t90histogram.eps acta polytechnica vol. 52 no. 1/2012 grb duration distribution considering the position of the fermi d. szécsi, z. bagoly, i. horváth, l. g. balázs, p. veres, a. mészáros abstract the fermi satellite has a particular motion during its flight which enables it to catch the gamma-ray bursts mostly well. the side-effect of this favourable feature is that the lightcurves of the gbm detectors are stressed by rapidly and extremely varying background. before this data is processed, it needs to be separated from the background. the commonly usedmethods [3,7] were useless formost cases of fermi, sowe developed a new technique based on themotion and orientation of the satellite. the background-free lightcurve can be used to perform statistical surveys, hence we showed the efficiency of our background-filtering method presenting a statistical analysis known from the literature. keywords: gamma-ray burst, fermi, background. 1 introduction nasa’s fermi gamma-ray space telescope is designed to measure the position of a burst in seconds and change the detectors’ orientation so that the orientation of the effective area of the main detector (lat) and the celestial position of the burst have the smallest angle [6]. the problem is that the secondary detector (gbm) measures a quickly varying background superposed on the substantial data because of the rapid motion of the satellite. though, work with the gbm data must be anticipated by removing the background, the widely-used methods, like fitting the background with a polynomial function of time are too simple and not very effective in the case of fermi. we made a model that involves the real motion and orientation of the satellite during its flight and the position and contribution of the main gamma-ray sources to the background. 700 800 900 1000 1100 1200 1300 1400 1500 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 c o u n ts time [s] fitted polynomial background (degree: 3) fig. 1: lightcurve of the fermi-burst 091030613 measured by the 3rd gbm detector. the burst is at 0s but the variation of the background is comparable with its height, and cannot be modelled by a simple polynomial function of time of degree 2 or 3 2 the data and the model after some basic data transformation, we can plot a gbm lightcurve as shown in figure 1, which is typical. since the variation of the background is caused by the rapid motion of the satellite, we need a specific model involving the position and orientation of the fermi in every second. we made such a model considering the detector’s orientation and the celestial position of the burst, the sun and the earth limb. the effect of every other gamma-source is included in an isotropic constant gamma-ray background in this approximation. if the earth-limb is in the detector’s field of view, it has two effects. first, it shields the cosmic gamma-ray background, which in our case we presume to be homogeneous and isotropic. second, there are so-called terrestric gamma-ray sources, like lightning in the upper atmosphere. these sources can influence the actual level of the background [6]. in both cases, the detected background depends on how much the earth-limb shields from the detector’s field of view. in order to measure this, we compute the rate of the uncovered sky correlated to the size of the field of view and use this as an underlying variable. comparing figure 1 to figure 2, we can note the connection between them: where the angles and the earth limb’s uncovering vary, the background of the lightcurve also varies. therefore, we use these 3 variables to fit the background of the burst. 3 fitting the background in order to separate the background from the burst’s real data, we fitted a 3-dimensional hypersurface of degree 3 to the lightcurve’s background. the 3 dimensions were shown in figure 2. after subtracting the hypersurface from the lightcurve, we get the background-free lightcurve shown in figure 3. 43 acta polytechnica vol. 52 no. 1/2012 0 10 20 30 40 50 60 70 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 a n g le b e tw e e n d e te ct o r a n d b u rs t [d e g re e s] time [s] 30 40 50 60 70 80 90 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 a n g le b e tw e e n d e te ct o r a n d s u n [ d e g re e s] time [s] 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 r a te o f e a rt h -u n co ve re d s ky t o d e te ct o r’ s f o v time [s] fig. 2: from left to right: the celestial distance of the 3rd gbm detector and the position of fermi-burst 091030613; the same with the sun; and the rate of the earth-uncovered sky to the 3rd gbm detector’s field of view, both in the function of time. it is worth comparing these figures with figure 1 -50 0 50 100 150 200 250 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 c o u n ts time [s] fig. 3: after background fitting: the lightcurve of fermiburst 091030613 measured by the 3rd gbm detector. comparing figure 1 to figure 3, we can state that our background filtering was successful, since the resulting lightcurve is perfectly background-free 0 5 10 15 20 25 30 35 40 45 0 0.5 1 1.5 2 2.5 3 h is to g ra m log(t90) [log(sec)] fig. 4: the distribution of the logt90 statistical parameter for 332 bursts according to ourmethod. the coloured curves show thefittedgaussian functions: green andblue for the short peak and the long peak, light blue and purple for together 4 testing the method in order to verify our method described above, we made a statistical analysis based on our backgroundfiltered data set. since we were investigating the temporal characteristics of the lightcurve, it was obvious to compute the t90 statistical parameter (see [3]). we ran our background filtering method on 332 fermibursts and computed the t90 values for them. it is well-known from the literature that the (logarithmic) distribution of the gamma-ray bursts’ duration shows 2 or 3 peaks (see for example [1] or [4]). we present the distribution of the log t90 variable computed by us in figure 4. in figure 4, the duration distribution of our data follows the general t90 distribution shape (e.g. it has two peaks). the greater peak on the right-hand side surely represents the long/soft bursts, while the smaller peak on the left-hand side can represent the short/hard bursts or also the intermediate group (see [2] and [5]). we need further investigations to decide this question, but since the shape of the distribution is analogous to the literature (logarithmic distribution with two peaks), we will develop and use this method for further fermi grb analyses. 5 conclusion the method can be further developed by the effects of other gamma-ray sources, e.g. the moon, some galactical sources or near supernova remnants. spectral (energy dependent) analysis will be performed as well, which is one of our goals in the future. however, we can announce that we have succesfully created a method that is able to separate the fermi data from the motion-based background in such an effective way that statistical analyses can be made using the data. acknowledgement this work was supported by otka grant k077795, by otka/nkth a08-77719 and a08-77815 grants 44 acta polytechnica vol. 52 no. 1/2012 (z.b.), by gačr grant no. p209/10/0734 (a.m.), by the research program msm0021620860 of the ministry of education of the czech republic (a.m.) and by a bolyai scholarship (i.h.). references [1] balazs, l., et al.: a & a, 1998, 339, 1. [2] balazs, l., et al.: balta, 2004, 13, 207b. [3] horvath, i., et al.: apj, 1996, 470, 56. [4] horvath, i., et al.: a & a, 2002, 392, 791. [5] horvath, i., et al.: a & a, 2008, 489, l1. [6] meegan, c., et al.: apj, 2009, 702, 791. [7] sakamoto, et al.: apjs, 2008, 175, 179. dorottya szécsi eötvös university, budapest zsolt bagoly eötvös university, budapest bolyai military university, budapest istván horváth bolyai military university, budapest lajos g. balázs konkoly observatory, budapest péter veres eötvös university, budapest bolyai military university, budapest konkoly observatory, budapest attila mészáros charles university, prague 45 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 research and development of co2 capture and storage technologies in fossil fuel power plants lukáš pilař1, jan hrdlička2 1 nuclear research institute řež plc, vyskočilova 3/741, 140 21 prague 4, czech republic 2 ctu in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic correspondence to: pilar@egp.cz abstract this paper presents the results of a research project on the suitability of post-combustion ccs technology in the czech republic. it describes the ammonia co2 separation method and its advantages and disadvantages. the paper evaluates its impact on the recent technology of a 250 mwe lignite coal fired power plant. the main result is a decrease in electric efficiency by 11 percentage points, a decrease in net electricity production by 62 mwe, and an increase in the amount of waste water. in addition, more consumables are needed. keywords: post-combustion, ammonia, fossil fuel. 1 introduction a key goal of many current research projects is to reduce co2 emissions. several lignite coal fired power plants are operated in the czech republic where ccs technology might be applied. this work is a part of a project that studies two co2 separation methods: oxyfuel combustion and chemical absorption, and storage in geological structures. while oxyfuel combustion is more suitable for a newly-constructed plant, chemical absorption might be applied in power plantsthat are already in operation. this paper offers a detailed discussion of a post-combustion method based on chemical absorption of co2, with an evaluation of key parameters for a given fossil fuel fired power plant. 2 methods of co2 capture from flue gas the methods for removing co2 from flue gas can be classified according to their chemical and physical principles as follows [1–4]: • absorption — scrubbing by an absorbent liquid • adsorption — absorption on the surface of solid matter or extraction by ion liquids • physical separation — membrane process, cryogenic separation • hybrid approaches • biological capture these methods are currently at different levels of development, from laboratory scale to pilot units. for power plants in the czech republic, only absorption techniques are under consideration, because these are currently the most technically developed. in this case, the co2 is either captured by dissolving it physically in a solvent, or it is absorbed by a chemical reaction. however, these technologies have a similar operation principle. the flue gas enters an absorption tower, where it is scrubbed in a countercurrent by an absorption liquid (solvent). the saturated solvent is transferred to another tower, where the solvent is regenerated and the dissolved co2 is removed at high concentration. during the operation, there are certain losses of solvent, e.g. due to unwanted reactions and products, or the solvent is released along with the flue gas. the solvent is therefore a consumable. at present, the solvents that are most widely applied are water solutions of: – amines of various kinds (primary, secondary, tertiary, heterocyclic) – ammonia – carbonates of alkaline metals (sodium or potassium carbonate) – blended solutions 3 suitability of the methods the most developed absorption methods have been described in great detailin the literature, and there are reports on the operation of pilot plants. detailed information can be found about technologies that are currently under intensive development, or that are being specially developed for application in 93 acta polytechnica vol. 52 no. 3/2012 current power plants. we have selected two absorption methods for co2 capture that are considered to be suitable for application in the czech republic. these two methods are in the most advanced stage of technological development, and they are supposed to be the first commercially built. the first method uses amine scrubbing, and the second uses ammonia as the solvent. other methods for co2 capture are currently in the research and development process, but have not yet gone beyond laboratory-scale application. the advantages and disadvantages of these two methods are compared from the point of view of application in the czech republic, and also from the point of view of energy and material demands. the main differences between the two methods are as follows: – financial demands — the investment costs are about 20 % lower for the ammonia method. an advantage when operating the plant is that ammonia is cheaper than amines. – chemical properties of the solvent — both solvents are toxic and corrosive. amines tend toward oxidative degradation, but the degradation caused by so2 and nox in the flue gas is a more important issue. the amine technology requires less than 30 mg/nm3 of so2 and nox in the flue gas from the combustion system compared to ammonia method. the amine technology therefore requires additional desulfurization, and the denox system alsoneeds to be used. – operation temperature — the amine technology works with higher temperatures. it requires higher steam parameters (temperature), while the ammonia method requires steam at about 140 ◦c. generally, the heat consumption is higher for the amine method; however, the cooling consumption is higher for the ammonia method — besides the cooling water, it requires an additional cooling supply, because absorption takes place at approximately 0 ◦c. – co2 capture — it has been found that the ammonia method can absorb three times more co2 per kg of solvent than the amine method. this is valid for monoethylene amine. however, studies are being carried out to increase this capacity. – energy demands — information is available only from journal articles, conference proceedings and companymaterials. the heat consumption is about 65 % lower for the ammonia method. the decrease in efficiency for an entire power plant is estimated to be 9 percentage points for the amine method, and 4 percentage points for the ammonia method. the decrease in efficiency was calculated for a current power plant using hard coal as the fuel. on the basis of the considerations discussed here, the ammonia method was chosen as the reference method for application in power plants in the czech republic. 4 input parameters of the study, and a technology proposal for the technology proposal, parameters of the flue gas after the desulfurization process from a reference coal-fired power plant were used. the parameters are summarized in table 1. the ammonia process consists of the following main components: – flue gas cooling and the flue gas fan – co2 absorption – final cleaning of the scrubbed flue gas – co2 desorption – co2 final cleaning – co2 compression – auxiliary cooling source – ammonia treatment a more detailed description is provided in the following paragraphs: – flue gas cooling — the absorption process for the ammonia method takes place at low temperatures (5–10 ◦c), so the flue gas must be cooled down as much as possible before entering the process. condensed water steamis produced during the cooling process. two-stage cooling is proposed; the first step involves counter-current water cooling, and the second step involves comtable 1: input parameters dry flue gas nm3/h 766 045 nox mg/nm 3 207.5 co2 % vol. 13.94 pm mg/nm 3 10.4 o2 % vol. 5.44 water steam nm 3/h 218 493 n2 % vol. 80.62 water (droplets) kg/h 80 so2 mg/nm 3 155.6 so3 mg/nm 3 12.44 temperature ◦c 62 94 acta polytechnica vol. 52 no. 3/2012 pression cooling. a flue gas fan is proposed after the cooling system to cover all pressure drops in the course of the process. – absorption — the absorber is in principle similar to that for flue gas desulfurization. the co2 is initially dissolved in water, and then it reacts with a solution of ammonia and ammonium carbonate. crystallized ammonium bicarbonate does not react further, and is removed for regeneration. the regenerated solvent from desorption, which must also be cooled down in advance, is introduced into the highest level of the absorption tower. after the absorber, the flue gas passes through ammonia capture. the cleaned flue gas at approximately 10 ◦c enters the gasgas heat exchanger, where is it warmed by the flue gas entering the capture technology to approximately 50 ◦c. the flue gas is then transported to cooling towers. the suspension from the absorber is transported into a hydro-cyclone to dewater the ammonium bicarbonate to more than 50 % dry matter. the solution is pumped back to the absorber at 3.2 mpa. the suspension passes a regenerative heat exchanger to be warmed by the solution that returns from desorption. the crystals are melted by heating, and enter the desorption column. – flue gas final cleaning — passing from the absorber, the flue gas enters the ammonia removal (scrubbing) device to remove the ammonia slip before it is released into the atmosphere. – desorption — decomposition of ammonia bicarbonate to ammonia and co2takes placehere. the ammonia remains dissolved under pressure, and the co2 is released in gaseous form. the process take place at 3 mpa and 120 ◦c. all reaction heat and additional heat must be returned to warm the solution to 120 ◦c. this heat is supplied by steam extracted from the turbine. the co2 stream is collected at the head of the column at approximately 115 ◦c and passes a cooler to be cooled to 30 ◦c. condensed water droplets are removed in the separator, and pure co2 is compressed to the pressure required for transport, which is 10 mpa and temperature 50 ◦c. this means that the co2 is in a supercritical and liquidstate. – co2 compression — a two-stage radial compressor with an intercooler (integrally geared compressor) is proposed. the output temperature from the compressor will be 117 ◦c, and further cooling is proposed. in this study, a separate cooling loop will be integrated to utilize the heat from compressed co2 cooling. – auxiliary cooling source — two cooling sources will be used for cooling the technology. the first (with the highest power) is a cooling loop with a cooling tower. however, the required temperature of around 0 ◦c cannot be attained there. for example, in summer the temperature will probably not be lower than 23 ◦c. compression cooling with ammonia as the working fluid is therefore proposed. the ammonia loop parameters typically reach −12 ◦c, which is fully sufficient for our purposes. – ammonia treatment — this is necessary for ammonia storage and feeding. storage will be in the liquid state. 5 impacts on the current power plant the proposed technology will basically have a negative influence on the whole power plant. the most important impacts are: – increased amount of water — the proposed cooling requires a large amount of water. the proposed water consumption is calculated in table 2. the calculation assumes a temperature difference of 10 ◦c in the cooling tower. the system is designed as a closed system, filling with cooling water and removing dense salt water as needed. table 2: calculated amount of water device removed heat cooling water evaporation condensate salt removal mwt t/h t/h t/h t/h 1st cooling stage 105.68 9 086.46 121.43 138.55 cooling of desorbed co2 8.68 746.35 9.97 compressor cooling co2 (1 st stage) 2.03 174.69 2.33 total 10 007.5 133.73 138.55 66.87 cooling of the compressor cooler 9 8.18 8 441.69 112.81 56.40 95 acta polytechnica vol. 52 no. 3/2012 – increased energy self-consumption — the electricity needs of the main drives are already known (compressor, flue gas fan, compression cooling). the self-consumption is estimated at approx. 50 mwe, and is calculated in table 3. table 3: energy consumption device value flue gas fan mwe 2.03 compression cooling mwe 36.82 compressor mwe 6.32 other mwe 4.52 total mwe 49.69 table 4: summary parameter unit current situation with ccs power output mwe 250 238 coal consumption t/h 214 214 energy in fuel mwt 588 588 self-consumption mwe 24 24 co2 production t/h 211 211 captured co2 t/h 0 190 co2emissions t/h 211 21 consumption of ccs mwe 0 50 net electricity generation mwe 226 164 total efficiency % 38.4 27.9 efficiency decrease % 0 10.5 – steam consumption — steam is required for the desorption process, to heat up the suspension. approximately 20.7 kg/s of steam is required. – consumption of demi water — demineralized water is required for filling into the absorber to sustain the required concentration and the required amount of solvent. – decreased efficiency — for this case, the postcombustion co2 capture technology decreases the power plant’s efficiency by approximately 11 percentage points. at a nominal power output of 250 mwe, the efficiency will be 28 %. – waste water — the waste water contains residues of salts, and the total amount of waste water will increase. – required area — according to the literature, approximately 25 000 m2 of free area is required for a 600 mwe power plant. this is a very high requirement. table 4 summarizes all important calculated data for a reference 250 mwe power plant running on lignite coal. the table is divided into the current situation and the situation after ccs construction with ammonia scrubbing. 6 conclusion the study presented here has shown that the ammonia post-combustion co2 capture method is suitable from the technological point of view for a current 250 mwe power plant running on lignite coal. the technology is quite well known and available. however, the impact is very significant. the calculations have shown that the addition of ccs technology decreases the total efficiency of the power plant by nearly 11 %. this means that the net electricity production decreases by approx. 62 mwe, mostly due to the self-consumption of the new technology. it also means that the electric efficiency of the power plant falls from the current level of 38.4 % to just 27.9 %. further negatives are increased production of waste water, and the addition of new consumables. acknowledgement financial support for this work under project number tip fr-ti1/379 “výzkum a vývoj optimálńı koncepce a technologie zachycováńı co2 ze spalin elektrárny spaluj́ıćı hnědé uhĺı v české republice” is gratefully acknowledged. references [1] herzog, h., meldon, j., hatton, a.: advanced post-combustion co2 capture, prepared for the clean air task force. 2009. [cit. 2012–03–28] availablefrom. http://web.mit.edu/mitei/docs/ reports/herzog-meldon-hatton.pdf [2] neathery, j.: co2 capture for power plant applications, in proceedings of the illionis bas in energy conference, 2008. [3] rhudy, r., freeman, r.: assessmentof postcombusti on capture technology developments. palo alto : epri, 2007, 1012796. [4] feron, h. m. p.: progress in post-combustion co2 capture. [cit. 2012–03–28] availablefrom. http://www.co2symposium.com/presentations/ colloqueco2 session3 02 feron tno.pdf 96 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 improving the photometry of the pi of the sky system a. f. żarnecki, k. ma�lek, m. soko�lowski abstract the “pi of the sky” robotic telescope was designed to monitor a significant fraction of the sky with good time resolution and range. the main goal of the “pi of the sky” detector is to look for short timescale optical transients arising from various astrophysical phenomena, mainly for the optical counterparts of gamma ray bursts (grb). the system design, the observation methodology and the algorithms that have been developed make this detector a sophisticated instrument for looking for novae and supernovae stars and for monitoring blasars and agns activity. the final detector will consist of two sets of 12 cameras, one camera covering a field of view of 20◦ ×20◦. for data takenwith the prototype detector at the las campanas observatory, chile, photometry uncertainty of 0.018–0.024 magnitudo for stars 7–10m was obtained. with a new calibration algorithm taking into account the spectral type of reference stars, the stability of the photometry algorithm can be significantly improved. preliminary results from the bgind variable are presented, showing that uncertainty of the order of 0.013 can be obtained. keywords: gamma ray burst (grb), prompt optical emissions, optical flashes, novae stars, variable stars, robotic telescopes, photometry. 1 introduction the “pi of the sky” experiment [1, 2] was designed for continuous observations of a large part of the sky, in the search for astrophysical phenomena varying on scales from seconds to months, especially for prompt optical counterparts of gamma ray bursts (grbs). other scientific goals include searching for novae and supernovae stars and monitoring blasars and agns activity. the large amount of data obtained in the project also enables the identification and cataloging of many different types of variable stars. the “pi of the sky” project involves scientists, engineers and students from leading polish academic and research units: the andrzej so�ltan institute for nuclear studies, the center for theoretical physics (polish academy of science), the institute of experimental physics (faculty of physics, university of warsaw), the warsaw university of technology, the space research center (polish academy of sciences), the faculty of mathematics, informatics and mechanics (university of warsaw), cardinal wyszynski university, the pedagogical university of cracow. 2 detector the full “pi of the sky” system will consist of 2 sites separated by a distance of ∼ 100 km, which will allow rejection by parallax of satellites and other nearearth object. each site will consist of 12 highly sustainable, custom-designed ccd survey cameras. the cameras will be placed on custom-designed paralactic mounts (4 cameras per mount) with high tracking precision and two observation modes: “deep”, with all cameras observing the same field (increasing measurement precision and/or time resolution) and “wide”, when the cameras cover adjacent fields (maximizing fov). pairs of cameras will work in coincidence and will observe the same field of view. the whole system will be capable of continuous observation of about 1.5 steradians of the sky, which roughly corresponds to the field of view of the swift bat instrument. the full system should be completed by the end of 2011. fig. 1: the “pi of the sky” prototype detector located in the las campanas observatory in chile 112 acta polytechnica vol. 51 no. 2/2011 hardware and software solutions were tested with a prototype device installed in the las campanas observatory in chile in june 2004 and upgraded in 2006 (see figure 1). it consists of two ccd cameras (2 000 × 2 000 pixels, 15 μm × 15 μm each) observing the same field of view (20◦ × 20◦) with a time resolution of 10 seconds. each camera is equipped with canon lenses f = 85 mm, d = f /1.2, which enables them to observe objects to ∼ 11m (∼ 13m for 20 coadded frames). the prototype allows fully autonomous running including diagnostics and recovery from known problems. human supervision is possible via internet. 3 data processing with each camera taking about 3 000 images per night, processing the large amount of data is a nontrivial task. the search for fast optical transients (e.g. grb flashes) requires very fast data processing and identification of events in real-time. however, nova star search and variable star analysis are based on precise photometry, which requires timeconsuming detailed image analysis and data reduction. to meet both requirements, two independent analysis paths were developed: the on-line part, which takes fast data scanning in real-time, and the off-line part, which performs a detailed data analysis. 3.1 on-line analysis on-line data analysis is based on dedicated fast algorithms optimized for transient search. in the full system, real-time frame by frame analysis will enable alerts to be distributed to the community for followup observations. after dark frame subtraction, an image is transformed by a special transformation called the laplace filter. a new value for each pixel is calculated, taking into account the sum of the pixels around it and the sum of the pixels surrounding the central region. the idea of this transformation is to calculate the simple aperture brightness for each pixel (fast aperture photometry algorithm). the resulting image, after the laplace filter, is compared with the reference image stored in memory (based on a series of previous images). any difference observed (above the estimated noise level) is considered as a possible “candidate event”. all events are then processed through a set of selection algorithms to reject backgrounds such as background fluctuations, hot pixels, cosmic ray hits, or satellites. coincidence between cameras is crucial for ccd related background and cosmic ray recognition. to allow for efficient background rejection, a multilevel selection system with pipeline data processing similar to trigger systems in particle physics experiments is used. 3.2 off-line data reduction the aim of the off-line data analysis is to identify all objects in an image, and to add their measurements to the database. the reduction pipeline consists of three main stages: photometry, astrometry and cataloging. the data stored in the catalogue is then subjected to off-line analysis, which consists of several different algorithms. algorithms optimized for off-line data reduction are applied to the sums of 20 subsequent frames, which is equivalent to an analysis of 200 seconds of exposure. after dark frame subtraction and flat correction, multiple aperture photometry is used, adopted from the asas [3] experiment. the procedure prepares lists of stars with (x, y) coordinates on ccd and estimated magnitudes for each camera. the lists are then an input for the astrometry procedure. this is an iteration procedure where stars from the list are matched against reference stars from the catalog (the tycho catalog is currently used). after successful reference star matching, their measurements are used to calculate the photometry corrections (the final measurement is normalized to v magnitudes from the tycho catalog). finally, all measurements are added to the postgresql database. all data taken by the “pi of the sky” prototype and stored in the project databases are publicly accessible. two data sets are currently available: the first database covers the period from july 2004 until june 2005, and contains about 790 mln measurements for about 4.5 mln objects, while the new one covers the period from may 2006 until april 2009, and includes about 2.16 billion measurements for about 16.7 mln objects. a dedicated web interface has been developed to facilitate public access [4]. 4 photometry 4.1 data quality cuts off-line data reduction algorithms are designed for maximum efficiency. all collected data is stored in the data base. additional cuts have to be applied in the analysis stage to select data with high measurement precision. it is necessary to remove measurements affected by detector imperfections (hot pixels, measurement close to the ccd edge, background due to an opened shutter) or observation conditions (planet or planetoid passage, moon halo). dedicated filters, taking into account all known effects, have been developed to remove bad object measurements (or whole images). the photometry accuracy obtained after applying standard set of cuts to remove bad quality data is illustrated in figure 2. for stars from 7m to 10m, average photometry uncertainty of about 0.018 − 0.024m has been obtained. 113 acta polytechnica vol. 51 no. 2/2011 fig. 2: precision of star brightness measurements from standard photometry, for 200 s exposures (20 coadded frames) from the “pi of the sky” prototype in the las campanas observatory in chile 4.2 spectral corrections until 2009 the prototype detector installed in lco was not equipped with any filter, except for an ir+uv cut filter1. this resulted in relatively wide spectral sensitivity of the “pi of the sky” detector, as shown in figure 3. the average λ ≈ 585 nm is closest to the v filter, which we use as a reference in photometry corrections. when trying to improve the photometry precision for a bgind variable star, we observed that the average magnitudo mp i of the reference stars, as measured by “pi of the sky”, is shifted systematically with respect to catalogue magnitudo v , depending on the star spectral type given by the difference of catalogue magnitudo b − v or j − k, see figure 4. fig. 3: spectral sensitivity of the “pi of the sky” detector, as resulting from the ccd sensitivity and ir+uv filter transmission, compared to the transmission curves of standard photometric filters b-v -0.5 0 0.5 1 1.5 2 2.5 v p i m -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 fig. 4: average difference between the “pi of the sky” magnitudo mp i and the catalogue v magnitudo for the reference stars, as a function of the spectral type given by b − v the dependence of the average differences between the measured and catalog magnitudo on the spectral type has been approximated by a linear function. this corrects the measurement of each reference star so that, on average, the measured magnitudo mcorr is the same as the catalogue v magnitudo, independently of spectral type. this so called “spectral correction” significantly reduces the systematic uncertainties in reference star magnitudo measurements. the distribution of the average magnitudo shift for the reference stars used in bgind analysis, before and after spectral correction, is shown in figure 5. vcorrm -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 5000 10000 15000 20000 25000 # stars corrected uncorrected fig. 5: distribution of the difference between the measured reference starmagnitudo and the catalogue v magnitudo before (red) and after (blue) spectral correction corrected reference star measurements are used to evaluate additional photometry correction for the studied object (the bgind variable star is used as an 1since summer 2009 one of the cameras has been equipped with a standard r filter 114 acta polytechnica vol. 51 no. 2/2011 phase 0 0.2 0.4 0.6 0.8 1 p i m -6.6 -6.5 -6.4 -6.3 phase 0 0.2 0.4 0.6 0.8 1 c o rr m -6.6 -6.5 -6.4 -6.3 fig. 6: phased light-curveof thevariable starbgindbefore (left) andafter (right) thenewcorrection proceduredescribed in this work example). to calculate the correction only reference stars with catalogue magnitudo 6 < v < 10, and with angular distance from the object smaller than 4 degrees are used. these cut values were found to result in the most precise and most stable photometry corrections, resulting in the smallest uncertainty in the final bgind brightness determination. significantly improved measurement precision is also obtained when the photometry correction is not calculated as a simple average over all selected reference stars, but when a quadratic dependence of the correction on the reference star position in the sky is fitted for each frame. the effect of the new photometry correction procedure on the reconstructed bgind light curve is shown in figure 6. after applying the new corrections, the measurement quality improves significantly. uncertainty of the order of 0.013m can be obtained. 5 conclusions the “pi of the sky” prototype has been working since 2004, and has delivered a large amount of photometric data, which is publicly available [4]. with improved understanding of the detector and new filtering algorithms, the data quality and the stability of the photometry algorithm can be significantly improved. work on the new photometry corrections is still ongoing, and further improvements are still possible. additional corrections can take into account the dependence of the magnitudo error of the star on its catalogue brightness, ccd pixel structure and pixel response non uniformity, as well as information on the correction quality from the fit. we hope to be able to obtain measurement precision of ∼ 0.01m for stars up to 10m (in optimal observation conditions). an independent study is also under way to prepare a photometry algorithm based on a detailed psf (point spread function) model. acknowledgement we are very grateful to g. pojmański for access to the asas dome and for sharing his experience with us. we would like to thank the staff of the las campanas observatory for their help during the installation and maintenance of our detector. this work was financed between 2005 and 2010 by the polish ministry of science and higher education, as a research project. references [1] burd, a., et al.: pi of the sky-all-sky, real-time search for fast optical transients, new astronomy, 10 (5), 2005, 409–416. [2] malek, k., et al.: “pi of the sky” detector, advances in astronomy 2010, 2010, 194946. [3] pojmanski, g.: the all sky automated survey, acta astronomica, 47, 1997, 467. [4] “pi of the sky” measurement databases are available at http://grb.fuw.edu.pl/pi/databases aleksander filip żarnecki e-mail: filip.zarnecki@fuw.edu.pl institute of experimental physics university of warsaw ul. hoza 69, 00-681 warsaw, poland katarzyna ma�lek center for theoretical physics warsaw, poland marcin soko�lowski so�ltan institute for nuclear studies warsaw, poland 115 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 physics on the terascale j. engelen the terascale at an energy of order 1012 ev, i.e. 1 tev, the fundamental interactionsbetweenquarks, leptons andgauge bosons need an additional ingredient in order to preserve unitarity. this ingredient is naturally provided by the brout-englert-higgs mechanism. this mechanism invokes a scalar field with a ‘mexican hat’ type potential, inwhich the ground state of the vacuumoccurs atafinite valueof thefield (‘symmetrybreaking’); the quantum of this field, the higgs boson, provides a way of ‘generating’ massive gauge bosons whilst preserving renormalizability of the theory. the existence of the higgs boson has not yet been demonstrated experimentally; in fact the elucidation of ‘the higgs sector’ represents a huge experimental program – arguably including the ‘next large collider’ ilc-clic – that onlybeginswith thediscoveryof the (or ‘a’)higgs boson. furthermore, access to the terascale holds the promise of more discoveries. a theoretically attractive scenario, in which the electro-weak and strong interactions unify at a scale of 1015 gev (‘grand unification’), requires an additional ingredient to enter into the renormalization group equations on a scale of 1 tev. this couldmark the scale onwhich ‘supersymmetry’ is revealed and a new world of supersymmetric partners of all known particles is discovered. finally, and increasingly speculatively, the terascale might give us access to ‘large’ (of the order of 0.1 mm) extra dimensions, only open to gravity: this would open up possibilities of the experimental study of quantum gravity well below the planck scale (1019 gev). an accelerator for the terascale the large hadron collider was approved by the cerncouncil in december 1996. initial ideas can be traced back to the late 1970’s, but even in 1996 an extensiver&dprogramme still had to be completed before the feasibility (and the cost) of the lhc could be established. the nominal energy of 14tev, i.e. 7tev per beam, requirednew superconductingmagnet technology; the nominal luminosity of 1034 cm−2s−1 required, among other things, a novel collimation system. moreover, experiments able to cope with the interaction rate at this energy needed new concepts in detector technology and on-line selection of potentially interesting events (‘triggering’). this is the subject of a separate presentation at this symposium, by peter jenni (spokesman of the atlas collaboration). in addition, the huge amounts of data of 15 petabytes per year required a new approach to computing: the now operational worldwide lhc computing grid (wlcg), also serving as a platform for applications outside high energy physics. the lhc dipole magnets have all now been successfully produced, and the last magnet was installed in the tunnel in march 2008. in total 1232 of these magnets and about 400 quadrupole magnets (and a large number of smaller, higher order correction magnets) form the lhc lattice. the lhc is a marvel of technology in all its facets, crowned by the superconducting dipole magnets. they are 15 m in length, and feature two coils (for the two beams) in one ‘cold mass’ (flux return yoke). the coils consist of niobiumtitanium cables (out of 7 micron filaments) and are operated at a temperature of 1.9 k, cooled by superfluid helium. the lhc was successfully put into operation on september 10, 2008. in the period september 10– september 19, 2008 this very complexmachine proved to have been extremelywell designed: injecting, circulating and ‘capturing’ (by therfacceleration system) the beam was achieved in a matter of days. preparations for first collisions were in full swingwhen an unfortunate incident revealed a weakness in one of the joints between two superconducting cables between two magnets. there are more than 10000 such connections in the lhc, but one of themhad a resistance of ∼ 100 nano-ohm (instead of less than 1, as in the specifications – a few additional ‘suspect’ joints were identified during subsequent inspections). the dissipated energy (at a current of 10 ka) led to warm up and eventual failure of the joint. the loose cable subsequently discharged, burning a hole in the pipe carrying the liquid helium. the helium escaping into the vacuum system caused a pressure wave that damaged (and in a number of cases dislodged) magnets over a distance of several hundred meters. the repairs and the implementation of additional diagnostics (and of some measures to limit the damage should such an event occur again, but it should not!) are estimated to take a full year, such that restart of the accelerator is foreseen for october 2009. (note added: meanwhile the lhc has very successfully resumed operation in november 2009). brief summary of a colloquium given in honor of professor j. niederle on the occasion of his 70th birthday 82 acta polytechnica vol. 50 no. 3/2010 conclusion research into the interactions of elementary particles andfields is at the eve of a newera. thelargehadron collider, a marvel of technology and a wonderful example of european leadership in worldwide collaboration will allow the exploration of new and uncharted territory, where exciting new phenomena are waiting to be discovered. finally jiri niederle has contributed prominently to theeuropean leadership referred to in the previous paragraph: as the czech delegate, he is a long-time member of the cern council; on the basis of his authority as a prominent scientist he has successfully helped to steer cern in the right direction! references [1] the large hadron collider: a marvel of technology. epfl press, 2009, edited by lyndon evans. jos engelen nikhef and nwo – the netherlands organisation for scientific research p/o box 93138 nl 2509 ac den haag, netherlands 83 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 influence of the lossy compression jpeg2000 standard on the deformation of psf p. páta abstract this paper deals with the influence of lossy compression algorithms on the deformation of the point spread function (psf) of imaging systems in astronomy. lossy compression algorithms reduce irrelevant information in image functions, and their application distorts the image function. astronomical images have typical specific properties — high grayscale bit depth, size, noise occurrence and special processing algorithms. they belong to the class of scientific images as well as medical or similar. their processing and compression is quite different from the classical approach of multimedia image processing. the influence of the jpeg2000 coder on the deformation of psf is presented in this paper. keywords: astronomical lossy image compression, psfdeformation, astronomical contextcoder (acc), jpeg2000. 1 introduction this paper deals with influence of the lossy compression standard jpeg 2000 on the deformation of the point spread function (psf) of the imaging system. jpeg200 is a lossy compression standard exploiting the wavelet approach [4]. image compression based on the wavelet transform is nowadays very popular because of its nice properties. the whole imaging system can be described using a model based on the point spread function. this approach demonstrates the influence of eachpart of the systemon the quality of the acquired image. in the case of a system with linear and space invariant parameters the final psf can be expressed as a convolution of the psf of the parts of the system p sf(x, y)= p sfair(x, y) · p sfoptics(x, y) · (1) p sfsensor(x, y) · p sfip(x, y) · p sfcompr(x, y), where p sfair(x, y) is thepoint spread functionof the earth’s atmosphere, p sfoptics(x, y) is the influence of the system optics, p sfsensor(x, y) is the psf of the imaging sensor, p sfip(x, y) is equal to the image processing part and p sfcompr(x, y) covers the influence of using of an image compressionmethod. each of these steps can distort the acquired image and can change it irreversibly. when a lossless compression algorithm is used the appropriate point spread function is equal to the dirac impulse. the application of lossy compressionmethods deforms the point spread function of the imaging system, of course [5]. it is thereforenecessaryto consider the compression intensity that is used. the deformation of the point spread function is used as the most important quality criterion in this paper. in most cases, the imaging system can be considered as a linear and space invariant case. other systems can be described as piecewise linear and space invariant [6]. 2 image data and point spread function model astronomical images are a special class of images. they have different parameters frommultimedia image classes. the most important distinctions are: • high bit depth (up to 16 bits) • grayscale and color different from the multimedia rgb system • significant noise level • sophisticated algorithms for processing astronomical images three images have been chosen as typical representatives of astronomical images. the first is an image acquired by a wide field camera of the bootes experiment (burst observer andoptical transient exploring system) [1]. the image captures the neighborhood of the m7 with objects (stars) with the full width at halfmaximum(fwhm)of approximately a few pixels (see figure 1). the other two images can be classified in the deep sky class. these images are acquired with longer focus optics. image m42 is with a satellite tray and image m51 is in the red filter (see figure 1). these images are stored in the deimos image database [2]. a 2dgaussianormoffatmodel of apoint spread function (psf) canbe used for space invariant linear systemswithout dominant influence of image aberrations [8]. 54 acta polytechnica vol. 51 no. 6/2011 a) b) c) fig. 1: input image data from the deimos database: a) image from wide field camera m7 and the milky way with many objects (size smaller than 10 pixels) b) m51 galaxy in red filter. deep sky image with bigger objects c)m42nebulaewith a satellite tray. image obtained from the deep sky camera of the bootes system 3 results and measurements the java implementation of theastronomical context coder [7] has been used for the influence of lossy compression on psf deformation. this software package contains the jpeg2000 standard with an extension for 16bit images. the criteria were chosen for compressed image quality evaluation. these criteria are based on a description of the image functionwith respect to the deformation of objects in the image and also the precision of the photometric and astrometric algorithms. the following set of criteria has been chosen: • point spread function deformation measured by the moffat function fit (β parameter, see figure 2a). • object position error expressed as the object center of mass (in pixels, see figure 2b). • object flux error. this flux is definedas the sum of the image function over the object expressed as sensor irradiation (the background value is removed) (as a percentage, see figure 2c). a) b) c) fig. 2: influence of lossy compression on moffat parameter deformation a), object position error b) and change of the object with the compression ratio c) the iraf software package has been chosen for image analyzing [3]. two objects with different brightness have been selected to demonstrate the results. these objects have different fwhm and they are therefore not equally sensitive to damageby lossy compression methods. 4 conclusion compression and processing of astronomical images are different tasks from the classicalway known from multimedia technology. it is not possible to use a setup optimized for human vision. the influence of the lossy compression standard jpeg 2000 on the distortion of the object profile has been verified in this paper. the profile is closely related to the point spread function of the imaging system. it canbe said that the influence of lossy compression ismore significant for faintobjectswithfwhm(equal objectarea) 55 acta polytechnica vol. 51 no. 6/2011 not exceedinga fewpixels (imagem7-300ff.fits). special quality criteria andacceptabledistortionarenecessary for defining the application of a lossy compression standard. acknowledgement this work has been supported by grant no. p102/10/1320 “research and modeling of advanced methods of image quality evaluation” of the grant agency of the czech republic and by research project msm 6840770014 “research of perspective information and communication technologies” of msmt of the czech republic. references [1] bootes, burst observer and optical transient exploring system. 2011. http://www.laeff.esa.es/bootes/ing/index.html [2] fliegel,k.,kĺıma,m., páta,p.: newopen source image database for testing and optimization of image processing algorithms, in optics, photonics, and digital technologies for multimedia applications. spie proceedings, vol. 7723, 2010. [3] iraf, image reduction and analysis facility. 2011. [online] http://www.iraf.noao.edu [4] iso/iec 15444-1:2000: jpeg2000 image coding system (core coding system). 2000. [online] http://www.jpeg.org/fcd15444-1.htm [5] páta, p., hanzĺık, p., schindler, j., vı́tek, s.: influence of lossy compression techniques on processingprecision of astronomical images, 6th ieee isspit conference, athens, greece : 2005. [6] řeřábek,m.,páta,p.: astronomical imagecompression techniques based on acc and klt coder, proceedings of the 7th integral/bart workshop, ibws2010, acta polytechnica, 2011. [7] schindler, j., páta, p., kĺıma, m., páta, p.: advanced processing of images obtained from wide-field astronomical optical systems, proceedings of the 7th integral/bart workshop, ibws2010, acta polytechnica, 2011. [8] starck, j. l., murtagh, f.: astronomical image and data analysis, springer, 2002. petr páta department of radioengineering faculty of electrical engineering czech technical university in prague technická 2, prague, czech republic 56 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 an optimal procedure for determining the height parameters of fracture surfaces t. ficker abstract this paper presents an attempt to find an optimal procedure for determining the height parameters of fracture surfaces. this is a useful task that may significantly increase the reliability of topographic analyses of solids. the paper focuses on seeking an optimum number of measuring sites to ensure sufficient reliability of the resulting height parameters determined by the confocal technique. the statistical tests show that the number may be close to 25 measuring sites. keywords: profile analysis, 3d profile parameters, fracture surfaces, cement-based materials, confocal microscopy. 1 introduction in fracturemechanics there has been long-term interest in the topographic properties of fracture surfaces. it is believed that fracture surfaces bear valuable information not only on the structure but also on the mechanical properties of solids. software analysis of surface topography requires digital replicasof the surfaces tobeavailable. among several techniques available for this purpose, confocal microscopy is very convenient, since it produces three-dimensional replicas without any contact with the surface, i.e. without any mechanical damage to the specimens. confocal replicas are reconstructed from a series of horizontal sections, which are in fact digital two-dimensional microscopic images. as soon as the digital three-dimensional surface relief is formed, software analysis of the surface can begin, and it can reveal various useful surface quantities, including the 3d-profile and 3d-roughness parameters, which are often employed for classifying surface height irregularities [1–4]. these parameters are closely related to structural parameters such as porosity [5], or to mechanical properties, including compressive strength [6]. the 3d-profile parameters ha, hq computed by means of the reconstructed surface profile f(x, y) within the plane rectangle l × m ha = 1 l · m ∫ ∫ (lm) |f(x, y)|dxdy (1) hq = √ 1 l · m ∫ ∫ (lm) [f(x, y)]2dx dy (2) were tested previously [1,5, 6], and were shown to be reliable indicators of both the porosity and the compressive strength of hydrated cement materials. however, the determination of parameters ha, hq is connectedwith certain particularities that should be borne in mind when they are determined. the first problem is that we do not know the optimal number of surface sites to be visited with the confocal microscope in order to obtain reliable profile parameter values. to resolve this problem,weperformeda large series of measurements on a particular fracture surface and used statistical considerations to infer a sufficient number of sitemeasurements to ensure precise results. this paper reports on the whole procedure. 2 experiment the fracture surface of hydrated cement paste (fig. 1) was tested. the test specimen was prepared from ordinary portland cement cem 42,5 i r-sc. the water-to-cement ratio was set to 0.5 and the period of hydration was one year, i.e. the specimen was of high maturity. fig. 1: the surface of a fractured specimen of hydrated cement paste 31 acta polytechnica vol. 52 no. 2/2012 fig. 2: confocal relief of a fractured surface the 3d profiles f(x, y) were created using an olympus lext 3100 confocal microscope. one of these profiles is shown in figure 2. the profiles were formed by the software, which processed a series of optical sections taken at various heights of the fracture surface. approximately 200 image sections were taken (magnification 20×) for each measured surface site, starting from the very bottom of the surface depressions (valleys) and processing to the very top of the surface protrusions (peaks) with a step of 1.28μm. the investigated area l × m = 1280μm×1280μm(1024 pixels×1024 pixels) was chosen in 280 surface sites. these measuring sites consisted of seven groups, i.e. 5, 10, 15, 25, 50, 75 and 100 measuring sites located randomly on the surface. for each group of measurements, parameters ha, hq were computed and their averages were determined. in this way we obtained seven couples of average values, whose statistical reliabilities increased with increasing number of measurements. naturally, the group of 100 measurements yielded the most reliable averages, and they were therefore adopted as reference values h(100)a , h (100) q of high precision close to exact values. the averages h(n)a , h(n)q of the remaining groups n = 5,10,15,50, and 75 were then classified according to their percentage deviation from the precise reference values pa(n) = |h(100)a − h(n)a | h (100) a ×100 (%) (3) pq(n) = |h(100)q − h(n)q | h (100) q ×100 (%) (4) as the optimal number of measuring sites n, we chose a number which ensured that the percentage deviation will be less than five, i.e. p(n) < 5%. 3 results and discussion in figure 3, there are two graphs pa(n) and pq(n) showing the behavior of percentage deviations from the reference average values h(100)a , h (100) q . both graphs clearly indicate the optimum number of measuring sites to be close to n = 25. this number of measuring sites ensures that the percentage deviation is about 5%, which is a normal laboratory statistical deviation. for larger n > 25, the resulting profile parameters h(n)a , h (n) q would show still lower statistical uncertainty but at the expense of an enormous measuring and computational effort. on the other hand, results with n < 25 rapidly increase their statistical uncertainty. for example, at n = 5 the percentage deviation reaches a value of almost 25%,which indicates ratherhighuncertaintyand low reliability. 4 conclusion the statistical tests presented in this paper have shown that the optimum number of measuring surface sites to determine 3d-profile parameters using the confocalmicroscopic technique is close to 25 sites inorder to ensure sufficient reliabilityofnumerical results. in practice, this means that the measurements may be performed within a matrix of 5 × 5 surface points uniformly distributed over the tested surface. fig. 3: percentage deviations of average profile parameters from the reference value representing 100 site measurements 32 acta polytechnica vol. 52 no. 2/2012 acknowledgement this work was supported by the ministry of education, youth and sports of the czech republic under contract no. me 09046 (kontakt). references [1] ficker,t.,martǐsek,d., jennings,h.m.: roughness of fracture surfaces and compressive strength of hydrated cement pastes, cem. conr. res. 40, 2010, 947–955. [2] qu, j., stih, a. j.: analytical surface roughness parametersof a theoreticalprofile consistingof elliptical arcs, machining science and technology, 7, 2003, 281–294. [3] hong, e.-s, lee, j.-s., lee, i.-m.: underestimation of roughness in rough rock joints, int. j. numer.anal.meth.geomech. 32, 2008, 1385–1403. [4] fardin, n., stephansson, o., jing, l.: the scale dependence of rock joint surface roughness, international journal of rock mechanics & mining sciences, 38, 2001, 659–669. [5] ficker,t.,martǐsek,d., jennings,h.m.: roughness andporosityofhydratedcementpastes,acta polytechnica, 51, 2011, 7–20. [6] ficker, t.: fracture surfaces of porous materials, acta polytechnica, 51, 2011, 21–24. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering brno university of technology veveř́ı 95, 662 37 brno, czech republic 33 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 holding pressure and its influence on quality in pim technology pavel petera1 1 department of engineering technology, faculty of mechanical engineering, technical university of liberec, studentská 2, liberec, czech republic correspondence to: pavel.petera@tul.cz abstract the pim (powder injection molding) process consists of several steps in which faults can occur. the quality of the part that is produced usually cannot be seen until the end of the process. it is therefore necessary to find a way to discover the fault earlier in the process. the cause of defects is very often “phase separation” (inhomogeneity in powder distribution), which can also be influenced by the holding pressure. this paper evaluates the powder distribution with a new method based on density measurement. measurements were made using various holding pressure values. keywords: pim, holding pressure, phase separation. 1 introduction powder injection molding technology (pim) is a very effective and precise technology, especially for producing small metal and ceramic parts with complex shapes in large-scale production. the technology itself consists of two subcategories, cim (ceramic injection molding) and mim (metal injection molding). a clear difference between cim and mim is in the use of different feedstocks, where a different powder is used (either metal-based or ceramic-based). the powder is mixed with a polymeric binder and is granulated. the processing is very similar to standard injection molding, but due to the highly abrasive properties of the powder it is necessary to use a suitable machine with abrasion-resistant parts (screw, cylinder etc.) after the injection-molding stage of the process, it is necessary to take out the binder. there are various ways to remove the polymeric binder, including catalytic debinding, water dissolving or thermal debinding. the last stage of production is sintering, which produces a part with high density and is accompanied by considerable shrinkage. the shrinkage is dependent on several factors, e.g. the material (the ratio between powder and binder in the feedstock, the type of binder, the material of the powder, grain size, etc.), processing conditions, mold, machine, etc. the quality is often influenced by a range of factors, but the influence of holding pressure remains not well described. the aim of our project is to study the influence of holding pressure on the quality of the molded part. this paper describes the influence of holding pressure on powder distribution in various places in the specimen characterised by a new method — density measurement. 2 materials and methods 2.1 preparation of the specimens inmafeed 1008 feedstock, produced by inmatec (germany), was selected as a material for the specimens. the exact composition of this material is protected, but it is based on al2o3 and consists of powder (about 81–90 %) and a binder (about 10–19 %). the powder composition is min 96 % of al2o3 and up to 4 % of inorganic flux additives. it uses two-step debinding (water and thermal). the sintering temperature is 1 620 ◦c. the binder is based on a polyolefin and wax mixture. a cavity with dimensions 120 × 10 × 4 mm was selected for the specimens, of which was the density evaluated in various places. these specimens were produced by powder injectionmolding technology, where the mold has a gate from the shorter side. an arburg allrounder 270s injection molding machine was used. the processing conditions were set according to the material sheet with two different holding pressure values (see table 1). table 1: processing parameters for injection molding of the specimens processing temperature from feeding zone 159 ◦ c to nozzle tip 162 ◦ c mold temperature 60 ◦ c injection speed 80 ccm/s dose volume 30 ccm holding pressure 300/150 bar switching to holding pressure 11 ccm cooling time 10 s 100 acta polytechnica vol. 52 no. 4/2012 2.2 evaluation of the density in different places the decision was taken to measure the average density in 28 different places (4 rows on the shorter side of the sample and 7 rows on longer side, where especially close to the gate and at the end of the part the measuring points were placed more densely because higher influence of the holding pressure was expected). more details are shown in figure 1. all specimens were conditioned according to en iso 291 before measuring the density. in places where the measurements were to be made, the tested sample was cut into small pieces 2 × 2 mm in dimensions, while the thickness remained constant (4 mm). this means that the variations in properties across the thickness were not considered. the archimedean immersion method was selected as the testing method (according to en iso 1183). an a&d gf-300 balance was used for weighing the samples. due to the high density of the feedstock that was used, the immersion fluid was distilled water. it was necessary to use a surfactant to reduce the high surface tension. the sample then immersed much more easily. the results were calculated according to the archimedes principle, and the results are shown in the graphs in figures 2 and 3. figure 1: specimen with 28 measuring points: on the shorter side, all distances are 2 mm and on the longer side the points are placed 2/4/8/40/80/114/118 mm from the gate figure 2: average density in each of 28 points on the specimen, where the holding pressure was 300 bar figure 3: average density in each of 28 points on the specimen, where the holding pressure was 150 bar 101 acta polytechnica vol. 52 no. 4/2012 3 results and discussion in the case of pim, the flow properties are significantly different from the case of conventional injection-molding, because of the smaller amount of polymeric material and the combination with inorganic grains. these inorganic grains have much worse fluidity than standard polymeric material, and also have an abrasive effect on the machine and on the mold. the effect of the holding pressure is also different. it causes different coordination of the two components of the melt in different places and, as a result, also different properties, shrinkage etc. it is then quite difficult to set up the machine, especially the holding pressure. it can be seen from these graphic relations, which show the density across the sample, that where there was higher holding pressure the distribution of the density values was more homogeneous. this means that the distribution of the ceramic powder should also be more homogeneous. if we focus on specific places, the density values were also more constant in the middle of the specimens (longitudinal direction) than in the area close to the gate or at the end of the sample. in these places with more constant values, it is likely that a quality surface can be obtained after sintering. it can also be seen that with higher holding pressure the average density is also higher. this indicates a higher content of ceramic particles, which are of higher density than the polymeric binder. 4 conclusion pim technology is quite sensitive as far as changing the holding pressure is concerned, but the influence on the properties of pim parts has not yet been described properly. the approach of evaluating the influence of the holding pressure on quality, represented by powder distribution, which was quantified by density measurements, is a new way to predict the distribution of ceramic (or metal) powder and binder in injection molded part by using a different holding pressure. this could be useful for discovering any defects before sintering has been performed, and therefore avoiding wasting time, material, energy, etc., in the subsequent steps (recycling can still be done in the “green part” step). the experimental measurements showed that samples produced with higher holding pressure showed more homogeneous density distribution, and generally also higher density values, this could indicate that there was a greater content of ceramic powder, which has a higher density than the polymeric binder. also, as anticipated, the densities were slightly more homogeneous in the middle of the specimen than in the area of the gate or at the end of the specimen. the method is quite simple and provides visual results, but its accuracy is limited even when the work is carried out with great precision. it is therefore necessary to continue investigating this topic, to seek new approaches and methods, and to make further measurements. acknowledgement the topic has been solved under the terms of solution of the research program sgs 2822. references [1] li, y., lou, j., yue, j.: analysis and evaluation of effects of processing steps on dimensional tolerance of pim parts. journal of central south university of technology, changsha, china, 2005. [2] laddha, s., wu, c., vallury, s., lingam, g.: characterisation of alumina feedstock with polyacetal and wax-polymer binder systems for micro powder injection moulding. in pim international, vol. 3, 2009, p. 64–70. [3] barriere, t., gelin, j., dvořák, p., liu: experimental analyses and modeling of processing defects in powder injection molding. in conference pim 2003, 17–19 march, penn state university press, usa, p. 1–18. [4] german, r., bose, a.: injection moulding of metals and ceramics. in metal powder industries federation, 1997, princeton. isbn 978-1878954619. 102 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 non mechanical (mezic) type forces in the foundations of quantum mechanics č. šimáně abstract many authors have attempted to derive the fundamental equations of quantummechanics from classical hydrodynamics. in thepresent contributionwepresume that the continuous, electrically chargedmaterial substancemoves simultaneously under the influence of the electric field and at the same time undergoes a diffusion process. this assumption leads to the appearance of non-mechanical (mezic) type forces responsible for inner sources ofmatter (positive or negative), similar to those whose existence is supposed to exist in relativistic hydrodynamics. we obtained a non-linear differential equation, convertible by linearization to a form coinciding with the schrödinger equation, as a condition for the establishment of the same steady states with discrete energies. keywords: hydrodynamics, quantum mechanics, mezic (non-mechanical) forces. 1 introduction the discovery of quantum mechanics signified a revolution in the physical image of our world. however, the price paid for the general acceptance of quantum mechanics was very high. one had to sacrifice many fundamental principles of classicalmechanics, among them especially its determinism, and to accept controversial uncertainty relations. the constant h̄ discovered by planck together with einstein’s relations e = hν and �p = �kν between the corpuscular and wave aspects of matter were the starting points for the discovery of quantum mechanics by schrödinger, louis de broglie, born, heisenberg anddirac. many physicists tried to overspan the profound abyss between classical and quantum mechanics, proposing variousmodels leading to the fundamental equations of quantummechanics [1] . this work is in someway a continuation of the author’s earlier article on this subject [2]. 2 from classical to quantum mechanics the schrödinger equation for a spinless particle in spherical s states of the hydrogenatomhasbeen chosen to demonstrate that the assumption of the existence of non-mechanical (mezic) forces and diffusion of electrically charged continuous electron matter in the field of electric potential are sufficient to deduce its formal equivalent from classical hydrodynamics. because one can hardly reconcile the continuous hydrodynamical system with the electrons as particles in the planetary model of the atom, the concept of particles is the first that had to be sacrificed. instead of this, one has to replace themby a cloud of charged electronmatter around the nucleus. further, one has to propose some mechanism which could lead to the existence of discrete time independent states of the system. inspiration comes from the paper by nelson [3],whoused statisticalmechanics toderiveequations that led to discrete states in planetary atomic systems. his basic idea was that the electron simultaneously executes two kinds ofmotions. one, which is the classical kind, with velocity �v (he refers to it as the flow velocity) in the electrostatic field of the nucleus, and a certain type of brownian motion resulting in a motion with osmotic velocity �u obeying the diffusion law μ�u = −d∇μ (1a) which leads to relations �u = −d ∇μ μ , ∇μ = −d−1μ�u, u2 = d2 (∇μ)2 μ2 , div (μ�u)= −dδμ (1b) inspired by nelson, we constructed a heuristic lagrangean l for the motion of a volume element δv containing mμδv electron matter (m is the electron rest mass, μ the distribution function normalized to unity, eμ the electric charge density) l = ∫ δv μ ( m v2 + u2 2 − eϕ(�x) ) dv (2) as a sum consisting of two lagrangeans in which the integrand is the sum of the densities of the kinetic potentials corresponding to the motions with uncorrelated velocities �v and �u(rot�v = rot�u = 0) and the 81 acta polytechnica vol. 50 no. 6/2010 density of the potential energy eμϕ. for infinitely small δv the lagrangean is replaceable by l = mμ [ v2 + u2 2 −e(ϕ − ϕ0) ] δv (3) from the general formula d dt ∂l dq̇i − ∂l dqi =0 (4) we get the equations ofmotion of the volume element d dt [μ(�v + �u)δv ]− ∇ [ μ ( ( v2 + u2 ) 2 − e m (ϕ − ϕ0) )] mδv =0, (5) the ϕ0 is for the moment an arbitrary constant potential, which disappears in the potential gradient. the first member on the left side of (5) is the inertial force acting on the volume element δv by acceleration. the second member represents the external force densities.[ −e e m μ∇ϕ +( ( v2 + u2 ) 2 − e m (ϕ − ϕ0) ) ∇μ ] mδv (6) the firstmember of (6) is the classical electrical force density. − e m μ∇ϕ (7a) the second member is interesting. if ∇μ is substituted from (1b), then this force density becomes �ud−1μ ( v2 + u2 2 − e m (ϕ − ϕ0) ) = �uq, (7b) a product of the velocity �u and the sum of the densities of the kinetic and electrical potentialsmultiplied by d−1. setting d = h̄/(2m)with dimensionm2s−1, q getsdimensionkgs−1 andmaybe interpretedas the change of matter content in a volume element in the time unit, i.e, as the density of an internal source of matter. then, the equations of motion can be explicitly written in the form[ μ d�v dt + μ d�u dt δv +(�v + �u) dμ dt δv + μ(�v + �u) (δv ) dt + e m μ∇ϕδv − �uq ] mδv =0 (8) the time derivative of δv in the fourth member of (8) must be treated as the sum of the derivatives at �v and �u constant μ(�v + �u) (δv ) dt = μ(�v + �u)[div�u(�v + �u)+div�v(�v + �u)]δv = μ�v div�vδv + μudiv�uδv (9) then the equations of motion of the unit volume element take the form μ d�v dt + μ d�u dt +�v [( ∂μ ∂t ) �u +�v∇μ ] + μ�v div�v + �u [( ∂μ ∂t ) �v + �u∇μ ] + μ�udiv�u + e m μ∇ϕ + �uq =0 (10) or by putting together the third and the fourth, and the fifth, sixth and the lastmember members in (10) μ d�v dt +�v [( ∂μ ∂t ) �u +div(μ�v) ] + e m μ∇ϕ+ μ d�u dt + �u [( ∂μ ∂t ) �v +div (μ�u) ] + �uq =0 (11) here again the partial timederivativesmust be taken at �v and �u constant. the sumof the first and the thirdmember in (11) is the equation of motion μ d�v dt + e m μ∇ϕ =0 (12a) and the expression in the square brackets in the second member in (11) the equation of continuity ( ∂μ ∂t ) �u +div(μ�v)= 0 (13a) in the motion with the velocity �v. the remaining members in (11) represent the equation of motion with velocity �u, which is according to (1) a function of μ. the last two members in (11) can be put together, so that the equation of motion becomes μ d�u dt + �u [( ∂μ ∂t ) �v +div(μ�u)+ q ] =0 (12b) theexpression in squarebrackets in the secondmember in (12b)maybe interpretedasanequationof continuitywith internal sources ofmatter q expressedby (7b) ( ∂μ ∂t ) �v +div (μ�u)− d−1μ ( v2 + u2 2 − e m (ϕ − ϕ0) ) =0 (13b) 82 acta polytechnica vol. 50 no. 6/2010 3 steady states from now on, we will be interested only in steady, time independent solutions of the equations of motion and of the continuity equations. therefore, we shall omit all partial time derivatives. equations (12a) and (13a) concern the classical motion of matter in the electrostatic field of the nucleus, for which the equation of continuity (13a) reduces to the second member or to �v∇μ + μdiv�v =0 (14) the gradient of density μ may take any value between minus and plus infinity, so that equation (14) canbe satisfiedonly for �v equal identically to zero everywhere. this of course is inconsistent with (12a), unless another electric force density compensates the force density in the electrostatic field of the nucleus. the only electrical charges present that could create the compensating electric field are the charges of the electron cloud. supposing spherical symmetry of the electrical charge, the total flux of the electrical induction through the surface of the sphere of radius r equals the charge within this sphere 4πr2 �e = −e4π ∫ r ρ=0 μ(ρ)ρ2dρ (15) because the charges outside the sphere do not contribute to the intensity, one can extend the integration on the whole space and thus, owing to normalization of the distribution function, the field intensity �e = −e 4π 4πr2 ∫ ∞ ρ=0 μ(ρ)ρ2dρ = −e r2 (16) the resulting electrical intensity from the positive charge of the proton and the negative charge of the electron cloud en + ee = e r2 − e r2 =0 (17) now, let us pay attention to the processwith osmotic velocity �u, to which equations (12b) and (13b) pertain. equation (12b) ofmotionwith osmotic velocity has the form �uμdiv ū + �u [div (μ�u)+ q] = 0 (18) if the equation of continuity (13b) is valid, then the second member in (18) equals zero and the equation of motion reduces to �udiv ū = d�u dt =0 ( ∂�u ∂t =0 ) , (19) in agreement with the characteristic property of the mezic forces, which do not accelerate the material substance, so that its spatial distribution continues to be time independent. after replacing in (13b) from (2) u2 by d2 (∇μ)2 μ2 and div(μ�u) by −dδμ, we get the equation of continuity in the form −dδμ+d−1μ ( v2 2 + d2 (∇μ)2 2μ2 −e(ϕ − ϕ0) ) =0, (20) which is then the condition for a time independent, steady state of the substance in the volume element at a given point of the cloud. simultaneously, equation (14) must also be satisfied, which — as shown above — is possible only for �v equal identically to zero. therefore, the final condition for a steady, time independent state has the form − dδμ+d−1μ ( d2(∇μ)2 2μ2 −e(ϕ − ϕ0) ) =0 (21) the solution μ(�x,eϕ0) of (14)depends on thevalueof the constant eϕ0. if in all points in space the solution fulfils (14) for the same value of eϕ0, then equation (21) represents the condition for a steady state of the whole cloud and may be interpreted in the following way: the steady, time independent state of the cloud is reachedwhen ineachoneof itsvolume elements the outflow (inflow) of the matter through its surface is just compensated by the internal positive (negative) sources of matter evokedbynon-mechanical (mezic) type forces. substituting μ = r2 in the non linear equation (21), one obtains a linear equation( h̄ 2 2m δ+eϕ ) r =eϕ0r ≡ er (22) equation (22), derived from a hydrodynamic model, is in fact sufficient to bring order into most of the experimental spectroscopic data of discrete energy states. it does not resemble a wave equation. it is asymmetric: on the left side we have an operator, and on the right side a simple algebraic expression. replacing this expression by an operator of the form −ih̄∂/∂t acting on a exponential function ψ = rexp ( −i e h̄ t ) (23) we obtain the same equation (18), butwith operators on both sides hψ = −ih̄ ∂ ∂t ψ, (24) where ψ can rightly be declared as a wave function. the author would like to remark, that one gets the same results if the velocity �u is taken fromthebeginningaspure imaginary. then (11) is obtainedas a 83 acta polytechnica vol. 50 no. 6/2010 complex equation and separating the real and imaginary parts one immediately gets equations (12a,b) and (13a,b). before finishing this chapter, the author would like to mention the historical paper by d. bohm [4] concerning the so-calledhidden variables of quantum theory. bohm supposed the function ψ = rexp(−is/h̄), (r and s are real functions of time and coordinates), as the solution of the schrödinger equation ih̄∂ψ/∂t = − ( h̄2/2m ) ∇2ψ + v (�x)ψ, (25) substituting r by μ1/2, putting ∂s ∂t = e and �v = ∇s m , he obtained from (24) two equations ( ∂μ ∂t ) +∇ ( μ ∇s m ) =0 (26a) ∂s ∂t + (∇s)2 2m −eμϕ − h̄2 4m [ δμ μ − (∇μ)2 2μ2 ] =0 (26b) oriented on the planetary atomic model, bohm interpreted (26b) as a jacobi-hamilton equation for the motion of a particle in the field of classical electric potential and in the field of a quantum potential expressed by the last member on the left side of (26b). however, this equation,multiplied by μ/d, with ∇s/m replacedby �v, and μ d ∂s ∂t by ( ∂μ ∂t ) �v and with the use of (1b) canbe transformed into equation (13b) with an entirely different interpretation. 4 discussion a necessary condition for the existence of steady, time independent spatial density distribution and fluxes of the electronic substance in an electrically charged material continuum is the validity of the equation of continuity with internal sources or sinks of matter, which compensate the outflows or inflows ofmatter in eachof the volumeelements of the object through its surface. the basic prepositions necessary for the derivation of this equation of continuity reducible formally to a form of the schrödinger equation were that • the electron in the bound state does not exist as a particle and forms a cloud of continuous electrically charged electron substance around the nucleus, bearing the total electrical charge, whose spatial distribution is the same as that of the electron substance. • there exist two kinds of uncorrelated motions of the continuum, one with flow velocity �v and the otherwith osmotic velocity �u, the latter obeying the diffusion law. • the continuous electron substance moves in the electric field of the nucleus and in the field of its own electrical charge and at the same time diffuses opposite to the density gradient. • inner sources or sinks of matter due to the action of non-mechanical (mezic) type forces exist in the continuous electron substance. • in the steady time independent case, the flowvelocity is zero and the intensity of the electrical field of the nucleus is compensated by the electron field of the electron cloud. • the possibility of reducing the nonlinear equations of continuity to the linear schrödinger equation offers a practical means for solving them. it is worth noting that continuous distribution of the electron matter around the nucleus was strongly defended by schrödinger in his disputes with born [5]. in relativistichydromechanics, the existenceof inner sources of matter is attributed to the action of forces of non-mechanical (mezic) character [6]. nonrelativistic hydromechanics does not calculate with their existence. as such, they have never been observed, because they vanish in steady states and one can conclude their existence only indirectly from the fact that with their aid one can get the same steady time independent states as from the corresponding schrödinger equation. nevertheless, the non-mechanical forces shouldmanifest their existence in non-stationary processes during so-called jumps from one state to another. the violation of the continuity equation in the steady state by external perturbation cancels this state and starts the transition from a steady state to a time dependent state. this transition process may end in reestablishment of the initial steady state or in a transition to some new state with different energy. without an exact mathematical description of the transition process, which is very rapidandwhichmanifests itself as a jump, one can only predict the probability of the transitions to the end states. the situation could change, once one will be able to follow theoretically the transition process in time, in which the non-mechanical forces will play an important role. a very serious historical difficulty was mentioned already by madelung [7] in his attempt to deduce a hydrodynamical model from the schrödinger equation, namely that one has to assume that the electrically charged volume elements of the same electron cloud do not mutually interact. in the planetary model, this difficulty disappears, as the inner electric field is concentrated in the point-like particle. we suppose that the inner electrical intensity in the electron cloud compensates just the electrical intensity in the electric field of the nucleus, which allows the flow intensity in the discrete, time independent state to be obtained equal to zero. we have no idea how to describe the process by which the continuous electron cloud takes the form 84 acta polytechnica vol. 50 no. 6/2010 and properties of a free particle. the dual wave — particle character, manifested mainly in the vicinity of the rest energy of the free particle and depending on the type of the experiment, might be explained as fluctuations between the continuous and particle states of the electron, due to small perturbations of the continuity equations provoked by weak external fields, by their time dependence and boundary conditions in various types of experiments. 5 conclusion the hydrodynamical equations of motion deduced above, which are based onnewtonianmechanics and include non-mechanical (mezic) forces combinedwith the diffusion of the charged electronic substance in the field of an electric potential,maybe considered in only a limited sense equivalent to quantum mechanics based on the discoveries by schrödinger, louis de broglie, born, heisenberg, dirac, covering a much broader range of experimentally observed quantum phenomena. it is too early to draw general conclusions fromthe one isolated special case treatedabove. nevertheless, one shouldnot omit to pay attention to the possible role ofmezic forces in quantummechanics. references [1] jammer, m.: the philosophy of quantum mechanics, j. wiley & sons, new york, 1974. [2] šimáně, č.: discrete states of continuous electrically chargedmatter.concepts of physics, vol.v, no. 3, 2008, 499–512. [3] nelson, e.: derivation of the schrödinger equation from newtonian mechanics, phys. rev. 50, 1966, 1079–1085. [4] bohm, d.: a suggested interpretation of the quantum theory in terms of “hidden variables“ i, phys. rev. 85, 1952, 166–179. [5] born, m.: the conceptual situation in physics andtheprospects of its futuredevelopment,proc. phys. soc., a, vol. lxvi,1953, 501–513. [6] möller, c.: the theory of relativity, second edition, clarendon press oxford 1972, ruskoe izdanie, moskva atomizdat 1975, pp. 105–108; votruba, v.: základy speciálńı teorie relativity, academia press, prag 1969, pp. 307–313. [7] madelung, e.: quantentheorie in hydrodynamischer form, zeitschr. für physik, 40, 1926, 166. prof. dipl. ing. čestmı́r šimáně, drsc. phone: +420 284 819 279 e-mail: csimane@centrum.cz nuclear physics institute as cr prof. emeritus of the czech technical university in prague home adress: u svobodárny 9, 190 00 praha 9, czech republic 85 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 czech participation in the integral satellite: a review r. hudec, m. blažek, v. hudcová abstract the esa integral satellite, launched in october 2002, is the first astrophysical satellite of the european space agency esa with czech participation. the results of the first 7 years of investigations of various scientific targets e.g. cataclysmic variables, blazars, x-ray sources, and grbs with the esa integral satellite with czech participation are briefly presented and discussed. keywords: high-energy astrophysics, high-energy satellites. 1 introduction the esa integral project was the first esa project in space astronomy with official czech participation based on a collaboration agreement between the esa and the czech republic, i.e. prior to full membership of the czech republic in esa. integral (international gamma-ray astrophysics laboratory) satellite has now been more than 7 years in orbit, so some general conclusions may be drawn at this point. there are four co-aligned instruments on board integral: (1) ibis gamma-ray imager (15 kev–10 mev, full coded field of view (fov) 8.3 × 8 deg, 12 arc min fwhm), (2) spi gamma-ray spectrometer (12 kev–8 mev, full coded fov 16 × 16 deg), (3) jem-x x-ray monitor (3–35 kev, fully illuminated fov diameter 4.8 deg), and (4) omc optical monitoring camera (johnson v filter, fov 5 × 5 deg) (winkler et al. 2003). these experiments allow simultaneous observation in the optical, medium x-ray, hard x-ray, and gamma spectral region (or at least a suitable upper limit) for each object, assuming that it is inside the field of view. the basic codes of the observations are as follows: (a) regular (weekly) galactic plane scans (gps) (−14 deg < bii < +14 deg), (b) pointed observations (ao), (c) targets of opportunity (too). in this paper we deal with examples of observations and analyses of integral data with our participation, focusing on two categories of objects, namely cataclysmic variables (cvs) and blazars. 2 czech involvement in the integral project czech involvement in the esa integral project started in 1996 by inviting the first author of this paper to the omc and isdc consortia, based on a collaboration agreement between esa and the czech republic. then our participation focused on isdc and omc. in omc (optical monitoring camera), the participation focused on various software packages such as omc ps (omc pointing software) for integral isoc, and also on the design, development and operation of omc td (test device), a ground-based camera with output analogous (pixel size 18 arcsec) to the real omc. for isdc (integral science and data center), located in versoix, switzerland, the main part of the contribution focused on providing manpower, i.e. one person working within the team, with various responsibilities and involvements in the isdc operations. as for the scientific responsibilities, the first author of this paper was delegated to lead the study of cataclysmic variables, and he also became a member of the working groups on gammaray bursts (grbs) and agns. in this paper we very briefly summarize the scientific achievements obtained in these directions. in addition, we participated in developing and operating the dedicated robotic telescopes, considered as the ground-based segment of the project, and in delivering supplementary optical data for satellite triggers. these efforts were solved mainly by young research fellows and by students. the czech scientific participation focused on topics allocated by the integral bodies, mostly cataclysmic variables but also blazars and some other objects such as gamma-ray bursts (grbs). 3 integral and cataclysmic variables the field of cataclysmic variables (cvs) and related objects was delegated to the responsibility of the first author of this paper. the results of hard x-ray detections of these binary galactic objects are surprising. the soft x-ray emission of the group was already 38 acta polytechnica vol. 51 no. 2/2011 fig. 1: sub-windowing of an omc ccd frame by omc ps (pointing software). the sub-windows generated cover images of astrometric and photometric stars, and also the positions of various objects of interest from catalogues. up to 100 sub-windows can be created per frame fig. 2: selection of objects by omc photometric shots by omc ps (pointing software) fig. 3: theomconboard cameraright: theomctestdeviceoperated inondřejovprior to the launchof integral 39 acta polytechnica vol. 51 no. 2/2011 fig. 4: image from the omc test device operated in ondřejov prior to the launch of integral. the pixel size of the ccd camera was identical to those of real omc (17 arcsec) fig. 5: the czech integral team at the 2nd ibws integral/bart workshop held in kostelni strimelice in october 2003 known in advance, but the hard x-ray extension to (in some cases) more than 80 kev was a new discovery. these findings even led to considerations that cvs may represent a contribution to the galactic xray background. so far, 32 cataclysmic variables (cvs) have been detected by the integral ibis gamma-ray telescope (surprisingly, this is more than had been expected before launch, and represents almost 10 percent of the integral detections). 22 cvs were seen by ibis and found by the ibis survey team (barlow et al. 2006, bird et al. 2007, galis 2008), based on the correlation of the ibis data and the downes cv catalogue (downes et al. 2001). four sources are cv candidates revealed by optical spectroscopy of igr sources (masetti et al. 2006), i.e. new cvs, not in the downes catalogue. they are mainly magnetic systems: 22 are confirmed or probable ips, 4 probable magnetic cvs, 3 polars, 2 dwarf novae, 1 unknown. the vast majority have an orbital period porb > 3 hr, i.e. above the period gap (only one has porb < 3 hr), 5 objects are long-period systems with porb > 7 hr. the long life time of the satellite allows longterm variability studies, albeit limited by observation sampling. at least in some cases, the hard x-ray fluxes of cvs seen by integral exhibit time variations, very probably related to activity/inactivity states of the objects. the spectra of cvs observed by ibis are similar in most cases. the power law or thermal bremsstrahlung model compares well with the previous high-energy spectral fits (de martino et al. 2004, suleimanov et al. 2005, barlow et al. 2006). 40 acta polytechnica vol. 51 no. 2/2011 fig. 6: the ibis gamma-ray light curve for cv v1223 sgr. the fluxes especially in the (15–25) kev and (25–40) kev bands are long-term variables with a significant drop around mjd 53 650. optical variations are correlated with the changes in the (15–25) kev, (25–40) kev and (40–60) kev spectral bands with correlation coefficient 0.81, 0.82 and 0.89, respectively. the fluxes from integral/jem-x were persistent within their errors in the monitored time period. right: the omc optical (v band) light curve for v1223 sgr another surprise is the fact that while the group of ips represents only ∼ 2 percent of the catalogued cvs, it nevertheless dominates the group of cvs detected by ibis. more such detections and new identifications can therefore be expected, as confirmed by our search for ips in the ibis data, which provided 6 new detections (galis et al. 2008). many cvs covered by the core program (cp) remain unobservable by ibis because of the short exposure time, but new cvs have been discovered. ibis tends to detect ips and asynchronous polars: in hard x-rays, these objects seem to be more luminous (up to a factor of 10) than synchronous polars. detection of cvs by ibis typically requires 150–250 ksec of exposure time or more, but some of them remained invisible even after 500 ksec. this can however, at least in some cases, be related to the activity state of the sources — the hard x-ray activity is temporary or variable. for short-term variations, there is an indication for a hard x-ray flare in a cv system, namely v1223 sgr, seen by ibis (a flare lasting for ∼ 3.5 hr during revolution 61 (mjd 52743), with the peak flux ∼ 3 times the average (barlow et al. 2006)). these flares were seen in the optical already in the past by a ground-based instrument (duration of several hours) (van amerongen & van paradijs 1989). this confirms the importance of the omc-like instrument (preferably with the same fov as a gamma-ray telescope) on board gamma-ray satellites: even with the v limiting mag 15, it can provide valuable optical simultaneous data to the gamma-ray observations. analogous flares are also known for other ips in the optical, but not in hard x-rays. an example is tv col (hudec et al. 2005), where 12 optical flares have been observed so far, five of them on archival plates from the bamberg observatory, and remainig ones by others observers. tv col is an ip and the optical counterpart of the x-ray source 2a0526–328 (cooke et al. 1978). this is the first cv discovered through its x-ray emission, newly confirmed as an integral source. the physics behind the outbursts in ips is either disk instability or an increase in the mass transfer from the secondary. 4 integral and blazars blazars are among the most important and also most optically violently variable extragalactic high-energy objects. below we list a few examples of blazars analyzed with integral observations. we focus on objects found by data mining in the integral archive for faint and hidden objects. for more details on blazar analyses with integral, see hudec et al. (2007). in addition, successful blazar observations were performed mostly in the too regime. fig. 7: the list of blazars observed by the integral satellite in hard x-rays (ibis isgri) 41 acta polytechnica vol. 51 no. 2/2011 fig. 8: left: ibis gamma-ray light curve of 1es 1959+650. this blazar is visible in the integral ibis gamma-ray imager only in the data set corresponding to the optical flare. right: optical light curve of blazar 1es 1959+650 (tuorla observatory blazar monitoring program). in hard x-rays (ibis), the blazar is visible only during the large optical flare in the 2nd half of 2006 fig. 9: the light curves of selected eclipsing binaries obtained by the omc camera the large collaboration led by e. pian can serve as an example (pian et al. 2007). we have developed procedures to access faint blazars in the ibis database. blazar 1es 1959+650 can serve as an example. this blazar is a gamma-ray loud variable object visible by ibis in 2006 only, invisible in total mosaics and/or other periods. the optical light curve available for this blazar confirms the relation of active gamma-ray and active optical state. 5 omc optical camera the small optical camera on board integral satellite omc delivered a great deal of valuable simultaneous optical data for observations of gammaray burst sources. however this is the case only for some triggers, as the field of view (fov) of omc is much smaller than the fov of the mainly used instrument on integral, namely ibis (5 vs. 8 degrees). on the other hand, omc proved to be an efficient tool for optical objects without gamma-ray counterparts, such as eclipsing binaries. for these objects, the uninterrupted nature (no day/night cycles) of space-based observations was positive for studying the light curves and for determining the times of the minima. 42 acta polytechnica vol. 51 no. 2/2011 fig. 10: we observed the longest and brightest outburst of the (most likely) neutron-star low-mass x-ray binary sax j1810.8-2609 observed by integral so far. during the observation period spanning from sept. 12 to sept. 21, 2007, the binary system sax j1810.8-2609 became increasingly brighter. it even emitted a type i x-ray burst, whose flux exceeded that of the crab nebula (ibis image) fig. 11: left: the light curve of the type i x-ray burst observed by integral/jem-x. the time of the peak corresponds to 2007-09-24t19:53:06, when sax j1810.8-2609 reached a flux of 1.3 ± 0.2 crab (3–10 kev; shown in black – upper light curve) and 1.1±0.3 crab (10–20 kev; shown in red – bottom light curve). right: rxtelightcurve of the object with the integral observation indicated by a dot fig. 12: “labor day” grb030501 allocated to our responsibility. left: the grb image by spi, right: the obtained light curve 6 grbs and other objects additional results have also been obtained with our participation for other types of high-energy sources, the two most important results being briefly mentioned below. these results are illustrated in figures 11 and 12, one for neutron star low-mass x-ray binary, the other for a gamma-ray burst (grb). 43 acta polytechnica vol. 51 no. 2/2011 fig. 13: an example of a study for faint sources in ibis data and noise reduction – comparison of visualization methods. xy ari is an example of a newly detected cv (ip) in ibis data 7 conclusion in is obvious that the integral satellite is an effective tool for analyzing cvs and blazars. so far, 21 blazars, 32 cvs and 3 symbiotics have been detected, with the number increasing with time. the successful observations of cvs by integral provide proof that cvs can be successfully detected and observed in hard x-rays with integral (for most cvs these are considerably harder passbands than were possible previously). these results show that more cvs (and in harder passbands) will be detectable with increasing integration time. there is also an increasing probability of detecting objects in outbursts, high and low states, etc. simultaneous hard x-ray and optical monitoring of cvs and blazars (or at least suitable upper limits) can provide valuable inputs for a better understanding of the physical processes that are involved. acknowledgement we ackowledge esa pecs project c98023, grants 205/08/1207 and 102/09/0997 provided by the grant agency of the czech republic and msmt kontakt project me09027. references [1] barlow, e. j., knigge, c., bird, a. j., et al.: mnras, 2006, 372, 224. [2] bianchini, a., sabbadin, f.: ibvs, 1985, 2751, 1. [3] bird, a. j., malizia, a., bazzano, a., et al.: apj suppl. s., 2007, 170, 175. [4] cooke, b. a., ricketts, m. j., maccacaro, t., et al.: mnras, 1978, 182, 489. [5] de martino, d., matt, g., belloni, t., et al.: a & a, 2004, 415, 1 009. [6] downes, r. a., webbink, r. f., shara, m. m., et al.: pasp, 2001, 113, 764. [7] garnavich, p., szkody, p.: pasp, 1988, 100, 1 522. [8] hudec, r.: baic, 1981, 32, 93. [9] hudec, r., šimon, v., skalický, j.: the astrophysics of cataclysmic variables and related objects, proc. of asp conf., 2005, vol. 330. san francisco : asp, p. 405. [10] hudec, r., šimon, v., munz, f., galis, r., štrobl, j.: integral results on cataclysmic variables and related objects, presented at integral science workshop, sardinia, oct 2007, http://projects.iasf-roma.inaf.it/integral/ integral5thanniversarypresentations.asp. [11] ishida, m., sakao, t., makishima, k., et al.: mnras, 1992, 254, 647. [12] king, a. r., ricketts, m. j., warwick, r. s.: mnras, 1979, 187, 77. [13] masetti, n., morelli, l., palazzi, e., et al.: a & a, 2006, 459, 21. [14] meintjes, p. j., raubenheimer, b. c., de jager, o. c., et al.: apj, 1992, 401, 325. 44 acta polytechnica vol. 51 no. 2/2011 [15] motch, c., haberl, f.: proceedings of the cape workshop on magnetic cataclysmic variables, san francisco : asp, 1995, vol. 85, p. 109. [16] motch, et al.: a & a, 1996, 307, 459. [17] mürset, u., jordan, s., wolff, b.: supersoft xray sources, lnp, vol. 472, p. 251. [18] patterson, j., skillman, d. r., thorstensen, j., et al.: pasp, 1995, 107, 307. [19] šimon, v., hudec, r., štrobl, j., et al.: the astrophysics of cataclysmic variables and related objects, proc. of asp conf. 2005, vol. 330. san francisco : asp, p. 477. [20] sokoloski, j. l., luna, g. j. m., bird, a. j., et al.: aas, 2005, 207, 3 207. [21] suleimanov, v., revnivtsev, m., ritter, h., a & a, 2005, 443, 291. [22] van amerongen, s., van paradijs, j.: a & a, 1989, 219, 195. [23] watson, m. g., king, a. r., osborne, j.: mnras, 1985, 212, 917. [24] winkler, c., courvoisier, t. j.-l., di cocco, g., et al.: a & a, 2003, 411, l1. [25] galis, r. proceedings of 7. integral workshop, proceedings of science, 2008. [26] hudec, r. et al.: nuclear physics b proceedings supplements, 2007, vol. 166, p. 255–257. [27] pian, e. et al.: observations of blazars in outburst with the integral satellite, in triggering relativistic jets (eds. william h. lee, enrico ramírez-ruiz) revista mexicana de astronoma y astrofisica (serie de conferencias) vol. 27, 2007, contents of supplementary cd, p. 204. rené hudec e-mail: rene.hudec@gmail.com astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic faculty of electrical engineering czech technical university in prague technická 2, cz-16627 prague, czech republic martin blažek astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic faculty of electrical engineering czech technical university in prague technická 2, cz-16627 prague, czech republic věra hudcová astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic 45 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 a study of gait and posture with the use of cyclograms o. hajný, b. farkašová abstract present-day science makes extensive use of the simulation. our work focuses on simulating human gait. simulations of human gait can be used for prosthetics and therapy e.g. rehabilitation, optimizing the movements made by sportsmen, evaluating advances in rehabilitation, etc. methods of ai can also be used for predicting gait movement and for identifying disorders. our project is about measuring human gait, simulating the musculo-skeletal system in order to study study walking and predicting and quantifying gait with the use of neural networks. the research is being carried out in the biomechanics laboratory at fbe ctu, and is intended for use in clinical practice at the 2nd faculty of medicine, charles university. keywords: simulation, human body model, walking, artificial intelligence, gait angles, bilateral cyclograms. 1 methods for a study of gait angles, we decided to use methods based on measurements of the geometric properties ofbilateral cyclograms(also calledangle-anglediagrams). the symmetrymeasuresare simple andphysicallymeaningful, objective, reliable andwell suited for a statistical study [1]. furthermore, the technique is strongly rooted in geometry and the symmetry measures are intuitively understandable [3]. depending on the cyclicity of the gait, cyclograms are closed trajectories generated by simultaneously plotting two (or more) joint variables. in gait study, the easily identifiable planar hip-knee cyclograms have traditionally received the most attention. in order to quantify the symmetry of human walking, we also obtained and studied cyclograms from the same joint from two sides of the body [2]. fig. 1: illustration of marker movements 1.1 measuring systems we used two methods for measuring gait and movement: an infrared (ir) camera with active markers, and a web camera. first, we had to measure a human gait to obtain a quantumofdata. for this purpose, we used two methods of movement in the space measure. the firstmethod used an ir camera (fig. 2.) with activemarkers (lukotronicas 200),whichwas available fromthe externalworkplacejointctudepartmentof biomedical engineering and charles university. we placed led diodes markers on the measured person at the following points: malleolus lateralis, epicondylus lateralis, trochanter major and spina iliaca anterior superior. by this method we were able to register the movement in three dimensional space. the second method was to record a video of human walking, using a web camera. the video was consecutively analyzed in coach6, version 6.1. in this case, we made our own circular markers, which contrasted with the clothing of the measured person, who was dressed in black. thecoach6programis anadequate tool for detecting themarkers. we chose frames of the video that were usable for our analysis, and we marked the positions of the markers in them, one by one. this video method providedonly two-dimensional co-ordinates of the captured markers. fig. 2: lukotronik as200 ir camera system [6] 1.2 model a model of human body was created in the matlab (mathematic laboratory), version r2008b environment with simmechanics (simulink toolbox) for simulating and modeling mechanical elements and their directions. to create a model of the human body we used the simmechanics tool blocks. for a practically 48 acta polytechnica vol. 50 no. 4/2010 useable model, a big block was built to form the base of the model. a skeletal was formed by a body block and custom joints. with the help of the joint actuator block, the data acquired by one of the measuring systems was imported to the body model. using the joint sensor block, data could be exported from the body model (fig. 3). fig. 3: model of a human body created in the simulink workspace [5] the presentmodelwas updated by themass of the body. the user working with the program filled in a small form including information on height, weight and sex. with a subprogram we computed the length and weight of each segment of subject’s body. several methods can be used for determining the weight of the individual segments. unfortunately, none of them is absolutely exact, so we have to count with a weight error. we decided to calculate the mass of the segments by means of experimentally acquired coefficients. the coefficients are stochastic, and applying them for an average population involves a weight error, because the authors (zatsiorsky, bohn, shan, et al.) carried out their experiments on a different population. we wanted to calculate the weight of the individual body segments. the optimal equationusing experimentally appointed coefficients b0i, b1i a b2i takes the form: mi = b0i + b1i · m + b2i · h. (1) where mi is the weight of an individual segment (kg), m is the total weight of the subject, and h is the height of the subject (cm). table 1 shows our experimentalweight coefficients. other methods for finding out the weight of the segments can also be used, e.g. subaqueous weighing according to the archimedes principle. however, this method is not applicable for our example. we can compute the loading of the joints and the moments of inertia of the body segments using a similar equation: iti = b0i + b1i · m + b2i · h (2) table 1: weight coefficients for computing the weight of the segments weight coefficient segment b0i b1i b2i foot −0.8290 0.0077 0.0073 shin −1.5920 0.0362 0.0121 femur −2.6490 0.1463 0.0137 hand −0.1165 0.0036 0.0017 forearm 0.3185 0.0144 −0.0011 upper arm 0.2500 0.0301 −0.0027 head 1.2960 0.0170 0.0143 upper torso 8.2144 0.1862 0.0584 medial torso 7.1810 0.2234 −0.0663 lower torso −7.4980 0.0976 0.0490 our body model was created mainly to calculate the angles in the joints and to use the results in the simulation of human gait. to compute the angles we used equation (3): cosϕ = u1v1 + u2v2√ u21 + u 2 2 · v21 + v22 (3) for the two-dimensional system, where u1, u2, v1, v2 are thevectorsof thebody segments (femur, shin, foot, etc.) represented by at least two points as markers. fig. 4: figure of the computed angle in the knee for computing the angles in the three-dimensional system, the model uses the following equation: cosϕ = u1v1 + u2v2 + u3v3√ u21 + u 2 2 + u 2 3 · √ v21 + v 2 2 + v 2 3 (4) based on this equation, the model is able to calculate the angles in the hip, knee and ankle if there are enoughmarkers. in the lukotronicas200 system, the 49 acta polytechnica vol. 50 no. 4/2010 maximum number of markers is 32, but we had a limited editionwith 10markers. with 32markerswewill be able to measure more joints in the body, including the upper extremities, but this is beyond the scope of our project. by means of simulationwe can check the accuracy of the data. becausewe use more than one system for imaging the movements of a point in space, the markers may be confused or another human lapsesmay occur. fig. 5: animation controlled and computed by the simulink model [5] the model uses measured data of a moving point in space to calculate the angles in the joints. to calculate the angle in one jointweneed at least three points (markers). this data was acquired using a webcam and was processed in coach6, version 6.1. the frame rate of webcam was 26 frames per second. 2 results by means of the model we obtained graphs of the dependency of the change in angle over time in the knee and hip (fig. 6). this is important for subsequent computation of the cyclograms. the graph is plotted fromthe data obtained from thewebcammeasurements. the graph captures the half step and the full step, because the subject was asked to make a step from the standing position (fig. 1). it follows from the graph that the angle in the knee changes from 1◦ (stretched leg) to 78◦ (relaxed leg), where the values represent the angle between the femur and the shin. the second curve plots the changing angle between the femur and the body. the cyclograms are obtained very easily. they are graphs plotting the angles in the knee and the hip together. there are various kinds of cyclogram where we can plot e.g. the angles between the knee and the ankle, etc. (fig. 6). for our case, we used the formula: cyclogram= angle in knee angle in hip (5) fig. 6: computed changes in the angles over time of the knee and hip we had to decide which kind of cyclogramwas applicable for our research. the choice fell on the cyclogram of the knee and the hip, because it is easiest and most accurate to measure the angles in the joints. fig. 7: cyclogram of gait (knee/hip) 3 gait prediction we selected an artificial neural network to analyze the cyclograms, because we consider nns to be very interesting, and we think they will be used extensively in future [4]. the m-s (musculo-skeletal) body model contains an option of movement prediction, or more accurately inferior limbmotionpredictionwhich is represented by the nn. we used the matlab ai toolbox for the nn. we created our nn to predict the angles in the knee and hip of the right leg and we used a backpropagation network training functionwith 10 input layers (we defined this number according to a calculation of the breaks in the angle functionbehavior)andoneoutput layer. weuseda log-sigmoid function as a transfer function of the input layers and a linear transfer function in the output layer. our neural network learns for 500 generations, because the mean squared error (mse) of the predicting function of the angles in the knee and the hip was the smallest at that point. 50 acta polytechnica vol. 50 no. 4/2010 fig. 8: data predicted by nn 4 conclusion we designed a functional user interface in the matlab guidecomponent inwhich theuser can easilyhandle all parts of our program. the output of our project is a user interface for work with our application consisting of an analysis (graphs of measured data and of the change of angles in the knee and hip, bilateral cyclograms), a body model (animation ofmotion) and a prediction (setting of thenn, graphsof the input vector, the target vector and the output vector). the program that we are pursuing will be ready for use in the biomechanics laboratory for study of gait. the body model can be modified by changing theweight of the body or of individual segments, from which we can count the moments of inertia of these segments and the forces in the joints. a more realistic human model could be created, e.g. in cad. in the next step of our research we would like to produce a hydraulic mechanism that will be controlled by nn and could help patients in their rehabilitation. 4.1 future developments in the future we would like to build on our previous work and develop our system. the next useable function in the model will calculate the load of the joint from the mass of the body. itmay be important to study duringwhichmovement and in which part of the movement the joint is most loaded by the mass. matlab simulink offers a connectionwith thecad technical graphic program. wewould like tomake use of this connection and to recreate a real human body in cad. the animation will be more distinct in this way. itwill takeaconsiderable time topredict theangles in the knee and hip, and then to compute the cyclograms. we aim to teachnn to predict the cyclograms correctly. acknowledgement this work was carried out at fbe ctu in frame of research program no. msm 6840770012 “transdisciplinary biomedical engineering research ii” of ctu, sponsored by the ministry of education, youth and sports of the czech republic. references [1] goswami, a.: kinematic quantification of gait symmetry based on bilateral cyclograms, xixth congress of the international society of biomechanics (isb), dunedin, new zealand, 2003. [2] goswami, a.: new gait parameterization technique by means of cyclogram moments: application to human slop walking, gait and posture, 1998, p. 15–26. [3] heck, a., holleman, a.: walk like a mathematician: an example of authentic education, proceedings of ictmt6 – new technologies publications, 2003, p. 380–387. [4] ju won lee, gun ki lee: gait angle prediction for lower limb orthotics and prostheses using an emg signal and neural networks, international journal of control, automation, and systems, 2005, p. 152–158. [5] jelínek,r.: tvorba modelu svalově-kosterního systému pro studijní účely, kladno (czech republic), 2009. [6] lukotronic [online]. url: http://www.lukotronic.com/ about the authors ondřej hajný was born in chomutov on 19. 11. 1987,where he graduated from the basic school andthen fromthegrammarschool. he isnowstudying at the faculty of biomedical engineering of ctu, in kladno. he is employed as a teaching and research assistant at the institute of normal, pathology and clinical physiology, charles university in prague. barborafarkašováwasborn inostrava,where she graduated from the basic school and then from the grammar school. she is now studying at the faculty of biomedical engineering of ctu, in kladno. ondřej hajný barbora farkašová e-mail: hajnyond@fbmi.cvut.cz, farkabar@fbmi.cvut.cz department of biomedical technology faculty of biomedical engineering czech technical university in prague nám. śıtná 3105, 272 01 kladno, czech republic 51 ap09_1.vp 1 introduction as a part of an extensive long-term plan for the development of the helsinki fair centre a new 14500 m2 multipurpose hall will be built in helsinki, finland. the design of the hall was carried out in co-operation between finnish and german architects and it is intended to be able to host both exhibitions and indoor sporting events with up to 6000 spectators. in fig. 1, the new multipurpose hall can be seen to the left. the finnish building code allows the fire safety design of a building to be performed according to either the prescriptive regulations or the natural fire safety concept, nfsc. in this study, nfsc was used to do a performance-based structural fire safety design of the steel roof trusses in the above-described multipurpose hall. this made it possible to estimate whether the roof trusses can be safe in the case of fire without passive fire protection, such as intumescent painting. 2 structure description 2.1 general the main frame of the hall is made up of reinforced concrete columns and three dimensional steel roof trusses. see fig. 2 for a view of the frame. the frame spacing is 9.0 m, except in the middle of the hall where it is 13.05 m. the span of the roof trusses is 78 m with a splice in the middle of the 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 simulation and study of natural fire in a wide-framed multipurpose hall with steel roof trusses d. pada in this case study, the structural fire safety of unprotected steel roof trusses in a wide-framed multipurpose hall was evaluated according to the natural fire safety concept. the design fires were simulated with fds in order to determine the temperature development inside the hall. the temperature of the steel was calculated based on the results from the simulation and the structural analysis was carried out in robot. it was established that the steel roof trusses could be left unprotected under certain conditions, however, a more violent design fire resulted in failure of the truss. keywords: natural fire safety concept, case study, multipurpose hall, steel roof truss, design fires, sprinkler system, fire resistance, fire simulation, fds, steel temperature, structural analysis, fem. fig. 1: air photo of the architects’ suggestion for the helsinki fair centre span and the total number of trusses is 17. the free height inside the hall varies from 10 m to 16 m. the total length of the building is just over 170 m and the total width is about 88 m. the height of the steel roof trusses varies between 4.5 m and 7 m, the bottom width of the truss is about 1.3 m and the top width is 3.0 m. the trusses are made up of structural steel hollow sections of varying dimensions with a steel quality of s355. the roof structure will consist of wooden elements with mineral wool insulation supported by the roof trusses. a large part of the exterior walls will be made of glass, which is assumed to break when the temperature reaches 200 °c. the rest of the exterior walls will be made of steel sheets with mineral wool insulation and concrete sandwich structures. the new hall will form one single fire compartment together with an existing hall at the helsinki fair centre. hence the total area of the fire compartment will be in the order of 33000 m2. 2.1 fire resistance requirement this study focuses only on the first part of the essential requirement for the limitation of fire risks according to the construction products directive 89/106/eec, i.e. “the load bearing resistance of the construction can be assumed for a specified period of time” [1]. the required time period in this case study was 60 minutes. 3 design fires two prescribed design fires were used in this study. they were chosen from the project’s performance-based fire safety design report [2], where a rough risk analysis was made and several different fires were considered, and a report by hietaniemi [3]. the choice of the design fires from these reports was based on the possible severity of their effect on the steel roof trusses. 3.1 spectator stand fire the spectator stand fire represents the case when the multipurpose hall is used for e.g. indoor sport events or concerts. as spectator stands can also be placed above floor level, the design fire is closer to the roof than a fire on floor level, hence posing a larger threat to the load-carrying structure. according to hietaniemi [3], the maximum rate of heat release, rhr, of the seat material for a spectator stand fire can be assumed to be 2000 kw/m2, giving a maximum rhr of the spectator stand of 1500 kw/m2, while the total fire load is 510 mj/m2. the studied spectator stand section measured 8.0 m×13.6 m with a height of 3 m. it was placed 5.2 m above floor level in the lower part of the hall, leaving only about 2 m of free height to the bottom chord of the roof truss. the whole section was assumed to be on fire, giving a maximum rhr of 163.2 mw. fig. 3 presents the rhr curve for the spectator stand design fire. 3.2 exhibition stand fire the exhibition stand fire represents one of the main purposes of use, where the fire load can be very high. also this design fire was based on hietaniemi’s report [3], a 10 m×10 m exhibition stand, made of burnable materials, for small motor vehicles, e.g. motorcycles or atv’s, placed on floor level in the lower part of the hall just below a roof truss. the maximum rhr of the design fire was 53 mw, the total fire load was 1720 mj/m2 and the height of the stand was assumed to be 2 m. the rhr curve for the exhibition stand fire is shown in fig. 4. © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 49 no. 1/2009 fig. 2: main frame fig. 3: rhr curve for spectator stand fire fig. 4: rhr curve for exhibition stand fire 3.3 sprinkler system failure as the sprinkler system is usually considered to be one of the most effective and reliable active fire protection systems in a building, a total sprinkler system failure was not considered in this study. however, a partial sprinkler system failure, where two nozzles above the design fire were inoperative, was considered in both the design fires. 4 fds model creation 4.1 general the design fires were simulated with fire dynamics simulator, fds, version 5.2 and the input files were made with pyrosim. the whole multipurpose hall was modeled in pyrosim with a cell size ranging from 0.2 m×0.2 m×0.2 m to 0.8 m×0.8 m×0.8 m, hence the total number of cells was kept at about 2 millions. 4.2 sprinkler system the model was equipped with an automatic sprinkler system that activated when the temperature of the nozzle reached 74 °c. the effect of the sprinklers on the gas temperature and the combustion occurring in the gas phase were taken into account in the simulation. the suppressing effect of the sprinklers on the design fire was however not taken into account in the two original simulations, as there was no way of determining how large the effect could be without carrying out real fire tests. in a second additional simulation of the spectator stand fire, the effect was however taken into account according to a method described in hietaniemi’s report [3]. according to this method, the rhr is only allowed to double from the value it has when the sprinkler system is activated. as the model was so large, the time when 20 nozzles had been activated in the original simulation, i.e. at approximately 7 minutes into the simulation, was used as the sprinkler activation point. at that point the rhr was about 28 mw and was hence allowed to grow to 56 mw. 4.3 smoke exhaust system the model was also equipped with a smoke exhaust system. the hall was divided into different smoke sections so that one section was about 2400 m2, assuming that that the system could be active only in two sections at the same time. the time at which the smoke exhaust system was activated in the spectator stand fire was assumed to be 400 s and in the exhibition stand fire 600 s. these times were approximated based on simulation tests of the sprinkler activation times, assuming that the smoke exhaust system would be activated roughly at the same time as the sprinklers. 4.4 data measured the most important data measured during the fire simulation, from the point of view of structural fire safety design, was the adiabatic surface temperature every 5 m of the bottom chord of the steel roof trusses situated above and near to the fire. the adiabatic surface temperature is the temperature that the bottom chord “sees”, and is the quantity that is representative of the heat flux to the solid surface [4]. this temperature was used to calculate the temperature of the steel cross section. the gas temperature of the building was also measured at several different heights and points to get a picture of the total temperature development in the building as a function of time. 5 simulation of fire 5.1 spectator stand fire 1 the original spectator stand fire simulation was ended at 1400 s, as the fire only lasted 1380 s. the rhr reached a maximum value of 167 mw during the simulation, i.e. very close to the intended value of 163.2 mw. the first sprinkler nozzle activated a bit sooner than expected, already at 308 s, being one of the sprinklers above the fire. in total 251 out of the 310 functional sprinkler nozzles in the model were activated during the simulation. the temperature inside the hall remained quite low in general, except of course over and near to the fire. close to the roof the temperature reached about 65 °c, while at 5 m above floor level it remained just above 20 °c. the measured adiabatic surface temperature of the bottom chord just above the fire is plotted in fig. 5, and this was also the measurement used to calculate the temperature of the steel. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 [ ] ° c fig. 5: adiabatic surface temperature of the bottom chord above spectator stand fire 1 5.2 spectator stand fire 2 in the additional simulation of the spectator stand fire, where the suppressing effect of the sprinklers on the design fire was taken into account, the rhr reached a maximum value of 56 mw. as the sprinklers were removed from the model in order to speed up the simulation, the only measured data taken into consideration was the adiabatic surface temperature of the bottom chord just above the fire. this temperature is plotted in fig. 6. 5.3 exhibition stand fire the exhibition stand fire was ended at 3670 s, as there was no need to study the fire situation beyond one hour. the maximum rhr measured during the simulation was just over 53 mw, i.e. almost exactly the intended value. the first sprinkler nozzle activated a bit later than expected, at 668 s. in total only 148 out of the 310 functional sprinkler nozzles were activated. the temperature close to the roof reached circa 60 °c, while at 5 m above floor level it increased only a few degrees above the original temperature of 20 °c, again except close to and over the fire. the adiabatic surface temperature of the bottom chord above the fire is plotted in fig. 7. 6 fire safety design of the steel roof truss the temperature development of the unprotected steel members was calculated according to eurocode 1 and 3 just above the fire. the temperature was assumed to be vertically equivalent. 6.1 steel temperature in the original spectator stand fire the maximum temperature of the bottom chord with a wall thickness of 10 mm was established to be 727 °c, whereas it was 897 °c for the diagonals with a wall thickness of 5 mm. in the additional simulation of the spectator stand fire the temperature of the bottom chord reached 614 °c, whereas the diagonals reached a temperature of 661 °c. in the case of the exhibition stand fire, the maximum temperature of the bottom chord was 387 °c and the maximum temperature of the diagonals was 396 °c, i.e. almost the same temperature was reached in all steel sections, independent of wall thickness. 6.2 structural analysis an fem-model of the roof truss was made in robot millennium 21. in the model the truss was subjected to snow load, self-weight of the roof structure, self-weight of equip© czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 49 no. 1/2009 [ ] ° c fig. 6: adiabatic surface temperature of the bottom chord above spectator stand fire 2 [ ] ° c fig. 7: adiabatic surface temperature of the bottom chord above the exhibition stand fire ment such as lighting and ventilation ducts hung to the bottom chord of the truss, and of course the self-weight of the truss itself. the temperature of the truss was assumed to be uniform in order to simplify the calculations and to avoid having to take the effect of heat conduction inside the roof truss into consideration. the effective yield strength and the modulus of elasticity of the steel were changed to correspond to the values at the different elevated temperatures. with the help of the fem-model the critical temperature of the roof truss could be established to be 590 °c. at this temperature the highest degree of utilization was 0.94 and took place in the diagonals in compression closest to the ends of the truss. the deflection was established to be 400 mm in the middle of the truss, without taking the pre-camber of 100 mm into consideration. hence the actual deflection would be in the order of 300 mm, which equals the length of the truss, 78 m, divided by 260. 7 conclusions by comparing the temperature of the steel reached in the different design fires with the critical temperature of the truss, it could be established that the truss could well withstand the exhibition stand fire without any fire protection. however, in the case of the spectator stand fire the temperature proved to be too high for the unprotected truss to endure the fire, though the temperature did not rise very much above the critical temperature when taking into account the suppressing effect of the sprinklers on the fire. references [1] european committee for standardization. sfs-en 1991-1-2 eurocode 1: actions on structures. part 1-2: general actions. actions on structures exposed to fire. finland: finnish standards association, 2003. [2] hämäläinen, s.: toiminnallisen palosuunnittelun määrittely ja käytettävät lähtöarvot, suomen messut. helsinki, finland: näyttelyhallin laajennus, 12. 06. 2008. [3] hietaniemi, j.: palon voimakkuuden kuvaaminen toiminnallisessa paloteknisessä suunnittelussa. finland: vtt, 2007. [4] mcgrattan k. et al.: nist special publication 1019-5, fire dynamics simulator (version 5) user’s guide. washington, usa: u.s. government printing office, 2007. dan pada e-mail: dan.pada@ako.fi helsinki university of technology faculty of engineering and architecture espoo, finland 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 acta polytechnica vol. 51 no. 4/2011 effect algebras of positive self-adjoint operators densely defined on hilbert spaces z. riečanová abstract we show that (generalized) effect algebras may be suitable very simple and natural algebraic structures for sets of (unbounded) positive self-adjoint linear operators densely defined on an infinite-dimensional complex hilbert space. in these cases the effect algebraic operation, as a total or partially defined binary operation, coincides with the usual addition of operators in hilbert spaces. keywords: quantum structures, (generalized) effect algebra, hilbert space, (unbounded) positive linear operator. 1 introduction for any linear operator a densely defined on a hilbert space h one can define its adjoint operator a∗. if a∗ coincides with a then operator a is called selfadjoint. self-adjoint (unbounded) linear operators on infinite-dimensional complex hilbert spaces have importance in quantum mechanics, since they represent physical observables, e.g. the position or momentum of an elementary particle. differential operators form a class of unbounded operators. the laplace operator is an example of an unbounded positive linear operator. the algebraic structures of sets of such operators distinguish from classical boolean logics. this follows from the fact that e.g., the distributive law fails due to the noncompatibility of some pairs of operators. for instance, position x and momentum p of an elementary particle cannot be measurable simultaneously with arbitrarily prescribed accuracy, hence x and p are noncompatible. non-classical logic for calculus of propositions of quantum mechanical system started in 1936 by birkhoff and von neuman, (see [2]). effect algebras were introduced in 1994 in [5]. a survey of algebras of unbounded operators can be found in [1]. the aim of this paper is to show that (generalized) effect algebras may be suitable, very simple and natural algebraic structures for sets of linear operators (including unbounded ones) densely defined on an infinite-dimensional complex hilbert space, at which the effect algebraic operation coincides with the usual sum of operators. more details on linear operators on hilbert spaces can be found, e.g., in [3] and about effect algebras in [4]. 2 basic definitions and some known facts in the paper we assume that h is an infinitedimensional complex hilbert space, i.e., a linear space with inner product (· , ·) which is complete in the induced metric. conventions differ as to which argument sesquilinear form (· , ·) should be linear. recall that here for any x, y ∈ h we have (x, y) ∈ c (the set of complex numbers) such that (x, αy+βz) = α(x, y) + β(x, z) for all α, β ∈ c and x, y, z ∈ h. moreover, (x, y) = (y, x) and finally (x, x) ≥ 0 at which (x, x) = 0 iff x = 0. the term dimension of h in the following always means the hilbertian dimension defined as the cardinality of any orthonormal basis of h (see [3, p. 44]). moreover, we will assume that all considered linear operators a (i.e., linear maps a : d(a) → h) have a domain d(a) a linear subspace dense in h with respect to metric topology induced by inner product, so d(a) = h. moreover, our next results will be for positive linear operators a (denoted by a ≥ 0), meaning that (ax, x) ≥ 0 for all x ∈ d(a), therefore operators a are also symmetric, (for more details see [3]). we will denote the set of all such operators by v(h). recall that a : d(a) → h is called a bounded operator if there exists a real constant c ≥ 0 such that ‖ax‖ ≤ c‖x‖ for all x ∈ d(a) and hence a is an unbounded operator if to every c there exists xc ∈ d(a) with ‖axc‖ > c‖xc ‖. to every linear operator with d(a) = h there exists the adjoint linear operator a∗ of a such that d(a∗) = {y ∈ h | there exists y∗ ∈ h such that (y∗, x) = (y, ax) for all x ∈ d(a)} and a∗y = y∗ for every y ∈ d(a∗). 78 acta polytechnica vol. 51 no. 4/2011 if a∗ = a then a is called self-adjoint. the set of all positive self-adjoint linear operators densely defined in h will be denoted by sp(h). hence sp(h) = {a ∈ v(h) | a = a∗}. a densely defined linear operator a on h is called symmetric, if a ⊂ a∗. here we write a ⊂ b iff d(a) ⊆ d(b) and ax = bx for every x ∈ d(a). the condition a ⊂ a∗ is equivalent to (y, ax) = (ay, x) for all x, y ∈ d(a). an operator a : d(a) → h is called closed if for every sequence (xn)n∈n, xn ∈ d(a), such that xn → x ∈ h and axn → y ∈ h as n → ∞ one has x ∈ d(a) and ax = y. since a ∈ v(h) is positive and hence also symmetric (see [3], p. 142) there exists a closed operator a such that a ⊂ a and a ⊂ b for every closed operator b ⊃ a. moreover a is symmetric and it is called the closure of a. a symmetric operator is called essentially self-adjoint if (a)∗ = a and then a is a unique self-adjoint extension of a [3, p. 96]. we shall show in section 3 that, under a partially defined usual sum of linear operators, sets v(h) and sp(h) form quantum structures called (generalized) effect algebras (see also [9]). now we recall their definitions. definition 2.1 (foulis and bennett, 1994) a partial algebra (e; ⊕, 0, 1) is called an effect algebra if 0,1 are two distinguished elements and ⊕ is a partially defined binary operation on e which satisfies the following conditions for any x, y, z ∈ e: (e1) x ⊕ y = y ⊕ x if x ⊕ y is defined, (e2) (x⊕y)⊕z = x⊕(y ⊕z) if one side is defined, (e3) for every x ∈ e there exists a unique y ∈ e such that x ⊕ y = 1 (we put x′ = y), (e4) if 1 ⊕ x is defined then x = 0. we often denote the effect algebra (e; ⊕, 0, 1) briefly by e. on every effect algebra e the partial order ≤ and partial binary operation � can be introduced as follows: x ≤ y and y�x = z iff x⊕z is defined and x⊕z = y. if e with the defined partial order is a lattice (a complete lattice) then (e; ⊕, 0, 1) is called a lattice effect algebra (a complete lattice effect algebra). generalizations of effect algebras (i.e., without a top element 1) have been studied by kôpka and chovanec (1994) (difference posets), foulis and bennett (1994) (cones), kalmbach and riečanová (1994) (abelian ri-posets and abelian ri semigroups) and hedĺıková and pulmannová (1996) (generalized dposets and cancellative positive partial abelian semigroups). it can be shown that all of the above mentioned generalizations of effect algebras are mutually equivalent and extend similar previous results for generalized boolean algebras and orthomodular lattices and posets. definition 2.2 (1) a generalized effect algebra (e, ⊕, 0) is a set e with element 0 ∈ e and partial binary operation ⊕ satisfying for any x, y, z ∈ e conditions (ge1) x ⊕ y = y ⊕ x if one side is defined, (ge2) (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) if one side is defined, (ge3) if x ⊕ y = x ⊕ z then y = z, (ge4) if x ⊕ y = 0 then x = y = 0, (ge5) x ⊕ 0 = x for all x ∈ e. (2) a binary relation ≤ (being a partial order) on e can be defined by x ≤ y iff for some z ∈ e , x ⊕ z = y . (3) q ⊆ e is called a sub-generalized effect algebra (sub-effect algebra) of the generalized effect algebra e (effect algebra e) iff it has the following property. if at least two of elements x, y, z ∈ e with x ⊕ y = z are in q then all x, y, z are in q. note that a sub-generalized effect algebra (sub-effect algebra) q ⊂ e is a (generalized) effect algebra in its own right. 3 generalized effect algebras of positive operators on a hilbert space and their sub-generalized effect algebras in [9] the following theorem on positive linear operators with common domain was proved: theorem 3.1 [9, theorem 3.1] let h be a complex hilbert space and let d ⊆ h be a linear subspace dense in h (i.e., d̄ = h). let gd (h) = {a : d → h | a is a positive linear operator defined on d}. then (gd (h); ⊕, 0) is a generalized effect algebra where 0 is the null operator and ⊕ is the usual sum of operators defined on d. in this case ⊕ is a total operation. if d = h in theorem 3.1 then gd(h) is a generalized effect algebra of all bounded positive linear operators acting in h with usual addition as effect algebraic operation ⊕. hence in the case d = h all operators in gd (h) are self-adjoint. on the other hand, if d �= h then every bounded operator in gd(h) is a restriction a|d of a bounded operator a with d(a) = h. thus, in this case, a = a∗ = (a|d)∗ �= a|d. it follows that every 79 acta polytechnica vol. 51 no. 4/2011 self-adjoint operator in gd (h) for d �= h is necessarily unbounded. nevertheless, it is well known (see, e.g. [10]) that every densely defined positive operator a has a positive self-adjoint extension â called friedrichs’ extension. moreover, â extends all symmetric extension a′ of a. thus if a′ is selfadjoint then a′ = â. but in general d(a) �= d(â), hence â /∈ gd(h). clearly, for domains d1 �= d2, gd1 (h)∩gd2 (h) = ∅. however it is well-known that bounded linear operators have unique extensions to the whole space h. theorem 3.1 remains true if we substitute gd(h) by g̃d (h) = {a : d(a) → h | a is positive linear operator with d(a) = d if a is unbounded, d(a) = h if it is bounded}. then for d1 �= d2 we obtain g̃d1 (h) ∩ g̃d2(h) = b+(h) where b+(h) is the set of all bounded positive linear operators a with d(a) = h. theorem 3.2 [9, theorem 3.5] let h be an infinite-dimensional complex hilbert space. let v(h) = {a : d(a) → h | a ≥ 0 with d(a) = h and d(a) = h if a is bounded}. let ⊕ be a partial binary operation on v(h) defined by a⊕b = a+b with d(a⊕b) = h for any bounded a, b ∈ v(h) and a ⊕ b = b ⊕ a = a + b|d(a) with d(a ⊕ b) = d(a) if a is unbounded and b is bounded. then (v(h); ⊕, 0) is a generalized effect algebra. moreover, b+(h) is a sub-generalized effect algebra of v(h) with respect to inherited ⊕-operation, which is defined for every pair a, b ∈ b+(h). now we are going to show that sp(h) = {a ∈ v(h) | a = a∗} is a sub-generalized effect algebra of v(h), hence it is a generalized effect algebra. moreover, we show that sp(h) = f(h) = {â | a ∈ v(h) , â is a friedrichs positive self-adjoint extension of a} . lemma 3.3 under the assumptions of theorem 3.2, for every a ∈ v(h): (i) a and a∗ exist, â is closed and a ⊂ a ⊂ â = (â)∗ ⊂ (a)∗ = a∗, (ii) â = a iff a is essentially self-adjoint, (iii) sp(h) = f(h). proof. (i) let a ∈ v(h). since d(a) = h, the adjoint a∗ of a exists [3, p. 93]. further the assumption that a ≥ 0 implies that a is symmetric (see [3, p. 142]) and there exists the so-called friedrichs positive self-adjoint extension â of a (see, e.g. [3] or [10]), hence a ⊂ â = (â)∗ ⊂ a∗. it follows that h = d(a) = d(â) = d(a∗), which gives that a∗ and (â)∗ are closed (see [3, p. 95]). as â = (â)∗, we obtain that â is closed. moreover, since a is symmetric, its closure a is also symmetric (see [3, p. 96]). thus we obtain a ⊂ a ⊂ â = (â)∗ ⊂ (a)∗ ⊂ a∗. further a∗∗ = a and (a)∗ = a∗ (see [3, p. 96]). (ii) if a is essentially self-adjoint then (a)∗ = a implies â = a. conversely, if a = â then a = â = (â)∗ = (a)∗. (iii) if a ∈ sp(h) then a = a∗, hence, by (i), a = â ∈ v(h). conversely, if a ∈ v(h) then a is self-adjoint, hence a ∈ sp(h). theorem 3.4 under the assumption of theorem 3.2 let sp(h) = {a ∈ v(h) | a = a∗} and let ⊕s = ⊕/sp(h) be the restriction of ⊕-operation defined on v(h) to the set sp(h). then (sp(h); ⊕s, 0) is a sub-generalized effect algebra of (v(h); ⊕, 0). proof. we have to show that if a, b, c ∈ v(h) with a ⊕ b = c and out of a, b, c at least two are in sp(h) then a, b, c ∈ sp(h). (i) assume first that a, b ∈ sp(h). if a, b are bounded then c = a ⊕ b is again bounded and d(a) = d(b) = d(c) = h, hence c ∈ sp(h). further, if a is unbounded and b is bounded then c = a + b|d(a) and d(c) = d(a). moreover, a, b ∈ sp(h) implies that a = a∗, hence d(a) = d(a∗) and b = b∗ ⊂ (b|d(a))∗, which gives b = (b|d(a))∗ on h. it follows, as b|d(a) is bounded, that (a⊕b)∗ = (a + b|d(a))∗ = a∗ + (b|d(a))∗ = a∗ + b = a∗ + b|d(a∗) = a + b|d(a) = a ⊕ b. again c ∈ sp(h). (ii) assume now that a, c ∈ sp(h). then if c is bounded then d(c) = h and then a, b are bounded, hence a, b ∈ sp(h). if c and a are unbounded then b is bounded (since otherwise a ⊕ b is not defined) and again b ∈ sp(h). finally, if c is unbounded and a is bounded then b is unbounded and c = a|d(b) + b. it follows that d(c) = d(b). moreover, c∗ = (a|d(b))∗ + b∗ = a + b∗, hence d(c∗) = d(b∗). now, the assumption that c is self-adjoint implies d(c) = d(c∗) which gives d(b∗) = d(b), hence b ∈ sp(h). in theorem 3.4 we may substitute sp(h) by f(h). hence f(h) is a generalized effect algebra, more precisely: corollary 3.5 let h be an infinite-dimensional complex hilbert space. let f(h) be the set of all friedrichs positive self-adjoint extensions of all positive densely defined linear operators in h with d(a) = h if a is bounded. let ⊕ be a partial binary operation defined for a, b ∈ f(h) iff out 80 acta polytechnica vol. 51 no. 4/2011 of operators a, b at least one is bounded and then a ⊕ b = a + b is the usual sum of operators in h. then (f(h); ⊕, 0) is a generalized effect algebra. assume that (e; ⊕, 0) is a generalized effect algebra. then (see, e.g., [11]) for any fixed q ∈ e, q �= 0 the interval [0, q]e = {x ∈ e | there exists y ∈ e with x⊕y = q} is an effect algebra ([0, q]e ; ⊕q, 0, q) with unit q and with the partial operation ⊕q defined for x, y ∈ [0, q]e by x ⊕q y exists and x ⊕q y = x ⊕ y iff x ⊕ y ∈ [0, q]e exists in e . we have shown that v(h) and sp(h) are generalized effect algebras under the partial operations ⊕ and ⊕s , respectively. moreover, a⊕b (for a, b ∈ v(h)) and a ⊕s b (for a, b ∈ sp(h)) coincide with the usual sum of operators a, b when at least one of them is bounded. if both a, b are unbounded then a⊕b, a⊕s b, respectively, are not defined. since for any fixed q ∈ sp(h), q �= 0, it holds [0, q]sp(h) = [0, q]v(h) ∩ sp(h), we obtain the following effect algebras of positive self-adjoint operators: theorem 3.6 let q ∈ sp(h), q �= 0 be fixed. then ([0, q]sp(h); ⊕q, 0, q) is an effect algebra (with unit q) of positive self-adjoint operators densely defined in h under the ⊕q defined for a, b ∈ [0, q]sp(h) by: a ⊕q b exists and a ⊕q b = a + b (the usual sum of a, b in h) iff at least one out of operators a, b is bounded and a + b ∈ [0, q]v(h). note that if we substitute sp(h) in the preceding theorem by v(h) then for every fixed q ∈ v(h) we have [0, q]v(h) = {a ∈ v(h) | there exists c ∈ v(h) such that out of a, c at least one is bounded and a + c = q}. then ([0, q]v(h); ⊕q, 0, q) is an effect algebra with unit q and a partial binary operation ⊕q defined in theorem 3.6. remark 3.7 (i) if q ∈ sp(h) is a bounded operator then [0, q]sp(h) = [0, q]v(h) and it is an effect algebra of all bounded self-adjoint positive operators between 0 and q (with domain h). moreover, ⊕q coincides with the usual sum of operators if a ⊕q b exists in [0, q]sp(h). (ii) it follows from (i) that if q = i (the identity operator with domain h) then [0, q]sp(h) = [0, q]v(h) = e(h) is the hilbert space effect algebra of all self-adjoint operators between 0 and the identity operator i (see [5]). (iii) if q ∈ sp(h) is an unbounded operator with d(q) = h then every unbounded operator a ∈ [0, q]sp(h) has d(a) = d(q), since then there exists a bounded operator c ∈ sp(h) (hence d(c) = h) such that a + c = q. (iv) if q ∈ sp(h) is an unbounded self-adjoint operator then (2q)∗ = 2q∗ = 2q ∈ sp(h). in this case for any operators a, b ∈ [0, q]sp(h) one has a + b ∈ sp(h) (the usual sum of operators), even if a, b are unbounded. the last follows from the fact that there are bounded operators ca, cb ∈ sp(h) such that q = a ⊕ ca = b ⊕ cb . thus (a ⊕ ca) + (b ⊕ cb ) = 2q, hence (a + b) + (ca + cb ) = 2q. here ca +cb ∈ sp(h) and because sp(h) is a generalized effect algebra and also 2q ∈ sp(h) we obtain that a + b ∈ sp(h). (v) it is worth noting that effect algebras are very natural structures as carriers of states (or probability measures) when we handle also noncompatible pairs or unsharp elements. acknowledgement supported by vega 1/0297/11 grant of the ministry of education of the slovak republic. references [1] bagarello, f.: algebras of unbounded operators and physical applications: a survey, reviews in mathematical physics 19 (2007), 231–271. [2] birkhoff, g., von neumann, j.: the logic of quantum mechanics, ann. math. 37 (1936), 823–843. [3] blank, j., exner, p., havĺıček, m.: hilbert space operators in quantum physics. (second edition), springer, 2008. [4] dvurečenskij, a., pulmannová, s.: new trends in quantum structures. dodrecht : kluwer, the netherlands, 2000. [5] foulis, d. j., bennett, m. k.: effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1 331–1 352. [6] hedĺıková, j., pulmannová, s.: generalized difference posets and orthoalgebras, acta math. univ. comenianae lxv (1996), 247–279. [7] kalmbach, g., riečanová, z.: an axiomatization for abelian relative inverses, demonstratio math. 27 (1994), 769–780. [8] kôpka, f., chovanec, f.: d-posets, math. slovaca 44 (1994), 21–34. [9] polakovič, m., riečanová, z.: generalized effect algebras of positive operators densely defined on hilbert space, internat. j. theor. phys 50 (2011), 1 167–1 174. 81 acta polytechnica vol. 51 no. 4/2011 [10] reed, m., simon, b.: methods of modern mathematical physics ii, fourier analysis, selfadjointness. new york, san francisco, london : academic press, 1975. [11] riečanová, z.: subalgebras, intervals and central elements of generalized effect algebras, international journal of theoretical physics 38 (1999), 3 209–3 220. zdenka riečanová e-mail: zdenka.riecanova@stuba.sk department of mathematics faculty of electrical engineering and information technology stu ilkovičova 3, sk-81219 bratislava 82 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 unobtrusive non-contact detection of arrhythmias using a “smart” bed ch. brüser abstract wepresent an instrumented bed for unobtrusive, non-contactmonitoring of cardiac and respiratory activity. the system presented here is based on the principle of ballistocardiography (bcg), and measures cardiopulmonary vibrations of the body by means of an electromechanical foil (emfi) attached to the mattress. using our system, a clinical study with 13 participants was conducted to assess the bcg’s ability to distinguish atrial fibrillations from normal sinus rhythms. by computing a time-frequency representation of the recorded signals based on parametric autoregressive estimators, we can show clear qualitative differences between normal and arrhythmic bcg episodes. the same distinctive features could also be observed when applying our method to a simultaneously recorded reference ecg. our results suggest that ecg and bcg both contain the same basic information with respect to the presence of atrial fibrillations, and that a bed-mounted bcg sensor can indeed be used to detect atrial fibrillations. keywords: ballistocardiography, arrhythmia, bed, unobtrusive home-monitoring, atrial fibrillation. 1 introduction cardiovascular diseases in general and heart failure in particular are among the commonest reasons for hospitalization in the industrialized countries [1]. in order to deal with the growing number of patients, there is a need for technical solutions which enable personalized monitoring and treatment, preferably at home. in recent years, the bed has emerged as a promising place for long-term monitoring of cardiopulmonary activity at home, as virtually everyone spends a significant portion of the day in bed. instrumented beds could also be applied in the general wards of hospitals, where fully automatic, unobtrusivemonitoring systems could reduce the workload of the staff and increase the safety of the patients. a promising approach for measuring cardiopulmonary activity is the integration of highly sensitive mechanical sensors into the bed-frame or mattress which record the vibrations of the body caused by the mechanical activity of the heart and by the respiratory movements of the thorax. this basic principle, known under the term “ballistocardiography” (bcg), was first reported in the late 19th century [2]. through improvements in sensor technologies and digital signal processing techniques, the field has gained renewed interest in recent years. using a variety of different sensors, including but not limited to strain gauges [3,4], pvdf and emfi sensors [5,6], accelerometers [7], hydraulic [8] and pneumatic sensors [9,10] as well as optical devices [11], bcg systems have been integrated into objects of daily life, such as beds [4–12], chairs [13] and even weighing scales [3]. these systems share the commonadvantagethat theyareunobtrusiveand do not require direct skin contact, unlike for example, a conventional ecg. user errors, unlike, incorrectly attached electrodes can mostly be avoided by bed-based bcg systems, which do not require any user interaction or professional supervision to perform nightly measurements over extended periods of time. hence bcg systems are very well suited for long-term monitoring. while there have been significant contributions from various sensor modalities for bcg recording, as well as a few studies evaluating modernbcg systems, using healthy subjects, there is currently a lack of researchonwhether these systemsactuallyprovide clinically useful informationwhenmeasuring real patients. after all, the answer to this question will decide whether or not these systems come into use in clinical practice. arrhythmias, in particular, have so far been regarded as undesirable artefacts in many publications. we therefore devised a clinical study with the explicit goal of evaluating our bcg system’s ability to detect cardiac arrhythmias. for our study we focused on atrial fibrillation (af), since it is the most common type of arrhythmia [14]. during atrial fibrillation, the two upper chambers of the heart (atria) fibrillate and do not perform coordinated contractions [15]. af is a marker for other severe illnesses such as congestive heart failure [16]. long-term monitoring of af episodes might enable an early detection of worsening conditions in heart failure patients. 18 acta polytechnica vol. 51 no. 5/2011 in the remainder of this paper, we first introduce our bcg measurement system (section 2.1). then we describe the design of the clinical study that was performed to acquire bcg data from real arrhythmia patients (section 2.2). next, we present the signal processing techniques that were applied to evaluate the acquired data (section 2.3). we conclude by discussing the results and presenting our conclusions (sections 3 & 4). 2 materials and methods 2.1 bcg acquisition system the bcg acquisition system used in this study consists of a single electromechanical-film (emfi) sensor [17] (30 cm ×60 cm, thickness < 1 mm) and the acquisition electronics for amplifying and digitizing the analog sensor signal. mechanical deformation of the electromechanical film generates a charge, δq, which is proportional to the dynamic force,δf , acting along the thickness direction of the sensor δq = kδf (1) where k is the sensitivity coefficient. the resulting charge is amplified by a charge amplifier, and is then digitized with 12 bits at a sampling frequency of 128 hz. the emfi foil is mounted on the underside of a thin foam overlay, which is then placed on top of the mattress of a regular bed. due to its thinness and its flexible properties, the presence of the sensor is almost imperceptible to the person lying in the instrumented bed. however, owing to the sensitivity of the emfi foil, cardiopulmonary movements of the person lying in bed can be recorded. in order to obtain optimal signals, the sensor is mounted in a fixed position under the thorax region (see figure 1). a short segment of a signal recorded by our bcg system containing three heart beats is shown in figure 2. vertical dotted lines indicate the time points at which r peaks occurred in the reference ecg. fig. 1: picture of the emfi bcg sensor attached to the bottom of the thin foam overlay 2.2 measurement scenario in order to assess whether atrial fibrillations and sinus rhythm can be distinguished in a bcg recording, the following study was performed at the university hospital in aachen, germany. the study was approved by the ethics board of the university hospital aachen (ref. number: ek075/10, date: 05.05.2010). a total of 13patients (3 female, 10male, age: 63.6±16.3 years, bmi: 28.6±4.1 kg m2 ) whowere visiting the hospital to undergo ambulatory treatment for atrial fibrillation gave their informed written consent and were included in our study. to return the patients’ hearts to a regular sinus rhythm, a routine procedure called synchronized electrical cardioversion [18] was performed on each patient. during this procedure, an electrical current is administered to the heart. unlike defibrillation, the initial current dose is smaller and the shock is triggered by the r-peak in the ecg in order to reduce the risk of induced ventricular fibrillation. for the entire duration of their treatment, the subjects were placed in a hospital bed instrumented with an emfi foil sensor, as described above. in addition to the bcg, a 3-lead reference ecg was recorded with a sampling rate of 500 hz. bcg and ecg data was continuously acquired before, during, and after the procedure. the mean length of the individual bcg recordings is 45 minutes. this measurement scenario has the major advantage that it allows thebcgof the samepatient to be recordedwhile exhibiting thepathology(i.e. arrhythmia/atrial fibrillations) as well as when the patient’s heart is returned to a normal sinus rhythm. hence an inter-personal as well as an intra-personal comparison of the bcg signal during arrhythmias and during normal rhythms is possible. fig. 2: exemplary trace of three heart beats recorded by the bed sensor. vertical lines indicate the occurrence of r peaks in the simultaneously recorded reference ecg 2.3 signal analysis psd estimation using ar models autoregressive (ar) models are a common choice for parametric estimation of the power spectral den19 acta polytechnica vol. 51 no. 5/2011 sity (psd) of a signal [19]. first, a given signal, x(n), n ∈ [0, n − 1], is modelled as the output of a discrete, all-pole, infinite impulse response (iir) filter whose input is white noise, w(n), of variance σ2: x(n)= − p∑ k=1 akx(n − k)+ w(n). (2) thus, for an ar(p) model of p-th order, only the filter coefficients a1, . . . , ap and the noise variance σ 2 need to be estimated to fully describe this process. after estimating these parameters, the psd, p(f), of the modelled process can be computed as: p(f)= σ2∣∣∣∣1+ p∑ k=1 ake−j2πf p ∣∣∣∣ 2 . (3) a number of different ar parameter estimation methods are known in the literature [19]. these are typically based on minimizing the estimate of the prediction error power. popular estimators include the yule-walker method and burg’s method. for our analysiswhich follows, we have chosen to use the burg estimator [20]. autoregressive spectral estimation can provide better modelling of the peaks in the psd than nonparametric methods, especially when dealing with short signal lengths [19]. this improvement comes at the cost of a less accurate description of the valleys of the psd. when dealing with quasi-periodic signals measuring cardiac activity, however, this is a valid trade-off, as we are primarily interested in the peaks of the psd. however, this advantage of ar spectral estimators exists only when the assumption of an underlying ar process is indeed valid for the given signal. furthermore, the model order needs to be carefully chosen to achieve high quality estimates. spectrogram since biosignals, such as ecg and bcg, are also highly non-stationary in nature (especially in the presence of arrhythmias), a deeper insight into the properties of these signals can be achieved by analysing their time-frequency distributions. a common approach to obtain a time-varying spectral representation of a signal is to divide the signal into smaller (overlapping) epochs and estimate the psd for each of these epochs separately [21]. this socalled, spectrogram is usually computed bymeans of the short-time fourier transform (sfft). however, ar-based spectral estimators can equally be used. let p lm(f) denote the estimatedpsdof an epoch of the signal x(n) which starts at the m-th sample and has a length of l samples. we can then define the ar-based spectrogram as s(f, n)= p ln (f). (4) the better distinction of the peaks that can be obtained through ar estimators means that smaller epoch lengths can be chosen. this allows an increase in time resolutionwhile at the same timemaintaining a similar resolution in the frequency domain. ecg and bcg signal analysis both the bcg signals and the lead ii ecg signals recordedduring our studywere first low-pass filtered to 15 hz and then downsampled to 30 hz. then the signals were split into 5 s long epochs with 4 s of overlap, thus resulting in one epoch every second. for each epoch, the psdwas estimated using anar model of order 50. spectrogramswere then obtained for each signal from the sets of estimated psds. 3 results and discussion figure 3 shows the bcg and ecg spectrograms of a healthy reference subject. both spectrograms show a very similar image containing clear bright lines related to the heart frequency and its harmonics. from a purely visual standpoint, one could conclude that both signals, i.e. the non-contact bed measurement and the reference ecg, contain similar information about the current heart rate of the subject. much to our surprise, the spectrograms also showedapparent visual similarities duringpathologic episodes of atrial fibrillation. figure 4 shows the time-frequencyanalysisof the signals recordedduring the treatment sessionwith one of the patients in our study. before the cardioversion was performed, the fig. 3: spectrograms of simultaneously recorded bcg and ecg signals of a healthy subject, respectively. both images show clearly visible lines corresponding to the heart rate of the subject and its harmonics. the higher power densities in the lower frequencies of thebcg spectrogramare related to respiratory-inducedmotions in the bcg signal 20 acta polytechnica vol. 51 no. 5/2011 fig. 4: time-frequency analysis of the bcg and the reference ecg of patient 11 before and after the cardioversion is performed. following the cardioversion, both spectrograms change from a noise-like appearance into clearly visible lines representing a base frequency and its harmonics. (from top to bottom: 1. bcg signal recorded by the bed-sensor, 2. beat-to-beat heart rates computed from the reference ecg, 3. spectrogram of the bcg signal, and 4. spectrogram of the reference ecg signal) patient suffered from atrial fibrillation, which caused strong and rapid fluctuations in the ecg-derived heart rates, as shown in the second plot of the figure. after the cardioversion event, the patient’s heart returned to a sinus rhythm and the heart rate stabilized. when inspecting the spectrograms of the ecg signal and the bcg signal, respectively, the change of state induced by the cardioversion is also immediatelyvisible. while the spectrogramduringatrialfibrillationhas a smeared, almostnoise-like appearance, the spectograms change into the previously seen pattern of distinct lineswhen the subject’s cardiac activity returns to a sinus rhythm. this pattern conforms with what we observed earlier for healthy subjects. in addition to the example shown here, we observed the samedifferences in spectrogrampatternsbetween arrhythmia and normal periods for all patients that took part in our study. these preliminary results seem to support our initial hypothesis that arrhythmic cardiac activity can indeed be detected using a bed-mounted bcg system. furthermore, the observation that ecg and bcg spectrograms undergo the same qualitative changes leads us to believe that the bcg signal recorded using our unobtrusive, non-contact sensor systemdoes indeed contain similar informationas the ecg with respect to the presence of arrhythmias. nevertheless, our experiments also highlight a major challenge on to road towards a fully automatic bcg-based arrhythmia detector. as shown in figure 4, motion artefacts immediately after the cardioversion (as evident through the increased amplitude of the bcg signal) cause distortions in the bcg spectrogramwhich, at first glance, appear similar to arrhythmias. however, an automatic algorithm might still be able to distinguish motion artefacts from arrhythmia by taking the bcg signal amplitude and the details of the frequency distribution into account. 4 conclusion we have introduced a bed-based sensor system that can unobtrusively monitor the cardiopulmonary activity of a person lying in bed. unlike previous work 21 acta polytechnica vol. 51 no. 5/2011 in the field, which has mostly treated arrhythmias in the data sets as undesirable artefacts, we have devised and presented a clinical study dedicated explicitly to the goal of evaluating the fitness of the proposed system for detecting cardiac arrhythmias. through the analysis of the data acquired during this study by means of an ar-model-based timefrequency representation, we have shown that the proposed systemdoes indeed enable cardiac arrhythmias to be detected. our work prepares the way for future research on fully automatic algorithms for detecting arrhythmias in bcg signals. while the types of arrhythmiaswhich canbe detected inbcgrecordings is still limited to atrial fibrillation, our findingsmight facilitate improvements in the long-termmanagementand treatment of cardiac diseases for which af episodes can be an important marker. acknowledgement the research presented in this paper was supervised by prof. s. leonhardt, rwth aachen university in aachen, germany, andwas sponsored byphilips research, eindhoven, the netherlands. the author also thanks s. dewaele ofphilipsresearch for insightful discussions. further thanks go toprof. p. schauerte andm.zink of thedepartment of cardiology, medical clinic i, university hospital aachen, for enabling and supporting the execution of this study. references [1] organization, w. h.: the world health report 2004. 2004. http://www.who.int/whr/2004/ annex/topic/en/annex 2 en.pdf [2] gordon, j. w.: certain molar movements of the human body produced by the circulation of the blood, journal of anatomy and physiology, 1877, vol. 11, p. 533–536. [3] inan, o. t., etemadi, m., wiard, r. m., giovangrandi, l., kovacs, g. t. a.: robust ballistocardiogram acquisition for home monitoring, physiological measurement, 2009, vol. 30, pp. 169–185. [4] brüser,c., stadlthanner,k., brauers,a., leonhardt, s.: applying machine learning to detect individual heart beats in ballistocardiograms, in proc. 32nd ann. int. conf. of the ieee embs, buenos aires, argentina, 2010, pp. 1926–1929. [5] kortelainen, j. m., virkkala, j.: fft averaging of multichannel bcg signals from bed mattress sensor to improve estimation of heart beat interval, proc. 29th ann. int. conf. of the ieee embs,cité internationale, lyon, france, 2007, p. 6685–6688. [6] aubert, x. l., brauers, a.: estimation of vital signs in bed from a single unobtrusive mechanical sensor: algorithms and real-life evaluation, proc. 30th ann. int. conf. of the ieee embs, british columbia, canada, 2008, pp. 4744–4747. [7] chuo, y., tavakolian,k., kaminska, b.: evaluation of a novel integrated sensor system for synchronous measurement of cardiac vibrations andcardiacpotentials,journal of medical systems, 2009, pp. 1–11. [8] zhu, x., chen, w., nemoto, t., kanemitsu, y., kitamura, k., yamakoshi, k., wei, d.: realtime monitoring of respiration rhythm and pulse rate during sleep, ieee trans. biomed. eng., no. 12, vol. 53, 2006, pp. 2553–2563. [9] watanabe, k., watanabe, t., watanabe, h., ando, h., ishikawa, t., kobayashi, k.: noninvasive measurement of heartbeat, respiration, snoring and body movements of a subject in bed via a pneumatic method, ieee trans. biomed. eng., no. 12, vol. 52, 2005, pp. 2100–2107. [10] chee, y., han, j., youn, j., park, k.: air mattress sensor system with balancing tube for unconstrained measurement of respiration and heart beat movements, physiological measurement, vol. 26, 2005, pp. 413–422. [11] spillman jr., w. b., mayer, m., bennett, j., gong, j., meissner, k. e., davis, b., claus, r. o., muelenaer jr., a. a., xu, x.: a ‘smart’ bed for non-intrusive monitoring of patient physiological factors,measurement science and technology, vol. 15, 2004, pp. 1614–1620. [12] mack, d. c., patrie, j. t., suratt, p. m., felder, r. a., alwan, m. a.: development and preliminary validation of heart rate and breathing rate detection using a passive, ballistocardiography-based sleep monitoring system, ieee trans. inf. technol. biomed., no. 1, vol. 13, 2009, p. 111–120. [13] junnila, s., akhbardeh, a., värri, a.: an electromechanical film sensor based wireless ballistocardiographic chair: implementation and performance, journal of signal processing systems, no. 3, vol. 57, 2008, pp. 305–320. 22 acta polytechnica vol. 51 no. 5/2011 [14] go, a. s., hylek, e. m., phillips, k. a., chang, y., henault, l. e., selby, j. v., singer, d. e.: prevalence of diagnosed atrial fibrillation in adults: national implications for rhythmmanagement and stroke prevention: the anticoagulation and risk factors in atrial fibrillation (atria) study, jama, no. 18, vol. 285, 2001, pp. 2370–2375. [15] katz, a. m.: physiology of the heart. m. lippincott williams & wilkins, 2005. [16] maisel, w. h., stevenson, l. w.: atrial fibrillation in heart failure: epidemiology, pathophysiology, and rationale for therapy, american journal of cardiology, no. 6, vol. 91, 2003, pp. 2d–8d. [17] paajanen, m., lekkala, j., kirjavainen, k.: electromechanical film (emfi): a new multipurpose electret material, sensors and actuators a: physical, no. 1–2, vol. 84, 2000, pp. 95–102. [18] shea, j. b., maisel, w. h.: cardioversion, circulation, no. 22, vol. 106, 2002, pp. e176–e178. [19] kay, s. m.: modern spectral estimation: theory and application. prentice hall, 1999. [20] broersen, p.: finite-sample bias propagation in autoregressive estimation with the yulewalker method, ieee trans. instrumentation andmeasurement, no. 5, vol. 58, 2009,pp. 1354– 1360. [21] mitra, s. k.: digital signal processing: a computer-based approach. mcgraw-hill, 2001. about the author christoph brüser was born in troisdorf, germany, in 1983. he holds a dipl.-ing. degree in computer engineering from rwth aachen university, aachen, germany. currently, he is pursuing a dr.ing. (ph.d.) degree in the department of medical information technology, rwth aachen university, where he is also working as a research assistant. his research interests include biosignal processing and classification as well as unobtrusive physiological measurement techniques. christoph brüser e-mail: brueser@hia.rwth-aachen.de philips chair for medical information technology rwth aachen university pauwelsstrasse 20, 52074 aachen, germany 23 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 checking the geometric accuracy of a machine tool for selected geometric parameters adam janásek1, robert čep1, josef brychta1 1 všb — technical university of ostrava, department of machining and assembly, 17. listopadu 15/2172, 708 33 ostrava-poruba, czech republic correspondence to: adam.janasek@vsb.cz abstract this paper deals with the control parameters for selected geometric accuracy measurements for a machine tool. the parameters were needed after a refurbished milling machine was purchased. after setting up the machine, it was necessary to check the geometric accuracy that can be used for precise milling. the whole check was performed in accordance with iso 10791. only selected parameters of geometric accuracy were inspected, and they were later compared with the prescribed values. on the basis of a comparison of these values we were able to determine whether the machine tool can be used for accurate machining. keywords: geometric accuracy, testing, milling machine. selected geometric parameters the measurements were performed on a vertical spindle milling machine, see figure 1. for this milling machine, we selected eight geometrical accuracy tests from the iso 10791-2 standard for a machine with a vertical spindle, and two accuracy tests for horizontal milling machines. for each geometric accuracy test, we performed at least as many measurements as were stipulated in the standard. before each measurement was performed, the gauge adjustment required for the specific test was made, and surface cleaning was done (on the measured surfaces). the tests were performed at a constant temperature of 20 ◦c. the test scheme is shown for each test of geometric accuracy from the iso 10791-2 standard. further tests were supplemented by an individual value, as allowed by the iso 10791-2 standard. tests performed on the milling machine: [1] a) measurement of periodic axial spindle movement on the lateral surface, test g10b. b) measurement of periodic axial spindle movement, test g10a. c) measurement of peripheral whipping — internal taper spindle at the end of the spindle, test g11a. d) measurement of peripheral whipping — internal taper spindle at a distance of 110 mm from the end of the spindle, test g11b. e) measurement of axial parallelism with spindle movement on the z axis, test g12. f) measurement of the angular deviation movement on the y axis, test g5a. g) measurement of the angular deviation movement on the x axis, test g4a. h) measurement of the middle guide groove — parallel to the table feed at a distance of 500 mm, test 8a and measured opposite side of the guide groove, test 8b. i) measurement of the movement parallelism on the y axis with the table surface, distance y = 170 mm, test g17. figure 1: tested milling machine with a vertical spindle — side view [3] test g10b this test was conducted on a radius of a = 40 mm. the measurement was based on the fact that the dial indicator was attached to the lateral surface of the 68 acta polytechnica vol. 52 no. 4/2012 spindle, and deviation can be controlled thanks to the rotation of the spindle. before making the measurements we had to remove the drag-stones, which are hindered by the measurements. ten measurements were performed, and the values were processed. figure 2: measurement scheme [1] ua = √∑n i=1(xi − x)2 n · (n − 1) = 8.33 · 10 −4 mm x = x ± ua x = (0.007 50 ± 0.000 83) mm measured value: (0.007 50 ± 0.000 83) mm value according to the standard: 0.01 mm the milling machine is satisfactory in terms of this test. test g10a this test was conducted inside the spindle at a height of 10 mm above the drag-stones. the measurement was based on the fact that the dial indicator was attached to the internal surface of the spindle, and the deviation can be controlled thanks to the rotation of the spindle. again, ten measurements were performed and the values were processed [3]. finally, the drag-stones were remounted on the spindle. ua = √∑n i=1(xi − x)2 n · (n − 1) = 7.65 · 10 −4 mm x = x ± ua x = (0.003 50 ± 0.000 77) mm measured value: (0.003 50 ± 0.000 77) mm value according to the standard: 0.005 mm the milling machine is satisfactory in terms of this test. figure 3: measurement scheme [1] test g11a this test was conducted at the end of the spindle. the mandrel is clamped to the spindle and the peripheral whipping was evaluated. the perpendicular axis of the mandrel (or the spindle on the axis of the longitudinal feed table) was checked before starting the measurements [3]. thus control was important for precise testing of the peripheral spindle whipping. the measurement was based on the fact that dial indicator was attached to the mandrel. figure 4: measurement scheme [1] ua = √∑n i=1(xi − x)2 n · (n − 1) = 8.16 · 10 −4 mm x = x ± ua x = (0.013 00 ± 0.000 82) mm measured value: (0.013 00 ± 0.000 82) mm value according to the standard: 0.01 mm the milling machine is unsatisfactory in terms of this test. 69 acta polytechnica vol. 52 no. 4/2012 test g11b this test was conducted at a distance of 110 mm from the end of the milling spindle. the mandrel was clamped into the milling spindle, and peripheral whipping was carried out. it was not necessary to check the perpendicularity of the axis of the mandrel because this had been done in previous measurements. the measurements were taken after measuring at a distance of 110 mm from the end of the spindle and after establishing a magnetic pedestal with a dial indicator on the milling table. the measurements were based on the fact that the mandrel was attached to the dial indicator and the deviation was controlled during rotation. for this test, it was necessary to convert the allowed values according to [1], because the tolerance on a measured length of 300 mm is given by the standard [3]. our measurement was performed on a length of 110 mm. figure 5: measurement scheme [1] ua = √∑n i=1(xi − x)2 n · (n − 1) = 8.5 · 10 −4 mm x = x ± ua x = (0, 023 00 ± 0.000 85) mm conversion of standard values for length 110 mm: length 300 mm allows 0.01 mm according to [1] ⇒ allowed 1 mm on: 0.01/300 = 3.33·10−5 mm ⇒ 110 mm = 0.014 mm measured value: (0.023 00 ± 0.000 85) mm value according to the standard: 0.014 mm the milling machine is unsatisfactory in terms of this test. test g12 this test was conducted at a distance of 110 mm. the mandrel was clamped to the milling spindle and axial parallelism was carried out with spindle movement on the z axis. measurements were taken after measuring a distance of 110 mm and after establishing a magnetic pedestal with a dial indicator on the milling table. ten measurements were performed at position 0◦, and the next ten measurements were performed at mandrel rotation 180◦, in order to eliminate errors resulting from inaccuracies of the mandrel itself. for this test, it was necessary to convert the allowed values according to [1], because the tolerance given by the standard is for a measured length of 300 mm. our measurement was performed on a length of 110 mm. figure 6: measurement scheme [1] ua = √∑n i=1(xi − x)2 n · (n − 1) = 8.16 · 10 −4 mm x = x ± ua x = (0.003 00 ± 0.000 82) mm conversion of standard values for length 110 mm: length 300 mm allows 0.015 mm according to [1] ⇒ allowed 1 mm on: 0.015/300 = 5 · 10−5 mm ⇒ on 110 mm = 0.005 5 mm measured value: (0.003 ± 0.000 82) mm value according to the standard: 0.005 5 mm the milling machine is satisfactory in terms of this test. test g5a this test was conducted at a distance of 1 000 mm and was performed in the transverse direction. a coincidence spirit level was set up on the milling table, and was gradually placed at three locations. first, on the right side of the table, then in the middle, and finally on the left side. in each of these positions we gradually made ten measurements. after these measurements, the coincidence spirit level was rotated 180◦, and the measurements were again per70 acta polytechnica vol. 52 no. 4/2012 formed ten times in the same places [4]. the measurements were based on the fact that the examined deviation of each of the halves inside the coincidence spirit level and the deflection were recorded at the time when the two halves for the spirit level were established [2]. the resultant values were obtained by differential measurements of the maximum and minimum values. figure 7: measurement scheme [1] ua = (uap + ual)/2 x = xmax − xmin x = xp − xl = (0.190 0 ± 0.0047) mm uap = √∑n i=1(xip − xp )2 n · (n − 1) = 4.43 · 10 −3 mm xp = xp ± uap = (2.120 0 ± 0.004 4) mm uas = √∑n i=1(xis − xs)2 n · (n − 1) = 0.010 089 mm xs = xs ± uas = (2.076 000 ± 0.010 089) mm ual = √∑n i=1(xil − xl)2 n · (n − 1) = 5 · 10 −3 mm xl = xl ± ual = (1.930 ± 0.005) mm measured value: (0.190 0 ± 0.004 7) mm value according to the standard: 0.04 mm the milling machine is unsatisfactory in terms of this test. test g4a this test was conducted at a distance of 1 000 mm, and was performed in the longitudinal direction. a coincidence spirit level was set up on the milling table and was gradually placed at three locations. the order of the placements was as for the previous measurement. in each of these positions we gradually made ten measurements [3]. in essence, the principle of measurement was the same as in test g5a (above). figure 8: measurement scheme [1] ua = (uas + uap )/2 x = xp − xl = xmax − xmin x = xs − xp = (0.200 0 ± 0.009 5) mm uap = √∑n i=1(xip − xp )2 n · (n − 1) = 0.011 mm xp = xp ± uap = (2.630 ± 0.011) mm uas = √∑n i=1(xis − xs)2 n · (n − 1) = 0.007 9 mm xs = xs ± uas = (2.830 0 ± 0.007 9) mm ual = √∑n i=1(xil − xl)2 n · (n − 1) = 0.008 9 mm xl = xl ± ual = (2.710 0 ± 0.008 9) mm measured value: (0.2000 ± 0.009 5) mm value according to the standard: 0.06 mm the milling machine is unsatisfactory in terms of this test. test 8a and test 8b this test was conducted at a distance of 500 mm. individual measurements were performed after 50 mm. the measurements were based on the fact that the magnetic pedestal was attached to the structure of the mill. the point of the dial indicator was attached to the surface side of the guide groove. the measured deflection was noted every 50 mm with the passing of the groove [5]. then we measured the opposite side of the guide groove. the principle is exactly the same as in test 8a (above). 71 acta polytechnica vol. 52 no. 4/2012 figure 9: measurement scheme [1] ua = √∑n i=1(xi − x)2 n · (n − 1) = 4.69 · 10 −3 mm x = x ± ua = (0.023 0 ± 0.004 7) mm ua = √∑n i=1(xi − x)2 n · (n − 1) = 1.95 · 10 −3 mm x = x ± ua = (0.007 3 ± 0.002 0) mm measured value: (0.023 0 ± 0.004 7) mm value according to the standard: 0.02 mm the milling machine is unsatisfactory in terms of this test. (8a test) measured value: (0.007 3 ± 0.002 0) mm value according to the standard: 0.02 mm the milling machine is satisfactory in terms of this test. (8b test) test g17 this test was conducted on the right and left side of the milling table. first, a ruler had to be assembled at a distance of 250 mm from the centre of the table to the right side and then on the left side of the milling table. ten measurements were performed for the right side, and then ten measurements for the left side. figure 10: measurement scheme [1] uap = √∑n i=1(xip − xp )2 n · (n − 1) = 1.25 · 10 −3 mm ual = √∑n i=1(xil − xl)2 n · (n − 1) = 8.16 · 10 −4 mm xp = xp ± uap = (0.0140 ± 0.001 3) mm xl = xl ± ual = (0.008 00 ± 0.000 82) mm measured value: (0.014 0 ± 0.001 3) mm value according to the standard: 0.02 mm measured value: (0.008 00 ± 0.000 82) mm value according to the standard: 0.02 mm the milling machine is satisfactory in terms of these tests. conclusions this paper has reported on tests of selected geometrical precision parameters for a milling machine in the laboratories of the department of machining and assembly, faculty of mechanical engineering, všb – technical university of ostrava. the measurements were performed after selecting suitable tests and providing the necessary gauges and equipment for the tests. on the basis of the results of the geometric precision tests for a milling machine with a vertical spindle, the following conclusions can be drawn: • the milling machine failed 5 out of the 10 tests that were performed. • measurement of the middle guide groove, parallel to table feed at a distance of 500 mm (test 8a): the main reason for the unsatisfactory test results was the groove that ran through the whole side. the same measurement was therefore performed on the opposite side of the groove, and in this case c ompliance was achieved. • measurement of the angular deviation movement on the y axis (test g5a), and measurement of the angular deviation movement on the x axis (test g4a): the main reason for the unsatisfactory test result was bad setting up in the laboratory. it was not possible to establish into the correct position, because the milling machine was not equipped with establishing screws. • the peripheral whipping measurement — the internal taper spindle at the end of the spindle (test g11a) and the peripheral whipping measurement – the internal taper spindle at a distance of 110 mm from the end of the spindle (test g11b): the main reason for the unsatisfactory test result was perhaps gear-wear with a constant gear ratio inside the milling head, or bearing-wear. in order to eliminate this inac72 acta polytechnica vol. 52 no. 4/2012 curacy, it would be necessary to overhaul the milling head. acknowledgement this paper is an outcome of project no. cz.1.07/2.4.00/17.0082: increasing of professional skills by practical acquirements and knowledge, supported by the education for competitiveness operational programme, funded from european union structural funds and from the state budget of the czech republic. references [1] iso 10791-2 – test conditions for machine centres — part 2: geometric tests for machines with vertical spindle or universal heads with vertical primary rotary axis (vertical z-axis). [2] tichá, š.: stroj́ırenská metrologie – část 1. ostrava : všb – tu ostrava, 2006, p. 112. isbn 80-248-0671-1. [3] brázda, f.: kontrola vybraných parametr̊u geometrické přesnosti obráběćıho stroje. ostrava : katedra obráběńı a montáže, 2006. bakalářská práce. všb – technical university of ostrava. bachelor project supervisor: robert čep. [4] čep, r., janásek, a., vaĺıček, j., čepová, l.: testing of greenleaf ceramic cutting tools with and interrupted cutting. tehnički vjestnik – technical gazette. vol. 18, no. 3, 2011, p. 327–332. issn 1330-3651. [5] iso 10791-1 – test conditions for machine centres — part 1: geometric tests for machines with horizontal spindle and with accessory heads (horizontal z-axis). 73 ap_07_6.vp 1 introduction electron avalanches, as precursors of stable gas discharges, have been studied from the very beginning of the 20th century. townsend was the first to make a major study of this type of non-self-sustained discharges. in the 1930s in connection with streamer discharges, raether [1], loeb [2], and meek [3] built up a new streamer theory which incorporated the townsend electron avalanche as a necessary starting base for launching the streamer channel. according to this view, a single avalanche with a high electron population n �( , )10 108 9 is accompanied by uv radiation intensive enough to facilitate photoionisation as a major “driving force” of electron multiplication while collisional ionisation successively loses its governing role. according to this theory, a single big townsend avalanche n � 108 starts the streamer luminous channel of cold plasma that follows the track of the initial townsend avalanche. at first sight it might seem that there are certain statistical similarities between streamers and townsend avalanches. but, surprisingly, this is not the case. both theoretical [4] and experimental results [1, 5–7] have shown that electron populations n of poor townsend avalanches n � 105 developed inside a discharge gap of length d are governed by the furry [8] probability density w n d( , ) w n d n n w n d n n n n n ( , ) , lim ( , ) exp � �� � �� � � � � � � � � �� 1 1 1 1 1 , ( )n x x x � � � � � � � �� � d 0 (1) whereas populations of pre-streamer (n �( , )10 105 8 ) and streamer (n � 108) avalanches are governed by another statistical law [9–17], which is different from that of furry. the experimental data from these highly populated avalanches is best fitted [11–17] by the pareto distribution function w n d n d( , ) ( )� � � �const 1 , (2) where d is the so-called fractal dimension. the different statistical behaviour of lowly and highly populated avalanches has been verified many times in our laboratory [11–17] when we have measured the height statistics of dc partial discharges with sandwiched electrode systems. depending on the actual experimental conditions, partial discharges are in fact electron avalanches often mixed with streamers (see fig. 1) and sometimes even developed into microscopic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 47 no. 6/2007 statistics of electron avalanches and streamers t. ficker we have studied the severe systematic deviations of populations of electron avalanches from the furry distribution, which has been held to be the statistical law corresponding to them, and a possible explanation has been sought. a new theoretical concept based on fractal avalanche multiplication has been proposed and is shown to be a convenient candidate for explaining these deviations from furry statistics. keywords: furry and pareto statistics, fractal multiplication of electron avalanches, fractal statistical pattern, fractal dimension. fig. 1: streamer spots on the dielectric barrier sparks. the height statistics of dc partial discharges are nothing else than statistical distributions of electron avalanches via their populations, i.e. via the number of charge carriers that they have. fig. 2 shows one of our avalanche statistics measured with an ultra-fast digitiser [12, 13]. avalanches were detected across a resistance (the resistance ( r � 100 k�) was connected in series to the discharge gap (c) so that the two components formed a classical rc circuit – for more details see ref. [14]) as short voltage pulses with random heights u. since we did not calibrate the voltage pulses u against the number of electrons n, our resulting distribution curves w are dependent on u instead of n. fig. 2 shows that the power function w u c uo d( ) ( )� � � �1 represents an excellent fit of the measured data. assuming linear proportionality u c n� � , our curve w(u) will preserve the same shape as that of furry w n n d( ) ( )� � � �const 1 , i.e. they will both possess the same value of the exponent (1+d). unfortunately, no theory has been developed so far to explain the different statistical behaviour of lowly and highly populated avalanches. to resolve this puzzle, it is necessary to find the corresponding physical process underlying the phenomenon. there are several important points that should be taken into account when forming a theoretical concept explaining the cross-over from furry statistics to pareto statistic: � the change in the statistical behaviour occurs simultaneously with the change in electron multiplication, i.e., when photoionisation starts dominating over collisional ionisation and highly populated avalanches appear ( )n � 105 . � since all fractal objects are governed by pareto statistics, the electron multiplication mechanism that forms the pareto set of avalanches has to be of a fractal nature, too. � the fractal photoionisation multiplication should be based on creating additional smaller avalanches accompanying the initial (parent) avalanche, because an increase in electron populations within the parent avalanches leads only to an increase in average population n, which does not change the character of the furry distribution itself (1). a proposal for a convenient fractal mechanism of electron multiplication, capable of creating the pareto set of electron avalanches, is formulated in the next paragraphs of this paper. in addition, a derivation of the general statistical pattern which follows the pareto behaviour (2) together with its application to experimental data will also be presented. 2 fractal multiplication of electron avalanches on the basis of the experimental observations and deductions mentioned above, it is clear that the multiplication mechanism of highly populated avalanches, whose populations follow pareto’s distribution, has to be governed by a fractal scenario that should be very similar to the following: (i) beside a parent avalanche a series of additional smaller avalanches arises inside the discharge gap. these smaller avalanches are generated in a hierarchical manner with different mean populations nd to fulfil the fractal scenario � � � �n j dd jj d j j j � � � � � �0 0 1e�( ) ,� � . (3) in this way the number of less populated avalanches increases and, as a consequence, deviations from the furry distribution may occur. (ii) multiplication of highly populated avalanches with mean populations � �nd j j �0 must be generated according to a fractal scenario of branching or partitioning like most fractals when going to smaller scales. therefore, some type of fractal avalanche branching should be taken into account. the branching should originate 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 2: avalanche statistics (e p � 52 v pa�1m�1, voltage pulses) in air at normal laboratory conditions with a parent avalanche possessing a mean population nd d ,0 � e � (fractal initiator). after passing a certain initial distance � � 0 and gathering a certain number of energetic electrons n d� �e� 1, which are capable of creating a group of uv photons, a photoionisation process starts and a swarm of k � 1smaller avalanches with mean populations nd d , ( ) 1 � �e� � appear along the parent avalanche. let us call them “side avalanches of the first generation”. thus the side avalanches of the first generation actually represent the so-called fractal generator given by multiplicity k . the side avalanches, once created, become parent avalanches for the next generation of new side avalanches. so, the side avalanches of the first generation become parent avalanches for the side avalanches of the second fractal generation with the mean population nd d , ( ) 2 2� �e� � . this process of avalanche multiplication may or may not continue up to the last possible generation j d� �� 1 with the mean population nd j d j , ( )� � �e� � . provided the multiplying process reaches the j-th generation, the mean (average) total number of side avalanches in this generation is just k j . the described multiplicative process yields a hierarchy of avalanches and when extended to infinity ( )j � � , it yields an infinite set of avalanches that is similar to the well-known fractal object called the cantor fractal set [18]. using this similarity, a relation between avalanche characteristics and properties of the cantor fractal set can easily be found k l l n n k n d kj o j d d o d j d j j d� � � � � � � � � � � � � � � � � � � �, , ( ) ln ln n . (4) since all the fractal objects obey the pareto statistics with probability density in the form of the power law (2), the studied avalanche set, being of a fractal nature, will also follow this statistical law w n x n nd k n( , ) ( ) ( ln ln ) � � � �� � � � const const1 1 . (5) such a strictly deterministic mechanism, which has already been described, might hardly be expected in a real situation. instead, a strongly stochastic mechanism is more probable with certain distributions of the quantities �, k, and n. however, the use of their average values �, k , and n makes the treatment more realistic and partly advocates the deterministic view of the problem. (iii) the described fractal mechanism of multiplication of highly populated avalanches anticipates that the most probable place where a parent avalanche initiates side avalanches is in some of the first �-intervals because, due to diffusion at a larger distance, the parent avalanche is broadened enough to integrate the side avalanches. the foregoing paragraphs have summarised the main properties of the concept of fractal multiplication of highly populated avalanches. if all these assumptions are sound, the derivation of a new statistical pattern using the existence of side avalanches should be fruitful. naturally, this new pattern must be capable of generating linear behaviour in the bilogarithmic co-ordinates with the slope � �( ln ln )1 k n . such a derivation is realised in the next section. 3 statistical pattern of fractal avalanche multiplication with reference to points (i)–(iii), the probability densities for generations of side avalanches can be formed as follows j = 0: zero generation – parent avalanche w n d n nd d n 0 1 1 1 1 ( , ) � � � � � � � . (6) j = 1 : the first generation of side avalanches w n d k n nd d n 1 1 1 1 ( , )� � � � � � � � � � � � � . (7) j = 2 : the second generation of side avalanches w n d k n nd d n 2 2 2 2 1 2 1 1 ( , )� � � � � � � � � � � � � . (8) j : the j-th generation of side avalanches w n d j k n n k n n j d j d j n j d 2 1 1 1 1 ( , ) ( ) � � � � � � � � � � � � � � � � � � � n n j d n � � � � �1 , (9) where n n n n nd j d j d j d j� � � � � � �� � � � � � � � �� � ��e e e� � �( ( , , � � � (10) the probability density f(n, d) measured at the anode, which is placed at a distance d from the cathode, is given as the sum of all the probability densities of the avalanches created within the discharge gap f n d w n d j n k n n n j j j d j j dj j ( , ) ( , ) ( ) � � � � � � � � � � � � � � � 0 0 1 1� �n 1 , (11) where j d n k� � � � � 1 1 1, , . (12) within the exponential approximation (1) the total probability density (11) reads f n d n k n n n nd j j dj j ( , ) ( ) exp� � � � �� � � � � �1 0 . (13) relations (11) and (13) are generalised statistical distributions that include the original furry distribution (1) and its exponential approximation as special cases ( j � 0) when no side avalanches are generated. at first sight the probability densities (11) / (13) seem unlikely to follow the pareto power law (2), but the opposite is © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 47 no. 6/2007 true. when these functions are plotted with convenient parameters nd , k , and n within bilogarithmic co-ordinate systems, their graphs indeed show linear sections (power behaviours) spanning over many orders of magnitudes of electron populations – see fig. 3. the relation for the slope s k n � � � � �� � � �� � �1 ln ln 1.6309 is also fulfilled, as can be verified from fig. 3. nevertheless, to provide full rigorous grounds for the fractality (power law behaviour) of the generalised probability density f(n, d), it would be necessary to perform a full mathematical proof of this property. this would not be an original task, as several researchers [19–28] have worked on a similar problem, though with different starting functions (usually of lévy type). professor t. f. nonnenmacher – inspired probably by theoretical works [19], [23] dealing with the origin of the fractal scaling laws in biophysics – has systematically studied [24–28] a similar problem that emerged from protein gating kinetics. as a prominent mathematical physicist he succeeded in finding an original way to transform the functions that have assumed the form of a discrete exponential chain, like series (13), into the power law functions. the required exact mathematical proof can therefore be found in his works [24–28]. finally, to illustrate straightforwardly the capability of the generalised statistical pattern (17) of furry’s functions to provide a faithful fit for experimental data of population statistics, the experimental data from fig. 3 has been fitted by pattern (17) in bilogarithmic co-ordinates – see fig. 4. as can be seen from the two figures, the fractal dimensions are in excel34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 3: fractal statistical pattern developed into a power law fig. 4: fractal pattern fit of pre-streamer statistics (voltage pulses – data taken from fig. 2) lent agreement (� 0.471 versus � 0.473). at this point we would like to mention that all other data measured in our laboratory has been processed in the same way, showing the clear capability of the statistical pattern (17) to reproduce faithfully statistical data of highly populated electron avalanches. in brief, the generalised statistical patterns (17) seem to provide very convenient approximations capable of incorporating all the main specific features of fractal avalanche multiplication. 4 conclusions in conclusion we would like to underline several main points that have been presented in this paper: � a new concept of fractal multiplication of highly populated avalanches (streamer-like avalanches) has been proposed. the concept is based on a generalised photoionisation mechanism leading to side branching of avalanches. � the branching may propagate to higher generations of side avalanches. this process is inherently stochastic and requires the introduction of average multiplicity k and average number n of initiating electrons to provide an analytical description of the branching procedure. � the generalised statistical pattern (17) representing the probability density of avalanches created in a discharge gap has been derived. this pattern includes the original furry distribution (1) and its exponential approximation as special cases ( j � 0) when no side avalanches are generated. � the capability of the generalised statistical pattern (17) to fit faithfully experimental data from avalanche experiments has also been illustrated (fig. 4). acknowledgments this work has been supported by the grant agency of the czech republic under grant no. 202/07/1207. references [1] raether, h.: electron avalanches and breakdown in gases. london: butterworths, 1964. [2] loeb, l. b.: basic processes of gaseous electronics. berkley ca: university of california press, 1960. [3] meek, j. m., craggs, j. d.: electrical breakdown of gases. new york: willey, 1978. [4] wijsman r.: phys. rev. vol. 75 (1949), p. 833. [5] van brunt, r. j.: ieee trans. el. insul. vol. 26 (1991), p. 902. [6] frommhold, l.: zeitschrift für physik, vol. 150 (1958), p. 172. [7] schlumbohm, h.: zeitschrift für physik, vol. 152 (1958), p. 49. [8] furry, w. h.: phys. rev. vol. 52 (1937), p. 569. [9] schlumbohm, h.: zeitschrift für physik, vol. 151 (1958), p. 563. [10] richter, k.: zeitschrift für physik, vol. 158 (1960), p. 312. [11] ficker, t.: j. appl. phys. vol. 78 (1995), p. 5289. [12] ficker, t., macur, j., kliment, m., filip, s., pazdera, l.: j. el. eng. vol. 51 (2000), p. 240. [13] ficker, t, macur, j., pazdera, l., kliment, m., filip, s.: ieee trans. diel. el. insul. vol. 8 (2001), p. 220. [14] ficker, t.: ieee trans. diel. el. insul. vol. 10 (2003), p. 689. [15] ficker, t.: ieee trans. diel. el. insul. vol. 10 (2003), p. 700. [16] ficker, t., macur, j., kapička, v.: czech. j. phys. vol. 53 (2003), p. 509. [17] ficker, t.: ieee trans. diel. el. insul. vol. 11 (2004), p. 136. [18] mandelbrot, b. b.: the fractal geometry of nature. new york: freeman, 1983. [19] montroll, e. w., shlesiger, m. f.: proc. natl. acad. sci. vol. 79 (1982), p. 3380. [20] shlesiger, m. f.: j. stat. phys. vol. 36 (1984), p. 639. [21] montroll, e. w., bendler, j. t.: j. stat. phys. vol. 34 (1984), p. 129. [22] shlesinger, m. f., west, b. j., klafter, j.: phys. rev. lett. vol. 50 (1988), p. 1100. [23] west, b. j., bhargava, v., goldberger, a. l.: j. appl. physiol. vol. 60 (1988), p. 1089. [24] nonnenmacher, t. f., nonnenmacher, d. j. f.: in: albeverio. s., casati, g., cattaneo, u., merlini, d., moresi r. (eds.): stochastic processes, physics and geometry. proc. of the 2nd ascona/locarno conference: world scientific, july 1988. [25] nonnenmacher, t. f., nonnenmacher, d. j. f.: phys. lett. a, vol. 140 (1989), p. 323. [26] nonnenmacher, t. f.: coll. polym. sci. vol. 267 (1989), p. 753. [27] nonnenmacher, t. f.: coll. polym. sci., vol. 268 (1990), p. 401. [28] nonnenmacher, t. f., losa, g. a., weibel, e. r. (eds.): fractals in biology and medicine. basel: birkhäser verlag, 1994. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics university of technology faculty of civil engineering žižkova 17 662 37 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 47 no. 6/2007 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 differences between doped and undoped zirconium alloy oxide layers h. frank abstract two samples of undoped zr1nb and of zry-4w, doped with additives of sn, fe, cr, were oxidized for 42 days at 360◦c, forming oxide layers 1.9μm in thickness. the i-v characteristics, measured at room temperature up to 160◦c, were used to compare the transport properties in order to assess the influence of doping. in both samples, the activation energywas equal to 1.2 ev and the temperature dependence of resistivity, electron mobility and carrier concentration, and also the behavior of injection and extraction currents, were found to be equal inside the error limits, thus proving the doping to be ineffective. zirconiumoxide fits into the group of oxide semiconductors, being ann-type reduction semiconductor, conduction depending on stoichiometric deviation, i.e. missing oxygen. keywords: zirconium alloy oxide layers, doped and undoped samples, i-v characteristics, temperature dependence of transport parameters, activation energy, injection and extraction currents and their time dependence, reduction semiconductor. 1 introduction in normal atomic semiconductors, e.g. si, the conductivity is sensitive to doping with atoms of different valency, whereas in oxidic semiconductors conductivity depends rather on stoichiometric deviations and less on doping [1]. it is the aim of this work to assess the influence of alloying atoms acting as dopants in oxide layers. there were two samples of undoped zr1nb (alloying with 1 % of niobium does not introduce doping centers), and of zry-4w without nb but alloyed with sn, fe, cr, of 1.46, 0.2, 0.1 wt %, respectively. two tube specimens 30 mm in length and 9 mm in outer diameter were oxidized under vver conditions (tab. 1). 2 experimental the oxide on the end faces of the tubes was ground off for good contact. painted-on electrodes of colloidal silver 6.0 mm in diameter were used. the samples were mounted into a mini-thermostat for measurement at temperatures up to 160 ◦c. details of the measuring procedure are given in [2]. the relative permittivity was calculated from the measured electrode capacity at 1 000 hz with known geometrical factors. the levels were extremely low, 12.4 and 8.3, for d 61 and u 71, respectively. the v-i characteristics were symmetrical, therefore only the forward voltage branch, with the positive terminal connected to the zirconium metal, was measured with voltage steps of 0.5 v up to 7 v, and at constant temperatures in steps of 20 ◦c, or with continually increasing temperatures. at lower temperatures, the samples had normal i-v characteristics of the form of eq. 1 with space-charge limited currents, which tended to become linear at higher temperatures. i = au2 + bu + c, (1) eq. 1 can be used to compute resistivity ρ, mobility μ and carrier concentration n. further details concerning theoretical aspects are given in [3, 4]. 3 results and discussions in order to stabilize the colloidal ag contacts, it was necessary to anneal the layers at a temperature of over 100 ◦c. the temperature was continually increased, and at each rise of 5 ◦c the zero current was recorded. at 140 ◦c the temperature was again decreased. in both samples, the zero current was constant with about 12 pa up to 80 ◦c and then increased to over 1 000 pa at 140 ◦c. at second heating, the zero current grew steadily in exponential form. current measured with constant voltage of 0.5 v and continually increasing temperature gave straight lines, allowing computation of the activation energies. in d 61, 1.19 ev table 1: characterization of samples sample number type short name medium temperature (◦c) time (d) thickness (μm) e 110g 6136268 zry-4w d 61 (doped) vver 360 42 1.91 e 110 7136108 zr1nb u 71 (undoped) vver 360 42 1.90 7 acta polytechnica vol. 50 no. 2/2010 fig. 1: temperature dependence of i-v characteristics, d61 doped fig. 2: temperature dependence of i-v characteristics, u71 undoped fig. 3: linear i-v characteristics at higher temperatures of d 61 doped fig. 4: as in fig. 3, u 71 undoped table 2: parameters of both samples at room temperature (figs. 3, 4) sample thickness (μm) resistivity (ωcm) mobility (cm2)/vs) concentration (cm−3) d 61 1.9 11.3 · 1014 2.4 · 10−10 2.1 · 1014 u 71 1.90 1.07 · 1014 2.8 · 10−10 2.1 · 1014 was found for increasing temperature, and 1.32 ev for decreasing temperature. similar behavior in u 71 gave 1.10 ev and 1.31 ev, respectively. taking timedependent deviations into account, the middle value of both samples is 1.2 ev. due to the small thickness, the critical field strength is achieved already at about 1 v, and therefore the i-v characteristics were measured only up to 7 v, and were found to be very similar in both samples. the characteristics (figs. 1, 2) of both samples are very similar, and their values are shown in tab. 2. the i-v characteristics measured at constant temperatures were taken as base values for assessing resistivity, mobility, carrier concentration and activation 8 acta polytechnica vol. 50 no. 2/2010 fig. 5: at 100◦c (and higher), the current rises in a linear way (beginning at 3.5v) fig. 6: temperature dependence of resistivity using data of i-vcharacteristics, (each samplemeasured twice toprove reproducibility), activation energy of resistivity equal in both samples fig. 7: equal temperature dependence of mobility in both samples fig. 8: equal temperature dependence of carrier concentration in both samples energy. as can be seen, comparing figs. 1–4, the behavior of both samples is very similar, at lower temperatures up to 80 ◦c, eq. 1 is obeyed, creating second order curves, whereas at higher temperatures linear dependence of current on voltage is observed. the change from quadratic to linear form is shown in fig. 5, where at 101 ◦c the change takes place at 3.5 v. plotting log ρ as a function of the reciprocal absolute temperature in fig. 6 gives the same activation energy of 1.2 ev, proving the ineffectiveness of doping. the similarity of the two samples is confirmed by the practically identical temperature dependence. the time dependence of the injection and extraction currents was compared in the two samples, and was found at room and higher temperatures to be very nearly equal, as can be seen in figs. 9, 10. as shown in figs. 9, 10, on application of voltage a high starting current, injecting most of the carriers, gradually diminishes, building up a space charge, until a stable equilibrium current is reached after a certain time, obeying eq. 1. when the contacts with the picoamperemeter are shortened, the injected space charge flows out and gives rise to a negative extraction current, which is equal to the former (positive) injection 9 acta polytechnica vol. 50 no. 2/2010 fig. 9: d 61, time dependence of injection current after application of constant voltage, and of extraction current measured without driving voltage, electrodes shorted, using a pico-amperemeter fig. 10: u 71, as in fig. 9, with lower injection voltage fig. 11: comparison of temperature dependent exponents n fig. 12: comparison of extracted charge (integrated extraction currents) current and obeys the power law of eq. 2 [5] i = bt−n, (2) with time t and exponent n < 1. the extracted charge q can be computed by integration, q = ∫ t2 t1 bt−n dt = b ( t1−n2 − t 1−n 1 ) 1 − n . (3) the slopes of the straight lines in figs. 9, 10 give the increase of the extracted charge per unit voltage at injection, and are dq/du = c. this means that the oxide layer behaves like a capacitor which can be charged and discharged. the temperature dependence of exponent n in eq. 2 is shown in fig. 11, being equal for both samples inside the measuring errors. the extracted specific charges c for both samples are also equal, as can be seen in fig. 12. 10 acta polytechnica vol. 50 no. 2/2010 table 3: comparison of parameters (22◦c) sample number d 61 d 61 u 71 u 71 ρ (ωcm) act. energy (ev) 1.73 · 1014 1.21 8.94 · 1013 1.20 μ (cm2/vs) act.energy (ev) 1.44 · 10−10 −0.68 1.98 · 10−10 −0.67 n (cm−3) act.energy (ev) 2.7 · 1014 −0.48 2.3 · 1014 −0.49 4 conclusions comparing the two samples shows that there is a striking similarity of all parameters. values taken at room temperature are given in tab. 3, while at higher temperatures they change exponentially, and the exponents are equal. taking into account the measurement errors, the sums of the exponents of mobility and carrier concentration are equal to the activation energy of 1.2 ev for both samples, as follows from resistivity ρ = 1/(enμ). the resistivity of the doped sample, contrary to expectation, is about twice higher than that of the undoped sample, maybe due to doping atoms filling up the vacancies. the differences of the parameters expressed by their ratio d/u are of the order of 10 %. the sum of the ratios for μ and n is of the same (absolute) value as for ρ. even the injection and extraction currents in the whole temperature range are of the same form and value. at temperatures over 100 ◦c, a positive zero current (pico-amperemeter without an external voltage source directly connected to the sample electrodes) of similar magnitude, indicating continuing oxidation, is observed in both samples (the extraction current would be negative). only the relative permittivity is different, being 8.3 and 12.4 for d 61 and u 71, respectively. from all these found equalities, it follows that zirconium oxide fits into the group of oxidic semiconductors, where the (low) conductivity is provoked by stoichiometric deviations and not by doping. zro2 is an n-type reduction semiconductor, conduction depending on missing oxygen. at higher temperatures, an additional part of ionic conduction by oxygen ions can be observed. acknowledgement support for this work from ujp, praha, a.s., and from msm grant 680770015 is greatly appreciated. special thanks are due to mrs. v. vrtilková for providing the specimens of measured thickness. references [1] hintenberger, h.: z. phys. 119, 1, (1942). [2] frank, h.: j. nucl. mater. 340, 119, (2005). [3] mott, n. f., guerney, r. w.: electronic processes in ionic crystals, clarendon, oxford, (1940). [4] gould, r. d.: j. appl phys., 53, 3 353, (1982). [5] frank, h.: acta physica slovaca, 55, 341, (2005). prof. rndr. helmar frank phone: +420 224 358 559 e-mail: helmarfrank@fjfi.cvut.cz department of solid state engineering faculty of nuclear sciences and physical engineering czech technical university trojanova 13, prague 2, czech republic 11 acta polytechnica vol. 52 no. 5/2012 design process of energy effective shredding machines for biomass treatment juraj beniak, juraj ondruška, viliam čačko faculty of mechanical engineering, slovak university of technology in bratislava, institute of production systems, environmental technology and quality management, nám. slobody 17, bratislava, slovakia corresponding author: juraj.beniak@stuba.sk abstract the shredding process has not been sufficiently investigated for the design of better, energy and material saving shredding machines. in connection with present-day concern about the environment, ecology, energy saving, recycling, and finding new sources of energy, we need to look at the design of shredding machinery, the efficiency of the machines that we using, and ways of improving them to save electric energy for their operation. this paper deals with sizing and designing shredding machines from the point of view of energy consumption and optimization for specific types of processed material. keywords: shredding, disintegration, shredding machines. 1 introduction research and development for new types of machinery requires knowledge of each structural node and each machine part. on the basis of this knowledge, and with the help of practical experience, we can optimize the design of the device, minimize its input power and achieve optimal performance. a feature of shredding (disintegrating) machinery is the broad range of disintegration processes and raw materials that are processed. disintegration technology is stochastic as regards the basic principle of disintegration and the raw material that is to be processed. this is the main reason why the field has received such sparse scientific investigation, and why the design process for new machinery is mainly based on experience. the design and production of new types of devices has been delayed by lack of research and development. in order to be able to describe the process, it is necessary to make measurements on specially adapted devices. an experimental stand was therefore designed for measuring the basic parameters that influence the disintegration process. the stand was designed to allow direct and indirect measurement of the analyzed parameters. on the measurement stand, we can make measurements on the wedge to determine directly the force necessary for disintegrating a material, or we can determine the force indirectly by converting the torque moment, which can be measured on the clutch between the drive and the measurement stand. a methodology and an experimental plan were designed [3]. we used the complete plan of the experiment to measure the torque moment, with four selected variables. we used the averages analysis method (anom) and the dispersion analysis method to obtain the significance of these factors [7]. experimental tests supported the working hypothesis [1] that the value of the disintegrative force also affects the face angle. the aim of our experiment was to verify this working hypothesis. the primary assumption, based on an analysis of the theoretical problem, was that the size of the tool angles, together with other cutting conditions, is decisive for the productivity of tools and machines, and for the cost-effectiveness of all types of material working [8], see fig. 1. incorrectly selected cutting angles can accelerate blunting of the tool, reduce the lifetime of the machine, increase the cutting resistance, and affect the productivity and cost-effectiveness of the machinery. fig. 2 shows the geometry of a disintegrative tool [6] (α — back angle; β — disintegrative wedge angle; γ — face angle). 2 impact assessment of selected parameters based on the proposed experiment plan, one of the measured parameters was the impact of changes in face angle γ on the torque moment that is necessary for disintegrating the material samples. on the basis of the measurement results, modifications were made to the basic form of the mathematical model describing the disintegration process, and the following form was reached [1]: mk = τrsm(1 − tan γ), fd1 = τsm(1 − tan γ), 133 acta polytechnica vol. 52 no. 5/2012 figure 1: impact of the geometry of a disintegrative tool on input power [6]: 1 — hardwood, 2, 3— softwood. figure 2: geometry of a disintegrative wedge. where: mk — torque moment (in nm); τ — shear strength of material (in mpa); r — disintegrative disk radius (in mm); sm — disintegrative surface area (in mm2); γ — face angle (in degrees); fd1 — disintegrative force for a single wedge (in n). the mathematical model was created on the basis of experimental measurements that took into account the geometry of the disintegrative wedge (fig. 2, face angle γ and back angle α), rotor rotating frequency n and the disintegrating material section surface area sm, taking into account the width b and the height h of the disintegrative wedge and the thickness of the processed material hm. this formula produced some parameter dependences, or, more precisely, indicated how the parameters exert the necessary force (either torque moment or input power) to disintegrate the materials. the relation among the parameters is also expressed numerically (tables 1–4). there is no rotor rotating freγ (in °) mk (in nm) p (in kw) −10 557.5 2.6 0 473.9 2.2 10 390.4 1.8 20 301.4 1.4 30 200.3 0.9 40 76.3 0.4 table 1: values of the torque moment and power for the selected face angle. figure 3: torque and input power in relation to the shredding face angle. quency parameter, because this parameter was evaluated as insignificant, and we therefore neglected it in rest of our study. other parameters, e.g. the width and height of the wedge and the thickness of the processed material are considered within parameter sm. the most obvious and the biggest impact on the necessary input power is due to face angle γ (fig. 3, tab. 1). the bigger this angle is, the smaller the force that is necessary to overcome the resistance of the material that this wedge leaks into. this is because when there is disintegration with face angle γ = −10°, the whole surface of the tool face presses on the material, so that there is a bigger surface to leak into the disintegrating material. when the face angle is γ = 40°, the wedge leaks into the material progressively. it therefore does not need to disintegrate a big section all at once, but can disintegrate it progressively. this fact was evident not only from the measured values, but also visually and acoustically, according to how the device is loaded. when the face angle of the wedge is 40°, the device runs considerably more easily and more smoothly. when we look at the power that is needed as the face angle changes, we can observe that as the face angle increases the nec134 acta polytechnica vol. 52 no. 5/2012 r (in mm) mk (in nm) p (in kw) 70 335.0 1.6 80 382.8 1.8 90 430.7 2.0 100 478.6 2.3 110 526.4 2.5 120 574.3 2.7 130 622.1 2.9 table 2: torque moment and power values for selected disk radii. figure 4: torque and input power in relation to the radius of the shredding disk. essary power input decreases. the total change for the reported range is as much as 6.5 times more. in comparison with the other parameters (tables 2–4), this is the biggest change. the experiment presented in [2] was performed with spruce wood samples with various crosssections. however, it is also necessary to design an experiment for other types of materials and for other subsidiary parameters that can affect the process of material disintegration, and thus increase the energy consumed in these processes (by these devices). the complete plan of the experiment presented in [1] for eight selected factors that can be changed on the measuring stand, for a minimum of three levels of each factor, and for ten repeated measurements for each combination, gives us a total of 65610 repetitions, which is not a practicable number. in practice, we used not more than 5 factors for the complete plan of the experiment. for this reason, incomplete and reduced experimental plans are used [3, 4, 5], or specific experimental plans, as e.g. in the experiment carried out by taguchy [9, 10]. the experimental plan designed by taguchy for h (in mm) mk (in nm) p (in kw) 16 546.5 2.6 18 590.6 2.8 20 634.6 3.0 22 678.7 3.2 24 722.8 3.4 26 766.9 3.6 28 810.9 3.8 table 3: torque moment and power values for selected wedge height. figure 5: torque and input power in relation to shredding face angle. eight factors, when one factor has two levels and the seven other factors have three levels, with ten repetitions for each combination, requires a total of 180 experiments (measurements). this is much more acceptable, and can also be performed much more quickly than the complete experimental plan (65610 repetitions). each of the diagrams (figures 3–6) and also the tables of calculated values (tables 1–4) show how a small change in tool geometry leads to big changes in the loading of the disintegrative machine. this can finally be expressed in terms of the cost of the drive and in terms of electricity consumption. the initial cost for purchasing a bigger drive for the device is a once-off expense, whereas electricity consumption is a running cost. if we need to use a bigger drive, there will be permanently higher costs for electricity consumption, which is an important consideration with the present-day prices of electricity. we assume that not only face angle but also material moisture will have a major impact on power requirements, and also on the input power of the device. in the case of material moisture, we have some practical experience, but we have not carried out any 135 acta polytechnica vol. 52 no. 5/2012 b (in mm) mk (in nm) p (in kw) 16 539.9 2.5 18 561.9 2.6 20 584.0 2.8 22 606.0 2.9 24 628.0 3.0 26 650.1 3.1 28 672.1 3.2 table 4: torque moment and power values for selected wedge width. figure 6: torque and input power in relation to the width of the shredding wedge. scientific experiments. we can therefore only work with suppositions and assumptions, and with analyses drawn from other fields of wood material processing. fig. 7 [6] shows the impact of moisture on the shear strength of pine and spruce wood material. as the moisture value increases, the shear strength of pine and spruce wood material decreases. this information is taken from the field of wood treatment, which is a related field, though the principle is different. 3 conclusion we have attempted to show the importance of correct adjustment of device parameters for energy efficiency. devices need to be adjusted for the specific conditions under which they will be running. it is necessary to take all input parameters into consideration, e.g. the processed material and its moisture, and to design the device on the basis of all relevant parameters. this attitude is not applicable to all standard mass production machines, as there are cases when we cannot define the marginal conditions in advance. in figure 7: impact of moisture and type of wood on the shear strength of a disintegrating material [6]: a,b — swedish pine (schlyter), c — spruce (newlin). addition, a wide assortment of materials sometimes has to be processed, e.g. municipal blended wastes, which consist of heterogeneous materials. acknowledgements this paper is one of the outcomes of the project “developing of progressive technology of biomass compaction and production of prototypes and highly productive tools” (itms code: 26240220017), supported by the research and development operational programme, funded by the european regional development fund. references [1] j. beniak. optimalizácia konštrukcie dezintegračného stroja. dizertačná práca, katedra výrobnej techniky, sjf stu bratislava, 2007. [2] j. beniak, e. kureková. design of experiment for measurement of torque of the disintegration device. in mechanical engineering 2007: the 11th international scientific conference, bratislava, 2007. [3] v. chudý, r. palenčár, e. kureková et al. meranie technických veličín. vydavateľstvo stu v bratislave, 1999. [4] e. jarošová. navrhování experimentů. česká společnost pro jakost, 1997. [5] e. kureková, m. halaj, a. bohunický. návrh plánu experimentu na zistenie opakovanej presnosti polohovania plazmového rezacieho stroja. in 4th international conference aplimat 2005, brno, 2005. [6] j. lisičan. teória a technika spracovania dreva. matcentrum, zvolen 1996. 136 acta polytechnica vol. 52 no. 5/2012 [7] r. palenčár, j.-m. ruiz, i. janiga et al. štatistické metódy v metrologických a skúšobných laboratóriách. grafické štúdio ing. peter juriga, 2001. [8] s. prokeš. obrábění dřeva a nových hmot ze dřeva. sntl praha, 1982. [9] six sigma. taguchiho orthogonal arrays. http://www.micquality.com/reference_ tables/taguchi.htm#l27, [15 may 2007]. [10] taguchi. http://www.uni-kassel.de/fb15/ lbk/downloads/doe/taguchi/, [15 may 2007]. 137 acta polytechnica vol. 52 no. 6/2012 oofem — an object-oriented simulation tool for advanced modeling of materials and structures bořek patzák department of mechanics, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague, czech republic corresponding author: bp@cml.fsv.cvut.cz abstract the aim of this paper is to describe the object-oriented design of the finite element based simulation code. the overall, object-oriented structure is described, and the role of the fundamental classes is discussed. the paper discusses the advanced parallel, adaptive, and multiphysics capabilities of the oofem code, and illustrates them on the basis of selected examples. keywords: object-oriented design, finite element analysis, parallel and distributed processing, multiphysics simulations. 1 introduction the aim of this paper is to describe the object-oriented design of the finite element based simulation code. the design follows several design patterns that have contributed to an extremely modular and extensible structure and have sustained nearly twenty years of active development, during which the code has been extended from a tool that was originally oriented toward solid mechanics to become a truly multiphysics modeling tool with adaptivity and distributed parallel processing support, without any large-scale redesign. the oofem code was originally developed by the author at the czech technical university. at present, the code consists of more than 200k lines of source code in c++. it is freely available under general public license, and it is being developed by an international community (visit the project web pages [1] for further information). the design is based on traditional object-oriented paradigms, such as encapsulation, abstraction, and inheritance. on top of these fundamental concepts, a hierarchy of classes is designed to map the mathematical problem described by a set of partial differential equations in space and time into flexible and cooperating software objects that solve the problem. initially, the object-oriented design may seem complex in comparison with traditional finite element codes. however, when it is structured properly and in agreement with objectoriented philosophy, many benefits can be achieved. the encapsulation of attributes and methods into a class can hide the implementation details, provided that the object state is requested and manipulated by the corresponding services. inheritance allows specialization and extension of existing classes, and when combined with abstract methods defined by parent classes it allows for extremely modular and extensible design. the concept of abstract classes allows an abstract interface to be designed in terms of the service specification. the abstract interface has to be implemented by the derived classes by implementing the required services. this allows all derived classes to be treated using the same high-level interface, without regarding the particular details of each derived class. in the implementation, single inheritance has been preferred wherever the parent class defines a compulsory basic interface. however, in many cases, optional functionality may also be implemented by derived classes. as an example, consider the implementation of a particular finite element. the compulsory part of the element interface (involving methods for evaluating characteristic matrices and vectors, etc.) is defined by the parent element class. additional functionality, e.g. an error estimation capability, may be supported only by some elements. it is not a smart idea to extend the basic interface by including optional functionality, as this will result in a very complex interface specification, with many methods that will prevent implementation of simple elements without optional functionality. a possible remedy is then based on using multiple inheritance, where a particular element is derived from the base class and optionally from classes defining the optional interfaces. this solution may seem natural, but it has one important drawback: if a class is based on (derived from) another one, it is automatically derived from all classes as its parent. this prevents the selective application of optional interfaces. this problem has been solved in some object-oriented languages, e.g. in java, where only single inheritance is supported, but a class can implement several interfaces. derived classes do not inherit interfaces automatically; interface implementation has to be declared by each class explicitly. as the presented code has been implemented in c++, 59 acta polytechnica vol. 52 no. 6/2012 engngmodel domain element dofmanager boundarycondition initialconditionmaterial numericalmethod figure 1: problem representation in oofem. an additional concept has been defined to support an optional interface. it is based on multiple inheritance, but supplemented by an abstract interface request service, which allows selective decisions on implemented interfaces by particular elements. 2 overall design this section presents the general structure of the oofem code using uniform modeling language (uml) notation (see [2] for details). the abstract classes are represented by rectangles. the lines marked with a triangle represent the generalization/specialization relation (inheritance), where the triangle mark points to the parent class. the lines marked with a diamond represent the whole/part relation, in which diamond mark points to the “whole” class possessing the “part” class, and an association is represented by a solid line drawn between the classes. the problem under consideration is represented by a class derived from the engngmodel class, see fig. 1. its role is to assemble the governing equation(s) and use a suitable numerical method (represented by the class derived from the numericalmethod class), to solve the resulting system of equations. the discretization of the problem domain is represented by the domain class, which maintains the lists of objects representing nodes, elements, material models, boundary conditions, etc. the domain class is an attribute of the engngmodel. for each solution step, the engngmodel instance assembles the governing equation(s) by summing up the contributions requested from the domain components (nodes, elements, etc.). this abstract approach is supported by suitable abstract services at the element and nodal classes for getting code numbers and evaluating characteristic components. since the governing equations are typically represented numerically in matrix form, implementation is based on vector and sparse matrix representations, see fig. 2. the parent sparsematrix class declares the abstract interface, allowing manipulation of different sparse matrix formats using the same interface. derived classes represent different sparse storage schemes and implement, for example, the skyline, a compressed row or compressed column formats, as well as classes implementing the sparsematrix interface and representing an interface to third-party libraries, e.g. iml[3], spooles[4], dss[5], or petsc[6]. the advantage of the design described here is its modular design with decoupled problem formulation, numerical solution and sparse storage. it allows a particular problem to be implemented in a form that will work with all suitable numerical solvers and sparse matrix storage schemes, even those added in the future. the individual finite elements are represented by classes derived from elementgeometry and one or more classes derived from elementevaluator, see fig. 3. the elementgeometry class defines the element geometry and keeps a list of element nodes. the elementevaluator (or, more precisely, the problem related class derived from the parent elementevaluator) defines the problem-specific methods (e.g. methods for evaluating the stiffness matrix or element load vector) required by the corresponding problem. the implementation of individual elements is further facilitated by the library of classes representing integration points, integration rules, and finite element interpolation spaces. the elementevaluator problem specific methods can be implemented by the evaluator itself, without regarding the details of the particular element, because the evaluator has access to representations of element geometry, element interpolation and integration rules, and each of these can be manipulated by an abstract interface. multiple integration rules can be created by elements to implement reduced or selective integration schemes, and multiple interpolation can also be created. many lagrangebased interpolation schemes are provided, as well as b-spline, nurbs, and t-spline based interpolations. thanks to t-spline based interpolations, the implementation of isogeometric analysis [7] has been pretty straightforward (see [8, 9] for more details). the essential feature is decoupling of the element 60 acta polytechnica vol. 52 no. 6/2012 figure 2: nonlinear static problem and corresponding classes. geometry description (represented by the elementgeometry class) and problem-specific functionality (represented by the classes derived from the elementevaluator class), which allows natural implementation of elements for coupled problems, where one needs to combine functionality from individual subproblems into a single element (represented by corresponding classes derived from elementevaluator). this is a better solution than deriving the coupled element from several sub-problem elements, leading to duplication of element geometry data. introducing the coupledevaluator class makes it possible to combine individual evaluators. coupledevaluator is derived from the base elementevaluator class, and comes with the capability to group individual low-level evaluators together (complemented by evaluators for coupling terms) by performing local assembly from individual contributions. when a problem-specific evaluator is available, the definition of a particular element is straightforward. it consists in: 1. defining a new class, derived from elementgeometry and class(es) representing problem-specific evaluator(s); 2. setting up its interpolation(s) and integration rules. no additional coding is necessary. nonlinear, path-dependent material models require keeping a loading history in each integration point, described by a set of internal variables. naturally, these variables should be stored in the corresponding integration point. however, efficient implementation is not straightforward, as the amount and the type of internal variables vary for different material models. instead, each material model defines an associated status, a class derived from materialstatus, which serves as a container for material model specific internal variables. the integration point comes with the possibility to store a unique material status instance, which is created by the corresponding material. as the integration point is the parameter to all services provided by a material model, the associated status and the corresponding internal variables are conveniently accessible. the elements do not communicate directly with the associated material model, as an additional layer is inserted between them representing the cross section model. the role of the cross section is to integrate the response (in terms of stress and strain, for example) over the cross section geometry composed of possibly different materials. this is especially helpful in the case of shell and beam elements, where the introduction of the cross section allows cross section details to be hidden by decoupling finite element and cross section formulations. this approach allows the use of the same beam element formulation with integral or layered cross section descriptions, for example. for problems where the cross section model is irrelevant, a simple dummy cross section model is provided, routing all requests directly to the underlying material model. 3 parallelization strategy the code provides support for parallel and distributed computations. the design of parallel algorithms requires partitioning of the problem into a set of tasks, the number of which is greater than or equal to the number of available processors. the partitioning of the problem can be fixed (static load balancing) or can change during the solution (dynamic load balancing). the latter option is often necessary in order to achieve good load balancing of the work between processors, and thus optimal scalability. the adopted 61 acta polytechnica vol. 52 no. 6/2012 figure 3: element implementation. parallelization strategy is based on the domain decomposition paradigm, where the computational mesh is divided into partitions assigned to individual processing units. the node cut approach is used. this approach is based on unique assignment of individual elements to partitions. a node is then either assigned to a partition (local node), if it is surrounded exclusively by elements assigned to that partition, or is shared by several partitions (shared node), if it is incident to elements owned by different partitions. the communication model is based on the message passing paradigm, which is available on most hardware configurations. this allows different parallel architectures to be supported, ranging from systems with shared memory to massively parallel systems with distributed memory. the message passing interface (mpi) [10, 11] is used. efficient parallelization requires that all steps in the solution procedure can be processed in parallel. in a typical problem, this involves assembling the characteristic problem, solving it, and postprocessing it. high-level communication services were developed on the top of the message passing library, providing transparent data streams for parallel, non-blocking communication between partitions. provided that the partitioning of the problem is given, each processor can assemble its local contribution to the global characteristic equation(s). the rows of global matrices and vectors are distributed on individual processors according to the distribution of nodes and corresponding dofs. this part is identical to the serial assembly process. after local assembly is finished, the shared node contributions are exchanged. this is integrated into general purpose assembly services, so that the users can use any sparse matrix representations and the exchange of shared contributions is performed transparently. after the assembly process is finished, the global set of (linearized) equations is solved in parallel. after the solution, the local solution vectors on each partition are collected, and local postprocessing (stress and strain evaluation, for example) is performed on each partition in parallel, typically without the need for further communication. the load balance recovery is achieved by repartitioning the problem domain and transferring the work (typically represented by finite elements) from one sub-domain to another. there are in general two basic factors causing load imbalance between individual subdomains: 1. one coming from the nature of the application, e.g. switching from a linear response to a nonlinear response in certain regions or local adaptive refinement; 2. external factors, caused by resource reallocation, typical in non-dedicated cluster environments, where individual processors are shared by different applications and users, leading to variation in allocated processing power. repartitioning is an optimization problem with multiple constraints. the optimal algorithm should balance the work while minimizing the work transfer and keeping the sub-domain interfaces as small as possible. other constraints can reflect the differences in the processing power of individual processors, or may be induced by the topology of the network. the load balancing layer is transparently integrated into the computational kernel. details on implementing dynamic load balancing in oofem can be found in [12, 13]. 4 multiphysics and adaptive capabilities the framework supports both fully and weakly (staggered) coupled multiphysics simulations. the development of multiphysics solution schemes is relatively 62 acta polytechnica vol. 52 no. 6/2012 2 0 0 400 100 260 300 36 14 plane of symmetry free surface figure 4: geometry of an anchor pullout test. straightforward, as the individual elements can easily be constructed from existing single physics formulations by supplying only a definition of corresponding coupling terms, by implementing the corresponding element evaluator. concurrent multiscale simulations have also been developed, using a microscale model of the representative volume element in each integration point to obtain a macroscale material response by means of homogenizations. additionally, any primary or secondary variable can be represented as a field. fields can represent any scalar, vector or tensorial quantity and provide services to evaluate the field at any point of interest. this feature simplifies mutual field exchange in staggered simulations to a large extent, naturally allowing the possibility to have different discretizations on each subproblem. the individual subproblems can be arbitrarily assembled into a staggered solution scheme, and data is transparently exchanged. the field mapping is also essential in h-adaptive, nonlinear solution procedures, enabling the solution state to be mapped from old to updated discretization. the sequence of meshes is generated on the basis of spatial error distribution, estimated by a suitable a-posteriori error estimator, represented by a class derived from the abstract errorestimator class. the mesh can be refined by a built-in, fully parallel, subdivision-based remeshing algorithm or by an external application. at present, an interface to the t3d mesh generator [14, 15] is supported. 5 examples 5.1 parallel analysis of 3d anchor test the capabilities and the performance of the parallel adaptive load-balancing framework are illustrated on a three-dimensional adaptive analysis of an anchor pullout test. in this example, h-adaptive analysis is used together with a heuristic error indicator based on the 0 5 10 15 20 25 30 35 cpus 0 10 00 0 20 00 0 30 00 0 t im e [s ] no load balancing post lb each step post lb each 2nd step pre lb each step pre lb each 2nd step 0 5 10 15 20 25 30 35 cpus 0 1 2 3 4 5 6 7 s p e e d u p ( re la tiv e t o 4 c p u s) no load balancing post lb each time step post lb each 2nd step pre lb each step pre lb each 2nd step figure 5: anchor pullout: timing and speedups. attained damage level. in order to assess the behavior and performance of the proposed methodology, the case study analyses were run without dynamic load balancing (static partitioning was employed, marked as “nolb”), and with dynamic load balancing performed before error assessment (“prelb”), or after error assessment (“postlb”). the geometry and setup of the test are shown in fig. 4. the anchor is located close to the boundary, requiring full 3d analysis with only one plane of symmetry. as the steel anchor is pulled out of the concrete, a crack surface is initiated at the anchor head and starts to propagate towards the boundary as the loading increases. an anisotropic, non-local damage based model has been used for modeling the concrete fracture. the original mesh consists of 16772 linear tetrahedral elements and 1456 nodes, which was subsequently refined in 20 steps into a final mesh with 125400 elements and 22441 nodes. the problem was solved on an sgi altix 4700 machine installed at the ctu computing center. the obtained solution times (averaged over two or three analysis runs) and the corresponding speedups (relative to 4 cpus) are summarized in fig. 5. the results reveal that the effect of dynamic load balancing is quite substantial. when no load balancing is applied, the solution times decrease only slightly with the number of cpus. this is a direct 63 acta polytechnica vol. 52 no. 6/2012 figure 6: oparno bridge. figure 7: multiscale computational scheme. consequence of the heavy imbalance due to the localized refinement resulting in a dramatic increase in the number of elements in one subdomain or in a small number of subdomains. with dynamic load balancing, this effect is alleviated. the obtained speedup of the load-balanced computation shows a nice linear trend, indicating very good scalability of the parallel algorithm. 5.2 multiscale heat transport this example illustrates multiscale simulation, which helped to find the optimal position for cooling pipes and a cooling regime on an arch bridge over the oparno valley, czech republic. the bridge was built between 2008 and 2010, with the arches spanning 135 m, see fig. 6. hydrating concrete produces a significant amount of hydration heat, which causes several problems in massive concrete elements. there were concerns that significant tensile stresses might appear during the cooling of a massive cross-section of the arch. these tensile stresses can lead to cracking, which can negatively affect the durability of the final figure 8: oparno bridge: temperature profile. figure 9: oparno bridge: comparison with measurements. structure. another negative effect can appear when the temperature exceeds 70 degrees celsius, when the formation of monosulphate can be followed by ettringite formation (so-called delayed ettringite formation, def). two computational scales were involved (see fig. 7): 1. the level of cement paste, where the cemhyd3d [16] material model predicts the evolution of a discrete microstructure on the scale of micrometers and returns the liberated heat; 2. the structural level, where the heat balance equation is solved with finite elements. the details can be found in [17]. fig. 8 shows the temperature evolution during concrete hardening and the induced out-of-plane stress. 64 acta polytechnica vol. 52 no. 6/2012 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.05 0.1 0.15 experiment 1s experiment 1.5s experiment 2s experiment 3s simulation 1s simulation 1.5s simulation 2s simulation 3s figure 12: lbox test: profiles of the spreading concrete at different times. 0.1 0.615 0.14 walls air 0 .1 5 0 .6 co n cr e te sliding door figure 10: lbox test: geometry. the simulation runs on the left symmetric part of the arch cross-section. fig. 9 compares the multiscale simulation with the temperature in the core of the cross-section. the temperature measured in the concrete remained below 65°c during summer casting, which was found acceptable. 5.3 casting simulation the last example illustrates the simulation of fresh concrete casting, where the fresh concrete has been modeled as a homogeneous, non-newtonian fluid. the simulation is based on an incompressible flow model of two immiscible fluids (concrete and air), where the interface tracking technique is used to track the position of the interface between the fluids. the geometry and the setting of the experimental setup, known as the l-box test, is presented in fig. 10. the concrete is confined in an l-shaped reservoir with a vertical gate. after the removal of the gate, the concrete starts spreading into the form-work. frictionless boundary conditions are assumed on the walls figure 11: lbox test: spreading conrete profiles. forming the reservoir, and also on the formwork. the numerical simulations use the bingham model for the concrete suspension. the unstructured, triangular grid consists of 3652 nodes and 6927 triangles. realistic modeling of the gradual gate opening is particularly important, as the results, especially in the initial phase, are very sensitive to the gate opening speed. the profiles of the fresh concrete at different times (see fig. 11) are compared with experimental observations in fig. 12. very good agreement has been obtained. 6 conclusions the paper has focused on a description of the objectoriented structure of a finite element based simulation framework. the fundamental design patterns and their consequences have been discussed. the paper has also described the design and implementation of some advanced features, including parallel and distributed support and multiphysics. in the final part, selected examples have been used to demonstrate the capabilities of the code. acknowledgments this work was supported by the ministry of education of the czech republic, under project msm 6840770003. 65 acta polytechnica vol. 52 no. 6/2012 references [1] b. patzák. oofem project home page, 2012. http://www.oofem.org. [2] d. pilone and n. pitman. uml 2.0 in a nutshell. o’reilly media, inc.; 2nd edition, 2005. [3] j. dongarra, a. lumsdaine, r. pozo, k. remington. iml++ (iterative methods library) project page, 2012. http://math.nist.gov/iml++. [4] c. ashcraft et al. spooles: sparse object oriented linear equations solver, 2012. ww.netlib.org/linalg/spooles/spooles.2.2.html. [5] r. vondráček. use of a sparse direct solver in engineering applications of the finite element method. česká technika – nakladatelství čvut, isbn: 978-80-01-04245-8, 2008. [6] satish balay, kris buschelman, william d. gropp, dinesh kaushik, matthew g. knepley, lois curfman mcinnes, barry f. smith, and hong zhang. petsc web page, 2001. http://www.mcs.anl.gov/petsc. [7] t.j.r. hughes, j.a. cottrell, y. bazilevs. isogeometric analysis: cad, finite elements, nurbs, exact geometry and mesh refinement. computer methods in applied mechanics and engineering, 194, 4135-4195, 2005. [8] d. rypl, b. patzák. from the finite element analysis to the isogeometric analysis in an object oriented computing environment. advances in engineering software, 44 (1), pp. 116-125, 2012. [9] d. rypl, b. patzák. object oriented implementation of the t-spline based isogeometric analysis. advances in engineering software, 50, pp. 137–149, 2012. [10] message passing interface forum. mpi: a message-passing interface standard. technical report, university of tennessee, 1995. [11] marc snir, steve otto, steven huss-lederman, david walker, and jack dongarra. mpi: the complete reference. mit press, boston, 1996. [12] b. patzák, d. rypl, and z. bittnar. parallel explicit finite element dynamics with nonlocal constitutive models. computers and structures, 79(26-28):2287–2297, 2001. [13] b. patzák, d. rypl. object-oriented, parallel finite element framework with dynamic load balancing. advances in engineering software, 47(1):35 – 50, 2012. [14] d. rypl. t3d project home page, 2012. http://www.t3d.info. [15] d. rypl. sweeping of unstructured meshes over generalized extruded volumes, finite elements in analysis and design, 46 (1–2), 203–215, 2010. [16] v. šmilauer, t. krejčí. multiscale model for temperature distribution in hydrating concrete. international journal for multiscale computational engineering, 7(2), 1543–1649, 2009. [17] v. šmilauer, j. l. vítek, b. patzák, z. bittnar. optimalizace chlazení oblouku oparenského mostu. cooling optimization in oparno bridge’s arch. časopis beton tks. 2011, vol. 11, no. 4, p. 62-65. 66 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 on the relation between grb classes and x-ray flashes z. bagoly, p. veres, i. horváth, a. mészáros, l. g. balázs abstract gamma-ray bursts are usually classified into either short-duration or long-duration bursts. going beyond the short-long classification scheme, it has been shown on statistical grounds that a third, intermediate population is needed in this classification scheme. we are looking for physical properties which discriminate the intermediate duration bursts from the other two classes. as the intermediate group is the softest, we argue that we have related them with x-ray flashes among the grbs. we give a new, probabilistic definition for this class of events. keywords: gamma rays, bursts, observations. 1 introduction to discern the physical properties of grbs as a whole, we need to understand the number of physically different underlying classes of the phenomenon. with the launch of the swift satellite [1], a new perspective has opened up in the study of gamma-ray bursts and their afterglows. the intermediate grb population in the studies [2–5] so far has always been the softest among the groups, meaning that intermediate grbs emit the bulk of their energy in the low-energy gamma-rays. here we report on a significant difference in the peak-flux distribution between the intermediate and the short populations, and between the intermediate and long populations. we identify a third population using a multi-component model and we show that this group has a significant overlap with xray flashes. the first swift bat catalog [6] was augmented with bursts up to august 7, 2009. after excluding the outliers and bursts without measured parameters, our sample has a total of 408 grbs. to obtain the spectral parameters we fitted the spectra integrated for the duration of the burst with a power law model and a power law model with an exponential cutoff. the most widely used duration measure is t90, which is defined as the period between the 5 % and 95 % of the incoming counts. to find the fluences (semin,emax) we integrated the model spectrum in the usual swift energy bands with 15 − 25 − 50 − 100 − 150 kev as their boundaries. we define the hardness ratio (hij , where i and j mark the two energy intervals) as the ratio of the fluences in different channels for a given burst. 2 classification there are many indications that the phenomenon which we observe as gamma-ray bursts has more than one underlying population. the goal is to identify classes which are physically different. by using t90 and h32 we include a basic temporal and spectral characteristic of the bursts. we carry out three types of classifications: model-based multivariate classification, k-means clustering and hierarchical clustering. we use the algorithms implemented in the r software1. studies show that for example the distribution of the logarithm of the duration can be adequately described by a superposition of three gaussians [2]. here we find the model parameters using the expectation-maximization (em) maximum likelihood method. we use the bayesian information criterion (bic), introduced by [7, 8], to find the most probable model (including the number of components) and the parameters of this model. for k components (bivariate gaussians) the number of free parameters is 6k − 1, since the sum of the weights is 1. we have applied this classification scheme on our sample, and found that the model with three components gives the best fit for the data in the bic sense, where the shape of the bivariate gaussians is the same (σlog 10 t90,i = σlog10 t90,j and σlog10 h32,i = σlog 10 h32,j for i, j = {short, long, intermediate}) for each group, only their weights are different with no correlation, the best model has a value of bic = −262.14. the clustering method of this model shows that a three bivariate component model is the most preferred. two components models have the best bic ∼ −276 and for models with four components the best bic ∼ −274, both are clearly below the maximum. the best-fit model has 10 free parameters and has three bivariate gaussian components. we assign class memberships probabilities using the ratio of the fitted bivariate models at the burst location on the duration-hardness plane. the model has the following three components with equal stan1http://cran.r-project.org 7 acta polytechnica vol. 52 no. 1/2012 table 1: bivariate model parameters for the best-fitted (eei) model. the standard deviations in the direction of the coordinate axes and the correlation coefficients are constrained by the model groups pl lg t90c lg h32c σlgt90 σlgh32 r nl short 0.08 −0.331 0.247 0.509 0.090 0 31 interm. 0.12 1.136 −0.116 0.509 0.090 0 46 long 0.80 1.699 0.114 0.509 0.090 0 331 dard deviations in both directions and with no correlation (r = 0) (see also table 1): 1. the first component is the known short class of grbs (shortest duration and hardest spectra). the average duration is 0.47 s and the average hardness ratio is 1.77. it has 31 members, and the weight of this model component is 0.08. 2. the second, most numerous model component is the long class, also identified in many previous studies. it has an average duration of 50.0 s and an average hardness of 1.30. it has 331 members and the weight of the model component is 0.80. 3. the third and softest class is intermediate in duration. it has overlapping regions with previous definitions of the intermediate class [4]. the average duration is 13.7 s, and the average hardness of this class is 0.77. it has 46 members and the weight of the model component is 0.12. -2 -1 0 1 2 3 log 10 t 90 -0.4 -0.2 0.0 0.2 0.4 0.6 lo g 1 0 h 3 2 xrf fig. 1: grbpopulations on the duration-hardness plane. triangles mark the long class, squares mark the short class and circles mark the intermediate class. filled symbols mark bursts with measured redshifts. one and two sigma ellipses are superimposed on the figure to illustrate the model components found as described in the text. the dashed line indicates the definition of x-ray flashes (xrfs) given by [9] all components have the same standard deviation in both directions. this means the shape of the gaussian is the same for the three groups (though obviously their weight is different). models with non-zero correlation coefficients between the two variables are not favored in the bic sense, contrary to the models with r = 0. using both model-based and non-parametric kmeans and hierarchical clustering methods we have experimented by using t50 instead of t90, by using different hardness ratios (e.g., h42 = s100−150 s25−50 , h432 = s100−150 s25−50 + s50−100 etc.). the classification remained essentially the same. 3 discussion our analysis on the swift grbs supports the earlier results that there are three distinct groups of bursts. again, besides the long and the short population, the intermediate duration class appears to be the softest. the intermediate bursts’ peak-flux are systematically lower than the long ones, while their redshift range is either lower or similar. we thus conclude that the intermediate class is intrinsically dimmer. if the intermediate population is part of the long population, the lower peak-flux requires a physical explanation. the observational properties show that intermediate bursts are the softest among the three groups, meaning that their emission is concentrated to low-energy bands. as the intermediate population is the softest, it is worth searching for a link with the similar and softer phenomenon compared to classical gammaray bursts, the x-ray flashes (xrfs) (for a review, see [11]). [9] gives a working definition for x-ray flashes (xrf) and x-ray rich grbs (xrr) for swift using the fluence ratio. the s23 fluence ratio is the reciprocal of the hardness (h32 = (s23) −1). current understanding of xrfs indicate that they are related to long bursts and they form a continuous distribution in the peak energy (epeak) of the νfν spectrum [9]. according to the fuzzy classification model we do not get a definite membership for a given burst, rather a probability that a burst belongs to a group. to identify the intermediate population (and tentatively the x-ray flashes), we use the indicator function: iinterm.(log10 t90, log10 h32) = (1) pinterm. × p (log10 t90, log10 h32|”interm.”) ∑ l∈{short,interm.,long} pl × p (log10 t90, log10 h32|l) 8 acta polytechnica vol. 52 no. 1/2012 the values of the parameters in this equation should be taken from table 1. the joint distribution function of the fitted model can be seen in gray in figure 2 and the probability contours of the third population are drawn in black with probability level contours shown. fig. 2: contour plot of the swift’s duration-hardness distributionbased on theeeimodelwith three components. points show individual bursts. the dashed line shows the region belonging to the xrf population, and the indicator function of the intermediate group is shown by solid lines. one can observe the strong coincidence between the xrfs and the intermediate group [9] define xrfs as events with hardness ratio h32 < 0.76. the limit is found using a pseudo-burst with spectral parameters: α = −1, β = −2.5 and epeak = 100 kev for a band spectrum [10]. based on this definition we identify 24 bursts from our 408 burst sample. the average of these bursts’ probabilities, (i.e. the xrf belongs to the intermediate group) is 95 %. this high value allows us to conclude that all xrfs belong to the intermediate group defined by the eei model with high probability. we propose that the members of the third component are probably x-ray flashes. therefore, using the model based classification method we can give a probabilistic definition for the x-ray flashes based on the durationhardness distribution. this definition defines 22 additional bursts that belong to the intermediate population and hence to the xrfs. all the x-ray flashes are in the region where the third component has the highest probability, but not all third component bursts can be unambiguously classified as x-ray flashes according to the [9] criterion. in other words the third component in the eei model contains all the x-ray flashes and some additional, very soft bursts. the mechanism behind the x-ray flashes is still not clear. there are various scenarios that could produce these phenomena (e.g. dirty fireballs, inefficient internal shocks, structured jets with off-axis viewing angle, etc., for a review of the models see [12]). a more precise experimental definition of xrfs can result in more stringent constraints on the models. acknowledgement this work was supported by otka grant k077795, by otka/nkth a08-77719 and a08-77815 grants (z.b.), by gačr grant no. p209/10/0734 (a.m.), by the research program msm0021620860 of the ministry of education of the czech republic (a.m.) and by a bolyai scholarship (i.h.). we thank peter mészáros, gábor tusnády, ĺidia rejtő and jakub řı́pa for valuable comments. references [1] gehrels, n., et. al: apj, 611, 1 005–1 020. [2] horváth, i., et. al: a & a, 447, 23–30. [3] huja, d., mészáros, a., řı́pa, j.: a & a, 504, 67. [4] horváth, i., et. al: a & a, 713, 552. [5] řı́pa, j., et. al: a & a, 498, 399. [6] sakamoto, t., et. al: apj suppl., 175, 179–190. [7] schwarz, g.: annals of statistics, 6, 2, 461–464. [8] liddle, a. r.: mnras, 377, l74–l78. [9] sakamoto, t., et. al: apj, 679, 570–586. [10] band, d., et. al: apj, 413, 281–292. [11] hullinger, d.: early afterglow evolution of x-ray flashes observed by swift, phd thesis, university of maryland, college park, 2006. [12] zhang, b.: chinese journal of astronomy and astrophysics, 7, 1–50. z. bagoly department of physics of complex systems eötvös university 1518 budapest, pf. 32, hungary department of physics bolyai military university 1581 budapest, pob 15, hungary p. veres department of physics of complex systems eötvös university 1518 budapest, pf. 32, hungary department of physics bolyai military university 1581 budapest, pob 15, hungary konkoly observatory 1505 budapest, pob 67, hungary 9 acta polytechnica vol. 52 no. 1/2012 i. horváth department of physics bolyai military university 1581 budapest, pob 15, hungary l. g. balázs konkoly observatory 1505 budapest, pob 67, hungary a. mészáros astronomical institute faculty of mathematics and physics charles university v holešovičkách 2, 180 00 prague 8, czech republic 10 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 on uq (sl2)-actions on the quantum plane s. duplij, s. sinel’shchikov abstract to give the complete list of uq (sl2)-actions of the quantum plane, we first obtain the structure of quantum plane automorphisms. then we introduce some special symbolic matrices to classify the series of actions using the weights. there are uncountably many isomorphism classes of the symmetries. we give the classical limit of the above actions. keywords: quantum universal enveloping algebra, hopf algebra, verma module, representation, composition series, projection, weight. we present and classify uq (sl2)-actions on the quantumplane [1]. the general form of an automorphism of the quantum plane [5] allows us to use the notion of weight. to classify the actions we introduce a pair of symbolicmatrices, which label the presence of nonzero weight vectors. finally, we present the classical limit of the obtained actions. thedefinitions of ahopf algebra h and h-action, the quantum universal enveloping algebra uq (sl2) (determinedby its generators k, k−1, e, f), andother notations can be found in [3]. the quantum plane is a unital algebra cq[x, y] generated by x, y and the relation yx = qxy, and we assume that 0 < q < 1. the notation cq[x, y]i for the i-th homogeneous component of cq[x, y], being the linear span of the monomials xmyn with m + n = i, is used. denote by (p)i the i-th homogeneous component of a polynomial p ∈ cq[x, y], that is the projection of p onto cq[x, y]i parallel to the direct sum of all other homogeneous components of cq[x, y]. denote by c[x] and c[y] the linear spans of {xn|n ≥ 0} and {yn|n ≥ 0}, respectively. the direct sum decompositions cq[x, y] = c[x] ⊕ ycq[x, y] = c[y] ⊕ xcq[x, y] is obvious. let (p)x be a projection of a polynomial p ∈ cq[x, y] to c[x] parallel to ycq[x, y]. proposition 1 let ψ be an automorphism of cq[x, y], then there exist nonzero constants α, β such that [5] ψ:x �→ αx, y �→ βy. (1) for any uq (sl2)-action on cq[x, y], we associate a 2×3 matrix, to be referred to as a full action matrix m def = ∥∥∥∥∥∥∥∥ k (x) k (y) e(x) e(y) f (x) f (y) ∥∥∥∥∥∥∥∥ . (2) an extension of uq (sl2)-action from the generators to cq[x, y] is given by (ab)u def = a(bu) , a(uv) def = σi (a ′ iu) · (a ′′ i v) , a, b ∈ uq (sl2) , u, v ∈ cq[x, y] together with the natural compatibility conditions [3]. we have from (1) that the action of k is determined by its action ψ on x and y given by a 1 × 2 matrix mk mk def = ‖k (x) , k (y)‖ = ‖αx, βy‖ , (3) where α, β ∈ c \ {0}. this allows us to introduce the weight of xnym ∈ cq[x, y] as wt(xnym) = αnβm. another submatrix of m is mef def = ∥∥∥∥∥ e(x) e(y)f (x) f (y) ∥∥∥∥∥ . (4) we call mk and mef an action k-matrix and an action ef-matrix, respectively. each entry of m is a weight vector (by (3) and (1)), and all the nonzeromonomialswhich constitute a specific entry have the same weight. we use the notation wt(m) def = ⎛ ⎜⎜⎝ wt(k (x)) wt(k(y)) wt(e(x)) wt(e(y)) wt(f (x)) wt(f (y)) ⎞ ⎟⎟⎠ (5) �� ⎛ ⎜⎜⎝ wt(x) wt(y) q2wt(x) q2wt(y) q−2wt(x) q−2wt(y) ⎞ ⎟⎟⎠= ⎛ ⎜⎜⎝ α β q2α q2β q−2α q−2β ⎞ ⎟⎟⎠, where the matrix relation �� is treated as a set of elementwise equalities if they are applicable, that is, when the corresponding entry of m is nonzero (hence admits a well-defined weight). denote by (m)i the i-thhomogeneous component of m which, if nonzero, admits awell-definedweight. introduce the constants a0, b0, c0, d0 ∈ c such that the zero degree component of the full action matrix 25 acta polytechnica vol. 50 no. 5/2010 is (m)0 = ⎛ ⎜⎜⎝ 0 0 a0 b0 c0 d0 ⎞ ⎟⎟⎠ 0 . (6) we keep the subscript 0 to the matrix in the r.h.s. to emphasize the origin of this matrix as the 0-th homogeneous component of m. weights of nonzero projections of (weight) entries of m should have the same weight, then wt((m)0) �� ⎛ ⎜⎜⎝ 0 0 q2α q2β q−2α q−2β ⎞ ⎟⎟⎠ 0 . (7) all the entries of (m)0 are constants (6), and so wt((m)0) �� ⎛ ⎜⎜⎝ 0 0 1 1 1 1 ⎞ ⎟⎟⎠ 0 . (8) let us use (mef)i to construct a symbolic matrix ( � m ef ) i whose entries are symbols0 or � in such a way: a nonzero entry of (mef)i is replaced by �, while a zero entry is replaced by the symbol 0. for 0-th components the specific relations involved in (7) imply that each columnof ( � m ef ) 0 should contain at least one 0, therefore we have 9 possibilities. apply e and f to yx = qxy using (3) to get ye(x) − qβe(x)y = qxe(y) − αe(y)x, (9) f (x)y − q−1β−1yf (x) = (10) q−1f (y)x − α−1xf (y) . we project (9)-(10) to cq[x, y]1 and obtain a0 (1 − qβ)y = b0 (q − α)x, d0 ( 1 − qα−1 ) x = c0 ( q − β−1 ) y, which gives a0 (1 − qβ) = b0 (q − α)= (11) d0 ( 1 − qα−1 ) = c0 ( q − β−1 ) =0. due to (11), weight constants α and β are 1) a0 = 0=⇒ β = q−1, (12) 2) b0 = 0=⇒ α = q, (13) 3) c0 = 0=⇒ β = q−1, (14) 4) d0 = 0=⇒ α = q. (15) we compare this to (7)–(8) and deduce that the symbolic matrices containing two �’s should be excluded. using (7) and (12)–(15) we conclude that the position of � in the remaining symbolic matrices determines the associated weight constants( � 0 0 0 ) 0 =⇒ α = q−2, β = q−1, (16) ( 0 � 0 0 ) 0 =⇒ α = q, β = q−2, (17) ( 0 0 � 0 ) 0 =⇒ α = q2, β = q−1, (18) ( 0 0 0 � ) 0 =⇒ α = q, β = q2, (19) and the matrix ( 0 0 0 0 ) 0 does not determine any weight constants. in the 1-st homogeneous component we have wt(e(x)) = q2wt(x) = wt(x) (because 0 < q < 1), which implies (e(x))1 = a1y, and similarlywe obtain (mef)1 = ( a1y b1x c1y d1x ) 1 , (20) where a1, b1, c1, d1 ∈ c. so we introduce a symbolic matrix ( � m ef ) 1 as above. the relations between weights similar to (7) give wt ( (mef)1 ) �� ( q2α q2β q−2α q−2β ) 1 �� (21) ( β α β α ) 1 . as a consequencewe have that every row and every column of ( � m ef ) 1 should contain at least one0. we project (9)-(10) to cq[x, y]2 and get a1 (1 − qβ)y2 = b1 (q − α)x2, (22) d1 ( 1 − qα−1 ) x2 = c1 ( q − β−1 ) y2, (23) whence a1 (1 − qβ) = b1 (q − α) = d1 ( 1 − qα−1 ) = c1 ( q − β−1 ) =0. so we obtain 1) a1 = 0=⇒ β = q−1, (24) 2) b1 = 0=⇒ α = q, (25) 3) c1 = 0=⇒ β = q−1, (26) 4) d1 = 0=⇒ α = q. (27) the symbolic matrix ( � 0 0 � ) 1 can be discarded from the list of symbolic matrices with at least one 0 at every row or column, because of (21), 26 acta polytechnica vol. 50 no. 5/2010 (24)–(27). for other symbolic matrices with the above property we have( � 0 0 0 ) 1 =⇒ α = q−3, β = q−1, (28) ( 0 � 0 0 ) 1 =⇒ α = q, β = q−1, (29) ( 0 0 � 0 ) 1 =⇒ α = q, β = q−1, (30) ( 0 0 0 � ) 1 =⇒ α = q, β = q3, (31) ( 0 � � 0 ) 1 =⇒ α = q, β = q−1, (32) and the matrix ( 0 0 0 0 ) 1 does not determine the weight constants. let us introduce a table of families of uq (sl2)actions, each family is labeled by two symbolic matrices ( � m ef ) 0 , ( � m ef ) 1 , and we call it a[( � m ef ) 0 ; ( � m ef ) 1 ] -series. note that the series labeledwith pairs of nonzero symbolic matrices at both positions are empty, because each such matrix determines a pair of specific weight constants α and β (16)–(19)which fails to coincide to any pair of such constants associated to the set of nonzero symbolic matrices at the second position (28)–(32). also the serieswithzero symbolicmatrix at the first position and symbolic matrices containing only one � at the second position are empty. in this way we get 24 “empty” [( � m ef ) 0 ; ( � m ef ) 1 ] series. let us turn to “non-empty” series and beginwith the case in which the action ef-matrix is zero. theorem 2 the [( 0 0 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] -series consists of four uq (sl2)-module algebra structures on the quantum plane given by k (x) = ±x, k (y)= ±y, (33) e(x) = e(y)= f (x)= f (y)= 0, (34) which are pairwise non-isomorphic. the next theorem describes the well-known symmetry [6, 7]. theorem 3 the [( 0 0 0 0 ) 0 ; ( 0 � � 0 ) 1 ] -series consists of a one-parameter (τ ∈ c \ {0}) family of uq (sl2)-module algebra structures on the quantum plane k (x) = qx, k (y)= q−1y, (35) e(x) = 0, e(y)= τ x, (36) f (x) = τ −1y, f (y)= 0. (37) all these structures are isomorphic, in particular to the action as above with τ =1. the essential claim here which is not covered by [6, 7], is that no higher (> 1) degree terms could appear in the expressions for e(x), e(y), f(x), f(y) in (36) and (37). this can be proved by a routine computation which relies upon our assumption 0 < q < 1. consider the symmetries whose symbolic matrix( � m ef ) 0 contains one �. theorem 4 the [( 0 � 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] -series consists of a one-parameter (b0 ∈ c \ {0}) family of uq (sl2)-module algebra structures on the quantum plane k (x) = qx, k(y)= q−2y, (38) e(x) = 0, e(y)= b0, (39) f (x) = b−10 xy, f (y)= −qb −1 0 y 2. (40) all these structures are isomorphic, in particular to the action as above with b0 =1. theorem 5 the [( 0 0 � 0 ) 0 ; ( 0 0 0 0 ) 1 ] -series consists of a one-parameter (c0 ∈ c \ {0}) family of uq (sl2)-module algebra structures on the quantum plane k (x) = q2x, k (y)= q−1y, (41) e(x) = −qc−10 x 2, e(y)= c−10 xy, (42) f (x) = c0, f (y)= 0. (43) all these structures are isomorphic, in particular to the action as above with c0 =1. theorem 6 the [( � 0 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] -series consists of a three-parameter (a0 ∈ c \ {0}, s, t ∈ c) family of uq (sl2)-actions on the quantum plane k(x)=q−2x, k (y)= q−1y, (44) e(x)=a0, e(y)=0, (45) f (x)= −qa−10 x 2 + ty4, f (y)= −qa−10 xy + sy 3. (46) 27 acta polytechnica vol. 50 no. 5/2010 the generic domain {(a0, s, t) |s =0, t =0} with respect to the parameters splits into uncountably many disjoint subsets {(a0, s, t) |s =0, t =0, ϕ = const} , where ϕ = t a0s2 . each of these subsets corresponds to an isomorphism class of uq (sl2)-module algebra structures. additionally there exist three more isomorphism classes which correspond to the subsets {(a0, s, t) |s =0, t =0} , {(a0, s, t) |s =0, t =0} , (47) {(a0, s, t) |s =0, t =0} . the specific form of weights for x and y discards primordially all but finitelymany terms (monomials) that could appear in the expressions for e(x), e(y), f(x), f(y) in (45) and (46). thus it becomes much easier to establish the latter relations than to do this for the corresponding relations in the previous theorems. to prove the rest of the claims, one needs to guess the explicit form of the required isomorphisms. theorem 7 the [( 0 0 0 � ) 0 ; ( 0 0 0 0 ) 1 ] -series consists of a three-parameter (d0 ∈ c \ {0}, s, t ∈ c) family of uq (sl2)-actions on the quantum plane k (x)=qx, k (y)= q2y, (48) e(x)= −qd−10 xy + sx 3, e(y)= −qd−10 y 2 + tx4, (49) f (x)=0, f (y)= d0, (50) here we have the domain {(d0, s, t) |s =0, t =0} which splits into the disjoint subsets {(d0, s, t) |s =0, t =0, ϕ = const} with ϕ = t d0s2 . this uncountable family of subsets is in one-to-one correspondence to isomorphism classes of uq (sl2)-module algebra structures. in addition, one also has three more isomorphism classes which are labelled by the subsets {(d0, s, t) |s =0, t =0}, {(d0, s, t) |s =0, t =0}, {(d0, s, t) |s =0, t =0}. remark 8 the uq (sl2)-symmetries on cq[x, y] picked from different series are nonisomorphic, and the actions of k in different series are different. remark 9 there are no uq (sl2)-symmetries on cq[x, y] other than those presented in the above theorems, because the assumptions exhaust all admissible forms for the components (mef)0, (mef)1 of the action ef-matrix. the associated classical limit actions of the lie algebra sl2 (here it is the lie algebra generated by e, f, h subject to the relations [h, e] = 2e, [h, f] = −2f, [e, f] = h) on c[x, y] by differentiations is derived fromthequantumactionvia substituting k = qh with subsequent formal passage to the limit as q → 1. we present all quantum and classical actions in table 1. note that there exist more sl2-actions on c[x, y] by differentiations (see, e.g. [8]) than one can see in table 1. it follows from our results that the rest of the classical actions admit no quantum counterparts. on the other hand, among the quantum actions listed in the first row of table 1, the only one to which the above classical limit procedure is applicable, is the action with k(x) = x, k(y) = y. the remaining three actions of this series admit no classical limit in the above sense. acknowledgement one of the authors (s.d.) is grateful to yu. bespalov, j. cuntz, b. dragovich, j. fuchs, a. gavrilik, h. grosse, d. gurevich, j. lukierski, m. pavlov, h.steinacker,z.rakić,w.werner, ands.woronowicz for many fruitful discussions. also, he would like to thankm.znojil for his invitation to the conference “analytic and algebraic methods vi”, villa lanna, prague, andkindhospitality at thedoppler institute in rez. references [1] manin, y. i.: topics in noncommutative differential geometry, princeton university press, princeton, 1991. [2] castellani, l., wess, j. (eds.): quantum groups and their applications in physics, iospress, amsterdam, 1996. [3] kassel, c.: quantum groups, springer-verlag, new york, 1995. [4] sweedler, m. e.: hopf algebras, benjamin, new york, 1969. [5] alev, j., chamarie, m.: dérivations et automorphismes de quelques algèbres quantiques, comm. algebra 20, 1787–1802 (1992). [6] montgomery, s., smith, s. p.: skew derivations and uq(sl(2)), israel j.math.72, 158–166(1990). [7] lambe, l. a., radford, d. e.: introduction to the quantum yang-baxter equation and quantum groups: an algebraic approach, kluwer, dordrecht, 1997. [8] gonzález-lópez, a., kamran, n., olver, p.: quasi-exactly solvable lie algebras of differential operators in two complex variables, j. phys. a: math. gen. 24, 3995–4078 (1991). 28 acta polytechnica vol. 50 no. 5/2010 table 1: symbolic matrices uq-module algebra structures classical limit sl2-actions by differentiations [( 0 0 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] k (x)= ±x, k (y)= ±y, e(x)= e(y)= 0, f (x)= f (y)= 0, h(x)= 0, h(y)= 0, e(x)= e(y)= 0, f (x)= f (y)= 0, [( 0 � 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] k (x)= qx,k (y)= q−2y, e(x)= 0, e(y)= b0, f (x)= b−10 xy, f (y)= −qb−10 y 2 h(x)= x, h(y)= −2y, e(x)= 0, e(y)= b0, f (x)= b−10 xy, f (y)= −b−10 y 2 [( 0 0 � 0 ) 0 ; ( 0 0 0 0 ) 1 ] k (x)= q2x,k (y)= q−1y, e(x)= −qc−10 x 2 , e(y)= c−10 xy, f (x)= c0, f (y)= 0, h(x)= 2x, h(y)= −y e(x)= −c−10 x 2 , e(y)= c−10 xy, f (x)= c0, f (y)= 0. [( � 0 0 0 ) 0 ; ( 0 0 0 0 ) 1 ] k (x)= q−2x,k (y)= q−1y, e(x)= a0, e(y)= 0, f (x)= −qa−10 x 2 + ty4, f (y)= −qa−10 xy + sy 3 . h(x)= −2x, h(y)= −y, e(x)= a0, e(y)= 0, f (x)= −a−10 x 2 + ty4, f (y)= −a−10 xy + sy 3 . [( 0 0 0 � ) 0 ; ( 0 0 0 0 ) 1 ] k (x)= qx, k (y)= q2y,e(x)= −qd−10 xy + sx3, e(y)= −qd−10 y 2 + tx4, f (x)= 0, f (y)= d0, h(x)= x, h(y)= 2y, e(x)= −d−10 xy + sx 3 , e(y)= −d−10 y 2 + tx4, f (x)= 0, f (y)= d0, [( 0 0 0 0 ) 0 ; ( 0 � � 0 ) 1 ] k (x)= qx,k (y)= q−1y, e(x)= 0, e(y)= τ x, f (x)= τ −1y, f (y)= 0, h(x)= x, h(y)= −y, e(x)= 0, e(y)= τ x, f (x)= τ −1y, f (y)= 0. steven duplij, dr. habil. e-mail: steven.a.duplij@univer.kharkov.ua http://webusers.physics.umn.edu/˜duplij theory group, nuclear physics laboratory v. n. karazin kharkov national university 4 svoboda sq., 61077 kharkov, ukraine sergey sinel’shchikov e-mail: sinelshchikov@ilt.kharkov.ua mathematics division b. i. verkin institute for low temperature physics and engineering, 47 lenin ave., ukraine, 61103 kharkov, ukraine 29 acta polytechnica vol. 51 no. 4/2011 two remarks to bifullness of centers of archimedean atomic lattice effect algebras m. kalina abstract lattice effect algebras generalize orthomodular lattices as well as mv-algebras. this means that within lattice effect algebras it is possible to model such effects as unsharpness (fuzziness) and/or non-compatibility. the main problem is the existence of a state. there are lattice effect algebras with no state. for this reason we need some conditions that simplify checking the existence of a state. if we know that the center c(e) of an atomic archimedean lattice effect algebra e (which is again atomic) is a bifull sublattice of e, then we are able to represent e as a subdirect product of lattice effect algebras ei where the top element of each one of ei is an atom of c(e). in this case it is enough if we find a state at least in one of ei and we are able to extend this state to the whole lattice effect algebra e. in [8] an atomic lattice effect algebra e (in fact, an atomic orthomodular lattice) with atomic center c(e) was constructed, where c(e) is not a bifull sublattice of e. in this paper we show that for atomic lattice effect algebras e (atomic orthomodular lattices) neither completeness (and atomicity) of c(e) nor σ-completeness of e are sufficient conditions for c(e) to be a bifull sublattice of e. keywords: lattice effect algebra, orthomodular lattice, center, atom, bifullness. 1 preliminaries effect algebras, introduced by d. j. foulis and m. k. bennett [3], have their importance in the investigation of uncertainty. lattice ordered effect algebras generalize orthomodular lattices and mvalgebras. thus they may include non-compatible pairs of elements as well as unsharp elements. definition 1 (foulis and bennett [3]) an effect algebra is a system (e; ⊕,0,1) consisting of a set e with two different elements 0 and 1, called zero and unit, respectively and ⊕ is a partially defined binary operation satisfying the following conditions for all p, q, r ∈ e: (e1) if p ⊕ q is defined, then q ⊕ p is defined and p ⊕ q = q ⊕ p. (e2) if q ⊕ r is defined and p ⊕ (q ⊕ r) is defined, then p ⊕ q and (p ⊕ q) ⊕ r are defined and p ⊕ (q ⊕ r) = (p ⊕ q) ⊕ r. (e3) for every p ∈ e there exists a unique q ∈ e such that p ⊕ q is defined and p ⊕ q = 1. (e4) if p ⊕1 is defined then p = 0. the element q in (e3) will be called the supplement of p, and will be denoted as p′. in the whole paper, for an effect algebra (e, ⊕,0,1), writing a ⊕ b for arbitrary a, b ∈ e will mean that a ⊕ b exists. on an effect algebra e we may define another partial binary operation � by a � b = c ⇔ b ⊕ c = a. the operation � induces a partial order on e. namely, for a, b ∈ e b ≤ a if there exists a c ∈ e such that a � b = c. if e with respect to ≤ is lattice ordered, we say that e is a lattice effect algebra. for the sake of brevity we will write just lea. further, in this article we often briefly write ‘an effect algebra e’ skipping the operations. s. p. gudder ( [5, 6]) introduced the notion of sharp elements and sharply dominating lattice effect algebras. recall that an element x of the lea e is called sharp if x ∧ x′ = 0. jenča and riečanová in [7] proved that in every lattice effect algebra e the set s(e) = {x ∈ e; x ∧ x′ = 0} of sharp elements is an orthomodular lattice which is a sub-effect algebra of e, meaning that if among x, y, z ∈ e with x ⊕ y = z at least two elements are in s(e) then x, y, z ∈ s(e). moreover s(e) is a full sublattice of e, hence a supremum of any set of sharp elements, which exists in e, is again a sharp element. further, each maximal subset m of pairwise compatible elements of e, called a block of e, is a sub-effect algebra and a full sublattice of e and e = ⋃ {m ⊆ e; m is a block of e} (see [16, 17]). central elements and centers of effect algebras were defined in [4]. in [14, 15] it was proved that in every lattice effect algebra e the center c(e) = {x ∈ e; (∀y ∈ e)y = (y ∧ x) ∨ (y ∧ x′)} = s(e) ∩ b(e), (1) where b(e) = ⋂ {m ⊆ e; m is a block of e}. since s(e) is an orthomodular lattice and b(e) is an 26 acta polytechnica vol. 51 no. 4/2011 mv-effect algebra, we obtain that c(e) is a boolean algebra. note that e is an orthomodular lattice if and only if e = s(e) and e is an mv-effect algebra if and only if e = b(e). thus e is a boolean algebra if and only if e = s(e) = b(e) = c(e). recall that an element p of an effect algebra e is called an atom if and only if p is a minimal nonzero element of e and e is atomic if for each x ∈ e, x �= 0, there exists an atom p ≤ x. definition 2 let (e, ⊕, 0) be an effect algebra. to each a ∈ e we define its isotropic index, notation ord(a), as the maximal positive integer n such that na := a ⊕ . . . ⊕ a︸ ︷︷ ︸ n-times exists. we set ord(a) = ∞ if na exists for each positive integer n. we say that e is archimedean, if for each a ∈ e, a �= 0, ord(a) is finite. an element u ∈ e is called finite, if there exists a finite system of atoms a1, . . . , an (which are not necessarily distinct) such that u = a1 ⊕ . . . ⊕ an. an element v ∈ e is called cofinite, if there exists a finite element u ∈ e such that v = u′. we say that for a finite system f = (xj ) k j=1 of not necessarily different elements of an effect algebra (e, ⊕,0,1) is ⊕-orthogonal if for all n ≤ k x1 ⊕ x2⊕· · ·⊕ xn = (x1 ⊕ x2 ⊕· · ·⊕ xn−1) ⊕ xn exists in e (briefly we will write n⊕ j=1 xj ). we define also ⊕∅ = 0. definition 3 for a lattice (l, ∧, ∨) and a subset d ⊆ l we say that d is a bifull sublattice of l, if and only if for any x ⊆ d, ∨ l x exists if and only if∨ d x exists and ∧ l x exists if and only if ∧ d x exists, in which case ∨ l x = ∨ d x and ∧ l x = ∧ d x. it is known that if e is a distributive effect algebra (i. e., the effect algebra e is a distributive lattice — e. g., if e is an mv-effect algebra) then c(e) = s(e). if moreover e is archimedean and atomic then the set of atoms of c(e) = s(e) is the set {naa; a ∈ e is an atom of e}, where na = ord(a) (see [20]). since s(e) is a bifull sublattice of e if e is an archimedean atomic lea (see [13]), we obtain that 1 = ∨ c(e) {p ∈ c(e); p is an atom of c(e)} = ∨ e {p ∈ c(e); p is an atom of c(e)} for every archimedean atomic distributive lattice effect algebra e. in [8] it was shown that there exists an lea e for which this property fails to be true. important properties of archimedean atomic lattice effect algebras with an atomic center were proven by riečanová in [21]. theorem 1 (riečanová [21]) let e be an archimedean atomic lattice effect algebra with an atomic center c(e). let ae be the set of all atoms of e and ac(e) the set of all atoms of c(e). the following conditions are equivalent: 1. ∨ e ac(e) = 1. 2. for every atom a ∈ ae there exists an atom pa ∈ ac(e) such that a ≤ pa. 3. for every z ∈ c(e) it holds z = ∨ c(e) {p ∈ ac(e); p ≤ z} = ∨ e {p ∈ ac(e); p ≤ z}. 4. c(e) is a bifull sub-lattice of e. in this case e is isomorphic to a subdirect product of archimedean atomic irreducible lattice effect algebras. theorem 2 (paseka, riečanová [13]) let e be an atomic archimedean lattice effect algebra. then the set s(e) of all sharp elements of e is a bifull sublattice of e. we will deal only with atomic archimedean lattice effect algebras e. we have c(e) ⊂ s(e) ⊂ e. because of this inclusion and theorem 2, considering the bifullness of the center c(e) in e is equivalent to considering the bifullness of c(e) in s(e). and s(e) is an orthomodular lattice. for this reason, in the rest of the paper we will restrict our attention to atomic orthomodular lattices l and their centers c(l). for the sake of completeness, we give the definition of an orthomodular lattice. definition 4 let l be a bounded lattice with a unary operation ′ (called complementation) satisfying the following conditions 1. for all a ∈ l (a′)′ = a, 2. for all a, b ∈ l if a ≤ b then b′ ≤ a′, 3. for all a, b ∈ l if a ≤ b then a ∨ (a′ ∧ b) = b. then l is said to be an orthomodular lattice (oml for brevity). remark 1 though in oml’s we have just latticetheoretical operations ∨ and ∧, we will use also effect algebraic operations ⊕ and � with the meaning a ⊕ b = a ∨ b iff a ≤ b′ and a � b = c iff b ⊕ c = a. 27 acta polytechnica vol. 51 no. 4/2011 2 orthomodular lattice l whose center is not a bifull sublattice let us have the following sequences of atoms (sets): a0 = {(x, y) ∈ r2; 0 ≤ x ≤ 1, y ∈ r}, al = {(x, y) ∈ r2; l < x ≤ l + 1, y ∈ r}, for l = 1, 2, . . ., b0 = {(x, y) ∈ r2; −1 ≤ x < 0, y ∈ r}, bl = {(x, y) ∈ r2; −l − 1 ≤ x < −l, y ∈ r}, for l = 1, 2, . . ., (2) cj = {(x, y) ∈ r2; −j ≤ x ≤ j, y ≤ j · x}, for j = 1, 2, . . ., dj = {(x, y) ∈ r2; −j ≤ x ≤ j, y > j · x}, for j = 1, 2, . . ., pj = {j}, for j = 1, 2, . . .. for such a choice of atoms, q1 �= q2 are compatible if and only if q1∩q2 = ∅. fig. 1 shows the compatibility among atoms. for their non-compatibility (denoted by �↔) the following rules hold cj �↔ ai, cj �↔ bi for all j = 1, 2, . . . and i = 0, . . . , j − 1, dj �↔ ai, dj �↔ bi for all j = 1, 2, . . . and i = 0, . . . , j − 1, cj �↔ di for all i, j = 1, 2, . . . such that i �= j, cj �↔ ci, dj �↔ di for all i, j = 1, 2, . . . such that i �= j. � � �� � � � � � � � � � � p1 p2 p3 p4 p5 p6 pn pn+1 � � �� � � � � � � � � � � a0 b0 a1 b1 a2 b2 an bn � � � � � � � � � � � � � � � � c1 c2 c3 cn−1 d1 d2 d3 dn−1 fig. 1: greechie diagram of sets of atoms for non-compatible atoms the following equalities hold cj ⊕ dj = j−1⊕ i=0 (ai ⊕ bi) = ck ∨ cj = dk ∨ dj = ck ∨ dj = dk ∨ cj = cj ∨ al = cj ∨ bl = dj ∨ al = dj ∨ bl for 1 ≤ k < j and 0 ≤ l < j. denote b̂0, b̂j (for j = 1, 2, . . .) complete atomic boolean algebras with the corresponding sets of atoms a0, aj (j = 1, 2, . . .), given by a0 = ∞⋃ i=0 {ai} ∪ ∞⋃ i=0 {bi} ∪ ∞⋃ j=1 {pj}, (3) aj = ∞⋃ i=j {ai} ∪ ∞⋃ i=j {bi} ∪ ∞⋃ j=1 {pj} ∪{cj , dj}. (4) disjointness occurring among some atoms of the system (2) is equivalent to the fact that a0 and aj (j = 1, 2, . . .) are unique maximal sets of pairwise compatible atoms. theorem 3 (kalina [9]) let l̂ = ∞⋃ i=0 b̂i. let l1 be the complete oml generated by sets of atoms ∞⋃ i=0 {ai, bi} ∪ ∞⋃ j=1 {cj , dj} and n the complete boolean algebra generated by the set of atoms ∞⋃ j=1 {pj}. then (l̂, ∨, ∧,0,1) is a complete oml and l̂ ∼= l1 ×n. an element u ∈ b̂l is finite if and only if u = q1⊕ q2⊕ . . .⊕ qn for an n ∈ n and q1, q2, . . . , qn ∈ al. set ql = {u ∈ bl; u is finite}, l = 0, 1, 2, . . .. then ql is a generalized boolean algebra, since bl = ql ∪̇ q∗l is a boolean algebra, where q∗l = {u ∗; u∗ = 1l � u and u ∈ ql} (see [22], or [2, pp. 18-19]). this means that bl is a boolean subalgebra of finite and cofinite elements of b̂l (l = 0, 1, 2, . . .). theorem 4 (kalina [8]) denote l = ∞⋃ l=0 bl. then (l, ∨, ∧,0,1) is a compactly generated orthomodular lattice with the family (bl) ∞ l=0 of atomic blocks of l. the center of l, c(l), is not a bifull sublattice of l. 3 completion of the center of l we are going to show that it is possible to extend the orthomodular lattice l from theorem 4 to l̄, whose center, c(l̄), is a complete boolean algebra which is not a bifull sublattice of l̄. denote f a fixed non-trivial ultrafilter on n (the index set of atoms pj ). then f has the following properties which will be important for our construction: • let f ⊂ n. then either f ∈ f or n \ f ∈ f. • let f ⊂ n be a finite set. then f /∈ f. • if f1 ∈ f and f2 ∈ f then f1 ∩ f2 ∈ f. • if f1 ∈ f and f2 ⊃ f1 then f2 ∈ f. 28 acta polytechnica vol. 51 no. 4/2011 let ql1 denote the set of all finite elements of l1. further set pf = {⊕ i∈f pi; f /∈ f } (5) and g = {f ⊕ g; g ∈ ql1, f ∈ pf}, g⊥ = {h′ ∈ l̂; h ∈ g}. theorem 5 let l̃ = g ∪̇ g⊥. then the system( l̃, ∨, ∧,0,1 ) is an orthomodular lattice. the center c(l̃) = {f ∈ l̃; f ∈ pf or f ′ ∈ pf}, and c(l̃) is a complete boolean algebra which is not bifull in l̃. proof. first we show that l̃ is a bounded lattice. consider elements h1, h2 ∈ g. then there exist elements g1, g2 ∈ ql1 and elements f1, f2 ∈ pf such that h1 = f1 ⊕ g1, h2 = f2 ⊕ g2. (6) by the properties of the non-trivial ultrafilter f we get that f1 ∨ f2 ∈ pf and f1 ∧ f2 ∈ pf . since g1, g2 are finite elements of l1, we get that g1 ∨ g2 ∈ ql1 and also g1 ∧ g2 ∈ ql1. since l1 is generated by the sets of atoms ∞⋃ i=0 {ai, bi} and ∞⋃ j=1 {cj, dj}, each g ∈ ql1 is ⊕-orthogonal to each f ∈ pf . this implies that g is closed under ∨ and ∧. because g⊥ consists of complements of elements of g, we have that also g⊥ is closed under ∨ and ∧. now assume that h1 ∈ g and h2 ∈ g⊥. then h′2 ∈ g and we can write h1 = f1 ⊕ g1, h′2 = f2 ⊕ g2 with the same meaning of f1, f2, g1, g2 as in formula (6). this means that h2 = f ′ 2 � g2. then, because of the monotonicity of the ultrafilter f, we have (f1 ∨ f ′2) ′ ∈ pf and hence f1 ∨ f ′2 ∈ g ⊥. moreover, g2 ∈ ql1 is orthogonal to f1 which implies (f1 ∨ f ′2) � g2 = f1 ∨ (f ′ 2 � g2) ∈ g ⊥. since g is a monotone system (meaning that with an arbitrary element δ1 ∈ g it contains also all elements δ2 ∈ l̂ such that δ2 ≤ δ1), we get from the duality between g and g⊥ that (f1 ∨ g1) ∨ (f ′2 � g2) = h1 ∨ h2 ∈ g ⊥ dually we get that h1 ∧ h2 ∈ g. this implies that l̃ = g ∪̇ g⊥ is a lattice. obviously it is a bounded and orthocomplemented lattice. showing that it is an oml is a matter of routine. we will omit the detailed proof. let us consider an element f ∈ l̃ such that f ∈ pf or f ′ ∈ pf . then f is a central element. if f is such that neither f ∈ pf nor f ′ ∈ pf , then there exist atoms α1, α2 ∈ ∞⋃ i=0 {ai, bi} ∪ ∞⋃ j=1 {cj , dj} fulfilling α1 �↔ α2 and α1 ≤ f , α2 �≤ f . then f is not a central element. this proves that c(l̃) = {f ∈ l̃; f ∈ pf or f ′ ∈ pf}. due to the fact that f is a non-trivial ultrafilter, c(l̃) is a complete boolean algebra. the only central element that is greater than all atoms pj for j = 1, 2, . . ., is 1, hence we have that∨ c(l̃) {pj ; j = 1, 2, . . .} = 1. on the other hand, let us take an arbitrary atom α ∈ ∞⋃ i=0 {ai, bi} ∪ ∞⋃ j=1 {cj , dj} and assume that ∨ l̃ {pj ; j = 1, 2, . . .} does exist. since α is orthogonal to all atoms from the set ∞⋃ j=1 {pj}, we have that α is orthogonal to∨ l̃ {pj ; j = 1, 2, . . .} and hence ∨ l̃ {pj ; j = 1, 2, . . .} �= 1. it can be shown (see [8]) that ∨ l̃ {pj ; j = 1, 2, . . .} does not exist. this implies that c(l̃) is not a bifull sublattice of l̃. � 4 σ-complete orthomodular lattice l̃σ whose center is not a bifull sublattice let i denote the set of all ordinal numbers less than ω (the first uncountable ordinal number). further, denote e the set of all limit ordinal numbers up to ω and j = i \ e. assume sets of elements {pi; i ∈ i}, {ai; i ∈ i}, {bi; i ∈ i}, {ci; i ∈ i}, {di; i ∈ i}, where the corresponding elements for i ∈ j will act as atoms. we will have a partial relation �↔ modelling noncompatibility. this partial relation will have the following form among atoms cj �↔ ai, cj �↔ bi for all j ∈ j and i ≤ j, dj �↔ ai, dj �↔ bi for all j ∈ j and i ≤ j, 29 acta polytechnica vol. 51 no. 4/2011 cj �↔ di for all i, j ∈ j such that i �= j, cj �↔ ci, dj �↔ di for all i, j ∈ j such that i �= j. sets of elements {pi; i ∈ i}, {ai; i ∈ i}, {bi; i ∈ i}, {ci; i ∈ i}, {di; i ∈ i} will present atoms for i ∈ j and for κ ∈ e we will have pκ = ∨ i<κ pi, (7) aκ = ∨ i<κ ai, (8) bκ = ∨ i<κ bi, (9) cκ = ∨ i<κ ci = ∨ i<κ di = dκ = aκ ⊕ bκ. (10) as a possible model for the just presented sets of elements fulfilling the non-compatibility relation we may have the following: let us choose a good order of positive real numbers of type ω, i.e., positive real numbers will be enumerated by ordinal numbers from j . for i ∈ j and r > 0, r ∈ r, we denote ri the i-th number in the chosen good order. then we identify the set {pi; i ∈ j } with the set of all positive real numbers, i.e., pi = ri. further we put for i, j ∈ j ai = {(ri, y) ∈ r2; y ∈ r}, bi = {(−ri, y) ∈ r2; y ∈ r}, ci = {(rj , y) ∈ r2; j ≤ i, y ≤ ri · rj} ∪ {(−rj , y) ∈ r2; j ≤ i, y ≤ −ri · rj}, di = {(rj , y) ∈ r2; j ≤ i, y > ri · rj} ∪ {(−rj , y) ∈ r2; j ≤ i, y > −ri · rj}. for κ ∈ e we define the corresponding elements pκ, aκ, bκ, cκ, dκ by equalities 7, 8, 9, 10, respectively. compatibility among different atoms is given by disjointness of the corresponding sets. this implies that the uniquely given maximal sets of pairwise compatible atoms are ã0 = ⋃ i∈j {ai, bi, pi}, ãj = ⋃ i ∈ j i > j {ai, bi} ∪ ⋃ i∈j {pi} ∪ {cj, dj} for j ∈ j . sets of atoms ã0 and ãj for j ∈ j , generate complete boolean algebras b̃0 and b̃j for j ∈ j , respectively. for κ ∈ e we get complete atomic boolean algebras b̃κ generated by sets of atoms ãκ = ⋃ i∈j {pi} ∪ {aκ, bκ} ∪ ⋃ i ∈ j i > κ {ai, bi}. this means that for κ ∈ e b̃κ ⊂ b̃0. the union of all complete atomic boolean algebras, l̃ = b̃0 ∪ ⋃ i∈i b̃i, is a complete oml. an element f ∈ l̃ will be called countable if there exists an at most countable set of atoms (an at most countable set of indices k) {qk}k∈k ⊂ ã0 or {qk}k∈k ⊂ ãi for i ∈ j , such that f = ⊕ k∈k qk. by definition of elements pi, ai, bi, ci, di for i ∈ i we get that each of these elements is countable. let k denote the set of all countable elements of l̃ and k⊥ = {f ∈ l̃; f ′ ∈ k}. further, let p denote the set of all countable elements generated by {pi, i ∈ j }, and p⊥ = {f ∈ l̃; f ′ ∈ p}. theorem 6 let l̃σ = k∪̇k⊥. then ( l̃σ, ∨, ∧,0,1 ) is a σ-complete oml. the center c(l̃σ ) = p∪̇p⊥ and it is not a bifull sublattice of l̃σ. proof. each of the atoms pi, ai, bi, ci, di for i ∈ j (and hence also each of the elements pi, ai, bi, ci, di for i ∈ i) is countable. this implies that l̃σ is an oml. since it is by definition closed under countable meets and joins, it is σ-complete. elements pi for i ∈ i are central because each of the elements pi is compatible with all atoms of l̃σ. this implies that p∪̇p⊥ ⊂ c(l̃σ ). on the other hand, let f be a countable element, f /∈ p. then there exists ci such that ci �≤ f for i ∈ j and an atom out of e ∈ {aj, bj , cj, dj} for j < i, e ≤ f . then ci �↔ e and hence cj �↔ f . similarly, if f ∈ k⊥, there exists ci ≤ f and an atom out of e ∈ {aj, bj , cj, dj} for j < i such that e �≤ f . in this case e �↔ ci and hence also e �↔ f . we conclude that c(l̃σ ) = p∪̇p⊥. we show that c(l̃σ ) is not a bifull sublattice of l̃σ. obviously ∨ c(l̃σ) {pi, i ∈ i} = 1. assume that ∨ l̃σ {pi, i ∈ i} does exist. then all elements e ∈ ⋃ i∈i {ai, bi, ci, di} are orthogonal with all elements from the set ⋃ i {pj} and consequently also with ∨ l̃σ {pi, i ∈ i}. this implies ∨ l̃σ {pi, i ∈ i} �= 1. this means that c(l̃σ ) is not a bifull sublattice of l̃σ. � 30 acta polytechnica vol. 51 no. 4/2011 acknowledgement support from the science and technology assistance agency under contract no. apvv-0073-10, and from the vega grant agency, grant number 1/0297/11, is gratefully acknowledged. references [1] chang, c. c.: algebraic analysis of many-valued logics. trans. amer. math. soc. 88 (1958), 467–490. [2] dvurečenskij, a., pulmannová, s.: new trends in quantum structures. dordrecht, boston, london, and isterscience, bratislava : kluwer acad. publisher, 2000. [3] foulis, d. j., bennett, m. k.: effect algebras and unsharp quantum logics. found. phys. 24 (1994), 1 325–1 346. [4] greechie, r. j., foulis, d. j., pulmannová, s.: the center of an effect algebra. order 12 (1995), 91–106. [5] gudder, s. p.: sharply dominating effect algebras. tatra mountains math. publ. 15 (1998), 23–30. [6] gudder, s. p.: s-dominating effect algebras. internat. j. theor. phys. 37 (1998), 915–923. [7] jenča, g., riečanová, z.: on sharp elements in lattice ordered effect algebras. busefal 80 (1999), 24–29. [8] kalina, m.: on central atoms of archimedean atomic lattice effect algebras. kybernetika 46 (2010), 4, 609–620. [9] kalina, m.: mac neille completion of centers and centers of mac neille completions of lattice effect algebras. kybernetika 46 (2010), 6, 635–647. [10] kôpka, f.: compatibility in d-posets. internat. j. theor. phys. 34 (1995), 1 525–1 531. [11] mosná, k.: about atoms in generalized efect algebras and their effect algebraic extensions. j. electr. engrg. 57 (2006), 7/s, 110–113. [12] mosná, k., paseka, j., riečanová, z.: order convergence and order and interval topologies on posets and lattice effect algebras. in proc. internat. seminar uncertainty 2008, publishing house of stu 2008, 45–62. [13] paseka, j., riečanová, z.: the inheritance of bde-property in sharply dominating lattice effect algebras and (o)-continuous states. soft computing, 15 (2011), 543–555. [14] riečanová, z.: compatibility and central elements in effect algebras. tatra mountains math. publ. 16 (1999), 151–158. [15] riečanová, z.: subalgebras, intervals and central elements of generalized effect algebras. internat. j. theor. phys., 38 (1999), 3 209–3 220. [16] riečanová, z.: generalization of blocks for dlattices and lattice ordered effect algebras. internat. j. theor. phys. 39 (2000), 231–237. [17] riečanová, z.: orthogonal sets in effect algebras. demontratio mathematica 34 (2001), 525–532. [18] riečanová, z.: smearing of states defined on sharp elements onto effect algebras. internat. j. theor. phys. 41 (2002), 1 511–1524. [19] riečanová, z.: subdirect decompositions of lattice effect algebras. internat. j. theor. phys. 42 (2003), 1 425–1 433. [20] riečanová, z.: distributive atomic effect akgebras. demontratio mathematica 36 (2003), 247–259. [21] riečanová, z.: lattice effect algebras densely embeddable into complete ones. kybernetika, 47 (2011), 1, 100–109. [22] riečanová, z., marinová, i.: generalized homogenous, prelattice and mv-effect algebras. kybernetika 41 (2005), 129–142. martin kalina e-mail: kalina@math.sk dept. of mathematics faculty of civil engineering slovak univ. of technology radlinského 11, sk-813 68 bratislava, slovakia 31 wykresx.eps acta polytechnica vol. 51 no. 1/2011 toda tau functions with quantum torus symmetries k. takasaki abstract the quantumtorus algebra plays an important role in a special class of solutions of thetodahierarchy. typical examples are the solutions related to the melting crystal model of topological strings and 5d susy gauge theories. the quantum torus algebra is realized by a 2d complex free fermion system that underlies the toda hierarchy, and exhibits mysterious “shift symmetries”. this article is based on collaboration with toshio nakatsu. keywords: toda hierarchy, melting crystal model, quantum torus algebra. 1 introduction this paper is a review of our recentwork [1, 2] on an integrable structure of the melting crystal model of topological strings [3] and 5dgauge theories [4]. it is shown here that the partition function of this model, onbeing suitablydeformedby special externalpotentials, is essentially a tau function of the toda hierarchy [5]. a technical clue to this observation is a kind of symmetries (referred to as “shift symmetries”) in the underlying quantum torus algebra. these symmetries enable us, firstly, to convert the deformed partition function to the tau function and, secondly, to show the existence of hidden symmetries of the tau function. these results can be extended to some othertoda tau functions that are related to the topological vertex [6] and the double hurwitz numbers of the riemann sphere [7]. 2 quantum torus algebra throughout this paper, q denotes a constant with |q| < 1, and λ and δ denote the z×z matrices λ= ∑ i∈z ei,i+1 =(δi+1,j), δ= ∑ i∈z ieii =(iδij). their combinations v(k)m = q −km/2λmqkδ (k, m ∈ z) (1) satisfy the commutation relations [v(k)m , v (l) n ] = (q (lm−kn)/2 − q(kn−lm)/2)v(k+l)m+n (2) of the quantum torus algebra. this lie algebra can thus be embedded into thelie algebragl(∞) ofz×z matrices a = (aij) for which ∃n such that aij = 0 for |i − j| > n. to formulate a fermionic realization of this lie algebra, we introduce the creation/annihilation operators ψi, ψ ∗ i (i ∈ z) with anti-commutation relations ψiψ ∗ j + ψ ∗ j ψi = δi+j,0, ψiψj + ψj ψi = 0, ψ∗i ψ ∗ j + ψ ∗ j ψ ∗ i = 0 and the 2d free fermion fields ψ(z)= ∑ i∈z ψiz −i−1, ψ∗(z)= ∑ i∈z ψ∗i z −i. the vacuum states 〈0|, |0〉 of the fock space and its dual space are characterized by the vacuum conditions ψi|0〉 =0 (i ≥ 0), ψ∗i |0〉 =0 (i ≥ 1), 〈0|ψi =0 (i ≤ −1), 〈0|ψ∗i =0 (i ≤ 0). to any element a =(aij) of gl(∞), one can associate the fermion bilinear â = ∑ i,j∈z aij:ψ−iψ ∗ j :, :ψ−iψ ∗ j : = ψ−iψ ∗ j − 〈0|ψ−iψ ∗ j |0〉. these fermion bilinears form a one-dimensional central extension ̂gl(∞) of gl(∞). the special fermion bilinears [1, 2] v (k)m = v̂ (k) m = qk/2 ∮ dz 2πi zm:ψ(qk/2z)ψ∗(q−k/2z): (3) satisfy the commutation relations [v (k)m , v (l) n ] = (q (lm−kn)/2 − q(kn−lm)/2) ·( v (k+l) m+n − qk+l 1− qk+l δm+n,0 ) (4) for k and l with k + l �=0 and [v (k)m , v (−k) n ] = (q −k(m+n)/2 − qk(m+n)/2) · v (0) m+n + mδm+n,0. (5) 74 acta polytechnica vol. 51 no. 1/2011 thus ̂gl(∞) contains a central extension of the quantum torus algebra, in which the û(1) algebra is realized by jm = v (0) m = λ̂ m (m ∈ z). (6) 3 shift symmetries let us introduce the operators g± = exp ( ∞∑ k=1 qk/2 k(1 − qk) j±k ) , w0 = ∑ n∈z n2:ψ−nψ ∗ n:. (7) g±’s play the role of “transfermatrices” in themelting crystalmodel [3, 4]. w0 is a fermionic formof the so called “cut-and-join” operator for hurwitz numbers [8]. g± and q w0/2 induce the following two types of “shift symmetries” [1, 2] in the (centrally extended) quantum torus algebra. • first shift symmetry g−g+ ( v (k)m − δm,0 qk 1− qk ) (g−g+) −1 = (−1)k ( v (k) m+k − δm+k,0 qk 1− qk ) (8) • second shift symmetry qw0/2v (k)m q −w0/2 = v (k−m)m (9) 4 toda tau function in melting crystal model a general tau function of the 2d toda hierarchy [5] is given by τ(s, t , t̄)= 〈s|exp ( ∞∑ k=1 tkjk ) gexp ( − ∞∑ k=1 t̄kj−k ) |s〉, (10) where t =(t1, t2, · · ·) and t̄ =(t̄1, t̄2, · · ·) are time variables of the toda hierarchy, 〈s| and |s〉 are the ground states 〈s| = 〈−∞| · · · ψ∗s−1ψ ∗ s , |s〉 = ψ−sψ−s+1 · · · | − ∞〉 in the charge-s sector of the fock space, and g is an element of gl(∞)= exp ( gl(∞) ) . on the other hand, the partition function z(q, s, t) of thedeformedmelting crystalmodel [1, 2] can be cast into the apparently similar (but essentially different) form z(s, t)= 〈s|g+eh(t)ql0g−|s〉, (11) where q and t =(t1, t2, · · ·) are coupling constants of the model, and h(t) and l0 the following operators: h(t) = ∞∑ k=1 tkhk, hk = v (k) 0 , l0 = ∑ n∈z n:ψ−nψ ∗ n:. (12) the shift symmetries (8) and (9) imply the operator identity g+e h(t)g−1+ = exp ( ∞∑ k=1 tkq k 1− qk ) g−1− q −w0/2 · exp ( ∞∑ k=1 (−1)ktkjk ) qw0/2g−. inserting this identity and using the fact that 〈s|g−1− q −w0/2 = q−s(s+1)(2s+1)/12〈s|, q−w0/2g−1+ |s〉 = q −s(s+1)(2s+1)/12|s〉, we can rewrite z(s, t) as z(q, s, t) = exp ( ∞∑ k=1 tkq k 1− qk ) · q−s(s+1)(2s+1)/6τ(s, t ,0), (13) tk = (−1)ktk, where thegl(∞) element g defining the tau function is given by g = qw0/2g−g+q l0g−g+q w0/2. (14) actually, the shift symmetries imply the operator identity g−1− e h(t)g− = exp ( ∞∑ k=1 tkq k 1− qk ) g+q w0/2 · exp ( ∞∑ k=1 (−1)ktkj−k ) q−w0/2g−1+ as well. this leads to another expression of z(q, s, t,) in which τ(s, t ,0) is replaced with τ(s,0, −t). the existence of different expressions can be explained by the intertwining relations jkg = gj−k (k =1,2, . . .), (15) which, too, area consequenceof the shift symmetries. these intertwining relations imply the constraints( ∂tk + ∂t̄k ) τ(s, t , t̄)= 0 (k =1,2, . . .) (16) 75 acta polytechnica vol. 51 no. 1/2011 on the tau function. the tau function τ(s, t , t̄) thereby becomes a function τ(s, t − t̄) of the difference t − t̄ . in particular, τ(s, t ,0) and τ(s,0, −t) coincide. the reduced function τ(t , s) may be thought of as a tau function of the 1d toda hierarchy. (15) are a special case of the more general intertwining relations (v (k)m − δm,0 qk 1− qk )g = q−kg(v (−k) −2k−m − δ2k+m,0 q−k 1− q−k ). (17) we can translate these relations to the language of the lax formalismof thetoda hierarchy. a study on this issue is now in progress. 5 other models the followingtoda tau functions canbe treatedmore or less in the sameway as the foregoing tau function. we shall discuss this issue elsewhere. 1. the generating function of the two-leg amplitude wλμ in the topological vertex [6] is atoda tau function determined by g = qw0/2g+g−q w0/2. (18) 2. the generating function of double hurwitz numbers of the riemann sphere [7] is a toda tau function determined by g = e−βw0ql0. (19) the parameter q is interpreted as q = e−β. acknowledgement this work has been partly supported by the jsps grants-in-aid for scientific research no. 19104002, no. 21540218 and no. 22540186 from the japan society for the promotion of science. references [1] nakatsu,t., takasaki,k.: melting crystal, quantum torus and toda hierarchy, comm. math. phys. 285 (2009), 445–468. [2] nakatsu, t., takasaki, k.: integrable structure of melting crystal model with external potential, advanced studies in pure. math. vol. 59 (math. soc. japan, 2010), pp. 201–223. [3] okounkov, a., reshetikhin, n., vafa, c.: quantum calabi-yau and classical crystals, in: p. etingof, v. retakh, i. m. singer (eds.), the unity of mathematics, progr. math. 244, birkhäuser, 2006, pp. 597–618. [4] maeda, t., nakatsu, t., takasaki, k., tamakoshi, t.: five-dimensional supersymmetric yangmills theoriesandrandomplanepartitions,jhep 0503 (2005), 056. [5] takasaki, k., takebe, t.: integrable hierarchies and dispersionless limit, rev. math. phys. 7 (1995), 743–808. [6] zhou, j.: hodge integrals and integrable hierarchies, arxiv:math.ag/0310408. [7] okounkov, a.: toda equations for hurwitz numbers, math. res. lett. 7 (2000), 447–453. [8] kazarian, m.: kp hierarchy for hodge integrals, adv. math. 221 (2009), 1–21. kanehisa takasaki e-mail: takasaki@math.h.kyoto-u.ac.jp graduate school of human and environmental studies kyoto university yoshida, sakyo, kyoto, 606-8501, japan 76 ap10_1.vp 1 introduction from the point of view of temperature measurements, machine tools can be divided into two groups. the first group consists of so-called intelligent (advanced) machines, in which all required sensors are implanted directly within machine production. a machine spindle with temperature sensors built in to the bearings, motor, etc., is a typical representative of this group. this machine can solve deformation problems using mechatronic methods. unfortunately, these machines are not very common. they are also relatively expensive to produce. the second group comprises conventional machines that have only a limited number of built-in sensors (about five). these sensors are specially added to the surface of the machine frame. the spindle is generally not sufficiently monitored. machines of this type are preferred nowadays [1, 2 and 3]. conventional machine tools can be supplemented by additional sensors. however, placement of the sensors can be extremely problematic, see [4]. the problems can be summed as follows: � the sensor cannot be placed directly inside the heat source. � the sensor cannot be placed in the correct position because of the equipment and structure of the machine. � the sensor is too big for the chosen measuring place. � the sensor cannot be sunk into the material. � the contact surface between the sensor and the frame is not perfect. an example of sensor mounting is shown in fig. 1. these problems are most visible in spindle thermal behavior analysis. as mentioned, it is not possible to disassemble the spindle. in addition, the spindle is not designed for installing additional elements, due to its very sophisticated inner structure. nowadays the most commonly applied option is to install the required sensors on the spindle head, as close as possible to the spindle. another option is to place the sensors on the spindle tube. in terms of heat generation, the spindle is the major source of heat, and its thermal deformation is a major cause of overall machine deformation. this effect is multiplied when an electro-spindle with an integrated electro-motor is used. the high deformation of this type of spindle is caused by its typical mechanical structure with a group of front and rear bearings and the electro-motor winding. it is placed between the two bearing groups. these three parts of the spindle form a considerable source of heat, but they are covered by other spindle mass (lubricating and cooling circuits, spindle tube, etc.). if the sensors are placed on the outer surface of the spindle tube, there is a relatively large traffic time delay of the transferred heat on the way out from the heat source to the temperature sensors. this delay can cancel the thermal compensation of machine deformation. the sensors fail to react when the spindle (and also the machine frame) is already deformed by heat [5]. the cutting accuracy is lower than expected. the effort to eliminate this negative effect is a basic challenge for engineers working on machine tools. an unexplored approach to the problem is to use the spindle cooling liquid as a carrier of information about the thermal state inside the spindle. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 50 no. 1/2010 using the spindle cooling temperature as a tool for compensating the thermal deformation of machines j. vyroubal thermal error compensation of machine tools is a relatively complex problem nowadays. machine users have very high expectations, and it is necessary to use all means to improve the cutting accuracy of existing machines. this paper deals with a novel approach, which combines standard temperature measurement of a machine tool and new temperature measurement of the spindle cooling liquid. a multinomial regression equation is then used to calculate the compensation correction of the position of the tool. this calculation does not critically overload the control system of the machines, so no external computing hardware is required. the cooling liquid approach improves the accuracy of the machine tool over an operational time of several hours. keywords: machine tool, thermal compensation. fig. 1: example of sensor mounted on a spindle 2 making use of the spindle cooling liquid our task is to find a method for obtaining information about the inner thermal behavior of the spindle by measuring the temperature outside the spindle. the only option is to use the spindle cooling liquid. this liquid flows close around the groups of bearings and around the electromotor – fig. 2. it removes the generated heat by circulating through the spindle. if the temperature sensor is placed in the liquid tube at the exit of the spindle, it can determine the conditions inside the spindle. an advantage of this measurement is speed with which the liquid takes temperature information from the bearings to the sensor. this delivery time is shorter than the time taken by heat passing through the mass of material, from the bearings to the outer surface of the spindle, where sensors are standardly placed. our experiments prove, that the reaction of the sensor to the temperature change is much faster (in the case of outgoing spindle cooling liquid) than for other sensors installed on the machine frame. the time results are given below. 3 machine tool thermal behavior monitoring experiments aimed at verifying our hypothesis were performed on the mcfv 5050ln 3-axis machining center, equipped with an electro-spindle and a linear motor in all three axes. this machine has a type c frame, the most common type of frame. the goal of our project was to eliminate the thermal deformation of the vertical z-axis, caused by the spindle. the overall machine deformation was monitored in the place of the tool in the direction of the z-axis. the machine was repeatedly thermally loaded by the rotation of the spindle. the analysis began from the cool state, after the machine had been switched off 48 hours before the experiment. in this way, the machine was brought to room temperature. then the machine was started, referenced and spindle was set in motion at a constant rotation speed of 7500 rpm (50 % nmax). fig. 3. shows that the test ran for about 10 hours. this warmed the structure sufficiently to show the direction and amount of heat flux flowing from the spindle to the machine frame. the deformation of this type of thermal load in the z direction is shown in fig. 4. the initial very fast phase is caused by the spindle itself. the middle phase, between “50 min” and “150 min” is a mixture of influences from the spindle and from the column deformation. the last phase, from “150 min” is created by the column only. another problem in implementing a compensation mechanism for the machine is its cost. it is necessary to offer solu20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 fig. 2: cooling circuits of the tested spindle 0 100 200 300 400 500 600 18 20 22 24 26 28 30 32 34 time [s] t e m p e ra tu re [d e g .] temperature behavior of mcfv 5050ln fig. 3: thermal behavior of mcfv 5050ln tions that are inexpensive but still function well . a typical solution is a multiregression analysis. even without additional hardware, the machine control system is not overloaded. 4 multiregression compensation multiregression compensation is based on the principle of results calculated from several inputs. these can be written as equation: � � � � � a tn n n m 1 . (1) � calculated deformation, t measured temperatures, a calculated coefficients, m number of used sensors. for a better overview of the complete thermal behavior, of the machine, many sensors are installed on the frame and spindle. the sensors are selected by comparative analysis, using two parameters: the first parameter is a match between increased deformation and increased temperature at a specific place. the second parameter is the reaction speed to the defined temperature change of the measured place. the reaction limit was set up to 0.5 deg for this experiment. four sensors were chosen. the sensors covered the spindle (2 sensors), the vertical machine column and the influence of the z-axis linear motor. the sensors are shown in table 1, and their placement is shown in fig. 5. the temperature data serve as an input for proper compensation (eq. 1) through the machine control system. the result of eq. 1, calculated in a given time cycle, presents a correction signal for the control system of the machine. instead of the time-deformation characteristic, the special temperature deformation characteristic is used to form multinomial compensation. the heating process can vary in time but from the physical point of view the temperature change is dominant for the deformation magnitude. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 50 no. 1/2010 0 100 200 300 400 500 600 �10 �5 0 5 10 15 20 25 time [min] d e fo rm a ti o n [ m ] � z-axis deformation fig. 4: z-axis deformation in the place of the tool senzors time reaction c10 32 min c9 25 min a67 3 min c11 8 min 0086 6 min table 1: sensors and their reaction time fig. 5: chosen sensors placement 0 100 200 300 400 500 600 �20 �10 0 10 20 30 thermal behavior and residuum after compensation time [min] d e fo rm a ti o n � [� m ] calculated measured residuum fig. 6: first thermal measurement the resulting deformation equation for the z-axis for four inputs has the following form: �z t t t t � � � � � � � � � 236 81 0 27 1193 23 98 26 85 1 2 3 4 . . . . . this equation was determined by calculating the deformation and temperature tract during the first machine thermal behavior analysis, fig. 6. the calculation was checked by another machine measurement, with different initial conditions. the machine was in a different initial thermal state, due to the different room temperature. in addition, the frame itself was in a semi-warm state, due to incomplete cooling down from the previous working day. the cooling process had taken place only overnight, which is not long enough for this type of machine. the thermal load of the spindle was the same as in the first analysis. 5 compensation results the residual deformation after compensation is shown in fig. 7. clearly, the applied compensation has a positive influence. a good improvement can be seen in the middle transient phase, where the influences of the spindle deformation and of the column deformation are opposite. it is always difficult to describe this phase, because the superposition of the two deformations plays a significant role. also, during the first phase, when the big spindle deformation takes place, we can see the good quality of the compensation mechanism. there is a very fast increase in the spindle deformation. multinomial compensation with spindle cooling liquid measurement eliminates this effect in a shorter time than without compensation. the principle of multinomial regression calculation, together with the small number of sensors that have to be installed, limits the reaction speed when there are unexpected changes in machine behavior. this effect can be seen in fig. 6. around time “470 min”. a sudden spindle cooling failure causes deformation variation. the compensation mechanism reacts, but not sufficiently. this is due to the sensors that are included in the compensation calculation. to improve this compensation type, a special multinomial approach is necessary for the spindle. 6 conclusion using the spindle cooling liquid improves the multinomial regression compensation mechanism. the resulting residual deformation of the mcfv 5050ln machining center, in the z-axis, is better than when a standard regression calculation is made, using only a machine frame temperature measurement. the deformation can be eliminated faster, in the critical first phase. because the calculation is compounded from four sensors: the machine frame, the spindle and cooling liquid, the reaction of the compensation is not fast enough for special unexpected events such as a cooling failure. acknowledgment this research has been supported by the ministry of education of the czech republic – project 1m6840770003. reference [1] vyroubal, j.: elimination of thermal deformation due to cnc machine tool spindles. in: 4th international conference and exhibition on design and production of machines and dies/molds, middle east technical university, 2007, p. 251–257. [2] ko, t. j. et al.: particular behavior of spindle thermal deformation by thermal bending. ijmt, vol. 43 (2003), p. 17–23. [3] ramesh, r. et al.: thermal error measurement and modeling in machine tools. part i. influence of varying operating conditions. ijmt, vol. 43 (2003), p. 391–404. [4] lo, ch. h.: optimal temperature variable selection by grouping approach for thermal error modeling and compensation. ijmt, vol. 39 (1999), p. 1383–1396. [5] hornych, j., vyroubal, j.: inteligentní chladicí systém – část i: tepelné chování izolované pinoly. report v-07-055. research centre of manufacturing technology, 2007. ing. jiří vyroubal phone: +420 221 990 930 fax: +420 221 990 999 e-mail: j.vyroubal@rcmt.cvut.cz czech technical university in prague research centre of manufacturing technology horská 3 128 00, prague 2, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 0 100 200 300 400 500 600 �20 �15 �10 �5 0 5 10 time [min] r e s id u u m [� m ] without cooling liquid with cooling liquid residual deformation comparison fig. 7: residual deformation after compensation ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 color portion of solar radiation in the partial annular solar eclipse, october 3rd, 2005, at helwan, egypt a. h. hassan, u. a. rahoma, m. sabry, a. m. fathy abstract measurements were made of various solar radiation components, global, direct and diffuse and their fractions during the partial annular solar eclipse on october 3rd, 2005 at helwan, egypt (lat. 29.866◦ n and long. 31.20◦ e), and an analysis has been made. the duration of the solar eclipse was 3 h 17 min, and the maximum magnitude of the eclipse in this region was 0.65. the optical depth of the direct component and the relative humidity decreased, while both the transparency and the air temperature increased towards the maximum eclipse. the general trends of the global components are decreasing optical depth and increasing transparency between the first contact and the last contact. the prevailing color during the eclipse duration was diffused infrared (77 % of the total diffuse radiation level). keywords: diffuse infrared solar radiation, partial solar eclipse, meteorological data, optical depth, transparency, linke turbidity and angstrom turbidity. 1 introduction the greatest eclipse is defined as the instant when the axis of the moon’s shadow passes closest to the earth’s center. for total eclipses, the instant of greatest eclipse is virtually identical to the instants of greatest magnitude and greatest duration. however, for annular eclipses, the instant of greatest duration may occur either at the time of greatest eclipse or near the sunrise and sunset points of the eclipse path. onmondayoctober 3rd, 2005, apartial annular solar eclipse of the sun was visible from within a narrow corridor which traversed half the earth. the instant of greatest eclipse occurred at 10:32 ut, when the axis of the moon’s shadow passed closest to the center of the earth. the maximum duration of totality was 4 min 32 s; the sun’s altitude was 71◦, and the path width was 162 km. the maximum magnitude was 0.95 at lat. 12◦8′n and long.e [1]. extinction may be caused by a molecular absorption that is wavelength-dependent and follows the rayleigh formula. the molecular absorption appears in the form of absorption bands projected on to the continuous background spectrum. in a clear sky, at lower atmospheric layers in the wavelength range 500–650 nm, extinction has three known components. it consists ofmolecularabsorption following rayleigh scattering; absorption at a discrete wavelength by water vapor; and weakening by the chappuis band of ozone [2]. the absorbing components of the atmosphere are o2, o3, h2o, co2, n2, o, n, no, n2o, co, ch4 and their isotopic modifications, though the contributions of the latter are small. spectra due to electronic transitions ofmolecular and atomic of o, n, o3, lie chiefly in the ultraviolet region, while those due to the vibration and rotation of polyatomic molecules such as h2o, co2 and o3, lie in the infrared region. there is a very little absorption in the visible region. as the absorption coefficients associatedwith electronic transitions are generally very large, much of the uv is absorbed in the upper layers of the atmosphere. some of the oxygen and nitrogen molecules are dissociated into atomic oxygen and nitrogen owing to absorption of the solar radiation, while other molecules are ionized. dissociated atomic oxygen and nitrogen are also able to absorb solar radiation of still shorter wavelength, and some of these atoms become ionized as a result. the ionized layers in the upper atmosphere are formedmainly because of these processes. owing to the very strong absorption by o2, n2, o, n ando3 in the spectral region up to about 300 nm, the solar radiation in this region does not reach the earth’s surface. in the visible region, there is some absorption due to the weakchappuis bands of ozone and due to the red bands ofmolecularoxygenwhich occurat about690and760nm. irabsorptionbywater vapor occurs at about 700, 800, 900, 1100, 1400, 1900, 2700, 3200and6300nmandbyco2 at about 1600, 2000, 2700 and 4300 nm. these bands play a part in the absorption in the lower atmosphere, below 50 km, where water vapor and co2 are largely concentrated. no photochemical action is associated with absorption in this region, and the absorbed energy is used entirely to heat the lower atmosphere [3]. many studies of the variation of solar radiation and transparency have been carried out in recent years, dealing with solar eclipse totality and partial32 acta polytechnica vol. 50 no. 6/2010 ity in different countries. a study of atmospheric responses due to the 11 august total solar eclipse in romania, conducted by copaciu and yousef (1999), indicated that both global radiation and uvb radiation dropped dramatically to a minimum around totality. there was an opportunity to study the attenuation of such radiation due to clouds. the net radiation became negative for about 17 minutes at cãldãrusani. the temperature dropped to about 30◦csoonafter totalityatbothafumati andclarasi, although at the beginning of the eclipse it was about 46.5◦c at afumati and 34◦c at cãalãarasi. at cãldãrusani, the surface temperature dropped from 34.1◦c to 29◦c. it seems likely that the air temperature inside the umbra is between 29–30◦c. the response time of minimum surface temperature was about 18 minutes, which is comparable to the duration of the negative part of the net radiation when the backward radiation became higher than the incident radiation [4]. studies of simultaneous measurements of radiation; photolysis frequencies, o3, co, oh, pan and nox species were carried out in the boundary layer, along with the relevant meteorological parameters, under total solar eclipse conditions [5]. this experiment, performed at about 34 solar zenith angles and under noontime conditions, thus provided a case study of the interactions between radiation and photochemistry under fast “day-night” and “night-day” transitions, at high solar elevation. the results revealed a close correlation between photolysis frequencies and the uv radiation flux. due to the decreasing fraction of direct radiation at shorter wavelengths, all three parameters showed much weaker cloud shading effects than global solar radiation. no and oh concentrations decreased to essentially zero during totality. subsequently,noandohconcentrations increased almost symmetrically with their decrease preceding totality. the no/no2 ratio was proportional to no2 over ±30 min before and after totality, indicating that the partitioning of nox species was determined by jno2. simple box model simulations show the effect of reduced solar radiation on the photochemical production of o3 and pan. a study was made of the depression of the different solar radiation components during the solar eclipse, august 11th, 1999, overegypt (asapartial solar eclipse, 70%coveringof the solar disk in helwan, egypt) [6]. the maximum depressionvalues in the different components of solar radiation was 54 % in red solar radiation (for global and direct), while theminimumdepressionwas in infrared solar radiation (34 % for global and 41 % for direct). the clearness index and the diffuse fraction were 0.634 and 0.232, respectively. the atmospheric red radiationwas 7.4%and the atmospheric infrared radiation was 10.7 %. the percentage of ultraviolet was 3 %. a study of the spectral composition of global solar radiation by interference metallic filters was carried out during the same previous eclipse in august 11th, 1999 in helwan, egypt [7]. the conclusions indicated an increase in the whole interval from350–450nmbutwithout risk to the human eye. this interval lies at the end of ultraviolet solar radiation, while the minimum variation lies between 500–700 nm. this interval represents the normal maximum peak of the solar spectrum, and goes up to the 700–900 nm band. the change in the meteorological parameters is related to the variability of the solar spectrum shift from the short wave band to the longwave band. the maximumdrop in the solar spectrum lies in the interval that consists of the normalpeak of the solar spectrum from500–600nm. an investigationwasmade of the effects of pollutants on the color portion,where the increase in pollutants reduces the violet-blue band by 11%, the green-yellow bandby 14%, the red band by 13%and the infrared band by 5 % of the average annual values [8]. using ground-based spectral radiometric measurements taken over the athens atmosphere inmay 1995, an investigationwas carriedoutof the influence of gaseous pollutants and aerosols on the spectral radiant energydistribution. itwas found that the spectralmeasurements exhibited variationsbased onvarious polluted urbanatmospheric conditions, as determinedbygaseouspollutants recordanalysis. the relative attenuations caused by gaseous pollutants and aerosols can exceed 27 %, 17 % and 16 % in the global ultraviolet, visible and near infrared portions of the solar spectrum, respectively, as compared to “background” values. in contrast, an enhancement of the near infrared diffuse component by 66 % was observed, while in the visible and ultraviolet bands the relative increases were 54 % and 21 %, respectively [9]. the aim of the present work is to study and determine the percentage of color portion variations for the different solar radiation components during the solar eclipse onoctober 3rd, 2005 at helwan, egypt. instruments and measurements equipment was installed on the roof of the national research institute of astronomy and geophysics (nriag) building in helwan, (29◦54′n 30◦20′e, 126 m elevation above sea level). the background is taken as desert and pollution. measurements were conducted from sunrise to sunset. the time of measurements was taken as the local mean time of cairo (gmt+2 hours). the instruments used in this work were: • a pyrheliometer for measuring the direct solar radiation, in three different bands, direct yellow (y) (530–2800 nm), direct red (r) (630–2800 nm), direct 33 acta polytechnica vol. 50 no. 6/2010 infrared (ir) (695–2800 nm), and also the total direct band (i) (280–2800 nm). • four pyranometers for measuring the different components of global solar radiation (g, 280–2800 nm), global ultraviolet (guv , 285–385 nm), global infrared (gir, 695–2800 nm) and a black-and-white type sensor to measure the diffuse (d, 280–2800 nm) solar radiation. • a meteorological station to measure the different meteorological parameters. theoretical background to calculate the extraterrestrial solar radiation at any time during the partial solar eclipse, we carried out the following: ei at eclipse = (ei)(1 − m) (1) where ei is substituted by any radiation quantity, e.g. go, giro, guvo, b1, b2, b3 and b4. optical depth (α) and transparency (τ) are calculated by the formula: ibλ = ioλ exp(−αm) (2) where α and τ may be calculated as α = − ln(ibλ/ioλ)(1/m) (3) τ = 1/exp(α · m). (4) the calculations of the extraterrestrial solar radiation for the bands of various components bands are based on [10]. the diffuse infrared (dir) was calculated from the equation: dir = gir − ircos(θ) (5) the linke turbidity factor, lt is given by: lt = 1 δrma · ln · [ i0 i ] (6) where δr is given by: δr = 1 9.4+0.9 · ma (7) where ma is given by: ma= [ p 1013.25 · 1 cosθ +0.15(93.885− θ)−1.253 ] (8) the total amount of water vapor in the atmosphere in the vertical direction is highly variable and depends on the instantaneous local conditions. this amount, expressedasprecipitablewater w (cm), and can be readily computed through a number of standard routine atmospheric observations. the precipitablewater vapor can vary from0.0 to 5 cm [11, 12]. w = [ 0.493 · (φr) · exp[26.23− (5416/tk)] tk ] (9) the angstrom turbidity coefficient is a dimensionless index that represents the amount of aerosol, and the relation between the linke turbidity (lt) and the angstrom turbidity (β) for helwan is [13]: β = −0.194933+0.0620059lt (10) results and discussion table 1 characterizes the phase and magnitude (the fraction of the sun’s diameter covered atmid eclipse) at different stages of the partial solar eclipse under study at helwan, egypt. the duration of the solar eclipse was 3 h 05 m, with maximum magnitude in this region of 0.65 [1]. table 1: characterization of the partial solar eclipse, 3–10–2005 at helwan eclipse phase magnitude hh:mm:ss a az s.p.e 0 10:27:14.4 51 149 m.e 0.65 11:59:11.2 56 187 e.p.e 0 13:31:41.3 47 222 table 2 characterizes the environmental and meteorological conditions of the eclipse day in helwan, e.g. sunrise (s.r), upper transit of the sun (t.s), sunset (s.s), dry-bulb temperature (td), wet-bulb temperature (tw), airpressure (p), cloudcover (cl.), visibility (v is.), relative humidity (r.h) and dew point (d.p). fig. 1 shows the temporal measurements of the hourly variation of various global solar radiation components, global horizontal (g), global infrared horizontal (gir) and total diffuse horizontal (d) as well as the extraterrestrial solar radiation, go in (w/m2). table 2: characterization and environmental conditions of the eclipse day s.r t.s s.s td tw p cl. v is. r.h d.p hh:mm hh:mm hh:mm ◦c ◦c hpa okta km % ◦c 05:48 11:43 17:38 29.1 21.88 999.62 0 5.0 54.6 18.62 34 acta polytechnica vol. 50 no. 6/2010 fig. 1: hourly variation of g, gir, d and go fig. 2: hourly variation of gu v , and gu vo this figure clearly shows the depression of the irradiance of all components due to the eclipse, while the depression of the diffuse radiation is lower, because of the percentage of direct change to diffuse through the layer of atmosphere. fig. 2 shows the hourly variation measurements of global ultraviolet solar radiation (guv ) along with the extraterrestrial global ultraviolet solar radiation (guv o). fig. 3 shows the hourly variation measurements of various direct solar radiation components: total direct (i), direct yellow (y), direct red (r) and direct infrared (ir), in (w/m2). the depression gradient starting from the total to infrared bands passing through the yellow and red bands before and after the eclipse is clearly shown in the graph, but the difference between the bands is narrow during the maximum eclipse. fig. 3: hourly variation of i, y , r and ir fig. 4: hourly variation of kt, kd and gu v fig. 4 shows the hourly variation of various clearness indices, e.g. kt, kuv and diffuse fraction kd. the clearness indices have higher values during the eclipse, while the kd value suffers no variation because the percentage decrease of diffuse is equal to the percentage decrease of the global value. fig. 5 shows information on the meteorological conditions before, during and after the partial solar eclipse at helwan site, i.e. dry-bulb temperature (td) and relative humidity (rh %). the decrease in ambient temperature and the increase in relativehumidity are 35 acta polytechnica vol. 50 no. 6/2010 recorded as 1.8◦c and 2 %, respectively, as the difference between the first contact and the maximum eclipse. fig. 6 shows the hourly variation of the color portion of the direct bands b1, b2, b3 and b4 over the whole day from sunrise to sunset. the highest short wavelength values (b1 and b2) over the day are at low air mass (around true noon), while the top long wavelength value (b4) is at higher air mass near sunrise and sunset. fig. 5: hourly variation of ambient temperature and relative humidity fig. 6: hourly variation of b1, b2, b3 and b4 fig. 7 shows thehourlyvariationof the horizontal global infrared gir and the direct infrared ir over the whole day. the ir component is predominant outside the eclipse period, but this changes during an eclipse. the difference between gir and ir is the diffuse infrared (dir) that appears in equation 5. it is clear from this figure that the direct infrared is dominant before and after the eclipse, but during the eclipse, the global infrared is dominant. fig. 8 shows the hourly variation of the irradiance of the total diffuse d and the diffuse infrared dir, where it is clear that the total diffuse radiation values were higher than the diffuse infrared radiation values before, after and during the eclipse. fig. 9 shows the ratio (rd) of diffuse infrared (dir) to the total diffuse, where the maximum ratio rd conjugates with themaximumeclipse, and the duration of rd conjugates with the duration of the eclipse. fig. 7: hourly variation of ir and gir fig. 8: hourly variation of d and dir 36 acta polytechnica vol. 50 no. 6/2010 fig. 9: hourly variation of dir/d alongwith theeclipse magnitude fig. 10: hourly variation of kd and kdir fig. 10 compares the total diffuse fraction kd with the infrared diffuse fraction kdir.. the top value of kdir wasverynear sunrise and sunset (high airmass) andnear themaximumof the eclipse, while the top kd value was very near sunrise and sunset only, while the minimum value was in the middle of the day (low air mass). fig. 11 shows the different percentage of the color portion (c–p) through the day from 0800 to 1600, passing through the different intervals of the eclipse for the different bands. this figure shows that theprevailing color in the clear case is ib4 > ib1 > ib2 > ib3. these results agreewith figs. 7 and 9, where the prevalent color during the eclipse is diffuse infrared. thepercentage of the color portion at the true eclipse interval shows the same trend. however, the c–p in the direct infrared ib4 during the eclipse is low, while the high values are very near to sunrise and sunset, where there are high air masses, which are the major causation of the absorption and scattering of the infrared wavelengths. generally, the results for the color portion agreewith previous work in this location [8]. fig. 11: variation of color portion of the different bands over the day of eclipse table 3 presents the hourly variation of the optical depth (α) and transparency (τ) from0800 to 1600 through the duration of the eclipse of the global (g), global infrared (gir), global ultraviolet (guv ), total direct (i) aswell as direct color bands ib1, ib2, ib3 and ib4. this table demonstrates that the top optical depth (α) and low transparency (τ) are for the guv , where the low transparency for guv has two causes. firstly, the intensity of the scattered light is proportional to 1/λ4 and to the square of the volume of the particle [10, 11]. this means that the intensity of the scattering of uv light is eleven times the scattering of red light. this region is characterized by the high pollutant limit and the large size of the pollutants (mainly ca and fe) [8, 14]. secondly, the ozonosphere absorbs a large amount of this band. generally, the transparency increases gradually from the short wavelength to the long wavelength. the general trend of the global components in g and gir is the low optical depth (α) and high transparency (τ). the coefficient of the atmospheric attenuation is directly proportional to the wavelength, showing a general reddening. at maximum eclipse, the optical depth (α) is less and the transparency (τ) 37 acta polytechnica vol. 50 no. 6/2010 is higher in all bands. this is because the air temperature decreased by 1.8◦c and the r.h increased by 2 %, as shown in fig. 5, due to the low thickness of the atmospheric layers and, accordingly, lower optical thickness and higher transparency where the relationbetween transparencyand temperature is inverted [7]. table 4 presents the different values of linke turbidity (lt),angstromturbidity (β), precipitablewater (w , cm), total diffuse fraction kd and infrared diffuse fraction kdir during the daypassing through the eclipse period. both linke and angstrom turbidity values are higher in the afternoon than in the morning. this is because of the high temperature in the afternoon, which expands and excites the gases and the dust in the atmosphere. the distribution of precipitable water (w), diffuse fraction kd and the diffuse infrared fraction kdir give the concept of the atmospheric character through the eclipse. the high values of lt and β and w give an idea of the high turbidity of the day of observation. the prevailing color throughout the duration of the eclipse was diffuse infrared (77% of the total diffuse). table 3: various optical depth values (α) and transparency values (τ) during the day, including the phases of the eclipse (from the fc to the me to lc) of g, gir, guv, i, b1, b2, b3 and b4 8:00 9:00 10:00 10:31 11:00 11:59 13:00 13:31 14:00 15:00 16:00 f.c m.e l.c g τ 0.696 0.829 0.818 0.784 0.796 0.945 0.796 0.732 0.697 0.653 0.492 α 0.125 0.100 0.136 0.190 0.202 0.047 0.182 0.237 0.257 0.245 0.282 gir τ 0.657 0.783 0.759 0.727 0.781 0.798 0.789 0.707 0.672 0.565 0.459 α 0.148 0.132 0.191 0.254 0.201 0.190 0.192 0.269 0.288 0.334 0.315 guv τ 0.164 0.207 0.233 0.216 0.196 0.243 0.215 0.190 0.173 0.154 0.104 α 0.622 0.838 0.991 1.200 1.300 1.172 1.226 1.265 1.249 1.078 0.901 i τ 0.432 0.535 0.615 0.574 0.494 0.654 0.558 0.508 0.466 0.388 0.198 α 0.289 0.333 0.331 0.434 0.563 0.352 0.465 0.516 0.544 0.674 0.645 b1 τ 0.360 0.473 0.594 0.509 0.449 0.613 0.521 0.474 0.433 0.329 0.112 α 0.352 0.398 0.354 0.528 0.639 0.405 0.520 0.568 0.596 0.641 0.870 b2 τ 0.422 0.595 0.692 0.696 0.539 0.712 0.676 0.575 0.498 0.436 0.173 α 0.297 0.276 0.250 0.285 0.493 0.281 0.312 0.421 0.476 0.478 0.698 b3 τ 0.563 0.689 0.764 0.615 0.664 0.585 0.476 0.673 0.601 0.531 0.263 α 0.198 0.198 0.183 0.383 0.327 0.444 0.592 0.301 0.362 0.365 0.532 b4 τ 0.451 0.530 0.586 0.569 0.486 0.632 0.530 0.485 0.456 0.387 0.237 α 0.274 0.338 0.363 0.444 0.576 0.380 0.506 0.551 0.559 0.547 0.573 table 4: the different values of linke turbidity (lt), angstrom turbidity (β), precipitable water (w), total diffuse fraction kd and infrared diffuse fraction kdir during the day passing through the eclipse period 8:00 9:00 10:00 10:31 11:00 11:59 13:00 13:31 14:00 15:00 16:00 f.c m.e l.c lt 3.541 3.754 3.608 4.657 3.608 3.749 3.749 5.578 5.888 6.071 7.630 β 0.024 0.038 0.028 0.094 0.028 0.037 0.037 0.151 0.170 0.181 0.278 w 3.296 3.128 2.922 3.402 3.440 3.365 3.470 3.575 3.918 3.756 3.710 kd 0.194 0.193 0.188 0.172 0.192 0.167 0.175 0.184 0.199 0.244 0.401 kdir 0.380 0.347 0.228 0.198 0.358 0.295 0.278 0.253 0.250 0.216 0.366 38 acta polytechnica vol. 50 no. 6/2010 2 conclusion the results obtained from this analysis of the spectral composition of global and direct irradiance show that various atmospheric parameters cause considerable changes to the spectral distribution of radiant energy reaching the ground. our conclusions are summarized as follows: 1 – the percentage of prevalent color during the day is b4 > b1 > b2 > b3. the predominant colors during the eclipse were infrared, blue, green and yellow, respectively. the variation of the color portion is clearly obvious in b2 and b3, where thepercentage was higher in b2 and lower in b3 during the eclipse period. the c–p of b1 and b4 underwent almost no change. the lowest percent of color portion was in the red. 2 – at maximum eclipse (me), optical depth (α) is lower and transparency (τ) is higher. the air temperature decreased by 1.8◦c and a 2 % increase in rh was recorded; this was due to the low thickness of the atmospheric layers. the optical thickness was therefore lower and, accordingly, this raised the transparency values. the general trend of the global components in g, gir and guv are low optical depth (α) and high transparency (τ) in the first contact in comparison with the last contact. there was high optical depth (α) and low transparency (τ) in guv , where theozonosphereandtheairpollutants absorbed a large amount of this band. the top percentage of short wave length (ib1 and ib2) over the day was at low air mass (around true noon), while the top percentage of the longwavelength (ib4) was at higher air mass. the prevalent color throughout the eclipse was diffuse infrared (77 % of the total diffuse). references [1] espenak, f., anderson, j.: total solar eclipse of 2006 march 29. nasa/tp, 212762. [2] green, robin m.: spherical astronomy. cambridge university press, 1993. [3] robinson,n.: solar radiation, elsevier, london, 1966. [4] copaciu, v., yousef, s. m..: some atmospheric responses the 11august 1999 total solar eclipse nearbucharest.rom.astron. j.9, 1999, p. 19–23. [5] fabian, p., rappenglück, b., stohl, a., werner, h., winterhalter, m., schlager, h., berresheim, h., stock, p., kaminski, u., koepke, p., reuder, j., birmili, w.: boundary layer photochemistry during a total solar eclipse. meteorologische zeitschrift, 2001, 10(3), 187–192. [6] hassan, a. h., shaltout, m. a. rahoma, u. a.: the depression of different solar radiation components during the solar eclipse, 11 august 1999 over egypt. j. astron soc. of egypt, 2004, 12(i), 70–81. [7] rahoma,u. a., shaltout, m. a., hassan, a. h.: study of spectral global solar radiation during the partial solar eclipse of 11 august, 1999 at helwan, egypt. j. astron soc. of egypt, 2004, 12(i), 31–45. [8] shaltout, m. a., ghoniem, m. m., hassan, a. h.: measurements of the air pollution effects on the color portions of solar radiation athelwan, egypt. 4th world ren energy cong, usa, 1996, 2, 1279–1282. [9] constantnose, p., jacovides, michael, d., steven, demosthenis, n., asimakopoulos: spectral solar irradiance and some optical properties for various polluted atmospheres. sol. energy. 69(3), 2000, p. 215–227. [10] iqbal, m..: an introduction to solar radiation, academic press, 1983. [11] kasten, f.: a simple parameterization of the pyrheliometeric formula for determining the linke turbidity factor.meteor. rdsch., 1980,33, 124–127. [12] louche, a., peri, g., iqbal, m.: an analysis of linke turbidity factor. sol. energy., 37(6), 124–127. [13] hamdy, k. elminir, rahoma, u. a., benda, v.: comparison between atmospheric turbidity coefficients of desert and temperate climates. acta polytechnica, 2001, 41(2). [14] hala, s. own, hamdy, k. elminir., yasser, a. abdel-hady, fathy, a. m.: adaptation of wavelet features topredict the localnoonerythemal ultraviolet irradiance, international journal of computational intelligence research (ijcir), 2008, issue 4 of volume 4. a. h. hassan, u. a. rahoma, m. sabry, a. m. fathy phone: +20 227 044 422 e-mail: mohamed.ma.sabry@gmail.com national research institute of astronomy and geophysics helwan, egypt 39 acta polytechnica vol. 50 no. 6/2010 nomenclature a altitude of the sun above the horizon az azimuth of the sun d.p. dew point (co) d measured horizontal diffuse solar radiation dir infrared diffuse solar radiation f.c first contact of eclipse g global solar radiation, 280–2800 nm gir horizontal infrared global solar radiation, 695–2800 nm go extraterrestrial global solar radiation, 250–2800 nm guv horizontal uv global solar radiation, 280–385 nm guv o extraterrestrial uv global solar radiation, 250–385 nm i direct solar radiation, 280–2800 nm b1 value of band i − y =280–530 nm b2 value of band y − r =530–630 nm b3 value of band, 630–695 nm b4 value of band, 695–2800 nm ib1 color portion of b1 as a percentage ib2 color portion of b2 as a percentage ib3 color portion of b3 as a percentage ib4 color portion of b4 as a percentage ibλ measured spectral irradiance at wavelength λ ioλ extraterrestrial spectral irradiance corrected for the actual sun – earth distance ir direct infrared solar radiation, 695–2800 nm kd diffuse fraction (d/g) kdir ir diffuse fraction (diffuse infrared/global infrared=dir/gir) kt clearness index (g/go) kuv clearness index of u v (guv /guv o) l.c last contact of eclipse lt linke turbidity factor m.e maximum eclipse or mid eclipse m air mass, m = secθ m magnitude of the solar eclipse, defined as the fraction of the solar diameter that is obscured ma relative optical air mass p air pressure (hpa) r.h relative humidity (r.h) r direct red solar radiation, 630–2800 nm rd ratio of diffuse infrared to the total diffuse= dir/d sp e start of partial eclipse in local time (first contact) sr sunrise ss sunset s measurements of sunshine duration (hour) so calculation of sunshine duration (hour) t.s upper transit of the sun td dry-bulb temperature ( ◦c) tw wet-bulb temperature ( ◦c) tk ambient temperature in kelvins y direct yellow solar radiation, 530–2800 nm α optical depth for any bands β angstrom turbidity coefficient θ zenith angle τ transparency for any bands δr spectrally integrated optical depth of the clean dry atmosphere φr relative humidity in fraction of one (φr = r · h/100) 40 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 a neural network model for predicting nox at the mělńık 1 coal-powder power plant ivo bukovský1, michal kolovratńık2 1 czech technical university in prague, faculty of mechanical engineering, department of instrumentation and control engineering, technická 4, 166 07 prague, czech republic 2 czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague, czech republic correspondence to: ivo.bukovsky@fs.cvut.cz abstract this paper presents a non-conventional dynamic neural network that was designed for real time prediction of nox at the coal powder power plant mělńık 1, and results on real data are shown and discussed. the paper also presents the signal preprocessing techniques, the input-reconfigurable architecture, and the learning algorithm of the proposed neural network, which was designed to handle the non-stationarity of the burning process as well as individual failures of the measured variables. the advantages of our designed neural network over conventional neural networks are discussed. keywords: dynamic neural networks, prediction, nox emissions, signal processing. 1 introduction neural networks (nn) are a popular and widely studied real data-driven nonlinear modeling tool for complicated systems where mathematical-physical analysis is unavailable for deriving a model. unlike analytical or linear models, nns are black-box models, or sometimes gray-box models, that require proper design of their mathematical architecture and an efficient learning algorithm. for the principles of fundamental neural networks we may refer, e.g., to [1], and we may refer to less recent reviews [2, 3] for studies of nns in energetic processes. for more recent works, including studies of conventional nn in energetic processes, we may refer to papers [5–8], which deal with computational intelligence tools (neural networks, genetic algorithms) focused on biomass combustion. the study of non-conventional neural architectures for modeling steady state hot steam turbine data and for modeling a large scale energetic boiler can be found in [9], where the advantages of a static quadratic neural unit (qnu, [1, 4]) and a special quadratic neural network [9] over conventional multilayer perceptron neural networks (mlp) are demonstrated, with reference to the overfitting and local minima problem, which are typical drawbacks of mlp (even with a single hidden layer nn). the advantage of qnu is its nonlinear input-output mapping, while this neural model is linear in its parameters [10] (unlike mlp). this allows us to monitor and maintain adaptation stability by a comprehensible evaluation of the eigenvalues of the weight update system [10] that offer promising opportunities for adaptive monitoring, modeling, and process optimization by adaptive nonlinear control. qnu can be seen as a component of higher-order neural network (honn), sometimes also referred to as a polynomial neural network (pnn). the origins of these neural networks can be traced back to works [11–14], while the concept of the standalone higher-order neural units (honus) as a building component of honn can be found in [1] and in [4]. the fundamental gradient-based learning rules for training dynamic neural networks are known as real-time recurrent learning (rtrl) [15] and back-propagation through time (bptt) [16, 17]. these algorithms can be made comprehensible and practically useful for real-time computations. in this paper, we present the resulting neural network architecture that has been designed and tested for nox prediction for a pulverized coal firing boiler at the power plant “elektrárna mělńık 1 (eme 1)”; the nominal steam load of this boiler is 250 tons per hour. the goal is to design and test a model that does not involve measured o2 or co in its input and that can potentially be used for optimizing the energetic process regarding nox and co emissions of the pulverized firing boiler at eme 1. the resulting discrete-time dynamic (recurrent) neural network merges the concept of a conventional recurrent (mlp) neural network with qnu [4, 9]. the 17 acta polytechnica vol. 52 no. 3/2012 figure 1: the data preprocessing before each reconfiguration and retraining of the neural network data pre-processing and retraining strategy is described in the next section, and that in turn is followed by the mathematical notation of the neural architecture that led to the results shown in the section on discussion. 2 data preprocessing and network training the nox dynamics of the pulverized boiler is highly nonstationary, due to varying technical conditions of the boiler, varying quality of the coal powder, and also because of the measurement outages that occur quite often on an hourly basis. it was therefore not possible to obtain a neural network model that would reliably predict the nox emissions from the long term data. to handle the non-stationary nature of the boiler in eme 1, we arrived at the data preprocessing technique that is sketched in figure 1, where u(k) is a matrix of recent history (a retraining window) of all measured input variables (excluding measured o2, nox, and co) at a reference time k that is particularly given as follows u(k) = ⎡ ⎢⎢⎢⎢⎢⎣ u1(k − ntrain + 1) . . . u1(k − 1) u1(k) u2(k − ntrain + 1) . . . u2(k − 1) u2(k) ... ... ... ... un(k − ntrain + 1) . . . un(k − 1) un(k) ⎤ ⎥⎥⎥⎥⎥⎦ . (1) the measured input variables in u(k) are the primary, secondary, and tertiary air valves, and also optionally the steam load or the air flow before the ventilator (in total n = 18, 19, 20 variables) excluding o2, nox, and co. the principal component analysis (pca) block is an application of pca to the linearly correlated variables, so the number of input variables in upca(k) is m < n, which importantly decreases the computation load while it maintains information in the measured input data (note, figure 1 shows only a simplified sketch, while detailed implementation of pca that benefits from process knowledge of the pulverized boiler at eme 1 may be provided on the basis of an official request for [19]). the structure of the resulting input data matrix upca(k) that is used as the neural network input (after the preprocessing shown in figure 1) is as follows upca(k) = ⎡ ⎢⎢⎣ upca1(k − ntrain + 1) . . . upca1(k − 1) upca1(k) ... ... ... ... upcam(k − ntrain + 1) . . . upcam(k − 1) upcam(k) ⎤ ⎥⎥⎦ . (2) the presented data pre-processing technique removes variables with measurement outages. principal component analysis results in a lower computational load, because of the reduced number of external inputs into the neural network (m < n). pca has a filtering effect, and also contributes to more accurate calculations of matrix inversion with the bptt training technique by reducing redundant and linearly correlated data. 3 neural network for nox prediction this section describes the mathematical notation of the designed neural network for nox prediction. this neural network is a discrete-time recurrent architecture, i.e., a non-linear difference equation system, composed of a recurrent hidden layer of conventional sigmoid neurons and with an output quadratic neural unit with feedbacks also from the output to its input. in particular, the neural network predictive model is given as 18 acta polytechnica vol. 52 no. 3/2012 follows. the window of external inputs for retraining the network at reference time k are the pre-processed measured variables upca(k), as given in (2) and in figure 1. the external inputs that enter the neural network for ns samples ahead prediction at time k are in the last column of upca(k), as follows upca(k) = [ upca1(k) upca2(k) . . . upcar(k) ]t , (3) where k is a reference sample time index and r is the dimension of the reduced vector of all measured external inputs by the pca method. the input vector to the hidden layer of the neural network is given in (4) as x(k) = [ yn(k + nya) . . . yn(k − nyb) upca(k + nua)t . . . upca(k − nub)t ξ(k) ]t , (4) where yn(.) are step-delayed neural outputs; nya, nyb, nua, and nub are input configuration parameters; and ξ(k) is the step delayed feedback of the hidden layer outputs. the output of the hidden sigmoidal layer ξ(k + 1) (6) is calculated using the hidden layer weight matrix w (6) and using the classical sigmoid function (5), as follows φ(ν) = 2 1 + e−ν − 1, (5) where ξ(k + 1) is augmented with a unit as ξ(k + 1) = [ 1 φ(w · x(k)) ] , (6) where the unit allows the hidden layer (first column of w) and also the qnu (v0,0 in (7)) for biases, so the neural output is calculated by a quadratic neural unit [1,4,9,10], using (3)–(6), as follows yn(k + ns) = ∑ i=0 ∑ j=i vi,j · ξi(k + 1) · ξj(k + 1). (7) the proposed dynamic neural network has purposely designed properties that are worth mentioning and explaining. the hidden recurrent layer of neurons, which is calculated in (6) as φ(w · x(k)), reduces cognitively (by training) the number of already pca preprocessed neural inputs, and thus (6) results in the augmented vector of state variables ξ(k +1) that are fed both forward to the qnu and also back to the network input x(k), as in (4). without the first hidden layer, the number of input variables inputted directly into qnu would still be too large for the given 1-minute sampling period, as we feed an approximately twelve-minute history of each pca preprocessed variable into the network input, i.e. nya − nyb = 12 and also nua − nub = 12 (the estimated time constant of this pulverized firing boiler has been specified by experts as approximately 12 minutes). also, the first layer (6) plays a filtering role due to its step delayed feedback to the network input (4), and its recurrent feedback naturally calls for training by the backpropagation through time method (bptt) [15–17], which is a powerful and efficient and yet practical optimization method, as it can be achieved by a combination of a gradient descent rule and the levenberg-marquardt algorithm [18]. the sigmoid function ϕ(.), which is usually considered as a main nonlinearity of conventional neural networks, has another importance for this dynamic network, because the major nonlinearity is provided by the qnu [9, 10, 18]. the sigmoid function ϕ(.) limits the output of the hidden layer into the given range of values (−1, +1) that importantly assures stability of the hidden layer (as of a discrete time dynamic system, and this could not be so simply assured for continuous-time nns). then, there are always limited values entering the qnu, so its output is also naturally limited; thus, the stability of the state variables and of the output of the proposed neural network is naturally assured by preserving the sigmoid function in hidden neurons. as regards the stability of the learning algorithm, and thus its convergence, we proposed a novel approach to weight-update stability for gradient descent training of qnu in [10], and this approach is applicable to both static and dynamic qnu, and also to the hidden-layer weight system of this network for nox prediction. 4 results and discussion this section shows the results of 3-minute predictions of nox emissions (in fact the 3-minute floating averages) of the pulverized firing boiler at eme 1 by the proposed neural network (ns = 3, sampling 1 minute). the recurrent network does not include measured o2, nox, or co on its input; the introduction of nox as a measured external input resulted in a prediction failure; the model learned to follow blindly the previous measured nox, 19 acta polytechnica vol. 52 no. 3/2012 figure 2: nox prediction by the neural network (1)–(7) with re-configurations and re-training (figure 1) figure 3: detail from figure 2 — a good prediction figure 4: detail from figure 2 — a bad prediction due to outliers in the training data which is a typical problem with the improper use of neural networks for complicated systems. the permanent computation run of prediction for 24 days, with one-minute sampling and retraining every 30 minutes, is shown in figure 2. figure 2 shows superimposed 24-day recordings of measured nox (thick line) and the three-minute prediction (ns = 3) of nox (bold line) (the three-minute floating average of nox is predicted); the neural network is retrained every 30 minutes with 5 hours of the very last measured data (one-minute sampling, the model input excludes measured o2 , nox, co). the network (1)–(7) was retrained every 30 minutes by the back propagation through time algorithm [18] with blindly selected most recent history of 298 samples (5 hours) of the measured process variables. each retraining took less than 3 minutes of real computation time in matlab on a pc (win7, i7), and this is practical for real time retraining implementation. the good performance of nox prediction is apparent from the details in figure 3. also, figure 3 shows a temporary measurement outage (∼ 15 minutes) after sample k = 2.82e + 4; the output of the dynamic neural 20 acta polytechnica vol. 52 no. 3/2012 network substitutes the measurement outage; the good neural network prediction depends on availability of good retraining data in this observed period. however, this kind of nox outage affects retraining, see figure 4. the prediction accuracy and prediction reliability for nox prediction depends significantly on the retraining data (here, the last 298 samples before each predicted value). the impact of nox outliers is apparent if we compare the prediction details in figure 3 and figure 4, and it is clear that another signal processing technique for selecting the retraining data needs to be involved in order to avoid nox outliers in the retraining data; the neural network fails in prediction after k = 3.122e + 4 because of the poor retraining data and also because of the outliers at k = 3.12e + 4. the prediction becomes correct again for k > 3.133e + 10, because the related retraining data already does not include the outliers. 5 conclusions we designed and tested a non-conventional recurrent neural network for predicting the nox emissions of a pulverized firing boiler without using o2, nox, or co on the model input. the proposed method handles process non-stationarity by frequent retraining, and it handles the outages of input process variables by input data preprocessing (but not yet the outages of the predicted nox itself); it is assumed that this can be resolved by automatically supervised selection of the training data where nox outliers do not appear, and by avoiding unnecessary retraining. acknowledgement this work has been supported by grant mpo fr-ti1/538, and in part by grant sgs10/252/ohk2/3t/12. references [1] gupta, m. m., liang, j., homma, n.: static and dynamic neural networks: from fundamentals to advanced theory. ieee press and wiley-interscience, john wiley & sons, inc., 2003. [2] kalogirou, s. a.: artificial intelligence for the modeling and control of combustion processes: a review, progress in energy and combustion science, 29, 2003, p. 515–566, elsevier. issn 0360-1285. [3] mellit, a., kalogirou, s. a.: artificial intelligence techniques for photovoltaic applications: a review, progress in energy and combustion science, 34, 2008, p. 574–632, elsevier. issn 0360-1285. [4] bukovsky, i., bila, j., gupta, m. m., hou, z.-g., homma, n.: foundation and classification of nonconventional neural units and paradigm of nonsynaptic neural interaction, in discoveries and breakthroughs in cognitive informatics and natural intelligence. in the acini book series ed. by yingxu wang, university of calgary, canada : igi publishing, hershey pa, usa, 2009. isbn 978-1-60566-902-1. [5] pitel’, j., mižák, j.: approximation of co/lambda biomass combustion dependence by artificial intelligence techniques. in annals of daaam for 2011 & proceedings of the 22nd international daaam symposium, vienna, austria, 23–26th november 2011. vienna : daaam international, 2011, p. 0143–0144. isbn 978-3-901509-83-4, issn 1726-9679. [6] mižák, j., pitel’, j.: using artificial neural networks for biomass combustion process control. in proceedings of 2nd international seminar “system analysis, control and information processing”, divnomorskoje, russia. [cd-rom]. rostov on don: don state technical university, 2011, p. 343–348. isbn 978-5-7890-0666-5. [7] pitel’, j., borž́ıková, j., mižák, j.: biomass combustion process control using artificial intelligence techniques. in proceedings of xxxvth seminar asr’2010 “instruments and control”. ostrava : všb-tu ostrava, 2010, p. 317–321. isbn 978-80-248-2191-7. [8] hošovský, a.: genetic optimization of neural networks structure for modeling of biomass-fired boiler emissions. journal of applied science in thermodynamics and fluid mechanics, vol. 9, no. 2, 2011, p. 1–6. issn 1802-9388. [9] bukovský, i., lepold, m., bı́lá, j.: quadratic neural unit and its network in validation of process data of steam turbine loop and energetic boiler, wcci 2010, ieee int. joint. conf. on neural networks ijcnn, barcelona, spain, 2010. 21 acta polytechnica vol. 52 no. 3/2012 [10] bukovský, i., bı́lá, j., noriasu, h., rodriguez, r.: prospects of gradient methods for nonlinear control, automatizácia a riadenie v teórii a praxi artep 2012, slovakia 2012. isbn 978-80-553-0835-7. [11] ivakhnenko, a. g.: polynomial theory of complex systems, ieee tran. on systems. man and cybernetics. vol. smc-1, 4, 1971, p. 364–378. [12] nikolaev, n. y., iba, h.: learning polynomial feedforward neural network by genetic programming and backpropagation, ieee trans. on neural networks, vol. 14, no. 2, march, 2003, p. 337–350. [13] taylor, j. g., commbes, s.: learning higher order correlations, neural networks, 6, 1993, p. 423–428. [14] kosmatopoulos, e., polycarpou, m., christodoulou, m., ioannou, p.: high-order neural network structures for identification of dynamical systems, ieee trans. on neural networks, vol. 6, no. 2, march 1995, p. 422–431. [15] williams, r. j., zipser, d.: a learning algorithm for continually running fully recurrent neural networks, neural comput., vol. 1, 1989, p. 270–280. [16] werbos, p. j.: backpropagation through time: what it is and how to do it, proc. ieee, vol. 78, no. 10, oct. 1990, p. 1 550–1 560. issn 0018-9219. [17] pearlmutter, b. a.: gradient calculation for dynamic recurrent neural networks: a survey. ieee transactions on neural networks, 6, 5, 1995, 1 212–1228. doi: 10.1109/72.410363. [18] gupta, m. m., bukovský, i., noriasu, h., solo, m. g., hou, z.-g.: fundamentals of higher order neural networks for modeling and simulation, in artificial higher order neural networks for modeling and simulation, ed. m. zhang, igi global, 2012, (accepted, to appear in 2012). [19] bukovský, i., křehĺık, k.: testy neuronového modelu kotle elektrárny mělńık i. výzkumná zpráva č. 6 – zi00069/e06. ústav př́ıstrojové a ř́ıd́ıćı techniky, fakulta strojńı, čvut v praze, 2011. 22 acta polytechnica vol. 52 no. 5/2012 gated graphene electrical transport characterization josef náhlík1, michal janoušek1, zbyněk šobáň1,2 1dept. of microelectronics, faculty of electrical engineering, czech technical university in prague, technická 2, 166 27 praha, czech republic 2institute of physics ascr, cukrovarnická 10/112, 162 00 praha, czech republic corresponding author: nahlijo1@fel.cvut.cz abstract graphene is a very interesting new material, and promises attractive applications in future nanodevices. it is a 2d carbon structure with very interesting physical behavior. graphene is an almost transparent material that has higher carrier mobility than any other material at room temperature. graphene can therefore be used in applications such as ultrahigh-speed transistors and transparent electrodes. in this paper, we present our preliminary experiments on the transport behavior of graphene at room temperature. we measured the resistivity of hall-bar samples depending on gate voltage (backgated graphene). hysteresis between the forward and backward sweep direction was observed. keywords: graphene, hysteresis in electric transport. 1 graphene graphene is a very interesting new material that was first prepared by a. k. geim and k. s. novoselov in 2004 [9]. graphene is a monolayer of a carbon atom in a honeycomb lattice, and can be prepared in various ways. graphene has very interesting physical behavior, which promises applications in nanodevices such as ultra-high speed transistors and transparent electrodes. 1.1 methods of preparation the first preparation method is exfoliation, which was published by geim and novoselov. it is a very simple technique for preparing graphene “flakes” from highly-oriented pyrolytic graphite (hopg), but the dimensions of the “flakes” are very small – about tens µm. the second method is growth by chemical vapor deposition (cvd) on a copper or nickel foil [5, 7, 8]. this method is based on thermal decomposition of methane at high temperature. the third method is high temperature annealing of silicon carbide in an argon atmosphere or in another inert gas or in a vacuum [3]. it is better to use a semi-isolating silicon carbide for easy electrical characterization. it is important to use sic in a specific orientation. 1.2 basic properties of graphene graphene is an interesting material not only for its electrical properties, but also for its mechanical and optical properties. in this paper we will focus on its electrical properties. graphene is composed of an sp2 bonded carbon atom with lattice constant ac−c = 1,42 å (carbon to carbon length). the remaining p orbitals create π bonds responsible for the dominant planar conduction phenomena. the energy-momentum dispersion relation in the vicinity of the k point of the brillouin zone is linear, and the conduction and valence band overlap at one point (dirac point). due to this specific relation, charge carriers are seen as zero mass relativistic particles with effective “speed of light”, c∼106 m/s [11]. the type of conductivity is related to the dispersion relation. as shown in fig. 1, the change in fermi energy when the gate voltage is applied changes the type of conductivity. when the fermi level matches the dirac point, the conductivity has a minimum value due to lack of free carriers. the carrier mobility is also very high. it can be more than 106 cm2/vs, but the charge mobility depends on many parameters. the main influences include the concentration of impurities, the roughness of the substrate, and temperature. the highest measured charge mobility is about 2x105 cm2/vs [1]. this is more than one hundred times higher than in silicon. 1.3 identification methods graphene is a very thin material and its optical transparency is very high [10, 11] (monolayer – nearly 98 % for visible wavelengths). it is therefore almost invisible on a transparent material. interference with 76 acta polytechnica vol. 52 no. 5/2012 figure 1: ambipolar electric field effect in singlelayer graphene. the positive (negative) gate voltage changes the fermi energy and the induced electrons (holes), and causes electron (hole) conductivity. the substrate is used for optical identification. a silicon substrate with silicon dioxide 90 nm or 300 nm in thickness, where the optical contrast between the monolayer of graphene and the substrate can be as high as 12 %, is preferred for this reason. the most reliable method for confirming that that examined material is graphene is raman spectroscopy, which is a noninvasive technique for identifying the composition of the studied material. a carbon monolayer has its own unique spectrum (multilayer graphene or graphite has a different raman spectrum) [2, 11]. our graphene samples were checked by this technique. indentification can be performed by afm (atomic force microscopy). the height of the graphene layer on an oxidized silicon substrate is not 0,35 nm, i.e. the interlayer distance of graphite. the height is 0,8– 1,2 nm, due to the native van der waals inter-layer distance [11]. 2 electrical hysteresis in graphene the gate voltage that is applied changes the fermi energy position, and consequently also the resistivity of the graphene. hysteresis between points of maximum resistivity in two different direction sweeps of the gate voltage has been observed. this phenomenon is most often explained by the influence of the ambient air humidity [4, 6, 12]. a layer of silanol groups (sioh) is formed on the surface of the silicon dioxide substrate (sio2 on silicon), and attracts oh groups (or other molecules). dipolar water molecules are easily absorbed, and they can influence the charge transfer. water molecules behave as doping atoms, because the gate voltage affects their polarization and they add their own electric field intensity to the intensity caused by the figure 2: scheme of the polarization of water molecules by gate voltage in backgated graphene. figure 3: a graphene hall-bar structure defined by electron lithography. voltage source that is connected between the conductive substrate and the graphene (so-called backgated graphene). fig. 2 presents the scheme of polarized water molecules. polarized molecules and their electric intensity could be one of the effects that cause hysteresis between the forward and backward sweep of gate voltage in the resistivity/gate voltage characteristic. another way to explain hysteresis is by the presence of charge traps from the surface of silanol or surfacebound h20 molecules [4, 6]. to suppress the influence of water, the substrate can be covered with a hydrophobic hexamethyldisilazane (hmds) layer [4, 6]. 3 experimental setup our graphene samples were prepared by chemical vapor deposition on a copper foil and then transferred to a silicon substrate with 300 nm silicon dioxide. the presence of a graphene monolayer was confirmed by raman spectroscopy. after deposition of the samples, electron lithography was used to define the 2µm wide hall-bar shown in fig. 3. the graphene was etched by oxygen plasma, and the metal contacts (cr–5 nm, au–100 nm) were fab77 acta polytechnica vol. 52 no. 5/2012 figure 4: hysteresis behavior for low-speed gating (15-second applied gate voltage before measuring va characteristics). ricated by standard uv lithography. thereafter the samples were placed into ceramic chip carriers and were wirebonded. the graphene samples were electrically characterized using an hp4156c semiconductor parameter analyzer. resistivity measurements of the graphene hall-bar vs. gate voltage were performed in a fourpoint configuration at room temperature. the samples were measured using a constant current source. the gate voltage was connected to the doped silicon substrate. for each gate voltage we measured about 10 points of va characteristics to confirm that each of the characteristics is linear. 4 results and discussion the gate voltage was varied in the −40 v to 70 v range. the dependence of resistivity on the gate voltage of the graphene samples is shown in fig. 4. the characteristics exhibit hysteresis between the forward and backward sweep directions, as we had assumed. as shown in fig. 4, only one dirac point was reached. the dependence on sweep rate was observed on this sample [4, 12]. the low speed gating (15-second applied gate voltage before measuring the va characteristics) exhibits a smaller difference between the resistivity of the forward and backward sweep direction than the higher speed gating (5-second applied gate voltage), which is shown in fig. 5. 5 conclusion we have presented a preliminary study of graphene transport behavior. we prepared samples that were checked by raman spectroscopy. figure 5: hysteresis behavior for higher-speed gating (5-second applied gate voltage before measuring va characteristics). the dependence between resistivity and gate voltage was measured, and hysteresis between forward and backward sweep directions was shown. the influence of the sweep rate was also investigated. for our future work, a new set of samples will be covered with a hydrophobic hmds layer to reduce the influence of air humidity. a different way will be used for depositing the hmds layer. acknowledgements the research presented in this paper was supervised by assoc. prof. j. voves, fee ctu in prague, and was supported by gacr grant no. p108/11/0894 and by the grant agency of the czech technical university in prague, grant no. sgs10/281/ohk3/3t/13. references [1] bolotin, k.i., et al. ultrahigh electron mobility in suspended graphene. solid state communications 146:351–355, 2008. [2] calizo, i., et al. temperature dependence of raman spectra of graphene multilayers. nano lett 7 (9):2645–2649, 2007. [3] emstev, k. v., et al. towards wafer-size graphene layers by atmospheric pressure graphitization of silicon carbide. nature materials 8:203, 2009. [4] joshi, p., et al. intrinsic doping and gate hysteresis in graphene field effect devices fabricated on sio2 substrates. j phys: condens matter 22:334214, 2010. [5] kim, k. s., et al. large-scale pattern growth of graphene films for stretchable transparent electrodes. nature 457:706, 2009. 78 acta polytechnica vol. 52 no. 5/2012 [6] lafkoti, m., et al. graphene on a hydrophobic substrate: doping reduction and hysteresis suppression under ambient conditions. nano lett 10 (4):1149–1153, 2010. [7] li, x., et al. large-area synthesis of high-quality and uniform graphene films on copper foils. science 324:1312, 2009. [8] mattevi, c., kim, h., chhowalla, m. a review of chemical vapour deposition of graphene on copper. j mater chem 21:3324, 2011. [9] novoselov, k.s., et al. electric field effect in atomically thin carbon films. science 306:666, 2004. [10] orlita, m., potemski, m. dirac electronic states in graphene systems: optical spectroscopy studies. semicond sci technol 25:063001, 2010. [11] soldano, c., mahmood, e., dujardin, e. production, properties and potential of graphene. carbon 48:2127, 2010. [12] wang, h., et al. hysteresis of electronic transport in graphene transistors. acs nano 4 (12):7221– 7228, 2010. 79 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 xmm observations of metal abundances in galaxy clusters l. lovisari, s. schindler, w. kapferer abstract the hot gas that fills the space between galaxies in clusters is rich in metals. due to their large potential well, galaxy clusters accumulate metals over the whole history of the cluster, and retain important information on cluster formation and evolution. wederive detailedmetallicity maps for a sample of 5 clusters, observedwithxmm-newton, to study the distribution of metals in the intra-cluster medium (icm). we show that even in relaxed clusters the distribution of metals shows many inhomogeneities with several maxima separated by low metallicity regions. we also found a deviation from the expected temperature-metallicity relation. keywords: galaxies, cluster, general – galaxies, abundances – x-ray, galaxies, cluster, intergalactic medium. introduction since the first x-ray observations of the 7 kev iron line feature in the 1970’s we have known that the intra-cluster medium (icm) contains not only primordial elements but also heavy elements. as heavy elements are only produced in stars which reside mainly in galaxies, the enriched material must have been ejected into the icm by the member galaxies. due to the large potential wells of galaxy clusters they retain all the enriched material. this makes them excellent laboratories for the study of nucleosynthesis and of the chemical enrichment history of the universe. because the gas transfer affects the evolution of galaxy and galaxy cluster, it is important to know when and how the enrichment takes place. the components in a galaxy cluster interact with each other in many different ways. a study of the distribution of the ejected metals can therefore give us important information on the mechanisms that transported the enriched gas into the icm. several processes have been proposed for explaining the observed enrichment in the icm: rampressure stripping, galactic winds, galaxy-galaxy interactions, agn outflows, intra-cluster supernovae, and others. simulations show an inhomogeneous distribution of the metals independent of the enrichment processes [1]. nevertheless, agn outflows and galaxy-galaxy interactions can add metals to the icm [2]. simulations suggest that the metal enrichment of the icm is primarily due to galactic winds and ram-pressure stripping. a detailed comparison between the enrichment due to galactic winds and due to ram-pressure stripping revealed that these two processes yield different metal distributions and a different time dependence of the enrichment [3]. in massive clusters, ram-pressure stripping provides a much more centrally concentrated distribution than galactic winds. this is because galactic winds can be suppressed in the cluster center while ram-pressure stripping is most efficient there due to the fact that the icm density as well as the galaxies velocities are larger in the cluster center [3]. x-ray spectra are the only measure for the metallicity of the icm. the metallicity is derived mainly by measuring the equivalent width of the iron line once the continuum (almost entirely due to thermal bremsstrahlung) is known. with the first generation of satellites it was just possible to determine the radial metallicity profiles [4]. with deep observations of bright clusters of galaxies by the chandra and xmm-newton satellites it is now possible to extract the metallicities in certain regions of a galaxy cluster and to construct x-ray weighted metallicity maps [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]. metallicity maps in order to study the distribution of metals, we prepared adaptively binned abundance maps for a sample of five clusters (centaurus, a2029, a496, s159-03 and hydra a). appropriately cleaned data sets for all three xmm-newton epic cameras, with point sources removed, were used to create the source spectra. a high statistic is required in order to obtain a metallicity measurement with good accuracy. thus, to ensure an acceptable error also in the outskirts of the clusters we set a minimum required count number (∼ 5 000 source counts per region in the 0.3–10 kev band) for proceeding with the spectral fit. in order to model the emission from a single (or multi) temperature plasma we fit the spectra with an apec (+apec) model multiplied by the galactic column density fixed at the galactic values. the spectral re78 acta polytechnica vol. 51 no. 2/2011 fig. 1: left panels: metallicity maps based on spectra from all three epic cameras, right panels: plot of abundances against temperature for each bin 79 acta polytechnica vol. 51 no. 2/2011 gions for the map were selected following the method presented in [14], which we can summarize as follows: a square region centered on the x-ray peak was defined to include the area with high surface brightness. the region size of the pixels was optimized to be as small as possible by splitting it into horizontal or vertical segments through its center, while including at least 5 000 source counts. the metallicity distribution appears clearly nonspherical. in fig. 1 (left panels) we show 3 examples of the obtained metallicity maps. for centaurus and sérsic 159-03, there is a peak in the center and then it decreases in the outskirts, while a496 (as a2029 and hydra a) shows high metallicity clumps both in the center and in the outskirts. several clumps are not significant but most of them deviate significantly (99 %c.l) from the average profile. we note that, since centaurus is a very low redshift (z = 0.011 4), in these observations we are looking in the very central part of the cluster (r < 200 kpc) compared with the others clusters, and this could explain its different shape. on the other hand, sérsic, for which we map the metal distribution for more than 350 kpc, we observe the same shape as centaurus. several maxima are visible in the metal distribution, which are not associated with the cluster center. from simulations [15, 16], we know that the maxima are typically at places where galaxies just have lost a lot of gas due to ram pressure. since the gas lost by galaxies is not mixed immediately with the icm at the place where we observe a metal blob, we should also observe a low temperature due to the fact that the gas in galaxies is cooler than the icm. we therefore produced temperature maps of the clusters with the same spatial resolution obtained for the metal maps, and then we plotted the abundance of bins against their temperature, see fig. 1 (right panels). since we are searching for cool high-metallicity clumps due to the ejection of gas from galaxies and not the cool high-metallicity bins found in the cool cores, we did not plot the inner bins. apart from centaurus, we see a deviation from the expected temperature-metallicity relation. most of the deviating points corresponds to the locus of low metallicity and low kt. the deviation could be due to the iron bias effect that cause an underestimation of the metallicity when we attempt to fit with a single temperature model a plasma that is, in fact, characterized by a combination of different temperatures [17, 18]. on the other hand, in the range of temperature between 3 and 4 kev, the inverse iron bias effect, that cause an overestimation of the metal abundance [19], can play an important role. the combination of these two effects could explain the large scatter in the distribution of fig. 1. it is also possible that the ejected gas, with t < 1 kev and metallicity in the range 0.5–1.5 [20, 21], will be heated up to the temperature of the surrounding gas (icm) on a shorter time-scale than the time-scale of metal mixing. in this case, after a while we should observe a region of high metallicity (not yet dispersed) and high temperature (heated up at icm temperature). another possible explanation for the spread in the distribution could be related to the number of intracluster supernovae. during ram-pressure stripping events, many stars in fact form in the tail of stripped gas. the stars evolve and explode as sne directly in the icm and they can enrich the icm very efficiently. in this case we should see clumps of high metallicity (due to sne explosions) and high temperature. obviously, more complex heating and cooling processes are at work, thus the simple picture of stripped gas does not hold. conclusion based on xmm-newton observations, we have studied the spatial distribution of metal abundances in a sample of 5 relaxed clusters. we found that even for relaxed clusters the distribution of metals is clearly non-spherical. it appears very inhomogeneous, with several maxima separated by low metallicity regions. moreover, we found a deviation from the temperature-metallicity anti-correlation that can be partially explained by the iron bias and inverse iron bias effect. however, a set of simulations that takes into account thermal conduction should be performed to better understand the physics of the intracluster medium. references [1] schindler, s., diaferio, a.: metal enrichment processes, space science reviews, 2008. [2] kapferer, w., knapp, a., schindler, s., kimeswenger, s., van kampen, e.: star formation rates and mass distributions in interacting galaxies. a & a, 438, 2005, p. 87. [3] kapferer, w., ferrari, c., domainko, w., mair, m., kronberger, t., schindler, s., kimeswenger, s., van kampen, e., breitschwerdt, d., ruffert, m.: simulations of galactic winds and starbursts in galaxy clusters, a & a, 447, 2006, p. 827. [4] de grandi, s., ettori, s., longhetti, m., molendi, s.: on the iron content in rich nearby clusters of galaxies, a & a, 419, 2004, p. 7. [5] schmidt, r. w., fabian, a. c., sanders, j. s.: chandra temperature and metallicity maps of the perseus cluster core, mnras, 337, 2002, p. 71. 80 acta polytechnica vol. 51 no. 2/2011 [6] sanders, j. s., fabian, a. c., allen, s. w., schmidt, r. w.: mapping smallscale temperature and abundance structures in the core of the perseus cluster, mnras, 349, 2004, p. 952. [7] durret, f., lima neto, g. b., forman, g. b.: an xmm-newton view of the cluster of galaxies abell 85, a & a, 432, 2005, p. 809. [8] o’sullivan, e., vrtilek, j. m., kempner, j. c., david, l. p., houck, j. c.: awm 4: an isothermal cluster observed with xmm-newton, mnras, 357, 2005, p. 1 134. [9] sauvageot, j. l., belsole, e., pratt, g. w.: the late merging phase of a galaxy cluster: xmm epic observations of a 3266, a & a, 444, 2005, p. 673. [10] werner, n., de plaa, j., kaastra, j. s., vink, j., bleeker, j. a. m., tamura, t., peterson, j. r., verbunt, f.: xmm-newton spectroscopy of the cluster of galaxies 2a0335+096, a & a, 449, 2006, p. 475. [11] sanders, j. s., fabian, a. c.: enrichment in the centaurus cluster of galaxies, mnras, 371, 2006, p. 1 496. [12] hayakawa, a., hoshino, a., ishida, m., furusho, t., yamasaki, n. y., ohashi, t.: detailed xmm-newton observation of the cluster of galaxies abell 1060, pasj, 58, 2006, p. 695. [13] simionescu, a., werner, n., böhringer, h., kaastra, j. s., finoguenov, a., brüggen, m., nulsen, p. e. j.: chemical enrichment in the cluster of galaxies hydra a, a & a, 493, 2009, p. 409. [14] lovisari, l., kapferer, w., schindler, s., ferrari, c.: metallicity map of the galaxy cluster a3667, a & a, 508, 2009, p. 191. [15] kapferer, w., kronberger, t., ferrari, c., riser, t., schindler, s.: on the influence of rampressure stripping on interacting galaxies in clusters, mnras, 389, 2008, p. 1 405. [16] kapferer, w., sluka, c., schindler, s., ferrari, c., ziegler, b.: the effect of ram-pressure on the star formation, mass distribution and morphology of galaxies: a & a, 499, 2009, p. 87. [17] buote, d. a.: x-ray evidence for multiphase hot gas with nearly solar fe abundances in the brightest groups of galaxies, mnras, 311, 2000, p. 176. [18] rasia, e., mazzotta, p., bourdin, h., borgani, s., tornatore, l., ettori, s., dolag, k., moscardini, l.: x-mas2: study systematics on the icm metallicity measurements, apj, 674, 2008, p. 728. [19] gastaldello, f., ettori, s., balestra, i., brighenti, f., buote, d. a., de grandi, s., ghizzardi, s., gitti, m., tozzi, p.: apparent high metallicity in 3–4 kev galaxy clusters: the inverse iron-bias in action in the case of the merging cluster abell 2028, 2010, arxiv:1006.3255. [20] matsushita, k., ohashi, t., makishima, k.: metal abundances in the hot interstellar medium in early-type galaxies observed with asca, pasj, 52, 2000, p. 685. [21] athey, a. e., bregman, j. n.: oxygen metallicity determinations from optical emission lines in early-type galaxies, apj, 696, 2009, p. 681. lorenzo lovisari sabine schindler wolfgang kapferer e-mail: lorenzo.lovisari@uibk.ac.at institute of astro-and particle physics technikerstrasse 25, a-6020 innsbruck, austria 81 wykresx.eps acta polytechnica vol. 51 no. 1/2011 the velocity tensor and the momentum tensor t. lanczewski abstract this paper introduces a new object called the momentum tensor. together with the velocity tensor it forms a basis for establishing the tensorial picture of classical and relativistic mechanics. some properties of the momentum tensor are derived as well as its relation with the velocity tensor. for the sake of clarity only two-dimensional case is investigated. however, general conclusions are also valid for higher dimensional spacetimes. keywords: relativistic classical mechanics, velocity tensor, momentum tensor. 1 introduction in [1], an object called the velocity tensor v μν (v) was described. it comes from a generalization of the equation dx(t)− v (t) dt =0 (1) into a generally covariant form v μν (v) dx ν =0. (2) the two-dimensional matrix of the classical velocity tensor takes the form v (v)= v 01 ( −v 1 −v2 v ) , (3) while in the relativistic case v (β)= γ2v 01 ( −β 1 −β2 β ) , (4) where v 01 is some arbitrary constant, β = v/c and γ = ( 1− β2 )−1/2 . as was shown in [1], the tensorial description has an obvious advantage over a standard description since it does not use the notion of the proper time τ = t √ 1− v2 (t) c2 (5) and therefore it allows a description of non-uniformmotions and systems with an arbitrary number of material points. it also provides a cornerstone for formulating a generally covariant mechanics. however, the velocity tensor deals solely with kinematical issues. to make the tensor description complete we need to introduce another tensorial object called the momentum tensor πμν(v). by means of this tensor it is possible to solve dynamical problems. 2 definition of the momentum tensor in classical and relativistic mechanics the following formula holds true [2]: dp(x, t) dt = f(x, t). (6) the tensorial equivalent of eq. (6) is presumed to be ∂μπ μ ν(x, t)=φν(x, t), (7) where πμν(x, t) is the momentum tensor and φν(x, t) is an influence of the exterior on a body. it should be stressed here that we do not assume a priori the relationship between f(x, t) and φν(x, t). the choice of the form of the mixed tensor πμν(x, t) comes from the assumption that the momentum tensor should be some function of the velocity tensor. since the velocity tensor is a function of a classical velocity v, the momentum tensor is πμν(x, t) :=π μ ν(v). 42 acta polytechnica vol. 51 no. 1/2011 3 general construction of the momentum tensor in general, the momentum tensor πμν(v) is represented by a square matrix π(v)= ⎛ ⎜⎜⎜⎜⎜⎝ π00(v) π 0 1(v) · · · π 0 n(v) π10(v) π 1 1(v) · · · π 1 n(v) ... ... ... ... πn0(v) π n 1(v) · · · π n n(v) ⎞ ⎟⎟⎟⎟⎟⎠ , where the elements πμν(v) are some functions of velocity v variable with time. in order to determine them, we make use of the transformation relation for a mixed tensor. passing from an inertial reference frame s to an inertial system s′ that moves with velocity u relative to s, the momentum tensor πμν(v) transforms in accordance with the following formula πμν(v) → π μ′ ν′(v ′)= lμ ′ μ (u)π μ ν(v)l ν ν′(u), (8) or in matrix notation π(v) → π′(v′)= l(u)π(v)l(−u). (9) assuming that π(v) is form-invariant, i.e. π′(v′)= π(v′), we arrive at a functional equation for π(v) in the form π(v′)= l(u)π(v)l(−u), (10) where v′ is the velocity of a material point in the system s′ and v is its velocity in s. it is easy to prove [1] that after some simple substitutions and rearrangements in eq. (10) we get the solution π(v)= l(−v)π(0)l(v), (11) where π(0) is an arbitrary square matrix formed by constant elements. 4 two-dimensional momentum tensor 4.1 non-relativistic case in this case we substitute in eq. (11) the galilean transformation in the form g(v)= ( 1 0 −v 1 ) and hence we get π(v)= ( 1 0 v 1 )( π00 π 0 1 π10 π 1 1 )( 1 0 −v 1 ) = ( π00 − vπ 0 1 π 0 1 π10 + v ( π00 −π 1 1 ) − v2π01 π 1 1 + vπ 0 1 ) , (12) where all elements πμν in eq. (12) are constant. since the above equation is only time-dependent, eq. (7) leads to the expression ∂0π 0 ν(v)=φν , (13) where ∂0 = d/dt, and therefore we get φ0 = ∂0π 0 0(v)= ∂0 ( π00 − vπ 0 1 ) = −v̇π01 (14) and φ1 = ∂0π 0 1(v)= ∂0π 0 1 =0. (15) hence, in order to reconstruct the classical newtonian equation of motion we have to assume that π01 = m and φ0 = −f, (16) 43 acta polytechnica vol. 51 no. 1/2011 where m is mass of a material point and f is a classical newtonian force in a two-dimensional spacetime. the choice of the sign in eq. (16) results from considerations in higher dimensional spacetimes. it results fromeqs. (12), (14) and (15) that only the element π01 takes part in dynamical processes since no other coefficient appears in eq. (14). therefore, the other elements may take arbitrary values and each specific choice among them will lead to the same dynamics. in particular, we may choose them in such way that the relation π(v)= mv (v) (17) is satisfied. keeping in mind that v (v) is given by eq. (3), we get that π(v)=π01 ( −v 1 −v2 v ) . (18) the fact that in the considered caseφ1 =0 leads to the general assumption that the componentφ0 plays a key role in the dynamics, and the components φk are auxiliary quantities that provide the formalism covariance. 4.2 relativistic case in the case of substituting into eq. (11) the lorentz transformation given by l(β)= γ ( 1 −β −β 1 ) we get that π(β)= γ2 ( π00 + β(π 1 0 −π 0 1)− β 2π11 π 0 1 + β(π 1 1 −π 0 0)− β 2π10 π10 + β(π 0 0 −π 1 1)− β 2π01 π 1 1 + β(π 0 1 −π 1 0)− β 2π00 ) . (19) according to eq. (13) we obtain that φ0 = ∂0π 0 0(β)= ∂0γ 2 [ π00 + β(π 1 0 − π 0 1)− β 2π11 ] = γ4β̇ [( 1+ β2 )( π10 −π 0 1 ) +2β ( π00 −π 1 1 )] , (20) φ1 = ∂0π 0 1(β)= ∂0γ 2 [ π01 + β(π 1 1 − π 0 0)− β 2π10 ] = γ4β̇ [( 1+ β2 )( π11 −π 0 0 ) +2β ( π01 −π 1 0 )] . (21) as we can observe, generally all coefficients πμν take part in the dynamics in this case since all of them are present in eq. (20). in order to illustrate the role of parametersπμν let us consider a general case of dynamics where φ0 =const. after the integration of eq. (20) we find that γ2 [ π00 + β(π 1 0 −π 0 1)− β 2π11 ] =φ0t + c, (22) where c is an integration constant. taking into consideration the initial condition for t =0 we obtain that c = γ20 [ π00 + β0(π 1 0 −π 0 1)− β 2 0π 1 1 ] , where β0 and γ0 are the values for t = 0. if we additionally assume that β0 = 0 (i.e. γ0 = 1) then c = π 0 0. substituting this into eq. (22) and making simple rearrangements we arrive at the following: β2 ( φ0t +π 0 0 −π 1 1 ) + β ( π10 −π 0 1 ) −φ0t =0. (23) the solutions of the above equation are of the form β± = ( π01 −π10 ) ± √ (π01 −π10) 2 +4φ0t(φ0t +π00 −π11) 2(φ0t +π00 −π11) . (24) in the standard formalism of the special theory of relativity [3], when a constant force f is applied to a body one gets the following solutions of the equations of motion for a velocity: βst r± = ± √ f2t2 m2c2 + f2t2 . (25) 44 acta polytechnica vol. 51 no. 1/2011 if we expect that eq. (23) also has two symmetric solutions, we have to assume that π01 = π 1 0. hence in this case we find that β± = ± √ φ0t φ0t +π00 −π11 . (26) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 v/ c t fig. 1: comparison of β(t) (green dashed) and βst r(t) (red). f =φ0 =1, m 2 c 2 =π00 −π 1 1 =1 are assumed here it should be stressed here that the asymptotes of eqs. (25) and (26) are identical, i.e.: lim t→∞ βst r± = lim t→∞ β± = ±1 and lim t→0 βst r± = lim t→0 β± =0. as we can see from eq. (26), the constant π11 plays the role of a “renormalization” constant for π 0 0, hence it can be discarded without losing the generality of considerations. then matrix (19) takes the form π(β)= γ2 ( π00 π 0 1 − βπ 0 0 − β 2π01 π01 + βπ 0 0 − β 2π01 −β 2π00 ) . (27) matrix (27) can also be rewritten as π(β)= γ2π00 ( 1 −β β −β2 ) +π01 ( 0 1 1 0 ) , (28) where the second matrix on the right hand side of eq. (28) is constant in time. assuming that π01 =π 1 0 and π 1 1 =0, eqs. (20) and (21) turn into φ0 = ∂0π 0 0(β)= ∂0γ 2π00 =2γ 4β̇βπ00, (29) φ1 = ∂0π 0 1(β)= ∂0γ 2 ( −βπ00 ) = −γ4β̇ ( 1+ β2 ) π00. in order to compare it with the standard formalism of the special theory of relativity, let us recall that in the standard description the equation of motion is given by [3] f = dp dt = mcβ̇ (1− β2)3/2 = γ3mcβ̇, and therefore β̇ = γ−3 f mc . substituting this expression into eq. (29) we get φ0 = 2γ f mc β̇βπ00, (30) φ1 = −γ f mc β̇ ( 1+ β2 ) π00. 45 acta polytechnica vol. 51 no. 1/2011 this indicates that the assumption that β̇ in this formalism and the standard description is the same leads to the conclusion that for a force f constant in time the component φ0 is not constant in time, and vice versa. however, the uniform motion (β̇ =0) in both formalisms occurs simultaneously. the non-trivial part of the matrix (28) can also be expressed by means of well-known relativistic quantities such as energy and momentum: e = γmc2, p = γmcβ. therefore we get π(β)= π00 m2c4 ( e2 −epc epc −p2c2 ) +π01 ( 0 1 1 0 ) . (31) it should be highlighted here that — as was mentioned before — it is possible to choose a different special form of the relativistic velocity tensor matrix and — consequently — a different description of dynamics. for instance, by analogy with the non-relativistic solution, we can assume that the relation between the velocity tensor describedbyeq. (4) and themomentumtensor is givenbyeq. (17). hence in order to reproduceeq. (17) the general form of the momentum tensor matrix (19) has to be reduced to the matrix π(β)= γ2π01 ( −β 1 −β2 β ) , (32) where — as we have shown for the non-relativistic case — the constant π01 can be identified with mass m of a material point. it is easy to observe that form (32) is obtained from eq. (19), where all coefficients with the exception of π01 vanish. therefore, eqs. (20) and (21) can be written down as: φ0 = −γ4 ( 1+ β2 ) β̇π01, φ1 = 2γ 4ββ̇π01. 5 conclusions the aim of this paper was to introduce a new dynamical object called the momentum tensor as an analogue to the kinematical velocity tensor, and therefore to complete the tensorial description of classical and relativistic mechanics. calculations show that the choice of constants in the momentum tensor matrix results in different models of dynamics in the relativistic case. another important fact is that the naturally assumed relation between the tensors: π(v)= mv (v) is just one amongmany. further investigationswill focus on verifying the other models. acknowledgement i would like to thank prof. edward kapuścik for his scientific advice, and also for useful comments and ideas on this subject. references [1] kapuścik, e., lanczewski, t.: on the velocity tensors, physics of atomic nuclei, 72 (2009) 809. [2] goldstein, h.: classical mechanics, addison-wesley, reading, 1980. [3] landau, l. d., lifshitz, e. m.: classical theory of fields, pwn warsaw, 1980. tomasz lanczewski e-mail: tomasz.lanczewski@ifj.edu.pl h. niewodniczański institute of nuclear physics polish academy of sciences radzikowskiego 152, pl 31342 kraków, poland 46 ap08_5.vp 1 introduction the problem of complex probability functions is gaining in importance in quantum mechanics [1, 9], where the phase functions has been recognized as necessary for information processing and quantum system modeling. the contextual interpretation of phase functions was presented in [7], and the wave probabilistic models were introduced as necessary part of probabilistic multi-models. a more rigorous introduction to wave probabilistic models was presented in [14], where phase parameters are interpreted as dependency functions between events. the link between wave probabilistic functions and the complementarity principle was first introduced in [10]. the quantization principle as the consequence of phase parameters was defined in [6]. the goal of this paper is to continue in this way of thinking and to provide a more rigorous definition of wave probabilistic models together with their basic features. chapter 2 presents the mathematical theory of wave functions, together with their geometric interpretation. chapter 3 covers the link between wave probabilities and entanglement. chapter 4 describes the general estimation algorithm of wave probabilities. chapter 5 presents the methodology for modeling non-ergodic binary time series. chapter 6 offers an illustrative example of binary time series, and chapter 7 concludes the paper. 2 mathematical theory of wave probabilistic functions a probability space consists of a sample space s and a probability function p(.), mapping the events of s to real numbers in [0,1], such that p(s) � 1, and if a1, a2, … is a sequence of disjoint events, then the sum rule is fulfilled: p a p(ai i n i i n� � � � � � � � � � ) . (1) if the events a1, a2, … are not disjoint the following (product and inclusion-exclusion) rules can be defined: p(a a a p(a p(a a p(a a a p(a a a 1 2 1 2 1 3 1 2 1 � � � � � � � � � � � � n n n ) ) ) ) 1), (2) p(a a a p(a p(a a p(a a a 1 2 1 � � � � � � � � � � � n i i n i j i j n i j k ) ) ) ) i j k n n n � � � � � � � � � �( ) ).1 1 1 2p(a a a (3) taking into consideration the basic laws of probability defined above, we can rewrite them with the help of the complex representations (wave functions) summarized in theorem 1. theorem 1: let us define n events ai, i n�{ , , , }1 2 � of a sample space s, with defined probability functions p a( )i , i n�{ , , , }1 2 � and let us define the n complex functions � � � �( ( ) , { , , , }a ) e p a ei i j i ji i i n� � � � � 1 2 � (4) together with their superposition state � � as a quantum object at its measurement place (or at time series position) �: � � � � � � � � � � � � � � � ( ) ( ) ( ) a a a a a a 1 1 2 2 � n n (5) with modules p a( )i and phases �i where the reference phase assigned to event a1 is chosen as �1 0� , then the inclusion-exclusion rule given in (3) is represented for measurements on quantum objects � �� �1, ,� n (or at a time series window) by: p a a a a( ) ( )1 1 2 2 1 2 � �� � � � � � � � � � � n n i i n , (6) where the phases �i are given: �i i i i i a� � � � � � � � � cos ( ) ) ( ) ( 1 2 1 2 1 1 2 1 p( a a a a p a a a p a � � ) , , , � � � � � � � �1 2 1� i (7) and �1 2 1, , ,� i is computed: � � � � � ( ) ( ) ( ) , , , , a a a a a e 1 2 1 1 1 1 2 1 1 2 � � � � � � � � � � � � i i i j , ,� i 1 (8) proof: the proof is presented in [15]. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 wave probabilistic model of binary time series m. svítek this paper presents the theory of wave probabilistic models, together with important features, such as the inclusion-exclusion rule, the product rule, the complementary principle and entanglement. these features are mathematically described, and an illustrative example of binary time series is shown to demonstrate possible applications of the theory. keywords: quantum models, wave probabilistic functions, binary time series. in the next part of this paper, the set of complex functions (4) plus superposition state (5) together with the inclusion-exclusion rule (6, 7, 8) is called the wave probabilistic model. theorem 2: let us define n events ai, i n�{ , , , }1 2 � of a sample space s, with defined probability functions p(a i), i n�{ , , , }1 2 � , and the n complex functions �(a i) with modules p a( )i and phases �i defined in (4–8), where the reference phase assigned to event a1 is chosen as �1 0� , then the inclusion-exclusion rule given for the subset of events a r, r k k km�{ , , , }1 2 � is given as: p p ( ) lim ( ( ) , , a a a a a k k k m k k k i m k m 1 2 1 1 2 0 � �� � � � � � � � � � � � � � ) i n � 1 2 (9) proof: the proof of theorem 2 arises directly from theorem 1, which was proven for all probabilistic values p a( )i , a i, i n�{ , , , }1 2 � including zeros. the zero probabilistic values have an impact on phase parameters �i and change them in such a way that equation (9) is fulfilled. in this paper we assume that quantum objects � are distinguishable. when two identical particles interact (there is a significant overlap of their wave functions), we can not distinguish between them. these overlapping quantum objects are in general bosons (plus sign corresponding to a symmetric wave function under exchange of quantum objects) or fermions (minus sign corresponding to an anti-symmetric wave function under exchange of quantum objects). 3 wave probabilities and entanglement quantum entanglement (the definition for describing the entanglement principle from wikipedia, available on: http://en.wikipedia.org/wiki/quantum_entanglement ) is a quantum mechanical phenomenon in which the quantum states of two or more quantum objects have to be described with reference to each other, even though the individual objects may be spatially separated. this leads to correlations between observable physical properties of the systems. for example, it is possible to prepare two particles in a single quantum state such that when one is observed to be spin-up, the other one will always be observed to be spin-down and vice versa, despite the fact that it is impossible to predict, according to quantum mechanics, which set of measurements will be observed. as a result, measurements performed on one system seem to be instantaneously influencing other systems entangled with it. theorem 3: let us define n events ai, i n�{ , , , }1 2 � of a sample space s, with defined probability functions p(a i), i n�{ , , , }1 2 � , and the n complex functions �(a i) with modules p a( )i and phases �i defined in (4–8), where the reference phase assigned to event a1 is chosen as �1 0� , then all events a r, r r r rn�{ , , , }1 2 � with only two possible states a r ii �� a r ii �� are entangled if the following form holds: p p ( ) lim ( ) ( ( ) , , , a a a a r r n k r r r n k n 1 1 2 1 0 1 � � � � � � � � � � � � � � � a a2 2 0) ( )� � �� � n (10) which yields into form: p a a( )r r nn1 1 1 � �� � � � �� (11) where a r ii �� means the inversion state in comparison with a r ii �� on quantum object ( )� � i . proof: theorem 3 emerges directly from the inclusion-exclusion rules and from the definition of wave probabilities. phase parameters assigned into wave functions can be either positive or negative. form (10) modifies the phases for the selected subset of events { , , , }a a ar r rn1 2 � to comply with the inclusion-exclusion rule (9). for special cases, the inclusion-exclusion rule can yield into zero due to wave resonances within the phases of events. this special case can occur for a special set up of phases. zero probability (10) directly yields into equation (11), which defines that the state characterized by a ar rn1 � �� will surely occur. this state is not random but fully deterministic and so spatially spread within events { , , , }a a ar r rn1 2 � . it can be stated that entanglement is the logical result of probabilistic wave functions, and represents something like the resonance wave functions yielding into deterministic states. 4 estimation algorithm of the wave probabilistic model of time series we assume the time series composed of many quantum objects, each of which is described by wave function covering the superposition principle of all possible events { , , , }a a a1 2 � n : � � � � � � � � � � �1 1a an n� , (12) where � means the �-th quantum object. if we take into consideration the window of i quantum objects, the corresponding wave function ~� � is given by the kronecker product [14]: ~ � � � � � � � � � � � � � � �1 2 � i, (13) where ~� � 2 give us probabilities that measurements on the set of i quantum objects { , , }� �� �1 � i will yield into a series of predefined events. now we assume that we have the time series, and by return, we estimate the wave functions (4). the algorithm for estimating the parameters of the wave functions can be decomposed into following steps: 1. let us start with two events { , }a a1 2 and assume ��1 0� . © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 48 no. 5/2008 2. we can estimate the phase ��2 from the following equation: p a a a a( ) ( ) ( ) � � � � cos 1 1 2 2 1 2 2 1 2 2 2 1 22 � � � � � � � � � � � � � � � � � ( � � ) � ,� � �2 1 1 2 2 � (14) with the help of occurrence rates � , � , � , �,� � � �1 2 1 2 1 estimated from time series. in principle we estimate the probabilities: p a p a p a a( ), ( ), ( )1 2 1 2 1� � � �� � and compute the parameters � , � , � ,� � �1 2 1 2. 3. we continue and extend the algorithm for three events { , , }a a a1 2 3 p a a a a a a ( ) ( ) ( ) ( ) � � 1 1 2 2 3 3 1 2 3 2 1 2 2 2 � � � � � � � � � � � � � � � � � � � � � � cos( � � ) � � cos( � � ) � � � � � � � � � 3 2 1 2 2 1 2 3 3 2 2 2 2 � � � � � � cos( � � ),� � � �1 3 3 1� � (15) where the estimate of angle ��3 can be computed from the following equation: � cos( � � ) � cos( � � ) � cos( � , ,� � � � � � � � 1 2 3 1 2 2 3 2 1 3 � � � � � � )�1 (16) under knowledge of the estimated parameters � , � , � ,,� � �1 2 1 2 � , � , � ,� � �1 2 1 2. this step must be made numerically because equation (16) is non-linear. 4. the above described procedure can be extended to a general n-step, where the unknown angle ��n is assumed to be numerically computed from the following equation: � cos( � � ) � cos( � � , , , , , ,� � � � � � 12 1 12 1 1 � �n n n n n n � � � � � � � � � � 1 2 2 2 2 1 ) � cos( � � ) � cos( � � ) � co � � � � � � � n n n n� s( � � )� �n 1 (17) under knowledge of the estimated parameters � , ,�1 � � , � , � , � , , � , �, , , , , ,� � � � � �n n n n 1 1 2 1 1 1 1 1 2 1� �� . fortunately, for binary time series we need only step 1 and 2, and the result can be given analytically. 5 wave probabilistic model of non-ergodic binary time series we start our discussion with probabilistic binary time series, and we will show models of the occurrence of zero or one (probability and structure) in the form of wave probabilistic functions. let us define the binary quantum object in “bra-ket” form [17]: � � � � � � � � � �1 0 1p p je . (18) parameter p defines the probability of occurrence of state 1 � of time series position �. the probability of occurrence of a state 0 � must be (1 p). phase � plays the role of “structural” parameter that expresses the rate of time series randomness [16]. the ergodic theorem allows the time average of a conforming process to equal the ensemble average. in practice, this means that statistical sampling can be performed at one instant across a group of identical processes or sampled over time on a single process with no change in the measured result. quantum objects defined in (18) fulfill the ergodic theorem, because the time average of a series along the time trajectories exists almost everywhere and is related to the space (set of realizations) average: e d� � � � � � � � � � � � � ��� , ( ) lim , , t t t nn n 1 � , (19) where e� is the mean value of a stochastic process and �(�) is a probability measure. the first part of (19) is the �-ensemble average, which does not depend on time t and so the process is stationary. the second part of (19) is the time-average of quantum objects � � � � , ,t tn1 � �� with respect to selected process realization �. if the complex parameters of the “bra-ket” model are time dependent we speak of a non-ergodic probabilistic binary process. since non-ergodic processes are very difficult to model we will ease our requirements only to a special class of quasi-non-ergodic processes. quasi-non-ergodic processes are characterized by linear time invariant (lti) evolution of complex wave functions �( , )a i t assigned into states a i � in time interval t and position �. if we add the time varying phase parameter into a binary process we can rewrite “bra-ket” objects (18) as follows: � � � � � � � , ( ) t j j t j t � � � � � � � � � � �� 1 0 1 1 0 1 a a e e a a e � , (20) where parameter a defines the probability of occurrence of state 1 � . parameter � expresses the initial structure of the studied set of �-positioned states. parameter represents the frequency of continual structuring and randomizing of � positioned states. due to the time evolution of complex parameters (20) the ergodic condition is not fulfilled. the parameters for state 0 � can easily be computed from the parameters assigned to state 1 � because a must be maximally equal to one to be a probability function and the phase assigned to state 0 � is assumed to be the normalized reference phase presented in theorem 1 and equal to zero. frequency can be interpreted as the energy spent to “structure” or “randomize” the set of �-positioned states with respect to chosen frequency . we use the notation a , � as modulus and initial phase parameters assigned to frequency . they represent the frequency decomposition in the same way as the fourier transform. equation (20) can be used as one frequency com44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 ponent (modulus and initial phase) of a non-ergodic binary quantum object. the general periodic non-ergodic behavior can be expressed as the sum of different frequency components: � � � � � � � , ( , ) ( , ) ( ) t t t i i ij t i n � � � � � � � � � � � � 0 0 1 1 1 1 a e � � � � � � � � � � � �� � 2 1 0 1 � � � a e i i ij t i n ( ) , (21) where the final modulus and phase assigned to state 1 � are: a e a e � �� i i ij t i n j tt� � �� � � ( ) ~ )~( ) 1 , (22) where ~ ( )a t is the time evolution of the probability of state 1 � , and ~( )� t is the evolution of the link among different � positioned quantum objects (expressing structuring and randomizing). the complex parameter assigned to 0 � is computed from the normalization and reference condition: � � � �� � , ~ ~ ~ )t t t j t� � � � ��1 0 1a( ) a( ) e . (23) equation (23) can be understood as the “bra-ket” representation of general non-ergodic binary quantum objects. we can see in (21) that for every state 0 � , 1 � the discrete modules and phase spectrum can be defined. this means that the time-evolution is modeled by a periodic function. the discrete spectrum can be replaced by a continuous spectrum. in this case, the sums in (21) are replaced by integrals. this replacement means the transition from a fourier series into the fourier transform. 6 illustrative example-binary time series let us take two values { , }a a0 10 1� � time series represented by two complex wave functions �( )a 0 and �( )a1 . in the next part we will use for simplicity the notation � �( )a 0 0� and � �( )a1 1� . a. complementarity principle we demonstrate the complementarity principle of binary time series. discrete fourier transform (dft) is defined for k, i n� { , , , }0 1 1� : x e x e k i j i k n i n i k j i k n k n x x n � � � � � � � � � � � � 2 0 1 2 0 1 1 � � , . probability in x-representation is given from (4): p a p a ( ) ( ) * * 0 0 0 1 1 1 0 1 � � � � � � � � � � (24) first we compute dft of wave functions �0, �1: � � � � � � �� ( ) , ( ) k k i i i i j i � � � � � � � � � � � � � 0 1 0 1 0 0 1 0 1 0 e e �1, (25) where �(.) are dft transformed functions. the probability function in k-representation can be given [10]: ~ ( ) ~( ) ~ ( ) cos( ), p k k k� � � � � � � � � � 0 0 0 20 2 1 2 0 1 � � � � � � � (26) ~ ( ) ~( ) ~ ( ) cos( ), p k k k� � � � � � � � � 1 1 1 20 2 1 2 0 1 � � � � � � � (27) where � means the phase difference between complex numbers �0 and �1. the inverse dft of probabilities (26) and (27) yields into convolution in x-representation and describes the links between two successive quantum objects. if the phase parameter is zero � � 0 the values are strongly independent and the time series is fully random. phase � �� 2 explains that the probability of changes {0, 1} or {1, 0} in the time series is very low. a binary time series looks like { , , , , , , , , , , , }0 0 0 0 0 0 0 1 1 1 1 1 1 1� � . on the other hand if the phase parameter is � �� the probability of finding a pair {1, 1} or {0, 0} limits to zero. the corresponding time series looks like {0, 1, 0, 1, …, 0, 1, 0, 1} and seems to be fully deterministic. b. entanglement principle let us take two quantum objects ~� � of a time series with wave functions �0 and �1 defined in (13) and (18). let us define the probability: p a a(( ) ( ) cos( ), � � � � � � � � � � � � � � � � �0 1 2 1 0 2 1 2 0 1 (28) where � is the phase difference between wave functions �0 and �1. suppose now with respect to theorem 3 that: p a a(( ) ( )� �� � � ��0 1 01 . (29) this case can occur for the following values of �: � � � � � � � � � � � � � � a cos 1 2 0 2 1 2 0 1 . (30) if, for example, � �0 1 1 2 � � then � �� represents the entanglement. as the result of entanglement we can write that the following events will surely happen (there are no random values): p a a(( ) ( )� �� � � ��1 0 11 . (31) we can also start with the following probability, instead of (29): © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 5/2008 p a a(( ) ( )� �� � � ��1 0 01 . (32) then the entanglement yields into: p a a(( ) ( )� �� � � ��0 1 11 . (33) equations (32) and (33) can both be written to quantum “bra-ket” representation: ~ ( ).� � � � 1 2 01 10 (34) measuring the first quantum object from representation ~ � � (the probability of measuring event 0 is 1/2 and the probability of measuring 1 is also 1/2) fully determines the value which will be measured on the second object. equation (34) is the well-known bell state, which is used in many applications, e.g., in quantum teleportation, quantum cryptography, etc. c. wave function estimation let us define the time series: {0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0} with the estimated occurrence rates: �( )p a � � �1 7 15 , �( )p a � � �0 8 15 (35) �( )p a a� �� � � ��0 1 3 141 . (36) by using the inclusion-exclusion rule we can apply the estimated probabilities and write the following equation: � ( ) ( )) � ) � ) � ) p( a a p(a p(a p(a � � � � � � � � � � � � � � � � 0 1 0 1 2 0 1 1 � ) cos( ) � ) � ) �( p(a p(a p(a p a a � � � � � �� � � � � � � � � � � 1 1 1 1 0 1 0 � 1) (37) from which the estimated phase parameter can be computed: � arccos �( ) � ) � ) � � � � � � � � � � � � � � � � � � � � p a a p(a p(a 0 1 2 0 1 1 1 � 13543. . (38) if we use the estimated angle we can compute the inclusion-exclusion probability as: � ( ) ( )) � ) � ) � ) p( a a p(a p(a p(a � � � � � � � � � � � � � � � � 0 1 0 1 2 0 1 1 � ) cos( � ) ,p(a � �� � � �1 1 11 14 (39) where the estimate (39) corresponds to the occurrence rate directly estimated from the time series: � ( ) ( ))p( a a� �� � � ��0 1 11 141 . (40) we can see the compliance between (39) and (40). 7 conclusion wave probabilistic models have been introduced and a mathematical comparison between usually used probabilistic models and wave probabilistic models has been presented. mathematical theory points to the applicability of wave probabilistic models and their special features. quantum entanglement is explained as the consequence of phase parameters and it can be interpreted as the resonance principle of wave functions. the results of wave function resonance are fully deterministic, spatially distributed states with many properties. the complementarity principle presents the studied time series in both xand k-representation, where x-representation provides us with probabilities of occurrence of different events and k-representation caries information about the links between events and how the time series is structured. the general estimation algorithm for phase parameters of wave probabilities was introduced and shown on an illustrative example – a binary time series. we can understand that this methodology yields into new models taking into account the structure of time series. the application of the methodology presented here for non-ergodic time series modelling is also described and shown on binary time series. this opens new ways for modelling non-ergodic or quasi-non-ergodic processes of this kind. the inspiration for the problem defined here came from quantum physics [1, 4, 6, 7]. the analogy with quantum mechanics seems very interesting and will inspire future work in statistical modelling area and wave probabilistic models. 8 acknowledgments this project was supported by mšmt cost 356, oc194 and av čr, iaa201240701. references [1] gold, j. f.: knocking on the devil’s door – a naive introduction to quantum mechanics, tachyon publishing company, 1995. [2] raymer, m. g.: measuring the quantum mechanical wave function. contemporary physics, vol. 38 (1997), no. 5, p. 343–355. [3] radon, j.: über die bestimmung von funktionen durch ihre interalwerte längs gewisser mannigfaltigkeiten. berichte sachsiche akademie der wissenschaften, leipzich, mathematisch-physikalische klasse, vol. 69 (1917), p. 262–277. [4] bogdanov, yu. i.: the sixth hilbert’s problem and the principles of quantum informatics. institute of physics and technology, russian academy of sciences [http://arxiv.org/ftp/quant-ph/papers/0612/0612025.pdf] [5] clifton, r.: complementarity between position and momentum as a consequence of kochen-specker arguments [http://eprintweb.org/s/article/quant-ph/9912108] [6] svítek, m.: dynamical systems with reduced dimensionality. monograph nnw no. 6, czech academy of science, 2006, 161 p., isbn 80-903298-6-1. [7] svítek, m.: probabilistic multi-modeling by quantum calculus. in ictta’06, ieee conference damascus, syria, 2006. [8] deutsch, d.: quantum theory, the church-turing principle and the universal quantum computer, proceedings of the royal society of london series a400, 1985, p. 97–117. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 [9] lawden, d. f.: the mathematical principles of quantum mechanics. dover publication, inc., mineola, new york, 1995, isbn 0-486-44223-3. [10] svítek, m.: complementary variables and its application in statistics. neural network world, 2007, no. 3, prague, p. 237–253. [11] svítek, m., novovičová, j.: performance parameters definition and processing. neural network world, 2005, no. 6, p. 567–577. [12] svítek, m.: theory and algorithm for time series compression, neural network world, 2005, no. 1, p. 53–67. [13] clarke, c. j. s. the role of quantum physics in the theory of subjective consciousness. mind and matter, vol. 5 (2007), no. 1, p. 45–81. [14] vedral, v.: introduction to quantum information science, oxford university press, 2006. [15] svítek m.: wave probabilistic models. neural network world, 2007, no. 5, p. 469–481. [16] svítek, m.: quantum system modelling. international journal on general systems, vol. 37 (2008), no. 5, p. 603–626, . prof. dr. ing. miroslav svítek phone: +420 224 359 631 fax: +420 224 359 545 e-mail: svitek@fd.cvut.cz department of control and telematics czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 48 no. 5/2008 ap08_3.vp 1 introduction at the beginning of the 19th century, electric energy came into use in many technical fields. from the beginning ways were sought to change parameters such as voltage, frequency and current. the converters of electric parameters can be divided into two main groups. the first group uses for change the faraday’s law of induction e.g. the ward-leonard drive. the second group includes converters that work on the controlled switching principle, i.e., semiconducting rectifiers, inverters, etc. 2 phase controlled rectifiers the electric energy conversion made by semiconducting converters is being used more and more. this had led to the growth of negative phenomenons, that appeared negligible, when only a few converters being used. however the development of semiconductor structures has enabled higher power to be transmitted and has also led to wide spread of converters. in this way, converters have a negative effect on the supply network. the regressive effects of overloads with harmonics and reactive power consumption are becoming major disadvantages of phase controlled (mostly thyristor) rectifiers. these side effects need to be compensated by additional filtering circuits with capacitors or inductances. however, such circuits raise the costs and also increase material and space requirements for the converter. phase control and commutation of semiconducting devices impact the phase displacement between the first harmonics of the consumed current and the first harmonics of the supply voltage. this displacement leads to power factor degradation and to reactive power consumption. the consumed current harmonics cause non-sinusoidal voltage drops on the supply network impedances and lead to supply voltage deformation. this may cause malfunctions of other devices that are sensible to the sinusoidal shape of the supply voltage (e.g. measurement apparatuses, communication and control systems). the reactive power rises with longer control angle delays, so the rectifier acts as a time variable impedance that is nonlinear and causes deformed current consumption. 3 pwm rectifiers in order to suppress these negative phenomena caused by the power rectifiers, use is made of rectifiers with a more sophisticated control algorithm. such rectifiers are realized by semiconductors that can be switched off igbt transistors. the rectifier is controlled by pulse width modulation. a rectifier controlled in this way consumes current of required shape, which is mostly sinusoidal. it works with a given phase displacement between the consumed current and the supply voltage. the power factor can also be controlled and there are minimal effects on the supply network. pwm rectifiers can be divided into two groups according to power circuit connection – the current and the voltage type. for proper function of current a type rectifier, the maximum value of the supply voltage must be higher than the value of the rectified voltage. the main advantage is that the rectified voltage is regulated from zero. they are suitable for work with dc loads (dc motors, current inverters) for proper function voltage type rectifiers require higher voltage on the dc side than the maximum value of the supply voltage. the rectified voltage on the output is smoother than the output voltage of the current type rectifier. they also require a more powerful microprocessor for their control. output voltage lower than the voltage on input side can be obtained only with increased reactive power consumption. the function of the rectifier depends on the supply type of network. there are two types of supply network – “hard” and “soft”. ordinary rectifiers, which work on a relatively “hard” supply network, do not affect the shape of the supply voltage waveform. harmonics produce electromagnetic distortion, and the network will be loaded with reactive power. the pwm rectifier aims to consume sinusoidal current and to work with given power factor. a pwm rectifier connected to the “soft” supply network has more potential to affect the shape of the supply voltage network. it can be controlled, so the current consumed by the pwm rectifier will partly compensate the non-harmonic consumption of other devices connected to the supply network. 4 control of the pwm rectifier the basic block diagram of one phase pwm rectifier is shown in fig. 1. the rectifier consists of 4 igbt transistors, which form a full bridge, the input inductance and the capacitor at the output. it is controlled by pulse width modulation. supply voltage us and the voltage at the rectifier input ur are sinusoidal waveforms separated by the input inductance. the 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 single-phase pulse width modulated rectifier j. bauer the pwm rectifier is a very popular topic nowadays. with the expansion of electronics, conversion of electric parameters is also needed. for this purpose the side effects of passive rectifiers, e.g. production of harmonics and reactive power, must be taken into account. all these side effects fall away with the application of pwm rectifiers. this paper compares the differences between a phase angle controlled rectifier and a pwm rectifier. keywords: rectifier, pulse width modulation, harmonics analysis, active front-end. energy flow therefore depends on the angle between these two phasors. see the phasor diagram in fig. 2a. the power transferred from the supply to the input terminals of the rectifier is: p u u xsr s r s � sin � (1) � u is s cos �, (2) where us rms value of input supply voltage (v), ur rms value of first harmonics consumed by ac rectifier input (v), � phase displacement between phasors us a ur, xs input inductor reactance at 50hz (�), � power factor. in order to make the rectified voltage constant, the input and output powers must be balanced. then, as the phasor diagram in fig. 2a shows: i u xs r s cos sin � � � , (3) i u u xs s r s sin cos � � � � . (4) as long as the reactive power consumed is equal to zero, the power factor is equal to unity. therefore (3) and (4) can be adapted to: i x us s r� sin �, (5) u us r� cos �. (6) phasor diagrams of the rectifier, which works both, as a rectifier, and as an inverter, are shown in figs. 2b and 2c. the aim is to control the rectifier in such a way that it consumes harmonical current from the supply network, which is in phase with the supply voltage. this can be achieved by controlling the rectifier in one of a number of ways, e.g., by pulse width modulation. voltage and current under this control are shown in fig. 3. one possible way of transistor switching is shown at the bottom of fig. 3. it can be seen that two states alternate. first, the current flows into the load (d1 and d2 conducts), and second, the input of the rectifier is shortcircuited (d1 and t3 conducts). the grey areas mean that the transistor conducts. the white areas mean, that the passive element conducts. the transistor is turned off and the current flows through the antiparallel diode. the switching of devices must be precisely synchronized with the supply voltage. the output voltage of the rectifier ud is usually controlled to a constant value by using another converter e.g. an inverter. it is therefore possible, at a given current id at the converter output, to assign to output voltage ud a particular value of z and �. if we consider cos � � 1 and the power equality on the ac and dc side can be written as: i z i d s m � � ( ) cos1 2 �, (7) i u i us m dav dav sm ( )1 2 � . (8) from the phasor diagram in fig. 2b � � � arctan ( ) l i u s m sm 1 , (9 u u g m sm ( ) cos1 � � . (10) the rectifier controller can be made on the basis of these equations. it is obvious that only two variables are needed for such rectifier control. firstly, the displacement angle � must be controlled and the other necessary variable is z, © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 48 no. 3/2008 udus d3t 3 d2t 2 d1t1 d4t4 ur c xs is ic id rd fig. 1: block diagram of the power section is u s u r is x s � is u s ur isx s � is us ur isx s � � a) b ) c) fig. 2: phasor diagrams u ref i s ( 1 ) idav � i c t 1 , d 1 t 2 , d 2 t 3 , d 3 t 4 , d 4 u s ( 1 ) m u r u du g ( 1 ) m u s u r ( 1 ) fig. 3: voltage and current waveforms in a pwm rectifier z u u ref pil � , (11) which is defined as the ratio of the modulation signal uref amplitude to the saw-voltage upil amplitude. during displacement angle � control, it is necessary to consider, that if � increases the current through the inductance, will also increase. for proper rectifier function, it is also necessary to synchronize the control algorithm with the supply voltage. by detecting the zero crossing, the controller controls the displacement angle �. the synchronization circuit must be very accurate and also fast and reliable. 5 measurement results a functional sample of the pwm rectifier was built. the control algorithm was realized on the basis mentioned above. four igbt transistors act as the power part, and a motorola 56f508 controller microprocessor was chosen. simplified block diagram is shown in fig. 4. an analog synchronization circuit was chosen, which later proved to be a bad idea. zero cross detection did not function properly. the transistor switching disturbed the synchronization output. the waveforms obtained from the measurements are shown in figs. 5–8. fig. 5 shows typical diode rectifier waveforms. it can be seen that the current consumed from supply is is non-sinusoidal. a harmonics analysis of this current is shown in fig. 6. the amplitudes of the third, fifth, etc., harmonics are high. fig. 7 shows the waveforms taken with pwm control of the rectifier. the current is is nearly sinusoidal and in phase with the supply voltage. a harmonics analysis of the rectifier controlled by pulse width modulation is shown in fig. 8. the difference between these currents can be clearly seen. the non-sinusoidal shape of the consumed current can be expressed by the harmonic coefficients. � � i i ( )1 , (14) 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 igbt converterls measuring interface z d motorola 56 f 805 synchronization fig. 4: simplified block diagram of the pwm rectifier us is id fig. 5: waveforms of the diode rectifier fig. 6: harmonics analysis of the diode rectifier input current us is id fig. 7: waveforms of the pwm rectifier k i ih n � � � ( ) ( ) 2 2 1 , (15) k i iz n � � � ( ) 2 2 . (16) then the coefficients calculated for the measurements are in table 1. conclusion the use of pwm control in rectifiers eliminates the problems caused by using phase controlled rectifiers. the pwm rectifier can perform well in many applications, for example as an active filter, or as an input rectifier for an indirect frequency converter. this application is useful mainly in traction, where the ac voltage from the trolley wire is first rectified, and the traction inverters and also other auxiliary converters are fed from the output of the rectifier. a traction vehicle equipped with a pwm rectifier does not consume reactive power, will not load the supply network with harmonics and can recuperate. another possible application of the converter is as an active filter. an active front-end will have the capacitor at the output. acknowledgments the research described in the paper was supervised by doc. j. lettl csc., fee ctu in prague. references [1] lettl, j., žáček, j.: usměrňovače s šířkově pulzní modulací. elektro, 1995. [2] künzel, k., lettl, j., žáček, j.: some problems of pwm rectifiers. proceedings of uees 96, szczecin, 1996. [3] lettl, j.: duality of pwm rectifiers. proceedings of electronic devices and systems conference 2003, brno, 2003. ing. jan bauer e-mail: bauerj1@.fel.cvut.cz department of electric drives and traction czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 acta polytechnica vol. 48 no. 3/2008 fig. 8: harmonics analysis of the pwm rectifier input current coefficient diode rectifier pwm rectifier � 93 % 100 % kh 40 % 2 % kz 37 % 2 % table 1 harmonic coefficients table of contents communication between a matrix converter modulator and a superset regulator 3 p. pošta a proportional integral derivative (pid) feedback control without a subsidiary speed loop 7 m. aboelhassan visualization and simulation in scheduling 12 r. èapek realtimeframe – a real time processing framework for medical video sequences 15 s. gross, t. stehle the european electricity market and cross-border transmission 20 m. adamec, m. indráková, m. karajica analysis of the fuel efficiency of a hybrid electric drive with an electric power splitter 26 d. èundev nuclear fuel cycle evaluation and real options 30 l. havlíèek power producer production valuation 35 m. knìžek unrra and support for science 38 j. frydryšková wireless telegraphy at the german universal exhibition in ústí nad labem in 1903 [1] 40 t. okurka first steps into practical engineering for freshman students using matlab and lego mindstorms robots 44 a. behrens, l. atorf, r. schwann, j. ballé, t. herold, a. telle creating panoramic images for bladder fluorescence endoscopy 50 a. behrens automatic compensation of workpiece positioning tolerances for precise laser welding 55 n. c. stache, t. p. nguyen validity of the one-dimensional limp model for porous media 61 o. doutres, n. dauchez, j.-m. genevaux, o. dazel the precision simulation of the first generation matrix converter 66 m. bednáø non-contact monitoring of heart and lung activity by magnetic induction measurement 71 m. steffen, s. leonhardt space variant psf – deconvolution of wide-field astronomical images 79 m. øeøábek ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 lab-scale technology for biogas production from lignocellulose wastes lukáš krátký1, tomáš jirout1, jǐŕı nalezenec1 1 ctu in prague, faculty of mechanical engineering, department of process engineering, technická 4, 166 07 prague 6, czech republic correspondence to: lukas.kratky@fs.cvut.cz abstract currently-operating biogas plants are based on the treatment of lignocellulose biomass, which is included in materials such as agriculture and forestry wastes, municipal solid wastes, waste paper, wood and herbaceous energy crops. lab-scale biogas technology was specially developed for evaluating the anaerobic biodegrability and the specific methane yields of solid organic substrates. this technology falls into two main categories – pretreatment equipments, and fermentation equipments. pretreatment units use physical principles based on mechanical comminution (ball mills, macerator) or hydrothermal treatment (liquid hot water pretreatment technology). the biochemical methane potential test is used to evaluate the specific methane yields of treated or non-treated organic substrates. this test can be performed both by lab testing units and by lab fermenter. keywords: ball mill, batch test, fermenter, liquid hot water pretreatment, macerator. 1 introduction the application of anaerobic digestion technology is rapidly growing at the present time around the world, because of the environmental and economic benefits that it offers. to implement this technology, it is necessary to determine the biogas potential of various types of wastes, which can be found in materials such as agriculture and forestry wastes, municipal solid wastes, waste paper, wood and herbaceous energy crops. these lignocellulose substrates are generally composed of cellulose, hemicellulose, lignin and a wide variety of organic and inorganic compounds. cellulose and hemicellulose fractions are converted to monosaccharides, which can subsequently be fermented to biogas. however, the inherent properties due to the composite structure make them resistant to enzyme attack. these substrates biodegrade very slowly, and the extent does not exceed 20 % [1]. pretreatment of biomass is therefore an essential step in order to increase biomass digestibility. pilot-plant biogas technology, as presented in figure 1, was developed in order to evaluate the influence of various pretreatments on biodegrability and on biogas production. the first part of lab technology, the pretreatment section, enables size reduction machines and hydrothermal pretreatment to be used. afterwards, the treated material can be anaerobically digested, either in a testing unit or in a fermenter, where the quantity and quality of the biogas are evaluated in relation to the energy requirement for the pretreatment that it has undergone. 2 biomass size reduction machines comminution technology is an essential part of any biogas plant. the goal of this pretreatment is to disrupt the inherent structure of lignocelluloses, to decrease the particle size and the degree of cellulose crystallinity. in general, the recommended final particle size is 1–2 mm for effective digestibility. size reduction increases the total hydrolysis yield by 5–25 %, and also reduces the digestion time by 23–59 %. however, comminution is a very expensive operation that consumes about 33 % of the total electricity demand [2]. reducing the energy requirement figure 1: lab-technology for biogas yield evaluation 54 acta polytechnica vol. 52 no. 3/2012 a) macerator b) ball mill figure 2: size reduction machines in lab-scale technology and finding the right solution for biomass disintegration would clearly improve the economics of the whole process. knives, hammers, roll mills and colloid mills are widely used for disintegrating biomass. a prototype of a new type of mill called macerator is being tested, see figure 2a. the idea of this machine is to combine the most efficient milling principles of knives and roll mills. the macerator consists of a single horizontal roll and drum sieve, both with sharp-edged segments. the roll-sieve gap is easily adjustable by screws. in principle, lignocellulose biomass is fed into the gap between the roll and the sieve, where it is soaked in hot water. due to the high shear and cut forces, the biomass is disrupted and reduced in size, and it passes through the holes in the sieve into the storage vessel. the variable parameters are the amount of biomass, the gap size, the rotational roll speed, the flow rate and the temperature of the hot water. the effectiveness of a macerator is evaluated by the energy demand for comminution, the structure, and biogas tests of the treated material. the second type of size reduction machine used in lab-scale technology is a ball mill, see figure 2b. this universal equipment has been found to be a very efficient machine for disintegrating lignocelluloses, and it can be used both for wet milling and for dry milling. the shear and compressive forces disrupt the lignocellulose matrix, and reduce the size and the crystallinity of the cellulose. however, ball milling has also been found to be a time-consuming operation with a very high energy demand for comminution [2]. the effectiveness of a ball mill is determined by the same parameters as for a macerator, see above. 3 lhw pretreatment technology liquid hot water pretreatment (lhw) is a hydrothermal pretreatment that came into use several decades ago, e.g. in the pulp industry. in this treatment, lignocellulose biomass is heated in water maintained by pressure in a liquid state. above a temperature of 160 ◦c, lignocellulose matrix begins to be soluble, which means that the nutrients are more accessible for enzymatic attack. the major advantages of lhw are that no chemicals are added and no inhibitors are formed. the reactor is inexpensive to construct because there is low corrosion potential. the effectiveness of lhw depends on the composition and the ph of the substrate, on the processing temperature, and on the residence time [3,4]. lab lhw pretreatment technology is used in batch mode. this technology is composed of three main parts: the hydrolyser, the expansion vessel and the ball valve, equipped with a pneumatic actuator, see figure 3. the hydrolyser is a double-jacketed pressure vessel which can treat biomass up to 8 liters in volume under a maximum processing temperature of 200 ◦c and pressure 1.6 mpa. the substrate is indirectly heated by oil circulating in a double jacket. an electric spiral with power 12 kw is used for heating the oil. the expansion vessel is an apparatus with atmospheric pressure inside, and is used for storing the expanded substrate. this vessel is also equipped with cooling because there is faster vapor condensation after expansion. the third main part, the ball valve, keeps the pressure space in the hydrolyser separate from the atmospheric space in the expansion vessel. biomass processing by batch lhw is based on this principle. the hydrolyser is stuffed with a suspension containing lignocelluloses and hot water. the substrate is heated and then, when the processing temperature is reached, it is kept constant for the processing time. then the ball valve is rapidly opened, and the substrate immediately expands into the expansion vessel. two products are formed during expansion, i.e. vapor and hydrolyzate. after vapor condensation, the expanded material is removed from the expansion vessel and anaerobically digested, primarily in test units. the effectiveness of lhw is mainly evaluated by the ph, the glucose yield, the structure changes and by biogas tests [5]. 55 acta polytechnica vol. 52 no. 3/2012 a – equipment b – apparatus for hydrolysis 1 – hydrolyser; 2 – expansion vessel; 3 – ball valve figure 3: the lhw technology in a lab-scale application a) glucose yield change b) ph change figure 4: the dependence of glucose yield and ph on temperature and processing time — wheat straw lhw technology was used for treating various types of biomass, e.g. wheat straw, silage, office paper and boxboard. in the first experiments, a suspension containing non-disintegrated wheat straw with 5 % by mass was treated. this material was initially incubated at a temperature of 60 ◦c to achieve good straw maceration. then the substrate was filled into the hydrolyser. the initial ph value was 7.14 ± 0.05 and the glucose yield was 0.14 ± 0.02 g· l−1. figure 4 plots the dependencies of the final glucose yield and of the ph on thermal conditions 170–200◦c and processing time 0–60 min. the glucose yield (figure 4a) rises with increasing temperature and time. on the other hand, the ph values (figure 4b) fall with increasing temperature and time. generally, lhw pretreatment causes liquid water under pressure to penetrate into the pores in the biomass. because of the rapid expansion, liquid water changes phase to vapor, and the associated volumetric change disrupts the substrate, and especially the cell walls. these effects not only increase in glucose yield and cause ph changes, but primarily increase the biodegradation rate of the biomass. the influences of lhw on the structure of wheat straw and boxboard are shown in figure 5. a) straw before b) straw after c) boxboard before d) boxboard after figure 5: influence of lhw on the structure of biomass — processing parameters 200 ◦c, 20 min 56 acta polytechnica vol. 52 no. 3/2012 a – scheme of the unit b – lab unit 1 – bottle with substrate, 2 – burette, 3 – balancing bottle figure 6: lab testing unit for investigating biogas yield in general, substrate pretreated with lhw is more digestible, the amount of biogas is increased, the residence time in the fermenter decreases, and pumping and homogenization are facilitated [5]. 4 lab-batch tests for an investigation of biogas yield a test known as biochemical methane potential (bmp) is widely used for evaluating the anaerobic biodegrability of wastes. generally, biogas potential can be determined in batch mode or in continuous mode. however, batch systems are more widely used, because they are easier to set up, simpler, and easier to monitor and evaluate. these tests are based on the same principle, i.e. measuring the biogas/methane production. the basic approach is to incubate a waste with an anaerobic inocolum and measure the biogas/methane production [6]. the biochemical methane potential of untreated/treated waste in our technology is investigated with the use of two pieces of equipment: a lab testing unit and a lab fermenter. first, the lab testing unit shown in figure 6 is used for primary testing of bmp. anaerobic digestion experiments are carried out in accordance with european standards vdi 4630 [7] and čsn en iso 11734 [8]. in detail, eight glass batch digesters with a capacity of 0.5 liters are used. these bottles are filled with a mixture of the tested waste and digested sludge from a running biogas plant. five digesters are used for the replications and the next three bottles for the reference samples. the digesters are incubated under mesophilic conditions at a constant temperature of 35 ◦c, which is maintained through a water bath. the gas measuring system is based on a simple volumetric method. the biogas produced inside moves to the external burette, where it displaces an equivalent volume of barrier solution from the balancing bottle that provides constant pressure conditions. the amount of biogas is monitored daily, except at the beginning of the test, when the increase in volume is evaluated more frequently. the anaerobic digestion test is considered finished when the volumetric changes are lower than 1 % of the total biogas volume. the quality of the biogas is analyzed by absorbing carbon dioxide co2 into potassium hydroxide koh. the evaluation and the characteristics of the tested material and the anaerobic sludge are based on the initial and final ph analyses, the chemical the oxygen demand cod, the total solid content ts, and the total organic solid content ots. in addition to waste in the initial state, there are also investigations of element composition (c, n, s, p, mg, k) or analysis of fats, carbohydrates, proteins and lignin content. secondly, the lab fermenter, as presented in figure 7, is used for more detailed bmp tests and especially for scale-up authentication. these tests are carried out according the same processing parameters and rules as in the testing units. the lab fermenter enables an investigation of bmp tests with substrate volume between 20–35 liters. the tested substrate is also indirectly incubated under mesophilic conditions, usually at a temperature of 35 ◦c by hot water, which circulates in the external double jacket of the fermenter. owing to the adjustable pressure difference, the biogas that is produced is moved to the biogas analysis part, see figure 7b. first, the biogas is cooled to remove the humidity, then the quality and the flow rate of the biogas are analyzed in detail. the evaluation and the characteristics of the waste during the biodegradation process are provided by continually measuring the process parameters in the liquid phase and in the gas phase. in the liquid phase, the total organic carbon toc, the total nitrogen tn, the ph, the chemical oxygen demand cod, the redox potential orp and the temperature are measured. in 57 acta polytechnica vol. 52 no. 3/2012 a – equipment b – part of the biogas yield investigation 1 – cooling, 2 – gas analyser, 3 – flow meter figure 7: lab fermenter the gas phase, the pressure, the temperature, the humidity, the amount of biogas and the quality of the biogas are analyzed. the anaerobic process control and monitoring results can be viewed on websites [9]. 5 conclusion lab pilot plant technology was constructed in order to determine the biogas potential in various types of wastes. this technology makes use of several pretreatment methods aimed at enhancing the anaerobic biodegrability of the substrate and at increasing the biogas yield and the biogas quality. these methods are based on hydrothermal pretreatment or on mechanical comminution. • the lab ball mill and the new prototype macerator are used for biomass comminution. the energy demands for achieving the required final particle size and structure are under investigation. • liquid hot water (lhw) pretreatment is used to obtain more digestible biomass. the effectiveness of lhw pretreatment grows with increasing processing temperature and time. lhw pretreatment increases the amount of biogas and decreases the residence time in the fermenter. homogenization and pumping are facilitated. • biochemical methane potential tests can be carried out either by a simple process in a testing unit or by a laborious process in a lab fermenter. the testing units are used primarily for investigating biogas yields, and lab fermenters are used for scale-up investigations. acknowledgement this work was carried out within the project “the development of environment-friendly decentralized power engineering”, no. msm6840770035, supported by the ministry of education of the czech republic. references [1] pandey, a.: handbook of plant-based biofuels. in crc press, new york, 2009, 297 p. isbn 978-1-56022-175-3. [2] krátký, l., jirout, t.: biomass size reduction machines for enhancing biogas production. in chemical engineering and technology, 2011, vol. 34, no. 3, p. 391–399. [3] taherzadeh, j. m., karimi, k.: pretreatment of lignocellulosic wastes to improve ethanol and biogas production: a review. in international journal of molecular sciences, 2008, vol. 9, p. 1 621–1651. [4] hendriks, a., zeeman, g.: pretreatments to enhance the digestibility of lignocellulosic biomass. in bioresource technology, 2009, vol. 100, p. 10–18. [5] krátký, l., jirout, t., dostál, m.: laboratorńı zař́ızeńı pro termicko-expanzńı hydrolýzu substrát̊u při výrobě biopaliv. in 58th national congress of chemical and process engineering chisa 2011, 2011, 10 p. isbn 978-80-905035-0-2. [6] raposo, f., de la rubia, m. a., cegŕı, f., borja, r.: anaerobic digestion of solid organic substrates in batch mode: an overview relating to methane yields and experimental procedures. in renewable and sustainable energy reviews, 2001, vol. 11, p. 861–877. [7] vdi 4630. fermentation of organic materials — characteristic of the substrate, sampling, collection of material data, fermentation tests. ics: 13.030.30, april 2006, verein deutscher ingenieure. 58 acta polytechnica vol. 52 no. 3/2012 [8] čsn en iso 11734. jakost vod – hodnoceńı úplné anaerobńı biologické rozložitelnosti organických látek kalem z anaerobńı stabilizace – metoda stanoveńı produkce bioplynu. ics: 07.100.20, october 1999, czech office for standards, metrology and testing. [9] skočilas, j., dostál, m., petera, k., šulc, r.: měřeńı a regulace provozńıch paramet̊u laboratorńıho fermentoru. in 56th national congress of chemical and process engineering chisa 2009, 2009, 16 p. isbn 978-80-86059-51-8. 59 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 phphmm tool for generating speech recogniser source codes using web technologies r. krejč́ı abstract this paper deals with the “phphmm” software tool, which facilitates the development and optimisation of speech recognition algorithms. this tool is being developed in the speech processing group at the department of circuit theory, ctu in prague, and it is used to generate the source code of a speech recogniser by means of the php scripting language and the mysql database. the input of the system is a model of speech in a standard htk format and a list of words to be recognised. the output consists of the source codes and data structures in c programming language, which are then compiled into an executable program. this tool is operated via a web interface. keywords: speech recognition, dsp, php, mysql, omap, tms320c674x, arm. 1 introduction an automatic speech recogniser is a computer program consisting of interconnected algorithms whose input is human speech converted from a microphone into digital form, and the output is a text transcriptionof this speech. the structureof the speech recogniser consists of two main phases: in the first phase, so-called “training” is carried out, resulting in the creation and filling of data structures that describe a speech model. in the second phase of this process, decoding algorithms are developed that provide the speech recognition itself, using the speechmodels obtained in the training phase. since huge amounts of data are needed in order to create the speech recogniser, and huge amounts of data are elaborated, many activities are performed automatically using scripts. this facilitates thework and eliminates the need for repeated manual data processing. this is usually done using the htk toolkit [1], with the use of which a complete speech recogniser for the pc platform can be created. however, when creating a speech recogniser to be run on various hardware platforms, e.g. digital signal processors, no such public tool is available, and thus proprietary software has to be programmed. in this case, the speech models trained using the htk toolkit can be utilised, but it is necessary to use totally different algorithms and optimisation methods for their treatment than those used on the pc platform. to test the optimisation methods, it is often necessary to change the data structures and convert their parameters. for this purpose, the speech processing group at the department of circuit theory, ctu in prague has been developing a “phphmm” tool that facilitates and integrates the development of speech recognition algorithms to alternative hardware platforms. 2 phphmm tool the phphmm tool is a set of scripts in php scripting language [2]using themysqldatabase server [3]. this technology has become one of the standards for generating web pages, but it is also useful for generating other texts, such as the source code in any programming language. the basis of the phphmm tool is a class of functions that can be easily included into a superior systemwritten inphp language. the scripts are run on the server (either on a local computer configuredasa serveroronapubliclyaccessible web server), and their output is visible via a graphical user-friendly web interface. the source code of the speech recogniser can consist of a sequence of single steps. the stepswill be discussed in the following text. 2.1 speech model the result of the training phase of the speech recogniser usinghtktoolkit is a text file in a defined format thatdescribes a generalmodel of speech, created on the basis of the utterances of a training database. the models of speech may have a huge number of different variations, e.g. the type of parametrisation (extraction of speech features), the number of hmm (hidden markov model) states, streams and mixtures, the number of coefficients in eachmixture, etc. during recognition, these parameters enter the output probability density function b(o) [4]: bj(ōt)= ∏s s=1 [ ms∑ m=1 cjsmn (ōst; μ̄jsm,σjsm) ]γs ; n (ōst; μ̄jsm,σjsm)= (1) 1√ (2π)ns |σ| e −12(ōst−μ̄jsm) t σ−1 jsm (ōst−μ̄jsm), 58 acta polytechnica vol. 51 no. 5/2011 ~h "a" 5 2 39 1.437809e+00 -6.805577e+00 -8.517246e+00 -9.976683e+00 ... 39 2.393653e+01 4.407170e+01 3.864353e+01 4.710320e+01 ... 1.341746e+02 3 39 2.916575e+00 -8.322930e+00 -1.077090e+01 -9.984103e+00 ... 39 1.245955e+01 3.486024e+01 3.388573e+01 4.059823e+01 ... 1.130805e+02 4 39 4.856239e-01 -1.422903e+00 -6.716645e+00 -3.694754e+00 ... 39 1.848022e+01 2.745304e+01 3.125877e+01 4.468990e+01 ... 1.222291e+02 5 0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 6.224011e-01 3.775989e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 7.666833e-01 2.333166e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 5.902151e-01 4.097848e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 fig. 1: example of simple hidden markov model of “a” phoneme in text form where s is count of streams, γs is streamweight, ms is count of mixtures in a stream, cjsm is weight of the m-th mixture, n (ō; μ̄,σ) is multivariate gaussian distribution with a vector of mean values μ̄ and a covariancematrix σ. this function represents the acoustic similarity of the input signal with the reference models of speech units (phonemes). all these factors enter into the phphmm tool by uploading the text file with the speech model. 2.2 parsing and storing into the database after a text file with hidden markov models is uploaded, it is parsed and converted from text form into data structures in the memory of the server. at the same time, some basic integrity checks of the file are carried out. then database tables are created in themysqldatabase and they are populated with relevant data from the uploaded file. it is convenient to use the server-based (mysql) database, inter alia, because it enables easy selection of data by means of (even complicated) sql queries. selection and processing of data using a server-based database is significantly faster and more comfortable than searching in a text file. for our current experiments, it is advantageous to store the data in a “memory” table type, as this storage allows faster access than the commonly used “myisam” type. there are also many techniques for optimizing the performance of the database, such as the use of keys and indexes [4]. 2.3 glossary of words our goal is to create a speech recogniser that will be able to handle continuous speech in real time, but currently we are dealing with recognition of individual words and short phrases. in this step, we can simply specify all thewordswhich the recogniserwill be able to recognise, either by typing in the text-box, or by uploading a text file. the more words are to be recognised, the greaterwill be the demands on the recogniser hardware, and hence on optimizing the algorithms. 2.4 phonetic transcription in all languages, there are thedifferencesbetween the written languageandthe spoken formof speech. this stepautomatically creates aphonetic transcriptionof words entered in the previous step. e.g. the czech word “zpěv” will be rewritten by the transcription “spjef”. 59 acta polytechnica vol. 51 no. 5/2011 fig. 2: graphically expressed database structure for speech model 2.5 selection of hardware platform weworkonoptimizingalgorithmsof speech recognisers for platforms of multi-core digital signal processors of the tms320c6000 family from texas instruments. the intention of phphmm is to create a general tool for a large number of hardware and software platforms. currently, this step offers a choice between a “general” platform and the “omap-l137” platform. omap-l137 is a dual-core heterogenous processor fromtexas instruments with both a 32-bit arm9 and a tms320c674xdsp core. 2.6 selection of optimisation methods if a speech recogniser is to be run on a system with limited hardware resources, it is necessary to optimize computationally intensive algorithms. in this step, a combination of optimisation methods can be chosen for testing. the optimisation is done at all levels of the design of the speech recogniser — from the layout of the data structures up tomodifying the algorithms so that they are performed faster on the chosen hardware platform. 2.7 creating word models depending on the optimisation method, models of the words are created as sequences of states with which theviterbi algorithm[1]works. for eachword, the phoneme models are chained into a sequence of states. e.g. the czech word “spjef” creates the following sequence of states: fig. 3: sequence of states of word „zpěv [spjef] 2.8 assembling the source code and data structures the main task of the phphmm tool is to set up the source code and data structures on the basis of the input data, the specification of which has just been described. depending on the type of parametrisation, the structure of the models and the required optimisations, the system generates the sources of the speech recogniser with the relevant data. the source code must be generated before it can be programmed for each selection of the hardware platform and optimisation. the source code can be set up very effectively using php. the code of the php scripting language can be inserted directly into the source code in c. as described in [5], php can be used as a preprocessor with many more possibilities than the standard c preprocessor. for example, it can create cycles or compute with goniometric functions. a hamming window lookup table can be generated as follows: const float hamming_ar[]={ }; the generated code is subsequently compiled by the appropriate compiler. however, this is already beyond the function of the current phphmm tool, although in future it may be possible, after generating the source code, just to run the compiler and get the program in an executable format. 3 results although the phphmm tool is used to generate the entire speech recogniser, in the following text we discuss some examples of using the generated code for faster calculations. 60 acta polytechnica vol. 51 no. 5/2011 3.1 mfcc optimisations one of the optimisation methods calculates the results in advance, if all operands are known at compile time. this will avoid counting the same results repeatedly in the recognition process, and it speeds up the calculation. this so-called “lookup table”methodwasused to generate thehammingwindowcoefficients,whichare calculated at the beginning of the signal parametrisation by the mel-cepstral coefficients (mfcc) [1]. the parametrisation method during the recognition process never changes, and therefore the hamming window coefficients do not change. the calculation then reduces to reading the coefficient in a onedimensional data field. a part of the parametrisation block of the signal, where speech attributes are extracted from the input signal, is the calculation of the discrete cosine transform(dct) [6]. using the standardmethod for calculating dct, which is calculatedwith goniometrical functions, a parametrisation calculation time of approximately 55 ms per segment was achieved at the tested digital signal processor. with the known number of input and output dct coefficients, which are the constants known at compile-time and do not change during recognition, the concrete cosine results are calculated in advance and stored to the data structure. when running the dct algorithm in real time, the cosine is (paradoxically)not calculated, but the pre-calculated cosine value is used according to the appropriate arguments. the calculation of the coefficient is thus reduced to reading its value from the pre-calculated table. by this optimisation, calculation time lower than 6 ms was achieved, i.e. approximately a ninefold acceleration. fig. 4: computation time vs. optimisation methods for mfcc parametrisation 3.2 output probability density optimisations some of our proposed optimisation methods use transformed parameters, which arise by converting the originalmodel parameters. e.g. amodified algorithm for calculating the output probability density function b(o), based on the type of a = a + b × c dotproduct operation (“multiply andaccumulate” – “mac”), requires recalculation of the original coefficients by a simple transformation [3]. this transformation is performed while generating the source code, i.e. in compile time. the calculation without optimisations on the dual-core tms320c74x dsp architecture lasted 1477 ms/segment. after applying appropriate optimisations by recomputing the data structures, the best time of 52 ms/segmentwas achievedwhenusing themodifiedmacalgorithm[3]. fig. 5: computation time vs. optimisation methods of b(o) function fig. 6: computation time of maximum of neighboring values 3.3 viterbi algorithm optimisation the viterbi algorithm, which evaluates the most probable passage through the model, contains a part which compares adjacent values in the vector of the results of previous operations. variousmethods have been tried, andthe“loopunroll”methodhasproved to be the fastest in this case. the code that was originally performed repeatedly in the cycle is broken down into multiple particular operations without the cycle loop. this will not only reduce the overhead of cycle organisation, but will also provide an opportunity for greater use of the hardware architecture. in our case, instead of 32 passes through the cycle, a sequence of 32 individual operationswith 61 acta polytechnica vol. 51 no. 5/2011 directly addressed operands was created. this loop unrolling led to the possibility to use the “max2” instruction of thetms320c6000architecture, which is an simd (single instruction, multiple data) instruction that simultaneously compares twopairs of 16-bit operands and returns two results. the figure below shows the effectiveness of this optimisation for different numbers of test vectors compared with the best time achievedwithout using the loop unroll method. 4 conclusion the phphmm software tool for developing speech recognition algorithms focuses on applications for digitalsignalprocessors. theadvantagesof this tool include easy comparison of optimisation methods, easily changeable parameters, and a user-friendly graphical environment. it is used for generating source code and data structures tailored to the application. acknowledgement this research was supported by grants gačr 102/08/0707 “speech recognition under real-world conditions”, gačr 102/08/h008 “analysis and modelling of biomedical and speech signals”, and by research activity msm 6840770014 “perspective informative and communications technicalities research”. references [1] young, s., et al.: the htk book. cambridge university engineering department, 2006. [online] http://htk.eng.cam.ac.uk/ftp/ software/htkbook.pdf.zip. [2] php [online]. 2011 [cit. 2011–03–12]. http:///www.php.net/. [3] mysql. the world’s most popular open source database [online]. 2011 [cit. 2011–03–12]. http://www.mysql.com/. [4] krejč́ı, r.: optimization of computationally intensive part of speech recognizer. in 19th czech-german workshop on speech processing [cd-rom]. praha : institute of photonics and electronics as cr, 2009, p. 22–26. isbn 978-80-86269-18-4. [5] krejč́ı, r.: use php preprocessor for generating source codes in c programming language. in kráĺıky 2010. brno : brnouniversity of technology, 2010, p. 84–87. isbn 978-80-214-4139-2. [6] uhĺı̌r, j., et al.: technologie hlasových komunikaćı. praha : nakladatelstv́ı čvut, 2007. 276 p. isbn 978-80-01-03888-8. about the author robert krejč́ı deals with digital signal processing and speech recognition focusing on optimisation of speech recogniser algorithms for systemswith limited hardware resources. robert krejč́ı e-mail: robert.krejci@centrum.cz department of circuit theory czech technical university technická 2, 166 27 praha, czech republic 62 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 single phase voltage source inverter photovoltaic application j. bauer abstract photovoltaic applications have been developing and spreading rapidly in recent times. this paper describes the control strategy of the voltage source inverter that is the important tail end of many photovoltaic applications. in order to supply the grid with a sinusoidal line current without harmonic distortion, the inverter is connected to the supply network via a l-c-l filter. the output current is controlled by the hysteresis controller. to improve the behaviors of the l-c-l filter, active damping of the filter is being used. this paper discusses controller design and simulation results. keywords: voltage source inverter, l-c-l filter, hysteresis controller, pulse width modulation. 1 introduction increasing efforts are being made nowadays to use renewable energy sources. processing the energy obtained from sun, wind or water is coming to the fore. the energy supplied by these sources does not have constant values, but fluctuates according to the surrounding conditions (intensity of sun rays, water flow, etc.). these supplies are therefore supplemented by additional converters. the most used types are inverters ordc/dc converters. the area of high power converters for solar application is already covered by industrial manufacturers. however, the area of low power devices is not fully covered. these converters are mostly built from commercially produced parts that can perform demanded functions, but they are notdeveloped for this typeof applicationandtherefore the efficiency of the whole system is low. low power devices are important in applications where there is no voltage grid present and electric power is required (mountains, desert expeditions, etc.) the simplified block structure of the investigated system is shown in fig. 1. a dc/dc converter with an mppt (maximum power point tracker) is connected to the solar array. a second dc/dc converter is connected to the output of this converter. the second converter raises the voltage acquired from the solar system to the voltage level demanded by the vsi. fig. 1: block diagram of the system 2 dc voltage control the power produced by solar arrays depends on surrounding conditions, e.g. temperature ϑ and intensity of sunexposure e. the solararrayoutputpower isnot therefore constant. the output power also depends on the withdrawn current and voltage. (fig. 2) fig. 2: output characteristic of the solar cell and themaximum power point tracking principle the mppt control algorithm ensures that the solar array operates at its optimal point and that it delivers maximal power. the algorithm periodically changes the duty cycle of the converter in the defined hysteresis range, it increases the duty cycle, it decreases the duty cycle, and it compares the powers delivered by the solar cell. then it chooses the duty cycle that corresponds with the maximal delivered power. if the converter operates on the left side of the mpp (area 1.), the algorithm will decrease the duty cycle. when it is operating on the right (area 3.) of the mpp, the controller will increase the duty cycle. if the converter is operating at mpp (area 2.), the power obtained on both sides will be lower and the duty cycle remains the same. the second dc/dc converter is a boost type full bridge with galvanic insulation. the converter works with a fixedduty cycle andboosts the input voltage to the voltage level required by the inverter. for a single phase vsi, 400 v should be enough. 7 acta polytechnica vol. 50 no. 4/2010 3 voltage source inverter the vsi is used for converting energy fromdc to ac voltage. the detailed scheme of the inverter is shown in the fig. 3. the power part of the inverter is made of four mosfets and the l-c-l filter is connected to the output of the inverter. this filter ensures the sinusoidal shape of the output current. fig. 3: inverter diagram the inverter in this application can operate in two differentmodes. firstly, it can operate in the so-called “island mode”, which means that the converter acts as a voltage source and supplies some devices with sinusoidal voltage of the common network parameters. pwm control is used for this purpose. the second possible operation mode delivers current to the supply network. pwm control is not suitable for this operation mode and the current hysteresis control is therefore used. 3.1 current control the simplest hysteresis controller can be realized as a simple bang-bang regulator. the actual value of the output current is controlled in order to remain in a defined area. this method is fast and simple, and provides good results with an l-c-l filter. the onlyproblem is the variable switching frequency of the semiconductor switches that is a direct consequence of this control strategy. better results can be obtained whencontrolof thehysteresiswidth isused. according to [1], the width depends on the demanded switching frequency fs of the converter, on the inverter side filter inductance li, on the actual value of dc-link voltage udc, and on the filter capacitor voltage ucf. hy = |ucf | udc − |ucf | lifsudc (1) implementation of equation (1) causes the switching frequency to remain approximately constant. the influence of variable hysteresis width is depicted in fig. 4. a further improvementof the output current shape involves adapting the hysteresis controller. a simple fig. 4: switching frequency during variable hysteresis control hysteresis controller alternates only between two combinations (s1, s2) or (s3, s4), whichmeans that either +udc or −udc appears on the output (fig. 5). fig. 5: bipolar hysteresis controller with the help of the pwm switching analogy, bipolar switching can easily be adapted to unipolar switching. this method generates three output voltage levels +udc, 0 and −udc. the controller is shown in fig. 6. two hysteresis controllers of different hysteresis width are used instead of one. the first controls switches s2, s3 and the second controls s1, s4. outer hysteresis is used in areas where the current crosses zero and the drop in the desired current fig. 6: unipolar hysteresis controller 8 acta polytechnica vol. 50 no. 4/2010 value is faster than the drop in the output current. the output current then reaches the outer hysteresis, and the controller switches from s1 to s4 or vice versa. this control of the vsi is used when the converter operates on the supply network and is used for generation of current. the output voltage is guardedby the supply network itself. the converter “pushes” current into the grid. 3.2 pwm control hysteresis control cannot operate in “island mode”, because there is no supply network voltage that can guard the generated voltage. then the converter is supposed to generate output voltagewith a sinusoidal shape, as in the supply network. the pwm control algorithm was therefore used for “island mode”. the controller is shown in fig. 7. the pi controller is used in the voltage control loop. the output of the controller is the duty cycle for the modulator. fig. 7: island mode control loop 4 l-c-l filter design the filter is an important part of every semiconductor converter. the filter reduces the effects caused by switching semiconductor devices on other devices. the generation of the harmonics around switching frequency can be filtered by large inductance connected to the output of the converter. but the large inductance decreases the dynamics of the system and also the operation rangeof the converter. insteadof the inductance the inverter can be equipped with an l-c-l filter. the l-c-l filter has good current ripple attenuation evenwith small inductance values. however, it can bring also resonances and unstable states into the system. thefiltermust thereforebedesignedprecisely according to the parameters of the specific converter. in the technical literature we can find many articles on the design of the l-c-l filters [2, 3]. table 1: parameters for calculating the filter components grid voltage (v) 230 output power of the inverter (kva) 1.5 dc link voltage (v) 400 grid frequency (hz) 50 switching frequency (hz) 3000 nowthefilter designwill be described. the system parameters considered for calculating the components for a filter with power approx. 1.5 kva are shown in tab. 1. first, the base values need to be calculated. these values are later used for calculating the filter components. zb = u2n sn =46ω (2) cb = 1 ωnzb =69.91μf (3) the first step in calculating the filter components is the design of the inverter side inductance li, which can limit the output current ripple by up to 10% of the nominal amplitude. it can be calculated according to the equation derived in [3]: li = udc 16fsδil max =17.7mh (4) where δil max is the 10% current ripple specified by (5) δil max =0.01 pn √ 2 un =0.234a (5) the design of the filter capacityproceeds fromthe fact that themaximal power factor variationacceptable by the grid is 5%. the filter capacity can therefore be calculated as a multiplication of system base capacitance cb cf =0.05cb =3.45μf (6) the grid side inductance lg can be calculated as (7) lg = rli =0.32li =5.7mh (7) the last step in the design is the control of the resonant frequency of the filter. the resonant frequency musthaveadistance fromthe grid frequencyandmust be minimally one half of the switching frequency, because the filter must have enough attenuation in the switching frequency of the converter. the resonant frequency for the l-c-l filter can be calculated as (8) fres = 1 2π √ li + lg lilgcf =1.30khz (8) in order to reduce oscillations and unstable states of the filter, the capacitor should be added with an inseries connected resistor. this solution is sometimes called “passive damping”. it is simple and reliable, but it increases the heat losses in the system and it greatly decreases the efficiency of the filter. the value of the damping resistor can be calculated as (9) rsd = 1 3ωrescf =11.2ω (9) so-called “active damping”methodswith a virtual resistor were therefore developed. as mentioned in [4], 9 acta polytechnica vol. 50 no. 4/2010 there are four possible ways to place a virtual resistor for active damping. one of them is shown in fig. 8. fig. 8: a single phase l-c-l filter and an alternative circuit figure 8 shows a model of a single phase l-c-l filter with resistor damping connected in series with the filter capacitor. the inverter can be considered as a current source. the resistor reduces the voltage across the capacitor by a voltage proportional to the current that flows through it. in the control loop the current through cf is measured and differentiated by the term scf rsd. a real resistor is not used, and the calculated value is subtracted from the demanded current (fig. 9). fig. 9: a single phase l-c-l filter with a virtual resistor in this way the filter is actively damped with a virtual resistor without losses. the disadvantage of this method is that an additional current sensor is required and thedifferentiatormaybringnoise problems because it amplifies high frequency signals. 5 simulation results a model of the vsi with the control was made with the help ofmatlab-simulink sw. all simulations were made for output current ig = 3 a, output voltage ug = 230 v and output frequency f = 50 hz. the switching frequency of the inverter was fs = 3 khz a filter with parameters designed in this paper li = 17.7mh, cf =3.45μf, lg =5.7mhanddamping resistor rsd =11.2ωwas connected to the output of the filter. fig. 10 shows the effects of the virtual damping resistor from time t =0.1 s when the filter is damped with a virtual resistor. fig. 10: effect of the virtual resistor on the l-c-l filter fig. 11: simulation results for current hysteresis control figure 11 shows the simulation results of the inverterwith double hysteresis control. the filtered current ig is in the phase with the grid voltage. except for small spikes in areas where the current changes its direction, the shape of the filtered current is sinusoidal. the same simulationswere done for “islandmode” operation. in this case, the output voltage is regulated to ug =230 v and the current is determined by the load. the shape of the current is smoother, but there is a slight phase shift, caused by the output filter inductance and the load character. the harmonics content of both currents is good. 10 acta polytechnica vol. 50 no. 4/2010 fig. 12: simulation results for island mode with pwm control 6 conclusion the control algorithm for a grid connected voltage source inverter has been presented here. a sinusoidal line current is produced due to the hysteresis controller. the switching frequency is almost constant because of the variable hysteresis width control. the output current filter has been designed and simulated. the obtained results seem to be promising. however, wewill not be able to evaluatewhole systemuntil after it has been realized. acknowledgement this work was supported by the grant agency of the czech technical university in prague, grant no. sgs 10800630. the researchdescribed in thepaperwas supervised by prof. j. lettl, csc., fee ctu in prague. references [1] hinz, h., mutschler, p., calais, m.: control of a single phase three level voltage source inverter for grid connected photovoltaic systems, pcim 1997. [2] liserre, m., blaabjerg, f., hansen, s.: design and control of an lcl-filter based three-phase active rectifier. in industry applications conference, 2001. thirty-sixth ias annual meeting. conference record of the 2001 ieee, volume 1, 2001. [3] araújo, s. v., engler, a., sahan, b.: lcl filter design for grid-connectednpc inverters in offshore windturbines. inthe 7th internationalconference on power electronics. daegu (korea), 2007. [4] dahono, p. a.: a method to damp oscillations on the input lc filter of current-type ac-dc pwm converters by using a virtual resistor. in telecommunications energy conference intelec’03, 2003. about the author jan bauer was born in prague on january 14, 1983. he studied electrical engineering at ctu in prague and was awarded his master’s degree in june 2007. he is currently working on his phd thesis at ctu in prague, fee at the department of electric drives and traction. his main fields of interest are modern control algorithms for inverters and matrix converters. jan bauer e-mail: bauerj1@.fel.cvut.cz dept. of electric drives and traction faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha, czech republic 11 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 exceptional points and dynamical phase transitions i. rotter abstract in the framework of non-hermitianquantumphysics, the relationbetweenexceptionalpoints, dynamicalphase transitions and the counterintuitivebehavior of quantumsystems at high level density is considered. the theoretical results obtained for open quantumsystems andprovenexperimentally someyears ago onamicrowave cavity,may explain environmentally induced effects (including dynamical phase transitions), which have been observed in various experimental studies. they also agree (qualitatively) with the experimental results reported recently in pt symmetric optical lattices. keywords: non-hermitian quantum physics, dynamical phase transitions, exceptional points, open quantum systems, pt-symmetric optical lattices. many years ago, kato [1] introduced the notation exceptional points for singularities appearing in the perturbation theory for linear operators. consider a family of operators of the form t(ς)= t(0)+ ςt ′ (1) where ς is a scalar parameter, t(0) is the unperturbed operator and ςt ′ is the perturbation. then the number of eigenvalues of t(ς) is independent of ς with the exception of some special values of ς where (at least) two eigenvalues coalesce. these special values of ς are called exceptional points. an example is the operator t(ς)= ( 1 ς ς −1 ) . (2) in this case, the two values ς = ± i give the same eigenvalue 0. according to kato, not only the number of eigenvalues but also the number of eigenfunctions is reduced at the exceptional point. operators of the type (2) appear in the description of physical systems, for example in the theory of open quantum systems [2]. here, the function space of the system consisting of well localized states is embedded in an extended environment of scattering wavefunctions. due to this embedding, the hamiltonian of the system is non-hermitian. the interaction between two neighboring levels is given by a 2 × 2 symmetric hamiltonian that describes the two-level system with the unperturbed energies �1 and �2 and the interaction ω between the two levels, h(ω)= ( �1 ω ω �2 ) . (3) the operators (2) and (3) are, indeed, of the same type. in the following, we will discuss the role played by exceptional points in physical systems. it will be shown that they influence not only resonance states but also discrete states lying beyond the energywindow coupled directly to the environment. furthermore (and most important), they are responsible for the appearance of dynamical phase transitions occurring in the regime of overlapping resonances. in an open quantum system, two states can interact directly (corresponding to a first-order term) as well as via an environment (described by a secondorder term) [2]. here, we consider the case that the direct interaction is contained in the energies �k (k = 1,2). this means that the �k are considered to be eigenvalues of a non-hermitian hamilton operator h0 which contains both the direct interaction v between the two states and also the coupling of each of the two individual states to the environment of scattering wavefunctions. then ω contains exclusively the coupling of the two states via the environment. this allows one to study environmentally induced effects in open quantum systems in a very clear manner. the eigenvalues of the operator h(ω) are ε1,2 = �1 + �2 2 ± z ; z = 1 2 √ (�1 − �2)2 +4ω2 . (4) the physical meaning of re(z) is the well-known level repulsion occurring at small (mostly real) ω while im(z) is related to width bifurcation. the two eigenvalue trajectories cross when z =0, i.e. when �1 − �2 2ω = ± i . (5) at thesecrossing points, the twoeigenvaluescoalesce, ε1 = ε2 ≡ ε0 . (6) 73 acta polytechnica vol. 50 no. 5/2010 the crossing points may therefore be called exceptional points. they have a nontrivial topological structure. for details and for a reference to the experimental proof, see the review [2]. the eigenfunctions of the non-hermitian operator (3) are biorthogonal, 〈φ∗k |φl〉 = δk,l . (7) when the distance between the two individual states is large and they do not overlap, they are almost orthogonal in the standard manner, 〈φ∗k |φl〉 ≈ 〈φk |φl〉. in approaching the exceptional point, however, they become linearly dependent, φcr1 → ± i φ cr 2 φ cr 2 → ∓i φ cr 1 . (8) hence, thephases of the eigenfunctions φk of thenonhermitian hamilton operator (3) are not rigid. a quantitative measure for phase rigidity is the value rk ≡ 〈φ∗k |φk〉 〈φk |φk〉 (9) which varies between 1 at large distance of the states and 0 at the exceptional point. further details, including the experimentalproof of relations (8) and (9), are discussed in the review [2]. one of the most interesting differences between hermitian and non-hermitian quantum physics is surely the fact that the phases of the eigenfunctions of the hamiltonian are rigid (rk = 1) in the first case while they may vary according to (9) in the second case [2]. it is possible therefore that the wavefunction of one of the two states aligns with the scatteringwavefunction of the environmentwhile the other state decouples (more or less) fromthe environment. this phenomenon, called resonance trapping, is caused by im(z) in (4), i.e. by width bifurcation. it starts near (or at) the crossing (exceptional) point under the influence of the continuum of scattering wavefunctions. this means that the non-hermitian quantum physics is able to describe environmentally induced effects, for example spectroscopic redistribution processes inducedby themixing of the states via the continuum of scattering wavefunctions, which is described by ω in (3). another feature involved in non-hermitian quantumphysics is the appearance of nonlinearities in the neighborhood of exceptional points [2]. for example, the s matrix at a double pole (corresponding to an exceptional point) in the two-level one-continuum case reads s =1 − 2i γ0 e − e0 + i2γ0 − γ20 (e − e0 + i2γ0)2 (10) where the notation (6) is used and ε0 ≡ e0 − i 2 γ0. at the exceptional point, the cross section vanishes due to interferences. the minimum is washed out in the neighborhood of the double pole, however, the resonance is broader than a breit-wigner resonance according to (10). further studies have shown that the effects discussed above by means of the toy model (3) survive when the full problem in the whole function space with many levels is considered. this means that, when the level density is high and the individual resonances overlap, hamilton operators of type (3) and the exceptional points related to themplayan important role for the dynamics of the system. mainly two types of phenomena are caused by the exceptional points in physical systems. the two phenomena condition each other (for details see [2]). first, the spectroscopy of discrete and resonance states is strongly influenced by exceptional points. both types of states are eigenstates of a nonhermitianmany-level hamilton operator being analogous to (3). they differ by the boundary conditions. the states are discrete (corresponding to an infinitely long lifetime) when their energy is beyond the window coupled to the continuum of scattering wavefunctions. the states are resonant (corresponding, in general, to a finite lifetime) when their energy is inside the window coupled to the continuum of scattering wavefunctions. accordingly, the exceptional points influence the behavior not only of the resonance states but also of the discrete states. for example, the avoided crossing of discrete states can be traced back to an exceptional point and, furthermore, themixing of discrete states aroundanavoided crossing of levels is shown to arise from the existence of an exceptionalpoint in the continuumof scattering wavefunctions. discrete states have been well described in the framework of conventional quantum mechanics for verymanyyears. thehamiltonian ishermitianwith effective forces thatarenot calculated in the standard theory. the effective forces simulate (at least partly) theprincipal value integralwhicharises fromthe coupling to other states via the continuum (denoted by ω in (3)). the phases of the eigenfunctions are rigid (rk = 1), the discrete states avoid crossing and the topological phase of the diabolic point is the berry phase. due to rk = 1, the schrödinger equation is linear, and the levels are mixed (entangled) in the whole parameter range of avoided level crossings. at the critical point, the mixing is maximal (1 : 1). resonance states are well described when quantum theory is extended by including the environment of scatteringwavefunctions into the formalism. the hamiltonian is, in general, non-hermitian and ω in (3) is complex since it contains both the principal value integral and the residuum arising from the coupling to other states via the continuum. the phases of the eigenfunctions are, in general, not rigid 74 acta polytechnica vol. 50 no. 5/2010 corresponding to (9) with 0 ≤ rk ≤ 1. this can be seen in the skewness of the basis. the resonance states can cross in the continuum (at the exceptional point) and the topological phase of the crossingpoint is twice theberryphase. when rk < 1 (regimeof resonance overlappingwith avoided level crossings), the schrödinger equation is nonlinear and the levels are mixed (entangled) in the parameter range in which the resonances overlap. the parameter range shrinks to one point when the levels cross, i.e. when rk → 0 and (8) is approached. secondly, a dynamical phase transition is induced by exceptional points in the regime of overlapping resonances. such a phase transition is environmentally induced and occurs due to width bifurcation. the number of localized states is reduced since a few resonance states align to the scattering states of the environment and cease to be localized. by this, the dynamical phase transition destroys the relation between localized states below and above the critical regime in which the resonances overlap. the twophases are characterizedby the following properties. in oneof thephases, the discrete andnarrow resonance states have individual spectroscopic features. here, the real parts (energies) of the eigenvalue trajectories avoid crossing while the imaginary parts (widths) can cross. as a function of increasing (but small) coupling strength between system and environment, the number of localized states does not change and the widths of the resonance states increase, as expected. here, the exceptional points are of minor importance. in the other phase, the narrow resonance states are superimposedwith a smooth backgroundand the individual spectroscopic features are lost. the narrow resonance states appear due to resonance trapping, i.e. as a consequenceof the alignmentof a small number of resonance states to the environment (for details see [2]). here, the real parts (energies) of the eigenvalue trajectories of narrow(trapped) resonance states can cross with those of the broad (aligned) states since they exist at different time scales. the narrow resonance states show a counterintuitive behavior: with increasing (strong) coupling strength between system and environment, the widths of the narrow (trapped) states decrease. furthermore, the number of trapped resonance states is smaller than the number of individual (basic) states. this means, that the number of localized states is reduced when the (complex) interaction ω in (3) is sufficiently large. this phase results from the spectroscopic redistribution processes caused by exceptional points. the transition region between the two phases is the regime of overlapping resonances. here, shortlivedand long-livedresonancestates coexist, i.e. they arenot clearly separated fromoneanother in the time scale. innuclearphysics, this regime is describedwell by the doorway picture. according to this picture, the long-lived states are decoupled from the continuumwhile thedoorwaystates are coupled toboth the continuum and the long-lived states. in the transition region, the cross section is enhanced due to the (partial) alignment of some states (of the doorway states)with the scattering states of the environment. it is interesting to see that the system behaves according to expectations only at low level density. here, the resonance states are characterized by their individual spectroscopic properties and their number does not change by varying a parameter. after passing the transition regime with overlapping resonances by further variation of the parameter, the behavior of the system becomes counterintuitive: the narrow resonance states decouple more or less from the continuum of scattering wavefunctions and the number of localized states decreases. the decoupling increases with increasing coupling strength between system and environment. this counterintuitive behaviorwas proven experimentally, some years ago, in a study on a microwave cavity [3]. recently, a dynamical phase transition and the counterintuitive behavior at strong coupling between system and environment has been observed experimentally also in p t symmetric optical lattices. at small loss, the transmission through the system decreases with increasing loss according to expectations. with further increasing loss, however, the p t symmetry breaks and the transmission is enhanced [4, 5, 6]. an interpretation of these results from the point of view of a dynamical phase transition can be found in [7]. here, also the difference between the discrete states in p t symmetric systems and the bound states in the continuum (resonance stateswith vanishing decay width) in open symmetric quantum systems with overlapping resonances is discussed. dynamical phase transitions in other systems are observed experimentally. they are discussed in [2, 7, 8]. common to all of them is that the dynamical phase transition takes place in the regime of overlapping resonances. as a result, a few states are aligned to the scattering states of the environment while the remaining ones (long-lived trapped resonance states) are (almost) decoupled from the continuum of scattering wavefunctions. the results discussed in the present paper can be summarized as follows. exceptional points play an important role in the dynamics of quantum systems. they are responsible, e.g., for the appearance of dynamical phase transitions in the regime of overlapping resonances. in approaching exceptional points by varying a parameter, some states align with the states of the environment by trapping almost all the other resonance states. such a process can be described in non-hermitian quantum physics since the phases of the eigenfunctions of the hamiltonian are 75 acta polytechnica vol. 50 no. 5/2010 not rigid, 1 ≥ rk ≥ 0. however, it cannot be described in conventional hermitian quantum theory with fixed phases of the eigenfunctions, rk =1. due to the alignment of some states to the states of the environment, physical processes such as transmission may be enhanced in a comparably large parameter range. the alignment increases with increasing coupling strength between system and environment and causes a behavior of the system at high level density which is counterintuitive at first glance. further theoretical and experimental studies in this field will broaden our understanding of quantummechanics. moreover, the results are expected to be of great value for applications. references [1] kato, t.: peturbation theory for linear operators. springer berlin, 1966. [2] rotter, i.: a non-hermitian hamilton operator and the physics of open quantum systems, j. phys. a 42 (2009) 153001 (51pp), and references therein. [3] persson, e., rotter, i., stöckmann, h. j., barth, m.: observation of resonance trapping in an open microwave cavity, phys. rev. lett. 85 (2000) 2478–2481. [4] guo, a., salamo, g. j., duchesne, d., morandotti, r., volatier-ravat, m., aimez, v., siviloglou, g. a., christodoulides, d. n.: observation of pt-symmetry breaking in complex opticalpotentials,phys. rev. lett.103 (2009)093902 (4 pp). [5] rüter, c. e., makris, g., el-ganainy, r., christodoulides, d. n., segev, m., kip, d.: observation of parity-time symmetry in optics, nature physics 6 (2010), 192–195. [6] kottos, t.: broken symmetry makes light work, nature physics 6 (2010), 166–167. [7] rotter, i.: environmentally induced effects and dynamical phase transitions in quantum systems, j. opt. 12 (2010) 065701 (9 pp). [8] müller,m., rotter, i., phase lapses in open quantumsystemsandthenon-hermitianhamiltonoperator, phys. rev. a 80 (2009) 042705 (14 pp). prof. ingrid rotter e-mail: rotter@pks.mpg.de max-planck-institut für physik komplexer systeme d-01187 dresden, germany 76 ap09_2.vp 1 introduction this paper presents some possible ways of speech enhancement in a car cabin. this task is a very important for speech control of devices in a car or for mobile communication. both of these applications contributes to greater traffic safety. multichannel methods of digital signal processing can be successfully used for speech enhancement. this class of methods outperforms single channel methods and achieves greater noise suppression. 2 spatial filtering a microphone array is a basic part of multichannel processing. a uniformly spaced microphone array is the simplest arrangement. the input acoustic signal is sampled in space due to microphone spacing and in time. it is possible to distinguish the signals coming from different directions thanks to spatial sampling. an input multichannel signal x[n] can be described as a mixture of the desired signal and interference. most multichannel systems are described under several assumptions. a model of a multichannel signal is introduced. first, the microphone array is focused to the direction of arrival of the desired signal (doa). second, it is assumed that the source signal is far enough from the array. the input acoustic signal can be assumed to be a plane wave [9]. the input signal at the m-th channel can be expressed as x n s n u nm m[ ] [ ] [ ]� � , (1) where s[n] denotes the n-th sample of the desired signal, and um[n] denotes the noise and interference at the m-th sensor. 3 interference in multichannel systems three types of interference are usually considered in a multichannel system. a criterion for classification is the coherence function �( )e j t� . this function expresses the reciprocal dependency (correlation) of particular signals in individual frequency bands. the coherence function � ij j te( )� of two signals is defined by the relation [14] � ij j t ij j t ii j t jj j t e e e e ( ) ( ) ( ) ( ) � � � � � � � � , (2) where � �ii j te( ) denotes the power spectral density (psd) of a signal in the j-th channel and � �ij j te( ) the crosspower spectral density (cpsd) of signals in the i-th and the j-th channel. the magnitude squared coherence (msc), defined as msc e ej t ij j t( ) ( )� �� � 2 , (3) is also often used. the type of interference is distinguished according to the shape of msc( )e j t� . three types of interference are recognized: spatial coherent, spatial incoherent and diffusive interference. 3.1 spatial coherent interference first, let us consider a plane wave reaching an array of two microphones under angle �c. this situation is illustrated in fig. 1. the spectrum of the signal at sensor 2 is x e j t2( ) � . the wavefront reaching sensor 1 is attenuated by a constant a and delayed by � �� d c c cos , (4) where c denotes the propagation speed of an acoustic signal and d denotes sensor spacing. the spectrum of the signal at sensor 1 is given by x e a x e ej t j t j t1 2( ) ( ) � � �� � . (5) substituting (5) into (2) results in an expression for the coherence function: 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 noise reduction in car speech v. bolom this paper presents properties of chosen multichannel algorithms for speech enhancement in a noisy environment. these methods are suitable for hands-free communication in a car cabin. criteria for evaluation of these systems are also presented. the criteria consider both the level of noise suppression and the level of speech distortion. the performance of multichannel algorithms is investigated for a mixed model of speech signals and car noise and for real signals recorded in a car. keywords: beamforming, adaptive array processing, signal processing, microphone arrays, speech enhancement. fig. 1: an array of two sensors �12 22 2 22 22 ( ) ( ) ( ) ( ) e a e e a e e ej t j t j t j t j t j � � � � � �� � � � � � � d c ccos � . (6) thus an expression for msc( )e j t� reveals full coherency msc( ) ( )e ej t j t� �� ��12 2 1. (7) 3.2 spatial incoherent interference in case of spatial incoherent interference, the coherence computed from samples obtained at two different points in space is equal to zero in the whole frequency band, because e x e x ej t j t[ ( ) ( )]*1 2 0 � � � . x1 and x2 denote the spectra of two interferences and the asterisk denotes complex conjugate. incoherent noise is represented by electrical noise in microphones. 3.3 spatial diffuse interference a reverberant environment is often encountered where many reflections occur. the delayed reflected signal reaches the array together with the direct wave. the characteristics of the delayed signal (magnitude and phase) depend on the acoustic properties of the given environment, e.g. a car cabin. this type of interference is very often present in real environments, and it is called spatial diffuse interference. diffuse noise can be modelled by an infinite number of independent sources distributed on a sphere [3]. a formula for coherence derived from this model is given by �12( ) sin e d c d c j t� � � � � � � � , (8) where � denotes angular frequency and d and c have been defined above. a shape for diffuse noise is depicted in fig. 2. the shapes are depicted for microphone spacing d � 5cm, 10 cm and 20 cm. an analysis of equation (8) and fig. 2 shows that the closer together the microphones are placed, the wider the main lobe of msc is. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 49 no. 2–3/2009 fig. 2: msc for diffuse noise and different microphone spacing fig. 3: msc for noise in a car cabin and different microphone spacing an analysis of noise recorded in a car cabin revealed interference of a diffuse nature. fig. 3 depicts the shapes of msc for various distances of microphones. the shapes are very close to the model of diffuse noise. 4 processing in the frequency domain algorithms of multichannel processing can be implemented in the time or frequency domain. the basic algorithms, e.g. gsc [7], operate in the time domain. a speech signal cannot be supposed stationary, so adaptive algorithms are used. the coefficients of adaptive filters are usually controlled by the lms algorithm. however, advanced algorithms require processing in the frequency domain. a block diagram of processing in the frequency domain is depicted in fig. 4. first, the input signal is divided into quasi-stationary overlapping segments. moreover, each segment is weighted by a hamming window. a typical segment length is 16 ms. second, a short time spectrum is computed. third, the short-time spectra are processed. an input signal is finally obtained using the inverse fourier transform and the overlap and add (ola) method [14]. weight adaptation is performed block by block. the adaptation is performed according to minimum mean square error (mmse). the advantage of this approach is that the weights in each frequency band change according to the power of the noise in a particular band. 5 beamforming algorithms the performance of four algorithms will be presented in this paper. their principles will be explained in this section. the following algorithms will be presented: beamformer with adaptive postprocessing (bap) [16], generalised sidelobe canceler (gsc) [7], linearly constrained beamformer (lcb) [5] and modified coherence filtering (mcf) [10]. 5.1 bap delay and sum beamformer (das) is the first block of bap [16]. the output of this block yb is an average of the input channels. weights wi are equal to 1 m. bap improves the das beamformer by using a wiener filter (wf) behind the das structure, fig. 5. the main contribution of wf is in improving the suppression level of uncorrelated interferences. the derivation for the weights of wf can be found in [15]. weights in the frequency domain are obtained as w e e e j t xs j t xx j t ( ) ( ) ( ) � � � � � � , (9) where � �xx j te( ) denotes the power spectral density (psd) of the signal x[k] (input of wf), and � �xs j te( ) is the cross-power spectral density (cpsd) of the signals x[k] and s[k] (output of wf). it is assumed that the interferences are uncorrelated ( [ ( ) ( )]e u e u ei j t j j t� � � 0 for all i j� ) and the desired signal is uncorrelated with the interferences ( [ ( ) ( )]e s e u ej t j j t� � � 0 for all i). s e j t( )� ) is a spectrum of the desired signal and u ei j t( )� is a spectrum of the interference at the i-th sensor. under these assumptions it holds � � � � � � xs j t sx j t ss j te e e( ) ( ) ( )� � . (10) weights of wf can now be expressed as w e e e j t ss j t xx j t ( ) ( ) ( ) � � � � � � (11) in the case of the bap structure, the psds in relation (11) are estimated by averaging the signal in a particular channel [13] � � ( ) re ( ) ( )*� � �ss i j t k j t k i m i m m m x e x e� � � �� � ��2 1 11 1 , (12) � ( ) ( ) ,� � �xx j t i j t i m e m x e� � �1 1 2 (13) where x ei j t( )� is a spectrum of the input signal. 5.2 gsc the structure of gsc [7] is depicted in fig. 6. it is equal to the adaptive beamformer [6]. the system consists of the das beamformer and the adaptive noise canceler (anc). anc serves to suppress the coherent interference. the weights of anc filters are in accordance with wiener theory [7]. a formula for optimal weights is given by h e e e i j t y y j t y y j t i b i i ( ) ( ) ( ) � � � � � � , i m� �1 1, ,� . (14) � � y y j t i b e( ) denotes the cpsd of signals yi and yw, the meaning of which is obvious from fig. 6. � �y y j t i i e( ) is the psd of yi. the proper function of the anc is given by perfect separation of the desired signal from the input signal. let us denote any coherent signal incident on the array from any 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 4: block diagram of processing in the frequency domain fig. 5: bap direction except doa as coherent interference. under this assumption, an interference can be separated from the input signal by an appropriate combination of input channels xi[k]. this separation is arranged by the blocking matrix (bm). the most commonly used bm differentiates neighbouring channels. bm consists of m columns, and (m � 1) rows, and looks like this [7]: bm � � � � � � � � � � � � 1 1 0 0 0 0 1 1 0 0 0 0 0 1 1 � � � � � � � � � . (15) 5.3 lcb lcb utilizes gsc and bap beamformers [5]. the structure of lcb is depicted in fig. 7. the direct branch composed of bap suppresses incoherent interference. the lower branch consisting of anc is responsible for coherent interference suppression. the greatest difference between gsc and lcb is the way in which the weights of the anc filters are computed. in lcb they are computed from signals at the outputs of bm and wf. the relation for calculating anc filters has to be written as h e e e i j t y y j t y y j t i w i i ( ) ( ) ( ) � � � � � � , i m� �1 1, ,� . (16) � y yi w denotes the cpsd of signals yi and yw the meaning of which is obvious from fig. 7. 5.4 coherence filtering coherence filtering differs from the other multichannel systems. it is a representative of double channel methods. the idea of this method [2] is based on the fact that the coherence function of the spatially coherent desired signal is close to one, and the coherence of the incoherent interference is close to zero. the authors of [10] propose a modification to coherence filtering. the coherence filter is included in the bap structure, see fig. 8. the coefficients of the modified coherence filter (mcf) c(k) are computed as follows c k k t k k t ( ) , � � � � � � w(k), if ( ) ( ) if ( ) � � � � (17) where w(k) denotes an estimated frequency response of the wiener filter, equation (9), and t denotes the threshold. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 49 no. 2–3/2009 fig. 6: gsc fig. 7: lcb fig. 8: structure of a modified coherence filter 6 testing procedure it is very difficult to separate the desired signal and noise when the level of noise suppression is evaluated. separation of the desired signal and noise is crucial for an assessment of the properties of the algorithms. the following approach has been chosen for testing the algorithms. the desired signal and interference are recorded separately. the input mixture x [n] is obtained before processing, so that the snr is defined. the recordings of utterances in a standing car with the engine switched off are assumed to be the desired signal. noise is represented by recordings of noise in a moving car without the presence of speech. a block diagram of the testing procedure is depicted in fig. 9. the input signals s [n] (desired signal) and u[n] (noise) are mixed with defined snr to make x [n]. the output signal y[n] originates by processing the input mixture. during processing, the coefficients of the adaptive filters are set. using these coefficients, clear signals s [n] and u[n] are also processed. this processing results in output signals ys[n] and yu[n]. these signals carry information about the influence of the system on the desired signal and interference. 7 criteria for system evaluation the criteria for assessing the level of speech enhancement can be classified into two classes, objective and subjective. the subjective criteria are represented by listening tests. listening tests are very difficult. it is necessary to gather several qualified listeners. the test also consumes a great deal of time. however, these tests can show how the output signals are perceived by human subjects. objective criteria give exact information and are not influenced by external factors, e.g. the mood of the listener. the following criteria will be used for evaluating the algorithms: noise reduction (nr), log area ratio (lar), signal to noise ratio enhancement (snre) and spectrograms. all of the criteria will be computed from quasi-stationary segments of the signal. 7.1 noise reduction nr expresses the ability of an algorithm to reduce noise. nr is defined as nr( ) ( ) ( ) e e e j t uu j t y y j t u u � � � � � � , (18) where �uu j te( )� is the psd of the interference at the input of the system, and �y y j t u u e( )� is the psd of the interference processed by the system. the assumption for nr calculation is that no desired signal is present at the input of the system. nr considers only the influence of the system on the interference. it does not consider the influence on the desired signal. this criterion has to be combined with other criteria. 7.2 log area ratio lar [12] takes into account the influence of the system on the desired signal and speech intelligibility. an advantage of this criterion is its high correlation with listening tests [4]. a presumption when using this criterion is the presence of speech. lar is calculated on the basis of the partial correlation coefficients (parcor) of the auto regressive model [8]. computing lar requires a clear speech signal s[n] and an output signal ys[n]. the computing is performed in the following steps: 1. estimation of parcor coefficients k (p, l) of the signal segment. index p denotes the p-th parcor coefficient and l the signal segment. the order of the model is chosen as p � 12. a burg algorithm can be used for estimating the coefficients [8]. 2. calculation of area coefficients g p l k p l k p l p( , ) ( , ) ( , ) , , , ,� � � � 1 1 1 12� (19) where k (p, l) is a p-th parcor coefficient of the l-th segment. (parcor coefficients k (p, l) are marked in some sources [11] as a negative of reflection coefficients.) 3. calculation of lar for block l lar ( ) log ( , ) ( , ) l g p l g p l s yp � � �20 10 1 12 . (20) lar expresses the “distance” of the model of signal s [n] from the model of signal y[n]. the lower lar is, the less the speech is distorted. 7.3 snre snre is very often used for evaluating systems for speech enhancement. the value of snre is also calculated segment by segment. snre is obtained as the difference of snrout – snrin. signals s [n] and u[n] are used for calculating snrin and ys[n] and yu[n] are user for calculating snrout. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 9: block diagram of the testing procedure 8 database of car speech a database of car speech and car noise has been created for developing and verifying of multichannel systems performing speech enhancement. the database creation procedure was chosen to fulfill the requirements of the testing procedure described in section 6. the signals were recorded in a škoda fabia. a microphone array of 12 sensors with constant spacing of 4 cm was used. the desired signal was represented by reproduced recordings of female and male utterances. noise signals were recorded under various conditions. more details about the database are summarized in [1]. 9 experiments two approaches were used to verify the algorithms presented in this paper. first, a model of the desired signal and noise recorded in a car were used as an input mixture. a model of the desired signal was created by copying the clear speech signal into all channels. the purpose of this approach is to verify the performance of the algorithm. the influence of the properties of the microphone array is not considered. breaking the assumptions mentioned in section 3 introduces additional delays of signals between the individual microphones. additional delays can be due to the fact that the acoustic signals cannot be represented by plane waves, and due to array imperfections. solving these problems is a separate issue. the purpose of the second experiment is to show the properties of the whole system. it should show that the properties of the array are significant and that it is worth taking them into account. each of the experiments was performed for two different environments. the first environment was a standing car with a running engine, and the second environment was a car moving outside a village (70 km/h). the criteria nr, lar and snre were calculated for segments of 128 samples. an mean value was calculated for each criterion. the experiments were performed for an array of 4 microphones with 4 cm spacing and snrin was set to 0 db. the sample rate was 8 khz. the parameters of mcf were set to t � 0 2. and � � 2. tables 1 and 2 show the results for a model of the desired signal. the results for a real signal are displayed in tables 3 and 4. the experiment with a model of the signal was done for different values of snrin. the results are summarized in the graphs in figs. 10, 11 and 12. 10 conclusion the experiments enabled a comparison of the methods for speech enhancement presented here. the results are very different for a model of the desired signal and for a real signal. array imperfections and propagation of signals are the most important influences. lcb provided the best results for a model of a signal. the experiments showed the importance of using several criteria. bap achieves low snre and high nr according to the results in table 1. gsc behaves in an opposite way. mcf seems to have the weakest performance. it produced high speech distortion (high values of lar) and low snre and nr. zero speech distortion is worth noting in the case of gsc. this is due to perfect separation of the desired signal at the input of the anc filters. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 49 no. 2–3/2009 lar snre nr bap 0.53 0.9 13.85 gsc 0.0 3.11 �1.46 lcb 0.53 3.22 10.86 mcf 1.83 1.38 2.72 table 2: results for a model of a signal, running car (70 km/h) lar snre nr bap 4.42 �0.32 13.66 gsc 7.36 �1.8 4.40 lcb 7.62 �1.9 17.24 mcf 6.26 �0.33 2.84 table 3: results for a real signal, standing car lar snre nr bap 4.41 �0.78 14.64 gsc 7.25 �1.47 5.95 lcb 7.52 �1.51 18.95 mcf 6.38 �0.69 4.21 table 4: results for a real signal, running car (70 km/h) lar snre nr bap 0.5 1.52 13.37 gsc 0.0 4.95 2.43 lcb 0.5 5.14 14.75 mcf 1.66 2.19 2.33 table 1: results for a model of a signal, standing car 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 10: nr for various snrin fig. 11: lar for various snrin fig. 12: snre for various snrin all of the algorithms showed much worse performance for real signals. there is both high speech distortion and low enhancement. there are no so significant differences between a standing car and a moving car. nr for bap and lcb is an exception. the lowest values of lar and snre are for bap and mcf. the second experiment focused on the influence of snrin on the results. the shape of nr (fig. 10) reveals strong dependance on snrin for gsc and lcb. the nr of lcb falls below bap, and gsc falls below mcf for high values of snrin. the shape of snre (fig. 12) shows a very similar trend. bap and mcf are almost independent of snrin with respect to both nr and snre. only bap, lcb and mcf can be considered when observing lar (fig. 11). gsc does not distort speech in the case of a model of the input signal, due to perfect separation of the desired signal. bap and lcb have the same shape of lar for the same reason. the highest speech distortion was for mcf. the figure also shows that speech distortion decreases with growing snrin. this paper has shown the properties of selected algorithms for speech enhancement in a noisy environment. the experiments with a model of the input signal showed that these methods are capable of speech enhancement. a problem occurred when the methods were used for real signals. the assumptions of proper functionality were broken in this case. the input signals did not match the model that the methods were developed for. it is necessary to focus on compensating the array imperfections and signal propagation in future work. acknowledgments the research described in this paper was supervised by prof. pavel sovka. this paper was mainly supported by research activity msm 6840770012 ”transdisciplinary research in biomedical engineering ii” and gačr grant 102/08/h008 “analysis and modelling of biomedical and speech signals” and gačr grant ga102/08/0707 “speech recognition under real-world conditions”. references [1] bolom, v., sovka, p.: multichannel database of car speech. in digital technologies 2008, vol. 1 (2008), žilina: university of žilina, faculty of electrical engineering. [2] le bouquin, r.: enhancement of noisy speech signals: application to mobile radio communications. speech commun., vol. 18 (1996), no. 1, p. 3–19. [3] cron, b. f., sherman, c. h.: spatial-correlation functions for various noise models. journal of acoustic society of america, vol. 34 (1962), no. 1. [4] fischer, s., kammeyer, k.-d., simmer, k.u.: adaptive microphone arrays for speech enhancement in coherent and incoherent noise fields. in invited talk at the 3rd joint meeting of the acoustical society of america and the acoustical society of japan, honolulu, hawaii, december 1996. [5] fischer, s., simmer, k. u.: beamforming microphone arrays for speech acquisition in noisy environments. speech communication, vol. 20 (1996), p. 215–227. [6] frost, o. l.: an algorithm for linearly constrained adaptive array processing. in proceedings of the ieee, vol. 60 (1972), p. 926–934. [7] griffiths, l. j., jim, w. c.: an alternative approach to linearly constrained adaptive beamforming. antennas and propagation, ieee transactions on, vol, 30 (1982), p. 27–34. [8] haykin, s.: adaptive filter theory (3rd ed.), upper saddle river, nj, (usa): prentice-hall, inc., 1996. [9] herbordt, w.: sound capture for human/machine interfaces. practical aspects of microphone array signal processing. springer, 2005. [10] mahmoudi, d., drygajlo, a.: combined wiener and coherence filtering in wavelet domain for microphone array speech enhancement. acoustics, speech and signal processing, 1998. proceedings of the 1998 ieee international conference on, 12–15 may 1998, vol. 1 (1998), p. 385–388. [11] psutka, j., müller, l., matoušek, j., radová, v.: mluvíme s počítačem česky. prague: academia, 2006. [12] simmer, k.u., bitzer, j., marro, c.: post-filtering techniques. in microphone arrays, berlin, heidelberg, new york: springer, may 2001, p. 39–57. [13] immer, k.u., wasiljeff, a.: adaptive microphone arrays for noise suppression in the frequency domain. in cost-229 workshop on adaptive algorithms in communications, bordeaux, france, sep 1992, p. 185–194. [14] uhlíř, j., sovka, p.: číslicové zpracování signálů. prague: ctu – publishing house, 2002. [15] widrow, b., stearns, s. d.: adaptive signal processing. prentice-hall, 1985. [16] zelinski, r.: a microphone array with adaptive post-filtering for noise reduction in reverberant rooms. in international conference on acoustic speech signal processing, new york, 1988, p. 2578–2581. václav bolom e-mail: bolomv1@fel.cvut.cz department of circuit theory czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 czech participation in integral: 1996–2011 r. hudec, m. blažek, v. hudcová abstract the european space agency esa integral satellite launched in october 2002 is the first astrophysical satellite of the european space agency (esa) with czech participation. the results of the first 8 years of investigations of various scientific targets are briefly presented and discussed here, with emphasis on cataclysmic variables and blazars with the esa integral satellite with czech participation. 1 introduction there is a long tradition of involvement of czech scientists in high-energy space projects, starting nearly 40 years agowithczech involvement in various satellite experiments within the interkosmos programme. collaboration with the european space agency(esa)startedsoonafter thepolitical changes in czechoslovakia in 1989. the esa integral projectwas the first esaproject in space astronomy withofficialczechparticipationbasedonacollaboration agreement between esa and the czech republic, i.e. prior to fullmembership of theczechrepublic in esa. the integral (international gammarayastrophysics laboratory) satellite has nowbeen in orbit formore than 8 years, and some general conclusions may be drawn at this point. fig. 1: omctestdevice (providing real test images) operated ataiondrejovprior to the launchof integral. bart wide field ccd camera, fov 6×7 degrees, lim mag 15.5, identical with integral omc test device (18 arcsec/pixel) there are four co-aligned instruments on board integral: (1) an ibis gamma-ray imager (15 kev to 10 mev, full coded field of view (fov) 8.3 × 8 deg, 12 arc min fwhm), (2) an spi gammaray spectrometer (12 kev–8 mev, full coded fov 16×16deg), (3) a jem-xx-raymonitor (3–35 kev, fully illuminated fov diameter 4.8 deg), and (4) an omc optical monitoring camera (johnson v filter, fov 5×5 deg) (winkler et al. 2003). these experiments enable simultaneous observation in the optical, medium x-ray, hard x-ray, and gamma spectral region (or at least a suitable upper limit) for each object, assuming that it is inside the field of view. the basic observation codes are as follows: (a) regular (weekly) galactic plane scans (gps) (−14 deg < bii < +14 deg), (b) pointed observations (ao), (c) targets of opportunity (too). in this paper, we deal with examples of observations and analyses of integral data with czech participation, focusing on two categories of objects, namely cataclysmic variables (cvs) and blazars. 2 czech involvement in the integral project czech involvement in the esa integral project started in 1996,whenrenehudecwas invited to join the omc and isdc consortia, on the basis of a collaboration agreement between esa and the czech republic. at that time, our participation focused on isdc and omc. in omc (optical monitoring camera), our participation focused on various software packages, such as omc ps (omc pointing software) for integral isoc, and also on the design, development and operation of omc td (test device), a ground-based camera with output analogous (pixel size 18 arcsec) with the real omc. for isdc (integral science anddata center), located in versoix, switzerland, the main part of our contribution involved providing manpower, i.e. one 20 acta polytechnica vol. 51 no. 6/2011 personworkingwithin the team,withvarious responsibilities and involvements in the isdc operations. as for the scientific responsibilities, rene hudec was delegated to lead the study of cataclysmic variables, andhewas later also amember of theworkinggroups on gamma-raybursts (grbs) andagns. in this paper, we very briefly summarize the scientific achievements in these fields. in addition, we have participated in the development and operation of dedicated robotic telescopes, considered as the ground-based segment of the project, and in delivering supplementary optical data for satellite triggers. this work has been done mainly by young research fellows and by students. theczech scientific participation focused on topics allocated by integral bodies, mostly cataclysmic variables but also blazars and some other objects, such as gamma-ray bursts (grbs). fig. 2: test image provided by the omc test device operated at the ondrejov observatory (fov 6 × 7 deg, lim mag 15.5, pixel size 18 arcsec identical with the real omc camera) fig. 3: example of an omc image from space 3 cataclysmic variables responsibility for the category of cataclysmic variables (cvs) and related objects was delegated to rene hudec. the results of hard x-ray detections of these binary galactic objects were surprising. the soft x-ray emission of the group was already known in advance, but the hard x-ray extension to (in some cases) more than 80 kev was a new discovery. these findings have even led to the idea that cvs may make a contribution to the galactic x-ray background. in total, 32cataclysmicvariables (cvs)havebeen detected by the integral ibis gamma-ray telescope (this is more than had been expected before launch, and represents almost 10 percent of integral detections). 22 cvs have been seen by ibis and found by the ibis survey team (barlow et al. 2006, bird et al. 2007, galis 2008), based on a correlation of the ibis data and the downes cv catalogue (downes et al. 2001). four sources are cv candidates revealed by optical spectroscopy of igr sources (masetti et al. 2006), i.e. new cvs, not in the downes catalogue. they are mainly magnetic systems: 22 were confirmed or probable ips, 4 probable magnetic cvs, 3 polars, 2 dwarf novae, 1 unknown. the vast majority have an orbital period porb > 3 hr, i.e. above the period gap (only one has porb < 3 hr), but 5 objects are long-period systems with porb > 7 hr. the long lifetime of the integral satellite (> 10 years) has enabled long-term variability studies, albeit limited by observation sampling. at least in some cases, the hard x-ray fluxes of cvs seen by integral exhibit time variations, very probably related to the activity/inactivity states of the objects. the spectra of the cvs observed by ibis are in most cases similar. a power law or thermal bremsstrahlungmodel compares well with the previous high-energy spectral fits (de martino et al. 2004, suleimanov et al. 2005, barlow et al. 2006). another surprise is that while the group of ips represents only ∼2 percent of the cataloguedcvs, it dominates the group of cvs detected by ibis. more such detections and new identifications can therefore be expected, as confirmedbyour search for ips in the ibis data, which provided 6 newdetections (galis et al. 2008). many cvs covered by the core program (cp) remain unobservable by ibis because of short exposure time, but new cvs have been discovered. ibis tends to detect ips and asynchronous polars: in hardx-rays, these objects seem to bemore luminous (uptoa factorof10) thansynchronouspolars. detection of cvs by ibis typically requires 150–250 ksec of exposure timeormore, but someof themremained invisible even after 500 ksec., at least in some cases. however, this can be related to the activity state of 21 acta polytechnica vol. 51 no. 6/2011 fig. 4: preview of 32 cvs observed by integral– integral ibis sky coverage (up to march 2009) fig. 5: symbiotic starrtcru observed as an ibis source up to energy 60 kev. the detection of symbiotic stars in hard x-rays by integral was a surprise the sources — the hard x-ray activity is temporary or variable. detecting hard x-ray flaring activity is another important issue. there is an indication for a hard x-ray flare in a cv system, namely v1223 sgr, seen by ibis (a flare lasting for > 3.5 hr during revolution 61 (mjd 52743), with the peak flux > 3 times above the average (barlowet al. 2006)). these flares had already been seen in the optical in the past by a ground-based instrument (duration of several hours) (vanamerongen&vanparadijs 1989). this confirms the importance of the omc-like instrument (preferably with the same fov as a gamma-ray telescope) on board gamma-ray satellites: evenwith the v limitingmag 15, this can provide valuable optical simultaneousdata forgamma-rayobservations. analogous flares are also known for other ips in the optical, but not in hard x-rays. tv col (hudec et al. 2005) can serve as an example: in this system, 12 optical flares have been observed so far, five of them on archival plates from the bamberg observatory, and the remaining flares by others observers. tv col is an ip, and the optical counterpart of the x-ray source 2a0526–328 (cooke et al. 1978). this system is the firstcvdiscoveredthrough itsx-rayemission, newly confirmedas an integral source. the physics behind the outbursts in ips is either the instability of the disk or an increase in the mass transfer from the secondary. 4 blazars another category of integral targets that we have investigated is a special class of agn (active galacticnuclei), knownasblazars. theseobjectsbelong to the most important and also optically most violently variable extragalactic high-energy objects. we focus on objects found by data mining in the integral archive for faint and hidden objects. for moredetails onblazaranalyseswith integral, see hudec et al. (2007). in addition, successful blazar observations have been performedmostly in thetoo regime. the extensive collaboration led by e. pian serves as an example (pian et al. 2007). we have developed procedures for accessing faint blazars in the ibis database. blazar 1es 1959+650 can serve as an example. this blazar is a gamma-ray loud variable object visible by ibis in 2006 only, invisible in totalmosaics and/or other periods. the optical light curve available for this blazar confirms the relation of active gamma-ray and the active optical state. fig. 6: the most significant result of the ibis data mining procedure for faint sources. the flux corresponding to the excess in the lower spectral band for mrk 501 is (1.57 ± 0.24) 10–11 erg/cm2/s. the coordinates of the images are given in pixels, one pixel being 4.9 arcmin; the mosaics are centered on the catalogue position of the source 5 omc the small optical camera on board the integral omc satellite delivered a great amount of valuable simultaneousopticaldata for observationsof gammarayandhardx-ray sources. however, for gamma-ray 22 acta polytechnica vol. 51 no. 6/2011 bursts (grbs) this is the case only for a few triggers, as the field of view (fov) of omc is much smaller than thefovof themostwidely-used instrument on integral, namely ibis (5 vs. 8 degrees). however, omc proved to be an efficient tool for optical objects without gamma–ray counterparts, such as eclipsing binaries. for these objects, the uninterrupted nature (no day/night cycles) of space-based observations was found to be positive for studying the light curves and for determining the times of the minima. 6 isdc the czech involvement focused on direct participation in the operation and activities of the integral science and data center (isdc), including providing one person continuously working directly within the isdc team (aswell as other persons occasionally visiting isdc). the person at isdc participated in the service work of center (especially by contributing to isdc operations), and also in scientific analyses. participation in the scientific programme has involved observations, data analyses, data archiving, data interpretation, and scientific evaluations. in addition, wehaveworkedon transferring isdc s/w packages, further development of tools for effective and interactive scientific analyses and the use of integral data, further improving the quality of astrometry and photometry, and on operating the second (local) isdc center/office (ondrejov integral data center, oidc) at the ondrejov astronomical institute, enabling the astronomical community in the czech republic and in central and east europe to participate in scientific activities related to integral, data evaluation, data archiving, and interpretation. the scientific activities have focused on allocated scientific tasks, especially cataclysmic variables and white dwarfs, gamma ray bursts, and blazars-agns. czech participation in isdc (integral science and data centre) in versoix has included the following tasks, which are listed below as examples. idx merge tool. idx merge is a tool program developed at isdc, which was used in archive processing for merging two fits indices. petr kubanek carried out benchmark tests to provide information about possible speed-up in this program. the tests identified the fast-merge patch as the best possible solution. a fast-merge patch was developed by petr kubanek, tests for this patch were made, and the patchwas delivered to isdc. the patch significantly speeded up archive processing. isr – integral source results web pages. based on discussions with mathias beck, mohamed meherga and roland walter, petr kubanek created the integralsourceresultswebpages. theseenabled users not familiar with osa (offline scientific analysis, a software package used for analysing integral data) to access data products from standard osa runs, which are executed at isdc (kubanek and hudec, 2007). the web pages were later further developed by the isdc staff. the web pages enabled access to light-curves, spectra and ibis andjem-xprocessed images of objects. they contained processed data for all public observations of sources that were flagged as detected in the integralsourceresults catalogue, version 15. isdc repeatably reprocessed this data with new osa releases, which provided better results on this quick-look page. a user guide for isr was written, and also a description of isr for the isdc newsletter. the isr perl source code was fully documented and delivered as the isdc sw package. lc extract tools. based on discussions with filipmunz,petrkubanekdeveloped lc extract tools, which were tested for extracting the countrates of weak sources from the integral ibis detector. lc extract uses a variant of the pure open/closemask element method to detect weak sources inside the ibis field of view. vo – virtual observatory at isdc almost all facilities dealing with astronomical data archiving contribute to the development of virtual observatory. virtual observatory can be used for quick multispectral analysis of various sources, and for computer-driven data mining and processing. it can help researchers to gain quick access to information that they need in a format that they can use, so that they can focus onvalidating their theories rather than on learning variousmethods for processing data from various earth and space based observatories. we contributed significantly to the development of virtual observatory by conforming access to integral data. petr kubanek prepared the environment for enablingvirtual observatory to provide access to integral data. this work included installing andconfiguring theapachetomcat server on the isdc solaris computers. he decided to implement vo services at isdc as a set of java servlets. themain reason for thiswashis experiencewithjava servlets,whichhe foundsuperior to theperl::cgiapproach. object-orientedprogramming (oop),which forms the basis of java language, allowsbetter design of complex programs. at the cost of a longer design phase, it enables better growth of initial small code subsets to full feature services and then procedural programming. it also promotes separation of code to small subsets with clearly defined interfaces. thanks to this approach, the code can be reused. it should be noted thatoopwas also introduced to perl, but since perl was not invented for oop, 23 acta polytechnica vol. 51 no. 6/2011 oop implementation in it is introduced at the cost of variousdesign requirements,which layeroopover the original perl procedural language. java also introduced javadoc for writing documentation directly in code. this enables better and more up-to-date documentation than writing separate programming documentation. after deciding on the target enviroment (java servletcontainer -tomcat fromtheapachefoundation), petr kubanek implemented vo access to the integral catalogue. this first servlet was used as the prototype for developing of another servlet, which handles fits images search and extraction. we developed a prototype for vo access to integral ibis mosaics. the advanced vo access was later offered by the isdc staff to the world astronomical community, after further development of vo access, taking it from a prototype to production status, and including other high-level products in the vo database for all integral instruments. as various vo developers have pointed out, the only currently available pure java library to access fits files, nom.tam.fits library, has significant drawbacks. these include bugs in reading big gzipped files, resulting in inability to read most of the integraldata, and lack of support forwcs (world coordinate systems) extensions (which are used for storing information about the part of the sky that the image contains). petr kubanek patched the nom.tam.fits library so the he can use it in his vo servlets. however, on the basis of discussions with other vo developers, he decided to recode the java fits access library, so that it will not suffer from thedrawbacksnotedduring itsuse. hehas alsomade changes to theukstarlink starjavapackage, so that he can use it to quickly generate pages used for vo access. these efforts have been further developed by the isdc staff. 7 ground based segment theoptical camera (omc)onboard integralhas delivered valuable data, but there are some limits on magnitude, on accuracy, and on available fov. there is an obvious need to provide additional optical data for simultaneous analyses of astrophysical objects detected by the onboard hard-energy experiments, above all ibis. a similar procedure is considered for theesagaia satellite, since thephotometric sampling of the gaia photometrywill not be optimal in many cases. for this reason we have from the beginning laid emphasis not only on space experiments but also on the related ground based segment, namely optical ground-based experiments, with emphasis on robotic telescopes. therts2 dedicated control programhas been designed and developed. rts2 was installed and runs on (nowaday) numerous robotics telescopes, which are spread around the globe. rts2 was originally developed for conducting observations of gamma-ray burst error boxes in an optical window, but it has evolved to a full-featured package for any robotic telescope (http://rts2.org). on the basis of experience gained from developing rts1, rts2 is layered to an abstract, deviceindependent communication layer, and drivers for various devices. thanks to this layering, new devices can be integrated smoothly and very rapidly into rts2. members of the czech integral group have continued to develop rts2 (remote telescope system, 2nd version). a major change involved separating the execution and selection logic, which had previously been handled by a single rts2 component (planc), into two independent componets (rts2executor, and rts2-selector). this separation enabled us to better fulfil the different requirements for different scheduling algorithms for different telescopes (http://rts2.org). rts2was installed e.g. in the 60 cmbir (bootes infra-rocho) telescope. bir is located at the instituto de astrofisica de andalucia (iaa) observatorio de sierra nevada (osn). rts2 has also been installed in the fram telescope at the pierre-auger observatory in argentina. it is used for monitoring the atmospheric conditions above the pierre-auger optical detectors. rts2 has also been installed in the watcher 1 telescope, located at the boyden observatory in southafrica. asrts2 is releasedunder gnu licence, the university college of dublin memberswho builtwatcher downloaded it, customized it to fit their purpose and installed it in their telescope. members of our grouphelped themwith the installationprocess, andprovidedhelp in customizingrts2. rts2 was also customized for use in the mark telescope, which is located in the prague stefanik observatory, and we are running prelimary tests on rts2 at this site. thanks to the mark tests, rts2 acquired the ability to control observations setups with a copula. this ability will be very important for the use of rts2 in larger telescopes. it is currently under negotiation for various telescopes (more than 1m in diameter). rts2 uses the libnova library to carry various astronomical calculations. petr kubanek, who co-maintains libnova, has futher patched and developed libnova. 8 other works the secondary science centre in ondrejov has been put into operation. various versions of osa data analysis software have been transferred and successfully installed. data for the most promising 24 acta polytechnica vol. 51 no. 6/2011 sources has been reprocessedwith the osa software packages and organized into a local archive in a way that enables any combination of scws to be constructed on demand. source database. considerable time has been spent on developing a web interface for integral working groups, devoted to studies of blazars (http://altamira.asu.cas.cz/iblwg) and cataclysmicvariable (cv) stars (http://altamira.asu.cas.cz/icvwg). most of the features of these pages were supplied by a common code written in php with the underlying mysql database. this database is filled with information from the isdc archive (position and quality of individual pointings), and also with available he data on the sources topredict their possible detection using integral instruments. we still lack information on the x-ray spectra of cvs (only a small collection of about 20 spectra from asca observations is available above 1 kev). more recently, a large new set of possible blazar positions (about 700, half of them corresponding to veron-cetty agn locations) from astro-ph/0506720 has been included. we are currently checking the candidates with highest exposures. an important feature of these web pages is a scheduler thatusesdata fromthe isocpages (a complex script for retrieval of scheduled pointings and for importing them into the database was written by jiri polcar). this allows us to plan simultaneous observations with optical telescopes in advance, not just to react to gcn alerts about new integral pointings (more suitable for robotic telescopes). in some cases, a given source is below the horizon at the time of integral observation, so optical monitoring should be performed before the alert is issued. weak source analysis. our basic tool for extracting physical data from reconstructed images is mosaic spec, a small program in c intended originally as an alternative to standard spectral analysis (started at isdc with roland walter in 2004). while the old version is currently employed by the integral source result web interface to isdc archive, the czech team members have added some new features that allow us to obtain more information about the shape of the analysed peak in an ibis image, the properties of the background (to sort out most of the false detections), and finally to retrieve a cutout from a large mosaic. this latter feature allows us significantly to reduce (by several orders of magnitude) the amount of data that needs to be transferred from isdc when analyzing largemosaics (either available directly for pre-defined observation groups—ogs—or constructed fromselectedscws using the ii skyimage mosaicking capability). since the energy binning of reconstructed ibis images in the isdc archive (revision 2) is too fine for a search of weak sources,mosaic spec can also sum-up several energy bins together to improve the statistics. the cut-outs should soon be available (once the bitmap conversion has been mastered). the analysis of short-timepre-definedogs (up to 3 days in length) is well suited for studies of blazars (whose flares can appear at these time scales) but not so well suited for a search of cataclysmic variables (whose variability has a more periodic nature). where their basic periods (orbital, rotation or beat of these two) are known, we could employ phase resolved analysis. a new tool called lc extract has been developed for this purpose by petr kubanek. it uses a pixel illumination factor (pif) method similar to the standard ibis light curve extraction process, but it should be less sensitive to variations of fluxes of strong sources in the field of view (which is the case for cvs close to the galactic bulge). the productionversionof this tool script includesgti andnoisy pixel treatment. more recently, we have participated in the efforts at isdc to further develop isdc into amore general scientific and data centre for space astronomy. inaddition, data fromastronomicalplatearchives has also been analysed for some of the targets, adding additional time dimensions to the investigations, identification and classification of integral sources. not only the long-term evolution of optical light curves of objects (in some cases for up to 100 years) can be studied this way. low-dispersion spectra (for various time epochs) can also be extracted and analysed. 9 conclusions theczech integral teamhas contributed to various fields of integral science. only a few examples have been given in this paper. in general, the integral satellite opens a new 10–100 kev x-ray observational window to which there had previously been only very limited access. the x-ray emisssion of some cvs and sss extends to 80 kev. our results confirm that the integral satellite is an effective tool for finding new cvs, mainly ips. the contribution of the czech participants in the esaintegralprojecthas focused on the onboard omccamera, onwork at isdc, and on integral science, with emphasis on cataclysmic variables and blazars. the integral satellite is clearly an effective tool for analyzing both cvs and blazars. so far, 21 blazars, 32 cvs and 3 symbiotics have been detected, and the number is increasing with time. the successful observations of cvs using integral provide proof that cvs can be successfully detected and observed in hard x-rays with integral (for most cvs, these are considerably harder passbands than had been possible previously). these results 25 acta polytechnica vol. 51 no. 6/2011 show that more cvs (in harder passbands) will be detectable with increasing integration time. there is also an increasing probability of detecting objects in outbursts, high and low states, etc. simultaneous hard x-ray and optical monitoring of cvs and blazars (or at least suitable upper limits) can provide valuable inputs for better understanding of the physical processes that are involved. acknowledgement the international gamma-ray astrophysics laboratory (integral) is an european space agency mission with the instruments and the scientific data center funded by the esa member states (especially the pi countries: denmark, france, germany, italy, spain, switzerland), the czech republic and poland, and with the participation of russia and the usa. this study was supported by the esa pecsintegral98023projectandpartlybygrant 205/08/1207. the plate analyses have recently been supported by msmt me09027. the work described in this paper has been supported by current andpast members of the czech integral team and contributors, mainly p. kubánek, j. štrobl, v. hudcová, m. kocka, m. blažek, r. gális, m. nekola, c. polášek, j. polcar, f. munz, v. šimon, i. sujová, f. hroch, m. topinka, and others. references [1] barlow, e. j., knigge, c., bird, a. j., et al.: mnras, 372, 224, 2006. [2] bianchini, a., sabbadin, f.: ibvs, 2751, 1, 1985. [3] bird,a. j.,malizia,a., bazzano,a., et al.: apj suppl. s. 170, 175, 2007. [4] cooke, b. a., ricketts, m. j., maccacaro, t., et al.: mnras, 182, 489, 1978. [5] demartino,d.,matt,g.,belloni,t., et al.: a& a, 415, 1009, 2004. [6] downes, r. a., webbink, r. f., shara, m. m., et al.: pasp, 113, 764, 2001. [7] garnavich, p., szkody, p.: pasp, 100, 1522, 1988. [8] hudec, r.: baic, 32, 93, 1981. [9] hudec, r., šimon, v., skalický, j.: the astrophysics of cataclysmic variables and related objects,proc. ofaspconf.vol.330. sanfrancisco : asp, 2005, p. 405. [10] hudec, r., simon, v., munz, f., galis, r., strobl, j.: integral results on cataclysmic variables and related objects, presented at integral science workshop, sardinia, oct 2007. http://projects.iasf-roma.inaf.it/integral/ integral5thanniversarypresentations.asp [11] ishida, m., sakao, t., makishima, k., et al.: mnras, 254, 647, 1992. [12] king, a. r., ricketts, m. j., warwick, r. s.: mnras, 187, 77, 1979. [13] masetti, n., morelli, l., palazzi, e., et al.: a & a, 459, 21, 2006. [14] meintjes, p. j., raubenheimer, b. c., de jager, o. c., et al.: apj, 401, 325, 1992. [15] motch, c., haberl, f.: proceedings of the cape workshop on magnetic cataclysmic variables. san francisco : asp, 1995, vol. 85, p. 109. [16] motch, et al.: a & a, 307, 459, 1996. [17] mürset, u., jordan, s., wolff, b.: supersoft xray sources, lnp, 1996, vol. 472, p. 251. [18] patterson, j., skillman, d. r., thorstensen, j., et al.: pasp, 107, 307, 1995. [19] šimon, v., hudec, r., štrobl, j., et al.: the astrophysics of cataclysmic variables and relatedobjects,proc. of aspconf.vol.330, san francisco : asp, 2005, p. 477. [20] sokoloski, j. l., luna, g. j. m., bird, a. j., et al.: aas, 207, 3207, 2005. [21] suleimanov,v., revnivtsev,m.,ritter,h.: a& a, 443, 291, 2005. [22] van amerongen, s., van paradijs, j.: a & a, 219, 195, 1989. [23] watson, m. g., king, a. r., osborne, j.: mnras, 212, 917, 1985. [24] winkler, c., courvoisier,t. j.-l., dicocco,g., et al.: a & a, 411, l1, 2003. [25] galis, r.: proceedings of 7th integral workshop, proceedings of science, 2008. [26] hudec, r. et al.: nuclear physics b proceedings supplements, 2007, vol. 166, p. 255–257. [27] pian, e. et al.: observations of blazars in outburst with the integral satellite, in triggering relativistic jets (eds. william h. lee, enrico ramírez-ruiz)revistamexicanadeastronomia y astrofisica (serie de conferencias), vol. 27, 2007, contents of supplementary cd, p. 204. 26 acta polytechnica vol. 51 no. 6/2011 [28] kubanek, p., hudec, r.: proceedings of 6th integral workshop, 2007. http://hea.iki.rssi.ru/integral06/papers/ rené hudec e-mail: rene.hudec@gmail.com astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic czech technical university in prague faculty of electrical engineering technicka 2, cz-16627 prague, czech republic martin blažek astronomical institute academy of sciences of the czech republic cz-25165ondřejov, czech republic czech technical university in prague faculty of electrical engineering technicka 2, cz-16627 prague, czech republic věra hudcová astronomical institute academy of sciences of the czech republic cz-25165ondřejov, czech republic 27 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 from gauge anomalies to gerbes and gerbal representations: group cocycles in quantum theory j. mickelsson abstract in this paper i shall discuss the role of group cohomology in quantummechanics and quantumfield theory. first, i recall how cocycles of degree 1 and 2 appear naturally in the context of gauge anomalies. then we investigate how group cohomology of degree 3 comes from a prolongation problem for group extensions and we discuss its role in quantumfield theory. finally, we discuss a generalization to representation theory where a representation is replaced by a 1-cocycle or its prolongation by a circle, and point out how this type of situations come up in the quantization of yang-mills theory. 1 introduction a projective bundle over a base m is completely determined, up to equivalence, by the dixmier-douady class, which is an element of h3(m,z). this is the origin of gerbes in quantum field theory: a standard example of this type of situation is the case when m is the moduli space of gauge connections in a vector bundle over a compact spin manifold, [5]. topologically a gerbe on a space m is just an equivalence class of p u(h)= u(h)/s1 bundles over m. here u(h) is the (contractible) unitary group in a complex hilbert space h. in terms of čech cohomology subordinate to a good cover {uα} of x, the gerbe is given as a c×-valued cocycle {fαβγ}, fαβγ f −1 αβδfαγδf −1 βγδ =1 on intersections uα ∩uβ ∩uγ ∩uδ. this cocycle arises from the lifting problem: a p u(h) bundle is given in terms of transition functions gαβ with values in p u(h). after lifting these to u(h) one gets a family of functions ĝαβ which satisfy the 1-cocycle condition up to a phase, ĝαβĝβγ ĝγα = fαβγ1. thenotion of gerbal representationwas introduced in recent paper [7]. this is to be viewed as the next level after projective actions related to central extensions of groups and is given in terms of third group cohomology. one can view this setting as a categorification of the representation theory of central extensions of groups. we shall not discuss the problem in this generality since our categories are of special kind: a category of groups for us is just a principal bundle over a base m. each fiber can be identified as a group g, but only after fixing a point in the fiber and calling the chosen point the unit element in g. fixing a representation of g defines a standard way a vector bundle over m. however, if the representation is only a projective representation we obtain in general only a projective vector bundle over m. we are now in the setting for gerbes and we have a characteristic class in h3(m,z). but there is a role also for third group cohomology. in fact, the appearance of third group cohomology in this context is not new and is related to group extensions as explained in [9]. in the simplest form, the problem is the following. let f be an extension of g by the group n, 1 → n → f → g → 1 an exact sequence of groups. suppose that 1 → a → n̂ → n → 1 is a central extension by the abelian group a. then one can ask whether the extension f of g by n can be prolonged to an extension of g by the group n̂. the obstruction to this is an element in the group cohomology h3(g, a) with coefficients in a. in the case of lie groups, there is a corresponding lie algebra cocycle representing a class in h3(g, a), where a is the lie algebra of a. we shall demonstrate this in detail for an example arising from the quantization of gauge theory. it is closely related to the idea in [2], further elaborated in [3], which in turnwas a response to a discussion in 1985 on breaking of the jacobi identity for the field algebra in yang-mills theory [6]. the paper is organized as follows. in section 2 we recall the basics about the role of group cohomology of degree 1 and 2 in quantum theory. in section 3 we then explain how3-cocycles come from a prolongation problem for group extensions. in section 4 we take as an example the gauge group extensions arising from the gauge action on bundles of fermionic fock spaces over background gauge fields and the corresponding lie algebra cocycles. in sections 5 and 6 we explain the use of 1-cocycles as generalized representations, with an example from quantum field theory. 2 cocycles of degree 1 and 2 in quantum theory in quantum mechanics a symmetry group g (e.g., the group of galilean symmetries) acts on schrödinger wave functions as 42 acta polytechnica vol. 50 no. 3/2010 (t(g)ψ)(x)= ω(g;x)ψ(g−1x) where ω(g;x), for g ∈ g, is a (matrix valued) phase factor. in order that the group multiplication rule is preserved ithas to satisfyas consistencythe1-cocycle condition ω(g1;x)ω(g2;g −1 1 x)ω(g1g2;x) −1 =1. it may happen that the cocycle condition does not hold, for example in the case of the galilean transformations onwave functions ofmassive particles; in this case the left-hand-side defines a s1 valued (it does not depend on the coordinate x) 2-cocycle. the representation of thegalilei group is nowprojective but it can still be viewed as a true representation for a central extension ĝ of g, [1]. 1-cocycles appear also in the context of symmetry breaking in qft. classically, one might expect that the (exponentiated) quantum action is invariant under a group g (group of gauge symmetries or group of diffeomorphisms of space-time), z(a)= z(ag) where a denotes a set of fields; in the case of a gauge action the right action is ag = g−1ag + g−1dg. but in case of chiral anomaly, for example, z(ag)= ω(g;a)z(a) where ω is a phase factor. consistency requires again that ω is a 1-cocycle. however, unlike in the case of the galilei group, the nontrivial 1-cocycle has serious physical consequences: it signals the breakingof gauge symmetry. nontrivialitymeans that there is noway to modify the quantum effective action by a multiplicative phase, z(a) �→ z′(a)= η(a)z(a), such that the modified action z′ would be gauge invariant. this means that ω(g;a) �= η(ag)η(a)−1 for anyphase function η. in case of an equality, we say that the 1-cocycle ω is a coboundary of the 0-cochain η. denote by a the space of all (smooth) fields a. if g acts smoothly and freely on a then x = a/g is a manifold and the cocycle ω defines a complex line bundle l. sections of l are complex functions on a satisfying ψ(ag)= ω(g;a)ψ(a). the complex line bundle haschern class c ∈ h2(x,z) which is obtained by transgression from ω. in the case of the chiral anomaly, the action z is thus a section of the complex line bundle l, which is called the dirac determinant bundle. indeed, the function z(a) can be viewed as a regularizeddeterminant of themassless weyl-dirac operator d+a which is a linear map from the left-handed spinor fields to the right-handed sector. to define the determinant, one fixes a map d−a0 from right-handed sector to the left-handed sector, by fixing abackgroundpotential a0, and then one applies the zeta function regularization to the determinant of the operator d−a0d + a. in hamiltonian quantization the breaking of (gauge, diffeomorphism) symmetry is best seen in the modified commutation relations in the lie algebra g of g, [x, y ] �→ [x, y ]+ c(x, y ) where c takes values in an abelian ideal a; in the simplest case a = c and c satisfies the lie algebra 2-cocycle condition c(x, [y, z])+ c(y, [z, x])+ c(z, [x, y ]) = 0 a famous example is given by the central extension of the loop algebra lg of smooth functions on the unit circle with values in a semisimple lie algebra g, c(x, y )= k 2πi ∫ s1 〈x(φ), y ′(φ)〉dφ where k is a constant (“level” of a representation), which is equal to a nonnegative integer in a positive energy representationwhen the invariantbilinear form 〈·, ·〉 on g is properly normalized. given a (central) extension of a lie algebra one expects that there is a central extension of the corresponding group. in case of lg the group is the loop group lg of maps from s1 to a (compact) group g. a central extension of lg would then be given by a circle valued function ω:lg × lg → s1 with group 2-cocycle property ω(g1, g2)ω(g1g2, g3)=ω(g1, g2g3)ω(g2, g3). however, in case of lg there is a topological obstruction, ω is defined only in an open neighborhood of theunit element. theobstruction is givenbyanelement in h2(lg,z)whose derham representative is a left invariant 2-formfixed by the lie algebra 2-cocycle c. in case of a compact simple simply connected lie group the cohomology h2(lg,z) is equal to h3(g,z) is equal to z. the lie algebra cocycle for lg can be viewed as a left invariant 2-formon the group lg. for a correct choice of normalization of the bilinear form 〈, ·, ·〉 on g the generator in h2(lg,z) corresponds to the basic extension with level k =1. 3 3-cocycles group andlie algebra cohomologywith coefficients in anabeliangroup (lie algebra) is defined inanydegree. sowhat about degree 3? and the relation to derham cohomology in dimension 3? first, let us recall the basic definitions. assume that a group g acts as automorphisms of an abelian group a. a map f:g× g× . . . g:→ a (n arguments) 43 acta polytechnica vol. 50 no. 3/2010 is a n-cocycle if δf =0where the coboundary operator δ is defined by (δf)(g1, g2, . . . , gn+1)= i=n∑ i=1 (−1)if(g1, . . . gigi+1, . . . , gn+1)+ (−1)n+1f(g1, . . . , gn)+ g1 · f(g2, . . . , gn+1). cocycles of type δf are exact cocycles. the group cohomology in degree n is then defined as the abelian group hn(g;a) of n-cocycles modulo exact cocycles. in case of a lie algebra g the cochains are alternating multilinear maps f:g × . . . × g → a. the cocycles are elements in the kernel of the lie algebra coboundary operator, which is now defined as (δf)(x1, . . . , xn+1)=∑ i dimm, [15]. according to richard palais, up is homotopy equivalent to u2 for all p ≥ 1, [16], so we can define generalized representations of m ap(m, g) from this equivalence and the embebding to up. 6 application to gauge theory let da dirachamiltonianonanodddimensional compact spin manifold coupled to a gauge potential a. the quantization d̂a of da acts in a fermionic fock space. for different potentials the representations of the fermion algebra are inequivalent [17]. in scattering problems one would like to realize the operators d̂a in a single fock space f, the fock space of free fermions, a =0. this can be achieved by choosing for each a a unitary operator ta which reduces the offdiagonal blocks of da to hilbert-schmidt operators, for the ‘free’ polarization � = d0/|d0|. then each d′a = t −1 a data can be quantized in the free fock space, [12], [13]. this has a consequence for the implementation of the gauge action a �→ ag = g−1ag + g−1dg in the fock space. in the 1-particle space the action of g is replaced by ω(a;g)= t −1a gtag with ω(a;gg′)= ω(a;g)ω(ag;g′). now the shale-stinespring condition [�, ω(a;g)] ∈ l2 is satisfied and we can quantize in f, ω(a;g) �→ ω̂(a;g′). for the lie algebra of the gauge groupwe have the lie algebra cocycle dω(a;x)= t −1a xta + t −1 a lx ta with quantization d̂ω(a;x). let x be an element in the lie algebra m ap(m, g) of the gauge group g = m ap(m, g). its quantization is then gx = lx + d̂ω(a;x) where lx is the lie derivative in the direction of x, corresponding to the infinitesimal gauge action a �→ [a, x] + dx. the commutation relations are now modified by a cocycle c, [gx , gy ] = g[x,y ] + c(a;x, y ) lx c(a; [y, z])+c(a; [x, [y, z]])+ cyclic combin. = 0. the lie derivative lx is needed since the fock spaces f depend on the external gauge field a: although as hilbert spaces fa are all the same, the gauge action explicitly depends on a. the 2-cocycle property quarantees the jacobi identity for the extension lie(ĝ)= m ap(m, g)⊕ m ap(a,c). in the casewhen m = s1 one can take ta =1and we obtain the standard central extension of the loop algebra m ap(s1, g). in the case when dimm =3 one can show that the 2-cocycle c is equivalent to the local form c ≡ const. ∫ m tra[dx,dy ] where the trace under the integral sign is computed in afinite-dimensional representationof g. this representation is the same defined by the g-action on fermions in the 1-particle space. actually, the coefficient in the front of the integral is nonzero only for chiral fermions (the schwinger terms from left and right chiral sectors cancel). references [1] bargmann, v.: on unitary ray representations of continuous groups.ann. of math. (2) 59, (1954), 1–46. [2] carey, a. l.: the origin of three-cocycles in quantum field theory. phys. lett. b 194, (1987), 267–270. [3] carey, a. l., grundling, h., raeburn, i., sutherland, c.: group actions on c∗-algebras, 3cocycles and quantum field theory. commun. math. phys. 168, (1995), 389–416. [4] carey, a. l., crowley, d., murray, m. k.: principal bundles and the dixmier douady class. comm. math. phys. 193, (1998), no. 1, 171–196. [5] carey,a. l.,mickelsson, j.,murray,m.k.: bundle gerbes applied to quantum field theory. rev. math. phys. 12, (2000), no. 1, 65–90. 46 acta polytechnica vol. 50 no. 3/2010 [6] grossman, b.: the meaning of the third cocycle in the group cohomology of nonabelian gauge theories. phys. lett. b 160, (1985), 94–100. jackiw, r.: three-cocycle in mathematics and physics. phys. rev. lett. 54, (1985), 159–162. jo, s. g.: commutator of gauge generators in nonabelian chiral theory. nuclear phys. b 259, (1985), 616–636. [7] frenkel, e., xinwen zhu: gerbal representations of double loop groups. arxiv.math/0810.1487 [8] hekmati,p.,mickelsson,j.: fractional loop group and twisted k-theory. arxiv:0801.2522. [9] maclane, saunders: homology.diegrundlehren der mathematischen wissenschaften, band 114. springer verlag (1963). [10] kac, victor: infinite-dimensional lie algebras. third edition. cambridgeuniversitypress, cambridge, (1990). [11] pressley, a., segal, g.: loop groups. oxford mathematicalmonographs.theclarendonpress, oxford university press, new york, (1986). [12] mickelsson, j.: wodzicki residue and anomalies of current algebras. integrablemodels and strings (espoo, 1993), 123–135, lecture notes in phys., 436, springer, berlin, (1994). [13] langmann, e., mickelsson, j.: scattering matrix in external field problems. j. math. phys. 37, (1996), no. 8, 3933–3953. [14] mickelsson, j.: from gauge anomalies to gerbes and gerbal actions. arxiv:0812.1640. to be publ. in the proceedings of “motives, quantum field theory, and pseudodifferential operators”, boston university, june 2–13, 2008. [15] mickelsson, j., rajeev, s. g.: current algebras in d + 1-dimensions and determinant bundles over infinite-dimensional grassmannians.comm. math. phys. 116, (1988), no. 3, 365–400. [16] palais, r.: on the homotopy type of certain groups of operators.topology 3, (1965), 271–279. [17] shale, david, stinespring, w. f.: spinor representations of infinite orthogonal groups. j. math. mech. 14, (1965), 315–322. jouko mickelsson department of mathematics and statistics university of helsinki department of theoretical physics royal institute of technology, stockholm 47 ap09_1.vp 1 introduction this paper focuses on the structural analysis of a steel structure under fire loading. the use of an analysis with thermo-plastic materials, geometric nonlinearities and modeling of the fire action using parametric curves, allows a faithful evaluation of the effective behavior of steel structures subject to fire. in this context, once these two basic factors are clarified in isolated beam elements, they are applied to a steel structure under fire loading. this is done to highlight the importance of the right choice of analyses to develop, and of the finite element codes that are able to model the resistance and stiffness reduction due to the temperature increase. in addition, considering that for such a structure the evaluation of the structural collapse is very tricky and depends from many factors, a factor worth highlighting in a performance based approach is the global configuration of the structure itself. 2 definition of the generic model when assessing the performance resistance of steel elements subjected to fire, an important factor to take into account is the effect of temperature on the material in question. in the case of steel, in particular, the yield strength, the element ductility and its elastic proprieties, e.g., in young’s modulus, poisson’s modulus and the proportional limit of stress, are strongly influenced by temperature increase. these characteristics therefore have to be used in the modeling, in order to achieve an accurate fire resistance assessment. in this paper, the material used is a thermo-plastic type. the mechanical evolutive parameters taken into account in structures subjected to fire are: � young’s modulus, e0; � effective yield stress, sy, which represents the maximum capacity of the material; � coefficient of thermal expansion, �t. a simple proportional ratio between values at ambient temperature and values at a specific temperature [1] provides the evolution of these parameters as result of temperature increase. table 1 lists how much these parameters change with the temperature increase. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 49 no. 1/2009 structural analysis of steel structures under fire loading c. crosti this paper focuses on the structural analysis of a steel structure under fire loading. in this framework, the objective is to highlight the importance of the right choice of analyses to develop, and of the finite element codes able to model the resistance and stiffness reduction due to the temperature increase. in addition, the evaluation of the structural collapse under fire load of a real building is considered, paying attention to the global behavior of the structure itself. keywords: thermo-plastic material, non-linear analysis, steel structure, thermal buckling, bowing effect, structural collapse. t (°c) e (pa) sy (pa) epl (pa) �t (°c �1) eu 0 2.10e+11 2.35e+08 1.05e+10 1.17e � 05 0.20 20 2.10e+11 2.35e+08 1.05e+10 1.17e � 05 0.20 100 2.10e+11 2.35e+08 1.05e+10 1.20e � 05 0.20 200 1.89e+11 2.35e+08 9.45e+09 1.23e � 05 0.20 300 1.68e+11 2.35e+08 8.40e+09 1.26e � 05 0.20 400 1.47e+11 2.35e+08 7.35e+09 1.30e � 05 0.20 500 1.26e+11 1.83e+08 6.30e+09 1.31e � 05 0.20 600 6.51e+10 1.10e+08 3.26e+09 1.34e � 05 0.20 700 2.73e+10 5.41e+07 1.37e+09 1.36e � 05 0.20 800 1.89e+10 2.59e+07 9.45e+08 1.38e � 05 0.20 900 1.42e+10 1.41e+07 7.09e+08 1.40e � 05 0.20 1000 9.45e+09 9.40e+06 4.73e+08 1.42e � 05 0.20 table 1: temperature effect on a thermo-plastic material a thermo-plastic material requires the use of finite element code (fec), which enables the correct implementation of these well-known evolutive parameters. not all fecs allow the evaluation of thermo-plastic material behavior. the calculation method finds numerical results by means of an iterative procedure: i.e., a procedure that predicts the updating of parameters with temperature increase. for solving the problem, it is therefore necessary to develop a nonlinear analysis, which is the only way to obtain a realistic representation of the problem. in particular, geometric effects are very important for the study of structures under fire action. in fact, where these effects are not taken into consideration, buckling phenomena are not included in the analysis. this can be an important error, especially in the case of redundant structures. this aspect will be highlighted in the course of the paper. in this paper, the fire action is applied to the element as a temperature curve, and the curve assigned is the parametric curve called iso834 [1], which has a zero thermal gradient in the single finite element. this particular curve is used for identifying the class of fire resistance for a general structure as well as the evaluation of a general conventional assessment of fire resistance. 3 numerical analysis on isolated elements this section shows two basic examples related to steel elements subjected to fire action. these two examples enable the chosen model to be validated, evaluating how two different finite element codes, see adina [2] and strand [3], approach the fire resistance of steel structures. these cases also enable us to deal with some typical behavior of a simple structure in which the collapse is due to loss of resistance and stiffness [4]. 3.1 structural elements with unconstrained lateral expansion a three-meter-long simply supported beam has been considered, with a rectangular section of 0.3×0.3 meters. this beam is subjected to a constant vertical force (f) at the midspan of 1410 kn (fig. 1). by means of analyses accounting for the material and geometric nonlinearity, it is possible for example to evaluate how two finite element codes, see adina [2] and strand [3], manage to gather some of the phenomena that frequently emerge in steel elements subject to fire action. one important consideration is the trend of the displacements in time, i.e., during the temperature increase. as an example, the trend of the vertical displacement of the node at the midspan of the beam is shown in fig. 2. fig. 2 shows that after 400 sec, there are noteworthy displacements, due to the fact that that this time limit represents a critical time of fire resistance for the considered element. a similar consideration can be made observing the trend of the simple support horizontal displacement (fig. 3), in which nevertheless it is possible to point out a phenomenon very frequent in structures subjected to fire. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 1 2 3 44 5 7 8 9 10 11 elem.1 elem.2 elem.3 elem.4 elem.5 elem.6 elem.7 elem.8 elem.9 elem.10 f y x 6 fig. 1: mesh of the element �1.2 �1 �0.8 �0.6 �0.4 �0.2 0 0 1000 2000 3000 4000 5000 t (sec) d y (m ) adina straus fig. 2: trend of vertical displacement in midspan, node no. 6 (fig. 1) in fact, after an initial progressive lateral expansion, due to the initial temperature rise that generates positive displacements, the trend of the displacement is inverted. this inversion is because, after a certain time and due to the temperature rise, the beam is no longer able to support the applied load. this is due to the progressive temperature rise that leads to a noteworthy reduction in the load-bearing capacity of the structural element, thus leading to the progressive approaching of the external nodes, a phenomenon known in the literature as the bowing effect [5]. therefore, referring to the horizontal node displacement of node no. 11 of the beam, it is simple to see how the coupling between material and geometric nonlinearity enables us to pick the progressive approaching of extreme nodes of the beam, when the temperature increases and the resistance and the stiffness decrease. this factor is perfectly pointed out in both codes used here. 3.2 structural element with constrained lateral expansion this paragraph analyzes the thermal buckling of an axially loaded beam, with both ends fixed against rotation. the beam considered here is 3 meters long, subjected to fire (iso834) [1] and to a lateral force f � 14 kn at mid-length to simulate initial member imperfections. in this case, the beam is assumed to be restrained against lateral torsional buckling and so only in-plan overall buckling checks need to be made, this being the most critical case. the beam is of hypothetical square section in order to simplify the analysis. the restrained thermal expansion produces an indirect axial force that must be taken into account for a correct assessment of fire resistance, as shown in fig. 4. a square section of 0.1×0.1 meters has been considered. for this section, the analysis provides a solution only when the axial reaction produced by the thermal expansion is below the critical buckling load value, as shown in fig. 4. with an increase in element section, for example from 0.1×0.1 to 0.2×0.2 meters, the solution is stopped when the axial reaction curve crosses the yield development curve, as shown in fig. 5. above this yield load, there is an important fall in the critical buckling load, which becomes smaller than the yield load in such a way that the axial reaction cannot increase. in fact, with entrance into the plastic domain, euler’s equation, used to determine the critical buckling load in the elastic domain, can still be used as long as the young’s modulus is replaced by the tangent modulus value. this formula is known as shanley’s equation [6]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 49 no. 1/2009 �1.0 �0.8 �0.6 �0.4 �0.2 0.0 0.2 0 1000 2000 3000 4000 5000 t (sec) d x (m ) adina straus fig. 3: trend of horizontal displacement of node no. 11 (fig. 1) 0.00e+00 5.00e+05 1.00e+06 1.50e+06 2.00e+06 2.50e+06 0 100 200 300 400 500 600 700 800 900 1000 t (°c) p c r , r x (n ) pcr elastic rx straus pcr yield rx adina fig. 4: collapse for thermal buckling of an element with section 0.1×0.1m by increasing the cross section from 0.1×0.1 to 0.2×0.2 meters, it can be observed that the solution is blocked in correspondence to the attainment of the yield strength (fig. 5). when this is exceeded, a noteworthy decrease in the critical buckling strength is observed, which reaches a value lower than the yield strength. as a matter of fact, in the plastic field, the euler formula, used for calculating the critical load in a linear elastic field, is still valid if the value of the elastic modulus (e) is substituted by the tangent modulus (shanley theory [6]). by increasing the cross section dimensions even more (up to 0.6×0.6 and 0.8×0.8 meters) it is possible to gather the post-yield strength, characterized by a noteworthy decrease in stiffness, denoted by the trend of the curves of the axial reaction (fig. 6). the results are synthesized in table 2, where the critical temperature is also determined. for further considerations, the reader is referred to [4]. 4 structural analysis of steel structures under fire this section investigates the structural behavior of a real steel structure. the objective of the analyses is, firstly, to high24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 0.0e+00 5.0e+06 1.0e+07 1.5e+07 2.0e+07 2.5e+07 3.0e+07 3.5e+07 0 100 200 300 400 500 600 700 800 900 1000 t (°c) p c r , r x (n ) pcr elastic rx straus pcr yield pcr elastic-plastic rx adina fig. 5: collapse for thermal buckling of an element with section 0.2×0.2 meters 0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08 2.5e+08 0 100 200 300 400 500 600 700 800 900 1000 t (°c) p c r , r x (n ) pcr elastic rx straus pcr yield pcr elastic-plastic rx adina fig. 6: collapse for thermal buckling of an element with section 0.6×0.6 meters case section (m) t (°c) collapse for the achievement of: 1 0.1×0.1 77 elastic buckling load (euler) 2 0.2×0.2 89.5 elastic-plastic buckling load (shanley) 3 0.6×0.6 331 elastic-plastic buckling load (shanley) 4 0.7×0.7 655 elastic-plastic buckling load (shanley) 5 0.8×0.8 1000 no collapse table 2: comparative results light some of the peculiar effects arising from the fire loading, and, secondly, to evaluate how changing the scenarios of the fire also changes the collapse of the structure. the analyses (implemented in commercial fec strand) account for the material and geometry nonlinearities, and are thus able to describe accurately the actual behavior of the structure. to carry out the fire resistance analyses, only vertical loads are applied to the structure, in accordance with the italian code [7] for structures subjected to fire action. 4.1 description of the structure and initial considerations the structure under inquiry is a single storey steel framed open deck car park, 32 meters long, 15 meters wide, with a maximum height of 3 meters. the deck consists in three rows of primary beams supporting seven rows of secondary beams, while appropriate steel braces contemplate the vertical elements. the fem of the structure is shown in fig. 7. due to the simplicity in the configuration of the structure and its particular utility, no comprehensive risk analysis has been performed at this stage. the fire scenarios were identified during the preliminary risk analysis process. just four scenarios are taken into account, as shown in fig. 8. as stated above, the fire action is modeled by means of the iso834 parametric curve applied directly to the elements involved in the fire. therefore, for example, in scenario 1, this curve is applied to some secondary and principal beams and to the two columns of the central frame. by means of non linear analyses, it was possible to include considerations of structural fire resistance, and also to figure out how some elements, such as the columns, are fundamental for the stability of the whole structure. 4.2 non-linear analysis results it is interesting to analyze the trend of the node at the top of the column involved in the fire in scenario 1, to understand the global behavior of the structure. as shown in fig. 9, two phases can be clearly identified. in the first phase, the thermal expansion arouses positive displacement on the node. consequently and after around 600 seconds, a second phase takes place, characterized by an inversion to the trend of the displacement, which goes from a positive value to a negative value, due to the loss of resistance and stiffness of the column, which sags laterally. fig. 9 shows the final de© czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 49 no. 1/2009 scenario nscenario n°°11 scenario nscenario n°°22 scenario nscenario n°°33 scenario nscenario n°°44 scenario nscenario n°°11 scenario nscenario n°°22 scenario nscenario n°°33 scenario nscenario n°°44 fig. 8: finite element model of the structure fig. 7: finite element model of the structure formed shape for the first scenario, where the columns are involved in the fire. from the figure, we can evaluate how the application of fire to the columns leads to a critical state for the structure, due to the appearance of buckling phenomena. in fact, from the moment it loses its load bearing capacity, the column is no longer able to withstand the applied load and it sags towards its weak direction. the situation in scenario 4 is different. here, the fire is applied only to some secondary beams and does not involve any columns; in fig. 10, the zone involved by the fire is highlighted. from that scenario, it is also possible to evaluate how the structure seems to retain a good bearing capacity, even when there are large displacements on the secondary beams. this is because the damage is localized and does not compromise the stability of the whole structure. we can conclude that, in performance fire analyses, the global behavior of the structure is more important than a study of the individual elements composing the structure. this leads to some considerations regarding the structural dependability of the facility [8], in terms of collapse resistance [9]. in fact, when a structure is redundant, there are many alternative load paths, large deformations can develop without a loss of its load bearing capacity, and structural failure must be accounted for in a different way. this phenomenon creates a sufficient reserve capacity to allow most of such structures to survive fires with reasonable structural damage. fig. 11 shows the final deformed shapes for the first three scenarios, in which there are columns involved in the fire. finally, fig. 12 summarizes the deformed shape, the critical time and temperatures for each considered scenario, locating in scenario 1 the worst case for the stability of this structure subjected to fire, having the lowest critical time and temperature of fire resistance. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 9: vertical displacements of the node at the top of the column fig. 10: deformed shape for scenario 4 5 conclusions this paper raises some considerations concerning the performance of steel structures under fire loading. for this purpose, the application of nonlinear analysis to the thermo-mechanic behavior of materials and to the structures as a whole, together with the appropriate fire modeling in appropriate scenarios, come together to demonstrate and verify the performance of the structure in terms of resistance to fire during the design phase. all of this focuses on the importance of understanding the behavior of single elements while noting that the fundamental consideration in the structural collapse of a complex structure is the global behavior of the structure itself [10]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 49 no. 1/2009 initial deformed shape initial deformed shape: scenario 1 initial deformed shape: scenario 2 initial deformed shape: scenario 3 initial deformed shape initial deformed shape: scenario 1 initial deformed shape: scenario 2 initial deformed shape: scenario 3 fig. 11: finite element model of the structure 1° scenario 3° scenario 4° scenario 2° scenario t cr= 670 sec tcr= 675 °c t cr= 1110 sec tcr= 750 °c t cr= 950 sec tcr= 725 °c t cr= 4445 sec tcr= 975 °c fig. 12. summary of critical states for the four scenarios considered here acknowledgments the author is grateful to her phd advisor professor franco bontempi, of the sapienza, university of rome, for his constant guidance in this research. ing. mauro caciolai and ing. claudio de angelis, of the italian fire service, are also acknowledged for their fundamental opinions. finally, special thanks are due to my colleagues luisa giuliani, konstantinos gkoumas and francesco petrini, who have spared their valuable time for discussions. references [1] eurocode 3 – design of steel structures, part 1.2: structural fire design, commission of the european communities, brussels, 1993. [2] adina, www.adina.com [3] strand, www.strand7.com [4] crosti, c., bontempi, f.: performance assessment of steel structures subject to fire action. proceedings of the cst2008 & ect2008 conferences, athens, 2–5 september 2008. [5] usmani, a. s., rotter, j. m.: fundamental principles of structural behavior under thermal effects. first international workshop on structures in fire, copenhagen, denmark, 2000. [6] shanley, f. r.: inelastic column theory, journal of the aeronautical sciences, vol. 14 (1947), no. 5, p. 261–268. [7] d.m. 14/09/2005 norme tecniche per le costruzioni (in italian), ministry of infrastructures and transportation, italy, 2005. [8] bontempi, f., giuliani, l., gkoumas, k.: handling the exceptions: dependability of systems and structural robustness, in recent developments in structural engineering. mechanics and computation, alphose zingoni (ed.), millpress, rotterdam, 2007. [9] starossek, u., wolff, m.: design of collapse resistant structures. workshop on robustness of structures (jcss and iabse), november 28–29, 2005. [10] crosti, c., bontempi, f.: safety performance evaluation of steel structure under fire action by means of nonlinear analysis. proceedings of the cst2008 & ect2008 conferences, athens, 2–5 september 2008. chiara crosti e-mail: chiara.crosti@uniroma1.it sapienza – university of rome school of engineering, rome, italy 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ap09_2.vp 1 introduction power efficiency and linearity are two important properties in conventional linear amplifier design and cannot be achieved simultaneously. linear amplification with nonlinear components (linc [1]) can produce output signals with high linearity after combining the antiphase outputs of two nonlinear power amplifiers. envelope elimination and restoration (eer [2]) is an approach for high efficiency amplification. due to the class-s power amplifier which acts as a power supply for the output power amplifier of eer, eer is limited to a signal with bandwidths up to serveral mhz. the combination of linc and eer (clier [3]) takes advantage of three nonlinear power amplifiers to achieve linear amplification. the inherent characteristic of this architecture allows power amplifiers to continuously operate at their peak power efficiency, and potentially improves the overall efficiency of the system. the structure is shown in fig. 1. however, a major disadvantage of this approach is the power wasted when the two power amplifiers are outphasing if a conventional 180° hybrid is used as a combiner. an alternative combining approach, named after chireix and extensively analyzed in [4], suffers from incomplete isolation of the two class-e power amplifiers. the two amplified signal components tend to travel and reflect back and forth between two amplifier branches, and the two amplifiers appear to be interfering with each other. as a result, a significant signal distortion can occur. in this paper, a different approach is presented and designed – the hybrid approach with a rat race coupler is used, but the power recycling scheme simply takes a rf-dc converter to recycle part of the wasted power back to the battery and enhance power efficiency. the focus of this paper is on the recycling network. 2 theory the baseband source signal of clier s(t) can be written as s t a t e j t( ) ( ) ( )� � , a t( ) ,� 0 (1) where a(t) is the envelope of the baseband input signal and e j�(t) describes the phase of the baseband input signal. the envelope of the input signal is digitally low pass filtered. the resulting signal a1(t) is amplified using a highly efficient class s amplifier. the quotient of the envelope a(t) and the lowpass filtered envelope a1(t) and the original phase of the signal are now amplified via the linc principle. the resulting out-phasing phase �(t) is given by �( ) arccos ( ) ( ) t k a t a t � � � � � � � 1 . (2) 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 implementation of a power combining network for a 2.45 ghz transmitter combining linc and eer f. wang, o. koch a power combining network with 180n hybrid for a 2.45 ghz transmitter combining linc and eer has been analyzed, built and measured. the network feeds the wasted outphasing power partly back to the power supply and therefore improves the overall power efficiency. the recycling circuit was designed and simulated with ads. a measured peak recycling efficiency of 60% was achieved with commercial schottky diodes at 2.45 ghz at an input power of 36 dbm. keywords: linc, eer, clier, power combining net-work, recycling. fig. 1: block diagram of clier principle note that the argument of the arccosine is limited to 1, and if the limits are exceeded, clipping occurs. the amount of clipping is determined by the clipping factor k, the lower part of the envelope a1(t) is fed to the class e amplifier via the supply voltage restoring the lower part of the envelope. the baseband representation of the signal is then after the dsp the resulting linc signals can be written as s t va t k t t1 1 2 ( ) ( ) exp( ( ) ( ))� �� � , (3) s t va t k t t2 1 2 ( ) ( ) exp( ( ) ( ))� �� � . (4) these two signals are fed into a 3 db combiner. if the two amplifier branches are perfectly matched, i.e., their gain and phase characteristics are precisely the same, an amplified replica of the original signal can be achieved, as the in-phase components add together and the out-of-phase components cancel each other. in this case the desired output signal is obtained at the summing port. the differential portion of the power is consumed at the resistive load and turns into waste heat, which degrades the overall power efficiency of clier. an important figure of merit of a linc based system is the average output efficiency, which can be expressed as �o s t s t s t � � � � � ( ) ( ) ( ) 2 2 2 , (5) where s t�( ) and s t�( ) are the in-phase components and out-phase components of s1(t) and s2(t), respectively. it describes how much of the produced power is used at the output. in this paper a technique for partial recovery of the wasted power at the differential port is implemented. the isolation between the two amplifiers is not degraded. this new technique was called power recycling or re-use technique in [5, 6]. the idea is simple – replace the power wasted resistive load with a rf-dc converter to recover the wasted power back to the power supply, and hence improve the overall power efficiency of the amplifier system. then the overall efficiency of the entire clier system can be written as � � � � � � � � � � � � � � p pdc o c e s o c e s r � 1 1( ) (6) in which �o , �c , �e , �s , �r are the output efficiency, the efficiency of the combiner, the efficiency of the class e amplifier, the efficiency of the class s amplifier and the efficiency of the recycling network respectively. the total efficiency in dependency on the output and the recycling efficiency is shown in fig. 2. we can see that the system efficiency depends on the © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 49 no. 2–3/2009 fig. 2: efficiency depending of tha output efficiency �o and the recycling efficiency �r . the other values are �c � 0 95. , �e � 0 9. , �s � 0 9. . fig. 3: outphasing power amplifier with power recycling output efficiency. the system efficiency increases significantly with recycling for medium �o . a diagram of this recycling approach is illustrated in fig. 3. a 180° hybrid combiner is configured as the power splitter to divide the wasted power into two 180° outphased portions. these two signals are then fed to a high-speed schottky diode pair through an impedance matching network. the schottky diodes rectify the rf waves and the dc components are withdrawn back to the power supply. a large value shunt capacitor and/or a series inductor may be used to reject the harmonic currents. the matching network is to be adjusted to optimize the system performance. an optional isolator can be added between the hybrid combiner and the power splitter to improve the isolation. apparently, the simplest case is that the input signal is a continuous wave, where the signal magnitude is constant. to simplify the analysis, an ideal resistive model is assumed for the schottky diode, i.e. the fixed on-resistance rd in series with the built-in potential vd, the infinite off-resistance, and a negligible shunt capacitance. all other components are assumed ideal. the analysis starts from the diode side. each schottky diode conducts at an angle of 2�. the current though the upper diode can be written in the following form[6]: i t v t v v r t v v vd pk c d d c d pk1 0 ( ) cos ( ) ; cos ( ) ; sup sup � � � � �� � otherwise, � � � � (7) where vpk is the peak signal voltage applied to the diode, vsup is the power supply voltage and �c is the carrier frequency of the input signal. so the upper diode conduction angle � can be determined by cos sup � � �v v v d pk ; 0 2 � �� . (8) the input signal is a periodic function with a period of 2 . in the actual circuit, the transmit signal is not only a signal with one carrier frequency (fundamental frequency), but also a sum signal of higher order frequencies (harmonic frequencies) because of the non linearity of the recycling network. so the actual diode current can be written as a fourier series. for the simplest case, the diode current can be expanded as the following fourier series i t i i tk c k ( ) cos( )� � � � �0 1 � , (9) where ik is the kth-order harmonic of the diode current. the dc-component of the upper diode is thus given by i v v r d d ( ) (tan ) sup 0 � � � � � . (10) the fundamental component and the higher order harmonics of the upper diode current are i k v v r k k k k k d d ( ) cos sin ( ) sin ( )sup � � � � � � � � � � � � � � �1 1 1 1 � . (11) obviously, the fundamental components and all odd-order harmonic current of the two diodes are 180° out of phase, and hence cancel out. only the dc components and even-order harmonics are left. a large value shunt capacitor may be sufficient to short the harmonic currents to the ground. a series inductor may be added to better help reject the harmonic current out of the power supply. in practice, microstrip lines are used instead of the capacitor. the recycled portion of the power is hence p i vr � 2 0 sup. (12) the rf-dc conversion is a strongly nonlinear process. in such a case, the large signal impedance of the device is usually estimated by the fundamental component of the voltage and current waveforms. the fundamental component of the upper diode current at carrier frequency is i v v r d d ( ) ( sec sin ) sup 1 � � � � � � . (13) the available power to the recycling network needs to be known in order to calculate the recycling efficiency. an isolator can be placed between the two hybrids to eliminate the reflection wave back to the two power amplifiers. now looking from the first hybrid ( see fig. 3), the load is always matched to 50 �, while looking from the seconde hybrid, the isolator acts like an ideal voltage source vs in series with an internal resistance of 50 �. thus the source voltage is v v z i n v z i ns pk � � � �1 0 1 0 12 2 , (14) where the scaling factor n results from the 1: n matching network and 2 comes form the fact that the hybrid is a power addition device. the power available to the recycling network pava is p v z v v z n z n r ava s d d � � � � � 1 8 4 2 0 2 0 0 ( ) sec ( sec sin sup � � � �) . � � � � � � 2 (15) so the recycling efficiency is �r r ava p p � . (16) if we only account for the fundamental components, we can obtain the input impedance of the recycling network z n v i n r in pk d� �2 2 1( ) sin cos � � � . (17) so the input impedance of the recycling network and the recycling efficiency vary with the diode conduction angle, 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 which is in fact determined by the power delivered to the recycling network. the reflection coefficient of the recycling network is �in in in z z z z � � � 0 0 . (18) the v swr is thus, according its standard definition v sw r in in � � � 1 1 � � . (19) fig. 4 and fig. 5 show the 3d plots of the recycling efficiency and reflection coefficient as a function of the impedance transform ratio n of the matching network and the source available power pava with the following parameters: vsup � 30 v, vd � 1.6 v and rd � 4.8 �. it is clear that there is a close relationship between the optimum recycling efficiency and the lowest reflection coefficient. © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 49 no. 2–3/2009 fig. 4: recycling efficiency as a function of n and available power fig. 5: reflection coefficient as a function of n and available power 3 measured results and discussion the power recycling circuit and the 180° hybrid ring coupler of fig. 3 were designed and simulated with advanced design system (ads) and fabricated on rt/duroid 5880. the measurement setup is illustrated in fig. 6. the power amplifier is used to provide the proper power level to the recycling network, since the signal generator is not capable of delivering so much power. the directional coupler is used to monitor the input power into the recycling network via the power meter. the circulator is used to isolate the power amplifier and the recycling network and enables the measurement of the reflected power using the spectrum analyzer. the external load simulates the power dissipation of the class-s power amplifier. the center frequency used for all tests is 2.45 ghz. the dc current irec is measured and the recycled power calculated. for all measurements the supply voltage is kept fixed at 30 v (25 v, 20 v) and the input power level is varied to determine the power recycling efficiency. one of the recycling circuits is optimized at 36 dbm input power for 30 v power supply voltages with surface mount schottky diodes (hbat540c). the layout of this circuit is shown in fig. 7. fig. 8 and 9 show the measured results for recycling efficiency and measured reflection coefficients as a function of the input power for three different power supply voltages. the measured peak recycling efficiency is found to be 57.13 % for a supply voltage of 20 v, 61.88 % for a supply voltage of 25 v and 60.9 % for a supply voltage of 30 v. in order to test the band width, the supply voltage and the input power level were kept constant and the frequency was swept. the result is shown in fig. 10. the band width for this circuit is about 100 mhz. another is optimized at 41 dbm input power for 30v power supply with the surface mount schottky diodes group (hsms280e). the measured peak recycling efficiency is found to be 62.7 % for 20 v supply, 63.4 % for 25 v supply, and 63.7 % for 30 v supply. 4 conclusion the power recycling technique has been presented for the optimum power combining network of the clier system. this network acts as an rf dc converter while maintaining sufficient isolation. the analysis demonstrates that a proper trade-off among the diodes, the power supply and the input power of the recycling networks is critical for the performance 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 50 � fig. 6: diode detector test setup fig. 8: comparison of calculated, simulated and measured recycling efficiency as a function of input power level fig. 7: photograph of the measured circuit optimized for 36 dbm of the system. with the diode groups, two more recycling circuits were designed. the recycling efficiencies were optimized for an input power level of 36 dbm and 41 dbm, respectively. for all measurements, a peak recycling efficiency of about 60% was achieved. but problems arise due to excesively high currents and voltages at the diodes. the measurement results agree well with the simulations and still show some similarity to the ideal case. this simple technique promises to improve the power efficiency of the outphasing microwave power amplifier, while maintaining its high linearity performance. references [1] cox, d. c.: linear amplification with nonlinear components. ieee-tc, vol. 41 (1974), no. 4, p. 1942–1945. [2] kahn, l. r.: signal-sideband transmission by envelope elimination and restoration. radioengineering, vol. 40 (1952), p. 803–806. [3] rembold, b., koch, o.: clier – combination of linc and eer method. electronics letters, vol. 42 (2006), p. 900–901. [4] birafane, a., kouki, a.: on the linearity and efficiency of outphasing microwave amplifiers, ieee-mtts, vol. 52 (2004), no. 5, p. 1702–1708. [5] langridge, r., thornton, t., asbeck, p. m., larson, l. e.: a power reuse technique for improved efficiency of outphasing microwave power amplifiers. ieee-mtts, vol. 47 (1999), no. 8, p. 1467–1470. [6] zhang, x., larson, l. e., asbeck, p. m., langridge, r. a.: analysis of power recycling techniques for rf and microwave outphasing power amplifiers. ieee-tcs, vol. 49 ( 2002), no. 5, p. 312–320. fei wang e-mail: wang@ihf.rwth-aachen.de olivier koch e-mail: koch@ihf.rwth-aachen.de institute of high frequency technology rwth aachen university melatener strasse 25 52074 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 49 no. 2–3/2009 fig. 9: comparison of calculated, simulated and measured reflection coefficients as a function of input power level << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 unobtrusive health screening on an intelligent toilet seat t. schlebusch abstract home monitoring is a promising way to improve the quality of medical care in an ageing society. to circumvent the problem that especially demented patients may forget or be stressed by the use of medical devices at home, monitoring devices should be embedded in objects of daily life to check the patient’s health status whenever possible, without any interactionwith the patient him/herself. this paper presents an intelligent toilet performing an unobtrusive health check when a person sits down. a variety of physical, electro-physical and urine parameters are analysed. this paper takes electrocardiogram and bioimpedance spectroscopy measurements and shows the practicability of measuring them on a toilet seat. keywords: monitoring of vital parameters, personal healthcare, ambient assisted living, electrocardiogram, bioimpedance, toilet seat. 1 introduction german society is undergoing a profound change in its age distribution. in about thirty years, around one third or the german population will be over 65 years old. along with the change in age distribution comes also a change in the main disease patterns. chronic diseases such as diabetes and chronic heart failure will further gain in relevance and, if not detected at an early stage, these diseases will burden the health systems with overwhelming costs. to limit public health expenditure and ensure appropriate health care and quality of life for patients, automatic monitoring of vital functions at home is an important building block for future health care systems. commercialhomemonitoring systems suchas the “telemedizin für’s herz” programme of the german health insurance company tk1 rely on interaction between a patient with a device and a telemedicine center. this means the patient him/herself has to take measurements with several devices, for example aweighing scale, abloodpressuremonitorandprobably evenmoredevices, andnote themeasuredvalues. the patient then calls the telemedicine center and transmits the measured parameters by phone. this leads to a very high daily workload for the patient if the number of monitored parameters increases. further, the scheme is unfeasible for elderly patients, whomayhaveproblems controlling themeasurement devices or may forget to take the measurements. to overcome these drawbacks,more andmore research projects have tried to take the burden of daily measurements from the patient and perform them unobtrusively during normal daily activities. one solution is to embed the measurement electronics in textiles wornby the patient, e.g. anecgshirt [1] or to measure the body composition by electronics integrated into smart clothing [2]. these systems provide the best long-term monitoring of patients, but cannot be treated like regular clothing: the batteries need to be charged regularly, and the electronics has to be removed before washing. the second approach is to embed monitoring electronics into devices used by the patient in his daily life, e.g. sleep-monitoring in his bed [3,4] or monitoring of the heart function on his chair [5], in his bath tub [6] or on his toilet seat [7–9]. while measurements in a bed or on a chair always have to deal with the clothing between the capacitive sensor and the human body, measurements on a toilet seat provide direct skin contact to the sensors and even enable urine analysis [10]. for early detection of a disease, e.g. diabetes or chronicheart failure,more thanoneparameterhas to be measured and tracked for parameter trend analysis. in the following sections an intelligent toilet will be presented that measures a wide variety of vital parameters, see figure 1. the intelligent toilet detects that a user sits down by regularly polling the weight sensors in the toilet seat. when a person sits down, automatic measurement is started and the results are transmitted wirelessly to a central control unit (ccu), as shown in figure 2. the ccu can collect data from several intelligent toilets, e.g. from different apartments in a retirement home, and forward them via a mobile phone or an internet connection to a central database. the database holds records for each patient (identified by toilet-id) and performs several thresholdoperationson the vital parameters and their calculated trends. in this way for 1http://www.tk.de/tk/innovative-verfahren/telemedizin/herz/9784 94 acta polytechnica vol. 51 no. 5/2011 example not only a threshold for the absolute weight of a patient can be monitored, but also a threshold for weight loss per time can be set. this paper will emphasize two important technological aspects of the intelligent toilet and will focus in detail on electrocardiogram (ecg) and on bioimpedance spectroscopy (bis) analysis. fig. 1: parameters measured by the intelligent toilet fig. 2: overview of the data transmission setting from the intelligent toilet to the physician 2 theoretical background an electrocardiogram represents the potential caused by the electric activity of the heart measured onthebody surfaceovertime. since the cardiacpulse spreads from the right atrium over the heart, the resulting electrical potential can be measured between two opposite places on the human chest. the resulting field and equipotential lines are shown in principle in figure 3. the strength of the recorded ecg signal is dependent on the strength of the potential difference between the electrode positions. from two opposite positions on the chest, positionsa andb in figure 3, a higher amplitude can be measured than between positions c and d, as we have for the ecg measurement on a toilet seat. since the amplitude is several orders smaller than the conventional chest ecg, most commercial ecg recorders will not be able to record any signal. for the measurements on a toilet seat, special hardwarewith cascadedgainand filter stages with an overall gain of 12000 has been constructed. fig. 3: equipotential lines on the surface of a human body, data for the simplified illustration taken from [11, 12] fig. 4: typical bis measurement raw data bioimpedance spectroscopy is a method for determining body composition [14] by measuring the complex impedance of tissue over a wide frequency range, usually in the range from5khz to 1mhz [13]. a very low alternating current i(jω) is injected by two current electrodes applied to the human body. between the current electrodes, two additional voltage electrodes are placed that measure the resulting voltage drop u(jω). the complex impedance can then be reconstructed by z(jω) = u(jω)/i(jω) for each frequency. by using separate electrodes for current injection and for voltage measurement, the effect of the electrode-skin contact impedance can be neglected. a typical plot of the acquired impedance data in the complex impedance plane is shown in figure4. cole [15]has shownthat cellmembranes show capacitive behaviour. at dc level the measurement current cannot pass the cell membrane and can flow only in the extracellular space. towards higher fre95 acta polytechnica vol. 51 no. 5/2011 quencies, the current canpass the cellmembrane and take a shorter path through the cells. this can be modelled by an equivalent model having resistance re, denoting extracellular resistance, in parallel to the series connection of capacitor cm and resistance ri, denoting the capacitance of the cell membranes and the intracellular resistance, respectively. for determining the hydration status of a person, re is of great interest, since a tight correlation between total body water and a change in re has been shown [16]. the bis electrodes are usually applied to the right hand and the right foot, forming awhole body measurement. medrano [16] was recently able to show that a measurement between the legs, as in our toilet-seat setting, could be sufficient for reliably determining the amount of body fluid. 3 materials and methods before being able to take first measurements with the intelligent toilet, it was necessary to embed adequate electrodes with the toilet seat. four electrodes are necessary for bis measurements, and it was decided to reuse them to connect the three ecg cables. for ecg and bis measurements, hydro-gel electrodes are usually used. they consist of metallic electrodes such as silver/silver-chlorideor aluminium with a layer of hydro-gel, which is also used as a glue layer to attach the electrodes to the skin. these electrodes haveanumber of advantages, but unfortunately they are disposables. this prevents their use in a maintenance-free toilet seat measurement system. in search of appropriate electrodes for integration in a toilet seat, it was decided to use printed circuit boards (pcb) as the basic material, since they are cheap and easy to process. a layer of tin was applied chemically to the copper layer of the pcb, and then a gold layer was applied by a galvanic process. the final electrode showed good contact impedances and long-term stability. the gold layer makes the electrode surface biocompatible, which is important for direct contact with human skin. since the electrodes have to be placed on the toilet seat in such away that all individuals are in good contact with them, a toilet seat was equipped with pressure sensors and ten student volunteers (seven male and three female) were asked to sit as they usually dowhen they use a toilet. the area featuring the highest contact pressure for all individuals was then chosen as the right place to embed the electrodes on the toilet seat. the ecg measurement was made by connecting the developed ecg-amplifier to the toilet seat electrodes. as a reference, a conventional lead-1 chestecg was taken with a bsamp biosignal amplifier (g.tek medical engineering gmbh, schiedlberg, austria). normal hydro-gel electrodes were used for the reference ecg. the outputs of both amplifiers were connectedtoanational instrumentsusbdataacquisition card, enabling synchronous recording with the use of a connected laptop. unfortunately, the driven right leg lead of the bsamp amplifier could not be connectedwith the subject since it interferedwith the toilet seat ecg amplifier, resulting in a higher noise level on the reference ecg than is usually present. for bis measurements, the ecg electronics was disconnected from the toilet seat and a hydra 4200 (xitron technologies, usa) bis device was connected. it was also connected to a pc by a serial connection, and alabview programmewas used to record the raw measurements from the bis device. after taking a measurement on the toilet seat, aluminium hydro-gel electrodes (fresenius medical care, badhomburg,germany)were attached to the legs of the subject. an effort was made to attach them as well as possible to the same place where the toilet seat electrodes also connected to the legs. 4 results 4.1 electrode positioning thepressuremeasurements on the toilet seat showed interestingly clear differences between genders. representative plots of theweight distribution for all ten measurements on the toilet seat are shown for amale subject in figure 7a) and for a female subject in figure 7b). fortunately, though the pressure distributions are so different for the two genders, the areas of highest contact pressure (red or orange in figure 7a) and 7b)) are in the same region. the four electrodes were then embedded in this region, see figure 5. fig. 5: picture of the experimental setup in the lab 96 acta polytechnica vol. 51 no. 5/2011 fig. 6: comparison of toilet seat ecg with conventional lead 1 a) typical male b) typical female fig. 7: weight distribution on a toilet seat 4.2 ecg measurement an example of an ecgmeasurement is shown in figure 6. the upper diagram shows the raw data from thedataacquisitioncard for the referenceecg,while the lower diagramshows the rawdata from the toilet seat ecg amplifier. as mentioned above, it was not possible to connect the driven right leg electrode of the biosignal amplifier to the subject since it would interfere with the toilet seat electronics. it was only possible to connect one of the two active feedback electrodes, so the feedback from the toilet seat ecg was chosen. this results in amuch higher noise floor for the reference ecg. in the toilet seat ecg the r-peak is clearly visible, enabling r-peak-detection and an estimation of heart rate and heart rate variability. it is hardly possible to detect additional information, e. g. propagation delays, without further signal processing. the intelligent toilet is equippedwith anmsp430f5437a microprocessor (texas instruments, usa), and the ep limited open source ecg analysis software2 was ported to this processor. by connecting the toilet seat ecg amplifier to the integrated a/d converter of the microprocessor it was possible to automatically analyse the heart rate of a subject on the toilet seat. the system was tested for its robustness against motion artefacts. since the author expects motion on the toilet seat usually to be limited to leaning forwards or sidewards, only thesemovements were tested. a reliable measurement could be made during most movements. only when leaning extensively sidewards could the connection between one leg and the electrodes get lost, thus interrupting the ecg signal. fig. 8: comparison of a bis measurement using the dry electrodes in the toilet seat and one reference measurement using aluminium hydro-gel electrodes 4.3 bis measurement bismeasurements could be taken using the dry electrodes embedded in the toilet set. figure 8 shows both the toilet seat and the reference measurements. for low frequencies starting from 5 khz both measurementmethods yield the same results. for higher frequencies, especially the referencemeasurement using aluminium hydro-gel electrodes reveals results withapositive imaginarypart,whichcanbequalified 2http://www.eplimited.com, last checked 8/3/2011 97 acta polytechnica vol. 51 no. 5/2011 as ameasurement error. ashas alsobeenobserved in other measurement scenarios using the hydra 4200, use of the device in other impedance ranges than wholebodymeasurements leads toproblemswith the current source of the device. the device is intended for whole body measurements at resistances around 900ω. in our measurement setup, the impedance is several orders lower, clearly leading to measurement errors for higher frequencies. for accurate results in thewhole frequency range, the development of a special bis device for the toilet seat bis measurement would be necessary. nevertheless, the results are of medical value: even with errors in the impedance data for high frequencies the cole-cole parameter re could be extracted. as mentioned above, this parameter is of great interest since it correlates with the hydration status of the patient. 5 conclusions an intelligent toilet has been constructed which can performa comprehensive health checkwhen a person sits down. this paper has shown the practicability of ecg and bis measurements using dry electrodes embedded in a toilet seat. a biosignal amplifier with an overall gain of 12000 has been constructed for measuring the ecg signal. it has been shown that the signal is good enough to perform reliable r-peak detection, which can be used to estimate the heart rate and heart rate variability. typical light movements on the toilet seat are extraneous to the measurement. monitoring body composition using bis has also been shown to be possible using the same electrodes. the intelligent toilet presented here has high potential to increase the quality ofmedical care for elderly people living at home by keeping track of important health and nutrition parameters. when a deviation of the parameters is detected, a medical professional can be informed to get in contact with the patient. acknowledgement research presented in this paper was supervised by univ.-prof.dr.-ing.dr.med. s. leonhardt,chair for medical information technology at rwth aachen, and has been supported by the federal ministry of economics andtechnology (bmwi) and thegerman federation of industrial researchassociations (aif) under grant no. kf2561903fo9. this study forms part of a joint research programmewith our partnerskurt-schwabe-institut für messund sensortechnik e. v. meinsberg, clinpath gmbh berlin, innotas elektronik gmbh zittau and bitsz engineering zwickau. references [1] lee, y.-d., chung, w.-y.: wireless sensor network based wearable smart shirt for ubiquitous health and activitymonitoring,sensors and actuators b, vol. 140, 2009, pp. 390–395. [2] vuorela, t., kukkonen, k., rantanen, j., järvinen, t., vanhala, j.: bioimpedance measurement system for smart clothing, proceedings of the seventh ieee international symposium on wearable computers iswc’03, 2003. [3] zhu, x., chen, w., nemoto, t., kanemitsu, y., kitamura, k., yamakoshi, k., wei, d.: realtime monitoring of respiration rhythm and pulse rate during sleep, ieee transactions on biomedical engineering, vol. 53, 2006, pp. 2553–2563. [4] watanabe, k., watanabe, t., watanabe, h., ando, h., iskikawa, t., kobayashi, k.: noninvasive measurement of heartbeat, respiration, snoring and body movements of a subject in bed via a pneumatic method, ieee transactions on biomedical engineering, vol. 52, 2005, pp. 2100–2107. [5] aleksandrowicz, a., walter, m., leonhardt, s.: wireless ecgmeasurement systemwith capacitive coupling,biomedizinische technik, vol.52, nr. 2, 2007, pp. 185–192. [6] lim, y. k., kim, k. k., park, k. s.: the ecg measurement in the bathtub using the insulated electrodes,engineering in medicine and biology society, iembs ’04. 26th annual international conference of the ieee, 2004, pp. 2383–2385. [7] kim,k.k.,lim,y.k.,park,k.s.: theelectrically non-contacting ecg measurement on the toilet seatusing thecapacitively-coupled insulatedelectrodes,proceedings of the 26thannual international conference of the ieee embs, 2004, pp. 2375–2378. [8] baek,h. j., kim, j. s., kim,k.k., park,k. s.: system for unconstrained ecg measurement on a toilet seat using capacitive coupled electrodes: the efficacy and practicality, 30th annual international ieee embs conference, 2008, pp. 2326–2328. [9] kim, j. s., chee,y. j., park, j.w., choi, j.w., park, k. s.: a new approach for nonintrusivemonitoringofbloodpressureona toilet seat, physiological measurement, vol. 27, 2006, pp. 203–211. [10] fichtner, w., schlebusch, t., leonhardt, s., mertig, m.: photometrische urinanalyse als 98 acta polytechnica vol. 51 no. 5/2011 baustein für ein mobiles patientenmonitoring mit der intelligenten toilette, 3. dresdner medizintechnik-symposium, dresden, dec. 6th–9th, 2010. [11] sachse, f. b., werner, c. d., meyer-waarden, k., dössel, o.: applications of the visible man dataset in electrocardiology: calculation andvisualization ofbodysurfacepotential maps of a complete heart cycle, second users conference of the national library of medicine, 1998, pp. s. 47–48. [12] schneider, f., dössel, o., müller, m.: optimierung von elektrodenpositionen zur lösung des inversen problems der elektrokardiographie, biomedizinische technik, vol. 43, 1998, pp. 58–59. [13] moissl, u., wabel, p., leonhardt, s., isermann, r.: anwendungen-modellbasierte analyse von bioimpedanz-verfahren, automatisierungstechnik, vol. 52, nr. 6, 2004, pp. 270–279. [14] kyle, u. g., bosaeus, i., de lorenzo, a. d., deurenberg, p., elia, m., gomez, j. m., heitmann, b. l., kent-smith, l., melchior, j. c., pirlich, m. et al.: bioelectrical impedance analysis – part i: review of principles and methods, clinical nutrition, vol. 23, nr. 5, 2004, pp. 1226–1243. [15] cole, k. s.: membranes, ions, and impulses. berkeley : university of california press, 1968. [16] medrano, g., eitner, f., floege, j., leonhardt, s.: a novel bioimpedance technique to monitor fluid volume state during hemodialysis treatment, asaio journal, vol. 56, nr. 3, 2010, pp. 215–220. about the author thomas schlebusch studied electrical engineering at rwth aachen university, germany, and at ntnu trondheim university, norway. he is now a ph.d. studentwith the institue formedical information technology at rwth aachen, germany. his current research fields are home monitoring, textile integration and impedance spectroscopy. thomas schlebusch e-mail: schlebusch@hia.rwth-aachen.de philips chair for medical information technology helmholtz institute for biomedical engineering rwth aachen, pauwelsstr. 20, 52074 aachen, germany 99 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 optimization of a water window capillary discharge radiation source m. stefanovič, m. vrbová abstract computer modeling of a fast electrical discharge in a nitrogen-filled alumina capillary was performed in order to discover discharge system parameters that lead to high radiation intensity in the so-called water window range of wavelengths (2–4 nm). the modeling was performed by means of the two-dimensional rmhd code z*. the time and spatial distribution of plasma quantities were used for calculating the ion level populations and for estimating the absorption of the 2.88 nm radiation line in the capillary plasma, using the flychk code. optimum discharge parameters for the capillary discharge water window source are suggested. the heating of the electrodes and the role of capillary channel shielding were analyzed according to the z* code. keywords: water window radiation, capillary discharge, soft x rays, nitrogen plasma. 1 introduction water window radiation sources are interesting for application in the biomedical sciences [1]. high absorption of this radiation by proteins, but very little absorption by oxygen, makes imaging of living cells possible. wehave looked for optimal parameters of the fast capillarydischarge radiation source, namely capillary dimensions, electrodes’ shapes and gas filling pressure, in order to achieve maximum radiation power in the water window range of wavelengths. 2 rmhd computer modeling calculations of plasma quantities were executed using the two-dimensional rmhd (radiative-magnetohydrodynamic) z*-code [2,3]. the physical model established in z* is based on the quasi-neutral multicharged ion plasma magneto-hydrodynamic equations with self-consistent electromagnetic fields and the radiation transport in a 2d axially symmetric geometry. this enables detailed engineering of the pinching discharge processes in the capillary. the z* computer code engine provides the radial and time evolutions of the plasma quantities, and also an estimation of the radiation generated in the capillary. these results are based on the input parameters inserted into the calculations by the user. these input fig. 1: electrical scheme of the capillary discharge circuit parametersassume: external electrical circuit parameters, discharge geometry andproperties of the filling gas (pressure, temperature, degree of pre-ionization, etc.). 2.1 code input 2.1.1 electrical scheme of the discharge system the capillary discharge is a part of anunder dumped rlc circuit (figure 1). in accordance with the experimental setup [4], the following values are taken intoaccount for the simulations: capacitor c =15nf charged to initial voltage u0 =80 kv, external resistivity and inductance of the circuit r = 0.73 ω and l = 50 nh, respectively. the capillary itself has its own impedance, which is incorporated into the code via the capillary geometry input. 2.1.2 capillary discharge geometry and filling gas alumina (al2o3) capillary (magnetic permeability μ/μ0 = 8.5, atomic weight a = 102) filled with nitrogen gas (z = 7, a = 14) was investigated. iron (z =26, a =55.8) electrodes were presumed. figure 2 shows the geometric details of the capillary grid used for the calculations. the capillary is presented in 2d cylindrical geometry. the left border of the drawing represents the axis of symmetry. the outer radius of the capillary is 21mm, while the inner radius r is varied (0.8, 1.65 and 2.5 mm). the length of the capillary l represents the distance between the two electrodes. the simulationswere done in a range of capillary lengths 40–180mm. the capillary is closed at one end and opened at the other end. the open end electrode is indented ∼ 0.6 mm 46 acta polytechnica vol. 51 no. 3/2011 into the channel. the voltage supply electrodes are connected to the bottom and top right corner of the grid. positive voltage is virtually applied to the electrode at the bottom. fig. 2: capillary grid in 2d cylindrical geometry thenodesof the grid (figure2) correspondto the space coordinatesatwhich thephysical quantities are calculated. the grid is denser inside the plasma and around the plasma-electrodes and plasma-capillary boundaries. the thicknesses of the cathode and anode were chosen to be 9 and 6 mm, respectively. nitrogen gas was chosen due to strong spectral lines in the wavelength region between 2 to 3 nm. also, a recent paper [5] confirmed nitrogen as a high water window emitter. the initial temperatures of the gas and the capillarywall are presumed ∼ 300k. a pre-ionized plasma channel with “effective preionization degree” z ∼ 10−3 and electron temperature ∼ 0.05 ev is presumed to be created in the gas cylinder near the alumina wall. calculations were made for various initial molecular gas pressures in the range 13–667pa. 2.2 results of modeling 2.2.1 spatial and time distributions of plasma quantities when a charged capacitor is connected to the capillary channel, the electric currentheats the plasma inside the channel, and itsmagnetic field causesplasma pinching. high temperature and dense plasma are created during the pinch. the plasma quantities depend substantially on the parameters of the discharge system, namely on the voltage applied, the discharge geometry and the filling gas. figure 3 illustrates the distribution of the plasma electron density at time t = 85 ns, shortly after the pinch. the electron density is uniform along the z direction, and decreasesmonotonously in the radial direction. on the axis, the value of the electron density is around 1017 cm−3. the electron temperature is found to be homogeneous in a broad capillary volume (0 cm < r < 1 mm) having a value of around te ∼ 50 ev (figure 4). fig. 3: radial and longitudinal distribution of electron density ne at time t = 85 ns (the units in the figure are given in av; density expressed in cm−3 may be obtainedbymultiplying the valueby theavogadro constant n a =6.02 × 1023) fig. 4: radial and longitudinal distribution of electron temperature t e at time t =85 ns 47 acta polytechnica vol. 51 no. 3/2011 2.2.2 output radiation estimated by the z* engine the code provides information on the output power of the radiation generated in the capillary, divided into 20 different spectral groups. as we are interested in radiation in the water window region, we observed the output emission power in group 13 of the z* code, corresponding to wavelength interval 2.1 nm < λ < 3.1 nm. we observed the evolution of the capillary plasma for the filling molecular gas pressure pm = 80 pa. figure 5 shows that the generated overall radiation (in thewhole spectral range) prad has two local maxima. the first radiation peak is reached at ∼ 15 ns, and the second peak, which is much higher, appears at t ∼ 85 ns. the second peak is related to the pinch. the radiation in spectral group 13 (peuv) has only one maximum at the pinch time. its peak value is only 3 times lower than the peak value of total radiation prad. the radiation peaks come later than the current maximum. fig. 5: time dependences of evaluated electric current (dashed line), total radiation prad and radiation peuv in the selected group 13 fig. 6: time dependencies of radiation power peuv in group 13 for various initial gas pressures also, simulations for various gas initial pressures in the range of 13–667 pa were executed for a 1.65mm capillary radius, in order to find the optimum filling gas pressure. the radiation outputs are very sensitive to changes in the initial gas pressure, as shown in figure 6. the highest output radiation value is achieved for initial gas pressure 80 pa at time 85 ns. this is the optimum filling pressure of the gas for capillary radius r = 1.65 mm. a capillary l = 95 mm in length is considered, but similar results in terms of optimumpressure are obtained for all capillary lengths 40–180 mm. the same procedure for finding the optimum gas pressure was repeated for two other capillary radii (0.8 and 2.5mm). the results are shown in figure 7. the line in figure 7 shows the optimum pressure for the corresponding capillary radius. the optimum pressure is amonotonouslydecreasing function of the radius of the capillary. if we now compare the optimized pressure output radiation powers for these three capillary radii, we can see that the highest emission is achieved for capillary radius 0.8 mm, for filling gas pressure 400 pa (figure 8). fig. 7: dependence of optimal pressure on the radius of the capillary fig. 8: pressure optimized time dependences of radiation power peuv in group 13 for different capillary radii 3 spectrum according to flychk the one-dimensional flychk code [6] was used to estimate the spectral properties of the output radiation from the capillary. we considered the evolution 48 acta polytechnica vol. 51 no. 3/2011 of the plasma field quantities at the capillary centre (r = 0 cm, z = l/2, where r and z are radial and longitudinal coordinates, respectively, and l is capillary length). similar spectral properties of plasma are also expected in the cylinder along the capillary axis 0.3 mm in radius, because of the uniformity of the plasmaquantities in this region (seefigures 3, 4). fig. 9: time evolutions of electron density and temperature calculated by z*, for gas pressure 80 pa fig. 10: time dependences of relative nitrogen ion abundances for initial gas pressure 80 pa (logarithmic scale) the time evolutions of electron density ne and electron temperaturetewere taken from the z* code calculations (figure 9), andwere introducedas an input file into the flychk code. non lte plasma was presumed, and the simulations were executed in the time dynamic regime, because the electron density is not high enough to fulfill themcwhirter criterion [7] for the lteplasma model. optically thick plasma 5 cm in length is presumed. the resulting evolutions of the relative abundances of nitrogen ions for initial nitrogen pressure 80 pa are shown in figure 10 (only 4+, 5+ and 6+ ion fractions are shown, since other fractions have minor abundances). it is evident that he–like n5+ ions prevail along the current period only after time t ∼ 50 ns. instantaneous spectra at time t = 85 ns for λ = 2.1.–3.1 nm are shown in figure 11. we see that the strongest spectral line lying at the wavelength λ = 2.8785 nm, corresponding to the quantum transition 1s2–1s2p in helium-like ions, prevails over the other lines. the optical depth (log(io/i)) of a 5 cm plasma column for different transitions is depicted in figure 12. we see that there is huge absorption of the 2.88 nm spectral line in the plasma, i.e. plasmawith such characteristics is almost totally nontransparent for this radiation. this conclusion is confirmedby applying the elton formula to calculate the absorption coefficient [8]. k = nl gu gl · λ3 8πc ( δλ λ ) aul where k is the absorption coefficient, gu and gl are statistical weights for the upper and lower atomic level, respectively, nl is the number of ions on the ground level, λ is the radiation wavelength, δλ is the line width due to doppler broadening, and c is speed of light. presuming δλ/λ = 1.45 × 10−4 [8], we calculated the value of the absorption coefficient k = 110 cm−1, which after multiplying by 5 cm length gives almost the same result as flychk. fig. 11: spectra for filling gas pressure 80 pa at the time of maximum output radiation t =85 ns fig. 12: optical depth of the 5 cmplasma channel for gas pressure 80 pa at the time of maximum output radiation t =85 ns 4 electrode heating az*engineprovides informationon the temperature rise of electrodes dtel, from the beginning of the simulation. we investigated theheating of the electrodes 49 acta polytechnica vol. 51 no. 3/2011 during the discharge. heating of the electrodes could be important when the source is working in a high repetition regime. weconsidered the capillarydischargewith the tip of one electrode protruded into the capillary channel (as in the experimental setup from reference [4] (figure 13)), and compared it with the system with electrodes lying in the extension of the capillarywall and open at both ends (schematically presented in figure 14). the protruded top is overheated accordfig. 13: temperature increase of the electrodes for the discharge system with the tip drawn into the channel, after 300 ns from the beginning of the discharge; most heated parts are magnified fig. 14: temperature increase of the electrodes for the discharge systemwith electrodes in the linewith the capillary wall 300 ns after the beginning of the discharge ing to the z* code. on the other hand, the heating is two times lower for the discharge geometry in figure 14. this leads to the conclusion that the tip of the electrode in the first scheme has an adverse effect on the heating of the electrodes. 5 the role of capillary shielding in capillarydischarge systems, thedielectric capillary is often surrounded by another metal in order to reduce the overall inductance in the discharge circuit. namely, the smaller the inductance, the faster and higher the current pulse through the capillary will be. weperformed simulations to investigatehowdischarge current and output radiation depend on the shielding of the capillary. for this purpose, special geometry of the capillary was inserted as an input into the z* code (see figure 15). the shield is 7 mm from the radial center of the capillary. fig. 15: schematic diagram of a shielded capillary, represented in a cylindrically symmetrical geometry the time dependences of the output radiation from the shielded and unshielded capillary, and also the corresponding discharge currents, are shown in figure 16. themaximumoutput radiation in the z* 13 radiation group is approximately 15 % higher for the shielded capillary. at the same time, the current slope is steeper and the maximum is higher. we also calculated the distribution of the zcomponent of the electric field in the discharge sys50 acta polytechnica vol. 51 no. 3/2011 fig. 16: time dependences of the output radiation power in group 13 and the current for a shielded capillary (full line) and for an unshielded capillary (dashed line) fig. 17: distribution of the z-component of the electric field in the discharge for the proposed geometry of the discharge (the axes are not proportional) fig. 18: distribution of the z-component of the electric field in the discharge for the proposed geometry of the discharge (the axes are not proportional) tem at time 15 ns (the highest electric fields occur at this time, according to the z*-code) for a shielded and unshielded capillary 4 cm in length (the radial component of the electric field is negligible in comparison with the z component). the electrical field reaches a value of around 45 kv/cm in a shielded capillary (figure 17), and only 6 kv/cm in an unshielded capillary (figure 18). since electrical breakdown in air occurs for an electric field of 30 kv/cm, we conclude that electrodes must be isolated (either by sinking into dielectric oil or by separation using an insulator (i.e. alumina)) in a configurationwith a shielded capillary. 6 suggested parameters of the new source we propose the parameters of a capillary discharge water window radiation source on the basis of the computer modeling described above. the electrical parameters of the discharge may be as follows: capacitor c = 15, initial voltage u0 = 80 kv, external resistivity r = 0.73 ω, and external inductance l =50 nh. capillary radius 0.8 mm is most suitable for a source with gas pressure of 400 pa. the capillary channel length should be only 4 cm, due to the high absorption of thewaterwindow radiation in the plasma column. due to the high electric fields in a system with a shielded capillary, we suggest that an unshielded capillary be used as the source. the suggested discharge geometry is shown in figure 18. the time dependences of the current and radiation power in the z* 13 radiation group for the proposed system are shown in figure 19. the highest xuv emission occurs approximately at the time of the current maximum, which is around 50 ns. the electron density and temperature distributions at the time of the emission peak are shown in figures 20 and 21. the electron density is uniform along the z direction, while it decreases monotonously in the radial direction. the electron temperature is homogeneous in abroad capillaryvolume, with a value around 70 ev, suitable for creating n5+ ions. fig. 19: time dependences of the output radiation power in group 13 and the current for the proposed geometry 51 acta polytechnica vol. 51 no. 3/2011 fig. 20: radial and longitudinal distribution of electron density ne at the time of maximum current (the units in the figure are given in av; the density expressed in cm−3 canbeobtainedbymultiplying thevalueby theavogadro constant n a =6.02 × 1023) fig. 21: radial and longitudinal distribution of electron temperature te at the time of a current maximum 7 conclusions the rhmd z* code was used to model a capillary pinchingdischarge toget an incoherentandpolychromatic “water window” radiation source. the radiation outputs in group 13 (2.1–3.1 nm) were evaluated for different initial gas pressure, different capillary radii and different capillary lengths. the optimal radius of the system was proposed (0.8 mm), with filling gas pressure around 400 pa and capillary length 40 mm. using the flychk code as a postprocessor, the detailed kinetics of nitrogen ions were computed and the relative abundances of nitrogen ions were evaluated. after 50 ns, n5+ ions prevail over the other ions throughout the current period. flychksimulations showedveryhighabsorptionof the 2.88 nm radiation line in the plasma. electrode heating was investigated for two different discharge configurations. the role of capillary shielding was analyzed by z*. acknowledgement the authors gratefuly acknowledge professor sergey zakharov for his useful advice on the z* code and on pinching discharges. this researchwas supported by meysresearchprojectc42andbymeysresearch project ingo la 08024. references [1] ford, t. w.: imaging cells using soft x ray. in: from cells to proteins: imaging nature across dimensions, amsterdam : springer netherlands, 2005, p. 167–185. [2] zakharov, s. v., novikov, v. g., choi, p.: z*code for dpp and lpp source modeling. in: euv source for lithography. (ed. v. bakshi). bellingham, washington : spie press, 2005, p. 223–275. [3] zakharov, s. v., novikov, g. v., mond, m., choi, p.: plasma dynamics in hollow cathode triggered discharge with influence of fast electrons on ionization phenomena and euv emission. plasma sources sci. technol. 17 (2), 2008, p. 13. [4] nevrkla, m.: návrh a realizace zař́ızeńı pro studium kapilárńıho výboje v argonu: diplomová práce. praha : čvut – fakulta jaderná a fyzikálně inženýrská, 2008. [5] vrba, et al.: capillary pinching discharge as water window radiation source. 29th icpig, cancún, méxico, 2009. [6] lee, r. w., larsen, j. t.: a time-dependent model for plasma spectroscopy of k-shell emitters, jqsrt 56, 1996, p. 535–556. [7] mcwhirter, r. w. p.: in: plasma diagnostic techniques? (eds. r. h. huddlestone, s. l. leonard),academicpress,newyork, 1965, p. 201–264. [8] elton, c. r.: x ray lasers. london : academic press, 1990. ing. miloš stefanovič prof. miroslava vrbová, csc. e-mail: stefanovic@fbmi.cvut.cz department of natural sciences faculty of biomedical engineering czech technical university in prague nám. sítná 3105, kladno 2, czech republic 52 acta polytechnica vol. 52 no. 5/2012 evaluation of bioelectrical impedance spectroscopy for the assessment of extracellular body water sören weyer, lisa röthlingshöfer, marian walter, steffen leonhardt philips chair for medical information technology, helmholtz institute for biomedical engineering, rwth aachen university, pauwelsstrasse 20, 52074 aachen, germany corresponding author: weyer@hia.rwth-aachen.de abstract this study evaluates bioelectrical impedance spectroscopy (bis) measurements to detect body fluid status. the multifrequency impedance measurements were performed in five female pigs. animals were connected to an extracorporeal membrane oxygenation device during a lung disease experiment and fluid balance was recorded. every 15 min the amount of fluid infusion and the weight of the urine drainage bag was recorded. from the fluid intake and output, the fluid balance was calculated. these data were compared with values calculated from a mathematical model, based on the extracellular tissue resistance and the hanai mixture theory. the extracellular tissue resistance was also measured with bis. these experimental results strongly support the feasibility and clinical value of bis for in vivo assessment of the hydration status. keywords: bioelectrical impedance spectroscopy, fluid status, body composition, hanai mixture theory. 1 introduction maintenance of a constant volume and stable composition of body fluids is essential for homeostasis. several of the most common and serious problems in clinical medicine are due to abnormalities in the constancy of body fluid levels. total body water is about 60 % of body weight. this implies that in an (average) adult person weighing 70 kg, total body water is about 42 liters. total body fluid is distributed mainly between two compartments: the extracellular fluid consisting of plasma and interstitial fluid, and the intracellular fluid. two thirds of the fluid is inside the cells and is called intracellular fluid. the remaining body fluid is located outside the cells and is called extracellular fluid. the extracellular fluid is mainly divided into interstitial fluid and blood plasma. edema is a condition that causes too much fluid to accumulate in the body. any tissue or organ can be affected but the feet and ankles are the most affected; this is known as peripheral edema. other common types of edema affect the lungs, brain and eyes [10]. the extracellular edemas (which occur most frequently) are generally caused by an abnormal leakage of fluid from plasma to the interstitial spaces across the capillaries, or by the failure of the lymphatic return from the interstitium back into the blood [10]. due to the important role of body fluids, it is an important to accurately assess fluid status and its distribution. the gold standard for measurement of body water is isotope dilution; however, this method is only appropriate in a research setting. other methods, like dual-energy x-ray absorptiometry, are costly and impractical for continuous monitoring [1]. a promising alternative for continuous monitoring fluid status is bioelectrical impedance spectroscopy (bis). in bis, detection of body hydration is based on the significant impact of body water on the electrical impedance. 2 bioimpedance bis allows measurement of electrical body impedance at various frequencies, generally ranging from 5 khz to 1 mhz. the frequency-dependent impedance is related to the different electrical properties of body tissue. biological tissue is often divided into three regions of dispersion: α, β, and γ. the α dispersion region is in the low frequencies, generally < 1 khz and mostly at frequencies < 10 hz. in this region impedance changes of only 1–2 % occur, mainly due to surface conductance, ion gates and cell membranes. the β dispersion region is generally between 10 khz and 50 khz for muscular tissue and is mostly based on the phenomenon of the cell membrane. the two major contributing factors to β dispersion are the shorting out of the membrane capacitances, and the rotational relaxation of biomacromolecules [8]. the γ dispersion region is usually at very high frequencies. this dispersion is caused by the sub-cellular components of the tissue and by the water relaxation of the tissue. the frequency of the β dispersion lies within the range which commercial bis devices measure; therefore, this is explained in detail. the extracellular water content and intracellular 120 acta polytechnica vol. 52 no. 5/2012 figure 1: low and high frequency current flow through the body. (modified from [2].) water content are mainly electrically resistive entities, whereas the cellular membrane (due to its lipid layer) has an isolating (capacitive) nature. due to the conductive behavior of body fluid, the bio-electrical impedance is related to the fluid volume and its distribution [12]. therefore, the impedance of tissue is strongly dependent on the frequency [2]. at low frequencies the extracellular fluid will conduct the current because cell membranes and tissue interfaces act as capacitors. therefore, the current passes through the extracellular fluid and does not penetrate the cell membrane. at higher frequencies, this capacitive property is lost and the current is conducted by both types of fluid (see fig. 1) [4, 2]. these physical characteristics are used to extrapolate the resistance of extracellular and intracellular fluid, which are related to fluid volume. typical bis measurements result in a semi-circular arc in the complex impedance plane (fig. 3). figure 1 also shows the electrical equivalent circuit which can be deduced from this behavior. this circuit and its associated equation is known as the cole-cole model [3]. to calculate the resistance of intracellular and extracellular body water, the cole-cole model is used. as mentioned above, the cole model represents the tissue using extracellular resistance (re), an intracellular resistance (ri) and a capacitor (cell membrane capacitance cm). the model was extended to include a heuristic factor (α) representing the presence of different tissues parallel with specific time constants [14]. based on this model the impedance can be expressed by [13]: z(jω) = r∞ + r0 −r∞ 1 + (jωτ)α e−jωtd, with: re = r0, ri = rer∞ re −r∞ , τ = (ri + re)cm. (1) the values of the electrical model re, ri and cm can be found using the body impedance at the frequencies ω = 0 (r0) and ω = ∞ (r∞) and solving the equation for the parallel circuit. due to the technical setup of this study, only frequencies between 5 khz and 1 mhz are used. thus, in practice, curve fitting methods are needed to calculate the parameters of the cole-cole model [2]. 3 physiological modelling several physiological models have been proposed over the last decade. the simplest ones are the bia standard model [5], the model known as 0/∞khz parallel model [9], the 50 khz parallel model [5, 9] and the 5/500-khz parallel model [9]. the main disadvantage of these models is their inaccuracy. the constant has to be suitable for each specific subject measured. in practice, because this is very laborious and not always possible, the calculations are only rough approximations. a more accurate model for the calculations, used in most commercial devices, is the hanai suspension model [11, 6]. in this model, the body is approximated by means of several conductive cylinders in series which represent the limbs and torso. all these cylinders are filled with a suspension containing nonconductive elements embedded in a conductive medium (extracellular fluid). knowing the specific resistivity of the conductive medium, allows to calculate the partial volumes (conductive and nonconductive). according to the hanais theory, the specific resistivity ρa of a suspension of nonconductive spheres in a conductive medium is greater than the specific resistivity of the medium ρ0 and depends on the fractional volume of the nonconductive spheres. this apparent resistivity is related to the conducting suspending medium ρ0 and the volume fraction c (dimensionless) of non-conducting spheres by [11, 6] ρa = ρ0 (1 − c)3/2 . (2) by substituting (2) with the formula for the resistance of a cylinder, which is r = ρl a = ρl2 v , (3) it is possible to calculate a cylinder filled with a suspension of nonconductive elements embedded in a conductive medium with the following equation: r = ρ0h 2 v (1 − c)3/2 , (4) where the h has been substituted by the height and v by the volume of the cylinder. approximating the torso as a cylinder filled with a suspension of nonconductive elements in a conductive electrolyte, the nonconductive partial fraction c can be defined as: c = 1 − vecw vb , (5) 121 acta polytechnica vol. 52 no. 5/2012 where vb constitutes the volume of the body and vecw the volume of the extracellular fluid. using (4) and (5), the body resistance can be rewritten as: r = ρecwh 2v 3/2 b vbv 3/2 ecw . (6) with (6), the extracellular fluid can be estimated as in [13]: vecw = 1 1000 ( ρ2ecw db )1 3 ( h2 √ w re )2 3 . (7) here w is the weight in kg and h is the height in cm. db is the body density, generally 1050 kg/m3 for biological tissues. ρecw is the specific resistivity of the extracellular water and averages 40 ω cm [16]. thus, it is possible to predict the extracellular volume as long as the physiological parameters are known. 4 experimental setup the present study was realized as a sub-experiment during an extracorporeal membrane oxygenation (ecmo) trial in pigs. in a typical extracorporeal oxygenation system, blood is collected from the body through largediameter cannulas placed in the femoral veins. from there, a blood pump transports the blood to the oxygenator and back into the body. in the present experiment, five female pigs (weight from 60 to 80 kilograms) were positioned supine, intubated and mechanically ventilated in volume-controlled mode. a double-lumen catheter was percutaneously inserted into the femoral veins, as it is standard in ecmo. the venous blood was drained by a blood pump. once the blood passed through the ecmo circuit, it returned through a cannula which was inserted in the right internal jugular vein, and is then pushed forward so that the tip lies just above the right atrium. body temperature of the animals was maintained by varying the temperature of a heating pad underneath the animal. the bladder was catheterized and an infusion started. the ecmo circuit requires a high blood flow from the venous system. in order to avoid vasoconstriction in the femoral vein, a large volume of the fluid must be syringed. the infusion consisted of ringer’s solution (as required), a saline solution, or a hydroxyethyl starch solution. the amount of infusion supplied was measured every 15 min using the liquid level scale on the infusion bottle. the volume of urine was measured by weighing the urine drainage bag in order to evaluate the fluid balance of the animal. at the time of recording the fluid level, a bis measurement was also performed. a xitron hydra 4200 system was figure 2: sketch showing the position of the electrodes on the animal. used to measure the bioimpedance at several frequencies. the device measures at 50 frequencies between 5 khz and 1000 khz and uses an excitation current in the range 50 µa to 700 µa. the system is designed for a tetra-polar arrangement of electrodes. one pair of electrodes passes a small alternating current into the body and the other pair measures the resulting drop in voltage. the impedance is the ratio of voltage to current. this four-electrode technique reduces the effects of skin impedance [7, 15]. each measurement was repeated at least ten times to minimize measurement error due to movement artefacts caused by breathing, using an arithmetical averaging. in this setup, a current electrode and an adhesive measurement electrode were placed on the shaved skin of the anterior leg and the posterior leg of the pig (see fig. 2); these electrodes were fixed with adhesive tape. 5 results figure 3 shows the results of measurements in one pig. as expected, body impedance decreased with an increase in infused volume. the bioelectrical parameters of each pig were estimated by fitting these curves to the cole-cole model (described by (1)) with the aim to relate its parameters to the physiological characteristics of the tissue. with the help of fitting algorithms, it is possible to determine the cole-cole model parameters re, ri, α and cm from each curve. in the present case, the extracellular fluid is particularly interesting since ringer’s solution is used to replace a deficit of extracellular fluid because it closely resembles physiological body fluid in terms of electrolyte concentration and osmolality. hence, the extracellular fluid volume increases significantly after infusion of ringer’s lactate whereas the intracellular fluid volume remains constant. thus, with the fitted re values and the weight and height of the pigs it 122 acta polytechnica vol. 52 no. 5/2012 30 40 50 60 70 80 −14 −12 −10 −8 −6 −4 −2 0 2 4 re(z) im (z ) intravenous infusion of fluids figure 3: plot of bis measurement on the complex impedance plane, the fading color indicates the increase of liquid. −1000 0 1000 2000 3000 4000 5000 0 500 1000 1500 2000 2500 3000 3500 4000 4500 infused fluid volume [ml] e s ti m a te d f lu id b a la n c e ( h a n a i m ix tu re t h e o ry ) [m l] pig 1 pig 2 pig 3 pig 4 pig 5 figure 4: estimation of fluid balance, calculated with the hanai model, above the infused fluid volume. is possible to calculate the extracellular fluid volume with (7). figure 4 shows the results of calculations with the hanai model as a function of the change in fluid balance. to differentiate between the five pigs, different colors are used. each measurement in each animal experiment corresponds to a plot in fig. 4. as the fluid balance could only be recorded after catheterization of the animal in the operating room, the amount of liquid absorbed before the intervention could not be monitored and is unknown; in some cases a negative balance is observed at the start of the experiment. the amount of fluid supplied was mainly determined by the blood flow requirements of the ecmo. in addition, some pigs developed renal failure during the experiment and could no longer excrete urine. for -2ó +2ó mean of fluid balance and bis ecf [ ] l d if fe re n c e b e tw e e n f lu id b a la n c e a n d b is e c f [ l ] 0 0.5 1 1.5 2 2.5 3 3.5 4 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 figure 5: bland-altman plots illustrating the smallest and greatest difference between the changes in extracellular body fluid (ecf) estimated by bis and the logged fluid balance plotted against the mean of the two methods. this reason, the data sets do not cover the same range of fluid in all cases. the bland-altman plots (fig. 5) illustrate the differences between the bis predicted fluid volume. 6 conclusion this study demonstrates that bis can be used to continuously estimate extracellular water content. the limits of agreement (average difference ±1.96 standard deviation of the difference) are 0.45 ` and −0.65 `. the mean difference between both methods is −100 ml. this deviation may be due to evaporation related to breathing or sweat, and to the unknown fluid conditions before conducting the experiment. initially, it seems as though the limits of agreement are large; however, the bland-altman plots show that in the physiological fluid range (< 1.5`) the limits of agreement are 20% smaller. however, with a high fluid infusion the approximation of equation (7) loses its validity. bis measurements are safe, non-invasive, portable, low cost, simple and provide instant results. the estimated fluid balance, calculated from bis measurements, shows a strong correlation with the water balance. furthermore, bis is sensitive enough to detect fluid compartment differences and changes in fluid compartment volumes due to an acute intravenous administration of fluid. therefore, bis seems highly feasible to monitor water content in humans. 123 acta polytechnica vol. 52 no. 5/2012 acknowledgements the project has been selected under the operational programme co-financed by the european regional development fund (erdf) objective 2 “regional competitiveness and employment” 2007–2013), north rhine-westphalia (germany). references [1] carlina v. albanese, evelyn diessel, harry k. genant. clinical applications of body composition measurements using dxa. journal of clinical densitometry : the official journal of the international society for clinical densitometry 6(2):75–85, 2003. [2] l. beckmann, s. hahne, g. medrano, et al. monitoring change of body fluids during physical exercise using bioimpedance spectroscopy. annual international conference of the ieee engineering in medicine and biology society ieee engineering in medicine and biology society conference 2009:4465–4468, 2009. [3] kenneth stewart cole. membranes, ions, and impulses. university of california press, 1968. [4] p. l. cox-reijven, p. b. soeters. validation of bio-impedance spectroscopy: effects of degree of obesity and ways of calculating volumes from measured resistance values. international journal of obesity and related metabolic disorders : journal of the international association for the study of obesity 24(3):271–280, 2000. [5] k. j. ellis. human body composition: in vivo methods. physiological reviews 80(2):649–680, 2000. [6] m. fenech, m. maasrani, m. y. jaffrin. fluid volumes determination by impedance spectroscopy and hematocrit monitoring: application to pediatric hemodialysis. artificial organs 25(2):89–98, 2001. [7] k. r. foster, h. c. lukaski. whole-body impedance–what does it measure? the american journal of clinical nutrition 64(3 suppl):388–396, 1996. [8] sverre grimnes, ørjan grøttem martinsen. bioimpedance and bioelectricity basic. academic press, san diego, second edition edn., 2000. [9] r. gudivaka, d. a. schoeller, r. f. kushner, m. j. bolt. singleand multifrequency models for bioelectrical impedance analysis of body water compartments. journal of applied physiology (bethesda, md : 1985) 87(3):1087–1096, 1999. [10] arthur clifton guyton, john edward hall. textbook of medical physiology. elsevier saunders, philadelphia and pa, 11th edn., 2006. [11] tetsuya hanai, naokazu koizumi, rempei gotoh. dielectric properties of emulsions. kolloidzeitschrift & zeitschrift für polymere 184(2):143– 148, 1962. [12] e. c. hoffer, c. k. meador, d. c. simpson. correlation of whole-body impedance with total body water volume. journal of applied physiology 27(4):531–534, 1969. [13] a. de lorenzo, a. andreoli, j. matthie, p. withers. predicting body cell mass with bioimpedance by using theoretical methods: a technological review. journal of applied physiology (bethesda, md : 1985) 82(5):1542–1558, 1997. [14] ulrich moissl, peter wabel, steffen leonhardt, rolf isermann. modellbasierte analyse von bioimpedanz-verfahren (model-based analysis of bioimpedance methods). at automatisierungstechnik 52(6-2004):270–279, 2004. [15] j. rosell, j. colominas, p. riu, et al. skin impedance from 1 hz to 1 mhz. ieee transactions on biomedical engineering 35(8):649–651, 1988. [16] xitron technologies inc. san diego californien, u.s.a. hydra 4200 bioimpedance spectroscopy operating manual, 2007. 124 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 strontium as a structure modifier for non-binary al–si alloy barbora brykśı stunová1 1 ctu in prague, faculty of mechanical engineering, department of manufacturing technology, technická 4, 166 07 prague, czech republic correspondence to: barbora.stunova@fs.cvut.cz abstract this paper presents a study of the influence on the structure of alsi10mg alloy when 400 ppm of strontium is added. not only changes in the morphology of eutectic silicon, but in particular changes in the morphology of the intermetallic phases are monitored, namely phases containing iron and magnesium. the effect of strontium on structural defects, namely cavities formation, is also observed. it was found, that in non-binary system al–si–mg also intermetallic phases of magnesium are affected by addition of strontium: especially phase mg2si changes the morphology significantly from unmodified to modified structure. moreover, findings of other authors, that strontium has a negative effect on the level of gas porosity and on the distribution of shrinkages, are also confirmed. keywords: strontium, aluminum alloys, structure, structural defects. 1 introduction in the foundry industry, strontium is used as a modifier element. when it is added, eutectic crystallization is affected in order to improve the mechanical properties, especially ductility, see e.g. [1]. findings of authors in recent years [2–6] show, that modification elements, including strontium, besides the change in morphology of eutectic silicon, also affect intermetallic phases. in this regard, interesting findings with iron have been published. iron as a negative element creating undesirable intermetallic phases in al–si alloys is commonly compensated in practical applications by the addition of manganese, which helps to create intermetallic phases with a more favorable morphology (skeletal “chinese script”). it was proven by other authors e.g. [2,5] that the morphology of iron intermetallic phases can be influenced by adding strontium. these authors usually examined binary alloys with the addition of iron, or alloys containing copper. other authors [1] describe the influence of strontium modification on structural defects of al–si alloys, which mainly involve shrinkage and gas porosity, often a combination of the two phenomena. generally, strontium has a negative influence on structural defects, such as cavities. the aim of this work was to perform a complex experiment with non-binary al-si alloy modified by strontium, with the following goals: • to observe the modification effect of various modification agents containing strontium in various amount • to confirm the influence of strontium on intermetallic phases containing iron for specific nonbinary alloy • to observe the influence of strontium on other intermetallic phases present in specific non-binary al–si alloy • to confirm the influence of strontium on structural defects of a non-binary alloy • to compare the effect of different modification agents containing strontium in various amounts on the factors mentioned above • to compare the structural changes with the unmodified structure • to analyze the results and formulate the theoretical consequences. 2 theoretical background the modification effect of strontium on the morphology of eutectic silicon is widely known, and is widely used in foundry practice. however, the principle is still not satisfactorily explained. one of the most accepted theories for explaining the modification principle is the iit mechanism (impurity induced twinning) e.g. [1]. however, recent studies have shown that modification changes the nucleation frequency and dynamics of eutectic grains, with associated effects on the growth rate. in unmodified commercial al-si alloys, a large number of eutectic grains nucleate at or near the primary aluminum dendrite tips, and eutectic aluminum forms epitaxially on the primary dendrites. on the other hand, with addition of eutectic modifiers, i.e., sr, a dramatic decrease in the nucleation frequency of eutectic grains is observed, and the grains are nucleated independently of the primary phase at distributed centers in the interdendritic regions. the eutectic reaction in al–si alloys commences with the nucleation of the silicon phase, 26 acta polytechnica vol. 52 no. 4/2012 which is the leading phase during growth of the al– si eutectic. aluminum phosphide (alp) particles are very potent nuclei for eutectic silicon in commercial al-si alloys, where phosphorus is commonly present as an impurity element. the addition of sodium neutralizes alp and thus makes nucleation of eutectic silicon more difficult. more recent studies of eutectic nucleation have confirmed that alp nucleates eutectic silicon and the large reduction in nucleation frequency of eutectic grains in sr-modified al–si alloys appears to be caused by some poisoning mechanism of the potent nuclei. [2] in addition to the effect of sr addition on the growth of eutectic si, recent studies have confirmed that sr also significantly changes the nucleation behavior of the eutectic phases. it is proposed that the addition of sr deactivates alp and/or oxide bifilms as favored nucleation sites for eutectic si. thus, si is forced to nucleate at a lower temperature on some unknown substrate and grow as a fine, fibrous eutectic si with high twin density. in order to confirm one of the proposed mechanisms, there is great interest in analyzing the local distribution of the modifying element within the al-si eutectic and at the al/si eutectic interfaces. [3] the modification may also affect the morphology of undesirable intermetallic phases, e.g. phases containing iron. as in the example in [2,4–6], some modifier elements, e.g. strontium and potassium, have a positive effect on the morphology of β-al5fesi, or on the conversion of this phase to the α-al8fe2si phase. the available literature does not satisfactorily describe the relationship and influence of strontium and magnesium. some studies have identified magnesium as an element that facilitates the modification, while other studies have attributed to magnesium a negative effect on modification. however, no studies have been carried out on the effect of strontium on the morphology of intermetallic phases containing magnesium. in al-si alloys, magnesium is usually found as the mg2si phase, but it may also be present in iron-containing phases such as the femg3si6al8 phase, or in other complex eutectics. modification elements may have not only a positive structural influence, but also a negative effect, manifested by an increase in porosity. one consideration is porosity due to hydrogen. another important aspect of the modification, however, is the change in the solidification temperature range. the modification may cause changes in the model of shrinkage, and may cause gas porosity. experiments performed by other authors have shown that total shrinkage, which is an alloy property, is not at all affected by modification. however, the way in which this shrinkage is distributed between macro-piping and microshrinkage depends strongly on whether or not the alloy is modified. both sodium and strontium cause a significant diminution of the primary pipe and an increase in the amount of microporosity. in other words, shrinkage is redistributed when modification occurs. [1] 3 experimental non-binary en ac-43100 (alsi10mg) alloy was chosen for the experiment (for chemical composition, see tab. 1). the batch consisted only of alloy ingots, with no recycled material. the melt was neither refined nor degassed. the samples were cast in a green sand molding mixture, with 4 casting samples in one mold. the reference samples were cast without modification. 400 ppm of strontium was added to the modified samples in the form of the following five modification agents: alsr3.5, alsr5, alsr10 prealloys — all wrought, alsr10 cast pre-alloy and pure strontium. samples for metallography were taken from the castings, and the chemical composition of the samples was measured (table 2). table 1: standard prescribed chemical composition of experimental alloy alsi10mg alloy/element si fe cu mn mg ni zn ti al en ac-43100 (alsi10mg) 9–11 0.55 0.10 0.45 0.20–0.45 0.05 0.10 0.15 remainder table 2: resultant measured chemical composition of casting samples agent/element si fe cu mn mg ni zn ti sr p al unmodified 10.304 0.304 0.032 0.230 0.398 0.005 0.078 0.055 0 0.006 88.496 alsr3.5 10.337 0.311 0.024 0.153 0.362 0.006 0.098 0.076 0.037 0.006 88.511 alsr5 10.627 0.327 0.022 0.156 0.332 0.006 0.093 0.077 0.044 0.006 88.225 cast alsr10 10.177 0.254 0.040 0.320 0.402 0.002 0.037 0.024 0.031 0.006 88.613 wrought alsr10 10.247 0.246 0.041 0.318 0.412 < 0.002 0.035 0.024 0.044 0.006 88.532 pure sr 10.689 0.259 0.025 0.163 0.337 0.005 0.086 0.071 0.018 0.006 88.259 27 acta polytechnica vol. 52 no. 4/2012 4 results the metallography provided some interesting findings. on closer examination, the eutectic grain boundary areas show coarse eutectic silicon. in addition, these sites contain coarse intermetallic phases, in particular iron, iron and manganese and magnesium phases (figure 1). figure 1: alsr3.5 modified sample — characteristic intermetallic phases at grain boundaries, 200×, 1 – phase of iron, probably β-al5fesi, 2 – phase of iron and manganese, probably al15(femn)3si2, 3 – phase of magnesium mg2si a spectral analysis of selected phases was performed to confirm the assumptions presented above about the composition of the intermetallic phases at the grain boundaries. it turned out that magnesium is also present in the phases that were assumed to be of al(femn)si type. figure 2 shows a detail of connected intermetallic phases. position spectrum 1 was defined as the phase where, in addition to aluminum and silicon, chromium also occurs, and especially manganese and iron in the order of units to tens of percentage points. in addition to what is indicated above, spectrum 3 shows a content of 1.32 wt. % of magnesium. besides the aluminum and silicon content, positions 2 and 4 also have around 6 wt. % magnesium, 4 wt. % iron, but very little or no manganese content. similarly, figure 3 shows a phase containing magnesium (12 %), iron (4.3 %) and manganese (0.83 %) as position 1. position 2 is the eutectic silicon phase. position 3 is rich in magnesium and silicon, while the morphology may be considered as an mg2si phase, confirming the original assumption. when comparing the different structures obtained by the modifying agents, various morphologies of the various intermetallic phases and cavity structural defects can be observed. figure 2: detail of intermetallic phases of modified structure from electron microscopy and spectral analysis figure 3: detail of intermetallic phases of modified structure from electron microscopy and spectral analysis figure 4: unmodified structure, mag. 1 000× an unmodified structure (figure 4) contains the primary dendrites of the solid solution and 28 acta polytechnica vol. 52 no. 4/2012 the coarse eutectic, intermetallic phases of iron, which are of acicular to skeletal character, while the branches of these particles are quite massive. occasionally there are skeletal particles of magnesium phase mg2si, always tied to the eutectic silicon or iron phase. the magnesium phases are often fragmented. there are “lace” formations containing various intermetallics (figure 4). according to color, these mixed phases can be considered as phases of iron and magnesium. an unmodified structure does not contain significant amount of shrinkage porosity. in the structure modified by agent alsr3.5, coarse acicular to flat formations of iron and manganese phases occur at the grain boundaries, and there are skeletal magnesium formations, which are always bound to the iron phase and/or to the modified (fibrous to acicular closer to the grain boundaries) eutectic silicon (figure 5). cavities are present, usually in combination of interdendritic shrinkage and gas porosity. figure 5: structure modified by alsr3.5, mag. 1 000× figure 6: modified by alsr5, mag. 1 000× in the structure modified by agent alsr5 (figure 6), acicular to rough flat formations of iron and manganese phases and skeletal magnesium formations occur. there are large numbers of cavities in all samples, mostly interdendritic shrinkage porosity combined with gas. figure 7: modified by cast alsr10 agent, mag. 1 000× the structure modified by cast agent alsr10 shows rough acicular to skeletal phases of iron and manganese, and skeletal magnesium phase formations, which are always bound to the iron phases (figure 7). finer phases of iron interfere with the eutectic. interdendritic shrinkages combined with gas bubbles appear in the structure. figure 8: modified by wrought alsr10 agent, mag. 1 000× in a structure modified by wrought agent alsr10 (figures 8, 10, 11), coarse acicular formations to skeletal phases of iron, manganese and skeletal magnesium phases occur. these magnesium 29 acta polytechnica vol. 52 no. 4/2012 phases are always bound to the iron phases. iron phases interfere with eutectic silicon. coarse iron formations are rougher than structures modified by cast agent alsr10. cavities are present, usually a combination of interdendritic shrinkage and gas porosity. figure 9: modified by pure sr, mag. 1 000× pure sr affected the structure (figure 9) in such a way that there are coarse to flat formations of iron phases which interfere with the eutectic. they are often parallel. magnesium phases are gentle, always tied to different phases. there are also small round to teardrop-shaped formations of mixed phases. combined porosity is present. 5 discussion of results the expected result of modification — finer morphology of eutectic silicon — was obtained by all modification agents, even by pure strontium, the residual content of which in the sample is lower than the other modification agents (see table 2). this lower measured amount in the cast sample can be explained by the worse dissolution conditions when strontium in pure metal form is added to the melt. the residual content of strontium in samples modified by pre-alloys with aluminum (alsr3.5, alsr5, wrought alsr10 and cast alsr10) is comparable, close to an added amount of 400 ppm. two agents (alsr5 and wrought alsr10) reached a higher residual strontium value than the added amount of 400 ppm. this phenomenon can be explained by higher local concentration of strontium. figures 4 to 9 show the structures of the unmodified sample and samples modified by various modification agents based on strontium. in particular, details of intermetallic phases, which are characteristic for alsi10mg alloy, are shown. namely there are phases of iron and magnesium. other elements, listed in tables 1 or 2, are dissolved in solid solution or they are contained in complex intermetallic phases. for example, manganese is usually a part of the intermetallic phases of iron (characteristic skeleton phases also called chinese script, e.g. in figure 2). figure 10: hydrogen bubbles on the fracture area, both samples are modified by the wrought alsr10 agent figure 11: dendritic arms and interdendritic shrinkage on the fracture area, both samples are modified by the wrought alsr10 agent unmodified structures, especially characteristic intermetallic phases (figure 4), have a different morphology from modified structures. the most significant difference can be observed in the morphology of magnesium phase mg2si (black color particles in figure 4, or skeletal shapes in other pictures). the sample obtained by pure strontium modifying has the morphology of the mg2si phases closest to the unmodified structures. this similarity is also related to the residual strontium content in this sample (table 2). the iron phases also have a different morphol30 acta polytechnica vol. 52 no. 4/2012 ogy in the unmodified sample and in the structure obtained by modification, where these phases are more skeletal or flat, and not so coarse. the only exception is again the structure modified by pure strontium, which is similar to the unmodified structure, and also contains “lace” or teardrop-shaped formations of mixed phases. the specific phenomenon of non-binary alloys can be observed, e.g. the alsi10mg alloy that the intermetallic phases interfere with each other and with eutectic silicon on the eutectic grain boundaries. magnesium phase mg2si in particular, is always bound to phases of iron. structural defects, especially cavities, were also observed in the obtained structures. it can be stated that the unmodified structure contains less porosity than the modified structure. even the structure modified by pure strontium contains more cavities of combined porosity. 6 conclusions the following conclusions can be drawn based on the presented experiments: • all structures (apart from the unmodified structures) were modified, and the morphology of eutectic silicon was changed from coarse to finer acicular or fibrous eutectic silicon. modification was achieved by all the modification agents based on strontium that were used. • it was observed that the structure of a nonbinary alloy such as alsi10mg can also be influenced by strontium (in addition to the modification effect): the coarse acicular morphology of iron based intermetallic phases changes to a flatter shape or even to a skeletal shape when strontium is added. this finding confirms and complements findings of other authors (e.g. [2,4–6]). • it was found that in the non-binary system al– si–mg intermetallic phases of magnesium are also affected by the addition of strontium: especially phase mg2si changes significantly in morphology from an unmodified structure to a modified structure. the change can be described as a transformation from a fragmented phase to skeletal formations. • it was confirmed that the addition of strontium influences structural defects such as cavities. modified structures show a greater degree of porosity, which can generally be referred to as combined porosity (gas + shrinkage). • it was confirmed that various modification agents based on strontium, with various amounts of strontium in the pre-alloy, behave differently with different effects. the effect on the morphology and the mechanical properties of eutectic silicon has been published e.g. in [8–10]. the effect on the morphology of intermetallic phases is described above. for an exact description of the influence for each individual pre-alloy, it will be necessary to make further observations with a qualitative and quantitative evaluation of the morphology of the intermetallic phases. • there are significant differences between unmodified structures and modified structures (irrespective of the modification agent): in addition to the change in eutectic silicon morphology, changes in the morphology of intermetallic phases, e.g. phases based on iron and magnesium, are observed (see above). • the changes in the morphology of the intermetallic phases influence the mechanical properties of the alloy. the work has shown that strontium affects the morphology of intermetallic phase mg2si. this phase can influence not only tensile strength or ductility, but also the process of heat treatment, mainly hardening of al alloys, which comprises solution annealing and aging. it can be assumed that if the hardening phase mg2si has a suitable and controllable morphology, with uniform distribution in the volume of the casting, the annealing could be shortened or the effect of heat treatment could be greater. further studies are needed, and more experiments need to be performed, to verify this assumption. acknowledgement the work presented in this paper was supported by project sgs ohk2-038/10. references [1] gruzleski, j. e., closset, b. e.: the treatment of liquid aluminium — silicon alloys. des plaines: american foundrymen’s society, inc., 1999. 256 pp. [2] cho, y. h., et al.: effect of strontium and phosphorus on eutectic al–si nucleation and formation of β-al5fesi in hypoeutectic al–si foundry alloys. metallurgical and materials transactions. 2008, vol. 39, no. 10, p. 2 435–2 448. http://www.springerlink.com/content/ 6883p01714hp2774/ doi: 10.1007/s11661-008-9580-8. [3] timpel, m., et al.: microstructural investigation of sr-modified al-15 wt %si alloys in the range from micrometer to atomic scale. ultramicroscopy. 2011, vol. 111, issue 6, p. 695–700. issn 0304-3991. 31 acta polytechnica vol. 52 no. 4/2012 [4] haro-rodŕıguez, s., et al.: on influence of ti and sr on microstructure, mechanical properties and quality index of cast eutectic al–si–mg alloy. materials & design. 2011, 32, p. 1 865–1871. issn 0261-3069. [5] ashtari, p., tezuka, h., sato, t.: modification of fe-containing intermetallic compounds by k addition to fe-rich aa319 aluminum alloys. scriptamaterialia. 2005, 53, p. 937–942. issn 1359-6462. [6] shabestari, s. g., ghodtat, s.: assessment of modification and formation of intermetallic compounds in aluminum alloy using thermal analysis. materials science & engineering a. 2007, a 467, p. 150–158. issn 0921-5093. [7] li, z., et al.: parameters controlling the performance of aa319-type alloys: part i. tensile properties. materials science & engineering a. 2004, 367, p. 96–110. issn 0921-5093. [8] brykśı stunová, b.: study of modification effect of different types of agents based on strontium in al-si alloys. ctu in prague, faculty of mechanical engineering, 2011. disertation. 148 pp. [9] brykśı stunová, b.: studium modifikačńıho účinku stroncia z hlediska času po modifikaci a koncentrace v předslitině. in 48. slévárenské dny – abstracts proceedings. brno : česká slévárenská společnost, 2011, p. 34. isbn 978-80-02-02337-1. [10] stunová, b., luňák, m.: al(4)sr particles size and morphology influence on modification of al–si alloys. in 19th international conference on metallurgy and materials. ostrava : tanger ltd, 2010, p. 642–646. isbn 978-80-87294-17-8. wos: 000286658700109. 32 ap_07_6.vp 1 introduction the problem of curve synthesis occurs frequently in mechanical design [1, 2, 3, 4, 5, 6]. this problem arises whenever two flat – the case of a surface – or straight – the case of a line – segments of a machine element are to be joined either to close an orifice smoothly or to join that segment to the machine frame. the reason why a smooth transfer is required lies in the need to prevent stress concentrations. mechanical designers over the years have designed machines with rectangular bores using circular arcs to round the corners. this is done with two purposes: (i) to avoid stress concentrations that would arise due to an inwnite curvature at the corner, and (ii) to ease the machining of the bore. the problem with circular arcs is that they provide only g1-continuity. second-order geometric continuity g2, on the other hand, requires that the two curvatures coincide at the blending point; however, the curvature of the straight segment is zero, while that of the circular arc is the reciprocal of the radius of the circle, which is a finite quantity different from zero. the outcome is that stress concentrations are not eliminated because of the curvature discontinuity at the point of blending [7]. in this paper, a methodology is proposed aiming at the production of g2-continuity at the blending of segments of two curves. the paper is based on the concept of curve synthesis, as first proposed in [1]. while the forgoing reference resorts to parametric cubic splines to synthesize geometric curves, we show in this paper that a geometric curve can also be synthesized using non-parametric splines, thereby streamlining the procedure. 2 problem statement first and foremost, a distinction is made between a curve representing the plot of a function and a geometric curve: while the slope of the former is bounded, the latter is not; moreover, a geometric curve can have cusps, can cross itself and need not be representable by analytic functions. the paper focuses on geometric curves, while resorting to non-parametric cubic splines. curve synthesis in this context is defined as: given two curve segments �a and �b lying in the same plane, find a third curve segment �c that blends smoothly with both �a and �b at the blending points a and b, with g2-continuity. notice that we stated the search of a segment, as opposed to that of the segment, to emphasize that the problem admits multiple solutions. in fact, the problem admits infinitely many solutions. to pinpoint one particular solution, we must impose additional conditions. many are possible, the one that we adopt here being that the segment sought �c be “as straight as possible.” what this means is that we want the segment to have the smallest possible curvature. this requirement makes sense in design engineering, since a straight segment is the simplest shape to fabricate and, in the realm of structural engineering, the least likely to offer high bending moments if, for example, �c were to be the neutral axis of a beam. the challenge here is to formulate the synthesis problem in a standard form, e.g., one that would lead to an optimization problem. to this end, we resort to a discretization of the curve. many forms of curve discretization are available in the realm of geometric modeling, namely, bézier curves, cubic splines, b-splines, non-uniform rational b-splines, or nurbs, and so on [8]. we limit the search to the simplest of these tools, namely, non-parametric cubic splines. 3 problem formulation using cubic splines in discretizing the problem at hand by means of cubic splines, we define n � 2 supporting points � �pk n 0 1� , the kth point having cartesian coordinates pk(xk, yk). shown in fig. 1 is a sketch of the blending segments �a and �b by means of �c, with p denoting a generic point of the latter. alternatively, points � �pk n 0 1� are defined by their polar coordinates pk(�k, �k). henceforth, we define p a0 � and p bn � �1 . now, let a(�a, �a) and b(�b, �b) be the polar coordinates of the blending points. moreover, let � � � � � � � b a n 1 and define � � �k a k� � � , for k n� �1 2 1, , ,� , with � �n b� �1 , � � thus being the uniform increment over the po56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 shape synthesis in mechanical design c. p. teng, s. bai, j. angeles the shaping of structural elements in the area of mechanical design is a recurrent problem. the mechanical designer, as a rule, chooses what is believed to be the “simplest” shapes, such as the geometric primitives: lines, circles and, occasionally, conics. the use of higher-order curves is usually not even considered, not to speak of other curves than polynomials. however, the simplest geometric shapes are not necessarily the most suitable when the designed element must withstand loads that can lead to failure-prone stress concentrations. indeed, as mechanical designers have known for a while, stress concentrations occur, first and foremost, by virtue of either dramatic changes in curvature or extremely high values thereof. as an alternative, we propose here the use of smooth curves that can be simply generated using standard concepts such as non-parametric cubic splines. these curves can be readily used to produce either extruded surfaces or surfaces of revolution. keywords: curve synthesis, cubic splines, optimum design, g2-continuity. lar coordinate �. in defining the polar coordinates of the blending curve �c, care must be taken to choose the origin conveniently, so as to avoid more than one point with the same �-coordinate, which would render the discretization adopted here invalid. under these conditions, then we assume that a function � � �� ( ) exists in the interval [�a, �b]. further, we define the n � 2-dimensional vectors, �, �� and ��� , namely, � � �� � � �0 1 1, , , ,� n n t (1) � � � � � � �� � � � �0 1 1, , , ,� n n t (2) �� � �� �� �� �� �� � � � �0 1 1, , , ,� n n t (3) letting � �k( ) be the cubic polynomial between two consecutive supporting points pk k k( , )� � and pk k k� � �1 1 1( , )� � , we have � � � � � � � � � � � k k k k k k k k k k a b c d( ) ( ) ( ) ( ) , , � � � � � � � � � �� 3 2 1 0 k n� . (4) by virtue of the g2-continuity of non-parametric cubic splines, a linear relationship between � and �� exists, namely, a c�� �� �6 , (5) where both a and c are n×(n � 1) matrices. further, we recall the expressions for the angle � made by the tangent to the curve with the radius vector, and the curvature, in polar coordinates: tan � � � � � , (6) � � � � �� � � � � � � �� � � 2 2 2 2 3 2 2( ) ( ) , (7) where � �� � � d d , (8) �� � � � � � d d . (9) moreover, let tk and �k denote tan � and � at pk, i.e., tk k k � � � � , (10) � � � � � � � � k k k k k k k � � � � �� � � 2 2 2 2 3 2 2( ) ( ) . (11) that is, t tk k k k� �( , )� � and � � � � �k k k k k� � ��( , , ). let us now introduce two end-conditions: t ta a0 � � tan � , t tn b b� � �1 tan � , (12) where �a and �b are the known tangent angles at point a and b of �a and �b, respectively, matrices a and c then becoming square. a linear relation between � and �� also exist, namely, p q� �� � , (13) where p and q are ( ) ( )n n� � �1 2 matrices. expressions for all four matrices a, c, p and q are included in the appendix. hence, slope and curvature values at the unknown supporting points � �pk n 1 of �c can all be expressed as functions of � ��k n 1 . in formulating the curve-synthesis problem within the realm of optimum design, we let x be the vector of design of variables, namely, x � [ ]� �1 � n t. (14) notice that � �a � 0, � �b n� �1 , (15) which are known, and part of the data. we can now formulate the curve-synthesis problem at hand as an optimum-design problem: z n w kk k n x � ��1 2 1 min . (16) where wk is the normal weight at the k th supporting point, whenever one needs to assign different importance to different points. we term the weights normal because they obey the relation wk n 1 1� � (17) the problem being constrained to obey the g2-continuity conditions at the two blending points. specifically, these constraints are, � �0( , , )� � �� �� � a , � �n b� � �� �1( , , )� � � , (18) where �a and �b are the curvature values of curve �a at point a and of curve �b at point b, respectively, thereby formulating an equality constrained optimum-design problem. furthermore, special cases may call for additional constraints. for example, if convexity is to be imposed, then we must add the inequality constraints �k � 0 , k n� 1 2, , ,� . (19) once the optimization problem is formulated, a solution is possible with scientific code available in the market. we resort to our own orthogonal decomposition algorithm, implemented in the oda package [9], which is a library of c routines, especially suited to solve nonlinear least-square problems. © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 47 no. 6/2007 fig. 1: blending of two curve segments with a third one 4 examples several shape-synthesis problems have been solved using cubic-splines. two examples belonging to structural optimization are outlined in the balance of this paper. 4.1 synthesis of the neutral axis of a curved robotic link the first case arose during the design of a novel spherical parallel robot, the “agile wrist” [10], as shown in fig. 2. as a module of an 11-degree-of-freedom (dof) long-reach robot operating on fragile objects such as the fuselage of aircraft, the wrist requires a high positioning accuracy since any error in the positioning of the tool could lead to expensive damage. the kinematic chain of the “agile wrist” was borrowed from the design of the “agile eye”, developed at laval university in quebec city [11]. many factors can affect robot accuracy, such as manufacturing and assembly errors, calibration errors, etc. our concern is the flexibility of the manipulator links, which significantly reduces the positioning accuracy of the robot due to external and inertial loading. for the most part, the geometric shape and the cross-section dimensions of robot links of manipulators are designed arbitrarily, with simple shapes and constant cross sections. current structural optimization technology and the use of highly accurate cnc machine tools enable designers to produce optimum, if less obvious, shapes; an efficient selection of the design parameters should improve link stiffness without making the link heavier. a general trend in the design of lighter and stiffer structures has led to the use of sophisticated materials, like carbon fibre-reinforced composites, with the consequent slendering of structural components. these components should ensure that no failure can occur under a given range of loads, which has motivated intensive research in the area of structural optimization, a very active field since schmidt set forth an approach for coupling finite element analysis and nonlinear programming [12]. in our work, structural optimization was conducted through the optimum design of the links of the agile wrist in order to enhance the load-carrying capacity, while minimizing weight. we focus here on the shaping of the links, rather than on material selection. the material of choice, as decided on earlier, was aluminum al2014. for optimization purposes, the problem at hand was decomposed into two main steps: the first one consists in defining the neutral axis of the links, i.e., the mid-curve; the second consists in defining the cross section at each point of the mid-curve. the idea is rather simple: once the mid-curve of the link is defined, the link shape is obtained by sweeping and blending simultaneously a cross-section of a given shape (rectangle, circle, etc.), whose dimensions are variable along the curve, and lie in a plane normal to the mid-curve. since the second step can be completed with cae software, we focus on the synthesis of the mid-curve. to this end, the mid-curve � of the proximal link, shown in fig. 3, is defined so as to minimize the root-mean square value of the curvature throughout � and to ensure a blending of the linear segments with the curved segment as smooth as possible. notice that the two ends lie at different distances from the center of the wrist, in order to increase the workspace of the agile wrist, besides reducing the likelihood of collisions among moving links and the tool installed on top of the upper platform. twenty spline supporting points pi, for i � 0 19, ,� , are used, whose angles �i in polar coordinates are given by: � � � � i i n � � � � � � �� � ��1 2 ( ) (20) with � � � � � � � �arctan l a v and � � � � � � �arctan l b h . with the aid of oda, the optimum values of the design variables were found. the objective function to minimize is the rms value of the array of curvature values at the supporting points, while respecting the tangency and zero-curvature conditions at points p0 and pn � 1. a convexity condition was added, so as to ensure that no changes in curvature-sign oc58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 2: prototype of the agile wrist fig. 3: the mid-curve of the proximal link cur. these twenty points, expressed first in polar coordinates, were then expressed in cartesian coordinates as required by cae software. the radius �i and the cartesian coordinates (xi, yi) obtained for a � 89 5. mm, b � 75 5. mm and l lv h� � 12.0 mm, are listed in table 1. the rms value of the curvature distribution over the synthesized curve is 0.0144 mm�1. the curve is plotted in fig. 4a, its curvature distribution being shown in fig. 4b for verification of the curvature continuity at the two blending points. based on this mid-curve, structural optimization was completed by considering the cross-section at each point of the curve. the overall stress analysis on an optimized link with two circular cross-sections at its ends is graphically displayed in fig. 5, where no stress concentrations are observed. it is noted that the link fabricated, shown in fig. 2, exhibits a rectangular, uniform cross-section, which is cheaper to machine, and we decided to adopt to stay within budget. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 47 no. 6/2007 i �i [mm] xi [mm] yi [mm] i �i [mm] xi [mm] yi [mm] 0 90.3009 89.5000 12.0000 10 87.7878 60.7257 63.3962 1 91.2159 89.3859 18.1795 11 86.3873 55.4222 66.2658 2 91.9843 88.7004 24.3585 12 84.9587 49.9954 68.6910 3 92.4646 87.3132 30.4319 13 83.5441 44.5046 70.7032 4 92.6339 85.2225 36.3065 14 82.1737 38.9943 72.3323 5 92.4845 82.4522 41.8929 15 80.8669 33.4957 73.6036 6 92.0253 79.0508 47.1128 16 79.6338 28.0313 74.5372 7 91.2811 75.0882 51.9039 17 78.4782 22.6175 75.1484 8 90.2919 70.6501 56.2245 18 77.4004 17.2674 75.4497 9 89.1084 65.8309 60.0549 19 76.4477 12.0000 75.5000 table 1: numerical results for the optimum mid-curve of the proximal link (a) (b) fig. 4: mid-curve synthesized for the agile wrist: (a) the curve, (b) its curvature distribution fig. 5: von mises stress distribution along the optimized proximal link (thickest cross-section is located closest to the motor) 4.2 the design of a wrist mechanism housing the second example pertains to an innovative mechanism, a gearless pitch-roll wrist (prw) [13], namely, a robotic device intended to produce two-degree-of-freedom rotations of a robotic gripper about a fixed point, the wrist center. conventional means of producing such motions rely on a bevel-gear differential train, similar to those found in automotive driving-wheel axes. moreover, such trains, in robotic wrists, invariably bear straight-tooth bevel gears, which are the source of noise and significant power losses. in an attempt to overcome these drawbacks, a prw is being designed at the robotic mechanical systems laboratory, mcgill university, montreal, as depicted in fig. 6a, with gears replaced by cam-roller pairs. the key component of the mechanism is an array of spherical stephenson mechanisms (ssms) used to transmit the power from the two independent cam rotations to the gripper, as illustrated in fig. 6b. the two cams are driven by the motion of the two roller-carrying disks of fig. 6a, which are rigidly coupled to their respective motors. when the two face-to-face motors turn at opposite angular velocities of identical absolute values, the whole array turns about the common axis of the two cams, as a single rigid body (pitch); when the two motors turn at identical angular velocities, the plane containing the four spherical-linkage centers remains stationary, but the gripper turns about its axis (roll). the array must be supported by a housing that doubles as a protection means to isolate the spherical linkages from the environment dust and dirt. one approach in designing the housing is to make use of lamé curves [14], which are given by the implicit function x a y b p p � � 1 , (21) where p is an integer and a and b are real numbers that determine the dimensions, 2 2a b� , of the box circumscribing the curve. a prw housing design [15] is shown in fig. 7, which is based on lamé curves, with p � 4 for the inner surface, and p � 6 for its outer counterpart. in most common applications, lamé curves are needed only in the first quadrant, at which we can dispense with the absolute-value signs, and rewrite eq. (21) as x a y b p p � � � � � � � � � � � � � � 1 , 0 1� �x , 0 1� �y . (22) in any event, moreover, the curvature of these curves at the points of intersection with the coordinate axes vanishes, thereby allowing for g2-continuity at the points of blending with two line segments at 90°. while lamé curves, in the foregoing case, provide a thicker cross section at the points of maximum curvature, a plus for the curves, the additional material, placed at a distant location from the cam axis, adds significantly to the moment of inertia of the whole device. an alternative design is thus desirable to replace the current housing shape by means of a new shape that is free of the drawbacks of lamé curves. the new housing is to have both uniform thickness and zero curvature at the blending points with the straight bearing housings. the profile of a new design is displayed in fig. 8a, which was generated based on non-parametric cubic splines. the numerical results, namely, the values of the �-coordinate at the unknown supporting points, are listed in table 2. in this case, a � 198 mm, b � 98 mm, the half-length of the horizontal line being equal to 100 mm and the half-length of its vertical counterpart being equal to 20 mm. a lamé curve with p � 4 is displayed in fig. 8a together with the synthesized curve for comparison. the curvature distributions of the two curves are shown in fig. 8b, which indicates that the synthesized curve has a lower maximum curvature value, implying a higher allowable bending moment. the rms value of the curvature of the synthesized curve is 0.0119 mm�1, as compared with 0.0142 mm�1 of the lamé curve with p � 4. a new housing is currently under design. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 (a) (b) fig. 6: an innovative prw with cam-roller pairs: (a) the mechanism; (b) an array of two ssms and their mirror-images fig. 7: the stephenson-linkage housing made up of two identical covers 5 conclusions a methodology of curve synthesis with g2-continuity was proposed, based on non-parametric cubic splines. two synthesis examples are provided to demonstrate the effectiveness of the methodology. the proposed methodology was applied to planar curves. it can also be used in spatial curves, if these curves are synthesized via their projections on the orthogonal planes [16]. for shape optimization, the procedure described herein can be integrated with cae software. the procedure is also expected to find applications in areas such as trajectory generation and path planning for manipulators and mobile robots. references [1] angeles, j.: synthesis of plane curves with prescribed geometric properties using periodic splines. computer-aided design, vol. 15 (1983), no. 3, p. 147–155. [2] böhm, w., farin, g., kahmann, j.: a survey of curve and surface methods in cagd. computer aided geometric design, vol. 1 (1984), no. 1, p. 1–60. [3] hosaka, m.: modeling of curves and surfaces in cad/cam. springer-verlag, 1996. [4] hoschek, j., lasser, d.: fundamentals of computer aided geometric design. springer-verlag, 1993. [5] lin, p. d., lin, m. f.: geometric modelling of an elliptic ball-end mill. proc. imeche – part b, vol. 219 (2005), no. 1, p. 87–97. [6] pottmann, h.: industrial geometry: recent advances and applications in cad. computer aided design, vol. 37 (2005), no. 7, p. 751–766. [7] neuber, h.: theory of notch stresses: principles for exact calculation of strength with reference to structural form and material. u.s. dept. of commerce, office of technical services, washington, 1961. [8] pottmann, h., wallner, j.: computational line geometry. springer-verlag, heidelberg, berlin, 2001. [9] teng, c. p., angeles, j.: a sequential-quadratic-programming algorithm using orthogonal decomposition with gerschgorin stabilization. asme j. of mechanical design, vol. 123 (2001), no. 4, p. 501–509. [10] bidault, f., teng, c. p., angeles, j.: structural optimization of a spherical parallel manipulator using a two-level approach. proc. of asme 2001 design engineering technical conferences, 2001, #detc 2001-21030. [11] gosselin, c.m., hamel, j.-f.: the agile eye: a high-performance three-degree-of-freedom camera orienting device. ieee int. conf. on robotics and automation, 1994, p. 781–786. [12] schmidt, l. a.: structural design by systemastic synthesis. proc. 2nd asce conf. electronic computation, pittsburgh, pennsylvania, usa, 1960, p. 1105–1122. [13] hernandez, s., bai, s., angeles, j.: the design of a chain of stephenson mechanisms for a gearless pitch-roll wrist. proc. asme design engineering technical conferences (detc’04), salt lake city, utah, usa, 2004, #detc 2004-57424. [14] gardner, m.: the superellipse: a curve that lies between the ellipse and the rectangle. scientific american, vol. 21 (1965), p. 222–234. [15] hernandez, s.: optimum design of epicyclic trains of spherical cam-roller pair. master’s thesis, mcgill university, montreal, canada, 2004. [16] angeles, j., akhras, r., liu, z.: the synthesis of smooth trajectories for pick-and-place operations of spherical wrists. mechanism and machine theory, vol. 28 (1992), no. 2, p. 261–269. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 47 no. 6/2007 (a) (b) fig. 8: (a) new housing profile and (b) curvature distribution over the synthesized curve; shown dashed is the lamé curve i �i [mm] �i i �i [mm] �i 0 199.0075 0.0000 10 189.4187 0.0155 1 199.7687 0.0094 11 185.4056 0.0146 2 200.3707 0.0109 12 180.7433 0.0133 3 200.6720 0.0122 13 175.4735 0.0117 4 200.6039 0.0135 14 169.6913 0.0098 5 200.1041 0.0145 15 163.5592 0.0076 6 199.1170 0.0153 16 157.2998 0.0054 7 197.5937 0.0159 17 151.1588 0.0033 8 195.4919 0.0161 18 145.3498 0.0014 9 192.7760 0.0160 19 140.0142 0.0000 table 2: numerical results for the housing design chin-pun teng department of technical r&d the original cakerie ltd, b.c., canada shaoping bai e-mail: shb@ime.aau.dk department of mechanical engineering aalborg university, denmark jorge angeles e-mail: angeles@cim.mcgill.ca department of mechanical engineering & centre for intelligent machines mcgill university, montreal, canada 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 appendix the linear relationship between � and ��� is expressed as a c�� �� �6 where a � � � 2 1 0 0 0 0 1 4 1 0 0 0 0 1 4 1 0 0 0 0 1 4 1 0 0 0 0 � � � � � � � � � � � � � � � � � � � 1 4 1 0 0 0 0 1 2� � � � � � � � � � � � � � � � � � � � � � � � � � � , c � � � � 1 1 0 0 0 0 1 2 1 0 0 0 0 1 2 1 0 0 0 0 1 2 11 � � c � � � � � � � � � � � � � � � � � � 1 0 0 0 0 1 2 1 0 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��� ��cn n , (23a) in which �� � �n n 2 and c ta 11 1� � � �� , c tn n b �� �� � � �1 �� . (23b) similarly, the linear relationship between � and ��� is p q� �� � where p � � � � �1 0 0 0 0 0 1 4 1 0 0 0 0 1 4 0 0 0 0 0 1 4 1 0 0 0 � � � � � � � � � � � � � � � � � � 0 1 4 1 0 0 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � , q � � � � 1 0 0 0 0 0 3 0 3 0 0 0 0 3 0 3 0 0 0 0 3 0 3 0 0 ta � � � � � � � � � � � � � � � � � � 0 0 3 0 3 0 0 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � �tb (23c) it is noteworthy that when either �a or �b is equal to � 2, q becomes singular. in our procedure, however, we need not invert q, for all we compute is � � �� �p q1 . table of contents signal separation in ultrasonic non-destructive testing 3 v. matz, m. kreidl, r. šmíd design and validation of a probe for spatially and temporally resolved measurements of vorticity and strain rates in compressible turbulence interactions 10 s. xanthos, m. gong, y. andreopoulos dynamics of the flow pattern in a baffled mixing vessel with an axial impeller 17 o. brùha, t. brùha, i. foøt, m. jahoda dimension of fracture surfaces 27 t. ficker statistics of electron avalanches and streamers 31 t. ficker influence of oxidation media on the transport properties of thin oxide layers of zirconium alloys 36 h. frank eu emissions trading scheme as implemented in the czech republic 43 t. chmelík, p. zámyslický simulation environment for artificial creatures 46 k. kohout, p. nahodil finite automata implementations considering cpu cache 51 j. holub shape synthesis in mechanical design 56 c. p. teng, s. bai, j. angeles ap08_4.vp 1 introduction scientific and technical developments (in all areas of world-wide industry) are affected by the growing demand for basic raw materials and energy. the provision of sufficient quantities of raw materials and energy for the processing industry is the main limiting factor of further development. it is therefore very important to understand the ore disintegration process, including an analysis of the bit (i.e. excavation tool) used in mining operations. the main focus is on modeling the mechanical contact between the bit and the ore, see fig. 1. 2 finite element model of the ore disintegration process fem (i.e. msc.marc/mentat 2005r3 and 2008r1 software) was used in modeling the ore disintegration process. figure 2 shows the basic scheme (plain strain formulation, mechani© czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 48 no. 4/2008 probabilistic analysis of the hard rock disintegration process k. frydrýšek this paper focuses on a numerical analysis of the hard rock (ore) disintegration process. the bit moves and sinks into the hard rock (mechanical contact with friction between the ore and the cutting bit) and subsequently disintegrates it. the disintegration (i.e. the stress-strain relationship, contact forces, reaction forces and fracture of the ore) is solved via the fem (msc.marc/mentat software) and sbra (simulation-based reliability assessment) method (monte carlo simulations, anthill and mathcad software). the ore is disintegrated by deactivating the finite elements which satisfy the fracture condition. the material of the ore (i.e. yield stress, fracture limit, young’s modulus and poisson’s ratio), is given by bounded histograms (i.e. stochastic inputs which better describe reality). the results (reaction forces in the cutting bit) are also of stochastic quantity and they are compared with experimental measurements. application of the sbra method in this area is a modern and innovative trend in mechanics. however, it takes a long time to solve this problem (due to material and structural nonlinearities, the large number of elements, many iteration steps and many monte carlo simulations). parallel computers were therefore used to handle the large computational needs of this problem. keywords: hard rock (ore), cutting bit, disintegration process, fem, probability, sbra method, parallel computing. fig. 1: a typical example of mechanical interaction between bits and hard rock (example of the ore disintegration process) fig. 2: geometry of 2d fe model, boundary conditions and details cal contact with friction between the bit and platinum ore, boundary conditions, etc.). fig. 2 shows that the bit moves into the ore with the prescribed time dependent function u f t� ( ), and subsequently disintegrates it. when the bit moves into the ore (i.e. a mechanical contact occurs between the bit and the ore) the stresses �hmh (i.e. the equivalent von mises stresses) in the ore increase. when the situation �hmh m� r occurs (i.e. the equivalent stress is greater than the fracture limit) in some elements of the ore, then these elements break off (i.e. these elements are dead). hence, a part of the ore disintegrates. in msc.marc/mentat software, this is done by deactivating the elements that satisfy the condition �hmh m� r . this deactivation of the elements was performed in every 5th step of the solution. for further information see references [1] and [2]. 3 probabilistic inputs – sbra (simulation-based reliability assessment) method a deterministic approach (i.e. all types of loading, dimensions and material parameters etc. are constant) provides an older but simple way to simulate mechanical systems. however, a deterministic approach cannot truly include the variability of all inputs, because nature and the world are stochastic. simulations of the ore disintegration process via a 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 3: material properties – whole model and material of the ore (stress � vs. plastic strain �p) fig. 4: stochastic inputs for the material of the ore (histograms for yield stress and fracture stress, results of anthill software) deterministic approach are shown in [1] and [2]. however, this problem is also solved via probabilistic approaches which are based on statistics. let us consider the “simulation-based reliability assessment” (sbra) method, a probabilistic approach, in which all inputs are given by bounded histograms. bounded histograms include the real variability of the inputs. application of the sbra method (based on monte carlo simulations) is a modern and innovative trend in mechanics, see for example [3] to [5]. the material properties (i.e. isotropic and homogeneous materials) of the whole system are described in fig. 3, where e is young’s modulus of elasticity and � is poisson’s ratio. the bit is made of sintered carbide (sharp edge) and steel. the ore material is elasto-plastic with yield limit rp � � �9 946 0 911 1722. . . and fracture limit rm � � �12 661 0 650 0 925. . . , which are given by bounded histograms see figs. 3 and 4. the elastic properties of the ore are described by hooke’s law in the histograms e � � �18513 8 2418 8 2608 8. . . mp and � � � �0199 0 019 0 021. . . ), see figs. 5 and 6. applications of the sbra method in combination with fem and subsequent evaluation of the results are shown © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 48 no. 4/2008 fig. 5: stochastic inputs for the material of the ore (histogram of young’s modulus, results of anthill software) fig.6. stochastic inputs for the material of the ore (histogram of poisson’s ratio, results of anthill software) in fig. 7. anthill, msc.marc/mentat and mathcad software were used. 4 solution – sbra method in combination with fem because of the material non-linearities, the mechanical contacts with friction, the large number of elements many iteration steps, and the choice of 500 monte carlo simulations, four parallel computers were used to handle the large computational requirements for this problem, see table 1. the domain decomposition method (i.e. application of parallel computers) was used, see fig. 8. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 computer name computer description software no. of cpu no. of mc simulations wall time (hours) alfa: linux os, 8 nodes. node configuration: 2x cpu amd opteron 250 (frequency 2.4 ghz, 1mb l2 cache) with 4 gb ram (400mhz ddr) msc.marc/mentat 16 312 70.395 opteron: linux os, 1 nodes. node configuration: 2x cpu amd opteron 248 (frequency 2.2 ghz) with 8 gb ram msc.marc/mentat 2 28 54.6 quad: linux os, 4 nodes. node configuration: 1x cpu amd opteron 848 (frequency 2.2 ghz) with 4 gb ram msc.marc/mentat 4 86 69.015 pca632d: ms windows xp professional 64 bit os, 4 nodes. configuration: intel core 2 quad cpu q9300 (frequency 2.5 ghz) with 8 gb ram msc.marc/mentat, anthill, mathcad 4 74 68.82 � 26 � 500 � cca 70.4 table 1: parallel computers used in this study (date: august-september 2008) fig. 7: computational procedure – application of the sbra method (solution of the ore disintegration process) fig. 8. domain decomposition method used for application of 2 cpu and 4 cpu (i.e. ways of performing one monte carlo simulation) the whole solution time for the non-linear solution (i.e. 1.04 s) was divided into 370 steps of variable length. the full newton-raphson method was used for solving the non-linear problem. table 1 shows that the solution of 500 monte carlo simulations (calculated simultaneously on four different parallel computers) takes cca 70.4 hours. 5 results – stochastic evaluation figs. 9 to 14 show the equivalent stress (i.e. �hmh distributions) at some selected time t of the solution calculated for one of 500 monte carlo simulations (i.e. for one situation when the material of the ore is described by values rp � 12 mpa, rm � 13 5. mpa, e � 20000 mpa and � � 0 2. ). the movement of the bit and also the subsequent disintegration of the ore caused by the cutting are shown. from the fem results, we can calculate the reaction forces rx, ry and the total reaction force r r rx y� � 2 2 which acts in the bit, see figs. 15 and 16. figure 16 is calculated for one simulation (i.e. for the situation when the material of the ore is described by values rp � 12 mpa, rm � 13 5. mpa, e � 20000 mpa and � � 0 2. ). © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 48 no. 4/2008 fig. 9: t � 0 s (fem results, start of the solution) fig. 10: t � � �3 37 10 2. s (fem results) fig. 11: t � � �3 714 10 1. s (fem results) fig. 12: t � � �8 335 10 1. s (fem results) fig. 13: t � 0 8511. s (fem results) a distribution of the total reaction forces acquired from 500 simulations is shown in fig.17. the maximum total reaction force (acquired from 500 monte carlo simulation) is given by the histogram rmaxsbra, fem � � �5068 984 1098 n, see fig. 18. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 14: t � 1 026. s (fem results) fig.15. reaction forces in the bit fig.16. reaction forces in the bit (fem results) fig. 17: total reaction forces in the bit (sbra-fem results) fig. 18: maximum total reaction forces in the bit (sbra-fem results of 500 monte carlo simulations), and an evaluation 6 comparison between stochastic results and experimental measurements the calculated maximum forces (i.e. sbra-fem solutions, see fig. 18) can be compared with the experimental measurements (i.e. compared with a part of fig. 19), see also [1] and [2]. the evaluation of one force measurement (fig. 19) shows that the maximum force is rmaxexp � 5280 n. hence, the relative error calculated for the acquired median value rmaxsbra, fem med� � 5068 n, see fig. 18, is: �r r r rmax � � � � � max max max exp sbra, fem med exp 0 01 4 02 . . %. the error of 4.02 % is acceptable. however, the experimental results also have large variability due to the anisotropic and stochastic properties of the material and due to the large variability of the reaction forces, see fig. 19. 7 conclusions this paper combines the sbra (simulation based reliability assessment) method and fem as a suitable tool for simulating the hard rock (ore) disintegration process. all basic factors have been explained (2d boundary conditions, material nonlinearities, mechanical contacts and friction between the cutting bit and the ore, the methodology for deactivating the finite elements during the ore disintegration process, application of parallel computers). the use of finite element deactivation during the ore disintegration process (as a way of expanding the crack) is a modern and innovative way of solving problems of this type. the error of the sbra-fem results (i.e. in comparison with the experiments) is acceptable. hence, sbra and fem can be a useful tool for simulating the ore disintegration process. because the real material of the ore (i.e. yield limit, fracture limit, young’s modulus, poisson’s ratio etc.) is very variable, stochastic theory and probability theory were applied (i.e. the sbra method). the sbra method, which is based on monte carlo simulations, can include all stochastic inputs and then all results are also of stochastic quantities. however, for better application of the sbra method (for simulating this large problem in mechanics) it is necessary to use superfast parallel computers. instead of 500 monte carlo simulations (wall time cca 70 hours, as presented in this article), it is necessary to calculate >104 simulations (wall time cca 58 days), or more. our department will be able to make these calculations when faster parallel computers became available. all the results presented here were applied for optimizing and redesigning the bit. in the future, 3d fe models (instead of 2d plane strain formulation), will be applied for greater accuracy. other methods for simulating the ore disintegration process are presented in [6] and [7]. acknowledgment this work has been supported by the czech project frvš 534/2008 f1b. references [1] frydrýšek, k.: výpočtová zpráva styku nože a platinové rudy při těžbě (calculation report on contact between the bit and platinum ore during mining), 2007, czech republic, 2007, p. 17 (in czech language). [2] frydrýšek, k., gondek, h.: finite element model of the ore disintegration process, in: annals of the faculty of engineering hunedoara – journal of engineering, tome vi, fascicule 1, issn 1584-2665, university politechnica timisoara, faculty of engineering – hunedoara, romania, 2008, p. 133–138. [3] frydrýšek, k.: performance-based design applied for a beam subjected to combined stress, in: annals of the faculty of engineering hunedoara – journal of engineering, tome vi, fascicule 2, issn 1584-2665, university politechnica timisoara, faculty of engineering – hunedoara, romania, 2008, p. 129–134. [4] marek, p., brozzetti, j., guštar, m., tikalsky, p.: probabilistic assessment of structures using monte carlo simulation background, exercises and software, (2nd extended edition), isbn 80-86246-19-1, itam cas, prague, czech republic, 2003, p. 471. [5] marek, p., guštar, m., anagnos, t.: simulation-based reliability assessment for structural engineers. crc press, boca raton, usa, isbn 0-8493-8286-6, 1995, p. 365. [6] zubrzycki, j., jonak, j.: numeryczno-eksperymentalne badania wplyvu kstaltu powierchni natarcia ostrza na obciaženie © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 48 no. 4/2008 fig. 19: experimental measurement, compared with the sbra-fem 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 noža skrawajacego naturalny material kruchy (in polish language), lubelskie towarzystwo naukowe, lublin, poland, 2003, isbn 83-87833-42-8, p. 90. [7] podgórski, j., jonak, j.: numeryczne badania procesu skravania skał izotropovych (in polish language), lubelskie towarzystwo naukowe, lublin, poland, 2004, isbn 83-87833-53-3, pp. 80. msc. karel frydrýšek, ph.d., ing-paed igip phone: +420 597 324 552 e-mail: karel.frydrysek@vsb.cz department of mechanics of materials všb-tu ostrava faculty of mechanical engineering 17. listopadu 15 708 33 ostrava, czech republic << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 a comparison of distillery stillage disposal methods v. sajbrt, m. rosol, p. ditl abstract this paper compares the main stillage disposal methods from the point of view of technology, economics and energetics. attention is paid to thedisposal of both solid and liquidphase. specifically, the followingmethodsare considered: a) livestock feeding, b) combustion of granulated stillages, c) fertilizer production, d) anaerobic digestion with biogas production and e) chemical pretreatment and subsequent secondary treatment. other disposal techniques mentioned in the literature (electrofenton reaction, electrocoagulation and reverse osmosis) have not been considered, due to their high costs and technological requirements. energy and economic calculations were carried out for a planned production of 120 m3 of stillage per day in a given distillery. only specific treatment operating costs (per 1 m3 of stillage) were compared, including operational costs for energy, transport and chemicals. these values were determined for january 31st, 2009. resulting sequence of cost effectiveness: 1. – chemical pretreatment, 2. – combustion of granulated stillage, 3. – transportation of stillage to a biogas station, 4. – fertilizer production, 5. – livestock feeding. this study found that chemical pretreatment of stillage with secondary treatment (a method developed at the department of process engineering, ctu) was more suitable than the other methods. also, there are some important technical advantages. using this method, the total operating costs are approximately 1150 �/day, i.e. about 9,5 �/m3 of stillage. the price of chemicals is the most important item in these costs, representing about 85 % of the total operating costs. keywords: disposal of distillery stillage, economic comparison, energy requirements. 1 introduction one of the most important problems in distilling is further processing or disposing of distillation residues, known as stillage or slops. the urgency of finding a solution for this issue increases with growing production of ethanol. nowadays, minimization of all energy losses and efficient use of waste (stillage in this case) are modern trends in all production facilities. this work deals with ways of processing stillage in the distillery under study. energy and financial calculations were carried out for the planned production of 120 m3 of stillage per day. only specific treatment operating costs (per 1 m3 of stillage) were compared, including operational costs of energy, transport and chemicals. 2 a description of ethanol production there are two main methods of ethanol production. the chemical method, used in the chemical industry. this mainly involves ethylene hydratation: ch2 = ch2 + h2o → c2h5oh. ethanol produced in this way is not suitable for consumption, because of harmful ingredients contained in it. thebiologicalmethod, used in distilleries. this method consists of fermentation and distillation of suitable biological substrates, which must contain saccharides. only ethanol produced in this way is suitable for human consumption. there are many primary raw materials suitable for ethanol production. only those containing a sufficient amount of starch are used in the distillery under study. starch is a polysaccharide with the formula (c6h10o5)n, consisting of two different polysaccharides: amylose and amylopectin. these polysaccharides consist of thousands of glucose molecules. the main raw materials used in the distillery are potatoes and cereals (wheat, rye, maize, sometimes barley and oats). residues from potato food production are also used for ethanol production. sugar beet is another substrate suitable for ethanol production, but it is not used in this distillery. at the beginning of the process, the input raw material is disintegrated into a mash containing particles from 0.4 to 1.2 millimeters. then an enzyme, α-amylasa, is added and the batch is heated up to 90–94 ◦c. the starch contained in the mash becomes a gelatinous liquid. then the mash is pumped into tanks together with another enzyme, β-amylasa. this enzyme breaks starch down into glucose molecules. the mash must be cooled down to 30–35 ◦c, and then it is pumped into the fermentation tanks. after the addition of fermentative microbial cultures (inoculation), fermentation begins. during this process, glucose is biologically transformed into the final product – ethanol – in anaerobic conditions. this process takes approximately 48–72 hours. there are many biochemical reactions running in the process. the most important chemical reaction can be written as: c6h12o6 → 2c2h5oh + 2co2 63 acta polytechnica vol. 50 no. 2/2010 meaning that during ideal and maximum fermentation of 1 kg of glucose, 0.511 kg of ethanol and 0.489 kg of carbon dioxide originate. however, approximately 6–8 wt. % of the glucose contained in the mash is consumed for growth of the fermentative microbial cultures. after fermentation there is 6–9 wt. % of ethanol in the mash. the product is separated from the mash in a distillation column. crude spirit flows out from the top of the column, and stillage flows out from the bottom. the crude spirit must be refined in special refining columns to produce refined ethanol. this ethanol is suitable for producing alcoholic beverages. 3 stillage there are two basic waste products from fermentation – solid parts of the input material, and the liquid fraction, i.e. stillage. according to [1]: “alcohol distillery stillage (the remains after distillation of fermented mash) is the main, high-strength ‘waste’ from distillation. the production of stillage in the distillery is in the range of 10–14 m3 per 1 m3 of pure ethanol. this waste contains between 5–8 wt. % of dry mass, is moderately acidic (with ph about 4) and has a high chemical oxygen demand value (cod of about 50 000 mg per liter). dried stillage from cereal is often used as a feed for livestock, containing about 30 % of proteins. unfortunately, it is complicated to feed wet stillage, because it is an organic material. nowadays, stillage is used for livestock feeding or it is used for energy production in biogas plants.” 4 stillage disposal methods there is only one initial condition for all methods compared here – average planned production of stillage of 120 m3 per day with dry mass concentration of 5 wt. %. all cost calculations are based on the following assumptions: • machinery power input: centrifuge (all methods) – 20 kw, agitation system (chemical pretreatment) – 1.5 kw, hydrostatic pump (chemical pretreatment) – 10 kw. • 1 kwh of electric energy costs 3.30 czk (0.13 �). • thermal energy for stillage concentration or drying is obtained from combustion of natural gas (all methods with an evaporator or a dryer). the efficiency of the gas boiler is 90 %, 1 nm3 of natural gas costs 9.50 czk (0.36 �). • 1 kg of ca(oh)2 for stillage neutralization costs 3.80 czk (0.15 �). • 1 hour of tractor work (fertilization of fields) costs 450 czk (17.30 �). • 1 km of truck driving costs 32 czk (1.23 �) – transportation to a biogas plant. • the biogas yield from 1 m3 stillage with 3 wt. % is 35 nm3. the electrical efficiency of the chp unit is 30 % (transportation to a biogas plant). 5 livestock feeding first, there is the possibility of drying wet stillage in a dryer. the end product from drying – dry stillage – can be (and usually is) used as a feed or as part of the feed for livestock. not only livestock, but pigs and sheep (often kept in cooperative farms in the neighbourhood of distilleries) can also be fed with dried stillage. wet stillage can also be used as a feed, but only until three days after production. after this time, this biological mass degrades, and is not suitable for feeding. another method of stillage processing as a feed is described in [3]: for example, a french producer of bioethanol, bio-ethanol-nord picardie, produces granulated dried stillage as a feed. liquid stillage is cultivated in a special cultivation reactor and enriched with yeast extract. after concentration in an evaporator, mixing with the solid parts follows. this mixture is dryed in dryers. livestock can be fed with this granulated mass, and meat and bone meal can be substituted by this granulated mass. the whole process line is drawn in fig. 1. fig. 1: livestock feeding process line. content of drymass in wt. % the biggest disadvantage of this method is the high energy demand for evaporation and drying. the total energy requirements for the production of 1 kg of dried stillage for livestock feeding are the same as the total energy demand for the production of 1 kg of dried stillage fuel for combustion in boilers, but the combustion provides energy (electric or heat), resulting in potential improvement of the energy balance. therefore, the stillage feeding method is currently not economically advantageous. nowadays, the price of conventional feed proteins is similar to the total production costs of evaporated and dried stillage for livestock feeding. the heart of this method is the concentration of liquid stillage in a centrifuge-evaporator cycle and further drying in a dryer. these two processes (concentration and drying) have the highest total energy de64 acta polytechnica vol. 50 no. 2/2010 table 1: all operating costs of the livestock feeding process line total cost – electric energy for centrifuge: 60 eur/day 2.8 % total cost – gas for evaporation: 1 120 eur/day 52.8 % total cost – gas for dryer: 640 eur/day 30.3 % total transport costs: 20 eur/day 0.9 % total neutralization costs: 170 eur/day 8.0 % total costs for pre-treatment of condensate from evaporation: 110 eur/day 5.2 % purchase of produced feed (dried stillages): −50 eur/day xxx total operating costs: 2070 eur/day 100.0 % 17 eur/m3 mand share. it is necessary to cool down and condense the vapour from evaporation to an acceptable temperature. this condensate must be pre-treated in a special wastewater treatment plant in the distillery, because of the very high cod value (about 20 000 mg/l). then the pre-treated condensate can be drained away to the local wastewater treatment plant, and there it can be processed without any problems. this condensate can also be added to the biomass for biogas production as so-called dilution water. concentration of stillage is followed by drying and pelleting. however, this is not strictly necessary – concentrated stillage can be used for feeding after neutralization. the number of livestock and pigs in the czech republic has been decreasing (since 1990 the number has decreased by approximately one order), and bioethanol (and stillage) production has been increasing. therefore, the demand for dried stillage in the czech republic has been dropping. note: there is a great difference between molassesproduced and grain-produced stillage. molassesproduced stillage can be (and usually is) evaporated up to a dry-weight content of about 35 wt. %. grainproduced stillage can be evaporated only to a maximum dry–weight mass value of 10 wt. %, due to problems with fouling and blocking of the mash in the evaporator tubes when reaching a higher dry-weight content. the distillery in this study produces only grain-produced stillage. finally, according to [4]: “livestock can be fed on dry or wet stillage. feeding on wet stillage may be economically and energetically advantageous, but this method is limited by a short storage period (three days). wet stillage can be used only for those categories of livestock for which this feeding is acceptable. it depends on specific forage technologies. dried stillage does not have these restrictions, but the drying process has not only high energy and financial costs but also the risk of damage to the stillage due to excessive temperature. this ‘burnt’ stillage has less sappiness and a lower quantity of amino acids.” the calculation result of this method is, that the total cost (due to operating costs – energy and transport costs) is 2 070 � per daily production of stillage. the total operating cost for 1 m3 of stillage is 17 �. the resulting operating costs are the highest of all disposal methods compared here. advantages of this method: • sale of the feed → savings. • lower feed prices for cooperative farms. disadvantages of this method: • high investment costs (evaporator, dryer). • high operating costs (energy costs). • great size of the apparatuses. 6 combustion of dried stillage energy can be obtained from combustion of dried corn stillage in biomass boilers (either in a stoker-fired boiler or in a fluidized bed boiler). the production process of fuel for combustion is identical to the production process of granules for feeding. first, the stillage is concentrated in a centrifuge-evaporator cycle, then it is dried in a dryer. the fuel obtained from the dryer can be combusted to gain energy. this energy can be used either for electricity production in steam turbines or for improving the energy balance of the distillery (for heating columns, ldots). the whole process line is drawn in fig. 2. fig. 2: combustion process line. content of dry mass in wt. % 65 acta polytechnica vol. 50 no. 2/2010 table 2: all operating costs of the combustion process line total cost – electric energy for centrifuge: 60 eur/day 2.9 % total cost – gas for evaporation: 1 120 eur/day 53.3 % total cost – gas for dryer: 640 eur/day 30.5 % combustion savings – producing of steam: −780 eur/day xxx total neutralization costs: 170 eur/day 8.1 % total costs for pre-treatment of condensate from evaporation: 110 eur/day 5.2 % total operating costs: 1320 eur/day 100.0 % 11 eur/m3 the disadvantage of dried stillage combustion is a high content of sulphur and nitrogen in the burnt gas, which exceed the emission limits. this means that it is necessary to build a desulphurization plant. the main disadvantages of this method are the high investment costs for the production line for fuel production, biomass combustion and burnt gas desulphurisation. some advantages: great savings of energy costs and low production of the final waste, ash. the result of the calculation for this method is that the total operating cost (energy and transport costs) is 1 320 � per daily production of stillage. the total operating cost for 1 m3 of stillage is 11 �. this disposal method was found to be the second cheapest of all methods. this is a result of savings due to steam production. advantages of this method: • energy gain during combustion. • small volume of the final waste, ash. • relatively low operating costs. disadvantages of this method: • high investment costs (evaporator, dryer, boilers). • great size of the apparatuses. • harmful to the environment – exhaust gases. • the need for a desulphurization plant. 7 fertilization by stillage if a stillage is used for fertilization, it must be concentrated to a minimum value of 20 wt. % of dry matter. this can be done in a centrifuge-evaporator cycle. the vapour condensate from evaporation must be pre-treated in the wwtp in the distillery. molassesproduced stillages are most frequently mentioned in the context of fertilizer production, but in fact, all types of stillages are suitable (after neutralization with calcium) for fertilizing fields. the whole process line is drawn in fig. 3. stillage is a rich source of residual sugars, organic nitrogen and other nutrients. this makes it a nutrientrich and ecological fertilizer for agriculture. stillage, either in its pure state or in combination with straw, can supply or substitute mineral fertilizers. concentrated stillage is usually offered cost-free to farmers. the result of the calculation for this method is that the total operating costs (energy and transport costs) are 1 530 � per daily production of stillages. the total operating cost for 1 m3 of stillages is 13 �. this is the second most expensive of all methods compared here, because it generates no earnings. there is a legislative problem with fertilization by stillage. it is prohibited by law to spread stillage on fields in winter. it can be used for fertilization in summer, but concentrated at least at 20 wt. %, as mentioned above. according to recent trends, it seems quite possible that fertilization of fields with stillage will be soon prohibited by law throughout the year. the following acts have dealt specifically with these questions: act no. 254/2001 coll. on protection of water sources, act no. 185/2001 coll. on wastes, which defines stillages as a waste product, and act no. 334/1992 coll. on the protection of arable land. this last act strictly prohibits the use of stillages as a fertilizer [2]. fig. 3: fertilizing process line. content of dry mass in wt. % 8 transportation of stillage to a biogas plant biogas is a combination of gases generated during anaerobic digestion (= controlled microbial conversion of organic substances in anaerobic conditions to pro66 acta polytechnica vol. 50 no. 2/2010 table 3: all operating costs of the fertilizing process line total cost – electric energy for centrifuge: 60 eur/day 3.9 % total cost – gas for evaporation: 1 120 eur/day 73.2 % total transport costs on the fields: 70 eur/day 4.6 % total neutralization costs: 170 eur/day 11.1 % total costs for pre-treatment of condensate from evaporation: 110 eur/day 7.2 % total operating costs: 1530 eur/day 100.0 % 13 eur/m3 duce biogas and digestate). the usual composition of biogas is: 40–75 % of methane ch4, 25–55 % of carbon dioxide co2, 0–10 % of water vapour h2o, 0–5 % of nitrogen n2, and the rest is a mixture of hydrogen h2, ammonia nh3 and hydrogen sulphide h2s. only methane and hydrogen are interesting from the energetic point of view. the problematic components are hydrogen sulphide and ammonia. it is usually neccesary to remove these components from biogas to protect the chp unit. the most frequently used agricultural waste as an input mass for fermentation are: manure, slurry and maize. these materials are acceptable for fermentation. other types of organic waste – slaughterhouse waste, fats, sludge from sewage treatment plants and stillage can be fermented too, but in this case there are some legislative problems. the biogas that is produced is used mainly for cogeneration – i.e. for producing heat and electric energy. stillage should be neutralized before it is transported to a biogas plant. the most important considerations for a distillery are the distance between the distillery and the biogas plant, and the agreement between them. the shorter the distance to the biogas plant, the lower the transport costs and the more profitable the operation will be, because transport costs are the biggest part of all the costs. in our specific case, a biogas plant with sufficient capacity is assumed at a distance of 70 km from the distillery. one disadvantage of this method is the fact that fermentation of stillage takes a longer time than production of the same volume of stillage. another factor is distance. the whole process line is drawn in fig. 4. this method seems to be financially attractive for distilleries, because the main costs are transport costs to the nearest biogas plant. the result of the calculation for this method is that the total operating costs (energy and transport costs) are 1 410 � per daily production of stillage. the total operating cost for 1 m3 of stillage is 12 �. advantages of this method: • no investment costs (no evaporator, no dryer). • minimum requirements for modifying the original stillage. • potential savings in cogeneration of electric energy from biogas. • no wastewater treatment requirement. • the digestate can be used for fertilizing fields. disadvantages of this method: • high transportation costs. • dependence on the nearest biogas plant, and dependence on biogas plants as a whole. • excessive damage to the environment and transport communications in the czech republic caused by trucks. fig. 4: transportation of stillage to a biogas plant. content of dry mass in wt. % table 4: all operating costs of transportation to a biogas plant total neutralization costs: 170 eur/day 7.0 % total cost – electric energy for centrifuge: 60 eur/day 2.5 % total transport costs of stillages: 1 810 eur/day 74.5% total transport costs of digestate to the fields: 390 eur/day 16.0 % total price of generated electric energy from biogas: −1 020 eur/day xxx total operating costs: 1410 eur/day 100.0 % 12 eur/m3 67 acta polytechnica vol. 50 no. 2/2010 table 5: all operating costs for the chemical pre-treatment method total cost – electric energy for centrifuge: 60 eur/day 5.2 % total cost – electric energy for mixing: 5 eur/day 0.4% total price of chemicals: 990 eur/day 86.1 % total cost – electric energy for pump: 30 eur/day 2.6 % total transport costs of filtration cakes: 65 eur/day 5.7 % total operating costs: 1150 eur/day 100.0 % 10 eur/m3 9 chemical pretreatment this method consists of a chemical pretreatment of stillage and then sending it to a waste water pretreatment plant (lagoons) to be cleaned up. this method is not used on an industrial scale yet, but it could be financially attractive for small and mediumsize distilleries in the near future. it is possible that this method will be used on an industrial scale soon. according to [5]: “the heart of this method is the chemical precipitation of soluble organic and inorganic substances in stillage, their sorption on an acceptable carrier and their flocculation with organic flocculant. a significant reduction in the organic and inorganic substances contained in stillage can be achieved in this way. the final products are sedimented flocculus and pretreated liquid. the flocculus can be separated by filtration and the filtrate can be finally treated biologically in lagoons with aeration. in this way, the limits for safe outflow into a recipient (= river) can be achieved. the separated flocculus, together with the biological sludge, can be fed into a biogas plant.” during mixing and addition of substances a, b, c and d into stillage, flocculation (creation of flakes from organic and inorganic parts of stillage) takes place. this flocculated stillage is pumped using a suitable type of pump (a hydrostatic pump, or a membrane pump – the flakes cannot be disintegrated) into a filter presser (or into a belt dewatering line), where the flocculus is separated from the liquid phase. the concentration of the solid phase into a filter cake is about 20 %. this mass can be composted or fed into a biogas plant. the filtrate with solid phase concentration of about 0.2 % can be treated in aerated lagoons. the whole process line is drawn in fig. 5. the most significant part of the operating costs are the expenses for chemicals, which are not specified. these costs constitute about 85 % of all costs. the result of the calculation for this method is that the total operating costs (energy, transport costs and expenses for chemicals) are 1 150 � per daily production of stillage. the total operating cost for 1 m3 of stillage is 10 �. this is the cheapest of all the methods compared here. the main reason for the cost efficiency is the substitution of energy-demanding operations (evaporation, drying) by less energy-demanding processes (mixing and filtering). fig. 5: chemical pretreatment process line. content of dry mass in wt. % advantages of this method: • lower investment costs than in other methods (an evaporator or a dryer is more expensive than a mixing tank, a filter press and a pump). • the possibility of filter cake composting. • low energy and transport costs. disadvantage of this method: • high price of chemicals. they account for the main part of all costs. 10 conclusion the following conclusions may be drawn: • a comparison of the operating costs for all methods is shown in table 6. the methods are listed according to increasing operating costs. the results are embodied in fig. 6. fig. 6: comparison of operating costs for allmethods compared here • according to the calculations presented here, the chemical pretreatment method appears to be the 68 acta polytechnica vol. 50 no. 2/2010 table 6: final comparison of methods disposal method total operating costs in eur per 120 m3 of stillage total operating costs in eur per m3 of stillage ranking chemical pretreatment 1 150 10 100 % 1 combustion 1 320 11 110 % 2 biogas plant 1 410 12 120 % 3 fertilization 1 530 13 130 % 4 livestock feeding 2 070 17 170 % 5 cheapest in terms of operating costs. this method offers an interesting and cheap alternative to currently used disposal methods. • the energy and financial calculations were based on a distillery with planned production of 120 m3 of stillage per day. • the financial calculations cover operating costs, i.e. energy, transport and chemicals. the costs were determined at january 31st, 2009 prices. acknowledgement this research has been supported by mšmt of the czech republic research project msm6840770035. references [1] rosol, m.: energy optimisation of a distillery. master thesis, prague, 2007. [2] slabý, f.: usage of stillage from bioethanol production, prokop invest a. s., pardubice, 2007. http://www.odpadoveforum.cz/symposium/ textyof/493.pdf (5th august 2009) [3] párová, j.: dry complete stillage from bioethanol production in livestock nutrition, research institute of livestock nutrition s. r. o., pohořelice. http://www.keth.sweb.cz/krmivo%20vypalky.doc (5th august 2009) [4] picka, j., mariaca, e.: http://biom.cz/cz/ odborne-clanky/vypalky-ako-krmna-surovina, tekro s. r. o. (5th august 2009) [5] nápravník, j., ditl, p.: modern methods of disposal of pig slurry. in: actual problems in pig breeding. praha : česká zemědělská univerzita, 2004, s. 79–91. isbn 80-213-1176-2. václav sajbrt ing. martin rosol prof ing. pavel ditl, drsc. phone: +420 224 352 549 e-mail: ditl@centrum.cz czech technical university in prague faculty of mechanical engineering department of process engineering technická 4, 166 07 prague, czech republic 69 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 parameter-invariant hierarchical exclusive alphabet design for 2-wrc with hdf strategy t. uřičář abstract hierarchical exclusive code (hxc) for the hierarchical decode and forward (hdf) strategy in the wireless 2-way relay channel (2-wrc)has the achievable rate region extended beyond the classical mac region. although directhxcdesign is in general highly complex, a layered approach tohxcdesign is a feasible solution. while the outer layer code of the layered hxccanbeany state-of-the-art capacity approaching code, the inner layermust bedesigned in suchaway that the exclusive property of hierarchical symbols (received at the relay) will be provided. the simplest case of the inner hxc layer is a simple signal space channel symbol memoryless mapper called hierarchical exclusive alphabet (hxa). the proper design of hxa is important, especially in the case of parametric channels, where channel parametrization (e.g. phase rotation) can violate the exclusive property of hierarchical symbols (as seen by the relay), resulting in significant capacity degradation. in this paper we introduce an example of a geometrical approach to parameter-invariant hxa design, and we show that the corresponding hierarchical mac capacity region extends beyond the classical mac region, irrespective of the channel parametrization. keywords: physical layer network coding, wireless 2-way relay channel, hdf strategy, hxa design. 1 introduction 1.1 background and related work communication scenarios based on principles similar tonetworkcoding (nc) [1] are expected tohavegreat potential for wireless communication networks. although pure nc operates with a discrete (typically binary) alphabet over lossless discrete channels, its principles can be extended into the wireless domain. such an extension is however non-trivial, because signal space link models (e.g. the mac phase in relay communications) lack the simple finite field properties found and used in pure discrete nc. nc-based approaches in the signal space domain are called physical network coding (pnc) or network coded modulation (ncm). the major benefit of ncm is the possibility to increase the throughput in the mac phase of bidirectional communication,which is believed to be the bottleneck in the overall system. the strategy where the relay decodes only hierarchical symbols (codewords), which jointly represent information received from both sources, is called the hierarchical decode and forward (hdf) strategy [2, 3]. the increased mac phase throughput of the hdf strategy provides a performance improvement over standard techniques based on the amplify & forward or joint decode & forward paradigms. the authors of [4] present the simplest realizationofhdfstrategywithminimal cardinalitymapping, which they call “modulo decoding”. more general relay output mapping, which also takes into account the possibility of extended cardinality, is introduced in [2]. only limited code design and capacity region results are available even for the simplest possible scenario of the parametric 2-way relay channel (2wrc), see [5, 6, 7, 8]. lattice-based code construction [4, 9] using the principles from [10] is limited to non-parametric gaussian channels. references [2, 3] present a layered approach to hierarchical exclusive code (hxc) design for parametric 2-wrc, based on thehierarchical exclusivealphabet (hxa).hierarchical mac capacity regions for various alphabets, constellation point indexing and channel parametrization are evaluated in [3]. significant capacity degradation is causedbychannel parametrization,whichhighlights the importance of hxa design resistant to channel parametrization. 1.2 goals and contribution of this paper theproper design ofhxa is critical for overall system performance, especially in the case of parametric 2wrc, where the channel parametrization (e.g. phase rotation) can cause significant capacity degradation. the parametrization should be taken into account in the design process, and hxa resistant to the effects of channel parametrization should be found. one way to achieve this is by designing the hxa with parameter invariant hierarchical decision maps (hdm) at the relay. the first design approaches of this kind (e-phxc design criteria [11, 12]) have led so far only 79 acta polytechnica vol. 50 no. 4/2010 to orthogonal or non-zeromeanhxa with unsatisfactory performance (capacity limited by classicalmac). in this paper, we introduce an example of a geometrical approach to the design of amulti-dimensional hxawhich extends thehierarchicalmac capacity region beyond the classical mac region irrespective of the channel parametrization. 2 system model and definitions we adopt the system model presented in [3]. we consider a parametric wireless 2-wrc system (fig. 1), which has 3 physically separated nodes (sources a, b and relay r) supporting two-way communication througha common shared relayr.the source for data a is co-located with the destination for data b, and vice-versa. the transmitted data of each source serves at the same time as side information (si) for the reverse link. the system is wireless, and all transmitted and received symbols are signal space symbols. the channel is assumed to be a linear frequency flat with additive white gaussian noise (awgn). the whole system operates in a half-duplex mode (one node cannot simultaneously receive and transmit). the overall bi-directional communication is split into a multiple access (mac) phase and a broadcast (bc) phase. mac phase bc phase si si a r b fig. 1: model of 2-wrc with side information 2.1 mac phase now, we define all formal details. subscripts a and b denotes variables associated with node a and b, respectively. the source data messages are da, db, and they are composed of data symbols da, db ∈ ad = {0,1, . . . , md − 1}, with alphabet cardinality |ad| = md. for notational simplicity, we omit the sequence number indices of the individual symbols. the source node codewords are ca,cb with code symbols ca, cb ∈ ac, |ac| = mc. the encoding operation is performed by the encoders ca, cb with codebooks ca ∈ ca and cb ∈ cb. a signal space representation (with an orthonormal basis) of the transmitted channel symbols is sa = s(ca), sb = s(cb), sa, sb ∈ as ⊂ cn. we assume a common channel symbol mapper as(•). a signal space representation of the overall coded frame is sa(ca) and sb(cb). the received useful signal is u = sa + αsb. (1) it equivalently represents the parametric channel with both links parametrized according to the flat fading, which is assumed to be constant over the frame. it is obtainedbyproper commonrescalingof the true channel response u′ = hasa +hbsb by 1/ha and denoting α = hb/ha, ha, hb, α ∈ c1. the received signal at the relay is x = hau + w (2) where the circularly symmetric complex gaussian noise w has variance σ2w per complex dimension. 2.2 bc phase the relay receives signal x and processes it using ahierarchicaldecode and forward (hdf) strategy. more details will be given in section 3. the output codeword and its code symbols are cr and cr. these are mapped into signal space channel symbols v ∈ ar and signal space codewords v with the codebook v ∈ cr andare broadcast to destinations a and b. at node b (the destination for data a), the received signal space symbols are ya = v + wa (3) where the complex circularly symmetric awgn wa has variance σ2a per complex dimension. we denote the signal space symbols at node a (a destination for data b) similarly yb = v + wb. 3 hierarchical exclusive code 3.1 hierarchical decode and forward strategy the hdf strategy is based on relay processing, which fully decodes the hierarchical data (hd) message dab(da,db) and sends out the corresponding codeword v = v(dab) which represents the original data messagesda anddb only throughthe exclusive law [6] v(dab(da,db)) �=v(dab(d′a,db)) , ∀da �=d ′ a, (4) v(dab(da,db)) �=v(dab(da,d′b)) , ∀db �=d ′ b .(5) the hierarchical data is a joint representation of the data fromboth sources such that they uniquely represent one data source given full knowledge of the other source. assuming that destination node b has perfect side information (si) on the node’s own data db it can then decode the message da (and similarly for node a). data db will be called complementary data fromtheperspectiveof thedatada operations. thesi on the complementarydatawill bedenotedascomplementary si (c-si) [3]. a code (codebook) satisfying the exclusive law at the signal space codeword level (for the hd messages) is called a hierarchical exclusive code (hxc) [3]. we will denote the mapping satisfying the exclusive law by the operator x(•, •). 80 acta polytechnica vol. 50 no. 4/2010 unlike the standard relaying techniques based on the jointdecode & forward paradigm,where the nc approach is used mainly for a bc phase of communication, in the hierarchical approach (hdf) the hierarchical data is directly obtained from the received signal observations without the need to decode the individual data streams. the fact that only the hierarchical data (not the individual data streams) is decoded at the relay in themacphase allows the system throughput to be increased above the constraints given by the classical mac region. to facilitate this potential throughputbenefit, themacstage encoding must be such that the observation at the relay allows directhxcmapping to the hierarchicaldata. in other words, alongside the complete signal path (mac and bc) the coding must always be hxc w.r.t. the hierarchical data [3]. throughout this paper we assume only the hxc withminimal cardinalityof the relayhierarchical codebook (|cr| = max(|ca|, |cb|)), which requires that both nodes (a, b) have perfect c-si on the complementary data. a general discussion on the relay hierarchical codebook cardinality can be found in [3]. 3.2 layered hxc design for perfect c-si direct design of the hxc codebook cr providing the mappingv(dab(da,db)) is evidentlyhighly complex. an alternative approachbased on layered hxc design is presented in [2, 3]. layered hxc (fig. 2) consists of the outer layer (error correcting) code and the inner layer (closer to the channel symbols), which provides the exclusive property of thehierarchical symbols. the outer layer code can be an arbitrary state-of-the-art capacity achieving code (e.g. turbo code or ldpc), and the inner layer can be designed (in the simplest case) as a simple signal space channel symbol memoryless mapper. an alphabet memoryless mapper cab = xc(ca, cb) (6) fulfilling the exclusive lawwill be called ahierarchical exclusive alphabet (hxa). the entire model of the layered system can be found e.g. in [3]. fig. 2: layered hierarchical exclusive code theorems in [2] show the viability of the layered approach to hxc design. the capacity is alphabet constrained by hierarchical-mac (h-mac) channel hierarchical symbols and is achievable by the outer standard channel code ca = cb = c combined with inner symbol-wise hxa as. the h-mac rate region has a rectangular shape ra = rb = rab ≤ i(cab;x). (7) the exclusive property at the symbol level allows simple determination of the soft per-symbol measure decoding metric for hierarchical symbols at the relay. it can either be directly used by the full hierarchical relay decoder or properly source encoded and sent on a per-symbol basis without decoding. 4 parameter-invariant hxa the complexity of the proper hxa design increases in the case of layered hxc design for a parametric channel. some specific channel parameter values can cause the signal space points corresponding to different hierarchical symbols to fall into the same useful signal (1) and thereby break the exclusive property, resulting in significant capacity degradation [3]. one possible solution to this inconvenience is to take the channel parametrization into account inherently from the beginning of the hxa design, e.g. by forcing the hierarchical decision maps (hdm) at the relay to be invariant to channel parameter α: xs(α)(sa, sb)= xs(sa, sb), ∀α (8) hxas that have the hdm invariant to channel parametrization will be called parameter-invariant hxas (pi-hxa). 4.1 e-phxc design criteria a first attempt to design the pi-hxa for the layered hxc in 2-wrc was given by the extended parametric hxc (e-phxc) design criteria [11]. these criteria utilize the criterion for the α-invariant hierarchical decision region pairwise boundary rkl (i.e. the decision region boundary between the useful signal pair uk(ia ,ib) = sia + αsib and u l(i′a,i ′ b) = si′ a + αsi′ b ): 〈 sia − si′a;sib − si′b 〉 = 0, (9)〈 sib − si′b;sib + si′b 〉 = 0, (10) to force some subset of the decision region boundaries to be invariant to channel parametrization (see [13] for details). as shown in [12], thee-phxcdesign criteria result in hxas which have all permissible decision region boundaries invariant to channel parametrization, hence the condition for a parameter invarianthdm is naturally satisfied. although the e-phxc design criteria provide a feasible way to design the hxa with parameter invariant hdm, the resulting pi-hxa has a number of drawbacks and only limited performance (see [12] for details). the strictness of the e-phxcdesign criteria forces sources (a, b) to use different channel symbol mappers (aa(•) and ab(•)) and the solution leads to mutually orthogonal alphabets or alphabetswith nonzeromeanandnon-equaldistance (which is apparently 81 acta polytechnica vol. 50 no. 4/2010 not optimal). the solution with mutually orthogonal alphabets has the mac capacity region equivalent to the classical-mac decoding [12]. 4.2 generalized approach to the design the strictness of the e-phxc design criteria (which in turn causes only the orthogonal solution topi-hxa design to be feasible) is due to the fact that all permissible pairwise boundaries are forced to be parameter invariant. this is obviously not necessary since some decision region boundaries can be “overlaid” by other boundaries or remain somehow“hidden” inside the hierarchical decision region. in such cases, the resulting final shapeof thehdmwill remainunaffectedby these boundaries. hence these boundaries do not have to be considered by the pi-hxa design criteria. this approach topi-hxadesign should relax the strictness of the design criteria (compared to e-phxc) and hence non-orthogonal pi-hxas with the rate region extending the classicalmac region can possibly be found. a comparison of this “generalized”design approachwith e-phxc based design is shown in fig. 3. 5 pi-hxa design 5.1 principles of geometrical design the derivation of the systematic design criteria (design algorithm) for pi-hxa is still relatively complex. the particular constellation space boundaries of the hdm at the relay result from the selected pi-hxa constellation (i.e. alphabet mapper as), and the design criteria for invariant decision region boundaries directly affect the requirements given on the pi-hxa constellation. this mutual relationship increases the complexity of the systematic solution to pi-hxa design. we show the viability of the layered hxc solution in parametric channels (i.e. the possibility to find pi-hxa) by introducing some major simplifications which allow a geometric interpretation of the pi-hxa design problem. the idea of the geometric approach to pi-hxadesign is based on the “constellation space patterns” of the useful signal u = sa + αsb. definition 1: a constellation space pattern uia is the subspace spannedbytheuseful signal u = sa+αsb for sa = sia, ∀sb ∈ as and ∀α ∈ c 1. the absolute value of the channel parameter |α| ∈ (0;∞) causes the constellation space patterns to be potentially unbounded. this is the only remaining inconvenience for a simple geometric interpretation of pi-hxa design. as we will show in the following subsection, the constellation space patterns can be effectively bounded by simple processing at the relay. 5.2 two-mode relay processing the received (useful) signal u is obtained by rescaling the true channel response (u′ = hasa + hbsb) by 1/ha. the only purpose of this rescaling is to obtain a simplified expression of the useful signal (1), which is (after rescaling) parametrized only by a single complex channel parameter α = hb/ha. it is obvious that the true channel response canalternativelybe rescaled by 1/hb, hence we can obtain two alternative models of the useful signal u: m1 : um1 = sa + αsb, (11) m2 : um2 = 1 α sa + sb. (12) fig. 3: the final shape of the hdm at the relay is affected only by the pairwise boundaries between different hierarchical codewords (given by a different colour of the region in the figure). the generalized approach to the pi-hxa design requires only these particular boundaries to be invariant to the channel parametrization (unlike e-phxc) 82 acta polytechnica vol. 50 no. 4/2010 this corresponds to two alternative models of the received signal at the relay: xm1 = haum1 + w, (13) xm2 = hbum2 + w. (14) the relay can potentially swap these two channel models (respectivemodels of the useful signal) in such a way that the absolute value of the channel parameter (α formodel m1 and 1 α for m2) is always less than (or equal to) one. this processing at the relay will be called 2-mode relay processing. if the hierarchical mapping at the relay is “symmetric”: c ij ab = xc(c i a, c j b)= xc(c j a, c i b), ∀i, j ∈ {1, . . . |ac|} , (15) then 2-mode relay processing can be used transparently to both sources a,b (i.e. the sources are not aware which channel model is in use at the relay for the current transmission), and hence it is feasible for the hdf strategy with hxc. the symmetry of the relay hierarchicalmapper allows the relay to swap these two equivalent models of the useful signal (11), (12) transparently to both sources. in this way the relay can ensure that the value of the channel parameter in the useful signal model remains bounded, which in turn affects the subspaces spannedby the useful signals um1, um2, i.e. the constellation space patterns. definition 2: a constellation space pattern uib is the subspace spanned by the useful signal um2 = 1 α sa + sb for sb = sib, ∀sa ∈ as and ∀α ∈ c 1. definition 3: a bounded constellation space pattern u′i is the subspace given by: u′i = { uia for |α| ≤ 1 uib for |α| > 1 . (16) fig. 4: example of the constellation space patterns for 2-mode relay processing (|as| = |ac| =4). the particular line style corresponds to the particular αsb, ∀sb ∈ as table 1: principles of 2-mode relay processing channel |α| ∣∣∣∣1α ∣∣∣∣ u u′i |ha| ≥ |hb| ≤ 1 ≥ 1 um1 u ia |ha| < |hb| > 1 < 1 um2 u ib constellation space patterns u′i are effectively bounded (fig. 4) by simple swapping of the useful signal models at the relay. the only requirement for this 2-mode relay processing is symmetry of the relay hierarchical output mapper (15). we summarize the principles of 2-mode relay processing in table 1. 5.3 an example of 2-dimensional pi-hxa we assume a real-valued channel symbol memoryless mapper as ⊂ r2 (common to both sources). the channel parameter is complex α ∈ c1, and the useful signals are um1, um2 ∈ c 2. the assumption of a real-valued alphabet (as ⊂ r2) allows the following simple interpretation of the real and imaginary part of the received useful signal um1 ∈ c 2 (similarly for um2): � {um1} = sa + � {α} sb, (17) � {um1} = � {α} sb, (18) which corresponds to the following vector notation: � {[ um1,1 um1,2 ]} = [ sa,1 sa,2 ] +� {α} [ sb,1 sb,2 ] , (19) � {[ um1,1 um1,2 ]} = � {α} [ sb,1 sb,2 ] . (20) it is obvious that the imaginary part of the useful signal � {umi } depends solely on the channel symbols fromone source(sourceb formode m1 (11)andsource a for mode m2 (12)). this can be viewed as an additional side information transmission from the corresponding source. the relay employs 2-mode processing, hence the corresponding constellation space patterns u′i are bounded. to simplify the design example even more, only the real part of the constellation space patterns (u′ire = � { u′i } ) will be considered here. these assumptions allow a simple geometric interpretation of thepi-hxadesignproblem in r2. the commonchannel symbol mapper as cause that for ia = ib the corresponding patterns uia and uib define identical subspaces. hence it is sufficient to consider only uia for the case |α| ≤ 1 (it is equivalent to the analysis of uib for the case ∣∣∣∣1α ∣∣∣∣ < 1). by defining the particular channel symbol mapper (as)we also directly define the corresponding constel83 acta polytechnica vol. 50 no. 4/2010 lation space patterns uia. in this case, the geometrical pi-hxa design example turns into a puzzle-like problem of the constellations space pattern (uiare for ia ∈ {1,2, . . . |as|}) arrangement in r2. the main goal of this “puzzle” is to find a suitable channel symbol mapper as and a proper hierarchical exclusive mapper cijab = xc(c i a, c j b)= x s(s i a, s j b), which will jointly prevent the possibility of violation of the exclusive law for arbitrary channel parameter α. s1 s2 s3 s4 as (−1,1) (1,1) (−1, −1) (1, −1) fig. 5: channel symbol mapper (as) and the resulting constellation space patterns (u′ire) for the example of pihxa here we show an example of a two-dimensional 4ary (|as| = |ac| = 4) pi-hxa, designed according to the assumptions given in this section. the selected constellation space symbols (i.e. the chosen channel symbol mapper as) and the resulting constellation space patterns (u′ire) are depicted in fig. 5. the final task of the example of geometrical pi-hxadesign is aproper choiceof thehierarchical exclusivemapper. the selection of a suitable hierarchical exclusivemapper can be visualized as a “colouring” (partitioning) process of the constellation space patterns. a visualization of this colouring process for the example of pi-hxa from fig. 5 is presented in fig. 6. the resulting hierarchical exclusive mapper xs(sia, s j b) corresponds to a bit-wise xor of the symbol indices (i ∈ {1,2, . . . |as|}). 6 numerical results to show the viability of the layered hxc design in a parametric 2-wrc with hdf strategy we present some numerical evaluations of the mutual information (capacity) and the minimum squared distance of the example of pi-hxa design from fig. 5. 6.1 mutual information (capacity) we evaluate the hierarchical and single-user (alphabet limited cut-set bound) rates (fig. 7) for an example alphabet as (see fig. 5) and various channel sa sb sa sb sa sb sa sb fig. 6: design of a suitable hierarchical exclusive mapper xs(sia, s j b) for the example of pi-hxa from fig. 5. the resulting hierarchical exclusive mapper corresponds to a bit-wise xor of the symbol indices (i ∈ {1,2, . . . |as|}) 84 acta polytechnica vol. 50 no. 4/2010 −10 −5 0 5 10 15 20 0 0.5 1 1.5 2 2.5 3 snr γ x [db] [b it/ sy m b o l] h−mac capacity for alphabets m2n4−hxa−c1 e ψ [c ab (ψ)] min ψ [c ab (ψ)] max ψ [c ab (ψ)] c ab (ψ=0) c 0 cut−set bound finite alph c s sum−rate/user bound finite alph c u cut−set bound gauss alph fig. 7: capacity (mutual information) for the example of pi-hxa from fig. 5 parametrization. the signal-to-noise ratio (snr) is defined as the ratio of the real base-band symbol energy of one source (e.g. a, to have a fair comparison for reference cases) to the noise power spectrum density ratio γ = ( ēsa /2 ) /n0. assuming orthonormal basis signal space complex envelope representation of the awgn, we have σ2w = 2n0 and thus γ = e [ ‖sa‖2 ] /σ2w. the alphabet as is indexed by symbols ca, cb ∈ {0, . . . , mc − 1}. the exclusive hierarchical mapping corresponds to a bit-wise xor of the symbol indices (fig. 6). the graph (fig. 7) shows the classical mac cutset bounds (1st and 2nd order) related to one user in comparison to the capacity of the hdf strategy with the example of pi-hxa. the hdf capacity is parametrized by the actual relative phase shift of the source-relay channels, while the amplitude is kept constant |α| in our setup (to respect the symmetry of the rates from a and b). we show the minimal, maximal and mean values of the hdf capacity. the results were obtained by the technique shown in [3], where details can be found. it is obvious from fig. 7 that the hdf capacity approaches the alphabet constrained cut-set bound limit for medium to high snr. for snr values above approximately 2 db the capacity outperforms the classical mac capacity, irrespective of the channel parametrization (relative phase shift of the sourcerelay channel), which has only a minor impact on the resulting performance. 6.2 minimum distance theminimumdistanceperformance is quite closely connected with the error rate of the whole system [8]. we define the minimum squared distance as: d2min = minxs(sa ,sb) �=xs(ŝa ,ŝb) d2(sa ,sb)−(ŝa,ŝb), (21) where d2 is the squared euclidean distance between the useful signal u = sa + αsb and its candidate û = ŝa + αŝb: d2(sa,sb)−(ŝa ,ŝb) = ‖(sa − ŝa)+ α(sb − ŝb)‖ 2 . (22) fig. 8depicts the squaredminimumdistance asa functionof channelparameter α. it is obvious fromthis figure, that theminimumsquareddistance ishighly resistant to the relativephase shift (� α) of the source-relay channels. note that distance shortening at |α| → 0 is inevitable. ℜ {α} ℑ { α } d min 2 (α) (for example of pi−hxa) −1.5 −1 −0.5 0 0.5 1 1.5 −1.5 −1 −0.5 0 0.5 1 1.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 fig. 8: minimum squared distance as a function of the channelparameter (for theexampleofpi-hxafromfig. 5) 85 acta polytechnica vol. 50 no. 4/2010 7 conclusion the importance of finding hxa resistant to channel parametrization(pi-hxa)wasstated in [3]. itwasobservedthere that someparticularchannelparametrization values have disastrous effects on the system performance (significant capacitydegradationcausedbya violationof the exclusiveproperty). thefirstapproach to pi-hxa design (e-phxc design criteria [11]) has until now led only to orthogonal or non-zero mean hxa with unsatisfactory performance (capacity limited by classical mac). in this paper we have presented an example of a geometrical approach to pi-hxa design. though the design was based on many simplifying assumptions, the numerical results show relatively high resistance of the capacity (fig. 7) and the minimum distance (fig. 8) to the channel parametrization. in addition, our setup requiresnoadaptationof the hierarchical exclusive mapping (unlike [6, 8]). hence the processing at the relay can always be kept transparent to both sources. the numerical results presented in this paper show the viability of the layered approach tohxc design in 2-wrc with the hdf strategy, even for the case of parametric channels. deriving systematic criteria for pi-hxa design is a topic for future work. acknowledgement this workwas supported by the fp7-ictsaphyre project, by grant agency of the czech republic, project 102/09/1624, and by ministry of education, youth and sport of the czech republic, programme msm6840770014, grant oc188 and by the grant agency of the czech technical university in prague, grant no. sgs10/287/ohk3/3t/13. references [1] yeung, r. w., li, s.-y. r., cai, n., zhang, z.: network coding theory. now publishers, 2006. [2] sykora, j., burr, a.: hierarchical exclusive codebook design using exclusive alphabet and its capacity regions for hdf strategy in parametric wireless 2-wrc, in cost 2100 mcm, (vienna, austria), pp. 1–9, sept. 2009. td-09-933. [3] sykora, j., burr, a.: hierarchical alphabet and parametric channel constrained capacity regions for hdf strategy in parametric wireless 2-wrc, inproc. ieeewireless commun.network. conf. (wcnc), (sydney, australia), pp. 1–6, apr. 2010. [4] baik, i.-j.,chung, s.-y.: networkcoding for twoway relay channels using lattices, in proc. ieee internat. conf. on commun. (icc), 2008. [5] popovski, p., koike-akino, t.: coded bidirectional relaying in wireless networks, in advances in wireless communications (v. tarokh, ed.), springer, 2009. [6] koike-akino, t., popovski, p., tarokh, v.: denoising maps and constellations for wireless networkcoding in two-wayrelaying systems, inproc. ieee global telecommun. conf. (globecom), 2008. [7] koike-akino, t., popovski, p., tarokh, v.: denoising strategy for convolutionally-coded bidirectional relaying, in proc. ieee internat. conf. on commun. (icc), 2009. [8] koike-akino, t., popovski, p., tarokh, v.: optimized constellations for two-waywireless relaying with physical network coding, ieee j. sel. areas commun., vol. 27, p. 773–787, june 2009. [9] nam, w., chung, s.-y., lee, y. h.: capacity bounds for two-way relay channels, in proc. int. zurich seminar on communications, 2008. [10] erez, u., zamir, r.: achieving 1/2log(1+snr) on the awgn channel with lattice encoding and decoding, ieee trans. inf. theory, vol. 50, p. 2293–2314, oct. 2004. [11] uricar, t., sykora, j.: extended design criteria for hierarchical exclusive code with pairwise parameter-invariantboundaries forwireless2-way relay channel, in cost 2100 mcm, (vienna, austria), p. 1–8, sept. 2009. td-09-952. [12] uricar, t., sykora, j.: design criteria for hierarchical exclusive code with parameter invariant decision regions for wireless 2-way relay channel, submitted for publication, 2010. [13] sykora, j.: design criteria for parametric hierarchical exclusive constellation space code for wireless 2-way relay channel, in cost 2100 mcm, (valencia, spain), p. 1–6, may 2009. td-09-855. tomáš uřičář e-mail: uricatom@fel.cvut.cz dept. of radioelectronics faculty of electrical engineering czech technical university in prague technicka 2, 166 27 prague, czech republic 86 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 experimental investigation of radial gas dispersion coefficients in a fluidized bed jǐŕı štefanica1, jan hrdlička1 1 ctu in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 praha, czech republic correspondence to: jiri.stefanica@seznam.cz abstract in a fluidized bed boiler, the combustion efficiency, the nox formation rate, flue gas desulphurization and fluidized bed heat transfer are all ruled by the gas distribution. in this investigation, the tracer gas method is used for evaluating the radial gas dispersion coefficient. co2 is used as a tracer gas, and the experiment is carried out in a bubbling fluidized bed cold model. ceramic balls are used as the bed material. the effect of gas velocity, radial position and bed height is investigated. keywords: fluidized bed, gas mixing, radial dispersion. 1 introduction fluidized bed combustion technology is an efficient and ecological way of combusting low quality fuels. the fuel is combusted in a bed of inert material that is brought to a fluidized state by passing through air. in a fluidized state, the bed particles are in force equilibrium, resulting in fluid-like behaviour of the bed. extensive gas and solid mixing inside the fluidized bed provides a large active surface for all chemical reactions, and also for heat transfer. as a result, the combustion temperature can be lower than for other types of combustion, while high combustion efficiency is achieved. a temperature range of 800–950◦c is usually used, providing ideal conditions for in-situ desulphurisation and reduction of nox. this temperature is also necessary to prevent agglomeration of the bed material. a detailed fluidized bed behaviour description is needed for the design and control of a fluidized bed boiler. there are many quantities for characterizing a fluidized bed. one of the most important characteristics is the extent of mixing, because it rules the intensity of all mass and heat transport processes. gas and solid dispersion coefficients in axial and radial direction are used to describe the extent of mixing in the bed. this paper will present an experimental evaluation of the gas dispersion coefficient in radial direction dgr for a bed of ceramic balls. ceramic balls are not a typical inert bed material; quartz sand and coal ash are more widely used. ceramic balls have recently come under consideration for use in combustion of biomass and waste derived fuels [1]. the most widely used method for dispersion coefficient evaluation is the tracer gas method. this method uses a suitable tracer gas, which is injected in a fluidized bed at one point. the concentration of the tracer gas in the primary fluidization gas is then measured at other points of the bed. gas dispersion coefficients can be determined using an appropriate model. the tracer gas experiment methodology is presented in detail in [2]. the step response tracer gas method is suitable for evaluating the axial dispersion coefficient dga. some authors have used the tracer gas method with steady state to determine also dgr, the radial dispersion coefficient [3–5]. considering the similarities of the two types of experiments, all tracer gas related studies have been taken into account. some authors (e.g. [5,7]) consider axial dispersion to be less important than radial dispersion. in this work the steady state tracer gas method was used to determine the gas dispersion coefficients in radial direction dgr. 2 theory the radial gas dispersion coefficient has been evaluated by the steady state tracer gas method. various tracer gases have been described in the literature. we selected co2 for our experiment. after setting the tracer gas flow rate, it was necessary to wait for the tracer gas concentration to stabilize and achieve a steady state. the tracer gas concentration was then measured at the investigated height, at points with different radial positions, see the concentration profiles for heights 5–20 cm and fluidization velocity of 0.44 m/s in figure 1. 97 acta polytechnica vol. 52 no. 3/2012 figure 1: radial dispersion profiles for ceramic balls, 0.44 m/s the dispersed plug flow model from equation (1) can be used for the evaluation. assuming dispersed plug flow, the concentration profiles should agree with equation (2), where x is the radial position and xm is the radial position where the tracer gas concentration reaches half of the maximum value. according to [5], the solution for sufficiently distant boundaries is given by equation (3), and dgr can be calculated by equation (4). u ∂cg ∂h = dgr [ 1 x ∂ ∂x ( x ∂cg ∂h )] + dga ∂2cg ∂h2 (1) cg ca = ( 1 2 )( x2 x2m ) (2) ca c∞ = ur2 hdgr (3) dgr = u · r · c∞ h · cg (4) the results can also be seen in terms of dimensionless numbers as the dependence of the peclet number from equation (5) and the reynolds particle number from equation (6). pe = l · u d (5) rep = dp · u · �g μ (6) 3 experimental the experiment was carried out in a bubbling fluidized bed cold model, see figure 2. considering the ambient temperature of the investigated bed, the model was made out of a 220 mm plexiglass cylinder to enable visual observation of the experiment. the body of the model was supported by a steel tripod stand 1. the fluidization air was supplied by an air fan in the lower section 2. tracer gas was injected by pipe 3 in the middle of the perforated plate distributor 5 (open area of 2.51 %) that was supporting the fluidized bed. co2 was used as a tracer gas, for safety reasons and also as a readily available detection method. in the wind-box section 4 between the fluidization air inlet and the distributor plate, the air flow became steady. the tracer gas concentration was measured through the access points 6. figure 2: experimental apparatus the tracer gas concentration was measured using an aseko air cmhn non-dispersive infrared gas analyser. a bed of ceramic balls was used for the dgr investigation. the material properties are presented in table 1. the dependence of dgr on mean fluidization velocity and bed height was investigated. table 1: material properties ceramic balls ρ [kg/m3] 800 ε [–] 0.34 φ [–] 0.95 dp [mm] 3.11 umf [m/s] 0.49 during the experiment, the tracer gas flow rate was kept below 1.5 % of the primary fluidization air flow rate. bed heights of 5, 10, 15 and 20 cm were measured. fluidization velocities of 0.44 m/s, 0.58 m/s, 0.73 m/s and 0.88 m/s were used, representing (1 − 2) · umf. the velocities were chosen in accordance with the flow rate measurement capabilities. 98 acta polytechnica vol. 52 no. 3/2012 4 results and discussion the dgr values that were obtained are shown in figures 3–6. the values varied between 0.26–4.68 cm2/s, which is in agreement with the findings of other authors. the measurements showed the same trend for all conditions. regardless of bed height or fluidization velocity, two peak values were observed, marking the regions where the radial gas mixing is most intensive. the first peak is the region in centre of the bed, and the second peak region is near the wall. it can also be observed that, while the extent of mixing decreases with increasing fluidization velocity in the centre of the bed, the situation is reversed near the wall. with increasing gas velocity the wall region gains in significance, due to the enhanced gas and solid back mixing that takes place in the region near the walls. it can also be observed that radial gas mixing decreases with increasing bed height. in the course of varying the bed height, the most significant difference was observed when the bed height was increased from 5 cm to 10 cm. the drop in the extent of mixing was less significant for further increases in bed height. the results in terms of dimensionless numbers are shown in figure 7. the expected linear increase of pe with rep was observed. for higher beds, the dependence on rep increased. for lower bed heights, only a slight increase or even stagnation was observed. figure 3: dgr for ceramic balls, 1 · umf figure 4: dgr for ceramic balls, 1.33 · umf figure 5: dgr for ceramic balls, 1.66 · umf figure 6: dgr for ceramic balls, 2 · umf figure 7: dimensionless parameters 5 conclusion this paper has described how the experimental apparatus for radial gas dispersion coefficient evaluation was designed and how a steady state tracer gas experiment was conducted. ceramic balls were used as the bed material. the effects of fluidization gas velocity and bed height were investigated. for the conditions that were used, the radial gas dispersion coefficient values were determined to be 0.26–4.68 cm2/s. this is in agreement with previous experiments for a bubbling bed, where dgr values of 1–5 cm2/s were mostly found. the highest value on each level was found either in the centre of the bed or at the wall. with increasing velocity, the peak near the wall was found 99 acta polytechnica vol. 52 no. 3/2012 to increase, while the peak in the bed centre was decreased. the radial gas dispersion coefficients were found to decrease with bed height. linear dependence of pe on rep was found. for higher beds, the dependence on rep increased. for lower bed heights, the pe value could be considered to be independent from rep taking into account the measurement uncertainty. legend ε [–] void fraction φ [–] sphericity μ [pa · s] dynamical viscosity ρ [kg/m3] density of the bed material ρg [kg/m 3] fluidization gas density ca [vol. %] maximal concentration of the tracer gas for a given height measured at the centre of the bed cg [vol. %] concentration of tracer gas at a given point c∞ [vol. %] concentration of tracer gas after perfect mixing with the fluidization gas d [m] bed diameter dga [m 2/s, cm2/s] gas dispersion coefficient in axial direction dgr [m 2/s, cm2/s] gas dispersion coefficient in radial direction dp [m] particle diameter l [m] characteristic dimension x [m] radial distance from the vertical axis of the bed xm [m] value of x where cg ca = 1 2 u [m/s] mean gas velocity umf [m/s] gas velocity for minimal fluidization pe [–] peclet number rep [–] reynolds number for particles references [1] hrdlička, j., obšil, m., hrdlička, f.: field test of waste wood combustion using two different bed materials, in proceedings of the 16th scej symposium on fluidization and particle processing. tokyo : the society of chemical engineers, japan, 2010, p. 40–43. [2] kunii, d.: levenspiel fluidization engineering, butterworth – heinemann, 1991. [3] namkung, w., kim, s. d.: radial gas mixing in a circulating fluidized bed, powder technology, 113, 2000, p. 23–29. [4] sternéus, j., johnsson, f., leckner, b.: characteristics of gas mixing in a circulating fluidised bed, powder technology, 126, 2002, p. 28–41. [5] chyang, c.-s., han, y.-l., chien, c.-h.: gas dispersion in a rectangular bubbling fluidized bed, journal of the taiwan institute of chemical engineers, 41, 2010, p. 195–202. [6] sane, s. u. et al.: an experimental and modelling investigation of gas mixing in bubbling fluidized beds. chemical engineering science, 51, 1996, p. 1 133–1147. [7] atimtay, a., cakaloz, t.: an investigation on gas mixing in a fluidized bed, powder technology, 20, 1978, p. 1–7. 100 ap08_6.vp 1 introduction vibrodiagnostics is a well-established technique for condition monitoring and for detecting faults in modern rotary machines, e.g. automotive gearboxes. the development of a new design for a rotary machine component is a typical field where acoustic and vibration signals produced by the machine need to be analyzed in order to ensure a longer lifetime and quieter running of the machine component. the properties of the newly designed component are evaluated during experiments that test its reliability and its ability to deal with wide-ranging conditions. the gear transmission is an important machine component which is still under intensive development. paper [1] describes the design and a way of testing for gears with a non-standard profile developed at the czech technical university in prague. the suitability of the gear transmission design was investigated in a stress test, in which various levels of torque load forced on the two shafts paired by the tested gears. besides an estimation of the lifetime, the main aim of the experiment was to discover the influence of the load value forcing on the gear transmission on the vibration exposure of the gears. paper [2] describes the signal processing method applied in order to analyze the vibrations acquired during this test. this paper evaluates the efficiency that can be achieved by several suitable methods. theoretical models of these designs are very complicated and often, as in this case, none is available. some characteristics of gear transmission vibration exposure are known, but they are too general and not accurate enough to satisfy our objective. the main general feature of gear vibration is that the energy of the vibrations produced by the gears is concentrated mainly around the harmonics of the tooth frequency (tf). tooth frequency ft can be estimated using equation (1). f n f n ft � � � �1 1 2 2, (1) where f1 is the revolution rate of the first gear, n1 is the number of teeth of the first gear, and f2 and n2 denote the same properties of the second gear. tf (shown in fig. 1) is characterized by the presence of many (sub)harmonics in spectra and their amplitude modulations by some frequencies, such as the repetitive frequencies of the shafts in engagement. other characteristic frequencies and their modulations can also occur, e.g. the hunting tooth frequency described in [3]. it is complicated to select a few frequencies that are most important for vibration analysis of the inspected gear transmission. it requires mutual comparison of many vibration spectra. this comparison can be performed via a cascade diagram (shown in fig. 1). the significant features included in the acquired vibrations related to the new gear design are given by the differences between the power spectral densities (psds) of gear vibration signals acquired when they are forced by different levels of gearbox load, and between the psds of vibration signals acquired when the gearbox is forced by the same load value. the need to compare hundreds of psds to discover this influence is a big disadvantage. therefore, it is convenient to simplify and automate this process. because it is supposed that the torque load mainly affects the vibrations produced by the © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 6/2008 evaluation of methods used for separation of vibrations produced by gear transmissions a. dočekal, m. kreidl, r. šmíd this paper evaluates methods used for separating vibrations produced by a gear transmission from the vibration signal acquired on the gearbox. the paper presents a novel method for evaluating the algorithms used for this separation. the evaluation method takes into account the statistical reliability of the results achieved on multiple sets of signals acquired on the same machine and conditions. the signal separation was applied in order to process data obtained during an experiment carried out with the aim of analyzing the influence of a torque load affecting a gearbox on the vibrations produced by the gear transmission. it is supposed that the vibration characteristics of the gear transmission are strongly affected by the value of the torque load influencing the gearbox shafts. this influence is analyzed using the vibration signal acquired on the gearbox housing. the vibration signal contains significant disturbances, and its interpretation is unclear. the vibration signal generated by the gear transmission can be separated using methods that make it possible to select the valid features included in the signal. methods for feature selection which implement a systematic search in the state space and methods based on the genetic algorithm were applied. the genetic algorithm poses a robust stochastic global search in the state space that is well suited to deal with nonlinear problems and also shortens the necessary computing time. the evaluation and comparison of the results achieved during the separation process using different methods have to be taken into account. in the case of signal separation, it is important to evaluate differences between the results achieved during particular executions of the separation process performed by the same method on different datasets which were acquired in the case of the same experiment and conditions. methods with results that vary, or that are different from the results given by other methods, are assumed not to be statistically reliable. it is also necessary to penalize methods leading to results that can vary greatly in some executions according to the scatter data. conversely, methods that give results varying around the right set of features seem more acceptable. a novel method for rating the statistical reliability of the results has been proposed. this method is essential for methods using a stochastic search in the state space. keywords: gear transmission, vibration, signal separation, selection error rate on multiple datasets. tested gear transmission, discovering the significant features included in psds enables us to separate the undesirable vibration signals produced by the gear transmission. the tested gear transmission was placed in a gearbox fitted with flow cooling. because of many technical issues, the accelerometers were placed on the gearbox housing. the amount of background noise and vibration contained in the vibration signal acquired on the housing increases the difficulty of further analysis, and can make the results of further analysis unclear. in this case, it is essential to separate the vibration signal produced by the gear transmission from the background vibration produced by other sources. under these circumstances, the gear transmission vibration signal can be separated utilizing methods implementing feature separation based on the dependence of the vibration on a known independent parameter. one big group of these methods applied in our study comprises methods of feature selection. this uses a systematic search in the feature state space. branch and bound feature selection, sequential backward feature selection, sequential forward feature selection, pudil’s floating feature selection (forward), and plus-l-takeaway-r feature selection were applied. other applied methods were based on the genetic algorithm, which implements a stochastic search in the state space. methods based on the genetic algorithm are well suited to deal with nonlinear problems and they also support parallel implementation, which shortens the necessary computing time. the multilayered iterative algorithm from the group method of data handling, and the group of adaptive models evolution were used. the results achieved by various methods during the separation process have to be evaluated and compared. the methods need to be evaluated with regard to the ability of the separated part of the vibration signal to retroactively recognize the value of the applied torque load. this aspect was verified using inter/intra class distance. another important consideration should be an evaluation of the statistical reliability of the results achieved by the separation process. for this reason, it is crucial to evaluate the differences between the results achieved during particular executions of the separation process performed by the same method on different datasets acquired in the case of the same experiment and conditions. a proposed method known as “selection error rate on multiple datasets” rates the statistical reliability of the results. this is particularly important for methods using a stochastic search in the state space, which cannot guarantee that the same results will be achieved even when the same input data are applied. 2 signal processing as mentioned in section 1, the actual condition of the gear transmission can be described by a vibration level at the frequency given by equation (1) and their higher harmonic and subharmonic frequencies. the frequencies are characteristic for a certain gearing and revolution rate. an evaluation of the power of the vibration signal at these frequencies and its changes reveals changes in gearing operational conditions, among others especially the wear or a fault that has arisen on the gearing. the aim of signal processing described here is to recognize the bands in psd that are most important and also contain most information about the condition of the gear transmission. the experiment and further signal processing focused on separating the vibration produced by the gear transmission from a mixture of vibrations acquired on the gearbox housing 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 0 2000 4000 6000 500 1000 1500 2000 2 4 6 8 10 a c c e le ra ti o n (m m /s ) 2 torque load (nm) frequency (hz) f t fig. 1: cascade diagram showing the dependency of vibration power spectra density on the torque load forcing on the gear transmission is depicted in fig. 2. the gearwheels are operated under a defined stress or, more precisely, under a pre-selected torque load during the experiment. the power spectral density (psd) of the vibration signal was estimated. then psd was split into bands. in order to reduce the number of features, the psd of the signal was represented by the power of the signal inside each band. this set of features was utilized to separate the vibration signal of the gear transmission using methods of feature selection. the set of selected features corresponds to the frequency response of the filter which separates the vibration signal produced by the gear transmission from the rest of the acquired signal. 3 methods used for feature selection this section briefly describes the methods used for separating the vibration signal produced by the gearing. 3.1 branch and bound feature selection the brand and bound feature selection (bbfs) method is a method of feature selection designed to select features valid for solving the task from the set of given features. the methods select the features that carry the most information in the sense of the selected criterion function. bbfs works on the basis of a systematic search in the feature state space by creating a decision tree using the “depth first search with a backtrack mechanism” [4]. this is a recursive algorithm which is initialized with the complete set of given features and a corresponding value of the criterion function. in the first step, it is removed the feature whose removal causes a minimal decrease (or even an increase) in the criterion function. then all valid branches of the decision tree are sought in dependence on the value of the criterion function. the algorithm continues recursively till the optimal set of features is found. a detailed description of the branch and bound method is given in [4]. 3.2 sequential forward or backward feature selection although the bbfs method can find the optimal solution very effectively, its computation can be quite demanding. feature selection methods are therefore used that can find a suboptimal solution only, but their computation is less demanding. sequential forward feature selection (sffs) and sequential backward feature selection (sbfs) are two of these methods. both methods also systematically search the feature state space while creating the decision tree. unlike bbfs, these methods only search the bounds of the tree that provide the least decrease of the criterion function [4]. unlike bbfs and sbfs, sffs starts with the empty set and proceeds by adding the features one after another. 3.3 plus-l-takeaway-r feature selection sffs and sbfs both work without a backtracking mechanism. once a feature is added (removed), this action cannot be undone. plus-l-takeaway-r feature selection (lrfs) (also known as (l, r) search) [4] is derived from the sequential forward selection algorithm and gets around this lack by adding l features at a time, and after that r features from the obtained set are excluded in accordance with the criterion function, etc. this algorithm results in better performance than sequential selection. 3.4 pudil’s floating feature selection pudil’s floating feature selection (pffs) is also derived from sffs. it implements the “floating search algorithm” described in [5]. pudil’s floating feature selection supports backtracking as long as there is an increase in the criterion function which may not be monotonic. pffs is believed to give results similar to those given by branch and bound, but it needs far less computation effort [6]. 3.5 group method of data handling the group method of data handling (gmdh) is a set of several methods for constructing inductive models [7]. this approach is based on gradually sorting out complicated models and selecting the best solution on the basis of the minimum of external criterion. this leads to the selection of valid features that are able to describe the analyzed influence in the data. the vibration signals acquired on the gearbox were processed by the “multilayered iterative algorithm” (mia or mia gmdh). this approach uses a data set to construct a model of a complex system. the model is represented by a neural network which has been trained using the genetic © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 6/2008 gearing torque load 2h data acquisition pre-processing split to the bands separated part of the signal (bands) mostly related to the torque load forcing on the gearing vibration feature selection psd estimation fig. 2: experiment used for separating the vibration produced by the gear transmission from a mixture of vibrations measured on the gearbox housing (2h denotes the sensor placing) algorithm. the genetic algorithm not only adjusts the network, but also has an influence on the network topology. the mia algorithm works as follows. first the initial population of units with a given polynomial transfer function is generated. the units have two inputs and therefore all pair-wise combinations of input variables are employed. then coefficients of unit transfer functions are estimated using stepwise regression or some other optimization method. the units are sorted by their error of output variable modeling. a few of the best-performing units are selected as inputs for the next layer according to the rules of the genetic algorithm. the next layers are generated identically until the error of modeling decreases. the units that are connected to features carrying most of the information provide the best results, and so they should with higher probability survive in a network. the fitted mia gmdh neural network describes the solved problem via polynomial equations [7]. 3.6 group of adaptive models evolution the group of adaptive models evolution (game) [8] is derived from gmdh theory. it improves the multilayer iterative algorithm (mia). the game method uses the niching genetic algorithm [8] to build networks with neurons and connections proper to the data set. the connecting can be more complex than mia provides, and several types of neurons are possible. game also contains a technique for verifying the models. a selected number of models are created during the training phase. the inner structures of all models are compared, and possible correlations in inner structure are penalized (danger of possible creation of similar models). subsequently, the models obtained during the training phase are compared all together and also to the known right values (right answers). this results in the selection of a few best models and simultaneously the credibility of the models for each value of a known parameter is determined. it has been proven that game networks are able to solve a certain type of complex problems that cannot be solved using mia gmdh [9]. the main disadvantage of using game is the higher computing severity. 4 evaluation of methods used for feature selection the efficiency of the methods described in the section 3 was evaluated using inter/intra class distance and selection error rate on multiple datasets. 4.1 inter/intra class distance because it is assumed that the torque load mainly affects the vibrations produced by the tested gear transmission, the significant features are related to differences between their values for the different load. conversely, the differences should be small in the case of the same load. hence the load defines classes in the feature space. the values of features for the given load value form a cluster. inter/intra class distance (iicd) evaluates the ability of selected features to form clusters, and the ability of the clusters to be easily discriminated. the iicd criterion was calculated using the following equations: � � � � iicd t t � � � � � � � � � nk k k k k k n k k n k n nk ( )( ) ( )( ), , s s s s z s z s 1 1k k � � 1 (2) s zk k k n n n n k � � � 1 1 , , (3) s z� � � 1 1 ns n n ns . (4) where ns is the number of samples, k is the number of classes, nk is the number of samples included in class k. sk denotes the centre of class k. s is the centre of all samples zn. zn denotes sample n, zk,n denotes sample n belonging to class k. the sample is presented by the set of features. 4.2 selection error rate on multiple datasets selection error rate on multiple datasets (sermd) is a novel method designed mainly for evaluation of separation using a stochastic search. when applied to methods using a systematic search, sermd evaluates the sensitivity of the search when scattered data is applied. sermd is based on analyzing the differences between the results (set of selected features) achieved during single executions of the separation process done by the same method on different datasets acquired in the case of the same experiment and its settings. the presumption behind the idea of sermd is that the results (selected features) given by the search should be the same or similar when it is applied to the data acquired during the same experiment under the same conditions (the same experimental settings). methods producing results that vary, or even results that are stochastic, are assumed not to be statistically reliable. it is also necessary to penalize methods that provide results that can be totally different for some executions depending on the data. on the other hand, methods that give results varying around the right set of features seem to be more acceptable. sermd analyzes the sets of z features and their ratings given by method m when it is applied to n datasets. the set of ratings of features selected when applied to dataset n is denoted as z(n, m). z(n, m) is a vector of length z. the element of z(n, m) is z(n, m, fi), where fi denotes the index of the selected feature (further denoted as selected feature) (matches the center frequency fi or index i of the selected band). z(n, m, fi) responds to the rating of feature fi by method m when it is applied to dataset n. when the method does not give the feature rating, the rating is set subsequently. if the feature has been selected, the rating is given by probability with uniform distribution for all the selected features. if it is not selected, the rating is zero. first, the most rated features (mrfs) given by all methods are estimated. the mrfs are given by the histogram of ratings of all features through all methods and all datasets available (equation (5)). p f z n m fi i m m n n ( ) ( , , )� �� �� 11 . (5) 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 the most rated features for all methods fe are given as follows: � � fe e i i zp f f f f f� arg max ( ) , , , ,1 2 � , (6) where the transformation “argmax” gives a vector of e features assigned by e maximum values included in p ( fi) reflecting all features fi. the vector of mrfs given by all methods fe forms a standard to which all methods are compared. the evaluation of method m starts with estimating mrfs which are given only for this method and dataset n. each vector of these mrfs is estimated by equation (7). � � f ( , ) arg max ( , , ) , , , ,n m z n m f f f f fe i i z� 1 2 � (7) then the matrix dm formed from differences between mrfs f (n, m) given by method m when applied to dataset n and the standard mrfs fe given by all the methods and datasets. matrix dm is created subsequently by stacking the differences d(n, m) which form columns of the matrix (equation (8)). � � d f f d d d d ( , ) ( , ) , ( , ), ( , ), , ( , ) . n m n m m m n m e m � � � 1 2 � (8) the efficiency of the method is estimated by the values of e minimum values included in dm according to equation (9). � d de e mm( ) arg min� , (9) where the transformation “argmin” gives a vector of e value assigned by e minimum values included in dm. the value of the sermd criterion is given by the mean absolute error of the selected features according to equation (10). sermd( ) ( , )m e d m ee e e � � � 1 1 , (10) where de( m, e) is a value included in de( m). the multiple datasets are created by splitting and shuffling the acquired dataset. each class (torque load value) should be represented equidistantly by the same number or a similar number of samples. 5 experimental setup 5.1 testing stand an universal testing stand was used to test the gearwheels. the stand is designed to abridge the lifetime of the gearwheels. the design of niemann’s closed loop circuit [10] was used for this purpose because of its lower energy intensiveness. the testing stand, shown in fig. 3 and fig. 4, consists of one measured gearbox and one auxiliary gearbox, a drive engine, tension equipment, and sensors for torque load, shaft revolution rate, temperature and vibration of the tested gear transmission. unlike the testing gearbox, the auxiliary gearbox is overrated for the values of the torque load. the torque sensor works up to 2000 nm. the circuit is dimensioned for maximal virtual power 785 kw and for a revolution rate of 1450 rpm. ft was 398 hz. 5.2 data acquisition and sensor placing the accelerometer was placed at a point located on the bearing housing above the shafts (shown in fig. 2 as 2h). the vibrations were acquired using a brüel & kj r pulse 7537 analyzer fitted with calibrated 4507 b accelerometers. the attachment of accelerometers enables operation up to the upper limit frequency at 3 khz with sensitivity 10 mv/ms�2. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 48 no. 6/2008 fig. 3. niemann’s closed loop circuit [2] 1 tensing screw, 2 auxiliary gearbox, 3 auxiliary gear transmission, 4 gear coupling, 5 fixes part of the gear coupling, 6 torsion shaft, 7 coupling with evolvent grooving, 8 – tested gearwheel, 9 testing gearbox, 10 – testing pinion, 11 etp-techno coupling, 12 radex-n coupling with cutable screw, 13 moment sensor, 14 axial fixed pinion, 15 radex-n coupling with cutable screw, 16 moment sensor, 17 radex-n coupling, 18 drive engine fig. 4. visualization of the testing circuit [2] 5.3 digital signal processing the measured vibration signal was filtered by a low-pass filter with the stop frequency at 3 khz. the power spectral density was estimated using welch’s method. the hanning window in time was applied. welch’s segment length was selected as one hundredth of the whole signal length. the overlap value was 25 %. the psd of the acquired vibration was divided into uniformly spread bands with constant bandwidths at 20 hz. the division performed in such a way that tf were in the center of each band. the vibration in each band was represented by its power value. for feature selection, each measurement of the data set was represented by a set of 150 features. the vibration signals acquired during the stress test contained 5 classes formed by different states of the torque forcing on the gearing. the dataset contained 192 records: 40 records for torque at 0 nm, 38 records for 500 nm, 38 records for 1000 nm, 38 records for 1500 nm, and 38 records for 2000 nm. the record order was shuffled using a random process with normal distribution. 10 most rated features were selected for each dataset and method. 6 experimental results a comparison of the results achieved by the methods is shown in table 1. the values of sermd stated in table 1 are given for evaluating 2 datasets. each dataset contained 96 samples of measurement. the 10 most rated features were selected for each dataset and method (e � 10). the results given by sermd are strongly dependent on the size of the dataset used for feature selection. this dependency corresponding to each method is shown in fig. 5. the dependency shown in fig. 5 is an approximation of the dependency which can be estimated by repetitive estimation on datasets created by random shuffling. according the computing severity of each method, the required computing time can be interesting. fig. 6 shows the maximum and minimum computing time that was needed to select the features. the values stated in fig. 6 were estimated on a pc fitted with an inter pentium m processor operated at 1.73 ghz, 533 mhz fsb, and 2 gb ram. because of the way in which the times were estimated, the values stated in fig. 6 are informative, but they provide an overview of the computing severity of the methods. an overview of power spectral density of vibration and its evaluation by group of adaptive models evolution is shown in fig. 7. conclusion this paper has focused on an evaluation of feature selection applicable methods. the novel method presented here, selection error rate on multiple datasets, takes into account the statistical reliability of the results achieved when the methods were applied repeatedly to multiple sets of signals acquired on the same machine under the same conditions. this is particularly important in the case of methods based on a stochastic search in the state space. when this criterion was applied to methods based on a systematic search, it evaluated the sensitivity of the search when scattered data are applied. the methods for signal separation were applied in order to separate the vibration signals produced by a new design of 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 method iicd (-) sermd (hz) avg. computing time (s) sbfs 82.8 138.0 1052 sffs 48.8 30.8 117 lrfs 48.8 30.6 116 bbfs 112.6 67.1 3608 pffs 42.7 44.2 127 mia gmdh 52.1 74.4 43 game 52.5 36.8 158 table 1. comparison of methods used for feature selection sbfs sffs lrfs bbfs pffs mia gmdh game 30 40 50 60 70 80 90 100 0 40 80 120 160 200 240 280 320 360 400 number of records in dataset (-) s e r m d (h z ) selection error rate on multiple datasets fig. 5: dependency of sermd on the size of the dataset used for feature selection 0 200 400 600 800 1000 1200 game mia gmdh lrfs pffs bbfs sffs sbfs computing time (s) max. time min. time reached 5 hours fig. 6: computing time required by the feature selection methods gear transmission from the rest of the signal that was acquired during a stress test. the aim of the experiment was to inspect the influence of the torque load on the vibration exposure of the gear transmission. the vibration signals were acquired on the gearbox housing and contained significant disturbances, so their interpretation was unclear. the feature selection methods based on a systematic search in the state space and methods based on the genetic algorithm, which implements a stochastic search in the state space, were applied. selection error rate on multiple datasets, inter/intra class distance, and computation severity were taken into account. the evaluation of the method indicates that sequential forward feature selection, pudil’s floating feature selection, plus-l-takeaway-r feature selection, and group of adaptive models evolution seem to be suitable for the objective in view. acknowledgments this study was supported by research program no. msm 6840770015 “research of methods and systems for measurement of physical quantities and measured data processing” of the ctu in prague, sponsored by the ministry of education, youth and sports of the czech republic. references [1] žák, p., dynybyl, v.: design and testing of gears with non-standard profile. in: 2007 proceedings of the asme international mechanical engineering congress and exposition. new york: american society of mechanical engineers – asme, 2007, isbn 0-7918-3812-9. [2] dočekal, a., dynybyl, v., kreidl, m., šmíd, r., žák, p.: localization of the best measuring point for gearwheel behaviour testing using group of adaptive models evolution. measurement science review [online], vol. 8, no. 2, 2008, p. 42–45. internet: http://www.measurement.sk/2008/s1/p.html. issn 1335-8871. [3] smith, derek j.: gear noise and vibration. second edition. new york: marcel dekker, inc., 2003, isbn 0824741293. [4] heijden, f., duin, r. p. w., ridder, d., tax, d. m. j.: classification, parameter estimation and state estimation. chichester: john wiley & sons, ltd., 2004, isbn 0-470-09013-8. [5] pudil, p. et al.: floating search methods for feature selection with non-monotonic criterion functions. in: proceedings of the twelfth international conference on pattern recognition. jerusalem: iapr, 1994, p. 279–283. [6] bensch, m. et al.: feature selection for high-dimensional industrial data. in: esann’2005 proceedings – european symposium on artificial neural networks. bruges, 2005, p. 375–380, isbn 2-930307-05-6. [7] madala, h. r., ivakhnenko, a. g.: inductive learning algorithms for complex systems modelling. london: crc press inc., 1994, isbn 0-8493-4438-7. [8] kordík, p., šnorek, m.: group of adaptive models evolution. in: simulation almanac 2005. praha: česká technika – nakladatelství čvut, 2005, p. 79–92. isbn 80-01-03322-8. [9] kordík, p., saidl, j., šnorek, m.: evolutionary search for interesting behavior of neural network ensembles. in: 2006 ieee congress on evolutionary computation. los alamitos: ieee computer society, 2006, vol. 1, p. 235–238. isbn 0-7803-9489-5. [10] žák, p., dynybyl, v.: spur gear tests – surface fatigue. materiálové inžinierstvo, vol. xiv, no. 3/2007, p. 81–86. issn 1335-0803. ing. adam dočekal phone: +420 224 352 346 e-mail: docekaa@fel.cvut.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@fel.cvut.cz doc. ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@fel.cvut.cz department of measurement czech technical university in prague faculty od electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 48 no. 6/2008 0 500 1000 1500 2000 2500 3000 3500 4000 0 2 4 6 8 frequency (hz) a c c e le ra ti o n (( m m /s ) /h z ) 2 2 power spectral density 0 500 1000 1500 2000 2500 3000 3500 4000 0 0.1 0.2 0.3 0.4 frequency (hz) ra ti n g () evaluation by group adaptive model evolution fig. 7: overview of gear transmission vibration psd and the corresponding evaluation by game (torque load at 1500 n) ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 analysis and simulation of the pi of the sky detector response l. w. piotrowski, a. f. żarnecki abstract the pi of the sky project observes optical flashes of astronomical origin and other light sources variable on short timescales, down to tens of seconds. we search mainly for optical emissions of gamma ray bursts, but also for variable stars, blazars, etc. precise photometry with a very large field of view (20◦ ×20◦) requires a careful study and modelling of a point spread function (psf), as presented in this paper. keywords: gamma-ray bursts (grb), point spread function (psf), wide-field camera. 1 introduction the pi of the sky experiment [2] is designed for continuousmonitoring of a large fraction of the skywith high (inastronomical terms) time resolution(∼ 10 s). a real time analysis of the data stream, based on a multi-level triggering system, allows discoveries of grb optical counterparts independent from satellite experiments. additionally, the self triggering capabilities allow the detection of other rapidly varying sources, such as nova and flare stars or even as yet unclassified phenomena. this approach resulted in autonomous detection of the “naked-eye” burst grb080319b at its very beginning [3]. to meet the requirement for monitoring a large fraction of the sky, thepi of the skyapparatusmakes use of cameraswith averywidefield of view—about 20◦×20◦ each. however, for starspositioned far from the optical axis, this causes significant deformations of images, much bigger than in other astronomical experiments (as in the case of grb080319b, see figure 1, left). therefore both profile and aperture astrometric and photometric algorithms, which assume standard psf shape, introduce large uncertainties. to improve the brightness and coordinate measurements, an elaboration of psf in the pi of the sky system is required. this requires high-precision profiles of the point-sources in different parts of the frame, and a mathematical model that will describe them properly. 2 laboratory measurements and modelling the derivation of a psf requires a star profile with sub-pixel resolution, which can be obtained from a superposition of single star images. however, the superposition of real images introduces big uncertainties due to the fact that the exact shape and thus the centre of the star image is unknown. additionally, stars are blurred due to mount vibrations when following the sky movement, etc. these factors show that ample data needs to be obtained in laboratory measurements. the apparatus for the laboratory measurements consisted of a led diode (colour or white) placed behind a pinhole 0.1mm in diameter and placed at a distance of 22m fromaccdcamera1. the pixel size fig. 1: left: the second exposure of grb 080319b taken with the pi of the sky apparatus, during which the optical emission was strongest. the grb was recorded close to the corner of the frame, thus its shape is very deformed; right: the setup for laboratory measurements 1the standard pi of the sky camera consists of custom designed electronics with readout noise of ∼ 16e− at 2 mpixels/s, commercial ∼ 2000 × 2000 pixel ccd and canon ef lenses with f = 85 mm and f /d =1.2. 57 acta polytechnica vol. 51 no. 6/2011 of the ccd sensor was 15×15 μm (figure 1, right). this setup gives a geometrical spot size of the diode on the ccd sensor of less than 0.1 pixel — fulfilling the requirement of a point source. 2.1 intra-pixel measurements the way inwhich the ccd sensor is designed causes a single-pixel sensitivity to light to be spatially nonuniform [4]. this is mainly due to the electrodes placed across the pixels and channel stops separating the columns of the sensor. this setup with an additional diaphragm 20 mm in radius on the lenses (reducing thepsfsize) enabledmeasurements of two functions describing the non-uniformity: the pixel response function (prf) and the pixel sensitivity function. fig. 2: left: pixel response function; right: pixel sensitivity function for a red diode prf describes a single pixel signal value vs. the position of the spot relative to the pixel edge. a spot only partially contained in the pixel causes a sudden drop in prf close to the pixel border (figure 2, left). however, the function is non-zero for a spot fully outside the pixel. this may be caused by psf size, by diffraction of the spot, or by a charge diffusion between the pixels. in the last case, the prf contributes to the whole shape of psf. the pixel sensitivity function is defined in a similar way to prf, but an overall ccd signal value is taken instead of the single pixel signal. changes in pixel sensitivity are the main factor responsible for the signal changes caused by movement of the image across theccd.with knowledge of the function and the position of the centre of the source in the pixel one cancompensate for this effect, performingamore precise measurement of the brightness. 2.2 psf measurements and modelling a high resolution profile for selected coordinates in the frame was obtained using multiple images of a diode. each exposure was taken for a specific position of the centre of the diode and the full set of images covered 10 × 10 points inside a single pixel. all the images were superimposed, taking into account the coordinates of each image. sample profiles obtained for 0, 800 and 1400 pixels from the frame centre along the diagonal are shown in figure 3. the spread of the image of the point source, psf, is causedbydeformationsof thewavefrontof the light passing through the optical system [1]. in the pi of the sky case, the psf should be described by a rayleigh-sommerfeld diffraction formula with aberrations, within a kirchhoff approximation. in the case of the wide-field lenses used in our project, it is not possible to use any of the paraxial approximations that lead to an fft, thus the formula has to be calculated directly. therefore, even though this approach should finally lead to satisfactory results, a faster model had to be developed. we use a set of modified zernike polynomials to describe the image directly. for each measured profile, psf modelling is obtained by fitting polynomial coefficients to the data (figure 4). the general model is obtained by interpolating the coefficients determined for selected coordinates on the frame. fig. 4: psfs for 800 pixels from the frame centre. left: measured; right: polynomial model fig. 3: measured psfs 0, 800 and 1400 pixels from the centre of the frame, along the diagonal 58 acta polytechnica vol. 51 no. 6/2011 fig. 5: left: comparison of the position determination accuracy from polynomial model profile astrometry and from asasaperture astrometry, for stars brighter than 9m; right: the signal resulting from thepsfmodel fit at the “nakedeye” burst position prior to the explosion (bars) and the extracted precursor limits (lines) (data from one camera) fig. 6: left: polynomial photometry results for simulated and real sky images; right: relative error of the photometry performed on the simulated data, as a function of the star brightness (points) and the estimated contribution from different sources of variability (lines) 3 model applications the polynomial model that is obtained can be used for multiple purposes. the most straightforward use is for photometry and astrometry of stars, grbs and other objects of interest. as for the photometry, when testing on selected data samples the model did not prove better in determining the brightness of stars than the asas aperture photometry used in our experiment. this is probablydue to the real fluctuations of the star signal, caused by miscellaneous factors in the experiment thatdominate themeasurement. however, determining the position of the stars on the frame is up to a factor of 2.5 more precise in the polynomial psf model (figure 5, left). another applicationof themodel is in adedicated search for a signal in specific coordinates, such as a search for an optical precursor to the “naked-eye” grb080319b [5]. pi of the sky started observing the position of the burst more than 19 minutes before the trigger, thus providing data well suited for this task. no signal exceeding 3σ, coinciding on two cameras, was found when fitting the modelled psf to the frames preceding the burst, and the limiting magnitudo was improved from 11.5m (resulting from standard photometry) to 12.25m (figure 5, right). additionally, a feature-rich simulator of the pi of the sky frame was created, utilizing the modelled detector response. the photometry results for real and simulated data are in very good agreement (figure 6, left). this can be taken as proof of good simulationquality. to obtain photometric uncertainties similar to the realuncertainties, fluctuations such as photon statistics, gain fluctuations, readout noise, position fluctuations and star turbulences had to be taken into account. the simulation results show that the photometry quality for stars darker then 8.5m is most probably limited by gain fluctuations, and not by readout noise, as had previously been anticipated (figure 6, right). 4 summary due to the very large field of view, psf in the pi of the sky detector is highly deformed and is not described by generalmodels. on the basis of dedicated laboratory measurements, we have created an effective model describing the dependence of psf on the position of the star in the frame. these results may help in further understanding and developing the pi of the sky detector and other future wide field experiments. so far we have successfully introduced 59 acta polytechnica vol. 51 no. 6/2011 the model into a photometric algorithm, a dedicated signal search and astrometric algorithms, the last of which has givenverypromising results. additionally, a detailed simulator of the pi of the sky frame has been created. acknowledgement this work describes a research project supported by the polishministry of science and higher education in 2009–2011. references [1] rerabek, m., pata, p.: the space variant psf for deconvolution of wide-field astronomical images, proceedings of spie — 7015 — adaptive optics systems. washington : spie, 2008, p. 70152g-1–70152g-12. [2] siudek, m., et al.: pi of the sky telescopes in spain and chile, these proceedings. [3] racusin, j. l., et al.: broadband observations of the naked-eyebig γ-rayburstgrb080319b,nature 455, 183–188, 2008. [4] toyozumi,h., ashley,m.c.b.: intra-pixel sensitivityvariationand charge transfer inefficiency— results of ccd scans, publications of the astronomical society of australia, 2005, 22, 257–266. [5] paczynski, b.: optical flashes preceding grbs, astro-ph/0108522, 2001. lech wiktor piotrowski aleksander filip żarnecki faculty of physics university of warsaw hoża 69, 00-681 warsaw 60 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 cosmos, time and creation remarks to the philosophical, theological and physical conceptions of creation p. zamarovský abstract the concept of the beginning of cosmos appears to be problematic. not only ancient theological, but also present-day physical approaches evoke many questions. they originate in the definition of time, its dimensionality and its scale. if we accept the standard model, all physical processes including processes utilised in clocks (chronometric processes) lose their theoretical basis in the vicinity of the initial singularity. the singularity is hidden behind horizon. does it mean that the singularity did not exist? keywords: big bang, creation, cosmos, time scale, imaginary time, horizon, singularity. like the tao, the indefinite that can be named is not the true indefinite. 1 ancient theological and philosophical approaches every mortal has been born, every particular thing emerged at a certain moment in the past. however, can the concept of the beginning also be applied to the whole universe? and can it be applied on time itself? among the varietyof theological and scientific theses that have appearedduringhistory, we canfind both positive and negative answers. the weightiest argument is perhaps that “nothing can emerge from nothing”. this thesis comes from experience with the “ordinary world” and can be understood as an intuitively and vaguely formulated conservation law. creation and annihilation would therefore be impossible. from this rule, many concepts of an eternal (and usually also static) universe have been developed, e.g. themodels ofaristotle, of theepicureans, and also most of the modern concepts valid until the middle of the 20th century. cosmological (rather cosmogonical) conceptions point to a fundamental problem, to the definition of the very concept of “beginning”, “genesis”, “creation” or “emergence”. what do these words mean? i want to stress that they denote not only “coming out from pure nothingness”, but also “rising up from something different by adding some new quality”. addition of a new quality means formation, ordering, organising — for example, condensation of a new phase in some physical system or some (re)structuring. the greek term for order was cosmos. from the time of the pythagoreans this term has also labelled the wholeness, the universe, the world. the reasonwas that the pythagoreans recognised and realised the orderwhich governs thewhole. (however, the idea ofmathematical order in theuniverse is much older. it was common already in ancient egypt, in the old kingdom.) so the emergence of cosmos canmean the ordering of the initial state—chaos. such conceptionswerewidely spread during antiquity. we can find them e.g., in egyptian andgreekmythology, andalso inmanyphilosophical approaches. for example, according to anaxagoras the initialchaoswasorderedbya supernatural agent called nous. and, according to plato, cosmos was arranged from the original chaos by the ordering activity of “god”demiurgos (“skilledworker”). [1] the stoics believed in the rise of cosmos from fire, i.e. from an unorganised hot state of matter. fire was part of the whole universal cosmic cycle — ekpyrosis. fig. 1: first words of genesis (in old hebrew) 86 acta polytechnica vol. 50 no. 6/2010 1.1 creation and the book of genesis did the christian god also create the world by ordering? whatdoes theholyscripture describe? the traditional interpretations say that god created the universe fromnothing. however in thebook ofgenesis is written: in the beginning god created the heaven and the earth. . . [2] so, it is not stated that god created the wholeness, the universe itself. (however, old hebrew had not terms for abstracts.) and it is not even explicitly stated that creation was made from pure nothingness. the creative activities of the christian god and of plato’s demiurgos need not be principally in contradiction. 1.2 approaches of augustine and thomas aquinas non est mundus factus in tempore, sed cum tempore. augustinus aurelius in the 5th century, st. augustine formulated idea thatgod created time togetherwith theworld. however, this assertion could appear paradoxical even from his own viewpoint. “creation” is, as a rule, regarded as an activity taking place in time, therefore it cannot be related to time itself. (it was augustine himself who stressed that, in all our expressions, “time” or temporal information is implicitly hidden in the grammar, and this “grammatical time” could contradict explicit formulated assertions about time. “creation of time” is a self-referring concept, self contradictory which from logical viewpoint must be eliminated.) how can this paradox be solved? one possible answer can be found in the summa theologica byst.thomasaquinas (13th century). [4] thomas distinguished between creation understood philosophically (metaphysically), with no reference to (ordinarily understood) temporality, and creation understood theologically (and also ordinarily, physically). he concluded that god created the world in the philosophical meaning of the time; creation is not any change, no ordering, nor an act in time. it represents continual activity (continual in ordinary time), a process causing the existence of whatever is. without this activity all things would disappear. (so “god” is an answer to the question: “why is there something and not nothing?”) creation and eternality are thus not in contradiction, god could create even an eternal world if he wanted to do so. (through these considerations thomas believed in a temporally limited world.) [3] can these ideas be reflected in the framework of present-day science? thomas’ conceptions can easily be described by two-dimensional time. one coordinate is ordinary (physical, or “theological” in thomas’s terminology) time. in this dimension all physical and human processes take place, and in this dimension the big bang also took place (as a singularity in the standard cosmology). the other perpendicular coordinate represents “divine”, metaphysical or imaginary time (“philosophical”) in which god created our cosmos with its whole history. i am aware that the introductionof a “divinequality”, i.e., a qualitywithout a clear relation to observable physical entities, may be unacceptable to most scientists. on the other hand, complex, i.e. two dimensional, time with one real and one imaginary (“divine”?) coordinate, has also been introduced by modern cosmology. [5] so imaginary coordinates not only solve (or rather describe) the “mystery of creation”, but also help present-day cosmologies to understand the mystery of singularity, the big bang. in this way modern cosmology returns back to forgotten ancient conceptions. however, there remain many doubts: does “complex time” play any role in “ordinary” physics, or is it only a “deus ex machina”, rescuing physicists from troubles of the singularity? 2 physics of time the term “time” has many meanings, but we will limit only to the “physical time” here. timehasqualitativeandquantitativeaspects. its (local) quantitative aspect is representedby its scale. the theoretical and practical realization of the time scale is an issue in physics, because scales are determined by a physical realization of time units. however, the choice of a “proper” scale belongs rather to metaphysics. in early times people used astronomical scales based on the rotation of the earth and its motion around the sun. a range of sundials were constructed. afterwards, clocks were developed, utilising various chronometric processes: the motion of sand or water in hourglasses and clepsydras, the oscillation of a balance wheel, pendulums, the oscillation of the photons emitted by an energy jump in a specific atom (atomic clocks), etc. there was an effort to use scales which are mutually interconnected (“solidary scales”). when a new “better” clock (and a “better” scale) was introduced, it had to offer a better approach to some “ideal scale”, “ideal time”. however, the “ideal scale” itself is unattainable in practices. it is something like a platonic idea. so the practical criterion has been only “mutual solidarity” of clocks; all clocks have to exhibit the same time, or their individual times have to be linearly dependent. continual improvements of clocks (scales) en87 acta polytechnica vol. 50 no. 6/2010 able slighter and slighter physical phenomena to be measured, including the irregularities of the phenomena (chronometric processes) that had made an earlier clock “worse” — non-uniformity of the rotation of the earth, non-homogeneity of the gravitational field, etc. mutual solidarity of clocks has led to the practical definition of the timescale till today: “official time” is defined as the weighted average of the times of certain representative precise clocks (now atomic clocks). temporal scales are in practise realized by clocks (supplemented by calendars). clocks perform some physical chronometric process, and some measuring instrument of the characteristics of the chronometric process. the chronometric process is usually periodical processes, the measuring instrument being a counter of periods (i.e. memory) and sometimes also a tool for measuring the phases of the period (with some important exceptions, of course, e.g. measuring elapsed time by the decay of radioactive nuclei — the c14method, geologicalmethods, measuring of the age of the universe by its expansion, etc.). the realization of the chronometric process requires the validity of the whole chain of physical laws, the most general being perhaps the energy conservation law. however, the dependence is mutual. we can say that only a certain class of (linearly dependent) timescalesguarantees the conservationof energy(and other physical laws). an interesting attempt to solve cosmological problems by introducing a slightly varied timescale was made in the mid 20th century by e. a. milne. in addition to the ordinary “atomic time” scale, milne introduced another “cosmological” time scale, which was (very slightly) nonlinearly dependent on the atomic scale. the divergences between these scales were not observable in practice, but the effect was such that the age of the universe in “cosmological” time was infinite. [6] 3 metaphysics of time scale as i have mentioned, the choice of a chronometric process and the (physical) law describing this process is principally arbitrary. as has been stressed by henri poincaré, the only criterion for “reasonable choice” is the demand for simplicity of physical laws (i.e. also simplicity of the description of most processes around us [7]). nevertheless, there arises a fundamental problem: howcanwedefine a time scale in situationswhenno clocks are available? naturally, we couldextrapolate thevalidityof ourphysical laws, i.e. “extrapolate the existence of clocks”. most of us believe that our physics describes all natural (physical) phenomena, not only here and now. (this assumption has been an extremely fruitful epistemological tool throughout the history of science. this belief is also supported by astronomers, who observe events at great spatial and temporal distances. so, where is the problem? i consider there could be serious trouble with extrapolation of time — i.e. the scale of time — to the very beginning, to the big bang or to creation, if you want. any extrapolation is a risky business when we extrapolate to a quite unknown situation. this is the situation of the verybeginning of ourcosmos. it seems evident that therewereno rigidbodies, no oscillators based on them (pendulums, crystals, etc.), there were no bodies bounded by simple gravitational interaction, there were no atoms with electron shells, etc. there was nothing from which “reasonable clocks” couldbe constructed. the conditions were so extraordinary that they cannot be described even by (present-day) physics. (was it still natural, or was it a supernatural state of affairs?) and, even worse, themost general framework for all chronometric processes — the energy conservation law — was also deconstructed here. (see problems with the definition of energy in general relativity or in theories concerning the higgs field and “false vacua”.) if we accept the standard cosmologicalmodel, we havenoway to extrapolate our time scale to the very beginning, to the starting point, to creation. our physics (andwhole scientific approach) is confinedby ahorizonwhich canbe approachedbut never reached or even exceeded. we can introduce various time scales, but no timescale can be “physically” extrapolated across the horizon. time is represented by an open set, a set without the first point. similar situation concerns spatial scale. (definition of meter is interconnectedwith definition of second through a constant, speedof light.) so thedeconstruction refers to the whole concept of space-time. the horizon is anepistemologicalhorizon, ahorizonofphysics itself. it separates the (principally) known from the (principally) unknown, physics from metaphysics, cosmos from chaos, natural from supernatural. does the horizon also represent an ontological boundary? 4 conclusion my short theorization implies that the very concept of a beginning of the universe (in the framework of the standard model) lacks a physical foundation. it is also the reasonwhywecannot extrapolateour temporal scale to the very beginning. the concept of the age of the universe therefore also remains unspecified. references [1] plato: timaeus and critias, czech edition timaios a kritias, oikoymenh, 2nd edition, praha 1996. 88 acta polytechnica vol. 50 no. 6/2010 [2] king james bible, old testament, the first book of moses, called genesis. [3] carroll, w. e.: time and creation: thomas aquinas and contemporary cosmology, a paper for the international conference 1609–2009 fromgalilei’stelescope toevolutionarycosmology, pontifical lateran university, vatican city, november 29th–december 2nd, 2009 (to be published). [4] aquinas, t.: summa theologica i, 46, article 3, http://www.ccel.org/ccel/aquinas/summa.html [5] hawking, s.: a brief history of time: from the bigbang toblackholes, bantambooks inc,new york, 1988. [6] milne, e. a.: kinematic relativity, clarendon press, oxford, 1948. [7] poincaré, h.: la mesure du temps, in revue metaphysique et morale, 6, 1898, p. 1–13. about the author peter zamarovský was born in prague, czech republic. he studied physics, andnowdealswith philosophical aspect of physics. he teaches philosophy at the czech technical university in prague. rndr. peter zamarovský, csc. phone: +420 731 512 121 e-mail: peter.zamarovsky@fel.cvut.cz dept. economics management and humanities czech technical university in prague technická 2, 166 27 praha, czech republic 89 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 production of biodiesel from vegetable oil using microware irradiation n. kapilan abstract the petroleumoil supply crisis, the increase in demand and the price eruption have led to a search for an alternative fuel of bio-origin in india. among the alternative fuels, biodiesel is considered as a sustainable renewable alternative fuel to fossil diesel. non-edible jatropha oil has considerable potential for the production of biodiesel in india. the production of biodiesel from jatropha oil using a conventional heating method takes more than 1h. in this work, microwave irradiation has been used as a source of heat for the transesterification reaction. a domestic microwave oven was modified and used for microwave heating of the reactants. the time taken for biodiesel production using microwave irradiation was 1 min. the fuel property analysis shows that the properties of jatropha oil biodiesel satisfy the biodiesel standards, and are close to the fossil diesel standards. from this work, it is concluded that biodiesel can be produced from vegetable oil using microwave irradiation, with a significant reduction in production time. keywords: transesterification, jatropha oil, microwave irradiation, biodiesel property. 1 biodiesel production 1.1 introduction the demand for fossil fuel in the world has been increasing due to industrialization and the increase in the number of automobiles. according to an estimation, world oil resources are judged to be sufficient to meet the projected growth in demand until 2030 [1]. the analysis by shahriar and erkan [2] indicates that the fossil fuel reserves depletion time for crude petroluem oil is approximately 35 year. india is the sixth largest energy consumer, and one of the fastest growing energy consumers in the world, with a rapidly growing economy, a rising population, and an expanding number of middle-class consumers [3]. the government of india is seriously concerned about the fluctuations in global oil prices and consequent expenditure on oil imports. india’s bio-fuel policy now aims to find ways to limit rising oil imports by promoting the use of bio-fuels as a renewable alternative source for fossil fuels [4]. in recent times there has been greater awareness of biofuels, in particular, biodiesel, and significant attention has been given to the production of biofuels in order to improve the rural economy. jatropha curcas belongs to the euphorbiaceae family, and is a small tree or bush with spreading branches. it grows in a number of climatic zones in tropical and sub-tropical regions of the world, and can be grown in areas of low rainfall (250 mm per year minimum, 900–1 200 mm optimal) and in droughtprone areas. it has large green to pale-green leaves, alternate to sub-opposite, three-to-five-lobed with a spiral phyllotaxis, and can attain a height of up to eight or ten meters under favourable conditions. it starts yielding from the second year onwards and continues for 40 years. generally, fruits are produced in winter when the shrub is leafless. each inflorescence yields a bunch of approximately 10 or more ovoid fruits. a three, bi-valved cocci is formed after the seeds mature and the fleshy exocarp dries [5–8]. figure 1 shows the jatropha and its seeds. the byproduct of oil extraction is seed cake, which contains 24 to 28 % protein on dry basis. the seed cake protein showed most promising results for adhesive and emulsifying properties, which increase the value of the jatropha. the major fatty acids present in jatropha oil are oleic acid, linoleic acid, and palmitic acid. jatropha is found in tropical america, africa and asia [9, 10]. in a combustion study of jatropha oil, it is reported that the fatty acid burned in the first step, and glycerol burned in the second step [11]. a) b) fig. 1: a) jatropha, b) jatropha seeds 46 acta polytechnica vol. 52 no. 1/2012 masami et al. [12] suggested that a domestic biofuel program requires sustained government support, and offer possible justifications for government interventions in the production of biofuel. india is a developing country, and imports more than 70 % of its crude oil from other countries, to meet the local energy demand [13]. in india’s bio-fuel policy, jatropha curcas has been identified as the most suitable plant for the biodiesel production. the government of india encourages the planting of jatropha in waste and degraded lands, with the aim of rural development and wasteland reclamation [14]. hence, in this work, jatropha oil was used as the raw material for biodiesel production. demirbas [15] reported that vegetable oil, biodiesel, fischer-tropsch and dimethyl ether can be used as a substitute for diesel in diesel engines. however, the direct use of vegetable oil as a substitute for diesel results in higher co, hc and smoke emissions and in lower thermal efficiency of the diesel engine [16]. there are several methods for using vegetable oil as a substitute in diesel engines. one of the most widely used methods is to reduce the viscosity of the vegetable oil by transesterification [17, 18]. the conventional heating method used in transesterification needs more than one hour reaction time to produce a biodiesel yield of more than 90 % [19, 20]. two-step transesterification is a way of producing biodiesel from crude vegetable oil [21, 22]. xin et al [23] used an ultrasonic reactor for the production of biodiesel from jatropha oil, using two-step transesterification. microwave irradiation has been used for a variety of applications, including organic synthesis. in recent years, microwave irradiation has been used for the production of biodiesel. yaakob et al. [24] used single-step, base catalyst transesterification for the production of biodiesel from jatropha oil. however, this method required a longer reaction time of 7 min and a higher catalyst concentration of 4 %. in this work, a feasibility study has therefore been carried out to produce biodiesel from jatropha oil using two-step transesterification. 1.2 materials jatropha oil was collected from the university of agricultural sciences, bangalore, india. this oil was filtered and used for the production of biodiesel. an alkaline catalyst can be used as an excellent microwave irradiation absorber. in this work, sodium hydroxide was used as a catalyst. in comparison with other alcohols, methanol is cheaper and has better physical and chemical properties (polar and shortest chain alcohol), and it was used as a reactant. sodium hydroxide, methanol and sulphuric acid were purchased from merck company, india. all the chemicals used for transesterification were of analytical reagent grade. 1.2.1 biodiesel production jatropha oil contains an initial acid value of 9 mgkoh per g of oil, and two-step transesterification was therefore used. in the first step, acid catalyzed transesterfication was used to reduce the acid value from 9 to 1 mgkoh per g of oil based on the conditions (0.60 w/w methanol-to-oil ratio and 1 % w/w h2so4) recommended for the conventional method reported in the literature [22], at a reaction time of 1 min. the experimental setup used in the first step consists of a 250 ml round bottom flask, equipped with a reflux condenser, a mechanical stirrer and a stopper. the scheme is shown in figure 2. a domestic microwave oven was used for microwave irradiation. fig. 2: microwave-assisted biodiesel production unit pre-treated jatropha oil was used in base catalyst second-step transesterification. in the second step, transesterification was carried out at a methanol-tooil ratio of 0.24 w/w and 1.4 % w/w naoh [22], with a reaction time of 1 min and with power supply 100 w. after the reaction, the excess methanol was removed by vacuum distillation and then the transesterification products were poured into a separating funnel for phase separation. after phase separation, the top layer (biodiesel), was separated and washed with distilled water in order to remove the impurities. then the biodiesel was heated above 100 ◦c, to remove the moisture. then the sample was sent for nmr analysis, to find the biodiesel yield. the biodiesel sample was dissolved in chloroform-d (cdcl3) and the proton nmr spectra were obtained using a bruker amx 400 spectrometer operating at 300 mhz. experimental errors can arise from the instruments that are used. uncertainty analysis is therefore needed to prove the accuracy of the experiments. in the present work, uncertainty analysis was performed as per reference [27], and was found to be ±1.5 %. 47 acta polytechnica vol. 52 no. 1/2012 1.3 determination of fuel properties the flash point of the biodiesel and diesel were determined as per the astm d 93 method. the pour point determinations were made using the astm d 97 method. the kinematic viscosity was determined using a viscometer and according to the astm d 7042 method. the density of the fuel was determined using a relative density bottle. the calorific value of the fuel was estimated by a bomb calorimeter. the ash content was determined using an electric muffle furnace, and a copper corrosion test was carried out by the copper strip tarnish test, following the astm d 130 method. the water content was determined following the astm d 95 method. the acid value of the biodiesel was measured following the astm d 974 method. 1.4 results and discussion biodiesel was produced from jatropha oil by microwave irradiation, and the biodiesel yield was determined using the nmr spectrum. in the nmr spectra, triglycerides (tg) protons on acyl groups resonate at 0.8–2.9 ppm, while protons h-1, h-2 and h-3, appear at a downfield of 4.0–5.6 ppm. when one or two acyl groups migrate from tgs, h-1, h2 and h-3 shift toward a higher field. this is due to loss of the high electron density of the acyl group. figure 3 shows the nmr spectrum of biodiesel. compared to the nmr spectrum of jatropha oil, a large signal at 3.6 pm was observed, which was assigned to the methyl protons of the esters. in addition, some new peaks appeared. this was attributed to the methanolysis products of mono and di-glycerides. the conversion of raw oil to biodiesel was calculated on the basis of the references [26]. a higher biodiesel yield was determined and found to be 0.91 g/g of jatropha oil. biodiesel production by microwave irradiation was due to direct adsorption of the radiation by the polar group (oh group) of the reactant. it is speculated that the oh group is directly excited by microwave radiation, and the local temperature around the oh group would be very much higher than its environment. hence, microwave assisted transesterification is a way of reducing the reaction time, the electrical energy and labour costs as compared to the conventional method. the power consumption in this method was 16.6 j/kg of jatropha oil. fig. 3: nmr spectra of biodiesel table 1 compares the properties of jatropha biodiesel with the properties of diesel. the flash point of biodiesel satisfies the fuel standards and is better than the flashpoint diesel. this is an important safety consideration when handling and storing flammable materials. the important cold flow properties of biodiesel are the cloud and pour point. according to astm standard d 6751, no limit is given for pour point and suggested “report” in the fuel standard. the calorific value is an important property of biodiesel that determines its suitability as an alternative to diesel. in the astm and bis standard, no limit is given for the calorific value. however, in the european standard, en 14214, the approved calorific value for biodiesel is 35 mj per kg. the table shows that the calorific value of mob is close to that of diesel. table 1: fuel properties property astm d6751 biodiesel diesel flash point (◦c) > 130 132 68 pour point (◦c) – 6 −15 calorific value (mj/kg) – 37.95 42.71 viscosity at 40◦c (mm2/sec) 1.9–6 4.21 2.28 density at 15◦c (kg/m3) – 889 846 water content (mg/kg) < 500 129 102 acid number (mg koh/g) < 0.50 0.42 0.34 copper strip corrosion >no. 3 1 1 ash content (%) < 0.02 0.01 0.01 48 acta polytechnica vol. 52 no. 1/2012 according to the astm standards, the acceptable viscosity range for biodiesel is between 1.9–6.0 mm2/s at 40 ◦c, and mob satisfies the biodiesel standard. the density of mob is close to that of diesel and satisfies the bis standard. the astm and bis biodiesel standard approves a maximum acid value for biodiesel of 0.5 mg koh/g, and this was fulfilled by mob. the degree of tarnish on the corroded copper strip correlates to the overall corrosiveness of the fuel sample. the copper strip corrosion property of mob is within the specifications of astm and bis. another important factor of biodiesel is the ash content estimation. the ash content of mob satisfies both the astm standard and the bis standard. 2 conclusions in this work, biodiesel was produced from jatropha oil using microwave irradiation and with the help of two-step transesterfication. it was observed that microwave irradiation helps the synthesis of methyl esters (biodiesel) from non-edible oil, and higher biodiesel conversion can be obtained within a few minutes, whereas the conventional heating process takes more than 60 minutes. biodiesel produced by microwave irradiation satisfies the astm and bis biodiesel standards. acknowledgement the author wishes to thank visvesvaraya technological university, belgaum, for financially supporting this work (grant no. vtu/aca./2009-10/a9/11583, 2009). references [1] chedid, r., kobrosly, m., ghajar, r.: a supply model for crude oil and natural gas in the middle east, energy policy, 35, 2007, 2 096–2 109. [2] shahriar, s., topal, e.: when will fossil fuel reserves be diminished? energy policy, 37, 2009, 181–189. [3] a report on international energy outlook 2008 report, http://www.eia.doe.gov/oiaf/ieo, http://www.eia.doe.gov/emeu/cabs/india/ oil.htm (browsed on june, 15, 2010). [4] report of the committee on the development of bio-fuel, planning commission, government of india, 2003. [5] http://en.wikipedia.org/wiki/jatropha (browsed on june, 15, 2010). [6] http://www.svlele.com/fattyacids.htm (browsed on june, 15, 2010). [7] http://www.pia.gov.ph (browsed on june, 15, 2010). [8] http://www.jatrophabiodiesel.org/ (browsed on june, 15, 2010). [9] lestari, d., mulder, w. j., sanders, j. p. m.: jatropha seed protein functional properties for technical applications, biochemical engineering journal, 53, 3, 2001, 297–304. [10] hamarneh, a. i., heeres, h. j., broekhuis, a. a., picchioni, f.: extraction of jatropha curcas proteins and application in polyketone-based wood adhesives, international journal of adhesion and adhesives, 30, 7, 2010, 615–625. [11] wardana, i. n. g.: combustion characteristics of jatropha oil droplets at various oil temperatures, fuel, 89, 3, 2010, 659–664. [12] kojima, m., johnson, t.: biofuels for transport in developing countries: socioeconomic considerations, energy for sustainable development, x, 2, 2006. [13] subramanian, k. a., singal, s. k., saxena, m., singhal, s.: utilization of liquid biofuels in automotive diesel engines: an indian perspective, biomass and bioenergy, 29, 2005, 65–72. [14] shanker, a., fallot, a.: global environmental facility — scientific and technical advisory panel workshop on liquid biofuels: main findings, energy for sustainable development, 10, 2, 2006, 19–25. [15] demirbas, a.: present and future transportation fuels, energy sources, part a, 30, 2008, 1 473–1 483. [16] devan, p. k., mahalakshmi, n. v.: performance, emission and combustion characteristics of poon oil and its diesel blends in a di diesel engine, fuel, 88, 5, 2009, 861-867. [17] marchetti, j. m., miguel, v. u., errazu, a. f.: possible methods for biodiesel production, j. renewable and sustainable energy reviews, 11, 2007 1 300–1 311. [18] ilkılıç, c., aydın, s., behcet, r., aydin, h.: biodiesel from safflower oil and its application in a diesel engine, fuel processing technology, 92, 3, 2011, 356–362. 49 acta polytechnica vol. 52 no. 1/2012 [19] jain, s., sharma, m. p.: biodiesel production from jatropha curcas oil, renewable and sustainable energy reviews, 14, 9, 2010, 3 140–3 147. [20] koh, m. y., ghazi, t. i. m.: a review of biodiesel production from jatropha curcas l. oil, renewable and sustainable energy reviews, 15, 5, 2011, 2 240–2 251. [21] balat, m.: production of biodiesel from vegetable oils: a survey, energy sources, part a: recovery, utilization, and environmental effects, 29, 2010, 895–913. [22] berchmans, h. j., hirata, s.: biodiesel production from crude jatropha curcas l. seed oil with a high content of free fatty acids, bioresource technology, 99, 6, 2008, 1 716–1 721. [23] deng, x., fang, z., liu, y.: ultrasonic transesterification of jatropha curcas l. oil to biodiesel by a two-step process, energy conversion and management, 51, 12, 2010, 2 802–2 807. [24] yaakob, z., sukarman, i. s., kamarudin, s. k., abdullah, s. r. s., mohamed, f.: production of biodiesel from jatropha curcas by microwave irradiation, 2nd wseas/iasme international conference on renewable energy sources (res’08), corfu, greece : october 26–28, 2008. [25] chauhan, b. s., kumar, n., jun, y. d., lee, k. b.: performance and emission study of preheated jatropha oil on medium capacity diesel engine, energy, 35, 6, 2010, 2 484–2 492. [26] jin, f., kawasaki, k., kishida, h., tohji, k., moriya, t., enomoto, h.: nmr spectroscopic study on methanolysis reaction of vegetable oil, fuel, 86, 2007, 1 201–1 207. [27] holman, j. p.: experimental methods for engineers. mcgraw hill,usa, 2006. natesan kapilan e-mail: kapil krecmech@yahoo.com department of mechanical engineering nagarjuna college of engineering and technology devanahalli, bangalore – 562 110, india 50 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 capillary discharge parameter assessment for x-ray laser pumping j. hübner abstract this paper assigns optimum capillary discharge characteristics with respect to reaching the maximum emission gain on wavelength l = 18.2 nm and corresponding to balmer α transition h-like carbon. the computer modelling of the capillary discharge evolution is carried out using the npinch programme, using a one-dimensional physical model based on mhd equations. the information about the capillary discharge evolution is processed in fly, flypaper, flyspec programmes, enabling the population to be modelled on specific levels during capillary discharge. keywords: capillary discharge, x-ray laser, mhd (magnetohydrodynamic) simulation, population inversion, gain optimalization, carbon. 1 introduction non-stationaryplasmaof a fast capillary electrical discharge was studied as a potential active medium for a soft x-ray laser. the aim was to achieve an optimum set of characteristics for lasing at 18.2 nm wavelength h-like carbon c5+ in an alumina capillary. optimum in the sense of reachingamaximumgainvalue for each set of parameters. capillary pinch dynamics is determined by many selected parameters: capillary geometry (radius and capillary length), the substance of the capillary, the initial filling pressure, and the electric current time dependance. this dependance, in particular, is given by an electric circuit which is joined to the capillary. a capillary discharge z-pinch acting as a medium for a soft x-ray laser uses ase, the “amplified spontaneous emission” effect, and electron-collisional recombination pumping. the main variable for ase is the gain, and the most important goal of this paper is to find specific parameters of a capillary to obtain the maximum gain. electron-collisional recombination, sometimes referred to as “three body” recombination, is the inverse process to electron-collisional ionization fromexcited levels. the combined recombination and radiative downward cascading process is illustrated by the equation x(i+1)+o +2e → x i+ n + e → x i+ u + e (1) it is assumed that the ions exist plentifully in the initial state, which is one stage of ionization higher than that of the lasing ions. note that this is not a self-containedpopulation-replenishment scheme. hence, replenishment depends upon further reionization, which is provided in our case by the release of the next current pulse. it was assumed that ablation from inner walls has only an imperceptible effect, and so it was not included in the calculations. the capillary discharge plasma quantities were calculated by means of the npinch [2] code under the one-dimensional, twotemperature, one-fluidmhdapproximation. the output data was then processed by means of the fly code, which enables the creation of a history file of the population on the individual levels along the capillary axis. on the basis of the populations history, the gain and the gain history can be calculated. 2 optimization optimization itself involves searching the maximum in the four-dimensional space of the parameters. two parameters relate to the current-impulse shape, the third parameter is initial density (or pressure), and the fourth is the radius of the capillary. the right choice in the range of parameters is of fundamental importance. the results of our study should be instrumental for experimenters in the capex-u [4] project. the range of parameters therefore has to be related to their experimental equipment, with the view to verifying the results. consequently, the choice of the first parameter was obvious because an alumina capillary r = 1.6 mm in radius is used in capex-u. the choice of the current pulse had to coincide with the source in capex-u. the electric current in a capillary was fitted from measuring by function. if it = i0 sin ( πt 2t0 ) exp ( − t t1 ) (2) the chosen parameters are therefore associated with equation 2: current peak value imax and di/dt|t=0. two cases were calculated imax ≈ 30 ka and imax ≈ 60 ka, on behalf of di/dt|t=0 the area within the range of value 0.5 ·1012–2.5 ·1012 a/s was chosen. 52 acta polytechnica vol. 50 no. 4/2010 table 1: parameters of pulse shape for imax =30 ka di/dt|t=0 imax tmax i0 t0 t1 (a/s) (ka) (ns) (ka) (ns) (ns) 0.750 · 1012 30.023 67.6 33.900 71 570 0.923 · 1012 30.031 54 33.082 56.3 570 1.000 · 1012 30.003 49.5 32.786 51.5 570 1.250 · 1012 30.049 39.4 32.229 40.5 570 1.500 · 1012 30.015 32.5 31.799 33.3 570 table 2: parameters of pulse shape for imax =60 ka di/dt|t=0 imax tmax i0 t0 t1 a/s (ka) (ns) (ka) (ns) (ns) 0.923 ·1012 60.056 115.15 74.213 126.3 570 1.000 ·1012 60.029 105.05 72.765 114.3 570 1.250 ·1012 60.006 82.020 69.630 87.5 570 1.500 ·1012 60.044 67.878 67.8 71 570 1.750 ·1012 60.027 57.57 66.51 59.7 570 2.000 ·1012 60.002 50 65.27 51.5 570 2.250 ·1012 60.005 43.63 64.88 45.3 570 2.500 ·1012 60.095 39.59 64.45 40.5 570 the last parameter is initial density n, according to the theoretical analysis and in order to achieve the gain maximum, the maximum electron density for carbon has to be ne ≈ 3 · 1020. to accomplish this requirement, we chose the range n =0.5 · 1017–5 ·1017 cm−3. 3 gain evaluation the selected parameters for each case served as input data for npinch. a very useful output of the calculation was the dens.dat file, which contained a time history of electron and ion density, and electron temperature. this file futher served as input for the fly and the flypaper code, and by means of these we obtained the time histories of populations on set levels n3 for level n = 3 and n2 for level n = 2 h-like carbon ions. inversion function f [3] is represented by equation 3, where the statistical weights gu/gl are for our transition 9/4, and nl, nu presents the lower and upper laser state population densities. f =1− ( 2.25 n1 nu ) (3) the cross-section with corresponding coefficients for our transition λul = 13.3818 · 10−7 cm, aul = 5.72 · 1010 s−1 can be illustrated as σstim = λ3 8πcδλ/λ aul = 1.62 · 104 · (18.2178)3 · 5.72 ·1010 8 · π ·2.997925 · 1010 · √ 2π = (4) 2.966136 · 10−15cm2 and finally, g after cross-section substitution can be illustrated as g =2.966136 · 10−15nuf cm−1 (5) for each calculation, we listed the parameter data, information about the maximum value and the time of reaching gain, the pinch time (the time when the maximum of electron density ne is reached). 4 imax =30 ka the results for imax = 30 ka, are in fig. 1. the maximum gain value was achieved at di/dt|t=0 = 1 ·1012 a/s, initial carbon density 1.4 ·1017 cm−3 and gmax ≈ 0.05 cm−1. the calculated data resembled the gauss function, so they were fitted by this function, and the deviation between the calculated data and the fitted function was minimal. the most information is shown in fig. 2. the best gainwas achieved when the time of reaching currentmaximumwas equal to the pinch time. fig. 1: maximum gmax against initial density for various current derivations at imax =30 ka. the data is fitted by the gauss function fig. 2: comparison of the achieving time ofmaximumgain tg, and current ti in dependence on di/dt|t=0 5 imax = 60 ka for the casewhere imax =60 ka is chosen, the results are shown in thenextfigure. themaximumgainvalue 53 acta polytechnica vol. 50 no. 4/2010 gmax was reachedatdi/dt|t=0 =2.25·1012 a/s, initial carbon density 4.1 · 1017 cm−3 and gmax ≈ 3.4 cm−1. this data too was fitted by the gauss function. fig. 3: maximum gmax against initial density for various current derivations at imax =60 ka. the data is fitted by the gauss function fig. 4: comparison of achieving time ofmaximumgain tg and current ti in dependence on di/dt|t=0 6 example now we can take a closer look at a particular case to show the history of certain variables. the parameters of a capillary discharge n0 = 4.0 · 1017 cm−3, di/dt|t=0 = 2 · 1012 a/s, r0 = 1.6 mm and imax = 60 kahave been chosen. we will plot the curve for an individual quantity, such as electron density, electron temperature, inversion of population, and gain. fig. 5: behaviour of electron density for parameters n0 = 4.0·1017 cm−3, di/dt|t=0 =2·1012 a/s, r0 =1.6 mmand, imax =60 ka fig. 6: behaviour of electron temperature for parameters n0 =4.0·1017 cm−3, di/dt|t=0 =2·1012 a/s, r0 =1.6mm, and imax =60 ka fig. 7: behaviour of the population on levels n = 2 a n = 3 for parameters n0 =4.0 · 1017 cm−3, di/dt|t=0 = 2 ·1012 a/s, r0 =1.6 mm, and imax =60 ka fig. 8: behaviour of inversion function f and gain (cm−1). in addition, the figure contains a marked pinch time tp = 44nsand time of reaching currentmaximum timax =55 for parameters n0 =4.0 ·1017 cm−3, di/dt|t=0 =2 ·1012 a/s, r0 =1.6 mm and imax =60 ka figures 6 and 7 demonstrate the behaviour of electron density and temperature. they show pinch time tp =44 ns, nemax =3.8 ·1020 cm−3, temax =110 ev. after that, a rapid drop in these quantities followsdue to adiabatic expansion. the electron density drops by nearly two orders in 5 ns and the electron temperature drops by nearly 90 ev. these are ideal conditions for creating population inversion at the desired levels. 54 acta polytechnica vol. 50 no. 4/2010 figure 8 shows that at time 46.5 ns, 2.5 ns after pinch time, both the inversion function and gain had an above zero value. at time 47.2 ns, the gain maximum gmax =3.2 cm −1 is reached, and just about one ns later the inversion function also reaches its maximum and starts to fall steadily. however, the gain begins to drop very rapidly, due to rapid density decay of the whole mass (both electrons and ions, which are lasing) along to axix. it follows from the gain behaviour that the hypothetical laser pulse could be 2 ns in length. howdoes the initial density influence the pinchbehaviour? with increasing initial density themaximum density also increases. at the same time, the pich time is delayed. the effect is totally contrary to the electron temperature. as the initial density increases the maximum electron temperature value decreases. 7 conclusion the optimal conditions for achieving maximum gain for initial density and electric current pulse shape have been obtained for two different current maxima, imax = 30 ka, imax = 60 ka and capillary radius r = 1.6 mm. figures 2 and 4 show clearly that the time of achieving the currentmaximum and the pinch time have to be synchronized for optimal current differentiation at zero time, so that both times will be almost identical. gmax = 0.05 cm −1 was achieved at n0 = 1.4 · 1017 cm−3, di/dt|t=0 =1 ·1012 a/s, r0 =1.6 mm and imax =30 ka. gmax = 3.4 cm −1 was achieved at n0 = 4.1 · 1017 cm−3, di/dt|t=0 = 2.25 · 1012 a/s, r0 = 1.6 mm and imax =60 ka. the assumed pulse length is 2 ns. acknowledgement the research described in this paper was supervised by ing. pavel vrba, csc., ipp as cr in prague, and was supportedby theczechgrantagencyundergrant no. 202/08/057. references [1] rocca, j. j.: table-top soft x-ray lasers,review of scientific instruments, 1999, vol. 70, no. 10. [2] razinkova, t. l., sasorov, p. v.: program npinch na počítání dynamiky z-pinče, moskva, 1998. [3] elton, r. c.: x-ray laser, academic press, new york, 1990. [4] koláček, k. et al.: 10th icxrl2006, berlin, germany (poster – 14 09 kolacek.pdf). jakub hübner e-mail: bukajus@centrum.cz dept. of physical electronics faculty of nuclear science and physical engineering czech technical university břehová 7, 115 19 prague 1, czech republic 55 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 lifetime assessment of a steam pipeline jǐŕı janovec1, daniela poláchová1, michal junek1 1 ctu in prague, faculty of mechanical engineering, department of materials engineering, karlovo náměst́ı 13, 121 35 prague 2, czech republic correspondence to: jiri.janovec@fs.cvut.cz abstract the aim of this paper is to design a method for assessing the life of steam pipes for czech power plants. the most widely-used material in czech power plants is steel 15 128. our findings may also be applied for international equivalents of this steel. the paper shows the classification of cavitation damage and microstructure classification status, based on the german vgb act, with references to epri law in the usa. calculations of remaining life on the basis of russian experience are also shown. the possibility of applying this method to increase the operating parameters for power plants is discussed. keywords: life assessment of a steam pipeline, creep damage, assessment of microstructure damage. 1 introduction a switchover to high-parametric power plants requires an assessment of the remaining life of existing steam piping. in creep conditions, this requires a large amount of information on operating conditions (temperature, pressure, time), material characteristics (microstructure, creep strength, creep strain rate, etc.) it is also important to monitor the stress in exposed parts and weld joints. the greatest influence on component life is the formation and joining of internal defects, i.e. cavities. it is therefore necessary to provide a sophisticated method for observing and evaluating internal defects. this paper deals mainly with steel type 15 128, because the vast majority of the steam piping currently operating in the czech republic is made from this material. 2 material 15 128 (14mov6-3) this is a low-alloy heat-resistant crmov steel with guaranteed weldability. the main use of this steel is for steam piping, superheaters and boiler tubes operating at temperatures up to 580 ◦c. the microstructure of steel in the initial state depends on the heat treatment that it has undergone. the initial state of 15 128.5 has a ferritic-bainitic microstructure with fine dispersion of globular carbides of m4c3 or mc type, respectively, precipitated in the ferritic matrix. the initial state of 15 128.9 is formed by a fine carbide-bainitic microstructure. the two initial states differ in their mechanical properties; see table 3 [1]. table 1: equivalent materials according to international standards čsn din gost astm w. nr. marking 15 128 1.771 5 14mov6-3 12ch1mf gr. p24 table 2: chemical composition of the material according to the standard identification according to čsn chemical composition [weight %] c mn si cr mo v p s al 15 128 0.10 0.45 0.15 0.50 0.40 0.22 max. max. max. 0.18 0.70 0.40 0.75 0.60 0.35 0.040 0.040 0.025 74 acta polytechnica vol. 52 no. 4/2012 table 3: mechanical properties of steel 15 128 in dependence on heat treatment state rm rp0,2 a5 hb rmt /10 5 hrs [mpa] rmt /2.5 · 105 hrs [mpa] *) [mpa] [mpa] [%] 550 ◦c 575 ◦c 600 ◦c 550 ◦c 575 ◦c 600 ◦c .5 490 690 min. 365 min. 18 140 197 89 64 45 73 51 35 .9 570 740 min. 430 min. 17 163 223 107 75 (51) 88 59 (38) *) 15 128.5–960 ◦c/30 min/air + 710 ◦c/1hour/air 15 128.9–970 ◦c/30 min/water + 720 ◦c/1hour/air table 4: summary of tests for steam piping [2] type of samples test methodology obtained data pipeline (non-destructive tests) replicas visual tests cavity formation cracking ultrasonic methods hardness testing density of inclusions spheroidization penetration test mag. powder test segregation of sulphides hardness optical fibre measurement chemical analysis size and type of grain metallographical samples (destructive tests) replicas hardness testing fusion line cracking cavities formation light microscopy electron microscopy hardness spheroidization cryo-cracking cross-weld stress rupture test fine-grain haz damage damage to weld metal density of inclusions size and type of grain 3 monitoring the operating parameters, service life management the main parameters that should be monitored over time are the temperature and the pressure of steam in the steam pipeline. regular monitoring of the operating parameters several times a year is most advantageous in terms of creep lifetime. 3.1 according to epri (usa) [2] a three-stage approach is used for evaluating the lifetime of steam pipelines. at each stage, the estimated remaining life and the desired service life of the steam pipeline are compared. stage 1: includes general calculations based on operating history and especially exploring the possibility of degradation of components. the main contents of this stage are the relevant technical drawings, material properties, operating hours and cycles, history of inspections and maintenance, failure history (details of failures and repairs to failures), operational parameters and their maximum values (temperature and pressure). stage 2: includes non-destructive testing of components, the results of which can improve the evaluation of the lifetime in stage 1. for tests carried out at this stage, see table 4. visual examination includes observation of geometric inaccuracies (e.g. buckling). this includes geometry measurements (wall thickness, ovality) and measurements of the position of selected hinges and supports. if cracks are found by capillary tests, an ultrasonic examination is to be made. the replica method provides a preliminary estimate of the lifetime of the steam piping (welded joints). 75 acta polytechnica vol. 52 no. 4/2012 figure 1: creep curve with states of cavitation damage marked on it [3] stage 3: includes destructive testing and detailed analysis of the samples. it is necessary to interrupt the operation and remove a part of the steam pipeline. this stage provides the most accurate estimate of the remaining life of the samples. the observed data is compared with the data determined in stage 2. the tests are presented in table 4. the high cost of this type of evaluation needs to be taken into account. (for example, is it better to replace or repair a part of the steam piping, or is it better to operate under lower conditions?) according to [3], the replica method can be used on properly prepared surfaces. two to three replicas should be taken from each site. microstructure and cavitation (creep) damage is evaluated on the basis of an evaluation of the replicas. figure 1 shows a creep curve with states of cavitation damage marked on it. the structural condition is assessed according to the etalons. resistance to corrosion attack can also be assessed with the use of electrochemical polarization measurements. 3.2 russian approach (material 12ch1mf) 3.2.1 operation and inspection on the straight parts of steam pipelines operating at temperatures from 450 to 545 ◦c it is necessary to measure the residual deformation 200 thousand hours after the pipeline came into operation. for steam pipelines operating at temperatures from 546 to 570 ◦c, analogous measurements should be performed after 150 thousand hours. if the residual deformation exceeds 0.75 %, an assessment of the material is made in terms of mechanical properties and chemical composition. on the bent parts of the steam piping, measurements of the residual deformation, magnetic powder and ultrasonic flaw detection must be performed after 150 thousand hours. in operating temperature ranges from 546 to 570 ◦c, this measurement is performed after 100 thousand hours. for welded joints, the degree of fatigue life τho/τlp is evaluated in accordance with the range of structures and their microdamange for welded joints (τho – hours of operation; τls – limit service life at the stage of microcrack discovery). the residual service life (τrs) can be calculated from the difference τrs = τls − τho [4]. typical damaged places of welded joints in the creep region are found mainly along the outside of the heat-affected zone of the material. 3.2.2 service life calculation the lifetime of a steam pipeline operated in the creep region is assessed according to the degree of microdamage within the structure of the material. the basic parameter for calculating the lifetime is the number of micropores in a unit area of the metallographic scratch pattern (replica). figure 2 shows an example of data processing for steel 12ch1mf at 600 ◦c. for a reliable extrapolation of long-term strength at 100 000 hrs, it is necessary to start from an experiment planned for at least 4 000–5000 hrs. in [5], the calculation was based on 50-300 thousand hrs in creep conditions, steam piping energy blocks 250–800 mw at 515–560◦c and 3.7 to 25.5 mpa. 76 acta polytechnica vol. 52 no. 4/2012 figure 2: example of experimental data processing in double logarithmic coordinates after tests on steel 12ch1mf for term strength at 600 ◦c [6] 3.2.3 creep time the creep time from pore creation to the development of a macro-crack for 12ch1mf can be one half of the total operating time, which is (1–3) × 105 hrs (in the course of a year the plant works for (7–8) × 103 hrs). information on the degree of harmfulness of the metal therefore allows the uptime capacity to be calculated. at the present time, the diagnostics is carried out mostly by metallographic methods when the energy blocks are shut down, or during an overhaul. the high work difficulty involved in preparing the metallographic sample reduces the control performance, and extensive repairs are needed. it is therefore necessary to introduce more progressive express methods. 4 increasing the operating parameters the russian literature [4,7], presents examples of service life extension by up to an additional 100 thousand hours after appropriate repairs to the welds on the basis of microstructure analysis and degree of damage. cyclic heat treatment (3 to 10 cycles) with heating up to 980 ◦c and cooling effectively regenerates the operating characteristics if the material damage does not exceed 10–15 % in relation to the original state, when δρ/ρ ≤ 0.1-0.2 % [?]. for 12ch1mf steel with 0.13 % damage just 2–3 cycles; at 0.5 % damage, 4–6 cycles; 1 % (total damage state), 8–10 cycles. other specific cases for increasing the operational parameters are very difficult to access. most of the results and experience gained in relation to this issue are covered by trade secrecy. however, it is obvious that even a small change in operating parameters (e.g. a temperature increase) has a significantly negative effect on the residual service life. 5 assessment of microstructure damage in our proposal for the metallographic evaluation of a structure, we utilize the classification degrees used by neubauer, isq, and vgb nordtest, a detailed description of which, and a comparison, are given in table 5 [3]. figure 3 shows various states of material 14mov6-3 damage. table 5: microstructure rating class [3] nordtest nt ndt010 vgb tw507 neuber and waddel description recommendation (nordtest) consumed life-fraction (epri) 0 as-received 1 1 no creep cavitation none 0–0.14 2 a single cavities re-examine after 20 000 hrs 0.05–0.47 2a isolated cavities 2b numerous cavities, no preferred orientation 3 coherent cavities re-examine after 15 000 hrs 3a b numerous oriented cavities 0.27–0.53 3b chain of cavities 4 4 c creep cracks (micro) re-examine after 10 000 hrs 0.29–0.84 5 5 d creep macrocracks issue immediate warning 0.7–1.0 77 acta polytechnica vol. 52 no. 4/2012 assessment class 1: no creep cavitation assessment class 2a: single cavities assessment class 2b: numerous cavities assessment class 3a: numerous oriented cavities assessment class 3b: chain of cavities assessment class 4: creep microcracks figure 3: states of material 14mov6-3 damage [8] 6 conclusion our proposal for czech power plants anticipates a shift in service life from grade 3a to 3b, i.e. to allowed length of string cavitation 100 μm at a density of 20 microcracks/mm2 for steel 15 128 or its international en equivalents. the proposal is based on an experimental evaluation of microstructures, creep behavior and changes in wall thickness of bends and pipes at 6 cez power plants operated for a period of 25 to 35 years. another topic for research is the design of checks on the service life of steam pipelines in the area of creep damage and damage assessment of the microstructures of grade 15 steels and the new class of t23, t24, p91, p92 and e911 steels. 78 acta polytechnica vol. 52 no. 4/2012 references [1] svobodová, m., tůmová, d., čmakal, j.: dı́lč́ı odborná rešerše “přehled materiálových vlastnost́ı oceli 15 128 z databáze ujp praha, a. s.”. [zpráva ujp 1450] ujp praha, a. s., praha : listopad 2011, 22 s. [2] fossil plant high-energy piping damage: theory and practice, volume 1: piping fundamentals. epri, palo alto, ca: 2007. 1012201.k. [3] mentl, v.: řı́zeńı životnosti parovod̊u v energetice. [technická zpráva – rešerše č. 16/2011/ kmm] kmm fs zču v plzni, plzeň : listopad 2011, 239 s. [4] hromchenko, f. a.: životnost svarových spoj̊u parovod̊u. moskva : mashinostrojenie, 2002, 246 s. isbn 5217031263. [5] kaligin, r. n.: prognozirovanie ostatoqnogo resursa dlitel�no ekspluatiru�xihs� sbarnyh soedineni� paroprovodov v uslovi�h polzuqesti po strukturnomu faktory. bserossi�ski� teplotehniqeski� nauqno-issledovatel�sku� institut. moskva : 2008. [6] antikajp, p. a.: metallovedenie. moskva : metalurgija, 1972, 183 s. [7] iz�mski�, n. a.: povyxenie nade�nosti parovyh kotlov i paroprovodov. [8] vgb–tw 507 – teil2/part2 richtreihen/rating charts, januar 2005. 79 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 concepts of emission reduction in fluidized bed combustion of biomass amon purgar1, franz winter1 1 vienna university of technology, institute of chemical engineering, getreidemarkt 9/166, 1060 vienna, austria correspondence to: amon.purgar@tuwien.ac.at abstract a status report on fluidized bed technology in austria is under preparation, in response to the fluidized bed conversion multi-lateral technology initiative of the international energy agency. this status report focuses on the current operation of fluidized bed combustors. combustors have been installed in the following industrial sectors: pulp and paper, biomass heat and power plants, waste-to-energy plants, and communal sewage sludge treatment plants. there are also some small demonstration plants. these plants all have in common that they treat renewable fuel types. in many cases, only bio-fuels are treated. besides the ability to burn a wide range of low-grade and difficult fuels, fluidized bed combustors have the advantages of low nox emissions and the possibility of in-process capture of so2. various emission reduction concepts for fluidized bed combustors that are typical for their industrial sector are discussed. the discussion of these concepts focuses on nox, so2 and dust. keywords: fluidized bed combustion, emission reduction systems. 1 fluidized bed combustion the history of fluidized bed conversion is considered to have started in about 1920. a name linked to the development of fluidized bed conversion is fritz winkler, who conducted flue gas into the bottom of a vessel containing coke particles. when the volume flow of the flue gas increased, the phenomenon of fluidization could be observed. the bulk coke increased in volume, and winkler observed that the motion of the coke particles was similar to that of a boiling liquid. this application can be described as fluidized bed gasification. he patented his findings in 1922, and continued building and investigating fluidized bed applications. the first boom in the commercial use of fluidized bed conversion was in the 1930s and 1940s. the reason for this boom is easy to explain: air blowers became commercially available at that time. further information about the history of fluidized bed conversion can be found in [1]. it should also be mentioned that, at least in austria, there was also a big increase in fluidized bed combustion applications between 1980 and 1993 in the pulp and paper industry. another increase in fluidized bed combustion technology in the waste-toenergy industry began in 2000, and is still going on. the reasons for these booms and their relationship to flue gas cleaning will be discussed below. there are two main concepts: bubbling and circulating fluidized bed combustors. the two concepts are illustrated in figure 1. fluidized bed combustors mainly consist of a vessel containing a gas distributor, the bulk bed material and the freeboard. the gas distributor, overlaid with the bed material, leads the fluidization air into the vessel. it flows through the bed material, and fluidization takes place. gas distributors can be open or closed. if a closed gas distributor is built, all the bed material is above the distributor. if the distributor is open, the bed material is situated around the distributor. the bulk bed material consists mainly of inert sand, in most cases silica or dolomite. at standstill the vessel is not entirely filled with bed material. it is filled to a height considering expansion due to fluidization. the empty space at the top of the vessel is called the freeboard. if the combustor is not equipped with an air staging system, the fluidization air is the entire combustion air. if there is an air staging system, the fluidization air is the primary air, possibly mixed with recirculated flue gas, and the secondary air is injected above the fluidized bed. the fluidized bed is heated to a certain temperature before the fuel supply is started. for this purpose, most fluidized bed combustors are equipped with a gas burner for the start up. once the fluidized bed material is at the required temperature, the fuel is injected and due to the horizontal and vertical movement of the bed material the fuel is well mixed into it. the excellent mixing behavior and the high heat capacity of the sand, which acts as a mobile heat tank, ensure an even temperature and even fuel distribution in the combustion chamber. in addition, the overall tem103 acta polytechnica vol. 52 no. 4/2012 figure 1: basic functionality of a fluidized bed combustor. left: a bubbling fluidized bed combustor. right: a circulating fluidized bed combustor perature in the combustion chamber is very insensitive to fuel quality fluctuations over time. these stable temperature conditions and the possibility of in process capture of so2, when limestone is used as an additive, are the main advantages over grate furnaces and pulverized combustors. [4] besides the enhanced constructional effort, fluidized bed combustion technology has two main limitations. depending on the ash composition, the maximum temperature in the fluidized bed is limited. when the ash melting point is reached there is a possibility of agglomeration within the bed material, which can reduce or stop fluidization. in addition, the superficial velocity within the reactor, depending on the fluidization air flow and the cross sectional area, must be above the minimum fluidization velocity, that ensures fluidization, and below the terminal velocity, which is the minimum velocity in the pneumatic transport regime. [1,4] 2 fluidized bed combustors in austria a status report on fluidized bed technology in austria is under preparation, in response to the fluidized bed conversion multi-lateral technology initiative of the international energy agency. this status report focuses on the current operation of fluidized bed combustors. besides two demonstration plants and other fluidized bed conversion plants, 23 fluidized bed combustors with a thermal capacity of more than 1 mw were found and investigated. the 23 combustors were categorized into the following four industrial sectors, how, is described in the following enumeration: • pulp and paper. combustors which supply a pulp and paper plant with energy and do not utilize municipal wastes. • waste-to-energy industry. combustors that utilize municipal wastes. • biomass heat and power plants. combustors that utilize only renewable fuels and are not connected to the pulp and paper industry. • treatment of communal sewage sludge. combustors utilizing only communal sewage sludge. figure 2: total thermal capacity of fluidized bed combustors installed in the pulp & paper (pp), waste-to-energy (wte), biomass heating and power plants (bhp), and treatment of communal sewage sludge (tcss) industrial sectors 104 acta polytechnica vol. 52 no. 4/2012 figure 3: thermal capacity of fluidized bed combustors installed in the pulp & paper (pp), waste-to-energy (wte), biomass heat and power plants (bhp) and treatment of communal sewage sludge (tcss) industrial sectors, over time table 1: hourly and daily emission standards for the investigated plants in the waste-to-energy industry [5,6] dust (mg/m3) corg (mg/m 3) so2 (mg/m 3) nox (mg/m 3) co (mg/m3) minimum 5/5 8/8 20/20 60/55 50/50 maximum 10/10 10/10 50/50 100/70 100/50 average 7.75/7.75 8.5/8.5 40/37.5 75/66.3 75/50 3 the influence of flue gas cleaning pulp & paper industry: figure 3 shows that most of the fluidized bed combustors in the pulp and paper industry were installed between 1983 and 1986. widely-used fuels are coal, biomass and fibrous rejects of the pulp and paper industry. gas as a fuel is used only for starting up the combustors. the main purpose of these boilers is to cover the main load of the energy demand of a pulp and paper plant. the legal framework at that time required these boilers to be equipped with electrostatic precipitators or baghouse filters and, depending on the fuels that were used, it was required to be able to add bulk limestone to the combustion chamber for in-process capture of so2. over time, some of the boilers have additionally been equipped with a selective non-catalytic flue gas cleaning system or a dry flue gas cleaning system. [3,6] waste incineration industry: at the end of 2011, there were seven fluidized bed combustors in this sector. together they have a capacity of 321 mw. four of these boilers, with a total thermal capacity of 268 mw, have been investigated closely. table 1 shows that there is a strict legal framework in the waste-to-energy industry. in order to handle those strict standards, all the investigated plants use a similar setup of flue gas treatment systems, see figure 4. due to tightening of the standards in recent years, this elaborate flue gas cleaning setup became necessary both for fluidized bed combustors and for grate furnaces. it should be mentioned here that standards are increasingly being set for shorter averaged sample time periods. this means that the flue gas cleaning systems are designed to handle emission peaks. for this reason, the advantage of the stable operating conditions of fluidized bed combustors has become crucial when deciding between grate furnaces and fluidized bed boilers. in 2006, there were three grate furnaces in austria, with a total capacity of 87 tons of waste per hour, and three fluidized bed combustors, with a total capacity of 99 tons of waste per hour, in the process of planning [2,6]. 105 acta polytechnica vol. 52 no. 4/2012 figure 4: basic setup of the flue gas cleaning system in the waste-to-energy industry. a) fluidized bed combustor, b) gravitation and/or centrifugal separators, c) dry flue gas cleaning, d) baghouse filter or electrostatic precipitator, e) wet scrubbers, f) selective catalytic reduction (scr) figure 5: basic setup of the flue gas cleaning system for biomass heat and power plants. a) fluidized bed combustor, b) selective non catalytic reduction, c) gravitation separators, d) selective catalytic reduction in high dust switching, e) dry flue gas cleaning system, f) baghouse filter treatment of communal sewage sludge: at the end of 2011, there were five boilers that exclusively utilize communal sewage sludge. two of them have a thermal capacity below 2 mw and are not discussed in this work. the other three combustors are structurally identical, and are all located in the same place. they have a thermal capacity of 20 mw each, and have basically the same flue gas cleaning systems as those sketched in figure 4. a notable difference is that there is a fixed bed activated carbon absorber between the wet scrubbers and the scr. in addition, no dry flue gas cleaning system is installed [6]. 106 acta polytechnica vol. 52 no. 4/2012 biomass heating plants: three fluidized bed combustors were put into operation in 2005 and 2006. together they have a thermal capacity of 163 mw. two of these boilers, with a thermal capacity of 116 mw, have been investigated closely. the two investigated biomass heating plants have almost the same flue gas treatment system, see figure 5, except that one of them also has selective catalytic reduction in high-dust switching. an obvious difference from the boilers in the wasteto-energy industry is that there are no wet scrubbers. this is because of the low sulfur content of the biomass. [6,7] 4 summary in austria, fluidized bed combustors are mainly used in the pulp and paper industry, in waste-to-energy plants, in biomass heat and power plants, and in communal sewage sludge treatment. each of these industrial sectors uses a typical fuel mixture. a specific flue gas cleaning system setup is installed for the typical fuel mixture that is used. in the pulp and paper industry, mainly baghouse filters or electrostatic precipitators are used. some combustors also have a selective non-catalytic reduction system (sncr) and a dry flue gas cleaning system. in the waste-to-energy industry and in communal sewage sludge treatment, the plants are equipped with an elaborate flue gas cleaning system. this system basically contains gravitation and centrifugal separators, a dry flue gas cleaning system, baghouse filters, wet scrubbers, and a selective catalytic reduction system (scr). the flue gas cleaning systems of biomass heat and power plants contain gravitation and centrifugal separators, a dry flue gas cleaning system, and baghouse filters. additionally, a selective catalytic reduction system in high-dust switching can be installed. acknowledgement we would like to acknowledge the austrian federal ministry for transport, innovation and technology (http://www.bmvit.gv.at) for funding the austrian activities within the international energy agency fluidized bed conversion agreement (http://www.iea-fbc.org). in addition, we would like to acknowledge the support provided by the iea — fluidized bed conversion network (http://www.iea-fbc.net). references [1] winter, f., szentannai, p.: iea fluidized bed conversion programme, status report 2010, österreichisches bundesministerium für verkehr, innovation und technologie, 2010, vienna. [2] böhmer, s., kügler, i.: abfallverbrennung in österreich statusbericht 2006. umweltbundesamt gmbh, 2007, vienna. isbn 3-85457-911-x. [3] stubenvoll, j., holzerbauer, r.: technische maßnahmen zur minderung der staubund noxemissionen bei wirbelschichtund laugenverbrennungskesseln. umweltbundesamt gmbh, 2007, vienna. isbn 3-85457-837-7. [4] zbigniew, b.: fluidized beds, handbook of combustion vol. 4: solid fuels. wiley-vch verlag gmbh & co. kgaa, 2010, weinheim, p. 399–433. isbn 978-3-527-32449-1. [5] amon, m., grech, h.: bericht des bundesministers für landund forstwirtschaft, umweltund wasserwirtschaft über verbrennungsund mitverbrennungsanlagen gemäss §18 avv, berichtszeitraum 2009, bundesministerium für landund forstwirtschaft, umweltund wasserwirtschaft. 2010, vienna. [6] provided information from the iea-fluidized bed conversion network austria. [7] selcuk, n., gogebakan, z.: co-firing biomass with coal in fluidized bed combustion systems, handbook of combustion vol. 4: solid fuels. wiley-vch verlag gmbh & co. kgaa, 2010, weinheim, p. 557–608. isbn 978-3-527-32449-1. 107 wykresx.eps acta polytechnica vol. 51 no. 1/2011 thermal entanglement and critical behavior of magnetic properties on a triangulated kagomé lattice n. ananikian, l. ananikyan, l. chakhmakhchyan, a. kocharian abstract the equilibrium magnetic and entanglement properties in a spin-1/2 ising-heisenberg model on a triangulated kagomé lattice are analyzed by means of the effective field for the gibbs-bogoliubov inequality. the calculation is reduced to decoupled individual (clusters) trimers due to the separable character of the ising-type exchange interactions between the heisenberg trimers. the concurrence in terms of the three qubit isotropic heisenberg model in the effective ising field in the absence of amagnetic field is non-zero. themagnetic and entanglement properties exhibit common (plateau, peak) features driven by a magnetic field and (antiferromagnetic) exchange interaction. the (quantum) entangled and non-entangled phases can be exploited as a useful tool for signalling the quantum phase transitions and crossovers at finite temperatures. the critical temperature of order-disorder coincides with the threshold temperature of thermal entanglement. keywords: triangulated kagomé lattice, ising-heisenberg model, gibbs-bogoliubov inequality, entanglement, concurrence. 1 introduction geometrically frustrated spin systems exhibit fascinating new phases of matter, a rich variety of unusual ground states and thermal properties as a result of zero and finite temperature phase transitions driven by quantum and thermal fluctuations, respectively [1]. the efforts aimed at a better understanding of these phenomena have stimulated an intensive search for transition-metal magnetic and molecularmaterialswhose paramagneticmetal centers can be strongly frustrated by local geometric structures. one of the most interesting geometrically frustrated magnetic two-dimensional structures is the triangulated kagomé (triangles-in-triangles) lattice, which can be applied to the magnet cu9x2(cpa)6 ·nh2o (x=f, cl, br and cpa=carboxypentonic acid) [2]. the magnetic architecture of these series of compounds, which can be regarded as a triangulated kagomé lattice (fig. 1), is currently under active theoretical investigation [3]. the spin1 2 isingheisenberg model on this lattice, which takes into account quantum interactions between cu2+ ions in a-sites, in the limit when monomeric b-spins having anexchangeof isingcharacter, providesa richphysics anddisplays the essential featuresof the copperbased coordination compounds [4, 5]. entanglement is a generic feature of quantumcorrelations in systems,which cannot be quantified classically [6]. it provides a new perspective for understanding quantumphase transitions (qpts) and collective many-body phenomena in condensed matter physics. a key novel observation is that quantum entanglement can play an important role in proximity to qpts controlled by quantum fluctuations in the vicinity of quantum critical points. a new line of research points to a connection between the entanglement of a many-particle system and the existence ofqpts and scaling [7]. the basic features of entanglement in spin1 2 finite systems are fairlywell understood by now,while the role of local cluster topology and spin correlations in the thermodynamic limit still remains unanswered. effective field theories can be offered by using the gibbs-bogoliubov inequality for studying the thermodynamic and thermal entanglement properties ofmany-body systems [8]. although the method is not exact, it is still possible to see regions of criticality [9]. unlike a classical transition, controlled by temperature, a quantum phase transition (qpt) is driven solely by (quantum) interactions. in the case of the triangulated kagomé lattice, each atype trimer interacts with its neighboring trimer through the (ferromagnetic) ising-type exchange, i.e. a classical interaction. therefore, the states of two neighboring a-clusters become separable (unentangled) [6]. thus, the concurrence (a measure of entanglement [10]), which characterizes quantum (non classical) features for each trimer, in the effective field, can be calculated separately. the key result of the currentwork is a comparative analysis of specific (peakandplateau) features in themagnetic and thermal entanglement properties of the spin-1/2 isingheisenberg model on a triangulated kagomé lattice. 7 acta polytechnica vol. 51 no. 1/2011 the rest of the paper is organized as follows: in sec. 2 we introduce the ising-heisenberg model on the triangulated kagomé lattice and provide a variational solution based on the gibbs-bogoliubov inequality. the basic principles for calculating entanglement measure and some of the results on intrinsic properties are introduced in sec. 3. in sec. 4 we present a comparison of magnetic properties and thermal entanglement. concluding remarksaregiven in sec. 5. 2 basic formalism we consider the spin1 2 ising-heisenberg model on a triangulated kagomé lattice (tkl) (fig. 1) consisting of two types of sites (a and b). since the exchange coupling between cu2+ ions is almost isotropic, it is more appropriate to apply isotropic heisenberg model. there is a strong heisenberg jaa (antiferro) exchange coupling between trimeric sites of type a and a weaker ising-type (ferro) exchange (jab) between trimeric types a and monomeric b. thus, the kagomé lattice of the ising spins (monomers) contains inside each triangle unit a smaller triangle of heisenberg spins (trimer). fig. 1: a cross-section of tkl. the solid lines represent intra-trimerheisenberg interactions jaa, while thebroken lines labelmonomer-trimer ising interations jab. the circle marks the k-th cluster. saki presents the heisenberg spins and sbki the ising spins the hamiltonian can be written as follows: h = jaa ∑ (i,j) sai s a j − jab ∑ (k,l) (sz)ak · (s z)bl − h 2n 3∑ i=1 3[(sz)aj + 1 2 (sz)bj], (1) wheresa = {sax, s a y , s a z } is theheisenberg spin1 2 operator, and sb is the ising spin. jaa > 0 corresponds to antiferro-coupling and jab > 0 ferro-couplings. here, the total number of sites is 3n, where the first twosummations runover a−a and a−b nearestneighbors, respectively, and the last sum incorporates the effect of a uniform magnetic field. the variational gibbs-bogoliubov inequality is adopted to solve the hamiltonian (1) f ≤ f0 + 〈h − h0〉0, (2) where h is the real hamiltonian which describes the system, and h0 is the trial one. f and f0 are free energies corresponding to h and h0, respectively, and 〈. . .〉0 denotes the thermal average over the ensemble defined by h0. following [4], the trial hamiltonian is reduced to h0 = ∑ k∈trimers hc0, (3a) hc0 = λaa ( sak1s a k2 +sak2s a k3 +sak1s a k3 ) − 3∑ i=1 [ γa(s z)aki + γb 2 (sz)bki ] . (3b) in this hamiltonian, the stronger quantum heisenberg antiferromagnetic interactions between a-sites are treated exactly, while the weaker ising-type ones between the aand b-sites (|jab/jaa| ≈ 0.025 [11]) are replacedby self-consistent (effective) fields of two types: γa and γb. the variational parameters γa, γb and λaa can be found by minimizing the rhs of (2). using the fact that in terms of (3b) sa and sb are statistically independent, and taking into account 〈sx〉0 = 〈sy〉0 =0asingle sitemagnetization, 〈(sz)a〉0 = ma, 〈(sz)b〉0 = mb on a and b-sites, we obtain λaa = jaa, γa =2jabmb+h, γb =4jabma+h. the eigenvalues of hac0 are: e1 = 3 4 (λaa +2γa);e2 = e3 = 1 4 (−3λaa +2γa) e4 = 1 4 (3λaa +2γa);e5 = e6 = 1 4 (−3λaa −2γa) e7 = 1 4 (3λaa −2γa);e8 = 3 4 (λaa −2γa) (4) and the corresponding eigenvectors given by |ψ1〉 = |000〉 |ψ2〉 = 1 √ 3 ( q|001〉+ q2|010〉+ |100〉 ) |ψ3〉 = 1 √ 3 ( q2|001〉+ q|010〉+ |100〉 ) |ψ4〉 = 1 √ 3 (|001〉+ |010〉+ |100〉) |ψ5〉 = 1 √ 3 ( q|110〉+ q2|101〉+ |011〉 ) (5) |ψ6〉 = 1 √ 3 ( q2|110〉+ q|101〉+ |011〉 ) |ψ7〉 = 1√ 3 (|110〉+ |101〉+ |011〉) |ψ8〉 = |111〉, where q = ei2π/3 (these eigenvectors should be also the eigenstates of cyclic (rotation) operator p with eigenvalues 1, q and q2, satisfying q2 + q +1=0). 8 acta polytechnica vol. 51 no. 1/2011 for the above-defined aand b-single site magnetizations we obtain (here and further the boltzman’s constant is set to be kb =1): ma = (6a) 1 6 3sinh ( 3γa 2t ) +2e 3λaa 2t sinh ( γa 2t ) +sinh ( γa 2t ) cosh ( 3γa 2t ) +2e 3λaa 2t cosh ( γa 2t ) +cosh ( γa 2t ) , mb = 1 2 tanh ( γb 2t ) . (6b) for the gibbs-bogoliubov free energy (fgb) of the system we obtain the following expression: fgb n = λaa 2 +4jabmamb −2t [ 1 3 ln { 4e 3jab 2t cosh ( γa 2t ) +2cosh ( γa 2t ) +2cosh ( 3γa 2t )} + 1 2 ln { 2cosh ( γb 2t )}] . (7) 3 concurrence and thermal entanglement the effective field treatment of (1) transformsmanybody systemtoa reduced“single”cluster study. this allows us to study, in particular, the thermal (local) entanglement properties of the a-sublattice in terms of a three-qubit (isotropic) heisenberg model in a self-consistent field γa, which carries the properties of the whole system. as a measure of the pairwise entanglement, we use concurrence c(ρ) [10]. the corresponding density matrix ρ is defined as c(ρ)= max{λ1 − λ2 − λ3 − λ4,0}, (8) where λi are the square roots of the eigenvalues of the operator ρ̃ = ρ12(σ y 1 ⊗ σ y 2)ρ ∗ 12(σ y 1 ⊗ σ y 2) in descending order. since we consider pairwise entanglement, we use the reduced density matrix ρ12 = tr3ρ. in the effective field, due to the classical character of the ising interaction (sec. 1) between trimers, the concurrence for each decoupled heisenberg cluster can be calculated individually. in our case, the density matrix has the form ρ = 1 z 8∑ k=1 exp(−ek/t)|ψk〉 〈ψk|, ek and |ψk〉 taken from (4) and (5) and z is the partition function [z = trρ = e− 3(2γa+λaa) 4t ( 1+ e γa t )( 1+ e 2γa t +2e 2γa+3λaa 2t ) ]. while the construction of ρ̃ does not depend on whether γa is an effective parameter or a real magnetic field, the self-consistent field solution for γa is crucial in obtaining the final results. in this paper we skip specific derivations and rather focus only on the final results. concurrence c(ρ) is given by [12]: c(ρ)= 2 z max(|y| − √ uv,0), (9) where u = 1 3 e 2γa−3λaa 4t ( 1+3e γa t +2e 3λaa 2t ) v = 1 3 e− 3(2γa+λaa) 4t ( 3+ e γa t +2e 2γa+3λaa 2t ) (10) w = 1 3 e− 2γa+3λaa 4t ( 1+ e γa t )( 1+2e 3λaa 2t ) y = − 1 3 e− 2γa+3λaa 4t ( 1+ e γa t )( −1+ e 3λaa 2t ) . first, we find that concurrence c(ρ) as an entanglement measure exhibits critical behavior upon the temperature variation shown infig. 2 in the absence of a field. fig. 2: concurrence c(ρ) versus temperature field for jaa =1, α =0.025 and h =0 the system is entangled at a relatively low temperature, below the threshold, tth. this effect apparently occurs because of the ising-type interaction replacedby the effective field γa =2jabmb+h acting upon the a-spins, which isnon-zero at h = 0. thus, effective field provides a solution for an entanglement resource in the absence of a magnetic field. another important observation: the threshold temperature at which c(ρ) becomes zero coincides with the critical temperature tc at which spontaneous magnetization m vanishes for the smooth (second order) phase transition between ordereddisordered phases. expanding ma into series near the phase transition point: m = am + bm3 + cm5 + . . . (11) one finds tc from the condition a = 1, b < 0 (for the case jaa = 1 and α = 0.025, tc = 0.0102062). the coincidence of the critical and threshold temperatures for magnetization and concurrence is a consequence of the fact that at tc the system undergoes the order-disorder phase transition and the second term in γa also vanishes (mb = 0, when h = 0 and t ≥ tc). in general, we find a number of other similarities between the magnetic properties and the entanglement of the system. under variation of h, the entanglement andmagnetic properties showvery 9 acta polytechnica vol. 51 no. 1/2011 rich behavior in the low-temperature region. fig. 3 presents the three-dimensional plot of the concurrence as a function of the temperature and external magnetic field. wewill study some other features in the behavior of c(ρ) by returning to the magnetic and entanglement ground state properties in sec. 4. fig. 3: concurrence c(ρ) versus temperature t and external magnetic field h for jaa =1, α =0.025 4 quantum critical points and phase diagrams nowwe consider themany-body quantumeffects relevant to entanglement properties and discuss some similarities between magnetic (statistical) properties and (quantum) entanglement. as statistical characteristics, the density distribution of susceptibility χ = ∂ma ∂h reduced per one a-site is shown in fig. 4(a) as a function of the coupling constant jaa and the external field h, at a relatively high temperature t = 0.1, higher than tc. the white stripes, which have a certain a finite width due to nonzero temperature, correspond to the peaks of the susceptibility. a similar density plot is shown for entanglement in fig. 4 (b) for the same range of jaa − h parameters. a comparison of these two graphs shows that the general behavior of the statistical properties in χ resembles the features of the quantum concurrence. (a) (b) fig. 4: density plot of (a) susceptibility χ, (b) concurrence c(ρ) versus h and jaa for α =0.025 and t =0.1 our calculations show that the peaks inmagnetic susceptibility correspond to the phase boundaries on the jaa − h density diagram in concurrence c(ρ) at which the quantum coherence vanishes. as can be seen, this is true only for the ising-heisenbergmodel on thetkllatticewith jaa > 0coupling; for jaa < 0 coupling the system is non-entangled, c(ρ)=0. thus, the extremal behavior of χ is not reproduced by the concurrence density. hence, although the critical behavior of the two characteristics coincides for the antiferromagnetic region, only (quantum)concurrence canbeusedasa reference forquantitative analysis of qpts. in addition,we studyhere thequantumcriticality in the ground state phase diagram resulting from the magnetic field variation in themagnetization and entanglement properties of the a-sublattice. fig. 5(a) shows a phase diagramof constantmagnetization for the a-sublattice. this diagram differentiates the followingphases for jaa > 0: phase i corresponds to the spontaneous magnetization ma = 1/6, when spins in the a-sublattice are in one of the available (↑↑↓) configurations; phase ii corresponds to one of the possible configurations (↓↓↑) with ma = −1/6. (a) (b) fig. 5: (a) phase diagram of the a-sublattice for |α| = 0.025; (b)density plot of concurrence c(ρ) versus h and jaa for |α| =0.025 at zero temperature in the ferromagnetic case (jaa < 0) we have full spin saturation in regions iii and iv, with the value of the maximum magnetization per atom ma = 1/2 [configuration (↓↓↓)] and ma = −1/2 [configuration (↑↑↑)], respectively. phase i contains the two-fold degenerate states |ψ5〉 and |ψ6〉, while phase ii contains the two-fold degenerate states |ψ2〉 and |ψ3〉 with c(ρ) = 1/3. phases iii and iv correspond to states |ψ1〉 and |ψ8〉, respectively. these phases are non-entangled, (c(ρ) = 0) [fig. 5(b)]. the area of non-zero entanglement coincideswith phase i+ii, where |ma| = 1/6, while the one with zero entanglement (c(ρ)=0) corresponds to phase iii+iv with |ma| =1/2. 5 conclusion in this work we have demonstrated correlations between magnetic properties and quantum entanglement in the spin1 2 ising-heisenberg model on a tri10 acta polytechnica vol. 51 no. 1/2011 angulated kagomé lattice. we adopted the variationalmean-field like treatment (based on thegibbsbogoliubov inequality) to decouple clusters in effective (interconnected) fields of two types (consisting of heisenberg a trimers and ising-type b monomers). each of these fields taken separately describes not only the corresponding (aor btype) spins, but the system as a whole. we used concurrence as a computable measure of bipartite entanglement for the trimeric units in terms of the isotropic heisenberg model in the effective magnetic field γa. using the fact that “a subdivisions” are separable, we studied the entanglement for each of them individually in an effective isingtype field, (γb). the model exhibits quantum criticality, which can be identified and characterized by studying the behavior of the magnetic and entanglement properties with respect to the interaction, the magnetic field and the temperature that control the transition. it turned out that entanglement does not vanish in the zero external field, as happens for the common three qubit (isotropic) heisenberg model. we find that the temperature at which entanglementbecomes zerocoincideswith the critical temperature of the second order phase transition at which spontaneous magnetization disappears. in addition, we show that in the antiferromagnetic region (the interactions between trimeric a sites are exactly of this type) themagnetic susceptibility peaks coincidewith the boundary lines at which entanglement vanishes. however, this does not take place in the ferromagnetic case. therefore, one can detect a quite visible correlation for the lineboundariesbetween thephases on the density diagrams for entanglement and magnetization as a signature of the corresponding quantum phase transition. note that the disordered spin liquid state can also exist in the ground state of the frustrated spin system, on the assumption that there is a sufficiently strong antiferromagnetic intra-trimer interaction. finally, themagnetization,magnetic susceptibilities and (quantum) entanglement features can be exploited to signal andunderstand thequantumcritical points and phase transitions. acknowledgement this work was supported by the ps-1981, ps2033ansef andecsp-09-08-saspnfsat research grants. references [1] diep, h. t. (ed.): frustrated spin sysytems. singapore : world scientific, 2004; gardner, j. s., et al.: magnetic pyrochlore oxides. rev. mod. phys., 82(1), 2010, p. 53–107; zhitomirsky, m. e., et al.: field induced ordering in highly frustratedantiferromagnets.phys. rev. lett., 85(15), 2000, p. 3269–3272. [2] maruti, s.,haar ter,l.w.: magneticproperties of the two-dimensional triangles-in-triangles kagomé lattice cu9x2(cpa)6 (x=f,cl,br). j. appl. phys., 75(10), 1994, p. 5949–5952; ateca, s., et al.: the two-dimensional triangles-in-triangles kagomé antiferromagnet cu9x2(cpa)6: a topological spin-glass?. j. magn. magn. mater., 147(3), 1995, p. 398–400; ramirez, a. p.: strongly geometrically frustrated magnets. annu. rev. mater. sci., 24, 1994, p. 453–480. [3] strečka, j., et al.: exact solution of the geometrically frustrated spin-1/2 ising-heisenberg model on the triangulated kagome (trianglesin-triangles) lattice.phys. rev. b,78(2), 2008, 024427; ryo, n., et al.: a theoretical study of magnons for heisenberg model in triangulated kagomé lattice-characteristic dispersionless modes. j. phys. soc. jpn., 66, 1997, p. 3687–3688. [4] strečka, j.: magnetic properties of thegeometrically frustrated spin-1/2 heisenberg model on the triangulated kagomé lattice. j. magn. magn. mater., 316(2), 2007, p. e346–e348. [5] yao, d.-x., et al.: xxz and ising spins on the triangular kagome lattice. phys. rev. b, 78(2), 2008, 024428. [6] amico, l., et al.: entanglement in manybody systems. rev. mod. phys., 80(2), 2008, p. 517–576; amico, l., fazio, r.: entanglement and magnetic order. j. phys. a: math. theor., 42(50), 2009, 504001. [7] larsson, d., johannesson, h.: entanglement scaling in the one-dimensionalhubbardmodel at criticality. phys. rev. lett., 95(19), 2005, 196406; yang, c., et al.: phase transitions and exact ground-state properties of the onedimensional hubbard model in a magnetic field. j. phys.: condens. matter, 12(33), 7433; alba, v., et al.: entanglement entropy of two disjoint blocks in critical ising models. phys. rev. b, 81(6), 2010, 060411(r). [8] gong, s.-s., gang, s.: thermal entanglement in one-dimensional heisenberg quantum spin chains under magnetic fields. phys. rev. a, 80(1), 2009, 012323; asoudeh, m., karimipour, v.: thermal entanglement of spins in mean-field clusters. phys. rev. a 73(6), 2006, 062109; canosa, n., et al.: description of thermal entanglement with the static path plus 11 acta polytechnica vol. 51 no. 1/2011 random-phase approximation. phys. rev. a, 76(2), 2007, 022310. [9] vedral, v.: mean-field approximations and multipartite thermal correlations. new j. phys., 6(1), 2004, 22. [10] wootters,w.k.: entanglement of formation of an arbitrary state of two qubits. phys. rev. lett., 80(10), 1998, p. 2245–2248. [11] mekata, m., al.: magnetic ordering in triangulated kagomé lattice compound cu9cl2(cpa)6nh2o. j. magn. magn. mater., 177–181, 1998, p. 731–732. [12] wang, x., et al.: thermal entanglement in three-qubit heisenberg models. j. phys. a: math. gen., 34(50), 2001, 11307. n. ananikian yerevan physics institute alikhanian br. 2 0036 yerevan, armenia l. ananikyan e-mail: lev.ananikyan@gmail.com yerevan physics institute alikhanian br. 2 0036 yerevan, armenia l. chakhmakhchyan yerevan physics institute alikhanian br. 2 0036 yerevan, armenia yerevan state university a. manoogian 1 0025 yerevan, armenia a. kocharian yerevan physics institute alikhanian br. 2 0036 yerevan, armenia department of physics california state university los angeles, ca 90032, usa 12 ap09_1.vp 1 introduction structural fire design is, to a large extent, based on single member tests. due to the nature of these tests, the behaviour of the connections is neglected suggesting that they do not play a critical role in fire. in support of this theory, connections generally have a lower temperature than the surrounding structure during fires and are usually protected. this assumption of cooler connections is valid but this does not justify ignoring them in fire design. during both heating and cooling, connections will be subject to conditions, for example large moments and shear forces, which they will not typically have been designed for [1]. the response of connections to these conditions is complex and is largely based on the material strength degradation and the interactions between the various components of the connection. to predict how the behaviour of connections affects global performance in fire, temperature profiles must initially be established in order to evaluate the material strength degradation over time. this paper examines two current methods available for predicting connection temperatures as defined in eurocode 3 [2]. the first of these methods suggests that connection temperatures can be defined as a percentage of the adjacent beam temperature, and a second is based on the lumped mass of material at the connection. a 3d finite element model is also created to predict connection temperatures using the commercial software package abaqus [3]. abaqus uses heat transfer theory to predict connection temperatures over time. these methods are all compared to experimental data and the validity and accuracy of each is evaluated and its limitations explored. 2 theory 2.1 eurocode percentages method eurocode 3 details two methods for predicting connection temperatures. the first assumes that connection temperatures follow the same general trend as local beam temperatures but are a percentage lower. this method simplifies connections into 2 categories based on whether the connecting beam is less than or equal to 400 mm deep, or greater than that. in the first case, where the top of the connection is adjacent to the concrete slab, connection temperatures are approximated at 88 % of the beam lower flange mid-span temperature at the bottom of the connection, 75 % at mid-height and 62 % at the top. between these points the temperature is assumed to vary linearly. where the connecting beam is more than 400 mm deep, temperatures are calculated as 88 % of the beam temperature at the bottom and mid-height of the connection, tapering to 70 % at the top. connections are cooler at the top for two main reasons. the largest contributor to heating is radiation from hot surfaces such as compartment or furnace walls. where there is no direct line of sight between these walls and the member, in this case the connection, the radiative heating will decrease. this phenomenon is known as shadowing and causes the top of the connection to be cooler than the rest. the provision of a heat sink in the form of a concrete slab will also reduce connection temperatures. 2.2 lumped capacitance connection temperatures can also be predicted using the ratio of material volume to exposed surface area. an average connection temperature or that of a specific connection component, such as a bolt or end plate, can be calculated, assuming the gas temperature-time curve is known. the lumped capacitance method [4] calculates the uniform temperature rise in an unprotected steel member using a series of finite time steps �t, as given in eq. (1). � �t h c w d t t ts s f s� �( ) , (1) where h heat transfer coefficient, tf gas temperature, ts steel temperature, cs steel specific heat, d heated perimeter, w steel volume per meter length. © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 49 no. 1/2009 investigation into methods for predicting connection temperatures k. anderson, m. gillie the mechanical response of connections in fire is largely based on material strength degradation and the interactions between the various components of the connection. in order to predict connection performance in fire, temperature profiles must initially be established in order to evaluate the material strength degradation over time. this paper examines two current methods for predicting connection temperatures: the percentage method, where connection temperatures are calculated as a percentage of the adjacent beam lower-flange, mid-span temperatures; and the lumped capacitance method, based on the lumped mass of the connection. results from the percentage method do not correlate well with experimental results, whereas the lumped capacitance method shows much better agreement with average connection temperatures. a 3d finite element heat transfer model was also created in abaqus, and showed good correlation with experimental results. keywords: connection, joint, heat transfer, steel, concrete, fire, temperature. when applied to connections, this method must take into consideration the large volume of steel at the connection. therefore the equation has been modified to use the ratio of volume of steel, v, to heated surface area, a, as shown in eq. (2). � �t h c v a t t ts s f s� �( ) . (2) this method does not account for the concrete slab above the connection and, as the effect of the slab on connection temperatures has not been well researched, the validity of applying this method to connections is uncertain. 2.3 finite element modelling an alternative means of predicting structural temperatures is to carry out a finite-element heat transfer analysis. detailed temperature profiles can be created and this information can be used as direct input for a structural model. in principle, this method is highly accurate, however obtaining correct values for the all input parameters is very challenging. the modelling process is outlined below. for the purpose of this paper, the gas temperature-time curve to which a connection is exposed is assumed to be known, therefore the modelling will be limited to the convective and radiative heat transfer between the gas and solid, and to the conduction between the connection components. perfect conduction is assumed between the various connection components such as bolts and bolt holes and through the welds. convective heat transfer is heating by movement of the hot gases, where the heat flux due to convection, ��qc and is given by: �� � �q h t tc f su( ) , (3) where tsu is the surface temperature. the heat transfer coefficient varies with temperature and depends on the hot gas velocity. its value for structural steel is given in eurocode 3 as 25 w/m2�k [2]. at the connections, the convective heating will be less, due to lower gas velocities in these areas. however, no practical methods exist for accurately calculating heat transfer coefficient at connections. this paper has therefore used the eurocode recommended value. this assumption was verified with a sensitivity study. radiative heat transfer is heating directly between one item and another or between the fire source and a structural element or between one structural element and other. emissivity is used to describe the radiative power of an object and can be defined as the ratio of the radiative power of the object to the radiative power of a black body where a black body is a perfect emitter and emissivity can never be greater than 1. emissivity varies with temperature and in large building fires is usually the dominant mode of heating. there are a huge number of factors which affect radiative heating, for example the make up of the air in the room: if the air contains soot particles the radiation between objects will be lower than in clear air, or if an element becomes charred or sooty its emissive power will reduce, i.e. less heat will be absorbed by the element. due to the many variables affecting emissivity, predicting radiative heating is extremely difficult: for one structural member, there will be variations not only with temperature but also with factors such as location in the building, fuel type and ventilation conditions. the heat flux due to radiative heating, or total emissive power of an object is given in eq. (4). �� �q te su 4 � � , (4) where � emissivity and � stefan boltzmann constant. the emissivity at a connection will be lower than that of the local beams and columns due to the shadow effect, as discussed in section 2.1. eurocode 3 suggests that for ‘shadowed’ areas a reduction factor for unprotected steel temperatures can be defined as shown in eq. (5). k a v a v sh m b m � � �� � � �� � , (5) where a v m b � �� � box section factor, a v m� �� � section factor, am surface area of the member per unit length. this method is suggested for beams but has not yet been validated for connections. further research is required to validate this assumption. 3 results two sets of experimental data have been used to investigate the accuracy of these methods for predicting connection temperatures. these are briefly summarised here. � manchester university furnace tests carried out in 2008 [5–6] consisting of 4 beams spanning from one column with a concrete slab on top. the steel members were all unprotected. this whole assembly was tested in a furnace where the gas temperatures followed a 60 minute standard fire. connection temperatures were recorded at several locations. cooling was not considered. 4 connection types were used: flush and flexible end plate, fin plate and web cleats. the flush end plate and fin plate have been used for validation in this paper. � cardington full scale tests from january 2003 [7]. this was a compartment fire test on the 4th storey of an 8 storey building where one of the main objectives was to monitor the connection behaviour including temperature evolution during the heating and cooling phases. the interior beam-to-column connections were flexible end plates. the columns were protected to the underside of the beams whilst the connection remained unprotected. 3.1 eurocode percentages method the eurocode percentages method was used to predict the temperatures of two connections, a flush end plate from the manchester university tests and a flexible end plate from the cardington tests. the results of these are shown in figs. 1 and 2. fig. 1 shows the predicted temperatures at 3 locations on the connection, bottom, mid-height and top, in comparison 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 to the recorded temperatures at the same locations. for the first 15 minutes the predicted temperatures are of reasonable accuracy but after this point predicted temperatures are much higher than experimental temperatures. after 50 minutes the predicted temperature of the bottom of the connection is 900 °c, whereas the measured temperature was 250 °c lower at 650 °c; there is a similar error margin for the mid-height temperature. the peak temperature of the connection is estimated to be at around 50 minutes, coinciding with the peak beam temperature. connections, however, can continue to heat after the surrounding structure has started cooling and experimental results show that this connection does not start cooling until 15 minutes later. during the cooling stage the connection temperatures are underpredicted by between 150 °c and 250 °c for the 60 minute cooling period at all locations. the results for the flush end plate, fig. 2, show that the temperatures at three locations on the connection are initially over-conservative by up to 200 °c but are equally under-conservative after about 25 minutes until the conclusion of the test. the trend of connection temperatures relative to one another is also not shown: the experimental test shows connection temperatures varying by around 75 ec from top to bottom, whereas the eurocode method shows a variation of close to 200 °c. this method provides a very simple means of estimating connection temperatures where the only information required is the beam mid-span temperature and depth. however, results show it to be unreliable in both heating and, to a larger extent, in cooling. the implication of this is that the use of this method is inappropriate except for very crude calculations. 3.2 lumped capacitance the lumped capacitance method has been used to predict the average temperature of a fin-plate connection from the manchester university tests and of the flexible end plate used in the cardington tests. the results are shown in fig. 3 and compared to the recorded average connection temperature. for fin-plate, the average temperature is predicted well. however it is noteworthy that this experiment was carried out in a highly controlled environment. the same lumped capacitance method is then used to calculate the average temperature for the flexible end plate. the temperatures predicted are consistently higher than the experimental results by between 30 °c and 90 °c and are therefore conservative. however, there is a good correlation between the predicted and experimental trend. more input data is required for this method than for the percentages method: connection geometry and gas temperature-time curve. despite calculations being basic, results show that good average temperatures are predicted. there are, however, many factors that could affect the results, such as how much of the beam or column is considered to be part of the connection and what effect the concrete slab has on the heating rate. as the effect of the slab on connection temperatures has not been well researched, this assumption may be invalid. also, temperature gradients are present over connections, and mechanical response may vary notably between a connection with one average temperature to that with a temperature profile. therefore an average temperature may not be adequate for detailed calculation purposes. 3.3 finite element modelling a model of the flexible end plate was created in abaqus for the 200 minute fire. it includes a 140 minute cooling phase. in creating the finite element model there were three main areas for consideration: radiative heating, convective heating and the inclusion of a concrete slab. a sensitivity study was carried out to look at these three parameters and examine their effect on results. it was found that varying the heat transfer coefficient, and therefore the level of convective heating, at the connection had a negligible effect on the results. when the concrete slab was included in the model the temperatures of the upper flanges of the beams were affected, but other temperature predictions remained unchanged. based on these results the concrete slab was excluded from further modelling. © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 49 no. 1/2009 0 200 400 600 800 1000 0 20 40 60 80 100 120 time (mins) t e m p e ra tu re (° c ) fig. 1: ec % method: flexible end plate 0 200 400 600 800 1000 0 10 20 30 40 50 60 time (mins) t e m p e ra tu re (° c ) fig. 2: ec % method: flush end plate 0 200 400 600 800 1000 0 20 40 60 80 100 time (mins) t e m p e ra tu re (° c ). exper iment al: fin plat e lc met hod: fin plat e exper iment al: end plat e lc met hod: end plat e fig. 3: lumped capacitance method for fin plate and flexible end plate the value of emissivity affected the results, and therefore the area near to the connection was assigned a lower emissivity than the rest of the structure. this is based on the shadow effect in this location, as discussed in section 2.1. the known gas temperature over time was input to the model and heating assumed on all faces apart from the upper flanges of the beams and the top of the column where there was contact with the concrete slab. results from the finite element modelling are shown in figs. 4 and 5 and all show close correlation with experimental data. fig. 6 shows that the biggest difference between predicted and experimental temperatures is on the underside of the beam upper flange. this is due to the concrete being excluded from the analysis. for detailed calculations where the exact connection geometry is known, this method provides accurate results. it can be used for all connection types where a detailed knowledge of its response in fire is required. this method, however, is time consuming both in model creation and simulation run time. it could not, therefore, be a day-to-day modelling approach. 4 conclusions this paper has investigated three methods for predicting connection temperatures. the eurocode suggests connection temperatures can be calculated as percentages of the mid-span beam flange temperature. however, results show this method to be unreliable, and it should therefore be used with caution. the lumped capacitance method, based on the heated surface area of the connection and its volume, showed good correlation with average connection temperatures. more work should be done to look at predicting temperatures of individual connection elements and to definite what volume of the connection beams and columns should be included in calculations. the abaqus modelling also showed good correlation with experimental results. this method can therefore be recommended if a detailed temperature profile is needed for mechanical analysis. during the modelling it was found that the inclusion of the concrete slab did not affect the predicted temperatures of the connection. therefore it does not have to be included, allowing for much quicker computational times. a detailed yet simple method for predicting connection temperatures is still unavailable and therefore more work is required in this field. acknowledgment the authors wish to thank arup fire and epsrc for funding this research. they also thank dr. y. wang and dr. x. dai of the university of manchester for allowing access to experimental results in advance of publication. references [1] bailey, c. g., lennon, t., moore, d. b.: the behaviour of full-scale steel framed buildings subjected to compartment fires. the structural engineer, april 1999, p. 15–21. [2] cen, ec 3: design of steel structures part 1.2: general rules – structural fire design, bs en 1993-1-2:2005, brussels: cen, european committee for standardisation, 2006b. [3] abaqus user’s manual, version 6.6, providence, ri, usa. [4] incropea, f. p., dewitt, d. p., bergman, t. l., lavine, a. s.: fundamentals of heat and mass transfer. john wiley & sons, hoboken, usa, sixth edition, 2005. [5] dai, x. h., wang, y. c., bailey, c. g.: temperature distributions in unprotected steel connections in fire. proc. steel & composite structures, manchester, uk, 2007, p. 535–540. [6] dai, x. h., wang, y. c., bailey, c. g.: effects of partial fire protection on temperature developments in steel 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 0 200 400 600 800 1000 0 50 100 150 200 time (mins) t e m p e ra tu re (° c ) experiment al fe m odelling fig. 4. finite element for flexible end plate: top of connection 0 200 400 600 800 1000 0 50 100 150 200 time (mins) t e m p e ra tu re (° c ) experiment al fe m odelling fig. 5: finite element for flexible end plate: bottom of connection 0 200 400 600 800 1000 0 50 100 150 200 time (mins) t e m p e ra tu re (° c ) experiment al fe m odelling fig. 6: finite element for flexible end plate: top flange of beam joints protected by intumescent coating. fire safety j., jan. 2009, p. 16–32. [7] lennon, t., moore, d. b.: results and observations from full-scale fire test at bre cardington, 16 january 2003. bre, watford, uk, 2004. kate anderson e-mail: s0091706@sms.ed.ac.uk martin gillie university of edinburgh school of engineering and electronics edinburgh, united kingdom © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 49 no. 1/2009 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 a finite liouville dress for c< 1 boundary degenerate matter p. furlan, v. b. petkova, m. stanishkov abstract we review the derivation of a general formula for the liouville dressing factor in the boundary 3-point tachyon correlator with c < 1 degenerate matter. keywords: non-critical string, tachyon correlators, boundary conditions. 1 introduction thesimplest exampleof anon-critical string theory is 2dliouvillegravity inducedby cm < 1matter [1]. it combines twovirasoro theories with central charges parametrised by a generically real number b, cm =13−6(b2+1/b2) < 1 and cl = 26 − cm > 25, so that when we add a pair of reparametrisation ghosts of central charge −26 the total conformal anomaly vanishes. the d-brane dynamics in an open non-critical string is determined by the boundary correlation functions (numbers) of the physical fields of ghost number one, “massless tachyons”, see e.g. [2, 3, 4, 5, 6, 7] for more recent discussions. the full boundary tachyon field factorises into amatter times a liouville “dressing” vertex operator, producing a similar factorisation of the full 3-point function. in this work we address our attention to the pure liouville factor of it in the case where the matter factor corresponds to degenerate virasoro representations. the matter fields are vertex operators of the scaling dimensionδm(e)= e(e−1/b+b) labelled by degenerate cm < 1virasoro representations. this implies that the charges {βi} of the dressing liouville boundary vertex operators σi bσiβi , of scaling dimensions δl(β)= β(q − β)=1−δm(e), take the values βi = b + mib − ni b , 2mi,2ni ∈ z≥0 (1.1) or their reflected β → q − β counterparts (q = b +1/b), so that without loss of generality we shall work with the values in (1.1). the range of the boundary parameters σi is generically parametrised by the continuous liouville spectrum 2σ − q – pure imaginary but also admits continuation to real values. these liouville boundary fields correspond to the fzz branes [8]. the matter factor of the 3-point boundary tachyon correlator is a straightforward generalisation of the factor in the rational b2-case. it is alternatively reproduced by an analytic continuation of a residuum of the integral ponsot-teschner (pt) formula [9] at points corresponding to c > 25degeneratevirasoro representations. the same analytic continuation applies to fusing matrices, which differ from the boundary field crossing matrices (3-point boundary correlators) by a renormalisation of the three boundary vertices. thus the formulae in [9, 10] for the quantum 3j and 6j symbols, designed generically for the continuous c > 25 spectrum, are in a sense universal, since we can reproduce from them the coulomb gas quantities in both c < 1 and c > 25 virasoro regions. however this integral formula is not very explicit, and its main characteristics are not immediately visible when applied to the spectrum of representations (1.1). another alternative is to solve the pentagon equations recursively. the final result is a meromorphic expression in the boundary cosmological parameters, the derivation of which we review here, see [11] for more details. it generalises a special (thermal) case result of [6] and partial results in the microscopic approach in [5]. 2 boundary 3-point liouville constant thematter fusion rules impose restrictionson the values in (1.1), namely all mkij := mi+mj−mk, n k ij = ni+nj−nk are non-negative integers, so that 2m123 = 3∑ i=1 2mi =0mod 2. 84 acta polytechnica vol. 50 no. 3/2010 the 3-point boundary liouville functions that we are interested in are related to the boundary field crossing matrices cσ2,q−β3 [ β2 β1 σ3 σ1 ] = 〈σ1bβ3 σ3bβ2 σ2bβ1 σ1〉 = cσ3,σ2,σ1β3,β2,β1 = s(σ1, β3, σ3)c σ3,σ2,σ1 q−β3,β2,β1, (2.1) where s(σ1, β3, σ3) is the reflection amplitude [8]. the associativity condition for ope of boundary fields, together with the fusion transformation relating the s and t channels, lead to an integral pentagon-like equation for the boundary field 3-point functions∫ dβscσ4,σ3,σ1q−β3,β2,βsc σ3,σ2,σ1 q−βs,β,β1fβs,βt [ β2 β β3 β1 ] = cσ4,σ2,σ1q−β3,βt,β1c σ4,σ3,σ2 q−βt,β2,β , (2.2) where fβs,βt is the fusing matrix computed in [10]. the boundary 3-point functions c σ3,σ2,σ1 β3,β2,β1 are meromorphic with respect to the variables β1, β2, β3 [9], while the fusion coeficients fβs,βt [ β2 β β3 β1 ] are meromorphic in all six variables and invariant under the reflections βi → q − βi. when one of the operators corresponds to a degenerate representation, the fβs,βt and c σ3,σ2,σ1 β3,β2,β1 coeficients develop singularities such that the integral in (2.2) gives rise to a finite sum over representations in accordance with the fusion rules [9]. in particular, for β = −b/2, equation (2.2) becomes (see e.g. [5]): cσ3,β2−t b2 ⎡ ⎣ β2 −b2 σ4 σ2 ⎤ ⎦cσ2=σ3± b2 ,β3 ⎡ ⎣ β2 − t b2 β1 σ4 σ1 ⎤ ⎦ = f+t ⎡ ⎣ β2 −b2 β3 β1 ⎤ ⎦ cσ3± b2 β1− b2 ⎡ ⎣ −b2 β1 σ3 σ1 ⎤ ⎦cσ3,β3 ⎡ ⎣ β2 β1 − b2 σ4 σ1 ⎤ ⎦ + (2.3) f−t ⎡ ⎣ β2 −b2 β3 β1 ⎤ ⎦ cσ3± b2 β1+ b2 ⎡ ⎣ −b2 β1 σ3 σ1 ⎤ ⎦cσ3,β3 ⎡ ⎣ β2 β1 + b2 σ4 σ1 ⎤ ⎦ , t = ±1 where c and f are the appropriate residues of c and f. for t =1 it becomes cσ3,σ2± b 2 ,σ1 β3,β2,β1 = γ(1−2β2b)γ((2β1 − b)b) γ(1− b(β2 + β3 − β1))γ(b(β1 + β3 − β2 − b)) · cσ3,σ2,σ1 β3,β2+ b 2 ,β1− b 2 + (2.4) b2 √ λlγ(1−2β2b)γ(1−2β1b)g ∓ (σ2, β1, σ1) 2sinπb(2β1 − q)γ(1− b(β1 + β2 + β3 − q))γ(1− b(β1 + β2 − β3)) · cσ3,σ2,σ1 β3,β2+ b 2 ,β1+ b 2 , where λl = πμγ(b 2) is the (normalised) cosmological constant and g−(σ2, β1, σ1) = g+(σ2, q − β1, σ1)= −4sinπb ( β1 − σ1 − σ2 + b 2 ) sinπb ( β1 − σ2 + σ1 − b 2 ) . (2.5) there is a second equation with a shift β2 − b/2 on the r.h.s. as well as two dual equations for 1/b → b/2. the derivation of (2.4) is standard and the coeficients in front of the correlators are given by the products of fusing matrix elements and 3-point boundary functions containing a fundamental field. the latter are computed by free field coulomb gas methods [8], assuming that for degenerate representations the cardy multiplicity coincides with the verlinde multiplicity. in the case above, this means that the two boundaries of the field σ2bσ1− b2 satisfy σ2 = σ1 ± b/2. 2.1 the simplest correlator we start with the derivation of the simplest correlator with three identical charges equal to b, i.e., the correlator of three cosmological operators, or boundary liouville screening charges. it is reproduced by the second term on the r.h.s. of the equality (2.4) choosing β1 = b 2 = β2, β3 = b. for this choice the equation needs regularisation since the coeficient in front of the correlator becomes divergent. the remaining two correlators are represented 85 acta polytechnica vol. 50 no. 3/2010 as reflections (2.1) with respect to β3 (the l.h.s.) and β2 (the first term on the r.h.s.) of correlators which also diverge, if we assume that they are given by the integral pt formula. indeed they satisfy the charge conservation conditions (q − β3)+β2+β1 = q and β3+(q − β2 − b/2)+(β1 − b/2)= q, respectively, and their residua equal 1/2π (to agree with the normalisation in [9]). thus, in a proper regularisation of (2.4), these two correlators are replaced by the corresponding reflection amplitudes, which appear as the initial data in the equation. we recall their general expression computed in [8], s(σ2, β, σ1) = 2π bγ(1+ 1 b (q −2β))γ(b(q −2β)) g2(σ2, β, σ1), (2.6) g2(σ2, β, σ1) = λ 1 2b(q−2β) l sb(2β − q)∏ s=± sb(β + s(σ2 + σ1 − q))sb(β + s(σ2 − σ1)) , where sb(α)=γb(α)/γb(q − α)= 2sinπb(α − b)sb(α − b) and γb(x) is the double gamma function; sb(b)= b. in the case under consideration here β = b and inserting in (2.4) we reproduce the cyclically symmetric expression proposed in the microscopic approach [5], c σ3,σ2,σ1 b,b,b = 2π √ λl −1 (γ(1− b2))2γ( 1 b2 −1) · g2(σ3, b, σ1)− g2(σ3, b, σ2) g−(σ1, b 2, σ2) = (2.7) 2πλ q−3b 2b l sb( 2 b )(γ(1− b2))2γ( 1 b2 −1) · (c̃1(c2 − c3)+ c̃2(c3 − c1)+ c̃3(c1 − c2) (c2 − c1)(c1 − c3)(c3 − c2) where the boundary cosmological constants ∼ ci and their dual appear, ci =2cosπb(b −2σi), c̃i =2cosπ 1 b ( 1 b −2σi ) . (2.8) similar regularisedversions of (2.4) arise for other values of the charges corresponding to reflections ofcoulomb gas correlators. 2.2 one parameter correlators, cyclic symmetry we shall use eq. (2.4) as a recursion relation, starting from the explicit expression (2.7). let us first introduce some general notation: g(−)(σ2, β, σ1) := sb(−β + σ2 + σ1)sb(q − β + σ2 − σ1)= g− ( σ2, β + b 2 , σ1 ) g(−)(σ2, β + b, σ1). (2.9) for a non-negative integer k and an integer n of parity p(n) denote b(σ2, σ1) (k;p(n)) := g(−)(σ2, −kb2 − n 2b , σ1) g(−)(σ2, b + kb 2 − n 2b , σ1) = (−1)(k+1)(n+1)b(σ1, σ2)(k;p(n)). (2.10) applying (2.9), the ratio (2.10) is expressed as a k +1 order polynomial in {ci} using that for k �=0 g− ( σ2, b 2 − k b 2 + n 2b , σ1 ) g− ( σ2, b 2 + k b 2 + n 2b , σ1 ) = c21 + c 2 2 − c1c2(−1) n2cosπkb2 − (2sinπkb2)2 (2.11) while b(σ2, σ1) (0;p(n)) =(−1)nc2 − c1. similarly, we define the dual b̃(σ2, σ1)(n;p(k)) b̃(σ2, σ1) (n;p(k)) := g(−)(σ2, − n2b − kb 2 , σ1) g(−)(σ2, 1 b + n2b − kb 2 , σ1) = (−1)(k+1)(n+1)b̃(σ1, σ2)(n;p(k)) (2.12) so that the reflection amplitude is expressed as the ratio of polynomials λ 2β2−q 2b l g2(σ2, β2 = b + m2b − n2 b , σ1) sb(2β2 − q) = g(−)(σ2, β2, σ1) g(−)(σ2, q − β2, σ1) = b̃(σ2, σ1) (2n2;p(2m2)) b(σ2, σ1)(2m2;p(2n2)) . (2.13) 86 acta polytechnica vol. 50 no. 3/2010 finally we introduce p2 ≡ p σ3,σ2,σ1β3,β2,β1 := (−1) m312+2m2λ − m3 12 2 l sb((2m1 +1)b)sb((2m2 +1)b) sb(b) · m312∑ p=0 sb((m 3 12 +1)b) sb((p +1)b)sb((m312 +1− p)b) × g2(σ2 + p b 2, β2 − p b 2, σ3) g2(σ2, β2, σ3) (2.14) g2(σ2 − (m312 − p)b2, β1 − (m 3 12 − p)b2, σ1) g2(σ2, β1, σ1) andsimilarly p1 and p3,whichareobtained from(2.14)bycyclicpermutations. thefinite sum(2.14) isproportional to a truncated basic hypergeometric function 4φ3(. . . ;q, q). it can be expanded as a polynomial in the variables {ci} (a special case of askey-wilson polynomials). we begin with the “thermal” case with all ni = 0 in (1.1). we first use such a regularised equation in which the first term on the r.h.s. of (2.4) reduces to a 2-point function in order to obtain recursively the most general correlator with m213 = 0. then using the analog of the general equation (2.4) for shifts of the pair (β3, β2), we obtain c σ3,σ2,σ1 β3,β2,β1 = − λ q−β123 2b l ∏ (β3, β2, β1) b(σ1, σ2)(2m1;0)b(σ2, σ3)(2m2;0)b(σ3, σ1)(2m3;0) f σ3,σ2,σ1 β3,β2,β1 , f σ3,σ2,σ1 β3,β2,β1 = (−1)2m1((−1)2m2c̃2 − c̃3)b(σ3, σ1)(2m3;0)p σ3,σ2,σ1β3,β2,β1 − (−1)2m2((−1)2m3c̃3 − c̃1)b(σ2, σ3)(2m2;0)p σ2,σ1,σ3β2,β1,β3 = (2.15) − ( c̃1b(σ3, σ2) (2m2;0)p σ2,σ1,σ3 β2,β1,β3 + c̃2b(σ1, σ3) (2m3;0)p σ3,σ2,σ1 β3,β2,β1 + c̃3b(σ2, σ1) (2m1;0)p σ1,σ3,σ2 β1,β3,β2 ) where ∏ (β3, β2, β1)= be0(q−β123)γb(2q − β123)γb(q − β123)γb(q − β213)γb(q − β312) sb( 1 b )sb( 2 b )γb(q)γb(q −2β1)γb(q −2β2)γb(q −2β3) . (2.16) in the last equality of (2.15) we have exploited (2.10) and the relation. b(σ3, σ1) (2m3;0)p σ3,σ2,σ1 β3,β2,β1 +cyclic permutations= 0 (2.17) which is equivalent to the cyclic symmetry of the correlator, now explicit in (2.15). symmetry is ensured by the fact that the expression given by the first equality satisfies all the equations related by cyclic permutations. the composition of the reflection of all three fields with the reflection amplitude as in (2.1) and the duality transformation b → 1/b (changing notation mi → ni) gives the correlator in the other thermal case, when all mi =0 in (1.1). in this case the product of b (0;p(2ni)) replaces the denominator in (2.15) and the formula confirms the structure suggested in themicroscopic approach of [5]. the dual polynomial p̃ σ3,σ2,σ1β3,β2,β1 is defined by changing in (2.14) βi → q−βi, b → 1/b, mi → ni. with the help of some identities for the basic hypergeometric functions one reproduces the formula in [6] for the case {mi = 0, ni – integers}. the expression in [6] is however not explicitly symmetric under cyclic permutations, rather this symmetry is checked to hold on examples. 2.3 the general correlator to obtain the liouville correlator defined for general values (1.1), we can either use the dual pentagon equations, or we can start from the correlator with all mi =0. in one of the steps, the pentagon equation (2.4) is regularised again so that the second term on the r.h.s. is given by g2 times a non-trivial coulomb gas liouville correlator. the final result is an expression generalising the first line in (2.15), c σ3,σ2,σ1 β3,β2,β1 = λ q−β123 2b l ∏′ (β3, β2, β1) b(σ1, σ2)(2m1;p(2n1))b(σ2, σ3)(2m2;p(2n2))b(σ3, σ1)(2m3;p(2n3)) × (−1)2m22n1 ( (−1)2m1+2n2b̃(σ2, σ3)(2n2;p(2m2))p̃ σ2,σ1,σ3β2,β1,β3 b(σ3, σ1) (2m3;p(2n3))p σ3,σ2,σ1 β3,β2,β1 − (2.18) (−1)2m2+2n1b̃(σ3, σ1)(2n3;p(2m3))p̃ σ3,σ2,σ1β3,β2,β1 b(σ2, σ3) (2m2;p(2n2))p σ2,σ1,σ3 β2,β1,β3 ) with the prefactor ∏′ (β3, β2, β1)= (−1)m123n123 ∏ (β3, β2, β1)s 3 b( 1 b )sb( 2 b − b) sb( n312+1 b )sb( n123+1 b )sb( n213+1 b )sb( n123+2 b − b) . (2.19) 87 acta polytechnica vol. 50 no. 3/2010 here, say, the polynomial p2 is given by the first formula (2.14), where now all βi are given by (1.1), with only the sign in front of (2.14) modified to (−1)m 3 12(1+2n3)+2m32n3+2m2 = (−1)m123(1+2n3)+2m1. let us also write down the expression for one of the dual polynomials p̃1 ≡ p̃ σ2,σ1,σ3β2,β1,β3 (−1)n123(1+2m2)+2n3λ−n 2 13/2 l sb( 2n1+1 b )sb( 2n3+1 b ) sb( 1 b ) n213∑ u=0 sb( n213+1 b ) sb( 1+u b )sb( n213+1−u b ) × (2.20) g2(σ1 + u 2b , q − β1 − u 2b , σ2) g2(σ1, q − β1, σ2) g2(σ1 − n213−u 2b , q − β3 − n213−u 2b σ3) g2(σ1, q − β3, σ3) . the cyclic symmetry of the full correlator is ensured by construction and is equivalent to a relation generalising (2.17), (−1)2n2(2m2+1)b(σ3, σ1)(2m3;p(2n3))p2 +cyclic permutations= 0 (2.21) and its dual with the dual polynomials and mi ↔ ni. in particular, when all mi =0 the dual relation reproduces the cyclic identity satisfied by the first order dual polynomials b̃(σ2, σ3) (0;p(2m2)) = (−1)2m2c̃2 − c̃3, etc., which appear in the numerator in (2.15). the composition of the duality transformation b → 1/b, mi ↔ ni with reflection of all three fields keeps (2.18) invariant. 3 summary and discussion we have obtained the general liouville dressing factor in the tachyon 3-point boundary correlatorwith degenerate c < 1 representations. formula (2.18) represents the liouville correlator as a ratio of polynomials of the boundary cosmological parameters ci, c̃i generalising the partial results in [5, 6]. this solution of the liouville pentagon equations extends to theminimalgravity theorywith rational b2, inwhichcase theremayappear further truncations of the sums. thegeneral3-pointboundarytachyoncorrelator is aproductof (2.18)andthematter3-pointboundary correlator, satisfying a 4-term equation, see [11] for an explicit formula and further discussion. a possible extension of our result would allow us to describe also the 3-point boundary tachyon correlators corresponding to the zz branes. for this purpose, the roles of the matter andliouville spectra and the corresponding correlators are essentially inverted: the coulomb gas liouville correlator for degenerate c > 25 representations describing both the charges and the boundaries should be combinedwith amatter factor obtained by analytic continuation of the solution (2.18). note that the corresponding discrete c < 1 spectrum parametrises the irreducible representations embedded as submodules of the reducible virasoro modules. the analogous characteristics of the c > 25 spectrum (1.1) have been exploited in the construction of the 4-point bulk tachyon correlators [12]. acknowledgement p. furlan acknowledges support from the italian ministry of education, universities and research (miur). v. b. petkova acknowledges hospitality from the service de physique thèorique, cea-saclay, france, and ictp and infn, italy. this research has received some support from the french-bulgarian rila project, contract 3/8-2006. references [1] ginsparg, p., moore, g.: lectures on 2d gravity and 2d string theory (tasi 1992), hep-th/9304011. [2] martinec, e. j.: the annular report on non-critical string theory, hep-th/0305148. [3] seiberg, n., shih, d.: jhep 0402 (2004) 021, hep-th/0312170. [4] kostov, i. k.: nucl. phys. b 689, 3 (2004), hep-th/0312301. [5] kostov, i. k., ponsot, b., serban, d.: nucl. phys. b 683, 309 (2004), hep-th/0307189. [6] alexandrov, s. y., imeroni, e.: nucl.phys. b 731 (2005) 242, hep-th/0504199. [7] basu, a., martinec, e. j.: phys. rev. d 72, 106007 (2005), hep-th/0509142. [8] fateev, v., zamolodchikov, a., zamolodchikov, al.: boundary liouville field theory. i: boundary state and boundary two-point function, hep-th/0001012. 88 acta polytechnica vol. 50 no. 3/2010 [9] ponsot, b., teschner, j.: nucl. phys. b 622 (2002) 309, hep-th/0110244. [10] ponsot, b., teschner, j.: liouville bootstrap via harmonic analysis on a noncompact quantum group, hepth/99111110;comm. math. phys. 224, 3 (2001) 613, math.qa/0007097. [11] furlan, p., petkova, v. b., stanishkov, m.: non-critical string pentagon equations and their solutions, to appear in: j. phys. a, arxiv:0805.0134. [12] belavin, a., zamolodchikov, al.: theor. math. phys. 147 (2006) 729, hep-th/0510214. p. furlan dipartimento di fisica teorica dell’università di trieste, italy istituto nazionale di fisica nucleare (infn) sezione di trieste, italy v. b. petkova institute for nuclear research and nuclear energy (inrne), bulgarian academy of sciences (bas), bulgaria m. stanishkov institute for nuclear research and nuclear energy (inrne), bulgarian academy of sciences (bas), bulgaria 89 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 czech participation in international x-ray observatory (ixo) r. hudec, l. ṕına, v. marš́ıková, a. inneman, m. skulinová, m. mı́ka abstract here we describe the recent status of czech participation in the ixo (international x-ray observatory) space mission, with emphasis on the development of new technologies and test samples of x-ray mirrors with precise surfaces, based on new materials, and their applications in space. in addition, alternative x-ray optical arrangements are investigated, such as kirkpatrick-baez systems. keywords: x-ray satellites, x-ray telescopes, x-ray optics. 1 introduction the design and development of x-ray optics has a long tradition in the czech republic (e.g. hudec et al., 1991, 1999, 2000, 2001, inneman et al. 1999, 2000). a range of various related technologies have been exploited and investigated, including technologies for future large, light-weight x-ray telescopes. future large space x-ray telescopes (such as ixo considered by esa or ixo/constellation x by nasa) require precise, light-weight x-ray optics based on numerous thin reflecting shells. novel approaches and advanced technologies need to be developed and exploited. in this paper, we refer to czech efforts in connection with ixo (now athena), focusing on the results of test x-ray mirror shells produced by glass thermal forming (gtf) and by shaping si wafers. both glass foils and si wafers are commercially available, have excellent surface microroughness of a few 0.1 nm, and low weight (the volume density is 2.5 g · cm−3 for glass and 2.3 g · cm−3 for si). technologies need to be exploited for shaping these substrates to achieve the required precise x-ray optics geometries without degrading the fine surface microroughness. although glass, and more recently, silicon wafers have been considered the most promising materials for future advanced large aperture x-ray telescopes, other alternative materials are also worth further study, such as amorphous metals and glassy carbon (marsch et al., 1997). in order to achieve sub-arsec angular resolutions, the principles of active optics need to be adopted. the international x-ray observatory (ixo) is a new x-ray telescope with joint participation of nasa, the european space agency (esa) and the japan aerospace exploration agency (jaxa). this project supersedes both nasa’s constellation-x and esa’s xeus mission concepts. in mid-2008, officials from esa, nasa and jaxa headquarters agreed to conduct a joint study of ixo with a single merged set of top-level science goals. this agreement established the key science measurement requirements (white et al., 2009). the spacecraft configuration for the ixo study is a mission featuring a single large x-ray mirror, an extendible optical bench with a focal length of ∼ 20 m and a suite of five focal plane instruments. the x-ray instruments under study for the ixo concept include: a wide field imaging detector, a highspectral-resolution imaging spectrometer (calorimeter), a hard x-ray imaging detector, a grating spectrometer, a high timing resolution spectrometer and a polarimeter. the ixo mission concept was submitted to the u.s. decadal survey committee and to esa’s cosmic vision process. note added in proofs: as a consequence of us decadal survey, on european side (esa) the ixo will be replaced by new project athena. as the athena will also use the imaging x-ray optics similarly to ixo, the developments described in this paper refer now to athena. 2 czech involvement in ixo-related studies at the moment, the czech participation in ixo concentrates on: (1) participating in defining scientific goals, justification and project preparation, (2) participating in the design and development of mirror technologies. the first author of this paper was delegated as a member of the ixo telescope working group. in the mirror development, we focus on supporting esa estec micropore silicon technology design and also on designing and developing alternative background technologies discussed in greater detail below. 3 the glass foil alternative for ixo glass science and technology has a long tradition in the czech republic. at the same time, glass technol46 acta polytechnica vol. 51 no. 2/2011 ogy is one of most promising technologies for producing mirrors for ixo, as the volume density of glass is nearly four times less than the volume density of electroformed nickel layers. glass foils can be used as flats or may be shaped or thermally slumped to achieve the required geometry. thermal forming of glass is not a new technology, and it has been used in various sectors of the glass industry and in glass art, as well as in the production of cherenkov mirrors. however, the application of this technology in x-ray optics is related with the need to improve accuracy significantly and minimize errors. as the first step, small (various sizes typically less than 100×100 mm) glass samples of various types provided by various manufacturers were used and thermally shaped. the geometry was either flat or curved (cylindrical or parabolic). the project continued with larger samples (up to 300 × 300 mm) and further profiles. recent efforts have focused on optimizing the relevant parameters of both glass material and substrates, as well as the parameters of the slumping process. various approaches have been investigated (figure 1). we note that these are not quite identical with efforts by another teams (e.g. zhang et al., 2010, ghigo et al., 2010). the glass samples were thermally formed at rigaku, prague, and also at the institute of chemical technology in prague. for large samples (300×300 mm), facilities at the optical development workshop in turnov were used. the strategy is to develop a technology suitable for inexpensive mass production of thin x-ray optics shells, i.e., to avoid expensive mandrels and techniques that are not suitable for mass production or that are too expensive. numerous glass samples have been shaped and tested in order to find out the optimal parameters. the shapes and profiles of both mandrels, as well as the resulting glass replicas, have been carefully measured using metrological devices. the results show that the quality of the thermal glass replica can be significantly improved by optimizing the material and improving the design of the mandrel, by modifying the thermal forming process, as well as by optimizing the temperature (figure 2). after the modifications and improvements, some of them significant, we obtained the resulting deviation of the thermally formed glass foil from the ideal designed profile less than 1 μm (peak to valley value) in the best case. this value is however strongly dependent on the exact temperature, so we believe that further improvements are still possible. fig. 1: the three investigated glass thermal arrangements: convex and concave mandrels and a double mandrel fig. 2: an example of the optimization studies performed for glass thermal forming: optimization map for waviness (wavinesswaas function of forming process parameters i.e. duration of the thermal forming process (τ) and temperature (t), tg means glass transformation temperature) 47 acta polytechnica vol. 51 no. 2/2011 the fine original microroughness (typically better than 1 nm) of the original float glass foil was found not to be degraded by the thermal forming process. we note that our approach in thermal glass forming is different from the approaches used by other authors. recent efforts have been devoted to optimizing the whole process, using and comparing different forming strategies etc., as the final goal is to further improve the forming accuracy to less than 0.1 μm values. for the near future, we plan to continue these efforts together with investigations of computer-controlled forming of glass foils (according to the principles of active optics). 4 the silicon wafer alternative silicon is a relatively light material and already during the manufacturing process it is lapped and polished (either on one side or on both sides) to very fine smoothness (better than a few 0.1 nm) and thickness homogeneity (of the order of 1 μm). another obvious alternative, recently considered as one of most promising for high-precision x-ray optics for ixo, is the use of x-ray optics based on commercially available silicon wafers manufactured mainly for the purposes of the semiconductor industry. the main advantages of the application of si wafers in space x-ray optics are (i) the volume density, which is more than 4 times lower than the electroformed nickel used in the past for galvanoplastic replication of multiply nested x-ray mirrors, and slightly less than the alternative approach of glass foils, (ii) very high thickness homogeneity, typically less than 1 μm over 100 mm, and (iii) very small surface microroughness either on one side or on both sides (typically of the order of a few 0.1 nm or even less, e.g. figure 3). silicon wafers were expected to be used in the esa xeus project and are still under consideration for the ixo (now athena) project. the recent baseline optics for the ixo x-ray telescope design is based on x-ray high precision pore optics (x-hpo), a technology currently under development with esa funding (rd-opt, rd-hpo), in view of achieving large effective areas with low mass, reduced telescope length, high stiffness, and a monolithic structure, favoured for handling the thermal environment and for simplifying the alignment process (bavdaz et al. 2010). in addition, due to the higher packing density and the associated shorter mirrors required, the conical approximation to the wolter-i geometry becomes possible. the x-hpo optics is based on ribbed si wafers stacked together. the si wafers to achieve the conical approximation are formed by stacking large number of plates together using a mandrel. the typical size of the si wafers is 10 × 10 cm. there are also alternative x-ray optics arrangements with the use of si wafers. in this paper, we refer to the development of an alternative design of innovative precise x-ray optics based on si wafers. our approach is based on two steps, namely (i) developing dedicated si wafers with properties optimized for use in space x-ray telescopes and (ii) precisely shaping the wafers to the optical surfaces (figure 4). stacking to achieve nested arrays is performed after the wafers have been shaped. in this approach, multi foil optics (mfo) is thus created from shaped si wafers (figure 5). for more details on mfo, see hudec et al. (2005). fig. 3: afm mesurement results for si wafers fig. 4: taylor-hobson profilometric measurement of a bent si wafer fig. 5: multi foil optics (mfo) in the kirkpatrick-baez (k-b) arrangement 48 acta polytechnica vol. 51 no. 2/2011 this alternative approach does not require the si wafers to have a ribbed surface, so problems with transferring any deviation, stress, and/or inaccuracy from one wafer to the neighbouring plates or even to the whole stacked assembly will be avoided. however, suitable technologies for precise stacking of optically formed wafers to a multiple array have to be developed. the si wafers available on the market are designed for use mainly in the semiconductor industry. it is obvious that the requirements of this industry are not the same as the requirements of precise space x-ray optics. si wafers are a monocrystal (single crystal) with some specifics, and this must also be taken into account. moreover, si wafers are fragile, and it is very difficult to bend and/or shape them precisely (for thicknesses required for x-ray telescopes, i.e. around 0.3–1.0 mm. an exception is thin si wafers below 0.1 mm in thickness. however, these can be hardly used in this type of x-ray optics because of diffraction limits. also, while their thickness homogeneity is mostly perfect, the same is not true for commercially available wafers for their flatness (note that we refer here to the deviation of the upper surface of a free-standing si wafer from an ideal plane, while in the semiconductor community flatness is usually represented by a set of parameters). in order the achieve the very high accuracy required by future large space x-ray telescopes like esa/nasa/jaxa ixo, now athena by esa, the parameters of the si wafers need to be optimized (for application in x-ray optics) at the production stage. for this purpose we have established and developed a multidisciplinary working group including specialists from the development department of the si wafer industry with the goal to design and manufacture si wafers with improved parameters (mostly flatness) optimized for application in x-ray telescopes. it should be noted that the manufacture of silicon wafers is a complicated process with numerous technological steps and with many free parameters that can be modified and optimized to achieve optimal performance. this can also be useful for further improving the quality of x-hpo optics. as we are dealing with high-quality x-ray imaging, the smoothness of the reflecting surface is important. the standard microroughness of commercially available si wafers (we have used the products of on semiconductor, czech republic) is of the order of 0.1 nm, as confirmed by several independent measurements by various techniques including the atomic force microscope (afm). this is related to the method of chemical polishing used in the manufacture of si wafers. the microroughness of si wafers exceeds the microroughness of glass foils and most other alternative mirror materials and substrates. the flatness (in the sense of the deviation of the upper surface of a freestanding si wafer from a plane) of commercially available si wafers was however found not to be optimal for use in high-quality (order of arcsec angular resolutions) x-ray optics. most si wafers show deviations from the plane of the order of a few tens of microns. after modifying the technological process during si wafer manufacture, we were able to reduce this value to just a few microns. also, the thickness homogeneity was improved. in collaboration with the manufacturer, further steps are planned to improve the flatness (deviation from an ideal plane) and the thickness homogeneity of si wafers. these and planned improvements introduced at the si wafer manufacture stage can also be applied for other designs of si wafer optics including x-hpo, and can play a crucial role in the ixo project. the x-ray optics design for ixo (now athena) is based on the wolter 1 arrangement, and hence requires curved surfaces. however, due to the material properties of monocrystalline si, si wafers (except very thin ones) are extremely difficult to shape. it is obvious that we have to overcome this problem in order to achieve the fine accuracy and stability required for future large x-ray telescopes. the final goal is to provide optically shaped si wafers with no or little internal stress. three different alternative technologies for shaping si wafers have been designed and tested to achieve precise optical surfaces. the samples shaped and tested were typically 100 to 150 mm large, typically 0.6 to 1.3 mm thick, and were bent to either cylindrical or parabolic test surfaces. the development described here is based on a scientific approach, and hence the large number of samples formed with different parameters must be precisely measured and investigated in detail. especially precise metrology and measurements play a crucial role in this type of experiment. the samples of bent wafers with the investigated technologies have been measured, including taylor-hobson mechanical and still optical profilometry, as well as optical interferometry (zygo) and afm (atomic force miscroscope) analyses (figure 3). it has been confirmed that all these three technologies do not degrade the intrinsic fine microroughness of the wafer. while the two physical/chemical technologies exploited give peak-to-valley (pv) deviations (of the real surface of the sample compared with the ideal optical surface) of less than 1 to 2 μm over the 150 mm sample length, as preliminary values, the deviations of the first thermally bent sample are larger, of the order of 10 μm. taking into account that the applied temperatures, as well as other parameters, were not optimized for this first sample, we anticipate that the pv (peak to valley) value can be further reduced down to the order of 1 μm and perhaps even below. fine adjustments of the parameters can also further improve the accuracy of the results for the other two techniques. 49 acta polytechnica vol. 51 no. 2/2011 5 k-b alternative for ixo although the wolter systems are generally well known, hans wolter was not the first to propose xray imaging systems based on reflection of x-rays. in fact, the first grazing incidence system to form a real image was proposed by kirkpatrick and baez in 1948. this system consists of a set of two orthogonal parabolas of translation. the first reflection focuses to a line, which is focused by the second surface to a point. this was necessary to avoid the extreme astigmatism suffered from a single mirror, but it still was not free of geometric aberrations. nevertheless, the system is attractive because it is easy to construct the reflecting surfaces. these surfaces can be produced as flat plates and then mechanically bent to the required curvature. in order to increase the aperture a number of mirrors can be nested together, but it should be noted that this nesting introduces additional aberrations. this configuration is used mostly in experiments not requiring a large collecting area (solar, laboratory). recently, however, large modules of kb mirrors have also been suggested for stellar xray experiments. as mentioned above, si wafers are difficult to shape, especially to small radii. to overcome this difficulty, another x-ray optics arrangement can be considered, namely the kirkpatrick-baez (kb) system. then the curvature radii are much larger, of the order of a few km, while the imaging performance is similar. for the same effective area, however, the focal length of the kb system is about twice as large as the focal length of the wolter system. nevertheless, kb systems represent a promising alternative to the classical wolter systems in future large space x-ray telescopes. a very important factor is the ease (and hence the reduced cost) of constructing highly segmented modules based on multiply nested thin reflecting substrates in comparison with the wolter design. while e.g. the wolter design for ixo requires the substrates to be precisely formed with curvatures as small as 0.25 m, the alternative kb arrangement uses almost flat or only slightly bent sheets. hence the feasibility of constructing a kb module with the required 5 arcsec fwhm at an affordable cost is higher than for the wolter arrangement. advanced kb telescopes are based on the multi foil optics (mfo) approach (x-ray grazing incidence imaging optics based on numerous thin reflecting substrates/foils). the distinction between mfo and other optics using packed or nested mirrors is that mfo is based on numerous very thin (typically less than 0.1 mm) substrates. the mfo kb test modules were recently designed and constructed at rigaku innovative technologies europe (rite) in prague, and 2 modules were tested in full aperture x-ray tests in the test facility of the university of boulder, with preliminary results of fwhm 26 arcsec for a full stack of 24 standard si plates at 5 kev (figures 6, 7). fig. 6: left: test k-b modules assembled in rigaku rite in prague, right: k-b module during full aperture x-ray tests at boulder university fig. 7: the measurement results of the k-b test module with 24 si wafers, full aperture tests at 5 kev at university of colorado at boulder. the estimated fwhm is 26 arcsec 50 acta polytechnica vol. 51 no. 2/2011 6 conclusion suitable technologies for future large x-ray telescopes require extensive research work. two promising technologies suitable for future large-aperture and fine resolution x-ray telescopes, such as ixo, were exploited and investigated in detail, namely glass thermal forming and si wafer bending. in both cases, promising results have been achieved, with peak-tovalley deviations of the final profiles from the ideal profiles being of the order of 1 μm in the best cases, with space for further essential improvements and optimization. in the czech republic, an interdisciplinary team with 10 members is cooperating closely with experienced specialists, including researchers from a large company producing si wafers. si wafers have been successfully bent to the desired geometry by three different techniques. in the best cases, the accuracy achieved for a 150 mm si wafer is 1–2 μm for deviation from the ideal optical surface. experiments are continuing in an attempt to further improve the forming accuracy. acknowledgement we acknowledge the support provided by the grant agency of the academy of sciences of the czech republic, grant iaax 01220701, by the ministry of education and youth of the czech republic, projects me918, me09028 and me09004. the investigations related to the esa ixo project were supported by esa pecs project no. 98038. m.s. acknowledges support from a junior grant from the grant agency of the czech republic, grant 202/07/p510. we also acknowledge collaboration with drs. j. sik and m. lorenc from on semiconductor czech republic, and with the team of prof. webster cash from university of colorado at boulder for x-ray tests of kb modules in their x-ray facility. references [1] hudec, r., valníček, b., červencl, j., et al.: spie, 1991, 1343, 162. [2] hudec, r., pína, l., inneman, a.: spie, 1999, 3766, 62. [3] hudec, r., pína, l., inneman, a.: spie, 2000, 4012, 422. [4] hudec, r., inneman, a., pína, l.: lobster-eye: novel x-ray telescopes for the 21st century, new century of x-ray astronomy, asp conf. proc., 2001, 251, 542. [5] hudec, r., pína, l., inneman, a., et al.: spie, 2005, 5900, 276. [6] inneman, a., hudec, r., pína, l., gorenstein, p.: spie, 1999, 3766, 72. [7] inneman, a., hudec, r., pína, l.: spie, 2000, 4138, 94. [8] kirkpatrick, p., baez, a. v.: j. opt. soc. am. 38, 766 (1948). [9] marsch, h., et al.: introduction to carbon technologies, university of alicante, 1997. [10] white, n. e., hornschemeier, a. e.: bulletin of the american astronomical society, 2009, vol. 41, p. 388. [11] white, n. e., parmar, a., kunieda, h., international x-ray observatory team, 2009, bulletin of the american astronomical society, vol. 41, p. 357. [12] http://ixo.gsfc.nasa.gov [13] bavdaz, m., et al.: proceedings of the spie, 2010, vol. 7732, pp. 77321e-77321e-9. [14] zhang, w. w., et al.: proceedings of the spie, 2010, vol. 7732, pp. 77321g-77321g-8. [15] ghigo, m., et al.: proceedings of the spie, 2010, vol. 7732, pp. 77320c-77320c-12. rené hudec e-mail: r.hudec@asu.cas.cz astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic czech technical university in prague faculty of electrical engineering technická 2, cz-166 27 prague, czech republic ladislav ṕına czech technical university in prague faculty of nuclear engineering břehová 78/7, cz-110 00 prague, czech republic veronika marš́ıková adolf inneman rigaku innovative technologies europe, s. r. o. novodvorská 994, cz-142 21 prague 4, czech republic michaela skulinová astronomical institute academy of sciences of the czech republic cz-251 65 ondřejov, czech republic martin mı́ka institute of chemical technology technická 5, cz-166 28 prague, czech republic 51 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 effect of gas mixture composition on the parameters of an internal combustion engine andrej chŕıbik1, marián polóni1, ján lach1 1 slovak university of technology in bratislava, faculty of mechanical engineering, institute of transport technology and engineering design, department of automobiles, ships and combustion engines, námestie slobody 17, 812 31 bratislava, slovak republic correspondence to: andrej.chribik@stuba.sk abstract this paper deals with the use of the internal combustion piston engine, which is a drive unit for micro-cogeneration units. the introduction is a brief statement of the nature of gas mixture compositions that are useful for the purposes of combustion engines, together with the basic physical and chemical properties relevant to the burning of this gas mixture. specifically, we will discuss low-energy gases (syngases) and mixtures of natural gas with hydrogen. the second section describes the conversion of the lombardini lgw 702 combustion engine that is necessary for these types of combustion gases. before the experimental measurements, a simulation in the lotus engine simulation program was carried out to make a preliminary assessment of the impact on the performance of an internal combustion engine. the last section of the paper presents the experimental results of partial measurements of the performance and emission parameters of an internal combustion engine powered by alternative fuels. keywords: natural gas, hydrogen, synthesis gas, engine parameters. 1 introduction in the current development of combustion engines, there is a trend to put emphasis not only on performance and cost parameters, but also on environmental parameters, the aim being to achieve the most acceptable values. the spotlight is also therefore on non-traditional power sources. one of these energy carriers is hydrogen. a mixture of natural gas and hydrogen will above all be used to drive vehicles in urban areas, where there are higher requirements on complying with emission regulations. this mixture combines the advantages of the individual components of the mixture. the current natural gas (h2ng0) is made up mainly of methane (97 % vol.), and has a calorific value of 50 mj·kg−1 and an octane number of 130. a combination of this natural gas with oxygen to power internal combustion engines can achieve multiple benefits in terms of combustion and emissions: – limited flammability – as the proportion of hydrogen in the mixture with natural gas increases, the flammability range also increases. a mixture that contains 30 % of hydrogen has the theoretical flammability range limit λ = (2.67 to 0.52), while when natural gas is burned λ = (1.92 to 0.57). – ignition delay time – adding hydrogen to natural gas reduces the delay time of ignition, and in the case of a 5 % proportion of hydrogen the delay time is reduced to about one half of the delay when an internal combustion engine operates on natural gas. – burning rate – increasing the proportion of hydrogen in the gas mixture increases the flame speed, and hence the time of burning the mixture itself, resulting in faster heat release. – emissions – there is some reduction in the content of carbon (co2, co, chx), since the addition of hydrogen increases the h/c ratio. however, the addition of hydrogen increases the amount of nox, because there is an increase in the combustion temperature. the second fuel types that will be used for powering internal combustion engines in cogeneration units are called synthesis gases. these gases are derived mainly by gasification of wastes. the fuel composition depends on the gasifier that is used, and on the type of waste. synthesis gases can be divided into two basic groups. the first group is synthesis gas with a 30 % proportion of inert gases, and the second group is synthesis gas with a 60% proportion of inert gases. the first group is characterized as fuel syngas 1, with the following composition — (ch4 – 15 % h2 – 25 % co – 30 % co2 – 25 % n2 – 5%). a typical fuel in the second group is referred to as syngas 2, and has the following composition — (ch4 – 10 % h2 – 15 % co – 15 % co2 – 10 % n2 – 50 %). in 23 acta polytechnica vol. 52 no. 3/2012 the rest of this paper these fuels (h2ng0, h2ng15, syngas 1 and syngas 2) will be used to drive a lombardini lwg 702 internal combustion engine, and the parameters will be evaluated. 2 lombardini lgw 702 combustion engine our workplace has been applying h2ng gas mixtures (0 to 30 % vol. h2) to drive an lgw 702 internal combustion engine. this is a two-cylinder gas engine with electronically controlled richness of the mixture. the basic characteristics of the lgw 702 engine are summarized in table 1. before the actual application of the gas mixture in a combustion engine, it was necessary to make a simulation analysis to obtain an indicative idea of the behaviour of the internal combustion engine with a given fuel mixture. simulations were performed on a one-zone model of combustion for the lgw 702 engine, in the lotus engine simulation calculation program (les). this program is used for simulating the operation of the internal combustion engine, and it is also used as a valuable source of information on the operational parameters of the lgw 702 internal combustion engine in reaction to various gas mixtures that will be experimentally verified step-by-step. the basic parameters required as input data characterizing a fuel in the les program are shown in table 2. figure 1 shows a model of the lgw 702 engine in the les program. this model will serve us later to optimize the combustion engine in terms of performance parameters for various types of fuel blends. table 1: main parameters of the lgw 702 internal combustion engine principle of operation spark ignition number of cylinders and their positions 2, in-line angle of crank throw [◦] 360 displacement [cm3] 686 bore [mm]/stroke[mm] 75/77.6 compression ratio [–] 12.5 : 1 double overhead camshafts, drive ohc, gearing belt valve timing ivo – 18◦ btdc, ivc – 34◦ abdc evo – 32◦ bbdc, evc – 32◦ atdc preparation of mixture external, in a mixer, with electronic regulation of the air excess ratio, voila plus system cooling by liquid, with forced circulation, double circular, controlled with a thermostat, cooler ventilated lubrication pressure, forced-feed lubrication, with filtration, oil 1.6 l mass without fillings [kg] 66 table 2: basic physical-chemical properties of fuel mixtures fuel m h/c ρ r hu lt [kg · kmol−1] [–] [kg · m−3] [j · kg−1 · k−1] [kj · kg−1] [kg · kg−1] h2ng0 16.64 3.93 0.705 499.7 48 825 17.00 h2ng15 14.45 4.28 0.613 575.4 50 314 17.36 syngas 1 23.72 2.44 0.969 350.5 11 147 3.67 syngas 2 24.52 2.80 1.002 339.1 6 418 2.11 24 acta polytechnica vol. 52 no. 3/2012 figure 1: virtual model of the lgw 702 engine in the lotus engine simulation program figure 2: course of the measured and calculated torque mt, and specific fuel consumption mpe, depending on the speed n of the lgw 702 engine for h2ng0 and h2ng15 mixtures figure 3: course of the calculated effective efficiency ηe and the fuel consumption mf of the lgw 702 internal combustion engine for various fuels for a stoichiometric mixture and a full load 25 acta polytechnica vol. 52 no. 3/2012 figure 4: course of the composition of the exhaust gases, measured behind the catalytic converter, in dependence on the richness of the mixture, for blends h2ng0 (0 % vol. h2) and h2ng15 (15 % vol. h2) at full load and at an internal combustion engine speed of 1 800 min−1 3 results of the simulations and experiments figure 2 shows two basic indicators of the internal combustion engine (torque and specific fuel consumption), which serve for comparing the simulation and the experiment. a comparison of the simulations and the experimental measurements showed that the difference between the real engine and the virtual model of the engine was approximately 2 %. the experimental results also showed that the torque of the engine in operation with h2ng15, compared to natural gas (h2ng0), was on an average 2 % lower, which on an average meant a decrease of 0.8 n·m. the specific fuel consumption was on an average 3.3 % lower when the engine was running on the h2ng15 mixture than when running on natural gas (h2ng0). figure 3 shows a simulation of the effective efficiency and the fuel consumption for the four basic fuels listed above. the graph shows that the most effective efficiency was achieved when the engine operated on natural gas, and it was 30 % in the speed range from 1 800 to 2 000 min−1. by contrast, the lowest effective efficiency was achieved when the engine operated on syngas 2 (60 % inert gases), and the efficiency being around 11 % in the speed range from 1 200 to 2 000 min−1. the mass consumption of the fuel as a stoichiometric mixture was highest when the engine operated on the syngas 2 mixture. in comparison with the use of syngas 1, there was a rise in consumption of about 48 %. the decrease in the h2ng15 mass fuel mixture was about 3 % lower than when the engine operated on natural gas (h2ng0). several previously published results have shown a significant effect of blending hydrogen into natural gas with the aim to improve the emission parameters of internal combustion engines. the reason is that hydrogen blending increases the molar proportion of h/c, thus reducing the emissions of chx, co and co2. figure 4 shows the course of the composition of the exhaust gases, depending on the air excess ratio for various fuel types (h2ng0 and h2ng15). as can be seen, there is significant production of carbon monoxide co in the operation of natural gas (h2ng0) in rich mixtures, where the influence of oxygen deficiency prevents perfect oxidation of co to co2. conversely, the production of co in lean mixtures is practically zero. the nature and course of the co yield is the same when the engine is running on the h2ng15 mixture (15 % vol. h2). complete combustion of hydrocarbon fuels produces carbon dioxide co2. it can be seen that the largest proportion of co2 is for a slightly lean mixture, with the highest combustion efficiency. from this value, the proportion of co2 decreases similarly as in the rich mixture, which is caused by lack of oxygen to complete the conversion to co2. the same situation also arises in the lean mixture, because of the limited amount of carbon in the fuel. when the engine runs on the h2ng15 mixture, there is a 9 % reduction in co2, on an average, throughout the course, compared to h2ng0. this is because the proportion of carbon in the fuel decreases. nitrogen oxides nox, in the case of engine operation on natural gas are formed by oxidation of nitrogen and oxygen present in the air. nitrogen and oxygen oxidize each other at high temperature. as can be seen, the highest growth occurs for a slightly 26 acta polytechnica vol. 52 no. 3/2012 lean mixture that has the highest combustion temperature with a relative abundance of oxygen. with the gradual transition into leaner areas, the amount of nox decreases with lower combustion temperature. in the area of rich mixtures with a lack of oxygen, there is a decrease in nox. the course of nox formation is similar to the combustion of h2ng15 mixture, but due to the higher combustion temperatures, the concentration of nox increases to about 35 % of the maximum nox values. the formation of nox is closely related to the proportion of residual oxygen in the exhaust gas. in operation with h2ng15 mixture, the formation is lower by about 5 %, than in operation with h2ng0 natural gas. as shown in figure 4, an increase in the admixture of hydrogen to natural gas increases the operating range of an internal combustion engine, mainly in lean mixtures. the addition of 15 % vol. hydrogen shifts the operating range of an internal combustion engine from around λ = 1.35 to a value of λ = 1.5. 4 conclusion mixtures of natural gas and hydrogen have a major advantage over other types of fuels for powering internal combustion engines: they produce lower exhaust emissions, particularly unburned hydrocarbons, namely carbon monoxide and carbon dioxide. however, adding hydrogen to natural gas increases the amounts of nox, which can be partly removed in operation modes with lean mixtures. with an increasing proportion of hydrogen, there is a decrease in the performance parameters of the internal combustion engine. however, the benefits include operating the engine on a lean mixture at partial load, because with added hydrogen the operation range of a combustion engine working with a lean mixture increases. another benefit of our investigation is its contribution to the application of synthesis gas in internal combustion engines. synthesis gases have significant potential for effective utilization of low-energy gases in co-generation units. acknowledgement this work was supported by the slovak research and development agency under contract no. apvv-0270-06 and contract no. apvv-0090-10. references [1] akansu, s. o., kahraman, n., ceper, b.: experimental study on a spark ignition engine fuelled by methane-hydrogen mixtures, international journal of hydrogen energy, 32, 2007, p. 4 279–4 284. [2] huang, z., bing, l.: experimental study on engine performance and emissions for an engine fuelled with natural gas — hydrogen mixtures, energy fuels, 2006, p. 2 131–2 136. [3] eichlseder, h., klell, m.: wasserstoff in der fahrzeugtechnik. vieweg + teubner, 2008. isbn 3834804789. 27 acta polytechnica vol. 51 no. 4/2011 path integrals for (complex) classical and quantum mechanics r. j. rivers abstract an analysis of classical mechanics in a complex extension of phase space shows that a particle in such a space can behave in a way redolent of quantum mechanics; additional dimensions permit ‘tunnelling’ without recourse to instantons and time/energy uncertainties exist. in practice, ‘classical’ particle trajectories with additional degrees of freedom arise in several different formulations of quantum mechanics. in this talk we compare the extended phase space of the closed time-path formalism with that of complex classical mechanics, to suggest that h̄ has a role in our understanding of the latter. however, differences in the way that trajectories are used make a deeper comparison problematical. we conclude with some thoughts on quantisation as dimensional reduction. keywords: phase space, path integral, complex classical physics. 1 introduction it has been argued recently, in a series of papers by carl bender and collaborators [1–3], that classical mechanics in a complex extension of phase space has some attributes of quantum mechanics. with these additional dimensions, particles can negotiate otherwise impenetrable classical hills, enabling them to move from one potential well to another as an alternative to quantum tunnelling. however, the discussion has now gone beyond the mere cataloguing of these qualitative similarities, to suggesting that we are seeing behaviour that can approximate the probabilistic results of quantum mechanics quantitatively [3]. although the emphasis has been on mimicking h̄-independent probability ratios, for this procedure to be sensible h̄ must be implicit in the analysis, since the formalism of complex classical mechanics cannot, of itself, distinguish between quantum (action o(h̄)) systems and classical systems (action o(h̄0)). in fact, should there be any relationship between complex classical mechanics and quantum mechanics, there is a strong hint as to how this must happen, since the tunnelling complies with time/energy uncertainty relations [1], albeit with no h̄ visible. on the other hand, if we adopt the contrary position of extending (real) classical behaviour to quantum mechanical behaviour, it is well-known that there are several different ways in which classical or ‘quasi-classical’ paths can arise in the formulation of quantum dynamics, for which the role of h̄ is clear. in particular, we are interested in what we might term a moyal path integral approach which (in co-ordinate space) reduces to the feynman-vernon closed timepath approach [4] for the evolution of density matrices, familiar in the analysis of decoherence. in fact, it was the use of classical trajectories by the author [5,6] to approximate the evolution of the quantum density matrix that provoked this talk. in this commentary we shall contrast the quasi-classical paths of this moyal formalism to the classical paths in complex phase-space, in each case the solutions to a constrained hamiltonian system. for reasons that will rapidly become clear, we shall initially restrict ourselves to hermitian hamiltonians, even though part of the original motivation for considering complex phase space was to accommodate pseudo-hermitian hamiltonians which, by virtue of pt-symmetry, had real spectra. there is a huge literature on this field, and we would cite [7–9] as exemplary. in the next two sections we introduce path integrals for real classical mechanics, as developed by gozzi over several years [10–12] and show how the moyal path integrals for quantum mechanical systems evolve naturally from them, as originally proposed by marinov [13] and gozzi [14, 15]. we show that the dynamics of the quasi-classical trajectories that provide the mean-field approximation to this quantum system has the same symplectic structure as the trajectories of complex classical mechanics shown by smilga [18]. insofar as the two approaches have some similarities we may hope to impute h̄ behaviour to complex classical mechanics. insofar as they have differences it might be thought that, the better that moyal paths represent quantum mechanics, to which they are an identifiable approximation, the more poorly the paths from ccm will do so. in fact, as we shall see, the situation is somewhat more complicated. even prior to papers [13] and [14, 15] there was an extensive literature on the role on classical and quasi-classical paths in quantum mechanics (e.g. [16]) that has been developed since. given the brevity and simplicity of our observations it is suffi83 acta polytechnica vol. 51 no. 4/2011 cient, in the context of this paper, to cite just [17] and the references therein for interested readers. we conclude with some thoughts on quantisation as dimensional reduction. 2 phase space path integrals for (real) classical mechanics we restrict ourselves to a single particle, mass m, moving in phase space ϕ = (x, p), under the hamiltonian hcl(ϕ) := hcl(x, p) = p2 2m + v (x). the classical solutions ϕcl satisfy hamilton’s equations ϕ̇a − ωab∂bhcl = 0, (2.1) where ωab = iσ2 and ∂a = ∂/∂ϕ a. classical phase-space densities ρcl(ϕ, t) evolve as ρcl(ϕf , tf ) = (2.2)∫ dϕa kcl(ϕf , tf |ϕi, ti)ρcl(ϕi, ti) where the kernel kcl(ϕf , tf |ϕi, ti) is restricted to the classical paths of (2.1), kcl(ϕf , tf |ϕi, ti) = ∫ ϕf ϕi dϕa δ[ϕa − ϕacl] = (2.3)∫ ϕf ϕi dϕa δ[ϕ̇a − ωab∂bhcl] the second equality of (2.3) is a consequence of the incompressibility of phase space. we repeat the earlier analysis of gozzi [10] in doubling phase space through the functional fourier transform of the delta functional: kcl(ϕf , tf |ϕi, ti) = (2.4)∫ ϕf ϕi dϕadπa exp { i ∫ tf ti dt πa(ϕ̇ a − ωab∂bhcl) } if 〈. . .〉 denotes averaging with respect to the (normalised) path integral then, on time-splitting, we see that the πa are conjugate to the ϕ a, with equal-time commutation relations 〈[ϕa, πb]〉 = iδab (2.5) that is, in (2.4) we have the path integral realisation of the canonical koopman — von neumann (kvn) hilbert space description of classical mechanics [19, 20]. this becomes clearer if we rewrite kcl(ϕf , tf |ϕi, ti) as kcl(ϕf , tf |ϕi, ti) = (2.6)∫ ϕf ϕi dϕadπa exp { iscl[ϕ, π] } where scl[ϕ, π] = ∫ tf ti dt lcl(ϕ, π) (2.7) with lcl(ϕ, π) = ϕ̇aπb − hcl(ϕ, π). the hamiltonian hcl(φ, π), hcl(ϕ, π) = πaωab∂bhcl. (2.8) is no more than the liouville operator (up to a factor of i), as follows immediately if we adopt the ϕrepresentation πa = −i∂a in hcl. the commutators with hcl(ϕ, π) (in the sense of (2.5)) determine the evolution of (ϕ, π), but it is more convenient to think of these solutions as just the solutions (ϕcl, πcl) to δscl = 0. 3 phase space path integrals for quantum mechanics to explore the role of semi-classical paths in quantum mechanics we stay as close to the classical formalism of the previous section as possible. the classical phase-space density ρcl(ϕ, t) is replaced by the wigner function ρw (ϕ, t) ρw (x, p, t) = (3.9) 1 πh̄ ∫ dy 〈x − y|ρ̂(t)|x + y〉e2ipy/h̄, which, although not strictly a density, reduces to it in the h̄ → 0 limit. its evolution equation ρw (ϕf , tf ) = (3.10)∫ dϕi kqu(ϕf , tf |ϕi, ti)ρw (ϕi, ti) is determined by the kernel kqu, which we define on the same extended phase-space as kqu(ϕf , tf |ϕi, ti) = (3.11)∫ ϕf ϕi dϕadπa exp { isqu[ϕ, π] } , with squ(ϕ, π) = ∫ tf ti dt lqu(ϕ, π) = (3.12)∫ tf ti dt (ϕ̇aπb − hqu(ϕ, π)) 84 acta polytechnica vol. 51 no. 4/2011 the quantum hamiltonian hqu(φ, π) is a planckian finite-difference discretisation of hcl(φ, π), to which it reduces as h̄ → 0. several choices are possible. for the reasons given in [14, 15] we follow these authors in taking hqu(ϕ, π) = − 1 2h̄ [hcl(ϕ a + h̄ωabπb) − hcl(ϕ a − h̄ωabπb)]. (3.13) although we appreciate that the paths themselves can be less important than the ways in which they are put together, what singles out complex classical mechanics (ccm) is the importance attached to individual paths in tracking their times of passage through the important regions of the complexified classical potential landscape. a priori, we take the same stance here, in assuming that kqu is dominated by solutions to δsqu[ϕ, π] = 0. (3.14) that is, we treat squ[ϕ, π] as a quasi-classical theory in its own right, which we shall term mean-field quantum mechanics (mfqm). since squ = o(h̄ 0), (3.14) is a stationary phase approximation with no small parameter, and therefore to be taken circumspectly. in what follows we compare mfqm to ccm. although they do not match they have suggestive similarities. to cast squ in a more familiar form, we reproduce marinov [13] by introducing new phase space variables: ξa := h̄ωabπb. (3.15) kqu then takes the integral form kqu(ϕf , tf |ϕi, ti) =∫ φf φi dϕadξa exp { i h̄ sm [φ, ξ] } , (3.16) where sm [ϕ, ξ] = ∫ tf ti dt lm (φ, ξ) (3.17) with lm (φ, ξ) = ϕ̇aωabξb + (3.18) 1 2 [hcl(ϕ + ξ) − hcl(ϕ − ξ)] we stress again that the formalism of (3.16) is misleading, in that it suggests that the stationary phase approximation is also a small-h̄ result, whereas sm is o(h̄). despite that, there has been considerable work that successfully utilises the stationary-phase solutions [see [17] and applications cited therein]. 3.1 structure of mfqm let us define ξ1 := y, ξ2 := q and h± := 1 2 [hcl(ϕ + ξ) ± hcl(ϕ − ξ)] (3.19) with ϕ ± ξ = (x ± y, p ± q). we rearrange the original extended phase-space (ϕ, π) into the 4d phase space x: x1 := x, x2 := y, x3 := p, x4 = q. on x we introduce the poisson bracket {{·, ·}}: {{a, b}} := ωab ∂aa ∂bb, (3.20) where ω = ( 0 i −i 0 ) . (3.21) hamilton’s equations, which reduce to δsqu = 0, then take the form ẋ a = {{x a, h+(x)}} = ωab∂bh+(x). (3.22) since h+ and h− are related by ∂ah − = γba∂bh +, where γ = ( σ1 0 0 σ1 ) (3.23) it follows that {{h−, h+}} = ωacγbc ∂ah + ∂bh + = 0, (3.24) since ωγ is antisymmetric. h− = const. is a first class constraint upon the effective classical theory. we note that there is an equivalence between h+ and h− in that, with respect to a slightly different symplectic matrix ω′, we could equally derive the equations of motion from h− as ẋ a = {{x a, h−(x)}}′ = ω′ab∂bh−(x). (3.25) this is more in accord with the closed timepath formalism (ctp), defined in the extended (x, y) coordinate space, for which h− is the relevant hamiltonian prior to momentum integration. we shall not pursue this further. the classical limit is straightforward. remember that y = h̄ȳ, q = h̄q̄ where ȳ, q̄ are the o(h̄0) kvn conjugate variables to p and x. as h̄ → 0 then y, q → 0, as does h− = o(h̄) → 0. (3.26) at the same time we get the contraction h+ → hcl, ω → ω. 85 acta polytechnica vol. 51 no. 4/2011 4 complex classical mechanics (ccm) we now consider complex phase space, repeating the analysis of smilga [18], taking the complex extension of the real phase-space as x → z = x + iy, p → p = p − iq. (4.27) the hamiltonian hcl is decomposed as hcl(p, x) → (4.28) hcl(p, z) = hr(p, z) + ihi (p, z), where hr = 1 2 [hcl(x + iy, p − iq) + (4.29) hcl(x − iy, p + iq)], hi = 1 2i [hcl(x + iy, p − iq) − (4.30) hcl(x − iy, p + iq)]. to see the symplectic structure of ccm we label the phase-space variables x a as before: x1 := x, x2 := y, x3 := p, x4 = q. (4.31) and introduce an identical poisson bracket {{·, ·}} to (3.20): {{a, b}} := ωab ∂aa ∂bb, (4.32) for the same ω. hamilton’s equations are now [18] ẋ a = {{x a, hr}} = ωab∂bhr(x). (4.33) hr and hi are related by cauchy-reimann as ∂ahi = λ b a∂bhr, (4.34) where λ = ( −iσ2 0 0 iσ2 ) . (4.35) it follows that {{hi , hr}} = (ωλ)ab ∂ahr ∂bhr = 0, (4.36) since ωλ is also antisymmetric. that is, as before we have a constrained system with first-order constraint hi = constant. 5 ccm v. mfqm there are obvious similarities between the two approaches, a consequence of the identities h+(x, iy, p, −iq) = hr(x, y, p, q) h−(x, iy, p, −iq) = ihi (x, y, p, q). (5.37) since {{y, q}} = {{iy, −iq}} we can see why the symplectic structure remains unchanged. the similarity between the two formalisms is very apparent for the sho, with hamiltonian hcl = 1 2 (p2 + x2). in the x−y plane we find identical solutions for x and y in both ccm and mfqm. these are tilted ellipses x = a sin(t + α1), y = b sin(t + α2), leading to identical h− and hi , h− = hi = ab cos(α1 − α2) this suggests a possible role for h̄ in ccm for this and other potentials. we remember that y = h̄ȳ, q = h̄q̄ in mfqm, which encourages us to take y = h̄ȳ, q = h̄q̄ in ccm. then • ccm can now distinguish between ‘large’ and ‘small’ systems. the conventional classical limit is simply understood as the recovery of real phase space. • with 0me now o(h̄) the empirical ccm tunnelling observation 0me δt ≈ const. is understood as the quantum uncertainty relation δeδt = o(h̄). • the ccm tunnelling results in [3] are understood as anomalous behaviour in the limit 0me → 0. we now understand this as the familiar small h̄ behaviour when looking for the persistence of ‘quantum’ effects. looking at more general potentials, even in [13] and [14] it was appreciated that the extra dimensions permitted tunnelling without instantons. however, this should not blind us to strong differences between the two formalisms, for which the factors of i are crucial. most importantly, the constant energy surfaces of mfqm are bounded, whereas those of ccm are unbounded. as a result, individual particle trajectories in ccm go to infinite distances in the x−y plane (and back) in finite time, whereas those of mfqm are always bounded. to see the effects of this boundedness, it is convenient (with the former) to work with z± = ϕ ± ξ, since hcl(z±) are individually conserved: ża± = ω ab ∂hcl(z±) ∂zb± we can think of the classical paths for z± as the tips of chords of length o(h̄) whose midpoints are quantum paths, only constrained by boundary conditions. in the lagrangian formalism (on integrating out p, q) this is the familiar closed time-path approach. 86 acta polytechnica vol. 51 no. 4/2011 for example, now consider ‘tunnelling’ in a double-well potential with binding energy e0. in ccm we take hr = en < e0 to match the energy of a bound state of the potential and fix hi = 0me = δe �= 0. if, for example, we then consider paths with starting points y = 0, x = x0 the particle flips from well to well in a ‘symmetric’ way using the additional dimensions [3]. if we now measure the ratio of the time the particle spends in each well as δe → 0 we can compare this with the qm results obtained from the wavefunctions as hi = 0me → 0. see [3] for details. while not being compelling, the results are certainly interesting. however, the situation is very different in mfqm because of the boundedness of the constant energy surfaces. we now have h+ = 1 2 [hcl(z+) + hcl(z−)]. as shown in [17], to proceed we choose h+ = e < e0 with hcl(z+) > e0 and hcl(z−) < e0 (or v.v) and fix h− = δe �= 0. that is, one end of the chord is trapped in a well, while the other end is free. the initial conditions mean that the particle flips from well to well ‘asymmetrically’ using additional dimensions. for this reason it is not sensible to attempt to measure the relative time the particle spends in each well as δe → 0. these trajectories make fundamentally clear a profound difference between ccm and mfqm that lies at the heart of quantum mechanics. for all the qualitative similarities between ccm and quantum mechanics, the defining ingredient of the latter is interference. this is immediately clear from the evolution equation (3.11) for real density matrices, which demands that kqu be real. at the very least, if (ϕcl, ξcl) are solutions to (4.33) then so are (ϕcl, −ξcl) and both solutions have to be combined (z+ ↔ z−) in (3.11), with the appropriate determinant of small fluctuations around the classical solutions. this is in contrast to the hamiltonian formulation of classical mechanics of section 2 for which there are potentially more observables, i.e. hermitian functions of ϕ and π, than in the standard approach to (real) classical mechanics. because of the non-commutativity of ϕ and π, as shown in (2.5), interference looks to be possible, although we know this is not the case. in fact, not all these extra observables are invariant under a set of universal local symmetries which appear once the formalism is extended to differential forms on phase space [21] and, because of this, have to be removed. as gozzi has shown, this makes the superposition of states in (real) cm impossible [22]. whether this is equally true for complex classical mechanics is unclear, but we would be surprised if it were not. unfortunately, this makes a direct comparison between the approaches impossible. thus, even in a pragmatic sense, we can’t use the virtues of the one to have implications for the other. in fact, we need more than interference between quasi-classical paths in mfqm, as can be seen from the simple v (x) = − 1 2 x2 potential. quasi-classical solutions in mfqm show that the particle rebounds for e < 0, i.e. there is no ‘tunnelling’ despite the extra dimension. nonetheless, tunnelling happens in mfqm because of state preparation [23], whereby the tail of the wavefunction crosses the x − p separatrix and is stretched to the other side of the potential hill. we conclude with a comment on the first order constraints h− = 0 and hi = 0 that the formalisms possess. to accommodate them fully requires gauge fixing in the path integrals (or a change of bracket). at our simple level of comparison this is unnecessary, but see smilga [18] for a detailed discussion of gaugefixing for ccm. 6 quantisation as dimensional reduction so far we have compared the classical trajectories of particles in complex phase space with the classical trajectories z± of the ends of the chords whose midpoints are the stationary phase solutions that define what we have called mean-field quantum mechanics. the doubling of the degrees of freedom by having to take account of two chord ends has its counterpart in the doubling of degrees of freedom by making phase space complex. since the chords are of length o(h̄) we recover real classical mechanics from meanfield quantum mechanics trivially and, if the complex phase-space coordinates are, equally, o(h̄), recover real classical mechanics from complex classical mechanics at the same time. however, the real significance of the doubling of degrees of freedom in the chords is that these new coordinates are the quantum generalisations of the classical ‘momenta’ conjugate to the real phase space variables in the kvn hilbert space formalism of classical mechanics. this suggests an entirely different approach to ‘deriving’ quantum mechanics from classical mechanics that we sketch below. more details will be presented elsewhere. it relies on the extension of the formalism of section 2 to differential forms by gozzi [10], that we have already alluded to above. for the moment we stay firmly with classical mechanics in real phase space. the rightmost integral of (2.3) contains the jacobian j := det(δab ∂t − ω ac∂c∂bh) (6.38) which we have set to unity, as a consequence of the incompressibility of phase space. however, we can 87 acta polytechnica vol. 51 no. 4/2011 equally express it in terms of 2 + 2 ghost-field grassmann variables (ca, c̄a) as [10] j = ∫ dc̄adca · (6.39) exp [ − ∫ dtc̄a[δ a b ∂t − ω ac∂c∂bh]c b ] . inserting this in the integrand of the rightmost path integral in (2.3) enables us to write the kernel kcl as kcl = ∫ dx adπadc̄adca exp [ i ∫ dt l̄cl ] (6.40) where we have dropped explicit mention of boundary conditions to simplify the notation. in (6.40) lcl of (2.8) is replaced by l̄cl = πa[ẋ a − ωab∂bh] + ic̄a[δab ∂t − ω ac∂c∂bh]c b = πaẋ a + ic̄aċ a − h̄cl, (6.41) where the hamiltonian associated with l̄ of (6.41) is now h̄cl = πaωab∂bh + ic̄aωac∂c∂bhcb. (6.42) the equations of motion that follow from l̃cl show that the (ca, c̄a) are conjugate jacobi variables in the sense of (2.5) and that h̄cl is the lie derivative of hamiltonian flow associated with hcl [10]. all that matters for this discussion is that, on introducing grassmann partners (θ, θ̄) to time t, we can construct superspace phase space variables φ = (x, p ) φa(t, θ, θ̄) := ϕa(t) + θca(t) + (6.43) θ̄c̄a(t) + iθ̄θωabπb we note that θ̄θ has the dimensions of inverse action. it follows [10] that i ∫ dθ dθ̄ hcl(φ) = hcl(ϕ, π). (6.44) similarly, i ∫ dt dθ dθ̄ lcl(φ) = ∫ dt lcl(ϕ, π), (6.45) where lcl = pẋ − hcl. (6.46) the theory possesses brs invariance. we retrieve quantum mechanics from classical mechanics by making the dimensional reduction [24] ih̄ ∫ dt dθ dθ̄ → ∫ dt (6.47) in superspace, together with (x, p ) → (x, p) in phase space. see also [25] for the relationship of this approach to ‘t hooft’s derivation of quantum from classical physics [26, 27]. this analysis permits a natural extension to complex phase space, with co-ordinates x a of (4.31). the first step is to double this already doubled phase space by introducing four conjugate variables π̄a, to be supplemented by (now 4 + 4) grassmann variables (ca, c̄a). we then look for brs symmetry in this extended space. a superspace realisation of this extended theory is possible. it is then interesting to look at its dimensional reduction, not yet attempted. if the outcome is quantum mechanics, it will be quantum mechanics defined on complex phase space, i.e. we shall be looking at probability densities in the complex plane [28]. this is to be distinguished from the results of [3], in which comparison is made between ccm and quantum mechanics in the real plane. however, it will allow for a discussion of pseudo-hermitian hamiltonians, whose quantum mechanics has already been described in detail [7–9] and with which much of the discussion of [18] was concerned. 7 conclusions our conclusions derived from these explorations of path integrals are somewhat schizophrenic. on the one hand, if we take the behaviour of particles in complex classical mechanics (ccm) as really reflecting attributes of quantum mechanics (qm), then formal similarities between ccm and mean-field quantum mechanics (mfqm) suggest that the complex dimensions of ccm should be taken as o(h̄). this resolves several issues with ccm, such as how classical mechanics distinguishes between classical and quantum systems, and how to interpret uncertainty relations. on the other hand, the differences between ccm and mfqm are at least as important as the similarities. in particular, the boundedness of the energy surfaces of the latter (in comparison to the unbounded nature of those of the former) mean that the trajectories are very different, even though each permits tunnelling without instantons and we know that, in many circumstances, mfqm gives a reliable description of quantum mechanical particles. in particular, for mfqm quantum superposition is a necessary ingredient, whereas (for real phase space at least) superposition plays no role in the path integrals of classical mechanics. much of the original work on ccm was concerned with comparing its results to quantum mechanics in the real plane, e.g. [3]. as an alternative, we have raised the possibility of looking for supersymmetric realisations of ccm, with the potential of getting quantum mechanics in complex phase space. this will be pursued elsewhere. 88 acta polytechnica vol. 51 no. 4/2011 acknowledgement we acknowledge stimulating discussions with ennio gozzi and thank him for access to unpublished notes on ‘tunnelling without instantons’ (december, 1995). we also thank daniel hook for empirically demonstrating the nature of ‘tunnelling’ paths in mfqm. references [1] bender, c. m., brody, d. j., hook, d. w.: quantum effects in classical systems having complex energy, j. phys. a 41 (2008), 352 003. [2] bender, c. m., hook, d. w., meisinger, p. n., wang, q.-h.: complex correspondence principle, phys. rev. lett. 104 (2010), 061 601. [3] bender, c. m., hook, d. w.: tunneling in classical mechanics. [4] feynman, r., vernon, f.: ann. phys. (n. y.) 24 (1963), 118. [5] lombardo, f. c., mazzitelli, f. d., rivers, r. j.: classical behaviour after a phase transition, phys. lett. b 523 (2001), 317. [6] lombardo, f. c., rivers, r. j., villar, p. i.: decoherence of domains and defects at phase transitions, phys. lett. b 648 (2007), 64. [7] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. 80 (1998), 5 243. [8] bender, c. m.: introduction to pt-symmetric quantum theory, contemp. phys. 46 (2005), 277. [9] bender, c. m.: making sense of non-hermitian hamiltonians, rept. prog. phys. 70 (2007), 947. [10] gozzi, e., reuter, m., thacker, w. d.: hidden brs invariance in classical mechanics. ii, phys. rev. d 40 (1989), 3 363. [11] gozzi, e., reuter, m., thacker, w. d.: symmetries of the classical path integral on a generalized phase-space manifold, phys. rev. d 46 (1992), 757. [12] gozzi, e., reuter, m.: classical mechanics as a topological field theory, phys. lett. b 240 (1990), 137. [13] marinov, m. s.: a new type of phase-space path integral, phys. lett. a 153 (1991), 5. [14] gozzi, e., reuter, m.: quantum-deformed geometry on phase-space, mod. phys. lett. a 8 (1993), 1 433. [15] gozzi, e., reuter, m.: a proposal for a differential calculus in quantum mechanics, int. jour. mod. phys. a 9 (1994), 2191. [16] gutzwiller, m. c.: chaos in classical and quantum mechanics. new york : springerverlag, 1990. [17] dittrich, t., gomez, e. a., pachon, l. a.: semiclassical propagation of wigner functions, j. chem. phys. 132 (2010), 214 102. [18] smilga, a. v.: cryptogauge symmetry and cryptoghosts for crypto-hermitian hamiltonians, j. phys. a 41 (2008), 244 026. [19] koopman, b. o.: hamiltonian systems and transformations in hilbert space, proc. natl. acad. sci. usa 17 (1931), 315. [20] von neumann, j.: zur operatorenmethode in der klassischen mechanik, ann. math. 33 (1932), 587; ibid. 33 (1932), 789. [21] deotto, e., gozzi, e., mauro, d.: hilbert space structure in classical mechanics: (ii), j. math. phys. 44 (2003), 5 902. [22] gozzi, e., pagani, c.: universal local symmetries and non-superposition in classical mechanics, phys. rev. lett. 105 (2010), 150 604. [23] balazs, n. l., voros, a.: wigner’s function and tunneling, ann. phys. 199 (1990), 123. [24] abrikosov, a. a. (jr.), gozzi, e., mauro, d.: geometric dequantization, ann. phys. 317 (2005), 24. [25] blasone, m., jizba, p., kleinert, h.: path integral approach to ‘t hooft’s derivation of quantum from classical physics, phys. rev. a 71 (2005), 052 507. [26] ‘t hooft, g.: quantum mechanical behaviour in a deterministic model, quant. grav. 13 (1996), 1 023. [27] ‘t hooft, g.: quantum gravity as a dissipative deterministic system, quant. grav. 16 (1999), 3 263. [28] bender, c. m., hook, d. w., meisinger, p. n., wang, q.-h.: probability density in the complex plane, ann. phys. 325 (2010), 2 332. r. j. rivers e-mail: r.rivers@imperial.ac.uk physics department imperial college london london sw7 2az, uk 89 ap_07_6.vp 1 introduction the results presented in this paper were achieved continuing the investigation of the transport properties of oxide layers of zirconium alloys described earlier [1, 2]. unlike to previous samples oxidized in water at vvvr conditions, the oxidation was performed in steam at 425 °c, and in air at 500 °c, respectively, to assess possible differences in transport properties due to the influence of hydrogen originating from dissociation of the water molecules in steam. zirconium alloys are used as fuel cladding and channel box materials in nuclear light water reactors because of their enhanced corrosion resistance [3, 4]. the corrosion rate depends largely on electron motion, which is governed by the electrical conductivity of the oxide layer. it is therefore of interest to investigate the electrical transport properties of the oxide layers with regard to oxide-forming capability and an assessment of corrosion resistance. the excellent corrosion resistance of the alloys in an aqueous solution is attributed to the formation of a passive film on the surface. lee et al. [5] investigated the passive film on zircaloy-4 by a photo-electrochemical method. the oxide consisted of an inner anhydrous layer with a band gap of 4.30 ev, and an outer hydrous zro2 layer with a band gap of 2.98 ev. cox [6] found that electronic conductivity is controlled by small amounts of alloying elements. howlader et al. [7] concluded that electron conduction dominates the electrical conductivity of zircaloy oxide films. they measured the conductivity and i-v characteristics over a wide temperature range using various electrode metals. the oxide layer was not uniform. there was black hypostoichiometric layer of relatively high conductivity near the oxide-metal interface, whereas the outer layer was white and of high resistivity, as was shown metallographically by cox et al. [14]. it is well known [7–9] that bulk zro2 is predominantly an electronic high-resistivity semiconductor with a certain amount of ionic conduction at higher temperatures [7]. the band gap is approximately 5 ev, work function 4.0 ev, and relative permittivity 22. the observed activation energy is determined by the distance between the trap levels and the lower edge of the conduction band [10]. 2 samples two tube specimens 30 mm long and of 9 mm outer diameter prepared from the zirconium alloys zr1nb and zry-4w were oxidized in steam of 10.7 mpa at 425 °c in an autoclave of 70 l volume (according to astm g2m-88) for 360 days, and two other specimens of the same alloys were oxidized in air at 500 °c for 31 days. due to the dissimilar growth rate, 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 influence of oxidation media on the transport properties of thin oxide layers of zirconium alloys h. frank two batches of tubes of zr1nb and of zry-4w were oxidized for 30 days at 425 °c in steam, and for 360 days at 500 °c in air, respectively. the analysis of the i-v characteristics at constant temperatures up to 220 °c of oxide layers of nearly equal thickness gave an activation energy of 1,3 ev for the grey homogeneous steam samples, and of 0.4 ev for the white surface layer, and of 1.3 ev at temperatures over 140 °c, for the grey bottom layer of the air samples, respectively. the i-v characteristics were sub-linear in the air samples, the current growing less at rising voltages, but staightening to super-linear space-charge limited currents at higher temperatures. the injection currents flowing when voltage was applied did not reach constant equilibrium, but at a bend, continued with a lesser slope. the resistivity was about one order of magnitude greater in air samples and greater in zry-4w. the relative permittivity was greater in the steam samples and greater in zr1nb. the currents of the air samples were greater with au electrodes than with ag electrodes. keywords: zirconium alloys, oxide layers, relative permittivity, sub-linear i-v characteristics, space-charge limited current, zero-voltage, injection and extraction current, power law of current drop, temperature dependence of resistivity, activation energy. nb sn fe cr zr1nb 1 zry-4w 1.3 0.2 0.1 table 1: chemical composition (wt %) of the zircaloys used in this study sample short name/medium t (°c) time (d) thickness (�m) �r colour zr1nb 1744323 zr1nb steam 425 360 33.3 20.5 light grey zr1nb nm034 zr1nb air 500 31 29.8 15.0 white zry-4w 3744323 zry-4w steam 425 360 35.3 17.2 dark grey zry-4w 3890085 zry-4w air 500 31 32.4 11.4 light brown table 2: characterization of the samples different oxidation times were chosen in order to achieve oxide layers of approximately the same thickness of about 30 �m. the chemical composition of the samples is given in table 1, and the sample parameters are shown in table 2. the samples were wrapped in al-foil with circular openings of 6.0 mm diameter and gold of 300 nm thickness on a substrate layer of 30 nm ti was vacuum evaporated. close by a contact of colloidal silver (degussa) was painted on for the purpose of comparison. 3 experimental the samples were mounted in a small thermostat with a maximum temperature of 220 °c. the abraded front ends of the tubes of shining zirconium metal were in direct contact with pressed-on copper electrodes, on which a thermocouple was mounted for temperature control. the current was measured with a two-electrode arrangement, using only one contact for each electrode. the contact resistance between the pressed-on copper electrodes and the metallic bulk zirconium is certainly negligible in comparison with the layer resistance of more than g�, the same being true for the resistance between the metallic surface electrode and the pressed-on 0.3 mm thick phosphorbronze contact spring, 50 mm in length, to minimize thermal loading. a stabilized voltage source was connected to the zr metal. the surface electrode was earthed via a picoampere-meter with 0.1 pa resolution. the voltage drop of the meter was limited to 10 mv and was negligible at source voltages over 1 v. the i-v characteristics were measured at room temperature and at constant higher temperatures up to 220 °c, mostly in steps of about 20 °c. the normally existing asymmetry of the complete i-v characteristics due to a certain rectifying effect was not observed in this case. therefore only the forward voltage branch, with the positive voltage source terminal connected to the zirconium metal, was measured. current measurements i f t u� ( ) with a low constant voltage in the ohmic range at continually rising temperature with a rate of 1 °c/min were used to determine the activation energy by means of the slope of log ( )� � f t1 , the temperature dependence of the resistivity. after being painted on, the silver contact was left to dry in air at room temperature for several hours, and was then slowly heated in the thermostat.the short-circuit current was measured without applying an external voltage. after the first heating, which served to anneal the silver contact, the same procedure was repeated after cooling down to room temperature, and the values of the short-circuit current and open circuit voltage were used to plot the temperature dependence of the resistivity and to compute the activation energy. the observed short-circuit current is ionic and is due to the continuing oxidation of zirconium in air at elevated temperatures [1]. then the capacity was measured with a tesla bm 498 type capacitance bridge operating at 1000 hz with 0.1 % precision. the i-v characteristics of homogeneous high resistivity semiconductors start at low voltages with a linear part obeying ohm’s law. when higher voltages are applied the current rises faster due to the injection of majority carriers building up a space charge.the current develops a space-charge limited additional part. the measured current values can be fitted to a second order polynomial i au bu c� � �2 . (1) the zero-current expressed by constant c can be observed at elevated temperatures as a consequence of temperature-activated liberation of trapped electrons and/or continuing oxidation in air. constant b is the slope of the linear part of the chacteristic and can be used to compute the resistivity � � a wb , (2) where a is the area of the contact and w is the thickness of the layer. the first term of eq.(1) is the space-charge limited current and obeys child’s law [11] i a u w ausc � � 9 8 0 2 3 2 �� � , (3) where ��0 is the relative and vacuum permittivity, respectively, � is the mobility of the carriers and u the applied voltage. the transition of the current from the linear to the square part occurs at the voltage u b a� . further details concerning theoretical aspects are given in [13]. 4 results and discussion 4.1 influence of media on oxide growth rate table 2 shows the astonishing fact that the time to obtain equal thicknesses of the oxide layers is about ten times longer, if the sample is oxidized in steam instead of in air. it is possible that in steam, an aqueous environment, the formation of a passive film on the surface as proposed by lee et al. [5] is responsible for the enhanced corrosion resistance. it may also be possible to assume that the diffusing oxygen, which is responsible for the oxide growth, will be partially compensated by the also existing hydrogen stemming from the dissociation of the water molecules. given equal oxidation conditions, the oxide layer of zry-4w is about 5–8 % thicker than a layer of zrnb. 4.2 relative permittivity the capacity of the samples with silver electrodes was measured after annealing to 180 °c, and the relative permittivity was computed using the known geometric factors. measurements with au electrodes gave comparable results, shown in table 2. the value of 20.5 of the layer grown in steam on zr1nb is near to the normally accepted bulk value, while the value for zry-4w is slightly lower. however, the relative permittivity is 27 % and 34 % lower in the oxides formed in air of zr1nb and of zry-4w, respectively. 4.3 time dependence of the electric current when a voltage is applied, a relatively large injection current iin flows. this gradually decreases with time, building up a space charge q, until an equilibrium state defined by the ohmic resistance is finally reached. if the voltage is switched off, and the sample is short-circuited, there flows an adequate extraction current i iex in� � of opposite polarity [13] of the form i b t nex � � � . (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 47 no. 6/2007 the time integral of the extraction current gives the extracted charge qex b t b t t n n t t n n � � � � � � � � d 1 2 2 1 1 1 1 , (5) equalling the injected charge, q qin ex� � . due to the high resistance the injected charge remains practically unchanged and, as long as the layer is left open-circuited, it will only be discharged after a long time. the injected charge is a linear function of the injection voltage and of the injection time. the specific charge q u has the dimension of a capacitance and can be expressed as the charge density per unit voltage, being of the order of �f/cm3. a typical example of the time dependence of the injection current is shown in fig.1. it can be seen that the current, 1 s after voltage application, is two orders of magnitude higher than half an hour later, when the injection phase is coming to an end, and constant current is expected to be established, as the current now begins to decrease. contrary to expectation, however, there is only a bend in the straight line and the current continues to decrease with a lesser slope and even after 20 hours there is no end of the current drop. similar behaviour of the other samples is presented in table 3. this influence of the electrode metal was not expected. the ti substrate of the au seems to act as potential barrier and gives rise to extremely long time delays. therefore the results achieved using the ag contacts were preferred, and were given greater weight. thus the time for reaching the end of the injection current is from 15 to 20 min. for all samples with ag electrodes, which seems reasonable, whereas with au electrodes there are extreme differences. there is a marked decrease in the specific capacitance going from zr1nb steam to zry-4w air, which is in accordance with the trend of decreasing conductivity. 4.4 short-circuit current and open-circuit voltage measurement of the short-circuit current i0 and the open-circuit voltage u0 as functions of slowly rising temperature showed the current to increase exponentially, while the open-circuit voltage grew linearly, starting over 100 °c. the open-circuit voltage u0 can be measured directly by compensating of the short-circuit current with an opposing voltage of the same value but of opposite polarity. as this is rather time consuming, it is better to assess the voltage u0 by computing the intersection point of a straight line, having the slope of the resistivity and put through the point of the short-circuit current, with the x-axis, u i w a0 0 � � . (6) the short-circit current of zr1nb reached 2500 pa at 210 °c with an open-circuit voltage of 300 mv, the rate being 3 mv/°c. both samples of zry-4w, oxidized in steam and in air, due to higher resistivity, achieved only a current of 5 pa at 240 °c with an open-circuit voltage of only 5 mv. the flowing diffusion current, in accordance with eq.(6), develops u0 as the voltage drop across the resistance of the oxide layer. 4.5 i-v characteristics the i-v characteristics were measured at constant temperatures, from room temperature up to 220 °c in steps of about 20 °c, and with voltages up to 90 v, mostly in steps of 10 v. the normally existing asymmetry of the complete i-v characteristics due to a certain rectifying effect could be neglected in this case. therefore only the forward voltage branch, with the positive terminal of the voltage source connected to the zr metal of the sample, was measured. the i-v characteristics of the sample zr1nb oxidized in steam, shown in fig. 2, are normal space-charge limited current curves following eq.(1). at room temperature the resis38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 1 1.5 2 2.5 3 3.5 1 2 3 4 5 log t (s) lo g i (p a ) zr1nb air (ag) (at 96 v) fig. 1: typical example of the time-dependent current drop after application of voltage. after 20 min. no constant current level is reached, but the current continues to decreases with a smaller slope sample zr1nb zry-4w oxidation steam air steam air electrode au ag au ag au ag au ag exponent (n) of injection – 0.35 0.76 0.7 – 0.56 0.76 0.47 bend after t (min.) – 15 115 20 17 h 15 105 17 exponent (n) of continuing drop – – 0.26 0.35 – 0.16 – 0.14 capacitance (�f/cm3) 2.6 3.9 1.56 1.9 0.3 0.98 0.15 0.96 table 3: comparison of injection currents of the various samples tivity is 1.7×1014 �cm, with a mobility of 1.8×10�9 cm2/vs and a carrier concentration of 2×1013 cm�3. the specimen of zr1nb grown in air at 500 °c was quite different. the i-v characteristics at low temperatures were sublinear (fig. 3), increasing less with growing voltages, but at higher temperatures gradually straightening and finally, after attaining a straight line (at 90 °c), they developed normal forms of space-charge limited current characteristics (fig. 4). it is interesting to note that the fitted curves still follow eq.(1), but with changing sign of coefficient a, indicating the curvature of the characteristics. the temperature dependence of coefficients a and b are given in table 4. the temperature dependent resistivity can be computed using coefficient b in eq.(2). a similar behaviour was found in the zry-4w samples. the sample oxidized in steam, like zr1nb steam, was very near to a normal form of characteristics, only with an extraordinarily © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 47 no. 6/2007 0 20 40 60 80 100 0 20 40 60 80 100 voltage (v) c u rr e n t (n a ) zr1nb steam ag 150°c 120°c 80°c fig. 2: normal i-v characteristics of zr1nb steam, with space-charge limited currents, observing eq.(1) 0 10 20 30 40 50 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zr1nb air (ag) 43 oc 23 fig. 3: abnormal (sub-linear) i-v characteristics of zr1nb air at lower temperatures 0 200 400 600 800 1000 1200 1400 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zr1nb air (ag) 139 °c 120 °c 100°c fig. 4: normal, space-charge limited current characteristics of the sample in fig. 3, at higher temperatures t (°c) 23.5 43 62 82 100 120 139 a (pa/v2) �0.0017 �0.004 �0.0045 �0.0055 +0.0045 +0.0083 +0.076 b (pa/v) 0.458 0.884 1.977 5.246 6.433 7.444 8.354 table 4: temperature dependence of the coefficients of the polynomial (eq.(1) in zr1nb air ( ) 0 100 200 300 400 500 600 700 800 900 0 50 100 voltage (v) c u rr e n t p a zry-4w steam (au) t (°c) 80 42 23 61 fig. 5: i-v characteristics, linear up to 90 v and 220 °c of zry-4w steam, without space-charge limited currents, although charges had been injected long linear part, and the sample oxidized in air was sub-linear, like its counterpart of zr1nb air. the i-v characteristics of zry-4w air, shown in fig. 5, were linear up to 90 v and 220 °c, but space-charge limited currents were not observed, although injected and extracted charges of usual magnitude were measured, see table 3. the sub-linear characteristics of zry-4w air are presented in fig. 6, and the complete characteristics at room temperature in fig. 7 show the symmetry of the positive and negative parts, together with the dropping currents at larger voltages. similar complete characteristics of the other samples were measured, showing the symmetrical form without rectifying effects, and justifying the limitation to measurement of the positive branch only. 4.6 activation energy we used the results of current measurements at different temperatures and at low voltages near the origin, where only the linear (ohmic) part of the characteristicswere used to compute and plot the temperature dependence of the resistivity in the usual form of log ( )� � f t1 to assess the activation energy of the carriers from the slope of the straight line parts. the parallel straight lines from room temperature up to 220 °c for the samples oxidized in steam, as shown in fig. 8, indicate the existence of a single phase of the same kind, which also follows from the same grey colour of the oxides. although the resistivity of zry-4w is about one order of magnitude higher, the activation energies have the same value of 1.3 ev. the complete i-v characteristics of zr1nb air for both ag and au electrodes at room temperature in fig. 9 are slightly 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 0 5 10 15 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zry-4w air (ag) t (°c) 80 62 41 fig. 6: abnormal (sub-linear) i-v characteristics of zry-4w air, similar to those in fig. 3 -5 -2.5 0 2.5 5 -80 -60 -40 -20 0 20 40 60 80 u (v) i (p a ) zry-4w air fig. 7: complete i-v characteristic of zry-4w air, room temperature 9 10 11 12 13 14 2 2.2 2.4 2.6 2.8 3 3.2 3.4 1000/t (k) lo g rh o (o h m c m ) zr1nb steam (ag) zry-4w steam (ag) fig. 8: comparison of the temperature dependence of the resistivity of samples zr1nb and of zry-4w, both oxidized in steam at 425 °c, activation energy 1.30 and 1.27 ev, respectively -20 -10 0 10 20 -100 -50 0 50 100 voltage (v) c u rr e n t (p a ) zr1nb air (au, ag) au ag fig. 9: complete sub-linear i-v characteristics typical for samples oxidized in air, but with slightly higher currents of the gold electrodes different, but the temperature dependent resistivity in fig. 10 is practically the same for both contact metals. the low activation energy of 0.5 ev, observed between room temperature and 140 °c, increases to 1.3 ev at higher temperatures. this is the same value as found in the samples oxidized in steam, and can be understood as the influence of the grey phase near the metal-oxide interface, whereas near the surface the white phase with 0.5 ev prevails at lower temperatures. a similar behaviour was found in the zry-4w air sample with regard to the running of the temperature dependent resistivity, but with different values, obtained with the au and ag contacts. fig. 11 shows the temperature dependence of the resistivity of zry-4w air with both electrodes. the resistivity, using the au electrode, was computed from current measurement at constant 10 v with continually rising temperature. at temperatures up to 160 °c, the influence of the white phase prevails with low activation energy of about 0.3 ev, and at higher temperatures the grey modification dominates with 1.3 ev, on an average. measurements at the same conditions on the ag electrode gave at temperatures up to 140 °c gave a straight line with the same slope, only higher by a factor of 1.7. at higher temperatures the grey part of the layer was activated with a higher activation energy, which appeared to be slightly different for the au and ag contacts, with a mean value of 1.3 ev. in this sample, unlike zr1nb air, there was a marked influence of the electrode metals. although the activation energies were found to be nearly the same, the temperature for changing over from 0.3 to 1.3 ev was higher for the au contact than for the ag contact, 160 °c and 140 °c, respectively. a comparison of the activation energies is given in table 5. the grey phase, independent of material and oxidation conditions, has an activation energy of 1.3 � 0.1 ev, whereas the white phase part, grown in air at 500 °c, has a lower value of 0.4 � 0.1 ev. the oxide films are not homogeneous, but consist of a substoichiometric black oxide layer of high conductivity, near the metal-oxide interface, as stated by cox et al. [14], and of an almost stoichiometric white layer of high resistivity. at larger film thickness the high resistivity layer dominates. fully oxidized layers are of monoclinic structure, whereas substoichiometric black layers with oxygen deficiency can have a tetragonal structure [15]. moreover, part of the layer near the surface can be porous [14], so that painted-on electrode material could enter into the pores and alter the © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 47 no. 6/2007 12 12.5 13 13.5 14 14.5 15 2 2.2 2.4 2.6 2.8 3 3.2 3.4 1000/t (k) lo g rh o (o h m c m ) zr1nb air (ag,au) au ag fig. 10: temperature dependence of zr1nb air, no difference between ag and au contacts 12 12.5 13 13.5 14 14.5 15 2 2.2 2.4 2.6 2.8 3 3.2 3.4 1000/t (k) lo g rh o (o h m c m ) zr1nb air (ag,au) au ag fig. 11: temperature dependence of zry-4w air, with au and ag electrodes sample zr1nb zry-4w oxidation colour steam air steam air electrode au ag au ag au ag au ag energy (ev) grey 1.3 1.3 1.21 1.25 up to 140°c white 0.45 0.54 0.30 0.30 over 160°c grey 1.34 1.43 1.43 1.15 average (ev) grey 1.3 � 0.1 white 0.4 � 0.1 table 5: comparison of the activation energies of the various samples effective thickness of the layers. the relative permittivity computed from capacitance measurements of specimens of different electrode metals often deviated from the accepted value of �r � 22 for bulk zirconium oxide. this may be due to partly to the porosity of the layers or to wetting difficulties of the contacts, or to inhomogeneities of the layers [14]. oxidation in aqueous surroundings leads to the development of hydrogen and can enter into the layers and even into the bulk zirconium [16]. 5 conclusions the main results can be stated as follows: (a) in steam at 425 °c a substoichiometric grey oxide phase of high conductivity was formed with an activation energy of about 1.3 ev. (b) in air at 500 °c the oxide was inhomogeneous, and at the surface there prevailed an almost stoichiometric white/ish phase with high resistivity and a lower activation energy of about 0.4 ev. near the metal-oxide interface the grey phase with its activation energy of about 1.3 ev became active at measuring temperatures over 140 °c. (c) the injection current, flowing when voltage was applied, decreased with time obeying a power law with t�n, where the exponent was n > 0.5, but did not attain constant equilibrium values. there was a bend at the expected time for equilibrium and the current continued to decrease at a lower slope with n < 0.5. (d) the inhomogeneous oxide grown in air formed a kind of junction between the grey and white phases, manifesting itself by sublinear i-v characteristics, which even attained decreasing currents at rising voltages. the measuring points could still be fitted to the second order polynomial i au bu c� � �2 , but with negative coefficient a. at higher temperatures the sublinear characteristics flattened out and became super-linear, indicating space-charge limited currents, when the influence of the grey phase with its higher activation energy was dominant. (e) in zry-4w samples there was a certain difference between au and ag contacts: currents were larger with au contacts, but there was no difference in samples of zr1nb. (f) the resistivity was about one order of magnitude greater in oxides formed in air. (g) the relative permittivity was greater in oxides grown in steam and greater in zr1nb. acknowledgments the support given to this work by ujp, praha a.s. and by the grant msm 6840770015 grant is highly appreciated. special thanks are due to mrs. v. vrtilkova for providing specimens of measured thickness. references [1] frank, h.: j. nucl. mater. vol. 340 (2005), p. 119. [2] frank, h.: j. nucl. mater. vol. 360 (2007), p. 282. [3] franklin, d. g., lang, p. m., eucken, c. m., garde a. m. (eds.): proc. 9th int. symp. nucl. industry, astm, philadelphia, 1991, p. 3, astm stp 1132. [4] corrosion of zirconium alloys in nuclear power plants, iaea-tecdoc-684, vienna, 1993. [5] lee, s. j., cho, e. a., ahn, s. j., kwon, h. s.: electrochim. acta, vol. 46 (2001) p. 2605. [6] cox, b.: j. nucl. mater. vol. 37 (1970), p. 177. [7] howlader, m.m.r., shiiyama, h., kinoshita, c., kutsuwada, m., inagaki, m.: j. nucl.mater. vol. 253 (1998), p. 149. [8] inagaki, m., kanno, m., maki, h.: astm stp 1132 (1992), p. 437. [9] charlesby, a.: acta metall. vol. 1 (1953), p. 348. [10] hartman, t. e., blair, j. c., bauer, r.: j. appl. phys. vol. 37 (1966), p. 2468. [11] mott, n. f., guerney, r. w.: electronic processes in ionic crystals, clarendon, oxford, 1940. [12] gould, r. d.: j. appl. phys. vol. 53 (1982), p. 3353. [13] frank, h.: j. nucl. mater. vol. 306 (2002), p. 85. [14] cox, b., wong, y.-m., mostaghimi, j.: j. nucl. mater. vol. 226 (1995), p. 272. [15] koski, k., holes, j., juliet, p.: surf. coat. technol. vol. 1207&121 (1999), p. 303. [16] oskarson, m., ahlberg, e. sodervall, u., anderson, u., petterson, k.: j. nucl. mater. vol. 289 (2001), p. 315. prof. rndr. helmar frank, drsc. phone:+420 224 358 559, +420 221 912 407 e-mail: frank@km1.fjfi.cvut.cz department of solid state engineering czech technical university faculty of nuclear sciences and physical engineering trojanova 13 120 00 prague 2, czech republic 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 wykresx.eps acta polytechnica vol. 51 no. 1/2011 supersymmetry in the quark-diquark and the baryon-meson systems s. catto, y. choun abstract a superalgebra extracted from the jordan algebra of the 27 and 2̄7 dim. representations of the group e6 is shown to be relevant to the description of the quark-antidiquark system. a bilocal baryon-meson field is constructed from two quark-antiquark fields. in the local approximation the hadron field is shown to exhibit supersymmetry which is then extended to hadronic mother trajectories and inclusion of multiquark states. solving the spin-free hamiltonian with light quark masses we develop a new kind of special function theory generalizing all existing mathematical theories of confluent hypergeometric type. the solution produces extra “hidden” quantum numbers relevant for description of supersymmetry and for generating new mass formulas. keywords: supersymmetry, relativistic quark model. a colored supersymmetry scheme based on su(3)c × su(6/1) algebraic realization of supersymmetry and its experimental consequences based on supergroups of type su(6/21) and color algebras based on split octonionic units ui (i = 0, . . . ,3) was considered in our earlier papers [1, 2, 3, 4, 5]. here, we add to them the local color group su(3)c. we could go to a smaller supergroup having su(6) as a subgroup. with the addition of color, such a supergroup is su(3)c × su(6/1). the fundamental representation of su(6/1) is 7-dimensional which decomposes into a sextet and a singlet under the spin-flavor group. there is also a 28-dimensional representation of su(7). under the su(6) subgroup it has the decomposition 28=21+6+1 (1) hence, this supermultiplet can accommodate the bosonic antidiquark and fermionic quark in it, providedwe are willing to addanother scalar. togetherwith the color symmetry,weare led to consider the (3, 28) representation of su(3) × su(6/1) which consists of an antidiquark, a quark and a color triplet scalar that we shall call a scalar quark. this boson is in some way analogous to the s quarks. the whole multiplet can be represented by an octonionic 7×7 matrix z at point x. z =u · ( d ∗ q iqt σ2 s ) (2) hered∗ is a 6×6 symmetricmatrix representing the antidiquark,q is a 6×1 columnmatrix,qt is its transpose and σ2 is the pauli matrix that acts on the spin indices of the quark so that, if q transforms with the 2 × 2 lorentz matrix l, qt iσ2 transforms with l −1 acting from the right. similarly we have zc =u∗ · ( d iσ2q ∗ q† s∗ ) (3) to represent the supermultiplet with a diquark and antidiquark. the mesons, exotic mesons and baryons are all in the bilocal field z(1)⊗ zc(2) which we expend with respect to the center of mass coordinates in order to represent color singlet hadrons by local fields. the color singlets 56+ and 70− will then arise as shown in our earlier papers [1, 2, 3]. now the (d̄q) system belonged to the fundamental representation of the su(6/21) supergroup. but z belongs to the (28) representation of su(6/1)which is not its fundamental representation. are there any fields that belong to the 7-dimensional representation of su(6/1)? it is possible to introduce such fictitious fields as a 6-dimensional spinor ξ and a scalar a without necessarily assuming their existence as particles. we put ξ =u∗ · ξ, a =u∗ ·a, (4) 77 acta polytechnica vol. 51 no. 1/2011 so that both ξ and a are color antitriplets. let λ = ( ξ a ) , λc = ( ξ̂ a∗ ) (5) where ξ̂ =u · (iσ2ξ∗), a∗ =u ·a∗. (6) consider the 7×7 matrix w = λλc† = ( ξξ̂† ξa aξ̂† 0 ) . (7) w belongs to the 28-dimensional representation of su(6/1) and transforms like z, provided the components of ξ are grassmann numbers and the components of a are even (bosonic) coordinates. the identification of z and w would give s= a×a=0, qα = ξα ×a, d∗αβ = ξα × ξβ (8) a scalar part in w can be generated by multiplying two different (7) representations. the 56+ baryons form the color singlet part of the 84-dimensional representation of su(6/1) while its colored part consists of quarks and diquarks. now consider the octonionic valued quark field qia, where i = 1,2,3 is the color index and a stands for the pair (α, μ) with α = 1,2 being the spin index and μ = 1,2,3 the flavor index. (if we have n flavors, a =1, . . . , n). as before qa = uiq i a =u ·qa (9) similarly the diquark dab which transforms like a color antitriplet is dab = qaqb = qbqa = ijku ∗ kq i aq j b =u ∗ ·dab we note, once again, that because qia are anticommuting fermion operators, dab is symmetric in its two indices. the antiquark and antidiquark are represented by q̄a =u ∗ · q̄a, (10) and d̄ab =u ·d̄ab , (11) respectively. if we have 3 flavors qa has 6 components for each color while dab has 21 components. at this point let us study the system (qa, d̄bc) consisting of a quark and an antidiquark, both color triplets. there are two possibilities: we can regard the system as a multiplet belonging to the fundamental representation of a supergroup su(6/21) for each color, or as a higher representation of a smaller supergroup. the latter possibility is more economical. to see what kind of supergroup we can have, we imagine that both quarks and diquarks are components ofmore elementary quantities: a triplet fermion f ia and a boson c i which is a triplet with respect to the color group and a singlet with respect to su(2n) (su(6)) for three flavors). the f ia is taken to have baryon number 1/3 while c has baryon number −2/3. the system qia = ijkf̄ j ac̄ k (12) will be a color triplet with baryon number 1/3. it can therefore represent a quark. we can write qa =u ·qa =(u∗ · f̄a)(u∗ · c̄) (13) with two anti-f fields we can form bosons that have the same quantum numbers as antidiquarks: d̄ab =u · d̄ab =(u∗ · f̄a)(u∗ · f̄b) (14) in this case the basic multiplet is (fa,c) which belongs to the fundamental representation of su(6/1) for each color component. the complete algebra to consider is su(3)c × su(6/1) and the basic multiplet corresponds to the representation (3,7) of this algebra. let f = ⎛ ⎜⎜⎜⎜⎜⎜⎜⎝ f1 f2 ... f6 c ⎞ ⎟⎟⎟⎟⎟⎟⎟⎠ , fa =u · fa, c =u ·c (15) 78 acta polytechnica vol. 51 no. 1/2011 we write f1, f2, etc., as f1 = ⎛ ⎜⎜⎜⎜⎜⎝ f11 f12 ... f16 ⎞ ⎟⎟⎟⎟⎟⎠ , f 2 = ⎛ ⎜⎜⎜⎜⎜⎝ f21 f22 ... f26 ⎞ ⎟⎟⎟⎟⎟⎠ , . . . (16) combining two such representations and writing x̄=f×ft we have x̄ =u∗ ·x̄= ( f1 c1 ) (f2t c2) − ( f2 c2 ) (f1t c1) (17) further making the identifications u∗ ·d11 =2f11 f 2 1 , u ∗ ·d12 = f11 f 2 2 − f 2 1 f 1 2 , etc. (18) and u∗ · q̄1 = f11c 2 − f21 c 1, u∗ · q̄2 = f12 c 2 − f22 c 1, etc., (19) we see that x̄ has the structure x̄ =u∗ · x̄=u∗ · ⎛ ⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ d11 . . . d16 q̄1 d12 . . . d26 q̄2 d13 . . . d36 q̄3 d14 . . . d46 q̄4 d15 . . . d56 q̄5 d16 . . . d66 q̄6 −q̄1 . . . −q̄6 0 ⎞ ⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ (20) the 27 dimensional representation decomposes into 21+6̄ with respect to its su(6) subgroup. consider now an antiquark-diquark system at point x1 = x − 1 2 ξ and another quark-antidiquark system at point x1 = x + 1 2 ξ, and take the direct product of x(x1) and x(x2). we see that we get (dab(x1), q̄d(x1))⊗ (d̄ef(x2), qc(x2)) (21) consisting of the pieces h(x1, x2)= ( q̄d(x1)qc(x2) dab(x1)qc(x2) q̄d(x1)d̄ef(x2) dab(x1)d̄ef(x2) ) (22) the diagonal pieces are bilocal fields representing color singlet 1+35 mesons and 1+35+405 exotic mesons, respectively, with respect to the subgroup su(3)c × su(6) of the algebra. the off diagonal pieces are color singlets that are completely symmetrical with respect to the indices (abc) and (def). they correspond to baryons and antibaryons in the representations 56 and 56 respectively of su(6). we can define the baryonic components fabc as fabc = − 1 2 {dab, qc} (23) and evaluate its components: dabqc = ( u∗1(q 2 aq 3 b + q 2 bq 3 a)+ u ∗ 2(q 3 aq 1 b + q 3 bq 1 a)+ u ∗ 3(q 1 aq 2 b + q 1 bq 2 a) ) × (u1q1c + u2q 2 c + u3q 3 c) (24) this becomes dabqc = −u∗0 ( (q2aq 3 b + q 2 bq 3 a)q 1 c +(q 3 aq 1 b + q 3 bq 1 a)q 2 c +(q 1 aq 2 b + q 1 bq 2 a)q 3 c ) (25) 79 acta polytechnica vol. 51 no. 1/2011 similarly the qc dab part is qc dab = −u0((q2aq 3 b + q 2 bq 3 a)q 1 c +(q 3 aq 1 b + q 3 b q 1 a)q 2 c +(q 1 aq 2 b + q 1 b q 2 a)q 3 c) (26) since u0 + u ∗ 0 =1, we have fabc = − 1 2 {dab, qc} =(q1aq 2 b + q 1 bq 2 a)q 3 c +(q 2 aq 3 b + q 2 bq 3 a)q 1 c +(q 3 aq 1 b + q 3 bq 1 a)q 2 c (27) which is completely symmetric with respect to indices (abc), corresponding to baryons. in the limit x2 − x1 = ξ → 0, h can be represented by a local supermultiplet with dimension 2 × 56+ 2(1+35)+405 = 589 of the original algebra. this representation includes 56 baryons, antibaryons, mesons and q2q̄2 exotic mesons. it is empirically well known for many years that all light-quark hadrons lie upon linear regge trajectories. relation between linear trajectories, linear confinement and relativistic dynamics had been well studied. massless quarks bound by a linear confinement potential generate a family of parallel regge trajectories. regge slopes and daughter spacings depend on the lorentz nature and other properties of the interaction. these are well known also through numerical studies (exact solutions) to the massless spinless salpeter wave equation and the quantized straight string. a generalized second order equationwith a normalizedwave function including quarkmass m is given by [6] for h2: h2 =4 [ (m + 1 2 br)2 + p2r + l(l +1) r2 ] ; p2r = − ∂2 ∂r2 − 2 r ∂ ∂r (28) and schrödinger equation is ∂2ψ ∂r2 + 2 r ∂ψ ∂r + ( e2 4 − 1 4 b2 ( r + 2m b )2 − l(l +1) r2 ) ψ =0 (29) we have developed [7] a new solution for the normalized wave function (including the small mass m): ψnr ,nζ ,l,m̄(r, θ, φ) = ( (nr −1)!(2l +2)!(nr + l − 12)! √ π (2b)l+ 2 3(l +1)!(l + 12)! − m {nr−1∑ p=0 2l+2(p + l +1)! bl+2p! ( (nr −1)!(12)! (nr − p −1)!(p − nr + 32)! )2 + nr−1∑ n=0 (nr −1)!(nr + l − 12)!(n − 1 2)!(n + l)!(2n + l +1) n!(−12)!(nr − n −1)!(n + l + 1 2)! × nζ−1∑ k=n (−1)nr+k2n+l+2(−12)!(nζ − n −1)!(n + k + l +1)!(n + k + 1 2)! bn+l+2(k + 12)!(nζ − k −1)!(k + l +1)!(k + n − nr 3 2)! })−12 × rle− b 4(r+ 2m b )2 n∑ n=0 (−1)n(nr −1)!(nr + l − 12)! n!(nr − n −1)!(n + l + 12 ( 1 2 br2 )n × { 1+ mr nζ−n−1∑ k=0 (−1)k(n + l)!(2n + l +1)(nζ − n −1)!(n − 12)! 2(k + n + 12)!(nζ − k − n −1)!(k + n + l +1)! ( 1 2 br2 )k} × y m̄l (θ, φ) (30) in the above equation n∑ n=0 (−1)n(nr −1)!(nr + l − 12)! n!(nr − n −1)!(n + l + 12 ( 1 2 br2 )n = f |α|n ( |α| = nr −1, γ = l + 3 2 ;z = 1 2 br2 ) (31) this last expression is the confluent hypergeometric function. for this reason we call eq. (31) as grand confluent hypergeometric function. the nr is the radial quantum number. we give a name of nζ to an additional hidden radial quantum number that appears in our solution. in many of the special functions for 80 acta polytechnica vol. 51 no. 1/2011 second order differential equation there is only one eigenvalue, but when we include the small mass m, two eigenvalues are created. we already know that one eigenvalue is equal to e2r =4b ( l +2nr − 1 2 ) (32) this was shown in our earlier paper [6]. the second one, caused by small mass m, is equal to e2ah =4b ( l +2nζ + 1 2 ) (33) typical wave functions has only three quantum numbers. but this wave function has four quantum numbers which are nr, nζ, l, and m̄. when we let the small mass m equal to zero, then eq. (31) simply becomes ψnr ,nζ ,l,m̄(r, θ, φ)= ( (nr −1)!(2l +2)!(nr + l − 12)! √ π (2b)l+ 2 3(l +1)!(l + 12)! )−12 rle− b 4 r 2 f |α|n y m̄ l (θ, φ) (34) where f |α|n = f |α| n ( |α| = nr −1, γ = l + 3 2 ;z = 1 2 br2 ) eq. (35) is exactly the same as our previous wave function when the small mass m is neglected. our generalized solution gives rise to new mass formulas in remarkable agreement with experiments which will be described in our forthcoming publications. we believe that the new special functions we createdwill have great many uses in physics and other areas of sciences. acknowledgement this work was supported in part by doe contracts no. de-ac-0276 er 03074 and 03075. one of us (sc) thanks dean jeffrey peck for the travel fellowship and professor cestmir burdik for the kind invitation to this conference to present this paper. references [1] catto, s., gürsey, f.: nuovo cim. 86 a, 201 (1985). [2] catto, s., gürsey, f.: nuovo cim. 99a, 685 (1988). [3] catto, s.: miyazawasupersymmetry. inh. sakai, b. f.gibson (eds.): new facets of three nucleon force. (2008) aip 1011. [4] catto, s.: scalar mesons, multiquark states and supersymmetry. in g. rupp, b. hiller, f. kleefield (eds.) workshop on scalar mesons and related topics. (2008) aip 1030. [5] catto, s.: effective supersymmetrybased on su(3)c × s superalgebra. inv. dobrev (ed.)lie theory and its applications in physics. 2010. [6] catto, s., cheung, h. y., gürsey, f.: mod. phys. lett. a 38 3485 (1991). [7] catto, s., choun, y.: in preparation. sultan catto e-mail: scatto@gc.cuny.edu the graduate school the city university of new york and baruch college 17 lexington avenue, new york, ny 10010 physics theory group the rockefeller university 1230 york avenue, new york, ny 10021-6399 yoon choun the graduate school the city university of new york and baruch college 17 lexington avenue, new york, ny 10010 81 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 topology control with anisotropic and sector turning antennas in ad-hoc and sensor networks v. černý abstract during the last several years, technological advances have allowed the development of small, cheap, embedded, independent and rather powerful radio devices that can self-organise into data networks. such networks are usually called ad-hoc networks or, sometimes, depending on the application field, sensor networks. one of the first standards for ad-hoc networks to impose itself as a fully industrial framework for data gathering and control over such devices is ieee 802.15.4 and, on top of it, its pair network architecture: zigbee. in the case of multiple radio devices clamped into a small geographical area, the lack of radio bandwidth becomes a major problem, leading to multiple data losses and unnecessary power drain from the batteries of these small devices. this problem is usually perceived as interference. the deployment of appropriate topology control mechanisms (tc) can solve interference. all of these algorithms calculate tc on the basis of isotropic antenna radiation patterns in the horizontal plane. keywords: ad-hoc network simulation, interference, antenna radiation patterns,mobile network simulation, omnet++, topology control, interference, sector antennas, rotating antennas, wireless networks, anisotropic antennas, sensor networks. 1 introduction event simulators such as omnet++ were developed inorder to study interferencewithout thedeployment of large and sometimes expensive networks. however, these simulators lack the proper radio propagation characteristics that are crucial for understanding the interference phenomenon. the widely-used radio propagationmodel is isotropic (in the horizontal planewe can imagine it as a circlewith the transmitter in the center). this radio pattern is very hard to achieve in the real world. in addition, this model is unusable for simulating the radio patterns of sector antennas. classical tc is based on graph theory alone, and does not take into account antenna radiation patterns, propagation, receiver sensitivity or any other radio-related characteristics. such algorithms, e. g. relative neighbourhood graph [1], gabriel graph [1], yao graph [1], minimal spanning tree, xtc, itc [2], etc. are based on the following assumptions: all nodes are situatedonaflat surface, no radio propagation model is involved (coverage being determined on geometric properties only), antennas areperfect emitters (isotropic radiopattern), and the receiver is a perfect receiver with no minimal sensitivity threshold. power regulation is modeled by varying (with infinitely fine increments and with no maximal limit) the radius of the circle that defines the radiation pattern. such simplified models are easy to test, but they are very far removed from reality. to obtainmore realistic output it is necessary to involve new transmitter properties in our simulation. 2 a better antenna simulation model for our simulation we do not need the whole 3d antenna propagation pattern because we work only in the horizontal plane, hence flat pattern is sufficient. the modeling is performed by sampling the signal power around the modeled transmitter. the correctness of such a model can by adjusted by changing the size of the sampling step. all measured received power values are recomputed to gain values [3]. in order to obtain an antenna model that is easy to understand and adapt, the radiation pattern for each antenna is defined inomnet++as anxmlfile. eachsimulatednodecanbe coupledwithoneormore modeled antennas; in this way, studies of spatial diversitycanbeperformedeasily,with little ornomodification to the model itself. below is a sample code of an antenna radiation pattern: ... 24 acta polytechnica vol. 51 no. 5/2011 the code was shortened due to the length of the file. this slicing into 10-degree sectors gives good results for reasonable amounts of run time and for the algorithm. the radiation pattern for the above antenna (using the complete set of data) is provided in figure 1 and figure 2. fig. 1: gain distribution pattern of a rubber-duck antenna measured in the antenna chamber fig. 2: gain distribution pattern of a rubber-duck antenna in omnet++ 3 tc algorithms with anisotropic antennas this section contains the adapted (towards anisotropy) versions of the classical topology control algorithms. 3.1 arng — anisotropic relative neighbourhood graph the subgraph g′ = arn g(g) = (v, e′) obtained from graph g, where e′ is: e(g)′ = {(p, q) ∈ v × v |u ∈ v \ {p, q}, (1) u �∈ σ(p) ∩ σ(q)} where σ(p) and σ(q) are the irregular shapedcoverage areas of the communication nodes, defined as the zonewhere the signal receivedbyanantennahaving gain0dbiwill be at least theminimal power level pmin, when p respectively q acts as a transmitter, figure 4. fig. 3: step condition in rng fig. 4: step condition in arng 3.2 agg — anisotropic gabriel graph the subgraph g′ = agg(g) = (v, e′) obtained from graph g, where e′ is: (p, q) ∈ e′if fγp,q ∩ v \ {p, q} ∩ σ(p)∩ σ(q)= ∅ (2) γp,q = d ( p + q 2 , δ(p, q) 2 ) (3) put into words, the zone represented by the intersection of the disc centred in the middle of the segment (p, q) andwith as its diameter the euclidian distance between p and q, intersectedby the irregular coverage areas of p and q must be empty in order for the edge (p, q) to be included in the e′ set, figure 6. fig. 5: step condition in gg fig. 6: step condition in agg 3.3 ayg — anisotropic yao graph the subgraph g′ = ay g(g) = (v, e′) obtained from graph g, where e′ is: (p, q) ∈ e′if f δ(p, q) ≤ (4) ≤ ( min v∈cp,q\{p,q} {δ(p, v)})∧ (q ∈ σ(p)) put into words, in each sector around node p we choose to connect to the closest node q that lies within the irregular coverage area of node p; even though node r is in the same sector closer to p than q, node p will not connect to it if r is outside the coverage area of p, figure 8 (the arrow represents the node to connect to, and not an oriented edge). fig. 7: step condition in yg fig. 8: step condition in ayg 3.4 examples of simulated networks table 1 presents the differences between classical and anisotropic structures for a network containing auniform randomplacement of 50 communication nodes. for the anisotropic simulations, the chosen antenna type was rubber-duck; this is motivated by the fact that it has the radiation pattern closest to the ideal dipole. it can be clearly seen that even in this case there are (minor) differences. 25 acta polytechnica vol. 51 no. 5/2011 table 1: comparison between classical and anisotropic topology control algorithms type isotropic anisotropic rng gg yg 4 particular case: high gain sector turning antennas a sector turning antenna “i” is defined as a touple st ai(x, y, g1, g2, ϕ, θ, p , a, pmin, pmax, p rmin, p imin, δθmin,δpmin), by the following parameters 〈in brackets is the measure unit〉: • coordinates in plane: x 〈m〉, y 〈m〉; • gains: g1 〈dbi〉, g2 〈dbi〉; with g2 > g1; • opening of the main lobe: ϕ 〈deg〉; • rotation of the symmetry axes of the main lobe towards north (azimuth/azimuthal angle): θ〈deg〉; • power injected in the antenna: p 〈dbm〉; • antenna aperture: a 〈m2〉; • minimal and maximal power levels: pmin〈dbm〉 and pmax〈dbm〉, respectively; • minimal power threshold required for proper data reception: p rmin〈dbm〉; • minimal power threshold beyondwhich interference is registered: p imin〈dbm〉; • azimuth angle resolution: δθmin〈deg〉, with the condition δθmin ≤ ϕ (in order to cover all the area around the antenna through rotation); • power resolution: δpmin〈dbm〉; • i is the index to keep a record of the antennas in the network. in this model, parameters x, y, g1, g2, ϕ, a, pmin, pmax, p rmin, p imin,δθmin andδpmin arefixed (antennas do notmove in a plane and do not modify the radiation lobe parameters, the limit power levels or theminimal variation intervals for power or turn). a) b) c) fig. 9: example of a 2.4 ghz sector antenna. 1a – real antenna; 1b – radiation pattern; 1c – simulated radiation pattern a) b) c) fig. 10: example of a ferimex sector antenna. 1a – real antenna; 1b – radiation pattern; 1c – simulated radiation pattern a) b) c) fig. 11: example of a yaggi-uda lobe antenna. 1a – real antenna; 1b – radiation pattern; 1c – simulated radiation pattern 4.1 n-sector turning antenna a n-sector turning antenna is an extended st a, a touple: nst ai(x, y, g[], ϕ[], θ, p, a, pmin, pmax, p rmin, p imin,δθmin,δpmin), where all the parameters excepting g[] and ϕ[] have the same definition as in the stadefinition. g[] represents the gain vector g[] = g1, g2, . . . , gn−1, gn, gn−1, . . . , g2, g1, where g1 < g2 < gn and ϕ[] represents the openings vector ϕ[] = ϕ1, . . . , ϕn, where ϕj is the angle opening of the zone having gain gj. this type of antenna mimics the behaviour of lobe-antennas such as yaggi-uda antennas (figure 11). 4.2 network composed of turning sector antennas a cluster composed of sector turning antennas is defined by a touple k(nk, xk, yk, xk, yk, st a[k]), where: • geographical extent of cluster: xk 〈m〉, yk 〈m〉; • geographical coordinates of the centre: xk 〈m〉, yk 〈m〉; 26 acta polytechnica vol. 51 no. 5/2011 • number of antennas nk 〈〉; • st a1, st a2, . . . , st ank component antennas. all the parameters defining a cluster remain constant (clusters do not change positions or number of components). a network composed of clusters of sector turning antennas is definedas a touple n(x, y, n, k, k[k], q), by the following parameters: • geographical extentof thenetwork: x〈m〉, y 〈m〉; • n 〈〉 number of antennas; • k 〈〉 number of clusters; • k1, k2, . . . , kk component clusters; • a property q that has to be satisfied; • nk[1]+ nk[2]+ . . . + nk[k] = n. a network consisting in a single cluster will have k =1, xk[1]= x, yk[1]= y and nk[1]= n —like for example a uniform distribution of nodes over the entirefield. differentnodeplacementsweretested: randomplacement—nodes are placed randomlyaround the geographic network centre (defined by coordinates [x/2, y /2]) and cluster placement — the centre of each cluster is chosen randomly around the network centre and the nodes composing one cluster are placed randomly around the centre of the respective cluster. all the parameters defining a network remain constant (the number of clusters in the network, positions of the clusters, etc.). for network n, let us define an evaluating metric e, which will be detailed in the next paragraph. symmetricnetwork: network n is definedas symmetric, thus all the nodes in the network have the same shared parameters (g1, g2, ϕ, a, pmin, pmax, p rmin, p imin,δθmin andδpmin, which is allwith the exception of power and rotation of the main lobe towards north). such a network canmodel for example a city-wide radio network which consists of identical sector antennasand inwhichareallocatedavery limited number of radio channels. the operator of this network will be interested to connect the entire network (if possible) with low interference rather than low power (because the antennas — in such a network — are powered from the electric grid). applying tc to network n translates to finding an assignment of power and azimuth to north (p and θ) for each antenna in the network, such that the property q is achieved (if possible) and the value of evaluation e is improved towards other methods. classical tc is reduced to finding just a power level assignment (p) for each node in the network such that property q is fulfilled while e is evaluated. 5 connectivity and interference this section defines the connectivity between two nodes and the methods for measuring interference in the network. 5.1 connectivity between two nodes in the horizontal plane (which in our case corresponds to the magnetic h-plane of all antennas), in the far field, an ideal dipole radiator l 〈m〉 in length, which is fed an alternative current with frequency f 〈hz〉 and intensity i 〈a〉 will generate a power density (measuredbythe time-averagedpoyntingvector) at distance r 〈m〉 from the radiator pdens 〈w/m2〉 given by: pdens = 1 2 ( il 2rλ )2 η (5) where η 〈ω〉 =120π is the impedanceof the free space and λ 〈m〉 = c/f is the radio signalwavelength (c being the speed of light) [3]. in our case λ =0.12244m (for f =2450 mhz). considering the emitter antennaas st a1 and the receiver as st a2, the receiving antenna captures the power density on the whole aperture (a), thus inducing power pind 〈w〉 equal to the power density multiplied by the aperture of the receiver. however this does not take into account antenna gains; the received power precv 〈dbm〉, with the gain contribution, becomes [3]: precv 〈dbm〉 = pind 〈dbm〉 + g12 〈dbi〉+ (6) g21 〈dbi〉 − f sp l 〈db〉 where pind had to be converted fromwtodbm, g12 is the gain of the emitter facing towards the receiver, g21 is the gain of the receiver towards the emitter, and fspl is the free space path loss of the connection [3]. in order to have communication between st a1 and st a2 it is required that precv ≥ p rmin. thus, the minimal power density level that must be created by antenna st a1 to communicate to st a2 can be computed as: pdens 〈w ·m−2〉 = 10 p rmin−g12−g21 10 a (7) byconvertingdbmtowandwhere a 〈m2〉 =3λ2/8π is the aperture of the receiver (dipole-based sector antenna). the power that must be injected into the emitter st a1 can be calculated by integrating the power density pdens over the whole surface that is the 3d radiation pattern of the antenna. thus p 〈w〉 = pdens 〈w/m 2〉s〈m2〉. in the case of dipole-based sector antennas or lobe antennas (like yaggi-uda), the radiation surface is s = 8πr2/3 and in the case of an ideal antenna (the radiation pattern of which is a perfect sphere) is s =4πr2, with r in this case being the distance between the two antennas. in this paper theminimal receive levelwas chosen at p rmin = −46.511 dbm, based on the xbee series 27 acta polytechnica vol. 51 no. 5/2011 2 modules specification (however it can be chosen at any other value, depending on the used network devices) and the required antenna gainswere chosen on the basis of various sector and yaggi-uda antennas. 6 interference in the network based on the radio parameters, six interference estimations are defined (for node, edge, maximal and average cases); interference is the metric e that will be evaluated for each network. for module st a1, let us define the radio coverage interference of the node as the number of other nodes that receive from st a1 a radio signal stronger than p imin: rcov(st a1)=∣∣∣∣∣ {st ax ∈ n \ st a1;precv(st a1 → st ax) ≥ p imin} ∣∣∣∣∣ (8) where precv(st a1− > st ax) designates the power induced by node st a1 at the st ax node. for bi-directional communication between st a1 and st a2, let us define the radio coverage interference of thewhole link st a1−st a2 as thenumber of other nodes that receive a radio signal stronger than p imin from either st a1 or st a2: rcov(st a1 − st a2)=∣∣∣∣∣∣∣∣∣∣ {st ax ∈ n{st a1;st a2}; precv(st a1 → st ax) ≥ p imin} ∪{st ay ∈ n{st a1;st a2}; precv(st a2 → st ay) ≥ p imin} ∣∣∣∣∣∣∣∣∣∣ (9) for the whole network, let us define the maximal andaverage radio interference values, basedon either nodes (vertices) or links (edges in a graph representing the network): riv cov(n)= max st ax∈n (rcov(st ax) (10) riecov(n)= max st ax−st ay (rcov(st ax − st ay)) (11) riavgv cov(n)= 1 n ∑ st ax rcov(st ax) (12) riavgecov(n)= 1 n ∑ st ax−st ay rcov(st ax − st ay) (13) where n is the number of all nodes in the network. in all the simulated networks p imin was chosen at −55 dbm, based on thexbee specification (it can be chosen at other values, depending on the used network devices). the e estimation in this paper was chosen as riavgecov. from (5) it can be seen that the radio signal decreases with the square of the distance from the emitter — corresponding to the freespace propagation model. 7 tc with sector turning antennas twoalgorithmsare presented in this paper: maxdistanceminimise (mdistm) and axdegreeminimise (mdegm). the two algorithms do not require global information about the network (each node relies on localnodediscovery); eachof the twoalgorithmsconsists in two stages/steps: discovery and configuration; because the discovery phase is common it will be presented only once. if global information about the network is available in the form of geographical coordinates of the antennas, the first step (discovery) can be skipped. 7.1 node discovery this phase assumes that each antenna is randomly oriented at initial azimuthal angle θi and it requires one complete turn in the azimuthal plane in order to perform a full discovery. it is also required that a neighbour antenna should not be identified more than once. for this purpose, the antennas exchange hellomessages that contain: a unique id (antenna index i), current azimuth θ (read froman internal compass, for example) — based on this the receiver can determine whether the sender is using the strongest radiation lobe. the algorithm requires 360/δθmin rotation steps to complete. node discovery (all nodes execute in parallel): • set power level to maximum p = pmax; • while θ < θi +360 execute: • exchange hello packets; • identify neighbours reachable with azimuth θ; • θ = θ +δθmin; • build a neighbour table nt containing: neighbour id., possible azimuths θ that each neighbour can be reached with, and the power level that each neighbour was listened to; • exclude fromnt the cases when a neighbour can be seen only if both antennas are oriented with the strongest lobe towards each other. these neighbours are very distant and they will be disconnected; 7.2 maxdistanceminimise (mdistm) this algorithmisbasedonthe fact that linksbetween distant nodes can be achievedby orienting a stronger lobe of radiation in the desired direction, without increasing the power injected into the emitter. the 28 acta polytechnica vol. 51 no. 5/2011 algorithm requires maximum (pmax − pmin)/δpmin steps of power adjustment to complete. maxdistanceminimise (all nodes execute in parallel): • determine the most distant neighbour (mdn) from the nt (mdn is the node that has the lowest received signal over all registered entries in nt); • calculate the azimuth θx to reach the mdn; • set power level to maximum p = pmax; • while θ < θx execute: θ = θ +δθmin; • while p < pmin execute: • exchange hello messages; test connectivity with neighbours in nt; • if no nodes missing p = p −δpmin; • else p = p +δpmin and end while cycle; • while p > pmin execute: • exchange nt tables with neighbours; • test connectivity with neighbours through intermediate nodes (acting as relays); • if no nodes missing p = p −δpmin; • else p = p +δpmin and end while cycle. • result: (θx, p) for each node. 7.3 maxdegreeminimise (mdegm) this algorithm is based on the fact that in clustered networks it is desired that the main radiation lobe will be used to create intercluster links, while the weak lobe will be used to create intracluster links. the algorithm requires maximum (pmax − pmin)/δpmin steps of power adjustment to complete. the degree of a transmitter node in one direction θ is adapted from graph theory, and is defined as the number of other nodes that can receivea signal above p rmin fromthe transmitter if themain radiation lobe of the transmitter has azimuthal angle θ. maxdegreeminimise (all nodes execute in parallel): • from nt calculate the azimuth θx that corresponds to the direction for which the transmitter has the highest degree; • set power level to maximum p = pmax; • while θ < θx execute: θ = θ +δθmin; • while p > pmin execute: • exchange hello messages; • test connectivity with neighbours in nt; • if no nodes missing p = p −δpmin; • else p = p +δpmin and end while cycle; • while p > pmin execute: • exchange nt tables with neighbours; • test connectivity with neighbours through intermediate nodes (acting as relays); • if no nodes missing p = p −δpmin; • else p = p +δpmin and end while cycle. result: (θx, p) for each node. 8 power level adjustment the algorithms mdistm and mdegm presented in the previous sections make use of linear power decrease, which is the standard approach for having power adjustments implemented in hardware. however faster results can be achieved if the power levels are adjusted exponentially: powadj: • set power level to maximum p = pmax; • base b =2; exponent e =1; • while p > pmin execute: • δp = pmax be • exchange nt tables with neighbours; • test connectivity with neighbours through intermediate nodes (acting as relays); • if no nodes missing p = p −δp; • else p = p +δpmin and end while cycle; • e = e +1. 9 simulation results for the simulations, we chose a squared flat playground of 180 × 180 m on which 20 or 50 nodes (equipped with sta or nsta, uniform networks — all nodes have one type of antenna only) were placed indifferent configurations: uniformrandomandclustered. a cluster placement is defined by choosing the number of clusters (2 to 10) and their geometric sizes (5 to25m): the centres of the clusters areplaced randomly around the playground centre and the nodes of each cluster are placed within the borders of the cluster randomly around the centre of the cluster. the following experiments were performed for both 20 and 50 nodes, uniform or clustered, sta or nsta: • sector opening (ϕ) variation: the gains g1 and g2 aremaintained constant, while ϕ is increased from 10◦ to 350◦; • gain (g2) variation: the gain g1 and the sector opening ϕ aremaintained constant, while the gain g2 is increased from 2 dbi to 20 dbi; • sector openings for nsta with 5 to 10 gain zones in the gain vector, ranging from 1.5 dbi to 20 dbi. in order to compare the interference results, we used the same simulator as in [1], the same configurations for node distributions and wave propagation. in the following graphs we compare the results obtained by applying anisotropic tc (arng, agg, ayg — without antenna turning) with results obtained by antenna turning. finally, the total power need per network is estimated for all the methods in order to keep the network connected. eachpresented result is an average over 100 simulations of the same type; a total of 52000 networks were simulated for this paper. 29 acta polytechnica vol. 51 no. 5/2011 a) interference (0y), uniform distribution b) total power (0y − dbm), uniform distribution c) interference (0y), cluster distribution d) total power (0y − dbm), cluster distribution fig. 12: 50 nodes sta gain increment for sector ϕ =10◦, g1 =1.5 dbi, g2 =2 . . .20 dbi (0x) a) interference (0y), uniform distribution b) total power (0y − dbm), uniform distribution c) interference (0y), cluster distribution d) total power (0y − dbm), cluster distribution fig. 13: 50 nodes sta sector increment ϕ =10◦ . . .350◦ (0x) for gains g1 =3.5 dbi, g2 =7 dbi a) interference (0y), uniform distribution b) total power (0y − dbm), uniform distribution c) interference (0y), cluster distribution d) total power (0y − dbm), cluster distribution fig. 14: 50 nodes sta sector increment ϕ =10◦ . . .350◦ (0x) for gains g1 =3.5 dbi, g2 =12 dbi a) interference (0y), uniform distribution b) total power (0y − dbm), uniform distribution c) interference (0y), cluster distribution d) total power (0y − dbm), cluster distribution fig. 15: 50 nodes sta sector increment ϕ =10◦ . . .350◦ (0x) for gains g1 =3.5 dbi, g2 =20 dbi 30 acta polytechnica vol. 51 no. 5/2011 a) interference (0y), uniform distribution b) total power (0y − dbm), uniform distribution c) interference (0y), cluster distribution d) total power (0y − dbm), cluster distribution fig. 16: 50 nodes nsta sector increment n = 5, ϕ1 = ϕ2 = ϕ3 = ϕ4 = 10 ◦, ϕ5 = 20 ◦ . . .70◦ (0x), g1 = 2 dbi, g2 =4 dbi, g3 =8 dbi, g4 =16 dbi, g5 =20 dbi a) exponential node chain and solution to have lower interference: introduction of auxiliary nodes b) solution with sector antennas: no auxiliary nodes required, zero additional interference. note that in this image the radiation pattern is correlated to power not coverage— thus a link can exist even when a node is not located inside the radiation pattern fig. 17: exponential node chain 10 conclusions as is shown in figures 12–16, mdistm and mdegm have better behaviour (lower interference and power need) for clusterdistributions than foruniformdistribution: this is explained by the fact that a high-gain lobe tends to create intercluster links and a residual lobe tends to create intracluster links. another consideration is the decrease in power needwhen the main lobe widens (figures 13–15b and d). another example architecture is the exponential node chain—all nodes placed on the same axeswith distances increasing exponentially (figure 17a. on axes 0x) [4]; in this case, the use of isotropic antennas yields huge interference values (this being in fact a counterexample for lownode degreetc,which does not ensure low interference). however, the node topology can be elegantly solved by using sta (figure 17b.), where each antenna has gain g2 = 2g1, which proves that antenna behavior opens new horizons for tc. nodeswhich take advantage of using the newtc with sta are capable to create networks that need less energy consumption. as it can be seen, the networks created in such a way model very well wifi infrastructure networks for metropolitan areas (for example, networks composed by turning high-gain antennas placed on tops of building roofs). what it is presented in this paper is a networkwhich is capable to automatically connect and reconnect in case of a failure (due to the removal of one communication node for example). the principle described in this paper is interesting because it can be improved, for example by combining tc with smart antennas [3]. different tc methods which take into account slightly different parameters as an input will generate slightly different results, suitable for different real-world scenarious. how to precisely choose what is best for each scenario is a subject for future improvement in this area. 31 acta polytechnica vol. 51 no. 5/2011 references [1] wagner, d., wattenhofer, r.: algorithms for sensor and ad hoc networks — advanced lectures, issn 0302-9743, springer, 2007, pp. 85–90. [2] janecek, j., moucha, a.: comparison of local interference-aware topology control algorithms, proceedings of the 19th iasted international conference on parallel and distributed computing and systems, 2007. isbn 978-0-88986-704-8. [3] balanis,c.: antennatheory—analysis anddesign. 3rd ed., wiley, 2005, pp. 154–162. isbn 0-471-66782-x. [4] burkhart, m., von rickenbach, p., wattenhofer, r., zollinger, a.: does topology controlreduce interference?,mobihoc ’04: proceedings of the 5th acm international symposium on mobile ad hoc networking and computing, 2004, pp. 9–19. about the author viktor černý was born in prague in 1982 and was awardedhismaster degree in 2009. currently he is a phd student in computer science atfelcvut. viktor černý e-mail: cernyvi2@fel.cvut.cz dept. of computer science and engineering faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic 32 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 modeling of concrete creep based on microprestress-solidification theory p. havlásek, m. jirásek abstract creep of concrete is strongly affected by the evolution of pore humidity and temperature, which in turn depend on the environmental conditions and on the size and shape of the concrete member. current codes of practice take that into account only approximately, in a very simplified way. a more realistic description can be achieved by advanced models, such as model b3 and its improved version that uses the concept of microprestress. the value of microprestress is influenced by the evolution of pore humidity and temperature. in this paper, values of parameters used by the microprestress-solidification theory (mps) are recommended and their influence on the creep compliance function is evaluated and checked against experimental data from the literature. certain deficiencies of mps are pointed out, and a modified version of mps is proposed. keywords: creep, concrete, compliance function, kelvin chain, solidification, microprestress, finite elements. 1 introduction in contrast tometals, concrete exhibits creep already at room temperature. this phenomenon results in a gradual but considerable increase in deformation at sustained loads, and needs to be taken into account in thedesignandanalysis of concrete structures. the present paper examines an advanced concrete creep model, which extends the original b3 model [1] and uses the concepts of solidification [5,6] and microprestress [3,4,2]. the main objective of the paper is to clarify the role of non-traditionalmodel parameters and provide hints on their identification. the creep tests performed bykommendant, polivka, and pirtz [8], nasser and neville [9], and fahmi, polivka and bresler [7] are used as a source of experimental data, which are comparedwith the results of numerical simulations. references [8] and [9] were focused mainly on creep of sealed concrete specimens subjected to elevated but constant temperatures. reference [7] studied creep under variable temperature for both sealed and drying specimens. the same references were used in [2] to demonstrate the functionality of themicroprestress-solidificationtheory,which is the constitutive model described in section 2. all numerical computations have been performed using thefinite elementpackageoofem[10–12]developed mainly at the ctu in prague by bořek patzák. 2 description of the material model the complete constitutive model for creep and shrinkage of concrete can be represented by the rheological scheme shown infigure 1. it consists of (i) a non-aging elastic spring, representing instantaneous elastic deformation, (ii) a solidifying kelvin chain, representing short-term creep, (iii) an aging dashpot with viscosity dependent on the microprestress, s, representing long-term creep, (iv) a shrinkage unit, representingvolume changes due to drying, and (v) a unit representing thermal expansion. all these units areconnected in series, and thus the total strain is the sum of the individual contributions, while the stress transmitted by all units is the same. attention is focusedhere on themechanical strain, composed of the first three contributions to the total strain, which are stress-dependent. in the experiments, shrinkage and thermal strains were measured separately on loadσσ ε e0 e1 η1 e2 η2 em ηm ss s s cs ksh αt εa εv εf εsh εt solidification fig. 1: rheological scheme of the complete hygro-thermo-mechanical model 34 acta polytechnica vol. 52 no. 2/2012 free specimens and subtracted from the strain of the loaded specimen under the same environmental conditions. it should be noted that even after subtraction of shrinkage and thermal strain, the evolution of mechanical strain is affected by humidity and temperature. the reference case is so-called basic creep, i.e. creep in sealed conditions and at room temperature. dry concrete creeps less than wet concrete, but the process of drying accelerates creep. elevated temperature leads to faster cement hydration and thus to faster reduction of compliance due to aging, but it also accelerates the viscous processes that are at the origin of creep and the process of microprestress relaxation. solidification theory [5] reflects theprocessof concrete aging due to cement hydration, which leads to the deposition of new layers of solidified hydration products (mainly calcium-silicate-hydrate gels, c-sh). it is assumed that the creep ofc-s-h is described by non-aging viscoelasticity, and aging is caused by the growth in volumeof the solidifiedmaterial, which leads to a special structure of the compliance function, reflectedbymodelb3. according to thismodel, basic creep is described by a compliance function of the form jb(t, t ′) = q1 + q2 ∫ t t′ ns−m s − t′ +(s − t′)1−n ds + (1) q3 ln [ 1+(t − t′)n ] + q4 ln t t′ , t ≥ t′, where t is the current time (measured as the age of the concrete, expressed in days), t′ is the age at load application, n = 0.1, m = 0.5, and q1, q2, q3 and q4 are parameters determined by fitting of experimental results or estimated from concrete composition and strength using empirical formulae. the first (constant) term corresponds to the compliance q1 = 1/e0 of the elastic spring in figure 1, the second and third terms to the solidifying viscoelastic material (in numerical simulations approximated by a solidifyingkelvin chain), and the fourth term to an aging viscous dashpot with viscosity ηf(t)= t/q4. microprestress-solidification theory is an extension of the above model to variable humidity and temperature. it replaces the explicit dependence of viscosity ηf on time by its dependence on the socalled microprestress, s, which is governed by a separate evolution equation. the microprestress is understoodas the stress in themicrostructuregenerated due to large localized volume changes during the hydration process. it builds up at very early stages of microstructure formation and then is gradually reduced by relaxation processes. the microprestress is considered to be much bigger than any stress acting on themacroscopic level, and therefore it is not influenced by the macroscopic stress. additional microprestress is generated by changes in internal relative humidity and temperature. this is described by the non-linear differential equation ds dt + ψs(t, h)c0s 2 = k1 ∣∣∣∣d(t lnh)dt ∣∣∣∣ (2) in which t denotes the absolute temperature, h is the relative pore humidity (partial pressure of water vapor divided by the saturation pressure), c0 and k1 are constant parameters, and ψs is a variable factor that reflects the acceleration ofmicroprestress relaxation at higher temperature and its deceleration at lower humidity (compared to the standard conditions). owing to the presence of the absolute value operator on the right-hand side of (2), additionalmicroprestress is generatedbybothdrying andwetting, and by both heating and cooling, as suggested in [2]. the dependence of factor ψs on temperature and humidity is assumed in the form ψs(t, h) = exp [ qs r ( 1 t0 − 1 t )] · (3) [ αs +(1 − αs)h2 ] where qs is the activation energy, r is the boltzmann constant, t0 is the reference temperature (room temperature) in absolute scale and αs is a parameter. thedefault parametervalues recommended in [2] are qs /r =3000 k and αs ≈ 0.1. as discussed in [3], highmicroprestress facilitates sliding in the microstructure and thus accelerates creep. therefore, the viscosity of the dashpot that represents long-term viscous flow is assumed to be inversely proportional to the microprestress. this viscosity acts as a proportionality factor between the flow rate and the stress. themodel is thus described by the equations σ = ηf dεf dt (4) ηf = 1 cs (5) with a constant parameter c, which is not independent and can be linked to the already introduced parameters. it suffices to impose the requirement that, under standard conditions (t = t0 and h = 1) and constant stress, the evolution of flow strain should be logarithmic and should exactly correspond to the last termof the compliance function (1) ofmodel b3. a simple comparison reveals that c = c0q4. at the same time, we obtain the appropriate initial condition for microprestress, which must supplement differential equation (2). the initial condition reads s0 = 1/(c0t0), where t0 is a suitably selected time that precedes the onset of drying and temperature variations. as already mentioned, parameters q1, q2, q3 and q4 are related to basic creep and can be predicted 35 acta polytechnica vol. 52 no. 2/2012 from the composition of the concretemixture and its average 28-day compressive strength using empirical formulae [1]. the part of the compliance function that contains q2 and q3 is related to viscoelastic effects in the solidifying part of the model. in numerical simulations, this part of compliance is approximated by dirichlet series corresponding to a solidifying kelvin chain. the stiffnesses and viscosities of individual kelvin units can be conveniently determined from the continuous retardation spectrum of the non-aging compliance function that describes the solidifying constituent. the flow term representing long-term creep is handled separately. for general applications with variable environmental conditions, it is necessary todetermineparameters c0 and k1 that appear in the microprestress evolution equation (2) and indirectly affect the flow viscosity. the model response is also influenced by parameters qs /r and αs, forwhichdefault valueshavebeen recommended. as will be shown later, a better agreement with experimental results can be obtained if the default values are adjusted. also, the assumption that changes of t lnh contribute to the build-up ofmicroprestress independently of their sign will be shown to be too simplistic. 3 numerical simulations in this section, experimental data are compared to results obtained with mps theory, which reduces to the standard b3 model in the special case of basic creep. all examples concerning drying and thermally induced creep have been run as a staggered problem, with the heat and moisture transport analyses preceding themechanical analysis. the available experimental data contained themechanical strains (due to elasticity and creep), with the thermal and shrinkage strains subtracted. 3.1 experiments of kommendant, polivka and pirtz (1976) at the time of writing, the original report was not available to the authors of the present paper; therefore the experimental data as well as the recommendedbasic creepparameters (q1 =20.0, q2 =70.0, q3 = 5.6 and q4 = 7.0, all in 10 −6/mpa) were taken from [2]. under constant uniaxial load and constant temperature, it is assumed that there are similar conditions in the whole specimen. this allowed all computations to be carriedout on just one finite element. figure 2 shows the experimental (points) and calculated (curves) compliance functions for two different ages at loading and three different levels of temperature. for the younger age at loading, t′ = 28 days, the computed curves provide an excellent fit of the measured data for temperatures 23◦c and 43◦c. for the highest temperature, t = 71◦c, the compliance is somewhat overpredicted for load durations from 1 week to 2 years. for the higher age at loading, t′ = 90 days, the measured data are underpredicted for all temperatures, but starting from load durations of 10 days all the creep rates are predicted very well. the reduced accuracy can be attributed to the general tendency of the b3 model to overemphasize the effect of aging. 0 20 40 60 80 100 120 0.1 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=23°c t=23°c ex. data t=43°c t=43°c ex. data t=71°c t=71°c 0 20 40 60 80 100 120 0.1 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=23°c t=23°c ex. data t=43°c t=43°c ex. data t=71°c t=71°c fig. 2: experimental data (kommendant, polivka and pirtz)andcomputedcompliance functions for ageat loading t′ =28 days (top) and t′ =90 days (bottom) in this example, the present results agree with those presented in the original work [2], which verifies the correct implementation. the calculated data are independent of parameters c0 and k1. one can obtain exactly the same curves as t = 23◦c in figure 2 just by substituting parameters q1–q4, age at loading t′ and the duration of loading t − t′ into the full version of the b3 model; see equation 1. theoriginalb3model containsa simple extension to basic creep at constant elevated temperatures; see section 1.7.2 in [1]. the actual age at loading and the load duration are replaced by the equivalent age 36 acta polytechnica vol. 52 no. 2/2012 and the equivalent load duration, which evolve faster at elevated temperatures. the calculated compliance functions for default values of activation energies, assumed water content w = 200 kg/m3 and average 28-day compressive strength f̄c =35 mpa are shown in figure 3. for loading at age t′ =28 days, the initial compliance is overestimated and for the highest temperature 71◦c the rate of creep for longer loading durations is too low. the compliance functions for loading at age t′ = 90 days fit the experimental data nicely except for the highest temperature. in all cases the rate of creep is captured better by mps theory. 0 20 40 60 80 100 120 0.1 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=23°c b3 t=23°c ex. data t=43°c b3 t=43°c ex. data t=71°c b3 t=71°c 0 20 40 60 80 100 120 0.1 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=23°c b3 t=23°c ex. data t=43°c b3 t=43°c ex. data t=71°c b3 t=71°c fig. 3: experimental data (kommendant, polivka and pirtz) andcomputedcompliance functions for ageat loading t′ = 28 days (top) and t′ = 90 days (bottom) using the original model b3 3.2 experiments of nasser and neville (1965) nasser and neville studied the creep of cylindrical concrete specimens subjected to three different levels of temperature. in their experiments, all specimens were sealed in water-tight jackets and placed in a water bath in order to guarantee constant temperature. at the age of 14 days the specimens were loaded to 35%, 60% or 69% of the average compressive strength at the time of loading; unfortunately, just the lowest load level is in the range in which concrete creep can be considered as linear. paper [9] does not contain enough information to allow the parameters of mps theory to be predicted, but the values q1 = 15, q2 = 80, q3 = 24 and q4 = 5 (all in 10−6/mpa) published in [2] again provide good agreement at room temperature, see the first graph in figure 4. 0 20 40 60 80 100 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=21°c standard 0 20 40 60 80 100 120 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=71°c standard modified 0 20 40 60 80 100 120 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=96°c standard modified fig. 4: experimental data (nasser andneville) and compliance functions for temperatures 21◦c, 71◦cand96◦c 37 acta polytechnica vol. 52 no. 2/2012 0 20 40 60 80 100 120 1 10 100 1000 co m p lia n ce j [ 1 0 -6 /m p a ] duration of loading t-t’ [days] ex. data t=21°c b3 t=21°c ex. data t=71°c b3 t=71°c ex. data t=96°c b3 t=96°c fig. 5: experimental data (nasser andneville) and compliance functions obtained with the original model b3 for temperatures 21◦c, 71◦c and 96◦c for thehigher temperature, t =71◦c, the agreement is goodup to 20days at loading, but afterwards the computed rate of creep is too low. a remedy can be sought in modifying the activation energy. reduction of qs /r from the default value 3000 k to the adjusted value of 2200k leads to an excellent fit; see the curve labeled in figure 4 as modified. unfortunately, the prediction for the highest temperature (t =96◦c) is improved only partially. changes in activation energy have no influence on the results when the temperature is close to the room temperature. before loading, the specimens had been subjected to an environment at the given temperature, which accelerated the hydration processes in concrete, i.e. the maturity of concrete. in other words, the higher the temperature, the lower the initial compliance. on the other hand, for longer periods of loading the higher temperature accelerates the rate of bond breakages, which accelerates creep. this justifies the shape of the obtained curve for the medium temperature, which is different from the one published in [2], where the initial compliance for this temperature was higher than for the room temperature. the compliance functions obtained with the b3 model are shown for all tested temperatures in figure5. again, defaultvalueswereused for theactivation energies, assumedwater content w =200 kg/m3 and compressive strength f̄c =35 mpa. experimental data for the room temperature and for the highest temperature are captured nicely, but the compliance function for t = 71◦c is overestimated (until 100 days of loadduration), and the final rate of creep seems to be too low. 3.3 experiments of fahmi, polivka and bresler (1972) in these experiments, all specimens had the shape of a hollow cylinder with inner diameter 12.7 cm, outer diameter 15.24 cm and height 101.6 cm. the weight ratio of the components of the concrete mixture was water:cement:aggregates= 0.58 : 1 : 2. from this we can estimate that the concretemixture contained approximately 520 kg of cement per cubic meter. the average 21-day compressive strength was 40.3 mpa. using ceb-fip recommendations, the 28-day strength can be estimated as 42.2 mpa. the experiment was performed for four different histories of loading, temperature and relative humidity. the loading programs of the first two specimens are summarized in tables 1 and 2, the other two loading programswith cyclic thermal loading are specified in tables 3 and 4. table 1: testingprogramof the sealed specimenwith one temperature cycle (data set #1) time duration t rh σ [day] [◦c] [%] [mpa] 21 23 100 0 37 23 98 −6.27 26 47 98 −6.27 82 60 98 −6.27 10 23 98 −6.27 25 23 98 0 table 2: testing program of the drying specimen with one temperature cycle (data set #2) time duration t rh σ [day] [◦c] [%] [mpa] 18 23 100 0 14 23 50 0 37 23 50 −6.27 108 60 50 −6.27 10 23 50 −6.27 25 23 50 0 table 3: testing program of the sealed specimen subjected to several temperature cycles (data set #3). asterisks denote a section which is repeated 4× time duration t rh σ [day] [◦c] [%] [mpa] 21 23 100 0 35 23 98 −6.27 9 40 98 −6.27 5 60 98 −6.27 14 23 98 −6.27 7∗ 60 98 −6.27 7∗ 23 98 −6.27 7 60 98 −6.27 12 23 98 −6.27 40 23 98 0 38 acta polytechnica vol. 52 no. 2/2012 table 4: testing program of the drying specimen subjected to several temperature cycles (data set #4). asterisks denote a section which is repeated 4× time duration t rh σ [day] [◦c] [%] [mpa] 18 23 100 0 14 23 50 0 33 23 50 −6.27 15 60 50 −6.27 14 23 50 −6.27 7∗ 60 50 −6.27 7∗ 23 50 −6.27 7 60 50 −6.27 13 23 50 −6.27 14 23 50 0 the four parameters of the b3 model describing the basic creep, q1, q2, q3 and q4, were determined from the composition of the concrete mixture and from the compressive strength using empirical formulae according to [1]. the result of this prediction exceeded expectations; only minor adjustments were necessary to get the optimal fit (see the first part of the strain evolution in figure 6). the following values were used: q1 = 19.5, q2 = 160, q3 = 5.25 and q4 =12.5 (all in 10 −6/mpa). they differ significantly from the values recommended in [2], q1 =25, q2 =100, q3 =1.5 and q4 =6, which do not provide satisfactory agreement with experimental data. mps theory uses three additional parameters, c0, k1 and c, but parameter c should be equal to c0q4. it has been found that the remaining parameters c0 and k1 are not independent. what matters for creep is only their product. for different combinations of c0 and k1 giving the same product, the evolution of microprestress is different but the evolution of creep strain is exactly the same. sincemicroprestress is not directlymeasurable, c0 and k1 cannot (andneed not) be determined separately. in practical computations, k1 can be set to a fixed value (eg. 1 mpa/k), and c0 can be varied until the best fit with experimental data is obtained; in all the following figures c0 is specified inmpa−1day−1. all other parameterswere used according to standard recommendations. a really good fit of the first experimental data set (98% relative humidity, i.e., h = 0.98) was obtained for c0 =0.235mpa −1day−1; seefigure6. the agreement is satisfactory except for the last interval, which corresponds to unloading. it is worth noting that the thermally induced part of creep accounts for more than a half of the total creep (compare the experimental data with the solid curve labeled basic in figure 6). unfortunately, with default values of the other parameters, the same value of c0 could not be used to fit experimental data set number 2, because it would have led to overestimation of the creep (see the dashed curve in figure 7). in the first loading interval of 37 days, creep takes place at room temperature and the best agreementwould be obtained with parameter c0 set to 0.940mpa −1day−1; see the dashdotted curve in figure 7. however, at the later stage when the temperature rises to 60◦c, the creepwould be grossly overestimated. a reasonable agreement during this stage of loading is obtained with c0 reduced to0.067mpa−1day−1 (solid curve infigure7), but then the creep is underestimated in the first interval in figure 6 left. raising parameter αs from its recommendedvalue 0.1 to 0.3 (short-dashed curve in figure 7 right) has approximately the same effect as decreasing c0 from 0.235 to 0.067 mpa −1day−1. parameter αs controls the effect of reduced humidity on the rate of microprestress relaxation, and its modification has no effect on the response of sealed specimens. 0 200 400 600 800 1000 1200 1400 1600 1800 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data basic c0 = 0.067 c0 = 0.235 c0 = 0.671 fig. 6: mechanical strain evolution for sealed specimens, with relative pore humidity assumed to be 98%, loaded by compressive stress 6.27 mpa at time t′ =21 days 0 200 400 600 800 1000 1200 1400 1600 1800 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data c0 = 0.067 c0 = 0.235 c0 = 0.235 modified c0 = 0.940 fig. 7: mechanical strain evolution for drying specimens at 50% relative environmental humidity, loaded by compressive stress 6.27 mpa at time t′ =32 days 39 acta polytechnica vol. 52 no. 2/2012 0 500 1000 1500 2000 2500 3000 0 20 40 60 80 100 120 140 160 180 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data c0 = 0.235 basic fig. 8: mechanical strain evolution for sealed specimen, loaded by compressive stress 6.27 mpa at time t ′ = 21 days and subjected to cyclic variations of temperature 0 1000 2000 3000 4000 5000 6000 0 20 40 60 80 100 120 140 160 180 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data c0 = 0.235 fig. 9: mechanical strain evolution for drying specimens, loaded by compressive stress 6.27 mpa at time t ′ = 32 days and subjected to cyclic variations of temperature for the last two testing programsdescribed intables 3and4, the agreementbetween the experimental and computed data is reasonable only until the end of the second heating cycle (solid curves in figure 8 and figure 9). for data set 3, the final predicted compliance exceeds the measured value almost twice (figure 8), for data set 4 almost five times (figure 9). in order to obtain a better agreement, parameter c0 would have to be reduced, but this would result in an underestimation of the creep in the first two testing programs. the experimental data showthat the temperature cycles significantly increase the creep only in the first cycle; during subsequent thermal cycling their effect on creepdiminishes. therefore it couldbe beneficial to enhance the material model by adding internal memory, which would improve the behavior under cyclic thermal loading, while the response to sustained loading would remain unchanged. another deficiency of the model is illustrated by the graphs in figure 10. they refer to the first set of experiments. as documented by the solid curve in figure 6, a good fit was obtained by setting parameter c0 =0.235 mpa −1day−1, assuming that the relative pore humidity is 98%. the pores are initially completely filled with water; however, even if the specimen isperfectly sealed, the relativehumidity decreases slightly due to the water deficiency caused by the hydration reaction. this phenomenon is referred to as self-desiccation. 0 200 400 600 800 1000 1200 1400 1600 1800 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data rh = 95% rh = 96% rh = 97% rh = 98% rh = 99% rh = 100% basic fig. 10: mechanical strain evolution for sealed specimens, loaded by compressive stress 6.27 mpa from age 21 days, with the assumed relative humidity of the pores varied from 95% to 100%. parameters of mps theory: k1 =1 mpa/k, c0 =0.235 mpa −1 day−1 the problem is that the exact value of pore humidity in a sealed specimen and its evolution in time are difficult to determine. in simple engineering calculations, a constant value of 98% is often used. unfortunately, the response of the model is quite sensitive to this choice, and the creepcurvesobtainedwith other assumed values of pore humidity in the range from95% to 100%would be different; see figure 10. the source this strong sensitivity is the assumption that the instantaneously generated microprestress is proportional to the absolute value of the change of t ln(h); see the right-hand side of (2). rewriting (2) as ds dt + ψs(t, h)c0s 2 = k1 ∣∣∣∣lnhdtdt + t h dh dt ∣∣∣∣ (6) we can see that at (almost) constant humidity close to 100%, the right-hand side is proportional to the magnitude of the temperature rate, with proportionality factor k1| ln(h)| ≈ k1(1− h). for instance, if the assumedhumidity is changed from99% to 98%, this proportionality factor is doubled. 40 acta polytechnica vol. 52 no. 2/2012 0 200 400 600 800 1000 1200 1400 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data original mps improved mps fig. 11: mechanical strain evolution for sealed specimens loaded by compressive stress 6.27 mpa at time t ′ =21 days 0 500 1000 1500 2000 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data original mps κt = -ln(0.98) κt adjusted improved fig. 12: mechanical strain evolution for drying specimens loaded by compressive stress 6.27 mpa at time t ′ =32 days 4 improved material model and its validation as a simple remedy to overcome these problems, the microprestress relaxation equation (2) is replaced by ds dt + ψs(t, h)c0s 2 = k1 ∣∣∣∣ th dh dt − κt kt(t) dt dt ∣∣∣∣ (7) with kt(t) = e −ct (tmax−t) (8) in which κt and ct are new parameters and tmax is themaximumreached temperature. with κt =0.02, the creep curves in figure 10 plotted for different assumed pore humidities would be almost identical with the solid curve, which nicely fits the experimental results. the introduction of a newparameter providesmore flexibility, which is needed to improve the fit of the second testing program in figure 7, with combined effects of drying and temperature variation. for sealed specimens and monotonous thermal loading, only the product c0k1κt matters, and so the good fit in figure 7 could be obtained with different combinations of κt and c0. the resultsare shown infigures11and12 for sustained thermal loading (data sets 1 and2)and infigures 13 and 14 for cyclic thermal loading (data sets 3 and4). default values of parameters αs, αr, αe and activation energies are used. in these plots, data series labeled original mps show results obtainedwith standard mps. the data series κt = − ln(0.98) were obtained with c0 =0.235 mpa −1day−1, k1 =1 mpa/k, κt = 0.020203 and ct = 0. the data series κtadjusted correspond to parameters c0 = 0.235 mpa −1day−1, k1 = 4 mpa/k, κt = 0.005051 and ct = 0. note that in the case of constant relative humidity (figures 11 and 13) these series coincide with the data series original mps. 0 200 400 600 800 1000 1200 1400 0 20 40 60 80 100 120 140 160 180 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data original mps improved fig. 13: mechanical strain evolution for sealed specimen, loaded by compressive stress 6.27 mpa at time t ′ = 21 days and subjected to cyclic variations of temperature 0 500 1000 1500 2000 2500 0 50 100 150 200 m e ch a n ic a l s tr a in [ 1 0 -6 ] age of concrete [day] experimental data original mps κt = -ln(0.98) κt adjusted improved fig. 14: mechanical strain evolution for drying specimens, loaded by compressive stress 6.27 mpa at time t ′ = 32 days and subjected to cyclic variations of temperature 41 acta polytechnica vol. 52 no. 2/2012 thebest agreementwith experimental data is obtainedwith c0 =0.235mpa −1day−1, k1 =4mpa/k, κt =0.005051and ct =0.3k −1; these series are labeled improved. in figure 11, only a small change can be observed compared to data series original mps; these differences arise when the temperature ceases to be monotonous. for the sealed specimen (figure 11), this change is detrimental, but looking at figures 13 and 14, this deterioration is negligible compared to the substantial improvement in the case of cyclic thermal loading. 5 conclusions the material model based on mps theory has been successfully implemented into the oofem finite element package, and has been used in simulations of concrete creep at variable temperature andhumidity. mps theory performswell for standard sustained levels of temperature and load levelswithin the linear range of creep, provided that the activation energy is properly adjusted. for higher sustained temperatures (above 70◦c) the experimental data are reproduced with somewhat lower accuracy. for sealed specimens subjected to variable temperature, the results predicted by mps theory are very sensitive to the assumed value of relative pore humidity (which is slightly below 100% due to selfdesiccation). in order to overcome this deficiency, a modified version of themodel has been proposed and successfully validated. excessive sensitivity to the specific choice of relative humidity has been eliminated. also, it has become easier to calibrate the model because thermal andmoisture effects on creep are partially separated. the original mps theory grossly overestimates creep when the specimen is subjected to cyclic temperature. a new variable kt has been introduced in order to reduce the influence of subsequent thermal cycles on creep. this modification does not affect creep tests in which the evolution of temperature is monotonous. acknowledgement financial support for this work was provided by projects 103/09/h078 and p105/10/2400 of the czech science foundation. the financial support is gratefully acknowledged. references [1] bažant, z. p., baweja, s.: creep and shrinkage prediction model for analysis and design of concrete structures: model b3, adam neville symposium: creep and shrinkage — structural design effects, 2000. [2] bažant, z. p., cedolin, l., cusatis, g.: temperature effect on concrete creep modeled by microprestress-solidification theory, journal of engineering mechanics, 2004, 130, 6, 691–699. [3] bažant, z. p., hauggaard, a. b., ulm, f.: microprestress-solidification theory for concrete creep. i:aginganddrying effects,journal ofengineering mechanics, 1997,123, 11, 1188–1194. [4] bažant, z. p., hauggaard, a. b., ulm, f.: microprestress-solidification theory for concrete creep. ii: algorithm and verification, journal of engineering mechanics, 1997, 123, 11, 1195–1201. [5] bažant, z. p., prasannan, s.: solidification theory for concrete creep. i: formulation, journal of engineering mechanics, 1989, 115, 8, 1691–1703. [6] bažant, z. p., prasannan, s.: solidification theory for concrete creep. ii:verificationandapplication, journal of engineering mechanics, 1989, 115, 8, 1704–1725. [7] fahmi, h. m., polivka, m., bresler, b.: effects of sustained and cyclic elevated temperature on creep of concrete,cement and concrete research, 1972, 2, 591–606. [8] kommendant, g. j., polivka, m., pirtz, d.: study of concrete properties for prestressed concrete reactor vessels, final report — part ii, creep and strength characteristics of concrete at elevated temperatures, rep. no. ucsesm 76–3 prepared for general atomic company, berkeley : dept. civil engineering, univ. of california, 1976. [9] nasser,k.w., neville, a.m.: creep of concrete at elevated temperatures, aci journal, 1965, 62, 1567–1579. [10] patzák, b.: oofem home page, http://www.oofem.org, 2000. [11] patzák, b., bittnar, z.: design of object orientedfinite element code,advances inengineering software, 2001, 32, 10–11, 759–767. [12] patzák, b., rypl, d., bittnar, z.: parallel explicit finite elementdynamicswithnonlocal constitutivemodels,computers & structures, 2001, 79, 26–28, 2287–2297. petr havlásek e-mail: petr.havlasek@fsv.cvut.cz milan jirásek e-mail: milan.jirasek@fsv.cvut.cz department of mechanics faculty of civil engineering czech technical university in prague thákurova 7, 166 29 prague 6 42 acta polytechnica vol. 52 no. 5/2012 development of an ultralight with a ducted fan jiří brabec, martin helmich department of aerospace engineering, faculty of mechanical engineering, ctu, karlovo namesti 1, 121 35, prague 2, czech republic corresponding author: helmich@aerospace.fsik.cvut.cz abstract this paper introduces the ul-39 project, an ultralight aircraft with a ducted fan, and some of the problems that have arisen in the course of its development. several problems with the design of a non-traditional aircraft of this kind are mentioned, e.g. the design of the airframe, and the design of the propulsion unit. the paper describes the specific procedure for determining the basic thrust characteristics of this unusual aircraft concept, and also the experimental determination of these characteristics. further options for applying the experience gained during the work, and the future focus of work on these issues, are outlined at the end of the paper. keywords: available thrust, required thrust, propulsion, ducted fan. figure 1: ul-39 model. 1 introduction about 12 years ago, the idea of building an ultralight aircraft powered by a ducted fan arose at ctu in prague. an airplane of this type would offer the feeling of jet-powered flight to a relatively large population at a very much lower cost than a real jet plane. at the same time, it would enable higher maximum speed than is usual for planes in this category with propellers. the existence of such an engine would also provide an impulse for the market to further improve its production, which has now stabilized at a certain level and has not been developing any further. in the past, a number of aircraft were constructed which attempted to use either a ducted fan or a propeller in a ring. in most cases, these were amateur efforts with insufficient theoretical support, or with inadequate financial support, which in the end were unsuccessful. the technical possibilities that have become available in recent years, especially the existence of relatively light and highly effective high-speed engines and significant developments in composite technologies, have enabled the construction of an airplane of this type under conditions accessible to producers of ultralights. a number of problems have arisen during the development of this type of aircraft that we do not face when developing a “regular” ultralight, and which require a non-traditional or completely new solution. some aspects of the design of this aircraft, and work being done on developing it, will be described here. 2 specifications of the airframe design even the conception of such an aircraft poses certain difficulties when compared with propeller airplanes, especially when it comes to designing the size of the aircraft. this then has a significant effect on the weight of the aircraft. the most difficult task is to keep to the weight limits determined by the construction regulations. the target weight category for these aircraft is limited to a maximum take-off weight of 472.5 kg, if using a parachute rescue system (or 450 kg without a parachute rescue system). taking into acccount all variable weight items, such as pilots and fuel, we are left with about 320 kg for the airframe and the power unit. where can we save construction weight? because the size of the fuselage is given by the size of the ducted fan power unit and the necessity to locate the staff there, it cannot be changed significantly. savings have to be found first on the wings. minimization of the wing area required the use of aerodynamic profiles reaching high maximum lift coefficient values and using highly effective high-lift systems [1]. this resulted 138 acta polytechnica vol. 52 no. 5/2012 figure 2: real set up of the aircraft propulsion. in a wing area that is 20 % smaller than usual for this category of two-seater aircraft, and the wing is about 20 % lighter than competitive wings. another way to lower the weight is by using progressive technologies and light-weight materials. 3 conceptual set-up of the power unit the main use of the propeller is as a means for converting the energy to gain thrust in the ultralight category of “relatively low” weight airplanes. considering the differences, and attempting to create a new concept for an ultralight aircraft, while keeping to the certification criteria, we end up with the thought of using a non-conventional power unit in this category: propulsion with a fan in the outlet channel, i.e. a cold propulsor. this is a construction type that crosses the border from propeller to jet propulsion. the whole propulsion system consists of the inlet channel one-stage low pressure axial fan with a stator blade powered by a transmission shaft in a piston engine and an outlet channel with a jet. the propulsion set up is a dominant feature of the structure, in which the inlet and the outlet channel form another internal integral sandwich structure of the aircraft fuselage. after long consideration, we selected for the heart of the power unit a four-cylinder motorcycle petrol engine, volume 999 cm3 sed in a bmw s1000rr motorcycle with motor power of max. approx. 200 k (see [3]). 4 technologies used due to placing the outlet channel in the fuselage of the plane and therefore increasing the weight of the airplane, and due to the necessity to comply with the weight limits for the selected construction regulations, the structure uses composite materials. contact lamination is currently the standard production technology in producing the parts of ultralight airplanes that are made of composite materials. contact lamination causes differences in the weight of the components and in their mechanical characteristics. at the same time it introduces a risk of low temperature tolerance. all these factors, together with the idea of creating a really unique plane, led to the use of autoclave technology with prepreg materials regularly used in the construction of army airplanes or airplanes in the general aviation category. up to now, though, autoclave technology had never been used in an airplane in the ultralight category (ul). 5 determining the thrust characteristics in the course of developing a non-traditional airplane of this kind, a number of problems arise that are not directly connected with the structure itself, but which in the end influence it quite significantly. to determine the dimensions of most load-bearing parts of the structure, it is necessary to know the maximum speed that the aircraft can reach during horizontal flight. to determine the maximum speed, it is necessary to know the thrust characteristics of the power unit and 139 acta polytechnica vol. 52 no. 5/2012 figure 3: locating the power unit in the fuselage. the drag characteristics of the plane. these characteristics are represented by values called available thrust, i.e. the thrust that the plane can use for a specific flight speed, and the required thrust for the plane drag. both these values depend greatly on flight speed. for a regular plane (with a propeller), both of these values can be determined using empirical relations, which generally correspond well with the real values. for a plane with a ducted fan power unit, the situation is slightly different, but very similar, if not practically the same, as with a jet airplane. in this case, quite a strong effect on the drag characteristics of the plane is caused by the presence of the power unit. it is very difficult, if not impossible, to employ the simple calculation procedures that can be used with a propeller airplane. because the available thrust is greatly affected by the power unit regime (and is so dependent on the available thrust), it is not possible to solve this problem individually, which is the regular procedure with propeller propulsion. the procedure currently used to deal with this issue will be described below. the dependencies mentioned below will for now be mentioned without measures, without real values, which should not be published at this point in the project. 5.1 methodology for determining the maximum speed first, the necessary values were determined on the basis of the empirical relations (in the case of the airframe) and cfd analysis (in the case of the power unit). generally, the required thrust of the power unit is determined as the sum of the contributions of the individual aircraft parts to the total drag power. in the calculation, the drags of the elementary parts of the airplane were considered, e.g. fuselage, wing, tailplane, landing gear, canopy and air inlets. the theoretical courses of these values are marked in fig. 4. the maximum speed value is given by the meeting point of the curves of the available and required thrust in fig. 4 (curves tp and tv). as mentioned above, this is where we encounter the problem of determining the real values of the required and available thrust. determining the drag of the plane using the regular procedure does not take into consideration how it is influenced by the thrust of the power unit. in addition, theoretical determination of the available thrust of the power unit can be considered very difficult and inaccurate, due to lack of experience with procedures of this kind. in the czech republic, this problem has undoubtedly been solved several times in the development of jet planes, but this was always in military applications and the results are therefore not available to the development team. after long discussions about the methodology for determining the real performance parameters of the propulsion unit and the required thrusts to power the plane, we decided to make an experimental determination of the required characteristics and a compilation of the methodology to determine the available and required thrust theoretically. 140 acta polytechnica vol. 52 no. 5/2012 figure 4: equiponderant thrust diagram. figure 5: view of the back part of the static stand for tests on the new power unit under laboratory conditions. 5.2 determining the thrust characteristics on a static test bed the first necessary and logical step for the successful development of the new aircraft concept was to manufacture a static test bed located in a laboratory. this test bed is intended for static measurements (without forward speed), for determining the functionality of the system, for investigating the thrust characteristics for several different configurations, and making comparisons with the theoretical values obtained while developing the propulsion unit. the core of the test bed was the engine designed for the yamaha yzf-r1 2004, a 180 hp sports motorbike. that was the previous choice of power plant before the bmw engine. this demonstrator of the aircraft propulsion unit enabled us to measure important propulsion system parameters for configurations with one or two coolers placed in the outlet channel, varying the outlet nozzle diameters, etc. the most valuable data, e.g. thrust, was defigure 6: example of the performance characteristics of the stand. figure 7: mobile tests of the propulsion unit with a ford car. termined by dynamometers and metal planchettes equipped with a tensometric gauge. the velocity and pressure field in the outlet nozzle were investigated according to engine or fan revolutions. from the operational point of view, the temperatures on the rotor bearings, near the coolers, the exhaust system and the engine compartment are all insignificant parameters. other measured parameters were the fuel consumption and the volume flow of coolant in the cooling system. the dependency of available thrust on motor speed in different propulsion variations is shown in fig. 6. 5.3 determining the thrust characteristics during mobile tests although the previous measurements provided a number of very valuable operational and performance parameters, it was now necessary to obtain characteristics on speed. this was absolutely necessary for further development of the flying prototype. for this 141 acta polytechnica vol. 52 no. 5/2012 figure 8: mobile tests of the propulsion unit with a buggyra truck. figure 9: thrust characteristics measured for a mobile device. purpose, it was necessary to carry out mobile tests of the propulsion unit. the prototype of the propulsion unit described above was attached to a ford f-350, which was able to drive with a load of about 500 kg at speeds over 100 km/h. the arrangement of the whole device is shown in fig. 7. a speed of only 140 km/h was reached with this arrangement. this was only a half of the expected maximum speed of the aircraft, and was even lower than the cruising speed. it was necessary to find some other device that would be able to drive much faster and carry such a heavy load. there was practically only one possible way to achieve higher speeds on a ground device. the propulsion unit was placed on a buggyra racing truck. the speed reached in this arrangement was 245 km/h. this arrangement is shown in fig. 8. the forces on the test bed in the direction of the thrust were measured in both cases in a way similar to the tests performed by planchettes with a tensometrical gauge during the static tests. it became evident when evaluating the measurement results that this force measurement method and our own testing method were inappropriate. very intense vibrations were transmitted from the car into the test bed, and it was very complicated to filter them figure 10: fuselage demonstrator placed in an aerodynamic tunnel. figure 11: a comparison between theory and experiment. out. the vibrations were primarily caused by the test bed-car connection, which was practically fixed. the measured values could be considered acceptable, after filtering, only for speeds of about 100 km/h, because at higher speeds the aerodynamic drag of the measuring device became more dominant than the thrust of the propulsion unit. this was due to the inappropriate shape of the test device, which had been primarily designed for static tests in a laboratory, and was not shaped for mobile tests. some measured thrusts of the propulsion unit according to speed are drawn in fig. 9 (dimensionless). the same problem would arise, of course, if we were to make an aerodynamically better shape of the device. forces in the direction of the thrust will always be influenced by the drag of the device that is used. in relation to the difficulties when predicting the required and available thrust, as mentioned above, it was not possible to determine which part of the measured force was caused by thrust and which part was caused by drag. as a result of this reasoning, we came up with a certain modification to the methodology for determining the maximum speed, as mentioned above. this modification leads to the next step in the experimentation. the next and final step involved 142 acta polytechnica vol. 52 no. 5/2012 measuring a full-scale aircraft fuselage equipped with a functional propulsion unit in an aerodynamic tunnel. the following procedure will work with experimentally obtained available thrust values. the maximum speed value will then be determined by the intersection of the curves of the available and required thrust (curves tp and tv). 5.4 measuring with the fuselage demonstrator a fuselage demonstrator was made for the purposes of tunnel measurement and for testing the installation of a power plant in the fuselage. before measuring the fuselage equipped with a functional propulsion unit, measurements were made in an aerodynamic tunnel of the fuselage without a propulsion unit in order to verify the possible positions of the fuselage in the tunnel and the correctness of the methodology. before the measuments were made, the fuselage was hung up with tensometric weights in the 3-metre diameter aerodynamic tunnel at vzlú a.s. in prague letňany. the method for hanging the fuselage in the tunnel is shown in fig. 10. the measurements in the tunnel were performed for speeds from 0 to 216 km/h. in this area, too, our aircraft is a pioneer. it is practically the first ultralight aircraft to be tested in an aerodynamic tunnel, and it is the first aircraft in the czech republic for which full-scale measurements have been performed (with the exception of unmanned aerial vehicles). to verify the fuselage drag force values obtained in the tests in the aerodynamic tunnel, a comparison was made with values obtained from the theoretical calculations performed in [2]. this comparison is presented in fig. 11. a comparison of the dependencies shows clearly that the value obtained by measuring is bigger by about 25 % at maximum speed. some part of this difference can be assigned to several gaps on the fuselage, which are perpendicular to the air flow (aircraft drag can generally be increased by 10 % due to this phenomenon). the greatest part will be caused by differences between the flux in a free stream and the stream in the tunnel, the major part of which is filled by the measured object. even after eliminating these influences, a difference will remain between the theoretical results and the experimental results, even without the propulsion unit running, which will further strongly influence the results. 5.5 focus for further work as a next step, we will work on supplementing the fuselage demonstrator with a functional propulsion unit, and tests will then be carried out in an aerodynamic tunnel. to evaluate the measurements, we will prepare correction coefficients for converting the values obtained by measurements in the tunnel for movement in a free stream, and we will correlate the experiment with theoretically obtained values. based on the experience gained in the tests and from the comparison with theory, we will compile a methodology for determining the available and required thrust, which can then be used for other aircraft based on a similar concept. 6 conclusion many problems have arisen in the course of developing this aircraft that have not previously been investigated in the ultralight weight category. this is primarily due to the revolutionary concept of the ul-39 aircraft. some of the problems that have been investigated are indicated in this paper. especially problems with thrust characteristics, to which we have given the most space in this paper, have good potential to form the basis for further theoretical and experimental work. this work, in turn, will be highly beneficial for the future development of aircraft in the ultralight weight category. optimization of the propulsion unit, which is unique not only in the ultralight category, has a long way to go before serial production can be considered. references [1] j. brabec. koncepční studie letounu ul-39. technická zpráva, tzp/ult/72/11, ústav letadlové techniky, čvut v praze, 2011. [2] j. brabec. aerodynamický výpočet letounu ul39. technická zpráva, tzp/ult/8/11, ústav letadlové techniky, čvut v praze, 2011. [3] m. helmich. uspořádání pohonné jednotku letounu ul-39. technická zpráva, tzp/ult/68/2011, ústav letadlové techniky, čvut v praze, 2011. [4] s. f. hoerner. fluid-dynamic drag. published by the author, 1965. [5] j. roskam. airplane design: part vi: preliminary calculation of aerodynamic, thrust and power characteristics. the university of kansas, 1987. [6] r. theiner. studie nekonvenčního ul letounu. phd thesis, čvut v praze, 2007. [7] e. torenbeek. synthesis of subsonic airplane design. delft university press, 1976. 143 wykresx.eps acta polytechnica vol. 51 no. 1/2011 quantized solitons in the extended skyrme-faddeev model l. a. ferreira, s. kato, n. sawado, k. toda abstract the construction of axially symmetric soliton solutions with non-zero hopf topological charges according to a theory known as the extended skyrme-faddeev model, was performed in [1]. in this paper we show how masses of glueballs are predicted within this model. keywords: integrable systems, solitons, monopoles, instantons, semiclassical quantizations. we construct static soliton solutions carrying nontrivialhopf topological charges for afield theory that has found interesting applications in many areas of physics. it is a (3+ 1)-dimensional lorentz invariant field theory for a triplet of scalar fields �n, living on the two-sphere s2, �n2 = 1, and defined by the lagrangian density l = m2 ∂μ�n · ∂μ�n − 1 e2 (∂μ�n ∧ ∂ν�n)2 + (1) β 2 (∂μ�n · ∂μ�n) 2 , where the coupling constants e2 and β are dimensionless, and m has a mass dimension. the first two terms correspond to the so-called skyrme-faddeev (sf) model [2, 3, 4] proposed by the following idea of skyrme [5]. it was conjectured by faddeev and niemi [6] that the sfmodel describes the low energy (strong coupling) regime of the pure su(2) yangmills theory. this was based on the so-called chofaddeev-niemi-shabanov decomposition [6, 7, 8] of the su(2) yang-mills field �aμ, where its six physical degrees of freedom are encoded into a triplet of scalars �n (�n2 = 1), a massless u(1) gauge field, and two real scalarfields. gieshascomputed theone-loop wilsonian effective action for the su(2) yang-mills theory and has found agreements with the conjecture, which is provided that the sf model is modified by additional quartic terms in derivatives of the �n field [9]. one can now stereographically project s2 on a plane and work with a complex scalar field u related to the triplet �n by �n = 1 1+ | u |2 (u + u∗, −i(u − u∗), | u |2 −1). (2) the static hamiltonian associated to (1) is hstatic =4m2 ∂iu ∂iu ∗ (1+ | u |2)2 − 8 e2 [ (∂iu) 2(∂j u ∗)2 (1+ | u |2)4 +(β e2 −1) (∂iu ∂iu ∗)2 (1+ | u |2)4 ] . (3) therefore, it is positive definite for m2 > 0, e2 < 0, β < 0, βe2 ≥ 1. the euler-lagrange equations from (1) read (1+ | u |2)∂μkμ −2u∗ kμ ∂μu =0, (4) together with its complex conjugate, where kμ := m2 ∂μu + (5) 4 e2 [ (β e2 −1)(∂ν u ∂ν u∗)∂μu +(∂ν u∂νu)∂μu∗ ] (1+ | u |2)2 . we choose to use the toroidal coordinates defined as x1 = r0 p √ z cosϕ, x2 = r0 p √ z sinϕ, (6) x3 = r0 p √ 1− z sinξ, where p = 1 − cosξ √ 1− z, xi (i = 1,2,3) are the cartesian coordinates in r3, and (z, ξ, ϕ) are the toroidal coordinates. we have 0 ≤ z ≤ 1, −π ≤ ξ ≤ π, 0 ≤ ϕ ≤ 2π, and r0 is a free parameter with dimension of length. we use the ansatz for the solution u = √ 1− g(z, ξ) g(z, ξ) eiθ(z,ξ)+i n ϕ, (7) with n being an integer. we now impose the boundary conditions g(z =0, ξ) = 0, g(z =1, ξ) = 1, for − π ≤ ξ ≤ π (8) and θ(z, ξ = −π) = −m π, θ(z, ξ = π) = m π, for 0 ≤ z ≤ 1 (9) with m being an integer. 47 acta polytechnica vol. 51 no. 1/2011 fig. 1: the hopf charge density isosurfaces for solutions with various charges (m, n)= (1,2) (left), (2,2) (middle) and (4,1) (right) for βe2 =1.1 the finite energy solutions of the theory (1) define maps from the three dimensional space r3 ∼ s3 to the target space s2. these are classified into homotopy classes labeled by an integer qh called the hopf index, which has values of qh = m n. substituting (7) into (4) and (5), we get two coupled non-linear partial differential equations in two variables. we have constructed numerical solutions with several hopf charges up to four by a successive over-relaxationmethod. infig. 1,wepresent someof the results of the hopf charge density for the charge (m, n)= (1,2), (2,2) and (4,1). probably the main interest in studying the class of the model is to get an insight into the mass of the glueballs. since our finding solutions are classical, they must be properly quantized. making the following replacement for the fields [10] �n(�r) · �τ → �n′(�r, t) · �τ := (10) a(t) [ �n(r(b(t))�r − �x) · �τ ] a†(t), one obtains the kinetic contribution to the energy t = 1 2 aiuij aj − aiwij bj + 1 2 bivij bj (11) with ai := −itr(τia†ȧ) and bi := itr(τiḃb†). here a and b arematrices in su(2), and b works through so(3) form rij(b) = 1 2 tr(τibτj b −1). �b is an angular velocity, and �a means an angular velocity in an isospace. quantizing these coordinates, one finally finds the quantized mass of the hopf soliton equanta := m |e| estatic − m e2|e| [ i(i +1) 2u11 + j(j +1) 2v11 + (12) k23 2 ( 1 u33 − 1 u11 − n2 v11 )] , where estatic is the energy corresponding to the hamiltonian (3) and the inertia tensors uab, vab are the function of the classical fields �n. numerically u11 has a quite large value, so we omit the terms with u11. as a result, the states are essentially labeled by two quantum numbers (j, k3). krusch applied the basic fr constraints to the skyrme-faddeev model, and finally found that the quantum states with even topological charge qh could be bosonic[11]. they may be possible candidates for glueballs. fig. 2: the quantized spectra for the topological charge (m, n) = (1,2) and (4,1) (bold line), compared with result of the lattice simulation [12] we plot the lowest three states of (m, n) = (1,2) and (4,1). first, we have fitted the first two spectra to the result of the lattice simulation [12] and have computed the third lowest spectrum corresponding to j p c =2−+. for (1,2), the third state is lower energy than the second state because the second term onthe righthand side of (12)has anegative contribution to the quantumenergy. on the other hand, if we employ the solution of (4,1), the third state appears near to the prediction of the lattice. this is quite promising. in [12], the authors also predicted the root mean square radius √ 〈r2〉 ∼ 0.481 [fm]. in our calculation, the radius is √ 〈r2〉 = 0.466 (for (1,2)) or 0.536 (for (4,1)) [fm]; the results are consistent. we summarize our analysis. we have found new hopf soliton solutions for the extended skyrmefaddeev model. the model is a low energy effective model of qcd, and the solutions are possible candidates for the glueballs. wehaveperformedthe collectivecoordinatequantization for the obtainedclassical solutions and have computed the quantum energies. some of our results are in good agreement with the study of the lattice gauge simulation. acknowledgement the authors acknowledge financial support from fapesp(brazil). oneof the authors (l.a.f.) is partially supported by cnpq. this work was partially 48 acta polytechnica vol. 51 no. 1/2011 supported by a grant-in-aid for the global coe program “the next generation of physics, spun fromuniversality andemergence” from theministry of education, culture, sports, science and technology (mext) of japan. one of the authors (k.t.) is partially supported by the fy 2010 researcher exchangeprogrambetweenjspsand thegermanacademic exchange service. one of the authors (n.s) expresses his gratitude to the conference organizers of isqs19 for kind accommodation and hospitality. references [1] ferreira, l. a., sawado, n., toda, k.: static hopfions in the extended skyrme-faddeev model, journal of high energy physics, jhep11(2009)124. [2] faddeev, l. d.: quantization of solitons, princeton preprint ias print-75-qs70 (1975). faddeev, l. d., niemi, a. j.: knots and particles, nature 387, 58 (1997). [3] battye, r. a., sutcliffe, p. m.: to be or knot to be?, phys. rev. lett. 81, 4798 (1998). [4] hietarinta, j., salo, p.: faddeev-hopf knots: dynamics of linkedun-knots,phys. lett. b 451, 60 (1999). [5] skyrme, t. h. r.: a nonlinear field theory, proc. roy. soc. lond. a 260, 127 (1961). [6] faddeev, l. d., niemi, a. j.: partially dual variables in su(2)yang-mills theory,phys. rev. lett. 82, 1624 (1999). [7] cho, y. m.: a restricted gauge theory, phys. rev. d 21, 1080 (1980). cho, y. m.: extended gauge theory and its mass spectrum, phys. rev. d 23, 2415 (1981). [8] shabanov, s. v.: an effective action for monopoles and knot solitons in yang-mills theory, phys. lett. b 458, 322 (1999). [9] gies, h.: wilsonian effective action for su(2) yang-mills theory with cho-faddeev-niemishabanov decomposition, phys. rev. d 63, 125023 (2001). [10] kondo, k. i., ono, a., shibata, a., shinohara, t., murakami, t.: glueball mass from quantized knot solitons and gauge-invariant gluon j. phys. a 39, 13767 (2006). [11] krusch, s., speight, j. m.: fermionic quantization of hopf solitons, commun. math. phys. 264, 391 (2006). [12] morningstar, c. j., peardon, m. j.: the glueball spectrum from an anisotropic lattice study, phys. rev. d 60, 034509 (1999). about the author nobuyuki sawado is currentlyworkingas anassociate professor at the institute of science and technology, department of physics at tokyo university of science. his research interest is in topological solitons and related areas, such as effective models of qcd, brane world scenarios and some topics in condensed matter physics. luiz agostinho ferreira institute of physics of são carlos ifsc/usp, university of são paulo – usp caixa postal 369, cep 13560-970, são carlos-sp, brazil satoru kato department of physics institute of science and technology tokyo university of science, noda, chiba 278-8510, japan nobuyuki sawado e-mail: sawado@ph.noda.tus.ac.jp department of physics institute of science and technology tokyo university of science noda, chiba 278-8510, japan kouichi toda department of mathematical physics toyama prefectural university kurokawa 5180, imizu, toyama, 939-0398, japan and research and educationcenter for natural sciences hiyoshi campus, keio university 4-1-1 hiyoshi, kouhoku-ku, yokohama, 223-8521, japan 49 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 cosmology and the subgroups of gamma-ray bursts a. mészáros, j. ř́ıpa, l. g. balázs, z. bagoly, p. veres, i. horváth abstract both short and intermediate gamma-ray bursts are distributed anisotropically in the sky (mészáros, a. et al. apj, 539, 98 (2000), vavrek, r. et al. mnras, 391, 1741 (2008)). hence, in the redshift range, where these bursts take place, the cosmological principle is in doubt. it has already been noted that short bursts should be mainly at redshifts smaller than one (mészáros, a. et al. gamma-ray burst: sixth huntsville symp., aip, vol. 1133, 483 (2009); mészáros, a. et al. baltic astron., 18, 293 (2009)). here we show that intermediate bursts should be at redshifts up to three. introduction in several papers, the authors have shown that there are three subgroups of gamma-ray bursts (grbs); see horváth, i. et al, apj, 713, 552 (2010) and the references therein. the three subgroups are shown in figures 1–3 for different instruments (batse on the compton gamma-ray observatory, http://heasarc.gsfc.nasa.gov/docs/cgro/batse/; rhessi satellite, http://science.nasa.gov/missions/ rhessi/; swift satellite, http://heasarc.nasa.gov/docs/ swift/swiftsc.html). -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 lo g h 3 2 log t90 fig. 1: three subgroups of batsegrbs separated with respect to duration and hardness. t90 is in seconds; for definitions of hardness, durationt90 and formore details see horváth et al.: a & a, 447, 23 (2006) angular distribution of the batse grbs the biggest number of detected grbs is in the batse database. if the cosmological principle holds, then they should be distributed isotropically in the sky. in other words, these bursts may well serve as a test of isotropical distribution. these sky distributions are shown in figures 4–6. fig. 2: three similar subgroups of rhessi grbs. t90 is in seconds; ř́ıpa et al.: a & a, 498, 399 (2009) −1 0 1 2 − 0 .2 0 .0 0 .2 0 .4 logt90 lo g h r fig. 3: three similar subgroups of swift grbs. t90 is in seconds; horváth, i. et al.: apj, 713, 552 (2010) 82 acta polytechnica vol. 51 no. 2/2011 0 360180 90 -90 short fig. 4: celestial distribution of short batse grbs. these short grbs are not distributed isotropically; vavrek, r. et al.: mnras, 391, 1741 (2008) 0 360180 90 -90 medium fig. 5: celestial distribution of intermediate batse grbs. they are also not distributed isotropically; mészáros, a. et al.: apj, 539, 98 (2000); vavrek, r. et al.: mnras, 391, 1741 (2008) 0 360180 90 -90 long fig. 6: celestial distribution of longbatsegrbs. long grbs seem to be distributed isotropically; vavrek, r. et al.: mnras, 391, 1741 (2008) redshifts of short and intermediate grbs in two previous papers it was already suggested that short grbs are mainly at z < 1 (z is the redshift); mészáros, a. et al.: gamma-ray burst: sixth huntsville symp., aip, vol. 1133, 483 (2009); mészáros, a. et al.: baltic astron., 18, 293 (2009). hence, up to z ∼ 1 the cosmological principle is in doubt. in this paper we have also acquired the redshifts of intermediate grbs (from the swift satellite; horváth, i.: et al, apj, 713, 552 (2010)). the redshifts of known swift grbs are shown in figure 7. it is seen that these redshifts are up to z ∼ 3. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 z fig. 7: redshift distributions of swift grbs (short grbs=solid line; intermediategrbs=dashed line; long grbs=dotted line). intermediate grbs are at even higher redshifts than short ones; horváth, i. et al.: apj, 713, 552 (2010) conclusion of course, it is not fully necessary that the redshifts of intermediate grbs from two different experiments (batse vs. swift) are the same in the statistical sense. however, keeping this eventuality in mind, it seems that anisotropies exist up to z ∼ 3 in the spatial distribution of grbs. acknowledgement this study was supported by otka grant k77795, by grant agency of the czech republic grants no. 205/08/h005 and p209/10/0734, by project svv 261301 of charles university in prague, and by research program msm0021620860 of the ministry of education of the czech republic. references [1] horváth, i., et al.: a & a, 447, 23 (2006). 83 acta polytechnica vol. 51 no. 2/2011 [2] horváth, i., et al.: apj, 713, 552 (2010). [3] http://heasarc.gsfc.nasa.gov/docs/cgro/batse/ [4] mészáros, a., et al.: apj, 539, 98 (2000). [5] mészáros, a., et al.: gamma-ray burst: sixth huntsville symp., aip, vol. 1133, 483 (2009). [6] mészáros, a., et al.: baltic astron., 18, 293 (2009). [7] řı́pa, j., et al.: a & a, 498, 399 (2009). [8] http://heasarc.nasa.gov/docs/swift/swiftsc.html. [9] vavrek, r., et al.: mnras, 391, 1741 (2008). attila mészáros e-mail: meszaros@cesnet.cz charles university faculty of mathematics and physics astronomical institute v holešovičkách 2, 180 00 prague 8, czech republic jakub řı́pa e-mail: ripa@sirrah.troja.mff.cuni.cz charles university faculty of mathematics and physics astronomical institute v holešovičkách 2, 180 00 prague 8, czech republic lajos g. balázs e-mail: balazs@konkoly.hu konkoly observatory po box 67, h-1525 budapest, hungary zsolt bagoly e-mail: zsolt@yela.elte.hu lab. for information technology eötvös university pázmány p. s. 1/a, h-1518 budapest, hungary péter veres e-mail: veresp@elte.hu lab. for information technology eötvös university pázmány p. s. 1/a, h-1518 budapest, hungary istván horváth e-mail: horvath.istvan@zmne.hu dept. of physics bolyai military university h-1581 budapest, pob 15, hungary 84 acta polytechnica vol. 52 no. 6/2012 hygro-thermo-mechanical analysis of a reactor vessel jaroslav kruis, tomáš koudelka, tomáš krejčí department of mechanics, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague corresponding author: jk@cml.fsv.cvut.cz abstract determining the durability of a reactor vessel requires a hygro-thermo-mechanical analysis of the vessel throughout its service life. damage, prestress losses, distribution of heat and moisture and some other quantities are needed for a durability assessment. a coupled analysis was performed on a two-level model because of the huge demands on computer hardware. this paper deals with a hygro-thermo-mechanical analysis of a reactor vessel made of prestressed concrete with a steel inner liner. the reactor vessel is located in temelín, czech republic. keywords: durability of a reactor vessel, damage, coupled heat and moisture transport. 1 introduction many nuclear power plants across europe are approaching the end of their design durability. this raises issues connected with the high costs of decommissioning and replacing of existing plants. extending the serviceability of plants is therefore welcome because it is significantly cheaper than replacing them. however, detailed analyses of reactor vessels have to be performed, because nuclear equipment is at the centre of public attention and there are severe safety requirements. a comprehensive analysis of existing reactor vessels contains mechanical and transport parts which are coupled together. several analyses are necessary in order to estimate behaviour of a vessel after more than 30 years of service life. during this long period of time, the vessel has undergone through many stages that have to be modelled. this paper deals with a hygro-thermo-mechanical analysis of a reactor vessel made of prestressed concrete with a steel inner liner. the reactor vessel is located in temelín, czech republic. coupled hygro-thermo-mechanical analyses are very demanding, because there are many unknowns that are discretized, and therefore many degrees of freedom are needed. moreover, the growing number of degrees of freedom in the nodes of a finite element mesh leads to larger bandwidth of the system matrix and direct solvers are significantly inefficient. at the same time, the properties of the system matrix are poor due to the very different orders of the particular entries, and iterative methods require many iterations. efficient implementation of hygro-thermo-mechanical analysis is described in reference [1]. due to limited space, only selected models used in the analysis are described. section 2 is devoted to the orthotropic damage model, and section 3 describes the künzel model of coupled heat and moisture transport. section 4 describes the simulation of the reactor vessel. 2 mechanical models — orthotropic damage model in the first approach, the containment was computed with the help of a scalar isotropic damage model which showed that tensile strength plays a key role. several analyses with different tensile strength values for concrete were performed and the results were unrealistic until the tensile strength was greater than 3.7 mpa. lower tensile strength values led to unrealistic damage of the containment. the tensile strength was not measured experimentally, but a mean value for the compressive strength obtained from earlier experiments was set to 60 mpa. the approximate tensile strength value according to the czech technical standard, can be set to 3.5 mpa. these results led to the conclusion that it will be necessary to use more advanced damage model for concrete which can better describe the 3d problems. in reference [2], the authors proposed a general anisotropic model for concrete which contains nine material parameters. laboratory measurements of the required material parameters have to be performed, but this caused difficulties in certain cases. additionally, the model required a significant number of internal variables which have to be stored. these difficulties led to the development of a simplified version of the model. this simplified version is based on six material parameters — three for tension and three further parameters for compression. the model is based on the following stress-strain 67 acta polytechnica vol. 52 no. 6/2012 relation σα = ( 1 −h(ε̄α)d(t)α −h(−ε̄α)d (c) α ) × [( k − 2 3 g ) ε̄v + 2gε̄α ] , (1) where the subscript α stands for the index of the principle components of the given quantity. ε̄α denotes the principal values of the strain tensor. the model defines two sets of damage parameters, d(t)α and d(c)α , for tension and compression, respectively. in the equation (1), the symbol h() denotes the heaviside function, k is the bulk modulus, g is the shear modulus and ε̄v stands for volumetric strain. there are many evolution laws that can be used for describing d(t)α and d(c)α . in our problems, the two evolution laws for the damage parameters are used similar to the laws used in the scalar isotropic damage model. the first law gives better results for compression but it is more complicated to determine the material parameters. it can be written in the form d(β)α = a(β) ( |ε̄(β)α |− ε̄ (β) 0 )b(β) 1 + a(β) ( |ε̄(β)α |− ε̄ (β) 0 )b(β) , (2) where the superscript (β) represents indices t or c, which are used for tension and compression. a(β), b(β) are material parameters controlling the peak value and the slope of the softening branch and ε̄(β)0 is the strain threshold. damage evolves after the strains exceed the limit value of ε̄0. the second law involves correction of the dissipated energy with respect to the size of the elements, and it provides a better description of the tension. it is defined by the non-linear equation (3), which can be solved using the newton method ( 1 −d(β)α ) e|ε̄(β)α | = fβ exp ( − d (β) α h|ε̄ (β) α | w (β) cr0 ) . (3) in the above equation, fβ represents the tensile or compressive strength, w(β)cr0 controls the initial slope of the softening branches, e is the young modulus of elasticity, h is the characteristic size of the finite element and ε̄(β)α is the principal value of the strain tensor. more details about the implemented models can be found in [3], [4] and [5]. 3 coupled heat and moisture transfer — the künzel and kiessl model this model introduces two unknowns in a material: relative humidity ϕ [–] and temperature t [k]. the model divides the overhygroscopic region into two subranges — the capillary water region and the supersaturated region, where different conditions for water and water vapour transport are considered. for the description of simultaneous water and water vapour transport, relative humidity ϕ is chosen as the only moisture potential for both the hygroscopic and the overhygroscopic range. this model uses certain simplifications. nevertheless, the proposed model describes all substantial phenomena and the predicted results comply well with experimentally obtained data. this is the main advantage of the model together with easy and quick determination of the material properties measured in a laboratory. 3.1 transport equations künzel proposed that the moisture transport mechanisms relevant to numerical analysis in the field of building physics are just water vapour diffusion and liquid transport [6]. vapour diffusion is the most important in large pores, whereas liquid transport takes place on pore surfaces and in small capillaries. vapour diffusion in porous media is described in the model by fick’s diffusion and effusion in the form jv = −δp∇p = − δ µ ∇p, (4) where jv is the water vapour flux [kg m−2 s−1], δp [kg m−1 s−1 pa−1] is the vapour permeability of the porous material, p denotes vapour pressure [pa], the vapour diffusion resistance number µ is a material property and δ [kg m−1 s−1 pa−1] is the vapour diffusion coefficient in the air. the liquid transport mechanism includes the liquid flow in the absorbed layer (surface diffusion) and in the water filled capillaries (capillary transport). the driving potential in both cases is capillary pressure (matrix suction) or relative humidity ϕ. the flux of liquid water is described by jw = −dϕ∇ϕ, (5) where the liquid conductivity dϕ [kg m−1 s−1] is the product of the liquid diffusivity dw [m2 s−1] and the derivative of the water retention function dϕ = dw · dw/dϕ, where w [kg m−3] is the water content of the material. the heat flux is proportional to the thermal conductivity of the moist porous material and the temperature gradient (fourier’s law) q = −λ∇t, (6) where λ [w m−1 k−1] is the thermal conductivity of the moist material. the enthalpy flows through moisture movement, and the phase transition is taken into account in the form of the source terms in the heat balance equation. 68 acta polytechnica vol. 52 no. 6/2012 3.2 balance equations the heat and moisture balance equations are closely coupled because, the moisture content depends on the total enthalpy and thermal conductivity while the temperature depends on moisture flow. the resulting set of differential equations for describing the simultaneous heat and moisture transfer, expressed in terms of temperature t and relative humidity ϕ, have the form of partial differential equations defined on a domain ω dw dϕ ∂ϕ ∂t = ∇t ( dϕ∇ϕ + δp∇(ϕpsat) ) , (7)( ρc + dhw dt )∂t ∂t = = ∇t ( λ∇t ) + hv∇t ( δp∇(ϕpsat) ) , (8) where hw [j m−3] is the enthalpy of the moisture of the material, hv [j kg−1] is the evaporation enthalpy of the water, psat [pa] is the water vapour saturation pressure, ρ [kg m−3] is the mass density of the porous material, c [j kg−1 k−1] is the specific heat capacity and t [s] denotes time. the boundary of domain ω is split into parts γtd, γϕd, γtn, γϕn, γtc and γϕc. d indicates the dirichlet boundary conditions (prescribed values), n denotes the neumann boundary conditions (prescribed fluxes) and c denotes the cauchy/newton boundary conditions (flux caused by the transmission). the parts γtd, γtn and γtc are disjoint and their union is the whole boundary γ. the same is valid for the parts γϕd, γϕn and γϕc. the heat fluxes are prescribed on the part γq = γtn ⋃ γtc and the moisture fluxes are prescribed on the part γj = γϕn ⋃ γϕc. the system of equations (7) and (8) is accompanied by three types of boundary conditions. the dirichlet boundary conditions have the form t(x, t) = t(x, t), x∈ γtd, (9) ϕ(x, t) = ϕ(x, t), x∈ γϕd. (10) the neumann boundary conditions have the form q(x, t) = q(x, t), x∈ γtn, (11) j(x, t) = j(x, t), x∈ γϕn. (12) the cauchy/newton boundary conditions have the form q(x, t) = βt (t(x, t) −t∞(x, t)), x∈ γtc, (13) j(x, t) = βϕ(p(x, t) −p∞(x, t)), x∈ γϕc. (14) in the previous relationships, t(x, t) is the prescribed temperature, ϕ(x, t) is the prescribed relative humidity, q(x, t) is the prescribed heat flux, j(x, t) is the prescribed moisture flux, βt [w m−2 k−1] and βϕ [kg m−2 s−1 pa−1] are the heat and mass transfer coefficients, t∞ is the ambient temperature and p∞ is the ambient water vapour pressure. besides the boundary conditions, the initial conditions are prescribed, i.e. t(x, 0) = t0(x), x∈ ω, (15) ϕ(x, 0) = ϕ0(x), x∈ ω, (16) where t0(x) denotes the initial temperature and ϕ0(x) denotes the initial relative humidity. 3.3 discretization of the differential equations the finite element method is used for spatial discretization of the partial differential equations (7) and (8). the weighted residual statement [7] is applied to the mass balance equation, assuming δt = 0 on γtd and δϕ = 0 on γϕd∫ ω δϕ (dw dϕ ∂ϕ ∂t −∇t ( dϕ∇ϕ + δp∇(ϕpsat) )) dω = 0 (17) and is also applied to the energy balance equation ∫ ω δt (( ρc + dhw dt )∂t ∂t −∇t ( λ∇t ) −hv∇t ( δp∇(ϕpsat) )) dω = 0. (18) in the finite element method, temperature t and relative humidity ϕ are approximated in the form t = n(x)dt , ϕ = n(x)dϕ (19) and the gradients of temperature and relative humidity are also needed ∇t = b(x)dt , ∇ϕ = b(x)dϕ. (20) in the previous equations, n(x) denotes the matrix of approximation functions, b(x) is the matrix of their derivatives, dt denotes the vector of nodal temperatures and dϕ denotes the vector of nodal relative humidities. a set of first order differential equations is obtained in the matrix form ( kϕϕ kϕt ktϕ ktt )( dϕ dt ) + ( cϕϕ cϕt ctϕ ctt )( ḋϕ ḋt ) = ( jϕ qt ) . (21) matrices kϕϕ, kϕt , ktϕ and ktt form the conductivity matrix of the problem, and they have the 69 acta polytechnica vol. 52 no. 6/2012 form kϕϕ = ∫ ω btdϕϕbdω, (22) kϕt = ∫ ω btdϕtbdω, (23) ktϕ = ∫ ω btdtϕbdω, (24) ktt = ∫ ω btdttbdω, (25) where the conductivity matrices of material dϕϕ, dϕt , dtϕ and dtt are diagonal matrices and the diagonal entries are equal to the appropriate conductivities kϕϕ = dw dw dϕ + δppsat, kϕt = δpϕ dpsat dt , (26) ktϕ = hvδppsat, ktt = λ + hvδpϕ dpsat dt . (27) matrices cϕϕ, cϕt , ctϕ and ctt form the capacity matrix of the problem, and they have the form cϕϕ = ∫ ω nthϕϕndω, (28) cϕt = ∫ ω nthϕtndω, (29) ctϕ = ∫ ω nthtϕndω, (30) ctt = ∫ ω nthttndω, (31) where the capacity matrices of material hϕϕ, hϕt , htϕ and htt are diagonal matrices and the diagonal entries are equal to the appropriate capacities cϕϕ = dw dϕ , cϕt = 0, (32) ctϕ = 0, ctt = ρc + dhw dt . (33) vectors jϕ and qt contain prescribed nodal fluxes and have the form jϕ = ∫ γj ntjϕdγ, qt = ∫ γq ntqt dγ, (34) where jϕ denotes the mass boundary fluxes and qt denotes the heat boundary fluxes. 4 computer simulation of a reactor vessel the reliability and the durability of reactor containments depend directly on the prestressing system. the general results from in-situ measurements throughout the operation show the increase in deformations and the increase in prestress losses since figure 1: change in tendon force gradient during service time [8] the onset of service. most measurements also indicate that the temperature has a major influence on the prestress losses. these conclusions have been obtained, e.g., from thirty years of measured prestress at swedish nuclear reactor containments [8]. this section presents a computer simulation of a nuclear power plant containment under cyclic temperature loading during service, when the stages of service and planned stops are changed. it is well known that increase in temperature influences the rate of concrete creep. this fact can cause significant prestress losses of the structure. moreover, increasing deformations are observed and additional cracks may occur. an advanced two-level model is used for predicting the prestress losses and the structure response. it is a combination of a global macro-level model and a local model. the aim of the global model is to model the evolution of the prestress forces changed by the temperature and climatic loading. the local model is loaded by the mechanical and thermal loading from the global model. the staggered coupled thermo-mechanical analysis is the main part of the local model, which has to explain the time dependent processes in the containment wall. the heat transfer analysis runns in parallel with the mechanical analysis, where the effect of temperature on concrete creep is modeled by bazant’s microprestress-solidification theory, described in [9] and [10]. the local model is subsequently supplemented by suitable damage models. the study presented here is a part of the overall reliability and durability model of nuclear power plant containment in temelín in the czech republic. the computation attempts to model and explain the increase in radial deformation and the decrease in tendon forces since the onset of service. there have been many of measurements to explain these phenomena at the swedish nuclear reactor containment 70 acta polytechnica vol. 52 no. 6/2012 figure 2: geometry — section view of the containment with non-injected (non-bounded) prestress tendons [8] over a time period of 5 years (6.5 years in the czech republic). the time evolution of the tendon force in a selected tendon is plotted in logarithmic scale in figure 1. two gradients of tendon force losses were also observed in prestress measurements at the czech containment. with reference to [8], it can be concluded that an increase in temperature accelerates of creep. any change in temperature, moisture content, and loading causes changes in the creep rate. there is no doubt that temperature is one of the sources of an increase in prestress losses. the influence of temperature will be the dominant phenomenon, while damage will have a minor effect, and the increase in radial strains will be neglected. 4.1 basic data the containment of the nuclear power plant in the czech republic is a monolithic post-tensioned structure made from reinforced concrete. it consists of two parts — the lower cylindrical part and the upper dome. the cylinder has an internal diameter of 45.00 m and the wall is 1.20 m in thickness. the dome is fixed into a massive girder. the scheme of the structure is shown in figure 2. the leakproofness of the containment is ensured by the 8 mm thick steel lining placed inside the structure. unbonded tendons are placed in three parallel layers in the containment wall. figure 3: tendon channels and reinforcement 4.2 global model a global model is depicted in figure 2. it serves only for determining the prestress losses caused by temperature fluctuations. the prestress losses are determined analytically on the basis of shell theory. 4.3 local model a local model is presented in figure 3. the cylindrical segment represents a periodic unit cell (puc) from the cylindrical part of the containment with channels for prestressing tendons and with vertical, radial and horizontal reinforcement. puc is 2.12 m in height and it covers the section of an angle of 7.5°. the prestressing tendons are not modeled. their effect is introduced as the mechanical loading. the finite element mesh was generated by the t3d automatic mesh generator [11]. the thermomechanical coupled algorithm of the sifel finite element computer code [12] was used. 4.4 loading temperature loading. the impact of temperature is modeled by the dirichlet boundary conditions. temperatures from in-situ measurements (inner and outer surface) are applied directly into the computation. the temperature cycle loading was considered in one year intervals. mechanical loading. the mechanical loading of the cylindrical segment is considered as a combination of four types of loading: • dead weight of the segment. • dead weight of the containment over the segment which is considered as a loading on the top surface. • vertical loading of the prestress forces which is also considered also as a loading on the top surface. it is computed from the reactions of 71 acta polytechnica vol. 52 no. 6/2012 figure 4: change in the prestress force in the anchorage system since the end of construction the anchorage system decreased by the prestress losses caused by friction in the tendon channels. • loading prescribed directly in the tendon channels which consist of radial and tangential components. the first two loadings are instantaneous. the latter two loadings are calculated as a multiple of the prestress forces in the tendons in the place of the anchorage system. the prestress forces values are obtained from in-situ measurements by a magnetoelastic method (mem) and they are displayed in figure 4. it should be emphasized that several measurements were averaged and a homogenized prestress force was used in the numerical analysis. the prestress force is not reduced due to the effect of temperature on concrete creep, and it is therefore slightly different from the force depicted in figure 1. the data was approximated by a logarithmic regression method. in the graph, jumps which simulate in the cycle service time-the planned stop-are obtained from the global model. 4.4.1 material properties and equations coupled heat and moisture transfer was assumed because of the climatic conditions on the exterior side. it was recognized that the relative humidity varies very little, and its influence on the containment response is negligible. the non-stationary heat transport was solved assuming constant material parameters. the mechanical part of the computation considered four types of constitutive material models, namely creep, damage, plasticity and thermal dilatation. the b3 creep model influenced by temperature and moisture changes and a damage model describe the behaviour of the concrete. several damage models were used in the computer simulation. there were local and nonlocal versions of the scalar isotropic damage model, the anisotropic damage model and the orthotropic figure 5: isosurfaces of damage parameter dt in the radial direction at the end of the pre-stressing phase model. the results obtained using the orthotropic model showed the best coincidence with the in-situ measurements. the application of damage models is described in detail in [5]. the steel reinforcement was modeled using the bar finite elements with a plasticity model, using the huber-mises-hencky condition. the thermal dilatation model was assumed in both materials (concrete and reinforcement). all material parameters were obtained from the nuclear power plant operator. our analysis was concerned with capturing the trends in containment behaviour, and with identifying important processes in concrete that have to be modelled. the key point for future analyses is the accessibility of the material parameters. from this point of view, several experiments on concrete specimens have to be performed in order to obtain exact material parameters, e.g. tensile strength and fracture energy. moreover, the model needs to be calibrated. 4.5 computation results the relation between the response of the local model and the tensile strength of the concrete in the damage models was observed during the computer simulation. several calculations were therefore made with different tensile strengths in order to verify the damage evolution. the scalar isotropic damage model gives the upper estimate, because the damage parameter influenced all principal directions. more realistic anisotropic or orthotropic models should therefore be used for a reliable prognosis of the durability of containment. the distribution of the damage parameter for an orthotropic damage model is captured in figure 5. from the point of view of concrete creep, the different levels of the effect of temperature on concrete creep were also studied. for an explanation of the increase in radial deformation, the most monitored graphs are the strains in the radial reinforcement depicted in figure 6. for the accompanying effect due 72 acta polytechnica vol. 52 no. 6/2012 figure 6: comparison of the strain in the radial reinforcement obtained from computation (black line) and from in-situ measurements (red line – averaged data) to cracking strain evolution, the increase in radial deformation and the decrease in tendon forces during service life can be observed. the strains in the radial reinforcement are plotted in figure 6 and are compared with the average data from in-situ measurements. noticable increments in strains are caused by temperature increases after planned reactor shutdowns. 5 conclusions the explanation for the increase in radial strains and the decrease in tendon forces since the beginning of service is based on theoretical knowledge about concrete creep influenced by temperature changes, and also partly on measurements of prestress losses mainly performed at swedish nuclear reactor containments. the influence of an increase in temperature during service has been proved. the results obtained from the connecting the simplified global model and the local model show relatively good coincidence with in-situ measurements. for the best coincidence between the computer simulation and the measurements, it is necessary to calibrate all material models and their parameters and to compare them with laboratory and in-situ measurements. in particular, it is necessary to measure the tensile strenght, which is the basic property for monitoring the hypothetical damage to the containment. acknowledgements financial support for this work was provided by project number vz 03 cez msm 6840770003 of the ministry of education of the czech republic. the financial support is gratefully acknowledged. references [1] kruis, j., koudelka, t., krejčí, t.: efficient computer implementation of coupled hygro-thermomechanical analysis, math. comput. simulat, 80, 2010, 1578–1588. [2] papa, e., taliercio, a.: anisotropic damage model for the multiaxial static and fatigue behaviour of plain concrete, eng. fract. mech, 55, no. 2, 1996, 163–179. [3] koudelka, t., krejčí, t.: an anisotropic damage model for concrete in coupled problems. in: proceedings of the ninth international conference on computational structures technology, (editors: b. h. v. topping and m. papadrakakis), stirlingshire, uk, civil-comp press, 2008, paper 157. [4] krejčí, t., koudelka, t., šejnoha, j., zeman, j.: computer simulation of concrete structures subject to cyclic temperature loading. in: proceedings of the twelfth international conference on civil, structural and environmental engineering computing, (editors: b. h. v. topping, l. f. costa neves and r. c. barros), stirlingshire, uk, civil-comp press, 2009, paper 131. [5] koudelka, t., krejčí, t., šejnoha, j.: analysis of a nuclear power plant containment. in: proceedings of the twelfth international conference on civil, structural and environmental engineering computing, (editors: b. h. v. topping, l. f. costa neves and r. c. barros), stirlingshire, uk, civil-comp press, 2009, paper 132. [6] künzel, h.m., kiessl, k.: calculation of heat and moisture transfer in exposed building components, int. j. heat mass tran, 40, 1997, 159– 167. [7] bittnar, z., šejnoha, j.: numerical methods in structural mechanics. new york: asce press, 1996. [8] anderson, p.: thirty years of measured prestress of swedish nuclear reactor containment, nucl. eng. des, 235, 2005, 2323–2336. [9] bažant, z.p., prasannan, s.: solidification theory for concrete creep: i. formulation, j. eng. mechasce, 115, no. 8, 1989, 1691–1703. [10] bažant, z.p., prasannan, s.: solidification theory for concrete creep: ii. verification and application, j. eng. mech-asce, 115, no. 8, 1989, 1704–1725. [11] t3d — automatic mesh generator, http://mech. fsv.cvut.cz/~dr/t3d.html, [12] sifel — simple finite elements, http://mech. fsv.cvut.cz/~sifel/index.html, 73 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 multi-functional star tracker — future perspectives j. roháč, m. řeřábek, r. hudec abstract this paper focuses on the idea of a multi-functional wide-field star tracker (wfst) and provides a description of the current state-of-the-art in this field. the idea comes from a proposal handed in to esa at the beginning of 2011. star trackers (sts) usually havemore than one object-lens with a small field-of-view. they provide very precise information about the attitude in space according to consecutive evaluation of star positions. our idea of wfst will combine the functions of several instruments, e.g. st, a horizon sensor, and an all-sky photometry camera. wfst will use a fish-eye lens. there is no comparable product on the present-daymarket. nowadays, spacecraft have to carry several instruments for these applications. this increases theweight of the instrumentation and reduces theweight available for the payload. keywords: star tracker, horizon sensor, all-sky camera, photometry, astrometry, space variant image processing. 1 introduction attitude determination functions are basic functions in all space applications. attitude information is used in attitude and orbit control systems to stabilize a spacecraft (s/c) in the required position and orientation. variousattitude reference sources canbe used for obtaining attitude information, e.g. earth horizon sensors, which can be directed towards the sun, moon, or other planets and stars, sun sensors, magnetometers, and star trackers (sts). however, onlystsprovideaccuracyof attitude estimationbetter than 30 arcseconds [1]. the first generation of sts acquired only a few bright stars in the fieldof-view (fov) and provided the focal plane coordinates of these stars [2]. the coordinates were not related to inertial space, and the attitude information therefore had to be provided indirectly by an external unit. this was due the insufficient power of the microcomputers. rapid improvements in microcomputer power led to the autonomous functionality of stswhich became the preferred attitude reference source. basically, the secondgenerationstconsists of an electronic camera and amicrocomputer as shown in figure 1. the st autonomously performs pattern recognition of stars in fov, compares the results with the star catalogue stored in the internal memory, and estimates the attitude. generally, the st has to operate in two modes. the first mode solves the lost-in-space function which does not have previous attitude information available, and therefore the star pattern in thefovhas to be recognized. the identification can be usually accomplished in a few seconds. the othermode performs star tracking, which assumes that the current attitude is closely related to theprevious attitude. thismode is easier for computation because it only tracks previously identified stars at knownpositions. the basic drivers of st designs take into account: accuracy, reliability,mass, power consumption, and size. typical requirements are stated in tab.1. there are also other parameters that influence the design. they are: star light sensitivity, the detection threshold, the average number of stars in the fov, and sky coverage. fig. 1: principle sketchof a secondgeneration star tracker generally, second generation sts employ more than onecameraheadunit (chu) consisting of the baffle, the lens, and the optical sensor. it usually has about 20◦ ×20◦ fovthemulti-chu concept increases the average number of stars in the fov and the accuracy of the st. in addition, due to this concept full immunity against simultaneous blinding is ensured 2 new concept of a multi-functional star tracker second generation sts already have better parameters than required, see table 1. nevertheless, they use the multi-camera-head concept to increase the average of stars in the fov. to decrease the mass and size of the current st concept and to increase its functionality we proposed to design, develop, andverify themulti-functionalwide-field star tracker 61 acta polytechnica vol. 51 no. 6/2011 table 1: typical requirements for autonomous star trackers and an example of the micro advanced stellar compass (μasc) flown on the proba-2 technology demonstration satellite required μasc initial acquisition (lost-in-space function) < 1 min. 30 msec accuracy (eol) (arcsec) 30 (3σ) 2 (3σ) attitude rate (deg/sec) up to 1 up to 20 availability (%) 99.900 99.995 power (w) < 10 < 4 mass (kg) < 2 < 0.5 (wfst), which combines the functions of several instruments usually used in s/cs. one function of thewfstwill be to provide precise attitude determination. this will be ensured by sophisticated algorithmsusing thewidest possible fov optic characteristics, supported by innovative image and data processing, a 3d inertial measurement unit (imu) consisting of accelerometers and angular rate sensors. for low orbit applications, the system will also be equipped with a gps receiver. all available sources of information, seefigure 2,will be fused by the extended kalman filter (ekf). a two-speedupdating approach, as proposed in [3], will be used for mechanizing and computing the inertial navigation equations. the precision will be between 1 arcminute and 1 arcsecond, according the application and the optics that are used. fig. 2: available data sources for attitude evaluation the second function focuses on having costeffective means to provide a high-potential information source for scientific analyses of objects in the fov of thewfst,which extends the applicability of wfst. this idea corresponds fully with esa cosmic vision 2015–2025, where the priorities are planet anduniverse studies. wfstwill be capable ofmonitoring fluctuations in the radiation of stars. high potential will be provided by the fish-eye lens, which will enable observations of up to 180 deg sky, and by an innovative approach to image processing. systems of this kind using a fish-eye lens are commonly referred to asultrawidefieldcamera (uwfc) systems. uwfc image data analysis is very difficult in general, due to the wide range of distortions in wide field images and mainly due to the spatial variant (sv) properties of uwfc systems. moreover, the objects in ultra-wide field images are very small (a few pixels per stellar object). precise astronomical measurements (in astrometry and photometry) need high quality images. even small distortions may lead to inaccurate determination of the position or movement of stellar objects. the error depends on themagnitude of themeasured star — the higher it is, the higher the error can be. the properties of uwfc astronomical systems and the specific visual data in astronomical images lead to complicated evaluation of the image data. these uwfc systems containmany different kinds of optical aberrations that havenegative impacts on the image quality and system transfer characteristics. for precise astrometry and photometry over the entire fov it is therefore very important to comprehend howtheoptical aberrationsaffect the transfer characteristics, howthe astronomicalmeasurementsdepend on the optical aberrations, and how the wavefront aberration affects the point spread function (psf). precise stellar profile fitting is very important for astronomical measurements. nowadays, there are two common functions for fitting stellar profiles, referred to as gaussian andmoffat functions [4,5]. efforts are being made to match a star’s profile with the gaussian or moffat profile and to remember the parameters of the fit. if a star were ideal, the stellar profile would be represented by a small “dot”. due to many different distortions, the dot is blurred all around. the more it is blurred, the worse is the psf of the whole imaging system. the centre of a star 62 acta polytechnica vol. 51 no. 6/2011 usually has a profile corresponding to the gaussian function. the more distant parts of the star are, the closer the profile will be to the moffat function. therefore, the ideal fitting function should combine these two profiles. the properties of the uwfc system are referred to as space or spatial variant properties, and the psf of this system is different for each point in the object plane. in our case, a sophisticated algorithm employing special functions based on zernike polynomials has to be applied in order to get a precise stellar profile fit for the entire fov. 3 conclusion theresearchanddevelopmentof themulti-functional wfstwill verify current technologypotentialsbased on analyses reflecting fast current technological development and improvements of digital optical sensors (cmos, ccd), low-cost imu consisting of mems (micro-electro-mechanical-system) sensors, andgps technology. all analyseswill determine the boundaries in the instrument functionality with respect to s/c maneuvering parameters, various optical sensor sizes, sensitivity, error sources, stability, lifetime, power consumption, mass, and suitability of various wide fov optics. the proposed wfst should use cost-effective components and thus fill a gap in the market with a system that is novel and powerful, but cost-effective. acknowledgement this work has been performed under sciexnmsch—chviandga205/09/1302“studyof sporadic meteors and weak meteor showers using automatic video intensifier cameras” of thegrantagency of the czech republic. references [1] jørgensen, j. l., pickles, a.: fast and robust pointing and tracking using a second generation star tracker, proceedings of spie. washington : 1998, p. 51–61. [2] liebe, c. ch.: accuracy performance of star trackers — a tutorial, ieee transactions on aerospace andelectronic systems,vol.38,no. 2, 2002, p. 587–599. [3] savage, p. g.: strapdown analytics. strapdown associates, minnesota, usa : 2000. [4] rerabek, m., pata, p., koten, p.: processing of the astronomical image data obtained from uwfc optical systems, proceedings of spie, washington : 2008. [5] starck, j., murtagh, f.: astronomical image and data analysis. berlin : springer, 2002. jan roháč faculty of electrical engineering czech technical university in prague technická 2, 166 27 prague, czech republic martin řeřábek faculty of electrical engineering czech technical university in prague technická 2, 166 27 prague, czech republic multimedia signal processing group ecole polytechnique fédérale de lausanne station 11, lausanne, switzerland rené hudec faculty of electrical engineering czech technical university in prague technická 2, 166 27 prague, czech republic 63 ap10_1.vp 1 introduction a manual automotive gearbox comprises gearwheels, synchronizer pack components, bearings and shafts. each of these parts is quality checked before assembly. parts that work together are not paired, because high production capacity lowers the production time. the demand for high-quality completed gearboxes requires the inclusion of an inspection process in the gearbox production line. the inspection process for gearbox diagnostics can be divided into four main parts – signal acquisition, data pre-processing, condition indicator computation and signal classification, as shown in fig. 1. inspection processes are usually based on vibration diagnostics, because the vibration of the gearbox housing during gearbox operation clearly indicates the technical condition of the tested gearbox. an important part of such a diagnostic chain is the classification unit that makes the final decision on the technical condition of the tested object. however, in some cases a classifier can be seen as a further part of the signal processing chain which only simplifies the results from condition computing to a form that is easier for the user to read. an example of this is mapping the test object to easily depicted surfaces. neural networks belong to the group of classifiers successfully applied to diagnostics. neural networks successfully substitute for human power in tasks where the decision is made on the basis of many input features, or when we need to automate the inspection process. signal conditioning methods often need a trained technician to interpret the results. this is a crucial constraint in present-day industry. the study of neural networks began in the 19th century, when brain cells were analyzed [4]. the history of artificial neural networks goes back to 1943, when warren mcculloch and walter pitts described how they worked and designed the first simple neural network with electrical circuits [17]. the fundamental concept of how neural nets are learned was introduced in 1949 by donald hebb [6]. in 1950, nathanial rochester simulated a neural network on ibm electronic calculators [13], which had cells organized into a single layer, and the outputs from these cells were connected to the inputs of other cells. this net was able to meet hebb’s learning rule. in 1958, frank rosenblatt created the first perceptron model [7]. his net learned patterns which occurred regularly and consistently. the first work on neural network classification came from bernard widrow and marcian hoff, who presented the adaptive linear element (adeline) [2]. adeline was able to recognize a linear pattern. in 1962, rosenblatt combined his first perceptron model and adeline to create a new perceptron model [8]. further progress in neural classification came about in 1972. two researchers independently came up with the idea of association networks [3]. the first multilayered network [10] appeared in 1975. then, in 1986, the idea of a multilayer backpropagation network arose [5]. a disadvantage of neural networks is that a trained net cannot be easily visualized. this disadvantage can be overcome by the self-adjusted fuzzy interface system known as the adaptive-network-based fuzzy interface system (anfis). anfis was introduced by jung [9] in 1992. this paper deals with suitable applications of classifiers, based on two different types of approach (som and anfis) in gearbox quality assessment process. the first part of the paper summarizes the theory of the two networks, while the second part describes experimental results and the third part summarizes the results and their consequences. 2 kohonen neural network (self-organizing maps) the kohonen neural network [16] (also called the self-organizing map (som)) was presented by t. kohonen in 1984. a neuron in som is characterized by excitation [12]. excita© czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 50 no. 1/2010 gearbox condition monitoring using advanced classifiers p. večeř, m. kreidl, r. šmíd new efficient and reliable methods for gearbox diagnostics are needed in automotive industry because of growing demand for production quality. this paper presents the application of two different classifiers for gearbox diagnostics – kohonen neural networks and the adaptive-network-based fuzzy interface system (anfis). two different practical applications are presented. in the first application, the tested gearboxes are separated into two classes according to their condition indicators. in the second example, anfis is applied to label the tested gearboxes with a quality index according to the condition indicators. in both applications, the condition indicators were computed from the vibration of the gearbox housing. keywords: diagnostics, automotive gearbox, kohonen neural network, self-organizing map, anfis, classification system, data pre-processing. data acquisition data pre-processing computation of condition indicators classification by condition indicators fig. 1: the inspection process of gearbox diagnostics tion is usually computed from the euclidian distance between the weight vector and the input vector. with som, a mapping from n-dimensional input space into two-dimensional output space can be found. the mapping procedure takes each input vector and compares it with the weight of each neuron. the neuron with the greatest excitation represents a response, and the appropriate input vector is mapped into this location. 2.1 topology a som usually contains only two layers – an input layer and a kohonen layer. the input layer just provides the distribution of the input vectors to all neurons in the kohonen layer. the desired mapping from n-dimensional input space into two-dimensional space is defined by the kohonen layer. the neurons in the kohonen layer are oriented into a rectangular or hexagonal lattice. 2.2 using som in the first stage, called initialization, the weights of all neurons in the som have to be set. these weights can be set randomly or can take into account the structure of the input data. after initialization, a new input vector is put into the input layer. the distances dj between the input vector and j-th output neuron are computed by equation [12]: � �d x t w tj i ij i n � � � � � ( ) ( ) 2 0 1 , (1) where xi is the i-th element of input vector x in time t, wij is the i-th value of the weight vector for the j-th output neuron in time t. the distance is computed for each output neuron. then we select the neuron with the shortest distance dw, using the equation: d dw j j� min( ), (2) where j m� 1, ,� , m is the total number of neurons in the net. finally, the weight vector of the ”winning” neuron (the neuron with distance dw) and the weight of its neighborhood are adjusted. � �w t w t t x t w tij ij i ij( ) ( ) ( ) ( ) ( )� � � � �1 � , (3) where � is an auxiliary function. if not all input vectors are put into the input layer, or if the desired number of learning steps is not achieved, this learning process (without initialization) has to be repeated. in the opposite case, the som is ready to use. 2.3 application of som to gearbox diagnostics organizing the input data into meaningful structures is a crucial application of som. if the data computed from different gearboxes is in the same cluster, we can assume that these gearboxes are in similar technical condition. unsupervised learning, another important feature of som, allows gearbox damage to be identified without the corresponding benchmark vectors. 3 anfis fuzzy expert systems are usually based on a fuzzy inference system (fis). an important property of fuzzy systems is that they transform human knowledge into the form of fuzzy if-then rules [1]. an fis is usually composed of four functional units. the basic configuration of fis is depicted in fig. 2. first, the crisp input data is fuzzyfied. then fuzzy if-then rules are applied. a final decision is obtained from the aggregation process in which all rules are evaluated. finally, the defuzzification process is applied. fuzzy inference systems can be divided into three types [9]. for our purposes we will be oriented to type 3, where takagi and sugeno’s fuzzy if-then rules are used. to design this fuzzy interface system, the parameters of the membership functions and fuzzy if-then rules must be known. however, this is one way to design a fuzzy system without a priori knowledge about membership values and rules. adaptive network-based fis obtains the needed information directly from an input-output data set [9]. 3.1 anfis architecture the architecture of anfis is based on adaptive networks. adaptive networks are feed-forward networks in which the node parameters are tuned by a learning algorithm to minimize the prescribed error measure. the anfis structure for type 3 fuzzy reasoning will be examined in a simple example [9]. suppose that the knowledge base contains two fuzzy if-then rules (takagi and sugeno’s type): rule 1: if x is a1 and y is b1, then f l � � �p x q y r1 1 1 rule 2: if x is a2 and y is b2, then f l � � �p x q y r2 2 2 the corresponding architecture of anfis is illustrated in fig. 3. each layer is described below: layer 1: the function of this layer is to fuzzify the inputs. a node function can be expressed by the equation: o xi i� �a ( ) , (4) where �a is the membership function for linguistic value ai, x is input value. layer 2: this layer simply multiplies the incoming membership values according to a node function: � � �i x y� a b( ) ( ) , (5) where �a represents the degree to which given x satisfies the quantifier ai, 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 knowledge-rule fuzzifier defuzzifier fuzzy inference engine input output (crisp)(crisp) fig. 2: basic fis configuration �b represents the degree to which given y satisfies the quantifier bi. layer 3: the nodes in this layer calculate the normalized firing strength: � � � � i i� �1 2 , i � 1 2, . (6) layer 4: the nodes in this layer compute consequent parts of the fuzzy if-then rules (takagi and sugeno’s type): o f p x q y ri i i i i i i� � � � � �� � ( ) . (7) layer 5: in this layer the defuzzification process occurs. o f f i i i i i i i i � � � � � � � � � � . (8) 3.2 learning algorithm the learning algorithm is based on backpropagation and the lms algorithm [9]. the parameters of the premise membership function are set according to backpropagation. the optimization of the consequent equations is provided via linear least mean squares estimation. 3.3 application of anfis to gearbox diagnostics anfis can be used in the same manner as feed-forward neuron networks. the net can classify tested gearboxes into classes defined in advance. in addition to the fact anfis is in many cases faster in evaluation and learning, the main advantage is that we can view decision surfaces of the trained net in contrast with the feed forward neuron network [15]. in contrast to som, we need to know the benchmark vectors for all classes. 4 experimental results to demonstrate a promising method for gearbox fault classifier, two experiments are proposed – one based on a classifier with unsupervised learning – section 4.1 and the other with a classifier that sets its parameter according to supervised learning – section 4.2. the experimental results show only a possible solution of this task. for an adequate evaluation of the features of each classifier, a training dataset with a significantly higher number of tested gearboxes will be needed. 4.1 classification with a self-organizing map the following experiment was designed to check the performance of the classification system [14]. the vibration data from five gearboxes was classified by the designed inspection system. som provides a cluster analysis of the input data. according the technical condition of the tested gearbox, they will be classified into two main clusters (g and ng). we assume that gearboxes mapped to the same cluster have the same technical condition. the advantage of this classification is that we do not need to know the technical state of each benchmark vector during the learning phase. 4.1.1 measured gearboxes five gearboxes were chosen for this experiment. brief descriptions of the measured gearboxes are given in table 1. 4.1.2 data acquisition the vibrations of the gearbox housing were measured during a simulated test drive on the test bench. two piezoelectric accelerometers were used for vibration signal acquisition. the first transducer was placed near the differential gearing. the second transducer was located near the five-gear gearing. these sensor placements ensured fault signal detection from all gearings. the data from the accelerometers was recorded directly onto the pc hard disk for © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 50 no. 1/2010 b2 b1 a2 a1a1 x y layer 1 w1 w 2 layer 2 layer 3 w .f1 1 w .f2 2 f layer 4 layer 5 fig. 3: anfis structure class representative technical condition a class g (for 1st and 5th gear) working well during all the tests b class ng for 1st gear, class g for 5th gear noisy during the test in first gear. out of tolerance for dimension on first gear toothing c class g for 1st gear, class ng for 5th gear differential out of tolerance because of toothing dimension d class g (for 1st and 5th gear) bearing on the drive shaft on the tolerance limit table 1: measured gearboxes for 2-class separation using som off-line analysis by the b&k multi-analyzer system type 3560. the b&k mm0024 photoelectric tachometer probe captured the rotation speed of the drive shaft during the test procedure. 4.1.3 data pre-processing and classification the raw signal was synchronously averaged and three amplitude features (rms, skewness, and kurtosis) were computed. the computed amplitude features from the first and second transducers were reordered into input vectors. for each gearbox five input vectors were computed. an example of an input vector is shown in table 2. the som size of 4 x 3 neurons was chosen and the weights were initialized, using random numbers. a hexagonal lattice type and a gaussian neighborhood type were selected. after this, the map was trained. the training was done in two phases. in the first phase, ten thousand steps were set for training. during the first step the reference vectors of the map were roughly computed. in the second learning phase the reference vectors were fine-tuned. in this step of the procedure the map is learned and can be used for visualizing input vectors. sammon’s mapping is a nonlinear projection of the multidimensional input vector to a two-dimensional point on a plane, whereby the distances between the image vectors tend to approximate the euclidean distances of the input vector [12]. the final visualization of the sammon’s mapping after the second learning phase is shown in fig. 4. the som is properly learned. the distances between the reference vector neighboring units and input vector mapping are visualized in fig. 5. the distances between neurons are depicted using grey scale. dark grey mean longer distance, light color means a shorter distance. the gearboxes in good technical condition are concentrated on one side. the faulty gearbox is mapped on the right side of the map. these two sides of the map are divided by a sharp border. thus, som divided the tested gearboxes into the two parts of the map: gearboxes in good technical condition are on one side, and the faulty gearbox is on the other side. the same test was performed for the vibration data retrieved from the second test (five-gears gearing test). the 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 a b c d rms1 (m/s2) 4.74 5.80 1.32 2.43 rms2 (m/s2) 2.77 8.74 2.35 3.13 skew1 (-) 0.35 0.31 �0.09 �0.17 skew2 (-) 0.39 0.43 0.24 �0.63 kur1 (-) 3.02 2.80 2.95 2.74 kur2 (-) 2.16 2.19 2.56 2.99 table 2: typical values of input vectors for the 1st gear fig. 4: sammon’s visualization of the learned map after the second phase c a d b 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 fig. 5: distance between reference vectors neighboring neurons and the input vector for first-gear test mapping (lighter gray – shorter distance, darker gray – longer distance) distance between the reference vector neighboring units and input vector mapping is visualized in fig. 6. the gearbox with faulty differential gearing (gearbox c) is mapped on the opposite side of the map, distinct from the others. a comparison of the corresponding values in table 2 shows that the rms value computed from the vibration signal (rms 2 – second transducer) has a major influence on the vector mapping. the other items have no major effect on the vector mapping. a similar effect was found in the second test. 4.2 classification with anfis the following experiment was designed to check useful features of anfis for gearbox diagnostics. the classifier was taught to choose the tested gearboxes with the quality index for each tested gearbox according to a subjective evaluation made by a car tester. as in the previous experiment with som, the input data constituted vibration measured on gearbox housings. the data was acquired during a test drive. the quality index classiffied the tested gearboxes into 10 classes (10 – no noticeable noise from gearbox, 0 – unacceptable noise from gearbox). in accordance with the previous example with som, we need to have training vectors from the complete quality index scale, if possible. according to the decision surfaces of trained anfis, we can decide on the quality of the training data. 4.2.1 measured gearboxes the subjective quality index for the 3rd gear, deceleration, rated by the test driver, is given in table 3. 4.2.2 data acquisition the vibrations of the gearbox housing were measured during a test drive. the test regime was deceleration in 3rd gear. a piezoelectric accelerometer was used for vibration signal acquisition. the vibration transducer was placed on the front side of the gearbox near the switch for the reverse speed. the data from the accelerometer was recorded directly onto the pc hard disk for off-line analysis by the b&k multi-analyzer system type 3560. information on the rotation speed of the gearbox input shaft was acquired from a can message send by the engine control unit. the transducer converts a can message into an analog impulse signal. an impulse corresponds to one rotation of the input shaft. this signal was connected to the b&k multi-analyzer. 4.2.3 data pre-processing and classification amplitude features (rms, peak, crest factor and kurtosis) were computed from the raw vibration signal. these features were reordered to the input vector and, together with the quality index subjectively evaluated by the car tester; they © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 50 no. 1/2010 c a a d b 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 fig. 6: distance between reference vectors neighboring neurons and the input vector for five-gear gearing mapping (lighter gray – shorter distance, darker gray – longer distance) gearbox label e f g h i j quality index 8 6 7 6–5 5 8 table 3: measured gearboxes for classification with anfis tested gearbox rms (m/s2) peak (m/s2) crest factor (-) kurtosis (-) e 45.62 515.05 11.29 6.78 f 55.67 504.09 9.05 5.76 g 72. 57 649.80 8.95 5.76 h 49.19 441.65 8.98 6.19 i 38.67 368.66 9.53 7.25 j 64.62 480.03 7.43 5.51 table 4: typical values of input vectors were used as training vectors for the net. typical values of the computed amplitude features for the tested gearboxes are shown in table 4. the shape of the membership function was chosen as generalized bell-shaped. the two-rule anfis was chosen. the decision surfaces for each pair of amplitude features are depicted in fig. 7. according to the shape of these decision surfaces, it can be decided whether the training data is large enough, and whether the selected amplitude features correspond to the subjective evaluation of the gearbox carried out by the test driver. in an ideal case, the decision surface will have only one peak and the higher the condition indicator values, the lower the quality index will be. the decision surface tuned according to our training data showed clearly that the biggest impact on the quality index value came from amplitude features rms and peak. this figure also shows that in his subjective evaluation the car tester gave a high quality assessment to two gearboxes that had a higher kurtosis value and a low peak value, and vice versa. the decision surface based on the peak versus crest factor contains two peaks, because the training data does not contain enough samples from gearboxes in variety of technical conditions. the decision surface based on crest factor versus kurtosis should have a lower quality index for a higher kurtosis value. however, as mentioned above, this is only a demonstration of a promising method for a gearbox fault classifier. in order to evaluate the features of the proposed classifier, a significantly larger training data set containing data from a larger number of tested gearboxes is needed. if we design the net with more rules, the decision surface will be closer to each subjective evaluation, and the decision surface will be peakier. to avoid these problems, we need more training data that will be equally spread through the whole range of the quality index. 4.2.4 conclusion the performance of an som classifier used for gearbox diagnostics has been demonstrated. one of the major problems of the som mapping used in this work is that we cannot be sure how close together or how far apart vectors located on the opposite border of the map are. this problem can be solved by transforming the map into 3-d and projecting it into a round shape (e.g. cylindrical) [11]. however, in our experiment there was a sharp border between the tested gearbox from ng group and other gearboxes. the kohonen network can correctly classify gearboxes falling into a class, that was not known when the net was trained. we need to know the diagnosis for only one gearbox. the diagnosis for other gearboxes can be evaluated according to their mapping to the benchmark gearbox. crucial in the application of a kohonen map is the appropriate size of the map according to the number of test samples. if the map is too large, the tested samples will be too distinct from each other, and vice versa. anfis represented a second type of network, which uses supervised learning. a precondition for successful use of this 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 8 9 10 11 40 50 60 70 4 5 6 7 8 crest faktor (-) 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 6 6.5 7 8 9 10 11 4 6 8 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 7.5 8 8.5 9 9.5 10 10.5 400 450 500 550 600 4 4.5 5 5.5 6 6.5 crest factor (-)peak (m/s ) 2 q u a li ty in d e x () 4 4.5 5 5.5 6 6.5 6 6.5 7 40 50 60 70 2 4 6 8 2 3 4 5 6 7 8 9 rms (m/s ) 2 q u a li ty in d e x () q u a li ty in d e x () q u a li ty in d e x () crest factor (-) rms (m/s ) 2 11 kurt (-) kurt (-) fig. 7: decision surfaces of trained anfis (2 rules) type of network is to have training data in advance. the training data must contain enough gearboxes from all classes that we are trying to distinguish between. the advantage of anfis is visualizing a decision surface. according to the trained decision surface, we can decide whether our trained data contains enough training samples for the selected net parameters. our results show that the root mean square value and the kurtosis of the vibration signal have the biggest influence on the final value of the quality index. acknowledgments this study was supported by research program no. msm 6840770015 “research of methods and systems for measurement of physical quantities and measured data processing” of the ctu in prague, sponsored by the ministry of education, youth and sports of the czech republic. reference [1] matlab fuzzy logic. mathworks inc., 2003. [2] widrow, b., hoff, m.: adaptive switching circuits. new york: ire wescon convention record, 1960. [3] olmsted, d. d.: history and principles of neural networks from 1960 to 1990. www.neurocomputing.org, 2006. [4] olmsted, d. d.: history and principles of neural networks to 1960. www.neurocomputing.org, 2006. [5] rumelhart, d. d., hinton, g. e., willliams, r. j.: learning representations by back-propagating errors. 1986. [6] hebb, d. o.: the organizing of behavior. new york: john wiley & sons, 1949. [7] rosenblatt, f.: the perceptron: a probabilistic model for information storage and organization in the brain. psychological review, 1958. [8] rosenblatt, f.: principles of neurodynamics: perceptrons and the theory of brain mechanisms. washington d.c.: spartan books, 1960. [9] jang, j. r.: neuro fuzzymodeling: architecture, analyses and applications. phd thesis, university of california, berkeley, ca, 1992. [10] fukushima, k.: cognitron: a self-organizing multilayered neural network. biological cybernetics, 1975. [11] jurčík, m.: som, 2003. [12] šnorek, m.: neuronové sítě a neuropočítače. praha: čvut, 2002. [13] rochester, n., holland, j. h., haibt, l. h., duda, w. l.: test on a cell assembly theory of the action of the brain using a large digital computer. ire transaction of information theory it, 1956. [14] večeř, p., kreidl, m., šmíd, r.: application of the self-organizing map to manual automotive transmission diagnostics. in isspit 2003, darmstadt, germany, december 2003. [15] večeř, p., kreidl, m., šmíd, r.: using adaptive-network-based fuzzy inference system for the manual transmission diagnostics, 2004. [16] kohonen, t.: correlation matrix memories. berlin: springer-verlag, 1984. [17] mcculloch, w. s., pitts, w. h.: a logical calculus of the ideas immanent in nervous activity. bulletin of mathematical biophysics, 1943. ing. petr večeř e-mail: pvecer@centrum.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@fel.cvut.cz doc. ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@fel.cvut.cz department of measurements czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 50 no. 1/2010 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 uml-oriented risk analysis in manufacturing systems j.jirsa, j.žáček abstract wheneverwewant to avoid failures or hazardous events in today’s complex technological systems, it is advisable to carry out appropriate risk management. one of the most important aspects of risk management is the risk analysis process. the aim of this paper is to show a new risk analysis method based on the unified modelling language (uml), which is successfully used in software engineering for describing the problem domain. the paper also includes a small practical example. it also shows a new risk analysis method based on an example of an unreeling process in cable manufacturing. keywords: risk analysis, unified modelling language, manufacturing systems. 1 introduction risk analysis and risk control have been very frequently used words in recent years. it seems to be a modern trend to speak about risk, mainly in economics or in management of the environment. however if we focus on our everyday lives, we will find that there aremany situations when we subconsciously make a risk analysis. we do not usually recognise particular phases of our intuitive risk analysis, and we cannot apply this biological process directly in technical applications. the reason is very simple: we might omit some important factorsduring our intuitive risk analysis which can lead to fatal consequences. presentday technologies are complex systems, and theywork with various materials and utilise demanding processes. during the life-cycle of these technologies, hazardous events or failures can occur. so we try to find ways to prevent all potential damage. for this we need risk analysis. the paper focuses primarily on risk analysis for technological systems. our aim is to present a new application of standard software modelling tools to improve and enrich computer processing, and to simplify the steps in risk analysis. the paper therefore provides a new interdisciplinaryviewof the risk analysis issue, as will be discussed below. 2 risk analysis process as mentioned above, each of us applies risk analysis several times per day. we do this process fully automatically (subconsciously). we usually do not think about it in greater detail. for example, as we leave home, we try to remember if all electric, gas and water devices have been switched off or closed, including the kitchen stove and the water taps. a similar example is when we prepare for our holiday: we think about possible hazards and we try to avoid them or at least to be well prepared. it is obvious that the same sort of thinking is applicable in technical branches. let us take a look at three common characteristic questions, mentioned by tichý in [1]: 1. what failures can occur in the inspected object or process? 2. how often can these failures arise? 3. what will happen after the failure occurs? these questions are universal enough, and can be applied to every human activity. however, for real usage, especially in technical applications, it is necessary to specify concrete steps with specific rules. it is recommended to follow the general risk analysis process for technological systems defined in the iec 300-3-9 standard [2]. in particular, it is necessary to take the following steps: 1. define the scope of the analysis, 2. identify the hazard and make an initial evaluation of the consequences, 3. estimate the risk, 4. verify, 5. make documentation, 6. update the analysis. as might be expected, each step can be subdivided into more detailed tasks. although these steps may appear simple and easy, it is often quite difficult to implement them. 3 risk analysis techniques in the course of history, many techniques have been developed for making a risk analysis, especially in the last hundred years. it is necessary to recognise that this is an ongoing process. we can observe the progress in risk analysis techniques in response to rapid technical advances. some modern techniques are derived from older methods, which have been 41 acta polytechnica vol. 50 no. 6/2010 table 1: the most widely used risk analysis methods method s u it a b le fo r co m p le x sy st e m s s u it a b le fo r n e w sy st e m s q u a n ti ta ti v e a n a ly si s t o p -d o w n o r b o tt o m -u p event tree analysis (eta) no no yes bottom-up failure modes and effects analysis (fmea) no no yes bottom-up fault tree analysis (fta) yes yes yes top-down hazard and operability studies (hazop) yes yes no bottom-up human reliability analysis (hra) yes yes yes bottom-up reliability block diagrams (rbd) no no yes top-down preliminary hazard analysis (pha) yes yes no bottom-up suitablymodified and enriched to fulfil current needs andmakeuseofnewpossibilities. however, riskanalysis techniques have originated in various fields of activity and in various historical eras (in the scope of technical and technological invention), but all of them use the rules of systematic analysis and logic. we can see the application of two main logical principles: induction and deduction. induction is used when we investigate possible consequences of hazardous events. on the other hand, deduction is applied to find out possible causes of hazards or failure modes. in terms of risk analysis, these principles are called bottom-up for induction, and top-down for deduction. of course there are ways to combine these twoprinciples, butwecanalsoapply themseparately. in addition, risk analysis methods can be divided from another point of view, as can be seen in [4]. the qualitative or quantitative character (or both) of each method will now be discussed. this distinction is based on whether the method provides numerical results. another aspect of risk analysis methods is the output format. this issue is often determinedby theprinciples that areapplied, the structure of the system (or process), and by the accompanying effort to achieve clear visualisation. thus we can find verbal, tabular or graphical outputs. for a complex technological systemwith many parts spread over a wide area, it can be a very hard task to describe this system in purely textual form, and it may be better to choose a suitable graphical representation. the most widely used risk analysis methods are presented in table 1, which has been taken from [4, 2] and reduced. all methods mentioned here strictly use logical principles. however, let us focus on an alternative way of making an analysis, which is widely used in other technical branches: software development. 4 object-oriented principles at the beginning of each software project, it is necessary to make several analytical steps. in these steps, software developers attempt to identify and describe all desirable entities, their relationships and their behaviour. here we can clearly see a basic similarity with the risk analysis process: the initial steps are the same — see section 2. during the last three decades, object-oriented approaches have often been used for these purposes. object-oriented approaches are based on the idea that theworld consists of objectswhich interactwith each other. objects are characterised by the following features: • encapsulation • inheritance • polymorphism the term “encapsulation” indicates that the attributes and functions of the entity are joined together into one specific object. attributes describe the state of the object, and functions can change the state and behaviour of the object. inheritance reflects the everyday reality of the evolution of objects. it allowsa newobject to be derived fromone ormore parent objects, and during inheritance somenew features can also be appended. polymorphism shows the behaviour which different object types have in common. the idea of object-oriented approaches also involves structural aspects, so we can consider basic 42 acta polytechnica vol. 50 no. 6/2010 relationships, such as aggregation, association and composition. interaction between objects is provided by sending messages. the source object sends a request for some function on the target object. the target will react to the incoming event in accordance with its state and conditions. the same principle has been unconsciously applied in technically-oriented risk analysis for a long time. we may inspect an object (e.g. a manufacturing system, an engine, a component) in a view of its properties, functions and interconnectionswith other objects. in general, we can observe specific hazards assigned to a specific object. therefore we can say that these hazards are encapsulated in the object. our experiencewith similar objects gives us a guideline for estimating the potential hazard, so we work with inheritance. polymorphism in risk analysis can be seen in the followingway: the samehazard can be caused by several different objects. the unified modelling language was developed for graphical visualisation of previous concepts, and also fits well for several other purposes. 5 uml modelling tools unified modelling language is a specific set of tools which can help in several fields of activity, not only in software engineering. it uses the object-oriented approach mentioned above and adds some complementary tools for a better description of the structure and behaviour of the system. we demonstrate that all these features canbeutilised in several stages of risk management, mainly during a description of the system (or process) and also in visual hazard scenario modelling. although uml is a very complex language, let us take abrief look into its composition. the language includes the following basic construction blocks [3]: • subjects • relationships • diagrams subjects (or abstractions) can be further subdivided to: • structural abstractions— nouns e.g. classes, interfaces, collaboration, use cases, etc. • behaviour — modal verbs, e.g. interaction, state; • aggregations — packages for grouping significantly related components; • comments — additional useful annotations extending the model. diagramsareagraphical representationof themodel. there are symbols with predefined syntax and semantics to show the model in several views. obviously, diagrams allow better orientation in a description of the system, especiallywhen several people are participating in the project. in this case, diagrams are an unambiguous formof description used in team cooperation, and they are often more effective than hugeparagraphsof text. an important difference between models and diagrams is that when we remove a symbol from a diagram, it does not mean that the corresponding parts in the model are automatically discarded. all descriptions, recommendations and simple examples canbe found in theumlspecification [9]. we can also see an interesting uml feature: the metamodel approach. this means that a language definition can be described by its own means. 6 risk analysis based on uml in this part of our paper we discuss the use of uml as a helpful tool for risk assessment. we have tried above to briefly describe the basic principles of the risk analysis process andanobject-orientedapproach used in software application design. although there are many similarities between these different processes (both are usually organised as a project [6]), it is not easy to apply the steps used in software design to a risk analysis. uml does not provide recommendedmethodologies for using its own tools and the sequence in which to make the uml parts. several methodologies have been developed in software engineering that aim to describe the right order, for examplerup(rationalunifiedprocess), see [7]. we canalso take some inspiration from thesemethodologies, but the differences in our domain should not be forgotten. let us note that the rup methodology also includes some steps related to risk assessment. an important consideration is that the whole process of risk analysis cannot be finished at once and cannot be done quickly. if we want to obtain appropriate and valuable results, we have to iterate as many times as necessary. we would now like to propose a new uml-based method for risk assessment. this new method consists of the following steps: 1. describe the system structure by class diagram or component diagram, 2. describe the behaviour by activity, sequence or state diagrams, 3. identify, qualify and add potential hazards into class diagrams, 4. create rough hazard scenarios by use case diagrams, 5. describe detailed risk scenarios with the use of interaction or activity diagrams, 6. evaluate the results and the documentation. 43 acta polytechnica vol. 50 no. 6/2010 fig. 1: nfa2x cable core processing line 7 practical example let us take a look at a small practical example from electro-technical manufacturing. we will be investigating possible risks in a processing line that produces nfa2x cable core. this is an aluminium twisted wire core with xlpe (cross-linked polyethylene) insulation. a simple scheme of the processing line topology is shown in fig. 1, sketching its real configuration at prakab pražská kabelovna, a.s. company. the drum (or reel) with the wire core is fastened in the portal pay-off unit. the wire core is reeled off by the pushing caterpillar and goes into the head of the extrusion machine, where the insulation is deposited. in the next step, a new cable core is drawn through the water-filled cooling trough. then it is dried up by the compressed air, tested for dielectric strength by high voltage and wound up to the drum. appropriate tension of the wire core is provided by the draw caterpillar placed in front of the portal winding unit. there are also two diameter monitoringdevices anda lengthmeasurementdevice. as the risk analysis process of the whole line is very large,wewill focus only on the firstdevice in the line – theportal pay-offunit. we limit our analysis in order to focus on presenting uml as a risk analysis support tool. for the same reason, we will not work out the quantitative part. our aim is to identify potential hazards that could threaten the smoothness of the operationand to showpossible realisation scenarios. if, as inmost cases, we need a quantification, we can make the risk priority number (rpn) [8] calculation used in failure mode and effects analysis (fmea). the results are given by equation (1) rp n = sv × lk × dt (1) where the sv is a severity value, lk is a likelihood value and dt means detection. all three values are estimated and assigned during risk analysis, usually from a predefined degree scale. it is important to remember that the degree scale should not begin with zero, and should have a suitable range. first, we should start with object identification. there are threemajor objects that participate in the unwinding process: • the portal pay-off unit, • the cable reel, • the operating staff. let us focus on the first of these. fig. 2 shows the design of the portal pay-off unit. for better understanding, a cable reel is also drawn in its working position, but not fastened up. we can clearly see the structure and the relationships between the components. the portal consists of two movable interconnected arms. each armhas a lifter on its pole, which is coupled to the other on the opposite side. one arm is equipped with a compressed-air brake to supply reel braking. all movements are controlled by electrical drives. the whole unit traverses on floor rails across the unwinding direction. this feature is necessary for correct unwinding, otherwise the cable core can be damaged by the reel fronts at the terminal points. 7.1 system description and hazard identification we can now go ahead, having in mind the steps described in section 6. as afirst step, we shouldmake a structural description of the systemusing a class diagramandabehavioural descriptionby a sequence diagram. for the purposes of this paper, we canmerge step 1 and step 3 of our former order, so we can directly compose the identified hazards into the diagram. letusmake some simplifications. wepresume minimal failures of the electric drives and the power supply. thereforewe canconsider the three potential 44 acta polytechnica vol. 50 no. 6/2010 fig. 2: portal pay-off unit [5] with a cable reel fig. 3: class diagram showing a pay-off unit with possible hazards 45 acta polytechnica vol. 50 no. 6/2010 fig. 4: sequence diagram describing preparation for unwinding mechanical hazards shown in fig. 3. the class diagram includes standard uml notation [9] with rectangles as classes and junction lines as the symbols for a relationship. the lines can be equipped with short verbal phrases for better comprehension, and also with special symbols. these symbols (e.g. diamonds) represent the type of relationship and optionally the direction. it is sometimes suitable to supplement amultiplicity of specific classes. this indicates a possible count of the same classes in the relationship. the notation of multiplicity can typically be expressed in the interval form, e.g. 0..1, 2..5, where the numbers represent lower and upper bounds. the important thing is that the diagram represents classes, not concrete objects. hence we can reuse this diagram for other similar devices that are located in production lines. the behavioural aspects will be presented as a preparation for the unreeling process. the sequence diagram infig. 4 describes the sequence of steps that an operator should take for the correctunreeling process. the rectangleswith vertical dashed lines at the top of the picture are called lifelines, and they represent participant objects in the interaction. the thin vertical rectangles situated on the dashed lines show the activity of the specific object. 7.2 risk scenarios in the following step we can create approximate risk scenarios. they will be presented as use case diagrams with various focused objects and actors. the left part of fig. 5 shows the first case, where the centralpart is thepay-offunit andexternalobjects in the role of actors can cause possible hazards. the right side of fig. 5 shows another situation. the central investigated object is an operator, and the external actors are the pay-off unit and the cable reel, which can threaten the operator. we can note the same graphical symbol (an icon of a “stick man” – see the uml specification in [9]) for the human actors and the technical object actors. after the approximate scenarios, it is useful todevelop a more detailed description of potential hazard occurrences, including their consequences. we can use an activity diagram for this purpose. formally, activity diagrams arise frompetri nets, but they differ in several ways [9]. the common attribute is the flow of a token. we will show here only the diagram for the first scenario. it is drawn in fig. 6. let us note that activity diagrams can, up to a point, describe actions in terms of their location. this feature can also be very useful in risk analysis. another feature of activity diagrams is their ability to show concurrent activities and activity branching. 46 acta polytechnica vol. 50 no. 6/2010 fig. 5: the first and second scenarios as a use case diagram fig. 6: the more detailed first scenario in an activity diagram 47 acta polytechnica vol. 50 no. 6/2010 as was mentioned above, we will not make the last steps (quantification, evaluation anddocumentation) of ournewmethod,which lie outside theprimarygoal of this paper. we have therefore completed our task. 8 conclusion in this paper, we have tried to present a new approach to a common interdisciplinary risk analysis process. our aim was to present the possibilities of unifiedmodellinglanguageas a suitable tool for risk analysis. we decided that the best way was to show it on the basis of a small practical example. the utilisation ofuml is very efficient, although the tool itself comes froma different technical branch. its abstract concept allows the modelling of complex technical and other issues, e.g. from the field of biology. it is impossible to show in a single paper the whole range of possibilities of uml, so we tried to emphasise significant structural and behavioural modelling considerations. thanks to the origin of the unified modelling language, we can easily use it to convert the model into further computer processing. although the application of uml seems to be very beneficial, it is also necessary to discuss the negatives. obviously we have to learn uml, and we have to understand it. sometimes this could be quite difficult, especiallywhen a risk analysis ismade by a whole team of experts. another consideration is the need for a computer system. with the help of available software tools we can work better with uml-based risk models. manipulation, storage and advancedcomputationof themodels aremucheasier, but professional software tools canbe veryexpensive. we could also draw the diagrams on paper, but in the case of huge systems this would involve a large amount of work. futurework on this topic will involvemaking potential hazard templates in electro-technical manufacturing. theseuml-based templates could consist of particular hazard classes and general risk scenarios, which are typical for this branch of industrial processing. references [1] tichý, m.: risk governance: analysis and management. 1st ed. praha, c. h. beck, 2006. 396 p. isbn 80-7179-415-5. in czech [2] iec 300-3-9. dependability management – part 3: application guide – section 9: risk analysis of technological systems. 1995. 28 p. [3] arlow, j., neustadt, i.: uml2 and the unified process: practical object-oriented analysis and design. 2nd edition. brno, computer press, a.s., 2007. 567 p. isbn 978-80-251-1503-9. in czech [4] iec 60300-3-1. dependability management – part 3-1: application guide –analysis techniques for dependability – guide on methodology. 2003. 55 p. [5] transportkabel dixi a.s. [on-line].april 2nd2006 [cit. 2010–06–24]. portal pay-off unit. available from www: http://www.tkdixi.cz/html eng/obd2800.htm. [6] jirsa, j.: methods for analysis and modeling of failures in manufacturing systems. in internationalconferencepelincec 2005 [cd-rom]2005. warsaw, warsaw university of technology. [7] ibm rational unified process [on-line]. [cit. 2010–06–24] available from www: http://en.wikipedia.org/wiki/ ibm rational unified process. [8] fmearpn [on-line]. [cit. 2010–07–09] available from www: http://www.fmea-fmeca.com/fmea-rpn.html. [9] object management group, inc. omg unified modeling language, superstructure, v2.1.2 edition, november 2007. [cit. 2010–07–09] available from www: http://www.omg.org/spec/uml/ 2.1.2/superstructure/pdf. ing. jan jirsa phone: +420 224 353 965 e-mail: jirsaj@fel.cvut.cz department of electrotechnology faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha 6, czech republic doc. ing. jaroslav žáček, csc. phone: +420 224 352 198 e-mail: zacek@fel.cvut.cz department of electrotechnology faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha 6, czech republic 48 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 active detectors for plasma soft x-ray detection at pals c. granja, v. linhart, m. platkevič, j. jakůbek, e. krouský, s. pospíšil, o. renner, t. slavíček abstract this paper summarizes the work carried out for an experimental study of low-energy nuclear excitation by laser-produced plasma at the pals prague laser facility. we describe the adaptation and shielding of single-quantum active radiation detectors developed at ieap ctu prague to facilitate their operation inside the laser interaction chamber in the vicinity of the plasma target. the goal of this effort is direct real-time single-quantum detection of plasma soft x-ray radiation with energy above a few kev and subsequent identification of the decay of the excited nuclear states via low-energy gamma rays in a highly radiative environment with strong electromagnetic interference. keywords: laser induced nuclear excitation, laser-produced plasma, electromagnetic interference, photonuclear excitation. 1 introduction the possibility to excite low-energy nuclear states by laser-induced plasma has been initiated and experimentally investigated [1] on the prague asterix laser system (pals – see http://www.pals.cas.cz). this medium-energy high-power system yields interaction intensities at the level of 1016–1017 wcm−2, which produce subrelativistic plasmas with an electron temperature of the order of 1–10 kev. such laser-produced plasma (lpp) can excite low-lying nuclear states. based on a systematic survey of suitable candidate nuclei [2], experiments have been undertaken to investigate the possible lpp excitation and 6 μs decay of 6.2 kev level in 181ta. in framework of this project, particular effort has been devoted to the use of active detectors (i.e. realtime digital single-quantum counting devices) inside the interaction chamber for direct detection of plasma radiation and/or subsequent nuclear radiation. two types of detector systems have been adapted for this purpose: hybrid semiconductor pixel detectors, and fast scintillating detectors. in addition to the rather complicated target (ta, w) and experimental setup (direct target observation or indirect geometry using either a secondary target as an ion collector or a toroidally bent crystal x-ray spectrometer to monochromatize the radiation emitted from the plasma), a major challenge was to shield the detector systems against scattered xray radiation, high-energy particles, and in particular against the strong electromagnetic interference (emi) induced by the laser pulse. we report the development of special instruments complying with crude requirements for detecting the delayed x-ray emission close to the laser-irradiated targets, and we outline future work. 2 laser-produced plasma nuclear interaction 2.1 plasma radiation at pals the pals medium-size high-power laser system [3] (fundamental laser wavelength 1.315 μm, pulse duration 250 ps) provides strong light pulses with energies at the level of several hundreds of joules and at subrelativistic intensities above 1016 wcm−2. the characteristics of the plasma emission accompanying the laser-matter interaction depend on the laser pulse parameters and the particular target material that is used. by irradiating a 181ta target with a laser intensity of 7 · 1016 wcm−2, the plasma electron temperature near the target surface exceeds 1 kev at electron density close to 1021 cm−3, the number of ions in the whole plasma volume (about 3 · 10−6 cm−3) is estimated at 4 · 1013, and their average charge can reach 40 [2, 4, 5]. 2.2 plasma-induced nuclear excitation plasma-nuclei interactions in hot, dense lpp can result in nuclear-photonic and nuclear-electronic excitations [6, 7] such as direct photoexcitation and nuclear excitation by electronic transition (neet). with the plasma characteristics achievable at pals, we consider these two processes as dominant mechanisms for low-energy nuclear excitation [2, 4]. 2.2.1 nuclear de-excitation the decay of low-energy excited levels in nuclei 181ta (6.238 kev, 6.05 μs) proceeds via emission of γ rays or via internal electron conversion (ic) [2]. with decreasing transition energy (eγ < 100 kev), the latter mechanism becomes increasingly dominant. however, atomic shell ionization can significantly affect the decay rate and the decay branching ratio [2]. 12 acta polytechnica vol. 50 no. 2/2010 3 active detectors for direct detection of plasma radiation two types of active detector systems are being implemented and adapted for direct detection of plasma radiation (see http://www.utef.cvut.cz): (i) hybrid semiconductor pixel detectors of the medipix type (developed in the framework of medipix collaboration [8]), and (ii) fast scintillating detectors (of which baf2 was selected as the most suitable type). 3.1 medipix2/usb camera state-of-the-art hybrid pixel semiconductor detectors of the medipix family [8] are characterized by spatial, energetic and temporal resolution which make them attractive for nuclear and particle spectroscopy [9]. in addition to the position-sensitive and single quantum detection capability of medipix2 [10], the new timepix device [11] also provides the possibility to determine the detection time and/or the energy deposition in each individual pixel. real-time operation of detectors of this type, and also data readout and on-line visualization are realized via the muros interface [12], or with the integrated usb-based readout interface [13], which links by a standard usb port into any pc and provides the necessary power supply (see http://www.utef.cvut.cz/medipix). operation and control of the system, as well as data acquisition, are driven by a windows-compatible pixelman software package [14]. data stream acquisition and storage proceed on-line at an overall rate of about 5 frames per second. the assembled timepix/usb device [15] has dimensions 142 × 50 × 20 mm3 (shown in figure 1), and serves as a versatile portable real-time hand-held radiation camera of multi-particles (x-rays, e, p, α, t, ions, n) with position-, timeand energy-sensitive capability for single-quantum and on-line particle and nuclear spectroscopy [9]. fig. 1: medipix2/usb-interface radiation camera. the medipix board (left) is attached to theusb-based readout interface (right) 3.2 baf2 scintillating detector for soft x-ray detection we investigated the spectrometric response of a baf2 scintillating detector (korth kristalle gmbh) of dimensions 5 × 5 × 5 mm3, decay time 630 ns (slow component) resp. 0.6–0.8 ns (fast component), light yield 10 photons/kev (slow) resp. 1.8 photons/kev (fast) and photoelectron nai(tl) light yield of 16 % (slow) resp. 3 % (fast). a fast photomultiplier (hamamatsu no. h5783–04) was used. the energy threshold of this system was determined at 1 kev (noise level) corresponding to an amplitude of 2 mv. tests done with 5.9 kev x-rays from 55fe yield a maximum at 12 mv with a mean measured signal of about 4 mv. the amplitude spectra collected with this radioactive source are shown in figure 2. fig. 2: amplitude spectra of background (left) and 55fe source (right) collected with the baf2 detector 13 acta polytechnica vol. 50 no. 2/2010 4 detector shield for very high radiation and emi experimental conditions for generating lpp are characterized by (i) a high vacuum, (ii) a very high radiation background, and (iii) extremely high electromagnetic interference (emi) in the interaction chambers. the noise accompanying the interaction of the high power laser beams with the matter results in significant challenges for the operation and effective shielding of active detectors inside such hostile environments. 4.1 semiconductor pixel detector first of all, the detector chip board and the usbinterface were implemented and tested for operation in a high vacuum. moreover, background radiation and specially emi noise were expected to affect the operation of the detector chip, which consists of a bump bonded detector sensor and a microelectronics chip with highly integrated components (preamplifier, adc convertor, digital counter) for each of the 65 000 pixels which act as individual detectors. 4.1.1 medipix2/usb-interface inside the chamber in the first stage, we attempted to keep both the detector board and the readout interface as one unit (see figure 1) operating inside the interaction chamber [16]. in view of the expected high radiation and high electromagnetic noise, the detector system was doubly shielded. the first shield consisted of an aluminum casing with a cylindrical tube in front of the active sensor chip of the detector connected to ground. the second shield, isolated from the first shield, consisted of a lead-coated plate grounded to the chamber. the communication and power supply cables were also double shielded. in spite of the shielding, the laser pulse still affected the power supply and reset the cpu of the interface. this problem was solved by applying a dc-dc converter (20 v to 5 v) and a filtering capacitor close to the device. this scheme was functional at energies of the laser shots up to about 10 j. at more powerful flashes, the usb communication failed. we fixed this problem by disconnecting the data wires during the measurement by relay. the corresponding modification of the software was also implemented. the measurement procedure required preprogramming the device to wait for a trigger, disconnecting the usb data wires (by relay), and after the laser shot, reconnecting the usb data wires, and reading out the data. however, this solution was not reliable, as it was only occasionally functional with laser energies up to about 50 j. 4.1.2 medipix board detached from the usb-interface by a devoted module since strong effects of emi persisted, and in an effort to minimize the number of components in the very high field noise inside the interaction chamber, the usb-readout interface (of dimensions 64 × 50 × 20 mm3) was physically distanced from the medipix detector chip board. this was attained by constructing a radiation resistant single-purpose communication lvds module (see figure 3) with separate data communication (three ethernet cables) and power lines between the detector board (placed inside the interaction chamber) and the usb-readout interface (outside the chamber) [17]. a power supply to the module and to the detector board was provided for each device separately. fig. 3: medipix2 detector andlvdsmodule with communication ethernet and power cables. the emi shield for one cable is shown while the medipix detector board freely operates in a vacuum, namely the heat in the lvds module has to be dissipated. for this purpose, we glued the bottom side of the module (and also that of the detector board – to ensure its long-term stability) to an aluminum panel using a thermo-conductive polymer (silicone elastomer sylgard 160). in a subsequent step, in view of the strong electromagnetic interference (emi) with very high and short gradients, a special emi shielding assembly was designed and constructed for all components to operate inside the interaction chamber [17]. communication and power cables were emi protected by a flexible shielding insulating tube through which a cable or a bundle of cables is conveyed. the material used is μ-copper for both electrical and magnetic screening (holland shielding). the inner emi shielding was grounded to earth. the outer emi shield was connected to the interaction chamber wall. the detector board was also emi shielded by a gradual onion-like assembly of several (up to three) independent screening arrangements of reinforced amucor and mu-copper foils (hollandshielding.com). these successive layers 14 acta polytechnica vol. 50 no. 2/2010 were insulated from each other by additional insulating material. the ground was taken out independently by the single connection to avoid induced loops. the interface and contacts between the cable shielding tubes and the detector board shield casing were secured by emi insulating metal conductive tapes. to ensure reliability of operation, the cabling and the usb-based interface outside the interaction chamber were also emi shielded. 4.2 scintillating detector in view of both the ionizing radiation background and the non-ionizing emi noise, a massive and complex shield was required. emi induces a marked but gradually varying disturbance of the detector/preamplifier signal. this occurs even before the arrival of the laser shot (see figure 9 and figure 10), as a consequence of the laser charging and laser resonance effects. the design of the system shield against ionizing radiation was conceived [18] as three segments/layers: outer, intermediate and inner (see figure 4). the outer shield consisted of 2 cm-thick polyethylene with the aim of stopping or slowing down electrons emitted by the target. the intermediate layer consisted of dural foils with total thickness of 5 mm placed namely at the interfaces and loose edges of the surrounding shielding blocks. their task was to stop the charged particles which eventually arise and cross the interblock apertures. the inner layer contained three lead sheet plates with a full thickness of about 15 mm. they shielded the detector against photons originating directly from the plasma and/or from the interaction of the ion fluxes with the chamber and shielding blocks. this inner layer should already avoid any direct charged particle impact, with the exception of relativistic electrons. further thick al and pb plates as well as thin al/c paper foils were additionally implemented. the detector window was protected by a light-tight 125 μm thick be foil. mountable pb/pe collimators with bore opening 3 mm and 8 mm were also used (see inset in figure 5). fig. 4: baf2 detector with lead (pb), aluminum (al) and polyethylene (pe) shield fig. 5: shield assembly for baf2 detector with lead (pb), aluminum (al) and polyethylene (pe) segments. the pb/pe shield with 8 mm collimator is shown in the inset 5 measurements at pals the experiments were carried out by irradiating massive ta and w targets with the pals frequencytripled single laser beam (50–250 j, 0.44 μm, 250 ps, 0.4–2 · 1016 wcm−2); the laser characteristics were checked by a routine diagnostic complex [3]. for the search of 6.2 kev excitation in 181ta, the highest fluence of plasma x-ray radiation is estimated [4] for laser energies around 150 j. 5.1 medipix2 for the operation of the medipix detector in this hostile environment, a number of different, and increasingly extended, shielding setups were assembled and tested. in addition to the special hardware implementation (figure 3), we tested a number of emi shielding arrays which were gradually assembled in consecutive layers/circuits. operation of the detector inside the interaction chamber was ultimately achieved with a complex three-independent circuit emi shield. the fully assembled system in the interaction chamber is shown in figure 6. measurements were carried out using two different targets (ta and w) with the shielding assembly described above. the detector managed to remain operational at the desired laser energies (170 j). however, in view of the relatively short decay time (6 μs) of the excited level (6.2 kev) in 181ta, and the duration of the laser plasma flash pulse (250 ps), the additional use of a trigger should guarantee separation of the expected spectrometric signals from the large primary laser pulse and subsequent intense plasma radiation. 15 acta polytechnica vol. 50 no. 2/2010 fig. 6: medipix detector and shielding assembly setup in the pals interaction chamber. the diffracted x-rays (green arrow) coming from the bent crystal are detected by the detector placed at the focal plane fig. 7: indirect experimental setup in the pals interaction chamber: thebaf2 detector assembly, facing the secondary target – the collector, is shielded from the primary ta target 5.2 baf2 detector with this detector, measurements were carried out in various experimental and shielding setups. in terms of target and detector position we tested a direct view setup, where the detector window is directed at the primary target, and an indirect view setup, where the detector faces either the curved crystal (figure 6) or a secondary target functioning as an ion collector (figure 7). measurements were performed with different massive shielding arrays, which are illustrated in figure 8. the shield assembly for this detector was composed of blocks/layers made of polyethylene (pe), lead (pb), aluminum (al) of different segment sizes and thicknesses (see figure 8). a thin beryllium (be) foil window covered the detector window. 5.2.1 direct view on target while the emi noise was significantly suppressed, even this fully closed and massive shield yielded background signals, as shown in figure 9, for a ta target. results of similar response tests obtained with varying laser energy, from zero up to 157 j, are shown in figure 10. in this geometry, even this strong shield is not sufficient, and the penetrating radiation remains non-negligible (i.e. relativistic electrons and/or high energy gamma rays). 5.2.2 indirect view on secondary target-collector in this geometry, a massively protected detector with a sealed aperture was used. the space between the target and the detector was complementarily shielded by massive lead plates. however, opening the aperture even with varying collimator size and composition resulted in a strongly disturbed response. the response measurements are illustrated in figure 11. 5.2.3 indirect view behind the monochromator crystal with the toroidally bent crystal spectrometer, the detector response yielded similar results as . collimated apertures simply did not fully shield the detector. at least a thin plate of pb and al were required, even in this geometry. 6 conclusions for the laser parameters and ta target used here, plasma-penetrating radiation remains the principal noise factor for scintillating detectors, while emi noise was the most important limiting factor for semiconductor pixel detectors. the possibility of successfully applying both types of detectors has been demonstrated. in future work, the timing of medipix data collection at the desired time interval (a few μs after the laser flash) should be driven by the laser trigger, which is usually generated 4–5 μs before the laser shot. the baf2 detector is most suitable for following the time evolution of the plasma radiation as a whole. this may proceed, e.g., via recording the complete assembly of the emitted quanta in consecutive time intervals (e.g. 5 ns). until now, the baf2 detector could be used only for observing de-exciting nuclear radiation from indirect targets. moreover, the detector entrance window had to be at least partially shielded by thin pb/al foils, which significantly reduced the overall detection efficiency of the desired radiation. work on further improving and stabilizing the operation of these detectors in such a radiation and emi noise environment is in progress. 16 acta polytechnica vol. 50 no. 2/2010 fig. 8: schematic illustration of the target and shield setups investigated for the baf2 detector. the geometric setup is determined by the position of the primary target, the (eventual) crystal/collector as a secondary target, and the detector. the shield assembly was composed of materials with polyethylene (pe), lead (pb), aluminum (al), a thin beryllium (be) foil window in various segment sizes and thicknesses 17 acta polytechnica vol. 50 no. 2/2010 fig. 9: time spectrum of a laser shot (#09) with energy 155 j on a ta target in setup “a” (see figure 8). the full spectrum is shown (top). for illustration, parts of the spectrum are shown in detail: prior and at the laser shot (second from top), at a later time – the region between 29 μs and 39 μs (third from top) as well as a region between 36 μs and 37 μs (bottom). the laser shot time and the pals laser trigger signal level (generally at about 4 μs before the laser shot) are indicated by letters “s” and “t”, respectively acknowledgement this work was funded by research grant no. 202/06/0697 of the czech science foundation, and was carried out in framework of research program msm 6840770029 of the czech ministry of education, youth and sports. references [1] granja, c., jakůbek, j., linhart, v., pospíšil, s., slavíček, t., uher, j., vykydal, z., kuba, j., sinor, m., drska, l., renner, o., juha, l., krása, j., krouský, e., pfeifer, m., ullschmied, j.: search for low-energy nuclear transitions in laserproduced plasma, czech. j. phys., 56, 2006, p. 478–484. [2] granja, c., haiduk, a., kuba, j., renner, o.: survey of nuclei for low-energy nuclear excitation in laser-produced plasma, nuclearphysics a, 784, 2007, p. 1–12. [3] jungwirth, k., et al.: the prague asterix laser system pals, physics of plasmas, 8, 2001, p. 2 495. [4] renner, o., juha, l., krása, j., krouský, e., pfeifer, m., velyhan, a., granja, c., jakůbek, j., linhart, v., slavíček, t., vykydal, z., pospíšil, s., kravařík, j., ullschmied, j.: low-energy nuclear transitions in subrelativistic laser-generated plasmas, laser and particle beams, 26, 2008, p. 249–257. [5] renner, o., granja, c., linhart, v., pospíšil, s., juha, l., krása, j., krouský, e.: search for low-energy nuclear transitions in laser-produced plasma, proc. eps plasma physics conference, vol. 32, plasma physics and controlled fusion series, bristol, institute of physics publishing, 2008, p. 126–131. 18 acta polytechnica vol. 50 no. 2/2010 fig. 10: time spectra collected on the ta target in setup “a” (see figure 8) for varying laser pulse energy: (from top to bottom) no shot, 12 j, 61 j, 155 j (same as figure 9) and 157 j. trigger and shot level are indicated. time channel/range settings varied for some measurements [6] harston, m. r., chemin, j. f.: mechanisms of nuclear excitation in plasmas, phys. rev. c, 59 (5), 1999, p. 2 462–2 473. [7] tkalya, e. v.: mechanisms for the excitation of atomic nuclei in hot dense plasma, laser physics, 14, 2004, p. 360–377. [8] see http://www.cern.ch/medipix and http://www.utef.cvut.cz/medipix [9] granja, c., vykydal, z., jakůbek, j., pospíšil, s.: position–sensitive nuclear spectroscopy with pixel detectors, conf. proc. series, vol. 947, new york, american institute of physics, 2007, p. 449–452. [10] llopart, x., campbell, m., dinapoli, r., san segundo, d., pernigotti, e.: medipix2 – a 64 k pixel readout chip with 55 μm square elements working in single photon counting mode, proc. ieee nss/mic, trans. nucl. sci., 49, san diego, ieee, 2002, p. 2 279. [11] llopart, x., ballabriga, r., campbell, m., wong, w.: timepix, a 65 k programmable pixel readout chip for arrival time, energy and/or photon counting measurements, nuclear instruments and methods in physics research a, 585, 2007, p. 106–108. [12] san segundo bello, d., van beuzekom, m., jansweijer, p., verkooijen, h,. visschers, j.: an interface board for the control and data acquisition of the medipix2 chip, nuclear instruments and methods in physics research a, 509, 2003, p. 164–170. 19 acta polytechnica vol. 50 no. 2/2010 fig. 11: time spectra collected on a ta target (similar as figure 9) at time range 0 μs to 15 μs for four different detector shield setup settings (from top tobottom: “a”, “b”, “c” and “e” – seefigure 8). the laser shot energy (in joules) is indicated and was very similar for these graphs (155 j, 167 j, 165 j and 149 j; all in the 3rd harmonic). the pals laser trigger and shot signal levels (at about 0.7 μs and 4 μs, respectively) are labeled by the red boxes. the ta target is the same. time channel/range settings varied for some measurements (e.g., “b”) [13] vykydal, z., jakůbek, j., pospíšil, s.: usb interface for medipix2 enabling energy and position detection of heavy-charged particles, nuclear instruments and methods in physics research a, 563, 2006, p. 112–115. [14] holý, t., jakůbek, j., pospíšil, s., uher, j., vavrik, d., vykydal, z.: data acquisition and processing software package for medipix2, nuclear instruments and methods in physics research a, 563, 2006, p. 254–258. [15] vykydal, z., jakůbek, j., holý, t., pospíšil, s.: a portable pixel detector operating as an active nuclear emulsion and its application for x-ray and neutron tomography, proc. 9th conf. on astroparticle, particle and space physics, detectors and medical applications, como, world scientific, 2006, p. 779–784. [16] platkevič, m., granja, c., jakůbek, j., vykydal, z.: medipix in extremely hostile environment, proceedings workshop ctu prague, vol. 12, no. elp018, 2008, p. 212–213. [17] platkevič, m., granja, c, jakůbek, j., pospíšil, s.: electromagnetic interference shielding for medipix detectors for laser-induced plasma radiation detection, proceedings workshop ctu prague, 2009, in print. [18] granja, c., linhart, v., pospíšil, s., slavíček, t.: scintillating baf2 detector for low-energy nuclear excitation at pals, proceedings workshop ctu prague, 2009, in print. 20 acta polytechnica vol. 50 no. 2/2010 doc. ing. carlos granja, ph.d.* phone: +420 224 359 394 e-mail: carlos.granja@utef.cvut.cz http://www.utef.cvut.cz [*corresponding author] institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic ing. vladimír linhart, ph.d. phone: +420 224 359 180 e-mail: vladimir.linhart@utef.cvut.cz institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic [current address – university of valencia, spain] ing. michal platkevič phone: +420 224 359 181 e-mail: michal.platkevic@utef.cvut.cz institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic jan jakůbek phone: +420 224 359 181 e-mail: jan.jakubek@utef.cvut.cz institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic prom. fyz. eduard krouský, csc. phone: +420 266 052 136 e-mail: krousky@fzu.cz institute of physics, v.v.i. academy of sciences of the czech republic na slovance 2, 182 21 prague 8, czech republic stanislav pospíšil phone: +420 224 359 290 e-mail: stanislav.pospisil@utef.cvut.cz institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic ing. oldřich renner, drsc. phone: +420 266 052 136 e-mail: renner@fzu.cz institute of physics, v.v.i. academy of sciences of the czech republic na slovance 2, 182 21 prague 8, czech republic tomáš slavíček phone: +420 224 359 180 e-mail: tomas.slavicek@utef.cvut.cz institute of experimental and applied physics czech technical university in prague horská 3a/11, 128 00 prague 2, czech republic 21 acta polytechnica vol. 52 no. 5/2012 the electrification of tramways in ostrava in 1900–1901 michaela závodná dept. of history, centre for economic and social history, faculty of arts, university in ostrava, reální 5, 701 03 ostrava, czech republic corresponding author: michaelazavodna@seznam.cz abstract this paper focuses on the electrification of tramways in ostrava in 1900-1901. as the administrative, economic and cultural centre of an evolving industrial agglomeration, ostrava had specific transportation requirements. tram electrification was in response to these requirements. two private companies, brünner lokaleisenbahngesellschaft and ganz & comp.,( mährisch-ostrauer elektrizitäts-aktien-gesellschaft) negotiated an agreement. the first part of this paper deals with technical aspects of electrification the change-over from steam power to electric power, while the second section analyses the dealings between these private companies, subsequent municipalization and the involvement of the municipal self-government of moravská ostrava. keywords: history of technology, tramway, electrification, municipalization, ostrava. 1 introduction electricity and trams are two sides of the same coin. electrification was an major economic and sociospatial phenomenon. electric trams played a significant role as consumers of the electricity produced by the central power plant. electrification of means of transport provided new opportunities for the growth of cities. in this context, electrification was an essential step toward modernization of society. tramway electrification was in most cases accompanied by municipalization. this meant that tramways were considered as a monopoly of public necessity: “and, as such, should not be used to put large dividends into the pockets of shareholders.” [1, p. 172] tramways developed not as a money-making private enterprise, but for the benefit of the public. the development of the tramway network was seen as an act of social policy enabling both urban planning and important changes in the social structure of cities. 2 the electrification of tramways in ostrava in 1900–1901 electrification did not automatically mean municipalization, however, and there was no clear-cut connection between setting up power stations and operating electric trams. the ostrava-karviná mining district provides an example of how electric-powered trams became established. around 1900, moravská ostrava was shaping up and became the administrative, economic and cultural centre of the evolving industrial agglomeration. its mining and industrial history provided specific conditions for the development and the expansion of local public mass transport. beginning in 1894, the tramway transport network in ostrava was built up by the brünner lokaleisenbahngesellschaft transport company. on august 18th, 1894, the first steam-powered line was opened between přívoz, moravská ostrava and vítkovice. goods transport began at the same time. in 1899, a new line was opened to lhotka (present-day mariánské hory). until 1901, all these lines were operated by steam locomotives. the lines were used by 348 243 passengers in 1894. five years later, the number of passengers had increased to 1 480 980 [2, p. 7]. the transport company therefore established a new timetable, with trams running at 20-minute intervals. problems soon arose with this new tram system. neither the vehicles nor the rails were robust enough for the rising traffic frequency. in addition, steam locomotives required large amounts of coke, and the operating costs kept rising. the transport company tried to deal with this negative trend. first, the number of trams was reduced from 16 to 9, and then the frequency of the service was reduced. the new timetable scheduled 30 minutes between trams [3]. however, it became clear that the operation needed new technical developments, and that it was time to redevelop mass transport in the district of ostrava. the board of governors of brünner lokaleisenbahngesellschaft began to consider replacing uneconomical and expensive steam by electricity. the company contacted several electricity companies, and finally selected österreichische schuckert-werke. the first contacts between österreichische schuckert-werke and brünner lokaleisenbahngesellschaft were tentative. the transport 130 acta polytechnica vol. 52 no. 5/2012 line number railway station moravská ostrava — downtown 4 moravská ostrava — vítkovice 5 moravská ostrava — lhotka 2 reserve 4 table 1: number of electric locomotives bought in 1900. company asked if schuckert-werke would be able to make detailed plans for futur electrification of trams in ostrava. the total investment was estimated at 700 000–800 000 kronen [3]. the cost of electrification rose to 1 120 000 kronen, and to 1 300 000 kronen, in subsequent negotiations [5], [7]. brünner lokaleisenbahngesellschaft would cover this investment by issuing 5 350 new shares at 200 kronen. the rest would be paid from the reserve fund of the transport company [8]. a specific plan for electrification was made by österreichische schuckert-werke, and was debated by state offices between july 9th–11th, 1900 [2, p. 7]. electric trams were to be put into operation in ostrava in october 1900 [5], [7]. however, österreichische schuckert-werke had many problems with the materials that were required and with installing the overhead wires. the date for launching the electric trams was postponed several times. the works were completed in march 1901. the police and technical inspection was carried out on april 4th, 1901. the next day, regular electric transport services were started [2, p. 8], [5], [7], [8]. the transport company sent two of its employees, chief operating officer reinhart and foreman fösl, to liberec for an internship. at the same time, brünner lokaleisenbahngesellschaft bought 15 new electric locomotives [5], [7], see tab. 1. only personal transport was electrified. steam locomotives continued to operate for freight transport . at the same time, the transport company had to solve problems with electricity supply. brünner lokaleisenbahngesellchaft originally planned to have its own power station [4]. later, this idea was abandoned for two reasons. the first reason was the extremely high costs. as österreichische schuckert-werke wrote in its report, the price per 1 kwh was be estimated at 20–24 heller [5], [7]. the second reason was more substantial. the central power plant in moravská ostrava, owned by ganz & comp., had come into operation in 1897. on may 31st, 1897, this company and the municipal self-governments of moravská ostrava closed a deal. ganz & comp. was given an exclusive right to use public property (streets, squares, etc.) for the production and delivery of electricity in the quantity price per kwh < 350 000 kwh 16 heller 351 000–700 000 kwh 14 heller > 700 000 kwh 12 heller table 2: prices of electricity negotiated by brünner lokaleisenbahngesellschaft and ganz & comp. in 1900. district of moravská ostrava [11, p. 143]. brünner lokaleisenbahngesellschaft was able to buy a plot of land and built its own power station there, but it could not use the public streets for delivering electricity. the transport company also considered that this agreement between moravská ostrava and ganz & comp. infringed its own exclusive right to use public property [4] . in the end, the transport company and the electricity company negotiated an agreement. these negotiations led to an agreement on prices for electricity for the requirements of the tramway lines from přívoz to moravská ostrava, vítkovice and lhotka [5], [6], [7], [12, p. 528], see tab. 2. the same prices were also used in the tram network in brno. a fifteen-year trade agreement between brünner lokaleisenbahngesellschaft and mährischostrauer elektrizitäts-aktien-gesellschaft (the legal successor to ganz & comp. from may 1900) was signed on july 21st, 1900. electricity was to be supplied daily from 5.00 a.m. to 11 p.m. the 2 000 v alternating current was transformed to direct current by three 510 hp converters. from october to march, the amperagethe power was to be delivered at 200 kw, and from march to october at 400 kw [9], [10]. electric trams could not have been operated in ostrava without these negotiations and agreements. 3 municipal involvement although the municipal authorities knew about the tram electrification, because the municipality itself was a shareholder in mährisch-ostrauer elektrizitätsaktien-gesellschaft and some members of the municipal self-government held shares in brünner lokaleisenbahngesellschaft, the municipal self-government itself did not attempt to municipalize the network of electric tramways in ostrava. electrification of the tram network was an enterprise undertaken by two independent, private companies. the main impulse was the rising cost of operating steam-powered trams. brünner lokaleisenbahngesellschaft was motivated mainly by economic considerations, rather than by social considerations, to replace steam by electricity. 131 acta polytechnica vol. 52 no. 5/2012 references [1] j. p. mckay. tramways and trolleys. the rise of urban mass transport in europe. princeton 1976. [2] m. eber. die entwicklung des verkehrs in und um mährisch ostrau. mährisch ostrau 1913. [3] amo1, fund smmd2, inv. no. 13. zápisy ze zasedání správní rady společnosti (board of governors proceedings). proceedings for may 17th, 1899. [4] amo, fund smmd, inv. no. 13. proceedings for october 17th, 1899. [5] amo, fund smmd, inv. no. 13. proceedings for march 10th, 1900. [6] amo, fund smmd, inv. no. 9. xiv. annual general meeting proceedings for april 21st, 1900. 1archiv města ostravy, archiv of ostrava. 2společnost moravských místních drah. [7] amo, fund smmd, inv. no. 13. proceedings for november 5th, 1900. [8] amo, fund smmd, inv. no. 9. xv. annual general meeting proceedings for april 20th, 1901. [9] amo, fund smmd, inv. no. 292, box no. 39. [10] amo, fund archiv města moravská ostrava3, nová registratura4, box no. 531. [11] p. kladiwa. obecní výbor moravské ostravy 1850–1913. komunální samospráva průmyslového města a její představitelé. ostrava 2004. [12] gemeindeverwaltung und gemeinde-statistik der landeshauptstadt brünn. bericht des bürgermeisters dr. august ritter v. wieser für das jahr 1898. brünn 1899. 3archive fund of the city of moravská ostrava/ 4new register office. 132 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 explosion cladding of lead on steel milan turňa1, jozef ondruška1, zuzana turňová1 1 slovak university of technology, faculty of materials science and technology, j. bottu 25, 917 24 trnava, slovakia correspondence to: milan.turna@stuba.sk abstract this work deals with explosion cladding of lead on steel. the welded materials and the semtex s type explosive are characterised. the preparation of the welded materials and the proposed welding assembly are described. the welding parameters, the welding conditions and the fabrication of the weld overlays are discussed. the quality of the fabricated bimetals was studied by optical and electron microscopy and by mechanical tests. keywords: lead, structural carbon steel, explosion cladding. 1 introduction as it is well known, fusion welding, soldering and thermal cutting of lead, including metals combined with lead, is at present prohibited in technical practice. in some cases, however, the use of lead is possible and/or unavoidable. obvious examples are in the military and chemical industries. bimetals of lead and other structural materials are interesting for the chemical industry. bimetals such as pb — structural carbon steel are attractive mainly owing to the high corrosion resistance of lead and the sufficient strength of steel. these bimetalsare used for fabricating vessels for storing dangerous materials, e.g. h2so4 90 %, h2so4 60 %, pscl3 95 %, pcl3, nano3. for storing aggressive media a lead layer on steel 3–5 mm in thickness is used depending on the corrosive medium. until recently the interior of the chemical vessels was first tinned, and then a pb layer was deposited on it. this procedure posed a great health risk for the persons performing the operations even resulting in occupational diseases. to avoid this kind of risk solid state surfacing, namely by explosion cladding, has been selected as a more suitable technology. 2 experimental the materials selected for our experiments were as follows: 99.95% lead by stn 42 3701, and the steel type 11 373 by stn 42 5340, s235jrg1 by en 10025a1. owing to its low melting point, lead is used with working temperatures only from 150 to 170 ◦c. only technically pure lead is used in the chemical industry as pure lead has the highest resistance against corrosion. the chemical resistance of lead in some environments may be increased by the addition of a small amount of h2so4 to provide a protective layer on the material surface. so-called “hard lead” is not suitable for welding, since it causes embrittlement of welded joints. the chemical composition of the welded materials is shown in tables 1 and 2. table 1: chemical composition of pb 99.95 pb 99.95 by stn 42 3701 [wt.%] pb ag bi cu sb as fe zn sn min. max. max. max. max. max. max. max. max. 99.95 0.001 5 0.030 0.001 5 0.005 0.002 0.002 0.002 0.002 table 2: chemical composition of steel type 11 373 steel type 11 373 by stn 42 5340 [wt.%] c n p s – max. 0.170 max. 0.007 max. 0.045 max. 0.045 – 101 acta polytechnica vol. 52 no. 3/2012 table 3: physical properties of lead physical properties of pb 99.95 density ρ [kg · m−3] 11 341 elasticity modulus in tension e [mpa] (14.71 to 17.65)·10−3 elasticity modulus in shear g [mpa] 6.865 · 10−3 sound propagation velocity v [m · s−1] 1 200 table 4: mechanical properties of steel type 11 373 selected mechanical properties of steel type 11 373 yield point rp0.2 [mpa] 225 ultimate tensile strength rm [mpa] 363 to 441 minimum ductility a5 [%] 25 table 5: cladding parameters (charge thickness h, explosive density ρ, detonation velocity vd, distance spacing h, setting angle of accelerated metal [◦], deviation angle of accelerated metal [◦], collision velocity vk, final velocity v0) no. h p vd h α ϑd vk v0 expl. [mm] [g · cm−3] [m · s−1] [mm] [◦] [◦] [m · s−1] [m · s−1] type 1 31.6 0.980 1 336 5.1–9.8 2.23 8.6–8.9 1 060–1070 207 sp-14 2 31.5 0.984 1 340 7.5 0 8.8 1 340 208 sp-14 3 50.8 1.090 1 454 7.2 0 12.6 1 454 343 sp-14 4 50.7 1.110 1 474 7.0–13.1 2.80 12.7–13.5 1 210–1230 351 sp-14 5 51.7 1.030 1 390 8.0–17.5 4.35 12.5–13.1 1 040–1050 320 sp-14 6 51.5 1.260 1 250 8.0–17.5 4.35 – – – sn-12 7 51.8 1.250 1 250 8.8–18.0 4.26 – – – sn-12 8 21.6 1.156 1 990 4.9–22.7 8.10 9.1–9.6 1 070–1080 332 s-25 figure 1: parts of the fe–pb binary diagram the physical and mechanical properties of pb and steel needed for the welding process are presented in tables 3 and 4. the parameters of the cladding process are given in table 5. prior to welding, it was necessary to make a detailed study of the pb — steel binary diagram, and to ascertain the possibility that undesired phases might be formed (figure 1). all materials to be explosion welded must be free from impurities and organic deposits, irrespective of the cleaning effect of cleaning agents. the weld surfaces on steel were machined by grinding to roughness ra = 3.2 μm, and shortly before welding they 102 acta polytechnica vol. 52 no. 3/2012 figure 2: scheme of the peel test figure 3: shear test of the bimetal were degreased with acetone. the lead was straightenedprior to welding, cleaned with a steel brush and degreased with acetone. the welding parameters are given in table 5. the explosives for explosion welding do not form a unified group. they need to have specific properties for each technological process. to achieve good process reproducibility, the explosive that is used needs to have a stable detonation regime in non-sealed charges. loose semtex s type explosives were used for fabricating pb — steel bimetals. these explosives were manufactured at riich synthesia, pardubice. after welding, the bimetals were cleaned from detonation products with a brush under running water and were then prepared for further study. the following assessment methods were applied for our study of the quality of the fabricated bimetal joints: ultrasonic defectoscopy, mechanical tests, corrosion resistance tests and metallographic investigation. 2.1 ultrasonic defectoscopy the inspection was performed from the lead side. probe coupling was ensured by the use of oil. the tests were performed usingthe usip 11equipment with a mseb 6h probe with 6 mhz frequency. no defects were found in the joint boundary zone. the initial 3.2 mm thickness of the lead plate was reduced to 2.8 to 3.0 mm owing to the high detonation velocity and compression. 2.2 mechanical tests the quality of the fabricated joints was assessed by the following mechanical tests: a peel test, a shear strength test, and a bend test. peel test: an ordinary shop hydraulic press was used for this test. it must be stated that this test is not standardised and therefore it serves only to provide information on the boundary quality. the scheme of the test specimens is shown in figure 2. the test showed that the boundary remained undamaged and the punch penetrated through the lead. shear strength test: three specimens with dimensions as shown in figure 3 were cut from the bimetals in the direction of detonation wave propagation. failure of the specimens occurred on the lead side. this suggests that the lead — steel boundary is of higher strength than the pure lead, thus proving sufficient joint strength. the tensile strength of the initial lead used for cladding was determined to be 12.8 mpa. the bend test is presented in figure 4. similar specimens to those used for the tensile test were machined from the fabricated bimetal. the specimen was firmly clamped in a vice and its free end was bent by 180◦ around a mandrel of �14 mm in diameter. the specimen was placed in such a manner that the notch face included a 30◦ angle with the normal plane. the boundary was so strong that no damage occurred even when the mandrel diameter was reduced to �8 mm. figure 4: scheme of the bend test of the bimetal 103 acta polytechnica vol. 52 no. 3/2012 2.3 metallographic assessment of the welded joints observation by optical microscopy showed that the basic structure of the steel was polyhedral, locally deformed on the boundary. the thickness of the deformed layer was around 0.5 mm. the measured wave amplitude was 0.08 to 0.1 mm, and the waving period was 0.48 to 0.53 mm. the weld boundary is shown in figures 5 to 8. figure 5: weld boundary of the lead — steel bimetal 110× figure 6: weld boundary of the lead — steel bimetal 600× a more detailed assessment of the character of the fractured surfaces of a pb — steel welded joint was performed by scanning electron microscope (sem). the specimens from which the lead was torn off in the peel test were observed. it was observed that failure occurred after preliminary plastic strain, mostly by ductile fracture. figures 7 and 8 show the ductile fractures in the lead which progressed to the darker zones where insufficient bonding of materials was observed. the bonding of pb to the parent steel material was around 55 %. figures 7 and 8 show the fractures after fusing down of pb from the steel surface in a vacuum. figure 7: ductile fracture in the lead (rem) 500× figure 8: the lead fused from the steel surface 500× 3 conclusions the aim of our study was to develop and test a special technology for lead cladding on a steel plate made of structural carbon steel. a total of 8 bimetallic welded joint swith dimensions of 120×120×14 mm were fabricated. on the basis of our results it can be stated: sound welded joints were fabricated by explosion cladding with optimum parameters. a new explosive was tested and it was found to be suitable for cladding the lead. mechanical testing (a peel test, a shear test, and a bend test) of the bimetals proved their high quality. the failure of the specimens occurred on the lead side during the shear test. the boundary was so strong that no damage occurred 104 acta polytechnica vol. 52 no. 3/2012 even when the mandrel diameter was reduced to �8 mm in the bend test. the peel test showed that the boundary remained undamaged and the punch penetrated through the lead. on the basis of our experience and results it may be stated that this technology is promising one for fabricating bimetals. it is important to emphasize that the technology we have developed posseses no health hazards unlike the risky process of lead surfacing by flame. acknowledgement this work was realised with support from ga vega mš vvš sr and sav. projects no. 1/2594/12. references [1] turňa, m., špeciálne metódy zvárania (special welding methods). bratislava : alfa, 1989. isbn 80-05-00097-9. [2] farba, l’: návrh technológie navárania olovom funkčných plôch chemických nádob. bratislava : sjf svšt, 1985. [3] available on: http://www.explozia.cz [4] škorvánková, a.: posúdenie vhodnosti navárania olova na ocěl explóziou (feasibility study of explosion cladding of lead on steel). bratislava : sjf svšt, 1986. [5] turňa, m.: zváranie kovov v pevnom stave (solid state welding of metals). lectures delivered at iwe, fs čvut prague, 2010. [6] shane, a. h.: welding fume in the workplace. available on: http://www.aiha.org/localsections/ html/nts/0602news1.pdf [7] gul’bin, v. n., krupin, a. v., pashukov, y. n., yarutich, t. y.: examination and development of technology for explosion welding lead — titanium anodes. welding international, 1996, vol. 10, no. 8, p. 647–648. 105 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 new abrasive materials and their influence on the surface quality of bearing steel after grinding ondrej jusko1 1 ctu in prague, faculty of mechanical engineering, technická 4, prague, czech republic correspondence to: ondrej.jusko@fs.cvut.cz abstract this paper focuses on the influence of various types of abrasive grains on cutting properties during the grinding process for bearing steel. in this experiment, not only conventional super-hard abrasive materials but also a new type of abrasive material were employed in grinding wheels. the measurement results were compared, and an evaluation was made of the cutting properties of the new abrasive material. the options were then evaluated for their practical applicability. the measurement results indicated that a grinding wheel with abral and sg grains is the most suitable for grinding hardened bearing steel in order to achieve the best roughness and geometrical accuracy. keywords: grinding, bearing steel, abrasive material, roughness, geometric accuracy. 1 introduction research and development in the area of grinding wheels is a basic aspect of grinding technology. three groups of abrasive materials are employed in grinding wheels. the first group consists of conventional grinding wheels made of al2o3. super-hard grinding wheel on the basis of cbn and diamond form the second group. it is less expensive to apply grinding wheels based on al2o3, but there is a risk of thermal damage to the working surface. when we apply cbn and diamond-based grinding wheels, we can produce high quality surfaces, but this is an expensive method. this situation provided the motivation to seek a new direction in the development of grinding wheels, and to investigate a type of grinding wheel with an innovative abrasive material. this third group of abrasive materials consists of grinding wheels with sg and abral grains. 2 experimental details the bearing steel was ground using the plunge-cut grinding method on a cylindrical grinding machine. we used rings of 14 109.6 bearing steel as samples. the surface roughness was measured using the talysurf 6 laboratory type contact profilomer, to measure the surface with the profile method with progressive transformation of the information on the profile throughout the mechanic shift of the point. the talyrond 30 device, which uses the rotary table principle, was used for measuring the roundness. the parameters for the experimental measurements were: – speed of the wheel vc = 45 m/s – speed of the workpiece: vp = 30 m/min – grinding cycle: 1st phase: ap = 0.03 mm, vfr = 0.52 mm/min 2nd phase: ap1 = 0.02 mm, vfr1 = 0.52 mm/min, ap2 = 0.01 mm, vfr2 = 0.26 mm/min 3rd phase: ap1 = 0.02 mm, vfr1 = 0.52 mm/min, ap2 = 0.005 mm, vfr2 = 0.26 mm/min, ap3 = 0.005 mm, vfr3 = 0.11 mm/min – grinding tool: al2o3 — white corundum (aluminium oxide), a conventional material. the results were used for comparative purposes. a 99a 60j 9v grinding wheel was used. cbn — cubic boron nitride is another very hard material. its regular grain and firm bond make it suitable for grinding hard surfaces. a b-vii b64 k75 grinding wheel was used. sg (seeded gel) — is a ceramic aluminium oxide manufactured by a sintering process. each abrasive frit consists of sub-micron size particles which are separated from the grit under the grinding force. this keeps sg sharper than conventional abrasives, which can be dulled when flats are worn on the working points. an ag 92/99 80 hs(j) 9v grinding wheel was used. abral — is a crystalline mixture of al2o3 and alon. abral is produced by fusing aluminium oxide with alon, followed by slow solidification. the material is created by natural crystallization and the monocrystalline mixture is not disrupted by crushing. the crystalline mixture has a firm structure and a firm lattice. for this reason, its surface is more resilient 80 acta polytechnica vol. 52 no. 4/2012 and harder than a regular corundum surface. an ag 92/99 80 hs(j) 9v c45 grinding wheel was chosen for this experiment. 3 results 3.1 surface roughness the results of the study show, that the most appropriate material is the grinding wheel with sg and abral grains. this conclusion is based on the roughness criteria that were achieved. in the grinding cycle with one or two phases, abral seems to be a better grain option for the roughness criterion. in the threephases grinding cycle, the ra value results are almost identical for the abrasive materials. however, the sg material produced better profile results. the second grains roughness group consists of al2o3 and cbn. the cbn wheel is made of sharper grains and the instrument produces better thermal conductivity, i.e. its surface is not so heavily thermally loaded. this results in a sharper surface than that of al2o3 wheel, the grains of which have a bigger radius. this also affected the surface roughness measurement after the hardened bearing steel had been ground. an interesting finding was that when the cut was applied to this abrasive material, the best roughness results were obtained when the speed of the cross feed was highest and when the depth of the cut was greatest. figure 1: influence of the grinding cycle on the surface’s roughness ra figure 2: influence of the grinding cycle on the surface’s roughness rz figure 3: influence of the grinding cycle on the surface’s roughness rt 81 acta polytechnica vol. 52 no. 4/2012 figure 4: influence of the grinding cycle on the out-of-roundness 3.2 geometric accuracy a comparison of the measurements shows, that the cbn wheel has quite a big tendency to deviate from roundness. this deviation may originate from the production of the grinding wheels, which have thin layer of the abrasive material with the binder on their metallic surface. the production process involves a certain axis deviation of the metallic body and deviation of the axis of the thin abrasive material layer. this eccentricity creates a change in the depth of the cut-off layer, i.e. in the frequency, due to the high speed of the wheel. figure 4 shows, that when a conventional wheel and an innovative type of abrasive grinding wheel are used with decreasing cut depth, thus reducing the radial feed rate, the roundness deviation also decreases. the roundness values that are achieved are comparable with the values for these abrasive materials. the lowest roundness value is for the abral grinding wheel in the 1st and 2nd grinding cycle and for the sg grinding wheel in the 3rd grinding cycle. 4 conclusion our experiment has shown that the least appropriate material for the grinding wheel s for the cutting 14 109.6 bearing steel is cbn with al2o3 grains. abral and sg grinding wheels are more suitable, due to the high durability of the cbn grain layer on the perimeter of the wheel, despite the high cost and the high demand on the accuracy of the ring when mounting the bearing. a comparison of the two innovative abrasive materials shows that the performance of abral is slightly superior. there are several unanswered questions that require further research and development on the influence of the abrasive material in relation to the formation of the new surfaces in the cutting process. a possible application would be for the measuring the temperature in the contact spot between the abrasive tool and the workpiece, and consequently, its influence in relation to the production and the distribution of the residual tension in the superficial bearing steel after grinding. references [1] gašparek, j.: dokončovacie spôsoby obrábania. bratislava : alfa, 1979. isbn 63-009-80. [2] jusko, o.: vplyv nových brúsiacich materiálov na obrábaćı proces. dizertačná práca, praha : české vysoké učeńı technické v praze, fakulta strojńı, 2011. [3] mádl, j., holešovský, f.: integrita obrobených povrch̊u z hlediska funkčńıch vlastnost́ı. úst́ı nad labem : fkk company, 2008. isbn 978-80-7414-092-2. [4] marinescu, i., hitchiner, m., rowe, b.: handbook of machining with grinding wheels. new york : crc press, 2007. isbn 1-57444-671-1. [5] ondirková, j., neslušan, m.: analýza reznosti sg brúsnych kotúčov pri brúseńı 14 209.4. in valivé ložiská a strojárska technológia 2002, súlov : žilinská univerzita, 2002. 82 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 assembly of a pulmonary artery pressure sensor system j. müntjes, s. meine, e. flach, m. görtz, r. hartmann, t. schmitz-rode, h. k. trieu, w. mokwa abstract this paper presents an implantable system for telemonitoring the intravascular pressure in the pulmonary artery. by implanting a catheter-bound pressure and temperature sensor into the pulmonary artery, it is possible to monitor the actual value and the time variations of the intravascular pressure with a frequency of 128 hz. thus hospitalization of patients suffering from heart insufficiency can be avoided by early changes in therapy. preliminary in vivo experiments have been conducted to verify the fixation mechanism and the positioning of the sensor at the right place in the pulmonary artery. it was shown that the proposed fixation mechanism and the packaging of the sensor promise to be stable. keywords: heart insufficiency, pressure sensor, microsystems technology, assembly. 1 introduction aconsiderablenumber of people ineurope suffer from congestive heart failure. in germany, the number is around 1.8 million, rising by 2–300000 each year [1]. this poses a substantial economic problem. in fig. 1, the occurrence of heart insufficiency is compared to that of other reasons for hospitalization [2]. as it is one of themost frequent non-surgical diseaseswith recurring hospitalization and high mortality, a strategy to lower the costs is required. fig. 1: reasons for hospitalization in germany 2006 [2] patients can be monitored by measuring the pulmonary artery pressure and deriving the cardiac output by means of a modified pulse contour analysis. this provides information about the physical health of the patient, and serves as an early warning system for the physician. fig. 2 shows the pressure change over timebeforehospitalizationwhere the degradation is measured and gives information about the health state of the patient. risk of infection, however, would be given if parts of the monitoring system were situated outside the body. therefore, a fully implantable system is presented which includes a catheter-bound pressure and temperature sensor inside the pulmonary artery. the system enables the physician to monitor the actual value and the time variations of the intravascular pressurewith a frequency of 128hz. thus hospitalization of patients suffering from heart insufficiency can be avoided by early changes in therapy. fig. 2: pressure changes occurred in 9 of 12 events 4 – 2 days before hospitalization (major) [3] 2 design and fabrication the system is the result of cooperationbetweenamedical company and several university institutes. research at the institute of materials in electrical engineering covers the assembly process of the pulmonary artery pressure sensor. this paper therefore focuses on the assembly aspects of the sensor system. 2.1 design the system, as shown in fig. 3, consists of intraand extracorporeal parts. the implanted parts are 56 acta polytechnica vol. 50 no. 4/2010 (1) a pressure and temperature sensor enclosed in a metal tube forming the encapsulation, and (2) implanted electronics protectedbyabiocompatible housing. the sensor is located inside a catheter, which holds the sensor cable connecting the sensor element and the implant. in addition to the interface to the pressure sensor, the implanted electronics contains a telemetric unit, a battery, a processor and an internal memory. outside the body, a homemonitoring station (3) receives the data from the subcutaneous implanted electronics, which is able to communicate wirelessly. the data is then transferred by mobile communication via a service center to the attending physician. fig. 3: compass system (with (1) sensor element inside the pulmonary artery, (2) implanted electronics in a biocompatible housing and (3)homemonitoring station) fig. 4: sensor element (carrier with pressure sensor and signalprocessingchipplusadditionalelectronicdevices) the sensor element, as shown in fig. 4, contains electronic devices to support the transmissionbetween the sensor element and the implanted electronics and two silicon chips. one is a monolithically integrated pressure and temperature sensor (fig. 5), similar to the sensor presented in [4]. the circles that are visible on the chip layout represent the capacitive pressure sensors. the other chip is an additional interface circuit amplifying the signal obtained from the sensors. this allowssignal transmissionovera longdistancebetween the sensor and the telemetric unit. both chips aremountedona carrierandare electrically connected via wire bonds. fig. 5: monolithically integrated capacitive pressure and temperature sensor 2.2 fabrication stress-free assembly of the components is crucial to the system: the monolithically fabricated capacitive pressure sensor is very sensitive towards pre-stress. therefore the pressure sensor and the interface circuit are mounted on a carrier to which the cable running through the catheter can be attached. this carrier is designed to minimize the influence of temperature during operation inside the body. silicon is chosen as the carrier material because of its identical thermal expansion coefficient. fig. 6 shows the design of the silicon carrier. because of the limited space inside the metal sensor capsule, it is crucial to design the circuit lineswith the smallest possible pitch. toassist the signal processing chip, three capacitors and twodiodes in surface-mount technology are additionally soldered to the carrier. again the limited space inside the metal encapsulation has to be taken into account, and it is for this reason that small-scale smt devices have to be used. fig. 6: silicon carrier a single step process using flexible adhesive was developed to attach both silicon chips to the carrier. 57 acta polytechnica vol. 50 no. 4/2010 this flexibility is needed to reduce stress on the capacitive pressure sensor [5]. the low-viscosity adhesive fills the gaps between the two chips without any bubbles, see fig. 7. this is important in order to ensure drift-reduced operation of the sensor assembly. fig. 7: sensor assembly after the single step adhesion process, cross section fig. 8 shows the three fabrication stages of the assembly: (1) the carrier with additional smt devices after the single step adhesion process, (2) the signal processing chip is bonded to the carrier by wire bonding and (3) the complete structure providedwith a glob-top adhesive to protect the thin wire bonds. dummy chips are used instead of a pressure sensor and a signal processing chip. to protect the assembly against the highly aggressive environment which human blood poses to electronics, a metal tube forms the encapsulation. this tube is openedabove thepressure sensor toallowthebloodpressure tobemeasured. the tube itself is filledwith an incompressiblemedium to ensure good pressure propagation towards the sensor. fig. 8: fabrication stages of thesensorassembly (dummy chips; (1) chipcarrier with smt devices after the single step adhesion process, (2) after wire bonding, (3) after glob-top) fig. 9 shows the metal pins to which the cable can be connected. the pins are soldered to the two long gold pads on the right of the sensor carrier. a ceramic connector then forms the closure of the metal tube. this decouples the pressure sensor from the stress induced by movements of the catheter inside the blood vessel, and at the same time relieves the strain. the complete system is integrated in a catheter 80 cm in length. its distal end is fixed in the pulmonary artery; the proximal end of the catheter is attached to the telemetric unit. fig. 9: metal connectors to sensor cable 3 experiments and discussion the silicon carrier, which had been thinned to fulfil the geometric requirements, was subjected to several shear tests. with the ceramic connector stabilizing the sensor element, a breaking point could be ascertained that was high enough to assure stability during the handling and operation process. the first in vivo experiments have alreadybeen conducted to verify the fixationmechanismandthepositioningof the sensorat the right place in the pulmonary artery. it was shown that the proposed fixation mechanism and the packaging of the sensor promise to be stable. long-term in vivo experiments will be conducted in the near future. 4 conclusion a new implantable pressure and temperature sensor system for monitoring pulmonary artery pressure and cardiac output has been presented. the assembly process promises to be stable and is expected to impose little stress on the capacitive pressure sensor measuring the pulmonary artery pressure. by avoiding stress on the sensor, the measurements will give stable results and the drift will be reduced to a minimum. acknowledgement the researchdescribed in the paperwas supervised by prof. w. mokwa, rwth aachen university and was supported by the german federal ministry of education and research (bmbf), grant compass. 58 acta polytechnica vol. 50 no. 4/2010 references [1] fischer, m., et al.: epidemiologie der linksventrikulären systolischen dysfunktion in der allgemeinbevölkerung deutschlands, zeitschrift fuer kardiologie, 92, 4, 294–302. [2] statistisches bundesamt: diagnosedaten der patienten und patientinnen in krankenhäusern (einschl. sterbeund stundenfälle). [3] adamson, p. b, et al.: ongoing right ventricular hemodynamics in heart failure: clinical value of measurements derived from an implantable monitoring system., j. am. coll cardiol, 41, 4, 565–571. [4] fassbender, h., et al.: fully implantable blood pressure sensor for hypertonic patients, the seventh ieee conference on sensors, 1226–1229. [5] wilde, j., deier,e.: thermomechanischeeinflüsse der chipklebung auf die genauigkeit mikromechanischer drucksensoren, teil 1: simulation, technisches messen, 70, 5, 251–257. about the author jutta müntjes is currently working at the institute of materials in electrical engineering at rwth aachen university. her work focuses on assembly technologies in the field of pressure sensor systems in medical technology. she studied electrical engineering at the university of erlangen-nürnberg and at kth stockholm, focusing onmicroelectronics andmicrosystems technology. jutta müntjes e-mail: jutta.muentjes@rwth-aachen.de rwth aachen university institute of materials in electrical engineering 1 aachen, germany sonja meine biotronik se & co. kg, berlin, germany erhard flach biotronik se & co. kg, berlin, germany michael görtz fraunhofer institute for microelectronic circuits and systems duisburg, germany renate hartmann helmholtz institute department of applied medical engineering aachen, germany thomas schmitz-rode helmholtz institute department of applied medical engineering aachen, germany hoc khiem trieu fraunhofer institute for microelectronic circuits and systems duisburg, germany wilfried mokwa rwth aachen university institute of materials in electrical engineering 1 aachen, germany 59 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 change of pressing chamber conicalness at briquetting process in briquetting machine pressing chamber peter križan1, miloš matúš1, jaan kers2, djordje vukelić3 1 faculty of mechanical engineering, slovak technical university in bratislava, institute of production systems, environmental technology and quality management, nám. slobody 17, 812 31 bratislava, slovakia 2 tallinn university of technology; department of polymer materials, chair of woodworking, teaduspargi 5, 12618 tallinn, estonia 3 university of novi sad, faculty of technical sciences, trg dositeja obradovica 6, 21000 novi sad, serbia correspondence to: peter.krizan@stuba.sk abstract in this paper, we will present the impact of the conical shape of a pressing chamber, an important structural parameter. besides the known impact of the technological parameters of pressing chambers, it is also very important to pay attention to their structural parameters. in the introduction, we present a theoretical analysis of pressing chamber conicalness. an experiment aimed at detecting this impact was performed at our institute, and it showed that increasing the conicalness of a pressing chamber improves the quality of the final briquettes. the conicalness of the pressing chamber has a significant effect on the final briquette quality and on the construction of briquetting machines. the experimental findings presented here show the importance of this parameter in the briquetting process. keywords: biomass; briquetting; pressing chamber shape, pressing chamber conicalness. 1 introduction the briquetting process is a very interesting biomass treatment process. it is a very complicated process, because many parameters influence the process and the quality of the final briquettes. briquette quality is defined by eu standards, and is evaluated by mechanical and chemical-thermic indicators. briquette quality is evaluated mainly by density. briquette final density is influenced by many parameters. on the basis of the experience that we have acquired and the analyses that we have made, we can divide the parameters into the following three groups [3]: • material parameters • technological parameters • structural parameters. 2 influencing parameters in the briquetting process the material parameters emerge from the properties of the pressed material, i.e. material moisture, fraction size, chemical composition of the material, etc. the technological parameters are pressing temperature, compacting pressure, compacting speed, holding time, etc. these parameters can be changed in the course of pressing according to the capabilities of the briquetting machine. the structural parameters of the pressing chamber are also very important. for successful pressing of high-quality briquettes, all the parameters have to be in synergy. for an engineer, it is very important to know the behaviour of all parameters. we know that pressing temperature, material moisture, compacting pressure and fraction size are very important for briquetting. we can obtain briquettes of suitable quality by adjusting the optimal values of these parameters according to the material that is to be pressed. we know that we can achieve better results by changing some of the structural parameters. it is therefore is very desirable to optimize the geometry of the pressing chamber. the main structural parameters influencing the final briquette density are [2,6]: • the diameter of the pressing chamber • the length of the pressing chamber • the conical shape of the pressing chamber • the friction coefficient between the chamber and the pressing tool • the length of the cooling channel • the effect of counter pressure in the pressing chamber (created in various ways) at our institute we have designed an experimental pressing stand (see figure 1) on which we are 60 acta polytechnica vol. 52 no. 3/2012 figure 1: experimental pressing stand with heating equipment [2] able to perform experiments to detect the impact of each of the parameters listed above. in this paper, we will present our findings on the conical shape of the pressing chamber. why is it so important to know the impact of changes in the conicalness of the pressing chamber? a pressing chamber with a conically-shaped wall is very often used in briquetting and pelleting machinery. however, we know of no studies that clearly show the interaction between final briquette density and changes in pressing chamber conicalness. 3 theoretical analysis of pressing chamber conicalness the geometry of the pressing chamber is very important in briquetting. however, few studies of the briquetting process have shown the influence of individual parameters of this process, taking into account the pressing conditions in the pressing chamber during briquetting. the results presented in this paper are based on our experience, and on a comparison between our experimental results and analyses, on the one hand, and the existing mathematical model of pressing, on the other. we have attempted to find some equations or mathematical models that can be used for calculating the pressing conditions in the pressing chamber. in [1], horrighs offers a clear description of the pressing conditions in a cylindrically-shaped pressing chamber. this theory is represented by the following mathematical model (1): pg = pk · e− 4·λ·μ·h dk (mpa), (1) where pg is the counter pressure in the chamber, pk is the axial pressure of the hydraulic press, λ is the ratio of the main strains σr/σm , and μ is the friction coefficient, h is the length of a pressed briquette and dk is the diameter of the pressing chamber. the situation regarding the pressing conditions in the conical chamber is somewhat complicated. the pressed material in the chamber is subjected to multiaxial pressing. this increases the pressing quality: it increases the briquette density and also the mechanical properties of the briquettes. however, the tool wear is also increased. we have often replaced a cylindrical pressing chamber by a conical pressing chamber and obtained briquettes of higher density. however, there is no mathematical model for a conical pressing chamber. we are therefore attempting to design a mathematical model for a conical pressing chamber. for an analysis of the influence of changes in conical pressing chambers, we used the theory of forward extrusion [4] as a basic metal volume moulding technology. in this definition, the force and pressure distribution is very close to the theory of biomass briquetting. a simple scheme of the main parts of a conical pressing chamber is shown in figure 2 (left and middle). this type of pressing chamber is often used in the structure of briquetting machines. the pressing chamber consist of 3 basic parts — a cylindrical part, a conical (reductive) part and a calibration part. the material is filled into the cylindrical part and then starts to be compacted by the pressing piston. the main pressing of the material takes place in the conical part. the pressure and the conical chamber cause a multi-axial pressing effect. some holding time while the pressed briquette is under pressure is necessary in order to eliminate pressing expansion. the calibration part gives the final shape to the briquette and provides the holding time under pressure and temperature. figure 2 (right) provides a better description of the pressing conditions in a conical chamber, and shows all the acting forces and pressures. the maximum attained axial pressure pk depends on the length of the pressing chamber l, on the shape of the pressing chamber, on the size of the vertex angle of the conical chamber, and on the friction conditions between the pressed material and the wall of the 61 acta polytechnica vol. 52 no. 3/2012 figure 2: main parts of a conical pressing chamber (left and middle), and the pressing conditions in a conical pressing chamber (right) [2,4–6] (pk – axial pressure of press (mpa), pg – counter pressure in chamber (mpa), pr – radial pressure (mpa), pm – axial pressure on the briquette (mpa), d0 – input diameter of the pressing chamber (mm), d1 – output diameter of the pressing chamber (mm), d – diameter of the pressing chamber in cross-section (mm), μ – friction coefficient (–), l – length of the conical part of the pressing chamber (mm)) figure 3: pressure distributions in space (left) and plane pressure distributions (right) on a conically-shaped element chamber. the surface friction drag is given by the radial pressure pr, by the cross factor of pressure pm , which affects the wall of the chamber, by the friction coefficient μ, and by the length of the pressing chamber l. the diameter of the chamber decreases linearly according to the length of the chamber l. for a description of the pressing process, we have to start with a description of the pressures acting on an element of dx thickness cut in a conical shape (see figure 3 left). in the vertical direction, the axial compacting pressure pm , acting in the opposite direction, elicited pm + dpm . the friction also increases the pressure perpendicular to the wall of chamber pr. to make a balanced equation, we will need to know the sizes of the surfaces on which the pressures act. on the basis of the plane pressure distributions (see figure 3, right) we were able to write the following equation (2): pm · sv 2 + pm · sv − (pm + dpm ) · sv 2 − (2) μ · (pr + pm · sin α) · dsp l · cos α = 0 in order to implement equation (2) it will be necessary to define the border conditions and to define the final shape of a mathematical model suitable for optimization methods. this mathematical manipulation has not yet been finalized, so we are not yet able to present a mathematical model for a conical pressing chamber. 4 a experiment to measure the impact of conicalness the next step was to measure the impact of conicalness. we attempted to measure the interaction between changes in pressing chamber conicalness and final briquette density. for this experiment we used an experimental pressing stand. it was necessary to prepare some new components — new chambers with different wall conics (see figure 4). for the experiment we prepared three new chambers with 1◦, 2◦ and 3◦ degree conic walls. the results were evaluated in terms of final briquette density. we compared briquettes pressed in a conical chamber with briquettes pressed in a cylindrical chamber under the same conditions. 62 acta polytechnica vol. 52 no. 3/2012 figure 4: example of a new chamber with a conical wall (left), and the cross-section of pressing stand (right) (1 – pressing chamber; 2 – flange; 3 – start-up chamber; 4 – counter plug; 5 – piston; 6 – pressed material; 7 – chamber with a conical wall; 8 – sleeve connector; 9 – mounting screw) figure 5: negatives of internal chamber space representing different geometries table 1: experiment results — density of pressed briquettes, in kg/dm3 t1 = 20 ◦c t2 = 85 ◦c t3 = 120 ◦c α = 0◦ 0,852 1,152 1,207 α = 1◦ – 1,221 1,216 α = 2◦ – 1,236 1,224 α = 3◦ – – – ∗α represents the conicalness of the chamber we chose the following conditions for this experiment: pressed pine sawdust material, material moisture 8 %, fraction size 1 mm, compacting pressure 159 mpa. the pressing was done without the use of additional heating equipment, i.e. without a pressing temperature effect. the measurements were carried out in laboratory conditions, at temperature 20 ◦c. for each setting we pressed 7 briquettes. these were measured, and then we were able to calculate their density. table 1 presents the results, with only the average briquette density value. the average values were compared. for pressing without additional pressing temperature (under laboratory conditions) we did not obtain any briquettes. the first column of table 1 (20 ◦c) therefore only shows the briquette density obtained by pressing with a cylindrically-shaped chamber. the problem was the very high friction force between the pressed material and the chamber wall. the pressing forces in the pressing chamber are distributed as a laminar flow in a cylindrical pipe (see 63 acta polytechnica vol. 52 no. 3/2012 figure 6: pressing force distribution across the pressing chamber figure 6, left). the maximum axial pressing force is applied in the axial axis of the pressing chamber, see figure 6, middle and right. the experimental pressing stand is inserted into the hydraulic press, which can exert a maximum pressing force for 10 tons. in our case, this corresponds to compacting pressure 318 mpa. we also used also the maximum pressure, but without success. from our experience, we know that the friction force for briquetting can be reduced by lignin plastification. the friction force can be reduced by increasing the pressing temperature. for this purpose, we used heating equipment affixed to the pressing chamber. the heating equipment was controlled by a regulator that works on the basis of a signal coming in from the temperature sensor. lignin is a material component of all types of biomass. in the briquetting process, it has the function of a gluing medium, strongly joining the particles of the material into a compact briquette. we then decided to repeat the experiment at a different pressing temperature. the results are presented in table 1. with a higher pressing temperature we were able to obtain briquettes. we can add that the briquette density increases as the wall angle in the chamber increases, at each temperature level. we also proved that increasing the pressing temperature has a positive impact on the friction forces between the pressed material and the chamber wall. however, we found that a chamber with a 3◦ degree wall cannot be used in our conditions. the experiment with this chamber was not successful, because the mounting screws were destroyed during pressing. the maximum strength of the hydraulic press was also insufficient to extrude the pressed briquette from the conical chamber. during the second squeezing, the screws were destroyed. another interesting finding during the experiment was that we were able, during pressing, to recognize three compacting pressure values. the first value represents pressing, the second represents overcoming the friction force, and the third represents extruding the pressed briquette from the chamber. the following figures present a graphic record of these three pressures values, as recorded by the hydraulic press. figure 7: graphic record of briquette pressing in conical chambers at 120 ◦c pressing temperature 64 acta polytechnica vol. 52 no. 3/2012 figure 8: comparison of recognized pressures for pressing in conical chambers (blue columns represent the pressure needed to overcome the friction force, red columns represent pressing, and yellow columns represent the pressure needed to extrude the pressed briquette from the chamber these figures prove that during pressing with conical chamber acting higher pressures as with cylindrical chamber. this proves that higher briquette density can be obtained by pressing in a conical chamber. we can state that it is possible to increase the pressures acting in the chamber by changing the degree of conicalness of the chamber. however, it can be seen that higher friction forces act in a conical chamber with a higher degree of conicalness. the friction forces can be reduced by a higher pressing temperature. as the pressing temperature increases, the compacting pressure action decreases. a future study should investigate the unit production costs (energy costs and production costs). 5 conclusion the main aim of this paper has been to present the results of our experiment to detect the impact of the conical shape of pressing chambers. we also wanted to show the importance of this type of parameter for the briquetting process. in the near future, we aim to design a mathematical model for a conically-shaped pressing chamber. of course, more experiments will need to be made. with this model, we will be able to calculate the optimal length of a conical pressing chamber in accordance with the standards for the final density of briquettes. acknowledgement this paper is an outcome of the project “development of progressive biomass compacting technology and the production of prototype and high-productive tools” (itms project code: 26240220017), on the basis of the operational programme research and development support funding by the european regional development fund. references [1] horrighs, w.: determining the dimensions of extrusion presses with a parallel-wall die channel for the compaction and conveying of bulk solids, aufbereitungs – technik: magazine. duisburg : germany, 1985, no. 12. [2] križan, p.: process of wood waste pressing and conception of press construction, dissertation work, fme sut in bratislava, slovakia, july 2009, p. 150, (in slovak). [3] šooš, l’., križan, p.: analysis of the interaction of forces in the pressing chamber of a briquetting machine, proceedings of the 11th international conference top 2005, častá – papiernička, slovakia, 29. 6.–1. 7. 2005, fme stu in bratislava, p. 299–306, (in slovak). isbn 80-227-2249-9. [4] storožev, m. v., popov, j. a.: theory of metal shaping. bratislava, slovakia : alfa, praha : sntl, 1978, 63-560-78, p. 488, (in slovak). [5] križan, p., šooš, l’., vukelić, dj.: action of counter pressure on compacted briquettes in a pressing chamber, proceedings of the 10th international scientific conference, 9.–10. 10. 2009, novi sad, serbia : faculty of technical sciences, 2009, p. 136–139. isbn 978-86-7892-223-7. [6] križan, p., vukelić, dj.: shape of the pressing chamber for wood biomass compacting, international journal for quality research, vol. 2, no. 3 (2008), art grafika d.o.o. podgorica, montenegro, p. 205–209. issn 1800-6450. 65 ap08_5.vp 1 introduction the availability of solar radiation data and the relevant meteorological parameters are important to solar engineers and architects in order to give an accurate estimate of the available solar energy resource. solar radiation data is always a necessary basis for the design of any solar energy conversion device and for a feasibility study of the possible use of solar energy [1]. the sunshine duration at a given site depends on the topography of the site and the prevailing meteorological conditions, such as the clearness of the sky and the height above sea level, water vapor content, air temperature, pressure, humidity, wind direction and force, etc. [2]. the first attempt at estimating global solar radiation was the well-known empirical relation between global solar radiation under clear sky conditions and bright sunshine duration, given by angstrom, see [3]. theoretical and empirical models have been postulated to compute the components of the insolation [4–13]. some of these models are theoretical, dealing with the solution of the radiative transfer equation, while others are simply regression models. in this paper o give an accurate estimate of the available solar energy resource.sitesvaries an empirical sunshine-based model is applied to match observed values of the global solar radiation of selected geographical sites in egypt. this paper is simply a continuation of several previous studies conducted at different sites in egypt to define the degree of empirical insolation models in the examined area [14–16]. 2 methodology the original form, proposed by angstrom, expressed the correlation between clear sky global solar radiation and sunshine duration as follows: g g a b s nc� �{ ( )}. (1) the linear relation correlating g g0 and s n is given in [4] as: g g a b s n0 � �{ ( )}. (2) where g0 is the extraterrestrial solar radiation on a horizontal surface in kw/m2, given by: g i esc0 0 24 180 � � � � � � �� � � � � � �� � �cos cos sin sin sin . (3) where e0 is the correction factor of the earth’s orbit and � is the sunrise/sunset hour angle given by: e dn 0 1 0 033 2 365 � � � � � � �. cos � , (4) � � �� �cos (tan tan ).1 (5) the declination angle of the sun � is given in degrees according to [17] as: � � � � � ( . . cos . sin . cos 0 006918 0 399912 0 070257 0 006758 2 � � � � � � � � � 0 000907 2 0 002697 3 0 00148 3 180 . sin . cos . sin ) , � (6) where � is the day angle in radiance, it is represented by: � � �2 1 365 �( )dn . (7) t-statistics [18] is applied as an indicator to select a coefficient of the empirical best method that gives the smallest percentage error, in the estimated g-values, t n � � � � � � � � ( )( ) ( ) ( ) 1 2 2 2 1 2mbe rm e mbes , (8) where (rmse) is the root mean square error and (mbe) is the mean bias error to assess the performance of the relativity model, and n is the number of data pairs. the absolute percentage error of the estimated values of the global solar radiation at each site may be calculated from the following equation: � % � � � g g g m es m 100. (9) 3 observational data observations of the total solar radiation were carried out using a pyronometer with sensitivity 9 �v/wm2. the pyronometer was originally introduced by kimball and hobbs in 1923. the detector is a differential thermopile with the hot-junction receiver blackened and the cold-junction receiver whitened. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 an empirical method for estimating global solar radiation over egypt s. a. khalil, a. m. fathy global solar radiation has been estimated on the basis of measurements of sunshine duration for different selected sites in egypt; (marsa-matruh, cairo, aswan, al-kharga, abu-simble and halaib-shalatin). the regression coefficients (a) and (b) of angstrom type correlation are calculated for the selected sites. the values of the regression coefficients are found to vary from 0.219–0.611 and 0.107–0.576, respectively. these values have been calculated by three different approaches. the estimated values of the global solar radiation are compared with the measured values. although the (a) and (b) values differ from one site to another; the summation (a+b) is almost the same for the selected sites. the difference between the estimated and measured values of the global solar radiation at the various sites varies from 4 % to 12 %. keywords: regression coefficient, sunshine duration, global solar radiation. the calibration of the pyronometer readings was carried out by the egyptian meteorological authority, and the defined errors of the observations range from 5 % to 8 %.the data in this paper was obtained from the meteorological authority of egypt. the data was gathered at six selected sites in egypt: marsa-matruh (lat. 310°33´n & long. lat. 270°35´e), abu-simble (lat. 220°34´n & long. lat. 310°63´e), cairo (lat. 300°05´n & long. lat. 320°17´e), aswan (lat. 230°58´n & long. lat. 320°47´e), al-kharga (lat. 250°27´n & long. lat. 300°34´e) and halaib-shalatin (lat. 230°30´n & long. lat. 340°30´e). the data was gathered at the marsa-matruh, cairo and aswan sites during the period (1991–1993) and at the abu-simble, al-kharga and halaib-shalatin sites during the period (1992–1995). 4 computations substituting the global solar radiation data in equation (2), we obtained the values of the regression coefficients constant a and b for the examined sites by least square method. the a and b values were determined for different methods. in the first method, we substituted by the daily data values, while in the second method, the constants were calculated using the monthly data values of g g0 and s n for each month according to the available data. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 48 no. 5/2008 methods marsa-matruh cairo aswan abu-simble al-kharga halaib-shalatin (1) a 0.351 0.449 0.219 0.241 0.311 0.191 b 0.406 0.281 0.553 0.502 0.429 0.582 a+b 0.757 0.730 0.772 0.743 0.740 0.773 (2) a 0.249 0.461 0.446 0.472 0.291 0.490 b 0.576 0.259 0.297 0.281 0.439 0.223 a+b 0.825 0.720 0.743 0.753 0.730 0.713 (3) a 0.338 0.568 0.596 0.611 0.609 0.593 b 0.425 0.215 0.174 0.107 0.137 0.098 a+b 0.763 0.783 0.770 0.718 0.746 0.691 table 1: values of angstrom coefficients given by different methods at the selected sites month g0 s/n gm method 1 method 2 method 3 ges � % ges � % ges � % jan. 6.63 0.911 4.64 4.95 6.6 4.70 1.3 4.76 2.6 feb. 7.75 0.939 5.61 6.15 9.5 5.93 5.7 6.10 8.7 mar. 9.19 0.825 6.51 6.09 6.5 6.36 2.3 5.91 9.3 apr. 10.61 0.841 7.54 7.91 4.9 7.47 0.9 7.64 1.4 may 11.34 0.826 7.84 7.64 2.5 7.72 1.5 7.45 5.0 jun. 11.04 0.853 8.20 8.42 2.7 8.23 0.4 7.82 4.6 jul. 10.54 0.947 7.76 8.12 4.8 7.92 2.1 7.76 0.1 aug. 10.32 0.924 7.53 7.33 2.7 7.62 1.3 7.44 1.3 sep. 10.07 0.886 6.99 6.80 2.6 7.17 2.6 6.99 0.04 oct. 8.58 0.850 6.19 6.45 4.2 6.59 6.6 6.50 5.1 nov. 7.56 0.892 5.32 5.63 5.8 5.51 3.7 5.21 2.1 dec. 6.42 0.814 4.55 4.36 3.2 4.64 2.0 4.47 1.8 mbe 0.193 0.136 0.072 rmse 0.783 0.491 0.463 t 0.625 0.511 0.439 table 2: comparison between measured gm and estimated ges global solar radiation values (kw/m 2) at abu-simble given by different methods in the third case, the constants were calculated using the monthly mean daily values of the global solar radiation. the different values of the constants determined using the three methods for the given locations are listed in table 1. these constants were in turn used to recalculate the estimated values of the global solar radiation at the selected sites. a comparison of the measured and estimated global solar radiation values is shown in tables 2–7. the physical significance of regression coefficients a and b is that a represents the case of overall atmospheric transmission for overcast sky conditions, i.e. s n � 0, while b is the rate of increase of g g0 with s n. the summation (a b� ) denotes the overall atmospheric transmission under clear sky conditions or the clearness index. 5 results and discussion for cairo, the a-values given by the three methods are generally higher than the b-values (see table 1). in the coastal site at marsa-matruh an opposite behavior is noted, i.e. the b-values are higher than the a-values. aswan and abu-simble have the same behaviors, i.e. the b-values are higher than the a-values for method one, while the reverse results are provided by the second and the third methods. at the al-kharga and halaib-shalatin sites, the results reveal fluctuating behavior of these parameters. however, the summations (a b� ) are almost the same at the examined sites, except for the results of the third method at halaib-shalatin, where the summation is slightly lower. in fact, under full clear sky conditions, i.e. s n� , according to eq. (2), we find that, the values of g g0 are equal to the value of a at the examined location. the applied empirical methods give estimated values of global solar radiation nearly coinciding with the measured values at the various selected sites, where the errors range from 4 % to 12 % (see tables 2–7). according to eq. (9), we obtained the values of (� %) for the different methods at all the selected locations, where we compare the values of (� %) for the different methods at all the sites. the smallest values are considered the best method, but we have to compare their mbe, rmse and t-test values. thus the method which gives the smallest values for the t-test is considered as the best method for estimating the global solar radiation for different selected sites with an acceptable error. in fact, it is difficult to select one empirical method that explains the time fluctuations of the observed global solar radiation values at various sites. for example, methods 1 and 2 are more applicable at aswan, al-kharga and halaib-shalatin, but method 2 is best for estimating the global solar radiation at these sites. at cairo and abu-simble, methods 2 and 3 are more applicable throughout the various seasons, but method 3 is best for estimating at abu-simble, whereas method 2 is best at cairo. at marsa-matruh, method 1 and 3 seem to the best for representing the trend of the measured global solar radiation values throughout the seasons. method 1 is generally best at marsa-matruh. 6 conclusion the results of this paper clearly indicate the primary importance of developing empirical approaches for formulating the global solar radiation field reaching the earth at various geographical sites in egypt. method two provides 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 month g0 s/n gm method 1 method 2 method 3 ges � % ges � % ges � % jan. 6.16 0.598 3.40 3.29 3.1 3.13 7.9 3.26 4.2 feb. 6.64 0.647 3.92 3.67 6.2 3.75 4.3 3.83 2.3 mar. 8.24 0.689 5.26 5.07 3.5 4.99 5.1 5.07 3.6 apr. 10.04 0.771 6.20 5.96 3.8 6.37 2.7 6.08 1.8 may 11.06 0.815 7.33 7.59 3.5 7.47 2.0 7.45 1.7 jun. 11.51 0.859 7.99 8.11 1.5 7.77 2.7 8.17 2.3 jul. 11.28 0.883 7.76 7.84 1.1 7.81 0.6 7.95 2.5 aug. 10.60 0.809 7.26 7.17 1.1 7.45 2.6 7.36 1.8 sep. 9.99 0.731 6.18 6.57 6.3 6.05 2.1 6.37 3.1 oct. 8.09 0.702 5.26 4.97 5.3 5.07 3.5 5.37 2.1 nov. 6.15 0.693 3.94 4.34 10.2 4.11 4.5 4.07 3.6 dec. 5.16 0.645 3.30 3.44 4.0 3.49 5.6 3.54 7.1 mbe 0.185 0.122 0.489 rmse 0.649 0.571 0.631 t 0.797 0.460 1.227 table 3: comparison between measured (gm) and estimated (ges) global solar radiation values (kw/m 2) at cairo given by different methods. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 48 no. 5/2008 month g0 s/n gm method1 method2 method3 ges � % ges � % ges � % jan. 6.64 0.863 4.79 4.75 0.8 4.99 4.1 4.90 2.3 feb. 7.93 0.905 5.86 5.79 1.2 5.97 1.9 5.71 2.7 mar. 9.47 0.846 6.62 6.66 0.6 6.71 1.3 6.79 2.5 apr. 10.91 0.849 7.98 8.12 1.8 8.19 2.6 7.95 0.4 may 11.34 0.872 8.16 8.07 1.1 8.02 1.7 8.29 1.5 jun. 11.66 0.909 8.29 8.16 1.6 8.04 2.9 7.92 4.5 jul. 11.60 0.913 7.92 7.81 1.4 7.80 1.5 7.76 2.1 aug. 10.79 0.932 7.76 7.87 1.5 7.98 2.8 7.37 4.9 sep. 10.32 0.895 7.19 7.29 1.3 7.38 2.6 6.98 2.9 oct. 9.33 0.879 6.49 6.42 1.1 6.35 2.2 6.30 2.9 nov. 7.88 0.845 5.87 5.80 1.2 5.97 1.7 5.77 1.6 dec. 6.59 0.870 4.96 4.76 4.2 5.09 2.7 5.04 1.6 mbe 0.283 0.149 0.262 rmse 0.641 0.627 0.539 t 0.539 0.391 0.473 table 5: comparison between measured (gm) and estimated (ges) global solar radiation values (kw/m 2) at al-kharga given by different methods month g0 s/n gm method 1 method 2 method 3 ges � % ges � % ges � % jan. 6.52 0.875 4.70 4.51 4.1 4.79 1.9 4.75 1.1 feb. 7.77 0.911 5.78 5.68 1.8 5.92 2.4 5.87 1.6 mar. 9.39 0.825 6.58 6.36 3.4 6.28 4.6 6.49 1.4 apr. 10.81 0.859 7.87 7.75 1.6 7.61 3.4 7.44 5.6 may 11.31 0.813 8.03 7.82 2.6 7.90 1.6 7.82 2.6 jun. 11.63 0.881 8.25 8.01 3.0 7.84 5.0 8.10 1.9 jul. 11.43 0.925 7.90 7.76 1.8 7.54 4.6 7.63 3.4 aug. 10.66 0.961 7.70 7.62 0.9 7.59 1.4 7.73 0.4 sep. 10.21 0.905 7.20 7.06 1.9 7.30 1.4 7.26 0.8 oct. 9.00 0.863 6.50 6.38 2.0 6.62 1.8 6.59 1.3 nov. 7.81 0.815 5.59 5.79 3.5 5.76 2.9 5.86 4.8 dec. 6.44 0.85 4.77 4.86 1.8 4.70 1.5 4.96 3.9 mbe 0.171 0.69 0.059 rmse 0.711 0.532 0.496 t 0.692 0.486 0.436 table 4: comparison between measured (gm) and estimated (ges) global solar radiation values (kw/m 2) at aswan given by different methods 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 month g0 s/n gm method1 method2 method3 ges � % ges � % ges � % jan. 6.82 0.882 4.83 4.95 2.6 4.69 2.8 4.60 4.8 feb. 7.97 0.895 5.99 6.22 3.9 5.81 2.9 5.78 3.4 mar. 9.67 0.932 6.66 6.47 2.9 6.74 1.1 6.37 4.4 apr. 10.91 0.865 8.18 7.97 2.5 8.09 1.1 7.99 2.8 may 11.36 0.876 8.26 8.20 0.6 8.14 1.3 7.94 3.8 jun. 11.80 0.942 8.43 8.68 2.9 8.24 2.3 8.04 4.7 jul. 11.56 0.963 8.20 8.09 1.4 7.85 4.2 7.92 3.9 aug. 10.99 0.951 7.90 7.74 2.0 7.66 2.9 7.54 4.5 sep. 10.38 0.918 7.40 7.26 1.8 7.54 1.9 7.49 1.3 oct. 9.51 0.885 6.89 6.66 3.3 6.61 4.2 6.75 2.1 nov. 8.09 0.839 5.99 5.79 3.9 6.09 1.5 6.22 3.8 dec. 6.74 0.874 5.14 5.27 2.5 5.34 4.0 5.24 1.9 mbe 0.231 0.181 0.211 rmse 0.682 0.575 0.425 t 0.573 0.431 0.511 table 7: comparison between measured (gm) and estimated (ges) global solar radiation values (kw/m2) at halaib-shalatin given by different methods month g0 s/n gm method1 method2 method3 ges � % ges � % ges � % jan. 5.45 0.672 2.81 2.85 1.4 3.02 7.4 3.18 13.3 feb. 6.66 0.692 3.86 3.97 3.0 3.68 4.6 4.02 4.1 mar. 8.47 0.715 4.97 4.80 3.5 4.89 1.6 4.67 6.1 apr. 9.78 0.759 6.42 6.61 3.0 6.74 4.9 6.65 3.6 may 10.51 0.771 7.20 7.45 3.1 7.03 2.3 7.26 0.8 jun. 10.91 0.815 7.87 8.21 4.3 7.75 1.5 8.04 2.2 jul. 10.60 0.849 7.94 8.28 4.3 7.82 1.5 7.99 0.6 aug. 10.39 0.807 7.26 7.54 3.7 7.35 1.1 7.45 2.6 sep. 9.76 0.782 6.34 6.48 2.2 6.37 0.5 6.56 2.5 oct. 8.11 0.731 5.14 5.32 3.6 5.38 4.8 5.49 6.8 nov. 6.66 0.671 3.92 4.18 6.7 4.14 5.6 4.07 3.9 dec. 5.24 0.619 3.13 3.31 5.6 3.40 8.4 3.42 9.2 mbe 0.086 0.137 0.479 rmse 0.763 0.849 1.063 t 0.211 0.473 1.395 table 6: comparison between measured (gm) and estimated (ges) global solar radiation values (kw/m 2) at marsa-matruh given by different methods good agreement at the cairo, halaib-shalatin and al-kharga sites, while at abu-simble and aswan sites the third method is considered best. for marsa-matruh, the first method is considered best. other topographic, climatological and environmental parameters should be inserted into the adopted empirical formula to increase the accuracy of the estimated values of the observed quantities. the dependence of coefficients a and b in the angstrom formula should be tested as a function of the prevailing environmental conditions at the tested sites. list of symbols gc clear sky global solar radiation in kw/m 2, s sunshine duration in hours, n length of the daylight in hours, g0 extraterrestrial solar radiation on a horizontal surface in kw/m2, e0 correction factor of the earth’s orbit, � sunrise/sunset hour angle, dn day number in the year (1 365� �dn ), � declination of the sun, � latitude of the station, � day angle in radiance, rmse root mean square error, mbe mean bias error, n number of data pairs, � % absolute percentage error, gm measured values of global solar radiation, ges estimated values of global solar radiation. references [1] gopinathau, k. k.: solar sky radiation estimation techniques, solar energy, vol. 49 (1992), no. 1, p. 9–11. [2] ready, s. s.: an empirical method for the estimation of the total solar radiation, solar energy, vol. 24 (1971), p. 13. [3] angstrom, a.: solar and terrestrial radiation, roy. met. soc., vol. 50 (1924), p. 121–127. [4] prescott, j. a.: evaporation from a water surface in relation to solar radiation, trans. r. soc. s. austr., vol. 64 (1940), p. 114–118. [5] hourmitz, b.: insolation in relation to cloudiness and cloud density, j. met., vol. 2 (1945), p. 154–156. [6] daneshyar, m.: solar radiation statistics for iran, solar energy, vol. 21 (1978), p. 345–349. [7] davies, j., abdel-wahab, m., mekay, d.: estimating solar irradiance on horizontal surface, int. j. sol. energy, vol. 2 (1984), p. 405. [8] abdel-wahab, m.: simple model for estimation of global solar radiation, solar and wind technology, vol. 2 (1985), no. 1, p. 69–71. [9] srivastava, s. k., sinoh, o. p., pandy, g. n.: estimation of global solar radiation in uttar pradesh (india) and comparison of some existing correlations, solar energy, vol. 51 (1993), p. 27–90. [10] abdel-wahab, m.: new approach to estimate angstrom coefficient, solar energy, vol. 51 (1993), p. 241–245. [11] bodescu, v.: verification of some very simple clear and cloudy sky models to evaluate global solar radiation, irradiance, solar energy, vol. 61 (1997), p. 251–264. [12] hamid, r. h.: formulation of the global solar radiation using sunshine duration over egypt, journal astronomical society of egypt, vol. 11 (2003), p. 39–52. [13] beheary, m. m.: using the global solar radiation to estimate the spectroscopic structure of the normal incident solar radiation at selected sites in egypt, al-azhar bull. sci. vol. 15 (2004), no. 2, p. 93–106. [14] kamal, m. a., shalaby, s. a., mostafa, s. s.: solar radiation over egypt; comparison of predicted and measured meteorological data, solar energy, vol. 50 (1993), p. 463–470. [15] ibrahim, s. m. a.: predicted and measured global solar radiation in egypt, solar energy, vol 35 (1985), p. 185–190. [16] tadros, m. t. y.: uses of sunshine duration to estimate the global solar radiation over eight meteorological stations in egypt, renewable energy, vol. 21 (2000), p. 231–240. [17] spencer, j. w.: fourier series representation of the position of the sun, search, vol. 2 (1975), no. 5, p. 165–172. [18] iqbal, m: an introduction to solar radiation, edited by academic press, 1983. doc. ahmed mohamed fathy phone: +2-011-2689337 fax: +2-02-25548020 e-mail: amfathy2003@yahoo.co.uk national research institute of astronomy and geophysics solar and space department helwan, egypt. doc. samy khalil abd el mordy fax: +2-02-25548020 e-mail: samyakalil@yahoo.com national research institute of astronomy and geophysics helwan, egypt. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 48 no. 5/2008 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 influence of the amount of master alloy on the properties of austenitic stainless steel aisi 316l powder sintered in hydrogen mateusz ska�loń1, jan kazior1 1 cracow university of technology, mechanical department, warszawska street 24, 31-155 cracow, poland correspondence to: kazior@mech.pk.edu.pl abstract aisi 316l austenitic stainless steel powder was modified with four different amounts of boron (0.1; 0.2; 0.3; 0.4 of wt. %) in the form of masteralloy micro-powder, and was sintered in a pure dry hydrogen atmosphere in order to obtain high density sintered samples characterized by a thickened non-porous surface layer. we investigated the influence of the amount of boron on: density, hardness, grain microhardness, porosity, microstructure and surface quality. the study revealed that it is possible by a conventional compacting and sintering process to obtain near full-density sintered samples with a non-porous superficial layer without boride precipitations. keywords: master alloy, stainless steel, sintering. 1 introduction sintered austenitic stainless steel powders (e.g. aisi 316l) cause some serious problems connected especially with the presence of open porosity. this allows an aggressive fluid to penetrate the bulk material, which remarkably intensifies the corrosion rate. in order to extend the applications of sintered stainless steels, increased density and decreased total porosity, in particular open porosity, are necessary. it has beenfound that when ferrous metals powders are sintered with the addition of boron powder, a nonporous surface layer is formed [1–11]. unfortunately, high addition of elemental boron stimulates the appearance of a eutectic liquid phase during sintering, which in fact intensifies the processes of mass transport phenomena, while at the same time creating an almost continuous network of borides surrounding the matrix grains. this significantly decreases the mechanical properties of the sintered material. the main goal of our investigation was to obtain a high-density sintered material with a non-porous superficial layer without boride precipitations. 2 experimental precedure cylindrical specimens 5 mm in length and 20 mm in diameter of aisi 316l austenitic stainless steel powders modified with five different amounts of boron containing master alloy micro-powder (0.1; 0.2; 0.3; 0.4 of wt. %), were uniaxially compacted at 600 mpa. only the punch and die walls were lubricated with zinc stearate in order to avoid contamination of the mixtures of powders. the sintering process was performed in a tubular furnace in a pure dry hydrogen atmosphere according to the temperature profile presented below: • heating from room temperature up to 1 240 ◦c, with a heating rate of 10 ◦c/min. • isothermal sintering at a temperature of 1 240 ◦c for 30 min, • cooling to 30 ◦c with a cooling rate of 20 ◦c/min, • temperature stabilization to 30 ◦c for 60 min. density was measured by the water displacement method according to the pn-en iso 2738:2001 standard. hardness tests were carried out using the vickers hardness tester. ten measurements were taken, and the results were calculated as an average value. the standard deviation calculation was also based on this data. specimens for the microstructures were obtained by means of a particular procedure aimed at avoiding modifying the porosity morphology. microstructures were revealed by etching in villel’s etchant for stainless steels at a temperature of 80 ◦c. the porosity was determined by digital image analysis as a function of the boron content. the microhardness of the grains was measured after etching: ten measurements were performed for each sample. 10g was used as a test load. the load acting time was 10 seconds. porosity analysis was performed using apheliontm software. a surface quality check was carried out by observing a cross-section of the sample using an optical microscope. 108 acta polytechnica vol. 52 no. 4/2012 3 results and discussion 3.1 density figure 1 presents the influence of the amounts boron on the sintered density of aisi 316l powders. it is clear that the sintered density increased with an increasing amount of boron, and in particular when the amount of boron exceeded 0.3 % wt. the major increase in sintered density for 0.3 % wt. amount of boron suggests some changes in the sintering mechanisms than when smaller lower amounts of boron were added. figure 1: influence of the amount of boron on the sintered density of aisi 316l powders 3.2 hardness figure 2 and table 1 present the results of vickers hv 10 hardness measurements of sintered aisi 316l powders. similarly as in the case of density, an increase in the amount of boron increases the hardness of the sintered materials. in particular in the case of the addition of 0,4 wt.% boron, the hardness is 2.5 times higher than hardness of the base material. this is directly connected with the increase in density. 3.3 microhardness figure 3 presents the results of microhardness measurements of sintered aisi 316l powders. unlike in the case of density and hardness, the grain microhardness measurements indicate that the addition of boron does not influence the grain microhardness. figure 2: influence of the amount of boron on the vickers hardness of sintered aisi 316l powders figure 3: influence of the amount of boron on the microhardness of sintered aisi 316l powders table 1: hardness data boron amount [wt. %] 0.0 0.1 0.2 0.3 0.4 hv/10 73 80 100 144 180 standard deviation ±1.1 ±1.3 ±1.6 ±4.6 ±3.9 table 2: microhardness data boron amount [wt. %] 0.0 0.1 0.2 0.3 0.4 hv/0.01 262 293 281 302 283 standard deviation ±36.0 ±22.3 ±29.9 ±26.2 ±28.1 109 acta polytechnica vol. 52 no. 4/2012 figure 4: influence of the amount of boron on the pore area in three different ranges < 25 μm2, 25–75 μm2 and > 75 μm2 the boron alloyed-samples had only slightly higher microhardness than the base sintered material. this is due to some diffusion of the alloying elements from master alloy powder to the austenitic matrix this means that the increase in vickers hardness is a consequence of increased density, but that the microstructure does not change. 4 pore morphology figure 4 presents pores as a function of boron content. it may be noted that with increasing boron content the pore area changed. for the boron-free samples, a small area of pores dominated, while for a higher amount of boron a larger area of pores dominates. the results presented in figure 4 show that the trend changes at 0.3 % wt. boron addition. this behavior is well connected with the development of density: with 0.3 % wt. addition of boron considerable growth in density was also observed. this behavior is well related with phenomena associated with liquid-phase sintering when microstructural coarsening occurs in the final stage of sintering. along with grain growth there are simultaneous changes in pore size, inter-granular neck size, grainliquid interfacial area and mean grain separation. 5 microstructures figure 5a presents the microstructure of boron-free sintered austenitic stainless steel, and figure 5b presents the microstructure of 0.4 % wt. boron alloyed aisi316l. in principle, the microstructure is characterized by two different layers. the first is the dense superficial layer (figure 7) which after appearing of the precipitations develops into second one: layer rich in borides but characterized with much lower porosity than sample’s core (figure 6). the network of borides surrounding the austenitic grains is caused by the solidification of the liquid phase that forms at high temperature through an eutectic reaction. the network is constituted by complex cr-mo-fe borides [4]. figure 5: porosity of a) sintered aisi 316l base powder and b) aisi 316l modified with 0.4 wt. % of boron in masteralloy form 110 acta polytechnica vol. 52 no. 4/2012 figure 6: core microstructure of a) sintered aisi 316l base powder and b) aisi 316l modified with 0.4 wt. % of boron in masteralloy form figure 7: surface microstructure of a) sintered aisi 316l base powder and b) aisi 316l modified with 0.3 wt. % of boron in masteralloy form c) aisi 316l modified with 0.4 wt. % of boron in masteralloy form in addition, the evolution of a boride network suggests that the surface layer is formed by a pore filling mechanism described theoretically by lee s. m and kang s. j. [11]. when the grain growth process occurs, there is a pressure imbalance between the outer surface of the specimen and the inner surface of the pores, which pushes the liquid into the internal pores. increasing in grain size, the radius of the external meniscus decreases, causing an increase in the external pressure with respect to the internal pressure. as a consequence of this pressure imbalance, the liquid is induced to flow into the internal pores of the sample. this phenomenon therefore produces penetration of the liquid towards the inner part of the sample, and the concomitant formation of a liquid-free layer on the surface, where the pores have almost completely disappeared. the penetration of the liquid enriches the intermediate layer, and the liquid tends to flow towards the bulk. when the boron content increases, the amount of the liquid phase increases overall in the samples, and this influences the pore-filling mechanism. 6 surface quality in the case of 0.3 wt. % and 0.4 wt. % of added boron, a non-porous and boride-free superficial layer was obtained. the microhardness of the superficial layer is similar to the microhardness of the sample core. the thickness of the superficial layer varies between 40 μm and 100 μm. in addition, there is a noticeable improvement in surface quality for the boron alloyed specimens, in particular for 0.3 and 0.4 wt % when compared with the surface microstructure of the base material. 111 acta polytechnica vol. 52 no. 4/2012 7 conclusion introducing a high amount (0.3 wt. % and 0.4 wt. %) of boron in the form of masteralloy into austenitic stainless steel powders causes: increased sintered density and, as a consequence, increased hardness of sintered austenitic stainless steels. however, it should be pointed out that the introduction of boron in the form of masteralloy enabled the formation of a non-porous and plain surface, with no precipitations. this led to an improvement in corrosion resistance. however, further investigations should focus on obtaining a non-porous layer that is at the same time free of grain boride network bulk material. acknowledgement this research was supported by the polish ministry of science and higher education, grant no n 508 479 138. references [1] klar, e., samal, p.: powder metallurgy stainless steel. ed. asm, 2007. [2] tandon, r., german, r. m.: the international journal of powder metallurgy, vol. 34, no. 1, 1998, p. 40–49. [3] molinari, a., kazior, j., marchetti, f., cantieri, r., cristofolini, a., tiziani, a.: powder metallurgy 1994, vol. 37, no. 2, p. 115–122. [4] molinari, a., straffelini, g., pieczonka, t., kazior, j.: the international journal of powder metallurgy, 1998, vol. 34, no. 2, p. 21–28. [5] velasco, f., et al.: material science and technology, 1997, vol. 13, p. 847–851. [6] toennes, c., ernst, p., meyer, g., german, r. m.: advances in powder metallurgy, 1992, vol. 3, p. 371–381. [7] menapace, c., molinari, a., kazior, j., pieczonka, t.: powder metallurgy, 2007, vol. 50, no. 4, p. 326–335. [8] sarasola, m., gomez-acebo, t., castro, f.: powder metallurgy, 2005, vol. 48, no. 1, p. 59–67. [9] selecka, m., salak, a., danninger, h., j. mat. proc. techn, 2003, vol. 143, p. 910–915. [10] selecka, m., danninger, h., bures, r., parilak, l.: proc. pm world cong. 1998, granada, spain, october 1998, epma, vol. 2, p. 638–643. [11] lee, s. m., kang, s. j.: acta materiala, 1998, vol. 46, no. 9, p. 3 191–3202. 112 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 the intriguing nature of the cataclysmic variable ss cygni f. giovannelli, l. sabau-graziati abstract the classification of ss cyg as a dwarf nova (dn), a subclass of the non–magnetic (nm) cataclysmic variable (cv) has been considered by most of the community as well established because of a paper appeared in nature (bath & van paradijs, 1983), which was a bandwagon for all the papers discussing ss cyg behaviour both from experimental and theoretical points of view. this classification has been widely accepted until nowadays, in spite of the many arguments and circumstantial proofs about its possible intermediate polar nature, as claimed by franco giovannelli’s group for more than 25 years. the goal of this paper is to present an objective discussion of the problems connected with the controversial nature of ss cyg, using all the different interpretations of its multifrequency data in order to demonstrate beyond doubt its intermediate polar nature. keywords: cataclysmic variables, dwarf novae, intermediate polars, optical, spectroscopy, photometry, sub-mm, ir, radio, uv, x-rays, individual: ss cyg ≡ bd+42◦ 4189a ≡ aavso 2138+43 ≡ 3a 2140+433 ≡ 1h 2140+433 ≡ integral1 121 ≡ 1rxs j214242.6+433506 ≡ euve j2142+43.6 ≡ swift j2142.7+4337. 1 introduction historically, the classification of cvs was based on the optical outburst properties, by which one may distinguish four groups of cvs: (i) classical novae; (ii) recurrent novae; (iii) dwarf novae; (iv) nova-like objects (e.g., giovannelli & martinez-pais, 1991 and references therein; ritter, 1992; giovannelli, 2008). this classification, however, is neither self-consistent nor adequate, and it is much better to consider primarily the observed accretion behaviour (smak, 1985). one obvious advantage of such an approach is connected with the time scales of various accretion phenomena, which are sufficiently short to avoid any major observational bias. the mass accretion rates in cvs usually range from 10−11 to 10−8 m⊙ yr −1 (patterson, 1984); the time scales are from tens of seconds (oscillations in dwarf novae at outbursts) to years (super-outbursts of su uma stars or long term variations in vy scl stars). however, in the class of nova-like objects there are two sub-classes: dq her stars and am her stars. in these sub-classes of cvs, white dwarfs possess magnetic fields with intensity high enough to dominate the accretion disk and all the phenomena related to it. these classes of magnetic cvs, whose names come from the prototypes dq her and am her, later took the names of intermediate polars and polars, respectively. a short history of their discovery was discussed by warner (1995). fundamental papers about these sub-classes are those by patterson (1994), warner (1996a,b). the class of ips has been split into two subclasses with relatively a large magnetic field anf with a relatively weak magnetic field (norton et al., 1999). one example of a system belonging to the latter subclass is do dra (previously registered as yy dra) (andronov et al., 2008). depending on the magnetic field intensity at the white dwarf, the accretion of matter from the secondary star onto the primary can occur either via an accretion disc (in the so-called non-magnetic cvs: nmcvs) or via channelling through the magnetic poles (in the case of polars: pcvs), or in an intermediate way (in the case of intermediate polars: ipcvs). ss cyg is the most observed and most intriguing cv. for reviews, see the papers by giovannelli & martinez-pais (1991), giovannelli (1996), giovannelli & sabau-graziati (1998). the most extensive review about ss cyg before the advent of the space era is that by zuckerman (1961). the light curves of ss cyg have been produced continuously by the aavso observations since 1896 (mattei, et al., 1985; mattei, waagen & foster, 1991, 1996; mattei, menali & waagen, 2002; aavso web page (http://www.aavso.org/)). the optical outbursts of ss cyg are not always the same. howarth (1978) discussed three possible kind of outbursts, long, short and anomalous with average periodicity of 50.21 days. giovannelli et al. (1985), lombardi, giovannelli & gaudenzi (1987), gaudenzi et al. (1990; 2011) and giovannelli & martinez-pais (1991) discussed outbursts that originate different optical, uv and x-ray behaviour of the system. on the basis of its optical light curves, ss cyg was classified as a dwarf nova (bath & van paradijs, 1983), with white dwarf mass equal to 11 acta polytechnica vol. 52 no. 1/2012 chandrasekhar’s limit (patterson, 1981). however, we will discuss its intermediate polar nature, analyzing its multifrequency behaviour and different interpretations of the data from the literature. moreover, on the basis of more realistic values for its orbital parameters, we will try to reconcile all experimental multifrequency data with the magnetic nature of ss cyg. 2 on the controversial nature of ss cyg with the historical classification of cvs based on the optical outburst properties, ss cyg (α2000 = 21h42m48s.2; δ2000 = +43 ◦ 35′09”.88 with the galactic coordinates l2000 = 090.5592, b2000 = −07.1106), whose distance is 166.2 ± 12.7 pc (harrison et al., 1999), is the brightest of the dwarf novae. its optical magnitude ranges from ∼ 12 to ∼ 8.5 during quiescent and outburst phases, respectively. because of these characteristics, it is the most observed cv, not only in the optical wavelength region, where measurements are available from the end of the 19th century to the present, but also in other wavelength regions. ss cyg shows oscillations of ∼ 10 s in both the optical and the x-ray ranges, orbital modulations (porb ≃ 6.6 h) of the intensities of balmer and uv emission lines and of the continuum, and almost periodic outbursts (poutb ∼ 50 days, howarth, 1978). all these characteristics, together with relatively high luminosity both in outburst and in quiescence, render ss cyg the most appropriate laboratory for studying the physical processes occurring in dwarf novae and in cvs in general. the orbital parameters of the binary system were derived by giovannelli et al. (1983) with the use of theoretical and experimental constraints from measurements obtained in different energy regions. they are i = 40◦+1−2, m1 = 0.97 +0.14 −0.05 m⊙, m2 = 0.56 +0.08 −0.03 m⊙, r2 = 0.68 +0.03 −0.01 r⊙, rod = 2.9 × 1010 cm, rid = 3.6 × 10 9 cm, where 1 and 2 refers to the primary and secondary star, respectively. rod and rid are the outer and inner accretion disk radius. these parameters have been confirmed by direct measurements of radial velocities (martinez-pais et al., 1994). martinez-pais et al. also determined that the optical companion of ss cyg system is a k2–k3 late-type star. the mass of the white dwarf of the binary system ss cyg was considered for long time as high as chandrasekhar’s limit — because of a paper that appeared in the astrophysical journal (patterson, 1981) — a value completely unuseful for any sort of serious interpretation of the many multifrequency experimental data and for modeling. in a study of the matter flow structure in ss cyg using the doppler tomography technique, bisikalo et al. (2008) found that rid = (2.6−3.3)×10 9 cm, which is another important confirmation of the goodness of giovannelli et al.’s parameters. despite the enormous amount of multifrequency experimental data spread over many years, the morphology and the nature of ss cyg are still unsettled questions. indeed, ss cyg was classified as a non-magnetic cv (nmcv) by bath & van paradijs (1983). ricketts, king & raine (1979) explained the x-ray emission from ss cyg as owing to the radial inflow of matter onto a magnetized white dwarf (b ∼ 106 g) from a disrupted accretion disk. fabbiano et al. (1981), using coordinated optical-uv and x-ray measurements of ss cyg, noted that its behavior is not compatible with a viscous disk model and confirmed the magnetic nature of the white dwarf with b ≤ 1.9 × 106 g. ss cyg at quiescence is quite similar to am her and its behaviour is consistent with a picture of polar magnetic accretion. further multifrequency data of ss cygni showed incompatibility of its behavior with that of nmcv, and strongly favored its classification as an intermediate polar (see, e.g. giovannelli et al., 1985; giovannelli & martinez-pais, 1991; kjurkchieva, marchev & og loza, 1999; marchev, kjurkchieva & og loza, 1999; gaudenzi et al., 2002; long et al., 2005; schreiber, hameury & lasota, 2003; schreiber, & lasota, 2007). moreover, in ss cyg, lhard−x < luv+soft−x. this is compatible with thermonuclear burning onto the wd surface. thermonuclear burning was first suggested by igor mitrofanov (1978). gaudenzi et al. (2002) found that thermonuclear burning can occur in ∼ 24 % of the wd surface. kording et al. (2008) detected a radio jet from ss cyg. the hardness intensity diagram shows an analogy between x-ray binaries (xrbs) and ss cyg. this result supports the presence of a rather strong magnetic field at the surface of the white dwarf. upper limits to linear and circular polarizations have been found as 3.2 ± 2.7 % and −3.2 ± 2.7 %, respectively. integral/ibis and swift/xrt observations have shown that a conspicuous number of cvs have a strong hard x-ray emission (landi et al., 2009; scaringi et al., 2010). in their published sample of 22 cvs, 21 are classified as magnetic cvs (mcvs) (intermediate polar: ip) and only one (ss cyg) as nmcv, meanwhile all its characteristics are practically equal to those of the other 21 objects. this is one more strong circumstantial proof in favor of the magnetic nature of ss cyg. scaringi et al. (2010) reported the detection of one more ip: ao psc, which is added to the former sample. the experimental evidence that ss cyg emits in the hard x-ray energy range is, in our opinion, conclusive evidence about its magnetic nature. 12 acta polytechnica vol. 52 no. 1/2012 however, there are many papers in the literature that seem to contradict the intermediate polar nature of ss cyg. indeed, gnedin et al. (1995) found from observations of intrinsic circular polarization in ss cyg performed in the wings of the balmer hydrogen lines that the true value of the magnetic field probably lies in the range 0.03 < b < 0.3 mg. using extreme ultraviolet explorer satellite observations, mauche (1996) detected quasi-periodic oscillations (qpos) from ss cyg with a period in the range 7.19–9.3 s. this variation correlates with the extreme ultra-violet (euv) flux as p ∝ i−0.094euv . with a magnetospheric model to reproduce this variation, he found that a high-order, multipole field is required, and that the field strength at the surface of the white dwarf is 0.1 < b < 1 mg. this field strength is at the lower extreme of those measured or inferred for bona fide magnetic cataclysmic variables. however, they do not exclude the possibility that at an outburst the accretion of matter could occur onto the magnetic poles of the white dwarf. giovannelli (1981) found qpos from ss cyg — during a rise up to a maximum of one long optical outburst — with periods in the range 8.96–9.91 s that show the inverse relationship between outburst luminosity and oscillation period as general properties of cvs (e.g. nather & robinson, 1974; nevo & sadeh, 1978). the amplitude of the optical oscillations has a maximum at the maximum of the outburst. then the white dwarf is heated directly by the surrounding luminous material, as by accretion, as suggested by hildebrand et al. (1980) for ah her. the variation of the rapid oscillation periods in ss cyg has the same trend with respect to ah her. the temperature is minimum at the peak of the outburst. a possible explanation could be that very close to the optical maximum the inner part of the disk moves closer to the white dwarf surface and it becomes denser and renders cooling possible. this optical behaviour is in agreement with the x-ray behaviour of ss cyg. indeed, at the optical maximum, the hard x-ray emission is lower than during quiescence, and all the energy in the x-ray region is emitted below 2 kev (ricketts, king & raine, 1979). with regard to the question of the possible magnetic nature of ss cyg, mauche et al. (1997) discussed the case of the uv line ratios of cvs, which seem to be almost independent of the nature (magnetic or not) of cvs. okada, nakamura & ishida (2008) found from chandra hetg observations in ss cyg that the spectrum in quiescence is dominated by h-like kα lines, and is dominated in outburst by he-like lines, which are as intense as h-like lines. the broad line widths and line profiles indicate that the line-emitting plasma is associated with the keplerian disk. in quiescence the lines are narrower and are emitted from an ionizing plasma at the entrance of the boundary layer. ishida et al. (2009) found from suzaku observations of ss cyg that the plasma temperature in quiescence is 20.4 kev and in outburst 6.0 kev. the 6.4 kev line is resolved in narrow and broad components, which indicates that both the white dwarf and the accretion disk contribute to the reflection. the standard optically thin boundary layer is the most plausible picture of the plasma configuration in quiescence. the reflection in outburst originates from the accretion disk and an equatorial accretion belt. the broad 6,4 kev line suggests that the optically thin thermal plasma is distributed on the accretion disk, in a manner similar to that of a solar corona. long et al. (2005) found by fitting the doublepeaked line profile in ss cyg that the fuv lineforming region is concentrated closer to the white dwarf than the region that forms the optical lines. their study provides no evidence of a hole in the inner disk. however, they cannot fit ss cyg data by a simple model as white dwarf plus accretion disk. the ss cyg system is also important as laboratory for the study of circumstellar dust in cvs. indeed, jameson et al. (1987) detected ir emission from ss cyg in outburst in iras bands i and ii (11.8 µm and 24.4 µm). the most likely origin of the ir emission is circumstellar dust heated by the enhanced uv flux during outburst. dubus et al. (2004) performed optical and mid-ir observations of several cvs including ss cyg in quiescence. for ss cyg the measurements at 11.8 µm are consistent with the upper limits obtained by jameson et al. (1987) when the source was not yet in full outburst. the observed variability in the mid-ir flux on short time scales is hardly reconcilable with intrinsic or reprocessed emission from circumbinary disk material, while on the contrary a free–free emission from a wind should be reconcilable. if any sizeable circum-binary disk is present in the system, it must be self-shadowed or perhaps dust–free, with the peak thermal emission shifted to far-ir wavelengths. gaudenzi et al. (2011) discussed the reasons for the variable reddening in ss cyg and demonstrated that this reddening consists of two components: the first is interstellar in origin, and the second (intrinsic to the system itself) is variable and changes during the evolution of a quiescent phase. moreover, an orbital modulation also exists. the physical and chemical parameters of the system are consistent with the possibility of the formation of fullerenes. the spitzer space telescope detected an excess (3–8) µm emission from magnetic cvs, due to dust (howell et al., 2006; brinkworth et al., 2007). this is a strong push for observing ss cyg carefully with spitzer. however, a weak ir excess was discovered in ss cyg looking at the spitzer data, but no conclusions were given since the data at different 13 acta polytechnica vol. 52 no. 1/2012 table 1: comparison of the characteristic parameters of ei uma — a well established ip (reimer et al., 2008) — and ss cyg (giovannelli et al., 1983; lombardi et al., 1987; gaudenzi et al., 1990, 2002; giovannelli & martinez-pais, 1991; schreiber & gänsicke, 2002) ei uma ss cyg porb = 6.434 h porb = 6.603 h popt = 745 s (∼ pxmm) popt = 745 s (∼ pxmm uv) 0.81 m⊙ < mwd < 1.2 m⊙ mwd = 0.97 m⊙ rwd = 7 × 10 8 cm rwd = 5 × 10 8 cm mr = 0.81 m⊙ mr = 0.56 m⊙ rr = 0.76 r⊙ rr = 0.68 r⊙ lx ∼ 10 × 10 32 erg s−1 lx ∼ (6.6 − 9.8) × 10 32 erg s−1 ṁ = 3.6 × 1017 g s−1 ṁ ≃ (1 − 4) × 1017 g s−1 mv = 5.4 mv = 5.9 f = rd/a = 0.2 – 0.3 f = rd/a = 0.2 ubv orbital modulations of ubv & uv orbital modulations of continuum continuum & emission lines ews wavelengths were not simultaneous (harrison et al., 2010). 3 the intermediate polar nature of ss cyg in our opinion, there are several incontestable arguments in favour of the ip nature of ss cyg, namely: i) the strong analogy of ss cyg with the well established ip ei uma (reimer et al., 2008). table 1 shows the parameters of ei uma and ss cyg; ii) in the diagram log lx vs log ṁ for the ips (warner, 1996a), ss cyg lies just in the place of ips (figure 1, left panel). the x-ray luminosity of ss cyg — lx ∼ (6.6–9.8) × 10 32 erg s−1 — has been derived by the values of the distance of ss cyg: 166 pc (harrison et al., 1999) and its x-ray flux: fx ∼ (2-3)×10 −10 erg cm−2 s−1 (giovannelli & martinez-pais, 1991; mcgowan, priedhorsky & trudolyubov, 2004). the average mass accretion rate is ṁ ∼ 2 × 1017 g s−1 (gaudenzi et al., 1990, and the references therein; schreiber & gänsicke, 2002); iii) in the diagram log ṁ vs log porb (warner, 1996a) ss cyg lies just in the place of ips (fig. 1, right panel). the orbital period of ss cyg is porb = 6.6 h (martinez-pais et al. 1994); iv) ss cyg has been detected in the hard x-ray range, together with other tens of very well known ips (landi et al., 2009; scaringi et al., 2010). such an emission cannot be justified without the presence of an intense white dwarf magnetic field. moreover, if ss cyg should be a dwarf nova (non-magnetic cv) — as reported in the table of detected cvs in the former papers — why are other dwarf novae not detected with the same instruments? fig. 1: left panel: relationship between x-ray luminosity and mass accretion rate for ips that are indicated with x (by courtesy of warner, 1996a). right panel: relationship between mass accretion rate and orbital period for ips (by courtesy of warner, 1996a). ss cyg positions are indicated with a red ⋆ in both panels 14 acta polytechnica vol. 52 no. 1/2012 fig. 2: observed emission fluxes converted to luminosity for magnetic cvs (after howell et al., 1999) are indicated with : c ii in the left panel, and c iv in the right panel. ss cyg positions are indicated with a red ⋆. luminosity expressed in erg s−1, and b in unit of 106 gauss fig. 3: left panel: relationship between pspin andporb for ips (by courtesy of warner, 1996a). right panel: relationship between the magnetic moment of the white dwarfs in cvs and m⊙ (by courtesy of warner, 1996a). polars are indicated with x, and ips with their abbreviated titles. ss cyg positions are indicated with a red ⋆. magnetic moment µ is expressed in g cm3, and m⊙ in unit of 10 16gs−1 rudolf gális et al. (2011, ibws) when discussing the x-ray and optical activity of integral cvs, explicitly mentioned ss cyg as ip, making a comparison of its x-ray behaviour with those very similar of v 709 cas, a well-established ip. then we can assume, beyond reasonable doubt, that ss cyg is an ip. what is the intensity of its magnetic field? a relationship between the strength of high-state uv emission lines and the strength of the white dwarf magnetic field has been found by howell et al. (1999). from the fluxes of uv emission lines (gaudenzi et al., 1986) we derived the luminosity of c ii and c iv: lcii ∼ 7.8 × 10 30 erg s−1, and lciv ∼ 6.2 × 1031 erg s−1, respectively. using these values of luminosity and the extrapolation of the line best fitting the emission line luminosity of c ii and c iv from howell et al. (1999), the magnetic field intensity of ss cyg is 2.5 and 2.0 mg, respectively as shown in fig. 2, left and right panels, respectively. taking into account the errors in the fluxes of the considered uv lines, we can argue a reasonable value of the white dwarf magnetic field in ss cyg as b ∼ 2.2 ± 1.0 mg. moreover, v) a periodicity at 12.18±0.01 m in r and i bands was detected in ss cyg by bartolini et al. (1985), probably the beat period between the spin period of the white dwarf and the orbital period of the system. if the rotation is direct with orbital motion pspin = 11.82±0.01 m, corresponding to 709.2±0.6 s. if the rotation is inverse pspin = 12.56 ± 0.01 m (753.6 ± 0.6 s). tramontana (2007) found a periodicity of 12.175 ± 0.539 m using 504 images obtained in the r band on october 26, 2006 with ss cygni in outburst at the tacor teaching telescope of la sapienza university. using 10 ks xmm-om uv ob15 acta polytechnica vol. 52 no. 1/2012 servations of ss cyg, braga (2009) found a period of 709 ± 1 s that should be the rotational period of the white dwarf, following the model of ips developed by warner (1986). using pspin = 709 s and the relationship pspin vs porb (warner, 1996a), ss cyg lies in the zone of the ips very close to the line of 1 m⊙ (figure 3, left panel). the mass of the ss cyg white dwarf is 0.97+0.14−0.05 m⊙ (giovannelli et al., 1983). finally, with b ≈ 2 mg, and the radius of the white dwarf rwd = 5 × 10 8 cm (martinez-pais et al., 1994), the magnetic moment is µ ≈ 2.5 × 1032 g cm3. using this value of µ and the average value of ṁ, ss cyg lies just in the ips region, as shown in the right panel of figure 3. therefore, we can definitively affirm that ss cyg is an intermediate polar. acknowledgement we are pleased to thank the organizers of the karlovy vary 8th integral/bart workshop for the invitation. one of us (fg) wishes to thank the loc for logistical support. this research has made use of nasa’s astrophysics data system. references [1] andronov, i. l., chinarova, l. l., han, w., kim, y., yoon, j.-n.: a & a, 2008, 486, 855. [2] bath, g. t., van paradijs, j.: nature, 1983, 305, 33. [3] bartolini, c., et al.: multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.) edizioni scientifiche siderea, roma : 1985, p. 50. [4] bisikalo, d. v., et al.: astron. rep., 2008, 52, no. 4, 318. [5] braga, v. f.: thesis in physics, roma : 2009, university la sapienza. [6] brinkworth, c. s., et al: 15th european workshop on white dwarfs, r. napiwotzki, m. r. burleigh (eds.) aspconf. ser. 2007, 372, 333. [7] dubus, g. et al.: mnras, 2004, 349, 869. [8] fabbiano, g., et al.: apj, 1981, 243, 911. [9] galis, r., et al.: 2011, this workshop. [10] gaudenzi, s., giovannelli, f., lombardi, r., claudi, r.: new insights in astrophysics. eight years of uv astronomy with iue, esa-sp, 1986, 263, 455. [11] gaudenzi, s., giovannelli, f., lombardi, r., claudi, r.: aca, 1990, 40, 105. [12] gaudenzi, s., et al.: multifrequency behaviour of high energy cosmic sources, f. giovannelli, l. sabau-graziati (eds.), mem.s.a.it. 2002, 73, n. 1, 213. [13] gaudenzi, s., et al.: a & a, 2011, 525, 147. [14] giovannelli, f.: ssr, 1981, 30, 213. [15] giovannelli, f.: multifrequency behaviour of high energy cosmic sources, f. giovannelli, l. sabau-graziati (eds.) mem. s.a.it. 1996, 67, 401. [16] giovannelli, f.: chja & a, 2008, 8 suppl., 237. [17] giovannelli, f., gaudenzi, s., piccioni, a., rossi, c.: aca, 1983, 33, 319. [18] giovannelli, f., et al.: multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.) edizioni scientifiche siderea, roma : 1985, p. 37. [19] giovannelli, f., martinez-pais, i. g.: space sci. rev., 1991, 56, 313. [20] giovannelli, f., sabau-graziati, l.: ultraviolet astrophysics beyond the iue final archive, w. wamsteker, r. gonzalez-riestra (eds.) esa sp, 1998, 413, 419. [21] gnedin, yu. n., et al.: astr. lett., 1995, 21, 132. [22] harrison, t. e., et al.: apjl, 1999, 515, l93. [23] harrison, t. e., et al.: apj, 2010, 710, 325. [24] howarth, i. d.: j. brit. astron. ass., 1978, 88, 458. [25] howell, s. b., et al.: aj, 1999, 117, 1 014. [26] howell, s. b., et al.: apjl, 2006, 646, l65. [27] ishida, m., et al.: pasj, 2009, 61, s77. [28] jameson, r. f., et al.: observatory, 1987, 107, 72. [29] kjurkchieva, d., marchev, d., og loza, w.: ap& ss, 1999, 262, 53. [30] körding, e., et al.: science, 2008, 320, 1 318. [31] landi, et al.: mnras, 2009, 392, 630. [32] lombardi, r., giovannelli, f., gaudenzi, s.: ap & ss, 1987, 130, 275. 16 acta polytechnica vol. 52 no. 1/2012 [33] long, k. s., et al.: apj, 2005, 630, 511. [34] marchev, d., kjurkchieva, d., og loza, w.: aca, 1999, 49, 585. [35] martinez-pais, i. g., giovannelli, f., rossi, c., gaudenzi, s.: a & a, 1994, 291, 455. [36] mattei, j. a., saladyga, m., waagen, e. o.: ss cygni light curves 1896–1985, aavso monograph 1, cambridge : ma, 1985. [37] mattei, j. a., waagen, e. o., foster, e. g.: ss cygni light curves 1985–1990, aavso monograph 1 supplement 1, cambridge : ma, 1991. [38] mattei, j. a., waagen, e. o., foster, e. g.: ss cygni light curves 1991–1995, aavso monograph 1 supplement 2, cambridge : ma, 1996. [39] mattei, j. a., menoli, h. g., waagen, e. o.: ss cygni light curves 1996–2000, aavso monograph 1 supplement 3, cambridge : ma, 2002. [40] mauche, c., et al.: apjl, 1996, 463, l87. [41] mauche, c., et al.: apj, 1997, 477, 832. [42] mcgowan, k. e., priedhorsky, w. c., trudolyubov, s. p.: apj, 2004, 601, 1 100. [43] mitrofanov, i. g.: sov. astron. lett., 1978, 4, 119. [44] nather, r. e., robinson, e. l.: apj, 1974, 190, 637. [45] nevo, i., sadeh, d.: mnras, 1978, 182, 595. [46] norton, a. j., beardmore, a. p., allan, a., hellier, c.: a & a, 1999, 347, 203. [47] ricketts, m. j., king, a. r., raine, d. j.: mnras, 1979, 186, 233. [48] ritter, h.: the astronomy and astrophysics encyclopedia. cambridge, uk : cambridge university press, 1992, p. 61. [49] okada, s., nakamura, r., ishida, m.: apj, 2008, 680, 695. [50] patterson, j.: apjs, 1981, 45, 517. [51] patterson, j.: apjs, 1984, 54, 443. [52] patterson, j.: pasp, 1994, 106, 209. [53] reimer, t. w., et al.: apj, 2008, 678, 376. [54] scaringi, s., et al.: mnras, 2010, 401, 2 207. [55] schreiber, m. r., gänsicke, b. t.: a & a, 2002, 382, 124. [56] schreiber, m. r., lasota, j.-p.: a & a, 2007, 473, 897. [57] schreiber, m. r., hameury, j.-m., lasota, j.-p.: a & a,, 2003, 410, 239. [58] smak, j.: multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.) edizioni scientifiche siderea, roma : 1985, p. 17. [59] tramontana, v.: thesis in physics. roma: university la sapienza, 2007. [60] warner, b.: mnras, 1986, 219, 347. [61] warner, b.: cape workshop on magnetic cataclysmic variables, d. a. h. buckley, b. warner (eds.) asp conf. ser., 1995, 85, 3. [62] warner, b.: ap & ss, 1996, 241, 263. [63] warner, b.: cataclysmic variable stars, cambridge university press, 1996. [64] zuckerman, m.-c.: ann. astrophys., 1961, 6, 431. franco giovannelli inaf – istituto di astrofisica spaziale e fisica cosmica – roma area di ricerca di roma-2 via del fosso del cavaliere 100, i 00133 roma, italy lola sabau-graziati inta – dpt de programas espaciales y ciencias del espacio ctra de ajalvir km 4 – e 28850 torrejón de ardóz, spain 17 ap08_3.vp 1 introduction cancer of the urinary bladder is the fourth most common malignancy among males and one of the top eight cancers for women in industrial countries. according to the american cancer society, 68,810 new cases and 14,100 deaths are estimated in 2008 in the united states [1]. bladder cancer tends to occur most commonly in individuals over 60 years of age. the major risk factors are cigarette smoking and exposure to aromatic amines, used in the chemical and dye industry. bladder cancer can be diagnosed during a cystoscopy, in which an endoscope is introduced through the urethra into the bladder, which is filled with isotonic saline solution. malignant tissues of the bladder wall can then be removed with the use of endoscopic tools, e.g. a resectoscope cutting loop. in white light illumination, small and flat tumors, whose structures do not differ strongly from the surrounding tissue are difficult to recognize and could thus be overlooked during the therapy. to reduce this risk, the visualization of tumor tissue can be improved by a photodynamic diagnosis (pdd) system. this technology uses imaging with fluorescent light, which is activated by the marker substance 5-aminolaevulinic acid (5-ala), accumulated in malignant tissue. thus, the contrast between tumor and benign tissue is enhanced and permits easier differentiation, as illustrated in fig. 1. a common disadvantage during an endoscopy is the limited field of view of the endoscope. the physician can examine only a small part of the whole operating field at once. this causes difficulties in navigation, especially in hollow organs. instead, a panoramic image provides an overview of the whole region of interest and links images taken from different angles of view. this additional information facilitates visual control and navigation, especially during a cystoscopy, and can be documented in medical evidence protocols. we have therefore developed an image mosaicing algorithm, which stitches single images of a pdd bladder video sequence and finally provides an expanded panoramic image of the urinary bladder. this paper is organized as follows: in section 2 we discuss the panorama algorithm in detail. further optimizations are given in section 3. section 4 describes the results and perspectives of the algorithm. finally, section 5 summarizes the proposed approach of our image mosaicing algorithm for endoscopic pdd images. 2 image mosaicing algorithm the image mosaicing algorithm processes single endoscopic pdd images provided by a video sequence. first, in a preprocessing step we separate the relevant image information of the input images. then the sift features [4] of two images are detected and matched. to refine the feature point correspondences we adapt and apply the ransac algorithm [7] to reject outliers. subsequently we stitch the two images together and interpolate the overlapping region using a linear cross blending method. then we apply our algorithm iteratively to the next input images. finally a complete panoramic image of the bladder is built. 2.1 image acquisition during a cystoscopy the endoscopic images, showing the internal urinary bladder wall, are captured by a pdd video cystoscopy system. in this process the bladder wall is illuminated by a pdd light source. an external camera is attached at the tail end of the rigid cystoscope, as shown in fig. 2, and captures video images with a resolution of 720×576 pixels at a frame rate of 25 frames per second. the video frames are transmitted to the computer video system and are processed. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 creating panoramic images for bladder fluorescence endoscopy a. behrens the medical diagnostic analysis and therapy of urinary bladder cancer based on endoscopes are state of the art in urological medicine. due to the limited field of view of endoscopes, the physician can examine only a small part of the whole operating field at once. this constraint makes visual control and navigation difficult, especially in hollow organs. a panoramic image, covering a larger field of view, can overcome this difficulty. directly motivated by a physician we developed an image mosaicing algorithm for endoscopic bladder fluorescence video sequences. in this paper, we present an approach which is capable of stitching single endoscopic video images to a combined panoramic image. based on sift features we estimate a 2-d homography for each image pair, using an affine model and an iterative model-fitting algorithm. we then apply the stitching process and perform a mutual linear interpolation. our panoramic image results show a correct stitching and lead to a better overview and understanding of the operation field. keywords: image mosaicing, stitching, panoramic, bladder, endoscopy, cystoscopy, fluorescence, photo dynamic diagnosis, pdd. white light pdd fig. 1: papillary tumors in different illuminations 2.2. image preprocessing in a preprocessing step we subsample the images by a factor of four to reduce computational time and the resolution of the final panoramic image. then we separate the relevant image information within the elliptical shape from the surrounding dark image region of the input images (see fig. 3), using otsu’s thresholding method [2]. thus, we transform the rgb input image to a gray value image and calculate a binary mask, which represents the two classes elliptical and surrounding region. otsu’s algorithm is a thresholding method for separating two classes of pixels so that their between-class variance � b 2 is maximal. the optimal threshold �t is then determined by � �b t t bt t 2 0 2( ) max ( )� � � � (1) with � � � � �b t t t t t 2 1 2 1 2 2( ) ( ) ( ) ( ( ) ( ))� � (2) and the two class probabilities �1, �2 and their mean levels �1, �2. after a further eroding operation we extract the elliptical region using the binary mask, as shown in fig. 3. 2.3 feature detection common stitching algorithms are based on registrations methods, which can be categorized into pixel-based alignment, feature-based methods, and global registration [3]. in this project we have chosen a feature-based method, since the pdd bladder images generally show a high-contrast vascularization structure. furthermore it allows a fast stitching process, since only single feature points have to be matched instead of large pixel blocks. according to the situation, these image structures can also vary in scale during the video sequence, e.g. caused by a zoom, we use distinctive scale invariant keypoints, calculated by the scale invariant feature transform (sift) [4]. sift features are located by detecting the local extrema of a difference-of-gaussian (dog) function. this closely approximates the scale-normalized laplacian of gaussian (log), which is introduced by lindeberg [5] for effective scale-invariant feature detection. mikolajczyk [6] showed that this extrema detection generally provides the most stable image features, compared to a range of other possible image functions, such as the gradient, hessian, or harris corner function. the relationship between dog and log can be understood from the heat diffusion equation: � �� � g g� � 2 , (3) � � �� � � � � � � � � � 2g g g x y k g x y k ( , , ) ( , , ) (4) � � � �g x y k g x y k g dog log ( , , ) ( , , ) ( )� � � � ���� ���� ��� 1 2 2 . (5) eq. (5) shows that the dog function with a constant scale difference k approximates the �2 scale-normalized log multiplied by a constant factor (k � 1), which does not influence the extrema detection. after the features are localized by a maximum operation over space and scale, a further refinement step is applied. low contrast points and keypoints along edges are rejected, since they are unstable to small amounts of noise. thus, the ratio of the eigenvalues of the 2×2 hessian matrix h � � � � � d d d d xx xy xy yy , (6) based on the second derivations of the dog image d(x, y, �) is calculated and compared to a threshold r. keypoints with a large ratio, which means having a large principal curvature across the edge but a small one in the perpendicular direction, are thus rejected. a result of feature detection is shown in fig. 4. 2.4. feature matching subsequent to the feature localization, described in the preceding section, the feature points of two images are matched. a robust matching algorithm requires distinctive keypoints. so we calculate for each feature point a rotation © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 48 no. 3/2008 fig. 2: principle setup of a pdd video cystoscopy system, showing a cystoscope introduced into the urinary bladder with its limited field of view (fov). the fluorescence pdd light source provides the illumination and a camera at the tail of the rigid cystoscope transmits the captured image data to a computer video system. fig. 3: left: original input image, right: binary mask (transparent overlay) fig. 4: sift features located in image one (left) and image two (right) invariant sift descriptor [4]. the sift descriptor takes into account the local neighborhood of the gaussian smoothed image l x y( , ) by calculating the gradient magnitude m x y( , ) and orientation �( , )x y , accumulated in a histogram, as illustrated in fig. 5, by � � � �m x y l x y l x y l x y l x y( , ) ( , ) ( , ) ( , ) ( , )� � � � � � � �1 1 1 12 2 (7) �( , ) arctan ( , ) ( , ) ( , ) ( , ) x y l x y l x y l x y l x y � � � � � � � � � � 1 1 1 1� � � ��. (8) the use of 8 orientations and a 4×4 array of histograms leads to a distinctive 128 element feature vector. after each feature point is described by its local descriptor, we apply the matching process using the minimum euclidean distance measurement � � �d d dl i i lmin 2. (9) in this process one local descriptor dl of the first image is compared to all feature descriptors di of the second image within the 128 dimensional feature space, and vice versa. the minimum squared error then leads to the best corresponding point �dl . afterwards, the next local descriptor � dl �1 is selected and matched. this procedure is repeated until all feature descriptors are processed. fig. 6 shows the resulting point correspondences of the two sequential images. 2.5 homography estimation in the next step we determine an image-to-image transform, so called 2-d homography, based on the final point correspondences. since the images of the video sequence describe a non-rigid camera movement, we choose an affine model for the homography. an affine transform provides six degrees of freedom, parametrized by a translation vector � t t tt x y� ( , ), rotation angle �, scales sx and sy, and a shear parameter a. in homogeneous coordinates the homography matrix m can bewritten as m a � � � � � � � � � � � � � � a b c d e f t t 0 0 1 0 1 � � (10) with a � � � � � � � � � �� � � �� �1 0 1 0 0 a s s x y cos( ) sin( ) sin( ) cos( � � � �) � � � � � �. (11) although the matching process shows a good matching result (see fig. 6), some matching errors can occur, due to the noise and blur of the pdd images. since matching errors have a high impact on the image transformation, we have to reject these outliers to perform a robust homography estimation. therefore we employ the ransac (random sample consensus) model fitting algorithm [7], which rejects outliers of the matching results during an iterative process. with the adaption of the fitting model to our 2-d affine model, ransac carries out the following steps: 1. select a set of p � 4 point matches randomly (four point correspondences are required to determine the affine model), 2. validate the points of being not collinear. if they are collinear, go back to step 1, 3. calculate the affine 2-d homography between image two and one, 4. apply the transform to each feature point of image two and the inverse transform to image one, respectively. count the number of points (so-called inliers), which lie within a spatial error tolerance of �max to the related reference points, 5. if the number of inliers is greater than the best previous score, save the homography matrix. 6. if the maximum number nmax of trials is reached, terminate the algorithm. otherwise go back to step 1. the rejected point correspondences of fig. 6 performed by the iterative ransac algorithm are labeled black in fig. 7. eventually the affine homography matrix between image two and one is determined by the greatest number of inliers. 2.6 stitching and blending now we apply the estimated homography to image two and combine image one and two. if a direct composition 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 5: 2×2 keypoint descriptor array representing the orientation histogram of an 8×8 set of samples (principle sketch) fig. 6: matched point correspondences of image one and image two fig. 7: rejected outlier points (labeled black) performed by the ransac algorithm method is used, visual artifacts like edges and signal discontinuities occur in the combined image, as shown in fig. 8. this effect results from the inhomogeneous illumination of the bladder wall during the cystoscopy. the illumination intensity decreases from the middle of the image to the borders. to overcome this problem, we apply a cross blending interpolation, suggested by wald et al. [8], during the composition process. the method performs an interpolation in the overlapping region based on a linear mutual weight distribution, as illustrated in fig. 9. due to to these weight functions, pixels with low illumination at the image borders have less impact on the interpolation than pixels in the image center. this approach reduces the visual artifacts significantly, as can be seen in fig. 10. finally, the two images are stitched and a combined image is built. then we apply the whole image mosaicing algorithm iteratively to the subsequent images, until all frames of the video sequence are processed. 3 optimization we implement the image mosaicing algorithm firstly in an offline matlab program code. to reduce the computational time we apply some further improvements to the algorithm. since the camera movement along the video sequence is smooth and slow, we dynamically increase the processing frame step size and decrease the overlapping region of the images, respectively. if a sufficient number of reliable matching points is still found, the expansion of the panoramic image will perform faster. otherwise not enough matching points are found due to low contrast of the image structures or too small image overlap, and the matching process fails. in that case we successively decrease the processing frame step size of the subsequent images, until the matching succeeds. 4 results and perspectives after all images of the video sequence have been processed by our image mosaicing algorithm, the final panoramic image is built. fig. 11 represents a panoramic image of an original endoscopic pdd video sequence of 580 frames. the panorama shows a section of the left part of the urinary bladder, stitched by the single video frames. fig. 11 show that the papillary tumors are located on the upper left bladder wall related to the left urethral orifice. the spatial relation between adjacent images can now be directly accessed by the panoramic image, since this information is only given implicitly by the camera movement along the video sequence in © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 48 no. 3/2008 fig. 8: direct image compositions without the blending algorithm fig. 9: cross blending interpolation with a mutual weight distribution in the overlapping region of image i and ii fig. 10: image compositions applied with a linear cross blending interpolation fig. 11: panoramic image built from an original endoscopic pdd video sequence of 580 frames time. in addition, the panoramic image can be documented in medical evidence records to supplement the textual descriptions of the tumor positions in the bladder. this will lead to a better and more intuitive understanding, and can be used for follow-up cystoscopic examinations. since the computation time for each image pair takes several seconds, real-time image mosaicing is not supported yet. a software implementation in c�� should improve the performance, and will be applied in future work. further clinical tests and evaluations also have to be performed. nevertheless physicians’ first comments have indicated that offline results can already provide a high clinical benefit. 5 summary in this paper we have developed an image mosaicing algorithm for bladder video sequences in fluorescence endoscopy. first, we extracted the relevant image information and applied a feature-based algorithm to get robust and distinctive image feature points for each image pair. based on our affine parameter model we used an iterative optimization algorithm to estimate the best image-to-image transform according to a mean squared error measurement. then we described how visual artifacts, caused by inhomogeneous illumination, could be compensated during the stitching process by a mutual linear interpolation function. the results of our iterative image mosaicing algorithm were discussed and illustrated by a panoramic image of an original bladder endoscopic video sequence. finally we described some optimization steps and perspectives. acknowledgments i would like to thank dr. med andreas auge, waldklinikum gera, germany and olympus & winter ibe gmbh for technical support, providing the video sequences and giving preliminary feedback for this project. i would also like to thank prof. dr.-ing. til aach, institute of imaging and computer vision, rwth aachen university, germany, who supervised this project. references [1] american cancer society®, cancer facts and figures 2008 [online]. available: http://www.cancer.org/downoads/stt/ 2008cafffinalsecured.pdf [2] otsu, n.: a threshold selection method from gray-level histograms, ieee trans. syst. man cybern., vol. 9 (1979), p. 62–66. [3] szeliski, r.: image alignment and stitching: a tutorial, technical report msr-tr-2004-92, microsoft research, 2006. [4] lowe, d.: distinctive image features from scale-invariant keypoints, int. journal of computer vision, vol. 60 (2004), no. 2, p. 91–110. [5] lindeberg, t.: scale-space theory: a basic tool for analysing structures at different scales. journal of applied statistics, vol. 21 (1994), no. 2, p. 224–270. [6] mikolajczyk, k.: detection of local features invariant to affines transformations. ph.d. thesis, institut national polytechnique de grenoble, france, 2002. [7] fischler, m. a., bolles, r. c.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. communications of the acm, vol. 24 (1981), no. 6, p. 381–395. [8] wald, d., reeff, m., székely, g., cattin, p., paulus, d.: fliessende überblendung von endoskopiebildern für die erst ellung eines mosaiks. bildverarbeitung für die medizin, mar. 2005, p. 287–291. alexander behrens e-mail: alexander.behrens@lfb.rwth-aachen.de institute of imaging & computer vision rwth aachen university d-52056 aachen, germany 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 acta polytechnica vol. 51 no. 4/2011 singularities in structural optimization of the ziegler pendulum o. n. kirillov abstract structural optimization of non-conservative systems with respect to stability criteria is a research area with important applications influid-structure interactions, friction-induced instabilities, andcivil engineering. incontrast tooptimization of conservative systems where rigorously proven optimal solutions in buckling problems have been found, for nonconservative optimization problems only numerically optimized designs have been reported. the proof of optimality in non-conservative optimization problems is a mathematical challenge related to multiple eigenvalues, singularities in the stability domain, and non-convexity of the merit functional. we present here a study of optimal mass distribution in a classical ziegler pendulum where local and global extrema can be found explicitly. in particular, for the undamped case, the two maxima of the critical flutter load correspond to a vanishing mass either in a joint or at the free end of the pendulum; in the minimum, the ratio of the masses is equal to the ratio of the stiffness coefficients. the role of the singularities on the stability boundary in the optimization is highlighted, and an extension to the damped case as well as to the case of higher degrees of freedom is discussed. keywords: circulatory system, structural optimization, ziegler pendulum, beck column, flutter, divergence, damping, whitney umbrella. 1 introduction structural optimization of conservative and non-conservative systems with respect to stability criteria is a rapidly growing research area with important applications in industry [1–3]. optimization of conservative elastic systems such as the problem of the optimal shape of a column against buckling is already non-trivial, because some optimal solutions could be multi-modal and thus correspond to a multiple semi-simple eigenvalue which creates a conical singularity of the merit functional [3]. despite these complications, a number of rigorous optimal solutions are known in conservative structural optimization. nevertheless, the increase in the critical divergence load given by the optimal design in such problems is usually not very large in comparison with the initial design [2, 3]. in contrast to conservative systems, non-conservative systems can loose stability both by divergence and by flutter. it is known that mass and stiffness modification can increase the critical flutter load by hundreds percent, which is an order of magnitude higher than typical gains achieved in optimization of conservative systems [4–18]. for example, ringertz [9] reported an 838 % increase of the critical flutter load for the beck column [19], from 20.05 for a uniform design to 188.1 for an optimized shape. recently, temis and fedorov [17] found for a free-free beam moving under the follower thrust an optimal design with a critical flutter load that exceeds the load for a uniform beam by 823 %. we note that although the very notion of the follower forces was debated in [20–22], the beck column [19] as well as its discrete analogues [23–25] including the ziegler pendulum [26], remain popular models for investigating mode-coupling instabilities in non-conservative systems and related optimization problems. in both conservative and non-conservative problems of structural optimization of slender structures, their optimal or optimized shapes often possess places with small or even vanishing cross-sections. the known optimized shapes of the beck column or of a free-free rod moving under follower thrust have an almost vanishing cross-section, e.g., at the free end, which means vanishing mass of a finite element in the corresponding discretization [9–12, 17, 18]. another intriguing feature of optimizing non-conservative systems is the ‘wandering’ critical frequency at the optimal critical load. during optimization the eigenvalue branches experience numerous mutual overlappings and veerings [4, 6, 7, 27–29] with a tendency for the critical frequency to increase and to correspond to higher modes [9, 14–17, 30]. this puzzling behavior of the critical frequency still awaits an explanation. in some problems, such as the optimal placement of the point mass along a uniform free-free rod moving under the follower thrust [5,8], the local maxima were found to correspond to singularities of the flutter boundary such as cuspidal points where multiple eigenvalues with the jordan block exist [11, 12]. in order for the last phenomenon to happen, we need at least three modes [11, 12], meaning that two-mode approximations [5, 8] 32 acta polytechnica vol. 51 no. 4/2011 are unable to detect such optima. this reflects a general question on model reduction and the validity of lowdimensional approximations in non-conservative problems, already discussed by bolotin [30] and gasparini et al. [23] and recently raised again in the context of friction-induced vibrations by butlin and woodhouse [31]. the above-mentioned phenomena make rigorous proofs of optimality in non-conservative optimization problems substantially more difficult than in conservative ones. to the best of our knowledge, no rigorously proven optimal solutions in optimization problems for distributed circulatory systems have been found. although the situation is not much better in the finite-dimensional case, it seems reasonable to try to understand the nature of the observed difficulties of optimization on final dimensional non-conservative systems that depend on a finite number of control parameters. let us consider a circulatory system mẍ + kx = 0, (1) where dot indicates time differentiation, m is a real symmetric m×m mass matrix and k is a real non-symmetric m × m matrix of positional forces that include both potential and non-potential (circulatory) forces. equation (1) typically originates after linearization and discretization of stability problems for structures under follower loads, in problems of friction-induced vibrations, and even in rotor dynamics when the damping is not taken into account [30, 28, 32]. the characteristic equation for the circulatory system (1) is given by the modified leverrier algorithm [33]. in the case when m = 2, it reads detmλ4 + (trmtrk− tr(mk))λ2 + detk = 0, (2) where λ is an eigenvalue that determines the stability of the trivial solution. the coefficients of matrix k usually contain the loading parameter, say p, that we need to increase by varying the coefficients of, e.g., the mass matrix. during this optimization process some masses can come close to zero, so that the mass matrix can degenerate yielding detm = 0. as a consequence, some eigenvalues λ can increase substantially. however, such a singular perturbation of the characteristic polynomial may cause large values of the gradient of the critical load with respect to the mass distribution. we see that a problem of optimal mass distribution in a finite-dimensional circulatory system (1) in order to increase the critical flutter load, looks promising for explaining the peculiarities of optimizing distributed non-conservative structures. however, it makes sense to tackle first not the most general system (1) but rather a particular finite-dimensional nonconservative system with m degrees of freedom. in this paper, we propose to take an m-link ziegler pendulum [23–26] as a toy model for an investigation of the optimal mass and stiffness distributions that give an extremum to the critical flutter load. it appears that even the classical two-link ziegler pendulum has rarely been studied from this point of view, in contrast to its continuous analogue — the beck column. the only example of such a study known to the author is contained in the book by gajewski and zyczkowski [1]. this paper is organized as follows. in the next section, we first consider optimization of an undamped two-link ziegler pendulum. we find an explicit expression for the critical flutter load as a function of the mass or stiffness distribution, and demonstrate that in the space of the two mass coefficients and the flutter load as well as in the space of the two stiffness coefficients and the flutter load, the critical flutter load forms a surface with a self-intersection and with the whitney umbrella singular point. we consider the problem of optimal mass redistribution and find that the only two maxima of the critical flutter load correspond to a vanishing mass either in a joint or at the free end of the pendulum; in the only minimum, the ratio of the masses is equal to the ratio of the stiffness coefficients. the maxima are attained at the singular cuspidal points of the stability domain and are characterized by a dramatic increase in the critical frequency of the vibrations. then, we write down the equations of motion of an m-link undamped ziegler pendulum and consider the case m = 3, in which we again find that the optimal mass distributions maximizing the critical flutter load correspond to vanishing of some of the three point masses. other types of local extrema are also found that correspond to points where the flutter boundary has a vertical tangent, such as cuspidal points with triple eigenvalues, cf. [11, 12], or points where the flutter boundary experiences a sharp turn, cf. [27–29]. finally we consider the problem of optimal mass distribution for a two-link ziegler pendulum with dissipation. in conclusion, we formulate an optimization problem for an m-link ziegler pendulum and discuss some hypotheses on plausible optimal solutions and their expected properties. 2 structural optimization of the ziegler pendulum let us consider the classical ziegler pendulum consisting of two light and rigid rods of equal length l. the pendulum is attached to a firm basement by a viscoelastic revolute joint with stiffness coefficient c1 and damping 33 acta polytechnica vol. 51 no. 4/2011 fig. 1: undamped 2-link ziegler pendulum. (a) the critical load p(c1, c2) as a function of the stiffness coefficients forms a conical surface in the (c1, c2, p)-space (the casewhen m1 = m2 =1and l =1 is shown); (b)the critical load p(m1, m2) as a function of the masses, forms a self-intersecting surface with the whitney umbrella singularity at point (0,0,2) of the (m1, m2, p)-space (the case when c1 = c2 =1 is shown) coefficient d1. another viscoelastic revolute joint with stiffness coefficient c2 and damping coefficient d2 connects the two rods [26, 32]. the point masses m1 and m2 are located at the second revolute joint and at the free end of the second rod, respectively. the second rod is subjected to a tangential follower load p [26, 32]. 2.1 undamped case small deviations from the vertical equilibrium for an undamped ziegler pendulum are described by equation (1) with mass and stiffness matrices that have the following form [26, 32] m = l2 ( m1 + m2 m2 m2 m2 ) , k = ( c1 + c2 − p l p l − c2 −c2 c2 ) , (3) where x = (θ1, θ2) t is the vector consisting of small angle deviations from the vertical equilibrium position. calculating the characteristic equation det(mλ2 +k) = 0 for the ziegler pendulum without dissipation, we find m1m2l 4λ4 + (m1c2 + 4m2c2 + c1m2 − 2p lm2)l2λ2 + c1c2 = 0. (4) by direct calculation of the roots of the characteristic equation (4) or by using the gallina criterion [24], we find a critical surface that separates the flutter instability and the marginal stability domains (2lm2p − 4m2c2 − m1c1 − m2c1)2 + (m1c1 − m2c2)2 = (m1c1 + m2c2)2. (5) equation (5) defines a conical surface in the (c1, c2, p )-space when m1,2 and l are fixed: flutter inside the cone, figure 1(a). however, in the (m1, m2, p )-space the critical surface (5) looks different and has the form of a self-intersecting surface with the whitney umbrella singularity at the (0, 0, 2c2/l)-point. indeed, expressing the critical load p from (5) we get p = 4m2c2 + (√ m1c2 ± √ m2c1 )2 2lm2 ≥ 2c2 l . (6) defining the new critical flutter load as p := p l/c2 we come to the more symmetric expression p = 2 + 1 2 (√ m1 m2 ± √ c1 c2 )2 ≥ 2. (7) 34 acta polytechnica vol. 51 no. 4/2011 fig. 2: undamped2-link ziegler pendulumwith c1 =1, c2 =1: the critical flutter load p(α) as a function of the azimuth angle α indicating the direction in the (m1, m2)-plane. point a is an absolute minimum of the flutter load: pa = 2, point b corresponds to the local maximum: pb =2+ c1/(2c2) with m1 =0, and the absolute maximum corresponds to point c (not shown) with pc =+∞ and m2 =0. point z corresponds to ziegler’s original design: m1/m2 =2 the case when c1 = c2 = 1, m1 = 2 and m2 = 1 corresponds to the classical result of ziegler [26] p = 7 2 ± √ 2. (8) the lower value of the critical load corresponds to the boundary between marginal stability and flutter, while the higher critical load corresponds to the conventional transition from flutter to divergence through the double zero eigenvalue with the jordan block. the critical load (7) as a function of the masses p = p(m1, m2) is plotted in figure 1(b). it is seen that the stability boundary has a self-intersection along a ray of the p-axis that starts at the whitney umbrella singularity with the coordinates (0, 0, 2) in the (m1, m2, p)-space. indeed, for small absolute values of m1c2 − m2c1 we can expand the critical load in a series p = 2 + (m1c2 − m2c1)2 8c1c2m22 + o((m1c2 − m2c1)3), (9) which gives an approximation to the flutter boundary having a canonical form for the whitney umbrella z = x2/y 2. due to the symmetry of the expression (7), the critical load as a function of the stiffness coefficients, p = p(c1, c2), forms the identical surface in the (c1, c2, p)-space. in the following we will study the function p = p(m1, m2). according to inequality (7), the critical load is always not less than p0 = 2. the minimum is reached when the masses satisfy the constraint m1c2 = m2c1. (10) note that the equal stiffness coefficients c1 = c2 imply equal masses m1 = m2. this situation corresponds to uniformly distributed mass and stiffness in continuous systems such as the beck column [19]. in structural optimization problems the uniformly distributed stiffness and mass are usually considered as the initial design that is the starting point in optimization procedures. the critical load of the optimized structure is conventionally compared to that of the same structure with uniform mass and stiffness distributions [4–10,13–18]. since p(m1, m2) is a ruled surface and thus p effectively depends on the mass ratio only, it is convenient to introduce the azimuth angle α by assuming m1 = cos α and m2 = sin α and to plot the critical load as a function of α. in figure 2, the curves p = p(α) bound the flutter domain shown in light gray. when α tends to zero, which corresponds to the vanishing mass m2, the critical load increases to infinity. when α tends to π 2 35 acta polytechnica vol. 51 no. 4/2011 and, correspondingly, mass m1 is vanishing, then the critical flutter load increases to the value pb = 2 + 1 2 c1 c2 . (11) at point b in the stability diagram of figure 2 the flutter boundary has a vertical tangent, which is a typical phenomenon in non-conservative optimization [27–29]. the lower part of the flutter boundary corresponds to designs with a complex conjugate pair of pure imaginary double eigenvalues with the jordan block; the upper part is the designs with two real double eigenvalues of the same magnitude and different sign, each with the jordan block. above the upper flutter boundary lies the domain of statical instability or divergence, with two unstable modes corresponding to two different positive real eigenvalues. at point b, the flutter boundary is tangent to the vertical part of the divergence boundary. to the right of this vertical line, there is a pair of pure imaginary eigenvalues and a pair of real eigenvalues with the same magnitude and different signs. the transition from the stability boundary to the divergence boundary below point b occurs when a pair of pure imaginary eigenvalues goes out of the origin in the complex plane to infinity, merges there and returns back along the real axis. this happens because at the boundary m1 = 0, i.e. detm = 0. a similar divergence boundary exists at α = 0, which corresponds to m2 = 0. transition through the vertical line above point b is accompanied by another eigenvalue bifurcation at infinity: two real eigenvalues of the same magnitude and different signs go out of the origin in the complex plane in order to merge at infinity and then come back along the imaginary axis. the very point b corresponds to an antagonist of a quadruple zero eigenvalue with the jordan block, i.e. to a quadruplet of complex eigenvalues that merge at infinity in the complex plane. to summarize, the popular initial design corresponding to uniformly distributed mass and stiffness turns out to give an absolute minimum to the critical flutter load of the ziegler pendulum. the critical flutter load attains its local maximum, pb , for m1 = 0 at the singular cuspidal point b of the stability boundary where the flutter domain has a vertical tangent and touches the boundary of the divergence domain. note that in [11] a local extremum of the flutter load for the free-free beam carrying a point mass was also found to be at the cuspidal point on the flutter boundary. the global maximum of the critical flutter load for the undamped ziegler pendulum is at infinity when m2 = 0. the global maximum corresponds to a vanishing mass at the free end of the column, which is qualitatively in agreement with the numerically found optimized designs of the beck column available in the literature [9,13–18]. indeed, all known optimized designs of the beck column are characterized by the vanishing cross-sections at the free end. moreover, the gradients of the critical flutter load with respect to the mass or stiffness distribution of the beck column are large, which is, again, in qualitative agreement with our stability diagram of figure 2. most interestingly, with the increase in the critical flutter load the higher and higher modes were reported to be involved into the coupling, which indicates the onset of flutter [9, 13–18]. our simple model shows that this phenomenon seems to be natural for the optimal design that causes the degeneracy in the mass matrix that gives rise to the critical frequency that increases without bounds. 2.2 the m-link ziegler pendulum it would be very desirable to extend our study to the case of the multiple-degrees-of-freedom ziegler pendulum. the corresponding models and recursive schemes for deriving the equations of motion were proposed by gasparini et al. [23] and by lobas [25]. a three-link ziegler pendulum was considered by gallina [24]. we take the linearized equations of lobas [25], since their model deals with the arbitrary masses and stiffnesses in the joints of an m-link ziegler pendulum, in contrast to the models of gasparini and gallina [23,24]. the mass and stiffness matrices of the ziegler-lobas model look like m = l2 ⎛ ⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ m∑ i=1 mi m∑ i=2 mi · · · m∑ i=m−1 mi m∑ i=m mi m∑ i=2 mi m∑ i=2 mi · · · m∑ i=m−1 mi m∑ i=m mi · · · · · · · · · · · · · · · m∑ i=m−1 mi m∑ i=m−1 mi · · · m∑ i=m−1 mi m∑ i=m mi m∑ i=m mi m∑ i=m mi · · · m∑ i=m mi m∑ i=m mi ⎞ ⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ , (12) 36 acta polytechnica vol. 51 no. 4/2011 fig. 3: undamped 3-link ziegler pendulum with l = 1, c1 = c2 = c3 = 1: the critical flutter load p(α) as a function of the azimuth angle α indicating the direction in the (m2, m3)-plane at the fixed radial distance r =1 for (a) m1 =0, (b) m1 =10, and (c) m1 =200 and for the fixed m1 =5 and (d) r =1, (e) r =0.65 and (f) r =0.1 k = ⎛ ⎜⎜⎜⎜⎜⎜⎜⎝ c1 + c2 − p l −c2 · · · 0 p l −c2 c2 + c3 − p l · · · 0 p l · · · · · · · · · · · · · · · 0 0 · · · cm−1 + cm − p l −cm−1 + p l 0 0 · · · −cm−1 cm ⎞ ⎟⎟⎟⎟⎟⎟⎟⎠ . (13) for m = 2, matrices (12) and (13) are reduced to (3). note that detm = l2m m∏ i=1 mi. (14) 37 acta polytechnica vol. 51 no. 4/2011 let us consider the ziegler-lobas pendulum with m = 3 links. now the mass at the free end is m3, while the masses m2 and m1 are located at the third and second joints, respectively. the first joint connects the first rod with the basement. the length of each of the three rods is equal to l. the stiffness coefficients of the joints are c3, c2, and c1, respectively. for simplicity we assume that c1 = c2 = c3 = c. then, the characteristic polynomial has the form a0λ 6 + a1λ 4 + a2λ 2 + a3 = 0, (15) with the coefficients a0 = l 6m1m2m3, a1 = cl 4(6m2m3 + 5m1m3 + m1m2) − 2l5p m3(m1 + m2), a2 = 3p 2l4m3 − 2(7m3 + m2)p l3c + (m1 + 5m2 + 14m3)l2c2, a3 = c 3. (16) composing the discriminant matrix [24] s = ⎛ ⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ a0 a1 a2 a3 0 0 0 3a0 2a1 a2 0 0 0 a0 a1 a2 a3 0 0 0 3a0 2a1 a2 0 0 0 a0 a1 a2 a3 0 0 0 3a0 2a1 a2 ⎞ ⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ (17) and calculating the discriminant sequence consisting of the determinants of the three main minors of even order, we find that δ1 = 3l 12m21m 2 2m 2 3 > 0. the expressions for determinants δ2 and δ3 = dets are rather involved, and for this reason we omit them here. however, numerical experiments evidence that the stability boundary is given by the equation δ3 = 0 for the stability condition δ3 > 0 implies δ2 > 0. in figure 3 using the inequality δ3 > 0 we present the stability diagrams in the (α, p )-plane, where the azimuth angle α in the (m2, m3)-plane is introduced by assuming m2 = r cos α and m3 = r sin α. we assume equal lengths of the links l = 1 and equal stiffness coefficients c1 = c2 = c3 = 1 and vary the radial distance r in the (m2, m3)-plane and the mass m1. since for m = 3 the critical surface p (m2, m3) is no longer a ruled surface as it was in the case m = 2, the pictures in the (α, p )-plane change with the variation of the radial distance r, which complicates the optimization problem. nevertheless, such diagrams are convenient for analyzing the geometry of the stability boundary and thus for identifying the potential extrema. moreover, the critical surface p (m2, m3) has a self-intersection along a ray of the p -axis that starts from the singularity whitney umbrella, as in the case of the 2-link pendulum. therefore, at small values of m2 and m3 the critical load can be locally approximated by a ruled surface. in the left column of figure 3 the radial distance r in the (m2, m3)-plane is fixed to r = 1 while the mass m1 is increasing. as in the case m = 2, (marginal) stability is possible for α ∈ [0, π/2]. for m1 = 0, two finite maximal values pa and pb are identified at α = 0 (m3 = 0) and α = π/2 (m2 = 0), respectively, figure 3(a). both maxima are attained at the cuspidal points of the stability boundary, where it has vertical tangents. however, the stability diagram changes when m1 = 10, figure 3(b). again, local extrema exist at the boundary points α = 0 (m3 = 0) and α = π/2 (m2 = 0), while the global maximum is at the point of the sharp turn of the flutter boundary with the vertical tangent near the cuspidal point c1 � (0.040 347 7, 11.961 144), which corresponds to triple pure imaginary eigenvalues λ � ±i1.163 524 3 with the jordan block of the third order, cf. [11, 12] where at such a singular point a maximum of the critical flutter load was found for a free-free beam under the follower thrust. with a further increase in the first mass up to m1 = 200, the stability diagram converges to that similar to the diagram of the two-link pendulum, cf. figure 2 and figure 3(c). this is not surprising because big inertia of the first joint makes the three-link pendulum effectively a two-link pendulum. the right column in figure 3 corresponds to the fixed first mass m1 = 5 and varied radial distance r in the (m2, m3)-plane. small values of r correspond effectively to a two-link pendulum. that is why figure 3(f) with r = 0.1 looks similar to figure 3(c) and figure 2. an increase in r is accompanied by a complication of the stability diagram. in particular, two cusp point singularities originate corresponding to triple pure imaginary eigenvalues with the jordan block, figure 3(d,e). 38 acta polytechnica vol. 51 no. 4/2011 near these singularities, the stability boundary experiences a sharp turn at points b1 and b2, where the tangent to the boundary is vertical. at such points the eigencurves imλ(p ) form a crossing that can change either to avoided crossing or to overlapping with the origination of a bubble of complex eigenvalues in dependence on the direction of variation of azimuth angle α. this phenomenon has been observed in many numerical studies of non-conservative optimization problems [4, 6, 7, 9, 10, 13, 14, 17, 18] and was described analytically by kirillov and seyranian in [27, 28]. moreover, the critical load at these points can jump to a higher value corresponding to the merging of other (often higher) modes, and is thus undefined [4, 6, 7]. nevertheless, these points could be local extrema of the merit functional, see [29] where the necessary conditions for that were derived. we stress that due to the finite number of degrees of freedom and the finite number of control parameters the stability boundary of an m-link ziegler pendulum can be thoroughly analyzed both analytically and numerically. in particular, the coordinates of the singular points can be calculated with arbitrary precision and thus the issues of high sensitivity of the critical flutter load at the optima could be successfully resolved in this model, unlike the optimization problems for distributed systems. the possibility to work with singularities related to coalescence of more than two eigenvalues allows us to make a qualitative investigation of the question ‘should low-order models be believed?’ [31]. although two-mode approximations work well far from such singularities, in their vicinity we have to take into account higher modes. 2.3 damped case in the presence of dissipation, the equation of the ziegler pendulum is mẍ + dẋ + kx = 0 (18) with the damping matrix [25, 26] d = ⎛ ⎜⎜⎜⎜⎜⎜⎜⎝ d1 + d2 −d2 · · · 0 0 −d2 d2 + d3 · · · 0 0 · · · · · · · · · · · · · · · 0 0 · · · dm−1 + dm −dm−1 0 0 · · · −dm−1 dm ⎞ ⎟⎟⎟⎟⎟⎟⎟⎠ . (19) for the two-link damped ziegler pendulum with m = 2, we find the characteristic equation a0λ 4 + a1λ 3 + a2λ 2 + a3λ + a4 = 0. (20) with the coefficients a0 = l 4m1m2, a1 = l 2(m1d2 + d1m2 + 4m2d2), a2 = d1d2 + m1l 2c2 + 4m2l 2c2 + c1m2l 2 − 2p l3m2, a3 = d1c2 + c1d2, a4 = c1c2. (21) applying the routh-hurwitz criterion, we find the critical flutter load p = 4m22(d 2 2c 2 1 + d 2 1c 2 2) + d1d2(8m2(m1 + 2m2)c 2 2 + (c1m2 − m1c2)2) 2(m2l(4m2d2 + d1m2 + m1d2)(c1d2 + d1c2) + 1 2 d1d2 m2l3 . (22) for c1 = c2 = 1, l = 1, m1 = 2 and m2 = 1 it was found to be [34] p = 4d21 + 33d1d2 + 4d 2 2 2(6d2 + d1)(d2 + d1) + 1 2 d1d2. (23) equation (23) defines a surface with the whitney umbrella singularity in the (d1, d2, p )-space which explains ziegler’s destabilization paradox by vanishing dissipation [26], as was first demonstrated by bottema already in 1956 [32, 35]. in contrast to earlier studies, e.g. [32, 34–36], we consider the critical flutter load (22) as a function of the masses p = p (m1, m2) for the fixed damping distribution. 39 acta polytechnica vol. 51 no. 4/2011 fig. 4: damped 2-link ziegler pendulum with l = 1, c1 = c2 = 1: the critical flutter load p(α) as a function of the azimuth angle α indicating the direction in the (m1, m2)-plane for (a) d1 = d2 = 1 and r = 1, (b) d1 = d2 = 1 and r =0.4, (c) d1 =1, d2 =0.1 and r =1, (d) d1 =1, d2 =0.1 and r =0.1 in figure 4, the stability diagrams for the damped two-link ziegler pendulum are shown in the assumption of c1 = c2 = 1, l = 1, m1 = r cos α and m2 = r sin α. the critical load p (m1, m2) does not constitute a ruled surface and thus the function p (α) depends on the radial distance r in the (m1, m2)-plane. we see that the extrema again correspond to the boundary points m1 = 0 and m2 = 0, although the singularity at α = π/2 is an intersection point. nevertheless, with an increase in the number of degrees of freedom and control parameters new types of singularities will appear. the planar stability diagrams of figure 4 depend on the damping distribution but do not tend to that of the undamped pendulum when damping goes to zero (destabilization paradox [32, 35, 36]). 2.4 ‘problema novum ad cuius solutionem mathematici invitantur’ ‘a new problem that mathematicians are invited to solve’ is a translation of the latin title of the work by j. bernoulli published in 1696 where he proposed the famous brachistochrone problem [37]. supporting this good old tradition, we would like to formulate the following optimization problem: given a circulatory system (1) with matrices m and k defined as in (12) and (13). find all local extrema, the absolute maximum of the critical flutter load p , and the corresponding extremal mass distributions {m1, m2, . . . , mm}. we can also consider the problem of optimal stiffness distribution or even vary both stiffness and mass. the same problems can be formulated for the damped pendulum with the damping matrix (19). we expect that both in the undamped case and in the damped case there exists a class of extrema corresponding to the distributions {m1, m2, . . . , mm} with some masses mi = 0. it should be possible to find these 40 acta polytechnica vol. 51 no. 4/2011 optimal mass distributions explicitly, perhaps with the use of the pontryagin’s maximum principle. it would be interesting to identify the singularities of the stability boundary that correspond to these extrema; some of them could be related to infinite eigenvalues λ. on the other hand, some local extrema should exist with mass distributions that do not contain vanishing masses mi. it would be interesting to understand at which points — smooth or non-smooth — of the stability boundaries they are attained. since the system is finite-dimensional and contains a finite number of control parameters with a clear physical meaning, the locations of the singularities corresponding to multiple eigenvalues can easily be found numerically with high accuracy. in the vicinity of such points where at least three pure imaginary eigenvalues couple, the question ‘should low-order models be believed’ [31] makes sense, because here one more degree of freedom is crucial for the correct solution. knowledge of rigorously established optimal solutions at every m should shed light on the behavior of the optimal mass distributions and critical frequencies with an increase in the number of degrees of freedom. as a by-product, such a study will give an insight into the problem of dimension reduction, and will serve as a nice playground both for applying the existing methods of nonsmooth analysis and eigenvalue optimization [38– 40] and for developing them further in the direction of a tighter relation both with singularity theory and with the needs of applications. we expect that the proposed model optimization problem will yield useful recommendations for improving the algorithms for optimizing real non-conservative structures in industry. references [1] gajewski, a., zyczkowski, m.: optimal structural design under stability constraints. dordrecht : kluwer, 1988, in particular, p. 137. [2] zyczkowski, m. (ed.): structural optimization under stability and vibration constraints. wien–new york : springer-verlag, 1989. [3] seyranian, a. p., lund, e., olhoff, n.: multiple eigenvalues in structural optimization problems. journal of structural and multidisciplinary optimization. 8 (1994), 207–227. [4] claudon, j. l.: characteristic curves and optimum design of two structures subjected to circulatory loads, journal de mecanique. 14(3) (1975), 531–543. [5] sundararajan, c.: optimization of a nonconservative elastic system with stability constraint. journal of optimization theory and applications. 16(3/4) (1975), 355–378. [6] hanaoka, m., washizu, k.: optimum design of beck’s column. computers and structures. 11(6) (1980) 473–480. [7] kounadis, a. n., katsikadelis, j. t.: on the discontinuity of the flutter load for various types of cantilevers, international journal of solids and structures. 16 (1980), 375–383. [8] park, y. p., mote, c. d.: the maximum controlled follower force on a free-free beam carrying a concentrated mass, journal of sound and vibration. 98 (1985), 247–256. [9] ringertz, u.: on the design of beck’s column, structural optimization. 8(2–3) (1994), 120–124. [10] kirillov, o. n., seyranian, a. p.: optimization of stability of a flexible missile under follower thrust, aiaa paper 98-4969, (1998), 2 063–2 073. [11] kirillov, o. n.: optimization of stability of the flying bar. young scientists bulletin. applied mathematics and mechanics 1(1) (1999), 64–78. [12] kirillov, o. n., seyranian, a. p.: optimization of stability of a flying column. 3rd world congress of structural and multidisciplinary optimization. buffalo. new york (usa) : may 17–21, 1999. short paper proceedings, 2 (1999) 355–357. [13] langthjem, m. a., sugiyama, y.: optimum shape design against flutter of a cantilevered column with an end-mass of finite size subjected to a non-conservative load, journal of sound and vibration. 226(1) (1999), 1–23. 41 acta polytechnica vol. 51 no. 4/2011 [14] langthjem, m. a., sugiyama, y.: optimum design of cantilevered columns under the combined action of conservative and nonconservative loads part i: the undamped case. computers and structures. 74(4) (2000), 385–398. [15] langthjem, m. a., sugiyama, y.: optimum design of cantilevered columns under the combined action of conservative and nonconservative loads part ii: the damped case. computers and structures. 74(4) (2000), 399–408. [16] langthjem, m. a., sugiyama, y.: dynamic stability of columns subjected to follower loads: a survey. journal of sound and vibration. 238 (2000), 809–851. [17] temis, yu. m., fedorov, i. m.: shape optimization of nonconservatively loaded beams with a stability criterion. problems of strength and plasticity. 69 (2007), 24–37. [18] katsikadelis, j. t., tsiatas, g. c.: optimum design of structures subjected to follower forces. international journal of mechanical sciences. 49 (2007), 1 204–1 212. [19] beck, m.: die knicklast des einseitig eingespannten, tangential gedruckten stabes (the buckling load of the cantilevered, tangentially loaded rod). zeitschrift fur angewandte mathematik und mechanik. 3 (1952), 225–228. [20] sugiyama, y., langthjem, m. a., ryu, b.-j.: realistic follower forces. journal of sound and vibration. 225 (1999), 779–782. [21] sugiyama, y., langthjem, m. a., ryu, b.-j.: beck’s column as the ugly duckling. journal of sound and vibration. 254 (2002), 407–410. [22] elishakoff, : controversy associated with the so-called “follower forces”: critical overview. applied mechanics reviews. 58 (2005), 117–142. [23] gasparini, a. m., saetta, a. v., vitaliani, r. v.: on the stability and instability regions of non-conservative continuous system under partially follower forces. computer methods in applied mechanics and engineering. 124 (1995), 63–78. [24] gallina, p.: about the stability of non-conservative undamped systems, journal of sound and vibration. 262(4) (2003), 977–988. [25] lobas, l. g.: dynamic behavior of multilink pendulums under follower forces. international applied mechanics. 41(6) (2005), 587–613. [26] ziegler, h.: die stabilitätskriterien der elastomechanik (stability criteria in elasticity theory). ingenieurarchiv. 20 (1952), 49–56. [27] kirillov, o. n., seyranian, a. p.: overlapping of frequency curves in non-conservative systems. doklady physics. 46(3) (2001), 184–189. [28] kirillov, o. n., seyranian, a. p.: metamorphoses of characteristic curves in circulatory systems. journal of applied mathematics and mechanics. 66(3) (2002), 371–385. [29] kirillov, o. n., seyranian, a. p.: a non-smooth optimization problem. moscow university mechanics bulletin. 57(3) (2002), 1–6. [30] bolotin, v. v.: non-conservative problems of the theory of elastic stability. moscow : fizmatgiz (in russian), 1961; oxford : pergamon, 1963. [31] butlin, t., woodhouse, j.: friction-induced vibration: should low-order models be believed? journal of sound and vibration. 328 (2009), 92–108. [32] kirillov, o. n., verhulst, f.: paradoxes of dissipation-induced destabilization or who opened whitney’s umbrella? zeitschrift fur angewandte mathematik und mechanik. 90(6) (2010), 462–488. [33] wang, g., lin, y.: a new extension of leverrier’s algorithm. linear algebra and applications. 180 (1993), 227–238. 42 acta polytechnica vol. 51 no. 4/2011 [34] herrmann, g., jong, i. c.: on the destabilizing effect of damping in nonconservative elastic systems. transactions of the asme, journal of applied mechanics. 32(3) (1965), 592–597. [35] bottema, o.: the routh-hurwitz condition for the biquadratic equation. indagationes mathematicae. 18 (1956), 403–406. [36] kirillov, o. n.: destabilization paradox. doklady physics. 49 (2004), 239–245. [37] bernoulli, j.: problema novum ad cuius solutionem mathematici invitantur (a new problem that mathematicians are invited to solve). acta eruditorum. 15 (1696), 264–269. [38] lewis, a. s.: the mathematics of eigenvalue optimization. mathematical programming, ser. b 97 (2003), 155–176. [39] burke, j. v., henrion, d., lewis, a. s., overton, m. l.: stabilization via nonsmooth, nonconvex optimization. ieee transations on automatic control. 51(11) (2006), 1 760–1 769. [40] ledyaev, yu. s., zhu, q. j.: nonsmooth analysis on smooth manifolds. transactions of the american mathematical society. 359(8) (2007), 3 687–3 732. oleg n. kirillov e-mail: o.kirillov@hzdr.de magneto-hydrodynamics division (fwsh) helmholtz-zentrum dresden-rossendorf p.o. box 510119, d-01314 dresden, germany 43 ap09_1.vp 1 introduction in the last 20 years, concrete has been developed to perform better in all situations. the changes made in mixture composition (lower water content, use of superplasticizers, optimization of grain size distribution, use of particles with pozzolanic activity, addition of fibers, etc.) have led to striking improvements in many properties such as strength, rheology of fresh concrete, ductility and compactness. compactness usually results in better durability, but it may also lead to the brittle behavior of high-performance concrete (hpc) in fire conditions. under certain thermal and mechanical stresses, hpc may spall [1]. spalling is the violent or non-violent breaking off of layers or pieces of concrete from the surface of a structural element when it is exposed to high and rapidly rising temperatures, as experienced in fires [2]. it results from two concomitant processes: (i) the thermo-mechanical process associated with the thermal expansion/shrinkage gradients that occur within the element being heated, and (ii) the thermo-hydral process that generates high-pressure fields of gas (water vapour and trapped air) in the porous network [3]. fires have occurred in several tunnels in europe in the past years. they have caused loss of life, as well as significant damage to the concrete structure. some of these tunnels, such as the channel tunnel and the great belt tunnel, were built recently. this shows that spalling phenomenon is not yet sufficiently understood for it to be prevented in the construction process. this paper intends to contribute to the understanding of concrete behavior at high temperatures, and to the characterization of the spalling phenomenon, through the development of concretes with improved fire behavior. a portuguese study and a brazilian study are described, which were carried out to develop fiber concrete compositions with better fire behavior. 2 experimental studies 2.1 portuguese tests studies are in progress in the laboratory of testing materials and structures of the university of coimbra to develop high-strength concretes (hsc) with improved fire behavior. three compositions of concrete are presented in this paper: one without fibers, one with both steel and polypropylene fibers, and a third one with glass fibers. compressive strength tests were carried out at high temperatures on cylindrical specimens of these three concrete compositions [4]. concrete compositions table 1 provides details of the three compositions. all compositions contained portland cement (cem) type ii 42,5r, superplasticizer (sp) sika 3002 he, limestone filler (lf) and four different aggregates: fine sand (fs) with a fineness modulus of 1.87, coarse aggregate (ca) with maximum size of 9.5 mm and two calcareous crushed stone (cs1 and cs2 with maximum sizes of 12.5 mm and 25 mm, respectively). the compositions differ only in the fiber type, or lack of them. the first composition (hsc) had no fibers, while the second contained dramix rc zp305 steel fibers (sf) 30 mm in length, 0.55 mm in diameter, a length/diameter ratio of 55, tensile strength of 1100 mpa, together with © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 49 no. 1/2009 compressive behaviour at high temperatures of fibre reinforced concretes s. o. santos, j. p. c. rodrigues, r. toledo, r. v. velasco this paper summarizes the research that is being carried out at the universities of coimbra and rio de janeiro, on fibre reinforced concretes at high temperatures. several high strength concrete compositions reinforced with fibres (polypropylene, steel and glass fibres) were developed. the results of compressive tests at high temperatures (300 °c, 500 °c and 600 °c) and after heating and cooling down of the concrete are presented in the paper. in both research studies, the results indicated that polypropylene fibers prevent concrete spalling. keywords: concrete, high temperatures, compressive strength, spalling, polypropylene fibres, steel fibres, glass fibres. cem [kg] cs1 [kg] cs2 [kg] ca [kg] fs [kg] lf [kg] w/c sp [%cem] pf [kg] sf [kg] gf [kg] hsc 400 600 321 230 470 200 0.3 2.9 – – – hscspf 400 600 321 230 470 200 0.3 11.6 1 70 – hscgf 400 600 321 230 470 200 0.3 11.6 – – 1.5 table 1: concrete compositions (per m3) duro-fibril polypropylene fibers (pf), 31 �m in diameter and 6 mm in length (hscpsf); the third composition contained vimacrack glass fibers (gf), 12 mm in length and 14 �m in diameter (hscgf). table 2 shows the resistance classes for each concrete composition, obtained in compressive strength tests at room temperature, after 28 days, carried out according to the european standard en 206-1 (2000). this table also incorporates the values corresponding to a compression load of 0.7 fcd, applied to the specimens during the heating process, in the compressive strength tests at high temperatures. this load was intended to simulate the maximum load that the concrete is usually subjected to in building structures. when exposed to high temperatures, the polypropylene fibers form pathways that allow the steam to escape, thereby reducing the pressure accumulated in the porous network of the concrete element, while the steel fibers increase the ductility of the concrete, making it more mechanically and thermally resistant. glass fibers were included in a concrete composition to assess to what extent these fibers can replace the polypropylene and steel fibers, taking into account the concrete strength at high temperatures. specimens the specimens were cylinders 75 mm in diameter (�) and 225 mm in height (h), with h/� ratio 3. five type k thermocouples (cromo-alumel) 0.5 mm in diameter were placed within the specimen and on its surface to measure the temperature during high temperatures tests. the location of the thermocouples was defined according to the recommendations of rilem tc-200 htc [5]. test procedure the test system used is presented in fig. 1. it consisted of an amsler compression machine with a capacity of 5 mn (a), a cylindrical furnace with an internal diameter of 90 mm and 300 mm in height, capable of temperatures up to 1200 °c (b) and a tml tds-601 datalogger for data acquisition (force, displacement, and furnace and specimen temperatures) (c). the test procedures followed the rilem tc-200 htc recommendations [5]. a load equal to 70 % of the design value of concrete compressive strength at room temperature (0.7 fcd) was initially applied to the specimen. when the load level was reached, the specimen was heated at a rate of 3 °c/min, until the desired level of temperature was reached. three maximum temperatures were tested: 300 °c, 500 °c and 600 °c. the temperature was deemed to be reached when the average temperatures registered by the three specimen thermocouples on the surface matched the temperature of the furnace. the specimen was then kept at that temperature for an hour to stabilize. the temperature difference 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fc [mpa] fcm [mpa] fck [mpa] resistance class compression load (0.7 fcd) [kn] hsc 73.04 71.33 63.33 c50/60 130.572.25 68.69 hscpsf 75.79 76.21 68.21 c55/67 140.677.18 75.65 hscgf 65.19 64.16 56.16 c45/55 115.857.04 70.23 table 2: compressive resistance classes fig. 1: test system inside and outside the specimen, given by thermocouples at the same level, was also ascertained. this difference should not be greater than 20 °c. results fig. 2 shows a graph plotting the variation of concrete compressive strength vs the maximum temperature that the specimens were subjected to in the experimental tests. in tests carried out up to 300 °c, all specimens had a slight loss of strength. the hsc samples had a strength loss of 5 % while the hscpsf and hscgf samples lost 10 %. in the 500 °c tests, the hsc and hscpsf specimens lost about 30 % while the hscgf samples lost approximately 26 % of their strength at room temperature. in the maximum temperature tests, 600 °c, all the specimens collapsed before temperature stabilization. this may indicate that the initial compressive load applied to the specimens was excessive for this temperature. there was a less explosive failure in the hscpsf specimens, which confirms that steel fibers achieve a more ductile concrete. explosive spalling was not observed in any of the heating-cooling tests, only some cracks. cracking was worse for tests up to 500 °c. at this temperature, the hscspf specimens suffered exfoliation at the top. in 300 °c tests the hsc specimens showed only some surface cracking. 2.2 brazilian tests tests to assess the stress-strain behavior of polypropylene fiber reinforced high performance concrete at ambient temperature (27 °c) and at high temperatures of 400, 650 and 900 °c, were carried out in coppe’s laboratory of structures at the federal university of rio de janeiro, brazil [6]. the residual compressive strength and elastic modulus were determined after the concrete was subjected to a heating and cooling process. the role of polypropylene fiber in controlling the spalling of high performance concrete (hpc) was also investigated. concrete composition the concrete was made with portland cement (cpiii-40), sand with a fineness modulus of 2.70, crushed syenite with a maximum size of 9.50 mm and a specific weight of 2.70 g/cm3, naphthalene sulfonate based superplasticizer (sp) with a total amount of solid particles of 40 % and silica fume. the polypropylene fibres (pf) were 40 mm long and had a specific weight of 0.91 kg/dm3 and an elastic modulus of 3500 mpa. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 49 no. 1/2009 fig. 2: concrete compressive strength vs maximum temperature series cement [kg] silica [kg] sand [kg] agg. [kg] water [l] sp [l] fibres [kg] c65 365 37 780 857 156 8.30 – c85 414 42 694 895 151 8.49 – c65pf0.25 365 37 780 857 156 8.30 2.28 c65pf0.5 365 37 780 857 156 8.30 4.56 c85pf0.25 414 42 694 895 151 8.49 2.28 c85pf0.5 414 42 694 895 151 8.49 4.56 table 3: mix proportions for the hpc tested (per m3) the mixtures were produced so as to reach compressive strength levels of 65 mpa (c65) and 85 mpa (c85) at 28 days. they were reinforced with 0.25 % and 0.5 % by volume of polypropylene fibers (c65pf0.25/c85pf0.25 and c65pf0.5/ c85pf0.5 series, respectively). the mix proportions of the mixtures are summarized in table 3. specimens and test procedure the compression tests were carried out on cylindrical specimens 100 mm×200 mm. a 2500 kn mts compression machine was used, at a loading rate of 0.00078 mm/s. three samples were tested for each mixture. average values were taken as representative for each analyzed mixture. prismatic specimens (150 mm×260 mm×100 mm) were used in the spalling and total porosity studies. they were heated in a computer-controlled electric furnace at a rate of 10 °c/min. three maximum temperatures (400 °c, 650 °c and 900 °c) were chosen. once the peak temperature was reached it was maintained for one hour and then cooled to room temperature at a rate of (0.4–0.5) °c/min. total porosity was measured by means of water absorption tests on cylindrical specimens (60 mm in height, with a diameter of 25.4 mm) extracted from the prismatic samples. results typical compressive stress-strain curves for the plain and pf reinforced hpc matrices at room temperature and after exposure to high temperatures are presented in fig. 3. at room temperature, the addition of small volume fractions of pf to the concrete did not significantly change the behavior in compression of the matrices. the results for the specimens that were subjected to temperatures of 400 °c show that the pf hpc specimens suffered a more pronounced loss of strength and rigidity than the plain hpc specimens. at 650 °c and 900 °c all concrete specimens experienced a similar strength and rigidity loss, regardless of the presence of fibers in the mixtures. the poor behavior of the pf concrete specimens is associated with the melting of polypropylene fibers at about 170 °c. a slight dilatation of about 10 % [3] of the polypropylene occurs during melting, which generates extra pore-pressure in the concrete and leads to a higher crack density than that observed in the plain concrete. the fiber beds left in the matrix after melting may also help in the local nucleation of cracks as they have sharp angles, favoring the propagation of microcracking. in this study, spalling was observed in c85 prismatic concrete samples for the 400 °c temperature series. the spalling occurred at about 200 °c–220 °c. the addition of 0.25 % by volume of pf to the c85 matrix prevented concrete spalling. 3 conclusions in the portuguese studies, it could be concluded that the inclusion of pf fibers in the concrete compositions prevented spalling. the specimens with steel and polypropylene fibers performed better than those with glass fibers. a small detachment of surface concrete was observed in the glass fiber specimens. a more explosive rupture occurred in specimens without fibers and in those with glass fibers. the benefit of steel fibers in controlling cracking was confirmed. the incorporation of shorter steel fibers and a small amount of polypropylene fibers conferred greater strength on the concrete specimens. in the portuguese compression tests the concretes with glass fibers exhibited behavior quite similar to that of the specimens with polypropylene and steel fibers. the loss of strength of hscgf specimens at high temperatures was only slightly lower than for the hscpsf specimens. it was concluded that glass fibers did not improve concrete performance at high temperatures. however, further studies with this type of fiber are needed, since only a small number of tests were carried out in this study. in the brazilian studies it was concluded that at high temperatures the pf hpc specimens showed a more pronounced strength and rigidity loss than the plain hpc specimens. the increase in porosity was also higher for the pf hpc. this behavior can be associated with melting of pf at about 170 °c. another conclusion from this study was that the addition of pf to the c85 matrix prevented concrete spalling. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 t = 27 °c t = 400 °c t = 650 °c t = 900 °c t = 27 °c t = 400 °c t = 650 °c t = 900 °c (a) t = 27 °c t = 400 °c t = 650 °c t = 900 °c t = 27 °c t = 400 °c t = 650 °c t = 900 °c (b) fig. 3: compressive stress-strain curves at room temperature after exposure to high temperatures: (a) concrete series c85; (b) composite series c85pf0.25 given these findings, we can state that pf prevents the spalling of concrete. these fibers melt at approximately 170 °c and are then partially absorbed by the microcracked cement matrix, leaving a pathway for gas to dissipate and so reducing water vapor pore pressure. references [1] kalifa, p., menneteau, f., quenard d.: spalling and pore pressure in hpc at high temperatures, cement and concrete research, vol. 30 (1999), p. 1915–1927. [2] khoury, g. a.: effect of fire on concrete and concrete structures. prog. struct. engng mater., vol. 2 (2000), p. 429–447. [3] kalifa, p., chene, g., galle, c.: high-temperature behaviour of hpc with polypropylene fibres. from spalling to microstructure, cement and concrete research, vol. 31 (2001), p. 1487–1499. [4] santos, s., rodrigues, j. p.: spalling on concrete structures, proceedings of the national conference on structural concrete, 2008, guimarnes, portugal. [5] recommendations of rilem tc 200-htc : mechanical concrete properties at high temperature – modeling and applications. materials and structures, vol. 38 (2005), p. 913–919. [6] velasco, r. v., toledo, r. d., fairbairn, e. m. r., lima, p. r. l., neumann, r.: spalling and stress-strain behaviour of polypropylene fibre reinforced hpc after exposure to high temperatures. susana otero santos e-mail: susana.otero.s@gmail.com joao paulo c. rodrigues faculty of science and technology of university of coimbra, portugal romildo toledo r. v. velasco federal university of rio de janeiro department of civil engineering, coppe rio de janeiro, brazil © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 49 no. 1/2009 ~ acta polytechnica vol. 52 no. 5/2012 control algorithms for large-scale single-axis photovoltaic trackers dorian schneider institute of imaging & computer vision, rwth aachen university, 52056 aachen corresponding author: schneider@lfb.rwth-aachen.de abstract the electrical yield of large-scale photovoltaic power plants can be greatly improved by employing solar trackers. while fixed-tilt superstructures are stationary and immobile, trackers move the pv-module plane in order to optimize its alignment to the sun. this paper introduces control algorithms for single-axis trackers (sat), including a discussion for optimal alignment and backtracking. the results are used to simulate and compare the electrical yield of fixed-tilt and sat systems. the proposed algorithms have been field tested, and are in operation in solar parks worldwide. keywords: single-axis solar tracker, backtracking, photovoltaic, sun tracking. 1 introduction the degree of efficiency of photovoltaic (pv) power plants can be maximized by optimizing the alignment of the photovoltaic module plane to the current position of the sun. unlike fixed-tilt superstructures, where modules are placed stationary in the field, tracking superstructures mount the modules on carriers able to rotate along one or two axes in order to maximize the electrical yield of the system. two different schemes for the construction of a pvtracker are in existence: dual-axis trackers (dat) are provided with two degrees of freedom, which theoretically allows optimal module-sun alignment and hence the maximum possible yield at any time. theoretically, dats enable a rise of 30–45 % [7] in yield compared to an optimally-positioned fixed-tilt plant at the same location. the details for tracker performances and extra yield have high variance, since they depend strongly on the location where the system is installed [4]. the benefits of dats are achieved at the cost of a need for more space and at the cost of higher mechanical complexity, leading to higher costs for construction, planning and maintainance. the second scheme, single-axis trackers (sat), provides only one degree of freedom, limiting the tracking motion so that perfect module-sun alignment cannot always be provided. although the tracking angle is limited, the system can provide incremented yields of 10–20 % compared to a perfectly adjusted fixed-tilt plant at the same location. the constructional complexity is significantly lower than for dats, which has a positive effect on the costs for construction and maintainance. in order to choose the best system for a power plant of given size and location, it is necessary to optimize the cost-yield ratio of the system. while the costs for construction and maintainance can be estimated easily, expected yields can only be analysed using sophisticated simulations. while several algorithms and control schemes for high precision dat control have been published [1, 2, 8, 9], very few publications have handled the control of sats. in fact, the results found in [5, 6] are estimations, and give not an exact solution for the backtracking problem. for this reason, our paper introduces control algorithms for sats which allow the system to be regulated to the optimal position at any time, without the need for any additional sensor technology or hardware, but using the calculated sun position instead. the problem of self-shadowing is addressed with a high-precision, field-tested backtracking algorithm. the results are used for a basic energy yield analysis. the structure of the paper is as follows. in section 2, the nomenclature and coordinate system for later sections are defined. section 3 handles the basic mathematical equations for naive tracking, allowing the system to find the rotation angle that maximizes the yield for a given sun position. since the system should avoid self-shadowing by all means, a backtracking algorithm that finds the optimal shadow free rotation angle is discussed in section 4. the results are affiliated in section 5 to simulate sat yield expectations for different latitudes and superstructure settings in matlab/simulink. section 6 discusses the results, and section 7 concludes the work. 2 coordinate system we define the nomenclature and coordinate system that is used throughout the subsequent sections. due to the rotational movement of the sun, a polar coordinate system is used, as illustrated in fig. 1: here, the azimuth angle φ is given by the angle between the orthogonal projection of the sun vector s to the 86 acta polytechnica vol. 52 no. 5/2012 s axis of ro tatio n axis of ro tatio n north south y x z p east west θ 90−θ elevation angle azimuth northsouth zenith sun sun rotation φ zax is west x-axis a: 21 ,6 m d: 8,4 m h: 3 ,5 m figure 1: schemes of the sat coordinate system. left: illustration of the sun azimuth and elevation angles. right: illustration of three adjacent sats with corresponding distances and annotations. xy-plane and the x-axis (north-south axis). the azimuth takes positive values [+180°, 0°[ when the sun is eastward, and negative values ]0°,−180°] when it is westward. it is zero when the sun is in the south. the zenith angle θ gives an indication for the sun altitude by measuring the angle between the orthogonal projection of the sun vector s to the xy-plane and the z-axis. the sun elevation θe angle is then given by 90° −θ. an optimal sat alignment would be lengthwise, parallel to the x-axis of the coordinate system, i.e. perpendicular to the east-west axis. if the environmental setting of the power plant prohibits optimal alignment of the sat superstructure, a deviation angle η occurs, as illustrated in fig. 2. as nomenclature we define η < 0 for clockwise rotation and η > 0 for counter-clockwise rotation. for simplicity, we do not rotate the sat system, but we rotate the sun azimuth instead. the new sun azimuth becomes φη = φ−η. (1) one can easily convert between polar and cartesian coordinates using: xs = r cos φη sin θe, (2) ys = r sin φη sin θe, (3) zs = r cos θe. (4) 2.1 sun path mirroring the sun starts its course early in the day in the east, reaches the south around noon, and ends its path in the evening in the west. the course of the sun is always symmetric with regard to the north-south axis. this fact is exploited to further simplify the subsequent calculations: a variable δ is introduced, according to: δ = { +1 sun is in the east, i.e. ψη ∈ ]0◦, 180◦], −1 sun is in the west, i.e. ψη ∈ ]−180◦, 0◦]. using δ, the system behaviour must only be calculated once during simulation for either west or east. the results are then mirrored to get the final result. 2.2 plane inclination solar power plants are sometimes installed on nonplane surfaces — for example in a hilly environment. the inclination of the surface must be taken into account for precise tracking. the scheme for a nonplane setting is illustrated in fig. 3. to keep all calculations as simple as possible, the plane inclination is embedded into the sat rotation angle by following the update rule: α = α−δ ·β. (5) 3 basic tracking according to lambert’s law, the yield of a solar module is directly proportional to the angle of incidence γ between the sun light and the module normal vector for a given insolation intensity. the lower the angle, the higher the irradiation intensity, and the higher the yield. since the loss in irradiation intensity follows the function cos γ, maximal yield can only be achieved when the sun vector is orthogonal to the module plane, i.e. γ = 0. a single-axis tracker (unlike a dat) cannot always achieve orthogonality, due to the mechanical restrictions that apply, but the yield can be optimized during the span of one day. this section aims to express the system rotation angle α as a function of the sun position in order to achieve maximum yield. 87 acta polytechnica vol. 52 no. 5/2012 0° 45° 45° β β d α figure 3: illustration of the plane inclination angle β for hilly environments. y optimal alignment non optimalalignment eastwest x η figure 2: explanatory scheme for the sat misalignment angle η. the system is not optimally aligned on the east-west axis (y-axis) for η 6= 0. let the module plane of one sat be denoted as p , cf. fig. 1b. p can be rotated around the x-axis with the rotation angle α, but has no other degrees of freedom. in this way, p can be expressed in matrix notation: p =   1 0 0 cos α 0 sin α   . (6) the angle α is negative when p is inclined eastward, positive when p is inclined westward. the origin of co-ordinates o = [0, 0, 0]t is placed as indicated in fig. 4. a normal vector v of p can be expressed by: v = detp = ∣∣∣∣∣∣∣∣ 1 0 0 cos α 0 sin α ∣∣∣∣∣∣∣∣ =   0 −sin α cos α   . (7) as mentioned above, the maximal yield is equivalent to the minimal angle γ between the normal vector v and the sun vector s. this can be achieved by minimizing the absolute value of the cross product n between the two vectors: n = v×s =   0 −sin α cos α  ×   x y z   =   y cos α + z sin α x cos α x sin α   . (8) minimizing the absolute value of (8) corresponds to minimizing the area of the parallelogram that the two vectors span, and hence corresponds to minimizing the angle between them: |n|2 = (y cos α + z sin α)2 + (x cos α)2 + (x sin α)2 = (y cos α + z sin α)2 + x2. (9) from (9) it can be seen that perfect sat alignment (i.e. |n|2 = 0) can only be achieved when the sun is positioned on the east-west axis (x2 = 0). the best sat rotation angle for a given sun position can be found when 0 = y cos α + z sin α (10) applies. solving for α and applying (5) gives α = −arctan y z − δ ·β (11) as the optimal rotation angle. care needs to be taken when the sun is rising or setting, i.e. z = 0. 4 backtracking equation (11) allows the system to find an optimal rotation angle for a given sun position at any time. 88 acta polytechnica vol. 52 no. 5/2012 w s f1 f2 s h a d y x z p b o figure 4: illustration of planes, lines and points needed for deriving the backtracking algorithm: a shadow is cast from surface f1 to surface f2. by rotating sat so that line s and line w intersect, shading can be avoided by guaranteeing a minimal deviationfrom the optimal rotation angle. however, the case is not considered when, for specific sun positions, self-shading may occur between adjacent sats. the problem is illustrated for a twounit sat in fig. 4. with more units, the problem becomes even more important. in order to maximize the yield, shading should be avoided by all means, even though a shadow-free position is not optimal in terms of the module-sun alignment, as discussed in section 3. smart control algorithms should be able to detect when sat self-shading occurs and hence update the current rotation angle so that no shading can occur, while still optimizing the module alignment for maximum yield. this procedure is known as backtracking. the next section derives the equations for a shadow-free sat backtracking control mechanism that maximizes the electrical yield. first, it is necessary to check whether self-shading occurs for a given sun vector s. as illustrated in fig. 4, the module surface of two adjacent sats of width a and height h will be denoted as f1 and f2, respectively. the origin of the co-coordinates is placed at the center of gravity of f1. in order to calculate whether a shadow is cast from f1 to f2, one may project the corner point p of surface f1 onto surface f2 along the sun vector s. point p can be expressed in cartesian coordinates as: p = [ a 2 yp zp ]t , with yp = d 2 cos α and zp = d 2 sin α. the line s that passes through p and is parallel to the sun vector s can be expressed as: s : p + λs. (12) according to equation (7), the normal vector v of surface f1 corresponds to: v =   0 −sin α cos α   . (13) the plane that surface f2 is part of can be parametrized by f2,plane : v ·r− b = 0, (14) where r is an arbitrary point on the plane and b is a constant that must be determined. we choose r = [0 d 0]t and rearrange for b to find b = −d sin α. (15) insertion of (12) for vector r into (14) and rearranging for λ gives: v · (p + λs) − b = 0, thus λ = b−v ·p v ·s = b vs = − d sin α zs cos α−ys sin α . (16) we are now in a position to detect whether shading from surface f1 onto f2 occurs: by inserting (16) into (12), one finds the cartesian coordinates of the planeline intersection point b (cf. fig. 4). since b can lie anywhere on the plane and must not necessarily lie on surface f2, an interval check must be performed to test whether b is inside the bounding box of f2. if the check is positive, shading occurs and a new, shadow-free rotation angle should be found. if shading occurs, the tracker should be rotated as far as the lower bound of surface f2 (green line in 89 acta polytechnica vol. 52 no. 5/2012 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 -200 -175 -150 -125 -100 -75 -50 -25 0 25 50 75 100 125 150 175 200 time [h] an g le [° ] backtracking backtracking tracking + backtracking optimal tracking sun azimuth sun elevation figure 5: tracking and backtracking angles, sun azimuth and elevation plotted for january 1st, 2012 in berlin. the maximal rotation angle has been clipped to ±45°. as the sun elevation angle is low, backtracking occurs from 8 am to 10 am, and also from 14.30 pm and 16.30 pm. fig. 4 denoted as w) intersects with line s. line w can be parametrized by w :   −a2 + µa −yp + d −zp   . (17) to find the rotation angle that avoids shading, the intersection point between s and w must be found by equalizing the line expressions w = s, thus  −a2 + µa −d2 cos α + d −d2 sin α   =   −a2 + λxs d 2 cos α + λys d 2 sin α + λzs   . (18) this gives us a nonlinear system with three unknowns (α,λ,µ) and three equations that we solve numerically to find two solutions for the shadow-free rotation angle (including (5)) αsf: αsf,1 = arccos a− √ b c −δβ, (19) αsf,1 = arccos a + √ b c −δβ, (20) where a = hdz2s, b = h4y4s + h 4y2sz 2 s −h 2d2y2sz 2 s, c = h2y2s + h 2z2s. the actual values for λ and µ are of no interest for our problem. it is intuitive that the system angle αsf should be minimal. obviously, equation (19) does not meet this requirement, which leaves us with 2 4 6 8 10 12 0 10 20 30 40 50 60 70 80 90 100 month yi el d (r el at iv e to t h e m ax im al a ch ie ve ab le y ie ld ) [ % ] fixed-tilt sat + bt figure 6: comparison of fixed-tilt yields and sat yields for each month of a year. the results are normed to the maximum yield that could be achieved with a perfectly aligned module plane. the final solution (20) for the best, shadow-free backtracking angle. the proposed backtracking algorithm has been tested in the arizona desert with a real sat system. the algorithmic framework proved to be exact at centimetre precision. meanwhile, more than 100 mw of sats worldwide are controlled using the proposed scheme. 5 results equations (20) and (11) allow us to find an optimal, shadow-free operation angle for sat. however, the exact position of the sun is needed for a given time and location. for this purpose, our paper uses the 90 acta polytechnica vol. 52 no. 5/2012 10 20 30 40 50 60 70 80 90 0 5 10 15 20 25 30 35 40 45 latitude[˚] ex tr a yi el d [% ] figure 7: averaged, annual extra yield of sat compared to an optimally aligned fixed-tilt superstructure in dependence on latitude. sun position algorithm proposed in [10], which allows us to determine the sun azimuth and elevation with accuracy of ±0.0003°, without the need for any additional sensors or hardware. in fact, the sun position algorithm can run on the same controller as the tracking control, which is beneficial in terms of costs and installation complexity. fig. 5 shows a simulation of the sat behaviour during one day in january for the location in berlin, germany. the shadow-free angle of operation αsf and the optimal angle α are plotted against time. additionally, the sun azimuth and elevation are shown. the angle αsf is limited to a range of [−45°, 45°], due to mechanical limitations that apply to real-world structures. as a matter of fact, all sats are provided with a rotation limit, which may vary between 45° and 60°, depending on the design. the system switches to backtracking as soon as the sun rises, and continues shadow-free tracking until about 10 am. afterwards there is no longer any shading, and the system’s angle of operation corresponds to the optimal angle until about 2.30 pm, when the sun elevation has declined enough to cause shading again. the results were further used to simulate the relative extra yield of sat compared to an optimally aligned fixed-tilt superstructure at the same location, cf. fig. 6. the results shown here are normed to the maximal achievable yield. it can be seen that sat clearly outperforms the fixed-tilt installation in the summer months, when the sun stands high in the sky and the days are long. in the winter months, the fixed-tilt installation performs better because the sun stands low in the sky, shadows are more likely and the sat system often switches to backtracking, hence missing the optimal alignment to the sun. however, the results are relative to the maximal achievable yield and have no information about the absolute annual extra yield. fig. 7 shows the simulation results for the total, absolute extra yield of sat compared to an optimally aligned fixed-tilt installation. the plot shows the extra yield plotted against the latitude. the clear-sky insolation model proposed in [3] has been used for this purpose. it is interesting to see how minimal gain can be achieved for latitudes of around 50° to 60°, which correspond to central europe. here, the simulation predicts an extra yield of about 8–10 %, which is consistent with the results found by other researchers [11]. according to fig. 7, the usage of a sat system becomes very lucrative for southern countries, where the gain may rise up to 30 %. finally, the effect of the spacing d between two adjacent sats has been investigated in this work. fig. 8 summarizes the results. it can be seen that in general the best distance in terms of extra gain and space efficiency can be achieved for a distance of about 12 meters, when the slope of the curve flattens out. however the best distance depends strongly on the park layout, and generally needs to be selected individually for each park. when dealing with diffuse insolation (which forms a major part of central europe’s insolation), a totally flat tracker would be the best operational position. theoretically, the maximum amount of light could be collected in this position. however, to detect the current type of insolation, extra sensors are required. alternative solutions could employ local weather broadcasts to adapt the tracking behaviour in order to avoid external hardware. 6 conclusions we have proposed an algorithmic framework for controlling photovoltaic single-axis trackers using back91 acta polytechnica vol. 52 no. 5/2012 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 distance between two adjacent sats [m] g ai n in y ie ld [% ] figure 8: averaged, annual extra yield of a sat compared to an optimally aligned fixed-tilt superstructure in dependence on the distance d between two adjacent sat units. tracking. the control schemes are able to maximize the electrical yield of a sat power plant by finding the best, i.e. shadow-free, rotation angle for the tracker in dependence on its location and environmental settings. in the simulation, the sat yields have been compared to fixed-tilt yields, giving an insight into the strengths and weaknesses of the system. the proposed control scheme has been field tested, and is in use in power plants with more than 100 mw output worldwide. references [1] a. catarius, m. christiner. azimuth-altitude dual axis solar tracker. bachelor thesis, worcester polytechnic institute, 2010. [2] t. n. kathiba , a. mohamed, r. j. khan et al. a novel sun tracking controller for photovoltaic panels. journal of applied sciences 9(22):4050– 4055, 2009. [3] b. keller, a. m. s. costa, a matlab gui for calculating the solar radiation and shading of surfaces on the earth. computer applications in engineering education 19(1):161–170, 2009. [4] d. l. king, w. e. boyson, j. a. kratochvil. analysis of factors influencing the annual energy production of photovoltaic systems. in photovoltaic specialists conference, 2003 [5] e. lorenzo, l. navarte, j. muoz. tracking and back-tracking. progress in photovoltaics: research and applications 19:747–753, 2011. [6] e. lorenzo, m. perez, a. ezpeleta et al. design of tracking photovoltaic systems with a single vertical axis. progress in photovoltaics: research and applications 10: 533–543, 2002. [7] t. m. pavlovic, d. d., milosavljevic, a. r. radivojevic et al. a comparison and assesment of electricity generation capacity for different types of photovoltaic solar plants of 1 mw in sokobanja, serbia. thermal science 15:605– 618, 2011. [8] t. peterson, j. rice, j. valentin. solar tracker. final project report, cornell university, 2005. [9] s. seme, g. stumberger, j. vorsic. maximum efficiency trajectories of a two-axis sun tracking system determined considering tracking system consumption. ieee transactions on power electronics 26:1280–1290, 2011. [10] i. reda, a andreas. solar position algorithm for solar radiation applications solar energy 7(5):577–589, 2004. [11] p. vanicek, s. stein. simulation of the impact of diffuse shading on the yields of large single-axis tracked pv plants. technical report, deutsche gesellschaft fuer sonnenenergie lv berlin brandenburg e.v., 2009. 92 ap09_1.vp 1 introduction the mokrsko fire test focused on the overall behaviour of the structure, and on the connection temperatures, neither of which can be observed on the separate elements. in addition to three types of flooring systems, six wall structures with mineral wool were tested. the fire experiment was conducted in mokrsko, central bohemia, czech republic, 50 km south from prague, on 18 september 2008, see [1]. the new building was set up at the joseph underground educational facility of the czech technical university in prague, see www.uef-josef.eu. the experiment follows on from the seven large-scale fire tests at the cardington laboratory on steel frames from 1998 – 2003, see [2]. knowledge acquired during the ostrava fire test was also used during the experiment, see [3] and [4]. the structure was designed by the excon a.s. prague design office, in cooperation with all parties involved in delivery of the structural parts. the fire design of the structure was prepared at the czech technical university in prague, the university of sheffield and the slovak technical university in bratislava. the behaviour of slender castellated beams and beams with a corrugated web, including the concrete slab and the connection behaviour at elevated temperatures, were simulated using the vulcan programme. 2 experimental structure the structure represents one floor of an administrative building 18×12 m in size, see fig. 1 and fig. 2. the composite slab on the castellated beams was designed with a span 9 to 12 m and on beams with corrugated webs with a span 9 to 6 m. the deck was a simple trapezoidal composite slab 60 mm in thickness, 120 mm in height over the rib with cf46 sheeting (cofraplus 0.75 mm) and concrete of measured cubic strength 34 n/mm3 in 28 days reinforced by a smooth mesh � 5 mm 100/100 mm, with strength 500 mpa and coverage 20 mm. the spiroll prefabricated panels 320 mm in height with hollow core openings formed a 9 m span. the panels were supported by a concrete wall and a primary hollow beam from welded double ipe 400 section. the castellated beams with sinusoid angelina openings, designed by arcelormittal, made of an ipe 270 section from steel s235 were 395 mm in height. the beams with corrugated webs, designed by kovové profily s.r.o., had flanges 220×15 mm and a web 2.5 mm in thickness and 500 mm in height, using 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 connection temperatures during the mokrsko fire test j. chlouba, f. wald the mokrsko fire test focused on the overall behaviour of the structure, which cannot be observed on the separate elements, and also on the temperature of connections with improved fire resistance. during the test, measurements were made of the temperature of the gas and of the elements, the overall and relative deformations, gas pressure, humidity, the radiation of the compartment to structural element and the external steel column, transport of the moisture through the walls, and also the climatic conditions. the results of the test show the differences between the behaviour of the element and the behaviour of the structure exposed to high temperatures during a fire. the collapse of the composite slab was reached. the results of the numerical simulations using the safir program compared well with the measured temperature values in the structure and also in the connections. keywords: fire test, fe simulation, fire safe connection, temperature prediction, angelina beam, compartment fire. fig. 1: experimental building fire door a b c 3 2 1 9 000 9 000 6 000 6 000 as7 +0.00 sand bag timber pile +4.00 n meteo meteo as6 as5 as4 as3 as2 as1 ce4 ce3 ce2 ce1 s5 s4 s3 s2 s1 ce2as2 opening 2,5 x 4 mopening 2,5 x 4 m fig. 2: plan view of the structure steel s320. the edge beams were from ipe 400 steel s235 sections. the fire protected columns were prepared from heb 180 sections. the horizontal stiffness of the frame was achieved with concrete walls 250 mm in thickness, made of c30/37 concrete, and two cross braces of l 80×80×8. the beam-to-beam and beam-to-column connections were designed as a header plate, plate 10 mm with four m16 bolts class 8.8. improved fire resistance was achieved by encasing two bolts in the concrete of the slab. two walls were composed from cladding, linear trays, mineral wool and external corrugated sheets. in two 6 m spans, a comparison was made of the system with the internal grid and horizontal sheeting and with vertical sheeting without the internal grid. two other walls were made of sandwich panels 150 mm in thickness, filled with mineral wool. in front of the concrete wall there was a brick wall made of plaster blocks. the fire protection of the columns, primary and edge beams, and also the bracings, was designed for r60 by promatect h 2×15 mm board protection. 3 mechanical and fire load the mechanical load was designed to comply with the load for a regular administrative building. the dead load of the tested structure was 2.6 kn/m2. the variable load 3.0 kn/m2 was simulated by 78 sand bags with road metal. the weight of the bags varied from 793 kg to 1087 kg. they were coupled on pallets in threes to achieve an average weight 900 kg, see fig. 3. the applied load represented the characteristic value of the variable action at elevated temperature 3.0 kn/m2 and characteristic value of flooring and partitions 1.0 kn/m2. the 15 m3 50×50 mm unwrought wooden cribs 1 m in length of softwood dried to moisture till 12 % formed the fire load. the cribs were placed into 50 piles, see fig. 4. each pile consisted of 12 rows with 10 cribs, i.e., 35.5 kg/m2 of timber, and simulated a fire load of 620 mj/m2. the design characteristic fire load of an administrative building is calculated as 420 mj/m2. simultaneous ignition of the piles was achieved by connecting them using steel thin-walled channels filled with mineral wool and penetrated by paraffin. the channels were located on the second layer of cribs, and connected three/four piles together. the fire test started by reaching a gas temperature of 50 °c. openings 2.54 m in height and 8.00 m in total length with a 1 m parapet ventilated the compartment. to allow smooth development of the fire, no glazing was installed. © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 49 no. 1/2009 fig. 3: position of mechanical load fig. 4: distribution of fire load fig. 5: thermocouples for gas and steel temperatures 4 measurements the gas temperature in the fire compartment was measured by 14 jacketed 3 mm thermocouples located 0.5 m below the ceiling in the level of the lower flanges of the beams, see fig. 5. two thermocouples were placed in the openings. the temperature profile along the compartment height was measured between the window and in the back of the fire compartment below the secondary beam. 2 mm jacketed thermocouples were used for measuring the temperature of the structure. there were 12 thermocouples in the composite slab, on the beams 11, in bolted connections 37, see fig. 6, in the hollow core panels 6, in the concrete wall 16, in the external cladding 24, in the fire pprotected internal column 7, and on the external column 24. on the west linear scaffold a meteorological station was installed to record the external temperature, the wind direction and wind speed. the behaviour was documented by photographs, a video and thermo imaging records. 4.1 temperatures the prediction of the gas temperatures by the parametric fire curve and by the zone model conservatively expected a temperature of 1057 °c in 60 min of fire, see fig. 7. under the composite slab with castellated beams, a temperature of 935 °c was measured in 60 min. at the beginning of the fire, the highest gas temperatures were reached at the front of the fire compartment, and during the fully-developed fire the highest temperatures were at the back of the fire compartment. the east and west part of the compartment showed different temperature developments. in the eastern part of the fire compartment with the concrete wall, a temperature of 810 °c was reached in 21 min, at temperature of 935 °c in 30 min, and a temperature of 855 °c in 58 min. in the western part of the fire compartment the developed gas temperature was very similar to the nominal standard fire curve [5]. the temperatures in the two parts of the fire compartment differed due to the different walls, and due to a small change in wind direction during the test. 4.2 response of the structure the lower flange at the midspan of the unprotected castellated beam as4 reached 487 °c in 23 min with deformation 135 mm, see fig. 8 and 9. in 34 min of fire, the temperature was 790 °c and the deflection was 378 mm. the slab failed 62 min into the test, in the cooling phase of the fire, with a measured temperature of 895 °c in the lower flange of the beam, at the mid span. the damage to the ceiling started in the southeast corner. the slab lost its resistance in compression 62 min into the experiment. the edge beam buckled on its developed free length. due to spalling of the top of the concrete column, the anchors lost their tensile resistance. the bolted connection of the primary box girder was exposed to torsion, which led to loss of its bolt shear resistance. 4.3 connection temperatures one of the goals of this experiment was to examine connections with higher fire resistance. this was done by encasing them in the concrete slab. the maximum temperature of 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 6: thermocouples on beam-to-column connection gas temparature, °c 0 15 30 45 60 75 90 105 120 135 150 200 0 400 600 800 1000 zone model time, min parametric fire curve tg06 tg08 thermocouple thermocouple nominal standard fire curve tg08 tg06 openings fig. 7: comparison of predicted and measured gas temperatures the lower part of the beam-to-column joint was 520 °c, whereas the upper encased part reached a temperature of 157 °c. the highest temperature of the lower flange of the beam in the midspan was 932 °c. in the case of beam-to-beam connections, the temperature differences were similar; the lower part of the joint reached a maximum temperature of 410 °c, and the upper encased part reached 198 °c, while the lower flange at the midspan of the beam reached a temperature of 881 °c. the end plate of the connections deformed plastically before the collapse of the slab, see fig. 10. fig. 11 presents the tem© czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 49 no. 1/2009 0 as6 15 30 45 200 400 600 800 1000 as2 as5 as4 beams lower flange temparature [°c] time [min] 0 as2 as4 as5 as6 n tc fig. 8: temperatures of the lower flanges of the cellular beams 15 30 45 -200 -400 -600 -800 0 0 deflection [mm] v3 v1 v7 v5 time [min] v1 v3 v5 v7n fig. 9: deflection of the ceiling a) b) c) fig. 10: a) deflections of the structure in 58 min, b) the beam-to-column connection after the test, c) deformation of the end plate of the beam-to-beam connection perature in the connection of castellated beam as4 to the column, and fig. 12 shows the temperatures of the connection of castellated beam as5 to the primary beam. the safir program, see [6], was selected to predict the temperature in the connection, which was partially encased in the concrete slab. a 3d model of the joint is shown in fig. 13. the fire was modelled using the ozone 2.2 program, see [7]. the predicted temperatures in the connection are shown in fig. 14. 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 0 100 200 300 400 500 600 700 800 0 15 30 45 60 75 tc47 tc48 tc50 tc51 tc53 tc54 tc55 temperature, °c time, min tc48 tc47 tc55 tc50 tc51 tc53tc54 fig. 11: temperatures in the beam-to-column connection of the castellated beam as4 0 100 200 300 400 500 600 700 800 0 15 30 45 60 temperature, °c tc69 tc71 tc66 tc67 tc64 tc63 tc70 time, min75 tc64 tc63 tc71 tc66 tc67 tc69tc70 fig. 12: temperatures in the beam-to-beam connection of the castellated primary beam as5 fig. 13: simulation of temperatures of the beam-to-column connection in 60 min 0 200 400 600 800 1000 0 20 40 60 80 100 120 temperature, °c time, min gas temp. lower flange upper bolt next to upper bolt lower bolt upper flange fig. 14: predicted temperatures in the connection 4 summary the fire test shows the differences between the behaviour of the element and of the structure exposed to high temperatures during a fire test. the collapse of the composite slab was reached. the maximum temperature of the lower bolt in the beam-to-column connection reached 56 % of the temperature in the lower flange in the beam midspan, and the upper encased bolt reached 17 % of the midspan maximum in the flange. in the case of a beam-to-beam connection, the temperature in the lower unprotected bolt was 46 % of the maximum temperature in the flange of the beam in the midspan, while the upper protected bolt reached 22 % of the same maximum temperature. aknowledgment this work was supported by project no. oc 190 fire improved joints, and by project no. 1m0579 of the cideas research centre of the ministry of education, youth and sports. references [1] kallerová, p., wald f.: požární zkouška na experimentálním objektu v mokrsku. praha: česká technika –nakladatelství čvut v praze, srpen 2008, isbn 978-80-01-04146-8. [2] wald, f., simoes da silva, l., moore, d. b., lennon, t., chladna, m., santiago, a., beneš, m., borges, l.: experimental behaviour of a steel structure under natural fire, fire safety journal, vol. 41 (2006), issue 7, p. 509–522. [3] kallerová, p., wald, f.: ostrava fire test, czech technical university in prague, cideas report no. 3-2-2-4/2, p. 18, www.cideas.cz. [4] chlouba, j., wald, f., sokol, z.: temperature of connections during fire on steel framed building. international journal of steel structures, accepted for printing. [5] en 1991-1-2: 2002. eurocode 1: basis of design and actions on structures – part 2-2: actions on structures – actions on structures exposed to fire, cen, brussels. [6] franssen, j. m., kodur, v. k. r., mason, j.: user’s manual for safir 2004: a computer program for analysis of structures subjected to fire. university of liège, 2005. [7] ozone v2, university of liège, url: http://www.argenco.ulg.ac.be/logiciel.php. jiří chlouba e-mail: jiri.chlouba@fsv.cvut.cz františek wald department of steel and timber structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 49 no. 1/2009 ~ ap09_2.vp 1 introduction power efficiency and linearity are two important properties in conventional linear amplifier design and cannot be achieved simultaneously. linear amplification with nonlinear components (linc [1]) can produce output signals with high linearity after combining the antiphase outputs of two nonlinear power amplifiers. envelope elimination and restoration (eer [2]) is an approach for high efficiency amplification. due to the class-s power amplifier which acts as a power supply for the output power amplifier of eer, eer is limited to a signal with bandwidths up to serveral mhz. the combination of linc and eer (clier [3]) takes advantage of three nonlinear power amplifiers to achieve linear amplification. the inherent characteristic of this architecture allows power amplifiers to continuously operate at their peak power efficiency, and potentially improves the overall efficiency of the system. the structure is shown in fig. 1. however, a major disadvantage of this approach is the power wasted when the two power amplifiers are outphasing if a conventional 180° hybrid is used as a combiner. an alternative combining approach, named after chireix and extensively analyzed in [4], suffers from incomplete isolation of the two class-e power amplifiers. the two amplified signal components tend to travel and reflect back and forth between two amplifier branches, and the two amplifiers appear to be interfering with each other. as a result, a significant signal distortion can occur. in this paper, a different approach is presented and designed – the hybrid approach with a rat race coupler is used, but the power recycling scheme simply takes a rf-dc converter to recycle part of the wasted power back to the battery and enhance power efficiency. the focus of this paper is on the recycling network. 2 theory the baseband source signal of clier s(t) can be written as s t a t e j t( ) ( ) ( )� � , a t( ) ,� 0 (1) where a(t) is the envelope of the baseband input signal and e j�(t) describes the phase of the baseband input signal. the envelope of the input signal is digitally low pass filtered. the resulting signal a1(t) is amplified using a highly efficient class s amplifier. the quotient of the envelope a(t) and the lowpass filtered envelope a1(t) and the original phase of the signal are now amplified via the linc principle. the resulting out-phasing phase �(t) is given by �( ) arccos ( ) ( ) t k a t a t � � � � � � � 1 . (2) 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 implementation of a power combining network for a 2.45 ghz transmitter combining linc and eer f. wang, o. koch a power combining network with 180n hybrid for a 2.45 ghz transmitter combining linc and eer has been analyzed, built and measured. the network feeds the wasted outphasing power partly back to the power supply and therefore improves the overall power efficiency. the recycling circuit was designed and simulated with ads. a measured peak recycling efficiency of 60% was achieved with commercial schottky diodes at 2.45 ghz at an input power of 36 dbm. keywords: linc, eer, clier, power combining net-work, recycling. fig. 1: block diagram of clier principle note that the argument of the arccosine is limited to 1, and if the limits are exceeded, clipping occurs. the amount of clipping is determined by the clipping factor k, the lower part of the envelope a1(t) is fed to the class e amplifier via the supply voltage restoring the lower part of the envelope. the baseband representation of the signal is then after the dsp the resulting linc signals can be written as s t va t k t t1 1 2 ( ) ( ) exp( ( ) ( ))� �� � , (3) s t va t k t t2 1 2 ( ) ( ) exp( ( ) ( ))� �� � . (4) these two signals are fed into a 3 db combiner. if the two amplifier branches are perfectly matched, i.e., their gain and phase characteristics are precisely the same, an amplified replica of the original signal can be achieved, as the in-phase components add together and the out-of-phase components cancel each other. in this case the desired output signal is obtained at the summing port. the differential portion of the power is consumed at the resistive load and turns into waste heat, which degrades the overall power efficiency of clier. an important figure of merit of a linc based system is the average output efficiency, which can be expressed as �o s t s t s t � � � � � ( ) ( ) ( ) 2 2 2 , (5) where s t�( ) and s t�( ) are the in-phase components and out-phase components of s1(t) and s2(t), respectively. it describes how much of the produced power is used at the output. in this paper a technique for partial recovery of the wasted power at the differential port is implemented. the isolation between the two amplifiers is not degraded. this new technique was called power recycling or re-use technique in [5, 6]. the idea is simple – replace the power wasted resistive load with a rf-dc converter to recover the wasted power back to the power supply, and hence improve the overall power efficiency of the amplifier system. then the overall efficiency of the entire clier system can be written as � � � � � � � � � � � � � � p pdc o c e s o c e s r � 1 1( ) (6) in which �o , �c , �e , �s , �r are the output efficiency, the efficiency of the combiner, the efficiency of the class e amplifier, the efficiency of the class s amplifier and the efficiency of the recycling network respectively. the total efficiency in dependency on the output and the recycling efficiency is shown in fig. 2. we can see that the system efficiency depends on the © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 49 no. 2–3/2009 fig. 2: efficiency depending of tha output efficiency �o and the recycling efficiency �r . the other values are �c � 0 95. , �e � 0 9. , �s � 0 9. . fig. 3: outphasing power amplifier with power recycling output efficiency. the system efficiency increases significantly with recycling for medium �o . a diagram of this recycling approach is illustrated in fig. 3. a 180° hybrid combiner is configured as the power splitter to divide the wasted power into two 180° outphased portions. these two signals are then fed to a high-speed schottky diode pair through an impedance matching network. the schottky diodes rectify the rf waves and the dc components are withdrawn back to the power supply. a large value shunt capacitor and/or a series inductor may be used to reject the harmonic currents. the matching network is to be adjusted to optimize the system performance. an optional isolator can be added between the hybrid combiner and the power splitter to improve the isolation. apparently, the simplest case is that the input signal is a continuous wave, where the signal magnitude is constant. to simplify the analysis, an ideal resistive model is assumed for the schottky diode, i.e. the fixed on-resistance rd in series with the built-in potential vd, the infinite off-resistance, and a negligible shunt capacitance. all other components are assumed ideal. the analysis starts from the diode side. each schottky diode conducts at an angle of 2�. the current though the upper diode can be written in the following form[6]: i t v t v v r t v v vd pk c d d c d pk1 0 ( ) cos ( ) ; cos ( ) ; sup sup � � � � �� � otherwise, � � � � (7) where vpk is the peak signal voltage applied to the diode, vsup is the power supply voltage and �c is the carrier frequency of the input signal. so the upper diode conduction angle � can be determined by cos sup � � �v v v d pk ; 0 2 � �� . (8) the input signal is a periodic function with a period of 2 . in the actual circuit, the transmit signal is not only a signal with one carrier frequency (fundamental frequency), but also a sum signal of higher order frequencies (harmonic frequencies) because of the non linearity of the recycling network. so the actual diode current can be written as a fourier series. for the simplest case, the diode current can be expanded as the following fourier series i t i i tk c k ( ) cos( )� � � � �0 1 � , (9) where ik is the kth-order harmonic of the diode current. the dc-component of the upper diode is thus given by i v v r d d ( ) (tan ) sup 0 � � � � � . (10) the fundamental component and the higher order harmonics of the upper diode current are i k v v r k k k k k d d ( ) cos sin ( ) sin ( )sup � � � � � � � � � � � � � � �1 1 1 1 � . (11) obviously, the fundamental components and all odd-order harmonic current of the two diodes are 180° out of phase, and hence cancel out. only the dc components and even-order harmonics are left. a large value shunt capacitor may be sufficient to short the harmonic currents to the ground. a series inductor may be added to better help reject the harmonic current out of the power supply. in practice, microstrip lines are used instead of the capacitor. the recycled portion of the power is hence p i vr � 2 0 sup. (12) the rf-dc conversion is a strongly nonlinear process. in such a case, the large signal impedance of the device is usually estimated by the fundamental component of the voltage and current waveforms. the fundamental component of the upper diode current at carrier frequency is i v v r d d ( ) ( sec sin ) sup 1 � � � � � � . (13) the available power to the recycling network needs to be known in order to calculate the recycling efficiency. an isolator can be placed between the two hybrids to eliminate the reflection wave back to the two power amplifiers. now looking from the first hybrid ( see fig. 3), the load is always matched to 50 �, while looking from the seconde hybrid, the isolator acts like an ideal voltage source vs in series with an internal resistance of 50 �. thus the source voltage is v v z i n v z i ns pk � � � �1 0 1 0 12 2 , (14) where the scaling factor n results from the 1: n matching network and 2 comes form the fact that the hybrid is a power addition device. the power available to the recycling network pava is p v z v v z n z n r ava s d d � � � � � 1 8 4 2 0 2 0 0 ( ) sec ( sec sin sup � � � �) . � � � � � � 2 (15) so the recycling efficiency is �r r ava p p � . (16) if we only account for the fundamental components, we can obtain the input impedance of the recycling network z n v i n r in pk d� �2 2 1( ) sin cos � � � . (17) so the input impedance of the recycling network and the recycling efficiency vary with the diode conduction angle, 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 which is in fact determined by the power delivered to the recycling network. the reflection coefficient of the recycling network is �in in in z z z z � � � 0 0 . (18) the v swr is thus, according its standard definition v sw r in in � � � 1 1 � � . (19) fig. 4 and fig. 5 show the 3d plots of the recycling efficiency and reflection coefficient as a function of the impedance transform ratio n of the matching network and the source available power pava with the following parameters: vsup � 30 v, vd � 1.6 v and rd � 4.8 �. it is clear that there is a close relationship between the optimum recycling efficiency and the lowest reflection coefficient. © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 49 no. 2–3/2009 fig. 4: recycling efficiency as a function of n and available power fig. 5: reflection coefficient as a function of n and available power 3 measured results and discussion the power recycling circuit and the 180° hybrid ring coupler of fig. 3 were designed and simulated with advanced design system (ads) and fabricated on rt/duroid 5880. the measurement setup is illustrated in fig. 6. the power amplifier is used to provide the proper power level to the recycling network, since the signal generator is not capable of delivering so much power. the directional coupler is used to monitor the input power into the recycling network via the power meter. the circulator is used to isolate the power amplifier and the recycling network and enables the measurement of the reflected power using the spectrum analyzer. the external load simulates the power dissipation of the class-s power amplifier. the center frequency used for all tests is 2.45 ghz. the dc current irec is measured and the recycled power calculated. for all measurements the supply voltage is kept fixed at 30 v (25 v, 20 v) and the input power level is varied to determine the power recycling efficiency. one of the recycling circuits is optimized at 36 dbm input power for 30 v power supply voltages with surface mount schottky diodes (hbat540c). the layout of this circuit is shown in fig. 7. fig. 8 and 9 show the measured results for recycling efficiency and measured reflection coefficients as a function of the input power for three different power supply voltages. the measured peak recycling efficiency is found to be 57.13 % for a supply voltage of 20 v, 61.88 % for a supply voltage of 25 v and 60.9 % for a supply voltage of 30 v. in order to test the band width, the supply voltage and the input power level were kept constant and the frequency was swept. the result is shown in fig. 10. the band width for this circuit is about 100 mhz. another is optimized at 41 dbm input power for 30v power supply with the surface mount schottky diodes group (hsms280e). the measured peak recycling efficiency is found to be 62.7 % for 20 v supply, 63.4 % for 25 v supply, and 63.7 % for 30 v supply. 4 conclusion the power recycling technique has been presented for the optimum power combining network of the clier system. this network acts as an rf dc converter while maintaining sufficient isolation. the analysis demonstrates that a proper trade-off among the diodes, the power supply and the input power of the recycling networks is critical for the performance 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 50 � fig. 6: diode detector test setup fig. 8: comparison of calculated, simulated and measured recycling efficiency as a function of input power level fig. 7: photograph of the measured circuit optimized for 36 dbm of the system. with the diode groups, two more recycling circuits were designed. the recycling efficiencies were optimized for an input power level of 36 dbm and 41 dbm, respectively. for all measurements, a peak recycling efficiency of about 60% was achieved. but problems arise due to excesively high currents and voltages at the diodes. the measurement results agree well with the simulations and still show some similarity to the ideal case. this simple technique promises to improve the power efficiency of the outphasing microwave power amplifier, while maintaining its high linearity performance. references [1] cox, d. c.: linear amplification with nonlinear components. ieee-tc, vol. 41 (1974), no. 4, p. 1942–1945. [2] kahn, l. r.: signal-sideband transmission by envelope elimination and restoration. radioengineering, vol. 40 (1952), p. 803–806. [3] rembold, b., koch, o.: clier – combination of linc and eer method. electronics letters, vol. 42 (2006), p. 900–901. [4] birafane, a., kouki, a.: on the linearity and efficiency of outphasing microwave amplifiers, ieee-mtts, vol. 52 (2004), no. 5, p. 1702–1708. [5] langridge, r., thornton, t., asbeck, p. m., larson, l. e.: a power reuse technique for improved efficiency of outphasing microwave power amplifiers. ieee-mtts, vol. 47 (1999), no. 8, p. 1467–1470. [6] zhang, x., larson, l. e., asbeck, p. m., langridge, r. a.: analysis of power recycling techniques for rf and microwave outphasing power amplifiers. ieee-tcs, vol. 49 ( 2002), no. 5, p. 312–320. fei wang e-mail: wang@ihf.rwth-aachen.de olivier koch e-mail: koch@ihf.rwth-aachen.de institute of high frequency technology rwth aachen university melatener strasse 25 52074 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 49 no. 2–3/2009 fig. 9: comparison of calculated, simulated and measured reflection coefficients as a function of input power level << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap_07_6.vp 1 what is eu ets? the eu emissions trading scheme (eu ets) has been developed by the eu to serve as one of the main instruments of the european community to achieve its reduction targets, as agreed in the kyoto protocol to the united nations framework convention on climate change. in this protocol, a group of industrialized countries agreed to take on concrete targets to reduce emissions of greenhouse gases to support the main objective of the convention – to stabilize the concentrations of greenhouse gases in the atmosphere and limit anthropogenic (human related) influences on the global climate system. the european union as one of the signatories agreed to reduce its emissions in the so-called commitment period (2008–2012) by 8 % compared to 1990. to achieve this target various policies and measures have been and have yet to be, implemented. industry is one of the key sectors emitting greenhouse gases, and an important question was which instrument to use. based on experience of the usa with sulphur dioxide (so2) emissions trading, a co2 trading system was developed to cover all (eu-25) countries. this started with the shorter pre-commitment period phase (2005–2007) and will continue with five-year trading phases as of 2008. the system is based on the following five pillars: � allocation (a system of distribution of emission allowances to participants, based on harmonized criteria). � registration (each industrial participant has to operate on the basis of a permit defining conditions for its operation and monitoring of emissions). � monitoring (a harmonized system of monitoring of emissions focusing on accuracy and quality of data). � reporting (companies are obliged to annually monitor emissions of co2 and surrender allowances equal to emissions). � verification (emissions are independently verified). trading with allowances across the eu is not further limited. not only emitters, but also other participants can trade (such as brokers and investors) to help market liquidity. 2 czech national allocation plan the national allocation plan (nap) is a key element of the whole trading system. on the basis of legislative criteria it has to include certain types of emission sources (called installations) that emit carbon dioxide (co2). the eu system defines the criteria for obligatory participation in the system for key sectors (energy production, refineries, production and processing of metals, the mineral industry (ceramics, glass, cement, lime), pulp and paper). the key element of the allocation plan is that decisions on allocations must be made before the trading period begins and ex-post adjustments are not allowed. this provides investment certainty for participants. allowances are distributed to operators free of charge (a voluntary auction of small amounts of allowances is allowed). nap therefore represents mainly a distributional challenge. the allocation has to be based on non-discriminatory and fair criteria, both on a national and on an individual level, nap is also screened against state aid rules. each allocation plan has to pass a scrutiny assessment by the european commission, which will assess whether the nap fulfills all the criteria as listed in the legislation. the czech allocation plan has been approved, and 97.6 million allowances are to be distributed among more than 400 participants each year in the period 2005–2007. the allocations in the czech allocation plan are based on a mathematical formula using historical emissions of installations, adjusted to reflect some specific elements (combined heat and power production, centralized heating systems, early action, etc.). 2.1 sectoral coverage a simple comparison shows that some sectors, while having almost the same number of installations, have a much smaller share of total emissions. this indicates that some criteria in the system do not reflect the real emissions, but rather the installed capacity. therefore the system covers a number of small installations in terms of their annual emissions, and for these installations the system could represent more of a © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 47 no. 6/2007 eu emissions trading scheme as implemented in the czech republic t. chmelík, p. zámyslický the paper reports on the implementation of the european emissions trading scheme (eu ets) in the czech republic. trading in co2 emission allowances is one of the most dynamic and progressive economic instruments of environmental policy. since coming into force in january 2005, the system has incorporated into company decision making a truly new phenomenon – the price of carbon in the form of allowances tradable on the european market, where the price of the allowance represents the costs of a company when emitting one ton of co2. the incentive, given by the opportunity costs should motivate operators of installations emitting co2 to analyze options for carbon emission reduction and to behave rationally. in total, the system should result in a quite significant reduction in the compliance costs of co2 reduction when compared to alternatives such as taxes or emission limits, as a result of flexibility given by the market. the most controversial and sensitive part of the system is the initial distribution of the allowances to emitters. the so-called national allocation plan has to provide answers to very difficult questions – to whom, how and how many allowances will be given. this paper focuses on a detailed description of the sectors and companies in the czech republic that are affected by the system and discusses various aspects of the scheme. keywords: eu emissions trading scheme (eu ets), national allocation plan (nap), allowances, industry. burden than an opportunity, also. in addition, the real impact of these small companies (their share of national emissions is almost negligible) is disputable. this is particularly visible in the czech energy sector – public energy production (including all large powerplants that supply power to the national grid) and corporate energy production. this effect can be further demonstrated in a graph of cumulative allocation (as allocation is based on historical data, the graph would look almost the same if historic emissions were used). the vast majority of these installations contribute only a small percentage of the total emissions (allocation). 2.2 selected participants a closer look at company level allocation yields the following observations. eleven of the fifteen installations are from the energy production sector, and eight of them are owned by a single company, čez, which is by far the biggest player on the czech allowance market. these 15 companies have an allocation of 56.5 million allowances (almost 58 % of the total allocation). 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 share of total emissions number of installations public energy production (utilities) 66.59 % 139 corporate energy production 3.53 % 135 refineries 1.10 % 4 chemicals 5.28 % 17 coke 0.26 % 2 production and processing of metals 16.22 % 19 cement 2.95 % 6 lime 1.34 % 5 glass 0.84 % 21 ceramics 0.78 % 60 pulp 0.19 % 2 paper and board 0.91 % 16 total 426 source: ministry of the environment, 2005 table 1: nap for the czech republic, by sector 0 00 %. 10 00 %. 20 00 %. 30 00 %. 40 00 %. 50 00 %. 60 00 %. 70.00 % share of total emissions public corporate refineries chemicals coke metals cement lime glass ceramics pulp paper fig. 1: share of total emissions, by sector 0 20 40 60 80 100 120 140 160 number of installations public corporate refineries chemicals coke metals cement lime glass ceramics pulp paper fig. 2: number of installations, by sector 0 20 000 000 40 000 000 60 000 000 80 000 000 100 000 000 120 000 000 1 32 63 94 125 156 187 218 249 280 311 342 373 404 number of installations a ll o w a n c e s fig. 3: cumulative allocation an interesting point is that with the planned acquisition of mittal steel and vysoké pece ostrava under a single company, the biggest installation will not be an energy company, but a company from the metal sector. however, the public energy production will remain the key sector from the allowance point of view. 3 conclusions the simple survey made in this paper indicates that the eu ets system in the czech republic covers installations varying in size from several millions of allowances annually to several hundreds of allowances annually. while the number of participants in the system (around 12 000 throughout eu-25) clearly helps liquidity on the market and can bring opportunities in variability of abatement costs, the administrative burden of the scheme must also be taken into account. this involves not only the size of the scheme and the burden related to government “management” of the system, but also the burden on companies themselves (costs related to participation, monitoring, verification of data, etc.) the inclusion of sectors such as ceramics and glass is in particular questionable, because in these two sectors (together with corporate energy production) most installations emit only small amounts of co2. one issue that is discussed on the european level is the inclusion of a minimum threshold on emissions, so that obligatory inclusion in the scheme would be based on fulfillment not only of technical criteria (e.g. installed capacity), but also on the level of emissions in past years. this could significantly reduce the number of installations covered and probably without a major impact on the scope of trading. however, such adjustments cannot be made without broader consensus across eu-25 and without changes to the legislation. it is planned to consider these ideas within a revision of the system that is planned to begin in the second half of 2006. acknowledgments the study described in this paper was supervised by jaroslav knapek. references national allocation plan of the czech republic 2005 to 2007, ministry of environment of the czech republic, september 2005. ing. tomáš chmelík e-mail: tomas_chmelik@env.cz ing. pavel zámyslický e-mail: pavel_zamyslicky@env.c dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 47 no. 6/2007 0 1 000 000 2 000 000 3 000 000 4 000 000 5 000 000 6 000 000 7 000 000 8 000 000 a ll o c a ti o n è ez -p od ìb ra dy è ez -p ru né øo v 1 è ez -t uš im ic e 2 è ez -c hv al et ic e è ez -d ìt m ar ov ic e è ez -m ìl ní k 3 è ez -p ru né øo v 2 è ez -l ed vi ce en er g o tr an sm ìl ní k 1 tø in ec ké �e le zá r n y m itt al st ee l so ko lo vs ká uh el ná c h em o pe tr o l el ek trá rn a o pa to vi ce vy so ké pe c e o st ra va fig. 4: allocation (amount of allowances) to selected installations ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 higher-spin triplet fields and string theory d. sorokin abstract we review basic properties of reducible higher-spin multiplets, called triplets, and demonstrate how they naturally appear as part of the spectrum of string field theory in the tensionless limit. we show how in the frame-like formulation the triplet fields are endowed with the geometrical meaning of being components of higher-spin vielbeins and connections and present actions describing their free dynamics. 1 introduction it is a pleasure to write this contribution on the occasion of the anniversary of jiri niederle. as a topic i have chosen a subject to whichprofessorniederle and his collaborators have contributed with several interesting papers [1, 2, 3]. it concerns higher spin field theory. in particular, i would like to discuss a system of higher spin fields which are called triplets. the physical states of triplet fields describe massless particles of decreasing spins. in the bosonic case the physical states of a triplet have spins s, s − 2, s − 4, . . . ,1 or 0 and the physical states of a fermionic triplet have spin s, s − 1, s − 2, . . . ,1/2. in other words the triplets form reducible poincaré group multiplets of massless higher-spin particles. as we shall see, they naturally arise in a tensionless limit of string field theory [4, 5, 6, 7, 8, 9, 10] as sets of three tensor fields (and this is where their name comes from [8]). we shall also discuss basic group-theoretical and geometrical properties of the higher-spin triplets revealed in [11]. let us startwith a brief generic discussion of problems of higher spin field theory. the construction of a consistent interacting theory of higher-spin fields is one of the oldest long standing problems in theoretical physics of particles and fields. since the early 1930s, it has been addressed by many distinguished theorists including majorana, dirac, fierz & pauli, rarita & schwinger, bargmann & wigner, fang & fronsdal, weinberg, velo & zwanziger, aragon & deser, singh&hagen, dewit&freedman,fradkin& vasiliev and (later on) by many others1. the main problem of higher spin field theory is the construction of consistent interactions of higherspin fields. higher-spin interactions, e.g. with electromagnetic field and gravity, must respect gauge symmetries and be consistent with quantum mechanical principals such as unitarity, causality, etc. to solve the problem of higher-spin interactions one first looks for a most appropriate description of free higher-spin fields which would allow for generalization to an interacting theory. revealing the grouptheoretical and geometrical structures underlying this theory is of great importance. until now, the most successful general approach to the constructionof nonlinear (i.e. interacting) classical higher-spinfield equations has been the so-called unfolding techniques put forward by m. vasiliev [18, 19] (see [12] for a review and references). great efforts have been made in studying both, massive and massless, higher-spin field theories. jiri niederle has concentrated, in particular, on studying electro-magnetic interactions of massive higherspin fields, which, actually, reveal major issues of the generic problem [2, 3]. it would be very interesting and very important for the further development of this subject to compare the results obtained in the construction of higher-spin interactions with the structure of a known example of consistent higher-spin field theory which is string theory2. people have worked in this direction since the early years of string theory, and this is how, for example, the system of higher spin triplet fields was first found [4]. so let usmake a very short overviewof the place of the higher-spin fields in string theory. 2 string theory as a theory of interacting massive higher-spin fields string excitations give rise to an infinite number of fields of increasing spin andmass, the mass of a string state being a linear function of its spin (regge trajectory) with the proportionality coefficient being the string tension t . for instance, in open string theory we have m2s ∼ t(s −1) (2.1) the infinite tower of higher-spin string states plays a crucial role in ensuring a smooth ultraviolet behavior (or even uv finiteness) of superstring theory, thus making it a consistent theoryofquantumgravity, since gravity is an intrinsic part of string theory. 1for reviews on various aspects of higher spin field theory and references see e.g. [12, 13, 14, 15, 16, 17]. 2let us recall that stringtheory emerged as a proposal to explain (among other things) the mass-spin dependence (known as regge trajectory) of higher-spin hadronic resonances observed in experiments in the 1960s. 48 acta polytechnica vol. 50 no. 3/2010 in string theory the higher-spin excitations are massive, but our experience in the quantum field theory of vector fields teaches us that for quantumconsistency themass of the vector fields should be generated as a result of spontaneous breaking of a symmetry of massless vector fields. if one tries to extrapolate this statement to string theory, then the natural question arises whether string theory can be a spontaneously broken phase of an underlying gauge theory of massless higher-spin fields. from eq. (2.1) we see that the mass tends to zero in the limit inwhich the string tension goes to zero. so people have tried to answer this question by taking a tensionless limit of string theory (see, e.g. [20, 21, 22, 23] for various aspects of the tensionless string limit and higher-spin theory). one can try to deduce the structure of higher-spin field interactions from the action of string field theory. e.g. the action of open string field theory has the following schematic chern-simons-like form (see [24] for more details) sopen string = 〈φ|q|φ〉+ |φ〉3 (2.2) where |φ〉 is a string field and q is a brst operator associated with the symmetries of string theory. due to the complexity of the problem, only terms in the lagrangiandescribing a system of free massless higher-spin fields have been obtained from the string field theory action so far. as we have already mentioned, this systemofmassless fieldswasfirst obtained by s. ouvry and j. stern in 1986 [4]. the simplest triplet consists of a symmetric tensor or (in the case of half integer spins) tensor-spinor fields of rank s, s −1 and s −2 φm1···ms(x), cm1···ms−1(x), dm1···ms−2(x) (2.3) where the indices m = 0,1, . . . , d − 1 are indices of space-time of dimension d. the spectrumof the physical statesof thesefields consistsofparticlesofdecreasing spin: s, s −2, s −4, . . . ,1 or 0 – in the case of bosons s, s −1, s −2, . . . ,1/2 – in the case of fermions. since each value of spin corresponds to an irreducible representation of the poincaré group, the triplet of the fields (2.3) describes a reducible multiplet of higherspin states. 3 massless higher-spin triplets from string theory let us consider how the equations of motion of the triplet fields (2.3) arise in the tensionless and free field limit of string field theory. in the tensionless limit t → 0, the nilpotent open bosonic string brst operator is (see e.g. [9]) q = c0pmp m + ∑ k �=0 ( c−kpma m k − k 2 b0c−kck ) , (3.4) k = ±1, ±2, . . . , ±∞ q2 =0 where pm is the momentum of string center-of-mass, amk are string creation (for k < 0) andannihilation (for k > 0) oscillator operators (with |k| = 1,2, . . . , ∞ labelling the regge trajectories), c0 and ck are string reparametrizaition ghosts, and b0 and bk are antighosts. the (anti)commutator relations satisfied by the operators are [amk , a n l ] = k δl+k,0 η mn, {ck, bl} = δk+l,0 . (3.5) a string field is a sum of an infinite number of higherspin fields generated by acting on the string vacuum with all possible combinations of creation operators |φ〉 = ∑ φmn···p···q···(x) · (3.6) aman · · · ap · · · aq c · · · b · · ·b |0〉. now, one may ask the question, do independent finite subsets of higher-spin fields exist inside (3.6) which satisfy (at least linearized) string field equations ofmotion that follow fromthe action (2.2)? the linear field equations are q|φ〉 =0 . (3.7) because of the nilpotency of the brst operator, they are invariant under the brst gauge transformations δ|φ〉 = q|λ〉 , q2 =0 . (3.8) to extract from (3.6) a finite independent set of fields satisfying eq. (3.7) let us pick up string states corresponding to a single regge trajectory (e.g. k=1) and cut this trajectory at a level of spin s. then it turns out that the string field (3.6) reduces and splits into independent pieces each of which is the sum of three terms |φ〉triplet = φm1···ms(x)a m1 −1 · · · a ms −1 |0〉+ (3.9) cm1···ms−1(x)a m1 −1 · · · a ms−1 −1 c0b−1 |0〉+ dm1···ms−2(x)a m1 −1 · · ·a ms−2 −1 c−1b−1 |0〉 . by independent we mean that each set of fields (3.9) independently satisfies the string field equation (3.7). symmetric tensor fields φs(x), cs−1(x) and ds−2(x) form the simplest bosonic higher-spin triplet which satisfies entangled equations of motion that follow from (3.7), namely: � φm1···ms = s ∂(ms cm1···ms−1) , � ≡ ∂n ∂ n cm1···ms−1 = ∂ n φnm1···ms−1 − (s −1) · ∂(ms−1 dm1···ms−2) , � dm1···ms−2 = s ∂ n cnm1···ms−2, (3.10) 49 acta polytechnica vol. 50 no. 3/2010 where (m1 · · · mp) denotes the symmetrization of the indices with weight 1 p! . equations (3.10) are invariant under the gauge transformations with a local symmetric parameter λm1···ms−1(x) that follow from the brst symmetry (3.8): δ φm1···ms = s ∂(ms λm1···ms−1) , δ cm1···ms−1 = �λm1···ms−1 , δ dm1···ms−2 = ∂ n λnm1···ms−2. (3.11) from the second of eqs. (3.10) we see that cs−1 is an auxiliary field which is expressed in terms of derivatives of φs and ds−1, and from the form of the gauge variationof ds−1 we see that this field is a pure gauge. thus, all physical degrees of freedom are contained in the field φm1···ms(x). as we have already mentioned and as follows from the analysis of eqs. (3.10)–(3.11) [9], the physical degrees of freedom of φm1···ms(x) are particle states of spin s, s −2, s −4, . . . , 1 or 0 (depending whether s is even or odd). a questionwhich can nowbe addressed is whether the triplet fields may have some geometrical nature? 4 geometrical meaning of hs triplet fields. frame-like formulation higher spin field theory is a gauge theory. understanding its underlying group-theoretical and geometrical structure is of great importance for making progress in solving the higher-spin interaction problem. understanding the geometrical nature of the triplets is part of this generic problem. as far as the physical field φm1···ms(x) is concerned, its grouptheoretical properties resemble very much the metric field of general relativity. the gravitational field gmn(x) is symmetric and it transformsunder linearized diffeomorphisms as a gauge field δgmn = ∂m ξn(x)+ ∂n ξm(x), which is similar to the transformationproperties (3.11) of φm1···ms(x). what about auxiliary fields cs−1(x) and ds−2(x)? can they be related to other geometrical quantities, such as a generalizedchristoffel symbol, i.e. a higherspin connection associated with the higher-spin local symmetry? the answer to this question is positive. the geometrical nature of the triplet fields manifests itself rather clearly in the frame-like formulation [11] which is the generalization to higher-spinfields [25, 26] of the cartan-tamm-weyl formulation of gravity in terms of vielbeins and spin connections. in the frame formulation of gravity the gravitational field is described by a vielbein one-form ea = dxm em a(x), (4.12) which carries the tangent space lorentz index. in this formulation, in addition to the local diffeomorphisms, the theoryof gravitypossesses local lorentz symmetry δ ea(x) = eb ξb a(x) , (4.13) where ξab = −ξba is a parameter of local lorentz transformation. the gauge field associated with local lorentz symmetry is the spin connection ωab = dxm ωabm(x)= −ω ba, (4.14) which transforms under the infinitesimal local lorentz transformations as follows δ ωab = d ξab − ξac ωcb + ξbc ωca. (4.15) the metric field gmn is composed of the vielbeins gmn = e a m e b n ηab. (4.16) in the linear approximation, in which em a = δam + ẽm a(x) and gmn = ηmn + g̃mn(x), eq. (4.16) reduces to g̃ab = ẽab + ẽba . (4.17) note that in the linear approximation there is no distinction between world indices m, n, . . . and tangent space indices a, b, . . . the einstein gravity is characterized by the socalled torsion-free constraintwhich relates the vielbein and spin connection d ea + eb ωb a =0 . (4.18) by analogy with the frame formulation of gravity, in the frame-like formulationofhigher-spinfield theory [25, 26] one introduces ahigher-spinvielbein one-form, which is symmetric in the s −1 tangent-space indices, ea1···as−1 = dxm em a1···as−1 (4.19) and the higher-spin connection ωa1···as−1,b = dxm ωm a1···as−1,b (4.20) which is symmetric in the indices a1 · · · as−1 and satisfies the following properties: i) the symmetrization of all of its tangent-space indices a and b brings zero ω(a1···as−1,b) =0 , (4.21) and ii) the trace of the index b with any of the indices a is zero ωa1···as−1,b ηa1b =0 . (4.22) theproperties (4.21)and(4.22)of thehigher-spinconnection are somewhat analogous to the antisymmetry propertyof the spinconnection(4.20). todescribe single higher-spinfields, one should impose on thehigherspin vielbein and connection stronger trace-less conditions (see [12, 11] for a review and references) ea1a2···as−1 ηa1a2 = 0 , (4.23) ωa1a2···as−1b, ηa1a2 = 0 . 50 acta polytechnica vol. 50 no. 3/2010 if eqs. (4.23) are satisfied, then the traceless condition (4.22) for ω follows from (4.23) and (4.21). as has been shown in [11], the reducible triplet multiplets are described by e and ω which are not subject to the traceless constraints (4.23) and ω satisfies a less restrictive constraint (4.22). as in the case of einstein gravity, we also assume that the higher-spin vielbein and connection satisfy the torsion-free condition d ea1···as−1 − dxm ωa1···as−1,bηmb =0 . (4.24) the torsion-free condition is preserved under the followingvariationof e and ω, which are local gauge symmetries of the frame-like formulation δea1···as−1 = d ξa1···as−1(x)− (s −1)ξa1···as−1,b(x)ηmb, δωa1···as−1,b = (4.25) d ξa1···as−1,b(x)− (s −2)ξa1···as−1,bc(x)ηmc , where the parameter ξa1···as−1 is symmetric, the parameter ξa1···as−1,b(x) has the same symmetry properties as ω and theparameter ξa1···as−1,bc is symmetric in the indices a1 · · · as−1 and (separately) in the indices bc. in addition, the symmetrization in ξa1···as−1,bc of either b, or c with all a gives zero, and the trace of b or c with any a also gives zero. the higher-spin triplet fields (3.9) turn out to be components of the higher-spin vielbein and connection [11]. the relation is as follows φa1···as = em (a1···as−1 ηas)m , da1···as−2 = em a1···as−2as−1 δmas−1 (4.26) ca1···as−1 = (s −1)ωa1···as−1,mm + ∂ m em a1···as−1 . the relation (4.26) between the two sets of fields has been established in [11] by comparing their gauge transformations and equations ofmotion in the framelike and metric-like formulation. in the frame-like formulation, for the higher-spin vielbein and connection the equations of motion are the torsion-free condition (4.24), which allows one to express the components of ω in terms of derivatives of e, up to the stueckelberg gauge transformations with the parameter ξa1···as−1,bc(x) of eq. (4.25), and the following differential equation on ω δm(b ∂ cωd;n1···ns−2)[c,d] + ∂d ω(b;n1···ns−2) [m,d] + ∂(b ω d; n1···ns−2)[d, m] =0 , (4.27) where the indices separated by ‘;’ are world indices, i.e. they correspond to the index m in (4.20). this equation is the dynamical field equation of the higherspin vielbein, when ω is expressed in terms of ∂ e in virtue of the torsion-free conditions. the equations of motion (4.24) and (4.27) can be obtained from the action which generalizes to higherspin fields the action for gravity in the frame formulation. in any dimension d > 1 the gravity action can be written as an integral over the differential form s[ea, ωab] = 1 2 ∫ md ea1 . . . ead−2εa1...ad−2bc r bc = (4.28) 1 2 ∫ md ea1 . . . ead−2εa1...ad−2bc (d ω bc + ωbd ω dc) , or up to a total derivative s[ea, ωab] = 1 2 ∫ md εa1...ad−3dbc e a1 . . . ead−3 ·( (d −2)ded ωbc + ed ωbf ωf c ) = (4.29)d −2 2 ∫ md εa1...ad−3bcd e a1 . . . ead−3 · (deb ωcd − 1 2 ef ωbf)ω cd , where the wedge product of the differential forms is assumed. thehigher-spingeneralizationof the linearizedgravity action (4.29) in minkowski space is s = ∫ m d dxa1 . . . dxad−3 εa1...ad−3pqr · (4.30) (d en1...ns−2p − s −1 2 dxm ω n1...ns−2p, m)ωn1...ns−2 q, r . this action is a straightforward generalization of the four-dimensional action of [25]. the torsion-free condition (4.24) and the dynamical equation (4.27) are obtained by varying eq. (4.30) with respect to ω and e, respectively. let us recall that if in equations (4.24), (4.27) and (4.30) thehigher-spinvielbeinandconnection together with corresponding gauge symmetry parameters were subject to the additional traceless condition in their symmetric target-space indices (4.23), the frame-like formulation based on the action (4.30) would describe irreducible physical states corresponding to massless particles of single spin s (see [11] for more details). it is a certain relaxation of the trace constraints on the higher-spin vielbeins and connections which extended the irreducible higher-spin system to the reducible (triplet) higher-spin multiplet. it should also be noted that because of the peculiarity of the form of the frame-like action (4.30), which is constructed of the wedge product of one-forms ω and (the differential) of e, it describes the higher-spin triplets whose physical states have spins s, s − 1, . . . , 3 or 2. the scalar and the vector fields can nevertheless be added to this construction in a conventionalway (see [12, 11] for more discussion of this point). an analogous formulation has been constructed in [11] for the fermionic higher-spin triplets describing physical states of spin s, s − 1, . . . ,3/2. as in the bosonic case regarding the scalar and the vector states 51 acta polytechnica vol. 50 no. 3/2010 of the triplets, the field of spin 1/2 can be included into this construction as an independent field. the action for the fermionic triplets is of the first order in derivatives of the fermionic one-form ψa1···as−1/2,α = dxm ψ a1···as−1/2,α m , (4.31) where α is the spinorial index and s is now a halfinteger. the field ψ is symmetric in the indices a. in a d-dimensional flat space-time the fermionic triplet action in the frame-like formulation has a very simple form s = i ∫ m d dxa1 . . .dxad−3 εa1...ad−3pqr · ( ψ̄d1...ds−3 2 γpqr dψ d1...ds−3 2 − (4.32) 6 ( s − 3 2 ) ψ̄d1...ds−5 2 pγq d ψ d1...ds−5 2 r ) , it is invariant under the following local transformations of the field ψ δψa1...as− 3 2 = (4.33) dξa1...as− 3 2 − ( s − 3 2 ) dxb ξa1...as− 3 2 ,b with the tensor-spinor parameter ξαa1...as− 3 2 ,b having the following symmetry properties ξa1...as− 3 2 ,b = ξ(a1...as− 3 2 ) ,b , (4.34) ξ(a1...as− 3 2 ,b) = 0 , andbeing subject to the (gamma)-tracelessconstraints with respect to the index b γb ξ a1...as−3 2 ,b =0 ξ a1...as− 5 2 b, b =0 . (4.35) if the fermionic one-form ψ and the fermionic parameters of the gauge transformations (4.33) were (in addition) gamma-traceless in the indices a, the action (4.32)would describe a single (irreducible) higher-spin field of spin s. the frame-like formulation of the bosonic and fermionic higher-spin triplet fields considered above can be generalized to describe these fields in the antidesitter space [11]. inparticular, theuse of the framelike formulation has allowed us to overcome technical difficulties encountered in [9, 27] and to obtain in [11] a description of the fermionic higher-spin triplets in ads.this is afirst step towards the studyof consistent interactions of fermionic triplets, since, as has been known for a long time, a consistent theory of interactingmassless higher-spin fields should be formulated in a space-time background of constant curvature, like the ads space. 5 conclusion we have seen how a triplet system of fields of spin s, s − 2 . . . ,1 or 0 appears in a truncated action for stringfieldtheory. using the frame-like formulation, we endowed these fields with a geometrical meaning of higher-spin vielbeins and connections subject to a torsion-free condition and transforming under higherspin local symmetries. the frame-like actions for the bosonic and fermionic triplet fields have been constructed in flat and ads spaces. it is of great interest and importance to generalize these results by analyzingpossible interactionsof thehigher-spin tripletfields (see [17, 28] for a recent discussion of this issue and references). this would give a new insight into the structure of higher spin field theory, string theory and ads/cft correspondence. acknowledgement this work was partially supported by the infn special initiative tv12, the intas project grant 051000008-7928, and anexcellence grant of fondazione cariparo. references [1] kotecky, r., niederle, j.: conformally covariant field equations ii. first order massless equations, rept. math. phys. 12 (1977) 237–249. [2] niederle, j., nikitin, a. g.: relativistic wave equations for interacting massive particles with arbitrary half-integer spins, phys. rev. d64 (2001) 125013. [3] niederle, j., nikitin, a. g.: relativistic coulomb problem for particles with arbitrary half-integer spin, j. phys. a39 (2006) 10931–10944, arxiv:hep-th/0412214. [4] ouvry, s., stern, j.: gauge fields of any spin and symmetry, phys. lett. b177 (1986) 335. [5] bengtsson, a. k. h.: a unified action for higher spin gauge bosons from covariant string theory, phys. lett. b182 (1986) 321. [6] henneaux, m., teitelboim, c.: first and second quantized point particles of any spin. in santiago 1987, proceedings, quantum mechanics of fundamental systems 2 113–152. (see conference index). [7] pashnev, a. i.: composite systems and field theory for a free regge trajectory, theor. math. phys. 78 (1989) 272–277. [8] francia, d., sagnotti, a.: on the geometry of higher-spin gauge fields, class. quant. grav. 20 (2003) s473–s486. 52 acta polytechnica vol. 50 no. 3/2010 [9] sagnotti, a., tsulaia, m.: on higher spins and the tensionless limit of string theory, nucl. phys. b682 (2004) 83–116. [10] barnich, g., bonelli, g., grigoriev, m.: from brst to light-cone description of higher spin gauge fields, arxiv:hep-th/0502232. [11] sorokin,d. p., vasiliev,m.a.: reducible higherspin multiplets in flat and ads spaces and their geometric frame-like formulation, nucl. phys. b809 (2009) 110–157. [12] bekaert, x., cnockaert, s., iazeolla, c., vasiliev,m.a.: nonlinearhigher spin theories in various dimensions, arxiv:hep-th/0503128. [13] vasiliev, m. a.: higher spin gauge theories in various dimensions, fortsch. phys. 52 (2004) 702–717. [14] sorokin, d.: introduction to the classical theory of higher spins, aip conf. proc. 767 (2005) 172–202, arxiv:hep-th/0405069. [15] bouatta, n., compere, g., sagnotti, a.: an introduction to free higher-spin fields, arxiv:hep-th/0409068. [16] sagnotti, a., sezgin, e., sundell, p.: on higher spins with a strong sp(2,r) condition, arxiv:hep-th/0501156. [17] fotopoulos, a., tsulaia,m.: gauge invariantlagrangians for free and interacting higher spin fields. a review of the brst formulation, int. j. mod. phys. a24 (2009) 1–60. [18] vasiliev, m. a.: equations of motion of interactingmassless fields of all spins as a free differential algebra, phys. lett. b209 (1988) 491–497. [19] vasiliev, m. a.: consistent equations for interactingmassless fields of all spins in the first order in curvatures, annals phys. 190 (1989) 59–106. [20] gross, d. j.: high-energy symmetries of string theory, phys. rev. lett. 60 (1988) 1229. [21] sundborg, b.: stringy gravity, interacting tensionless strings and massless higher spins, nucl. phys. proc. suppl. 102 (2001) 113–119. [22] lindstrom, u., zabzine, m.: tensionless strings, wzwmodels at critical level andmassless higher spin fields, phys. lett. b584 (2004) 178–185. [23] bonelli, g.: on the tensionless limit of bosonic strings, infinite symmetries and higher spins, nucl. phys. b669 (2003) 159–172. [24] witten, e.: interactingfieldtheory ofopen superstrings, nucl. phys. b276 (1986) 291. [25] vasiliev, m. a.: ‘gauge’ form of description of massless fields with arbitrary spin. (in russian), yad. fiz. 32 (1980) 855–861. [26] aragone, c., deser, s.: higher spin vierbein gauge fermions and hypergravities, nucl. phys. b170 (1980) 329. [27] buchbinder, i. l., galajinsky, a. v., krykhtin, v. a.: quartet unconstrained formulation for massless higher spin fields, nucl. phys. b779 (2007) 155–177. [28] fotopoulos, a., tsulaia, m.: current exchanges for reducible higher spin multiplets and gauge fixing, jhep 10 (2009) 050. dmitri sorokin e-mail: dmitri.sorokin@pd.infn.it infn, sezione di padova via f. marzolo 8 35131 padova, italia 53 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 analysis of feature point distributions for fast image mosaicking algorithms a. behrens, h. röllinger abstract in many algorithms the registration of image pairs is done by feature point matching. after the feature detection is performed, all extracted interest points are usually used for the registration processwithout further feature point distribution analysis. however, in the case of small and sparse sets of feature points of fixed size, suitable for real-time image mosaicking algorithms, a uniform spatial feature distribution across the image becomes relevant. thus, in this paper we discuss and analyze algorithms which provide different spatial point distributions from a given set of surf features. the evaluations show that a more uniform spatial distribution of the point matches results in lower image registration errors, and is thus more beneficial for fast image mosaicking algorithms. keywords: feature detection, point distribution, image registration, image mosaicking. 1 introduction alignmentandstitchingof images into seamlessphotomosaics is most widely used in computer vision [1]. besides the approach to minimize pixel-to-pixel dissimilarities directly, a common way to combine image pairs into one image composition is performed using only sparse sets of interest points. here, distinctive feature points are first extracted and described from the images, and then matched using a similarity measurement. many different feature-based algorithms, like sift [2], surf [3], gloh [4], mops [5] or [6] have been proposed for extracting distinctive image features. their robustness, repeatability and invariance to different illumination changes and image transformations have also been widely evaluated by [4, 7, 8, 9]. on the one hand, a high level of distinctiveness of the features, expressed by their descriptors and the filter response values of the detector, leads to robust matching and precise image alignment. on the other hand, these characteristics alone do not always result in a low registration error, since the spatial feature distribution in the images is also relevant. images which showonly a high contrast in local image regions (see fig. 1) usually create only clusters of strong features and provide an overall non-uniform spatial feature distribution. since the estimation of a global image transformationbetweenagiven imagepair is based on matched feature points, image regions with many interest points will create only small registration errors, whereas the error in regionswith feature clusters becomes larger. thus, only the combinationofdistinctive and also spatially well uniformed distributed feature points leads to good global image transformation with low registration errors. while in many natural image scenes feature points can usually be extracted across the whole image, spatial distribution is often a relevant consideration for medical images, e.g. endoscopic images of the internal urinary bladder wall. these images often show only sparsely located structures with high contrast, e.g. vasculature or lesions (see fig. 1), and thus impede robust image mosaicking. fig. 1: left: landscape with a high contrast region in the lower right corner. right: a noisy and low contrast endoscopic bladder image showing a lesion during cystoscopy, where a rigid endoscope is introduced into the bladder, an inspection and analysis of bladder cancer, the 4th most common disease with about 52810 new cases among males in the united states estimated in 2009 [10], is performed by a physician based on endoscopic images captured by a video endoscope system. since sufficient illumination of the bladderwall is onlyprovided if the endoscope is guided close to thebladderwall, the field of view is very small. this results indifficulties in orientationandnavigation for the physician. to provide navigation assistance during cystoscopy, a feature-based mosaicking algorithm composing local or global panoramic overview images from single video images has been proposed 12 acta polytechnica vol. 50 no. 4/2010 by [11, 12, 13, 14]. for this application, a fixed number of feature points is required for real-time processing [14]. it is therefore necessary to generate an adequate sample set of the extracted features list for robust image mosaicking. while some registration algorithms have been developed that create feature distributions based on local non-maximum suppression [5] or spacepartitioning [15], these different techniques have not been evaluated. thus, in this paper we analyze these different feature distribution algorithms and evaluate their average image registration errors. the paper is organized as follows: in section 2 the mosaicking algorithm is described. the distribution algorithms for selecting a feature sample set are described in section 3. section 4 presents the evaluated data, and the conclusions and futurework are given in section 5. 2 mosaicking algorithm in the following sections only a brief overview is given of the mosaicking algorithm. a more detailed description can be found in [12, 14]. during the mosaicking process, image pairs of a captured video sequence are sequentially stitched and blended to compose a successively growing panoramic overview image. when endoscopic images are used, the lens distortions of the fish-eye optics are compensated in a preprocessing step. then, distinctive feature points are extracted and described by their local neighborhood region. based on these sparse feature sets, point correspondences arematched between two subsequent images. thus, a global image-to-image transformation, a so-called homography, is estimated. the two images are registered by applying the transformation. finally, the overlap regions are blended by an interpolation algorithm. iteratively, all images of the video sequence are then sequentially processed to build the final panoramic overview image. 2.1 feature detection feature extraction is performed by the speeded up robustfeatures (surf) detector [3]. based ona very basic approximationof thehessianmatrix, the feature strength is calculated by det(h)= ∥∥∥∥∥ [ lxx lxy lxy lyy ]∥∥∥∥∥ ≈ dxxdyy − (0.9 · dxy)2, (1) where lxx represents the convolution of the gaussian second order derivative ∂2 ∂x2 g with the input image. instead of iteratively reducing the image size, integral images and scalable box filters dxx, dyy, dxy are used to speed up the blob filter response. the distinctive feature points are localized in the space domain and over scales by non-maximum suppression. feature descriptors are then extracted ina square regioncentered around the point of interest. split up into smaller 4 × 4 square sub-regions, the haar wavelet responses in horizontal hx and vertical direction hy are calculated and composed to a four-dimensional descriptor vector �d4d = (∑ hx, ∑ hy, ∑ |hx|, ∑ |hy| ) . the concatenation of the sixteen sub-regions then results in a descriptor vector �d of length 64. 2.2 matching and registration thepoint correspondencesbetween two imagesaredetermined by matching the surf feature descriptors. based on the least similarity measurement δ�dij =min j ∣∣∣∣�di − �dj∣∣∣∣2, (2) a feature descriptor �di of the first image is compared to all feature descriptors �dj of the second imagewithin the 64 dimensional feature space, and vice versa. the minimum squared error δ�dij then leads to the best point match �di ↔ �dj. to increase the distinctiveness and robustness of the point correspondences, only matches with the ratio δ�dij δ�d2ndij < τ (3) between the best and the second best match are considered. after single points of an image pair are matched, the homography h is estimated. the applied affine transformationmodel provides six degrees of freedom, parameterized by a translation vector �tt = (tx, ty), rotation angle α, scales sx and sy, and skew a. in homogeneous coordinates, the homography matrix h can be written as h= ⎡ ⎢⎢⎣ a b c d e f 0 0 1 ⎤ ⎥⎥⎦ = [ a �t �0t 1 ] (4) with a= [ 1 a 0 1 ][ sx 0 0 sy ][ cos(α) −sin(α) sin(α) cos(α) ] . (5) to ensure robust homography estimation, unavoidable false point correspondences are rejected by the ransac (random sample consensus) model fitting algorithm[16]. basedon iterativelyandrandomly selected point correspondences �p, a homography ĥ is estimated and the number of inliers is determined by the threshold operation∣∣∣∣�pi − ĥ · �pj∣∣∣∣2 < d. (6) 13 acta polytechnica vol. 50 no. 4/2010 if point correspondences �pi ↔ �pj satisfy eq. 6 with estimation ĥ and the given pixel distance d, they are marked as inliers. finally, the estimated homography ĥ with the highest number of inliers is selected as the final image transform and is used for registration by warping the second image into the reference system of the first image. 3 feature selection since the computational complexity of the surf detector increases linearlywith number of featurepoints, the limitation to a fixed number is often desired for real-time mosaicking algorithms [14]. also calculating of the feature descriptors is computationally more intensive than locating the feature and the filter response value itself (eq. 1). thus, feature extraction is separated into two steps. first, the surf feature detector is applied to the whole image to detect the locations and the strengths of potential interest points based on the filter response values. then a sample set with a fixed number of the points is chosen from all the point candidates. to assure constant computational complexity, only these features are then characterized by the 64-dimensional surf descriptor and passed to the matching process. since the usage of fewer feature points leads to less robust homography estimations and registration results, the selection of an adequate and representative sample set is very important. 3.1 top n selection a straightforward approach is to select the first n-th strongest feature points from the whole set of interest points. for each feature point, the location and the response value of the blob filter is calculated according to eq. 1, and is used for comparison. after the extracted interest points have been sorted in descending order of their filter response values, the first n-th feature points are selected. since this algorithmgenerates only a sample set based on the filter response, no additional information about the spatial distribution of the feature points is exploited. 3.2 adaptive non-maximal suppression to select a fixed number of interest points from images which are local maxima and whose response values are also significantly greater than those of all of their neighborswithin radius r, brown et al. [5] developed an adaptive non-maximal suppression (anms) strategy. conceptually, this can be performed by initializing the suppression radius r =0andthen increasing it until the desired number of n feature points is obtained. in practice, this operation can be done by first sorting all local maxima by their response values, and then creating a second list sorted by decreasing suppression radius. the first entry in the list represents the global maximum, which is not suppressed at any radius. as the radius decreases from infinity, feature points are added to the list. to increase the robustness, a second constraint is defined, which requires that a neighbor feature has sufficiently greater strength. thus, the minimum suppression radius ri is determined by ri =min j |�xi − �xj|, with f(�xi) < c · f(�xj), �xj ∈ i, (7) where �xi describes the position (x, y) of the feature point, f(�xi) its strength, and i the set of all feature point positions. theparameter c =0.9 represents the robustness factor, which adjusts how significantly greater the strength of a neighbor feature must be for suppression to take place. for each feature point its radius is then determined by eq. 7, and the feature list is sorted by the radius in descending order. then, the first n entries of the list are selected. this sample set now provides features which are spatially distributed across the whole image (cf. fig. 4 (right column)). 3.3 k-d tree partitioning another selection method that considers the spatial data distribution is based on space-partitioning of high-dimensional data sets. cheng et al. [15] developed an algorithm using a 2-dimensional k-d tree. here, the feature points are separated into m rectangular image regions and cells, respectively. in a recursivemanner, eachpartition cuts the regionwith the current highest variance in two, using themediandata value as the dividing line. the position of the dividing line is stored as the node’s data. for each nonterminal node of the binary tree, the feature points of the related region are then handed over to the left and right child node, respectively, until the given number of m cells is obtained. finally, each leaf node contains a list of the features which are located within the related image partition. an example is given in fig. 2. from each cell �n/m�, the strongest features are selected as the output sample set, cf. fig. 4 (middle column). while cheng [15] uses a balanced tree, which limits the number of leaf nodes to be a power of two,we have extended the algorithm to handle any integer number of points. since only one feature is taken from each cell, the partitions are divided until the desired number of features n is obtained. thus, a direct comparisonbetween k-d tree partitioning,anms, and top n selection becomes more feasible. 4 evaluation and results the three feature distribution algorithms (top n, anms, and k-d tree) are evaluated using three data 14 acta polytechnica vol. 50 no. 4/2010 0 5 4 3 2 1 1 2 3 4 5 x = 2.6 y = 2.275 y = 2 x = 1.1 x = 1 x = 4.1 x = 4.15 (0.8 2.1) (1.4 0.2) (0.5 4.2) (1.5 3.9) (3.7 1.4) (4.5 0.9) (4.8 3.2)(3.5 3.5) fig. 2: above: 2-dimensional k-d tree partitions with feature points. below: k-d tree tree-clinic houses bladder fig. 3: data sets used for evaluation sets (fig. 3) which show varying contrast across the whole image. each image pair represents one scene from two slightly different point of views, but still providing a large overlap region. since the images are not synthesized, no ground truthdata of the correct image composition is given. thus, for each image pair control points �p′i ↔ �p′j are manually set and matched. based on these points, a homography is also estimated according to eq. 4, and the registration error e = 1 2 (∣∣∣∣�p′i −ĥ · �p′j∣∣∣∣2 + ∣∣∣∣�p′j −ĥ−1 · �p′i∣∣∣∣2 ) (8) is calculated. these error values are used as references in later evaluations. to evaluate different feature distributions, surf features are first extracted and listed for each data set. each algorithm of section 3 is then applied to the point lists to generate a sample set of only n feature points, resulting in different point distributions. representative examples for each data set are shown in fig. 4. the figure shows, that the selection of the top n strongest features leads to single point clusters in all three data sets. in example, many surf features show a high filter response value in the tree region of the tree-clinic sequence, since the dark branches have a high contrast against the bright background. by contrast, the k-d tree and also the anms algorithm provide amore spatiallyuniformdistributed set of feature points across the whole image and the relevant overlap regions, respectively. as shown in fig. 4(c), more feature points are also located in the vasculature in the image center in the case of k-d tree partitioning and anms. since this region is also present in the overlap region of the bladder sequence (cf. fig. 3), consideration of these features during the homography estimation should provide a small registration error. after the different feature lists have been generated, they are passed to the matching process, and a global image-to-image transformation for each sample set is estimated, as described in section 2.2. for a quantitative evaluation, the homographies are then applied to the ground truth control points �p′i ↔ �p′j and the final registration errors are calculated using eq. 8. fig. 5 shows the reproduction error characteristics for each distribution method over the selected number of feature points. since the ransac algorithm provides robust point matches and rejects outliers on the basis of a randomprocess,mean error values are calculated. thus, fig. 5 shows the averaged registration errors and the number of inliers over the number of feature points for eachdata set. the results of the tree-clinic and houses sequence are averagedby 15 measurements, and the graphs of the bladder sequence are based on 20 measurements. as can be seen from the characteristics, the mean registration errors calculated from feature points distributed by k-d tree partitioning and anms are usually smaller than with the top n method. especially for a small number of points, the error difference increases. at higher numbers, the different algorithms provide almost the same registration errors. this is obvious, since a larger point set usually leads to a spatial distributionwith a higher variance. the results of the tree-clinic sequence also show that robust matching of the first n strongest features is not feasible until at least aminimum number of n =210 feature points 15 acta polytechnica vol. 50 no. 4/2010 tree-clinic sequence with 200 features points houses sequence with 50 features points bladder sequence with 80 features points fig. 4: for each data set top n selection (left), k-d tree partitioning (middle), and anms (right) is performed is obtained, but still resulting in high error. by contrast, the variance of the mean errors of the k-d tree andanms sample sets ismuch smaller. although the registration errors of k-d tree partitioning and anms are almost equal in all image sequences, the number of inliersused for robusthomographyestimation ishigher in the case of anms. against this, the complexity of anms with up to o((n − 1)!) is much higher than o(logn)) of the k-d tree. thus, the integration of the k-d tree partitioning in fast and real-time image mosaicking algorithms like [14] should be preferred. 5 conclusions an evaluation of the anms and k-d tree algorithms, which provide a spatially well uniformed feature distribution, has shown that the mean error of an imagepair registrationcanbe greatly reduced, compared to a sample set based on strongest feature selection. this difference becomes significant in images showing regions of different varying contrast. for applications that require a fixed or limited number of interest points, e.g. real-time image mosaicking algorithms for medical computer assistance systems, the usage of feature distribution algorithms is also beneficial. in future work, further evaluationswill be made onmore medical imagedata sets and timemeasurements of the distribution algorithms integrated in a real-time image mosaicking algorithm for fluorescence endoscopy. the generation of feature point sets based on an adaptive algorithm considering the feature strengths and their distribution according to the current image information will also be analyzed. acknowledgement we would like to thank prof. dr.-ing. til aach, institute of imaging andcomputervision, rwthaachen university, germany, for supervising this project. additionally we would like to thank olympus & winter ibe gmbh for providing the endoscopic image sequences. 16 acta polytechnica vol. 50 no. 4/2010 tree-clinic houses bladder fig. 5: mean registration errors (black) and numbers of inliers (red) of each distribution method (top n, anms, k-d tree) over the number of feature points. the registration errors using the ground truth homography are highlighted in green 17 acta polytechnica vol. 50 no. 4/2010 references [1] szeliski, r.: image alignment and stitching: a tutorial, microsoft research, 2006, tech. rep. msr-tr-2004-92. [2] lowe, d.: distinctive image features from scaleinvariant keypoints, in international journal of computervision, 2004, vol.60, no. 2, pp. 91–110. [3] bay, h., ess, a., tuytelaars, t., van gool, l.: surf: speeded up robust features, in computer vision and image understanding (cviu), 2008, vol. 110, no. 3, pp. 346–359. [4] mikolajczyk,k., schmid,c.: aperformanceevaluation of local descriptors, in pattern analysis andmachine intelligence, ieeetransactions on, 2005, vol. 27, no. 10, pp. 1615–1630. [5] brown, m., szeliski, r., winder, s.: multi-image matching using multi-scale oriented patches, in ieee conference on computer vision and pattern recognition (cvpr’05), 2005, vol. 1, pp. 510–517. [6] shi, j., tomasi, c.: good features to track, in ieee conference on computer vision and pattern recognition (cvpr’94), 1994, pp. 93–600. [7] dorko, g.: selection of discriminative regions and local descriptors for generic object class recognition, phd thesis, 2006. [8] schmid, c., mohr, r., bauckhage, c.: evaluation of interest point detectors, in international journal of computer vision, 2000, vol.37, no. 2, pp. 151–172. [9] mikolajczyk, k., tuytelaars, t., schmid, c., zisserman,a.,matas, j., schaffalitzky,f.,kadir,t., vangool, l.: acomparisonofaffineregiondetectors, in int. j. comput. vision, 2005, vol. 65, no. 1–2, pp. 43–72. [10] americancancer society: cancer facts and figures 2009, 2009. [11] miranda-luna, r., daul, c., blondel, w., hernandez-mier,y.,wolf,d., guillemin, f.: mosaicking of bladder endoscopic image sequences: distortion calibration and registration algorithm, in biomedical engineering, ieee transactions on, 2008, vol. 55, no. 2, pp. 541–553. [12] behrens, a.: creating panoramic images for bladder fluorescence endoscopy, in acta polytechnica journal of advanced engineering, 2008, vol. 48, no. 3, pp. 50–54. [13] behrens,a., stehle,t.,gross, s.,aach,t.: local and global panoramic imaging for fluorescence bladder endoscopy, in engineering in medicine and biology society, embc 2009. 31th annu. int. conf. of the ieee, 2009, pp. 6990–6993. [14] behrens, a., bommes, m., stehle, t., gross, s., leonhardt, s., aach, t.: a multi-threaded mosaicking algorithm for fast image composition of fluorescence bladder images, in proceedings spie medical imaging 2010: visualization, image-guided procedures, and modeling, 2010, vol. 7625. [15] cheng, z., devarajan, d., radke, r. j.: determining vision graphs for distributed camera networksusing feature digests, ineurasipjournal on adcances in signal processing, 2007, article id 57034. [16] fischler, m. a., bolles, r. c.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,communications of theacm, 1981, vol. 24, no. 6, pp. 381–395. about the authors alexander behrens was born in bückeburg, germany in 1980. he received his dipl.-ing. degree in electrical engineering from theleibnizuniversity of hannover, hannover, germany, in 2006. after receiving the degree, he worked as a research scientist at the university’s institut für informationsverarbeitung. since 2007, he hasbeen aph.d. candidate at the institute of imaging andcomputervision, rwthaachen university, aachen, germany. his research interests are inmedical imageprocessing, signalprocessing,pattern recognition, and computer vision. hendrikröllingerwasborn in1983 inaachen, germany. he has been studying computer engineering at rwth aachen university since 2003, and will graduate with a dipl.-ing. degree in 2010. alexander behrens hendrik röllinger e-mail: alexander.behrens@lfb.rwth-aachen.de, hendrik.roellinger@lfb.rwth-aachen.de institute of imaging & computer vision rwth aachen university 52056, aachen, germany 18 ap08_4.vp 1 introduction in 1202, the italian mathematician fibonacci (also known as leonardo da pisa) asked a simple question in his book liber abaci: if a pair of rabbits beget a pair of new rabbits after one year of maturing, and these rabbits beget another pair after maturing, how many pairs of rabbits will there be after n years? to make things simple, fibonacci assumed that rabbits never die and breed every year. fig. 1 shows the first 5 years of the rabbit population. the number of rabbit pairs in the nth year fn (f for fibonacci) is the number of pairs one year earlier fn �1 (because the rabbits do not die) plus the number of pairs two years earlier fn �2, because they are all mature and can reproduce: f f fn n n� �� �1 2 (1) starting with one immature pair of rabbits (f1 1� , f2 1� ), it is easy to calculate the number of pairs for the next years: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 114, 223, 347… if an immature pair of rabbits is represented by a “0” and a mature pair is represented by a “1”, a binary fibonacci sequence can be built. in the first year a “0” would represent one immature pair, and in the second year a “1” would represent the matured pair. in the third year a “10” would represent the same matured pair and a newly born immature pair. table 1 shows the binary fibonacci sequence for the first 8 years. the self-similarity of the fibonacci sequence can also be found in the binary sequence: after the first two years, the nth binary sequence can be built if the sequence of year n � 2 is attached to the end of the sequence of year n � 1. the sequence of the number of pairs fn is the original fibonacci sequence. the number of mature pairs (“1”) m are the fibonacci numbers starting in year 2, and the number of immature pairs (“0”) n are the fibonacci numbers starting in year 3. the quotient of the ones and zeros m/n converges to the golden ratio: m n f f gn n n � � �� � � �� 1 . (2) the golden ratio is a geometric relation: a straight line is sectioned in such a way that the ratio of the total length to the longer segment equals the ratio of the longer segment to the shorter one. calling the total length l and the longer segment a, the following equation describes the golden ratio: g l a a l a : � � � . (3) the only positive solution of this equation is © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 48 no. 4/2008 psychoacoustic properties of fibonacci sequences j. sokoll , s. fingerhuth 1202, fibonacci set up one of the most interesting sequences in number theory. this sequence can be represented by so-called fibonacci numbers, and by a binary sequence of zeros and ones. if such a binary fibonacci sequence is played back as an audio file, a very dissonant sound results. this is caused by the “almost-periodic”, “self-similar” property of the binary sequence. the ratio of zeros and ones converges to the golden ratio, as do the primary and secondary spectral components intheir frequencies and amplitudes. these fibonacci sequences will be characterized using listening tests and psychoacoustic analyses. keywords: fibonacci sequence, psychoacoustics, listening tests. fig. 1: the fibonacci rabbit population year fn m n m/n 1 0 1 0 1 0 2 1 1 1 0 � 3 10 2 1 1 1 4 101 3 2 1 2 5 10110 5 3 2 1.5 6 10110101 8 5 3 1 66. 7 1011010110110 13 8 5 1.6 8 101101011011010110101 21 13 8 1.625 table 1: the binary fibonacci sequence g � � � 1 5 2 1618033989. � (4) the golden ratio is considered the most irrational of all ratios. the fibonacci number fn can be easily calculated with this approximation (the wavy equal sign means: take the nearest integer) [1]: f g n n � 5 . (5) 2 acoustical realization of the fibonacci sequence as a very dissonant sound is expected if the binary fibonacci sequence is played as an audio file, it would be interesting to analyze these sounds. for the acoustical realization of the fibonacci sequence, a binary sequence of any length fn can be generated and played back with a digital-analog-converter. although there is no definite period, certain sequences recur and the signal shows a distinct discrete spectrum. the analysis of longer sequences shows only small differences, so that the spectrum of a sequence with a length of f20 6765� , as shown in fig. 2 can be seen as the spectrum of a sequence of infinite length [2]. in the spectrum of the fibonacci sound, the golden ratio g can be found again. in the top plot of fig. 2 the second highest peak has the magnitude of the highest peak divided by g ( .1 0 618g � ) and the frequency of the highest peak multiplied by g (392 634 �g ). the binary fibonacci sequence was interpolated with a sinc-algorithm to minimize the spectral influences of the rectangle-characteristic of a binary sequence. because of the low-pass characteristic of the sinc-interpolation, the high spectral components were cut off. the sinc-interpolated binary fibonacci sequences were integrated in an internally developed wavetable synthesizer (see fig. 3). with this synthesizer it is possible to play intervals and melodies with fibonacci sounds. midi files are supported. to characterize the psychoacoustic properties of the fibonacci sequences, listening tests and psychoacoustic analyses have been performed. for the listening tests, sequences of length f16 987� have been used. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 2: binary sequence vs sinc interpolation 3 listening tests for a subjective psychoacoustic analysis, listening tests have to be performed. in the first test the subjects heard popular melodies played with the fibonacci synthesizer. all subjects were able to recognize the melodies, though the subjects could not identify the intervals. a “fibonacci octave” cannot be perceived as a “real” octave with a frequency ratio of 1:2. the tritone paradox phenomenon [4] occurs as well. [start frequency: result of last pass] [start frequency: 0 hz] [start frequency: 10 khz] the next test examined whether one of the peaks in the spectrum could be spotted as the most remarkable peak. the subjects were told to change the pitch of a sine tone until it matched the fibonacci sound. the sounds were presented with an electrostatic stax headphone with a linear frequency response: the pure tone on one ear, the fibonacci sound on the other ear. the effect of the binaural beat [3] was intentionally used to help to find the right pitch. as there is more than one noticeable peak in the spectrum of the fibonacci sounds, different subjects identified different peaks. fig. 4 shows the results of this listening test. each dot represents the choice of one subject. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 4/2008 fig. 3: the fibonacci synthesizer because the subjects started each pass with the sine frequency of the last pass, two further tests were performed to find out whether the start frequency of the sine tone has an influence on the results. the subjects were asked to start each pass with the lowest sine tone of the frequency synthesizer and the highest respectively. fig. 4 shows the results of the listening test starting each pass with a sine frequency of 0 hz. it can be seen that many more low frequencies were identified than in the first test. in the last listening test every pass started with a sine frequency of 10 khz (see fig. 4). almost all subjects have marked frequencies above the actual spectrum of the presented sounds. the different results of the three different listening tests refer to the dissonant spectrum of the fibonacci sounds. although there is a major peak in the frequency domain which is higher in amplitude by factor g than the second major peak, this peak was only rarely identified in the pitch perception test. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 4: listening test: pitch-perception of fibonacci sounds 4 psychoacoustics for an objective analysis, psychoacoustic parameters like loudness and pitch strength were calculated. fig. 5 shows the loudness according to iso 226:2003 [5] of a fibonacci sound. the relative pitch strength [6] was calculated for the fibonacci sequence that was used in the listening tests. if a sine of 80 db and 1500 hz has a pitch strength of 1, the fibonacci sequence has a relative pitch strength of 0,1878. 5 summary / outlook the fibonacci numbers are a very interesting mathematical phenomenon. as the golden ratio g occurs permanently in the spectrum of the binary fibonacci sequence, a very dissonant sound results. although melodies can be recognized, it is not possible to identify specific intervals. listening tests were performed in order to find the subjective pitch of the signals. it is not possible to define one most outstanding peak, as all subjects marked different frequencies. in the future, more psychoacoustic analyses will be performed. another project of the institute of technical acoustics will investigate consonance and tonality. acknowledgements the authors would like to thank prof. michael vorländer for supporting this work and discussion, and also all the participants in the listening tests. references [1] schroeder, m. r.: number theory in science and communication. 3rd edition. springer verlag 1999. [2] schmidt, h.: eigenschaften und akustische realisierung von fibonacci-folgen. proceedings of daga ’88, braunschweig, p. 621. [3] terhardt, e.: akustische kommunikation. springer verlag 1998. [4] deutsch, d.: the tritone paradox: effects of spectral variables. perception & psychophysics, vol. 41 (1987), p. 563–575 . [5] iso 226:2003: acoustics – normal equal-loudness-level contours. [6] fruhmann, m.: introduction and practical use of an algorithm for the calculation of pitch strength. journal of the acoustical society of america (jasa), vol. 118 (2005), no. 3, pt. 2 , p. 1894. jan sokoll e-mail: jso@akustik.rwth-aachen.de sebastian fingerhuth sfi@akustik.rwth-aachen.de institute of technical acoustics rwth aachen university d-52056 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 4/2008 fig. 5: loudness of a fibonacci sound << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 on representations of sl(n, c) compatible with a z2-grading m. havlíček, e. pelantová, j. tolar abstract this paper extends existing lie algebra representation theory related to lie algebra gradings. the notion of a representation compatible with a given grading is applied to finite-dimensional representations of sl(n, c) in relation to its z2-gradings. for representation theory of sl(n, c) the gel’fand-tseitlin method turned out very efficient. we show that it is not generally true that every irreducible representation can be compatibly graded. 1 introduction contractions of lie algebras, of interest in connecting physical theories, are traditionally understood as limit procedures throughwhichlie algebras aremodified into different, non-isomorphic lie algebras [5, 8]. nevertheless, formanyphysical applications, especially in quantumtheory, representations of lie algebras are important. it should be noted that contractions usually produce non-compact lie algebras whose unitary representations are infinite-dimensional. remaining inside the framework of lie algebras, a completely different notion of graded contractions was proposed in [11]. in a seminal paper [13], r. v. moody and j. patera pushed the theory of graded contractions of lie algebras further with graded contractions of representations of lie algebras. by considering along with graded lie algebras their compatibly graded finite-dimensional representations, they obtained a theory of contractions of representations that contains the lie algebra contractions as a special case for adjoint representation. this is unfortunately the only existing mathematical theory of this matter, and moreover it is not concerned with the question of which representations can be compatibly graded. namely, compatibly graded finite-dimensional representations were assumed throughout the paper [13], eqs. (2.10) and (2.11) which is a valid assumption if the grading is induced by an inner automorphism. in this respect they also provided a recipe for finding the corresponding grading of vector space v on which the representation is acting (5). one should also mention a short note [15] on the subject, but up to now nobody has gone ahead with a further study of representations of lie algebras related to their gradings, especially when the gradings are induced by outer automorphisms. we are aware that finite-dimensional representations of contracted lie algebras can only be non-unitary. however, such representations, usually indecomposable, also have some interest in physics. our paper, as a starting point for such an investigation, gives answers under the restrictive assumptions used in [13]. thus we restrict our consideration to 1. complex lie algebras of type a, 2. finite-dimensional representations, 3. group gradings with the grading group z2. z2-gradings are closely related to involutive (second order) automorphisms of lie algebras. in physical applications they are especially useful as generalized parity transformations. in this connection our earlier paper [12] dealt with the well-known space-time parity transformations — space inversion and time reversal— and the associated graded contractions for the de sitter lie algebras (type b). here we start with the simplest case of finite-dimensional representations of classical lie algebras of type a. thepaper is organizedas follows. section 2 is devoted to representationscompatiblewith agrading. explicit results are obtained in section 3 for finite-dimensional representations of sl(n, c) compatible with z2-gradings generated either by an inner automorphism of order 2 or by an outer automorphism of order 2. our concrete results are illustrated on the simple lie algebra sl(3, c). 30 acta polytechnica vol. 50 no. 5/2010 2 representations compatible with grading 2.1 graded contractions of lie algebras a grading of a lie algebra l is a decomposition γ of the vector space l into vector subspaces lj, j ∈ j , such that l is a direct sum of these subspaces lj, and, for any pair of indices j, k ∈ j , there exists l ∈ j such that [lj, lk] ⊆ ll. we denote the grading by γ:l = ⊕ j∈j lj; (let us note that in our definition of grading we do not exclude trivial subspaces li = {0}). it follows directly from the definition that for any grading γ: ⊕ j∈j lj and any automorphism g ∈ aut l the decomposition γ′: ⊕ j∈j g(lj) is also a grading. gradings γ and γ ′ are called equivalent. now we describe a specific type of grading, namely a group grading. a grading γ:l = ⊕j∈j lj is called a group grading if the index set j can be embedded into a semigroup g (whose binary operation is denoted by +), and, for any pair of indices j, k ∈ j , it holds that [lj, lk] ⊆ lj+k. (1) since even trivial subspaces are generally allowed in the decomposition of l, the semigroup g may be used as the index set of the group grading. in this case we will speak about a g-grading γ. we will focus in this paper on group gradings only andwe assume in the sequel that the indices of the grading subspaces belong to a group g, i.e. γ is a g-grading of l. a grading γ:l = ⊕i∈j li of a lie algebra l is a starting point for the study of graded contractions of the lie algebra. this method for finding contractions of lie algebras was introduced in [11, 13]. in this type of contraction, we define new lie brackets by the prescription [x, y]new := εj,k[x, y], where x ∈ lj, y ∈ lk. (2) the complex or real parameters εj,k for j, k ∈ g must be determined in such away that the vector space l with the binary operation [., .]new again forms a lie algebra. antisymmetry of lie brackets demands that εj,k = εk,j. if, moreover, the coefficients εj,k fulfill the first basic set of contraction equations [14]: εi,j εi+j,k = εj,kεj+k,i = εk,iεk+i,j for all i, j, k ∈ g, (3) then the vector space l with new brackets [x, y]new satisfies the jacobi identities as well. this new lie algebra will be denotedby lε. note that the equations (3) involve only relevantparameters forwhich the corresponding commutators [lj, lk] do not vanish. example 1 z2-grading. the most notorious case of group grading is z2-grading. 1 here a lie algebra l over c is decomposed into two non-zero grading subspaces l0 and l1, where 0 = [l0, l0] ⊆ l0, 0 = [l0, l1] ⊆ l1, 0 = [l1, l1] ⊆ l0. (4) here we have applied the generic condition that in each class of commutators there exists at least one nonvanishing commutator. for a z2-grading of a lie algebra l, the generic system of equations (3) has a very simple form (ε00 − ε01)ε01 =0= (ε00 − ε01)ε11, ε10 = ε01. there exist infinitely many solutions ε = (εjk) of this system. however for many solutions, the contracted algebras lε are isomorphic. it can be shown that only four solutions (εjk)= ( 1 1 1 0 ) , ( 1 0 0 0 ) , ( 0 0 0 1 ) , and ( 0 0 0 0 ) give mutually non-isomorphic lie algebras lε over c. (the original lie algebra is obtainedwith all parameters εij =1.) the contracted algebra obtained by the first solution is the semidirect sum of l0 with a commutative algebra l1 and corresponds to the inönü-wigner contraction. the second solution is the direct sum of l0 and the commutative algebra l1. the third solution corresponds to the central extension of l1 (considered as a commutative algebra) by the commutative algebra l0. the fourth solution is an abelian lie algebra. 1note that special z2-graded contractions are closely related to inönü–wigner contractions [12]. 31 acta polytechnica vol. 50 no. 5/2010 2.2 representations of graded contractions let us focus on the question of a representation of the contracted lie algebra lε. we will reformulate the method proposed in [13], which enables us to find a representation of lε by modifying a given representation of the original algebra l. it involves a simultaneous grading of the lie algebra l and the representation space v . definition 2.1 let r:l �→ endv be a representation of lie algebra l and let γ:l = ⊕i∈gli be its g-grading. we say that the representation r is compatible with the g-grading, if there exists a decomposition of the vector space v into a direct sum v = ⊕i∈gvi such that r(xi)vj ⊂ vi+j for each i, j ∈ g and any xi ∈ li . (5) remark 2.2 let r be a representation of l compatible with the grading l = ⊕i∈gli and h ∈ aut l be any automorphismof l. then r◦h−1 is a representationof l compatiblewith the equivalentgrading l = ⊕i∈gh(li), since (r ◦ h−1)h(xi)vj = r(xi)vj ⊂ vi+j. suppose we are given a representation r of l compatible with the g-grading. we are looking for a representation rε of a contracted lie algebra lε. according to [13] we define rε(xi)vj := ψi,j r(xi)vj (for each i, j ∈ g , any xi ∈ li and any vj ∈ vj), (6) where ψi,j are unknown parameters. the requirement that r ε is a representation of lε formally means rε ( [xi, xj]new ) vk = [r ε(xi), r ε(xj)]vk = ( rε(xi)r ε(xj) − rε(xj)rε(xi) ) vk for any xi ∈ li, xj ∈ lj, and vk ∈ vk. using equations (2) and (6) and relation (5) we obtain ψj,kψi,j+kr(xi)r(xj) − ψi,kψj,i+kr(xj)r(xi)= εi,j ψi+j,kr([xi, xj]) since r is a representation of l, we know that r(xi)r(xj)− r(xj)r(xi)= r([xi, xj]). therefore, the choice of parameters ψi,j satisfying the second basic set of contraction equations [14] ψj,kψi,j+k = ψi,kψj,i+k = εi,j ψi+j,k (7) implies that rε defined by (6) is a representation of the contracted lie algebra lε. solutions of (7) determine the contractions of the chosen representations. let us stress that, if r([xi, xj] = 0 for all xi ∈ li, xj ∈ lj, conditions (7) are not necessary. comparing (7) and (3) we see that the system of quadratic equations for parameters ψi,j has at least one solution, namely ψi,j = εi,j for each pair i, j (adjoint representation of l ε). therefore the mapping rε:lε �→ endv defined by (6) is a representation of the graded lie algebra lε. usually, there also exist other solutions of the system (7), and therefore more representations of the same contracted algebra lε. example 2 z2-graded representation. consider a z2-grading of a lie algebra l and its representation r which is compatible with the grading. for the corresponding decomposition of the vector space v = v0 ⊕ v1 we may construct a basis b of v composed of the basis of v0 and the basis of v1. in such a basis b, the grading relations (4) acquire the block form explicitly r(x0)= ( a(x0) 0 0 b(x0) ) and r(x1)= ( 0 c(x1) d(x1) 0 ) . in the sequel, we will illustrate all notions on the lie algebra lε obtained by contraction from a z2-grading of a lie algebra l by the first solution (εjk)= ( 1 1 1 0 ) given in example 1. for this lie algebra lε the commutation relations have the form [x, y]new = [x, y], if x, y ∈ l0 or if x ∈ l0, y ∈ l1 and [x, y]new =0, if x, y ∈ l1. in this case the system of equations (7) is ψ00ψ00 = ψ00, ψ10ψ01 = ψ00ψ10 = ψ10 , 32 acta polytechnica vol. 50 no. 5/2010 ψ01ψ01 = ψ01, ψ11ψ00 = ψ01ψ11 = ψ11 , ψ10ψ11 =0 . all solutions (up to equivalence of representations) of this system are (ψjk)= ( 1 1 1 0 ) , ( 1 1 0 1 ) , ( 1 1 0 0 ) , ( 1 0 0 0 ) , ( 0 1 0 0 ) and ( 0 0 0 0 ) the representations rε of the contracted lie algebra lε in the chosen basis b of the vector space v have the block form rε(x0)= ( ψ00a(x0) 0 0 ψ01b(x0) ) and rε(x1)= ( 0 ψ11c(x1) ψ10d(x1) 0 ) , where for parameters (ψij) one may choose one of the six solutions. let us mention that only the first two solutions are interesting since the elements of subalgebra l1 are represented by zero operators in the remaining solutions. 2.3 group gradings and automorphisms the simplestway to find a groupgradingof a lie algebra is to decompose the vector space l into eigensubspaces of a diagonalizable automorphism g ∈ aut l [6]. for any pair of its eigenvectors xλ and xμ corresponding to eigenvalues λ and μ, respectively, we have g([xλ, xμ])= [g(xλ), g(xμ)]= λμ[xλ, xμ] . thus the commutator [xλ, xμ] is either zero or an eigenvector corresponding to the eigenvalue λμ. let us denote by σ(g) the spectrum of automorphism g and by lλ the eigensubspace corresponding to λ ∈ σ(g). the decomposition γ:l = ⊕ λ∈σ(g) lλ (8) is a group grading, where the multiplicative semigroup generated by the spectrum of g can be taken as a semigroup g. remark 2.3 if h ∈ aut l, then the decomposition of l into eigensubspaces of the automorphism hgh−1 is l = ⊕ λ∈σ(g) h(lλ), i.e. the gradings given by conjugated automorphisms g and hgh −1 are equivalent. therefore, the automorphisms g and hgh−1 are called equivalent as well. note however that different inequivalent automorphisms may even give the same grading. similarly, if g1, g2, . . . , gr are mutually commuting automorphisms of l, then the decomposition of l into commoneigensubspaces of all these automorphisms is a groupgradingof l. the semigroup suitable for indexing this grading is g1×g2×. . .×gr, where each gi is the semigroupgeneratedby the spectrumof the automorphism gi. furthermore, for lie algebrasover the complexfield c, any groupgrading canbe obtainedby this procedure. let us emphasize that this is not the case for real lie algebras. in the following, in order to study the compatibility problem, we shall consider the simplest case of group grading determined by one automorphism. 2.4 group grading determined by one automorphism let γ be a grading of the form (8), i.e. obtained by decomposition of l into eigensubspaces of a single automorphism g. we may assume that g has a finite order, say gk = id. for its spectrum we have σ(g) ⊂ {ei 2π k | � =0,1,2, . . . , k − 1} =: g. (9) this means that γ is a g-grading. let us consider an irreducible d-dimensional representation r of the lie algebra l. our aim is to discuss the question of compatibility of r with g-grading. let rg be a non-singular matrix in c d×d such that r(g(x)) = rgr(x)r −1 g for all x ∈ l . (10) 33 acta polytechnica vol. 50 no. 5/2010 as gk = id, the previous equality gives r(x) = r(gk(x))= rkg r(x)r −k g or [r k g , r(x)] = 0 for all x ∈ l . since the representation r is irreducible, by schur’s lemma rkg = α id for some α ∈ c. of course, any nonzero multiple of rg also satisfies the relation (10). therefore without loss of generality, we may assume that rkg = id , where k is the order of automorphism g. (11) this normalization guarantees that the spectrum of matrix rg and the spectrum of automorphism g belong to the same group g. in particular, since rkg is the identity, matrix rg is diagonalizable. let v = ⊕λ∈gvλ denote the decomposition of column space cd into eigensubspaces of matrix rg, i.e. rgvλ = λvλ for all vλ ∈ vλ. we will show that this decomposition is exactly the decomposition required in definition 2.1. let us consider some μ ∈ σ(g) so that g(xμ) = μxμ for all xμ ∈ lμ. relation (10) for x = xμ leads to the matrix relation r(g(xμ))rg = r(μxμ)rg = μ r(xμ)rg = rgr(xμ) which acts on a column vector vλ ∈ vλ as μ λ r(xμ)vλ = rgr(xμ)vλ . the last equalitymeans that the column r(xμ)vλ is either zero or it is an eigenvector ofmatrix rg corresponding to eigenvalue μ λ. therefore r(xμ)vλ ⊂ vμλ for any λ, μ ∈ g and any xμ ∈ lμ . this is relation (5) written in the multiplicative form. of course, our multiplicative group g defined in (9) is isomorphic to the additive group zk. we have seen that matrix rg with the properties (10) and (11) guarantees the compatibility of grading of l with the representation of the lie algebra l. such matrix rg will be called the simulation matrix of automorphism g. matrix rg depends on the chosen automorphism g and on the chosen representation r. the idea for finding the simulation matrix is more straightforward if g ∈ aut l is an inner automorphism. in this case it is natural to search for rg among matrices in the representation of the corresponding lie group. this idea was already presented in [11] and [15], where rg was a representation of an element of finite order [9]. nevertheless, we show that it is also possible to find the simulation matrix rg even for an outer automorphism g. in the sequel, we will concentrate on the lie algebras sl(n, c). the reason is that these algebras (with the exception of o(8, c)) are the only simple classical lie algebras over c for which the group of automorphisms contains an outer automorphism as well [7]. 3 representations of sl(n, c) compatible with z2-grading we will identify the lie algebra sl(n, c) with {x ∈ cn×n | trx =0}. any z2-grading of it is uniquely related to an automorphism of order 2. let us therefore recall the structure of aut sl(n, c) as described in [7]: 1. for any inner automorphism g there exists a matrix a ∈ sl(n, c) := {a ∈ cn×n | deta =1} such that g(x)= adax = axa −1 for any x ∈ sl(n, c); 2. the mapping given by the prescription outi x := −x t for any x ∈ sl(n, c) is an outer automorphism of order 2; 3. any outer automorphism g is a composition of an inner automorphism and the automorphism outi. 4. to any outer automorphism outa of order two there exists an inner automorphism adp such that ad−1p outaadp = outp t ap = outi, i.e. outa and outi are equivalent (see [6], lemma a.1). 34 acta polytechnica vol. 50 no. 5/2010 the next ingredient for the construction of simulationmatrices of automorphisms is the knowledge of finitedimensional irreducible representationsof sl(n, c). these representationsarewell describedbygel’fand-tseitlin formalism [4, 10, 2]. any irreducible representation r of sl(n, c) is in one-to-one correspondencewith an n-tuple (m1,n, m2,n, . . . , mn,n) of non-negative integer parameters m1,n ≥ m2,n ≥ . . . ≥ mn,n = 0. the dimension of the representation space of r = r(m1,n, m2,n, . . . , mn,n) is given by the number of triangular patterns m= ⎛ ⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ m1,n m2,n m3,n . . . mn,n m1,n−1 m2,n−1 m3,n−1 . . . mn−1,n−1 m1,n−2 m2,n−2 . . . mn−2,n−2 ... ... ... m1,2 m2,2 m1,1 ⎞ ⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ in which the numbers mi,j ∈ z satisfy mi,j+1 ≥ mi,j ≥ mi+1,j+1 for all 1 ≤ i ≤ j ≤ n − 1. to any such pattern m, we assign the basis vector ξ(m). the representation r is fully determined by the action r(ek ) on all basis vectors ξ(m) for any k, � = 1,2, . . . , n. (we have adopted the notation ek for n × n matrices with elements( ek ) ij = δikδ j.) this action can be found e.g. in [4], but for the reader’s convenience the representation of gl(n, c) is described in the appendix. 3.1 inner automorphisms of order two any innerautomorphism g oforder two is associatedby the equality g = ada withagroupelement a ∈ sl(n, c) such that a does not belong to the center z[sl(n, c)] and a2 belongs to the center. if we denote ω = e iπ n , then the center can bewritten explicitly z[sl(n, c)] = {ω2 in | � =0,1, . . . , n −1}. a simple calculation shows that any such element a ∈ sl(n, c) is up to equivalence one of the matrices an,s := ω η(s) ( in−s 0 0 −is ) where s =1, . . . , � n 2 � and η(s)= { 0 if s is even 1 if s is odd (note that an,0 = in belongs to z[sl(n, c)].) these matrices may be rewritten by using elements of the lie algebra sl(n, c) as follows: an,s =exp(xn,s) with xn,s = iπ ⎛ ⎜⎝ η(s) n in−s 0 0 η(s) n is + ms ⎞ ⎟⎠ , where ms = diag(−1,1, −1, . . . ,(−1)s) ∈ cs×s. one can use the notation of ekk and write xn,s = iπ ( η(s) n n∑ k=1 ekk + n∑ k=n−s+1 (−1)n−s+1−kekk ) . (12) if r is any representation of the lie algebra sl(n, c), then ran,s := exp(r(xn,s)) satisfies ran,s r(x) ( ran,s )−1 = r(an,sxa −1 n,s)= r(adan,s x) and ( ran,s )2 = id . therefore, matrix ran,s is the simulation matrix of the inner automorphism g = adan,s. we have shown theorem 3.1 any z2-grading of the lie algebra sl(n, c) obtained by an inner automorphism and any irreducible representation of sl(n, c) are compatible. using (12) and the explicit form of the gel’fand-tseitlin representationwe obtain for any basis vector ξ(m) r(xn,s)ξ(m)= iπ ( η(s) n rn(m)+2 s−1∑ k=1 (−1)k−1rn−s+k(m) − rn−s(m) − (−1)η(s)rn(m) ) ξ(m) . thus we have arrived at the explicit form of the simulation matrix of the automorphism g = adan,s ran,s ξ(m)= e iπ (( η(s) n −1 ) rn(m)−rn−s(m) ) ξ(m) 35 acta polytechnica vol. 50 no. 5/2010 3.2 outer automorphism of order two as explained at the beginning of section 3, any outer automorphismof order two on sl(n, c) is up to equivalence the automorphism outi(x)= −x t , and thus we will focus only on it without loss of generality. it is well known that for an irreducible representation r characterized in the gel’fand-tseitlin formalism by the n-tuple (m1,n, m2,n, . . . , mn,n), the mapping −rt (to minus transposed matrices) is also an irreducible representation. this representation is equivalent to the contragredient representation rc, which is characterized by the n-tuple (m′1,n, m ′ 2,n, . . . , m ′ n,n), where m′i,n = m1,n − mn−i+1,n for i =1,2, . . . , n . let us consider a triangular patternm filled by indices mi,j, 1 ≤ i ≤ j ≤ n, and associatedwith the basis vector ξ(m) of representation r. to any such pattern m, we may assign the unique triangular patternm′ with indices m′i,j := m1,n − mj−i+1,j. it is easy to check that m ′ i,j satisfies the necessary inequalities for m ′ to be a correct pattern of the contragredient representation rc. let us define the linear mapping j of the representation space of r onto the representation space of rc by j ξ(m) := (−1) ∑ i,j mi,j ξ(m′) . on the other hand, from the formulae in the appendix one sees that rt(eij)= r(eji)= r(e t ij ) . (13) using this fact one can prove by direct verification that the mapping j satisfies − j rt(x)= rc(x)j for any x ∈ sl(n, c) . (14) let us return to our original task. we are looking for the simulation matrix of the automorphism g = outi, i.e., we are looking for a matrix rg of order two such that r(outi(x))= −r(x t)= rgr(x)r−1g . according to (13), we have r(x t) = rt(x) and therefore the existence of the simulation matrix rg means equivalence of the representations r and −rt , i.e. equivalence of r and its contragredient representation rc. the gel’fand-tseitlin result states that this is possible if and only if n-tuples (m1,n, m2,n, . . . , mn,n) and (m′1,n, m ′ 2,n, . . . , m ′ n,n) coincide. in this case the simulation matrix rg is equal to j. we have deduced theorem 3.2 a z2-grading of the lie algebra sl(n, c) obtained by an outer automorphism outi is compatible with an irreducible representation r of sl(n, c) if and only if the representation is self-contragredient. if we do not insist on the irreducibility of representation r, the class of representations compatible with the z2grading obtained by the automorphism outi is larger. of course, if for a representation r1 it is possible to find a simulationmatrix r(1) and for a representation r2 a simulationmatrix r (2), then the direct sum r(1) ⊕ r(2) is the simulationmatrix for the direct sum r1 ⊕ r2. to avoid a discussion of all such obvious cases, wewill describe only those representations r with simulationmatrices r for which the operator set {r} ∪ {r(x) | x ∈ sl(n, c)} is irreducible, whereas the set {r(x) | x ∈ sl(n, c)} is reducible. if r0 is a d-dimensional irreducible representation of sl(n, c) then the 2d-dimensional representation r := r0 ⊕ ( −r t0 ) assigns to x the matrix r(x)= ( r0(x) 0 0 − ( r0(x) )t ) and therefore r(outi(x))= ( − ( r0(x) )t 0 0 r0(x) ) = ( 0 id id 0 ) r(x) ( 0 id id 0 ) . the matrix ( 0 id id 0 ) is the simulationmatrix of outi. it is easy to see that the simulationmatrix together with all r(x) form an irreducible set. 36 acta polytechnica vol. 50 no. 5/2010 3.3 z2-grading of sl(3, c) let us illustrate the conclusions of the previous sections on the lie algebra sl(3, c). on this algebra there exist only two inequivalent automorphisms of order two. in our notation g1 = ada3,1 with a3,1 = ω ⎛ ⎜⎜⎝ 1 0 0 0 1 0 0 0 −1 ⎞ ⎟⎟⎠ , where ω = e iπ3 and g2 = outi. the corresponding z2-gradings are γ1:sl(3, c)= {⎛⎜⎜⎝ a b 0 c d 0 0 0 −a − d ⎞ ⎟⎟⎠∣∣∣ a, b, c, d ∈ c} ⊕ { ⎛ ⎜⎜⎝ 0 0 a 0 0 b c d 0 ⎞ ⎟⎟⎠∣∣∣ a, b, c, d ∈ c} , γ2:sl(3, c)= {⎛⎜⎜⎝ 0 a b −a 0 c −b −c 0 ⎞ ⎟⎟⎠∣∣∣ a, b, c ∈ c} ⊕ { ⎛ ⎜⎜⎝ a b c b d e c e −a − d ⎞ ⎟⎟⎠∣∣∣ a, b, c, d, e ∈ c} . the first grading γ1 is compatible with any irreducible representation. the simulation matrix rg1 of the automorphism g1 = ada31 acts on the gel’fand-tseitlin triangular patterns as follows rg1 ⎛ ⎜⎜⎝ m1,3 m2,3 0 m1,2 m2,2 m1,1 ⎞ ⎟⎟⎠ = e−2iπ3 (m1,3+m2,3)e−iπ(m1,2+m2,2) ⎛ ⎜⎜⎝ m1,3 m2,3 0 m1,2 m2,2 m1,1 ⎞ ⎟⎟⎠ . the irreducible representations compatible with the second grading are only self-contragredient representations, i.e., representations r = r(2�, �,0). in such representation, the operator j is defined by j ⎛ ⎜⎜⎝ 2� � 0 m1,2 m2,2 m1,1 ⎞ ⎟⎟⎠ =(−1) +m1,2+m2,2+m1,1 ⎛ ⎜⎜⎝ 2� � 0 2� − m2,2 2� − m1,2 2� − m1,1 ⎞ ⎟⎟⎠ . the lowest-dimensional non-trivial self-contragredient representation is r = r(2,1,0), in fact, the adjoint representation. its dimension is 8 and has the following explicit form on the basis vectors: rg2 ⎛ ⎜⎜⎝ 2 1 0 2 1 2 ⎞ ⎟⎟⎠= ⎛ ⎜⎜⎝ 2 1 0 1 0 0 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 1 0 0 ⎞ ⎟⎟⎠= ⎛ ⎜⎜⎝ 2 1 0 2 1 2 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 2 1 1 ⎞ ⎟⎟⎠ = − ⎛ ⎜⎜⎝ 2 1 0 1 0 1 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 1 0 1 ⎞ ⎟⎟⎠ = − ⎛ ⎜⎜⎝ 2 1 0 2 1 1 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 2 0 2 ⎞ ⎟⎟⎠= − ⎛ ⎜⎜⎝ 2 1 0 2 0 0 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 2 0 0 ⎞ ⎟⎟⎠ = − ⎛ ⎜⎜⎝ 2 1 0 2 0 2 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 1 1 1 ⎞ ⎟⎟⎠ = ⎛ ⎜⎜⎝ 2 1 0 1 1 1 ⎞ ⎟⎟⎠ , rg2 ⎛ ⎜⎜⎝ 2 1 0 2 0 1 ⎞ ⎟⎟⎠ = ⎛ ⎜⎜⎝ 2 1 0 2 0 1 ⎞ ⎟⎟⎠ . if the representation r = r(m13, m23,0) is not self-contragredient, then grading γ2 is compatible with the reducible representation r⊕(x) := ( r(x) 0 0 − ( r(x) )t ) and the corresponding simulationmatrix on the double-dimensional space is j = σ1 ⊗ i, where i is the identity operator on the representation space of representation r and σ1 denotes the first pauli matrix. 37 acta polytechnica vol. 50 no. 5/2010 4 conclusions since basic concepts connected with gradings of lie algebras were laid down already in the work of j. patera and h. zassenhaus [16], including the notion of compatibly graded representation, it is really surprising that there does not yet exist a theory of representations of graded lie algebras compatible with a given grading. this work is devoted to first steps in investigating which irreducible representations of a lie algebra l are compatible with its g-grading, at least in a rather restricted framework. the main contribution of the paper consists in elucidatingwhich representationsof classicallie algebrasof type a are compatiblewith a z2-grading. concretely, the results are as follows: if the involutive automorphism producing the z2-grading is inner, then every irreducible finite-dimensional representation is compatible with the grading, but if the automorphism producing the grading is not inner, then the only irreducible finite-dimensional representations compatible with the grading are the self-contragredient ones. for the outer automorphism there is also a possibility of reducible representations involving pairs of mutually contragredient irreducible representations. thus it is not generally true that every irreducible representation can be compatibly graded. the sl(3, c)-case is enclosed to illustrate the process. one of our future goals is to enlarge the family of gradings of l for which one can decide about compatibility with representations of l. another goal is to study representations of physical interest of the so-called kinematical groups of space-times. the possible lie algebras l of these groups were classified in [1]. a rather remarkable fact was found there that very simple conditions imposed by space inversion and time reversal on the generators very severely constrain the possible lie algebras. this result was confirmed in [12] from the corresponding z2 × z2-contractions of the de sitter lie algebras. from this point of view it would be useful to identify the gradings implicitly present for instance in [3, 18] and in other papers where contractions of representations are studied. acknowledgement the authors would like to express their gratitude to the referees for their careful reading of the manuscript and for their valuable critical comments, which have helped to improve the presentation. we are grateful to vyacheslav futorny for fruitful discussions on relations between the notions of grading and group grading, and to jiří patera for introducing us to the problems connectedwith representations of contractedlie algebras. we acknowledge financial support from the grants msm6840770039 and lc06002 of the ministry of education, youth, and sports of the czech republic. appendix. the gel’fand-tseitlin formalism let us give an explicit description of the irreducible representations of gl(n, c) in the gel’fand-tseitlin formalism [4, 10, 2]. any irreducible representation r of gl(n, c) is in one-to-one correspondence with an n-tuple (m1,n, m2,n, . . . , mn,n) of non-negative integer parameters m1,n ≥ m2,n ≥ . . . ≥ mn,n ≥ 0. since any ek can be obtained by commutation relations from ek,k, ek,k−1 and ek−1,k, only formulas for r(ek,k), r(ek,k−1) and r(ek−1,k) are needed: r(ek,k)ξ(m)= (rk − rk−1)ξ(m) , where rk = m1,k + . . . + mk,k for k =1,2, . . . , n and r0 =0, r(ek,k−1)ξ(m)= a 1 k−1ξ(m 1 k−1)+ . . . + a k−1 k−1ξ(m k−1 k−1) , where mjk−1 denotes the triangular pattern obtained from m replacing mj,k−1 by mj,k−1 − 1, a j k−1 = [ − ∏k i=1(mi,k − mj,k−1 − i + j +1) ∏k−2 i=1 (mi,k−2 − mj,k−1 − i + j)∏ i�=j(mi,k−1 − mj,k−1 − i + j +1)(mi,k−1 − mj,k−1 − i + j) ]1/2 and r(ek−1,k)ξ(m)= b 1 k−1ξ(m 1 k−1)+ . . . + b k−1 k−1ξ(m k−1 k−1) , where mjk−1 denotes the triangular pattern obtained from m replacing mj,k−1 by mj,k−1 +1, and b j k−1 = [ − ∏k i=1(mik − mj,k−1 − i + j) ∏k−2 i=1 (mi,k−2 − mj,k−1 − i + j − 1)∏ i�=j(mi,k−1 − mj,k−1 − i + j)(mi,k−1 − mj,k−1 − i + j − 1) ]1/2 . 38 acta polytechnica vol. 50 no. 5/2010 references [1] bacry, h., lévy-leblond, j.-m.: possible kinematics, j. math. phys. 9 (1968), 1605–1614. [2] barut, a. o., raczka, r.: theory of group representations and applications, world scientific, singapore, 2000, chap. 10. [3] de bièvre, s., cishahayo, c.: on the contraction of the discrete series of su(1,1), ann. inst. fourier, 43 (1993), 551–567. [4] gel’fand, i. m., tseitlin, m. l.: finite-dimensional representations of the group of unimodular matrices, dokl. akad. nauk sssr 71 (1950), 825–828. [5] gilmore, r.: lie groups, lie algebras, and some of their applications, wiley, new york 1974, chap. 10. [6] havlíček, m., patera, j., pelantová, e.: on lie gradings ii, lin. alg. appl. 277 (1998), 97–125. [7] helgason, s.: differential geometry, lie groups, and symmetric spaces.academic press, newyork 1978. [8] inönü, e., wigner, e. p.: on the contraction of groups and their representations, proc. nat. acad. sci. u.s.a. 39 (1952), 510–525. [9] kac,v.g.: automorphismsoffinite order of semisimplelie algebras,funct.anal.appl.3 (1969), 252–254. [10] lemire, f., patera, j.: formal analytic continuation of gel’fand’s finite dimensional representations of gl(n,c), j. math. phys. 20 (1979), 820–829. [11] demontigny,m., patera, j.: discrete andcontinuous gradedcontractionsoflie algebrasand superalgebras, j. phys. a: math. gen. 24 (1991), 525–549. [12] demontigny,m., patera, j., tolar, j.: graded contractions and kinematical groups of space-time, j. math. phys. 35 (1994), 405–425. [13] moody, r. v., patera, j.: discrete and continuous graded contractions of representations of lie algebras, j. phys. a: math. gen. 24 (1991), 2227–2258. [14] patera, j.: graded contractions of lie algebras, representations and tensor products, aip conference proceedings, vol. 266 (1992), 46–54. [15] patera, j., tolar, j.: on gradings of lie algebras and their representations, in lie theory and its applications in physics ii, (eds. h.-d. doebner, v. k. dobrev, j. hilgert), world scientific, singapore 1998, 109–118. [16] patera, j., zassenhaus, h.: on lie gradings i, lin. alg. appl. 112 (1989), 87–159. [17] patera, j., zassenhaus, h.: the pauli matrices in n dimensions and finest gradings of simple lie algebras of type an−1, j. math. phys. 29 (1988), 665–673. [18] ström, s.: construction of representations of the inhomogeneous lorentz group by means of contraction of representations of the (1+4) de sitter group, arkiv för fysik 30 (1965), 455–472. prof. ing. miloslav havlíček, drsc. e-mail: miloslav.havlicek@fjfi.cvut.cz prof. ing. edita pelantová, csc. e-mail: edita.pelantova@fjfi.cvut.cz prof. ing. jiří tolar, drsc. e-mail: jiri.tolar@fjfi.cvut.cz doppler institute, faculty of nuclear sciences and physical engineering czech technical university in prague, czech republic 39 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 chaos in gdp r. kř́ıž abstract this paper presents an analysis of gdp and finds chaos in gdp. i tried to find a nonlinear lower-dimensional discrete dynamicmacroeconomicmodel thatwould characterizegdp.thismodel is represented bya set of differential equations. i have used the mathematica and ms excel programs for the analysis. keywords: macroeconomic modeling, gross domestic product, chaos theory. 1 introduction humanity has always been concerned with the question whether the processes in the real world are of a stochastic or deterministic nature. answers are explored by theologians, philosophers and scientists in various fields. i take the view that real processes are more deterministic in nature. an interesting case of determinism is deterministic chaos. the only purely stochastic process is amathematicalmodel described by mathematical statistics. the statistical model oftenworksand is the onlypossible description ifwedo not know the system. this also applies to economic quantities, including forecasts for gdp. in this paper i have tried to grasp the hidden essence of the problem in order to formulate a predictionmore easily. the basic question is therefore the existence of chaotic behavior. if the system behaves chaotically, we are forced to accept only limited predictions. in this paper i will try to show the chaotic behavior of gdp and then propose a simple lower-dimensional system under which the system evolves. 2 methodology for the analysis i will briefly state the basic definitions and describe the basic methods for examining the input data. 2.1 gross domestic product gross domestic product (gdp) is a major macroeconomic indicator. it measures the overall production performance of the economy. it is the totalmarket value of all final goods and services produced within a country during some period, usually one year, expressed in monetary units [8]. it is often considered an indicator of a country’s standard of living. gdp can be determined in three ways, all of which should, in principle, give the same result. they are the product (or output) approach, the income approach, and the expenditure approach. y = c + i + g +(x − m) (1) • c (consumption) is normally the largestgdp component in the economy, consisting of private (household final consumption expenditure) in the economy. • i (investment) includes business investment in equipment, for example, and does not include exchanges of existing assets. • g (government spending) is the sum of government expenditures on final goods and services. • x − m (net exports) represents gross exports x — gross imports. 2.2 hurst exponent the hurst exponent is widely used to characterize some processes. the hurst exponent is a measure that has been widely used to evaluate the selfsimilarity and correlation properties of fractional brownian noise, the time-series produced by a fractional (fractal) gaussian process. as originallydefinedbymandelbrot [5], thehurst exponent h describes (among other things) the scaling of the variance of a stochastic process y(t), σ2 = ∫ +∞ −∞ y2f(y, t)dy = ct2h (2) where c is constant. the hurst exponent is used to evaluate the presence or absence of long-range dependence and its degree in a time-series. the hurst exponent (h) is defined in terms of the asymptotic behavior of the rescaled range as a function of the time span of a time series, as follows e [ r(n) s(n) ] = cnh as n → ∞, (3) 63 acta polytechnica vol. 51 no. 5/2011 where [r(n)/s(n)] is the rescaled range; e[y] is expected value; n is number of data points in a time series, c is a constant. an algorithm for calculation is used from wikipedia [12]. to calculate the hurst exponent, one must estimate the dependence of the rescaled range on the time span n of observation. the average rescaled range is then calculated for each value of n. for a (partial) time series of length n, y = y1, y2, . . . , yn, the rescaled range is calculated as follows: 1. create a mean-adjusted series ut = yt − 1 n n∑ i=1 yi for t =1,2, . . . , n (4) 2. calculate the cumulative deviate series v ; vt = n∑ i=1 ui for t =1,2, . . . , n (5) 3. compute the range r; r(n)=max(v1, v2, . . . , vn)−min(v1, v2, . . . , vn) (6) 4. compute the standard deviation s s(n)= √√√√1 n n∑ i=1 (yi − ȳ )2 (7) 5. calculate the rescaled range and average over all the partial time series of length n. the hurst exponent is estimated by fitting the power law, according to the definition. the values of the hurst exponent vary between 0 and 1, with higher values indicating a smoother trend, less volatility, and less roughness. random walk has a hurst exponent of 0.5. 2.3 fractal the term “fractal” was first introduced by mandelbrot [4]. a fractal is a complicated geometric figure that, unlike a conventional complicated figure, does not simplifywhen it ismagnified. in thewaythateuclideangeometryhas servedas adescriptive language for classicalmechanics ofmotion, fractal geometry is being used for the patterns produced by chaos [9]. the fractal dimension, d, is a statistical quantity that gives an indication of howcompletely a fractal appears to fill space, as one zooms down to finer and finer scales. there are many specific definitions of fractal dimension. hurst exponent h is directly related to fractal dimension d, because the maximum fractal dimension for a planar tracing is 2: d + h =2 (8) 2.4 correlation dimension correlationdimension (dc) describes thedimensionality of the underlying process in relation to its geometrical reconstruction in phase space. the correlation dimension is calculated using the fundamental definition. define the correlation integral for set of data m: c(r) = 1 m(m −1) m∑ i, j =1 i �= j h (r − ‖yi − yj‖) (9) where h is the heaviside step function. h(x)= ⎧⎪⎪⎨ ⎪⎪⎩ 0 y < 0 1 2 y =0 1 y > 0 (10) a euclidean metric is used for all calculations in this paper. when a lower limit exists, the correlation dimension is then defined as dc = lim r → 0 m → ∞ ln(c(r)) ln(r) (11) 2.5 lyapunov exponents the lyapunov exponent or the lyapunov characteristic exponent of a dynamical system is a quantity that characterizes the rate of separation of infinitesimally close trajectories. quantitatively, two trajectories in phase spacewith initial separation δz0 diverge (provided that the divergence can be treated within the linearized approximation) δz(t) ≈ eλt |δz0| (12) where λ is the lyapunov exponent. the maximal lyapunov exponent can be defined as follows: λ = lim δz0 → 0 t → ∞ 1 t ln |δz(t)| |δz0| (13) the limit δz0 → 0 ensures the validity of the linear approximation at any time. maximallyapunov exponentdetermines a notion of predictability for a dynamical system. a positive maximal lyapunov exponent is usually taken as an indication that the system is chaotic (provided some 64 acta polytechnica vol. 51 no. 5/2011 other conditions are met, e.g., phase space compactness) [13]. 3 introduction to nonlinear dynamics if the world is not linear (and there is no qualitative reason to assume that it is linear), it should be natural to model dynamical economic phenomena nonlinearly [2]. if it is present in the general system of nonlinear dynamics, a deterministic system can generate random-looking results, but may include hidden order. in this paper i have studied only a lowdimensional discrete dynamical system. 3.1 discrete dynamical system definition from goldsmith [6]: a discrete-time dynamical system on a set y is just the function φ:y → y (14) this function, often called a map, may describe the deterministic evolution of some physical system: if the system is in state y at time t, then it will be in state φ(y) at time t +1. the study of discrete-time dynamical systems is concerned with iterates of the map: the sequence y,φ(y),φ2(y), . . . (15) is called the trajectory of y and the set of its points is called the orbit of y. these two terms are often used interchangeably, although we will remain consistent in their usage. the set y in which the states of the system exist is referred to as the phase space or state space. we will restrict our attention to maps φ(y) such that y is a subset of rd. 3.2 one-dimensional discrete dynamical system the one-dimensional discrete system is yt+1 = f(yt) yt ∈ � (16) one-dimensional, discrete dynamical systems in economics are surely suited for demonstrating the relative easy with which complex behavior can be modeled. the mathematical properties of onedimensional dynamical systems seemtobebetter understood than higher-dimensional systems [2]. the logistic equation is a typical example of easy one-dimensional discrete dynamical systems which can be chaotic. yt+1 = μyt(1− yt) (17) the logistic equation has been described inmany books andpapers e.g. [1,2,6,9,10]. the logistic equation is chaotic, when control parameter μ is between 3.5699 . . . and 4. 4 analysis of gdp 4.1 input data the gross domestic product by type of expenditure in current prices is used in this paper. i have used data (quarterly, without seasonal adjustment) from the czech statistical office. i analyze data from the czech republic between 1995 and 2010 [11]. fig. 1: gdp with linear trend of gdp 4.2 calculation of h and dc the main problem in analyzing gdp is the lack of data. therefore, all results are only estimates. i have computed the hurst exponent for gdp h = 0.75 according to the algorithm in chapter 2.2. this value is in accordance with expectations. we know that the value of h is between 0 and 1, whilst real time series are usually higher than 0.5. if the exponent value is close to 0 or 1, it means that the time-series has long-range dependence. value 0.75 is directly between the stochastic and deterministic process. i think that 0.75 value is a sufficient value for credible prediction. nowwe also know the fractal dimension 2−0.75=1.25. fig. 2: correlation integral, value with a linear trend 65 acta polytechnica vol. 51 no. 5/2011 the correlation dimension is calculated using the fundamental definition in section 2.4. wehavehad a problemwith lack of data for this computing. i have put the calculated data into a graph in logarithmic coordinates, and i have made a linear interpolation. (cf. figure 2). on this basis, the correlation dimension for the small value of r can be estimated. the estimate of the correlation dimension is a value lower than 2. if the correlation dimension is low, the lyapunov exponent is positive and the kolmogorov entropy has a finite positive value, chaos is probably present. from estimates h and dc it can be concluded that gdp is a deterministic chaos. 4.3 analyzing in phase space in the previous section (section 4.2) we verified the presence of chaos in gdp. data with a trend can cause problems for future analysis. the trend is removed by subtracting the linear interpolation. denote gdp without trend as y (t). fig. 3: gdp without trend fig. 4: gdp in phase space a phase portrait 2d of gdp is constructed so that each ordered pair of {yt;yt−1, t = 2, . . . , n} is displayed in theplanewhere the x-axis represents the values of yt and y-axis value yt−1 (cf. figure 4). the individual points {yt;yt−1} of phase space are connected by a smooth curve. this curve looks like a chaotic attractor. the points are located mainly in the first and third quadrant. the phase portrait 3d of gdp is constructed so that each ordered trio of {yt;yt−1, yt−2, t = 3, . . . , n} is displayed in the space. the individual points {yt;yt−1, yt−2} of 3d phase space are connected by a line (cf. figure 5) or coveredby a surface (cf. figure 6). figure 6 looks very strange. fig. 5: gdp in 3d phase space (yt, yt−1, yt−2) fig. 6: gdp in 3d phase space (yt, yt−1, yt−2) 5 gdp as a dynamical system i have tried to find a system that will be similar to gdp, based on the previous analysis. 66 acta polytechnica vol. 51 no. 5/2011 5.1 comparison of gdp with an one-dimensional discrete dynamical system chaotic course yt can be immediately simulated by a logistic equation, but a progression in the phase space is completely different. a logistic equation can be appropriate for simulation, but not for forecasting. if we consider a function, it should be an odd function. i have studied several one-dimensional systems with a cubic function. it is reasonable to assume that that function has one root 0. the desired equation can be written in the form: yt+1 = yt(αy 2 t − β) (18) fig. 7: cubic function: plot [y^3-2.8y,y,-2,2] fig. 8: bifurcation diagramof function y3−by, {b,2.25,3} fig. 9: chaos in cubic function yt+1 = y 3 t − 2.8yt 6 conclusion i have shown in this paper thatgdp can be chaotic. i found a very simple nonlinear differential equationwhich properly captures gdp. according to the phase portrait, it seems that the function should be odd. the most appropriate function is the cubic equationwith a single zero root. it is a simple system that is easy to interpret. it should be noted that this system is not perfect and it would be useful to find a set of differential equations of higher order. the biggest problem in finding a suitable system is the lack of data. this paper analyzes gdp as such. it would be more sophisticated to create a theoretical model and calibrate it. acknowledgement the research presented in the paper was supervised by doc. ing. helena fialová, csc., fee ctu in prague and supported by the czech grant agency under grant no. 402/09/h045 “nonlinear dynamics in monetary and financial economics. theory and empirical models.” references [1] allen, r. g. d.: matematická ekonomie. academia, 1971. [2] lorenz, h.-w.: nonlinear dynamical economics and chaotic motion. springer-verlag, 1989. [3] flaschel,p., franke,r.,demmler,w.: dynamic macroeconomics. the mit press, 1997. [4] mandelbrot,b.b.: thefractalgeometry ofnature. w. h. freeman and co., 1983. [5] mandelbrot, b. b., ness van, j. w.: fractional brownian motions, fractional noises and applications. siam rev. 10 (1968), pp. 422. [6] goldsmith, m.: the maximal lyapunov exponent of a time series. thesis, 2009. [7] grassberg, p., procaccia, i.: characterization of strange attractors,phys. rev. lett.50 (1983) 346. [8] fialová, h., fiala, j.: ekonomický slovńık. a plus, 2009. [9] alligood, t. k., sauer, d. t., yorke, j. a.: chaos an introduction to dynamical systems. springer, 2000. 67 acta polytechnica vol. 51 no. 5/2011 [10] chiarella, c.: the elements of a nonlinear theory ofeconomicdynamics.springer-verlag, 1990. [11] http://www.czso.cz/eng/redakce.nsf/i/time series [12] http://en.wikipedia.org/wiki/hurst exponent [13] http://en.wikipedia.org/wiki/lyapunov exponent about the author radko kř́ıž,msc. wasborn inbroumov. hegraduated at the faculty of electrical engineering of the czechtechnicaluniversity in prague, specializing in economics and management in electrical engineering, and he is currently aphd student at thefaculty of electrical engineering of theczechtechnicaluniversity in prague. his major fields of specialization are macroeconomics, energetics, nonlinear dynamics in economics and chaos theory. radko kř́ıž e-mail: krizradk@fel.cvut.cz dept. of economics management and humanities faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic 68 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 electronic education d. vaněček, j. jirsa abstract the age in which we are living nowadays is characterized by rapid innovation in the development of information and communication technologies (ict). this innovation has a significant influence on the education process. this paper deals with e-learning. keywords: e-learning, lms, computer animation. introduction theworldwideweb has had a fundamental impact on the paradigm. it also has an impact on education, in particular the way in which information and ideasare transmitted. computer technologyhaspenetrated into all areas of human activity. it is becoming a part of our everyday life. however, there are still fields in which computers are only slowly beginning to participate and emerge into the foreground. this is particularly true for the social sciences. however, the situation has been changing significantly even there in recent times. this paper deals with the use of computers in education. we will pay particular attention to elearning and its multimedia capabilities. we will focus on computer animations, which are discussed in greater detail in the second and third parts of this paper. electronic education the idea of e-learning began with the computer age, and has developed dynamically as the internet has spread and browsers have improved. as recently as 1999, e-learning consisted mainly of static pages. communicationwith the teacher either did not work at all, or was limited to e-mails. this phase of computer based learning is also known as web based training (wbt). however, as internet technology developed e-learning courses became more sophisticated and better teacher-student collaboration and feedback became possible. moreover, thanks to easy up-dating, the contenthas changedover time andhas become more and more multimedia. nowadays e-learning iswidespreadnot only in education but also in commercial businesses, which often use it as a major lifelong learning tool for their staff. there are thousands of so-called corporate universities in theworld,whichpush theuse andtheway of e-learning further. to-day e-learning is often mentioned by educators, and it is regarded as a means for better teaching, especially in distance teaching. e-learning is a way of enriching and boosting teaching with the use ofmoderncomputing technologies (mostlyweb-based technologies). e-learning changes our view on traditional ways of delivering education, and has also influenced pedagogical and psychological learning theories linked with technologies. if the e-course is well conceived and well created, its use in education is based on modern learning concepts: constructivism and connectivism. the terms remember, recall and learn are enriched by terms like reflect, create and form. definition the notion of e-learning proves to be difficult to define, in spite of its high frequency of use. numerous definitions have been proposed, and many of them differ significantly from each other. usual differences arewhether thedefinition is fromapedagogical viewpoint or froma technological viewpoint, andwhether older forms of computer use in education (especially cbt) belong to e-learning (or whether e-learning should only refer to the use of internet technologies). for better understandingwewill present somedefinitions of the notion of e-learning expressedbypresentday experts, see [2]: • for me, e-learning is studying by means of electronic media, be it learning through cds or through the internet. • by e-learning i understand educational and training systems (especially on-line systems). • i understand e-learning as education supported by modern electronic means (computers, media, internet) indistance learning, part-timeand fulltime study. • i imagine e-learning as electronic education, e.g. an educational course created in lms which is intended for self-study under supervision of 53 acta polytechnica vol. 51 no. 3/2011 a teacher who communicates with the student electronically through this environment. in their definitions of e-learning, experts mostly emphasize its more sophisticated and more exacting forms. utilization of e-learning at universities e-learning is becoming integrated into the education at many czech universities, and is created and run by lcms/lms systems. this integration of electronic education is of three types at the present time: (see [3]) • electronic support for full-time study; classical education is mixed with elements of individual work by the student using electronic sources. • interactive elements are added to electronic support; classical education is in some cases limited. • students have access to electronic courses, and to entire educational cycles. classical education is limited to a minimum, or is omitted completely. the electronic support is highly interactive, and even tests of knowledge are often administered in this way. lms/lcms systems for the ordinary teacher, it is difficult to create a modern interactive electronic course (e-course or elearning course) without expert help. high technical skills are needed. this can affect the teacher’s decision-makingprocess, if it is necessary to createan e-course for students on a voluntary basis. the basic skill is in creating web pages. this requires knowledge ofhtml documents, and if active elements are used programming skills are also required. there is no reason to assume that each teacher (e.g. a teacher of social sciences) will be willing and technically able to master these computer skills. due to these assumptions, computer education systems creators soon began to developed userfriendly systems called web content management systems — wcms. most of these systems run over the internet and they are available via web browsers. nowadayswcmss are used for creating andmanaging web pages, and the user does not need to have programming skills. after logging in to the application, the course creator can insert some content, usually text. after it has been saved, this text is available for other users as a web page (e.g. wikipedia). wcmss are very often equipped with advanced features that even an experienced programmer would scarcely be able to implement. these features often include forums, discussions and questionnaires (like in social network pages, e.g. facebook). all these features are usuallywell secured and debugged. here we can mention popular systems such as drupal and joomla, or systems based on wiki. the idea of utilizing these systems for the purposes of education followed soon after the creation of learning content management systems — lcms. there are some differences from regular wcms, mainly in the specific creationofparticularwebpages types (online learningcourses)and in someadditional specific functions. these special functions are characteristic educational tasks, such as setting tests, assigning homework and delivering lectures. in addition to lcms there are learningmanagement systems — lms. these are not suitable for creating e-learning courses, but they are useful for controlling, managing and administering them. the use of lmss therefore fits well for monitoring students’ results, for assigning homework tasks, and for access to separate parts of an e-course (parts created inadvance inlcms).most systemsprovide the functions of both lcms and lms. the most widespread lms/lcmssystemin theczechrepublic ismoodle. for technically-oriented subjects, it is necessary to use interactive elements integrated into lms/lcms courses. the basic representatives of this branch are animated elements. skalková made an important contribution. pedagogical research in the field of education technology has generally emphasized interactivity and hyper-media presentation of knowledge. sociological research has shown that children and young people are living in an increasingly hyped world. computers, video and internet have become everyday features of their living environment. we will now focus on educational computer animation, a significant interactive aspect of electronic learning courses in the framework of lm/lcm systems. animation — a multimedia foundation of e-learning animation for teaching technical subjects at high schools and universities, for example constructive geometry, it is necessary to develop the students’ spatial imagination. vividness is often crucial for understanding the topic under discussion. the creation of a system of ideas and notions on the basis of directly perceiving real objects is heavily utilized by teachers of technical subjects, but it has its limits. technical diagrams and photos can be an excellent aid. verbally-illustrative vividness, based on a verbal description of phenomena, can prove difficult for students, especially if their personal experience cannot 54 acta polytechnica vol. 51 no. 3/2011 be relied on. thanks to the huge development of the computer technology, new options in support of vividness are now available. this new type of vividness might be called “media vividness”. its main representative is animation. animation is essentially an illustrative-demonstrative method [1], combining demonstrations of visual information with dynamic projection. the reasons for using visual information come from the findings of cognitive psychology. from the point of view of the human psyche, an image is perceived differently than textual information. we can say that an image has a closer relation to the physicalworld than averbal description, because its structure preserves the structure of the world. [2] while watching an animation, a student undergoes a process of indirect observation. he/she observes a mediated model, purposefully perceives the presented facts, and creates notions about the phenomenon or technology. in addition, animation attempts tomotivate students for other activities and encourage their interest in the subject. the option of publishing animations on the internet is closely linked with the didactic principle of continuityandpermanency. astudenthas theoption to go back to the animation when working at home. another advantage of animation is that it can take into account the individual pace of each student. animation is thus a continuation and extension of usual classwork,basically fulfilling a consolidative function. historical development of animation the idea of using animation in education is not new. it builds on the basic insight of j.a.komensky, that people easily remember and acquire new knowledge through visual information, and even better through all their senses. a correctly conceived animation can come very close to fulfilling this requirement. if we look back into the past for a moment, we can observe that animated presentation of the curriculum has been achieved by various means. when electronic media were not available, animations were made with paper, translucent foil, or specially-lit show cases with an electromechanical drive. further developmentwasbased on electronics andtvbroadcasting, which enabled educational programs on tv to be included into traditional classwork. these programmes included some animated elements, but they were made by classical techniques borrowed mainly from the film industry. subsequent rapid development was connected with the expansion of vcr. it became possible to record broadcast programs on tape, or to play educational programs distributed on vhs video cassettes. with thedevelopmentof computer technology, its use in teaching and learning has also changed. the times when a computer served only for teaching programming techniques are long gone. in the process, we have learned the advantages of using computers not only as a tool for presenting a specific course, but also as a tool for testing and evaluating the knowledge of pupils and students. with the rise of the internet, other important changes in methods of education support have emerged. teachers have created webpages for their students in support of the work done in class, e.g. by publishing study materials. thus the option of distance study (not to be confused with part-time study!) has appeared, i.e. studying via the still expandingworldwide computer network. nowadays, we can see the development of entire electronic educational systems with internet support, and the emergence of other conveniences, e.g. animations and simulations. features of good didactic animation based on our practical experience of creating technical animations, we will now present some basic features that should be taken into accountwhen designing animations: [3] • vividness: this a logical consideration when speaking about animation. animations should graphically and visually supplement a notion about a topic that is being described theoretically. it is important to knowwhat the intended message is for the student, and what features of the topic should be emphasized. • simplicity: whencreating theanimations,we try to go directly to the point, to portray just a specific point, avoiding esthetic additions. ifweadd unnecessary extra elements into the animation, the studentmay get lost andwill not knowwhat he/she is supposed to note in the animation, and what the animator is attempting to explain. the principle of the combustion engine can serve as an example. if we want to explain the principle of a four-cylinder engine, we should portray only one cylinder, and on the basis of this one cylinder explain all four strokes that occurduring one cycle. we should not add the three other cylinders, which are in a completely different part of the cycle in each stroke. a student would see how the whole system works, but he/she would hardly understand the individual parts of the cycle. by displaying all cylinders at once we could illustrate the function of the engine as a whole. thus we could consider the animation as illustrative or demonstrative. however, if we want to create a descriptive or telling animation, we should follow the principle that less is often more. if we also want the student to have some fun while watching the animation, and not only stare blankly at the same backgrounds and col55 acta polytechnica vol. 51 no. 3/2011 ors, the animation should be catchy. using a number of bright colors will definitely help the catchiness, in contrast with a black and white animation. colorfulness can be also used for highlighting some parts of the images and for drawing attention to them. when we are explaining the details of a piece of technology, we can always highlight the detail in an appropriate color tomake sure that it is obviouswhatwe are talking about andwhere it is located. of course, here too the principle of diversity with moderation applies. if we want to retain the student’s attention, the animation shouldbe creative. the main issue is, how are we trying to pass on our vision? the more creative the animation is, the more interest itwill arouse, and thebetter itwill be remembered by the students. you can certainly remember from your own student years that it matters how a teacher transforms and presents the subject didactically. if a teacher, for example, combines a dry explanation of a physical phenomenon with some witty anecdote about the subject, we are able to recall it even aftermanyyears, togetherwith thephysicalphenomenon. animation should be used similarly. if we choose awitty design, our graphics can remain in the sub-consciousnessof our students for years, together with the phenomenon that they illustrated. • adequate length: it is necessary to understand that an excessively long animation can become boring. the attention span of a student has its limits, which should not be crossed. we should try to create a short, clear and diverse animation, so that students will not get bored and start thinking about other things. if students lose their attention in class, there will be a disruption, even if the initial intentionwas good. if we go to the cinema and the movie is too long and repetitive, we can fall asleep or walk out. appropriate length always needs to be kept in mind. • speed: too much speed is disturbing for the student. it basically ruins the animation if everything we want to say happens in one quick moment. excessive speed is therefore unsatisfactory. if we slow down the animation, which is in most cases done by adding more images to the animation, the final length and the final number of frames increases. the size of the file of the animation also increases. making the animation longer can cause a new problem without solving the original problem. a problem can occur if we want to portray the different speeds of ongoing phenomena within a single animation. for example, a portrayal of the solar system can be very confusing, with each planet rotating round the sun at a different speed, with different numbers ofmoons that havedifferent periods of orbit. the student does not know what to concentrate on. for a motivational animation, a high-speed solar system might be suitable, but for an illustrative animationwe must find another solution. in this case, it is enough to show eachmotion separately and spend some time on it, so that the students can comprehend. at the end of the animation we can combine all motions together to represent the real situation, and it will be apparent which planet is faster,whenandunderwhat conditions the speed changes. • size: the size of the animation depends on the number of frames, or on their length. the smaller the size of the animation, the more accessible it is, because it canbedownloaded faster from the internet. the size of an animation can be regulated by the choice of the graphic editor, or by its complexity. themoremovingparts the animation has, the more information it will be necessaryto save into thefinal file. naturally,we donot alwayshave toworryabout the size of the graphics. it depends on theway inwhich our educational course is going to be distributed. if it is to be delivered oncds, or nowadays rather on dvds, we can afford to have larger graphic files, because the userwill not have to download them from anywhere. the same applies if we create our own animations and keep them on the hard disk drivewithout any idea of distributing them. • use in a class: a scenario is closely linked to the didactic function of the animation — whether it is to serve for motivating, for simulating and understanding patterns, for practicing or for testing. technological aspects of animation thefirstdidactic animation typeswerecreatedasanimated gifs. this format was suitable for web presentation purposes. it is based on a common cartoon concept, where the animation consists of a sequence of single pictures. we create the series of pictures, which we then merge into one final animation with a gif file extension. it is easy to create and edit animations of this kind, because shareware or freeware software tools are available all over the internet (e.g. gif construction set for windows). there is no need to obtain expensive licenses for commercial tools such as adobe photoshop. the main disadvantage of these animations is their size. we have to remember that we use a sequence of single pictures. the final size of the animation is therefore 56 acta polytechnica vol. 51 no. 3/2011 the sum of all included pictures, and can be large. this can be an important consideration when there is a slow internet connection and the student may have to wait some minutes for the whole animation to be downloaded into his computer. another issue is the degree of animation. classical film is based on 24 frames per second. if we want to achieve cinematic animation quality, we have to create animation with 24 frames per second, but this leads to a large file size. an alternative is to use less than 24 frames, e.g. just one per second. this solution will lead to inferior visual effects and the animation will be ripped. the usage of such sequences is therefore only suitable for illustrating some simple idea, or for adding some interesting decoration. it is necessary to use animated gifs appropriately, otherwise we can achieve the opposite of what was intended. another disadvantage of gifs is the absence of any audio track. dynamichtml.theearliest conceptofwebsites was based on the idea of static information sites, without any changing content, though a new page might be added at any time. this property is characteristic of the basic html web site language (hyper text markup language). the basic elements of this language are tags, which tell the web browser how to display a certain part (font, hypertext link, etc.) as the internet progressed, static web content became less and less adequate. web developers wanted to add dynamically changing parts of web sites for users to view. this feature was allowed with dynamic html (or dhtml) technology. complex scripting languages are used, like javascript. these languages provide access to the document object model (dom) within the user’s browser. however, dhtml is not suitable for creating animation. it can move static pictures in the browser window due to the properties of dom. if we move several graphical elements in different directions, we can get some degree of graphic animation. the two technologies, gif and dhtml, are automatically recognized by the web browser, without any additional programs or plug-ins. however, it is hard to make any dhtml animation that will be correctly displayed in all web browsers. this is its disadvantageagainstgifanimation. it is also rather moredifficult toprogramanimations, but fortunately there are software tools that can easily create the initial code for our web page. dhtml itself is not very efficient for web animation purposes, because it only involves moving static pictures inside the windowarea. we cannotmakeanyother types of animation, such as shape or color transformations. however, these animations are smoother than animated gifs. more variable web animations can be created with the use ofmore sophisticatedweb browser plugins. java applets. another way to create web animation is by using multiplatform network programming java language. in java, we can create small applications downloadable and running in the web browser, provided we have installed the necessary plug-in. these applets are small reduced software applications which cooperate with the web browser. themain advantage of java is its cross-platformorientation and its ability to run on several operating systems. we can create several interactive animations, including raster or vector based graphics, or we can mix it with other web page elements. adobe flash (or shockwave flash). adobe flash or adobe director are applications that enable any graphic idea of the author to be animated. both are widely-used software applications. in 2003, more than 97 percent of all internet users visitedweb sites withmacromediaflashcontent (accordinglyto information provided by macromedia corporation). for this reason there areadobe extensionmodules (plugins) for thewidely-usedwebbrowsers that enable animation to be shown inside the web browserwindow. of course, this is a very user-friendly feature, because we can get the requested content by one click of the mouse. even if we do not have plug-ins installed,mostweb browsers offer automatic download of plug-ins from relevant web sites. thus there is no need to search for them all over the internet. however, we can also download them manually from the adobe web site if we want. the advantage of adobe flashoradobedirector is that they create small files which are easily and quickly downloadable. this is due to the utilization of vector graphics inside their file format. we therefore do not need to transfer whole pictures over the net, but only vectors of their changes. the second benefit is the popularity of this graphical format. it is a well-known format in the community of animation creators. the third advantage concernsdownloading. even if the animationfile is large, the user does nothave to wait for long. the conceptofflashanimationsallows the animation to start playing before the whole file hasbeen fully downloaded. these concurrentplaying and downloading activities are very useful, because the user can immediately focus on the information that is being presented, and does not need to think about the rest of file. the question is which of them to choose — flash or director? in most cases the significant parameter is price. thus we will probably choose the adobe flash, which is cheaper than adobe director. the file extension of flash animation is fla. this is the source file type of animation which is only editable, and cannot be played directly out of the editor. we must therefore publish the animation into a playable format swf (shockwave flash) or optionally into other formats like exe or html. 57 acta polytechnica vol. 51 no. 3/2011 flash animation can nowadays be found in many web sites, see the number of interactive advertisement banners. flash format also includes the programming language. this feature allows the creation of several interactive environments to fulfil the author’s graphic ideas. wink. wink is free software tool. it can create animationsdirectlyby recording screenshots. wecan utilize it especially for topicswhere the studentswork with computer workstations. in this case we need to show students some new actions, immediately after which they should try them themselves. examples of these activities are: algorithmprogramming, courses in computer graphics or presentation of application features. the principle of wink is based on making screenshots. first, we choose the required recording area onour screen. it is thenminimized andwaits for a signal from the user. after the right combination of keys has been pressed, the recording starts, while the tutor normally works, e.g. shows work with a new application. the tutor works continuously, and thewinkconcurrently takes snapshotsof the selected screenarea. when the recording is finished (bypressing the same combination again), smooth animation can be created from the sequence of recorded screenshots. if we find a group of identical screenshots due to our longer explanation, we can rip these screenshots out and leave only one of them. then we set a longer presentation time for this single screenshot. this leads to considerable saving of file. we can also add some additional comments, graphic symbols or pictures. the final animation looks like a video that is exportable into severalfile formats, includingflash orpdf.then it has the same advantages aswementioned above. the tutor can also add some audio comments that explain the actual situation on the screen. there is no reason to make this animation only from screenshots. we can also compose it from other picture formats e.g. jpg,gif, bmp.the major advantages of wink can be observed when students have to emulate the computer activity of a tutor. the main utilization of wink is for informatics. wink is used in a practical way at the technical electrical industry secondary school in jecna street, in prague. this kind of education is used at the school for teaching computer languages. wink is very popular with teachers and with students. we should point out that the list of software applications mentioned here is incomplete. there are many other tools that are more or less suitable for creating animation, and as technology progresses we can anticipate further extension and innovation of programs of this sort. conclusion the goal of this paper was to present to the reader some key issues in electronic education and some opportunities in multimedia. we have provided an introduction to technical animation as a fundamental multimedia aspect of electronic learning. we present a taxonomy of technical animation in the paper that follows. references [1] vaněček, d.: informační a komunikační technologie ve vzdělávání. praha : čvut, 2008. isbn 978-80-01-04087-4. [2] kassin, s.: psychologie. computer press, 2007. isbn 978-80-251-1716-3. [3] kropelnický, r.: využití animace při tvorbě elearningového kurzu, bp múvs čvut, praha 2007, ved. práce d. vaněček. [4] vargová, m.: metodika pracovnej výchovy a pracovného vyučovania. nitra : pf ukf, 2007. isbn 978-80-8094-171-0. ing. paed. igip. david vaněček, ph.d. e-mail: david.vanecek@muvs.cvut.cz czech technical university in prague masaryk institute of advanced studies department of engineering pedagogy horská 3, 128 00 praha 2, czech republic ing. bc. jan jirsa e-mail: jirsaj@fel.cvut.cz czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 58 ap08_6.vp 1 introduction a rock burst is the most dangerous event that can occur during excavation works. surrounding rock is extruded into an underground open space by a severe force during a rock burst. this event may cause an accident or even the death of mining workers, and it may destroy the excavation space. for this reason, studies of this problem are very important theoretically and for practical applications. see [1], [2], [3], and [7–9]. sufficiently high pressure at the place of the rock burst is necessary for bump occurrence (usually great depth, but also tectonic pressure) and the rock must be brittle and must have a disposition for bumps (properties that allow the creation of bumps). for the occurrence of bursts, mining velocity is also a very important factor. in the same conditions if we excavate slowly, we give the rock mass sufficient time to create cracks in the vicinity of the open space. for this reason, the stress concentration next to the excavation falls, and no bump occurs. if mining works proceed rapidly, cracks have no time to occur, and a rock burst appears. this feature is confirmed by old mining experience. for this reason, it is necessary to study this event as a time dependent problem. rock bursts were studied for the case of a mine gallery inside a horizontal coal seam. their mechanics and the stress distribution on the top of the seam were studied by mathematical and experimental modelling. 2 experimental part 2.1 testing devices 2.1.1 loading cell fig. 1 shows loading cell. it consists of the lower steel tank, which is designed for the horizontal forces caused by the vertical load in araldite specimens. the loading cell is equipped with lucites on its sides, which allow observation of the samples during the tests. the tank is shown in photo 2. this loading cell models (simulates) the rock mass in the vicinity of the seam. in the loading cell (with dimensions of 160/400/70 mm) we placed two araldite specimens, which model the coal seam. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 rock burst mechanics: insight from physical and mathematical modelling j. vacek, j. vacek, j. chocholoušová rock burst processes in mines are studied by many groups active in the field of geomechanics. physical and mathematical modelling can be used to better understand the phenomena and mechanisms involved in the bursts. in the present paper we describe both physical and mathematical models of a rock burst occurring in a gallery of a coal mine. for rock bursts (also called bumps) to occur, the rock has to possess certain particular rock burst properties leading to accumulation of energy and the potential to release this energy. such materials may be brittle, or the rock burst may arise at the interfacial zones of two parts of the rock, which have principally different material properties (e.g. in the příbram uranium mines). the solution is based on experimental and mathematical modelling. these two methods have to allow the problem to be studied on the basis of three presumptions: the solution must be time dependent, the solution must allow the creation of cracks in the rock mass, the solution must allow an extrusion of rock into an open space (bump effect). keywords: rock burst, bump, mining, rock mechanics, mathematical and physical modelling. fig. 1: scheme of the loading cell the gap between them corresponds to the width of a working gallery in a mine. we observed the mechanism and the history of coal rock bursts. the araldite specimen was covered with a soft duralumin sheet, and force meters were placed on it in the following manner: 5 comparatively thick force meters were placed near its outer edge, and another 15 thinner force meters were placed next to them (see fig. 1 and photo 2). in order to embed the force meters properly and to prevent them from tilting, another double steel sheet, 1 mm thick, was placed over the force meters. a block of duralumin 300 mm in height was placed over this sheet. this block simulates the handing wall and models the stress distribution similarly to that in reality (see photo 1). the bedrock was modelled by a steel sheet. 2.1.2 force meters photo 2 shows the force meters. the force meters are 160 mm in length, 68 mm in height, and 16 or 32 mm in width. there are 4 strain gauges on each force meter – 2 on one side, 30 mm from the edge of the force meter and 2 on the other side, 60 mm from the edge of the force meter. these allow us to measure the deformation along its full length. the data from each force meter was read automatically every 10 seconds by a brüel & kj r strain gauge bridge, and deposited in a computer. it takes 1,2 s to read 40 force meters. 2. 1. 3 conduct of the experiments three tests with various speeds of loading were used during our experiments. the whole system was uniformly loaded with a speed of loading of 50, 240 or 1200 kn/min. in all cases, the total force reached was 6 mn (the maximum force of the testing apparatus), i.e. the total experimental time was 5, 25 and 120 min, respectively. the average load of the araldite samples at the end of the test was 46.875 mpa, while the maximum load close to the mine gallery was approximately doubled, see figures 2 and 3. the force meters can sustain this load without reaching the yield limit of their steel. the reading frequency on the force meters was 10 s on a commercial br el & kjaer bridge. the speed camera/camcorder was equipped with a 2 s loop tape, i.e. a recording was overwritten after 2 s. after the rock burst, the camera stopped and the last two seconds remained recorded. the rock burst intensity was proportional to the intensity of the produced sound effect and, more accurately, proportional to the level of decrease in the load force. in the case of the strongest bursts, the force decreased all the way to zero. 2.2 some of the results more results are given in ref. [4], [5] and [6]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 48 no. 6/2008 photo 1: testing device. two camcorders recorded the test photo 2: loading cell. right with sample and force meters. fig. 2: mean stress distribution of the coal seam loading before (thick line) and after a bump. the force is 4170-4265. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 fig. 3: loading of the coal seam during test vw 21 in axonometry photo 3: state before bump photo 4: first extrusions appear photo 5: early stage of bump photo 6: late stage of bump photo 7: bursting (t � 0 060. s) 2.2.1 mechanics of bumps the last part of the experiments were also recorded with a high-speed camera. this made it possible to watch the bumps as events lasting several seconds. photos 3–6 were chosen from the continuous records. they show that the bump started (like an earthquake) with a small extrusion of rock mass, see photo 3, followed by the main bump, photo 4 and 5, and the state after the bump finished, photo 6. the event sometimes ends with small extrusions of rock, thought not in the case described here. 2.2.2 what the tests showed � the extruded rock mass is located on the only narrow strip next to the free rock surface. the breadth of this strip was only 15–40% of the height of the coal seam, and was proportionate to the bump force. the bigger the bump, the broader was the amount of extruded rock. the height of the coal seam was 70 mm in the experiment; the bump extruded approximately 10 to 30 mm of rock. after the bump, the surface of the coal was not smooth, it was uneven. � in one case (out of about 30 cases) the bump came in two waves, see photos 7–9. � bumps in experimental conditions last about 20 ms, (the two-wave bump lasted 50 ms), and whole events last about 0.5 s. � the time, between the first extrusion and the bump was 20–200 ms. � a great amount of energy is released during the bump. part of it extrudes the rock mass to an open space, and reaction forces strike the rest of the rock. this creates new joints in the rock and the bump does not continue. when the bump is only weak, no joints appear in non-destroyed rock, and the second wave of bump comes, see photos 7–9. � the speed of the load considerably influences the rock burst intensity. while the intensity is rather low for the slow experiments, and the number of bursts is smaller (~3) and in the intermediate experiments the bursts are stronger and their number increases to about 5, for the fast experiments we also observe ~5 bursts, but of exceptionally high intensity. one of these bursts even damaged our experimental equipment – completely destroying the plexiglass sidings. 3 mathematical model pfc2d (particle flow code in two dimensions), developed by itasca, usa, was used for the numerical modelling part of the project. a physical problem concerning the movement and interaction of circular particles may be modelled directly by pfc2d. pfc2d models the movement and interaction of circular particles by the distinct element method (dem), as described by cundall and strack (1979). particles of arbitrary shape can be created by bonding two or more particles together. these groups of particles act as autonomous objects, provided that their bond strength is high. as a limiting case, each particle may be bonded to its neighbour. the resulting assembly can be regarded as a “solid” that has elastic properties and is capable of “fracturing” when the bonds break in a progressive manner. pfc2d contains extensive logic to facilitate the modelling of solids as close-packed assemblies of bonded particles; the solid may be homogeneous, or it may be divided into a number of discrete regions of blocks. the calculation method is a time stepping, explicit scheme. modelling with pfc2d involves the execution of many thousands of time steps. at each step, newton’s second law (force � mass×acceleration) is integrated twice for each particle to provide updated velocities and new positions, given a set of contact forces acting on the particle. based on © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 48 no. 6/2008 photo 8: first wave of bump (t � 0146. s) photo 9: second wave of bump (t � 0178. s) these new particle positions, contact forces are derived from the relative displacements for pairs of particles: a linear or non-linear force/displacement law at the contacts may be used. model parameters: in order to model rock bursting in a coal mine gallery we created a 2d box in the computer and filled the upper part with larger overload particles (balls, 100 rows) and the lower part with smaller coal balls (50 rows). the gallery was simulated by a rectangular opening in the lower part of the box (see figs. 4–8). in the coal we set up joint planes at 45 degrees to horizontal. the material properties were set to model known coal mine data (cf. table 1), with a time step of 0.00001 s. the loading was simulated by moving the top wall of the box downwards at a fixed speed at the beginning of the simulation (8000 steps). after the loading we typically performed about 200 000 integration steps (2 s) to 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 material friction kn ks density n_bond s_bond coal 0.7 1e10 1e10 1900 1e4 1e4 joint planes 0.1 – – – 1e2 1e2 overburden 0.7 1e10 1e10 2300 2e5 2e5 walls 1.0 1e12 1e12 – – – we used the following main material properties to get the required behavior: � friction – friction coefficient of ball or wall surface (not friction angle) � kn – normal stiffness of ball or wall when in contact with other balls [force/displacement] � ks – shear stiffness of ball or wall when in contact with other balls [force/displacement] � density – density of ball material [mass/volume] � s_bond – shear contact bond strength [force] � n_bond – normal contact bond strength [force] table 1: material properties of the coal, overburden and walls fig. 4: mathematical model before the bump fig. 5: the first stage of the bump let the dynamics evolve. the bursting started rather early (0.1 s). several simulations with varying numbers of coal particles and other parameters were performed with no significant impact on the qualitative results of the simulations. figs. 4–8 show details of the mathematical model of the bump, and fig. 9 shows the typical stress distribution along the coal mine. the stress grows (i, ii) until the first bump initiation (iii). this occurs in the 5th measurement cycle approximately, then the stress decreases in this location, but it increases simultaneously in the 7th measurement cycle, when the second bump initiation occurs (iv). subsequent bump initiations can be expected in the 8th and 10th measurement cycles (v, vi). 4 conclusion a combination of experimental and mathematical models appears very appropriate for a study of the stress distribution © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 48 no. 6/2008 fig. 7: next stages of the bump fig. 6: next stages of the bump fig. 8: late stage of the bump in a coal seam before and after rock burst initiation. both methods enable a time dependent study of the problem, and enable us to study the development of cracks during rock burst initiation, and then extrusion of material into an open space during a burst. thus they offer a description of the problem that is very close to reality. acknowledgement this research and this paper have been supported by the grant agency of the czech republic, (gačr), grant number 103 / 08 / 0922 “influence of shock and impact loading on structures” and european commission project mcrtn-ct-2005-019481 “from flim to flin”. references [1] foss, m. m., westman, e. c.: seismic method for in-seam coal mine ground control problems. seg international exposition and 64th annual meeting, los angeles, 1994, p. 547–549. [2] goodman, r. e.: introduction to rock mechanics, john wiley & sons, 1989, 562 p. [3] torańo, j., rodríguez, r., cuesta, a.: using experimental measurements in elaboration and calibration of numerical models in geomechanics. computation methods and experimental measurements x, alicante, 2001, p. 457–476. [4] vacek, j., procházka, p.: rock bumps occurrence during mining, computation methods and experimental measurements x, alicante, 2001, p. 437–446. [5] vacek j., bouška, p.: stress distribution in coal seam before and after bump initiation. geotechnika 2000, gliviceustroň, 2000, p. 55–66. [6] vacek, j., procházka, p.: behaviour of brittle rock in extreme depth. 25th conference on our world in concrete & structures, singapore, 2000, p. 653–660. [7] williams, e. m., westman, e. c.: stability and stress evaluation in mines using in-seam seismic methods. 13th conference on ground control in mining, us bureau of mines, 1994, p. 290–297. [8] šňupárek, r., zeman, v: rockbursts in longwall gates during coal mining in ostrava-karviná basin. eurock 2005 impact of human activity on the geological environment, balkema 2005, p. 605–610. [9] konečný, p., velička, v., šňupárek, r., takla, g., ptáček, j.: rockbursts in the period of mining activity reduction in ostrava-karviná coalfield, 10th congress of the isrm technology, johannesburg, 2003, p. 665–668. prof. ing. jaroslav vacek, drsc. phone: +420 224 353 549 e-mail: vacek@klok.cvut.cz czech technical university in prague klokner institute šolínova 7 166 08 prague 6, czech republic rndr. jaroslav vacek, ph.d. phone: +420 220 183 103 e-mail: vacek@eefus.colorado.edu rndr. jana choholoušová, ph.d. phone: +420 220 183 216 e-mail. jana@eefus.colorado.edu institute of organic chemistry and biochemistry academy of sciences of the czech republic flemingovo nám. 2 166 10 prague 6, czech republic 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 fig. 9: stress distribution along the coal seam i – 5000 cycles, ii – 7000 cycles, iii – 9000 cycles, iv – 12000 cycles, v – 15000 cycles, vi – 26000 cycles ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 advances in modern capacitive ecg systems for continuous cardiovascular monitoring a. schommartz, b. eilebrecht, t. wartzek, m. walter, s. leonhardt abstract the technique of capacitive electrocardiography (cecg) is very promising in a flexible manner. already integrated into several everydayobjects, the single lead cecgsystemhas shownthat easy-to-usemeasurements of electrocardiograms are possible without difficult preparation of the patients. multi-channel cecg systems enable the extraction of ecg signals even in the presence of coupled interferences, due to the additional redundant information. thus, this paper presents challenges for electronic hardware design to build on developments in recent years, going from the one-lead cecg system to multi-channel systems in order to provide robust measurements e.g. even while driving an automobile. keywords: capacitive electrodes, non-contactmeasurements, ecg,multi-channel sensor array, automotive application. 1 introduction cardiovascular diseases have for years been the most common cause of death (german federal statistical office). today, however, it is essential to monitor cardiac patients not only in hospitals, but also in everyday life (e.g. while driving a car), because of the demographic change toward an aging population. in traffic situations, we do not risk only our own lives, and this makes monitoring even more important. theelectrocardiogram(ecg)hasbeenknown for a long time as a fast examination tool which can provide important clues to the status of the cardiovascular system. it is often the tool of first choice in emergency situations, e.g. for detecting a heart attack. however, until now no extensive long-time monitoring system for people in high-risk groups could be developed, for technical and financial reasons. in recent years, a technique already known since 1967 through richardson has been in the main research focus: measuringpotentialswith isolated electrodes [1]. this capacitive measurement method is nowadays built into a range of everyday objects: office chairs [2,3], bathtubs [4], toilet seats [5], incubators [6], cars [7,8]andbeds [9]. asanexample, the socalled “aachensmartchair” from [10] integrates two capacitive electrodes in an off-the-shelf office chair (see figure 1) for easy-to-use measurements where no medical staff is needed and no difficult preparation has to be done. even a study on acceptability for medical staff and patients produced very positive results [11]. however, this technique has not yet been marketed, perhaps because most systems use a single lead, which is less robust thanconventional conductive systems. fig. 1: the “aachen smartchair”: a single lead cecg system in an office chair [10] this paper presents an overview of the capacitive measurement method, the challenges for electronic hardwaredesignandthedevelopments in recentyears fromsingle lead systems tomulti-channel systems for robust and reliable measurements. 2 theory of capacitive ecg measurements charge movements on the surface of human bodies, caused by their heart activity, influence the electric charge distribution on an electrode within a small distance from the body. this knowledge is used for capacitive measurements. conductive electrodes, often made of ag/agcl, need a contact gel as a conductive contact to the 100 acta polytechnica vol. 51 no. 5/2011 body surface to measure the potential between the two electrodes. this is mainly resistive behavior. with time, the contact gel dehydrates and reduces the quality of the signals. capacitive electrodes, isolated e.g. by a very thin lacquer coatingwith high surface resistance, need no leading contact to the body. in addition, the potentials can be measured even through several layers of clothing (depending on material and thickness). each electrode forms a coupling capacitance c with the patient’s body, which is known to be c = �0�r a d (1) where a is the effective surface area of the electrode, d is the thickness, �r is the dielectric constant of the clothes, and �0 is vacuum permittivity. with d = 0.3 mm (off-the-shelf cotton shirt), �r = 1 and a =40 mm×80 mm, as in [10], the coupling capacitance is about90pf.forahigh couplingcapacitance, large electrode surfaces within small distances from the body are essential. one major disadvantage of this technique is the continuous changing of c due to patients’ movements, transpiration, varying thickness d and clothingmaterials, aswell as static charges. in addition, it is a technical challenge to deal with high impedance biosignals in the rangeof a fewmv.direct impedance conversion must be achieved in order to avoid disturbing voltages. 3 realization of capacitive ecg the coupling capacitance forms the input of a high impedance input stage for impedance transformation (figure 2). typically, a high-impedance bias resistor rbias (> gω) is connected to ground to discharge to capacitance, but this results in a first order highpassbehaviorwith the correspondingcut-off frequency: fc = 1 2πrbiasc . (2) thus, variations of the coupling capacitances can result in a shift of 0.1–100 hz of the cut off frequency into the spectrum of the ecg, and can thus deteriorate the ecg signal. even the operational amplifier should have a very high input impedance for the same reason. further on, due to the small voltage amplitudes, low noise amplifiers should be used and because of the shift of the operating point low bias currents are necessary. signals measured at two different positions are combined, forming the differential input in an instrumentationamplifierwithahighcmrr(common mode rejection ratio) of about 120 db in order to filfig. 2: block diagram of the capacitive ecg system, modified from [10] fig. 3: matlab simulation of the resulting gain in the spectrum of the ecg after filtering ter the coupled power line interferences that underly both signals. due to the asymmetries, e.g. variable lengthofwiresorvaryingdistancesof electrodes, coupled interference may have a slightly different effect. therefore it cannot be eliminated in reality as well as in theory. further filtering of the signal is indispensable. a low pass filter is used to cut off the high frequency components (typically > 100 hz), which do not provide information for the ecg, as well as higher harmonics of the 50 hz power baseline. with an additional notch filter at 50 hz this noise can be suppressed. a high pass filter with a low cut off frequency of about 0.3 hz deletes the dc components and minimizes the baseline drifts. figure 3 shows the matlab simulation of the filter and gain cascade, depending on the frequency, in the range of 0.1–1000 hz. a nearly constant gain of 61 db in the useful frequency range was achieved. the high pass filter with a cut off frequency of 0.3 hz shows 101 acta polytechnica vol. 51 no. 5/2011 moderateabsorption,whereas thehighpassof the capacitive electrodes (due to uncertainties of the modelling depending on the distance between electrode and body) is not considered in this simulation. the notch filter at 50 hz achieves suppression of 35 db with regard to the desired signal. the remaining gain of 26 db referring to the input signal is low enough. higher suppression canbe achievedby additional notch filter stages or digital signal processing. controlling further components of the signal processing of the ecg signal, i.e. the a/d converter, significant amplification (about factor 1000) is usually impliedatgoodsnr(signal tonoise ratio). first amplification can be realized by external circuitry of the instrumentational amplifier increasing the snr without driving the subsequent components into saturation. the main amplification takes place at the end of the filter cascade in the output or gain stage. figure 4 shows a realization of an ecg circuit with two simultaneous filter cascades. fig. 4: realization of anecgcircuit for two simultanous measurements of ecg conventional ecg measurement systems make use of the principle knwon as “driven right leg” to eliminate interference in the signal evenmore [12,13]. this principle can also achieve interference reduction in capacitive ecg measurements. there it is often referred to as “driven ground plane”, as in [7], or “driven ground electrode” (dge), as in [14]. the negative feedback electrode connects the body and the output of an inverting operational amplifier the potential of which is the sumof the active electrodes, with amplificationof about −1000. in thisway, identical potential changes on the body surface are not transferred to the system output (compare [7]). dealing with these technical challenges alone will not ensure a robust cecg measurement system. challenges due to the patient also need to be overcome. 4 results progress in digital circuit technology in recent years has enabled the implementation of complex algorithms in reasonable modules. improved types of electrodes integrated in everyday objects have been achieved and there have been advances in signal processing. fig. 5: left: experimental setup formulti-channel cecg, right: multi-channel cecg integrated into the driver’s seat of a ford automobile the latest cecg systems have been realized for multi-channel measurements. this redundancy may lead to robust measurements to ensure reliable medical statements on the basis of an ecg. as a first outcome, amulti-channel cecgsystemhasbeenpresented by [15], with 15 electrodes integrated in a tablet pc. however, the patient must additionally be connected to ground. to the best of the author’s knowledge, our grouppresented the first independent multi-channel cecg system with free choice for the negative feedback dge to reduce the commonmode rejection [14]: on a square aluminum plate an array of nine round electrodes, positioned in a 3 × 3 matrix, was attached with flanged bearings enabling to tilt every electrode (figure 5 left). in combination with the ability to have continuously adjustable positions of the electrodeson theplate, this allowsproper adaptation to the silhouette of the patients. ancillary connectors with springs have been developed to achieveproper adaptationand surfacepressure in axial direction. this construction allows simultaneous measurements of up to eight cecgs where the leads can be chosen manually, just like the dge. figure 6 shows a sequence of six leads according to einthoven (3) and goldberger (4), measured on the patient’s back, which can be calculated as: i = u8 − u6 = φ8 − φ6, ii = u8 = φ8 − φ1, iii = u6 = φ6 − φ1, (3) avr = u8 − u0 + u6 2 , avl = u6 − u0 + u8 2 , avf = u0 − u6 + u8 2 , (4) 102 acta polytechnica vol. 51 no. 5/2011 where φi is the potential from the electrode i, φ1 is the reference potential and the voltage ui = φi − φ1 (compare figure 5 left). the sequence in figure 6 shows that every channel has robust signals and the r-peaks can be clearly identified and therefore used e.g. for qrs detection. the possibility of developing a multi-channel cecg system enables integration into the driver’s seat (here a ford s-max) for cardiovascular monitoring even in traffic situations (figure 5 right). the positionsof the electrodes ina2×3matrix in theback rest were chosen due to pressure measurements with a flexible pressure sensor mattress by a group of ten males and femaleswithdifferentphysique. averaging thepressuredistribution identified thebestpositions, and thesewere then verifiedby cecgmeasurements. a textile negative feedback electrode integrated into the seat panel is used for common mode reduction. measurements of the pressure sensor mattress, with correspondingecgmeasurements, the best capacitive channel and reference ecg as a gold standard, for two persons, see figure 7, show that a proper pressure contact to the electrodes is of essential interest. the bmi, a criterion for the relation between weight and height of human beings, of the male proband in the upper part is normal with 22.2 kg/m2, and in the lower part the bmi is low with 16.8 kg/m2. it can be seen thatwith decreasing surface pressure, the signal quality of the cecg decreased in comparisonwith the referenceecgbelow. fig. 6: sequenceof resulting leads according toeinthoven and goldberger, as shown in (3), (4), transformed to the patient’s back fig. 7: pressure measurements of two subjects with different bmi: male proband with normal bmi in the upper part, below female proband with low bmi. on the right side the cecg and the reference ecg as gold standard are shown for these persons 103 acta polytechnica vol. 51 no. 5/2011 in test runs at aford car test site, the systemwas validated in static tests andwhile driving ondifferent track surfaces. with 93%of the 59 probands, robust cecgs were measured in static tests. on motorway tracks, in particular, the detection rate of 92.4% is very robust [16]. city tracks, withmore steeringmotion and road damage, lead to lower detection rates than for the reference ecg. importantly, irregularities in theecgcould be detectedwithout restricting the driver’s level of comfort. 5 conclusion in recent years, capacitive ecg measurement systems developed by various groups have shown that this technique is very flexible due to integration into several everyday objects. however, the medical diagnostics and the robustness of the systems are limited becausemost systems are based on a single lead. multi-channel cecgsystemspresent a promising approach for monitoring people in high-risk groups. even measurements while driving a car on a test track showed good performances. this is a major step towards monitoring and assisting drivers. acknowledgement parts of the research described in this paper have received funding from theeuropeancommunity’s seventh framework programme under grant agreement no. fp7-216695 and the ford forschungszentrum aachen gmbh, aachen, germany. references [1] richardson, p.: the insulated electrode. proceedings of the 20th annual conference on engineering in medicine and biology. boston, 1967, vol. 157. [2] lim,y.,kim,k.,park,s.: ecgmeasurementon a chair without conductive contact. ieee transactions onbiomedical engineering, 2006, vol.53, no. 5, p. 956–959. [3] aleksandrowicz, a., walter, m., leonhardt, s.: ein kabelfreies, kapazitiv gekoppeltes ekgmesssystem/wireless ecg measurement system with capacitive coupling. biomedizinische technik, 2007, vol. 52, no. 2, p. 185–192. [4] lim, y., kim, k., park, k.: the ecg measurement in the bathtub using the insulated electrodes. 26th annual international conference of the ieee embs, 2004, vol. 1. [5] kim, k., lim, y., park, k.: the electrically noncontacting ecg measurement on the toilet seatusing the capacitively-coupled insulated electrodes. 26th annual international conference of the ieee embs, 2004, vol. 1. [6] kato, t., ueno, a., kataoka, s., hoshino, h., ishiyama, y.: an application of capacitive electrode for detecting electrocardiogramof neonates and infants. 28th annual international conference of the ieee embs, 2006, p. 916–919. [7] leonhardt, s., aleksandrowicz, a.: non-contact ecg monitoring for automotive application. 5th international summer school and symposium on medical devices and biosensors, 2008, p. 183–185. [8] chamadiya, b., heuer, s., hofmann, u., wagner, m.: towards a capacitively coupled electrocardiography system for car seat integration. in proceedings ifmbe, 2008, vol.22, p. 1217–1221. [9] ishijima, m.: monitoring of electrocardiograms in bed without utilizing body surface electrodes. ieee transactions on biomedical engineering, 1993, vol. 40, no. 6, p. 593–594. [10] aleksandrowicz, a., leonhardt, s.: wireless and non-contact ecg measurement system — theaachensmartchair.actapolytechnica, 2007, vol. 2, p. 68–71. [11] czaplik, m., eilebrecht, b., ntouba, a., walter, m., schauerte, p., leonhardt, s., rossaint, r.: clinical proof of practicability for an ecg device without any conductive contact, biomedizinische technik, 2010, vol. 55, p. 291–300. [12] neuman, m.: biopotential amplifiers, medical instrumentation: application and design, 1978, p. 292–296. [13] winter, b., webster, j.: driven-right-leg circuit design, ieee transactions on biomedical engineering, 1983, p. 62–66. [14] eilebrecht, b., schommartz, a., walter, m., wartzek, t., czaplik, m., leonhardt, s.: a capacitive ecg array with visual patient feedback, 32nd annual international conference of the ieee embs, 2010. [15] oehler,m., ling,v., melhorn,k., schilling,m.: a portable ecg system with capacitive sensors, physiological measurement, 2008, vol. 29, p. 783–793. [16] eilebrecht, b., wartzek, t., lem, j., vogt, r., leonhardt, s.: capacitive electrocardiogram measurement system in the driver seat, automobiltechnische zeitschrift atz, 2011, vol. 113, p. 50–55. 104 acta polytechnica vol. 51 no. 5/2011 about the authors antje schommartz was born in essen, germany, on march 31st, 1981. she studied electrical engineering and information technology, specializing in medical engineering, at ruhr university bochum, germany, where she received her dipl.-ing. degree in 2009. she currently works as a ph.d. student at thephilipschair ofmedical informationtechnology, rwth aachen university, germany. her research interests are focused on capacitive ecg measurements and high frequency cardiac neuromodulation. she is a member of the vde, dgbmt and ieee. benjamin eilebrecht was born in bochum, germany, on june 6th, 1982. he studied electrical engineering, with a specialization of medical engineering, at ruhr university bochum, germany, and received his dipl.-ing. degree in 2008. he is working as a ph.d. candidate at the philips chair of medical information technology, rwth aachen university, aachen, germany. his research interests includenon-contactmonitoring techniques, learning algorithmsandmodeling. hehasbeenamember of the german electrical engineering association (vde) since december 2008. tobias wartzekwas born inkrefeld in 1982. from 2003 to 2008, he studied electrical engineering focusing on information and communication technology, and received his dipl.-ing. degree from rwth aachen university, germany in 2008. he is currently pursuing a ph.d. degree at the chair of medical informationtechnology, rwthaachenuniversity, aachen, germany, where he is also working as a research assistant. his research interests are in the field of modeling physiological systems, new sensors for biomedical measurements, and automation and diagnosis support for the intensive care unit. marianwalterwasborn insaarbrücken,germany, on march 4th, 1966. he studied electrial engineering, with a specialization in control engineering, at the technical university of darmstadt and received his dipl.-ing. degree in 1995 and his dr.-ing. degree in 2002. he has worked in the medical engineering industry for three years and was appointed senior scientist and deputy head at the philips chair ofmedical informationtechnologyatrwthaachen university, aachen, germany, in 2004. his research interests include non-contact monitoring techniques, signal processing and feedback control in medicine. steffen leonhardt was born in frankfurt, germany, on nov. 6th, 1961. he holds an m.s. in computer engineering from suny at buffalo, ny, usa, a dipl.-ing. in electrical engineering and a dr.ing. degree in control engineering from the technicaluniversity ofdarmstadt, germany, and am.d. in medicine from j. w. goethe university, frankfurt, germany. he has 5 years of r&d management experience in medical engineering industry and was appointed full professor and head of the philips endowed chair of medical information technology at rwth aachen university, germany in 2003. his research interests include physiologicalmeasurement techniques, personal health care systems and feedback control systems in medicine. antje schommartz e-mail: schommartz@hia.rwth-aachen.de benjamin eilebrecht tobias wartzek marian walter steffen leonhardt philips chair for medical information technology rwth aachen university pauwelsstr. 20, 52074 aachen, germany 105 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 czech contribution to athena r. hudec, l. ṕına, v. marš́ıková, a. inneman, m. skulinová, m. mı́ka abstract we describe the recent status of the czech contribution to the esa athena space mission, with emphasis on the development of new technologies and test samples of x-ray mirrors with precise surfaces, based on new materials, and their applications in space. in addition, alternativex-rayoptical arrangements are investigated, suchaskirkpatrick-baez systems. 1 introduction design and development of x-ray optics have a long tradition in the czech republic (e.g. hudec et al., 1991, 1999, 2000, 2001, inneman et al. 1999, 2000). a range of related technologies have been exploited and investigated, including technologies for future large light-weight x-ray telescopes. future large space x-ray telescopes (such as athena, consideredbyesa[16]) requireprecise lightweight x-ray optics based on numerous thin reflecting shells. novel approaches and advanced technologies need to be developed and exploited. in this paper, we refer to czech efforts in connection with athena, focusing on the results of test x-ray mirror shells produced by glass thermal forming (gtf) and by shaping si wafers. both glass foils and si wafers are commercially available, have excellent surfacemicroroughnessofa few0.1nm,and lowweight (thevolume density is 2.5 g ·cm−3 for glass and 2.3 g ·cm−3 for si). technologies need to be exploited for shaping these substrates to achieve the requiredprecisex-ray optic geometries without degrading the fine surface microroughness. although glass and, more recently, silicon wafers have been considered the most promising materials for future advanced large aperture x-ray telescopes, other alternative materials are also worth further study, such as amorphous metals and glassy carbon (marsch et al., 1997). in order to achieve sub-arsec angular resolutions, the principles of active optics need tobeadopted. theathenax-rayobservatory is a newx-ray telescope of theeuropeanspaceagency (esa). this project supersedes ixo (international x-rayobservatory)aswell asnasa’sconstellationx concept and also esa’s xeus mission concept (white et al., 2009). the spacecraft configuration for the athena study is a mission featuring two large x-ray telecopes, each with an optical bench with a focal length of approx. 11.5 m and a suite of focal plane instruments. the athena mission concept was presented and discussed in detail at the 1st athena science meeting held atmpe ingarching, germany, in june 2011 [16]. 2 czech contribution to athena-related studies at the moment, the czech contribution to athena concentrates on: (1) participating in defining scientific goals, justification and project preparation, (2) participating in the design and development of mirror technologies. the first author of this paper was delegated as a member of the athena telescope working group. in mirror development, we focus on supporting esa estec micropore silicon technology design and also on designing and developing alternative background technologies and designs, as discussed in greater detail below. originally, these technologieswere studied for theesa/nasa/jaxa ixo project, but we believe that at least some of them might be valuable for athena. 3 the glass foil option glass science and technology has a long tradition in the czech republic. at the same time, glass technology is one of the most promising technologies for producing mirrors for athena and/or similar space telescopes, as thevolumedensityof glass ismore than 3 times less than the volumedensity of electroformed nickel layers. glass foils can be used as flats, or may be shaped or thermally slumped to achieve the required geometry. thermal forming of glass is not a new technology. it has been used in various sectors of the glass industry and in glass art, as well as in the production of cherenkov mirrors. however, the application of this technology in x-ray optics is related to the need to improve accuracy significantly andminimize errors. as the first step, small (various sizes typically less than 100 × 100 mm) glass samples of various typesprovidedbyvariousmanufacturers were used and thermally shaped. the geometry 28 acta polytechnica vol. 51 no. 6/2011 was either flat or curved (cylindrical or parabolic). the project continued with larger samples (up to 300 × 300 mm) and further profiles. recent efforts have focused on optimizing the relevant parameters of both glass material and substrates, as well as the parameters of the slumping process. various approaches have been investigated. we note that these are not quite identical with efforts by other teams (e.g. zhang et al., 2010, ghigo et al., 2010). the glass samples were thermally formed at rigaku, prague, and also at the institute ofchemicaltechnology inprague. for large samples (300×300mm), facilities at theopticaldevelopment workshop in turnov were used. the strategy is to develop a technology suitable for inexpensive mass production of thin x-ray optics shells, i.e., to avoid expensivemandrels and techniques that are not suitable for mass production, or that are too expensive. numerous glass samples havebeen shapedand tested in order to find the optimal parameters. the shapes and profiles of both mandrels, as well as the resulting glass replicas, havebeen carefullymeasuredusing metrological devices. the results show that the quality of the thermal glass replica can be significantly improved by optimizing the material and improving the design of the mandrel, by modifying the thermal forming process, as well as by optimizing the temperature. after themodifications and improvements, some of them significant, we obtained the resulting deviation of the thermally formed glass foil from the ideal designed profile to be less than 1 μm (peak to valley value) in the best case. this value is, however, strongly dependent on the exact temperature, so we believe that further improvements are still possible. the fine originalmicroroughness (typically better than 1 nm) of the original float glass foil was found not to be degraded by the thermal forming process. wenote thatourapproach in thermalglass forming is different from the approaches used by other authors. recent efforts have been devoted to optimizing the whole process, using and comparingdifferent forming strategies etc., as the final goal is to further improve the forming accuracy to less than 0.1 μmvalues. for the near future, we plan to continue these efforts together with investigations of computer-controlled forming of glass foils (according to the principles of active optics). 4 the silicon wafer option silicon is a relatively light material and already during the manufacturing process it is lapped and polished (either onone side or onboth sides) to veryfine smoothness (better than a few 0.1 nm) and thickness homogeneity (of the order of 1 μm). another obvious option, recently considered as one of most promising for high-precisionx-ray optics for athena, is the use of x-ray optics based on commercially available silicon wafers manufactured mainly for the purposes of the semiconductor industry. the main advantages of the application of si wafers in space x-ray optics are (i) the volume density, which is more than 3 times lower than the electroformed nickel used in the past for galvanoplastic replication of multiply nested x-ray mirrors, and slightly less than the alternative approach of glass foils, (ii) very high thickness homogeneity, typically less than 1 μm over 100 mm, and (iii) very small surfacemicroroughness either on one side or on both sides (typically of the order of a few 0.1 nm or even less). silicon wafers were expected to be used in the esa xeus and ixo projects, and are still under consideration for the athena project. the recent baseline optics for the athena x-ray telescope design is, like xeus and ixo, based on x-ray high precision pore optics (x-hpo), a technology currently under development with esa funding (rdopt, rd-hpo), with a view to achieving large effective areas with low mass, reduced telescope length, high stiffness, and a monolithic structure, favoured for handling the thermal environment and for simplifying the alignment process (bavdaz et al. 2010). in addition, due to the higher packing density and the associated shorter mirrors required, conical approximation to the wolter-i geometry becomes possible. x-hpo optics is based on ribbed si wafers stacked together. the si wafers for achieving the conical approximation are formed by stacking a large number of plates together using a mandrel. the typical size of the si wafers is 10×10 cm. there are also alternative x-ray optics arrangements with the use of si wafers. in this paper, we refer to the development of an alternative design of innovative precise x-ray optics based on si wafers. our approach is basedon two steps, namely (i) developing dedicated si wafers with properties optimized for use in space x-ray telescopes, and (ii) precisely shaping the wafers to the optical surfaces (figure 4). stacking to achieve nested arrays is performed after the wafers have been shaped. in this approach, multi foiloptics (mfo) is thus created from shaped si wafers. formore details onmfo, seehudec et al. (2005). this alternative approach does not require the si wafers to have a ribbed surface, so problems with transferring any deviation, stress, and/or inaccuracy from one wafer to the neighbouring plates or even to thewhole stackedassemblywill be avoided. however, suitable technologies for precise stacking of optically formed wafers to a multiple array have to be developed. the si wafers available on the market are designed for use mainly in the semiconductor industry. it is obvious that the requirements of this industry 29 acta polytechnica vol. 51 no. 6/2011 are not the same as the requirements of precise space x-ray optics. si wafers are a monocrystal (single crystal) with some specifics, and this must also be taken into account. moreover, si wafers are fragile, and it is very difficult to bend and/or shape them precisely for the thicknesses required for x-ray telescopes, i.e. around 0.3–1.0mm. an exception is thin si wafers below 0.1 mm in thickness. however, these can be hardly used in this type of x-ray optics because of diffraction limits. also, while their thickness homogeneity is mostly perfect, the same is not true in the case of commercially available wafers for their flatness (note that we refer here to the deviation of the upper surface of a free-standingsiwafer from the ideal plane, while in the semiconductor community flatness is usually represented by a set of parameters). in order to achieve the very high accuracy required by esa for future large space x-ray telescopes like athena, the parameters of the si wafers need to be optimized (for application in x-ray optics) at the production stage. for this purpose we have established and developed a multidisciplinary working group including specialists from the development department of the si wafer industry with the goal to design and manufacture si wafers with improved parameters (mostly flatness) optimized for application in x-ray telescopes. it should be noted that the manufacture of silicon wafers is a complicated process with numerous technological steps and with many free parameters that can be modified and optimized to achieve optimal performance. this can also be useful for further improving the quality of x-hpo optics. as we are dealing with high-quality x-ray imaging, the smoothness of the reflecting surface is important. the standard microroughness of commercially available si wafers (we have used the products of on semiconductor, czech republic) is of the order of 0.1 nm, as confirmed by several independent measurements using various techniques including the atomic force microscope (afm). this is related to the method of chemical polishing used in themanufacture of si wafers. themicroroughness of si wafers exceeds the microroughness of glass foils andmost other alternativemirrormaterials and substrates. the flatness (in the sense of the deviation of the upper surface of a free-standing si wafer from a plane) of commercially available si wafers was however found not to be optimal for use in high-quality (order of arcsec angular resolutions) x-ray optics. most si wafers show deviations from the plane of the order of a few tens of microns. after modifying the technological process during si wafer manufacture, we were able to reduce this value to just a few microns. also, the thickness homogeneity was improved. in collaboration with the manufacturer, further steps are planned to improve the flatness (deviation from an ideal plane) and the thickness homogeneity of si wafers. these and planned improvements introduced at the si wafer manufacture stage can also be applied for other designs of si wafer optics including x-hpo, and can play a crucial role in the athena project. the x-ray optics design for athena is based on thewolter 1 arrangement, and hence requires curved surfaces. however, due to the material properties of monocrystalline si, si wafers (except very thin ones) are extremely difficult to shape. it is obvious thatwe have to overcomethis problem in order to achieve the fine accuracy and stability required for future large x-ray telescopes. the final goal is to provide optically shapedsiwaferswithno or little internal stress. threedifferent alternative technologies for shapingsi wafers have been designed and tested to achieve precise optical surfaces (figure 1). the samples shaped and tested were typically 100 to 150 mm large, typically 0.6 to 1.3 mm thick, and were bent to either cylindrical or parabolic test surfaces. fig. 1: example of precise silicon wafer shaping. after deposition of poly-si (thickness 1436 nm at temperature 615◦c) and for wafer thickness 507 μm, a warp of 110 mm (r = 25.6 m) was achieved. left: wafer deformation map. right: warp profile perpendicular to the facet 30 acta polytechnica vol. 51 no. 6/2011 the development described here is based on a scientific approach, and hence the large number of samples formed with different parameters must be precisely measured and investigated in detail. especially precise metrology and measurements play a crucial role in this type of experiment. the samples of bentwaferswith the investigatedtechnologieshave been measured, including taylor-hobsonmechanical and still optical profilometry, aswell as optical interferometry (zygo) and afm (atomic force miscroscope) analyses. it has been confirmed that all these three technologies do not degrade the intrinsic fine microroughness of the wafer. while the two physical/chemical technologies exploited give peakto-valley (pv) deviations (of the real surface of the sample compared with the ideal optical surface) of less than 1 to 2 μm over the 150 mm sample length as preliminary values, the deviations of the first thermally bent sample are larger, of the order of 10 μm. taking into account that the applied temperatures, as well as other parameters, were not optimized for this first sample, we anticipate that the pv (peak to valley) value can be further reduced down to the order of 1 μm and perhaps even below. fine adjustments of the parameters can also further improve the accuracy of the results for the other two techniques. 5 the kirkpatrick-baez option although wolter systems are generally well-known and widely used, hans wolter was not the first to propose x-ray imaging systems based on reflection of x-rays. in fact, the first grazing incidence system to form a real image was proposed by kirkpatrick and baez in 1948. this system consists of a set of two orthogonal parabolas of translation (figure 2). the first reflection focuses to a line, which is focused by the second surface to a point. this was necessary to avoid the extreme astigmatism sufferedby a single mirror, but it still was not free of geometric aberrations. nevertheless, the system is attractive because it is easy to construct the reflecting surfaces. these surfaces can be produced as flat plates and then mechanically bent to the required curvature. in order to increase the aperture, a number of mirrors can be nested together, but it shouldbenoted that this nesting introduces additional aberrations. this configuration is used mostly in experiments not requiring a large collecting area (solar, laboratory). the applications in x-ray astronomy and astrophysics were limited in the past, despite initial success on sounding rockets (e.g. gorenstein et al., 1978). nevertheless, largemodules ofkirkpatrick-baez (kb)mirrors based on float glass have also been suggested for stellar x-ray experiments (lamar experiment in kb configuration designed for the shuttle experiment, e.g. fabricant et al. 1988). fig. 2: multi foil optics (mfo) in the kirkpatrick-baez (k-b) arrangement superior silicon substrates have recently become available that make it possible to consider designing large kb modules with this novel material. as mentioned above, si wafers are difficult to shape, especially to small radii. to overcome this difficulty, anotherx-ray optics arrangement can be considered, namely thekirkpatrick-baez (kb) system. then the curvature radii aremuch larger, of the order of a few km,while the imagingperformance is similar. for the same effective area, however, the focal length of the kb system is about twice as large as the focal length of the wolter system. nevertheless, kb systems represent a promising alternative to the classicalwolter systems in future large space x-ray telescopes. a very important factor is the ease (and hence the reduced cost) of constructing highly segmented modules basedonmultiply nested thin reflecting substrates, in comparisonwith thewolter design. while e.g. the wolter design for athena requires the substrates to be precisely formed with curvatures as small as 0.25m, the alternativekbarrangementuses almost flat or only slightly bent sheets. hence the feasibility of constructing a kb module with the required5 arcsecfwhmatanaffordable cost is higher than for the wolter arrangement. advanced kb telescopes are based on the multi foiloptics (mfo)approach(x-raygrazing incidence imagingopticsbasedonnumerousthin reflecting substrates/foils). the distinction between mfo and other optics using packed or nested mirrors is that mfo is based on numerous very thin (typically less than0.1mm) substrates. themfokbtestmodules were recently designedand constructed atrigaku innovativetechnologieseurope (rite) inprague, and 2 modules were tested in full aperture x-ray tests in the test facilityof theuniversityofboulder,withpreliminary results of fwhm 26 arcsec for a full stack of 24 standard si plates at 5 kev, and even better for glass foils and 1d imaging (figures 3, 4). 31 acta polytechnica vol. 51 no. 6/2011 fig. 3: left: test k-b modules assembled at rigaku rite in prague, right: the x-ray test facility for full aperture x-ray tests at the university of colorado in boulder fig. 4: themeasurement results of thek-b1d testmodulewith glass foils, full aperture tests at 5 kev at theuniversity of colorado at boulder. the estimated fwhm of 1d focus is 4 arcsec in our opinion, the use of kb design instead of wolter in athena might help to retain high performance (such as effective area) even in the event of a reduced budget for athena. 6 conclusion suitable technologies for future largex-raytelescopes require extensive researchwork. twopromising technologies suitable for future large-aperture and fine resolution x-ray telescopes, such as athena, were exploited and investigated in detail, namely glass thermal forming and si wafer bending. in both cases, promising results have been achieved, with peak-to-valley deviations of the final profiles from the ideal profiles being of the order of 1 μm in the best cases, with space for further essential improvements and optimization. in the czech republic, an interdisciplinary team with 10 members is cooperating closely with experienced specialists, including researchers from a large company producing si wafers. si wafers have been successfully bent to the desired geometry by three different techniques. in the best cases, the accuracy achieved for a 150mmsiwafer is 1–2 μm for deviation from the ideal optical surface. experiments are continuing in an attempt to further improve the forming accuracy. as an alternative, we have investigated the kb option for athena, with quite promising results justifying further efforts in this direction. a major advantage of kb is its low cost, as there is no need for bending to small curvatures and/or for theuse of (expensive) mandrels. less expensive optics can maintain a high effective area even on a reduced budget. 32 acta polytechnica vol. 51 no. 6/2011 acknowledgement we acknowledge the support provided by the grant agency of the academy of sciences of the czechrepublic, grant iaax01220701, by theministry of education and youth of the czech republic, projects me918, me09028 and me09004. the investigations related to the esa ixo (now athena) project were supported by esa pecs project no. 98038. we also acknowledge collaboration with drs. j. sik and m. lorenc from on semiconductor czech republic, and with the team of prof. webster cash from the university of colorado at boulder for x-ray tests of kb modules in their x-ray facility. references [1] hudec, r., valnicek, b., cervencl, j., et al.: spie, 1343, 162, 1991. [2] hudec, r., pina, l., inneman, a.: spie, 3766, 62, 1999. [3] hudec, r., pina, l., inneman, a.: spie, 4012, 422, 2000. [4] hudec, r., inneman, a., pina, l.: in lobstereye: novel x-ray telescopes for the 21st century, new century of x-ray astronomy, asp conf. proc., 251, 542, 2001. [5] hudec, r., pina, l., inneman, a. et al.: spie, 5900, 276, 2005. [6] inneman, a., hudec, r., pina, l., gorenstein, p.: spie, 3766, 72, 1999. [7] inneman, a., hudec, r., pina, l.: spie, 4138, 94, 2000. [8] kirkpatrick, p., baez, a. v.: j. opt. soc. am. 38, 766, 1948. [9] marsch,h., et al.: introduction to carbon technologies, university of alicante, 1997. [10] white, n. e., hornschemeier, a. e.: bulletin of the american astronomical society, vol. 41, 2009, p. 388. [11] white, n.e., parmar,a., kunieda,h.: international x-ray observatory team, 2009, bulletin of the american astronomical society, vol. 41, p. 357. [12] http://ixo.gsfc.nasa.gov [13] bavdaz, m., et al.: proceedings of the spie, vol. 7732, 2010, p. 77321e–77321e-9. [14] zhang, w. w., et al.: proceedings of the spie, vol. 7732, 2010, p. 77321g–77321g-8. [15] ghigo, m., et al.: proceedings of the spie, vol. 7732, 2010, p. 77320c–77320c-12. [16] http://www.mpe.mpg.de/athena/ home.php?lang=en [17] gorenstein, p., et al.: astrophys. j. 224, 718, 1978. [18] fabricant, d. g., et al.: applied optics, 27, no. 8, 1456, 1988. rené hudec e-mail: rhudec@asu.cas.cz astronomical institute academy of sciences of the czech republic 251 65 ondřejov, czech republic faculty of electrical engineering czech technical university in prague technická 2, 166 27 prague, czech republic ladislav ṕına faculty of nuclear engineering czech technical university in prague břehová 78/7, 110 00 prague, czech republic veronika marš́ıková adolf inneman rigaku innovative technologies europe, s.r.o. novodvorská 994, 142 21, prague 4, czech republic michaela skulinová astronomical institute academy of sciences of the czech republic 251 65 ondřejov, czech republic martin mı́ka institute of chemical technology technická 5, 166 28 prague, czech republic 33 ap08_5.vp 1 introduction the inherent variability of a rock mass is difficult to model and for this reason engineers very often have to ask the question “what value should be used in analyses?” the answer to such question requires a probabilistic approach in the evaluating the uncertainty in the input parameters in geologic media. in the recent times, specialists have encountered problems in the input parameters derived from uncertainty modelling based on the fuzzy set theory, monte carlo simulation, latin hypercube sampling etc. 2 fuzzy methods in the design of underground structures, it is very difficult to take into account the inherent variability of rock mass using the current rock mass classifications. one of the main means of improving rock mass classification is by accounting for variations in the individual parameters using fuzzy mathematics [1]. the fuzzy set was first introduced in 1965 by lofti a. zadeh [2] as a mathematical way to represent linguistic vagueness. in a classical set, an element either belongs to or does not belong to a set. the concepts and definitions of the fuzzy set theory are described in many publications. dubois and pride [3] provide the following definition: “fuzzy set is a generalization of ordinary or classical set theory. it consists of mathematical tools developed to model and process incomplete and/or gradual information, ranging from interval – valued numerical data to symbolic and linguistic expressions”. contrary to crisp (or ordinary) sets, fuzzy sets have no sharp or precise boundaries. in crisp sets, element x belongs to or does not belong to a set a and the membership function (or degree of membership) �a(x) is unique. fuzzy sets assign the membership function �a(x) for each element x as a range over the interval (0 to 1). this type of membership function is characterised by a smooth transition from ”belonging to a set” (1) to ”not belonging to a set” (0) and gives fuzzy sets flexibility in modelling based on linguistic expressions of engineering practice (such as ”fairly rough surface”). membership functions can also be represented in analytical methods. two types of variables are used in the fuzzy model: fuzzy variables and fuzzy numbers. fuzzy variables are defined directly as fuzzy sets based on a group of reference linguistic terms. generally, a fuzzy variable xj can acquire any value between the reference terms. for the linguistic terms the fuzzy variables are fuzzy singletons. fuzzy variables need not to be associated with any numerical universe. this can be because their values are qualitative in nature, or because they are treated as qualitative for the sake of convenience. fuzzy numbers are defined over a continuous domain – triangular or trapezoidal membership functions (or numbers). trapezoidal number is defined by the height and four distinct elements of the fuzzy set interval. fuzzy logic provides a number of functions for performing fuzzy arithmetic. the fuzzy arithmetic functions are a little different from the rest of fuzzy logic’s functions in that they operate on lists of fuzzy numbers. the application of fuzzy arithmetic in the rock mass classification is direct and generates a fuzzy number representing the classification value. fuzzy mathematics even introduces the uncertainty in the evaluating of parameters in the rock mass classification. let us take, for example, the index q rock mass classification. this classification was established in 1974 in norway (by barton, lien, lunde and loset) and is based on six parameters [4]: q rqd j j j j srfn r a w � � � , (1) where rqd � rock quality designation, jn � joint set number, ja � joint alteration number, jw � joint water reduction number, srf � stress reduction factor. by applying fuzzy logic to the equations for the index q (equation 1), we obtain results in the fuzzy classification value with non-linear distributions. the convex nature of the flanks © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 48 no. 5/2008 statistical analysis of input parameters impact on the modelling of underground structures m. hilar, j. pruška the behaviour of a geomechanical model and its final results are strongly affected by the input parameters. as the inherent variability of rock mass is difficult to model, engineers are frequently forced to face the question “which input values should be used for analyses?” the correct answer to such a question requires a probabilistic approach, considering the uncertainty of site investigations and variation in the ground. this paper describes the statistical analysis of input parameters for fem calculations of traffic tunnels in the city of prague. at the beginning of the paper, the inaccuracy in the geotechnical modelling is discussed. in the following part the fuzzy techniques are summarized, including information about an application of the fuzzy arithmetic on the shotcrete parameters. the next part of the paper is focused on the stochastic simulation – monte carlo simulation is briefly described, latin hypercubes method is described more in details. at the end several practical examples are described: statistical analysis of the input parameters on the numerical modelling of the completed mrázovka tunnel (profile west tunnel tube km 5.160) and modelling of the constructed tunnel špejchar – pelc tyrolka. keywords: finite element method, input parameters, rock mass, fuzzy, lhs sampling, monte carlo method, numerical model, underground structures. has the effect of increasing the possibility that the conditions will be worse than a single computed index q. 3 monte carlo simulation monte carlo simulation is a well-known tool that is used to analyze random phenomena. in the monte carlo simulation, a random problem is transformed into several deterministic problems that are much easier to solve – sample inputs are used to generate sample outputs with statistical or probabilistic information about the random output quantity. monte carlo simulation is simple to use and therefore has found much favour in geomechanics, particularly in stability analysis of rock slopes [5]. the simplest sampling scheme of a monte carlo simulation approach is to use a pseudorandom number generator to select random numbers between 0 and 1 and use them to generate values for each variable which is an input to the calculation. however, this simple (and best-known) random sampling scheme requires many samples for good accuracy and repeatability – in practice, generating a probability distribution of the safety factor of a rock slopes requires a minimum of 200 up to 2000 selections (depending on the desired accuracy). the simulation output (random variable which depends upon random input variables, fields and processes) may be presented in several ways. one way is to define the probability that a safety factor f is less than a prescribed value f0: p f f n n ( )� �0 , (2) where n � number of trials in which f < f0 , n � number of selections. another approach [6] is to plot a cumulative distribution of f, which can be used to determine the probability of f being less than a given value. 4 latin hypercube sampling (lhs) other sampling methods have been developed to reduce the number of samples required for good accuracy in the monte carlo simulation. one of the best is latin hypercube sampling [7]. latin hypercube sampling preserves marginal probability distributions for each simulated variable (fig. 1). to fulfilled this aim, latin hypercube sampling constructs a highly depend joint probability density function for the random variables in the problem, which allows good accuracy in the response parameter even when using only a small number of samples. in latin hypercube sampling, the bounded interval 0 1, is uniformly divided into n non-overlaying intervals – fig. 2 (with the same probability of accuracy). one value per interval is generated afterwards. this can be performed by initially generating n random numbers within range 0 1, . these values are linearly transformed to the random numbers in the non-overlaying intervals for each random variable: 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 0.5�1 �0.5 0 1 0 f t( ) range of the input parameters ´qi 0.5 0.6 0.8 1 0.4 0.2 �1 fig. 2: dividing the distribution function into intervals 0.45 0 0.4 0.35 0.25 0.15 0.05 0.3 0.2 0.1 standard variable t f t( ) �3 �2 �1 0 1 2 3 fig. 1: probability density function – gaussian distribution u u n i ni � � �( )1 , (3) where i n� 1 2, , ,� (n number of values), u � a random number (u � 0 1, ), ui � random numbers in the i th interval, n � number of the non-overlaying intervals. from the above equation, it follows that we generate only one value for each random variable (this value is randomly selected within each of n intervals): ( )i n u ni � � � 1 1 , (4) where i n� 1 2, , ,� (n number of values), ui � a random numbers in the ith interval, n � number of the non-overlaying. the n values obtained for the first random variable x1 are paired in a random manner with the values of the second random variable. these n pairs of (x1, x2) are combined in a random manner with n values of the third random variable. this combination results in the n triads (x1, x2, x3) which are combined with n values of the 4th random variable and so on until (k�2) triads are formed (k is a number of random variables). thus we can assemble an n k� matrix. it can be seen that certain statistical correlations between columns of the matrix may have a significant influence on the results of simulation [8]. 5 construction of tunnels in prague the north-western sector of the city circular road in prague contains three major road tunnels of large cross section area. while the strahov and mrázovka tunnels are already open for traffic, the špejchar – pelc tyrolka tunnel is still in the planning phase. the very difficult geological conditions in the prague ordovician make it necessary to use natm mehods. stabilization of the deformations was achieved only after closure of the whole primary lining, but the following interim supporting technical measures were performed: anchoring, widening of the top heading legs, micropile support under the top heading legs, reinforcing grouting performed in advance from the exploratory gallery, and closing the top heading by a temporary invert. due to increased deformations occurring at the excavation with a horizontally divided face, a sequence with a vertical pattern was used in the further course of excavating the mrázovka tunnel. 6 modelling of the mrázovka tunnel the outputs of the mrázovka tunnel modelling were verified by the latin hypercubes method [9]. the numerical analysis was carried out by means of the plaxis program system. the 3d behaviour of the excavation face area, and correct description of the influence on the deformations and the state of the massif were simulated by the common procedure of loading the excavation and lining using the so-called �-method (fig. 3). the modelled area of the profile was approximately 200 m wide and 110 m high and was divided into eight basic sub-areas according to the types of rock encountered (fig. 4). the rock mass behaviour was approximated by means of the mohr-coulomb model. the input geotechnical parameters of the rock mass (edef, �, c, �, �) were determined on the basis of the engineering-geological investigation results – tables 1, 2 and 3 [10]. a comparison of theoretically determined deformations with the values obtained by monitoring [11] was used to verify the applicability of the mathematical model. the results of the statistical study of the west tunnel tube (wtt) in profile km 5.160 are shown in table 4. they show that the probability of final settlements being between 71 mm and 213 mm is 95 %. the range of settlement without including deformations caused by the excavation of the pilot adit is 65 mm to 198 mm. the predicted range was consistent with the measured value of 194 mm (value does not include effect of pilot adit). 7 modelling of the špejchar – pelc-tyrolka tunnel the špejchar – pelc-tyrolka project is 4.320 m long with the length of the tunnelled section being 3.438 m [12]. as well © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 5/2008 fig. 3: numerical modelling stages as the tunnel, the project includes underground garages in letná, four underground technology centres and the trója bridge. as designed, the excavation will be carried out using the new austrian tunnelling method (natm). due to the predicted conditions, a vertical excavation sequence will be adopted in the three-lane tunnel excavation. in the two-lane tunnels, a horizontal excavation sequence is expected. the position of the exploratory drift has been selected to coincide with the complicated sections of the tunnel to provide information to predicted whether the vertical sequence should be applied to the top heading only, or to the entire cross section. mechanical rock breaking is expected, but in combination with the drill-and-blast when passing through the quartzites. a transition zone between the quartzites exists at the foot of the slope falling from the letná plain. the gradient parameters mean that the tunnels run are in the vicinity of fully saturated quaternary sediments. a multicriteria analysis resulted in the selection of pre-excavation 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 number of calculation a b in � 1 in � 2 in � 3 in � 4 in � 5 value qij’ �1 1 �0.64 �0.26 0.00 0.26 0.64 ground 1 25 27 25.4 25.6 26.0 26.4 26.6 ground 2 27 29 27.4 27.6 28.0 28.4 28.6 ground 3 22 25 22.5 22.9 23.5 24.1 24.5 ground 4 25 30 25.9 26.5 27.5 28.6 29.1 ground 5 30 35 30.9 31.5 32.5 33.6 34.1 ground 6 35 38 35.5 35.9 36.5 37.1 37.5 ground 7 35 38 35.5 35.9 36.5 37.1 37.5 ground 8 25 28 25.5 25.9 26.5 27.1 27.5 table 1: input � values for various geotechnical layers based on random intervals calculation q e1 � q2 � � q3 � � q c4 � q5 � � 1 4 2 2 5 3 2 1 3 5 2 1 3 3 5 3 4 5 4 5 1 4 3 4 5 2 4 1 1 2 table 2: random permutations of the input parameters qi parameter qij edef [mpa] � [-] � [kn/m 3] c [kpa] � [°] interval number in 4 2 2 5 3 ground 1 4.8 0.30 17.3 11.3 26.0 ground 2 13.6 0.30 18.8 5.3 28.0 ground 3 24.2 0.35 20.2 14.1 23.5 ground 4 79.7 0.31 22.6 22.3 27.5 ground 5 171.0 0.29 24.3 45.5 32.5 ground 6 271.0 0.26 25.1 91.0 36.5 ground 7 421.0 0.20 25.8 91.0 36.5 ground 8 24.5 0.33 23.1 13.2 26.5 table 3: input parameters for the first run fig. 4: geometry of generated mesh in wtt km 5.160 grouting. the outputs from the exploratory drift near the transition zone (km 5.900 stt) were verified by latin hypercube sampling method [13]. the 2d and 3d numerical analyses were carried out by means of the cesar – lcpc program system. the rock mass behaviour was approximated by means of the mohr-coulomb model. the intervals of the input parameters of the rock mass were determined on the basis of the engineering-geological investigation results. the results of statistical study of south tunnel tube (stt) at profile km 5.900 showed that final settlements of the tunnel lining will be between 13 mm and 347 mm, with a probability of 95 %. the range of predicted surface settlements above excavated tunnels is from 7 mm to 26 mm. 8 conclusion the results of analyses show the effect of the variation of input parameters describing rock mass behaviour on the results of fem and demonstrate that the differences from this influence can be significant. the differences in the results flow from variations in the geological conditiond. however, the importance for determination of the final structure behaviour of at least a basic study of the variation of the input parameters can be clearly seen. the latin hypercube sampling method is a procedure with advantages for the qualified statistical evaluation of fe calculation. these methods make significant time saving possible in common statistical methods (monte carlo method, estimations of probability moments etc.). 9 acknowledgment the authors would like to thank to ministry of education of the czech republic for research project vz 03 cez msm 6840770003 “developments of the algorithms of computational simulations and their application in engineering” for assistance with a research. references [1] hudson, j. a.: rock engineering systems: theory and practice. chichester: ellis horwood, 1998. [2] zadeh, l. a.: fuzzy sets, inform control, vol. 8 (1965), p. 338–353. [3] dubois, d., prade h.: fundamentals of fuzzy sets, norwell, ma, kluwer, 2000. [4] bieniawski, z. t.: enngineering rock mass classification. new york: john wiley and sons, 1989, isbn 0-47160172-1 [5] mcmahon, b. k.: a statistical method for the design of rock slopes. proceedings 1st australia – new zealand conference on geomechanics, vol. 1, melbourne, australia, 1971, p. 314–321. [6] mahtab, m. a., grasso, o.: geomechanics principles in the design of tunnels and caverns in rock. amsterdam: elsevier, 1992, isbn 0-4440883088. [7] bažant, z. p., křístek, v.: the effect of shear lag on creep deformation. journal of structural engineering, vol. 113 (1987), no. 3, p. 557–569. [8] et al.: cesta k pravděpodobnostnímu posudku bezpečnosti, provozuschopnosti a trvanlivosti konstrukcí. in: spolehlivost konstrukcí 2003 (in czech), ostrava. dům techniky, 2003, isbn 80-02-01410-3 . [9] barták, j., hilar, m., pruška, j.: statistical analysis of input parameters influence to the tunnel deformations modelling. in: fourth international conference on computational stochastic mechanics [cd-rom]. houston: rice university, 2002, p. 280–292. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 5/2008 total settlement (mm) pilot adit settlement (mm) settlement without the pilot adit effect (mm) calculation surface tunnel surface adit surface tunnel 1 79 107 15 8 64 99 2 153 197 29 14 124 183 3 92 125 18 10 74 115 4 85 111 15 8 70 103 5 134 171 25 12 109 159 x (average) 108.60 142.20 20.40 10.40 88.20 131.80 s (standard deviation) 29.41 35.61 5.64 2.33 23.80 33.31 x s p� �2 95 45( . %) 167.42 213.42 31.69 15.06 135.81 198.43 x s p� �2 95 45( . %) 49.78 70.98 9.11 5.74 40.59 65.17 x s p� �( . %)68 27 138.01 177.81 26.04 12.73 112.00 165.11 x s p� �( . %)68 27 79.19 106.59 14.76 8.07 64.40 98.49 var x (variance) 5.69 6.22 2.65 1.75 5.15 6.02 table 4: results of the mrázovka tunnel calculations [10] hudek, j.: the mrazovka tunnels – measurement and monitoring at construction of the wtt – presiometric checking on the success of saving grouting – profiles 003 to 009. praha: púdis, 1999. [11] kolečkář, m., zemánek i.: monitoring of the mrazovka tunnel. in: volume of papers of the int. conf. underground construction 2000, praha, ita/aites, 2000, p. 427–433. [12] butovič, a.: exploratory drift for the blanka twin tube. tunel, vol. 1 (2004), ctuk praha, 2004, p. 13–18. [13] louženský, t.: mathematical modeling of primary lining influence on rock mass stress, prague, ctu in prague, fce, diploma thesis, 2006 (in czech). doc. ing. matouš hilar, m.sc., ph.d. phone/fax: +420 241 443 411 d2 consult prague s.r.o. zelený pruh 95/97 147 00 praha 4, czech republic doc. dr. ing. jan pruška phone: +420 224 344 547 fax: +420 224 344 556 e-mail: pruska@fsv.cvut.cz department of geotechnics czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 acta polytechnica vol. 51 no. 4/2011 polynomial solutions of the heun equation b. shapiro, m. tater abstract we review properties of certain types of polynomial solutions of the heun equation. two aspects are particularly concerned, the interlacing property of spectral and stieltjes polynomials in the case of real roots of these polynomials and asymptotic root distribution when complex roots are present. keywords: heun equation, van vleck and stieltjes polynomials, asymptotic root distribution, logarithmic potential. 1 introduction we study polynomial solutions of the heun equation{ q(z) d2 dz2 + p (z) d dz + v (z) } s(z) = 0, (1) where q, p , and v are given polynomials. q is a polynomial of degree k, p is at most of degree k − 1, and v is at most of degree k − 2. e. heine and t. stieltjes posed the following problem: problem. given a pair of polynomials {q, p } and a positive integer n find all polynomials v such that (1) has a polynomial solution s of degree n. polynomials v are referred to as van vleck polynomials and polynomials s as stieltjes polynomials. for a generic pair {q, p } there exist ( n+k−2 n ) distinct van vleck polynomials. the simplest case is k = 2, when equation (1) is an equation of hypergeometric type: q is quadratic, p is at most linear and v reduces to a (spectral) parameter. this situation was thoroughly studied in the past and all polynomial solutions are brought to six types of either finite or infinite systems of orthogonal polynomials e.g. [4]. asymptotic distribution of zeros of orthogonal polynomials has been studied for quite a long time and many important results are known [13]. 2 k = 3 case next natural step is k = 3. even this problem has a long history, going back to g. lamé. already heine and stieltjes knew that for a fixed n the above mentioned problem has n + 1 solutions, i.e. that there exist n + 1 distinct van vleck polynomials. moreover, in the case of the lamé equation (p = q′/2) and if we additionally assume that q has three real and distinct roots a1 < a2 < a3 then each root of each v and each s is real and simple, the roots of v and s lie between a1 and a3, none of the roots of s coincides with any ai (i = 1, 2, 3), and n + 1 polynomials s can be distinguished by the number of roots lying in the interval (a1, a2) (the remaining roots lie in (a2, a3)) [14]. besides this, there is no zero of s between a2 and the zero of the corresponding van vleck polynomial [1], cf. figure 1. some additional results are known for fixed n. each van vleck (linear) polynomial has a single zero νi, i = 1, . . . , n + 1. we can form a so-called spectral polynomial made of these zeros spn(λ) = n+1∏ i=1 (λ − νi). zeros of two successive spectral polynomials, i.e. spn and spn+1 interlace: between any two roots of spn lies a root of spn+1, and vice versa [2]. on the other hand, in spite of the fact that these polynomials have simple zeros that interlace, the system {spn}∞n=1 is not orthogonal with respect to any measure. the proof in [2] is based on the finding that the asymptotic zero distribution of spn [3] is different from that of orthogonal polynomials, showing also that spn do not obey any three-term recurrence relation. as already mentioned above, the roots of van vleck’s νi lie between a1 and a3, and are mutually different, making it thus possible to order stieltjes polynomials accordingly. so, for a fixed n, we have a sequence of n + 1 stieltjes polynomials s (n) i of degree n, i = 1, . . . , n+1. two interesting results are proved in [1]. the n zeros of s (n) i and the n zeros of s (n) i+1 interlace. in addition, the smallest zero of s (n) i+1 is smaller than the smallest zero of s (n) i . besides this, the zeros of s (n) i and s (n+1) j interlace if and only if i = j or i = j + 1, otherwise they do not interlace. there is no definitive answer to the question of orthogonality of s (n) i . if complex roots of q are admitted, g. pólya proved [9] that all roots of both v and s belong to the convex hull convq of a1, a2, a3 provided that all residues of p/q are positive. investigations of the root asymptotics of both van vleck and stieltjes polynomials have a considerably shorter history. we summarize here some salient results [10–12]. 90 acta polytechnica vol. 51 no. 4/2011 −2 −1 0 1 2 3 4 fig. 1: the situation for p = 0 and n = 25. the thick black dots mark the roots of q(x) = (x +2)(x − 1)(x − 4), the thick green dotsmark the roots of n+1vanvleck polynomials, and the small red dotsmark n roots of the corresponding stieltjes polynomials −1 0 1 2 0 0.5 1 1.5 2 2.5 3 3.5 4 −1 0 1 2 0 0.5 1 1.5 2 2.5 3 3.5 4 fig. 2: the left part: the roots of the spectral polynomial sp51(λ) for q(z) = (z +1)(z − 2)(z − 2 − 4i) and p(z) = (z+2+2i)(z −1+3i). the thick black dotsmark the roots of q, the green dotsmark the roots ofvanvleck polynomials. the right part: the thick green dot marks one of the 51 van vleck polynomials and the small red dots mark 50 roots of the corresponding stieltjes polynomial the roots can be asymptotically localized. for any � > 0 there exist n� such that for any n ≥ n� any root of any v as well as any root of the corresponding s lie in the �-neighbourhood (in the usual euclidean distance on c) of the convex hull of a1, a2, a3. this result shows that the asymptotic behaviour of roots is determined by q, i.e. it is not influenced by p for sufficiently large n. for a more detailed description of asymptotic distribution we associate to each polynomial pn a finite real measure μn = 1 n n∑ j=1 δ(z − zj ), where δ(z − zj ) is the dirac measure supported at the root zj . this probability measure is referred to as the root-counting measure of the polynomial pn. now, two questions are to be answered. does the sequence {μn} converge (in the weak sense) to a 91 acta polytechnica vol. 51 no. 4/2011 limiting measure μ and if so what does μ look like? we may ask these questions when pn = spn. the first question is answered positively [11, 12]. the sequence {μn} of the root-counting measures of its spectral polynomials converges to a probability measure μ supported on the union of three curves located inside convq and connecting the three roots of q with a certain interior point, cf. figure 2. moreover, μ depends only on q. the support of μ is a union of three curve segments γi, i ∈ {1, 2, 3}. they may be described as the set of all b ∈ convq satisfying∫ ak aj √ b − t (t − a1)(t − a2)(t − a3) dt ∈ r, here j and k are the remaining two indices in {1, 2, 3} in any order and the integration is taken over the straight interval connecting aj and ak. we can see that ai belong to γi and that these three curves connect the corresponding ai with a common point within convq. take a segment of γi connecting ai with the common intersection point of all γ’s. let us denote the union of these three segments by γq. then the support of the limiting root-counting measure μ coincides with γq. knowing the support of μ it is also possible to define its density along the support using the linear differential equation satisfied by its cauchy transform [11] q(z)c′′ν (z) + q ′(z)c′ν (z) + q′′(z) 8 cν (z) + q′′′(z) 24 = 0. in the case when q(z) has all real zeros, the density is explicitly given in [3]. the cauchy transform cν (z) and the logarithmic potential potν (z) of a (complex-valued) measure ν supported in c are given by: cν (z) = ∫ c dν(ξ) z − ξ and potν (z) = ∫ c log |z − ξ| dν(ξ). cν (z) is analytic outside the support of ν [5]. in [11] we were able to find an additional probability measure ν which is easily described and from which the measure μ is obtained by the inverse balayage, i.e. the support of μ will be contained in the support of the measure ν and they have the same logarithmic potential outside the support of the latter one. this measure is uniquely determined by the choice of a root of q(z), and thus we in fact have constructed three different measures νi having the same measure μ as their inverse balayage. let us try to formulate similar results for the asymptotic root behaviour of stieltjes polynomials. to this end we must formulate in more detail which sequence of polynomials we are studying. take a sequence of monic (the leading coefficient is 1) van vleck polynomials {ṽn} converging to some monic linear polynomial ṽ . the existence of a linear polynomial ṽ is ensured by the existence of the limit of the sequence of (unique) roots νn,in of {ṽn}. the above mentioned results guarantee the existence of plenty of such converging sequences in convq and the limit ν̃ of these roots must necessarily belong to γq. having chosen {ṽn} we take any sequence of the corresponding {sn,in}, deg sn,in = n whose corresponding sequence {ṽn} has a limit. if we denote by μn,in the root-counting measure of the corresponding stieltjes polynomial, we have proved that the sequence {μn,in } converges weakly to the unique probability measure μ ṽ whose cauchy transform c ṽ (z) satisfies the equation c2 ṽ (z) = ṽ (z) q(z) almost everywhere in c. in order to formulate further results we used [12] the notion of the quadratic differential (cf. also [7,8]). we avoid this way of formulating the results, because it would necessarily exceed the scope if this paper. instead, we limit ourselves to presenting a typical example, cf. the right part of figure 2. the support of the limit measure consists of singular trajectories of the quadratic differential. they run close to the roots shown in red. in this particular case, one trajectory joins two zeros of q and the other one joins the third zero of q with the root of the limiting van vleck polynomial. 3 bispectral problems concerning the situation when k = 4 certain general statements have already been published (e.g. in [6, 7]). in the case when the roots of van vleck and stieltjes polynomials are real we can still rely on the result of stieltjes mentioned above, which make ordering of stieltjes polynomials possible. the situation is shown in figure 3. when complex roots come into play, the picture is less clear. figure 3 suggests that the asymptotic root distribution of van vleck polynomials has a more complicated structure than before. on the other hand, the structure of the asymptotic root distribution of stieltjes polynomials bears some resemblance to the k = 3 case. there are still several questions open. in addition, many other unsolved problems can be found for higher linear differential equations with polynomial coefficients. 92 acta polytechnica vol. 51 no. 4/2011 −5 0 5 10 −3 −2 −1 0 1 2 0 1 2 3 4 −3 −2 −1 0 1 2 0 1 2 3 4 fig. 3: the left part: the location of roots for q(x) = (x +5)(x +1)(x − 5)(x − 12), p = 0, and n = 6. the dots have the same meaning as in fig. 1. the right upper part: the union of roots of (quadratic) van vleck polynomials for q(z) = (z +1)(z − 2)(z − 2 − 4i)(z +3 − 2i), p = 0, and n = 20. the lower part: roots of a particular stieltjes polynomial (in red) and the roots of the corresponding van vleck polynomial (in green) acknowledgement this work has been supported by the czech ministry of education, youth and sports within the project lc06002 and by gacr grant p203/11/0701. references [1] bourget, a., mcmillen, t.: on the distribution and interlacing of the zeros of stieltjes polynomials, proc. ams 138 (2010), 3 267–3 275. [2] bourget, a., mcmillen, t., vargas, a.: interlacing and nonorthogonality of spectral polynomials for the lamé operator, proc. ams 137 (2009), 1 699-1 710. [3] borcea, j., shapiro, b.: root asymptotics of spectral polynomials for the lamé operator, commun. math. phys. 282 (2008), 323–337. [4] cotfas, n.: systems of orthogonal polynomials defined by hypergeometric type equations with application to quantum mechanics, cent. eur. j. phys. 2 (2004), 456–466. [5] garnett, j.: analytic capacity and measure. lecture notes in mathematics 297, springerverlag, berlin-new york, 1972. [6] holst, t., shapiro, b.: on higher heine-stieltjes polynomials, to appear in isr. j. math. 183 (2011), 321–347. [7] mart́ınez-finkelshtein, a., rakhmanov, e. a.: on asymptotic behavior of heine-stieltjes and van vleck polynomials, contemporary mathematics 507 (2010), 209–232. [8] mart́ınez-finkelshtein, a., rakhmanov, e. a.: critical measures, quadratic differentials, and weak limits of zeros of stieltjes polynomials, comm. math. physics 302 (2011), 53–111. [9] pólya:, g. sur un théorème de stieltjes, c. r. acad. sci. paris 155 (1912), 76–769. [10] shapiro, b.: algebro-geometric aspects of heine-stieltjes theory, j. london math. soc. 83 (2011), 36–56. 93 acta polytechnica vol. 51 no. 4/2011 [11] shapiro, b., tater, m.: on spectral polynomials of the heun equation. i, j. approx. theory 162 (2010), 766–781. [12] shapiro, b., takemura, k., tater, m.: on spectral polynomials of the heun equation. ii, arxiv:0904.0650. [13] szegő, g.: orthogonal polynomials. 1975, ams, pronidence, r.i. [14] whittaker, e. t., watson, g.: a course of modern analysis. reprint of the 4th edition, 1996, cambridge univ. press, uk. boris shapiro e-mail: shapiro@math.su.se department of mathematics stockholm university, se-106 91 stockholm, sweden miloš tater e-mail: tater@ujf.cas.cz department of theoretical physics nuclear physics institute as cr v.v.i. cz-250 68 řež, czech republic 94 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 how to protect patients digital images/thermograms stored on a local workstation j. živčák, m. roško abstract to ensure the security and privacy of patient electronic medical information stored on local workstations in doctors’ offices, clinic centers, etc., it is necessary to implement a secure and reliable method for logging on and accessing this information. biometrically-based identification technologies use measurable personal properties (physiological or behavioral) such as a fingerprint in order to identify or verify a person’s identity, and provide the foundation for highly secure personal identification, verification and/or authentication solutions. the use of biometric devices (fingerprint readers) is an easy and secure way to log on to the system. we have provided practical tests on hp notebooks that have the fingerprint reader integrated. successful/failed logons have beenmonitored and analyzed, and calculations have been made. this paper presents the false rejection rates, false acceptance rates and failure to acquire rates. keywords: digital images, thermograms, biometrics, fingerprint, authentication. 1 introduction the health insurance portability and accountability act (hipaa), which was designed to ensure the security and privacy of personal health information, affects all areas of the health care. if digital (radiology) images (any kind of images, e.g., ct images or thermograms) are locally stored at workstations, they must be secured against the misuse. nowadays, digital images and reports are distributed and accessed by authorized persons (clinicians, technologists, etc.) throughout the doctor’s offices and/or by health care providers. thus, appropriate access control, authorization and subsequent audit trails are critical [1, 2]. common problems in securing access to patient medical information (digital images or thermograms, medical reports, and other digital data) include passwords and other sophisticated user identification and/or authenticationmethods, such as smart cards, biometrics, etc. [3]. to improve security and be hipaa compliant, imaging centers and imaging departments (of hospitals, clinics) must implement security procedures and appropriate user authentication. with increasing numbers of images/thermograms being transmitted over the internet to physicians’ offices, encryption also is a key component in hipaa compliance [2]. the biometrics industry includes many hardware and software producers. standards are emerging for a common software interface to enable the use of biometric identification in many solutions that provide security and positive identification [4]. sharing of biometric templates and allowing effective evaluation and combination of two or more different biometric technologies is offered by idteck or precise 100mc/200mc/250mc (fingerprint and smart card readers) or sagem morpho (fingerprint, facial and iris recognition). interoperable biometric applications and solutions are offered by cross match technologies inc. digitalpersona, or precise 100mc/200mc/250mc which also offers integration with microsoft windows active directory) [5, 6, 7]. these are just a few examples of leading global biometric identity software and hardware (applications and solutions) producers. 2 methods we provided practical tests on 3 identical hewlett packardhpnotebooks (model 6735b) that hadwindows vista business operating systems installed on them, and we interconnected 3 different users in a local area network (lan), within a time frame of one month (february 2009). the biometric (fingerprint) windows-based system environment was implemented, and the logon and authentication activity of users using a fingerprint instead of typing their password were monitored by enabling success and failure logonauditing in thewindows system’saudit policy. the practical tests were provided within the clinic of plastic and aesthetic surgery, porta med, ltd. košice (slovak republic). 90 acta polytechnica vol. 50 no. 6/2010 3 capturing of fingerprints fingerprints were captured using the integrated fingerprint scanning device (reader/sensor). the scanningdevice is an inputdevice that transfers theuser’s biometric information into electrical information and then into digital information [8, 9, 10]. in windows, the user must authenticate before access is granted to files, folders, and/or applications (on stand-alone clients, in active directory setups, or some other network environment) [11]. microsoft windows assures security by using the following processes: authentication, which verifies the identity of something or someone, and authorization, which allows control of access to all local and network resources, such as files and printers [12]. there are four scenarios associated with the verification task. based on whether the identity claim originates from anenrollee or from a fraud, the system either correctly or incorrectly accepts or rejects the identity claim [13] (tab. 1). table 1: biometric system decision/identity claim biometric system decision accept reject identity claim enrollee genuine accept false reject fraud false accept genuine reject two steps are takenbefore afingerprint is used to log on towindows: (1)register user’s fingerprints in credentialmanager, and (2) set upcredentialmanager to log on to windows. to register a user’s fingerprints incredentialmanager, at least 2 user’s fingerprintsmust be registered to obtainbiometric samples (templates) with sufficient quality. this means that the user must swipe the same finger slowly over the fingerprint reader several times, until the finger on the screen turns green and the progress indicator displays 100%. the biometric templateswere stored locally on the hard drive of each laptop. in addition, audit account logon events was placed. this governs auditing each instance when a user logsonwitha swipeofhis/herfingerover thefingerprint reader. auditing fingerprint logon attempts generates security events, depending on whether the audit of successes or failures, or both (in our case we audited both), is enabled. success auditing generates an audit entrywhen an account logon process is successful. failure auditing generates an audit entry when an attempted account logon process fails. the events recorded in event viewer were used to track each user’s logon attempt that occurred on each hp notebook locally. the number of entries in event viewer, when the accounts logon process was successful and/or the accounts logon process failed, were counted and analyzed. 4 results wehavealreadymentioned that the systemcorrectly or incorrectly accepts or rejects the identity claim on the basis of an identity claim. thus we experience four situations, as per tab. 1: (1) true positive – genuine accept anenrollee, (2)falsepositive–false reject an enrollee, (3) false negative – false accept a fraud, and (4) true negative – genuine reject a fraud [13]. a measure of the performance of the biometric system is its error rate, described by the false acceptance rate far (the probability that a biometric system incorrectly identified an enrollee or failed to reject a fraud), and the false rejection rate frr (the probability that a biometric system failed to identify an enrollee, or verified a legitimate identity claim as a fraud) [14, 15]. the false acceptance rate far is defined as: far= number of false acceptances number of fraud recognition attempts (1) the false rejection rate frr is defined as: frr= number of false rejections number of enrollee recognition attempts (2) at the point where far andfrr are equal, this value is called the equal error rate (err). this value does not have any practical use, so we did not calculate it. however, it is an indicator of the accuracy of the device. for example, if we have two devices with error rates of 5 % and 10 %, we know that the first device is more accurate (it makes fewer errors) than the other. however, such comparisons are not straightforward in reality [15, 16]. the number of entries fromeventviewer, in this case fingerprint logon attempts, when the accounts logon process was successful and/or the accounts logon process failed (for each user on each notebook) were collected, counted and analyzed. tab. 2 and tab. 3 show the calculated frr rates from the real environment of three different computers (but with the same type of fingerprint sensor/scanner), and three users. although the error rates quoted by manufactures (typically far < 0.01, frr < 0.1, err < 1) may indicate thatbiometric systemsareveryaccurate, the real situation is rather different, namely the frr is very high (over 10 %). in our case, the frr values expressed as a percentage are in the range of 9.5 % to 18.5 % (tab. 4). this can sometimes prevent a legitimate user (enrollee) gaining access. thus we must be very careful when interpreting such numbers/measurements. 91 acta polytechnica vol. 50 no. 6/2010 table 2: number of logins (successful, failed) for each user/per computer (notebook), and calculated false rejected rates frr notebook total logins frr 1 successful failed user 1 46 8 0.142 user 2 57 7 0.109 user 3 66 7 0.095 total 169 22 0.115 notebook total logins frr 2 successful failed user 1 99 12 0.108 user 2 44 10 0.185 user 3 133 22 0.141 total 276 44 0.137 notebook total logins frr 3 successful failed user 1 65 8 0.109 user 2 71 9 0.112 user 3 89 14 0.135 total 225 31 0.121 table 3: total successful and failed logins (user/per computer), and false rejection rates frr total logins frr successful failed user 1 210 28 0.117 user 2 172 26 0.131 user 3 288 43 0.129 total 670 97 0.126 tab.4 showsthefrrrates for eachuser andeach computer/notebook (expressed as a percentage) out of the total of authorized and failed access attempts (fingerprint used to log on to windows). table 4: frr rates in [%] (nb – notebook) nb 1 nb 2 nb 3 user 1 14.2 10.8 10.9 user 2 10.9 18.5 11.2 user 3 9.5 14.1 13.5 the numbers of refused acquired attempts for each user were counted in advance, and the failure to acquire rate fta was calculated, as below [16]: fta= number of refused acquirement attempts number of all acquirement attempts (3) all acquirement refusalsmean the inability of the fingerprint reader (sensor) to deliver the output data. no software or log files were used to count these refused acquirement attempts. manual counting was arranged by each user to count refused acquirement attempts by the respective fingerprint reader (sensor). the numbers of refused logon attempts for each user (false reject of an enrollee) are shown in tab. 5. these are only informative results indicating how many fingerprint logon attempts were not enrolled. the failure to acquire rates (fta) were also calculated, and are shown in tab. 5. table 5: fta rates acquired attempts total/success. and failed refused fta user 1 238 40 0.168 user 2 198 32 0.161 user 3 331 52 0.157 total 767 124 0.161 tab. 6 shows the numbers of genuine acceptances and false rejectsand/or falseacceptancesandgenuine rejects in associationwith user 1 and notebook 1. a false reject of anenrollee is referred to as a type 1 error of identity claim or a false positive, and/or false acceptance of a fraud is referred to as a type 2 error of an identity claim, or a false negative [13]. table 6: the number of accepted and rejected attempts associated for user 1 and notebook 1 (note: the numbers of acceptedand rejected attemptsofenrollee/user 1were used from tab. 1) accepted rejected enrollee 46 true positive (genuine accept) 8 false positive (false reject) fraud 1 false negative (false accept) 49 true negative (genuine reject) false acceptance of a fraud (false negative) is a possible error in the statistical decision process that fails to reject enrollmentwhen it shouldhavebeen re92 acta polytechnica vol. 50 no. 6/2010 jected. in real-life applications, one type of errormay have more serious consequences than the other [7]. wemeasured thefalseacceptanceratefarparameter for one user only (user 1) during his/her 50 login (recognition) attempts, when the user, instead of enrollingwith his “registered”fingerprint (we used index fingers) provided some other “not registered” finger(s). (note: a not registered finger means that the biometric samples/templates of the fingerprints hadnotbeen captured). in accordancewith this part of the test, user 1 passed the authentication (was not rejected) once, which represents 2 % of the total fraud login attempts. the false acceptance rate (far), as we mentioned above, is typically far < 0.01. as we have shown in our measurements, where the far rates were calculated as per (1), we had one false acceptance fraud only (false negative), which represents 2 % of the total number fraud login attempts, thus in this case the false acceptance rate far=0.02. related calculations [13] from tab. 6: false positive rate= false positive (false positive +true negative) (4) false negative rate= false negative (true positive +false negative) (5) then false positive rate= 8 (8+49) =0.14 [or 14 %] (6) false negative rate= 1 (46+1) =0.02 [or 2 %] (7) 5 conclusions utilizing fingerprints for personal authentication is becoming convenient and considerablymore accurate than currentmethods, such as the utilization of passwords. fingerprints cannot be forgotten, shared or misplaced. we have shown experimentally that the use ofbiometric techniques (fingerprintbiometrics) is not yet perfect, but is reliable and secure enough to be used in log on to, e.g., personal computers (workstations) and/or networks to obtain proper data access. some factors influence our results for authentication reliability (dryness or wetness of fingerprints, pressure, speed of finger swiping over the fingerprint reader, etc.) these factors influence the generation of a unique template for use each time an individual’s biometric data is scanned and captured. consequently (depending on the biometric system), a personmayneed to present biometric data several times in order to enroll. as regards fingerprint-based methods, note that the stored fingerprint templates should not enable reconstruction of the full fingerprint image. in this way, the system can comply perfectly well with privacy rules, so that it can onlybe used in co-operation with the person who is enrolled. acknowledgement this paper is an outcome of the vega project no. 1/0829/08: “correlationof inputparametersand output thermograms changes within infrared thermography diagnostics” carried out at the technical university of košice, faculty of mechanical engineering, department of biomedical engineering,automation and measurement. we thank mudr. viliam jurášek and his staff from the clinic of plastic and aesthetic surgery, porta med, ltd. košice, slovak republic, for their assistance with data collection. references [1] gate, l.: pacs integration and work flow. radiologic technology, 2004, vol. 75, no. 5, pp. 367–377. the american society of radiologic technologists, 2004. [2] lehman, j.: hipaa’s impact on radiology. radiology management, 2003. vol. 25, no. 1, pp. 45–46. [3] ross, a., prabhakar, s., jain, a.: an overview of biometrics, [on-line]. [cit. 3–23–2010]. http://biometrics.cse.msu.edu/info.html [4] chang, kyong i., bowyer, kevin w., flynn, patrick j., chen, xin: multi-biometrics using facialappearance, shape andtemperature. 6th ieee int.conf. onautomaticface andgesture recognition fg’04, seoul, korea, may 17–19, 2004, pp. 43–48. [5] public attitudes toward the uses of biometric identification technologies by government and the private sector. summary of survey findings. prepared by orc international. 2002. [online]. [cit. 3–23–2010] http://www.ece.unh.edu/biometric/biomet/ public docs/biometricsurveyfindings.pdf [6] mullaney, j.: biometric authentication a choice for banks. software quality news. 12 oct 2006. [on-line]. [cit. 3–23–2010]. http://searchsoftwarequality.techtarget.com/ news/article/0,289142,sid92 gci1222998,00.html [7] liu, s., silverman, m: a practical guide to biometric security technology. it professional, 2001, 3, pp. 23–32. [on-line]. [cit. 3–23–2010]. http://www.computer.org/itpro/homepage/ jan feb/security3.htm 93 acta polytechnica vol. 50 no. 6/2010 [8] ratha, n. k., connell, j. h., bolle, r. m.: enhancing securityandprivacy inbiometrics-based authentication systems. ibm systems journal, 2001, vol. 40, no. 3, pp. 614–634. [9] keith, rhodes a.: information security. challenges in using biometrics. applied research and methods. 2003. [on-line]. [cit. 1–20–2009] http://www.gao.gov/fraudnet/fraudnet.htm [10] maltoni, d., maio, d., jain, a. k., prabhakar, s.: handbook of fingerprint recognition. springer verlag, new york, 2003. [on-line]. [cit. 1–22–2009] http://bias.csr.unibo.it/maltoni/handbook [11] hp protect tools. security manager reference guide. [on-line]. [cit. 2–2–2009] http://www.hp.com/notebook [12] understanding logon and authentication. published: november 2005. [on-line]. [cit. 1–25–2009] http://www.microsoft.com/technet/prodtechnol/ [13] lehman,e.l.,romano,josephp.: testing statistical hypotheses (3 ed.). new york, springer. isbn 0387988645. [14] association for biometrics, international computer security association: glossary of biometric terms. 1999. [on-line]. [cit. 1–20–2009] http://www.afb.org.uk/docs/glossary.htm [15] roško, m.: biometrics: fingerprint verification and/or authentication in windows-based system environment. in: crisis management, 02/2007, p. 6. university of žilina, (faculty of special engineering), žilina. issn 1336-0019. [16] řı́ha, z., matyas, v.: biometric authentication systems. masaryk university (faculty of informatics). technical report (fimu-rs-2000-08), p. 46. november 2000. dr.h.c. prof. ing. jozef živčák, phd. phone: +421 556 022 381, fax: +421 556 022 363 e-mail: jozef.zivcak@tuke.sk technical university of košice faculty of mechanical engineering department of biomedical engineering automation and measurement letná 9/a, 042 00 košice, slovak republic ing. milan roško phone: +14 164 696 333, fax: +14 164 696 615 e-mail: milan.rosko@gmail.com toronto east general hospital 825 coxwell ave., m4c 3e7, toronto, canada 94 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 root asymptotics for the eigenfunctions of univariate differential operators b. shapiro abstract this paper is a brief survey of the research conducted by the author and his collaborators in the field of root asymptotics of (mostly polynomial) eigenfunctions of linear univariate differential operators with polynomial coefficients. keywords: root-counting measure, exactly solvable operator, schrödinger equation. 1 objective study asymptotic properties of sequences {pn(z)}, of polynomials/entire functions in z which either 1. are polynomial/entire eigenfunctions of a univariate linear ordinary differential operatorwith polynomial coefficients; or 2. are polynomial solutions of more general pencils of such operators, e.g. homogenized spectral problems and heine-stieltjes spectral problems; or 3. satisfy a finite recurrence relation with (in general) varying coefficients. 2 basic notions and examples definition 1 an operator t = k∑ i=1 qi(z) di dzi is called exactly solvable if degqi(z) ≤ i and there exists at least one value i such that degqi(z)= i. obviously, t(zj)= aj z j +lower order terms, i.e. t acts by an (infinite) triangularmatrix in themonomial basis {1, z, , z2, . . .} of c[z]. lemma 1 for any exactly solvable t and sufficiently large n there exists a unique (up to a scalar) eigenpolynomial pn(z) of degree n. typical problem. given an exactly solvable t describe the root asymptotics for the sequence of polynomials {pn(z)}. 2.1 two asymptotic measures given a polynomial family {pn(z)} where degpn(z) = n we define two basic measures: (i) asymptotic root-countingmeasure μ; (ii) asymptotic ratio measure ν. definition 2 associate to each pn(x) a finite probability measure μn by placing the mass 1 n at every root of pn(x). (if some root is multiple we place at this point the mass equal to its multiplicity divided by n.) the limit μ = lim n μn (if it exists in the sense of weak convergence) will be called the asymptotic root-counting measure of {pn(z)}. definition 3 consider the ratio qn(z) = pn−1(z) pn(z) . (assume for simplicity that pn(z) has no multiple roots and expand qn(z)= n∑ i=1 κi,n z − zi,n .) associate to qn(z) the finite complex-valued measure by placing κi,n at zi,n. define the asymptotic ratio measure of the sequence {pn(z)} as ν = lim n→∞ νn. observation. supports of μ and ν coincide but ν is often complex-valued. 2.2 examples below we show the root distribution for p55(z) for 4 different exactly solvable operators t1 = z(z − 1)(z − i) d3 dz3 ; t2 = (z − i)(z + i)(z − 2+3i)(z − 3 − 2i) d4 dz4 ; t3 = (z − i)(z + i)(z − 2+3i)(z − 3 − 2i) (z +3) d5 dz5 ; t4 = (z 2 +1)(z − 2+3i)(z − 3 − 2i)(z +3) (z +1+ i) d6 dz6 of the form q(z) dk dzk where q(z) is a monic polynomial of degree k +1. 77 acta polytechnica vol. 50 no. 5/2010 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 2.5 3 -3 -2 -1 0 1 2 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 fig. 1: roots of p55(z) for the above t ’s explanations to fig. 1. the larger dots show the roots of the corresponding q(z) and the smaller dots are the fifty five roots of the corresponding p55(z). 2.3 classical prototypes theorem 1 (g. szegö) if {pn(z)} is a family of polynomials orthogonal w.r.t a positive weight w(z) supported on [−1,1] such that ∫ 1 −1 lnw(z)dz < ∞ then the asymptotic root-counting measure has the density 1 π √ 1 − z2 , x ∈ [−1,1]. theorem 2 (g. szegö) if {pn(z)} is a family of polynomials orthogonal w.r.t a weight w(z) supported on [−1,1] such that ∫ 1 −1 lnw(z)dz√ 1 − z2 > −∞ then the asymptotic ratio measure has the density 2 √ 1 − z2 π , z ∈ [−1,1]. 3 first results 3.1 non-degenerate exactly solvable operators the next subsection is based on [10, 2]. definition 4 the cauchy transform of a (complex-valued) measure ρ satisfying ∫ c dρ(ξ) < ∞ is given by cρ(z)= ∫ c dρ(ξ) z − ξ . example. if ρ(z) = 1 π √ 1 − z2 , z ∈ [−1,1] then cμ = 1 √ z2 − 1 in c \ [−1,1] and cν = 2 z + √ z2 − 1 in c \ [−1,1]. definition 5 an exactly solvable operator t = k∑ i=1 qi(z) di dzi is called non-degenerate if degqk(z)= k. proposition 1 assuming that ψ(z) = lim n→∞ p′n(z) npn(z) exists in some open neighborhood ω of c one gets that ψ(z) satisfies in ω the algebraic equation qk(z)ψ k(z)= 1. theorem 3 (h. rullg̊ard) let qk(z) be a monic degree k polynomial. then there exists a unique probability measure μq such that 1) suppμq is compact; 2) its cauchy transform cμ satisfies the equation qk(z)c k μ(z)= 1 almost everywhere in c. 78 acta polytechnica vol. 50 no. 5/2010 0 0.5 1 1.5 2 2.5 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.2 -1.5 -1 -0.5 0 0.5 1 fig. 2: the measure μq before and after the straightening transformation in the case q(z)= (z −1)(z −3)(z − i) theorem 4 (main result, see fig. 2) in the above notation 1) suppμq is a curvilinear tree which is straightened out by the analytic mapping ξ(z)= ∫ z a dz k √ qk(z) . 2) suppμq contains all the zeros of qk(z) and is contained in the convex hull of those. 3) there is a natural formula for the angles between the branches, and the masses of the branches satisfy kirchhoff law. belowwe showan example of such ameasure in a proper scale and with all angles between its vertices marked, see fig. 3. 0 0.5 1 1.5 -1.5 -1 -0.5 0 60 150 150 120 150 90 150 150 120 90 150 60 fig. 3: example of μq with angles problem 1 is it true that the support of themeasure μq is a subset of the stokes lines of the corresponding operator q dk dzk ? somepartial results in this direction canbe found in [12]. 3.2 degenerate exactly solvable operators this subsection is based on [1]. definition 6 an exactly solvable t of order k is called degenerate iff degqk < k. classical examples: t = z d2 dz2 + (az + b) d dz , t = d2 dz2 +(az + b) d dz leading to laguerre resp. hermite polynomials. proposition 2 the union of all roots of all polynomial eigenfunctions of an exactly solvable t is unbounded if and only if t is degenerate. problem 2 given a degenerate t with the family of eigenpolynomials {pn(z)} how fast does the maximum rn of the modulus of roots of pn(z) grow? conjecture 1 given a degenerate t = k∑ j=1 qj(z) dj dzj denote by j0 the largest j for which degqj(z)= j. then lim n→∞ rn nd = ct where ct > 0 is a positive constant and d := max j∈[j0+1,k] ( j − j0 j − degqj ) . corollary 1 (of the latter conjecture) the cauchy transform c(z) of the asymptotic root measure μ of the scaled eigenpolynomial qn(z) = pn(n dz) of a degenerate t satisfies the following algebraic equation for almost all complex z: zj0cj0(z)+ ∑ j∈a αj,degqj z degqj cj(z)= 1, where a is the set consisting of all j for which the maximum d := max j∈[j0+1,k] ( j − j0 j − degqj ) is attained, i.e. a = {j : (j − j0)/(j − degqj)= d}. 79 acta polytechnica vol. 50 no. 5/2010 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -1 -0.5 0 0.5 1 fig. 4: examples of the root distributions of scaled eigenpolynomials to degenerate exactly solvable operators the latter equation for the cauchy transform (if true) leads to very detailed information about the support of the asymptotic root-countingmeasure for the sequence of scaled eigenpolynomials. we illustrate this in fig. 4. 4 homogenized spectral problem for non-degenerate t this section is based on [6]. an observant reader has noticed that so far only the leading coefficient of an exactly solvable operator effected the asymptotic root-counting measure, which makes the situation somewhat unsatisfactory. to make the whole symbol of an operator important we consider (following the classical pattern of e.g. w. wasow,m. fedoryuk) the homogenized spectral problem of the form tλ = k∑ i=0 qi(z)λ k−i d i dzi , where each qi(x)= aiiz i +ai,i−1z i−1+ . . . is a polynomial of degree i. definition 7 a non-degenerate t is called of general type iff degqk(z)= k and k∑ i=0 aiiλ k−i =0 has k distinct zeros. proposition 3 if t is of general type then 1) for all sufficiently large n there exist exactly k distinct values λn,j , j = 1, . . . , k of the spectral parameter λ such that the operator tλ has a polynomial eigenfunction pn,j(z) of degree n. 2) asymptotically λn,j ∼ nλj where λ1, . . . , λk is the set of roots of the algebraic equation k∑ i=0 ai,ix k−i =0. conjecture 2 if t is of general type and all λ1, . . . , λk have distinct arguments then for each j = 1, . . . , k ∃! probability measure μj with compact support whose cauchy transform cj(z) satisfies almost everywhere in c k∑ i=1 qi(z)(λj cj(z)) i =0. conjecture 3 cj(z)= lim n→∞ p′n,j(z) λn,j pn,j(z) outside the support of μj which is the union of finitelymany segments of analytic curves forming a curvilinear tree. observation. near ∞ ∈ cp1 the cauchy transforms λ1c1(z), . . . , λkck(z) are independent sections of the symbol equationof tλ consideredas abranchedcover over cp1. problem 3 find an explicit description of (the support) of the measures μi. is there any relation of these measures to the periods of the plane curve k∑ i=1 qi(z)y i =0? 5 heine-stieltjes theory this section is based on [11]. take an arbitrary univariate linear differential operator t = k∑ i=0 qi(z) di dzi with polynomial coefficients and set r =max i (degqi(z) − i). definition 8 if r ≥ 0, degqk(z)= k+r and qk(z) has at least two distinct roots we call t a general lame-type operator. 80 acta polytechnica vol. 50 no. 5/2010 0 1 2 3 4 5 6 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 -1 0 1 2 3 4 0 1 2 3 4 5 6 -1 0 1 2 3 4 fig. 5: three root-counting measures and their union for a homogenized spectral problem with an operator of order 3 0 0.5 1 1.5 2 2.5 3 -3 -2 -1 0 1 2 0 0.5 1 1.5 2 2.5 3 -3 -2 -1 0 1 2 fig. 6: examples of μq’s for t =(z 2 +1)(z +2i −3)(z −3i − 2) d3 dz3 consider the following multi-parameter spectral problem. for a given non-negative integer n find all polynomials v (z) of degree at most r such that the equation t(p(z))+ v (z)p(z)= 0, has a polynomial solution p(z) of degree n. (classically, p(z) is called a stieltjes polynomial and v (z) is called a van vleck polynomial.) proposition 4 under the above assumptions for any sufficiently large n there exist exactly ( n + r r ) degree n stieltjes polynomials pn,j(z) and corresponding van vleck polynomials vn,j(z). proposition 5 if a sequence {ṽn,jn(z)}, n =1, . . . , of scaled van vleck polynomials converges to some polynomial ṽ (z) then the sequence of finite measures μn,j of the corresponding family of eigenpolynomials {pn,jn(z)} converges to a measure μṽ satisfying the properties: a) suppμ ṽ is a forest of curvilinear trees; b) the union of the leaves of suppμ ṽ coincides with the union of all zeros of qk(z) and those of ṽ (z). c) suppμ ṽ is straightened out by the transformation given by ∫ z a ṽ (z)dz qk(z) . explanations to fig. 6 and 7. in fig. 6 we give two examples of different van vleck polynomials v (z) and the corresponding stieltjes polynomials p(z). the average size dots are the 4 roots of the polynomial q(z) = (z2 +1)(z +2i − 3)(z − 3i − 2), the 81 acta polytechnica vol. 50 no. 5/2010 unique largedot is the only root of v (z) (which is linear in this case). small dots show the roots of p(z). in fig. 7 we show the union of all roots of p(z) of degree 25 for the same problem. 0 0.5 1 1.5 2 2.5 3 -3 -2 -1 0 1 2 fig. 7: union of μq’s for the above t 6 schrödinger operator with polynomial potential this section is based on [7, 8]. consider the operator h = − d2 dz2 + p(z) where p(z) = z2l + 2l−1∑ i=0 aiz i is a monic polynomial of even degree with real coefficients. it is well-known that the classical spectral problem h(y)= λy (1) where y belongs to l2(r) has a discrete and simple spectrum 0 < λ0 < λ1 < λ2 < . . . < λn < . . . denote by φ0(z), φ1(z), . . . , φn(z), . . . the sequenceof the corresponding eigenfunctions. these eigenfunctions are real entire functions of order l +1 and φn(z) has exactly n real zeros. set ψn(z)= φn( 2l √ λnz) which we call the scaled n-th eigenfunction. thestokes graph ofanycomplexpolynomial p(z) is the following object. each root of p(z) is called a turning point. a (local) stokes line of p(z) is amaximal segment of the real analytic curve containing at most two turning points (finite or infinite) which solves the equation: � ξz0(z) = 0 where (2) ξz0(z) = ∫ z z0 √ p(u)du =0, with respect to z, where z0 is one of the turning points of p(z). the stokes graph stp of the polynomial p(z) is the union of all its local stokes curves. a local stokes line connecting twofinite turning points, i.e. two roots of p(z) is called short. (the stokes graph st(p) of a generic p(z) has no short stokes lines.) proposition 6 for a given positive integer l the stokes graph st(z2l − 1) consists of 1) l short stokes lines for l odd and l − 1 short stokes lines for l even connecting all pairs of the roots of z2l − 1 which are symmetric w.r.t the imaginary axis; 2) for l odd each root of z2l − 1 is connected by 2 infinite stokes lines to ∞. more exactly, the 2 infinite stokes lines passing through the root e πik l , k =0, . . . ,2l − 1 are tangent at ∞ to the stokes rays having the nearest slope to πik l ; 3) for l even each root of z2l − 1 except for ±i is connected to ∞ by 2 infinite stokes lines with the same property as above. the roots ±i have 3 infinite stokes lines each. theorem 5 for any monic polynomial pc(z) of even degree the sequence of meromorphic functions {cn(z)} = { ψ′n(z) nψn(z) } converges to c(z) = −kl √ z2l − 1 uniformly on any compact set lying in the domain c \ u cl, where kl = √ πγ ( 3l+1 2l ) γ ( 2l+1 2l ) . (here by − √ z2l − 1 we mean the branch which is negative for positive z > 1. also u cl is a certain subset of local stokes lines marked by bold on fig. 8.) �� �� �� fig. 8: stokes lines of z2l − 1 for l =1,2,3 82 acta polytechnica vol. 50 no. 5/2010 7 finite recurrences this section is based on [4]. consider a finite recurrence of length (k +1) given by pn+1(z)= q1(z)pn(z)+ . . . + qk(z)pn−k+1(z), with polynomial or rational coefficients {q1(z), . . . , qk(z)} uniquely determined by the initial k-tuple {p0(z), . . . , pk(z)}. theorem 6 there exists a finite subset θ ⊂ c depending on the initial k-tuple and a curve σ depending on the recurrence such that the asymptotic ratio ψ(z) = lim n→∞ pn+1(z) pn(z) exists and satisfies the symbol equation ψk(z)= q1(z)ψ k−1(z)+ . . . + qk(z) (∗) in c \ (σ ∪ θ). here σ is the so-called stokes discriminant of (∗) which is the set of all z for which the equation (∗) has at most two roots with the same and maximal absolute value. -3 -2 -1 0 1 -3 -2 -1 0 1 2 fig. 9: zeros of p31(z) satisfying the recurrence relation (z +1)pn(z)= (z 2+1)pn−1(z)+(z −5i)pn−2(z)+(z3 − 1 − i)pn−3(z) acknowledgement i want to thank my coauthors t. bergkvist, j. borcea, r. bøgvad, a. eremenko, a. gabrielov, g. masson, h. rullg̊ard for the pleasure of working with them and for the numerous insights and results we obtained together. i want to thank the organizers of the miniconference ‘analytic and algebraicmethods in quantum mechanics, v’ for the financial support and a great pleasure of visiting prague in may 2009, where these results were presented. references [1] bergkvist, t.: on asymptotics of polynomial eigenfunctions for exactly-solvable differential operators, j. approx. theory 149(2), (2007), 151–187. [2] berqkvist, t., rullg̊ard, h.: on polynomial eigenfunctions for a class of differential operators, math. res. lett., 9 (2002), 153–171. [3] berqkvist, t., rullg̊ard, h., shapiro, b.: on bochner-krall orthogonal polynomial systems, math. scand., 94 (2004), 148–154. [4] borcea, j., bøgvad, r., shapiro, b.: on rational approximation of algebraic functions, adv. math. 204 (2006), 448–480. [5] borcea, j., shapiro, b.: root asymptotics of spectral polynomials for the lamé operator, comm. math. phys, 282 (2008), 323–337. [6] borcea, j., bøgvad, r., shapiro, b.: homogenized spectral pencils for exactly solvable operators: asymptoticsofpolynomial eigenfunctions, publ. rims, 45 (2009), 525–568. [7] gabrielov, a., eremenko, a., shapiro, b.: zeros of eigenfunctions of someanharmonicoscillators,annales de l’institutfourier,58(2) (2008), 603–624. [8] gabrielov, a., eremenko, a., shapiro, b.: high energy eigenfunctions of one-dimensional schrödinger operators with polynomial potentials, comput. methods funct. theory, 8(2) (2008), 513–529. [9] holst, t., shapiro,b.: onhigherheine-stieltjes polynomials, to appear in isr. j. math. [10] masson, g., shapiro, b.: on polynomial eigenfunctionsofahypergeometric-typeoperator,exper. math., 10 (2001), 609–618. [11] shapiro,b.: algebro-geometric aspects ofheinestieltjes theory, submitted. [12] shapiro, b., takemura, k., tater, m.: on spectral polynomials of the heun equation. ii, submitted. prof. boris shapiro e-mail: shapiro@math.su.se department of mathematics stockholm university se-106 91, stockholm, sweden 83 ap09_2.vp 1 introduction tree pattern matching is one of the fundamental problems with many applications, and is often declared to be analogous to the problem of string pattern matching [3, 5, 17]. string pattern matching is the problem of finding all occurrences of string patterns and their positions in a given text. a model of computation for string pattern matching is the finite automaton [8]. one of the basic approaches used for string pattern matching is represented by finite automata which are constructed for string patterns, which means that the patterns are preprocessed. given a text of size n, such finite automata typically perform the search phase in time linear to n (see [8, 9, 19] for a survey). tree pattern matching is the problem of finding all occurrences and their positions of matches of tree patterns in a subject tree. although many tree pattern matching methods have been described [4, 5, 6, 10, 11, 13, 14, 16, 17, 18, 20, 21], most of them fail to provide a search phase in linear time (based on the size of the subject tree) or have huge memory requirements. this paper presents the first attempt to perform tree pattern matching by a unified and systematic approach using pushdown automata. we present the first, basic, non-deterministic model of a pushdown automaton performing tree pattern matching. the goal of this research is to provide a method for determinising the non-deterministic model of the proposed pushdown automaton, which will make linear time (based on the size of the subject tree) pattern matching possible. 2 basic notions 2.1 ranked alphabet, tree, prefix notation, tree pattern we define notions on trees similarly as they are defined in [1, 5, 7, 15, 17]. we denote the set of natural numbers by �. a ranked alphabet is a finite, non-empty set of symbols, each of which has a unique, non-negative arity (or rank). given a ranked alphabet �, the arity of a symbol a �� is denoted arity (a). the set of symbols of arity p is denoted by � p . elements of arity 0, 1, 2, …, p are respectively called nullary, unary, binary, …, p-ary symbols. we assume that � contains at least one constant. in the examples we use numbers at the end of identifiers for a short declaration of symbols with arity. for instance, a2 is a short declaration of a binary symbol a. based on the concepts of graph theory (see [1]), a labeled, ordered, ranked tree over a ranked alphabet � can be defined as follows: an ordered directed graph g is a pair (n, r), where n is a set of nodes and r is a set of linearly ordered lists of edges, such that each element of r is of the form (( , ), ( , ), , ( , ))f g f g f g n1 2 � , where f g g g nn, , , ,1 2 � � , n � 0. this element would indicate that, for node f, there are n edges leaving f, the first entering node g1, the second entering node g2, and so forth. a sequence of nodes ( , , , )f f f n0 1 � , n � 1, is a path of length n from node f0 to node fn, if there is an edge which leaves node fi�1 and enters node fi for1 � �i n . a cycle is a path ( , , , )f f f n0 1 � , where f f n0 � . an ordered directed acyclic graph (dag) is an ordered directed graph that has no cycle. labeling of an ordered graph g a r� ( , ) is a mapping of a into a set of labels. in the examples we use af for a short declaration of node f labeled by symbol a. a labeled, ordered, ranked tree t over a ranked alphabet � is an ordered dag t n r� ( , ) with a special node called the root, such that (1) r has in-degree 0, (2) all other nodes of t have in-degree 1, (3) there is just one path from root r to every f n� , where f r� , (4) every node f n� is labeled by a symbol a �� and out-degree of af is arity (a), (5) nodes labeled by nullary symbols are called leaves. the prefix notation pref (t) of a labeled, ordered, ranked tree t is obtained by applying the following step recursively, beginning at the root of t: 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 on tree pattern matching by pushdown automata t. flouri tree pattern matching is an important operation in computer science on which a number of tasks such as mechanical theorem proving, term-rewriting, symbolic computation and non-procedural programming languages are based on. work has begun on a systematic approach to the construction of tree pattern matchers by deterministic pushdown automata which read subject trees in prefix notation. the method is analogous to the construction of string pattern matchers: for given patterns, a non-deterministic pushdown automaton is created and then it is determinised. in this first paper, we present the proposed non-deterministic pushdown automaton which will serve as a basis for the determinisation process, and prove its correctness. keywords: tree, tree pattern, pushdown automaton, tree pattern matching. step: let this application of step to be node af. if af is a leaf, list af and halt. if af is not a leaf, let its direct descendants be a a af f f n1 2 , , ,� . then list af and subsequently apply step to a a af f f n1 2 , , ,� in that order. we note that in many papers on the theory of tree languages, such as [5, 7, 15, 17], labeled ordered ranked trees are defined with the use of ordered ranked ground terms. ground terms can be regarded as labeled, ordered, ranked trees in prefix notation. therefore, the notions ground term, tree and tree in prefix notation are used interchangeably in these papers. example 1. consider a ranked alphabet � � { , , }a a a2 1 0 . consider a tree t1 over � t a a a a a a a r1 1 2 3 4 5 6 72 2 0 1 0 1 0� ({ , , , , , , }, ), where r is a set of the following ordered sequences of pairs: (( , ),( , ))a a a a2 2 2 11 2 1 6 , (( , ),( , ))a a a a2 0 2 12 3 2 4 , (( , ))a a1 04 5 , (( , ))a a1 06 7 . the prefix notation of tree t1 is pref t a a a a a a a( )1 2 2 0 1 0 1 0� . trees can also be represented graphically and tree t1 is illustrated in fig. 1. the height of a tree t, denoted by height (t), is defined as the maximal length of a path from the root of t to a leaf of t. to define a tree pattern, we use a special nullary symbol s, where s ��, which serves as a placeholder for any subtree. a tree pattern is defined as a labeled, ordered, ranked tree over ranked alphabet � { }s . by analogy, a tree pattern in prefix notation is defined as a labeled, ordered, ranked tree over ranked alphabet � { }s in prefix notation. a pattern p with k � 0 occurrences of the symbol s matches a tree t at node n if there exist subtrees t t tk1 2, , ,� (not necessarily the same) of the tree t, such that the tree p , obtained from p by substituting the subtree ti for the i-th occurrence of s in p, i k� 1 2, , ,� , is equal to the subtree of t rooted at n. example 2. consider a tree t a a a a a a a r1 2 2 0 1 0 1 01 2 3 4 5 6 7� ({ , , , , , , }, ) from example 1, which is illustrated in fig. 1. consider a tree pattern p1 over � { }s , p a s a s r1 8 9 10 112 1� ({ , , , }, ) , where is a set of the following ordered sequences of pairs: (( , ),( ))a s a a2 2 18 9 8 10 , (( , ))a s19 11 . the prefix notaion of tree pattern p1 is pref p a s a s( )1 2 1� . the tree pattern p1 is illustrated in fig. 2 and has two occurrences in tree t1, matching at nodes a21 and a22 of t1. 2.2 alphabet, language, context-free grammar, pushdown automaton let an alphabet be a finite, nonempty set of symbols. a language over an alphabet � is a set of strings over �. the symbol �* denotes the set of all strings over �, including the empty string, which is denoted by �. set � � is defined as � �� � * �{�}. similarly for string x �� *, the symbol xm, m � 0 denotes the m-fold concatenation of x with x0 � �. set x* is defined as x x mm* { : }� � 0 and x x� � *�{ } { : }� � �x mm 1 . a context-free grammar (cfg) is a 4-tuple g n t p s� ( , , , ), where n and t are finite, disjoint sets of nonterminal and terminal symbols, respectively. p is a finite set of rules a � �, where a n� , � � ( )*n t . s n� is the start symbol. a cfg g n t p s� ( , , , ) is said to be in reversed greibach normal form, if each rule from p is of the form a a� � , where a t� and � � n *. the relation is called derivation, if � � ���a , a n� , and � � �, , ( )*� n t , then rule a b� is in p. symbols �, and * are used for the transitive, and the transitive and reflexive closure of , respectively. a rightmost derivation rm is a relation � ��ax x , where x t� *. the relation a a a � � � is called recursion. right recursion is a a a � � . hidden-left recursion is a a b a � � �, where b� ��� . the language generated by a cfg g, denoted by l(g), is the set of strings l g w s w w t* *( ) { : , }� � . a derivation tree is a labeled, ordered tree representing a syntactic structure of a string w, generated by the grammar g. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 49 no. 2–3/2009 fig. 1: tree t1 from example 1 and its prefix notation fig. 2: tree pattern from example 2 and its prefix notation its root is labeled by the start symbol s and its leaves are labeled by terminal symbols or empty strings. each interior node of the tree is labeled by a non-terminal symbol a, and the children of such a node are labeled, from left to right, by symbols from the right-hand side � of a rule a p� �� . a derivation s w * corresponds to a derivation tree whose leaves, if read from left to right, are labeled with the string w. a cfg g is unambiguous if each string w l g� ( ) has just one derivation tree in the cfg g. a context-free language is a language generated by a cfg. an (extended) non-deterministic pushdown automaton (non-deterministic pda) is a seven-tuple m q g q z f� ( , , , , , , )� � 0 0 , where q is a finite set of states, � is an input alphabet, g is a pushdown store alphabet, � is a mapping from q g� �( { }) *� � , into a set of finite subsets of q g� *, q q0 � is an initial state, z g0 � is the initial content of the pushdown store, and f q� is the set of final (accepting) states. the triplet ( , , ) * *q w x q g� � �� denotes the configuration of a pushdown automaton. in this paper, the top of the pushdown store x is always on the left-hand side. the initial configuration of a pushdown automaton is a triplet ( , , )q w z0 0 for the input string w �� *. the relation ( , , )q aw �� �m p w q a g q a g( , , ) ( ) ( ) * * * *�� � � � � � � is a transition of a pushdown automaton m, if ( , ) ( , , )p q a� � �� . the k-th power, transitive closure, and transitive and reflexive closure of the relation �m is denoted �m k , �m � , �m * , respectively. a pushdown automaton m is deterministic (deterministic pda), if it holds: (1) � �( , , )q a � 1 for all q q� , a a� { }� , � �g *. (2) if � �( , , )q a � �0, � �( , , )q a � �0 and � �� then � is not a prefix of � and � is not a prefix of �. (3) if � �( , , )q a � �0, � � �( , , )q � �0, then � is not a prefix of � and � is not a prefix of �. the class of languages accepted by non-deterministic pdas is exactly the class of context-free languages. languages accepted by deterministic pdas are called deterministic context-free languages. there exist context-free languages which are not deterministic, that is, for which no deterministic pda can be constructed. a pushdown automaton is input-driven if each of its pushdown operations is determined only by the input symbol. 2.3 lr(0) parsing given a string w, an lr(0) parser for a cfg g n t p s� ( , , , ) reads the string w from left to right without any backtracking and is implemented by a deterministic pda. a string � is a viable prefix g if � is a prefix of ��, and s a x x rm rm * � �� is a rightmost derivation in g; the string � is called the handle. we use the term complete viable prefix to refer to �� in its entirety. during parsing, each content of the pushdown store corresponds to a viable prefix. the standard lr(0) parser performs two kinds of transitions: (a) when the contents of the pushdown store correspond to a viable prefix containing an incomplete handle, the parser performs a shift, which reads one symbol a and pushes a symbol corresponding to a onto the pushdown store. (b) when the contents of the pushdown store correspond to a viable prefix ending by the handle �, the parser performs a reduction by rule a � �. the reduction pops � symbols from the top of the pushdown store and pushes a symbol corresponding to a onto the pushdown store. a cfg g is lr(0) if the two conditions for g: (1) s aw w rm rm * � �� , (2) s b x y rm rm * � �� , imply that � �ay b x� , that is � �� , a b� , and x y� . if the cfg g is not an lr(0) grammar, then the pda constructed as an lr(0) parser contains conflicts, which means the next transition to be performed cannot be determined according to the contents of the pushdown store only. for cfgs without hidden-left and right recursions, the number of consecutive reductions between the shifts of two adjacent symbols cannot be greater than a constant, and therefore the lr(0) parser for such a grammar can be optimized by precomputing all its reductions. then, the optimized resulting lr(0) parser reads one symbol on each of its transitions [2]. a language l accepted by a pushdown automaton m is defined in two distinct ways: (1) accepting by final state: l m( ) � { : ( , , )x q x z� 0 0 � m q x g q f * * *( , , ) }.� � �� � � � � �� (2) accepting by empty pushdown store: l m x q x z�( ) { :( , , )� 0 0 � m q x q q * *( , , ) }.� � � � � �� 3 a deterministic pushdown automaton accepting trees in prefix notation the prefix notation of a tree can be generated by a grammar g n t p s� ( , , , ), having rules p of the following form: (1)s a� 0 (2)s a s� 1 (3)s a ss� 2 … (n)s a sn n � � � 1 1 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 since the grammar is lr(0), belonging to the subclass of context-free grammars named as deterministic context free grammars, the generated language belongs to the class of deterministic context-free languages and can be recognised by a deterministic pushdown automaton. in this section we present the deterministic pushdown automaton m s s� �({ }, , { }, , , , )0 0 0� � , accepting arbitrary trees in prefix notation by empty pushdown store. the transitions of the automaton are in the form �( , , ) ( , )( )0 0x s s arity x� , for each x ��. have in mind that s 0 � �. the automaton, which is input-driven, is depicted in fig. 3. this particular automaton will be the basic building block for our non-deterministic automaton, which will serve as a pattern matcher. theorem 1. the automaton presented in section 3 accepts valid trees in prefix notation by empty pushdown store. proof. there are three possible types of input to be given to the automaton: (1) a valid input, which represents the prefix notation of a tree. (2) an invalid input, in which there exists a prefix that represents a valid prefix notation of a tree. (3) an invalid input which is the prefix of the prefix notation of some (unknown) tree. to show that the first type of (valid) input is accepted by our automaton by empty pushdown store, we use strong induction. let p(n) be a predicate defined over all integers n. predicate p(i) is true, if trees of height i are accepted by the presented deterministic pda. we define the base case and the inductive step in the following manner: (1) base case: p(0) is true. (2) inductive case: p p p n p n( ), ( ), , ( ) ( )0 1 1� � since the initial pushdown store symbol is symbol s, statement (1) is true, as the only trees of height (0) are trees with only one node x having arity 0. the transition to be taken is � �( , , ) ( , )0 0x s � , removing the initial symbol s from the pushdown store. each tree of depth n can be represented as a root x of arity k, where each of its children nodes can be subtrees of height at most n � 1. according to the inductive case assumption (2), each of the subtree can be accepted by the automaton by empty pushdown store, removing the initial pushdown store symbol s. since the root appends k � 1 symbols s on the pushdown store, the k subtrees remove k symbols s from the pushdown store, leaving it empty. as a result, we have proven that our automaton accepts valid input (prefix notations of trees) by empty pushdown store. in the second case (invalid input, in which there exists such a prefix that represents a valid prefix notation of a tree), the pushdown store will be emptied at the moment the prefix which represents a valid prefix notation is read. in the third case, which is apparent from the first case, the pushdown store will not be emptied. we have proved that the automaton accepts only valid input (prefix notations of trees) by empty pushdown store. � corollary 1. processing an arbitrary tree with the automaton introduced in section 3 results in one symbol being removed from the top of the pushdown store. 4 searching non-deterministic pushdown automaton in this section we present the searching non-deterministic pushdown automaton (snpda), performing tree pattern matching. the structure of an snpda accepting all occurrences of the tree pattern for a given tree in prefix notation is described by algorithm 4. we note that the snpda accepts the matched patterns by final state. the snpda is loosely based on the searching non-deterministic finite automaton, which is used for pattern matching in strings, as described in [19]. it is constructed by extending the deterministic pushdown automaton presented in section 3 with states and transitions corresponding to the given tree pattern. we start by constructing the deterministic pushdown automaton m q g q s f� ( , , , , , , )0 0� � presented in section 3, where � is defined as �( , , ) {( , )}( )q x s q s arity x0 0� for each x ��, f � �0 and g s� { }. the prefix notation of the pattern is read from left to right; for each node x (except the special nullary symbol s) at position i (the position of the first node is 1) , the following steps are carried out: 1. create a new state qi. 2. in case i � 1 do step 3, otherwise do step 4. 3. define a new transition �( , , ) {( , )}( )q x s q si i arity x � �1 . 4. append a new transition: � �( , , ) ( , , ) {( , )}( )q x s q x s q s arity x0 0 1� . in case the nullary symbol s is found at position i, the following steps are carried out: 1. create a new state qi. 2. define a symbol #j , where #j g� . 3. add the new symbol #j to the pushdown store symbol set g: g g j� {# }. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 49 no. 2–3/2009 fig. 3: deterministic tree automaton 4. define a new transition � �( , , ) {( , # )}q s q si j� �1 0 . 5. define a new transition � � �( , , # ) {( , )}q qj i0 � . the last created state (state qn) is set as final (that is, f qn� { }). examples of pdas constructed by algorithm 1 for various patterns are shown in fig. 4–6. theorem 2. the snpda constructed by algorithm 1 finds all occurrences of the tree pattern in a subject tree by final state. proof. we provide a sketch of the proof. a tree pattern, for which the snpda m q g q s f� ( , , , , , , )� � 0 is constructed, has either the form p x x xn� 1 2� (form 1), where x1 �� and x si � � { } for i � 1, or the form p s� (form 2). the automaton is non-deterministic at state q0, due to the transitions �( , , ) {( , ),( , )}( ) ( )q x s q s q sarity x arity x0 1 0 11 1� in the case of form 1, or due to the transition � �( , , ) {( , #)}q s q s0 0� conflicting with all other transitions in the case of form 2. because of this property, the snpda can follow more than one paths at each input symbol. it can either cycle through state q0 or move to state q1 and on to qn, in case the input symbols match the pattern. at the point of a nullary s symbol in the pattern, an �-transition leading to state q0 is taken, replacing the top of the pushdown store (which is a symbol s) with s j# , where j is distinct for each s in the tree pattern. using this method, we simulate a new pushdown store on the top of the current pushdown store. symbol #j denotes the end of the new, simulated pushdown store. from corollary 1, we know that reading a tree by cycling through state q0 removes 1 s symbol from the pushdown store. as a result, the top of the pushdown store will be #j, which indicates that a tree (required by the respective symbol s in the tree pattern) has been processed. the #j-transition can now be taken to resume pattern matching at the point after the respective symbol s in the pattern. while reading a tree by cycling at state q0, a new pattern can be detected since the automaton is non-determinstic. � note that the snpda in fig. 6 is input-driven and thus it can be determinised in the same way as finite automata. the deterministic version is illustrated in fig. 7. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 input : x x x xn� 1 2� – prefix notation of a tree over a ranked alphabet � output : m – a non-deterministic pushdown automaton 1 q � �0 2 g s� { } 3 � � �0 4 for i � �0 to n do q q qi� { } 5 f qn� { } 6 foreach y �� do �( , , ) ( , )( )q y s q s arity y0 0� 7 j � 0 8 for i � 1 to n do 9 if x si � then 10 j j� � 1 11 g g j� {# } 12 � �( , , ) ( , # )q s q si j� �1 0 13 � � �( , , # ) ( , )q qj i0 � 14 else 15 �( , , ) ( , )( )q x s q si i i arity x i � �1 16 end 17 end 18 m q g q s f� ( , , , , , , )� � 0 algorithm 1: construction of a searching nondeterministic pushdown automaton fig. 4: non-deterministic searching automaton pushdown automaton for tree pattern p a a sa� 2 1 0 fig. 5: non-deterministic searching automaton pushdown automaton for tree pattern p a ssa sa� 3 2 0 fig. 6: non-deterministic searching automaton pushdown automaton for tree pattern p a a a a� 2 1 0 0 fig. 7: deterministic searching automaton pushdown automaton for tree pattern p a a a a� 2 1 0 0 5 conclusion and future work in this paper we have presented an innovative method of tree pattern matching by pushdown automata. we have introduced a non-deterministic model of the searching pushdown automaton, which correctly accepts all occurrences of a pattern in a given tree presented in its prefix notation. our goal is to perform determinisation of this automaton, which will lead to linear time (to the size of the subject tree) searching of patterns, as in the case of string pattern matching by deterministic finite automata. work on the determinisation of this automaton has already begun and the first results were presented at the london stringology days conference held at king’s college, london [12]. acknowledgements the research described in this paper was supervised by prof. bořivoj melichar, drsc. of fee ctu in prague and supported by the czech ministry of education, youth and sports under research program msm 6840770014, and by the czech science foundation as project no. 201/09/0807. references [1] aho, a. v., ullman, j. d.: the theory of parsing, translation and compiling. vol. 1: parsing, vol. 2: compiling, new york: prentice–hall, 1972. [2] aycock, j., horspool, n., janoušek, j., melichar, b.: even faster generalized lr parsing, in: acta informatica, berlin: springer, vol. 37 (2001), no. 9, p. 633–651. [3] bille, p.: pattern matching in trees and strings. phd thesis, fit university of copenhagen, copenhagen, 2008. [4] chase, d.: an improvement to bottom up tree pattern matching, in: proc. 14th ann. acm symp. on principles of programming languages, 1987, p. 168–177. [5] cleophas, l.: tree algorithms. two taxonomies and a toolkit. phd thesis, technische universiteit eindhoven, eindhoven, 2008. [6] cole, r., hariharan, r., indyk, p.: tree pattern matching and subset matching in deterministic o(nlog3n) time. in: proceedings of the 10th acm-siam symposium on discrete algorithms, 1999, p. 245–254. [7] comon, h., dauchet, m., gilleron, r., löding, c., jacquemard, f., lugiez, d., tison, s., tommasi, m.: tree automata techniques and applications, available on: http://www.grappa.univ-lille3.fr/tata, release october, 12th 2007, 2007. [8] crochemore, m., hancart, ch.: automata for matching patterns, in: vol 2: linear modeling: backgroung and application. handbook of formal languages, berlin, heidelberg: springer, 1997, p. 399–462. [9] crochemore, m., rytter: text algorithms. oxford university press, 1994. [10] dubiner, m., galil, z., magen, e.: faster tree pattern matching. in: journal of acm, vol. 41 (1994), no. 2, p. 205 –213. [11] ferdinand ch., seidl h., wilhelm r.: tree automata for code selection, in: acta informatica, berlin: springer, vol. 31 (1994), no. 8, p. 741–760. [12] flouri, t., melichar, b., janoušek, j.: tree pattern matching by pushdown automata. abstract in: lsd 2009 talks’ abstracts, department of computer science, king’s college, london, p. 5. [13] fraser, ch. w., hanson, d. a., proebsting, t. a.: engineering a simple, efficient code generator, in: acm letters on programming languages and systems, vol. 1, 3 (1992), p. 213–226. [14] fraser, ch. w., henry, r. r., proebsting, t. a.: burg: fast optimal instruction selection and tree parsing, in: acm sigplan notices, vol. 27 (1992), p. 68–76. [15] gecseg, f, steinby, m.: tree languages, in: vol 3: beyond words. handbook of formal languages. berlin, heidelberg: springer, 1997, p. 1–68. [16] glanville, r. s., graham, s. l.: a new approach to compiler code generation, in: proc. 5th acm symposium on principles of programming languages, 1978, p. 231–240. [17] hoffmann, c. m., o’donnell, m. j.: pattern matching in trees. in: journal of the acm, vol. 29 (1982), no. 1, p. 68–95. [18] madhavan, m., shankar, p., siddhartha, r., ramakrishna, u.: extending graham–glanville techniques for optimal code generation. in: acm transactions on programming languages and systems. vol. 22 (2000), no. 6, p. 973–1001. [19] melichar, b., holub, j., polcar, j.: text searching algorithms. available on: http://stringology.org/athens/, release november 2005, 2005. [20] ramesh, r., ramakrishan, i. v.: nonlinear tree pattern matching. in: journal of the acm, vol. 39 (1992), no. 2, p. 295–316. [21] shankar, p., gantait, a., yuvaraj, a., madhavan, m. a.: new algorithm for linear regular tree pattern matching. in: theoretical computer science. elsevier, vol. 242 (2000), no. 1–2, p. 125–142. tomáš flouri e-mail: flourt1@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 m2-branes in n =3 harmonic superspace e. ivanov invited talk at the conference “selected topics in mathematical and particle physics”, in honor of the 70-th birthday of jiri niederle, prague, 5–7 may 2009 abstract we give a brief account of the recently proposed n = 3 superfield formulation of the n = 6,3d superconformal theory of aharony et al (abjm) describing a low-energy limit of the system of multiplem2-branes on theads4 × s7/zk background. this formulation is given in harmonic n = 3 superspace and reveals a number of surprising new features. in particular, the sextic scalar potential of abjm arises at the on-shell component level as the result of eliminating appropriate auxiliary fields, while there is no explicit superpotential at the off-shell superfield level. 1 preliminaries: ads/cft 1.1 ads/cft in iib superstring as the starting point, i recall the essentials of the original ads/cft correspondence (for details see [1] and references therein). it is the conjecture that the iib superstring on ads5× s5 is in some sense dual to maximally supersymmetricn =4,4d superyang-mills (sym) theory. this hypothesis is to a large extent based upon the coincidence of the symmetry groups of both theories. indeed, ads5×s5 ∼ so(2,4) so(1,4) × so(6) so(5) ⊂ su(2,2|4) so(1,4)× so(5) , so the superisometries of this background constitute the supergroup su(2,2|4). on the other hand, the supergroup su(2,2|4) defines superconformal invariance of n = 4 sym, with so(2,4) and so(6) ∼ su(4) being, respectively, 4d conformal group and r-symmetry group. some related salient features of the ads/cft correspondence are as follows. • ads5×s5 (plus a constant closed 5-formon s5) is the bosonic “body” of the maximally supersymmetric curved solution su(2,2|4) so(1,4)× so(5) of iib, 10d supergravity. it preserves 32 supersymmetries. • n = 4 sym action with the gauge group u(n) is the low-energy limit of a gauge-fixed action of a stack of n coincident d3-branes on ads5×s5: 4 worldvolume co-ordinates of the latter system become the minkowski space-time co-ordinates, while 6 transverse (u(n) algebra-valued) d3brane co-ordinates yield just 6 scalar fields of the nonabelian n =4,4d gauge multiplet. • this system has the following on-shell content: 6 bosons and 16/2 = 8 fermions (all u(n) algebra valued); 2 “missing” bosonic degrees of freedom which are required by world-volume n = 4 supersymmetry come from a gauge field. this is a “heuristic” explanationwhy just d3-branes, with the gauge fields contributing non-trivial degrees of freedom on shell, matter in the case of the ads5/cft4 correspondence. 1.2 ads/cft in m-theory recently, there has been a surge of interest in another example of ads/cft duality, this time related to mtheory and the iia superstring. the fundamental (thoughnot explicitly formulated as yet) m-theory can be defined as a strong-coupling limit of the iia, 10d superstringwith 11d supergravity as the low-energy limit. it has the following maximally supersymmetric classical curved solution: ads4 × s7 ∼ so(2,3) so(1,3) × so(8) so(7) ⊂ osp(8|4) so(1,3)× so(7) (plus a constant closed 7-formon s7), which preserves 32 supersymmetries. when trying to treat this option within the generalads/cftcorrespondence (like thepreviouslydiscussed ads5× s5 example), there arise the following natural questions. • what is the cft dual to this geometry? 1. it should be some 3d analog of n =4 sym and should arise as a low-energy limit of multiplem2-branes (membranesofm-theory, analogs of d3-branes of the iib superstring). 2. hence it should contain8 (gauge algebravalued) scalar fields which originate from the transverse co-ordinates of m2-branes. 3. it should contain off-shell 16 physical fermions (16 other fermionic modes can be gauged away by the relevant κ symmetry). 90 acta polytechnica vol. 50 no. 3/2010 4. finally, it should be superconformal, with osp(8|4) realized as n = 8,3d superconformal group. • on shell there should be 8+16/2=8+8 degrees of freedom. hence the gauge fields should not contribute any degree of freedom on shell in this special case (in a drastic contrast with the “type iib /n =4 sym” correspondence). the unique possibility which meets all these demands is that the dual theory is some supersymmetric extension of chern-simons gauge theory [2]. 2 chern-simons theories the standard bosonic chern-simons (cs) action is as follows scs = k 4π tr ∫ d3x�mns ·( am∂nas + 2i 3 amanas ) (2.1) ⇒ fmn = ∂man − ∂nam +[am, an] = 0, i.e. the ym field an is pure gauge on shell. the n = 1 superextension of the cs action is obtainedbyextending an ton =1gaugesupermultiplet an ⇒ (an, χα), α =1,2; lcs(a) ⇒ lcs(a)−tr(χ̄χ) . (2.2) the fermionic field χ is auxiliary, and no dynamical (dirac) equation for it appears. the same phenomenon takes place in the case of n =2 and n =3 superextensions of the pure cs action. the physical fermionic fields (having standard kinetic terms) can appear only from the matter supermultiplets coupled to the cs one. keeping inmind these general properties of supersymmetric chern-simons theories, schwarz assumed [2] that the theory dual to ads4×s7 must be n = 8 superextension of the 3d cs theory, i.e. one should deal with the on-shell supermultiplet (am, φ i , ψbα ), i =1, . . . ,8, b =1, . . . ,8. how to gain physical kinetic terms for 16 (u(n) algebra-valued) fermions? the recipe: place the latter into matter multiplets of the manifest n = 1, n = 2 or n =3 supersymmetries, consider the relevant combined “cs+matter” actions and realize extra supersymmetries as the hidden ones mixing the cs supermultiplet with the matter multiplets. 3 blg and abjm models 3.1 attempts toward n=8 cs theory thefirst attempt to formulate the appropriatecs theorywas undertakenby j. schwarz in 2004 [2]. he used n =2,3d superfield formalismand tried to construct n = 8 superconformal cs theory as n = 2 cs theory plus 4 complexmatter chiral superfields (with the off-shell content consisting of 8 physical bosons, 16 fermions and 8 auxiliary fields). however, these attempts failed. as became clear later, the reason for this failure is that the standard assumption that both matter and gauge fields are in the adjoint of the gauge group prove to be wrong in this specific case. sucha theorywas constructedbybaggerandlambert [3] and gustavsson [4]. the basic assumption of blg was that the scalar fields and fermions take values in an unusual “three-algebra” [ta, tb, tc] = f d abc td . (3.1) the gauge group acts as automorphisms of this algebra, gauge fields being still in the adjoint. the totally antisymmetric “structure” constants of the 3-algebra should satisfy a fundamental jacobi-type identity f dabc f egh d +some permutations of indices= 0 . (3.2) blg managed to define n = 8 (on-shell) supersymmetry in such a system and to construct the invariant lagrangian ln=8=l̃cs(a)+covariantized kin.terms of φi , ψa + 6-th order potential of φi + . . . , where l̃cs(a) is some generalization of the lagrangian in (2.1). all terms involve the constants f dabc and contain only one free parameter, the cs level k. 3.2 problems with the blg construction assuming that the 3-algebra is finite-dimensional and no ghosts are present among the scalar fields, the only solution of the fundamental identity (3.2) proved to be f abcd = �abcd, a, b =1,2,3,4. thus the only admissible gauge group is so(4) ∼ su(2)l × su(2)r and φi , ψa are in the “bifundamental” representation of this gauge group (in fact these are just so(4) vectors). no generalization to the higher-dimensional gauge groups with the finite number of generators and positive-definedkilling metric is possible. the su(2)×su(2) gaugegroupcase canbe shown to correspond just to twom2-branes. how to describe the system of n m2-branes? 3.3 way out: abjm construction aharony, bergman, jafferis, maldacena in 2008 [5] proposed a way to evade this restriction on the gauge group. their main observation was that there is no need in exotic 3-algebras to achieve this at all! the fields φi , ψa should be always in the bi-fundamental 91 acta polytechnica vol. 50 no. 3/2010 of the gauge group u(n)×u(n), while the double set of gauge fields should be in the adjoint. the abjm theory is in fact dual to m-theory on ads4 × s7/zk, and in general it respects only n = 6 supersymmetry and so(6) r-symmetry. the invariant action is a low-energy limit of the worldvolume action of n coincident m2-branes on this manifold. for the gauge group su(2) × su(2), the abjm theory is equivalent to the blg theory. the full on-shell symmetry of the abjm action is the n = 6,3d superconformal symmetry osp(6|4). characteristic features of this action are the presence of sextic scalar potential of special form and the absence of any free parameter except for the cs level k. this k is common for both u(n) cs actions which should appear with the relative sign minus (only in this case there is an invariance under n = 6 supersymmetry). 3.4 superfield formulations off-shell superfield formulations make manifest underlying supersymmetries and frequently reveal unusual geometricproperties of supersymmetric theories. thus it was advantageous to find a superfield formulation of the abjm model with the maximal number of supersymmetries being manifest and off-shell. n =1 and n =2 off-shell superfield formulations were given in refs. [6, 7, 8]. they allowedone to partly clarify the origin of the interaction of scalar and spinor component fields. on-shell n = 6 and n = 8 formulations were also constructed for both the abjm and blg models (see e.g. [9, 10, 11]). the maximally possible off-shell supersymmetry for the cs theory coupled to matter is n =3,3d supersymmetry [12, 13]. thus it was an urgent problem to reformulate the generalabjmmodels inn =3,3d superspace. this was recently done in [14]. this formulation uses the n = 3,3d version [12] of the n =2,4d harmonic superspace [15, 16]. 4 n =3 superfield formulation of the abjm model 4.1 n =3,3d harmonic superspace n = 3,3d harmonic superspace (hss) is an extension of the standard real n = 3,3d superspace by the harmonic variables parametrizing the sphere s2 ∼ su(2)r/u(1)r: (xm, θ(ik)α ) ⇒ (x m, θ(ik)α , u ± j ) , (4.1) u±i ∈ su(2)r/u(1)r , u +iu−i =1 , m, n = 0,1,2; i, k, j =1,2; α =1,2 . the most important feature of the n = 3,3d hss is the presence of an analytic subspace in it, with a lesser number ofgrassmannvariables (two3d spinors as opposed to three such spinor coordinates of the standard superspace) (ζm) ≡ (xma , θ ++ α , θ 0 α, u ± k ) , (4.2) θ++α = θ (ik) α u + i u + k , θ 0 α = θ (ik) α u + i u − k . it is closed under both the n =3,3d poincaré supersymmetry and its superconformal extension osp(3|4). all the basic objects of the n =3 superspace formulation live as unconstrained superfields on this subspace: 1. gauge superfields v ++(ζ), δv ++ = −d++λ(ζ)− [v ++,λ] , (4.3) λ=λ(ζ) . 2. matter superfields (hypermultiplets) (q+(ζ), q̄+(ζ)), (4.4) q+ = u+i f i +(θ++αu−k − θ 0αu+k )ψ k α +∞ of aux. fields. in eq. (4.3), d++ is the analyticity-preserving derivative on the harmonic sphere s2. 4.2 n =3 action the n = 3 superspace formulation of the u(n) × u(n) abjm model [14] involves: 1. the gauge superfields v ++l and v ++ r for the left and right gauge u(n) groups. both of them have the following field contents in the wess-zumino gauge: v ++ ∼ ( am, φ (kl), λα, χ (kl) α , x (kl) ) , (4.5) i.e. (8+8) fields. 2. the hypermultiplets (q+a) b a, (q̄ +a)ab, a = 1,2, in the bi-fundamental of u(n)×u(n): a =1, . . . , n; b = 1, . . . , n. each hyper q+a contributes (8+16) physical fields off shell ((8+8) on shell). the full superfield action is as follows: sn3 = scs(v ++ l )− scs(v ++ r )+∫ dζ(−4) q̄+a ∇ ++q+a , (4.6) ∇++q+a = d++q+a + v ++l q +a − q+av ++r . 4.3 some salient features of the n =3 formulation • though the gauge superfieldcs actions are given by integrals over the harmonic superspace, their variations with respect to v ++l , v ++ r are represented by integrals over the analytic subspace δscs = − ik 4π tr ∫ dζ(−4)δv ++w++ , (4.7) w++ = w++(ζ), ∇++w++ =0 . 92 acta polytechnica vol. 50 no. 3/2010 as a result, the equations of motion are written solely in termsof analytic superfields in the simple form: w++l = −i 4π k q+aq̄+a , w ++ r = −i 4π k q̄+a q +a , ∇++q+a = ∇++q̄+a =0 . (4.8) • the n = 3 superfield action, in contrast to the n = 0, n = 1 and n = 2 superfield abjm actions, does not involve any explicit superfield potential, only minimal couplings to the gauge superfields. the correct 6-th order scalar potential emergeson-shell after eliminatingappropriate auxiliary fields from both the cs and hypermultiplet sectors. • three hidden supersymmetries completing the manifest n = 3 supersymmetry to n = 6 are realized by simple transformations δv ++l = 8π k �α(ab)θ0αq + a q̄ + b , δv ++r = 8π k �α(ab)θ0αq̄ + a q̄ + b , (4.9) δq+a = i�α(ab)∇0αq + b , where ∇0α is the properly covariantized derivative with respect to θ0α. • the hidden r-symmetry transformations extending ther-symmetry of the n =3 supersymmetry to so(6) also have a very transparent representation in terms of the basic analytic superfields. • the n = 3 harmonic superspace formulation makes manifest that the hidden n = 6 supersymmetry is compatiblewith other product gauge groups, e.g. with u(n) × u(m), n �= m, and with other types of bi-fundamental representation for the hypermultiplets. the hidden supersymmetry transformations have the universal form in all cases and suggest a simple criterion as to which gauge groups admit this hidden supersymmetry. in this way one can e.g. reproduce, at the n =3 superfield level, the classification of admissible gauge groups worked out at the component level by schnabl and tachikawa in [17]. • the enhancement of the hidden n =6 supersymmetry ton =8andr-symmetry so(6) to so(8) in the case of the gauge group su(2)k ×su(2)−k is also very easily seen in the n = 3 superfield formulation. actually, this enhancementarisesalready in the case of the gauge group u(1)× u(1) with a doubled set of hypermultiplets (with 16 physical bosons as compared to 8 such bosons in the “minimal” u(1)× u(1) case [18]). 5 outlook in conclusion, letme list some further problemswhich can be studied within the n = 3 superfield formulation sketched above. • construction and study of the quantum effective action of the abjm-type models in the n = 3 superfield formulation. the fact that the superfield equations of motion are given solely in the analytic subspace hopefully implies some powerful non-renormalizability theorems [19]. • computing the correlation functions of composite operators directly in the n =3 superfield approach as comprehensive checks of the considered version of the ads4/cf t3 correspondence. • a study of interrelations between the low-energy actions of m2and d2-branes using the higgs mechanism [20], in which the second system is interpreted as a higgs phase of the first one. • constructing the full effective actions of m2branes in terms of the n = 3 superfields (with a nambu-goto action for scalar fields in the case of onem2-braneand its nonabeliangeneralization for n branes). • etc . . . acknowledgement i thank the organizers of jiri niederle’s fest for inviting me to present this talk and my co-authors in refs. [14] and [19] for our fruitful collaboration. i acknowledge a support from grants of the votrubablokhintsev and the heisenberg-landauprograms, as well as from rfbr grants 08-02-90490, 09-02-01209 and 09-01-93107. references [1] aharony, o., gubser, s. s., maldacena, j. m., ooguri, h., oz, y.: large n field theories, string theory and gravity, phys. rept. 323 (2000) 183, arxiv:hep-th/9905111. [2] schwarz, j. h.: superconformal chern-simons theories, jhep 0411 (2004) 078, arxiv:hep-th/0411077. [3] bagger, j., lambert, n.: modeling multiple m2’s, phys. rev. d75 (2007) 045020, arxiv:hep-th/0611108; gauge symmetry and supersymmetry of multiple m2-branes, phys. rev. d77 (2008) 065008, arxiv:0711.0955 [hep-th]; comments on multiple m2-branes, jhep 0802 (2008) 105, arxiv:0712.3738 [hep-th]; three-algebras and n = 6 chern-simons gauge theories, phys. rev. d79 (2009) 025002, arxiv:0807.0163 [hep-th]. [4] gustavsson, a.: algebraic structures on parallel m2-branes, nucl. phys. b811 (2009) 66, arxiv:0709.1260 [hep-th]; selfdual strings and loop space nahm equa93 acta polytechnica vol. 50 no. 3/2010 tions, jhep 0804 (2008) 083, arxiv:0802.3456 [hep-th]. [5] aharony, o., bergman, o., jafferis, d. l., maldacena, j.: n =6 superconformalchern-simonsmatter theories, m2-branes and their gravity duals, jhep 0810 (2008) 091, arxiv:0806.1218 [hep-th]. [6] benna, m., klebanov, i., klose, t., smedback, m.: superconformal chern-simons theories and ads4/cf t3 correspondence, jhep 0809 (2008) 072, arxiv:0806.1519 [hep-th]. [7] mauri,a.,petkou,a.c.: an n =1superfieldaction for m2 branes, phys. lett. b666 (2008) 527, arxiv:0806.2270 [hep-th]. [8] cherkis, s., sämann,c.: multiplem2-branes and generalized 3-lie algebras,phys. rev.d78 (2008) 066019, arxiv:0807.0808 [hep-th]. [9] cederwall, m.: n = 8 superfield formulation of the bagger-lambert-gustavssonmodel, jhep 0809 (2008) 116, arxiv:0808.3242 [hep-th]; superfieldactions for n =8and n =6conformal theories in three dimensions, jhep 0810 (2008) 070, arxiv:0809.0318 [hep-th]. [10] bandos, i. a.: nb blg model in n = 8 superfields, phys. lett. b669 (2008) 193, arxiv:0808.3568 [hep-th]. [11] samtleben, h., wimmer, r.: n = 8 superspace constraints for three-dimensional gauge theories, jhep 1002 (2010) 070, arxiv:0912.1358 [hep-th]. [12] zupnik, b. m., khetselius, d. v.: threedimensional extended supersymmetry in harmonic superspace, sov. j. nucl. phys. 47 (1988) 730. [13] kao, h.-c., lee, k.: self-dual chern-simons higgs systems with n = 3 extended supersymmetry, phys. rev. d 46 (1992) 4691. [14] buchbinder, i. l., ivanov, e. a., lechtenfeld, o., pletnev, n. g., samsonov, i. b., zupnik, b. m.: abjm models in n = 3 harmonic superspace, jhep 0903 (2009) 096, arxiv:0811.4774 [hep-th]. [15] galperin, a., ivanov, e., kalitzin, s., ogievetsky, v., sokatchev, e.: unconstrained n = 2 matter, yang-mills and supergravity theories in harmonic superspace, class. quant. grav. 1 (1984) 469. [16] galperin, a. s., ivanov, e. a., ogievetsky, v. i., sokatchev, e. s.: harmonic superspace, cambridge university press, 2001, 306 p. [17] schnabl, m., tachikawa, y.: classification of n=6 superconformal theories of abjm type, arxiv:0807.1102 [hep-th]. [18] bandres, m. a., lipstein, a. e., schwarz, j. h.: studies of the abjm theory in a formulation with manifest su(4) r-symmetry, jhep 0809 (2008) 027, arxiv:0807.0880 [hep-th]. [19] buchbinder, i. l., ivanov, e. a., lechtenfeld, o., pletnev, n. g., samsonov, i. b., zupnik, b. m.: quantum n = 3, d = 3 chern-simons matter theories in harmonic superspace, jhep 0910 (2009) 075, arxiv:0909.2970 [hep-th]. [20] mukhi, s., papageorgakis, c.: m2 to d2, jhep 0805 (2008) 085, arxiv:0803.3218 [hep-th]. evgeny ivanov e-mail: eivanov@theor.jinr.ru bogoliubov laboratory of theoretical physics, jinr 141980, dubna, moscow region, russia 94 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 replacing natural gas by biogas — determining the bacterial contamination of biogas by pcr jǐrina čermáková1, jakub mrázek2, kateřina fliegerová2, daniel tenkrát1 1 institute of chemical technology, department of gas, coke and air protection, technická 5, 168 48 praha, czech republic 2 institute of animal physiology and genetics, as cr, v.v.i., vı́deňská 1083, 142 20 prague, czech republic correspondence to: cermakoi@vscht.cz abstract a promising way of using biogas is to upgrade it to natural gas, which is referred to as substitute natural gas (sng) or biomethane. biomethane, or biogas, is produced by biological processes of harnessing the ability of microorganisms to degrade organic material to methane. some of the microorganisms are aerosolized from the digester into the biogas; afterwards a bio-film is formed that attaches to the surfaces of the distribution pipes, and can find it was to the place where the end use of biogas takes place. this paper deals with the detection of microbial species in biogas, their influence on corrosion and the potential risk that diseases can be spread via biogas using molecular techniques. using molecular methods, we found that raw biogas contains about 8 million microorganisms per m 3 , which is most likely the result of microbial transmission from the anaerobic digestion process. some bacterial species may contribute to the corrosion of pipelines and equipment; others are opportunistic pathogens that can cause toxic reactions. however, most bacterial species, more than 40 % in biogas, are still unknown, as is their influence on the digestion process and on human health. further studies are needed to better understand the behavior of microorganisms in anaerobic digestion and to prevent microbial-influenced corrosion and microbial dissemination. keywords: biomethane, pcr, microorganisms. 1 introduction biogas is produced by anaerobic digestion of organic substrates, such as manure, sewage sludge, energy crops and the organic fractions of household and industrial wastes. these raw materials are not technologically and economically suitable for combustion. the substrate composition affects the yield and the composition of the biogas. biogas consists mainly of methane (50—65 %) and carbon dioxide (35—50 %), but it also contains a number of other minor elements, e.g. solid particles, nitrogen, water vapour, oxygen, hydrogen sulphide and ammonia [1]. the most common use of biogas is as a fuel for co-generation units, which generate combined heat and power (chp), because the production of electricity is supported by many mechanisms in the czech republic. the electricity that is produced is consumed by the plant itself (typically 5 to 10 % of the total electricity that is produced). however, most of the electricity is sold to the local electrical grid. as far as heat is concerned, the situation is quite different. more than 60 % of the heat that is produced is difficult to utilize effectively over the whole year, especially during summer. the heat is needed for operating the plant (usually between 10–15 %). it can also be used for other purposes, e.g. for heating service buildings. low utilization of the heat is due to the location of the biogas plant, usually far from a city or from industrial buildings. excessive heat often worsens the overall energy balance of biogas use and, as a result, the overall efficiency is in reality less than 65 % of the input energy, and indeed, often even well below 35 %. a more effective way of using biogas is to upgrade it to natural gas, which is referred to a substitute natural gas (sng) or biomethane. biomethane has similar properties and uses to those of natural gas, and can be injected directly into the local natural gas distribution grid. biogas can be transferred to a different place from the place where it was produced, and thus all the heat that is produced can be utilized. to do this, carbon dioxide and all remaining trace elements (hydrogen sulphide, water vapour, etc.) must first be removed. this increases the concentration of methane, resulting in improved energy density. various commercial technologies are used for removing undesirable components from biogas; the most common are pressure swing adsorption (psa), water scrubbing and chemical absorption. other technolo33 acta polytechnica vol. 52 no. 4/2012 gies are at the pilot or demonstration plant level, e.g. membrane separation or cryogenic separation [2, 3]. the best technology is selected on the basis of the composition of the raw biogas, the size of the plant, and the investment cost. the energy required for upgrading process depends on the technology that is used, but generally the requirement is 5—10 % of the energy content of the biogas that is produced. this means that more than 70 % of the energy is available for other uses. if biomethane is injected into the local natural gas distribution grid and is transported to its destination with higher efficiency, it has to be compressed to the pressure of the local distribution grid, and an odor has to be added so that any leaks are detected. in order to ensure the safety, integrity and operability of gas networks, biomethane has to meet certain minimum gas quality requirements. the choice of gas quality parameters and their limiting values are generally specified nationally. in the czech republic, tpg 902 02 refers to the quality of the biomethane supplied in networks [4]. unfortunately, there are technological and administration barriers and obstacles in the promotion of biogas as biomethane. for example, there are no specific requirements for frequency and monitoring of the gas quality, and there are no regulations on organizational and investment problems, etc. biomethane can also be used as a fuel for vehicles that run on natural gas. these vehicles have several advantages over vehicles equipped with petrol or diesel engines, especially ecological advantages. the emissions of nox and dust particles are drastically reduced, and the emission of co2 decreases by more than 35 %. in the czech republic, the major obstacle to wider use of cng is the limited number of cng filling stations. 2 microorganisms in biogas biomethane, or biogas, is produced by biological processes of harnessing the ability of microorganisms to degrade organic material to methane. some of the microorganisms are aerosolized from the digester into the biogas; afterwards form a biofilm that is fixed to the surfaces of distribution pipes. this biofilm can be dislodged and so get to the place of end usages biogas. most of the biogas that is produced comes from treated sewage sludge and biogas plants, and these systems do not contain a pasteurization step. pathogens can therefore be found in biogas. introducing the biogas that is produced into systems constructed for natural gas is currently a matter for debate about the risks of introducing pathogens into gas systems. the microbial communities in the biogas are not just a rough copy of anaerobic digester microbial communities, and not all microorganisms are equally conveyed by biogas. given this dispersal process, three types of behaviors appear to be possible: the first type is passive behavior. it means that the microorganisms are randomly floated out by the biogas. the other two types of behavior are active: they either seek to avoid transport into a hostile environment, or to use transport for dissemination [6]. investigations of microbial species present in biogas or in natural gas have traditionally relied on the use of laboratory culture methods. laboratory growth media cannot accurately reflect the true condition within a pipeline. in addition, it is known that at present only a small part (generally less than 1 %) of the total microbial diversity can be cultivated. the remaining part is composed of microorganisms for which no culture methods have yet been found, or which are in a viable but non-cultivable state. molecular techniques enable a better appreciation of the compositions and the variability of microbial communities than traditional bacterial cultures [7]. 3 material and methods two different biogases were investigated: biogas from a biogas plant processed energy crop, grass silage and manure, and biogas from a waste water treatment plant (wwtp) equipped with an adsorption unit that is designed for sulphur and siloxane removal. biogas samples for dna extraction were collected by filtration through a 0.22 μm nylon membrane filter (millipore) directly behind the digester and the adsorption unit. anaerobic digester substrate and sludge were also analyzed. natural gas from the distribution grid was also collected in order to draw a comparison between microbial diversity of natural gas and biogas. all samples were frozen for transport with dry ice, and dna extractions were performed according to the methods described by bartosh et al. (2004). [8] the total dna was purified using a powersoil dna isolation kit (mobio) according to the manufacturer’s protocol. two analyses were made: real time pcr (qpcr) and pcr + dgge (polymerase chain reaction + denaturing gradient gel electrophoresis). qpcr was used for quantifying bacterial dna. the target organisms were quantified using specific primers of chosen bacterial groups, and the qpcr results that were obtained were expressed as numbers of bacteria cells per gram of substrate or sludge or per cubic meter of gas. the second analysis determined the bacterial species [8–12]. pcr was realized by targeting 16s rrna gene sequences with universal bacterial primers 338gc (5’ – cgc ccg ccg cgc ccc gcg ccc ggc ccg ccgccgccgccg cac tcc tac ggg agg cag cag – 3’) and rp534 (5’ – att acc gcg gct gct gg – 3’) [13]. the pcr products were processed 34 acta polytechnica vol. 52 no. 4/2012 table 1: identification of bacterial from dgge [15,16] location organisms sa [%] classification main characteristics and indication of pathogenicity b1w, b2w, ng pseudomonas sp. 95 proteobacteria; gammaproteobacteria reduce nitrate to nitrite, opportunistic pathogen b1w, b2w, bb, ng butyvibrio sp. pseudobutyvibrioruminis 99 firmicutes; clostridia decompose cellulose and amylum, can cause microbial corrosion sw uncultured beta proteobacterium 99 proteobacteria; betaproteobacteria typical for waste water treatment plants b1w, b2w, sw sphingomonas sp. 99 proteobacteria; alphaproteobacteria decompose polysaccharide, can cause infection sb lactobacillus fermentum 98 firmicutes; lactobacillales produce lactic acid, can cause infection sb uncultured lachnospiraceae bacterium 98 firmicutes; clostridia – sb, sw uncultured bacteroidetes clone 100 bacteroidetes produce propionic, lactic and acetic acid, can cause infection b1w, ng uncultured clostridium clone 36 firmicutes, clostridia – b2w leucobacter 100 actinobacteria; actinobacteridae – sb uncultured bacteroidetes bacterium 95 bacteroidetes produce propionic, lactic and acetic acid, can cause infection ng uncultured clostridiales bacterium 100 firmicutes, clostridia – b1w, b2w, bb, sw, sb, ng e. coli 100 proteobacteria; gammaproteobacteria from intestinal micro-flora, can cause toxins ng staphylococcus sp. 100 firmicutes; bacili very adaptable pathogen b1w, b2w, ng pseudomonas fluorescens 99 proteobacteria; gammaproteobacteria reduce nitrate, opportunistic pathogen sw candidatus nitrospiradefluvii 99 nitrospirae; nitrospira reduce nitrate sa column s for percentage similarity of closest sequence in genbank b1w biogas behind the digester from waste water treatment b2w biogas behind the adsorption unit from waste water treatment bb biogas from biogas plant sw sludge from waste water treatment sb substrate from biogas plant ng natural gas 35 acta polytechnica vol. 52 no. 4/2012 table 2: number and type of microorganisms found in the digester and biogas microorganisms/ substrate biogas sample biogas wwtp biogas wwtp behind wwtp behind plant plant digester adsorption unit [10 6 bacteria·g−1] [106 bacteria·g−1] [106 bacteria·m−3] [106 bacteria·m−3] [106 bacteria·m−3] total number 80.7 1 360 11.2 70.5 34.3 c. leptum 6.38 940 0.46 5.09 0.62 desulfovibrio 25.8 124 0.64 3.46 nd faecallibacter 3.11 7.16 nd nd 0.5 lactobacillus 17.9 49.9 nd nd nd enterobacteriaceae 0.003 0.04 0.025 0.047 0.04 other 27.5 244 10 64.9 33.7 table 3: number and type of microorganisms found in the natural gas microorganisms/ natural gas sample ng1 ng2 ng3 ng4 ng6 [10 6 bacteria·m−3] [106 bacteria·m−3] [106 bacteria·m−3] [106 bacteria·m−3] [106 bacteria·m−3] total number 3.08 2.11 8.37 18.8 4.29 c. leptum 0.50 0.41 2.6 5.06 1.55 desulfovibrio 0.20 0.05 0.31 – 0.24 faecallibacter – – – – – lactobacillus – – – – – enterobacteriaceae 0.009 – 0.014 – 0.007 other 2.36 1.65 5.45 13.7 2.49 by dgge on the dcodetm universal mutation detection system (biorad, usa). finally, we compared the obtained sequences with the genbank database, using the blastn algorithm. [14] 4 results and discussion the microbial diversity of all samples was assessed and compared by pcr-dgge fingerprinting analysis of bacterial 16s rrna. the presence of bacteria in particular samples and their main characterization are shown in table 1. similarities based on sequence comparison varied between 36–100 %, whereby 94 % of the total sequence presented more than 95 % similarity with the known sequences found in the genbank database. only one sequence showed 36 % similarity. butyvibrio sp., pseudobutyvibrioruminis and e. coli were present in all gaseous samples, while the others were present only in sludge or substrate. the results obtained from real time pcr are shown in table 2 and table 3. the comparison of biogas and digester microbial diversity shows that clostridium leptum, desulfovibrio group, feacallibacterium group, enterobacteriaceae family and lactobacillus group appeared to be present in significantly greater numbers in the anaerobic digester than in the biogas (table 2). this is in accordance with the idea that only some microorganisms from the digester were taken up in the biogas. the data revealed that the total number of microorganisms in the biogas behind the digester were two times higher than the total number behind the adsorption unit. this decrease could be explained by the capture of microorganisms on activated carbon. we detected only four bacterial genera in biogas (table 2), but more than 40 % of the detected microorganisms are unknown bacterial species. desulfovibrio and clostridium were the most frequently detected genera that produce acidic conditions and hence accelerate the corrosion process. enterobacteriaceae was the least present genus. it decomposes sugars to lactic acid and reduces nitrate to nitrite. this family also includes many pathogens, such as salmonella and e. coli. upgraded biogas can be injected into the natural gas distribution grid. for safety reasons against microbial contamination, the microbial diversity of 36 acta polytechnica vol. 52 no. 4/2012 natural gas was therefore also analyzed. we detected 3–19·106 bacteria·m−3, depending on the pressure in the distribution grid and the material of the pipeline (table 3). this value is quite similar to that found in biogas. we found three bacterial genera in natural gas: c. leptum, desulfovibrio and entrerobacteriaceae. lacobacillus and feacalibacter were not present, because they come from intestinal microflora and participate in a fermentation process. 5 conclusion biogas as a renewable energy source has been receiving growing attention in the czech republic and in the eu countries. nowadays, there are two main ways of using biogas: as a fuel for a co-generation unit or by upgrading it to natural gas. when it is upgraded to natural gas, there is a potential risk of spreading disease via biogas. using molecular methods, we found that raw biogas contains microorganisms, most likely as a result of microbial transmission from the anaerobic digestion process. however, natural gas contains a quite similar quantity of microorganisms as biogas. our results show that firmicutes and proteobacteria were the most frequently encountered bacterial phyla. some bacterial species may contribute to corrosion of pipelines and equipment; others are opportunistic pathogens that can cause toxic reactions. however, most of bacterial species, more than 40 % in biogas, are still unknown, as is their influence on the digestion process and on human health. further studies are needed for a better understanding of the behavior of microorganisms in anaerobic digestion, and to prevent microbial-influenced corrosion and microbial dissemination. acknowledgement this work received financial support from specific university research (msmt no. 21/2011), the national agency for agriculture research (project no. qi92a286/2008), and the czech science foundation (project no. p503/10/p394). references [1] straka, f., a kol.: bioplyn. 2. roz. vyd. praha : gas, s. r. o., 2006. [2] weidner, e.: technologien und kosten der biogasaufbereitung und einspeisung in das erdgasnetz. ergebnisse der markterhebung 2007–2008. institut umwelt-, sicherheits-, energietechnik, 2008. [3] tentscher, w.: anforderung und aufbereitung von biomas zur einspeisung erdgasnetze. gas – erdgas, 2007. [4] tpg 902 02. jakost a zkoušeńı plynných paliv s vysokým obsahem methanu. gas, s. r. o., 2009. [5] deublein, d., steinhauser, a.: biogas from waste and renewable sources. an introduction. weinheim : 2008. isbn 978-3-527-31841-4. [6] molleta, m., delgenes, j. p., godon, j. j.: differences in the aerosolization behaviour of microorganisms as revealed through their transport by biogas, science of the total environment, 2007, p. 75–88. [7] malik, s., beer, m., megharaj, m., naidu, r.: the use of molecular techniques to characterize the microbial communities in contaminated soil and water. env. international, 2007, vol. 34, p. 265–276. [8] bartosch, s., fite, a., macfarlane, g. t., mcmurdo, m. e. t.: characterization of bacterial communities in feces from healthy elderly volunteers and hospitalized elderly patients by using real-time pcr and effects of antibiotic treatment on the fecal microbiota, appl. environ. microbiol. 2004, vol. 70, 6, p. 3 575–3 581. [9] nadkarmi, m. a., martin, e. f., jacques, n. a., hunter, n.: determination of bacterial load by real-time pcr using a broad-range (universal) probe and primers set. microbiol. 2002, vol. 148, 1, p. 257–266. [10] shen, j., zhang, b., wei, g., pang, x., wei, h., li, m., zhang, y., jia, w., zhao, l.: molecular profiling of the clostridium leptum subgroup in human fecal microflora by pcrdenaturing gradient gel electrophoresis and clone library analysis, appl. environ. microbiol. 2006, p. 5 232–5238. [11] salzman, n. h., hung, k., haribhai, d., chu, h., karlsson-sjberg, j., amir, e., teggatz, p., barman, m., hayward, m., eastwood, d., stoel, m., zhou, y., sodergen, e., weinstock, g. m., bewins, c. l., williams, c. b., bos, n. a.: enteric defences are essential regulators of intestinal microbial ecology, nat. immunol. 2009, vol. 11, p. 76–82. [12] rinttila, t., kassinen, a., malinen, e., krogius, l., palva, a.: development of an extensive set of 16s rdna-targeted primers for quantification of pathogenic and indigenous bacteria in faecal samples by real-time pcr, j. appl. microbiol. 2004, vol. 97, p. 1 166–1177. [13] muyzer, g., de waal, e., uitterlinden, a. g.: profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction amplified genes 37 acta polytechnica vol. 52 no. 4/2012 coding for 16s rrna. applied and environmental microbiology, 1993, p. 695–700. [14] maidak, b. l., larsen, n., mccaughey, m. j., overbeek, r., olsen, g. j., foge, k., blandy, j., woese, c. r.: the ribosomal database project. nucleic acid. res. 1994, vol. 22, p. 3 485–3487. [15] http://microbewiki.kenyon.edu/index.php/ bovine rumen, downloaded on 14. 11. 2010. [16] atlas, r. m.: principles of microbiology. dubuque : mcgraw-hill, 1997. isbn 0-8151-0889-3. 38 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 ultra low-dispersion spectroscopy with gaia and photographic objective prism surveys r. hudec, l. hudec, m. kĺıma abstract this paper discusses the ultra low-dispersion spectroscopy to be applied in the esa gaia space observatory and the ground-based objective-prism plate surveys. although the dispersion in plate surveys is usually larger than in the gaia bp/rp spectrometers, the spectral resolutions differ by a factor of 2–3 only, since the resolution in ground-based spectra is seeing-limited. we argue that some of the algorithms developed for digitized objective-prismplates can also be applied for the gaia spectra. at the same time, the plate results confirm the feasibility of observing strong emission lines with gaia rp/bp. keywords: astronomical plates, astronomy satellites, esa gaia, low-dispersion spectroscopy. 1 introduction the esa gaia satellite payload consists of a single integrated instrument, the design of which is characterised by a dual telescope concept with a common structure and a common focal plane (http://sci.esa.int/gaia/). both telescopes are based on a three-mirror anastigmat (tma) design. beam combination is achieved in image space with a small beam combiner. silicon-carbide (sic) ultra-stable material is used for the mirrors and the telescope structure. there will be a large common focal plane with an array of 106 ccds. the large focal plane also includes areas dedicated to the spacecraft’s metrology and alignment measurements. three instrument functions/modes are designed: (i) astrometric mode for accurate measurements, even in densely populated sky regions of up to 3 million stars/deg2, (ii) photometric mode based on low-resolution, dispersive spectro-photometry using blue and red photometers (bp and rp) for continuous spectra in the 320–1 000 nm band for astrophysics and chromaticity calibration of the astrometry, and (iii) spectrometroscopic (rvs) mode for high resolution, with grating, covering a narrow band: 847–874 nm. the expected limiting magnitude is 20 in photometric mode (http://sci.esa.int/gaia). see also figure 1. in our study we focus on the “photometric mode” rp/bp. use of the dispersive element (prism) generates ultra low-dispersion spectra. one disperser, called bp for blue photometer, operates in the 330– 660 nm wavelength range; the other, called rp for red photometer, covers the 650–1000 nm wavelength range. the dispersion is higher at short wavelengths, and ranges from 4 to 32 nm/pixel for bp and from 7 to 15 nm/pixel for rp (http://sci.esa.int/gaia). it should be noted however that the photometric ccds are located at the edge of the focal plane, where the quality of the images is more sensitive to aberrations than astrometric images (straizys, 2010). the spectral coverage of gaia, i.e. the g band, and rp, bp and rvs passbands is illustrated in figure 1. fig. 1: the spectral coverage by gaia (http://sci.esa.int/gaia) fig. 2: bp (left) and br (right) images simulated by the gibis simulator, the same skyfield. the images havedifferent scale parameter values (visualization ds9). these images illustrate the image wings mentioned by straizys et al., 2006 the bp and rp spectra will be binned on-chip in the across-scan direction; no along-scan binning is foreseen. rp and bp will be able to reach ob52 acta polytechnica vol. 51 no. 2/2011 ject densities on the sky of at least 750,000 objects deg−2. the obtained complex images can be simulated by the gibis simulator (figure 2). gibis is a pixel-level simulator of the gaia mission intended to simulate how the gaia instruments will observe the sky, using realistic simulations of the astronomical sources and of the instrumental properties. it is a branch of the global gaia simulator (gaiasimu) under development within gaia coordination unit 2: data simulations. we note that certain types of variable stars (vs) such as miras, cepheids, and a few cases of other stars, mostly peculiar variables, exhibit large variations in their spectral types. this field is, however, little exploited, as these studies used to be very laborious (plates were mostly visually inspected) and limited, and hence no review on the spectral variability among vs exists. the esa gaia is expected to deliver data to fill this gap. an important question is whether the dispersion of these devices is enough to detect and to study bright spectral features/emission lines. we also briefly introduce the extended work of us astrophysicist and nasa astronaut karl henize, who spent a large part of his scientific career on low dispersion spectroscopy with objective prism. we have found the 290 large 15 × 15 inch original low-dispersion spectral plates that he took about 60 years ago in south africa and analysed extensively. we found and investigated these plates (probably the complete henize collection) in the pari (pisgah astronomical research institute) institute, nc, usa (figure 3). the plates show numerous examples of objects with very prominent emission lines, which he found in a very extended time-consuming and laborious project (figure 4). 2 ultra low dispersion spectral plate databases lds (low-dispersion spectroscopy) astrophysics was evolved and performed at numerous observatories between ca 1930 and 1980. mostly lds with schmidt telescopes (plates with objective prism) were used for various projects e.g. qso, emission line and halpha surveys, star classifications, etc. the technique was however little used after 1980. some of these surveys are listed below (the dispersion data is given in the next section, and we also note that many other similar surveys exist): (1) schmidt sonneberg camera. sky survey (selected fields) with a 50/70 cm schmidt telescope. no online access yet but the scans can be provided upon request (http://www.stw.tu-ilmenau.de/). (2) bolivia expedition spectral plates. these plates offer homogeneous but not full coverage (90 southern kapteyn’s selected areas nos. 116–206 were covered with plates representing 10 × 10 grad each, hence 9 000 square degrees in total) of the southern sky with spectral and direct plates, directed by the potsdam observatory. the plates are stored at the sonneberg observatory (http://www.stw.tuilmenau.de/) and were taken between 1926–1928, in total about 70 000 prism spectra were estimated and published in potsdam publications, see becker, 1929, and following papers. fig. 3: oneof the 290karlhenize southern skyha-alpha surveyplates (michigan-mountwilson southernh-alpha survey), the bright emission objects foundbykarlhenize are indicated fig. 4: examples of objects with prominent emission lines found in the karl henize plate collection (michigan-mount wilson southern h-alpha survey) located at the pari institute, nc, usa 53 acta polytechnica vol. 51 no. 2/2011 (3) hamburg quasar survey. a wide–angle objective prism survey searching for quasars with b < 17.5 in the northern sky. the survey plates were taken with the former hamburg schmidt telescope, located at calar alto/spain since 1980. online access (http://www.hs.unihamburg.de/de/for/exg/sur/index.html). (4) byurakan survey. the digitized first byurakan survey (dfbs) is the digitized version of the first byurakan survey (fbs). it is the largest spectroscopic database in the world, providing low-dispersion spectra for 20,000,000 objects on 1 139 fbs fields = 17,056 deg2. online access (http://byurakan.phys.uniroma1.it/). sky coverage: dec > −15 deg, all ra (except the milky way). the prism spectral plates were taken by the 1 m schmidt telescope. limiting magnitude: 17.5 in v. spectral range: 340–690 nm, spectral resolution 5 nm. (5) spectral survey plates in the astronomical photographic data archive (apda) located at the pisgah astronomical research institute (pari), usa, e.g. case qso-survey (http://www.pari.edu/library). telescope: 61/91 cm burrell schmidt at kitt peak, 1.8 deg prism, plate fov: 5-degree by 5-degree, limiting b magnitude: 18, emulsion: iiiaj baked, spectral range: 330 nm to 530 nm. (6) karl henize h-alpha plate collection (located since 2010 at pari) — michigan-mount wilson southern h-alpha survey (henize, 1954). a newly (in 2010) re-discovered highly valuable plate collection. 290 high quality plates 15 × 15 inches taken in 1950–1952 in south africa by dedicated telescope by karl henize. telescope aperture d25 cm, dispersion 45 nm/mm at halpha, various filters used (henize, 1954). fig. 5: examples of objects with emission lines visible at spectral dispersion similar to gaia bp/rp: planetary nebula (left) and uv excess galaxy (right). source: digitized byurakan survey. note that albeit the theoretical spectral resolution of these plates is several times better than spectral resolution of gaia rp/bp, in reality the resolution is comparable as the ground-based data are mostly affected by seeing and atmospheric influence and do not reach their theoretical values 3 ultra low dispersion sprectral images by gaia rp/bp 3.1 algorithms the algorithms for automated analyses of digitised spectral plates were developed by cumputer science students (hudec 2007). the main goals are as follows: automated classification of spectral classes, searches for spectral variability (both continuum and lines), searches for objects with specific spectra, correlation of spectral and light changes, searches for transients, and application for gaia. the archival spectral plates taken with the objective prism offer the possibility to simulate the gaia low dispersion spectra and related procedures such as searches for spectral variability and variability analyses based on spectro-photometry. we focus on the sets of spectral plates of the same sky region covering long time intervals with good sampling; this enables simulation of the gaia bp/rp outputs. the main task is automatic classification of stellar objective prism spectra on digitised plates, a simulation and a feasibility study for the low-dispersion gaia spectra. the algorithmes developed and tested include application of novel approaches and techniques with emphasis on neural networks for automated recognition of spectral types of stars comparing them with atlas spectra and differ from techniques discussed before (e.g. christlieb et al. 2002 or hagen et al., 1995). for future we plan to continue developing innovative dedicated image processing methods to continue our participation in data extraction and evaluation by providing expertise in the high level image processing with focuss on solving problems of data processing and data extraction coming from the peculiar way gaia is functioning. the expertise available at the department of radioelectronics of the ctu faculty of electrical engineering will be further used and developed in this direction. 3.2 comparison of gaia low dispersion spectra and spectral plates the motivation for these studies is as follows: (1) comparison of the simulated gaia bp/rp images with those obtained from digitized schmidt spectral plates (both using dispersive elements) for 8 selected test fields, and (2) feasibility study for application for the algorithms developed for the plates for gaia. dispersion is an important parameter, and is discussed later: (1) gaia bp: 4–32 nm/pixel i.e. 400–3 200 nm/mm, 9 nm/pixel i.e. 900 nm/mm at 54 acta polytechnica vol. 51 no. 2/2011 hγ, rp: 7–15 nm/pixel i.e. 700–1 500 nm/mm. psf fwhm ∼ 2 px i.e. spectral resolution is ∼ 18 nm, (2) schmidt sonneberg plates (typical mean value): the dispersion for the 7 deg prism 10 nm/mm at hγ, and 23 nm/mm at hγ for the 3 deg prism. (3) bolivia expedition plates: 9 nm/mm, with calibration spectrum, (4) hamburg qso survey: 1.7 deg prism, 139 nm/mm at hγ, spectral resolution of 4.5 nm at hγ, (5) byurakan survey: 1.5 deg prism, 180 nm/mm at hγ, resolution 5 nm at hγ and (6) pari prism dispersion: 150 nm at 450 nm. we see that the gaia bp/rp dispersion is ∼ 5 to 10 times less than the dispersion of a typical digitised spectral prism plate, and the spectral resolution of gaia is ∼ 3 to 4 times less than for the plates (table 1). note that for plates the spectral resolution is seeing-limited, hence the values represent the best values, and on the plates affected by bad seeing the spectral resolution is only ∼ 2 times better when compared to gaia bp/rp. table 1: a comparison of gaia rp/bp and plate lowdispersion schmidt plate spectral surveys 4 examples of science with gaia rp/bp spectro-photometry despite the low dispersion discussed above, the major strength of gaia for many scientific fields will be in spectro-photometry, as the low dispersion spectra may be transferred to numerous well-defined color filters. as an example, the optical afterglows (oas) of gamma-ray bursts (grbs) are known to exhibit quite specific color indices, distiguishing them from other types of astrophysical objects (simon et al., 2001 and 2004), hence a reliable classification of oas of grbs will be, in principle, possible using this method. the colors of microquasars may serve as another example: they display blue colors, with a trend of a diagonal formed by the individual objects. this method can be used even for optically faint, and hence distant objects. we however note that the correct color indices cannot be calculated without careful decontamination of the bp/rp spectra (straizys et al., 2006, straizys, 2010). the energy redistribution effect in the gaia bp and rp spectra arising from contamination by wings of the image profiles was mentioned and investigated by straizys et al., 2006, montegriffo et al., 2007, and montegriffo, 2009. according to these researchers, the gaia spectra may be used for classifying stars either after applying contamination corrections or by using standard calibration stars with known physical parameters and observed with the gaia spectrophotometers. in the latter case, there is no way to calculate the real spectral energy distributions, magnitudes, color indices, color excesses or other photometric quantities. the classification has to be done by matching the observed pseudo-energy distributions of the target and the standard stars, or by using pattern recognition algorithms (template matching) over the whole spectrum to estimate the astrophysical parameters of stars. fig. 6: example of the automated star spectral classification process on a digitised sonnebergobservatory lowdispersion spectral plate 55 acta polytechnica vol. 51 no. 2/2011 5 conclusion variability studies based on low-dispersion spectra are expected to provide unique novel data, and can use the algorithms recently developed for automatic analyses of digitized spectral schmidt plates. the esa gaia satellite will provide ultra-low dispersion spectra by bp and rp, representing a new challenge for astrophysicists and for cumputer science. the nearest analogy is digitized prism spectral plates: the sonneberg, pari, hamburg and byurakan surveys. these digitised surveys can be used for a simulation and for tests of the gaia algorithms and gaia data. some algorithms have already been tested. some types of variable stars are known to exhibit large spectral type changes — however this field is little exploited and more discoveries can be expected with the gaia data, as gaia will allow us to investigate the spectral behavior of huge numbers of objects over 5 years with good sampling for spectroscopy. however, the data must first be decontamined to be scientifically applied, as discussed above. acknowledgement czech participation in the esa gaia project is supported by pecs project 98058. the scientific part of the study is related to grants 205/08/1207 and 102/09/0997, provided by the grant agency of the czech republic. some aspects of the project described here are a natural continuation of czech participation in esa integral (esa pecs 98023). the analyses of digitised low dispersion spectral plates are supported by msmt kontakt project me09027. references [1] becker, f.: spektral-durchmusterung der kapteyn-eichfelder des sudhimmels. i. pol und zone – 75deg. potsdam publ., 27, 1, 1929. [2] henize, karl g.: the michigan-mount wilson southern ha survey. astronomical journal, vol. 59, p. 325, 1954. [3] hudec, l.: algorithms for spectral classification of stars, bsc. thesis, charles university, prague, 2007. [4] jordi, c., carrasco, j. m.: spectrophotometric and polarimetric standardization, asp conference series, the future of photometric, 364, 215, 2007. [5] montegriffo, p., et al.: a model for the absolute photometric calibration of gaia bp and rp spectra. i. basic concepts. gaia-c5-tnoabo-pmn-001-1, 2007. [6] montegriffo, p.: a model for the absolute photometric calibration of gaia bp and rp spectra. ii. removing the lsf smearing. gaia-c5-tnoabo-pmn-002, 2009. [7] šimon, v., hudec, r., pizzichini, g., masetti, n.: a & a, 377, 450, 2001. [8] šimon, v., hudec, r., pizzichini, g., masetti, n.: 30 years of discovery: gamma-ray burst symposium, aip conference proceedings, gamma-ray bursts, 727, 487–490, 2004. [9] šimon, v., hudec, r., pizzichini, g.: a & a, 427, 901, 2004. [10] http://sci.esa.int/gaia/ [11] straizys, v., et al.: baltic astronomy, vol. 15, 449–459, 2006. [12] christlieb, n., wisotzki, l., grasshoff, g.: a & a, 391, 397, 2002. [13] hagen, h.-j., et al.: a & as, 111, 195, 1995. rené hudec e-mail: rhudec@asu.cas.cz astronomical institute academy of sciences of the czech republic cz-251 65 ondřejov, czech republic czech technical university in prague faculty of electrical engineering technická 2, cz-166 27 prague, czech republic lukáš hudec miloš kĺıma czech technical university in prague faculty of electrical engineering technická 2, cz-166 27 prague, czech republic 56 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 stochastic models of solid particles grinding g. a. zueva, v. a. padokhin, p. ditl abstract solid particle grinding is considered as a markov process. mathematical models of disintegration kinetics are classified on the basis of the class of markov process that they belong to. a mathematical description of the disintegration kinetics of polydisperse particles bymilling in a shock-loading grinder is proposed on the basis of the theory ofmarkov processes taking into account the operational conditions in the device. the proposed stochastic model calculates the particle size distribution of the material at any instant in any place in the grinder. the experimental data is in accordance with the predicted values according to the proposed model. keywords: markov processes, grinding kinetics models, density of distribution. 1 introduction alongside the traditional approach, based on phenomenological introduction of the mechanics of a continuous medium into a description of chemical engineering processes, in particular, milling processes, stochastic approaches, particularly those based on markov processes, have become widely applied. to a certain extent, these two approaches are mutually complementary. however, in many cases the detailed formal means of stochastic theory enable models of milling to be constructed more rationally [1, 2, 6, 7, 12, 13]. stochasticity, which is an integral part of the fracture process of polydisperse material, is automatically taken into account. compliance with the markov process forms the basis for modeling the particle size distribution during milling. this means that the future behavior of the system is not affected by the behavior of the system in the past [2]. it is worth noting that the methods of markov processes have been used rather productively in theoretical analyses of many processes in chemical technology, e.g. in mechanoactivation [8], separation, classification of heterogeneous systems [9]. one of the main constructors of the statistical theory of basic processes in chemical technology was a. m. kutepov [10, 11]. he formulated the following positive tendencies which stimulate the further development of the statistical theory of processes of chemical technology as follows: • interest in making active use of the methods of non-equilibrium statistical thermodynamics, the theory of casual processes and synergetic steadily amplifies. • constant enrichment of statistical theory by new, highly-effective mathematical means. • rapid development of computer technologies has enabled the creation of the automated systems for calculating and designing equipment for chemical manufacturing. • to solve problems in simulating chemical processes it is necessary to create a bank of simple, evident mathematical models of the processes of chemical technology which are easily solved using a computer. stochastic models of these processes obtained using fundamental methods of modern statistical theory fully satisfy these requirements. feller [14] reported well-arranged and inspiring review of stochastic theories in his comprehensive book. 2 classification of grinding kinetics models the existing variety of types of markov processes allows the construction of stochastic models of disintegration kinetics of varying complexity that adequately reflect the specific features of the process of grinding materials in a range of milling plants. the jump markov process most adequately describes the impulse character of loading of solids in a shock-loading grinder. generally, bead vibrations result in a time-continuous spectrum of actions on the material. this can be approximated by a continuous markov process. a classification of mathematical models of grinding kinetics, based on their relationship to a fixed class of markov processes, is given in table 1. this classification of models is incomplete and provisional. nevertheless, it convincingly shows the physical conciseness of markov models of grinding processes, and also the potential of the markov-based methodology. 70 acta polytechnica vol. 50 no. 2/2010 table 1: classification of grinding kinetics models group no. markov process type model type primary disintegration mechanism grinder design 1 markov chain (state-discrete and time-discrete) matrix free impact shock mills (shock – reflective, disintegrator, etc.), roller mills 2 markov process, state-discrete and time-continuous (transitions in casual instants) differential – differential constrained impact, abrasion, crushing vibrational, magnetic-vortical, drum-type, spherical, epicyclic mills, etc. 3 jump process (time-continuous and state-continuous transition in casual instants) integral – differential free impact shock-reflective, disintegrators, hammer mills, rotor, etc. 4 diffusive process (continuous) diffusive abrasion bead, sand mills, etc. 5 mixed process integral – differential + diffusive, integral – differential + differential – differential) abrasion + free impact, constrained impact jet-mills: counter-current, ring, pulsating, centrifugal – counter-current, etc. 3 stochastic models an analysis of the milling of dispersed material in a device with a periodical action of theoretical merging, as described by the matrix model, shows that this process can be classified as a stationary markov chain. for this purpose, in accordance with the terminology used in the theory of markov processes, we will introduce the concept of a vector line of probabilities of the states of the modeling system that is identical to a vector column of the size distribution of the ground material [2] π(k + 1) = π(k)p, k = 0, 1, 2, . . . ; (1) π(k)|k=0 = π0, or π(k) = π0p k, (2) where π0 – a vector of initial probability distribution; π(k) – a vector of probabilities of states in time step k; p – matrix of transition. this is a model of periodical milling that can be developed with the help of a stationary markov chain. however, loading particles in real conditions happens in random time instants. it is therefore necessary to describe grinding processes with the help of a discrete markov process, bringing in what happens at casual time intervals. it is known [4] a markov process with continuous time and discrete states is determined by matrix a of intensities of transitions with time-constant components aij and vector π0 of probabilities of states of the system at the initial time instant. a mathematical description of the grinding process in this case is dπ(t) dt = π(t)a; (3) π(0) = π0. the solution of this equation π(t) = π0 exp(at). (4) the matrix a of the transition intensities is differential, and it has a close connection with the stochastic matrix p of the transitions [4]. let us find matrix a for a time-continuous process, for which the probabilities of conditions at moments of time t = 0, 1, 2, . . . are the same as for the timediscrete process, described by matrix p . the time of one transition of a discrete process is taken as the time 71 acta polytechnica vol. 50 no. 2/2010 unit. comparing the solutions of equations (2), (4) at t = k we can see that exp(a) = p or a = ln p. (5) the procedures for finding the logarithmic and exponential function of a matrix are well known [5]. expression (4) enables us to determine some of the particles of each fraction at any moment t while grinding a portion of an ideal mixture in a device. the hydrodynamic conditions in the device obviously have an essential influence on the milling process. it should therefore be reflected in the mathematical description. in accordance with the theory of markov processes we have introduced a theoretical analysis of the grinding process in the device of theoretical extruding of continuous action. the equation describing the continuous process of grinding in a shock-centrifugal milling device of theoretical displacement [5] in terms of markov processes is: ∂π(t, x) ∂t = π(t, x)a − v ∂π(t, x) ∂x . (6) here π(t, x) – particle size distribution at the moment t at passage of length distance x in a theoretical extrusion device (the vector of state probabilities); v – linear speed of a stream. applying laplace transformations to equation (6) twice [7], we receive an expression for the image of the vector of probabilities of states π(s, p) = ( π(0, p) + v π0 s ) (se − a + vpe)−1. (7) here l[t] = s; l[x] = p; l [π(t, x)] = π(t, p); l [π(t, p)] = π(s, p); l [π(0, x)] = π(0, p), where π(0, x) – vector of probabilities of states in the initial moment of time in the section specified by distance x: π(0, x) = { π0, x = 0, 0, x �= 0. (8) here π0 – density of distribution of probabilities of states at the initial moment, or otherwise, density of the distribution of the number of particles in the sizes at the moment t = 0 on an input into the device. according to (8) π(0, p) = ⎧⎨ ⎩ π0 p , x = 0, 0, x �= 0. therefore the image of the solution of equation (6) is: if x �= 0, then π(s, p) = v π0 s (se − a + vpe)−1; (9) if x = 0, then π(s, p) = ( π0 p + v π0 s ) (se − a + vpe)−1. we are interested in the case when x �= 0, since the density of a probability distribution at the entrance of the grinding device (x = 0) is known at any instant; it equals π0. so, having applied the inverse laplace transform to expression (9) to a variable s, and then to a variable p, we receive the density of distribution of the probabilities of states at any point of distance x at any moment t. in other words we receive the particle size distribution in any section x at any moment t. note that under steady conditions t → ∞ and the limiting vector of density of the probability distribution π∞ has components dependent on the value x v and also on the initial density of the probability distribution π0. setting the stationary particle size distribution π∞ on an output of the device and having the average speed of the stream, we can define, for example, the necessary length of the device. using the constructed model, we can solve the inverse problem, i.e. we can find the elements of matrix a of the transition intensities. equation (6) for vector π∞ of the limiting probabilities of states becomes dπ∞(x) dx = π∞(x) a v , (0 ≤ x ≤ l), (10) where l is the total length of the grinder. the boundary condition is π∞(0) = π0. (11) solving equation (10) in view of boundary condition (11), we receive the density of the limiting probabilities π∞(x) = π0 exp ( a x v ) , (0 ≤ x ≤ l). substituting for x = l, the elements of matrix a can be determined by solving the system of the equations constructed according to the condition π∞(l) = π0 exp ( a l v ) (12) and the condition of equality to zero of the sum of elements in the matrix lines. matrix a has following appearance a = ⎛ ⎜⎜⎜⎜⎝ 0 0 . . . 0 a21 a22 . . . 0 . . . an1 an2 . . . an n ⎞ ⎟⎟⎟⎟⎠ , 72 acta polytechnica vol. 50 no. 2/2010 where aij = 0, if i < j; a11 = 0, and a21 + a22 = 0; . . . (13) an1 + an2 + . . . + an n = 0. thus, having received experimentally steady state distribution of particles in the sizes π∞ on an output of the device, we can unequivocally find elements of matrix a and also elements of a matrix of transitions p. the common view of matrix p : p = ⎛ ⎜⎜⎜⎜⎝ 1 0 . . . 0 p21 p22 . . . 0 . . . pn1 pn2 . . . pn n ⎞ ⎟⎟⎟⎟⎠ . considering that the elements of matrix p are formed as follows pij = piϕij , we can find probabilities pi of destruction of the particles of each fraction i and probabilities ϕij of formation of particles of the j-th fraction at destruction of larger particles of the i-th fraction (i = 2, n ). similarly, a model can be constructed for continuous milling of a dispersed material in the device of theoretical merging, modeling it by a markov process with discrete states and in continuous time. in this case, the equation for π(t) is [6]: dπ(t) dt = π(t)a + q v (π(0) − π(t)) . (14) here q – volumetric flow of dispersed material; v – operating volume of the device. in the case of an intermediate hydrodynamic mode, the process of grinding can be simulated by means of a cell model. thus, the process will be described by system of differential equations of type (14). the number of the equations should be equal to number of cells of the ideal mixture into which the device is broken down. using a set of blocks that simulate milling in a continuous action device with various hydrodynamic modes, it is possible to solve problems of modeling, optimization and constructive framing of processes combined with grinding. the mathematical model (6, 8) of particle milling in the device of theoretical displacement of continuous action has been used to describe the process of milling in a rotor-pulse grinder. note that the two-level shockreflective grinder working on a single passage is close to the hydrodynamic structure of a dispersed particle stream to the device of theoretical displacement of continuous action. if we know a vector π, describing distribution of the state probabilities, it is possible to find the density of probability distribution f , m−1 (or %/m−1). this procedure is well-known in probability theory. obviously f is identical to density of size distribution. 4 a check on the adequacy of the mathematical model in order to obtain the experimental density of the sizeparticle distribution, we used the results of research on the process of grinding in a patch-centrifugal mill. the check on the adequacy of the mathematical model of grinding involves comparing the calculated density of size-particle distribution equations (10 and 11) with the experimental density of size-particle distribution. the experiments proved that the time necessary to reach steady state conditions was a few seconds and less. as an example the results of a check on the adequacy of the mathematical model of grinding in miller are shown. such equipment function can be described by eqs. (6 to 13). for reasons of convenience, the calculations and the experiments used quartz of sand and benzoic acid. the linear speed of material was taken v = 24 m · s−1 in both calculations and experiments. to obtain calculated values for the density of the size-particle distribution, it is necessary to make use of expression (12). having determined the probabilities of particle destruction pi and the value of the distributive functions ϕij , and also the residence time of the particles τ = l v corresponding to the regime and design parameters, we find matrix p and then matrix a (a = ln p ). having substituted residence time τ , elements aij of matrices a and the coordinates of initial vector π0 in the system of equations (12), we calculate the values for particles distribution density against particle size on an output of the device. fig. 1: comparison of experimental density (solid lines) and calculated (dotted line) density of the distribution of quartz sand particles in sizes on an output of the device (n = 3000 rpm, g = 100 g/min): 1 – initial material; 2 – t =1 min 73 acta polytechnica vol. 50 no. 2/2010 fig. 2: comparison of experimental (solid lines) and calculated (dotted lines) of density of distribution of benzoic acid particles in sizes on an output of the device (g = 50 g/min, t = 1 min): 1 – initial material, 2 – n =2000 rpm, 3 – n =3000 rpm, 4 – n =4000 rpm a comparison was made of the calculated and experimental particle size density distributions of the crushed material using the root-mean-square criterion of conformity [6]. figs. 1 and 2 show the experimental and calculated particle size distribution density changes with time. fig. 1 shows crushing of quartz sand particles, whereas fig. 2 shows the effect of rotor rotational speed on the grinding of benzoic acid particles. satisfactory agreement between experimental and calculated data is observed. the calculated rootmean-square criteria of the given curves do not exceed 15 %. we can conclude that the mathematical model adequately agrees with the experimental data on the crushing process. 5 conclusion the following conclusions can be drawn: • mathematical models of disintegration kinetics have been classified on the basis of their belonging to a certain class of markov process. • solid particle grinding can be considered as a markov process. • a mathematical description of the disintegration kinetics of polydisperse particles when milled in a shock-loading grinder is proposed on the basis of the theory of markov processes, taking into account the operating conditions in the device. the stochastic model enables the particle size distribution of a material to be calculated at any instant in any position in the grinder. • the experimental data is in agreement with the predicted values according to the model. acknowledgement this research has been executed within the framework of the analytical departmental target program development of the scientific potential of the university (2009–2010). action 2.1.2.6492 and within the framework of the agreement on cooperation between the czech technical university in prague and the ivanovo state university of chemistry and technology. references [1] nepomniatchiy, e. a.: kinetics of milling. theoretical fundamentals of chemical technology, 1977. vol. 11, no. 3. p. 576–580. [2] venttsel, e. s., ovcharov, l. a.: the theory of casual processes and its engineering applications. m.: publishing centre “academy”, 2003. [3] padokhin, v. a., zueva, g. a.: discrete markov models of process of dispergation. engineering and technology of loose materials. ivanovo, 1991, p. 55–59. [4] howard, r. a.: dynamic programming and markov processes. m.: sovietskoye radio, 1964. [5] gantmaher, f. r.: the theory of matrixes. m.: science, 1988. [6] kafarov, v. v., dorohov, i. n., arutyunov, s. j.: system analysis of processes of chemical technology. processes of crushing and mixture of loose materials. m.: science, 1985. [7] ditkin, v. a., prudnikov, a. p.: operation calculus. m.: higher school. 1966. [8] padokhin, v. a., zueva, g. a.: stochasticmarkov models of mechanostructural transformations in inorganic substances and high-molecular connections. energy and resource saving technologies and equipment, ecologically safe manufactures, ivanovo, 1, 2004, p. 215–224. [9] mizonov, v.: application of multi-dimensional markov chains to model kinetics of grinding with internal classification. the 10th symposium on comminution heidelberg, 2002, p. 14. [10] kutepov, a. m.: stochastic theory of processes of division of heterogeneous systems. iii intern. conf. theoretical andexperimental bases ofcreation of the new equipment. ivanovo, 1977. [11] ternovskij, a. m., kutepov, a. m.: hydrocyclones. m.: science, 1994. [12] kolmogoroff, a. n.: dansssr, 31, 99, (1941) (in russian). [13] nepomniatchiy, e. a.: toht, xi, 477 (1977) (in russian). [14] feller, w.: an introduction to probability theory and its applications. 3rd edition. wiley, 1968. 74 acta polytechnica vol. 50 no. 2/2010 prof. galina a. zueva e-mail: galina@isuct.ru ivanovo state university of chemistry and technology pr. f. engelsa 7, 153000 ivanovo, russia prof. valeriy a. padokhin institute of solution chemistry russian academy of sciences ul. akademicheskaya 1, 153045 ivanovo, russia prof. ing. pavel ditl, drsc. phone: +420 224 352 549 e-mail: pavel.ditl@fs.cvut.cz department of process engineering faculty of mechanical engineering czech technical university in prague technická 4, 166 07 prague 6, czech republic nomenclature a matrix of intensities of transitions aij components of matrix of transition intensities, s −1 d diameter of particle, m f density of distribution of probabilities of states which is identical to particle size distribution density, m−1 or % · m−1 g mass flow of a material, kg · sec−1 i, j index of state k time step, sec l device length, m l laplacian n rotation speed of the grinder rotor, sec−1, rpm p matrix of transition pi probability of particle destruction for the i-th fraction q volumetric flow rate of a crushed material, m3 · sec−1 v operating volume of the grinder, m3 v linear speed of a stream of material, m · sec−1 t time, sec x variable, distance coordinate, m greek letters ϕij probability of formation of particles of the j-th fraction at destruction of larger particles of the i-th fraction π(k) vector of state probabilities in time step k, 1 or % π0 vector of initial probability distribution, m −1, 1 or % π(t, x) particle size distribution density at the moment t at passage of distance x, m−1 π∞ stationary particle size distribution density on a device output, m −1 τ residence time, sec 75 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 prague’s water supply station in podoĺı — a solution for the problems of clean water in the 1930s k. drnek abstract in the 1920s prague was seeking a solution to the problem of supplying its inhabitants with drinkable water. the water plant in káraný was not able to provide enough water, and the bold plan to bring water from a reservoir and to provide a dual system of potable and non-potable water faced an uncertain future. in order to stave off the crisis and make time to complete its plans, the city council decided to construct a new water supply plant inside the city next to the vltava river in the city district of podoĺı. keywords: káraný, ing. jaroslav vancl, podoĺı, puech-chabal, filtration, large mesh filters, prefilters, fine filters, ing. antońın engel. 1 introduction to the problem in the 1920s prague was faced with a crucial problem—howto provide drinkablewater for the inhabitants of the city. although there was a new system based on the newly constructed water plant in káraný, prague needed more water than this new source was able to provide. the water plant was modernized and expanded but the amount of water that could be drawn was inadequate. new ideas about new resourceswere put forward, but nothing changed until the end of the 1920s, when the new water plant was built next to the vltava river in podoĺı. 2 prague’s water supply system in 1913, just before the demise of the austrohungarian monarchy, prague finally completed the long proclaimed project of building of a new water supply system thatwas able to transport cleanwater to the biggest city in the czech lands. the project was completed just in time. the first world war was imminent, andwithout this source of cleanwater the citizens of prague would have suffered more health problems than they did. however, shortly after the first world war ended, problems of water supply again came into the foreground. prague, the capital city of the newly established republic, was enlarged and combined with its suburbs. the now increased number of inhabitants were in urgent need of clean water. the newly constructed water supply plant was suddenly inadequate, and the city authorities began to look for a solution. the systemthatwas supplyingpraguewith clean water after the first world war was based on four sources,which suppliedwater of varyingquality. the main source was the new water plant in káraný, which delivered water only to the inner city and the inner suburbs. the second biggest water source was infiltrated water from the vltava river (the infiltrationwasmixedwithwater fromkáraný). thewater plant was located in bráńık (this water plant was founded in 1876). water from this source was delivered into just two other city districts. both of these sources were capable of supplying water suitable for drinking. the third and fourth sources were both from the vltava. the river provided water for the factories around the city (many of them had their own water plants) and for the water wells that were located throughout the city. both of thesewater sources supplied water that was not suitable for drinking. the main water source was still káraný, and the city fathers were working on providing enough water from káraný to supply the whole city. despite the modernization works, ing. jaroslav vancl, head of the water department in the technical office of the city, set up a project for future extension of the water supplies. 2.1 water plant in káraný the water plant was first set up in 1905, and was finally completed in 1913. the plant was planned only for the inhabitants of inner prague and the surrounding suburbs. it was planned to supply only 120 l per day and per person. by the end of the war, the amount of water that was supplied had risen to 170 l 33 acta polytechnica vol. 51 no. 5/2011 per day andper person, but the dilapidatedpipelines also led to considerable losses. during the 1920s, the capacity of the plant was enlarged to its final amount of 80000 m3 per day. however, this was not enough, because the numbers of people connected to the water plant kept growing, as did the amount of water used per person to 200 l per day. fig. 1: the water plant in káraný 2.2 the plan of ing. jaroslav vancl in 1921, ing. j. vancl presented a plan to the city council. he had evaluated the development of the city, the growing number of inhabitants and the increase in the water drained from káraný, and made a forecast for the next 70 years. his answer to the growing needs for more water was to build dual water pipelines to prague, and to build more water plants to collect more water. the dual pipelines were planned to supply water of twodifferent types. drinkablewaterwouldbe supplied from káraný, and non-potable water would be taken from the vltava, which could provide a sufficient amount of water for the needs of the city. ing. j. vancl planned to bring 130 l of water per day and per person into the city. only 50 l would be drinkable and drawn from káraný. the rest (80 l) would be drawn from the vltava. a new water plant was planned to the south of the capital, in the village of štěchovice. a huge dam would be constructed there, according to the american model. themain problemswhich prevented this plan being implemented were the enormous cost (770 mil. kč to build all the dual pipelines and the dam) and the risks of bringing unclean water into the houses in prague. nobody could guarantee that the inhabitants would not drink this water. 3 podoĺı water supply plant while the city councilwas still assessingvancl’s plan, the situation in the prague was slowly becoming worse and worse. in 1923, irrespective of vancl’s plan, the city council decided to build a new water supply plant, which would supplement the water brought to the city from káraný. fig. 2: the facade of the filtration building fig. 3: location of the water supply plant the new plant was located in podoĺı. two water plants had been located in podoĺı in the past, and the city council decided it was a suitable place for a new plant. 3.1 dispositions thetwooldwaterplants inpodoĺıwereboth founded in the 1880s. onebelonged to the townofvinohrady (a suburbwith the status of a town), while the other belonged to the city of prague. in the early 1920s, several tests were carried out to find the chemical composition of the underground water, which was planned to be the main source of drinkablewater. unfortunately, theundergroundwater was found to contain too much iron and manganese, and the water was also too hard. the solutionwas to use the groundwater andmix it with river water. to collect ground water, the old system of collecting wells was used. they were dug on schwarzenberg island (now called the rowing island). the wells were built with an average length of 4 m, depth 6.50 m, and were placed 4 m under the surface of the island. they were connected to the water plant by a drainage pipe. all the machinery was newly rebuilt. the river water was taken through the newly built riverbed located next to the old water plant formerly belonging to vinohrady. because of the moderate quality of the river water, the builders had to construct a sophisticated filtration system, which had been proposed by vancl. it was decided to use the puech-chabal system, and a contract with chabal & cie. was signed in 1921. this system was considered as one of the best, and the city alreadyhad good experiencewith it as it had been used at the institute for the insane at bohnice. 34 acta polytechnica vol. 51 no. 5/2011 fig. 4: scheme for taking water from the vltava the filtration system was planned to have a maximum capacity of 35000 m3 of water per day. 3.2 construction in 1922, when the plans for the filtration plant had come from chabal & cie., prague also received the plans for constructing the water plant. these were just simple plans for the facade, and thewhole buildingwas a simple structure. the skeleton of the building was based on a roof structure comprising vaults with a margin of 6.50 m, which formed twelve longitudinal fields. as the project was developed, the original plans for the building were found to be inadequate, and a tender for the architectural design of the building was announced. the winner of the competition was ing. antońın engel. engel proposed to build in stages,with each stage supported by a system of columns. later, because the columns made the system too complex and too confusing,engel changed theplansanddeviseda system of parabolic arches. these arches had a margin of 24 m, and the biggest arch above the prefilters was 16 m in height. for this system, one huge hall 60×24m in size was constructed in the water plant. the system was designed by prof. ing. frantǐsek klokner and ing.dr. bohumil hacar, who alsomade the calculations. further changes were made when ing. a. engel designed the windows and addressed the problem of sufficient sunlight. the original windows were too small, and they were replaced by squared windows over the whole area of the walls. the buildingwas almost completelymade of concrete, and was one of the biggest concrete structures in the whole of czechoslovakia. the entire complex of thewater plantwas to consist of four buildings — two filtration plants, one engine chamber and the administrativebuilding. in the original project, only the first filtration plant (the northern one) and the two other support buildings were constructed. the whole complex cost kč 17316039. fig. 5: the change in the architectural design of the building 3.3 filtration system the system consisted of four levels: • large mesh filters • prefilters • fine filters • tanks for clean water. because of a lack of space in the water plant, it was decided to stack the levels above each other, and not to place them side-by-side, as was normal in the puech-chabal system. the large mesh filters and the prefilters were on the top floor, the fine filters were on the ground floor and the tanks for clean water were located in the basement. the filters are divided into twelve parts, and the whole system is divided into two halves with an expansion joint in the longitudinal direction and also in the transverse direction. raw water is taken from the river and from the collecting wells through the engine building to the filtration building. fig. 6: scheme of the pipelines heading from the engine building to the filtration building 35 acta polytechnica vol. 51 no. 5/2011 fig. 7: longitudinal scheme of the filters fig. 8: scheme of water production 3.3.1 filters the system of the filters is quite simple. sand and gravel is put into a system of filtration beds (54 cm in length and 7 cm in height) with 64 holes that expand conically. the water flows through the holes to the bottom of the filter and then to the small chamber that collects the water before it flows down the cascade to the next level of the filtration. fig. 9: scheme of the filtration beds 3.3.2 large mesh filters thefirst level of filtration comprised three degrees to ensure complete removal of gross contaminants. the degrees of filtration were located above each other, like the rest of the filters. table 1: parameters of the large mesh filters first degree of large mesh filters floor space 259.30 m3 height of the filter 35 cm grain size 2–2.5 cm speed of filtration 116 m3/ 24 hrs second degree of large mesh filters floor space 404 m3 height of the filter 40 cm grain size 1–2 cm speed of filtration 74 m3/24 hrs third degree of large mesh filters floor space 1371 m3 height of the filter 50 cm grain size 5–10 mm speed of filtration 22 m3/24 hrs the first degree is supplied with unfiltered water from the doubled feeder channel, which has separated space for groundwater and for river water (the channel for the river water is enclosed, while the groundwater channel is open). at the expansion joint, the channel is connected with an iron pipeline with gibault system expansion joints. the water falls from the channel to the first degree from the squared hole located 7 cm below the water level in the channel. from the first degree, the water falls to the second degree and finally to the third degree of largemesh filters. in each degree, the water level above the gravel filter is about 75 cm. each step is separated from the others by awater cascade, which helps to oxidate the water. the cascades between each degree are 30 cm in height, and the difference in water level between the first degree and the third degree is 2.10 m. the coarse sediments are removed in the large mesh filters. they are sedimented on the surface of the filters. this is the active part of the cleaning process. 3.3.3 prefilters the prefilters are also divided into six parts with an extension joint. however, due to the huge floor space 36 acta polytechnica vol. 51 no. 5/2011 they are also divided into two further parts. there are six prefilters on each side in two lines, with three prefilters located in each line, one behind the other. fig. 10: scheme of the filters table 2: parameters of the prefilters prefilters floor space 2625 m3 height of the filter 70 cm grain size up to 7 mm speed of filtration 11.4 m3/24 hrs the activewater treatment system is the same as in the large mesh filters. the process and the filters in the prefilters are of course finer than before. the prefilters and their sand filters are also themain part of the filtration. themembrane on the prefilters ismade fromfine colloidal suspensions, which are taken from the water. to accelerate the formation of the membrane, unfiltered water can also be drawn from the large mesh filters. fig. 11: scheme of the fountains 3.3.4 fine filters and tanks for clean water the fine filters are the last level of the filtration process. they are located on thewhole groundfloor area and the prefiltered water is brought there by a system of fountains, making a use of the big differences in height between the prefilters and the fine filters. table 3: parameters of the fine filters fine filters floor space 5604 m3 height of the filter 90 cm grain size up to 4 mm speed of filtration 5.3 m3/24 hrs the fountains are an important part of the oxidation of the water. together with the cascades between all levels of the filters, the accelerate thewater cleaning process. all twelve parts of the fine filters have their own longitudinal groove. this groove collects the final product, filtered water, and leads to the pipelines that takes the water into the collecting tanks. fig. 12: plan of the fine filters the tanks for clean water have a capacity of 15000 m3 of water. this volume was chosen so that water will be available even if the machines taking thewater to the final areaswork only 16 hours a day. thiswasdone to avoidoperating themachines in the daytime, when electricity is most expensive. the tanks are divided into four parts, and the water always flows because of the heading and suctioning pipelines, which are located 52 m apart. 3.4 after completion, and the present-day situation the water from the podoĺı water plant entered prague’swater supply systemjust in time. thewhole project was constructed between 1923 and 1929, and in the final year the consumption of drinking water reached almost 200 l per day and per person. the water was to undergo biological tests lasting 3–4months, but there was no time to perform these. 37 acta polytechnica vol. 51 no. 5/2011 the water plant was completed in march 1929. from may the water was drawn into prague’s pipelines with water from káraný, even though this had not been part of the plan. the plan had been to use water from the vltava only as non-drinking water. more recently, water from the river has been drawn into its own water tank in flora, which was connected with the old water tank for water from káraný. the connection for mixing those two different kinds ofwaterwas used onlywhen that therewas an inadequate stock of drinkable water. in the 1950s, the consumptionofwater roseagain, and even with modernization and improvements of the engines the water supply plant was not able to meet the increasing requirements. as a result, the city council decided to built on to the water plant. engel was again chosen as the leading architect, and he was given an opportunity to complete his project in terms of his old plans. the building was constructed between 1957 and 1965, but from 1960 onward the unfinished water plant started to provide water for prague’s pipelines and became prague’s main source of water supply. the water plant was in use until the great floods in 2002. altough the water plant was not impacted by the floods thanks to engel’s good planning, it was withdrawn from the prague system. it is now a standbywater plant for use if themainwater plant in želivka is out of order. acknowledgement the research presented in this paper was supervised by prof. i. jakubec, faculty of arts, charles university in prague and was supported by research programme no. 263101 “man in the perspective of historical science”. references [1] jásek, j.&col.: water supply plant inpodoĺı and antońın engel. prague 2002. [2] sńıžek,e.: the newwater supply plant in podoĺı and its water engineering and architectural development,the technical horizon — the magazine of czechoslovakian engineers, 1928, vol. 36, pp. 360–440. [3] červenka,v.: filtrationstation inland institute inbohnice.gas and water, 1928, vol.8, no. 7–8, pp. 168–170. [4] archive of prague’s waterworks and sewerage (apvk), prague waterworks fund, panels 140–149. about the author kryštof drnek was born in prague in 1985. he graduated from the faculty of arts, charles university in prague with a bachelor degree in 2007, and with amaster degree in 2010. since then he has been studying for a phddegree in economic history in the department of economic history at the faculty of arts, charles university in prague. mgr. kryštof drnek e-mail: drnekk@gmail.com dept. of economic and social history charles university in prague náměst́ı jana palacha 2, 116 38 praha 1, czech republic 38 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 influence of the surface layer when the cmt process is used for welding steel sheets treated by nitrooxidation i. michalec, m. marônek abstract nitrooxidation is a non-conventional surface treatmentmethod that can provide significantly improvedmechanical properties as well as corrosion resistance. however, the surface layer is a major problem during the welding process, and welding specialists face many problems regarding the weldability of steel sheets. this paper deals with the properties of a nitrooxidized surface layer, and evaluates ways of welding steel sheets treated by nitrooxidation using a cold metal transfer (cmt) process. the limited heat input and the controlled metal transfer, which are considered as the main advantage of thecmtprocess, have anegative impact onweld joint quality. an excessive amount of porosity is observed, probably due to the high content of nitrogen and oxygen in the surface layer of the material and the fast cooling rate of the weld pool. keywords: steel sheet, nitrooxidation, cold metal transfer, porosity. 1 introduction steel sheets with surface treatment are widely used, above all in the automotive industry. this research work is in response to aproblemencountered inpractical applications of this material. an investigation is made of a new surface treatment method that has a major effect on improving the surface properties of steel sheets, e.g. their corrosion resistance. nitrooxidation in afluidizedbed is anon-conventional surface treatment method. this process consists of nitridationwith subsequent oxidation. nitrooxidation leads to a significant increase in corrosion resistance (up to level 10), and also an increase in mechanical properties, e.g. yield strength and tensile strength (1, 2, 3). the process ofwelding damages the surface layer, and it is necessary to find a way to minimize this damage. a pulse process is applied to limit the thermal input in arcwelding, where there is periodic pulsationof theweldingvoltageand thewelding current. the cmt process, an innovative welding method, can also be used. cmt is a revolutionary shortcircuit welding method, which works with digitallycontrolled short-circuitmetal transfer in the welding arc. for the first time, this method integrates the wire motion into the welding process and into the overall control of the process. in this way, the, thermal input can be limited. this is a major advantage of this welding method (1, 4, 5). aims previous studies (3, 5, 6) dealt with ways of welding steel sheets treated by nitrooxidation. several defects were identified in most of the tested methods, e.g. spatter, porosity and weld bead irregularities. only solid-state laser beam welding was considered satisfactory. because of the high initial cost of laser equipment, our research focused on finding an adequate option in field of arcwelding. themain aimof the research was to evaluate the possibility of welding nitrooxidizedmaterial using the innovativecmt welding method. 2 materials and methods thin dc 01/din en 10130-9 steel sheets 1 mm in thicknesswereused in the experiments. thechemical compositionof the steel is shown intable 1. thematerial was subjected to the nitrooxidation process. a fluid environment consisting of al2o3 grains 120 μm in diameter was used in the nitridation treatment. the oxidation process was carried out in vapours of distilled water immediately after nitridation. the nitrooxidation treatmentparameters arepresented in table 2. table 1: chemical composition of dc 01 en 10130/91 steel en c mn p s si al designation [%] [ %] [%] [%] [%] [ %] dc 01 max. 0.10 max. 0.45 max. 0.03 max. 0.03 0.01 – 43 acta polytechnica vol. 52 no. 2/2012 table 2: the nitrooxidation parameters temperature t time t [◦c] [min] nitridation 580 45 oxidation 350 5 table 3: the chemical composition of ok aristorod 12.50 c si mn [%] [%] [%] 0.10 0.90 1.50 the experiments were performed at fronius ltd., trnava. the welding process configuration is shown infigure 1. robotisedcmtweldingwasused. a total of 26 specimenswithdimensions of 25×10×1mm were produced. the welding parameters are presented in table 4. i joints, overlapped joints and flange joints were welded. ok aristorod 12.50, 1mm indiameter,wasused as thefillermetal. corgon18 (18%ofco2, 82%ofargon)with flow rate 15l · min−1 was applied as the shielding gas. the chemical composition of the filler metal is shown in table 3. fig. 1: welding process configuration table 4: welding parameters of the specimens specimen welding welding wire welding shielding filler no. current voltage feed rate speed gas metal [a] [v] [m ·min−1] [mm · s−1] 1 92 10.1 3.6 7 corgon18 ok 12.50 2 92 10.1 3.6 7 corgon18 ok 12.50 3 81 11.3 3.9 7 corgon18 ok 12.50 4 90 11.4 3.6 20 corgon18 ok 12.50 5 89 10.4 3.6 33 corgon18 ok 12.50 6 90 10.1 3.7 10 corgon18 ok 12.50 7 88 9.0 5.3 20 corgon18 ok 12.50 8 88 9.0 5.3 20 corgon18 ok 12.50 9 88 9.0 5.3 20 corgon18 ok 12.50 10 92 10.1 3.6 7 corgon18 ok 12.50 11 92 10.1 3.6 7 corgon18 ok 12.50 12 81 11.3 3.9 7 corgon18 ok 12.50 13 90 11.4 3.6 20 corgon18 ok 12.50 14 89 10.4 3.6 33 corgon18 ok 12.50 15 90 10.1 3.7 10 corgon18 ok 12.50 16 98 12.8 5.0 20 corgon18 ok 12.50 17 86 9.0 5.2 20 corgon18 ok 12.50 18 94 9.3 5.5 20 corgon18 ok 12.50 19 103 9.8 6.0 20 corgon18 ok 12.50 20 103 9.8 6.0 20 corgon18 ok 12.50 21 107 10.9 4.5 20 corgon18 ok 12.50 22 120 11.5 5.0 20 corgon18 ok 12.50 23 158 11.1 5.0 20 corgon18 ok 12.50 24 107 10.9 4.5 25 corgon18 ok 12.50 25 179 12.2 6.0 20 corgon18 ok 12.50 26 158 11.1 5.0 20 corgon18 ok 12.50 44 acta polytechnica vol. 52 no. 2/2012 the evaluation of the material and of the specimens was performed at the faculty of materials science and technology in trnava. microscopic analysis, spectroscopy and microhardness measurements were performedas an analysis of thematerial properties. macroscopic analysis, microscopic analysis and microhardness measurements were performed for an evaluation of the joints. gd–oes — qdp (glow discharge — optical emission spectroscopy — quantitative depth profiling) was performed with the following process parameters: the current was 15 ma at voltage 1000 v. the analysiswas performed on a nitrooxidizedmaterial andalsoonamaterialwithout surface treatment. vickers microhardness measurements were performed using buehler indentamet 1100 series equipment. the load force applied to the specimen was 981 mn, and the loading time was 10 s. the measurementswere carriedout onnitrooxidized andnonnitrooxidized steel sheets 1 mm in thickness. the measurementsweremade transverselyto thematerial thickness (from top to bottom), and were repeated on three separate specimens (nitrooxidized andwithout nitrooxidation) in order to obtain averagevalues. the distance between the indents was 80 μm, and the depth of the first indent beneath the surface was 60 μm. 3 results material analysis the microstructure of the dc 01 steel after the nitrooxidation process is shown in figure 2. beneath the very thin oxide layer (up to 680 nm), an ε-phase 7–10 μm in thickness was observed. it was composed of fe2-3n. a compound layer was created under the ε-phase. it consisted of ferritic matrix with needleshaped γ′–fe4n nitrides. the compound layer was approximately 200 μm in thickness. these phases were identified by scanning electron microscopy, using a jeol 7600-f scanning electron microscope. fig. 2: microstructure of the dc 01 steel after nitrooxidation the results of the gd-oes analysis (figure 3) showed that, after the nitrooxidation process, the nitrogen content had increased up to five times in the material surface. this increasewasmainly at the expense of the iron content. the oxygen content at a depth of 1 μm was more than 50% greater. this corresponds with the microscopic analysis, where an oxide surface layer 680 nm in thickness was observed. fig. 3: gd-oes analysis of the material surface a) material without nitrooxidation b) material after nitrooxidation 45 acta polytechnica vol. 52 no. 2/2012 fig. 4: microhardness trend comparison of nitrooxidized material (black) and material without surface treatment (red) the results of the microhardness measurements are shown in figure 4. the graph compares the microhardness of the nitrooxidized material with the microhardness of the material without surface treatment. it can be observed that, after the nitrooxidation process, the material is affected to a depth of 400 μm. the highest microhardness values were observed on both surfaces of the steel sheet (distances 0 and 1000 μm in the graph). they were 47% higher than the values for the material without surface treatment. due to limitations of the testing device, the microhardness values in the area of the ε-phase, where much higher values were expected (4), could not be observed. analysis of the joints: inappropriate joints were rejected in a visual inspection. several defects, e.g. weld bead irregularity and weld bead reinforcement were detected. only specimenswith theminimumamountofdefectswereanalysed. the macroscopic analysis results are presented in figure 5. the picture shows an excessive amount of porosity, distributed over almost the full length of the joint. weld bead irregularity together with lack of fusion was also detected. the probable reason for the high porosity lies in the surface layer, which is rich in nitrogen and oxygen. however, the fast cooling ratemay also have caused increased porosity, due to insufficient time for the escape of gases originating in the nitrooxidized surface layer and dissolved in wm during its melting process. these are until now just hypotheses, and will be a topic for further research. the microscopic analysis results are shown in figure 6. the weld metal (wm — see figure 6a) was mainly composed of acicular ferrite. globular ferrite was observed along the column grain boundaries. widmanstätten ferrite was also observed sporadically. figure 6b shows the interface between thehightemperatureheataffected zone (hthaz) and wm. hthaz consisted of upper bainite, in which the fe3c needles were secreted, together with coarse-grained acicular ferrite and proeutectoid ferrite. coarsening of the primary austenitic grainswas also identified. the heat affected zone (haz — see figure 6c) was composed of a fine-grained structure consisting of globular ferrite. fig. 5: macrostructure of the joints fig. 6: microstructure of the joints a) wm area b) wm — hthaz interface c) haz — bm interface 46 acta polytechnica vol. 52 no. 2/2012 fig. 7: microhardness values trend the results of the microhardness measurements (figure 7)proved that thehighestmicrohardnessvalues were detected in wm. a continuous drop in microhardness was identified towardshaz. in the area of bm, the microhardness stabilized, because there was no thermal affection within this area. the results presented here correspondwith the microscopic analysis, where acicular ferrite was observed in the wm area. 4 conclusion the results show that for steel sheets treated by nitrooxidation there was a radical increase in microhardness values, up to 47%, in comparison with the values for the same material without surface treatment. microscopic analysis identified individual surface phases of thematerial treated bynitrooxidation. on the basis of the research results presented here, it can be stated that the given parameters of the cmt process were not suitable for welding steel sheets treated by nitrooxidation, due to the high level of porosity. future research will focus on changing the shielding gas and the filler metal, and on optimizing the parameters. acknowledgement this paper was prepared with support from the slovak research and development agency, grant no. 0057-07 and from the scientific grant agency, grant no. 1/0203/11. references [1] michalec, i.: cmt technology exploitation for welding of steel sheetstreated bynitrooxidation. master’s thesis, 2010. [2] michalec, i. et al.: metallurgical joining of steel sheets treated by nitrooxidation by a hybrid cmt – laser process. in metal 2011 – 20th anniversary internationalconference onmetallurgy and materials, brno : tanger, 2011. [3] marônek, m. et al.: welding of steel sheets treated by nitrooxidation, in jom-16: 16th international conference on the joining of materials & 7th international conference on education in welding icew-7, tisvildeleje, 2011. [4] dománková,m. et al: influence ofnitridationand nitrooxidation processes on microstructure and corrosion properties of low carbon deep-drawing steels. in materials science and technology [online]. vol. 11, no. 1, 2011, p. 40–51. [5] bárta,j. et al.: joiningof thin steel sheets treated bynitrooxidation. in15th seminar of esab:proceedings of lectures of the 15th esab + mtfstu seminar in the scope of seminars about welding and weldability. trnava : alumnipress, 2011, p. 57–67. [6] jančár, j. et al.: laser beam utilisation in welding of steel sheets treated by nitrooxidation. in využit́ı laseru v průmyslu.plzeň : 2011, p. 25–35. ivan michalec e-mail: ivan.michalec@stuba.sk materiálovotechnologická fakulta stu katedra zvárania j. bottu 25, 917 24 trnava milan marônek e-mail: milan.maronek@stuba.sk materiálovotechnologická fakulta stu katedra zvárania j. bottu 25, 917 24 trnava 47 wykresx.eps acta polytechnica vol. 51 no. 1/2011 n =4, d =1 supersymmetric hyper-kähler sigma models and non-abelian monopole background s. bellucci, s. krivonos, a. sutulin abstract weconstruct alagrangian formulation of n =4supersymmetricmechanicswithhyper-kähler sigmamodels in abosonic sector in a non-abelian background gauge field. the resulting action includes a wide class of n = 4 supersymmetric mechanics describing the motion of an isospin-carrying particle over spaces with non-trivial geometry. in two examples that we discuss in details, the background fields are identified with the field of bpst instantons in flat and taub-nut spaces. keywords: supersymmetric mechanics, hyper-kähler spaces, non-abelian gauge fields. 1 introduction n = 4 supersymmetric mechanics provides a nice framework for the study ofmany interesting features of higherdimensional theories. at the same time, the existence of a variety of off-shell n = 4 irreducible linear supermultiplets in d = 1 [1, 2, 3, 4, 5] makes the situation in one dimension evenmore interesting, and this is what prompted us to investigate such supersymmetric models themselves, without reference to higher dimensional counterparts. being a supersymmetric invariant theory, n =4mechanics admits a natural formulation in terms of superfields living in a standard and/or in a harmonic superspace [6], adapted to one dimension [7]. in any case, the preferred approach for discussing supersymmetric mechanics is thelagrangianone. beingquite useful, the lagrangian approach has one subtle point, when we try todescribe the system inanarbitrarygaugebackground. while the inclusion of the abelian gauge background can be done straightforwardly [7], the non-abelian background reqiures new ingredients — isospin variables — which have to be included in the description [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. these isospin variables become purely internal degrees of freedomafter quantization and form an auxiliary n = 4 supermultiplet, together with the auxiliary fermions. thereare variousapproaches for introducing such auxiliary superfields and couplings with them, but until nowall constructedmodels havebeen restricted to have conformally flat sigma models in the bosonic sector. this restriction has an evident source — it has been known for a long time that all linear n =4 supermultiplets can be obtained through a dualization procedure from the n = 4 “root” supermultiplet — the n =4 hypermultiplet [18, 19, 20, 21, 22, 23], while the bosonic part of the general hypermultiplet action is conformal to the flat one. the onlyway to escape this flatness situation is tousenonlinear supermultiplets [24, 25, 26], instead of linear ones. the main aim of this paper is to construct the lagrangian formulation of n = 4 supersymmetric mechanics on conformal to hyper-kähler spaces in non-abelianbackgroundgaugefields. toachieve this goal we combine two ideas • we introduce the coupling ofmatter supermultiplets with an auxiliary fermionic supermultiplet ψα̂ containing on-shell four physical fermions and four auxiliary bosons playing the role of isospin variables. the very specific coupling results in a component action which contains only time derivatives of the fermionic components present in ψα̂. then, we dualize these fermions into auxiliary ones, ending up with the proper action for matter fields and isospin variables. this procedure was developed in [11]. • as the next step, starting from the action for the n = 4 tensor supermultiplet [27, 28] coupled with the superfield ψα̂, following [24], we dualize the auxiliary component a into a fourth physical boson, finishing with the action having a geometry conformal to the hyper-kähler one in the bosonic sector. the resulting action contains a wide class of n = 4 supersymmetric mechanics describing the motion of an isospin-carrying particle over spaces with nontrivial geometryand in the presence of a non-abelian backgroundgaugefield. in two examples thatwediscuss in details, these backgrounds correspond to the field of the bpst instanton in the flat and taubnut spaces. in order to make our presentation selfsufficient, we include in section 2 a sketchy description of our construction applied to the linear tensor and hypermultiplet. we also discuss the relation between these supermultiplets in the context of our approach (section 3), which immediately leads to the generalized procedure presented in section 4. 13 acta polytechnica vol. 51 no. 1/2011 2 isospin particles in conformally flat spaces one way to incorporate the isospin-like variables in the lagrangian of supersymmetric mechanics is to couple the basic superfields with auxiliary fermionic superfields ψα̂,ψ̄α̂, which contain these isospin variables [11]. such a coupling, being written in a standard n = 4 superspace, has to be rather special, in order to provide a kinetic term of the first order in time derivatives for the isospin variables and to describe the auxiliary fermionic components present in ψα̂,ψ̄α̂. following [11], we introduce the coupling of auxiliary ψ superfields with some arbitrary, for the time being, n =4 supermultiplet x as sc =− 1 32 ∫ dtd4θ(x + g)ψα̂ψ̄α̂, g=const. (2.1) the ψ supermultiplet is subjected to the irreducible conditions [5] diψ1 =0, diψ2 + diψ1 =0, diψ 2 =0, (2.2) and thus it contains four fermionic and four bosonic components ψα̂ =ψα̂|, ui = −diψ̄2|, ūi = diψ1|, (2.3) where the symbol | denotes the θ = θ̄ = 0 limit and n =4 covariant derivatives obey standard relations{ di, dj } =2iδij ∂t. (2.4) it has been demonstrated in [11] that if the n = 4 superfield x is subjected to the constraints [29, 5] didix =0, did ix =0, [ di, di ] x =0, (2.5) then the component action which follows from (2.1) can be written as sc = ∫ dt [ −(x + g) ( ρ1ρ̄2 − ρ2ρ̄1 ) − i 4 (x + g) ( u̇iūi − ui ˙̄ui ) + 1 4 aij u iūj + (2.6) 1 2 ηi ( ūiρ̄2 + uiρ2 ) + 1 2 η̄i ( uiρ 1 + ūiρ̄ 1 )] , where the new fermionic components ρα̂, ρ̄α̂ are defined as ρα̂ = ψ̇α̂, ρ̄α̂ = ψ̇α̂. (2.7) the components of the superfield x entering the action (2.6) have been introduced as x = x|, aij = a(ij) = 1 2 [ di, dj ] x|, ηi = −idix|, η̄i = −idix|. (2.8) what makes the action (2.6) interesting is that, despite the non-local definition of the spinors ρα̂, ρ̄α̂ (2.7), the action is invariant under the following n =4 supersymmetry transformations: δρ1 = − ̄i ˙̄ui, δρ2 = i ˙̄ui, δui = −2i iρ̄1 +2i ̄iρ̄2, δūi = −2i iρ1 +2i ̄iρ2, δx = −i iηi − i ̄iη̄i, δηi = − ̄iẋ − i ̄j aij , δη̄i = − iẋ + i jaji , δaij = − (iη̇j) + ̄(i ˙̄ηj). (2.9) in the action (2.6) the fermionic fields ρα̂, ρ̄α̂ are auxiliary ones, and thus they can be eliminated by their equations of motion ρ1 = 1 2(x + g) ηiū i, ρ2 = − 1 2(x + g) η̄iūi. (2.10) finally, the actiondescribing the interactionofψand x supermultiplets acquires a very simple form sc = 1 4 ∫ dt [ −i(x + g) ( u̇iūi − ui ˙̄ui ) + aij u iūj + 1 x + g ηiη̄j ( uiūj + ujūi )] . (2.11) thus, in the fermionic superfieldsψ only the bosonic components ui, ūi, entering the action with a kinetic term linear in time-derivatives, survive. after quantization, these variables become purely internal degrees of freedom. in order to be meaningful, action (2.1) has to be extended by the action for the supermultiplet x itself. if the superfield x obeying (2.5) is considered as an independent superfield, then the most general action reads s = sx + sc = − 1 32 ∫ dtd4θf(x)+ sc, (2.12) where f(x) is an arbitrary function of x. in this case the components aij (2.8) are auxiliary ones, and they have to be eliminated by their equations of motion. the resulting action describes n = 4 supersymmetric mechanics with one physical boson x and four physical fermions ηi, η̄j interacting with isospin variables ui, ūi. this is the system that has been considered in [8, 10, 11]. it is clear that treating the scalar bosonic superfield x as an independent one is too restrictive, because the constraints (2.5) leave in this supermultiplet only one physical bosonic component x, which is not enough to describe the isospin particle. in the present approach, a way to overcome this limitation was proposed in [14, 17]. the key point is to treat superfield x as a composite one, constructed from n = 4 supermultiplets with a larger number of physical bosons. the two reasonable superfields 14 acta polytechnica vol. 51 no. 1/2011 fromwhich it is possible to construct superfield x are n = 4 tensor supermultiplet vij [27, 28] and a onedimensional hypermultiplet qiα [18, 19, 20, 21, 7]. tensor supermultiplet the n =4 tensor supermultiplet is described by the triplet of bosonic n = 4 superfields vij = vij subjected to the constraints d(ivjk) = d(ivjk) =0, ( vij )† = vij , (2.13) which leave in vij the following independent components: va = − i 2 (σa)i jvij|, λ i = 1 3 djvij|, (2.14) λ̄i = 1 3 djvji |, a = i 6 didjvij|. thus, its off-shell component field content is (3,4,1), i.e. three physical va and one auxiliary a bosons and four fermions λi, λ̄i [27, 28]. under n =4 supersymmetry these components transform as follows: δva = i i(σa)ji λ̄j − iλ i(σa)ji ̄j , δa = ̄iλ̇ i − i ˙̄λi, δλi = i ia + j(σa)ij v̇ a, (2.15) δλ̄i = −i ̄ia +(σa)ji ̄j v̇ a. now one may check that the composite superfield x = 1 |v| ≡ 1 √ vava , (2.16) where va = − i 2 (σa)i jvij, obeys (2.5) in virtue of (2.13). clearly, now all components of the x superfield, i.e. the physical boson x, fermions ηi, η̄i and auxiliary fields aij (2.8) are expressed through the components of the vij supermultiplet (2.14) as x = 1 |v| , ηi = va |v|3 (λσa)i, η̄i = va |v|3 (σaλ̄)i, aij = −3 vavb |v|5 (λσa)i(σbλ̄)j − va(σa)ij |v|3 a + (2.17) 1 |v|3 abcvav̇b(σc)ij + 1 |v|3 ( δij λ kλ̄k − λj λ̄i ) . in what follows, we will also need the expression for aij components (2.17) in terms of η i, η̄i fermions (2.8), which reads aij = − va(σa)ij |v|3 a + 1 |v|3 abcvav̇b(σc)ij − (2.18) |v| ( ηiη̄j + ηj η̄ i ) − 1 |v| va(ησaη̄) vb(σb)ij . finally, one should note that, while dealing with the tensor supermultiplet vij, onemay generalize the sx action (2.12) to have the full action in the form s = sv + sc = − 1 32 ∫ dtd4θf(v)+ sc, (2.19) wheref(v) is nowanarbitrary functionofvij. after eliminating the auxiliary component a in the component form of (2.19)wewill obtain the action describing the n = 4 supersymmetric three-dimensional isospin particle moving in the magnetic field of a wu-yang monopole and in some specific scalar potential [14]1. hypermultiplet similarly to the tensor supermultiplet one may construct the superfield x starting from the n =4 hypermultiplet. the n = 4 , d = 1 hypermultiplet is described in n =4 superspace by the quartet of real n =4 superfields qiα subjected to the constraints d(iqj)α = 0, d(iqj)α =0,( qiα )† = qiα. (2.20) this supermultiplet describes four physical bosonic and four physical fermionic variables qiα = qiα|, ηi = −idi ( 2 qjαqjα ) |, η̄i = −idi ( 2 qjαqjα ) |, (2.21) and it does not contain any auxiliary components [7, 18, 19, 20, 21]. one may easily check that if we define the composite superfield x as x = 2 qiαqiα , (2.22) then it will obey (2.5) in virtue of (2.20) [5]. for the hypermultiplet qiα we defined the fermionic components to coincide with those present in the x superfield (2.8),while the former auxiliary components aij are now expressed via the components of qiα as aij = − 4i (qkβ qkβ)2 ( q̇αi qjα + q̇ α j qiα ) − (qkβ qkβ) 2 (ηiη̄j + ηj η̄i) . (2.23) as in the case of the tensor supermultiplet, one may write the full action with the hypermultiplet selfinteracting part sq added as s = sq + sc = − 1 32 ∫ dtd4θf(q)+ sc, (2.24) 1an alternative description of the same system has recently been constructed in [16] 15 acta polytechnica vol. 51 no. 1/2011 where now f(q) is an arbitrary function of qiα. the action (2.24) describes the motion of an isospin particle on a conformally flat four-manifold carrying the non-abelian field of a bpst instanton [17]. this systemhas recently been obtained in different frameworks in [13, 15]. to close this section one should mention that, while dealing with the tensor supermultiplet vij and the hypermultiplet qiα, the structure of the action sc (2.1) can be generalized to be [17] sc = − 1 32 ∫ dtd4θ y ψα̂ψ̄α̂, (2.25) with y obeying �3y =0 in case of the tensor supermultiplet �4y =0 in case of the hypermultiplet. (2.26) here �n denotes the laplace operator in a flat euclidean n-dimensional space. clearly, our choice y = x + g with x defined in (2.16), (2.22) corresponds to spherically-symmetric solutions of (2.26). 3 from the hypermultiplet to the tensor supermultiplet and back oneof themost attractive features of our approach is the unified structure of the action sc (2.1) which has the same form for any type of supermultiplets that we are using to construct a composite superfield x. this iswhat opens theway to relate the different systems via duality transformations. indeed, it has been known for a long time [1, 2, 3, 4, 22, 23] that in one dimension one may switch between supermultiplets with a different number of physical bosons, by expressing the auxiliary components through the time derivative of physical bosons, and vice versa. here we will use this mechanism to obtain the action of the tensor multiplet (2.19) from the hypermultiplet (2.24) action and then, alternatively, action (2.24) (with some restrictions) from (2.19). inwhat follows, to make some expressions more transparent, we will use, sometimes, the following stereographic coordinates for the bosonic components of hypermultiplet (2.21) and tensor supermultiplet (2.14): q11 = e 1 2(u−iφ)√ 1+λλ λ, q21 = − e 1 2(u−iφ)√ 1+λλ , q22 = ( q11 )† , q21 = − ( q12 )† , (3.1) v 11 = 2i eu 1+λλ λ, v 22 = −2i eu 1+λλ λ, v 12 = −ieu ( 1−λλ 1+λλ ) . (3.2) one may easily check that these definitions are compatible with (2.16) and (2.22). from hypermultiplet to tensor supermultiplet themain ingredient for getting the tensor supermultiplet action from the hypermultiplet one is provided by the expression for “auxiliary” components aij in termsof the components of superfields vij andqiα in (2.18) and (2.23), respectively. identifying the right hand sides of (2.18) and (2.23), one may find the expression of the auxiliary component a present in the superfield vij in terms of components of qiα: a = i ( q̇i1q2i + q̇ i2q1i ) + 1 4 (qkαqkα) 2 qi1qj2 (ηiη̄j + ηj η̄i) . (3.3) another way, and probably the easiest one, to check the validity of (3.3) is to use the following superfield representation for the tensor supermultiplet [7]: vij = i ( qi1qj2 +qj1qi2 ) . (3.4) this “composite” superfield vij automatically obeys (2.13) as a consequence of (2.20). being partially rewritten in terms of components (3.1), expression (3.3) reads φ̇ = e−ua − i λ̇λ−λλ̇ 1+λλ − (3.5) 1 4 e−u(qkαqkα) 2 qi1qj2 (ηiη̄j + ηj η̄i) . thus, we see that, in order to get the action for the tensor supermultiplet, onehas to replace, in the componentaction for thehypermultiplet, the timederivative of field φ by the combination on the r.h.s. of (3.5), which includes the new auxiliary field a. an additional restriction comes from the sq part of the action (2.24), which now has to depend only on the “composite” superfield vij (3.4). if it is so, then in the full action (2.24) component φ will enter only through φ̇, and the discussed replacement will be valid. from the tensor supermultiplet to the hypermultiplet it is clear that the backward procedure also exists. indeed, from (2.17) and (2.23) one may get the following expression for a: a = 1 f [ φ̇ + ∂ ∂va f(λσaλ̄)− bav̇a ] , (3.6) where f = 1 |v| , and , b1 = − v2(v3 + |v|) (v21 + v 2 2)|v| , b2 = v1(v3 + |v|) (v21 + v 2 2)|v| , b3 =0. (3.7) 16 acta polytechnica vol. 51 no. 1/2011 it is easy to check that in the coordinates (2.14), (3.2) we have bav̇a = −i λ̇λ−λλ̇ 1+λλ and |v| = eu (3.8) in full agreement with (3.5). thus, to get the hypermultiplet action (2.24) from that for the tensor supermultiplet (2.19), one has to dualize the auxiliary component a into a new physical boson φ using (3.6). of course, we do not expect to get the most general action for thehypermultiplet interactingwith the isospin-containing supermultiplet ψ, because the sv part in (2.19) depends only on the vij supermultiplet. but we will surely get a particular class of hypermultiplet actions with one isometry, with the killing vector ∂φ. 4 hyper-kähler sigma model with isospin variables the considerationwe carried out in the previous section has one subtle point. indeed, if we rewrite (3.7) as φ̇ = bav̇a − f,a(λσaλ̄)+ f a, f,a ≡ ∂ ∂va f, (4.1) then the r.h.s. of (4.1) has to transform as a full time derivative under supersymmetry transformations (2.15). one may check that it is so, if f and ba are chosen as in (3.7). however, this choice is not unique. it has been proved in [24] that the r.h.s. of (4.1) transforms as a full time derivative, if the functions f and ba satisfy the equations �3f ≡ f,aa =0, f,a = abcbc,b. (4.2) thus, one may construct a more general action for four-dimensional n = 4 supersymmetric mechanics using the component action for the tensor supermultiplet and substituting there thenewdualizedversion of the auxiliary component a (4.1). integrating over theta’s in (2.19) and eliminating the auxiliary fermions ρα̂ (2.10), (2.17), we will get the following component action for the tensor supermultiplet: s = 1 8 ∫ dt [ f ( v̇av̇a + a 2 ) +i ( ξ̇iξ̄i − ξi ˙̄ξi ) + i abc f,a f v̇bσc − i f,a f σaa − 1 6 �3f f2 σaσa − 2i ( ẇiw̄i − wiw̄i ) + 4 1+3g|v| f(1+ g|v|)2|v|4 (vaia)(vbσb)− 4 g f(1+ g|v|)2|v| (iaσa)− 4i (1+ g|v|)|v|2 (vaia) a + 4i (1+ g|v|)|v|2 abcvav̇bic ] , (4.3) where f = �3 f(v)|, ia = i 2 (wσaw̄) , (4.4) σa = −i ( ξσaξ̄ ) , and the re-scaled fermions and isospin variables are chosen to be ξi = √ f λi, wi = √ g + 1 |v| ui. (4.5) substituting (4.1) into (4.3), we obtain the resulting action s = 1 8 ∫ dt [ f ( v̇av̇a + 1 f2 ( φ̇ − bav̇a )2) + i ( ξ̇iξ̄i − ξi ˙̄ξi ) −2i ( ẇiw̄i − wiw̄i ) − i [ 1 f δab ( φ̇ − bcv̇c ) + abcv̇c ] ·( f,a f σb + 4 (1+ g|v|)|v|2 vaib ) + 4 f 1+3g|v| (1+ g|v|)2|v|4 (vaia)(vbσb)− 1 f 4g (1+ g|v|)2|v| (iaσa)+ (4.6) 1 3f2 ( f,af,a f − f f,af,a f2 − 1 2 �3f ) σbσb ] . action (4.6) is our main result. it describes the motion of a n = 4 supersymmetric four-dimensional isospin carrying particle in a non-abelian field of some monopole. the metric of this four-dimensional space isdefined in termsof two functions: thebosonic part of our pre-potential f (4.4) and the harmonic function f (4.2). the supersymmetric version of the coupling with the monopole (second line in action (4.6)) is defined by the same harmonic function f and the coupling constant g. in the more general case (2.25), we will have two harmonic functions — f and y , besides the pre-potential f . among all possible systems with action (4.6) there is a very interesting sub-class which corresponds to hyper-kähler sigma models in the bosonic sector. this case is distinguished by the condition f = f. (4.7) clearly, in this case the bosonic kinetic term of action (4.6)acquires the familiar formof the onedimensional version of the general hawking-gibbons solution for four-dimensional hyper-kähler metrics with 17 acta polytechnica vol. 51 no. 1/2011 one triholomorphic isometry [30]: skin = 1 8 ∫ dt [ f v̇av̇a + 1 f ( φ̇ − bav̇a )2] , �3f = 0, rot �b = �∇f. (4.8) it is worth noting that the bosonic part of n =4 supersymmetric four dimensional sigma models in one dimension does not necessarily have to be a hyperkähler one. this fact is reflected in the arbitrariness of the pre-potential f in action (4.6). only under the choice f = f is the bosonic kinetic term reduced to thegibbons-hawking form (4.8). let us note that for hyper-kähler cases the four-fermionic term in action (4.6) disappears. this fact has been previously established in [24]. now we can see that the additional interaction with the background non-abelian gauge field does not destroy these nice properties. among all possible bosonic metrics one may easily find the following interesting ones. conformally flat spaces there are two choices for the function f which correspond to the conformally flat metrics in the bosonic sector. the first choice is realized by f = 1 |v| . (4.9) this is just the casewe have considered in section 2. the gauge field in this case is the field of bpst instanton [17]. next, an almost trivial solution, also corresponding to theflatmetrics in thebosonic sector, is selected by the condition f =const., ba =0. (4.10) note that the relation with the tensor supermultiplet, in this case, is achieved through the following “composite” construction of vij [31] vij = q(iα). (4.11) onemay check that the constraints on vij (2.13) follow directly from (4.11) and (2.20). let us recall that in both these cases we have not specified the pre-potential f yet. therefore, the full metrics in the bosonic sector is defined up to this function. taub-nut space one should stress that the previous two cases are unique, because only for these choices of f can the resulting action (4.6) be obtained directly from the hypermultiplet action (2.24). with other solutions for f wecome to the theorywith thenonlinear n =4 hypermultiplet [24, 25]. among the possible solutions for f which belong to this new situation, the simplest one corresponds to one center taub-nut metrics with f = p1 + p2 |v| , p1, p2 =const. (4.12) in order to achieve the maximally symmetric case, we will choose these constants as p1 = g, p2 =1 → f = g + 1 |v| . (4.13) with such a definition, f coincides with the function y = g + 1 |v| (2.25) entering in our basic action sc in (2.1), (2.16). to get the taub-nut metrics, one has also to fix the pre-potential f to be equal to f. the resulting action which describes the n = 4 supersymmetric isospin carrying particle moving in a taub-nut space reads st aub−n ut = 1 8 ∫ dt [( g + 1 |v| ) v̇av̇a + 1( g + 1|v| ) (φ̇ − bav̇a)2 +i(ξ̇iξ̄i − ξi ˙̄ξi) − 2i ( ẇiw̄i − wiw̄i ) + i (1+ g|v|)|v|2 ·⎡ ⎣ va( g + 1|v| ) (φ̇ − bcv̇c) − abcvbv̇c ⎤ ⎦(σa −4ia)+ 4(1+3g|v|) (1+ g|v|)3|v|3 (vaia)(vbσb)− 4g (1+ g|v|)3 (iaσa) ] . (4.14) thebosonic term in the second line of this action can be rewritten as aaia = i 2 [ 1 f ∂ logf ∂va ( φ̇ − bav̇a ) − abc ∂ logf ∂vb v̇c ] ia, (4.15) where f is defined in (4.13). in this form the vector potential aa coincides with the potential of a yang-mills su(2) instanton in the taub-nut space [32, 33], if we may view ia, as defined in (4.4), as proper isospin matrices. the remaining terms in the second and third lines of (4.14) provide a n = 4 supersymmetric extension of the instanton. finally, to close this section, let us note that more general non-abelian backgrounds can be obtained from the multi-centered solutions of the equation for the harmonic function y (2.26), which defined the coupling of the tensor supermultiplet with 18 acta polytechnica vol. 51 no. 1/2011 auxiliary fermionic ones. thus, thevarietymodelswe constructed are defined through three functions: prepotential f (2.19)which is an arbitrary function, 3d harmonic function y (2.25), (2.26) defining the coupling with isospin variables and, through the again 3d harmonic function f (4.1), (4.2), which appeared during the dualization of the auxiliary component of the tensor supermultiplet. it is clear that we can always redefine f to be f = f̃ f. thus, all our models are conformal to hyper-kähler sigmamodels with n = 4 supersymmetry describing the motion of a particle in the background non-abelian field of the corresponding instantons. 5 conclusion in this paperwehaveconstructed thelagrangian formulation of n = 4 supersymmetric mechanics with hyper-kähler sigma models in the bosonic sector in the non-abelian background gauge field. the resulting action includes thewide class of n =4supersymmetricmechanicsdescribing themotionof an isospincarrying particle over spaces with non-trivial geometry. in two examples that we discussed in detail, the background fields are identified with the field of bpst instantons in the flat and taub-nut spaces. the approach used in the paper has utilized two ideas: (i) the coupling ofmatter supermultipletswith an auxiliary fermionic supermultiplet ψα̂ containing on-shell four physical fermions and four auxiliary bosons playing the role of isospin variables, and (ii) the dualization of the auxiliary component a of the tensor supermultiplet into a fourth physical boson. the final action that we constructed contains three arbitrary functions: the pre-potential f, a 3d harmonic function y which defines the couplingwith isospin variables and, again 3d harmonic, a function f which appeared during the dualization of the auxiliary component of the tensor supermultiplet. the usefulness of the proposed approach is demonstrated by the explicit example of the simplest system with non-trivial geometry — the n = 4 supersymmetric action for one-center taub-nut metrics. we identified the background gauge field in this case, which appears automatically in our framework, with the field of the bpst instanton in the taub-nut space. thus, one may hope that the other actions will possess the same structure. of course, the presented results are just preliminary in the quest for full understanding of n = 4 supersymmetric hyper-kähler sigma models in nonabelian backgrounds. interesting questions that remain unanswered include, in particular: • the full analysis of the general coupling with an arbitrary harmonic function y has yet to be carried out. • the structure of the background gauge field has to be further clarified: is this really the field of some monopole (instanton) for some hyperkähler metrics? • the hamiltonian construction is really needed. let us note that the supercharges have to be very specific, because the four-fermions coupling is absent in the case of hk metrics! • it is quite interesting to check the existence of the conservedrunge-lenz vector in the fully supersymmetric version. • explicit examples of other hyper-kählermetrics (say, multi-centered eguchi-hanson and taubnut ones) would be very useful. • questions of quantization and analysis of the spectra, at least in the cases of well known, simplest hyper-kählermetrics, are doubtless urgent tasks. finally, let us stress that our construction is restricted to the case of hyper-kählermetrics with one translational (triholomorphic) isometry. it will be very nice to find a similar construction applicable to the case of geometries with rotational isometry. we hope thismaybe donewithin the approachdiscussed in [34]. acknowledgement wethankandreyshcherbakov for useful discussions. this work was partially supported by the grants rfbf-09-02-01209 and 09-02-91349, by volkswagen foundation grant i/84 496, and also by erc advanced grant no. 226455, “supersymmetry, quantum gravity and gauge fields” (superfields). references [1] gates, s. j., rana, l.: ultramultiplets: a new representation of rigid 2-d, n = 8 supersymmetry, phys. lett. b342 (1995) 132, arxiv:hep-th/9410150. [2] pashnev, a., toppan, f.: on the classification of n-extended supersymmetric quantum mechanical systems, j. math. phys. 42 (2001) 5257, arxiv:hep-th/0010135. [3] faux, m., gates, s. j.: adinkras: a graphical technology for supersymmetric representation theory, phys. rev. d71 (2005) 065002, arxiv:hep-th/0408004. [4] bellucci, s., gates, s. j., orazi, e.: a journey through garden algebras,lect. notes phys. 698 (2006) 1–47, arxiv:hep-th/0602259. [5] ivanov, e., krivonos, s., lechtenfeld, o.: n =4, d =1 supermultiplets fromnonlinear re19 acta polytechnica vol. 51 no. 1/2011 alizations of d(2,1;α), class. quant. grav. 21 (2004) 1031, arxiv:hep-th/0310299. [6] galperin,a. s., ivanov,e.a., ogievetsky,v. i., sokatchev, e. s.: harmonic superspace, cambridge, uk : univ. press. (2001), 306 pp. [7] ivanov, e., lechtenfeld, o.: n = 4 supersymmetric mechanics in harmonic superspace, jhep 0309 (2003) 073, arxiv:hep-th/0307111. [8] fedoruk, s., ivanov, e., lechtenfeld, o.: supersymmetric calogero models by gauging, phys. rev. d79 (2009) 105015, arxiv:0812.4276[hep-th]. [9] gonzales, m., kuznetsova, z., nersessian, a., toppan, f., yeghikyan, v.: second hopf map and supersymmetric mechanics with yang monopole, phys. rev. d80 (2009) 025022, arxiv:0902.2682[hep-th]. [10] fedoruk, s., ivanov, e., lechtenfeld, o.: osp(4|2) superconformal mechanics, jhep 0908(2009)081, arxiv:0905.4951[hep-th]. [11] bellucci, s., krivonos, s.: potentials in n = 4 superconformal mechanics, phys. rev. d80 (2009) 065022, arxiv:0905.4633[hep-th]. [12] krivonos, s., lechtenfeld, o.: su(2) reduction in n =4 supersymmetricmechanics,phys. rev. d80 (2009) 045019, arxiv:0906.2469[hep-th]. [13] konyushikhin, m., smilga, a.: self-duality and supersymmetry, phys. lett. b689 (2010) 95, arxiv:0910.5162[hep-th]. [14] bellucci, s., krivonos, s., sutulin, a.: three dimensional n = 4 supersymmetric mechanics with wu-yang monopole, phys. rev. d81 (2010) 105026, arxiv:0911.3257[hep-th]. [15] ivanov, e. a., konyushikhin, m. a., smilga, a. v.: sqm with non-abelian self-dual fields: harmonic superspace description, jhep 1005 (2010) 033, arxiv:0912.3289[hep-th]. [16] ivanov, e., konyushikhin, m.: n = 4, 3d supersymmetric quantum mechanics in nonabelian monopole background, phys. rev. d 82 (2010) 085014, arxiv:1004.4597[hep-th]. [17] krivonos, s., lechtenfeld, o., sutulin, a.: n = 4 supersymmetry and the bpst instanton, phys. rev. d 81 (2010) 085021, arxiv:1001.2659[hep-th]. [18] coles, r. a., papadopoulos, g.: the geometry of the one-dimensional supersymmetric nonlinear sigma models, class. quant. grav. 7 (1990) 427. [19] gibbons,g.w., papadopoulos,g., stelle,k. s.: hkt and okt geometries on soliton black hole moduli spaces, nucl. phys. b 508 (1997) 623, arxiv:hep-th/9706207. [20] hellerman, s., polchinski, j.: supersymmetric quantum mechanics from light cone quantization, in shifman, m. a. (ed.), the many faces of the superworld, arxiv:hep-th/9908202. [21] hull, c. m.: the geometry of supersymmetric quantum mechanics, arxiv:hep-th/9910028. [22] bellucci, s.,krivonos,s.,marrani,a.,orazi,e.: “root” action for n = 4 supersymmetric mechanics theories, phys. rev. d73 (2006) 025011, arxiv:hep-th/0511249. [23] delduc, f., ivanov, e.: gauging n = 4 supersymmetricmechanics, nucl. phys. b753 (2006) 211, arxiv:hep-th/0605211[hep-th]. [24] krivonos, s., shcherbakov, a.: n = 4, d = 1 tensor multiplet and hyper-kahler sigma-models, phys. lett. b637 (2006) 119, arxiv:hep-th/0602113. [25] bellucci, s., krivonos, s., lechtenfeld, o., shcherbakov, a.: superfield formulation of nonlinear n = 4 supermultiplets, phys. rev. d77 (2008) 045026, arxiv:0710.3832[hep-th]. [26] bellucci, s., krivonos, s.: geometry of n = 4, d = 1 nonlinear supermultiplet, phys. rev. d 74 (2006) 125024,arxiv:hep-th/0611104;bellucci, s., krivonos, s., ohanyan, v.: n = 4 supersymmetric micz-kepler systems on s3, phys. rev. d 76 (2007) 105023, arxiv:0706.1469[hep-th]. [27] decrombrugghe,m.,rittenberg,v.: supersymmetric quantum mechanics, ann. phys. 151 (1983) 99. [28] ivanov, e. a., smilga, a. v.: supersymmetric gauge quantum mechanics: superfield description, phys. lett b257 (1991) 79; berezovoj, v. p., pashnev, a. i.: three-dimensional n=4 extended supersymmetrical quantum mechanics, class. quant. grav. 8 (1991) 2141; maloney, a., spradlin, m., strominger, a: superconformal multiblack hole moduli spaces in four-dimensions, jhep 0204 (2002) 003, arxiv:hep-th/9911001. 20 acta polytechnica vol. 51 no. 1/2011 [29] ivanov, e. a., krivonos, s. o., leviant, v. m.: geometric superfieldapproachtosuperconformal mechanics, j. phys. a22 (1989) 4201. [30] gibbons, g. w., hawking, s. w.: gravitational multi-instantons, phys. lett. b78 (1978) 430. [31] ivanov, e., lechtenfeld, o., sutulin, a.: hierarchy of n=8 mechanics models, nucl. phys. b790 (2008) 493, arxiv:0705.3064[hep-th]. [32] pope, c. n., yuille, a. l.: ayang-mills instanton intaub-nutspace,phys. lett.78b (1978) 424. [33] etesi, g., hausel, t: on yang-mills instantons over multi-centered gravitational instantons, comm. math. phys. 235 (2003) 275, arxiv:hep-th/0207196. [34] bellucci, s., krivonos, s., shcherbakov, a.: generic n = 4 supersymmetric hyper-kahler sigmamodels in d =1,phys. lett.b645 (2007) 299, arxiv:hep-th/0611248. stefano bellucci infn – frascati national laboratories via e. fermi 40, 00044 frascati, italy sergey krivonos bogoliubov laboratory of theoretical physics jinr, 141980 dubna, russia anton sutulin e-mail: sutulin@theor.jinr.ru bogoliubov laboratory of theoretical physics jinr, 141980 dubna, russia 21 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 pi of the sky telescopes in spain and chile m. siudek, t. batsch, a. j. castro-tirado, h. czyrkowski, m. cwiok, r. dabrowski, m. jeĺınek, g. kasprowicz, a. majcher, a. majczyna, k. malek, l. mankiewicz, k. nawrocki, r. opiela, l. w. piotrowski, m. sokolowski, r. wawrzaszek, g. wrochna, m. zaremba, a. f. żarnecki abstract pi of the sky is a system of robotic telescopes designed for observations of short timescale astrophysical phenomena, e.g. prompt optical grb emissions. the apparatus is designed to monitor a large fraction of the sky with 12–13m range and time resolution of the order of 1–10 seconds. in october 2010 the first unit of the new pi of the sky detector system was successfully installed in the inta el arenosillo test centre in spain. we also moved our prototype detector from las campanas observatory to san pedro de atacama observatory in march 2011. the status and performance of both detectors is presented. keywords: gamma ray burst (grb), prompt optical emissions, optical flashes, nova stars, variable stars, robotic telescopes. 1 introduction piof thesky [4] is a robotic telescopedesigned for observations of short timescale astrophysical phenomena, especially for prompt optical counterparts of gamma ray bursts (grbs). other scientific goals include searching for nova and supernova stars and monitoring interesting objects such as blasars, agns or variable stars. the apparatus design allows us to monitor a large fraction of the sky with good range and time resolution. the pi of the sky project involves scientists, engineers and students from leading polish academic and research units: the andrzej soltan institute for nuclear studies, the center for theoretical physics (polish academy of science), the institute ofexperimentalphysics (faculty of physics, university of warsaw), warsaw university of technology and the space research center (polish academy of science). the detector was designed mainly to search and observe prompt optical counterparts of grbs during or even before gamma emission. to manage this goal it is necessary to develop advanced and fully automatic software for real-time data analysis and identification of flashes [2]. the standard approach assumes reaction to satellite alerts distributed by the gamma ray burst coordinates network (gcn) [1] andmoving the telescope to the target as fast as possible. this approach does not allow us to observe an optical emission from the source exactly at the moment of or before thegrbexplosion,which is crucial for understanding the nature of grbs. pi of the sky applies an innovative solution,which assumes continuous observationof a largepart of the sky to increase thepossibilityof catchingagrbandaself-triggering system to detect flashes. the observations of the famous “naked-eye” grb080318b have confirmed the usefulness of this strategy. 2 the prototype tests of hardware and softwarewere performed with a prototype, which is just a small version of the final detector. the prototype is equipped with two custom-designed 442a 2048 × 2048 ccd cameras equipped with canon telephoto lenses with focal length f =85 cm, f /d =1.2 and covering a 20◦ ×20◦ field of view. thepixel size is 15×15 μm2, which corresponds to 36 arcsec in the sky. the ccd is cooled with a two-stagepeltiermodule up to 40 degrees below the ambient temperature. both cameras observe the samefieldof viewwith a time resolutionof 10 seconds. the limiting magnitude for a single frame is 12m and rises to 13.5m for a frame stacked from 20 exposures. this rather short magnitude range is determined by the system design and by the observational strategy. all observations are made in white light and no filter is used, except for an ir-cut filter in order to minimize the sky background. since may 2009 we have had a bessel-johnson r-band filter installed on one of the cameras in order to facilitate absolute calibration of the measurements. the prototypeworked at the las campanasobservatory (lco) in chile from june 2004 till the end of 2009 (see figure 1). in 2008 the prototype automatically recognized and observed the prompt optical emission from the famous “naked-eye” grb080319b. these spectacular observations confirm the efficiency of the 64 acta polytechnica vol. 51 no. 6/2011 flash recognition algorithms and the usefulness of the observational strategy. fig. 1: pi of the sky prototype (left) located at the las campanas observatory (right) 2.1 moving the prototype from lco to spda in march 2011 the prototype was moved from lco and installed in the san pedro de atacama observatory (spda) (see figure 2). the new location is situated approximately 740 km north of the previous location and about 2400 meters above sea level. thanks to the shorter distance to the equator, the observed part of the sky is larger than that at lco. the new site was selected because of the good and stable weather conditions. the sky is clear or almost clear for more than 80 % of the night. in 2010 there were 309 observing nights. the spda hosts several robotic telescopes, e.g. the 40 cm telescope from the institute of astrophysics of andalusia, the 40 cm telescope used for exoplanet work on behalf of the microfun project, and a variety of “tourist” telescopes. the observatory is coordinatedbyalainmaury,whoprovides support, general maintenance and improvements for these telescopes. alainmaury’sweather stationalsoprovides real-time information about weather conditions. fig. 2: pi of the sky prototype (left) located in the new dome at spda (right) 3 new detector unit in spain the prototype is only a small version of the final detector. in october 2010 the first unit of the pi of the sky detector system was successfully installed in the inta el arenosillo test centre in mazagón near huelva, spain, on the coast of the atlantic ocean. the final system consists of 4 ccd cameras on one specially designed equatorial mount (see figure 3). the custom-designed cameras are an improved version of the cameras developed for the prototype system operational in chile. cameras with sta0820 2k × 2k ccd chips are equipped with ef canon lenses with a focal length f = 85 mm (f /d = 1.2) and together cover a 40◦ ×40◦ field of view. fig. 3: new pi of the sky unit located in spain 3.1 observation mode the custom-designed equatorial mount located in spain is an improved version of the mount developed by g. pojmanski for the prototype system operational in chile. the original design was scaled up to hold 4 cameras and, thanks to the mechanism for deflecting the cameras (see figure 4), it enableswork in two operation modes: • common-target (deep), when all cameras point at the same object • side-by-side (wide), with cameras covering adjacent fields (see figure 5). fig. 4: mount design with 2 ccd cameras (prototype in chile)(left) and for 4 ccd cameras (newdetector unit in spain) (right) fig. 5: two modes: side-by-side (left) and common target (right) 65 acta polytechnica vol. 51 no. 6/2011 with harmonic drives, encoders and control solutions based on ethernet and the industrial can standard, the newdesign of the telescopemount provides much better pointing accuracy and a shorter reaction time than the prototype. 4 the ultimate system the full pi of the sky systemwill consist of two sites equipped with 12 custom-designed ccd cameras on 3 equatorial mounts each, separated by a distance of about 100 km. the first new unit has already been installed near huelva, and we are planning to install the second site near malaga (see figure 6). pairs of cameras will work in coincidence and will observe the same field of view. the system is designed to identify and remove the reflections from the satellites by the parallax and eliminate cosmic rays by analyzing the coincidence onboth cameras. thiswill significantly improve our real-time flash recognition algorithms, while a test performed by the prototype revealed that the most common background sources are flashes due to cosmic rays and near-earth flashes from sun light reflections from satellites. the whole system will be capable of continuous observations of about two steradians of the sky, which roughly corresponds to the field of view of the bat instrument on board the swift satellite [3]. the final systemwill largely allow the elimination of time for re-pointing the telescope to the coordinates from gcn and the dead time for decision process and signal propagation from the satellite to gcn and from gcn to a ground-based telescopes. fig. 6: scheme of the final system 5 summary the prototype working in the period 2004–2009 in the las campanas observatory was successfully moved and installed in the san pedro de atacama observatory (spda) inmarch2011. inoctober 2010 we managed to install the first unit of the new pi of the sky detector system in the inta el arenosillo testcentre inspain. bothpi of the sky instruments operate in the fully autonomous mode, practically without any human supervision and search for short timescale astrophysical phenomena. acknowledgement we are very grateful to g. pojmanski for access to the asas dome and for sharing his experience with us. we would like to thank the staff of the lascampanas observatory, san pedro de atacama observatory and the bootes-1 station at esat/intacedeainelarenosillo (mazagón,huelva) for their help during the installation and maintenance of our detector. this workwas financed by the polishministry of science and higher education in 2009–2011 as a research project. references [1] barthelmy, s. d., et al.: the grb coordinates network (gcn): a status report, gamma-ray bursts, aip conference proceedings, 428, p. 99, 1997. [2] burd, a., et al.: pi of the sky – all-sky, real-time search for fast optical transients,newastronomy, 10 (5), 2005, 409–416. [3] gehrels, n., et al.: the gamma ray burst mission, apj, 611, p. 1005–1020, 2004. [4] malek, k., et al.: pi of the sky detector, advances in astronomy 2010, 2010, 194946. malgorzata siudek katarzyna malek lech mankiewicz rafal opiela center fortheoreticalphysics of thepolishacademy of sciences al. lotnikow 32/46, 02-668 warsaw, poland tadeusz batsch ariel majcher agnieszka majczyna krzysztof nawrocki marcin sokolowski grzegorz wrochna the andrzej soltan institute for nuclear studies hoza 69, 00-681 warsaw, poland alberto j. castro-tirado martin jeĺınek instituto de astrofiśica de andalućia csic glorieta de la astronomía s/n e-18080granada, spain henryk czyrkowski mikolaj cwiok ryszard dabrowski lech w. piotrowski 66 acta polytechnica vol. 51 no. 6/2011 aleksander f. żarnecki faculty of physics university of warsaw hoza 69, 00-681 warsaw, poland grzegorz kasprowicz institute of electronic systems warsaw university of technology nowowiejska 15/19, 00-665 warsaw, poland roman wawrzaszek spaceresearchcenter of the polishacademy of sciences bartycka 18a, 00-716warsaw, poland marcin zaremba faculty of physics warsaw university of technology koszykowa 75, 00-662 warsaw, poland 67 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 prediction of lower extremity movement by cyclograms p. kutilek, s. viteckova abstract human gait is nowadays undergoing extensive analysis. predictions of leg movements can be used for orthosis and prosthesis programming, and also for rehabilitation. our work focuses on predicting human gait with the use of angleangle diagrams, also called cyclograms. in conjunctionwith artificial intelligence, cyclograms offer awide area ofmedical applications. we have identified cyclogram characteristics such as the slope and the area of the cyclogram for a neural network learning algorithm. neural networks learnedbycyclograms offerwide applications inprosthesis control systems. keywords: gait, artificial intelligence, cyclogram, artificial neural networks. 1 introduction in medical practice, there is no appropriate widelyused application of a system based on artificial intelligence (ai) for identifying defects in the movement of human legs, or for controlling the actuators of prosthesis or rehabilitation facilities. above all, it is difficult to evaluate the quality of walking. medical decisions are often based only on the subjective views of physicians, and appropriate accurate methods and are not widely used in clinical practice. several methods can be used in medical practice and in physiotherapeutic research for identifying defects in the movement of the human body. the most widely-used method for studying gait behavior in clinical practice is gait phase analysis by gait time phase cycles [1, 2]. the time phase cycles of gait behavior have been used to analyze gait with the application of artificial intelligence methods [3–9], but the findings have not subsequently been applied in medical practice. very intensive research is now being done on predicting leg movements by artificial intelligence and emg signal measurements [10–12], mainly with a view to their possible application in myoelectrical prosthesis control systems. for a study of gait we have used new methods based on an analysis of gait angles using cyclograms (also called angle-angle diagrams or cyclokinograms) and artificial intelligence to predict human motion of legs/prostheses. the concept of angle-angle diagrams, although known to the biomechanics community, has not been mentioned much in the recent literature. the first mention of the cyclogram [13] argued that a cyclic process such as walking is better understood if studied with a cyclic plot, e.g. an angle-angle diagram. the creation of cyclograms is based on gait angles that are objective, reliable and well suited for statistical study [14]. this technique is strongly rooted in geometry, and the quantities are intuitively understandable [15]. depending on the cyclicity of the gait, cyclograms are closed trajectories generated by simultaneously plotting two (or more) joint quantities. in gait studies, most attention has traditionally been given to easily identifiable planar knee-hip cyclograms . in order to quantify the symmetry of human walking, we have obtained and studied a cyclogram for the same joint and two sides of the body [16]. applications of cyclograms in conjunction with artificial intelligence can offer a wide range of medical applications, but this approach has not yet been studied or applied in practice. 2 methods to create and study angle-angle diagrams, we use a model of the human body created in matlab simulink and simmechanic software. the movement of the model of a body is controlled by data measured by the motion capture system, which identifies the position of points/markers in the cartesian coordinate system. we can generally use several methods for measuring movements in two/three dimensional space, e.g. an infrared (ir) camera with active markers, or a web camera, which is cheaper. for our application, we used an ir medical camera with active markers (lukotronic as 200 system) and placed led diode markers on the following points on the person being measured: fifth metatarsophalangeal joint, malleolus lateralis, epicondylus lateralis, trochanter major, spina iliaca anterior superior, etc., figure 1. using this method we can record the movement in a three-dimensional space, though we primarily study the movement in a two-dimensional sagittal plane. human gait data commonly consists of the recorded positions of markers on the skin/dress at the extremities of the limb segments (the thigh, the shank, etc.) of a subject. if we have information on 51 acta polytechnica vol. 52 no. 1/2012 fig. 1: location of ir markers and angles measured during the examination the movement of points/markers in space, and the points characterize parts of the human body, we can use these points to define the vectors of the positions of the body parts. the difference of the coordinates of two points in space defines a vector. the angles between each two segments are calculated by assuming the segments to be idealized rigid bodies. for the computing angles, we use the formula: β = arccos u · v |u| · |v| , 0 ≤ ϕ ≤ π, [rad], (1) for the two-dimensional system, because we assume that this knee or ankle joint has only one rotational degree of freedom in the case of simplified assumptions [17]. u and v are vectors of body segments (thigh, shin, foot, etc.) represented by at least two points, i.e. markers. the calculation is performed in an equivalent way for three-dimensional space. if we are interested in the angle between a body segment and a physical horizontal we determine the angle between the horizontal vector (1, 0) and vector that is defined by the coordinates of the points on the body segments evaluated in the cartesian coordinate system. the markers, i.e. the points, move in space together with the body segments, and the individual segments of the body move by the translational or angular movement per unit time. we therefore use numerical derivations to determine the translational and rotational speed and the acceleration of the individual segments, i.e. markers the aim of this study is not to evaluate the center of rotation of a joint, because even in practice, in the case of the motor control system of a prosthetic device, we do not expect complete conformity with the anatomical characteristics, and the deviations are negligible for our purpose, which is to provide verification of movement prediction methods. for the calculation we therefore used vectors based on the coordinates of the markers, and vector u and v in the equation does not necessarily correspond to the directional unit vectors of the body parts. during the measurement, the measured data is grouped according to the age groups of the subjects, and could also be grouped according to the diagnoses of the subjects. we create cyclograms of the angles, angular velocities and angular acceleration: left knee – left hip, right knee – right hip, left hip – right hip, left knee – right knee, right ankle – right hip, left ankle – left hip, etc. we were also able to study the following characteristics of cyclograms: length of the trajectory, frequency of the loops, slope of the loops, maximum range, average speed, total circumscribed area of the loops. the most important part of our work is on designing methods for applying cyclograms in practice in order to identify movement defects, and to apply cyclograms in the control system of the actuators of prosthesis or rehabilitation facilities. for this purpose, we use artificial intelligence methods which are implemented in matlab toolboxes [18, 19]. for example, we can use artificial neural networks (ann) to predict the joint angles, i.e. for predicting cyclograms. artificial neural networks are based on the neural structure of the brain [20, 21]. they process records one at a time, and “learn” by comparing 52 acta polytechnica vol. 52 no. 1/2012 their prediction of the record with the known actual record [22, 23]. the input to the first layer consists of the values in a data record. the final layer is the output layer, where there is one node for each physical quantity. the prediction of time series using a neural network consists of teaching the net the history of the variable in a selected limited time and applying the taught information to the future. data from the past is provided to the inputs of the neural network, and we expect future data from the outputs of the network. table 1: learning process of artificial neural network according to the calculated joint angles input data target data joint angles joint angle x1 x2 x3 x4 x5 x6 x2 x3 x4 x5 x6 x7 x3 x4 x5 x6 x7 x8 x4 x5 x6 x7 x8 x9 ... ... ... ... ... ... xn−4 xn−3 xn−2 xn−1 xn xn+1 our learning method is based on the premise of the proposal of a table with m + 1 columns of states. we assume that five columns of previous states plus one column for the prediction will be sufficient, table 1. the first row of the table records the first five angles computed from the measured data, and the sixth column indicates the next calculated value of the angle, as a target to which the artificial neural network learns by example. the second row records the second to sixth calculated value of the angle, and in the sixth column in the same row we insert the seventh value of the angle, as a target. this cascading method fills a table of n-4 rows, where n + 1 is the number of known values of the joint angles that we decided to use for the learning process, table 1. thus the method is generally based on information about a man walking. walking is described by the cyclogram, and the cyclogram is used to learn neural networks. we chose the calculated values of the joint angles as an approach for selecting data for artificial neural network learning. for neural network learning we use part of the curve of the cyclogram that represents a set of states for learning. these states are divided into past states and prospective states. with each presentation, the output of the neural network was compared to the desired output, and the error was computed. this error was then fed back (backpropagated) to the neural network, and was used to adjust the weights in such a way that the error decreases with each iteration and the neural model gets closer to producing the desired output. the table of input and target data can be extended to other parameters that are also very important for predicting the movement of the lower limbs. appropriate parameters are, for example, the angular acceleration (i.e. four states of acceleration), and also the subject’s weight in kilograms and age in years, table 2. we tested several modifications to anns. the first designed and tested anns were for predicting the angle in only one joint of the left leg (hip, knee and ankle). we also designed anns for predicting the complete knee-hip cyclogram or the ankle-knee cyclogram, i.e. for predicting two angles. an ann structure that we designed for predicting the angles of the hip-knee curve of a cyclogram is presented in figure 2. for learning the nn, we can also add the height of the subject in meters, or other additional parameters, e.g. a predefined code for an illness, operation of the musculoskeletal apparatus, etc. the backpropagation algorithm is used for training the neural networks. with backpropagation, the input data was repeatedly presented to the neural network. table 2: learning process of artificial neural networks according to the angles and angular accelerations calculated in a joint, subject’s weight and subject’s age input data target data joint angles angular accelerations patient’s weight patient’s age first joint angle second joint angle x1 x2 x3 x4 x5 ε1 ε2 ε3 ε4 m n x6 y6 x2 x3 x4 x5 x6 ε2 ε3 ε4 ε5 m n x7 y7 x3 x4 x5 x6 x7 ε3 ε4 ε5 ε6 m n x8 y8 x4 x5 x6 x7 x8 ε4 ε5 ε6 ε7 m n x9 y9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xn−4 xn−3 xn−2 xn−1 xn εn−4 εn−3 εn−2 εn−1 m n xn+1 yn+1 53 acta polytechnica vol. 52 no. 1/2012 fig. 2: artificial neural network 11-7-2 for predicting the movement of the lower limb the above-described methods for predicting motion using learned neural networks reflect only a short part of the curve of the angle-angle diagram of a gait cycle. the values by which the neural network is learned reflect information on only a limited number of states within a short period of time. this method may not always adequately describe the stereotypes of human walking. the main problem is that the method does not describe the whole gait cycle and the transition from the current gait cycle into a new gait cycle. it is theoretically possible to design a number of input neurons of a neural network to describe the whole gait cycle, but the structure of the neural network will then be very complex, and the calculation will therefore be difficult and slow. for this reason, we proposed new ways to describe the gait cycle by characteristics of cyclogram. the proposed method takes the distribution of the values less into account, but to a greater extent reflects the geometrical shape of the area a of one cycle. we can use several methods and we decided to use area moment of inertia to describe the property of a two-dimensional plane shape of the cyclogram. the reason for using second moment is that we can very easily describe the distribution of the circumscribed area of a cyclogram and we can determine the inclination angle θ of diagram. the value of angle θ, which is given a product moment of zero, is equal to: tan 2θ = 2ixy ixx − iyy , (2) ixx is the second moment of the area about the xaxis, ixx is the second moment of the area about the y-axis and ixy is the product moment of area. the θ is the inclination angle between axes of the original coordinate system of the diagram and the principal axes of the area. the values of the inclination angle or the moments of the area can be used for learning the neural network. the neural network will be extended for neurons with regard to other input values, e.g. the inclination angle of one cycle before the actual value of the joint angles. the neural network will also be extended for new output neurons, so that the neural network learns by target values, e.g. the inclination angle of a subsequent one cycle after the actual value of the joint angles, table 3. the patterns in table 4 are used for learning the neural network using second moments and for predicting the one joint angle. for learning the anns, including the variables mentioned above, we can theoretically also use the area of one angle-angle diagram a, the area of the ellipse of inertia, the center of the area, etc. description of the cyclogram by area, area moment and inclination angle has great advantage because calculated characteristics of the cyclo54 acta polytechnica vol. 52 no. 1/2012 table 3: learning process of artificial neural networks according to the joint angles and inclination angles of the angleangle diagram input data target data first joint angle second joint angle inclination of the previous gait cycle 1st joint angle 2nd joint angle inclination of the subsequent gait cycle x1 x2 x3 x4 y1 y2 y3 y4 θ1 x5 y5 θ ′ 1 x2 x3 x4 x5 y2 y3 y4 y5 θ2 x6 y6 θ ′ 2 x3 x4 x5 x6 y3 y4 y5 y6 θ3 x7 y7 θ ′ 3 x4 x5 x6 x7 y4 y5 y6 y7 θ4 x8 y8 θ ′ 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xn−3 xn−2 xn−1 xn yn−3 yn−4 yn−2 yn θn xn+1 yn+1 θ ′ n table 4: learning process of artificial neural networks according to the joint angle, the moments of the area and the inclination angles of the one angle-angle diagram input data target data joint angles 2nd moment of area: x-axis 2nd moment of area: y-axis inclination of the diagram joint angle 2nd moment of area: x-axis 2nd moment of area: y-axis inclination of the subsequent diagram x1 x2 x3 x4 ixx1 iyy1 θ1 x5 i ′ xx1 i ′ yy1 θ ′ 1 x2 x3 x4 x5 ixx2 iyy2 θ2 x6 i ′ xx2 i ′ yy2 θ2 x3 x4 x5 x6 ixx3 iyy3 θ3 x7 i ′ xx3 i ′ yy3 θ ′ 3 x4 x5 x6 x7 ixx4 iyy4 θ4 x8 i ′ xx4 i ′ yy4 θ ′ 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xn−3 xn−2 xn−1 xn ixxn iyyn θn xn+1 i ′ xxn i ′ yyn θ ′ n gram are not negatively affected if the data without a normal distribution. for this reason, alternatively we can also use linear regression to calculate slope of angle-angle diagram. simple linear regression fits a straight line (axis) through the set of points i.e. states. the polynomial equation of the regression principal axis is y = a0 + a1 · x (3) the a0 and a1 are parameters identified by estimator. when y is the dependent variable of the first angle, e.g. knee angle, and the x is the independent variable of the second angle, e.g. hip angle, the direction (slope) of the principal axis, i.e. inclination angle θ of angle-angle diagram, is obtained from the parameter a1 as follows tan θ = a1. (4) the trained neural networks prefer the typical changes from the previous cycle to the subsequent cycle, and avoid the use of atypical changes. in addition, we allow nn to estimate the expected gait cycle based on a specified slope of the angle-angle diagram of a gait cycle. method based on the moment of inertia is proposed for training neural networks on the basis of individual data for a particular person. however, the method can be modified to take into account anthropometric data such as the weight, height or age of the subject, and the neural network can be expanded to include the appropriate number of input neurons, and can thus become universal. the main object of our study was to predict the trajectory of cyclogram curves of on the basis of the current state of the lower extremities, and to make the prediction with the use of artificial intelligence. the set of data for training artificial neural networks 55 acta polytechnica vol. 52 no. 1/2012 was measured on 10 volunteers recruited from students of the czech technical university in prague. the subjects were asked to walk properly on a treadmill at a variable walking speed. the main human walking speed was 1.5 km/h for studying and adjusting the proposed method. we used the very accurate lukotronic as200 motion capture system to record the data. the ir camera with active markers was manufactured by lutz mechatronic technology e.u. according to the ce-certification for medical products, eu-directive 93/42/eec, the manufacturer declares that the lukotronic motion analysis system can be used for patient care in hospitals and rehabilitation centers. one camera system mounted in front of or behind the subject moving on the treadmill recorded the 3d motion of the lower limbs. markers were placed in accordance with the manufacturer’s recommendations for gait analysis by gaitlab software. the recommended marker set model is the same as the set defined by the helen hayes hospital model [26] for vicon clinical manager and the sagittal plane. physicians defined the location of the markers by the helen hayes marker set model. the position of the markers on the foot was chosen mainly according to the requirements for a good record, because the movement of the feet is not usually measured in practice, and the manufacturer does not mention placing more than one marker on the foot. after obtaining the measured data, we created the necessary cyclograms in matlab software. we obtained graphs of the changes of angles in all the main points in the lower part of human body and the sagittal plane. this was important for subsequent computation of the curve of the cyclograms and for predicting the motion. we used cyclograms as models/patterns for training the artificial neural networks. after training the neural networks we used them to predict the future states of the angles in the joints of the lower limbs. 3 results our measured data indicates that the angle in the knee usually changes from 5◦ (stretched leg) to 70◦ (shrugged leg). in the hip, the angle is usually from 5◦ to 40◦, figure 3, and in the ankle the angle is usually from 60◦ to 100◦, figure 6. under the assumption of neglecting inaccurate placement of the markers, and other simplifying assumptions, we can state that the angles correspond to the angle between the femur and the tibia and between the tibia and the metatarsus. the cyclogram in figure 3 shows that the swing phase typically starts at a thigh extension angle of 0◦ and knee flexion of about 80 % of the maximum. the subject weighed 65 kilograms and was 23 years old. fig. 3: knee-hip cyclogram (treadmill, walking speed 1.5 km/h) fig. 4: predicted knee-hip cyclogram (nn training, not taking into account the inclination angles of the diagrams): predicted values of angles — circular symbols; measured known values — cross symbols after training the neural networks, we used parts of the cyclograms for predicting the future states of the joint angles. the result was that by using short sections of the cyclogram curve, which was loaded into a neural network, the trained neural network predicted the subsequent behavior of the gait by the predicted cyclogram curve. the two results of our proposed method for gait prediction are shown by the predicted knee-hip cyclogram (figure 4) and the predicted ankle-knee cyclogram (figure 7). for the prediction, we used nn learning without taking into account the inclination angles of the diagrams. the two-dimensional knee-hip and ankle-knee cyclogram shows the prediction cyclogram curves based only on measurements of past states, i.e. the remaining part of the curve is predicted by assuming knowledge of a short curve segment. we can see that the predicted curves correspond only partially to the usual form of cyclograms [13–16]. the 11-7-2 artificial neural 56 acta polytechnica vol. 52 no. 1/2012 network, figure 2, was used for predicting the cyclograms (figure 4 and figure 7). the results show that the prediction is inaccurate, especially in the ankleknee cyclogram, because of its complicated shape. the predicted knee-hip cyclogram is a relatively accurate, and is similar to the pattern, figure 3. we found that predictions of movement based only on an evaluation of a small section of the cyclogram are not suitable for predicting complex movements. in the case of gait, which often changes, the prediction is not always appropriate and accurate. fig. 5: predicted knee-hip cyclogram (nn training, taking into account the inclination angles of the diagrams): predicted values of the angles — circular symbols; measured known values — cross symbols fig. 6: ankle-knee cyclogram (treadmill, walking speed 1.5 km/h) improved predictions are achieved by increasing the monitored section of the curve of the cyclogram, but the complexity of the neural network and the computing time also increase. these features are undesirable, and we had to modify the neural networks and the methods of learning. for ann training and for the prediction, we used the variables identified by the moment of inertia method. the predicted kneehip cyclogram (figure 5) and the predicted ankleknee cyclogram (figure 8) show that the prediction is very accurate. we identified slight variability in the prediction of the angles, but the variability was negligible, because there are small variations in angles even in typical gait. this way of making predictions is suitable for situations where the movements often use stereotypes and typical gait changes between these stereotypes of human walking. according to the method described here, the cyclograms inform us about the previous position of the lower limbs, and we can infer the expected conditions, i.e. the states during future predicted walking. this information is important in rehabilitation medicine, and also as an expected value of the angles used in control algorithms for lower limb prostheses. fig. 7: predicted ankle-knee cyclogram (nn training, not taking into account the inclination angles of the diagrams): predicted values of the angles — circular symbols; measured known values — cross symbols fig. 8: predicted ankle-knee cyclogram (nn training, taking into account the inclination angles of the diagrams): predicted values of the angles — circular symbols; measured known values — cross symbols 57 acta polytechnica vol. 52 no. 1/2012 4 discussion cyclograms in conjunction with artificial intelligence are broadly applicable in medicine. we have described a method for predicting the motion of the lower extremities, and this predicted data can be used for evaluating human gait in physiotherapeutic practice, based on a study of angle-angle diagrams. predictions of the joint angles of the leg are based on the principles of artificial intelligence. in our study we have used artificial backpropagation neural networks. by analogy with the analytical method based on nns for a study of two-dimensional cyclograms, we designed and tested an methods based on nns for a study of three-dimensional knee-hip-ankle cyclograms. the new methods can be applied in clinical practice for studies of disorders or characteristics in the motion function of the human body [27], and the method can be used in advanced control systems for controlled prostheses of the lower extremities [28]. in the past, it was almost impossible to use complex algorithms based on artificial intelligence in the slow control systems of a controlled prosthesis, but today we can consider applying the methods described here in the algorithms for new prosthetic knee control systems [29, 30]. there is an obvious opportunity to continue in this research, and to use these methods to study and design a new hydraulic or pneumatic knee prosthesis. a new robotic orthosis offers a second very important possibility of applications in rehabilitation medicine. we can use the proposed methods in algorithms for a driven robotic gait orthosis for the purposes of locomotion therapy [31, 32]. the therapy is based on simulating the movement of the lower limbs of healthy people by sophisticated robotic devices. an example of such a system is the hocoma lokomat system, which supports the rehabilitation of patients suffering from neurological diseases (multiple sclerosis, post-stroke), patients after spinal cord injury or after a traumatic brain injury resulting in partial loss of the ability to walk. the lokomat has been on the market since 2001, and has crucially improved the art and science of locomotion therapy at the motol university hospital in prague. we assume that our method will be applied in the control systems of such a device, because artificial intelligence applied in control systems could extend the possibilities of rehabilitation i.e. training possibilities. in addition, the identified and predicted gait pattern could be individually adjusted to the patient’s needs [33, 34], because our movement prediction method also takes into account the patient’s weight and age. moreover, the proposed new methods for identifying the technique of human walking, which is used for training neural networks, can be modified and used in other areas of artificial intelligence, such as reinforcement learning [35, 36]. this work has not attempted to describe all potential ways of applying cyclograms in conjunction with artificial intelligence. we have shown new methods that have subsequently been proved by a number of simulations in matlab software. these methods based on cyclograms and anns could be suitable for a broad range of applications. acknowledgement this work was done at the joint department of biomedical engineering ctu and charles university in prague in the framework of research program no. vg20102015002 (2010–2015, mv0/vg), sponsored by the ministry of the interior of the czech republic. references [1] gage, r. j., hicks, r.: gait analysis in prosthetics. clinical prosthetics & orthotics, vol. 9, issue 3, 1989, p. 17–21. [2] janura, m., cabell, l., svoboda, z., kozáková, j., gregorková, a.: kinematic analysis of gait in patients with juvenile hallux valgus deformity. journal of biomechanical science and engineering, vol. 3, issue 3, 2008, p. 390–398. [3] gioftsos, g., grieve, d. w.: the use of neural networks to recognize patterns of human movement: gait patterns, clinical biomechanics, vol. 10, issue 4, 1995, p. 179–183. [4] koktas, n. s., yalabik, n., yavuzer, g.: combining neural networks for gait classification. the 11th iberoamerican congress on pattern recognition (ciarp 2006), mexico, cancun, 2006. [5] jaraba, j. e. h., orrite-uruñuela, c., pérez, j. d. b., roy-yarza, a.: human recognition by gait analysis using neural networks. international conference on artificial neural networks, 2002, p. 364–369. [6] lai, d. t. h., begg, r. k., palaniswami, m.: computational intelligence in gait research: a perspective on current applications and future challenges, ieee transactions on information technology in biomedicine, vol. 13, issue 5, 2009, p. 687–702. [7] wang, l., tan, t., ning, h., hu, w.: automatic gait recognition based on statistical shape analysis, ieeetrans imageprocessing, vol. 12, issue 9, 2003, p. 1 120–1 131. 58 acta polytechnica vol. 52 no. 1/2012 [8] mijailovic, n., gavrilovic, rafajlovic, s.: gait phases recognition from accelerations and ground reaction forces: application of neural networks, telfor journal, vol. 1, issue 1, 2009, p. 34–36. [9] mordaunt, p., zalzala, a.: towards an evolutionary neural network for gait analysis. ieee world congress on computational intelligence, cec02 proceedings, ieee press, 2002, p. 1 238–1 243. [10] sepulveda, f., wells, d., vaughan, c.: a neural network representation of electromyography and joint dynamics in human gait, journal of biomechanics, vol. 26, issue 2, 1993, p. 101–109. [11] prentice, s. d., patla, a. e., stacey, d. a.: artificial neural network model for the generation of muscle activation patterns for human locomotion, journal of electromyography and kinesiology, vol. 11, issue 1, 2001, p. 19–30. [12] heller, b. w., veltlink, p. h., rijkhof, n. j. m., rutten, w. l. c., andrevs, b.: reconstructing muscle activation during normal walking: a comparison of symbolic and connectionist machine learning techniques, biological cybernetics, vol. 69, issue 4, 1993, p. 327–335. [13] grieve, d. w.: gait patterns and the speed of walking, biomedical engineering, vol. 3, issue 3, 1968, p. 119–122. [14] grieve, d. w.: the assessment of gait, physiotherapy, vol. 55, issue 11, 1969, p. 452–460. [15] goswami, a.: kinematics quantification of gait symmetry based on bilateral cyclograms. xixth congress of the international society of biomechanics (isb), dunedin, new zealand : 2003, p. 34–43. [16] goswami, a.: new gait parameterization technique by means of cyclogram moments: application to human slope walking, gait and posture, vol. 8, issue 1, 1998, p. 15–26. [17] heck, a., holleman, a.: walk like a mathematician: an example of authentic education. proceedings of ictmt6 – new technologies publications, 2003, p. 380–387. [18] beale, m., demuth, h.: neural network toolbox for use with matlab. natick : publisher: the mathworks, 2002. [19] hajny, o., farkasova, b.: a study of gait and posture with the use of cyclograms, acta polytechnica, vol. 50, issue 4, 2010, p. 48–51. [20] agre, p.: computation and human experience. new york : cambridge university press, 1997. [21] dayan, p., abbott, l.: computational and mathematical modeling of neural systems. cambridge : mit press, 2001. [22] wasserman, d.: neural computing theory and practice. new york : van nostrand reinhold, 1989. [23] mackay, d.: information theory, inference, and learning algorithms. new york : cambridge university press, 2003. [24] goldstein, h.: classical mechanics. (2nd ed.). boston : addison-wesley, 1980. [25] pilkey, w. d.: analysis and design of elastic beams. new york : john wiley & sons, inc., 2002. [26] vaughan, c. l., davis, b. l., o’connor, j. c.: dynamics of human gait. 2nd edition. cape town : kiboho publishers, 1999. [27] yam, c. y., nixon, m. s., carter, j. n.: gait recognition by walking and running: a modelbased approach. proceedings 5thasian conference on computer vision, 2002, p. 1–6. [28] eng, j. j., winter, d. a.: kinetic analysis of the lower limbs during walking: what information can be gained from a three-dimensional model, journal of biomechanics, vol. 28, issue 6, 1995, p. 753–758. [29] bellmann, m., schmalz, t., blumentritt, s.: comparative biomechanical analysis of current microprocessor-controlled prosthetic knee joints, archives of physical medicine and rehabilitation, vol. 91, issue 4, 2010, p. 644–652. [30] brian, j. h., laura, l. w., noelle, c. b., katheryn, j. a., douglas, g. s.: evaluation of function, performance, and preference as transfemoral amputees: transition from mechanical to microprocessor control of the prosthetic knee, archives of physical medicine and rehabilitation, vol. 88, issue 2, 2010, p. 207–217. [31] boian, f. r., burdea, c. g., deutsch, e. j.: robotics and virtual reality applications in mobility rehabilitation, rehabilitation robotics, publisher: i-tech education and publishing, vienna, 2007, p. 27–42. [32] cikajlo, i., matjacic, z.: advantages of virtual reality technology in rehabilitation of people with neuromuscular disorders. recent advances inbiomedical engineering, publisher: intech, vienna, 2009, p. 301–320. 59 acta polytechnica vol. 52 no. 1/2012 [33] kang, h. g., dingwell, j. b.: effects of walking speed, strength and range of motion on gait stability in healthy older adults, journal of biomechanics, vol. 41, issue 14, 2008, p. 2 899–2 905. [34] owings, t. m., grabiner, m. d.: variability of step kinematics in young and older adults, gait and posture, vol. 20, issue 1, 2004, p. 26–29. [35] watkins, ch.: learning from delayed rewards. phd thesis. university of cambridge, psychology department, 1989. [36] sutton, r., barto, a.: reinforcement learning: an introduction. cambridge : mit press, 1998. patrik kutilek phone: +420 312 608 302, +420 224 358 490 e-mail: kutilek@fbmi.cvut.cz faculty of biomedical engineering czech technical university in prague sitna sq. 3105, kladno, czech republic slavka viteckova phone: +420 224 358 401 e-mail: slavka.viteckova@fbmi.cvut.cz faculty of biomedical engineering czech technical university in prague sitna sq. 3105, kladno, czech republic 60 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 production of biocellulosic ethanol from wheat straw ismail, w. ali1, braim, r. rasul1, ketuly, k. aziz2, awang bujag, d. siti shamsiah2, arifin, zainudin2 1 salahaddin university, science education college, chemistry department, erbil, kurdistan region, iraq 2 malaya university, science faculty, chemistry department, malaysia correspondence to: wshyarali@yahoo.com, wshyarali@esc-ush.com (ismail, w. ali) abstract wheat straw is an abundant lignocellulosic feedstock in many parts of the world, and has been selected for producing ethanol in an economically feasible manner. it contains a mixture of sugars (hexoses and pentoses). two-stage acid hydrolysis was carried out with concentrates of perchloric acid, using wheat straw. the hydrolysate was concentrated by vacuum evaporation to increase the concentration of fermentable sugars, and was detoxified by over-liming to decrease the concentration of fermentation inhibitors. after two-stage acid hydrolysis, the sugars and the inhibitors were measured. the ethanol yields obtained from by converting hexoses and pentoses in the hydrolysate with the co-culture of saccharomyces cerevisiae and pichia stipites were higher than the ethanol yields produced with a monoculture of s. cerevisiae. various conditions for hysdrolysis and fermentation were investigated. the ethanol concentration was 11.42 g/l in 42 h of incubation, with a yield of 0.475 g/g, productivity of 0.272 g/l·h, and fermentation efficiency of 92.955 %, using a co-culture of saccharomyces cerevisiae and pichia stipites. keywords: wheat straw, two-stage acid hydrolysis, bioethanol, co-culture, fermentation. 1 introduction lignocellulosic materials are renewable, largely unused, and abundantly available sources of raw materials for the production of ethanol fuel. a potential source for future low-cost ethanol production is to utilize lignocellulosic materials such as crop residues, grasses, sawdust, wood chips, oil palm empty fruit bunches, trunks, and fronds [1]. these materials contain sugars polymerized in the form of cellulose and hemicellulose, which can be liberated by hydrolysis and subsequently fermented to ethanol by microorganisms [2]. hydrolysis can be performed with the use of enzymes or chemicals to further decompose the starch or cellulose to simple sugars [3]. the cellulose hydrolysis step is a major element in the total production cost of ethanol from lignocellulosic material [4]. three groups of enzymes, i.e. endoglucanase, exo-glucanase and β-glucosidase, are involved in the cellulose-to-glucose process, with synergetic interactions [5]. enzymatic hydrolysis or saccharification is mainly limited by a number of factors, including the crystallinity of the cellulose, the degree of polymerization, the moisture content, the available surface area, etc. [6]. acid hydrolysis is becoming more popular, due to its lower cost and greater effectiveness than enzymatic hydrolysis [7]. the lignocellulosic material is subjected to strong concentrations of hydrochloric or sulfuric acid in order to begin the breakdown and separation of the materials [4]. acid hydrolysis can be divided into two groups: (a) concentrated-acid hydrolysis and (b) dilute-acid hydrolysis. concentrated-acid processes are generally reported to give a higher sugar yield (e.g. 90 % of the theoretical glucose yield), and consequently a higher ethanol yield, than dilute-acid processes [3]. in addition, concentrated-acid processes can operate at low temperature, which is a clear advantage over dilute-acid processes [8]. there are other studies that apply hydrothermal processes [9], steam explosion [10], wet oxidation [11], alkaline peroxide [12] and ammonia fiber explosion [13] for biomass pretreatment in the ethanol process. the sugars formed in the hydrolysate are fermented into ethanol. the most common microorganisms for this purpose are saccharomyces cerevisiae and zymomonas mobilis, which are not metabolized pentoses. however, almost one-third of the reducing sugars obtained from hydrolyzed lignocellulosic materials are pentoses, composed primarily of d-xylose [14]. yeasts that have the ability to convert xylose to ethanol have been reported, and one of the earliest identified with this unique capability was pichia stiptis [15]. for efficient conversion of all sugars to ethanol, a co-culture of ovb 11 (sacchromyces cerevisiae) and pichia stipitis ncim 348 is used to co-ferment hexoses and pentoses to ethanol [14]. the average yield of wheat straw is 589.670 to 635.029 g/g of wheat grain [16]. wheat straw con28 acta polytechnica vol. 52 no. 3/2012 tains 35–45 % cellulose, 20–30 % hemicellulose, and 8–15 % lignin, and can also serve as a low-cost attractive feedstock for fuel alcohol production [17]. several studies are available on the production of ethanol from wheat straw hydrolysates [18–22]. in this work, wheat straw was treated with different concentrations of perchloric acid in two-stage hydrolysis to fermentable sugars. in addition, vacuum evaporation and overliming of the hydrolysate were optimized. the fermentability of the hydrolysate was evaluated using a monoculture of baker’s yeast and a co-culture of baker’s yeast and p. stipitis. the optimum yields were obtained from variations of acid concentration, temperature and time. the ethanol that was produced was 11.42 g/l in 42 h of incubation, with a yield of 0.475 g/g, productivity of 0.272 g/l · h, and fermentation efficiency of 92.955 %. perchloric acid was used because of its double function as an oxidising agent and a hydrolysing agent, and it can be recyclyed from its kclo4 salt. 2 materials and methods 2.1 wheat straw and reagents sun-dried wheat straw, cultivated in june 2011, was obtained from the local harvest in erbil, kurdistaniraq. it was further air-dried in an oven at 70 ◦c for 12 h before it was milled in a hammer mill, and particles smaller than 0.12 mm were collected for further use in the experiments. the following reagents were obtained from sigma-aldrich: perchloric acid (70 % w/w), potassium hydroxide pellets, glucose, arabinose, xylose, ethanol, lime and charcoal. 2.2 microorganisms and culture media hexose yeast; dry baker’s yeast (saccharomyces cerevisiae), widely used in the bakery and brewery industries (mauri-pan, instant yeast, ab mauri malaysia sdn. bhd.), was used for ethanol fermentation of glucose. the yeast (10 g/l) was inoculated in ym broth (ph 6.0), which consisted of glucose (10 g/l), peptone (5 g/l), yeast extract (3 g/l), malt extract (3 g/l) and distilled water (up to 1l). this culture was incubated at 30 c for 48 h. pentose yeast; pichia stipitis (cbs 5773), was purchased from the cbs culture collection center, netherlands, and was used for ethanol fermentation of xylose. it was maintained on a gpya (atcc 144) agar plate (glucose (40 g/l), yeast extract (5 g/l), peptone (5 g/l), agar (15 g/l)) medium. it was grown in gpya medium at 30 c, 100 rpm for 48 h, but this growing process was done after xylose (40 g/l) got subtituted for glucose in the medium. the preculture yeast cells were collected by centrifugation at 6 000 g for 10 min. the cells were washed twice with distilled water prior to inoculation. 2.3 pretreatment 2.3.1 acid hydrolysis the powdered wheat straw (10 g) was treated with perchloric acid (17.5 %) with a solid-to-liquid ratio of 1 : 4 at 100 c for 20 min. the mixture was cooled in an ice bath and filtered. the residual wheat straw from the filtration was kept for the second stage of hydrolysis. the filtrate was neutralised with koh (10m), kclo4 was precipitated, and the solution was filtered. the residual wheat straw was hydrolysed with hclo4 (35 %) at 100 c for 40 min. the filtrate from this second stage was treated as above. the sugars and the by-products were measured by hplc. 2.3.2 concentrating the sugars the combined filtrates from above (550 ml) were concentrated by vacuum distillation (buchi rotavap r215) at 80 ◦c. sugars, phenolics and furans were measured by hplc. 2.3.3 detoxification the concentrated filtrate was treated with calcium oxide with stirring, until ph 10. this mixture was incubated for half an hour, followed by centrifugation (3 000 g, 20 min) and filtration. later, the ph of the hydrolysate was brought down to ph 6 by hclo4 (10m) [23]. activated charcoal (3.5 g) was added to the hydrolysates and stirred for 1 h. the mixture was centrifuged and filtered. the sugars and the byproducts were measured by hplc. 2.4 fermentation monoculture (s. cerevisiae); the detoxified hydrolysates (100 ml) were taken along with supplementation of 0.1 % (w/v) yeast extract, peptone, nh4cl, kh2po4 and 0.05 % of mgso4 · 7h2o, mnso4, cacl2·2h2o, fecl3·2h2o and znso4·7h2o in a 250 ml conical flask, adjusting the ph to 5.5, and autoclaved at 115 ◦c for 15 min [23]. after the medium had been cooled to room temperature it was transferred to a 500 ml jar (laboratory fermentor, b. e. marubishi (thailand co., ltd) under sterile conditions. then the baker’s yeast starter culture (10 ml) was inoculated, and incubated anaerobically at 30 ◦c, 200 rpm for 72 h. samples from the medium were withdrawn periodically at various intervals from the replicated fermenter jars and centrifuged at 600 g for 10 min at 10 ◦c and analyzed for ethanol. 29 acta polytechnica vol. 52 no. 3/2012 the fermentation efficiency was calculated using the following formula [23]. fe% = practical yield theoretical yield × 100, where the practical yield is the ethanol that is produced, and the theoretical yield is 0.511 per gram of sugar consumed. co-culture (s. cerevisiae and p. stipitis); the detoxified hydrolysate (100 ml) was taken along with supplementation of 0.1 % (w/v) yeast extract, peptone, nh4cl, kh2po4 and 0.05 % of mgso4 ·7h2o, mnso4, cacl2·2h2o, fecl3·2h2o and znso4·7h2o in a 250 ml conical flask, adjusting ph to 5.5, and autoclaved at 115 ◦c for 15 min [21]. after the medium had been cooled to room temperature it was transferred to a 500 ml jar (laboratory fermentor, b. e. marubishi (thailand co., ltd) under sterile conditions. after it had been transferred, the baker’s yeast starter culture (10 ml) was inoculated and incubated anaerobically at 30 ◦c, 200 rpm for 24 h. then, p. stipitis inoculum was added at a rate of (10 g/l) and fermentation was allowed to continue at 30 ◦c, 300 rpm for 72 h at an aeration rate of 5 ml/min. samples from the medium were withdrawn periodically at various intervals from the replicated fermenter jars, centrifuged at 600 g for 10 min at 10 ◦c and analyzed for ethanol by hplc. 2.5 analytical methods sugars, furfural, hmf, ethyl vanillin, syringaldehyde, acetic acid and ethanol were analysed using the waters hplc system with a refractive index as a detector (waters 600e multisolvent delivery system, waters 717plus autosampler, waters tcm column heateritem 2989 hplc,waters 2414 refractive index detector) equipped with a rezex column (rororganic acid 00f-0138-k0, 8 % h, 150×7.8 mm) and a rezex micro-guard cartridige column (03b-0138k0, 50 × 7.8 mm). the eluent was h2so4 (5 mm) at a flow rate of 0.6 ml/min, and the injection volume was 10 μl. 3 results and discusion 3.1 acid hydrolysis perchloric acid was used because of its double function as an oxidising agent and as a hydrolysing agent. in addition to the acid hydrolyzing effect of hclo4, its oxidizing function helps in delignification and requires less time and energy than other acids pretreatments that are used [24]. in addition, neutralisation of the access hclo4 with koh leads to precepitation of the insoluble kclo4, and this can be recycled to hclo4. first stage of acid hydrolysis; the effects of temperature and time on the acid hydrolysis of wheat straw were studied. in this study, the ratio of powdered wheat straw to the volume of perchloric acid was 1 : 4, and the concentration of perchloric acid was 17.5 %. the effects of four different temperatures (50, 70, 90, 100 ◦c) on the hydrolysis of wheat straw were investigated. the effects of different heating times from 10 min to 60 min were also investigated at each of the above temperatures. reducing sugars from the hydrolysis increased as the temperature and the heating time increased, as shown in figures 1 and 2. however, at 100 ◦c, the concentration of arabinose decreased slightly after 20 min of hydrolysis, and the xylose concentration approximately stabilized. this indicated that the hemicellulose hydrolysis of wheat straw was almost complete when it was hydrolysed at 100 ◦c for 20 min. the concentrations of by-products such as furfural, hmf, ethyl vanillin, syringaldehyde and acetic acid increased as the temperature and time increased, as shown in figures 2 and 3. this indicates why no degradation products were observed at 50 ◦c. based on the results, the optimum conditions for the first stage were when hydrolysis was conducted at 100 ◦c for 20 min. table 1: effects of hclo4 concentrations on hydrolysis at 100 ◦c and for 60 min hclo4 glucose xylose arabinose acetic acid hmf furfural ethyl vanillin syringaldehyde % g/l g/l g/l g/l g/l g/l g/l g/l 17.5 4.226 4.926 0.958 0.749 0.345 0.195 0.26 0.21 20 5.097 4.906 0.946 1.099 0.364 0.308 0.29 0.24 25 6.123 4.876 0.923 1.783 0.407 0.396 0.29 0.28 30 6.502 4.812 0.902 2.117 0.446 0.442 0.28 0.30 35 6.765 4.656 0.889 2.378 0.479 0.422 0.27 0.29 40 6.796 4.520 0.867 2.524 0.490 0.382 0.25 0.27 30 acta polytechnica vol. 52 no. 3/2012 figure 1: effects of the temperatures and the heating times on hydrolysis (the first stage), where x is xylose and g is glucose figure 2: effects of the temperatures and the heating times on hydrolysis (the first stage), where ar is arabinose and aa is acetic figure 3: effects of the temperatures and the heating times on hydrolysis (the first stage), where h is hmf, s is syringaldehyde, e is ethyl vanilin and f is furfural table 2: effects of heating time on the hydrolysis of wheat straw residual using 35 % hclo4 at 100 ◦c time glucose xylose arabinose acetic acid ethyl vanillin syringaldehyde hmf furfural (min) g/l g/l g/l g/l g/l g/l g/l g/l 10 1.586 0.012 0 0.030 0.10 0.00 0 0 20 2.631 0.087 0.000 0.057 0.27 0.02 0 0 30 3.108 0.188 0.002 0.084 0.63 0.05 0 0 40 3.325 0.238 0.003 0.092 1 0.84 0.08 0 0 50 3.419 0.270 0.003 0.134 1.02 0.12 0 0 60 3.414 0.266 0.004 0.168 1.10 0.16 0 0 the effects of perchloric acid concentration on the hydrolysis were investigated. the perchloric acid concentration ranged from 17.5 % to 40 % (w/w). the glucose concentration from hydrolysis increased as the perchloric concentration increased, until a level of 35 % was reached, after which it stabilized. inversely, however, the xylose and arabinose concentration decreased, as shown in table 1. these results show that hemicellulose hydrolysis of wheat straw was approximately complete when it was hydrolyzed by 17.5 % perchloric acid at 100 ◦c for 60 min, but the cellulose of the wheat straw still remained and continued to hydrolysis. however, the concentration of furfural, ethyl vanillin and syringaldehyde increased when the concentration of perchloric acid began to increase, but later it decreased. this is because furfural was oxidized to formic acid, ethyl vanillin was oxidized to isovanillic acid and syringaldehyde was oxidized to syringic acid by hclo4 [7]. the concentration of hmf continuously increased as the concentration of the acid increased, perhaps because the conversion rate of furfural is about four times faster than that of hmf [25]. based on the results, the optimum concentration of perchloric acid for hydrolysis of cellulose was when 35 % hclo4 was used. this is guaranteed to produce a high yield of glucose. in order to prevent degradation of xylose and arabinose and to minimize the amount of by-products, it was decided to separate the hydrolysis of hemicellulose and cellulose into two separate stages. second stage of acid hydrolysis; the effects of various heating times from 10 min to 60 min on the hydrolysis of the wheat straw residual from the first stage filtration were investigated. in this study, 40 ml of 35 % hclo4 and the wheat straw residue were added to the flask of the hydrolysis set, and hydrolysis was performed at 100 ◦c. the glucose concentration from the hydrolysis increased significantly as the time of heating increased, until 50 min, after which it declined slightly, as shown in table 2. by contrast, only very small amounts of xylose, arabinose 31 acta polytechnica vol. 52 no. 3/2012 figure 4: effects of the hydrolysate concentrating and detoxification on the amount of sugars, where b.c. if before concentrating a.c. is after concentrating, b.d. is before detoxification and a.d. is after detoxification figure 5: effects of the hydrolysate concentrating and detoxification on the amount of inhibitors, where b.c. is before concentrating, a.c. is after concentrating, b.d. is before detoxification, a.d. is after detoxification, aa is acetic acid, h is hmf, f is furfural, e is ethyl vanilin and s is syringaldehyde figure 6: effect of incubation time on the ethanol production using mono-culture (baker’s yeast) an co-culture baker’s yeast and p. stipites and acetic acid were produced, and when the heating time increased their concentration slightly increased. this is further evidence that the hemicellulose hydrolysis of wheat straw was approximately complete in the first stage of hydrolysis, but the cellulose of the wheat straw still remained and continued to hydrolyse. therefore, as shown in table 2, furfural and hmf were not detected. the concentration of ethyl vanillin and syringaldehyde increased with increased heating time, which may be due to degradation of lignin from the residual, as shown in table 2. based on the results, 50 min was selected as the optimum heating time for the second stage of hydrolysis. 3.2 concentrating the sugars the sugars were concentrated by about twofold from their initial concentrations, as shown in figure 4. the furfural, hmf, ethyl vanillin, syringaldehyde and acetic acid contents were 0.141, 0.354 1.68, 0.31, 1.27 g/l, respectively, from initial concentrations of 0.126, 0.236, 0.94, 0.25, 0.88 g/l, respectively, as shown in figure 5. the by-products were concentrated at different rates. this may be due to degradation, e.g. furfural to acetic acid and formic acid [26]. 3.3 detoxification partial neutralization, over-liming and activated charcoal treaments were used to minimize the effect of the microbial inhibitors (by-products) caused by acid hydrolysis, and to improve the formation of ethanol during the fermentation process. this process led to a reduction by 87.2 %, 86.2 %, 75 %, 84.9 % and 84.4 % in furfural, hmf, acetic acid, ethyl vanillin and syringaldehyde, respectively, as shown in figure 5. lower amounts of glucose (3.03 %), xylose (2.95 %) and arabinose (2.79 %) were absorbed, as shown in figure 4. 3.4 fermentataion monoculture ethanol fermentation using baker’s yeast (saccharomyces cere visiae): the fermentability of the concentrated and detoxified hydrolysate was evaluated using the baker’s yeast starter culture. the highest concentration of ethanol was 6.516 g/l after 36 h of incubation, as shown in figure 6. the resulting yield of ethanol was eqiuvalent to 0.271 g/g with volumetric productivity of 0.181 g/l · h and fermentation efficiency of 53 %, based on the total fermentable sugars 24.044 g/l of the hydrolysate. the ethanol efficiency declined after 36 h of incubation. this may be because most of the xylose remained unfermented by the baker’s yeast, as the hydrolysate contain pentoses and hexoses. co-culture ethanol fermentation using baker’s yeast and p. stipitis; the fermentability of the concentrated and detoxifided hydrolysate of wheat straw was evaluated using the co-culture of baker’s yeast and p. stipitis. the highest concentration of ethanol was 11.421 g/l after 42 h as the optimum incubation period for maximum fermentation efficiency of 32 acta polytechnica vol. 52 no. 3/2012 92.955 %, as shown in figure 6. the resulting yield of ethanol was eqiuvalent to 0.475 g/g, with volumetric productivity of 0.272 g/l · h, based on the total fermentable sugars (24.044 g/l) of the hydrolysate. however, the ethanol productivity declined after 42 h of incubation. the higher ethanol yield is attributed to the fermentation of both hexoses and pentoses in the hydrolysate. 4 conclusion bioethanol was produced from wheat straw using two different concentrations of perchloric acid hydrolysis in two separate stages. perchloric acid was used because of its double function as an oxidizing agent and as a hydrolyzing agent, and it can also be recycled from its kclo4 salt. two-stage hydrolosis was preferred to one-stage hydrolysis because there is less sugar degradation from the hydrolyzed materials in the first stage. fewer fermentation inhibitors formed, and less heating time was required during two-stage hydrolysis. a higher yield of ethanol was obtained by concentrating the sugars that were produced and detoxifying the inhibitors in the hydrolysate. the use of a co-culture of s. cerevisiae and p. stipitis for fermenting the concentrated and detoxified hydrolysate led to bioconversion of both hexoses and pentoses with higher ethanol yields than for ethanol fermentation by a monoculture of s. cerevisiae. acknowledgement this study was carried out between the university of salahaddin/erbil — kurdistan region-iraq and the university of malaya — malaysia as research collaboration, a sabbatical leave program of the ministry of higher education and scientific research, kurdistan region-iraq. we want to thank associate prof. dr. koshy philip for letting us use his lab., and we also thank mr. kelvin swee chuan wei, and miss ainnul azizan for assisting us with our work. references [1] millati, r., niklasson, c., taherzadeh, m. j.: effect of ph, time and temperature of overliming on detoxification of dilute-acid hydrolysates for fermentation by saccharomyces cerevisiae, process biochem. 38, 4, 2002, 515–522. [2] palmqvist, e., hahn-hägerdal, b.: fermentation of lignocellulosic hydrolysates. ii: inhibitors and mechanisms of inhibition, bioresource technol. 74, 2000, 25–33. [3] carvalheiro, f., duarte, l., gı́rio, f. m.: hemicellulose biorefineries: a review on biomass pretreatments, journal of scientific & industrial research, 67, 2008, 849–864. [4] gonzalez, g., lopez-santin, j., caminal, g., sola, c.: dilute acid hydrolysis of wheat straw. hemicellulose at moderate temperature: a simplified kinetic model, biotechnology and bioengineering, 28, 1986, 288–293. [5] yah, c. s., iyuke, s. e., unuabonah, e. i., pillay, o., vishanta, c., tessa, s. m.: temperature optimization for bioethanol production from corn cobs using mixed yeast strains, online j. bio. sci., 10, 2010, 103–108. [6] li, c., knierim, b., manisseri, c., arora, r., scheller, h., et al.: comparison of dilute acid and ionic liquid pretreatment of switchgrass: biomass recalcitrance, delignification and enzymatic saccharification, bioresource technol., 101, 2010, 4 900–4906. [7] taherzadeh, m. j., karimi, k.: acid-based hydrolysis processes for ethanol from lignocellulosic materials: a review, bioresources, 2, 3, 2007, 472–499. [8] jones, j., semrau, k.: wood hydrolysis for ethanol production — previous experience and the economics of selected processes, biomass, 5, 1984, 109–135. [9] bobleter, o., pape, g.: method to degrade wood, bark and other plant materials, aust pat 263661, 1968. [10] schultz, t. p., templeton, m. c., biermann, c. j., mcginnis, g. d.: steam explosion of mixed hardwood chips, rice hulls, corn stalks, and sugar cane bagasse, j. agri. food chem., 32, 1984, 1 166–1 172. [11] mcginnis, g. d., wilson, w. w., mullen, c. e.: biomass pretreatment with water and highpressure oxygen — the wet oxidation process, ind. eng. chem. prod. res. develop., 22, 1983, 352–357. [12] bjerre, a. b., olesen, a. b., fernqvist, t., ploger, a., schmidt, a. s.: pretreatment of wheat straw using combined wet oxidation and alkaline hydrolysis resulting in convertible cellulose and hemicellulose, biotechnol bioeng, 49, 1996, 568–577. [13] holtzapple, m. t., jun, j. h., ashok, g., patibandla, s. l., dale, b. e.: the ammonia freeze explosion (afex) process — a practical lignocellulose pretreatment, appl. biochem. biotechnol., 28, 9, 1991, 59–74. 33 acta polytechnica vol. 52 no. 3/2012 [14] slininger, p., bothast, r., vancauwenberge, j., kurtzman, c. p.: conversion of d-xylose to ethanol by the yeast pachysolen tannophilus. biotechnol. bioeng., 24, 1982, 371–384. [15] du preez, j., prior, b.: a quantitative screening of some xylose fermenting yeast isolates, biotechnol. lett. 7, 1985, 241–246. [16] montane, d., farriol, x., salvado, j., jollez, p., chornet, e.: application of stream explosion to the fractionation and rapid vapourphase alkaline pulping of wheat straw. biomass bioenergy, 14, 1998, 261–276. [17] gould, j.: alkaline peroxide delignification of agricultural residues to enhance enzymatic saccharification, biotechnol. bioeng., 24, 1984, 46–52. [18] ahring, b., jensen, k., nielsen, p., bijerre, a., schmidt, a. s.: pretreatment of wheat straw and conversion of xylose and xylan to ethanol by thermophilic anaerobic bacteria. bioresour. technol., 58, 1996, 107–113. [19] ahring, b., licht, d., schmidt, a., sommer, p., thomsen, a.: production of ethanol from wet oxidized wheat straw by thermoanaerobacter mathranii. bioresour. technol., 68, 1999, 3–9. [20] klinke, h., olsson, l., thomsen, a., ahring, b.: potential inhibitors from wet oxidation of wheat straw and their effect on ethanol production by saccharomyces cerevisiae: wet oxidation and fermentation by yeast. biotechnol. bioeng., 81, 2003, 738–747. [21] nigam, j.: ethanol production from wheat straw hemicellulose hydrolyzate by pichia stipitis. j. biotechnol., 87, 2001, 17–27. [22] delgenes, j., moletta, r., navarro, j.: acid hydrolysis of wheat straw and process considerations for ethanol fermentation by pichia stipitis y7124. proc. biochem., 25, 1990, 132–135. [23] yadav, k., naseeruddin, s., sai prashanthi, g., sateesh, l., rao, l.: bioethanol fermentation of concentrated rice straw hydrolysate using coculture of saccharomyces cerevisiae and pichia stipites, bioresource technology, 102, 11, 2011, 6 473–6478. [24] carlos martin, marcelo marcet, anne belindo thomsen: comparison between wet oxidation and steam explosion as pretreatment methods for enzymatic hydrolysis of sugarcane bagasse, bioresources, 3, 3, 2008, 670–683. [25] harris, j., baker, a., zerbe, j.: two-stage, dilute sulfuric acid hydrolysis of hardwood for ethanol production, energy biomass wastes, 8, 1984, 1 151–1 170. [26] dehkhoda, a., brandberg, t., taherzadeh, mohammad: comparison of vacuum and high pressure evaporated wood hydrolysate for ethanol production by repeated fed batch using flocculating saccharomyces cerevisiae, bioresources, 4, 1, 2008, 309–320. 34 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 barriers to energy efficiency – focus on transaction costs m. valentová abstract this paper assesses the main barriers that prevent economic energy efficiency potential from being realized. the main barriers discussed here include energy prices (and prices of technology), limited access to capital, lack of information, incorrect risk assessment (i.e. setting a discount rate), the principal-agent problem and transaction costs. transaction costs are analyzed in greater detail, as they are one way or another related to all of the barriers mentioned here. based on the analysis, there is a discussion of implications for effective policy making. these are specially needed for transaction costs, where the availability of empirical data is very limited. keywords: barriers to energy efficiency, transaction costs, energy efficiency gap, energy efficiency instruments. 1 introduction energy demand has become a major political topic, both on a national level and on an international level. with increasing energy prices and steady depletion of classical fossil fuels, energy issues are likely to become even more important in future. thebenefits of increased energyefficiency (ee)are widely known. apart from lower energy demand and lower energy costs as such, it can provide a better working environment, environmental benefits, or from the macroeconomic point of view, job creation and lower import dependence. political targets have been set ateu level and at international level (the so called 20-20-20 target for 2020, or the kyoto targets). energy efficiency as a strategic goal is clearly mentioned in the state energy concept of the czech republic. the economic potential for energy savings is not negligible; it is estimated to be around 20 to 30 %. however, this potential remains to a large extent unexploited. this is because of the barriers to energy efficiency. the paper analyzes the main barriers to energy efficiency, specifically focusingontransactioncosts1. on the basis of this analysis, the paper presents possible implications for effective policies to overcome them. 2 barriers to efficiency the main barriers to energy efficiency are2 energy prices (and technology prices), limited access to capital, lack of information, incorrect risk assessment (i.e. setting a discount rate), the principal-agent problem and transaction costs. it is important to keep in mind though that the analyzed barriers never stand alone. on the contrary, the barriers are usually all interconnected and they may even reinforce each other [1], which renders potential policymaking evenmore complex. 2.1 energy prices and prices of the technology energy prices affect the implementation of energy efficiency measures in various ways. one factor is the actual share of energy costs in total costs, while another factor is the development of energy prices. energy costs very often form (except in energy intensive industries) only a small proportion of the overall expenditures. also, when making a decision about an investment, energy consumption is only one of many criteria for decision [2]. this may negatively impact the implementation of otherwise cost-effective efficiency measures. the development of energy prices is another key determinant of adoption or non-adoption of efficiency measures. basically, increasing energy prices will lead to the use of more efficient technologies [3]. by the same logic, one could induce that, conversely, low energy prices will lead to the use of inefficient technologies. this is however true only to certain extent. birol and keppler [3] call this a ratchet effect, meaning that some level of efficient technologies will remain in place even if energy prices fall (e.g. households will not tear down new insulation just because energy prices decreased). however, at the same time, there will always be some level of rebound effect, which is the increase in demand for energy services (and eventually in energy) due to the de facto lower price of energy (because of efficiency measures) measured in terms of energy services or efficiencyunits. the level of the reboundeffect cannot be conceptually determined (only empirically), 1the special interest in transaction costs is because they are in one way or another related to all of the other barriers (either stemming from the other barriers or including them). 2not in order of relevance. 87 acta polytechnica vol. 50 no. 4/2010 but it will always be a fraction between zero and one. when the rebound effect is one, it means that all the costs saved through efficiency measures will be translated into higher demand for energy services. current empirical studies showthat the level of the rebound effect varies inmost cases between 10%and30% [4, 5]. 2.2 limited access to capital another barrier that is usually mentioned in relation to energy efficiency measures is the often high upfront costs of energy efficiency investments3. the problem itself concerns potentially limited access to capital. there are two groups of actors that are particularly influenced by this barrier: small and medium size enterprises (smes) and lower income households. paradoxically, it is the lattergroupthat couldmake the highest efficiency gains [6]. at the same time though, it is low income households that have very difficult access to capital and to credit (and are also more likely to be unable to repay the loan). another issue with households (irrespective of income levels) is that energy efficiency investments have to competewithmanyother investments that a household has to make [7]. households in general tend to be averse toward risk and towards credit. like low incomehouseholds, small andmediumsize enterprises tend to have worse access to credit (with less favorable conditions than bigger companies). an important sector for energyefficiencymeasures, the public sector, is in general a trustworthy client for the banks, because the risk of non-repayment is relatively low. therefore, public sector actors will not usually have problems with access to capital per se. the constraint is different here – very often the level of indebtedness will be limited by law [7], therefore even though the public authorities could obtain credit for energy efficiency investment, they will not be allowed to do so. 2.3 lack of information one of the major barriers to energy efficiency is the lack of information of potential investors (both households and organizations). due to lack of information, the costs of energy savingmeasures are likely to exceed the benefits for individual users [8]4. as in previous section, lack of information tends to be more relevant to households and smes5. e.g. households will usually only compare the investment costs, but not the operation costs [1]. some authors [1, 9] mention the problem of lack of billing information. households and also most small and medium enterprises very often get information on energy consumption once a year, not split down into different end-uses. bills are frequently paid through monthly (fixed) payments with one clearance at the end of the year. it is therefore almost impossible to baseone’sdecisionsonenergyconsumption– thus spot the biggest energy users6. the problem is not only lack of information, but also asymmetry in information. sanstad and howarth [6] call this a special case of imperfect information when “two parties have access to different levels of information”. typically this would be a case of producers versus consumers. asymmetry in information levels is, according to the authors, a rule rather than an exception. schleich and gruber [7] offer audits as a solution to the information barrier. nonetheless, they add in the same breath that those who would make use of it are usually the ones who will also lack the resources for such an audit. finally, even if we suppose that the actors do receive all the needed information, anotherbarrier seems tobe a lackof knowledgeor capacity to evaluate it correctly and draw correct conclusions. this is the problem of households, and also of smes, which typically do not have a specialized energy expert. 2.4 incorrect risk assessment generally, using an excessively high discount rate7 in assessing the economic effectiveness of energy saving measures is thought to be another major source of the so-called “efficiency gap” [11, 12]. many empirical studies have shown that customers (households and firms) discount the energy savings by tens %, thus significantly lowering their present value (see e.g. [11, 6, 13]). howarth and anderson [8] estimate that the discount rates start from 20–25 % but can reach up to 800 %, much more than the returns on other investments would be. according to vine et al. [14], the level of the discount rate can reach up to 50 %. there are various reasons for this, but all of them together very often reflect the barriers discussed so far. 3however, this is by no means always true, as shown e.g. in [2] where a study on refrigerators and freezers is cited in which little correlation was found between upfront costs and energy performance. in addition, there is strong role of behavior which can be important but, of course, has a zero cost. 4this is very much related to the issue of transaction costs, which is elaborated in detail below. 5though reddy [2] notes that the problem of insufficient information will appear also at governmental level, related to policy making. 6a solution to this seems to lie in a transition to smart grids and smart meters. apart from advantages for energy suppliers, it is believed that smart meters encourage energy savings through real time provision of information on energy consumption, which can also be further split down to end-uses. more on the discussion on smart meters e.g. in [10]. 7it is important to keep in mind that here we are discussing the discount rate used by investors in the meaning of investors’ opportunity cost, not really the interest rate that the bank will give to the investor. this is discussed in the section on access to capital. 88 acta polytechnica vol. 50 no. 4/2010 one of the main determinants influencing the risk of achieving future energy savings is seen to be the energy price. this is why e.g. thompson [11] proposes using two different discounts, one for the old equipment and the other for the new, efficient equipment, because if the energy source is changedwith the efficiency measure, then it may also be necessary to consider a different risk in energy prices for the two (old and new) installations. another example would be the transition to a different tariff and also with the two-part price of energy changing the type of equipment can lead to a change in the fixed part of the price. apart from the future energy price, another determinant is energy consumption. if the energy efficiency measure is rather common or not too complex (e.g. lighting), future energy consumption can be well estimated from the technical parameters of the equipment. withmore complex installations, future energy consumption can also represent a risk or an uncertainty, and thus may provide a reason for a higher discount rate. this is also related to the fact that energy-efficient equipment is often new to the market and thus investors will attach more risk to it [8]8. in general, both above analyzed risks or uncertainties lead to the use of a higher discount rate. stochastic energy prices will, as schleich andgruber [7] note, raise the investor’s required rate of investment (discount). thompson [11] however argues that with energy savingmeasures, the correctmethod would be to actually use a lower discount rate. the reason is that the main source of uncertainty towards the future is the energy price (and energy consumption), whereas the energy efficiencymeasurewill then lead to a “lower variation in investor wealth” [11]. other explanation would be through the capital assets pricing model (capm). using more efficient equipment and thus achieving energy savingswill lead to a decrease in the systemic risk [15]. the reason is that lower incomes of the market will usually correlate with higher energy prices, which on the contrary advantages energy saving measures. in other words, energy saving measures can serve as a “safety fuse” against price volatility. 2.5 the principal/agent problem the “principle-agent” problem is basically a barrier of split incentives (or of the separation of responsibilities for energy expenditures and conservation actions [1]). the owners of the facility (of the rental unit) do have the incentive to invest in the efficiency measure. however, they will have no control over the use of the efficient equipment and thus no control over the efficiency gains. furthermore, the owner does not receive the benefits of the measure, because it is the user who pays lower energy bills. conversely, the user receives all the benefits from the efficiency measure, but has no incentive to invest, as there is high uncertainty as to length of the contract. it is likely that the user will not be able to benefit fully from the cost savings (will simply have to move out before all the cost savings can be realized), which makes the investment economically disadvantageous. schleich andgruber [7] however point out that the barrier of split incentives ismore significant for households than for private companies and for public sector, as theseusuallyhave longer rental contracts than is the case for households. 2.6 transaction costs in energy efficiency the reason why transaction costs are depicted separately is that they tend to include or stem fromall the above barriers. the level of transaction costs is not negligible, and is likely to prevent energy efficiencymeasures frombeing implemented. however, the exact size of transaction costs remains still rather unclear, partly because there is no common method for evaluating them and including them in decision making. most authors state that the transaction costs in energy efficiency are real and are on a significant level. they may hamper the implementation of energy efficiency projects, or may even outweigh the gains of energy efficiency improvements and thus lead to their non-realization, or to a preference for inefficient or standard technologies. a suitable definition of transaction costs in energy efficiency seems to be providedbymatthews [16]: “. . .the costs of arranging a contract ex ante andmonitoring and enforcing it ex post, as opposed to production costs.” this can be applied both to investments in efficiency measures, and to policy instruments (in this case, the contract can be subsituted for a policy instrument). the transaction costs are borne either by the project developers, by the programmemanagers or by the beneficiaries of ee programmes. the transaction costs therefore pertain to the costs related to investment, operation andmaintenance, verification, and/or administrative costs. lack of information and transaction costs are sometimes interchanged [7]. energy efficiency transaction costs can be divided into four main stages in which transaction costs can occur: the planning phase, the implementation phase, monitoring phase and the verification phase (table 1). 8in addition, future consumption is not dependent only on technical specifications, but to large extent also on the consumer’s behavior and usage. similarly, future energy consumption will depend on the reference scenario, i.e. the future energy consumption of existing equipment. 89 acta polytechnica vol. 50 no. 4/2010 table 1: sources of transaction costs. source [17] project phase nature of transaction costs planning • search for information • search for costumers • legal fees • development of proposal (including development of baseline, m&v methodologies, etc.) • project identification and evaluation implementation • megotiation of contracts • procurement • project validation monitoring and verification • mechanisms to monitor, quantify and verify savings and related ghg emissions reductions (including installation of required equipment) theplanningphasemostly consists of searching for information, project identification, evaluationandproposal. in the implementation phase, the negotiation process is important. the last phase basically means monitoring and verifying the energy savings and/or ghg emissions reductions. björkqvist and wene [18] also emphasize the importance of including the potential active rejection in the calculation. active rejection means that the actors actually considered the option and actively rejected it. those actors also incur transaction costs. general knowledge about the negative impact of transaction has been supported by a number of studies (eg. in [19, 20, 2, 6]). however, empirical data is still lacking. the reasons for this include the fact that the actors are often reluctant to disclose information. also, there is a lackof ex-post evaluations,which serve as an important source for estimates of transaction costs. in addition, transaction costs are relatively case specific [21]. some of the small number of studies that evaluate the level of transaction costs of different programmes are presented here. it is important to keep in mind though, thatbecausedifferentmethods andsectors are being analyzed, the studies are not directly comparable. björkqvist and wene [18] udnertook a study of transaction costs in families that participated in a demand side management (dsm) programme in göteborg. they analyzed 51 families who decided to invest in upgrading their heating systems. the transaction costs were not measured in monetary terms, but in hours spent by the families. the authors found that on an average the families spent 18 hours on the decision making process. they also assessed the time spent by non-investors (active rejection), which was only 6 hours. importantly, a lot of information was provided by the energy supplier, whowas the initiator of thedsmprogramme(e.g. the potential suppliers of energy efficiency equipment, information on options, etc). in this way, some time for information searching was definitely saved. the authors transposed the hours into monetary terms, using labor costs as a proxy. the transaction costs then represented 28 % of the average investment if gross income was taken, or 13 % if net income was used. the authors admit that the numbersmaybeunderestimatedbecause, firstly, the households may not have remembered everyminute of the time they spent on the decision and, secondly, because part of the decision stage was not included as it was provided by the distribution company. michaelowa and jotzo [22] evaluated the transaction costs of several ghg emissions schemes, namely the aij in sweden (the predecessor of the joint implementation mechanism) and the cdm. the main sources of transaction costs under these schemes are the search costs, baseline development, approval costs (to have the co2 emission reduction approved by the approving authority), validation, registration and monitoring. the main findings are that there is a significant fixed part in the transaction cost in the ghg table 2: empirical estimates of transaction costs. compiled by author study level of tcs field note björkqvist et al. [18] 28 % (13 %) households if gross (net) income referred to michaelowa and jotzo [22] 20.5 % cdm mundaca [21] 10–20 % audit scheme % of the audit costs mundaca [21] 8–12 % lighting energy saving target programme mundaca [21] 24–36 % insulation energy saving target programme sathaye [23] 9–19 % not specified easton consultants [24] 20–40 % escos 90 acta polytechnica vol. 50 no. 4/2010 schemes. this means that there is a certain threshold of co2 savings belowwhich the transaction costs outweigh the gains of the project. one study finds this threshold at a level of 50000 t co2/yr for a 20-year project; another value is that the transaction costs should not be more than 25 % of the proceeds from permit sales in order to make a project viable [22]. overall, the level of transaction costs was estimated at ca 20.5 % of total project costs. mundaca [21] analyzed two energy efficiency programmes: free of charge energy audits in denmark, and the energy efficiency commitment in great britain. the results of the former cannot really be considered as statistically relevant, as theywere based only on 5 replies. nevertheless, a qualitative analysis was made. the idea is that the energy providers have to make some number of energy audits of their customers. the rationale behind the programme is that the market agents have asymmetric information and thus they will not materialize all the energy improvements. the transaction costs are related mainly to finding clients for the audit as such, carrying out the audit, and also to follow up measures, such as the search for partners if the client decides to implement some measures. another source of tcs is the accreditation process, as the energy audit has to be reported as part of the programme. the transaction costs were estimated as10–20% of the direct costs of the energy audit (not the costs of potential resulting investment). the british programme sets an energy saving target to be fulfilled by energy supply companies. the companies can trade the savings among themselves. the author [21] interviewed suppliers and asked them to identify and quantify the transaction costs related to the programme. the identified transaction costs were mainly searches for information (searches for households that could save), persuading customers or approval of measures by the authority. in the implementation phase the main source of tc was in negotiating agreements or contracts with a third party and in monitoring and verification. random quality checks were the main source. all together, the level of tc differed according to the measure undertaken. in lighting, the transaction costs ranged from 8 % to 12 %, and with insulation measures the range was 24–36 % of the investment costs. sathaye [23] analysed various emission reduction projects (not only energy efficiency) in north and southamericaandasia, andestimated that the transaction costs ranged from9 % to 19% of total project costs. the transaction costs arisemainly from the negotiationprocess amongparties, feasibility studies (includingbaselines andadditionality), negotiation,monitoringandevaluationandalsoapproval fromthemanaging authority. sathayealsobelieves that the key factor determining the level of transaction costs (at least in ghg projects) is project size. a few studies have focused on the transaction costs borne byenergyservicecompanies (escos). easton consultants [24] estimated that the transaction costs of energy efficiciency projects carried out by escos represented from 20 % to 40 % of the total value of the project (figure 1). fig. 1: costs associated with the esco project. source [24] 3 policy implications various policy instruments are needed to help removing the different barriers to energy efficiency. the instruments range from providing a regulatory framework, which should allow the market to develop, through hard regulatory instruments (minimum efficiency standards, building codes), financial incentives (subsidies, loans) to soft (information)measures. there is no silver bullet, but a set of instruments has to be put in place. from practice, the barrier of lack of information has been dealt with using various policy instruments. on the eu level, depending on the sector or the enduse, hard regulations or labelling have been adopted. the former (in the form of minimum efficiency standards) passes the difficulty and costs that the consumers incur while searching for information on the producers. if minimum efficiency standards (meps) are adopted, consumers are sure to buy efficient products insteadofhaving to look for them. this is the case for example for standby regulation, where the search for information at each individual appliance would be too costly. the labelling approach is an example of a successful soft, information measure, which has helped in achievingenergyefficiencyat lowcosts. ithasbeen the most effective in the caseof appliances that representa significnatportionofahouseholdbudget (e.g. refrigerators or washing machines). conversely, the labelling hasnotworked for lighting, thoughthe savingsandthe related economic effectiveness are unquestionnable. in addition, in one of the above-mentioned case studies, the lack of information (and thus high transaction costs in the initial planning phase) was solved 91 acta polytechnica vol. 50 no. 4/2010 by passing the burden to a distribution company. the informationwas gathered and pooled at a single place and then distributed to the households, which reduced the transaction costs for the families and increased the effectiveness of the actions [18]. in some cases, it is sufficient to set up the right regulatory framework. that is the case for e.g. the principal/agentdilemma. schleichandgruber [7] suggest that this barrier couldbe avoided if “the investing party were able to credibly transmit the information about the benefits (i.e., future cost savings) arising from the investment, and to enter into a contractwith those benefiting from the investment”. one can imagine a policy instrument that will help in establishing the environment for such information transfer (by e.g. providing standard documents for both parties). as to risk management, a solution offered e.g. by mills [25] is to use insurance in energy saving projects. another way is to use real options, well developed in other sectors, but not yet in energy saving projects. someauthors also imply that incorrect risk assessment is only a result of the other barriers. therefore, removing this barrierwill depend on removing the other barriers. 3.1 transaction cost policy implications the stakeholders are usually aware of the existence of transaction costs in energy efficiency projects andprogrammes, but knowneither their structure nor the exact levels. therefore, a common method is needed for including them indecisionmaking, both at the level of the investor and also at the level of the policy makers. this will also allow a direct comparison, which is not possible at the moment. from the case studies mentioned above, some preliminary ideas for the development of such a method are drawn. firstly, there is an indirect relation between the size of the project and the transaction costs, which thus justifies streamlining. in other words, bundling and standardizing projects could reduce the fixed part of the transaction costs, which tends to be an inhibitor of efficiency measures in smaller projects (smes, households, and others). economies of scale play an important role in programmes,where the fixed part of the transaction costs is significant [22]. the need to assess the transaction costs with respect to time is also stressed, as there may be a learning curve in (at least some forms of) transaction costs, depending on the general context of the programme [22]. it seems that the transaction costs tend to vary from sector to sector. in the household sector, the transaction costs are higher than in the commercial sector and the industry. the reason probably relates to the importance of project size. the commercial and industrial sector will alsomore likely benefit from economies of scale and from fewer market imperfections. providing a framework for developing the energy service9 market may be a useful supplementary tool. the esco, from the very nature of the business, has an interest in including all the costs in their calculations [19], thus alsoall the transactionandmore generally hidden costs. energy efficiency services can solve the problem of limited access to capital, as one of the potential services is tooffer repaymentof the loan from guaranteed savings, thus removing the financial risk to the client. acknowledgement research described in the paper was supervised by doc. ing. jaroslavknápek,csc., feectu inprague. references [1] jochem, e., gruber, e.: obstacles to rational electricity use and measures to alleviate them. energy policy, 1990, vol. 18, no. 4, p. 340–350. [2] reddy, a. k. n.: barriers to improvements in energy efficiency. energy policy, december 1991, p. 953–996. [3] birol, f., keppler, j. h.: prices, technology development and the rebound effect.energy policy, 2000, vol. 28, p. 457–469. [4] world energy council: energy efficiency policies around the world: review and evaluation. london : world energy council, 2008. [5] sorrell, s. dimitropoulos, j., sommerville, m.: empirical estimates of the direct rebound effect: a review. energy policy, vol. 37, no. 4, 2009, p. 1356–1371. [6] sanstad, a. h., howarth, r. b.: “normal” markets, market imperfections and energy efficiency. energy policy, 1994, vol. 22, no. 10, p. 811–818. [7] schleich, j., gruber, e.: beyond case studies: barriers to energy efficiency in commerce and the services sector.energy economics, 2008,vol. 30, p. 449–464. [8] howarth, r. b., anderson, b.: market barriers to energy efficiency. energy economics, october 1993, p. 262–272. 9defined in the energy service directive (2006/32/ec) as “the physical benefit, utility or good derived froma combination of energy with energy efficient technology and/or with action, which may include the operations, maintenance and control necessary to deliver the service, which is delivered on the basis of a contract and in normal circumstances has proven to lead to verifiable and measurable or estimable energy efficiency improvement and/or primary energy savings.” 92 acta polytechnica vol. 50 no. 4/2010 [9] darby, s.: implementing article 13 of the energy services directive and defining the purpose of new metering infrastructures. in proceedings of eceee 2009 summer study. act! innovate! deliver! reducing energy demand sustainably. edited by broussous, c., jover, c. eceee: stockholm, 2009, p. 441–452. [10] darby, s.: the effectiveness of feedback on energy consumption. a review for defra of the literature on metering, billing and direct displays, 2006, available from http://www.eci.ox.ac.uk/research/energy/ downloads/smart-metering-report.pdf (accessed february 2010). [11] thompson, p. b.: evaluating energy efficiency investments: accounting for risk in the discounting process.energy policy, 1997,vol. 25, no. 12, p. 989–996. [12] koopmanss, c. c., willem te velde, d.: bridging the energy efficiency gap: using bottom-up information in a top-down energy demandmodel. energy economics, vol. 23, no. 1, january 2001, p. 57–75. [13] oikonomou, v., rietbergenb, m., patelbet, m.: an ex-ante evaluation of a white certificates scheme in the netherlands: a case study for the household sector. energy policy, 2007, vol. 35, p. 1147–1163. [14] vine, e., kats, g., sathaye, j., joshi, h.: international greenhouse gas trading programs: a discussion ofmeasurement and accounting issues. energy policy, vol. 31, no. 3, 2003, p. 211–224. [15] stoft, s.: appliance standards and the welfare of poor families.theenergy journal, vol.14,no. 4, 1993, p. 123–128. [16] matthews, r. c. o.: the economics of institutions and the sources of growth. the economic journal, 1986, vol. 96, no. 384, p. 903–918. [17] mundaca, l., neij, l.: transaction costs of energy efficiency projects: a review of quantitative estimations. report prepared under work package 3 of the eurowhitecert project, 2006. [18] björkqvist, o., wene, c.: a study of transaction costs for energy investments in the residential sector. in proceedings of the 1993 summer study. the european council for an energy efficient economy (eceee), stockholm, 1993, p. 23–30. [19] ostertag, k.: transaction costs of raising energy efficiency. internationalworkshopon technologies to reduce greenhouse gas emissions: engineeringeconomic analyses of conserved energy carbon. washington d.c., usa, 1999. available from http://www.isi.fhg.de/publ/downloads/isi99a19/ energyeffiency.pdf (accessed february 2010). [20] ostertag, k.: no-regret potentials in energy conservation: an analysis of their relevance, size and determinants, springer berlin : heidelberg, 2003. [21] mundaca, l.: transaction costs of energy efficiency policy instruments. in proceedings of the eceee 2007 summer study–savingenergy – just do it!, edited by attali, s. and tillerson, k., eceee: stockholm, 2007, p. 281–291. [22] michaelowa, a., jotzo, f.: transaction costs, institutional rigidities and the size of the clean development mechanism. energy policy, 2005, vol. 33, p. 511–523. [23] sathaye, j. a.: expediting energy efficiency projectmethodologies.bonn,germany,lawrence berkeley national laboratory, 2005. available from http://www.meti.go.jp/policy/global environment/ kyomecha/050531futurecdm/committee/ sathaye-may%2020-methodologies-future% 20cdm-japan.pdf (accessed february 2010). [24] easton consultants, s. f. m. c.: energy service companies. a market research study, prepared forenergycenter ofwinsconsin: 64, 1999.available from http://www.ecw.org/ecwresults/181-1.pdf (accessed february 2010). [25] mills, e.: risk transfer via energy-savings insurance. energy policy, vol. 31, no. 3, 2003, p. 273–281. about the author michaela valentová, msc. was born in prague. she graduated from central european university in budapest and is currently a phd student at the faculty of electrical engineering, czech technical university in prague. her major area of research is barriers to energy efficiency and evaluation of energy efficiency policy instruments and their economic effectiveness. michaela valentová e-mail: valenmi7@fel.cvut.cz dept. of economics, management and humanities faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic 93 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 the effect of temperature on the gasification process marek baláš1, martin lisý1, ota štelcl1 1 brno university of technology, faculty of mechanical engineering, energy institute, technická 2, 616 69 brno, czech republic correspondence to: balas.m@fme.vutbr.cz abstract gasification is a technology that uses fuel to produce power and heat. this technology is also suitable for biomass conversion. biomass is a renewable energy source that is being developed to diversify the energy mix, so that the czech republic can reduce its dependence on fossil fuels and on raw materials for energy imported from abroad. during gasification, biomass is converted into a gas that can then be burned in a gas burner, with all the advantages of gas combustion. alternatively, it can be used in internal combustion engines. the main task during gasification is to achieve maximum purity and maximum calorific value of the gas. the main factors are the type of gasifier, the gasification medium, biomass quality and, last but not least, the gasification mode itself. this paper describes experiments that investigate the effect of temperature and pressure on gas composition and low calorific value. the experiments were performed in an atmospheric gasifier in the laboratories of the energy institute at the faculty of mechanical engineering, brno university of technology. keywords: biomass, gasification, temperature effect. 1 introduction biomass is one of the most significant renewable energy sources worldwide. biomass technology has many advantages, e.g. there is a comparatively low negative impact on the environment, and it can be grown on surplus agricultural land that is not suitable for, or is not required for, food production purposes. in recent times, renewable energy sources have covered almost 4 % of power production in the czech republic, and biomass is a major contributor to these sources. there are various ways in which biomass can be used for heat and power production, ranging from oil esterification, biogas production and biogas utilization to thermal processes such as pyrolysis, combustion and gasification. at the present time, direct combustion is the most widely used method for producing power from biomass. it is the oldest method, with well-mastered technologies. however, low-capacity processing lines are mostly suitable for heat production from biomass, and not for more desirable power production. fuel gasification, with subsequent utilization of the generated gas in a cogeneration unit, is another technology for producing power from biomass. gasification is a relatively new method that offers many advantages. increased efficiency of energy utilization from biomass, especially power production, is a major advantage of the gasification process. combustion of the gas that is produced is a more easily controlled process than combustion of solid biomass. gasification therefore decreases the production of harmful emissions. higher power production efficiency is achieved using gas in gas turbines and steam-gas cycles. gasification also achieves lower heat loss and better energy production than biogas combustion [1]. gasification is the thermo-chemical conversion of organic mass with a limited oxygen supply into a lower calorific value gas (4–15 mj/m3n) whose main components include: co, co2, h2, ch4, more complex hydrocarbons, n2 and pollutants. the operating temperatures are rather high, commonly from 750 up to 1 000 ◦c. the gas that is produced is then combusted in a boiler or in combustion engines (and/or combustion turbines). the use of air as the gasification medium results in a low calorific value of the gas (4–7 mj/m3n) due to dilution of the gas with nitrogen (more than 50 %). the use of mixtures of air and oxygen and/or steam as the gasification medium produces gas of lower calorific value, ranging from 10–15 mj/m3n [2]. heat for endothermic reactions is most commonly produced by partial oxidation of the gasified material (air or oxygen as the gasification medium), or is supplied from external sources. the main purpose of the gasification process is to transform as much energy as possible from fuel into gas [3]. the most monitored characteristics of the gas that is produced include quality (lower calorific value, composition), the amount produced by means of gasification, and also the amount of pollutants in the gas, 7 acta polytechnica vol. 52 no. 4/2012 and their composition. our study focuses on the effect of temperature and pressure on the quality and composition of the gas that is produced. 2 methodology for measurements at the biofluid 100 gasification fluid generator the study was performed at the biofluid 100 stand (see figure 1), which is equipped with a stationary fluidized bed. figure 1: biofluid 100 experimental equipment figure 2: scheme of the biofluid gasifier a simplified scheme of the experimental equipment is presented in figure 2. fuel is supplied from a fuel storage tank equipped with a shovel, and is introduced via a dosing screw with a frequency converter into the reactor. the primary supply of blower compressed air is led into the reactor under the bed, and secondary and tertiary supplies are located at two high-rise levels. the energogas that is produced is stripped of its solid particulate matter in the cyclone. the output gas is combusted in a burner equipped with a stabilization burner for natural gas and an individual air supply. the ash from the reactor can be removed from the tank located beneath the bed. the power-based heater for primary air supply is placed behind the blower to enable the impact of air preheating to be monitored. in recent years, filters for a study of the efficiency of various gas cleaning methods have been attached to basic part of stand. reactor parameters: • capacity (in produced gas) 100 kwt • fuel demand (consumption, requirement) 150 kwt • wood consumption 40 kg · h−1 • air flow rate 50 m3n · h−1 the basic fluid generator operation characteristics are described below: • operation of the fluid generator – after ignition, the fluid generator is operated in combustion mode, so that its heating is quickit heats up quickly. after the required gasification temperatures are achieved, secondary and tertiary air is supplied to the generator, and thus the gas is immediately combusted and consequently heats up the generator. the air supplies are then shut off, and the generator is introduced into stable mode for a specific and preset gasification temperature. a stable mode is achieved when the amount of fuel that is dosed remains unaltered, the amount of gasified air is even, and the temperature swings in middle section of the gasification generator are stable within the narrow range specified by the gasification temperature. • data entry for the gasification process – the monitored data is continuously recorded by a computer at time intervals of 10 seconds for each measurement. the following values are monitored: • the frequency of the converter of the dosing screw, for determining the mass flow rate; • the temperature in various parts of equipment, which is measured by thermocouples; the positions of the thermocouples are given in detail in the scheme in figure 2. there are 3 thermocouples along the top of the generator, 1 thermocouple in the cyclone and 2 thermocouples in the semi-coke pipe. 8 acta polytechnica vol. 52 no. 4/2012 one thermocouple is in the output gas pipe. one thermocouple measures the temperature of the primary air supply. • the pressure difference between the upper and lower sections of the fluid generator (fluid bed); • the pressure difference at the orifice plate, for determining the gas flow rate; • the pressure of the generated gas, at the generator outlet and at the fuel storage tank. other values, e.g. temperature and air moisture, primary air flow rate and primary air temperature, have to be recorded manually. 3 course of the experiments the main purpose of the experiments was to determine the dependency of gas quality on pressure and temperature changes in the gasifier. the quality of the gas was assessed by analyzing its composition and its lower calorific value. a device based on infrared spectrometry monitored the gas composition (co and h2 components) online after the stand had stabilized. gas was also sampled into a test-glass at regular intervals. these gas samples were later analyzed in a gas chromatograph (h2, co, co2, ch4 components). further required parameters for the samples were set later. 4 results and discussion the dependency of the gas composition on the temperature and pressure in the gasification generator was experimentally monitored. no dependency of gas composition on fluid bed temperatures (t 101 and t 102) or on other values was proved. on the contrary, the experimental results showed that the temperature in the freeboard section (t 103), where the chemical balance reactions take place, is the key temperature for gas composition (see figure 3 through figure10). the charts show that an increase in temperature results in an increase in the proportion of hydrogen and carbon monoxide, and a decrease in the proportion of carbon dioxide and methane. this is due to the decreasing speed of the methanizing reactions, and the higher probability of reactions of water gas (c + h2o → co + h2). the available equipment places limitations on analyses of dependency on pressure. while the temperature can be set within a range of 100 ◦c, the pressure can be regulated only within the range from ca. 2.5 to 19 kpa. this is a rather narrow range. however, an increased proportion of methane is seen to be dependent on pressure, i.e. this dependency contrasts with the temperature trends. figure 3: chart of the dependency of h2 content on t 103 temperature figure 4: chart of the dependency of h2 content on pressure figure 5: chart of the dependency of co2 content on t 103 temperature figure 6: chart of the dependency of co2 content on pressure 9 acta polytechnica vol. 52 no. 4/2012 figure 7: chart of the dependency of ch4 content on t 103 temperature figure 8: chart of the dependency of ch4 content on pressure figure 9: chart of the dependency of co content on t 103 temperature figure 10: chart of the dependency of co content on pressure subsequent analyses of the generated gas focused on its lower calorific value. as figure 11 shows, an increase in temperature results in a slight decrease in the proportion of nitrogen in the gas. thus, we see an increase in the proportion the combustible component in the gas, and therefore an increase in the lower calorific value of the gas (see figure 13). the dependency on pressure again contrasts with the dependency on temperature (see figures 12 and 14). figure 11: chart of the dependency of n2 content on t 103 temperature figure 12: chart of the dependency of n2 content on pressure figure 13: chart of the dependency of lower calorific value on t 103 temperature figure 14: chart of the dependency of lower calorific value on pressure 5 conclusion this paper has analyzed the effect of gasification temperature and pressure on the quality of the gas gen10 acta polytechnica vol. 52 no. 4/2012 erated from biomass. the results show that the temperature in the fluid bed of a gasification reactor has no effect whatsoever on the final composition of the gas. however, t 103, i.e. the temperature in freeboard, has a direct impact on the final composition of the gas. an increase in temperature results in an increase in the proportion of co and h2, a decrease in the proportion of ch4 and co2, and an increase in the lower calorific value of the gas. a change in pressure has the opposite effect. acknowledgement financial support from gačr project no. 101/09/1464 is gratefully acknowledged. references [1] lisý, m., baláš, m., moskalik, j., posṕı̌sil, j.: research into biomass and waste gasification in atmospheric fluidized bed, 2009, proceedings of the 3rd wseas international conference on renewable energy sources, res ’09. isbn 978-960-474-093-2. [2] baláš, m., lisý, m.: vliv vodńı páry na proces zplyňováńı biomasy, acta metallurgica slovaca, vol. 11, no. 1, košice, 2005. issn 1335-1532. [3] ochrana, l., skála, z., dvořák, p., kub́ıček, j., najser, j.: gasification of solid waste and biomass, vgb powertech, 2004, vol. 84, no. 6, p. 70–74. issn 1435-3199. 11 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 nonlinear aspects of heat pump utilization r. najman abstract this work attempts to answer the question: how much can we believe that the coefficient of performance provided by the manufacturer is correct, when a heat pump is required to face the real load coming from changes of temperature? the paper summarizes some basics of heat pump theory and describes the results of numerical models. keywords: heat pumps, thermal mass of an object, coefficient of performance. 1 introduction the idea of a heat pump is quite old, but nowadayswe are confronted with the perspective of a potential energy crisis in the coming decades; people are starting to look for ways to lower their expenditure on domestic heating. a few years ago, when we started studies of this topic, only a few thousand heat pumps were installed in the czech republic, but the number has grown by 20–50% every year. houses with additional thermal insulation and new insulation windows have spread evenmore. these installations havea profound impact on the performance of heat pumps. 1.1 basic principles of heat pumps heat pumps use energy from a colder source and release it into a warmer ambience. almost every refrigeratorhasone, sowehavebeen livingwithheatpumps for a long time now. legend 1 condenser coil (hot side heat exchanger) 2 expansion valve (gas expands, cools and liquifies) 3 evaporator coil (cold side heat exchanger) 4 compressor red=gas at high pressure and temperature pink=gas at high pressure and reduced temperature blue=liquid at low pressure and greatly reduced temperature light blue=gas at low pressure and warmer temperature fig. 1: basic idea of a heat pump [*1] to describe a heat pump correctlywe need the following data: cop bheatingb, source of heat, target medium of heat transition cop: (efficiency) copheating = q p . (1) q amount of heat transferred to hot reservoir [w] copheating coefficient of performance (for heating purposes) [–] p dissipated work of the compressor [w] 1.2 source and target of heat the three typical sources are air, water and soil. the targets are air or water. the systems are therefore referred to as air/air, air/water,water/air, water/water, soil/air and soil/water. for most technical and economic evaluation purposes, the target medium itself does not matter, only its temperature is important. 1.2.1 sources air is the cheapest source for initial investment, but if you do not possess a source with stable temperature (suchaswarmair fromsometechnologicalprocess) the cop is quite low, especially when temperatures outside hit minus values (◦c), or when you need output temperature above 40◦c. water is the “golden” middle way. it is a cheaper source than soil and its temperature is quite steady through the year, so achievable cop is quite good. of course, use can be limited by unavailability of a usable water source. in most cases, the water source must be approved by the local authorities. soil is the most expensive, but surely the best source forcop.there aremany technologicalways to obtain heat from soil. the most common way is from drill holes, or from ground collectors. for new build60 acta polytechnica vol. 50 no. 4/2010 ings, a quite cheap and effective way is to use energy pilots (pilots of building foundations with integrated heat collectors). fig. 2: air, water, soil to water heat pumps [*2] 2 input values and equations 2.1 subject for testing our mathematical model our test subject is a house with a surface area of 400 m2 (surfaces with applicable thermal insulation). our model heat pump is a water/water type, source water 7◦c (from a well), output 35◦c for a screed floor. after making the first calculation of energy losses, the stiebel eltron wpw 7 heat pump seems to be right choice, in combination with some thermal insulation. with no thermal insulation we should use some larger model. technical data of wpw7: 6.9 kw heat output at 35◦c water and cop 5.2. 2.2 required heat output the heat output heat source is calculated according to the following model: the indoor temperature tv is maintained by regulating temperature t1. t1 highest heating temperature [ ◦c] twd lowest heating temperature [ ◦c] tv indoor temperature [ ◦c] kheating constant representing efficiency of heat transfer [w/k] qven heat losses through ventilation [w] qiz. heat losses through walls (insulated) [w] qok. heat losses through windows [w] qt uv heat consumed on supply water [w] fig. 3: heat losses diagram to calculate the heat losses we need to establish the thermal resistances rthv = 1 αv , rthz = d λz , rthiz = diz 1000 · λiz (2) rtha = 1 αa , rtho = 1 λo where: rthv indoor convection thermal resistance [ k ·m2 w ] αv indoor heat transfer coefficient [ w k ·m2 ] rthz thermal resistance of wall [ k ·m2 w ] d wall width [m] λz thermal conductivity of walls [ w k ·m ] rthiz thermal resistance of walls [ k ·m2 w ] λiz thermal conductivity of thermal insulation [ w k ·m ] diz thermal insulation width [mm] rtha outdoor convection thermal resistance [ k ·m2 w ] αa outdoor heat transfer coefficient [ w k ·m2 ] rtho thermal resistance of windows [ k ·m2 w ] λo thermal conductivity of windows [ w k ·m ] qok = sok · tv − ta rthv + rtho + rtha (3) qiz = s · tv − ta rthv + rthz + rthiz + rtha (4) qven = 10 · (tv − ta) · os – empiric value (5) when using a heat pump, the consumption is givenby čsn as follows: (heating to 35◦c, and then applying an additional source of heat. with other heat sources than the heat pumps it can be done in one step.) qt uv = os ·82 · cp · (35−7) 3600 · 24 + pom (6) additional heating up to 55◦c: pom = os · 82 · cp · (55−35) 3600 ·24 (7) os number of people [–] cp specific heat capacity of water [j/kg · k] 61 acta polytechnica vol. 50 no. 4/2010 2.3 heat pump in a thermal circuit qh2o−in heat taken from the source [w] pel-hp b electricity consumption of hp [w] m2 mass flow the in heating circuit [kg/s] twh max. temp. of water in the hp [ ◦c] pel-b electricity consumption of a boiler [w] (for others, see before) fig. 4: thermal circuit in the heat pump, thewater is heated totwh (approximately 35◦c), then it passes through the water boiler, where it can be heated to t1 (if necessary). in the screed floor, the temperature of the water goes down to twd, and the cycle is repeated. 2.4 equations of the system cop · pel-hp = m2 · cp · (twh − twd) (8) pel-b = m2 · cp · (t1 − twh) (9) qcelk = m2 · cp · (t1 − twd) (10) qcelk = kheating · ( t1 + twd 2 − tv ) (11) qcelk sum of all heat losses of the object [w] this system of equations can be solved. the m2 is chosen from the manufacturer’s catalogue. the solutions are as functions of diz and ta (from heat loss formulas). 2.5 thermal mass thenonlinearbehavior of aheat pump(cop is a nonlinear function of the source and target temperature) indicates that the thermal mass of the object must be taken into account. here i present the algorithm that allows us to estimate the impact of thermal masses on a defined thermal circuit. the first step is to define the command function, which makes it easy for us to input the desired indoor temperature: ptop heating output [w] fig. 5: heating regulation the following equations describe the thermal status of the system: the wall: ρs · cps · ∂ts(x, t) ∂t = λs · ∂2ts(x, t) ∂x2 (12) with this initial condition ts(x,0)= 0 (13) and border conditions 1 1 αa + diz λiz ·(ta(t)−ts(0, t))= −λs · ∂ts(x, t) ∂x |x=0 (14) αv · (ts(d, t) − tv(d, t))= −λs · ∂ts(x, t) ∂x |x=d (15) where: ρs density [kg/m 3] cps specific heat capacity [j/kg · k] ts temperature in the wall [ ◦c] λs thermal conductivity of wall [w/k · m] (for others, see above) equations for air, adapted to fit better into mathematica software: ρv · cpv · ∂tv(x, t) ∂t =100 · ∂2tv(x, t) ∂x2 + ptop(tv(x, t)) − s · αv · (tv(x, t) − ts(x, t)) v − (16) ρv · cpv · (tv(x, t) − ta) tv with the initial condition tv(x,0)= 0 (17) and border conditions: ∂tv(x, t) ∂x = 0|x→0 (18) ∂tv(x, t) ∂x = 0|x→d (19) 62 acta polytechnica vol. 50 no. 4/2010 where: tv air temperature [ ◦c] ρv air density [kg/m 3] cpv air specific heat capacity [j/kg · k] s surface of walls [s] v indoor volume of air [m3] ptop heating output [w] tv time constant of ventilation [s] the solutions to the equations give heating output as a function of time. with sinusoid outdoor temperature in the graph below: fig. 6: heating output from the part of the solution where transient effects no longer apply, we extract a period of one day ptop = a + b · sin(ωt + ϕ). if such heat loses are implemented into the heat pump model described above, we can obtain the dependencies of the variables needed to evaluate the effects of the thermal masses on the heat pump. where “red” is the power input of the heat pump, and “black” is the power input of water boiler. variant a: outdoor temperature −10 to +10◦c, sinus, without thermal mass fig. 7: power inputs, variant a variant b: outdoor temperature −10 to +10◦c, sinus, with thermal mass fig. 8: power inputs, variant b variant c: outdoor temperature 0◦c fig. 9: power inputs, variant c if thepower inputs are integrated, there is little difference between var. b (reality) and var. c. (5.8 %). however, since integrating the full thermal mass includingmodel into our heat pump circuitmodelwould increase the numerical flaws and greatly reduce the stability and reliability of the outcome, we chose from the models according to var. a (no thermal mass) or var. c (the thermal mass is so great, that it would negate all changes of temperature during one day). i have chosen to use a curve of input temperatures that it will simulate a system equivalent to variant c. it is very close to reality, and can be solved more precisely with numerical methods in our system of equations. 3 output values of the heat pump circuit model (for diz =50 mm) if the outdoor temperature is stable, the input powers of the heat pump (red) and the water boiler (black) are as follows: fig. 10: power inputs in stable conditions for choosing the heating period of the year we take the long-term average temperature and compare it with 15◦c. fig. 11: average temperature 63 acta polytechnica vol. 50 no. 4/2010 for the real effects on a heat pump, we add some oscillations to the outdoor temperature. fig. 12: average temperature with oscillations this givesus a gooddisplayof the consumedpower of the heat pump and water boiler in the course of a year. it shows us that the heat pump (red) covers almost all heat losses. the water boiler (black) needs to be switched on only when there is a long period of cold weather. fig. 13: power consumptions in the course of a model year we also obtain a very significantcop value (red): fig. 14: cop in the course of model year results with our input parameters the heat pump worked throughout the year with cop 4.82. the heat pump covers almost 100 % of heat consumption with 5900 kwh per year consumed for heating. the simplified models give us different results: the“merchant”model,which is oftenusedbyheat pump sellers gives uscop 5.2, and energy consumption of 5 400 kwh. it calculates with an average temperature during the heating season. the “normative” model, which uses ta = −15◦c for the working parameters for heating gives us cop 2.5 and consumed power of circa 11000 kwh per year. this shows that it isworthwhile to investigateheat pumps utilization more deeply. appropriate use of heat pumps should be considered on the basis of as much factual information as possible. acknowledgement the research described in this paper was supervised by doc. dr. jan kyncl. references [1] http://en.wikipedia.org/wiki/heat pump [2] stiebel, e.: heat pump catalogue 2008. richard najman e-mail: najmaric@.fel.cvut.cz dept. of electroenergetics czech technical university technická 2, 166 27 praha, czech republic 64 acta polytechnica vol. 52 no. 5/2012 a comparison of power quality controllers petr černek, michal brejcha, jan hájek, jan pígl dept. of electrotechnology, czech technical university in prague, technická 2, 166 27 praha 6, czech republic corresponding author: cernepet@fel.cvut.cz abstract this paper focuses on certain types of facts (flexibile ac transmission system) controllers, which can be used for improving the power quality at the point of connection with the power network. it focuses on types of controllers that are suitable for use in large buildings, rather than in transmission networks. the goal is to compare the features of the controllers in specific tasks, and to clarify which solution is best for a specific purpose. it is in some cases better and cheaper to use a combination of controllers than a single controller. the paper also presents the features of a shunt active harmonic compensator, which is a very modern power quality controller that can be used in many cases, or in combination with other controllers. the comparison was made using a matrix diagram that, resulted from mind maps and other analysis tools. the paper should help engineers to choose the best solution for improving the power quality in a specific power network at distribution level. keywords: power, quality, controllers, matrix, diagram, facts. 1 introduction in recent decades, power quality has become topic number one. the target has been to obtain the most effective conditions for power transmission between power sources and users. the same demands arise in the construction of intelligent buildings and large public buildings. there are several ways to ensure power quality in buildings of this kind, and this paper tries to clarify them and compare them with each other. mind maps were used at the beginning of the analysis to define important disturbances and important controllers for the distribution level of networks. disturbances are presented in section 2. a comparison of the controllers themselves is made section 3. the analysis points to major considerations when using controllers and should help engineers to select an appropriate solution. a swot analysis of the shunt active harmonic filter (ahf) is also presented in section 3. ahf is a very modern way to ensure power quality, so it is studied in greater detail. 2 power quality in power distribution networks there are several power quality parameters that are important for transmission networks, e.g. the static and dynamic stability of the power system, voltage stability, frequency stability, etc. a very important issue here is the loop flow of power through parallel transmission lines. only some of these parameters are important at distribution level. users are not able to control the whole transmission network, so they cannot have much effect on the frequency stability, the power flow, or the stability of the power network. however, users are limited by the demands of their electric power supplier and by the demands of their power system. if a power network is being designed for a hospital for example, the project engineer will be very concerned about supply continuity and voltage stability. a another parameter that has to be taken into account is minimum power factor defined by the power suppliers. four important power quality parameters at distribution level were determined using mind maps. these parameters are graphically represented in figure 1. 2.1 power factor the power factor is defined as: pf = p s , where p is active power and s is apparent power. poor quality of this parameter is caused by: • inductive or capacitive load, which creates reactive power, • by the current harmonics (nonlinear load) or • by an unbalanced load in three-phase systems. it is obvious that the power factor parameter is affected by the load and its currents. to compensate the effects mentioned above, the load has to be resistive, linear and balanced. shunt compensators are therefore suitable for this purpose. 22 acta polytechnica vol. 52 no. 5/2012 figure 1: important power quality parameters at distribution level. 2.2 voltage stability the voltage in a power system may vary due to changes in the load. low voltage at a point of heavy load is caused by a voltage drop on the series impedance of the power network. similarly for low loads the voltage can rise and make higher stress on the load. this effect can be compensated by changing the ratio of the distribution transformer or by changing the series impedance of the network. compensation can also be make by changing the character of the load. this effect is shown in figure 2. shunt compensators can be used for this purpose. if low voltage is detected, reactive power should be supplied as a countermeasure (capacitive load). 2.3 continuity of supply the primary sources of voltage dips are load switching events and short circuits occurring in the power figure 2: the dependency of voltage on the character of the load. network. short circuit events also lead to unbalanced supply voltage. moreover the system can be disconnected from the supply by the fuse due to short circuits. electric power users within the disconnected segment of the network suffer an interruption of supply. voltage dips and interruptions are the problems on the power supply side. much standard electrical equipment has to be designed to overcome a short interruption of the power supply. the duration of the effect is therefore a major parameter. for example, computers and a much other equipment can deal supply interruption lasting more than 1 ms. however some special loads, e.g. measuring systems, are very sensitive to voltage changes and therefore have to be controlled. 2.4 voltage and current imbalance unbalanced phase voltage is closely connected to phase currents. if phase currents are unbalanced, ther a different voltage drop in each phase. as a result, the phase voltages are also unbalanced. in three-wire systems, unbalanced loads create reactive power even though the loads are resistive. this phenomenon is less significant in a four-wire system. however, in large buildings it can be useful to compensate unbalanced loads in oder to ensure proper utilization of the power network. imbalances at fundamental frequency can be caused by negative sequence and zero sequence components. however, a zero sequence component can only appear in a three-phase grounded system, which induces current flow through the neutral wire. this part of the imbalances can be compensated by transformers in connection d/yn or y/yn. 23 acta polytechnica vol. 52 no. 5/2012 figure 3: swot analysis of shunt active harmonic filter. 3 a comparison of controllers various types of controllers are compared in figure 4. the controllers have been divided into two groups according to their connection to the power network. only controllers intended for use in a power distribution network, were selected for the comparison: ahf — an active harmonic filter is a power converter that is able to compensate harmonics and reactive power. it is used as a shunt or as a series compensator, according to whether current or voltage is being controlled. ahf can also improve the stability of the network or the efficiency of passive filters. statcom — a static compensator is a solid state synchronous condenser connected in shunt with the power network. the output current is adjusted to control the reactive power. the output waveform is close to a sine wave, so it injects only a small amount of harmonics. if connected to a source of power, it can also provide active ac power and can compensate the imbalances. svc — a static var compensator is a thyristorcontrolled reactor connected in parallel with a capacitor. it changes the firing angle in order to control the reactive power in the power network. it is one of the most common types of compensation. pass. filter — this column includes mechanically switched capacitor banks and passive lc filters. this type of compensation has only low ability to adapt to changes in the power network. the main advantage is that the solution is very simple and can fit to stable loads. some other controllers can be combined with lc filters and the compensation may better meet the requirements. dvr — a dynamic voltage restorer is a converter which can protect sensitive loads from all supplyside disturbances other than outages. it operates like a series voltage source. dvrs are usually less than costly ups requirements. ups — uninterruptible power supplies are the only type of equipment that is able to compensate outages. there are many technologies based on batteries, flywheels etc. depending on the solution it can protect the loads from all supply-side disturbances. the main disadvantages are that ups occupy a relatively large area and are relatively expensive. as can be seen from the analysis in figure 4 and from the diagram in figure 1 shunt compensators are suitable for load-side disturbances and series compensators are suitable for source-side disturbances should one of these be supply-side. outages are the most important supply-side disturbances in most cases. ups devices are the only equipment that can overcome this problem. in a suitable solution, ups can protect sensitive loads very well. other series equipment, e.g. ahf or dvr, is suitable only in special cases, e.g. some voltage sensitive loads like loads requiring constant power. this can be important in industrial applications. for users, the power factor which is given by their loads, is an important consideration. power suppliers require this factor to be kept close to 1. as stated above, the power factor depends on the reactive power, the harmonics and, in the case of threewire systems, also on imbalances. these effects are caused especially by currents, because supply voltages are sinusoidal and are in most cases balanced. 24 acta polytechnica vol. 52 no. 5/2012 figure 4: abilities of controllers. in the simplest case, only inductive reactive power has to be compensated. a capacitor bank, svc or statcom is suitable for this purpose depending on the required precision of the control. svc and statcom are very suitable in buildings that are equipped with solar cells or other sources of energy. intelligent buildings are crammed with electrical equipments and devices, which are major sources of harmonic disturbances. harmonics decrease the power factor and increase the power loss in the network. a passive filter, or a combination of a passive filter with a series active harmonic filter, can be used to compensate the current harmonics. a disadvantage is that this solution is relatively comprehensive. the best solution for improving the power factor is to use a shunt active harmonic filter. this equipment alone can meet many of the requirements for power quality. a swot analysis of active harmonic filters is presented in figure 3. the main advantage is, that the reactive power, the current harmonics and the imbalances can be compensated by a single device. however, in many cases it is suitable to combine the ahf with other instruments to suppress its weaknesses. if ahf is combined with a capacitor bank, the main part of the reactive power is compensated by the capacitors and the ahf can be less rated. the resulting solution is then cheaper. 4 conclusion with the expansion of intelligent buildings, more emphasis will be placed on power quality. some equipment for ensuring power quality has been presented in this paper, and comparisons have been made. the shunt active harmonic filter seems to be most promising, and can be used in most of the cases presented here. the paper presents one aspect of our research on constructing and testing this type of compensation equipment. acknowledgements the research presented in this paper was supported by “sgs čvut 2012: power conditioners — prototype”, under grant no. ohk3-017/12. references [1] e. acha, l. gyugyi. facts: modelling and simulation in power networks. john wiley, hoboken, nj, 2004. [2] a. b. baggini. handbook of power quality. john wiley, hoboken, nj, 2008. [3] n. g. hingorani, l. gyugyi. understanding facts: concepts and technology of flexible ac transmission systems. ieee press, new york, 2000. [4] r. strzelecki, g. benysek. power electronics in smart electrical energy networks. springer, london, 2008. 25 ap08_3.vp industrial and universal exhibitions played a significant role before world war i. this refers both to world exhibitions, which became a global phenomenon in the second half of the 19th century, and to minor national and regional exhibitions. these were of high importance for the economy since they fostered commercial intercourse. these events attracted large numbers of people and provided a source of knowledge and entertainment. political importance can also often be attributed to them as well. dissemination of scientific and technological progress was one of the main aims of the exhibitions. each of the world exhibitions presented some novelties of science and technology (which were often revolutionary) to the general public for the first time. the innovations included bell’s telephone in philadelphia 1876, edison’s phonograph in paris 1889, and radio in chicago 1893. “in spite of the haphazard and often purely fortuitous method of gaining knowledge by these exhibitions, it is doubtful that a more efficient promoter of technology could have been devised at the time.” [2] this applies in first place to the world expositions but national and regional exhibitions also in many cases presented noteworthy innovations. wireless telegraphy at the german universal exhibition in ústí nad labem in 1903 is a good example. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 wireless telegraphy at the german universal exhibition in ústí nad labem in 1903 [1] t. okurka this paper focuses on the transmission of wireless telegraphy between ústí nad labem and teplice during the german universal exhibition in ústí nad labem in 1903. though this was not the first transmission of wireless telegraphy in austria, as some newspapers claimed, it was probably the first transmission of wireless telegraphy over a large distance presented to the public in bohemia. the idea of presenting wireless telegraphy at this exhibition was promoted by wilhelm biscan, director of the elektrotechnikum school in teplice at that time. the telegraph was installed by the allgemeine elektrizitäts-gesellschaft in berlin. the device used the slaby-arco system. keywords: history of technology, electrical engineering, wireless telegraphy, exhibitions. fig. 1: exhibition grounds in ústí nad labem in 1903 fig. 2: pavilion of the wireless telegraphy in ústí nad labem © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 48 no. 3/2008 fig. 3: interior of the pavilion fig. 4: profile of country between teplice and ústí nad labem fig. 5: connections at transmitting station the ústí nad labem exhibition was held by the local association of craftsmen. it did not focus solely on crafts, but also comprised industry, arts, agriculture, transportation, education, etc. it was one of the largest exhibitions before world war i in the german regions of bohemia. nine hundred exhibitors took part and six hundred thousand visitors attended the exhibition in the course of three months. it attracted extensive attention of the local press and various german newspapers in bohemia, austria and abroad. however, the czech press – due to high national feelings in bohemia – neglected this german exhibition almost completely. the idea of presenting wireless telegraphy at the exhibition in ústí nad labem was promoted by wilhelm biscan, founder and director of the elektrotechnikum school in teplice [3]. the telegraph was installed by the allgemeine elektrizitäts-gesellschaft in berlin. at the exhibition grounds in ústí nad labem a small pavilion was built with two high poles. the second station was situated in the elektrotechnikum in teplice. the distance between these stations was 14 km. in the course of the exhibition there was a successful wireless telegraph connection between these two towns. the wireless telegraphy pavilion was an object of extraordinary attention among visitors and the press; it was one of the highlights of the exhibition. the device used the slaby-arco system, which was a rival to the marconi system. the purpose of the pavilion was to show the principle of wireless telegraphy to the visitors. mr. biscan explained it, for example, to the bohemian governor karl coudenhove [4]. at the opening ceremony on june 20th, the organizers read out a message of greetings from teplice. this was said to be the first wireless telegram sent in austria [5]. that grandiose claim of primacy was, however, somewhat exaggerated. it was certainly not the first successful transmission of wireless telegraphy in austria. successful relays had been transmitted earlier by siemens & halske in vienna, and there were also radiotelegraphic transmissions by the navy near pula [6]. besides, it is not sure that the transmission on the opening day of the exhibition was successful. the primacy of this telegram was questioned by the daily newspaper deutsche volkszeitung. however, this nationalistic and antisemitic newspaper is not a very reliable source of information. in an article under the title “humbug”, the newspaper wrote that because of unprofessional procedures the device did not work and the organizers just read out the text of the greeting message that they had known in advance. several days later, an expert from berlin had to come to ústí nad labem to repair the device. only then – according to deutsche volkszeitung – “the first-ever wireless telegram in austria, on which so much rubbish was prattled by the jewish newspapers, could be received”. [7] however, it is not of great importance whether the device actually worked on the opening day. it is obvious that the radiotelegraphic connection between ústí nad labem and teplice worked in the following weeks. evidence of this can be found in the german newspapers in bohemia and in technical journals in vienna and abroad. the vienna zeitschrift für post und telegraphie affirmed that “the device at the exhibition grounds in ústí nad labem works perfectly”. [8] the journal of the vienna electrical association zeitschrift für elektrotechnik published a detailed description of the wireless telegraphy between ústí nad labem and teplice [9]. a shorter version of this article was published by the british journal the electrician [10]. these articles did not declare this relay to be the first wireless transmission in austria, but they appreciated that the relay succeeded although the district, on account of the various hills, was not very favourable for wireless telegraphy. information on the wireless telegraphy at the exhibition in ústí nad labem was also published by the czech technical journal technický obzor [11]. however, other czech newspapers and magazines neglected this success absolutely. there is no doubt that wireless telegraphy was successfully presented to the general public at the exhibition in ústí nad labem in 1903. however, present-day technical literature refers to the transmission between prague and carlsbad during the exhibition of the prague chamber of commerce in 1908 as the first presentation of wireless telegraphy to the general public in the bohemian lands [12]. the relay between ústí nad labem and teplice fell into oblivion. this is probably because it took place in the german part of bohemia. generally, czechs neglected such events in this region of the country. if a success like this could not be interpreted as a national event no significance was attributed to it. this oblivion can also be explained by the fact that wireless telegraphy was not widely used after the presentation in ústí nad labem. the wireless telegraphy pavilion at the german universal exhibition in ústí nad labem 1903 was probably the first place in the bohemian lands where a transmission of wireless telegraphy over a large distance was presented to the general public. before that, only experiments over a short distance had been carried out (for example, the successful experiments of františek křižík in the year 1902) [13]. there are enough reliable reports on the relays between ústí nad labem and teplice to provide testimony to the beginning of wireless telegraphy in the bohemian lands. after the exhibition, wilhelm biscan used the same device for experiments on wireless transmission in a train. a non-moving wireless station was placed in the railway station in teplice and a mobile wireless station was installed in a special carriage of a passenger train on the ústí nad labem – teplice – chomutov railway line. the apparatus in the train received 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 6: connections at receiving station a message from the railway station at a distance of 7 km. according to the technical journals, this was the first successful transmission in a train in the lands of the habsburg monarchy [14]. references [1] this paper is based on a chapter on wireless telegraphy in my thesis on the german universal exhibition in ústí nad labem 1903, which was published in the year 2005: okurka, t.: všeobecná německá výstava v ústí nad labem 1903 [german universal exhibition in ústí nad labem 1903], ústí nad labem 2005. [2] ferguson, e. s.: expositions of technology 1851–1900. in: technology in western civilization. the emergence of modern industrial society earliest times to 1900. kranzberg, m., pursell, c. w. (eds.), vol. 1, new york 1967, p. 725. [3] for more about this school see: 20 jahre elektrotechnikum 1895–1915. ein gedenkblatt als beilage zum programm der anstalt, teplitz, 1915. [4] 20 jahre elektrotechnikum, p. 3–4. [5] aussiger tagblatt, 22. 6. 1903, p. 4; bohemia, 21. 6. 1903, p. 5. [6] see many articles in: zeitschrift für post und telegraphie, vol. 10, 1903; other successful experiments in austria are described on the websites of the company telekom austria: www.aon.at. [7] deutsche volkszeitung, 9. 7. 1903, p. 4. [8] zeitschrift für post und telegraphie, 10. 7. 1903, p. 159–160. [9] zeitschrift für elektrotechnik, 4. 10. 1903, p. 569–570. [10] the electrician, 13. 11. 1903, p. 136. [11] technický obzor, 16. 12. 1903, p. 337. [12] mayer, d.: pohledy do minulosti elektrotechniky [views of history of electrical engineering], české budějovice 1999, p. 249; nemrava, a.: sdělovací technika [communication engineering]. in: studie o technice v českých zemích 1800–1918 [essays on technology in bohemian lands 1800–1918], jílek, f. (ed.), vol. 4, praha 1986, p. 236. [13] technický obzor, 25. 2. 1903, p. 56–57. [14] zeitschrift für post und telegraphie, 1. 12. 1903, p. 271–272; technický obzor, 16. 12. 1903, p. 337; elektrotechnische zeitschrift, 3. 12. 1903, p. 996. tomáš okurka e-mail: tomas.okurka@seznam.cz institute of economic and social history charles university faculty of philosophy and arts celetná 20 116 42 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 48 no. 3/2008 ap10_1.vp 1 allocation task the problem of allocation tasks lies in selecting an optimal number of logistics centers to be located subsequently on the basis of a multiple criteria evaluation. the difficulty of multiple criteria evaluation tasks results not only from the number of evaluation criteria, but also from how these criteria are expressed and from the degree which they are dependent on their nature in various units of measurement. it is not uncommon for there to be a mixed set of criteria, where some criteria are quantitative, i.e. expressed numerically, and others are of a qualitative nature (expressed in a verbal description). decision making a basic managerial actively, where a bad decision may be a key cause of a business failure. the importance of decision making depends directly on the level of resources (primarily financial) that are closely connected with the decision making. the process of selecting feasible options from a set of proposed options forms a decision-making process and is a part of a broader decision-making task, namely selecting the best option. 1.1 elements in the decision-making process the key elements in the decision-making process include: the decision-making objective, the subject and the object of the decision, evaluation criteria, decision-making options and their outcomes, and states of the world. 1.1.1 decision-making objective(s) we understand the decision-making objective in solving a decision-making problem as a certain state that we wish to attain by means of a solution to the decision-making problem. in our case, the only objective is a decision on the optimal number of logistics centres. 1.1.2 evaluation criteria evaluation criteria are factors selected by the decision maker, serving for evaluating the advantageousness of individual decision-making options, from the viewpoint of meeting the objectives of the decision-making problem that is being solved. the evaluation criteria are usually derived from set objectives. selected evaluation criteria for allocation tasks: � cost criterion 1. one-off acquisition costs for a new logistics centre – direct material (equipping the depot with vehicles by purchase or leasing, with furniture, computers, mobile telephones, a fixed telephone line and other office equipment) 2. monthly operating costs for a branch – direct wages, other direct costs, operating margin, administration margin, etc. (basic wages, supplements and additional payments, bonuses and remunerations, operating expenses, depreciation charges, repair and maintenance fees, offsetting up a repair fund, transport and travel costs, contributions from wages, fuel costs, telephone charges, energy, insurance, fines, penalties, loan repayments, leasing) 3. costs for providing the branch with the required supply of materials and spare parts – storage costs, funds tied up in stock 4. environmental costs. the one-off costs and the capital field up in supplies will grow along the curve with the growing number of depots; however, the operating costs will decrease as a result of smaller catchments areas. � response times (speed) – by setting up another depot, the response period will be reduced due to the reduced size of the catchment areas. this will be reflected in the reduced average number of kilometers driven and, consequently, in decreased fuel costs, � technology demands – equipping the depot with special vehicles, machinery and handling equipment, � customer convenience, � share of services in the public interest – fire-fighters, emergency services, � energy requirements, � geographical considerations, � economic importance, � social considerations – solution to unemployment, � importance of the hub as a transit hub, 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 allocation and location of transport logistics centres d. mocková the facility allocation problem sets out to determine the optimal number of facilities to be opened. based on multiple criteria evaluation, the optimal location of the facilities is usually solved subsequently. several considerations, e.g. technical parameters, costs and finance must be taken into account. economic analysis is carried out on the basis of the specific instance of the problem. let us assume that the number of potentially located facilities is known. then the problem of the optimal location of a given number of facilities in a network is referred to as the facility location problem. the solution to the problem is a set of facilities optimally located in an area such that this area is fully covered by the required services that the facilities provide. an example of a real-life problem of this type is the location of logistics centers. keywords: allocation tasks, set of criteria, multiple criteria evaluation, location tasks, heuristic method, genetic algorithm, fitness function. � importance of the hub from the viewpoint of resources – raw materials, � importance of the hub from the viewpoint of customers. 1.1.3 decision-making subject the decision-making subject (decision maker) is the person making the decision, i.e. the person selecting the option for implementation. the decision-making subject may be an individual or a group of people. 1.1.4 decision-making object the decision-making object is, as a rule, understood as being the field for which the problem has been formulated, for which the objective of the solution has been set, and which the decision making concerns (a decision-making object may, for example, be to determine the reserve stocks of logistics centre warehouses, the equipment of logistics centres, financial provision for development, etc.). 1.1.5 decision-making options and their outcomes the options for solving a problem represent for the decision maker possible courses of action that will lead to the solution of the problem, or, as relevant, to the fulfillment of the objectives that have been set. while many decision-making problems have given or known alternative solutions, there are many cases (especially in the case of complex decision-making problems) where the creation of options is time-consuming and requires a creative approach for demanding complex processing and searching for information. decision-making options are closely linked with their outcomes, which we can understand as being the anticipated impacts and effects of the options. 1.1.6 states of the world states of the world (scenarios, risk situations) may be understood as future mutually exclusive situations that may occur following implementation of the option, and which influence the outcomes of the given option with regard to specific evaluation criteria. 1.2 solution methodology � determining the decision-making object, subject and objective. � determining the criteria for evaluating the options – information should be fully exploited in selecting the criteria. the primary consideration in setting the evaluation criteria is the objectives to be achieved by the solution to the decision-making problem. besides the objectives for the problem being solved, the selection of the evaluation criteria may also be supported by identifying the subjects whose interests, objectives, or needs may be affected by solving the problem or by choosing one of the options. in addition, searching for and clarifying the potential adverse impacts and effects of the options is also important. applying the above-mentioned criteria helps to eliminate at least some of the shortcomings that can arise in decision making. � methods for setting criteria weightings – most methods for multiple criteria evaluation of options require first that weightings be set for the individual evaluation criteria that will express numerically the importance of these criteria. the greater the importance of the criterion, the higher its weighting. in order to achieve comparability between the weightings of a set of criteria determined by different methods, these weightings are as a rule standardized so that the sum of the weightings is equal to one. � generation of options – this is the most important stage in the decision-making problem, and the quality of the solution to of the whole decision-making problem depends on it. � evaluation of options and selection of the option intended for implementation – the final objective is to determine an option of the decision-making problem solution that will best meet the solution objectives of the problem. the option intended for realization must be feasible. therefore it is necessary to exclude from the set of evaluated options those that are inadmissible. inadmissible options are those that: 1. do not meet some of the objectives of the solution to the decision-making problem, 2. do not fullfil some of the limiting conditions. 1.3 multiple criteria decision making multiple criteria decision making involves modeling decision-making situations containing a defined set of options and a set of criteria according to which the options will be evaluated. the outcome of the option evaluation process is that we can establish of the preferential arrangement of the options, i.e. we can rank their overall advantageousness. the first place is occupied by the most advantageous option, i.e. the optimal option. determining the preferential ranking is in general a demanding process, the complexity of which grows with the increasing size of the set of options and with the increasing number of criteria. if a given criterion is of a quantitative nature, it is sufficient to rank the options by their descending or ascending values (where this concerns a cost or revenue type criterion). the complexity of multiple criteria evaluation of options is often dealt with by unjustified simplification of the task, where the number of evaluation criteria is reduced by neglecting less important criteria. a different, more acceptable approach to multiple criteria evaluation attempts to convert all the criteria into the same unit of measurement, which ensures that the individual criteria are enumerable, and thus that they can be converted to a single criterion. determining the preferential ranking of options often depends on the importance attributed by the decision maker to the individual evaluation criteria, i.e., it depends on the value hierarchy of the decision maker and his subjective appraisal. different decision makers may reach different preferential rankings of options. 2 location tasks common solution approaches are either exact methods or heuristics. the latter are preferred, and many approaches have been developed with the application of genetic algo© czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 50 no. 1/2010 rithms. the advantage of this heuristics is that the elementary steps of the algorithm do not depend on the criterion, i.e. the algorithm works even though the criterion has been changed. 2.1 heuristics for locating of transport logistics centres 2.1.1 methods for solving location tasks the problems dealt with here belong to the category of combinatory tasks of discrete optimization. we can use two different approaches. the first approach comprises exact methods based on examining all possible options, determining the value of the criterion and selecting the optimal solution. this approach is time-consuming, and may be used for tasks that are not too extensive. however, such tasks are rare in practice. let us take 1000 nodes as the possible location of logistics centres, and the total number of possible locations is given by the formula: 1000 1000 1000 1 1000 k k k k � � ���� � � ����� � � ! ( ) ! ! , (1) where k is the number of depots in the network. another approach uses heuristic methods and procedures. the result is, as a rule, a suboptimal solution that may be significantly far from the optimal solution. there is, for example, a heuristic method that uses an iterative algorithm to determine the peak optimal location of k depots in the network. the model is simplified (deterministic) with a known number of operating requirements, the quality criterion being the minimization of transport work, or, as relevant, minimization of the time necessary for reaching each point in the network at which a logistics requirement may arise. 2.1.2 solving location tasks by means of genetic algorithms a further heuristic method involves solving location tasks by means of genetic algorithms. the basic advantage of these algorithms is that they scan the area at several points at the same time, with concurrent information exchange between the scanned points. another advantage is that they find better solutions without needing to know of the structure of the problem solved. the algorithm, in solving a problem, works only with a string of ones and zeros and with the quality of such strings. the quality is discovered by using a decoding function. this function is the only mediator between the problem and the algorithm. the purpose of the algorithm is thus to obtain the best strings possible. the whole algorithm is very simple. it is composed of only three operators – reproduction, crossover, and mutation. each of these operators uses random selection. the whole algorithm works in by creating a new generation from an old generation. this means that after obtaining a new generation, via reproduction, crossover and mutation operations, it uses the same generation as the basis for the next generation, i.e. it begins to run the procedure again. after many repetitions to create an improved generation (e.g. one thousand repetitions), the algorithm finds the best string obtained in the last generation. this string is declared the best string found. representation of an individual, genetics operators an individual in our task is a selected subset of logistics centres, or depots (parameter k), of the total number of n nodes. one of the possible representations of the individual that presents itself is the natural display of a subset of depots, i.e. a field of integral numbers containing k elements; the numbers in the field move in the range 1 n. given the aspect of the nature of the task, the form of the mutation operator results quite clearly – one node in the subset of depots is randomly selected and replaced by a randomly selected node not present in the subset. from the viewpoint of implementing the mutation (as well as crossover) algorithm, it is more advantageous to select a representation that is commonly used for the implementation of the set in programming languages. this is the display of the characteristic function of the set. for the total number of n nodes, the individual is represented by a bit field containing n elements. if node i belongs to the selected subset of depots, the element (bit) in field i has the value 1, else 0. in the field, each individual will contain k ones. fig. 1 gives an example. in total we 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 1 1 110 000 65432 71 8 representation of a subset of depots ( )v1, v4, v5, v8 fig. 1: representation of an individual 1 0 0 1 1 0 0 1 1 2 3 4 5 6 7 8 (v1,v4,v5,v8) i=3 j=8 1 0 1 1 1 0 0 0 1 2 3 4 5 6 7 8 (v1,v3,v4,v5) 1 0 0 1 1 0 0 1 1 2 3 4 5 6 7 8 (v1,v4,v5,v8) i=1 j=4 � j=6 0 0 0 1 1 1 0 1 1 2 3 4 5 6 7 8 (v4,v5,v6,v8) fig. 2: principle of the mutation operation have n 8 nodes, designated as v1 to v8. let k 4. the subset of depots (v1, v4, v5, v8) will be represented as shown in fig. 1. the mutation operation was described briefly in the previous paragraph. from the subset, a node is selected randomly and is replaced by a node not present in the subset. the latter node is likewise selected on a random basis. in practice, mutation is realized in the following way (fig. 2): index i is randomly generated (naturally with a uniform distribution) into a field. this determines the allele (the position of the gene) that is to be changed. if the i gene (bit) is true (the node is in a subset), it is zeroed (the node is taken out of the subset), and if the i gene (bit) is zero (the node is not in the subset), it will be set to true (the node is entered in the subset). the second index j is generated in a random manner, and the allele of the second gene is determined. if the j bit has a value opposite to that originally held by the i bit, it will be inverted. the exchange of the two nodes is thus complete (fig. 2 – left). where the j bit has a value equal to that held by the i bit, it searches linearly (cyclically) towards the higher indices for a bit having a value opposite to that held by the i bit, and then it is inverted (fig. 2 – right). since 1 k n, a bit of this quality is always found. the crossover operation is based on the general principle described above. principle of the mutation operation it is applied without any alterations, it can produce invalid offsprings, since its result may be a bit vector representing a subset containing a number of elements other than k. the adjusted form of the crossover is as follows. again a crossover point is generated in a random manner. the creation of an offspring – a part of the first parent, is copied on the left of the crossover point into the offspring without change. it is then progressively supplemented from the right from the second parent by ones up to the total number of k ones, until the total number of ones corresponds to k, as shown in fig. 3. then the ones are supplemented, from the right, from the second parent’s to the left of the crossover point. the principle of the crossover operator is demonstrated in fig. 3. fitness the quality criterion is the transport work. in contrast to common practice, in calculating the fitness function the better individuals are given a lower score (in usual practice a higher quality individual receives a higher score). the selection algorithms then presume that the probability of the selection of the i individual into the next generation is proportional to its fitness value fi, e.g. p f f i i j j n � 1 . (2) in order to minimize transport work, the dp value was linearly converted to the fitness value according to the following relation [1]: fitness s dp avg best avg � 1 1( )( ) , (3) where avg is the average value of transport work in a generation and best is the best value of transport work in a generation. the parameter s (usually in the scope of 1, 2-2) controls the selection pressure. in order that individuals having a transport work value below the average do not have a negative fitness value, s is alternatively selected, as per [1]: s best avg best worst � 1 . (4) in this case the worst individuals have a zero fitness value for s. above the fitness values the selection is arranged by the roulette wheel method. in fitness transformed in this manner, the worst individuals in no case progress into the next generation, which may sometimes be a disadvantage (a bad individual may become good by mutation). 3 conclusion allocation tasks need to be assessed and evaluated taking into account a considerable number of criteria, i.e. they are problems of a multiple criteria nature. in determining the optimal number of logistics centres it is necessary to take into account the required technical parameters of the centers, financial and cost considerations, etc. it is necessary to make an economic analysis based on the particular task assignment. allocation tasks are, by their nature, highly individual. we have therefore merely outlined the theoretical side of using multiple criteria decision making and the solution methodology. location tasks serve for distributing of logistics centers in networks. tasks of this type belong to the group of discrete optimization tasks. they may be solved by means of exact or heuristic methods. since these tasks offer an unmanageable number of candidate solutions, calculation using exact methods is abandoned and preference is given to compiling suitable heuristic methods, such as the application of genetic algorithms. such a heuristic has the huge advantage that the individual steps in the algorithm do not depend on a criterion, i.e. the algorithm still works in the event of a change to the criterion. in essence, the criterion of quality is the greatest, but not the sole problem in location tasks. this application was tested on a sample of approximately 2000 examples, the quality of the solution was assessed by comparing it with the results from an exact method, and the deviations of this heuristic are minimal, around 0.5 % of the total number of examples, whereby the deviation – the fitness function value – is always the second best solution from the optimum. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 50 no. 1/2010 1 1 110 000 65432 71 8 1 1 110 000 65432 71 8 0 0 101 011 10 r1: r2: 1 0 100 011 10p: crossover point problematic gene (v1, v4, v5, v7) (v2, v3, v6, v8) (v1, v3, v6, v8) fig. 3: principle of the crossover operation (r1, r2 – parent, p – offspring) 4 references [1] mařík, v., štěpánková, o., et al.: artificial intelligence (3, 4), prague: academia, 2001, 2003. [2] mocková, d.: solution of allocation-location tasks, thesis, prague: ctu – faculty of transportation science, 2005. [3] volek, j.: operational research i. pardubice, 2002. ing. denisa mocková, ph.d. phone: +420 224 359 160 fax: +420 224 919 017 e-mail: mockova@fd.cvut.cz czech technical university in prague faculty of transportation sciences horská 3 128 03 prague 2, czech republic 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 wykresx.eps acta polytechnica vol. 51 no. 1/2011 current exchanges for reducible higher spin modes on ads a. fotopoulos, m. tsulaia abstract we show how to decompose a lagrangian for reducible massless bosonic higher spin modes into the ones describing irreducible (fronsdal) higher spin modes on a d dimensional ads space. using this decomposition we construct a new nonabelian cubic interaction vertex for reducible higher spin modes and two scalars on ads from the already known vertex which involves irreducible (fronsdal) modes. keywords: gauge symmetry, ads-cft correspondence, string field theory. higher spin gauge theories (see [1]–[2] for recent reviews) are usually formulated either in frame-like [3]–[4] or metric-like [5]–[20] approaches. recently several interesting cubic vertices have been constructed in the metric-like approach [8, 9, 10, 11]. bearing inmind a possible application of the reducible higher spin modes (described by the socalled “triplet” [20]) for string theory [21]–[22] and for ads/cft correspondence [23], we consider the problem of cubic interaction of a triplet on an ads space. in particular we study the cubic interaction of a triplet with two scalar fields. the main result of this paper is twofold. firstly, we show that the procedure derived in [17] for decomposing the free lagrangian for reduciblemassless bosonic higher spin modes in a flat space time also works for an arbitrary dimensional ads space. the second and more important result is that after this decomposition one can use the cubic vertex1 of [16], which describes an interaction of irreducible (fronsdal) higher spin modes with two scalars, to obtain an interactionvertex for reduciblehigherspinmodes with two scalars. obviously this technique canbe applied not only for the particular vertex given in [16], but for constructing of more complicated interaction vertices in ads following the method given in [14]. the advantage of this approach is that the construction of interactionvertexes for triplets inads is often technically complicated, due to repeated commutators betweencovariantderivatives. thedouble tracelessness condition for irreducible higher spin modes makes the problem at hand considerably simpler. let us start froma free lagrangiandescribing the propagation of reduciblemasslesshigher spinmodes on a d dimensional ads space. it contains a field ϕμ1,...,μs(x) of rank s, a field cμ1,...,μs−1(x) of rank s −1 and a field dμ1,...,μs−2(x) of rank s −2, and has the form [13] (see also [2] for details of the construction), l = − 1 2 (∇μϕ)2 + s ∇ · ϕ c + s(s −1)∇ · c d + s(s −1) 2 (∇μd)2 − s 2 c2 + s(s −1) 2l2 (ϕ ′ ) 2 − s(s −1)(s −2)(s −3) 2l2 (d ′ ) 2 − 4s(s −1) l2 d ϕ′ − 1 2l2 [(s −2)(d + s −3)− s]ϕ2 + s(s −1) 2l2 [s(d + s −2)+6] d2, (1) the symbol ∇· means divergence, while ∇ is the symmetrized action of ∇μ on a tensor. the symbol ′ means that we take the trace of a field. multiplication of a tensor by the metric g implies symmetrized multiplication, i.e., if a is a vector aμ we have ga = g(μν aρ) = gμν aρ + gμρaν + gνρaμ. this lagrangian is invariant under gauge transformations with parameter λμ1,...,μs−1(x) δϕ = ∇λ, δc = � λ+ (s −1)(3− s − d) l2 λ+ 2 l2 g λ′ δd = ∇ ·λ . (2) let us note that the field c(x) has no kinetic term andcanbe eliminatedvia its ownequationsofmotion to obtain l = − 1 2 (∇μϕ)2 + s 2 (∇ · ϕ)2 + s(s −1)∇ · ∇ · ϕ d + s(s −1)(∇μd)2 + talk given at the xixth international colloquium on integrable systems and quantum symmetries, prague, czech republic, june 17–19, 2010 1let us point out that the method given in [14] describes the construction of nonabelian cubic interaction vertices, see also [16] for some explicit nonabelian examples. a particular example of an abelian vertex given in [15] is in some sense a “degenerate” solution of the method, where however the abelian property is maintained in a nontrivial way, due to the structure of the ghost terms. 50 acta polytechnica vol. 51 no. 1/2011 s(s −1)(s −2) 2 (∇ · d)2 + s(s −1) 2l2 (ϕ′) 2 − s(s −1)(s −2)(s −3) 2l2 (d ′ ) 2 − (3) 4s(s −1) l2 d ϕ′ − 1 2l2 [(s −2)(d + s −3)− s]ϕ2 + s(s −1) 2l2 [s(d + s −2)+6] d2 . now we would like to decompose this lagrangian in terms of irreducible (fronsdal) [6] modes, following the procedure given in [17] for a minkowski space. let us start with the simplest example of a s =2 tripletwhich contains fields ϕμν(x), cμ(x) and d(x). let us make the ansatz ϕμν =ψμν + 1 d −2 gμνψ, ϕ ′ −2d =ψ. (4) inserting these expressions back to the lagrangian (3), for s =2 one obtains l = − 1 2 (∇μψρσ)2 +(∇νψνμ) 2 +ψ′∇μ∂νψμν + 1 2 (∇μψ′)2 − 1 2(d −2) (∇μψ)2 + (5) 1 l2 (ψμν) 2 + d −3 l2(d −2) (ψ)2 + d −3 2l2 (ψ′) 2 therefore, the initial lagrangian (3) has been decomposed into a sum of two fronsdal lagrangians for s = 2 field ψμν with the gauge transformation law δψμν = ∇μλν + ∇μλν and a gauge invariant scalar ψ. let us describe this procedure for the spin 4 triplet, since in this case both a constraint on the parameter of gauge transformations and an off-shell constraint on the gauge field arise. let us use the substitution [17] ϕ(4) = ψ(4) + 1 d +2 gψ(2) + 1 d(d −2) (g) 2 ψ(0) d = 1 2 [ ψ′ (4) + 2 d +2 ψ(2) + 1 d +2 gψ′ (2) + (6) 2 d(d −2) gψ(0) ] . the field ψ(4) is doubly traceless and transforms under the gauge transformations as δψ(4) = ∇λ̃, λ̃=λ− 1 d +2 ηλ′ (7) inserting these expressions into the lagrangian (3), one can see again that it decomposes into the sum of the fronsdal modes with spins 4,2 and 0, described by the fields ψ(4),ψ(2) and ψ(0). one can further generalize this procedure for an arbitrary spin. in particular, take ϕ = [s2 ]∑ k=0 ρ̃k(d, s)(g) k ψ(s−2k) d = 1 2 [s2 ]−1∑ k=0 ρ̃k(d, s)(g)kψ ′(s−2k) + (8) [s2 ]∑ k=1 ρ̃k(d, s)(g) k−1 ψ(s−2k). and λ̃s−1−2k = [s2 ]∑ q=0 ρq(d, s −2k −1)(g)qλ[q+k](s−1) (9) with ρq(d −2, s)= (−1)q(d +2(s − q −3))!! (d +2(s −3))!! , ρ̃k(d, s)= (d +2(s −2k −2))!! (d +2(s − k −2))!! (10) and [q+k] denotes the number of traces. finally, one can show that the normalization factor for the propagators for each of individual fronsdalmode, i.e. the inverse of the prefactor of (∇μψ(s−2k))2 terms multiplied by 2, is q(s, k, d)= 2kk!(s −2k)! s!ρ̃k(d, s) . (11) nowletusbuilda cubic interactionvertexofahigher spin triplet with two scalars on ads. to this end, let us use the corresponding vertex for an individual fronsdal mode [16] l00sint =ψ (s) · js + [ s −1 6l2 [2s2 +(3d −4)s −6]− s −2 l2 ] ψ′(s) · js−2 (12) where j 1;2 s−2q = s−2q∑ r=0 crs−2q(−1) r(∇μ1 . . . ∇μr φ1) · (∇μr+1 . . . ∇μs−2q φ2) (13) therefore multiplying the interacting vertices (12) with the appropriate factor (11) and adding them to the free lagrangian (3), one finds the expression for a cubic lagrangian describing the interaction of reducible higher spin modes with two scalars on ads. then one can perform a current–current exchange procedure following the lines of [18, 19, 17]. 51 acta polytechnica vol. 51 no. 1/2011 acknowledgement it is a pleasure to thank a. p. isaev, s. o. krivonos anda.o. sutulin for valuable discussions. thework of a. f. was supported by an infn postdoctoral fellowshipandpartly supportedby italianmiur-prin contract 20075att78. the work of m. t. has been supported by a stfc rolling grant st/g00062x/1. references [1] vasiliev, m. a.: fortsch. phys. 52, 702 (2004) [arxiv:hep-th/0401177]. bekaert, x., cnockaert, s., iazeolla,c.,vasiliev,m.a.: [arxiv:hepth/0503128]. sorokin, d.: aip conf. proc. 767, 172 (2005) [arxiv:hep-th/0405069].bouatta,n., compere, g., sagnotti, a.: [arxiv:hep-th/0409068]. bekaert, x., buchbinder, i. l., pashnev, a., tsulaia, m.: class. quant. grav. 21 (2004) s1457 [arxiv:hepth/0312252]. campoleoni, a.: arxiv:0910.3155, francia, d.: j. phys. conf. ser. 222, 012002 (2010) [arxiv:1001.3854]. bekaert, x., boulanger, n., sundell, p., [arxiv:1007.0435]. [2] fotopoulos,a.,tsulaia,m: int. j.mod. phys.a 24, 1 (2009) [arxiv:0805.1346 [hep-th]]. [3] fradkin, e. s., vasiliev, m. a.: nucl. phys. b 291, 141 (1987), annals phys. 177, 63 (1987), vasiliev, m. a.: phys. lett. b 243, 378 (1990), phys. lett. b 285, 225 (1992), phys. lett. b 567, 139 (2003) [arxiv:hep-th/0304049], arxiv:0707.1085 [hep-th], jhep 0412 (2004) 046 [arxiv:hep-th/0404124], skvortsov, e. d., vasiliev, m. a.: nucl. phys. b 756 (2006) 117 [arxiv:hep-th/0601095], sorokin, d. p., vasiliev, m. a.: nucl. phys. b 809, 110 (2009) [arxiv:0807.0206 [hep-th]], [4] bandos, i. a., lukierski, j., sorokin, d. p.: phys. rev. d 61, 045002 (2000) [arxiv:hepth/9904109]. vasiliev, m. a.: phys. rev. d 66, 066006 (2002) [arxiv:hep-th/0106149]. [5] c. fronsdal, phys. rev. d 18, 3624 (1978), berends, f. a., burgers, g. j. h., vandam, h.: nucl. phys. b 271, 429 (1986), ouvry, s., stern, j.: phys. lett. b 177, 335 (1986), bengtsson, a. k. h., bengtsson, i., brink, l.: nucl. phys. b 227, 41 (1983), nucl. phys. b 227, 31 (1983), hussain, f., thompson, g., jarvis, p. d.: phys. lett. b 216, 139 (1989). [6] c. fronsdal, phys. rev. d 20, 848 (1979). [7] pashnev,a.,tsulaia,m.m.: mod.phys. lett.a 12, 861 (1997) [arxiv:hep-th/9703010]. buchbinder, i. l., pashnev, a., tsulaia, m.: phys. lett. b 523 (2001) 338 [arxiv:hep-th/0109067]. [arxiv:hep-th/0206026]. buchbinder, i. l., krykhtin, v. a., pashnev, a.: nucl. phys. b 711, 367 (2005) [arxiv:hep-th/0410215]. buchbinder, i. l., krykhtin, v. a., ryskina, l. l., takata, h.: phys. lett. b 641 (2006) 386 [arxiv:hep-th/0603212].buchbinder, i. l.,galajinsky, a. v., krykhtin, v. a.: [arxiv:hepth/0702161].buchbinder, i. l., krykhtin,v. a., reshetnyak, a. a.: [arxiv:hep-th/0703049]. buchbinder, i. l., krykhtin, v. a., lavrov, p. m.: nucl. phys. b 762 (2007) 344 [arxiv:hep-th/0608005]. alfaro, j., cambiaso, m., [arxiv:0809.4298 [hep-th]]. alkalaev, k. b., grigoriev, m.: nucl. phys. b 835, 197 (2010) [arxiv:0910.2690 ]. isaev, a. p., krivonos, s. o., ogievetsky, o. v.: j. math. phys. 49, 073512 (2008) [arxiv:0802.3781 [math-ph]]. [8] metsaev, r. r.: [ arxiv:hep-th/9810231], phys. lett. b 419, 49 (1998) [arxiv:hep-th/9802097], [arxiv:0712.3526 [hep-th]], nucl. phys. b 759, 147 (2006) [arxiv:hep-th/0512342], [9] bekaert, x., boulanger, n., cnockaert, s.: jhep 0601, 052 (2006) [arxiv:hep-th/0508048], boulanger, n., leclercq, s., cnockaert, s.: phys. rev. d 73 (2006) 065019 [arxiv:hep-th/0509118], boulanger, n., leclercq, s.: jhep 0611 (2006) 034 [arxiv:hep-th/0609221], boulanger, n., leclercq, s., sundell, p.: jhep 0808, 056 (2008) [arxiv:0805.2764 [hep-th]]. [10] zinoviev, yu. m.: nucl. phys. b 785, 98 (2007) [arxiv:0704.1535 [hep-th]], class. quant. grav. 26, 035022 (2009) [arxiv:0805.2226 [hep-th]], jhep 0904, 035 (2009) [arxiv:0903.0262 [hep-th]], [arxiv:1007.0158]. [11] manvelyan, r., mkrtchyan, k., ruehl, w.: nucl. phys. b 836, 204 (2010) [arxiv:1003.2877], [arxiv:1002.1358],nucl. phys. b 826, 1 (2010) [arxiv:0903.0243 [hep-th]]. manvelyan, r., mkrtchyan, k.: mod. phys. lett. a 25, 1333 (2010) [arxiv:0903.0058 [hep-th]]. [12] bastianelli, f., bonezzi, r.: jhep 0903, 063 (2009) [arxiv:0901.2311 [hep-th]]. bastianelli, f., corradini, o., latini, e.: jhep 0811, 054 (2008) [arxiv:0810.0188 [hep-th]]. corradini, o.: [arxiv:1006.4452],deser, s,waldron, a.: nucl. phys. b 607, 577 (2001) [arxiv:hep-th/0103198]. gover, a. r., shaukat, a., waldron, a.: arxiv:0812.3364 [hep-th]. cherney, d., latini, e., waldron, a.: phys. lett. b 682, 472 (2010) [arxiv:0909.4578 52 acta polytechnica vol. 51 no. 1/2011 [unknown]]. marnelius, r.: [arxiv:0906.2084 [hep-th]]. [13] sagnotti, a., tsulaia, m.: nucl. phys. b 682 (2004) 83 [arxiv:hep-th/0311257]. [14] buchbinder, i. l., fotopoulos,a., petkou,a.c., tsulaia, m.: phys. rev. d 74 (2006) 105018 [arxiv:hep-th/0609082]. [15] fotopoulos, a., tsulaia, m.: phys. rev. d 76, 025014 (2007) [arxiv:0705.2939 [hep-th]]. [16] fotopoulos, a., irges, n., petkou, a. c., tsulaia, m.: jhep 0710 (2007) 021 [arxiv:0708.1399 [hep-th]]. [17] fotopoulos, a., tsulaia, m.: jhep 0910, 050 (2009) [arxiv:0907.4061 [hep-th]]. [18] francia, d., mourad, j., sagnotti, a.: nucl. phys. b 773, 203 (2007) [arxiv:hep-th/0701163].nucl. phys. b 804, 383 (2008) [arxiv:0803.3832 [hep-th]]. sagnotti, a., [arxiv:1002.3388]. [19] bekaert,x., joung,e.,mourad,j.,jhep 0905, 126 (2009) [arxiv:0903.3338 [hep-th]]. [20] francia, d., sagnotti, a.: class. quant. grav. 20 (2003) s473 [arxiv:hep-th/0212185]. francia, d.: phys. lett. b 690, 90 (2010) [arxiv:1001.5003]. [21] sagnotti, a., taronna, m.: [arxiv:1006.5242], taronna, m.: [arxiv:1005.3061]. [22] polyakov,d.: [arxiv:1005.5512], [arxiv:0910.5338]. [23] giombi, s., yin, x.: [arxiv:0912.3462], [arxiv:1004.3736]. angelos fotopoulos e-mail: foto@to.infn.it department of theoretical physics turin university sezione di torino, via p. giuria 1, i-10125 torino, italy mirian tsulaia e-mail: tsulaia@liv.ac.uk department of mathematical sciences university of liverpool liverpool, l69 7zl, united kingdom 53 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 ensemble empirical mode decomposition: image data analysis with white-noise reflection m. kopecký abstract during the last decade, zhaohua wu and norden e. huang announced a new improvement of the original empirical mode decomposition method (emd). ensemble empirical mode decomposition and its abbreviation eemd represents a major improvement with great versatility and robustness in noisy data filtering. eemd consists of sifting and making an ensemble of a white noise-added signal, and treats the mean value as the final true result. this is due to the use of a finite, not infinitesimal, amplitude of white noise which forces the ensemble to exhaust all possible solutions in the sifting process. these steps collate signals of different scale in a proper intrinsic mode function (imf) dictated by the dyadic filter bank. as eemd is a time–space analysis method, the added white noise is averaged out with a sufficient number of trials. here, the only persistent part that survives the averaging process is the signal component (original data), which is then treated as the true and more physically meaningful answer. the main purpose of adding white noise was to provide a uniform reference frame in the time–frequency space. the added noise collates the portion of the signal of comparable scale in a single imf. image data taken as time series is a non-stationary and nonlinear process to which the new proposed eemd method can be fitted out. this paper reviews the new approach of using eemd and demonstrates its use on the example of image data analysis, making use of some advantages of the statistical characteristics of white noise. this approach helps to deal with omnipresent noise. keywords: empiricalmodedecomposition (emd),ensembleempiricalmodedecomposition (eemd), image analysis, sifting, noise-assigned data analysis (nada). 1 introduction empiricalmodedecomposition(emd)hasbeenproposed as an adaptive time-frequency data analysis method. it has been proved in applications for extracting signals from data generated in noisy nonlinear and non-stationary processes [1, 2]. however, there are still several known unresolved difficulties with emd. the first major weakness of the original emd is the frequent occurrence of mode mixing, which is defined as a single intrinsicmode function (imf). imfs can also consist ofwidely disparate scales, or can consist of a similar signal residing in different imf components. mode mixing is often a consequence of an intermittent signal, as discussed by huang et al. [2, 3, 4]. an intermittent signal cannot only cause serious aliasing in the time–frequency distribution, but can also make the physical meaning of individual imfs seriously unclear. to alleviate this drawback, huang proposed the intermittence test [3, 2, 4]. this test aimed to avoid several difficulties. however the test caused its own issues, which were also illustrated by huang and wu. the first issue is the subjectively selected scale of the test. by this intervention emd became totally adaptive. the second issue concerns separable and definable data selection from timescales, which was discussed by huang and wu et al. [4]. to overcome the scale separation issue without introducing a subjective intermittence test, huang andwuproposed a newnoise-assigneddataanalysis (nada) method, knownasensembleemd(eemd),whichdefines the true imfcomponents as themeanvalue of an ensemble of trials [2]. each trial consists of the signal plus a white noise of finite amplitude. binding white noise helps to better cover real cases, as is common in current papers on image data analysis [1]. addingwhite noise has shownemd to be an adaptive dyadic filter bank. tomake suchan improvement,wuandhuang were inspired byflandring andgledhill and their research in adding white noise to data analysis, which helps to improve emd results [5, 4]. the main principle of eemd was defined simply: “the added white noise would populate the whole time–frequency space uniformly with the constituting components of different scales. when a signal is added to this uniformly distributed white background, the bits of signal of different scales are automatically projected onto proper scales of reference established by the white noise in the background” [4, p. 2]. this approach was adopted on the basis of the following observations: 1. a collection of white noise cancels itself in the ensemble mean if averaged in a time domain; 49 acta polytechnica vol. 50 no. 6/2010 therefore, only the signal can survive andpersist in thefinalnoise-addedsignal ensemblewhenaveraged. 2. finite, not infinitesimal, amplitude white noise is needed to force the ensemble to exhaust all possible solutions. finitemagnitudenoisemakes the different scale signals reside in the corresponding imf, dictated by the dyadic filter banks, and renders the resulting ensemble mean more meaningful. 3. the true and physically meaningful answer to emd is not an answerwithout noise. it is designated tobe the ensemblemeanof a largenumber of trials consisting of the noise-added signal. the proposed eemd method utilizes many important statistical definitions of noise [4]. the following sections describe a part of the research done by huang and wu on the relation between white noise and a real signal. image data is used in this paper. section 2 provides a brief introduction to ensemble empirical mode decomposition, and several details of drawbacks associated with mode mixing are described. section 3 describes the usefulness and capability of eemd through examples in image data analysis and optical flow assignment. 2 introduction to ensemble empirical mode decomposition over emd as an introduction to a more detailed description of the eemd method, we begin with a short review of emd [2, 6]. the emd method is an adaptive method, with the decomposition based on data and derived from data. in the emd approach, the data, time series x(t), is decomposed in terms of imfs, as has been described in [4] by huang, x(t)= n∑ j=1 cn + rn where rn is the residue of the original data x(t) and n is the number of steps for extracting imfs. imfs are simple oscillatory functions with varying amplitudes and frequency. the extracted imfs have the following properties: 1. throughout the length of a single imf, thenumber of extremes and thenumber of zero crossings must either be equal or must differ at most by one (although these numbers can differ significantly for the original data set). 2. at any data location, the mean value of the envelope defined by the local maxima and the envelope defined by the local minimum is zero. fig. 1: the sine wave signal with three crests is used as an example in an introduction to eemd (fig. 2) fig. 2: the first step of the sifting process. panel (a) is the input. panel (b) identifies local highs (red dots). panel (c)plots theupperenvelope (upper reddashed line) and the lower envelope (lower red dashed line) and their input signal (blue line), and panel (d) are the difference between the input and the mean mi(t) of the envelopes in common practice, emd is implemented through a sifting process using only local extrema [2, 6]. foranydata rj−1 the followingprocedureapplies: 1. identify all the local extrema (a combination of highs and lows) and connect all these local highs (lows) with a cubic spline as the upper (lower) envelope 2. obtain the first component h(t) by taking the mean m(t) of the upper and lower envelopes 3. treat h(t)= x(t)− m(t) as the data, and repeat steps 1 and 2 as many times as is required until the envelopes are symmetric with respect to the zero mean under certain criteria. the final h(t) is designated as cj. the complete sifting process stops when the residue rn becomes a monotone function fromwhich nomore imfs can be 50 acta polytechnica vol. 50 no. 6/2010 extracted. the process is stopped using a stoppage criterion. two types of stoppage criteria have commonly been used. the first type was used by huang in 1998. it is based on a cauchy type of convergence test. the test requires the normalized squared difference between two successive sifting operations defined as eq. 1 sdk = ∑t t=0 |hk−1(t) − hk(t)| 2∑t t=0 h 2 k−1 (1) to be small. if this squared difference sdk is smaller than the desirednumber, the processwill be stopped. setting up the right sdk value seems tobeaverydifficult task, because no acceptable definition is available [2]. the second criterion is to set up a preselected s-number, which deals with other issues about how to select the appropriate number. these difficulties led authors [1] to develop a new approach to obtaining imfs from the acquired signal (data), which will be described and applied below. based on the research of fladrin and gledhill, wu andhuang [4] used their results as a background for improving the definition of the emd method. bearing the definition of emd in mind, they showed that emd behaves as a dyadic filter bank when white noise is populateduniformly through thewhole timescale or time-frequency space [4]. their description postulated that the total number of imfs in a data set is close to log2(n), with n as the total number of data points. this fact ensures a total number of valid imfs with difficulties around. when the signal is not merged with pure noise, some scales can be missing, and this involves the total number of imfs, which can be lower than the expectation log2(n). another issue is caused by modemixing, as in the previousemd.anadvantage of this approach is knowledge of the expected imfs expressed in n number. fig. 3: the intrinsic mode functions of the input displayed in fig. 1(a) 2.1 a definition of mode mixing based on emd modemixing has been defined as any imfconsisting of oscillation of dramatically disparate scales. when mode mixing occurs, imf can cease to have a different physical meaning by itself. the signal can suggest an invalid physical representation. this known drawback was mentioned by huang in [2, 3, 4]. an example of the sifting process is presented in fig. 2, and its decomposition is shown in fig. 3, as illustrated by huang and wu [4]. the fundamental part of the data is the sine wave with unit amplitude. at the three middle crests of the sine wave, highfrequency intermittent oscillations with amplitudes close to 0.2 ride on the fundamental sine wave, in panel (a) of fig. 2. this signal is denoted as the test signal. panels (b) and (c) show the sifting process of the identifying highs (lows), as described above. panel (d) displays the result, which is affected by visible mode mixing. this is because one envelope was a mixture of envelopes of the fundamental sine waveandthe intermittent signal, leading toa severely distorted envelope mean, as shown in fig. 2 and fig. 4. fig. 4 shows the main issue caused by appropriate localization of the highs and lows. an unpleasant implication of mode mixing leads to disadvantages of the original emd method with the stoppage criterion, as described byhuang [1, 2, 4]. mode mixing is also the main reason why the emd algorithm is unstable (fig. 3). any small perturbation may result in a new set of imfs, as reported by gledhill [4]. obviously, the intermittence prevents emd from extracting any signal with similar scales. eemd [4], which will be briefly described in the following sections, has been proposed to solve these problems. fig. 4: test signal (red dashed line) and the locations of highs (blue circle) and lows (green circle) 51 acta polytechnica vol. 50 no. 6/2010 2.2 ensemble empirical mode decomposition as shown in theprevious test signal, the data (fig. 2, fig. 3) comprises collections of the main signal and some noise. to improve the precision of the measurements, the idea of the ensemble mean becomes a powerful approach. data is collected by separate observations, each ofwhich contains a differentwhite noise realization [4]. to generalize this ensemble idea, the noise is populated to a single data set x(t). the added white noise is treated as possible random noise that would be encountered in the measurement process. under such conditions, the equation of any observation can be written as xt(t)= x(t)+ wi(t) (2) in thisway, adifferent realizationofwhite noise wi(t) will be added when there is only one observation. (eq. 2) we will now make a brief review of the properties of emd before proceeding to a more detailed description of the eemd method: 1. emd is an adaptive data analysismethod based on local characteristics of the data, and it therefore captures nonlinear, nonstationary oscillations more effectively 2. emd is a dyadic filter bank for any gaussian white noise-only series 3. when the data is intermittent, the dyadic property is often compromised in the original emd, as shown in the example in fig. 3 4. adding noise to the data canprovide auniformly distributed reference scale, which enables emd to repair the compromised dyadic property 5. the corresponding imfs of different series of white noise have no correlationwith each other. therefore, the means mi(t) of the corresponding imfs of different white noise series are likely to cancel each other. bearing in mind all these properties of the emd method, the proposed eemd has been developed in the following way: 1. add white noise series to the targeted data; 2. decompose the datawith addedwhite noise into the imfs; 3. repeat step 1 and 2, based on the realization of different white noise series each time; 4. perform steps 1 and 2 until the imfs of the data set are close to log2(n), with n as the number of total data points; 5. obtain the (ensemble) means of the corresponding imfs of the decompositions as the final result. fig. 5: test signal with white noise added (red dashed line) and the location of its highs (blue circle) and lows (green circle) fig. 6: the input top panel, its intrinsic mode functions (c1–6), and the trend (r). in panel c5, the original input is plotted as the bold dashed red line, for purposes of comparison the main effect of decomposition using eemd is that the added white noise series cancel each other in the final mean of the corresponding imfs. this means that the imfs stay within the natural dyadic filter windows, and thus significantly reduce the chance of mode mixing and preserve the dyadic property. an example of the use of the eemd method is illustrated in fig. 6, in a similar meaning as for the previous emd in fig. 2 and fig. 3. clearly, the fundamental signalc5 is representedalmostperfectly, as are the intermittent signals, if c2 and c3 are added together. the fact that the intermittent signal actually resides in two eemd components is due to the average spectra of the neighboring imfs of white noise overlapping, as revealed by wu and huang. it 52 acta polytechnica vol. 50 no. 6/2010 is necessary to combine the two adjacent components to one imf. the need for this type of adjustment is easily determined through an orthogonal check. whenever two imf components become grossly nonorthogonal, we should consider combining the two components to form a single imf component [4]. 3 eemd method implemented in image data analysis the previous theoretical example shows the main concept of nada, using the eemd method as a tool. based on previous text discussion, the question arises how the method could be used in image data analysis. every frame of an image comprises a timesequence of data that is processed by the brain. our brain gives an appropriate meaning to this ensemble of data. in this way, synapses can work as a dyadic filter bank. data can be and mostly is affected by all kinds of noise. there can be various reasons for this noise. examples are short-sightedness or daylight. the observed object is not sharply displayed on our retina and the data is treated in a damaged way. this can affect itsmeaning or its values. in this section, we will outline eemd usage and how it can be used in analyzing such data. an example of cube rotation is displayed in fig 7. the brightness of the pixels over two images of the same cube in line height 120 pixels forms the test signal. the general size of the two images is 320×240 pixels in gray scale colormode. gray scale mode was chosen for the sake of simplicity to better understand themean value. in the first step, the two images were analyzed in a similar way. the line values were treated and the imfs were extracted. as the deviation, the amount of white noise was set to 0.2 and the eemd ensemble number was set to 50. fig. 7: cube rotation: the cube rotates fromamisaligned position to frontal position. theblackarrow indicates the direction of rotation fig. 8: cube 1 – signal line fig. 9: cube 2 – signal line fig. 10: cube 1 – all imf functions extracted by eemd 53 acta polytechnica vol. 50 no. 6/2010 in thecube 1 trial, the input (the original brightness data) is a visible jump around a width value of 100, which corresponds with the edge of the real cube (fig. 7). cube 1 is slightly dipped to the left side from the center, as shown in the original image (fig. 8). the next peak infig. 10 ofwidth of around 118 units can represent the second cube edge, which is coherent with reality. consequent perturbation in thefirst imfcan reflect the footmark. thefirstmain point here is the edge projection as a short perturbation with definable peaks over a zero value. other imfs can represent surface deviances. the last line is the residuum of the data. the residuum is mostly explained as a trend. the trend explanation can also be used for our case, because the last line shows a background change caused by the box. the original data defines the camber in the real scope. this jump is performed in the same way as in the trend line. only eight different imfs were extracted, which corresponds to the eemd basics of the finite data set explained above. the data forcube 2 (fig. 9) is also treated and displayed in a similar sense as cube 1. the image of cube 2 is taken from the front, which is also visible in the original. the data jump and the additional pit are smaller. this fact is caused by the projection of cube 2, which is logical. the surface texture is alsodisplayed in the originalfigure. aswas described above, the eemd method works with the imf summary. fig. 11: cube 2 – all imf functions extracted byeemd fig. 12: comparison between the original data set and the imf summary all imfs should have no mutual correlation. imfs should be independent of white-noise trials. this means that a summary of all imfs should give us the original signal (fig. 12). finally, when we make a comparison of the two cubes, we find some kind of data shifts in the inputs and imfs. these shifts are treated by cube rotation from right to left, as indicated above (fig. 7). there is no doubt that the two original figures are affected by different kinds of noise. the noise was treated during capture time. eemd enables us to extract data from images properly, even in different layers. the main intention in edge detection is to find appropriate differences over all image layers, which is enabled by the use of the eemd method presented here. another important point is the logically different image width of cube rotation. fig. 13 points out several areas where the cube edge can be projected on to the specific imfs. possible edge fluctuations are marked in subplot c1. these fluctuations are displayed as second fluctuations, because of the main transition. the first and the last fluctuations directly indicate a background change to/from the cube. now we see that we have two limited areas where everything can be projected. in c2, the imfs still show similar behavior over cubes with a logical width shift. the underlying change is shown in imf c3. the cube 1 line is projected with no underlying fluctuations, and it corresponds to realitywithno additional cube edge. in cube 2, however, the line fluctuation reaches a smaller peak and some kind of promontory is treated. this can be explained as a cube surface projection. when we look back to the original images, we see the true reality (fig. 8 and fig. 9). we consider here that c3 corresponds with the surface of the object where no inner edge fluctuation is displayed. this assertion is not proven, and 54 acta polytechnica vol. 50 no. 6/2010 fig. 13: comparison of cube eemds — the colored circles indicate important edge detection areas fig. 14: cube 1 — a comparison between emd and eemd methods and instability in analysis will be a topic for further study. this result can be taken as a valid conclusion here. in our analysis of thecube 1 signal, we found three edges. in the analysis of the cube 2 signal we found only two edges, which correspond to the front view. the retrieved result is considered to be the cube rotation from the left dipped state to the right frontal view. 4 a comparison between emd methods and eemd methods fig. 14 presents a comparison between the original emdmethod and the neweemdmethod, using image data analysis. 5 conclusion a major drawback of emd is the instability of the method, see fig. 14. the emd method has a problem evenwith edge detection. edge fluctuations over the signal are more widely spread and are not as strict as ineemd.emd lines showa strongmixture of modes, where lines dramatically change direction with no valid reasons. any edge projected on to the retina is awidth limited area,which causes strict sensor stimulations. this assertion is closer to eemd behavior, where narrower fluctuations are depicted. in brief, results produced by the emd method can be less valid thaneemd lines, which offermuch stricter behaviors. this instability can have a dramatic effect on further studies of any signal, not necessary from an image [1, 2, 4, 6]. the reason for the drawbackis linkedwith the ideaof acleardata signal. this ideadoesnot correspondto realworldexamples. the eemd method uses a white noise-added signal. this approach exhausts all possible solutions during the shifting process (fig. 4, fig. 5). the comparative test of the instability of the twomethods displays the mainbenefit ofusing theeemdmethod. the ability todivide the incoming signal data intomore independent imfs functions and then analyze it gives power to the eemd method, where this signal, not necessary an image, can be interpreted and processed by layers. the natural fact of the independence of imfs provides an opportunity to treat any signals as a summary of their layers, and this could be helpful in future medical research. there is potential and motivation for improving the proposed method. acknowledgement this work was supported by the grant agency of the czech technical university in prague, grant no. 2b06023 and by ctu grant sgs ohk2-021/10. references [1] kopecký,m.: hilbert-huangtransformationand its application in optical flow evaluation, stc, 2009. [2] huang,n.e., samuel, s. p. shen: hilbert-huang transformand itsapplication,politecnico dimilano, vol. 25. issn 0262-8856. 55 acta polytechnica vol. 50 no. 6/2010 [3] huang, n. e., shen, z., long, s. r., wu, m. c., shih, e. h., zheng, q., tung, c. c., liu, h. h.: the empirical mode decomposition method and the hilbert spectrum for non-stationary time series analysis,proc. roy. soc. london 454a, 1998, p. 903–995. [4] wuand, z., huang, n. e.: ensemble empiricalmodedecomposition andnoise-assisteddata analysis method, world scientific publishing copany, advances in adaptive data analysis, vol. 1, no. 1 (2009), p. 1–41. [5] flandrin, p., goncalves, p., rilling, g.: emd equivalent filter banks, from interpretation to applications, in hilbert-huang transform: introduction and applications, eds. n. e. huang and s. s. p. shen, world scientific, singapore, 2005. [6] kokeš, j.: okamžitá frekvence a hilberthuangova transformace. sborńık konference inteligentńı systémy pro praxi, lázně bohdaneč, 2006. isbn 80-239-6535-2. miroslav kopecký e-mail: miroslav.kopecky@fs.cvut.cz department of instrumentation and control engineering czech technical university in prague faculty of mechanical engineering technická 2, 166 27 praha, czech republic 56 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 creating a multi-axis machining postprocessor petr vavruška1 1 department of production machines and equipment, faculty of mechanical engineering, czech technical university in prague, horská 3, 128 00 prague 2, czech republic correspondence to: p.vavruska@rcmt.cvut.cz abstract this paper focuses on the postprocessor creation process. when using standard commercially available postprocessors it is often very difficult to modify its internal source code, and it is a very complex process, in many cases even impossible, to implement the newly-developed functions. it is therefore very important to have a method for creating a postprocessor for any cam system, which allows cl data (cutter location data) to be generated to a separate text file. the goal of our work is to verify the proposed method for creating a postprocessor. postprocessor functions for multi-axis machining are dealt with in this work. a file with cl data must be translated by the postprocessor into an nc program that has been customized for a specific production machine and its control system. the postprocessor is therefore verified by applications for machining free-form surfaces of complex parts, and by executing the nc programs that are generated on real machine tools. this is also presented here. keywords: postprocessor, nc program, interpolation. 1 introduction most cam (computer aided manufacturing) systems are able to export general data, called cl data (cutter location data), in a text file. the cl data file consists of information about the coordinate system, tools, etc. it is necessary to adjust this cl data file for a specific production machine. each production machine has a different structural design and a unique control system. a file with cl data is then translated by the postprocessor into an nc program that has been adapted to a specific production machine and its control system, see [3]. figure 1: nc program creation process especially for multi-axis machining, it is necessary to include further features for generating the relevant nc programs, e.g. the location of the rotary axes of the machine tool. however, when using standard commercially-available postprocessors, it is often very difficult to modify its internal source code and it is a very complex process, in many cases even impossible, to implement the newly developed functions. it is therefore very important to be able to create a postprocessor for any cam system, which generates cl data to a separate text file. subsequently, the postprocessor that is created must be verified by applications for machining free-form surfaces of complex parts, by executing the nc programs that are generated on a real machine tool. 2 method for creating a postprocessor the postprocessor falls within the chain shown in figure 1, which illustrates the idea that the postprocessor is not necessarily a part of the cam system, but can also be a stand-alone program, and is in essence a compiler. the issue of compilers is taught e.g. at the faculty of electrical engineering (ctu in prague), and can be found in [2]. the cl data file contains all necessary information for creating a specific nc program. besides the coordinate system and tools mentioned above, a cl data file consists of information about the toolpath, the orientation of the tool, tool changes, the feed-rate value, the spindle speed value and the direction of rotation, the desired cooling, the type of interpolation, etc. a method for creating a postprocessor for cl data analysis has been proposed. this method is based on using software that can be found as open-source or free software. flex and bison software generate source codes in c language for lexical and syntactic analyzers on the basis of lexical and syntactic analysis 113 acta polytechnica vol. 52 no. 4/2012 rules created by the programmer. the software is distributed under general public license (gpl) and forms part of the gnu project. lexical and syntactic analysis is needed to decode the input text file (cl data). after this decoding process, the data is passed to other postprocessor functions, where it is further processed according to the programmed algorithms. other postprocessor functions (transformations, modality of words, block formatting, etc.) will be written in the source code of c language as a separate text file. the final software that we need is a compiler of source codes written in c language to obtain a binary file of the postprocessor. a compiler called dev-c++ is used for this purpose, but of course a different c language compiler can be used. this is summarized in figure 2. figure 2: postprocessor creation method 3 postprocessors for multi-axis milling for more than three-axis machining, it is necessary to transform the coordinates of the tool reference point in each block of cl data. the transformation depends on the nomenclature of the actual machine tool. it is also necessary to consider the required machining principle. in multi-axis machining there are two basic machining principles: positioning (indexing), and continuous machining. in the case of positioning, the workpiece is set to the desired position using rotary axes, and the cutting moves are realized using linear axes only. the postprocessor then transforms the coordinates according to the same angular coordinates. the limits of the rotary axes are checked only in the block by setting the desired positions of the rotary axes. whereas in continuous machining it is necessary to check the positions of the rotary axes of the machine tool in each block, in the case of four-axis machining we can speak of plane coordinate transformation (the coordinates of the linear axis, which is parallel to the axis of rotation, remain unchanged), and the axis of rotation also usually does not have limits. figure 3a shows the nomenclature of the haas tm1 four-axis machine tool, which is available at the department of production machines and equipment (ctu in prague, faculty of mechanical engineering). five-axis machining is a larger issue. in this case we must consider the spatial transformation, which is ambiguous. in a cl data file, we can find positions of the tool that may be represented by two identical positions on the machine tool, see figure 4a. we know only the position of the tool reference point and the orientation of the tool axis, and we look for the position of all machine tool axes. this is an application of inverse kinematics. multi-axis machine tools are distinguished by their nomenclature (location of the rotary axes), and hence transformation equations can be created. the transformation equations are usually created using transformation matrices, which can be found e.g. in [4]. figure 3b shows the kinematics of a five-axis machining center, which is available at the department of production machines and equipment (ctu in prague, faculty of mechanical engineering). it is based on the mcv 1000 three-axis milling machine tool (made by kovosvit mas), and is supplemented by the rotary/tilting table made by nikken. in this case, the two rotary axes are on the work-piece clamping device. this is known as the table-table concept. the transformation matrices for this machine can be found in [5]. figure 3: nomenclature of machine tools (a – four-axis haas tm1, b – five-axis mas mcv 1000) 114 acta polytechnica vol. 52 no. 4/2012 figure 4: range of movement of the tilting axis (a – ambiguous positions, b – limits of tilting axis) postprocessors for continuous multi-axis machining must include solutions for the limits of the rotary axes. it is usually possible to specify the limits of machine tool axes and then during transformation the postprocessor checks the positions of the axes in the whole machine tool. however, in the event that the subsequent position of some rotary axis would exceed the limit of this axis, and in the next nc program block the postprocessor generates the positions of the alternative machine tool axes (and they do not exist), without the tool retracting from the workpiece in order to change the positions of the machine tool axes, then a collision might occur between a tool and a workpiece. this is because a linear interpolation is programmed between the two consecutive blocks in the nc program. other places in the nc program where this problem may occur are called stationary points, where the tool axis is parallel to the rotary axis c (in the case of the five-axis machining tool mentioned above). in fact, there is no line during multi-axis machining using linear interpolation, but a spatial curve is created because of the existence of rotary axes. in the case of very short increments of the toolpath, the curve approaches a straight line, but when the angular coordinate is too high the curve is visible on the surface and the workpiece is devalued. however if these critical points are dealt with using a proprietary algorithm, the impending destruction of the product is averted. the limits of tilting axis b of the mas mcv 1000 machine tool are shown in figure 4b as “lim 1” and “lim 2” (“-lim 2” is the mirrored “lim 2”, as an area of possible ambiguous solutions for the position of the tilting axis). figure 5 shows a difference between the characteristics of the coordinates of two nc programs, the first without using the algorithm for flipping of the tilting axis over the stationary point (characteristics of the coordinates in figure 5a), the second with this algorithm (characteristics of the coordinates in figure 5b). the green box in fig. 5b surrounds the area from the stationary point to the point where the limit of tilting axis b is reached. if the tool is at a safe distance from the workpiece, the machine tool axes can change the positions, and the machining process can continue after the tool approaches the workpiece again. the approach is implemented in such a way that at first the tool moves at a rapid speed in the position near the surface, and then it moves at the working feed-rate to the cut. figure 5: usage of the algorithm for flipping over the stationary point (a – without, b – with) 115 acta polytechnica vol. 52 no. 4/2012 figure 6: testing the four-axis milling postprocessor the format of the nc program depends on the control system used in the machine tool. there is also a common language for nc code — the iso language (iso code), but the producers of the control systems are still trying to bring new features in their systems. in order to generate these functions in the nc program correctly, the postprocessor has to identify areas in the cl data file where these functions can be used. the postprocessor can be extended by non-standard functions that are derived from specific requirements in production, or from practices within the company where the postprocessor will be used. in these two cases it was necessary to adjust the nc program to the requirements of the haas control system (in the case of haas tm1) and to the requirements of heidenhain itnc 530 (in the case of mas mcv1000). an adjusted iso format had to be used for the haas control system. the part (blade) for testing the postprocessor functions is shown in figure 6, together with a part of the nc program (the block with the tool change is marked in red, and the blocks with tool length and radius compensation are marked in green) and the real machining of the blade. heidenhain control systems allow for programming using the adjusted iso language, and along with it they also offer programming in their own language called “dialogue”, see [1]. this language is very different from the iso code. due to the machine operator requirements, it was necessary to modify the postprocessor so that the format of the nc program corresponds with this language. a comparison between a cl data section and the corresponding section of an nc program is shown in figure 7, where the blocks that correspond to each other are marked in blue and green, and the feed rates programmed through the parameters are marked in red. the final figures show a test of the postprocessor. the creation of toolpaths (for roughing and for finishing) is shown in figure 8, where the test part is an impeller. after this part was machined on a machine tool (see figure 9a), the surfaces were measured using a mitutoyo coordinate measurement machine, see figure 9b. it can be surely stated that the postprocessor algorithms for multi-axis milling have been verified. figure 7: cl data in comparison with the nc program (a – cl data, b – nc program) 116 acta polytechnica vol. 52 no. 4/2012 figure 8: testing toolpath creation (a – roughing, b – finishing) figure 9: finished blade (a – after machining, b – during the measurement) 4 conclusion this paper has summarized my findings from designing and verifying the multi-axis machining postprocessor creation method. the method is based on translating a text file with cl data. a cl data file is generated by the cam system, and then it is translated by the appropriate postprocessor into an nc program. the appropriate postprocessor must be created for a specific cam system, a particular machine tool and a control system. the multi-axis postprocessor method has been verified by the catia cam system. an experiment machining real parts on cnc machine tools proved the functionality and the applicability of the postprocessor algorithms. this postprocessor creation method can also be easily applied (after analysis of the actual format of the cl data and adaptation of the lexical and syntactical analyzers of the postprocessor) to other cam systems. acknowledgement the results have been obtained with the support of the 1m0507 czech ministry of education, youth and sports grant “research center of manufacturing technology”. references [1] itnc 530, př́ıručka uživatele – programováńı podle din/iso. traunreut : heidenhain, 2002. 463 p. (in czech) [2] melichar, b. et al.: konstrukce překladač̊u. i. a ii. část. 1. edition, prague : ctu, 1999. 636 p. isbn 80-01-02028-2. (in czech) [3] ryb́ın, j.: automatické ř́ıdićı systémy. 1. edition, prague : ctu, 1991. 150 p. isbn 80-01-00694-8. (in czech) [4] stejskal, v., valášek, m.: kinematics and dynamics of machinery. 1. edition, new york : marcel dekker, inc., 1996. 494 p. isbn 0-8247-9731-0. [5] vavruška, p., rybńın, j.: postprocessing and its applications on free-form surfaces. research report no. v-08-088. ctu in prague, research center of manufacturing technology, 2008. 54 p. (in czech) 117 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 structural variability of 3c111 on parsec scales c. grossberger, m. kadler, j. wilms, c. müller, t. beuchert, e. ros, r. ojha, m. aller, h. aller, e. angelakis, l. fuhrmann, i. nestoras, r. schmidt, j. a. zensus, t. p. krichbaum, h. ungerechts, a. sievers, d. riquelme abstract we discuss the parsec-scale structural variability of the extragalactic jet 3c111 related to a major radio flux density outburst in 2007. the data analyzedwere takenwithin the scope of themojave, umrao, andf-gamma programs, whichmonitor a large sample of the radio brightest compact extragalactic jetswith thevlba, theuniversity ofmichigan 26m, the effelsberg 100m, and the iram 30m radio telescopes. the analysis of the vlba data is performed by fitting gaussian model components in the visibility domain. we associate the ejection of bright features in the radio jet with a major flux-density outburst in 2007. the evolution of these features suggests the formation of a leading component and multiple trailing components. keywords: galaxies, individual, 3c111 – galaxies, active – galaxies, jets– galaxies, nuclei. 1 introduction jets of active galactic nuclei (agn) are among the most fascinating objects in the universe. from the time when the term “jet” was first introduced by [6] until today it is still unclear how these jets are created and formed. a prime source to gain insight into the physics of extragalactic jets is the broad-line radio galaxy 3c 111 (pks b0415+379) at z = 0.0491. the object can be described with a classical fr ii morphology [8] exhibiting two radio lobes with hot spots and a single-sided jet [17]. untypical for radio galaxies, a small inclination angle of only 18 ◦ to our line of sight has been determined on parsec scales [14]. moreover, 3c 111 has a blazar-like spectral energy distribution (sed) [11] and shows one of the brightest radio cores in the mm-cm wavelength regime of all frii radio galaxies. superluminal motion was detected in this radio galaxy by [10] and [20] making this source one of the first radio galaxies to exhibit this effect. the egret source 3eg j0416+3650 has been associated with 3c 111 [21, 11] and γ-ray emission from 3c 111 has been confirmed by fermi/lat [1–3]. a major fluxdensity outburst in 1997 was investigated by [15] with 10 years of radio monitoring data (1995–2005). in addition [7] and [23] report on a possible connection between the accretion disk and the jet of 3c 111. in this paper a new major flux density outburst from 2007 and the associated jet kinematics will be discussed with data from the very large baseline array (vlba2), the university of michigan radio astronomy observatory (umrao3) and the fgamma program. 2 data analysis the broad-line radio galaxy 3c 111 has been part of the vlba 2cm survey program since its start in 1995 [16] and its successor mojave (monitoring of jets in active galactic nuclei with vlba experiments, [18]) in 2002. twenty-four epochs of data have been taken from 2006 to 2010 within the mojave program of this source. phase and amplitude self-calibration as well as hybrid mapping by deconvolution techniques were performed as described by [16]. utilizing the program difmap [22], two-dimensional gaussian components have been fitted in the (u, v)-plane to the fully calibrated data of each epoch. we refer to the inner ∼ 0.5 mas as the “core” region, which can usually be modeled with two gaussian components. all models have been aligned by assuming the westernmost components to be stationary and all component positions are measured with respect to it. conservative errors of 15 % are assumed for the flux densities of the model-fit components accounting for absolute calibration uncertainties and formal model-fitting uncertainties [13]. within the umrao radio-flux-density monitoring program [4], more than three decades of single-dish flux-density data have been collected for 1assuming ho =71kms −1mpc−1, ωλ =0.73 and ωm =0.27 (1mas= 0.95pc; 1masyr −1 = 3.1c) 2the national radio astronomy observatory is a facility of the national science foundation operated under cooperative agreement by associated universities, inc. 3umrao has been supported by a series of grants from the nsf and nasa and by funds from the university of michigan 18 acta polytechnica vol. 52 no. 1/2012 fig. 1: long-term radio lightcurve of 3c111 obtained by the umrao at 4.8ghz (left). short-term radio lightcurves (right) at 32.0ghz, 42.0ghz, 86.24ghz and 142.33ghz obtained by the f-gamma program 3c 111 at 4.8 ghz, 8.0 ghz, and 14.5 ghz4. in addition, 3c 111 has been observed monthly by the f-gamma program [9, 5] at multiple frequencies throughout the cmand mm-bands since 2007. 3 results 3.1 lightcurves figure 1 (left) shows the long-term radio lightcurve of 3c 111 at 4.8 ghz. since the start of the measurements the source has been in a low activity state for almost two decades with only minor activity. the source showed a major outburst in 1996/1997 [15]. starting in 2004, minor outbursts have been observed and the flux density level has increased. a major flux density outburst starts in early 2007 at high frequencies (see figure 1, right), peaking ∼ 2 007.6 and is subsequently seen at lower frequencies. a secondary outburst starts mid 2008 at high frequencies. the overall flux density level at 4.8 ghz has been decreasing since the major outburst. 3.2 vlba data overview in this work, we focus on the time period 2006 through 2010, which contains 24 mojave epochs with an average image rms noise of 0.2 mjy beam−1 and a maximum of 0.4 mjy beam−1. the restoring beams of different epochs are very similar with an average of (0.84 × 0.58) mas at p.a. −8 ◦. a peak flux density of 6.54 jy was measured in may 2008. figure 2 shows the evolution of the parsec scale jet of 3c 111 observed within the mojave vlba program at 15 ghz since may 2008 (excluding six epochs before the ejection of the primary component). a major bright feature appears in may 2008 and is travelling downstream. the source brightness distribution was modelled with the clean algorithm by [12] within difmap. 3.3 model fitting in figure 3, the radial distances to the stationary component in the core region are plotted for every component as a function of time. the identification throughout the epochs is based on a comparison of the positions and flux densities of these model components. this identification is preliminary and will be discussed in more detail in a forthcoming paper. in the context of this paper we focus on the components which can be associated with the major flux density outburst of 2007 (see figure 1). a linear regression fit has been performed to measure component speeds and ejection dates (see figure 3). the ejection dates of a primary component and a secondary component are quasi identical (∼ 2 007.6) within the errors and were found to coincide with the peak of the outburst at high frequencies (see figure 1, right). the components flux density evolution (see figure 4) shows that the core region was extremely bright during the time of the outburst but dropped significantly after ejecting the bright primary and secondary component. the determined apparent speed for the primary component is 3.94±0.19 c and for the secondary component 2.80 ± 0.40 c. the primary component remains at a constant flux density of ∼ 1 jy until mid 2009. after that, the component is splitting into multiple parts: a new leading component with trailing components is in its wake. 4in this work, we consider only the 4.8ghz umrao data. a multi-frequency long-term analysis of the light curve will be presented elsewhere (grossberger et al., in prep.) 19 acta polytechnica vol. 52 no. 1/2012 fig. 2: naturally weighted clean images of 3c111 from 2008 to 2010. the minimum contours are set to 5 sigma of the rms noise and increase logarithmically by a factor of 2. all fitted model components are indicated with a circle or ellipse (size of the full-width half maximum of the gaussian function) enclosing a cross. the lines are the fitted evolutional tracks of the leading components. the dashed line indicates the primary component (and leading component after mid 2009). the dotted line shows the position of the core the secondary component has a higher flux density than the primary at the beginning of its lifetime which rapidly decays. this decay suggests that the component disappeared in mid 2009 though an identification with the first trailing component is possible based on position alone. the calculated apparent speed of the new leading component after mid 2009 is 4.53 ± 0.09 c with the flux density decreasing. the first trailing component has an apparent speed of 3.32 ± 0.19 c and shows a constant flux density evolution. the second trailing component was first observed 2010.53 with a flux density of ∼ 240 mjy and could be modeled until 2010.91 with a flux density of less than 100 mjy. the flux density evolution 20 acta polytechnica vol. 52 no. 1/2012 201120102009200820072006 12 10 8 6 4 2 0 epoch [y] d i s t a n c e [ m a s ] 2008 component 2007-secondary sec. trail.(post09) first trail.(post09) lead.(post09) 2007-primary 20112010200920082007 5.5 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 epoch [y] d i s t a n c e [ m a s ] fig. 3: separation of all model-fit components (left) and only components associated with the 2007-outburst (right) with respect to the core as a function of time. model-fit components which could not be identified in 5 or more epochs are marked with a black cross. the 2007-primary component is represented by triangles, the 2007-secondary component by crosses, the leading component by open rectangles, the first trailing component by open diamonds, the second trailing componentbypluses andthe2008-componentbystars. linear regressionfitsdetermine the trajectories of thecomponents associated with the 2007-outburst of the second trailing component suggests that this component faded away and thus could not be modeled in epoch 2010.98. an association of this second trailing component with the 2008 component is possible based on the flux density evolution and position but needs further investigation of the 2008-outburst. a similiar behaviour with a leading, secondary and trailing components has been seen in the evolution of the components associated with the outburst from 1997 by [15]. the components were interpreted as a forward and backward shock with the backward shock fading very fast. in this model, the plasma of the forward shock entered a region of rapidly decreasing external pressure allowing it to expand into the jet ambient medium and accelerate. in the following, the plasma recollimated and trailing features were formed in the wake of the leading component [19]. 4 summary in this paper, the ejection of new jet components on parsec scales were associated with a major flux density outburst of 3c 111 in 2007. it was shown that the major flux density outburst can be associated with the ejection of a primary jet component and secondary component. the evolution of the leading component suggests a split into multiple components. the full multi-epoch kinematical analysis of the vlba jet of 3c 111 between 2006 and 2011 will be presented elsewhere (grossberger et al., in prep). 2008 component 2007-secondary sec. trail.(post09) first trail.(post09) lead.(post09) 2007-primary core 201120102009200820072006 10 1 0.1 epoch [y] f l u x d e n s i t y [ j y ] fig. 4: flux density evolution of the “core” region and thecomponents as a functionof time. conservative errors of 15%are assumed for the flux densities of themodel-fit components. the component symbols are the same as in figure 3 with the addition of the “core” region, represented by stars. the black arrow at the bottom indicates the calculated and almost identical ejection date of the primary and secondary component based on the derived jet kinematics 21 acta polytechnica vol. 52 no. 1/2012 references [1] abdo, a. a., et al.: apj, 2010a, 715, 429. [2] abdo, a. a., et al.: apjs, 2010b, 188, 405. [3] abdo, a. a., et al.: apj, 2010c, 720, 912. [4] aller, m. f., aller, h. d., hughes, p. a.: radio astronomy at the fringe, zensus, j. a., cohen, m. h., ros, e. (eds.), aspconf. ser., 2003, 300, 159. [5] angelakis, e., et al.: 2008, arxiv:0809.3912. [6] baade, w., minkowski, r.: apj, 1954, 119, 215. [7] chatterjee, r., et al.: apj, 2011, 734, 43. [8] fanaroff, b. l., riley, j. m.: mnras, 1974, 167, 31p. [9] fuhrmann, l., et al.: the first glast symposium, 2007, 921, 249. [10] götz, m. m. a., et al.: a&a, 1987, 176, 171. [11] hartman, r. c., kadler, m., tueller, j.: apj, 2008, 688, 852. [12] högbom, j. a.: a & as, 1974, 15, 417. [13] homan, d. c., et al.: apj, 2002, 568, 99. [14] jorstad, s. g., et al.: aj, 2005, 130, 1 418. [15] kadler, m., et al.: apj, 2008, 680, 867. [16] kellermann, k. i., et al.: aj, 1998, 115, 1 295. [17] linfield, r., perley, r.: apj, 1984, 279, 60. [18] lister, m. l., et al.: aj, 2009, 137, 3 718. [19] perucho, m., et al.: a & a, 2008, 489, l29. [20] preuss, e., alef, w., kellermann, k. i.: the impact of vlbi on astrophysics and geophysics, in m. j. reid, j. m. moran (ed.) iau symposium, 1988, vol. 129, p. 105. [21] sguera, v., et al.: a & a, 2005, 430, 107. [22] shepherd, m. c.: astronomical data analysis software and systems vi, in g. hunt, h. payne (ed.) asp conf. ser., 1997, vol. 125, p. 77. [23] tombesi, f., et al.: 2011, arxiv:1108.6095 c. grossberger j. wilms c. müller t. beuchert dr. remeis sternwarte & ecap universität erlangen-nürnberg sternwartstr. 7, 96049 bamberg, germany m. kadler institut für theoretische physik und astrophysik universität würzburg am hubland, 97074 würzburg, germany dr. remeis sternwarte & ecap universität erlangen-nürnberg sternwartstr. 7, 96049 bamberg, germany e. ros departament d’astronomia i astrof́isica universitat de valència 46100 burjassot, valència, spain max-planck-institut für radioastronomie auf dem hügel 69, 53121 bonn, germany r. ojha goddard space flight center nasa, greenbelt 8800 greenbelt rd., greenbelt, md, 20771, usa m. aller h. aller department of astronomy university of michigan ann arbor, mi, 48109-1042, usa e. angelakis l. fuhrmann i. nestoras r. schmidt j. a. zensus t. p. krichbaum max-planck-institut für radioastronomie auf dem hügel 69, 53121 bonn, germany h. ungerechts a. sievers d. riquelme instituto de radio astronomía milimétrica avenida divina pastora 7 local 20, 18012, granada, spain 22 acta polytechnica vol. 52 no. 6/2012 global optima for size optimization benchmarks by branch and bound principles adéla pospíšilová, matěj lepš faculty of civil engineering, ctu in prague, thákurova 7, 166 29 prague, czech republic corresponding author: adela.pospisilova@fsv.cvut.cz abstract this paper searches for global optima for size optimization benchmarks utilizing a method based on branch and bound principles. the goal is to demonstrate the process for finding these global optima on the basis of two examples. a suitable parallelization strategy is used in order to minimize the computational demands. optima found in the literature are compared with the optima used in this work. keywords: benchmarks, discrete sizing optimization, branch and bound method, global optima, parallel programming. 1 introduction optimization and search methodologies have become very popular for making products more desirable. the shape of a structure, the amount of reinforcement, the cross-sections, sheet thicknesses, design of the concrete mixture, and many other properties can be optimized. recently, many heuristic algorithms have been developed and tested on benchmarks in order to assess their performance. in order to compare different optimization methods, it is necessary to have reliable information on the global optima of the benchmarks. in the past, it was not possible to obtain these optima by exhaustive search approaches, due to the large computational demands. as computational power is growing year by year, it now seems to be the right time to deal with this issue. this paper outlines a process for searching global optima for sizing discrete optimization benchmarks. various optimization methods can be used for obtaining optima, e.g. gradient methods [1], heuristic methods, and evolutionary algorithms [2]. these methods do not guarantee that a global optimum is obtained because only a portion of the space is explored. it is not always necessary to obtain the global optimum. a local optimum with good qualities found within a short time can be considered as a mark of a high quality algorithm. however, without knowledge of the global optima for selected benchmarks, it is not possible to make a reliable assessment of the performance of the methodology that is used. in our work, we look for global optima for some fundamental benchmarks, using a method based on branch and bound principles. this approach requires two values called bounds for determining the searched space. a good estimate of these bounds reduces the searched space but still ensures that global optima can be found. the optimization problem for the benchmarks presented in this paper is defined by an objective function that is easy to solve, constraints with high computational demands and with a searched space that is discrete and huge. the algorithm presented here can be used for obtaining the global optimum for problems similar to those discussed in this paper. 2 sizing optimization sizing optimization [3] is a type of structural optimization that deals with truss-like structures. these structures are defined by a fixed topology, material, loading, supports, and a set of cross-sections or, alternatively, minimum and maximum cross-sectional areas of individual truss bars. the objective function is the weight of the structure or its volume. the objective function is linear and easy to solve. constraints are maximum stresses and maximum displacements, respectively. these functions are non-convex in the most cases, and it is more time-consuming to solve them that to solve the objective function. the goal is to find sections for a given structure that satisfy the prescribed constraints and have the lowest possible weight. the selection of cross-sections from the given database then defines a discrete optimization problem, and variables chosen from given limits lead to a continuous case. the continuous optimization problem can be efficiently solved by mathematical programming methods, e.g. gradient-based methods, as will be shown below. when using discrete variables, no such option is available. thus we first give our attention to the discrete case. 74 acta polytechnica vol. 52 no. 6/2012 3 discrete problem the goal is to find a combination of cross-sections from the given list of profiles that leads to the lowest possible weight, while still fulfilling the given constraints. we present two methods that are able to find global optima for this discrete optimization problem. 3.1 enumeration enumeration (also called a brutal force or exhaustive search) is the simplest method for obtaining a global optimum of the discrete optimization problem. here, it is necessary to compute values of an objective function and constraints for every combination of crosssections from a given set. the enumeration therefore poses very large computational demands. if there are n sections and k variables (i.e., truss bars or groups of bars), then nk possible solutions exist, i.e., the problem grows exponentially with the growing number of variables. enumeration can therefore be applied only for small structures or for analysing the neighborhood of some local optima. 3.2 a method based on branch and bound principles a branch and bound method is another method for obtaining global optima. land and doig [4] invented this method for linear programming problems. later, it was modified for discrete problems and for mixeddiscrete problems [5]. branch and bound methods are based on a dividing the main problem into several subproblems, known as branches. to estimate which branches are to be evaluated, the existence of the lower and upper bounds is assumed to restrict the searched space. the lower bound can be obtained by any continuous optimization method, because the global optimum with discrete design variables will never provide a lower value of the objective function than the global optimum with continuous design variables. the upper bound can be obtained by any heuristic method, because a local optimum always has a greater or equal value of the objective function than the global optimum. since the constraints for the sizing optimization problem are more computationally demanding than the value of the objective function, they are calculated only for solutions that lie between the lower and upper bounds. if we obtain a subproblem with a value of the objective function outside the given bounds, the rest of the branch is not calculated because a global optimum cannot be located there. the more accurate the estimates of the lower and upper bounds are, the narrower the searched space can be. it is thus advantageous to decrease the upper bound on the basis of already obtained objective function values. although this methodology is more efficient than enumeration, it is still time consuming. however, it is possible to parallelize the algorithm to reduce the computation time. the main idea of the parallel version of the algorithm is presented in section 6. 4 continuous problem a continuous optimization problem is more complex than a discrete problem, because an infinite number of potential solutions exist in the space with real numbers. therefore, for non-convex problems, it cannot be guaranteed that the optimum that is found is the global optimum. however, powerful and wellestablished continuous optimization algorithms such as mathematical programming methods, can be used for obtaining a potential global optimum. obtaining a potential global optimum with continuous variables is therefore less demanding than solving the optimization problem with discrete variables. the main disadvantage of this methodology is the uncertainty of the quality of the solution. this can be overcome in one of two ways: • the branch and bound method expects that the lower bound has the same (or a lower) value of the objective function as the global optimum with continuous variables. since the global optimum of the continuous problem cannot be generally known, the true lower bound cannot be ensured. as a solution, the lower bound is set to its lowest potential minimum, i.e., without using any continuous optimization method. this process provides a real global optimum with discrete variables. in most cases, however, the searched space will be enormous for the computation of all possible solutions in real time. • other approaches do not fully guarantee the acquisition of the global optimum. nevertheless, the probability of obtaining the global optimum is acceptable. these approaches are based on estimating the global optimum with continuous variables and the value of its objective function. we use nonlinear programming that is implemented in the matlab environment (e.g. the fmincon function). this routine is executed several times from random initial points. if the obtained optima do not differ from each other and the results are comparable to optima published in the available literature, the estimate is considered as credible. if the obtained optima differ from each other, then it is not possible to use them as the lower bound. the first approach (without using the continuous optimization method) is then used, or the lower bound is estimated to be e.g. 20 % lower than the solution what is obtained. 75 acta polytechnica vol. 52 no. 6/2012 10 in 3 kips 5 kips 1 kip 1 0 in 1 4 5 2 3 1 2 4 3 x y figure 1: the 5-bar truss. all continuous optima for problems mentioned below were consistent with published results. for the sake of certainty, the nonlinear programming method was launched hundred times with different starting vectors, and the best solution was considered as the lower bound. 5 benchmarks 5.1 5-bar truss it was necessary to have a representative example of a structure that was small enough for computational demands and yet big enough for branching purposes in order to test the branch and bound algorithm. note that imperial units are used throughout the text, because our solutions will be compared with published optima in the available literature with the same units. the structure in fig. 1 has four nodes and five truss bars, and is made from aluminium. the density of the material is 0.1 lb/in3 and the young’s modulus is equal to 104 ksi. the allowable stress is limited to ±60 ksi in each element, and the displacements are limited to ±0.06 in along the horizontal and vertical directions. the continuous variables can vary between the lower bound 0.01 in2 and the upper bound 0.1 in2. a function for nonlinear programming fmincon [6] offers four variants of the optimization algorithms. in this paper, the active set method [7] is suitable for our continuous sizing optimization problem. it is based on changing inequalities into equalities, followed by the line-search algorithm leading to a quadratic subproblem. this procedure is repeated in a sequence which converges in the limit to a critical point [8]. a starting point, i.e., a design variable vector composed of cross-sectional areas, an objective function, constraints and lower and upper bounds of the variables are necessary as an input at the beginning of the algorithm. the objective function is the weight variable units discrete optimization continuous optimization e b&b fmincon() a1 in2 0.05 0.05 0.0500 a2 in2 0.01 0.01 0.01 a3 in2 0.06 0.06 0.0471 a4 in2 0.02 0.02 0.0167 a5 in2 0.01 0.01 0.01 m lb 0.179 0.179 0.157 max |wj| in 0.059 0.059 0.06 max |σi| ksi 59.371 59.371 60.006 wlim in 0.06 0.06 0.06 σlim ksi 60 60 60 table 1: 5-bar truss optima. b&b — method based on branch and bound principles, e — enumeration. of the structure m = f(a) = ρ n∑ i=1 aili, (1) where n = 5 is the number of truss bars, ai is the cross-sectional area and li is the length of element i, and ρ is the density of the material. constraints are defined as inequalities: max |σi|− 60 ≤ 0, (2) max |wj|− 0.06 ≤ 0, (3) where max |σi| is the maximum absolute value of stresses, j is an ordinal number of independent displacements and max |wj| is the maximum absolute value of displacements. these values can be obtained by several methods. in this paper, geometrically and physically linear behaviour was assumed and the finite element method was used, see e.g. [9]. the results for the continuous optimization problem of the 5-bar truss appear in tab. 1. the objective function value of the optima obtained with the active set method is later used as the lower bound for the branch and bound method. dealing with inequalities in the form of equalities is solved by a penalty type approach, and therefore the exact fulfillment of the constraints therefore cannot be ensured, see table 1. this discrepancy is not crucial since only a lower bound is needed, not the global continuous optimum. an identical topology of the 5-bar truss is used for the discrete optimization version. the material properties and constraints are also identical. 76 acta polytechnica vol. 52 no. 6/2012 initialization set variables to the smallest values compute m compare m with mmin and mmax compute constraints are they satisfied? mmin ≤ m < mmax yes m < mmax? yes mmax = m increase last variable by one compute m no find combination m > mmin m < mmin m ≥ mmax are all variables set to the highest values? find combination m < mmax no end yes no step 1 step 2 step 3 step 4 step 5 step 6 figure 2: a flow chart of the algorithm. the cross-sectional areas are chosen from the set {0.01, 0.02, . . . , 0.1} in2. since the structure has five elements (k = 5) and 10 cross-sectional areas (n = 10), the number of all possible solutions is nk = 105. therefore, the structure is small enough and the discrete global optimum can be obtained by the enumeration method. the results appear in tab. 1. the lower bound for the branch and bound method is set to the optimum objective function value obtained with continuous variables. the upper bound is set to the estimated weight of 0.23 lb, which is 25 % higher than the global optimum value of the objective function obtained by the enumeration. the space is searched systematically between these two bounds until the global optimum is found. a flow chart of the algorithm is depicted in fig. 2. the steps can be described as follows: 1. first, we have to decide which values will be used as the initial point. it is appropriate to begin with the smallest profiles and then increase them, since the objective function is linear with respect to the cross-section areas. from the programming point of view, it is easier to use integer variables that are the ordinal numbers of the given set of cross-sectional areas — set m. for example, the initial design variable vector is 1 1 1 1 1, which means that the first area (0.01 in2) from the given set is attached to each truss bar, according to the numbering of the trusses shown in fig. 1. 2. the value of the objective function is then calculated and compared with the lower mmin and upper mmax bounds. if the weight of the structure is less than mmin, the algorithm will go to step 3. if the weight of the structure is between mmin and mmax, the algorithm proceeds to step 4. if the weight of the structure is greater than mmax, step 5 is executed. 3. the value of the objective function is less than mmin. it is therefore necessary to find a combination of variables that corresponds to weight larger than mmin. the last variable is raised to its maximum value, e.g. as 1 1 1 1 10, and the value of the objective function is calculated and compared to mmin. (a) if the value of the objective function is still less than mmin, the algorithm searches for a combination with greater weight than the lower bound. this can be done as follows. the next-to-last variable is repeatedly raised by one, e.g. to (1 1 1 2 10). if the value of the next-to-last variable reaches its maximum, it is decreased to its minimum and the third from the end variable value is raised by one. the algorithm will go to step 2 at the moment when all variables are set such that m > mmin. (b) if the value of the objective function is greater than the lower bound, the last variable value is decreased to its minimum (1 1 1 1 1) and is then increased one by one (1 1 1 1 2, 1 1 1 1 3, ..., etc.) until the weight is greater than mmin. if mmin > m the algorithm goes to step 2. 4. the value of the objective function is greater than mmin and less than mmax. the global optimum is located somewhere in this subspace. therefore, the constraints are evaluated, i.e the stresses and displacements are calculated. (a) if the constraints are fulfilled, i.e., max |σi| ≤ 60 ksi and max |wj| ≤ 0.06 in, the upper bound is updated to the actual objective function value mmax = m. thus the upper bound is pushed down towards the global optimum and the searched space is reduced. the last variable value is then increased by one. if this variable value exceeds its maximum possible cross-sectional area value from a given set, e.g. 1 1 5 11 1, its value is set to the minimum possible value, and the next-to-this variable value is increased by one, i.e., 1 1 6 1 1. the algorithm goes to step 2. (b) if the constraints are not fulfilled, the value of the last variable is increased by one. the algorithm continues with step 2. 5. the value of the objective function is greater than mmax. the value of the last variable is decreased 77 acta polytechnica vol. 52 no. 6/2012 1.41% 6.13% 92.46% number of potential solutions below the lower bound (step 3) number of potential solutions between the lower bound and the upper bound (step 4) number of potential solutions above the upper bound (step 5) figure 3: a pie chart of a distribution of the 5-bar truss problem solutions solved by the branch and bound method. to its minimum, the next-to-last variable value is increased by one and the objective function value is calculated. if the variable exceeds its maximum possible value from the given set, the algorithm acts as in step 4a. (a) if the value of objective function m is lower than mmax, the algorithm goes to step 2. (b) if objective function value m is greater than mmax, the value of the third variable from the end increases by one and the value of the next-to-last variable is set to its minimum. the algorithm continues in this way until the objective function value is less than mmax. if there is no such combination of cross-sectional areas, the algorithm is terminated. 6. if all variable values are set to their maxima, the algorithm ends. fig. 3 shows the distribution of 5-bar truss potential solutions. the dark grey part shows the number of potential solutions below the lower bound, where only the objective function values are calculated, i.e., step 3 of the algorithm. the grey part shows the number of potential solutions between the lower and the upper bound, where the values of the objective function and also the constraints are calculated (step 4). the global optimum is included in this subspace. the light grey part represents the number of potential solutions above the upper bound, where only the objective function values are calculated (step 5). it is obvious that there is no need to compute constraints for more than 90% of potential solutions. since the evaluation of constraints is very computationally demanding, this results in significant time savings. the reason for the bigger number of potential solutions above the upper 1 2 3 5 8 6 7 10 9 75 z 1 0 0 1 0 0 x y 200200 [in] 4 figure 4: a 25-bar truss. bound is that the optimum is placed relatively close to the lower bound in the searched subspace. tab. 1 presents results for the continuous optimization problem along with results for the discrete problem solved by the enumeration and the method based on branch and bound principles. since the enumeration calculates the values of the objective function as well as constraints for all potential solutions, it is not possible to omit the global optimum. the results obtained by both presented methods are identical and this comparison serves as the verification of the branch and bound method. 5.2 25-bar truss the 25-bar truss is one of the most widely-used benchmarks for size optimization. it was introduced by fox and schmit [10] in 1966. the structure has ten nodes and four supports (at nodes 7–10), see fig. 4; therefore there are 18 free displacements. the structure is symmetric so some elements were gathered to eight groups, listed in tab. 2. the truss is made of aluminium material, with density equal to 0.1 lb/in3 and with the young’s modulus equal to 104 ksi. the loading is defined in tab. 3. all cross-sectional areas are chosen from the given set: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.8, 3.0, 3.2 and 3.4 in2, see [11]. the continuous variables range from 0.1 in2 to 3.4 in2. the allowable stress is set to ±40 ksi in all truss bars and the maximum allowable displacement is ±0.35 in at all nodes along the x, y and z directions. the results for the continuous case are shown in tab. 4 and they are compared with the results published in the available literature. the discrete case cannot be enumerated within a reasonable time because the number of potential solutions is nk = 308 = 6.561 · 1011, where k is the number of element groups. for the discrete case, the lower bound was set to the 78 acta polytechnica vol. 52 no. 6/2012 group of bars conectivities a1 1-2 a2 1-4, 2-3, 1-5, 2-6 a3 2-5, 2-4, 1-3, 1-6 a4 3-6, 4-5 a5 2-4, 5-6 a6 3-10, 6-7, 4-9, 5-8 a7 3-8, 4-7, 6-9, 5-10 a8 3-7, 4-8, 5-9, 6-10 table 2: member grouping for the 25-bar truss. node fx fy fz 1 1.0 −10.0 −10.0 2 0 −10.0 −10.0 3 0.5 0 0 6 0.6 0 0 table 3: loadings for the 25-bar truss (kips). variable unit perez & behdinan this paper a1 in2 0.1 0.1 a2 in2 0.457 0.421 a3 in2 3.4 3.4 a4 in2 0.1 0.1 a5 in2 1.937 1.917 a6 in2 0.965 0.966 a7 in2 0.442 0.471 a8 in2 3.4 3.4 m lb 483.84 483.82 max |σi| ksi 6.15 6.13 max |wj| in 0.35 0.35 σlim ksi 40 40 wlim in 0.35 0.35 table 4: comparison of results for the 25-bar truss continuous case. 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 sp ee dup number of cores speed-up linear speed-up figure 5: a speed-up of the 25-bar truss problem solved by the parallel branch and bound method. value found with continuous optimization, and the upper bound was set to the worst available solution in the literature [12]. 6 parallelization the 25-bar truss is relatively computationally demanding. since the evaluations of the solutions are independent from each other (except updating the upper bound mmax, as described in step 4a of the algorithm), the method can be run in a parallel way. nowadays, modern computers are equipped with several core processors, and we can to make use of this computational power. the matlab environment offers several parallelization tools, but not all of them provide shared memory. this consideration is essential for our algorithm for updating the upper bound. the appropriate method is the spmd method, i.e., the single programm multiple data method, see e.g. [13] for more details. the spmd statement separates the block of the code to be run simultaneously on multiple labs. as in the parfor loop method, the matlabpool open n command opens the required number of labs. data can be sent to another lab by the labsend(data, x) command, where x is the index of the receiving lab to which the data is sent. the data is received by the labreceive(y) command, where y is the index of the lab from which the data will come. it is appropriate to split the data only at one so-called master lab and to receive data with the other so-called slaves. the master can also process its own data as well. the main problem here is to estimate a proper amount of data for each lab. if the data is sent too often communication between master and slaves is too costly. in the 25-bar truss task, permutations with repetition are generated in advance for several groups of elements (e.g. four groups), and the remaining groups of elements (four other groups) are generated in the branch and bound method independently at each lab. the generated combinations are assigned to individual labs in advance, and then the algorithm continues in the same way as for the 5-bar truss task. the maximum mmax values are collected at the end 79 acta polytechnica vol. 52 no. 6/2012 480 500 520 540 560 1 10 100 1000 10000 w ei gh t [lb ] iterations of the algorithm figure 6: a graph of the decreasing upper bound for the 25-bar truss problem. of each iteration. the smallest one is chosen as a new mmax and is resent to every lab as an initial value of mmax for another iteration. if all data has been used, the smallest value of mmax is taken as the global optimum. for the parallel version of the algorithm, scaling the algorithm is very important. if the algorithm is scaled badly, parallelization is not useful at all. ideally, we would like to achieve linear scaling, i.e., speed-up of n on n cores. however, it is very hard to obtain linear scaling, e.g. because of the time spent on communications. fig. 5 shows a graph where the speed-up of the parallel algorithm is compared at 1 to 8 labs1. the hp xeon z600 workstation with two intel xeon e5520 4-cores processors, frequency 2.27 ghz was used for computations within the matlab r2009a 64-bit debian gnu/linux environment. 7 conclusions fig. 6 shows a graph with the decreasing upper bound mmax for the 25-bar truss problem. the value of mmax determines the best-so-far solution found during the whole algorithm run. it can be interpreted as the convergence of the objective function to the global optimum. all combinations generated in advance were divided into smaller blocks of data containing fifty combinations and these “packs” were sequentially sent to eight labs. the number of all iterations was therefore 304/(8 · 50) = 2025. the global optimum was gained in the 66th iteration. nevertheless, it should be pointed out that the number of iterations depends strictly on the data ordering or the starting point (minimum vs maximum cross-sectional areas). since the task is to find the global optima, the whole subspace of potential solutions must be searched for, and it is not possible to shorten the computation. in tab. 5, we compare the optima obtained with the branch and bound method with the heuristic algorithms found in the literature. the result of the branch and bound method (b&b) is identical to the 1it was not necessary to compute the whole task. some variables were fixed to prescribed values, here 2 out of 8 variables, and the algorithm was run with this restriction. solution presented by kripka [15] using the simulated annealing method (sa). however, he did not search the whole subspace of possible solutions, so he could not be sure that the obtained optimum is the global one. to the best of the authors’ knowledge, global optima for computationally demanding tasks such as the 25-bar truss problem have not been published yet. we hope that by publishing the algorithm as well as the value of the global optimum we will introduce a quality standard that will help to improve new optimization methods. acknowledgements this work was supported by the grant agency of the czech technical university in prague, grants number sgs11/021/ohk1/1t/11 and number sgs12/027/ohk1/1t/11, the czech science foundation gacr, grant number p105/12/1146 and by ministry of education, youth and sports, grant number msm 6840770003. references [1] shewchuk, j. r.: an introduction to the conjugate gradient method without the agonizing pain, school of computer science, carnegie mellon univ., pittsburgh, [online], 1994. [2] eiben, a. e., smith, j. e.: introduction to evolutionary computing, berlin: springer, 2003. [3] bendsoe, m. p., sigmund o.: topology optimization: theory, methods and applications, berlin: springer, 2003. [4] land, a. h., doig a. g.: “an automatic method of solving discrete programming problems.” econometrica, 28 (3), 1960, pp. 497–520. [5] arora, j. s.: methods for discrete variable structural optimization. in: “recent advantages in optimal structural design”. american society of civil engineers, 2002, p. 1–40. [6] the mathworks, find minimum of constrained nonlinear multivariable function matlab. [02/10/2011]. http://www.mathworks.com/ [7] bhatti, m. a.: practical optimization methods, new york: springer-verlag, 2000. [8] the mathworks, constrained nonlinear optimization algorithms: optimization algorithms and examples. [02/10/2011]. http://www.mathworks.com/ 80 acta polytechnica vol. 52 no. 6/2012 variable units this paper kripka lemonge & barbosa li & liu wu & chow coello rajeev & krishnamoorthy [15] [16] [17] [11] [18] [12] b&b sa ga pso ga ga ga date 2012 2004 2004 2009 1995 1994 1992 a1 in2 0.1 0.1 0.1 0.1 0.1 1.5 0.1 a2 in2 0.4 0.4 0.3 0.3 0.5 0.7 1.8 a3 in2 3.4 3.4 3.4 3.4 3.4 3.4 2.3 a4 in2 0.1 0.1 0.1 0.1 0.1 0.7 0.2 a5 in2 2.2 2.2 2.1 2.1 1.5 0.4 0.1 a6 in2 1 1 1 1 0.9 0.7 0.8 a7 in2 0.4 0.4 0.5 0.5 0.6 1.5 1.8 a8 in2 3.4 3.4 3.4 3.4 3.4 3.2 3 m lb 484.33 484.33 484.85 484.85 486.29 539.78 546.01 max |σi| ksi 6.20 6.20 6.11 6.11 6.01 6.66 6.77 max |wj| in 0.35 0.35 0.35 0.35 0.35 0.34 0.35 table 5: a comparison of results for the 25-bar truss, discrete case, from the literature and from our work. b&b — method based on branch and bound principles, sa — simulated annealing, ga — genetic algorithm, pso — particle swarm optimization. σlim = 40 ksi, wlim = 0.35 in. [9] pospíšilová a.: analysis of sizing optimization benchmarks, bachelor thesis, ctu in prague, 2010. http://klobouk.fsv.cvut.cz/ ~pospisilova/publications/ap_bp_2010. pdf [10] fox, r. l., schmit, l. a. jr.: “advances in the integrated approach to structural synthesis.” j spacecraft rockets, 3 (6), 1966, p. 858–866. [11] wu, s.-j., chow, p.-t.: “steady-state genetic algorithms for discrete optimization of trusses.” comput struct, 56 (6), 1995, p. 979–991. [12] rajeev, s., krishnamoorthy, c. s.: “discrete optimization of structures using genetic algorithms.” j struct eng, 118 (5), 1992, p. 1233– 1250. [13] the mathworks, executing simultaneously on multiple data sets: single program multiple data (spmd). [02/10/2011]. http://www.mathworks.com/ [14] perez, r. e., behdinan, k.: particle swarm optimization in structural design. in: “swarm intelligence, focus on ant and particle swarm optimization”. itech education and publishing, 2007, p. 373–394. [15] kripka, m.: “discrete optimization of trusses by simulated annealing.” j braz soc mech sci eng, 26 (2), 2004, p. 170–173. [16] lemonge, a. c. c., barbosa, h. j. c.: “an adaptive penalty scheme for genetic algorithms in structural optimization.” int j numer meth eng, 59 (5), 2004, p. 703–736. [17] li, l. j., huang, z. b., liu, f.: “a heuristic particle swarm optimization method for truss structures with discrete variables.” comput struct, 87 (7-8), 2009, p. 435–443. [18] coello, c. a. c.: discrete optimization of trusses using genetic algorithms. in: “expersys-94: expert systems applications and artificial intelligence”. i.i.t.t. international, 1994, p. 331–336. 81 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 the atlas experiment entering into operation: overview, motivation and status of the project p. jenni the large hadron collider lhc at the european laboratory for particle physics cern near geneva will deliverparticle collisionsat thehighest energyever achieved in a laboratory. after more than 15 years of design and construction efforts, the lhc and its experiments are finally starting operation. besides the giant accelerator,which is installed in a ring tunnel 27 km in length about 100 m underground, the no less impressive and complex detectors are ready for data collection. ageneraloverviewofatlas,oneof thekeyexperiments, hasbeen given. thelhcwill allowatlasto study for the first time fundamental physical phenomena as they occurred very shortly after the big bang. atlas will address questions like: why do particles have a mass, what is the non-visible dark matter in the universe, are there more than four dimensions in nature, and what are the smallest building blocks of matter? the expectations for newdiscoveriesarehigh, since physicists have for decades been eagerly awaiting this exploratory step into the unknown. the technically sophisticated and highly complex atlas detector has been developed and constructed through world-wide collaboration, and the czech teams have made remarkable contributions to the project since its conception some 20 years ago, very much so thanks to the constant strong support and encouragement provided by professor niederle. one important andmost pleasantmilestone in the history of atlas was the yearly overview week. the september 2003 overview took place in prague, and was opened with an address by professor niederle. some highlights of the construction and commissioning of the detector have been illustrated, as well as several examples of anticipated discovery signals. it wouldbeartificial andalmost impossible to attempt to summarize the huge amount of work done by the collaborating teams within just a few pages. the reader is therefore referred to the two following references. a comprehensive and detailed description of the construction of theatlasdetector and its expected performance canbe found inref. [1], whileref. [2] is part of an overall lhcproject publication that provides an account of the atlas experiment as a whole. references [1] aad, g., et al.: the atlas collaboration, ‘the atlasexperimentat thelargehadroncollider’, jinst 3 (2008), s08003. [2] jenni, p.: in: the large hadron collider: a marvel of technology, editedbyl.evans,epflpress, 2009, p. 182–199. isbn 978-2-940222-34-2. peter jenni cern, former spokesperson of the atlas collaboration european organization for nuclear research cern ch-1211, genève 23, switzerland 95 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 spectroscopy of the stellar wind in the cygnus x-1 system i. mǐskovičová, m. hanke, j. wilms, m. a. nowak, k. pottschmidt, n. s. schulz abstract thex-ray luminosity of black holes is produced through the accretion ofmaterial from their companion stars. depending on the mass of the donor star, accretion of the material falling onto the black hole through the inner lagrange point of the system or accretion by the strong stellar wind can occur. cygnus x-1 is a high mass x-ray binary system, where the black hole is powered by accretion of the stellar wind of its supergiant companion star hde226868. as the companion is close to filling its roche lobe, the wind is not symmetric, but strongly focused towards the black hole. chandra-hetgs observations allow for an investigation of this focused stellar wind, which is essential to understand the physics of the accretion flow. we compare observations at the distinct orbital phases of 0.0, 0.2, 0.5 and 0.75. these correspond to different lines of sight towards the source, allowing us to probe the structure and the dynamics of the wind. keywords: x-ray binary, cygnus x-1, stellar wind, accretion, spectroscopy. 1 introduction 1.1 stellar winds of o stars stellar winds of early (o or b) type stars are driven by the radiation pressure of copious absorption lines present in the ultraviolet part of the spectrum on material in the stellar atmosphere [4]. therefore the winds are very strong; common mass loss rates are ∼ 10−6m�/year. since primaries of high-mass xray binaries are o or early b stars [5], which radiate in the uv, this radiation is strong enough to produce such a wind. according to simulations of linedriven winds, perturbations are present and dense and cool inhomogeneities are created in the wind [6]. larger density, velocity, and temperature variations compress the gas further, creating “clumps”. current knowledge about stellar winds assumes two disjunct components of o star winds: cool dense clumps and hot tenuous gas. sako et al. [16] showed that observed spectra of x-ray binaries can only be explained as originating from an environment where the cool and dense clumps are embedded in the photoionized gas. 1.2 cygnus x-1 cygnus x-1 is a binary system where the x-ray source is a black hole [3, 19], and ∼ 18 m� [14], o 9.7 iab type star hde 226868 is its companion [18]. stellar wind accretion plays a major role in the mass transfer process, because cyg x-1 belongs to the high-mass x-ray binaries (hmxb), which is in contrast to low-mass x-ray binaries (lmxb), where roche lobe overflow is more important and “accretion disk” accretion occurs. there are strong tidal interactions in the system. moreover, the donor star fills ∼ 90 % of its roche volume [5, 9]. therefore the wind is not symmetric, but focused towards the black hole [7], such that the density and mass loss rate are higher along the binary axis. the fact that such a high percentage of the roche lobe is filled, however, means that we cannot exclude roche lobe overflow taking place as well. 1.3 hard and soft state of cygnus x-1 black hole binaries show two principal types of emission called the hard or soft state, which differ in the shape of the x-ray spectrum, the timing properties and the radio emission. cyg x-1 spends most of the time in the hard state with a hard, exponentially cut-off powerlaw spectrum, strong short term variability and steady radio emission [15, 20]. however, transitions between hard and soft states are observed (fig. 1). fig. 1: 1.5−12 kev light curve of cygx-1 obtained from 1996 to 2010 by the rxte-asm. over time, transitions fromhard tosoft stateoccur. compared to the time spent in the low-luminosity hard state (blue, <∼ 40c/s for asm countrate)cygx-1 spent less time in the high-luminosity soft state (red, ∼ 100c/s for asm countrate) 85 acta polytechnica vol. 51 no. 2/2011 how exactly the wind properties differ between states and what triggers state transitions are questions remaining to be answered. one possibility is that states correspond to different configurations of the accretion flow [17], which cause differences in energy dissipation. another possibility is that changes of the wind properties themselves trigger state transitions [10]. the mass transfer process in either case (hmxb or lmxb) provides extremely efficient energy release and produces luminosities of ∼ 1037 erg/s in general. for such a luminosity, which is typical for cyg x-1, the x-ray source produces a considerable feedback on the wind by photoionization of its nearby environment [2], which contributes to the complex wind structure. 2 observations and data analysis 2.1 observations and orbital coverage the spectra used for our analysis were obtained by the high energy transmission grating spectrometer (hetgs – hetg in combination with acis, advanced ccd imaging spectrometer) on board the chandra observatory. chandra acis observations are performed in two different modes: timed exposure (te) mode and continuous clocking (cc) mode. in te mode, the ccd is exposed for some time and then its data are transfered to the frame store, which is read out during the next exposure. the readout time required for the full frame store is 3.2 s. in cc mode, the columns are read out continuously, which reduces the readout time to 3 ms [8]1. when the source is very bright, more than one photon may reach the same pixel in one frame time (pile-up). these photons are misinterpreted as one single event with higher energy. cc mode is usually used to avoid pile-up. high-resolution spectra of persistently bright sources like cyg x-1 provide the unique possibility of probing the structure of the wind directly. however, this structure and therefore also the properties of the wind (density, velocity, ionization state) change with different lines of sight, which correspond to the different orbital phases. thus, a good coverage of the binary orbit is desirable. observations (whether in the hard or soft state) which are currently available cover part of the orbit around phase 0.0, between phases 0.7 and 0.2, and around phase 0.5, with the latter only obtained in january, 2010 (fig. 2a). we focus here on the comparison of observations obtained at four distinct phases of φ = 0.0, which is defined at the time of superior conjunction of the black hole (obsid 3814 and obsid 8525), 0.2 (obsid 9847), 0.5 (obsid 11044) and 0.75 (obsid 3815), see fig. 2c. the latter was obtained in cc mode, while all others were obtained in te mode. this difference has no influence on our comparison. while the calibration of cc mode does not allow for an adequate modelling of the whole continuum shape, local absorption lines, which are our primary interest, are not affected. fig. 2: a) polar view of the cyg x-1 orbit and illustration of observation coverage with chandra. before 2010, january, therewere13observationsavailable,mostly covering the part of the orbit around phase φ≈0. another observation (obsid 11044) was obtained at φ≈0.5. a short observation (obsid 2742) was the only one at this phase before, but since it was obtained in te mode during the soft state of the source, it strongly suffered from pile-up. full lines (dashed lines) display te mode (cc mode) observations. changes fromblue to red colour correspond tochanges fromhard to soft state. b)colorcoded orbital phases corresponding to lines of sight towardscyg x-1 taking into account an inclination of ∼ 35◦. c) highlighted observations at phases of φ≈0.0 (obsid 3814 and obsid 8525), 0.2 (obsid 9847), 0.5 (obsid 11044) and 0.75 (obsid 3815), which are analyzed and compared in this work the phase 0.0 coverage is extremely important, since due to the inclination of ∼ 35◦ of the cyg x-1 orbital plane, it corresponds to looking through the densest part of the wind close to the stellar surface (fig. 2b). the distribution of x-ray dips with orbital phase peaks around phase 0.0 [1]. the observation around phase 0.5 provides a great opportunity to close a gap in defining the general picture of the wind structure. while all recent chandra observations have caught cyg x-1 in the hard state at <∼ 100 c/s (of chandra countrate), comparable to the observation at φ ≈ 0 [12], the spectrum was softer and the flux was more than twice as high during the observation at φ ≈ 0.7. the light curves at φ ≈ 0 are modulated by strong and complex absorption dips, but dipping occurs already at φ ≈ 0.7 and has not ceased at φ ≈ 0.2, though the dip events seem to become shorter with distance from φ = 0. the light curve 1chandra x-ray center, the chandra proposers’ observatory guide, 2009, http://cxc.harvard.edu/proposer/pog/ 86 acta polytechnica vol. 51 no. 2/2011 at φ ≈ 0.5 is totally free of dips, yielding 30 ks of remarkably constant flux. 2.2 absorption dips according to the general assumption, absorption dips, during which the soft x-ray flux decreases sharply, originate from inhomogeneities – “clumps” – present in the wind, where the material is of higher density and lower temperature [4, 16]. according to the softness ratios in the color-color diagram (fig. 3), different stages of dipping can be classified. fig. 3: all of these diagrams show a “soft softness ratio” on the x-axis and a “hard softness ratio” on the yaxis. dipping produces a clear track in the color-color diagrams: both colors harden towards the lower left corner, due to increased absorption. however, during extreme dips, the soft color becomes softer again, which is likely due to partial coverage 6.2 6.4 6.6 6.8 7 7.2 7.4 0. 1 0. 2 0. 5 1.71.81.92 wavelength [å] energy [kev] c ha nd ra − h e t g s f lu x [ ph ot on s/ s/ cm 2 / å ] si xi v si xi ii si xi i si xi si x si ix si vi ii si vi i mg xi i al xi ii mg xi ε mg xi δ mg xi γ fig. 4: “dip”and“non-dip” spectra from theobservation at phase φ ≈ 0.7 are shown for comparison of absorption lines present in different stages of dipping. the reduction in flux in the spectra is real anddue to dips. in the “nondip” spectrum, only absorption lines of sixiv and sixiii are present, while in the “dip” spectrum the whole series of sixii–vii appears figure 4 shows the spectrum from the observation at phase φ ≈ 0.7 (obsid 3815) split into “dip” and “non-dip” stages in the wavelength interval of the siregion between 6 å and 7.5 å. while absorption lines of sixiv and sixiii are already present in the non-dip spectrum, the dip spectra contain additional strong absorption lines that can be identified with kα transitions of lower ionized sixii–vii. the strength of the low-ionization lines increases with the degree of dipping, indicating that the latter is related to clumps of lower temperature. moreover, the clumps are of higher density than their surroundings. in 2008 our group organized a multi-satellite observational campaign with xmm-newton, chandra (obsid 8525 and obsid 9847), suzaku, rxte, integral and swift observing cyg x-1 simultaneously. the dips shortly after phase φ ≈ 0 were so strong that they were seen by all the instruments involved in the campaign, even rxte–pca or integral–isgri [13]. the light curve from xmm-newton is shown in fig. 5 [11]. where the dips occur in the light curve, the hydrogen column density, nh, increases strongly. 20 30 40 50 x m m − p n r at e [c /s ] −0.02 0 0.02 0.04 0.06 0.08 orbital phase 0 5 10 n h [ 10 22 c m − 2 ] 18.6 18.7 18.8 18.9 19 19.1 0. 6 0. 8 1 re l. f lu x no rm al iz at io n day of 2008 april (mjd − 54556) fig. 5: absorption dips and scattering of x-rays as seen with xmm-newton, epic-pn. a) the light curve in the energy band 0.3–10 kev shows absorption dips, which are identical to the dips observed by chandra shortly after φ ≈ 0. b)hydrogencolumndensityof theneutral absorption model increases strongly when dips occur. c) relative flux normalization constant, which is consistent with the scattering trough seen in hard x-rays [11] as shown in the third panel, however it is not only the pure absorption that causes the dips. thomson scattering contributes during these times, causing longer time scale variations, also at hard x-rays. a possible explanation is in the existence of dense and (nearly) neutral clumps (causing the sharp dips) embedded in ionized halos (causing the scattering) [11]. 2.3 spectroscopy we separate the “non-dip” and the “dip” parts of the observations. the “non-dip” spectrum is extracted from the least absorbed phases at the upper 87 acta polytechnica vol. 51 no. 2/2011 right corner of the color-color diagram (except for obsid 3815) and the spectroscopic results here refer to these “non-dip” phases. the highly photoionized wind is detected at φ ≈ 0 via numerous strong absorption lines at vrad ≈ 0 (fig. 6). fig. 6: absorption and p cygni profiles of sixiv, mgxii and nex. while the observations at φ ≈ 0 and φ ≈ 0.75 show clear absorption profiles, although redshifted for φ ≈ 0.75, the observation at φ ≈ 0.5 shows emission at vrad ≈ 0 and blueshifted absorption the lack of appreciable doppler shifts can be explained by the wind flow being orthogonal to the line of sight. in contrast, the recent observation (obsid 11044) at φ ≈ 0.5 reveals for the first time for cyg x-1 clear p cygni profiles with a strong emission component at a projected velocity vrad ≈ 0, while the weak absorption components occur at a blueshift of ≈ 500–1 000 km/s. if we observe the same plasma in both cases, this indicates that the real velocity must be small, i.e., we are probing a dense, low-velocity wind close to the stellar surface. the fact that the absorption line profiles measured at φ ≈ 0.75 are redshifted by ≈ 200–300 km/s indicates that the wind flow is not radial from the star, as a radial wind (i.e., directing away from the star) would always give a blueshifted velocity when projected onto the line of sight at phases φ = 0.25–0.75. 3 summary the new chandra observation of cygnus x-1 at orbital phase 0.5 obtained in january 2010 allows us to compare observations at the four distinct orbital phases 0.0, 0.2, 0.5 and 0.75. with such a coverage, the full structure of the wind starts to reveal itself. at phase 0.0 we look through the densest part of the wind, as it is focused towards the black hole. the light curve is modulated by strong absorption dips. the flux decreases strongly during such dips, consistent with being caused by dense and cool clumps of material embedded in the more ionized wind. while absorption lines of sixiv and sixiii are already present in the non-dip spectrum, in the dip spectra also kα transitions of lower ionized si appear, whereas the strength of these lines increases with the degree of dipping. an especially interesting result is the totally flat light curve around phase 0.5. while dipping has started around phase 0.7, is strongest around 0.0 and still present at 0.2, it has vanished at 0.5. we therefore proposed for the next observation between phases 0.25 and 0.4 to investigate the transition between dipping and non-dipping phases. spectroscopic analysis showed another interesting result. in the spectrum at phase 0.5, clear pcygni profiles of lyman α transitions were observed for the first time for cyg x-1. we observe here strong emission components at a projected velocity vrad ≈ 0 in contrast to the pure absorption observed at phase 0.0. detailed modeling of photoionization and wind structure is in progress. acknowledgement the research leading to these results was funded by the european community’s seventh framework programme (fp7/2007-2013) under grant agreement number itn 215212 “black hole universe” and by the bundesministerium für wirtschaft und technologie under grant number dlr 50 or 0701. references [1] ba�lucińska-church, m., et al.: the distribution of x-ray dips with orbital phase in cygnus x-1, mnras, 311, 2000, p. 861–868. [2] blondin, j. m.: the shadow wind in highmass x-ray binaries, astrophys. j., 435, 1994, p. 756–766. [3] bolton, c. t.: cygnus x-1-dimensions of the system, nature, 240, 1972, p. 124. [4] castor, j. i., abbott, d. c., klein, r. i.: radiation-driven winds in of stars, astrophys. j., 195, 1975, p. 157–174. [5] conti, p. s.: stellar parameters of five early type companions of x-ray sources, astron. astrophys., 63, 1978, p. 225–235. [6] feldmeier, a., puls, j., pauldrach, a. w. a.: a possible origin for x-rays from o stars, astron. astrophys., 322, 1997, p. 878–895. [7] friend, d. b., castor, j. i.: radiation-driven winds in x-ray binaries, astrophys. j., 261, 1982, p. 293–300. [8] garmire, g. p., et al.: advanced ccd imaging spectrometer (acis) instrument on the chandra x-ray observatory, in truemper, j. e., tananbaum, h. d. (ed) x-ray and gammaray telescopes and instruments for astronomy. proceedings of the spie, 4851, 2003, p. 28–44. 88 acta polytechnica vol. 51 no. 2/2011 [9] gies, d. r., bolton, c. t.: the optical spectrum of hde 226868 = cygnus x-1. iii. a focused stellar wind model for he ii lambda 4686 emission, astrophys. j., 304, 1986, p. 389–393. [10] gies, d. r., et al.: wind accretion and state transitions in cygnus x-1, astrophys. j., 583, 2003, p. 424–436. [11] hanke, m., et al.: a thorough look at the photoionized wind and absorption dips in the 226868 x-ray binary system, in the energetic cosmos: from suzaku to astro-h. tokyo : jaxa special publication, jaxa-sp-o9-008e, 2010, p. 294–295. [12] hanke, m., et al.: chandra x-ray spectroscopy of the focused wind in the cygnus x-1 system. i. the nondip spectrum in the low/hard state, astrophys. j., 690, 2009, p. 330–346. [13] hanke, m., et al.: multi-satellite observations of cygnus x-1, in vii microquasar workshop: microquasars and beyond, pos (mqw7), 2008. [14] herrero, a., et al.: fundamental parameters of galactic luminous ob stars. ii. a spectroscopic analysis of hde 226868 and the mass of cygnus x-1, astron. astrophys, 297, 1995, p. 556–566. [15] pottschmidt, k., et al.: long term variability of cygnus x-1. i. x-ray spectral-temporal correlations in the hard state, astron. astrophys, 407, 2003, p. 1 039–1 058. [16] sako, m., et al.: structure and dynamics of stellar winds in high-mass x-ray binaries, in branduardi-raymont, g. (ed) high resolution x-ray spectroscopy with xmm-newton and chandra. mullard space science laboratory of university college london, holmbury st mary, dorking, surrey, uk, 2002. [17] smith, d. m., heindl, w. a., swank, j. h.: two different long-term behaviors in black hole candidates: evidence for two accretion flows?, astrophys. j., 569, 2002, p. 362–380. [18] walborn, n. r.: the spectrum of hde 226868 (cygnus x-1), astrophys. j., lett., 179, 1973, p. l123–l124. [19] webster, b. l., murdin, p.: cygnus x-1-a spectroscopic binary with a heavy companion?, nature, 235, 1972, p. 37–38. [20] wilms, j., et al.: long term variability of cygnus x-1. iv. spectral evolution 1999–2004, astron. astrophys., 447, 2006, p. 245–261. ivica mǐskovičová manfred hanke jörn wilms dr. karl remeis-sternwarte universität erlangen-nürnberg & ecap sternwartstr. 7, 96049 bamberg, germany michael a. nowak norbert s. schulz mit kavli institute for astrophysics and space research ne80-6077, 77 mass. ave. cambridge, ma 02139, usa katja pottschmidt cresst and nasa goddard space flight center astrophysics science division code 661, greenbelt, md 20771, usa center for space-science & technology university of maryland baltimore county 1000 hilltop circle, baltimore, md 21250, usa 89 ap09_1.vp 1 introduction the behaviour of a steel structure under a fire situation differs from its behaviour under ambient temperature. the mechanical properties and the thermal expansion change with increasing temperature. especially the yield stress and the modulus of elasticity have a significant influence on the bearing capacity of steel members. this is significant especially for thin-walled elements. a corrugated sheet is able to transfer the bending moments in the early phase of the fire. the thermal expansion of steel extends the sheet and results in increased deflection. at this stage, the bolted connection is loaded by forces induced by thermal expansion. at higher temperatures, the bending moment resistance is reduced and a major part of the load is transferred by the tension membrane. at this moment, the resistance and stiffness of the bolted connection has a significant influence on the sheet behaviour. the connections transfer the membrane force to the supports. the performance of the connection is also important in the cooling phase of the fire. the resistance of the connection is expressively influenced by the change in the mechanical properties of the corrugated sheet. the increase in the temperature leads to a decrease in the yield stress and the modulus of elasticity of thin-walled cold formed steel members. the decrease in these mechanical properties leads to a reduction in the load bearing capacity of the structure. however, the ultimate strength is slightly increased for higher temperatures. the maximum strength is reached at 250 °c, and the original value is obtained at about 350 °c. an additional increase in temperature leads to a decrease in bearing capacity. for temperatures higher than 400 °c, the yield stress on the force-deformation diagram is not visible. buckling of thin walled elements is influenced by the reduced modulus of elasticity value. 2 description of experiment and tested specimens 2.1 experiments with screwed connections two sets of tests with screwed connections under ambient and elevated temperatures were carried out in the laboratory of the faculty of civil engineering of the czech technical university in prague. in these experiments, the mechanical properties of screwed connections at steady state conditions were determined. steady state tests (sst) are faster and simpler than transient state tests. ssts can be used for predicting behaviour in fire situations, when the temperature changes in time [1]. the experiments focused on the stiffness, resistance, deformation capacity and collapse mode of the connections during fire. four sets of experiments have been performed. in 2005, two sets of tests were carried out [2]. in set a, e-vs bohr 5-5.5×38 screws and a sealed washer �19 mm were used. in set b, the same screws were used, but the sealed washer was replaced by a steel washer 29 mm in diameter. the thickness of the trapezoidal sheet for sets a and b was 0.75 mm. the next two sets of experiments were carried out in 2007 [3]. the screwed connections in the test were made with the use of self-drilling carbon steel screws marked sd8-h15-5.5×25 (set c and set d). the test specimens were cut out from a trapezoidal sheet with nominal thickness 0.75 mm. in test set c, trapezoidal sheets with measured thickness 0.75 mm, width 75 mm and length 500 mm were tested. the specimens for test set d were from measured sheet 0.80 mm in thickness, 50 mm in width and 350 mm in length. the values of the material properties of the trapezoidal sheet were obtained by material experiments. for measured sheet thickness 0.75 mm, the yield stress was 338 mpa, and the ultimate strength was 428 mpa. for measured sheet thickness 0.80 mm, these values were 327 mpa for the yield stress and 426 mpa for the ultimate strength. in each set of tests, experiments were performed with two specimens for ambient temperature 20 °c and for constant elevated temperatures 200 °c, 400 °c, 500 °c, 600 °c and 700 °c. steel sheets 10 mm in thickness simulated the bearing roof structure and they were anchored into the grips of the testing machine, see fig. 1. due to these thicker sheets, the force from testing machine to the specimens was transferred. the tested screwed joints were placed in the middle in the electric furnace. the specimens tested in 2006 were heated in an electric furnace with internal diameter 150 mm and height 300 mm. the temperatures of the connection were measured by a thermocouple attached to the steel sheet close 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 connections of trapezoidal sheets under fire p. kallerová, f. wald, z. sokol this paper describes two different experiments on connections of trapezoidal sheets under elevated temperatures. the first experiments were tensile tests carried out on four sets of tests with screwed connections under ambient and elevated temperatures. one diameter of self-drilling screws and three different thicknesses of trapezoidal sheets were used. the applied screws were without washers, or with sealed or steel washers. the second experiment was performed in a laboratory furnace to check the catenary action of a thin-walled trapezoidal sheet. the basic theory tested in this experiment was that in the first phase of the fire the sheet behaves as a simply supported beam, while in the second phase the load bearing is transferred by a tension membrane. these experiments will be used to develop a design model of connections at high temperatures. high fire resistance of the trapezoidal sheet, dependent on suitable design of the screwed connection to the bearing structure, was confirmed. the experiment with the simple beam also confirmed catenary action. keywords: screwed connection, self-drilling screw, trapezoidal sheet, elevated temperature, catenary action. to the bolt. the experiments from 2007 were carried out in a smaller furnace with one opening and internal dimensions 50×130×125 mm, see fig. 2. the thermocouple for measuring the temperature of the connection was located in a hole drilled in the screw head. the air temperatures in the furnaces and the temperatures of the tested connections were measured by one thermocouple. the electric furnace had one opening, which was filled with glass with high temperature protection. during the fire tests, the behaviour and the deformation of the tested connections could be observed. in the course of the experiments, photo documentation was provided at 5-second intervals. fig. 3 shows four photographs taken during one of the tests. the edges of the specimens were marked at 5 mm spacings for displacement measurement. the constant rate of movement was established. 2.2 experiment with a trapezoidal sheet and its catenary action in 2007, the pavus laboratory in veselí nad lužnicí provided a fire experiment for checking the catenary effect of a thin-walled trapezoidal sheet. the specimen for the fire experiment was taken from a trapezoidal sheet with sheet thickness 0.75 mm, and the waves were 55 mm in height. this sheet was placed above a furnace with diesel burners. the specimen was fastened by self-drilling sd8-h15–5.5×25 screws to the bearing steel frame, which was made from heb200 profiles. the inner dimension of this frame was 800×3000 mm. in each lower wave of trapezoidal sheet two © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 49 no. 1/2009 steel plate = 10 mmt the tested connection in the furnace screw sd8 h15 5.5 × 25 test specimen = 0.75 mmt fixed to testing machine bolted connection, standard bolt m12 steel plate = 10 mmt fixed to testing machine fig. 1: the test set up fig. 2: electric furnace for the tests in 2007 1 min 10 s 2 min 15 s 4 min 15 s 4 min 45 s fig. 3: the deformation of the connection fig. 4: the tested specimen before the fire experiment self-drilling screws were used. thermal insulation protected the frame against the effect of high temperatures. four rectangular iron plates 30 mm in thickness, with dimensions 450×580 mm and weight 60 kg each were used as the mechanical load. the total load on the tested specimen was 240 kg, which corresponds to 1 kn/m2. the iron plates were uniformly distributed on the trapezoidal sheet. the distance between the load and the edge of the specimen was 300 mm, and the distance from each other was 200 mm. fig. 4 shows a view of the tested specimen before the fire experiment. the trapezoidal sheet, the mechanical load, the bearing frame and the thermal insulation of the whole specimen can be seen. the thermocouples and a vertical deflectometer in the middle of the span of the simple beam were located directly on the sheet. the thermocouples were placed in the midspan and at the quarter distance of the span of the trapezoidal sheet, three on the screws and three near the screws on the sheet. two thermocouples were used for measuring the gas temperature in the furnace; they were placed at a distance of 350 mm from the upper surface of the upper wave of the sheet. 3 results of experiments 3.1 experiments with screwed connections the resistance of the connection with a sealed washer (set a) was limited by the bearing resistance of the thin sheet. the sealant washer has no influence on the behaviour, because the sealant burns at higher temperatures. the stiffness of the connection with steel washers (set b) was much higher and the resistance was almost twice as high as for the previous set. the thin sheet was deformed and accumulated in front of the washer. this was accompanied by the formation of two shear zones on both sides of the washer, see fig. 6. this failure mode is characterized by deformation capacity larger than 30 mm. however, at temperatures higher than 500 °c shear failure of the bolt was observed, see fig. 7. fig. 8 is a force-deformation diagram of the connections from set d, and the collapse mode for the connection from set c is shown in fig. 9. for all specimens with a measured sheet thickness of 0.75 mm, failure in bearing was reached when the trapezoidal sheet tore. two modes of failure were observed for sheet thickness 0.80 mm. for temperatures from 20 °c to 600 °c the sheet failed in bearing, whereas for a temperature of 700 °c the mode of failure was shear failure of the screw. in the initial phase of loading, elastic behaviour can be observed. then the force increased until the maximum bearing capacity was achieved and the tearing of the sheet occurred. in the next phase of the force-deformation diagram, a de84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 n n h n n h q � � l fig. 5: the catenary action on a simply supported beam fig. 6: the collapse mode of the connection from set b, temperature 200 °c force [kn] deformation [mm] 2.0 4.0 0 3.0 1.0 5.0 105 15 20 25 300 35 7.0 6.0 8.0 20 °c 200 °c 400 °c 500 °c 600 °c 700 °c fig. 7: force-displacement diagrams of the screwed connections from set b 300 °c 400 °c 500 °c 600 °c 700 °c 200 °c force [kn] 2.0 4.0 0 3.0 1.0 5.0 105 15 20 25 300 35 6.0 20 °c deformation [mm] fig. 8: force-displacement diagrams of the screwed connection from set d fig. 9: collapse mode of the connection from set c, temperature 200 °c crease in force can be observed. the increase and subsequent decrease in force due to the accumulation of deformed sheet in front of the screw can be seen in the diagram. the deformation capacities of the connections were high. due to the dimension limits of the electrical furnace, all the experiments were terminated before their ultimate failure, but after exhaustion of the residual bearing capacity. the results of these experiments show how the temperature increase leads to a decrease in the bearing capacity of the connections. for a temperature of 550 °c, the bearing capacity of the connection is reduced to approximately one half of the bearing capacity under ambient temperature, and for a temperature of 700 °c the bearing capacity is less than 20 % of the bearing capacity under ambient temperature. a reduction of 45 % for temperature 500 °c and of 90 % for 700 °c is used for calculations of connections with bolts and nuts. the experiments confirm a greater reduction in the resistance of the self-drilling screws in bearing in the initial phase of heating (up to temperature 550 °c), and a smaller reduction for higher temperatures. these two factors lead to unfavourable brittle failure of the connection in shear [4]. temperatures lower than 500 °c have no significant influence on the initial stiffness of the connection. the deformation capacity for higher temperatures is reduced by failure of the screw in shear. this mode of failure occurred only for the screwed connection with sheet thickness 0.80 mm and temperature 700 °c. 3.2 experiment on the trapezoidal sheet and its catenary action the fire load was modelled by a multilinear fire curve simulating the fire load used for the fire tests in cardington. the usage of similar fire curves is helpful for subsequent comparison of the results from different fire experiments. the maximum measured gas temperature in the furnace was 1096 °c. this value was reached in the 55th minute, and the total length of the fire experiment was 2 hours. the temperature of the trapezoidal sheet above the support was 447 °c, and this temperature is about 58% lower than the temperature of the trapezoidal sheet in the midspan (1084 °c), see fig. 10. in the case of an unprotected load-bearing structure the temperature would be higher. © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 49 no. 1/2009 time [min] temperature [°c] 0 15 30 45 60 75 90 105 120 100 200 300 400 500 600 700 800 900 1000 1100 gas temperature temperature of sheet in midspan temperature of sheet near screw temperature of screw fig. 10: measured temperature on the specimen �220 �200 �180 �160 �140 �120 �100 �80 �60 �40 �20 0 20 10 20 30 40 50 60 70 80 90 100 110 120 time [min] deflection [mm] calculation experiment fig. 11: vertical deformations in the midspan of the trapezoidal sheet the behaviour of the trapezoidal sheet during the fire experiment was similar to the behaviour of a simple beam on which the elongation is restrained. at the beginning of the fire test, the temperatures are low and the trapezoidal sheet is not deflected by the influence of the temperature. the sheet elongates in its plane, and the screws in the supports are loaded in shear. as the temperature increases, an extension occurs as a result of the thermal expansion of the material, and the yield stress and the modulus of elasticity decrease. as a result of these effects, the deflection is increased. the increase in the deflection and the decrease in bending stiffness lead to a change in the tensile forces in the supports, and the sheet starts to behave like a tensile membrane. this effect is known as catenary action. fig. 5 shows a scheme of the catenary action on a simply supported beam. the collapse of the structure depends on the bearing capacity of the connections and on the ability of the load bearing structure to carry the tensile forces. fig. 11 compares the deflection as a function of time for values calculated by equations and for values obtained during the experiment. the comparison of the maximum deflections and the times where the deflection are obtained are in a good agreement. the maximum measured deflection of the trapezoidal sheet was 229 mm, and the calculated vertical deformation was 222 mm. 4 summary the resistance of the connection in relation to temperature is shown in fig. 12. resistance is reduced at higher temperatures; the reduction is small at temperatures up to 400 °c but significant at temperatures higher than 500 °c. the diameter of the washer or of the screw head has a significant influence on the resistance. the resistance of the screwed connection from set a is approximately 40 % lower than the resistance of the screwed connection from set d. when the connection from set b is used, the resistance is similar to the connection from set c. shear failure of the screw may lead to low deformation capacity at temperatures higher than 500 °c. these experiments will be used for developing a design model of connections at high temperatures. the experiment with a trapezoidal sheet supported like a simple beam confirmed the catenary action. high fire resistance of the trapezoidal sheet, depending on suitable design of the screwed connection to the bearing structure, was also confirmed. acknowledgment this outcome has been achieved with financial support from the ministry of education, youth and sports, under project no. 1m0579. references [1] outinen, j.: mechanical properties of structural steels at high temperatures and after cooling down. espoo, 2007. [2] sokol, z.: design of corrugated sheets exposed to fire, progress in steel, composite and aluminium structures, 2006. [3] kallerová, p.: experiments with bolted connections – experiments at normal and elevated temperatures. research report ctu in prague, 2007. [4] wald, f. et al.: calculation of the fire resistance of structures. ctu in prague, 2005. petra kallerová e-mail: petra.kallerova@fsv.cvut.cz františek wald zdeněk sokol department of steel and timber structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 resistance [kn] temperature [°c] 2.0 4.0 0 3.0 1.0 5.0 200100 300 400 500 6000 700 7.0 6.0 8.0 the set tests c the set tests a the set tests b the set tests d fig. 12: resistance of the connections table of contents influence of brick walls on the temperature distribution in steel columns in fire 5 antónio j. p. moura correia, joao paulo c. rodrigues, valdir pignatta e silvac incorporation of load induced thermal strain in finite element models 11 a. law, m. gillie, p. pankaj thermal conductivity of gypsum at high temperaturesa combined experimental and numerical approach 16 i. rahmanian, y. wang structural analysis of steel structures under fire loading 21 c. crosti compressive behaviour at high temperatures of fibre reinforced concretes 29 s. o. santos, j. p. c. rodrigues, r. toledo, r. v. velasco experimental analysis of concrete strength at high temperatures and after cooling 34 e. klingsch, a. frangi, m. fontana fire resistance of axially loaded slender concrete filled steel tubular columnsdevelopment of a three-dimensional numerical model and comparison with eurocode 4 39 a. espinós, a. hospitaler, m. l. romero calculation of a tunnel cross section subjected to firewith a new advanced transient concrete model for reinforced structures 44 u. schneider, m. schneider, j.-m. franssen a new design method for industrial portal frames in fire 56 y. song, z. huang, i. burgess, r. plank decomposition of intumescent coatings: comparison between a numerical method and experimental results 60 l. m. r. mesquita, p. a. g. piloto, m. a. p. vaz, t. m. g. pinto simulation and study of natural fire in a wide-framed multipurpose hall with steel roof trusses 66 d. pada investigation into methods for predicting connection temperatures 71 k. anderson, m. gillie connection temperatures during the mokrsko fire test 76 j. chlouba, f. wald connections of trapezoidal sheets under fire 82 p. kallerová, f. wald, z. sokol ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 a taxonomy of technical animation d. vaněček, j. jirsa abstract the age in which we are living nowadays is characterized by rapid innovation in the development of information and communication technologies (ict). this innovation has a significant influence on the education process. this article deals with computer animation in technical education. our aim is to show the taxonomy of education animation. the paper includes practical examples of animation. keywords: animation, taxonomy, didactic function. utilization of animation the use of animation as a didactic tool is connected with the following notions: [1] • displaying a simulated model • displaying phenomena that cannot be properly processed by other didactic tools • equipment that cannot be demonstrated by some other model because of its size or hazardousness • displaying technologieswhicharecostly in terms of money, time, etc., or which are inaccessible however, animation can also be misused, and it is necessary to beware of potential shortcomings of animation. some of them were summarized by skalková. [2] the primary problems are reduced contactwith reality and reducedfirst-hand experience of students. young people may get the idea that they have learned everything through an animation, and that they do not need to come into contact with actual objects or phenomena. this is a fatal error that can have immense consequences. there can also be negative impacts on learners’health if they sit continuously in front of a computer screen. the utilization of these modernmeans can also lead to several problems affecting teachers. firstly, the transfer of control over the learning process from the teacher to the student can lose some of the benefits that a learner can draw from the knowledge, influence and experience of a teacher. this maymake learning inefficient and affect the student’s orientation in the topic under discussion. in addition, the student’s ability to gain information from a teacher about the relevance of a topic for other fields of study is dramatically reduced. we can also point out that teachers may be put under pressure by the need to keep up to date with innovations in education and in education technology. classification of computer animation — taxonomy computer animations designed for pedagogical purposes can be divided into a number of categories. from the didactic point of view, animation can be classified: according to the student’s activity: • passive, where a student only observes what is presented to him/her. • active, where a student participates actively in running and displaying the animation. the student can manipulate certain constituents of the animation. if the student is not active enough, the animation does not run and becomes only a static picture. classification of computer animations — taxonomy according to didactic function: • motivational—amotivational animation is usually applied by the teacher at the beginning of a class when giving an introduction to the topic under discussion. the goal is to get students’ attention, and to engage them in the problem that is to be presented. an example of motivational animation: • electric power generation (figure 1) • illustrative — these animations are designed to provide an insight into the patterns of the phenomena, technologies or processes that are to be presented. they aim to explain the phenomenon, technology or process under study in greatdetail, with emphasis on illustration, clearness and intelligibility. the student has a passive roleasanobserver, andcreatesbasicnotions about the topic problem on the basis of his/her observations. 59 acta polytechnica vol. 51 no. 3/2011 fig. 1: electric power generation examples of illustrative animation: • soldering components by the smd remelting process (figure 2) • the principle of an extruder for cable production (figure 3) • thermal treatment (direct bainite transformation) (figure 4) • two-stroke and four-stroke engines (figure 5) fig. 2: soldering components by the smd remelting process [3] here, it is important to note that the line between motivational animation and illustrative animation is not clear-cut. we can often encounter a combination of both types. in such cases we will speak of illustrative-motivational animation. fig. 3: the principle of an extruder for cable production [3] 60 acta polytechnica vol. 51 no. 3/2011 fig. 4: thermal treatment — direct bainite transformation [4] fig. 5: two-stroke and four-stroke engines fig. 6: presentation of snell’s law [3] • demonstrative-interactive animation —the student participates actively, and can affect the components of the animation (e.g. by changing the parameters of the phenomenon, technology, or process that is being presented). the animation is usually of a programmed, simplified in most cases physical model, which corresponds with the patterns of the real world. examples of demonstrative-interactive animation: • presentation of snell’s law (figure 6) • laboratory devices for introducing impregnating processes (figure 7) • diagnostic, practicing the subject, an examination in which the student affects the animation and tests the knowledge that has been acquired. 61 acta polytechnica vol. 51 no. 3/2011 fig. 7: animation of laboratory devices for introducing impregnating processes [3] fig. 8: creating the cycle of a heat pump [5] 62 acta polytechnica vol. 51 no. 3/2011 fig. 9: characteristics of a silicon diode [6] here, it is necessary for some form of feed-back to be incorporated to inform the student about the correctness or incorrectness of the answer. dividing electronic components into correct categories, creating a correct network schema, creating the cycle of a heat pump, etc., can serve as examples. an example for gaining practice: the principle of the heat pump — example of a practical exercise, creating the cycle of a heat pump (figure 8) according to the way of presenting the topic, with regard to representation in space: • two-dimensional (2d), where plane geometric figures are used to symbolize specific parts of the technology. an isometric view can also be used (so-called 2,5d) to emphasize the possible dimensional configuration of the symbols. an example of two-dimensional (2d) animation: characteristics of a silicon diode (figure 9) • three-dimensional (3d)—all objects in the animation are representedby three-dimensional depiction and are also displayed in three dimensions on the screen. some modeling tools are needed for creating the animation an example of three-dimensional (3d) animation: witness program— presentation of an industrial process (figure 10) fig. 10: witness program—presentation of an industrial process 63 acta polytechnica vol. 51 no. 3/2011 fig. 11: program third wave advantedge — thermal stratification according to availability: • generally available, e.g. on the internet, commercially-available educational cd-roms • restrictively available, only within specific educational facilities it is obvious that these basic categories can be combinedwitheachother. theclassificationcouldbe further specified, e.g. on the basis of specific specialized subjects, but that would lie beyond the framework of this paper. a simulation can be understood as a specific case of an animation. a simulation is a visualization of a specificmathematical model. the input parameters have to be set. the mathematical model generates output data fromtheparameters. the inputdata is usually in the formof specificphysicalquantities (e.g. tension, temperature, pressure, force). the difference between a simulation and ananimation lies in the incorporation of random phenomena which exist in the real world and affect the final behavior of the model. an example is a simulationof an industrial process, a proposal for interior lighting, a proposal for traffic operation, etc. example of a simulation: simulation of a cutting process in theadvantedgeprogram—thermal stratification (figure 11) conclusion thispaperhasattempted topresentacategorization, descriptionand explanation of specific variants of animations for e-learning that teachers can encounter in their profession. the main emphasis has been on the teaching and learning aspects of the animations. for the sake of better understanding, the theoretical categorizationwas supplementedby specificpractical examples. technological aspects of creating animations are dealt with in the first and the third parts of this three-part paper. references [1] vaněček, d.: informační a komunikační technologie ve vzdělávání. praha : čvut, 2008. isbn 978-80-01-04087-4. [2] skalková, j.: obecná didaktika: vyučovací proces, učivo a jeho výběr, metody, organizační formy vyučování. praha : grada, 2007. isbn 978-80-247-1821-7. [3] jirsa, j.: tvorba počítačové animace pro potřeby výuky technických předmětů, bp múvs čvut, praha 2009, ved. práce d. vaněček. 64 acta polytechnica vol. 51 no. 3/2011 [4] učeň, m., filípek, j.: izotermické tepelné zpracování mendelnet’04 agro ed. r. cerkal, p. ryzny, e. fryščáková, t. středa, brno : mendelova zemědělská a lesnická univerzita v brně, 2004. [5] kříž, č., tvorba animace pro e-learningový kurz, bp múvs čvut, praha 2006, ved. práce d. vaněček. [6] kropelnický, r.: využití animace při tvorbě elearningového kurzu, bp múvs čvut, praha 2007, ved. práce d. vaněček. ing. paed. igip. david vaněček, ph.d. e-mail: david.vanecek@muvs.cvut.cz czech technical university in prague masaryk institute of advanced studies department of engineering pedagogy horská 3, 128 00 praha 2, czech republic ing. bc. jan jirsa e-mail: jirsaj@fel.cvut.cz czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 65 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 surface roughness and porosity of hydrated cement pastes t. ficker, d. martǐsek, h. m. jennings abstract 3dprofile and roughness parameters were used to perform an extensive study of the fracture surfaces of hydrated cement pastes. one hundred and eight specimens were prepared with six different water-to-cement ratios. the surfaces of the fractured specimens are the subject of a study analysing the height irregularities and the roughness. the values of the 3d surface profile parameters and the roughness parameters of the fractured specimens were computed from digital replicas of surface reliefs assembled bymeans of confocal microscopy in three differentmagnifications: 5×, 20× and 50×. seventy-eight graphs were plotted to describe and analyze the dependences of the height and roughness irregularities on the water-to-cement ratio and on the porosity of the cement hydrates. the results showed unambiguously that the water-to-cement ratio or equivalently the porosity of the specimens has a decisive influence on the irregularities of the fracture surfaces of this material. the experimental results indicated the possibility that the porosity or the value of the water-to-cement ratio might be inferred from the height irregularities of the fracture surfaces. it was hypothesized that there may be a similarly strong correlation between porosity and surface irregularity, on the one hand, and some other highly porous solids, on the other, and thus the same possibility to infer porosity from the surfaces of their fracture remnants. keywords: roughness analysis, fracture surfaces, cement-based materials, confocal microscopy. 1 introduction the surface features of fractured materials are valuable sources of information on the topological, structural and mechanical properties of the materials. however, in the research field of cementitious materials there have only been a restricted number of studies dealing with the surface features of fractured specimens. some early surface studies of hydrated cement materials focused on fractal properties [1, 2], whereas others [3–5] investigated roughness numbers (rn) or similar surface characteristics [6–8]. in the rn studies [3–5] dealing with fracture surfaces, the authors used as a research tool the so-called roughness numbers rn, which proved to be correlated to energy-based mechanical quantities e.g. toughness but, unfortunately, they did not show a correlation with porosity-dependent quantities, e.g. compressive strength. in our preliminary study [9], it was illustrated that, in addition to the roughness numbers rn, there are very promising surface parameters of another kind called the surface profile (sp) and surface roughness (sr) parameters derived from three-dimensional (3d) digital replicas of fracture surfaces. it has been indicated [9] that these sp and sr parameters, in contrast to rn, might show a correlation with porosity and with porosity-based quantities such as compressive strength and, as such, they could become an important complement to rn characteristics. as has been highlighted previously [9], the striking difference between rn and sp/sr characteristics consists in their definitions. rn numbers quantify the increase in the area of fracture surfaces, i.e. they are overall area characteristics, whereas sp/sr parameters evaluate surface irregularities, i.e. they quantify height differences on fracture reliefs. in our preliminary report [9], only three types of sp/sr parameters were tested, though there is a larger series of these 3d parameters. in addition, in the preliminary report [9] the tested specimens covered only four values of the water-to-cement ratio r, namely 0.4, 0.6, 0.8, and 1.0, though the reliability of the experimental conclusion is increasedby including more values of r. last but not least, some vagueness may be added to the results by the porosity of specimens inferred from the r-values that are used, rather than fromdirectmeasurements. all thesepointshave been improved in the present communication, which is basedona larger series of sp/sr parameters, on a more numerous spectrum of r-values, on direct measurements of porosity and, in addition, on a larger number of specimens. thismakes the present results more reliable and more precise than the preliminary results [9]. the goal of the present communication is to provide clear experimental evidence showing porosity as a main factor controlling the irregularity of the fracture surfaces of porous materials. for this purpose, the behavior of the profile parameters sp and roughness parameters sr has been analyzed in terms of various porosity values. 7 acta polytechnica vol. 51 no. 3/2011 fig. 1: scheme of surface irregularities 2 surface profile and roughness parameters in this study, six different profile parameters (spp, spv, spz, spc, spa, spq,) andsevendifferent roughness parameters (srp, srv, srz, src, sra, srq, srzjis) are employed. this means that thirteen independent parameters are used as a measure of the irregularity of fracture surfaces. all these parameters are derived from 3d digital replicas (maps) of the surfaces. in our case, the digital maps have been created by means of confocal microscopy (for more details, see [9]). the map may be viewed as a threedimensionalmatrix [xi, yi, zi]where thediscrete function zi = f(xi, yi) approximates the fracture surface, which is in realitymore or less a smoothwavy surface z = f(x, y). sp parameters are derived from the measured map f(xi, yi), whereas sr parameters are associated with a different function fr(xi, yi) formed from f(xi, yi) by filtering out the wavy components possessing wave lengths longer than a critical value λc (for more details see [9]). briefly, sp parameters characterize surfaces on all measured length scales, but sr chracterizes those length scales that are shorter than λc. the algorithms for the computation of sp and sr parameters are very similar; only their source functions differ. to determine sp the starting function is given by f(xi, yi), whereas for sr it is the filtered function fr(xi, yi). the computationalalgorithmsarebrieflyexplained inthe following paragraphs with the aid of figure 1, which has been drawn for simplicity in two dimensions rather than in three dimensions. parameters spp/srp — these parameters represent the total maxima on the graphs of the source functions f(xi, yi)/fr(xi, yi) (figure 1), i.e. spp/srp → total max{zpi} (1) parameters spv/srv — inform us about the total minima on the graphs of the source functions f(xi, yi)/fr(xi, yi) (figure 1), i.e. spv/srv → total min{zvi} (2) parameters spz /srz — are determined by the sumof the totalmaximumand the totalminimumon the graphs of the source functions f(xi, yi)/fr(xi, yi) (figure 1), i.e. spz /srz → total max{zpi}+total min{zvi} (3) parameters spc/src —represent the arithmetic averagesof the sumsof the localmaximaandminima on the graphs of the source functions (figure 1), i.e. spc/src → 1 m m∑ i=1 (local max{zpi}+local min{zvi}) (4) where m is an overall number of local extremes (local maxima and local minima). parameters spa/sra — are given as mean heights of the graphs of the source functions f(xi, yi)/fr(xi, yi) (figure 1), i.e. spa = 1 l · m ∫ ∫ (lm) |f(x, y)|dxdy (5) sra = 1 l · m ∫ ∫ (lm) |fr(x, y)|dxdy (6) where l × m is the area of vertical projection of f(x, y)/fr(x, y) into the plane xy. parameters spq/srq — provide us with the square roots of the mean of the squares of the source functions f(xi, yi)/fr(xi, yi) — rms values (figure 1), i.e. spq = √ 1 l · m ∫ ∫ (lm) [f(x, y)]2dxdy (7) srq = √ 1 l · m ∫ ∫ (lm) [fr(x, y)]2dxdy (8) 8 acta polytechnica vol. 51 no. 3/2011 parameter srzjis — represents the ten-point arithmetic average of the five highest peaks and the five deepest valleys of the source function fr(x, y) (figure 1), i.e. srzjis → 1 5 5∑ i=1 (zmax5pi + z min5 vi ) (9) analyzing the functionality and the capability of parameters sp/sr defined above, we can recognize their characteristic properties, whichmake them unique and distinguishable. for example, the first three parameters of both kinds, i.e. spp, spv, spz, and srp, srv, srz haveonly a local character, since each of them determines a value belonging to a single point of the surface (i.e. the local extreme) without taking into account the influence of other surface points. in other words, they select from the very numerous set of peaks, valleys and z-widths only one value, i.e. the extreme value. from the viewpoint of statistics, the representativeness of such values is very limited and may suffer from essential variability. in addition, the z-modifications of the sp or sr parameters may be expected partly to resemble the behavior of por v-modifications of these parameters, since the z-modification is actually a combination of pand v-modifications. for all these reasons, the groups of parameters spp, spv, spz, and srp, srv, srz shouldbe consideredas statistically less relevant than the remaining parameters spc, spa, spq, or src, sra, srq, srzjis, since these two groups represent averaged quantities. nevertheless, a certain specificity may be observed with parameter srzjis. although defined as an averaged quantity, it is not a complete average since the averagingdoesnot goover all the surface features but only over the five highest peaks and the five deepest valleys. due to this circumstance, srzjis also ranks among the statistically less reliable characteristics. the parameters spc, spa, spq, and src, sra, srq represent true averages. their c-components are the results of averaging over all the local z-widths, which consist of the pand v-components of the given surface relief, and as such the c-components include properties of both the ‘upper’ and the ‘lower’ sides of the relief. this makes them rather prone to imitate the behavior of the pand v-components. during the computation of the aand qcomponents the entire ‘lower halves’ of the surface reliefs are ‘rotated’ up, and in this way they are unified with the ‘upper’ parts of the reliefs. together they create a statistically indistinguishable unit which does not suffer so much from the ‘twoside effect’. the aand q-components seem to be statistically the most reliable components among all those employed in this study. however, there is a certain difference between spa/spq and sra/srq parameters. the r-parameters are extracted from the roughness function fr(xi, yi), which lacks the larger wave lengths of thewavyprofiles, and this helps spa and spq to be better representatives of the overall surface irregularity of fracture specimens. finally, it should be mentioned that there are some differences between the surface parameters measured at different magnifications. in the case of smaller magnifications, priority is given to the larger surface features of reliefs but the details are suppressed. this means that the values of all parameters will be shifted to larger quantities — they are set on scales of greater length. at greater magnifications the situation is opposite: the details are accented but the larger relief features are missing, so that the values of all parameters will be shifted to smaller quantities— they are set on scales of shorter length. these propertiesmake all the surface parameters scale-dependent quantities. the surface places (sites) on which measurements were performed are specified in section 3. all thementionedproperties of the profile parameters (sp) and roughness parameters (sr) can be observed with the graphs presented in section 4. 3 experimental arrangement one hundred eight specimens (2 cm×2 cm×16 cm) of hydrated ordinary portland cement paste of six water-to-cement ratios r (0.3, 0.4, 0.5, 0.6, 0.7, 0.8) were prepared (eighteen samples per r-value). the specimens were rotated during hydration to achieve better homogeneity. all specimens were stored for the whole hydration time at 100 % rh and 20◦c. after 60 days of hydration, the specimens were fractured in three-point bending and the fracture surfaceswere immediately used formicroscopic analysis. other parts of the specimens were used for porosity measurements and for further mechanical tests. porosity was determined by the common weightvolume method. the wet specimens were weighed and their volumewasmeasured. then theywere subjected to a temperature of 105◦c for one week until theirweight stoppedchanging, andthedryspecimens were weighed again. the microscopic analysis was performed using an olympus lext 3100 confocal microscope. approximately 150 image sections were taken for each measured surface site, from the very bottom of the surface depressions (valleys) to the very top of the surface protrusions (peaks). the investigated area l × m = 1280 μm × 1280 μm (1024 pixels×1024 pixels)was chosen infivedifferent places of each fracture surface (in the center, and in four positions near the corners of the rectangular area), i.e. each plotted point in the graphs of the profile and roughness parameters corresponds to an average value composed 9 acta polytechnica vol. 51 no. 3/2011 fig. 2: the dependence of porosity p (measured by evaporable water content) on the original water-to-cement ratio r of 90measurements (18 samples×5 surfacemeasurements). each measurement was performed for three different magnifications, namely 5×, 20× and 50×, giving 270 measurements performed for the particular r-value. each site measurement amounts about 150optical sections (digital files), i.e. 40 500files had to be processed to create 270 digital maps per one r-value. this resulted in 1620 digital maps for all r-values altogether (6 r-values×270 maps for each r-value). these 1620 digital fracture surfaces were then subjected to 3d profile and roughness surface analyses using olympus lext 3100 software, version 6. the critical wavelength λc for filtering out wavycomponents of longerwavelengthwas set to 100 pixels, which is about 10 % of the reference length l =1024 pixels. in this way an extensive statistical ensemble was created, providing a sufficiently reliable basis for making relevant conclusions. 4 results and discussion prior to a discussion of the graphs of profile and roughness parameters, it is necessary to recall some basic facts about theprocess of formingporositywith hydrated cement pastes. when cement is mixed with water, the hydration process combines some water into the c–s–h1 gel, a main hydration product, and the remaining water is either physically adsorbed in tiny gel pores or remains as free water in the capillary pores. when the water-to-cement ratio r = w/c is increased, the capillarywater increases, and alongwith it the capillary space extendswithin thehydratedcementpaste. the higher the ratio r, the larger the capillary space. naturally, this scheme holds onlywhen all the freewater is integrated into the paste. at extremely high ratios (r > 0.6), this is a problem because of sedimentation and segregation of cement grains. nevertheless, rotation of specimens and adding admixtures preserves sufficient homogeneity of the hydrated specimens. it follows from the foregoing paragraph that the main factor controlling the capillary porosity of cement paste is the water-to-cement ratio, r, which is primarily reflected in the total porosity p . therefore, strong dependence of porosity on the r-ratio, i.e. a strong correlation p(r), is expected. this wellknownrelationship isnot surprising, and it alsoworks with our specimens — see figure 2. at first sight, graph p(r) in figure 2 seems to be linear. however, some caution is necessarywhen inspecting functional behavior within a narrow interval. many graphs of non-linear functions seem to be almost linear in very narrow intervals. it is necessary to observe the behavior in a wider interval. in our case, the porosity cannot exceed a value of 1, but it can approach this value for large r-ratios, i.e. lim r→∞ p(r)= 1 (10) this requirement cannot be guaranteed by any common linear function, but canbe guaranteedbya nonlinear function, e.g. by the exponentially growing function p(r) = 1 − exp(−r/ro) or by some type of a hyperbolically growing function, to mention only some of the possible candidates. regardless of the type of p(r) function, one fact is clear: p(r) in the interval r ∈ (0.3,0.8) with our specimens is only 1c–s–h gel in cement notation: c=cao, s=sio2, h=h2o. 10 acta polytechnica vol. 51 no. 3/2011 slightly non-linear, which qualifies the linear approximation as a possible tentative candidate for p(r) in this interval. as we shall see later, this will have a special impact on the investigated graphical dependences. let us briefly summarize the facts that have been formulated in the foregoing lines. the initial value of the water-to-cement ratio r determines the value of the porosity p of hydrated cement paste, which leads to a strongly correlated dependence p(r). this inevitably leads to the conclusion that all the quantities dependent on the water-tocement ratio, r, must also be dependent on porosity p , and vice versa. in our case these consequences are perfectly fulfilledwithin the seriesof 78graphs infigures 3–15. the series of graphs contains six graphs spp(r), spv(r), spz(r), spc(r), spa(r), spq(r), seven graphs srp(r), srv(r), srz(r), src(r), sra(r), srq(r), srzjis(r), six graphs spp(p), spv(p), spz(p), spc(p), spa(p), spq(p), and seven graphs srp(p), srv(p), srz(p), src(p), sra(p), srq(p), srzjis(p). all the graphs are repeated in threedifferentmagnifications: 5×, 20×and 50×. the graphs document in a very straightforward manner the strongdependence of both the profile parameters and the roughness parameters on thewaterto-cement ratio r, and also on porosity p . in addition, these dependences on r and p are very similar in shape within the investigated intervals, which is also a consequence of the almost linear behavior of p(r). all the graphs contain error bars that seem to be rather large. this is because they represent limiting statistical errors, i.e. intervals with 99.73 % confidence. in normal laboratory practice, statistical intervals with 50 % confidence are usually used and as such they would be 4.5× shorter. however, the limiting intervals are more instructive since they allowus to recognize other possible positions ofmeasuredpoints, and enable us to consider other possible shapes of the graphs. the next section discusses the plotted graphs in greater detail. 4.1 dependences on the water-to-cement ratio graphs of the dependences spp(r), spv(r), spz(r), spc(r), spa(r), spq(r) and srp(r), srv(r), srz(r), src(r), sra(r), srq(r), srzjis(r) are shown in the upper halves of figures 3–14 and in figure 15. when comparing these graphical results for different magnifications (5×, 20×, 50×), it is obvious that both the sp profile parameters and the sr roughness parameters change their numerical extent according to the magnification. for example, in figure 3 (magnification 5×) the numerical extent of the spp parameter is 280 μm, infigure 4 (magnification 20×) the value is 110 μm, and in figure 5 (magnification 50×) the value is only 62 μm. this is in full agreementwithwhatwasmentioned in section 2 about the scale-dependent properties of sp/sr parameters. an inspection of all the sp/sr graphswithin the framework of all the magnifications used here, 5×, 20×, 50×, results in the conclusion that the smallest statistical scatter of the measured values (not the error bars) can be found in the graphs associated with magnification20×. it is likely that thismagnification is set at the most favorable length scales characteristic for the studied fracture surfaces. at magnification 5x, the fine length scales are not included in the measurements. thus a larger statistical scatter can be observed at the side of the small water-tocement ratios, where finer fracture surfaces, i.e. finer length scales, are localized (see figures 3, 6, 9, 12 or 15). on the other hand, at magnification 50× the coarser length scales of the fracture surfaces are excluded. thus a larger scatter appears at the side of the higher water-to-cement ratios, since coarser surfaces (with larger length scales) are localized there (see figures 5, 11 or 15). intermediate magnification 20× is optimum for covering the characteristic length scales of the studied surfaces. thus it shows the smallest statistical scatter of the sp/sr parameter values. similarly,we candetermine someparameterswhosebehavior is almostunaffectedby statistical scatter. parameters spa and spq measured at magnification20× showalmost smoothbehavior,with no major scatter of their values. parameters sra and srq are less representative thanparameters spa and spq due to their filtered large length scales. analyzing the mutual differences between p-, v-, z-componentsand c-, a-, q-componentswithboth the sp and sr parameters, it is obvious that the larger statistical scatter ismost pronouncedwith the p-, v-, z-components (compare, e.g., figures 3 and 6). as was highlighted in section 2, this is because the p-, v-, z-components are not averaged over the fracture surface, while the c-, a-, q-components are true averages. it is interesting to compare the behavior of the z-components with the behavior of the p-, vcomponents. for example, figures 5 shows that the spv parameter records the reduction in surface irregularity (the depth reduction of the deepest valley) at high water-to-cement ratios 0.8, while the spp parameter shows no reduction, and spz — as a combination of the former two parameters — reports a clear reduction in surface irregularity at this point. naturally, this is a consequence of the definition of the spz parameter, which consists in the sumof spp and spv. moreover, spz partly influences spc for similar reasons. in figures 5 and 8, the drop in sur11 acta polytechnica vol. 51 no. 3/2011 fig. 3: 3d profile parameters spp, spv, spz as dependents on the water-to-cement ratio r and porosity p – magnification 5× fig. 4: 3d profile parameters spp, spv, spz as dependents on the water-to-cement ratio r and porosity p – magnification 20× 12 acta polytechnica vol. 51 no. 3/2011 fig. 5: 3d profile parameters spp, spv, spz as dependents on the water-to-cement ratio r and porosity p – magnification 50× fig. 6: 3d profile parameters spc, spa, spq as dependents on the water-to-cement ratio r and porosity p – magnification 5× 13 acta polytechnica vol. 51 no. 3/2011 fig. 7: 3d profile parameters spc, spa, spq as dependents on the water-to-cement ratio r and porosity p – magnification 20× fig. 8: 3d profile parameters spc, spa, spq as dependents on the water-to-cement ratio r and porosity p – magnification 50× 14 acta polytechnica vol. 51 no. 3/2011 fig. 9: 3d roughness parameters srp, srv, srz as dependents on the water-to-cement ratio r and porosity p – magnification 5× face irregularities is clearly visible, not onlywith spz but also with spc. parameter spc actually represents spz averaged over the whole fracture surface, and in this sense they are mutually related. finally, noting that the shapes of the sp and sr graphs are similar (compare, e.g. figures 6 and 12), it is also noted that their characteristic length scales do not differ enough tomodify their vertical arrangements. 4.2 dependences on porosity the lower halves of figures 3–15 show the graphs of the dependences of the profile parameters spp(p), spv(p), spz(p), spc(p), spa(p), spq(p) and the roughness parameters srp(p), srv(p), srz(p), src(p), sra(p), srq(p), srzjis(p) onporosity p . all of theabovediscussiononthewater-to-cementratio r in section 4.1 canbe applied to the dependences on porosity p . this is because there is a strong correlation between these two quantities, as is also illustrated in figure 2. the unambiguous and almost linear correspondence between p and r of the studied specimens in the interval r ∈ (0.3,0.8) ensures an unambiguous, almost linear transition between the r-axes and the p-axes of the graphs in figures 3–15. this in turn guarantees almost identical shapes of the graphs, regardless whether they are based on rvariables or on p-variables. the same strongdependences of the surface irregularity parameters sp/sr on ror p-quantities together with their identical graphical shapes are convincing evidence of the governing roles of r and p related to the irregularity of the fracture surfaces of highly porous hydrated cement pastes. the influence of porosity on surface irregularity is not necessarily only a specific feature of porous cement pastes, but may be an inherent feature of other porous solidmaterials. thefinding that surface irregularity is prevalently determined by porosity is in accordancewith the observation of ponson and others [10,11]. they studied the roughness of the fracture surfaces of glass ceramics made of small glass beads sintered in bulk with porosity that could be varied within a certain interval up to ∼ 30 %. they observed that the twodimensional profile parameter [10] increased in value with increasing porosity. porosity seems to be amajor factor governing the height irregularities of the fracture surfaces of porous solids. the roughness of the fracture remnants may be inferred from the porosity values, and conversely the porositymay be assessed from the surface roughness of the fracture remnants. unfortunately, there is no exact theory to support this close relationbetween porosity and surface irregularity. this task remains as a challenge for future research. 15 acta polytechnica vol. 51 no. 3/2011 fig. 10: 3d roughness parameters srp, srv, srz as dependents on the water-to-cement ratio r and porosity p – magnification 20× fig. 11: 3d roughness parameters srp, srv, srz as dependents on the water-to-cement ratio r and porosity p – magnification 50× 16 acta polytechnica vol. 51 no. 3/2011 fig. 12: 3d roughness parameters src, sra, srq as dependents on the water-to-cement ratio r and porosity p – magnification 5× fig. 13: 3d roughness parameters src, sra, srq as dependents on the water-to-cement ratio r and porosity p – magnification 20× 17 acta polytechnica vol. 51 no. 3/2011 fig. 14: 3d roughness parameters src, sra, srq as dependents on the water-to-cement ratio r and porosity p – magnification 50× fig. 15: 3d roughness parameter srzjis as dependent on the water-to-cement ratio r and porosity p – magnifications 5×, 20× and 50× 18 acta polytechnica vol. 51 no. 3/2011 5 conclusion an extensive study of the fracture surfaces of hydrated cement pastes has been performed using 3d sp profileparametersand sr roughnessparameters. thirteen 3d parameters of different kinds have been employed to describe and analyze surface irregularities on 106 specimens of cement pastes preparedwith 6 differentwater-to-cement ratios. each fracture surface has been tested on 5 different sites, so that each value of the 3d surface parameters belonging to the particular ratio r has been averaged over 80 measured values. this essential statistical relevancy has been associatedwith each experimental point on the plotted graphs. this has enabled us to specify some of our preliminary results [9] more precisely and reliably. themicroscopicmeasurementswereperformed in triplicate in three different magnifications 5×, 20× and50×, resulting in 78graphsdescribing thebehavior of the surface irregularities of fractured specimens in dependence on variouswater-to-cement ratios and porosities. the 3d sp profile parameters and sr roughness parameters have proved to be capable of analyzing the geometric irregularities of the surfaces of hydrated cement pastes and of providing information on height differences, on morphological singularities andon somemissing surface features, e.g. suppressed protrusions (peaks) or depressions (valleys). the results achieved in different magnifications have shown that the values of 3d surface parameters sp/sr are dependent on the length scales, and for this reason their values are reduced when using a largermagnification and their values are expanded at small magnifications. it has been shown that spa, spq are the most reliable of all the studied parameters as regards the minimum statistical scatter of the processed values. the specific distribution of the length scales of the studied fracture surfaces of cement pastes has proved to be well treated in magnification 20×, at which the 3d profile parameters sp and roughness parameters sr provide the most stable values. naturally, this does not mean at all that the fracture surfaces of other materials with differently distributed length scales of surface irregularities will also prefer magnification 20×. the present study has shown a close relation between surface irregularities and the porosity of hydrated cement pastes. since porosity is influenced by the water-to-cement ratio, a close relation has also been found between the surface irregularities and the water-to-cement ratio. for this reason, the surface irregularities, quantified by parameters sp/sr, show very similar analytical dependences both on porosity p and on water-to-cement ratio r. the initial valueof thewater-to-cementratioused for mixing cement paste is one of the main factors that decides about the future porosity of cement hydrates, and is also influential for the surface irregularities of the fracture remnants of this material. acknowledgement this work was supported by the ministry of the czechrepublic under contract no. me09046 (kontakt). references [1] lange, d. a., jennings, h. m., shah, s. p.: a fractal approach to understanding cement paste microstructure, ceram. trans. 16 (1992) 347–363. [2] issa, m. a., hammad, a. m.: fractal characterization of fracture surfaces in mortar, cem. concr. res. 23 (1993) 7–12. [3] lange, d. a., jennings, h. m., shah, s. p.: analysis of surface-roughness using confocal microscopy, j. mater. sci. 28 (14) (1993) 3879–3884. [4] lange,d.a., jennings,h.m., shah,s.p.: relationship between fracture surface roughness and fracture behavior of cement paste and mortar, j. am. ceram. soc. 76 (3) (1993) 589–597. [5] zampini,d., jennings,h.m., shah, s.p.: characterization of the paste-aggregate interfacial transition zone surface-roughness and its relationship to the fracture-toughness of concrete, j. mater. sci. 30 (12) (1995) 3149–3154. [6] lange, d. a., quyang, c., shah, s. p.: behavior of cement-based matrices reinforced by randomlydispersedmicrofibers,adv. cem.bas. mater. 3 (1) (1996) 20–30. [7] abell, a. b., lange, d. a.: fracture mechanics modeling using images of fracture surfaces, 35 (31–32) (1997) 4025–4034. [8] nichols, a. b., lange, d. a.: 3d surface image analysis for fracture modeling of cementbased materials, cem. conc. res. 36 (2006) 1098–1107. [9] ficker, t., martǐsek, d., jennings, h. m.: roughness of fracture surfaces and compressive strength of hydrated cement pastes,cem. conr. res. 40 (2010) 947–955. [10] ponson, l.: crack propagation in disordered materials; how to decipher fracture surfaces (ph.d. thesis, univ. paris., 2006) (http://pastel.paristech.org/2920/?). 19 acta polytechnica vol. 51 no. 3/2011 [11] ponson, l., auradou, h., pessel, m., lazarus, v., hulin, j. p.: failure mechanisms and surface roughness statistics of fractured fontainebleau sandstone, phys. rev. e 76 (2007) 036108/1–036108/7. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering brno university of technology veveř́ı 95, 662 37 brno, czech republic assoc. prof. paeddr. dalibor martǐsek, ph.d. department of mathematics faculty of mechanical engineering brno university of technology technická 2896/2, 616 00 brno, czech republic adjunct professor hamlin m. jennings, ph.d. department of civil and environmental engineering massachusetts institute of technology 77 massachusetts avenue, cambridge, ma, 02139, u.s.a. 20 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 advanced functions of a modern power source for gmaw welding of steel ladislav kolař́ık1, marie kolař́ıková1, karel kovanda1, marek pant̊uček2, petr vondrouš1 1 ctu in prague, faculty of mechanical engineering, department of manufacturing technology, technická 4, 166 07 praha 6, czech republic 2 migatronic cz, a. s., tolstého 451, 415 03 teplice 3, czech republic correspondence to: ladislav.kolarik@fs.cvut.cz abstract this paper evaluates the use of a modern welding power source equipped with advanced arc control functions. at the laboratory of welding technologies of ctu in prague we have focused on gmaw welding of steel using sigma galaxy, a modern welding power source produced by migatronic. sigma galaxy is equipped with functions called intelligent arc control and sequence repeat. according to the manufacturer, controlling an arc by these functions should significantly stabilize the welding process, lower the heat input and the deformation, and improve the weld quality. to evaluate the benefits of these functions completely, single v butt welds were performed on s275j2 structural steel 10 mm in thickness (2 weld layers: root + capping layer) in pf and pg positions. welding was monitored by the welding information system, and was compared with standard gmaw welding. the results have shown that these “intelligent” functions offer significant advantages for steel welding, especially in vertical and overhead positions, because they lower the heat input and improve the weld metal control. keywords: welding, gmaw, iac, sequence repeat. 1 introduction the aim of this study is to investigate the main advantages of using a modern gmaw welding source equipped with special pulse control functions. the main goal is to measure and present the influence of pulse control, and to compare independent experimental results with the information advertised by the welding source manufacturer. in present-day industrial welding practice, gmaw (gas metal arc welding) is the most widelyused welding technology. as in the case of any manufacturing technology, what industry requires from the gmaw welding process is increased efficiency, lower costs and greater welding speed, while achieving high-quality, improved weld quality, and not requiring highly-qualified welders. other industrial requirements for the gmaw process are e.g. overhead welding (pd, pe) and vertical welding (pf, pg), bridging wide root openings, and an increased melting rate. another important trend is the application of gmaw in domain of gtaw (gas tungsten arc welding) welding, i.e. welding of high strength steels, welding of non-ferrous materials, thin sheets and heterogeneous joints. developments in modern microelectronics have led to rapid advances in welding sources. with the use of fast microelectronic circuits, the speed of welding process control and welding parameter adjustment has been increased tremendously, and dynamic control over arc and molten metal transfer has become possible. research and development carried out by manufacturers of welding sources focuses on rapid optimal control of the welding parameters during welding. modern welding sources are equipped with special control functions of arc and molten metal transfer, focusing on two basic areas. the first area of focus is on welding thin metal sheets (0.5–3 mm), and the second is on high-productivity thick metal sheet welding (over 5 mm). 1. thin metal sheet welding: the top priority in thin sheet welding is to stabilize and lower the heat input to reduce the risk of burn-through, to reduce warping, improve melt pool control, and reduce spatter. this can mainly be achieved by stabilizing and controlling the arc. with this in mind, many welding source manufacturers have developed pulse control functions incorporated into their welding sources, e.g. cmt (fronius), stt (lincoln electric), cold arc (ewm), sat (esab) [1,7,8]. 2. thick sheet welding: the top priority is to maximize the weld metal deposition rate when the heat input and spatter is low. welding 83 acta polytechnica vol. 52 no. 4/2012 source manufacturers have developed pulse control functions for sheets over 5 mm ,e.g. force arc (ewm), power arc (migatronic), aristo superpulse (esab) tandem welding, t.i.m.e. (fronius) [6,9,10]. a sigma 400 galaxy migatronic welding source equipped with an iac (iintelligent arc control) function and sequence repeat was used for our study. a brief introduction to these functions, as advertised by migatronic, follows: 1.1 function iac — intelligent arc control this function focuses on welding of thin sheets and on root welds, bridging wide and uneven weld gaps, and welding in vertical downward pg position. this function offers lower spatter, high stability, low heat input. the iac function changes the standard short arc current and voltage in time. during molten metal drop separation, when short circuiting is finished, a low value current pulse is used to suppress spatter to a significant extent [2,3]. 1.2 function sequence repeat this function combines two molten metal transfer modes periodically, e.g. a combination of short arc mode and pulsed transfer mode. the short arc mode has lower welding parameters (less heat input), and the pulsed transfer mode has higher parameters (higher heat input). this is advantageous for welding in difficult positions pc, pd, pe, pf or pg. also when applied to a v-joint butt weld, the welder can make a weave movement without dwelling over the beveled face. this dwell is usually needed in order to distribute the heat as necessary. a pulsed transfer is used when the torch is over the beveled faces (to assure side wall fusion), and a short arc is used when not (the central part of the weld, to prevent excessive reinforcement or burn through). continuous weave movement is therefore possible, making weave movement easier for the welder [2,4]. 2 experimental robotic welding was carried out in the robotic cell in the welding laboratory of ctu in prague, equipped with a migatronic sigma 400 galaxy power source with iac and sequence repeat functions. improvements resulting from the use of the iac function and the sequence repeat function were evaluated on the basis of a comparison of samples a and b welded in difficult vertical positions pf, pg. sample a was welded using these functions, while sample b was welded without their use, using typical welding parameters and short arc transfer mode. figure 1: weld sample-welding position pg a sample 10 mm in thickness needed two layers, a root layer and a capping layer carried out with the weave function. in order to observe the evolution of u, i in time, the wis (welding information system) monitoring unit was connected to the welding source to record the time evolution of the welding parameters i, u, v, etc. a root weld was performed in vertical downward pg position for samples a and b, see figure 1. pg position is considered a difficult position for root welds, because it does not offer good penetration and often has the problem of insufficient root penetration. the capping layer was performed in vertical upward position pf. s275j2 structural steel 10 mm in thickness was used. the composition and the basic properties are shown in table 1. ok autrod 12.56, diameter 1 mm, supplied by esab, a typical filler material for these steels, was used. base metal sheet, 200 × 80 mm in size, thickness 10 mm, was used. v-joint with groove angle 70◦ and root opening 3.2 mm. the shielding gas was m21, a mixture of 82 % ar + 18 % co2. table 1: base metal composition and mechanical properties — s275j2 state according to din en 10025-2 [5] c mn si p s al max. 0.22 % max. 1.6 % max. 0.55 % max. 0.035 % max. 0.035 % 0.01–0.06 % rm [mpa] rp0,2 [mpa] a5 [%] 410–560 275 21 84 acta polytechnica vol. 52 no. 4/2012 3 results the welding parameters used for samples a and b are shown in table 2. setting optimum parameters for a sample was easier, and the first sample was already successful. creating a good weld without advanced functions, with a standard short arc required three b test samples, because of the difficult root welding. 3.1 root weld — setting the welding parameters to create a sound weld, it was necessary to adjust the welding parameters, especially for sample b. the welding parameters were: welding speed 0.10 m· min−1, current 80 a, voltage 16 v, weave movement setting: frequency 2 hz, amplitude 1.5 mm, dwell 0.4 s. sample a: function iac was used with the parameters stated above. the weld had sufficient root penetration, and also sufficient penetration on both faces, see figure 4. the arc was stable. the molten pool was not dripping. no spatter was found. the weld quality was high, without a defect. sample b: when current 80 a was set up on the welding source without using the iac function, the welding process was very unstable. the arc was unstable, the heat input was insufficient, the joint faces were not completely fused. only one joint face was melted. to stabilize the process and to fuse the two join faces, the current needed to be increased twice, to 100 a and to 130 a. at 130 a, the welding process was stabilized, but the root of the weld was concave in shape (welding defect: root concavity), as shown in figure 6. this is in concordance with known fact that a short arc in pg position is not suitable for forming a root weld. pf position would be more suitable. 3.2 capping weld — setting the welding parameters the capping welds were performed in vertical upward pf position. sample a: function sequence repeat was used for the capping layer. function sequence repeat changed the metal transfer mode periodically from iac short arc to pulse transfer, see figure 3. the setting of the function was: iac short arc transfer 85 a for a period of 0.6 s, pulsed arc transfer 150 a for a period of 0.3 s. the results are shown in figure 5. during the pulsed arc phases (higher heat input), the torch heated up the weld faces. during iac short arc (lower heat input), the torch was in between the faces. sample b: to melt both joint faces well, the current needed to be increased to 145 a. the results are shown in figure 7. the welding parameters that were set are shown in table 2. he measured time evolution of current and voltage for a root weld of sample a and sample b3 is shown in figure 2. this picture shows the difference between the standard short arc transfer and the iac function short arc transfer. table 2: welding parameters sample function, metal transfer weld pass current [a] voltage [v] wire feed [m · min−1] welding speed [mm · s−1] heat input [kj · mm−1] result iac root 80 15.5 2.5 0.59 ok figure 4 a sequence capping 85 short arc 16 2.6 0.66 (0.6 s) ok figure 5 repeat 150 pulsed 26.4 6.7 1.90 (0.3 s) avg 1.1 b1 short arc root 80 16.3 2.5 1.7 0.63 ng b2 short arc root 100 17 3.3 0.82 ng b3 short arc root 130 18.5 4.9 1.16 ok figure 6 short arc capping 145 19.2 5.6 1.34 ok figure 7 85 acta polytechnica vol. 52 no. 4/2012 figure 2: current, voltage time evolution for root weld — left: sample a – iac function 80 a, right: sample b – short arc transfer 130 a figure 3: current, voltage time evolution for the sequence repeat weld short arc transfer (right side) — when the metal drop touches the weld pool, the current rises to maximum values, and the voltage drops to 0 v. during short cutting there is no arc, and the current rises to the maximum value limited by the power source. using the iac function (left side), the voltage and current time evolution is distinctly different from the standard short arc. the current rises during the shortcut, but after a certain time the power source sharply reduces the voltage, as a result of which 86 acta polytechnica vol. 52 no. 4/2012 table 3: measured weld geometry sample weld width [mm] weld reinforcement [mm] root width [mm] root reinforcement [mm] haz width at root [mm] a 11 1.9 4 0.3 14 b3 12 1.8 6 −0.5 16 the current is also reduced. this sharp energy drop slows down the melt drop transfer, and the spatter is reduced significantly. the voltage and current are raised to start the arc again. the weld geometry for sample a, b are shown in table 3. the lowest width of haz at the root was measured for sample a, welded with the iac function (80 a), 14 mm. sample b for the same current (80 a) had lack of penetration and lack of fusion. when the current was increased to 130 a to reach acceptable penetration and fusion, the haz size increased to 16 mm. 4 conclusion the experiment affirmed some advantages of the iac function short arc over the standard short arc for welding in difficult positions (pg), because the root shape has improved. the iac function controls the voltage and current, so that the heat input into the weld can be greatly reduced. the arc is much more stable, even for low arc parameters. it has been proved that the heat-affected zone is smaller and that spatter is suppressed. function sequence repeat is advantageous for filling and capping weld passes, because the total heat input is reduced by controlling the voltage and current evolution. a good weld was achieved in pf position, and the heat affected zone size was smaller than for a standard short arc weld. figure 4: sample a — root welded with iac 80 a figure 5: sample a — capping layer welded with sequence repeat 85 a/150 a figure 6: sample b3 — root weld, short arc 130 a figure 7: sample b3 — capping layer, short arc 145 a 87 acta polytechnica vol. 52 no. 4/2012 the development of electronics and programming has enabled the creation of special functions that can considerably improve the behavior of arc and melt transfer, making the welding stable with lower parameters. these functions certainly improve some aspects of the welding process, and they also facilitate the work of welders, by making it easier to set the parameters and to perform welds in difficult positions. acknowledgement this research was supported by grant no. sgs ohk 2-038/10. references [1] kolař́ık, l. a kol.: gmaw svařováńı ocelových materiál̊u metodou force arc. techmat 2011. pardubice : univerzita pardubice, 2011, s. 184–189. isbn 978-80-7395-431-4. [2] čsn en iso 6947. syařováńı a př́ıbuzné procesy – polohy svařováńı. praha : český normalizačńı institut, listopad 2011. [3] odděleńı výzkumu a vývoje, migatronic a/s. intelligent arc control – proces pro snižováńı rozstřiku a vneseného tepla při zkratovém přenosu. svět svaru, 2011, vol. 15, no. 3, p. 12–14. issn 1214-4983. [4] havelka, p.: vývoj svařováńı studeným obloukem. svět svaru, 2011, vol. 15, no. 1, p. 12. issn 1214-4983. [5] furbacher, i., macek, k., seidl, j. a kol.: lexikon technických materiál̊u, sv. 1., praha : verlag dashöfer, 2001. [6] fronius: rozšǐrte si vědomosti [online]. 2009 [cit. 2004–9–13]. http://www.fronius.com/cps/rde/xchg/ sid-77958394-23586247/fronius ceska republika/ hs.xsl/29 104.htm [7] lincoln electric. surface tension transfer [online]. 2006 [cit. 2012–03–29]. http://www.lincolnelectric.com/assets/en us/ products/literature/nx220.pdf [8] esab. swift arc transfer [online]. 2009 [cit. 2012–03–29]. http://products.esab.com/ esabimages/swift art transfer final.pdf [9] fronius. time and timetwin welding [online]. 2011 [cit. 2012–05–29]. http://www.fronius.com/cps/rde/xchg/ fronius international/hs.xsl/ 79 9163 eng html.htm [10] migatronic. improved quality and higher productivity from energy-dense powerarc technology for thick-plate welding [online]. 2010 [cit. 2012–03–29]. http://www.migatronic.com/ default.aspx?m=4&i=115&pi=1&pr=0 88 ap09_1.vp 1 introduction for some years, the cement industry invests in the development of new, environment-friendly blended cement products, e.g., supersulfated slag cement (ssc). this cement is mainly made out of blast furnace slag, a by-product of iron making; hence less energy is used during the manufacturing process and less carbon dioxide is produced than in the case of portland cement. the thermal and mechanical properties of concrete change at elevated temperatures. this change in material properties influences the load-carrying and deformation behavior of concrete structures in case of fire. the rate of increase of temperature across the section in a concrete element is relatively slow, and inner zones are protected against heat. therefore reinforced concrete structures with adequate structural detailing, e.g., minimum dimensions and cover thicknesses of the reinforcement, usually achieve satisfactory fire resistance without any additional fire protection [1]. however after the fire has been extinguished the heat penetration into the cross section may continue for hours, and is inverted during the cooling phase leading to thermal stresses and cracks [2]. additionally, chemical reactions, e.g., the reformation of calcium hydroxide during the cooling phase, can widen up micro cracks. a combination of these two phenomena may lead to a significant reduction in the compressive strength of concrete after a fire [3]. investigations carried out by felicetti and gambarova [4] find that the minimum strength is reached after the concrete has cooled down to normal temperature. for general application of concrete made of supersulfated slag cement, there is a lack of basic knowledge on the mechanical behaviour during and after a fire. a research project on the fire behaviour of concrete made of supersulfated slag cement is currently being carried out at the institute of structural engineering at eth zürich. the research project aims at enlarging the theoretical and experimental data on the performance of concrete made of supersulfated slag cement during and after a fire. an extensive testing program using the ibk electric furnace is being conducted to study the temperature-dependent loss of strength of concrete and to develop temperature-dependent stress-strain relationships for concrete made of supersulfated and other types of cement for a complete temperature cycle (including cooling phase). the stress-strain relationships can be used as material input parameters for finite-element analysis and for developing calculation models. 2 test set up and testing procedure the tests were performed using an electric-powered furnace that can reach a temperature of up to 1000 °c. the attainable heating rate at the concrete surface of cylindrical specimens 150 mm in diameter and 300 mm in length can be up to 4.5 k/min. the furnace consists of two u-shaped shells, allowing the test specimen to be placed in the middle of the oven in a metal cage to protect the furnace in case of concrete spalling. the concrete specimen is loaded with a hydraulic cylinder. at the contact zones a thin layer of gypsum ensures even and centrical force transmission. the test set-up is shown in fig. 3. all tests were carried out within a very short time-frame, and the specimens were kept in a controlled climate (20 °c /50 %) to reduce the influences of concrete age and moisture content. the tests specimens were heated up slowly to the maximum temperature of 300 °c, 500 °c and 700 °c. after maintaining the temperature level for two hours, the test specimens were cooled down linearly. the compressive strength at maximum temperature (hot strength) was measured, as well as the residual strength during the cooling down phase and after being cooled down to ambient temperature of 20 °c. the general thermal cycle is shown in fig. 1. the corresponding target temperatures for the hot and residual strength are given in table 1. in order to minimize the temperature gradients in the cross section and related thermal stresses, slow heating up and cooling down rates and a conditioning time of 2 hours at maximum temperature were chosen. according to the test results presented in [3] and [4], the heating up rate can be as high as 5 k/min, while the cooling down rate should not 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 experimental analysis of concrete strength at high temperatures and after cooling e. klingsch, a. frangi, m. fontana in recent years, the cement industry has been criticized for emitting large amounts of carbon dioxide; hence it is developing environment-friendly cement, e.g., blended, supersulfated slag cement (ssc). this paper presents an experimental analysis of the compressive strength development of concrete made from blended cement in comparison to ordinary cement at high temperature. three different types of cement were used during these tests, an ordinary portland cement (cem i), a portland limestone cement (cem ii-a-ll) and a new, supersulfated slag cement (ssc). the compressive strength development for a full thermal cycle, including cooling down phase, was investigated on concrete cylinders. it is shown that the ssc concrete specimens perform similar to ordinary cement specimens. keywords: fire; concrete; blended cement; alkali-activated cement; concrete structures; tests at high temperatures; cooling down phase; hot strength; residual strength. exceed 1.0 k/min. higher values reduce the formation of calcium hydroxide, causing a higher residual strength which decreases within days after cooling down. a first simple finite element analysis was carried out to predict the maximum heat gradient during heating up and cooling down. in accordance with this first analysis, literature results and a preliminary test, the heating up rate at the surface was chosen as 1.5 k/min and the surface cooling down rate as 0.9 k/min. during the entire heating cycle, the concrete was loaded with a maximum pressure of roughly 0.3 mpa, which is less than 0.75 % of the cold strength; hence, the cylinders can be considered as unloaded, since the thermal expansion of the concrete is not affected by any load effects. the free expansion, internal damages caused by cracks and debonding of the cement matrix lead to a lower hot and residual strength compared to the loaded specimens [3] and [4]. a test series with loaded concrete specimens is planned. at the end of the thermal cycle, when reaching the target temperature, the concrete strength was determined inside the furnace. the furnace was not opened; hence the concrete surface temperature remained constant. the load was applied deformation controlled with a constant speed of 0.005 mm/s. the stress-strain curve was continuously monitored and the test was stopped manually post fracture. the concrete cold strength was determined in an analogous manner with the same deformation speed. after the testing procedure, the specimens were visually inspected with respect to crack formation and spalling. none of the tested concrete specimens showed loose or missing concrete parts caused by spalling. equalizing the contact zones of the specimens with a thin layer of gypsum led to proper results, since the fracture pattern was shaped like a truncated cone. 3 test specimens the tests were conducted using cylindrical specimens (d � 150 mm / l � 300 mm) made of three different concrete mixtures, a supersulfated slag cement (ssc), a portland-limestone cement (cem ii-a-ll), and an ordinary portland cement (cem i). the mixing properties of the concrete were identical, apart from the cement used. the cement content was always 300 kg/m3 with a w/c ratio of 0.55. common types of carbonate and siliceous aggregates up to 32 mm were used to prepare the test specimens. the fresh concrete density was 2420 kg/m3, decreasing to 2390 kg/m3 during conditioning in 20 °c/50 % atmosphere. before concreting, a petrographic analysis of the aggregates was carried out. the gravel plant is located in the area of an end moraine, hence the main component are based on limestone gravel. the petrographic analysis shows that roughly 57 % of the aggregates consist of carbonate, while 38 % of the gravel components are siliceous. the specimen diameter used in hot material testing is mostly smaller than 100 mm. these small-scale specimens are insufficient for grading curves with a maximum aggregate size of 32 mm. to ensure adequate compacting of the fresh concrete, cylindrical specimens 150 mm in diameter and 300 mm in height were used. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 49 no. 1/2009 conditioning 2h � e e � e � e � e � heating up cooling down cold strength hot strength residual strength during cooling down temperature time conditioning 0.75h residual strength at ambient temperature conditioning 0.75h fig. 1: full thermal cycle testing series max. temperature tests at target temperature (hot and residual strength) 1 300 °c hot strength 20 °c 2 500 °c hot strength 300 °c 20 °c 3 700 °c hot strength 500 °c 300 °c 20 °c table 1: testing procedure fig. 2: thermocouple and formwork fig. 3: concrete specimen in the furnace the specimens were produced under laboratory conditions. the facilities were climate-controlled, with a relative humidity of 65 % and a temperature of 20 °c. as formwork, a non-absorbent plastic cylinder was used. four thermocouples were placed into this formwork before concreting: two in the core, and additionally two at a depth of 30 mm, as shown in fig. 2. there was a distance of at least 30 mm between the thermocouples, in order to minimize perturbation due to electric fields. the thermocouples were fixed to a 2 mm welding wire to ensure that the couples stayed in position during concreting and compacting. this set-up followed generally accepted guidelines [5]. the concrete was poured into the formwork in two stages and was compacted after each stage, using a vibrating table. striking times and storage conditions for hardened concrete were respected, according to [6]. in accordance with these regulations, the specimens were cured for three days after concreting; they were then placed either in water or in very humid conditions, i.e., temperature of 20 °c and a relative humidity of at least 95 % up to the age of 28 days from the time of concreting. according to [7], the specimens have to be conditioned in a dry atmosphere for at least 90 days. the ambient temperature should not exceed 23 °c and the moisture level should be around 50 % rel. humidity. all cylinders were stored in a 20 °c/50 % conditioning room; the loss of weight of the concrete was monitored on a weekly basis until the specimen was tested in the furnace. at the time of testing, the specimens showed no significant loss in weight. 4 test program table 2 shows the test program that was carried out. while steps one to four were carried out for all cement types, steps five and six were conducted on a few specimens only. the interim values at step four are determined during the cooling down cycle (see fig. 1). the first temperature in table 2 (step 4 – right) indicates the maximum temperature, while the second temperature is the temperature for testing during cooling down. 5 test results the tested specimens showed no significant difference in strength at room temperature, the average coefficient of variation was less than 5 %. the cylindrical cold strength after 90 days was 33.1 mpa for the ssc concrete, 34.3 mpa for the cem ii-a-ll concrete and 40.3 mpa for the cem i concrete. during the heating cycle, the temperature at the concrete surface and at the core of the specimen was constantly monitored. it was ensured that the average temperature gradient between surface and core during the heating cycle was never higher than 1 k/mm. the measured stress-strain curves for the ssc concrete tests are shown in fig. 5. the picture on the left includes the cold strength stress-strain curve. the interim values during cooling down are also presented. it can be observed that, at hot stage, the gradient of the curves is monotonic, while at residual stage the gradient shows one inflexion point. this effect increases with increasing maximum temperature and lower cooling temperatures, respectively. this is due to the loss in the bond between the aggregates and the cement matrix during cooling down. the stress strain curves for the cem ii-a-ll and the cem i cement were similar; the ssc showed higher strains at ultimate load. figure 6 shows the reduction factors for hot and residual strength of the specimen with the three different types of cement used. all results are normalized to the corresponding cold strength before testing (cold strength � 1.0). starting at relative cold strength of 1.0 at 20 °c, the relative hot strength is given by the continuous line, the dots corresponding to the 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 4: concrete specimen after the test step measured category temperature level 1 cold strength 20 °c 2 hot strength 300 °c; 500 °c; 700 °c 3 residual strength after concrete cooled down to 20 °c from the three hot strength temperature levels from step 2 4 interim values 700 °c – 500 °c; 700 °c – 300 °c; 500 °c – 300 °c 5 slow heating rate (on cem ii-a-ll cement only) 500 °c – hot strength 500 °c – 20 °c (residual strength after cooling down) 6 magnetic resonance imaging (one cem i specimen only) 300 °c – 20 °c table 2. testing program test results at temperature levels of 300 °c, 500 °c or 700 °c. after reaching the maximum temperature level, the loss in strength during and after cooling down is found by following the dotted line to the left until reaching the residual strength at ambient temperature of 20 °c after a full thermal cycle. as shown in fig. 6, the specimens with ssc show slightly lower hot strength in comparison to cem ii-a-ll and cem i cement at the 300 °c and 500 °c temperature level. at higher temperature levels, i.e. 700 °c, all cements show similar performances. while the loss in strength during cooling down for ssc was nearly linear, cem i and ii show a non-linear residual strength development, with increased losses in strength while cooling down from 300 °c to 20 °c. in general, the losses in strength from hot to residual stage at ambient temperature increase with increasing maximum temperature levels. the concrete specimens cooled down from a maximum temperature of 300 °c had an average residual strength of 91 % of the hot strength. at 500 °c the average residual strength was 72 %, and after cooling down from 700 °c the average residual strength was 69 %. magnetic resonance imaging using magnetic resonance imaging, the entire concrete cylinder was scanned slice by slice. higher densities of any scanned objects inside the concrete cylinder, e.g., aggregates, © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 49 no. 1/2009 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 700°c 500°c 300°c heating up cooling down fc�/fc ssc 200 400 600 800 cem ii-a-ll 200 400 600 800 temperature in °c cem i fig. 6: temperature-strength relation for three different cements during heating up and cooling down phase 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 cold strength 300°c hot strength 300°c 20°c strain in mm/m fc� /fc (a) 0.0 0.2 0.4 0.6 0 2 4 6 8 10 12 14 500°c hot strength 500°c 300°c 500°c 20°c strain in mm/m fc�/fc (b) 0.0 0.1 0.2 0.3 0.4 0 2 4 6 8 10 12 14 16 18 700°c hot strength 700°c 500°c 700°c 300°c 700°c 20°c strain in mm/m fc�/fc (c) fig. 5: stress-strain relation for ssc concrete for different heating up levels: (a) max. temp. level of 300 °c (including cold strength), (b) max. temp. level of 500 °c, (c) max. temp. level of 700 °c fig.7. magnetic resonance imaging are shown in brighter greyscales. hence, cracks, voids and air pores are shown as black lines or dots in the image. after heating one cem i specimen up to 300 °c, it was cooled down and inspected by magnetic resonance imaging. the main aim was to study the crack formation inside the specimen, and to investigate if any loss in bond between the cement matrix and the aggregates had occurred. scans on ssc and cem ii-a-ll samples are planned. fig. 7 shows that the bond between the aggregates and the cement matrix became loose over the entire cross section. the arrows indicate the zones where the bond was affected. 6 conclusions the results of this first testing series on the hot and residual strength of concrete after a fire, including cooling, can be summarized as follows: � the difference in strength in hot and residual stage after cooling down to ambient temperature is significant. the losses in residual strength during cooling down increase with higher temperatures. � during the cooling down phase, a non-linear material behaviour for cem i and cem ii is observed. � even after cooling down from a moderate temperature of 300 °c, debonding effects between the cement matrix and the aggregates could be observed by magnetic resonance imaging. � supersulfated slag cement (ssc) performs not much different than ordinary portland cement. aknowledgements thanks to holcim group support ltd. for supplying the test specimen. references [1] en 1992-1-2:2004; eurocode 2: design of concrete structures; part 1-2: general rules – structural fire design. [2] frangi, a., tesar, c., fontana, m.: tragwiderstand von betonbauteilen nach dem brand. bauphysik, vol. 28 (2006), p. 170–183. [3] hertz, k. d.: concrete strength for fire safety design. magazine of concrete research, vol. 57 (2005), p. 445-453. [4] fire design of concrete structures – structural behaviour and assessment. ceb-fip state-of-art report bulletin vol. 46 (2008). [5] abm-paper 2a by the german federal institute for materials research and testing, 1990 [6] en 12390-2:2008; testing hardened concrete – part 2: making and curing specimens for strength tests. [7] en 1363-1:1999; fire resistance tests – part 1: general requirements. eike wolfram klingsch e-mail: klingsch@ibk.baug.ethz.ch andrea frangi mario fontana institute of structural engineering, steel, timber and composite structures eth zürich, switzerland 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ap08_6.vp 1 introduction engineering design is a very complex, iterative process. physical and mathematical modelling simulations and analyses are computationally intensive but offer immense insight into a developing product. structural engineering analysis plays an important role in this process, as the results of such analysis are often used as basic optimization parameters to improve the design candidate being validated and analysed. the number of iterations/cycles that are needed to reach the final design solution depends directly on the quality of the initial design and the appropriateness of the subsequent design changes. a wide range of computer aided design (cad) software is extensively applied in performing various design activities, such as modelling, kinematics, simulations, structural analysis or just drawing technical documentation. nowadays, cad software can be so complex, and offers such an extensive assortment of different options, that one can easily be confused. this is the reason for the relatively low level of control over these systems. in many cases, cad software is used just as a “black box”. this often leads to completely wrong conclusions, especially when young engineers facing complex design problems do not understand the basic theories, or their knowledge and experience are simply too limited. mainly because of the problems mentioned above, many professional engineers involved in the product development process share the opinion that the benefits of applying cad are below expectations. it cannot be denied that modern cad tools provide a wide range of technical support for the designers. however, these tools are unable to provide adequate help in more creative parts of the design process involving complex reasoning [1], as for example, when a design candidate needs to be evaluated and further design decisions need to be made. thus, many design steps, including the analysis-based design improvement process, still depend mostly on the designer’s knowledge and experience. the goal of our research work presented here was to increase the intelligence of existing cad tools [2] by collecting, organizing, and encoding this kind of knowledge and experience into a knowledge base for the consultative advisory intelligent decision support system. the prototype of the system proposes possible design changes that should be taken into consideration to improve the design candidate according to the results of prior stress-strain or thermal analysis [3]. the system is called propose, and can be applied either in the process of designing new products or for education purposes. 2 analysis-based design improvement every proposed design should be verified during the embodiment phase of the design process. the purpose of engineering analysis in the design process is to simulate and verify the conditions in the structure, as they will appear during its exploitation. if the structure does not satisfy the given criteria, it needs to be improved by applying certain design optimisation steps. several redesign steps are usually possible for design improvement. the selection of one or more redesign steps to be performed in a certain case depends on the user’s requirements, possibilities and wishes. the easiest design change is to select a different material. however, in many cases, this cannot be done or cannot be financially justified. fig. 1 presents some basic ways to improve analysis-based design. if the structure is over-dimensioned, it is “on the safe side”, because it is stronger than the loading requires. however, if such a design solution is too heavy or too expensive, redesign is still justified. on the other hand, design changes are necessary for under-dimensioned structure, which cannot bear the loadings and will fail during its exploitation. nowadays, the results of structural analyses are usually very well-presented. the analysing software is very helpful at this point, as it offers adequate computer-graphic support in terms of reasonably clear pictures showing the distribution of the computed values for unknown parameters, such as stresses, deformations and temperatures inside the body of the structure [4]. these values are then compared with the allowable limits, defined either by the selected material (stresses, temperatures) or by specific design requirements (deformations). however, the support provided by geometry-based cad systems is limited, because of the wide semantic gap between the expressive power of the geometry and the abstract features of the product. thus, substantial knowledge and experience is needed in order to understand the results of the analysis and to choose appropriate redesign actions [5]. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 48 no. 6/2008 an intelligent system for structural analysis-based design improvements m. novak, b. dolšak the goal of the research work presented in this paper was to collect, organize, and write the knowledge and experience about structural analysis-based design improvements into a knowledge base for a consultative advisory intelligent decision support system. the prototype of the system presented proposes possible design changes that should be taken into consideration to improve the design candidate according to the results of a prior stress-strain or thermal analysis. the system can be applied either in the design of new products or as an educational tool. keywords: computer-aided design, structural optimisation, knowledge based systems, decision support experience gained by many design iterations are of crucial importance. 3 intelligent design improvement process many important characteristics of advanced computing applications are changing the way engineers interact with computers. new approaches based on artificial intelligence (ai) have earned acceptance in many fields of engineering and have started to emerge in commercial software. analysis-based design improvement is certainly an engineering task, with a great potential for the application of intelligent systems. finite element analysis (fea) is one of the most extensively used numerical methods in the engineering product development process [6]. knowledge based engineering (kbe) techniques have been applied to fea for over twenty years to teach [7], advise [8], automate the fea pre-processing phase mainly involving automatic mesh generation [9], and verify calculations [10]. however, the use of ai methods is almost absent in the post-processing phase and the consequent design modification/improvement of designs [11, 12]. many early examples present a rule-based approach to automate the optimisation of simple components or geometric shapes [13]. recently, optimisation procedures have become a part of kbe systems for specific products [14, 15]. it is evident that kbe applications for analysis-based design improvement are quite scarce [11, 16], although the need for linking intelligent programs to structural analysis in design has been discussed in many research works, and some more specific ai techniques, like machine learning, had also proved to be serviceable in this particular domain [17]. in the last decade, research in this field has been concerned mainly with the integration of various software systems in such a way that the whole design process, including analysis, can be automated, again mostly for specific products [18, 19]. various software and hardware components are frequently required to perform both geometric modelling and engineering analysis. in this context, an independent intelligent advisory system for decision support within the analysis-based design improvement process can be applied more easily. moreover, using a qualitative description of engineering analysis results, such a system can be more general and cover a wider range of application areas. intelligent interpretation of analysis results can be used to choose the most suitable design modifications [20]. thus, what is needed is a mechanism for extracting meaningful qualitative design information from simulation results and to couple this information to a design modification system as a higher level of representation [21]. 4 propose – a consultative advisory intelligent system in order to provide intelligent decision support for a designer when performing analysis-based design improvement, we have developed a consultative intelligent system proposing appropriate modifications to design parameters. development of the system has been carried out in a sequence of steps. knowledge acquisition and development of the knowledge base were the first and most important. theoretical and practical knowledge about design and possible design improvements were investigated and collected. after 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 fig. 1: some basic steps in the analysis-based design improvement process that, appropriate representation formalism for the acquired knowledge was defined and the knowledge base of the system was encoded. finally, we developed the shell of the system, named propose, consisting of the user interface and inference engine suited to the existing knowledge base (fig. 2). the knowledge base and the shell of the system were encoded in prolog syntax. visual prolog version 5.2, developed by prolog development center a/s. (2001), was used for this purpose. 4.1 knowledge base of the system in the knowledge acquisition process we took advantage of all possible ways to acquire knowledge about design improvements, from a literature survey, including an examination of previously-conducted engineering analyses, to interviews with selected human experts. this was not a straight forward task. for example, many analysis reports contain confidential data and are thus unavailable for inclusion. additionally, interviews and examination of existing redesign examples are conditioned by the quality of cooperation with the available experts, and can be time-consuming. the scope of such results is greatly limited by these considerations. production rules were chosen as the most appropriate knowledge representation formalism, because they have a well-defined form, which is transparent, modular and easy to understand. furthermore, the form of the actual rules used by human experts in the design process is quite similar to the form of production rules. each rule presents a list of recommended design changes that should be taken into consideration while dealing with a certain problem, subject to certain limits. the rules are generalised and do not refer exclusively to the examples that were used during the knowledge acquisition process. they can also be used for any new problem and its limits which match those at the head of the rule. in such a case, application of the appropriate rule would result in a list of recommended design changes for dealing with the given problem. the designer, with his or her own knowledge and experience, should chose and apply one or more design changes that are possible, reasonable, and maximally effective for each particular case. some pictorial examples have been added to the system as an additional help to the user, to enhance understanding of the proposed design changes and to assist in making a suitable decision. the present version of the propose knowledge base comprises 314 various types of rules and facts (fig. 2) that are necessary for the system to be functional. for example, several rules are needed just to define the status of the structure (not stiff enough, under-dimensioned, over-dimensioned or satisfactory). the status of the structure depends on the type of engineering analysis, the parameters being analysed and the deviations between computed and allowable values. finally, the need for redesign is defined considering the status of the structure, the scale of the proposed changes (significant or minor) and justification for redesign. from technical point of view, the most important rules in the knowledge base are those defining redesign recommendations. let us present an example of such a rule for advising a designer how to reduce the local stress gradients around a hole in a plate, which is a quite frequent structural engineering design problem. fig. 3 shows a tensile loaded “infinite” plate with a circular hole (a) and three different © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 6/2008 fig. 2: basic architecture of the propose system fig. 3: reducing local stresses around a circular hole in a tensile loaded “infinite” plate 4.2 shell of the system the shell of the propose system was encoded in prolog. prolog was chosen because of its built-in features such as rule-based programming, pattern matching and backtracking, which are excellent tools for developing intelligent systems [22]. our work concentrated on a declarative presentation of the knowledge, using data–driven reasoning, which is built into prolog. however, some control procedures were also added to the inference engine of the system to adjust the performance to the real-life design process. for the user interface, our goal was to simulate communication between the designer–beginner and the designer–expert. the user interface enables the user to input the data, informs the user about the results, offers help and presents information about the inference process. as can be seen in fig. 2, the user interface has many features including help. thus enables efficient and user friendly communication, although the system is run as a simple monolithic console application. due to the simple architecture of the system, the response times are very modest (within a few seconds). it is however evident that propose is a prototype which is still the subject of research and, as such, cannot be compared with commercial software. 5 application of the propose system application of the propose system is based on interactive communication between the user and the system aiming to define the status of a structure and to generate the list of redesign proposals, if applicable (fig. 4). it is reasonable to use the propose system when the results of the analysis are available and also reliable. the system offers some basic guidelines to help the user to clarify whether, for example, fea results are reliable and can serve as basic parameters for verifying the suitability of the design. however, validation and determination of the reliability of the fea results should form part of the analysis. therefore, the aim of the system presented here is only to emphasize the importance of the reliability of the results and to present some guidelines, as a kind of help when considering the reliability of the results. after the availability and reliability of the analysis results have been confirmed, the user has to present the type of analysis that has been performed prior to the propose application. the current version of the system can deal with 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 design improvement options for reducing the potential high stresses around the hole (b, c, d). by generalising the geometric appearance of the hole, the following rule for design improvement recommendations can be defined: if the stresses are too high and structure is a 3d “infinite” plate and the critical area is around the hole then reduce the size of the hole make a chamfer on the edge of the hole change the circular hole into an elliptical hole (fig. 3b) change the circular hole into a round ended slot (fig. 3c) add smaller relief holes in the line of loads on both sides of the hole (fig. 3d) change the hole geometry to decrease the stress concentration factor (k) actions([], ["reduce the size of the hole", "make a chamfer on the edge of the hole", "change the circular hole into elliptical hole", "change the circular hole into round ended slot", "add smaller relief holes in line of loads on both sides of the hole", "change the hole geometry to decrease stress concentration factor (k)" ],[], l) :stresses(high), area_description(one,around_hole), additional_question(" is this a case of a hole in an 'infinite' plate? ",answer), answer = "yes", l = ["structure type is 3d","stresses are high", "critical area is one region around a hole", "this is a case of a hole in an 'infinite' plate"]. strain-stress or thermal analyses. in the next step, the user needs to know the allowable values for the structure being analysed. a qualitative description needs to be defined of the deviation between the computed maximum values for the stresses, deformations or temperatures and the allowable limits. considering the range of differences between the actual values and the allowable limits, the system defines the status of the design candidate (satisfactory, not stiff enough, under or over dimensioned), and what kind of changes need to be made (significant, minor or none). © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 6/2008 fig. 4: diagram representation of the propose application in cases where design changes are advised, any improvement of the design candidate should be justified. if the structure is not stiff enough or if it is under-dimensioned, design changes are necessary and the system itself classifies them as justified. on the other hand, if the structure is over-dimensioned, the user/designer has to decide whether design changes are justified or not. as in many other cases, the user can obtain some help from the system when making this decision. in order to present some recommendations for design improvement, the critical area where the computed values exceed the allowable limits needs to be defined. for the time being, the system can deal with two types of structure: beams and general three-dimensional structures. an abstract description of the critical area is supported by a list of predefined features, e.g., around a hole, a notch, in the corner. a critical area should be defined as generally as possible, to cover the majority of problems that may occur in practice. presently, the number of predefined geometric features is relatively small. however, by answering some additional questions, a critical area can be described in a more detailed manner. for each problem described, the system searches for redesign proposals in the knowledge base and presents them on the screen. as can be seen in fig. 5, the system is able to provide three types of proposals, the first referring to material changes, the second to geometry changes, and the last type of proposals referring to loads. as already mentioned, the user can obtain an insight into the inference process or obtain more information about certain redesign proposals. an example of a pictorial explanation for a redesign recommendation is presented in fig. 6. 6 conclusions the propose system is a knowledge-based module to be applied within the computer-aided design optimisation cycle to increase the intelligence of existing cad tools in order to enable more intelligent and less experience-dependent design performance [23]. the research on analysis-based design improvement dealt mostly with the pre-processing phase of the analysis, while much less attention was paid to the post-processing phase, which is well known as the other bottleneck in the analysis process. in order to fill in this gap, a prototype of the intelligent advisory system was developed to assist a designer in the post-processing phase of structural analysis by proposing possible design changes that should be taken into consideration to improve the design according to the results of a prior analysis. the propose system offers help and advice on how to solve design problems in abstractly described critical areas of the structure after a stress-strain or thermal analysis. the architecture of the system, based on production rules, enables the system to be expanded relatively easily with additional rules, for example for a more specific description of any problem, for other types of engineering analyses, and for a deeper, multi-physics understanding of recommended design changes. when using the propose system, a designer has to answer some questions stated by the system to present the results of the engineering analysis qualitatively, with emphasis on the 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 fig. 5: the result of a propose application fig. 6: pictorial explanation of a redesign proposal critical area that needs to be optimised. these answers are then compared with the rules in the knowledge base, and the most appropriate design changes that should be taken into account for the various cases are determined and recommended to the user. the system provides constant support for the user’s decisions in terms of explanations and advice. the propose system is not intended only for optimising new products in practice, but also for use in design education. in fact, the system is currently used in the design education process at the faculty of mechanical engineering, university of maribor, slovenia. overall experience with the operation of the system has been very positive and encouraging. the important feature of the system, the ability to explain the inference process, is especially welcome for students, as it enables them not only to select appropriate further design steps, but also to acquire some new knowledge. references [1] mili, f. et al.: knowledge modelling for design decisions. artificial intelligence in engineering, vol. 15 (2001), p. 153–164. [2] mcmahon, c., browne, j.: cadcam: principles, practice and manufacturing management. prentice hall, 2nd edition, 1999. [3] dolšak, b., novak, m., jezernik, a.: intelligent design optimisation based on the results of finite element analysis. the international journal of advanced manufacturing technology, vol. 21 (2003), p. 391–396. [4] tichánek, r. et al.: structural stress analysis of an engine cylinder head. acta polytechnica, vol. 45 (2005), no. 3, p. 43–48. [5] ong, y. s., keane, a. j.: a domain knowledge based search advisor for design problem solving environments. engineering applications of artificial intelligence, vol. 15 (2002), p. 105–116. [6] zienkiewicz, o. c. et al.: the finite element method: its basis and fundamentals. oxford: elsevier butterwoth-heinemann, 6th edition, 2005. [7] forde, b., stiemer, s.: esa: expert structural analysis for engineers. computers and structures, vol. 29 (1988), p. 171–174. [8] turkiyyah, g. m., fenves, s. j.: knowledge-based assistance for finite-element modelling. intelligent systems, vol. 11(1996), no. 3, p. 23–32. [9] dolšak, b.: finite element mesh design expert system. knowledge-based systems, vol. 15 (2002), p. 315–322. [10] breitfeld, t., kroplin, b.: an expert system for the verification of finite-element calculations. in: proc. 4th int. symposium on assessment of software tools (sast’96), ieee computer society, 1996, p. 18–23. [11] smith, l., midha, p.: a knowledge-based system for optimum and concurrent design and manufacture by powder metallurgy technology. int. journal of prod. res., vol. 7 (1999), no. 1, p. 125–137. [12] burczynski, t. et al.: optimization and defect identification using distributed evolutionary algorithms. engineering applications of artificial intelligence, vol, 17 (2004), p. 337–344. [13] jozwiak, s.: artificial intelligence: the means for developing more effective structural optimisation programs. in: proc. int. conf. on comp. mechanics (iccm 86), 1986, p. 103–108. [14] chapman, c. b., pinfold, m.: the application of a knowledge based engineering approach to the rapid design and analysis of an automotive structure. advances in engineering software, vol. 32 (2001), p. 903–912. [15] ríos, j. et al.: kbe application for the design and manufacture of hsm fixtures. acta polytechnica, vol. 45 (2005), no. 3, p. 17–24. [16] pilani, r. et al.: a hybrid intelligent systems approach for die design in sheet metal forming. international journal of advanced manufacturing technology, vol. 16 (2000), p. 370–375. [17] dolšak, b., bratko, i., jezernik, a.: knowledge base for finite element mesh design learned by inductive logic programming. artificial intelligence for engineering design, analysis and manufacturing, vol. 12 (1998), p. 95–106. [18] custom product engineering during the sales cycle. pittsburgh, usa: ansys user group conference, 2000. [19] kowalski, z. et al.: an expert system for computer aided design of ship systems automation. expert systems with applications, vol. 20 (2001), p. 261–266. [20] armstrong, c. and bradle, b.: design optimisation by incremental modification of model topology. in: proc. 8th int. meshing roundtable, sandia national laboratories, 1999, p. 293–298. [21] sahu, k., grosse, i.: concurrent iterative design and the integration of finite element analysis results. engineering with computers, vol. 10 (1994), p. 245–257. [22] bratko, i.: prolog: programming for artificial intelligence. addison-wesley, 3rd edition, 2000. [23] dolšak, b., novak, m., kaljun, j.: intelligent support for a computer aided design optimization cycle. acta polytechnica, vol. 46 (2006), no. 5, p. 15–20. marina novak, ph.d. phone: +386 2 220 7692 e-mail:marina.novak@uni-mb.si bojan dolšak, ph.d. phone: +386 2 220 7691 fax: +386 2 220 7994 e-mail: dolsak@uni-mb.si laboratory for intelligent cad systems university of maribor faculty of mechanical engineering smetanova 17 si-2000 maribor, slovenia http://licads.fs.uni-mb.si/ © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 6/2008 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 conductivity measurements of silverpastes m. dirix, o. koch abstract thedevelopmentof three-dimensional printedcircuit boards requires researchonnewmaterialswhich caneasily bedeformed. conducting pastes are well suited for deformation even after they are applied to the dielectric carrier. this paper deals with measurements of the electrical conductivity of these conducting pastes. two different conductivity measurement techniques are explainedandcarried out. the resultingmeasurements giveanoverviewof the conductivityof severalmeasured samples. keywords: conductivity, conducting pastes, 3d circuits. 1 introduction currentprinted circuit board(pcb)production is still mostly based on thewell-known concept of laminating andphoto-etching. during the laminatingprocess, the whole surface of the dielectric carrier is covered with a conducting material such as copper. then the conductingmaterial is coveredwith aphotosensitive polymer and exposed to a light projection of the desired circuit. afterwashing, only thedesiredcircuit surfaces are coveredwith the polymer. using ahighly corrosive etching fluid, the conducting material not covered by the polymer is removed, leaving only the circuit. the disadvantages of this method are on the one hand the use of conducting material that will be discarded. this is costly. on the other hand, the laminating process uses high pressure to bond the conductor with the dielectric carrier. this bonding can therefore only be applied to hard flat surfaces, such as standard dielectric carriers. these flat surface pcbs are ill-suited for deformation, which limits their application in the growing market for 3d shaped circuit boards. oneway to solve this problem is to usemolded interconnectdevicematerials in combinationwithlaser direct structuring (lds) [1]. inmid-lds, the dielectric carrier is preformed into the desired shape. then the surface of the dielectric carrier is covered with an organic metal complex. the organic metal complex can be activated using a laser beam. the activated surface is roughened, and the organicmetal complex is separated into metal atoms and organic ligands. this makes the activated surface suited for copper coating with a strong grip. after cleaning, the copper coating is then built up on the activated areas using a currentfree copper bath. another approach is to use thin flexible dielectric carriers together with a moldable conductor, such as conducting pastes. in this approach, the conductor is first applied to the surface of the dielectric carrier using a printing technique like that used for ink printers. the shape of the applied circuit has to take into account the desired form of the circuit after deforming. the dielectric carrier together with the conductor are then deformed into the desired shape. the advantages clearly lie in the straightforward processing steps. most conducting pastes consist of conducting material dissolved in an epoxy, polyamide or acrylate adhesive. after applying thepaste, thematerial isheated in order to thoroughly bond the conductor to the carrier and remove the solvent from the conducting structure. the resulting conducting structure is studied for its conductivity and usability for high frequent pcb designs. this paper presents two methods for measuring the electrical conductivity of such conducting pastes. 2 microstrip t-resonator measurement structure the microstrip t-resonator is a two-port resonator which consists of a 50ω microstrip line with a parallel quarter wavelength resonating line [2]. figure 1 shows the layout of the t-resonator used here. the open-endedquarterwavelength transmission line stub, resonates at odd integermultiples of the quarterwavelength frequency. the first resonance can be determined by calculating the length of the quarter wavelength stub, as follows lel = nc 4f √ ref f (1) where n is the order of the resonance (here n = 1), c is the speed of light, f is frequency, ref f is the effective dielectric constant for the dielectric carrier that is used. the line length of the quarterwavelength stub is calculated to have the primary resonance at 500mhz. in order to get the first resonance as accurately as possibly at 500 mhz, corrections must be applied for open-end and t-junction effects. these effects where 19 acta polytechnica vol. 50 no. 4/2010 compensatedwith the use of agilent ads commercial software. the accurate designing makes it possible to have data points at the desired frequencies. ����������������������� ����������������������� ����������������������� ����������������������� ����������������������� ����������������������� ����������������������� ����������������������� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� quarter wavelength stub port 1 port 2 fig. 1: basic layout of the t-resonator measurement structure the t-resonator operates like a notch filter with a resonant null in the resonance frequencies. a network analyser (nwa) is used to measure the transmission of the two-port network. from the measurement it is possible to determine the loaded quality factor ql by finding the resonance frequencies and the corresponding 3 db bandwidths bw3db. the loaded quality factor is the calculated as ql = f bw3db (2) the loaded quality factor is influenced by the 50ω test set of the nwa. in order to compensate, the unloaded quality factor q0 has to be extracted q0 = ql√ 1−2 ·10−(la/10) (3) where la is the insertion loss in db at the resonance. the unloaded quality factor q0 comprises the three main loss effects, namely conductor, dielectric and radiation losses [3]. q−10 = q −1 c + q −1 d + q −1 r (4) where qc is the quality factor due to conductor losses, that we want to derive, qd is the quality factor due to dielectric losses, and qr is the quality factor due to radiation losses. the t-resonator is first implemented using aknownconductor, copper, anda referencemeasurement is carried out. for copper, the attenuation due to conductor loss can be obtained with [4] αc = rs z0w (5) where rs = √ ωμ0 2σ is the surface resistivity of the conductor. using (5), the quality factor due to conductor losses can be calculated as qc(copper)= β 2α (6) now it is possible to derive the sum quality factor of both other losses from the measurement with q−1ref(copper)= q −1 0 − q −1 c (7) it is assumedhere that the losses due to the dielectric carrier and radiation are equal for the reference measurement with the copper conductor and for the measurements carriedoutusing the conductingpastes, if the samedielectric carrier is usedand thedimensions of the resonator are equal. the surface resistivity rs and further the electrical conductivity σ canbederived by calculating equations (7), (6) and (5) backwards for the measured sample. there is limited accuracy, due to the relatively small quality factor and the assumption that the dielectric and radiation losses are equal for the copper and for the samples. only assumptions can be made about the global locationof the electrical conductivity. 3 surface resistance measurements by means of a dielectric resonator the methods described in this section are based on the findings of j. krupa [5]. the measurements are performed using a dielectric resonator, manufactured by qwed. in a completely closed resonator, the unloaded quality factor comprises only the quality factor due to conductor losses qc and the quality factor due to dielectric losses qd. the unloaded quality of a resonator which is completely filled with a dielectric material can then be described as q−10 = rss asup + rsm amet + pe tanδ (8) where asup and amet are geometrical factors for the conducting surfaces of the resonator, the former for the surface of the sample and the latter for the lateral metal parts of the resonator. rss and rsm are the corresponding surface resistance values for sample and metal. pe is the fraction of the electrical energy stored in the dielectricmaterial, and tanδ is its dielectric loss tangent value. although equation (8) is valid for any of the resonator modes, only the t e0ρφ modes are considered for the conductivitymeasurements. the t e0ρφ modes are chosenbecause theyhaveboth axial symmetryand ohmic contact between the surface of the sample, and the lateral metal conductor has no influence on the quality factor of the resonance. this lack of influence results in high reproducibility for experiments using the same pair of samples. for the measurements, the t e010, t e011 and t e012 modes are used to determine the electrical conductivity of the sample. in order to eleminate the surface resistance of the lateral metal conductor in (8), first a measurement is made using samples of the samematerial as themetal. 20 acta polytechnica vol. 50 no. 4/2010 using the measured unloaded quality factor as a reference q0ref it is possible to write (8) as rss =asupq − 0 1− a2sup asup + amet q−10ref − pe tanδ a−1sup + a −1 met (9) with equation (9) it is now possible to determine the surface resistance of the sample, and from this to derive the electrical conductivity. the accuracy of the measurement depends on measuring the unloaded quality factor, the uncertainty of the dielectric loss tangent value and the values of the geometrical factors. the former is resolved by adjusting both the coupling loops to have a maximal transmission s21 of −40db, which results in a maximum error for q0 of 1%. both the other values are provided by the manufacturer. the accuracy of the measurement using the dielectric resonator relies on the assumption that the conductor deposit is bulk. the deposited conductor is only assumed to be bulk if its thickness is at least 3δ, where δ = 1√ πf κμ [6]. if the deposited conductor is thinner, the electromagnetic field can in part penetrate through the conductor into the carrier. the dielectric losses of the carrier will then have an effect on the quality factor of the measurment. for samples with a minimum of 3δ, the maximum accuracy, taking into account the error of q0, is 2% for the surface resistance and 5% for electrical conductivity. 4 measurements the t-resonator structure is implemented using a rogers rtduroid 5880 laminate. figure 2 shows the resulting resonator structure, a copper structure on the left and a structure with a sample conducting paste on the right. the reference sample in copper was created using a photo-etching technique. in order to check for connectivity errors, the t-resonator sample structures were created once using a copper conductor with a quarter wavelength stub printed on top using the conducting paste, and in the other case both transmission line and quarter wavelength stub were printed using the conducting paste. for these measurements it has to be taken into account that the laminate was possibly unsuited for the preparation and for the heating steps required for applying the conducting pastes, resulting in higher inaccuracy of the measurements. figure 3 presents the resulting transmission measurements. two resonant points can be seen, one at approximately 500 mhz and one at 1.5 ghz. several samples with different pastes were then measured, and the results can be found in table 1. fig. 2: the t-resonator structure fig. 3: transmission measurement of the t-resonator structure table 1: electrical conductivity measurements using the t-resonator structure sample σ [s/m] @500 mhz σ [s/m] @1.5 ghz 01 3e6 3e6 03 2e6 3e6 07 2e6 3e6 08 3e6 4e6 the dielectric resonator uses samples with surface dimensions of approximately 9.5 cm times 9.5 cm on both the top side and thebottomside of the resonator. using the dielectric resonator, the copper samples are first measured as a reference. the copper samples are laminated surfacesofrogersrtduroid35μminthickness. the actual samples are squares of conducting pastes printed on a dielectric carrier with an approximated thickness of 10μm, which was the current depositing limit. this is less than the formerly defined 3δ that is needed for the assumption of a bulk conductor. however, considering the high reproducibility of the measurements, and taking into account that all samples use the same carrier and have the same depositet thickness, it is still possible to make a qualitative evaluationof thedifferencesbetween themeasured samples. the resulting conductivitymeasurementsare shown in table 2. 21 acta polytechnica vol. 50 no. 4/2010 table 2: electrical conductivity measurements using the dielectric resonator sample σ [s/m] σ σ @1.6 ghz @2.1 ghz @2.7 ghz copper 5.53e7 5.35e7 5.65e7 07120902 5.89e+006 5.61e+006 5.43e+006 07120910 5.35e+006 4.98e+006 5.50e+006 07120913 4.42e+006 4.35e+006 4.48e+006 07120914 4.45e+006 4.47e+006 4.73e+006 07120915 4.40e+006 4.63e+006 4.62e+006 07120916 4.35e+006 4.68e+006 4.63e+006 07120917 5.50e+006 5.72e+006 5.67e+006 07120918 5.50e+006 5.50e+006 5.50e+006 5 conclusion a t-resonator structure for measuring electrical conductivity has been realised for 500 mhz and 1.5 ghz. further, a dielectric resonator has been evaluated and acquired for accurate measurements at 1.6 ghz, 2.0 ghz and 2.7 ghz. taking into account the accuracy of themeasurements it is possible tomake a qualitative evaluation of the different conducting pastes. theelectrical conductivities of the sample conducting pastes were measured using the t-resonator and the dielectric resonator. the conductivity of the conducting pastes has been shown to be approximately 10%of the electrical conductivity of copper. although the conductor losses for the conducting pastes will be larger than in the case of copper, they are still usable for designing high frequent circuits. acknowledgement the researchdescribed in this paperwas supervisedby prof. d. heberling, institute of high frequency technology, rwth aachen university. references [1] orlob, c., kornek, d., preihs, s., rolfes, i.: comparison ofmethods for broadband electromagnetic characterisation of molded interconnect device materials.advances inradio science, 2009, vol.7, p. 11–15. [2] lätti, k. p., heinola, j. m., kettunen, m., ström, j. p., silventoinen, p.: a novel strip line test method for relative permittivity and dissipationfactor of printedcircuit boardsubstrates. ursi/ieee xxix convention on radio science, 2004, p. 71–74. [3] amey, d. i., curilla, j. p.: microwave properties of ceramic materials. du pont electronics, 1991, p. 267–272. [4] pozar,d.m.: microwave engineering. 3th ed.wiley, 2005. [5] krupka, j., klinger, m., kuhn, m., baranyak, a., stiller, m.: surface resistance measurements of hts films by means of sapphire dielectric resonators. ieee transactions on applied superconductivity, 1993, vol. 3, no. 3, p. 3043–3047. [6] krupka, j., derzakowski, k., zychowicz, t., givot, b. l., egbert, w. c., david, m. m.: measurements of the surface resistance and conductivity of thin conductive films at frequencies near 1 ghz employing the dielectric resonator technique. personal communication. about the authors marc dirix was born in geleen, the netherlands in 1980. he received his dipl.-ing degree in electrical engineering from rwth aachen university, germany, in 2009. currently, he is a research assistant at the institute of highfrequencytechnology atrwth aachenuniversity,where he isworking towards a doctoral degree (phd).his research interests includemeasurement of high frequent properties of newmaterials. olivier koch was born in dortmund, germany in 1977. he receivedhisdipl.-ingdegree inelectricalengineering fromrwthaachenuniversity,germany, in 2004. currently, he is a research assistant at the institute of high frequency technology at rwth aachen university, where he is working towards a doctoral degree (phd). his research interests include power amplifiers and mimo systems. m. dirix o. koch e-mail: marc@ihf.rwth-aachen.de, koch@ihf.rwth-aachen.de institute of high frequency technology rwth aachen university melatener strasse 25, 52074 aachen, germany 22 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 flock growth kinetics for flocculation in an agitated tank r. šulc, o. svačina abstract flock growth kinetics was investigated in baffled tank agitated by a rushton turbine at mixing intensity 40 w/m3 and kaolin concentration 0.44 g/l. the tests were carried out with a model wastewater (a suspension of tap water and kaolin). the model wastewater was flocculated with organic sokoflok 56 a flocculant (solution 0.1 % wt.). the flock size and flock shapewere investigated by image analysis. a simple semiempirical generalized correlation for flock growth kinetics was proposed, andwasused for data treatment. theflock shapewas characterized by fractal dimension df2. using the statistical hypothesis test, the fractal dimension was found to be independent of flocculation time, and the value df2 =1.442 ±0.125 was determined as the average value for the given conditions. keywords: flocculation, flock growth, flock size, mixing, rushton turbine, kaolin slurry. 1 introduction flocculation is one of the most important operations in solid – liquid separation processes in water supply and wastewater treatment. the purpose of flocculation is to transform fine particles into coarse aggregates – flocks that will eventually settle – in order to achieve efficient separation. the properties of separated particles have a major effect on the separation process and on separation efficiency in a solid – liquid system. the solid particles in a common solid – liquid system are compact and are regular in shape, and their size does not change during the process. the size of these particles is usually sufficiently described by the diameter or by some equivalent diameter. the flocks that are generated are often porous and are irregular in shape, which complicates the flocculation process design. flock properties such as size, density and porosity affect the separation process and its efficiency. the aim of this work is to propose a simple semiempirical generalized correlation for flock growth kinetics and to verify the proposed model experimentally. 2 generalized correlation for flock growth kinetics turbidity measurement has been used and recommended for flocculation performance assessment in routine control in industry. flocculation efficiency has frequently been expressed as the rate of turbidity removal: z∗e (tf ) = ze(tf ) z0 = z0 − zr(tf ) z0 = 1 − z∗r (tf ) (1) where z∗e is turbidity removal degree, z ∗ r is residual turbidity degree, z0 is turbidity of a suspension before the beginning of flocculation, ze is eliminated turbidity due to flocculation, zr is residual turbidity of clarified water after flock separation, and tf is flocculation time. šulc [1] found that flocculation kinetics expressed as the dependence of the residual turbidity rate on the flocculation time can be expressed by a simple formula, taking into account flock breaking: z∗r = azr∗ · log 2(t∗f ) + bzr∗ · log(t ∗ f ) + czr∗ (2) where z∗r is residual turbidity degree, t ∗ f is dimensionless flocculation time, and azr∗ , bzr∗ , czr∗ are the model parameters. for flocculation taking place in an agitated tank, šulc and ditl [2] recommend dimensionless flocculation time t∗f given by: t∗f = n · tf (3) where tf is flocculation time, and n is impeller rotational speed. the proposed definition of dimensionless flocculation takes this into account due to the characteristic time choice. the chosen characteristic time, tchar ∝ 1/n, is proportional to the number of liquid passages through an impeller. based on eq. (2) šulc [1], šulc and ditl [3] have proposed a generalized correlation for flocculation kinetics in an agitated tank that takes into account flock breaking, as follows: δz∗r = a ∗ zr∗ · ( δ[ntf ] ∗ log )2 , (4) where δz∗r = z∗r − z∗rmin z∗rmin (5) δ[ntf ] ∗ log = log(ntf ) − log([ntf ]min) log([ntf ]min) , (6) where z∗r min is minimal residual turbidity degree achieved at time [ntf ]min, [ntf ]min is the dimensionless flocculation time in which z∗r min can be achieved, 22 acta polytechnica vol. 50 no. 2/2010 a∗ is the residual turbidity shift coefficient, tf is the flocculation time, and n is impeller rotational speed. according to lambert’s law, the turbidity depends on the cross-sectional area of the flock σ and flock concentration np, as follows: zr ∝ np · σ (7) assuming that the cross-sectional area of flock σ is proportional to flock size df σ ∝ d2f (8) and particle/flock mass conservation must be fulfilled at any time, flock concentration np can be expressed as follows: np · d3f ∝ const. (9) using eqs. (7), (8) and (9), the dependence of flock size on flocculation time can be given by a simple formula taking into account flock breaking: (1/df ) = af · log2(ntf ) + bf · log(ntf ) + cf (10) where (1/df ) is reciprocal flock size, ntf is dimensionless flocculation time, and af , bf , cf are the model parameters. assuming that the minimum residual turbidity degree corresponds to the maximum flock size, the dimensionless model variable δz∗r can be rewritten as follows: δz∗r = z∗r − z∗rmin z∗rmin = zr − zrmin zrmin ∝ nf nfmax · d2f − d 2 fmax d2fmax = (1/df ) − (1/dfmax) (1/dfmax) (11) where df is flock size and df max is maximal flock size. then the generalized correlation for flock growth kinetics that takes into account flock breaking can be derived as follows: δ(1/df ) ∗ = a∗f · (δ[ntf ] ∗ log) 2 (12) rewritten dfmax df = 1 + a∗f · (δ[ntf ] ∗ log) 2 (13) where δ(1/df ) ∗ = (1/df ) − (1/dfmax) (1/dfmax) , (14) a∗f = b2f 4 · af · cf − b2f , (15) where dfmax is the maximum flock size reached at time [ntf ]max, [ntf ]max is the dimensionless flocculation time in which dfmax can be achieved, δ[ntf ] ∗ log is the variable defined by eq. (6), and af , bf , cf are parameters of eq. (10). the generalized correlation parameters dfmax, [ntf ]max and a ∗ f depend generally on the flocculation process conditions, e.g. mixing intensity and flocculation dosage. 3 model evaluation the proposed generalized correlation was tested on the published experimental data by kilander et al. [4]. kilander et al. [4] investigated the local flock size distributions in square tanks of different sizes (5, 7.3, 28 and 560 l). the areas investigated in the 7.3 and 28 l tanks were the upper left corner (noted uc), the lower left corner (noted lc), directly over the impeller (noted oi) and directly under the impeller (noted ui). in the 5 and 560 l tanks, the areas lc, oi and ui were investigated. a suspension of buffered water and kaolinite clay was used as the model flocculation system. the tanks were agitated by a lightnin a 310 hydro foil impeller. the 5, 7.3 and 28 l tanks were operated at two specific energy inputs, 1.72 w/m3 and 2.55 w/m3, corresponding to average gradient velocity 41.2 and 50.4 s−1, respectively. the 560 l tank was operated only at specific energy input 2.55 w/m3. no model of flock growth kinetics was applied for data interpretation. the data in the uc area for the 7.3 l and 28 l tanks at specific power input 1.72 w/m3 were analyzed. a comparison of the experimental data and a generalized correlation are depicted in fig. 1. the generalized correlation parameters fitted for the measured data are presented in table 1. fig. 1: generalized correlation δ(1/df) ∗ = f(δ[ntf ] ∗ log) – kilander et al. [4] 23 acta polytechnica vol. 50 no. 2/2010 table 1: generalized correlation parameters fitted – data kilander et al. [4] v εv n [ntf ]max tfmax dfmax a ∗ f iyx δrmax/δrave ∗1 (l) (w/m3) (rev/min) (–) (min) (mm) (–) (–) (%) 7.3 1.72 199 6 429 32.3 0.231 4 39.108 0.998 5 1.4/3.2 ∗2 28 1.72 148 5 217 35.3 0.169 9 18.998 0.997 8 1.6/4 ∗2 ∗1 relative error of flock size df : maximum/average absolute value. ∗2 data for tf = 2.22 min and tf = 3.89 min were excluded. 4 experimental the flock growth kinetics was investigated in a baffled tank agitated by a rushton turbine at mixing intensity 40 w/m3 and kaolin concentration 0.44 g/l. the tests were carried out on the kaolin slurry model wastewater. the model wastewater was flocculated with organic sokoflok 56a flocculant (solution 0.1 % wt.). the flock size and its shape were investigated by image analysis. the proposed simple semiempirical generalized correlation for flock growth kinetics was used for data treatment. the fractal dimension of the flocks was also determined. 4.1 experimental apparatus the flocculation experiments were conducted in a fully baffled cylindrical vessel of diameter d = 150 mm with a flat bottom and 4 baffles per 90◦, filled to height h = d by a kaolin slurry model wastewater (tap water + kaolin particles). the vessel was agitated by a rushton turbine (rt) of diameter d = 60 mm that was placed at an off-bottom clearance of h2/d = 0.85. baffle width b/d was 0.1. the impeller motor and the cole parmer servodyne model 50000-25 speed control unit were used in our experiments. the impeller speed was set up and the impeller power input value was calculated using the impeller power characteristics. the agitated vessel dimensions are shown in fig. 2. image analysis technique the flock size was determined using a non-intrusive optical method. the method is based on an analysis of the images obtained by a digital camera in a plane illuminated by a laser light. the method consists of three steps: 1. illumination of a plane in the tank with a laser light sheet (sometimes called a laser knife) in order to visualize the flocks, 2. a record of the images of the flocks, using a camera, 3. processing the images captured by image analysis software. the illuminated plane is usually vertical and a camera is placed horizontally (e.g. kilander et al. [4], kysela and ditl [5, 6]). kysela and ditl [5, 6] used this technique for flocculation kinetics observation. they found that the application of this method is limited by the optical properties of the system. the required transparency limits the maximum particle concentration in the system. therefore we do not observe the flock size during flocculation but we observe it during sedimentation, thus the limitation should be overcome. therefore the laser illuminated plane is horizontal and perpendicular to the impeller axis. the scheme of the experimental apparatus for image analysis is shown in fig. 2. fig. 2: schema of the experimental apparatus for image analysis 24 acta polytechnica vol. 50 no. 2/2010 table 2: technical parameters item specification laser diode: nt 57113, 30 mw, wave length 635 nm (red light), edmund optics, germany diode optics: optical projection head nt54-186, projection head line, edmund optics, germany camera: colour cmos camera silicon video r©si-sv9t001c, epix inc., usa camera optics: objective 12vm1040asir 10–40 mm, tamron inc., japan image processing card (so-called grabbing card): pixci si pci image capture board, epix inc., usa camera control software: xcap r©, epix inc., usa operation software: linux centos version 5.2, linux kernel 2.6 software for image analysis: sigmascan pro 5.0 the agitated vessel was placed in an optical box (a water-filled rectangular box). the optical box reduces laser beam dispersion, and thus it improves the optical properties of the measuring system. the camera with the objective and laser diode are placed on the laboratory support stand. the technical parameters are presented in table 2. 4.2 experimental procedure the maximum flock sizes formed during flocculation were measured for various flocculation times at mixing intensity ε = 40 w/m3, constant flocculant dosage df = 2 ml/l and initial kaolin concentration 440 mg/l. the flock sizes were measured during sedimentation. the experimental parameters are summarized in table 3. table 3: experimental conditions kaolin concentration parameter cc0 = 0.44 g/l εv (w/m 3) 40 n (rev/min) 180 tf (min) 4; 6.66; 9.33; 13.33; 20 ntf (–) 720, 1 200, 1 680, 2 400, 3 600 df (ml/l) 2 no. of date 5 the experimental procedure was as follows: 1. calibration and experimental apparatus setting before each flocculation experiment the calibration grid was placed in an illuminated plane and the camera was focused manually onto this calibration grid. then the image of the calibration grid was recorded. for camera resolution 800 × 800 pixels, the scale 1 pixel ∝ 45μm was found for our images. this corresponds to a scanned area of 35 mm × 35 mm (approx. 6 % of the cross-section area of the tank). the scanned area was located in the middle of one quarter of the vessel, between the vessel wall and the impeller. 2. model wastewater preparation kaolin slurry (a suspension of water and kaolin particles (18 672 kaolin powder finest riedel-de haen)) was used as a model system. the solid fraction of kaolin was 440 mg/l. 3. flocculation the model wastewater was flocculated by the organic sokoflok 56a polymer flocculant (medium anionity, 0.1 % wt. aqueous solution; flocculant weight per flocculant solution volume mf /vf = 1 mg/ml; sokoflok ltd., czech republic). the experimental conditions are specified in table 3. the flocculation was initiated by adding flocculant into the agitated vessel, and the flocculation time measurement was started. 4. image acquisition after impeller shutdown, the flocks began to settle. during sedimentation, the images of flocks passing through the illuminated plane and having 10-bit depth were captured at frame rate 10 s−1, exposure = 5 ms, and gain 35 db. image capturing started 20 s after impeller shutdown and took 120 s. finally 1 200 images were obtained for the flocculation experiment and some were stored in a hard disk in 24-bit jpg format. 5. image analysis the images were analyzed using sigmascan software and its pre-defined filters (filter no. 8 for removing one-pixel points, filter no. 10 for removal of objects touching on an image border, and filter no. 11 for filling an empty space in identified objects caused by capture error) and our macros (svačina [7] in detail). 25 acta polytechnica vol. 50 no. 2/2010 table 4: generalized correlation δ(1/df eq) ∗ = f(δ[ntf ] ∗ log): parameters fitted εv n [ntf ]max tfmax df eqmax a ∗ f iyx δrave /δrmax ∗1 (w/m3) (ot/min) (–) (min) (mm) (–) (–) (%) 40 180 1 440 8 1.406 2 16.373 0.895 2 2.5/4.4 ∗2 note: ∗1 relative error of equivalent flock size df eq : average/maximum absolute value. note: ∗2 flock size for ntf = 3 600 was excluded from the evaluation. fig. 3: experimental data – maximum flock size vs. flocculation time – df eq = f(tf) fig. 4: generalized correlationδ(1/df eq) ∗ = f(δ[ntf ] ∗ log) 4.3 experimental data evaluation from the images that were captured, the largest flock was identified and its projected area was determined for the given flocculation time. the equivalent diameter calculated according to the flock area plotted in dependence on the flocculation time for a given flocculant dosage and mixing intensity is shown in fig. 3. when flocculation time increases, the flock size increases up to the maximum value, due to primary aggregation, and then decreases due to flock breaking. generalized correlation δ(1/df eq ) ∗ = f (δ[ntf ] ∗ log ) the dependence of calculated equivalent diameter df eq on flocculation time was fitted according to the generalized correlation (12). the generalized correlation parameters are presented in table 4. a comparison of the experimental data and the generalized correlation is depicted in fig. 4. fractal dimension the flocks generated are often porous and have an irregular shape, which complicates the design of the flocculation process. fractal geometry is a method that can be used for describing the properties of irregular objects. some flock properties such as shape, density, sedimentation velocity can be described on the basis of fractal geometry. a fractal dimension of the 3rd order df3 evaluated from the flock mass was usually determined. since in our case we measured the projected area of the flocks, the fractal dimension of the 2nd order df2 was used for flock shape characterization. the relation among projected area a, characteristic length scale lchar and fractal dimension df2 is given by: a = c · ldf2char . (16) the largest flocks determined in the images were used for fractal dimension estimation. the maximum flock size was used as a characteristic length scale. the fractal dimension df2 was determined for each flocculation time. for illustration, the dependence of the projected area on the maximum flock size is shown in fig. 5 for dimensionless flocculation time ntf = 1 680. the fractal dimension df2 plotted in dependence on flocculation time for a given flocculant dosage and mixing intensity is shown in fig. 6. 26 acta polytechnica vol. 50 no. 2/2010 fig. 5: fractal dimension determination – example for ntf =1680 fig. 6: fractal dimension vs. dimensionless flocculation time – df2 = f(ntf) table 5: fractal dimension – hypothesis testing εv (w/m3) m (–) t-distribution t(m−2),α=0.05 relation df2 = b · (ntf )β βcalc t-characteristics |t| hypothesis df2 = b · (ntf )0 βpred = 0 40 5 3.182 5 0.045 0.97 (yes) note: the absolute value of parameter t is presented in brackets. note: the t-distribution for (m − 2) degrees of freedom and significance level α = 0.05. the effect of flocculation time on fractal dimension df2 was tested by hypothesis testing. the statistical method of hypothesis testing can estimate whether the differences between the parameter values predicted (e.g. by some proposed theory) and the parameter values evaluated from the measured data are negligible. in this case, we assumed the dependence of fractal dimension df2 on dimensionless flocculation time, described by the simple power law df2 = b ·(ntf )β , and the difference between predicted exponent βpred and evaluated exponent βcalc was tested. the hypothesis test characteristics are given as t = (βcalc − βpred)/sβ where sβ is standard error of parameter βcalc. if the calculated |t| value is less than the critical value of the t-distribution for (m − 2) degrees of freedom and significance level α, the difference between βcalc and βcalc is statistically negligible (statisticians state: “the hypothesis cannot be rejected”). the hypothesis test result and parameter β evaluated from the data is presented in table 5. the fractal dimension was found to be independent of the flocculation time, and the value df2 = 1.442 ± 0.12 was determined as an average value. 5 conclusions the following results have been obtained: • a simple semiempirical generalized correlation for flock growth kinetics has been proposed. • a generalized correlation has been successfully tested and verified, using published data from kilander et al. (2007). the generalized correlation parameters are presented in table 1. a comparison of the experimental data and the generalized correlation is depicted in fig. 1. • the flock growth kinetics was investigated in a baffled tank agitated by a rushton turbine at mixing intensity 40 w/m3 and kaolin concentration 0.44 g/l. the flock size and flock shape were investigated by image analysis. • the tests were carried out on the kaolin slurry model wastewater. the model wastewater was flocculated with organic sokoflok 56a flocculant (solution 0.1 % wt.). 27 acta polytechnica vol. 50 no. 2/2010 • the largest flock was identified from the images, and its projected area was determined for a given flocculation time. the calculated equivalent diameter plotted in dependence on flocculation time for the given flocculant dosage and mixing intensity is shown in fig. 3. • the flock size increases with increasing flocculation time up to a maximum value due to primary aggregation, and then decreases due to flock breaking. • the proposed simple semiempirical generalized correlation for flock growth kinetics was used for data treatment. the model parameters are presented in table 4. • the fractal dimension df2 was determined for each flocculation time on the basis of the experimental data. using the statistical hypothesis test, the fractal dimension was found independent of flocculation time, and the value df2 = 1.442 ± 0.125 was determined as the average value. acknowledgement this research has been supported by the grant agency of czech republic project no. 101/07/p456 “intensification of flocculation in wastewater treatment”. references [1] šulc, r.: flocculation in a turbulent stirred vessel. phd. thesis. czech technical university, faculty of mechanical engineering, 2003 (in czech). [2] šulc, r., ditl, p.: effect of mixing on flocculation kinetics. in: proceedings of 14th internationalcongress ofchemical andprocess engineering chisa 2000, (cd rom), prague, 2000. [3] šulc, r., ditl, p.: the effect of flocculant dosage on the flocculation kinetics of kaolin slurry in a vessel agitated by rushton turbine at mixing intensity 168 w/m3 and kaolin concentration 0.58 g/l. czasopismo techniczne – seria: mechanika, vol. 105 (2008), pp. 341–349. [4] kilander, j., blomström, s., rasmuson, a.: scale-up behaviour in stirred square flocculation tanks. chem. eng. sci., vol. 62 (2007), pp. 1 606–1 618. [5] kysela, b., ditl, p.: the measurement of floc size and flocculation kinetics by the laser knife and digital camera. in: proceedings of 16th international congress of chemical and process engineering chisa 2004, (cd-rom), prague, 2004, isbn 80-86059-40-5. [6] kysela, b., ditl, p.: floc size measurement by the laser knife and digital camera in an agitated vessel. inzynieria chemiczna i procesowa, vol. 26 (2005), pp. 461–468. [7] svačina, o.: application of image analysis for flocculation process monitoring. diploma project. czech technical university, faculty of mechanical engineering, 2009 (in czech). ing. radek šulc, ph.d. phone: 420 224 352 558, fax: +420 224 310 292 e-mail: radek.sulc@fs.cvut.cz department of process engineering, faculty of mechanical engineering, czech technical university in prague, technická 4, 166 07 prague, czech republic notation a flock area projected mm2 a∗f coefficient; model parameter (12) – a∗zr∗ residual turbidity shift; model parameter (4) – df eq equivalent flock diameter according to flock area mm df flock size mm dfmax maximum flock size; model parameter (12) mm d tank diameter m df flocculant dosage ml/l df2 fractal dimension of 2 nd order – iyx correlation index – lchar characteristic length scale mm n impeller rotational speed rpm 28 acta polytechnica vol. 50 no. 2/2010 [ntf ]min model parameter (4) – [ntf ]max model parameter (12) – t hypothesis test characteristics – tf flocculation time minute t(m−2),α critical value of t-distribution for (m − 2) degrees of freedom and significance level α – tsed sedimentation time minute z0 turbidity before flocculation fau zr residual turbidity after flocculation fau z∗e turbidity removal degree – z∗r residual turbidity degree – z∗rmin model parameter (4) – greek letters δr relative error; % δr = 100 ∗ (experimental value – regression value)/regression value δ(1/df ) ∗ variable – δ[ntf ] ∗ log variable – δz∗r variable – εv specific impeller power input (per volume unit) w/m 3 indices ∗ dimensionless ∗ · x link ave average max maximum min minimum 29 ap08_5.vp 1 introduction the faculty of civil engineering of the czech technical university in prague is developing special software for systematic monitoring and analysis of the development of advertised quotations for properties. the software has been in operation since september 2007, and in that time it has collected over 650 000 offers for sale and tenancy of flats, family houses and real estate lands [1]. since the launch of the software and the start-up of data collection the research team has been working on improving the analytic tools and specifying inquiries for enlarging the quotation database. currently the research work focuses on improving the quality of the software in the field of data filtration. each recorded quotation is assessed in terms of the objectivity and correctness of the presented information. it is compared with older quotations, advertised properties are repeatedly searched, the completeness of the presented information is assessed, etc. about 270 potential faults, deliberately misrepresented information and manipulative practices are totally verified for each quotation. if any small discrepancy is found, the assessed quotation is discarded from the evaluation [2]. 2 results of research in the first quarter of 2008, the prices of energy, foods and services rose very considerably in the czech republic. at the same time, a slight rise in the cost of loans (particularly mortgage loans) was recorded. as against the same period in 2007, the interest rates increased by 0.6% on average. during the first quarter of 2008, the banks lent out a total of 25.8 billion crowns. this is 8.5 % less than in the first quarter of 2007. nevertheless, the availability of credit products continues to be good, and this has a significant influence on the development of the market values of residential properties. the slow growth in interest rates slightly increases the tendency to sell older flats. at the same time, quotations per square metre of residential property floorage have moderately risen. during the last 9 months market prices have © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 5/2008 development of real estate market in the czech republic l. jilemnická, v. berka, e. hromada the article deals with analysis of the current situation on the real estate market in the czech republic. software eval, which continually collects, examines and evaluates advertised quotations of real estates, was used for mapping and evaluation of the real estate market development. the article provides professional public with detailed view on the time progress of quotations and tenancy of flat units in dependence on the significant parameters of properties and a locality. keywords: real estate market, property market value, statistics. fig. 1: average quotation per square metre of older flat units according to type of ownership (1+kk to 4+1 category, throughout the czech republic) stagnated or shown a slight rise. there continues to be a quite considerable difference (about 25 %) between the market price of an older flat directly owned and the market price of an older cooperative flat also keeps on (see fig. 1). some examples of the categories of flats and their marking used in this paper are given in table 1. it follows from a comparison of the frequency of records in the eval software database that the type of flats most offered for sale are flats in the 3+1/3+kk category (see fig. 2). the increasing number of recorded offers in time testifies to the growing offer of individually owned flats. a similar trend can also be observed in the tenancy of older flats. the types of flats most offered for rent are flats in the 2+1/2+kk and 3+1/3+kk category (see fig. 3). quotations per square metre of flat floorage differ according to the type of flat, its total size, disposition, locality and many other factors. it is clear from fig. 4 that prices per square metre of flat floorage in the biggest flats show an unstable falling trend over the last 9 months. the trend for other flats is slightly rising. this may be due to a slight increase in 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 1+kk = one room and kitchen corner 2+kk = two rooms and kitchen corner 3+kk = three rooms and kitchen corner 1+1 = one room and kitchen 2+1 = two rooms and kitchen 3+1 = three rooms and kitchen table 1: categories of flats commonly used in the czech republic fig. 2: number of offers for sale of older flat units recorded by eval software in individual months throughout the czech republic. this is not a cumulative number of records. fig. 3: number of offers for tenancy of older flat units recorded by eval software in individual months throughout the czech republic. this is not a cumulative number of records. mortgage interest rates and a decrease in mortgages for financing the purchase of the most expensive category of flats. for the lower categories of flats (with a lower purchase price) various forms of financing are still well available. for this reason, there is no substantial limitation at prices at the lower categories of flats. if this hypothesis is confirmed, we can assume a future fall in prices per square metre in the 5+kk and 5+1 categories. in the future period, a slight drop in prices per square metre for the 4+kk and 4+1 categories may occur, and then (probably after a period of time) also in the 3+kk and 3+1. the most expensive types of flats per square metre of floorage are flats in the 5+kk and 5+1 categories. mostly these are luxurious, duplex and spacious roof flats with terraces located in good localities. by contrast, the second most expensive are flats in the lowest 1+kk, 1+1 and 2+kk, 2+1 category. these are so-called „starter“ flats for young couples, individuals and maybe foreigners, often purchasing their first housing on mortgage to live in on a temporary basis. the next category is 4+kk, 4+1 flats, followed finally by the 3+kk, 3+1 category. this sequence of comparative average prices per square metre of floorage is also influenced by the structural and material characteristics of other flat units. for example, small flats (in the 1+kk to 2+1 category) are most frequently offered in older brick housing developments, where the price is usually higher than, for instance, flats in the 3+kk, 3+1 category, which are most frequently available in prefabricated developments. fig. 5 shows the material characteristics of flats on offer, and also types of ownership. for cheaper flats of lower standard, if the total floorage is larger the market price for per square metre will be lower. from the purely economic point of view, the most cost-effec© czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 5/2008 fig. 4: average quotations per square metre of older flat units according to category of flat throughout the czech republic. fig. 5: average quotations per square metre of older flat units according to material characteristics and type of ownership (1+kk to 4+1 category, throughout the czech republic) tive purchase is of a 3+kk/3+1 flat with a floorage of about 65 m2. the price per square metre for bigger flats of a higher standard is proportionally higher. in contrast, the price per square metre for small (starter) flats is disproportionately high for the standard and level of dwelling that is offered (see fig. 6). the czech real estate market reveals considerable regional disparities in the financial affordability of dwellings [3]. markedly higher prices than in the other regions are observed in the prague region, where quotations are approximately twice as high as in other regions (see fig. 7). in terms of average gross monthly earnings, prague is the region with the lowest affordability of dwellings. 3 conclusion in recent years we have witnessed a significant rise in market prices for all categories of residential properties. this development has been caused particularly by the availability of mortgages for a relatively broad group of people. however, lasting recent months this trend has stopped particularly in the case of bigger flats (see graph 7). quotations per square metre of floorage for these flats are falling slightly. at the same time, we can observe a slight growth in offers for sale of older flats. this may be due to many factors, e.g. seasonal fluctuations or accidental features. however, the sequence of changes of trends indicates that these changes may be due to 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 fig. 6: dependence of the average quotation per square metre in an older flat unit on flat floorage and material characteristics (1+kk to 5+1 category, throughout the czech republic) fig. 7: average quotations per square metre of older flat units in individual ownership according to regions and material characteristics (1+kk to 4+1 category, may 2008) common basic cause – the growth of mortgage rates and a gradual decrease in the revenues from rented properties. acknowledgment this paper originated as part of a research project at czech technical university in prague, faculty of civil engineering on management of sustainable development of the life cycle of buildings, building enterprises and territories (msm: 6840770006), financed by the ministry of education, youth and sports of czech republic. references [1] hromada, e.: „tržní ceny starších bytů v čr.“ stavebnictví, march 2008, no. 3, p. 18–20. [2] hromada, e.: analýza vývoje tržních nájmů starších bytů. stavebnictví, may 2008, no. 5, p. 70–72. [3] ort, p.: oceňování nemovitostí na tržních principech. prague: bankovní institut vysoká škola, 2007. rndr. libuše jilemnická, csc. phone: +420 224 354 657 e-mail: jilemnic@fsv.cvut.cz department of languages ing. vilém berka phone: +420 224 353 720 e-mail: vilem.berka@fsv.cvut.cz ing. eduard hromada, ph.d. phone: +420 224 353 720 fax: +420 224 355 439 e-mail: eduard.hromada@fsv.cvut.cz dept. of construction management and economics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 5/2008 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 the puzzle of v407 cygni r. hudec abstract wediscuss the nature and recent observations (optical and gamma-rays) of symbiotics binary/mira variablev407cygni. in addition we discuss another object of similar category, namely the variable star fy aqlwith possible association with an eruptive gamma-ray source. keywords: variable stars, mira variables, symbiotic stars, gamma-rays, gamma-ray transients. 1 introduction v407 cyg is a symbiotic binary harboring a mira variable, of 745 day pulsation and a possible orbital period of 43 years, at a distance of 2.5/3.0 kpc and a reddening of e(b − v ) = 0.5/0.6 (munari et al. 1990). miras of such a long pulsation period are generally oh/ir sources, with a thick dust envelope which prevents direct observation of the central star. in addition to a possible previous one in 1936 (when the object was noted for the first time by hoffmeister 1949), v407 cyg has recently been discovered in outburst by nishiyama and kabashima (2010, cbet 2199) on 2010 mar 10.8 ut. spectroscopic confirmation was first provided by munari et al. (2010, cbet 2204) on 2010 mar. 13.1 ut. they noted the emergence of the spectrum of a he/n nova that overwhelmed the absorption spectrum of the mira. numerous and strong emission lines were observed, which belonged to two distinct groups. the first group, composed of sharp profiles with even narrower central absorptions, originated from the ionized slow wind of the mira. the second group, characterized by much broader profiles for helium, nitrogen and hydrogen lines, originated from the nova fast expanding ejecta. such a scenario was highly reminiscent of the recurrent nova rs oph. the outburst of v407 cyg has since then been detected also in gamma-rays (atel 2487) and at radio wavelengths (atel 2506, atel 2511, atel 2514, atel 2536), and observed in the infrared (cbet 2210 ). 2 history the variable star v407 cygni was discovered by cuno hoffmeister at sonneberg on astronomical astrograph plates in 1949 as a nova-like variable (hoffmeister 1949). the object was first investigated in more detail (on astronomical sky patrol plates) by ludwig meinunger at the sonneberg observatory in 1966 (meinunger, 1966). the star exhibited a strange outburst in 1936, followed by mira like light variations with a period of 745 days, perhaps related to a thermonuclear event in 1936. 3 spectroscopy the spectral appearance of v407 in outburst is highly peculiar. the spectrum is completely different from those ever recorded for this object and other symbiotic mira variables in outburst. the white dwarf companion to the mira variable experiences an outburst similar to that of classical novae, and its ejecta move in the circumstellar environment already filled by the ionized wind of the mira (cbet 2204). it is worth mentioning that the li abundance is certainly anomalous, about a factor of 100 higher than normal. this has been interpreted by several studies (cited in shore et al. 2011 and munari et al. 2011) that the red giant is likely an agb star or, at least, massive. 4 v407 cyg as a gamma-ray transient fermi lat detection of a new galactic plane gamma-ray transient in the cygnus region: fermi j2102+4542, and its possible association with v407 cyg was reported in atel 2487; c. c. cheung (nrc/nrl), d. donato (nasa gsfc), e. wallace (u. washington), r. corbet (nasa gsfc), g. dubus (u. grenoble), k. sokolovsky (mpifr), h. takahashi (hiroshima u.) and by abdo et al. (2010) on behalf of the fermi large area telescope collaboration. the large area telescope (lat), on board the fermi gamma-ray space telescope, detected a transient gamma-ray source in the galactic plane: fermi j2102+4542. preliminary analysis of the fermi-lat data indicates that on the 13th and 14th of march 2010, the source was detected with a more than 100 mev flux of (1.0 ± 0.3) × 10−6 ph 57 acta polytechnica vol. 51 no. 2/2011 fig. 1: the historical light curve of v407 cyg (munari et al., 1990) fig. 2: integral ibis observation of the v407 cyg position (bassani, 2010) shows the new transient to be located between two persistent hard x-ray sources, with no confirmation of the transient detection fig. 3: v407 cyg optical ccd observations prior to and after the 2010 optical flare (courtesy of aavso http://www.aavso.org) 58 acta polytechnica vol. 51 no. 2/2011 fig. 4: v407 cyg optical ccd observations prior to the 2010 optical flare (courtesy of meduza, lubos brat, czech republic) cm−2·s−1 and (1.4±0.4)×10−6 ph cm−2·s−1, respectively (statistical only) — the corresponding significances on these days are 8 sigma and 6 sigma. a systematic uncertainty of 30 % should be added to this number. there is no previously reported gamma-ray source at this location. within the 95 % confidence error circle radius of 0.12 deg (statistical only) is the symbiotic star v407 cyg, with a reported optical outburst beginning approximately 2 days earlier (cbet 2199) than the onset of gamma-ray activity detected by lat. swift/xrt observations triggered on the optical outburst of v407 cyg and performed on 2010 march 13th and 15th resulted in 2.4–2.6 sigma (0.3–10 kev) detections of an x-ray source coincident with the position of the star in each of the two 960 sec exposures. the gamma-rays could have been caused by shock driven acceleration of the ejected material, and its capture by strong magnetic fields within the system. this is the first hard gamma-ray emission detected from this category of objects. 5 the case of fy aql other mira/symbiotics stars with possible gammaray eruption/association have also been known, but without firm confirmation. one of these candidates is the variable star fy aql located inside the error box of grb19790331. an optical study in the field of fy aql/ grb19790331 was published by hudec in 1989. the investigated mira/symbiotics star fy aql is inside a small error box of the gamma-ray burst (grb) grb19790331. results were presented from an optical investigation of the error box of the gamma-ray burst source (grbs) 1979 0331. consideration was given to the light changes and the optical behavior of the variable fy aql, located inside the box. these results confirm that fy aql is indeed a mira variable with short period flares. this mira/symbiotics system had a temporary reflecting nebula, indicating possible eruption in the past. the object position is in a crowded part of the milky way in aquila. during a standard catalog search of the box, laros et al. (1985) found that the mira variable star fy aql is inside the region. subsequent studies have found that the mira star has short-duration flares and a surrounding reflection nebula, while the nebula has subsequently disappeared. details concerning fy aql are given in hartmann and pogge (1987), hudec (1989), schaefer et al. (1987), hudec (1987), schaefer (1990), schaefer (1991), and irwin and zytkow (1994). 6 conclusions the symbiotics mira variable star v407 cygni is very probably coincident with a gamma-ray transient detected by fermi. there are also other objects in this category that may be sources of high-energy gammaray emmission, such as fy aql. obviously the category of mira symbiotic variables with reflecting (temporary) nebulas are worth further study to establish the physics behind the posible gamma-ray activity. the spectroscopic properties during the optical flare indicate the possibility to detect such events 59 acta polytechnica vol. 51 no. 2/2011 by rp/bp photometers/ultra-low resolution spectrographs on esa gaia satellite. these findings also confirm the importance of long-term optical monitoring by ccd telescopes and cameras and also historical long-term optical coverage provided by astronomical plate collections. acknowledgement i ackowledge grants 205/08/1207 and 102/09/0997 provided by the grant agency of the czech republic and by msmt kontakt project me09027. the ccd observations of v407 prior to the 2010 flare were provided by meduza collaboration/lubos brat, czech republic. references [1] abdo, a. a., et al.: science, vol. 329, issue 5 993, pp. 817–821, 2010. [2] bassani, a., et al.: atel2498, 2010. [3] hartmann, d., pogge, r. w.: apj, 318, 363, 1987. [4] hoffmeister, c.: vvs 1, 1949, 295. [5] nishiyama, kabashima: cbet, 2010, 2199. [6] irwin, m., zytkow, a.: apj, 433, l81, 1994. [7] laros, j. g., et al.: apj, 290, 728, 1985. [8] munari, u., et al.: cbet, 2010, 2204. [9] munari, u., et al.: monthly notices of the royal astronomical society: letters, vol. 410, issue 1, pp. l52–l56, 2011. [10] cheung, c. c., et al.: atel, 2010, 2487. [11] hudec, r.: ibvs, 3121, 1, 1987. [12] hudec, r., bull: astronomical institutes of czechoslovakia, vol. 40, no. 4, aug. 1989, p. 261–266. [13] meinunger, l.: mvs sonneberg, 3, 111, 1966. [14] munari, u., et al.: mnras, 1990, 242, 653. [15] schaefer, b. e.: apj, 364, 590, 1990. [16] schaefer, b. e.: apj, 366, l39, 1991. [17] shugarov, s. y., tatarnikova, a. a., kolotilov, e. a., shenavrin, v. i., yudin, b. f.: v407 cyg as a member of a new subclass of symbiotic stars. baltic astronomy, vol. 16, p. 23–25, 1997. [18] shore, n. s., et al.: astronomy & astrophysics, vol. 527, id.a98, 2011. rené hudec e-mail: rhudec@asu.cas.cz astronomical institute academy of sciences of the czech republic cz-251 65 ondřejov, czech republic czech technical university in prague faculty of electrical engineering technická 2, cz-166 27 prague, czech republic 60 ap08_3.vp 1 introduction despite significant advances in micro-technology lasting recent decades, precise and cost-efficient welding still poses a big challenge. usually, tedious and error-prone manual interaction is required to adjust currently available welding machines to precisely join two parts with a laser beam. this adjustment is additionally exacerbated if flexible parts or parts with loosely fixed counter pieces (see fig. 1) are to be welded. in these cases each part has to be adjusted individually or clamped precisely, which leads to an increase in production costs. one of the major applicatios of laser welding is the production of medical devices, e.g., welding the housing for cardiac pacemakers or endoscopes. other applications are in environmental engineering, e.g., water filters that are capable of filtering harmful microorganisms out of drinking water, see fig. 1. to provide these high-tech goods for emerging markets, a production technology is needed which meets the requirements concerning accuracy and production costs. we therefore aim to simplify the production process by equipping an existing assembly system with sensors and intelligence. to avoid an excessive increase in costs, we try to make efficient use of system-immanent components. more precisely – firstly, we enable the machine to acquire high-resolution images of the workpieces with a single camera, secondly we enable the positions of the workpiece to be estimated automatically, and finally we adapt the position of the laser to possible offsets between nominal and detected positions. the paper is organized as follows: first, the hardware setup for welding and image acquisition is described. second, the concept for position estimation and for transforming the welding contours is introduced. finally, an evaluation of position estimation and a typical welding result is presented. the paper concludes with a summary and conclusions. 2 hardware setup to perform laser welding with an assembly machine, one of the most important problems concerns the positioning of the focused laser beam on the object to be welded. the solutions to this problem are highly diverse, ranging from moving workpieces via translation stages to static workpieces with fiber lasers moved by robots. we chose a constellation which is close to the latter. the workpiece is immobile and the laser adapts to the position of the workpiece. to achieve a high dynamics of the assembly machine, a so-called scanner is employed instead of a robotic arm. this scanner consists of two orthogonal turnable mirrors, which are capable of precisely positioning the laser beam in the plane of the workpieces (see fig. 2). corresponding to the reduced moving mass of the mirrors in comparison to a robotic arm, the dynamics and thus the throughput rates of the welding parts can be significantly increased. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 48 no. 3/2008 automatic compensation of workpiece positioning tolerances for precise laser welding n. c. stache, t. p. nguyen precise laser welding plays a fundamental role in the production of high-tech goods, particularly in precision engineering. in this working field, precise adjustment and compensation of positioning tolerances of the parts to be welded with respect to the laser beam is of paramount importance. this procedure mostly requires tedious and error-prone manual adjustment, which additionally results in a sharp increase in production costs. we therefore developed a system which automates and thus accelerates this procedure significantly. to this end, the welding machine is equipped with a camera to acquire high resolution images of the parts to be welded. in addition, a software framework is developed which enables precise automatic position detection of these parts and adjusts the position of the welding contour correspondingly. as a result, the machine is rapidly prepared for welding, and it is much more flexible in adapting to unknown parts. this paper describes the entire concept of extending a conventional welding machine with means for image acquisition and position estimation. in addition to this description, the algorithms, the results of an evaluation of position estimation, and a final welding result are presented. keywords: video-based position estimation, automatic laser adjustment, compensation of positioning tolerances, precise welding, robust fitting, contour point detection. fig. 1: prototype of a clear water filter: (a) filter membrane, (b) filter body with deepening, (c) membrane loosely positioned within the deepening. to prevent it from warping, the membrane has to be welded exactly along its edge. in addition to laser positioning, this setup can be beneficially used for high-resolution image acquisition with a standard camera. to this end, we employ specially designed beam splitter optics, which allows the optical path of the laser through the scanner to be shared with that of the camera (see fig. 3). thus, the camera field-of-view (fov) can be moved within the plane of the workpiece just like the laser. with this system we acquire high resolution image tiles and corresponding information about the fov positions in order to stitch the tiles to an overall image. in contrast to systems which use static cameras, we can hence create an overall picture with much greater resolution or size. high resolution is required because the size of the workpieces is approximately 6 orders of magnitude higher than the sought accuracy of position detection. since the diameter of the welding spot is approximately 300 �m, an offset of �50 �m is in most cases tolerable. 3 structure of the software framework in order to implement the adaptation of the welding laser to the position of the workpiece, several steps have to be carried out, as follows: 1. image acquisition and stitching. 2. feature extraction. 3. feature based position estimation and computation of the offset to nominal position. 4. transformation of the welding contour. 5. welding. first, as already mentioned, image tiles have to be acquired and stitched to an overall image. since the scanner knows the positions of these tiles with sub-pixel accuracy, we can omit time-consuming matching and position estimation steps and directly stitch the tiles at the correct positions. however, to avoid artifacts at the image junctions due to different brightnesses, single-band blending is used, which smoothly blends one image into the adjacent image. the second step is to define and extract the features to be used for identifying the position of the workpiece. these features have once to be defined by the user, since it is difficult to automatically omit features which are not relevant for estimating the subsequent position. the filter part in fig. 1 provides a good example of this: the aim is to weld the loosely positioned membrane along its rim to the body of the filter. therefore, it is essential to use only features of the membrane, e.g. its edge, instead of features of the body of the filter, which would hamper the correct detection of the membrane’s position. a further aspect that motivates the proposed feature based position estimation is related to the third step in the item list. here, the difficulty is to obtain nominal values for the positions which are later used to determine the offsets, and to transform the welding contour correspondingly. since these nominal values are usually available from cad-data (computer aided design), it is meaningful to use a procedure that is “compatible” with this data. as the appearance of surfaces is still only coarsely modeled in present-day cad-programs, and it is often far from reality, only edges and their geometric representations appear to be suitable for being 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 2: scheme of a galvanometer-driven scanner [4] fig. 3: scanner-based welding and image acquisition system. the beam splitter optics comprises a beam splitter mirror and a dichroit. the dichroit is a wavelength dependent mirror, which, in our case, reflects the nd:yag laser for welding and lets the illumination rays for image acquisition pass through. fig. 4: image tiles and stitched panorama of the clear water filter, fig. 1 matched with real images. thus, the nominal values for lines, circles, arcs of circles, corners and ellipses can be extracted from cad-data without excessive effort. 4 feature extraction the feature extraction algorithm is required to achieve high robustness, high accuracy and a high throughput rate. since the user is required to explicitly define the features to be selected for position estimation, for the reasons given above, it is not too time-consuming to also select a region of interest (roi) that is big enough to cover position tolerances. within the roi, the desired feature is extracted robustly and with sub-pixel accuracy, as described in greater detail below. the detection of only one feature per roi may be seen as a restriction, but it inherently avoids ambiguity of feature detection and matching within one roi. in addition, it has not shown to be adverse in practice. after defining the roi and selecting one of five possible features – such as line, circle, arc of a circle, ellipse, corner – the next step is robust and sub-pixel accurate extraction. since the size of the overall image can exceed 10 million pixels, the roi-size may still be about 1 megapixel and more. well-known algorithms such as the canny edge detector [2] in combination with the hough transform [7] for line or circle fitting are too costly, since they are not perfectly adapted to the problem at hand and they evaluate the entire roi. to obtain the required throughput in detection, further speedup is needed. to this end, we have developed an approach which is time-efficient and robust. this approach consists of two steps: estimating the positions of contour points, and fitting a chosen model to the points determined in this way. this approach can be used for robust estimations of the parameters of the mentioned features, except for the edge, which is determined using the susan corner detector [12]. 4.1 contour point estimation to estimate the contour points, a predetermined number of line profiles is evaluated. these profiles are roughly positioned orthogonally and equally spaced to the sought contour, when defining the roi. thus, the sought contour is sampled by only considering the pixels of the line profiles (and their neighborhoods), instead of evaluating the entire roi. to make a robust estimate of the contour points within these profiles, we applied two different models. 4.1.1 step edge model this model consists of a set of step edges with predefined edge positions, which are correlated with the line profiles [10]. the resulting position of the contour point is determined by evaluating which step edge model corresponds to the maximal correlation value. thus, the position of the step edge in this model defines the position of the sought contour point. since the models always cover entire profiles and, consequently, all pixels are considered, a high degree of robustness is achieved – outliers due to noise are canceled out. in addition, the step edge approach is well tailored to the problem, since it finds the one sought contour point for each profile. the step edge approach is most suitable to robustly detect the transition between two adjacent areas of slightly different gray values, even in noisy images. 4.1.2 sobel edge model the sobel edge model utilizes the well-known filter kernel from the sobel operator [1]. here, it is assumed that an edge is defined as a sharp transition orthogonal to the edge and a small transition in the direction of the edge. this model is transferred to the sobel filter kernel, which consists of gaussian smoothing combined with a derivation in the orthogonal direction. to apply this model to contour point detection, we first rotated the filter kernel so that the derivation is computed in the direction of the profile and the smoothing is computed orthogonally to the profile. then, this rotated filter is applied to each pixel of the profile and its neighborhood. this procedure yields an optimal response to edges orthogonal to the profiles. a simplification of this procedure may be to compute the dot product of the gradient and the unit vector pointing in the direction of the profile. however, this procedure would lack the directional smoothing of the rotated sobel kernel, and would therefore yield a lower degree of robustness. the sobel edge model is suitable for detecting edges, when the step edge model cannot be applied because the constraint of large adjacent and homogeneous areas is violated. 4.2 model fitting since the detected contour points may not be free from outliers, it seems reasonable to use robust methods for fitting the geometric models to the points. to this end, the ransac algorithm (random sample consensus) [3, 11] is used. this algorithm is able to eliminate outliers and hence use the points that are assumed to be undisturbed for fitting. to this end, we need mathematical definitions of the geometric primitives to be fitted and a measure for assessing the distance of each contour point to the given model. a description of all four models (circle, circle arc, ellipse, line) would go beyond the scope of this paper. we therefore elaborate more on the example of circle detection, since the basic steps are similar to the algorithms for the other models. the algebraic model of a circle in the plane can be expressed as [5] a ct tp p b p� � � 0 (1) with a � 0, p � ( , )x y t and b � ( , )b b t1 2 , p b, � � 2. the center of the circle c and the radius r can be determined from these parameters as c � � � � � � b a b a t 1 2 2 2 , , (2) r b b a c a � � �1 2 2 2 24 . (3) the insertion of n � 3 contour points p i i i tx y� ( , ) ( i n� 1� ) in (1) results in an overdetermined linear system of equations, which is bu 0� (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 48 no. 3/2008 with b x y x y x y x yn n n n � � � � � � � � � � 1 2 1 2 1 1 2 2 1 1 � � � � , u � � � � � � � � � � a b b c 1 2 . b is the matrix of the contour points and u is the vector of the sought circle parameters. thus, the problem to be minimized can be formulated as q b b b b bt t t( ) ( ) minu u u u u u� � � �2 2 , (5) with the constraint of u 2 1� to avoid obtaining the trivial solution. this can be reformulated as q b bi i t t i i( ) minv v v� � �� , (6) where the eigenvector vopt of b tb, which belongs to the smallest eigenvalue �4, equals the sought parameter vector u. thus, the circle parameters c and r can be determined. as already mentioned, the detected contour points may be subject to gross outliers, which can corrupt the fitting result and thus deteriorate the accuracy of the position estimation. therefore, we utilize the ransac algorithm for robust outlier elimination. in the case of circle fitting, the procedure is as follows [3]: 1. select a set of at least three determined contour points randomly, 2. construct a circle through these (e.g. with least squares circle fitting), 3. count the number of points which lie within a predetermined error tolerance to the circle (so-called inliers), 4. if the number of inliers is greater than some predefined threshold, do least squares circle fitting (as explained above) for all inliers, else repeat the process, i.e. start again at 1., until a maximum number of trials is reached. with this procedure, a circle can be correctly fitted even in challenging images. the use of line-shaped profiles for contour point detection instead of entire images or rois permits a speed up, which is required for executing one detection and welding cycle per second (detection with a standard pc, 2.6 ghz, dual-core, c�� implementation). in addition, sub-pixel accuracy is accomplished by means of least squares fitting to the ransac-inliers. 5 offset computation and correction in the case of extracting a single feature, e.g. a circle, it is easy to determine an offset between the detected value and a given nominal value. since the system is restricted to the correction of translation and rotation, one circle only contributes to translation estimation. thus, correction of the welding contour is a simple shift in its coordinates. in the case of two or more detected circles with different center positions, a rotation may be additionally determined. to avoid approaches such as chamfer matching [6], which require costly iterative optimization for position estimation, we used a direct algebraic and model-oriented way. in contrast to (local) optimization approaches, we thus avoid missing the global optimum, which represents the sought position. in order to apply our direct algebraic approach, which uses the ransac algorithm and the similarity transform [9], the different features have first to be brought to a uniform level. to this end, an examination of the descriptions of the feature is replaced by considering (center) points and angles. thus, e.g. an ellipse is represented by its centroid and an angle, a circle is represented by its center, and so on. in the next step, the ransac algorithm is used to eliminate outliers. then, at first, a rotation is computed by only considering the mentioned points and their corresponding nominal values (angles are considered later). hence, we use the similarity transform in order to obtain a linear set of equations and thus avoid sine and cosine terms. the structure of the linear equation set is u u u u v v v v x y xm ym x y y x 1 1 1 1 1 1 1 0 0 1 � � � � � � � � � � � � � � � � � � � � v v v v a b c d xm ym ym xm � � � � � � � � � � � � � � � � � � � � � � � 1 0 0 1 � , (7) where u i xi yi tu u� ( , ) , and vi xi yi tv v� ( , ) , i m� 1� , build a set of m point pairs, which are composed of the detected points such as the centers of the circles and their corresponding nominal values. the parameters a, b, c, d form the matrix of the similarity transform and encode the sought rotation and a scaling. the rotation angle � can be computed by determining the pseudoinverse of a, e.g., via singular value decomposition and the computation of � � arctan ( , )2 b a . (8) between this value and the angles, possibly determined directly by the features (e.g. the angles of lines and ellipses), the weighted average is computed. after the computation of this result, the points of the feature are rotated inversely and the translation is determined as the mean of the distances between the point pairs. hence, the sought rotation and the offset are determined robustly and without iterative optimization steps. finally, the welding contour has to be transformed correspondingly. this is straightforwardly implemented as a multiplication of the welding coordinates with an adapted transformation matrix. 6 experimental results to evaluate the accuracy of the position estimation, images of real workpieces were acquired using a galvanometer scanner and stitched (see fig. 5). these images were synthetically rotated and shifted with sub-pixel accuracy. hence, a reliable ground truth is created. within these images, the pro58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 5: stitched images used for evaluating the accuracy in position estimation posed algorithms for feature extraction and offset estimation are applied. the results obtained with the images in fig. 5 are presented in the boxplot [8] in fig. 6. despite the challenging real world images, the range of deviations for circle detection is smaller than 2 pixels, which is in our case less than 13 �m. in case of the arc of a circle, the same image is used but only a quarter of a circle is employed for position detection. this is more time efficient, since only a quarter of the overall image has to be acquired, but the accuracy is significantly decreased. in position estimation with two orthogonally intersecting lines, the worst case scenario is a deviation of 9 pixels, which corresponds to 60 �m. however, the image used for line detection depicts an smd-chip (surface mounted device), where the lines are fitted to the rows of pins. the rows are fragmented due to the gaps between the pins. this complicates the task of fitting a line with such high accuracy. the image of the ellipse is affected by strong artifacts which hamper higher accuracy. in addition, the ellipse has the highest number of degrees of freedom. since the number of contour points which are used for fitting is constant, the relative number (related to the degrees of freedom) for the ellipse is lower. this causes a decay in accuracy. the results of the algorithm for offset computation are used for rotating and shifting the welding contour correspondingly. the final aim is to execute the welding process at the correct position. in order to obtain preliminary evidence, three of the filter membranes in fig. 1 have been welded at position variations of about 1 mm. the final position deviation of the overall system did not exceed 30 �m. an example of successive welding obtained with our system is presented in fig. 7. 7 summary, conclusions this paper addresses the problem of tedious and error-prone adjustment of present-day laser welding machines. especially for high-tech goods in medical or environmental applications, high precision is required at moderate levels of production costs. to this end, a conventional welding machine has been equipped with the capability to react automatically to varying positions of the parts to be welded. in this way, costly manual adjustment can be reduced to a minimum and high-tech goods could become more attractive for emerging markets. the proposed procedure can be subdivided into the steps of image acquisition, stitching, feature extraction, position estimation, transformation of the welding contour and, finally, successful welding. these steps have been described and eval© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 48 no. 3/2008 fig. 6: boxplot, illustrating the deviations of 20 measurements from ground truth (“circle x” e.g. denotes the deviations in the x-direction, when a circle is used for position estimation. “lines rot.” denotes the deviation of the determined rotation from ground truth, in degrees). the lines in the boxes mark the medians, the boxes encompass the two inner quartiles of the quantity of results, while the “whiskers” and crosses mark the two remaining outer quartiles. fig. 7: welding result: filter part and its weld seam uated in detail. the evaluation results show that high standards of accuracy can be achieved by the entire system. acknowledgments the authors would like to thank professor dr.-ing. til aach, institute of imaging and computer vision, rwth aachen university, and dr.-ing. alexander olowinsky, fraunhofer institute for laser technology, who supervised this project. the authors would also like to thank dipl.-ing. jens gedicke, fraunhofer institute for laser technology who, among others, was involved in creating the hardware setup, for his kind support. this work is funded by the “intakt” collaborative research project in the “innonet” program, funded by german federal ministry of economics and technology (bmwi) with vdi/vde-it. the authors gratefully acknowledge this support as well as the active cooperation of all project partners. references [1] jähne, b.: digitale bildverarbeitung. springer verlag, berlin, heidelberg, 6. edition, 2005. [2] canny, j.: a computational approach to edge detection. ieee trans. pattern analysis and machine intelligence, vol. 8 (1986), p. 679–714. [3] fischler, m. a., bolles, r. c.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. commun. acm, vol. 24 (1981), no. 6, p. 381–395. [4] furlong, b., motakef, s.: scanning lenses and systems. news form cvi melles griot, november, 28 2007. [5] gander, w., golub, g. h., strebel, r.: least-squares fitting of circles and ellipses. in numerical analysis (in honour of jean meinguet), 1996 p. 63–84. [6] borgefors, g.: hierarchical chamfer matching: a parametric edge matching algorithm. ieee transaction on pattern analysis and machine intelligence, vol. 10 (1988), no. 6, p. 849–865, november 1988. [7] hough, p. v. c., arbor, a.: method and means for recognizing complex patterns. united states patent and trademark office, 3069654, 1962. [8] mcgill, r., tukey, w. j., larsen, w. a.: variations of box plots. in the american statistician, vol. 32 (1978), p. 12–16, feb. 1978. [9] werman, m., weinshall, d.: similarity and affine distance between 2d point sets. in proceedings of the 12th iapr international conference on pattern recognition, vol. 1 (1994), p. 723 – 725, jerusalem, israel, october 1994. ieee. [10] stache, n. c., zimmer, h., gedicke, j., olowinsky, a., aach, t.: robust high-speed melt pool measurements for laser welding with sputter detection capability. in dagm07: 29th annual symposium of the german association for pattern recognition, heidelberg, september 2007. german association for pattern recognition, springer verlag. [11] hartley, r., zisserman, a.: multiple view geometry in computer vision. cambridge university press, 2000. [12] smith, s. m., brady, j. m.: susan – a new approach to low level image processing. technical report tr95sms1c, oxford university, chertsey, surrey, uk, 1995. nicolaj c. stache e-mail: nicolaj.stache@lfb.rwth-aachen.de thi porn nguyen institute of imaging & computer vision rwth aachen university d-52056 aachen, germany 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 biomass gasification — primary methods for eliminating tar martin lisý1, marek baláš1, jǐŕı moskaĺık1, otakar štelcl1 1 brno university of technology, faculty of mechanical engineering, institute of power engineering, technická 2896/2, 616 69 brno, czech republic correspondence to: lisy@fme.vutbr.cz abstract this present paper deals with primary methods for reducing tar in biomass gasification, namely by feeding a natural catalyst into a fluidized bed. this method is verified using an experimental pilot plant. keywords: tar reduction, dolomite, catalytic reduction. 1 introduction energy recovery from biomass is one of the most widespread and most promising ways for including renewable energy sources in the energy mix of the czech republic. though the optimistic forecasts from the turn of the millennium concerning the growth of the share of res in power generation in the czech republic remain unaccomplished, a host of energy technologies based on biomass have become quite advanced. one option for conventional combustion technologies is thermal gasification of biomass, and possibly also wastes. gasification of biomass and wastes has developed progressively not only in the czech republic but, above all, in other countries. this technology offers a wide variety of options. there are two basic types of gasifiers: gasifiers with a fixed bed and gasifiers with a fluidized bed. the characteristics of the two types and the differences between them can be found in a number of publications [1]. these, however, are not the subject of the present paper. ways of using the gas that is produced form a separate topic. the ways in which the gas will be used are determined given by its quality. i, or, in other words, the requirements placed on the quality of the gas have a critical influence onin a critical manner the requirements concerning the ways that are to be used for gas cleaning. 2 the process of gasification, and the quality of the gas that is produced thermal and chemical gasification is a complex process of thermal and chemical conversion of organic matter into a low heating value gas (co, h2, ch4, co2, n2, h2o) consisting of a host of reactions. the process proceeds in the temperature range 750 ◦c to 1 000 ◦c. apart from the above-mentioned combustible and inert constituents, the fuel gas that is produced also contains smaller or larger amounts of impurities. the composition and amounts of the impurities are determined by a whole host of factors (fuel composition, type of reactor, conditions in the reactor, etc.), and this has a significant impact on the consequent methods employed for gas cleaning and use. the basic contaminants contained in the gas from gasification include solid particles, alkali and nitrogen compounds, sulphur and chlorine compounds and, above all, tar. any gas coming from biomass gasification contains at least minimum amounts of tar, and this causes serious problems in its use. tar is not a crucial problem for gas turbines, because there are high temperatures and the tar is combusted in the chamber. if the tar assumes the form of fumes, there are no limits on the amounts [2], but if there is condensation, the maximum amounts of tar range from 0 ppm to 0.5 ppm, according to different authors [1]. if the gas is to be used in combustion engines, a limit is most frequently set in the interval between 10 mg·m−3n and 100 mg · m−3n . in recent times, published works most frequently agree that the maximum amount of tar in the gas should be 50 mg · m−3n [3]. engine makers usually do not give concrete values, but they make gas purity conditional on zero content of tar in the condensate. 3 tar reduction methods tar production in wood gasification is much greater than that in coal or peat gasification and this tar, as a rule, consists of heavier and more stable aromatic substances [3]. these may partly react, producing gas black that clogs filters and valves, which is a problem particularly in the case of biomass gasification. 66 acta polytechnica vol. 52 no. 3/2012 this means that technologies developed for tar reduction in coal gasification need not necessarily apply to biomass gasification. much research is therefore being carried out on reducing tar formation in biomass gasification, or, alternatively, on removing the tar effectively in order to use the fuel gas that is produced in combustion engines. measures aimed at reducing the tar content in gas can be classified according to various criteria, but the main measures can be categorized as primary or secondary. as it is not possible in most gasifiers to prevent tar formation via primary measures, it is necessary to remove tar by using gas-filtering lines, i.e. so-called secondary measures. there are several ways and methods for removing tar from gas. the most widely used methods involve the use of catalysts (natural and artificial) and wet methods (gas scrubbing). however, primary measures are applied within the reactor. they have the potential to boost overall energy conversion efficiency while at the same time reducing the need to remove tar from the gas outside the reactor. this can reduce investment costs and operating costs. generally, two processes are applied: thermal decomposition with partial oxidation, and catalytic methods. 4 catalytic reduction of tar formation in gasifiers this method is based on the capacity of catalysts to decompose hydrocarbons into carbon monoxide, hydrogen, and lower hydrocarbons. in practice, two types of catalysts may be considered, one based on lime and dolomite, and the other based on nickel. in case of a primary measure, the catalyst is fed directly into the fluidized bed. using a suitable catalyst, the tar content can be reduced, and in addition concentrations of undesirable compounds of sulphur and chlorine in the gas can also be reduced. intense abrasion of adsorbent particles takes place in the fluidized bed, and small particles emerge with a sizeable surface and a considerable adsorption activity. in experiments carried out at various workplaces, the following materials have been most frequently used: dolomite, magnesite, limestone, and quartz sand. the most suitable material material seems to be dolomite, i.e. calcium magnesium carbonate, which is affordable and widely available. among the available materials, dolomite shows the best ratio between effectiveness and abrasion resistance. optimum temperature conditions range within the 800 to 900 ◦c interval, and the residence time is somewhere in the region of 0.3 to 0.8 sec. the activity of both dolomite and limestone grows as the proportion of ca/mg increases [4]. the properties of various materials referred to as dolomite vary considerably according to the site of origin. this then has a crucial impact on their tar reduction activity. the catalytic activity of dolomite is considerably boosted after calcination, i.e. after the conversion of carbonates to oxides, which is accompanied by the release of co2. catalysts may be applied directly to the bed in their solid state, or by way of wet spraying the feedstock [5]. if the reactor operation parameters are set appropriately, i.e. taking into account the temperatures and the proportions of fuels and catalysts, higher contents of h2 and co are obtained in the gas, and this also slightly raises the heat value of the gas. [6] this method has a negative impact on the stability and the start-up of the gasification process. 5 experimental plant the biofluid 100 atmospheric gasifier that is in operation in the laboratories of the institute of power engineering (ipe) of the faculty of mechanical engineering of the brno university of technology was used for the experimental part of our study. this is a facility with a stationary fluidized bed that can be run in both gasification and combustion modes of operation. the feedstock is supplied from a fuel storage bin equipped with a rake, and is fed by a worm conveyor with a frequency convertor to the reactor. air compressed by a blower is brought as primary air under the reactor grate and, in addition, at two other height levels, it is brought as secondary and tertiary air. the produced fuel gas is rid of fly ash using a cyclone. the resulting gas is then burned using a burner equipped with a natural gas fired stabilizing burner with its own air supply. ash from the reactor is discharged into a container placed under the grate. an electric air heater is installed at the blower outlet in order to be able to analyze the effect of air preheating. however, only the primary air is heated. a detailed description of the facility is given, e.g., in [4]. 6 initial experimental conditions the basic parameters that had to be set included dolomite particle size, based on the volume and the velocity of the moving gas, and also the mass of the fluidized bed in the reactor in the course of its operation. as it was not feasible to establish the amount of fuel in the fluidized bed by way of a theoretical computation, it was necessary before the series of experiments to design and build a sampling line and to establish this amount experimentally. the feedstock was fed at 18 kg/hr., and the entire content of the reactor, which amounted to 600 ml, or 100 g, was removed. 67 acta polytechnica vol. 52 no. 3/2012 second, it was necessary to calculate the grain size of the input material to ensure proper fluidization in the reactor. for this, the highest and lowest (design) velocities of some 1 m · s−1 to 1.6 m · s−1 were used. this estimate is based on a calculation with the following assumptions: reactor geometry, temperature variations alongside its height, flow of primary air at standard reactor operation (25 m3n · h−1), and increment in gas volume along the height of the fluidized bed (linear 0 to 18.8 m3n · h−1). to achieve catalyst particle fluidization in the bed, velocity 1.6 m · s−1 was deemed to be the velocity of particle release from the fluidized bed. for this velocity, with the known parameters of the gas and the reactor, the smallest particle diameter is 0.35 mm (calcined dolomite was considered, as it is expected that calcination will occur in the reactor). for velocity 1.6 m · s−1 and with the given assumptions, the largest diameter of the particle was set to 2.7 mm (non-calcined dolomite was considered, as it was fed into the reactor in its natural form). during its residence in the bed, abrasion and calcination occur at the same time. the diminishing particle is assumed to leave the area having reached a size of 0.35 mm. to make the constantly abraded particle stay in the bed as long as possible, and also due to the deviations of the simplified calculation, it was decided to use particles with grain size of 1 mm to 1.5 mm. the remaining quantity that had to be specified before launching the experiments was the quantity of catalyst that was to be applied. on the basis of a literature search and calculations that were made, the amount of dolomite to be used was set to be within the interval 0.027 to 0.035 kgdol · kg−1fuel. 7 the course and results of the experiments tests were carried out during which the impact of adding dolomite to the fluidized bed was examined. samples of gas and tars were first taken during each “clean” biomass gasification experiment. catalyst was then added on an intermittent basis, and reference samples were taken afterwards. the samples were stored in glass and metallic sample containers. a gas composition analysis was carried out directly in the laboratory of ipe, using the perkin elmer gas chromatograph (gc) with a packed column and a tcd detector. tar sampling was performed behind a cyclone at the outlet from the insulated jacket of the reactor. the class of hydrocarbons with their boiling point exceeding that of phenol was defined as “tar”. an hp6890 gas chromatograph with an hp5973 mass spectrometer, made by agilent (usa), were used to establish the gas tar content and its composition. table 1: properties and quantities of feedstock and dolomite batches feedstock feedstock humidity dolomite grain size dolomite batch interval dolomite total feedstock quantity proportion dol./feed. – % mm g min kg kg/hr. kgdol/kgfeed no.1 shavings 7 1–1.5 200 15 1.2 21.6 0.034 no.2 shavings 7 1–1.5 150 15 1.5 18 0.036 no.3 chips 15 0.5–1 35 5 1.3 15.5 0.030 no.4 chips 18 0.5–1 60 5 2.7 16 0.045 table 2: temperature conditions during the experiments average temperatures in the reactor in the course of the experiments reactor bottom reactor centre freeboard ◦c ◦c ◦c no.1 dolomite-free 821 795 731 dolomite-doped 810 770 730 no.2 dolomite-free 853 813 712 dolomite-doped 845 795 712 no.3 dolomite-free 850 821 719 dolomite-doped 835 800 735 no.4 dolomite-free 849 827 730 dolomite-doped 829 818 727 68 acta polytechnica vol. 52 no. 3/2012 figure 1: comparison of tar reduction after dolomite feeding the conditions and the results of the experiments are summarized in the following tables. this was a standard setting of the gasification process in gasifying woody biomass. when no dolomite was fed into the reactor, the temperature course and the gas composition course remained stable. following the addition of dolomite, drops both in the reactor temperature and in co2 content growth in the gas were observed. this is clearly due to the development of co2 with partial or full dolomite calcination. the degree of calcination is mainly dependent on the temperature of the fluidized bed and on the content of co2. another reason behind the temperature drop is the onset of the reducing endothermic reactions of the tar. 8 evaluation of the results the impact of adding dolomite in the fluidized bed of the gasifier on the amounts of tar produced is clearly shown in chart 1. application of dolomite with grain size 1 mm to 1.5 mm into the reactor at bed temperatures between 770 ◦c and 810 ◦c did not result in any drop in tar production. it was therefore decided to raise the temperature to the interval between 800 ◦c and 840 ◦c. the chart shows that a moderate drop in tar content in the gas was achieved, but that this drop was insignificant. subsequent experiments were therefore carried out at this higher temperature level. on the basis of the course of the experiment, and in particular, the changes of temperature and the changes in gas (co, co2) composition quantified using an on-line analyzer, it was concluded that the selected dolomite grain size was too big. it was concluded that the dolomite rests on the grate and in the bottom layer, and this prevents the necessary contact with the gas. a smaller grain size was therefore chosen for subsequent experiments. to curb fly ash, we opted for additions of smaller amounts of dolomite at shorter intervals. chart 1 shows an obvious positive impact, and a drop in the amount of tar by about one quarter of the original amount. to boost this effect, it was decided to increase the amount of catalyst by about 50 %, and this resulted in eliminating the escape of finer particles of the catalyst. the results show that the amount of tar that was produced dropped by about 60 % with this setting. no significant variation in the makeup of the individual gas constituents was found following doping of the reactor with dolomite. only a minor drop in the co and h2 contents was encountered, together with slightly elevated concentrations of co2 released upon dolomite calcination. the heat value of the gas that was produced ranged in the interval from 4.8 to 5.6 mj/m 3 n. 9 conclusions no fundamental changes in the composition and the content of tars in gas were found following the application of dolomite with grain sizes 1 mm to 1.5 mm. these particles were probably left lying on the grate, or were fluidized in the bottom section of the reactor, and therefore failed to get in touch with the moving gas. the use of a smaller grain size, combined with the application of a large initial batch, led to tar content reduction by some 25 per cent after dolomite was applied to the bed. the dolomite bed was significantly denser, and lightweight particles became fluidized at a greater height, as is evidenced by the temperature increase in freeboard. optimum results were then achieved by increasing the amount of dolomite to 0.045 kgdol/kgfuel in experiment no. 4. a shortcoming of this way of reducing tar in this particular experimental facility was the low freeboard 69 acta polytechnica vol. 52 no. 3/2012 temperature and the stationary fluidized bed, which hampered the circulation of catalyst particles within the reactor. this was the main reason for raising the proportion of catalyst. however, it seems that the addition of dolomite in the fluidized bed may be an effective primary measure for cutting down the tar content in fuel gas. the assumption that it is not feasible to remove all tar from the gas by way of primary measures has been confirmed; nevertheless, a 60 % drop in tar content brings a significant saving in investment and operating costs with secondary gas cleaning. acknowledgement this paper came into being thanks to the assistance of specific research project fsi-s-11-7 selected components of trigeneration processes provided by faculty of mechanical engineering, brno university of technology. references [1] najser, j.: zplyňováńı dřeva pro kogeneraci. disertačńı práce, tu-všb ostrava, 2009. [2] milne, t. a., evans, r. j, abatzoglou, n.: biomass gasifier “tars”: their nature, formation and conversion, nrel/tp-570-25357, colorado, usa, 1998. [3] skoblia, s.: úprava složeńı plynu ze zplyňováńı biomasy. diser. práce. praha : všcht, 2004. [4] lisý, m.: čǐstěńı energoplynu z biomasy v katalytickém vysokoteplotńım filtru. disertačńı práce, vut fsi v brně, 2009. [5] sutton, d., et. al.: review of literature on catalysts for biomass gasification, fuel processing technology, 2001, 73, 155–173. [6] perez, p., aznar, m. p., caballero, a. j., gill, j., martin, j. a., corella, j.: hot gas cleaning and upgrading with a calcined dolomite located downstream a biomass fluidized bed gasifier operating with steam-oxygen mixtures. energy fuels, 1997, 11, 1 194. 70 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 the diamond lemma and the pbw property in quantum algebras m. havlíček, s. pošta abstract the use of the diamond lemma for proving various facts about the center of the algebra u ′q(so3) is demonstrated. the approach presented here is successful in other cases of quantum algebras and superalgebras. keywords: quantum algebra, pbw property, center. 1 introduction a fundamental issue when examining quantum groups or similar structures is to explore and classify all representations. usually one encounters two main cases which are of differentdifficulty. when the deformationparameter, in quantumgroups typically denoted by q, is not a root of unity, i. e. when qn = 1 for all integers n, the representationtheory is “the same”as in the classical case, that is in the case of enveloping algebras of classical lie algebras. when q is a root of unity, it is not an easy task to classify finite dimensional representations even in the lowest dimensions. various preparatory steps must be taken before one can investigate representations of the considered algebra. important supportingknowledgewhich helps inmaking such a classification is detailed information about the center of the considered algebra, because irreducible representation operators which belong to the center of the algebra (casimir operators) are represented by a scalar multiple of the unit operator. theharish-chandra homomorphism (1951) is an isomorphism which maps the center z(u(g)) of the universal enveloping algebra u(g) of a semisimplelie algebra g to the elements s(h)w of the symmetric algebra s(h) of a cartan subalgebra h ⊂ g that are invariant under the corresponding weyl group w . let r be the rank of g, which is the dimension of the cartan subalgebra h. coxeter observed that s(h)w is a polynomial algebra in r variables. therefore, the center of the universal enveloping algebra of a semisimple lie algebra is a free polynomial ring. any casimir operator is an arbitrary polynomial in basic algebra invariants. the number and degrees of these fundamental invariants are shown in table 1. in the case of standard drinfeld-jimbo quantum groups,wehaveananalogyto thenon-deformedcase. when deformation parameter q is not a root of unity, a modified version of the harish-chandra homomorphism exists and the center is again isomorphic to the ring of polynomials of the fundamental invariants (see [1]). table 1: degrees of the fundamental invariants ar 2,3,4, ..., n +1 br 2,4,6, ...,2n cr 2,4,6, ...,2n dr n;2,4,6, ...,2n − 2 e6 2,5,6,8,9,12 e7 2,6,8,10,12,14,18 e8 2,8,12,14,18,20,24,30 f4 2,6,8,12 g2 2,6 for example, in the simplest case of the algebra uq(sl2), generated by four letters (generators) denoted by e, f , k, k−1 and relations kk−1 = k−1k =1, kek−1 = q2e, kf k−1 = q−2f, [e, f ] ≡ ef − f e = [k]q ≡ k − k−1 q − q−1 , the center is generated by the casimir element cq = ef + kq−1 + k−1q (q − q−1)2 (1) (for the standard proof of this fact see [2], theorem 45). when q is a primitive root of unity, say qn = 1, n > 1, qm =1 for m < n, the situation is muchmore difficult. the center is typically much larger and the central elements satisfy nontrivial polynomial relations [3]. in the case of uq(sl2), there are four more additional elements in the center, namely ep, f p, kp, k−p, where p = n if n is odd and p = n 2 if n is even. these five elements (together with (1)) are no longer algebraically independent. one can show by induction 40 acta polytechnica vol. 50 no. 5/2010 that p−1∏ j=0 (cq −(q − q−1)−2(kqj+1+k−1q−j−1))= epf p, which implies cpq + γ1c p−1 q + . . . + γp−1cq + (−1)p(q − q−1)−2p(kp − k−p)= epf p, where γi are certain complex coefficients. quantum groups are not the only kind of quantum deformations. there exist also other, nonstandard deformations. for example, q-deformation u ′q(so3) of the universal enveloping algebra u(so3), which does not coincide with the drinfeld-jimbo quantum algebra uq(so3) is constructed without using the cartan subalgebra and roots by deforming serre-type relations directly. we substitute simply 2 → [2]q = q + q−1 in cubic defining relations of u(so3). as a result we obtain a complex associative algebrawithunity generatedby elements i21, i32 satisfying the relations i221i32 − (q + q −1)i21i32i21 + i32i 2 21 = −i32, i21i 2 32 − (q + q −1)i32i21i32 + i 2 32i21 = −i21. it can be shown that this is isomorphic to an algebra generated by three generators i1, i2, i3 and relations [5] q 1 2 i1i2 − q− 1 2 i2i1 = i3, q 1 2 i2i3 − q− 1 2 i3i2 = i1, (2) q 1 2 i3i1 − q− 1 2 i1i3 = i2. one can quickly explore the following casimir element, which belongs to the center of this algebra: c = q2i21 + i 2 2 + q 2i23 − (q 5 2 − q 1 2)i1i2i3. (3) similarly as in the case of ordinary hopf quantum groups, one can expect that when q is not a root of unity, this element generates the center of the algebra u ′q(so3). however, there is no analogy of harishchandra homomorphismhere so onemust prove this fact by other methods. as is shown below, these methods areuseful even in themore complicated case when q is a root of unity. 2 the diamond lemma in 1978, m. bergmann recalled a rather deep and forgotten result of newman from graph theory, often called the diamond lemma. he showed its usefulness for other fields of mathematics also, namely for the theory of associative algebras. the original newman formulation was as follows (see [6]). let g be an oriented graph. now suppose that 1)the orientedgraph g has the descending chain condition. that is, all positively oriented paths in g terminate, in other words, there are no circles in the graph. 2) whenever two edges, e and e′, proceed from one vertex a of g, there exist positively oriented paths p, p′ in g leading from the endpoints b, b′ of these edges to a common vertex c. (this is often called the “confluence” or “diamond” condition, see fig. 1.) fig. 1: diamond condition then one can show that every connected component c of g has a unique minimal vertex mc. this means that every maximal positively oriented path beginning at a point of c will terminate at mc. let us now describe the version of the diamond lemma in the theoryof associativealgebras. let r be an associative algebrawith unity over complex numbers, given by the finite set of generators x and the set of relations s = {wσ = fσ|σ ∈ σ}, where wσ is monomial (the product of a finite number of generators from x) and fσ is a complex linear combination of monomials. let us have partial ordering < defined on monomials which satisfies three conditions: it is invariantwith respect to multiplication, i. e. for each monomial a, b, b′, c we have b < b′ =⇒ abc < ab′c, it is s-compatible (i. e. fσ is a linear combination of elements being less than wσ for each σ ∈ σ) and fulfils dcc (the descending chain condition, which is the nonexistence of an infinite sequence x1 > x2 > ...). furthermore, let all monomialswhich can be written as product abc, where ab = wσ and bc = wτ, b = 1, σ, τ ∈ σ or in the form abc = wσ and b = wτ, σ = τ (these monomials are called ambiguities) be reduced by the relations in s to a common value. then the diamond lemma states that all irreducible (i. e. completely reduced, towhich one cannotapply any relation from s) monomials form a basis of algebra r. 41 acta polytechnica vol. 50 no. 5/2010 the diamond lemma can be effectively used for decidingvariouskinds ofproblems. the typicalproblem, as mentioned in [6], is as follows. let us have an algebra with generators a, b, c and the relations a2 = a, b2 = b, c2 = c, (4) (a + b + c)2 = a + b + c. (5) nowtheproblem is to answer the questionwhether it follows from these relations that ab =0. the second relation (5) can be rewritten as cb = −ab − ba − ac − ca − bc. (6) now we test whether (4), (5) and (6) imply a unique canonical form of the elements of the considered algebra. we must examine the following ambiguities: a3, b3, c3, cb2, c2b. the first three are trivial to reduce, and the fourth, reduced in two possible ways, gives us the following: c(bb)= cb = −ab − ba − ac − ca − bc and (cb)b = (−ab − ba − ac − ca − bc)b = −ab2 − bab − acb − cab − bcb = −ab − bab − a(−ab − ba − ac − ca − bc) − cab − b(−ab − ba − ac − ca − bc)= −ab − bab + a2b + aba + a2c + aca + abc − cab + bab + b2a + bac + bca + b2c = −ab − bab + ab + aba + ac + aca + abc − cab + bab + ba + bac + bca + bc = aba + ac + aca + abc − cab + ba + bac + bca + bc. so we have −ab − ba − ac − ca − bc = aba + ac + aca + abc − cab + ba + bac + bca + bc. this equality can be rewritten as cab = aba +2ac + aca + abc + (7) 2ba + bac + bca +2bc + ab + ca. reducing the fifth ambiguity c(cb)= (cc)b leads to the same relation. now what happens if we add (7) to the list of relations? the ambiguities cb2 and c2b nowreduceautomatically toa commonvalue. but two new ambiguities of higher degree arise: c2ab, cab2. we therefore test again: we have (cc)ab = cab = aba +2ac + aca + abc +2ba + bac + bca +2bc + ab + ca, and, after some computation, c(cab)= . . . = aba +2ac + aca + abc +2ba + bac + bca +2bc + ab + ca. the second ambiguity reduces to a common value, too. the basis of the considered algebra consists of all words consisting of letters a, b, c not containing substrings a2, b2, c2, cab and cb. therefore the word ab is irreducible, hence nonzero. 3 pbw property one of simple consequences of the diamond lemma is the poincaré-birkhoff-witt property of universal enveloping algebras and its analogy in the case of quantum deformations. as a simple example, let us have an enveloping algebra u(sl2) of lie algebra sl2 which is given by three generators e, f , h satisfying the relations [e, f ] = ef − f e = h, [h, e] = 2e, (8) [h, f ] = −2f. wedefine the total ordering ≺ of the generators as it is in the alphabet, i. e. e ≺ f ≺ h. partial ordering amongmonomials is defined in suchwaythat x < y , when the length of x (the number of letters in the product x) is less than the length of y , orwhen x is a permutation of the letters from y , but has a lower number of inverses (the monomial x = x1 . . . xs has inverse (i, j), 1 ≤ i, j ≤ s, when i < j and xi � xj). one can easily see that this partial ordering is compatiblewith relations (8), whichwepresent in amore suitable form of “rewriting rules”: f e → ef − h, he → eh +2e, (9) hf → f h − 2f. one can also easily check that the ordering fulfils dcc. simple computation gives us (hf)e = (f h − 2f)e = f he − 2f e = f(eh +2e) − 2f e = f eh =(ef − h)h = ef h − h2, h(f e) = h(ef − h)= hef − h2 = (eh +2e)f − h2 = ehf +2ef − h2 = e(f h − 2f)+2ef − h2 = ef h − 2ef +2ef − h2 = ef h − h2. 42 acta polytechnica vol. 50 no. 5/2010 we see that the ambiguityhfe is reduced to a common value. one can easily list all irreducible monomials. these are precisely eαf β hγ , α, β, γ ≥ 0. the diamond lemma states that these monomials form the basis of the algebra u(sl2) (as stated by the well known pbw theorem). 4 center of u(sl2) as was said in the introduction, there is a standard way to explore the structure of the center of the algebra u(sl2). let us find this structure now using the diammond lemma. this procedure can then be generalized to other algebras where standard tools like the harish-chandra homomorphism cannot be used. theproblem is tofindall elements x from u(sl2), for which we have [x, a] ≡ xa − ax =0 for all a ∈ u(sl2). this condition is clearly equivalent to [x, e] = [x, f ] = [x, h] = 0, (10) that is, one can restrict oneself to making commutations with generators only. let us take a general element of the form x = n∑ i,j,k=0 αi,j,ke if j hk for some small values of n, let us generally compute commutators [x, e], [x, f ] a [x, h] and solve a system of linear equations (10) for coefficients αi,j,k. for n = 1 we get nothing (trivial zero solution only). for n =2 we get any scalar multiple of x = h2 +4ef − 2h ≡ c. for higher n we get elements of the form αc + βc2, then αc + βc2 + γc3, etc., where α, β, γ, . . . are arbitrary complex coefficients. of course this leads to the hypothesis that any element x which commutes with e, f and h is of the form p(c), where p is an arbitrary complex polynomial. the proof is based on the change of the original basis {eαf β hγ }. we add to generators e, f and h another letter c and to the rewriting rules (9) we add the following: ef → 1 4 (c +2h − h2), ec → ce, f c → cf, hc → ch. the basis now consists of irreducible elements, i. e. {cjekhm|j, k, m ≥ 0} ∪ {cj f lhm|j, l, m ≥ 0}. we now take the general element as a linear combination x = n∑ j,k,m=0 βj,k,mc j ekhm + n∑ j,l,m=0 γj,l,mc j f lhm. when computing commutators [x, e], [x, f ] and [x, h] one canmake use of the fact that, for example [cj ekhm, e] = cj[ekhm, e] = cj ek[hm, e] etc. it is also clear (as opposed to the original case) what it is sufficient to show: one must show that coefficients βj,k,m = 0 for (k, m) = (0,0), similarly for γ’s. this can be seen from (10). 5 center of u ′q(so3) letus apply the process introducedabove to the case of the nonstandarddeformation u ′q(so3). first, let us assume qn =1 for all n. using the first relation (2) as a definition for i3 and substituting into the second and third relations we get two cubic relations i2i 2 1 − (q + q −1)i1i2i1 + i 2 1i2 = −i2, i22i1 − (q + q −1)i2i1i2 + i1i 2 2 = −i1. it can be shown that u ′q(so3) is isomorphic to the algebra with the generators i1, i2 satisfying the two above relations. the casimir element (3) can be rewritten to the form (we ommit scalar factor q) c =(q+q−1)(i21+i 2 1i 2 2+i 2 2)+i2i1i2i1−[3]qi1i2i1i2. we now construct a different basis of the algebra u ′q(so3). we deal with the following rewriting rules: i2i 2 1 → (q + q −1)i1i2i1 − i21i2 − i2, i22i1 → (q + q −1)i2i1i2 − i1i22 − i1, i2i1i2i1 → c +[3]qi1i2i1i2 − (q + q−1)(i21 + i 2 1i 2 2 + i 2 2), i2c → ci2, i1c → ci1. 43 acta polytechnica vol. 50 no. 5/2010 it can easily be seen that precisely the elements cγ iα1 (i2i1) ki β 2 , γ, α, β ≥ 0, k ∈ {0,1} are irreducible, thus forming a new linear basis of the algebra. now it is sufficient to take arbitrary element x of the form x = ∑ α,β,k pα,β,k(c)i α 1 (i2i1) ki β 2 where pα,β,k are arbitrary complex polynomials of one variable, commute x with i1, i2 and by virtue of its equality to zero to show that all polynomials pα,β,k =0 with the only exception p0,0,0. when q is a primitive root of unity, qn = 1, the center of the algebra does not only consist of the ordinary casimir element (3) but there are three more elements having the form cn1 = [n−12 ]∑ j=0 ( n − j j ) n n − j ( i q − q−1 )2j i n−2j 1 (and the same polynomial in i2 and i3 denoted by cn2, cn3). it is not an easy task to show that these elements really belong to the center of the algebra (see [4]). after some transformation one can see that cnj, j =1,2,3 are actually chebyshev polynomials. it turns out that these four casimir elements are no longer polynomial independent. when one wants to prove this fact it may come in handy to have explicit polynomial dependence for small values of n. however, it turns out that it is practically impossible to obtain the relation between casimir elements by “brute force”, even for the simplest cases. for example, for n =3 one can show that c3 − qc2 − c231 − c 2 32− (11) c233 +3(q + q 1 2)c31c32c33 = 0. note that c3 is of degree nine, so it is quite complicated even to prove the given explicit relation. the diamond lemma can be quite useful for obtaining relations of type (11) directly. we must only construct a suitable set of rewriting rules and with the help of it new advantageous basis of the algebra. first we start with ordinary rewriting rules coming from the commutation relations, namely i2i1 → qi1i2 − q 1 2 i3, i3i2 → qi2i3 − q 1 2 i1, i3i1 → q−1i1i3 + q− 1 2 i2, i1c → ci1, i2c → ci2, i3c → ci3. the powers of the generators can be reduced using casimir elements cnk, thereforewe add the relations ink → cnk − [n−12 ]∑ j=1 ( n − j j ) n n − j ( i q − q−1 )2j i n−2j k , k =1,2,3. next we must ensure that casimir element c does not appear too many times. this is ensured by the relation acb → a(q2i21 + i 2 2 + q 2i23 − (q 5 2 − q 1 2)i1i2i3)b, where a, b are arbitrary monomials, and it comes into play only if the count of letters ik (k = 1,2,3) and c in monomial acb is greater or equal to n. the last rule (or, better to say, set of rules)whichwe do not present explicitly transforms cj w → cj+1w̃ , where w is any product of generators ik, k = 1,2,3 with the property that every generator i1, i2, i3 is present in the product. using the commutation relations, w is transformed to the form i1i2i3w1 and the product i1i2i3 is then converted to c using the rule i1i2i3 → (q 5 2 − q 1 2)−1(q2i21 + i 2 2 + q 2i23 − c). these transformation rules now lead to the basis containing the following elements: cαn1c β n2c γ n3c δik1 i l 2i m 3 , α, β, γ ≥ 0,0 ≤ δ ≤ n − 1, 0 ≤ k, l, m ≤ n − 1 − δ, klm =0. if we want to find the explicit relation between casimir elements, we simply express cn in the basis specified (using a finite number of rewriting rules reducing cn to canonical form). in general, one can show that for odd n the elements c, cn1, cn2, cn3 fulfil the relation c2n1 + c 2 n2 + c 2 n3 +(q − q −1)ncn1cn2cn3 = n−1∏ k=0 (c + q [k]q [k +1]q). the relation for even n as well as the proof of this can be found in [7]. acknowledgement we acknowledge financial support from grant 201/10/1509 of the czech science foundation and msm6840770039 of the ministry of education, youth, and sports of the czech republic. 44 acta polytechnica vol. 50 no. 5/2010 references [1] lusztig, g: introduction to quantum groups, birkhäuser, boston, 1993. [2] klimyk, a. u., schmüdgen, k.: quantum groups and their representations, springer, berlin, 1997. [3] tange, r.: the centre of quantum sln at root of unity, j. algebra 301 (2006) 425–445. [4] havlíček, m., pošta, s.: on the classification of irreducible finite-dimensional representations of u ′q(so3) algebra, j. math. phys. 42 (2001), 472–500. [5] havlíček, m., klimyk, a. u., pošta, s.: representations of the cyclically symmetric q-deformed algebra soq(3), j. math. phys. 40 (1999), 2135–2161. [6] bergmann, g. m.: diamond lemma for ring theory, adv. math. 29 (1978), 178–218. [7] havlíček, m., pošta, s.: center of algebra u ′q(so3), submitted to j. math. phys. prof. ing. miloslav havlíček, drsc. e-mail: miloslav.havlicek@fjfi.cvut.cz doc. ing. severin pošta, ph.d. e-mail: severin.posta@fjfi.cvut.cz department of mathematics fnspe, czech technical university in prague trojanova 13, cz-120 00 prague 45 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 search for a correlation between radio giant pulses and vhe photons of the crab pulsar n. lewandowska, d. elsäesser, k. mannheim abstract the crab pulsar is a unique source of pulsar radio emission. its regular pulse structure is visible over the entire electromagnetic spectrum from radio to gev ranges. among the regular pulses, radio giant pulses (gps) are known as a special form of pulsar radio emission. although the crab pulsar was discovered by its gps, their origin and emission mechanisms are currently not understood. within the framework of this report we give a review on radio gps and present a new idea on how to examine the characteristics of this as yet not understood kind of pulsar emission. keywords: neutron star, pulsar, crab pulsar, regular pulses, giant pulses. 1 introduction embedded in the supernova remnant sn 1054, the crab pulsar (psr b0531+21) is currently the only pulsar known with a pulsed emission structure seen over the entire electromagnetic spectrum. it consists of amainpulse (p1) andan interpulse (p2)occurring at the rotational phases of 70 and ≈ 110 degrees, respectively ( [1]). apart fromthese two regularpulses, further pulsed structures are visible at various frequency ranges. the precursor, for example, is an implementation which has only been observed from about300 to600mhz. athigher frequencies, twoadditional high frequency components known as hfc 1 and hfc 2 are visible from ≈ 4000 to 8000 mhz, simultaneouslywith aphase shift of the interpulse by about 10 degrees [1]. while the origin of p1 and p2 can be partly described by current pulsar theories, the origin of the precursor andhfc components still remains amystery. in addition to this regular pulsed structure, the crab pulsar is also a known source of radio gps. due to their properties, gps are an unusual and exotic form of pulsar radio emission. they have been observed in a wide frequency range from 23mhz [5] to 15.1ghz [6,7] at phases of p1 andp2, and distinguish themselves by flux densities that are higher by a factor of 5 × 105 and pulse widths from 1–2microseconds [8]. several studies also confirmthe occurrence of gps at both hfc components [9,6]. nevertheless none have been verified at the phase of the precursor. the emission of gps therefore seems to be phase bounded. differences between gps occurring at p1 and p2 were discovered by obervationswith the arecibo single dish telescope1 [11]. these observations indicate that the time and frequency patterns of gps at p1 are different from those at p2 at frequencies above 4 ghz. while giant main pulses (gmps) consist of narrow-band pulses of nanosecond duration, giant interpulses (gips) reveal narrow emission bands with durations in the microsecond range. these differences in gps at p1 and p2 possibly arise from different emission mechanisms underlying their development. additionally, they contradict all previous emission theories of the crab pulsar that assume a similar development process of both pulses. theoretical aspects of gps have been broadly investigated [15,16,4,17,18]. nevertheless, in spite of over 40 years since the detection of the crab pulsar by its gps [2], their possible origin and emission mechanism is still not understood. the only current model based on observational data refers to gips above 4 ghz. the lyutikov model [12] reconstructs the emission bands of gips under the assumption of higher particle density on closed magnetic field lines in contrast with the standard goldreich-julianmodel [13]. according to [14], this density is highest near the last closed magnetic field line at which a lorentz beam develops due to magnetic reconnection events. while it moves along the closed field line, it dissipates by curvature radiation. furthermore, the lyutikov model also predicts the occurrence of γ-radiation together with radio gps, and provides the motivation for simultaneous observations at γ-wavelengths. currently, no universal model is available for crab radio gps, since the lyutikov model is only applicable to gips above 4 ghz, where the p2 component changes in its position by 10 degrees [1]. with apparently sporadic, short pulses of this kind, the crab pulsar, together with its twin pulsar psr b0540-69 in the large magellanic (lmc) 1http://www.naic.edu/ 34 acta polytechnica vol. 51 no. 6/2011 cloud, belongs to a small group of 11 pulsars which are known to emit radio gps. this group also consists of ordinary andmillisecondpulsars (msps) [19], and it was thought that a common feature of them could be a high magnetic field at the light cylinder. however, no uniform accordance in all 11 pulsars could be found, and this makes the gp phenomenon a still enigmatic feature of pulsar radio emission. 2 multiwavelength observations the incentive of multiwavelength (mwl) observations is to deduce the central emission mechanism underlying the gp phenomenon. currently it is assumed that radio gps could be caused by coherent emission, by pair production processes or by changes in the beaming direction. to shed more light on this topic, severalmwlobservation campaignswere carried out. radio gps and γ-ray photons observed simultaneously with the green bank 43 m telescope and osse were examined in [10], but no correlation could be verified. aweakcorrelationbetween radiogpsandoptical photons was verified by shearer et al. [20], who observed thecrabpulsar simultaneouslywith thewesterbork synthesis radio telescope (wsrt) and with the triffid optical photometer. they observed an increase in theopticalfluxby≈ 3%during theoccurrence of radio gps, which proves that an additional non-coherent emission process accounts for the gp emission. to examine the possible mwl occurrence ofgps in awider extent, furthermwlstudies at radio, optical and also γ-wavelengths are necessary in order to see if pair productionprocesses, for example, are involved in the gp emission. 2.1 fermi lat arguing that the observations in [10] were based on insufficient sensitivity, several observationcampaigns of the crab pulsar were carried out with the gbt and fermi lat2 to examine the assumptions of the lyutikov model [21–23]. with a collection area of 0.8 m2, a total of 77 fermi photons were detected in more than 10 hours of observations in the energy range between 100 mev and 5 gev, simultaneously withover210000radiogpsata frequencyof8.9ghz (figure1) [22,23]. ineachcase, a searchwasmade for a correlation between the gp rate and single fermi photons in addition to a change in the γ-ray flux around single gps. however, bilous et al. conclude thatwith 95%probability the energy flux in a 30ms time window is not higher than 6 times the average flux, which suggests that coherent emission is the responsiblemechanism forgips (giant interpulses, see introduction). one of their conclusions refers to the possible existence of a correlation at very high energies ≥ 100 gev, at which a decisive number of photons for a correlation analysis cannot be provided in a reasonable time span by fermi lat [21]. fig. 1: distribution of gp peak flux density and energy of γ-ray photons detected by fermi over the pulsar rotational phase [bilous et al.(2010)] 2.2 cherenkov telescopes one key question resulting from the observations with fermi lat is whether the correlation does not exist at higher energies. at this point, imaging air cherenkov telescopes (iacts) with a general sensitivity > 60gevare essential. telescopes of this kind observe γ-rays indirectly through the detection of extensive air showers. when a γ-ray reaches the atmosphere of theearth, it strikes one of itsmolecules and produces a cascade of secondaryparticles. these secondary particles produce cherenkov radiation moving nearly at the speed of light at a height of about 10–20 km in the atmosphere. the cherenkov light is emitted in the form of a cone around the direction of the primary particle (figure 2). these brief flashes of cherenkov radiation are imaged by iacts. with a primary mirror 17 m in diameter in each case, major atmospheric gamma imaging cherenkov (magic)3 telescopes are currently the biggest iacts worldwide. among other iacts, for example cangaroo iii, h.e.s.s. and veritas, the magic telescopes were the first to detect the pulsed emission of the crab pulsar at an energy thresholdof 25gevprovidedbya special trigger system [24]. their large mirrors enable the detection of single vhe photons on short time scales. due to the shortwidths of radiogps, an accurate timing system 2http://www-glast.stanford.edu/ 3http://magic.mppmu.mpg.de/ 35 acta polytechnica vol. 51 no. 6/2011 down to at least microseconds is needed for a correlation analysis with vhe photons. employing the global positioning system, magic provides a time stampwith anaccuracyof 200ns for each singlevhe photon, and permits the user to decide whether it arose simultaneouslywith a radiogp. thus magic affords a unique opportunity to search for vhe photons coinciding with gps at a sensitivity exceeding previous studies using e.g. fermi lat, and to test various models dealing with the possible generation of crab gps [15,16,4,17,18,12]. fig. 2: illustration of the air shower technique http://icc.ub.edu/gp oa.php references [1] moffett, d. a., hankins, t. h.: astrophysical journal, 1996, vol. 468, p. 779. [2] staelin, d. h., reifenstein, e. c.: iii, science, 1968, vol. 162, (3861), p. 1481–1483. [3] soglasnov, v.: proceedings of the 363rd weheraeus seminar on neutron stars and pulsars 40 years after the discovery, 2007,mpe-report 291, p. 68. [4] hankins, t. h.: nature, 2003, vol. 422, (6928), p. 141–143. [5] popov, m. v., kuzmin, a. d., ulyanov, o. m., deshpande, a. a., ershov, a. a., kondratiev, v. i., kostyuk, s. v., losovsky, b. ya., soglasnov, v. a., zakharenko, v. v.: on the present and future of pulsar astronomy, 26th meeting of the iau, joint discussion 2, 2006. [6] hankins, t. h.: proceedings of the 177th colloquium of the iau held in bonn, 2000, vol. 202, p. 165. [7] jessner, a., popov, m. v., kondratiev, v. i., kovalev, y. y., graham, d., zensus, a., soglasnov, v. a., bilous, a. v., moshkina, o. a.: astronomy and astrophysics, 2010, vol. 524, (id.a60). [8] kuzmin,a.d.: astrophysics and space science, 2007, vol. 308, 1–4, p. 563–56. [9] jessner, a., slowikowska, a., klein, b., lesch, h., jaroschek, c. h., kanbach, g., hankins, t. h.: advances in space research, 2005, vol. 35, 6, p. 1166–1171. [10] lundgren, s. c., cordes, j. m., ulmer, m., matz, s. m., lomatch, s., foster, r. s., hankins, t.: astrophysical journal, 1995, vol. 435, 6928, p. 433. [11] hankins,t.h., eilek, j.a.: astrophysical journal, 2007, vol. 670, 1, p. 693–701. [12] lyutikov, m.: monthly notices of the royal astronomical society, 2007, vol. 381, 3, p. 1190–1196. [13] goldreich, j., julian,w.h.: astrophysical journal, 1969, vol. 157, p. 86. [14] gruzinov, a.: physical review letters, 2005, vol. 94, 2, id. 021101. [15] mikhailovskii, a. b., onishchenko, o. g., smolyakov, a. i.: soviet astr. lett. (tr: pisma), 1985,vol.11, no. 2/mar/apr, p. 78. [16] weatherall, j. c.: astrophysical journal, 2001, vol. 559, 1, p. 196–200. [17] petrova, s. a.: astronomy and astrophysics, 2004, vol. 424, p. 227–236. [18] istomin, y. n.: astronomical society of the pacific, 2004, p. 369. [19] slowikowska, a., jessner, a., kanbach, g., klein,b.: proceedings of the 363rdwe-heraeus seminar on neutron stars and pulsars 40 years after the discovery, 2007, mpe-report 291, p. 64. [20] shearer, a., stappers, b., o’connor, p., golden, a., strom, r., redfern, m., ryan, o.: science, 2003, vol. 301, 5632, p. 493–495. [21] bilous, a. v., kondratiev, v. i., mclaughlin, m. a., mickaliger, m., lorimer, d. r., ransom, s. m., lyutikov, m., stappers, b., langston, g. i.: 2009 fermi symposium, econf proceedings c091122. 36 acta polytechnica vol. 51 no. 6/2011 [22] bilous, a. v., kondratiev, v. i., mclaughlin, m. a., ransom, s. m., lyutikov, m., mickaliger, m., stappers, b., langston, g. i.: proceedings of the iskaf2010 science meeting. june 10–14, 2010. [23] bilous, a. v., kondratiev, v. i., mclaughlin, m. a., ransom, s. m., lyutikov, m., mickaliger, m., langston, g. i.: astrophysical journal, 2011, vol. 728, 110. [24] aliu, e., et al.: science, 2008, vol. 322, 5905, p. 1221. natalia lewandowska dominik elsäesser karl mannheim university of würzburg 37 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 application of wavelet transform for image denoising of spatially and time variable astronomical imaging systems m. blažek, e. anisimova, p. páta abstract we report on our efforts to formulate algorithms for image signal processing with the spatially and time variant pointspread function (psf) and inhomogeneous noise of real imaging systems. in this paper we focus on application of the wavelet transform for denoising of the astronomical images with complicated conditions. they influence above all accuracy of themeasurements and thenew source detection ability. ouraim is to test theusefulness ofwavelet transform (as the standard image processing technique) for astronomical purposes. keywords: point-spread function, image processing, iraf, daophot, wide-field, noise reduction, wavelet, astronomy. 1 introduction our research reflects the needs for effective data processing from systems with complicated psf or important role of noise like wide-field cameras or precise astronomical telescopes (e.g. maia [1], bart, d50 [2] and bootes [3]). a large amount of data is displaced because of big errors during data reduction (e.g. information found on the edges of the image from fish-eye type cameras). standard photometrical packages (e.g. iraf [4]) include algorithms for psf extraction dependent linearly or quadratically on the position in the image [5]. however, this does not change the analytic part of psf — only the fitting residua occurring in the lookup table (the experimental part of the constructed psf) depending on the image cursor, where the convolution proceeds. this model however does not reshape the stars at the edges of wide-field images (an example of such an asymmetric psf is shown in figure 1), although it is useful in the crowded fields as shown in figure 2. fig. 1: example of the asymmetric stellar point-spread function from the edge of a wide-field camera the formulation of new noise space dependent models (other than simple darkframe) and automatic variant psf construction is above all useful for widefield detectors. with the new algorithms, higher precision (sensitivity) and lower data littering can be obtained. all-sky cameras for gamma-ray bursts or meteor detections require optoelectronic analysis specific for each system because of the different dependence of the parameters. automatic data reduction usually does not include time variant psf because of the slow changes of the external conditions (camera cooling, atmosphere, direction of observation), which become important during long and precise measurement. the parameters of psf photometry (in comparison to aperture photometry) have to be acquired manually. the algorithms for evaluating them automatically on series of images will enable faster and more precise photometry if we assume similar and slowly changing conditions of the observations. typically one-by-one image analysis does not expect the dependence of the psf parameters on the previous images. aberration modelling using zernike polynomials [6] can provide better results for space-variant systems, while the fourier or wavelet transform can provide guidance for time-variant analysis of psf. 2 wavelet transform for astronomic image processing the discrete wavelet transform [7] is widely used in image processing to get rid of noise, while it is not commonly used in astronomy as higher uncertainties can occur. we tested the application of this method on stellar field data taken by the robotic tele11 acta polytechnica vol. 51 no. 2/2011 fig. 2: methods of psf construction from an over-crowded stellar field (ngc 6791) — significant coma present. the second column represents look-up tables — additive corrections from the analytic function to the observed empirical psf [5] i. row – analytic function only ii. row – constant psf over the image containing both the analytic function and corrections iii. row – look-up tables depending linearly on position iv. row – look-up tables depending quadratically on position scope ‘d50’ at the ondrejov observatory (cz). the wavelet transform was proceeded with matlab1 software, while iraf software was used for astronomic processing with the daophot package2 [5]. we focused on 3 specific targets: • denoising • stellar magnitude changes • the influence of the wavelet transform on the efficiency of the new object detection algorithms five specific wavelet families were used for the tests: daubechies, biorthogonal, reverse biorthogonal, symlets and coiflets, with the decomposition level up to 3 [8]. those families were chosen to cover both orthogonal and biorthogonal wavelet shapes and the most used wavelet families in standard image signal processing. only soft thresholding [9] was applied, due to its better peak signal to noise ratio for image denoising, and no thresholding was done to 1http://www.mathworks.com/ 2http://iraf.noao.edu/ 12 acta polytechnica vol. 51 no. 2/2011 the approximation coefficients of the wavelet dyadic decomposition. the test image is shown in figure 3. the efficiency of the wavelet transform on the test image for each wavelet was checked using the daophot routines [5] under iraf software package and with standard aperture photometry algorithms. fig. 3: test image for wavelet transform 2.1 results of the denoising we observed the changes of the standard deviation of the sky (background) value depending strongly on the decomposition level and the intensities of the nearby stars. the first level of decomposition shows only small variations among the wavelets that were used, lowering the noise dispersion by 40 %–50 %. even greater efficiency can be gained around faint stars for the second decomposition level with higher differences in the influence between the wavelets. for the third decomposition level, the influence of the denoising on the astronomical image was even negative for several wavelets (e.g. most of the daubechies and reversed biorthogonal wavelets) because of the strong artifacts of the transform (figure 4). all wavelet denoising results can be seen in table 1. the first and second column of the table represent the wavelet that was used. the first column for each decomposition level shows the mean percentage lowering (effectivity e) of the background noise deviation σ around stars according to the equation e = 1 − σa σb , (1) where the subscripts b and a represent background noise deviation before the wavelet transform application and after the transform respectively. the second and third column for each decomposition level in table 1 show the maximum δmmax and mean δm errors of the photometric magnitudes after the wavelet transform. the signal-to-noise ratio of the stars selected for the statistics is from 150 to 3 100. 2.2 magnitude changes most of the photometrical magnitudes were unchanged or changed only slightly during the wavelet transform, and all of those changes were hidden inside the 3-σ band of the counted instrumental magnitudes. from this point of view, the wavelet transform can therefore be considered as a low-loss modification of scientific images with low influence on the photometrical information. detailed errors of the magnitude changes can be seen in table 1. 2.3 detection of new sources the same detection algorithms for transformed images as for the original images were used for the search of new object candidates and for the pairing with their counterparts in the astronomical catalogues. however, no new sources were found using the daophot routines [5]. the influence of the wavelets on the point-spread function of the image was unfortunately too strong to get any new information from the noise level. no advantages for the new source detection algorithms were therefore obtained from the wavelet denoising. future research is therefore suggested. (a) (b) (c) fig. 4: effect of the wavelet transform on (a) the original stellar field, (b) the daubechies 3 wavelet third decomposition level, (c) the reverse biorthogonal 3.3 wavelet third decomposition level 13 acta polytechnica vol. 51 no. 2/2011 table 1: efficiency e of the denoising of the background deviation of the stellar surroundings (1). maximum δmmax and mean δm error of photometry after application of given wavelet transform wavelet dec. level 1 dec. level 2 dec. level 3 e δmmax δm e δmmax δm e δmmax δm [%] [mag] [mag] [%] [mag] [mag] [%] [mag] [mag] daubechies 1 47 0.026 0.010 71 0.052 0.020 79 0.098 0.031 2 47 0.014 0.004 73 0.030 0.012 55 0.067 0.026 3 47 0.005 0.002 69 0.046 0.014 35 0.102 0.030 4 48 0.008 0.002 60 0.037 0.010 24 0.092 0.029 5 48 0.012 0.002 57 0.020 0.006 22 0.061 0.026 6 48 0.012 0.003 43 0.021 0.008 1 0.094 0.034 7 46 0.011 0.003 39 0.039 0.007 4 0.114 0.032 8 44 0.008 0.002 49 0.013 0.005 10 0.090 0.034 9 45 0.008 0.002 34 0.021 0.007 −7 0.118 0.037 10 45 0.014 0.002 28 0.036 0.008 −10 0.122 0.039 biorthogonal 1.1 47 0.026 0.010 71 0.052 0.020 79 0.098 0.031 1.3 45 0.007 0.002 72 0.017 0.006 58 0.061 0.016 1.5 43 0.004 0.001 70 0.012 0.002 51 0.044 0.012 2.2 48 0.010 0.004 69 0.060 0.016 71 0.110 0.023 2.4 48 0.006 0.001 70 0.025 0.009 66 0.051 0.017 2.6 48 0.008 0.001 69 0.032 0.007 54 0.055 0.016 2.8 48 0.009 0.002 71 0.014 0.004 56 0.040 0.012 3.1 34 0.012 0.005 45 0.076 0.016 38 0.102 0.018 3.3 45 0.008 0.002 67 0.025 0.007 58 0.028 0.008 3.5 48 0.010 0.002 67 0.037 0.008 53 0.070 0.012 3.7 48 0.011 0.002 68 0.015 0.005 – – – 3.9 48 0.012 0.002 61 0.018 0.005 47 0.033 0.009 4.4 48 0.012 0.002 68 0.025 0.009 37 0.077 0.026 5.5 44 0.009 0.001 64 0.018 0.006 −15 0.080 0.026 6.8 47 0.014 0.003 65 0.018 0.006 11 0.064 0.024 coiflets 1 48 0.010 0.003 71 0.053 0.014 61 0.099 0.026 2 48 0.010 0.002 66 0.043 0.009 26 0.083 0.027 3 47 0.013 0.003 55 0.039 0.007 15 0.099 0.025 4 47 0.014 0.003 42 0.041 0.007 1 0.090 0.028 5 47 0.014 0.003 30 0.041 0.007 −10 0.105 0.027 symlets 2 47 0.014 0.004 73 0.030 0.012 55 0.067 0.026 3 47 0.005 0.002 69 0.046 0.014 35 0.102 0.030 4 47 0.008 0.002 66 0.046 0.011 31 0.086 0.026 5 45 0.007 0.002 62 0.045 0.010 23 0.119 0.028 6 47 0.011 0.003 61 0.015 0.005 18 0.060 0.023 7 46 0.015 0.002 51 0.165 0.009 – 0.068 0.028 8 47 0.012 0.003 43 0.045 0.008 2 0.099 0.029 reverse 1.1 47 0.026 0.010 71 0.052 0.020 79 0.098 0.031 biorthogonal 1.3 44 0.015 0.004 68 0.054 0.016 45 0.118 0.031 1.5 43 0.006 0.002 65 0.034 0.007 23 0.106 0.021 2.2 51 0.009 0.002 66 0.044 0.010 28 0.087 0.020 2.4 49 0.007 0.002 67 0.018 0.007 11 0.057 0.020 2.6 47 0.010 0.002 57 0.035 0.008 14 0.067 0.022 2.8 47 0.011 0.002 47 0.017 0.005 8 0.063 0.019 3.1 34 0.006 0.002 11 0.019 0.004 −115 0.039 0.009 3.3 45 0.004 0.002 32 0.015 0.004 −85 0.030 0.010 3.5 47 0.006 0.002 36 0.020 0.006 −53 0.057 0.014 3.7 48 0.009 0.002 33 0.015 0.005 −9 0.043 0.011 3.9 48 0.010 0.002 28 0.019 0.006 −9 0.055 0.016 4.4 48 0.009 0.002 67 0.015 0.006 24 0.046 0.019 5.5 48 0.013 0.003 64 0.010 0.003 12 0.063 0.023 14 acta polytechnica vol. 51 no. 2/2011 3 conclusion space dependent psf and noise models are necessary for precise and complete data reduction from imaging systems with wide-field lenses. we therefore plan to measure the optoelectronic characteristics and describe the geometric distortion of all-sky detectors to test and formulate new algorithms of variant noise and psf. the advantage of the denoising algorithms should be namely new discoveries of objects hidden in the noise. although the background noise distribution was positively modified with most of the wavelet transforms, no advantage for the new source detection algorithms was found yet. in our future effort we’d like to focus as well on the detection algorithms of the transformed astronomical images to obtain photometric methods possibly useful in practice. acknowledgement this project was supported by the czech technical university in prague grant sgs cvut 2010 – ohk3-066/10. references [1] vítek, s., koten, p., páta, p., fliegel, k.: double-station automatic video observation of the meteors, advances in astronomy 2010, article id 943145, 4 pages (2010). [2] nekola, m. et al: robotic telescopes for high energy astrophysics in ondřejov, experimental astronomy, vol. 28, 2010, issue 1, p. 79–85. [3] castro-tirado, a. j. et al.: the burst observer and optical transient exploring system (bootes), a&a suppl., vol. 138, 1999, p. 583–585. [4] massey, p.: a user’s guide to ccd reductions with iraf, noao (1997). [5] stetson, peter b.: daophot – a computer program for crowded-field stellar photometry, astronomical society of the pacific, publications, vol. 99, 1987, p. 191–222. [6] řeřábek, m., páta, p., koten, p.: processing of the astronomical image data obtained from uwfc optical systems, image reconstruction from incomplete data v, proc. spie 7076, 70760l (2008). [7] mallat, s.: a theory for multiresolution signal decomposition: the wavelet representation, ieee pattern anal. and machine intell., vol. 11, no. 7, 1989, pp. 674–693. [8] daubechies, i.: ten lectures on wavelets, cbmsnsf regional conference series in applied mathematics, lectures delivered at the cbms conference on wavelets, university of lowell, mass., june 1990, philadelphia: society for industrial and applied mathematics – siam (1992). [9] donoho, d. l.: de-noising by soft-thresholding, ieee, trans. on inf. theory, vol. 41, 3, 1995, pp. 613–627. martin blažek e-mail: blazem10@fel.cvut.cz elena anisimova petr páta czech technical university in prague faculty of electrical engineering technická 2, prague 6 15 acta polytechnica vol. 52 no. 5/2012 integration of autonomous uavs into multi-agent simulation martin selecký1, tomáš meiser2 1dept. of cybernetics, faculty of electrical engineering, czech technical university, technická 2, 166 27 prague, czech republic 2dept. of computer science, faculty of electrical engineering, czech technical university, technická 2, 166 27 prague, czech republic corresponding author: martin.selecky@agents.fel.cvut.cz abstract in recent years, unmanned aerial vehicles (uavs) have attracted much attention both in the research field and in the field of commercial deployment. researchers recently started to study problems and opportunities connected with the usage, deployment and operation of teams of multiple autonomous uavs. these multi-uav scenarios are by their nature well suited to be modelled and simulated as multi-agent systems. in this paper we present solutions to the problems that we had to deal with in the process of integrating two hardware uavs into an existing multi-agent simulation system with additional virtual uavs, resulting in a mixed reality system where hardware uavs and virtual uavs can co-exist, coordinate their flight and cooperate on common tasks. hardware uavs are capable of on-board planning and reasoning, and can cooperate and coordinate their movement with one another, and also with virtual uavs. keywords: unmanned autonomous vehicles, multi-agent systems, deployment. 1 introduction the use of uavs is growing nowadays thanks to the low cost of their deploying and maintaining them, and the possibility to operating them in areas inaccessible or dangerous for human pilots. to manage more sophisticated tasks such as area surveillance and monitoring or multiple target tracking, teams of multiple uavs should be deployed. this requires more complex control, coordination and cooperation mechanisms. these mechanisms have already been studied and developed for virtual uavs as a part of multi-agent systems, e.g. in the agentfly system [8] or mas, proposed by baxter and horn in [1] which would be very well suited for application in real hardware uavs. a further significant challenge in working with hardware air vehicles is that the design/test cycle is considerably longer than for ground robots or virtual entities. flight experiments involve a greater level of risk, since even minor errors can lead to serious crashes. for this reason mixed reality simulations, which allow integration of real hardware uavs and simulated virtual uavs to co-exist in one environment, are helpful and necessary in the development process. the principles of multi-agent systems have been used in [2], [3] and [6] to issue commands to hardware uavs. in these works, however, central planning mechanisms are used, and the uavs only follow precomputed plans. figure 1: agentfly multi-agent system. bürkle et al. [4] presented the deployment of control mechanisms for teams of hardware vtol micro uavs. these uav teams are capable of on-board planning and some cooperation on mutual tasks, but they do not coordinate their flight in terms of collision avoidance. the system also does not allow the coexistence of hardware uavs together with simulated uavs. we integrated two hardware fixed wing uavs into the agentfly multi-agent system [8], which is used for simulating uavs and air traffic, allows complex coordination and cooperation of agents, and provides collision avoidance mechanisms. we modified the sys93 acta polytechnica vol. 52 no. 5/2012 acl clock radar configuration plane control collision avoidance framework flight planning no flight zones pilot agent plane agent simulated uav entity simulator simulation framework ... cooperative negotiation 1. .n figure 2: design of a simulated virtual uav. tem to allow both hardware and virtual uavs to act and interfere in a common mixed reality environment, and we equipped the hardware uavs with a gumstix computer to enable on-board planning, reasoning and communication with other hardware or software uavs. this paper is organized as follows. first, we will briefly describe the agentfly system in section 2 and uavs and their computational and communication equipment in section 3. section 4 shows the modifications done to the agentfly system and hardware equipment for deploying the uavs, and in section 5 we describe the results of field experiments. section 6 concludes the paper. 2 agentfly multi-agent system agentfly is a java-based multi-agent system designed for simulation of air traffic and uav missions (see figure 1). it is built on top of the a-globe multi-agent platform [7], and it provides the simulated entities with a communication framework, trajectory planning, and collision avoidance and cooperation mechanisms. each simulated aircraft in the agentfly system is composed of two software agents — the pilot agent and the plane agent. pilot agent is responsible for the uav’s reasoning — it calls the trajectory planner and handles the communication with other uavs for the purposes of cooperation and collision prevention. plane agent provides pilot agent with an interface for high-level plane control, and executes the flight plans. it also simulates the aircraft’s on-board instruments, e.g. radar, gps and clock. the system provides simulated aircraft with mechanisms for peer-to-peer collision avoidance — either cooperative mechanism, where the uavs exchange figure 3: unicorn uav by procerus technologies. figure 4: gumstix overo fire on-board computer . their flight plans and negotiate about avoidance manoeuvres, or non-cooperative mechanism, where the uavs cannot exchange (e.g. because of incompatible protocols) or do not want to exchange their plans (for example hostile airplanes) and the collision situation is solved by flight trajectory prediction. the flight plans consist of a sequence of flight manoeuvres — straight flight, left/right turn, up/down turn. they represent so called dubins curves [5], which are based on the fact that any two positions (points with directions) in space can be connected by a sequence of a turn of a given radius, a straight segment and another turn. each manoeuvre is represented by the starting point, turn angle or acceleration, speed and duration. during the transition process from software simulation to mixed-reality we replaced two of these virtual aircraft with hardware commercial off-the-shelf fixed wing uavs by procerus technologies. 3 hardware uav platform we selected the procerus uav (figure 3) because of its relatively low cost, built-in cameras and sensors, ease of hardware integration and easy repairs, and the included kestrel autopilot. the kestrel autopilot is capable of waypoint navigation, thus removing the 94 acta polytechnica vol. 52 no. 5/2012 figure 5: scheme of hardware uav design. necessity for low level uav control, and it provides telemetry and gps info about the aircraft via an rf modem. to provide computational capacity for on-board planning and communication, we equipped the uav with a gumstix overo fire computer on module (com) (figure 4). this computer is based on arm cortex-a8 architecture with 512 mb ram running at 720 mhz with linux os and java embedded to enable running of the modified agentfly multi-agent system. figure 5 shows the block design of the deployed hardware uav. the kestrel autopilot is connected to the sensors and actuators to provide low level uav control. by the connected 869 mhz microhard modem, the autopilot distributes telemetry and gps info to the ground station and receives control commands in cases when manual control is required or when the autonomous on-board planning and control algorithms fail. the autopilot is also connected to the gumstix overo com by rs-232 line. by this line it distributes the telemetry and gps data to the on-board planner and receives standard control commands — mainly a sequence of waypoints in return. communication with other hardware or virtual aircraft is carried out by an additional 2.4 ghz xbee modem. the two modems have their counterparts on the ground station pc to pass the telemetry and communication to the simulation, and vice versa. 4 modifications made to the multi-agent system the software integration of the hardware uavs into the multi-agent system is depicted in figure 6. it can be seen that the hardware uav entity consists of an on-board part responsible for planning, flight execution and other reasoning, and a part located on the ground station pc responsible for visualization and exchange of position and telemetry information (block in the figure denoted as ‘radar’) between the real and virtual entities. radar visualisation control uav-ground agent uav-ground entity simulator simulation framework ... acl clock radar configuration uav control pilot agent uav-air agent cooperative negotiation ground pc on-board computer virtual entity positions uav position acl pilot agent plane agent simulated uav cooperative negotiation uav-air 1. .n 1. .m 1. .m figure 6: design of modificated mas for mixed hardware and virtual uav simulations. in order to integrate of the hardware uav, some parts of the simulation and planning software had to be modified: waypoint navigation of the kestrel autopilot was not compatible with the agentfly’s manoeuvrelike flight plan structure. wind in the real environment significantly influenced the precision of plan execution. position uncertainty caused by errors of sensors and actuators or environmental effects. unreliable communication and low bandwidths of the rf modems caused problems in the collision avoidance and cooperative mechanisms. the simulation time needed to be synchronized, otherwise it caused problems in cooperative collision avoidance negotiations. we will now describe the necessary changes that had to be applied to deal with these problems. 4.1 waypoint navigation as was stated above, the kestrel autopilot provides the operator with waypoint navigation capability. the waypoints are uploaded as gps coordinates, so we had to sample the uav’s plan, which is represented as a list of flight manoeuvres. the manoeuvres are sampled according to the acceleration or turn angle — the more rapid the change in speed or angle, the more dense is the sampling (see figure 7). 95 acta polytechnica vol. 52 no. 5/2012 figure 7: flight manoeuvre sampling. the larger the turn angle, more dense the sampling is. another problem is that the planner works with cartesian coordinates. the sample waypoints therefore have to be transformed to gps coordinates using linearisation of the world at the point specified by the position of the ground station. this position represents the origin of a cartesian coordinate system with the x-axis pointing to the east, the y-axis to the north and the z-axis upward (see figure 8). this linearisation causes errors that are less than 10 cm in measurements of distances, and less than 8 m in measurements of altitudes at a distance of 10 km from the origin and 500 m higher. 4.2 planning in wind wind has a significant effect on the precision of uav plan execution. apart from gusts of wind that drift the aircraft and cause position uncertainty, see the next section, stable winds especially influence the minimal turn radius of the uav — one of the main parameters of the trajectory planner. an aircraft flying against the wind is capable of turning with a much smaller turn radius than an aircraft flying with the wind. in these conditions, the original planner that was based on planning with dubins curves (i.e. straight segments and turns with constant a radius) created trajectories could not be followed in presence of wind. we modified the planner to use trochoidal curves. the trajectories can then be constructed as dubins curves in a coordinate system that moves in the same direction and at the same speed as the wind (see figure 9), and can be followed much more precisely, even in strong wind. 4.3 position uncertainty because of errors of the sensors and actuators and effects of the environment, e.g. gusts of wind, the aircraft’s position estimation is not and cannot be precise. for effective coordination and control of uavs, this uncertainty must be modelled and the o x zy x z y ogs oe lon lat figure 8: linearisation of the coordinate system. wind (a) |wind||velocity| = 0.1 wind (b) |wind||velocity| = 0.3 figure 9: effect of wind on the flight trajectory. planning algorithms must take the uncertainty into account. we distinguish two types of errors — ∆t for timerelated errors and ∆d for distance-related errors (see figure 10). the distance-related error is the deviation of the uav from the planned spatial trajectory. it can be caused by sudden wind gusts, wind changes, high airspeeds or imprecise autopilot control when the aircraft is unable to follow the trajectory correctly. the time-related error is then the deviation of the uav from the time plan. this is caused by imprecise autopilot velocity control, by accumulated small delays that emerge when the autopilot repairs small spatial deviations from the plan, or by high wind speeds when it is difficult to keep the desired velocity of the aircraft. to handle these uncertainties, we use the so-called safety zone around the uav. generally, the worse the flight plan execution performance, the bigger the safety zone needs to be in order to keep the plane far apart from obstacles or other planes. when specifying the dimensions of the safety zone we distinguish two different safety ranges — safety time range st , and safety distance range sd (see figure 11). the safety time range is given by the maximum allowed time-related error ∆t , and the safety distance range is given by the maximum allowed distance-related error ∆d. when the uav gets outside any of these ranges, trajectory replanning is 96 acta polytechnica vol. 52 no. 5/2012 figure 10: two possible error types in plan following∆t for time-related error, and ∆d for distance-related error. figure 11: two different safety ranges — safety time range st , and safety distance range sd. scheduled. the safety time range is bigger than the safety distance range, because it is generally more difficult for the aircraft to keep up with the plan in the time domain. the safety zone is used during the planning procedure so that it is wrapped along the planned trajectory and it cannot cross any obstacle, no-flight zone or other planned trajectory in time and space. 4.4 unreliable communication communication is one of the most crucial features of any multi-agent system. in the agentfly system, all collision avoidance and cooperative negotiations, e.g. plan and position exchange and coordination commands, are conducted by message exchange. in a simulation the messages are transferred by the reliable tcp/ip protocol, but in a real environment uavs use rf modems with limited bandwidth, with possible interference to their signal from other rf devices, and with significant signal attenuation with distance. the largest portion of the communication bandwidth is used by plans exchange during cooperative collision avoidance. figure 13 shows the required bandwidth for worst case collision avoidance negotiation in superconflict scenarios, where all the uavs are flying against each other with one collision point in the middle of them (see figure 12). we presume the initial distance of the uavs to be 1.5km, with figure 12: superconflict collision avoidance scenario. figure 13: required bandwidth for collision avoidance negotiation in superconflict. the requirement that the conflict is to be resolved 10 s before the collision point. apart from plan exchange, additional bandwidth required is for manual safety control, telemetry broadcasts and communication management control. this needs 10–30 kbps of additional bandwidth. commercial modems that are capable of communication at a distance of at least 1.5 km have maximum rf bandwidth around 115 kbps, which with the additional and safety bandwidth is not enough to handle even 4 uav superconflict scenarios. moreover, operating the modems at full speed requires a significant portion of cpu time. for that reason we decided to use two independent rf modems operated by kesterl autopilot and gumstix com on-board cpu units, as shown in figure 5. 4.5 simulation time collision avoidance algorithms and other cooperation and coordination mechanisms need to have synchro97 acta polytechnica vol. 52 no. 5/2012 (a) (b) figure 14: the effects of planning with (a) dubins curves and (b) trochoidal curves. nized time to work properly. there are two ways to synchronize simulation times. because all autopilot modules and also the ground station pc are connected to gps modules, we can synchronize the clocks from the time information contained in gps updates. this method can give very precise time information, but we have found out that kestrel autopilot sometimes provides wrong time information in gps updates that need to be recognized and taken out from the measurements. the second way is to pass the ground station’s simulation time during a registration process of the hardware uavs to the system and then start to measure time from that moment. in this method there is a problem in the information transfer time — it can take up to one second to transfer the registration data along with the simulation time, and this can have a significant effect on time synchronization. however we decided to use this approach because the synchronization error is smaller than the time errors from the gps updates. 5 experiments we performed several field tests to check and verify the modifications described above. first, we compared the plan execution precision in cases with plans found by the original planning algorithm and with plans found by our modified algorithm for planning in wind. the tests were conducted with approximately 5 m/s wind, the uav’s airspeed was 15 m/s, and the scenario consisted of three waypoints placed as shown in figure 14. the blue line in figure 14 is the ideal planned trajectory that starts at the first waypoint (wp1), then proceeds via wp2 and wp3 and ends again at wp1. the green line is the real recorded trajectory of the uav, and the black arrows emphasize the different turn radii in individual cases. it can be seen that the trajectories created with dubins curves with a constant minimal turn radius could not be followed at some points. on the other hand, the trajectories created with trochoidal curves that were adapted to the wind strength and direction were followed much more precisely. another experiment that we performed was a mixed reality collision avoidance test. we prepared a scenario with one hardware uav and one virtual uav flying against each other. the purpose of this experiment was to test the functionality of the collision avoidance mechanism between the real and virtual aircraft, and also to test the sufficiency of the rf bandwidth required for the communication. the experiment was successful: the uavs needed less than 10 s to solve the collision situation, which with airspeeds of 15 m/s corresponds to 300 m distance flown from the beginning of the negotiation. in future we would like to perform experiments with two hardware uavs, later adding one and more virtual uavs in order to study the scalability of the problem. 6 conclusion in this paper we have presented problems and solutions connected with integrating hardware fixed wing uavs into an existing multi-agent system for uav simulation. we are now able to verify the functionality of the collision avoidance and cooperative mechanisms in a real environment, and also to find possible bottlenecks and limitations, e.g. the maximum available rf bandwidths for communication, and the limited computational capacity of the on-board computers. our work is a first step toward full deployment of the agentfly system on real hardware uavs that could operate independently in coordinated teams during tactical missions. in our future work, we would like to extend the tactical cooperation possibilities of uavs and also to deploy autonomous vtol quadcopters next to the fixed wings and provide them with mechanisms for mutual flight coordination and cooperation. acknowledgements the project described in this paper was supervised by ing. m. rollo, phd., fee ctu in prague, and was supported by czech ministry of defence grant ovcvut2010001. 98 acta polytechnica vol. 52 no. 5/2012 references [1] j. w. baxter, g. s. horn. controlling teams of uninhabited air vehicles. in proceedings of the fourth international joint conference on autonomous agents and multiagent systems, acm, 2005, pp. 27–33. [2] j. w. baxter, g. s. horn, d. p. leivers. flyby-agent: controlling a pool of uavs via a multi-agent system. knowledge-based systems 21(3):232–237, 2008. [3] r. w. beard, t. w. mclain, d. b. nelson et al.. decentralized cooperative aerial surveillance using fixed-wing miniature uavs. proceedings of the ieee 94(7):1306–1324, 2006. [4] a. bürkle, f. segon, m. kollmann. towards autonomous micro uav swarms. in journal of intelligent & robotic systems, springer, 2011, pp. 1–15. [5] l. e. dubins. on curves of minimal length with a constraint on average curvature, and with prescribed initial and terminal positions and tangents. american journal of mathematics 79(3):497–516, 1957. [6] p. scerri, t. von gonten, g. fudge et al. transitioning multiagent technology to uav applications. in proceedings of the 7th international joint conference on autonomous agents and multiagent systems: industrial track, international foundation for autonomous agents and multiagent systems, 2008, pp. 89–96. [7] d. šišlák, m. rehák, m. pechouček. a-globe: multi-agent platform with advanced simulation and visualization support. in web intelligence, ieee computer society, 2005, pp. 805–806. [8] d. šišlák, p. volf, š. kopřiva et al. agentfly: a multi-agent airspace test-bed. in proceedings of 7th international conference on autonomous agents and multi-agent systems (aamas 2008), may 2008. 99 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 synthesis of room impulse responses for variable source characteristics m. kunkemöller, p. dietrich, m. pollow abstract every acoustic source, e.g. a speaker, a musical instrument or a loudspeaker, generally has a frequency dependent characteristic radiation pattern, which is preeminent at higher frequencies. room acoustic measurements nowadays only account for omnidirectional source characteristics. this motivates a measurement method that is capable of obtaining room impulse responses for these specific radiation patterns by using a superposition approach of several measurements with technically well-defined sound sources. we propose a method based on measurements with a 12-channel independently driven dodecahedron loudspeaker array rotated by an automatically controlled turntable. radiation patterns can be efficiently described with the use of spherical harmonics representation. we propose a method that uses this representation for the spherical loudspeaker array used for the measurements and the target radiation pattern to be used for the synthesis. we show validating results for a deterministic test sound source inside in a small lecture hall. keywords: spherical harmonics, adjustable source directivity, room impulse response, linear room acoustics. 1 introduction in order to determine roomacoustic parameters, e.g. reverberation time, clarity index or evenbinaural parameters (iacc), room impulse responses are measured with an omni-directional sound source, as required by the iso 3382 standard. these sound sources in general consist of several single loudspeaker chassis placed on a spherical array and excited in a coherent way with the exact same signal. measured impulse responses in the room under test entirely describe the linear behavior for the exact combination of sound source position and microphonepositions. this canbeusedafterwardstomake the acoustic situation in this particular roomaudible (concept of auralization) but lacking the characteristic effect of the source radiation pattern. methods were therefore developed to drive the single loudspeaker chassis of these compact spherical loudspeaker arrays with individual signals in order to directly approximate certain radiation patterns of target sound sources [7,8,10]. this directly implies measuring the room impulse responses with approximated radiationpatterns, e.g. ofa speaker, an instrument, even though only a technical source is present during themeasurements. ifweare interested in synthesizing the sound of various sources it becomes obvious that themeasurement time rises with each target sourceand, of course, all target sourceshave tobe specified and measured in advance. this motivates a novel measurement and synthesis method that allows us to measure universal sets of room impulse responses that can be used to synthesize arbitrary radiation patterns after the measurement has been completed. the number of available loudspeaker chassis, and therefore the number of different basis radiation patterns, can be artificially increased by using several rotation and tilting angles of the loudspeaker array. the proposedmethod requires that the acoustic transfer characteristics in a room can be assumed as linear and time-invariant in order to use a superposition approach. hence, the reasonable limits will be studied and discussed. the proposed synthesis method is based on a description of radiation patterns in the spherical harmonic domain. this enables us to model the radiation patterns of the source to be approximated as well as the spherical loudspeaker array used formeasurements on the same basis. 2 method the proposedmethod can be divided into two parts, measurement and synthesis, which can also be entirely separated from each other. both parts use the same calculus, and the inverse problem is also formulated in the same way. the spherical harmonics representation is used throughout. 2.1 measurement of room impulse responses the core of the measurement consists of well knownimpulse responsemeasurementsof linear timeinvariant (lti) systems. this assumption holds for 69 acta polytechnica vol. 51 no. 5/2011 most acoustical systems within certain limits. a detailed overview of these methods can be found in [6]. for each loudspeaker chassis of the array, the impulse response h(t) or its frequency representation h(ω)1 is obtained by using exponentially swept sines (chirps, sweeps) as excitation signals and using proper deconvolution techniques for the signal recorded by the microphones [2]. the chosen signal is advantageous for the given taskbymeansofnon-linearbehaviordetection possibilities. furthermore, we employ a time saving approach that uses interleaved excitation signals allowing several loudspeaker chassis to run at the same time, but also allowing us to perfectly separate the responses is used as proposed by madjak et. al [4]. for each of the m orientation angles of the loudspeaker array the n impulse responses aremeasured, one for each loudspeaker chassis, leading to a set of l = m · n room impulse responses hl(t) or hl(ω) with l =1 . . . l. (1) each response corresponds to a different radiation pattern. figure 1 illustrates the method schematically for an array of three loudspeakers: the impulse responses of each driver are measured in two orientations and are subsequently superposed. fig. 1: superposition of several, differently oriented loudspeakers chassis 2.2 synthesis of target responses in order to approximate the target radiation pattern by the spherical loudspeaker array, complex and frequency dependent weighting factors wl are determined to obtain the room impulse response ht(t) or the transfer function ht(ω) of the approximated target radiation pattern by superposition. ht(ω) ≈ l∑ l=1 hl(ω) · wl. (2) the superposition approach is only applicable if the roomcanbeconsideredasanlti system. linearity is in general not problematic for air-borne sound paths as in the room for moderate sound pressures, as in the case of standard room acoustic measurements. however, time-variances become problematic if the room changes significantly during a measurement session. time variances in rooms are caused by temperature shifts, changes in humidity or light winds and circulations. in order to detect these variances leading to errors in the ongoing synthesis, a concept is used as described in section 4.1. the radiationcharacteristicsof anacoustic source can be described by the directivity factor γ [5]: γ(θ, φ) := p(r, θ, φ) p(r, θ0, φ0) . (3) this gives the complex factor between the pressure p in a reference radiation angle (θ0, φ0) and the sound pressure in any direction (θ, φ). (r, θ, φ) are the radius, the vertical and the horizontal angle of the common spherical coordinate system. in general, p andγ are complex and frequency dependent, but for better readability they are used without subscripts in the following. since the directivity value can be regarded as a function which only depends on the radiation angle (θ, φ) and which is furthermore continuously integrable, it can be represented by a set of spherical harmonic coefficients γ̂n,m, as shownbywilliams [9]: γ(θ, φ)= ∞∑ n=0 m∑ m=−n γ̂n,m · y mn (θ, φ) (4) where γ̂n,m are frequency dependent and complex valued spherical harmonic coefficients, and y mn are spherical harmonic base functions, which can be defined as: y mn (θ, φ)= √ 2n +1 4π (n − m)! (n + m)! p mn (cosθ) e imφ. (5) indices n and m denote the spatial periodicity of the function y mn (θ, φ). they are called order n ∈ n0, and degree m ∈ z : −n ≤ m ≤ n. p mn (μ) is an associated legendre polynomial of the first kind. a detailed work explaining the characterization of acoustic sources and radiation pattern with spherical harmonics is given by zotter [10]. the radiation pattern of a real source has a finite roughness over the surface. therefore its characterization in the spherical domain can be limited to a maximum order nmax, and the spherical harmonic coefficients can be summarized in a column vector [10]: γ̂= [ γ̂n,m ] (6) where 0 ≤ n ≤ nmax and −n ≤ m ≤ n. 1the two representations have a fixed mathematical relationship. hence they are used as synonyms in this work. 70 acta polytechnica vol. 51 no. 5/2011 eachof the above-mentioned l measured impulse responseswith the spherical loudspeaker array correspond to a certain source radiation pattern, which can be also written in such a vector d̂l. let γ̂t be the radiation pattern of the target to be synthesized, andwecan formulatebyanalogywith equation (2): γ̂t ≈ l∑ l=1 d̂l · wl. (7) the vectors d̂l can be summarized in a matrix characterizing the radiationpatterns of the entire extended array: d̂= [ d̂1 . . . d̂l ] . (8) hence equation (7) can be extended towards a matrix formulation: γ̂t ≈ d̂ ·w and w= ⎡ ⎢⎢⎣ w1 ... wl ⎤ ⎥⎥⎦ . (9) for the optimum weighting vector w we formulate, || d̂ ·w − γ̂t || −→ min (10) leading to an inverse problem, that can be solved by using the moore-penrose pseudo inverse d̂ + [3]: w= ( d̂h d̂ )−1 d̂h︸ ︷︷ ︸ d̂ + · γ̂t . (11) all quantities in equations (2) and (11) aremeasured quantities and are therefore subject to noise. in order to suppress the influence of these measurement uncertainties in the synthesis result, the rangeof possible solutions is limited by tikhonov regularization methods [3]: w= ( d̂h d̂+ i · ν )−1 d̂h︸ ︷︷ ︸ d̂ ⊕ · γ̂t , (12) where i is the unit matrix of dimension l × l and ν is the so called tikhonov parameter. 3 implementation and instrumentation the measurement methods were implemented using matlab and the ita-toolbox. this toolbox is developed at the institute of technical acoustics and provides various tools for acoustics measurements and post-processing. hence, the calculus for the inverse problem and the synthesis was also implemented in matlab. a complex calibrated instrumentation setup was used, consisting of the following elements (the numbers correspond to the numbers in figure 2). fig. 2: instrumentation setup • measurement pc (1): (windows xp, 32-bit) asio driver, matlab and ita-toolbox. • d/a and a/d converter (2, 6): rme-multiface ii connected via rme-hdsp and behringer ada8000pro8 connectedviaadatwithmultiface ii. • power amplifier (3): two 8-channel stagelineimg sta-1508. • microphones (4): radiation pattern: brüel & kjær. room impulse responses: sennheiser ke4 electret condensor. • signal conditioning (5): behringer ada8000 pro8, preamplification and phantom power supply for condensor microphones. our spherical loudspeaker arrayconsists of amidtone dodecahedron loudspeaker developed at the institute of technical acoustics. the single loudspeaker chassis can be driven independently and the radiationpattern of each chassiswasmeasuredunder free-field conditions in the anechoic chamber with a controlled measurement scan unit. the results were transformedto the sphericalharmonicsdomain. this loudspeaker array was used along with a computerized turntable to allow arbitrary horizontal orientation of the array in the room for measurements as shown in figure 3. the array was inclined at an angle so that the elevation angles of the single chassis were equally distributed. fig. 3: dodecahedron array, devices for tilting and rotating 71 acta polytechnica vol. 51 no. 5/2011 the superposition method based on the moorepenrose pseudo inverse d̂ ⊕ of the radiationpatterns of the arraywas introduced in section 2.2. the error of this inversion is a good measure of the quality of the method to expect for ongoing calculations with reasonable target responses. in the ideal case, the residual matrix r̂= | î − d̂ ⊕ · d̂| (13) would be the zero-matrix. the energy of its columns ε̂n,m corresponds to the error that arises when synthesizing several spherical harmonics y mn . figure 4 shows this error over frequency2. frequency in hz o rd e r o f s p h e ri ca l h a rm o n ic s mean square error of synthesis in db 400 630 1000 1600 2500 4000 6300 20 18 16 14 12 10 8 6 4 2 0 −60 −50 −40 −30 −20 −10 fig. 4: error of synthesizing spherical harmonics as canbe seen, the possible order of the spherical harmonics for synthesized target sources rises with frequency, and the error of synthesis rises as well. the low number of possible reproducible orders for low frequencies can be explained by the fact that the single loudspeakers donot have a dominant radiation pattern for low frequencies themselves. the synthesis error is causedby limited resolutionof theorientation angles in vertical direction of the single chassis, as this angle could not be adjusted automatically with the given measurement setup. 4 experimental results in order to evaluate the proposedmethod a comparativemeasurementwas conducted. the roomchosen for the measurements was an easily accessible lecture hall in the institute of technical acoustics with a mean reverberation time of approx. 0.9 seconds at mid frequencies. two main measurements were conducted in this room: one measurement with the spherical loudspeaker array, and the other measurementwith a loudspeaker of a certain target radiation pattern, which was also used as target response for synthesis. figure5 illustrates themeasurement setup inside the lecture hall. the upper picture shows the spherical loudspeaker array on the left side and the bottom picture shows the target loudspeaker in the same position in the room. additionally, the reference dodecahedron loudspeaker (right side) was used in afixedposition, as canalsobe seen in thepictures. 4.1 detection of time variances as mentioned above measurements with a reference loudspeaker are conducted for each orientation angle of the spherical loudspeaker array. the results of the reference loudspeaker are used for a correlation analysis of the impulse responses in the timedomainwith a mean impulse response. figure 6 shows this correlation coefficient. the dotted line marks the time when the room was briefly entered to replace the array by the target loudspeaker. it is obvious that the acoustic behavior of the room changes over time. at the beginning, after the personnel has left the room, the changes are greater than at the end. this can be explained by the fact that the room still needs some time to completely settle down after objects havemoved. measurements arechosenatmeasurementtimeswhere the timevariances are low. in this case we chose 100 measurements as input for the ongoing synthesis. fig. 5: measurement setup for comparative meausrements 20 40 60 80 100 120 140 160 180 200 0.7 0.75 0.8 0.85 0.9 0.95 1 correlation coefficient (reference: all measurements) index of measurement position c o rr e la tio n c o e ff ic ie n t 5e+002 hz [1] 1e+003 hz [1] 2e+003 hz [1] 4e+003 hz [1] fig. 6: correlation analysis to detect time variances 2only the maximum error of all degrees in one order is shown. 72 acta polytechnica vol. 51 no. 5/2011 4.2 results the upper image infigure 7 shows themeasurement with the real source and the synthesis result in the time domain, and the lower image zooms into the range of the first reflections in the room impulse response. the results lookvery similar in this representation. the frequency domain is plotted in figure 8. the results show good agreement in the range from 300hz to 1.5 khz. the chosen cut off frequency limits the result at the low end. the deviation above 1.5 khz grows over frequency, which is in correspondence with the results shown in figure 4. fig. 7: measured and synthesized impulse response (time domain) fig. 8: measured and synthesized impulse response (frequency domain) 5 conclusion we have proposed a measurement method for a special set of room impulse responses and synthesis in a post-processing step for room impulse responses of arbitrary target radiation patterns. in addition an approach was introduced to fully separate measurementandsynthesisby transformingthemeasurement results into a universal representation. since the method assumes linear time-invariant systems, this assumption was studied and a measure to quantify and monitor time variances was used based on measurementswith a reference loudspeaker in a fixed position. the method was validated in a small lecture hall using a 12-channel dodecahedron spherical loudspeaker arraywith automatically adjustable orientationangles to virtually increase thenumber of drivers and therefore the number of different radiation patterns. the results obtained by the synthesis of the proposed method were compared to measurements with the sourcewhichwas also used as target for the synthesis. the frequency range is limited towards higher frequencies at around 3 khz with the given measurement setup. as themain idea of this workwas to develop such a measurement method, there are still some limitations to overcome in future work. in order to cover the entire audible frequency range from 20 hz to 20khztwomodifications seemreasonable: the spherical array should be replaced by a high-tone version for the higher frequency range, and the vertical resolution of the spherical array needs to be increased. acknowledgement research described in this paper was supervised by prof. dr. michael vorländer. the authors would like to thank the electrical and mechanical workshop at the institute of technical acoustics for their support for the special measurement devices. references [1] dietrich, p., masiero, b., mueller-trapet, m., pollow, m., scharrer, r.: matlab toolbox for thecomprehension ofacousticmeasurement and signal processing. daga, 2010. [2] farina, a.: advancements in impulse response measurements by sine sweeps. aes 122nd convention, vienna, austria, 2007. [3] kress, r.: inverse problems. institute for numerical and applied mathematics, tu goettingen. [4] majdak,p.,balazs,p.,laback,b.: multipleexponential sweep method for fast measurement of head-related transfer functions, j. audio eng. soc., 2007. [5] mechel, f. p.: formulas of acoustics. springer, 2002. 73 acta polytechnica vol. 51 no. 5/2011 [6] müller, s., massarani, p.: transfer-function measurement with sweeps, journal of the audio engineering society, 2001. [7] pollow, m., behler, g.: variable directivity for platonic soundsourcesbased onsphericalharmonics optimization, acta acustica, 2009. [8] warusfel, o., derogis, p., causse, r.: radiation synthesis with digitally controlled loudspeakers. aes, 1997. [9] williams, g.: fourier acoustics, sound radiation and nearfield acoustical holography. naval research laboratory, washington, d.c. [10] zotter, f.: analysis and synthesis of soundradiation with spherical arrays dissertation, institut für elektronische akustik, graz, 2009. about the authors martin kunkemöller received his diploma degree in electrical engineering and information technology at rwth aachen, germany, in 2011. he was employedas researchassistantat the instituteoftechnicalacoustics ofrwthaachenuntil april 2011with the focus on developing and evaluating the proposed method. he is member of the german acoustical association (dega). pascal dietrich received his diploma degree from the faculty of electrical engineering andcomputer science at rwth aachen university in 2006. in 2007 he joined the institute of technical acoustics as a phd student and researcher in the field of electroacoustics, transfer-path analysis and structure-borne sound prediction always with the focus on measurement and modeling uncertainty. he is a member of the audio engineering society (aes) and the german acoustical association (dega). martin pollow received his diploma degree in electrical engineering from rwth aachen university, germany, in 2007. he is currently employed as a researcher and enrolled as phd student at the institute of technical acoustics of rwth aachen university. his focus of research encompasses complex sound fields, airborne sound radiation, array systems and analytical models. he is member of the german acoustical association (dega). martin kunkemöller pascal dietrich martin pollow e-mail: martin.kunkemoeller@akustik.rwth-aachen.de institute of technical acoustics rwth aachen university, germany 74 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 the influence of heart and lung dynamics on the impedance cardiogram — a simulative analysis m. ulbrich, a. schauermann, s. leonhardt abstract impedance cardiography (icg) is a simple and cheap method for acquiring hemodynamic parameters. unfortunately, not all physiological influences on the icg signal have yet been identified. in this work, the influence of heart and lung dynamics is analyzed using a simplified model of the human thorax with high temporal resolution. simulations are conducted using the finite integration technique (fit) with temporal resolution of 103 hz. it is shown that changes in heart volume as well as conductivity changes of the lung have a high impact on the icg signal, if analyzed separately. considering the sum signal of both physiological sources, it can be shown that they compensate each other and thus do not contribute to the signal. this finding supports kubicek’s model. keywords: impedance cardiography, bioimpedance, simulation, finite integration technique, high temporal resolution, signal source analysis. 1 introduction one of the most common causes of death in western europe is chronic heart failure (chf). measures of the severity of this cardiovascular disease are hemodynamic parameters such as strokevolume (sv).until now, the goldstandard formeasuring theseparameters has been the thermodilution technique, which utilizes a pulmonary artery catheter. however, risks of estimating co via catheters include infections, sepsis and arrhythmias, as well as increased morbidity and mortality. an alternative method for assessing hemodynamic parameters non-invasively and cost-effectively is icg. currently, icg is not commonly used as a diagnosticmethod, because it is not considered to be valid [1]. one reason is the inaccuracyof the technology itself concerning sv calculations. another possible reason for this is that processes in the human bodyduring icgmeasurementsarewidelyunknown. one way to analyzewhere the current paths run and which tissue contributes significantly to themeasurement result is to use computer simulations employing fit. other researchers have already examined multiple sources of the icg signal, using various approaches: some works are based on simple geometries [2], others on real anatomical data, such asmri data [3]. since controversial results have been obtained, the influence of two particular sources for the impedance cardiogram will be analyzed in this work and will be judged according to the value of their contribution. 2 basics thebasicsof themeasurementtechniqueandalso the simulation techniquewill be explained in this section. 2.1 bioimpedance for bioimpedance measurements, two outer electrodes are used to inject a small alternating current into the human body, and the voltage is measured by two inner electrodes to calculate the complex impedance. if a frequency spectrum between 5 khz and 1 mhz is used to measure the bioimpedance for each frequency, this method is called bioimpedance spectroscopy (bis). this is generally the most interesting frequency range for diagnosis, since physiological and pathophysiological processes lead to changes inbody impedanceswith highdynamics. bis is commonly used to assess the body composition of humans. if only one frequency of this spectrum is used to measure the bioimpedance continuously, thismethod is called impedance cardiography (icg). using icg, time-dependent hemodynamic parameters canbe extracted from the measured impedance curve (see figure 1). the derivation of the impedance changeδz is the icg signal whose maximum is used for calculating the stroke volumes. the measured svby icg according to bernstein and sramek can be described by the following equation: sv = δ · (0,17)3 4,2 · ∣∣∣∣dzdt ∣∣∣∣ max · te z0 (1) 106 acta polytechnica vol. 51 no. 5/2011 fig. 1: typical icg wave here factor δ is the actual weight divided by the ideal weight, te is the left ventricular ejection time (lvet), and z0 is the thoracic base impedance [4]. since icg usually operates at one certain frequency between 20 and 100 khz, only one continuous point ona complex frequency locus plot is obtainedby icg measurements. 2.2 finite element simulation finite element simulations may be used to describe complex structures by subdividing them into simple finite elements. these elements can be triangles or rectangles for 2dproblems, and tetrahedrons, pentahedrons, pyramids or hexahedrons for 3d problems. discretizationusinghexahedrons is shown infigure2 (middle). physical values are assigned to the edges, faces and volumes of these substructures. this assignment is illustrated in the following figure (right), which shows 2 dual grids representing the functional principle of the fit. here, electric voltages (e) and magnetic voltages (h) are assigned to the edges while the magnetic fluxes (b) and electric fluxes (j,d) are allocated on the faces of the grids. hence, a system of equations, themaxwell-grid-equations, has to be solved for the whole calculation domain describing each cell: c�e = − ∂�b ∂t c̃�h = ∂ �d ∂t +�j (2) s̃ �d = q s�b =0 (3) note that matrices c and c̃ correspond to the curl operator, and matrices s and s̃ correspond to the divergence operator [5]. while the differential formofmaxwell’s equations is solved for femsimulations, the integral form is used for fit simulations, so that the material matrices are diagonal for fit while the fem material matrices are non-diagonal. these matrices contain information about conductivity, permittivity and permeability, and define the relations between voltages and fluxes [6]. fig. 2: finite integration technique 3 methods classical icg analyzes the impedance of the thorax, approximating its volume by one outer cylinder with a conductivity from a mixture of tissues, containing another cylinder representing the aorta. this is of course an assumption which leads to modeling errors. as a result, the task is to find other models to improve the reliability of icg measurements. in addition, it has to be clarified which physiological sources contribute to the icg signal and lead to its characteristic points, shown in figure 1. this taskwill be accomplished by simulations using fit and an anatomical data set of a male human as a basis for a simplified dynamic model. the dataset is based on the visible human project c© dataset from the national library of medicine in maryland [7]. since this dataset containsno information about dynamics, a new model had to be created using simple geometries, such as frustums, spheres and cylinders, in order to reduce the simulation time. the thorax of the visible human male in combination with the created simple thorax is shown in figure 3. fig. 3: visible human dataset and simplified model for each expansion step a new model had to be created, since for each point in time the discretization of the simulated volume has to be recalculated due to the altered thorax geometry. the left ventricular volume change obtained by mri data has been used to alter the radius of a sphere representing the 107 acta polytechnica vol. 51 no. 5/2011 heart while assuming a stroke volume of 80 ml [8]. the radius varies between 38.5 mm and 33.8 mm and will mimic the contraction of the heart. the conductivity change due to lung perfusion has been implemented by assuming a maximum conductivity change of 10 % [9]. based on measured conductivity changeswith electrical impedance tomographic spectroscopy (eits), a dataset comprising 103 points in time has been created using matlab [10]. the calculation frequency has been set to 100 khz. using a discretization density of 50 units, each model comprises 1.5 million tetrahedrons. besides this global discretization, finer local discretizationhasbeenused for shape changingorgans in order toobtainaccurateresults for small geometrychanges. conductivity (σ) and permittivity (�r) values for every tissuehavebeen implementedusing thedata from gabriel et al. (see table 1). table 1: permittivity and conductivity values for 100 khz [11] σ [s/m] �r blood 0.70292 5120 myocard 0.21511 9845.8 bone 0.020791 227.64 fat 0.024414 92.885 muscle 0.36185 8089.2 abdomen 0.2 4000 lung 0.10735 2581.3 rebuilding all organs situated in the abdomen with the chosen procedure would have been too complex for the desired model. in addition, the impedance change of these organs plays a subordinate role for the icg signal during a heartbeat, so that the simplification has been made that the abdominal part of the model has been filled with a uniform “tissue mixture” by using boolean operations. average permittivity and conductivity of all abdominal organs has then been assigned to this tissue. for simplicity, ring electrodes have been used for current injection. of course, it is basically possible to use standard spot electrodes for the simulation. the time resolution comprises 103 points in time within a heartbeat for both physiological sources. for validation purposes, the sum of the dynamic impedance measured at 100 khz and including the volume change of the aorta has been compared to measured data of amale human and the results show excellent agreement (r = 0.94) using a correction factor [12]. this data has been acquired using the niccomotm device from medis germany, ilmenau. thus, themodel itself has been considered to be suitable for these kind of simulations. 4 results the model created for the simulation is shown in figure 4. it consists of all important tissues (muscle, fat, bone, heart, lung, abdominal tissue, important blood vessels). the current density results of a simulatedmeasurementarepresentedhere ina transversal section by arrows floating through the thorax from abdomen to neck. we can observe visually that high current density persists in muscle tissue, aorta and heart whereas low current flows through bone and fat tissue. it is also no surprise, that there is high current density in the current injection areas. fig. 4: simulation model and current density results all simulations have been calculated on a personal computer with a 64 bit operating system, an intelr© xeonr© 5240processorwith2coresand24gb ram. every simulation step consumed 2 gb ram and lasted 5 minutes using one core of the processor. impedance changes have been computed for all points in time and both dynamic sources, so that in total 206 simulations have been conducted (see figure 5). fig. 5: simulation results of one heartbeat 108 acta polytechnica vol. 51 no. 5/2011 this plot shows the results of lung perfusion simulations and heart beat simulations, as well as the sum signal of both results. the ripple of the heart impedance change can be explained by the changing geometry of the model due to the heart volume change, because for everynewmodel geometry a new mesh has to be created. this can lead to other edges or faces of the grid contributing to the results obtained by the postprocessor. what is more, an impedance change of approx. 0.35ω can be observed for both dynamic sources. in addition, both results show nearly the same temporal behavior, so that the resulting curve reveals no significant impedance change. 5 discussion the task of this work was to analyze the influence of lung perfusion and heart beat on icg measurements. first, simulations with high temporal resolution have been conducted using conductivity changes of the lung and geometry changes of the heart as physiological dynamic sources for the icg signal. in addition, physiological data and a simple model on the basis of anatomical data have been used to produce realistic results. thus, a resource-saving and accurate way to reproduce an impedance measurement has been implemented. second, it has been shown that lung perfusion and heartbeat have a high influence on the icg signal because both signals have a maximum impedance change of 0.35 ω, which is even higher than the usual impedance change of the thorax. third, the two effects annihilate themselves because they show a very similar temporal behavior. to sumup, it has been shown that although there areotherbigphysiological influencesaltering the conductivity and thus the impedance of the human thorax, it seems that the original kubicek model does not reallymake false assumptionsby taking the aorta as the major contributor to the icg signal into account and neglecting other sources. this model has of course the potential to be improved. we could add more static sources, such as rib cage, kidneys and liver and more dynamic sources should be taken into account including erythrocyte orientation, respiration and more components of the cardiovascular system. in addition, the influence of pathologies on the icg signal should be analyzed in order to explain effects when measuring patients. acknowledgement the research described in this paper was supervised by prof. steffen leonhardt, rwth aachen university of technology and was supported by jens mühlsteff, philips research eindhoven. this work contributes to the european union “heartcycle” project. references [1] cotter, g.: impedance cardiography revisited, physiological measurement, 2006, vol. 27, p. 817–827. [2] kosicki, j.: contributions to the impedance cardiogram waveform, annals of biomedical engineering, 1986, vol. 14, p. 67–80. [3] patterson,r. p.: sources of the thoracic cardiogenic electrical impedance signal as determined by amodel,medical &biological engineering & computing, 1985, vol. 23, p. 411–417. [4] van de water, j.: impedance cardiography — the next vital sign technology? chest, 2003, vol. 123, p. 2028–2033. [5] clemens, m.: discrete electromagnetism with the finite integration technique, progress in electromagnetics research, 2001, p. 65–87. [6] demenko, a.: on the equivalence of finite element and finite integration formulations, 17th conference on the computation of electromagnetic fields, 2009, p. 677–678. [7] national library of medicine, “the visible human project.” [online], available: http://www.nlm.nih.gov/research/visible/ [8] feng, w.: a dual propagation contours technique for semi-automated assessment of systolic and diastolic cardiac function by cmr, journal of cardiovascular magnetic resonance, 2009, vol. 11. [9] brown, b. h.: simultaneous display of lung ventilation and perfusion on a real-time eit system, engineering in medicine and biology society, proceedings of the annual international conference of the ieee, 1992, vol. 5, p. 1710–1711. [10] zhao, t.: modelling of cardiac-related changes in lung resistivity measured with eits, physiological measurement, 1996, vol. 17, p. 227–234. [11] gabriel, c.: the dielectric properties of biological tissues, physics in medicine and biology, 1996, vol. 41, pp. 2231–2249. 109 acta polytechnica vol. 51 no. 5/2011 [12] ulbrich, m.: simulation of continuous spectroscopic bioimpedance measurements for impedance cardiography, ieee international workshop on impedance spectroscopy, 2010. about the authors mark ulbrich was born 1980 in bonn, germany. he studied electrical engineering at rwth aachen university of technology and specialized in medical engineering. he started in 2009 working as ph.d. student at the chair for medical information technology, rwth aachen university. his research focuses on the implementation of a textile integratedbioimpedancemeasuring systemwhich allows themonitoringof hemodynamicparameters at home. alexander schauermannwasborn 1983 infrunse (kirghizia). he started his studies in electrical engineering at rwth aachen university in 2005. obtaining his bachelor degree in 2007, he focused his research interest onmicroelectronics andmedical engineering during his master studies. mark ulbrich e-mail: ulbrich@hia.rwth-aachen.de alexander schauermann e-mail: alexander.schauermann@rwth-aachen.de steffen leonhardt philips chair for medical information technology rwth aachen university pauwelsstrasse 20, 52074 aachen, germany 110 acta polytechnica vol. 52 no. 6/2012 energetic approach to large strain gradient crystal plasticity jan kratochvíl1, martin kružík2,1 1czech technical university, faculty of civil engineering, thákurova 7, 166 29 prague, czech republic 2institute of information theory and automation of the ascr, pod vodárenskou věží 4, 182 08 prague, czech republic corresponding author: kruzik@utia.cas.cz abstract we formulate a problem of the evolution of elasto-plastic materials subjected to external loads in the framework of large deformations and multiplicative plasticity. we focus on a spontaneous inhomogenization interpreted as a structuralization process. our model includes gradients of the plastic strain and of hardening variables which provide a relevant length scale of the model. a simple computational experiment interpreted as a hint of a deformation substructure formation is included. keywords: energetic solution, crystal plasticity, gradient plasticity, slip system. 1 introduction the elastic-plastic behavior of crystalline materials poses a challenge for mathematical analysis on the microscopic, the mesoscopic, and the macroscopic scales. here, we study a rate-independent model arising in the crystal plasticity. a common and successful approach to the analysis of crystalline materials is by means of energy minimization; see e.g. ortiz & repetto [30]. this is manifested for various elastic crystals, even for those with the potential of undergoing phase transitions. the applicability of variational methods has been broadened to include rate-independent evolution. typically, these models are characterized by energy minimization of a functional including macroscopic quantities such as the macroscopic deformation gradient as well as a dissipation functional. in order to introduce a physically relevant scale to our problem we assume, following earlier works of bažant & jirásek [3], dillon & kratochvíl [9], gurtin and gurtin & anand [15, 16], mainik & mielke [22] and others, that our energy functional depends also on the gradient of the plastic tensor. the gradient term models nonlocal effects caused by short-range interactions among dislocations. it is not clear, however, which function of the gradient should be used. for further details we refer to kratochvíl & sedláček [18], and to bakó & groma [1] for attempts to derive it from statistical physics revealing thus complexity of the problem, for more details. a related approach to non-local models in damage and plasticity was undertaken in bažant & jirásek [3], see also [8, 10, 12, 23]. in this paper, we formulate the so-called energetic solution due to mielke et al. [28] to our problem. this concept of a solution is based on two requirements. first, as a consequence of the conservation law for linear momentum, all work put into the system by external forces or boundary conditions is spent on increasing the stored energy or it is dissipated. secondly, the formulation must satisfy the second law of thermodynamics, which in the present mechanical framework has the form of a dissipation inequality. the last requirement enters the framework as the assumption of the existence of a nonnegative convex potential of dissipative forces. as a consequence, the imposed deformation evolves in such a way that the sum of the stored and dissipated energies is always minimized. the main advantage of this approach is that it allows us to exploit the theory of the modern calculus of variations and suggests a numerical approach to this problem. to expose the essence of the mathematical structure of the energetic approach, we first analyze a proto-model called here a material with internal variables. it freely follows the exposition of francfort & mielke [11] and we recall it here to motivate the notion of the energetic solution. in the second step, the framework is applied to elasto-plastic materials by specifications of some internal variables. one of the main results is that the described energetic approach can be identified with crystal plasticity with strain gradients in the version formulated by gurtin [15]. gurtin’s model is formulated in the mathematical language of differential equations. from the point of view of numerical solution of a boundary value problem of crystal plasticity the energetic formulation is more convenient. our results are closely related to [22], where the authors proved the existence of energetic solutions to strain gradient plasticity with polyconvex energy density allowing even for +∞; see [2] and to giacomini & lussardi [13] where the linear-elastoplasticity 9 acta polytechnica vol. 52 no. 6/2012 framework is considered. here we allow for finite quasiconvex stored energy density and large deformations. this is motivated by relaxation theory in the calculus of variation, where the effective macroscopic energy density is quasiconvexification of the microscopic energy density. thus, our results may be applied to plasticity of materials with developing microstructures, as in shape memory alloys, [4], or [26] for instance. we refer an interested reader to [19] for a model describing cyclic plasticity in these materials. another related paper is carstensen et al. [6], where the authors use the energetic approach to plasticity without strain gradients. in what follows, ω ⊂ rn, is an open bounded domain and lβ(ω; rn), 1 ≤ β < +∞ denotes the usual lebesgue space of mappings ω → rn whose modulus is integrable with the power β, and l∞(ω; rn) is the space of measurable and essentially bounded mappings ω → rn. further, w 1,β(ω; rn) standardly represents the space of mappings which live in lβ(ω; rn) and their gradients belong to lβ(ω; rn×n). if f : rn → r is convex but possibly nonsmooth we define its subdifferential at a point x0 ∈ rn as the set of all v ∈ rn such that f(x) ≥ f(x0) + v ·(x−x0) for all x ∈ rn. the subdifferential of f will be denoted ∂subf and its elements will be called subgradients of f at x0. 2 materials with internal variables consider a material whose elastic properties depend on internal variables z ∈ z ⊂ rm. the stored energy density is then w = w(fe,z), where fe ∈ rn×n is the elastic strain. we are interested in the rateindependent evolution of the material. to this end, we assume the existence of a nonnegative convex potential δ = δ(ż) of dissipative forces, where ż denotes the time derivative of z. in order to ensure rateindependence, δ must be positively one-homogeneous, i.e., δ(αż) = αδ(ż) for all α > 0. finally, we define for z ∈ z a thermodynamic force q := − ∂ ∂z w(fe,z). (1) the evolution rule is introduced in the form q(t) ∈ ∂subδ(ż(t)), (2) where ∂subδ is the subdifferential of δ. hence, there is ω(t) ∈ ∂subδ(ż(t)) such that q(t) = ω(t). maximal monotonicity of the subdifferential implies that for all θ ∈ ∂subδ(ξ) we have 〈ω(t) −θ, ż(t) − ξ〉≥ 0. (3) remark 2.1. in particular, taking ξ = 0 and realizing that the one-homogeneity of δ yields δ(ż) = 〈ω,ż〉 for all ω ∈ ∂subδ(ż) we get δ(ż(t)) = 〈ω(t), ż(t)〉 = 〈q(t), ż(t)〉≥ 〈θ, ż(t))〉 (4) for all θ ∈ ∂subδ(0). inequality (4) expresses the socalled maximum dissipation principle (see e.g. hill [17] or simo [31]), which says that thermodynamic forces “available” in the so-called elastic domain ∂subδ(0) are not strong enough to overcome frictional forces. in what follows, ω ⊂ rn, is a bounded lipschitz domain representing the so-called reference configuration, ν is the outer unit normal to ∂ω, and ∂ω ⊃ γ0, γ1 which are disjoint. the deformation will be denoted y : ω → rn. the evolution of the system will be controlled by external forces. let f(t) : ω → rn be the (volume) density of the external body forces and g(t) : γ1 ⊂ ∂ω → rn be the (surface) density of the surface forces. the equilibrium equations governing the mechanical behavior of the system are: −div ( ∂ ∂∇y w(∇y(t),z(t)) ) = f(t) in ω, (5) y(t,x) = y0(x) on γ0, (6) ∂ ∂∇y w(∇y(t),z(t))ν(x) = g(t,x) on γ1. (7) the full system characterizing the proto-model consists of (5)–(7) supplemented by (2): − ∂ ∂z w(∇y(t),z(t)) ∈ ∂δ(ż(t)), z(0) = z0, z ∈ z, (8) where z0 ∈ z is an initial condition for the internal variable. in order to regularize our problem we may add the gradient of the internal variable, i.e., for ω ≥ 1 and ε > 0 put w(∇y,z) + ε ω |∇z|ω. (9) the evolution rule changes to ε div ( |∇z(t)|ω−2∇z(t) ) − ∂ ∂z w(∇y(t),z(t)) ∈ ∂δ(ż(t)), z(0) = z0, z ∈ z, (10) so we have the thermodynamic force q(t) := ε div(|∇z(t)|ω−2∇z(t)) − ∂ ∂z(t) w(∇y(t),z(t)). 10 acta polytechnica vol. 52 no. 6/2012 the potential energy of our system can be written (� := ε/ω) i(t,y(t),z(t)) := ∫ ω w(∇y(t),z(t)) dx + � ∫ ω |∇z(t)|ω dx−l(t,y(t)), (11) where the work done by external forces is l(t,y(t)) := ∫ ω f(t) ·y(t) dx + ∫ γ1 g(t) ·y(t) ds (12) and the following energy balance is satisfied d dt i(t,y(t),z(t)) = l̇ ( t,y(t) ) − d dt diss ( z; [0, t] ) , (13) where diss ( z; [0, t] ) := ∫ t 0 ∫ ω δ ( ż(s) ) dxds. hence, the integration with respect to time gives i(t,y(t),z(t)) + diss(z; [0, t]) = i(0,y(0),z(0)) + ∫ t 0 l̇(s,y(s)) ds. we can also consider a more general form of δ which can also depend on (x,z), i.e. δ := δ(x,z, ż). typically, however, we do not have enough smoothness in the internal variable to compute the time derivative on the right-hand side of (13). following mielke [24], we define the dissipation distance between two values of internal variables z0,z1 ∈ z as d(x,z0,z1) := inf z {∫ 1 0 δ(x,z(s), ż(s)) ds; z(0) = z0, z(1) = z1 } , (14) where z ∈ c1 ( [0, 1]; z ) , and set d(z1,z2) = ∫ ω d ( x,z1(x),z2(x) ) dx, (15) where z1,z2 ∈ z := {z : ω → rm; z(x) ∈ z a.e. in ω}. we assume that z is equipped with strong and weak topologies which define the notions of convergence used below. following [11, 22], we impose the following assumptions on d: (i) weak lower semicontinuity: d(z, z̃) ≤ lim inf k→∞ d(zk, z̃k), (16) whenever zk⇀z and z̃k⇀z̃. (ii) positivity: if {zk} ⊂ z is bounded and at the same time min{d(zk,z),d(z,zk)}→ 0 then zk⇀z. (17) 2.1 energetic solution suppose that we look for the time evolution of y(t) ∈ y ⊂ {y : ω → rn} and z(t) ∈ z during the time interval [0,t]. the following two properties are the key ingredients of the so-called energetic solution due to mielke and theil [27, 28]. (i) stability inequality: ∀t ∈ [0,t], z̃ ∈ z,y ∈ y: i ( t,y(t),z(t) ) ≤i ( t, ỹ, z̃ ) + d ( z(t), z̃ ) (18) (ii) energy balance: ∀t, 0 ≤ t ≤ t i ( t,y(t),z(t) ) + var ( d,z; [0, t] ) = i ( s,y(0),z(0) ) + ∫ t 0 l̇ ( ξ,y(ξ) ) dξ, (19) where var ( d,z; [s,t] ) := sup { n∑ i=1 d ( z(ti),z(ti−1) ) ; {ti} partition of [s,t] } . definition 2.2. the mapping t 7→ (y(t),z(t)) ∈ y× z is an energetic solution to the problem (i,δ,l) if the stability inequality and the energy balance are satisfied. remark 2.3. note that the stability inequality (i) can be written in the form ∀t ∈ [0,t], z̃ ∈ z, ỹ ∈ y i ( t,y(t),z(t) ) +d ( z(t),z(t) ) ≤i ( t, ỹ, z̃ ) +d ( z(t), z̃ ) , which means that y(t),z(t) always minimizes (ỹ, z̃) 7→ i(t, ỹ, z̃)+d(z(t), z̃). this means that the equilibrium configurations are characterized by energetic minima. contrary to elasticity theory, the minimized energy is not only the overall elastic energy described by i but the dissipated energy is added. it is convenient to put q := y × z and to set q := (y,z). moreover, we define the set of stable states at time t as s(t) := { q ∈ q; ∀q̃ ∈ q : i(t,q) ≤i(t, q̃) + d(q, q̃) } (20) and s[0,t] := ⋃ t∈[0,t] {t}×s(t). (21) moreover, a sequence {(tk,qk)}k∈n is called stable if qk ∈s(tk). 11 acta polytechnica vol. 52 no. 6/2012 3 applications to elasto-plasticity now we apply the energetic approach to an elastoplastic problem. 3.1 problem statement in what follows y : ω → rn will be a deformation of a body ω ⊂ rn (in a fixed reference configuration) with the deformation gradient f = ∇y. in particular, y covers both elastic and plastic deformation. we define the multiplicative split, f = fefp, into an elastic part fe and an irreversible plastic part fp, which belongs to sl(n) := {a ∈ rn×n; det a = 1}. the socalled plastic strain fp and the vector p ∈ rm of the hardening variables are internal variables influencing elasticity. in other words, z(x) = (fp(x),p(x)) ∈ sl(n) × rm for almost all x ∈ ω. the energy functional i takes the form i ( t,y(t),z(t) ) := ∫ ω w(x,∇yf−1p ,fp,∇fp,p,∇p) dx −l(t,y(t)), (22) with l given by (12). in order to ease the notation, we omit the dependence of w on x. however, all the theory developed in this paper may include nonhomogeneous w. in what follows, we suppose that y ∈ y := { y ∈ w 1,d(ω; rn); y = y0 on γ0 } , where γ0 ⊂ ∂ω with a positive surface measure. moreover, we suppose that γ0 ∩ γ1 = ∅. further z := { (fp,p) ∈ w 1,β(ω; rn×n) ×w 1,ω(ω; rm); fp(x) ∈ sl(n) for a.e. x ∈ ω } . as q = (y,z) it will be advantageous and will cause no confusion to write d as dependent on q, i.e., d(q1,q2) := d(z1,z2) if q1 = (y1,z1) and q2 = (y2,z2). similarly, we may write i in terms of q = (y,z) as i ( t,q(t) ) = ∫ ω w ( x,∇yf−1p ,fp,∇fp,p,∇p ) dx −l ( t,q(t) ) , where, obviously, l ( t,q(t) ) := l ( t,y(t) ) . in this situation, q = (q1,q2) are conjugate plastic stress and conjugate hardening forces, respectively. q1 = div ( ∂w(∇yf−1p ,fp,∇fp,p,∇p) ∂∇fp ) − ∂w(∇yf−1p ,fp,∇fp,p,∇p) ∂fp and q2 = div ( ∂w(∇yf−1p ,fp,∇fp,p,∇p) ∂∇p ) − ∂w(∇yf−1p ,fp,∇fp,p,∇p) ∂p . (23) the elastic domain is defined as q(x,z) = ∂subżδ(x,z, 0). (24) remark 3.1. the principle of maximal dissipation asserts that q1 : ḟp + q2 · ṗ (25) is maximal if ḟp and ṗ are kept fixed and (q1,q2) ∈ q(x,z). this means that for all (a,b) ∈q(x,z). q1 : ḟp + q2 · ṗ ≥ a : ḟp + b · ṗ. (26) finally, we conclude with an example covered by our approach. example 3.2 (simple shear carried by a single slip). consider a single slip system defined by two orthonormal vectors a,b ∈ r3 such that a is the glide direction and b is the slip-plane normal. further suppose that we have a particular case of a so-called separable material where: w(x,fe,z) = w1(fe) + α ∣∣fp∣∣2 + ε2∣∣∇fp∣∣2, where fp(t) = i + γ(t)a ⊗ b where γ is the plastic slip. the slip system is generally not fixed in the reference configuration. the slip-plane normal b̃ in the reference configuration has the form b̃ = (fp)>b. however, in this special case we have that b̃ = b, so that the slip-plane normal is kept constant during the process [15]. due to the special case of fp we may identify z := (γ,p) because fp depends only on γ. choose the dissipation metric: δ(z, ż) = δ(γ,p, γ̇, ṗ), δ(γ,p, γ̇, ṗ) = { p|γ̇| if ṗ ≥ h|γ̇|, +∞ otherwise, where h is the so-called hardening function. the evolution rule reads: ε∆γ ∈ ∂sub ( p|γ̇| ) the elastic domain ∂subδ(γ,p, 0, 0) = [−p,p] if ṗ ≥ h|γ̇| and (−∞,∞) otherwise. the boundary of the elastic domain ±p − ε∆γ = 0 defines the yield surface. thus, the energetic approach recovers gurtin’s calculations on shear bands in single-slip, see gurtin (2000). 12 acta polytechnica vol. 52 no. 6/2012 the dissipation distance is: d(γ1,p1,γ2,p2) = { p2|γ2 −γ1| if p2 −p1 ≥ h|γ2 −γ1|, +∞ otherwise. if h > 0 we get that the optimal choice is p2 = p1 +h|γ2−γ1| which gives d(γ1,p1,γ2,p2) = p1|γ2− γ1| + h(γ2 −γ1)2. the following finite element computation illustrates the performance of our model. we take n = 2, ω = (0, 1) × (0, 3), t ∈ [0, 1]. further, we apply a zero dirichlet boundary condition on (0, 1)×{0} and u(t,x) = −0.6t for x = (0, 1) ×{3}. as to material properties we consider a homogeneous isotropic material with a young modulus 200 gpa, poisson’s ratio 0.3, a = (−1, 1)/ √ 2, b = (1, 1)/ √ 2 and the w1(e) = λ2 tr|e| 2 + µ|e|2 where λ and µ are lamé constants corresponding to the used young modulus and to the poisson’s ratio. the other model constants were set as follows: ε = 40 gpa · m2, α = 2 mpa, and h = 0.2 mpa. the initial condition was chosen γ = 0 (no plastic deformation) and initial yield stress p = 200 mpa. note that the constant defining the energy stored in defects, i.e., α, is much smaller than the elastic constants of the material. the specimen was discretized using piecewise affine finite elements for the displacement as well as for fp. a sequence of resulting minimization problems was solved by a fortran 77 routine described in [5], which is based on a quasinewton method. it is designed to solve large-scale nonlinear optimization problems with box constraints. we first observe the development of 45-degree bands in the vicinity of the boundary where the dirichlet boundary conditions are applied. at the final stage a large plastic deformation appears in the middle of the specimen (see e.g. [20] for similar calculations). the spontaneously-formed inhomogeneity provides a hint of a deformation structuralization process. the characteristic length scale introduced through the higher gradients guarantees the intrinsic size of the inhomogeneity independent of the fem mesh size. acknowledgements this work was supported by gačr through projects p107/12/0121, p201/10/0357, and by the research project vz-mšmt 6840770003. references [1] bakó, b., groma, i.: stochastic approach for modeling dislocation patterning, phys. rev. b., 60, 1999, 122–127. −0.3 γ 0 figure 1: elasto-plastic deformation of a twodimensional specimen at four time instants. the darker the shade of gray color, the larger is the plastic deformation, i.e., |γ|. the loading increases from left to right. [2] ball, j.m.: convexity conditions and existence theorems in nonlinear elasticity, arch. ration. mech. anal., 63, 1997, 337–403. [3] bažant, z.p., jirásek, m.: nonlocal integral formulation of plasticity and damage: a survey of progress, j. engrg. mech., 128, 2002, 1119– 1149. [4] bhattacharya, k.: microstructure of martensite. why it forms and how it gives rise to the shapememory effect. oxford: oxford university press, 2003. [5] byrd, r.h., lu, p., nocedal, j., zhu, c.: a limited memory algorithm for bound constrained optimization problems, siam j. scientific computing, 16, 1995, 1190–1208. [6] carstensen, c., hackl, k. mielke, a.: nonconvex potentials and microstructures in finite-strain plasticity, proc. roy. soc. lond. a, 458, 2002, 299–317. [7] dacorogna, b.: direct methods in the calculus of variations 2nd ed. new york: springer, 2008. [8] dal maso, g., francfort, g., toader, r.: a model of quasistatic crack growth of brittle fractures: existence and approximation results, arch. ration. mech. anal., 176, 2005, 165–225. [9] dillon, o.w., kratochvíl, j.: a strain gradient theory of plasticity, int. j. solid struct., 6, 1970, 1513–1533. [10] fleck, n., hutchinson, j.w.: a phenomenological theory for strain gradient effects in plasticity. j. mech. phys. solids, 41, 1993, 1825–1857. 13 acta polytechnica vol. 52 no. 6/2012 [11] francfort, g., mielke, a.: existence results for a class of rate-independent material models with nonconvex elastic energies, j. reine angew. math., 595, 2006, 55–91. [12] frémond, m.: non-smooth thermomechanics. berlin: springer, 2002. [13] giacomini, a., lusardi l.: quasistatic evolution for a model in strain gradient plasticity, siam j. math. anal., 40, 2008, 1201–1245. [14] groma, i.: link between the microscopic and mesoscopic length-scale description of the collective behaviour of dislocations, phys. rev. b, 56, 1997, 5807. [15] gurtin, m.e.: on the plasticity of single crystals: free energy, microforces, plastic-strain gradients, j. mech. phys. solids, 48, 2000, 989–1036. [16] gurtin, m.e., anand, l.: a theory of straingradient plasticity for isotropic, plastically irrotational materials. i. small deformations, j. mech. phys. solids, 53, 2005, 1624–1649. [17] hill, r.: a variational principle of maximum plastic work in classical plasticity, q. j. mech. appl. math., 1, 1948, 18–28. [18] kratochvíl, j., sedláček, r.: statistical foundation of continuum dislocation plasticity, phys. rev. b 77, 2008, 134102. [19] kružík, m., zimmer, j.: a model of shape memory alloys accounting for plasticity, ima j. appl. math., 76, 2010, 193-216. [20] kuroda, m., tvergaard, v.: a finite deformation theory of higher-order gradient crystal plasticity, j. mech. phys. solids, 56, 2008, 2573-2585. [21] mainik, a., mielke, a.: existence results for energetic models for rate-independent systems. calc. var. partial differential equations, 22, 2005, 73–99. [22] mainik, a., mielke, a.: global existence for rateindependent gradient plasticity at finite strain, j. nonlinear sci., 19, 2009, 221–248. [23] maugin, g.a.: the thermomechanics of plasticity and fracture cambridge : university press, 1992. [24] mielke, a.: energetic formulation of multiplicative elasto-plasticity using dissipation distances. cont. mech. thermodyn., 15, 2002, 351–382. [25] mielke, a.: evolution of rate-independent systems. in: evolutionary equations. ii, handb. differ. equ., pp. 461–559, amsterdam :elsevier/north-holland, 2005. [26] mielke, a., roubíček, t.: a rate-independent model for inelastic behavior of shape-memory alloys, multiscale model. simul., 1, 2003, 571– 597. [27] mielke, a., theil, f.: a mathematical model for rate-independent phase transformations with hysteresis. in: models of continuum mechanics in analysis and engineering. (eds.: h.-d.alder, r.balean, r.farwig), aachen :shaker verlag, 1999, pp.117-129. [28] mielke, a., theil, f., levitas, v.i.: a variational formulation of rate-independent phase transformations using an extremum principle, arch. ration. mech. anal. 162, 2002, 137–177. [29] mühlhaus, h-b., aifantis, e.c.: a variational principle for gradient plasticity, int. j. solid. structures 28, 1991, 845–857. [30] ortiz, m., repetto, e.a.: nonconvex energy minimization and dislocation structures in ductile single crystals. j. mech. phys. solids, 47, 1999, 397–462. [31] simo, j.: a framework for finite strain elastoplasticity based on maximum plastic dissipation and multiplicative decomposition. part i. continuum formulation, comp. meth. appl. mech. engrg., 66, 1988, 199–219. part ii. computational aspects, comp. meth. appl. mech. engrg. 68, 1988, 1–31. [32] tsagrakis, i., aifantis, e.c.: recent developments in gradient plasticity. j. engrg. mater. tech. 124, 2002, 352–357. 14 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 gause symmetry and howeduality in 4d conformal field theorymodels i. todorov to jiri niederle on the occasion of his 70th birthday abstract it is known that there are no local scalar lie fields in more than two dimensions. bilocal fields, however, which naturally arise in conformal operator product expansions, do generate infinite lie algebras. it is demonstrated that these lie algebras of local observables admit (highly reducible) unitary positive energy representations in a fock space. the multiplicity of their irreducible components is governed by a compact gauge group. themutually commuting observable algebra and gauge group form a dual pair in the sense of howe. in a theory of local scalar fields of conformal dimension two in four space-time dimensions the associated dual pairs are constructed and classified. the paper reviews joint work of b. bakalov, n. m. nikolov, k.-h. rehren and the author. 1 introduction we review results of [32, 33, 34, 35, 36] and [2, 3] on 4dconformalfield theory (cft)models,which canbe summed up as follows. the requirement of global conformal invariance (gci) in compactified minkowski space togetherwith thewightmanaxioms [41] implies the huygens principle (eq. (3.6) below) and rationality of correlation functions [32]. a class of 4d gci quantumfield theorymodels gives rise to a (reducible) fock space representation of a pair consisting of an infinite dimensionallie algebralanda commutingwith it compact lie group u. the state space f splits into a direct sum of irreducible l × u modules, so that each irreducible representation (ir) of l appearswith a multiplicity equal to the dimension of an associated irof u. thepair (l, u) illustratesa interconnects two independent developments: (i) it appears as a reductive dual pair, [16, 17], within (a central extension of) an infinite dimensional symplectic lie algebra; (ii) it provides a representation theoretic realization of the doplicher-haag-roberts’ (dhr) theory of superselection sectors and compact gauge groups, [8, 14]. i shall first briefly recall howe’s and dhr’s theories; then (in sect. 2) i will explain how some 2d cft techniques can be extended to four space-time dimensions (in spite of persistent doubts that this is at all possible). after these preliminaries we shall proceed with our survey of 4d cft models and associated infinite dimensionallie algebraswhich relate the two independent developments. 1.1 reductive dual pairs the notion of a (reductive) dual pair was introduced by roger howe in an influential preprint of the 1970’s thatwas eventuallypublished in [17]. itwaspreviewed in two earlier papers of howe, [15, 16], highlightening the role of the heisenberg group and the applications of dual pairs to physics. for howe a dual pair, the counterpart for groups and for lie algebras of the mutual commutants of von neumann algebras ([14]), is a (highly structured) concept that plays a unifying role in such widely different topics as weil’s metaplectic group approach [44] to θ functions and automorphic forms (an important chapter in number theory) and the quantum mechanical heisenberg group along with the description of massless particles in terms of the ladder representations of u(2,2) [31], among others (in physics). howe begins in [16] with a 2n-dimensional real symplecticmanifoldw = v+v′ wherev is spannedby n symbols ai, i = 1, . . . , n, called annihilation operators andv′ is spannedby their conjugate, the creation operators a∗i satisfying the canonical commutation relations (ccr) [ai, aj] = 0= [a ∗ i , a ∗ j], [ai, a ∗ j] = δij . (1.1) the commutator of two elements of the real vector space w being a real number it defines a (nondegenerate, skew-symmetric) bilinear formon itwhich vanishes on v and on v′ separately and for which v′ appears as the dual space to v (the space of linear functionals on v). the real symplectic lie algebra sp(2n,r) spanned by antihermitean quadratic combinations of ai and a ∗ j acts by commutators on w preserving its realityandtheabovebilinear form. this action extends to thefock space f (unitary, irreducible) representation of the ccr. it is, however, only exponentiated to the double cover of sp(2n,r), the metaplectic group m p(2n) (which is not a matrix group – i.e., has no faithful finite-dimensional representation; we canview its fock space, calledbyhowe [16]oscillator representationas thedefiningone). twosubgroups g and g′ of m p(2n) are said to forma (reductive) dual pair if they act reductively on f (that is automatic for 54 acta polytechnica vol. 50 no. 3/2010 a unitary representation like the one considered here) and each of them is the full centralizer of the other in m p(2n). the oscillator representation of m p(2n) displays aminimality property, [19] that keeps attracting the attention of both physicists andmathematicians – see e.g. [25, 12, 26]. 1.2 local observables determine a compact gauge group observables (unlike charge carrying fields) are left invariant by (global) gauge transformations. this is, in fact, part of the definition of a gauge symmetry or a superselection rule as explained by wick, wightman and wigner [45]. it required the non-trivial vision of rudolf haag to predict in the 1960’s that a local net of obsevable algebras should determine the compact gauge group that governs the structure of its superselection sectors (for a review and references to the original work – see [14]). it took over 20 years and the courage and dedication of haag’s (then) young collaborators, doplicher and roberts [8] to carry out this program to completion. they proved that all superselection sectors of a local qft a with a mass gap are contained in the vacuum representation of a canonically associated (graded local) field extensione, and they are in a one-to-one correspondence with the unitary irreducible representations (ir) of a compact gauge group g of internal symmetries of e, so that a consists of the fixed points of e under g. the pair (a, g) ine provides a general realizationof a dual pair in a local quantu theory. 2 how do 2d cft methods work in higher dimensions? anumber of reasons are givenwhy 2-dimensional conformal field theory is, in away, exceptional so that extending its methods to higher dimensions appears to be hopeless. 1. the 2d conformal group is infinite dimensional: it is the direct product of the diffeomorphism groups of the left and right (compactified) light rays. (in the euclidean picture it is the group of analytic andantianalytic conformalmappings.) by contrast, for d > 2, according to the liouville theorem, the quantummechanical conformal group in d space-time dimensions is finite (in fact, (d +1)(d +2) 2 )-dimensional: it is (a covering of) the spin group spin(d,2). 2. the representationtheoryofaffinekac-moodyalgebras [20] and of the virasoro algebra [23] plays a crucial role in constructing soluble 2d models of (rational) cft. there are, on the other hand, no local lie fields in higher dimensions: after an inconclusive attempt by robinson [39] (criticized in [28]) this was proven for scalar fields by baumann [4]. 3. the light cone in two dimensions is the direct product of two light rays. this geometric fact is the basis of splitting 2d variables into rightand left-movers’ chiral variables. no such splitting seems to be available in higher dimensions. 4. there are chiral algebras in 2d cft whose local currents satisfy the axioms of vertex algebras1 andhave rational correlation functions. itwasbelieved for a long time that they havenophysically interesting higher dimensional cft analogue. 5. furthermore, the chiral currents in a 2d cft on a torus have elliptic correlation functions [46], the 1-point function of the stress energy tensor appearing as a modular form (these can be also interpreted as finite temperature correlation functionsanda thermal energymeanvalueon theriemann sphere). again, there seemed to be no good reason to expect higher dimensional analogues of these attractive properties. we shall argue that eachof the listed features of 2d cft does have, when properly understood, a higher dimensional counterpart. 1. the presence of a conformal anomaly (a non-zero virasoro central charge c) tells us that the infinite conformal symmetry in 1+1dimension is, in fact, broken. what is actually used in 2d cftare the (conformal) operator product expansions (opes) which can be derived for any d and allow to extend the notion of a primary field (for instance with respect to the stress-energy tensor). 2. for d = 4, infinite dimensional lie algebras are generated by bifields vij(x1, x2) which naturally arise in the ope of a (finite) set of (say, hermitean, scalar) localfields φi ofdimension d(> 1): (x212) d φi(x1)φj(x2)= nij + x 2 12 vij(x1, x2)+ o((x 2 12) 2) , x12 = x1 − x2 , x2 =x2 − x0 2 , (2.2) nij = nji ∈ r where vij are defined as (infinite) sums of ope contributionsof (twist two) conserved local tensor currents (and the real symmetric matrix (nij) is positive definite). we say more on this in what follows (reviewing results of [33, 34, 35, 36, 2, 3]). 3. we shall exhibit a factorization of higher dimensional intervals by using the following parametrization of the conformally compactified space-time ([43, 42, 37, 38]): 1as a mathematical subject vertex algebras were anticipated by i. frenkel and v. kac [11] and introduced by r. borcherds [5]; for reviews and further references see e.g. [21] [10] 55 acta polytechnica vol. 50 no. 3/2010 m̄ = {zα = eit uα , α =1, . . . , d; (2.3) t, uα ∈ r ; u2 = d∑ α=1 u2α =1} = s d−1 × s1 {1, −1} . the real interval between two points z1 = e it1 u1, z2 = e it2 u2 is given by: z212 (z 2 1 z 2 2) −1/2 = 2(cost12 −cosα)= (2.4) −4sint+ sint− , z12 = z1 − z2 t± = 1/2(t12 ± α) , (2.5) u1 · u2 = cosα , t12 = t1 − t2 . thus t+ and t− are the compact picture counterparts of “left” and “right” chiral variables (see [38]). the factorizationof2d cross ratios into chiral parts again has a higher dimensional analogue [7]: s := x212 x 2 34 x213 x 2 24 = u+ u− , (2.6) t := x214 x 2 23 x213 x 2 24 =(1− u+)(1− u−) , xij = xi − xj which yields a separation of variables in the d’alembert equation (cf. remark 2.1) one should, in fact, be able to derive the factorization (2.6) from (2.4). 4. it turns out that the requirement of global conformal invariance (gci) in minkowski space togetherwith the standardwightman axioms of local commutativity and energy positivity entails the rationalityof correlation functions in any even number of space-time dimensions [32]. indeed, gci and local commutativity of bose fields (for space-like separationsof thearguments) imply the huygens principle and, in fact, the strong (algebraic) locality condition (x212) n[φi(x1), φj(x2)]= 0 (2.7) for n sufficiently large, a condition only consistentwith the theory of free fields for an even number of space time dimensions. it is this huygens locality condition which allows the introduction of higher dimensional vertex algebras [37, 38, 1]. 5. local gci fields have elliptic thermal correlation functions with respect to the (differences of) conformal time variables in any even number of space-time dimensions; the corresponding energy meanvalues in agibbs (kms) state (see e.g. [14]) are expressed as linear combinations of modular forms [38]. the rest of the paper is organized as follows. in sect. 3 we reproduce the general form of the 4-point function of the bifield v and the leading term in its conformal partial wave expansion. the case of a theory of scalar fields of dimension d = 2 is singled out, in which the bifields (and the unit operator) close a commutator algebra. in sect. 4 we classify the arising infinite dimensional lie algebras l in terms of the three real division rings f = r, c, h. in sect. 5 we formulate the main result of [2] and [3] on the fock space representations of the lie algebra l(f) coupled to the (dual, in the sense ofhowe [16]) compact gauge group u(n, f) where n is the central charge of l. 3 four-point functions and conformal partial wave expansions the conformal bifields v (x1, x2) of dimension (1,1) which arise in the ope (2.2) (as sums of integrals of conserved tensor currents) satisfy the d’alembert equation in eachargument [34]; we shall call them harmonic bifields. their correlation functions depend on the dimension d of the local scalar fields φ. for d =1 one is actually dealing with the theory of a free massless field. we shall, therefore, assume d > 1. a basis {fνi, ν = 0,1, . . . , d −2, i = 1,2} of invariant amplitudes f(s, t) such that 〈0 | v1(x1, x2)v2(x3, x4) | 0〉 = 1 ρ13 ρ24 f(s, t) , ρij = x 2 ij + i0x 0 ij , x 2 = x2 − (x0)2 (3.1) is given by (u+ − u−)fν1(s, t) = uν+1+ (1− u+)ν+1 − uν+1− (1− u−)ν+1 , (u+ − u−)fν2(s, t) = (−1)ν(uν+1+ − u ν+1 − ) , (3.2) ν = 0,1, . . . , d −2 , where u± are the “chiral variables” (2.6); f01 = 1 t , f02 =1; f11 = 1− s − t t2 , f12 = t − s −1; f21 = (1− t)2 − s(2− t)+ s2 t3 , (3.3) fν2(s, t) = 1 t fν1 ( s t , 1 t ) fν,i, i = 1,2 corresponding to single pole terms [36] in the 4-point correlation functions wνi(x1, . . . , x4) = fνi(s, t)/ρ13 ρ24: w01 = 1 ρ14 ρ23 , w02 = 1 ρ13 ρ24 ; 56 acta polytechnica vol. 50 no. 3/2010 w11 = ρ13 ρ24 − ρ14 ρ23 − ρ12 ρ34 ρ214 ρ 2 23 , w12 = ρ14 ρ23 − ρ13 ρ24 − ρ12 ρ34 ρ213 ρ 2 24 ; w21 = (ρ13 ρ24 − ρ14 ρ23)2 ρ314 ρ 3 23 − ρ12 ρ34 (2ρ13 ρ24 − ρ14 ρ23)+ ρ212 ρ234 ρ314 ρ 3 23 , w22 = (ρ14 ρ23 − ρ13 ρ24)2 ρ313 ρ 3 24 − (3.4) ρ12 ρ34 (2ρ14 ρ23 − ρ13 ρ24)+ ρ212 ρ234 ρ313 ρ 3 24 . we have wν2 = p34wν1(= p12wν1) where pij stands for the substitution of the arguments xi and xj. clearly, for x1 = x2 (or s = 0, t = 1) only the amplitudes f0i contribute to the 4-point function (3.1). it has been demonstrated in [35] that the lowest angular momentum (�) contribution to fνi corresponds to � = ν. the corresponding ope of the bifield v starts with a local scalar field φ of dimension d = 2 for ν = 0; with a conserved current jμ (of d = 3) for ν =1; with the stress energy tensor tλμ for ν =2. indeed, the amplitude fν1 admits an expansion in twist two2 conformal partial waves β�(s, t) [6] starting with (for a derivation see [35], appendix b) βν(s, t) = gν+1(u+)− gν+1(u−) u+ − u− , (3.5) gμ(u) = u μf(μ, μ;2μ;u) . remark 3.1 eqs. (3.2) (3.5) provide examples of solutions of the d’alambert equation in any of the arguments xi, i = 1,2,3,4. in fact, the general conformal covariant (of dimension 1 in each argument) such solution has the formof the right hand side of (3.1)with f(s, t)= f(u+)− f(u−) u+ − u− . (3.6) remark 3.2 we note that albeit each individual conformal partial wave is a transcendental function (like (3.5)) the sum of all such twist two contributions is the rational function fν1(s, t). it canbededuced fromtheanalysis of 4-point functions that the commutator algebraof a set of harmonic bifields generated byopeof scalar fields of dimension d can only close on the v ’s and the unit operator for d = 2. in this case the bifields v are proven, in addition, to be huygens bilocal [36]. remark 3.3 ingeneral, irreducible positive energy representations of the (connected) conformal group are labeled by triples (d;j1, j2) including the dimension d and the lorentz weight (j1, j2)(2ji ∈ n), [29]. it turns out that for d = 3 there is a spin-tensor bifield of weight ((3/2;1/2,0),(3/2;0,1/2))whose commutator algebra does close; for d = 4 there is a conformal tensor bifield of weight ((2;1,0),(2;0,1)) with this property. these bifields may be termed lefthanded: they are analogues of chiral 2d currents; a set of bifields invariant under space reflections would also involve their righthanded counterparts (ofweights ((3/2;0,1/2),(3/2;1/2,0)) and ((2;0,1),(2;1,0)), respectively). 4 infinite dimensional lie algebras and real division rings our starting point is the following result of [36]. proposition 4.1. the harmonic bilocal fields v arising in the opes of a (finite) set of local hermitean scalar fields of dimension d = 2 can be labeled by the elements m of an unital algebra m ⊂ mat(l, r) of real matrices closed under transposition, m → tm, in such a way that the following commutation relations (cr) hold: [vm1(x1, x2), vm2(x3, x4)] = δ13vtm1m2(x2, x4)+δ24vm1 tm2(x1, x3)+ δ23vm1m2(x1, x4)+δ14vm2m1(x3, x2)+ tr(m1m2)δ12,34 + tr( tm1m2)δ12,43 ; (4.1) here δij is the free field commutator, δij := δ + ij − δ+ji, and δ12,ij = δ + 1i δ + 2j − δ + i1δ + j2 where δ + ij = δ+(xi − xj) is the 2-point wightman function of a free massless scalar field. wecall the set of bilocal fields closedunder thecr (4.1) alie system. the types of lie systems are determined by the corresponding t-algebras – i.e., real associative matrix algebras m closed under transposition. we first observe that each such m can be equipped with a frobenius inner product < m1, m2 >= tr( tm1m2)= ∑ ij (m1)ij(m2)ij , (4.2) which is symmetric, positive definite, and has the property < m1m2, m3 >=< m1, m3 tm2 >. this implies that for every right ideal i ⊂ m its orthogonal complement is again a right ideal while its transposed ti is a left ideal. therefore, m is a semisimple algebra so that every module over m is a direct sum of irreducible modules. let now m be irreducible. it then follows from the schur’s lemma (whose real version [27] is richer 2the twist of a symmetric traceless tensor is defined as the difference between its dimension and its rank. all conserved symmetric tensors in 4d have twist two. 57 acta polytechnica vol. 50 no. 3/2010 but less popular than the complex one) that its commutant m′ in m at(l, r) coincides with one of the three real division rings (or not necessarily commutative fields): the fields of real and complex numbers r and c, and the noncommutative division ring h of quaternions. in each case the lie algebra of bilocal fields is a central extension of an infinite dimensional lie algebra that admits a discrete series of highest weight representations3. it was proven, first in the theory of a single scalar field φ (of dimension two) [33], and eventually for an arbitrary set of such fields [36], that the bilocal fields vm can be written as linear combinations of normal products of free massless scalar fields ϕi(x): vm(x1, x2)= l∑ i,j=1 m ij : ϕi(x1)ϕj(x2) : . (4.3) for each of the above types of lie systems vm has a canonical form, namely r : v (x1, x2) = n∑ i=1 : ϕi(x1)ϕi(x2) : , c : w(x1, x2) = n∑ j=1 : ϕ∗j(x1)ϕj(x2) : , h : y (x1, x2) = n∑ m=1 : ϕ+m(x1)ϕm(x2) : (4.4) where ϕi are real, ϕj are complex, and ϕm are quaternionic valued fields (corresponding to (3.2) with l = n,2n, and 4n, respectively). we shall denote the associated infinite dimensional lie algebra by l(f), f = r, c, or h. remark 4.1 wenote that thequaternions (represented by 4×4 realmatrices) appear both in the definition of y – i.e., of the matrix algebra m, and of its commutant m′, the two mutually commuting sets of imaginary quaternionic units �i and rj corresponding to the splitting of the lie algebra so(4) of real skewsymmetric 4×4 matrices into a direct sum of “a left and a right” so(3) lie subalgebras: �1 = σ3 ⊗ � , �2 = � ⊗1 , �3 = �1�2 = σ1 ⊗ � , (�j)αβ = δα0δjβ − δαj δ0β − ε0jαβ , α, β =0,1,2,3 , j =1,2,3 ; r1 = � ⊗ σ3 , r2 =1⊗ � , r3 = r1r2 = � ⊗ σ1 (4.5) where σk are the pauli matrices, � = iσ2, εμναβ is the totally antisymmetric levi-civita tensor normalized by ε0123 =1. we have y (x1, x2) = v0(x1, x2)1+ v1(x1, x2)�1 + v2(x1, x2)�2 + v3(x1, x2)�3 = y (x2, x1) + (�+i = −�i , [�i, rj] = 0); vκ(x1, x2) = n∑ m=1 : ϕαm(x1)(�κ)αβ ϕ β m(x2) :, (4.6) �0 =1 . in order to determine the lie algebra corresponding to the cr (4.1) in each of the three cases (4.5) we choose a discrete basis and specify the topology of the resulting infinite matrix algebra in such a way that the generators of the conformallie algebra (most importantly, the conformal hamiltonian h) belong to it. the basis, say (xmn) where m, n are multiindices, corresponds to the expansion [42] of a free massless scalar field ϕ in creation and annihilation operators of fixed energy states ϕ(z)= ∞∑ �=0 (�+1)2∑ μ=1 ((z2)−�−1ϕ�+1,μ + ϕ−�−1,μ)h�μ(z), (4.7) where (h�μ(z) , μ = 1, . . . ,(� + 1) 2) form a basis of homogeneous harmonic polynomials of degree � in the complex 4-vector z (of the parametrization (2.3) of m̄). the generators of the conformal lie algebra su(2,2) are expressed as infinite sums in xmn with a finite number of diagonals (cf. appendix b to [2]). the requirement su(2,2)⊂ l thus restricts the topology of l implying that the last (c-number) term in (4.1) gives rise to a non-trivial central extension of l. the analysis of [2], [3] yields the following proposition 4.2 the lie algebras l(f), f = r, c, h are 1-parameter central extensions of appropriate completions of the following inductive limits of matrix algebras: r : sp(∞, r) = lim n→∞ sp(2n, r) c : u(∞, ∞) = lim n→∞ u(n, n) h : so∗(4∞) = lim n→∞ so∗(4n) . (4.8) in the free field realization (4.4) the suitably normalized central charge coincides with the positive integer n. 3finite dimensional simple lie groups g with this property have been extensively studied by mathematicians (for a review and references – see [9]); for an extension to the infinite dimensional case – see [40]. if z is the centre of g and k is a closed maximal subgroup of g such that k/z is compact then g is characterized by the property that (g, k) is a hermitean symmetric pair. such groups give rise to simple space-time symmetries in the sense of [30] (see also earlier work – in particular by günaydin – cited there). 58 acta polytechnica vol. 50 no. 3/2010 5 fock space representation of the dual pair l(f)× u(n, f) to summarize the discussion of the last section: there are three infinite dimensional irreducible lie algebras, l(f) that aregenerated in a theoryofgci scalarfields of dimension d = 2 and correspond to the three real division rings f (proposition 4.2). for an integer central charge n they admit a free field realizationof type (4.3) and a fock space representation with (compact) gauge group u(n, f): u(n, r) = o(n) , u(n, c)= u(n) , (5.1) u(n, h) = sp(2n)(= u sp(2n)) . it is remarkable that this result holds in general. theorem 5.1 (i) in any unitary irreducible positive energy representation (uiper) of l(f) the central charge n is a positive integer. (ii) all uipers of l(f) are realized (with multiplicities) in the fock space f of n dimrf free hermitean massless scalar fields. (iii) the ground states of equivalent uipers in f form irreducible representations of the gauge group u(n, f) (5.1). this establishes a one-to-one correspondence between uipers of l(f) occurring in the fock space and the irreducible representations of u(n, f). the proof of this theorem for f = r, c is given in [2] (the proof of (i) is already contained in [33]); the proof for f = h is given in [3]. remark 5.1 theorem 5.1 is also valid – and its proof becomes technically simpler – for a 2-dimensional chiral theory (in which the local fields are functions of a single complex variable). for f = c the representation theory of the resulting infinite dimensional lie algebra u(∞, ∞) is then essentially equivalent to that of the vertex algebra w1+∞ studied in [22] (see the introduction to [2] for a more precise comparison). theorem 5.1 provides a link between two parallel developments, one in the study of highest weight modules of reductive lie groups (and of related dual pairs – see sect. 1.1) [24, 18, 9, 40] (and [16, 17]), the other in thework ofhaag-doplicher-roberts [14, 8] on the theory of (global) gauge groups and superselection sectors – see sect. 1.2. (they both originate – in the talks of irvingsegalandrudolfhaag, respectively–at the same lille 1957 conference onmathematical problems in quantum field theory). albeit the settings are not equivalent the resultsmatch. the observable algebra (in our case, the commutator algebra generatedby the set of bilocal fields vm) determines the (compact) gauge group and the structure of the superselection sectors of the theory. (for a more careful comparison between the two approaches – see sections 1 and 4 of [2].) the infinite dimensional lie algebra l(f) and the compact gauge group u(n, f) appear as a rather special (limit-) case of a dual pair in the sense of howe [16], [17]. it would be interesting to explore whether other (inequivalent) pairs would appear in the study of commutator algebras of (spin)tensor bifields (discussed in remark 3.3) and of their supersymmetric extension (e.g. a limit as m, n → ∞ of the series of lie superalgebras osp(4m∗|2n) studied in [13]). acknowledgement it is a pleasure to thank my coauthorsbojkobakalov, nikolay m. nikolov and karl-henning rehren: all results (reported in sects. 3–5) of this paper have been obtained in collaborationwith them. i thank cestmir burdik for invitingme to talk at themeeting “selected topics inmathematical andparticlephysics”,prague, 5–7 may 2009, dedicated to the 70th birthday of jiri niederle. i also acknowledge a partial support from the bulgariannationalcouncil for scientificresearch under contracts ph-1406 and do-02-257. references [1] bakalov, b., nikolov, n. m.: jacobi identities for vertex algebras in higher dimensions, j. math. phys. 47 (2006) 053505 (30 pp.); mathph/0601012. [2] bakalov, b., nikolov, n. m., rehren, k.-h., todorov, i.: unitary positive energy representations of scalar bilocal quantum fields, commun. math. phys. 271 (2007) 223–246; mathph/0604069. [3] bakalov, b., nikolov, n. m., rehren, k.-h., todorov, i.: infinite dimensional lie algebras of 4d conformal quantum field theory, j. phys. a math. theor. 41 (2008) 194002; arxiv:0701.0627 [hep-th]. [4] baumann, k.: there are no scalar lie fields in three or more dimensional space-time, commun. math. phys. 47 (1976) 69–74. [5] borcherds, r.: vertex algebras,kac-moody algebras and themonster,proc. natl. acad. sci. usa 83 (1986) 3068–3071. [6] dobrev, v. k., mack, g., petkova, v. b., petrova, s. g., todorov, i. t.: harmonic analysis on the n-dimensional lorentz group and its application to conformal quantum field theory,lecturenotes in physics 63, springer,berlin 1977. 59 acta polytechnica vol. 50 no. 3/2010 [7] dolan, f. a., osborn, h.: conformal four point functions and operator product expansion, nucl. phys. b599 (2001) 459–496. [8] doplicher, s., roberts, j.: why there is a field algebrawith a compact gauge group describing the superselection structure in particle physics,commun. math. phys. 131 (1991) 51–107. [9] enright, t., howe, r., wallach, n.: a classification of uniatry highest weightmodules, in representation theory of reductive groups, progress in mathematics 40, birkhäuser, basel 1983, pp. 97–143. [10] frenkel,e., ben-zvi,d.: vertexalegbras andalgebraiccurves,mathematical surveys andmonographs 88, ams 2001; second ed. 2004. [11] frenkel, i. b., kac, v. g.: basic representations of affine lie algebras and dual resonance models, inventiones math. 62 (1980) 23–66. [12] günaydin, m., pavlyk,o.: a unified approach to the minimal unitary realizations of noncompact groups and supergroups, jhep 0609 (2006) 050; hep-th/0604077v2. [13] günaydin, m., scalise, r.: unitary lowest weight representations of the non-compact supergroup osp(2m∗|2n), j.math. phys.32 (1991) 599–606. [14] haag,r.: local quantum physics: fields, particles, algebras, springer, berlin 1992, 412 p. [15] howe, r.: on the role of the heisenberg group in harmonic analysis,bull. amer. math. soc. 3:2 (1980) 821–843. [16] howe, r.: dual pairs in physics: harmonic oscillators, photons, electrons, and singletons, in applications of group theory in physics and matehmatical physics, m. flato, p. sally, g. zuckerman (eds.) lectures in applied mathematics 21 amer. math. soc., providence, r.i. 1985, pp. 179–206. [17] howe, r.: remarks on classical invariant theory, trans. amer. math. soc. 313 (1989) 539–570; transcending classical invariant theory, j. amer. math. soc. 2:3 (1989) 535–552. [18] jakobsen,h.p.: the last possible place of unitarity for certainhighestweightmodule,math. ann. 256 (1981) 439–447. [19] joseph, a.: minimal realizations and spectrum generating algebras, commun. math. phys. 36 (1974)325–338;theminimal orbit in a simplelie algebra and its associated maximal ideal, ann. sci. ecole normale sup. série 4, 9 (1976) 1–29. [20] kac, v. g.: infinite dimensional lie algebras, cambridge univ. press., cambridge 1990. [21] kac, v.: vertex algebras for beginners, 2nd ed., ams, providence, r.i. 1998. [22] kac, v., radul, a.: representation theory of the vertex algebra w1+∞, transform. groups 1 (1996) 41–70. [23] kac, v. g., raina, a. k.: highest weight representations of infinite dimensional lie algebras, adv. series in math. phys. 2, world scientific, singapore 1987. [24] kashiwara, m., vergne, m.: on the segal-shaleweil representations and harmonic polynomials, invent. math 44 (1978) 1–47. [25] kazhdan, d., pioline, b., wadron, a.: minimal representations, spherical vectors and exceptional theta series, commun. math. phys. 226 (2002) 1–40; hep-th/0107222. [26] kobayashi, t., mano, g.: the schrödinger model, for the minimal representation of the indefinite orthogonal group o(p, q), arxiv:0712.1769v2 [math.rt], july 2008 (167+iv pp.). [27] lang, s.: algebra, third revised editon,graduate texts in mathmatics 211, sprigner, n.y. 2002. [28] lowenstein, j. h.: the existence of scalar lie fields, commun. math. phys. 6 (1967) 49–60. [29] mack, g.: all unitary representations of the conformal group su(2,2)with positive energy,commun. math. phys. 55 (1977) 1–28. [30] mack,g., deriese,m.: simple symmetries: generalizingconformalfield theory,j.math.phys.48 (2007) 052304-1-21; hep-th/0410277. [31] mack, g., todorov, i.: irreducibility of the ladder representations of u(2,2) when restricted to the poincaré subgroup, j. math. phys. 10 (1969) 2078–2085. [32] nikolov,n.m.,todorov, i.t.: rationalityof conformally invariant local correlation functions on compactified minkowski space, commun. math. phys. 218 (2001) 417–436; hep-th/0009004. [33] nikolov, n. m., stanev, ya. s., todorov, i. t.: four dimensional cft models with rational correlation functions, j. phys. a 35 (2002) 2985–3007; hep-th/0110230. [34] nikolov, n. m., stanev, ya. s., todorov, i. t.: globally conformal invariant gauge field theory with rational correlation functions, nucl. phys. b670 (2003) 373–400; hep-th/0305200. 60 acta polytechnica vol. 50 no. 3/2010 [35] nikolov, n. m., rehren, k.-h., todorov, i. t.: partial wave expansion and wightman positivity in conformal field theory, nucl. phys. b722 (2005) 266–296; hep-th/0504146. [36] nikolov, n. m., rehren, k.-h., todorov, i.: harmonic bilocal fields generated by globally conformal invariant scalar fields,commun.math. phys. 279 (2008) 225–250; arxiv:0711.0628 [hep-th]. [37] nikolov, n. m.: vertex algebras in higher dimensions and globally conformal invariant quantum field theory, commun. math. phys. 253 (2005) 283–322; hep-th/0307235. [38] nikolov, n. m., todorov, i. t.: elliptic thermal correlation functionsandmodular forms inaglobally conformal invariant qft, rev. math. phys. 17 (2005) 613–667; hep-th/0403191. [39] robinson,d.w.: on a solublemodel of relativistic field theory, physics lett. 9 (1964) 189–190. [40] schmidt,m.u.: lowestweight representations of some infinite dimensional groups on fock spaces, acta appl. math. 18 (1990) 59–84. [41] streater, r. f., wightman,a. s.: pct, spin and statistics, and all that, w. a. benjamin, reading 1964; princeton university press, princeton 2000. [42] todorov, i. t.: infinite-dimensional lie algebras in conformal qft models, in: a.o. barut, h.d. doebner (eds.), conformal groups and related symmetries. physical results and mathematical background, pp. 387–443, lecture notes in physics 261, springer, berlin 1986. [43] uhlmann, a.: the closure of minkowski space, acta phys. pol. 24 (1963) 295–296. [44] weil, a.: sur certains groupes d’opérateurs unitaires, acta math. 111 (1964) 143–211; sur la formule de siegel dans la théorie des groupes classiques, ibid bf 113 (1965) 1–87. [45] wick, g. c., wightman, a. s., wigner, e. p.: the intrinsic parity of elementaryparticles,phys. rev. 88 (1952) 101–105. [46] zhu, y.: modular invariance of characters of vertex operator algebras, j. amer. math. soc. 9:1 (1996) 237–302. ivan todorov e-mail: todorov@inrne.bas.bg institute for nuclear research and nuclear energy, tsarigradskochaussee 72 bg-1784 sofia, bulgaria 61 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 energy accumulation by hydrogen technologies jǐrina čermáková1,2, aleš doucek1,2, lukáš polák1,2 1 nuclear research institute, department of hydrogen technologies, husinec – řež 130, 250 68 řež, czech republic 2 institute of chemical technology, department of gas, coke and air protection, technická 5, 168 48 praha, czech republic correspondence to: cea@ujv.cz abstract photovoltaic power plants as a renewable energy source have been receiving rapidly growing attention in the czech republic and in the other eu countries. this rapid development of photovoltaic sources is having a negative effect on the electricity power system control, because they depend on the weather conditions and provide a variable and unreliable supply of electric power. one way to reduce this effect is by accumulating electricity in hydrogen. the aim of this paper is to introduce hydrogen as a tool for regulating photovoltaic energy in island mode. a configuration has been designed for connecting households with the photovoltaic hybrid system, and a simulation model has been made in order to check the validity of this system. the simulation results provide energy flows and have been used for optimal sizing of real devices. an appropriate system can deliver energy in a stand-alone installation. keywords: hydrogen, accumulation, photovoltaic. 1 introduction the main reasons for higher utilization of renewable electricity sources are the fluctuation of crude oil prices and the limited supply of fossil resources, and also global warming, local pollution and contamination, geopolitical pressure, and the growth in power consumption. photovoltaic (pv) power plants are one type of renewable electricity source. photovoltaic power plants produce practically zero emissions, but the amount of electricity that is produced depends on the sunlight falling on the earth’s surface. pv power plants provide a variable and unreliable supply of electric power over time, and this has a negative impact on the operation of the electric power system [1]. there are two ways to overcome the variability in the output of photovoltaic power plants. one is by transposing consumption into the time when energy is available, while the other way is accumulation. the only way to accumulate electricity is by transforming it into another type of energy, e.g. into hydrogen. hydrogen as an energy carrier enables photovoltaic energy produced in times of excess power in the grid to be stored, and then supplied to the grid when it is required, i.e., during peak periods in the daily load curve [2]. another possible way to use the hydrogen that is produced is as a fuel for vehicles. hydrogen-fuelled vehicles have several advantages over vehicles equipped with petrol or diesel engines, especially ecological advantages. 2 solar energy storage by hydrogen for obvious reasons, the production of electricity from pv panels is linked to the weather conditions, and is variable in the course of a day, a month, and a year. it is also unpredictable. an example of the daily pv power profile is shown in figure 1. this load profile does not correspond closely with the energy requirements of an average household, which are represented in figure 2. energy requirements are unstable, and they differ between weekdays and weekends. this discordance between energy production and energy utilization can be dealt with by hydrogen accumulation. the configuration of the proposed pv layout and hydrogen equipment is shown in figure 3. it consists of the following components: pv panels, an electrolyzer, a fuel cell (fc), a hydrogen storage tank, and a battery [5]. the components represent the current state-of-the-art, and their parameters are presented in table 1. the photovoltaic power plant contains 56 polycrystalline pv panels with a total area of about 90 m2. a pem (proton exchange membrane) electrolyte made by hogen r© company is used as the electrolyzer. the electrolyzer is a fully integrated system that includes power supply, a hydrogen gas dryer and a heat exchanger, and it generates hydrogen (0–1 m3/hr) at 13 bars. the hydrogen is stored in a low pressure tank. a pem fuel cell made by 39 acta polytechnica vol. 52 no. 4/2012 figure 1: load profile of electrical energy generated from pv panels, providing a power output of 12 kwp [3] figure 2: average household electricity use over the day for detached houses [4] figure 3: configuration layout table 1: parameters of the devices device characteristic parameters photovoltaic power plant polycrystalline pv cell 12 kwp (230 w) battery capacity li-ion battery 45 ah (2.2 kwh) electrolyzer power pem electrolyte 6.3 kw fuel cell pem cell 4 kw h2 storage volume low pressure tank 10 kg (5–15 bar) 40 acta polytechnica vol. 52 no. 4/2012 fronius will be used as the fuel cell. a lithium-ion battery has been chosen. the pv panels produce electrical power from the solar source. some the power that is produced is directly consumed by the electrolyzer that produces hydrogen. hydrogen is stored in the low pressure tank for short-term or long-term storage. the duration of storage depends on the requirements of the targeted application, and on the amount of power that is generated. the stored hydrogen is used by fc to generate electricity, or can be utilized as a fuel for cars. in our case, we will use the hydrogen for a hydrogen bus that we built and operate. the remaining part of the power that is produced is stored in a li-ion battery, which enables a rapid response to fluctuations in renewable energy generation, as it releases power directly in response to a request from end devices. a simulation model has been made in order to check the validity of the connection of the household with the photovoltaic hybrid system. the model, based on realistic photovoltaic production over the course of half a year (1. 4.–29. 9. 2011), and the consumption of a household, enabled the energy flow to be observed. the model is intended for a long-term evaluation of energy flow, and batteries are therefore not taken into consideration. the simulation results have been used for optimal sizing of each device, and will be verified by an experimental system. the model was developed using matlab/simulink, and worked on the assumption that the photovoltaic hybrid system covers the entire energy consumption of the household, and in addition, it permits the sale of excess energy to the local electricity grid. the following input quantities are used: • immediate power of the photovoltaic plant [kwe] • immediate energy consumption of a household [kwe]. the output data includes • immediate fuel cell power [kwe] • immediate hydrogen consumption [kg/hr] • immediate electrolyzer power [kwe] • immediate hydrogen production [kg/hr]. 3 results the simulation model addresses the optimum design of an integrated power system, when weather variations co-exist with varying efficiency in the performance of the electrolyzer and the fuel cell. the aim is to meet the power demands for a targeted application under a variable load schedule. the results obtained from the simulation model are shown in figures 4 and 5. the data shows that the energy produced from the pv panels first covers the household consumption, and then the rest of energy is used by the electrolyzer, until the pressure in the hydrogen storage tank reaches a pre-specified pressure limit. when the pressure in the storage tank is below this limit, hydrogen production is initiated in the electrolyzer in order to fill the tanks. when the pressure in the tank reaches the pre-specified limit, the excess of energy is sold to the local grid. the hydrogen production is limited to the pressure limit and the maximum power of the electrolyzer (7 kw). the energy deficit that is generated from lack of energy from the pv panels is compensated by the use of hydrogen stored in the fuel cell, which produces electrical energy again. figure 4: results obtained from the simulation model 41 acta polytechnica vol. 52 no. 4/2012 figure 5: results obtained from the simulation model on the basis of an analysis of the energy flow patterns was made the following characteristic situations: • daily ordinary disproportion of the energy production from pv panels and the household energy consumption; basically higher energy production around midday, and lack of energy production in the evening. this disproportion is covered by using hydrogen from the storage tank. figures 4 and 5 show that there is a requirement for about 0.3 kg of h2 per day. • longer-term lack of energy production due to unfavourable weather conditions; basically this period is not longer than 7–10 days. in this case, the required amount of stored h2 is higher, and is assessed at 2 kg of h2. this amount of h2 proceeds from realistic pv panel power data, which was measured in the time period from 1. 4.–29. 9. 2011. we estimate that the required amount of h2 will increase to 4 kg h2 in the winter period. irrespective of our results, the experimental facility constructed at the nuclear research institute is designed with a storage tank for 10 m3 of h2 under a working pressure of 5–15 bar. this volume is equal to approximately 10 kg of usable hydrogen. this relatively large volume of hydrogen allows good experimental flexibility. the parameters of the power electrolyzer and the fuel cell remain at the same level, 7 kw for the power electrolyzer and 4 kw for the fuel cell. 4 conclusion this paper has described a hybrid res-based system, consisting of pv panels, an electrolyzer, a fuel cell and a hydrogen storage tank. in order the check the validity of this system, a simulation model was made, which is based on realistic photovoltaic production over a period of half a year. the results show two major discrepancies between energy production and energy utilization: the daily ordinary disproportion between energy is covered by 0.3 kg of h2, and the long-term lack of energy is covered by 2 kg of h2. the results indicate that this hybrid res-based system can deliver energy in a stand-alone installation. acknowledgement this work received financial support from mpo tip fr-ti2/442. references [1] giannakoudis, g., papadopoulos, a. i., seferlis, p., voutetakis, s.: optimum design and operation under uncertainty of power systems using renewable energy sources and hydrogen storage. in international journal of hydrogen energy, 2010, vol. 35, p. 872–891. [2] moldř́ık, p., cválek, r.: akumulace energie z fotovoltaiky do vod́ıku. in elektrorevue, 2011. issn 1213-1539. 42 acta polytechnica vol. 52 no. 4/2012 [3] doucek, a.: návrh technického řešeńı pro výrobu vod́ıku ze solárńı energie. studie č. újv-fr-ti2442-2012-3, 2011, újv řež. [4] widén, j., lundh, m., vassileva, i., dahlquist, e., ellega, k., wackelga, e.: constructing load profiles for household electricity and hot water from time-use data — modelling approach and validation. in energy and buildings, 2009, vol. 41, p. 753–768. [5] lagorsea, j., simoesm, m. g., miraouia, a., costergc, p.: energy cost analysis of a solarhydrogen hybrid energy system for stand-alone applications. in international journal of hydrogen energy, 2008, vol. 33, p. 2 871–2879. 43 ap09_2.vp 1 introduction efficiency is a more and more critical issue when designing transmitters, both to save energy costs at the base station and to reduce the power consumption of mobile terminals to increase battery life. efficiency boosting techniques tend to rely on switch-mode amplifiers (sma). conventional amplifier classes like a, ab, which offer high linearity at the cost of efficiency are avoided. the non-linearities caused by these amplifiers are compensated using envelope elimination and restoration (eer)[2] or vector addition (linc) [1] techniques or both (clier) [5]. for implementation of a clier amplifier architecture a class e amplifier is well suited both for the linc and for the eer part. linc consists of two non-linear amplifiers. the input signal, containing both amplitude a(t) and phase �(t) is transformed into two solely phase modulated signals with constant envelopes. both non-linear amplified signals are vector combined at the output, resulting in an amplified replica of the original time varying envelope signal. additionally, the supply of the amplifiers is modulated with a slowly varying signal. this reduces the dynamics of the linc signal and thereby increases the efficiency of the entire system. this paper focuses on the implementation of class e amplifiers. 2 theory for lossless operation of the class e amplifier there are two criteria that have to be met: first, zero voltage across the transistor terminals when the switch closes [4] u t ts c( )� �� � 0. (1) and, second, no voltage building across the transistor terminals while the switch is closed d d u t t ts c( )� �� � 0. (2) to obtain this non overlapping between voltage and current a complex load impedance is imposed on the switch output terminals, which results in a phase shift between current and voltage of the output signal. fig. 1 shows a typical class e circuit. the transistor is ideally replaced with a switch. the load network consists of a parallel capacitor c and an inductance l in series with a tuned output network l0c0 and load resistance r. when using a real transistor the capacitor c can comprise the intrinsic transistor capacitance cds and an external capacitance. an rf choke at the supply voltage enforces a dc supply current. the output tuning network l0c0 forces the current through the load resistance r to be a sinusoidal function of �t angular time and phase shift �. the current can be described as i t i tr r( ) sin ( )� � �� � . (3) because the rf choke enforces a dc current idc, the difference between outputand dc current flows into the switch-capacitance network. initially, when the switch s is closed, the switch current is zero i ts( )� � �0 0. with (3), the dc current can then be described as i idc r� � sin( )� . (4) while the switch s is closed the capacitor c has zero voltage across and consequently all current starts flowing through s for � t � 0. the current flowing the switch s i ts( )� can be described as � �i t i i t i ts dc r r( ) sin ( ) sin ( ) sin ( )� � � � � �� � � � � � . (5) at the time � � � �t t k� � �0 2 the switch s opens and the difference current i t t i t tc s( ) ( )� � � �� � �0 0 flows into capacitor c which starts charging. the build up of the voltage u ts( )� across the switch s is determined by the charging of capacitor c, which can be described as u t c i t ts c t t ( ) ( )� � � � � � � � 1 0 d . (6) 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 2.45 ghz class e power amplifier for a transmitter combining linc and eer m. dirix, o. koch a 10 w class-e rf power amplifier (pa) is designed and fabricated using a cree gan hemt. the proposed pa uses an innovative input circuit to optimize band with. at 2.45 ghz the pa achieves a pae of 60 % at an outputpower of 40 dbm. the resulting amplifier is simulated and constructed using a transmissionline topology. two of these amplifiers are fabricated on a single board for outphasing application. their suitability for outphasing application and supply modulation is investigated. keywords: class e, power amplifier, clier. fig. 1: class e circuit with lumped elements with (5) it follows that � �u t i c t ts r( ) cos ( ) cos ( ) sin� � � � � � � �� � � � � � . (7) with the first criteria for class e operation eq. (1) and eq. (7) it follows that for optimal operation the phase angle � can be defined as � � � � � � � � � �tan .1 2 32 482 deg. (8) with these equations the resulting normalized voltage can easily be obtained. the normalized voltage and current transients are shown in fig. 2. as can be seen, there is no overlap between switch current and voltage waveforms, resulting in 100 % switch efficiency. a common design approach for high frequency class e pas is to design a lumped element amplifier [4] and then replace each component with an equivalent transmission line. this however makes the design and optimization both tedious and time-consuming because only the slightest change in a transmission line parameter will change the load network radically, losing class e operating conditions. a better approach is found in [3]. consider fig. 3. class e operation in this circuit depends on the load network providing a phase angle � � 49.05 deg. this complex output impedance, using tan � � x r, can be written as z r j f nf ne e � � � � � � � ( . ) , . 1 1152 1 at at 0 0 (9) the optimum value for the load resistance re at the fundamental frequency f0 is given by r f ce s � 1 34 225 0. . (10) by choosing a moderate to high impedance for transmission lines ztl1 and ztl2 and defined output load impedance rl, circuit parameters can be further obtained by keeping the absolute value of the reflection parameter on both sides of tl1 equal � �e g� . the total admittance combining the load admittance gl and transmission lines tl2 and tl3 can be written as y g jbg l� � , (11) where b is b y g y ge tl l tl l e � � � � � � � � 2 2 2 2 1 1 1 ( ) ( ) (12) because the load resistance is real, the imaginary part of yg has to comprise both transmission lines tl2 and tl3. with the already chosen parameter ztl2, ztl 3 can be obtained with z b z tl tl 3 2 1 1 30� � � tan( deg) . (13) with �e being the reflection factor at the transistor output terminals �e e tl e tl z z z z � � � 1 1 (14) and �g the reflection factor at the end of transmission line tl1 �g g tl g tl y z y z � � � 1 1 1 1 (15) the electrical length of ztl1 can be calculated with l j tl e g1 4� � � � � � � � � ln � � . (16) because this implementation only provides a high impedance at the second and third harmonics across the transistor output terminals, output current and voltage waveforms are not completely separated resulting in less than 100 % maximum efficiency. hence the transmissionline circuit should only be used for applications where it is not possible to use © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 acta polytechnica vol. 49 no. 2–3/2009 0 100 200 300 0 1 2 3 4 �t fig. 2: switch current (light) and voltage (dark) transient fig. 3: class e circuit with transmission lines lumped elements, or the insertion losses are greater than the efficiency restraints due to harmonic termination problems. 2.1 input network to meet the demands that class e imposes on the used transistor for this project a gaas hemt transistor of type cree cgh4010f is chosen. this transistor allows for a maximum collector voltage of 120 v . the input impedance for this transistor is 4�j 4 � at 2.4 ghz which needs to be adapted to the 50 � output of the pre-amplifier. while this can be easily be achieved using a � 4 transmission line limiting bandwidth, a binomial input filter is chosen. the main advantage of using a binomial matching transformer is that the passband response is maximally flat near the design frequency. the order n also determines the number of sections in the transformer [7]. 2.2 simulation and measurement results in this section, the class e amplifier structure is modeled and simulated in the agilent advanced design system (ads). then simulation results are compared to measurement of the implemented amplifier. the cree cgh4010f gan hemt has an output capacitance cgs of typically 1.3 pf [6]. using eqs. (10) and (9) the required load impedance ze can be calculated. further following the advice of [3] a medium to high characteristic impedance has been chosen for tl1 � 75 � and tl2 � 50 �. further parameters can easily be attained using equations presented in section 2. first, the circuit output network is simulated using two real ports, to review the location of the second and third harmonic in a smith chart. as already mentioned, the output network should impose a 40.5 deg phase shift on the transistor output terminals for the target frequency and impose a high impedance at the second and third harmonic. fig. 4 shows the frequency response of the output network as seen by the transistor output terminals. fig. 5 shows the frequency response of the amplifier. in red the measurement, and in blue the simulated values. remarkable for this amplifier is that the measurement results are well above the simulation results. in addition, the real amplifier performs best with 28 dbm input power, while the simulation is carried out with 30 dbm input power. these differences can be explained given that the simulation model of the transistor is optimized for class ab operation. the output frequency response of the pre-amplifier is shown in turquoise. fig. 6 shows the resulting amplifier efficiency. fig. 7 shows that the two amplifiers, yellow and blue, are put together on a single circuit board for implementation in 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 4: smithchart of the output load network fig. 5: frequency response of the measured and simulated amplifier, and pre-amplifier using a 20 v dc supply voltage fig. 6: power added efficiency the clier system. in the following figures the two amplifiers can be compared. figs. 8 and 9 show that the differences between the two amplifiers are small. this makes them ideally suited for further implementation in the linc part of the clier amplifier architecture, as a linc transmitter requirers a small phase and amplitude imbalance between the two paths. figure 10 depicts the linearity between the dc supply voltage and the output voltage. as can be seen good linearity is delivered up to 20 v dc. the non-linearity in the higher voltage range must be compensated using pre-distorting or with the linc part of the architecture. 3 conclusion a class e power amplifier has been presented and its efficiency simulated and measured. its suitability for a clier transmitter has been demonstrated. references [1] cox, d. c.: linear amplification with nonlinear components. ieee-tc, vol. 41 (1974), no. 4, p. 1942–1945. [2] kahn, l. r.: signal-sideband transmission by envelope elimination and restoration. radioengineering, vol. 40 (1952), p. 803–806. [3] negra, r., fadhel, m., ghannouchi, m., bachtold, w.: study and design optimization of multiharmonic transmission-line load networks for class-e and class-fk-band mmic power amplifiers. ieee transactions on microwave theory and techniques, 207, vol. 55, no. 6, p. 1390–1394. [4] raab, f. h.: idealized operation of the class e tuned power amplifier. ieee transactions on circuits and systems, vol. 24 (1977), no. 12, p. 725–735. [5] rembold, b., koch, o.: increasing power amplifier efficiency by combining linc and eer, proceedings of the 12th international student conference on electrical engineering poster 2008.ue (czech republic), 2008. [6] cree inc.: cgh40010 datasheet 2007, rev 1.5 preliminary [7] pozar, d. m.: microwave engineering, john wiley & sons, inc., 2005. marc dirix e-mail: marc@ihf.rwth-aachen.de olivier koch e-mail: koch@ihf.rwth-aachen.de institute of high frequency technology rwth aachen university melatener strasse 25 52074 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 acta polytechnica vol. 49 no. 2–3/2009 fig. 7: the two linc class e amplifiers on a circuitboard 2 2.2 2.4 2.6 2.8 3 10 20 30 40 50 frequency (ghz) o u tp u t p o w e r (d b m ) 10v 15v 20v 25v fig. 8: frequency response of the blue and the yellow amplifier with 10, 15, 20 and 25 v supply voltage, dc supply 2 2.2 2.4 2.6 2.8 3 �4 �2 0 2 4 frequency (ghz) p h a s e s h if t (d e g ) 5v 10v 15v 20v fig. 9: phase difference of the two ouput signals b21-y21 5 0 � in to fig. 10: comparison of the output voltage of the yellow (light) and the blue (dark) amplifier table of contents multi-condition training for unknown environment adaptation in robust asr under real conditions 3 j. rajnoha a constellation space dimensionality reduced sub-optimal receiver for orthogonal stbc cpm modulation in a mimo channel 8 m. hekrdla technical evolution of medical endoscopy 15 s. gross, m. kollenbrandt noise reduction in car speech 20 v. bolom on tree pattern matching by pushdown automata 28 t. flouri technical and economic assessment of storage technologies for power-supply grids 34 h. meiwes age dependence of children’s speech parameters 40 j. janda working fluid quantity effect on magnetic field control of heat pipes 44 f. cingroš, t. hron demand modelling in telecommunications – comparison of standard statistical methods and approaches based upon artificial intelligence methods including neural networks 48 m. chvalina capillary discharge xuv radiation source 53 m. nevrkla calculation of the electromagnetic field around a microtubule 58 d. havelka, m. cifra development of a compact matrix converter 64 j. bauer a new buck-boost converter for a hybrid-electric drive stand 70 p. mašek transformation of commercial flows into physical flows of electricity – flow based method 75 m. adamec implementation of a power combining network for a 2.45 ghz transmitter combining linc and eer 80 f. wang, o. koch importance of peaceful utilization of nuclear energy 86 j. frydryšková 2.45 ghz class e power amplifier for a transmitter combining linc and eer 90 m. dirix, o. koch << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica vol. 51 no. 4/2011 erlangen programme at large 3.2 ladder operators in hypercomplex mechanics v. v. kisil abstract we revise the construction of creation/annihilation operators in quantum mechanics based on the representation theory of the heisenberg and symplectic groups. besides the standard harmonic oscillator (the elliptic case) we similarly treat the repulsive oscillator (hyperbolic case) and the free particle (the parabolic case). the respective hypercomplex numbers turn out to be handy on this occasion. this provides a further illustration to the similarity andcorrespondence principle. keywords: heisenberg group, kirillov’s method of orbits, geometric quantisation, quantum mechanics, classical mechanics, planck constant, dual numbers, double numbers, hypercomplex, jet spaces, hyperbolic mechanics, interference, fock-segal-bargmann representation, schrödinger representation, dynamics equation, harmonic and unharmonic oscillator, contextual probability, symplectic group, metaplectic representation, shale-weil representation. 1 introduction harmonic oscillators are treated in most textbooks on quantum mechanics. this is efficiently done through creation/annihilation (ladder) operators [9, 3]. the underlying structure is the representation theory of the heisenberg and symplectic groups [28, § vi.2], [34, § 8.2], [12, 8]. it is also known that quantum mechanics and field theory can benefit from the introduction of clifford algebra-valued group representations [20, 5, 4, 10]. the dynamics of a harmonic oscillator generates the symplectic transformation of the phase space of the elliptic type. the respective parabolic and hyperbolic counterparts are also of interest [37, § 3.8], [35]. as we will see, they are naturally connected with the respective hypercomplex numbers. to make this correspondence explicit we recall that the symplectic group sp(2) [8, § 1.2] consists of 2 × 2 matrices with real entries and the unit determinant. it is isomorphic to the group sl2(r) [28,13,30] and provides linear symplectomorphisms of the twodimensional phase space. it has three types of nonisomorphic one-dimensional subgroups represented by: k = {( cos t sin t − sin t cos t ) = exp ( 0 t −t 0 ) , t ∈ (−π, π] } , (1) n = {( 1 t 0 1 ) = exp ( 0 t 0 0 ) , t ∈ r } , (2) a = {( et 0 0 e−t ) = exp ( t 0 0 −t ) , t ∈ r } . (3) we will refer to them as elliptic, parabolic and hyperbolic subgroups, respectively. on the other hand, there are three nonisomorphic types of commutative, associative twodimensional algebras known as complex, dual and double numbers [38, app. c], [29, § 5]. they are represented by expressions x + ιy, where ι stands for one of the hypercomplex units i, ε or j with the properties: i2 = −1, ε2 = 0, j2 = 1. these units can also be labelled as elliptic, parabolic and hyperbolic. in an earlier paper [25], we considered representations of the heisenberg group which are induced by hypercomplex characters of its centre. the elliptic case (complex numbers) describes the traditional framework of quantum mechanics, of course. double-valued representations, with the imaginary unit j2 = 1, are a natural source of hyperbolic quantum mechanics developed for a while [14, 15, 17, 16, 18]. the representation acts on a krein space with an indefinite inner product [2]. this aroused significant recent interest in connection with pt symmetric quantum mechanics [10]. however, our approach is different from the classical treatment of krein spaces: we use the hyperbolic unit j and build the hyperbolic analytic function theory on its own basis [21, 27]. in the traditional approach, the indefinite metric is mapped to a definite inner product through auxiliary operators. 44 acta polytechnica vol. 51 no. 4/2011 the representation with values in dual numbers provides a convenient description of the classical mechanics. to this end we do not take any sort of semiclassical limit, rather the nilpotency of the imaginary unit (ε2 = 0) performs the task. this removes the vicious necessity to consider the planck constant tending to zero. mixing this with complex numbers we get a convenient tool for modelling the interaction between quantum and classical systems [22, 24]. our construction [25] provides three different types of dynamics and also generates the respective rules for addition of probabilities. in this paper we analyse the three types of dynamics produced by transformations (1–3) from the symplectic group sp(2) by means of ladder operators. as a result we obtain further illustrations to the following: principle (similarity and correspondence) [23, principle 29] 1. subgroups k, n and a play a similar role in the structure of the group sp(2) and its representations. 2. the subgroups shall be swapped simultaneously with the respective replacement of hypercomplex unit ι. here the two parts are interrelated: without a swap of imaginary units there can be no similarity between different subgroups. in this paper we work with the simplest case of a particle with only one degree of freedom. higher dimensions and the respective group of symplectomorphisms sp(2n) may require consideration of clifford algebras [32]. 2 heisenberg group and its automorphisms let (s, x, y), where s, x, y ∈ r, be an element of the one-dimensional heisenberg group h1 [8, 12]. consideration of the general case of hn will be similar, but is beyond the scope of present paper. the group law on h1 is given as follows: (s, x, y) · (s′, x′, y′) =( s + s′ + 1 2 ω(x, y; x′, y′), x + x′, y + y′ ) , (4) where the non-commutativity is due to ω — the symplectic form on r2n [1, § 37]: ω(x, y; x′, y′) = xy′ − x′y. (5) the heisenberg group is a non-commutative lie group. the left shifts λ(g) : f (g′) �→ f (g−1g′) (6) act as a representation of h1 on a certain linear space of functions. for example, an action on l2(h, dg) with respect to the haar measure dg = ds dx dy is the left regular representation, which is unitary. the lie algebra hn of h1 is spanned by left(right-)invariant vector fields sl(r) = ±∂s, x l(r) = ±∂x − 1 2 y∂s, (7) y l(r) = ±∂y + 1 2 x∂s on h1 with the heisenberg commutator relation [x l(r), y l(r)] = sl(r) (8) and all other commutators vanishing. we will sometime omit the superscript l for left-invariant field. the group of outer automorphisms of h1, which trivially acts on the centre of h1, is the symplectic group sp(2) defined in the previous section. it is the group of symmetries of the symplectic form ω [8, thm. 1.22], [11, p. 830]. the symplectic group is isomorphic to sl2(r) [28], [34, ch. 8]. the explicit action of sp(2) on the heisenberg group is: g : h = (s, x, y) �→ g(h) = (s, x′, y′), (9) where g = ( a b c d ) ∈ sp(2), and ( x′ y′ ) = ( a b c d )( x y ) . the shale-weil theorem [8, § 4.2], [11, p. 830] states that any representation ρh̄ of the heisenberg groups generates a unitary oscillator (or metaplectic) representation ρswh̄ of s̃p(2), the two-fold cover of the symplectic group [8, thm. 4.58]. we can consider the semidirect product g = h 1 ×| s̃p(2) with the standard group law: (h, g) ∗ (h′, g′) = (h ∗ g(h′), g ∗ g′), where h, h′ ∈ h1, g, g′ ∈ s̃p(2), and the stars denote the respective group operations while the action g(h′) is defined as the composition of the projection map s̃p(2) → sp(2) and the action (9). this group is sometimes called the schrödinger group, and it is known as the maximal kinematical invariance group of both the free schrödinger equation and the quantum harmonic oscillator [31]. this group is of interest not only in quantum mechanics but also in optics [36, 35]. 45 acta polytechnica vol. 51 no. 4/2011 consider the lie algebra sp2 of the group sp(2). pick up the following basis in sp2 [34, § 8.1]: a = 1 2 ( −1 0 0 1 ) , b = 1 2 ( 0 1 1 0 ) , z = ( 0 1 −1 0 ) . the commutation relations between the elements are: [z, a] = 2b, [z, b] = −2a, [a, b] = − 1 2 z. (10) vectors z, b + z/2 and −a are generators of the one-parameter subgroups k, n and a (1–3) respectively. furthermore, we can consider the basis {s, x, y, a, b, z} of the lie algebra g of the lie group g = h1 ×| s̃p(2). all non-zero commutators besides those already listed in (8) and (10) are: [a, x] = 1 2 x, [b, x] = − 1 2 y, [z, x] = y ; (11) [a, y ] = − 1 2 y, [b, y ] = − 1 2 x, [z, y ] = −x. (12) the shale-weil theorem allows us to expand any representation ρh̄ of the heisenberg group to the representation ρ̃h̄ = ρh̄ ⊕ ρswh̄ of group g. example 1 let ρh̄ be the schrödinger representation [8, § 1.3] of h1 in �l2(r), that is [25, (3.5)]: [ρχ(s, x, y)f ](q) = e 2πih̄(s−xy/2)+2πixq · f (q − h̄y). (13) thus the action of the derived representation on the lie algebra h1 is: ρh̄(x) = 2πiq, ρh̄(y ) = −h̄ d dq , ρh̄(s) = 2πih̄i. (14) then the associated shale-weil representation of sp(2) in l2(r) has the derived action, cf. [35, (2.2)], [8, § 4.3]: ρswh̄ (a) = − q 2 d dq − 1 4 , ρswh̄ (b) = − h̄i 8π d2 dq2 − πiq2 2h̄ , (15) ρswh̄ (z) = h̄i 4π d2 dq2 − πiq2 h̄ . we can verify commutators (8) and (10–12) for operators (14–15). it is also obvious that in this representation the following algebraic relations hold: ρswh̄ (a) = i 4πh̄ (ρh̄(x)ρh̄(y ) − 1 2 ρh̄(s)) = (16) i 8πh̄ (ρh̄(x)ρh̄(y ) + ρh̄(y )ρh̄(x)), ρswh̄ (b) = i 8πh̄ (ρh̄(x) 2 − ρh̄(y )2), (17) ρswh̄ (z) = i 4πh̄ (ρh̄(x) 2 + ρh̄(y ) 2). (18) thus it is common in quantum optics to name g as a lie algebra with quadratic generators, see [9, § 2.2.4]. note that ρswh̄ (z) is the hamiltonian of the harmonic oscillator (up to a factor). then we can consider ρswh̄ (b) as the hamiltonian of a repulsive (hyperbolic) oscillator. the operator ρswh̄ (b − z/2) = h̄i 4π d2 dq2 is the parabolic analog. a graphical representation of all three transformations is given in figure 1, and a further discussion of these hamiltonians can be found in [37, § 3.8]. an important observation, which is often missed, is that the three linear symplectic transformations are unitary rotations in the corresponding hypercomplex algebra. this means that the symplectomorphisms generated by operators z, b − z/2, b within time t coincide with the multiplication of hypercomplex number q + ιp by eιt [23, § 3], which is just another illustration of the similarity and correspondence principle. q p q p q p fig. 1: three types (elliptic, parabolic and hyperbolic) of linear symplectic transformations on the plane 46 acta polytechnica vol. 51 no. 4/2011 example 2 there are many advantages of considering representations of the heisenberg group on the phase space [12, § 1.7], [8, § 1.6], [6]. a convenient expression for fock-segal-bargmann (fsb) representation on the phase space is [25, (3.2)]: [ρf (s, x, y)f ](q, p) = e −2πi(h̄s+qx+py) · (19) f ( q − h̄ 2 y, p + h̄ 2 x ) . then the derived representation of h1 is: ρf (x) = −2πiq + h̄ 2 ∂p, ρf (y ) = −2πip − h̄ 2 ∂q, (20) ρf (s) = −2πih̄i. this produces the derived form of the shale-weil representation: ρswf (a) = 1 2 (q∂q − p∂p) , ρswf (b) = − 1 2 (p∂q + q∂p) , (21) ρswf (z) = p∂q − q∂p. note that this representation does not contain the parameter h̄, unlike the equivalent representation (15). thus the fsb model explicitly shows the equivalence of ρswh̄1 and ρ sw h̄2 if h̄1h̄2 > 0 [8, thm. 4.57]. as we will also see below, the fsb-type representations in hypercomplex numbers produce almost the same shale-weil representations. 3 ladder operators in quantum mechanics let ρ be a representation of the group g = h 1 ×| s̃p(2) in a space v . consider the derived representation of the lie algebra g [28, § vi.1] and denote x̃ = ρ(x) for x ∈ g. to see the structure of the representation ρ we can decompose the space v into eigenspaces of the operator x̃ for some x ∈ g. the canonical example is the taylor series in complex analysis. we are going to consider three cases corresponding to three non-isomorphic subgroups (1–3) of sp(2) starting from the compact case. let h = z be a generator of the compact subgroup k. corresponding symplectomorphisms (9) of the phase space are given by orthogonal rotations with matrices ( cos t sin t − sin t cos t ) . the shale-weil representation (15) coincides with the hamiltonian of the harmonic oscillator. since this is a double cover of a compact group, the corresponding eigenspaces z̃vk = ikvk are parametrised by a half-integer k ∈ z/2. explicitly for a half-integer k: vk (q) = hk+12 (√ 2π h̄ q ) e− π h̄ q2, (22) where hk is the hermite polynomial [8, § 1.7], [7, 8.2(9)]. from the point of view of quantum mechanics and representation theory (which may be the same), it is beneficial to introduce the ladder operators l±, known as creation/annihilation in quantum mechanics [8, p. 49] or raising/lowering in representation theory [28, § vi.2], [34, § 8.2], [3]. they are defined by the following commutation relations: [z̃, l±] = λ±l ±. (23) in other words, l± are eigenvectors for operators ad z of the adjoint representation of g [28, § vi.2]. remark 1 the existence of such ladder operators follows from the general properties of lie algebras if the hamiltonian belongs to a cartan subalgebra. this is the case for vectors z and b, which are the only two non-isomorphic types of cartan subalgebras in sl2. however, the third case considered in this paper, the parabolic vector b + z/2, does not belong to a cartan subalgebra, yet a sort of ladder operators is still possible with dual number coefficients. moreover, for the hyperbolic vector b, besides the standard ladder operators an additional pair with double number coefficients will also be described. from the commutators (23) we deduce that if vk is an eigenvector of z̃ then l+vk is an eigenvector as well: z̃(l+vk) = (l +z̃ + λ+l +)vk = l+(z̃vk) + λ+l +vk = ikl+vk + λ+l +vk = (ik + λ+)l +vk. (24) thus the action of ladder operators on the respective eigenspaces vk can be visualised by the diagram: (25) there are two ways to search for ladder operators: in (complexified) lie algebras h1 and sp2. we will consider them in sequence. 3.1 ladder operators from the heisenberg group assuming l+ = ax̃ + bỹ we obtain from the relations (11–12) and (23) the linear equations with unknown a and b: 47 acta polytechnica vol. 51 no. 4/2011 a = λ+b, −b = λ+a. the equations have a solution if and only if λ2+ + 1 = 0, and the raising/lowering operators are l± = x̃ ∓ iỹ . remark 2 here we have an interesting asymmetric response: due to the structure of the semidirect product h1 ×| s̃p(2) it is the symplectic group which acts on h1, not vise versa. however, the heisenberg group has a weak action in the opposite direction: it shifts eigenfunctions of sp(2). in the schrödinger representation (14) the ladder operators are ρh̄(l ±) = 2πiq ± ih̄ d dq . (26) the standard treatment of the harmonic oscillator in quantum mechanics, which can be found in many textbooks, e.g. [8, § 1.7], [9, § 2.2.3], is as follows. the vector v−1/2(q) = e −πq2/h̄ is an eigenvector of z̃ with the eigenvalue − i 2 . in addition v−1/2 is annihilated by l+. thus the chain (25) terminates to the right and the complete set of eigenvectors of the harmonic oscillator hamiltonian is presented by (l−)kv−1/2 with k = 0, 1, 2, . . . we can make a wavelet transform generated by the heisenberg group with the mother wavelet v−1/2, and the image will be the fock-segal-bargmann (fsb) space [12], [8, § 1.6]. since v−1/2 is the null solution of l+ = x̃ − iỹ , then by the general result [26, cor. 24] the image of the wavelet transform will be null-solutions of the corresponding linear combination of the lie derivatives (7): d = x r − iy r = (∂x + i∂y) − πh̄(x − iy), (27) which turns out to be the cauchy-riemann equation on a weighted fsb-type space. 3.2 symplectic ladder operators we can also look for ladder operators within the lie algebra sp2, see [23, § 8]. assuming l + 2 = aã + bb̃ + cz̃ from the relations (10) and defining condition (23) we obtain the linear equations with unknown a, b and c: c = 0, 2a = λ+b, −2b = λ+a. the equations have a solution if and only if λ2+ + 4 = 0, and the raising/lowering operators are l±2 = ±iã + b̃. in the shale-weil representation (15) they turn out to be: l±2 = ±i ( q 2 d dq + 1 4 ) − h̄i 8π d2 dq2 − πiq2 2h̄ = − i 8πh̄ ( ∓2πq + h̄ d dq )2 . (28) since this time λ+ = 2i the ladder operators l ± 2 produce a shift on the diagram (25) twice bigger than the operators l± from the heisenberg group. after all, this is not surprising since from the explicit representations (26) and (28) we get: l±2 = − i 8πh̄ (l±)2. 4 ladder operators for the hyperbolic subgroup consider the case of the hamiltonian h = 2b, which is a repulsive (hyperbolic) harmonic oscillator [37, § 3.8]. the corresponding one-dimensional subgroup of symplectomorphisms produces hyperbolic rotations of the phase space. the eigenvectors vμ of the operator ρswh̄ (2b)vν = −i ( h̄ 4π d2 dq2 + πq2 h̄ ) vν = iνvν , are weber-hermite (or parabolic cylinder) functions vν = dν−12 ( ±2ei π 4 √ π h̄ q ) , see [7, § 8.2], [33] for fundamentals of weber-hermite functions and [35] for further illustrations and applications in optics. the corresponding one-parameter group is not compact and the eigenvalues of the operator 2b̃ are not restricted by any integrality condition, but the raising/lowering operators are still important [13, § ii.1], [30, § 1.1]. we again seek solutions in two subalgebras h1 and sp2 separately. however, the additional options will be provided by a choice of the number system: either complex or double. 4.1 complex ladder operators assuming l+h = ax̃ + bỹ from the commutators (11–12), we obtain the linear equations: − a = λ+b, −b = λ+a. (29) the equations have a solution if and only if λ2+ − 1 = 0. taking the real roots λ = ±1 we obtain that the raising/lowering operators are l±h = x̃ ∓ ỹ . in the schrödinger representation (14) the ladder operators are l±h = 2πiq ± h̄ d dq . (30) the null solutions v±12 (q) = e ± πi h̄ q2 to operators ρh̄(l ±) are also eigenvectors of the hamiltonian 48 acta polytechnica vol. 51 no. 4/2011 ρswh̄ (2b) with the eigenvalue ± 1 2 . however the important distinction from the elliptic case is that they are not square-integrable on the real line anymore. we can also look for ladder operators within the sp2, that is in the form l + 2h = aã + bb̃ + cz̃ for the commutator [2b̃, l+h ] = λl + h . we will get the system: 4c = λa, b = 0, a = λc. a solution again exists if and only if λ2 = 4. within complex numbers we get only the values λ = ±2 with the ladder operators l±2h = ±2ã + z̃/2, see [13, § ii.1], [30, § 1.1]. each indecomposable h1or sp2-module is formed by a one-dimensional chain of eigenvalues with a transitive action of ladder operators l±h or l ± 2h respectively. and we again have a quadratic relation between the ladder operators: l±2h = i 4πh̄ (l±h ) 2. 4.2 double ladder operators there are extra possibilities in the context of hyperbolic quantum mechanics [17,16,18]. here we use the representation of h1 induced by a hyperbolic character ejht = cosh(ht) + j sinh(ht), see [25, (4.5)], and obtain the hyperbolic representation of h1, cf. (13): [ρjh(s ′, x′, y′)f̂ ](q) = ejh(s ′−x′y′/2)+jx′q · f̂ (q − hy′). (31) the corresponding derived representation is ρ j h(x) = jq, ρ j h(y ) = −h d dq , (32) ρ j h(s) = jhi. then the associated shale–weil derived representation of sp2 in the schwartz space s(r) is, cf. (15): ρswh (a) = − q 2 d dq − 1 4 , ρswh (b) = jh 4 d2 dq2 − jq2 4h , (33) ρswh (z) = − jh 2 d2 dq2 − jq2 2h . note that ρswh (b) now generates a usual harmonic oscillator, not the repulsive one like ρswh̄ (b) in (15). however, the expressions in the quadratic algebra are still the same (up to a factor), cf. (16–18): ρswh (a) = − j 2h (ρjh(x)ρ j h(y ) − 1 2ρ j h(s)) = (34) − j 4h (ρjh(x)ρ j h(y ) + ρ j h(y )ρ j h(x)), ρswh (b) = j 4h (ρjh(x) 2 − ρjh(y ) 2), (35) ρswh (z) = − j 2h (ρjh(x) 2 + ρjh(y ) 2). (36) this is due to the principle of similarity and correspondence: we can swap operators z and b with simultaneous replacement of hypercomplex units i and j. the eigenspace of the operator 2ρswh (b) with an eigenvalue jν are spanned by the weber-hermite functions d−ν−12 ( ± √ 2 h x ) , see [7, § 8.2]. functions dν are generalisations of the hermit functions (22). the compatibility condition for a ladder operator within the lie algebra h1 will be (29) as before, since it depends only on the commutators (11–12). thus we still have the set of ladder operators corresponding to values λ = ±1: l±h = x̃ ∓ ỹ = jq ± h d dq . admitting double numbers, we have an extra way to satisfy λ2 = 1 in (29) with values λ = ±j. then there is an additional pair of hyperbolic ladder operators, which are identical (up to factors) to (26): l±j = x̃ ∓ jỹ = jq ± jh d dq . pairs l±h and l ± j shift eigenvectors in the “orthogonal” directions changing their eigenvalues by ±1 and ±j. therefore an indecomposable sp2-module can be parametrised by a two-dimensional lattice of eigenvalues in double numbers, see table 1. the following functions v±h1 2 (q) = e∓jq 2/(2h) = cosh q2 2h ∓ j sinh q2 2h , v ±j 1 2 (q) = e∓q 2/(2h) are null solutions to the operators l±h and l ± j , respectively. they are also eigenvectors of 2ρswh (b) with eigenvalues ∓ j 2 and ∓ 1 2 , respectively. if these functions are used as mother wavelets for the wavelet transforms generated by the heisenberg group, then the image space will consist of the null-solutions of the following differential operators, see [26, cor. 24]: dh = x r − y r = (∂x − ∂y) + h 2 (x + y), dj = x r − jy r = (∂x + j∂y ) − h 2 (x − jy), 49 acta polytechnica vol. 51 no. 4/2011 table 1: the action of hyperbolic ladder operators on a 2d lattice of eigenspaces. operators l±h move the eigenvalues by 1, making shifts in the horizontal direction. operators l±j change the eigenvalues by j, shown as vertical shifts for v±h1 2 and v±j1 2 , respectively. this is again in line with the classical result (27). however annihilation of the eigenvector by a ladder operator does not mean that the part of the 2d-lattice becomes void, since it can be reached via alternative routes. instead of multiplication by a zero, as happens in the elliptic case, a half-plane of eigenvalues will be multiplied by the divisors of zero 1 ± j. we can also search ladder operators within the algebra sp2 and admitting double numbers we will again find two sets of them [23, § 3]: l±2h = ±ã + z̃/2 = ∓ q 2 d dq ∓ 1 4 − jh 4 d2 dq2 − jq2 4h = − j 4h (l±h ) 2, l±2j = ±jã + z̃/2 = ∓ jq 2 d dq ∓ j 4 − jh 4 d2 dq2 − jq2 4h = − j 4h (l±j ) 2. again these operators l±2h and l ± 2h produce double shifts in the orthogonal directions on the same twodimensional lattice in tabular 1. 5 ladder operator for the nilpotent subgroup finally, we look for ladder operators for the hamiltonian b̃ + z̃/2 or, equivalently, −b̃ + z̃/2. it can be identified with a free particle [37, § 3.8]. we can look for ladder operators in the representation (14–15) within the lie algebra h1 in the form l±ε = ax̃ + bỹ . this is possible if and only if − b = λa, 0 = λb. (37) the compatibility condition λ2 = 0 implies λ = 0 within complex numbers. however, such a “ladder” operator produces only the zero shift on the eigenvectors, cf. (24). another possibility appears if we consider the representation of the heisenberg group induced by dualvalued characters. on the configurational space such a representation is [25, (4.11)]: [ρεχ(s, x, y)f ](q) = e 2πixq (( 1 − εh ( s − 1 2 xy )) · f (q) + εhy 2πi f ′(q) ) . (38) the corresponding derived representation of h1 is ρ p h(x) = 2πiq, ρ p h(y ) = εh 2πi d dq , (39) ρ p h(s) = −εhi. however the shale-weil extension generated by this representation is inconvenient. it is better to consider the fsb-type parabolic representation [25, (4.9)] on the phase space induced by the same dual-valued character, cf. (19): [ρεh(s, x, y)f ](q, p) = e −2πi(xq+yp) · (40) (f (q, p) + εh(sf (q, p) + y 4πi f ′q(q, p) − x 4πi f ′p(q, p))). then the derived representation of h1 is: ρ p h(x) = −2πiq − εh 4πi ∂p, ρ p h(y ) = −2πip + εh 4πi ∂q, (41) ρ p h(s) = εhi. 50 acta polytechnica vol. 51 no. 4/2011 an advantage of the fsb representation is that the derived form of the parabolic shale–weil representation coincides with the elliptic one (21). eigenfunctions with the eigenvalue μ of the parabolic hamiltonian b̃ + z̃/2 = q∂p have the form vμ(q, p) = e μp/qf (q), (42) with an arbitrary function f (q). the linear equations defining the corresponding ladder operator l±ε = ax̃ + bỹ in the algebra h1 are (37). the compatibility condition λ2 = 0 implies λ = 0 within complex numbers again. admitting dual numbers, we have additional values λ = ±ελ1 with λ1 ∈ c with the corresponding ladder operators l±ε = x̃ ∓ ελ1ỹ = −2πiq − εh 4πi ∂p ± 2πελ1ip = −2πiq + εi ( ±2πλ1p + h 4π ∂p ) . for the eigenvalue μ = μ0 + εμ1 with μ0, μ1 ∈ c the eigenfunction (42) can be rewritten as: vμ(q, p) = e μp/qf (q) = eμ0p/q ( 1 + εμ1 p q ) f (q) (43) due to the nilpotency of ε. then the ladder action of l±ε is μ0 + εμ1 �→ μ0 + ε(μ1 ± λ1). therefore, these operators are suitable for building sp2-modules with a one-dimensional chain of eigenvalues. finally, consider the ladder operator for the same element b + z/2 within the lie algebra sp2. according to the above procedure we get the equations: −b + 2c = λa, a = λb, a 2 = λc, which can again be resolved if and only if λ2 = 0. there is the only complex root λ = 0 with the corresponding operators l±p = b̃ +z̃/2, which does not affect the eigenvalues. however the dual number roots λ = ±ελ2 with λ2 ∈ c lead to the operators l±ε = ±ελ2ã + b̃ + z̃/2 = ± ελ2 2 (q∂q − p∂p) + q∂p. 6 conclusions: similarity and correspondence we wish to summarise our findings. firstly, the appearance of hypercomplex numbers in ladder operators for h1 follows exactly the same pattern as was already noted for sp2 [23, rem. 32]: • the introduction of complex numbers is a necessity for the existence of ladder operators in the elliptic case; • in the parabolic case, we need dual numbers to make ladder operators useful; • in the hyperbolic case, double numbers are not required neither for the existence or for the usability of ladder operators, but they do provide an enhancement. in the spirit of the similarity and correspondence principle we have the following extension of prop. 33 from [23]: proposition let a vector h ∈ sp2 generate the subgroup k, n ′ or a′, that is h = z, b + z/2, or 2b, respectively. let ι be the respective hypercomplex unit. then the ladder operators l± satisfying the commutation relation: [h, l±2 ] = ±ιl ± are given by: 1. within the lie algebra h1: l ± = x̃ ∓ ιỹ . 2. within the lie algebra sp2: l ± 2 = ±ιã+ẽ. here e ∈ sp2 is a linear combination of b and z with the properties: • e = [a, h]. • h = [a, e]. • killings form k(h, e) [19, § 6.2] vanishes. any of the above properties defines the vector e ∈ span {b, z} up to a real constant factor. it is worth continuing this investigation and describing in detail hyperbolic and parabolic versions of fsb spaces. acknowledgement i am grateful to the anonymous referees for their helpful remarks. references [1] arnol’d, v. i.: mathematical methods of classical mechanics. graduate texts in mathematics, vol. 60, new york : springer-verlag, 1991. translated from the 1974 russian original by k. vogtmann, a. weinstein, corrected reprint of the second (1989) edition. [2] azizov, t. ja., iohvidov, i. s.: linear operators in hilbert spaces with g-metric, uspehi mat. nauk 26 (1971), no. 4 (160), 43–92. [3] boyer, ch. p., miller, w., jr.: a classification of second-order raising operators for hamiltonians in two variables, j. mathematical phys. 15 (1974), 1 484–1 489. [4] cnops, j., kisil, v. v.: monogenic functions and representations of nilpotent lie groups in quantum mechanics, math. methods appl. sci. 22 (1999), no. 4, 353–373. 51 acta polytechnica vol. 51 no. 4/2011 [5] constales, d., faustino, n., kraußhar, r.: fock spaces, landau operators and the time-harmonic maxwell equations, journal of physics a: mathematical and theoretical 44 (2011), no. 13, 135 303. [6] de gosson, maurice a.: spectral properties of a class of generalized landau operators, comm. partial differential equations 33 (2008), no. 10–12, 2 096–2 104. [7] erdélyi, a., magnus, w., oberhettinger, f., tricomi, f. g.: higher transcendental functions. vol. ii. melbourne : robert e. krieger publishing co. inc., fla., 1981. based on notes left by harry bateman, reprint of the 1953 original. [8] folland, g. b.: harmonic analysis in phase space. annals ofmathematics studies, vol. 122, princeton : princeton university press, nj, 1989. [9] gazeau, j.-p.: coherent states in quantum physics. wiley-vch verlag, 2009. [10] günther, u., kuzhel, s.: pt -symmetry, cartan decompositions, lie triple systems and krein space-related clifford algebras, journal of physics a: mathematical and theoretical 43 (2010), no. 39, 392 002. [11] howe, r.: roger howe, on the role of the heisenberg group in harmonic analysis, bull. amer.math. soc. (n.s.) 3 (1980), no. 2, 821–843. [12] howe, r.: quantum mechanics and partial differential equations, j. funct. anal. 38 (1980), no. 2, 188–254. [13] howe, r., tan, e. ch.: non-abelian harmonic analysis: applications of sl(2,r). new york : universitext, springer-verlag, 1992. [14] hudson, r.: generalised translation-invariant mechanics. d. phil. thesis, oxford : bodleian library, 1966. [15] hudson, r.: translation invariant phase space mechanics, quantum theory: reconsideration of foundations – 2, 2004, pp. 301–314. [16] khrennikov, a. yu.: hyperbolic quantum mechanics, dokl. akad. nauk 402 (2005), no. 2, 170–172. [17] khrennikov, a.: hyperbolic quantum mechanics, adv. appl. clifford algebr. 13 (2003), no. 1, 1–9 (english). [18] khrennikov, a.: hyperbolic quantization, adv. appl. clifford algebr. 18 (2008), no. 3–4, 843–852. [19] kirillov, a. a.: elements of the theory of representations. berlin : springer-verlag, 1976. translated from the russian by edwin hewitt, grundlehren der mathematischenwissenschaften, band 220. [20] kisil, v. v.: clifford valued convolution operator algebras on the heisenberg group. a quantum field theory model. clifford algebras and their applications in mathematical physics, proceedings of the third international conference held in deinze, 1993, pp. 287–294. [21] kisil, v. v.: analysis in r1,1 or the principal function theory, complex variables theory appl. 40 (1999), no. 2, 93–118. [22] kisil, v. v.: a quantum-classical bracket from p-mechanics, europhys. lett. 72 (2005), no. 6, 873–879. [23] kisil, v. v.: erlangen program at large – 2 1/2: induced representations and hypercomplex numbers, submitted (2009). [24] kisil, v. v.: computation and dynamics: classical and quantum, aip conference proceedings 1232 (2010), no. 1, 306–312. [25] kisil, v. v.: erlangen programme at large 3.1: hypercomplex representations of the heisenberg group and mechanics, submitted (2010). [26] kisil, v. v.: covariant transform, journal of physics: conference series, 284 (2011), no. 1, pp. 9. [27] kisil, v. v. erlangen programme at large: an overview, rogosin, s. v., koroleva, a. a. (eds.): advances in applied analysis. imperial college press, 2011, p. 1–65. [28] lang, s.: sl2(r), graduate texts in mathematics. vol. 105, new york : springer-verlag, 1985. [29] lavrent’ev, m. a., shabat, b. v.: problemy gidrodinamiki i ih matematiqeskie modeli. (russian) [problems of hydrodynamics and their mathematical models], moscow : second, izdat. “nauka”, 1977. [30] mazorchuk, v.: lectures on sl2-modules. world scientific, 2009. [31] niederer, u.: the maximal kinematical invariance group of the free schrödinger equation, helv. phys. acta, vol. 45, 1972/1973, no. 5, p. 802–810. 52 acta polytechnica vol. 51 no. 4/2011 [32] porteous, i. r.: clifford algebras and the classical groups, cambridge studies in advanced mathematics, cambridge : cambridge university press, 1995, vol. 50. [33] srivastava, h. m., tuan, v. k., yakubovich, s. b.: the cherry transform and its relationship with a singular sturm-liouville problem, q. j. math. 51 (2000), no. 3, 371–383. [34] taylor, m. e.: noncommutative harmonic analysis, mathematical surveys and monographs. vol. 22, providence : american mathematical society, ri, 1986. [35] torre, a.: a note on the general solution of the paraxial wave equation: a lie algebra view, journal of optics a: pure and applied optics 10 (2008), no. 5, 055 006 (14 pp). [36] torre, a.: linear and quadratic exponential modulation of the solutions of the paraxial wave equation, journal of optics a: pure and applied optics 12 (2010), no. 3, 035 701 (11 pp). [37] wulfman, c. e.: dynamical symmetry. world scientific, 2010. [38] yaglom, i. m.: a simple non-euclidean geometry and its physical basis. new york : springerverlag, 1979. an elementary account of galilean geometry and the galilean principle of relativity, heidelberg science library, translated from the russian by abe shenitzer, with the editorial assistance of basil gordon. vladimir v. kisil e-mail: kisilv@maths.leeds.ac.uk http://www.maths.leeds.ac.uk/∼kisilv/ school of mathematics university of leeds leeds ls2 9jt, uk 53 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 3d surface-based detection of pleural thickenings p. faltin, k. chaisaowong abstract pleuramesothelioma is amalignant tumor of the pleura. it evolves frompleural thickeningswhich are a typical long-term effect of asbestos exposure. a diagnosis is performed by examining ct scans acquired from the patient’s lung. the analysis of the image data is a very time-consuming task and is subject to strong interand intra-reader variances. to objectivize the analysis and to speed-up the diagnosis a full automatic system is developed. an essential step in this process is to identify potential thickenings. in this paperwe describe the complete system in brief, and then take a closer look on thickening detection. a ct slice based approach is presented here. it is extended by using 3d knowledge of the lung surface which could scarcely have been acquired visually. keywords: pleuramesothelioma, pleural thickenings, computer-assisted diagnosis, convex hull, morphological erosion, ct. 1 introduction asbestos was widely used until the 1990s because of its useful properties. today it is known to be a carcinogenous substance for humans. fibres 3–5 μm in length can get into the pulmonary alveolus and cannot be decomposed by the body. the fibers are typically accumulated on the lung boundary. the long persistence and the consequent continuous penetration of the pleura can result in pleural thickenings and hence pleuramesothelioma. long latency of 14–72 years leads to an expected peak in diagnosed cases in 2018 [1]. high-risk patients undergo regular medical checkups, which also include the acquisition of 3d ct image data. the ct scan typically contains approximately 700 image-slices, which have to be inspected by a medical expert. this task requires much time and concentration, and is subject to strong interand intra-reader variances [2]. in order to support the medical experts and objectivize the diagnosis we have developed a fully automatic system. it automatically detects, analyses and measures thickenings using ct data from one point of time. the system also comprises a follow-up assessment, in which thickenings recorded at two different points in time are compared regarding their thickness and volume. the whole process consists of several steps. in the first step, two-step supervised range-constrained otsu thresholding [3] is used to segment the lung tissue. the next step is thickening detection and extraction, which is presented in this paper. for exact measurements of volume and thickness, the thickeningsareclassifiedaspleuralplaquesorasdiffusepleural thickenings. a thin plate spline interpolation [4] is applied to reconstruct the correct thickening expansion, depending on the classification. to perform the follow-up assessment a rigid registration [5] between the imagedataof twodifferentpoints in time is performed. applying this transformation, the thickenings arematchedwith regard to their position and the ct number distribution in hounsfield units [6]. in the final step, the growth rate of the matched thickenings is calculated. fig. 1: an example of typical pleural thickening 2 methods the presented method is oriented towards the human perception. as shown in figure 1, thickenings canmainly be recognized by an indention of the lung surface, visualized in red, compared to the green healthy lung boundary. the fundamental step in the thickening detection presented here is to individually model a virtual healthy lung mask ŝ on the lattice r= {(x, y, z) : 1 ≤ x ≤ x 1 ≤ y ≤ y 1 ≤ z ≤ z}, where x, y, z are the dimensions of the ct image. if a voxel i ∈ r is located inside the lung, s(i) = 1 applies, and if it is located outside the lung, s(i)= 0 applies. an older approach [6] generates the healthy lung model analyzing the ct data in the transverse 39 acta polytechnica vol. 51 no. 5/2011 planes (here: fixed z) slice by slice. the surrounding slices are only considered for the postprocessing. thickenings which are mainly visible from a sagittal plane (here: fixed x) or from a coronal plane (here: fixed y), as shown in figure 2, are not reliably detected. to overcome this uncertainty, the new method models a 3d healthy lung to detect potential thickenings. the difference between thickenings and a healthy lung surface can mainly be described by an indention of the detected lung surface in the direction of the lung center. mathematically expressed, let {p1,p2} ⊂ rl be any two points of the thickening boundary,withrl = {i : i ∈ r;s(i)= 1}. the direct connectiond= {tp1+(1−t)p2 : t ∈ [0,1]} is not fully covered by the lung mask, hence ∃d s(d)= 0;d ∈ d. this definition is similar to the definition of a convex hull, whereas ∀d s(d) = 1;d ∈ d holds for all points {p1,p2} of the lung mask. in a closed form, a convex hull of the lung model is given by conv(rl)= {∑ i∈r ai · i : ∑ i∈r ai =1;ai ≥ 0 } . (1) using this knowledge, a healthy lung is modeled by generating a convex hull of the actual lung mask. thickenings are normally limited to a small surface area, while other anatomical structures, e.g. ribs, have amore global effect on the shape of the lung. a convex hull of the whole lung mask not only removes thickenings, but yields an anatomically strongly altered lung model. a concept called a sliding convex hull is introduced tomodel the healthy lung. instead of computing the convex hull of the complete lung conv(rl), the healthy lung r̂l is modeled, calculating the convex hull for each slice using n surrounding slices in eachdirection. ina closed form, this canbeexpressed by r̂l= z⋃ ζ=1 conv(rl,z)∩{i =(x, y, z):i ∈ r;z = ζ}, (2) where rl,z = {i= (x, y, z) : i ∈ rl; |z − ζ| <= n} are the surrounding slices. parameter n controls the spatial extension of the considered thickenings. the use of n =1 limits the approach to a two-dimensional slice by slice analysis, and n = z is identical with computing a convex hull of the whole lung. the results for applying the sliding convex hull with different values for n are shown in figure 4. the lung mask image is provided by the mapping ŝ(i)= { 1, if i ∈ r̂l 0, else . (3) the principle is visualized in figure 3 with the used range of slices (z − n) . . .(z + n) shown in blue and the actual slice z shown in red. fig. 2: example of a pleural thickening not clearly visible in the transverse plane, but visible in the sagittal and coronal plane fig. 3: the sliding convex hull principle 40 acta polytechnica vol. 51 no. 5/2011 n =0 n =20 n = z fig. 4: results for different parameter settings of n thedifferential image ŝ(i)∧¬s(i),i ∈ r includes all detectable thickenings. in addition, it contains other anatomical indentions as shown overlaid in figure 5, e.g. the ribs, the bronchial tree and the lobar fissure. these structures usually consist of huge segments. the removal of large segments would also discardall thickenings located in these anatomical regions. the thickeningshowever express their particularnaturebyalmostneverappearingonthebronchial tree and the lobar fissure [7]. all other regions of the lung surface contain only relatively flat anatomical structures like ribs, the backbone and lobar fissures. the undesired detection of these flat regions can be suppressed by virtually shrinking the healthy lung model, using the morphologic erosion operation �. the shrunken model is given by ŝe = ŝ �pe, (4) where pe is a sphere with the radius (e +0.5) used as the structure element. the differences δse(i)= ŝe(i) ∧ ¬s(i),i ∈ r (5) using this model only contain structures indented more than e voxels. parameter e is called the erosion level. fig. 5: an example of overlaid difference of modeled healthy lung mask and actual lung mask fig. 6: detected differences for different erosion levels in different colors on each erosion level, flat anatomical structures with indention of e or less are removed. only large structures or thickenings with a higher indention remain. the large structures can be filtered out, as described below, and therefore only thickenings remain. in figure 6 the differences between the eroded healthy lungmodel and the actual lung arevisualized in different colors. each voxel has a color depending on the highest erosion level, where it is contained in the differential image. the resulting segments have to be classified as thickenings or as normal anatomical structures. this is realized by filtering the segmentsdepending on their size anddepending on their ct number. thickenings are usually from 5 mm3 to 2000 mm3 in volume, hence segments with more than 10000 mm3 can safely be classified as anatomical structures. while fat typically has a ct number in the range from −220 hu to −50 hu, pleural mesothelioma is a tumorous connective tissue similar 41 acta polytechnica vol. 51 no. 5/2011 to muscle, whose hounsfield unit ranges between 20 and 60 hu. therefore only segments including more than10%of tissuewith10huorhigherare classified as thickenings. after filtering the differences δse(i), the final result of thickening detection is achieved by merging the detected thickenings from each erosion level, using δs(i)= e⋃ e=1 filter(δs(i)), (6) where e is the maximum expected indention depth of normal anatomical structures. 3 notes on implementation in this section we focus on the computationally intensive parts of the implementation. a complex part is the determination of the convex hull, which is performed for each slice. only the surface of the lung mask contains points potentially contained in the subset defining the convex hull. thus, a simple binary edge detector is applied to the lung mask s to reduce the number of considered voxels that are used to calculate the convex hull. the calculation itself is performed using the quickhull algorithm for convex hulls [8]. generating the healthy lung model requires fast voxelization of the triangles g = {(i, j, k) : i, j, k ∈ r̂l}, describing the convex hull. to reduce the considered number of triangles g’ζ in the slice ζ, only triangles whose vertices i = (xi, yi, zi),j = (xj , yj, zj),k = (xk, yk, zk) are all located in this slice (zi = zj = zk = ζ) or intersect the slice ((zi < ζ ∨ zj < ζ ∨ zk < ζ) ∧ (zi > ζ ∨ zj > ζ ∨ zk > ζ)) are regarded. voxelization is realized by recursively splitting the triangles, until their edge length is below 0.5 voxels. the resulting triangle edges cover all voxels of the lung surface. the resulting algorithm is highly efficient for parallel computing. another computationally intensive task is the erosion of the lung model, which depends on the size of the structuring element. to reduce the slice of this structuring element, the erosion of level e is approximated by erosion of level 1 applied to the results of the erosion from level (e −1). 4 results the experiments on thickening detection are carried out using image data from four different patients. to evaluate the results, we measure both the totally detected volume of all thickenings, and the number of detected thickenings, for each patient. the results are plotted in figure 7 for volume of the thickenings and in figure 8 for the number of thickenings. both metrics are measured depending on parameter n, which describes the size of the sliding convex hull. fig. 7: totally detected thickening volume depending on the size of the sliding convex hull fig. 8: detected number of thickenings depending on the size of the sliding convex hull 5 discussion neither an exactnumber of thickeningsnor the thickening volume could be determined because of the interand intra-reader variances. therefore only a interpretational evaluation is performed. obviously the detected number of thickenings and the volume both fluctuate for n < 4 and tend to a constant value for higher values of n. unstable results occur in the range where the method has somehow a more 2d like behavior. this behavior results an inappropriate connectivity of the detected segments in the z direction. thus, on the one hand large anatomical structures could not be detected and be filtered out and, on the other hand, related segments could not be assigned to the same thickening. another obvious effect is the increasing volume for increasing n, which can be explained by a growing convex hull. this is caused by the increasing effect of bulges outside the lung surface for increasing n. 42 acta polytechnica vol. 51 no. 5/2011 6 summary and future work we have presented a slice based approach for detecting pleural thickenings, including 3d information on the surrounding lung surface. the detected structuresareautomatically refinedandfilteredusing stepwise erosion of a modelled healthy lung. though this approach leads to anopticallyplausible result for the shape of the extracted thickenings, the method still focuses on detection. the exact boundaries of the thickenings need to be refined in further processing steps. for example the rear of the thickenings is currently interpolated only by applying a convex hull, which does not consider the surrounding lung curvature. approaches like [9] promise more accurate interpolation of the rear of the thickenings. acknowledgement the project is supported bygerman social accident insurance (dguv), project number ff-fb0148. we would like to thank prof. dr.-ing. til aach, head of the institute of imaging and computer vision, rwth aachen university, germany for supervising this project, and also prof. dr. med. thomas kraus, director of the institute and out-patient clinic of occupational medicine, university hospital aachen, germany for supervising and also for providing us withmedical image data, backgroundknowledge and evaluations. references [1] pistolesi, m., rusthoven, j.: malignant pleural mesothelioma: update, current management, and newer therapeutic strategies, chest, 2004, vol. 126, no. 4, pp. 1318–1329. [2] carl, t.m. i.: interreadervarianz bei der hrctund cxr-befundung in einer längsschnittstudie bei ehemals asbeststaubexponierten personen. ph.d. thesis, medizinische fakultät, rwth aachen, 2004. [3] chaisaowong, k., knepper, a., kraus, t., aach, t.: application of supervised rangeconstrainedthresholding toextractlungpleura for automated detection of pleural thickenings from thoracic ct images. proc. spie medical imaging: computer-aided diagnosis, san diego, ca, 2007, vol. 6514, p. 65143m. [4] saekor, n., roongruangsorakarn, s., chaisaowong, k., kraus, t., aach, t.: 3d modeling of detected pleural thickenings through thin platespline interpolation. inelectrical engineering/electronics, computer, telecommunications and information technology, ecti-con, 2009, vol. 02, pp. 1106–1109. [5] el-baz, a., gimel’farb, g., falk, r., elghar, m. a.: a novel approach for automatic follow-up of detected lung nodules. icip, san antonio, tx, 2007, vol. 5, pp. v-501–v-504. [6] chaisaowong, k., bross, b., knepper, a., kraus, t., aach, t.: detection and follow-up assessment of pleural thickenings from 3d ct data. ecti-con, 2008, vol. i, pp. 489–492. [7] chaisaowong, k., jäger, p., vogel, s., knepper, a., kraus, t., aach, t.: computer-assisted diagnosis for early stage pleural mesothelioma: towards automated detection and quantitative assessment of pleuralthickenings fromthoracic ct images.methods of information in medicine, 2007, vol. 46(3), pp. 324–331. [8] bradford barber, c., dobkin, d. p., huhdanpaa, h.: the quickhull algorithm for convex hulls. acm transactions on mathematical software, 1996, vol. 22, no. 4, pp. 469–483. [9] liepa,p.: fillingholes inmeshes.proceedings of the 2003 eurographics/acm siggraph symposium on geometry processing, aire-la-ville, switzerland, 2003, sgp ’03, pp. 200–205. about the authors peter faltin was born in 1983 in cologne, germany. he studied computer engineering at rwth aachenuniversity (germany) and receivedhisdipl.ing. in 2010. currently, he is a phd student at the institute of imaging & computer vision at rwth aachen university, where he is working on the “computer-aided early diagnosis of pleuramesothelioms” project. his research focuses on medical image processing, signal processing and computer vision. kraisorn chaisaowongwas born inthailand. afterhisdiplom-ingenieur studies (electricalengineering and information technology) in biomedical engineering at the institute of biomedical engineering, karlsruhe institute of technology (germany), and a research stay at the noraeccles harrisoncardiovascular research and training institute, university of utah (usa), he ha been employed as a university lecturer at the king mongkut’s university of technology north bangkok (thailand). currently, he is a phd student at the institute of imaging & computer vision, rwth aachen university, where he is responsible for the “computer-aided early diagnosis of pleuramesothelioma” project. 43 acta polytechnica vol. 51 no. 5/2011 peter faltin e-mail: peter.faltin@lfb.rwth-aachen.de institute of imaging and computer vision rwth aachen university, germany kraisornchaisaowong e-mail: kraisorn.chaisaowong@lfb.rwth-aachen.de http://www.lfb.rwth-aachen.de/ institute of imaging and computer vision rwth aachen university, germany king mongkut’s university of technology north bangkok, thailand 44 acta polytechnica vol. 51 no. 4/2011 exceptional points for nonlinear schrödinger equations describing bose-einstein condensates of ultracold atomic gases g. wunner, h. cartarius, p. köberle, j. main, s. rau abstract the coalescence of two eigenfunctions with the same energy eigenvalue is not possible in hermitian hamiltonians. it is, however, a phenomenon well known from non-hermitian quantummechanics. it can appear, e.g., for resonances in open systems, with complex energy eigenvalues. if two eigenvalues of a quantum mechanical system which depends on two or more parameters pass through such a branch point singularity at a critical set of parameters, the point in the parameter space is called an exceptional point. we will demonstrate that exceptional points occur not only for non-hermitean hamiltonians but also in the nonlinear schrödinger equations which describe bose-einstein condensates, i.e., the grosspitaevskii equation for condensates with a short-range contact interaction, and with additional long-range interactions. typically, in these condensates the exceptional points are also found to be bifurcation points in parameter space. for condensates with a gravity-like interaction between the atoms, these findings can be confirmed in an analytical way. 1 introduction in 1924, satyendra nath bose and albert einstein predicted that when the thermal de broglie wavelength becomes of the order of the interparticle distance, bosons begin to “condense” into their ground state. all bosons have the same energy and quantum characteristics, similar to the way all photons in a laser share the same quantum state. superfluidity of 4he atoms below a temperature of approximately t = 2.17 k was the first experimental realisation in 1937 of such a macroscopic quantum state [1]. it took, however, until 1995 for bose-einstein condensation to be observed in dilute atomic vapours of alkali atoms, namely, by wieman et al. [2] in 87rb and by ketterle et al. [3] in 23na (in both cases the number of nucleons and electrons add up to an even number, to form a boson). in these experiments, advanced optical techniques (capture in a magneto-optical trap and application of laser cooling and evaporative cooling) had to be employed to cool the atoms to temperatures very near to absolute zero (170 nk in the rubidium experiment and 2 μk in the sodium experiment). meanwhile bose-einstein condensates of many more alkaline and alkaline earth atoms have been produced (for a review see, e.g., ref. [4]). alkalis, however, are not ideal bose gases: at small distances, the atoms still interact via the short-range van der waals force, whose action can be simulated by a short-range isotropic s-wave scattering contact potential vs(r,r ′) = 4πah̄2 m δ(r − r′), where a is the s-wave scattering length. the latter can be tuned, using feshbach resonances of hyperfine levels and changing the applied magnetic field, from positive values (repulsive interaction) to negative values (attractive interaction), until the interaction becomes so attractive that the condensates collapse. in this paper we will be concerned with boseeinstein condensates where an additional long-range atom-atom interaction is active. an example is boseeinstein condensation of 52cr atoms (griesmeier et al. [5], koch et al. [6]), with a large magnetic moment of μ = 6μb, so that the atoms interact via the dipole-dipole interaction over large distances. in these condensates the possibility of tuning the relative strengths of the short-range interaction and the long-range interaction has opened the way to a variety of new phenomena (see, e.g., the review article by lahaye et al. [7]). furthermore, the stability of these condensates depends strongly on the trap geometry. the atoms are aligned by an external magnetic field in the z direction. if the confinement in the z direction is stronger than perpendicular to it (if the aspect ratio λ of the trapping frequencies ωz/ω� is greater than one), the condensate assumes an oblate shape and the dipole-dipole interaction acts predominantly repulsive, while for λ < 1 the shape is prolate, the atoms are arranged in the favoured head-to-tail configuration, and the interaction acts attractive. as an alternative system, o’dell et al. [8] have proposed bose-einstein condensation of neutral atoms with an electromagnetically induced attractive long-range 1/r interaction. in their proposal, 6 “triads” of intense off-resonant laser beams average out 1/r3 interactions in the near-zone limit of the retarded dipole-dipole interaction of the atoms produced by the irradiation, while the weaker 1/r 95 acta polytechnica vol. 51 no. 4/2011 interaction is retained. the novel feature of these condensates is that the atoms can be self-trapped, without an external trap. on the theoretical side, the advantage is that for self-trapping analytical variational calculations are possible, which can serve as a guide for investigations of more complex situations and condensates with other interatomic interactions. a theoretical description of condensates of weakly interacting bosons at ultracold temperatures can be performed within a mean-field picture by the grosspitaevskii equation. this equation can be considered as the hartree equation for the single particle orbital ψ(r) which all atoms occupy. it is a nonlinear schrödinger equation, and therefore phenomena not present in linear quantum mechnics appear. for example, the superposition principle is no longer applicable to the equation. for bose-einstein condensates with gravity-like long-range interaction vu(r,r ′) = −u/|r−r ′| (“monopolar condensates”, u describes the strength of the interaction) the gross-pitaevskii equation reads{ −δ + γ2r2 + n 8π a au |ψ(r)|2 − (1) 2n ∫ |ψ(r′)|2 |r−r′| d3r′ } ψ(r) = ih̄ ∂ψ(r) ∂t . here γ = h̄ω0/eu is the trapping frequency measured in the time scale given by the “rydberg” energy eu associated with the strength of the 1/r interaction, n is the particle number, and a/au the scattering length in units of the “bohr” radius au. it can be shown [9] that the equation effectively depends only on two physical parameters, viz. the particle number scaled quantities n2a/au and γ/n 2. for condensates with dipole-dipole interactions (“dipolar condensates”) in which the atoms are spinpolarized in the z direction, the gross-pitaevskii equation reads { − δ + γ2ρ ρ 2 + γ2z z 2 + n 8π a ad |ψ(r)|2 + n ∫ |ψ(r ′)|2 (1 − 3 cos2 ϑ′) |r−r ′|3 d3r ′ } ψ(r) = (2) ih̄ ∂ψ(r) ∂t , where we have used as units of length, energy, and frequency the quantities ad = μ0μ 2m/(2πh̄2) (μ is the magnetic moment), ed = h̄ 2/(2ma2d), and ωd = ed/h̄, respectively. again it can be shown [10] that the equation only depends on the three scaled parameters n2γ̄, λ, a/ad (or alternatively n 2γ̄ = γ2/3ρ γ 1/3 z , λ = γz /γρ). equations (1) and (2) are the starting point for the investigations below. 2 monopolar condensates 2.1 variational results to find stationary solutions of the gross-pitaevskii equation (1) we can enter into the corresponding mean-field energy functional e[ψ] = n ∫ d3r ψ∗(r) ( −δr + γ2r2 + 4πn a au |ψ(r)|2 − n ∫ d3r′ |ψ(r′)|2 |r−r′| ) ψ(r) with the gaussian variational ansatz ψ(r) = a exp ( −k2r2/2 ) , with a = ( k/ √ π )3/2 . all integrals, including the gravity-like interaction integral, can be evaluated analytically. requiring (3) to become stationary with respect to variation of the width leads to a quintic equation for k, which, however, reduces to a simple quadratic equation in the special case γ = 0, i.e., for a self-trapping condensate. the two solutions read [11, 13] k± = 1 2 √ π 2 1 n a au ( ± √ 1 + 8 3π n2 a au − 1 ) , (3) where the plus sign corresponds to a local minimum (stable ground state of the condensate) and the minus sign to a local maximum (collectively excited state of the condensate) of the energy. from (3) it can be seen that these solutions only exist for n2a/au ≥ −3π/8. accurate numerical stationary solutions of (1) can be determined by direct outward integration starting with suitable initial conditions at r = 0 [9]. figure 1 compares the variational and numerically exact results for the mean-field energy and the chemical potential. it can be seen that the simple gaussian ansatz qualitatively well describes the overall behaviour: both in the variational calculations and in the numerical calculations two solutions are born in a tangent bifurcation at a critical negative value of the scattering strength, with the value being only slightly underestimated by the gaussian ansatz (n2a/au = −3π/8 = −1.178 0 variationally, and −1.025 1 numerically). we note that similar tangent bifurcation behaviour is also present for non-vanishing trapping potentials [9]. at the bifurcation point, the energies and the wave functions ψk+ and ψk− are identical, which is characteristic of exceptional points known to appear in non-hermitian quantum mechanics. to confirm that the bifurcation point is an exceptional point, we have to perform a circle around the bifurcation point. if the two eigenvalues permute after traversing the full circle, we have an exceptional point. this requires a two-dimensional parameter space and thus 96 acta polytechnica vol. 51 no. 4/2011 fig. 1: comparison of variational and numerically accurate results for themean-field energy (left) and chemical potential (right) of a self-trapped monopolar condensate as a function of the particle number scaled scattering length fig. 2: paths of the real and imaginary parts of the mean-field energy and the chemical potential in the variational calculation with a gaussian-type orbital for self-trapped monopolar condensates if a circle of radius 10−8 is traversed around the bifurcation point n2a/au = −3π/8 an extension of the scattering length to complex values [11, 13]. we demonstrate this first for the analytical solution obtained with the gaussian ansatz for selftrapping condensates. figure 2 shows the resulting paths of the values of the mean-field energy and the chemical potential in the complex plane if a small circle reiϕ, ϕ = 0 . . . 2π, with radius r = 10−8 is performed around the critical scattering length n2a/au = −3π/8. a surprising result is that while for the chemical potential a clear permutation of the two solutions is visible, after a full circle the two mean-field energy values do not permute. this can be explained by looking at the analytical fractional power series expansion of the mean-field energy around the critical value [11] ẽ±(ϕ) = − 4 9π + 0 · √ reiϕ/2 + 32 27π2 · √ r 2 eiϕ ±( 4 9π − 32 9π2 ) · √ r 3 e(3/2)iϕ + o (√ r 4 ) . evidently the first-order term with the phase factor eiϕ/2 vanishes, the lowest non-vanishing order has the phase factor eiϕ, which does not lead to a permutation of the energies, and it is only the third-order term which can yield the permutation. by contrast, in the fractional power series expansion of the chemical potential μ̃±(ϕ) = − 20 9π ± 8 3π · √ reiϕ/2 −( 4 3π + 128 27π2 ) · √ r 2 eiϕ ±( 8 9π − 64 9π2 ) · √ r 3 e(3/2)iϕ + o (√ r 4 ) the first-order term does not vanish and leads to a permutation of the chemical potential values. thus the permutation of the eigenvalues appears for a small radius, in the same way that it occurs for exceptional points in simple non-hermitian linear model systems. 97 acta polytechnica vol. 51 no. 4/2011 fig. 3: same as figure 2, but for a larger radius of r =10−3 fig. 4: same as figure 2, but for a still larger radius of r =10−1 for increasing radius, the higher-order terms in the expansions become more and more important, and the permutation of the mean-field energy values after a full circle becomes recognizable, as can be seen in figure 3. the permutation behaviour of the chemical potential solutions is not altered, as is expected from the expansion. for the case of an ever increasing radius of r = 10−1, shown in figure 4, the higher-order terms lead to a deformation of the circle of the mean-field energy (as in linear model systems). the shape of the chemical potential circle does not change, but the spacing between the points is no longer uniform. for the gaussian ansatz, analytical expansions can also be performed for circles around the bifurcation point with large radii, r � 1. the expression for the mean-field energy reads [11] ẽ±(ϕ) = ± ( π 16 − 1 2 ) · e−iϕ/2 √ r + 1 2 · e−iϕ √ r 2 ±( 3π2 64 − 3π 8 ) · e−(3/2)iϕ √ r 3 + o ( 1 √ r 4 ) while for the chemical potential we obtain μ̃±(ϕ) = ± (π 8 − 1 ) · e−iϕ/2 √ r + ( 1 − 3π 16 ) · e−iϕ √ r 2 ±( 3π2 32 − 3π 8 ) · e−(3/2)iϕ √ r 3 + o ( 1 √ r 4 ) . from the expressions it is evident that a clear semicircle structure should appear for large radii, as can be seen in figure 5 for r = 108. 2.2 numerical analysis of the bifurcation point for self-trapped condensates for accurate numerical solutions of the grosspitaevskii equation a similar investigation of the exceptional point behaviour can be carried out. the task again is to perform a circle in the complex scattering length plane around the numerically accurate bifurcation point at n2a/au = −1.025 1 and to numerically solve the equation for each complex scattering length on the circle. this yields not only complex values for the chemical potential and the mean-field energy but also complex wave functions. 98 acta polytechnica vol. 51 no. 4/2011 fig. 5: same as figure 2, but for the very large radius r =108 fig. 6: comparison of variational andnumerically accurate results proving thebranchpoint singularity at the bifurcation point of the gross-pitaevskii equation for self-trapped bose-einstein condensates with an attractive 1/r long-range interaction. panels (a) and (b) show the circles of radius r = 10−3 in the complex scattering length plane around the critical values, (c) and (d) the reactions of the chemical potential values as the circles are traversed, (e) and (f) those of the mean-field energies (variational results: left column, accurate numerical results: right column). (adapted from [11]) 99 acta polytechnica vol. 51 no. 4/2011 the problem is that the gross-pitaevskii equation contains the square modulus of the wave function ψ and is therefore a non-analytic function of ψ. we use the following procedure for complex wave functions. any complex wave function can be written as ψ(r) = eα(r)+iβ(r) , (4) where the real functions α(r) and β(r) determine the amplitude and phase of the wave function, respectively. the complex conjugate and the square modulus of ψ(r) read ψ∗(r) = eα(r)−iβ(r) , |ψ(r)|2 = e2α(r) . (5) the gross-pitaevskii system can then be transformed into two coupled nonlinear differential equations for the real functions α(r) and β(r), however, without any complex conjugate or square modulus. these equations can now be continued analytically by allowing for complex valued functions α(r) and β(r). this implies that ψ∗(r) = eα(r)−iβ(r) , |ψ(r)|2 = e2α(r) . without complex conjugation of α(r) and β(r) is formally used for the calculation of ψ∗ and |ψ(r)|2. as a consequence, the square modulus of ψ can become complex, leading to a complex absorbing potential in the gross-pitaevskii equation. figure 6 shows a comparison between the results of the variational and the numerically accurate calculation. the first row shows the circles of radius r = 10−3 around the respective critical values of the scattering length where the bifurcation appears. the second row compares the results for the chemical potential, and the third row for the mean-field energy as the circles in the complex scattering length plane are traversed. obviously the permutation behaviour is identical in both approaches, proving that the bifurcation point is indeed an exceptional point also in the full numerical solution of the nonlinear schrödinger equation which describes the bose-einstein condensation of ultracold atomic gases with a long-range attractive 1/r interaction. 3 dipolar condensates to find stationary solutions of the gross-pitaevskii equation (2) of dipolar gases, we could of course again extremize the mean-field energy functional for a given variational ansatz. a more sophisticated way is to exploit the time-dependent variational principle [12] ||iφ(t) − hψ(t)||2 != min with respect to φ (6) and afterwards set ψ̇ ≡ φ. if a complex parametrization of the trial wave function ψ(t) = χ(λ(t)) is assumed, the variation leads to equations of motion for these parameters λ(t) [13]. 〈 ∂ψ ∂λ ∣∣∣iψ̇ − hψ〉 = 0 ↔ (7) kλ̇ = −ih with k = 〈 ∂ψ ∂λ ∣∣∣∂ψ ∂λ 〉 ,h = 〈 ∂ψ ∂λ ∣∣∣h∣∣∣ψ〉 . 3.1 variational results we assume an axisymmetric trapping potential, and take the time-dependent gaussian trial wave function ψ(ρ, z, t) = ei(aρρ 2+az z 2+γ), (8) aρ = aρ(t), az = az(t), γ = γ(t) , where all three parameters are complex functions of time. the equations of motion that follow for the real and imaginary parts of the variational parameters from the time dependent variational principle read ȧrρ = −4((a r ρ) 2 − (aiρ) 2) + fρ(a i ρ, a i z , γ i) ȧiρ = −8a r ρa i ρ ȧrz = −4((a r z) 2 − (aiz ) 2) + fz(a i ρ, a i z , γ i) ȧiz = −8a r za i z γ̇r = −4aiρ − 2a i z + fγ (a i ρ, a i z , γ i) γ̇i = 4arρ + 2a r z and are solved with the initial values arρ = 0, a i ρ > 0, arz = 0, a i z > 0 and γi = 1 2 ln π3/2 2 √ 2aiρ √ aiz . (9) this results in four remaining coupled ordinary differential equations for ȧrρ, ȧ i ρ, ȧ r z , ȧ i z. variational stationary solutions of the gross-pitevskii equation (2) are then obtained by searching for the fixed points of these differential equations. as in the monopolar case two stationary points are found, one stable, representing the ground state, and one unstable, representing a collectively excited state. figure 7 shows the results for the chemical potential and the mean-field energy as functions of the scattering length for different aspect ratios. it is evident that again the two stationary solutions are born at a critical scattering length in a tangent bifurcation, and below the critical scattering length no stationary solutions exist. are the bifurcation points found in the variational calculations with an axisymmetric gaussian wave function also exceptional points? to check 100 acta polytechnica vol. 51 no. 4/2011 fig. 7: chemical potential (left) and mean-field energy (right) as functions of the scattering length for an axisymmetric variational gaussian ansatz for dipolar condensates for n2γ̄ =3.4 × 104, and different aspect ratios fig. 8: chemical potentials (middle) and mean-field energies (right) for the two stationary solutions that emerge in a tangent bifurcation when a circle with radius r = 10−3 around the critical scattering length acrit is traversed in the complex extended parameter plane (left) (for n2γ̄ =3.4×104, and two different aspect ratios). the permutation of the quantities after one cycle proves that the bifurcation points are exceptional points this, an extension of the scattering length to complex values is again necessary. in contrast to the self-trapped monopolar case analytical investigations are no longer possible, and the analysis has to be carried out numerically. figure 8 shows the results for the behaviour of the chemical potential and the mean-field energy when the scattering lengths corresponding to the bifurcation points in figure 7 for the aspect ratios λ = 1 and λ = 6 are circled in the complex plane with a radius r = 10−3. it is evident that both the values of the chemical potentials and the mean-field energies of the two solutions permute as one full cycle is traversed, an unambiguous sign that the bifurcation points found in the solution of the nonlinear schrödinger equation which describes the bose-einstein condensation of ultracold atomic gases with a long-range dipole-dipole interaction are also exceptional points. 3.2 numerical results more sophisticated variational calculations for dipolar condensates can be performed using the method of coupled gaussian wave packets. for the grosspitaevskii equation with dipolar interaction the method has been shown [14, 15] to be a full-fledged alternative to numerical computations on a grid with the method of imaginary time evolution. in contrast to the latter, the method allows us to find both stable and unstable solutions. the idea is to take as trial functions superpositions of n different gaussians centered at the origin ψ(r) = n∑ k=1 ei(a k x(t) x 2+aky(t) y 2+akz(t) z 2+γk(t)) ≡ n∑ k=1 gk(yk (t),r) , 101 acta polytechnica vol. 51 no. 4/2011 fig. 9: mean-field energy of a dipolar condensate for (particle number scaled) trap frequencies n2γz = 25200 and n 2 γρ = 3600 as a function of the scattering length in units of ad. in the variational calculation with one gaussian a stable ground state (gn=1) and an unstable excited state (en=1) emerge in a tangent bifurcation. in the calculation with coupled gaussians two unstable states emerge (labeled ucoupled), of which the lower one turns into a stable ground state (gcoupled) in the region of the small rectangle in a pitchfork bifurcation. (adapted from [14]) where the ak(t) are complex width parameter functions (3n ) and the γk(t) fix the weights and phases of the individual gaussians (n ). equations of motion of the variational parameters again follow from the time-dependent variational principle. figure 9 shows the results for a dipolar condensate as a function of the scattering length for trap frequencies n2γz = 25 200 and n 2γρ = 3 600. in the calculation with five gaussians the results from a grid calculation are well reproduced, and the bifurcation of the mean-field energy is confirmed. again the variational calculation with a single gaussian underestimates the critical scattering length where the bifurcation appears. however, a more complicated bifurcation scenario evolves: as the scattering length is decreased the stable ground state solution already turns unstable before reaching the tangent bifurcation point, indicating a pitchfork bifurcation in the region of the small rectangle shown in the figure. neither the tangent nor the pitchfork bifurcation point have been conclusively analyzed so far as to their properties as exceptional points, but the analysis should reveal new features, in particular when the tangent bifurcation point is encircled with a radius that also encompasses the pitchfork bifurcation. 4 summary we have demonstrated that “nonlinear versions” of exceptional points appear in bifurcating solutions of the (extended) gross-pitaevskii equations describing the bose-einstein condensation of ultracold atomic gases with attractive 1/r interaction and with dipoledipole interaction. the identification as exceptional points required a complex extension of the scattering length. in summary, bose-einstein condensates near the collapse point can be regarded as realizations of physical systems close to exceptional points. references [1] kapitza, p., allen, j. f., misener, a. d.: nature 141, (1938), 74, 75. [2] anderson, m. h., ensher, j. r., matthews, m. r., wieman, c. e., cornell, e. a.: science 269 (1995), 198. [3] davis, k. b., mewes, m. o., andrews, m. r., van druten, n. j., durfee, d. s., kurn, d. m., ketterle, w.: phys. rev. lett. 75 (1995), 3 969. [4] ueda, m.: fundamentals and new frontiers of bose-einstein condensation, world scientific publishing, 2010. [5] griesmaier, a., werner, j., hensler, s., stuhler, j., pfau, t.: phys. rev. lett. 94 (2005), 160 401. [6] koch, t., lahaye, t., metz, j., fröhlich, b., griesmaier, a., pfau, t.: nature physics 4 (2008), 218. [7] lahaye, t., menotti, c., santos, l., lewenstein, m., pfau, t.: rep. prog. phys. 72 (2009), 126 401. 102 acta polytechnica vol. 51 no. 4/2011 [8] o’ dell, d., giovanazzi, s., kurizki, g., akulin, v. m.: phys. rev. lett. 84 (2000), 5 687. [9] papadopoulos, i., wagner, p., wunner, g., main, j.: phys. rev. a 76 (2007), 053 604. [10] köberle, p., cartarius, h., fabčič, t., main, j., wunner, g.: new j. phys. 11 (2009), 023 017. [11] cartarius, h., main, j., wunner, g.: phys. rev. a 77 (2008), 013 618. [12] mclachlan, a. d.: mol. phys. 8 (1964), 39. [13] cartarius, h., fabčič, t., main, j., wunner, g.: phys. rev. a 78 (2008), 013 615. [14] rau, s., main, j., köberle, p., wunner, g.: phys. rev a 81 (2010), 031 605(r). [15] rau, s., main, j., köberle, p., wunner, g.: phys. rev a 82 (2010), 046 201. guenter wunner e-mail: wunner@itp1.uni-stuttgart.de h. cartarius p. köberle j. main s. rau institute for theoretical physics 1 university of stuttgart 70550 stuttgart, germany 103 ap08_4.vp introduction there are several formats for storing sparse matrices. they have been designed mainly for spm×v. spm×v for the most common format, the compressed sparse rows (shortly csr) format (see section 1.2.2), suffers from low performance due to indirect addressing. several studies have been published about increasing the efficiency of spm×v [1, 2, 3]. there are some formats, such as register blocking [4], that eliminate indirect addressing during spm×v. then, vector instructions can be used. these formats are suitable only for matrices with a known structure of nonzero elements. the processor cache memory parameters must also be taken into account. the overhead of matrix reorganization from one format to another is often of the order of tens of executions of spm×v. such a reorganization pays off only if the same matrix a is multiplied by multiple different vectors, e.g., in iterative linear solvers. many authors (such as [5]) have overlooked the overhead of matrix transformation and have designed time-expensive matrix storage optimization. in contrast to them, we try to find a near-optimal matrix storage format to maximize the performance of spm×v with respect to matrix transformation overhead and cache parameters. 1 terminology and notation 1.1 the cache model the cache model that we consider here corresponds to the structure of l1 and l2 caches in the intel x86 architecture. an s-way set-associative cache consists of h sets, and one set consists of s independent blocks (called lines in intel terminology). let cs denote the size of the data part of a cache in bytes, and let bs denote the cache block size in bytes. then c s b hs s� � � . let sd denote the size of type double and si the size of type integer in bytes. we consider two types of cache misses: (1) compulsory misses (sometimes called intrinsic or cold), which occur if the required memory block is not in the cache because it is being accessed for the first time, and (2) thrashing misses (also called cross-interference, conflict, or capacity misses), which occur if the required memory block is not in the cache even though it was previously loaded, because it has been replaced prematurely from the cache for capacity reasons. 1.2 common sparse matrix formats in the following text, we assume that a is a real sparse matrix of order n and x, y are vectors of size n. we consider the 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 acceleration of sparse matrix-vector multiplication by region traversal i. šimeček sparse matrix-vector multiplication (shortly spm×v) is one of most common subroutines in numerical linear algebra. the problem is that the memory access patterns during spm×v are irregular, and utilization of the cache can suffer from low spatial or temporal locality. approaches to improve the performance of spm×v are based on matrix reordering and register blocking. these matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. the efficiency of these transformations depends strongly on the presence of suitable blocks. the overhead of reorganization of a matrix from one format to another is often of the order of tens of executions of spm×v. for this reason, such a reorganization pays off only if the same matrix a is multiplied by multiple different vectors, e.g., in iterative linear solvers. this paper introduces an unusual approach to accelerate spm×v. this approach can be combined with other acceleration approaches and consists of three steps: 1) dividing matrix a into non-empty regions, 2) choosing an efficient way to traverse these regions (in other words, choosing an efficient ordering of partial multiplications), 3) choosing the optimal type of storage for each region. all these three steps are tightly coupled. the first step divides the whole matrix into smaller parts (regions) that can fit in the cache. the second step improves the locality during multiplication due to better utilization of distant references. the last step maximizes the machine computation performance of the partial multiplication for each region. in this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). our measurements prove that our approach gives a significant speedup for almost all matrices arising from various technical areas. keywords: cache hierarchy, sparse matrix-vector multiplication, region traversal. fig. 1: description of cache parameters operation of spm×v ax y� . let nz denote the total number of nonzero elements in a. 1.2.1 coordinate (xy) format in this simplest sparse format, a matrix a is represented by 3 linear arrays a, xpos, and ypos. array a stores the nonzero elements of a, and arrays xpos and ypos contain x-positions and y-positions, respectively, of the nonzero elements. the storage complexity of the xy format is n s sz d i( )� 2 , so the xy format can be space inefficient. locality during spm×v in the xy format during one spm×v in the xy format, every element of � arrays a, xpos, ypos is read only once; � output array y is written repeatedly; � input array x is read repeatedly. in comparison to the csr format (see section 1.2.2), this causes too many compulsory misses for regular matrices. 1.2.2 compressed sparse row (csr) format this is the most common format for storing sparse matrices. a matrix a stored in the csr format is represented by 3 linear arrays a, adr, and ci. array a stores the nonzero elements of a, array adr[1, …, n] contains indices of initial nonzero elements of the rows of a, and array ci contains column indices of nonzero elements of a. hence, the first nonzero element of row j is stored at index adr[j] in array a. its storage complexity is n s s nsz d i i( )� � . every regular matrix must contain at least one element per every row. hence, n n� z. from the storage complexities of the two formats, we can conclude that for every regular matrix the csr format is the same or more space-efficient than the xy format. locality during spm×v in csr format during one spm×v in csr format, every element of � array a, ci, adr is read only once; � output array y is written only once; � input array x is read repeatedly. some reads (writes) can conflict with other reads (writes), and thrashing misses may occur. the csr format can be inefficient due to many thrashing misses for some types of matrices: � matrices with a large number of nonzero elements per row (relative to the cache size), � matrices with large bandwidth (relative to the cache size). 2 increasing locality in spm×v 2.1 usual approach sparse matrices often contain dense submatrices (called blocks), so various blocking formats have been designed to accelerate matrix operations (mainly spm×v). compared to csr format, the aim of these formats is to consume less memory and to allow better use of registers. the effect of indirect addressing is also to be reduced, and vector (simd) instructions can be used. 2.2 our approach in general, there is only one way to improve the locality in spm×v: by making the multiplication more cache-sensitive. in a typical numerical algorithm, the cache hit ratio is high if the cache can hold data accessed in the innermost loop of the algorithms. memory accesses that do not satisfy this assumption are called distant references. in this paper, we propose © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 4/2008 fig. 2: an example of a sparse matrix a in a) dense format, b) coordinate (xy) format fig. 3: an example of a sparse matrix a in a) dense format, b) compressed sparse row format and explore a new method for increasing the locality of spm×v, consisting of 3 steps: 1) divide matrix a into disjoint nonempty rectangular submatrices (regions). 2) choose a suitable storage format for the regions. 3) choose a good traversal of these regions during spm×v. 2.2.1 dividing the matrix into regions we divide matrix a into disjoint nonempty rectangular regions so that the cache can hold all data accessed during partial spm×v within each region. the whole matrix a is represented by two data structures: � r-aux � auxiliary data of regions. � r-x, r-y � position of the beginning of the region, � r-s � number of nonzero elements inside the region, � r-addr � pointer into r-data. � r-data � data inside regions. it contains the values and locations of nonzero elements. in this paper, we assume for simplicity that regions are square submatrices of the same size r containing at least one nonzero element. an optimal value of r is hard to predict, because it depends on the locations of nonzero elements and cache sizes. the value of the region side size r must be chosen to address the following trade-off: � greater regions lead to a smaller region storage overhead, � smaller regions lead to higher cache locality, but the computational complexity of an optimal traversal algorithm (see later) grows dramatically with the number of regions. in this paper, we leave this problem open and choose r � 240, so that the values of the coordinates of the elements in the regions can be represented in one byte. 2.2.2 choosing a suitable storage format for the regions the second important decision is to choose an efficient format for the r-data structure. regions are viewed as submatrices. therefore, we can use common sparse matrix formats for storing nonzero elements inside them. we can use the information that an element is inside a region, so we can express all coordinates relative to the region. all coordinates can be stored by using smaller data-types (char or byte) instead of long int. this leads to compressed representation of the input matrix. we call these new formats r-based. in the following text, we will consider only the r-csr and r-xy formats: � r-xy, derived from the xy storage format. � r-csr, derived from the csr storage format. each region is stored either in r-xy or r-csr format to minimize compulsory misses (see section 1.2.2). let �n be the number of nonzero elements within a region. if � �n r, then the r-csr format is chosen, otherwise the r-xy format is chosen. one possible drawback of these formats is that in some architectures (including intel x86 and compatible cpus), data memory transfers with smaller data-types are very slow. 2.2.3 choosing a good traversal of regions for the performance of a program, a high cache hit ratio (high locality) is crucial. better locality in spm×v is only one step toward higher performance of spm×v. choosing a good traversal is equivalent to finding a good ordering of regions (partial multiplications). we can simply say that the traversing is good if it leads to a fast spm×v. now we will discuss this obvious assumption in a more formal way with an in-depth analysis. the minimizing of the number of cache misses (denoted by l) is equal to the problem of optimal indexing of regions (denoted by v). it is an np-complete problem, so finding an optimal solution is hard. we try to keep our approach efficient, so we have considered only a number of standard region traversals: � row-wise traversing (see fig. 4a)), � snake-wise row traversing (see fig. 4b)), � column-wise traversing (see fig. 4c)), � snake-wise column traversing (see fig. 4d)), � diagonal traversing (see fig. 4e)), � reversal diagonal traversing (see fig. 4f)), and two recursive region traversals � z-morton traversing (see fig. 5a)), � c-morton traversing (see fig. 5b)). a very important question is how to measure the performance of spm×v, and the effect of various traversal strategies. we have used 3 approaches: 1. real execution time measurements. � the results are platform dependent. � they can be influenced by other os processes. all hw and sw aspects are taken into account. 2. estimating of the number of cache misses, using a sw cache analyzer (for example [6], which uses a cache model, as described in section 1.1). � it cannot be influenced by other os processes. � this solution takes only the cache memory effect into account. 3. estimations of the number of cache misses by simulation algorithms (see section 2.3.1). � this is a fast way. � the simulation algorithms are slightly inexact, due to their initial assumptions. � this solution takes only the cache memory effect into account. a cache behavior simulation algorithm we have designed a cache behavior simulation algorithm that predicts the number of cache misses during an execution of spm×v. the number of cache misses depends on the cache-block replacement strategy; we assume the lru replacement strategy as the most common case. for different schemas of the cache-block replacement strategy, similar simulation algorithms can be derived. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 a cache behavior simulation algorithm (cbsa) is an even more abstract model than cs [6]. it uses the queue q for modeling a cache memory. for a chosen traversal ordering � and a chosen cache line replacement strategy, this allows us to estimate the number of cache misses during the execution of spm×v. cbsa assumes a fully associative cache memory, omits cache spatial locality, and counts only load operations with parts of array x and y, because only these arrays are reused. definition of region traversing � let g v e� ( , ), v m� , be a subgraph of a 2d mesh (n×n) such that every vertex v vi � has coordinates x vi( ), y vi( ) and the value l vi( ). (each vertex represents a region of the matrix containing at least one nonzero element; the value l vi( ) represents the number of nonzero elements inside this region). © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 4/2008 a) b) c) d) e) f ) fig. 4: the most “typical” region traversals a) b) fig. 5: recursive region traversals � � is a permutation of [1� m] defining a traversal of g so that the ith traversed block is v i�( ). � let l(�) be the number of cache misses if g is traversed according to �. � the optimal traversal is a permutation � with minimal l(�). the algorithm cbsa for the lru replacement strategy (1) l � 0; (2) for i � 1 to m (3) u v i� �( ) (4) x x x utemp � [ ‘ ’, ( )]; (5) y y y utemp � [ ‘ ’, ( )]; (6) if x qtemp then l l l u� � ( ); (7) if x qtemp else remove xtemp from q; (8) store xtemp into q; (9) if y qtemp then l l r l u� � min( , ( )); (10) if y qtemp else remove ytemp from q; (11) store ytemp into q; (12) while size of q c s do dequeue(q); ;;the result is stored in the variable l. comments on the algorithm � two operations with the queue appear: � remove x from q: this operation removes element x from the queue q irrespective of its position. � dequeue (q): this operation removes the element at the front of the queue q. � in codelines (4) and (5), special additional flags ‘x’ and ’y’ serve for distinguishing memory operations with arrays x and y. � in codeline (9), cache misses are simulated. a part of the array x or y must be loaded, because they are new or they were replaced prematurely from the cache for capacity reasons. the number of cache misses is equal to r (for the r-csr format) or l(u) (for the r-xy format). � in codelines (7) and (10), refreshments of the cachelines are described. � in codeline (12), the queue size is decreased to the cache size. elements are freed in the order according to the lru strategy. 3 evaluation of the results all results were measured on pentium celeron 2.4 ghz, 512 mb@ 266 mhz, running os windows xp with the following cache parameters: the l1 cache is a data cache with bs � 64, c ks � 8 , s � 4, h � 32, and lru replacement strategy. the l2 cache is unified with bs � 64, c ks � 128 , s � 4, h � 1024, and lru strategy. also, s bd � 8 , s bi � 4 , sw: microsoft visual c�� 6.0 enterprise edition intel compiler version 7.1 with switches: -o3 -ow -qpc64 -g6 -qxk -qipo -qsfalign16 -zp16 all cache events were monitored by the intel vtune performance analyzer 7.0 and verified by the cache analyzer [6]. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 6: example of the cbs algorithm. a dark box means 2 misses, a light box means 1 miss, a white box means no miss. traversing the matrix in row-wise traversing causes 30 misses. fig. 7: example of the cbs algorithm. a dark box means 2 misses, a light box means 1 miss, a white box means no miss. traversing the matrix in snake-like row-wise traversfig. 8: example of the cbs algorithm. a dark box means 2 misses, a light box means 1 miss, a white box means no miss. traversing the matrix in area-filling traversing causes 14 misses. 3.1 test data we have 3 sources of test data � we wrote a gen program that generates symmetric, positive definite matrices produced by discretization of two elliptic partial differential equations with the dirichlet boundary condition on rectangular grids. for testing purposes, we used 4 matrices 1. matrix a is a random banded sparse matrix with n � �2 104; nz � �8 7 10 4. . 2. matrix b is filled by randomly located short (l<10) diagonal blocks with n � �2 104; nz � �3 10 5. 3. matrix c is given by discretization on a 100×100 rectangular grid with n � �2 104; nz � �2 6 10 5. . 4. matrix d is given by discretization on a 10×1000 rectangular grid with n � �2 104; nz � �2 4 10 5. . � we used 41 real matrices from various technical areas from matrixmarket and the harwell sparse matrix test collection. � we wrote a syn program that generates random sparse matrices with the following parameters: 1000 10000� �n , 1000 500000� �nz . 3.2 performance metrics 3.2.1 compression rate we define the parameter compression rate as the ratio between the space complexity of the r-based format and the space complexity of the csr format. the decision whether a given region should be stored in the r-xy format or in the r-csr format can be made statically (see section 2.2.2). from fig. 9, we can conclude that the average compression rate is about 74 %. the only exception is matrices with the average number of nonzero elements per row close to 1. for such matrices, the compression rate varies from 63 % (for small matrices) to 74 % (large matrices). this compression rate variability is caused by the array adr overhead in the csr format. 3.2.2 accuracy of cbsa we define the accuracy of cbsa as the ratio between the number of cache misses predicted by cbsa and the number of cache misses measured by the sw cache analyzer. fig. 10 illustrates that the average accuracy of cbsa is 93 % for the l1 cache and 95 % for the l2 cache. 3.2.3 performance we define performance as the ratio between the number of fpu operations during spm×v and the execution time of spm×v. if the cache is flushed out before the measurement and the multiplication is done only once, the csr format and the r-based format achieve almost the same performance. this is due to the fact that during one multiplication, the impact of cache re-use is negligible. figs. 11 and 12 illustrate the performance of spm×v with the csr format and the r-based format, respectively. in this case, multiplication is done repeatedly (1000 times). there is a performance gap between small matrices that fit in the cache and large matrices that do not fit in the cache. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 4/2008 fig. 9: the compression rate fig. 10: accuracy of cbsa fig. 11: performance of spm×v with the csr format 3.2.4 speedup we define speedup as the ratio between the time of spm×v with the csr format and the time of spm×v with the r-based format. figs. 13, 14 and 15 show the speedups of spm×v on real matrices. � slowdown of about 20 % was obtained only for small matrices that fit in the cache. this expected result follows from the fact that the r-based format for these matrices does not improve cache utilization and the matrix transformation destroys the naturally occurring locality. � on the other hand, speedup of about 20 % was achieved for all large matrices (that do not fit in the cache). this speedup is due to better cache utilization, for two main reasons: � a lower number of compulsory misses due to compressed representation of the matrix, � a lower number of trashing misses due to region traversing. the big peaks in the plotted data occur for matrices “on the edge”. these are large enough (do not fit in the cache) in the csr format, but small enough (fit in the cache) in the r-based format. 3.2.5 payoff iterations the transformation into another format always carries some overhead. a useful question is whether the transformation into different formats pays off in real cases. to answer this 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 12: performance of spm×v with the r-based format fig. 13: the speedup measured by real execution time measurements fig. 14: the speedup estimated by an sw cache analyzer fig. 15: the speedup estimated by a cbsa fig. 16: payoff iterations question, we define the parameter payoff as the ratio between the matrix transformation time and the difference between the time of spm×v with the csr format and the time of spm×v with the r-based format. this denotes the number of executions of spm×v to amortize the overhead of the matrix format transformation. from fig. 16, we can conclude that the payoff is approximately 40 (iterations) for large matrices that do not fit into the cache. 3.2.6 impact of the traversal on the performance another important question is whether and in what way the region traversal has a significant effect on the performance of spm×v. fig. 17 illustrates the importance of choosing a good traversal for various strategies in fig. 4. the difference of the spm×v performance was up to 100 % for small matrices and about 20 % for larger matrices. we can conclude that choosing the traversal has a significant effect on the performance. 4 conclusions in this paper, we have presented an unusual approach to accelerate sparse matrix-vector multiplication. this approach consists of 3 steps. first, we divide matrix a into nonempty disjoint square regions. second, we choose a suitable storage format for each region. we introduce new sparse matrix r-based (r-xy or r-csr) formats for additional acceleration of spm×v. these new formats cause much fewer compulsory misses. third, we present a methodology for finding a good region traversal. we compare the measured and predicted values. for all types of sufficiently large matrices arising from various technical disciplines, our approach gives a significant speedup. the matrix transformation is relatively fast and straightforward. our measurements prove that matrix transformation pays off for a relatively small number spm×v. acknowledgments this research has been supported by the czech ministry of education, youth and sport under research program msm6840770014. references [1] im, e.: optimizing the performance of sparse matrix-vector multiplication – dissertation thesis. dissertation thesis, university of carolina at berkeley, 2001. [2] mellor-crummey, j., garvin, j.: optimizing sparse matrix vector product computations using unroll and jam. international journal of high performance computing applications, vol. 18 (2004), no. 2, p. 225–236. [3] vuduc, r., demmel, j. w., yelick, k. a., kamil, s., nishtala, r., lee, b.: performance optimizations and bounds for sparse matrix-vector multiply. in proceedings of supercomputing 2002, baltimore, md, usa, november 2002. [4] tvrdík, p., šimeček, i.: a new diagonal blocking format and model of cache behavior for sparse matrices. in parallel processing and applied mathematics, vol. 3911 of lecture notes of computer science, p. 164–171, poznan, (poland) 2006, springer. [5] white, j., sadayappan, p.: on improving the performance of sparse matrix-vector multiplication. in proceedings of the 4th international conference on high performance computing (hipc ’97), p. 578–587. ieee computer society, 1997. [6] tvrdík, p., šimeček, i.: software cache analyzer. in proceedings of ctu workshop, vol. 9, p. 180–181, prague, czech republic, mar. 2005. ing. ivan šimeček phone: +420 224 357 268 e-mail: xsimecek@fel.cvut.cz department of computer science czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 4/2008 fig. 17: the difference between the worst and best region traversal << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 rectifiable p t -symmetric quantum toboggans with two branch points m. znojil abstract certain complex-contour (a.k.a. quantum-toboggan) generalizations of schrödinger’s bound-state problem are reviewed and studied in detail. our key message is that the practical numerical solution of these atypical eigenvalue problems may perceivably be facilitated via an appropriate complex change of variables which maps their multi-sheeted complex domain of definition to a suitable single-sheeted complex plane. keywords: quantum bound-state models, wave-functions with branch-points, complex-contour coordinates, pt-symmetry, tobogganic hamiltonians, winding descriptors, single-sheet maps, sturm-schrödinger equations. 1 introduction the one-dimensional schrödinger equation for bound states − h̄2 2m d2 dx2 ψn(x)+ v (x)ψn(x) = en ψn(x) , (1) ψn(x) ∈ l2(r) is one of the most friendly phenomenological models inquantummechanics [1]. forvirtuallyall of the reasonable phenomenological confining potentials v (x) the numerical treatment of this eigenvalue problem remains entirely routine. during certain recent numerical experiments [2] it became clear that many standard (e.g., rungekutta [3]) computational methods may still encounter new challenges when one follows the advice by bender and turbiner [4], by buslaev and grecchi [5], by bender et al [6] or by znojil [7], and when one replaces the most common real line of coordinates x ∈ r in ordinary differential eq. (1) by some less trivial complex contour of x ∈ c(s)whichmaybe conveniently parametrized, whenever necessary, by a suitable real pseudocoordinate s ∈ r, − h̄2 2m d2 dx2 ψn(x)+ v (x)ψn(x) = en ψn(x) , (2) ψn(x) ∈ l2(c) . temporarily, the scepticism has been suppressed by weideman [8] who showed that many standard numerical algorithmsmay be reconfirmed to lead to reliable results even for many specific analytic samples of complex interactions v (x) giving real spectra via eq. (2). unfortunately, the scepticism reemergedwhenwe proposed, in ref. [7], to study so-called quantum toboggans characterized by the relaxation of the most common tacit assumption that the above-mentioned integrationcontours c(s)must always lie just inside a single complex plane r0 equipped by suitable cuts. subsequently, the reemergence of certain numerical difficulties accompanying the evaluation of the spectra of quantum toboggans has been reported by bı́la [9] and by wessels [10]. their empirical detection of the presence of instabilities in their numerical results may be recognized as one of the key motivations for our present considerations. 2 illustrative tobogganic schrödinger equations 2.1 assumptions whenever the complex integration contour c(s) used in eq. (2) becomes topologically nontrivial (cf. figures 1–4 for illustration), it may be interpreted as connecting several sheets of the riemann surface r(multisheeted) supporting the general solution ψ(general)(x) of the underlying complex ordinary differential equation. it is well known that these solutions ψ(general)(x) are non-unique (i.e., two-parametric – cf. [9]). from the point of view of physics this means that they may be restricted by some suitable (i.e., typically, asymptotic [4, 5]) boundary conditions (cf. also ref. [7]). in what follows we shall assume that (a1) these general solutions ψ(general)(x) live on unbounded contours called “tobogganic”, with the name coined and with the details explained in ref. [7]; (a2) ourparticular choiceof the tobogganic contours c(s)= c(tobogganic)(s) ∈ r(multisheeted) 84 acta polytechnica vol. 50 no. 5/2010 –1 –1.5 –0.5 0.5 re x im x fig. 1: the central segment of the typical pt -symmetric double-circle tobogganic curve of x ∈ c(lr)(s)withwinding parameter κ = 3 in eq. (10). this curve is obtained as the imageof the straight line of z ∈ c(0)(s) at ε =0.250 re x -im x –1 –2 –1 1 fig. 2: an alternative version of the double-circle curve of figure 1 obtained at the “almost maximal” ε = ε (critical) − 0.0005 (note that ε(critical) ∼ 0.34062502) re x -im x –1 –2 –1 1 fig. 3: the extreme version of the double-circle curve c(lr)(s) at ε < ≈ ε(critical) re x -im x –1 –2 –1 1 fig. 4: the change of topology at ε > ≈ ε(critical) when eq. (10) starts giving the single-circle tobogganic curves c(rl)(s) at κ =3 will be specified by certain multiindex � so that c(tobogganic)(s) ≡ c(�)(s); (a3) for the sake of brevity our attentionmay be restricted to the tobogganicmodelswhere themultiindices � are nontrivial but still not too complicated. for this reasonwe shall study just the subclass of the tobogganic models − h̄2 2m d2 dx2 ψn(x) + v (2) (j) (x)ψn(x)= en ψn(x) , (3) ψn(x) ∈ l2(c(�)) containing, typically, potentials v (2) (1) (x) = v(ho)(x)= x2 + [ f (x − 1)2 + f (x +1)2 ] , (4) f 0 1 or v (2) (2) (x) = v(ico)(x)= ix3 + [ g (x − 1)2 + g (x +1)2 ] ,(5) g 0 1 with two strong singularities inducing branch points in the wave functions. in this manner we shall have to deal with the two branch points x (bp) (±) = ±1 in ψ (general)(x). in the language of mathematics the obvious topological structure of the correspondingmulti-sheetedriemann surface r(multisheeted) will be “punctured” at x (bp) (±) = ±1. in the vicinity of these two “spikes” we shall assume the generic, “logarithmic” [11] structure of r(multisheeted). 85 acta polytechnica vol. 50 no. 5/2010 2.2 winding descriptors � the multiindex � will be called a “winding descriptor” in what follows. it will be used here in the form introduced inref. [12], where each curve c(�)(s) has been assumedmoving from its “left asymptotics” (where s � −1) to a point which lies below one of the branch points x (bp) (±) = ±1. during the further increase of s one simply selects one of the following four alternative possibilities: • one moves counterclockwise around the left branch point x (bp) (−) (this move is represented by the first letter l in the “word” �), • one moves counterclockwise around the right branch point x (bp) (+) (this move is represented by letter r), • onemovesclockwisearoundthe left branchpoint x (bp) (−) (this move is represented by letter q or symbol l−1 ≡ q), • one moves clockwise around the right branch point x (bp) (+) (this move is represented by letter p or symbol r−1 ≡ p). in thismannerwemay compose themoves and characterize each contour by a word � composed of the sequenceof letters selected fromthe four-letteralphabet r, l, q and p . once we add the requirement of p t -symmetry (i.e., of a left-right symmetry of contours) we arrive at the sequence of eligible words � of even length 2n. at n =0 wemay assign the empty symbol � = ∅ or � =0 to the one-parametric family of the straight lines of ref. [5], c(0)(s) ≡ s − iε , ε > 0. (6) thus, one encounters precisely four possible arrangements of the descriptor, viz, � ∈ { lr , l−1r−1 , rl , r−1l−1 } , n =1 (7) in the first nontrivial case. in the more complicated caseswhere n > 1 itmakes sense to re-express the requirement of p t -symmetry in the formof the stringdecomposition � = ω ⋃ ωt where the superscript t marks an ad hoc transposition, i.e., the reverse reading accompanied by the l ↔ r interchange of symbols. thus, besides the illustrative eq. (7) we may immediately complement the first nontrivial list ω ∈ { l , l−1 , r , r−1 } , n =1 , by its n =2 descendant{ ll, lr, rl, rr, l−1r, r−1l, lr−1 , rl−1 , l−1l−1, l−1r−1, r−1l−1, r−1r−1 } (8) etc. the four “missing” words ll−1 , l−1l , rr−1 and r−1r had to be omitted as trivial here because they cancel each other when interpreted as windings [12]. 3 rectifications 3.1 formula the core of our present message lies in the idea that the non-tobogganic straight lines (6)maybemapped on their specific (called “rectifiable”) tobogganic descendants. for this purpose one may use the following closed-form recipe of ref. [12], m : ( z ∈ c(0)(s) ) → ( x ∈ c(�)(s) ) (9) where one defines x = −i √ (1 − z2)κ − 1 . (10) this formula guarantees the p t symmetry of the resulting contour aswell as the stability of the position of our pair of branch points. another consequence of this choice is that the negative imaginary axis of z = −i|z| is mapped upon itself. some purely numerical features of the mapping (10)mayalso be checkedvia the freely available software of ref. [13]. on this empirical basis we shall require exponent κ to be chosen here as an odd positive integer, κ =2m +1, m =1,2, . . .. in this case theasymptoticsof the resultingnontrivial tobogganic contours (with m = 0) will still parallel the κ = 1 real line c(0)(s) in the leading-order approximation. 3.2 sequences of critical points an inspection of figures 2 and 3 and a comparison with figures 4 and 5 reveals that one should expect the emergence of sudden changes of the winding descriptors � during a smooth variation of the shift re x im x –6 –4 –2 –4 –2 2 fig. 5: the fully developed version of the single-circle tobogganic curve c(rl)(s) obtained at κ =3 and ε =0.400 86 acta polytechnica vol. 50 no. 5/2010 ε > 0 of the initial straight line of z introduced via eq. (6). formally we may set � = �(ε) and mark the set of corresponding points of changes of �(ε) by the suband superscript in ε (critical) j . a quantitative analysis of these critical points is not difficult since it is perceptibly simplified by the graphical insight gained via figures 2–4 and via their appropriately selected more complicated descendants. trial and error constructions enable us to formulate (and, subsequently, to prove) the very useful hypothesis that the transition between different descriptors �(ε) always proceeds via the same mechanism. its essence is characterized by the confluence and “flip” of the curve at any j = 1,2, . . . , m in ε = ε (critical) j . at this point two specific branches of the curve c(�)(s) touch and reconnect in the manner sampled by the transition fromfigure 2 to figure 4. the key characteristics of this flip is that it takes place in the origin so that we can determine the point x (critical) j = 0 which carries the obvious geometric meaning mediated by the complex mapping (10). thus, the vanishing x (critical) j =0 is to be perceived as an image of some doublet of z = z (critical) j or, due to the left-right symmetry of the picture, as an image of a symmetric pair of the pseudocoordinates s (critical) j = ± ∣∣∣s(critical)j ∣∣∣. at any κ =2m +1 the latter observations reduce eq. (6) to the elementary relation 1= { 1+[i(s − iε)]2 }κ (11) which may be analyzed in the equivalent form of the following 2m +1 independent relations e2πim/(2m+1) =1+(is+ε)2 =1+ε2−s2+2is ε . (12) these relations numbered by m =0, ±1, . . . , m may further be simplified via the two known elementary trigonometric real and non-negative constants a and b such that[ 1 − e2πim/(2m+1) ] = a ± ib . in terms of these constants we separate eq. (12) into it real and imaginary parts yielding the pair of relations s2 − ε2 − a =0 , 2s ε = b . (13) as longas ε > 0wemayrestrict ourattention tononnegative s and eliminate s = b/(2ε). the remaining quadratic equation b2/(2ε)2 − ε2 − a =0 finally leads to the following unique solution of the problem, ε = 1 √ 2 √ −a + √ a2 + b2 . (14) this formula perfectly confirms the validity and precision of our illustrative graphical constructions. 4 samples of countours of complex coordinates for the most elementary toboggans characterized by the single branching point the winding descriptor � becomes trivial because it is being formed by the words in a one-letter alphabet. this means that all the information about windings degenerates just to the length of the word � represented by an (arbitrary) integer [14]. obviously, these models would be too trivial from our present point of view. in an opposite direction one could also contemplate tobogganic models where a larger number of branch points would have to be taken into account. an interesting series of exactly solvable models of this form may be found, e.g., in ref. [15]. naturally, the study of all of these far reaching generalizations would still proceed along the lines which are tested here on the first nontrivial family characterized by the presence of the mere two branch points in ψ(x). from the pedagogical point of view the merits of the two-branch-point scenario comprise not only the simplicity of the formulae (cf., e.g., eq. (10) in the preceding section) but also the feasibility and transparency of the graphical presentation of the integrationcontours c(�) of the tobogganicschrödinger equations. this assertionmay easilybe supported by a few explicit illustrative pictures. 4.1 rectifiable tobogganic contours with κ =3 the change of variables (10) generating the rectifiable tobogganic schrödinger equations must be implemented with due care because the knot-shaped curves c(�)(s) may happen to run quite close to the points of singularities at certain values of s. this is well illustrated by figure 1 or, even better, by figure 6. at the same time all our figures clearly show thatonecancontrol theproximity to the singularities by means of the choice of the shift ε of the (conventionally chosen) straight line of the auxiliary variable z ∈ c(0) given by eq. (6). once we fix the distance ε of the complex line c(0) from the real line r we may still vary the odd integers κ. vice versa, even at the smallest κ = 3 the recipe enables us to generate certain mutually non-equivalent tobogganic contours c(�)(s) in the ε−dependentmanner. this confirms the existence of discontinuities. their emergence and form are best illustrated by the pair of figures 3 and 4. 87 acta polytechnica vol. 50 no. 5/2010 re x im x –1 –0.5 0.5 fig. 6: the quadruple-circle tobogganic curve of x ∈ c(llrr)(s). with winding parameter κ = 5 in eq. (10) this sample is obtained at ε = ε (critical) 1 − 0.0005, i.e., just slightly below the first critical value of ε (critical) 1 ∼ 0.21574990 re x im x –1 –0.5 0.5 fig. 7: the topologically different, triple-circle curve c(rrll)(s) obtained at κ =5 and ε = ε(critical)1 +0.0005 we may conclude that in general one has to deal here with the very high sensitivity of the results to the precision of the numerical input or to the precision of the computer arithmetics. this confirms the expectations expressed in our older paper [12] where we emphasized that the descriptor � is not necessarily easily inferred from a nontrivial, detailed analysis of the mapping m. 4.2 rectifiable tobogganic contours with κ ≥ 5 oncewe select the next odd integer κ =5 ineq. (10) the study of the knot-shaped structure of the resulting integration contours c(�)(s) becomes even more involved because in the generic case sampled by figure 6 the size of the internal loops proves unexpectedly small in comparison. as a consequence, their very existence may in principle escape our attention. thus, one might even mistakenly perceive the curve offigure 6asan inessential deformationof the curves in figures 1 or 2. naturally, not all of the features of our toboganic integration contours will change during the transition from κ = 3 to κ = 5. in particular, the partial parallelism between figures 2 and 6 survives as the similar global-shape partial parallelism between figures 4 (where κ = 3) and 7 (where κ = 5). moreover, a certain local-shape partial parallelismmaybe also foundbetweenfigure 2 (where the twoupwardsoriented loops almost touch at κ = 3) and figure 8 (where the two downwards-oriented “inner” loops almost touch at κ = 5). the latter parallels seem to sample a certainmore generalmechanism, since figure 4 also finds its replica inside the upper part of figure 9, etc. obviously, the next-step transition from κ = 5 to κ = 7 (etc.) may also be expected to proceed along similar lines. re x im x –10 –4 4 fig. 8: the other extreme triple-circled κ = 5 curve c(rrll)(s) as emerging at ε = ε(critical)2 − 0.005, i.e., close to the second boundary ε (critical) 2 ∼ 0.49223343 re x im x –10 –4 4 fig. 9: the twice-circling tobogganic κ = 5 curve c(rlrl)(s) as emerging slightly above the second critical shift-parameter, viz, at ε = ε (critical) 2 +0.005 88 acta polytechnica vol. 50 no. 5/2010 table 1: transition parameters for κ =2m +1 with m =1,2, . . . ,6 m ε (critical) (m) pseudocoordinate angle m b [critical shift in c(0)(s) ] ∣∣∣s(critical)(m) ∣∣∣ ϕ(critical)(m) 1 1 0.8660 0.34062501931660664017 1.2712 0.2618 2 1 0.9510 0.49223342986833679823 0.96606 0.4712 2 0.5878 0.21574989943840034163 1.3622 0.1571 3 1 0.7818 0.49560936234793313854 0.78876 0.5610 2 0.9749 0.41300244005317039597 1.1803 0.3366 3 0.4339 0.15634410200136762402 1.3876 0.1122 4 1 0.6428 0.47438630343334929661 0.67749 0.6109 2 0.9848 0.47917814904271720218 1.0276 0.4363 3 0.8660 0.34062501931660664017 1.2712 0.2618 4 0.3420 0.12231697600600608108 1.3981 0.08727 5 1 0.5406 0.44984366535166445772 0.60092 0.6426 2 0.9096 0.49834558687374848153 0.91265 0.4998 3 0.9898 0.42964189183273983152 1.1519 0.3570 4 0.7557 0.28670826353957054964 1.3180 0.2142 5 0.2817 0.10037407570525388131 1.4034 0.071400 6 1 0.4647 0.42666576745054519911 0.54460 0.6646 2 0.8230 0.49875399287559237235 0.82504 0.5437 3 0.9927 0.47264256935707423545 1.0502 0.4229 4 0.9350 0.38168235795277279438 1.2249 0.3021 5 0.6631 0.24649719795540125795 1.3451 0.1812 6 0.2393 0.085076232785825555735 1.4065 0.06042 for computer-assisted drawing of the graphical representationof the curves c(�) the formulaeofparagraph3.2 should be recalledas the source of themost useful informationabout the criticalparameters. the extended-precision values of the underlying coordinates of the points of instability are needed in such an application. their m ≤ 6 sample is listed here in table 1. on this basiswemay summarize that at a generic κ the variation (i.e., in all of our examples, the growth)of the shift ε makes certain subspirals of contours c(�) larger andmoving closer and closer to each other. in this context ourtable 1 could, in principle, serve as a certain systematic guide towards a less intuitive classification of our present graphical pictures characterizing transitions between different winding descriptors � and, hence, between the topologically non-equivalent rectifiable tobogganic contours c(�). during such phase-transition-like processes [4] the value of ε crosses a critical point beyond which the asymptotics of the contours change. as a consequence, also the spectra of the underlying tobogganic quantum bound-state hamiltonians will, in general, be changed [16]. 5 conclusions wehave confirmed the viability of an innovated, “tobogganic” version of p t -symmetric quantum mechanics of bound states in models where the general solutions of the underlying ordinary differential schrödinger equation exhibit two branch-point singularities located, conveniently, at x(bp) = ±1. in particular we have clarified that many topologically complicated complex integrations contours which spiral around the branch points x(bp) in various ways may be rectified. this means that one can apply an elementary change of variables z(s) → x(s) and replace the complicated original tobogganic quantumbound-state problem by an equivalent simplifieddifferential equation defined along the straight line of complex pseudocoordinates z = s − iε. in detail a few illustrative rectificationshavebeen described where we have succeeded in assigning the 89 acta polytechnica vol. 50 no. 5/2010 differentwindingdescriptors � to the tobogganic contours controlledsolelyby thevariationof the “initial” complex shift ε. an interesting supplementary result of our present considerationsmaybe seen in the constructive demonstration of the feasibility of an explicit description of these transitions between topologically non-equivalent quantum toboggans characterized by non-equivalent winding descriptors �. nevertheless, a full understanding of these structures remains an open problem recommended for deeper analysis in the nearest future. in summary we have to emphasize that our present rectification-mediated reconstruction of the ordinary-differential-equation representation of quantum toboggans can be perceived as an important step towards their rigorous mathematical analysis and, in particular, towards an extension of the existing rigorous proofs of the reality/observability of the energy spectra to these promising innovativephenomenological models. acknowledgement support from institutional research plan av0z 10480505 and from mšsmt doppler institute project lc06002 is acknowledged. references [1] flügge, s.: practical quantum mechanics i, ii. berlin, springer, 1971. [2] znojil,m.: experiments inpt-symmetric quantum mechanics, czech. j. phys. vol. 54 (2004), p. 151–156 (quant-ph/0309100v2). [3] znojil, m.: one-dimensional schrödinger equation and its “exact” representation on a discrete lattice, phys. lett. vol. a 223 (1996), p. 411–416. [4] bender, c. m., turbiner, a. v.: analytic continuation of eigenvalue problems, phys. lett. vol. a 173 (1993), p. 442–445. [5] buslaev, v., grechi, v.: equivalence of unstable anharmonic oscillators and double wells, j. phys. a: math. gen. vol. 26 (1993), p. 5541–5549. [6] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. vol. 80 (1998), p. 5243–5246; bender, c. m., boettcher, s., meisinger, p. n.: pt-symmetric quantum mechanics, j. math. phys. vol. 40 (1999), p. 2201. [7] znojil, m.: pt-symmetric quantum toboggans, phys. lett. vol. a 342 (2005), p. 36–47. [8] weideman, j.a.c.: spectraldifferentiationmatrices for the numerical solutions of schrödinger equation,j.phys.a:math.gen.vol.39 (2006), p. 10229–10238. [9] bı́la,h.: non-hermitian operators in quantum physics (phd thesis supervised by m. znojil). prague, charles university, 2008; bı́la, h.: pramana, j. phys. vol. 73 (2009), p. 307–314. [10] wessels, g. j. c.: a numerical and analytical investigation into non-hermitian hamiltonians (master-degree thesis supervised byh.b.geyer and j. a. c. weideman). stellenbosch, university of stellenbosch, 2008. [11] see, e.g., “branch point” in http://eom.springer.de. [12] znojil, m.: quantum toboggans with two branch points. phys. lett. vol. a 372 (2008), p. 584–590 (arxiv: 0708.0087). [13] novotný, j.: http://demonstrations.wolfram.com/ thequantumtobogganicpaths. [14] znojil, m.: spiked potentials and quantum toboggans.j. phys. a:math. gen.vol.39 (2006), p. 13325–13336 (quant-ph/0606166v2); znojil,m.: identification of observables in quantum toboggans. j. phys. a: math. theor. vol. 41 (2008), p. 215304 (arxiv:0803.0403); znojil, m.: quantum toboggans: models exhibiting a multisheeted pt symmetry. j. phys.: conference series, vol. 128 (2008), p. 012046; znojil, m., geyer, h. b.: sturm-schroedinger equations: formula formetric.pramana j.phys. vol. 73 (2009), p. 299–306 (arxiv:0904.2293). [15] sinha, a., roy, p.: generation of exactly solvablenon-hermitianpotentialswith realenergies. czech. j. phys. vol. 54 (2004), p. 129–138. [16] znojil,m.: topology-controlled spectra of imaginary cubic oscillators in the large-l approach. phys. lett. vol. a 374 (2010), p. 807812 (arxiv:0912.1176v1). prom. fyz. miloslav znojil, drsc. e-mail: znojil@ujf.cas.cz nuclear physics institute ascr 250 68 řež, czech republic 90 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 photometric analysis of the pi of the sky data m. siudek, k. malek, l. mankiewicz, r. opiela, m. sokolowski, a. f. żarnecki abstract a database containing star measurements from the period 2006–2009 taken by the pi of the sky detector located in las campanas observatory in chile contains more than 2 billion measurements of almost 17 million objects. all measurements are available on the pi of the sky web site through a dedicated interface, which also allows users to download selected data. accurate analysis of pi of the sky data is a real challenge, because of a number of factors that can influence the measurements. possible sources of errors in our measurements include: reading the chip with the shutter open, strong and varying sky background, passing planets or planetoids, and clouds and hot pixels. in order to facilitate the analysis of variable stars we have developed a system of dedicated filters to remove bad measurements or frames. the spectral sensitivity of the detector is taken into account by appropriate corrections based on the spectral type of reference stars. this process is illustrated by an analysis of the bg ind system, where we have been able to reduce the systematic uncertainty to about 0.05 magnitudo. keywords: gamma ray burst (grb), prompt optical emissions, optical flashes, nova stars, variable stars, robotic telescopes, photometry. 1 introduction pi of the sky [4] is a robotic telescope aimed at monitoring a large part of the sky with a good range and time resolution. the detector was designed for observations of astrophysical phenomena characterizedby short timescales, especially forpromptoptical counterparts of gamma raybursts (grbs). the pi of the sky apparatus design and observational strategy make this detector useful also for searching for nova and supernova stars and for monitoring interesting objects such as blasars, agns and variable stars. the full pi of the sky systemwill consist of 2 sites separated by a distance of about 100 km, allowing flashes from satellites and other near-earth objects to be rejected. each site will consist of 12 customdesigned ccd cameras placed on specially designed equatorial mounts (4 cameras per mount). the full system is now under construction. necessary tests before constructing the final version were performed with a prototype consisting of 2 custom-designed cameras placed on an equatorial mount. the prototype operated at the las campanas observatory in chile from june 2004 till the end of 2009. cameras working in coincidence observed a 20◦ × 20◦ field of view with a time resolution of 10 seconds. each camera is equipped with canon lenses f = 85 mm, d = f /1.2, which allows observation of objects to about 12m (about 13.5m for 20 coadded frames). during the 6 years of work the prototype gathered a large amount of data valuable for astronomical research, e.g. for identifying and cataloging many different types of variable stars [2]. 2 pi of the sky databases all data gathered by the pi of the sky detector are stored in publicly accessible databases1. currently three databases are available: the first database, covering the period from june 2004 till may 2005, contains almost 800 million measurements for about 4.5 million objects. the second database, from may 2006–november 2007, is a subset of the third database, which contains data gathered from may 2006–april 2009. the third database is the biggest, and includes about 2.16 billion measurements for about 16.7 million objects. in order to ensure easy access to our measurements, we have developed a user-friendly web interface [1]. it allows stars to be selected according to their type, magnitudo, coordinates, etc, displaying their light curves and other properties. it is also possible to download largepackets of light curves of multiple stars. 3 improving photometry 3.1 system of dedicated filters the data acquired during the pi of the sky observations is reduced by a fully automatic pipeline, and only the light curves of stars are stored in the database. in many aspects of analysis, e.g. identification of variable stars, it is important to select only 1pi of the sky home page: http://grb.fuw.edu.pl/pi/ 68 acta polytechnica vol. 51 no. 6/2011 fig. 1: precision dispersion of star brightness measurements from standard photometry for 200s exposures (20 coadded frames) from the pi of the sky prototype at the las campanas observatory in chile. large dispersion (left) is mainly caused by false measurements. after the application of quality cuts (right), the photometry accuracy improves significantly data with high measurement precision. in order to fulfill this requirement we have developed a system of data quality cuts. the system of dedicated filters allows the rejection of measurements or even whole frames affected by detector imperfections or observation conditions. measurements that are placed near the border of the frame, or that are infected by hot pixels, bright background caused by an open shutter or moon halo, or by planet or planetoid passage, can easily be excluded from further analysis. filters allowing the removal of measurements infected by one of these effects are available on the web interface. applying quality cuts can significantly improve photometry accuracy (see figure 1). for stars with range 7–10m, average photometry uncertainty sigma of about 0.018–0.024m has been achieved. 3.2 approximate color calibration algorithm the pi of the sky detector is characterized by relatively wide spectral sensitivity due to the fact that the observations are performed only with an ir-uvcut filter. the average wavelength is about 585 nm and corresponds to a v filter, which is also used as a reference in photometry corrections. it turned out that the detector response is correlatedwith the star spectral type. the average magnitudo measured by pi of the sky (mp i) is shiftedwith respect to the difference of the catalogue magnitudo given by b-v or j-k. approximativy of this dependence with a linear function enables the measurement of each star to be corrected. after applying the correction, the distribution of the average magnitudo shift for the reference stars becomes significantlynarrower (an example of the reference stars for bg ind variable [3] is shown in figure 2). fig. 2: distribution of the average magnitudo shift for reference stars after standard photometry (red) and after spectral correction (blue) further conditions, including maximal accepted shift, rms of mcoor and number of measurements, are used to evaluate additional photometry corrections (see figure 3). fig. 3: for calculating photometry corrections, only the best reference stars were used (blue) after rejecting stars with magnitudo shift (mcoor − v ) bigger than 0.2, rm scoor bigger than 0.07 andwith number of measurements smaller than 100 (red) 69 acta polytechnica vol. 51 no. 6/2011 significant improvement of measurement precision is also achievedwhen the photometry correction is not calculated as a simple average over all selected reference stars, but when a quadratic dependence of the correction on the reference star position in the sky is fitted for each frame. χ2 distribution can be used to select measurements with the most precise photometry (see figure 4). the effect of photometry correctionwith a distribution of χ2 on the reconstructedbg ind light curve is shown in figure 5. applying the new algorithm improved the photometry quality, and uncertainty sigma of the order of 0.013m was obtained [3]. we also applied photometry correction to other stars, with as good results as in the case of bg ind variable. spectral correctionandadditional χ2 distributionallowthe selection of only the measurements with the highest precision. with the example of cepheid variable,tcruandgh carwecan see that the light curve is significantly improved and the measurements flagged as those with potentially worse precision do indeed tend to stand outside the light curve (see figure 6). fig. 4: χ2 distribution. for about 20 % of the calculated frames, χ2 is greater than 0.058. this information can beused to selectmeasurementswith themost precise photometry fig. 5: uncorrected light curve for bg indvariable (left) and after spectral correctionswith correction quality cut (right) fig. 6: uncorrected light curve for t cru variable (left) and after spectral corrections with correction quality cut (right) fig. 7: uncorrected light curve for gh car variable (left) and after spectral corrections with correction quality cut (right) 70 acta polytechnica vol. 51 no. 6/2011 4 summary during the period 2006–2009 the prototype observed almost 17 million objects and collected over 2 billion measurements. the data acquired by pi of the sky is publicly available through a user-friendly web interface. in order to facilitate selection of measurements with high precision, we have developed a system of dedicated filters to remove measurements affected by detector imperfections or observation conditions. when an approximate color calibration algorithmbased on the spectral type of reference stars is used, an uncertainty sigma of the order of 0.013m can be obtained. acknowledgement we are very grateful to g. pojmanski for access to the asas dome and for sharing his experience with us. we would like to thank the staff of the lascampanas observatory, san pedro de atacama observatory and the inta el arenosillo test centre in mazagón near huelva for their help during installation and maintenance of our detector. this work was financed by the polish ministry of science and highereducation in 2009–2011as a researchproject. references [1] biskup, m., et al.: web interface for star databases of the pi of the sky experiment, proceedings of spie, vol. 6937, 2007. [2] majczyna, a., et al.: pi of the sky catalogue of the variable stars from 2006–2007 data, proceedings of spie, vol. 7745, 2010. [3] rozyczka,m.,malek,k., et al.: absolute properties of bg ind — a bright f3 system just leaving themain sequence,monthly notices of the royal astronomical society, 2011, 414, 2479–2485. [4] siudek, m., et al.: pi of the sky telescopes in spain and chile, in these proceedings. malgorzata siudek katarzyna malek lech mankiewicz rafal opiela center for theoretical physics of the polish academy of sciences al. lotnikow 32/46, 02-668 warsaw, poland marcin sokolowski the andrzej soltan institute for nuclear studies hoza 69, 00-681 warsaw, poland aleksander f. żarnecki faculty of physics university of warsaw hoza 69, 00-681 warsaw, poland 71 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 measurement of o2 in the combustion chamber of apulverized coal boiler břetislav janeba1, michal kolovratńık1, ondřej bartoš1 1 ctu in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic correspondence to: bretislav.janeba@fs.cvut.cz abstract operational measurements of the o2 concentration in the combustion chamber of a pulverized coal boiler are not yet common practice. operators are generally satisfied with measuring the o2 concentration in the second pass of the boiler, usually behind the economizer, where a flue gas sample is extracted for analysis in a classical analyzer. a disadvantage of this approach is that there is a very weak relation between the measured value and the condition in specific locations in the fireplace, e.g. the function of the individual burners and the combustion process as a whole. a new extraction line was developed for measuring the o2 concentration in the combustion chamber. a planar lambda probe is used in this approach. the extraction line is designed to get outputs that can be used directly for diagnosis or management of the combustion in the boiler. keywords: o2 measurement, lambda probe, pulverized coal boiler. 1 introduction the aim of this study was to obtain more detailed information about the combustion process in pulverized coal boilers. the volumetric concentration of o2 in the combustion chamber provides important information about the burn-out ratio, the combustion quality, and the space distribution of the flame. simultaneous application of several classical o2 analyzers for making measurements in multiple places in the combustion chamber is too expensive for ordinary industrial operation. another disadvantage of classical types of o2 analyzers is that a flue gas sample cooler needs to be used, which makes the entire measurement system complicated and delays the delivery of the acquired information. a new analyzer device has been developed for making simple and affordable measurements in the combustion chamber of a pulverized coal boiler [1]. it consists of an extraction line, an oxygen sensor, suction pump, and a periodic cleaning system. a wide range λ-sensor is used as the o2 measurement element. this is a type of sensor that is widely used in the automotive industry. mass production reduces the price of the whole analyzer. the flue gas is drawn by the pump directly through the extraction line of the analyzer, without using the cooler, and then the flue gas passes by the λ-sensor. the sensor measures the o2 concentration directly, i.e. in the wet flue gas. under suggestion that the combustion process is finished, the air excess ratio can be determined by eq. 1 α = 21 + ωweto2 · ν ( vf luegasw et min vairw et min − 1 ) 21 − ωweto2 · ν , (1) where ω denotes concentration and v denotes volume. ratio ν is defined as υ = vairwet min vairdry min . (2) the ratio vf luegaswet min vairwet min varies from 1.1 to 1.25, and has to be taken into consideration. 2 description of the analyzer the measurement sensor is a bosch lsu 4.9 widerange (planar) sensor. this sensor [2, 3] is based on the principle of two zro2 electrochemical cells. the first cell measures the o2 concentration in the testing volume. the second cell is the inverse electrochemical cell, and it removes (or adds) oxygen ions, to held the air excess ratio in this volume equal to one. the oxygen concentration in the flue gasses is determined from the electric current flowing through the second electrochemical cell. an advantage of this sensor is that it can determine the air excess ratio in a wide range of values. this sensor can also identify sub-stoichiometric combustionas well. the extraction line of the analyzer is equipped with periodic blow-through periodical cleaning by compressed air. 35 acta polytechnica vol. 52 no. 3/2012 figure 1: the extraction line and the head of the λ-sensor the sensor itself is protected by a high-capacity stainless steel filter. without this cleaning, ash and coal powder can block the extraction pipe within a few hours. the temperature along the entire extraction line is controlled to avoid unfavorable condensation. the sample transport within the analyzer is executed by a chemically resistant air pump. the present design of the analyzer with the extraction line (figure 1) is the result of development work based on test measurements performed at the mělńık 1 power plant. the analyzer itself is installed in an isolated, temperature-controlled box. the aim was to develop a compact structure for the analyzer. several different structural solutions were abandoned because they did not provide the anticipated properties. a former manner of the sample transport was provided by the air driven ejector. the tests in the power plant revealed the problem of the ejector getting blocked by a mixture of ash and condensed liquids from the flue gas. another disadvantage was the high consumption of compressed air. the extraction pipe inserted in the combustion chamber is made from high-temperature resistant kanthal material. the velocity of the flue gas in this pipe should be slow, in order to take advantage of the gravitation setting of the prevailing part of the ash. this settled powder is blown away by the periodic cleaning. the compressed air for cleaning is stored in a 5-litre pressure vessel. the cleaning period of 10 sec every hour is activated by opening an electromagnetic valve. the valve is governed by a programmable logic controller (plc). 3 calibration and test measurements several tests were performed in the laboratory with the extraction pipe and the lsu 4.9 sensor. the sensor has its own calibration function between the measured electric current and the volumetric o2 concentration. the purpose of the laboratory tests was to find the properties of our extraction line with the sensor. we tested the absolute value of the o2 concentration, and the time response to the change in concentration. for the reference measurement, we used a paramagnetic analyzer pma12 (m&c). both analyzers measured the same gas sample (a mixture of n2 and air) at the same time. the uncertainty detected during the tests was lower than 2 % of the measurement range for the expected operation measurement. this uncertainty is acceptable for this analyzer. the time delay is in the order of seconds, depend on the pump performance and the optimum flue gas velocity in the extraction pipe. for o2 concentrations higher than 15 %, the time response is longer, but this condition is, in general, outside the desired operational condition in the combustion chamber. more attention was paid to the tests in the power plant. in a laboratory, it is difficult to simulate the same conditions as in a combustion chamber, especially the high temperature of the dewpoint and the high concentration of solid particles in the flue gas. the measurements were mostly carried out at the mělńık 1 power plant. these measurements focused on the reliability of the entire analyzer and tests of the periodic cleaning of the extraction pipe. the heating system of the analyzer was also tested. figure 2: scheme of the extraction line and the analyzer: 1 – air pump, 2 – connection port, 3 – electromagnetic valve, 4 – air vessel, 5 – connection to the compressed air line, 6 – pid-controlled heater, 7 –λ-sensor, 8 – isolated box, 9 – filter 36 acta polytechnica vol. 52 no. 3/2012 figure 3: the inlet to the extraction pipe after a one-day test figure 3 shows the inlet part of the extraction pipe after a one-day test. this test showed the good efficiency of periodic cleaning. without cleaning, the tip of the extraction line became blocked by ash. after this measurement, no trace of unfavorable condensation inside the analyzer was detected. figure 4 presents an example of the measured data from a pulverized coal boiler at the mělńık 1 power plant. after 3 minutes, the analyzer was placed in the boiler. in the 16th minute, the cleaning event by the compressed air blow through can be seen. 4 results when the final design of the analyzer was settled, a second analyzer was manufactured. for the verification of analyzers were used opportunityto measure the off-design operation of boiler k6 at mělńık 1, in collaboration with i&cenergocompany. during these tests, the data from the analyzers was compared with the operational measurement of the o2 concentration in the second pass of the boiler. this verification measurement was considered successful, because the new analyzers follow all changes to the boiler setting. further verification tests will be made and operational experience with the analyzer will be gained. figure 4: acquired data example of o2 concentration figure 5: acquired dataexample from the left and right side of the boiler measured by the boiler control system and by the tested analyzers 37 acta polytechnica vol. 52 no. 3/2012 an example of the measurements is presented in figure 5. the label cvut denotes the new analyzer, and eme denotes operational measurements of the boiler. l and r refer to the left and right side of the boiler. in the 18th minute of this test, the amount of air in the right side of the boiler increased rapidly, while the left side remained stable. the operational measurement is placed behind the combustion chamber and the information is blurred by the mixing of the flue gas behind the combustion chamber. 5 conclusion the new o2 analyzer was developed, tested and used for practical measurements. the main advantage of the analyzer is that it provides immediate information about the o2 concentration at the place of measurement. this information can be used in the boiler control system to optimize some imbalances in the combustion chamber. the new analyzer was developed for pulverized coal boilers, but it can also be used for other types of boilers. t analyzer costs less than classical analyzers, which raises the question of the possibility of using the same analyzer or a similar analyzer to control boilers with lower performance. also biomass combustions systems offer a broad field of potential applications. if the combustion process is not finished, measured value provides cumulative information about the amount of as yet unburned fuel and the excess air ratio. these two parameters can be distinguished only with additional information about the combustion. we expect that the analyzer will be able to provide accurate results only in the region when combustion is finished. the present development focuses on reliability enhancement, especially on the external side of the extraction pipe. the outer surface can be covered by melted ash; this layer can increase until the weight of the stuck ash bends the extraction pipe. periodic mechanical cleaning of the outer surface is currently under development. acknowledgement this work has been supported by grant tip frti1/539 of the ministry of industry and trade of the czech republic and by i&c energocompany. references [1] janeba, b., kolovratńık, m., bartoš, o.: měřićı technika pro výkonněǰśı ř́ızeńı proces̊u spalováńı, část 2. zpráva fs čvut v praze, ústavenergetiky, z-575/2011, 2011. [2] janeba, b., kolovratńık, m., bartoš, o.: měřićı technika pro výkonněǰśı ř́ızeńı proces̊u spalováńı, 1. zpráva fs čvut v praze, ústav energetiky, z-571/2010, 2010. [3] shuk, p., bailey, e., guth, u.: zirkonia oxygen sensor for process application: state-of-the-art, sensors and transducers journal, 2008. 38 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 plasma flow and temperature in a gliding reactor with different electrode configurations j. sláma abstract this paper deals with the plasma flow shape depending on the electrode form of a gliding discharge plasma-chemical reactor, and with the temperature distribution along the direction of the plasma flow in one specific electrode form. the shape of the electrodes and their mutual position has a significant influence on the design of a gliding discharge reactor and its applications. it is crucial to know the temperature distribution in the reactor’s chamber design and discharge application. three configurations with model shapes of wire electrodes were therefore tested (low-divergent, circular, highdivergent) and the plasma flow was described. the experiments were performed in air at atmospheric pressure and at room temperature. in order to map the reactive plasma region of the flow we investigated the visible spectral lines that were emitted. the gas temperature was measured using an infrared camera. keywords: gliding discharge, spectra, shape of electrodes, plasma reactor, plasma flow, temperature. 1 introduction in recent years there have been many physical and technological applications of gliding discharge as “an auto-oscillating phenomenon developing between at least two electrodes that are immersed in a laminar or turbulent gas flow” [1]. the gliding discharge configuration its parameters, e. g. discharge voltage and current, injected working gas flow rate, and shape of the electrodes. in this paper we have focused on the emitted spectra and the plasma flow temperature which were observed by changing two parameters. the spectra were measured for three model shapes of electrodes (low-divergent, circular and high-divergent). motivated by utilization of the high-divergent electrode configuration, the temperature of the gliding discharge plasma flow was measured for the high-divergent configuration and for three working gas flow rates (10, 20 and 30 slm). 2 experimental the apparatus (figure 1) used in our experiments consisted of a plasma chemical reactor containing electrodes (copper wires 1 mm in diameter shaped into the required form — figure 2), a nozzle, and the gas distribution system (flowmeter, reduction valve, piping and air compressor). the electrodes were connected to the high voltage power source (u = 8 kv, f = 50 hz). the interelectrode distance at the point of initial discharge breakdown was 3 mm. the reduction valve held the gas flow to average values of q = 10, 20 and 30 slm. the experiment was performed at temperature about 21 ◦c, atmospheric pressure 101 kpa and air humidity about 30 %. fig. 1: plasma chemical reactor, gas distribution system and spectroscope fig. 2: shape of the electrodes: (a) – low-divergent, (b) – circular, (c) – high-divergent 2.1 electrodes three wire electrode configurations were tested with model electrode shapes — “low-divergent”, “circular” and “high-divergent” (figure 2). two 130 mm 61 acta polytechnica vol. 52 no. 1/2012 long straight electrodes represented the low-divergent configuration, circular electrodes were formed by a pair of ring-shaped electrodes 22 mm in radius, and the high-divergent configuration was two sharp triangular right angle-shaped electrodes. the model shapes of the electrodes are represented by both lowdivergent and high-divergent configurations. the photos in figure 3 were taken using a nikon d90 camera and an af-s nikkor 18–200 mm objective. the exposure was 1 s shutter speed, aperture f/13. iso 200 was used for the photography of lowdivergent configuration and iso 160 was used for the photography of circular and high-divergent configuration. fig. 3: plasma flows in low-divergent (a), circular (b), high-divergent (c) electrode configuration, q = 20slm (pictures not taken on the same scale) 2.2 temperature and visible spectra temperature a basic analysis was performed of the temperatures in the gliding discharge. we used a guide m8 infrared camera (figure 4) for scanning the observed space. the resolution of the microbolometer array was 160×120 pixels with spectral range from 8 µm to 14 µm, temperature range from −20 ◦c to 250 ◦c, and sensitivity ≤ 0.1 k [2]. for better temperature distribution visualization in the gliding discharge plasma flow we used a set of “optical fibre probes”, due to the low emissivity and the low mass density of the plasma. fig. 4: guide m8 infrared camera for measuring thermal emission fig. 5: horizontal projection of the experimental setup with an sad500 avantes spectrometer and guide m8 infrared camera (see also figure 1) the optical fibre probes were made of 50 µm fibre (taken from the gumt206 fibre optic cable (8 fibre 50/125 µm) [3]). the probes were placed in a straight line in the xy-plane (figure 5 and 1) with a mutual distance of 5 mm in the direction of the positive z-axis direction — all probes were perpendicular to the plasma flow and were heated by circumfluent plasma. the guide m8 infrared camera was placed as shown in figure 5. the optical fibre probes were scanned by infrared camera approx. at an angle of 45 ◦. the distance from the camera to the probes was about 270 mm. spectra a sad500 avantes spectrometer with fiber for transmission of the light from the discharge was used for measuring of the visible spectra. the spectral range of the spectrometer was from 190 nm to 861 nm. the resolution of the spectrometer was 0.328 mm. the spectrometer fiber was set perpendicular to the z-axis in the direction of the y-axis (figure 5). to scan the whole plasma flow axial area, the fiber was moved axially in the z-axis direction. 3 interpretation the spectra in the visible wavelength range were taken in all three configurations. as the plasma region was relatively small, the temperature was measured indirectly along the z-axis only in the highdivergent electrode configuration. 3.1 spectra the ionized flow regions were registered by visible spectral line intensities. each intensity value in the graphs in figures 6, 7 and 8 represents the average value of 7 measurements. based on the nist database [4], we found four spectral lines — λa, λb, λc and λd — typical for low temperature plasma in air. λa =463.061nm niii;2s2p( 3p◦)4p − 2s2p(3p◦)5s, λb =463.885nm oii;2s 22p2(3p)3s −2s22p2(3p)3p, λc =500.113nm nii;2s 22p(2p◦)3p −2s22p(2p◦)3d, λd =567.602nm nii;2s 22p(2p◦)3s − 2s22p(2p◦)3p. 62 acta polytechnica vol. 52 no. 1/2012 fig. 6: intensity distribution along the z-axis in the lowdivergent electrode configuration, q =20slm fig. 7: intensity distribution along the z-axis in circular electrode configuration, q =20slm fig. 8: intensity distribution along the z-axis in highdivergent electrode configuration, q =20slm in the uv spectra range we also found the first negative system of n+2 and the second positive system of n2 in all three measured spectra. this is shown in figures 6, 7 and 8 as the visible band around wavelength λe∼ 390 nm. to detect the plasma particles excitations it is possible to observe the measured spectra, e. g. the peaks λa . . . λd and the spectral band λe. the intensity distribution of peaks (λa . . . λe) in the z-axis direction (figure 9) corresponds to the shape of the plasma region (figure 3(c)). from the results, we expect that the most excited plasma region will be found in the ignition area (figures 6, 7, 8 and 9). the plasma distribution for the individual configurations in the interelectrode region can be described as follows: • in the low-divergent configuration (figures 3(a) and 6), a typical band λe ∼ 390 nm was found along the z-axis through the whole interelectrode region (figure 6) and plasma channels were originated on the whole length of the electrodes. • in the circular configuration, the plasma region was twice as long as the radius of the circular electrodes (figures 3(b) and 7). an outstanding “λe band” was also observed. • in the high-divergent configuration the plasma outreached the interelectrode region by about 10 mm (figures 3(c), 8 and 9). a sharp transition was found between the ionized and neutral parts of the flow (z ∼ 20 mm on figure 9). fig. 9: intensity distribution along the z-axis in highdivergent electrode configuration, q =20slm 3.2 temperature motivated by actual utilization of the high-divergent electrode configuration, we measured the temperature distribution in the interelectrode region along the z-axis in this configuration. to visualize the temperature in the infrared camera more easily we scanned the interelectrode space with an added set of optical fiber probes (figure 5). from the overall infrared picture (figure 10) we selected the required cross-section (green line) which included the scanned “probes” heated by the gas flow. fig. 10: infrared picture from the guide m8 infrared camera 63 acta polytechnica vol. 52 no. 1/2012 five infrared pictures were taken (figure 10) by the guide m8 for each gas flow q = 10, 20 and 30 slm. then the temperatures of the probes were extracted and the distribution containing the mean values was plotted for the gas flows (figure 11). fig. 11: temperature distribution along the z-axis for gas flows q = 10, 20 and 30slm (high-divergent configuration) as shown in the graph (figure 11), the gliding discharge axial temperature distribution depended on the gas flow. with increasing gas flow values q, the temperature of the probe decreased. the maxima of the curves in figure 11 were approximately at the end of the discharge region, e. g. z ∼ 20 mm. 4 summary the connection between three basic electrode profiles (low-divergent, circular, high-divergent) and the appropriate plasma flow shapes in the gliding discharge reactor has been studied. the experiments were carried out in the air at room temperature and atmospheric pressure. the most ionized region was identified by the visible spectral lines of the observed plasma, and was located in the ignition area. in the high-divergent configuration, the temperature was measured indirectly along the perpendicular axis for three gas flow values. the hottest region was found at the end of the plasma region, and its temperature depended on the gas flow, i. e. with increased flow rate the overall temperature dropped. our results can be helpful for finding suitable gliding reactor electrode shapes, i. e. the shapes of the plasma flow in the reactor, and in technological applications of the discharge (e. g. decompositions [5], surface treatment, etc.). acknowledgement research described in the paper was supported by the czech technical university in prague grant no. sgs10/266/ohk3/3t/13 and by the surfacetreat, a.s., czech republic. references [1] bo, y. et al: the dependence of gliding arc gas discharge characteristics on reactor geometrical configuration. plasma chem plasma process (2007) 27:691700, springer science+business media, llc 2007. [2] reseller webpage, online: march 10th 2011, http://www.guide-infrared.cz/termokamery-rucni/ termokamera-guide-m8. [3] datasheet, online: march 10th 2011, http://www.anixter.com/axecom/axedoclib.nsf/ (unid)/de885588defc8ce1862575a200611206/ $file/fibre.pdf. [4] nist: atomic spectra database lines form, online: 1st april 2010, http://physics.nist.gov/physrefdata/asd/ lines form.html. [5] krawczyk, k., ulejczyk, b.: decomposition of chloromethanes in gliding discharges. plasma chemistry and plasma processing, vol. 23, no. 2, june 2003, p. 265–281. about the author jan sláma was born in jablonec nad nisou in 1982. he was awarded his bachelor degree in 2007 and his master degree in 2010 from fee ctu in prague. he started his phd studies on plasma physics at the department of physics, fee ctu in prague in 2010. he designed a plasma-chemical reactor for adbd applications for his bachelor project, and implemented a high voltage middle frequency power source for adbd reactors for his master project. he is currently working on characterizing gliding discharge and applying it in various technical fields. jan sláma e-mail: slamajan@fel.cvut.cz department of physics faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha, czech republic 64 ap09_2.vp 1 introduction national and european energy policies aim at an increasing contribution from renewable sources. the european commission’s target for this contribution is 20 % by 2020. for germany, with its already existing high amount of onshore wind turbines, the most probable growth is seen in wind energy at offshore locations and repowering of existing onshore sites. a major obstacle for a further increasing contribution from these renewable sources lies in their intermittent nature and thus in their undetermined availability. reported durations of unavailability of power produced in onshore wind turbines can reach up to some weeks under special weather conditions, and are not locally restricted. today, conventional power plants are needed in order to adapt the electricity generation at each moment to the existing load and to cover the periods of unavailability of wind and solar energy. as a consequence, conventional thermal power plants have to provide an increasing amount of control power and thus frequently operate at inefficient partial load or have to shut down. in this situation the operation of energy storage systems can help to allow a higher amount of renewable energy sources (res) by stabilizing the grid and finally decreasing the consumption of fossil fuels. as a result, the emissions of conventional fossil fired thermal power plants can be reduced and the remaining thermal power plants can operate more continuously and close to their best efficiency. in principle, pumped hydro storage plants are particularly suitable for the supply of control and reserve power. however, their current capacity cannot readily be expanded significantly because of geographic constraints. it is beyond all question to cover the long periods of unavailability of wind energy by pumped hydro plants of the conventional type. having in mind the huge amount of needed storage capacity, other possibilities for energy storage systems such as compressed air energy storage (caes) or the possibilities of hydrogen production and storage have to be investigated. economically, these storage systems generally compete with fast acting conventional generation technologies, e.g. gas turbines. at the present time, it seems very difficult to justify the need for large scale energy storage from a purely economic point of view. however, it could be argued that external effects such as the ecological and overall national socio-economic benefits of storage systems (e.g. reduction of co2-emissions or the possibility to reduce dependency on energy imports) should also be taken into account. the high prices paid today for primary, secondary and minute reserve already indicate an economic potential for storage systems. a further future market for energy storage applications can be expected in the transportation sector. the fuel stored in today’s car and truck fleet represents an enormous potential to be replaced by batteries (plug-in hybrid vehicles) or hydrogen storage (with re-electrification in fuel cells) in future. unlike hydrogen vehicles, plug-in hybrids do not require an extensive new infrastructure. nevertheless, also here technological and economic barriers need to be overcome before a mass market roll-out can take place. as long as these mobile storage systems are connected to the grid they are in principle also able to provide grid services, if controlled by a suitable management system. a significant number of plug-in hybrids being on the market may replace some stationary storage systems. a few numbers show the potential of the vehicle-to-grid scenario: just 10 % of the german cars equipped as plug-in hybrids (10 kwh storage, 5 kw bi-directional charger) account for 20 gw well distributed installed power for 2 hours. 2 characterisation of energy storage systems energy storage systems can be characterized by a set of parameters. for the selection of a suitable energy storage system it is necessary to know as exactly as possible the characteristics of typical duty cycles, their frequency and the required response time to full power. in this context the load following capability and the power-flow reversal (e.g. change from charging to discharging) may also become important for certain applications. this data defines the energy throughput and the number of cycles per time unit. the efficiency of storage systems can be described by their cycle losses and stand-by losses (including the power conversion system and 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 technical and economic assessment of storage technologies for power-supply grids h. meiwes fluctuating power generation from renewable energies such as wind and photovoltaic are a technical challenge for grid stability. storage systems are an option to stabilise the grid and to maximise the utilisation factors of renewable power generators. this paper analyses the state of the art of storage technologies, including a detailed life cycle cost comparison. beside this, benefits of using storage systems in electric vehicles are analysed and quantified. a comprehensive overview of storage technologies as well as possible applications and business cases for storage systems is presented. keywords: storage systems, batteries, renewable energies, grid stability, vehicle to grid, life cycle cost. auxiliaries). it is necessary to note that the efficiency depends for many storage technologies on the duty cycle. for all electrochemical systems, efficiency decreases with increasing charge/discharging power rate. concerning the investment costs, a distinction has to be made between power-dependent costs and capacity dependent costs. regarding the latter, attention has to be paid to the fact that a difference generally exists between the gross energy content and the usable energy content. for compressed-air storage systems based on caverns, a limitation for the usable energy content also exists depending on the duty cycles and their frequency. the depth of discharge (dod) may have an influence on the lifetime, e.g. for batteries. for the various main components of a storage system, different lifetimes have generally to be considered as well as reliability and maintenance aspects. in addition to electrical parameters other characteristics like weight, footprint, volume, transport logistics, auxiliaries, and environmental aspects are also decision making factors. 3 storage technologies storage technologies can be distinguished according to tab. 1 into three major groups with different features. the main characteristics and features of the different storage technologies are discussed below. there is over 90 gw of pumped hydro storage in operation worldwide, which is about 3 % of global power generation capacity. pumped storage plants are characterized by long construction times and high capital expenditure. pumped storage is the most widespread energy storage system in use on power grids, and is commercially available from many manufacturers. however, a large percentage of pumped hydroelectric potential has already been developed in north america and europe. typical applications of pumped hydro power plants are secondary and minute reserve, peak shaving, or load-leveling, and they have black-start capabilities. the typical efficiency range is 65–80 %, strongly dependent on the site. depending on the size of the lake, typical discharge durations realized today are in the range of a few hours. the power range is 10 mw to 1 gw and the time to full power is in the order of 90 s [1, 2]. compressed air energy storage (caes) has similar properties as pumped hydro, but with other geographic restrictions as cavern leaching needs suited salt deposits in the underground. existing ‘diabatic’ caes plants can be thought of as gas turbine plants, where the compression process and the expansion process are temporally decoupled: excess electricity is used to drive turbo-compressors that fill underground caverns with compressed air (cooled down to ~50 °c). at times of peak load, compressed air is drawn from the cavern, then heated in gas burners and expanded in a modified gas turbine. as a drawback, the cycle still depends on fossil fuel and also has limited cycle efficiency due to the waste heat emerging from the compression process. improved implementation makes use of the gas turbine waste heat with the help of a recuperator in the fuel gas path. however, the system inherent limitation of the efficiency remains. adiabatic caes, a novel concept, seeks to overcome these disadvantages by re-incorporating the otherwise lost compression heat into the expansion process and thus to provide a locally emission-free storage technology with high storage efficiency. it thus needs heat storage as a central element of the plant. so far, adiabatic caes are a subject of research, while diabatic caes technology is commercially available from several manufacturers. the multitude of suitable sites for caes is a beneficial condition for broad introduction on the market. a typical pressure is in the range of 6–10 mpa. for daily cycling the usable pressure swing has to be limited to approx. 2 mpa. the round-trip efficiency is in the range of 42–54 % for a diabatic caes and up to 70 % for an adiabatic caes. depending on the size of the cavern, discharge durations from a few hours up to a few days are possible. the time to full power is in the range of 15 minutes, which is sufficient for providing minutes reserve [3]. hydrogen can be produced from power by high pressure electrolysers (pressures between 3 and probably 20 mpa). various electrolyser technologies are under development. for efficient storage hydrogen has to be compressed further before being stored in underground salt caverns at a pressure of up to 20 mpa and above. as charging and discharging is slow, a pressure swing of about 2/3 can be realized. for high power levels the most efficient conversion back to electricity can be achieved in combined cycle power plants. in the lower power © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 49 no. 2–3/2009 response time typical discharge times storage technologies suited applications x-large scale > 15 min days – weeks hydrogen storage systems reserve power compensating for long-lasting unavailability of wind energy large scale < 15 min hours – weeks compressed air storage (caes) hydrogen storage systems pumped hydro secondary reserve minute reserve load-leveling medium scale 1–30 s1)/15 min2) min – hours batteries (li-ion, lead-acid, nicd) high-temp. batteries zinc-bromine batteries redox-flow batteries primary reserve1) secondary & minute reserve2) load-leveling, peak shaving table 1: overview on storage applications and suited storage technologies range fuel cells can be applied. round-trip efficiencies are expected to be in the range of 35-40 %. the power range is 10 kw to 1 gw and the ramp-up time is about 1 to 5 minutes. the achievable energy density of compressed hydrogen is more than one order of magnitude higher than the one of compressed air. storage of compressed hydrogen in salt caverns is relatively cheap, which qualifies this technology especially for long-term storage of bulk energy to be reused during long-lasting unavailability of wind energy. it is currently the only technology with a technical potential for single storage systems in the 100 gwh range [4, 5]. the lead-acid battery is one of the oldest battery technologies and today still the most used secondary battery technology worldwide. lead-acid batteries are commercially available from many manufacturers all over the world, and there are also large scale installations in the 10 kw to 50 mw class which have been in operation in the past 25 years. their biggest advantage is the low cost compared to other storage systems. however, lead-acid batteries for large stationary applications are not produced in large quantities of one cell type in fully automated production lines as they are used today for automotive batteries to achieve sufficient economy of scale effects. therefore, especially for large scale applications, there is still room for cost reductions and also for technical improvements with regard to lifetime. the typical efficiency range is 80–85 %. the lead-acid battery is more a high energy technology rather than a high power technology, which means typical discharge durations in the range of 1 hour and more. discharge durations below 15 min are possible but generally make no sense. the available capacity decreases significantly with increasing discharge current rates. lead-acid batteries suffer especially from their short life-time, limited usually to a few thousand cycles and depending strongly on the dod [6]. nickel-cadmium batteries are a very successful battery product from a technical point of view and this is the only battery technology that still features a good power capability for temperatures in the range of �20 to �40 °c. large battery systems built from nicd batteries are in operation, similar to those for lead-acid batteries. the specific costs per capacity are significantly higher compared with lead-acid batteries, but they can provide a long cycle lifetime with more than 10.000 cycles at 80 % dod. a small number of manufacturers deliver industrial nicd batteries in the world. there are several technology variants available for optimization adapted to specific requirements (lifetime, cycle number and power requirement). the use of cadmium is critical and, therefore, this technology is on the inspection list of the eu preceding possible prohibition, which can only be impeded as long as there are no alternative storage technologies available. the efficiency of nicd batteries is about 70 % due to the low nominal voltage of the basic cells [7]. lithium-ion batteries have become the most important storage technology in portable applications (e.g. laptops, cell phones) within a few years, due to their high gravimetric energy density. also in stationary applications, they could be an interesting option because of their high power capability. recent developments underline the future potential of this technology. discharge times of 15 minutes or less are possible. unlike other secondary battery technologies, a large variety of material combinations are available for lithium-ion batteries. on the one hand, this makes it difficult to give some general statements on the performance; on the other hand it gives room for numerous companies and players in the field and for strong competition among companies and technologies. worldwide investments in r&d of more than 500 million us$ per year are being spent to bring the technology forward for portable, stationary and mobile (hybrid, plug-in hybrid and full electric vehicles) applications. the efficiency is 90 to 95 %, and the gravimetric energy density is superior to all other commercial rechargeable batteries in the capacity range of kwh and above. [6, 8] sodium-nickel-chloride(nanicl, also called zebra-battery) and sodium-sulphur-batteries (nas) have a solid state instead of a liquid electrolyte like other batteries. to achieve sufficient ion conductivity and to transfer the active masses into fluid condition, an operation temperature of 270–350 °c is necessary. when the battery is cooled down, charging or discharging is no longer possible, and there is the danger of cracks in the ceramic electrolyte because of mechanical tensions. for daily utilization, the temperature of the battery can be maintained by its own reaction heat with an appropriately dimensioned isolation. thereby these batteries qualify for applications with daily cycling. in japan, nas batteries are commercially available and are used in many stationary applications such as load-leveling, sometimes including emergency power supply and ups. nanicl batteries are commercially available in europe mainly for mobile applications such as full electric vehicles. the efficiency is 70–80 %. high temperature batteries are typical high energy batteries [6, 9]. in redox-flow batteries, the active material is made up of salt, which is dissolved in two different fluid electrolytes. the electrolytes are stored in tanks and are pumped, when needed, into a central reaction unit (stack) for the charge or discharge process. similar to fuel cells, the stack undergoes no physical or chemical change during the charge/discharge process. the size of the tank determines the energy capacity of the battery; the stack determines the power. principally, this battery technology is very well suited for large-scale application because bigger tanks can be constructed very easily and effectively. important combinations of salts, which are under investigation, are e.g. nabr�na2s4/na2s2�nabr3 (regenesys), fe/cr, br2/cr or vanadium. among these, the all-vanadium redox-flow battery is particularly interesting, because the same material is used for both electrodes. they are meanwhile commercially available from two or three manufacturers. in japan, installations have already been built and operated for years for load-leveling applications. other material combinations are still on the level of research. the system efficiency (including energy consumption of pumps, etc.) of most systems is in the range of 60 to 75 %. redox-flow batteries are a typical high energy battery and are less suited for power applications with discharge times below one hour [10]. zinc/bromine batteries are similar to redox-flow batteries. the battery consists of a zinc negative electrode and a bromine positive electrode separated by a micro porous separator. an aqueous solution of zinc/bromide is circulated through the two compartments of the cell from two separate reservoirs. zinc is deposited in the stack during operation. this is a major difference to redox-flow batteries, where the stacks themselves remain unchanged during charging and 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 discharging. the zinc/bromine battery can be repeatedly fully discharged without any damage to the battery. the zinc/bromine battery is predominantly made with low cost, recyclable plastics and manufactured with techniques suitable for mass production and at low production costs. today only one company is actively developing and supplying zinc/bromine batteries. the zinc bromine energy storage system is now in the first stages of commercialization. there are already some stationary installations. the performance of zinc bromine batteries is similar to redox-flow batteries; the efficiency at high rates is lower [11]. 4 economic assessment and discussion when evaluating the most suited storage technologies, it is necessary to make a precise definition of the boundary conditions in terms of power, energy, response time and capital costs to achieve comparable results. three classes of storage applications are discussed in this paper with regard to suited storage technologies and cost estimations: a) long-term storage (500 mw, 100 gwh, 200 h full load, ~1.5 cycle per month) b) load-leveling (1 gw, 8 gwh, 8 h full load, 1 cycle per day) c) peak-shaving at distribution level (10 mw; 40 mwh, 4 h full load, 2 cycles per day) a life cycle cost (lcc) analysis has been performed for different storage technologies with respect to the three application classes. each application is defined by the required charge/discharge power, the necessary energy content (resulting in the effective gross energy content of the storage), the number of cycles per day and the required overall system lifetime. if a storage technology is not able to achieve the required system lifetime, the storage system or the relevant subsystems are assumed to be replaced and the costs are accounted accordingly. the lcc takes into account investment costs for the storage medium itself, including the necessary auxiliaries and the power interfaces for charging and discharging, resulting in corresponding capital costs. the lifetime – for some technologies depending on the cycle depth – is also taken into account as well as costs for buying electrical energy to compensate the overall losses. all calculations are based on a capital cost rate of 8 %/year. for caes systems only the new adiabatic technology is taken into account. in the following, the cost figures are given as capital costs related to the electricity output of the systems. the width can be interpreted as “state of the art” (high value) and “achievable costs” expected in 5 to 10 years, based on known technology and mass production (low value). data from the literature, studies and expert knowledge was used for the different technologies [7]. batteries have a significant cost reduction potential in large scale fully automated production. for the class a application (fig. 1), hydrogen storage can benefit from low volume related costs due to its very high energy density compared to caes. pumped hydro storage © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 49 no. 2–3/2009 0 10 20 30 40 costs [€ct / kwh] pumped hydro caes today> 10 years > 10 years today depending on location hydrogen fig. 1: comparison of storage systems for long-term storage, class a 0 5 10 15 20 25 caes hydrogen > 10 years today depending on location > 10 years today pumped hydro costs [€ct / kwh] fig. 2: comparison of storage systems for load-leveling, class b 0 5 10 15 20 25 30 costs [€ct / kwh] 5 to 10 years today redox-flow (vanadium) zinc-bromine nas (high temp.) (vanadium) lithium-ion nicd lead-acid nanicl (zebra, high temp.) fig. 3: comparison of storage systems for peak-shaving at distribution level, class c systems could be used as long term storage at lower costs, but the technical potential for appropriate sites is very limited whereas salt cavern for hydrogen storage could be made available in sufficient quantity in suited regions. class b is covered today by pumped hydro power plants, which are still the most economic solution (fig. 2). however, increasing the number of pumped hydro systems is limited by the lack of geographically suited sites and decreasing public acceptance. compressed air stored underground in large salt caverns appears as an economically interesting and technically feasible alternative. especially the new adiabatic caes concepts show big advantages with regard to efficiency and ecological aspects. battery technologies generally can be used for this application as well, centralized or decentralized. here, the lowest costs are expected to be in the range of 8 to 12 €ct/kwh. although this is well above the costs shown in fig. 2, it is necessary to take into account that all battery systems can also deliver primary reserve due to their very fast response time less than 10 ms (see also table 1). for storage systems of class c (fig. 3) for medium voltage applications in the context of smart grids, supporting micro-grid or virtual power plant concepts, various electrochemical battery storage systems are competing. regarding best-case assumptions the nas technology is ahead at the moment. in japan commercial applications with storage capacities up to 50 mwh have been in operation for several years. conventional lead-acid technology is still one of the most economic systems, although it suffers today from a very limited lifetime. with regard to battery technologies, other pros and cons also have to be discussed which cannot be expressed in terms of cost effectiveness [12]. 5 vehicle-to-grid concepts plug-in hybrids (phev) including the vehicle-to-grid concept (v2g) have come to attention to several major players in the energy and mobility sector just in the past two years. phevs with an electric driving range of 30 to 60 km based on 5 to 10 kwh battery storage have become a realistic option. electric motors, power controls, etc. are currently under development or on the market. lithium-ion batteries show very promising results in the lab with regard to cycle lifetimes, and cost projections are good as well. if plug-in hybrids are connected with bidirectional converters to the grid (e.g. 3 to 5 kw power) positive and negative control power is available. a single car accounts for 3 to 5 kw and 10 kwh. 10 % of the cars in germany already offer 3 to 5 gw for 8 hours, which is in the order of all pumped hydro storage systems available for germany. 100 % of the cars in germany can serve the full electric load for more than 3 hours. an additional advantage is the uniform distribution throughout the grid. this offers remarkable options for the control and stability of the power grid. the concept is very interesting for multiple reasons. car manufacturers get an option for individual mobility beyond the pure gasoline car. the average driving range per car in germany is approx. 36 km/day, thus a battery with 10 kwh would serve for a significant part of driving from electricity. on the other hand, hybrid cars still have an internal combustion engine and therefore the overall performance and range of the cars is not limited by the electrical drive. utilities get an additional market for electricity and at the same time also a technical solution for all problems which may arise from fluctuating power generation from wind and solar at least on a time scale of 24 hours. significantly increased shares of electricity production become a realistic technical option based on the additional storage capacities. in addition, economic concepts show that installed storage power from plug-in hybrids can be cheaper for utilities than pumped hydro systems. car owners can save money when replacing gasoline by electricity, and the utilities pay for the usage of the storage systems. this can balance the additional costs for the batteries. in addition, they do not have to limit themselves with regard to the comfort of their automobile. the environment and thus the society benefits from a significant co2 reduction due to replacing gasoline by electricity from co2-free power generators (renewable, coal/gas with co2 sequestration, nuclear). this will also extend the availability of cheap oil. the concept can work from a battery point of view, because the expected lifetime for lithium-ion batteries in the range of 5,000 cycles is sufficient for 8 years of daily electric driving plus an additional cycle for the grid. storage systems in the grid are typically paid to be available in case of emergency (primary and secondary reserve). in fact they are needed very seldom and the real number of additional cycles will be quite small [13]. 6 summary in future power supply systems, characterized by an increasing share of renewable energy resources, broad use of energy storage systems is a prerequisite in order to cope with the fluctuating nature of wind and solar energy. for long-term storage with the aim to cover also long lasting periods up to a few weeks without wind, only hydrogen, stored in salt caverns can be a technically feasible solution due to its low specific storage costs. for mid-term storage, providing control and reserve power, pumped hydro plants compete with adiabatic caes. although pumped hydro shows economic advantages, the construction of new plants seems to be difficult due the lack of geographically suited sites and decreasing public acceptance, whereas a sufficient number of underground salt caverns could be made available. salt caverns built for caes applications might be converted into a hydrogen storage system if necessary. for load-leveling applications at the distribution level, various battery technologies compete. at the time being, the lead-acid technology is still one of the most economic systems and further progress in improving the lifetime seems to be possible. new battery technologies have to be further improved in order to come to more reliable numbers with regard to investment and operation costs, as well as lifetime. at the present time, high-temperature batteries are the most interesting technology. lithium-ion and redox-flow technologies are comparable with regard to their potential. however, the lithium-ion technology is the only one which can be considered in plug-in hybrid or full electric vehicles and within an increasing number of vehicles together with a suitable control system these cars may represent a huge storage system also to be used for grid issues (vehicle-to grid). 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 acknowledgments the research described in the paper was supervised by prof. dr. dirk uwe sauer, head of the electrochemical energy conversion and storage systems group at the institute for power electronics and electrical drives (isea), rwth aachen university. references [1] giesecke, j., mosony, e.: wasserkraftanlagen – planung, bau und betrieb, berlin, heidelberg: springer, 1997. [2] heyder, b.: energy storage systems in the electricity network, ecpe seminar: energy storage technologies, aachen, germany, 2007. [3] wietschel, m., hasenauer, u., vicens, n. j., klobasa, m., seydel, p.: zeitschrift für energiewirtschaft, ein vergleich unterschiedlicher speichermedien für überschüssigen windstrom, 2/2006. [4] ipa energy consulting, report to the scottish executive, hydrogen technology systems, edinburgh, scotland, 2005. [5] lipman, t. e., ramos, r., kammen, d. m.: an assessment of battery and hydrogen energy storage systems integrated with wind energy resources in california, report by the university of california, berkeley, for the california energy commission, usa, 2005. [6] handschin, e., styczynski, z.: power system application of the modern battery storage, magdeburger forum zur elektrotechnik, magdeburg, germany, 2004. [7] leonhard, w., buenger, u., crotogino, f., gatzen, ch., glaunsinger, w., huebner, s., kleinmaier, m., koenemund, m., landinger, h., lebioda, t., sauer, d. u., weber, h., wenzel, a., wolf, e., woyke, w., zunft, s.: etg-study on energy storage, energiespeicher in stromversorgungssystemen mit hohem anteil erneuerbarere energieträger, power engineering society within vde (etg), germany, 2008, www.vde.com/vde/fachgesellschaften/etg. [8] schoenung, s. m., hassenzahl, w. v.: longvs. short-term energy storage technologies analysis, a life-cycle cost study, study by sandia national laboratories for the doe energy storage systems program, livermore, california, usa, 2003. [9] galloway, r. c., dustmann, c.-h.: zebra battery – material cost, availability and recycling, evs20, long beach, california, usa, 2003. [10] sauer, d., jossen, a.: advances in redox flow batteries, first international renewable energy storage conference (ires i), gelsenkirchen, germany, 2006. [11] jonshagen, b.: the zinc bromine battery for renewable energy storage, first international renewable energy storage conference (ires i), gelsenkirchen, germany, 2006. [12] kleinmaier, m., buenger, u., crotogino, f., gatzen, ch., glaunsinger, w., huebner, s., koenemund, m., landinger, h., lebioda, t., leonhard, w., sauer, d. u., weber, h., wenzel, a., wolf, e., woyke, w., zunft, s.: energy storage for improved operation of future energy supply systems, cigre session 2008 (international council on large electric systems), paris, france, 2008, paper c6-301. [13] sauer, d.u., blank, t., kowal, j., magnor, d.: energy storage technologies for grids with high penetration of renewable energies and for grid connected pv systems, 23rd european photovoltaic solar energy conference, valencia, spain, 2008. heide meiwes e-mail: mw@isea.rwth-aachen.de electrochemical energy conversion and storage systems group institute for power electronics and electrical drives (isea) rwth aachen university jägerstrasse 17/19 52066 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap08_6.vp 1 introduction thermally or pressure formed starch based materials can be used in the packaging industry for manufacturing dishes, cups, containers, as a substitute for petroleum based plastics, especially expanded polystyrene. packaging materials manufactured from starch compost easily in nature, and are biodegradable. these products are manufactured by extraction, extrusion, high pressure compression (explosion), see glenn et al. [1], or by heating starch suspensions in closed molds. this process is similar to baking a dough, but at a higher pressure. the process includes not only one-phase heating of a liquid suspension, but also the formation of steam bubbles accompanied by sample expansion and the formation of a crust. mathematical modeling of the process requires knowledge of density, heat capacity, thermal conductivity, enthalpy of evaporation; see zanoni et al. [2], wang and hayakawa [3], maroulis et al. [4]. sorption isotherms for potato starches were presented by lind and rask [5], cha et al. [6] and some rheological properties of liquid matrix can be found in schwarzberg [7], wang [8] and lagarrigue [9]. it should be noted that the data presented in these references is restricted to a narrow range of concentrations and temperatures. a disadvantage of the process is the long heating time, which can be reduced by combining classical contact heating from hot walls with volumetric heating, for example by using microwave or direct ohmic heating. the aim of this paper is to evaluate the most important parameter for direct ohmic heating, which is the effective electric conductivity of the processed sample. overall conductivity is affected by the specific electric conductivity of the starch suspension and by the contact resistance at the contact with the electrodes of the heating apparatus. little information is available on these properties for starch based systems, see wang and sastry [8]. their work is restricted to a low concentration of potato starch (up to 0.2, and the electric conductivity of the water suspension is artificially increased by nacl) and temperatures below 90 °c. the functional properties of the manufactured products are determined by their mechanical properties (strength), see glenn et al. [1], stability and resistance (permeability and absorptiveness), and thermal insulation properties. the most important characteristics affecting these properties are the effective thermal conductivity, porosity and thickness of the crust. scanning electron micrographs of baked starch foams are presented by shogren et al. [10], and some selected properties of crust are discussed by zanoni [11]. 2 thermodynamic parameters of starch suspension the potato starch used in the experiments with thermal pressure formation reported by skocilas et al. [12] and also for evaluating the electrical properties of starch suspensions presented in this paper, is commercially available solamyl and naturamyl powder, produced by natura, czech republic. the composition of the powder and basic thermodynamic parameters of the water suspension are summarized in table 1. it should be noted that the data presented in table 1 is not sufficient for a full description of starch under typical conditions of thermal forming, where temperatures change from room temperature up to 180 °c, and the moisture content decreases from approximately 50 % almost to zero. some properties change abruptly during gelatinization at about 70 °c, and significant changes in rheological properties occur at the glass transition temperature of about 180 °c, affecting the formation of the crust. it is important to note that starch © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 6/2008 properties of starch based foams made by thermal pressure forming j. štancl, j. skočilas, j. šesták, r. žitný packaging materials based on expanded polystyrene can be substituted by biodegradable foam, manufactured by direct or indirect electrical heating of a potato starch suspension in a closed mold. this paper deals with an experimental evaluation of selected properties of potato starch and starch foam related to this technology: density, specific heat capacity and specific electrical conductivity of a water suspension of potato starch within the temperature range up to 100 °c, and mass fraction from 5 to 65 %. the electric conductivity and heat capacity changes were observed during direct ohmic heating of a starch suspension between electrodes in a closed cell (feeding voltage 100 v, frequency 50 hz). specific electric conductivity increases with temperature, with the exception of the gelatinization region at 60 to 70 °c, and decreases with increasing concentration of starch (the temperature and concentration dependencies were approximated using the lorentz equation). direct ohmic heating is restricted by a significant decrease in effective electrical conductivity above a temperature of 100 °c, when evaporated steam worsens the contact with the electrodes. experiments show that when direct ohmic heating is not combined with indirect contact heating, only 20 % of the water can be evaporated from manufactured samples and the starch foam is not fully formed. this is manifested by only a slight expansion of the heated sample. only the indirect contact heating from the walls of the mold, with the wall temperature above 180 °c, forms a fixed porous structure (expansion of about 300 %) and a crust, ensuring suitable mechanical and thermal insulation properties of the manufactured product. the effective thermal conductivity of the foamed product (sandwich plates with a porous core and a compact crust) was determined by the heated wire method, while the porosity of the foam and the thickness of the crust were evaluated by image analysis of colored cross sections of manufactured samples. while the porosity is almost constant, the thickness of the crust is approximately proportional to the thickness of the plate. keywords: starch, starch suspension, starch foam, direct ohmic heating, indirect heating, electric conductivity of a starch suspension. and starch-water systems are not homogeneous materials, and starch powder consists in fact of starch grains of typically 50 �m in size (the distribution for potato starch consists of grains from 10 to 100 �m). this is important for the formation of steam bubbles, because grains in a water suspension play the role of nucleation centers. 3 direct ohmic heating and the electric properties of a potato starch suspension the process time necessary for heating a starch suspension inside a closed mold can be reduced by volumetric heating, for example making use of an electric current passing through a sample compressed between two electrodes (in reality, volumetric heating has to be combined with indirect contact heating, because only then is it possible to create a crust, ensuring desirable mechanical properties of the product). accelerated ohmic heating is possible only if there are ions passing an electric charge through the sample. their presence and mobility is manifested by positive specific electrical conductivity � of the heated material, depending upon the composition of the starch suspension and the temperature. specific electrical conductivity � is easily measurable in liquids, but for example during gelatinization, accompanied by volumetric changes, standard conductivity probes cannot be applied. it should also be taken into account that standard instruments apply rather high feeding frequencies, and the results need not be directly applicable to industrial processes at much lower frequencies (50–60 hz). different instruments were therefore used for a low concentration of naturamyl potato starch mixed with tap water (mass fraction of starch from 0.05 up to 0.35) and for high concentrations (mass fraction from 0.55 to 0.65), when the mixture has the consistency of a viscous paste. the experimental setup is shown in fig. 1. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 potato starch powder dry matter content of powder (mass fraction) % >80 manufacturer ash (mass fraction in dry material) % <0.5 manufacturer nitrogen compounds (in dry material) % <0.1 manufacturer so2 (mass fraction in dry material) % <5�10 �5 manufacturer density of pure potato starch �s kg�m �3 1440 � 740x � 1940 x2 maroulis [4] thermal conductivity � as a function of temperature w�m �1�k�1 0.0976 � 0.00167 t t � (25–70) °c lind and rask [5] specific heat capacity cs as a function of temperature (t in °c) j�kg�1�k�1 1136 � 3.56 t wang [3] 1420 � 4340 x schwarzberg [7] water suspension ph 5–7 this work iodine reaction / colors dark blue density � as a function of mass fraction of starch � (solamyl) kg�m�3 995.66 � 403.89 � 0.1 � � � 0.4, t � 22 °c pycnometric method, 12 experiments thermal conductivity � w�m �1�k�1 0.4168 � 0.00066 t � 0.00000036 x3 � 0.0000095 t x t � (80–120) °c, x � 40–71 wang [3] sorption isotherm x = a[aw/(1 � aw)] 0.38 (oswin) ln a = �0.0071t � 4.5 t � (30–90) °c lind and rask [5] x = exp (�0.000273 /emc2.418) t � (120–160) °c emc – equilibrium moisture content on dry basis cha (2001) [6] enthalpy of evaporation kj�kg�1 2050, 2256 wu [13], fan [14] flow behavior index n 0.2–0.6 parker [15], fan [14], lagarrigue [9] coefficient of consistency k pa�sn 0.49 (melt) fan [14] table 1: thermodynamic properties of the potato starch suspension in the low concentration range, the two cylindrical graphite electrodes (see fig. 1) were submerged to a homogenized starch suspension and the electric current i, together with a small applied voltage u (frequency 50 hz), were recorded by a powermeter (lmg 95, zimmer electronic systems). the specific electrical conductivity was evaluated from � � c i u , (1) where c is a calibration constant, evaluated for a fixed geometry of the cylindrical electrodes by comparison with results obtained by a wtw conductometer with a tetracon 625 four-electron probe. this comparison was performed with water and for very low concentrations of starch at temperatures under 60 °c. this calibration constant c was applied for a series of steady state experiments with graphite electrodes at temperatures (20–90) °c and for mass fractions of starch 0.05, 0.15, 0.25, 0.35. the results show a steady decrease in conductivity � with increasing mass fraction of the starch, and also a monotonous increase in � with temperature. the experimental data (values of conductivity from 32 experiments) can be approximated with acceptable accuracy by the lorenz formula as a function of temperature and mass fraction � � � � � � � �� � � � � � � � � � � � � �� � � � � � 0 1 2 2 1 2 2 1 1 t t t � � � � � , (2) where parameters �0, t1, t2, �1, �2, shown in the first row of table 2, were identified by regression analysis using sigma plot software. the accuracy of formula (2) can be estimated to about 10 %, compared with the data for the electric conductivity of tap water presented by metaxas [16]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 48 no. 6/2008 fig. 1: experimental setup for measuring the specific electrical conductivity of a starch suspension fig. 2: electric conductivity (experimental data and eq. (2)) for mass fractions of potato starch � � 0.55, 0.6, 0.65 � range t-range (°c) �0 (s�m�1) �1 �2 t1 (°c) t2 (°c) 0.05–0.35 20–90 0.087 0.0783 0.4122 92.84 63.84 0.55–0.65 25–100 64.18 0.4257 0.00254 93.64 94.17 table 2: coefficients of the lorentz relationship (2) results for higher mass fractions of starch (0.55 to 0.65) were obtained from experiments in a closed cell, corresponding to the technology for thermal forming of a cylindrical sample of starch paste located between two planar stainless steel electrodes (fed by a higher voltage, typically 100 v, at 50 hz). a cross section of this ohmic cell is shown in the right part of fig. 1. a constant initial mass of starch paste (50 g) was applied in all experiments, and the corresponding contact surface s was calculated from the density (see table 1) and the distance of the electrodes h � 10 mm, giving s � 0.00411 m2 at � � 0.55 and s � 0.00397 m2 at � � 0.65. the effective electric conductivity was evaluated using eq. (1) with the calibration constant c h s� , as follows from the assumption that the intensity of the electric field (� u h) inside the heated sample is uniform. the electric conductivity was evaluated continuously during heating at approximately constant voltage u, recorded together with the electric current i by the lmg-95 powermeter. at the same time, a continuous increase in temperature at the center of the sample was recorded by a t-type thermocouple, which was electrically separated using an insulated amplifier. the thermometer enabled not only assignment of a temperature to the corresponding electric conductivity, but also estimation of the specific electrical conductivity from a simplified enthalpy balance. the results, specific electrical conductivities, are presented in fig. 2.4 a characteristic feature of temperature courses is a temporary decrease in electric conductivity (typical s-shape), at around (60–80)°c, which manifests starch gelatinization. a temporary decrease in electric conductivity during gelatinization is also manifested by peaks of electric power during continuous heating (fig. 3). the experimental data was approximated by eq. (2) with the parameters in the second row of table 2, see fig. 2. it is obvious that the local conductivity decrease cannot be described by this simple correlation. experiments carried out in the ohmic heating cell are in fact feasibility tests of the assumed technology for combined (direct and indirect) ohmic heating. the tests revealed the limits of direct ohmic heating, represented by a significant reduction in effective electric conductivity due to the formation of an insulating steam cushion. the effect of insulating steam was confirmed by experiments, when the power supply was switched off temporarily within the evaporation phase (to let the steam run out) and switched on again. after switching the power supply on, the electric power returns almost to the original high value (slightly lower, the decreasing peak values of the electric power correspond to the decreasing moisture of the gel). the reduction in delivered electric power during evaporation of the water is clearly shown in fig. 3. a comparison mass of the samples at the beginning and at the end of the heating process gives the amount of evaporated water. these values, together with the estimated specific heat capacity evaluated from the recorded time profiles of the temperature, are summarized in table 3. some preliminary conclusions can be derived from the table: direct ohmic heating can accelerate only the first stage of heating, and cannot be used alone, without simultaneous heating by the hot wall of the mold. the product manufactured using only direct ohmic heating has properties different from those induced by indirect heating in a closed mold: this product is compact, without pores, hard and tough. similar conclusions were obtained from experiments with microwave heating of a potato starch suspension in an ohmic cell inserted into a microwave oven (the cell is made from plastic, therefore only the electrodes had to be removed). although a slightly greater amount of water was evaporated, the mechanical properties of the products were unsuitable, due to the absence of a crust, and the process could not be completed due to problems with local overheating (burnout). 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 fig. 3: time course of temperature and electric power, corresponding to a constant feeding voltage u � 100 v, initial mass 50 g, mass fraction of starch � � 0.55 4 properties of manufactured sandwich plates thermal processing of potato starch (and related material properties) is one thing, another thing are the properties of the manufactured products, foam sandwich plates with a crust. the most important parameter is the porosity of the foam, and this must be known not only for modeling the production processes (expansion inside a mold). porosity determines all the mechanical properties and the thermal insulation capacity of the products. the higher the porosity, the lower is the thermal conductivity (the better is the insulation). the mechanical properties fall, but this can be compensated by higher strength of the crust. there are several practical questions that should be answered: can we assume that the porosity is uniform throughout the sample, even if it expands inside the mold several times? what factors affect porosity: the thickness of the plate, the initial mass fraction of the starch, the temperature of the mold? is it possible to determine the porosity of the product from the thermal conductivity measurement, and vice versa? similar questions concern the thickness of the crust: is it independent of the thickness of the manufactured sandwich plate, etc. some of these questions are discussed in this section, which analyses the properties of sandwich plates manufactured from naturamyl potato starch (mass fraction of starch � � 0.5) in a mold with electrically heated walls (maintained at constant temperature180 °c). a schematic representation of the apparatus is shown in fig. 4. the manufactured products were plates, corresponding to dimensions of the cavity 200mm × 270mm × h, where thickness h was adjusted to 4 and 8 mm alternatively. the volume of the starch suspension poured onto the bottom of the mold at the beginning of the experiment was approximately 3 times less than the volume of the cavity (the corresponding porosity of the manufactured plate should be according to this e=1/3, but this is only an approximation, because a part of the expanded foam escapes through the venting slits and, above the density of the suspension differs from the density of thermally processed starch). 4.1 thermal conductivity the thermal conductivity of the samples was measured using a kemtherm thermal conductivity meter qtm-3 (kyoto electronics). a probe in the form of a thin ohmically heated metallic band is applied to the surface of the sample, and thermal conductivity l is evaluated from the recorded time course of the band temperature (this is in principle a transient heated wire method). since the qtm-3 instrument requires © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 48 no. 6/2008 � ratio of evaporated water (measured) specific heat capacity (median 25 °c – 60 °c) process time from 25 °c up to 90 °c process time corresponding to indirect heating (wall temperature 180 °c) estimated from model skocilas [12] (%) (j�kg�1�k�1) (s) (s) 0.55 18.67 1815�235 65 70 0.6 11.90 1690�230 141 68 0.65 7.31 1562�231 378 66 table 3: summary of the experiments using direct ohmic heating of a starch suspension fig. 4: indirectly heated mold (h � 4 and 8 mm, cavity 200×270 mm) no. off measurement thickness h � 8 mm � (w�m�1�k�1) thickness h � 4 mm � (w�m�1�k�1) boundary center boundary center 1 0.0693 0.0774 0.0696 0.07 2 0.0693 0.0773 0.0708 0.0692 3 0.0694 0.0766 0.0667 0.069 4 0.0688 0.0767 0.0691 0.0696 5 0.0691 0.0761 0.0685 0.0698 mean value � (w�m�1�k�1)) 0.069�0.00024 0.077�0.00054 0.069�0.0015 0.070�0.00041 table 4: thermal conductivity of processed sandwiches (thicknesses h � 4 and 8 mm) sufficiently thick samples, at least three plates had to be stacked on at measurement with thin plates h � 4 mm. a thermal conductivity map was obtained when the probe was moved along the surface, and table 4 shows the results (the values at the center and at the border of the plate represent 5 repeated measurements at each point). the data indicate that the thermal conductivity is almost independent of the sandwich thickness, at least within the accuracy of measurement, and also the lateral changes are not very significant (only about a 10 % decrease at the boundary was observed for h � 8 mm). similar results were reported by žitný [17] for plates also manufactured from potato starch but at a lower initial mass fraction � � 0.4. the mean thermal conductivity value for h � 4 mm was � � 0.066 w�m�1�k�1 (� � 0.067 at the centre and � � 0.063 at a corner of plate). the qtm-3 conductivity meter is designed for measuring homogeneous samples. however, the sandwich plates are not homogeneous and the conductivities reported in table 4 are only mean conductivities affected by the different conductivities of the foam (in the core) and the crust. the thermal conductivity of foam �f depends on porosity �f. assuming a uniform arrangement of spherical pores in a cubic grid, thermal conductivity �f can be expressed by � � � � � � � � � �f s f a s f � � � � � � � � � � � � 1 3 4 1 3 8 2 3 4 2 12 3 3 � � � � a s a s � � � � � � � � � � � � � � � � � � 2 , (3) where �s is the thermal conductivity of the starch matrix and �a is the conductivity of the air inside the pores. this relationship follows from the analytical solution of the temperature field around a single sphere (�a), which is surrounded by an infinite medium having conductivity �s. therefore the result is valid only for relatively small porosities, �f <0.5. relationship (3) can be approximated by a linear asymptote, corresponding to the case of two parallel resistors (parallel channels of air and starch). � � � � � f s f a s � � � � � 1 . (4) using eq. (4), the following expression for the mean thermal conductivity � can be derived (summing the thermal resistances of compact crust and foam) � � � � � � � s s f f a s � � � � � � � � � � � � � � h h h h h h * * * * * ( )1 1 1 1 1 * * , 1 1 1� � � � � � � � �h a s (5) where h* is the relative thickness of an idealized compact crust without pores, and � �� �f ( )*1 h is the mean porosity of the whole sandwich plate. eq. (5) can be used for an evaluation of porosity from the measured mean thermal conductivity of plate �, or conversely for an evaluation of the thermal conductivity of the starch matrix �s from measured porosity �. a problem is that the accuracy of eq. (3, 4) decreases with increasing porosity. results for porosity � greater than 0.7 (and this is the typical porosity of manufactured sandwiches) are unreliable. eq. (5) was applied only for an evaluation of thermal conductivity �s from the results obtained with mechanically compressed sandwiches (see next section). these plates were characterized by reduced porosity � � 0.54, thermal conductivity of sample � � 0.09 w�m�1�k�1 (� a � 0.025 w�m �1�k�1), giving � s � 0.167 for h*= 0 and � s � 0.175 w�m �1�k�1 for h* � 0.1. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 0 0.1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0 0.1 0.3 0.4 0.5 0.6 0.7 0.80.2 � � f s / � �a s/ 0.2� � �a s/ 0.4� �f –[ ] fig. 5: temperature conductivity of foam as a function of the porosity of foam (dotted lines are linear asymptotes) 4.2 porosity and crust porosity was determined in two ways, first by compressing the sandwich plates in a hydraulic press, giving the relationship between the load (acting force) and the reduced thickness of the sample. at a sufficiently great force, the thickness became almost independent of the load, and it can be assumed that the compacted sandwich had approximately zero porosity. comparing the final volume of the compacted sandwich with the initial volume, the porosity can be evaluated. when the cross-section of the hydraulic press clamps was large (rectangular contact surface 60 × 30 mm and the tested sandwich plates exceeded this surface), it was assumed that the lateral expansion is small and therefore the volume reduction was calculated only from the displacement of the clamps. the results presented in table 6 agree quite well with previously obtained results for a lower mass fraction of starch � � 0.4, when mean porosity was also 0.7, žitný [17]. the thermal conductivity of the compacted plates was measured, and is compared with the thermal conductivity of the foamed plates in tab. 4. this experimental technique has a drawback caused by a slight relaxation of the sample after removal of the load (for example initial thickness h � 8 mm was reduced to 2.38 mm after compression and relaxed to 3.63 mm when unloaded). the thermal conductivities could be measured only with these relaxed samples, and not with compacted samples. the second technique applied for evaluating the porosity and crust was image analysis of photographs of cross-sections of the plates. the cross sections were prepared from cuts, polished to a perfect plane with sandpaper, and coloured by black ink so that the structure just at the plane of the cut would be accentuated. digital photographs were processed by nis-elements br 3.0 software. thresholding was carried out in a standard way by selecting a suitable cubic subset of points in the rgb space, and the selected mask was applied to all processed photographs. an example of such a processed photograph is shown in fig. 6. fig.6 is a bitmap with 1408 pixels in the x-direction (along the plate, the corresponding physical size is l � 27.74 mm), and 256 pixels in the y-direction (transversal profile for thickness of plate h � 4 mm). white pixels represent the starch matrix, black pixel an empty space. histograms (sums of black pixels counts ch(x) and cl(y) along the columns and rows of the bitmap) characterize the porosity variation across the thickness (cl(y)) and along the plate (ch(x)). this bitmap illustrates some undesirable effects, for example non-uniform illumination (the right side is overexposed) and a small depression of the upper surface. in such a case, the bitmaps have to be corrected manually by excluding the distorted columns from the statistical evaluation (in this case, columns 400 to 600 and above 1300, were extracted). the mean porosity was evaluated simply as the relative number of black pixels in the bitmap. the crust is characterized by increased values of cl(y) at bottom and top, but the interface is not sharp and a suitable algorithm must be designed capture the interface. the bottom and the head part were analysed separately, and in each part the histogram cl(y) was approximated by the function c y c c y yc l( ) tanh� � �� � � 1 2 � , (6) where y is the distance from the surface and yc is the unknown position of the interface (thickness of the crust). parameters y, yc and � can be substituted by integers – the number of pixels for simplicity. parameters c1, c2, yc were identified by regression analysis, using the least squares criterion. assuming � � 1 (narrow interface), the linear parameters c1, c2 are given as © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 48 no. 6/2008 fig. 6: bitmap of cross section h � 4mm where n is the number of pixels in the selected part of the histogram. the nonlinear parameter yc was identified by a linear search, so that the criterion s2 was minimized � �s y c y c c y yc i i c i n 2 1 2 2 1 ( ) ( ) tanh( )� � � � � � l . (8) this approach has the advantage of simplicity and it works without some artificial smoothing of cl(y) (the smoothing is realized in fact by the hyperbolic tangents (6)). the algorithm is insensitive to “outlier points” and overcomes problems with local extremes of cl(y). examples of results obtained by this approach are shown in table 5. it is obvious that the identified layers of crust are not homogeneous, the layers are porous and the upper layer is usually not so compact as the bottom layer. a better characteristic of the layers from point of view of thermal resistance is therefore reduced thickness of an equivalent compact layer. the thickness of this compact layer is evaluated from the requirement that the amount of starch (counts of white pixels) is the same as in the “geometrically” determined porous crust yc. this size of the pores in the foam was only estimated by analyzing rows of bitmaps in the central region, with the exception of crust layers. contiguous parts (horizontal sections of black pixels) were considered as a measure of the corresponding pore diameter and were evaluated statistically, giving a rough estimate of the mean and maximum diameter. all textural properties (porosity and thickness of crust) were evaluated by image analysis of 25 samples for h=4 mm and 24 samples for h=8mm. each sample was represented by two photographs (two lateral cross sections) and the results are shown in tab. 6. the values in parentheses were evaluated from identical bitmaps, but using a quite different processing algorithm, by hulan [18]. differences are caused for example by the fact that hulan also evaluated the overexposed parts of the bitmap (the effect of non-uniform illumination was not corrected). however, these differences do not invalidate the basic conclusions: the porosity is almost independent of the plate thickness, unlike the thickness of the crust. the thickness of the crust is approximately proportional to the thickness of the sandwich plate. the fact that the thickness of the crust at the upper surface is less than the thickness of the crust on the bottom indicates some technological imperfections: the bottom part was heated slightly longer (it takes some time to close the upper lid of the mold) and also the thermal contact with the upper plate was not so good as on the bottom. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 c c y y y c y y y yi i c i i c i 1 2 � � � � �� � �l l( ) tanh ( ) ( ) tanh( ) tanh( � � y n y y y y c i c i c ) tanh ( ) tanh( ) � � �� � �2 2 , (7a) c n c y y y c y y y n y i i c i i c 2 2 � � � �� � �l l( ) tanh( ) ( ) tanh( ) tanh ( � �i c i cy y y� � �� �) tanh( ) 2 , (7b) table 5: thickness of crust, dotted lines (yc) evaluated from eq. (8). examples of evaluation 5 discussion experiments carried out with direct ohmic heating and microwave heating of a concentrated suspension of potato starch indicate that volumetric heating alone is not able to evaporate a sufficient amount of water, due to increased electrical resistance (in the case of direct ohmic heating), or due to local overheating (in the case of microwaves). the electrical properties of a starch suspension were therefore evaluated only in a reduced temperature range up to 100 °c. volumetric heating can be used only as an auxiliary tool during the initial process stages, and must be combined with indirect heating from the walls. another disadvantage of volumetric heating is the absence of a crust, which is important for the mechanical and stability properties of the final products. on the other hand the technology for baking plates, containers, etc., from potato-starch water suspensions (+additives like fibres, caco3) that is applied in practice, consists of two stages, glenn et al. [1]: the first stage is preparation of the starch suspension, preheating and forming the gel, and then the gelatinized starch is baked to the final form. it seems that direct ohmic heating is more suitable for the first stage, and the evaluated electrical properties provide a sufficient background for the design. experiments carried out with manufactured potato starch foams confirm that the porosity and thermal conductivity of sandwich plates prepared by standard indirect heating is almost independent of the thickness of the plates. values of porosity evaluated by quite independent techniques (by image analysis and by compression tests) are in accordance with the results obtained by thermal conductivity measurements. it seems that thermal conductivity measurements could be a fast way to check the uniformity and of the current porosity values of manufactured products. however, it is somewhat surprising that the measured values � are not affected by the crust, which seems to be the only parameter depending on the plate thickness. image analysis of the cross section reveals that the thickness of the crust is approximately proportional to the thickness of the manufactured plates (of course, it is difficult to reach a more specific conclusion, because only plates h � 4 and 8 mm were analyzed). most of the questions formulated in section 4 still remain: it is not known how porosity and the crust are affected by the composition of suspension and by the wall temperature. list of symbols c specific heat capacity (j�kg�1�k�1) cl counts of pixels in a row of bitmap (-) ch counts of pixels in a column of bitmap (-) c1, c2 parameters of regression function c calibration constant dmean, dmax mean and maximum pore diameter (mm) h distance of electrodes (m) h sample thickness (m) h* dimensionless thickness of crust i electrical current (a) k consistency coefficient (pa�sn) n flow index behavior (-) n number of pixels (-) s area (m2) t time (s) t temperature (°c) t1, t2 parameters in eq. (2) (°c) u voltage (v) x relative moisture (kg water / kg solid) y distance from surface (m) � porosity of sandwich plate (-) �f porosity of foam in the core of sandwich plate (-) � specific electrical conductivity (s�m�1) � thermal conductivity of porous layer (w�m�1�k�1) © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 48 no. 6/2008 h � 4 mm h � 8 mm overall porosity (including crust) � (porosity from compression tests) 0.69 0.7 � (porosity by image analysis) 0.71 � 0.05 (0.68 +) 0.74 � 0.05 (0.70+) relative thickness of crust identified from eq. (8) htop/h 0.08 � 0.04 (0.12 +) 0.13 � 0.08 (0.12+) hbottom/h 0.15 � 0.04 (0.15 +) 0.12 � 0.02 (0.16+) 1 � h h (relative thickness of porous core) 0.77 � 0.05 (0.73+) 0.76 � 0.08 (0.72+) h*top (compacted crust/h) 0.04 � 0.01 0.05 � 0.02 h*bottom (compacted crust/h) 0.11 � 0.03 0.09 � 0.02 diameter of pore dmean [mm] dmax [mm] 2.26 � 0.41 3.52 � 1.04 table 6: porosity and thickness of the crust (values+ reported by hulan [19]) �a thermal conductivity of air (w�m �1�k�1) �f thermal conductivity of foam (w�m �1�k�1) �s thermal conductivity of starch matrix (w�m�1�k�1) � density (kg�m�3) �s density of starch (kg�m �3) � mass fraction of starch (-) �1, �2 parameters in eq. (2) (-) acknowledgment this research has been supported by the grant agency of the czech republic, project 101/06/0535. references [1] glenn, g. m., orts, w. j., nobes, g. a. r.: starch, fiber and caco3 effects on the physical properties of foams made by a baking process. industrial crops and products, vol. 14 (2001), p. 201–212. [2] zanoni, b., smaldone, d., schiraldi, a.: starch gelatinization in chemically leavened bread baking. journal of food science, vol. 56 (1991), no. 6, p. 1702. [3] wang, j., hayakawa, k.: thermal conductivities of starch gels at high temperatures influenced by moisture. journal of food science, vol. 58 (1993), no. 2. [4] maroulis, z. b., drouzas, a. e., saravacos, g. d.: modelling of thermal conductivity of granular starches. journal of food engineering, vol. 11 (1990), p. 255–271. [5] lind, i., rask, c.: sorption isotherms of mixed meat, dough and bread crust. journal of food engineering, vol. 14 (1991), p. 303–315. [6] cha, j. y., chung, d. s., seib, p. a., flores, r. a., hanna, m. a.: physical properties of starch-based foams as affected by extrusion temperature and moisture content. industrial crops and products, vol. 14 (2001), p. 23–30. [7] schwarzberg, h. g.,wu, j. p. c, nussinovitch, a., mugerwa, j.: modelling deformation and flow during vapor-induced puffing. journal of food engineering, vol. 25 (1995), p. 329–372. [8] wang, w. ch., sastry, s. k.: starch gelatinization in ohmic heating. journal of food engineering, vol. 34 (1997), p. 225–242. [9] lagarrigue, s., alvarez, g.: the rheology of starch dispersion at high temperatures and high shear rates: a review. journal of food engineering, vol. 50 (2000), p. 189–202. [10] shogren, r. l., lawton, j. w., tiefenbacher, k. f.: baked starch foams: starch modifications and additives improve process parameters, structure and properties. industrial crops and products, vol. 16 (2002), p. 69–79. [11] zanoni, b., peri, c., pierucci, s.: a study of the bread baking process. i: a phenomenological model. journal of food engineering, vol. 19 (1993), p. 389–398. [12] skočilas, j., žitný, r.: thermal pressure forming of food materials in: 17th international congress of chemical and process engineering [cd-rom]. prague: czech society of chemical engineering, 2006, p. 1–9. isbn 80-86059-45-6. [13] wu, y., irudayaraj, j.: analysis of heat, mass and pressure transfer in starch based food systems. journal of food engineering, vol. 29 (1996), p. 399–414. [14] fan, j., mitchell, j. r., blanshard, j. m. v.: a computer simulation of the dynamics of bubble growth and shrinkage during extrudate expansion. journal of food engineering, vol. 23 (1994), p. 337–356. [15] parker, r., ollett, a. l, lai-fook, r. a., smith, a. c.: the rheology of food melts and its application in extrusion processing. rheology of food biological and pharmaceutical materials, r.e. carter. london: elsevier, 1989. [16] metaxas, a. c.: foundations of electroheat. chichester: willey, 1996. isbn 0-471-95644-9. [17] žitný, r.: waffle baking, part viii, experiments. research report ctu fme, prague 1999. [18] hulan, m. : image analysis of a porous layer (in czech), internal report, ctu fme, 2008. ing. jaromír štancl phone: +420 224 352 719 fax: +420 224 310 292 e-mail: jaromir.stancl@fs.cvut.cz ing. jan skočilas phone: +420 224 352 719 fax: +420 224 310 292 e-mail: jan.skocilas@fs.cvut.cz prof. ing. jiří šesták, drsc. phone: +420 224 352 547 fax: +420 224 310 29 e-mail: jiri.sestak@fs.cvut.cz prof. ing. rudolf žitný, csc. phone: +420 224 352 555 fax: +420 224 310 292 e-mail: rudolf.zitny@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 contributions to the study of dynamic absorbers, a case study monica bălcău1, angela pleşa1, dan opruţa1 1 technical university of cluj-napoca, automotive engineering department, cluj-napoca, b-dul muncii 103-105 correspondence to: monica.balcau@auto.utcluj.ro abstract dynamic absorbers are used to reduce torsional vibrations. this paper studies the effect of a dynamic absorber attached to a mechanical system formed of three reduced masses which are acted on by one, two or three order x harmonics of a disruptive force. keywords: dynamic absorber, torsional vibrations, reduced masses. 1 introduction this paper studies the effect of dynamic absorbers on reducing torsional vibrations. like the inertial effects, the forces and the torque produced by thermal processes in the cylinders of internal combustion engines produce forces and torsional moments that vary nonlinearly. these forces, applied to the crankshaft and to the engine block, produce translational and rotational oscillations of the engine block and torsional oscillations in the crankshaft. they have to be reduced, because they cause noise and vibrations in an engine. in order to calculate the torsional vibrations of a complex elastic system of this kind, the system must first be turned into a simpler equivalent dynamic system, formed of an elastic linear shaft of negligible mass loaded with circular reduced masses obtained by reducing the mobile gears. this study focuses on three mechanical systems formed of three reduced masses, which receive a pendulum dynamic absorber. these masses are acted on by one, two or three order × harmonics, resulted from the decomposition of the fourier series of the disruptive force with periodic variation. the dynamic absorber is attached at the end of the mechanical system. 2 a mechanical system acted on by one harmonic the first case deals with a mechanical system formed of three reduced masses with a dynamic absorber attached according to figure 1. the dynamic absorber is placed at the end of the mechanical system. the reduced mass m3 is acted on by an order x harmonic of a disruptive force presenting a periodic variation marked with px · cos(ωxt − ε). the dynamic absorber is replaced by an equivalent reduced mechanical system formed of the reduced mass m4 and the segment of the reduced crankshaft with an elastic constant c34. the elastic constants of the segments of crankshafts between the two consecutive reduced masses and the mechanical axial moments of the reduced masses in relation to the symmetrical geometry axis of the shaft must be chosen in such a way that the kinetic energy and the potential energy of the real vibrating system (formed by the crankshaft and its mobile gears) is equal to the kinetic energy and the potential energy of the reduced vibrating system. figure 1: mechanical system this case starts from the differential equation system governing the torsional vibrations created by the mechanical system presented in figure 2: m1 d2a1 dt2 + c12(a1 − a2) = 0 m2 d2a2 dt2 + c12(a2 − a1) + c23(a2 − a3) = 0 12 acta polytechnica vol. 52 no. 4/2012 m3 d2a3 dt2 + c23(a3 − a2) + c34(a3 − a4) = px cos(ωxt − ε) (1) m4 d2a4 dt2 + c34(a4 − a3) = 0 figure 2: equivalent mechanical system the relative linear elongations (displacement) ai measured on the reduced circle of radius r (of the vibrations of the four reduced masses) are expressed in relation to the linear amplitudes ai: ai = ai cos(ωxt − ε) (i = 1 ∼ 4) (2) in order for the mechanical system formed of the reduced mass m4 and the shaft segment with elastic constant c34 to be dynamically equivalent to the dynamic absorber attached to the reduced mass m3 (in other words for this mass m3 to be subjected to the same torque as the dynamic absorber), it is necessary and sufficient that: m4 = m (l + l)2 r2 c34 = m (l + l)2 r2 l l ω20 (3) the square value of the order × harmonic is expressed by: ω2x = x 2ω20 (4) where x represents the order of the harmonic and ω0 the angular speed of the shaft. taking into consideration equations (2), (3), (4) and introducing the differential equation system (1), we obtain an algebraic system of equations. by solving this system of equations we get the four determinants. if the dynamic absorber is built in such a way that: l l = x2 (5) the expressions of the four determinants become: δ1 = − px m3 c12 m1 c23 m2 ω20 ( x2 − l l ) δ2 = px m3 c23 m2 ω20 ( x2ω20 − c12 m1 ) ( x2 − l l ) δ3 = − px m3 ω20 ( x2ω20 − c12 m1 ) · ( x2ω20 − c23 m2 − c12 m2 ) ( x2 − l l ) δ4 = px m3 ω20 ( x2ω20 − c12 m1 ) · ( x2ω20 − c23 m2 − c12 m2 ) l l ω20 (6) an analysis of the four determinants shows that the only mass that executes torsional vibrations is mass m4, which is not part of the reduced crankshaft. the other reduced masses do not execute torsional vibrations. so: a1 = 0 a2 = 0 a3 = 0 a4 �= 0 (7) 3 a mechanical system acted on by two order × harmonics this case deals with a mechanical system composed of three reduced masses, which receives a dynamic absorber placed at the end of the system presented in figure 3. the reduced masses m2, m3 are acted on by two order × harmonics of the disruptive force presenting a periodic vibration marked with px cos(ωxt − ε). figure 3: mechanical system figure 4: equivalent mechanical system 13 acta polytechnica vol. 52 no. 4/2012 the reduced mass m4, together with the shaft segment with elastic constant c34, forms a dynamic mechanical system equivalent to the dynamic absorber which applies the same torque to mass m3 as the absorber (figure 4). as in the previous case, the dynamic absorber is replaced by an equivalent dynamic system formed of the reduced mass m4 and the reduced crankshaft segment having elastic constant c34 (figure 4). the study starts from the differential equation system that governs the torsion vibrations performed by the mechanical system represented in figure 4: m1 d2a1 dt2 + c12(a1 − a2) = 0 m2 d2a2 dt2 + c12(a2 − a1) + c23(a2 − a3) = px cos(ωxt − ε) m3 d2a3 dt2 + c23(a3 − a2) + c34(a3 − a4) = px cos(ωxt − ε) (8) m4 d2a4 dt2 + c34(a4 − a3) = 0 relation (2) gives the elongation expressions ai related to the amplitudes recorded on radius r on the reduction circles. taking into account expressions (2), (3), (4) and introducing the differential system of equations (8) we obtain a system of algebraic equations. by solving this system, we obtain expressions for the five determinants. δ = 1 m3 m (l + l)2 r2 ( l l ω20 )2 [( x2ω20 − c12 m1 ) · ( x2ω20 − c12 m2 − c23 m2 ) − c12 m1 c12 m2 ] δ1 = − px m2 ω40 c12 m1 1 m3 m (l + l)2 r2 ( l l )2 δ2 = − px m2 ω40 ( x2ω20 − c12 m1 ) 1 m3 m (l + l)2 r2 ( l l )2 δ3 = 0 δ4 = − px m2 c23 m3 l l ω20 ( x2ω20 − c12 m1 ) + px m3 [( x2ω20 − c12 m1 ) ( x2ω20 − c12 m2 − c23 m2 ) − c12 m1 c12 m2 ] l l ω20 (9) the analysis of the expressions (9) shows that the reduced masses m1, m2, and m4 — born from a reduction operation — perform torsion vibrations. the only mass that does not perform any torsional vibrations is mass m3, which is the mass that has the dynamic absorber attached. a1 �= 0 a2 �= 0 a3 = 0 a4 �= 0 (10) 4 the mechanical system acted on by three order × harmonics this case investigates a mechanical system composed of three reduced masses which receive a dynamic absorber placed at the end of the system (figure 5). the reduced masses m1, m2, m3 are acted on by means of three order x harmonics of disruptive forces presenting a periodic variation marked with px cos(ωxt − ε). figure 5: mechanical system figure 6: equivalent mechanical system just as in the previous paragraph, the dynamic absorber is replaced by an equivalent dynamic system formed of the reduced mass m4 and the reduced shaft segment having the elastic constant c34 (figure 6). the differential equations governing the vibratory movements of the mechanical system represented in figure 6 are: m1 d2a1 dt2 + c12(a1 − a2) = px cos(ωxt − ε) m2 d2a2 dt2 + c12(a2 − a1) + c23(a2 − a3) = px cos(ωxt − ε) m3 d2a3 dt2 + c23(a3 − a2) + c34(a3 − a4) = px cos(ωxt − ε) (11) 14 acta polytechnica vol. 52 no. 4/2012 m4 d2a4 dt2 + c34(a4 − a3) = 0 the expressions of elongations ai according to the amplitudes recorded on the radius r reduction circles are given by relation (2). taking into account expressions (2), (3), (4) and introducing the differential system of equations (11), we obtain an algebraic system of equations. solving this system provides the four determinants. δ = − 1 m3 m (l + l)2 r2 ( l l ω20 )2 [( x2ω20 − c12 m1 ) · ( x2ω20 − c12 m2 − c23 m2 ) − c12 m1 c12 m2 ] δ1 = px m1 ( x2ω20 − c12 m2 − c23 m2 ) 1 m3 m (l + l)2 r2 · ( l l ω20 )2 − px m2 c12 m1 1 m3 m (l + l)2 r2 ( l l ω20 )2 δ2 = − px m1 c12 m2 1 m3 m (l + l)2 r2 ( l l ω20 )2 + px m2 1 m3 m (l + l)2 r2 ( l l ω20 )2 δ3 = 0 (12) we observe that: a1 �= 0 a2 �= 0 a3 = 0 (13) this means that the only mass that does not execute torsional oscillations is mass m3, which received a dynamic absorber. since mass m4 is not a result of reducing the crankshaft, it was no longer useful when calculating the amplitude vibration for this mass. 5 conclusion the three cases discussed here offer us the following information: – in the case of the mechanical system formed of three reduced masses acted on by a single order x harmonic of the disruptive force, if the dynamic absorber is built according to relation (5), the three reduced masses m1, m2 and m3 do not perform any torsional vibrations, irrespective of the value of the angular speed. – in the case of the mechanical systems formed of three reduced masses acted on by two and three order x harmonics of the disruptive force, respectively, if the dynamic absorber is built according to relation (5), the only mass which does not execute torsional vibrations is the mass which receives the dynamic absorber. references [1] haddow, a. g., shaw, s. w.: centrifugal pendulum vibration absorbers: an experimental and theoretical investigation. nonlinear dynamics, 34, 293–307, 2003. 2004 kluwer academic publishers. printed in the netherlands. [2] ripianu, a., crăciun, i.: the dynamic and strength calculus of straight and crank shafts. transilvania press publishing house cluj, 1999. [3] kraemer, o.: drehschwingungsrechnung berechnung der eigenschwingungszahlen, in technishe hochschule karlsruhe lehrstuhl für kolbenmaschinen und getriebelehre, 1960. [4] pop, a. f., gligor, r. m., bălcău, m.: analising of vibrations measurements upon hand-arm system and results comparison with theoretical model, 3rd european conference on mechanism science, previous eucomes 2010 conferences, september 14–18, 2010, cluj-napoca, romania mechanisms and machine science, springer vol. 5, p. 277–284. isbn 978-90-481-9688-3. 15 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 the simulation of the influence of water remnants on a hot rolled plate after cooling radek zahradńık1 1 faculty of mechanical engineering, heat transfer and fluid flow laboratory, technická 2896/2, 616 69 brno, czech republic correspondence to: zahradnik@lptap.fme.vutr.cz abstract in situations when a sheet metal plate of large dimensions is rolled, water remnants from cooling can be observed on the upper side of the plate. this paper focuses on deformations of a hot rolled sheet metal plate that are caused by water remnants after cooling. a transient finite element simulation was used to describe shape deformations of the cross profile of a metal sheet. the finite element model is fully parametric for easy simulation of multiple cases. the results from previous work were used for the boundary conditions. keywords: water remnants, fea, simulation, hot plate. 1 introduction cooling is an integral part of the hot rolling process. in most cases, sprayed water is used to cool down the rolled product. water remnants can be present on upper side of the rolled plate after it passes through the cooling section (figure 1). these remnants will eventually evaporate (after 60 seconds), but before that they significantly cool down the rolled plate. this additional unwanted cooling can deform the plate. this problem is observed on the basis of a real case scenario carried out for an international client of the heat transfer and fluid flow laboratory. this paper focuses on a numerical calculation (finite element analysis) of rolled plate deformations, and the goal is to find whether the water remnants have a significant influence on the shape of the rolled plate. 2 finite element analysis the simulation of the influence of the water remnants is divided into two separate analyses: a transient thermal analysis, and a multiple static structural analysis. the temperature field over time is calculated in the thermal analysis. this temperature field is used to calculate the structural deformation of the rolled plate caused by the non-uniform temperature distribution. the finite element model makes use of the symmetry of the simulated task. the model is a section cut from a half of a rolled plate (figure 2). no support is used for the plate, so finite element model (fem) includes no plate. figure 1: remnants on the top side of a rolled plate after passing through the cooling section. although the plate is moving, the water remains on the plate until it evaporates 118 acta polytechnica vol. 52 no. 4/2012 figure 2: front view of fem at the top of the figure; side view of fem on the left bottom part of the figure; detail of the fem mesh (element type – solid90) in the middle part of the figure; initial temperature distribution in bottom right part of the figure figure 3: calculated displacement structure with/without gravitation on the left/right of the figure figure 4: the thermal field of the rolled plate after the water remnants evaporate (on the left side of the figure) with a closer view of the center of fem (on the right of the figure) displacement scaling: true scale – 100 : 1 119 acta polytechnica vol. 52 no. 4/2012 figure 5: calculated displacement structure after the water remnants evaporate. displacement scaling: true scale – 100 : 1 the initial temperature field was recorded in a hot strip mill on the plate during the rolling process. the details cannot be divulged because of a non-disclosure agreement with the customer. there is a parabolic temperature distribution from the core temperature of 675 ◦c to the surface temperature of 650 ◦c (figure 2). the boundary conditions are the heat transfer coefficient (htc) values on the external surfaces of the fem in thermal analysis. the htc values are 400 w · m−2 · k−1 for the surface where there is contact between the water and the plate, and 55 w · m−2 · k−1. these values were obtained from previous work done in the heat transfer and fluid flow laboratory. the boundary conditions applied in structural analysis are the temperature field, the gravitation and the constraints which represent symmetry. the material model is a temperaturedependent bilinear model. 3 results the thermal contraction is not strong enough to bend the rolled plate, as shown on the right in figure 3. the gravitation force is almost two times higher than the bending moment produced by the contraction. the surface temperature beneath the water remnants drops to 458 ◦c. this is an almost 200 ◦c decrease from the starting value (figure 4). figure 5 shows the displacement structure after the water remnants evaporate (60 seconds). the range of deformation is ±1 mm. no negative displacement in the y-axis is possible in the real case scenario, because of the plate support, but we can see that the plate deformation is small in comparison with the size of the plate. 4 conclusion the cooling capacity of water remnants is high enough to produce a non-uniform temperature field in a thin rolled plate (figure 4). the water remnants significantly cool down the rolled plate, which leads to contraction. this contraction acts as a bending moment (figure 3). the weight of a rolled plate is sufficient to prevent considerable lifting of its edges (figure 3). the shape of the plate is just slightly deformed. the maximum displacement does not exceed 1 mm, and the influence of the water remnants is insignificant. acknowledgement the paper has been supported by an internal grant of the brno university of technology for specific research and development no. fsi-s-11-20. references [1] kotrbacek, p., horsky, j., raudensky, m., pohanka, m.: experimental study of heat transfer in hot rolling, revue de metallurgie, 2005, p. 42–43. isbn 2-911212-05-3. [2] raudensky, m., bohacek, j.: leidenfrost phenomena at hot sprayed surface, in the 7th eci international conference on boiling heat transfer 2009, 3–7 may 2009, florianopolis, brazil. [3] pohanka, m., bellerova, h., raudensky, m.: experimental technique for heat transfer measurements on fast moving sprayed surfaces, journal of astm international, vol. 6, no. 4, 2009, p. 1–9. 120 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 topics on n-ary algebraic structures j. a. de azcárraga, j. m. izquierdo abstract we review the basic definitions and properties of two types of n-ary structures, the generalized lie algebras (gla) and the filippov (≡ n-lie) algebras (fa), as well as those of their poisson counterparts, the generalized poisson (gps) and nambu-poisson (n-p) structures. wedescribe thefilippov algebra cohomology complexes relevant for the central extensions and infinitesimal deformations of fas. it is seen that semisimple fas do not admit central extensions and, moreover, that they are rigid. this extends whitehead’s lemma to all n ≥ 2, n =2 being the original lie algebra case. some comments on n-leibniz algebras are also made. 1 introduction thejacobi identity (ji) forlie algebras g, [x, [y, z]]+ [y, [z, x]]+ [z, [x, y ]] = 0, may be looked at in two ways. first, onemay see it as a consequence of the associativity of the composition of generators in the lie bracket. secondly, it may be viewed as the statement that the adjointmap is a derivation of the lie algebra, adx [y, z] = [adx y, z]+ [y, adx z]. a natural problem is to consider n-ary generalizations, i.e. to look for the possible characteristic identities that a n-ary bracket, (x1, . . . , xn) ∈ g×. . .×g �→ [x1, . . . , xn] ∈ g, (1.1) antisymmetric in its arguments (this may be relaxed; see last section), may satisfy. when n > 2 two generalizations of the ji suggest themselves. these are: (a) higher order lie algebras or generalized lie algebras (gla) g, proposed independently in [1, 2, 3] and [4, 5, 6, 7]. their bracket is defined by the full antisymmetrization [xi1, . . . , xin] :=∑ σ∈sn (−1)π(σ)xiσ(1) . . . xiσ(n) . (1.2) for n even, this definition implies the generalized jacobi identity (gji)∑ σ∈s2n−1 (−1)π(σ) [ [xiσ(1), . . . , xiσ(n)], xiσ(n+1) . . . , xiσ(2n−1) ] =0 (1.3) which follows from the associtivity of the products in (1.2) (for n odd, the r · h · s is n!(n −1)![xi1, . . . , xi2n−1] rather than zero). chosenabasis ofg, the bracketmaybewrittenas [xi1, . . . , xi2p] = ωi1...i2p j xj, where the ωi1...i2p j are the structure constants of the gla. (b) n-lie or filippov algebras (fa) g. the characteristic identity that generalizes the n = 2 ji is the filippov identity (fi) [8] [x1, . . . , xn−1, [y1, . . . yn]] = n∑ a=1 [y1, . . . ya−1, [x1, . . . , xn−1, ya], ya+1, . . . yn] . (1.4) if we introduce fundamental objects x = (x1, . . . , xn−1) antisymmetric in their (n−1) entries and acting on g as x · z ≡ adxz := [x1, . . . , xn−1, z] (1.5) ∀z ∈ g , then the fi just expresses that adx is a derivation of the bracket, adx[y1, . . . , yn] = n∑ a=1 [y1, . . . , adxya, . . . , yn] . (1.6) chosen a basis, a fa may be defined through its structure constants, [xa1 . . . xan] = fa1...an d xd , (1.7) and the fi is written as fb1...bn l fa1...an−1l s = n∑ k=1 fa1...an−1bk l fb1...bk−1lbk+1...bn s . (1.8) 2 some definitions and properties of fa the definitions of ideals, solvable ideals and semisimple algebras can be extended to the n > 2 case as follows [9]. a subalgebra i of g is an ideal of g if [x1, . . . , xn−1, z] ⊂ i (2.9) ∀x1, . . . , xn−1 ∈ g , ∀z ∈ i . to appear in the proceedings of the meeting selected topics in mathematical and particle physics, may 5–7 2009 (niederlefest), held in prague on occasion of the 70th birthday of professor j. niederle. 7 acta polytechnica vol. 50 no. 3/2010 an ideal i is (n-)solvable if the series i(0) := i, i(1) := [i(0), . . . , i(0)], . . . , i(s) := [i(s−1), . . . , i(s−1)], . . . (2.10) ends. afa is then semisimple if it does not have solvable ideals, and simple if [g, . . . , g] �= {0} and does not contain non-trivial ideals. there is also a cartanlike criterion for semisimplicity [10]. namely, a fa is semisimple if k(x, y)= k(x1, . . . , xn−1, y1, . . . , yn−1) := t r(adxady) (2.11) is non-degenerate in the sense that k(z, g, n−2. . . , g, g, n−1. . . , g)=0 ⇒ z =0 . (2.12) it can also be shown [11] that a semisimple fa is the sum of simple ideals, g = k⊕ s=1 gs = g(1) ⊕ . . . ⊕ g(k) (2.13) the derivations of a fa g generate a lie algebra. to see it, introduce first the composition of fundamental objects, x · y := n−1∑ a=1 (y1, . . . , ya−1, [x1, . . . , xn−1, ya], ya+1, . . . , yn−1) (2.14) which reflects that x acts as a derivation. it is then seen that fi implies that x · (y · z)− y · (x · z)= (x · y) · z, (2.15) ∀x, y, z ∈ ∧n−1g adxadyz − adyadxz = adx·yz, (2.16) ∀x, y ∈ ∧n−1g, ∀z ∈ g , which means that adx ∈ endg satisfies adx·y = −ady·x. these two identities show that the inner derivations adx associated with the fundamental objects x generate (the ad map is not necessarily injective) an ordinary lie algebra, the lie algebra associated with the fa g. an important type of fas, because of its relevance inphysicalapplicationswherea scalarproduct is usually needed (as in thebagger-lambert-gustavsson model in m-theory), is the class of metric filippov algebras. these are endowed with a metric 〈, 〉 on g, 〈y, z〉= gaby a zb, ∀ y, z ∈ g that is invariant i.e., x · 〈y, z〉 = 〈x · y, z〉+ 〈y, x · z〉 = 〈[x1, . . . , xn−1, y ], z〉+ (2.17) 〈y, [x1, . . . , xn−1, z]〉 =0 . this means that the structure constants with all indices down fa1...an−1bc are completely antisymmetric since the invariance of g above implies fa1...an−1b l glc+ fa1...an−1c l gbl = 0. the fa1...an+1 define a skewsymmetric invariant tensor under the action of x, since the fi implies n+1∑ i=1 fa1...an−1bi l fb1...bi−1lbi+1...bn+1 =0 or lx .f =0 . (2.18) 3 examples of n-ary structures 3.1 examples of glas let n = 2p. we look for structure constants ωi1...i2p j that satisfy the gji (1.3) i.e., such that ω[j1...j2p lωj2p+1...j4p−1]l s =0 . (3.19) it turns out [3, 2] that given a simple compact lie algebra, the coordinates of the (odd) cocyles for the lie algebra cohomology satisfy the gji identity (1.2). these provide the structure constants of an infinity of glas,withbracketswith n =2(mi−1) entries (where i =1, . . . , l and l is the rank of the algebra), according to the table below: g dimg orders mi of invariants (and casimirs) orders 2mi −1 of g-cocycles al (l +1) 2 −1 [l ≥ 1] 2,3, . . . , l +1 3,5, . . . ,2l +1 bl l(2l +1) [l ≥ 2] 2,4, . . . ,2l 3,7, . . . ,4l −1 cl l(2l +1) [l ≥ 3] 2,4, . . . ,2l 3,7, . . . ,4l −1 dl l(2l −1) [l ≥ 4] 2,4, . . . ,2l −2, l 3,7, . . . ,4l −5,2l −1 g2 14 2,6 3,11 f4 52 2,6,8,12 3,11,15,23 e6 78 2,5,6,8,9,12 3,9,11,15,17,23 e7 133 2,6,8,10,12,14,18 3,11,15,19,23,27,35 e8 248 2,8,12,14,18,20,24,30 3,15,23,27,35,39,47,59 8 acta polytechnica vol. 50 no. 3/2010 3.2 examples of fas an important example of finite filippov algebras is provided by the real euclidean simple n-lie algebras an+1 defined on an euclidean (n+1)-dimensional vector space. let us fix a basis {ei} (i = 1, . . . , n +1). the basic commutators are given by [e1 . . . êi . . . en+1] = (−1)n+1ei or [ei1 . . . ein] = (−1) n n+1∑ i=1 �i1...in iei . (3.20) there are also infinite-dimensional fas that generalize the ordinary poisson algebra by means of the bracket of n functions fi = fi (x1, x2, . . . , xn) defined by [f1, f2, . . . , fn] := � i1...in 1 ... n ∂i1f 1 . . . ∂in f n =∣∣∣∣ ∂(f1, f2, . . . , fn)∂(x1, x2, . . . , xn) ∣∣∣∣ , (3.21) considered by nambu [12] specially for n = 3. the conmmutators in (3.20) and the above jacobian nbracket satisfy the fi, which can be checked by using the schouten identities technique. all these examples are also metric fas. 4 n-ary poisson generalizations both glas and fas have n-ary poisson structure counterparts. these satisfy the associated gji and fi characteristic identities, to which leibniz’s rule is added. 4.1 generalized poisson structures (gps) the generalized poisson structures [2] (gps, n even) are defined by brackets {f1, . . . , fn} where the fi, i = 1, . . . , n, are functions on a manifold. they are skewsymmtric {f1, . . . , fi, . . . , fj, . . . , fn} = −{f1, . . . , fj, . . . , fi, . . . , fn} , (4.22) satisfy the leibniz identity, {f1, . . . , fn−1, gh} = g{f1, . . . , fn−1, h}+{f1, . . . , fn−1, g}h , (4.23) and the characteristic identity of the glas, the gji (1.3), ∑ σ∈s4s−1 (−1)π(σ){fσ(1), . . . , fσ(2s−1), {fσ(2s), . . . , fσ(4s−1)}} =0 . (4.24) as with ordinary poisson structures, there are linear gps given in terms of coordinates of the odd cocyles of the g in the table of sec. 3.1. they are given by the multivector λ= 1 (2m −2)! ωi1...i2m−2 σ · xσ∂ i1 ∧ . . . ∧ ∂i2m−2 (4.25) since, as it may be checked [2], λ has zero schoutennijenhuis bracket with itself, [λ,λ]sn =0, which corresponds to the gji. all glas associatedwith a simple algebra define linear gps. 4.2 nambu-poisson structures (n-p) these are defined by relations (4.22) and (4.23), but now the characteristic identity is the fi, {f1, . . . , fn−1, {g1, . . . , gn}} = {{f1, . . . , fn−1, g1}, g2, . . . , gn}+ {g1, {f1, . . . , fn−1, g2}, g3, . . . , gn}+ . . . + (4.26) {g1 . . . , gn−1, {f1, . . . , fn−1, gn}} . the filippov identity for the (nambu) jacobians of n functions was first written by filippov [8], and by sahoo and valsakumar [13] and takhtajan [14] (who called it fundamental identity) in the context of nambu mechanics [12]. physically, the fi is a consistency condition for the time evolution [13, 14], given in terms of (n − 1) ‘hamiltonian’ functions that correspond to the adx derivations of a fa. every even n-p structure is also agps, but the conversedoes not hold. thequestionof thequantizationofnambu-poisson mechanics has been the subject of a vast literature; it is probably fair to say that it remains a problem (for n > 2!) aggravated by the fact that there are not so many physical examples ofn-pmechanical systems to be quantized. we shall just refer here to [15, 16, 17], from which the earlier literature can be traced. 5 lie algebra cohomology, extensions and deformations given a lie algebra g, the p-cochains of the lie algebra cohomology are p-antisymmetric, v -valued maps (where v is a g-module), ωp : g × p · · ·×g → v , ωa = 1 p! ωai1...ip ω i1 ∧ . . . ∧ ωip , (5.27) where {ωi} is a basis of the coalgebra g∗. the coboundary operator (for the left action) s : ωp ∈ cp(g, v ) �→ (sωp) ∈ cp+1(g, v ), s2 =0, is given by 9 acta polytechnica vol. 50 no. 3/2010 (sωp)a (x1, . . . , xp+1) := p+1∑ i=1 (−)i+1ρ(xi)a.b ω pb(x1, . . . , x̂i, . . . , xp+1) (5.28) + p+1∑ j,k=1 j 3. acknowledgement this work has been partially supported by the research grants fis2008-01980 and fis2009-09002 from the spanish micinn, and va013c05 from the junta de castilla y león (spain). references [1] de azcárraga, j. a., perelomov, a., pérez bueno, j. c.: new generalized poisson structures, j. phys. a29, l151–l157 (1996), arxiv:q-alg/9601007. [2] de azcárraga, j. a., perelomov, a. m., pérez bueno, j. c.: the schouten-nijenhuis bracket, cohomology and generalized poisson structures, j. phys. a29, 7993–8010 (1996), arxiv:hep-th/9605067. 12 acta polytechnica vol. 50 no. 3/2010 [3] de azcárraga, j. a., pérez-bueno, j. c.: higherorder simple lie algebras,commun. math. phys. 184, 669–681 (1997), arxiv:hep-th/9605213. [4] hanlon, p., wachs, h.: on lie k-algebras, adv. in math. 113, 206–236 (1995). [5] gnedbaye, v.: les algèbres k-aires el leurs opérads, c. r. acad. sci. paris, série i, 321, 147–152 (1995). [6] loday, j.-l.: la renaissance des opérades, sem. bourbaki 792, 47–54 (1994–1995). [7] michor, p.w., vinogradov,a. m.: n-ary lie and associative algebras, rend. sem. mat. univ. pol. torino 53, 373–392 (1996). [8] filippov, v.: n-lie algebras, sibirsk. mat. zh. 26 (1985), 126–140, 191 (english translation: siberian math. j. 26, 879–891 (1985)). [9] kasymov,s.m.: theoryof n-lie algebras,algebra i logika 26 (1987), no. 3, 277–297 (english translation: algebra and logic 26, 155–166 (1988)). [10] kasymov, s. m.: on analogues of cartan criteria for n-lie algebras,algebra i logika 34 (1995), no. 3, 274–287, 363. [11] ling, w. x.: on the structure of n-lie algebras. phd thesis, siegen, 1993. [12] nambu, y.: generalized hamiltonian dynamics, phys. rev. d7, 2405–2414 (1973). [13] sahoo,d., valsakumar,m.c.: nambumechanics and its quantization,phys. rev.a46, 4410–4412 (1992). [14] takhtajan, l.: a higher order analog of the chevalley-eilenberg complex and the deformation theory of lie algebras, st. petersburg math. j. 6, 429–437 (1995). [15] de azcárraga, j. a., izquierdo, j. m., pérez bueno, j. c.: on the higher-order generalizations of poisson structures, j. phys. a30, l607–l616 (1997), hep-th/9703019. [16] awata, h., li, m., minic, d., yoneya, t.: on the quantization of nambu brackets, jhep, 02, 013 (2001), hep-th/9906248. [17] curtright, t., zachos, c.: classical and quantum nambu mechanics, phys. rev. d68, 085001 (2003), ep-th/0212267. [18] gerstenhaber, m.: on the deformation of rings and algebras, annals math. 79, 59–103 (1964). [19] gautheron, p.: some remarks concerningnambu mechanics,lett.math. phys. 37, 103–116 (1996). [20] rotkiewicz, m.: cohomology ring of n-lie algebras, extracta math. 20, 219–232 (2005). [21] de azcárraga, j. a., izquierdo, j. m.: cohomology of filippov algebras and an analogue of whitehead’s lemma, j. phys. conf. ser. 175, 012001 (2009), arxiv:0905.3083[math-ph]. [22] loday, j.-l.: une version non-commutative des algèbresdelie,l’ens.math.39, 269–293 (1993). [23] daletskii, y. l., takhtajan, l.: leibniz and lie algebra structures fornambualgebra,lett.math. phys. 39, 127–141 (1997). [24] casas, j. m., loday, j.-l., pirashvili, t.: leibniz n-algebras, forum math. 14, 189–207 (2002). [25] loday, j.-l., pirashvili, t.: universal enveloping algebras of leibniz algebras and (co)homology, mat. annalen 296, 139–158 (1993). [26] fialowski, a., mandal, a.: leibniz algebra deformations of a lie algebra, j. math. phys. 49, 093511 (2008), arxiv:0802.1263 [math.kt]. [27] figueroa-o’farrill, j. m.: three lectures on 3algebras, arxiv:0812.2865 [hep-th]. [28] de azcárraga, j. a., izquierdo, j. m.: on leibniz deformations and rigidity of simple n-lie algebras, to be published. j. a. de azcárraga departamento de f́ısica teórica and ific (csic-uveg) univ. de valencia 46100-burjassot (valencia), spain j. m. izquierdo departamento de f́ısica teórica universidad de valladolid 47011-valladolid, spain 13 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 agent-based simulation of the maritime domain o. vaněk abstract in this paper, a multi-agent based simulation platform is introduced that focuses on legitimate and illegitimate aspects of maritime traffic, mainly on intercontinental transport through piracy afflicted areas. the extensible architecture presented here comprises several modules controlling the simulation and the life-cycle of the agents, analyzing the simulation output and visualizing the entire simulated domain. the simulation control module is initialized by various configuration scenarios to simulate various real-world situations, such as a pirate ambush, coordinated transit through a transport corridor, or coastal fishing and local traffic. the environmental model provides a rich set of inputs for agents that use the geo-spatial data and the vessel operational characteristics for their reasoning. the agent behavior model based on finite state machines together with planning algorithms allows complex expression of agent behavior, so the resulting simulation output can serve as a substitution for real world data from the maritime domain. keywords: multi-agent system, simulation, maritime traffic. 1 introduction the maritime domain is a complex environment consisting of many independent, though often intensively interacting entities that have their own goals and intentions. the aim of this research is to simulate not only legitimate maritime traffic, such as an intercontinental transportation, coastal fishing and recreational traffic, butalso illegitimateaspects, suchas illegalfishing, waste dumping and maritime piracy. in this paper, an agent-based simulation of maritime traffic is presented, more specifically the design of an architecture of the simulation platform is described (section 2) and the simulation configuration and control flow are explained (section 4). a model of the maritime environment (section 5) with several types of independent agents striving to achieve their goals (section 6) is visualized using thegoogleearth1 platform (section 7). the simulation is used to provide a noise-free source of maritime data with the desired spatiotemporal resolution to algorithms analyzing the situation, and it serves for developing and prototyping agent-based techniques for understanding, detecting, anticipatingandeventuallypreventingpiracyand, possibly, other categories of maritime crime. because of the recent surge inmaritime piracy, insurance rates have increasedmore than 10-fold for vessels transiting known pirate waters in recent years, and the overall costs of piracy in the pacific ocean and the indian ocean alone were estimated at us$ 15 billion in 2006 and have continued to rise [7]. it is therefore appropriate to develop a set of techniques and algorithms that are able reduce this threat. a description of these algorithms is beyond the scope of this paper, but it can be found in [5]. although agent-based techniques have been successfully applied in other trafficand transportationdomains and problems (see e.g. [4, 8]), this is – to the best of our knowledge – the first integrated attempt at employing agent-based concepts and techniques in the domain of maritime transport security. 2 architecture overview the main objective in designing the simulation platform was modularity and extensibility; this objective has been met by employing a loosely coupled architecture with clearly defined data and control flows. a diagram depicting the main modules and data flows is depicted on figure 1. as can be seen, the simulation platform consists of several modules that can be arranged into the following groups: fig. 1: main modules and data flows. the lines represent the main data flows, the dashed line represents the temporal synchronization controlled by the simulation loop 1http://earth.google.com 94 acta polytechnica vol. 50 no. 4/2010 • simulationcontrol– thesemodules control the initialization and execution of the simulation and all related processes (see section 4). • data sources – these modules provide the platformwith the off-line data required for initializing and running of the simulation. for a description of the necessary data sources, see section 3. • simulation – containingmodules responsible for representing and operating the simulationmodel, both the simulated vessels and the simulated environment in which they operate. • analysis – containing modules analysing the data coming from the simulation and a module emulating the imperfect aspects of the real world (noisiness and incompletenessofdata). adetailed description of algorithms andmodules in this category is beyond the scope of this paper. • planningandcoordination–containingmodules responsible for more complex coordination andplanningbeyond thebasic vessel behavior implemented as part of the simulation. two different planners have so far been implemented. a geo-spatial planner plans the route of the vessel taking into account geographical obstacles (coastline, shallows, etc.) and a planner using the results of a game-theory formulation of the relation between the transport vessel and the agent (the formulation and results are briefly described in section 6.3). • presentation – containing modules responsible for presenting the output of the simulation and the analytical modules using the google earth kml2 format. 3 data sources as an essential feature, the platform incorporates several categories of real-world data and enables integrated analysis, specifically general geographical data comprising general information about the geography of the environment, in particular shorelines, ports and shallow waters. this data is supplied directly by google (earth); it is used primarily for general vessel navigation and for providing the backgroundgeographical context in theuser frontend. operational geographical data comprises geographical information specific to the operation of the simulated vessel types, in particular the location of the main piracy hubs [1], piracy zones, fishing zones and transit corridors. the operational geographical data governs the operation of individual vessel categories (see section 6). vessel attributes describe vessel operational attributes such as vessel type, length, tonnage, max speed, etc. this data is extracted fromvessel tracking servers, e.g. aislive [2], and is used to provide realistic parameters for simulated vessels. activity data comprises higher-level information aboutmaritime activity. this data is typically provided by organizations observing the situation in relevant regions. the maritime securitycentre, horn ofafrica (mschoa) [3] providesup-to-date informationonpiracy incidents andalerts in thegulf ofadenandoff thesomali coast, including guidelines on how to proceed when traversing these areas. this data is used for pirate behavior modeling, and also for the route planning of transport vessels. 4 simulation control the simulation is executed from a script with a predefined configuration. the loading of the data and parameters and the creation of the needed data structures is carried out in the scenario configuratormodule. the simulation time flow is synchronous and discrete, i.e. the simulation time ismeasured in steps. in each step, a sequence of synchronized actions is executed. 4.1 scenario configurator thescenarioconfiguratorhandles the initialization of all modules and all agents present in the simulation. the configuration of the simulation (e.g. the number of vessels, the files containing vessel andgis data, etc.) as well as the parameters of each of the modules is specified in a single groovy3 script file4. the main advantage of this approach is that the script can be both embedded into a pre-compiled package and kept separate (in the source code form) to be able to run different scenarios or to modify the parameters of the simulation without recompilation. the modularity of our architecture enables us easily to initialize only a small subset of analyzers or to initialize a simulationwithout support forkml-based visualization when running non-interactive batch experiments, and logging the results for later postprocessing. 4.2 time flow the main parameters governing the flow of the simulation are the simulation step size and simulation step delay. the simulation step size parameter specifies 2keyhole markup language (kml) is an xml-based language schema for expressing geographic annotation and visualization 3http://groovy.codehaus.org/ 4the nature of scenario description is more algorithmic than declarative; it is therefore more convenient to describe the scenario configuration using a script than using a markup language (such as xml). 95 acta polytechnica vol. 50 no. 4/2010 how much time one step takes, measured in simulation time (i.e. not related to the real-world wall clock time5). the simulation step delay parameter specifies how much time a simulation thread sleeps after each step (to leave some computing time to other components, in particular the presentation andanalysis component. thisparameter ismeasured inwall clock time. the simulation duration parameter specifies the total simulation time of a particular scenario. the simulator module holds information about the current simulation step and the total simulation time of the simulation run. this time is measured in simulation-time seconds (i.e. the maximal granularity of the simulation time is one second). the simulation time is related to date-anchored time by the environment module, where the environment module is initialized to a specific calendar time (i.e. 24th of december, 2009). in each time step, the environment module updates the simulation time so that the other modules can derive additional context information about relating to the given simulation date. the amount of time in each time step (givenby the simulation step size parameter) affects the simulation granularity, and the temporal resolution of the output data obtained from the simulation. if the simulation step size is 1 s, the granularity is the finest, andwe can focus on vessel behavior in detail (e.g. vessel interaction in ports etc.). if the simulation step size is set to 1 hour, the simulation runs fast, and we can roughly estimate the future positions of the vessels and events that may happen in the future. 4.3 simulation cycle in each simulation step, the following sequence of actions is performed: 1. environmentmodule updates its simulation time. 2. agent container is notified and it sequentially sends the information about the new step to each agent. each agent spends the amount of time on performing a part of its plan (see section 6 for details). generated events are sent to the agent container, which distributes the events to all relevant listeners and to the environment that records the event. 3. if – as a result of actions of the agents – new data become available, this is sent to the subscribed analyzers for processing and analysis. the synchronous nature of the simulation control, together with seed-based randomization, guarantees the determinism and thus the repeatability of the simulation. deadlockscannotoccur, asaccess to resources is sequential; this also enables the use of unsynchronized data structures, which are faster to work with. 5 environment model the state of the environment is maintained by two datamodules – thegisdataproviderandvesseldata provider and invariable storing the current simulation time/date. the environment module is a central access point for the other modules to the environmental data including data about the vessels. the temporal information is represented in standard time and date units. the main units of spatial information are nautical miles; speed is measured in nautical miles per hour or per second. gis data model the gis data provider stores the geographical data and provides access to this information. the following data is represented and is available for analysis and display: • somali harbors – a list of somali harbors with name, location, adescriptionandapproximatecapacity for pirate vessels. this data is used for pirate vessel initialization and placement. the original file is a kml file. • fishing zones – a list of possible fishing zones around the somali coast. the fishing area is represented by a polygon. this data is used for initialization of fishing vessels around the somali coast. currently, artificial zones are used; this can later be replaced by a list of real fishing areas if available. • piracy zones – a list of piracy zones around the somali coast. eachpirate vessel chooses a zone in which it is active. the zone selected is the closest one to the pirate’s base harbor. currently, artificial zones are used; this can later be replaced by a list of real piracy zones if available. • irtc corridor – data about the international recommended transport corridor6. this data is used by the transport vessels to plan their trajectory. vessel data model the vessel data provider provides data about the ships in the simulation. each record is about one vessel and is identified with a unique identifier, vessel id (this id is equal to the ship’s call signwhen real-world ship parameters are used). vessel-related data can be categorized into two groups: • static vessel information – static information stays the same throughout the simulation and cannot be modified. this information can be viewed as a database table where the rows repre5wall clock time is the humanperception of the passage of time from the start to the completion of a task, i.e. the elapsed real-world time as determined by a chronometer such as a wristwatch or wall clock. 6as defined in http://asianyachting.com/news/piratecorridor.htm 96 acta polytechnica vol. 50 no. 4/2010 table 1: main parameters of different types of vessel agents. vessel type parameters long-range transport vessel start destination, goal destination short-range transport vessel start destination, goal destination fishing vessel home port, fishing zone pirate vessel home port, area of control, (targeted ship) sents individual vessels and the columns represent the attributes of each vessel. the data is stored in an sql database for easy access. • vessel trace information – a dynamic part of vessel information, storing the actual position of the ship as well as the previous positions in the form of time-annotated location records. 6 agent vessel model the agents reside in the agent container, which distributes events gathered from the agents or from the environment. eachagent controls oneormorevessels. theplans for eachvessel are either createdprior to the simulation (e.g. for transport vessels) or, typically, are generated dynamically during the simulation run (e.g. for pirate vessels). 6.1 vessel types the platform can simulate the simultaneous activity of a large number (thousands) of the following categories of vessels: long-range transport vessels are largeto very large-size vessels transporting cargo over long distances (typically intercontinental); these are the vesselsmost often targeted by pirates. shortrange transport vessels are smallto medium-size vessels which transport passengers and/or cargo close to the shore or across the gulf of aden. these ships are local, and are not attacked by pirates. fishing vessels are smallto medium-size vessels which go fishing within designated fishing zones; fishing vessels launch from their home harbors and return back after the fishing is completed. these vessels can potentially be attacked by a pirate, but currently only local fishing vessels, which cannot be attacked, are simulated. pirate vessels are small to medium-size vessels operating within designated piracy zones and seeking to attack a long-range transport vessel. the pirate control module supports several strategies, some of which can employ multiple vessels. table 1 summarizes the main parameters of each type of vessel agent. note that, in general, a vessel agent can controlmore than one vessel (e.g. amothership pirate vessel agent controlling several small boats and a hijacked transport ship). the behavioral models for individual categories of vessels have beenmanually synthesized on the basis of information about real strategies,which have been obtained from several sources including the imb piracy reporting centre7 and the maritime terrorism research center8. 6.2 executable behavior representation vessel agentbehavior is implementedusing finite state machines (fsm). agent fsms consist of states that represent an agent’s principal mental states. transitions between the states are defined by unconditional or by conditional transitions conditioned by external events. implementation-wise, the simulator allocates a time slice to the agent and the agent delegates the quantum to the fsm. the current state may either use the whole time slice and stay in the current state, or it canutilize only part of the time slice anddelegate the rest of the time to a following state or states. an example of a pirate fsm is depicted in figure 2. fig. 2: finite state machine of the pirate vessel agent 6.3 path planning a modular route planning architecture has been developed allowing us to combine general shortest-route point-to-point planners with specialized planners for specific areas. a more detailed description of the operation of both planners can be found in [5] and [6]. general shortest-route navigation the basic route planner finds a shortest route between two locations on the earth’s surface considering ves7http://www.icc-ccs.org/index.php?option=com content&view=article&id=30 8http://www.maritimeterrorism.com/ 97 acta polytechnica vol. 50 no. 4/2010 sel operational characteristics and environmental constraints, includingminimumalloweddistance to shore, which candiffer between regions. theplanner is based on the a* algorithm adapted for a spherical environment and polygonal obstacles. gulf of aden transit planner factors other than route length need to be consideredwhen planning a route through the gulf of aden. two specialized planners have been therefore developed for this purpose. the simple corridor planner navigates the ship through the international recommended transport corridor and mimics the current practice in the area. as a possible alternative, several alternative route planners through this area have been implemented. a risk-minimizing route planner uses a pre-generated riskmap to avoid high risk areas. a game-theory planner solves a two player zero-sum game between the transport ship and the attacker to find an optimal path through the gulf. a detailed description can be found in [5]. 7 visualization the user front-end provides a geo-based visualization of the various outputs providedby the platform, using google earth, a kml capable viewer. fig. 3: googleearth-based front-end showingvessels, their past trajectories and an output fromdynamic incident risk analyzer over the gulf of aden the google earth-based front-end allows us to interactively visualize the various output of the testbed, both the static data described in section 3 and the dynamically generated output of the advanced analysis and planning modules. a screenshot of the front-end is given in figure 3. in addition to ergonomic navigation and 3d camera control, the main advantage of the front-end is the ability to present structured data on varying levels of detail. the layer-based interface allows us to select different layers of information and compose an information picture with the aspects and the level of detail fit for the specific user’s needs. to leverage the layer-based concept, the testbed output is organized into multiple information layers. the simulation itself provides a number of layers; each of the analysis and planning tools then also adds its own layer. the integrationwith google earth is provided via dynamically constitutedkmlfiles servedbyanhttp server running inside the platform. the kml files are read into the google earth application using its http data link feature and automatically refreshed. this way, dynamic data can be displayed (such as a moving vessel), though for performance reasons, the refresh rate is limited to about once a second. acknowledgement the researchdescribed in this paperwas supervisedby prof. v. mařík, fee ctu in prague and supported by office for naval research project no. n0001409-1-0537 and by the czech ministry of education, youth and sports under research programme no. msm6840770038: decision making and control for manufacturing iii. references [1] somalia pirate attacks. bbs google earth forum, http://bbs.keyhole.com/ubb/ubbthreads.php?ubb= showflat&number=1242871%&site id=1, april 2008. [2] aislive. http://www.aislive.com/, 2009. [3] the maritime security centre, horn of africa (mschoa). http://www.mschoa.eu, 2009. [4] glashenko, a., ivaschenko, a., rzevski, g., skobelev, p.: multiagent real time scheduling system for taxi companies. in proc. of 8th int. conf. on autonomous agents and multi-agent systems (aamas 2009), budapest, hungary, 2009. [5] jakob, m., vaněk, o., urban, š., benda, p., pěchouček, m.: adversarial modeling and reasoning in themaritime domain,year 1 report. technical report, agent technology center, department of cybernetics, fee, ctu prague, 2009. [6] jakob, m., vaněk, o., urban, š., benda, p., pěchouček, m. agentc:agent-based testbed for adversarial modeling and reasoning in the maritime domain. proc. of 9th int. conf. on autonomous agents and multiagent systems (aamas 2010) – demotrack, may, 2010. [7] nincic, d.: maritime piracy in africa: the humanitarian dimension. african security review, vol. 18(3): p. 2–16, 2009. 98 acta polytechnica vol. 50 no. 4/2010 [8] pěchouček, m., šǐslák, d.: agent-based approach to free-flight planning, control, and simulation. ieee intelligent systems, vol. 24(1): p. 14–17, jan./feb. 2009. about the author ondřejvaněk graduated intechnicalcybernetics from fee, ctu in 2008, and is now a phd student at the agent technologycenter at the department of cybernetics, fee, ctu. his main research is focused on machine learning in multi-agent systems, cooperative and non-cooperative game theory and coordination and cooperation in multi-agent systems. ondřej vaněk e-mail: vanek@agents.felk.cvut.cz dept. of cybernetics, fee czech technical university in prague technická 2, 166 27 praha, czech republic 99 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 fmea and fta analyses of the adhesive joining process using electrically conductive adhesives e. povolotskaya, p. mach abstract this paper introduces a formulation of appropriate risk estimation methods that can be used for improving of processes in the electronics area. two risk assessment methods have been chosen with regard to the specifics of adhesive joining based on electrically conductive adhesives. the paper provides a combination of a failure mode and effect analysis (fmea) and fault tree analysis (fta) for optimizing of the joining process. typical features and failures of the process are identified. critical operations are found and actions for avoiding failures in these actions are proposed. a fault tree has been applied to the process in order to get more precise information about the steps and operations in the process, and the relations between these operations. the fault tree identifies potential failures of the process. then the effects of the failures have been estimated by the failure mode and effect analysis method. all major differences between failure mode and effect analysis and fault tree analysis are defined and there is a discussion about how to use the two techniques complement each other and achieve more efficient results. keywords: failure mode and effect analysis, fault tree analysis, adhesive joining, electrically conductive adhesives. 1 introduction electrically conductive adhesives (ecas) are becoming increasingly important in the electronics industry. these materials are used in two main areas of electronics packaging – in mounting of heat sensitive components such as lcds, and in mounting ultrafine pitch electronic packages [1]. ecas create a permanent electrical and mechanical connection between the pad and the component lead. adhesives on an epoxy basis filled with silver conductive particles aremainly used. the curing temperature is lower than the soldering temperature of lead-free solders. the electrical conductivity depends on the concentration of conductive particles in the resin, on the shape and the material of these particles [2]. anappropriate surfacepretreatmentof the joined parts and detailed control of the filler through an analysis of the grains arenecessary to achieve of good electrical,mechanicalandthermalpropertiesof adhesive joints [3]. to achieve parameters adhesive joints that are comparable with soldered joints it is necessary to optimize the process of adhesive joining. the electrical resistivity of adhesives, electrical noise and the nonlinearity of the current vs. voltage characteristic are higher than these parameters for lead-free solders. the mechanical properties and climate resistivity of ecas are also worse than these of solders [4]. there are many parameters that influence the quality of adhesive joints in the process of adhesive joining. optimization of this process requires the use of proper quality control tools such as failure mode and effect analysis (fmea) and fault tree analysis (fta). these analyses make an examination of the process critical parameters possible. fmea is a technique for analyzing the occurrence of process failures and their effect on the result of a process [7]. currently, fmea is a widely used method for risk assessment in industrial processes. this method is primarily adapted for material and equipment failures. there are four basic types of fmea: process fmea, system fmea, designfmeaand servicefmea.fmeaproduces risk priority numbers (rpns) as outcomes. rpns are obtained for each failure that can occur in a process. anrpn is amultiplication of the severity, occurrence and detection numbers for each failure. severity of failure shows the level of seriousness of the failure. theoccurrencenumber representshowoften the failure occurs and the detection number indicates the levelofvisibilityof the failure. thepurposeoffmea is to reduce therpnbyreducingone, twoorall three numbers in order to improve the process and ensure the non-appearance of such errors subsequently. the appearance of the failure can be reduced by improving the technical documentation requirements in the process to eliminate the causes of failures or reduce their frequency. detection canbe reducedby offering new or improved assessment methods or by offering additional equipment fordetection. several examples offmea implementation for industrial processes are presented in [12–15]. 48 acta polytechnica vol. 52 no. 2/2012 a fault treeforms the basis for logical-probabilisticmodels of systemfailure causality, failures of its elements and other events or impacts [8]. thismethod is based on sequences and combinations of disturbances and faults [9]. thus it is amultilevel structure or diagramof causal relationships [10]. ftaprovides a common vision of the process, components, and howthese componentsare related. thismakes it easy to identify the defects arising in the process. it also provides a way for proposing step-by-step improvements to prevent defects and errors and for making a troubleproof process. several examples of a successful combination of fault tree analysis and failure mode and effect analysis methods for application in industrial processes are shown in [10,16–19]. this paper presents the use of fta and fmea for optimizing the joining process when electrically conductive adhesives (eca) are used. 2 theoretical background 2.1 basic risk analysis methods generally, risk is the possibility of the occurrence of certain undesirable events that initiate various types of failures. risk analysis is used to find causes of failures and to prevent the occurrence of these failures in the future. the results of risk analysis canbe used for process optimization. risk analysis is divided into two complementary types: 1. qualitative. 2. quantitative. the task of qualitative analysis is to identify the risk areas in a process, types of risks, and the factors causing the risks. this is done in various ways, for example, by an expert, by brainstorming and so on. quantitative analysis enables the level of effect to be quantified for each type of risk. basic methods for risk analysis are as follows: 1. analogies. 2. expert methods. 3. statistical methods. 4. modeling, etc. the analogy approach is focused on an examinationofanalogiesamongdataobtained fromarangeof sources. expertmethods areused to collect the opinions of qualified specialists. a statistical approach to risk analysis uses various types of statisticalmethods to process data that has been obtained experimentally. the simulation is based on calculating various types of models and on testing or these models in various situations. the followings are some of the most commonly used risk analysis methodologies [5]: 1. structured what-if technique (swift). 2. fault tree analysis (fta). 3. event tree analysis (eta). 4. failure modes and effects analysis (fmea). two expertmethods for risk analysis— fault tree analysis (fta) and failure mode and effect analysis (fmea)—were used for an analysis of a conductive adhesive joining process [22]. 2.2 fault tree analysis fta is a very powerful systematic way which is widely used for estimating process quality. starting from the top event, the fault-tree method uses a boolean algebra and logical modeling to make a graphical representation of the relations among various failure events at different levels of the process (figure 1) [21]. fig. 1: typical fault tree in this technique, deductive logic is used. it enables the root causes of the failure events of a process to be found. this type of logic helps to establish a clear and detailed scheme of relationships between steps or events in the process that can affect their quality. the contribution of the fault tree is as follows: • allows potential failure parts of the process to be seen in detail. • helps identify failures deductively. • enables a qualitative or quantitative analysis of the process to be made. • themethod can focus on individual parts of the process, and can extract specific failures. • it clearly represents the behavior of the process. themain advantage of the fault tree (in comparisonwithothermethods) is that the analysis is limited to identifying only those events of the process which lead to a specific process failure. the disadvantages of fault trees are as follows: • implementation of the method requires considerable inputs, becausemoreprocess details leads 49 acta polytechnica vol. 52 no. 2/2012 to a geometric increase in the analyzed area, and the number of influencing events grows correspondingly. • a fault tree is a boolean logic diagram, which shows only two states: working and failed. • it is difficult to estimate the state of partial failure of the process parts, because use of the method generally indicates that the process is either in good condition or in a faulty state. • it requires a reliability specialist with deep knowledge of the process. 2.3 failure mode and effect analysis thefmeamethod is applied in addition to thefta technique. the failure mode and effects analysis (fmea) is a widely used analytical tool. it is especially useful in connection with reliability, maintainability and safety analyses. themain goals of the technique are to determine: • possible failures (defects) of the process, their causes and consequences; • the criticality of the effects on the process (s), the probability of occurrence (defects) (o) and their detectability (d). • generalized assessments of the functionality of the process — calculation of rpn. ten-point or five-point rating scales are often used for occurrence, detection and severity numbers. a rule of thumb is usually used for the risk priority number. this means that a serious look has to be taken at rpns higher than 125. when a tenpoint scale is used [3]. a special team is set up to conduct fmea. the values of s,d,o, andrpnaredeterminedby expert estimates [22]. fmeaof the production process covers the stage of technical preparation of equipment and materials for the process tobe started. it endsbefore thedirect work begins [22]. 3 experimental part before the risk analysis is started, it is necessary to define the main steps in the process. a flow-chart of the process of electrically conductive adhesive joining is shown in figure 2. failures of adhesive joints are mostly connected with theirmechanical and/or electrical properties. a table of failure resistance and nonlinearity of the currentvs. thevoltagecharacteristicof anadhesive joint is shown in table 1. the structure of the total resistance of an adhesive joint is shown in figure 3. here r1 represents the resistances between the component lead and the adhesive, r2 represents the resistance of adhesive, and r3 represents the resistance between the pad and the adhesive. fig. 2: flow-chart for a joining process based on eca fig. 3: total resistance of an adhesive joint table 1: adhesive joining characteristics (the values are valid for joining components of dimension type 1206) typical value failure value resistance (r) 20mω ≥ 40mω nonlinearity(u) 10μv ≥ 25μv parts of the assembly processwhich influence the valuesof these resistancesare examinedandanalyzed for the risk of the occurrence of potential failures. deductive approach (fta) and inductive approach (fmea) are reviewed. the first step in the process of an examination of adhesive joining using a fault tree analysis is to identify the main undesirable events. to define such an event, it is necessary to define still acceptable values for joint resistance, nonlinearity of the current vs. voltage characteristic of the joint, shear strength, tensile strength, etc. typical and failure values of electrical adhesive joining characteristics, such as joint resistance and joint nonlinearity of the current vs. voltage characteristic, are shown in table 1. these events become the top-event of a fault tree. 50 acta polytechnica vol. 52 no. 2/2012 table 2: influence of basic fault events on the properties of the adhesive faulty eca joint low mechanical resistivity high electrical resistance high nonlinearity of adhesive high noise fta code improper material of a lead × × fcu1 improper material of a pad × fcu2 improper surface finish of a lead × × × fcu3 improper surface finish of a pad × × × × fcu4 inappropriate curing × × × fcu5 inappropriate storing × × fcu6 inappropriate type of resin × × fcu7 inappropriate concentration of filler particles × × × fcu8 inappropriate viscosity × × × fcu9 fig. 4: fta for joints formed by eca the second step is to identify of events directly related to the top-event. this is a repeatable process and can be continued until we reach the basic events that cause the top-event. fault tree analysis (fta) is generally performed using a logical structure ofand andorgates. in the caseof the joiningprocessbased on electrically conductive adhesive (eca), each basic faulty event, alone, can cause one ormore failures of an adhesive joint, so instead of grouping them under gates, we used a tabular representation of fta (table 2). as a result of applying the fta method to adhesive joining we found the weakest parts of the process. toobtainabetterunderstandingof the failures, we applied fta to each type of typical joint failure. fault trees are presented in figures 4, 5 and 6. varianceof the electrical resistance sometimesappears in joint of this type. it can be caused by improper surface finish of the pad and the component lead by faulty placing of a component or by using an adhesive with faulty consistency. figure 7 shows a more detailed representation of the joint. when the fta has been performed, an inductive method such as failure mode and effect analysis (fmea) is applied to the joining process in order to analyze the significance of various types of failures. with the help of fmea, a potential failure mode in the process is analyzed to define the effect on the result of the process and to classify each potential failure mode according to severity. in the process considered here, we used process fmea in an original functional approach [20]. in this approach, each step in the process performs a number of events which can be determined as outputs. the outputs are listed and analyzed. fig. 5: fta for joints formed by eca 51 acta polytechnica vol. 52 no. 2/2012 fig. 6: fta for joints formed by eca fig. 7: fta for joints formed by eca in our approach, we used the list of failures, i.e. events which cause the top-event, defined during the fta analysis and for each undesirable event we define: • basic causes of failures, • specific features of the process, • severity numbers of failures(s): an assessmentof the seriousness of the effects on the failure process, • number of occurrences of failures (o): an assessment of the likelihood that a failure will occur, • detection number of failures (d): assessment of whether current control methods detect the causes of failures on an appropriate level, • risk priority numbers (rpn): multiplication of detection, occurrence and severity numbers. this is used to set priorities for failures on pro52 acta polytechnica vol. 52 no. 2/2012 table 3: part of the failure mode and effect analysis table for joints formed by eca f m e a f m e a p ro ce ss o r d a te o f f m e a f m e a fo r jo in in g fo rm ed by el ec tr ic a ll y co n d u ct iv e a d h es iv es d ep a rt m en t o f e le ct ro te ch n o lo gy s ep te m be r, 2 0 1 1 o bj ec t o f f m e a p ro ce ss ed a re a f m e a -s ta tu s a ss em bl y te ch n o lo gy a d h es iv e jo in in g r u n n in g p o te n ti a l f a il u re m o d e f t a co d e f u n ct io n p o te n ti a l f a il u re e ff ec ts s p o te n ti a l c a u se s o c u rr en t c o n tr o ls d r p n a ct io n s r ec om m en d ed r es p o n si b il it y f a u lt y e c a jo in t f c u 3 p re p a ra ti o n o f co m p on en ts su rf a ce s h ig h el ec tr ic al re si st a n ce 8 im p ro p er su rf a ce fi n is h o f a le a d 6 v is u a l co n tr o l 4 1 9 2 c a re fu l cl ea n in g o f su rf a ce le a d f c u 2 p re p a ra ti o n o f co m p on en ts su rf a ce s h ig h el ec tr ic al re si st a n ce 8 im p ro p er su rf a ce fi n is h o f a p a d 6 v is u a l co n tr o l 4 1 9 2 c a re fu l cl ea n in g o f su rf a ce p a d f c u 5 c u ri n g l ow m ec h a n ic a l re si st iv it y 8 in a p p ro p ri a te cu ri n g 3 m ea su ri n g o f a te m p er a tu re p ro fi le 3 7 2 c a re fu se tt in g u p o f cu rr in g p ro fi le f c u 6 s to ri n g o f a d h es iv e l ow m ec h a n ic a l re si st iv it y 8 in a p p ro p ri a te st o ri n g 3 c o n tr o l o f st o ri n g 2 4 8 t ra ck in g o f st o ri n g a p p ro v a l si g n a tu re s c o n cu rr in g si g n a tu re s 53 acta polytechnica vol. 52 no. 2/2012 cess levels, and to establish what requires additional quality planning, • corrective actions, • checks on corrective actions. parts of the output of this analysis of the faulty eca joint failure mode are shown in table 3. 4 conclusion the outcome of this approach is a reliability analysis realized through the interaction of the fmea andfta reliability tools. each of these risk analysis methods has advantages, which enable the technological process to be investigated and help to observe theprocessmore clearly fromdifferentpoints of view. thefmeamethod in general is a libraryof all possible potential failures and their consequences,whereas ftaenables adetailed analysis of logical and temporal relationships that lead to a failure, taken over the top of the tree. the application of these twomethods to the process, complementing each other, provides deeper information than applying the methods separately. as a consequence of this approach, more efficient results have been achieved. the most significant steps in forming high-quality adhesive joints are the preparation of the pad and the lead surfaces. preventive measures to avoid failures have been proposed. references [1] john, h. l., wong, c. p., ning, c. l., ricky lee, s.w.: electronicsmanufacturing with leadfree, halogen-free and conductive-adhesive materials. new-york : mcgraw-hill, steve chapman, 2003. [2] duraj, a.: using of electrically conductive adhesives for bonding in electrical engineering. prague : czech technical university in prague, 2001. [3] charles, a., happer, m., miller, m.: electronic packaging, microelectronics, and interconnection dictionary. new-york : mcgraw-hill, 1993. [4] ageeva, n. d., vinakovskaya, y. u., lifanov,v.n.: electrotechnical material engineering. vladivostok : dvgtu, 2006. [5] vose, d.: risk analysis: a quantitative guide. 3rd edition. england : john wiley and songs, ltd, 2008. [6] stamatis, d. h.: failure mode and effect analysis: fmea from theory to execution. 2nd ed. p.cm. visconsin : asq quality press, 2003. [7] mcdermott, r. e., mikulak, r. j., beauregard, m. r.: the basics of fmea. 2nd edition. newyork : taylor&francisgroup, llc, 2009. [8] bilal, m. a.: risk analysis in engineering and economics. florida : crc press llc, 2003. [9] ics 03.100.01, czech technical standard iso 31000:2009. riskmanagement—principles and guidelines, 2010. [10] queiroz, s. r., álvares, a. j.: fmea and fta analysis for application of the reliability centered maintenance methodology: case study on hydraulic turbines. symposium series in mechatronics, 3, 2008, p. 803–812. [11] tichy, m.: risk control. analysis and management. prague : c. h. beck, 2006. [12] cassanelli, g., mura, g., cesaretti, f., vanzi, m., fantini, f.: reliability predictions in electronic industrial applications. microelectronics reliability, 45, 2005, p. 1321–1326. [13] douglas, w. b., patrick, w. k., carl, s. b., michael, j.r.: electronicprognostics—acase study using global positioning. microelectronics reliability, 47, 2007, p. 1874–1881. [14] abdul-nour, g., beaudoin, h., ouellet, p., rochetie, r., lambert, s.: a reliability based maintenance policy; a case study. computers and industrial engineering, 35 (3–4), 1998, p. 591–594. [15] scipioni, a., saccarola, g., centazzo, a., arena, f.: fmea methodology design, implementation and integrationwithhaccpsystem in a food company. food control, 13, 2002, p. 495–501. [16] han-xiong, l., ming, j. z.: ahybridapproach for identification of root causes and reliability improvement of a die bonding process — a case study. reliability engineering and system safety, 64, 1999, p. 43–48. [17] ortmeier, f., reif, w., schellhorn, g.: formal safety analysis of a radio-based railroad crossing using deductive cause-consequence analysis (dcca). report d-86135. augsburg : lehrstuhl für softwaretechnik und programmiersprachen, universität augsburg, 2005. [18] gasanelli, g., fantini, f., serra, g., sgatti, s.: reliability in automotive electronics; a case study applied to diesel engine control.microelectronics reliability, 43, 2003, p. 1411–1416. 54 acta polytechnica vol. 52 no. 2/2012 [19] moser, t., melik-merkumians, m., zoitl, a.: ontology-based fault diagnosis for industrial control applications. report 11662353.bilbao : vienna university of technology, 2010. [20] vesely, w. e., goldberg, f. f., roberts, n. h., haasl,d. f.: fault treehandbook.washington, d.c. : mail stop ssop, 1981. [21] ching-torng,l.,mao-jiun, j.w.: hybridfault treeanalysisusingfuzzysets.reliability engineering andsystemsafety,58, 1997,p. 205–213. [22] kane, m. m., ivanov, b. v., koreshkov, v. n., skchirtladze,a.g.: systems, methods and tools of quality management. saint petersburg : piter press, 2009. evgenia povolotskaya pavel mach e-mail: povolevg@fel.cvut.cz czech technical university in prague 55 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 option derivatives in electricity hedging p. pavlatka abstract despite the high volatility of electricity prices, there is still little demand for electricity power options, and the liquidity on the power exchanges of these power derivatives is quite low. one of the reasons is the uncertainty about how to evaluate these electricity options and about finding the right fair value of this product. hedging of electricity is associated mainly with products such as futures and forwards. however, due to new trends in electricity trading andhedging, it is also useful to thinkmore about options and the principles for workingwith them in hedging various portfolio positions and counterparties. we can quite often encounter a situation when we need to have a perfect hedge for our customer’s (end user consuming electricity) portfolio, or we have to evaluate the volumetric risk (inability of a customer to predict consumption, which is very similar to selling options. now comes themoment to compare the effects of using options or futures to hedge these open positions. from a practical viewpoint, the black-scholes prices appear to be the best available and the simplest method for evaluating option premiums, but there are some limitations that we have to consider. keywords: option derivatives, electricity hedging, evaluation models, electricity prices. 1 key features of electricity prices some years ago, the electricity market was vertically integrated and the prices of this commoditywere fully regulated by state-owned authority managed worldwide. these regulated prices had to reflect the cost of electricity generation, transmission and distribution. the prices were therefore determined by well-known factors, and changed only rarely. since deregulation of the electricity market, prices have been determined according to the economic rule of supply and demand. many countries have settled electricity pools, where bidsof electricity sellersarematchedwith thepurchase orders of end users. these pools trade with long-term products and also with short-term products. the differences areonly in liquidity andvolatility. this deregulation has fully supported trading activities on the derivatives markets, which allow trading with financial electricity contracts as derivatives, where electricity is the underlying asset. the relatively high volatility of electric power and important specifics of this commodity have forced many market players to manage price risk professionally. hedgingmarket risks is a well-knownway to eliminate the risk of price changes, but there are also weak points, which are associated with specific features of electricity. due to the obvious specific features of electricity, e.g. its unique nonstorability, electricity prices are more likely to be driven by spot supply and demand, which is inelastic. any shock in consumption or production may give rise to price jumps [4]. 2 hedging as the electricity market becomes deregulated and more competitive, changes in supply and demand are increasingly translated into price volatility and fluctuations. another very important driver has been financial crises, which have shown us the impact of financial derivatives traded also on behalf of electricity contracts. most of the volatility of the fluctuations in supply and demand is visible on the daily spot market, where the price is mainly influenced by inelastic demand and short-term supply. figure 1 shows the increasing volatility of spot prices in recent years. fig. 1: spot prices of electricity in �/mwh (source http://www.eex.com) mostderivatives, however, arenot typicallyused to hedge risks connectedwith daily price volatility. they 65 acta polytechnica vol. 50 no. 4/2010 are used to hedge risks associated with trend fluctuations and seasonalprice volatility. marketparticipants therefore often use annual andmonthly derivatives. in a competitive electricity market, daily fluctuations in electricity prices will therefore be the most dramatic driver of price volatility. there are two different approaches depending on the type ofmarket participant. generators, as entities owning power plants, have a natural “long” electricity position, and the value of this position increases and decreases with the price of power. when power prices increase, the value of the electricity produced increases, and when power prices decrease, the value of the produced commodity decreases. an electricity consumer is naturally “short” and, in the opposite way, consumers benefit when prices go down and have to suffer loss when prices increase. price volatility introduces new risks for generators, consumers and traders (brokers). in a competitive electricity market, generators will have to sell some of the electric power that they produce in volatile spot markets, and will bear the risk if the spot prices are lower than the generation costs. in the case of consumers, we have to consider higher seasonal and hourly price variability. it is obvious that this uncertainty could make it more difficult to assess andmanageacustomers’s long-termfinancialposition. electricity futures and other power derivatives help electricity generators and end consumer to hedge price risks (market risk) in a competitive electricitymarket. futures contracts are legally binding and negotiable contracts that call for future delivery of electricity. in many cases, physical delivery does not take place, and the futures contract is closed by buying or selling a futures contract on the delivery date. other power derivatives include options, price swaps, andotc forward contracts. power derivatives, like futures and options, are traded on an exchangewhere participants are required to deposit margins to cover all potential losses due to the credit risk of the counterparty and the market risk of an open position. other hedging instruments, such as forwards, are traded bilaterally “over-the-counter”. recently we have seen how expensive membership of a power exchange is, and we can compare the costs of financial capital for margins with paid option premiums. 2.1 short-term or long-term hedging it is relative difficult to define a strict boundary between long-term hedging and short-term hedging. we could define short-term hedging in terms of the most distant maturity traded month contract on the power exchange. this could be between 6 and 12 months. the main important risks associated with short-term hedging are cash-flow problems. in the first case, a cash-flow problem emerges from an insufficient initial and variation margin for mtm (“mark-to-market”). the result is that the intendedhedging transactionbecomes to speculative position after a margin call that we are not able to pay for. in the second case, we can consider an unhedged price risk, which results from inadequate hedging of open positions. this case very often occurs and is associated with volumetric risk. most electricity consumption depends on short-term conditions, and there are not enough strict plans or “take or pay” contracts, which will motivate the end customer to consume in orderwith the contracted volume. gains and losses from hedging activities that occur in the futures market when a hedge is undertaken must be viewed as part of the electricity price that the market participant provides to its customers. the same approach has to be undertaken in the case of option premiums. sometimes amarket player takes a profit in the futures market and loses in the spot market, and sometimes the reverse situation occurs. it is clear that hedging profits and losses must be treated simply as part of the cost of purchasing energy. with an imperfect hedge, the market player could earn less on his futures position than he loses between his fixed price contract and the spot market, or he could earn more. there is no clear line to distinguish long-term and short-term hedging. the cash flow risk increases exponentially due to margin calls as the maturity of the long-term hedge increases. we consider that the increase in risk is faster than linear, for two reasons. firstly, the price volatility increases approximately in proportion to the square root of the length of the hedge, and secondly, the amount being hedged is generally proportional to the length of the hedge, because the market player will be hedging an constant volume over the time. the primary risk associated with longtermhedging isagainassociatedwithmargincalls risk. nowwe can compare forwardand futures contracts. a key difference between forward and futures contracts is in cash settlement, which is performed by a clearing bank in the case of futures. a buyer or seller of a futures contract will have to realize short term losses or gains as the futures price changes. this cash settlement is performed daily. in the case of a forward contract, profit and loss is realized only at maturity and there is no cash flow problem due to the payment of a variation margin. there is another more important specific consideration which could make forward dealing less interesting for smaller business units, and that is the credit risk exposure of an electricity seller. in the case of futures, this credit risk and also themarket risk is solvedbymtm (daily cash settlement) clearing. it is obvious that the money lost on the future is entirely regained from the addedprofit on the fixed price contract that was sold at the start of this example. if the loss is quite large, it may be impossible for the hedging market participants to raise the cash margins necessary to meet the variation margin requirement. 66 acta polytechnica vol. 50 no. 4/2010 in this case, the clearing bank has the right to liquidate all open positions of the counterparties. hedging over longer periods puts traders at risk for extremely large margin calls. the consequence is that long-term hedging requires significantfinancial resources tomeet variation margin requirements. 2.2 options in 1996, nymex introduced options for electricity. there are two types of options for electricity: a put option (“floor”) and a call option (“cap”). in the first case, the buyer of an electricity put option pays a premium for the right, but not the obligation, to sell electricity at a specified price, the strike price or exercise price, at a specified exercise time. end users use call options to place a maximum cap price that they will pay for the commodity at a specified exercise time. market participants often use combinations of calls and puts to ensure a particular price range. generators often use put options to guarantee a minimum price of the produced electricity in conjunction with the physical sale of electricity. by this product, a generator could benefit from increases in commodity prices, but would avoid the risk of lower prices. consider that the futures contract price is �43/mwh and the generator of electricitywould like to receive at least this amountdue toprofitanalysis. therefore, the generator has to purchase a put option, for�2/mwh, which thegeneratorwill pay for. if thepriceof electricity increases, the generatorwill sell electricity into the spotmarket and receive the higher spot price (see figure). if the price of electricity falls, the generator will sell electricity to the option holder for �43/mwh, or hewill sell his option at its exercise value,�43/mwh, on or before its expiration date. fig. 2: put option of an electricity producer a consumer, end user, deals with the opposite problem. in the case of hedging, he would utilize a call option to avoid the risk of higher prices, while retaining the ability to participate in potentially lower prices. let us assume that the futures contract price is �40/mwh and the consumer would like to pay no more than this price. in this case, the customer will buy a call option; say for �2/mwh, which the end users have to pay in advice. if the price of electricity falls, the consumer will buy electricity in the spot market. if theprice goesup, the enduserwill buy electricity from the option holder for �40/mwh or will sell his call option for its exercise value, �40/mwh, on or before its expiration date. 3 option evaluation market models for evaluating derivatives and options work mainly with storable commodities. the nonstorability of electricity implies a breakdown of the relationship between the spot price and the forward price (eydeland and geman 1988). the second problem is with convenience yield, which is important in many cases of commodities. the convenience yield is a numeric adjustment to the cost of carry in thenonarbitrage pricing formula for forward prices in markets. in the case of the money market we can consider the following situation. let ft,t be the forward price of an assetwith initial price st andmaturity t . we suppose that r is the continuously compounded interest risk free rate. then, the non-arbitrage pricing formula of the forward price is: ft,t = ste r(t −t) (1) however, this relationship does not exist in most commodity markets, partly because of the inability of investorsandspeculators to short theunderlyingasset. instead, there is a correction to the forward pricing formula given by the convenience yield c. hence ft,t = ste (r−c)(t −t) (2) the convenience yield exists because owners of the assetmayobtain a benefit fromphysically holding this asset as inventory to maturity. these benefits include the ability to profit from temporary shortages. anyone who owns inventory has the choice between consumption today versus investment for the future. a rational investor will choose the outcome that is best. when inventories are high, this suggests anticipated relatively low scarcity of the commodity today versus sometime in the future. otherwise, the investorwould sell his stocks and the forward prices ft,t of the asset should be higher than the current spot price st. this tells us that r − c > 0. the reasoning become interesting in the case of low inventories, when we expect that the scarcity now is greater than it will be in the future. therefore, the investorwants to borrow inventory from the future but is unable. we expect future prices to be lower than today and hence ft,t < st. this implies that r − c < 0. the concept of convenience yield was introduced by kaldor(1939) and working(1949) for agricultural commodities. this concept represented the benefit from holding the commodity as opposed to a forward contract. the concept of convenience yield does not 67 acta polytechnica vol. 50 no. 4/2010 make sense in the caseof electricity, because there is no availablemethod to store electric power, and therefore we cannot consider the benefit from storing the commodity versus storing costs. the first, very important, characteristic of electricity prices is a mean reversion toward a level representing the marginal cost of electricity production, which can be constant, periodic or periodic with some trend. in the case of electricity we have to expect the mean to revert to a deterministic periodical trend driven by seasonal effects. the second specific driver of electricity prices is the existence of temporary imbalances of supply and demand in the network which affects randomprice moves around the averagetrend. wearenot able topredict this effect. a third feature is the jump character of electricity prices (spikes), because shocks in power supply and demand cannot be smoothed away by inventories. as wasmentioned above, the convenience yield attached to a commodity can be interpreted as a continuous dividend payment made to the owner of the commodity. then we could suppose that the price of the underlying asset is driven by a geometric brownian motion and use merton’s (1973) formula for pricing options (3). this formula provides the price of a plain vanilla call option written on a commodity with price s: c(t)= s(t)e−y(t −t)n(d1) − ke−r(t −t)n(d2), (3) where d1 = ln(s(t)e −y (t −t) ke−r(t −t) )+ 12σ 2(t − t) σ √ t − t (4) d2 = d1 − σ √ t − t (5) as was mentioned above, the main difficulties in valuing power options are due to the fact that this commodity cannot be stored and the associated problem with convenience yield. the results of using formula (3) are very similar to the prices at the eex pool. all inputs and results are obvious from tab. 1. fig. 3: market data of a traded option (call/put, source http://www.eex.com) table 1: inputsand results of a comparison ofeexmarket data and an evaluation of formula (3) market data (eex pricing) contract cal2011 futures �/mwh 47.58 �/mwh premium �/mwh strike price call option put option �/mwh 49 2.47 3.88 50 2.11 4.51 51 1.80 5.19 53 1.30 6.67 55 0.92 8.28 57 0.65 9.99 58 0.55 10.87 59 0.46 11.77 64 0.18 16.45 results of black — scholes formula (3), volatility approx. 16 % strike price call option put option �/mwh 49 2.12 3.51 50 1.76 4.13 51 1.45 4.80 53 0.96 6.28 55 0.62 7.90 57 0.39 9.63 58 0.30 10.53 59 0.24 11.44 64 0.06 16.17 fig. 4: comparison of market data(eex) with result of the b-s formula (3) 68 acta polytechnica vol. 50 no. 4/2010 the primary category of traded electricity options includes calendar year contracts and monthly physical options, which are also traded at pool eex. call options allow the buyer to receive power at the strike price. these options are relatively liquid. see the next figure with pricing of a set of options divided by the strike price. the results of using the b-s formula (3) are relevant in this case and after a comparison (see fig. 4) with market pricings of options, they are very similar. the second categories of options are daily options, or options for a block of hours. these (asian type) options are specified for a given period of time and can be exercised every day during this period. it is obvious that daily options are very difficult to manage and are not liquid. therefore it is important to work better with pricingmodels associatedwith jump characteristics andmean reverting of electricity prices. 4 conclusion liberalization of the energy industry requires adaptation to risk-management techniques. pricing or selling derivative products poses new challenges for market participants. the non–storability of electrical power and the inability to hold short positions in electricity spot prices has eliminated the utilization of techniques from financial mathematics. in the case of european type of derivatives in energy markets, namely options (call, put and collar), it turns out that these derivative prices are related to the standard black-scholes option pricing formula. the main important parameter for calculating fair value is volatility. these findings are positive, since the standard tool for pricing option derivatives can also be applied in powermarketswithout losing the influence of specific market features. acknowledgement the research described in this paper was supervised by prof. o. starý, fee ctu in prague references [1] borovkova, s., geman, h.: analysis andmodeling of electricity futures prices, studies in nonlinear dynamics & econometrics, nonlinear analysis of electricity prices, vol. 10, issue 3, article 6, the berkeley electronic press, 2006. [2] hull, j.c.: options, futures, and other derivatives. 6th edition, prentice hall, 2006, isbn 0-13-149908-4. [3] redl, c.: modeling electricity futures, energy economics group, vienna university. [4] stoft, s., belden, t., goldman, c., pickle, s.: primer on electricity futures and other derivatives, university of california, 1998. about the author pavel pavlatka was born in ceske budejovice in 1982. he was awarded a master‘s degree in february 2008. he is currently a doctoral student at the department of economics, management and humanities, fee, ctu in prague. pavel pavlatka e-mail: pavlap1@.fel.cvut.cz dept. of economics management and humanities czech technical university zikova 4, 166 29 praha 6, czech republic 69 wykresx.eps acta polytechnica vol. 51 no. 1/2011 five-dimensional n =4 supersymmetric mechanics s. bellucci, s. krivonos, a. sutulin abstract we perform an su(2) hamiltonian reduction in the bosonic sector of the su(2)-invariant action for two free (4,4,0) supermultiplets. as a result, we get the five dimensional n =4 supersymmetric mechanics describing the motion of an isospin carrying particle interacting with a yang monopole. some possible generalizations of the action to the cases of systems with a more general bosonic action constructed with the help of ordinary and twisted n = 4 hypermultiplets are considered. keywords: supersymmetric mechanics, hamiltonian reduction, non-abelian gauge fields. 1 introduction the supersymmetric mechanics describing the motion of an isospinparticle in backgroundnon-abelian gauge fields has attracted a lot of attention in the last few years [1, 2, 3, 4, 5, 6, 7, 8, 9], especially due to its close relationwith higher dimensional hall effects and their extensions [10], as well as with the supersymmetric versions of various hopf maps (see e.g. [1]). the key point of any possible construction is to find a proper realization for semidynamical isospin variables, which have to be invented for the description of monopole-type interactions in lagrangian mechanics. in supersymmetric systems these isospin variables should belong to some supermultiplet, and the main question is what to do with additional fermionic components accompanying the isospin variables. in [3] fermions of such a kind, together with isospin variables, span an auxiliary (4,4,0) multiplet with a wess-zumino type action possessing an extra u(1) gauge symmetry1. in this framework, an off-shell lagrangian formulation was constructed, with the harmonic superspace approach [11, 12], for a particular class of fourdimensional [3] and three-dimensional [5] n =4 mechanics, with a self-dual non-abelian background. the same idea of coupling with an auxiliary semidynamical supermultiplet has also been elaborated in [6] within the standard n = 4 superspace framework, and then it has been applied for the construction of lagrangian and hamiltonian formulations of the n = 4 supersymmetric system describing the motion of the isospin particles in three [7] and fourdimensional [8] conformally flat manifolds, carrying the non-abelian fields of thewu-yangmonopole and the bpst instanton, respectively. in both these approaches the additional fermions were completely auxiliary, and they were expressed through the physical ones on themass shell. another approachbased on the direct use of the su(2) reduction, firstly considered on the lagrangian level in the purelybosonic case in [1], hasbeenused in the supersymmetric case in [9]. the key idea of this approach is to perform a direct su(2) hamiltonian reduction in the bosonic sector of the n = 4 supersymmetric system, with the general su(2) invariant action for a self-coupled (4,4,0) supermultiplet. no auxiliary superfields are needed within such an approach, and the procedure itself is remarkably simple and automatically successful. as concerning the interaction with the nonabelianbackground, the systemconsidered in [9]was not too illuminating, due to its small number (only one) of physical bosons. in the present letter, we extend the construction of [9] to the case of the n =4 supersymmetric systemwithfive (and four in the special case) physical bosonic components. it is not, therefore, strange that the arising non-abelian backgroundcoincideswith thefieldof ayangmonopole (a bpst instanton field in the four dimensional case). a very important preliminary step, discussed in details in section 2, is to pass to new bosonic and fermionic variables, which are inert under the su(2) group, over which we perform the reduction. thus, the su(2) group rotates only the three bosonic components, which enter the action through su(2) invariant currents. just these bosonic fields become the isospin variableswhich the background field couples to. due to the commutativity of n = 4 supersymmetry with the reduction su(2) group, it survives upon reduction. in section 3, we consider some possible generalizations,which include a systemwith more general bosonic action, a four-dimensional system which still includes eight fermionic components, and the variant of five-dimensional n = 4 mechanics constructedwith the help of ordinary and twisted n = 4 hypermultiplets. finally, in the conclusion we discuss some unsolved problems and possible extensions of the present construction. 1note that the first implementation of this idea was proposed in [4] 22 acta polytechnica vol. 51 no. 1/2011 2 (8b,8f) → (5b,8f) reduction and the yang monopole asthefirstnontrivial exampleof the su(2) reduction in n =4 supersymmetric mechanics we consider the reduction from the eight-dimensional bosonic manifold to the five dimensional one. to start with, let us choose our basic n = 4 superfields to be the two quartets of real n = 4 superfields qiα̂a (with i, α̂, a = 1,2) defined in the n = 4 superspace r(1|4) =(t, θia) and subjected to the constraints d(iaqj)α̂a =0, and ( qiα̂a )† = qiα̂a, (2.1) where the corresponding covariant derivatives have the form dia = ∂ ∂θia +iθia∂t, so that{ dia, djb } = 2i ij ab∂t. (2.2) these constrained superfields describe the ordinary n = 4 hypermultiplet with four bosonic and four fermionic variables off-shell [12, 13, 14, 15, 16, 17]. the most general action for qiα̂a superfields is constructed by integrating an arbitrary superfunction f(qiα̂a ) over thewhole n =4 superspace. here, we restrict ourselves to the simplest prepotential of the form2 f(qiα̂a ) = q iα̂ a qiα̂a −→ s = ∫ dtd4θ qiα̂a qiα̂a. (2.3) the rationale for this selection is, first of all, itsmanifest invariance under su(2) transformations acting on the “α̂” index of qiα̂. this is the symmetry over which we are going to perform the su(2) reduction. secondly, just this form of the prepotential guarantees so(5) symmetry in the bosonic sector after reduction. in terms of components, the action (2.3) reads s = ∫ dt [ q̇iα̂a q̇iα̂a − i 8 ψ̇aα̂a ψaα̂a ] (2.4) where the bosonic and fermionic components are defined as qiα̂a = q iα̂ a | , ψ aα̂ a = d iaqα̂ia| , (2.5) and, asusually, (. . .)|denotes the θia =0 limit. thus, from the beginning we have just the sum of two independent non-interacting (4,4,0) supermultiplets. to proceed further, we introduce the following bosonic qiαa and fermionic ψ aα a fields qiαa ≡ q i aα̂g αα̂, ψaαa ≡ ψ a α̂ag αα̂, (2.6) where the bosonic variables gαα̂, subjected to gαα̂gαα̂ =2, are chosen as g11 = e− i 2 φ√ 1+λλ λ, g21 = − e− i 2 φ√ 1+λλ , g22 = ( g11 )† , g12 = − ( g21 )† . (2.7) the variables gαα̂ play the role of a bridge relating the twodifferent su(2) groups realizedon the indices α and α̂, respectively. in terms of the variables given above, the action (2.4) acquires the form s = ∫ dt [ q̇iαa q̇iαa −2q iα a q̇ β iajαβ + qiαa qiαa 2 j βγ jβγ − i 8 ψ̇aαa ψaαa + (2.8) i 8 ψaαa ψ β aajαβ ] , where j αβ = j βα = gαα̂ġβα̂. (2.9) as follows from (2.6), the variables qiαa and ψ aα a , which, clearly, contain five independent bosonic and eight fermionic components, are inert under su(2) rotations acting on α̂ indices. under these su(2) rotations, realized now only on gαα̂ variables in a standard way δgαα̂ = γ(α̂β̂)gα β̂ , (2.10) the fields (φ,λ,λ̄) (2.7) transform as [17] δλ = γ11eiφ(1+λλ), δλ = γ22e−iφ(1+λλ), (2.11) δφ = −2iγ12 +iγ22e−iφλ− iγ11eiφλ. it is easy to check that the forms j αβ (2.9), expressing in terms of the fields (φ,λ,λ̄), j11 = − λ̇− iλφ̇ 1+λλ , j22 = − λ̇+ iλφ̇ 1+λλ , j12 = −i 1−λλ 1+λλ φ̇ − λ̇λ−λλ̇ 1+λλ (2.12) are invariant under (2.11). hence, the action (2.8) is invariant under the transformations in (2.10). next, we introduce the standardpoissonbrackets for bosonic fields {π,λ} =1, { π̄,λ } =1, {pφ, φ} =1, (2.13) so that the generators of the transformations (2.11), 2we used the following definition of the superspace measure: d4θ ≡ − 1 96 diadibd bj dja. 23 acta polytechnica vol. 51 no. 1/2011 iφ = pφ, i =e iφ [ (1+λλ)π − iλpφ ] , ī = e−iφ [ (1+λλ) π̄ +iλpφ ] , (2.14) will be the noether constants of motion for the action (2.8). to perform the reduction over this su(2) group we fix the noether constants as (c.f. [1]) iφ = m and i = ī =0, (2.15) which yields pφ = m and π = imλ 1+λλ , π̄ = − imλ 1+λλ . (2.16) performing a routh transformation over the variables (λ,λ, φ), we reduce the action (2.8) to s̃ = s − ∫ dt { π λ̇+ π̄ λ̇+ pφφ̇ } (2.17) and substitute the expressions (2.16) in s̃. at the final step, we have to choose the proper parametrization for bosonic components qiαa (2.6), taking into account that they contain only five independent variables. following [1] we will choose these variables as qiα1 = 1 2 iα √ r + z5, qiα2 = 1√ 2(r + z5) ( x(iα) − 1 √ 2 iαz4 ) , (2.18) where x12 = i√ 2 z3, x 11 = 1√ 2 (z1 + iz2) , x22 = 1√ 2 (z1 − iz2) , r2 = 5∑ i=1 zizi , (2.19) and now the five independent fields are zm. slightly lengthy but straightforward calculations lead to sred = ∫ dt [ 1 4r żmżm − i 8 ψ̇aαa ψaαa + i 4r hαβvαβ + 1 128r hαβ hαβ − m2 r − m 4r vαv̄β hαβ − (2.20) 4im r vαv̄β vαβ + im ( v̇αv̄α − vα ˙̄vα )] . here hαβ = ψaαa ψ β aa, v α = gα1, v̄α = gα2, vαv̄α =1, (2.21) and to ensure that the reduction constraints (2.16) are satisfiedweaddedlagrangemultiplier terms (the last two terms in (2.20)). finally, the variables v αβ in the action (2.20) are defined in a rather symmetric way to be v αβ = 1 2 ( qiαa q̇ β ia + q iβ a q̇ α ia ) . (2.22) to clarify the relations of these variables with the potential of the yang monopole, one has to introduce the following isospin currents (which will form an su(2) algebra upon quantization) t i = vα ( σi )β α v̄β , i =1,2,3. (2.23) now, the (vαv̄β)-dependent terms in theaction (2.20) can be rewritten as − m 4r vαv̄β hαβ − 4im r vαv̄β vαβ = m t i ( 1 r(r + z5) ηiμν zμżν + 1 8r hi ) , (2.24) μ, ν =1,2,3,4, where ηiμν = δ i μδν4 − δ i ν δμ4 + a μν4 (2.25) is the self-dual t’hooft symbol and the fermionic spin currents are introduced hi = hαβ ( σi )β α . (2.26) thus we conclude that the action (2.20) describes n = 4 supersymmetric five-dimensional isospin particles moving in the field of the yang monopole aμ = − 1 r(r + z5) ηiμν zν t i . (2.27) we stress that the su(2) reduction algebra, realized in (2.11), commutes with all (super)symmetries of the action (2.4). therefore, all symmetry properties of the theory are preserved in our reduction and the final action (2.20) represents the n = 4 supersymmetric extension of the system presented in [1]. with this, we have completed the classical description of n = 4 five-dimensional supersymmetric mechanics describing the isospin particle interacting with a yang monopole. next, we analyze some possible extensions of the present system, together with some possible interesting special cases. in what follows we will concentrate on the bosonic sector only, while the full supersymmetric action could be easily reconstructed, if needed. 3 generalizations, and cases of special interest let us consider more general systems with a more complicated structure in the bosonic sector. we will 24 acta polytechnica vol. 51 no. 1/2011 concentrate on the bosonic sector only, while the full supersymmetric action could be easily reconstructed. 3.1 so(4) invariant systems our first example is the most general system, which still possesses so(4) symmetry upon su(2) reduction. it is specified by the prepotential f (2.3) depending on two scalars x and y f = f(x, y ), x = qiα̂1 q1 iα̂, (3.1) y = qiα̂2 q2 iα̂. such a system is invariant under su(2) transformations realized on the “hatted” indices α̂ and thus the su(2) reduction that we discussed in the section 2 goes in the same manner. in addition, the full su(2) × su(2) symmetry realized on the superfield qiα̂2 will survive in the reduction process. so we expected the final system to possess so(4) symmetry. the bosonic sector of the system with prepotential (3.1) is described by the action s = ∫ dt [( fx + 1 2 xfxx ) q̇iα̂1 q̇1 iα̂ +( fy + 1 2 yfyy ) q̇iα̂2 q̇2 iα̂ + (3.2) 2fxyq jβ̂ 2 q1 jα̂q̇2 iβ̂ q̇ iα̂ 1 ] . even with such a simple prepotential, the bosonic action (3.2) after reduction has a rather complicated form. a further, stillmeaningful simplification, could be achieved with the following prepotential f = f(x, y )= f1(x)+f2(y ), (3.3) where f1(x) and f2(y ) are arbitrary functions depending on x and y , respectively. with such a prepotential the third term in theaction(3.2)disappears and the action acquires a readable form. with our notations (2.18), (2.19) the reduced action reads s = ∫ dt [ hxhy 2((hx − hy)z5 +(hx + hy)r) żμżμ + (hx − hy)2 8r2 ((hx − hy)z5 +(hx + hy)r) (zμżμ) 2 + hx − hy 4r2 (zμżμ) ż5 + 1 8 ( hx − hy r2 z5 + hx + hy r ) ż25 + im ( v̇αv̄α − vα ˙̄vα ) − (3.4) 2m2 (hx + hy)r +(hx − hy)z5 − 8imhy (hx + hy)r +(hx − hy)z5 vαv̄β vαβ ] , where hx = f ′ 1(x)+ 1 2 xf ′′1 (x), hy = f ′ 2(y)+ 1 2 yf ′′2 (y), (3.5) and x = 1 2 (r + z5) , y = 1 2 (r − z5) . (3.6) let us stress that the unique possibility to have an so(5) invariant bosonic sector is to choose hx = hy = const. this is just the case we considered in section 2. with arbitrary potentials hx and hy we have a more general system with the action (3.4), describing the motion of the n = 4 supersymmetric particle in five dimensions and interacting with a yang monopole and some specific potential. 3.2 non-linear supermultiplet it has been known for a long time that in some special cases one could reduce the action for hypermultiplets to the action containing one fewer physical bosonic components — to the action of a so-called non-linear supermultiplet [12, 17, 18]. themain idea of such a reduction is to replace of the time derivative of the “radial” bosonic component of hypermultiplet log(qiaqia) by anauxiliary component b without breaking the n =4 supersymmetry [19]: d dt log(qiaqia) → b. (3.7) clearly, to perform such a replacement in some action the “radial”bosonic component has to enter this action only with a time derivative. this condition strictly constraints the variety of the possible hypermultiplet actions in which this reduction works. for performing the reduction from a hypermultiplet to the non-linear one, parametrization (2.18) is not very useful. instead, we choose the following parameterizations for independent components of two hypermultiplets qiα1 and q iα 2 qiα1 = 1 √ 2 iαe 1 2 u, qiα2 = x (iα) − 1 √ 2 iαz4, (3.8) where x12 = i √ 2 z3, x 11 = 1 √ 2 (z1 + iz2) , x22 = 1 √ 2 (z1 − iz2) . (3.9) thus, the five independent components are u and zμ (μ =1, . . . ,4), and x = q21 = e u, y = q22 = 4∑ μ=1 zμzμ ≡ r24. (3.10) 25 acta polytechnica vol. 51 no. 1/2011 with this parametrization the action (3.4) acquires the form s = ∫ dt [ g1g2e u eug1 + g2r24 żμżμ + g22 eug1 + g2r24 (zμżμ) 2 + 1 4 g1e uu̇2 + im ( v̇αv̄α − vα ˙̄vα ) − m2 eug1 + g2r24 − 4img2 eug1 + g2r24 vαv̄β vαβ ] , (3.11) where g1 = g1(u)= f ′ 1(x)+ 1 2 xf ′′1 (x), g2 = g2(r4)= f ′ 2(y)+ 1 2 yf ′′2 (y). (3.12) if we choose g1 = e −u, then the “radial” bosonic component u will enter the action (3.11)only through the kinetic term ∼ u̇2. thus, performing replacement (3.7) and excluding the auxiliary field b by its equation of motion we will finish with the action s = ∫ dt [ g2 1+ g2r24 ( żμżμ + g2 (zμżμ) 2 ) + im ( v̇αv̄α − vα ˙̄vα ) − m2 1+ g2r24 − (3.13) 4img2 1+ g2r24 vαv̄β vαβ ] . the action (3.13) describes the motion of an isospin particle on a four-manifoldwith so(4) isometry carrying the non-abelian field of a bpst instanton and somespecial potential. ouraction is rather similar to those recently constructed in [3, 8, 2], but it contains twice more physical fermions. 3.3 ordinary and twisted hypermultiplets one more way to generalize the results presented in the previous section is to consider simultaneously the ordinary hypermultiplet qjα̂ obeying (2.1) together with twisted hypermultiplet vaα̂ — a quartet of n =4 superfields subjected to constraints [17] di(avb)α̂ =0, and ( vaα̂ )† = vaα̂ . (3.14) themost general systemwhich is explicitly invariant under su(2) transformations realized on the “hatted” indices is defined, similarly to (3.1), by the superspace action depending on two scalars x, y s = ∫ dtd4θf(x, y ), x = qiα̂qiα̂, y = vaα̂vaα̂. (3.15) the bosonic sector of the action (3.15) is a rather simple s = ∫ dt [( fx + 1 2 xfxx ) q̇iα̂q̇iα̂ −( fy + 1 2 yfyy ) v̇ aα̂v̇aα̂ ] . (3.16) thus, we see that the term causes the most complicated structure of the action with two hypermultiplets, which disappear in the case of ordinary and twisted hypermultiplets. clearly, the bosonic action after su(2) reduction will have the same form (3.4), but with hx = fx + 1 2 xfxx, hy = − ( fy + 1 2 yfyy ) . (3.17) here f = f(x, y) is still a function of two variables x and y. the mostly symmetric situation again corresponds to the choice hx = hy ≡ h(x, y) (3.18) with the action s = ∫ dt [ h 4r żmżm + im ( v̇αv̄α − vα ˙̄vα ) − m2 h r − 4im r vαv̄βvαβ ] . (3.19) unfortunately, due to definition (3.17), (3.18) the metric h(x, y) cannot be chosen fully arbitrarily. for example, looking for an so(5) invariant model with h = h(x + y) we could find only two solutions3 h1 = const., h2 =1/(x + y) 3. (3.20) both solutions describe a cone-like geometry in the bosonic sector, while the most interesting case of the sphere s5 cannot be treated within the present approach. finally, we would like to draw attention to the fact that with h = const. the bosonic sectors of systemswith twohypermultiplets andwith oneordinary andonetwistedhypermultiplets coincide. this is just onemore justification for the claimthat“almost free” systems can be supersymmetrized in various ways. 3the same metric has been considered in [20]. 26 acta polytechnica vol. 51 no. 1/2011 4 conclusion in this paper, starting with the non-interacting system of two n = 4 hypermultiplets, we perform a reduction over the su(2) group which commutes with supersymmetry. the resulting system describes the motion of an isospin carrying particle on a conformally flat five-dimensional manifold in the nonabelian field of a yang monopole and in some scalar potential. themost important step for this construction passes to new bosonic and fermionic variables, which are inert under the su(2) group over which weperformthe reduction. thus, the su(2) grouprotates only three bosonic components, which enter the action through su(2) invariant currents. just these bosonic fields become the isospin variableswhich the background field couples to. due to the commutativity of n = 4 supersymmetry with the reduction su(2) group, it survives upon reduction. we considered some possible generalizations of the action to the cases of systems with a more general bosonic action, a four-dimensional system which still includes eight fermionic components, and a variant of fivedimensional n = 4 mechanics constructed with the help of ordinary and twisted n =4 hypermultiplets. the main advantage of the proposed approach is its applicability to any systemwhichpossesses su(2) invariance. if, in addition, this su(2) commutes with supersymmetry, then the resulting system will automatically be supersymmetric. possibledirectapplicationsofour construction include a reduction in the cases of systems with nonlinear n =4 supermultiplets [21], systemswithmore than two (non-linear) hypermultiplets, in systems with bigger supersymmetry, say for example n =8, etc. however, the most important case, which is still missing within our approach, is the construction of the n =4 supersymmetric particle on the sphere s5 in the field of a yang monopole. unfortunately, the use of standard linear hypermultiplets makes the solution of this task impossible, because the resulting bosonic manifolds have a different structure (conical geometry) to include s5. acknowledgement we thank armen nersessian and francesco toppan for useful discussions. this work was partially supported by grants rfbf-09-02-01209 and 09-0291349, by volkswagen foundation grant i/84 496, as well as by erc advanced grant no. 226455, “supersymmetry, quantum gravity and gauge fields” (superfields). references [1] gonzales, m., kuznetsova, z., nersessian, a., toppan, f., yeghikyan, v.: second hopf map and supersymmetric mechanics with yang monopole, phys.rev. d 80 (2009) 025022, arxiv:0902.2682[hep-th]. [2] konyushikhin, m., smilga, a.: self-duality and supersymmetry, phys. lett. b689 (2010) 95, arxiv:0910.5162[hep-th]. [3] ivanov, e. a., konyushikhin, m. a., smilga, a. v.: sqm with non-abelian self-dual fields: harmonic superspace description, jhep 1005 (2010) 033, arxiv:0912.3289[hep-th]. [4] fedoruk, s., ivanov, e., lechtenfeld, o.: supersymmetric calogero models by gauging, phys. rev. d 79 (2009) 105015, arxiv:0812.4276[hep-th]. [5] ivanov, e., konyushikhin, m.: n = 4, 3d supersymmetric quantum mechanics in nonabelian monopole background, phys. rev. d 82 (2010) 085014, arxiv:1004.4597[hep-th]. [6] bellucci, s., krivonos, s.: potentials in n=4 superconformal mechanics, phys. rev. d 80 (2009) 065022, arxiv:0905.4633[hep-th]. [7] bellucci, s., krivonos, s., sutulin, a.: three dimensional n = 4 supersymmetric mechanics with wu-yang monopole, phys. rev. d 81 (2010) 105026, arxiv:0911.3257[hep-th]. [8] krivonos, s., lechtenfeld, o., sutulin, a.: n = 4 supersymmetry and the bpst instanton, phys. rev. d 81 (2010) 085021, arxiv:1001.2659[hep-th]. [9] krivonos, s., lechtenfeld, o.: su(2) reduction in n = 4 supersymmetric mechanics, phys. rev. d 80 (2009) 045019, arxiv:0906.2469[hep-th]. [10] zhang, s. c., hu, j. p.: a four dimensional generalization of the quantumhall effect, science 294 (2001) 823, arxiv:cond-mat/0110572, bernevig, b. a,. hu, j. p., toumbas, n., zhang, s. c.: the eight dimensional quantum hall effect and the octonions, phys. rev. lett. 91 (2203) 236803, arxiv:cond-mat/0306045, karabali, d. nair, v. p.: quantum hall effect in higher dimensions, nucl. phys. b641 (2002) 533, arxiv:hep-th/0203264, hasebe, k.: hyperbolic supersymmetric quantumhall effect, phys. rev.d78 (2008) 125024, arxiv:0809.4885[hep-th]. [11] galperin, a. s., ivanov,e.a., ogievetsky,v. i., sokatchev, e. s.: harmonic superspace, cambridge, uk : univ. press. (2001), 306 pp. 27 acta polytechnica vol. 51 no. 1/2011 [12] ivanov, e., lechtenfeld, o.: n = 4 supersymmetric mechanics in harmonic superspace, jhep 0309 (2003) 073, arxiv:hep-th/0307111. [13] coles, r. a., papadopoulos, g.: the geometry of the one-dimensional supersymmetric nonlinear sigma models, class. quant. grav. 7 (1990) 427. [14] gibbons,g.w., papadopoulos,g., stelle,k. s.: hkt and okt geometries on soliton black hole moduli spaces, nucl. phys. b 508 (1997) 623, arxiv:hep-th/9706207. [15] hellerman, s., polchinski, j.: supersymmetric quantum mechanics from light cone quantization, in shifman, m. a. (ed.), the many faces of the superworld, arxiv:hep-th/9908202. [16] hull, c. m.: the geometry of supersymmetric quantum mechanics, arxiv:hep-th/9910028. [17] ivanov, e., krivonos, s., lechtenfeld, o.: n = 4, d = 1 supermultiplets from nonlinear realizations of d(2,1;α), class. quant. grav. 21 (2004) 1031, arxiv:hep-th/0310299. [18] bellucci, s., krivonos, s.: geometry of n = 4, d = 1 nonlinear supermultiplet, phys. rev. d 74 (2006) 125024,arxiv:hep-th/0611104;bellucci, s., krivonos, s., ohanyan, v.: n = 4 supersymmetric micz-kepler systems on s3, phys. rev. d 76 (2007) 105023, arxiv:0706.1469[hep-th]. [19] bellucci, s.,krivonos,s.,marrani,a.,orazi,e.: “root” action for n = 4 supersymmetric mechanics theories, phys. rev. d 73 (2006) 025011, arxiv:hep-th/0511249. [20] ivanov, e., lechtenfeld, o., sutulin, a.: hierarchy of n = 8 mechanics models, nucl. phys. b790 (2008) 493, arxiv:0705.3064[hep-th]. [21] bellucci, s., krivonos, s., lechtenfeld, o., shcherbakov, a.: superfield formulation of nonlinear n =4 supermultiplets, phys. rev. d 77 (2008) 045026, arxiv:0710.3832[hep-th]. stefano bellucci infn – frascati national laboratories via e. fermi 40, 00044 frascati, italy sergey krivonos bogoliubov laboratory of theoretical physics jinr, 141980 dubna, russia anton sutulin e-mail: sutulin@theor.jinr.ru bogoliubov laboratory of theoretical physics jinr, 141980 dubna, russia 28 acta polytechnica vol. 52 no. 5/2012 a study of the properties of electrical insulation oils and of the components of natural oils milan spohner dept. of physics, brno university of technology, technická 10, 616 00 brno, czech republic corresponding author: xspohn00@stud.feec.vutbr.cz abstract this paper presents a study of the electrical and non-electrical properties of insulating oils. for the correct choice of an electrical insulation oil, it is necessary to know its density, dynamic viscosity, dielectric constant, loss number and conductivity, and the effects of various exposure factors. this paper deals with mathematical and physical principles needed for studying and making correct measurements of the dynamic viscosity, density and electrical properties of insulation oils. rheological properties were measured using an a&d sv-10 vibratory viscometer, and analytical balance with density determination kit, which operates on the principle of archimedes’ law. dielectric properties were measured using a lcr meter agilent 4980a with connected with the agilent 16452a test fixture for dielectric liquids. keywords: conductivity, density, dielectric constant, fame, insulating oil, lauric acid, loss number, methyl ester, midel, oleic acid, rapeseed oil, stearic acid, transformer oil, vegetable oil, vibratory viscometer, viscosity. 1 introduction mineral oil is used as a dielectric in transformers, and also in cables and capacitors. the advantage of mineral oils over other types of oils is their high resistance to aging. a transformer oil must have require low viscosity. for the selection of the oil is also important to consider the construction of the transformer in which will later be filled. the oil is necessary to determine long term stability and whether the oil suitable rheological properties. that can affect the required heat dissipation (cooling), leakage or loss of oil from the transformer. for all types of mineral and non-mineral transformer oils, we need to know in particular the following parameters: dynamic and kinematic viscosity, density, flash point, pour point, breakdown voltage, relative permittivity, loss number, and oxidative stability. the most important parameters of transformer oils include long-term stability of parameters that affect the reliability of the equipment, and especially the costs of operating transformers in distribution systems. fig. 1 shows the fatty acid content in selected oils. natural esters and primarily rapeseed oil were previously considered unsuitable, especially due to the low oxidative stability. for the production of renewable oils are suitable oilseed crops grown commercially in farming. liquids made from these seeds are composed of triglycerides. triglyceride is a molecule of glycerol associated with three molecules fatty acids. unsaturated fatty acids in the liquid exhibit lower oxidative stability and lower dynamic viscosity values [3]. tab. 1 presents the most frequently used parameters of mineral, synthetic and natural oils. values are defined for pentaerythritol tetraoleate, oleic acid methyl ester (methyl oleate) and synthetic ester (midel 7131). 1.1 rheological properties important rheological properties for oils used in electrical engineering are viscosity and density, which affect the behavior of the liquid in dependence on changes in operating temperature. if these parameters are not within the required tolerances is due to poor lubrication of moving parts and loss or leakage of liquid from the transformer. density measurements based on use of the archimedes principle take into account the gravity and buoyancy force acting on the plunger (calibrated glass weights) with a defined volume (vpl). in order to calculate the density, it is necessary to know the weight of the plunger calibrated (calibrated glass weight) in air (ma) and immersed in the liquid (ml). the density of the liquid can be calculated according to the equation: % = ma−ml vpl , (1) where the density % is using the unit g cm−3 and where it is necessary to know the weight air in the plunger (hanging in the cupboard by simply weighing the influence of air) and its weight after immersion in a liquid. fig. 2 shows a plunger (calibrated weights) in the measurement of their weighted hanging in the cupboard and immersion in the liquid. 100 acta polytechnica vol. 52 no. 5/2012 figure 1: comparison of the fat content found in natural oils [7]. oil density kinematic viscosity flash points pour point dielectric [29°c] (kg m−3) [27°c] (cst) (°c) (°c) constant peto 910 128 255 -12 3.1 methyl oleate 875 9 188 -6 3.2 midel 7131 950 53 233 -20 3.2 table 1: parameters of different kinds of oils (mineral acid methyl ester, synthetic oil) [6]. figure 2: set for density measurement: a) the plunger weighted in the air, b) specific and calibrated plunger suspended on a hanger and immersed in a liquid for the measurement [5]. 1.2 dielectric properties for the correct selection of electrical insulation liquids, it is necessary to know the following dielectric properties: dielectric constant and loss number. another relevant parameter is the influence of temperature on changes in the parameters. oils are tested in laboratory conditions using various accelerated aging tests in order to monitor changes in parameters over time. the changes are mainly influenced by the oxidative stability of the oil. figure 3: figure rlcg meter and electrode system for measuring the liquid samples [1, 2]. the conductivity of the liquid insulating material is influenced by the concentration of free carriers of electric charge, which may be due to the ionization of neutral molecules, molecular dissociation own liquid and dissociation of molecular masses cause emission of electrons from the cathode in a strong electric field and thermal excitation of electrons. the temperature dependence of the mobility of free carriers of an electric charge causes strong dependence of conductivity on the temperature of the insulating liquid. technically pure liquids with conductivity in the order of 101 acta polytechnica vol. 52 no. 5/2012 figure 4: comparison of the catalog density values for midel 7131 oil, using two measurement methods. figure 5: dependence of density on inverse temperature for various oils. 10−11 to 10−12 s m−1. the mobility of free carriers affects the viscosity, which is a rheological parameter. the relationship between conductivity and dynamic viscosity can be described by walden’s law [4]. fig. 3 shows the lcr meter and electrode system used for the liquid samples. 2 experimental 2.1 rheological properties for examining disparities of density mineral, synthetic and natural oils were the following samples: synthetic ester (midel 7131), natural (rapeseed oil, methyl ester rapeseed oil (fame)) and one mineral renolin eltec t. the experiment measured density values in dependence on temperature, using two methods. the first method involved measuring the density using a standard glass densimeter, while the second method used the radwag analytical balance with density determination kit to determine density according to equation (1). these two methods are compared in fig. 4 with datasheet value manufacturer of synthetic oil midel 7131. the waveform in fig. 5 shows the density in dependence on the inverse temperature interval from 270 to 350 k. the lowest values were for mineral oils and methyl ester rapeseed oil. an a&d sv-10 vibration viscometer with a maximum range of 10 pa s was used for the dynamic viscosity measurements. to ensure reproducibility, the measurements were performed in the agilent vee pro software automation environment. the values ob102 acta polytechnica vol. 52 no. 5/2012 figure 6: dependence of viscosity on inverse temperature for various oils. figure 7: frequency dependence of the dielectric constant for methyl oleate, methyl laurate, fame and rapeseed oil. tained are shown in fig. 6, depending on a thousand times the inverse temperature. 2.2 dielectric properties for a study of dielectric properties, samples of fame and rapeseed oil methyl ester were selected for a comparison with the two fatty acids found in rapeseed oil. measurements were performed using the agilent 4980 lcr meter and the agilent 16452a electrode system for liquid samples. the highest relative permittivity value was 3.24 for fame, and the lowest methyl value was for laurate 2.94 (fig. 7). fig. 8 shows the dependence of relative permittivity on logarithmic frequency at various temperatures. the temperature was set using the agilent16452a electrode system immersed in a medingen kt-20 cryostat. size of variable loss numbers decreases negligibly with rapeseed oil and fame. value loss figures for a sample of methyl laurate and oleate decreased with increasing frequency and growing with temperature. the dependence of loss number of the logarithm frequency (fig. 9) for a sample of methyl oleate is shown at different temperatures of the sample. 2.3 activation energy oils the external electric field in insulating liquids may be caused by ions, but the state can also occur when the ion captures the molecule, forming a single unit. the temperature effect can can cause the separation of ions 103 acta polytechnica vol. 52 no. 5/2012 figure 8: frequency dependence of the dielectric constant for methyl oleate. figure 9: frequency dependence of the loss number for methyl oleate. and molecules add energy to overcome any potential barriers to activation energy ea. in order to calculate the activation energy, we can use the temperature dependence of ionic conductivity according to the equation: σ = σ0e− ea kt . (2) the highest activation energy value of 30.3 mj kmol−1 was calculated for midel 7131 oil, and the lowest value, 14.8 mj kmol−1, was calculated for fame. the second column compares the activation energy of the methyl esters of three acids found in rapeseed and other oils. rapeseed oil with methyl oleate occurs in approximately 60 % of the composition, and its activation energy was determined as 22.8 mj kmol−1. the activation energy of rapeseed oil was 26.3 mj kmol−1. the difference is a result of the ratio of the contents of different fatty acids in natural oils. 3 conclusion fig. 10 shows the high conductivity values of rapeseed oil. the highest conductivity of 15 ns m−1 at frequency 1 mhz is shown in rapeseed oil. the conductivity values of the other samples are given for a frequency of 10 khz. fame had the second highest conductance of 4 ns m−1. methyl esters had the lowest conductivity: methyl laurate 0.087 ns m−1 and methyl oleate 0.097 ns m−1. the conductivity values for methyl esters were useless at frequencies above 30 khz, because the values are close to the resolution threshold of the agilent 4980 lcr meter, that were loss number lower than 0.00001 (-). 104 acta polytechnica vol. 52 no. 5/2012 figure 10: logarithmic frequency dependence of conductivity for methyl esters, fame and rapeseed oil. oil activation energy midel 7131 30.3 mj kmol−1 rapeseed oil 26.3 mj kmol−1 luko oil silik m 350 14.6 mj kmol−1 fame 14.8 mj kmol−1 renolin eltec t 24.1 mj kmol−1 table 2: calculated parameters the activation energy of different types of oils. acid activation energy methyl laurate 30.2 mj kmol−1 methyl oleate 22.8 mj kmol−1 methyl stearate 8.7 mj kmol−1 table 3: calculated parameters the activation energy of fatty acids. acknowledgements this research has been supported by the internal grant agency of the brno university of technology within the framework of research project bd18102fekt-s-11-11 on “diagnostics of material defects”, and by the grant agency of the czech republic within the framework of research project gd 102/09/h074 “diagnostics of material defects using the latest defectoscopic methods”. references [1] agilent technologies. agilent 16452a liquid test fixture: operation and service manual. http://cp.literature.agilent.com/ litweb/pdf/16452-90000.pdf, 2000, [2011-0522]. [2] laboratoře uete. měřící přístroje. http://laboratore.uete.feec.vutbr.cz/ ?id=merici-pristroje&str=1&device=4, 2011, [2011-04-26]. [3] c. p. mcshane. vegetable-oil-based dielectric coolants. industry applications magazine 8: 34– 41, 2002. [4] v. mentlík. dielektrické prvky a systémy. ben — technická literatura, praha, 2006. [5] m. spohner. diagnostika perspektivních elektroizolačních kapalin. vysoké učení technické v brně, fakulta elektrotechniky a komunikačních technologií, 2011. [6] p. thomas, s. sridhar, k. r. krishnaswamy. synthesis and evaluation of oleic acid esters as dielectric liquids. in international symposium on electrical insulation, montreal, canada, 1996, pp. 565–568. [7] comparison of dietery fats. http://www. canolainfo.org/quadrant/media/downloads/ pdfs/ditfatpadfinal.pdf, [2012-02-28] 105 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 the optical transient search in the bamberg southern sky survey: preliminary results r. hudec, f. kopel, p. krapp, u. heber, w. cayé abstract a large fraction of gamma-ray bursts temporarily emit optical light, i.e. optical afterglows and optical transients. so far, optical transients have only been detected after related gamma-ray satellite detection. however, taking into account their optical magnitudes at maximum light, these objects should be detectable in various historical and recent optical surveys, including the photographic sky patrol. here we report on an extended study based on blink-comparison of 5004 bamberg observatory southern sky patrol plates performed within a student high school project (jugend forscht). keywords: gamma-ray, bursts-optical, transients-sky, surveys-photographic, sky plate archives. 1 introduction it is known that a substantial fraction of gammaray bursts (grbs) temporarily emit (typically for minutes to weeks) optical light, namely optical afterglows (oa) and optical transients (ot). so far, these phenomena have only been detected after related gamma-ray satellite detection. however, these objects should be detectable in various historical and recent optical surveys, including the photographic sky patrol, as they may be brighter than the limits of these surveys. in addition to these triggers, one can also expect to detect in optical surveys the as yet hypothetical orphan afterglows which may be observed in optical light but are not in gammarays due to different opening jet angles (e.g. hudec 2001). in this paper, we report on the preliminary results of a study based on an extended blinkcomparison of the bamberg observatory southern sky patrol plates performed within a german high school student project (jugend forscht). the bamberg observatory (bavaria, germany) dates back to 1899. the observatory has belonged to erlangennürnberg university since 1962. the observatory was deeply involved in variable star research in the past. the bamberg observatory photographic sky surveys (hudec, 1999) were used to deliver observation data for these studies. the bamberg plate archive contains 40 000 plates from northern surveys (18 000) and southern surveys (22 000); the relevant time periods are 1928–1939 (north) and 1963–1976 (south). the southern patrol was taken from the boyden observatory observing station in south africa for variable star research (figure 1). the northern patrol was performed in bamberg directly. the work in variable star research in bamberg in the past focused mostly on discoveries and on classifying new variable stars. the archive is nowadays located in a separate building on the observatory campus. the instrumentation available at the bamberg observatory for astronomical plate analyses includes an epson expression 1640 xl flatbed scanner for plate digitization, two plate microscopes, as well as a zeiss blinkmicroscope (figure 2). the zeiss blinkmicroscope was used in the past for very extended and long-term searches for new variable stars. the measurement principle is based on blink comparison of the left and right plates (taken at various time epochs). about 1700 new variable stars have been detected this way at the bamberg observatory (designed as bamberg variables, bv). the machine is still operational at the bamberg observatory, and it was used in our work for re-detection of the objects on the plates. fig. 1: (left) the cluster of sky patrol photographic cameras operated in the boyden observatory, south africa, by the bamberg observatory. (right) the bamberg zeiss blinkmicroscope and the southern sky patrol plates 23 acta polytechnica vol. 52 no. 1/2012 fig. 2: overall statistics of all 6 measurement books that have been investigated 2 the method the search for new and variable objects was based on the blinkmicroscope method. blinkmicroscope analyses of numerous pairs of selected high-quality sky patrol plates were conducted in the past under the direction of prof. w. strohmeier, former director of the bamberg observatory, and his collaborators, mainly r. knigge and h. abt. they investigated more than 2 500 southern sky survey plate pairs, where one plate pair represented about 5 hours of work at the blinkmicroscope. each plate represents 13×13 square degrees, about 1 hr exposure time, and has a limiting magnitude of about 15 (for southern plates) i.e. enough to detect brighter ots and oas of grbs. in addition, numerous (almost 1 000) northern sky survey plates were investigated in a similar way (though they have not yet been included in our study). due to the large field of view (fov) of the plates (the northern plates are 35 × 35 degrees), it is one of major sky survey programmes in the past. we have re-analysed the ot candidates recorded in 6 measurement books for southern sky surveys, with emphasis on a detailed investigation of nonclassified objects suspected to occur only once (i.e. visible only on one plate and below the limit of other plates). 3 preliminary results and discussion the original records of the whole project (southern sky patrol plates) led to the detection of a total of 8040 variable objects. out of these, 2766 were identified as bamberg variable stars (some twice — the total number of these objects is 1 700). 4 791 were identified as other known variable stars, 45 as planets, 8 as asteroids, 82 as plate faults, 56 as known objects of other types, and 292 remained non-classified (see also figure 2). in our recent study, we focused on objects not previously classified, with emphasis on possible ot candidates. we have identified a total of 189 suspicious objects (possible ot candidates according to notes in the measurement logs) found in 6 measurement logs, corresponding to a total of 5 004 fully investigated southern sky patrol plates. we have searched for objects which were detected only on one of several (or many) plates. we found that 86 of these are reasonable ot candidates (not emulsion defects, visible only once, no gcvs object at the position). we note that an analogous study by bedient (2003) indicated that out of 24 ot candidates identified by ross (1929) 6 are asteroids misclassified as suspected variable stars. hence one may expect that a similar fraction of these ot candidates may be asteroid images. the relevant identification study of our sample for asteroids is in progress. the analysed datasets involved a total of 5 004 southern sky patrol plates (always blinked as a pair of plates, i.e. 2 502 blink comparisons), each plate representing 13 × 13 square = 169 square degrees and 60 min exposure time. this represents a total of 845 676 square degrees (i.e. 21.14 full sky spheres) monitored for 60 minutes, i.e. almost a full day and full sky sphere coverage. it is obvious that although the statistical expectation that there will be a real fig. 3: example of anot candidate found by blink-comparison of two plates. left: a typical plate with no object at the position, right: the ot candidate indicated by an arrow. this object was originally misclassified as a possible emulsion defect, but the recent study with advanced software confirms that the image is star-like 24 acta polytechnica vol. 52 no. 1/2012 fig. 4: an example of advanced methods used to verify the images that were found. a detailed study of the ot profile and of normal stars profiles (as 3d plots) can easily recognize and exclude emulsion defects fig. 5: the method used to identify new (unknown) variable stars among the ot sample. left: the two bamberg southern sky survey plates where the object is visible on the left plate and invisible on the right plate. the dss image used for comparison is on the right and shows a star at the ot position fig. 6: 3d analyses used to analyse the profiles of the ot candidates. the image on the left corresponds to the image on the left in figure 4. the image on the right to the image in the middle in figure 5 grb ot candidate in our sample is low, it is not negligible. the estimated observed grb rate by grbm is about 1.3/day (guidorzi, 2011). however the intrinsic grb rate is higher, as the estimated rate is influenced by instrument sensitivity, and one can expect there was at least one grb inside the investigated plate sample. in the near future we plan a further detailed analysis of selected ot candidates, using computers (scanned data) to confirm their reality (figures 4–6). we will also continue classifying all ot candidates, including a comparison with astronomical databases (simbad, variable stars, asteroids, etc). the goal is to attempt to detect ots related to historical grbs, including orphan afterglows. we plan to study the relevant positions of the best candidates, for possible host galaxies. we note that the ot candidates noticed by past bamberg investigators are almost free from emulsion defects, as the bamberg astronomers were very experienced with that work. 25 acta polytechnica vol. 52 no. 1/2012 4 conclusion numerous (86) ot candidates (objects visible only once, not obvious plate defects) were detected in a very extended photographic southern sky monitoring program directed by prof. w. strohmeier (former bamberg observatory director) and his collaborators at the bamberg observatory and were re-analysed by us. their positions were given in measurement logs. we have re-analysed these logs and re-detected and investigated relevant ot candidates. the study continues with a detailed ot classification. the candidate objects were scanned and investigated by advanced computer programs. this study won 2nd place/award in the jugend forscht (youth research) high school competition in oberfranken/bavaria in 2011. fabian and pedro are (now) 16-year-old students. the project was proposed by rene hudec and supervised by rene hudec and uli heber. acknowledgement rh acknowledges grants 102/09/0997 and 205/08/ 1207 from the grant agency of the czech republic, and msmt project me09027. references [1] hudec, r.: astrophysics with astronomical plate archives, in exploring the cosmic frontier: astrophysical instruments for the 21st century. eso astrophysics symposia, european southern observatory series. edited by andrei p. lobanov, j. anton zensus, catherine cesarsky, phillip j. diamond. series editor: bruno leibundgut, eso. berlin – heidelberg, germany : springer-verlag, 2007, p. 79. [2] hudec, r.: on the feasibility of independent detections of optical afterglows of grbs. in gamma-ray bursts in the afterglow era, eso astrophysics symposia. berlin – heidelberg : springer, 2001. [3] bedient, j. r.: ibvs, 5478, 1–4, 2003. [4] hudec, r.: an introduction to the world’s large plate archives, acta historica astronomiae, vol. 6, p. 28–40, 1999. [5] hudec, r.: superoutburst of ot j213806.6+261957, the astronomer’s telegram, 2619, 2010, 1942. [6] hudec, r., šimon, v.: identification and investigation of high energy sources on astronomical archival plates, x-rayastronomy2009; present status, multi-wavelength approach and future perspectives: proceedings of the international conference. aip conference proceedings, vol. 1248, p. 161–162, 2010. [7] tsvetkov, m., et al.: proc. iv serbian-bulgarian astronomical conference, publ. astron. soc. rudjer boskovic, no. 5, 303–308, 2005. [8] ross, f. e.: aj, 39, 140, 1929. [9] guidorzi, c.: phd thesis. http://www.fe.infn.it/ ∼guidorzi/doktorthese/node1.html 2011. rené hudec astronomical institute academy of sciences of the czech republic ondřejov, czech republic czech technical university in prague faculty of electrical engineering prague, czech republic fabian kopel pedro krapp walter cayé dientzenhofer gymnasium bamberg, germany ulrich heber dr remeis observatory university of erlangen bamberg, germany 26 ap_07_6.vp 1 introduction facing the need for a simulation environment, we investigated whether we can use some of the existing enviroments. we found that are many freely available simulation platforms. we analyzed them with the aim to find a platform capable of running the required simulations, but none of those analyzed was fully able to satisfy our needs. we also found that they are mainly focused on just one specific domain of artificial life. almost none of them are capable of simulating artificial life on a more general level. and just a handful of them also provide a set of analytical tools to evaluate the simulation in a broader context. we therefore decided to design our own simulation environment. the main goal was to develop a simulator on a high modularity level and simple enough to be usable by anyone interested in alife research. special attention was given to the possibility of making an analysis, either during the simulation or after the simulation, from the saved data. visualization modules involve not only displaying the simulated agent world but are targeted on efficient analysis of agents’ behavior. visualization can provide both a simplified and an attractive view in order to present the simulation to a broader or non-technical audience. this simulator has helped us to focus on the topic under study (whatever it was) while abstracting from the implementation details of the environment itself. 2 designed abstract architecture our requirements for this platform were as follows: � the ability to simulate various phenomena of artificial life from cellular automata, boids, bimorphs, ant colonies etc., up to complex and socially behaving agents. � the ability to export simulation inner data so that it can be used for parameter visualization. � variability of simulation with easy modifiability and step by step run. � interesting visualization of the agent world, in order to present the simulation to a wider or non-technical audience. � meaningful and helpful visualization of parameters in time. � modularity � easy extendibility � interoperability. in order to implement such task, a general abstract architecture was proposed and named wal abstract architecture, wala2 in short. it should provide a general guide or instructions on how an application for simulation of artificial life should be defined and implemented with care for high interoperability and modularity between various implementations. this design was not work of one person, but the result of tight cooperation and many discussions among all mrg members [1]. while designing this abstract architecture we kept in mind that our implementation can be superseded in future by better ones, but if it sticks with the philosophy and recommendations of the abstract architecture, i.e. if the interfaces remain the same, the agents will be transferable with no or only minor reprogramming and redesign. another goal of wala2 is to ensure that agents can also run and compete on other implementations of the same architecture. the aim is to ensure that when different agent architectures and approaches are used the results can be evaluated in the basic environment, or that test agents’ behavior can be tested in a different environment than the agent was designed for. we can observe if the agent can adapt and to new circumstances 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 simulation environment for artificial creatures k. kohout, p. nahodil our research is focused on the design and simulation of artificial creatures – animates. this topic has been addressed in our research group for the last decade. the designed animates have been greatly inspired by sciences such as ethology, biology and psychology. several agent architectures have been proposed and tested in recent years. we started to face the problem of comparing various architectures in order to benchmark them. intelligence is embodied in our agent, and it needs an environment to be placed into. we have solved both these problems by first proposing and then implementing a common simulation environment in which these agents can run and compete. the main contribution of this paper is to offer a description of this designed simulation environment. it has been named the world of artificial life (wal). keywords: anticipation, alife, hybrid architecture, behavior, animate, artificial creatures. fig. 1: block scheme of wal abstract architecture and survive. wala2 itself defines the modular block architecture of the platform, as shown in fig. 1. this architecture was designed to enable easy parameterization of simulation and distributivity of its parts (body and mind can be separated and even run on different computation units). one of the benefits is that the environment is divided into layers. this is not layered architecture in the agent design but in the environment design. this decomposition of the environment leads to simpler and more comprehensive simulation and also gives an opportunity to describe more complex environments. 2.1 platform – engine the core part of the simulation environment will be referred to as the engine or platform. it is the basic unit and it controls the run of the simulation on the program level. this means that it synchronizes the whole application – it gives impulses at the start and end of each step. it contains an interface for modules. there are two components of the environment: the layers and the agents. in one simulation step, the engine asks all layers to evaluate the actions of all agents and perform appropriate environmental changes. the distribution of evaluation to the layers means distribution of simulation control. each layer can run in a different computation thread (on a multiprocessor unit they might also run on different processors). the main data structure where the parameters of all layers and agents are stored is also maintained by the engine. this data can be viewed or modified by the agent or even by external modules. it is important to distinguish between the control part of the engine, which interacts mostly with the operating system (graphical interface, loading and saving configuration, user interaction etc.) and the part providing and simulating the virtual world for the agents. the first is done by the engine described above. the second function is described below, and is handled by the layers. 2.2 engine interface the interface between the simulation environment and its program surrounding (e.g., visualization, analysis tool or parameterization) is an important part of the application. this is what makes wal modular and distributable. from the point of view of the processing speed of the data, it is suitable to exchange information in binary format. it is also possible to use text based formats such as xml. the textual format is in principle highly redundant (but descriptive) and its processing can be slow. still, it can be used for offline analysis. the engine contains all its data, the data of the layers and agents in an inner tree-based data structure. it can provide all of this data or just part of it to the external modules. each connected external module can ask for data. the inner data representation is not defined in the abstract architecture. it can be implemented in various ways and it does not matter as long as the interface for exchanging this data remains the same. this interface should work in both directions: for exporting the data to be read by the external modules, as well as for receiving updates of the data structure from the modules. the running simulation can also be stopped at any moment. thanks to the single data structure it can be saved at each step, hence the simulation can even be traced back to a certain point in history and run again to observe if any change of behavior will occur (emerge) given the same starting conditions. change of simulation parameters should be available while the simulation is running. an agent is understood to be any object in simulation either virtually alive (creature, predator) or virtually non-living (trees, food, water, rocks). sensors and effectors of the agent are their interface with the virtual world and therefore they are part of the environment and layers. the agent mind (decision control) is not part of the environment and can be remote. 2.3 simulation world in layers layers are the part of the application directly interacting with the agent via its sensors and effectors. layers define the virtual world in which the agents live. they separate operations which would otherwise be controlled by the engine. the layer is a logically separable part of the environment which can be used standalone and which when combined with others defines the environment as a whole. basically, by using layers we segregate the simulation of physical laws. as an example, we can have a physical layer taking care of agents’ positions and collisions. we can add a thermal layer taking care of propagation of heat in the virtual world. all agents influenced by a particular layer or having an influence in this particular layer must be registered with it. this ensures that the layer has full information for computing the next step. the layer will evaluate agents’ actions in each step, and through their sensors it will provide them with the new state of the environment. according to the executed actions the layer will modify its own values and then will provide new sensoric data to the agents. the layer must have the ability to register and deregister the agent and also fill the agents’ sensors. this means that it must have an interface for communicating with sensors. for example, the thermal layer should have the ability to provide information for sensors of temperature. to sum this up in each step the layer must read agents’ executed actions, validate them (whether they are possible or executable), modify the environment and provide new sensory data. we have defined two types of layers. there is the point layer, where the value can be evaluated directly, or it can be obtained immediately from the layers data. the gradient layer is where the value at the point of interest cannot be evaluated just from the current information but the history of the value must also be taken into account. the physical body of the agent and its sensors and effectors are also part of the environment, so it is necessary to interpret them in it. the layer must have information about how much space the body, sensors and effectors take. this could enable a more complex agent to be built from the basic blocks. the potential of layers has been used in several works, where the implementations have had up to seven different layers [5]. another advantage of layers is that they can solve communication between agents in the sense of distribution of the signal. we can implement the acoustic layer to propagate sound based on the physical laws. this solves the problem of transporting the message, but not the understanding and context of the message. to sum up layers in a single simple sentence, they implement various physical laws. 2.4 human interface and analysis everything that has been described above is just an algorithm with no human interface. visualization of the designed world can be an attractive and also a useful tool. for this purpose, an external visualization module or an internal (default) © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 47 no. 6/2007 module can be used. internal visualization is meant for debugging and observing the simulation by the creator (fig. 2). the external module can be used to present this simulation to a wider audience (fig. 3). on-line or off-line tools for analysis of changes in agent attributes in time are also supported. the proposed environment is compatible with the visualization tool called vat. it can be used to observe agent parameters at any time of simulation. we will not go into much detail about the parameter visualization problem. information about this can be found in [4]. using the third dimension for data visualization makes the analysis more comprehensive, and computer graphics has various methods for visualizing even more than three dimensions. this is why 3d analytical tools are strongly supported. they offer many advantages, because the value of the parameter can be mapped to shape, height, width or length, etc. another advantage is the possibility to use sensitivity analysis. analytical tools provide an offline or online evaluation of the simulation together with fast orientation in complex situations. they also enable backward analysis of an interesting simulation, and they can be used for observing the relations between sensory inputs and executed actions (i.e. what action was triggered when there was a specific sensory input, and vice versa). 2.5 influencing the simulation parameterization provides the ability to alter the simulation either as an initial setup of simulation or a direct change to the simulation in runtime. this means changing the agent or layer parameters while the simulation is in progress. for example, you can set a new target for the agent, decrease the temperature at a particular spot or even create a new object (agent, food, etc.). this should provide the ability to run longer simulations. user intervention via parameterization can be used for example in a learning process (imitation, conditioned and unconditioned reflexes, etc.). moreover, combined with visual analytical tools it is possible to observe the interesting moments of the simulation and change the scenario to see how the agents will adapt to this change. pausing and resuming the simulation and tracing step by step (even backwards) are also a part of parameterization. backward run is a key element that had been missing until now in simulations. for observing emergent behavior, it is useful when we can trace the simulation back to some interesting point and run it again in order observe if the situation will end exactly as it did before or if it will differ. 2.6 agent of wal wala2 separates the body of the agent (physical representation of agent) from the mind (control mechanism). the agent’s body is part of the environment and therefore it is covered here in the environment architecture. the mind of an agent on the other hand communicates with the body through data from sensors. please note that even the agent’s own state has to be observed by sensors. this internal state covers the state of agent sensors and effectors (some of these may be damaged or partial by malfunctioning) and the vegetative system block. the data from sensors is send to the mind of the agent, where it is processed. how the data is processed is not a subject of this abstract architecture. several approaches to agent mind design can be found in [2, 5, 6]. finally, the mind evaluates the situation and selects the action or actions for execution. the body tries to perform these ac48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 2: example of internal visualization fig. 3: example of external 3d visualization fig. 4: wal agent decomposition tions using its effectors. the layer mentioned above will then provide feedback about the effect of these actions by setting new sensory data. the agent’s body is part of the environment and as such it has physical properties such as position, shape, temperature, etc. the decomposition and the interfaces of the agent are shown in fig. 4. 2.6.1 agent’s sensors sensors are the agent’s senses. they are used to perceive its surroundings and also its internal state. the richness of the information about the surrounding world that the agent can obtain depends just on the type and number of sensors. layers should know all types of possible sensors in order to be able to fill them with data. this means that creating a new layer necessarily also requires the creation of adequate sensors and, conversely, when adding a new sensor it is also necessary to alter the layers so that they are able to fill it. sensors are the part of an environment that contains exact data (a numerical value). it is not always desirable to provide this crisp data to the agent’s mind. creatures in nature are also not able to perceive for example “that the object is 35.56 meters from them”. rather, they are able to perceive relative distance (closer/further) or inprecise data (close/mid range/far). they can attempt to estimate the value based on their experience (it might be 30 to 40 meters). to simulate this kind of perception in an agent’s world we want to implement fuzzy information rather than crisp values. this can be done by filtering the exact floating point value to a fuzzy value after it has been obtained by the sensor. 2.6.2 agent’s effectors effectors enable an agent to interact with its surroundings. in our simulated world we use simplified effectors. for example we use effectors of motion which can move agents in a certain direction at a certain speed. of course we could go into greater detail and implement effectors such as leg or wheel, but this would distract us (by solving inverse and forward kinematics tasks, friction, etc.) from our subject of research – behavior. we do not require this level of detail, but we do not run away from it. the possibility to implement it is open. in the field of effectors there is space for improvement. instead of using effectors as part of the environment and controlling their action, we implemented them only as the action result. in the movement example above, we move the agent from one position to another, instead of sending a signal to the agent’s locomotion system. 2.7 communication communication between the agent body and the layers is internal communication, so there is no need for explicit data sending. this communication can be done via the internal data structure, which is in this case the shared medium. the communication between body and mind is in general done via messages. it can be done even remotely, via various media (for example over tcp/ip, see fig. 5). communication on the agent level means sending a message from one agent to another, or to a group of agents. here we would like to let the layers decide who receives the information and who not. we are trying to reflect real world behavior, where information is carried via various media and can be received by various entities based on their sensor capabilities. when one agent wants to send a message (tell something) to another agent it will use its effectors and certain media of communication (an acoustic wave, for example). the information can be then received not only by the addressee but also by another agent who is within the range of the signal (even if it did not request this information). the interesting thing about this is that it can disregard unwanted information, or make use of it for its own purposes. 3 simulations and results we mentioned above that there are two components to run the simulation. the first of these is the environment described here, while the second is the control of the agent itself (agent mind). several agent behavior control architectures have been introduced. the basis for the agent architecture was designed by d. kadleček [2, 3]. this agent architecture with was redesigned for the wal environment by k. kohout in [1]. several simulations, including the lotka-volterra system (also known as predator-prey system, see fig. 6), and several task oriented scenarios were tested. in one of the simulations a task was given to the agent. this means that the agent, in addition to assuring its own survival should complete a task. in our case the task was to deliver messages. this simulation tested whether the thresholds were set correctly. if the agent’s “need” to fulfill the task was low, it focused almost only on its own survival (lazy agents). if the need was high, the agent was busy with his task and he fulfilled his survival needs only when necessary (hardworking agent). this simulation showed that various creature or human qualities can be reproduced in agents. 3.1 case study in chapter 2.4 we mentioned the usefulness of external modules namely for simulation analysis. we wanted to test this on a case study performed while redesigning the agent to the wal environment. the above mentioned vat tool was used for the analysis. the simulation scenario involved a single agent which was intended to move an object between two © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 47 no. 6/2007 reconstruction filter serialize deserialize fig. 5: block scheme of communication to external client fig. 6: lotka-volterra simulation results places. the agent had enough food and water to satisfy its needs. fig. 7 shows the visualized data from this simulation. on the left there is a 3d mesh; on the right is a detail of values in the 60th step. even a brief look at the mesh indicates that there is something wrong with the simulation. almost all the parameters are zero (the mesh is flat). this means that the agent is not hungry or thirsty; neither is it tired or sleepy. but we implemented and designed all these features. the reason for this could be a data export failure, a mistake in implementation of the inner agent vegetative block (part taking care of the “chemicals” in the agent’s body) or bad initial configuration of the agent. because we had run the simulation previously and the vegetative block had worked properly, there is no problem with the implementation itself. a brief check of the configuration showed that an excessively high value had been set to the time function for increasing/decreasing the chemicals. fig. 8 shows the mesh after correction of the configuration mistake. the values now change with time as they should. 4 conclusion in this paper, we have described the simulation environment architecture for artificial creatures. it was used by several agent architectures. the first agent architecture tested in this environment was described in section 3 above. this agent architecture was superseded by several other architectures namely lemming, designed by l. foltýn [5], acs proposed by m. mach [6] and animatsim introduced by a. svrček [7]. these architectures focused on different topics or different approaches to agent learning. animatsim focuses on reinforced learning; lemming uses the tdidt algorithm to create knowledge about the environment and to reason about it. acs uses hidden markov models and reinforced learning. they have one thing in common. they were designed in various programming languages (c, java, matlab) but took wal architecture into account and are capable of running in the wal environment. references [1] kohout, k.: simulation of animates, behavior. diploma thesis. prague: czech technical university in prague, faculty of electrical engineering, department of cybernetics, 2004. [2] kadleček, d., nahodil, p.: new hybrid architecture in artificial life simulation. in: lecture notes in artificial intelligence no. 2159, berlin: springer verlag, 2001. p. 143–146. [3] kadleček, d.: simulation of an agent – mobot in a virtual environment. diploma thesis. prague: czech technical university in prague, faculty of electrical engineering, department of cybernetics, 2001. [4] kadleček, d., řehoř. d., nahodil, p., slavik, p., kohout, k.: transparent visualization of multi-agent systems. in: proceedings of 4th international carpathian control conference, may 26–29, 2003, vysoké tatry. p. 723 – 726, isbn 80-7099-509-2. [5] foltýn, l.: realization of intelligent agents architecture for artificial life domain. diploma thesis. prague: czech technical university in prague, faculty of electrical engineering, department of cybernetics, 2005. [6] mach, m.: data mining knowledge mechanism of an environment based on behavior and functionality of its partial objects. diploma thesis. prague: czech technical university in prague, faculty of electrical engineering, department of cybernetics, 2005. [7] svrček, a.: selection and evaluation of robots-animates behavior. diploma thesis. prague: czech technical university in prague, faculty of electrical engineering, department of cybernetics, 2005. ing. karel kohout phone: +420 224 357 350 e-mail: kohoutk@fel.cvut.cz doc. ing. pavel nahodil, csc. phone: +420 224 357 353 fax: +420 224 353 677 e-mail: nahodil@felk.cvut.cz department of cybernetics faculty fo electrical engineering czech technical university in prague karlovo náměstí 13 121 35 prague, czech republic 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 7: use case – simulation analysis fig. 8: use case – simulation analysis correct setup ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 surface morphology of porous cementitious materials subjected to fast dynamic fractures t. ficker abstract this paper presents a study of the surface height irregularities of cement pastes subjected to fast dynamic fractures. the height irregularities are quantified by the values of the three-dimensional profile parameters. the studied dynamical irregularities show a similar analytical behavior to those obtained by static fractures. keywords: 3d profile analysis, fracture surfaces, cement-based materials, confocal microscopy. 1 introduction we have recently published several studies [1–3] on the surface morphology of fractured specimens made from hydrated cement pastes. this material shows a high value of porosity, which has been proved to be an influential factor governing the irregularities of fracture surfaces. for practical surface analyses of porous materials, it would be valuable to know whether the surface height irregularities are also influenced by the method of fracture. for this purposewe performed a large series of experiments with cement pastes. this material was chosen because its porosity can easily be controlledwithin a broad interval bymeans of the water-to-cement ratio r = w/c. correctknowledgeof the dependence of surface roughness on the fracturing method may be useful for further surface studies of fractured porous materials. 2 experimental arrangement ordinary portland cement cem 42,5 i r-sc of domestic provenance was used to create 144 specimens of hydrated pasteswith six differentwater-to-cement ratios r (0.3, 0.4, 0.5, 0.6, 0.7, 0.8). the specimens were rotated during hydration to achieve better homogeneity. all specimens were stored for the whole time of hydration at 100 % rh and 20◦c. after 90 days of hydration the specimens were fractured both in the static regime (three-point bending tests) and also in the dynamic regime (impulse fractures caused by a chisel and a heavy hammer). the fracture surfaces were then immediately used for microscopic analysis. the three-dimensional profile parameter ha was used to characterize the roughnessof the fracture surfaces of the hydrated cement pastes. in fact, ha represents the averaged ‘absolute’ height of the fracture relief z = f(x, y) fig. 1: 3d confocal relief of fractured cement paste ha = 1 l · m ∫ ∫ (lm) |f(x, y)|dxdy (1) where l × m is the area of the vertical projection of the three-dimensional fracture profile f(x, y) into the plane xy. the parameter ha has great statistical relevancy, since it is a global averaged characteristic covering the entire tested surface l × m. the three-dimensional profiles f(x, y) were created using an olympus lext 3100 confocal microscope. one of these profiles is shown in figure 1. the profiles are formed by software that processed a series of optical sections created by the confocal microscope at various heights of the fracture surfaces. approximately 100 image sectionswere taken for eachmeasured surface site, starting from the very bottom of the surface depressions (valleys) and proceeding to the very top of the surface protrusions (peaks). the investigated area l × m =1280 μm×1280 μm (1024 pixels×1024 pixels) was chosen in five different places of each fracture surface (in the center, and in fourpositions near the corners of the rectangular area), i.e. each plotted point on the graphs of the profile parameters corresponds to an average value composed 118 acta polytechnica vol. 51 no. 5/2011 of 60measurements (12 samples×5 surfacemeasurements). each measurement was performed for magnification 20×. 3 results and discussion it is well known that hydrated cement is a composite material consisting of several solid hydrated products and pore spaces (see the photo in figure 2). porosity (p) influencesmost of the mechanical properties of this material, and it is therefore not surprising that the surface irregularity (ha) was also found [1–3] to be among the dependent properties. since porosity to a large extent determines compressive strength, a strong functional relation between compressive strength σc and surface irregularity of the fracture surfaces was also revealed σc(ha)= σo ( ho ha − ho )ρ (2) where σo, ho, ho and ρ are fitting parameters. fig. 2: a photograph of the surface of a hydrated cement paste fig. 3: compressive strength independence on theheight irregularities of a surface formed in a dynamic fracture in the present study, relation (2) is tested with specimens fractured dynamically using a sharp chisel andaheavyhammer that simulate an impulsive load. figure 3 shows the resulting graph σc(ha). the dynamical strength values in figure 3 were evaluated on the basis of the static values by adding corrections corresponding to dynamic processes [4]. as shown in figure 3, the experimental points are fitted well by function (2), which means that this function may describe a universal behavior of the compressive strength and the height irregularities of the surfaces formed by both the static fracture processes [3] and the dynamic fracture processes. 4 conclusion the experiments have proved similar behavior of the height surface irregularities formed by static and dynamic fractures. the fast fracture process accomplished with the wedge-shaped chisel generates very similar graphs σc(ha) to those in the case of slow fracture processes. these results have been achieved using three-dimensional profile parameters ha evaluated on the basis of a reconstruction of the confocal surface. the graphs σc(ha) seem to be convenient candidates for calibration curves. theproperties of the surface irregularities of fractured cement pastes presented here may also be useful for morphological and structural studies of other porous materials. acknowledgement this work was supported by the ministry of the czech republic under contract no. me09046 (kontakt). references [1] ficker,t.,martǐsek,d., jennings,h.m.: roughness of fracture surfaces andcompressive strength of hydrated cement pastes, cem. concr. res. 40 (2010) 947–955. [2] ficker, t., martǐsek, d., jennings, h. m.: surface roughness and porosity of hydrated cement pastes, acta polytechnica 51(2011), no. 3, 7–20. [3] ficker, t.: fracture surfaces of porous materials, acta polytechnica 51(2011), no. 3, 21–24. [4] ficker, t.: quasi-static compressive strength of cement-based materials, cem. concr. res. 41 (2011) 129–132. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering brno university of technology veveř́ı 95, 662 37 brno, czech republic 119 ap08_6.vp 1 introduction stirling’s engine (fig. 1) is a volumetric engine, in which work is done by changing the volume, the pressure and the temperature of the working gas. the working gas is moved by pistons on the hot and cold side through the regenerator. the motion of the pistons is regular, and the pistons are mechanically connected. the pistons are frequently moved by a crank shaft. the power output is through the crank shaft of the engine. one cycle of the stirling engine is one turn of the crank shaft. the regeneration of the heat inside the regenerator is perfect according to schmidt’s idealization. if the working gas is flowing from the hot side to the cold side, the working gas is cooled in the regenerator from temperature th to temperature tc. heat is saved to the matrix of the regenerator during cooling. if the working gas is flowing from the cold side to the hot side, the working gas is heated from temperature tc to temperature th inside the regenerator. the heat is taken from the matrix of the regenerator during heating. this process is called regeneration of the heat. the temperature of the working gas on the hot side and on the cold side is not constant during the cycle in a real stirling engine (fig. 1). the amount of regenerated heat inside the regenerator of the stirling engine is much greater than the heat put into the engine (urieli and berchowitz – simply adiabatic analysis [1]). however no an analytical computing method for the amount of regenerated heat inside the regenerator has existed untill now. the amount of regenerated heat has an elemental action on the thermal efficiency, the performance and the dimensions of the engine. schmidt’s idealization [2] is a comparative cycle of the engine with outer transfer of the heat (stirling engine), when isothermal processes are taking place in all volumes of the engine fig. 3. schmidt’s idealization has been the only analytic computing method for thermodynamic design of stirling engine untill now. the heat flows, the work autput and the thermal efficiency of the cycle without wastage of energy are computed on the basis of the assumptions of schmidt’s idealization. the assumptions of schmidt’s idealization are stated below. 2 notation symbol description unit cp specific heat capacity for working gas at constant pressure j�kg�1�k�1 i enthalpy of working gas j m total amount of working gas in engine kg p pressure pa q energy balance (heat) j r individual gas constant of working gas j�kg�1�k�1 t absolute temperature k v volume m3 � phase angle, of the hot side volume variation to the cold side volume variations rad � thermal efficiency 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 the amount of regenerated heat inside the regenerator of a stirling engine j. škorpík the paper deals with analytical computing of the regenerated heat inside the regenerator of a stirling engine. the total sum of the regenerated heat is constructed as a function of the crank angle in the case of schmidt’s idealization. keywords: stirling engine, regenerator, regenerated heat, schmidt’s idealization. v h =max. v c =max. v d v hd v r v cd v h (�) v c (�) t c t h t r t [k] v(�) v hc (�) v cc (�) hot side cold side regenerator v hc,max v cc,max fig. 1: scheme of a stirling engine with dead volumes and progress of the temperature of the working gas for crank angle � � crank angle rad � adiabatic index � temperature rate subscripts c cold side car carnot cc volume of cylinder of cold side d death (death volume) h hot side hc volume of cylinder of hot side max maximum min minimum r regenerator reg regenerated 3 the amount of regenerated heat inside the regenerator the amount of regenerated heat inside the regenerator of the stirling engine can be found from an energy balance cycle. this energy balance can be applied to the stirling engine (fig. 1) under the following assumptions: 1) the working gas is an ideal gas. 2) there is no pressure loss, and the pressure is the same on all the volume. 3) the engine is totally sealed. 4) there is no heat transfer between the matrix of the regenerator and the structure of the engine. 5) steady state conditions are assumed for overall operation of the engine so that the pressures, temperatures, etc. are subject to cyclic variations only. the energy balance of the cycle on a part of a cycle 1-2 in the t-s chart (fig. 2) is described by the following this equation d h c rq q q q q s s 1 2 1 2 1 2 1 2 1 2� � � � �� � � �, , , . (1) the thermal flows and the entropy are functions of the crank angle, so equation (1) is d h c rq q q q q � � 1 2 1 2 1 2 1 2 1 2� � � � �� � � �, , , . (2) the energy balance of the working gas inside the regenerator on a part of cycle 1-2 can be found using the equation (2) q q q qr h cd, , ,1 2 1 2 1 2 1 2 � � �� � �� � � . (3) the energy balance of the working gas on the hot side of the engine on a part of cycle 1-2 � �q i v p i v ph h h h hd d d, ( ) ( ) ( )1 2 1 2 1 2 1 2 1 2 � � � � �� � � � � � � � � � � � �� . (4) the energy balance of the working gas on the cold side of the engine on a part of cycle 1-2 � � q c m t v p i v pc c c c c c d d, ( ) ( ) ( ) ( 1 2 1 2 1 2 1 2 � � � � � � �� � � � � � � � � � ) .dp � � 1 2 � (5) for the energy balance of the cycle on a part of cycle 1-2, we use the first law of thermodynamics � � d d d d q c m t v p c m t v p p � � � � � � � � � � � 1 2 1 2 1 2 1 2 � � �� � � � ( ) ( ) ( ) p � � 1 2 � . (6) the energy balance of the working gas inside the regenerator on a part of cycle 1-2 can be found by substituting equations (4), (5), (6) into equation (3) � � � � q c m t v p i v p pr h h d d , ( ) ( ) ( ) ( ) 1 2 1 2 1 2 1 2 � � � � � �� � � � � � � � � � � � � 1 2 1 2 1 2 � � � � � � � � �� �i v pc c d( ) ( ) , (7) whereas for a medium temperature working gas in the engine (from the state equation) is t p v r m ( ) ( ) ( ) � � � � . (8) the total volume is the sum of all working volumes of the engine v v v v( ) ( ) ( ).� � �� � �h r c (9) the energy balance of the working gas inside the regenerator on a part of cycle 1-2 can be found by substituting equations (8), (9) into equation (7) © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 6/2008 s [j·k-1] t [k] 1 2 s 1 s 2 s 1 s 2 dq cycle fig. 2: general thermal cycle in the t-s chart � � � � � � q p v v p i i r r h c , ( ) ( ) ( ) ( ) ( 1 2 1 1 2 1 2 1 2 � � � � � � � � � � � � � � � � � � � � �) .� � 1 2 (10) this equation is the energy balance of the regenerator on a part of cycle 1-2. the total sum of the regenerated heat inside the regenerator in the state �1 0� can be found from equation (10) � � � � � � � � q p v v p i i r r h c , ( ) ( ) ( ) ( ) ( ) 0 0 0 0 0 1� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 0 0 0 0 p v p v v p p i i ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) . � � � � � � r h c (11) the pressure p(�) and the volume of the engine v(�) for a specific crank angle is calculated or measured. fig. 4 describes the progress of the total sum of the regenerated heat inside the regenerator (11) as a function of the crank angle. from equations (4), (5) and (11) we can calculate the heat input to the engine, the heat output from the engine and the regenerated heat inside the regenerator on a part of cycle 1-2. the difference between the maximum and minimum value of the function q r, 0�� is the amount of regenerated heat inside the regenerator in the cycle q q qreg � �� �r r, , max , , min0 0� � . (12) the amount of regenerated heat can be computed exactly if the change enthalpy of the working gas on the hot side and the cold side � �i ih c( ) ( )� � �� 0 is known (more about this problem in [3], [4]). if the thermodynamic processes are isothermal on the hot side and the cold side, then the enthalpy of the working gas does not change (� �i ih c( ) ( )� � �� �0 0) and we can using equation (13). the thermodynamic processes are not isothermal in real stirling engines. these changes of the temperature (enthalpy) of the working gas on the hot side and the cold side cannot be computed without measurement. however the amount of the regenerated heat inside the regenerator can be computed from equation (13) for the assumption � �i ih c( ) ( )� � �� 0 � � � � � � � � � � � � � � � � 1 1 2 1 2p v v p( ) ( ) ( )r � � � � q p v p v v p p r r , ( ) ( ) ( ) ( ) ( ) ( ) . 0 1 0 0 0 � � � � � � � � � � � � (13) the value of the regenerated heat computed from equation (13) is greater than the value of the regenerated heat computed from equation (11). the regenerator designed by equation (13) is therefore greater. the derivation of equation (13) was obtained using differential equations by a prof. uriel, but only for conditions: th const� , tc const� . 4 a calculation of the amount regenerated heat inside the regenerator for a stirling engine fulfilling the assumptions of schmidt’s idealization from equation (11) or (13) we can compute the amount of regenerated heat inside the regenerator of the stirling engine if computed or measured the functions p(�) and v(�) are known. this section presents a method for calculation the amount of regenerated heat inside the regenerator in the cycle of the stirling engine, fulfilling the assumptions of schmidt’s idealization. schmidt’s idealization can be applied to the engine in fig. 3, under the following assumptions: 1) the temperature of the working gas which flows from the regenerator on the hot side is th, and the temperature of working gas which flows from the regenerator on the cold side is tc. 2) there is no pressure loss, and the pressure is the same throughout the volume. 3) the working gas is an ideal gas. 4) the engine is totally close. 5) sinusoidal motion of the pistons. 6) the temperature of working gas on the hot side is constant and equal to th, and the temperature of working gas on the cold side is constant and equal to tc. the mean temperature of working gas in the regenerator is constant and equal to tr. 7) heat enters the engine only through the walls on the hot and cold sides of the engine. 8) there is perfect regeneration. the summary equations for schmidt’s idealization are evolved from the following assumptions [2]: pressure p r t m v a b ( ) ( cos( )) , ,max � � � � � � h hc 1 2 (14) a v t v t v t v t k t t � � � � � � � � � � � � 1 2 1 1 hc h hd h r r cd c r c ,max ,� � � � � � � � � � � � � t t t k v v t t b x z h c h cc hc h c ln , , , ,max ,max 1 2 � 2 1 1 1x k z k z x k � � � � � � � � � � � � � � cos sin , arctan arctan 1 11 � � � � � � � � � � � sin cos . k 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 volume of the hot side v v vh hc hd( ) ( )� �� � . (15) volume of the cold side v v vc cc cd( ) ( )� �� � . (16) heat input to the engine q r t m k b a a b h h� � � � � � � � � � � � � � � 1 2 2 2 1sin( ) . (17) heat output from the engine q r t m k b a a b c h� � � � � � � � � �1 2 2 2 1sin( )� � � . (18) thermal efficiency of the cycle � � � � � a q h car1 1 . (19) as is evident from equation (19), an engine the fulfilling assumptions of schmidt’s idealization has thermal efficiency equal to the thermal efficiency of carnot’s for temperature rate t th c. the temperature of the working gas on the hot and cold sides of the engine is constant. therefore the enthalpy of the working gas on these sides does not change � �( ( ) ( ) )i ih c� � �� �0 0 ) and we can use the equation (13). © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 6/2008 v h (�) t [k] t h =constant t c =constant t r 0 � � v c =max. v c (�) 0 � v h =max. v(�) fig. 3: simplified scheme of the stirling engine, fulfilling the assumptions of schmidt’s idealization, and the progress of the temperature of the working gas in volumes of the engine q r,0-� [j] 0 60 120 180 240 300 360 -1000 0 1000 2000 3000 4000 5000 q reg � [deg] a b fig. 4: progress of the sum total of regenerated heat (13) as a function of the crank angle for a stirling engine fulfilling the assumptions of schmidt’s idealization (a) – q r, 0�� diagram for the stirling engine united stirling v-160 (working gas helium, � � 167. , r � 2077 22. j/kg k, th � 900 k, tc � 330 k, pmean � 15 mpa, vhc,max � 160 cm 3, vhd � 191 cm 3, vcd � 110 cm 3, vr � 28 cm 3, � � 105°) (b) – q r, 0�� diagram for the same stirling engine but in the case of v v vhd cd r� � � 0 cm 3 qreg amount of regenerated heat inside the regenerator in case (a). other results of these cycles are shown in table 1 5 conclusion the amount of regenerated heat inside the regenerator of a stirling engine can be computed analytically if functions p(�) and v(�) from equations (13) are known. the progress of the total sum of the regenerated heat q r, 0�� as a function of the crank angle for a stirling engine fulfilling the assumptions of schmidt’s idealization is shown in fig. 4a. this figure shows the amount of regenerated heat inside the regenerator qreg. from equation (17) and (12), we can compute the ratio of the regenerated heat to the heat input to the engine. this ratio is 4.94 (for an engine with the technical specifications in fig. 4), and it corresponds perfectly with the ratio computed by urieli and berchowitz [1]. the larger this ratio is, the greater the impact on the efficiency of regeneration of the heat input to the engine. the progress of the total sum of regenerated heat q r, 0�� as a function of the crank angle for the stirling engine fulfilling the assumptions of schmidt’s idealization without dead volumes is shown in fig. 4b. the ratio of the regenerated heat to the heat input to the engine (1.59) is smaller than for the previous case. thus, if the dead volumes in the engine decrease, the influence of the regenerated heat inside the regenerator on the thermal efficiency of cycle also decreases. references [1] urieli, i.: stirling cycle engine analysis, bristol: a. hilger, 1984, isbn 0-85274-435-8. [2] walker, g.: stirling engines, oxford university press, 1980, isbn 80-214-2029-4. [3] škorpík, j.: příspěvek návrhu stirlingova motoru, vut v brně, edice phd thesis, 2008, isbn978-80-214-3763-0. [4] škorpík, j.: energetická bilance oběhu stirlingova motoru, [online] www.oei.fme.vutbr.cz/jskorpik/190.html. ing. jiří škorpík, ph.d. phone: +420 541 142 575 fax:+420 541 143 345 e-mail: skorpik@fme.vutbr.cz department of power engineering brno university of technology faculty of mechanical engineering technická 2896/2 616 69 brno, czech republic 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 parameter a b qh [j] 969.1 3129.1 qh [j] �355.4 �1147.4 a [j] 613.8 1981.8 qh [j] 4784.1 4963.8 q qreg d [-] 4.94 1.59 [-] 0.63 0.63 table 1: other results of the stirling engine cycle fulfilling the assumptions schmidt’s idealization for the parameters presented in the legend of fig. 4. ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 metallurgical joining of magnesium alloys by the fsw process tomáš kupec1, ivana hlavačová2, milan turňa1 1 slovak university of technology, faculty of materials science & technology, department of welding, j. bottu 25, 917 24 trnava, slovakia 2 university of žilina in žilina, faculty of mechanical engineering, univerzitná 1, 010 26 žilina, slovakia correspondence to: tomas.kupec@stuba.sk abstract this paper deals with welding az 31mg alloy by fsw (friction stir welding) technology. welds were fabricated with new equipment supplied from china for vuz-pi bratislava (welding research institute — industrial institute). welding parameters and conditions were proposed and tested. joint quality was assessed by optical microscopy and microhardness measurements. the fabricated joints were sound, apart from minor inhomogeneities (cracks). it is considered that after certain adaptations of the welding parameters, and perhaps also of the welding tool, that this equipment will be capable of producing welded joints of excellent quality that can compete with any fusion welding technologies, including concentrated power sources. keywords: welding, magnesium alloy, friction stir welding, quality control. 1 introduction magnesium alloys are applied mainly in the aviation, space, railway and shipbuilding industries, and also in general engineering [1]. magnesium alloys are among the lightest structural materials. one of advantages of mg alloys is therefore that theyreduce the weight of vehicles, which leads to a significant reduction in fuel consumption and in emissions [6]. the principle of this technology is that a rotating tool with a specially designed pin on a shoulder is impressed into the gap between the welded materials, and is subsequently led in the weld line direction. considerable heat is thus generated due to the friction and the plastic strain of the materials. the temperatures occurring in the weld zone usually attain 80 to 90 % of the melting point of a given metal [4]. the principle of this technology is shown schematically in figure 1. the tool used in welding by the fsw process must be sufficiently strong, tough and resistant against wear in the welding process [3]. it must additionally have good oxidation resistance and a low coefficient of thermal conductivity, in order to minimize the thermal losses. special steels and/or composite materials are selected for the pin material. most of tools have a concave shape of the shoulder, and this acts as a release volume for the material displaced by the pin. the tool prevents material being forced out of from the sides of the pin shoulder, and desirable material forging under the tool is also preserved [2]. figure 1: schematic representation of the fsw process [5] 2 experimental an mg alloy welding experiment was carried out in cooperation with wri-ii sr bratislava, which has available newly-installed fsw equipment from china, see figure 2. the specimen of az 31mg alloy, 100 × 55 × 10 mmin dimensions, was welded in the experiments. the chemical composition of this alloy is presented in table 1. a weld of az 31mg alloy is shown in figure 3. 89 acta polytechnica vol. 52 no. 4/2012 equipment parameters: equipment type: fsw – lm-060 welding speed – max. 1500 mm/min. compressive force – max. 12 t tool revolutions – 1 500 rpm axis strokes – x: 2 m, y: 1 m, z: 0.4 m power supply – 3 ac 380 v, 50 hz current – 115 a rated power – 60 kw figure 2: fsw – lm-060 welding equipment table 1: chemical composition of az 31 alloy [6] element al zn mn si cu ni fe ca other mg wt. % 2.5–3.5 0.6–1.4 0.2–0.6 < 0.05 < 0.008 < 0.002 < 0.008 < 0.02 < 0.03 balance welding parameters for the mg alloy: tool revolutions — 800 rpm, welding speed — 100 mm/min, inclination angle of the tool — 3◦, compressive force — 3.5 t. figure 3: weld of the mg alloy fabricated by the fsw process the samples for a metallographical study of the microstructure were prepared by a standard procedure which includes preparation, in this case by cold embedding the sample into epoxy resin, grinding the samples with emery papers of gradually finer granularity, followed by mechanical polishing with the application of a diamond paste with 3 μm to 1 μm grain size. the final polishing was performed by applying emulsion type ops. the samples were etched by dipping in 1 % nital for 20 s. the microstructure of the base metal (figure 4) is composed of a δ solid solution (a solid solution of al in mg) and an intermediary γ phase (mg17al12). when the mg alloy is heated during welding and subsequent cooled down, crystallisation of eutectic cells occurred in the vicinity of the γ — phase (figure 5). figure 4: microstructure of base metal (az31alloy), etch. 1 % nital figure 5: microstructure of weld metal, etch. 1 % nital 90 acta polytechnica vol. 52 no. 4/2012 figure 6: boundary zone between the weld metal and the base metal (precipitated γ — phase) figure 7: boundary between the base metal and the weld metal the transition zone between the base metal and the weld metal is characterized by a visible boundary (figure 6). precipitation of the γ — phase occurred on the weld metal — base metal boundary, see figure 6. the base metals made a major contribution to the heat removal, since the weld metal was cooled down in the direction from the weld boundary to the welded base metals. a network of hot stress microcracks was probably formed on the weld surface. the course of the microhardness was measured by a vickers test using azwick/roellmicrohardness tester, applying a 500 g load acting on the cross section of the weld joint for 10 s. microhardness measurements typically show a great scatter of the measured values, since only a very small volume of material is involved. this means that the hardness value is considerably affected by microstructural factors (grain boundaries, inclusions, phase distribution etc.) the distribution of the indents in the hardness measurement is shown in figure 8, and the course of the microhardness is shown in figure 9. we may conclude that the microhardness values of the weld metal, the heat affected zone (haz) and the base metal do not differ significantly. 3 conclusions this paper has presented a brief survey of the metallurgy of welding mg alloys using the fsw process. the welding was performed on chinese-produced fsw – lm-060 equipment, which is at present installed at the welding research institute — industrial institute of the slovak republic with its seat in bratislava. az 31mg alloy was used: the chemical composition is given in table 1. the selected welding parameters and conditions have not yet been fully optimized, since defects such asmicrocracks were observedin several places. it will therefore be necessary to adapt the welding parameters and perhaps also the shape and inclination of the welding tool in future experiments. a metallographic assessment of the welded joints revealed the actual phases of the mg alloy. the measured microhardness values in the base metal, haz and weld metal did not differ considerably. it can be supposed that this technology will be very suitable for high-quality welding of mg alloys. figure 8: distribution of indents for the microhardness measurement across the weld metal — base metal boundary 91 acta polytechnica vol. 52 no. 4/2012 figure 9: course of microhardness in a weld joint across the weld metal — base metal boundary acknowledgement this paper reports on work carried out with support from the grant agency vega mšvvš sr, in the framework of project no. 1/2594/12 study of metallurgical joining and other technological processes of ecologically friendly technologies. references [1] friction stir welding. technical handbook. esab2011. [2] hrivňák, i.: zváranie a zvariteľnosťmateriálov (welding and weldability of materials). trnava : stu, 2011. isbn 978-80-227-3167-6. [3] accessible on: http://www.twi.co.uk/technologies/ welding-coating-and-materialprocessing/ friction-stir-welding/ [4] mishra, r. s., mahoney, m. w.: friction stir welding and processing. usa : asm international, 2007, 360 p. isbn 978-0-87170-840-3. [5] frictionstirlink. accessible on: http://www.frictionstirlink.com/desc.html [6] kainer, k .u.: magnesium. in proceed. of 7th international conf. on mg alloys and their applications. weinheim : wiley – vch verlag gmbh & co. kg and a. 284mo67538, 2007. [7] turňa, m.: solid state special welding methods. lectures. iiw postgraduate study. prague : 2011. 92 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 application of induction heating for brazing parts of solar collectors krist́ına demianová1, milan ožvold1, milan turňa1 1 slovak university of technology, faculty of materials science and technology, j. bottu 25, 917 24 trnava, slovakia correspondence to: kristina.demianova@gmail.com abstract this paper reports on the application of induction heating for brazing parts of solar collectors made of al alloys. the tube-flange joint is a part of the collecting pipe of a solar collector. the main task was to design an induction coil for this type of joint, and to select the optimum brazing parameters. brazing was performed with alsi12 brazing alloy, and corrosive and non-corrosive flux types were also applied. the optimum brazing parameters were determined on the basis of testing the fabricated brazed joints by visual inspection, by leakage tests, and by macroand micro-analysis of the joint boundary. the following conditions can be considered to be the best for brazing al materials: power 2.69 kw, brazing time 24 s, flux brazetec f32/80. keywords: aluminium alloy, brazing, induction heating. 1 introduction induction heating is widely used nowadays in many industrial applications, e.g. in material melting processes, heat treatment, hot forming, and also for welding and brazing metallic materials. modern induction heating provides reliable, repeatable, non-contact and energy-efficient heat in a minimal amount of time, without a flame. solid state systems are capable of heating very small areas within precise production tolerances, without disturbing the individual metallurgical characteristics. for larger volume applications and/or quality-dependent processes, parts can be brazed with induction under a controlled atmosphere without flux or any additional cleaning steps [2, 3]. the advantages of induction heating over conventional heating processes in furnaces or applying other heat sources are power efficiency and friendliness to the living environment. the main aims of this paper are to report on the design of an induction coil for induction brazing of parts of solar collectors, to determine the optimum brazing parameters, and to select suitable basic materials, and a suitable brazing alloy and flux. this work is linked with our previous study [1]. 2 experimental the brazing experiments were performed using hrf 15 equipment, produced for high-frequency induction heating. for this technology, the design of a suitable induction coil is very important. experience acquired from previous research formed the basis for the design of the induction coil and also the heating of the brazed parts (metallic materials). the coil and its basic dimensions are shown in figure 1. a) b) figure 1: dimensions of components a) inductor, b) brazed tubes 44 acta polytechnica vol. 52 no. 4/2012 table 1: chemical composition of aw 6082 alloy [wt. %] si fe cu mn mg cr zn ti al 0.7–1.3 0.5 0.1 0.4–1.0 0.6–1.2 0.25 0.2 0.1 balance table 2: chemical composition of aw 3000 alloy [wt. %] si fe cu mn mg zn cr al 0.05–0.15 0.06–0.35 max. 0.1 0.3–0.6 0.02–0.20 0.05–0.3 max. 0.25 balance table 3: chemical composition of brazing alloy type al 104 [wt. %] si fe cu mn mg zn ti al 11–13 0.6 max. 0.3 0.15 0.1 0.2 max. 0.15 balance figure 2: situation of joined parts in the inductor cu and its alloys are the most widely-used material for fabricating the collecting pipe of solar collectors. however, al is nowadays becoming an important substitute material for this application, owing to its unique utility properties. it has good corrosion resistance, it is also lighter and cheaper than copper, and it is 100 % recyclable. the selection of materials for the parts of the collecting pipe therefore focused on the selection of suitable al alloys. the flange was fabricated of from al alloy type aw 6082, and the tube was fabricated from alloy type 3 000. the chemical composition of these materials is presented in tables 1 and 2. brazing alloy type al 104 (table 3) in the form of wire was used for fabricating the flange-tube joint. a tubular solder type alsi12 filled with nocolok flux was also used. during the brazing process were used following fluxes: brazetec f30/70 (fl 10), brazetec f32/80 (fl 20) and nocolok. prior to brazing proper, the soldered areas were cleaned from surface impurities and oxides, and were subsequently degreased. then an appropriate amount of flux was deposited on the brazed surfaces (in the case of brazing alloy, in the form of a wire) and these were situated to the centre of the axis of the induction coil, see figure 2. the process parameters during brazing were set on the basis of a visual inspection at a stable frequency of 360 khz. external defects of the joints were observed, e.g.: insufficient filling, porosity damage to the surface, local melting, and rough joint surfaces. another important consideration was the precise positioning of the joined parts in the inductor, and correct selection of the intensity (power) related to the brazing duration. figure 3a shows the p1 joint, which was fabricated with the application of fl 10 flux and alsi12 brazing alloy at output power 2.26 kw. the brazing time was 35 s. melting of the flange, and incomplete filling of the clearance with the brazing alloy was observed. in the case of joint p2 (figure 3b), higher power (2.9 kw) was selected, which led to lo45 acta polytechnica vol. 52 no. 4/2012 a) 2.26 kw, 35 s, fl10 b) 2.9 kw, 18 s, fl10 c) 2.6 kw, 25 s, fl10 d) 2.6 kw, 25 s, fl10 figure 3: brazed joints fabricated with various parameters a) 2.6 kw, 26 s, nocolok b) 2.6 kw, 28 s, fl20 c) 2.69 kw, 24 s, fl20 figure 4: brazed joints fabricated with various parameters 2.6 kw, 26 s, fl20 figure 5: macrostructure of the brazed joint of specimen p6 with a detail of the microstructure cal melting of the flange at brazing time 18 s. in the case of joints p3 and p4 (figures 3c,d), which were fabricated at 2.6 kw power and brazing time 25 s, insufficient filling with the brazing alloy was also observed when fl 10 flux was applied, and then removed from the joint. the brazing alloy wetted just the parent metal of the tube. experience obtained when fabricating previous joints has shown that it is suitable to use power around 2.6 kw to prevent the flange melting down. nocolok type flux (figure 4a) was applied for fabricating joint p5. the brazing time was 26 s. after removing the residual flux, we observed porosity damaging the surface and local melting down of the parent material. fl 20 flux was used for brazing joint p6. the brazing time was 28 s at output power 2.6 kw. after the residual flux is removed from the joint, local melting down of the parent metal is visible (figure 4b). with a slight power increase to 2.69 kw, joint p7 was fabricated at time 24 s. when the flux residues were removed, it was obvious (figure 4c) that at this power setting and flux selection, there was neither melting of the flange, nor any formation of porosity damaging the surface. the joint macrostructure was observed on specimen p6 (figure 5). the brazing alloy sufficiently filled 46 acta polytechnica vol. 52 no. 4/2012 2.69 kw, 24 s, fl20 figure 6: macrostructure of a brazed joint of specimen p7, with a detail of the microstructure the required joint clearance. isolated pores were observed in the capillary gap. the detailed view of the microstructure shows the grain growth in the flange edge zone. figure 6 shows the macrostructure of joint p7. when the microstructure in the flange edge zone of was inspected, no grain growth was observed. this was due to the power increase during brazing and thus also due to suitable shortening of the the brazing time. isolated voids (shrinkage holes) occurred in the brazing alloy microstructure. the quality of the joints was evaluated by light microscopy and also by a leakage test. the leakage test was performed at thermosolar company, using the thermosolar leakage test equipment at a pressure of 1 mpa. all joints that were tested passed the test. 3 discussion the following conclusions may be drawn from the results. precise positioning of the brazed parts against the inductor was very important when fabricating the joints, in order to prevent local melting of the flange edges. melting of the flange edges was also affected by low power and subsequently by a longer brazing time. it can be stated that the brazing time to a certain measure also depends on the efficiency of each flux type and its working temperature. it was found that the brazetec f30/70 (fl 10) flux type was not suitable for brazing the alsi1mgmn alloy that the flange was made of. when this flux was applied, the brazing alloy wetted only the surface of the almn1 tube. the nocolok flux type caused porosity on the brazing alloy surface. the brazetec f32/80 (fl 20) flux type can be considered as best for joint p. this joint was fabricated with the flux at 2.69 kw power for a duration of 24 s. 4 conclusions ever increasing demands for solar collectors, together with growth in competition, require ongoing technological progress and innovative elements in order to maintain competitive advantages, product attractiveness and last but not least cost savings. innovative technology for joining the parts of the collecting pipe of collectors is therefore desirable. the aim of our work was to develop a progressive technology for joining the parts of solar collectors as a replacement for obsolete flame brazing technology. it was of equal importance to find a suitable alternative for the copper that the collecting pipe of the solar collectors was made of. on the basis of a literature study, brazing with high-frequency induction heating of aluminium alloys with a eutectic alsi12 brazing alloy type was selected. when brazing aluminium alloys, it is of great importance to select a suitable flux. a series of brazed joints were fabricated with three types of fluxes (nocolok, brazetec f32/80 and brazetec f30/70) and the alsi12brazing alloy type. the quality of the fabricated joints was assessed by an visual inspection, a leakage test, and optical microscopy. on the basis of our results, it can be concluded that the brazing technology with induction heating that has been developed is suitable for joining al alloys. in future work, we will carry out corrosion, thermodynamic and fatigue tests on the joints. acknowledgement this work was carried out with support from ga vega mš vvš sr and sav. projects no. 1/2594/12 and no. 1/1000/09. 47 acta polytechnica vol. 52 no. 4/2012 references [1] demianová, k., behúlová, m., ožvold, m., turňa, m., sahul, m.: brazing of aluminium tubes using induction heating, advanced materials research, vol. 463–464, 2012, p. 1 405–1409. [2] rapoport, e., pleshivtseva, y.: optimal control of induction heating processes. taylor & francis group, llc, usa, 2007. isbn-10 0-8493-3754-2. [3] rudnev, v., et al.: handbook of induction heating. new york : marcel dekker, 2003. isbn 0-8247-0848-2. 48 ap10_1.vp 1 introduction large tall vessels with a draught tube (shown in fig. 1) are used for mixing suspensions, especially when high homogeneity is desirable. a short shaft and a small ground area are advantages of this configuration. 2 modifications to a draught tube for particle suspension the main disadvantage of this arrangement is that particle suspension is difficult after mixing has been interrupted. as shown in fig. 2, in the case of small particles the speed for initiating particle suspension np is significantly greater than the speed necessary to keep a particle in suspension nk. operation at high speed requires high power consumption. to overcome this difficulty, the telescopic withdrawable © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 50 no. 1/2010 mixing suspensions in tall vessels with a draught tube j. brož, f. rieger a tall vessel with a telescopic draught tube is proposed for mixing suspensions. the paper presents the relations for calculating the agitator power consumption and the speed necessary to keep a particle in suspension. keywords: mixing equipment, draught tube, suspension. 45° h d 1 vessel 2 draught tube 3 impeller 4 baffles pitched six-blade turbine cvs 69 1020 h/d 0.2� 3 1 2 4 d t h l h 2 h t d dt l 4 5 ° fig. 1: mixing vessel with a draught tube n k = 1269.8 c v 0.1026 n p = 2346.8 c v 0.2066 n k = 4532.8 c v 0.3166 n p = 4627.1 c v 0.3101 100 1000 10000 0.01 0.1 1 cv dp = 0.15 mm d = 0.15 mmp d = 1.28 mmp d = 1.28 mmp n n k p , m in [ ] � 1 fig. 2: comparison of impeller speed nk,p for particle diameters dp � 1.28; 0.15 mm draught tube shown in fig. 3 has been proposed as a utility model [1]. for settled particle suspension, the draught tube is withdrawn to the upper position with a small gap between the sediment and the draught tube – fig. 4a. the high liquid velocity in the gap causes particle suspension – fig. 4b. then the draught tube can gradually be lowered as the particles are suspended and the bed of settled particles decreases. 3 experimental we need to know the power consumption and the critical agitator speed in order to design mixing equipment. measurements were carried out on the model mixing equipment shown in fig. 5, with the dimensions presented in table 1. water suspensions of five fractions of glass balotine with mean diameters from 0.15 to 1.28 mm with volumetric content of particles cv in the range from 0.025 to 0.45 were 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 1 3 2 4 5 a) operating position b) upper position fig. 3: vessel with telescopic draught tube a) b) fig. 4: draught tube above the sediment layer used in the measurements. the power consumption was measured by a tensometric pick-up with a gmv amplifier made by lorenz messtechnik at the agitator speed necessary to initiate np and maintain nk particle suspension. the critical speed of the agitator was determined visually and measured photoelectrically. 4 experimental results the results of the power consumption measurements were processed to the form of power number po dependence on volumetric content cv for various particle diameters shown in fig. 6. this figure shows that the power number is independent of particle content and size. the power number at the speed necessary to initiate particle suspension pop does not differ significantly from the power number at the speed necessary to keep a particle in suspension pok, and its mean value is po � �163 0 07. . . measurements of the critical agitator speed necessary to keep a particle in suspension nk have been presented in [2]. the results were processed in the form of a dimensionless equation fr p � � � � � � �� � a d d a (1) from which the critical agitator speed can be calculated. the coefficients in a and a depend on particle content and can be determined from the following relations a c� � �0 47 2 26. . v (2) a c� �40 07 17 62. exp ( . )v (3) list of symbols cv mean volumetric concentration of particles d agitator diameter dp particle diameter © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 50 no. 1/2010 d [mm] dt [mm] d1 [mm] dm [mm] d [mm] h [mm] ht [mm] h2 [mm] t [mm] l [mm] l [mm] 300 72 120 81 65 600 14 444 8 540 54 table 1 45° 1 2 4 5 3 direction of flow d t h l h 2 h t d dm l dt d1 fig. 5: vessel with telescopic draught tube 0.1 1 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.15 0.25 0.47 0.65 1.28 pok = 1.73 p o k cv 0.1 1 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.15 0.25 0.47 0.65 1.28 pop = 1.54 p o p cv a) b) fig. 6: power number po dependence on the mean particle volumetric content cv: a) pok � �173 0 08. . , b) pop � �1 54 011. . d vessel diameter fr� modified froude number, fr�= n d g k 2 � �� n agitator speed p power po power number, po p n d� �s 1 3 5 � density of liquid �s density of suspension �� solid-liquid density difference acknowledgement this work was supported by research project of the ministry of education of the czech republic msm6840770035. references [1] utility model proposal puv 2009-20921. [2] brož, j., rieger, f.: czasopismo techniczne. vol. 105 (m/2008), no. 6, p. 29–36. jiří brož prof. ing. františek rieger, drsc. phone: .+420 224 352 548 e-mail: frantisek.rieger@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 ap10_1.vp 1 introduction the ultracapacitor is a capacitor with large capacitance (up to 5000 farads) and high efficiency (up to 98 %). it leads to the idea of using ultracapacitors as an alternative source to batteries, or for extracting higher efficiency from existing power sources, e.g. fuel cells. the ultracapacitor also has other advantages – it is capable of very fast charges and discharges, a millions of cycles without degradation, extremely low internal resistance or esr, high output power, etc. ultracapacitors cannot yet be used as a primary power source in automotive applications. on the other hand, they can be a good choice as a secondary power source that works as an electric power storage system. they are able to deliver peak power for drive demands for acceleration or can be used for storing regenerative braking energy. boostcap® ultracapacitors, which are available at our department of electronics, are produced by maxwell technologies with a capacitance of 3 000 f and a voltage of 2.7 v dc. connection in series of these seventeen ultracapacitors creates an ultracapacitor battery pack (fig. 1) with a capacitance of 176 f and a voltage of 46 v dc. a fuel cell or a lifepo4 battery pack can be used as a primary power source. fuel cells are still under development, and their parameters will be improved in future. above all, the price of fuel cells is very high. in addition, it is necessary to maintain the temperature and pressure in the media within operating values. fuel cells have other disadvantages, but when the main disadvantages are removed, fuel cells will be better than ordinary batteries for supplying electric vehicles. 2 soft-switching there are two techniques for soft-switching: zero-voltage switching (zvs) and zero-current switching (zcs). resonance on a semiconductor switch from a resonant circuit is used for achieving zero-voltage or zero-current switching conditions. the resonant circuit comprises an inductor lr and a capacitor cr. a semiconductor switch with a resonant circuit creates a so-called resonant switch. in order to achieve zvs capacitor cr is connected in parallel with semiconductor switch vt. a zero-voltage resonant switch is shown in fig. 2. the resonant circuit is used to shape the switch voltage waveform during off time in order to eliminate turn-on loss due to zero voltage on the capacitor or bypass diode conducting. within the turn-off time, the capacitor creates a slowdown semiconductor switch voltage build-up. zvs is suitable for high-frequency operation. the output of converters can be regulated by variable on time or switching frequency control. mos-fet transistor is the most suitable semiconductor device for zvs. mos-fet is an appropriate device for high-frequency applications, but it has two main limitations: the internal output parasitic capacitance and the internal diode. the internal diode has slow dynamic characteristics for turn-off time. for a turn-on transistor with zvs, the internal diode current decreases slowly, so the internal diode does not need suitable dynamic characteristics. the second limitation also does not apply to zvs. if the transistor is turned on with zero voltage, the internal parasitic output capacitance does not discharge. to achieve zcs, inductor lr is connected in series with semiconductor switch vt. a zero-current resonant switch is © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 50 no. 1/2010 ultracapacitors utilization for automotive applications z. pfof, p. vaculík this paper describes the basic properties of ultracapacitors and a converter for ultracapacitors with application zero-voltage switching (zvs). because of the very high efficiency of the ultracapacitor, the efficiency of the converter for ultracapacitors also has to be high; otherwise, the converter reduces efficiency of the whole drive unit. further, the paper describes the drive unit concept for the cityel electric vehicle, with the use of ultracapacitors in cooperation with a fuel cell. this co-operation with ultracapacitors is useful for the supply unit as a fuel cell, which cannot deliver peak power in dynamic conditions while maintaining its nominal efficiency. however, this poses no problems for ultracapacitors. there is also a description of the basic principles of soft switching using zero-voltage and zero-current switching together with a comparison of the power losses between hard and soft switching. keywords: ultracapacitor, converter for ultracapacitors, soft switching, zero-voltage switching, fuel cell. fig. 1: ultracapacitor battery pack fig. 2: schematic diagram of a zero-voltage resonant switch shown in fig. 3. the resonant circuit is used to shape the switch current waveform during conduction time in order to eliminate turn-off loss when zero current flows across the switch and inductor. within the turn-on time, the inductor creates a slowdown semiconductor switch current build-up, thereby reducing the turn-on loss. zcs is suitable for power devices with a large tail current in the turn-off process. the output of converters can be regulated by variable off time control. when selecting a semiconductor device for zcs, an igbt transistor is most suitable. the turn-on power loss is lower, because igbt has considerably lower output capacitance against mos-fet, but the turn-off power loss is greater owing to the current tail. zcs eliminates this turn-off power loss, because the turn-off process runs over at zero current. thus, the power losses are mainly determined by the conduction power losses. power losses of hard-switching vs. soft-switching hard-switching is very widely used for power inverters, whereas soft-switching is used only in high frequency applications. in an ideal case, the switching losses of soft-switching are zero, because either value of the current or the voltage is zero. but in real cases, the switching losses are not zero, but are distinctly lower than in the case of hard switching. eq. (1) shows the switching loss ratio between soft-switching and hard-switching. w w u u(soft) (hard) sat cc � � (1) eq. (1) implies that the switching losses for soft-switching are lower than for hard-switching in the rate of saturation voltage usat and supply voltage ucc. otherwise, the maximum voltage on the device is saturation voltage for soft-switching. for a better illustration, fig. 4 shows the waveforms of hard-switching and soft-switching. 3 a converter for ultracapacitors the first concept of a converter for ultracapacitors was designed at the department of electronics as a hard-switched converter consisting of a buck converter, ultracapacitors and a two-quadrant boost converter. on the input side, the buck converter charges up the ultracapacitors from the power supply. on the output side, the two-quadrant boost converter increases the output voltage value to 60 v dc. the ultracapacitor unit measurements show that the use of ultracapacitors decreases the battery loading from the dc motor starting power peaks. power peaks are required for accelerating a of dc motor with maximum torque. a converter for ultracapacitors is shown in fig. 5. as was mentioned above, ultracapacitors are highly efficient (up to 98 %). therefore, the converter for ultracapacitors also has to have high efficiency. otherwise, the converter will reduce the efficiency of the whole drive unit. this can be achieved by soft-switching. zero-voltage switching is a suitable soft-switching technique for a converter for ultracapacitors. zvs also provides an effective solution for suppressing converter’s emi while preserving high switching frequency. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 fig. 3: schematic diagram of a zero-current resonant switch fig. 4: switching power losses of hard-switching (top waveform) and soft-switching (bottom waveform) fig. 5: first concept of a converter for ultracapacitors the converter configuration is shown in fig. 6. the buck/boost converter is on the input side. this input converter can work in two modes: buck or boost mode. when the ultracapacitors are charging up from the power supply created by the fuel cell, the converter works as a buck converter. for a buck converter, we use only top transistor vt1, and via its duty cycle, we can control the charge rate of the ultracapacitors and the value of the current that flows directly to the load. the maximum voltage value of the ultracapacitors can be the voltage value of the power supply. therefore, the input converter can work as a boost converter. both transistors vt1, vt2 are used for the boost converter, and via their duty cycle, we can raise the voltage value of the ultracapacitors that can be increased to maximal voltage value of ultracapacitor battery pack. the two-quadrant boost converter is on the output side. this converter increases the output voltage to the desired output voltage value, which is the nominal voltage of the electric motor. the output converter can work in two quadrants. it can supply power to the load, or it can regenerate power from the load back to the ultracapacitors. this converter utilizes both transistors to preserve the resonance. only a transistor duty cycle designates the direction of the power flow. simulation results in this section, we will present the simulation results for each converter. the simulation was performed in orcad 16.0. all simulation results show current and voltage on the resonant switch and the total power loss on semiconductor switch. fig. 7 shows the simulation result for the input buck/boost converter in boost mode. in buck mode, the bottom transistor vt2 is closed and the charge rate of the ultracapacitors is controlled via duty cycle of top transistor vt1. this mode is not as interesting as the boost mode. therefore, the simulation results are not shown. in boost mode, the top transistor vt1 must be open for the whole period. then the bottom transistor vt2 is switched to achieve a boost effect but also to achieve resonance on the resonant switch. the inductor current il1 rises, when transistor vt2 is open. the current that flows through the transistor vt2 produces a power loss on this transistor. fig. 7 shows that the turn-on power loss is almost zero, and the conduction power loss rises gradually. fig. 8 and fig. 9 show the simulation result for the output two-quadrant boost converter. fig. 8 is for the supply mode, whereas fig. 9 is for the regeneration mode, which means © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 50 no. 1/2010 fig. 6: block diagram of a converter for ultracapacitors with zvs fig. 7: simulation results for input buck/boost converter with zvs; boost mode that the power flows from the load back to the ultracapacitors. in both modes, transistors vt3 and vt4 are switched for preservation resonance. to generate power to the load the duty cycle of transistor vt4 must be greater than the duty cycle of transistor vt3, whereas for regenerating power back to the ultracapacitors it is vice versa. again fig. 8 and fig. 9 show that the turn-on power loss is almost zero and the conduction power loss rises gradually. 4 concept of the drive unit our department of electronics is developing a drive unit for the cityel electric vehicle, which is a small one-seat city vehicle. at present, the cityel has a drive unit with a dc motor supplied from a traction battery over a hard-switched converter for the ultracapacitor. fig. 10 shows a new concept that will be supplies from fuel cell (primary power source) and ultracapacitors (secondary power source), which will deliver peak power to a synchronous motor with a permanent magnet or will store the energy from regenerative braking. the drive unit will comprise a three-phase inverter for controlling the synchronous motor by dtc. our concept of a drive unit for the cityel electric vehicle will have a nexa fuel cell with output power of 1.1 kw and output voltage of 25 v dc. two fuel cells can deliver output power of 2.2 kw. that will be sufficient output power for the weight (above 280 kg) of the cityel vehicle. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 fig. 8: simulation results for output two-quadrant boost converter with zvs; supply mode fig. 9: simulation results for output two-quadrant boost converter with zvs; regeneration mode the converter for ultracapacitors has another purpose in the drive unit: to increase the voltage value from the fuel cell for the synchronous motor. the motor requires a voltage value from 80 v to 125 v ac. acknowledgments the research described in this paper was supervised by prof. ing. petr chlebiš, csc., department of electronics, všb-technical university of ostrava and supported by gačr project 102/08/0775: new structures and control algorithms of mobile hybrid systems. references [1] chlebiš, p.: polovodičové měniče s měkkým spínáním. (monograph in czech), ostrava: technical university of ostrava, 2004, isbn 80-248-0643-6. [2] dudrik, j.: výkonové vysokofrekvenčné meniče s mäkkým spínaním. košice: elfa, 2007, isbn 978-80-8086-055-4. [3] rashid, m. h.: power electronics handbook. california usa: academic press, 2001, isbn 0-12-571650-2. [4] rech, p.: měnič superkapacitoru. diploma work. všb-technical university of ostrava, 2007. zdeněk pfof e-mail: zdenek.pfof@vsb.cz petr vaculík e-mail: petr.vaculik@vsb.cz department of electronics všb-technical university of ostrava feecs 17. listopadu 15 708 33 ostrava-poruba, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 50 no. 1/2010 fig. 10: concept of the drive unit for the cityel electric vehicle ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 practical implementation of animation for students of pedagogical studies at mias ctu in prague d. vaněček, j. jirsa abstract this paper shows computer animation as a teaching and learning instrument in technical education. our aim is to show good practice in creating computer animations. the paper includes an example, which can serve as a practical guide for teachers of technical subjects. keywords: computer animation, education, e-learning. introduction the main goal of this paper is to present didactic aspects of producing computer animations for use in education, to categorize types of animations andprovide examples. a secondary goal is to show an approach to creating an educational animation bymeans of a simple example. the authors prefer a systematic approach to the creationof interactive components,which they consider to be an effective way of achieving the desired results. the first part of the paper will introduce the concept and will explain each step. specific examples will be shown in the second part. creating educational animation — design approach in the samewaythatweneed topreparebefore teaching a class, we need to have a method and to think things out before we can create the educational animations. if we are in too great a hurry, thinking only about creating an animation of some kind, the outcome is not likely to be truly effective. trial-anderror usually leads to unpleasantly inconsistent results. we should therefore study proven methods, and approach the general process of creating animation as a project divided into several stages. the project can be broken down into the following basic phases: • analysis • concept • realization • piloting • completion • testing note that these phases are very similar to the principles and approaches used in any systematic human activity. analytical phase in thepreparation stage,weusually consider the concept and display of the problem that is to be presented. we decide which parts of the topic would be suitable for an animated display. in other words, we consider how the subject should be presented in order for students to understand it easily and be able to remember the important facts. the course description is usually a useful starting point. we ask ourselves fundamental questions: whether the animation should be just demonstrative, or whether it should be interactive with direct student participation. a demonstrative animation enables the learner just to observe the issue under discussion, and on this basis to form an idea and acquire knowledge. we also consider the choice of an appropriate technology, whether it is available, or whether it is first necessary to acquire it and learn how to use it. at this stage it is worth investigating whether something similar does not already exist. in thisway, we can avoidmaking unnecessary efforts. nowadays, a great number of ready animations can be found on the internet, in various forms. time spent onmaking a search is definitely not wasted. concept in the concept phase, our ideas should gain a more precise form. we should set basic forms of parts of animations, consider specific animated features, their properties and location in accordance with didactic principles. on the basis of the outputs of our analysis,weshouldchooseanappropriate softwareproduct for creating the animations. [1] the concept phase includes setting a detailed timetable for creating the animation. it is also necessary to decide on the process, and onwhowill create a specific animated situation, when and how. this proven approach is especially suitable when dealing 66 acta polytechnica vol. 51 no. 3/2011 with a broad subject, which usually requires timeconsuming processing. we also should considerwhen and how to incorporate the animation into a class or lecture (e.g., use diagnostic animations for recapitulation). of course, if we want to present an animation during a lecture, it is necessary to ensure that the classroomhas appropriate equipment. this may not always be easy. realization the realization stage involves actually creating the animation. with the selected swtoolwe create specific objects, models and scenes according to our requirements and needs. however, even at this stage it is necessary to take into account the properties and capabilities of the software product. depending on the possibilities, we can choose either to create the animation gradually, from the beginning to the end, or to create individual parts separately and then assemble them into the final product. piloting before using the animation in the classroom, it is advisable to do some so-called piloting, i.e. initial testing of the product. piloting is a notion drawn from preparing examinations — see [2]. animations are submitted for criticism by a colleague (or by several colleagues) who did not participate in creating the animation. another option is to test the animation with the help of senior studentswhohave already encountered the problem that is to be presented, and are thus able to share their impressions and opinions. after piloting, deficiencies that had not been picked up, or that had been overlooked, can be removed. completion the completion stage involves publishing the animation. this can be done in various ways. in most cases, the product is published on the internet via webpages integrated into an electronic learning system (lsm) run by the school. this method is convenient if we want the product to be accessible only to a certain number of students with an access password, and not to the general public. if no e-learning system is available to us, we can try to create a simple webpage and publish it. there are many tools available that can export an animation into a form that is relatively easily incorporated into webpages. testing the last step should be to test our work. initial testing is usually conducted with colleagues who can point out major deficiencies from the outset. after making corrections, we can proceed to piloting, where we test our project on a group of students. after piloting the pages with animations, we proceed to the evaluation stage, which takes the form of a discussion or a questionnaire. on the basis of the results, we again make appropriate adjustments and changes. now it might seem that everything is ready. however, this would be a wrong assumption. wemust realize that as theworld aroundus changes, the students also change. thus we are faced with the task of re-checking the suitability of our created animation, in professional terms and in pedagogical terms — see [3]. the process of creating an animation in this second part of the paper, we will present a specific example of creating an educational animation. for illustrative purposes, wewill present a simple animation of the principle of the operation of a paternoster elevator. this animation might be used inmechanical engineering classes for high-school students. we will follow the steps listed in part one, with minor simplifications. analysis let us consider the following model situation: we are teaching a course on engineering and machinery at a technical high school with an engineering or electro-technical specialization. the audience that the animation is designed for consists of second and third grade students. the students are proficient with computer technology and the internet, andhave access to a school computer network. they can also log into a school intranet via aweb interface. if some of the students do not have an internet connection at home, the animation can be copied to a cdrom and given to the students. before the era of e-learning, we used only textbooks with pictures and photos as didactic tools, together with regular blackboard and chalk, and for a few topics we had a videotape of a program from a public media broadcast. we also used to enrich the class with samples of machine parts, e.g. bearings, cog-wheels, etc. in order to enhance students’ education and increase their interest in the topic, we decided to use an animation of a complicated piece of machinery. a check showed that there is no available educational cdrom meeting our expectations on the market. an internet search also did not reveal any sources where the topic was presented using animations. we therefore decided to fill the void and to create an animation of our own. 67 acta polytechnica vol. 51 no. 3/2011 fig. 1: sketch for an intended animation we can use ordinary computer technology owned by the school to create the animation. we will assume that the school has bought adobe flash software — see [4]. we chose this product on the basis of our previous positive experience. concept let us consider a simple animation, which will reflect the essential features of the functioning of the paternoster lift. wewill not go intomuchdetail. the constituents in the animationwill be relatively simple symbols, so that students can understand easily and will later be able to explain the principle and draw a scheme of the lift — see figure 1. if we should find out later (e.g. during piloting) that the animation could be more detailed, it will not be a problem to update it and go into greater detail. for a simple representation of this kind, it would not be advisable to attempt to make the animation three-dimensional, because the work would be much more time-consuming. we would like to create the animation in a relatively short time, and with minimum effort. this is why we chose adobe flash. its controls are similar to the controls of other graphic tools that arewidely used in computers. in addition, it is not difficult to understand the elementaryprinciples for creatingananimation, anda fairly computerliterate person should be able tomaster them. moreover, flash contains several tutorials through which even a beginner can find out the possibilities and options. in an animation we distinguish between static components, which do not change during the animation, and dynamic components, which show certain changes, e.g.: • motion (motion tween) • shift • rotation • curved motion • shape change (shape tween) • color change • a combination of the previous options adobe flash supports this distinction between static and dynamic symbols, and offers three basic types of symbols to the user [4]: 1. for static symbols it offers the graphic type, which can contain both graphics drawn by the user and inserted raster images and photos, 2. for dynamic symbols it introduces the movie clip type, 3. for creating buttons it offers the button type. 68 acta polytechnica vol. 51 no. 3/2011 in our case, we will use only the first two types, because we want to create a demonstrative animation without any interaction with the user. our symbols from figure 1 can thus be divided according to the following table: table 1: choice of type for individual symbols symbol type of symbol in adobe flash rotor graphic animated rotor movie clip cabin graphic cables without type (line) / graphics let it be added for clarification that the symbol of the animated rotor contains a symbol for the rotor itself. the movie clip type is therefore usually composed of individual static symbols of thegraphic type. we can also see that the cables do not have to be represented by a symbol, but can be drawn completely freely with a line. the next step in our consideration of the design concepts for the paternoster animation should be to allocate the individual symbols to sub-layers. this is therefore a logical separationof symbolswith adifferentmeaning. however, there are also graphic reasons for this separation. we usually use it when we need to emphasize the correct order of the symbols on the screen, or for animations in which the symbols are supposed to overlie in a specific way. adobe flash fully supports layers, and even offers several different types of layers. four layers of the following types are sufficient for our animation: • two layers of the normal type for the symbols for animated rotors and cables, • a layer of theguide type for the invisible line on which the cabin will move, • a layer of theguide type for the cabinwhichwill move on the line from the previous layer. the names of the individual layers and the order of the layers will be described in the following section. implementation the implementation is the process of creation, when we gradually create the sub-components with the use of adobe flash. then we compose the subcomponents into the final animation. the repetitive motif of the rotor canbeused to our advantage, since we need to draw and animate it only once, and then use it four times. for better orientation, we can follow the sketch presented above. in theprevious section,wegenerally indicated the types of symbols that would be appropriate for our components. we also allocate the symbols approximately into layers so that they will make sense and will correspond with our ideas. the creation of the paternoster elevator is thus divided into the following implementation steps: 1. customize ide (integrated development environment) 2. draw the symbol for the rotor 3. create the animated rotor — see figure 2 4. draw the symbol for the cabin — see figure 3 5. arrange the symbols into layers and distribute them on the canvas 6. draw the guiding cables — see figure 4 7. create the line for the movement of the cabin 8. create the animation of the cabin on the line 9. export the animation into a format displayable on the webpage — see figure 5 fig. 2: animation of the rotor fig. 3: motif of the cabin all these steps will be implemented in the ide of adobe flash. in this way,wewill not need any other software tools. piloting at this stage, the animation is already complete. next, we should carry out initial testing and present the results of our work to someone who is unbiased, but who has knowledge of the topic and can offer objective criticism. we therefore ask a colleague to evaluate a demonstration of the animation. it is also 69 acta polytechnica vol. 51 no. 3/2011 fig. 4: drawing of the guiding cables fig. 5: example of an export into an appropriate format good to approach some senior students, show them the animation and ask for their informal but important opinions. in both cases,we should consider their suggestions and make any necessary changes. the opinionswill probably not be the same. however, we consider that both groups of evaluators are important, and neither should be omitted. completion thework on the animationprojectwill be completed as explained above. we will modify our presentation in flash for publication on the internet by saving it in anappropriate format. with the permission of the school network administrator, we will place it on the school intranet system where it will be accessible for the students. if there is a student who is unable to connect to the network from home (which will become highly improbable in future years), we can also save the animation on a cdrom. a cdrom can also serve as a backup. testing the last stage is to test our animation and find out how useful the students consider it. in this step, the targeted group of students will discuss the animation or respond to a questionnaire to let us know whether it was useful and whether it helped them to understand the topic better. after the evaluationwe can make further adjustments. as was already mentioned above, we must not assume that our task is now over. after some time we should revise all the animations that we have created and consider their validity and purpose. in this waywe can ensure that we are keeping up with the latest didactic and professional standards. conclusion the goal of this paper was to acquaint readers from the professional public with the topic of creating educational animations. this problem has not been dealt with theoretically before (the authors were unsuccessful in their efforts to find any relevant references). we have noted that many animations are being created without taking didactic considerations sufficiently into account. in our opinion, this is not a satisfactory approach. we have therefore attempted to bring some theoretical order into the issue ofmakinganimations for e-learning. this could formabasis for further research on the theoretical bases for creating educational animations. the paper endedwith a practical implementation of our theoretical principles. the topic is extensive, and it is obvious that the set of three papers presentedhere cannotdeal exhaustivelywith thewhole topic. with theavailability 70 acta polytechnica vol. 51 no. 3/2011 of user-friendly software, educational animations are now widely used, and will be used even more in the future. they deserve much greater attention than they have received until now. references [1] jirsa, j.: tvorba počítačové animace pro potřeby výuky technických předmětů, bp múvs čvut, praha, 2009, ved. práce d. vaněček. [2] svoboda, e., bečková, v., švercl, j.: kapitoly z didaktiky odborných předmětů. 1. vyd. praha : vydavatelství čvut, 2004. 156 s. isbn 80-01-02928-x. [3] drahovzal, j., kilián, o., kouhoutek, r. didaktika odborných předmětů. brno : paido, 1997. 156 s. isbn 80-85931-35-4. [4] adobe flash professional [online]. 2009 [cit. 2009–04–20]. dostupný z www: http://www.adobe.com/products/flash/ ing. paed. igip. david vaněček, ph.d. e-mail: david.vanecek@muvs.cvut.cz czech technical university in prague masaryk institute of advanced studies department of engineering pedagogy horská 3, 128 00 praha 2, czech republic ing. bc. jan jirsa e-mail: jirsaj@fel.cvut.cz czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 71 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 advanced processing of images obtained from wide-field astronomical optical systems m. řeřábek, p. páta abstract the principal aim of this paper is to present a general view of the special optical systems used for acquiring astronomical image data, commonly referred to as wfc or uwfc (ultra wide field camera), and of their transfer characteristics. uwfc image data analysis is very difficult in general, not only because the systems have so-called space variant (sv) properties. images obtained from uwfc systems are usually incorrectly presented due to a wide range of optical aberrations and distortions. the influence of the optical aberrations increases towards the margins of the field of view. these aberrations distort the point spread function of the optical system and rapidly cut the accuracy of the measurements. this paper deals with simulation and modelling of the uwfc optical systems used in astronomy and their transfer characteristics. keywords: psf, uwfc, zernike polynomial, aberration, space variant, deconvolution algorithm. 1 introduction the properties of uwfc astronomical systems along with specific visual data in astronomical images contribute to complicated evaluation of acquired image data. these systems contain many different kinds of optical aberrations, which have a negative impact on the image quality and imaging system transfer characteristics. therefore, for precise astronomical measurements (astrometry, photometry) over the entire field of view (fov), it is very important to comprehend how the optical aberrations affect the transfer characteristics of the optical system. another question that arises is how the astronomical measurements depend on optical aberrations. a definition of the accurate point spread function (psf) at any point in the fov of the optical system could help us to restore the original image. optical aberrations models for linear space invariant and variant (lsi/lsv) systems are therefore outlined in this paper. these models based on zernike polynomials serve us as suitable tools for understanding how to estimate and fit the wavefront aberration of a real optical system, and give us an idea of intrinsic psf. when the model of the psf of a real acquisition system is known, we can use it for restoring the original image or for a precise evaluation of the astronomical measurements. two experiments are presented in this paper. the first describes psf model retrieval when we have original image data. the second focuses on using the psf model in known deconvolution algorithms with which we can restore the original image. 2 astronomical image data processing and acquisition most astronomical image data is provided by automatic robotic astronomical systems. the main idea of these automatic astronomical optical systems is based on long-term image data collection. the data is acquired with specific characteristics. these include especially focal length, spectral bands and sensor parameters. according to the focal length that is used, we can distinguish image data acquired in the primary focus of a big telescope (deep sky), a wide-field lens or a “fish-eye” lens for all sky data imaging. astronomical image data contains specific visual data. an astronomical image usually consists of the dark background of the starry sky, together with bright points and clumps, which represent stars and galaxies. image data can be divided into four groups [7]: flat field, dark frame, light image and deep sky light image. before processing and evaluating astronomical images, it is necessary to make corrections of these images. dark frame compensation and flat field compensation are two of the most frequently used pre-processing methods. real data from double-station video observation of meteors [10, 11] is used for our simulations and modelling. uwfc image data analysis is very difficult in general. there are many different kinds of optical aberrations and distortions in these systems. moreover, the objects in ultra wide-field images are very small (a few pixels per object dimension). the influence of the optical aberrations increases most to90 acta polytechnica vol. 51 no. 2/2011 fig. 1: block diagram of image processing in a video observation system wards the margins of the fov (high angular distance from the optical axis of the system). these aberrations distort the psf of the optical system and rapidly cut the accuracy of the measurements. the optical aberrations are dependent on spatial data, which affects the transfer characteristics of optical systems and makes them spatially variant. 3 stellar object profile there are two common functions for fitting the profiles of stars — gaussian and moffat [7, 11]. the aim is to match a star’s profile as well as possible with the gaussian or moffat profile, and then to store the parameters of the fit. if the star is ideal, it will be represented by a small “dot” — psf (point spread function). unfortunately, because of many different distortions (see below), the dot is ”blurred” all around, and the star’s profile is close to the gaussian function. the more it is blurred the worse is the psf of the whole imaging system. if the system is linear and space-invariant, the psf of all stars will be the same. the centre of a star usually has a profile closer to the gaussian function, while the more distant parts of a star are closer to the moffat profile. hence the ideal fitting function combines gaussian and moffat profiles. it is obvious that the centre of the star is fitted very well, whereas the more distant parts are fitted only poorly. figure 1 presents a simplified block diagram of various factors affecting the final image and stellar profile. we assume the first two blocks for our purposes; however, we consider that these blocks are placed in a black box. in our experiments, we do not take into account a model of atmospheric turbulence. other blocks implicate disruptive effects, which bring unwanted artefacts in psf. one of the most difficult effects is brought on by recording and compression. in order to improve the subjective image quality, the video tape recording and even jpeg compression use a sharpening mechanism which results in rapid transients at the edge of psf. these parts of the electro-optics transfer system are not considered in our simulations and modelling. 4 optical aberration modelling optical aberration can be described using a so-called wave aberration function. the wave aberration [1, 5] function is defined as the distance (in optical path length) from the reference sphere to the wavefront in the exit pupil measured along the ray. zernike polynomials are used for describing high order wavefront aberrations with high precision. zernike polynomials are normally expressed in polar coordinates (ρ, θ) in the exit pupil, where 0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π. a zernike polynomial consists of two factors, the normalization factor n mn and the radial polynomial r|m|n (ρ). zernike polynomials are defined as [1, 4]: zmn (ρ, θ) = ⎧⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎩ n mn r |m| n (ρ) cos(mθ) for m ≥ 0, 0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π −n mn r |m| n (ρ) sin(mθ) for , m < 0, 0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π, (1) for a given n, the number m can take values of −n, −n + 2, −n + 4, . . . , n. the normalization factor n mn and the radial polynomial r |m| n (ρ) can be expressed as n mn = √ 2(n + 1) 1 + δm0 , r|m|n (ρ) = (2) n−|m| 2∑ s=0 (−1)s (n − s)! s![0,5(n + |m|) − s]![0,5(n − |m|) − s]! ρn−2s. where δm0 = 1 for m = 0, δm0 = 0 for m �= 0. the wavefront aberration function may be expressed by zernike polynomials [1, 5, 7] as w (ρ, θ) = k∑ n n∑ m=−n w mn z m n (ρ, θ) = k∑ n { −1∑ m=−n w mn ( −n mn r |m| n (ρ) sin(mθ) ) + n∑ m=0 w mn ( n mn r |m| n (ρ) cos(mθ) )} , (3) where k is the polynomial order of the expansion, and w mn is the coefficient of the z m n mode in the expansion, i.e. it is equal to the rms wavefront error for that mode. 91 acta polytechnica vol. 51 no. 2/2011 a) b) c) fig. 2: visualization of coma and astigmatism. a) the testing image. b) sv image distorted by coma. c) sv image distorted by astigmatism 5 effect of wavefront aberration on the psf optical systems with all aberrations compensated are called diffraction limited. the influences of aberrations on image quality can be expressed as the wavefront error at the exit pupil, and their effects on the transfer characteristics can be expressed as the change in the psf size and shape. changes in the size and shape of psf blur the image. when an imaging system is diffraction limited, the psf consists of the fraunhofer diffraction pattern of the exit pupil [2]. the relation between the object and the image of a space invariant optical system can be expressed by the convolution in the spatial domain (the object irradiance distribution is convolved with the impulse response) [2]. the psf of the lsi optical imaging system can be expressed as p sf (u, v, δ, ϕ) =∣∣∣∣f t { p(x, y) exp ( −i 2π λ w (ρ, θ, δ, ϕ) )}∣∣∣∣ 2 , (4) where p(x, y) defines the shape, size and transmission of the exit pupil, exp[(−i2π/λ)w (ρ, θ, δ, φ)] accounts for the phase deviation of the wavefront from a reference sphere, and (δ, ϕ) are the polar coordinates at the object plane — these coordinates equal (0, 0) all the time if we consider an lsi optical system. if we assume the spatial variant system with anisotropic psf, then the aberration wavefront is different for each source point, see figure 2. the real image is always obtained from the spatially variant optical system. an example of an image acquired from a real sv optical system is shown in figure 3a, where can we also see the influences of aberrations in the middle of the image and at the edge. the psf of this optical system is different in the middle of the image and at the edge, due to optical aberrations. the differences of the psfs in the two positions are compared in figure 3b and figure 3c. a) b) c) fig. 3: a) input imagedata fromthesystemat theondřejov base. b) the object profile near the optical axes. c) object profile on the edge of fov 92 acta polytechnica vol. 51 no. 2/2011 fig. 4: ellipticity (an ideal psf deformation) of the profiles of stars as a function of the angle distance from the middle of an all-sky image a) b) fig. 5: a) dependence of the astrometric error (in pixels) on wavefront aberration error (in fraction of λ), b) rms error (in levels) versus wavefront aberration error (in fraction of λ) 6 effect of wavefront aberration on astrometric measurements unlike aperture photometry, precise astrometry demands high quality images. even slight distortion may cause inaccurate determination of the position or movement of a stellar object. the error depends on the magnitude of the measured star — the greater the magnitude, the greater the error can be. even a quite small compression rate may cause a wrong detection, especially when considering faint stars. if the system is space-variant, the psf of the stars in an image differs. the profiles of stars are not circular but rather elliptical, especially in the case of inferior lenses and greater distance from the middle of the image. the ellipticity is qualified by σx and σy (while considering the gaussian function), where σx and σy are the distance from the centre of the gaussian at which the gaussian is equal to exp(−0,5) of its central value. σx and σy are the major and minor half-axis of the ellipse, perpendicular to each other. the ellipticity in the radial direction can grow significantly with the angle distance from the middle of the image, especially for wide-angle images, see figure 4. all of the measurements in this section were done in iraf (image reduction and analysis facility) software [13]. this software system enables the user to measure, reduce and analyse astronomical data and write her/his own scripts. the program is “open source”. the reader can find further information at http://iraf.noao.edu. skycat can be used for visualization of images and for access to catalogues, for example http://archive.eso.org/skycat. the effect of coma wavefront aberration on the precision of astrometric measurements (e.g. position of objects) is demonstrated in figure 5a. we can see that the coma aberration has no influence on the precision of astrometric measurements for values of wavefront aberration error less than λ/50. the measurement error grows with the wavefront aberration error, and for wavefront aberrations error values greater than λ/10 it is no longer acceptable. the second graph represents the effect of coma wavefront aberration on the rmse of a distorted image in signal levels, see figure 5b. this effect is insignificant for wavefront aberration error values less than λ/100. rmse further increases with wavefront aberration error. for wavefront aberration error greater than λ/10, the increase in rmse values is over 30 %, and is also no longer acceptable. we assume only the 93 acta polytechnica vol. 51 no. 2/2011 a) b) fig. 6: a) dependence of the brightness stars on distance from centre of fov, b) dependence of the magnitude error of stars on the distance from centre of fov coma aberration included in the optical shift invariant system. 7 experiments and results 7.1 estimation of aberration if we want to use the deconvolution algorithm for precise restoration of the sv uwfc system, we need to know what the psf looks like. one approach is to find the psf of the optical system from empirical image data. here we describe the procedure for obtaining the model of psf from real image data. our experiments involve analysing real image data and comparing it with the modelled image data. real data from the double-station video observation project is used for our experiments. the stars vary not only their own position, but also their brightness and shape. the space variant properties are represented in figures 6a) and 6b), which show how the magnitude and the magnitude error vary in dependence on distance from the centre of fov. in principle, the brightness of stars has a decreasing character. within the double-station video observation project we obtain a sequence in non-compressed avi format from the observation system. a star which changes its position from the edge to the middle of the image, or vice versa, is taken as a suitable for our purposes. at first, several frames from each sequence form the uncorrected testing image, which is determined as the median of the used frames. in the next step, we pre-process the image using flat field and dark frame correction. we can also use noise and background removal. the next step is to isolate the star at discrete times when it changes its own position from the middle of fov to the edge of fov. we therefore cut the stars as sub-images with 15 by 15 pixel proportions. then the comparison of the real stellar profile and the model profile is implemented [12]. a generalized block diagram of the algorithm that estimates the optical aberrations of the real system is shown in figure 7. the input parameters are the wavefront aberration (i.e., the zernike coefficients) and psf computed according equation (4). then the real object profile is compared with the suggested model — both of them have the same fwhm [7]. the suitability of the resulting psf for the model is evaluated on the basis of the rms error between the modelled psf and the real object profile. the result for the chosen star is presented in figure 8. as can be seen, this model assumes only two varying aberrations (coma and astigmatism) with a constant defocus value. fig. 7: generalized block diagram of an estimation algorithm a) b) c) d) fig. 8: results of estimation. a) model of the star profile, b) original of the chosen star, c) differential image for the estimated and real star profile, d) dependence of the rms error profile on coma and astigmatism wavefront aberration error 94 acta polytechnica vol. 51 no. 2/2011 a) b) c) fig. 9: a) model of the psi optical system, b) source testing image, c) image passed through the sv optical system with coma aberration only a) b) c) fig. 10: image restored using the wiener deconvolution algorithm, b) deconvolution using the lucy — richardson algorithm, c) deconvolution using the maximum likelihood algorithm 7.2 model of spatially variant deconvolution the principal difficulty in spatially variant (sv) systems is that the fourier approach can no longer be used for restoring [6] the original image. if we want to use the fourier method for the deconvolution process, we need to split the original image. the transfer characteristics of each part are described by unique psf. this system is referred to as a partially space invariant system [8] and we use it in our experiments. such a system has parametric psf — for each value of a parameter (in our case it is the coordinate of the object at the object plane) psf takes a different size and shape according to the aberrations that are contained. thus we obtain a number of psfs, one for each region referred to as an isoplanatic patch [8], within which the system is approximately invariant. to describe the imaging system fully, the impulse response appropriate for each isoplanatic patch should be specified. the wavefront aberration function for the lsv optical system can be described as w (ρ, θ, δ, ϕ) = k∑ n n∑ m=−n w mn (δ, ϕ)z m n (ρ, ϕ−θ), (5) where w mn (δ, ϕ) is the rms wavefront error for aberration mode m, n and for object polar coordinate (δ, ϕ). using the partially space invariant system allows us to describe the transfer characteristics in individual patches by fourier approaches, according to equation (4). thus we obtain a number of space invariant psfs, one for each isoplanatic patch. now we can also use the fourier approach for image deconvolution. the optical system is divided into 40 isoplanatic patches in our experiments, see figure 9. three deconvolution algorithms [3, 4] — the wiener, lucyrichardson and maximum likelihood algorithm — are used for restoring the original image, see figure 10. 8 conclusions this paper has discussed ways of improving astronomical measurements. the first approach is to use the novel star profile model for restoring the original image, which is undistorted and is suitable for precise astrometry using ordinary fitting models (gaussian, moffat). the second approach for improving astronomical measurements is to use a different evaluation and detection algorithm. this involves creating a new psf fitting model based, e.g., on the zernike polynomial and using it for detection and astrometric measurements. the situation is complicated by fact that all wide field astronomical observational systems 95 acta polytechnica vol. 51 no. 2/2011 are sv. therefore, the shape of a stellar object varies across the entire fov, and gaussian or moffat fitting procedures are useless for precise astrometric measurements, especially towards the margins of fov. an experiment for estimating the optical aberrations of real optical systems was implemented. a comparison of the model psf and the real object profile was performed using the rms error. obviously the model should have more input parameters, especially more varying optical aberrations. sv deconvolution was addressed, and a model of the partially space variant optical system was implemented. only the “brute force” method was used for restoring the test images. the results of various deconvolution algorithms were demonstrated. acknowledgement this work has been supported by grants no. ga205/09/1302 “study of sporadic meteors and weak meteor showers using automatic video intensifiers cameras”, and ga p102/10/1320 “research and modelling of advanced methods of image quality evaluation” of the grant agency of the czech republic. references [1] born, m., wolf, e.: principles of optics. 2nd ed. london, pergamon press, 1993. [2] goodman, j. w.: introduction to fourier optics. boston, mcgraw-hill, 1996. [3] campisi, p., egiazarian, k.: blind image deconvolution: theory and applications. crc, 2007. [4] janson, p. a.: deconvolution of images and spectra. 2nd edition. academic press, london, 1997. [5] welford, w. t.: aberration of the symmetrical optical system. academic press, london, 1974. [6] shannon, r. r., wyant, j. c.: applied optics and optical engineering. academic press, lodon, 1992. [7] starck, j., murtagh, f.: astronomical image and data analysis. springer verlag, berlin, 2002. [8] trussel, h. j., hunt, b. r.: ieee trans. acoustic speech sig. proc., 608, 157, 1978. [9] maeda, p. y.: zernike polynomials and their use in describing the wavefront aberrations of the human eye. 2003. [10] hudec, r., spurny, m., krizek, p., páta, p., řeřábek, m.: low-cost optical all-sky monitor for detection of bright ots of grbs in: gamma-ray burst: sixth huntsville symposium. austin: american institute of physics, 2009, p. 215–217. isbn 978-0-7354-0670-4. [11] řeřábek, m., páta, p., koten, p.: processing of the astronomical image data obtained from uwfc optical systems. in proceedings of spie – 7076 – image reconstruction from incomplete data v. washington: spie, 2008. isbn 978-0-8194-7296-0. [12] řeřábek, m., páta, p.: advanced processing of astronomical images obtained from optical systems with ultra wide field of view in: applications of digital image processing xxxii. bellingham (washington state): spie, 2009, p. 1–4. isbn 978-0-8194-7733-0. [13] iraf – image reduction and analysis facility, http://iraf.noao.edu/, 2007. m. řeřábek p. páta e-mail: rerabem@.fel.cvut.cz czech technical university in prague faculty of electrical engineering department of radio engineering technická 2, 166 27 prague 6, czech republic 96 ap08_5.vp 1 introduction in recent years, rare earth (re) ions containing photonics materials have attracted much attention for their potential applications for full color displays, optical sources and laser systems such as optical amplifiers [1–6]. a list of the re elements with some of their basic properties is shown in table 1. most research has focused on re ions which can emit in the visible region. steckl et al. reported in [7] about the properties of gan layers doped with eu3�, er3�, and tm3� ions. they obtained (for the first time) photoemission from higher excited re states in gan covering the entire visible spectrum: light emission in the green (from er at 537/558 nm), red (pr at 650 nm and eu at 621 nm), and blue (tm at 477 nm) spectral regions. a second major field of study deals with re ions which can emit in the infrared region. for these purposes, the re ions are most often studied for telecommunications systems. for telecommunication systems operating at 1300 nm investigations are made of re ions such as nd3�, pr3� and dy3� [8–11]. for telecommunications applications at 1530 nm, investigations are made of er3� and tm3� ions [12–14]. in the last decade there have also been investigations of sensitizers to produce more efficient re doped sources. the most often used sensitizer is yb3� for er3�-doped optical amplifiers [15, 16]. in addition to yb3+ other re ions are nowadays examined as sensitizers, such as ho3� for tm3� [17, 18] or ho3� for yb3� [19] doped photonics materials, etc. [20]. optical materials such as semiconductors, glass and optical crystals doped with re ions are conventional materials for accomplishing lasing action. recently there has been considerable interest in the development of new photonics materials such as polymers [21, 22], which have better properties and a lower price. in this paper, we present the fabrication and properties of er and er/yb doped polymer layers. as a polymer material we chose polymethylmethacrylate (pmma), due to its low optical absorption, simple synthesis and low cost. these characteristics make it a suitable host material for re ions [23, 24]. er3� ions were chosen due to fact that er3� ions now play a key role in long-distance optical communication systems. yb3� co-doping was applied because it was previously shown that the addition of ytterbium ions increased the intensity of the luminescence at 1530 nm [16]. 2 experiment small pieces of pmma (goodfellow) were left to dissolve in chloroform for a few days before being used in the fabrication of pmma layers. the pmma layers were fabricated by spin coating on silicon substrates or the polymer was poured into a bottomless mold placed on a quartz substrate and left to dry. for re doping anhydrous ercl3 and ybcl3 or erf3 and ybf3 or erbium(iii) tris(2,2,6,6-tetramethyl-3,5-heptanedio14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 optical properties of erbium and erbium/ytterbium doped polymethylmethacrylate v. prajzler, v. jeřábek, o. lyutakov, i. hüttel, j. špirková, v. machovič, j. oswald, d. chvostová, j. zavadil in this paper we report on the fabrication and properties of er3� and er3�/yb3� doped polymethylmethacrylate (pmma) layers. the reported layers were fabricated by spin coating on silicon or on quartz substrates. infrared spectroscopy was used for an investigation of o-h stretching vibration. measurement were made of the transmission spectra in the wavelength ranges from 350 to 700 nm for the er3� doped samples and from 900 to 1040 nm for the yb3� doped samples. the refractive indices were investigated in the spectral range from 300 to 1100 nm using optical ellipsometry and the photoluminescence spectra were measured in the infrared region. keywords: polymer, polymethylmethacrylate, erbium, ytterbium, optical properties. atomic number element electron configuration re3� ground term re3� 58 cerium – ce 4f 15s25p6 2f5/2 59 praseodymium – pr 4f25s25p6 3h4 60 neodymium – nd 4f35s25p6 4i9/2 61 promethium – pm 4f45s25p6 5i4 62 samarium – sm 4f55s25p6 6h5/2 63 europium – eu 4f65s25p6 7f0 64 gadolinium – gd 4f75s25p6 8s7/2 65 terbium – tb 4f85s25p6 7f6 66 dysprosium – dy 4f95s25p6 6h15/2 67 holmium – ho 4f105s25p6 5i8 68 erbium – er 4f115s25p6 4i15/2 69 thulium – tm 4f125s25p6 3h6 70 ytterbium – yb 4f135s25p6 2f7/2 table 1: the rare earth elements and some of their properties nate) (sigma-aldrich) and ytterbium(iii) tris(2,2,6,6-tetramethyl-3,5-heptanedionate) (goodfellow) were dissolved in c5h9no or c2h6os (sigma-aldrich). the layers were fabricated in such a way that the content of erbium in the solutions varied from 1.0 at. % to 20.0 at. %, and were then added to the polymer. the samples containing 1.0 at. % of erbium were co-doped with ytterbium (er3�/yb3� samples) in amounts varying from 1.0 at. % to 20.0 at. %. the molecular structure of pmma is shown in fig. 1. the molecular structure of ercl3 shown in fig. 2a, the structure of erf3 in fig. 2b, and that of erbium(iii) tris(2,2,6,6tetramethyl-3,5-heptanedionate) and ytterbium(iii) tris(2,2,6,6-tetramethyl-3,5-heptanedionate) are shown in fig. 2c. 3 results and discussion the samples were characterized by infrared spectroscopy (ft-ir) using a bruker ifs 66/v ftir spectrometer equipped with a broadband mct detector, to which 128 interferograms were added with a resolution of 4 cm�1 (happ-genzel apodization). fig. 3 displays the ft-ir spectra of pmma layers doped with er3� ions (ercl3) in the wavelength range from 4000 to 2600 cm�1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 5/2008 o n ch3 c ch3 ch2 c o fig. 1: molecular structure of pmma me o o o o o o c(ch3)3 c(ch3)3 c(ch3)3 c(ch 3 ) 3 c(ch3)3 c cc c c c me = er3+, yb3+ ch chch c(ch 3 ) 3 a) c)b) er cl cl cl er f f f fig. 2: molecular structure of a) ercl3, b) erf3 and c) erbium(iii) and ytterbium(iii) ions 3600 3400 3200 3000 2800 2600 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 0% er 1% er 5% er 10% er 15% er 20% er a b s o rb a n c e () wavenumber (cm ) �1 o-h 3349 c-h 2994 c-h 2953 c-h 2843 fig. 3: infrared spectra of pmma samples doped with er3� using ercl3 the three strong broad bands occurring at 2994 cm�1, 2953 cm�1, 2843 cm�1 correspond to the aliphatic c-h bands. these bands are assigned to the stretching vibrations of ch3 and ch2, and indicate a high content of hydrogen-rich chx. the absorption band at 3349 cm�1 corresponds to the o-h stretching vibrations of the pmma layers. fig. 3 also shows that increasing the er3� content also increased the intensity of the o-h vibrations. this can be explained by the fact that ercl3, as very hygroscopic substance, not only dopes the polymer samples also but bring a certain amount of water. it is a well-known fact that the presence of o-h groups in a matrix containing rare earth ions unfortunately causes problems by hindering emission in the infrared region. the transmission measurements were performed using a uv-vis-nir spectrometer (uv-3600 shimadzu) in the spectral range from 350 to 700 nm. the transmission spectra of the er3� doped polymer (erbium(iii)) in the spectral range from 350 nm to 700 nm are shown in fig. 4. in the samples containing 10.0 at. % of er3�, two bands appeared that were attributed to the following transitions: 4g11/2 (377 nm) and 2h11/2 (519 nm). in the samples containing 20.0 at. % of er 3� one more band appeared at 4f9/2 (650 nm). we did not observe bands 2g7/2 (355 nm), 2g9/2 (363 nm), 2h9/2 (405 nm), 4f3/2 (441 nm), 4f5/2 (448 nm), 4f7/2 (485 nm) and 4s3/2 (539 nm). the same results were obtained for samples doped with ercl3 and erf3 solution. the transmission spectra of the er3� (1.0 at. %) doped polymers co-doped with yb3� ions using erbium(iii) and ytterbium(iii) (from 1.0 at. % to 20.0 at. %) in the spectral range from 900 nm to 1040 nm are shown in fig. 5. the sample containing 20.0 at. % of yb3� ions has a typical yb3� 2f5/2 transition with maxima at 977 nm. the samples 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 350 400 450 500 550 600 650 700 0 5 10 15 20 25 30 35 2 g 9/2 2 g 7/2 4 f 3/2 2 h 11/2 t ra n s m is s io n (a .u .) wavelength (nm) 1% er 5% er 10% er 20% er4g 11/2 4 f 7/2 4 f 5/2 4 f 9/2 4 s 3/2 fig. 4: transmission spectra of the er3� doped pmma using erbium(iii) 900 920 940 960 980 1000 1020 1040 2 0. 2 5. 3 0. 3 5. 4 0. 4 5. 5 0. 5 5. 6 0. 6.5 1% yb 5% yb 10% yb 20% yb t ra n s m is s io n (a .u .) wavelength (nm) 2 f 5/2 fig. 5: transmission spectra of er3� (1.0 at.%) doped pmma co-doped with yb3� using erbium(iii) and ytterbium(iii) ions with a lower concentration have a weaker 2f5/2 transition, and the samples with concentration 1 at. % yb3� ions have no visible yb3� (2f5/2) transition. the refractive indices were measured using variable angle spectroscopic ellipsometry (vase, j.a.woollam & co.) working in rotating analyzer mode. the measurements were carried out in the spectral range from 300 to 1100 nm. fig. 6a shows the dependence of the refractive indices of the er3� doped pmma (erbium (iii)), and fig. 6b shows the dependence of the refractive indices of the 1at. % er3� (erbium (iii)) doped pmma with yb3� co-doping (ytterbium (iii)). it is obvious that increasing the content of er3� and er3�� yb3� increases the refractive indices of the material. the refractive index value is, also a matter of the polarizability of the ions present in the material [26]. polarizability (or ion deformation) is understood as a function of the size of the ions – the larger the size, the larger the polarizability, and vice versa. the presence of larger cautions in the substance (in this case rather large er 3� and/or yb3� the thin layer of polymer) usually raises the refractive index. the results are not surprising, but what is important is the exact refractive index value (of course depending on the wavelength) of the deposited material. semiconductor laser excitation (p4300 operating at � x � 980 nm with eex � 500 mw; room temperature) was used to detect sample luminescence in the range from 1450 to 1650 nm. the photoluminescence spectra of the er3� doped samples (erbium (iii)) are given in fig. 7a, and the er3� doped samples (erf3) are given in fig. 7b. in the case of erbium (iii) only the samples with higher er3+ concentra© czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 5/2008 300 400 500 600 700 800 900 1000 1100 1 47. 1 48. 1 49. 1 50. 1 51. 1 52. 1 53. 1 54. 1.55 0 % er 5 % er 10 % er 20 % er in d e x o f re fr a c ti o n () wavelength (nm) a) 300 400 500 600 700 800 900 1000 1100 1 48. 1 50. 1 52. 1 54. 1 56. 1 58. 1 60. b) 0 % er 1% er 1% yb 1% er 20% yb in d e x o f re fr a c ti o n () wavelength (nm) fig. 6: wavelength dependence of the refractive indices a) of the er3� and b) er3��yb3� doped pmma layers using erbium(iii) and ytterbium(iii) tions showed very weak photoluminescence bands at 1530 nm attributed to the erbium transition 4i13/2 � 4i15/2. in the case of erf3, the emission intensity is higher than that of erbium (iii) (see fig. 7). the highest emission intensity was found in sample (erf3), which contained 10.0 at. % erbium. fig. 8 shows the infrared emissions obtained for samples doped with 1.0 at.% erbium and co-doped with ytterbium ions in amounts varying from 1.0 at. % to 10.0 at. % (fig. 8a erbium(iii), ytterbium(iii) and fig. 8b erf3, ybf3). the samples showed very weak emission at 1530 nm. therefore co-doping with ytterbium ions had only a weak effect on the photoluminescence spectra. 4 conclusion we have reported on the fabrication process and the properties of pmma layers doped with er3� and er3� ions co-doped with yb3� ions. � polymer layers were fabricated by spin coating or by pouring the polymer into a bottomless mould placed on a glass substrate. � we observed the ftir absorption band at around 3349 cm�1 corresponding to the o-h vibrations and three bands at 2994 cm�1, 2953 cm�1 and 2843 cm�1 corresponding to the aliphatic c-h bands. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 1450 1500 1550 1600 1650 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. p l in te n s it y (a .u .) wavelength (nm) 4 i 13/2 4 i 15/2 a) 1% er 10% er 20% er 1450 1500 1550 1600 1650 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. b) p l in te n s it y (a .u .) wavelength (nm) 4 i 13/2 4 i 15/2 10% er 5% er 1% er fig. 7: photoluminescence spectra of er3� doped pmma layers a) (erbium(iii)), b) erf3� ( �ex � 980 nm with eex � 500 mw; room temperature) � the content of er3� and er3� � yb3� ions had a significant effect on the transmission spectra. we observed three bands corresponding to the er3� ions (4g11/2 – 377 nm, 2h11/2 – 519 nm, 4f9/2 – 650 nm and one band corresponding to the yb3� ions (2f5/2 – 977 nm). these bands we observed in the samples doped with a higher er and yb concentration, and they almost disappeared in the background in the case of samples with a low er and yb concentration. � the refractive indices were investigated by spectroscopic ellipsometry and we found that increasing the content of the er3� and yb3� ions increases the refractive indices of the material. � the er3� doped pmma samples exhibited a typical emission at 1530 nm, due to the er3� intra-4f 4i13/2 � 4i15/2 only at samples with higher content of er3� ions. the highest emission intensity was found in sample (erf3) containing 10.0 at. % erbium. it was also found that the addi© czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 5/2008 1450 1500 1550 1600 1650 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1.0 p l in te n s it y (a .u .) wavelength (nm) 5% yb 10% yb 4 i 13/2 4 i 15/2 a) 20% yb 1450 1500 1550 1600 1650 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. 0 0. 0 2. 0 4. 0 6. 0 8. 1 0. b) p l in te n s it y (a .u .) wavelength (nm) 10% yb 5% yb 1% yb 4 i 13/2 4 i 15/2 fig. 8: photoluminescence spectra of er3��yb3� doped pmma layers (1.0 at. % er) a) (erbium(iii), ytterbium(iii)), b) erf3�, ybf3� ��ex � 980 nm with eex � 500 mw; room temperature) tion of ytterbium did not substantially affect the 1530 nm luminescence. acknowledgments our research has been supported by the grant agency of the czech republic under grant number 102/06/0424 and research program msm6840770014 of the czech technical university in prague. references [1] steckl, a. j., zavada, j. m.: optoelectronic properties and applications of rare-earth-doped gan. mrs bulletin, vol. 24 (1999), issue 9, p. 33–38. [2] steckl, a. j., heikenfeld, j., lee, d. s., garter, m.: multiple color capability from rare earth-doped gallium nitride. materials science and engineering b-solid state materials for advanced technology, vol. 81 (2001), issue 1–3, p. 97–101. [3] kenyon, a. j.: recent developments in rare-earth doped materials for optoelectronics. progress in quantum electronics, vol. 26 (2002), no. 4–5, p. 225–284. [4] van den hoven, g. n., snoeks, e., polman, a., vanuffelen, j. w. m., oei, y. s., smit, m. k.: photoluminesce characterization of er-implanted al2o3 films. applied physics letters, vol. 62 (1993), no. 24, p. 3065–3067. [5] kik, p. g., polman, a.: erbium doped optical-waveguide amplifiers on silicon. mrs bulletin, vol. 23 (1998), issue 4, p. 48–54. [6] chiasera, a., tosello, c., moser, e., montagna, m., belli, r., goncalves, r. r., righini, g. c., pelli, s., chiappini, a., zampedri, l., ferrari, m.: er3�/yb3�activated silicatitania planar waveguides for edpwas fabricated by rf-sputtering. journal of non-crystalline solids, vol. 322 (2003), issue 1–3, p. 289–294. [7] steckl, a. j., heikenfeld, j. c., lee, d. s., garter, m. j., baker, c. c., wang, y. q., jones, r.: rare-earth-doped gan: growth, properties, and fabrication of electroluminescent devices. ieee journal of selected topics in quantum electronics, vol. 8 (2002), no. 4, p. 749–766. [8] wang, j. s., vogel, e. m., snitzer, e., jackel, j. l., da silva, v. l., silberberg, y.: 1.3 �m emission of neodymium and praseodymium in tellurite-based glasses. journal of non-crystalline solids, vol. 178 (1994), p. 109–113. [9] man, s. q., pun, e. y. b., chung, p. s.: tellurite glasses for 1.3 �m optical amplifiers. optics communications, vol. 168 (1999), issues 5–6, p. 369–373. [10] samson, b. n., neto, j. a. m., laming, r. i., hewak, d. w.: dysprosium doped ga:la:s glass for an efficient optical fiber amplifier operating at 1.3 �m. electronics letters, vol. 30 (1994), issue 19, p. 1617–1619. [11] machewirth, d. p., wei, k., krasteva, v., datta, r., snitzer, e., sigel, g. h.: optical characterization of pr3� and dy3� doped chalcogenide glasses. journal of non-crystalline solids, vol. 213 (1997), p. 295–303. [12] yan, y. c., faber, a. j., dewaal, h., kik, p. g., polman, a.: erbium-doped phosphate glass waveguide on silicon with 4.1 db/cm gain at 1.535 �m. applied physics letters, vol. 71 (1997), issue 20, p. 2922–2924. [13] shmulovich, j., wong, a., wong, y. h., becker, p. c., bruce, a. j., adar, r.: er3� glass wave-guide amplifier at 1.5 �m on silicon. electronics letters, vol. 28 (1992), issue 13, p. 1181–1182. [14] kasamatsu, t., yano, y., sekita, h.: 1.50 �m-band gain-shifted thulium-doped fiber amplifier with 1.05 and 1.56 �m dual-wavelength pumping. optics letters, vol. 24 (1999), issue 23, p. 1684–1686. [15] chryssou, c. e., di pasquale, f., pitt, c. w.: improved gain performance in yb3�-sensitized er3�-doped alumina (al2o3) channel optical waveguide amplifiers. journal of lightwave technology, vol. 19 (2001), issue 3, p. 345–349. [16] strohhofer, c., polman, a.: relationship between gain and yb3� concentration in er3�-yb3� doped waveguide amplifier. journal of applied physics letters, vol. 90 (2001) no. 9, p. 4314–4320. [17] bourdet, g. l., muller, r. a.: tm,ho:ylf microchip laser under ti:sapphire and diode dumping. applied physics b-lasers and optics, vol. 70 (2000), issue 3, p. 345–349. [18] sani, e., toncelli, a., tonelli, m., coluccelli, n., galzerano, g., laporta, p.: comparative analysis of tm-ho:kyf4 laser crystals. applied physics b lasers and optics, vol. 81 (2005), issue 6, p. 847–851. [19] li, j., wang, j. y., tan, h., cheng, x. f., song, f., zhang, h. j., zhao, s. r.: growth and optical properties of ho,yb:yal3(bo3)4. crystal journal of crystal growth, vol. 256 (2003), issue 3–4, p. 324–327. [20] digonnet, j. f. m.: rare-earth-doped fiber lasers and amplifiers. stanford, california: marcel dekker inc. 1993. [21] eldada, l., shacklette, l. w.: advances in polymer integrated optics. ieee journal of selected topics in quantum electronics, vol. 6 (2000), no.1, p. 54–68. [22] wong, w. h., liu, k. k., chan, k. s., pun, e. y. b.: polymer devices for photonics applications. journal of crystal growth, vol. 288 (2006), issue 1, p. 100–104. [23] liang, h., zheng, z. q., zhang, q. j., ming, h., chen, b., xu, j., zhao, h.: radiative properties of eu(dbm)(3) phen-doped poly(methyl methacrylate). journal of materials research, vol. 18 (2003), issue: 8, p. 1895–1899. [24] sosa, r., flores, m., rodriguez, r., munoz, a.: optical properties and judd-ofelt intensity parameters of eu3� in pmma: paac copolymer samples. revista mexicana de fisica, vol. 46 (2003), issue 6, p. 519–524. ing. václav prajzler, ph.d. e-mail: xprajzlv@feld.cvut.cz ing. vitězslav jeřábek, csc. e-mail: jerabek@fel.cvut.cz department of microelectronics faculty of electrical engineering czech technical university in prague technická 2 166 27 prague, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 mgr. oleksei lyutakov doc. ing. ivan hüttel, drsc. e-mail: ivan.huttel@vscht.cz rndr. jarmila špirková, csc. e-mail: jarmila.spirkova@vscht.cz ing. vladimír machovič, csc. institute of chemical technology technická 5, 166 27 prague, czech republic ing. jiří oswald, csc. e-mail: oswald@fzu.cz rndr. dagmar chvostová e-mail: chvostov@fzu.cz institute of physics academy of sciences czech republic, v. v. i. cukrovarnická 10/112 162 53 prague 6, czech republic rndr. jiří zavadil, csc. e-mail: zavadil@ufe.cz institute of photonics and electronics academy of sciences czech republic, v. v. i. chaberská 57 182 51 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 5/2008 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 monitoring psr b1509–58 with rxte: spectral analysis 1996–2010 e. litzinger, k. pottschmidt, j. wilms, s. suchy, r. e. rothschild, i. kreykenbohm abstract we present an analysis of the x-ray spectra of the young, crab-like pulsar psr b1509–58 (pulse period p ∼ 151ms) observed byrxte over 14 years since the beginning of themission in 1996. the uniformdataset is especially well suited for studying the stability of the spectral parameters over time as well as for determining pulse phase resolved spectral parameters with high significance. the phase averaged spectra as well as the resolved spectra can be well described by an absorbed power law. keywords: pulsars: individual (psr b1509–58) — stars: neutron — x-rays: stars. 1 introduction thepulsarpsrb1509–58wasdiscovered ineinstein x-ray observatory data from 1979 and 1980 [5]. the pulsar is associatedwith the supernova remnant g320.4–1.2 (msh 15–52) in the constellation circinus. psrb1509–58hasbeen establishedasoneofonly a few known crab-like sources, i.e., a young pulsar powering a synchrotron nebula [6]. psr b1509–58’s nebula is considerably larger, its surface brightness is lower and the pulse period of p ∼ 151ms is slower than that of the crab. due to a very high spin-down rate of ṗ ∼ 1.5 × 10−12 s s−1, however, the characteristic age p/2ṗ of psr b1509–58 is ∼ 1.6 × 103yr (e.g., [7]), i.e., comparable to that of the crab (∼ 1.3×103yr). in the following we analyse the phase averaged spectra (sec. 2) and the phase resolved spectra (sec. 5) from calculated ephemerides (sec. 3) for the three major calibration epochs 3–5 for the rxte. pulse profiles for the epochs are presented in sec. 4. a short summary of the results and the implications for further spectral analysis with rxte are given in sec. 6 2 phase averaged spectra from approximately monthly monitoring observations of psr b1509–58 time averaged pca (pcu2 top layer, [1,2]) and hexte (cluster a and b, [4]) spectra were created by averaging individual monitoring spectra over major instrument calibration epochs, i.e. betweenmjd50188and51259(epoch3), mjd 51259 and 51677 (epoch 4), and from mjd 51677 onward (epoch 5). figure 1 shows the epoch averaged counts spectra for epoch 5. some spectral parameters and their uncertainties for the three fits are given in table 1. no systematic uncertainties have been added to the spectra. fig. 1: (a) pcu2 top layer and hexte counts spectra obtained by accumulating all suitable monitoring spectra of epoch 5 and best simultaneous fit model (absorbed power law with an iron line), (b) best fit residuals 38 acta polytechnica vol. 51 no. 6/2011 table 1: best fit parameters for the phase averaged pcu2 spectra of calibration epoch 3–5 parameter epoch 3 epoch 4 epoch 5 γ 2.022±0.001 2.021±0.001 2.026±0.001 aγ [ 10−2kev−1cm−2 s−1 ] 7.48±0.84 7.48±1.01 7.33± 0.12 nh [ 1022cm−2 ] 0.37±0.01 0.39±0.02 0.58± 0.02 efe [kev] 6.65 +0.01 −0.16 6.50 +0.16 −0.03 6.50± 0.03 chexte 0.65±0.05 0.76±0.04 0.85± 0.02 χ2red/dof 1.09/48 0.58/42 2.16/94 f4−10kev [ 10−11ergcm−2s−1 ] 10.25 10.34 9.85 funabs4−10kev [ 10−11ergcm−2s−1 ] 10.57 10.57 9.93 f10−20kev [ 10−11ergcm−2s−1 ] 7.76 7.76 7.20 f20−200kev [ 10−11ergcm−2s−1 ] 25.07 25.15 22.36 the spectra are modeled by an absorbed power law. in addition we found clear indications of a narrow iron kα line which is included by a gaussian component added to the power law. because of the long monitoring time of epoch 5 (ten years) systematic features are visible in the spectra. for the pca there are strong residuals around 5kev depending on the three xenon l edges. around 9.3kev a broad negative residual is visible. we model it with an additional negative gaussian line at this energy with a flux of −3 × 10−5cm−2 s−1. we speculate that this residual might be related to imperfect modeling of the copper kα emission line at 8.04kev. also events from the americium calibration source of the pca at 33kev are visible. they were also modeled with a negative gaussian line with a flux of −1.4 × 10−4cm−2 s−1. 3 pulse period ephemeris the pulse phase resolved analysis for the pca is based on high time-resolution goodxenon event mode data, filtered for pcu2 top layer events. ephemerides forpsrb1509–58were calculated from the pulse frequencies of each observation. the reference epoch was set to t0(mjd) = 52921.0, the averaged time of the monitoring (see figure 2, table 2). with this result barycentered pulse phaseand energy resolved source count rates (pha2 files) were created using a modified version of the ftool fasebin [3]. fig. 2: frequencies of each observation, calculated by epoch folding the barycentered goodxenon lightcurve, vs. time with the best fit of a polynomial of quartic grade. a linear decline is visible table 2: values for the pulse frequency and its first three derivatives ν [ s−1 ] ν̇ [ 10−11s−2 ] ν̈ [ 10−21s−3 ] ... ν [ 10−29s−4 ] 6.61032±1e−6 −6.6965±0.002 1.18± 0.24 1.77±0.36 39 acta polytechnica vol. 51 no. 6/2011 4 pulse profiles pulse profiles in the energy range 3–43kev for the three major epochs are shown in figure 3. the peak was centered to phase 1.0 by shifting the individual pulse profiles and adding them up. the decline in rate between epochs is an instrumental effect and is accounted for in the calibration of the pcu2 top layer. the bigger errorbars in epoch 4 are due to a shorter duration and therefore less observations (30 in epoch 3, 13 in epoch 4, 213 in epoch 5). a clear division in peak φ = 0.88 − 1.25 and off-peak φ=0.44−0.75 is possible. fig. 3: pulse profiles for the energy range 3–43kev of psr b1509–58 for the three major epochs 3, 4 and 5 5 phase resolved spectra all pulsed (i.e., peak minus off-peak) and unpulsed (regular background) spectrawere summed to obtain averaged spectra for epochs 3–5, respectively. for the averagedphase resolved spectra, as for the phase averaged spectra before, an absorbed power law was fitted (energy range 3–20kev). for epoch 5 it showed that the residuals for the pulsed emission improved by including a cutoff (see figures 4 and 5). the pulsed spectra showed no indication for an iron line at 6.4kev. the line is part of the unpulsed spectra andhence of thepwnofpsrb1509–58. this is also the explanation for the different values of the nh. in thepulsed emissionwe see thebeamof thepulsar and the galactic extinction. in the unpulsed emission we see the surroundingpwnwithout the pulsar and the galactic extinction. therefore the value for the latter is smaller than for the pulsed emission. the best fit parameters for epoch 5 are shown in table 3. fig. 4: pcu2 top layer counts spectra of the pulsed emission of epoch 5. residuals are shown for an absorbed power law and after including a cutoff fig. 5: the same as figure 4 but for the unpulsed emission table 3: best fit parameters for the pulsed and unpulsed emission for epoch 5 parameter e5 pulsed e5 pulsed e5 unpulsed γ 1.37±0.01 1.27±0.01 2.27±0.01 ecut [kev] 116±8 nh [ 1022cm−2 ] 2.40±0.1 1.97±0.11 1.34±0.02 efe [kev] 6.57 +0.01 −0.07 f4−10kev [ 10−11ergcm−2s−1 ] 4.28±0.02 4.29±0.02 7.99±0.01 funabs4−10kev [ 10−11ergcm−2s−1 ] 4.57±0.02 4.52±0.02 8.30±0.01 f10−20kev [ 10−11ergcm−2s−1 ] 5.56±0.02 5.57±0.02 4.90±0.01 χ2red 1.09 0.91 1.26 40 acta polytechnica vol. 51 no. 6/2011 6 summary and conclusions we couldwell describe the spectra with an absorbed power lawwithagaussian line for thephaseaveraged and the unpulsed spectra and without the gaussian for the pulsed spectra. for the pulsed spectrum of epoch 5 a cutoff improves the fit, while for the other epochs and theunpulsed emission it hasno effect. no significant changesbetween the values of the different epochs for the averaged, pulsed and unpulsed emission were seen. long observations show systematic effects from the instruments and are therefore good for describing the calibration effects. as forthcoming work we intend to add hexte spectra to the epoch averaged phase resolved analysis. references [1] jahoda, k., swank, j. h., giles, a. b., et al.: proc. euv, x-ray, and gamma-ray instrumentation for astronomy vii, 1996, vol. 2808, 59. [2] jahoda,k.,markwardt,c.b., radeva,y., et al.: apjs, 163, 401, 2006. [3] kreykenbohm, i., coburn, w., wilms, j., et al.: a&a 395, 129, 2002. [4] rothschild, r. e., blanco, p. r., gruber, d. e., et al.: apj, 496, 538, 1998. [5] seward,f.d.,harnden, jr.f.r.: apj, 256,l45, 1982. [6] seward, f. d., harnden, jr. f. r., szymkowiak, a., et al.: apj, 281, 650, 1984. [7] zhang, l., cheng, k. s.: a&a, 363, 575, 2000. e. litzinger dr. remeis-observatory/ecap,fau bamberg, germany k. pottschmidt cresst/nasa-gsfc greenbelt, md, usa umbc baltimore, md, usa j. wilms dr. remeis-observatory/ecap,fau bamberg, germany s. suchy cass/ucsd la jolla, ca, usa r. e. rothschild cass/ucsd la jolla, ca, usa i. kreykenbohm dr. remeis-observatory/ecap,fau bamberg, germany 41 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 on the intrinsic simplicity of spectral variability of grbs a. chernenko abstract in this paper we present a multi-scale correlation analysis (msca) of the light curves of gamma-ray bursts recorded in different energy ranges. this analysis allows us to identify time intervals where emission variability can be reduced to a single physical parameter and can therefore be robustly attributed to a single physical emitter. the properties of these intervals can then be investigated separately, and the spectral properties of individual emitters can be analysed. the signatures of hidden dynamical relations between individual emitters are also discussed. keywords: gamma-ray bursts, variability, multi-scale, correlations, random matrices, eigenvalues. 1 introduction: dimension of spectral variability relativistically expanding sources produce radiation that manifests a rich spectral evolution and diverse light curve structures. even if individual emitting regions have a similar and simple geometry and rest frame spectra, spectral and temporal evolution within any period of an observer’s time will be very diverse, since many individual emitters contribute to this period at different viewing angles and lorentz factors, at different stages of development. in reality, the physical properties and even the physical nature of individual emitters within a single source can also be quite diverse. in this situation we believe that the very first stage of spectroscopy should be model independent, and should give a clear understanding of the following: 1. the dimension of variability for the entire period of the observations — how many spectral parameters are in fact independently variable during the period of the observations. 2. segmentation into low-dimension periods — can the entire period of the observations be split into sub-periods for each of which the internally measured dimension of variability is considerably lower than the dimension for the entire period 3. hidden dynamics — if there is evidence that the emission properties at any moment of the observations are dependent on its properties at earlier moment(s) of time. if such signatures are found, can the dimension of variability be further decreased by taking into account the hidden dynamics. in this paper i will focus on the first two items on this list. 2 measuring the dimension of variability by correlating light curves correlation analysis of emission light curves recorded in different energy ranges (wave lengths) is a natural, and model-independent way of estimating the dimension of spectral variability. the rank rcm of the corresponding correlation matrix cm is exactly its numerical measure. figure (1) presents the light curves of grb911118. these light curves measured in p = 8 energy ranges by batse/compton large area detectors with time resolution of 0.016 s produced a data matrix with dimension p×n = 8×1 000. figure (2) presents the corresponding correlation matrix. the spectrum of its sorted eigenvalues ev (i) (i = 1, . . . , 8) is also presented. it is clear that there are 2 non-zero eigenvalues, and therefore the rank of the correlation matrix rcm = 2. in this case, we would say that spectral variability is twodimensional. this original result was first obtained by chernenko and mitrofanov in [1] and was interpreted as evidence for two non-correlated emission components related to physically distinct emission sources. later similar results were obtained in a number of papers, e.g. [2]. there is also an alternative interpretation involving a single emission component with two independently variable parameters [3]. the next logical step in the investigation, as was noted in the introduction, is segmentation, i.e. to look at the grb through time windows of different widths and positions in order to find sub-intervals where the spectral variability is just one-dimensional, i.e. the covariance matrix has just a single non-zero eigenvalue. we call this approach multi-scale correlation analysis, and will present its results later 61 acta polytechnica vol. 51 no. 2/2011 fig. 1: gamma-ray light curves of gamma-ray burstgb911118 detected by thebatse/comptonexperiment are presented asmeasured in 8 broad energy channels between 37 kevand approx. 2mevwith time resolution of 0.016 seconds. the amplitudes of the light curves are arbitrarily scaled to emphasize their shapes fig. 2: correlation matrix calculated for the 8 light curves of gb011118 presented in figure (1) is graphically shown in the left panel. in the right panel, the spectrum of its sorted eigenvalues ev (i) is presented. the eigenvalues are normalized by the largest ev (1). only the two largest eigenvalues are evidently non-zero, so the rank of the correlation matrix rcm =2 in this paper. but before that two important statements should be put forward: • as our goal is to study sub-intervals of the grb time history on multiple time scales down to the shortest ones where covariance can be calculated (i.e. n → 3), we need to deal with cases when n < p and the covariance matrix is p − n + 1 times singular. • in the real world, times series are “contaminated” by poisson noise. therefore, to determine the number of eigenvalues which are significantly larger than zero we need to know how to propagate poisson noise from raw time series to noise in eigenvalues. 3 statistical properties of eigenvalues of correlation matrices for poisson noise time series of arbitrary duration let us consider a data matrix with dimensions n × p, where p is number of time series and n is number of measurements (duration) for each of the time series. let us assume that each of the time series is a stationary poisson process with parameter λ. then, without loss of generalization we can transform the poisson process by means of the anscombe formula [4] to 62 acta polytechnica vol. 51 no. 2/2011 a nearly gaussian process with standard deviation σ = 1: a : x ⇒ 2 √ x + 3 8 (1) the correlation matrices for gaussian time series with σ = 1 are known in random matrices theory as wishart matrices. analytical results on their eigenvalues are basically limited to the quasi-continuous density distribution of the eigenvalues in the asymptotic case when p ≤ n → ∞, which is given by the marcenko-pastur density function [5]. analytical results on the distribution of individual eigenvalues are basically limited to the density distribution of the largest eigenvalue, which is described by the tracywisdom law [7]. although this law was found to be reasonably accurate for n and p as small as 5 [6], for our investigation this is not sufficient since i) we need to know the statistical properties of smaller eigenvalues in order to determine the dimension of spectral variability and ii) the tracy-wisdom law holds for n ≥ p, which poses an unnecessary restriction on our analysis. to sum up, in the case of n � ∞ and especially when n <∼ p � ∞, when the correlation matrix becomes singular, existing analytical results are insufficient. therefore, we attempted a number of monte-carlo simulations to derive the sample distribution for eigenvalues for data matrices with different n : [4, 8, . . . , 1 024] and p : [4, 8, 12, 16, 32, 64]. for each realization of data matrix d we computed, by means of svd, the spectrum of eigenvalues. for each pair of n and p, the number of such random realizations was 107. in figure (3) we present the probability distributions for non-singular eigenvalues for some values of n and p. the goal of this monte-carlo experiment was to estimate the one-sided confidence limits ev pi for sorted eigenvalues evi, i : [1, n] for a typical set of αvalues: α : [0.1, 10−2, 10−3, 10−4, 10−5]. in figure (4) these confidence limits are presented as functions of n. 4 the quest for one-dimensional spectral variability again, mathematically speaking, one-dimensional spectral variability during some time interval [t1, t2] takes place if a correlation matrix built from multichannel time histories over this time interval has only one non-zero eigenvalue. in a real case when the poisson background is present, there will be no zero eigenvalues. replacing the “one non-zero eigenvalue” by “one significant eigenvalue” would not be correct because if the s/n ratio is low, then even if the first few eigenvalues are quite comparable, only the first one could eventually be significant. fig. 3: probability distributions for individual sorted non-singular eigenvalues of sample correlation matrices for poisson noise time series are presented for some values of time series number p and lengths n. the leftmost panels correspond to cases n < p when the correlation matrices are p − n +1 times singular and only n − 1 largest eigenvalues are non-zero. the rightmost panels correspond to asymptotic cases when p � n → ∞. in this case, the eigenvalues are densely and symmetrically grouped around one, as expected from the marcenko-pastur density function [5] 63 acta polytechnica vol. 51 no. 2/2011 fig. 4: upper confidence limits are presented for the individual sorted non-singular eigenvalues of sample correlation matrices as derived from the corresponding probability distribution presented in figure (3). the solid lines correspond to p = 0.5 representing therefore the most probable values of the eigenvalues, while the two dotted lines in each plot correspond to p =10−3 and p =10−5 thus, for the purposes of this analysis the following definition of one-dimensionality will be used: 1. the first two eigenvalues are significant with a predefined significance level of α and 2. their ratio ev (1/2) = ev (1)/ev (2) is greater than a predefined level of �. none of the batse gamma-ray bursts that we have studied manifested one-dimensional spectral variability during its entire duration [3]. therefore, following the idea proposed in the introduction, we attempted segmentation — an analysis of the subperiods to find out whether one-dimensional variability exists on shorter times scales somewhere during a gamma-ray burst. 4.1 segmentation: looking for intervals of one-dimensional spectral variability gamma-ray bursts may last from a fraction of a second to hundreds of seconds, manifesting complicated temporal structures on different time scales. to identify periods of one-dimensional spectral variability in such a complex phenomenon we have, in principle, to use a truly redundant multi-scale approach. in the case of multi-scale correlation analysis this means that the correlation matrices should be sampled for all possible time windows [ti, tj ], where i, j : [1..n ], n being the total number of time intervals for a given grb. more precisely, the procedure of multi-scale correlation analysis was as follows: • for a moment t and time scale τ , the correlation matrix cm (t, τ ) size n× n is calculated for light curves in n energy channels for the time interval [t − τ /2, t + τ /2] • first two largest eigenvalues ev (1, t, τ ) and ev (2, t, τ ) of cm (t, τ ) are calculated • if both ev (1, t, τ ) and ev (2, t, τ ) are significant then the ratio ev (1/2, t, τ ) = ev (1)/ev (2) is calculated as a local measure of the variability dimension at moment t. if ev (1/2) � 1 we conclude that the dimension of variability at moment t and time scale τ is equal to 1. for the time scale of τ = 1 s the result of this procedure is illustrated in figure (5). this procedure is then performed for a number of time scales, and for each moment t the scale τmax, which maximizes ev (1/2, t, τ ) is determined: ev (1/2, t)max = max τ (ev (1/2, t, τ )) (2) peaks of ev (1/2, t)max mark time intervals [t − τmax/2, t + τmax/2] where one-dimensional variability is most prominent. in figure (6) the time histories of ev (1/2, t)max and τmax(t) are presented together with the original multi-channel light curves 64 acta polytechnica vol. 51 no. 2/2011 fig. 5: segmentation of the gb911118 light curves into sub-intervals with one-dimensional variability is demonstrated using multi-scale correlation analysis on the single time scale τ = 1 s. the upper panel presents the time histories of the two largest eigenvalues ev (1, t) and ev (2, t). the ratio ev (1/2, t) for the same time scale τ =1 s is presented in the middle panel. the peaks of this ratio mark the center points of sub-intervals with one-dimensional variability. the bottom panel shows three most pronounced sub-intervals with one-dimensional variability in the original multi-channel light curves of gb911118 by means of dashed lines 65 acta polytechnica vol. 51 no. 2/2011 fig. 6: segmentation of the gb911118 light curves into sub-intervals with one-dimensional variability on multiple time scales using multi-scale correlation analysis. peaks of the ev (1/2, t)max curve that is presented in the upper panel mark the middle points of these intervals. the corresponding time scale τmax, which is presented in the middle panels, gives the width of these intervals. this is illustrated in the bottom panel, which shows the original multi-channel light curves of gb911118 by means of dashed lines for 3 arbitrarily selected peaks of ev (1/2, t)max of grb911118. the time history of τmax(t) is stepwise because for computational reasons the calculations were performed for the limited number of logarithmically spaced values: τ = 0.064 · 2n (s) where n = 0, . . . , 7. 4.2 the nature of the second and higher dimensions and hidden dynamics of spectral variability figure (5) shows that there are sub-intervals of gb911118 (e.g. around t = 3 s and t = 7 s) where the spectral variability is not one-dimensional at any time scale. there are two alternatives: i) emission during such time intervals is of a different, two-dimensional, nature or ii) emission with onedimensional spectral variability is persistent along the entire grb but there is an additional component also with one-dimensional variability which emerges in various parts of the light curve and then becomes negligible. in principle, a possible physical connection between emitters within one source should manifest itself via a correlation between their light curves. how66 acta polytechnica vol. 51 no. 2/2011 ever, in the case of relativistic expansion these light curves would likely be shifted in time and energy in the observer’s frame. this could add extra dimensions to otherwise low-dimensional spectral variability. to deal with this problem, we should allow lags between the individual light curves and look for the cm with the lowest rank rcm over the continuum of the lags. such an analysis was done for the entire duration of gb911118 [8], and we found a combination of lags that makes rcm ∼ 1 for the entire burst. 5 conclusions we can see that the spectral variability in grbs is at most two-dimensional, and it may be possible to reduce it even to one-dimension by introducing time lags between the light curves in different energy channels. this brings us to the hypothesis of a single emission source within a grb with a single dominating variable physical parameter and some kind of “echoes” shifted in time and energy for geometrical reasons. a natural physical framework for modeling this behavior would be an optically thick expanding photosphere. however, this picture is in contradiction with models that involve more than one emission mechanism, e.g. ryde et al [9]. references [1] chernenko, a., mitrofanov, i.: mnras, 274, 361–368 (1995). [2] bagoly, z., et. al.: a & a, 493, issue 1, 51–54 (2009). [3] chernenko, a.: in the proceedings of the iau 8th asian-pacific regional meeting, edited by s. ikeuchi, j. hearnshaw, and t. hanawa, the astronomical society of japan, 2002, pp. 321–322. [4] anscombe, f. j.: biometrika, 35, 246–254 (1948). [5] marcenko, v. a., pastur, l. a.: math. ussr-sb. 1 507–1 536 (1967). [6] iain, m., johnstone: ann. statist. volume 29, number 2 (2001), 295–327. [7] tracy, c. a., widom, h.: comm. math. phys. 177 727–754. [8] chernenko, a.: in the proceedings of xxièmes rencontres de blois windows on the universe, june 21–26, 2009, blois, france, (2009). [9] ryde, f., et al.: apjlett, 625, l95–l98 (2005). anton chernenko e-mail: anton@cgrsmx.iki.rssi.ru space research institute 117 997, 84/32 profsoyuznaya str, moscow, russia 67 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 fracture surfaces of porous materials t. ficker abstract a three-dimensional absolute profile parameter was used to characterize the height irregularities of the fracture surfaces of cement pastes. the dependence of these irregularities on porosity was studied and its non-linear character was proved. ananalytical form for thedetectednon-linearitywas suggested and then experimentally tested. the surface irregularities manifest scale-invariance properties. keywords: roughness analysis, fracture surfaces, cement-based materials, confocal microscopy. 1 introduction the morphology of fracture surfaces have been studied for a long time to reveal the details of fracture processes. however, in the research field of cementitiousmaterials therehavebeenonlya restrictednumber of studies that deal with surface features of fractured specimens. some of the early surface studies of hydrated cement materials were focused on fractal properties [1,2] whereas others [3–5] investigated roughness numbers (rn) or similar surface characteristics [6–9]. when dealing with the roughness of the fracture surfaces of porousmaterials like cementitiousmaterials, an important question arises, namely, what type of relationship is therebetweensurface roughnessand porosity. recently, ponson at al [10,11] have pointed to a close relationship between these two quantities. the authors studied the roughness of fracture surfaces with glass ceramics made of small glass beads sintered in bulk with porosity p that could be varied in the interval (0,0.3). they observed that the roughness of the fracture surfaces increases linearly with increasing porosity (see their figure 3.3 in [10]). it would be valuable to know whether such a linear behavior is a general property of all porous materials, or only a specific feature of glass ceramics. for this reasonwe performed a large series of experiments with cement pastes. this material was chosen because its porosity can be easily controlled within a broad porosity interval by means of the water-tocement ratio r = w/c. correctknowledgeof the functional dependence of roughness on porosity may be useful for further surface studies of fractured porous materials. 2 experimental arrangement ordinary portland cement cem 42,5 i r-sc of domestic provenance was used to create 108 specimens of hydrated pasteswith six differentwater-to-cement ratios r (0.3, 0.4, 0.5, 0.6, 0.7, 0.8). the specimens were rotated during hydration to achieve better homogeneity. all specimens were stored for the whole time of hydration at 100 % rh and 20◦c. after 60 days of hydration the specimens were fractured in three-point bending tests and the fracture surfaceswere immediately used formicroscopic analysis. other parts of the specimens were used for porosity measurements and for further mechanical tests. porosity was determined by the common weightvolume method. the wet specimens were weighed and their volume wasmeasured, then they were subjected to 105◦c for one week until their weight no longer changed, and the dry specimens wereweighed again. the 3d profile parameter ha was used to characterize the roughness of the fracture surfaces of the hydrated cement pastes. in fact, ha represents the averaged ‘absolute’ height of the fracture relief z = f(x, y) ha = 1 l · m ∫ ∫ (lm) |f(x, y)|dxdy (1) where l × m is the area of the vertical projection of the 3d fracture profile f(x, y) into the plane xy. parameter ha has great statistical relevancy, since it is a global averaged characteristic covering the entire tested surface l × m. the 3d profiles f(x, y) were created using the olympus lext 3100 confocal microscope. one of these profiles is shown in figure 1. the profiles are formed by the software that processed a series of optical sections created by the confocal microscope at various heights of the fracture surfaces. approximately 150 image sections were taken for each measured surface site, starting from the very bottom of the surface depressions (valleys) and proceeding to the very top of the surface protrusions (peaks). the investigated area l × m = 1280 μm×1280 μm(1024 pixels×1024 pixels) was 21 acta polytechnica vol. 51 no. 3/2011 fig. 1: 3d confocal relief of fractured cement paste chosen in five different places of each fracture surface (in the center, and in four positions near the corners of the rectangular area), i.e. each plotted point on the graphs of the profile parameters corresponds to an average value composed of 90 measurements (18 samples×5 surface measurements). each measurementwasperformed for threedifferentmagnifications, namely 5×, 20× and 50×, giving 270measurements thatwereperformed for the particular r-value. since each site measurement amounts to about 150 optical sections (digital files), 40500 files had to be processed to create 270 digital maps per one r-value. this resulted in 1620 digital maps for all r-values altogether (6 r-values×270 maps for each r-value). these 1620 digital fracture surfaces were then subjected to 3d profile surface analysis. in this way an extensive statistical ensemble was created to provide a sufficiently reliable basis for drawing relevant conclusions. 3 results and discussion figure 2presents threedependences of ha(p) formed within threemagnifications: 5×, 20×and50×. their graphs do notmanifest linear behavior. all the three graphs show very similar shapes, which are well described by non-linear functions (curves in figure 2) that can be expressed as follows ha(p)= ho (po − p)β + ho (2) this is in fact a power-law function pointing to fractal-like properties. relation (2) contains fourpositive fitting parameters ho, po, β, and ho, the meanings ofwhich canbe explained on the basis of asymptotic patterns. firstly, po must always be greater than variable p , otherwise function (2) would be a decreasing function, and this would contradict the experimental data in figure 2. this means that po is the limiting value of porosity p when the waterto-cement ratio r goes to infinity po = lim r→∞ p =1 (3) assuming po = 1 and p → 0 (material with ‘zero porosity’), the limiting roughness of the corresponding fracture surfaces reads lim p → 0 po = 1 ha(p)= ho + ho (4) 22 acta polytechnica vol. 51 no. 3/2011 fig. 2: the dependence ha(p) between the roughness of the fracture surfaces and the porosity of the cement paste fig. 3: normalized scale-invariant data ha/m ax(ha) in dependence on porosity p from (4) it follows that the sum ho + ho represents the height irregularityat ‘zero porosity’. since differentmagnifications provide different resolutions, ‘zero porosity’ will be determined differently for different magnifications, and thus the sum ho + ho will also vary with magnification. the asymptotic form of ha(p) when the material contains a very small but non-zero porosity value, e.g. 0 < p < 0.30 can also be useful for further consideration. in this case, the taylor expansion of (2) may be utilized ha(p) ≈ ho p β o ( 1+ β p po ) + ho = ap + b (5) a = βho p 1+β 0 , b = ho p β o + ho. (6) it is obvious from(5) that at sufficiently small porosities the roughness of the fracture surfaces can be approximated by a linear function ha(p) ≈ ap + b, which is in agreement with the observation of ponson [10],whose experimentswith glass ceramics (p < 0.3) indicated such behavior (figure 3.3 in ref [10]). our data in figure 2 does not contain a sufficient number of experimental points in the region of small porosities (p < 0.35), within which linear behavior can be observed. however, the clearly nonlinear behavior of function ha(p) at higher porosities is a real fact illustrated by all the graphs in figure 2. the interval p ∈ (0.35,0.50) is characterized by an abrupt non-linear increase, and at p > 0.50 rapid growth unmistakably continues further. with our specimens, these three intervals of porosities p < 0.35, p ∈ (0.35,0.50) and p > 0.50 correspond to the following water-to-cement ratios: r1 < 0.4, r2 ∈ (0.4,0.6) and r3 > 0.6, respectively. it is well known that at sufficiently small water-to-cement ratio values (r1 < 0.4), gel porosity dominates over capillary porosity in hydrated cement pastes. within the specimens mixed with intermediate r-values r2 ∈ (0.4,0.6), capillary porosity starts to assume a governing role, and at still higher r-values r3 > 0.6 the porosity of the specimens is prevalently formed by capillary porosity. the overall non-linear surface irregularity (2) of cementitious materials seems to be a result of the interplay between gel and capillary porosities. as soon as the onset of capillary porosity appears, the height irregularities of the fracture surfaces increase their values abruptly, as can be seen with all the graphs in figure 2 and still more straightforwardly with the two graphs in figure 3, where the boundary between the 23 acta polytechnica vol. 51 no. 3/2011 gel and the capillary porosities can be recognized at p ≈ 0.3. this effect can be characterized as a crossover to larger scale porosities. it is interesting to note that the shapes of the graphs of functions ha(p) shown infigure 2 aremutually very similar, regardless of the magnifications that are used. in this connection, the question of their scale invariance arises. thismeans that the dependence ha(p) can be described by the same functional type (2), but with the fitting parameters ho, po, β, ho adapted to the results of the particular magnification. however, when these three functions are normalized by using the highest measured values m ax(ha), a unified function results (figure 3a). using the pattern of the unified function, the linear behavior of surface roughness at small porosities can be illustrated straightforwardly— see figure 3b. the graph of the unified function shown in figure 3a represents the results of all three magnifications used here— 5×, 20× and 50× — and it simultaneously manifests the scale invariant properties of the height irregularities of the fracture surfaces. 4 conclusion the experiments have proved a close relationship between the roughness (height irregularities) of fracture surfacesandtheporosityofmaterials. the functional relationof these twoquantities is generallynon-linear and may be described by a power law function (2). when normalizing this function, it becomes independentof themagnification that is used. thenon-linear behavior of ha(p) with cement pastes is the result of the influence of gel and capillary porosities. the region of small gel porosities is characterized by a moderate (almost linear) increase in surface roughness ha(p), whereas the region of capillary porosity is a domain with an abrupt increase in this quantity. the presentedproperties of the surface roughness of fracturedcementpastesmayalsobeuseful formorphological and structural studies of other porousmaterials. acknowledgement this work was supported by the ministry of the czech republic under contract no. me09046 (kontakt). references [1] lange, d. a., jennings, h. m., shah, s. p.: a fractal approach to understanding cement paste microstructure, ceram. trans. 16 (1992) 347–363. [2] issa, m. a., hammad, a. m.: fractal characterization of fracture surfaces in mortar, cem. concr. res. 23 (1993) 7–12. [3] lange, d. a., jennings, h. m., shah, s. p.: analysis of surface-roughness using confocal microscopy, j. mater. sci. 28 (14) (1993) 3879–3884. [4] lange,d.a., jennings,h.m., shah,s.p.: relationship between fracture surface roughness and fracture behavior of cement paste and mortar, j. am. ceram. soc. 76 (3) (1993) 589–597. [5] zampini,d., jennings,h.m., shah, s.p.: characterization of the paste-aggregate interfacial transition zone surface-roughness and its relationship to the fracture-toughness of concrete, j. mater. sci. 30 (12) (1995) 3149–3154. [6] lange, d. a., quyang, c., shah, s. p.: behavior of cement-based matrices reinforced by randomlydispersedmicrofibers,adv. cem.bas. mater. 3 (1) (1996) 20–30. [7] abell, a. b., lange, d. a.: fracture mechanics modeling using images of fracture surfaces, int. j. solids structures, 35 (31–32) (1997) 4025–4034. [8] nichols, a. b., lange, d. a.: 3d surface image analysis for fracture modeling of cementbased materials, cem. conc. res. 36 (2006) 1098–1107. [9] ficker, t., martǐsek, d., jennings, h. m.: roughness of fracture surfaces and compressive strength of hydrated cement pastes,cem. conr. res. 40 (2010) 947–955. [10] ponson, l.: crack propagation in disordered materials; how to decipher fracture surfaces, annales de physique 32 (2007) 1–120. [11] ponson, l., auradou, h., pessel, m., lazarus, v., hulin, j. p.: failure mechanisms and surface roughness statistics of fractured fontainebleau sandstone, phys. rev. e 76 (2007) 036108/1–036108/7. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering brno university of technology veveř́ı 95, 662 37 brno, czech republic 24 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 modularity of pressing tools for screw press producing solid biofuels miloš matúš1, peter križan1 1 slovak university of technology in bratislava, faculty of mechanical engineering, usetm, nam. slobody 17, 812 31 bratislava, slovak republic correspondence to: milos.matus@stuba.sk abstract this paper focuses on the development of the newly-patented structure of a screw briquetting machine for compacting biomass into a solid biofuel. the design of the machine is based on the results of a comprehensive study of the complicated process of biomass compaction. the patented structure meets two main goals: the elimination of axial forces, leading to increased lifetime of the bearings, and the new modular design of a pressing chamber and tools with their geometry based on the application of a mathematical model. keywords: biomass, screw press, compaction process. 1 introduction briquetting biomass by means of screw pressing is a progressive technology in the production of solid biofuels. screw presses offer great advantages in producing pressings of high quality, and they represent the most current and promising technology in the compaction of biomass to produce briquettes. research to develop this technology further is a logical step. increasing the tool life will reduce operating costs, and will lead to lower prices for the final product, making it a more viable application for biomass use. compacting biomass as a live material is very difficult process. it is necessary to control important construction parameters for different kinds of biomass. the newly-patented structure of a screw briquetting machine was designed in our department for this purpose. 2 new design of a screw press research results for parameters influencing the process of biomass compaction were applied in developing the design of a new screw press. the design that has been developed enables the technological and structural parameters of the compaction process to be controlled in order to achieve high quality output with various input factors. the requirements for the development of the structure were as follows: • eliminate the axial load of the bearings, and thus increase their life, • optimize the tools in terms of shape and material properties to increase the efficiency of the compaction process and to increase their lifetime, • achieve high modularity of the machine, • enable all important parameters of the compaction process to be managed and controlled, • ensure that the pressing screw and the pressing nozzle can be exchanged rapidly, • produce pressings of various shapes and sizes. the comprehensive design of the new screw press is shown in figure 1. it is a double chamber two-sided design, allowing quality production of pressings from various organic materials, due to the control options for each significant parameter of the compaction process, e.g. compacting pressure, pressing temperature, pressing speed, cooling intensity of the pressings, rapid exchange of worn tools, and required changes of tool geometry. the structure is also equipped with sensors that provide feedback in the compaction process. this screw machine is a universal machine for producing solid high-grade biofuels from a variety of raw materials. the machine consists of one common main drive, a special spindle bearing that captures the work axial load and defines the exact position of the pressing screw in the pressing chamber, two identical pressing chambers with tools, two filling systems, and two cooling channels. single screw extruders are characterized by the very short lifetime of the thrust bearings, or by their large dimensions. the main objective of the twosided design of the press is to eliminate the axial workload resulting from pressing the material, and thus progressively increasing the lifetime of the bearings. the bearings are loaded only with the difference of the axial pressing forces caused by asymmetric filling of the material into the pressing chamber. the whole workload is transmitted in the axis of the machine through the outer flange system (tensile stress) 71 acta polytechnica vol. 52 no. 3/2012 figure 1: newly-patented design of a screw press figure 2: pressing chamber with tools (1 – feeding screw, 2 – pressing screw, 3 – pressing chamber, 4 – nozzles) and the continuous solid spindle with pressing screws on the ends (compressive stress). this system eliminates the need for a massive frame to transfer the load, or for anchoring the machine. the construction of the machine is designed for secure transmission of an axial load to 520 kn (pressing machine is able to draw 265 mpa for briquette diameter 50 mm). the modularity of the double chamber machine enables a single chamber machine to be created very easily and quickly by removing a part of the press — the whole side from the drive of the press — without any other modifications. the single chamber design is used especially for optimization experiments and measurements, because it allows the full operating load to be measured. 3 design of the pressing chamber the pressing chamber must be strong enough to withstand the internal pressure while pressing. it consists of the body of the pressing chamber, the feed screw, the pressing screw and the individual nozzles (figure 2). high material requirements and geometric requirements are placed on the tools inside the pressing chamber. the material requirements include high abrasion resistance, toughness and thermal stability. the geometrical requirements are complicated, and vary according to the type of raw material. the basic geometrical requirements are to ensure an increase in material pressure during compaction. in addition, the geometry of the tool must generate axial movement of the material to ensure continuity of the compacting process. the pressing chamber is coated by heating devices to control the pressing temperature, which is the most important parameter in the biomass compaction process. the chosen design of the heating system provides direct measurement and control of the pressing temperature, up to 350 ◦c. during optimization of the compacting process, it is possible to change the shape and size of the pressing by simply and quickly changing the tool (screws and pressing nozzles), changing the inner diameter of the pressing chamber, the length of the pressing chamber, the combination of tool materials, the taper 72 acta polytechnica vol. 52 no. 3/2012 of the pressing chamber, etc. the structural parameters of pressing tools are optimized experimentally, e.g. the whole compaction process for different types of raw materials. 4 design of the pressing screw the geometry of the pressing screw (figure 3) ensures a high degree of material compaction in the pressing chamber and compression of the material through the nozzle, thus achieving a compact briquette of high density, strength and surface quality. the movement of the material, the compression, the rate of wear and the stress distribution depend primarily on the chosen geometry of the screw. it is therefore extremely important to pay close attention to the design of the screw geometry. figure 3: monolithic pressing screw (1 – working thread part of the screw, 2 – tip) we obtained a mathematical model (1) describing the dependence of the compacting pressure on the geometry of the screw by deriving the theory for the speed and power relations in the screw. p = p0 · ea· l d (1) this mathematical model is very important in designing the geometry of the pressing screw (p – working pressure, p0 – initial pressure, a – constant of proportionality, l – active length of the screw, d – outer diameter of the screw). this relationship shows that the pressure in the screw profile is exponentially dependent on the length of the screw. the constant of proportionality (a) depends on the geometry of the screw profile and the friction between the material and the nozzle (fp), as well as the friction between the material and the screw (fz). the condition of a sharp pressure increase requires that the friction coefficient (fp) be as large as possible, while the coefficient (fz) is as small as possible. the coefficient (fz) can greatly affect the surface quality of the screw. the goal is to achieve minimum surface roughness. coefficient (fp) can be increased by increasing the nozzle roughness or by machining grooves into the surface of the nozzles in the direction of the screw axis. grooving not only increases the friction, but also prevents rotation of the material invoking socalled axial block flow. the application of this theory will be demonstrated on three geometric variants of pressing screws (figure 4, 5, 6). for simplicity in comparing the different variants, the input parameters are the same for each screw: figure 4: screw section of variant a figure 5: screw section of variant b figure 6: screw section of variant c 73 acta polytechnica vol. 52 no. 3/2012 figure 7: relationship between pressure and the number of screw threads at a constant initial pressure (po = 1.05 mpa) • outer diameter of screw (d = 70 mm) • inner diameter of screw (d = 40 mm) • cross-sectional area of thread (s = 325 mm2) • width of screw guide surface (e = 5 mm) • number of threads (z = 3) figure 7 shows that the geometry of variant a has the smallest increase in working pressure. on the other hand, the geometry of variant c causes the working pressure to increase most (with respect to the number of threads). for each variant, the thread area is constant. only the pitch of the screw changes in accordance with the screw profile. for the condition of a sharp increase in working pressure, the thread geometry of variant c is most suitable. however, it cannot be definitively held that the thread profile is optimal in all respects. certainly it is necessary to verify the assumption made above also for other considerations — strength, technological factors, and cost. the pressing screw is the most stressed component in the machine, with the highest level of wear. it is subject to high pressure, abrasion and heat. the critical part of the screw is the tip (figure 3) and the first 1.5 revolutions of the thread, which shows on the instrument workload distribution. the degree of load and tool wear is also dependent on the raw material. further research and optimization is therefore under preparation for several sets of tools to be made from a variety of special steels, coating tools and tools with hard-grinding threads. pressing screws are designed as monolithic (figure 3), as well as folded (figure 8, figure 9), which can have a rotating tip to reduce friction. a folded screw can have each part made of a different material, which reduces costs. the modularity of the tool enables optimization of the compaction efficiency of different types of raw materials at the lowest tool cost. a worn pressing screw can be replaced or the raw material can be changed very rapidly thanks to the use of a supermagnet. it is only necessary to insert the screw. figure 8: folded pressing screw with a rotating tip figure 9: folded pressing screw with a fixed tip 74 acta polytechnica vol. 52 no. 3/2012 figure 10: pelleting nozzle figure 11: pressings made by a single machine 5 pressing nozzle the individual nozzles within the pressing chamber must copy the shape of the thread of the screws and prevent rotation of the material, while simultaneously allowing it to move axially. the nozzle geometry is determined in such a way that it copies the thread, from the pressing chamber to its tip, while gradually transforming the material into the desired product shape, i.e. a briquette. the nozzles are highly stressed by the compressive pressure, the heat and, most importantly, by abrasion. the surface of the material must therefore be very hard and resistant to abrasion, while internally the material must be relatively ductile. several pressing nozzle geometries were developed within the research project, with different lengths, diameters and tapers. these nozzles are, as well as screws, made of different materials for purpose of using them for different types of raw materials they can be exchanged very quickly and easily, thanks to the cassette system. the pressing nozzles determine the size and shape of the resulting pressings by their shape and design. the pressing nozzle in the chamber can be exchanged to produce several pressings at the same time (figure 10, 11). in this way, the same machine can produce briquettes or pellets by simply changing the pressing chamber. by making a simple change, it is possible to produce pressings of various shapes: cylindrical, elliptical, polygons, etc. 6 modularity of tools in the pressing chamber modularity of the pressing chamber (table 1) of the screw press enables the compaction process for different types of raw materials to be optimized easily and quickly by exchanging the tools. the optimization criteria are: quality of production, process efficiency, energy costs, tool costs, and tool lifetime. a major advantage is that pressings of various shapes and sizes can be made using a single machine. table 1: scheme of the pressing chamber modularity of the screw press modularity material geometry construction technologies pressing screw steel coating hard-grinding length of screw diameter of screw taper of screw profile of thread monolithic folded a) fixed tip b) rotating tip pressing nozzles steel coating hard-grinding length of nozzle diameter of nozzle taper of nozzle different section of pressings (circle, ring, ellipse, polygon, etc.) shape of pressings number of pressings monolithic folded briquetting nozzle (one pressing) pelleting nozzle (more than one pressing) briquetting pelleting 75 acta polytechnica vol. 52 no. 3/2012 7 conclusion research on biomass compaction indicated a need for a compacting machine with a modular design, where all significant parameters of the compaction process can be controlled. the aim of this paper has been to present a newly-patented screw press design that satisfies all requirements for modularity and control of the parameters. it enables this process to be optimized for different types of raw materials, and high quality production to be achieved. the results of an experimental study of the compacting process led to the engineering design of a production machine, tailor-made to the customer’s requirements, that is able to minimize the costs for investment, energy and operation. the design of the screw press is unique in its modularity and high reliability. six patents and utility models protecting the intellectual property rights of the authors have been taken out for this screw press. acknowledgement this paper is an outcome of the project “development of progressive biomass compacting technology and production of prototype and high-productive tools” (itms project code: 26240220017), on basis of support funding from the european regional development fund’s research and development operational programme. references [1] matúš, m., križan, p.: influence of structural parameters in compacting process on quality of biomass pressing. in aplimat – journal of applied mathematics. 2010, vol. 3, no. 3, p. 87–96. issn 1337-6365. [2] matúš, m., križan, p., ondruška, j., šooš, l’.: analysis of tool geometry for screw extrusion machines. in aplimat – journal of applied mathematics. 2011, vol. 4, no. 4. issn 1337-6365. 76 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 maxwell-chern-simons models: their symmetries, exact solutions and non-relativistic limits j. niederle, a. g. nikitin, o. kuriksha abstract two maxwell-chern-simons (mcs) models in the (1+3)-dimensional space-space are discussed and families of their exact solutions are found. in contrast to the carroll-field-jackiw (cfe) model [2] these systems are relativistically invariant and include the cfj model as a particular sector. using the inönü-wigner contraction agalilei-invariant non-relativistic limit of the systems is found,whichmakespossible to find a galilean formulation of the cfj model. 1 introduction there are three motivations of the present paper. first we search for four-dimensional formulations of maxwell-chern-simon models [1]. secondly, we look for relativisticand galilei-invariant versions of the carroll-field-jackiwelectrodynamics [2]. at the third place we construct a relativistic counterpart of the galilei-invariant equations for vector fields proposed in our paper [3]. there are two symmetries of maxwell’s electrodynamics that have dominated all fundamental physical theories, namely, gauge and lorentz invariance. they provide physical principles that guide the invention of models describing fundamental phenomena. at the first place, the properties of the electromagnetic radiation (in a natural setting and in he accelerators) are described by lorentz-invariant dynamics. on the other hand, the gauge invariance is possible for massless field only, and so it is validated by the stringent limits on the photon mass mγ. possible breaking of the lorentz and gauge invariance has been tested within a theoretical framework with symmetry breaking parameters. the violation of gauge invariance was tested in frames of two modifications of the maxwell theory. in the first of them the lagrangian of the free e.m. field lem = −fμν f μν is modified to lem → lem + m2γ 2 aν a ν where fμν = ∂μaν − ∂ν aμ is the tensor of e.m. field and aν is the four-vector of the photon field. this field aν is massive and so the gauge invariance is lost. the breaking of gauge invariance caused by presence ofmassive term m2γ has been testedwith geomagnetic and galacticmagnetic data. as a result, the limits for parameter m2γ have been obtained in the form mγ ≤ 3 ·10−24 and mγ ≤ 3 ·10−36, correspondingly. the othermodification of the maxwell lagrangian was proposed by carroll, field and jackiw: lem → l = lem +lcs (1) where lcs is a four-dimensional version of the chernsimons term: lcs = 1 4 εμνρσf μaν f ρσ. (2) here pμ is a constant vector which causes violation of the lorentz-invariance. the cfj model presents a rather elegant and convenient way for testing possible violation of the lorentz-invariance,which causes its large impact. but this model has a principle disadvantage, namely, the breaking of thelorentz-invariance“byhands” and the additional constants pμ have no physical meaning. in addition, this model is invariant with respect to neither the lorentz nor poincaré group, i.e. it does not satisfy any relativity principle accepted in physics. in the following sections we discuss two dynamical versions of the cfj model which contain the ordinary cfj model as a particular sector. one of them is very similar to axion electrodynamics in which, however, the axion rest mass is zero. the other includes axion electrodynamics as a limiting case corresponding to a small coupling constant. 2 the mcs model in the (1+3)-dimensional minkovski space let us start with the following lagrangian l=− 1 4 fμν f μν − 1 2 fμf μ − κ 4 εμνρσ f μaν f ρσ + ejμa μ + qj4a 4. (3) here the greek indices run over the values 0,1,2,3, aν and f μν = ∂μaν − ∂ν aμ are the four-vector potential and tensor of the electromagnetic field, respectively. in addition, lagrangian (3) includes a scalar 96 acta polytechnica vol. 50 no. 3/2010 potential a4 and its derivatives f μ = ∂a4 ∂xμ as well as a scalar current j4. the lagrangian (3) has the following nice properties. • it is transparently invariant w.r.t. lorentz transformations and shifts of independent variables. moreover, since j4 is a scalar, it is possible to introduce an additional charge q which is not necessarily equal to the electric charge e but in general to a coupling constant corresponding to some interaction which is not necessary purely electromagnetic. • the corresponding energy-momenta tensor does not depend on parameter κ and so is not affected by the term − κ 4 εμνρσf μaν f ρσ. more precisely, this tensor has the following form t00 = 1 2 (e2 +b2 + f20 +f 2), t0a = 1 2 (εabcebbc + f 0f a), (4) where e, b and f are three-vectors whose components are: ea = f0a, ba = 1 2 εabcfbc and fa. • lagrangian(3) includesboth thefieldcomponents f μν , f ν andpotentials aμ. but in spite of the explicit dependence on potentials, the lagrangian admits gauge transformations aμ → aμ + ∂μϕ since via them it is changed only by a mere surface term: l → l + ∂μ(ϕεμνρσ f νρf σ). in addition, this lagrangian is not affected by the change a4 → a4 + c, where c is a constant. • in non-relativistic approximation, lagrangian (3) is reduced to the galilei-invariant lagrangian for the irreducible galilean field discussed in [3]. 3 field equations let us consider the euler-lagrange equations which correspond to lagrangian (3): ∂ν f μν + κfν f̃ μν = ejμ, (5) ∂ν f ν + κ 2 fλν f̃ λν = qj4. (6) here, f̃ μν = 1 2 εμνρσfρσ is the dual tensor of the electromagnetic field f μν. field variables f̃ μν and fμ satisfy the following conditions: ∂ν f̃ μν =0, ∂ν fμ − ∂μfν =0 (7) in accordance with their definitions as derivatives of the potential. equations (5)–(7) involve ten field variables, i.e., six-component tensor f μν and four-vector f μ. all these variables are dynamical and of equal value. if q =0 then equations (5)–(7) reduce to the field equations of an axion electrodynamics [9] with zero axion mass. first, let us note that equations (5)–(7) cover the system of equations proposed by carroll, field and jackiw [2]. indeed, the system (5)–(7) is compatible with the following additional condition ∂ν fμ =0. (8) if the condition (8) is imposed then fμ = pμ where pμ are constants. substituting this solution into (5) we obtain the cfj equations. concerning our additional equation (6), for constant pμ it is reduced to a definition of j4. note that equations (5) with variable (i.e., nonconstant) pμ were discussed in [7] in frames of the premetric approach [8]. however, f μ is treated there as an external axion field, while in our model it is a dynamical variable satisfying evolution equation (6) and constraints (7). 4 mcs model with nonliner bianchy indentity the system (5), (6) generates the tensor conserved current whose components are given in equation (4). moreover, the energy-momentumtensor (4) has a very simple formand does not depend on the coupling constant κ. on the other hand, for κ =0 this tensor can be segregated to two parts, each of which is conserved separately: t μν = t μν1 + t μν 2 , (9) where t001 = 1 2 (e2 +b2), t0a1 = 1 2 εabcebbc, (10) t002 = f 2 0 +f 2, t0a2 = 1 2 f0f a. (11) however, for κ nonzero either tensor (10) or (11) is not conserved, but only their sum (9) is a conserved quantity. in this section we propose an mcs model which causes conservation of the standard energy-momenta tensor ofmaxwell field given by expressions (10). the related field equations for antisymmetric tensor f μν are: ∂ν f μν + κfν f̃ μν = ejμ, (12) ∂ν f̃ μν − κfν f μν = 0 (13) where f μ = ∂μa 4 and a4 is a scalar potential satisfying the following nonlinear equation ∂ν ∂ ν a4=κ(fμν f μν sin(κa4)− fμν f̃ μν cos(κa4)). (14) 97 acta polytechnica vol. 50 no. 3/2010 it is easy to verify that tensor (10) satisfies the continuity equation ∂μt μν 1 =0provided f μν solve the field equations (12)–(13) with zero current jμ =0. like (5)–(7), equations (12)–(14) admit lagrangian formulation. to construct the related lagrangian we express f μν via potential a = (a0, a1, a2, a3, a4) in a non-linear fashion: f μν =(∂μaν − ∂ν aμ)cos(κa4)− 1 2 εμν λσ(∂ λaσ − ∂σaλ)sin(κa4). (15) theansatz (15) converts equation (13) to identity, which plays the role of bianchi identity in our model. note that this identity appears to be essentially nonlinear. using definition (15), we can write a lagrangian for the system (12)–(14): l= 1 4 (fμν f μν cos(κa4)+ fμν f̃ μν sin(κa4))+ 1 2 fμf μ. (16) variating (16)w.r.t. aμ and a4 one obtains equations (12) and (14) correspondingly. for small values of parameter κ and bounded a4 it is possible to expand lagrangian (16) and the related equations (12)–(14) in power series of κ. then neglecting terms whose order in κ is higher than one we obtain lagrangian (3). in other words, the model considered in sections 4 and 5 can be treated as a first approximationof themodel based onlagrangian (16). 5 continuous and discrete symmetries equations (5)–(7) (or (24)) are transparently invariant w.r.t. the poincaré group. nevertheless we examined them using tools of lie analysis and found their maximal invariance group. we will not present here details of this routine procedure,whose algorithmcanbe found in [10], but will formulate its result: equations (5)–(7) are invariant w.r.t. 11-parametrical extended poincaré group p̃(1,3), whose infinitesimal generators are pμ = ∂μ, jμν = xμ∂ν − xν ∂μ + sμν , d = xμ∂ μ − i − jμ∂jμ . (17) here i = fμ∂fμ + fμν ∂fμν is an operator which acts on the field variables as the unit one, sμν are generators of the lorentz group acting on the field variables and currents: sab =kab − kba, kab =fa∂fb + ea∂eb + ha∂hb + ja∂jb , a, b �=0, s0a =f0∂fa + fa∂f0 + εabc(ea∂hb − ha∂eb hb∂ea). (18) thus, in contrast to the cfj model, the considered csm model in (1 + 3)-dimensional minkovski space is invariant w.r.t. the extended poincaré group. note that the additional condition (8) also is invariantw.r.t. this group; violation of lorentz invariance takes place only after fixingparticular constant solutions fμ = pμ. equations (24) are also invariant w.r.t. discrete transformations of space reflection p and time inversion t . moreover, f μ is a pseudovector and so the potential a4 is a pseudoscalar. in an analogous way we have found the maximal lie groupanddiscrete symmetries admittedby system (12)–(14). it happens that the symmetry of this system is completely analogous to the symmetry of equations (5)–(7). namely, system (12)–(14) is invariant w.r.t. extended poincaré group p̃(1,3), whose generators are given in equations (17), and admits discrete symmetry transformations p, c and t. 6 non-relativistic limit the correct definition of the non-relativistic limit is by no means a simple problem in general and in the case of theories ofmassless fields in particular, see, for example, [12]. a necessary (but not sufficient) condition for obtaining consistent non-relativistic limit of a relativistic theory is to take care that the limiting theory be in agreement with the principle of galilean relativity [11]. to find a non-relativistic limit of equations (24) we use the inönü-wigner contraction [13], which guarantees galilean symmetry of the limiting theory. namely, we shall start with the representation of the poincaré algebra, which is realized on the set of solutions of equations (24), andcontract it to the representation of the galilei algebra. then, using the contraction matrix we find the galilean limits of lagrangian (3) and system (24). the tensor field f μν and vector field f μ transform in accordance with the representation [d(0,1)⊕ d(1,0)]⊕ d(1/2,1/2) (19) of the lorentz group. contractions of this representation (and also of all irreducible representations involved into the direct sum (19)) to indecomposable representations of the homogeneousgalilei group hg(1,3) were discussed in [14] and [15]. the contraction of (19) to the indecomposable representation of hg(1,3) is reduced to the following procedure. first, let us represent the field variables as a ten component vector ψ=column(f01, f02, f03, f23, f31, f12, f1, f2, f3, f0) (20) then lorentz generators (18) act onψas the following matrices 98 acta polytechnica vol. 50 no. 3/2010 sab =εabc ⎛ ⎜⎜⎜⎜⎝ sc 0 0 0 0 sc 0 0 0 0 sc 0 0 0 0 0 ⎞ ⎟⎟⎟⎟⎠ ; s0a = ⎛ ⎜⎜⎜⎜⎝ 0 −sa 0 0 sa 0 0 0 0 0 0 k†a 0 0 −ka 0 ⎞ ⎟⎟⎟⎟⎠ . (21) here sa are spin-one matrices whose elements are (sa)bc = iεabc, and ka are 1×3 matrices of the form k1 =(i,0,0) , k2 =(0, i,0) , k3 =(0,0, i) , (22) and 0 denote the zero matrices of an appropriate dimension. the inönü-wigner contraction consists of transformation to a new basis sab → sab, s0a → εs0a followed by a similarity transformation of basis elements sμν → s′μν = v sμν v −1 with a matrix v depending on contracting parameter ε. moreover, v should depend on ε in a tricky way, such that all the transformed generators s′ab and εs ′ 0a are kept non-trivial and non-singular when ε → 0 [13]. in accordancewith [14, 15], representation (19) can be contracted either to the indecomposable representation of hg(1,3) or to a direct sum of such representations. to obtain an indecomposable representation, contractionmatrix v has to be chosen in the following form: v = ⎛ ⎜⎜⎜⎜⎝ ε 2 0 ε 2 0 0 i 0 0 −ε−1 0 ε−1 0 0 0 0 1 ⎞ ⎟⎟⎟⎟⎠ , v −1= ⎛ ⎜⎜⎜⎜⎜⎝ ε−1 0 − ε 2 0 0 i 0 0 ε−1 0 ε 2 0 0 0 0 1 ⎞ ⎟⎟⎟⎟⎟⎠ . (23) to apply the contraction procedure to the field equations (5)–(7) we first write them in vector notations ∂0e−∇×b = κ(f0b−f×e)+ ej, (24) ∇·e = κf ·b+ ej0, (25) ∂0b+∇×e = 0, (26) ∇·b = 0, (27) ∂0f0 −∇·f = −κe ·b+ qj4, (28) ∂0f−∇f0 = 0, (29) ∇×f = 0. (30) taking half sums and half divergences of pairs of equation (25) and (28), (24) and (29), (26) and (30) we come to a system equivalent to (24)–(30): ∂0f0 −∇· (f−e)= κ(f−e) ·b+ e ( j0 + q e j4 ) , ∂0f0 −∇· (f+e)= κ(f+e) ·b+ e ( j0 − q e j4 ) , ∂0(e+f)−∇×b−∇f0 = (31) κ(f0b− 1 2 (f−e)× (f+e))+ ej, ∂0(e−f)−∇×b+∇f0 = κ(f0b− 1 2 (f−e)× (f+e))+ ej, ∂0b+∇× (e+f)= 0, ∂0b+∇× (e−f)= 0, ∇·b=0. defining ψ′ = column(f′,b′,e′, f ′0) = v ψ we obtain from (20), (23): e+f=2ε−1f′, b′ =b, f−e= εf′, f0 = f ′0 2ε−1j′4 = (q e j4 − j0 ) , εj′0 = q e j4 + j0, j′ = j. (32) substituting (32) into (31), equating terms with lowest powers of ε and taking into account that relativistic variable x0 is related to non-relativistic time t as x0 = ct = ε −1t we obtain the following system: ∂tf ′ 0 −∇·e ′ + κb′ ·e′ = ej′0, ∂tf ′ +∇×b′ + κ(f ′0b ′ +f′ ×e′)= ej′, ∇·f′ + κf′ ·b′ = ej′4, ∂tb ′ +∇×e′ =0, ∇·b′ =0, ∇×f′ =0, ∂tf′ = ∇f ′0. (33) system of equations (33) coincides with thegalilei invariant system for the indecomposable ten component field deduced in [3]. like the corresponding relativistic equations (5), (6), system (33) admits a lagrangian formulation. the related lagrangian can be obtained from (3) using the contractionprocedure and has the following form l = 1 2 (f ′0 2 −b′2)−e′ ·f′ + κ(a0b′ ·f′ −a′ · (b′f ′0 +f ′ ×e′))− (34) e(a′0j′4 + a′4j′0 − j′ ·a′). just the system(33) andlagrangian(34) represent the galilean limit of the model discussed in sections 4 and 5. exactly this lagrangianwas found in [3] starting with the galilei invariance condition. 99 acta polytechnica vol. 50 no. 3/2010 with the additional galilei-invariant constraints f′ =0, j′4 =0, f ′0 = p0 = const system (33) is reduced to the form ∂b′ ∂t +∇×e′ =0, ∇·b′ =0, ∇·e′ + κb′ ·e′ = ej′0, ∇×b′ + κp0b′ = ej′. (35) equations (35) representagalilei-invariantversion of the cfj model with time-like pμ. 7 exact solutions one of important applications of lie symmetries to partial differential equations (pde) is connected with constructing their exact solutions. lie algorithm for constructing exact solutions of differential equations has been known for more than 120 years, see, e.g., [10]. various applications of this algorithm to relativistic systems can be found in [16]. in this section we present some group solutions of the relativistic system (5)–(7). since the maximal lie symmetry of this systemhasbeen foundandpresented in section 6, it is possible to find its exact solutions using the following algorithm: 1. to find all non-equivalent three-dimensional subalgebras of the lie algebra of group p̃(1,3) whose generators are given by formulae (17). 2. to find invariants of the related threeparametric lie groups. 3. to choosenewvariables in suchaway that eight of themcoincideswith these invariants. as a resultwe obtain a system of ordinary differential equations. 4. to solve (if possible) the obtained systems of ordinary differential equations and reconstruct the related solutions of the incoming system. the first step of the algorithm is reduced to using the classification results for the subalgebras of algebra p(1,3), which can be found in [17]. all the remaining steps are rather cumbersomebut algorithmic, and it is possible to find all exact solutions for systems (5)–(7), (12)–(14)which canbe obtainedvialie reductionprocedure. here we shall present only two examples of such solutions, while the complete list of them can be found in [18]. let us startwith the subalgebra of p̃(1,3), spanned on the basis elements < j12, p1, p2 > which are given explicitlyby equations (17) and (18). there is the only invariant of the related group depending on x0, x1, x2 and x3, namely, ω = x 2 1 + x 2 2. in addition, there are seven invariants depending on both space-time and field variables, which we denote as ϕ1, ϕ2, · · · , ϕ7. they are supposed to be functions of ω such that b1 = ϕ1cosω − ϕ3 sinω, b2 = ϕ1 sinω + ϕ2 cosω, e1 = ϕ3 cosω − ϕ4 sinω, e2 = ϕ3 sinω + ϕ4cosω, b3 = ϕ5, e3 = ϕ6, a 4 = ϕ7. (36) substituting (36) into (5)–(7)we reduce it to a systemof ordinarydifferential equationswhichappears to be integrable. moreover, its general solution depends on six arbitrary parameters. a particular solution has the following form: b1 = −c1x2/ω, b2 = c1x1/ω, b3 = c3a4, (37) e1 = c2x1/ω, e2 = c2x2/ω, e3 = c3, (38) f0 = f3 =0, fα = ∇αa4, α =1,2 (39) where a4 = c4j0(c3 √ kω)+ c5y0(c3 √ kω), j0 and y0 are the bessel functions of the first and second kind, respectively, c1, · · · c5 are arbitrary parameters. it is interesting to note that functions (37) and (38) also solve the standard (linear)maxwell equations with bounded currents. namely, the electric field (38) coincides with the field of an infinite straight charged line coinciding with the third coordinate axis supplemented by the constant electric field e3 = c3. the magnetic field (37) is a superposition of the field of a straight line current directed along the third coordinate axis and the field e3 = c3a4 generated by the current whose components are j1 = − x2√ ω a′4, j2 = x1√ ω a′4, j3 =0, where a ′ 4 = ∂a4 ∂ω . let us present one more exact solution of the system (5)–(7): b1 = c x2 ω3/2 , b2 = c −x1 ω3/2 , b3 =0, e1 = c x1 ω3/2 , e2 = c x2 ω3/2 , e3 =0, f1 = x2 ω , f2 = − x1 ω , f3 = f0 =0, jμ =0, μ =0, · · · ,4. (40) in contrast with (37)–(39) vectors b and e defined in (40) do not solve linear maxwell equations with bounded currents. however, they solve the system of nonlinear equations (5)–(7) for ω > 0. a complete set of reductions and exact solutions for equations (5)–(7) can be found in [20]. 8 conclusion we have discused two mcs models in the (1 + 3)dimensional space-time. one of them is presented in sections 2 and 3. it generalizes the axion electrodynamics toa theorywithafive-componentcurrent. the other model includes axion electrodynamics as a limiting case corresponding to a small coupling constant. thismodel includes anon-linearversionof thebianchi 100 acta polytechnica vol. 50 no. 3/2010 identity. the other specific feature of thismodel is the existence of two conservedparts of the energy density, one of which corresponds to the tensor of the electromagnetic field while the other one is formed by the additional four-vector field f μ. in contrast to the cfj model, ourmodels are relativistically invariant and include this model as a particular sector corresponding to constant solutions for vector field f μ. both the models have good nonrelativistic limit, coinciding with the galilei invariant system discussed in [3]. to obtain this limit we apply the inönü-wigner contraction,whichmakes it possible to find a galilean formulation of the cfj model. using the classical lie approach, we find continuous symmetries of our models and construct multiparameter families of their exact solutions. two of these solutionsarepresented insection7,while thecomplete list of group solutions can be found in preprint [18]. note that solutions (37)–(39) and (40) give rise to new exactly solvable models for dirac fermions. one of such models is presented in [19]. references [1] snonfeld, j.: nucl. phys., b175 157, 1991; deser, s., jackiw, r., templeton, s.: ann. phys. 140 372, 1982. [2] carrol, s. m., field, j. b., jackiw, r.: phys. rev. d 41 1231, 1990. [3] niederle, j., nikitin, a. g.: j. phys. a: mat. theor. 42 105207, 2009. [4] chern, s. s., simons, j.: annals math. 99 48, 1974. [5] horvaty, p. a.: lections on (abelian) chernsimons vortices. arxiv 0704.3220 [6] hariton,a. j., legnert,r.: phys. lett.a367 11, 2007. [7] itin, ya.: phys. rev. d 70 025012, 2004; itin, ya.: wave propagation in axion electrodynamics, arxiv 0706.2991. [8] obukhov, y. n., hehl, f. w.: phys. lett. b 458, 466, 1999; obukhov,y.n., rubilar,g.f.: phys. rev.d66, 2002. [9] wilczek, f.: phys. rev. lett. 58, 1799, 1987. [10] olver, p.: application of lie groups to differential equations. springer-verlag, n.y., 1986. [11] le bellac, m., lévy-leblond, j.-m.: nuovo cimento b 14, 217, 1973. [12] holland,p., brown,h.r.: studies inhistory and philosophy of science 34, 161, 2003. [13] inönü, e., wigner, e. p.: proc. nat. acad. sci. u.s. 39, 510, 1953. [14] de montigny, m., niederle, j., nikitin, a. g.: j. phys. a: mat. theor. 39, 1, 2006. [15] niederle, j., nikitin, a. g.: czech. j. phys. 56, 1243, 2006. [16] fushchich, w. i., nikitin, a. g.: symmetries of equations of quantum mechanics. new york, allerton press, 1994. [17] fushchich, w. i., barannyk, a. f., barannyk, l. f.: subgroup structure of lie groups. naukova dumka, kiev, 1993. [18] kuriksha, o., nikitin, a. g.: arxiv preprint arxiv:1002.0064, 2010. [19] ferraro, e., messina, n., nikitin, a. g.: phys. rev. a 81, 042108, 2010, arxiv 0909.5543. [20] kuriksha,o.: group analysis of (1+3)-dimensional maxwell-chern-simons models and their exact solutions. arxiv 0911.3220. j. niederle e-mail: niederle@fzu.cz institute of physics of the academy of sciences of the czech republic na slovance 2, 182 21 prague, czech republic a. g. nikitin e-mail: nikitin@imath.kiev.ua institute of mathematics national academy of sciences of ukraine 3 tereshchenkivs’ka street, kyiv-4, ukraine, 01601 o. kuriksha institute of mathematics national academy of sciences of ukraine 3 tereshchenkivs’ka street, kyiv-4, ukraine, 01601 101 ap08_3.vp 1 introduction in recent years, poroelastic numerical models using finite element method have been widely developed to improve the acoustic efficiency of porous materials used in the aeronautic and automotive industries. classical methods use the biot theory [1, 2] to account for the displacements of both solid and fluid phases. to model three dimensional applications, six or four degrees-of-freedom per node are required, depending on the chosen formulation [5, 6].these numerical methods enable us to predict the structural and fluid coupling induced by the poroelastic medium without any kinematic or geometrical assumptions. however, for large size finite element models, these methods can require significant computational time. to overcome this limitation, we can consider that the porous layer behaves like a dissipative fluid. two porous one-wave formulations can be found: (i) the rigid frame model assumes that the solid phase remains motionless [2], and (ii) the limp model assumes that the stiffness of the solid phase is zero but takes into account its inertial effects [8, 9, 10, 11, 12]. because the motion of the solid phase is considered in the limp model, this model has to be preferred for most of applications, as e.g., in means of transport (cars, trains, aircrafts), where the porous layers are bonded to vibrating plates. however, the model is valid since the frame flexibility of the porous material has little influence on the vibroacoustic response of the system. in a preceding paper [11], a criterion was proposed for identifying the porous materials and the frequency bands for which the limp model can be used according to the boundary conditions applied to the layer. the identification process is based on a parameter, the frame stiffness influence (fsi), determined from the properties of the porous material. this parameter, developed from the biot theory [1, 2] quantifies the intrinsic influence of the solid-borne wave [2] on the displacement of the interstitial fluid and is frequency dependent. in this study, the parameter fsi was compared to critical values obtained for different boundary conditions and porous thicknesses to give an estimation of the frequency bands for which the limp model can be used. in this paper, the identification process is more straightforward to give a first estimation on the accuracy of using the limp model in the whole frequency range. it is based on a frequency independent parameter fsir derived from fsi. critical values of fsir above which the limp model cannot be used are determined for porous materials from 1 to 5 cm in thickness and for a specific boundary condition set (see fig. 3). here we present the sound radiation of a porous layer backed by a vibrating wall. 2 porous material modeling 2.1 biot theory according to biot theory, three waves propagate in a porous media: two compressional waves and a shear wave. in this work, the applications are one-dimensional and only the two compressional waves are considered. the motion of the poroelastic medium is described by the macroscopic displacement of the solid and fluid phase, respectively denoted us and uf. assuming a harmonic time dependence, the equation of motion can be written in the following form [11]: � � � �� �� � � � 2 2 212 22 ~ ~~ ~ �� �u u p us f s , (1) � � � � � �� � � �2 12 2 22 2 2~ ~ ~ ~u u q u r us f s f , (2) with ~ ~ ~ ~ ~� � � � � � � � � � � � 11 12 q r , ~ ~ ~ ~ ~� � � � � � � � � � � � 12 22 q r . (3) the tilde symbol indicates that the associated physical property is complex and frequency-dependent. the inertial coefficients ~�11 and ~�22 are the modified biot density of the solid and fluid phase, respectively. the inertial coefficient ~�12 accounts for the interaction between the inertial forces of the solid and fluid phases together with the viscous dissipation. in eq. (1, 2), �p is the bulk modulus of the frame in vacuum � ( )( ) ( )( ) p e j � � � � � 1 1 1 2 1 � � � � , (4) with e the young modulus, � the loss factor, � the poisson ratio of the frame, ~ r is the bulk modulus of the fluid phase, ~ q quantifies the potential coupling between the two phases, and � is the porosity. in the geometry considered here, the displacement of each phase is due to the propagation of two compressional © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 48 no. 3/2008 validity of the one-dimensional limp model for porous media o. doutres, n. dauchez, j.-m. genevaux, o. dazel a straightforward criterion for determining the validity ofthe limp model validity for porous materials is addressed here. the limp model is an “equivalent fluid” model which gives a better description of porous behavior than the well known “rigid frame” model. it is derived from the poroelastic biot model, assuming that the frame has no bulk stiffness. a criterion is proposed for identifying the porous materials for which the limp model can be used. it relies on a new parameter, the frame stiffness influence fsi, based on porous material properties. the critical values of fsi under which the limp model can be used are determined using 1d analytical modeling for a specific boundary set: radiation of a vibrating plate covered by a porous layer. keywords: porous media, fibrous, limp model, biot model, criterion. waves traveling in both directions. they can be written in the form u x x xs ( ) � �1 2 , (5) u x x xf ( ) � � 1 1 2 2 , (6) where x s x d xi i i i i� �cos( ) sin( ) is the contribution of each compressional wave i � 1 2, , si and di is set by the boundary conditions. these waves are characterized by a complex wave number i i( , )� 1 2 and a displacement ratio i. this ratio indicates in which medium the waves mainly propagate. here, the wave with the subscript i � 1 propagates mainly in the fluid phase and is referred to as the airborne wave. the wave with the subscript i � 2 propagates mainly in the solid phase and is referred to as the frame-borne wave. 2.2 limp assumption the limp model is derived from the biot theory. it is based on the assumption that the frame has no bulk stiffness [8, 9, 10, 11, 12]: �p � 0. it is likely associated to soft materials like cotton and glass wool. this model describes the propagation of one compressional wave in a medium that has the bulk modulus of the air in the pores and the density of the air modified by the inertia effect of the solid phase and its interaction with the fluid phase. hence, by considering the assumption �p � 0 in eq. (1), we get a simple relation between the displacements of both solid and fluid phases. then, substituting the solid displacement in eq. (2) gives the propagation equation on uf ~ ~ limk u upf f f� � �2 2 0� � , (7) with ~ k f the bulk modulus of the air in the pores and ~ lim� p the modified density of the air. an expression of these coefficients can be found in reference [11, 12]. 3 influence of frame stiffness the aim of this section is to propose a parameter based on the properties of the porous material which quantifies the influence of the frame stiffness on the porous behavior. this parameter is called fsi for frame stiffness influence. 3.1 development of the frequency dependent parameter fsi the limp model can be used when the contribution of the frame-borne wave is negligible in the considered application. this approximation implies in the expressions of the solid and fluid displacements (eq. (5, 6)) that: � the contribution of the airborne wave x1 is great compared to the contribution of the frame-borne wave x2; this condition depends mainly on the boundary conditions : one configuration will be presented in section 4 to set critical values of the fsi parameter, � considering the fluid motion (eq. (6)), the displacement ratio 1 associated to the airborne wave is great compared to the displacement ratio 2 associated to the frame-borne wave: 2 1 1�� ; this condition is independent from the boundary conditions and will be used to build the fsi parameter. hence, the fsi parameter is based on the assumption of the limp model can be used when the frame-borne wave contribution is negligible in the considered application. the associated condition, 2 1 1�� , can be written in terms of a frequency dependent parameter, fsi, expressed as a ratio of two characteristic wave numbers [11] fsi f � � � � lim lim ~ ~ � ~ p c p c p k 2 2 . (8) � �lim lim ~ ~ p p k� f is the wave number derived from the limp model, and � �c c p� ~ � is the wave number of a wave, called the c wave, that propagates in a medium that has the bulk modulus of the frame in a vacuum and the density of the frame in a fluid ~ ~ � � � � �� c � �1 , (9) with �1 the mass density of the porous material. fig. 2 presents the fsi for the two characteristic materials b and c [11]. material b is a high-density fibrous material and material c is a polymer foam with a stiff skeleton and high airflow resistivity. the properties of these materials presented in table 1 were measured in our laboratory. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 1: one-dimensional porous modeling porous b c air flow resistivity: (kns/m4) 23 57 porosity: � 0.95 0.97 tortuosity: � 1 1.54 viscous lenght: � (�m) 54.1 24.6 thermal lenght: � (�m) 162.3 73.8 frame density: �1 (kg/m 3) 58 46 young’s modulus at 5 hz: e (kpa) 17 214 structural loss factor at 5hz: � 0.1 0.115 poison’s ratio: � 0 0.3 table 1: measured properties of materials b and c this figure shows that the fsi parameter has a bell shape in which amplitude increases with the bulk modulus of the porous skeleton. the maximum amplitude occurs at the decoupling frequency defined by zwikker and kosten [13]: fzk � � �� 2 12 . (10) this frequency indicates the frequency below which the viscous forces on the material are superior to the inertial forces per unit volume. it is generally used to determine the critical frequency above which an acoustical wave propagating in the fluid phase would not exert a sufficient force to generate vibrations in the solid phase. 3.2 a simplified frequency independent parameter fsir the main objective of the paper is to propose a straightforward identification process which is more easy to carried out compared to the one presented in ref [11]. the criterion proposed in this paper consists in comparing a frequency independent parameter which characterizes the frame influence with critical value. this frequency independent parameter is set as the maximum value of fsi to ensure the uniqueness of the solution in the whole frequency range. thus, as mentioned previously, it can be approached from the mass densities of both the limp waves and the c waves expressed at the frequency. assuming that the density of air �f is negligible compared with that of the porous material �1, these densities are given by ~ ( )� � � c zkf j� � � � �� ��1 1 1 , (11) ~ ( ) ( ) ( lim� � � � � p zkf j � � � 1 2 1 1 . (12) hence, the modulus of the maximum fsi at fzkis given by fsi fsi zkr f p p � � � ( ) � 0 21 � � . (13) fsir is then easy to calculate and requires measurement of the bulk modulus of the skeleton �p and the porosity (�). the two parameters fsir and fzk are given in table 2 for materials b and c. 4 determination of critical fsi values the previous section introduced the simple parameter fsir based on the physical properties of the material. the next step is to identify, for a specific boundary condition set, the critical values of fsi under which the limp model can be used instead of the biot model. these critical values are determined from the difference between the limp and the biot model carried out for a wide range of acoustic materials: hence, the critical fsi value is independent of the tested material. the chosen configuration is presented in fig. 3. the porous layer is excited by a vibrating plate at x l� and radiates in an infinite half-space at x � 0. this configuration corresponds to trim panels, cars roofs or airplane floors. the radiation efficiency factor r, defined as the ratio of the acoustic power radiated �a over the vibratory power of the piston �v, is used as a vibroacoustic response: � r � � � � a v f f p v c v ( ) * ( )0 0 2 w a vibrating surface area of 1 m2 is considered here. boundary conditions associated to this configuration are [14]: continuity of stress and total flow at x � 0. at x l� , the velocity of the fluid and the velocity of the frame are both equal to the wall velocity j u l j u l v� �s f w( ) ( )� � the vibroacoustic response is derived using the transfer matrix method (tmm) [2]. this method assumes that the multilayer has infinite lateral dimensions and uses a representation of plane wave propagation in different media in terms of transfer matrices. to ensure a one-dimensional representation, the multilayer is excited by plane waves with normal incidence. the porous is either simulated using layer the biot model or the limp model presented in section 2. fig. 4 shows the biot and limp simulations the radiation efficiency of materials b and c 2 cm in thickness. for © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 48 no. 3/2008 fig. 2: fsi of material ( -) b and (–) c material b c fzk (hz) 57 186 fsir at fzk 8.42 10 �2 1.43 table 2: simplified fsi parameter of materials b and c fig. 3: sound radiation of a porous layer backed by a vibrating wall both materials, an increase of the radiation efficiency is observed around the first � 4 resonance frequency of the frame: around 200 hz for material b and 1000 hz for material c. to determine the critical fsi value, the difference between the two models is derived by the absolute value of the difference of the two responses � r r biot r� �( ) (lim )p . the maximum accepted difference between the two models is set to 3 db, and corresponds to a classical industrial demand. in order to determine a critical fsi value independent of the tested material, the difference between the two simulations is plotted as a function of the frequency dependent parameter fsi for a wide variety of porous materials (256 simulated materials). the critical fsi value corresponds to the minimum fsi value for which the model difference exceeds the maximum acceptable value of 3 db [11]. the abacus given in fig. 5 presents the minimum fsi critical values determined for 5 different porous thicknesses. for a given material, the limp model can be used if its fsir is situated below the critical value (white area of the abacus), and the biot model should be preferred if fsir exceeds the critical value (gray area of the abacus). 5 discussion and conclusion a straightforward method is proposed to determine if the limp model can be used in the whole frequency range (1–10000 hz). the procedure is as follows: � two properties of the porous materials, �p and, � have to be measured. (see table 1 for materials b and c). � the parameter fsir is evaluated using eq. (13). � the critical values of fsi are chosen in fig. 5 according to the thickness of the porous layer. � fsir is finally compared to the critical values: the limp model can be used in the whole frequency range if fsir is below the fsi critical value. in the case of material c, fsir is equal to 1.4 (see table 2), which is above the fsi critical values of the radiation configuration and for all thicknesses: the biot model should be preferred for all layer thicknesses. the fsir of material b is equal to 8.4 10�2 (see table 2), which is below the fsi critical values of the radiation configuration for all thicknesses: the limp model can be used for all porous thicknesses. these predictions agree with the simulations presented in fig. 4. note that for material b, the increase the radiation efficiency induced by the frame motion do not exceed the maximum accepted difference between the biot and limp models of 3 db. the proposed method is easy to carry out and allows to estimate if the one-dimensional limp model can be used instead of the complete biot model without making any numerical simulations of the configuration nor experimental studies. note that the use of the limp model can be particularly interesting in order to decrease the computational time for large finite element calculations which include porous materials. the criterion method has been presented here in the case of the radiation efficiency of a plate covered by a porous layer of different thicknesses. it has been shown that the prediction of the material for which the limp model can be used is in close agreement with 1d simulations. acknowledgments this study was supported in the framework of the credo research project co-funded by the european commission. references [1] biot, m. a.: the theory of propagation of elastic waves in a fluid-saturated porous solid i. low frequency range; ii higher frequency range. j. acoust. soc. am. vol. 28 (1956), p. 168–191. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 4: radiation efficiency simulated with the biot model (solid line) and the limp model (circles): (a) material b, (b) material c fig. 5: evolution of fsi critical value as a function of the porous thickness [2] allard, j. f.: propagation of sound in porous media: modelling sound absorbing materials. london: elsevier applied science, 1993. [3] panneton, r., atalla, n.: an efficient scheme for solving the three-dimensional elasticity problem in acoustics. j. acoust. soc. am. vol. 101(1998), no. 6, p. 3287–3298. [4] hörlin, n. e., nordström, m., göransson p.: a 3-d hierarchical fe formulation of biot’s equations for elastoacoustic modeling of porous media. j. sound. vib. vol. 254 (2001), no. 4, p. 633–652. [5] atalla, n., panneton, r., debergue, p.: a mixed displacement-pressure formulation for poroelastic materials. journal acoust. soc. am. vol. 104 (1998), no. 4, p. 1444–1452, . [6] dauchez, n., sahraoui, s., atalla, n.: convergence of poroelastic finite elements based on biot displacement formulation. j. acoust. soc. am. vol. 109 (2001), no. 1, p. 33–40. [7] rigobert, s., atalla, n., sgard, f. : investigation of the convergence of the mixed displacement pressure formulation for three-dimensional poroelastic materials using hierarchical elements. j. acoust. soc. am. vol. 114 (2003), no. 5, p. 2607–2617. [8] beranek, l. l.: acoustical properties of homogeneous, isotropic rigid tiles and flexible blankets. j. acoust. soc. am. vol. 19 (1947), no. 4, p. 556–568. [9] ingard, k. u.: notes on sound absorption technology, noise control foundation. new york, 1994. [10] dazel, o., brouard, b., depollier, c., griffiths, s.: an alternative biot’s displacement formulation for porous materials. j. acoust. soc. am. vol. 121 (2007), no. 6, p. 3509–3516. [11] doutres, o., dauchez, n., genevaux, j. m., dazel, o.: validity of the limp model for porous materials: a criterion based on biot theory. j. acoust. soc. am. vol. 122 (2007), no. 4, p. 2038–2048. [12] panneton, r.: comments on the limp frame equivalent fluid model for porous media. j. acoust. soc. am. vol. 122(2007), no. 6, p. el 217–222. [13] zwikker, c., kosten, c. w.: sound absorption materials. new york: elsevier applied science, 1949. [14] doutres, o., dauchez, n., genevaux, j. m.: porous layer impedance applied to a moving wall: application to the radiation of a covered piston. j. acoust. soc. am. vol. 121 (2007), p. 206–213. olivier doutres nicolas dauchez e-mail: nicolas.dauchez@univ-lemans.fr jean-michel genevaux olivier dazel laum, cnrs, université du maine av. o. messiaen 72095 le mans, france © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 48 no. 3/2008 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 a thermogravimetric study of the behaviour of biomass blends during combustion ivo jǐŕıček1, pavla rudasová2, tereza žemlová1 1 power engineering department, institute of chemical technology, prague, technická 5, prague 6, czech republic 2 škoda power, s. r. o., tylova 1/57, 301 28 pilsen, czech republic correspondence to: jiriceki@vscht.cz abstract the ignition and combustion behavior of biomass and biomass blends under typical heating conditions were investigated. thermogravimetric analyses were performed on stalk and woody biomass, alone and blended with various additive weight ratios. the combustion process was enhanced by adding oxygen to the primary air. this led to shorter devolatilization/pyrolysis and char burnout stages, which both took place at lower temperatures than in air alone. the results of the ignition study of stalk biomass show a decrease in ignition temperature as the particle size decreases. this indicates homogeneous ignition, where the volatiles burn in the gas phase, preventing oxygen from reaching the particle surface. the behavior of biomass fuels in the burning process was analyzed, and the effects of heat production and additive type were investigated. mixing with additives is a method for modifying biofuel and obtaining a more continuous heat release process. differential scanning calorimetric-thermogravimetric (dsc-tga) analysis revealed that when the additive is added to biomass, the volatilization rate is modified, the heat release is affected, and the combustion residue is reduced at the same final combustion temperature. keywords: straw, lignin, peat, charcoal, combustion behavior, biomass blends, thermogravimetry. 1 introduction biomass, such as straw, grasses and wood, is used in various forms for energy production. many technologies for biomass utilization have been studied in the last two decades, e.g. (co)-combustion, pyrolysis, gasification and liquefaction. these technologies are in various stages of development, whereby combustion is most developed and most frequently applied. biofuel products are sometimes mixed with other biomass, semi-fossil peat, fossil coal and catalyst to achieve better control of the burning process [1]. until recently, there have been few studies on the co-firing of biomass blends for energy generation [2]. it is anticipated that blending low-grade biomass with higher-quality biomas will reduce flame stability problems, and will also minimize corrosion effects due to the deposited ash containing low melting point salts. the present work was undertaken to determine whether blending different biomass fuels influences the combustion performance, and whether the addition of a specific char additive can modify the burning velocity in the burnout stage and unify the thermal properties. non-isothermal thermogravimetry was applied to determine the combustion characteristic of six samples, namely wheat straw, rape straw, flax straw (leftover after scutching), pulp-mill lignin, garden peat, and hardwood charcoal. 2 materials and experiments all the samples originated from the eastern eu countries. the initial samples were milled, sieved and seven different particle fractions between 0.08 mm < d < 2 mm in diameter were used for the ignition study. for the combustion study, the sample particle size was in the range of 0.15–0.25 mm. this size enables uniform packing on the sample pan. proximate and ultimate analyses of these biomass samples were made using standard procedures, i.e. the astm d 3172-89 method and the tga method [3]. the gross calorific value (gcv) was determined as per astm d2015-66. the results are given in table 1. thermogravimetric tests were performed in the ta instruments q600 simultaneous differential scanning calorimetry-thermogravometry(dsc-tga) apparatus. the weight precision of the instrument is 0.1 μg. small samples weighing about 5 mg were placed in an open platinum sample pan, and uniform packing of the samples was ascertained. the samples were heated to 1 000 ◦c at a constant heating rate of 10 ◦c/min, under a constant air-oxygen flow rate of 120 ml/min (air flow rate of 100 ml/min and high purity oxygen flow rate of 20 ml/min) through the sample chamber. the thermograms were analyzed to determine the relevant combustion parameters. (dw/dt)max indi39 acta polytechnica vol. 52 no. 3/2012 cates the maximum reactivity attained in terms of rate of weight loss (%/min) at dtg (1st derivative of the tg curve) peak temperatures, (dw/dt)mean is the average rate of weight loss. to determine the ignition temperature, two points on the tg curve were first identified. one is the point at which a vertical line from the sharp dtg peak (dw/dt)max crosses the tg curve. the other is the point at which devolatilization begins. two tangents drawn to the tg curve at these points intersect at the ignition temperature (ti). the burnout temperature (tbo) was obtained in a similar way. 3 results and discussion the effect of particle size on ignition temperature was investigated on wheat straw and flax straw samples. other samples are being tested, and the results will be reported later. the ignition temperature (ti) for seven particle fractions was found to follow linear behavior between 0.08 mm < d < 2 mm. linear regressions gave the following relationships: wheat straw ti ( ◦c) = 241.9 + 2.731 · d, flax straw ti ( ◦c) = 265.3 + 2.427 · d. these values imply that wheat straw ignites at lower temperatures than flax straw. although the proximate analysis results differ considerably, the ignition temperatures of the biomass samples changed only in a narrow range, see table 2. generally, volatile matter, flammability of volatiles, and transport from the particle determine whether ignition of an isolated particle occurs heterogeneously or homogeneously (gas ignition). according to thermal explosion theory, heterogeneous ignition characterizes the decrease in ignition temperature as the particle size increases [1]. our data shows the decrease in ignition temperature as the particle size decreases, which indicates homogeneous ignition. volatiles seem to evolve early in the combustion sequence. after ignition, they burn in the gas phase, preventing oxygen from reaching the particle surface. when a volatile hydrocarbon is burned, a large number of different oxygen radicals are involved in various radical chain reactions. the combustion study showed that biomass degradation takes place in two steps: between 180 ◦c and 370 ◦c, volatiles are released and burned (devolatilization/pyrolysis step), and at 370–490 ◦c char combustion takes place (oxidation step). biomass can be divided into three categories according to the heat release process. most biomass falls into the first category, which is characterized by major heat release in the devolatilization/pyrolysis step. wheat straw and rape straw are typical representatives of this category, where the predominant form of combustion is gas-phase oxidation of the volatile species, see figure 1. peat and pulp-mill lignin fall into the second category, characterized by the highest heat release in the char oxidation step. the third category is reserved for biomass chars. table 1: analyses of the samples in wt. % dry base and low heat value(lhv) fuel volatile matter ash c k cl s lhv (wt. %) (wt. %) (wt. %) (wt. %) (wt. %) s(wt. %) (mj/kg) wheat straw 81.0 6.6 41.6 2.36 0.11 0.10 15.5 rape straw 79.5 3.5 40.0 2.50 0.20 0.20 14.8 flax straw 78.6 3.6 45.2 n.a. 0.06 0.04 19.1 pulp-mill lignin 63.6 5.3 55.7 0.01 0.074 0.73 21.0 peat 65.0 2.8 60.0 2.86 0.17 0.20 23.0 charcoal 20.8 0.6 86.0 0.43 0.001 0.01 33.0 table 2: ignition temperature ti, burn-out temperature tbo and combustion characteristic factor ccf for biomass fuels of particle size 0.15–0.25 mm fuel ti, ◦c tbo, ◦c ccf · 10−7 fuel ti, ◦c tbo, ◦c ccf · 10−7 wheat straw 243 447 1.97 lignin 257 462 1.19 rape straw 247 448 1.98 peat 251 445 1.40 flax straw 266 440 2.22 charcoal 425 513 2.95 40 acta polytechnica vol. 52 no. 3/2012 figure 1: heat release process in biomass fuels figure 2: heat release process in straw-lignin blends blending seems advantageous, if samples from different categories are chosen. for example 20 % and 40 % addition of pulp-mill lignin to wheat straw modifies the volatilization rate and affects the oxidation. the heat release is more balanced and more continuous, see figure 2. the released heat can be calculated from the dsc curves as: q = β ∫ ∞ ignition (t − ti) dt = β · s = β · δw · h, (1) where β is the heat transfer constant from the sample to the metal wall; s is the area under the dsc curve; δw is the average width of the heat release peak; h is the exothermic peak height. neglecting the difference in the β constant between these biomass species, the area under the dsc curve reflects the heat releasing state. however, the value of this heat is lower than the gross calorific value, because of the loss from incompletely burned volatiles. the effect of the heat release process on charcoal and its blends with the additive is shown in figure 3. the additive contains a catalyst that enables oxygen transport. the charcoal used in the study seems to have some semi-char content which upon heating degrades up to a temperature of about 500 ◦c. after this, the final char oxidizing step takes place. if mixed with a suitable ratio of additive, for example below 20 %, the heat release can be more continuous, and the burnout temperature decreases; thus, the combustion efficiency is increased. wood charcoal combustion is a solid phase reaction. a heterogeneous reaction involving carbon is usually slower than a gas-phase reaction, and this seems to be the reason for the higher burnout temperatures (tbo) that are observed. the difference in burnout temperature between charcoal and other biomass samples was in the range of 51–73 ◦c, see tabular 2. the additive ignites at a temperature of 440 ◦c and is burned out at a temperature of 508 ◦c, which is 5 ◦c lower than the burnout temperature of charcoal. the additive blends investigated here were able to lower the burnout temperature of charcoal by up to 4 ◦c. figure 3: heat release process in charcoal-additive blends a parameter called the combustion characteristic factor ccf [4] can be used as a criterion for fuel combustion performance, defined as: ccf = ( dw dt ) max · ( dw dt ) mean t 2i · tbo , (2) where (dw/dt)max and (dw/dt)mean are the maximum and average burning velocity (%/min); ti and tbo are the ignition temperature and the burnout temperature (k). this factor, which includes the ease of ignition, the firing velocity and the burnout temperature, is a comprehensive parameter, used here to compare the combustion performance of biomass fuels. ccf values were calculated for several biomass fuels, and are listed in table 2. with the exception of pulpmill lignin and peat, the values are near to or greater than 2 for all other biomass fuels, indicating their good general burning performance. the ccf values 41 acta polytechnica vol. 52 no. 3/2012 for the blends were found to fall between the values of the original fuels. the additive ccf value was found to be the highest. fuels with the highest ccf values e.g. charcoal (ccf = 2.95 · 10−7) and the additive (ccf = 3.23 · 10−7) suggest that they may be advantageously used for blending with other biomass. 4 conclusion in this study, thermogravimetric experiments were performed on a number of biomass species intended for use as fuels. some quantitative characteristics during ignition, devolatilization/pyrolysis, char burning and the burnout stages are listed and compared. combustion of wheat straw showed a longer transition stage between volatilization and char burning, so mixing wheat straw with a lignin additive is a method for modifying the biofuel and obtaining a more continuous heat release process. when an additive is added to charcoal, the heat release is affected and the burnout temperature decreases; thus, the combustion efficiency increases. the comprehensive parameter ccf for straw biomass fuels and charcoal in this project is near to or greater than 2, indicating good combustion performance. in further work, we intend to study additives for intensifying heterogeneous charcoal oxidation, which is the slowest step in the overall wood combustion process. acknowledgement financial support for specific university research in msmt project no. 21/2010 is gratefully acknowledged. references [1] sami, m., annamalai, k., wooldridge, m.: cofiring of coal and biomass fuel blends, progress in energy and combustion science, 2001, vol. 27, p. 171–214. [2] van loo, s., koppejan, j.: the handbook of biomass combustion and co-firing. earthscan, 2008, p. 22–38. isbn 978-1-84407-249-1. [3] jǐŕıček, i., žemlová, t., macák, j., janda, v., viana, m.: paliva, 2009, vol. 1, p. 19–22. [4] nie, q. h., sun, s. z., li, z. q.: thermogravimetric analysis of the combustion characteristic of the brown coal blends, combustion science and technology, 2001, vol. 7, p. 71–76. 42 ap09_1.vp 1 introduction filling hollow steel columns with concrete is an interesting way to improve their fire resistance [7]. the temperature at the surface of a hollow structural section without external protection increases quickly during the development of a fire. however, if the steel tube is filled with concrete, while the steel section gradually loses its resistance and rigidity, the load is transferred to the concrete core. the concrete core heats up more slowly, thus increasing the fire resistance of the column. besides its structural function, the steel tube acts as a radiation shield to the concrete core. this, combined with a steam layer in the steel-concrete boundary, leads to a lower temperature rise in the concrete core when compared to exposed reinforced concrete structures [7]. during a fire, the temperature distribution in the cross-section of a cft column is not uniform: steel and concrete have very different thermal conductivities, which generates a behaviour characterized by noticeable heating transients and high temperature differentials across the cross-section. due to these differentials, cft columns can achieve high fire resistance times without external fire protection [7]. however, it is necessary to resort to numerical models in order to make an accurate prediction of these temperature profiles along the fire exposure time [8], [9]. in this work, the abaqus finite element analysis package [1] was employed to model the behaviour of slender axially loaded cft columns exposed to fire. with this software, a sequentially coupled nonlinear thermal-stress analysis was conducted. the results of the simulations were compared with a series of fire resistance tests available in the literature [11], as well as with the predictions of the eurocode 4 [6] simplified calculation model. 2 development of the numerical model 2.1 finite element mesh a three-dimensional numerical model was developed in abaqus [1], with variable parameters such as the length of the column (l), the external diameter (d), the thickness of the steel tube (t) and the thermal and mechanical material properties. it consisted of two parts: the concrete core and the steel tube. due to symmetry on the geometry and boundary conditions, only a quarter of the column was modelled. the three-dimensional eight-node solid element c3d8rt was used to mesh the model. it is an eight-node thermally coupled brick, with trilinear displacement and temperature, reduced integration and hourglass control. the mesh density was controlled to have a maximum element size of 2 cm, which proved to be sufficient to predict with enough accuracy the thermal and mechanical behaviour of the cft columns under fire. 2.2 material properties the numerical model took into account the temperature dependent thermal and mechanical properties of the materials. for concrete, lie’s model [12] was employed, as it proved to be the one that best predicted the behaviour of the concrete infill in cft columns, according to hong & varma [9]. the mechanical model implemented in abaqus employed the hyperbolic drucker-prager yield surface. the thermal properties for concrete at elevated temperatures were extracted from en 1992-1-2 [4]. for steel, the temperature dependent thermal and mechanical properties recommended in en 1993-1-2 [5] were adopted. the isotropic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 49 no. 1/2009 fire resistance of axially loaded slender concrete filled steel tubular columns development of a three-dimensional numerical model and comparison with eurocode 4 a. espinós, a. hospitaler, m. l. romero in recent years, concrete filled tubular (cft) columns have become popular among designers and structural engineers, due to a series of highly appreciated advantages: high load-bearing capacity, high seismic resistance, attractive appearance, reduced column footing, fast construction technology and high fire resistance without external protection. in a fire, the degradation of the material properties will cause cft columns to become highly nonlinear and inelastic, which makes it quite difficult to predict their failure. in fact, it is still not possible for analytical methods to predict with enough accuracy the behaviour of columns of this kind when exposed to fire. numerical models are therefore widely sought. many numerical simulations have been carried out worldwide, without obtaining satisfactory results. this work proposes a three-dimensional numerical model for studying the actual fire behaviour of columns of this kind. this model was validated by comparing the simulation results with fire resistance tests carried out by other researchers, as well as with the predictions of the eurocode 4 simplified calculation model. keywords: fire resistance, concrete filled steel tubular columns, finite element analysis. multiaxial plasticity model with the von mises yield surface was employed. the values of the thermal expansion coefficient for concrete and steel recommended by hong and varma [9] were employed: � s � � �12 10 6 °c�1, �c � � �6 10 6 °c�1. the moisture content of the concrete infill was not modelled in this research, which lies on the safe side. 2.3 thermal analysis for conducting the thermal analysis, the standard iso-834 [10] fire curve was applied to the exposed surface of the cft column model as a thermal load. the thermal contact in the steel-concrete boundary was modelled by employing the “gap conductance” and “gap radiation” options. for the governing parameters of the heat transfer problem, the values recommended in en 1991-1-2 [3] were adopted. 3 validation of the numerical model the three-dimensional numerical model was validated by comparing the simulations with experimental fire resistance tests [11] and with the ec4 simplified calculation model [6]. 3.1 comparison with experimental results the numerical model was employed to predict the standard fire behaviour of a series of cft column specimens listed in table 1. these specimens were tested at the nrcc, and their results were published by lie & caron [11]. all the specimens tested were circular, filled with siliceous aggregate concrete and subjected to a concentric compression load. their total length was 3810 mm, although only the central 3048 mm were directly exposed to fire. because of the loading conditions, all the tests were assumed as fix-ended. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 column specimen d (mm) t (mm) fy (n/mm 2) fck (n/mm 2) n (kn) � � n npl,rd frr (min) 1 141 6.5 401.93 28.62 131 8.90 % 57 2 168 4.8 346.98 28.62 218 15.37 % 56 3 219 4.8 322.06 24.34 492 26.19 % 80 4 219 4.8 322.06 24.34 384 20.44 % 102 5 219 8.2 367.43 24.34 525 18.88 % 82 6 273 5.6 412.79 26.34 574 17.08 % 112 7 273 5.6 412.79 26.34 525 15.63 % 133 8 273 5.6 412.79 26.34 1000 29.76 % 70 table 1: list of cft columns analyzed, from the nrcc research report [11] fig. 1: comparison between calculated and measured axial displacement, for test no. 4 for each simulation, the axial displacement at the top of the column versus the fire exposure time was registered, comparing this curve with the curve obtained in the fire resistance test [11]. fig. 1 shows an example of the comparison of the two curves for one of the specimens studied. from these curves, the fire resistance rating (frr) was obtained for each of the specimens under study. the failure criteria from en 1363-1 [2] were adopted. this standard establishes that the failure time is given by the more restrictive of the following two limits: maximum axial displacement, and maximum axial displacement velocity. by applying these criteria, the values in table 2 were obtained. as shown in fig. 2, most of the values obtained lie in the region of 15 % error, apart from two values, corresponding to column specimens no. 1 and no. 2, which have the smallest diameters. the maximum axial displacement (�max) was also obtained for each of the column specimens studied here. table 2 shows the calculated and measured values, which are plotted in fig. 3, where it can again be seen that most of the cases lie in the region of 15 % error, apart from specimens no. 3 and no. 8, corresponding to those with a higher loading level, over 20 % of the maximum load-bearing capacity of the column at room temperature. 3.2 comparison with the eurocode 4 simplified calculation model in this section, the numerical model is compared with the predictions of the ec4 simplified calculation model [6], obtaining the results shown in table 3. it is seen in fig. 4 that the proposed numerical model gives a better prediction of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 49 no. 1/2009 column specimen frr (min) � frr test ns frr frr � �max (mm) � � � � max max max � , test , nstest simulation test simulation 1 57 72 0.79 24.09 24.35 0.99 2 56 75 0.75 20.48 19.25 1.06 3 80 74 1.08 18.13 12.36 1.47 4 102 97 1.05 18.77 16.23 1.16 5 82 68 1.21 20.36 19.30 1.05 6 112 126 0.89 16.40 17.71 0.93 7 133 137 0.97 19.67 18.61 1.06 8 70 70 1.00 5.51 10.35 0.53 average 0.97 average 1.03 standard deviation 0.15 standard deviation 0.26 table 2: predicted and measured frr and maximum axial displacement (�max) 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 test results, frr (min) n u m e ri c a l p re d ic ti o n s , f r r (m in ) +15% -15% safe fig. 2: comparison of the fire resistance ratings, calculated versus test results 0 5 10 15 20 25 0 5 10 15 20 25 test results, �max (mm) n u m e r ic a l p re d ic ti o n s , � m a x (m m ) +15% -15% fig. 3: comparison of the maximum axial displacement, calculated versus test results fire resistance rating, showing a very accurate trend. however, the ec4 simplified model turns out to be excessively conservative, as shown in the figure. we must note that the ec4 simplified model does not take into account the thermal expansion of the materials, nor the air gap at the steel-concrete boundary, which lies on the safe side and gives a very conservative prediction. if we apply these simplifications to our numerical model, smaller values of the fire resistance ratings are obtained, very similar to those predicted by ec4, as shown in table 3. as can be seen in fig. 5, our predicted values reproduce quite well the results of ec4, except for those tests with fire resistance ratings around 120 minutes, where our numerical model provides more accurate results, producing a trend that is closer to reality. 4 summary and conclusions a three-dimensional numerical model for axially loaded slender cft columns under fire has been presented. by means of this model, a prediction was made of the behaviour under standard fire conditions of eight column specimens previously tested by the nrcc research group [11]. the pro42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 column specimen frr (min) � frr test calc frr frr � test simulation simulation (no expansion) ec4 simulation simulation (no expansion) ec4 1 57 72 49 49 0.79 1.16 1.16 2 56 75 46 46 0.75 1.22 1.22 3 80 74 52 49 1.08 1.54 1.63 4 102 97 63 61 1.05 1.62 1.67 5 82 68 52 51 1.21 1.58 1.61 6 112 126 118 91 0.89 0.95 1.23 7 133 137 126 96 0.97 1.06 1.39 8 70 70 58 56 1.00 1.21 1.25 average 0.97 1.29 1.39 standard deviation 0.15 0.25 0.21 table 3: comparison of the numerical model and ec4 predictions with the tests fig. 5: comparison of frr, proposed model (without expansion), and ec4 model fig. 4: comparison of frr, proposed numerical model, and ec4 model posed numerical model showed better behaviour for columns with low slenderness and loading levels under 20 %. despite these two aspects, the model showed an accurate response when contrasted with the fire tests. the study has also proved that the predictions of the ec4 simplified calculation model [6] can be reproduced with the proposed numerical model by eliminating the thermal expansion of the materials, which lies on the safe side. however, if the real behaviour of cft columns under fire is to be predicted, this factor must be taken into account, extending the failure time. the expansion of the steel tube produces an opposed axial strain in the early stages of heating, as well as an opening of the gap in the steel-concrete interface, which delays the heating of the concrete core and thus increases the fire resistance rating. the proposed numerical model proved to give better predictions than the ec4 simplified model, which turned out to be excessively conservative. references [1] abaqus. abaqus/standard version 6.6 user’s manual: volumes i–iii. pawtucket, rhode island: hibbit, karlsson & sorenson, inc., 2005. [2] en (european committee for standardisation): en 1363-1: fire resistance tests. part 1: general requirements. brussels: cen 1,999. [3] cen (european committee for standardisation). en 1991-1-2, eurocode 1: actions on structures, part 1.2: general actions – actions on structures exposed to fire. brussels: cen 2,002. [4] cen (european committee for standardisation). en 1992-1-2, eurocode 2: design of concrete structures, part 1.2: general rules – structural fire design. brussels: cen, 2004. [5] cen (european committee for standardisation). en 1993-1-2, eurocode 3: design of steel structures, part 1.2: general rules – structural fire design. brussels: cen, 2005. [6] cen (european committee for standardisation). en 1994-1-2, eurocode 4: design of composite steel and concrete structures, part 1.2: general rules – structural fire design. brussels: cen, 2005. [7] cidect. twilt, l., has,s r., klingsch, e. w., edwards, m., dutta, d.: design guide for structural hollow section columns exposed to fire. cidect (comité international pour le développement et l’etude de la construction tubulaire). cologne, germany: verlag tüv rheinland; 1996. [8] ding, j., wang, y. c.: realistic modelling of thermal and structural behaviour of unprotected concrete filled tubular columns in fire. journal of constructional steel research vol. 64 (2008), p. 1086–1102. [9] hong, s., varma, a. h.: analytical modeling of the standard fire behavior of loaded cft columns. journal of constructional steel research, vol. 65 (2009), p. 54–69. [10] iso (international standards organization). iso 834: fire resistance tests, elements of building construction. switzerland: international standards organisation; 1980. [11] lie, t. t., caron, s. e.: fire resistance of circular hollow steel columns filled with siliceous aggregate concrete: test results. internal report no. 570. ottawa, canada: institute for research in construction, national research council of canada, 1988. [12] lie, t. t.: fire resistance of circular steel columns filled with bar-reinforced concrete. journal of structural engineering-asce, vol. 120 (1994), no. 5, p. 1489–1509. ana espinós e-mail: aespinos@mes.upv.es antonio hospitaler manuel l. romero instituto de ciencia y tecnología del hormigón, universidad politécnica de valencia © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 49 no. 1/2009 acta polytechnica vol. 51 no. 4/2011 dynamical symmetry breaking in rn quantum gravity a. t. kotvytskiy, d. v. kruchkov abstract we show that in the rn gravitation model, there is no dynamical symmetry breaking effect in the formalism of the schwinger-dysonequation (inflatbackgroundspace-time). ageneral formula for the secondvariation of thegravitational action is obtained from the quantum corrections hμν (in arbitrary background metrics). keywords: quantum gravity, schwinger-dyson equations formalism, dynamical symmetry breaking. the study of dynamical mass generation and dynamical symmetry breaking in different external fields [1, 2], including the gravitational field [3–5] is an important step in studying fundamental interactions, which can either be considered in field theory and an attempt at quantization can be made or considered as an external interaction. this paper is devoted to the possibility of dynamical symmetry breaking in such gravitation models using the schwinger-dyson equation formalism [6–9]. the primary goal is a study of some gravity properties of f (r)-gravity [10], where r is a scalar curvature. one of the most discussed variants of such models is rn gravity [11, 12]. in particular, we are interested in a possible effect of dynamical chiral symmetry breaking in this model under graviton and fermion interaction, which leads to mass formation of fermions. this problem is important for several reasons. first, quantum corrections should be taken into consideration from different scenarios describing the evolution of the early universe. second, an understanding of the dynamical symmetry breaking mechanism is important for black hole physics. third, rn — gravity, as a modified theory of general relatity, is also interesting from the phenomenological point of view, the peculiarity of which appears in various cosmological models. in particular, it is necessary to introduce either dark energy or a quintessence with a much more exotic state equation to explain the accelerated expansion of the observed universe (being within the limits of relativity theory). to sum up, it is significant for a description of the future of the universe. however, if one modifies gravity in a proper way, it is also possible to receive interesting cosmological dynamics without introducing new concepts. here, we provide a comparative analysis of the above formalism both in the case of flat background metrics (minkowski space) and in the case of an arbitrary background. let us expand the four-dimensional space-time metric as follows g̃μν � gμν + hμν , (1) where g̃μν is perturbed metric, gμν is background metric, hμν are quantum corrections. this definition leads to the following results. in the case of flat background space-time in the rn gravity model there is an existence of corrections like o(hn ) order and higher that gives no possibility to get the necessary equations. we can speak about the absence of a symmetry breaking effect (at least in the schwinger-dyson equation formalism). however, if the background metric is curved, then corrections of order appear and, particularly, specifically quadric ones o(h2). they allow us to obtain the schwingerdyson equations and to test them for possible dynamical symmetry breaking. an important conclusion is the following: if our real universe is described by a curved metric, then while constructing quantum gravity theory, we should take into consideration the summands of all powers in r, as they provide the same degree in small quantum corrections h. 1 schwinger-dyson equations one possible method for a dynamical symmetry breaking study is the schwinger-dyson equation formalism. since it is impossible to write closed system equations for all elements of the feynman diagram, we have to use some approximations, which allow us to solve the schwinger-dyson equation and to find the type of exact propagator. exact propagators and the vertex part are connected by the integral relation, s−1 − s−10 = i δγ2 δs (2) 54 acta polytechnica vol. 51 no. 4/2011 where s, s0 – exact and free fermion propagators, δγ2 δs – describes two particle and irreducible interaction diagrams, and γ2 is a part of the effective action γ[s] = −isp ( log s−1 + s−10 s ) + γ2[s]. (3) here we confine ourselves to the exact fermion propagator, which is written down in original type s(p) = 1 a(p2)pμγμ − b(p2) , (4) where a(p2), b(p2) are some unknown functions of the fourth momentum p, γμ are dirac matrices. then the schwinger-dyson equation for this propagator can be put down like [13–15] (a − 1)pμγμ − b = ∫ d4q (2π)4i γαβ (p, q − p)s′(q) · γ′μν (q, p − q)g ′αβμν (p − q), (5) where γ′, g′ are an exact vertex function and an exact graviton propagator. the infinite set of sdes determining the exact fermion and boson green functions, as well as the full interaction vertex, can be solved within some truncation version only. this means that the only subset of feynman graphs is taken into account, which naturally leads to the disappearance of the magical cancellation of gauge-dependent terms in the s-matrix expansion. for this reason, the most widespread ladder approximation gives gauge-dependent results, when the fermion green function is treated to be exact only in the case when the boson propagator and the interaction vertex are taken to be free [14]. in this way, we can define the functions a(p2), b(p2) for the quantum rn -gravity. the following important remark should also be made. we have to decide what kind of gravity action and gauge-fixing term should be used. it is convenient to bring the following condition sgf = −β1 2m2 ∫ d4x √ −g ( ∇λhλμ − β2∇μh ) · (gμν∇ρ∇ρ + β3∇μ∇ν ) × (6) (∇σhσν − β2∇ν h) , where β1, β2, β3 are arbitrary parameters. then, putting down the second variation of the full action which we can describe as the sum of the gravitational field action and the gauge-fixing action in the form δ2 (sg + sgf ) = 1 2m2 ∫ d4xhμν hμνρσ h ρσ. (7) then the gravitational field propagator is defined as an operator inverse to hμνρσ , that is gμνρσ = m2 ( h−1 )μνρσ . (8) let us include the gravitational field in our consideration. 2 flat background metric here we single out the part of the quadratic action according to corrections h, in order to put down the schwinger-dyson equations. instead of (1) the perturbed metric will be written down as gμν � ημν + hμν , (9) where ημν is the minkowski metric (we choose the signature (+ − −−)). we consider the following action sg = 1 2m2 ∫ d4x √ −grn . (10) note that action (10) does not have to be a full action for the gravitational field, while the einstein linear gravitation on the curvature and λ — term, and other possible variants, can be also included. however, a discussion of the effects caused by the form (10) is the main goal of this paper. in the case of a flat background metric, we have the following expansion (with accuracy up to the second order approximation) √ −g � 1 + 1 2 h − 1 4 hμν h μν + 1 8 h2, (11) where raising and lowering the index is by ημν background metric and h = ημν hμν . the riemann tensor is now rμνρσ = ∂ργ μ νσ − ∂σ γ μ νρ + γ μ τ ργ τ νσ − γ μ τ σγ τ νρ, (12) while the ricci tensor is determined by the riemann tensor convolution according to the first and the third indices. then, for the scalar curvature we get r = gμν rμν � 1 2 ( ημν − hμν + hμρhνρ ) × (13)( ∂α∂μh α ν − ∂α∂ αhμν − ∂μ∂ν h + ∂α∂ν hαμ + o(h2) ) = ∂α∂ ν hαν − ∂α∂ αh + o(h2). this implies, in the case of rn gravitation, the development according to h has the form rn � ( ∂α∂ ν hαν − ∂α∂ αh + o(h2) )n ∼ o(hn ) (14) that is, the smallest order has a power n in quantum corrections. physically this means that the graviton propagator is missed in such a gravitation model, and only n -particle vertex functions exist. therefore, it is not essential for the study of dynamical symmetry breaking (dsb) in this formalism. in 55 acta polytechnica vol. 51 no. 4/2011 other words, we can say that there is no dynamical symmetry breaking effect in this approximation. now we proceed to the quite different situation of non-flat background space-time. 3 curved background metric let us choose the action for the gravitational field in the form of sg = 1 2m2 ∫ d4x √ −g̃r̃n , (15) where tilde denotes a perturbed metric, and is determined by (1). since the background metric is not the minkowski one, the summands of all h orders appear in all redefined constructions. so the expressions for christoffel symbols can be reduced to the form (with accuracy within the second order on quantum corrections) γ̃μνρ � γ μ νρ + 1 2 (gμγ − hμγ ) · (16) (∇ρhγν + ∇ν hγρ − ∇γ hνρ) , where γμνρ are the christoffel symbols calculated according to the unperturbed metric gμν , and ∇ρ is the covariant derivative relative to the same metric. the complete expression for the riemann tensor (in this approximation) is r̃μνρσ � r μ νρσ + 1 2 (gμτ − hμτ ) ×( hτ εr ε νσρ + hεν r ε τ σρ + ∇ρ∇ν hτ σ − ∇ρ∇τ hνσ − ∇σ∇ν hτ ρ + ∇σ∇τ hνρ) + 1 4 gμγ gτ ε[(∇σhεν + ∇ν hσε − ∇εhνσ ) · (∇τ hγρ − ∇ρhγτ − ∇γ hτ ρ) + (∇σhτ γ − ∇τ hσγ + ∇γ hτ σ) · (∇ρhνε + ∇ν hρε − ∇εhνρ)]. here, the procedure for raising and lowering the indices is provided by the background metric. hence, we find the scalar curvature r̃ = g̃νσr̃ρνρσ � (g νσ − hνσ + hναhσα) r̃ ρ νρσ. (17) introduce the following symbols h1 ≡ −rνσhνσ + ∇ν∇σhνσ − ∇ν∇ν h, (18) h2 ≡ rνσhναhσα − h ρτ (∇ρ∇σ + ∇σ∇ρ) hτ σ − hρτ ∇ρ∇τ h + hρτ ∇σ∇σhρτ + 1 2 ∇σhτ σ∇τ h − ∇σhτ σ∇γ hγτ − (19) 1 4 ∇τ h∇τ h + 1 2 ∇τ h∇γ hγτ + 3 4 ∇σhερ∇σhρε − 1 2 ∇εhρσ∇ σhρε. the expression for scalar curvature is r̃ � r + h1 + h2, (20) where h1 contains the first power of the quantum corrections only, and h2 contains the second powers. then, the n -th power of the scalar curvature becomes r̃n � rn + n rn−1 (h1 + h2) + (21) 1 2 n (n − 1)rn−2h21. in the case of arbitrary background space-time we should take into account the expansion det(g̃) � det(g) + hμν kμν (g) + (22) hμν hαβ f μναβ (g), where kμν (g) = εαβρσ ( δ μ 0 δ ν αg1βg2ρg3σ + δ μ 1 δ ν βg0αg2ρg3σ + δ μ 2 δ ν ρ g0αg1β g3σ + δ μ 3 δ ν σg0αg1βg2ρ ) . we underline the fact that if the background metric is the minkowski one ημν , then function k μν = −ημν and with accuracy within the first order we get the well-known formula −g̃ � 1 + h. the expression for f μναβ (g) is cumbersome, and we do not present it here. to consider the possibility of dynamical symmetry breaking, it is necessary to have an expression for the second variation of the gravitational action. let us take into consideration the following expansion √ a + x1 + x2 � √ a + x1 + x2 2 √ a − x21 8 √ a3 , (23) where x1 + x2 � a, and also (x1) 2 ∼ x2. then,√ −g̃ � √ −g − hμν kμν − hμν hαβ f μναβ � √ −g − hμν k μν + hμν hαβ f μναβ 2 √ −g − (24) (hμν k μν ) 2 8 √ −g3 . thus, we obtain the final form of the second variation δ(2)sg = 1 2m2 ∫ d4x ( √ −g · ( n rn−1h2 + 1 2 n (n − 1)rn−2h21 ) − n 2 √ −g hμν k μν rn−1h1 − (25) hμν hαβ f μναβ 2 √ −g rn − (hμν k μν )2 8 √ −g3 rn ) . 56 acta polytechnica vol. 51 no. 4/2011 note, that in the case of the quadric gravitation (n = 2) and a flat background (r = 0), instead of (25) we obtain δ(2)sg = 1 2m2 ∫ d4x √ −gh21, (26) which coincides with [13]. so, it has been shown that in the case of any arbitrary background metric for any n there is an equation (25). hence, we obtain the propagator (8) and the schwinger-dyson equations (5). this indicates that the effect of dynamical symmetry breaking is possible in rn -gravity. 4 conclusions some quantum properties of model rn -gravity have been considered in the paper. a comparative analysis for two cases: a) a flat background space-time, and b) an arbitrary curved background has been carried out. expanding this model of gravity with quantum corrections hμν , we found that in the first case the smallest order of quantum corrections is n . this means that in the quantum theory of the rn -gravity graviton propagator (for n > 2) does not exist, and there is a vertex function of graviton-graviton interaction that is not used in this formalism. thus, there is no schwinger-dyson equation (5) and, therefore, there is no effect to discuss. in case b, a general formula (25) for the second variation of gravitational action on the quantum corrections hμν is obtained, which in the limit r → 0 coincides with the previously known results. it is determined that in this formulation of the problem under the effects of dynamical symmetry breaking research the terms of all powers from the scalar curvature should be considered in action for the gravitational field, because they give exactly the same order in quantum fluctuations as the einstein action (n = 1). therefore, if we represent a full gravitational action in the form of l = α1r 1 + α2r 2 + α3r 3 + . . ., where αi are some factors of a necessary dimension, then in the case of quantum gravity, each term will contribute to the propagator of a graviton (∼ h2). and we cannot neglect any term. in fact, this means that there exist schwinger-dyson equations for any n , and, hence, the effect of dynamical symmetry breaking is possible. we would like to note the analogy with the fierzpauli model, in which all undesirable degrees of freedom in the flat metric background are cancelled, whereas in a curved background one undesirable degree of freedom (the boulware-deser mode) appears again in the spectrum [16]. references [1] sitenko, yu. a.: polarization of the massless fermionic vacuum in the background of a singular magnetic vortex in 2 + 1 dimensional space-time, ukr. j. phys. 43 (1998), 1 513–1 525. [2] sitenko, yu. a.: induced vacuum condensates in the background of a singular magnetic vortex in 2 + 1 dimensional space-time, phys.rev. d60 (1999), 125 017. [3] miransky, v. a.: dynamical symmetry breaking in quantum field theories, singapore : world scientific, 1993. [4] inagaki, t., odintsov, s. d., shil’nov, yu. i.: dynamical symmetry breaking in the external gravitational and constant magnetic fields, modern phys. a. 12 (1999), 481–503. [5] gitman, d. m., odintsov, s. d., shil’nov, yu. i.: chiral symmetry breaking in d = 3 njl model in external gravitational and magnetic fields, phys. rev. d54 (1996), 2 968–2 970. [6] odintsov, s. d., shil’nov, yu. i.: schwingerdyson equations in qed at non-zero temperature, modern phys. lett. a. 6 (1991), 707–710. [7] bashir, a., diaz-cruz, j. l.: a study of schwinger-dyson equations for yukawa and wess-zumino models, j. phys. g25 (1999), 1 797–1 805. [8] elizalde, e., odintsov, s. d., romeo, a., shil’nov, yu. i.: schwinger-dyson equations and chiral symmetry breaking in 2d induced gravity, mod. phys. lett. a10 (1995), 451–456. [9] s̃auli, v., adam, j., jr., bicudo, p.: dynamical chiral symmetry breaking with minkowski space integral representations, phys. rev. d75 (2007), 087 701. [10] de felice, a., tsujikawa, s.: f (r) theories, living rev. relativity 13 (2010), 1–161, 3. http://www.livingreviews.org/lrr-2010-3 [11] kotvytskiy, a. t., kryuchkov, d. v.: the generalized equation of gravitation in type models, the journal of kharkiv national university, ser. nuclei, particles and fields 4(40) (2009), 29–33. [12] bertolami, o., boehmer, c. g., harko, t., lobo, f.: extra force in f (r) modified theories of gravity, phys. rev. d75 (2007), 104 016. 57 acta polytechnica vol. 51 no. 4/2011 [13] shil’nov, yu. i., chitov, v. v., kotwicki, a. t.: equations schwinger-dyson and dynamical symmetry breaking in two-dimension quantum gravity, russian phys. j. (izv. vuzov. fizika) 3 (1997), 40–44. [14] shil’nov, yu. i., chitov, v. v., kotwicki, a. t.: chiral symmetry breaking in quantum — gravity, modern phys. lett. a. 12 (1997), no 34, 2 599–2 612. [15] shil’nov, yu. i., chitov, v. v., kotwicki, a. t.: dynamical symmetry breaking in quadratic quantum gravity, nuclei physics 60 (1997), no 8, 1 510–1 517. [16] rubakov, v. a., tinyakov, p. g.: infraredmodified gravities and massive gravitons, usp. fiz. nauk 51 (2008), 759–792. a. t. kotvytskiy d. v. kruchkov e-mail: kotw@mail.ru department of theoretical physics v. n. karzin kharkov national university svobody sq. 4, 61077 kharkov, ukraine 58 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 robotic astronomy and the bootes network of robotic telescopes a. j. castro-tirado on behalf of the bootes collaboration abstract the burst observer and optical transient exploring system (bootes), started in 1998 as a spanish-czech collaboration project, devoted to a study of optical emissions from gamma ray bursts (grbs) that occur in the universe. the first two bootes stations were located in spain, and included medium size robotic telescopes with ccd cameras at the cassegrain focus as well as all-sky cameras, with the two stations located 240 km apart. the first observing station (bootes-1) is located at esat (inta-cedea) in mazagón (huelva) and the first light was obtained in july 1998. the second observing station (bootes-2) is located at lamayora (csic) inmálaga and has been operating fully since july 2001. in 2009 bootes expanded abroad, with the third station (bootes-3) being installed in blenheim (south island, new zealand) as result of a collaboration project with several institutions from the southern hemisphere. the fourth station (bootes-4) is on its way, to be deployed in 2011. keywords: robotic astronomy, stellar astrophysics, variable stars. 1 introduction robotic astronomical observatories were first developed in the 1960s by astronomers after electromechanical interfaces to computers became common at observatories. nowadays there more than 100 spread worldwide (fig. 1). see [1] for an overview. here are some important definitions in the field of robotic astronomical observatories1: automated scheduled telescope (robot): a telescope that performs pre-programmed observations without immediate help of a remote observer (e.g. avoiding an astronomer moving the mount by hand). remotely operated (remote) telescope robot: a telescope system that performs remote observations following the request of an observer. autonomous robot (observatory): a telescope that performs various remote observations and is able to adapt itself to changes during task execution without any kind of human assistance (e.g. weather monitoring; the system must not endanger humans!). fig. 1: distribution of robotic telescopes in the world 1following the consensus reached after one hour of discussion amongst the 80 participants who attended the “workshop in robotic autonomous observatories”, held in málaga (spain) on may 18–21, 2009. 16 acta polytechnica vol. 51 no. 2/2011 bootes (the burst observer and optical transient exploring system, bootes), started in 1998 as a spanish-czech collaboration project [2] devoted to the study of optical emissions from gamma ray bursts (grbs) that occur in the universe. nowadays it consists of 4 stations, three of them hosting 60 cm fast slewing robotic telescopes aimed at contributing significantly to various scientific fields. 2 the bootes network of robotic telescopes 2.1 bootes-1 the first robotic astronomical observatory in spain was placed in the inta’s estación de sondeos atmosféricos (esat) at the centro de experimentación de el arenosillo in mazagón (moguer, huelva). it has an extraordinary sky close to the atlantic ocean with more than 300 clear nights a year, limited to the east by the doñana national park. for the first two years after 1998, bootes provided rapid follow-up observations for more than 40 grbs detected by batse aboard the cgro until it was turned off in may 2000. it consisted of a 0.2 m schmidt-cassegrain reflector telescope (at f/10) with a ccd camera at the cassegrain focus, providing a 40′ ×30′ fov and a couple of ccd cameras attached to the main optical tube providing a 16◦ × 11◦ fov. since 2001, with the new location of the existing enclosure 100 m away from the original site, and with the addition of the second enclosure (dubbed bootes-1b to distinguish it from bootes-1a, the old one), various setups have been accomplished, the current one, as of summer 2010, being as follows: • a 0.3 m diameter schmidt-cassegrain reflector telescope (f/10) mounted on a paramount mount with a narrow field ccd (1 528×1 024 pix) camera attached to the main optical tube: 30′ × 20′ fov. • a wide field ccd camera (4 096 × 4 096 pix) attached to a 400 mm f/2.8 lens providing a 5◦×5◦ fov. • an all-sky ccd camera (casandra-1): 180◦ fov. see [3]. 2.2 bootes-2 the bootes-2 robotic astronomical station was officially opened on 7 nov 2001 and it is located at csic’s estación experimental de la mayora in algarrobo costa (málaga). it is limited to the south by the mediterranean sea and to the north by the tejeda-almijara mountains nature park with maroma peak (2.068 m. a.s.l.). unlike the two domes of the bootes-1 station 200 km away, its dome is controlled by a hydraulic opening system controlled automatically according to the existing weather conditions. bootes-2 at first hosted a 0.3-m schmidtcassegrain reflector telescope (f/10), which was replaced in 2009 by a 0.6 m ritchey-chrétien fast slewing telescope. this was officially opened on 27 nov 2009. thus, the new configuration in the bootes-2 station has the following instruments: • the telma (telescopio malaga) ritcheychrétien reflector telescope (0.6 m, f/8, see fig. 2) with an emccd narrow field camera with variou filters (clear, johnson r., sloan g’r’i’ and ukirt z and y-band filters) providing a 10′ × 10′ fov. • an all-sky camera (casandra-2) providing a 180◦ fov. fig. 2: the telma 0.6 m telescope at the bootes-2 astronomical station 17 acta polytechnica vol. 51 no. 2/2011 2.3 bootes-3 the bootes-3 robotic astronomical station was the first installation of the bootes network outside spain. it was officially opened on 27 feb 2009. it is located at vintage lane, blenheim (new zealand), one of the best places for observing the night sky in the southern hemisphere. similarly to the bootes-1 station, its dome is controlled by electric motors, which are controlled automatically according to the existing weather conditions. bootes-3 hosts a 0.6 m ritchey-chrétien fast slewing telescope, which is dubbed the ya (yockallen) telescope, in honour of prof. phil a. yock and eng. bill h. allen, in recognition of their encouraging support. the bootes-3 station has the following instruments: • the ya (yock-allen) ritchey-chrétien reflector telescope (0.6 m, f/8) with an emccd narrow field camera (clear, johnson b, sloan g’r’i’ and ukirt z and y-band filters) providing a 10′ × 10′ fov. • the all-sky camera (casandra-3) providing a 180◦ fov. see fig. 3. fig. 3: the ya 0.6 m telescope in blenheim (new zealand) depicted against the centre of the milky way in an image recorded by casandra-3. the fourth station (bootes-4) will be deployed in 2011 3 bootes scientific goals the bootes scientific goals are multifold, and are detailed below. 3.1 observation of the grb error box simultaneously to grb occurrence although the first detected optical counteparts were not brighter than 19th mag a few hours after the burst, there have been several grbs for which the optical transient emission has been detected simultaneously to the gamma-ray event, with magnitudes in the range 5–10. the faint transient emission that was detected a few hours after the event is a consequence of the expanding remnant that the grb leaves behind it. this provides information about the surrounding medium, but not about the central engine itself. the fast slewing 0.6 m bootes telescopes are producing important results in this field [4]. see fig. 4. fig. 4: optical afterglow lightcurves of some grbs detected by bootes and rapidly imaged (within 1 min) after detection by scientific satellites in this respect, coordinated observations of grbs in various filters is most essential, as only a few grbs have exceptionally bright optical counterparts. observers are of course interested in collecting as much data as possible, with the best possible resolution. one of the goals of the observers is to take spectra of the transient while it is bright enough, so that the transient redshift and other properties can be measured. using data taken with different filters, one can construct the spectral energy distribution of the event and estimate the object redshift. networked rts telescopes (like bootes) at favourable locations can simultaneously observe objects in different filters. the idea is to enable these telescopes to communicate with each other and provide simultaneous images in two or more filters. this system should balance the need to take some data with the possibility of taking data in multiple filters. this can be achieved by sending commands to take images in different filters when the system knows that it has at 18 acta polytechnica vol. 51 no. 2/2011 least some images of the event. this kind of decision is best made in a single component-observation coordinator. the coordinator will be connected to two or more telescope nodes. it will collect information from gcn and from all connected nodes. a node will report to the coordinator when it receives a gcn notice, when it starts its observation and as soon as it gets an image passed through the astrometry and it contains the whole error area of the grb. it will also report when the transient detection software identifies a possible optical transient. when the coordinator receives messages about correct observation by two telescopes, it will decide which filter should be followed at which telescope, and will send out commands to carry out further observations. the coordinator will periodically revisit its observing policy, and send out commands to change filter accordingly. as the system is “running against the clock” for the first few minutes after the grb event, trying to capture the most interesting part of the transient light curve, it cannot wait for completion of the transient source analysis. in the case of two telescopes, the coordinator will command different filters as soon as it knows that both telescopes have acquired the relevant field. the current astrometry routines take a few seconds to run, and it is expected that observations with different filters can already have started within this time-frame. 3.2 the detection of optical flashes (ots) of cosmic origin these events could be unrelated to grbs and could constitute a new type of different astrophysical phenomenon (perhaps associated to qsos/agns). if some of them are related to grbs, the most recent grb models predict that there should be a large number of bursting sources in which only transient x-ray/optical emission should be observed, but no gamma-ray emission. the latter would be confined in a jet-like structure and pointing towards us only in a few cases. 3.3 monitoring a range of astronomical objects these are astrophysical objects ranging from galactic sources such as comets (fig. 5), cataclysmic variables, recurrent novae, compact x-ray binaries to extragalactic sources such as distant supernovae and bright active galactic nuclei. in the later case, there are hints that sudden and rapid flares occurs, though of smaller amplitude. 4 networking one step further from grb observation is coordinated observation of targets — e.g. observation of variable stars for more than 12 hours (i.e. taking advantage of telescopes in different time zones). the observer should contact the coordinator, and either add a new target, or select a predefined target which he/she wants to observe. the coordinator should list for the observer telescopes which can observe the target of his/her choice, and propose filters and exposure times. the observer can then decide which telescopes are to be used, and the coordinator will send observation requests to the nodes, and will collect back information about observation progress. currently only observer-selected coordinated observations are envisioned. when that works properly, the observer can be replaced by network scheduling software. fig. 5: the evolution of comet 17p/holmes following the october 2007 outburst, imaged on a nightly basis with the boo-2 telescope in spain. the fov is 10′ ×10′ in all frames 19 acta polytechnica vol. 51 no. 2/2011 5 conclusions robotic telescopes are opening a new field in astrophysics in terms of optimizing the observing time, with some of them able to provide pre-reduced data. the big advantage is that they can be placed in remote locations where human life conditions will be hostile (antartica now, the moon in the near future). bootes (http://www.iaa.es/bootes) is an example of such a telescope system. technological development in various fields is much involved, and some robotic astronomical observatories are moving towards intelligent robotic astronomical observatories. one immediate application of small/medium size robotic telescopes is in the study of grbs, which can be considered the most energetic phenomenon in the universe. in combination with space missions like integral, swift andfermi, they are used for triggering larger size instruments in order to perform more detailed studies of host galaxies and intervening material on the line of sight. these robotic astronomy observatories will provide a unique opportunity to unveil the high-z universe in years to come. acknowledgement we are very grateful for the support given by the space sciences and electronics technologies department at inta, through project ige 4900506, to the special action dif2001-4256-e supported by the technology and science ministry (mcyt), and also to the spanish research projects aya2001-1708, aya2002-0802, aya2004-01515, aya2007-6377 and aya 2009-c14000-c03-01 granted by the spanish ministry of science and education and innovation and technology (with feder funding). the development of the bootes network has been also possible thanks to the support of junta de andalućıa through excellence research projects p06fqm-2192 and p07-tic-3094. the czech contribution is supported by the ministry of education and youth of the czech republic, projects es02 and es36. references [1] castro-tirado, a. j.: robotic autonomous observatories: a historical perspective, in advances in astronomy, issue on robotic astronomy, (edited by a.j. castro-tirado et al.), article id. 570489, 2010. (http://www.hindawi.com/ journals/aa/2010/570489.html) [2] castro-tirado, a. j. et al.: the burst observer and optical transient exploring system (bootes), a&as 138, 583, 1999. [3] castro-tirado, a. j. et al.: a very sensitive all-sky ccd camera for continuous recording of the night sky, in advanced software and control for astronomy ii. edited by bridger, alan; radziwill, nicole m. proceedings of the spie, vol. 7019, 2008, pp. 70191v–70191v–9. [4] jeĺınek, m. et al.: four years of realtime grb follow-up by bootes-1b (2005–2008), in advances in astronomy, special issue on robotic astronomy (edited by a. j. castro-tirado et al.), 2010, arxiv1001.2147. alberto j. castro-tirado e-mail: ajct@iaa.es iaa-csic, glorieta de la astronomı́a s/n e-18008 granada, spain 20 ap09_2.vp 1 introduction automatic speech recognition (asr) in a noisy environment has been a challenging issue in recent decades for many research centers, as the presence of noise significantly decreases the accuracy of asr systems. there are several approaches to compensate the effect of unclean conditions, which can be combined together with more or less advantageous results. the first class of these methods is applied before acoustic modelling in front-end signal preprocessing. the signal is standardly represented by auditory-based features plps [1] or mfccs to minimize the effect of speaker variability. then noise suppression methods, such as most widely used spectral subtraction (ss) [2], wiener filtering, and minimum mean square error (mmse) estimation [3], are applied within front-end signal processing to minimize the noise level background in the analyzed speech. the second class involves approaches, that take effect in the modelling phase. the models of speech and pause are typically trained on clean speech data to ensure high quality of the final models of speech. model adaptation transforms clean speech models to perform well in a noisy environment. several adaptation techniques use background noise, which is combined with the speech signal e.g. in multi-environment models [4], or with acoustic models in parallel model combination (pmc) [5]. other techniques use noisy speech data to adapt acoustic models for particular background conditions by simply retraining the clean speech models or by some transformation using maximum likelihood linear regression (mllr) [6] or maximum a posteriori (map) adaptation [7]. the latter two schemes are also used also for speaker adaptation with only a small proportion of adaptation material. due to varying or unknown target background conditions, and due to the high costs of collecting speech data in a real environment, not enough data that matches the recognition conditions for the adaptation procedure is typically available. therefore a set of data for “almost matched” conditions is often used for training or model adaptation [4, 8]. in [9], clean speech was mixed additively with real noise from a car to get adaptation data. the final models were then adapted on these recordings by mllr and map, with a resulting improvement from 14.38 % to 5.73 %. the authors show the advantage of using noisy data for training speech models in a car environment using additive mixing of clean speech and noise. similarly, an additive noise approach outperformed the recognition results of a baseline system in different environmental conditions trained and tested on the aurora 2 database in [10]. unlike using only additive noise, data recorded in real conditions is used in this paper. the aim of our work is to analyse the influence of using multi-environment training material for robust speech recognition in an unknown environment. the recordings from the real world are important from the point of view of the real influence of noisy conditions. not only additive distortion but also convolutional distortion are taken into account. as shown e.g. in [11], joint usage of spectral subtraction and mllr adaptation seems to be a good framework for a recognition task under conditions with a high level of background noise. these techniques can be used for blind adaptation, and they are therefore also useful for unknown noise reduction. this paper describes the effect of multi-condition training in several phases of the noise reduction algorithm shown in fig. 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 49 no. 2–3/2009 multi-condition training for unknown environment adaptation in robust asr under real conditions j. rajnoha automatic speech recognition (asr) systems frequently work in a noisy environment. as they are often trained on clean speech data, noise reduction or adaptation techniques are applied to decrease the influence of background disturbance even in the case of unknown conditions. speech data mixed with noise recordings from particular environment are often used for the purposes of model adaptation. this paper analyses the improvement of recognition performance within such adaptation when multi-condition training data from a real environment is used for training initial models. although the quality of such models can decrease with the presence of noise in the training material, they are assumed to include initial information about noise and consequently support the adaptation procedure. experimental results show significant improvement of the proposed training method in a robust asr task under unknown noisy conditions. the decrease by 29 % and 14 % in word error rate in comparison with clean speech training data was achieved for the non-adapted and adapted system, respectively. keywords: speech recognition, environment adaptation, spectral subtraction, mllr, noisy background. clean multi-condition unknown speech ess singlepass mllr baseline hmm decoder retraining/adaptation final hmm ess training material fig. 1: block scheme of the noise reduction algorithm 2 noise compensation methods the spectral subtraction technique is standardly used in front-end processing as a blind noise suppression method (see fig. 1). model parameters can be subsequently changed using single-pass retraining as a simple approach for offline adaptation or mllr as a standard method which can be used for both offline and online adaptation. 2.1 spectral subtraction spectral subtraction (ss) is a technique frequently used for the suppressing the additive background noise component in the spectral domain to eliminate stationary or non-stationary noise with rather slow changes in characteristics. the characteristics of noise are estimated from speech pauses found e.g. by the voice activity detector (vad), which can often the limiting point of the algorithm. in our work, extended spectral subtraction (ess) [12] uses modified adaptive wiener filtering working without vad. 2.2 single-pass retraining single-pass retraining of the models is often used when a large amount of data with matching environmental conditions is available and offline retraining of acoustic models can be performed. the parameters of clean speech models are changed within one pass of the retraining procedure. this retraining is performed on the set of recordings with a matching environmental background. such data will be called matching data in the following text. the disadvantage of this approach lies in the need for a sufficient amount of matching data for each model. this can be very difficult, mainly in the case of a specific environment. in addition a large number of speakers are needed for speaker independent recognition tasks. for these reasons, single-pass retraining was used as a low-bound result for unsupervised speaker adaptation experiments with an increasing amount of adaptation data in [6]. 2.3 mllr as noted above, there is often not enough available data for the single-pass retraining procedure. mllr uses a small amount of adaptation material to estimate an affine linear transform a, b of the model parameters, which is found in terms of minimizing the likelihood of adaptation data. based on our preliminary tests, we use only mllr of mean vectors for our experiments, and other model parameters are unchanged. the new mean vector is then given by � �new old� �a b . (1) the same transform a, b can be applied to the mean vector of all models (global adaptation) or the models can be clustered on the basis of acoustic similarity into several classes. separate transforms are then applied to particular classes. this clustering can represent the different effect of background distortion on particular speech phones. the regression class approach also enables us to cluster the models according to the amount of adaptation data to ensure sufficient quality of the transform. binary regression class tree clustering [13] is used in this work. 3 experiments the experiments were performed on a small vocabulary speaker independent (si) speech recognition task. the czech digit sequence recogniser based on hmms of monophones was used for this purpose. 3.1 front-end setup front-end signal processing was carried out using the ctucopy parametrization tool [14]. this enables similar functionality to the htk hcopy tool [13] and provides additional noise reduction algorithms, e.g. vad detection, spectral subtraction and lda rasta-like filtration. table 1 summarizes the overall setting of the recognition front-end. 3.2 databases the czech speecon database [15] was used for training and testing, i.e. 16 khz data recorded in different environments using several types of microphones. the database involves utterances from almost 600 speakers with different content, e.g. phonetically rich sentences, digits, commands, etc. table 2 shows the division of the database in accordance with various environmental conditions. the whole database (all) was divided on the basis of type of recording environment (clean and noisy) or estimated snr level (hisnr and losnr). subsets with specific environment (office and car) were also created. each subset was divided into a training part and a testing part, taking into account a sufficient number of speakers for 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 segmentation window 25 ms hamming window segmentation step 10 ms feature extraction 1 energy, 12 mfccs +delta+acc. coeffs models hmms of monophones 32 mixtures table 1: front-end setup name description snr [db] cs0 cs1 all whole speecon database 24.03 18.26 office very clean office recordings 26.91 19.88 car recordings in a car 13.33 8.43 clean clean environment subset 27.15 20.80 noisy noisy environment subset 21.25 15.44 hisnr high snr subset 27.51 20.36 losnr low snr subset 13.76 12.07 table 2: description of speecon subsets and average estimated snr si recognition task. training was performed on head-set microphone (cs0) data. only the subsets all and office were used for training to simulate multi-condition training or clean data training, respectively. data from two different channels using a head-set microphone (cs0) and hands-free set (cs1) was used for testing. the cs1 channel is assumed to capture a higher level of background noise, which is illustrated in the estimated snr values in table 2. each testing subset was divided for retraining or adaptation purposes according to the content, into a testing subset, which involves digits only, and a subset with the rest of the testing set, called the matched set. as noted in sec. 2.3, the mllr adaptation technique can work with a low amount of adaptation data. subsets containing 20, 50, 100, 200, 500 and 1000 utterances were therefore created from each matched subset for comparison purposes. for the speaker-independent recognition task, each such subset involved as many speakers as possible. not fewer than 18 speakers were present in the final subsets. this number can be considered as sufficient with regard to the number of speakers used in [9] (10–80) to get improvement in a speaker-independent task. table 3 shows the average amount of adaptation data for different limits of utterances. 3.3 spectral subtraction in different conditions training the models on clean data, the presence of environmental distortion significantly decreases the recognition accuracy. as table 4 shows, using ess helps to suppress the influence of unclean conditions. although the results are worse for matching conditions (clean, cs0), the overall results give more than 8 % of wer enhancement. a similar improvement was achieved for multi-condition training (table 5). although the unclean environment in the training phase decreases the quality of the resulting models, the overall contribution of using the multi-condition training database with ess against the case of clean training data (table 6) is almost 30 % of wer. 3.4 single-pass retraining all matching data for particular testing subsets was used for single-pass retraining in the case of clean or multi-condition training data. with regard to the results in section 3.3, ess was used within front-end signal processing in the following experiments. table 7 shows that using multi-condition training data for training the initial models for single-pass retraining brings an improvement of over 22% against the clean speech models. all available matching data were used in this experiment, which led to the final set of 2400 (car) – 11600 (all) utterances for retraining. 3.5 mllr single-pass retraining acts as a low-bound value for environment adaptation, as the amount of data for retraining is rather high. only a limited amount of adaptation material for particular conditions is available in a real system, and decreasing the proportion of data for single-pass retraining procedure can lead to a significant decrease in recognition accuracy. mllr-based adaptation removes this disadvantage. as shown in fig. 2, even for a low amount of adaptation data the accuracy of a mllr-adapted system outperforms the baseline and single-pass results. section 3.2 describes the adaptation subsets with a limited number of utterances, which reduces the computational load of the adaptation procedure. the recognition tests were performed on each subset and the results presented here show the value averaged over all these limited adaptation subsets. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 49 no. 2–3/2009 utter. 20 50 100 200 500 1000 all time 65 s 2.8 min 5.7 min 10.9 min 26.9 min 57.2 min 7.8 h table 3: average amount of speech data for limited adaptation subsets all office car clean noisy hisnr losnr avg avg avg cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 noss 8.32 14.72 3.47 8.14 7.03 35.17 4.0 11.07 13.09 18.79 7.03 10.2 11.88 17.65 12.18 7.83 16.53 ss 8.46 13.53 4.17 8.01 5.2 29.97 4.45 9.82 11.31 16.56 6.83 9.41 11.76 16.31 11.13 7.45 14.8 imp. �1.68 8.08 �20.17 1.6 26.03 14.79 �11.25 11.29 13.6 11.87 2.84 7.75 1.01 7.59 8.66 4.82 10.48 table 4: wer for different environmental conditions w/o and with ess in front-end processing. the models are trained on clean data. all office car clean noisy hisnr losnr avg avg avg cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 noss 7.77 10.65 3.87 9.61 3.36 8.26 4.79 11.19 10.6 10.95 6.68 10.0 10.37 12.81 8.64 6.78 10.5 ss 7.04 10.65 3.87 7.21 1.53 8.87 4.68 9.7 8.55 10.77 5.74 9.55 9.78 12.58 7.89 5.88 9.9 imp. 9.4 0.0 0.0 24.97 54.46 �7.38 2.3 13.32 19.34 1.64 14.07 4.5 5.69 1.8 8.59 13.17 5.63 table 5: wer for different environmental conditions w/o and with ess in front-end processing. the models are trained on multi-condition data various settings of model clustering for regression tree-based adaptation according to section 2.3 were used within the experiments. global transformation and the division into 2, 4, 8, 16 and 32 regression classes were used, and the case with minimum achieved wer is reported in the following table. the recognition results in table 8 again show the improvement for using multi-condition training material for initial models. only the case for very clean conditions (clean, cs0) brings a slight decrease in wer. the contribution is evidentavg mainly for channel mismatch (cs1). 3.6 overall improvement fig. 3 summarizes the contribution of using multi-condition training data for initial training in particular phases of the noise reduction procedure. the use of multi-condition training data leads to a significant improvement in all phases of the system. the proposed noise reduction method led to the enhancement of wer by 48 %. the improvement achieved by 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 all office car clean noisy hisnr losnr avg avg avg cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 clean 8.46 13.53 4.17 8.01 5.2 29.97 4.45 9.82 11.31 16.56 6.83 9.41 11.76 16.31 11.13 7.45 14.8 m-c 7.04 10.65 3.87 7.21 1.53 8.87 4.68 9.7 8.55 10.77 5.74 9.55 9.78 12.58 7.89 5.88 9.9 imp. 16.78 21.29 7.19 9.99 70.58 70.4 �5.17 1.22 24.4 34.96 15.96 �1.49 16.84 22.87 29.06 21.06 33.09 table 6: wer for clean (clean) and multi-condition (m-c) training data and relative improvement for multi-condition training against clean training – no retraining/adaptation all office car clean noisy hisnr losnr avg avg avg cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 clean 7.4 10.05 3.74 6.94 3.36 14.37 5.02 8.11 8.9 12.82 6.24 8.22 11.24 12.46 8.49 6.56 10.42 m-c 6.67 8.32 3.47 6.01 2.75 5.2 4.91 7.42 7.48 7.57 5.54 7.92 9.84 9.03 6.58 5.81 7.35 imp. 9.86 17.21 7.22 13.4 18.15 63.81 2.19 8.51 15.96 40.95 11.22 3.65 12.46 27.53 22.5 11.42 29.46 table 7: wer for clean (clean) and multi-condition (m-c) training data and relative improvement for multi-condition training against clean training – single-pass retraining all office car clean noisy hisnr losnr avg avg avg cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 cs0 cs1 clean 7.82 8.78 3.81 5.7 2.86 5.05 4.45 7.1 10.18 10.75 6.84 7.24 10.66 10.52 7.43 6.81 8.01 m-c 7.18 7.44 3.54 5.36 2.4 3.01 4.53 6.68 8.46 7.7 5.99 7.2 9.53 8.83 6.38 6.1 6.66 imp. 8.29 15.21 7.05 5.88 16.11 40.37 �1.68 5.89 16.88 28.31 12.42 0.58 10.56 16.03 14.06 10.4 16.84 table 8: wer for clean (clean) and multi-condition (m-c) training data and relative improvement for multi-condition training against clean training – mllr adaptation 0 10 20 30 40 50 60 70 80 90 100 all1k5002001005020 w e r [% ] no. of retraining utterances baseline singlepass mllr fig. 2: wer for different amount of training data for single-pass retraining and mllr adaptation 0 2 4 6 8 10 12 no ss ss singlepass mllr w e r [% ] system cleantraining m-c training fig. 3: average wer in particular phases of noise reduction for clean and multi-condition training multi-condition training brought a more than 14 % decrease in recognition error. 4 conclusion the paper shows the advantages of using a multi-condition training data for robust asr in unknown background conditions. the main contribution of the work is in using recordings from a real environment, which reflects the real influence of noise in a robust recognition task. the results can be summarized in the following points. � multi-condition (m-c) training brings significant improvement to recognition accuracy, even in the case without any other noise reduction method. in the results presented here, multi-condition training outperforms the system that uses spectral subtraction and clean training data by 22 %. a combination of m-c training and spectral subtraction algorithm resulted in more than 29 % enhancement of wer against the baseline system. an increase in recognition accuracy by more than 70 % can be observed for data recorded in a car. � single-pass retraining gives a robust offline procedure for correcting acoustic models when enough matching data is available for a high variety of speakers and a rich phonetic content. the main contribution was observed for channel mismatch, and the use of m-c trained initial models brought an additional improvement to these results. � advantageous clustering of models based on available adaptation data within mllr adaptation is shown to bring an improvement over single-pass retraining. the final improvement using m-c trained models only slightly outperforms the single-pass results. generally, multi-condition training material for initial training of speech models seems to bring an improvement to the recognition task in unknown environmental conditions. as the training and testing data in our experiments come from the same source, future work will be oriented to higher mismatches in adaptation and recognition conditions. acknowledgements this research has been supported by grants gačr 102/08/0707 “speech recognition under real-world conditions”, gačr 102/08/h008 “analysis and modelling biomedical and speech signals”, and by research activity msm 6840770014 “perspective informative and communications technicalities research”. references [1] hermansky, h.: perceptual linear predictive (plp) analysis of speech. j. acoust. soc. am., vol. 87 (1990), no. 4, p. 1738–1752. [2] kang, g. s., fransen, l. j.: quality improvement of lpc-processed noisy speech by using spectral subtraction. ieee trans. on assp, vol. 37 (1989), no. 6, p. 939–942, june 1989. [3] ephraim, y., malah, d.: speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. ieee trans. on acoustics, speech and signal processing, vol. assp-32 (1984), no. 6, december 1984. [4] ming, j., jancovic, p., hanna, p., stewart, d.: modeling the mixtures of known noise and unknown unexpected noise for robust speech recognition. european conference on speech communication and technology (eurospeech’2001), aalborg, denmark, september 2001, p. 579–582. [5] gales, m. j. f., young, s. j.: parallel model combination for speech recognition in noise. technical report cued/f-infeng/tr 135, cambridge, england, 1993. [6] leggetter, c. j., woodland, p. c.: maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models. computer speech & language, vol. 9 (1995), no. 2, (april 1995), p. 171–185. [7] gauvain, j. l., lee, c. h.: maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains. ieee transactions on speech and audio processing, vol. 2 (1994), no. 2, p. 291–298. [8] liao, y. f., fang, h. h., hsu, c. h.: eigen-mllr environment/speaker compensation for robust speech recognition. proceeding interspeech’08, brisbane, australia, september 2008, pp. 1249–1252 [9] bippus, r., fischer, a., stahl, v.: domain adaptation for robust automatic speech recognition in car environments. proc. eurospeech’99, budapest, hungary, 1999, p. 1943–1946. [10] ming, j. hou b.: speech recognition in unknown noisy conditions. chapter 11, in book robust speech recognition and understanding, m. grimm and k. kroschel (eds.), i-tech education and publishing, 2007, p. 175–186. [11] matassoni, m., omologo, m., santarelli, a., svaizer p.: on the joint use of noise reduction and mllr adaptation for in-car hands-free speech recognition. in proceedings of the ieee international conference on acoustics, speech, and signal processing (icassp’02), 2002, p. 289–292. [12] sovka, p., pollak, p., kybic, j.: extended spectral subtraction. proc. eusipco’96, trieste, italy, september 1996. [13] young, s. et al.: the htk book (for htk version 3.2.1), cambridge university engineering department, 2002. [14] fousek, p., pollak, p.: additive noise and channel distortion-robust parametrization tool – performance evaluation on aurora 2 & 3. proc. eurospeech’03, p. 1785–1788. [15] speecon project page, url: http://www.speechdat.org/speecon. josef rajnoha e-mail: rajnoj1@fel.cvut.cz department of circuit theory czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 thermal forming of glass — experiment vs. simulation l. sveda, m. landová, m. mı́ka, l. ṕına, r. havĺıková, v. marš́ıková abstract thermal forming is a technique for forming glass foils precisely into a desired shape. it is widely used in the automotive industry. it can also be used for shaping x-ray mirror substrates for space missions, as in our case. this paper presents the initial results of methods used for automatic data processing of in-situ measurements of the thermal shaping process and a comparison of measured and simulatated values. it also briefly describes improvements of the overall experimental setup currently being made in order to obtain better and more precise results. keywords: x-ray mirrors, comsol multiphysics, simulations, thermal forming, gravity forming, in-situ measurements, data processing. 1 thermal forming process for lightweight optics thermal forming is a technique for shaping lightweight precise space x-ray mirrors that has been under development for several years. teams working on the topic include the nasa team led by dr. zhang [7] working on the formercon-xproject, an italian group [8], and the czech group consisting of people from ctu, ict, the astronomical institute and rigaku company, working on the former xeus/ixo/athena mission [9]. the principle of the technique is rather straightforward. a precise form (a mandrel) is prepared. a glass sheet is then placed atop the mandrel (or between two mandrels) and a temperature profile is applied. since glass behaves like a viscous liquid above certain temperatures, forming can occur and the shape remains even after the sample cools down. themethoddescribed in thispaper is very similar to the method outlined above, except that no mandrel is used. the glass sheets are held only on their edges, and shaping at high temperatures takes place onlydue togravity [6]. theprinciple is demonstrated infigure1. asimilarmethod isused formanufacturing car frontwindows, howeverwith a lowerprecision requirement. 2 metrology the shape of the formedglass sheetwasmeasuredby two principally different methods and provided complementary information about the forming process. in order to acquire information about the dynamics of the forming process, a noncontact optical method was used. a scheme of the setup is shown in figure 1. the sample remains in its position inside the furnacewhile it is observed via a camerawith an objective. figure 2 shows a typical image obtained by the camera. the typical period between two consecutive images was 5–10 minutes. a set of images with timestamps was obtained, and was further processed in order to obtain valid data. fig. 1: schematic view of the experiment. the glass was placed atop the support only by its edges. the temperature was increased and gravity performed the actual forming. the maximal bending as a function of time and temperature was then measured and compared with the simulations fig. 2: example of image processing of the in-situ optical measurement. the original image (upper, false colors) is processed in order to obtain reference points (second image), the reference line and the vertical line for glass positionmeasurement is found (third, overlaid on the original image, false colors) and the position of the glass is finally found in this vertical line (fourth) 72 acta polytechnica vol. 51 no. 6/2011 several matlab scripts were written in order to process the images automatically, if possible, or semiautomatically. the most important information for developing the forming process that can be obtained frommeasurementsof this kind is thedynamicsof the process. thus, the total bend of the sample at each time was measured. the algorithm first located two fixed reference points within each image. this is because the images can be shifted and inclined relative to each other, as the doors needed to be opened each time in order to take the image. then the center of the sample was found, where the maximal bend was expected. the relative position of the center with respect to the reference points — the total bend — was identified. this was done either automatically, when the lighting conditions were optimal, or semiautomatically with some help from the user. finally, the values were calibrated in order to obtain values in mm and not in pixels. a single step of the script is shown in figure 2. the other semiautomatic script was also written, and was able to process the image in order to determine the sample profile. it was used to detect the shape change of the sample between the end of the forming process and the end of the cooling process. as the method used standard commercial equipment only (camera, lenses, tripod, etc.) and the doors had to be opened each time, the precision of the method is low, typically 0.1—0.5 mm. it can therefore be used successfully only for a study of the process dynamics. the taylor hobson pgi plus contact profilometer was used to obtain more accurate data for shape fitting. this device is able to measure profiles up to 120 mm in length, sampling down to 0.25 micron and with vertical resolution at the nanometer scale. three linesweremeasuredon each sample at each of the orthogonal directions, two close to the edges and one close to the center of the sample. the data from the device was initially processed by the taylor hobson software, and was further exported and processed in matlab as well. the profiles were rotated in order to make them horizontal and concentric, as it is impossible to position the sample perfectly on the table. varioius curves were then fitted onto the data, including polynomials of the order 2 and 4, a circle and a catenary curve. 3 simulations computer simulationof the formingprocess is a complex task [1–4]. generally, a combination of a thermal model and a mechanical model has to be applied. the thermal model should contain spatial and temporal temperature profiles, as well as the thermal interaction of the furnace and the form with the sample. this can be either measured or simulated. mechanically, it can be modelled as a simple beam made of a viscoelasticmaterial, where several border conditions need to be met. the gravitational force is easily applied as a volume force. the dynamic viscosity of the material as a function of the temperature at a given location has to be known. the proper sample/support interaction must be entered. either a rigid system or a system where some movements with defined friction are allowed, etc. surface tension can be included. the simulation performed here used the comsol multiphysics software with the cfd module [5], but with several simplifications. the temperature of the sample was assumed to be homogeneous and identical to the temperature of the furnace. this seems to be unrealistic, but it is a good starting point. more work needs to be done to justify and modify this condition. rigid borders were applied, no slip, no squeezing of the glass, and the sample lay freely on the support. uncertainty in border definition is expected to produce differences in the bending speed of the order of 20 % [2]. most of the forming time was spent above the transform temperature, thus a viscous newton liquid model was used with the dynamic viscosity given by the curve in figure 3. the output of the comsol multiphysics simulations was only a velocity field as a function of time. the data was thus imported tomatlab and integrated in order to obtain the actual profile. fig. 3: dynamic viscosity as a function of the temperature used in the simulation in the form of the vogelfulcher tammann equation [6] 4 processing experimental data the experiments were performed with glass samples made from desag d263 glass. two different sample sizes were used: 75 × 25 × 0.75 mm3 and 100 × 100 × 0.4 mm3. the forming process, temperatures, measurements and fits by different curves as well as a description of the equipment used in the 73 acta polytechnica vol. 51 no. 6/2011 experiment are described in greater detail in [6]. the average forming speed as a function of the forming temperature for one of the samples is shown in figure 4, together with the simulated data. fig. 4: average forming speed for a sample 100 × 100 × 0.4 mm3 in size as a function of temperature. the measured values are compared with the simulation all the forming experiments were measured insitu, and the results were processed by the noncontact method. the resulting data was then combined into the form of a unified plot where the change in shape (the bending) is a function of the forming time and the temperature (see figure 5). fig. 5: measured bending as a function of time and forming temperature for 100 × 100 × 0.4mm3 samples asimulationofall the experimentswasperformed and the data was processed in the same way as the experimental data in order to provide a comparison (see figure 6). the overall shape and values are consistent, although not perfectly matching yet, see figure 4, which compares the average shaping speed as a function of temperature. detailed contactprofilometermeasurements show that, due to short openings of the doors in order to make in-situ images for noncontact measurements, there are temperature gradients that result in inhomogeneous forming. it can be seen that the shape is different at the edge closer to the doors. further, we have no actual information about the temperatures and the temperature gradients in the glass, which is very important for the simulation. it is expected that, together with the poorly defined support, this leads to most of differences between the simulation and the experiment. inaccurate noncontactmeasurement is not important for average forming speed detection, as least square fitting is used. fig. 6: simulated data based on the experiments from figure 5. more data points are used for interpolating the colored surface 5 expected improvements several key points were identified during the experiments described above, which affect the measurements and the forming process itself, and are currently being upgraded. the shaping process is strongly affected by opening of the doors for making images, thus a sapphire window in the doors will be used. in addition, poorly defined support will be replaced by stable and perfectly defined mechanical support. independent temperaturemeasurements inside the furnace for greater precision will be performed. in order to be able to increase the precision of the opticalmeasurements, a camera at a fixed point relative to the furnace will be used. it will be equippedwith a telecentric lens. the lightingwill be adjusted to enable automatic data processing for all images. 6 conclusions ageneralmethod for in-situmeasurements of a thermal free fall glass forming process has been demonstrated, including automatic data processing. the datawasused for improvements to a formingmethod suitable for space x-ray telescopes, and as an input 74 acta polytechnica vol. 51 no. 6/2011 for computer simulations of the process. the simulation and the experiments are consistent, but more work needs to be done. key points which need to be modified and corrected in future experiments have been identified, and experimental upgrades are currently underway. acknowledgement we acknowledge support from the grant agency of the academy of sciences of the czechrepublic, material andx-rayopticalproperties of formedsilicon monocrystals project, grant number iaax01220701. references [1] starý, m.: gravitačńı tvarováńı skla i. – laboratorńı měreńı, sklář a keramik, 59, 7–9, 2009, p. 150–154. [2] starý, m.: gravitačńı tvarováńı skla ii. – numerická simulace,sklář a keramik, 59, 10–12, 2009, p. 213–217. [3] chen, y.: thermal forming process for precision freeform optical mirrors and micro glass optics, phd thesis, the ohio state university, 2010. [4] stokes, y. m.: very viscous flows driven by gravity, phdthesis,universityofadelaide, 1998. [5] http://www.comsol.com/ [6] landová, m.: thermal forming of glass and si foils for x-ray space telescopes, thesis, institute of chemical technology prague, 2011. [7] zhang,w.w.: lightweight and high angular resolution x-ray optics for astronomy, proceedings of the spie, 8076, 2011, p. 807602. [8] prosperio, l., et al.: thermal shaping of thin glass substrates for segmented grazing incidence active optics, proceedings of the spie, 7803, 2010, p. 78030k. [9] hudec, r., et al:, advanced x-ray optics with si wafers and slumped glass, proceedings of the spie, 7437, 2009, p. 74370s. libor sveda ladislav pína radka havlíková ctu in prague fnspe břehova 7, 115 19 prague 1, czech republic martina landová martin míka ict prague technická 5, 166 28 prague 6-dejvice czech republic veronika maršíková rigaku innovative technologies europe, s.r.o. novodvorská 994, 142 21 prague 4, czech republic 75 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 prague’s sewerage system in the 1930’ and the general sewerage project (1933–1936) k. drnek abstract prague’s sewerage system was built at the end of the era of the monarchy in the united town that prague was transformed into. the system was soon overloaded, and was not able to remove all the sewage produced by the citizens. to deal with this hygienic threat, the city council and the management of the wastewater services undertook several actions to build a new system or improve the existing system. the most ambitious and extensive measure was the general project carried out between 1933 and 1936. the project was invented to resolve the problem once and for all by introducing new ideas and cut out the problem of placing a new sewage plant instead of the old one. for the present-day observer it also offers a range of spectacular and interesting ideas on urban wastewater treatment. keywords: sewerage, sewage plant, máslo-douda project, řež, competition, projects, dorr, čistá půda, roztoky. 1 introduction to the problem onjanuary 1st 1922greaterpraguewas formed. the cities ofpragueand their surroundingsweremade into a single capital city. amongst other problems (administrative or technical), the city council was faced by the problem of providing adequate treatment for the sewage discharge by citizens in thewhole city. at that time, the city was using the prewar prague system, which had only been extended and improved within the limits of prewar technology. 1.1 original sewerage the sewerage system in prewar prague was successfully finished in 1906 after almost 20 years of construction. it covered 88509 km2 of settled area, and all the sewage was drained to central sewage plant in bubeneč, near the imperial island (císařský ostrov). this sewage plant used a mechanical system for treating the sewage. the wastewater was delivered to the plant, where it passed through first line of filters, the coarse racks (česle). this procedure removed the coarse trash – wood, old cloth and remnants of food. the nextphase removedthe sand fromthewater. due to thedirty streets, thewaterwas full of sand, itwould have been wasteful not to recycle the water. the grid chambers (lapač písku) were pools 34 m in length and 6m in depthwith a slightly sloping bottom. the water passed slowly through it and heavier sand sank to the bottom, from where it was collected and taken away for disposal. then water passed through the fine racks and finally into the primary clarifiers (usazovací nádrž), where the final sludge was separated. the water was then discharged into the river and the sludge into the sludge drying bed (in winter) and into a sludge barge (in summer). 1.2 limitations of the old sewerage system the sewerage and the treatment plant were designed for only 500000 people with 120 l of sewage per day and person. when the capital city of prague was established, the numbers of users and the area that was covered rose dramatically. in 1929, when the possibility of building a new system was raised, the area to be covered was about 172104 km2 and there were more than 600000 users of the system. besides the late 19th century technology,whichwas not able to dealwith amount of sewage discharged by all the citizens, the sludge and the pollution of the river were major problems. the amount of the sludge gradually rose to a level of more than 250 m3 per day, and dealing with this huge amount was becoming problematic. the second problem was that the riverwas slowly losing its self-cleaning ability, and thus the old sewage plant was also losing its main pillar. 1.3 project in 1929 in 1929, first attempts were made to prevent the city falling into serious hygienic trouble. ing.e.máslo and ing.v.doudawere asked to set up a project for a new sewage plant, fully compatible with the old canalization system. they submitted their proposal for authorization in 1929. both engineers set out to locate the sewage plant outside the city precincts, in order not to subject the 23 acta polytechnica vol. 50 no. 4/2010 citizens to the smell of the sludge and to be able to handle this valuable agricultural product better and more comfortably. the riverbankby thevillageofřež, about 11 km from the old sewage plant was chosen as thebest location. thenewplantwasdesigned to clean the water in the old way. the treatment was only mechanical, because the advanced biological methods were still controversial. mechanicalmethodswere also much cheaper. fig. 1: picture of the original “máslo-douda” project while the douda-máslo project was passing through the authorization procedures, ing. zika, the chief of prague’s sewers, introduced new projects to build sewage plants with biological purification either in řež or on imperial island. this dualism in planning came about because the planning of these complicated systems took years, and in the meantime the technology was advancing very rapidly. the originally proposed technology was outdated before the project reached the construction stage. zika’s project introduced the problem of how to manage all projects to meet the rising demands of the citizens for a clean river. thenew sewageplant on imperial island was cheaper, but the authorization process was already running and some land in the neighborhood of řež had already been bought for the new sewage plant. 2 competition as the result, the city council held a competition to resolve the questionof the best location for anewsewage plant and to introduce new ideas. the competition was announced on may 2nd, 1933, and the deadline wasmarch15th, 1934. the competitionwas open only to citizens of czechoslovakia. a commission was nominated by the city council and comprised 17 members and 2 experts. during the standing time of the commission, 2 members died and others were nominated to replace them. the committee finally consisted of the following members: ing. b. bartošek, ing. o. cvrk, mudr. j. čančík, ing. v. douda, ing. a. ernest, ing. k. holinka, ing. v. krouza, ing. t. mrkvan, ing. a. nový, mudr. l.procházka, phdr. f.schulz, ing. arch. v. prokop, ing. dr. j. racek, arch. f. šimáček, ing. e. thoma, ing. f. topinka, ing. b. vondráček, ing. dr. v. vrbenský, ing. r. žižka. the rules of the competition set several conditions: • theaverageamountof sewageperdaywas129 l in 1927 and 118 l in 1929. themáslo-doudaproject set a value of 160 l. • the number of users in the period between 1940 and 1960 would rise from 1 mil. to 1.6 mil. in 1929 there were only 601000 users. • the treated sewage should contain notmore than 250 mg of dry sludge. mud taken within cleaning processmust not rot, sand from the sandpool must not contain more than 10 % of dry residue, sludge from the sedimentation tank should not contain more than 91 % of water. • the location of the new sewage plant must not offend hygienics and aesthetics. it could be located in a populated area but the hygienic measureswouldhave tobemore accurateand stricter. • the new sewage plant was required to treat all sewage from the city. each project was evaluated in terms of: • location • technical elaboration of the project • hygiene • system economy • chemical processes • agricultural interest • mechanical processes • influence on the river • space for improvings • cost 3 the projects fifteen projects were submitted before the deadline. finally, only 13were admitted and 2were rejected because they did not comply with the conditions. however, they were so interesting that the committee also screened them. i have categorized the projects into four groups. i will deal with the three projects, in greater detail. however, i will also consider projects that were assessed as non-competitive because they did not deal with all problems and were not selected to carry out the project. 3.1 projects locating the plant on imperial island projects located on imperial island, where the old treatment plant was, relied on modern treatment procedures without creating a bad smell or hygienic trou24 acta polytechnica vol. 50 no. 4/2010 bles, which would have prevented the construction of a sewage plant inside the city limits. the sludge, a major cause of bad smell, would be pumped to a place located somewhere outside the city. however, the island was not an uncontroversial location. first, the state regulation committee was planning to use the island as a well-located recreation area. second, the river in this place was already very dirty, and locating the plant there would cause even more pollution inside the city. third, the regulation of the river meant that there was not a sufficient stream there, and regular floods were a constant threat. fourof theprojects located the sewageplanton the island: “ostrov”, “zdraví”, “zdraví všem” and “čistý vzduch”. only two of them were adjudged worthy to be bought – “ostrov” (15000 kč) and “zdraví všem” (10000 kč). all of them suffered from being located on the island. although this locationwas not banned, it prevented the projects from being selected for implementation. each of the four projects proposed the same treatment process. after the mechanical processes, traditionally a system of screens and grit chambers, there was a biological process in aeration tanks. the sludge was then left for some time in the final clarifiers and was finally transported through pipelines to sludge drying beds somewhere outside the city. table 1: survey of the projects in the island group project creator price ostrov ing. j. staněk 91 mil. kč ing. j. ledvinka ing. g. novák ing. v. maděra ing. v. hoffmann zdraví l. bill a comp. 48.8 mil. kč dr. k. skorkovský zdraví všem ing. e. zejda 290 mil. kč čistý vzduch lanna comp. 436.5 mil. kč 3.2 projects in roztoky the projects in this group located the new plant in roztoky. the group contains the projects: “roztoky” and “praze ku zdaru”. both projects involved both mechanical and biological water treatment. the system was almost the same as in the group of plants on the island. the differences were in the details. since the “roztoky” project won the third prize, and will be discussed later, i will write only about the second project. “praze ku zdaru” was interesting in the way it planned to use the old sewage plant. it proposed coarse cleaning in bubeneč and only water without garbagewouldbefloateddowntoroztoky. the sludge disposalwas alsodifferent. the sludgewouldbe transported immediately by train and by barge. this was themainproblemthatprevented theproject fromwinning. however, the project contained good ideas and was bought for 15000 kč. table 2: survey of the projects in the roztoky group project creator price praze ku zdaru ing. j. gregor 347 mil. kč 3.3 projects next to the labe third group contains projects planning to discharge their sewage into another river than the vltava, into the labe. these projects are “spád” and “ druhá řeka”. both of them planned to use long tunneled pipelines to get the sewage fromprague to the sewage plants (16 km for “spád” and 16.7 km for “druhá řeka”, about 14 km underground). both projects also proposed a natural biological treatment process, as biological pondswere plannednext to the sewageplants. themechanically treatedwaterwould be released into these natural pools and would be cleaned naturally. due to the locations, these projects had problems mostly with the incoming pipelines and with the river labe itself, which has a slower water speed than the vltava. in addition, it would have been really difficult and expensive to construct such long pipelines. table 3: survey of the projects next to the labe project creator price spád ing. j. lanč 157 mil. kč druhá řeka doc. e. snížek 258 mil. kč ing. b. belada 3.4 projects with special treatment programmes the projects in this group proposed some special way to dispose of the sewerage. the “hygiena 3” project located the plant next to drahaňská rokle. after mechanical cleaning of the sewage, the sludge was to be coagulated by an electrolytic process on 27600 electrodes. however, this treatment method was considered too expensive. nevertheless, ideas put forward in the project were so interesting that city bought the project for 5000 kč. the “závlaha” project proposed three sewage plants, including the old plant in bubeneč, and also one subsidiary sewage plant, where preliminary filtration treatment would be done. then the sludgewould be transported by pipeline (18.92 km long, the longest in all the projects) to the main plant near to the village ofveltrusy,where itwouldbedeposited ondrying 25 acta polytechnica vol. 50 no. 4/2010 fields. on these fields the sludge would undergo biological treatment. the project was too expensive, the cleaning process was not very effective, and a huge area of land would have had to be bought. however, there were some good points in the project and it was bought for 10000 kč. the last project, “úspora”, was only a small and modest proposal to reconstruct the old sewage plant in bubeneč. however, it contributed several ideas on how to reconstruct the old facility effectively and cheaply. this project was assessed as apropriate and was bought for 15000 kč. table 4: surveyof theprojects proposing special treatment project creator price hygiena 3 ing. j. roth 199 mil. kč ing. f. ballasko ing. dr. j. bulíček závlaha prof. ing. j. zavadil 537 mil. kč úspora ing. j. staněk 15 mil. kč ing. j. ledvinka ing. g. novák ing. v. maděra ing. v. hoffmann 4 winning projects three projects were considered good enough to be awarded full prizes and were bought for implementation in the wastewater treatment system for prague. these projects were: “dorr”, “čistá půda” and “roztoky”. 4.1 “dorr” this project located the new sewageplant on the right bank of the river in podhoří, 2 km away from the actual sewage plant. the cleaning process was divided into a mechanical part and a biological part. the mechanical part consisted of coarse and fine racks, which were cleaned by hand and also mechanically. the sand pools were square with rounded corners and the sand was collected with special dorr system-rakes (named after well knownamerican company dorr, which specialized in sewer systems). the sedimentation tanks used the same system. the activated tanks used therelling-hausen system (a combination of shaking and aeration). the settlement process was almost the same as sedimentation tanks. for the digestation tanks, it was planned to use a new thermophile methane fermentation system. the fermented sludge was then to be transported by pipeline to čimice. the “dorr” project was declared the best project in the competition. all members of the commission agreed that “dorr” had some of the best technical drawings and carefullymade calculations. the “dorr” project was awarded 55000 kč. fig. 2: picture of the winning “dorr” project 4.2 “čistá půda” this project, designed by the lanna company, was located in the originalplace of thedouda-másloproject, near to the village of řež. it had also considered a location nearroztoky, but this town refused permission to build a sewage plant to its neighborhood, as a local water plant was planned for this location. the incoming pipelines from the townwere to take sewage from the whole city. the pipelines were to be 10.7 km in length (only 0.92 km underground). the sewage stream was to be natural, slightly accelerated by pumping water from the river. the preliminary treatment and filtration of the garbagewasplanned ona systemof double racks,with 4 grit chambers and 4 skimming tanks. primary clarifiers were planned with the skimming bamag system (rakes moving in the vertical axis). aeration was to be performed either in aeration tanks or in tanks with a sprayed area. the digesters were also from the bamag system, like the primary clarifiers. they were plannedwith amoving ceiling to facilitate access to the sludge. at the end of the process, the sludge 26 acta polytechnica vol. 50 no. 4/2010 wouldbemovedto sludgedryingbedsnear the villages of drasty and tursk. the project was awarded second place and prize 45000 kč. it was adjudged the most precise and well planned, but it was also one of the most expensive projects and therefore did not meet the financial requirements of the competition. fig. 3: picture of the “čistá půda” winning project 4.3 “roztoky” this was one of the group of projects which located their sewage plant inroztoky. the creators chose this location as the first acceptable place outside the city boundaries. this made the incoming pipelines shorter and cheaper – they were only 7.4 km in length. the project proposed preliminary treatment of the sewage in theold sewageplant inbubeneč,where there is a different filter system for the upper and lower area of the city. the pipelines joined up behind the plant and pumped the treated sewage into the primary clarifiers in roztoky. aeration was performed in aeration tanks using the hurd system. the diggested sludge was to bemoved to special drying beds near zdiby. table 5: survey of the winning projects project creator price dorr ing. j. staněk 89 mil. kč ing. j. ledvinka ing. g. novák ing. v. maděra ing. v. hoffmann čistá půda lanna comp. 480 mil. kč roztoky ing. j. staněk 193 mil. kč ing. j. ledvinka ing. g. novák ing. v. maděra ing. v. hoffmann the project won third place and received a prize of 20000 kč. it was not considered such a carefully made project as the first two, and it did not take into account an appropriate amount of sludge. nevertheless, it was rated one of the best. fig. 4: picture of the “roztoky” winning project 5 conclusion the results of the competition were announced on may 22nd, 1935 and until 1936 there was a debate on the winning and losing projects. the winners of the competition have been discussed above. however, the final outcome was a surprise. although therewere threewinning projects and another 6 projects were bought for future consideration, none of them was accepted as good enough to replace the original máslo-douda project and none of them was successfully constructed. by the time the competition was completed, the project for a sewage plant in řež was licensed. the original goals of the competition, bringing innew ideas and designs for a new sewerage system for greater prague, were fulfilled, but they were never implemented. for the record, how did the máslo-douda project end up? it, too, was never completed. 27 acta polytechnica vol. 50 no. 4/2010 acknowledgement the researchdescribed in this paperwas supervisedby prof. i. jakubec, faculty of arts, charles university in prague. references [1] keclík, t.: the competition on a general project for new sewage plants for the capital city prague. gas, water and sanitation engineering, 1936, vol. 16, no. 4, p. 1–3. [2] schulz, f.: report on the conclusion of the competition. gas, water and sanitation engineering, 1936, vol. 16, no. 4, p. 3–5. [3] topinka, f.: brief survey about the projects.gas, water and sanitation engineering, 1936, vol. 16, no. 4, p. 7–16. [4] vondráček, b.: brief survey about the general conclusion of the committee accredited to judge projects in the competition for new sewage plants for the capital city of prague. bulletin of the capital city of prague, 1935, vol. 42, no. 25, p. 517–520. [5] košacký, m.: development of the prague sewer system in the 19th and 20th century, graduation thesis, prague 2000. [6] jásek, j.: 100 years of modern sewerage system in prague. prague, 2006. about the author kryštof drnek was born in prague on february 24th, 1985. after graduating from gymnasium českolipská in 2004, he started studying history at the faculty of arts, charles university in prague, where he graduated with a bachelor degree in 2007. since then he has been studying for amaster’s degree in economic history in the department of economichistory at the faculty of arts, charles university. kryštof drnek e-mail: drnekk@gmail.com dept. of economic and social history charles university náměstí jana palacha 2, 116 38 praha 1 czech republic 28 ap09_2.vp 1 introduction an analysis of the relationship between acoustic-phonetic aspects of speech and the speaker’s age may have numerous applications. this paper has been motivated by practical experience in the field of phoniatry and logopaedia. when examining children’s pathological speech, there is often an effort to answer the question “what age does particular speech corresponding to”, and therefore for example to estimate at what age a child’s speech development stopped. chronological age is unambiguously given by date of birth. logopaedic age is the age estimated on the basis of acoustic-phonetic aspects of human speech. 2 materials and methods for the purposes of this survey, a database of children’s speech was recorded. it consists of the speech of 193 children aged from three to twelve years. it contains the following words (in czech): babička, časopis, čokoláda, dědeček, kalhoty, kniha, košile, květina, květiny, maluje, mateřídouška, motovidlo, peníze, pohádka, pokémon, popelnice, radost, rukavice, různobarevný, silnice, škola, špička, televize, ticho, trumpeta, vlak, zelenina, zmrzlina. 2.1 frequency of a basic glottic tone f0 an analysis was made of separate vowels in syllables /la/, /le/, /li/, /lo/, /lu/ from the words škola, košile, zmrzlina, letadlo and maluje, and then of complete vocal sections of the speech. the analysis was made using an autocorrelation method in the praat v. 5.0.15 program [6] with the following parameters time step � 0 0. , pitch floor � 100 hz and pitch ceiling � 600 hz. the resulting values were verified using the wavwsurfer v. 1.8.5 program [7], and were manually modified, if applicable. the most frequent event was incorrect detection of f0, lower by one octave. in order to make the frequency intervals comply better with the perception of intonation intervals by human hearing, the f0 values were transferred to a semitone scale, with the beginning at 100 hz f st f hz 0 012 100 2 ( ) ln ( ) ln( ) � . (1) for statistical confirmation of the age dependence of f0, a zero hypothesis of h0 was taken into consideration, which denies such dependence. h0 can be rejected on the basis of the results of a t-test for the correlated measurements: t x x s f d � � 0 age , (2) where s d d n nd ii n � � � �� ( ) 2 1 1 . (3) the di variable in this case means the difference between f0 and the age of speaker no. i. in our case, h0 can be rejected for the level of , p � 0 001. , n � 193. the correlation power can be expressed using the pearson correlation coefficient: r x x s f x xf � cov( , ) 0 0 age s age . (4) for the age dependence of f0 for vowel /a/: r � 0 43. , which is a mesoscale satisfactory correlation. for all vocal sections of speech: r � 0 41. with p � 0 001. , n � 113. the f0 trend is shown in fig. 1. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 age dependence of children’s speech parameters j. janda this paper deals with the search for agedependent parameters in children’s speech. these parameters are compared in terms of age dependence, and their adequacy for recognizing the age of a speaker is presented, using discrimination analysis. keywords: children’s speech, age dependence, phonetic analysis. 2 4 6 8 10 12 14 12 13 14 15 16 17 18 19 f0 age s e m it o n e s fig. 1: f0 age dependence – complete speech 2.2 f0 variance the variance of the basic voice frequency is associated with the intonation range of a piece of speech. this parameter reflects the overall tunefulness and melodiousness of the speech typical of pre-school children. the f0 variance was analysed for all vocal speech sections, and showed a declining tendency with age. the correlation coefficient was r � �0 61. (p � 0 001. , n � 113). 2.3 f1, f2 formants formant frequencies correspond with the resonance frequencies of the vocal organ cavities [1]. they were estimated for particular vowels using an lpc (linear predictive coding) spectrum via an algorithm by burg [6]. age dependence was less evident with the formants than with f0. within all speech, f1 had a correlation coefficient of r � �0 25. (p � 0 001. , n � 193) and f2 had r � �0 34. (p � 0 001. , n � 113). 2.4 sibilant consonant characteristics spectral centre of gravity if the complex spectrum is given by s f( ), where f is the frequency, the centre of gravity is given by f f s f fc � � � ( ) 2 0 d (5) divided by the energy s f f( ) 2 0 d � � . (6) thus, the centre of gravity is the average of f over the entire frequency domain, weighted by the power spectrum. central spectral moment the n-th central spectral moment is given by �n c nf f s f f s f f � � � � � � ( ) ( ) ( ) 2 0 2 0 d d . (7) thus, the n-th central moment is the average of ( )f f c n� over the entire frequency domain, weighted by the power spectrum. spectral standard deviation the standard deviation of a spectrum is the square root of the second central moment of this spectrum. skewness of a spectrum the (normalized) skewness of a spectrum is the third central moment of this spectrum, divided by the 1.5 power of the second central moment. skewness is a measure for how greatly the shape of the spectrum below the centre of gravity differs from the shape above the mean frequency. for white noise, the skewness is zero. kurtosis of a spectrum the (normalized) kurtosis of a spectrum is the fourth central moment of this spectrum, divided by the square of the second central moment, minus 3. kurtosis is a measure for how greatly the shape of the spectrum around the centre of gravity differs from a gaussian shape. for white noise, the kurtosis is �6/5. the above-mentioned spectral characteristics of sibilant consonants were measured for consonants /s/, /ss/ and /cc/. significant features were especially the rise in the spectral centre of gravity (r � 0 45. , p � 0 001. , n � 193) (fig. 3, 4) and the reduction in spectral skewness (r � �0 47. , p � 0 001. , n � 193) (fig. 2) at consonant /s/ in word “silnice”. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 49 no. 2–3/2009 2 4 6 8 10 12 14 -3 -2 -1 0 1 2 3 4 5 spectral skewness age fig. 2: age dependence of spectral skewness for consonant /s/ 2 4 6 8 10 12 14 0 2000 4000 6000 8000 10000 12000 centre of gravity: /s/ age h z fig. 3: spectral centre of gravity shift 2.5 voice onset time voice onset time (vot) [5] is the time duration between the release of a plosive and the beginning of vocal cord vibration (fig. 5). this period is measured in milliseconds (ms). vot measurements were performed on syllable /ka/ from word “babička”. however, it was not possible to prove any age dependence even of this parameter using the measured values on the level of p � 0 05. . 2.6. speech rate speech rate has been determined for particular talkers as a reciprocal value of the duration of the entire speech without pauses. age dependence was also not proved for this parameter. 3 results 3.1 overview of age dependent parameters the table below summarizes the examined phonetic characteristics. individual attributes are ordered according to the age-correlation rate (column r). the h0 column contains significance level values where it is theoretically possible to reject the zero hypothesis of age independent parameters. the parameters below the double line cannot be considered age-dependent on the significance level of 5 %. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 2000 4000 6000 8000 10000 12000 1 2 3 4 5 6 7 8 9 x 10 -4 power spectrum of the consonant /s/ frequency (hz) p o w e r 11 years 4 years fig. 4: spectral centre of gravity shift -0.15 -0.1 -0.05 0.05 0.15 2000 4000 6000 8000 10000 12000 0 0.1 smpl voice onset time illustration plosive release vot /ka//cc/ fig. 5: voice onset time illustration. syllable /ka/ from word “babička” feature r h0 f0 variation �0.61 9.3e�13 spectral skewness /s/ �0.47 5.0e�12 spec. centre of gravity /s/ 0.45 8.7e�11 f0 – whole discourse �0.42 4.0e�06 f2 – whole discourse �0.34 1.9e�04 spec. standard deviation /cc/ �0.30 1.4e�03 spec. standard deviation /s/ �0.21 3.2e�03 spec. standard deviation /ss/ �0.20 4.6e�03 spectral kurtosis /s/ �0.17 1.8e�02 spectral skewness /ss/ �0.14 4.9e�02 spec. centre of gravity /ss/ 0.11 1.4e�01 spec. centre of gravity /cc/ �0.12 1.9e�01 spectral kurtosis /cc/ 0.11 2.6e�01 spectral skewness /cc/ 0.10 2.8e�01 voice onset time /k-a / �0.08 3.7e�01 spectral kurtosis /ss/ 0.00 9.6e�01 speech rate 0.00 9.8e�01 table 1: overview of age dependent features 3.2 discrimination analysis in this part, we will try to make use of the age-dependent parameters for a simple discrimination analysis. the data classification is based on acceptance or rejection of the hypothesis of data pertinence to a particular class. four classes were designated (0: 3–5 years, 1: 6–7 years, 2: 8–9 years, 3: 10–12 years). the discrimination function being maximized is as follows [2]: g p hi i i t i( ) ln ( ) ln ( ) ( )d d c d c d� � � � � � � i i� � 1 , (8) where c i is the covariance matrix, �i is the mean value and ( )d hi is the probability rate of the results on d data, on the assumption that hypothesis hi applies. training was performed using the ransac method for the vectors of 16 phonetic parameters. the classification success rate is shown in fig. 6, and the percentage enumeration is shown in a confusion matrix (table 2). 4 conclusion the selected speech characteristics showed various intensities of age dependence. the characteristics based on basic vocal frequency and some spectral properties of consonant /s/ showed a correlation of about 0.5. in the end, it was shown that selected speech attributes enable training of a classifier which provides for classification into age groups with a probability rate of ca 80 % a similar classification method will be tested in the future on the speech og children with speech developmental defects. acknowledgments the research described in this paper was supervised by doc. ing. roman čmejla, csc. fee ctu in prague, and has been supported by the czech grant agency under grant gd102/08/h008 – analysis and modeling of biomedical and speech signals. references [1] psutka, j. et al.: mluvíme s počítačem česky. prague: academia, 2006. [2] uhlíř, j., sovka, p. et al.: technologie hlasových komunikací. prague: ctu – publishing house, 2007, isbn 978-80-01-03888-8 [3] ohnesorg, k.: naše dítě se učí mluvit. prague: spn, 1976, isbn 80-04-25233-8. [4] schötz, s.: acoustic analysis of adult speaker age. in speaker classification i. heidelberg: springer-verlag, 2007. [5] whiteside, s. p., marshall, j.: developmental trends in voice onset time: some evidence for sex differences. phonetica, vol. 58, no. 3. p. 196–210. [6] boersma, p., weenink, d.: praat: doing phonetics by computer (version 4.3.14), 2005, [computer program]. http://www.praat.org/ [7] sjölander, k., beskow, j.: wavesurfer [computer program], http://www.speech.kth.se/wavesurfer/ jan janda e-mail: jandaj2@fel.cvut.cz department of circuit theory czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 49 no. 2–3/2009 0 50 100 150 200 0 1 2 3 age classes children sequenced by age a g e c la s s 0 50 100 150 200 0 1 2 3 age classification children sequenced by age d is c ri m in a te d c la s s fig. 6: age classification actual predicted 0 1 2 3 0 92.9 7.1 0.0 0.0 1 0.0 97.4 2.6 0.0 2 10.3 14.7 63.2 11.8 3 0.0 8.9 13.3 77.8 table 2: confusion matrix << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 limited angle torque motors having high torque density, used in accurate drive systems r. obreja, i. r. edu abstract atorquemotor is a special electricmotor that is able to develop the highest possible torque in a certain volume. a torque motor usually has a pancake configuration, and is directly jointed to a drive system (without a gear box). a limited angle torque motor is a torque motor that has no rotary electromagnetic field — in certain papers it is referred to as a linear electromagnet. the main intention of the authors for this paper is to present a means for analyzing and designing a limited angle torque motor only through the finite element method. users nowadays require very high-performance limited angle torque motors with high density torque. it is therefore necessary to develop the highest possible torque in a relatively small volume. a way to design such motors is by using numerical methods based on the finite element method. keywords: limited angle torque motor, finite element method, high performance, pancake, direct drive. 1 general considerations our purpose for this paper is to present some aspects of special torque motors with high torque density. more exactly, our attentionwill focus on limited angle torque motors used in very accurate drive systems. a torque motor is a special electric motor that is usually direct jointed to a drive system (without a gear box). there is generally a pancake structure, which can easily be adapted to the system configuration. a limited angle torquemotor is amotor that does not have a rotary electromagnetic field. in certain papers it is referred to as a linear electromagnet. nowadays users require very special limited angle torque motors with high torque density versus the volume of the motor. the reason is that a performing drive system has to be as small as possible. the authors of this paper have developed very special solutions for motors of this kind. some aspects of these solutions will be presented here. 2 limited angle torque motors with a small air-gap versus limited angle torque motors with a large air-gap a limited torque motor is, usually, a toroidal motor relative to the winding solution. in this way, the total air-gap, relative to the magnet, also includes the winding. the dimension of the air-gap is, generally, more than 1.5 mm. relative to the overall dimension of the motor, we can consider this to be a large air-gap. depending on the size of the motor and the internal structure, the value of the air-gap can be several millimeters (4÷5 mm). for motors with the winding assembled in slots, the value of the air-gap may be 0.4 mm to 1.5 mm, depending on the size of the motor and the arrangements for total friction torque. this air-gap is small. we will now present some considerations on the small air-gap and large air-gap solutions. 2.1 a limited angle torque motor with a small air-gap fig. 1: an example of a limited angle torque motor with a small air-gap 75 acta polytechnica vol. 51 no. 5/2011 fig. 2: magnetic field distribution fig. 3: torque versus electric angle diagram fig. 4: flux density diagram in the air-gap, in the proximity of the permanent magnet analyzing the figures above, it is found that the magnetic field generates 4 main poles, and also four other small poles in the neutral area, because the field lines may close in the neutral area of the poles directly between the magnets and the soft magnetic material of the rotor. a disadvantage is reflected in torque diagram form (the diagram is nonsymmetrical, as shown, as the current value is increasing). 2.2 a limited angle torque motor with a large air-gap in the situation of torquemotorswith a largeair-gap, it is found that the magnetic field generates only the 4 main poles very accurately, because the field lines maynot close in the neutral area of the poles directly between the magnets and the soft magnetic material of the rotor, due the dimensions of the air-gap. this advantage is reflected in torque diagram form, which is symmetrical. fig. 5: an example of a limited angle torque motor with a large air-gap fig. 6: magnetic field distribution fig. 7: torque versus electric angle diagram 76 acta polytechnica vol. 51 no. 5/2011 fig. 8: flux density diagram in the air-gap, in the proximity of the permanent magnet 3 analysis of a limited angle torque motor with a large air-gap section 2 presented general aspects of the two types of motors. as the motor with a large air-gap is more usual, it will now be analyzed in greater detail. fig. 9: torque motor with a large air-gap in no load conditions — magnetic field distribution fig. 10: tangential fluxdensity diagram in the rotor hub, through the magnet axis fig. 11: normal flux density diagram in the rotor hub, in the neutral area fig. 12: tangential flux density diagram, on the magnet, through the magnet axis fig. 13: normal flux density diagram, transversal on the magnet, in the base of the magnet area fig. 14: normal flux density diagram, transversal on the magnet, in the base of the middle area 77 acta polytechnica vol. 51 no. 5/2011 fig. 15: normal flux density diagram, transversal on the magnet, at the limit to the air-gap area fig. 16: tangential flux density diagram in the air-gap area, through the magnet axis fig. 17: normal flux density diagram in the air-gap area, on polar pitch fig. 18: tangential flux density diagram in winding area, through the magnet axis fig. 19: normal fluxdensity diagram in thewinding area, on polar pitch fig. 20: tangential flux density diagram in the stator core area, through the magnet axis fig. 21: normal flux density diagram in the stator core area fig. 22: torque motor with a large air-gap in load conditions at 1a – magnetic field distribution 78 acta polytechnica vol. 51 no. 5/2011 fig. 23: tangential fluxdensity diagram in the rotor hub, through the magnet axis fig. 24: normal flux density diagram in the rotor hub, in the neutral area fig. 25: tangential flux density diagram, on the magnet, through the magnet axis fig. 26: normal flux density diagram, transversal on the magnet, in the base of the magnet area fig. 27: normal flux density diagram, transversal on the magnet, in the middle area fig. 28: normal flux density diagram, transversal on the magnet, at the limit to the air-gap area fig. 29: tangential flux density diagram in the air-gap area, through the magnet axis fig. 30: normal flux density diagram in the air-gap area, on polar pitch 79 acta polytechnica vol. 51 no. 5/2011 fig. 31: tangential flux density diagram in the winding area, through the magnet axis fig. 32: normal fluxdensitydiagram in thewinding area, on polar pitch fig. 33: tangential flux density diagram in the stator core area, through the magnet axis fig. 34: normal flux density diagram in the stator core area, through the magnet axis fig. 35: normal flux density diagram in the stator core area fig. 36: torque motor with a large air-gap in load conditions at 5a — distribution of the magnetic field fig. 37: tangential flux density diagram in a rotor hub, through the magnet axis fig. 38: normal flux density diagram in the rotor hub, in the neutral area 80 acta polytechnica vol. 51 no. 5/2011 fig. 39: tangential flux density diagram, on the magnet, through the magnet axis fig. 40: normal flux density diagram, transversal on the magnet, in the base of the magnet area fig. 41: normal flux density diagram, transversal on the magnet, in the middle area fig. 42: normal flux density diagram, transversal on the magnet, at the limit to the air-gap area fig. 43: tangential flux density diagram in the air-gap area, through the magnet axis fig. 44: normal flux density diagram in the air-gap area, on polar pitch fig. 45: tangential flux density diagram in the winding area, through the magnet axis fig. 46: normal fluxdensity diagram in thewinding area, on polar pitch 81 acta polytechnica vol. 51 no. 5/2011 fig. 47: tangential flux density diagram in the stator core area, through the magnet axis fig. 48: normal flux density diagram in the stator core area, through the magnet axis fig. 49: normal flux density diagram in the stator core area the diagrams above present three different situations of a torque motor, limited angle, that can be met in applications: no-load conditions, or a very low load, below 1a, which is the most usual situation, and below a 5a load, which is considered to be a peak torque situation. it is extremely important to consider the magnetic field in any of the above situations, because the goal of the motor design is to put maximum power into a certain volume (maximum torque), while assuring that themotor operates properly in anyworking regime of interest. there are many areas where a designer controls the evolution of themagnetic field in variousworking regimes, but thosementioned above are themost important: to control the status of the dispersion field between the magnets; to control the level of the field and the influence of the winding field in the permanent magnet area; to control the level of saturation in the rotor hub area, as well as in the stator core area; to control the level and distribution of the field in the winding area. 4 conclusion limited angle torque motors are important components used in various applications: control systems, automatic systems, drive systems, etc. most limited angle torquemotors have a pancake configuration,whichmeans two separate parts: a rotor and a stator. in this way, such motors can easily be adapted and assembled in the above systems. present day applications require limited angle torque motors with: the smallest possible relative size and dimensions; the highest possible torque density, which means the highest possible torque constant relative to a certain volume; a high insulation class,whichmeanshaving theability towork inharsh ambient conditions, at high current density. general classical formulas can be used to design a limited angle torque motor, but it is difficult to achieve the best results in this way. it is only by using numerical methods, based on finite element analysis, that themagnetic field can be controlled in any area of interest ensuring that: any area is properly dimensioned; the total volume is used optimally; and the maximum torque constant is obtained. this way was used to analyze, design and produce many types of very well performing limited angle torque motors, with different sizes and electric parameters, with outer diameters from 15 mm to 300 mm, and also with a torque constant from 5 mnm/a to 5 nm/a. one of the most special limited angle torque motors has two poles and an angular excursion of ±40 degrees. when there was an outer diameter of 40 mm and a length of 9 mm, it was necessary to have 20 mnm/a as the torque constant, but 75 mnm at 4.5a.earliermotors hadworkedproperly only up to 2.5 a. at a higher current, saturation caused faster degradation of the torque constant. only using numericalmethodswas it possible to design amagnetic circuit and a winding distribution that would meet the application requirements. acknowledgement this work was supported by cncsis-uefiscdi, projectpnii-ru,no. 1/28.07.2010,“high-precision strap-down inertial navigators, based on the connection and adaptive integration of the nano and micro 82 acta polytechnica vol. 51 no. 5/2011 inertial sensors in low cost networks, with a high degree of redundance”, code te-102/2010. references [1] hoole, s.r.: computer-aided analysis and design of electromagnetic device. elsevier, 1989. [2] haberman,r.: elementary applied partial differential equations. prentice-hall, 1987. [3] ierusalimschy,r., defigueiredo,l.h., celes,w.: referencemanual of the programming, language lua 4.0. http://www.lua.org/manual/4.0/ [4] allaire, p. e.: basics of the finite element method. 1985. [5] silvester, p. p.: finite elements for electrical engineers. cambridge university press, 1990. [6] plonus, m.: applied electromagnetic. mcgrawhill, 1978. [7] chari,m.v.k., salon, s. j.: numerical methods in electromagnetism. academic press in electromagnetism, 2000. [8] boulassel, z. b., mekideche, m. r.: modeling and design of a synchronous permanent magnets machine, studies in applied electromagnetics and mechanics, vol. 34 – computer field models of electromagnetic devices, ios press 2010, pp. 389–397. [9] bernat, j., kajda, m., kolota, j., stepien, s.: brushless dc motor optimization using fem parallel simulation technique with look up tables, studies in applied electromagnetics and mechanics, vol. 34 – computer field models of electromagnetic devices, ios press 2010, pp. 435–439. about the authors radu obreja was born on december 22, 1983. he received his bachelor’s degree and master’s degree in electrical engineering from the faculty of electrical engineering,politehnicauniversity of bucharest, romania. at present he is a phd student at the military technical academy. his research interests are in special electric machines, inertial navigation systems, low-cost navigation sensors and integration technologies. ioana raluca edu was born on december 4, 1984. she received her bachelor’s degree and master’s degree in electrical engineering from the faculty of electrical engineering, politehnicauniversity of bucharest, romania. at present she is a phd student at the faculty of electrical engineering. her research interests are in inertial navigation systems, low-cost navigation sensors and integration technologies. radu obreja e-mail: radu@sistemeuroteh.ro military technical academy bucharest, romania ioana raluca edu e-mail: edu ioana raluca@yahoo.com politehnica university of bucharest romania 83 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 utilization of image intensifiers in astronomy s. v́ıtek, k. fliegel, p. páta, p. koten abstract in this paper we present the properties of image intensifiers, used together with fast tv cameras for astronomical purposes within the maia project(meteor automatic imager and analyser, primarily focused on observing meteoric events with high time resolution). the main objective of our paper is to evaluate the suitability of these devices for astronomical purposes in terms of noise, temporal and spectral analysis. keywords: image intensifier, astronomy, meteors. 1 introduction an interesting technique (that has became relatively inexpensive in recent years) for increasing the time resolution of any astronomical instrument is the use of amodernccd camerawith a fast framerate of 50 ormore frames per second. this type of systemhas a significant deficiency in terms of reduced sensitivity; the solution may to use an image intensifier. this paper describes our experience with a device of this kind. the maia astronomical instrument [2] uses a second generationmullardxx1332 image intensifier. the tube assembly of this 50/40mm inverter (typical gain 30000 to 60000 lm/lm, resolution 30 lp/mm) is designed to be incorporated in night vision devices, in particular in tank driving periscopes. this leads to some properties which significantly define the limits and the possibilities of using a device of this kind in astronomy. 2 electrooptical characteristics the instrument is equipped with an input aperture lens (pentax smcfa1.4/50mm) and an inner camera lens (pentaxh1214-m1.4/12mm) [4]. the spectral transparency of the two lenses (input and camera), the spectral sensitivity of the image intensifier, the spectral sensitivity of the camera, the spectrum of the light at the output of the image intensifier, the spatial resolution of the input lens and the image intensifier, and also the spatial resolution of the whole system are among the most important parameters tested and presented in this paper. 10 −4 10 −3 10 −2 10 −1 10 010 −2 10 −1 10 0 normalized input power [−] n o rm a liz e d g a in [ − ] 0 200 400 600 800 1000 1200 0 0.2 0.4 0.6 0.8 1 spatial frequency [lw/ph] m t f [ − ] overall mtf input lens and image intensifier (a) (b) fig. 1: the normalized gain of the system (measured at 650 nm) describes the automatic gain control as a nonlinearity in the image intensifier (a), the overall mtf of the system, including the camera (solid line) and the partial mtf of the image intensifier with the lens (dashed line) (b) 56 acta polytechnica vol. 52 no. 2/2012 400 500 600 700 800 900 1000 0 0.2 0.4 0.6 0.8 1 wavelength [nm] r e la tiv e r e sp o n se , s p e ct ra l t ra n sm is si o n , r e la tiv e in te n si ty [ − ] camera w/o lens camera with lens relative power lens 400 500 600 700 800 900 0 0.2 0.4 0.6 0.8 1 wavelength [nm] r e la tiv e s e n si tiv ity , s p e ct ra l t ra n sm is si o n [ − ] lens b=0.11 b=0.23 b=0.36 b=0.47 b=0.59 b=0.71 (a) (b) fig. 2: relative spectral response of the camera with (or without) a lens, and the relative power of the light at the output screen of the image intensifier (a), relative overall spectral sensitivity of the system for different digital levels b in the output image (b =1 white, b =0 black) (b) fig. 3: temporal changes in stellar object flux 2.1 spectral response the spectral response was measured independently for allpartsof the system[3]. theexperimental setup consisted of the lot-oriel collimated halogen light source, thelot-orielomni 150 computer controlled monochromator, the expander to get even illumination of the image sensor, and the avantes avaspec3648 fiber optic spectrometer. the measurement results are shown in figure 2. 2.2 spatial resolution the mtf was measured using a test chart according to iso 12233. this chart can be used to evaluate mtf with two different approaches, utilizing a slanted edge (an approx. 5◦ slightly slanted black bar used to measure the horizontal or vertical spatial frequency response), or a line square wave sweep with the spatial frequency range 100–1000 lw/ph (line widths per picture height). in our case, slanted edges were used to determine the spatial frequency response — see figure 1(b). 3 noise performance the noise analysis based on the acquisition of testing video sequences in various light conditions is described in [5]. we choose the generalized laplacian model (glm) for heavy-tailed noise. 4 temporal analysis figure 3 shows the temporal changes in stellar object flux over 100 frames of videosequence. the changes are mainly due to the automatic gain control. this fact puts greater demands on image calibration. we haveproposedadaptive flat-fielding for any light conditions. 5 conslusions the biggest disadvantage of the image intensifier describedhere is thebuilt-in automaticgaincontrol. as is shownfigure 1(a), the gain decreases rapidlywith increasing input power, i.e. if any bright stellar object appears in the field of view. however the image 57 acta polytechnica vol. 52 no. 2/2012 intensifier is not a bottleneck of the maia device — all measured parameters of an image intensifier are far better than the parameters of the ccd camera used for the project. acknowledgement this work has been supported by grant no. 205/09/1302 study of sporadic meteors and weak meteor showers using automatic video intensifier cameras of the grant agency of the czech republic. we would also like to acknowledge grant no. 102/09/0997 from grant agency of the czech republic. references [1] koten, p.: software for processing of meteor video records. proceedings of the asteroids, comets, meteors 2002 conference, 197, 2002. [2] vı́tek, s., koten, p., páta, p., fliegel, k.: double-station automatic video observation of the meteors. advances in astronomy, 2010 (article id 943145), 4 pages (2010). [3] fliegel, k., havĺın, j.: imaging photometer with a non-professional digital camera. proc. spie 7443, 74431q, 2009. [4] fliegel, k., švihĺık, p., páta, p., vı́tek, s., koten, p.: meteor automatic imager and analyzer: current status and preprocessing of image data. applications of digital image processing xxxiv, proc. spie, 2011. [5] švihĺık, p., fliegel, k., koten, p., vı́tek, s., páta, p.: noise analysis of maia system andpossible noise suppressio.radioengineering, vol. 20, pp. 110–117, 2011. stanislav vı́tek karel fliegel petr páta faculty of electrical engineering czech technical university pavel koten astronomical institute of the academy of sciences of the czech republic 58 acta polytechnica vol. 52 no. 6/2012 physical and chemical aspects of the nucleation of cement-based materials pavel demo1,2, alexey sveshnikov1,2, šárka hošková1, david ladman1, petra tichá1 1institute of physics, academy of sciences of the czech republic, cukrovarnická 10, 162 53 praha 6, czech republic 2czech technical university in prague, faculty of civil engineering, thákurova 7, 166 29 praha 6, czech republic corresponding author: demo@fzu.cz abstract a theoretical model of the nucleation of portlandite is proposed, and the critical size of a portlandite cluster and the energy barrier of nucleation are determined. the steady state nucleation rate and the time lag of the nucleation of portlandite are estimated for a pure solution of ca(oh)2 in water. possible connections with the corresponding properties for cement paste are discussed. a new method is developed for experimentally determining the concentration of ca2+ ions during the initial stage of hydration of a cement paste. the time dependence of ca2+ ions is measured for various water-to-cement ratio values. the results are discussed from the point of view of existing models of the induction period. keywords: cement paste, induction period, calcium concentration, nucleation rate, time lag. 1 introduction cementitious materials are one of the most widelyused types of materials in modern civil engineering. concrete has many advantages, such as very high compressive strength, good resistance to corrosion and fire, and good acoustic damping. workability occupies a special place among these properties of concrete. workability refers to the ability of a fresh cement paste to fill the desired shape without reducing the quality of the concrete. high workability of cement paste over a period of several hours significantly reduces construction costs, and enables both easier shaping of the material and pre-casting of concrete that can then be delivered to the construction site. having in mind the great importance of the workability of cement paste, extensive research has been carried out in order to understand the underlying mechanism and to study the possibility of extending the workability period without a significant impact on the properties of the final material. despite the efforts that have been made, there are still many unanswered questions due to the very complex nature of the physico-chemical transformations taking place in the hydrating cement paste. the hydration process of cement paste can be divided into a number of main stages. the first stage is the initial wetting of the cement particles and rapid dissolution of a small amount of the cement minerals, especially the calcium aluminates and the alkali sulfates. this stage is accompanied by significant heat production and lasts only few minutes. during this stage the hydrating tricalcium silicate also releases a large number of calcium ions into the solution, due to the very high intrinsic solubility of calcium. after the initial stage of exothermic reactions, the cement paste “falls asleep”, and for the next few hours practically no changes are observed in the state of the paste. the duration of this so-called dormant (or induction) period directly determines the workability, since the strength of the paste begins to grow steadily after the dormant period finishes and the concrete begins to set. thus, it is important to understand the reasons for the existence of the induction period and possible ways to influence its duration without a negative outcome. several hypotheses have been proposed in order to explain the induction period. taylor has grouped them into four main classes [9]: 1. the initial nuclei of c-s-h and ch are somehow poisoned by sio2 and cannot grow until the solution becomes supersaturated [8]. 2. the membrane hypothesis assumes, that during the first stage of paste hydration a semipermeable membrane forms around the particles, thus separating the solution into an inner part and an outer part. the membrane prevents the ions dissolved in the inner solution from reaching the main volume of the solution. according to this hypothesis, the induction period ends when the osmotic pressure inside the membrane grows high enough to break it [1, 6]. 3. the protective layer hypothesis is similar to the membrane hypothesis, but supposes that the 15 acta polytechnica vol. 52 no. 6/2012 oxide cao sio2 al2o3 fe2o3 mgo so3 na2o k2o mno comp. [wt.%] 63.77 20.51 4.74 3.3 1.05 3.07 0.15 0.95 0.09 table 1: oxide composition of cem i 42.5r, mokrá, czech republic mineral c3s c2s c3a c4af c2f so3-bound cao comp. [wt.%] 58.34 14.8 6.81 10.31 0 2.149 table 2: phase composition of cem i 42.5r, mokrá, czech republic (bogue) semipermeable layer is formed directly on the surface of the grains. as in the case of the membrane hypothesis, the induction period ends when the protective layer is somehow removed or made permeable. 4. the nucleation hypothesis attributes the existence of the induction period to the very mechanism of the formation and growth of the nuclei. the growing particles of the new phase have to overcome a certain energetical barrier by means of fluctuations, and this cannot happen instantly. this hypothesis explains the induction period as an intrinsic time lag of the nucleation process [3, 10]. in our paper we analyze a nucleation hypothesis of the induction period. the hydrating cement paste is an extremely complex system. for our analysis, we concentrated our attention on one of the main phases in the hydrating paste — portlandite ca(oh)2. we measured the concentration of ca2+ in a hydrating cement paste, and combined this with theoretical modelling of the nucleation of a water solution of calcium hydroxide to estimate the nucleation time lag of portlandite formation. 2 experimental materials. in our experiments we used an ordinary portland cement (opc) cem i 42.5r, mokrá, czech republic. the oxide and mineralogical composition of the cement is given in tables 1 and 2. chemicals. all chemical agents applied are of purity p.a. 0.05 m edta, disodium ethylenediaminetetraacetate, 0.05 m mgso4, schwarzenbach buffer, eriochrome black t. method. the concentration of ca2+ ions at an early stage of hydration of the cement paste was measured. the method for determining free calcium ions [5] is essentially based on the use of a chelatometric titration method, namely determining the concentration of ca2+ ions by disodium ethylenediaminetetraacetate (edta, chelaton iii), a basic volumetric (titration) method in analytical chemistry. it is well known that chelatometric determination utilizes the fact that edta forms very strong compounds (chelates) with certain metals belonging to the ii, iii and iv group. calcium is one of these metals. however, in our method for determining calcium ions in the first two hours of setting of the cement paste, edta plays a more complex role. during the first two hours after the cement is mixed with water, the reactions run very quickly and the release of ca2+ can be very rapid. the reaction is exothermic, as shown by the calorimetric curves (see figure 1). in order to measure the time dependence of the concentration of the calcium ions in the hydrating cement pastes, it is neccessary to stop the reaction at a chosen instant. the hydration reaction was stopped with the use of edta, which is simultaneously a strong retardant for the reaction of cement with water. thus this acid serves both as the retardant and as the volumetric reagent for calcium ions. to improve the accuracy of the method, we used indirect titration, i. e. the amount of excess edta was determined by reaction with magnesium ions, and the ca2+ concentration was calculated from the difference. the concentration of ca2+ ions as a function of time was determined for various water-to-cement ratios (w/c = 0.35; 0.40; 0.45; 0.50; 0.60), and the results are presented in figure 2. all the curves have a peak at the very beginning of the hydration process, which corresponds to the first stage of intense exothermic reactions. afterwards the concentration drops. this decrease in free ion concentration can be explained as the formation of some protective layer or membrane. when the osmotic pressure exceeds a certain threshold, the membrane bursts and releases the ca2+ ions held within, leading to a later increase in concentration. at this instant, the nucleation mechanism kicks in, and portlandite nuclei begin to form. from the point of view of nucleation 16 acta polytechnica vol. 52 no. 6/2012 h ea t (m w /g ) time [min] figure 1: initial stage of the hydration process in hcp of portland cement, mokrá, (w/c = 0.40; 0.45) measured using a setaram c 80 calorimeter. theory, the ca2+ ions serve as monomers for future portlandite clusters. these monomers are absorbed in the nucleation process, leading to a new decrease in calcium ion concentration and the establishment of the second peak on the curves in figure 2. 3 model according to nucleation theory, the initial stage of the process of the first order phase transition consists of two steps nucleation and growth. before the first particles of the new, stable phase can start to grow steadily, they have to appear somehow in the metastable phase, and this process is not easy for them. although the resulting phase is more favorable from the energetical point of l view, the high surface-to-volume ratio of small particles combined with positive excess interface energy makes them unstable. thus, nucleation theory distinguishes nucleation itself, when the embryos of the stable phase are stochastically produced in the volume of the metastable phase by means of fluctuations, and growth, when the nuclei increase in size deterministically. depending on the system under consideration, the establishment of a steady growth rate can take a significant period of time, and this is called the time-lag of nucleation [11]. portlandite clusters grow from a solution containing ca2+ and oh− ions. we assume that hydroxyl ions are present in abundance, because they are not only a product of ca(oh)2 dissociation, but also a product of dissociation of water itself. thus, the formation of portlandite clusters is controlled by the supersaturation of calcium ions: s = cact ceq , (1) where cact represents the actual concentration of ca2+ ions within the solution and ceq is their equilibrium concentration. the gibbs free energy of portlandite cluster formation is ∆g = ∆gv + ∆gs, (2) where ∆gv stands for the volume term, and ∆gs is the surface term. the volume term is negative, because the solid phase as a whole is stable and energetically more favorable than the solution. conversely, the surface term is positive and reflects the general tendency of a system to eliminate internal interfaces and become homogeneous. the exact interplay between the volume and the surface terms in (2) depends on the shape of the growing cluster. it has been observed [2] that small crystals of portlandite are shaped as hexagonal plates (see figure 3). the fact that the radius of crystal r is much larger than its height h suggests that the surface tension in this case is strongly anisotropic. denoting the surface tension on the corresponding planes as σr and σh, we obtain ∆g = −kt 3 √ 3r2h 2v ln s + 3 √ 3r2σr + 6rhσh, (3) where k = 1.38 × 10−23 j/k is the boltzmann constant, t stands for absolute temperature, and v = 17 acta polytechnica vol. 52 no. 6/2012 figure 2: concentration of ca2+ ions in hcp vs. time related to the total mass of hcp (t = 25◦c). σr σh r h figure 3: the assumed shape of a portlandite cluster and the corresponding surface energies. 1.65 × 10−28 m3 is the volume of a unit cell of portlandite, which is hexagonal with basis cell dimensions a = 0.3585 nm and c = 0.4895 nm. under given supersaturation s, expression (3) is a function of two variables, r and h. this expression describes the energetical landscape on which growing clusters have to find the optimal way to the valley, representing the stable solid phase. the optimal way leads through a saddle point of the energetical surface (3), and corresponds to the best thermodynamical approximation to the shape of a cluster. if we take kinetic aspects of cluster growth into account, it is possible that the dynamically changing shape of a growing crystal does not pass exactly through the saddle point of the energetical landscape. however, this deviation is negligible in most situations, so in the following text we assume that the shape of a critical cluster is determined from the pair of equations ∂∆g ∂r ∣∣∣∣ rc,hc = 0, ∂∆g ∂h ∣∣∣∣ rc,hc = 0. (4) solution of the system (4) yields rc = 4σhv√ 3kt ln s , hc = 4σrv kt ln s . (5) clusters which are smaller than the critical cluster tend to dissolve, while larger clusters can grow freely. this last conclusion is valid for an open system, but in a closed system the situation is more complicated. in a closed system, the concentration of monomers is not constant, as they are consumed by the growing clusters. thus, supersaturation s is constantly decreasing, which leads to an increase in the critical sizes rc and hc. it may even be possible that in later stages of the nucleation process, the 18 acta polytechnica vol. 52 no. 6/2012 n uc le at io n ra te ,a rb . un it . supersaturation s figure 4: dependence of the steady state nucleation rate of portlandite on the supersaturation of ca2+ ions. critical size grows faster than the front of the distribution function of the cluster. then the clusters that have already originated and are overcritical may become undercritical again and dissolve. this effect of depletion of ca2+ monomers due to their consumption by growing portlandite is shown in figure 2 as a decrease in concentration after it achieves the second maximum. however, during the whole process the ratio of the radius of the critical platelet to its height remains constant and independent of supersaturation s: rc hc = σh√ 3σr . (6) since rc � hc, lateral surface tension σh must be much larger than basal σr. when a macroscopic measurement of the surface tension of portlandite is performed, it yields the surface energy value, averaged over the surface of the whole cluster. it can easily be shown from (6) that the contribution of the basal surface energy to the total energy of the cluster is four times smaller than the corresponding contribution of the lateral surface energy. consequently, macroscopic measurements of the surface energy of portlandite provide values for σh. after substituting (5) into (3), we obtain an expression for the energy barrier of nucleation: ∆g = 16 √ 3σ2hσrv 2 (kt ln s)2 . (7) if we take the values for calcium hydroxide supersaturation from [4], s ∈ (1.75, 3), then it is possible to obtain the following estimations for the critical size of a portlandite cluster at t = 298 k: rc ∈ (5.1 nm, 10 nm), hc ∈ (1 nm, 2 nm). (8) the corresponding energy barrier ∆g ∈ (7.2 ev, 26.5 ev). (9) however, these estimations were performed using the equilibrium concentration of ca2+ ions based on the solubility curve of calcium hydroxide in water. data for the calcium hydroxide solution in the cement paste is not available. we can only assume that the nucleation of portlandite is the dominant process during the induction period and, consequently, the equilibrium concentration of ca2+ does not differ significantly from the pure solution. it is still an open question, though, how to relate the water-to-cement ratio of the cement paste to supersaturation of the solution with respect to portlandite. despite the relatively small variations in the critical size and the energy barrier, the steady state nucleation rate can be quite sensitive to supersaturation and, thus, to the water-to-cement ratio. this is explained by the fact that the nucleation rate (i. e. the number of new supercritical clusters appearing per unit volume of the system per unit time) is exponentially dependent on the energy barrier of nucleation: i ∼ exp ( − ∆gc kt ) . (10) a detailed knowledge of the kinetics of cluster growth is necessary for calculating the exact proportionality coefficient in (10). nevertheless, relation (10) enables us to estimate the relative change in the nucleation rate under normal conditions (see figure 4): imin imax ' 80. (11) 19 acta polytechnica vol. 52 no. 6/2012 t im e la g of nu cl ea ti on ,a rb . un it . supersaturation s figure 5: time lag of the nucleation of portlandite in dependence on supersaturation of ca2+ ions. equation (10) gives the nucleation rate in the steady state. however, as discussed above, due to the stochastic nature of the nucleation process, the nucleation rate does not achieve its steady state value immediately after the creation of supersaturation in the system. the corresponding time lag of nucleation can be estimated as [7] τ = 1 2.8z2r , (12) where z is the so-called zeldovich factor, which describes the curvature of the energetical landscape (3) at its saddle point, and r stands for the rate at which monomers are incorporated into critical-size clusters. the zeldovich factor z is equal to z = √ −1 2πkt ∂2∆gc ∂n2 , (13) where n is the number of monomers inside the cluster. figure 5 shows the dependence of the time lag of portlandite nucleation on supersaturation s. 4 conclusion we have proposed a new method for experimentally determining the concentration of ca2+ ions in hydrating cement paste during the first few hours after mixing cement with water. the method is based on a standard chelatometric titration method, but uses edta both as the retardant of the hydration process and as the volumetric reagent for calcium ions. the evolution of calcium ion concentration has been studied bye this method for several w/c ratios. the resulting curves (see figure 2) express in general a bimodal behavior. the existence of the first maximum is attributed to the release of ca2+ ions during the initial fast process of hydration of c-s-h, followed by the formation of some sort of protective layer around the cement grains. during the induction period, the concentration of calcium ions slowly grows due to the gradual hydration process. however, at later times the concentration goes down again because of the consumption of ca2+ by nucleating and growing portlandite clusters. homogeneous nucleation theory is applied to portlandite cluster formation. we derive expressions for critical size and the energy barrier. unfortunately, no data is available on the equilibrium concentration of ca2+ ions in a cement paste. we estimate the critical size and the energy barrier of nucleation on the basis of the solubility curve of portlandite in water. next, the expressions for the steady state nucleation rate and the time lag are obtained. the time lag of nucleation is strongly dependent on supersaturation (see figure 5). in the future, we plan to use our theoretical and experimental results to find the equilibrium concentration of ca2+ ions in a cement paste, with a corresponding recalculation of the critical size and the energy barrier. if we assume that the major part of the induction period is due to the nucleation time lag, the comparison of the experimental data for the dependence of the duration of the induction period on the water-to-cement ratio, together with figure 5, can provide information about the supersaturation of portlandite in a cement paste. in combination with data about the concentration of ca2+ ions (figure 2), it may be possible to calculate the solubility curve of calcium hydroxide in a cement paste. 20 acta polytechnica vol. 52 no. 6/2012 acknowledgements this study was supported by the ministry of education, youth and sport of the czech republic, project no. cez msm 6840770003 and grant sgs10/125/ohk1/2t/11. references [1] p. brown, j. pommersheim, g. frohnsdorff. a kinetic model for the hydration of tricalcium silicate. cement and concrete research 15(1):35– 41, 1985. [2] e. gallucci, k. scrivener. crystallisation of calcium hydroxide in early age model and ordinary cementitious systems. cement and concrete research 37(4)492–501, 2007. [3] s. garrault-gauffinet, a. nonat. experimental investigation of calcium silicate hydrate (c-s-h) nucleation. journal of crystal growth 200(3– 4):565–574, 1999. [4] m. gartner et al. chap. 3 of structure and performance of cements (eds. j. bensted, p. barnes). spon press, 2001. [5] š. hošková, p. tichá, p. demo. determination of ca2+ ions at early stage of hydrating cement paste. ceramics-silikáty 53(2)76–80, 2009. [6] h. jennings, p. pratt. an experimental argument for the existence of a protective membrane surrounding portland cement during the induction period. cement and concrete research 9(4):501–506, 1979. [7] i. kanne-dannetschek, d. stauffer. quantitative theory for time lag in nucleation. journal of aerosol science 12(2):105–108, 1981. [8] s. preece, j. billingham, a.king. on the initial stages of cement hydration. journal of engineering mathematics 40:43–58, 2001. [9] h. taylor. cement chemistry. 2nd edition, thomas telford, 2004. [10] j. thomas, h. jennings, j. chen. influence of nucleation seeding on the hydration mechanisms of tricalcium silicate and cement. journal of physical chemistry c 113:4327–4334, 2009. [11] h. vehkämäki. classical nucleation theory in multicomponent systems. springer, 2006. 21 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 time-dependent and/or nonlocal representations of hilbert spaces in quantum theory m. znojil abstract a few recent innovations of the applicability of standard textbook quantum theory are reviewed. the three-hilbert-space formulation of the theory (known from the interacting boson models in nuclear physics) is discussed in its slightly broadened four-hilbert-space update. among applications involving several new scattering and bound-state problems the central role is played by models using apparently non-hermitian (often called “crypto-hermitian”) hamiltonians with real spectra. the formalism (originally inspired by the topical need for a mathematically consistent description of tobogganic quantum models) is shown to admit even certain unusual nonlocal and/or “moving-frame” representations h(s) of the standard physical hilbert space of wave functions. keywords: quantum theory, cryptohermitian operators of observables, stable bound states, unitary scattering, quantum toboggans, supersymmetry, time-dependent models. 1 introduction the fourier transformation f:ψ(x) → ψ̃(p) of wave functions converts differential kinetic-energy operator k ∼ d2/dx2 into a trivialmultiplication by a number, k̃ = fkf−1 ∼ p2. thismeans that for certain quantum systems the fourier transformation offers a simplification of the solution of the schrödinger equation. thegeneralized,nonunitary (oftencalleddyson)mappings ω play the same simplifying role in the context of nuclear physics [1]. in our present brief review paperwe intend to recall anddiscuss very recentprogress and, mainly, a few of our own results in this direction. our textwill bemore or less self-contained, though the limitations imposed on its length will force us to skip all remarks on the history of the subject aswell as on references and on a broader context. fortunately, interested readersmay very easily get acquaintedwith these aspects of the new theory in several very thorough and extensive reviews [2] and also in our own recent compact review [3] and/or in our two-years-old short paper [4]. in section 2 we shall start our discussion from bound-state models characterized by the loss of observability of complexified coordinates. in the generic dynamical scenario where the riemann surface of the wave functions can be assumed multisheeted, we shall define certainmonodromy-sensitivemodels called quantum toboggans. our selection of their sample applications will cover innovativemodels possessing several branch points in the complex x-plane and/or exhibiting supersymmetry. section 3 will offer information about the specific cryptohermitianapproachtobound-statemodels characterized by themanifest time-dependence of their operators of observables (cf. paragraph 3.1) or by the presence of a fundamental length in the theory (cf. paragraph 3.2). the twopossiblemechanisms of a return to unitarity in the models of scattering by complex potentials will be described in section 4. via concrete examples we shall emphasize the beneficial role of a “smearing” of phenomenological potentials and the necessity of an appropriate redefinition of the effectivemass in certain regimes. section 5 contains a few concluding remarks. for the sake of completeness, a few technical remarks concerning the role of the dyson mapping in the abstract formulation of quantum theory as well as in some of its concrete applications will be added in the form of three appendices. 2 quantum theories working with quadruplets of alternative hilbert spaces within the cryptohermitian approach, a new category of models of bound states appeared, a few years ago, under the name of quantum toboggans [5]. their introduction extended the class of integration paths of complexified “coordinates” x = q(s) in the standard schrödinger equations to certain topologically nontrivial complex trajectories. the hamiltonians h(t) = p2 + v (t)(x) containing analytic potentials v (t)(x) with singularities (the superscripts (t) stand here for “tobogganic”) were connected with the generalized complex asymptotic boundary conditions and specified as operating in a suitable hilbert space h(t) of wave functions 62 acta polytechnica vol. 50 no. 3/2010 in which the hamiltonian itself is manifestly nonhermitian. practical phenomenological use of any cryptohermitian quantum model requires, firstly, a sufficiently persuasive demonstrationof the reality of its spectrum and, secondly, the availability of at least one metric operator θ = θ(h) (cf. appendices a–c for its definition). usually, both of these conditions are nontrivial, so that any form of the solvability of the model is particularly helpful. vice versa, once the hamiltonian h proves solvable in hilbert space h(t), we may rely upon the availability of the closed solutions of the underlying schrödinger equations and on the related specific spectral representations of the necessary operators (cf. [6, 7] for more details). the topological nontrivality of the tobogganic paths of coordinates running over several riemann sheets of wave functions happened to lead to severe complications in the numerical attempts to compute the spectra. this difficulty becomes almost insurmountable when the wave functions describing quantum toboggans happen to possess two or more branch points (cf. [8] for an illustrative example). for these reasons it is recommended to rectify the tobogganic integration paths via a suitable change of variables in a preparatory step [9]. our tobogganic schroedinger equations then acquire the generalized eigenvalueproblem form hψ = ew ψ of the so called sturmschroedinger equations with the rectified hamiltonian h �= h† and with a nontrivial weight operator w �= w † �= i. both of these operators are defined in another, transformed, “more friendly” hilbert space h(f) of course [10]. 2.1 supersymmetric quantum toboggans the introduction of the cryptohermitian and tobogganicmodels proveduseful in the context of supersymmetry (susy).asampleofpapersdevoted to this subject is referenced in [2]. the easiest case (called supersymmetric quantum mechanics) uses just the hamiltonian and the two charge operators generating the susy algebra, h = [ h(−) 0 0 h(+) ] = [ ba 0 0 ab ] , q = [ 0 0 a 0 ] , q̃ = [ 0 b 0 0 ] . for the solvable model of ref. [11] the energy spectrum (composed of four families en = a(n), . . . , d(n)) is displayed in figure 1. at γ = −1/2 the singularity vanishes and the (up to the ground state) doubly degenerate susy spectrum becomes strictly equidistant. the imposition of supersymmetry has been extended to quantum toboggans in [12]. both the components of the super-hamiltonian were defined along topologically nontrivial complex curves which connect several riemann sheets of the wave function. the new feature of this generalized model lies in the non-uniqueness of the map t between “tobogganic” partner curves. as a consequence, we must redefine the creationand annihilation-like operators as follows, a = −t d dx +t w(−)(x) , b = d dx t −1 + w(−)(x)t −1 . in contrast to the non-tobogganic cases, the hermitian-conjugation operator t even ceases to be involutory (i.e., t �= t −1, cf. paper [12] for more details). c(0) c(1) c(2) c(3) c(4) d(0) d(1) d(2) d(3) c(0) d(0) c(1) d(1) c(2) d(2) b(0) b(1) b(2) b(3) a(0) a(1) a(2) a(3) a(4) a(5) -2 -1 0 1 γ e fig. 1: spectrumof the singular supersymmetric harmonic oscillator 2.2 four-hilbert-space quantum mechanics in a way explained in our papers [7], the tobogganic quantum systems with real energies generated by their apparently non-hermitian hamiltonians may be assigned the entirely standard and consistent probabilistic interpretation. for this purpose the initial hilbert space h(t) is replaced by another, friendly hilbert space h(f) in which the above-mentioned sturm-schroedinger equations hψ = ew ψ have to be solved. this forces us to generalize the three-hilbert-space scheme of paper [3] [cf. also appendices a and b and figure 2] and to use the following four-hilbert-space pattern of mappings 63 acta polytechnica vol. 50 no. 3/2010 tobogganic space h(t) analytic multivalued ψ[q(s)] multisheeted paths q(n)(s) physics in h(p) h = h† , w = w† dynamics via topology ↓ (the change of variables) rectification ↑ (the unitary mapping) equivalence feasibility in h(f) h �= h† , w �= w † sturm−schrödinger eqs. −→ (metric is introduced) hermitization standard space h(s) h = h‡ , w = w ‡ ad hoc metric θ �= i the analyticity of the original wave function ψ[q(s)] along the given tobogganic integration path with parameter s ∈ (−∞, ∞) is assumed. the rectification transition between hilbert spaces h(t) and h(f) is tractable as an equivalence transformation under this assumption [10]. in the subsequent sequence of maps f → s and f → p one simply follows the old three-hilbert-spacepatternofappendixc [3] inwhich just the nontrivial weight operators w and/or w are added and appear in the respective generalizedsturmschrödinger equations. marginally, let us add that various, suitably modified spectral representations of the eligible metric operators may be used, say, in the form derived in [7]. the purely kinematical and exactly solvable topologically nontrivial “quantum knot” example of ref. [13] can also be recalled here as an exactly solvable illustration in which the confining role of the traditional potential is fully simulated by the mere topologically nontrivial shape of the complex integration path. 3 bound-state theories working with the triplets of alternative hilbert spaces 3.1 quantum models admitting the time-dependence of their cryptohermitian hamiltonians in our review [3] of the three-hilbert-space (3hs) formalism we issued a warning that some of the consequences of the enhancedflexibility of the languageand definitions may sound like new paradoxes. for illustration, let us mention just that in the 3hs approach the generator h(gen) = h(gen)(t) of the time-evolution of wave functions is allowed to be different from the hamiltonian operator h = h(t) of the system in question [14]. the key to the disentanglement of the similar puzzles is easily found in the explicit specification of the hilbert space in which we define the hermitian conjugation. we showed in [14] that the use of the full triplet of spaces of figure 2 becomes unavoidablewhenever our cryptohermitian observables are assumed time-dependent because their variations may and must be matched by the time-dependence of the representation of the physical ad hoc hilbert space h(s). its nontrivial inner product is capable of playing the role of a “moving frame” image of the original physical hilbert space h(p). although our “true” hamiltonian (i.e., operator h(t) in h(p)) is the generator of the time evolution in h(p), the time-evolution of the wave functions in h(s) is controlled not only by the “dynamical” influence of h = h(t) itself but also by the “kinematical” influence of the time-dependence of the “rotating” dyson mapping ω = ω(t). thus, the existence of any other given and manifestly timedependent observable o(t) in h(p) will leave its trace in dyson map ω(t), i.e., in metric θ(t), i.e., in the time-dependence of the “moving frame” hilbert space h(s). this circumstance implies the existenceof twopullbacks of the evolution law fromh(p) to h(s), with the recipe |ϕ(t)〉 = ω−1(t) |ϕ(t) � being clearly different from the complementary recipe 〈〈ϕ(t) | =≺ ϕ(t) |ω(t). the same dyson mapping leads to the two different evolution operators, viz., to the evolution law for kets, |ϕ(t)〉 = ur(t) |ϕ(0)〉 , ur(t)=ω−1(t)u(t)ω(0) and to the different evolution law for brabras, |ϕ(t)〉〉 = u †l(t) |ϕ(0)〉〉, u † l(t)=ω †(t)u(t) [ ω−1(0) ]† . wehave no space here for the detailed reproduction of the whole flow of this argument as presented in [14]. its final outcome is the definition of the common timeevolution generator h(gen)(t)= h(t)− iω−1(t)ω̇(t) . entering the final doublet of time-dependent schrödinger equations i∂t|φ(t)〉 = h(gen)(t) |φ(t)〉 , (1) i∂t|φ(t)〉〉 = h(gen)(t) |φ(t)〉〉 . (2) this ultimately clarifies the artificial character and redundancy of the mostafazadeh’s conjecture [15] of 64 acta polytechnica vol. 50 no. 3/2010 quasistationarity, i.e., of the requirement of timeindependence of the inner products and of the metric, i.e., ipso facto, of hilbert space h(s). 3.2 systems admitting a controllable nonlocality in a way emphasized by jones [16] the direct observability of coordinates x is lost for the majority of the parity-times-time-reversal-symmetric (or, briefly, pt symmetric) quantum hamiltonians. in the context of scattering, this forced us to admit a non-locality of the potentials in [17]. fortunately, in the context of bound states the loss of the observability of coordinates is much less restrictive since we do not need to prepare any asymptotically free states. the admissible hilbert-space metrics θ may be chosen as moderately non-local acquiring, in the simplest theoretical scenario as proposed in our paper [18], the form of a short-ranged kernel in a double-integral normalization or in the inner products of the wave functions. the standard dirac’s delta-function kernel is simply reobtained in the zero-range limit. in refs. [17, 19] we proposed several bound-state toymodels exhibiting, in a confined-motiondynamical regime, various forms of an explicit control of themeasure θ of their dynamicallygeneratednon-locality. the exact solvability of some of these models even allowed us to assigneachhamiltonian the complete menuof its hermitizing metricsθ=θθ distinguished by their optional fundamental lengths θ ∈ (0, ∞). in this setting the localmetrics reappear at θ =0 while certain standard hermitizations only appeared there as infinitely long-ranged, with θ = ∞. 4 scattering theories using pairs of hilbert spaces h(p) �= h(f) in our last illustrative application of 3hs formalism, let us select just two non-equivalent hilbert spaces h(f,s) and turn to scattering theory where one assumes that the coordinate is certainly measurable/measured at large distances. this means that we may employ the operators in coordinate representation and accept only such models where the metric operator remains asymptotically proportional to delta function, 〈x|θ|x′〉 ∼ δ(x− x′) at |x| ! 1 and |x′| ! 1. a few concrete models of this type were described in refs. [17, 19] using minimally nonlocal, “smeared” point interactions of various types (which were, in the latter case, multi-centered). the use of nonperturbative discretization techniques rendered possible the construction of the (incidentally, unique)metricθ compatible with the required asymptotic locality. the resulting physical picture of scattering was unitary and fully compatible with our intuitive expectations. in our last paper [20] the scope of the theory has further been extended to the generalized scattering models, where the matrix elements 〈x|θ|x′〉 of the metric were allowed operator-valued. a slightly different approach to scatteringhas been initiated in paper [21] where we studied the analytic and “realistic” coulombic cryptohermitian potentials defined along u-shaped complex trajectories circumventing the origin in the complex x plane from below. unfortunately, this model was unstable with respect to perturbations. a few years later we clarified, in paper [22], that a very convenient stabilization of the modelmaybe based onaminus-sign choice of the bare mass in theschrödiner equation. very soonafterwards we also revealed that the scattering by the amended hamiltonian is unitary [23]. the transmission and reflection coefficients were evaluated in closed analytic form exhibiting the coincidence of the bound-state energies with the poles of the transmission coefficients. thus, after a moderate modification a number of observations forming the analytic theory of s-matrix has been found transferrable to the cryptohermitian quantum theory. 5 conclusions oneofparadoxescharacterizingquantumtheorymay be seen in the contrast between its stable status in experiments (where, typically, its first principles are appreciated as unexpectedly robust [24]) and its fragile status in themathematical context, where virtually all of its rigorous formulations are steadily being found, for this or that reason, not entirely satisfactory [25]. in fact, at least a part of this apparent conflict is just a pseudoconflict. its roots can be traced back to various purely conceptual misunderstandings. in our present review we emphasized that within the comparatively narrow frameworkof quantumtheoryusing cryptohermitian representations of observables the majority of these misunderstandings can be clarified, mostly via a careful use of an adequate notation. the core of our present message can be seen in the unified outline of the resolution of the internetmediateddebate (cf. [3] for references) inwhich theadmissibility and consistent tractability of themanifestly time-dependent cryptohermitian observables has been questioned. it is now clear that the reduction of the scope of the theory to the mere quasistationary systems as proposed by mostafazadeh [15] is unfounded. this bound-state-related message can be seen accompanied by the clarification of a return to unitarity in the models of scattering mediated by cryptohermitian interactions. the currently valid conclusion is that itmakes sense to combine the complexification of the short-range interactions with our making them at least slightly nonlocal. we have seen that, in parallel, also the metric can be required to exhibit a certain limited degree of nonlocality. 65 acta polytechnica vol. 50 no. 3/2010 new questions emerge in this context. this means that in spite of all the recent rapidprogress the current intensive development of the cryptohermitian quantum theory is still fairly far from its completion. appendix a: hilbert space in our present notation in our review paper [3] we explained that one of the most natural formulations of the abstract quantum theory should follow the ideas of scholtz et al [1] by constructing the three parallel representatives of any givenwave function living in the three separatehilbert spaces. we argued that the use of the three-hilbertspace (3hs) formulation of quantum theory seems best capable of clarifying a few paradoxes emerging in connection with the concept of hermiticity and encountered in the recent literature. we emphasized in [3] thatmanyquantumhamiltonianswith real spectra, characterized by their authors as manifestly nonhermitian, should and must be re-classified as hermitian. in this sense we fully accepted the dictum of standard textbooks on quantum theory and complemented the corresponding postulates just by a few explanatory comments. in a brief summary of this argument let us recall that the states ψ of a (say, one-dimensional) quantum system are often assumed represented by normalized elements of the simplest physical and computation friendly concrete hilbert space l2(r). this is already just a specific assumption with restrictive consequences. thus, in a more ambitious picture of a general quantum system each state ψ should only be perceived as an element |ψ〉 of an abstract vector space v. the equally abstract dual vector space v′ of linear functionals over v may be bigger, v′ ⊃ v. in the most common selfdual case with v′ = v one speaks about the hilbert space h(f) := (v, v′) where the superscript (f) stands, say, for (user-)friendly or “feasible”. in many standard formulations of the first principles of quantum theory the well known dirac’s braket notation is used, with |ψ 〉 ∈ v and 〈ψ| ∈ v′ for a fixed or “favored” hilbert space h(f). at the same time, this choice of the notation does not exclude a transition (say, ω) to some other vector and hilbert spaces denoting, e.g., ω |ψ〉 := |ψ � ∈ w and using here the slightly deformed, spiked ket symbols [3]. appendix b: dyson mapping ω as a nonunitary generalization of the fourier transformation f in the context of nuclear physics the use of the single, favored hilbert space h(f) is rather restrictive. for example, in the context of the so called interacting boson model and in the way inspired by the well known advantages of the use of the usual unitary fourier transformation f = [ f† ]−1 , nuclear physicists discovered that their constructive purposes may bemuchbetter servedby a suitable generalized,manifestly non-unitary (often calleddyson) invertiblemapping ω. more details may be found in paper [1], where the operators ω were described as mediating the transition from a friendly bosonic vector space v into another, fermionic and “physical” vector space w. the deepenedmathematical differences between “bosonic” (i.e., simpler) v and fermionic (i.e., complicated, computationallymuch less accessible)w weakens the parallelism between ω and f since the latter operator merely switches between the so called coordinateand momentum-representations of ψs lying in the same hilbert space l2(r). this encouraged us to propose, in [3], visual identification of the bras and kets in one-to-one correspondence to the space in which they live, with |ψ 〉 ∈ v whileω |ψ〉 := |ψ � ∈ w. for duals (i.e., bra-vectors) we recommended the same notation, with 〈ψ| ∈ v′ while 〈ψ|ω† :=≺ ψ| ∈ w′. appendix c: the connection between dyson map ω and metric θ in the notation of appendix b one represents the same state ψ in two non-equivalent hilbert spaces, viz., in the friendly f-space h(f) := (v, v′) and in the physical p-space h(p) := (w, w′) (characterized by the “spiked” kets and bras). the latter space is, by construction, manifestly non-equivalent to the former one since, by definition, we have, for overlaps, ≺ ψa|ψb � = 〈ψa|ω†ω |ψb〉 �= 〈ψa|ψb〉. final initial friendly s-space: p-space: f-space: constructive, usual, auxiliary, predictive inaccessible unphysical (unitary) equivalence (spiritual role) (paternal role) (filial role) change of metric dyson map fig. 2: the samephysics is predicted in h(p) and in h(s) while, presumably, the calculations are all performed in h(f) 66 acta polytechnica vol. 50 no. 3/2010 according to our review [3], the demonstration of unitary non-equivalence between h(f) and h(p) can easily be converted into a proof of unitary equivalence betweenh(p) andanother, third, standardizedhilbert space h(s) := (v, v′′). indeed, we are free to introduce a redefined vector space of linear functionals v′′ such that the equivalence will be achieved. for the latter purpose it is sufficient to introduce the special duals 〈〈ψ| ∈ v′′ denoted by the new, “brabra” diracinspired symbol. in terms of a givendysonoperatorω we may define these brabras, for the sake of definiteness, by the formula 〈〈ψa| = 〈ψa|θ of ref. [6], where we abbreviated θ=ω†ω. in [1] the new operator θ has been called metric. it defines the inner products in the “second auxiliary” (i.e., in its nuclear-physics exemplification, second bosonic)hilbert spaceh(s) which is, by construction, unitarily equivalent to the originalphysicalh(p). the whole 3hs scheme is given a compact graphical form in figure 2. acknowledgement this work has been supported by mšmt doppler institute project lc06002, by institutional research plan av0z10480505 and by gačr grant 202/07/1307. references [1] scholtz, f. g., geyer, h. b., hahne, f. j. w.: quasi-hermitianoperators inquantummechanics and the variational principle, ann. phys., vol. 213 (1992) p. 74–101. [2] dorey, p., dunning, c., tateo, r.: the ode/im correspondence, j. phys. a: math. theor., vol. 40 (2007) p. r205–r283 (arxiv:hepth/0703066). bender, c. m.: making sense of non-hermitian hamiltonians, rep. prog. phys., vol. 70 (2007) p. 947–1018 (arxiv: hep-th/0703096); mostafazadeh, a.: pseudo-hermitian quantum mechanics, arxiv:0810.5643. [3] znojil, m.: three-hilbert-space formulation of quantummechanics, sigma, vol.5 (2009), 001, 19 pages (arxiv:0901.0700). [4] znojil, m.: pt-symmetric quantum chain models, acta polytechnica, vol. 47 (2007) p. 9–14. [5] znojil, m.: pt-symmetric quantum toboggans. phys. lett. a, vol. 342 (2005) 36–47. [6] znojil,m.: on the role of normalization factors andpseudometric incrypto-hermitianquantum models, sigma 4 (2008), p. 001, 9 pages (arxiv: 0710.4432). [7] znojil, m.: identification of observables in quantum toboggans. j. phys. a: math. theor. 41 (2008) 215304 (arxiv:0803.0403); znojil, m., geyer, h. b.: sturm-schroedinger equations: formula formetric. pramana j. phys., vol. 73 (2009), p. 299–306 (arxiv:0904.2293). [8] znojil, m.: quantum toboggans with two branch points.phys. lett. a, vol. 372 (2008) p. 584–590 (arxiv: 0708.0087). [9] novotný, j.: http://demonstrations.wolfram.com/ thequantumtobogganicpaths [10] znojil, m.: quantum toboggans: models exhibiting a multisheeted pt symmetry. j. phys.: conference series, vol. 128 (2008) 012046. [11] znojil, m.: non-hermitian susy and singular, pt-symmetrized oscillators. j. phys. a: math. gen., vol. 35 (2002) p. 2341–2352 (arxiv:hepth/0201056). [12] znojil, m., jakubský, v.: supersymmetric quantum mechanics living on topologically nontrivial riemann surfaces. pramana j. phys., vol. 73 (2009) p. 397–404. [13] znojil, m.: quantum knots. phys. lett. a, vol. 372 (2008) p. 3591–3596. [14] znojil,m.: time-dependent version of cryptohermitian quantum theory. phys. rev. d, vol. 78 (2008) 085003 (arxiv:0809.2874). [15] mostafazadeh, a.: time-dependent pseudohermitian hamiltonians defining a unitary quantum system and uniqueness of the metric operator. phys. lett. b, vol. 650 (2007) p. 208–212. [16] jones, h. f.: scattering from localized nonhermitian potentials. phys. rev. d, vol. 76 (2007) 125003. [17] znojil, m.: scattering theory with localized non-hermiticities. phys. rev. d, vol. 78 (2008) 025026 (arxiv:0805.2800). [18] znojil, m.: fundamental length in quantum theories with pt-symmetric hamiltonians. phys. rev. d, vol. 80 (2009) 045022 (13 pages, arxiv:0907.2677). [19] znojil, m.: discrete pt-symmetric models of scattering. j. phys. a: math. theor., vol. 41 (2008) 292002 (arxiv:0806.2019); znojil, m.: scattering theory using smeared non-hermitian potentials. phys. rev. d, vol. 80 (2009) 045009 (12 pages, arxiv:0903.1007). [20] znojil, m.: cryptohermitian picture of scattering using quasilocalmetric operators.sigma, vol. 5 (2009) 085 (21 pages, arxiv:0908.4045). 67 acta polytechnica vol. 50 no. 3/2010 [21] znojil, m., lévai, g.: the coulomb – harmonicoscillator correspondence in pt symmetric quantum mechanics. phys. lett. a, vol. 271 (2000) p. 327–333. [22] znojil, m., siegl, p., lévai, g.: asymptotically vanishingpt-symmetric potentials and negativemass schroedinger equations. phys. lett. a, vol. 373 (2009) 1921–1924. [23] lévai, g., siegl, p., znojil, m.: scattering in the pt-symmetric coulomb potential. j. phys. a: math. theor., vol. 42 (2009) 295201 (9 pp, arxiv:0906.2092). [24] styer, d. f. et al.: nine formulations of quantum mechanics, amer. j. phys., vol. 70 (2002), p. 288–297. [25] strocchi, f.: an introduction to the mathematical structure of quantum mechanics. singapore : world scientfic, 2008 (2nd edition). [26] rosinger, e. e.: deficient mathematical models ofquantumtheory. arxiv:quant-ph/0510138, and mathematics and the trouble with physics, how deep we have to go? arxiv:0707.1163. miloslav znojil, drsc. phone: +420 266 173 286 e-mail: znojil@ujf.cas.cz nuclear physics institute academy of science of the czech republic 250 68 řež, czech republic 68 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 microstructure of the transitional area of the connection of a high-temperature ni-based brazing alloy and stainless steel aisi 321 (x6crniti 18–10) r. augustin, r. koleňák abstract this paper presents a detailed examination of the structure of the transitional area between a brazing alloy and the parent material, the dimensions of the diffusion zones that are created, and the influence on them of a change in the brazing parameters. connections between ni-based brazing alloys (ni 102) with a small content of b and aisi 321 stainless steel (x6crniti 18–10) were created in a vacuum (10−2 pa) at various brazing temperatures and for various holding times at the brazing temperature. various specimens were tested. first, the brazing alloys were wetted and the dependence of the wetting on the brazing parameters was assessed. then a chemical microanalysis was made of the interface between the brazing alloy and the parent material. the individual diffusion zones were identified on pictures from a light microscope and rem, and their dimensions, together with their dependence on the brazing parameters, were determined. keywords: ni 102 brazing alloy, aisi 321, wetting, chemical microanalysis, diffusion zones. 1 introduction high-temperature brazing of stainless steels is a specific method for connectingmaterials that can create high-quality, demountable connections without local thermal impact on the connected material. hightemperature brazing can be applied where it is not desirable to use welding because of the thermal impact or because of the complexity of the connections that will be created [1]. nickel-based brazing alloys are the most suitable for high-temperaturebrazingof stainless steel. when partsmade of stainless steel are connected by a hightemperature nickel brazing alloy, the metallurgical connection has characteristics similar to those of a welded connection. however, in contrast to welding there is no melting of the parent material (pm) due to the significantly lowermelting point of the brazing alloys [2]. a high-vacuum atmosphere in high-temperature brazing prevents the brazing alloy and pm interacting with the ambient atmosphere, so it is not necessary to use fluxes with this method. at the same time, the vacuum atmosphere has no effect on the physical characteristics of pm. brazing in a vacuum complies with the latest world trends in machine technology, and is therefore also the most favoured high-temperature brazing method [3]. high-temperature brazing of stainless steels with nickel-based brazing alloys in a vacuum has already been used for a long time in practical applications. it is therefore desirable to know the influence of the basic brazingparameters on the characteristics of the brazing alloy and the connection that is formed. it is then possible to optimize the initial parameters in such a way that the connections meet the requirements made on them [4]. 2 materials and methods austenitic stainless steel aisi 321 (x6crniti 18–10, din 1.4541) was selected for the purposes of the experiment. high alloyed austenitic steels cannot be refined, so they can also be brazedwith slow cooling. the exact chemical composition of the 17246 steel is shown in table 1 [5]. two brazing alloys of similar chemical composition containing b were selected from the series of high-temperature nickel-based brazing alloys, see tab. 2. according to the stn en 1044 standard, they both fall under code ni 102. table 1: chemical composition of aisi 321 fe cr ni mn si c w ti mo 67.00 % 19.30 % 8.12 % 1.27 % 0.41 % 0.02 % 0.63 % 0.36 % 0.06 % 7 acta polytechnica vol. 50 no. 6/2010 table 2: chemical composition of brazing alloy ni 102 brazing alloy ni cr si b c fe ni 102-01 83.5 % 6.5 % 4.5 % 3.0 % 0.05 % 4.0 % ni 102-02 82.0 % 7.0 % 4.0 % 2.0 % 0.15 % 4.5 % the brazing parameters shown in tab. 3 were selected with reference to the objective of the paper, in order to be able to assess the influence of specific variables. the specimens were heat treated in a pz 810 vacuum furnace with heating rate 26.88◦c/min and cooling rate 2.15◦c/min at 10−1 to 10−2 pa. table 3: experimental parameters specimen no. 1 brazing brazing brazing alloy temperature time [◦c] [min.] ni 102-01 1200 10 ni 102-02 1200 10 specimen no. 2 brazing brazing brazing alloy temperature time [◦c] [min.] ni 102-01 1050 30 ni 102-02 1050 30 specimen no. 3 brazing brazing brazing alloy temperature time [◦c] [min.] ni 102-01 1100 120 ni 102-02 1050 120 the finished specimens were divided using a cutting disc, and after marking they were sealed with bakelite on a buehler simplimet 1000 device. then they were ground with grinding paper with 240, 600 and1200grit andpolishedwith diamondpasteswith 9, 6 and 3 μmgrit on a semi-automaticbuehler beta machine. finally, the specimens were polished using colloidal silica 2 μm (mastermet). to make the structure visible, an etching agent was used (10 ml h2so4, 10mlhno3, 20mlhf, and50mlh2o).the etching time was about 20 seconds. specimens prepared in this way could be scanned and analyzed by a neophot 30 lightmicroscope. whenmeasuring the diffusion zones, 11 vertical parallel lines weremarked out in eachpicture, and on eachof thempoints defining the boundaries of the given zone were drawn in. the resulting values are the arithmetic mean of the 11 measurements. line profiling and element distributionmappingweremeasuredusing anarlsemq device. 3 results 3.1 an evaluation of the wetting of the brazing alloys first of all, the wetting was measured on the specimens. angle α was determined on the pictures createdby the lightmicroscope, using graphics software. an example of these measurements is shown in fig. 1, 2 and 3, where the changes in this angle as a result of different parameters can also be recognized. the results of the measurements are summarized in tab. 4. tangent to the surface of pm – σsl tangent to the surface of the brazing alloy – σlv wetting angle – α fig. 1: ni 102-02, α ∼ 3.62◦ (1200◦c/10 min.) fig. 2: ni 102-02, α ∼ 5.63◦ (1050◦c/30 min.) fig. 3: ni 102-02, α ∼ 1.82◦ (1050◦c/120 min.) 8 acta polytechnica vol. 50 no. 6/2010 table 4: values of the wetting angles of the examined brazing alloys at the given parameters brazing alloy holding time at brazing temperature [min] brazing temperature [◦c] contact angle of wetting [◦] ni 102-01 10 1200 1.54±0.5 ni 102-02 10 1200 2.34±0.5 ni 102-01 30 1050 3.40±0.5 ni 102-02 30 1050 5.14±0.5 ni 102-01 120 1100 1.45±0.5 ni 102-02 120 1050 1.50±0.5 3.2 chemical microanalysis a global chemical microanalysis was then performed on the specimens. the objective was to determine the proportion of the basic components of the examined materials in the mutual diffusion and creation of the connection. the samediffusion reactions took place for all parameters, but they are most visible at brazing temperature 1100◦c and 120 min. holding time at this temperature — fig. 4 and fig. 5. fig. 4: cr concentration in transitional area ni 102 – aisi 321 the zone of cr diffusion from pm into the brazing alloy can be observed (white points left from the interface) in fig. 4, where the interface is indicated. placeswith the highestcr content arewhite. cr also diffuses along the grain boundaries from the brazing alloy into pm (visible grain boundaries right from the interface). this is probably caused by b. cr with b creates crb2 borides, and b diffuses very quickly from the brazing alloy into pm along the grain boundaries. fig. 5 shows the concentrationoffe inpmand in the transitional area. the zone of fe diffusion from pm into the brazing alloy, similarly as cr, can be observed. however, the dark areas along the grain boundaries of pm indicate that the concentration of fe in these places is significantly reduced, and this is also confirmed by linear microanalysis, see fig. 6. fig. 5: fe concentration in transitional area ni 102 – aisi 321 fig. 6: linear microanalysis of transitional area ni 10201 – aisi 321 (1100◦c/120 min.) 9 acta polytechnica vol. 50 no. 6/2010 the map of elements ni and si shows no higher participation in the creation of the brazed connection at any parameters. 3.3 measurements of the diffusion zones the pictures from the lightmicroscope show the separate diffusion zones in the interface between brazing alloy and pm, see fig. 7. specifically the pm solubility zone in the brazing alloy, the diffusion zone of the brazing alloy in pm, the diffusion on the boundaries of the grains, and the zonewith a visible change in the microstructure. by measuring these zones for various parameters, it was possible to assess their dependence on the parameters the measurement results are shown in tab. 5. fig. 7: diffusion zones in the interface ni 102 – aisi 321 the figures show examples of pictures from the light microscope on which the specific zones were measured. fig. 8 shows that the solubility zone of pm and the diffusion zone of the brazing alloy into pm cannot be distinguished with the given parameters. they were therefore measured as one whole. fig. 9 shows that these zones are clearly distinguishable after the parameters have been changed. fig. 8: interface of ni 102-01 and aisi 321 (1200◦c/10 min.) fig. 9: interface of ni 102-01 and aisi 321 (1050◦c/30 min.) with longer holding time atbrazing temperature, the pm solubility zone and the diffusion zone of the brazingalloy inpmcanbeclearlydistinguished, even when the brazing temperature is 100◦c lower, and a significant growth in the width of specific zones can also be seen – fig. 10. table 5: recorded dimensions of diffusion zones with various parameters brazing alloy holding time at brazing temperature [min] brazing temperature [◦c] zone of pm solubility [μm] zone of diffusion into pm [μm] diffusion along grain boundaries [μm] zone of changed structure [μm] ni 102-01 10 1200 23 93 38 ni 102-02 10 1200 23 109 49 ni 102-01 30 1050 9 16 101 50 ni 102-02 30 1050 5 18 194 72 ni 102-01 120 1100 23 50 203 78 ni 102-02 120 1050 13 24 332 132 10 acta polytechnica vol. 50 no. 6/2010 fig. 10: interface of ni 102-01 and aisi 321 (1100◦c/120 min.) 4 discussion brazing alloysni 102on stainless steelaisi 321were distinguished by very good wetting at all brazing parameters. it is known that the angle of wetting decreases with rising temperature and holding time at brazing temperature. however it was found that temperature has amuch greater influence onwetting than the holding times at brazing temperature. it was also found that a slightly increased content of b inbrazingalloyni102-01 in comparison toni 102-02 caused better wetting of the surface of pm. a chemical microanalysis showed that the solubility zone of pm in the brazing alloy and also the interface between the brazing alloy and pm are created to a significant extent only by fe and cr, and the results indicate that the proportion of ni and si is significantly lower. interstitial atoms of b along the grain boundaries diffuse most quickly into pm, because they are not soluble in the solid solution of ni. b creates crb2 phase with cr. these areas are characterized by a higher concentration of cr. the concentration of fe in these places is significantly reduced. by analysing the diffusion zones, it was found that, when there is a short holding time, even at a higher brazing temperature it was not possible to distinguish the solubility zone ofpm into the brazing alloy from the diffusion zone of the brazing alloy into pm. for the same brazing parameters, brazing alloy ni 102-02 achieveddeeper diffusion of the brazing alloy into pm, whereas better penetration of pm into the brazing alloy occurredwith brazing alloyni 10201. the influence of brazing temperature ondiffusion depth appeared to be approximately the same as the influence of the holding time at brazing temperature. even a slight change in the brazing temperature has a big influence on diffusion of the brazing alloy into pm. 5 conclusion from the measured values of wetting shown in tab. 4, it can be stated that: • the wetting of both brazing alloys, ni 102-01 and ni 102-02, is excellent, or even perfect, for all parameters. • although the two brazing alloys fall under code ni 102 according to their chemical composition, they have different wetting values. • the better wetting of brazing alloy ni 102-01 is probably due to a higher content of b. • the angle of wetting becomes smaller with rising brazing temperature and longerholding time at this temperature. the proportion is therefore inverse. • according to the measured values, the influence of the brazing temperature on the wetting is more distinct than the effect of the holding time at brazing temperature. as a result of the chemical microanalysis of the specimens: • the zone of solubility of pm in the brazing alloy and the interface between the brazing alloy and pmare created to a significant extent only byfe andcr. the results show that the proportion of ni and si is significantly lower. • interstitial atoms of b along the grain boundaries diffusemost quickly intopm, because they are not soluble in the solid solution of ni, where they formaphasewithcr. these areasare characterized by a higher concentration of cr. the concentrationoffe in theseplaces is significantly reduced. on the basis of the measured diffusion zone values shown in tab. 5, the following conclusions can be drawn: • with a holding time of 10 minutes at brazing temperature 1200◦c, it was not possible to distinguish the solubility zone of pm into the brazing alloy from the diffusion zone of the brazing alloy into pm on the interface of either of the alloys. • using the same brazing parameters, brazing alloy ni 102-02 achieved deeper diffusion of the brazing alloy intopmand, conversely, therewas better penetration of pm into the brazing alloy in the case of brazing alloy ni 10-01. • the influence of brazing temperature on diffusion depth is directly proportional to the influence of the holding time at brazing temperature. • a change or drop in brazing temperature, even a relatively small change (50◦c), has a great influence on the diffusion of the brazing alloy into pm. 11 acta polytechnica vol. 50 no. 6/2010 acknowledgement this paper was prepared with support from the vega 1/0381/08 project — research of the influence of physical metallurgical aspects of hightemperature brazing on the structure of connections of metal and ceramic materials and from the apvt 20-010804 project — development of lead free soft active solder and research of material solderability of metal and ceramic materials with the use of ultrasonic activation. references [1] available on: http://www.pva-lwt-gmbh.de/ englisch/sites/te vakuum verfah.php [2] ruža, v., koleňák, r., jasenák, j.: spájkovanie vo vákuu. trnava, szs, 2005. [3] available on: http://www.aws.org/ w/a/wj/2004/10/030/index.html [4] available on: http://cdsweb.cern.ch/record/1151308?ln=sk [5] ruža, v., koleňák, r.: spájkovanie materiálov. bratislava, stu, 2007. ing. robert augustin doc. ing. roman koleňák, phd. phone: +421 949 358 111 faculty of materials science and technology in trnava pavilont bottova 23, 917 24 trnava, slovak republic 12 wykresx.eps acta polytechnica vol. 51 no. 1/2011 infinitesimal algebraic skeletons for a (2+1)-dimensional toda type system m. palese, e. winterroth abstract a tower for a (2+1)-dimensional toda type system is constructed in terms of a series expansion of operators which can be interpreted as generalized bessel coefficients; the result is formulated as an analog of the baker-campbell-hausdorff formula. we tackle the problem of the construction of infinitesimal algebraic skeletons for such a tower and discuss some open problems arising along our approach. keywords: toda type system, integrability, infinitesimal skeleton, tower, cartan connection. 1 introduction nonlinear models, and in particular toda type systems, play a role in a variety of physical phenomena. as is well known, the problemof their integrability is far frombeing trivial. it is nowadayswell recognized that the algebraic properties of nonlinear systems are relevant from the point of view of integrability. a huge scientific production within this topic has developed in both discrete and continuous, as well as, classical and quantisticmodels. it is nevertheless important not to forget the origin of this interest: for a nonlinear system, it lies in the concept of integrability as of having ‘enough’ conservation laws to exaustively describe the dynamics (an ideawhich originates in the inverse of thenoether theorem ii in the calculus of variations). historically, the algebraicgeometric approach is based on the requirement for the existence of conservation laws which leads to the existence of symmetries (in terms of algebraic structures). in this light,wahlquist andestabrook [15, 5] proposed a technique for systematically deriving what they calleda ‘prolongationstructure’ in termsof a set of ‘pseudopotentials’ relatedwith the existence of an infinite set of associated conservation laws, and they also conjecturedthat the structurewas ‘open’ i.e. not a set of structure relations of a finite-dimensional lie group. since then, ‘open’ lie algebras have been extensively studied in order to distinguish them from freely generated infinite-dimensional lie algebras. in their approach, conservation laws are written in terms of ‘prolongation’ forms and integrability is intended as an integrability condition for a ‘prolonged’ differential ideal. attempting a description of symmetries in terms of lie algebras implies the appearance of an homogeneous space and thus the interpretation of prolongation forms as cartanehresmann connections. it should be stressed that the unknowns are both conservation laws and symmetries, and it is clear that the main point in this is how to realize the form of the conservation laws and thus the explicit expressionof theprolongation forms. different formulations of the prolongation ideal bring to both different algebraic structures (symmetries) and corresponding conservation laws: of course, the structure with which prolongation forms are postulated can produce lie algebras or more general algebraic structures. we use the algebraic properties of toda type systems as a ‘laboratory’ to explicate an algebraic-geometric interpretation of the abovementioned ‘prolongation’ procedure in terms of towers with infinitesimal algebraic skeletons [9]. consider the (2+1)-dimensional system, a continuous (or long-wave) approximation of a spatially two-dimensional toda lattice [14]: uxx + uyy +(e u)zz =0 , (1) where u = u(x, y, z) is a real field, x, y, z are real local coordinates (if we want, z playing the rôle of a ‘time’) and the subscripts mean partial derivatives. it can be seen as the limit for γ → ∞ of the more generalmodel uxx+uyy+ [ (1+ u/γ)γ−1 ] zz =0, covering (for γ different from 0,1) various continuous approximations of lattice models, among them the fermi-pasta-ulam (γ =3). it appears in differential geometry: kaehler metrics [8]; in mathematical and theoretical physics (see, e.g. newmanandpenrose as well as [12]); in the theory of hamiltonian systems, in general relativity: heavenly spaces (real, self-dual, euclidean einstein spaces with one rotational killing symmetry, [12, 4]); in the large n limit of the sl(n) toda lattice [11] (fromthe constrainedwess-zuminosupported by the university of torino through 2009 research project ‘conservation laws in classical and quantum gravity’. 54 acta polytechnica vol. 51 no. 1/2011 novikov-wittenmodel): extendedconformal symmetries (2d cf t) and reductions of four dimensional theory of gravitational instantons; in strings theory and statistical mechanics. it can be seen as the particular case with d = 1 of so-called 2d-dimensional toda-type systems [13] from a ‘continuum lie algebra’ by means of a zero curvature representation uww̄ = k(e u), (in our particular case w = x+iy and k is the differential operator given by k = ∂2 ∂z2 ). in particular, it has been studied in the context of symmetry reductions [1, 6] and a (1+1)-dimensional version in the context of prolongation structures which only partially lead to results [2]. 2 towers with skeletons for toda type systems the notion of an (infinitesimal) algebraic skeleton is an abstraction of some algebraic aspects of homogeneous spaces. let then v denote a finite-dimensional vector space. an algebraic skeleton on v is a triple (e, g, ρ), with g a (possibly infinite-dimensional) lie group, e = v ⊕ g, g the lie algebra of g, and ρ a representation of g on e (infinitesimally of g on e) such that ρ(g)x = ad(g)x, for g ∈ g, x ∈ g. let z be a manifold of type v (i.e. ∀z ∈ z, tzz � v ). we say that a principal fibre bundle p(z, g) provided with an absolute parallelism ω on p is a tower on z with skeleton (e, g, ρ) if ω takes values in e and satisfies: r∗gω = ρ(g) −1ω, for g ∈ g; ω(ã)= a, for a ∈ g; here rg denotes the right translation and ã the fundamental vector field induced on p from a. in general, the absolute parallelism does not define a lie algebra homomorphism. let g be a lie algebra and k be a lie subalgebra of g. let k be a lie group with lie algebra k and let p(z, k) be a principal fibre bundle with structure group k over amanifold z, as above. acartan connection in p of type (g, k) is a 1-form ω on p with values in g satisfying the following conditions: – ω| tup : tup → g is an isomorphism ∀u ∈ p ; – r∗gω = ad(g) −1ω for g ∈ k; – ω(ã) = a for a ∈ k. a cartan connection (p , z, k, ω) of type (g, k) is a tower on z. remarkthat since, apriori, theprolongationalgebradoesnotclose intoaliealgebrathe startingpoint for the prolongation procedure is only a tower with an absolute parallelism, and not a cartan connection. thus, in principle, estabrook-wahlquist prolongation forms are absolute parallelism forms. the corresponding open lie algebra structure can be provided with the structure of an infinitesimal algebraic skeleton on a suitable space. first we have to prove that a finite dimensional space v and a lie algebra g exist satisfying the definition of a skeleton, i.e. in particular that a suitable representation ρ can be defined. the representation is obtained bymeans of an integrability condition for the absolute parallelism of a tower on a manifold z (of type v ), with skeleton (e, v , g). note that if e has in addition the structure of a lie algebra this is exactly a cartan connection of type (e, g); in fact, the spectral linear problem is nothing but the construction of a cartan connection from this absolute parallelism. asanexample, letusnowintroduceonamanifold with local coordinates (x, y, z, u, p, q, r) the closeddifferential ideal definedby the set of 3-forms: θ1 =du∧ dx∧dy−rdx∧dy∧dz, θ2 =du∧dy∧dz−pdx∧dy∧dz, θ3 =du ∧dx∧dz+qdx∧dy ∧dz, θ4 =dp ∧dy ∧dz − dq ∧dx ∧dz + eudr ∧dx ∧dy + eur2dx ∧dy ∧dz. it is easy to verify that on every integral submanifold defined by u = u(x, y, z), p = ux, q = uy, r = uz, with dx ∧ dy ∧ dz �= 0, the above ideal is equivalent to the toda system under study. in terms of absolute parallelism forms, 2-forms generating associated conservation laws can be defined as follows: ωk = hk(u, ux, uy, uz;ξ m)dx ∧dy + f k(u, ux, uy, uz;ξ m)dx ∧dz + gk(u, ux, uy, uz;ξ m)dy ∧dz + akmdξ m ∧dx + bkmdξ m ∧dz + dξk ∧dy , where ξ = {ξm}, k, m = 1,2, . . . ,n (n arbitrary), and hk, f k and gk are, respectively, the pseudopotential (coordinates in the space v ) and functions to be determined, while akm and b k m denote the elements of two n × n constant regular matrices. in fact, we remark that ωk = θkm ∧ ω m, where θkm = −ā k mdx−b̄ k mdy −c̄ k mdz, and the absolute parallelism forms are given by1 ωm = dξ̄m + f̄ mdx + ḡmdy + h̄mdz . the integrability condition for the ideal generated by forms θj andω k finally yields hk = euuzl k(ξm)+ p k(u, ξm), f k = − uylk(ξm) + n k(ξm), gk = uxl k(ξm)+ m k(u, ξm), where lk, p k, n k, m k are functions of integration. as a consequence, the desired representation for the skeleton is provided by the following equations (we omit the indices for simplicity). pu = e u[l, m] , mu = −[l, p ] , [m, p ] = 0 .(2) we will consider l, p , m as regular operators so that lie brackets can be interpreted as commutators. we can now look for an exact solution in 1f k = c̄kmf̄ m − ākmh̄ m, gk = c̄kmḡ m − b̄kmh̄ m, hk = b̄kmf̄ m − ākmḡ m, ξk = c̄kmξ̄ m 55 acta polytechnica vol. 51 no. 1/2011 order to give the representation explicitly. for any operator d = dj ∂ ∂ξj , by introducing l[d] = [l, d], we define the n-th power of the operator l by setting ln[d] = [l, [l, . . . , [l, d] . . .], where l appears n-times, and l0[d] = d. put t =2e u 2 . a solutionof the prolongationequations regular at t =0 (i.e. at u → −∞) is then given by p = t 2 j1(tl[p0]) , m =j0(tl[m0]) , (3) wherej0(·) andj1(·) are formaloperatorexpansions givenbyj0(tl[m0])= ∞∑ m=0 (−1)m (m!)2 ( t 2 )2m l2m[m0], j1(tl[p0]) = ∞∑ m=0 (−1)m m!(m +1)! ( t 2 )1+2m l1+2m[p0] and m0 ≡ m0(ξ) = m(t;ξ) |t=0 and p0 ≡ p0(ξ) is such that [l, p0] = [l, m0] [10]. by defining operator bessel coefficients jm(tx), as the coefficients of the formal expansion e t 2 x(z−1/z) = ∞∑ m=−∞ zmjm(tx) (for bessel functions a standard reference is [16]), we can prove recurrence and derivation formulae by means of which we provide an equivalent solution to our prolongation equations in terms of l: p = t 2 ∞∑ k=−∞ jk+1(tl)p0jk(tl) , m = ∞∑ k=−∞ jk(tl)m0jk(tl) , based on the formulae j1(tl[p0]) = ∞∑ k=−∞ jk+1(tl)p0jk(tl), j0(tl[m0]) = ∞∑ k=−∞ jk(tl)m0jk(tl), which are in fact analogous to the baker-campbellhausdorff expansion [10]. these expansions together with [m, p ] = 0 provide the desired representation and at the same time define a tower with absolute parallelism. themainproblemwith this tower (which is somehow the most general one) is that it is a non trivial task to characterize explicitly its algebraic skeleton by means of the representation provided by the relations [m, p ] = 0. on the other hand, it is well known that the toda equation can be solved by the inverse scattering transform [7]. however, the associated linear spectral problem was never derived from an infinitesimal algebraic skeleton and in particular as the construction of a cartan connection from a tower with algebraic skeleton; thus it would be important to derive both thetoda systemand related spectral problem(s) (i.e. conservation laws and symmetries) starting from a tower with an algebraic skeleton. in this perspective, particular solutions of the correspondingestabrook-wahlquistprolongation problem can assume a relevant role: they correspond to particular choices for the absolute parallelism and can provide us explicit representations of the prolongation skeleton. 2.1 skeletons if we look for operators p(u, ξ) and m(u, ξ) depending on u only through the exponential function, i.e. p(u, ξ) = eup̄(ξ), m(u, ξ) = m(eu, ξ), the prolongation equations can now be written as: pu = eu[l, m] = ∂p ∂eu eu, mu = −[l, p ] = ∂m ∂eu eu; on the other hand, we have ∂p ∂eu = p̄(ξ) = [l(ξ), m(eu;ξ)], ∂m ∂eu = −[l(ξ), p̄(ξ)]. from the second equation, we get m(eu;ξ)= −eu[l(ξ), p̄(ξ)]+ m̄(ξ) and thus p̄(ξ)= −eu[l(ξ), [l(ξ), p̄(ξ)]]+ [l(ξ), m̄(ξ)]. we see then that we are able to obtain commutation relations: p̄(ξ) = [l(ξ), m̄(ξ)], [l(ξ), [l(ξ), p̄(ξ)]] = 0. there are additional relations determined by the third prolongation equation [−eu[l(ξ), p̄(ξ)] + m̄(ξ), eup̄(ξ)] = 0, so that we have [[l, p̄], p̄ ] = 0, [m̄ , p̄ ] = 0. for the sake of convenience we put l = x1, m̄ = x2, p̄ = x3, [x1, x3] = x4 and we then have the following prolongation closed lie algebra: [x1, x2] = x3 , [x1, x3] = x4 [x1, x4] = [x2, x3] = [x2, x4] = [x3, x4] = 0 , note that if x4 = μx2 we obtain a quotient lie algebra corresponding to the euclidean group in the plane and we get a cartan connection. suppose now that p(u, ξ) = lnup̄(ξ), m(u, ξ) = m(eu, ξ). we derive then pu = e u[l, m] = d(lnup̄(ξ)) deu eu = 1 u p̄(ξ), mu = ∂m ∂eu eu = −[l, p ] = −[l, lnup̄(ξ)]; so that ∂m ∂eu = − lnu eu [l, p̄(ξ)], from which we get m(eu, ξ) = −(lnu − 1)u[l(ξ), p̄(ξ)]+ m̄(ξ), and p(u, ξ) = ueu lnu[l, m]. from [p, m] = 0 we get, for u �= 0,1 (which are trivial solutions of the toda system), [[l, m], m] = 0; on the other hand substituting the above expression for m we get [[l, m̄], m̄] = 0 , 56 acta polytechnica vol. 51 no. 1/2011 [[l, [l, p̄]], m̄]+ [[l, m̄], [l, p̄]] = 0 , [[l, [l, p̄]], [l, p̄]] = 0 . by putting again for the sake of convenience l = x1, m̄ = x2, p̄ = x3, then we get the following infinitesimal algebraic skeletonwith the structure of an open lie algebra: [x1, x2] = x4 , [x1, x3] = x5 , [x4, x5] = [x2, x7] , [x3, x4] = [x2, x5] , [x1, x4] = x6 , [x1, x5] = x7 , [x2, x3] = x8 , [x1, x8] = [x2, x4] = [x2, x6] = [x3, x7] = 0 , . . . weobserve that by the homomorphism x4 = x5 =0 and x8 = νx3 we get a closed lie algebra (which is different from the lie algebra corresponding to the euclidean group in the plane obtained above): [x1, x2] = 0 , [x1, x3] = 0 , [x2, x3] = νx3 . which, by means of a suitable realization, can also provide us with a different cartan connection (thus a different spectral problem and different conservation laws); on the other hand, we can find a closed lie algebra bymeans of the followinghomomorphism x4 = x2 and x5 = x3, and then we have [x1, x2] = x2 , [x1, x3] = x3 , [x4, x5] = [x2, x3] = x8 =0 , [x3, x4] = [x3, x2] = −[x2, x3] = [x2, x5] = [x2, x3] , and we also deduce that x6 = x4 = x2, x7 = x3 and that [x1, x8] = [x2, x4] = [x2, x6] = [x3, x7] = 0 are all identically satisfied. it is easy to see that the two different cases above are both given by the homomorphism given by requiring x4 = λx2 and x5 = μx3. it is easy to realize that μ = −λ must old and there are the two cases λ =0 with x8 = νx3 giving the first case, and x8 = 0 with λ = 1 giving the second case, respectively. for any λ �= 0 we have a closed lie algebra depending on the parameter λ: [x1, x2] = λx2 , [x1, x3] = −λx3 , [x2, x3] = 0 . furthermore, by putting in the prolongation skeleton x4 = x2 and x5 = −x3 it is possible to realize the prolongation skeleton as akač-moodylie algebra of the type [hi, hj] = 0 , [hi, x+j] = κij x+j , [hi, x−j] = −κij x−j , [x+i, x−j] = δij hi , where we put [x2, x3] = x8, [x8, x2] = x9, [x8, x3] = x10, [x8, x9] = x11 [x8, x10] = x12 and {x1, x13, . . .} = hi, {x8, . . .} = hj, {x2, x9, x11, . . .} = x+i, {x3, x10, x12, . . .} = x−j. we also put [x8, x11] = x11, [x8, x12] = −x12, and so on. we then also have [x8, x13] = 0 and it is easy to realize that [x1, x9] = x9, [x1, x10] = −x10, [x1, x11] = x11, [x1, x12] = −x12, and so on; thus characterizing the cartan matrix κij. it would be of interest to study the relation of skeletons with generalization of continuum lie algebras to the case when the local algebra does not generate g(e;k, s) as a whole, where g(e;k, s) are saveliev’s continuum lie algebras and they are defined as follows. let e be a vector space parametrizing lie algebras gi, i = 0,+1, −1, ĝ ≡ g−1 ⊕ g0 ⊕ g+1, such that [x0(φ), x0(ψ)] = 0, [x+1(φ), x−1(ψ)] = x0(s(φ, ψ)), [x0(φ), x+1(ψ)] = x+1(k(φ, ψ)), [x0(φ), x−1(ψ)] = −x−1(k(φ, ψ)), with k, s bilinear maps e × e → e satisfying conditions equivalent to the jacobi identity. take g ′(e;k, s) as the lie algebra freely generated by a local part ĝ and then the quotient g(e;k, s) = g ′(e;k, s)/j, j the largest homogeneous ideal having a trivial intersection with g0. in fact, such an algebrabecomes thekač-moodyalgebra abovewhen e = cn, k = cartan matrix k, s = i. the relation with the virasoro algebra without a central charge could be also considered in this light. this topic will be the object of further investigations. references [1] alfinito, e., soliani, g., solombrino, l.: the symmetry structure of the heavenly equation, lett. math. phys. 41, 379–389 (1997). [2] alfinito, e., causo, m. s., profilo, g., soliani, g.: a class of nonlinear wave equations containing the continuous toda case, j. phys. a 31 (9) (1998) 2173–2189. [3] baker, h. f.: alternant and continuous group, proc. london math. soc. (2), 3, 24–47 (1904); campbell, j. e.: on a law of combination of operators, proc. london math. soc. 29, 14–32 (1898); hausdorff, f.: the symbolic exponential formula in group theory, ber. verh.sächs. gess. wiss. leipzig. math.-phys. kl. 58, 19–48 (1906). [4] boyer, c., finley, j. d.: killing vectors in selfdual, euclidean einstein spaces, j. math. phys. 23, 1126–1128 (1982). [5] estabrook, f. b., wahlquist, h. d.: prolongation structures of nonlinear evolution equations. ii, j. math. phys. 17, 1293–1297 (1976). 57 acta polytechnica vol. 51 no. 1/2011 [6] grassi, v., leo, r. a., soliani, g., solombrino, l.: continuous approximation of binomial lattices, internat. j. modern phys. a 14 (15) (1999) 2357–2384. [7] kodama, y.: solutions of the dispersionless toda equation, phys. lett. a 147 (8–9) (1990) 477–482. [8] lebrun, c.: explicit self-dual metrics on cp2# . . .#cp2,, j. diff. geom. 34 (1) (1991) 223–253. [9] morimoto, t.: geometric structures on filteredmanifolds,hokkaidomath. j.22(3) (1993) 263–347. [10] palese, m., leo, r. a., soliani, g.: the prolongation problem for the heavenly equation; in recent developments in general relativity (bari, 1998) springer italia, milan (2000) 337–344. [11] park, q-han: extended conformal symmetries in real heavens, phys. lett. 236b, 429–432 (1990). [12] plebanski, j. f.: some solutions of complexeinstein equations, j. math. phys. 16, 2395–2402 (1975). [13] saveliev, m. v.: integro-differential nonlinear equations and continual lie algebras, comm. math. phys. 121 (2)(1989) 283–290; saveliev,m. v.: on the integrability problem of a continuous toda system, theoret. and math. phys. (1992) 92 (3) 1024–1031 (1993); razumov, a. v., saveliev, m. v.: multidimensional systems of toda type, theoret. and math. phys. (1997) 112 (2) 999–1022 (1998). [14] toda,m.: theory of nonlinear lattices,springer series in solid-state sciences 20 springerverlag, berlin-new york (1981). [15] whalquist, h. d., estabrook, f. b.: prolongation structures of nonlinear evolution equations, j. math. phys. 16, 1–7 (1975). [16] watson, g. n.: a treatise on the theory of bessel functions, cambridge university press, cambridge (1952). marcella palese e-mail: marcella.palese@unito.it department of mathematics university of torino via c. alberto 10, 10123 torino, italy ekkehartwinterroth e-mail: ekkehart.winterroth@unito.it department of mathematics university of torino via c. alberto 10, 10123 torino, italy 58 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 aharonov-bohm effect and the supersymmetry of identical anyons v. jakubský abstract we briefly review the relation between the aharonov-bohm effect and the dynamical realization of anyons. we show how the particular symmetries of theaharonov-bohmmodel give rise to the (nonlinear) supersymmetry of the two-body system of identical anyons. keywords: nonlinear supersymmetry, aharonov-bohm effect, anyons. 1 aharonov-bohm effect more than fifty years ago, aharonov and bohm argued in their seminal paper [1] that the fundamental quantity in a description of the quantum system is the electromagnetic potential and not the electromagnetic field. they proposed an experiment in which two beams of electrons are guided around a thin solenoid that is shielded completely from the electrons. despite the absence of the magnetic field outside the solenoid, the wave functions are affected by the non-vanishing electromagnetic potential and acquire an additional phase-factor which is manifested in the altered interference of the beams. the so-called aharonov-bohm (ab) effect has been observed experimentally [2] and has found its application in numerous fields of physics. in the present article, we will review its relation to anyons, twodimensional particles of exotic statistics. we will present the recent results on the nonlocal symmetries of theab systemand their relation to the supersymmetry of two-body anyon models. let us consider a spin−1/2particlewhich ismoving in a plane. the plane is punctured perpendicularly in the origin by an infinitely thin solenoid. the solenoid is impenetrable for the particle. hence, the origin is effectively removed fromthe spacewhere the particle lives. the pauli hamiltonian of the system acquires the following simple form 1 h = 1 2m ∑ j=1,2 p2j − eh̄ 2mc b3 σ3 , (1.1) where pj = −ih̄∂j − e c aj, b3 = ∂1a2 − ∂2a1. the non-vanishing electromagnetic potential in the symmetric gauge reads �a = φ 2π ( − x2 x21 + x 2 2 , x1 x21 + x 2 2 ,0 ) = (1.2) φ 2πr (−sinϕ , cosϕ ,0) , where x1 = rcosϕ, x2 = rsinϕ, −π < ϕ ≤ π, and φ is the flux of the singular magnetic field, b3 = φδ 2(x1, x2). as we will work mostly in polar coordinates, let us present the explicit form of the hamiltonian in this coordinate system hα = −∂2r − 1 r ∂r + 1 r2 (−i∂ϕ + α)2 + α 1 r δ(r)σ3,(1.3) α = 1 2π φ . hereweused the identity δ2(x1, x2)= 1 πr δ(r) for the two dimensional dirac delta function2. to specify the system uniquely, we have to determine the domain of thehamiltonian. we require the operator (1.3) toacton2π-periodic functionsψ(r, ϕ), i.e. ψ(r, ϕ +2π) = ψ(r, ϕ). using the expansion in partial waves, we can write ψ(r, ϕ)= ∑ j eijϕfj(r). (1.4) the functions fj(r) should be locally squareintegrable (i.e. fj(r) should be square integrable on any finite interval). the partial waves fj(r) are regular at the originup to the exception specifiedby the following boundary condition lim r→0+ ψ ∼ ( (1+ eiγ)2−αγ(1 − α)r−1+αe−iϕ (1 − eiγ)2−1+αγ(α)r−α ) (1.5) where parameter γ can acquire two discrete values 0 and π. the boundary condition (1.5) is related to the self-adjoint extensions of the hamiltonian. let us note that the boundary condition (1.5) just fixes two self-adjoint extensions (one for γ =0, the second one for γ = π) of the formal operator hα that are 1we set m = 1/2, h̄ = c = −e = 1 from now on. 2in fact, thedirac delta term in the hamiltonian is quite formal. it can be omittedwhen the domain of hα is specified correctly. 46 acta polytechnica vol. 50 no. 5/2010 compatible with the existence of n = 2 supersymmetry, see [3]. to keep our presentation as simple as possible, we fix from now on γ =0. (1.6) wemodify theactualnotation to indicate thedomain of the hamiltonian, i.e. wewill write hα → h0α. for readers who are eager for a more extensive analysis of the problem we recommend [3] for reference. the hamiltonian h0α commutes with the angular momentum operator j = −i∂ϕ +α and the spin projection s3 = 1 2 σ3. hence, one can find the vectors |e, l, s〉 such that h0α|e, l, s〉 = e|e, l, s〉, j |e, l, s〉 = (l + α)|e, l, s〉, (1.7) s3|e, l, s〉 = s|e, l, s〉. we define the following two additional integrals of motion q = σ1p1 + σ2p2 = q+σ+ + q−σ− , q± = −ie∓iϕ ( ∂r ± 1 r (−i∂ϕ + α) ) , (1.8) σ± = 1 2 (σ1 ± iσ2) and q̃ = p11+ irσ3p2, rrr = r, (1.9) rϕr = ϕ + π, where 1 is a unit matrix and p1 + irp2 = q+π+ + q−π− , (1.10) π± = 1 2 (1 ± r) . we can make a qualitative analysis of how these operators act on the wave functions |e, l, s〉 just by observing their explicit form. for instance, we have q|e, l,1/2〉 ∼ |e, l +1, −1/2〉, (1.11) q̃|e,2l, s〉 ∼ |e,2l − 2s, s〉. hence, neither q nor q̃ commutes with the angular momentum j and the parity r. however, the operator q̃ preserves spin of the wave functions, i.e. [q̃, s3] = 0. the operators q and q̃ are related by nonlocal unitary transformation, see [4]. in addition, we can define w = q q̃ = q̃ q. (1.12) this operatoraltersboth theangularmomentumand the spin of the wave functions. the explicit action of q, q̃ and w on the kets |e, l, s〉 is illustrated in fig. 1. fig. 1: theaction of operator w (thickdottedarrows) on the states |e,2l,1/2〉 and |e,2l +1,1/2〉 as a sequential action of q̃ (solid arrows) and q (thin dotted arrows). black squares represent the eigenstates |e, l, s〉 with corresponding values of l and s 2 anyons quantumtheory has classifiedparticles into twodisjoint families; there are bosons with integer spin and fermions with half-integer spin. the wave functions of indistinguishable bosons or fermions reflect the specific statistical properties of the particles. when we exchange two bosons, the wave function remains the same. when we exchange two fermions, the correspondingwave function changes the sign. thewave functions respect eitherbose-einsteinorfermi-dirac statistics in this way. however, when one makes a quantum system be two-dimensional, there emerges an alternative to the classification. as predicted by wilczek [5], there can exist exotic particles in two-dimensional space that are called anyons. anyons interpolate between bosons and fermions in the sense that when we exchange two of them in the system, the associated wave function acquires a multiplicative phase-factor of unit amplitude but distinct from ±1. the prediction of these particles is physically relevant for various condensed matter systemswhere thedynamics is effectively twodimensional. wilczek proposed a simple dynamical realization of anyons with the use of “composite” particles. let us explain the idea on a simple model of two identical particles [6]. take either two bosons or fermions. then, glue each of the particles togetherwith amagnetic vortex, i.e. with infinitely thin solenoids of the samemagnetic flux α. as a result, we get two identical composite particles. each particle can “see” just the potential generated by the solenoid of the other particle. thehamiltonian corresponding to this twobody system has the following form hany =2 2∑ i=1 (�pi − �ai(�r)) 2 . (2.1) 47 acta polytechnica vol. 50 no. 5/2010 where �pi = −i∂/∂�xi, �r = �x1 − �x2 and the index i ∈ {1,2} labels the individual particles. the potential ak1(�r)= −a k 2(�r)= 1 2 α�kl rl �r 2 (2.2) encodes the “statistical” interaction of the particles. in this sense, we call �ai the statistical potential. whenwewrite thehamiltonian incenter-of-the-mass coordinates, the relative motion of the particles is governed by the effective hamiltonian hrel = −∂2r − 1 r ∂r + 1 r2 (−i∂ϕ + α)2 , (2.3) where r is the distance between the particles and ϕ measures their relative angle. the hamiltonian (2.3) manifests the relation between the two-body model of identical anyons and the ab system; formally, it coincides with h0α up to the irrelevantdirac delta term. however, its domain of definition is quite different. when anyons (composite particles) are composed of bosons, the wave function has to be invariant under the substitution ϕ → ϕ + π that corresponds to the exchange of the particles. when anyons are composed of fermions, the wave function has to change the sign after the substitution. hence, the wave functions are of two types ψα(r, ϕ)= ∑ l eilϕfα,l(r), (2.4) l ∈ { 2z for anyons based on bosons, 2z +1 for anyons based on fermions. we shall explain how the considered model explains anyons as the interpolation between bosons and fermions. we can transform the system by a unitary mapping u = eiϕα and describe alternatively the system of two identical anyons by the hamiltonian h̃rel = u hrelu −1 = −∂2r − 1/r∂r + (−i∂ϕ)2/r2. it coincides with the energy operator of the free motion. the simplicity of the hamiltonian is traded for the additional gauge factor that appears in the wave functions, ψ̃α(r, ϕ) = u ψα(r, ϕ) = eiϕα ∑ l eilϕfα,l(r). the wave functions ψ̃α(r, ϕ) acquire the phase eiπα after the substitution ϕ → ϕ+π and, hence, interpolate between the values corresponding tobose-einsteinandfermi-dirac statistics. we are ready to reconsider the ab system and its symmetries in the framework of identical anyons. the hamiltonian h0α can be rewritten as a direct sum with subsystems of fixed value of spin s3 and parity r. it is convenient to use the notation that reflects the decomposition of the wave functions into these subspaces, ψ̃= (ψς+π+,ψς+π−,ψς− π+,ψς− π−) t where σ± = 1 2 (1 ± σ3) and π± = 1 2 (1 ± r). in this formalism, the hamiltonian reads hγ=0α = diag(h 0 α,+, h 0 α,−, h ab α,+, h ab α,−) , h0α,± = h 0 ας+π±, (2.5) habα,± = h 0 ας−π± . let us make a few comments on the elements of (2.5). consider habα,+ in more detail first. it acts on the wave functions that are periodic in π. hence, it can be interpreted as the hamiltonian of the relative motion of two identical anyons based on bosons. its wave functions are regular at r → 0, which can be interpreted as a consequence of a hard-core interaction between the anyons. it is worth noting that the system represented by habα,+ coincides with the system represented by h0α,+. indeed, the hamiltonians coincide not only formally but in their domains as well (there are no singularwave functions in their domains, see (1.5)). hence, we canwrite h0α,+ = h ab α,+. the operators habα,− and h 0 α,− describe the systems of two identical anyons based on fermions. the operator habα,− prescribes hard-core interaction between anyons. by contrast, the system described by h0α,− allows singular wave functions. it can be understood as a consequence of a nontrivial contact interaction between the composite particles. the integrals of motion q, q̃ and w shall be rewritten in the 4 × 4-matrix formalism. they read explicitly q = ⎛ ⎜⎜⎜⎜⎝ 0 0 0 q+ 0 0 q+ 0 0 q− 0 0 q− 0 0 0 ⎞ ⎟⎟⎟⎟⎠ , q̃ = ⎛ ⎜⎜⎜⎜⎝ 0 q− 0 0 q+ 0 0 0 0 0 0 q+ 0 0 q− 0 ⎞ ⎟⎟⎟⎟⎠ , (2.6) w = ⎛ ⎜⎜⎜⎜⎝ 0 0 q+q− 0 0 0 0 q2+ q−q+ 0 0 0 0 q2− 0 0 ⎞ ⎟⎟⎟⎟⎠ , where q± was defined in (1.8). substituting (2.5) and (2.6) into the relations [q, hγα] = 0, [q̃, h γ α] = 0 and [w, hγα] = 0we get the following set of independent intertwining relations h0+q− = q−h 0 − , q+h 0 + = h 0 −q+ , (2.7) h0+q+ = q+h ab − , q−h 0 + = h ab − q− , (2.8) hab− q 2 − = q 2 −h 0 − , q 2 +h ab − = h 0 −q 2 − . (2.9) 48 acta polytechnica vol. 50 no. 5/2010 let us focus on the first set (2.7). they can be rewritten as [ q(1)a ,h (1) ] =0, (2.10) where we used the operators h(1) = ( h0+ 0 0 h0− ) , q(1)1 = ( 0 q− q+ 0 ) , (2.11) q(1)2 = i ( 0 −q− q+ 0 ) . the operators (2.11) close for n = 2 supersymmetry3. indeed, they satisfy the commutation relation{ q(1)a ,q (1) b } =2δa,bh (1) , a, b =1,2 . (2.13) hence, operatorh(1) canbe understoodas the superextended hamiltonian of the two-body anyonic systems. the system represented by h0+ is based on bosons (thewave functions are π-periodic), the other system (represented by h0−) is based on fermions with nontrivial contact interaction. the supercharges q(1)a provide the mapping between these two systems. they exchange the bosons with fermions within the composite particles. besides, they switch on (off) the nontrivial contact interaction between the anyons. the relations (2.8) can be analyzed in the same vein, giving rise to the n =2 supersymmetric system of the pair of two-body anyonicmodels. for the sake of completeness, we present the corresponding operators and the algebraic relations of the superalgebra h(2) = ( h0+ 0 0 hab− ) , q(2)1 = ( 0 q+ q− 0 ) , (2.14) q(2)2 = i ( 0 −q+ q− 0 ) , [ q(2)a ,h (2) ] = 0,{ q(2)a ,q (2) b } = 2δa,bh (2) , (2.15) a, b = 1,2 . the only difference appears in the contact interaction between the anyons. this time, the hard-core interaction appears in both systems (neither h0+ nor hab− has singular wave functions in its domain). a qualitatively different situation occurs in the last case (2.9). the intertwining relations define the n =2nonlinear supersymmetry4 representedby the operators h(3) = ( h0− 0 0 hab− ) , q(3)2 = ( 0 q2+ q2− 0 ) , (2.16) q(3)2 = i ( 0 −q2+ q2− 0 ) . they satisfy the following relations [ q(3)a ,h (3) ] = 0,{ q(3)a ,q (3) b } = 2δab ( h(3) )2 , (2.17) a, b = 1,2 . the supercharges q(3) alter the contact interaction between the anyons (hard-core in hab− to nontrivial in h0− and vice versa) but do not alter the nature of the composite particles. 3 comments in this paper, we have utilized the intimate relation between the aharonov-bohmmodel and the dynamical realization of anyons in order to construct three different n = 2 supersymmetric systems of identical anyons. the origin of the supersymmetry can be attributed to the symmetries q, q̃ and w of the 3let us suppose that we have a quantum mechanical system described by ahamiltonian h. there are n additional observables, represented by the operators qa, a ∈ {1, . . . , n}. it is said that the system has supersymmetry, as long as operators qa together with the hamiltonian satisfy the following algebraic relations {qa, qb} ∼ hδab (2.12) if this is the case, operators qa are called supercharges. as a direct consequence of (2.12), they satisfy the relations q2j ∼ h, [qj, h] = 0. 4the system has nonlinear supersymmetry when the supercharges qa, a ∈ {1, . . . , n} satisfy the generalized anticommutation relation [7] {qa, qb} = δabf(h) where f(h) is a function of the hamiltonian h. usually, f(h) is considered to be a higher-order polynomial. 49 acta polytechnica vol. 50 no. 5/2010 spin−1/2 particle in the field of a magnetic vortex. reduction of these operators into the specific subspaces (of the fixed value of spin and parity) gave rise to supersymmetry of the anyon systems. a similar constructionwas recently employed in the case of the reflectionless poschl-teller system [8]. its supersymmetric structure originated from the geometrical symmetries of a higher-dimensional system living in ads2 space after the reduction to the subspaceswith a fixed angular momentum value. acknowledgement the author was supported by grant lc06002 of the ministry of education, youth and sports of the czech republic. references [1] aharonov, y., bohm, d.: significance of electromagnetic potentials in the quantumtheory,phys. rev. 115, 485 (1959). [2] endo, j., kawasaki, t., matsuda, t., osakabe, n., tonomura, a., yamada, h., yano, s.: evidence for aharonov-bohm effect with magneticfield completely shielded fromelectronwave, phys. rev. lett. 56, 792 (1986); endo, j., kawasaki, t., matsuda, t., osakabe, n., tonomura, a., yamada, h., yano, s.: experimental confirmationofaharonov-bohmeffect using a toroidal magnetic field confined by a superconductor, phys. rev. a 34, 815 (1986). [3] correa, f., falomir, h., jakubský, v., plyushchay, m. s.: supersymmetries of the spin−1/2 particle in the field of magnetic vortex, and anyons, arxiv:1003.1434 [hep-th]; correa, f., falomir, h., jakubský, v., plyushchay,m. s.: hidden superconformal symmetry of spinless aharonov-bohm system, j. phys. a 43, 075202 (2010). [4] jakubský, v., nieto, l. m., plyushchay, m. s.: the origin of the hidden supersymmetry, arxiv:1004.5489 [hep-th]. [5] wilczek, f.: magnetic flux, angular momentum, and statistics, phys. rev. lett. 48, 1144 (1982); quantum mechanics of fractional spin particles, phys. rev. lett. 49, 957 (1982). [6] wilczek, f.: fractional statistics and anyon superconductivity, world scientific, singapore (1990); khare,a.: fractional statistics andquantumtheory, world scientific, singapore (1997). [7] andrianov,a. a., ioffe, m. v., spiridonov,v. p.: higher derivative supersymmetry and thewitten index,phys. lett. a 174, 273 (1993), [arxiv:hepth/9303005]. [8] correa, f., jakubský, v., plyushchay, m. s.: aharonov-bohmeffect onads2 andnonlinear supersymmetry of reflectionless poschl-teller system, annals phys. 324, 1078 (2009), [arxiv:0809.2854 [hep-th]]. ing. vít jakubský, ph.d. e-mail: jakubsky@ujf.cas.cz nuclear physics institute of the ascr, v. v. i. řež 130, 250 68 řež, czech republic 50 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 a morphing technique applied to lung motions in radiotherapy: preliminary results r. laurent, j. henriet, r. gschwind, l. makovicka abstract organ motion leads to dosimetric uncertainties during a patient’s treatment. much work has been done to quantify the dosimetric effects of lung movement during radiation treatment. there is a particular need for a good description and prediction of organ motion. to describe lung motion more precisely, we have examined the possibility of using a computer technique: a morphing algorithm. morphing is an iterative method which consists of blending one image into another image. to evaluate the use of morphing, four dimensions computed tomography (4dct) acquisition of a patientwas performed. the lungswere automatically segmented for different phases, andmorphingwas performed using the end-inspiration and the end-expiration phase scans only. intermediate morphing files were compared with 4dct intermediate images. the results showed good agreement between morphing images and 4dct images: fewer than 2 % of the 512 by 256 voxels were wrongly classified as belonging/not belonging to a lung section. this paper presents preliminary results, and our morphing algorithm needs improvement. we can infer that morphing offers considerable advantages in terms of radiation protection of the patient during the diagnosis phase, handling of artifacts, definition of organ contours and description of organ motion. keywords: organ motion, morphing, 4dct. 1 introduction organ motion leads to dosimetric uncertainties during a patient’s treatment. much work has been done to quantify the dosimetric effects of lung movement during radiation treatment. in particular, important work is being done on describing and predicting organ motion. four dimensions computed tomography (4dct) can easily be performed using varian’s real-timepositionmanagementsystem (rpm)with slices acquired throughout the respiratory cycle. a reflective marker is fixed on a box positioned on the chest of the patient. the process is rather complicated, because deformation does not follow the same speed and amplitude at every point of a given organ. furthermore, in the lungs, hysteresis must be considered: the motion in expiration and inspiration is not symmetrical [44, 45]. we did not account for hysteresis in this preliminary study. we had to find a mathematical definition to describe the movement. we chose to use the morphing technique, because it permitted both evaluation and interpolation of movement. morphing is a computer sciencewhich consists of blending one image into another. turning an image at the beginning of inspiration into an image at the end of the inspiration may therefore provide a precise description of lung motion. intermediate files obtained with morphing were very close to the 4dct intermediate images. one of the most important advantages of morphing over 4dct is the fact that the organ position can be precisely known with less exposure to radiation. for this preliminary study, we chose to study the motions of several lungs. we also wish to stress that these results do not yet account for all aspects of organ motion. nevertheless, the results are promising and encourage further complex studies. our morphing program has to be further refined before it can be offered for routine use in radiation treatment. the first section of this paper begins with a brief review of organ motion in radiotherapy, followed by an overview of the simulation of lung motion, as well as anoverviewofmorphingalgorithmsand toolsused inmedicine. the second sectionof thepaperpresents the material and method used for image acquisition, and also themorphing algorithm. finally, the results are presented and discussed in the third section. 1.1 organ motions in radiotherapy conformal radiotherapy shapes the treatment fields to conform to the target geometry in three dimensions. however, the use of a single 3dct (three dimensionscomputedtomography)data set for treatment planning assumes that the singlect represents the mean positions of the target organ and of any organs at risk (oar). in reality, the received absorbed dose may differ from the planned absorbed dose. either the dose coverage of the targeted volume is insufficient or there is an over-dosage of the 57 acta polytechnica vol. 50 no. 6/2010 surrounding normal tissue. the motion of the tumour volume is traditionally accounted for by theuse ofmargins according to internationalcommission on radiation units and measurements (icru) reports 50 and 62 [14, 15]. the margin used for treatment between the gross tumour volume (gtv) and the planning target volume (ptv) is divided into two parts: • internal margin (im) to account for clinical targetvolume (ctv) size and shape variations, • setup margin (sm) to account for uncertainties in patient positioning and beam alignment. the sm is determined either empirically or by measuring the magnitude of organ motion during respiration. many organ motion studies are available in the literature. generally, they relate to the thoracic or abdominal organs [15], and have used variousmotion observation techniques: an ultrasound scanner [11], digital fluoroscopy [33], a gamma scintillation camera [43], or a magnetic method [1]. the purposes of these observationsaremultiple: to improve imageacquisition [11, 43], to investigate field margins [33], or to examine effects on organs at risk [1]. not all imaging modalities are appropriate to all tumour sites. sixel et al. [33] recommend ct scanner use. low et al. [18] studied themotion of a pulmonary tumour with scanner modality and observed that between two breathing extreme ranges one part of the tumour followed a rotational displacement, and the other part followed a stretching displacement. they concluded that a simple one-dimensional rigidmotion model of this tumour would be unable to accurately reflect its motion during the patient’s respiratory cycle. other authors [4, 19] have noted that tumours move in translation and simultaneously distort their shape. many models and studies exist concerning organ motion prediction. for example, schweikard et al. [32] studied organdeformation from amechanical point of view (tissue elasticity) and created a mathematical model, and heath et al. [13] considered the deformation of a voxel. we decided to explore the possibility of using morphing to simulate organ motion. 1.2 simulation of lung motions the aim of simulating lung motion is to minimize radiation exposure due to 4dct and to help trace the accurate motion of the lung. this knowledge of motion may allow reduced margins to be used in treatment planning systems (tps) in radiotherapy. many works exist on the simulation of lung motions, eachwith its ownmethod. villardet al. [41, 42] developed a method based on the mechanics of continuousmedia lawswith resolutionbyfinite elements. thismethodusespatientdataandbiomechanicalparameters to generate a customized simulation of lung motion. boldea et al. [5, 6] estimated lung deformation with an algorithm of non-rigid registration. the implemented algorithm blends one image into anotherwithout dimension restriction: the objects represented in the two images may not be represented according to the same dimensions. the two techniques were compared and data similar to a 4dct was created. this method was used for all respiratory motions [48]. santhanam et al. [30] simulated the motions of a surface lung with a tumour to have a model working in real time. the model takes as its input a subject-specific 4dct of lungs and computes a deformable lung surface model by estimating the deformation properties of the surface model using an inverse dynamics approach. however precise the above techniques may be, they are too cumbersome for clinical application as they are based on complicated computations. 1.3 morphing for medicine this section summarizes the applications of morphing to medicine. the purpose of morphing is to blend one image into another. both of the images represent the same object and use the same structure [29]. morphing presents many potential uses in medicine. j. talairach and p. tournoux [36] presented a way to use morphing in order to create a universal cerebral atlas. others, e.g. j. r. moringlane [22], designed some morphing algorithms to locate precise points in the brain. other work has compared and merged data from different patients or has drawn up statistics of distortions. p. m. thompson and a. w. toga used morphing to study alzheimer’s disease [37, 39]. e. stindel used morphing to simulate the evolution of trauma to the patella [34]. b.p.bergeron et al. [3] developed applications for educationbasedonmorphing thatareable to generate variations in medical images. k. penska et al. [25] also used morphing for education and diagnosis by turning radiographic images into a movie that demonstrates the healing process of a humerus fracture. other examples and approaches were reported by j. montagner et al. [21]. as presented in fig. 1, j. b. a. maintz and m. a. viergever distinguished four main families of transformations [20]: 1. rigid transformations are simple translations and rotations; 2. affine transformations retain parallelism of image lines; 58 acta polytechnica vol. 50 no. 6/2010 fig. 1: the four families of transformations throughmorphing 3. projection transformations retain image lines but not necessarily their parallelism; 4. non-linear transformations turn lines into curves. non-linear transformationsareusuallyapplied for medical issues. there are three categories ofmethods for this kind of transformation [23]: (i) intensity-based methods are based on a global approach over the different grey colour levels. these methods are very complex. intensitybased methods propose a more global approach. they are based on minimisation of energy, which describes the degree of similarity between the source and the target morphing images. j.-p. thirion [38] presented a kind of regulation using iterative screening. during this regulation, he drew the outlines of the object in a first image, and then he let the second image filter through the first one. m. bro-nielsen andg.grambowadopted the same approach [7] adding a linear elasticity parameter. c. davatzikos [10]went further, using the elastic properties of a material. g. e. christensen added speediness (fluid model) to elasticity [8] and defined constraints using parametric equations. other multiscale approaches were presented by h. lester and s. r. arridge [17]. these approaches implemented either multiple decomposition of the images or the distortions. these transformations are fulfilled according different parameters. the former consisted in solving the transformation of one version of the image into another version and repeating it until the end. the latter, unusually, consists in considering successive distortions with a higher and higher degree of freedom. (ii) feature-based methods propose less complex algorithmsand faster calculations. thosemethods are based on geometrical transformations [47]. various geometrical methods have been implemented, such as automatic and semi-automatic search and manual search of critical points [26, 40]. the search of crest lines, curve points and shape curves are also geometrical methods that have been implemented [35]. methods based on the use of models which describe areas are also geometrical methods. these models allow one to roughly extract the most important anatomical structures, and then to refine them [28]. a. l. didier et al. [12] and d. sarrut et al. [31] simulated lung motion using geometrical transformationsbasedoncontinuousmechanical laws, whichwere solved by the finite elementmethod. this approach takes into account the influence of the organs around the lungs that are responsible for their motion and for limiting their motion: the ribs, the diaphragm, all their associated muscles and the pleura. h. atoui et al. [2] also implemented a feature-based algorithm to construct missing intermediate lung slices; (iii) hybrid methods have also been implemented by l. collins and a. c. evans [9]. 2 material and method this sectionpresents thematerials andmethods that are used. in the first part, we described the way the patient datawas acquired, and in the secondpart,we discuss themorphing algorithm that has been implemented. 2.1 acquisition of patient data with the respiratory gating radiotherapy technique, various tools were developed to register the patient’s breathing pattern. many techniques exist, only two ofwhich areusedby the hospital teamwithwhichwe collaborate: thereal-timepositionmanagement system, (rpm, varianmedical systems, paloalto, ca) and the spirometer system. rpm consists of placing a block with two reflective markers on the patient’s abdomen in front of a video camera which uses an infrared signal to register the motion of this block. this technique is used in 4d-pet (4 dimensionspositron emission tomography) and 4dct-scan acquisitions [24, 27], often in free breathing mode. the other technique is the spirometer system, which is often used in breath-hold techniques. the patient’s air flow circulates in a spirometer through the mouth [48]. a nose-clip prevents the patient from nasal breathing. a patient was set up on the 4dct (ge lightspeed 16-slice scanner, ge medical systems, waukesha, wi) lying in the treatment position, suspended in an alpha cradle, his arms folded above his head. the patient’s data was registered in the free breathing cine-mode: the scans were acquired at each table location to ensure a complete sampling of data for one respiratory cycle. the scanning technique used 59 acta polytechnica vol. 50 no. 6/2010 120 kvp and 400mawith slices 2.5mm in thickness. this processwas repeateduntil the entire thoraxwas scanned (160 couch positions). the rpm was used to obtain the retrospective 4dct. these 4dct imageswere sortedaccording to the respiratorypattern. each 3dct-scan represents 3d anatomic information at a certain phase. in this study, 4dct-scans were sorted into ten equal periods. as represented in fig. 2, 0 % phase corresponds to the end of inspiration and 50 % phase to the end of expiration. fig. 2: respiratory cycle and selection process achieved by 4dct fig. 2 describes the principles of an acquisition performed during one patient’s breathing cycle [33]. the respiratory signal and the data are acquired at the same time in order to be correctly sorted. each 3d-reconstruction corresponds to one of the six different phases of the respiratory cycle. the entiremotion is constituted from the set of 3d sorted data. the obtained slices are numbered relatively: slice 0 is the sagittal slice in the middle of the patient’s lung, positive slices are toward the lung apex, and thenegativesare towardthediaphragm. it is impossible to obtain one scan that represents each slice at exactly 0 %, 10 %, 20 %, 30 %, 40 % and 50 % expiration. consequently a tolerance value is introduced in order to use one scan for eachphase. so, the accurate moment does not always correspond to the phase inwhich it is used. theadvantage4dsoftware (by ge medical systems) is in charge of this retroclassification. table 1 shows the correspondences between the accurate moments and the phase in which they are sorted by advantage4d.as observed in table 1, for slices 75,0, −50 and −100, the same ct scans are used in the 0 % and 10 % phases leading to artifacts in the final 3d scan reconstruction. for example, table 1 shows that slice number 75was obtained at the precisely 14% expiration and it is used to construct two 3d images: the image of the 0 % expiration phases and also the image of the 10 % expiration phase. table 1: accurate respiratory moments used for 0 %, 10 %, 20 %, 30 %, 40 % and 50 % phases slice 0 % 10 % 20 % 30 % 40 % 50 % numbers phase phase phase phase phase phase 75 14 % 14 % 22 % 30 % 38 % 53 % 0 16 % 16 % 16 % 33 % 41 % 50 % −50 17 % 17 % 17 % 33 % 41 % 49 % −100 2 % 2 % 34 % 34 % 43 % 52 % morphing was performed for several couch positions on the first 2d image (classified in 0 % phase) and the final image (classified in 50 % phase). the morphing intermediate files obtained were compared to the intermediate images obtained with 4dct. all the compared images and intermediate files have the same number of voxels (512 by 256). the voxels of all those files have the same dimensions (28 mm by 38mm by 50mm) andcartesian coordinates. thus, to compare one file with another, we counted the number of voxels which did not have the same value in both files. 3 morphing this section describes the morphing algorithm that we designed and implemented. whereas the other models presented are predictive (cf. section 1.3), our model is an interpolated model based on morphing. the algorithm is presented in fig. 3. it has three input parameters: the source and target files, containing the organ contour at the end of and at the beginning of the respiration cycle, respectively, andan integerwhich regulates the deformation, as explainedbelow. there is oneoutput parameter: the array of intermediate files generated at each step. 4dct generates image files. then, as explained in section 2.1, the advantage4d software sorts the files. next, the advantagesim software (by ge medical systems) detects and stores the target organ contours. as a first step, this algorithm reads the file corresponding to the initial image of the organ (at the beginning of expiration). actually, the only information used is either each voxel is inside or outside the lung. so, the values that the algorithm deals with are ‘0’ and ‘1’. actually, the contour of the organ is a set of voxels. when a contour voxel vsource is found, eachvoxel located around takes the value that it has in the final image (at the end of expiration): ‘0’ if it is outside the organ at the end, ‘1’ otherwise. fig. 4 illustrates the voxels that are taken into account around vsource (the black voxel). this step is repeated until it obtains the final image of the organ 60 acta polytechnica vol. 50 no. 6/2010 program morphing ; // input and output parameters input: (dicom_struct) source_file, target_file ; input: (number) def_default ; output: (dicom_struct) interm_files[] ; // main program // initialisations iter := 0 ; intermediate_file[iter] := source_file ; // main loop while (interm_file[iter] <> target_file) do // mean distance from the final organ contour d:=meandist(interm_files[iter],target_file); // voxels to take into account around vsource for each voxel vsource of the contour do // distance between vsource and final vsource dsource := distance(vsource, target_file) ; dist := def_default; if (d < dsource) then dist := dist x dsource / d ; end if ; voxelset := vxtakenintoaccount(vsource,dist) ; // voxel transformation for each v in voxelset do // v takes the value it has in the target file interm_files[iter](v) := target_file(v) ; end for ; // voxel transformation end for ; // voxel around vsource // next iteration iter := iter+1 ; interm_file[iter]:=interm_file[iter-1] ; end while ; // main loop end program ; fig. 3: the morphing algorithm fig. 4: voxels that are taken into account at each step of the transformation slice: this treatment is applied to each voxel of the organ contour of the considered image. the voxels of the initial image progressively take the ‘0–1’ values of the voxels of the final image. the result of this loop is an intermediate file which is not the final file. thus, this loop is repeated until the intermediate file obtained is exactly equal to the final file: the intermediate file obtained will be considered as the new initial file during the next step. the intermediate files obtained, placed one after another in chronological order, create a motion. thus, our purpose is to make this motion be close to the real lung motions of the patients. the very first version of this algorithm took into account only the grey-coloured voxel of fig. 4. in fact, the obtained contours were very jagged, whereas in reality the contours are more circular. for this reason, the algorithmnowtakes into account grey-coloured and green-coloured voxels around each vsource. this algorithm implements a feature-based method (cf. section 1.1). indeed, it is based on organ contourdetectionand its geometrical transformation. there is a kinetic regulation to determine the number of voxels taken into account around vsource. this method allows us to apply the transformation to different distances of vsource: the voxels just in touch (8 voxels), or the voxels that are 2 voxels away from vsource or n voxelsaway from vsource. moreprecisely, the mean distance dsource between vsource and the voxels of the organ contour of the target image is calculated (by superimposing the two files). vsource is not unique: the same mean distance is calculated for all the voxels of the contour. the algorithm obtains d, which is themean distance of themean distances. eachcalculated dsource is compared to d. if dsource is less than d, the number of voxels transformed is the number defined as the input parameter; otherwise this input parameter is weighted by dsource/d. thus, the deformation applied is more important if vsource is far from the final position of the organ contours. furthermore, since these distances vary at each step of the algorithm, the deformation is faster during the construction of the first intermediate files than during the last files. nevertheless, even if this version is very simple (2 grey levels and 2 dimensions), we wanted to validate our approach before improving it (multiple grey levels, 4 dimensions). in order to validate our approach on a qualitative level, we compared the iterative organ contours obtainedbymorphing to the intermediate organcon61 acta polytechnica vol. 50 no. 6/2010 tours drawn by advantage4d. we applied this technique to a patient and to different lung slices. in the next part of this paper, we present and analyze the results obtained. 4 results we selected four lung slices. we analyzed the scanned images from 4dct and compared them to the images obtained by morphing. the first part of this section presents the results, the second part is a discussion regarding the advantages of morphing in radiotherapy, and the limitations of this algorithm and technique are presented in the last part. 4.1 morphing analysis figs. 5 and 6 display the results obtained by 4dct and morphing. the results are superimposed in fig. 5: black lines represent the organ contours according to 4dct and white lines represent the organ shapes according to themorphingalgorithm. for slice 75 of the left lung, there were five 4dct scans. we applied morphing as if the first and last scans (white shapes) were the lung image at maximal inspiration and maximal expiration, respectively. in that case, there are less than 2 % different voxels at each step (each slice is composed of 512 by 256 voxels). as shown in fig. 5, similar results are obtained with slice 0. the most important difference is obtained with the right lung slice −100: 2 % different voxels between morphing intermediate images and 4dct scans for one respiratorymoment (34%). fig. 5: superimposing of the images obtained with morphing (white shapes) and 4dct (black lines). percentages are the accurate moments of each phase (table 1) fig. 6: right lung images superimposed, 4dct in white, morphing in red, phase 10 % (a), phase 20 % (b), phase 40 % (c), slice −37.5 mm in fig. 6, the lung obtained by morphing (in red) is superimposed over the lung image obtained with 4dct. the differences between the images are acceptable. indeed, these results tend to prove that the morphing algorithm that we have designed both describes and predicts lung motion. now we have to account for the rhythm of the motion in morphing. lung motion is relatively simple, but this may not be the case for other organs. 4.2 benefits of morphing even if this version of our morphing implementation requires improvement, the present study suggests that applyingmorphing technique in radiotherapy is promising, and that it presents several considerable advantages. themost important benefit is that it protects the patient from radiation. indeed, the combination of 4dctandmorphing requires only two scans for each organslice insteadof six ormore scans for each organ slice with the use of 4dct alone. morphing will not totally replace 4dct scans, but it may considerably decrease the number of required scans. the second advantage is that it improves contour detection and description. indeed, morphing will enable the creationof an entire organ imageat a precise moment t, even if there is no scan at t. furthermore, morphing reduces the number of artifacts, since artifacts appear when no scan is available at t for a specific slice. therefore, morphing may be used in order to decrease the number of artifacts: considering lung slice −100 of table 1 for example, instead of using a 2 % scan in 10 % phase, morphing could extrapolate a scan for 10 % phase from the 2 % scan and one other scan. finally, the description of the entire range ofmotion for an organwill be more precise. this will considerably benefit synchronization during the treatment phase. 5 conclusions these preliminary results concerning the use ofmorphing for organ deformation analysis are promising. 62 acta polytechnica vol. 50 no. 6/2010 the maximum morphing deviation is only 2 %. the lack of 4dct scan patient data was highlighted. to overcome this limitation, morphing is a possible interpolation method. it can give intermediate organ contours in order to supplement 4dct scan sorting. this will reduce the internal margin for the target volume and will enable the patient to be treated in gated-radiotherapy [16]. it will then be required to apply morphing in 3d. our goal is to further refine the algorithm. step by step, we will study ways to distinguish transformation speediness from organwall elasticity and secondly to simulate the movement of the entire target organ. a further stepwill involve representing deformations of other organs and also taking into account the motion of organs at risk. indeed, some organs may not have linear moves; there are accelerations and slowdowns. in addition, organwalls are not uniformly elastic. in order to have a global vision, it is necessary to apply separate speeds and distortions to simulate the motions of a group of organs. acknowledgement the authors acknowledge financial support from lcc (ligue contre le cancer), stic (soutien technique aux innovations coûteuses) gating, région franche-comté,cancéropôlegrand-est, capmand dr. r. hamlaoui (chu besanc̨on) for organ delineations. wewould like to thankdr.g.hruby for his help with translation and proof reading. references [1] andrä,w., danan,h., eitner,k.: anovelmagnetic method for examination of bowelmobility. med. phys. 2005, 32, 2942–2944. [2] atoui, h., miguet, s., sarrut, d.: a fast morphing-based interpolation for medical images: application to conformal radiotherapy. image anal stereol, 2006, 25, 95–103. [3] bergeron, b. p., sato, l., rouse, r. l.: morphing as a means of generating variation in visual medical teachingmaterials.comput. biol. med., 1994, 24(1), 11–8. [4] berson, a. m., emery, r., rodriguez, l., richards,g.m., ng,t., sanghavi, s., barsa, j.: clinical experience using gated radiation therapy: comparison of free-breathing and breathhold techniques. int. j. radiation oncology biol. phys., 2004, 60, 419–426. [5] boldea, v., sarrut, d., clippe, s.: lung deformationestimationwithnon-rigid registration for radiotherapy treatment. miccai 2003. montreal (canada), 2878, 770–777. [6] boldea, v., sarrut, d., sharp, g. c., jiang, s. h., choi, n.c.: study ofmotion in a 4d ct using deformable registration, int. j. radiation oncology biol. phys., 2005, 63, 499–500. [7] bro-nielsen, m., gramkow, g.: fast fluid registration of medical images. fourth international conference on visualisation in biomedical computing, vbc’96, hamburg, germany, 199, 267–276. [8] christensen, g. e.: bayesian framework for image registrationusing eigenfunction.a.w.toga (ed.) brian warping, academic press, 1999, 5, 85–100. [9] collins, l., evans, a. c.: animal: automatic nonlinear imagematching and anatomical labeling. toga, a. w. (ed.) brain warping, academic press 1999, 8, 133–142. [10] davatzikos, g.: spatial transformation and registration of brain images using elastically deformable models. computer vision and image understanding, special issue on medical 1997, 66/2, 207–222. [11] davies, s. c., hill, a. l., holmes, r. b. et al.: ultrasound quantification of respiratory organ motion in the upper abdomen, br. j. radiol, 1994, 67, 1096–1102. [12] didier, a. l., villard, p. f., bayle, j. y., beuve, m., shariat, b.: breathing thorax simulation based on pleura physiology and rib kinematics. information visualisation medvis, ieee ed., zurich, switzerland, 2007, 35–40. [13] heath, e., seuntjens, j.: a direct voxel tracking method for four-dimensional monte carlo dose calculations in deforming anatomy. medical physics, 2006, 33/2, 434–445. [14] icru report 62: international commission on radiation units and measurements. prescribing, recording and reporting photon beam therapy, supplement to icru report 50, 1999. [15] icru report 50. international commission on radiationunits andmeasurements.prescribing, recording and reporting photon beam therapy, 1993. [16] kubo, h. d., len, p. m., minohara, s., mostafavi, h.: breathing-synchronized radiotherapy program at the university of california davis cancer center. med. phys., 2000, 27, 346–353. 63 acta polytechnica vol. 50 no. 6/2010 [17] lester,h., arridge, s. r.: a survey of hierarchical non-linear medical image registration, pattern recognition, 1999, 32, 129–149. [18] low, a. d., nystrom, m., kalinin, e., et al.: a method for the reconstruction of fourdimensional synchronized ct scans acquired during free breathing. med. phys., 2003, 30, 1254–1263. [19] mah, d., hanley, j., rosenzweig, k. e., et al.: technical aspects of the deep inspiration breath-hold technique in the treatment of thoracic cancer. int. j. radiation oncology biol. phys., 2000, 48, 1175–1185. [20] maintz, j. b. a., viergever, m. a.: a survey of medical image registration.medical imageanalysis, 1998, 2/1, 1–36. [21] montagner, j., barra, v., boire, j. y.: a geometrical approach of multiresolution management in the fusion of digital images. 1st ieee visual information expert workshop, paris, france, 2006. [22] moringlane, j. r.: coordonnées polaires pour le système stéréotaxique de talairach. neurochirurgie, 1986, 32, 452–454. [23] musse, o., heitz, f., armspach, j. p.: fast deformable matching of 3d images over multiscale nested subspaces. application to atlasbased mri segmentation. pattern recognition, 2003, 36/8, 1881–1899. [24] nehmeh, s. a., erdi, y. e.: four dimensional (4d) pet/ct imaging of the thorax, med. phys., 2004, 31, 3179–3186. [25] penska, k., folio, l., bunger, r.: medical applications ofdigital imagemorphing.journal of digital imaging, 2007, 20(3), 279–283. [26] rangarajan, a., chui, h., duncan, j. s.: rigid point feature registration using mutual information. medical image analysis, 1999, 3/4, 425–440. [27] rietzel, e., pan, t., chen, g. t. y.: fourdimensional computed tomography: image formation and clinical protocol. med. phys., 2005, 32, 874–889. [28] rizzo, g., scifo, p., gilardi, m. c., bettinardi, v., grassi, f., cerutti, s., fazio, f.: matching a computerized brain atlas to multimodal medical images. neuroimage, 1996, 6, 59–69. [29] salomon, m., heitz, f., perrin, g. r., armspach, j. p.: a massively parallel approach to deformable matching of 3d medical images via stochastic differential equations. parallel computing, elsevier 2005, 31, 45–71. [30] santhanam, a. p., willoughby, t., shah, a., meeks, s., rolland, j. p., kupelian, p.: realtime simulation of 4d lung tumor radiotherapy using abreathingmodel.proceedings of the 11th international conference onmedical image computing and computer-assisted intervention 2008, 11 (pt 2), 710–717. [31] sarrut,d.,delhay,b.,villard,p.f., boldea,v., beuve, m., clarysse, p.: a comparison framework for breathing motion estimation methods from 4d imaging. ieee transaction on medical imaging. 2007, 26(12), 1636–1648. [32] schweikard, a., berlinger, k., roth, m., sauer, o., vences, l.: volumetric deformation model for motion compensation in radiotherapy. medical image computing and computerassisted intervention miccai 2004, saint malo, france, 2004, 925–932. [33] sixel,k.e.,ruschin,m.,tirona,r., et al.: digital fluoroscopy to quantify lung tumor motion: potential forpatient-specificplanning targetvolumes, int. j. radiation oncology biol. phys., 2003, 57, 717–723. [34] stindel, e.: bone morphing – 3d morphological data for total knee arthroplasty.annual conference of the british society for computer aided orthopaedic surgery, london, uk, 2006. [35] subsol, g., thirion, j. p., ayache, n.: construction automatique d’atlas anatomiquesmorphéométriques à partir d’imagesmédicales tridimensionnelles: application à un atlas du crâne. medical image analysis, 1996, 2/1, 1–36. [36] talairach, j., tournoux, p.: co-planar stereotaxic atlas of the human brain. 3-dimension proportional system: an approach to cerebral imaging. thieme verlag, 1988. [37] thompson, p. m., toga, a. w.: wrapping strategies for intersubject registration. i. n. bankman (ed.), handbook for medical imaging, processing and analysis, academic press, 2000, 36, 569–601. [38] thirion, j. p.: diffusing models and applications. toga, a. w., (ed.) brain wrapping, academic press 1999, 9, 143–155. 64 acta polytechnica vol. 50 no. 6/2010 [39] thompson, p.m.,macdonald,d., mega,m. s., holmes,c.j., evans,a.c.,toga,a.w.: detection and mapping of abnormal brain structure withprobabilisticatlasof cortical surfaces.journal of computed assisted tomography, 1997, 21/4, 567–581. [40] vérard, l., allain, p., travère, j. m., baron,j.c.,bloyet,d.: fullyautomatic identification of ac andpc landmarks on brainmri using scene analysis. ieee transactions on medical imaging, 1997, 16/5, 610–616. [41] villard, p. f., beuve, m., shariat, b., baudet, v., jaillet, f.: simulation of lung behaviour with finite elements: influence of biomechanical parameters. ieee conference on information visualization, london (gb), 2005, 9–14. [42] villard, p. f., beuve, m., shariat, b., baudet, v., jaillet, f.: lung mesh generation to simulate breathing motion with a finite element method. ieee conference on information visualization, london (gb), 2004, 194–199. [43] weiss, p. h., baker, j. m., potchen, e. j.: assessment of hepatic respiratory excursion, j. nucl. med., 1972, 13, 758–759. [44] wolthaus, j.w., schneider,c., sonke, j. j.,van herk, m., belderbos, j. s. a., rossi, m. m. g., lebesque, j. v., damen, e. m. f.: midventilation ct scan construction from fourdimensional respiration-correlated ct scans for radiotherapy planning of lung cancer patients. int. j. radiation oncology biol. phys., 2006, 65(5), 1560–1571. [45] wolthaus, j. w., sonke, j. j., van herk, m., damen, e. m.: reconstruction of a timeaveraged midposition ct scan for radiotherapy planning of lung cancer patients using deformable registration. med. phys., 2008, 35(9), 3998–4011. [46] yang, d., lu, w., low, d. a., deasy, j. o., hope, a. j., naqa, i. e.: 4d-ctmotion estimation using deformable image registrationand 5d respiratory motion modeling. med. phys., 2008, 35(10), 4577–4590. [47] zanetti, e. m., crupi, v., bignardi, c., calderale, p.m.: radiograph-based femur morphing method. med. biol. eng. comput., 2005, 43(2), 181-8. [48] zhang, t., keller, h., o’brien, m. j., mackie, t. r., paliwal, b.: application of the spirometer in respiratory gated radiotherapy. med. phys., 2003, 30, 3165–3171. dr. robert laurent, julien henriet, régine gschwind, prof. ing. libor makovicka, drsc. e-mail: julien.henriet@pu-pm.univ-fcomte.fr irma/enisys/femto-st, umr 6174 cnrs montbéliard, france 65 acta polytechnica vol. 52 no. 5/2012 energy estimation of distributed phase-shift beamforming viktor černý dept. of computer science and engineering, faculty of electrical engineering, czech technical university, technická 2, 166 27 praha, czech republic corresponding author: cernyvi2@fel.cvut.cz abstract ad-hoc and sensor networks are composed of small, low-power devices, which can connect between themselves without the help of an infrastructure. research in this area has been both extensive and intensive and is still very far from exhaustion. our work in this area is aimed at developing a new type of communication between groups of modules capable of connecting clusters (groups) of devices which are separated by distances greater than the maximum transmission range of the devices themselves, without the help of relays or signal repeaters. in this paper we study the energy requirements for bidirectional communication between two clusters separated by a distance greater than the maximum transmission range of the modules, in the classic way (with the use of repeaters or relays) and by applying distributed phase-shift beamforming. keywords: ad-hoc networks, beamforming, phase-shift, distributed. 1 introduction phase-shift beamforming is a method extensively used in radio networks, in which devices are equipped with multiple antennas. by injecting in each antenna element a signal slightly shifted in phase (delayed), because the signals travel slightly different distances towards the receiver, for a favourable phase combination, the two signals will meet at the receiver in phase and thus, will have energy higher than each of the signals taken separately. such antenna systems (or antenna arrays) have been named “smart antennas” because, based on this method, the radiation pattern of the transmitter antenna array can be modified — making its signal stronger or weaker in desired directions — without physically orienting the device, just by modifying the phase of the signals coming to each antenna element. the case studied in this paper is of modules that are not equipped with antenna arrays — each device has only one single antenna. in order to be able to use the method of phase-shift beamforming, multiple devices have to share the same information to be sent and be able to synchronise the transmitted signals precisely. such a mechanism has already been proposed in [6], [7], [8] and [1], however these papers ignore the fact that crystals are not perfect and phase synchronisation between the modules is lost due to the clock being slightly faster at one module than the other. our aim is briefly to present a method for synchronizing the transmitted signals, simpler than the method in [3], and to evaluate the energy requirements for bidirectional communication between two clusters of modules (separated by a distance greater than their (a) (b) (c) (d) figure 1: different radiation patterns corresponding to different phase combinations of the signals injected into the elements of an antenna array. maximal transmission range) in the method proposed here and in the classic way of installing repeaters (relays) between two clusters. this paper is structured as follows: the second section presents the distributed phase-shift beamforming mechanism and the proposed device, the third section is a theoretical approach towards the required energy estimations, the fourth section contains the experi26 acta polytechnica vol. 52 no. 5/2012 (a) (b) (c) figure 2: (a) a network divided into two clusters. (b) the two clusters can be connected by installing repeaters. (c) the two clusters can be connected by distributed phase-shift beamforming. ments and the results and the paper is concluded in the fifth section, where future work is also proposed. 2 distributed phase-shift beamforming there are four types of radio communications from the channel perspective: siso, simo / mosi, miso / somi and mimo. siso (single input — single output) is the classic type — one antenna at the transmitter, one at the receiver. simo / mosi (single input — multiple output) works at the basis of signal processing at the receiver; the transmitter is equipped with only one antenna and the receiver with multiple antennas. noise affects the antennas at the receiver slightly differently and, with the use of signal processing, the useful signal can be separated from the noise. miso (multiple input — single output) describes communications where the transmitter has multiple antennas, regulated by different initial phases, such that the signals coming from these antennas superimpose at the receiver antenna, thus creating a stronger signal. mimo (multiple input — multiple output) combines the properties of simo and miso to present this technique here would be both space-consuming and beyond the scope of the paper. to exemplify the properties of beamforming, let us consider four antennas a1, a2, a3 and a4 symmetrically placed at coordinates (x,y): a1(−λ/2, 0), a2(λ/2, 0), a3(0,−λ/2) and a4(0,λ/2) and acting as an antenna array. each antenna is assumed to have a perfectly circular radiation pattern in the horizontal plane, as in fig. 1a (λ is the wavelength of the signal). if all the antennas are fed identical and in-phase signals, the radiation pattern of the whole array becomes as in fig. 1b. if the signal injected in antenna a4 is dephased at −90° and at +180° respectively taking as a reference the other three identical remaining signals, the radiation pattern of the array becomes as depicted in fig. 1c and fig. 1d respectively. in conclusion, we can manipulate the radiation pattern of a transmitter equipped with an antenna array by altering the initial phases of the signals injected into the element antennas composing the array. let us now consider an ad-hoc network where all the modules are randomly placed over a 2d surface and each module is equipped with only one single antenna. let us also assume that, due to the random placement of the modules, two clusters are formed, without the possibility of direct communication between them, as in fig. 2a. there are three ways to interconnect the devices by increasing the transmission power (this will drain the batteries faster and can be limited by the hardware of the transceiver), by using higher gain antennas (sometimes unsuitable because it will lead to physically bigger antennas) or by placing repeaters between the two clusters, as in fig. 2b. we are currently developing a fourth way for the clusters to interconnect: the modules of one cluster will beamform their signal thus creating a virtual transmitter whose power will be sufficient to transmit directly to the other cluster — as in fig. 2c. if we assume the network presented in fig. 2b, where the network is formed by two disjoint clusters k1 and k2, connected by a relay set r, the conditions for a connected network are: r 6= ∅, r∩k1 6= ∅ and r ∩ k2 6= ∅. assuming a source module s ∈ k1 and a destination module d ∈ k2, connected by a “best route” discovered by a routing algorithm of some sort, as in fig. 2b, then for each bidirectional communication between s and d, in order to have reliable communication, any repeater r ∈ r has to perform four steps: 27 acta polytechnica vol. 52 no. 5/2012 1. receive the data packet from the previous module; 2. transmit the data packet to the next module; 3. receive acknowledgement from the next module; 4. transmit the acknowledgement to the previous module. in the same time, the end devices performed only two operations: s transmitted a data packet and received an ack, while d received a data packet and sent an ack. in conclusion, because the number of operations performed by each relay is greater than the number of operations performed by each end device, we can expect that the batteries will exhaust at the relays sooner than at the end devices. in order to prevent this situation, we can install batteries with higher capacities at the relays, or we can increase the number of relay modules. there are however scenarios in which this improvement is impossible: for example, when there is a river in the area when the relay modules should be installed. in this case, the clusters are permanently disconnected. by applying distributed phase-shift beamforming, s will broadcast the data packet to its neighbours, and then the whole group of modules will transmit the data packet directly to d in a synchronized manner. in order to reply, d will broadcast its ack locally to its own neighbours and then the whole group of modules will transmit in the ack a synchronized manner directly to s — as in fig. 2c. it is beyond the scope of this paper to present how the different modules will know how much to delay (dephase) the signals in order to superimpose correctly at the receiver — the cluster discovery mechanism has been studied, and the results were published in [5]. we can assume that in order to send a message from s to d, each module m ∈ k1 has a precise phase delay ∆φm which must be used so that cluster k1 can transmit to d properly; similarly for the cluster k2 in order to have communication from d back to s. the difficulty with this mechanism consists in maintaining the constant phase delay between different modules: the oscillator crystals are not identical, and even minute imperfections, over long periods of time, will lead to the loss of precise phase delays. one solution might be to repeat the phase discovery algorithm presented in [5], however this is not feasible due to its exponential complexity. another solution that we have employed is if module s is capable of transmitting simultaneously on two frequencies: fd — on which data is sent, and fs — which is a plain sine wave used for synchronization. synchronization frequency fs was chosen t times higher than data frequency fd, where t is a constant. thus, module s is responsible for broadcasting information to its neighbours and for maintaining the synchronization of its neighbours during beamformed transmission to d. each neighbour of m of s receives the synchronization signal fs. through a phase-locked loop it locks its own data signal fd and then adds to it the corresponding phase delay ∆φm. in this manner, the signals coming to d from all the modules in k1 will meet in phase and thus their power will increase, assuring transmission beyond the maximum range of any individual module separately. the internal architecture of the module is presented in fig. 3 and it is currently patent pending, under patent application, reg. no. 2011-25254 / 2011-785, 2011. the components are the following: 1. power supply — provides electric power to the active components. 2. internal logic unit — performs the main tasks of the module (initiate phase search, dictates the regime based on channel access methods, computes the phase increment, etc.). 3. carrier generator — generates two in-phase synchronised sine waves: one wave has frequency fd and is the data carrier, the other has frequency fs = tůfd and is the synchronisation signal. this can be achieved for example by involving a crystal capable of generating fs and then a frequency divider which generates fd from fs, by dividing it to t > 1. 4. transceiver — using a modulation technique, this is responsible for sending and receiving information on the data wave. 5. phase synchroniser — this is in essence a phaselocked loop (pll) capable of synchronising two waves. 6. phase delayer — delays the phase of a signal by a given amount. 7. mode switch — commutes the module from one regime (or mode of operation) to another. 8. antenna — the antenna used for transmitting and receiving; it must allow frequencies fd and fs to pass through. 3 energy estimation case a. let us start with classic bidirectional storeand-forward communication between two radio modules m1 and m2, separated by distance d, in free space, as in fig. 4a. for a classic bidirectional communication, the minimal number of packets is 4: data (m1 → m2), ack (m2 → m1), data (m2 → m1) and ack (m1 → m2). we can assume that the size of the ack packet is k times larger than the size of the data packet, where 0 < k < 1 and we note that 28 acta polytechnica vol. 52 no. 5/2012 figure 3: device capable of distributed phase-shift beamforming. the full functioning of the device is described in the patent application form [4]. sizeof(ack) = k sizeof(data). from the laws of electromagnetic propagation in free space (fspl) it can be seen that the received power decreases with the distance squared. for energy estimation, because the methods are compared with each other, it is safe to assume (in order to simplify the calculus) that the antennas have no gain. in the same manner, we consider the receiving operation as being totally passive (no energy required). in this case, in milliwatts, according to fspl we can write that pr ∼ ptd2 . the received power is proportional to the transmitted power and is inreversely proportional to the squared distance. thus, the energy e required to transmit a data packet over distance d can be written as: e ∼ d2 sizeof(data) (1) and we can choose the constant to balance this inequality as c (dependent on the antenna gains, apertures, minimal required power at the receiver, the environment of propagation and sizeof(data)), thus: e(data) = cd2. (2) in this case, e(data) = cd2 = ed and e(ack) = kcd2 = ea = ked. the total energy required for the minimal bidirectional communication becomes: etot = 2(ed + ea) = 2cd2(1 + k). (3) the energy need per single module is: emod = (ed + ea) = cd2(1 + k). (4) case b. let us now consider the same case, though now between m1 and m2 there is a repeater r, placed at distances d1 and d2 from m1 and m2 respectively, as in fig. 4b. for duplex store-and-forward communication, the minimal number of data packets is 8: data(m1 → r), data(r → m2), ack(m2 → r), ack(r → m1), data(m2 → r), data(r → m1), ack(m1 → r) and ack(r → m2). in order to send one data packet from m1 to m2 through r, the total energy required will be, according to (2): e = e1+e2 = cd21 +cd 2 2 = c ( d21 +(1−d1) 2) (5) it can be seen that e has its minimal value when d1 = d2 = d, thus the most efficient way (from the energy point of view) to place a repeater is to place it in the middle of the distance between the modules. in this case, we obtain: e(data) = c ( d 2 )2 = cd 2 4 = ed 4 and in the same manner: e(ack) = kc ( d 2 )2 = ea 4 = ked 4 . the total energy need is: etot = 4 ( e(data) + e(ack) ) = 4 ( ed 4 + ked 4 ) = cd2(1 + k). (6) the energy drain per end-module is given by: emod = e(data) + e(ack) = 1 4 cd2(1 + k) (7) and for the relay: er = 2 ( e(data) +e(ack) ) = 1 2 cd2(1 +k). (8) 29 acta polytechnica vol. 52 no. 5/2012 m1 m2d1 d2 d (a) two modules. (b) two modules and one repeater. (c) two modules and n relays. m modules per cluster. (e) m modules per cluster and one relay. (f) m modules per cluster and n relays. (g) m modules per cluster, phase-shift beamforming. figure 4: different module placements. it can be seen that even in this most effective case, the need on the router is twice the need on any enddevice, thus the router will exhaust its battery twice as fast as any of the modules. however, if the router can allow for a battery twice the capacity of any end-module, this will compensate the higher power need. case c. if between the devices there are n relays, as in fig. 3.27c, then the average power consumption becomes per end device (the same as in (7)): emod = e(data) + e(ack) = 1 4 cd2(1 + k), (9) and per repeater: er = 2 ( e(data) + e(ack) ) = 1 2n cd2(1 + k). (10) the total energy need remains the same as in (6). thus, installing more repeaters can compensate the rapid exhaustion of the battery seen in case b. case d. assuming the same topology as in case a (fig. 4a), though with n data packets per end-device, then: etot = 2ncd2(1 + k), and the energy drain per end-module: emod = ncd2(1 + k). case e. supposing the same topology as in case b (fig. 4b), though with n data packets per end-device, we obtain: etot = ncd2(1 + k), (11) emod = n 4 cd2(1 + k), (12) er = n 2 cd2(1 + k). (13) case f. assuming the same topology as in case c (fig. 4c), though with n data packets per end-device, we obtain: etot = ncd2(1 + k), (14) emod = n 4 cd2(1 + k), (15) er = n 2n cd2(1 + k). (16) case g. considering now m modules per cluster, each generating n data packets, as in fig. 4d, then: etot = 2mncd2(1 + k), (17) emod = ncd2(1 + k), (18) case h. considering now m modules per cluster, each generating n data packets and one repeater in 30 acta polytechnica vol. 52 no. 5/2012 the middle, as in fig. 4e, then: etot = mncd2(1 + k), (19) emod = n 4 cd2(1 + k), (20) er = mn 2 cd2(1 + k). (21) we assumed for this case that distance d is much greater than the distances between the modules of a cluster. case i. considering now m modules per cluster, each generating n data packets and n repeaters in the middle, as in fig. 4f, then: etot = mncd2(1 + k), (22) emod = n 4 cd2(1 + k), (23) er = mn 2n cd2(1 + k). (24) case j: let us consider two clusters, separated by a distance d that is much greater than the cluster size (a squared area d×d meters — fig. 4g). each cluster contains m modules. let us note the energy that a module of a cluster needs to broadcast a data packet to the other members of the same cluster as ebd and the energy that a module of a cluster needs to broadcast an acknowledgement packet to the other members of the same cluster as eba. let us also assume that each module has equal participation in the cluster in spreading the load, in beamforming and generating data traffic. we can then write that: • the energy to broadcast data in the cluster is ebd = sed and • the energy to broadcast acknowledgement in the cluster is eba = sea = ksed. the steps required for 1-packet two-way communication between modules a1 and b1 are: 1. module a1 broadcasts data in cluster a on frequency fd, fig. 5a; 2. cluster a is synchronised by a1 on frequency fs and in one step transmits data to module b1, fig. 5b; 3. module b1 broadcasts an acknowledgement in cluster b on frequency fd, fig. 5c; 4. cluster b is synchronised by b1 on frequency fs and in one step transmits the acknowledgement to module a1, fig. 5d; 5. module b1 broadcasts data in cluster b on frequency fd, fig. 5e; 6. cluster b is synchronised by b1 on frequency fs and in one step transmits data to module a1, fig. 5f; 7. module a1 broadcasts the acknowledgement in cluster a on frequency fd, fig. 5g; 8. cluster a is synchronised by a1 on frequency fs and in one step transmits the acknowledgement to module b1, fig. 5h. the energy requirements for each step are: step 1 — module a1 broadcasts data in cluster a on frequency fd, fig. 5a: • at a1: e1 = ebd = sed; • at other modules: 0. step 2 — cluster a is synchronised by a1 on frequency fs and in one step transmits data to module b1, fig. 5b: • at each ai: e2 = e(data)m = cd2 m ; • a1 has also to synchronise the cluster: e3 = esd = tebd, where t is a constant; • at modules belonging to cluster b: 0. step 3 — module b1 broadcasts the acknowledgement in cluster b on frequency fd, fig. 5c: • at b1: e4 = eba = sea; • at other modules: 0. step 4 — cluster b is synchronised by b1 on the frequency fs and in one step transmits the acknowledgement to module a1, fig. 5d: • at each bi: e5 = e(ack)m = kcd2 m ; • b1 has also to synchronise the cluster: e6 = esa = teba, where t is a constant; • at modules belonging to cluster a: 0. step 5 — module b1 broadcasts data in cluster b on frequency fd, fig. 5e: • at b1: e7 = ebd = sed; • at other modules: 0. step 6 — cluster b is synchronised by b1 on frequency fs and in one step transmits data to module a1, fig. 5f: • at each bi: e8 = e(data)m = cd2 m ; • b1 also has to synchronise the cluster: e9 = esd = tebd, where t is a constant; • at modules belonging to cluster a: 0. step 7 — module a1 broadcasts the acknowledgement in cluster a on frequency fd, fig. 5g: 31 acta polytechnica vol. 52 no. 5/2012 a1 local data bcast. beamformed data transmission a → b1. b1 local ack bcast. beamformed ack transmission b → a1. b1 local data bcast. beamformed data transmission b → a1. a1 local ack bcast. beamformed ack transmission a → b1. figure 5: steps of a beamformed bidirectional communication. • at a1: e10 = eba = sea; • at other modules: 0. step 8 — cluster a is synchronised by a1 on frequency fs and in one step transmits the acknowledgement to module b1, fig. 5h: • at each ai: e11 = e(ack)m = kcd2 m ; • a1 also has to synchronise the cluster: e12 = esa = teba, where t is a constant; • at modules belonging to cluster b: 0. the total energy required by the whole network becomes the sum of e1 to e12, where some of the terms have to be multiplied by m (the number of nodes): etot = 2 ( ebd + m cd2 m + tebd + eba + mk cd2 m + teba ) . (25) we have previously seen that ed ∼ d2 and we noted ed = cd2. in the same manner, because the distances between modules belonging to the same cluster have the same order with cluster size d, we can write that ebd ∼ d2 and ebd = cd2. thus, by replacing these values into (25) we obtain: etot = 2 ( cd2+cd2+tcd2+kcd2+kcd2+tkcd2 ) = 2c(1 + k) ( (1 + t)d2 + d2 ) . (26) energy need per device: for module a1: ea1 = e1 + e2 + e3 + e10 + e11 + e12: ea1 = cd 2 + tcd2 + kcd2 + tkcd2 + cd2 m (1 + k) = c(1 + k) ( (1 + t)d2 + d2 m ) ; (27) for module b1: eb1 = e4 + e5 + e6 + e7 + e8 + e9: eb1 = c(1 + k) ( (1 + t)d2 + d2 m ) = ea1 ; (28) for the other modules in cluster a: eca = e2 + e11: eca = cd2 m (1 + k); (29) for the other modules in cluster b: ecb = e5 + e8: ecb = cd2 m (1 + k) = eca. (30) 32 acta polytechnica vol. 52 no. 5/2012 in conclusion: • the total energy needed per network: etot = 2c(1 + k) ( (1 + t)d2 + d2 ) . (31) • the energy for devices which are data sources and sync sources: eactive = c(1 + k) ( (1 + t)d2 + d2 m ) . (32) • the energy for devices which only help transmission: epassive = cd2 m (1 + k). (33) case k. let us take the same topology as in case j, but this time each device generates n data packets. in this case, the average energy per device can be calculated in the following manner: • we start by counting the data packets in the network: 2mn. • in one cluster, each device is active for n packets (its own packets) and passive for the remaining (m−1)n packets. in total, each cluster is responsible for mn packets (half of the total number of packets). thus: the total energy per network becomes: etot = 2cmn(1 + k) ( (1 + t)d2 + d2 ) ; (34) the energy per device is emod = neactive + (m − 1)nepassive: emod = nc(1 + k) ( (1 + t)d2 + d2 m ) . (35) final conclusions. let us compare the total energy requirements in cases k (general beamforming) and i (general relay): etot,k etot,i = 2cmn(1 + k) ( (1 + t)d2 + d2 ) mncd2(1 + k) = 4 ( 1 + (1 + t) ( d d )2) . (36) if d � d, then etot,k etot,i → 2, meaning that the overall energy consumption beamforming presented in case k is twice as inefficient as relay transmission of the same order. let us now compare the energy requirements of end devices in cases k (general beamforming) and i (general relay): emod,k emod,i = nc(1 + k) ( (1 + t)d2 + d 2 m ) + (m − 1)ncd 2 m (1 + k) 1 4ncd 2(1 + k) = 4 ( 1 + (1 + t) ( d d )2) . (37) if d � d, then emod,k etot,i → 4. meaning that the enddevice energy consumption beamforming presented in case k is four times as inefficient as relay transmission of the same order. finally, let us now compare the energy requirement of end devices in case k (general beamforming) and the energy requirement of relays in case i (general relay): emod,k er,i = nc(1 + k) ( (1 + t)d2 + d 2 m ) + (m − 1)ncd 2 m (1 + k) mn 2n cd 2(1 + k) = 2n m ( 1 + (1 + t) ( d d )2) . (38) if d � d, then emod,k etot,i → 2n m . in this case, if m = n (the number of end-devices is equal to the number of relays), the energy consumption in the case of beamforming per device is twice the energy consumption on a relay. proposition. in a clustered network, relaying is at limit twice more efficient for equivalent networks, per total power consumption and per end device. in environments in which relays cannot be installed or if the graph cut (of the network) contains a number of relays less than half of the number of end-devices, then the network connected through phase-shift beamforming will survive more than relaying proof. the proof has been already provided in a constructive way in this section. of course, the estimation here is an approximation, because we have ignored the small distance differences inside and outside the clusters. this however gives a better understanding of the quantitative improvements brought by phase-shift beamforming and in this way give a maximum theoretical limit. 33 acta polytechnica vol. 52 no. 5/2012 figure 6: the relay module position is varied along the line uniting the centres of the two clusters. figure 7: packets delivered between clusters in both directions. the vertical axis represents the number of packets, and the horizontal axis represents the distance between the relay and the centre of cluster a (in meters). 4 experiments and results the simulations presented in this section are based on the following assumptions: • a free-space propagation model (no obstacles for waves, no reflexions); • battery capacity for each device: 200 mwh (larger capacities are also permitted, but the simulation will last longer); • two clusters, each containing 4 modules; • squared cluster areas; • random placement of modules in the cluster area; • pmin = −75 dbm (minimal power); • data rate dr = 115200 bps; • antenna gain: 1.5 dbi isotropic (for the results presented here, isotropic antennas were used to show the contribution of the distributed phase-shift beamforming in a separate manner — anisotropic antennas can also be simulated); • data packet size: 128 b (the size of a zigbee packet); figure 8: 1 to 4 relay nodes in between the clusters. figure 9: packets delivered between clusters in both directions. the vertical axis represents the number of packets,and the horizontal axis is the number of tests performed. • acknowledgement size: 12 b; • synchronization constant t = 2; • random traffic is generated from each cluster to the other cluster (random source module and random destination module, where the source and destination belong to different clusters); • each presented result is the averaged value over 10 simulations; • in the results, only the number of delivered packets is presented, because the time intervals are proportional to the number of packets. 4.1 experiment 1 two clusters of 10 × 10 m, containing 4 modules each, are placed 1000 m from each other. in between there is a single relay module. the relay is moved along the line which passes through the centres of each cluster from the proximity of the first cluster to the proximity of the second cluster (fig. 6). the goal is to determine the number of packets and the time interval in which the network is functioning. the result of this experiment is presented in fig. 7, and it shows that the best result (the longest uptime) is achieved when the relay is placed in the middle of the distance between the two clusters. in all the 34 acta polytechnica vol. 52 no. 5/2012 figure 10: the b cluster moves away from cluster a. figure 11: packets delivered between clusters in both directions. the vertical axis represents the number of packets, and the horizontal axis represents the distance between the clusters (in meters). tested cases, the clusters become disconnected due to exhaustion of the relay battery. the fact that the optimum is reached when the relay sits in the middle of the distance has been theoretically shown; the results match the prediction. 4.2 experiment 2 two clusters of 10 × 10 m, containing 4 modules each, are placed 1000 m from each other. in between there are 1 to 4 relay modules used in a round-robin fashion (fig. 8). the goal is to determine the number of packets and the time interval in which the network is functioning. the results depicted in fig. 9 show a dramatic improvement as the number of relay modules increases. in all the cases the clusters become disconnected due to exhaustion of the relay battery. this leads to one conclusion: if relays can be installed, this is the preferred method. 4.3 experiment 3 two clusters of 10 × 10 m, containing 4 modules each, are placed 450 m from each other. cluster b is then moved along the axis that connects the centres of figure 12: increasing the cluster sizes. figure 13: the dependency of the number of delivered packets on cluster size. on the vertical axis, the number of delivered packets; on the horizontal axis, the size of the clusters in meters. the two clusters, away from cluster a, up to 2000 m. there are no relay nodes, only phase-shift beamformed communication. the goal is to determine the number of packets and the time interval in which the network is functioning. (fig. 10) the results are depicted in fig. 11, which shows how distance affects the overall number of delivered packets. the disconnection is caused by exhaustion of the batteries of either the source node or the destination node. what is really interesting in this graph (fig. 11) is that at a distance of 1000 m between the clusters, the number of acknowledged packets is approximately 3.7 million. by comparison, the best-case scenario for a single relay is just below one million. this proves that phase-shift beamforming can outperform classic relaying by a factor of 3 if the number of relays is very low. 4.4 experiment 4 two clusters of 10 × 10 m, containing 4 modules each, are placed 2000 m from each other. the size of each cluster is increased (from 10 × 10 m to 10 × 10 m), while its centre remains on the same coordinates. this method proves the stability of phase-shift beamformed communication between clusters (fig. 12). the results 35 acta polytechnica vol. 52 no. 5/2012 are depicted in fig. 13, and show that the number of delivered packets is almost constant at around one million, which corresponds to the result obtained in experiment 3 for the same distance between clusters. 5 conclusions and future work the work presented in this paper aimed to determine the advantages and disadvantages of using distributed phase-shift beamforming in comparison with classic repeater techniques for interconnecting two distant clusters of radio devices. in this work certain energy consumptions were ignored, e.q. for the data processing before the energy is transmitted or after it is received. however, the expected power needs are constant for each received or processed data packet, depending only on the size of the respective packet. because the results are presented both in absolute form and in relative form (beamformed towards a classic repeater), the influence of these constant power on the relative results is minimal. the previous section has shown that, by comparing the worst result achievable through distributed phaseshift beamforming with the best result obtainable by installing repeaters, the technique presented in this paper is about three times better than the classic method. however, in cases where installing a repeater is an option, it is the preferred method. the power of this method is due to the fact that it is scalable with the number of transmitters in each cluster. this paper presents only the case of four modules per cluster, because the algorithm that finds the optimal phase combinations [5] has (for the moment) exponential complexity and, in order to achieve results with a reasonable period of time and with reasonable power consumption, the number of devices must be limited. work is in progress to provide a faster algorithm of cluster discovery algorithm for this technology. the final note on the work presented in this paper is that intra-cluster communications have been totally ignored (with the exception of broadcasts — which are however necessary due to inter-cluster communication). the reason for this is to avoid creating interference between intra and inter cluster communications. inter-cluster communication requires much greater resources, and must be protected better against interference. there are two ways to achieve this protection: by dividing the time interval into separate time zones or by using a new technique that we propose. from the point of view of time separation, there are two orthogonal time chunks for communication: an intercluster chunk and an intra-cluster chunk. it can be said that this paper details the inter-cluster chunk and ignores the other chunk. however, our team is fully confident that there is another solution: vcsma/ca (virtual carrier sense multiple access with collision avoidance). this is work in progress, and it is based on the fact that the problem of interference has already been solved with individual devices by rts/cts mechanisms or by busy-tone mechanisms (btma — busy tone multiple access [2]). in our case, instead of individual transmitters we have virtual transmitters. acknowledgements the research reported in this paper has been supported by the ministry of education, youth and sports of the czech republic under research program msm 6840770014, and by the grant agency of the czech technical university in prague, grant no. sgs11/158/ohk3/3t/13. references [1] h. aghvami, m. dohler, j. dominguez. link capacity analysis for virtual antenna arrays. in ieee vehicular technology conference, 2002. [2] d. kivanc-tureli, p. nehaben, u. tureli. effective channel utilization using the ri-btma protocol. in military communications conference, 2007. milcom 2007. ieee, 2007, pp.1–7. [3] u. madhow, r. mudumbai, g. barriac. on the feasibility of distributed beamforming in wireless networks. ieee transactions on wireless communications 6(4), 2007. [4] a. moucha, v. cerny, j. kubr. distributed system for beamforming. patent application, reg. no. 2011-25254 / 2011-785, 2011. [5] a. moucha, j. gattermayer. cluster discovery in phase-shift beamformed ad-hoc and sensor networks. in proceedings of the international wireless communications and mobile computing conference iwcm, ieee, 2011. [6] h. poor & co., h. ochiai, p. mitran. collaborative beamforming for distributed wireless ad hoc sensor networks. ieee transactions on signal processing 53:4110–4124, 2005. [7] s. servetto, a. s. hu. optimal detection for a distributed transmission array. in ieee international symposium on information theory, 2003. [8] g. wornell, j. laneman. distributed space-timecoded protocols for exploiting co-operative diversity in wireless networks. ieee transactions on information theory 49(10):2415–2425, 2003. 36 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 optimization of fuzzy logic controller for supervisory power system stabilizers y. a. al-turki, a.-f. attia, h. f. soliman abstract this paper presents a powerful supervisory power system stabilizer (pss) using an adaptive fuzzy logic controller driven by an adaptive fuzzy set (afs). the system under study consists of two synchronous generators, each fitted with a pss, which are connected via double transmission lines. different types of pss-controller techniques are considered. the proposed genetic adaptive fuzzy logic controller (gaflc)-pss, using 25 rules, is compared with a static fuzzy logic controller (sflc) driven by a fixed fuzzy set (ffs) which has 49 rules. both fuzzy logic controller (flc) algorithms utilize the speed error and its rate of change as an input vector. the adaptive flc algorithm uses a genetic algorithm to tune the parameters of the fuzzy set of each pss. the flc’s are simulated and tested when the system is subjected to different disturbances under a wide range of operating points. the proposed gaflc using afs reduced the computational time of the flc, where the number of rules is reduced from 49 to 25 rules. in addition, the proposed adaptive flc driven by a genetic algorithm also reduced the complexity of the fuzzy model, while achieving a good dynamic response of the system under study. keywords: fuzzy logic controller: adaptive fuzzy set (afs), fixed fuzzy set (ffs) and genetic algorithm. 1 introduction researchers usually employ the simple one-machine infinite-bus system to study a novel or modified control technique. this analysis of the simplified model is only indicative of generator behavior when connected to a rigid system. however, it cannot provide complete information about generator behavior when connected to an oscillating system of comparable size. this can be achieved by replacing the infinite bus by another synchronous generator. in this case, the mutual influence between the two machines depends not only on the relative sizes of the two machines, but also on their parameters and the initial working conditions [1]. the main stability-indicating factor in the twomachine system is the instantaneous variation of the angle between their rotors,whichmust be convergent for “synchronous” and “steady state” operation. if the system is subjected to various disturbances, e.g. a change in load, a sudden transient short circuit, or some other abnormal conditions, the machines will be able to remain synchronized if the angle between the rotors does not acquire an increasingmanner or does not “theoretically” exceed the stability limit [2,3]. in general, a study of a two-machine system is acknowledged to represent a large power system concentrated in two distinct areas, and connected by a tie-line or by a short transmission line. 2 power system structure and modeling the system under study is shown in figure 1. it consists of two synchronousgenerators, connected together by two short parallel transmission lines. the generators feed local loads at their terminal bus bars. each generator is equipped with an automatic voltage regulator (avr) as the main excitation control. the pss also supports the excitation control of each generator. each synchronous generator is represented by a third order model compromising three mathematical equations: two electromechanical, and one electromagnetic. a mathematical model of each generator may be written as follows [4,5]: ω̇ = 1 m (tm − te − td), (1) δ̇ = ωb(ω −1), (2) ė′q = 1 t ′do (ef d − e′q − (xd − x ′ d)id), (3) the electric output power is given by the following equation: te ≈ pe = ( e′q · vt x′d − xd ) sinδ + v 2t (x ′ d − xq) 2x′dxq sin2δ e′q = vt + jx ′ did + jxq(jiq) (4) 7 acta polytechnica vol. 52 no. 2/2012 where: ω is the mechanical angular speed, m is the inertia constant, and tm, te and td are themechanical, electrical, and damping torques, respectively. symbol δ defines the power angle, and ωb is the base angular speed. e′q is the voltage behind the transient quadrature axis. t ′do is the field winding open circuit time constant (sec). ef d defines the excitation internal voltage of the machine, while xd and x ′ d are the synchronous and transient direct-axis reactances, respectively, of the synchronous machine. vt is the terminal voltage of themachine. thedot denotes the first time derivative of this variable. fig. 1: the two-machine system under study the mathematical model of the avr and the exciter of each machine is given by [5]: ef d = ( ka ta ) (vref − vt + up ss) − ( ef d ta ) (5) where vt = √ v 2d + v 2 q , vd = xqe sin(δ) xe + xq , vq = √ v 2t − v 2d . in the above equation, vref is the reference terminal voltage, u pss is the output of the power system stabilizer, ta is the exciter time constant, and vd and vq are the direct and quadrature axis components of the terminal voltage. for the conventional lead-lag pss (cpss), the following transfer function is considered during the simulation phase of the system under study: up ss = − ki ka ([ stq 1+ stq ] × [ 1+ st1 1+ st2 ]) × δ̇ (6) where, ki and ka are constants, tq is the time constant to be compensated, while t1 and t2 are the time constants of the lead-lag compensating network. more details of the cpss can be found in references [5,6]. the values of these parameters and the controller gains are given appendix a. the simulation study of the two machines is intended to determine their behavior in response to disturbances of the driving torque and terminal voltage of each generator. 3 fuzzy logic controller fuzzy control systems are rule-based systems. a set of fuzzy rules represents the flc mechanism for adjusting the effect of certain system stimuli. thus, the aimof fuzzy control systems is to replace a skilledhuman operator with a fuzzy rules-based system. the flc also provides an algorithm which can convert the linguistic control strategy,basedonexpertknowledge, to automatic control strategies. figure 2 depicts the basic configuration of the flc. it consists of a fuzzification interface, aknowledgebase, decision making logic, and a defuzzification interface [7]. fig. 2: generic structure of the fuzzy logic controller fig. 3: membership functions (mfs) of speed deviation for sflc fig. 4: membershipfunctionsof speeddeviationchange for sflc 8 acta polytechnica vol. 52 no. 2/2012 a. global input variables the fuzzy input vector of each sflc-pss of each generator consists of two variables: generator speed deviation, δω, and speed deviation change, δω′. two static fuzzy controllers are then designed; one with seven linguistic variables, using fixed fuzzy sets, for each input variable, as shown in figures 3, 4. the second sflc has five linguistic variables, using fixed fuzzy sets for each input variable, as shown in figures 6, 7 indicated by the solid lines. the fuzzy sets for the output variable, based on seven fuzzy sets, are shown in figure 5, and the fuzzy sets for the output variable, based on five fuzzy sets, are shown in figure 8 indicated by the solid lines. in these figures, linguistic variables used are: pl (positive large), pm (positive medium), ps (positive small), z (zero), ns (negative small), nm (negative medium) andnl (negative large), as indicated in tables 1–2. table 1: the look-up table relating input and output variables in a fuzzy set form for seven fuzzy sets of sflc speed deviation (δω) speed deviation change (δω′) nl nm ns z ps pm pl nl nl nl nl nl nm ps z nm nl nm nm nm ns z ps ns nl nm ns ns z ps pm z nl nm ns z ps pm pl ps nm ns z ps ps pm pl pm ns z ps pm pm pl pl pl z ps pm pl pl pl pl table 2: a look-up table relating input and output variables in a fuzzy set form for five fuzzy sets for gaflc speed deviation (δω) speed deviation change (δω′) nl ns z ps pl nl nl nl nl ns z ns nl nl ns z ps z nl ns z ps pl ps ns z ps pl pl pl z ps pl pl pl the fuzzy input vector of each gaflc-pss of each generator consists of the previous variables used in sflc with five linguistic variables using adaptive fuzzy sets. only five linguistic variables (lv) are used for each of the input variables, as shown in figures 6, 7, respectively. the output variable fuzzy set is shown in figure 8. in these figures, the fuzzy set of the related variables used with sflc is indicated by the solid lines, while the dotted lines represent the simulation results of the fuzzy set when using gaflc. figure 9 shows the fuzzy surface for the rules. in these figures, the lvs that we use are pl (positive large), ps (positive small), z (zero), ns (negative small) and nl (negative large), as indicated in table 2. fig. 5: membership functions of stabilizing signal for sflc fig. 6: membershipfunctions (mfs) of speeddeviation. sflc is indicated by solid lines, while gaflc is indicated by dotted lines fig. 7: membership functions of speed deviation change. sflc is indicated by solid lines, while gaflc is indicated by dotted lines 9 acta polytechnica vol. 52 no. 2/2012 fig. 8: membership functions of the stabilizing signal. sflc is indicated by solid lines, while gaflc is indicated by dotted lines fig. 9: rules surface viewer for sflc and gaflc controllers b. defuzzification method the minimum of maximum value method was used to calculate the output from the fuzzy rules. this output is usually represented by a polyhedron map. the defuzzification stage is executed in two steps. first,minimummembership is selected fromtheminimumvalue of interest of the two input variables (δω andδω′)with the related fuzzy set in that rule. this minimum membership is used to rescale the output rule, and then the maximum is taken to give the final polyhedron map. finally, the centroid or center of area is used to compute the fuzzy output, which represents the defuzzification stage [7–9]. 4 genetic algorithm for optimizing fuzzy controllers the adaptive fuzzy logic controller (gaflc), using an adaptive fuzzy set based on a genetic algorithm, has the same inputs and output as the static fuzzy logic controller (sflc) [10]. however, gaflc uses five fuzzy sets for the inputs and output variable. thus the full rule-base is (25 rules). sflc is defined as anflcusing a fixed fuzzy set structure, as shown in figures 3–5, in case of seven fs. for five fs, it is indicatedby solid lines infigures 6–8. the ruleshave the general form given by the following statement: ifvector (δω), is n s andchange in vector (δω′) is z then stabilizing signal is n s. where themembership functions (mfi) are defined as follows: mfj ∈ {n b, n s, z, p s and p b} as in the static fuzzy case. however, the output space has 5 different fuzzy sets. to accommodate the change in operating conditions, the adaptation algorithm changes theparametersof the inputandoutput fuzzy sets. the membership function parameters of the flcs are optimized on the basis of the adapted genetic algorithm with adjusting population size (agapop) [12]. the simulation results using the gaflccontroller are denoted in dotted lines in figures 6–8. this will be described later. agapop is used to calculate the optimum value of the fuzzy set parameters based on the best dynamic performance anddomain searchof theparameters [11]. theobjective function used in theagapoptechnique is given by the following equation (f = 1/(1 + j)), where (j) is the minimum cost function. agapop uses its operators and functions to find the values of the fuzzy set parameters of the fl controllers to achieve a better dynamic performance of the overall system. these parameter values lead to the optimum value for the control actions for which the system reaches the desired values, while improving the percentage of overshoot (p.o.s), the rising time and the oscillations. the main aspect of the agapop approach is to optimize the fuzzy set parameters of fl controllers. the flowchart procedure for the agapopoptimization process is shown in figure 10 [12]. fig. 10: flowchart of the agapop approach for optimizing mfs 10 acta polytechnica vol. 52 no. 2/2012 fig. 11: roulette wheel selection scheme a. representation of fuzzy set parameters in ga the fuzzy set parameters of fl controllers are formulated using the agapop approach [12], and are represented in a chromosome. the fuzzy set parameters of fl controllers are initially started using static fuzzy set parameter values. the intervals of acceptable values for each fuzzy set shape forming parameter (δc = [cmin, cmax], and δσ = [σmin, σmax] for gaussian) are determined based on 2nd order fuzzy sets for all fuzzy sets, as explained in appendix b. the gaussian shape is chosen in order to show how the parameters of the fuzzy sets are formulated and coded in the chromosomes. the minimum performance criteria j are [8]: j = ∫ t 0 (α1|e(t)| + β1|e′(t)| + γ1|e′′(t)|)dt (7) where e(t) is equal to the average error of δω1 and δω2. parameters (α1, β1 and γ1) are weighting coefficients. b. coding of fuzzy set parameters the coded parameters are arranged on the basis of their constraints, to form a chromosome of the population. the binary representation is the coded form for parameters with chromosome length equal to the sum of the bits for all parameters. tables 3, 4 show the coded parameters of flcs for machines 1 and 2, respectively. c. selection function the selection usually applies some selection pressure by favoring individualswithbetter fitness. after procreation, the suitable population consists, for example, of l chromosomeswhich are all initially randomized [12,14] and [16]. each chromosome has been evaluated and associated with fitness, and the current population undergoes the reproduction process to create the next population, as shown in figure 11. the chance on the roulette-wheel is adaptive, and is given as pl/ ∑ pl as in equation (8) [8]: pl = 1 jl , l ∈ {1, . . . , l} (8) where jl is the model performance encoded in the chromosome measured in the terms used in equation (7). d. crossover and mutation operators the mating pool is formed, and crossover is applied. then the mutation operation is applied followed by the agapop approach [12]. finally, the overall fitness of the population is improved. the procedure is repeated until the termination condition is reached. the termination condition is themaximumallowable number of generations. this procedure is shown in the flowchart given in figure 10. table 3: coded parameters of gaflc for m/c # 1 chromosome sub-chromosome of inputs sub-chromosome of output δω1 δω ′ 1 stabilizing signal 1 parameters c1, σ1, . . . , c5, σ5 c1, σ1, . . . , c5, σ5 c1, σ1, . . . , c5, σ5 30 2×5 2×5 2×5 table 4: coded parameters of gaflc for m/c # 2 chromosome sub-chromosome of inputs sub-chromosome of output δω2 δω ′ 2 stabilizing signal 2 parameters c1, σ1, . . . , c5, σ5 c1, σ1, . . . , c5, σ5 c1, σ1, . . . , c5, σ5 30 2×5 2×5 2×5 11 acta polytechnica vol. 52 no. 2/2012 5 simulation results and discussion 5.1 dynamic performance due to sudden load variation the system data given in appendix a is used to test the proposed algorithm. different simulation computations have been performed, and results were obtained for the two generators equipped with pss driven by sflc based on adaptive fuzzy sets. the simulation programs cover a wide range of operating conditions covering light,mediumandheavy load conditions. light load is represented by assuming both synchronousgeneratorsnormally loadedanddelivering 0.4 per unit (pu) active power (pe) and 0.2 pu reactive power (qe). in addition, themedium operating points are considered when both generators are normally delivering pe and qe equal to 0.65 and 0.45 pu, respectively. for the case of heavy load, pe and qe for the two generators, equal 0.9 and 0.4 pu, respectively. 5.2 mechanical torque disturbance a. light load conditions the first case was studied when both synchronous generators were loaded by pe, qe equal to 0.4, 0.2, respectively. generator (gen-1) was subjected to a 10%step increase in the referencemechanical torque. the torquewas then returnedback to the initial condition. figures 12a, b, c show the angular displacement between the rotors of the two machines (δ), in radians, and the speed deviation, δω in rad/sec, for gen-1 and gen-2, respectively. these figures include the simulation results for the system under study when equipped with various pss controllers. these controllers are conventional pss, pss-sflc using seven static fuzzy sets with overall rules equal to 49 rules, pss-sflcusing five static fswith overall rules equal to 25 rules, and the pss-genetic adaptive fuzzy logic controller (gaflc) using five adaptive fuzzy sets with overall rules equal to 25 rules. it should be noted that the pss-sflc using seven fuzzy sets provides a better dynamic performance than the pss-sflc with five fuzzy sets. however, the main drawback of the pss-sflc using seven fs is the large computation time for 49 rules every sampling time when compared with the time required for 25 rules using pss-flcwith five fs.meanwhile, pss-gaflc almost coincides with pss-flc with seven fs. table 1 and table 2 show the rules for static and adaptive fuzzy controllers. the dynamic response, shown in figure 12, depicts the superioa) the angular displacement between the two machine rotors under a light load b) speed change of generator 1 c) speed change of generator 2 fig. 12: dynamic response of a synchronous generator equippedwith sflc-pss,gaflc-pss andcpss.gen1 is subjected to a step increase/decrease in tm 12 acta polytechnica vol. 52 no. 2/2012 a) the angular displacement between the two machine rotors under a medium load b) speed change of generator 1 c) speed change of generator 2 fig. 13: dynamic response of the synchronous generator equipped with sflc-pss, gaflc-pss and cpss. gen 1 is subjected to a step increase/decrease in tm a) the angular displacement between the two machine rotors under a heavy load b) speed change of generator 1 c) speed change of generator 2 fig. 14: dynamic response of the synchronous generator equipped with sflc-pss, gaflc-pss and cpss. gen 1 is subjected to a step increase/decrease in tm 13 acta polytechnica vol. 52 no. 2/2012 rity of gaflc compared with the other controllers, except pss-flc using seven fuzzy sets. the rising time, settling time and damping coefficient of the overall system is better than pss-using flc with static fs with 25 rules. the simulation results also show that gaflc has a lower percentage overshoot thancpss.figures 6 to 8 showthenormalizedmembership function (m f s) before and after training using theagapopalgorithm for the input andoutput variables of the fuzzy controller. b. medium load conditions the second case studied is when each generator is loaded with p e = 0.65 pu, qe = 0.45 pu and is subjected to the same torque disturbance as in case study a. figures13a, b, c showthe simulation results for this case, including the power angle displacement between the two rotors (δ), in radians, and the speed deviation, δω, in rad/sec for gen 1 and gen 2, respectively. c. heavy load conditions the third case studied is when each generator is loaded with pe = 0.9 pu, qe = 0.4 pu and is subjected to the same torque disturbance as in case study a. figures14a, b, c showthe simulation results for this case, including the power angle displacement between the two rotors (δ) in radians, and the speed deviation, δω in rad/sec for gen 1 and gen 2; respectively. 6 conclusion this paper has presented a new fuzzy logic control power systemstabilizer for the supervisorypowersystem stabilizers of a two-machine system. the adaptive fuzzy set is introduced and tested througha simulation program. the proposed adaptive fuzzy controllerdrivenbyagenetic algorithmimproves the settling time and the rise time, anddecreases the damping coefficientof the systemunder study. the simulation results showthe superiorityof the adaptive fuzzy controller, drivenby a genetic algorithm, in comparison with other controllers. the results also show the effectiveness of the proposed gaflc with an adaptive fuzzy set scheme as a promising technique. the specifications of the parameter constraints related to the input/output reference fuzzy sets are based on 2nd order fuzzy sets. the problem of constrained nonlinear optimization is solved on the basis of a genetic algorithm with variable crossover and mutation probability rates. the proposed gaflc using afs also reduced the computational time of the flc, where the number of rules is reduced from 49 to 25 rules. in addition, the proposed adaptive flc technique driven by a genetic algorithm reduced the complexity of the fuzzy model. appendix a all parameters and data are given in per-unit values the machine# 1 parameters are as follows: pe = 0.8, qe =0.6, vt = 1.05, xd = 1.2, x ′ d =0.19, xq =0.743, h =4.63, tdo′ =7.76, d =2, ξ =0.3 the machine#2 parameters are as follows: pe = 0.75, qe = 0.55, vt = 1.0, xd = 1.15, x′d = 0.13, xq = 0.643, h = 3.63, tdo′ = 7.00, d =1.8, ξ =0.27 local load data: load#1: connected to machine #1 g1 = 0.449, b1=0.262 load#2: connected to machine #2 g2 = 0.249, b2=0.221 line data: rt.l =0.034, xt.l. =0.997 avr data: machine#1 and machine#2: ka1 = 400, ka2 = 370, ta1 =0.02, ta2 =0.015 appendix b determining constraints of gaussian mfs the membership function μ(x) of a fuzzy set is frequently approximated by a gaussian. a gaussian shape is formed by two parameters: center c and width σ, as in formula (b.1): μg1(x;cj , σj)= e − (x−cj) 2 2σ2 j (b.1) the idea of a 2nd order fuzzy set was introduced bymelikhov to obtain a boundary ofgaussian shape of themembership function [13]. the 2nd order fuzzy set of a given m f(x) is the area between d+ and d−, where d+, and d− are the upper and lower crisp boundaries of 2nd order fuzzy sets, respectively, as shown in figure b.1the expressions for determining the crisp boundaries are (b.2), and (b.3): d+j (xi) = min(1, m fj(xi)+ δ) (b.2) d−j (xi) = max(0, m fj(xi) − δ) (b.3) formulas (b.2) and (b.3) are based on the assumptions that theheight of the slice of the 2nd order fuzzy region, bounded by d+ and d−, at point x is equal to 2δ where δ ∈ [0,0.3679] and these boundaries are equidistant from m f(x). to obtain the ranges for the shape formingparameters of the m f s, it should be assumed that these 2nd order fuzzy sets are mf search spaces. all mfs with acceptable parameters should therefore be inside the area. in the general case, the intervals of acceptable val14 acta polytechnica vol. 52 no. 2/2012 ues for every m f shape forming parameter (e.g., δc = [c11, c22], and δσ = [σ11, σ22] for gaussian) may be determined by solving formulas (b.1), (b.2) and (b.3). in practice, this may be done approximately, considering d+ and d− as soft constraints. for example, c11 and c22 for the gaussian may be found as the maximum root and the minimum root of the equation d+ = 1, which can easily be calculated. this equation is based on the assumption that a fuzzy set represented by the gaussian must have a point where it is absolutely true. σ11 and σ22 can easily be found from the following four equations: μg1((c + σ);c, σ)+ δ = μg1((c + σ);c, σ22); (b.4) μg1((c + σ);c, σ) − δ = μg1((c + σ);c, σ11) μg1((c − σ);c, σ)+ δ = μg1((c − σ);c, σ22); (b.5) μg1((c − σ);c, σ) − δ = μg1((c − σ);c, σ11) wherewe choose σ11 as theminimumand σ22 as the maximum from the roots. these equations are based on the assumption that the acceptablegaussianwith [σ11, σ22] shouldcross the2 nd order fuzzy regionslices at points x =(c ± σ). there are two options for finding the constraints of gaussian parameters. first, we consider the constraints as hard constraints, and it follows that the lowerandupper bounds of the center of thegaussianmembership functionwill be chosen as cmin, and cmax should be lower than the values of c11 and c22 to satisfy the search space constraint conditions of 2nd order fuzzy sets, as shown in figure b.2. the lower and upper bounds for the width ofgaussianmembership function σmin and σmax will be equal to σ11 and σ22, respectively, to satisfy the search space constraint conditions of 2nd order fuzzy sets, as shown in figure b.1. a second option is to consider these constraints as soft constraints, i.e., [cmin, cmax] equal to [c11, c22], and [σmin, σmax] equal to [σ11, σ22]. fig. b.1: upper and lower boundaries of width σ, using a 2nd order fuzzy set fig. b.2: upper and lower boundaries of center c, using a 2nd order fuzzy set acknowledgement the authors gratefully acknowledge support from thedeanship for scientificresearch,kingabdulaziz university through funding project no. 4-021-430. references [1] kimbark, e. w.: power system stability: elements of stability calculations. vol. 1, eighth printing, april, 1967. [2] venikov, v.: transient process in electrical power systems.moscow : mir publishers, 1977. [3] demello,f.p.,concordia,c.: concepts of synchronous machine stability as affected by excitation control. ieee trans. on powerapparatus and systems, vol. pas-88 (4), p. 316–329, 1969. [4] yu yao-nan: electric power system dynamic. new york : academic press, 1983. [5] anderson, p. m., fouad, a. a.: power system control and stability. new york : ieee press, 1994. [6] kothari, m. l., nanda, j., bhattacharya, k.: discrete mode power system stabilizers, iee proceedings part c, 1993, 140, (6), p. 523–531. [7] lee, c.c.: fuzzylogiccontrol systems: fuzzy logic controller, part i, iee trans. syst. man, cybernetic, vol. 20, p. 404–418, mar./april, 1990. [8] el-metwally, k. a., malik, o. p.: a fuzzy logic power systemstabilizer, ieeeproc.generation, transmission and distribution, vol. 145, no. 3, 1995, p. 277–281. 15 acta polytechnica vol. 52 no. 2/2012 [9] soliman,h.f.: adaptivefuzzylogiccontroller for supervisory power system stabilizers of a two-machine system, scientific bulletin of ain shams univ., june, 2007, egypt. [10] soliman, h. f., attia, a.-f., hellal, m., badr,m.a. l.: power systemstabilizerdriven by an adaptive fuzzy set for better dynamic performance, acta polytechnica, czech technical university in prague, vol. 46, no. 2/2006. [11] attia, a.-f., soliman, h.: an efficient genetic algorithm for tuning pd controller of electric drive for astronomical telescope. scientific bulletin ofain shamsuniversity, facultyofengineering,part ii, issueno. 37/2, june 30, 2002. [12] attia, a.-f., mahmoud, e., shahin, h. i., osman, a. m.: a modified genetic algorithm for precise determination the geometrical orbital elements of binary stars, international journal of new astronomy, 14, 2009, p. 285–293. [13] melikhov, a., miagkikh, v., topchy, p.: in optimization of fuzzy and neuro-fuzzy systems by means of adaptive genetic search. proc. of ga+se’96 ic, gursuf, ukraine, 1996. [14] srinivas, m., patnaik, l. m.: in adaptive probabilities of crossover and mutation in genetic algorithms. ieeetrans. system. man. andcybernetics, vol. 24, no. 4/1994, p. 656–667. [15] srinivas,m., patnaik, l.m.: ingenetic search: analysis using fitness moments. ieee trans. on knowledge and data engineering. vol. 8, no. 1/1996. [16] attia, a.-f.: genetic algorithms for optimizing fuzzy and neuro-fuzzy systems. praha, czech republic : cvut, p. 107, 2002. yusuf a. al-turki e-mail: yaturki@yahoo.com elect. & computer eng. dept faculty of eng. king abdulaziz university p. o. box: 80230, jeddah 21589, saudi arabia abdel-fattah attia e-mail: attiaa1@yahoo.com national research institute of astronomy and geophysics helwan, cairo, egypt deanship of scientific research king abdulaziz university p.o. box: 80230, jeddah 21589, saudi arabia hussien f. soliman e-mail: faried.off@gmail.com electrical power & machines dept. faculty of engineering ain shams univ. cairo, egypt 16 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 impacts of new eu and czech environmental legislation on heat and electricity prices of combined heat and power sources in the czech republic j. vecka abstract inmyeconomicmodel i calculate the impact of theneweuetsdirective, the industrialemissionsdirectiveandthenew air protection law on future heat and electricity prices for combined heat and power sources. i discover that therewill be a significant increase in heat and electricity prices, especially because of the implementation of new so-called benchmark tools for allocating allowances. themain problem of large heat producers in this respect is loss of competitiveness on the heat market due to emerging stricter environmental legislation, which is not applied to competitors on the heat market (smaller heat sources). there is also lack of clarity about the modalities for allocating free allowances, and about the future development of the whole carbon market (the future european allowance price). keywords: district heating (dh), combined heat and power (chp), benchmark, eu emissions trading scheme (eu ets), industrial emissions directive (ied), uropean allowance (eua), climate-energy package (package), climate change committee (ccc), member state (ms). 1 new eu legislation 1.1 emission trading theclimate-energypackage (package)was adopted in june 2009. it consists of 4 parts. themain part is the directive 2009/29/ec (source [1]) amending existing directive 2003/87/ec(source [2]) establishing a scheme for greenhouse gas emission allowance tradingwithin the eu—the so-calledeu emission trading scheme (euets).directive 2003/87/ecwas already amended in 2004 by directive 2004/101/ec. the scope of the new directive 2009/29/ec inter alia includes eu greenhouse gas targets to decrease ghg emissions by 20 % by 2020. the final text was adopted after longdebates, and it containsmany terms that need to be further defined by the relevant authorities. this task falls to the so-called “climate change committee” (ccc), whichwas established by directive 2003/87/ec. ccc acts as an implementing body for all eu ets directives (2003/87/ec, 2004/101/ecand 2009/29/ec). the most important aspect of directive 2009/ 29/ec for all installations in eu ets concerns the new allocation tool — auctioning of allowances, which should serve as a universal approach for distributing european allowances (eua) from 2013 onwards. auctioning means that all euas will not be distributed to producers free of charge (as they are now) but producers will have to purchase them in open auctions. there are several exceptions to this rule. • free allocation will be given to sectors endangered by so-called “carbon leakage”— meaning sectors like steel or lime production,which could bemoved to countries outside theeubecause of higher costs. this rule is not applicable to dh sources. • a transitional free allocation will be given for the modernization of electricity generation — fulfilling at least 1 of 3 criteria given by directive 2009/29/ec, a member state can ask for a partial free allocation of euas for electricity producers. the market value of free euas has to be used for retrofitting and upgrading the infrastructure and clean technologies. • a free allocation will be given to district heating and also to high efficiency cogeneration, as defined by directive 2004/8/ec on the promotion of cogeneration, for economically justifiable demand, in respect of the production of heating or cooling. in december 2010, a commission decision on determining transitional union-wide rules for the harmonised free allocation of emission allowances (source [3]) was adoptedwithin the ccc body. this decision introduces new rules for adjusting the allocation of free allowances in respect of heat delivered for private households. unfortunately, it is still unclear how this new toll will be implemented, and for this reason i have tried to cover all possible outcomes of these eu processes. 111 acta polytechnica vol. 51 no. 5/2011 1.2 industrial emission directive (ied) industrial emissions directive 2010/75/eu (source [4]) was adopted after long negotiations indecember 2010 as a recast previousdirective on integrated pollution prevention and control (so-called ippc). this newdirectivemerges 6directives in the field of pollution control and in effect integrates environmental management. the purpose of integrated prevention is to focus on the impact of industrial installations on all aspects of the environment — including soil and ground water. the directive introduces new ambitious emission limit values for combustion plants (as listed in table 1). these new limits are evolved frombestavailable technology (bat) levels for each technology. table 1: emission limit values for combustion plants fuel hard coal or lignite liquid fuels biomass gaseous fuels so2 400 350 200 35 50–100 mw nox 300/450 450 300 100 dust 30 5 so2 250 200 35 101–300 mw nox 200 250 100 dust 25 20 5 so2 200 35 > 300 mw nox 200 150 200 100 e m is si o n li m it v a lu e s in m g / n m 3 dust 20 5 member states can use various derogation tools from these emission limit values. • “transitional national plans” for large combustion plants with individual emission limits until 30 june 2020, • exception for district heating plants (installationsupto200mwthermaloutput)until 31december 2022, • limited life timederogation for sources in operation nomore than 17500 operating hours, starting from 1 january 2016 and ending no later than 31 december 2023. inmy economicmodel i use the exception for district heating plants. 2 new czech legislation air protection law in the czechrepublic there is a new proposal on the government’s agenda for a complex amendment toactno. 86/2002coll., onair protection. this proposal includes a new version of pollution fees for all sources (as listed in table 2 — comparison between current fees and proposed fees) with a vast increase by about 10 times until 2022. there is huge opposition from industry stakeholders to these new fees because in the context of ied (with strict emission limits onbat levels) there is no additional economic incentive for producers to aim for even lower emission levels. the definition of bat itself means that there is no technologicalpossibility (or at least only a very narrowpossibility) to go further. consequently, pollution fees will only become a new pollution tax. table 2: pollution fees — current and proposed in czk per ton pollutant dust sox nox voc current 3000 1000 800 2000 2012–2016 4200 1350 1100 2700 2017 6300 2100 1700 4200 2018 8400 2800 2200 5600 2019 10500 3500 2800 7000 2020 12600 4200 3300 8400 2021 14700 4900 3900 9800 2022 and further 29400 9800 7800 19600 3 future of heat prices in the czech republic after 2012 in the respect of emission trading, i have focused on the third exception (free allocation of allowances for district heating), which is crucial for my economic model. according to the text of directive 2009/29/ec, there should be a free allocation for heat producers. the rules of this free allocation are presented in the decision (source [4], as mentioned above), but the detailedmodalities have to be discussed within the ccc bodies. the benchmark value, which is the ratio between ghg emissions and heat production, was set at levels for a natural gas source with 90 % heat production efficiency — this leads to 62.3 allowances per tj of heat delivered to consumers. there has been significant opposition to this proposal, mainly from new mss, which are strongly dependent on coal-fired dh systems. old mss were neutral or in favour of this proposal, because their heat systems mainly use natural gas as fuel (see figure 1). fortunately because of organized pressure from the new mss, the commission has proposed a new tool to improve free allocation for dh systems in respect of heat delivered to private households (see description below). 112 acta polytechnica vol. 51 no. 5/2011 fig. 1: heat production fuel mix in the eu in 2008 (source [5]) in terms of ied, it is necessary to implement all possible derogation tools for local sources. the new emission limitswere correctly set atbat levels. regulators however should bear in mind local circumstances — local fuel sources, the huge improvement in air quality within last 2 decades, and the energy security of the czech republic (the “cleanest” natural gas is imported via a 4500 km long transit gaspipeline from the yamburg gas fields). in terms of new pollution fees, the national authority should take into account that going below bat is not economically and technically possible, and therefore pollution fees will become a “tax”. there is no necessity to introduce a new “pollution tax”. ied forms a sufficiently deep and demanding framework for cleaner production of energy. 4 description of the economic model i have created an economicmodel for calculating the implementation of the new eu and czech environmental legislation and its influence on future heat prices. i used the following approach: • the model calculates the influence caused by emission trading, ied and pollution fees. • the model can be applied only to installations which fall into ied (thermal input 50 mw or higher); smaller sources will not suffer from all new eu and czech legislation. • the model assumes combined heat and power generation. • certain inputs were set by expert estimation (e.g. efficiency of coal boilers, grid losses, etc.) • a basic presumption is that heat and electricity production for the period 2013–2027 will be the same (or without significant changes) as average production during the period 2005–2008, which is the basic period for historical data according to decision to directive 2009/29/ec (source [4]). • i have calculated the simple influence on the energy price (1 gj of energy produced) for the whole czech republic based on fuel source in two scenarios. the real impact on energy prices has to take into account the fuel mix used for energy generation in real chp sources. 4.1 co2 emission factors i have used co2 emission factors from the ministry of industry andtradeweb site (as listed in table 3). table 3: co2 emission factors emission factor fuel t co2/mwh of fuel calorific value t co2/gj of fuel calorific value brown coal 0.360 0.100 hard coal 0.330 0.092 liquid fuels 0.260 0.072 natural gas 0.200 0.056 biomass 0.000 0.000 4.2 main indicators i have determined the values of the main indicators through expert estimations (see table 4). table 4: main indicators indicator value in % coal boiler efficiency 85 gas boiler efficiency 90 fuel oil boiler efficiency 87 heat production efficiency 95 grid losses 13 there are several presumptions in these figures. • the efficiencies of boilers are true for ideal operating circumstances (installed capacity, high base load etc.) • the heat productionefficiency is true formodern technology, but it canvarygreatlyacross the district heating (dh) sector. • grid losses are true for hot-water grids; there will be a higher figure for steam grids (approx. 5 % higher) all these presumptions aremade in respect of the objectivity of the model outcomes. there are significantdifferences among installations in thedhsector, so there are no “correct values” in this respect. 113 acta polytechnica vol. 51 no. 5/2011 4.3 benchmarks according to the text of decision to directive 2009/29/ec, the allocation of free allowances will be determined by so-called “benchmarks”. a benchmark is a fixed ratio between ghg emissions and a unit of production (in the case of the district heating sector, 1 gj of heat). benchmarks will be used for free eua allocation, as follows: • in 2013 there will be a free eua allocation of 80 % of the benchmark value, with a linear decrease to 30 % in 2020. • in 2027 there should be no free eua allocation. the benchmark value was set within decision [4] on 10 % of the best installations using as a fuel natural gas with 90 % boiler efficiency. the final value of the so-called heat benchmark is 62.3 kg co2/gj of produced heat. 4.4 free allocation for heat to private households free allocation for heat delivered to private households is a new tool introduced by decision [4] — the so-calledhousehold rule. this tool provides for an increase in the free allocation fordhsystemsaccording to their emissions related to the production of heat exported to private households from 1 january 2005 to 31 december 2008. this means that the free allocation forheat forprivatehouseholdswill be adjusted by the difference between historical emissions related to heat for households and the allocation according to the benchmark. however, this application of historical emissions is lowered each year, starting from 90 % in 2014. heat for other customers will be allocated only according to the benchmark (as described above). detailed rules of this tool have not yet been approved, and there are still many modalities to be developed. there are about four possible interpretations of the household rule. 4.5 derogation for electricity producers free allocation in respect of production of electricity is enabled by the text of directive 2009/29/ec. this allocation is possible mainly for new mss. the ccc body adopted the decision on the relevant part of directive 2009/29/ec in november 2010 (source [6]). unfortunately, this decision was very short and narrow, and left a major part of this allocation tool unclear. in recent months, the commission tried to adopt a communication on implementing measures of this free allocation, with very restrictive conditions and requirements for producers. this situation can be seen as an almost clear attempt to breach the subsidiarity rules of the eu, because the modalities of this allocation tool should be on the shoulders of the ccc bodies. fortunately, there was opposition to this communication even within the commission itself. derogation rules and their applicability for electricity producers have therefore not yet been finalized. 4.6 ied implementation implementation of iedwill involve significant investment in the technology of existing sources in terms of lowering emissions of pollutants (especially nox and sox). in my model, i assume that the derogation rule for dh systems will be used. concerning the fulfillment of emission limits given by ied, sources should invest in the following technology: • lignite/hardcoal source—desox, denox technology,dust ismanagedatemission limits in current technology systems (could be managed by minor adjustment of the system) total investments: czk2bln three years before emission limits are applied (e.g. in 2019 in order to meet emission limits in 2023). • liquid fuel source— desox, denox technology, dust/solid residues is covered by quality management of the fuel that is used (high quality heating oil) total investments: czk 1.5 bln three years before emission limits are applied. • gaseous fuel source — denox technology, dust/solid residues and sox is not applicable, covered by quality management of the fuel that is used (mainly natural gas) total investments: 0.75 blnczk three years before emission limits are applied. 4.7 future co2 price there is lack of clarity in respect of the future co2 price (future price of the eu allowance). according to various ec studies, and according to the opinion of the ministry of environment of the czech republic, the future price of eua will be in range of eur 20–30. however i have also used the “opinion” of the carbonmarket itself, which estimates the futureeua price at aroundeur 16 (this is the average price for buying eua with delivery 2013–2015). 4.8 scenarios i have constructed two possible implementation scenarios of the described environmental legislation. each of these scenarios has two carbon price values (as the carbon price is the most important parameter). 114 acta polytechnica vol. 51 no. 5/2011 scenario 1 — strictest implementation emission trading—nohousehold rule, no derogation for electricity producers, just free allocation according to the benchmark. ied — without any derogation for district heating, full application from 1 january 2016. air protection — highest pollution fees with no applicable fixation at lower levels (current proposal for a complex amendment to act no. 86/2002 coll., on air protection). scenario 2 — pragmatic implementation emission trading — household rule (most probable interpretation, 60 % of heat is delivered to households), derogation for electricity producers (most probable application with benchmark according to the proposal for the national plan by the ministry of environment proposal (source [7])). ied—with a derogation for district heating, full application from 1 january 2023. airprotection—pollution feesfixedat2012–2016 levels (meaning an increase of about 40 % of current fees). 5 model outcomes the following tables showthe outcomes frommyeconomic model. the listed figures reflect the impact on energy prices after the implementation of benchmarks on heat. table 5: impact on energy prices based on fuels used for scenario 1 and eua price eur 16 fuel year lignite hard coal liquid natural gas 2012 0.47 0.47 0.18 0.18 2013 42.32 37.81 24.56 13.62 2014 47.34 42.83 28.78 16.65 2015 49.07 44.55 30.52 18.42 2016 50.50 45.99 32.11 20.03 2017 52.25 47.73 33.79 21.72 2018 53.93 49.41 35.40 23.36 2019 55.57 51.05 36.98 24.96 2020 57.14 52.62 38.49 26.49 2021 58.15 53.64 39.43 27.44 2022 59.85 55.33 40.58 28.60 2023 60.67 56.16 41.41 29.45 2024 61.47 56.95 42.21 30.26 2025 62.23 57.71 42.99 31.04 2026 62.96 58.44 43.72 31.79 im p a ct o n en er g y p ri ce s in c z k p er 1 g j 2027 63.66 59.14 44.43 32.50 table 6: impact on energy prices based on fuels used for scenario 1 and eua price eur 30 fuel year lignite hard coal liquid natural gas 2012 0.47 0.47 0.18 0.18 2013 64.59 57.53 37.31 20.70 2014 70.61 63.55 42.55 24.75 2015 73.30 66.24 45.27 27.51 2016 75.68 68.62 47.80 30.08 2017 78.33 71.27 50.40 32.71 2018 80.89 73.83 52.90 35.25 2019 83.38 76.32 55.33 37.71 2020 85.78 78.72 57.67 40.08 2021 87.29 80.23 59.12 41.55 2022 89.46 82.41 60.76 43.21 2023 90.76 83.70 62.06 44.53 2024 92.00 84.94 63.31 45.80 2025 93.19 86.13 64.52 47.02 2026 94.33 87.27 65.67 48.18 im p a ct o n en er g y p ri ce s in c z k p er 1 g j 2027 95.43 88.37 66.78 49.30 table 7: impact on energy prices based on fuels used for scenario 2 and eua price eur 16 fuel year lignite hard coal liquid natural gas 2012 0.47 0.47 0.18 0.18 2013 14.74 12.04 4.52 −1.16 2014 21.69 18.64 10.16 3.67 2015 28.09 24.72 15.39 7.79 2016 33.95 30.31 20.24 11.21 2017 39.29 35.43 24.39 14.58 2018 44.15 40.09 27.74 17.89 2019 48.10 43.59 31.04 21.15 2020 53.65 49.13 35.99 25.21 2021 57.78 53.27 39.32 27.33 2022 58.64 54.12 40.19 28.21 2023 59.23 54.72 40.92 28.96 2024 60.03 55.51 41.73 29.77 2025 60.79 56.27 42.50 30.55 2026 61.52 57.01 43.23 31.30 im p a ct o n en er g y p ri ce s in c z k p er 1 g j 2027 62.22 57.71 43.94 32.01 115 acta polytechnica vol. 51 no. 5/2011 table 8: impact on energy prices based on fuels used for scenario 2 and eua price eur 30 fuel year lignite hard coal liquid natural gas 2012 0.47 0.47 0.18 0.18 2013 22.76 18.55 6.97 −1.92 2014 33.62 28.86 15.77 5.63 2015 43.62 38.37 23.95 12.07 2016 52.78 47.10 31.52 17.41 2017 61.13 55.09 38.01 22.67 2018 68.72 62.38 43.25 27.85 2019 74.90 67.84 48.40 32.95 2020 82.29 75.23 55.18 38.81 2021 86.92 79.86 59.01 41.44 2022 88.26 81.20 60.37 42.81 2023 89.32 82.26 61.57 44.04 2024 90.56 83.50 62.82 45.31 2025 91.75 84.69 64.03 46.53 2026 92.89 85.84 65.18 47.70 im p a ct o n en er g y p ri ce s in c z k p er 1 g j 2027 93.99 86.93 66.29 48.81 all listed figures are in czk and per 1 gj of energy supply—in the caseof heat, the impact onprice for customers for 1gj of heat; in the case of electricity, the impact on the price of 1 gj of electricity supply to the electricity grid. the major difference between the two scenarios is in the first years, where scenario 1 models a severe price increase. scenario 2 offers much more flexibility for producers through a gradual increase in energy prices. fig. 2: impact on energy prices in different scenarios for future eua price eur 16 6 summary ashas been presented in the figures above future energy prices from chp sources under eu ets and ied will be heavily influenced mainly by the implementation of directive 2009/29/ec, which introduces a new tool for allocating free allowances. socalledbenchmarkswill be used for all euets installations in the district heating sector. estimating the future eu allowance price is also very problematic. the european commission guesses an eua price of around eur 30, while the carbon market itself guesses around eur 16 (average price of eua with delivery after 2013). there are still many unclear modalities concerning free allocation of allowances after 2013. implementation of ied (new emission limits) and new pollutant fees will not have major impacts on the energy prices themselves, but could be seen as a reason for a fuel switch or closure. as is described by my model, there are several ways by which the ultimate target in terms of lowering emissions could be attained. however, the chosen path to the target could mean “price shocks” in the event of strict application or a gradual price increase in the event of a pragmatic approach. implementation of the new environmental legislation will lead to an increase in the energy prices of chp sources. in the case of heat prices, therewill be no direct impact on costs or revenues for these companies because of the heat price structure (regulated by the energy regulatory office). the most severe impact in this respect is the loss of competitiveness of heat producers in eu ets. customers in the czech republic do not care much about the environmental backgroundof heat production—theirmain concern is about the total price of heating. the main competitors on the heatmarket (local heat sources below euets thresholds) are in amuch better position in this respect. they are not influenced by eu ets, ied, pollution fees or an ecological tax (in the case of local boiler houses). the new environmental legislation is shown to distort competition on the heatmarket. a new “carbon tax” for sources outside eu ets needs to be established as soon as possible to take this issue into account. in the case of electricity prices, implementing the environmental legislation will involve a loss of profit for producers (especially for producers from coal sources). acknowledgement the research described in this paper was supervised by doc. ing. jaroslav knápek, csc., fee ctu in prague. 116 acta polytechnica vol. 51 no. 5/2011 references [1] directive 2009/29/ec of the european parliament and of the council of 23 april 2009 amending directive 2003/87/ec so as to improve and extend the greenhouse gas emission allowance trading scheme of the community. [2] directive 2003/87/ec of the european parliament and of the council of 13 october 2003 establishing a scheme for greenhouse gas emission allowance tradingwithin thecommunity and amending council directive 96/61/ec. [3] commission decision on determining transitional union-wide rules for the harmonised free allocation of emission allowances pursuant toarticle 10a of directive 2003/87/ec. [4] directive 2010/75/eu of the european parliament and of the council of 24 november 2010 on industrial emissions (integrated pollution prevention and control). [5] eurostat: combined heat and power (chp) in the eu, turkey, and norway — 2008 data. european union, brussels, 2010, http://www.edsdestatis.de/de/downloads/sif/qa 10 007.pdf [6] commission decision on guidance on the methodology to transitionally allocate free emission allowances to installations in respect of electricity production pursuant toarticle 10c(3) ofdirective 2003/87/ec. [7] ministry of environment, application of the czech republic for allocation of free allowances for investments in retrofitting and upgrading of the infrastructure and clean technologies and national investment plan, november 2010. about the author jiř́ı vecka born in 1982, is a graduate of fee ctu in prague. he is currently a ph.d. student of the dept. of economy, management and humanities, fee ctu in prague, working in the association for district heating of the czech republic (adh cr), an interest group of legal entities and entrepreneurs in the field of heat supply (91 members approx. 87.8 pj of heat produced in 2009), dealing with environmental issues — emission trading, ied, air quality, etc. jǐŕı vecka e-mail: vecka.jiri@centrum.cz dept. of economics management and humanities czech technical university technická 2, 166 27 prague, czech republic 117 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 a flexible adjustment and control system for hydraulic machines daniel banyai1, lucian marcu1 1 technical university of cluj-napoca, department of mechanical engineering, b-dul muncii, nr. 103–105, cluj-napoca, romania correspondence to: daniel.banyai@termo.utcluj.ro abstract due to the advantages of hydraulic systems with variable displacement, it was necessary to design a control system that can adjust the pressure, flow, power or a combination of these features, that can be easily integrated into the pump body without changing its mechanical construction. the objective of this work was to study the dynamic behavior of this electro-hydraulic control system. to achieve these objectives, first the adjusting system was analyzed by numerical simulations, and then a stand was constructed for testing the performance of these adjustable pumps. it was shown that this control system is superior to existing systems. keywords: adjustable pump, control system, dynamic behavior. 1 introduction this paper presents a study of high power drives, namely hydraulic systems, pursuing a high degree of automation, minimum power consumption, adaptability to a large range of industrial applications and perturbations. the systems should be flexible. essential trends in the construction of presentday non-hydraulic machines are toward flexibility and automation. the aim is to increase the level of intelligence of the machines and their adaptation to possible disturbances. [2] variable displacement pumps allow easy control of system parameters (pressure, flow, power, or combinations of these parameters). their technical characteristics make them the best option for most applications, from machine tools to mobile devices. [2] companies with a tradition in manufacturing pumps and motors with axial piston and variable displacement (rexroth, bosch, vickers, parker) have been producing this type of machinery for highly automated systems since the 1980, but there is little data in the literature on constructive solutions. the biggest producers of hydraulic machines are opting for mechano-hydraulic control structures that can be used in circuits for regulating pressure, flow and power independently of each other, with each control parameter requiring a different type of constructive control structure [7]. 2 system description the purpose of this paper is to report on optimizations of the adjustment structure of hydraulic parameters, their control in the system to which belong, and tests on the assembly pump-adjustment system. figure 1 presents a schematic diagram for an automatic control system proposed for implementation in a research program [1, 2]. the system contains the following components: 1 – a variable displacement pump with axial pistons; 2 – a linear hydraulic motor for changing the angular position of the piston block holder, in order to modify the flow of the pump; 3 – a proportional directional valve that controls the position of the linear motor; 4 – pressure sensors; 5 – a diaphragm for measuring the flow rate of the pump; 6 – electronic circuits that calculate the pressure drop on the diaphragm, then determine the flow, and then the hydraulic power generated by the pump is obtained via a signal from a pressure sensor and the signal that represents the flow; 7 – an electronic comparator, designed to find the error between the programmed value and the actual value of the adjusted parameter (pressure, flow, power); 8 – an electronic controller, used to compensate the errors and give the command signal for the proportional valve; 9 – switches whose state determines the control structure; 10 – a fixed displacement pump, which provides the necessary flow for positioning the hydraulic motor. this flow can be taken from the flow of the adjustable pump, and in this case the auxiliary pump is no longer required; 11 – a relief valve, which protects the system from exceeding the permissible pressure in the hydraulic components. thus, without a change in pump construction, this can be integrated into any control circuit for adjustable hydraulic machines, by simply actuating an electrical switch. 16 acta polytechnica vol. 52 no. 4/2012 figure 1: electro-hydraulic control system for variable displacement machines the mathematical model of the control system studied in this paper is based on differential equations that take into account nonlinear influences such as pressure and leakage flow dependence, saturation of the flow, limiting the pressure, etc., in order to minimize the deviations between the behavior of the real system and the modeled system. the following differential equations are used to model the system [6]: • equations of state (isothermal transformation) for discrete volumes; (a), (b); • equation of continuity; (c), (d); • equations describing the behavior of an ideal proportional electromagnet; (e); • equations describing the behavior of an ideal controller; (f); • mechanical equilibrium equations; (g). the general form of these equations was taken from the specialized literature [3,6], and they were adapted to the model investigated in the paper, resulting in the system of equations described in (1), where the notations have the following meanings: ṗa – temporal derivative of the pressure function, pa, in the large chamber of the linear hydraulic motor; ṗb – temporal derivative of the pressure function, pb, in the small chamber of the linear hydraulic motor; pa – pressure in the large chamber of the linear hydraulic motor; pb – pressure in the small chamber of the linear hydraulic motor; pt – tank pressure, (pt = 0); pc – pressure between the auxiliary pump and the control valve; ps – load pressure; eu – elasticity modulus; va – volume of oil under pressure pa; vb – volume of oil under pressure pb; vt – (dead) volume in the supply circuit of the linear motor, (volume of connecting pipes); qa – flow rate that enters or is discharged from the large chamber of the positioning hydraulic motor; qb – flow rate that enters or is discharged from the small chamber of the positioning hydraulic motor; a – piston area (rodless); xm – linear position of the hydraulic motor; clg – leakage flow coefficient dependent on speed; clp – leakage flow coefficient dependent on pressure; α – piston surface ratio; αq – flow rate coefficient; dv – proportional valve diameter; xv – linear position of the proportional valve; ρ – oil density; tv – time constant for the control valve; kv – gain of the control valve; u – voltage generated by the pid controller; uref – command voltage (command value for pressure, flow or power); ur – voltage generated by the sensors; kp – proportional gain, a tuning parameter of the pid controller; td – derivative gain, a tuning parameter of the pid controller; ti – integral gain, a tuning parameter of the pid controller; mp – linear motor piston mass; fam – preload force of the spring in the linear motor; km – spring stiffness; c3; c4 – viscous damping coefficient; m – piston and rod of the reduced mass of the variable pump; ω – angular velocity of the piston holder of the pump; r – placement radius of the pump’s pistons; a – tilting radius of the variable pump; d – diameter of the pump’s pistons. ṗa = eu va + vt · [qa − a · ẋm − clg · ẋm + clp · (pa − pb)] (a) ṗb = eu vb + vt · [−qb + α · a · ẋm − clg · ẋm + clp · (pa − pbt)] (b) 17 acta polytechnica vol. 52 no. 4/2012 qa = ⎧⎪⎪⎪⎨ ⎪⎪⎪⎩ αq · dv · π · xv · √ 2 ρ · (pc − pa), xv ≥ 0 αq · dv · π · xv · √ 2 ρ · (pa − pt ), xv < 0 (c) qb = ⎧⎪⎪⎪⎨ ⎪⎪⎪⎩ αq · dv · π · xv · √ 2 ρ · (pb − pt ), xv ≥ 0 αq · dv · π · xv · √ 2 ρ · (pc − pb), xv < 0 (d) (1) tv · ẋv + xv = kv · u (e) u = kp · (uref − ur) + td · (u̇ref − u̇r) + 1 ti ∫ (uref − ur) · dt (f) mp · ẍm = a · pa − α · a · pb + fam + km · xm − (c3 + c4)ẋm + m · ω2 · r2 a2 · xm − π · d2 · r 4a · ps (g) figure 2: pressure control figure 3: flow control figure 4: power control 3 numerical research a study of the dynamic behavior of the electrohydraulic control system for adjustable hydraulic pumps involves the use of concrete values for the physical and geometrical sizes involved in the model. an f316-type axial piston pump with variable displacement, made in romania by ump, was chosen [10]. the mathematical model was simulated in the matlab simulink programming environment [11]. the pid controller was tuned using the ziegler-nichols method [13]: kp = 0.3; 1/ti = 10; kd = 0. the behavior of the system was analyzed, e.g. the response to the step command for pressure, flow and power. when adjusting the pressure, the control step represents the input signal corresponding to the variation in load pressure from 0 to 200 bar, (figure 2). when setting the flow, the control step representing the input signal corresponds to the variation in flow from 0 to 30 l/min, (figure 3). in the version of power adjustment the step control that represents the input signal corresponds to a change in power from 0 to 5 kw, (figure 4). simulations indicated that the system can be controlled with the same controller values for any particularly control type. the dynamic behavior of the system is strongly influenced by the proportionality constant, kp, so that the natural frequency of the system increases with this constant, and the system is damped. the integral constant, ti, improves the behavior of the steady state, canceling the stationary error. when the value of this constant is increased, it is observed that the damping of the system decreases significantly, bringing it to the limit of instability. having a derivative gain, kd, even at low levels, the damping of the system decreases, bringing it to the limit of instability. 4 experimental research in order to achieve the objectives of our study, a stand was designed and constructed that was capable of testing the performance of adjustable pumps working under pressure control, flow, power, or a combination of these features. the overall picture and a schematic representation of the experimental plant is presented in figure 5. 18 acta polytechnica vol. 52 no. 4/2012 figure 5: the stand the main sub-assemblies of the plant are: 1. the pump, with an electro-hydraulic adjustment system; 2 [8]. the hydraulic energy source for control supply; 3. a specific sensor; 4 [5, 12]. the electricity supply and control system for the proportional distributor, the sensors and the distributor to simulate the load; 5 [4]. the acquisition and control hardware system; 6 [9]. the command and control interface; 7. the load simulator. after tuning the controller, the following constants were determined that offer good dynamic behavior for the system regardless of the chosen control type: kp = 0.35; 1/ti = 12; kd = 0, values very close to those obtained in the simulations, which validates the mathematical model. an important objective of the research was to study the behavior of the pid controller with various control structures. the dynamic behavior of the control structure was investigated by recording the signal step response, which gives the command for pressure, flow and power. figures 6 to 8 present in graphical form the data for the set of tests carried out on the stand for the three major types of control. figure 6: system response to a pressure command of 80 bar, and 20 bar figure 7: system response to a flow command of 28 bar and 12 lpm figure 8: system response to a power command of 3 kw and 0.1 kw 19 acta polytechnica vol. 52 no. 4/2012 the dynamical behavior of the system is evaluated from the time diagrams (simulated and real diagrams), taking into account the following: • stability: the model and the real system are well damped. • rapidity: response time: 0.2 seconds for the model and 0.4 seconds for the real system, difference is caused by the uncertainty factors that are approximated in the mathematical model. • precision: stationary error below 2 %. the control system has satisfactory behavioral skills for any adjustment structure. 5 conclusions an analysis of the results indicates that that there are no significant differences between the mathematical model and the real system, so the model can be used for developing new hydraulic machines equipped with this kind of command and control system, which offers the following advantages: reduced complexity for the circuit construction of the control; it does not involve the use of special expensive equipment; the mechanical structure of the pump is not affected when another hydraulic parameter needs to be controlled; the control scheme is integrated with the piloting and tracking mechanism into a compact smallsize system. it has been shown that, unlike when using conventional techniques, the three important hydraulic parameters can be adjusted with the same pump, the same controller and only two pressure sensors by simple electrical switching. axial piston machines with variable displacement and an electro-hydraulic control system allow complete automation, compatible with the operating cyclograms of complex equipment, by interfacing with a plc or a process computer. references [1] banyai, d., vaida, l.: electro-hydraulic control system for variable displacement machines, 12th international conference on automation in production planning and manufacturing, p. 50–58, zilina, slovakia, 2011. [2] banyai, d.: new methods in synthesis of hydraulic machines with variable displacement and electro-hydraulic adjustment, phd thesis, technical university of cluj-napoca, romania, 2011. [3] he, q. h., hao, p., zhang, d., zhang, h. t.: modeling and control of hydraulic excavator’s arm, journal of central south university of technology, vol. 13, no. 4, 2006, p. 422–427. [4] năşcuţiu, l., banyai, d., marcu, i. l: control system for hydraulic transmissions specific to wind machines, dtmm 2010, buletinul institutului politehnic din iaşi, iaşi, romania, 2010. issn 1244–7863. [5] năşcuţiu, l., banyai, d., opruţa, d.: measurement of hydraulic parameters, dtmm 2010, buletinul institutului politehnic din iaşi, iaşi, romania, 2010. issn 1244–7863. [6] opruta, d., vaida, l.: dinamica fluidelor, ed. mediamira, cluj-napoca, romania, 2004. isbn 973-713-044-8. [7] vaida, l.: proportional control for adjustable pumps. phd thesis, technical university of cluj-napoca, romania, 1999. [8] http://www.boschrexroth.com [9] http://www.dspace.de [10] http://www.hidraulica-ph.ro [11] http://www.mathworks.com [12] http://www.wika.com [13] the control handbook, crc press catalog number 8570, 1996. isbn 0-8493-8375. 20 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 fpu-supported running error analysis t. zahradnický, r. lórencz abstract a-posteriori forward rounding error analyses tend to give sharper error estimates than a-priori ones, as they use actual data quantities. one of such a-posteriori analysis – running error analysis – uses expressions consisting of twoparts; one generates the error and the other propagates input errors to the output. this paper suggests replacing the error generating term with an fpu-extracted rounding error estimate, which produces a sharper error bound. keywords: analysis of algorithms, a-posteriori error estimates, running error analysis, floating point, numerical stability. 1 introduction rounding error analyses are used to discover the stability of numerical algorithms and provide information on whether the computed result is valid. they are typically performed in forward or backward direction, and either a-priori or a-posteriori. backward error analysis [6] treats all operations as if they were exact but with a perturbed data, and we ask for which input data we have solved our problem. using backward error analysis is preferred, as each algorithm which is backward stable is automatically numerically stable [5], while this does not hold for forward error analyses. on the other hand, forward error analyses treat operations as sources of errors, and tell us about the size of the solution that our input data corresponds to. since error analyses performed a-priori can often be difficult if the algorithm being analyzed is complex, a-posteriori analyses may provide a more feasible alternative (msc 2000 classification: 65g20, 65g50). an objective of an a-priori analysis is to find the worst case bound for an algorithm, if it exists, while a-posteriori analyses calculate the error bound concurrently with the evaluation of the result and work with actual data quantities, providing a tighter error bound. this paper presents a replacement for an error generating term in error bounds calculated with forward a-posteriori error estimates (also known as running error analysis) to obtain sharper error bounds by exploiting the behavior of the computer fpu. we expect the calculation to be portable (conforming strictly to [3]) and assume an intel x86 compatible fpu which supports double extended precision (80-bit) floating point registers, which are essential for this method. 2 running error analysis and error bounds running error analysis [6, 7] is a type of a-posteriori forward error analysis. the algorithm being analyzed is extended to calculate the partial error bound alongside the normal calculation. as the algorithm proceeds, bounds are accumulated, making a total error bound estimate. from here on, a binary double precision floating point arithmetic with a guard digit with round to nearest rounding mode is assumed. definition 1 let ft ⊂ q be a binary floating point set with a precision of t, where q denotes the rational number set. definition 2 let y = ±m 2e−t ∈ ft be a binary floating point number, where m stands for mantissa, e for exponent, and t for precision. let ex(y) = e. lemma 3 let x̂ = fl(x) be a floating point representation of an exact number x, which is obtained by rounding x to the nearest element in ft. the rounding process can be described as: x̂ = fl(x) = x(1 + δ), |δ| ≤ ut, (1) where ut = 2 −t is unit roundoff in precision t. to conserve some space, and for clarity, u without a subscript means u53. assumption 4 (standard model) we assume that operations ♦ ∈ {+, −, ·, /} and the square root follow the standardmodel [5] and are evaluated with error no greater than u: ŝ♦ = fl(â ♦ b̂) = (â ♦ b̂)/(1 + δ♦), |δ♦| ≤ u, ŝ√ = fl( √ â) = √ â/(1 + δ√ ), |δ√ | ≤ u, (2) where we use ŝ as a computed result of an operation. the standard model will also be used in the form of ŝ♦ = fl(â♦b̂) = (â♦b̂) · (1 + δ♦), where useful. table 1 summarizes the error bounds for basic operations and the square root with products of errors being neglected. we also assume that the following holds: ŝ = s/(1 + δs) = s + σ, â = a/(1 + δa) = a + α, and b̂ = b/(1 + δb) = b + β, where α, β, and σ stand for absolute (static) errors. 30 acta polytechnica vol. 50 no. 2/2010 table 1: error bound estimates for basic operations operation error bound addition and subtraction |σ±| ≤ u |â ± b̂| + α + β multiplication |σ·| ≤ u |âb̂| + α|̂b| + β|â| division |σ/| ≤ u ∣∣∣∣â b̂ ∣∣∣∣ + α|̂b| + β|â| b̂2 square root |σ√ | ≤ u ∣∣∣√â∣∣∣ + α 2 ∣∣∣√â∣∣∣ the right-hand side of each error bound in tab. 1 contains an error generating term u| · |, while the rest of each expression simply propagates input error(s) to the output. when we substitute u = 2−53 into u|ŝ|, multiplying by u reduces the exponent by 53 u|ŝ| = 1, xx . . . xx︸ ︷︷ ︸ 52 times ·2ex(̂s)−2t, (3) leaving the mantissa unchanged if u|ŝ| is normal. the roundoff unit is just the first 1 in (3), while x-en are generally nonzero. using u|ŝ| as a rounding error estimate features two problems: 1. the calculated error bound u|ŝ| can be up to two times larger ( u ∑t−1 i=0 ŝi2 −i<∼2u ) , assuming that ŝi refers to the i-th bit of ŝ. 2. the error generation term always generates an error even in the case when no physical error was committed. the expression u|ŝ| tends to give a higher error bound estimate than we would need. however, we can revise this expression if we are interested in obtaining a tighter error bound estimate, and we can do so by exploiting the fpu. 3 analysis many computers use a processor with intel x86 architecture [4], which features an fpu with 8 double extended precision floating point registers. once a number is loaded into the fpu, regardless of its precision, it gets automatically converted into the double extended precision format. further calculations are performed in this format but rounded to a precision specified in the floating point control word register (fcw). the default settings for fcw specify double extended precision, round to the nearest rounding mode and mask out all floating point exceptions. if necessary, they can be changed with an fstcw instruction. once the result gets stored back into a memory location, it is rounded to (or extended if fcw specified single precision) a defined precision, depending on the type of store instruction. 3.1 obtaining a tighter bound from the fpu when rounding to the nearest element of ft, the relative error is no worse than ut, and we can extract the static error from the fpu by subtracting the computed value in double extended precision from its rounded value. this idea looks similar to compensated summation [5], which uses an entirely software approach in contrast to our paper. subtracting the values provides an 11-bit error estimate of the real rounding error, which is no greater than u. the entire idea can be written as: theorem 5 let ŝ be the result of an operation performed in double extended precision and let ŝ be the result in double precision, which we obtained by rounding ŝ to 53 bits by rounding to the nearest as defined by the ieee 754 standard. the absolute rounding error for an operation in floating point can be calculated rather as δ = |ŝ − ŝ|, where |δ| ≤ u|ŝ|. proof. we can extract 64-bit mantissa intermediate results (ŝ) from the fpu before they get rounded to 53 bits (ŝ), and calculate δ = |ŝ − ŝ|. the quantity ŝ is rounded either down to f1 ∈ f53 or up to f2 ∈ f53, where f1, f2 = {f1, f2 ∈ f53 : |f2 − f1|/|f1| = 2u}, ŝ ∈ f64 ∧ ŝ ∈ [f1, f2], ŝ = round 64→53 (ŝ) ⇒ ŝ ∈ {f1, f2}, where ŝ = round 64→53 (ŝ) = ⎧⎪⎪⎨ ⎪⎪⎩ f2 if |ŝ − f1|/|f1| > u f1 or f2 1 if |ŝ − f1|/|f1| = u f1 if |ŝ − f1|/|f1| < u . (4) the following figure shows the principle: f1 f2 � � 2 u � � round down � � � tie (round to even) � � round up fig. 1: rounding a double extended precision number to double precision. in all cases, the relative error can be computed in the same way as |ŝ − ŝ|/|ŝ| ≤ u and is always bounded by u. � 1rounding to even as in [3] applies here. 31 acta polytechnica vol. 50 no. 2/2010 note 1 theorem 5 is proved for positive numbers, but it can be proved for negatives too. corollary 6 the expression δ = |ŝ − ŝ| ≤ u|ŝ| provides an 11-bit estimate of the static rounding error and since |δ| ≤ u|ŝ|, it can be safely substituted for all occurrences of u|ŝ| in tab. 1, providing a tighter error bound. 3.2 implementation highlights our approach is implemented in c++ language as a class fdouble with gcc at&t inline assembly language fpu statements [9]. the fdouble class overloads most operators and some standard functions2 which are beyond the scope of this paper, providing a handy replacement for the double data type. the transition from double data type to fdouble is performed just by changing the name of the data type. the rest is handled by the class. for an illustration of how the class works, the source of a plus operator fdouble::operator+ and its assembly portion op_plus performing the fpu-supported running error analysis with the terms above follows: fdouble fdouble::operator+(fdouble const &b) const { fdouble r; op_plus( r.d, r.e, this->d, this->e, b.d, b.e ); return r; } the fdouble::operator+ method calls the op_plus function which contains the assembly inline and is common for all other fdouble::operator+ overloads: inline void op_plus( double& result, double& result_err, register const double a, register const double a_err, register const double b, register const double b_err ) { __asm__ __volatile__( "fldl %1\n\t" // 1. load b "fldl %2\n\t" // 2. load a "faddp\n\t" // 3. calc a+b "fstl %3\n\t" // 4. store rounded result "fldl %3\n\t" // 5. load rounded result "fsubp\n\t" // 6. calc difference "fabs\n\t" // 7. calc its absolute value "fldl %4\n\t" // 8. load a_err "faddp\n\t" // 9. calc diff + a_err "fldl %5\n\t" // 10. load b_err "faddp" // 11. calc diff + a_err + b_err : "=t"(result_err) // 12. put result in result_err : "m"(b), // 13. %1 = b "m"(a), // 14. %2 = a "m"(result), // 15. %3 = result "m"(a_err), // 16. %4 = a_err "m"(b_err) // 17. %5 = b_err ); } 2currently only logarithm, power and exponential function, absolute value, remainder and the square root are available. 32 acta polytechnica vol. 50 no. 2/2010 the op_plus first loads both double precision a and b (lines 1 and 2) on the floating point stack and the fpu extends them to the double extended precision. the sum is then calculated (line 3) popping a from the fpu stack and replacing b by the sum. since the result is still in double extended precision, it has to be stored back into memory in order to get a rounded result according to the standard that we have to round the result (ŝ) after each operation (line 4) and rounded result (ŝ) is pushed back onto the floating point stack (line 5). ŝ − ŝ is calculated (line 6) and its absolute value is determined (line 7). errors are propagated according to tab. 1 (lines 8 through 11), and the requested absolute error is found at the top of the fpu stack (line 12). lines 13 through 17 specify input operand constraints for the __asm__ directive. other operators and functions are implemented in a similar way. traditional running error analysis is implemented alike, and provides class radouble. the source code for radouble::operator+ follows: radouble radouble::operator+(radouble const &b) const { radouble r; r.d = d + b.d; r.e = fabs(r.d) * roundoff_unit + e + b.e; return r; } here, we can see that no assembly language inlines are necessary, but it is necessary to provide special compiler flags in order to obtain the same results with the fdouble class. these gcc flags are: -ffloat-store, which ensures that the result of each floating point operation is stored back into memory and rounded to a defined precision (required by ieee 754), and the second flag -mfpmath=387, which assures that the floating point unit is used. without the second flag, the optimization could use sse instructions and provide less accurate results. readers interested in obtaining the complete source code for all classes should refer to [9] but cite this paper. 3.3 complexity and implementation notes the traditional running error analysis needs to calculate u|ŝ|, and four x86 floating point instructions are necessary3. these are: fld instruction to load u onto the floating point stack, fmul to calculate uŝ, fabs to obtain |uŝ| = u|ŝ|, and fstl to store the result back into memory and round it to double precision. our approach suggests to substitute |ŝ− ŝ| for u|ŝ| and a typical evaluation requires an fldl instruction to load ŝ onto the floating point stack, fsub to calculate ŝ−ŝ, fabs to obtain its absolute value |ŝ−ŝ|, and fstl to store the result into memory while rounding it to double precision. according to instruction latency tables [1], fsub instruction latency is 3 cycles, while fmul latency is 5. the following table compares the two approaches from the latency point of view, proving that there is a speed enhancement: table 2: comparing the fpu instructions necessary to evaluate the absolute error traditional approach our approach instructions latency instructions latency fldl 3 fldl 3 fmul 5 fsub 3 fabs 1 fabs 1 fstl 3 fstl 3 total 12 total 10 our approach not only provides a better error bound estimate, but also at a higher speed and we should expect about 20 per cent speed up. 3.4 case study the case study compares the presented approach to traditional running error analysis, and also compares it to a forward error bound determined a-priori on the following mathematical identity: y∞ = log 2 = ∞∑ k=1 (−1)k−1 k . (5) c++ and mathematica [8] software are used to verify the results. a rounding error analysis of (5) for a double precision floating point arithmetic for n addends is given with the following assumptions: 3this approach works only with fpu, not sse2/sse3, as sse does not support double extended precision and we are unable to calculate |ŝ − ŝ|. 33 acta polytechnica vol. 50 no. 2/2010 −8 −6 −2 −4 0 l og ar it h m of er ro r running error our approach result error forward error bound log2 n −18 −16 −14 −12 −10 −8 −6 −4 −2 0 0 4 8 12 16 20 24 28 32 36 −18 −16 −14 −12 −10 fig. 2: calculated value of ŷn and its error bounds (forward sum direction) 1. the first two addends (1 and 1/2) of the sum are exact as we use a binary floating point arithmetic, 2. (−1)k−1 is evaluated as (−1)k−1 = ⎧⎨ ⎩ 1 when k is odd−1 otherwise rather than using pow function from libm thus the result is always exact, 3. all operations follow the standard model. the computed ŷn is expressed as: ŷn = [ · · · [( 1 − 1 2 ) (1 + δ1) + 1 3 (1 + δ2) ] (1 + δ3) − · · ·] (1 + δ2n−3) = (6) = 1(1 + θn ) − 1 2 (1 + θn ) + 1 3 (1 + θn ) − . . . + (−1)n−1 n (1 + θ2) ≤ (7) ≤ ( 1 − 1 2 + 1 3 − 1 4 + − · · · + (−1)n−1 n ) (1 + θn ) = (1 + θn ) n∑ k=1 (−1)k−1 k , (8) where |δi| ≤ u, n∏ i=1 (1 + δi) ρi = 1 + θn, where ρi = ±1, and |θn| ≤ nu 1 − nu = γn, and nu < 1 [5]. the backward error is immediately visible from equation (7), in which we can consider the sum as an exact sum of perturbed data entries by a relative value certainly bounded by γn. the forward a-priori error bound is then calculated as: |ŷn − yn | ≤ γn n∑ k=1 (−1)k−1 k , (9) and presents the worst case error estimate, which can be far from true error bound. the following section demonstrates this statement. 3.4.1 results results are provided for two summation orders. the first case performs the summation in decreasing order of magnitude, where a poor, insufficiently accurate result is quickly visible. the figure 2 depicts this scenario, where n stands for number of iterations and the y-axis shows a base 10 logarithm of the order of the error. the first line, marked with �, presents the absolute error, which is log10 (|ŷn − log 2|) and it gets smaller with each further accumulation of the sum. rounding errors go against this error from bottom to top, and are represented with the remaining three lines. from top to bottom: the ×+line presents the worst case, that is, the a-priori forward error bound which is obtained from the right hand side of (9). the second line, marked with +, stands for an a-posteriori running error bound. we can see that it is slightly better than the a-priori bound, and the last line (×) presents our approach, which accounts real rounding errors, and that is the reason why the first two iterations have no error (cf. the × at the top). we can also observe that if we use an a-priori bound, we will have to terminate the accumulation somewhere after 226 iterations, and after 227 iterations with the running error bound and approximately 228 iterations with our approach. calculating more iterations beyond the bound does not make sense in this case, because each accumulant is smaller than the error of the result. 34 acta polytechnica vol. 50 no. 2/2010 444036322824 result error forward error bound our approach 20 running error −18 8 1612 0 −2 −4 −6 −8 −10 −12 −14 −16 −14 −12 −10 −8 −6 −4 −2 0 0 4 48 −18 log2 n l og ar it h m of er ro r 52 −16 fig. 3: calculated value of ŷn and its error bounds (reverse sum direction) the second case performs the sum in the reverse order, i.e. going from numbers of the smallest magnitude towards greater numbers (see figure 3). note that this scenario demonstrates the claim that an error bound obtained a-priori (i.e. the forward error bound) is often too pessimistic and presents the worst case. using it would lead to premature termination of the sum evaluation. one more thing is demonstrated, i.e., that we should accumulate numbers in increasing order of magnitude. if we do not do that, numbers with small magnitude start to contribute less and less to a proportionally huge sum value, and we sooner or later find that further additions do not change the value of the sum. due to this fact, the harmonic sequence ∞∑ k=1 1 k = ∞ converges in finite arithmetics, and has a finite sum depending on the type of arithmetic. the following table presents selected portions of the evaluation time for all two a-posteriori error analyses including the original computation. in addition, the table shows the run time for the objdouble class, which wraps the double data class that is used after subtracting the double column to determine the amount of the c++ object overhead. the results were obtained with the posix getrusage api which provides, besides other information, the amount of used time in seconds and microseconds since the start of the process. for measuring purposes, these are internally converted into milliseconds and a difference of two millisecond values performs a measurement without interference with other processes and the operating system. each value was measured 50 times and the results were statistically evaluated [2]. when we compare the traditional running error analysis run time to a simple evaluation in double precision, we obtain that object-oriented running error analysis with the radouble class is approximately 2.81 times slower. this slowdown includes the c++ overhead, consisting of object construction and operator calls. the overhead was measured with the objdouble class and the results show that the evaluation with objdouble class took 2 times longer than with the double data type. our approach is only 2.35 times slower than the original approach with the double data type, and it is 1.19 times faster than the traditional approach. this speed up is what we have been expecting from tab. 2. table 3: run times for reverse sum evaluations of selected n operations with four data types. these are double data type with no error analysis, the radouble data type, an object-oriented traditional running error analysis, fdouble data type which performs fpu-supported running error analysis, and the objdouble class, which wraps double data type and is used to measure the c++ overhead. all values are rounded to whole milliseconds log2 n double radouble fdouble objdouble 16 1 3 3 2 20 8 50 42 36 24 285 803 672 573 28 4 568 12 849 10 777 9 178 32 73 120 205 726 172 538 146 847 4 conclusions the traditional running error analysis approach uses the u|ŝ| term to get a rounding error estimate and, as we have shown, this estimate can be up to two times 35 acta polytechnica vol. 50 no. 2/2010 larger than the actual rounding error. moreover, this term always generates an error, regardless of whether an error was physically committed. by exploiting the floating point unit’s behavior of the intel x86 platform, we are able to obtain an 11bit error estimate by subtracting the rounded result ŝ from its not yet rounded equivalent ŝ, and when divided by ŝ, this estimate is always less than or equal to the unit roundoff u. our approach is very similar to the traditional approach; it also needs four fpu instructions, but it replaces multiplication by subtraction and that saves 2 cpu cycles per evaluation, obtaining almost 20 per cent speedup over traditional running error analysis. we encourage the use of running error analysis in all iterative tasks where critical cancellation can occur, such as during evaluation of a numeric derivative. as we have seen in the case study, an error bound determined a-priori can be far from the actual error and, especially with the provided classes, running error analysis provides a very quick and feasible replacement. this, however, costs 2.85 times the evaluation time, but when replaced by fpu-supported running error analysis, the cost is 2.35 while providing a yet tighter bound. with a tighter bound, we are able to calculate more iterations and be sure that the result is still valid. acknowledgement this research has been partially supported by the ministry of education, youth, and sport of the czech republic, under research program msm 6840770014, and by the czech science foundation as project no. 201/06/1039. references [1] fog, a.: instruction tables: lists of instruction latencies, throughputs and micro-operation breakdowns for intel and amd cpu’s, copenhagen university college of engineering, available online as a part of software optimization resources, http://www.agner.org/optimize/, 2008. [2] bevington, p., robinson, d. k.: data reduction and error analysis for the physical sciences, mcgraw-hill science/engineering/math, 2002. [3] ieee computer society standards committee. working group of the microprocessor standards subcommittee and american national standards institute: ieee standard for binary floatingpoint arithmetic, ser. ansi/ieee std 754-1985. ieee computer society press, 1985. [4] intel corporation: intel r© 64 and ia-32 architectures software developer’s manual, vol. 3a, system programming guide, part 1, 11/2007. [5] higham, n. j.: accuracy and stability of numerical algorithms, 2nd edition, society for industrial and applied mathematics, 2002. [6] wilkinson, j. h.: the state of the art in error analysis, nag newsletter, vol. 2/85, 1985. [7] wilkinson, j. h.: error analysis revisited, ima bulletin, vol. 22, no. 11/12, pp. 192–200, 1986. [8] wolfram research, inc.: mathematica 7, http://www.wolfram.com/, 2008. [9] zahradnický, t.: fdouble class, a double data type replacement c++ class with the fpusupported running error analysis. accessible online at http://service.felk.cvut.cz/anc/zahradt/ fdouble.tar.gz, 2007. ing. tomáš zahradnický was born on 9th march 1979, in prague, czech republic. in 2003 he graduated (msc) from the department of computer science and engineering of the faculty of electrical engineering at the czech technical university in prague. since 2004 he has been a postgraduate student at the same department, where he became an assistant professor in 2007. since 2009 he has been an assistant professor at the department of computer systems at the faculty of information technologies at the czech technical university in prague. his scientific research focuses on system optimization, and on parameter extraction. doc. ing. róbert lórencz, csc. was born on 10th august 1957 in prešov, slovak republic. in 1981 he graduated (msc) from the faculty of electrical engineering at the czech technical university in prague. he received his ph.d. degree in 1990 from the slovak academy of sciences. in 1998 he became an assistant professor at the department of computer science and engineering at the faculty of electrical engineering at the czech technical university in prague. in 2005 he defended his habilitation thesis and became an associate professor at the same department. since 2009, he has worked as an associate professor at the department of computer systems at the faculty of information technologies at the czech technical university in prague. his scientific research focuses on arithmetics for numerical algorithms, residual arithmetic, error-free computation and cryptography. ing. tomáš zahradnický doc. ing. róbert lórencz, csc. e-mail: zahradt|lorencz@fit.cvut.cz department of computer systems faculty of information technologies czech technical university in prague kolejńı 550/2, 160 00 prague, czech republic 36 acta polytechnica vol. 52 no. 6/2012 synthesis of mechanisms by methods of nonlinear dynamics michael valášek, zbyněk šika department of mechanics, biomechanics and mechatronics faculty of mechanical engineering czech technical university in prague technická 4, 16607 praha 6, czech republic corresponding author: michael.valasek@fs.cvut.cz abstract this paper deals with a new method for parametric kinematic synthesis of mechanisms. the traditional synthesis procedure based on collocation, correction and optimization suffers from the local minima of objective functions, usually due to the local unassembled configurations which must be overcome. the new method uses the time varying values of the synthesized dimensions of the mechanism as if the mechanism had elastic links and guidances. the time varying dimensions form the basis for an accompanying nonlinear dynamical dissipative system and the synthesis is transformed into the time evolution of this accompanying dynamical system. its dissipativity guarantees the termination of the synthesis. the synthesis always covers the parametric kinematic synthesis, but it can be advantageously extended into the optimization of any further criteria. the main advantage of the method described here for dealing with mechanism synthesis is that it overcomes the unassembled configurations of the synthesized mechanisms and enables any further synthesis criteria to be introduced, and terminates due to dissipation of the accompanied dynamical system. keywords: synthesis of mechanisms; time varying dimensions; evolution of dissipative systems; multi-objective optimization; dexterity; workspace; built-up space. 1 introduction like other engineering problems, the parametric kinematic synthesis of mechanisms has profited from computational methods, e.g [1]. traditional methods are described specifically for a particular type of mechanisms [2–5]. general iterative procedures based on various optimization methods [1, 6–11] have been developed recently. the currently used methods seem to be sufficiently powerful and able to find solutions for most problems in mechanism synthesis. however, all these methods suffer from two related problems. the first problem is that the dimensions of the mechanism that are being optimized do not allow the mechanism to be assembled in all the positions required for the desired motion. the second problem is that if some mechanism synthesis iteration fails for a certain parameter because of a constraint and/or an assembly violation, the whole knowledge from this iteration is lost. a solution to the first problem has been proposed with the use of time-varying dimensions during the dimension iteration process for the mechanism [12]. by allowing the dimensions of the system that are treated as the design variables to vary during the motion of the mechanism, it is possible to guarantee that the system can be assembled in all configurations. this leads to a variation of each dimension during the cycle of the mechanism. the synthesis problem is then solved by attempting to minimize the deviation from the mean value for all the design variables during the cycle. a solution to the second problem is proposed in [13], where the non-assembly positions are used for the synthesis. however, this approach suffers from a slow iteration process with unclear termination. this problem is overcome by the new method described in this paper. this new approach has been described in [14–17]. our paper formulates the method in a specialized way for a very large but restricted class of synthesis problems of mechanisms. this enables the description to be made in a more precise, systematic and algorithmic way. in addition, two interesting examples are included that have not previously been fully described [17]. 2 general formulation of the method 2.1 traditional vector method the initial assumption is that the mechanism to be synthesized can be analyzed by the vector method (for 2d problems e.g. [18], for 3d problems e.g. [19]). this is the only reduction in generality compared to the formulation in [14–17]. let us further describe the general procedure without loss of generality just for 2d. the mechanism to be synthesized is described by the vector method, which leads to a description of the mechanism by vector polygons with vector vertices vi (i = 1, . . . ,n) (a simple case is shown in fig. 1) and vectors bi (i = 1, . . . ,nv , nv ≥ n) with the parameters the lengths bi and the angles βi of the vectors (fig. 2). 82 acta polytechnica vol. 52 no. 6/2012 figure 1: simple vector polygon. these parameters include both the coordinates (variable parameters from bi and βi) and the dimensions (constant parameters from bi and βi) of the synthesized mechanism. parameters bi and βi can therefore be split into variable coordinates sk and constant parameters pj (j = 1, · · · ,m). this is the traditional formulation of the mechanism synthesis where the dimensions being synthesized are constant values. the fundamental objective functions are typically constructed on the network of mechanism positions within the desired workspace. let index r denote the general position of the mechanism, r = 1, 2, . . . ,n, where n is the number of such representative positions, and the corresponding coordinates are sj,r. some of the coordinates sj,r are prescribed for particular positions r = 1, 2, . . . ,n. the traditional synthesis method is a search for constant parameters pj (j = 1, . . . ,m) such that the closure conditions of the vector polygons in all r (r = 1, 2, . . . ,n) positions are fulfilled [18–19]. 2.2 new method the new method is based on variation of the mechanism dimensions. they are no longer constant, and the time varying parameters pj,r can vary between the positions r (r = 1, 2, . . . ,n) of the mechanism. during the synthesis process it is therefore admitted p1,1 6= p1,2 6= · · · 6= p1,n, p2,1 6= p2,2 6= · · · 6= p2,n, ... pm,1 6= pm,2 6= · · · 6= pm,n (1) and the synthesis goal is to reach equality of all parameters at the end of the synthesis p1,1 ∼= p1,2 ∼= . . . ∼= p1,n, p2,1 ∼= p2,2 ∼= . . . ∼= p2,n, ... pm,1 ∼= pm,2 ∼= . . . ∼= pm,n (2) however, new coordinates are used by the new method. they are the cartesian coordinates figure 2: vector parameters. xvi,r,yvi,r of the polygon vector vertices vi (i = 1, . . . ,n), which are variable. the varying values of parameters pj,r that correspond to the dimensions being synthesized and that are constant in the traditional vector method can be determined from the positions of the vertices vi in each position r. if the distance vivi+1 corresponds to the constant length dimension of the synthesized mechanism, then its time varying value is computed in each position r and each time pi,r = bi,r = √ (xvi+1,r −xvi,r)2 + (yvi+1,r −yvi,r)2 (3) and if angle vivi+1 with respect to the frame corresponds to the constant length dimension of the synthesized mechanism then its time varying value is computed in each position r and each time pi,r = βi,r = atan yvi+1,r −yvi,r xvi+1,r −xvi,r . (4) the new coordinates xvi,r, yvi,r and the parameters pj,r are time varying and then constant at the synthesized mechanism after synthesis with the time varying values. the new coordinates xvi,r, yvi,r are the coordinates of the accompanying nonlinear dynamical dissipative system, which is described by the lagrange equations. its kinetic energy is ek = 1 2 n∑ k=1 n∑ r=1 mk(ẋ2k,r + ẏ 2 k,r), (5) where mk are artificially introduced masses, and its potential energy is ep = 1 2 n∑ k=1 n∑ r=1 kk n∑ i=1 (pk,r −pk,i)2, (6) where kk are artificially introduced stiffnesses. the potential energy describes the excitation of the new dynamic system whenever parameters pk,r are not equal to each other between the positions r = 1, . . . ,n. the dissipation is introduced by the raleigh function d = 1 2 n∑ k=1 n∑ r=1 bk(ẋ2k,r + ẏ 2 k,r), (7) 83 acta polytechnica vol. 52 no. 6/2012 222 222 222 222 222 )()(e )()(d )()(c )()(b )()(a cimicimi dimidimi dicidici cibicibi aidiaidi yyxx yyxx yyxx yyxx yyxx      (14). the optimization task is defined as follows     min 2 1 2 2 1 1       n i mimi n i mimi y'yq x'xqcf (15) a[xa,ya] b[xb,yb] d c m a c b d e β1 β2 β3 k β4 β5 fig. 3: four-bar mechanism and the set of constraints of the optimization task is 0; 1,1   i1i1i1i  (16), where the optimization parameters are: a, b, c, d, e, xa, ya, xb, yb and 1i, (i=1,2,…,n). the parameters qi again denote the penalization coefficients. figure 3: four-bar mechanism. where bk are artificially introduced damping coefficients. dissipation guarantees the removal of the energy from the new dynamic system, and thus brings the system into equilibrium. the synthesis process is now transformed into the evolution of the accompanying nonlinear dissipative dynamical system. the system has the coordinates xvi,r,yvi,r, the kinetic energy (5), the potential energy (6) where the variables are described by formulas (3)–(4) as functions of the coordinates xvi,r,yvi,r, the raleigh function (7) and the initial conditions xvi,r(0),yvi,r(0) as estimations of the positions of the mechanism described by the vector polygon vertices i = 1, . . . ,n in particular positions r = 1, . . . ,n. it is supposed that this new accompanying system reaches its equilibrium given by ek = 0, ep = 0 (8) and ek = 0 results from (5) into ẋk,r = 0, ẏk,r = 0 (9) and ep = 0 results from (6) into pk,r = pk,i. (10) these final values pk = pk,r are the synthesized parameters (dimensions) describing the synthesized mechanism [14–17]. 3 extension of the method the method described here deals just with the parametric positional synthesis of a mechanism. the synthesis is usually more complicated, and it can generally be described as the minimization/ maximization of the set of further objective functions{ min cf`(s1,s2, . . . ,sn,p1,p2, . . . ,pm), ` = 1, 2, . . . ,ncf. (11) these objective functions are taken into consideration by extending the potential energy (6) by new terms ep = · · · + 1 2 ncf∑ `=1 q`(cf` − cf d,`), (12) the associated dynamical dissipative system consists of n subsystems for the individual positions of point m (fig. 4). the masses ma, mb, md, me are introduced in the points ai, bi, di, ei. the interactions between the subsystems are ensured by forces of a linear spring nature. a nonzero force acts into the relevant masses whenever the corresponding dimension differs between subsystems i and j (i,j = 1,2,….n). the stabilization of the whole system is ensured by damper elements between the masses and the inertial frame. the constraints of the associated system were changed into the form prescribedconsty prescribedconstx yyxx yyxx yyxx yyxx yyxx m m cimicimi dimidimi dicidici cibicibi aidiaidi        i i 222 i 222 i 222 i 222 i 222 i )()(e )()(d )()(c )()(b )()(a (17). fig. 4: associated dissipative system of the four-bar me di fdi mai fayi fbyi fbxi fei fei fbi fbi fai fci fci fai ai mi di ei bi ci ai bi ci x y faxi fdi bxa chanism figure 4: associated dissipative system of the fourbar mechanism. where q` are chosen positive constants and cf d,` are desired values of the objective function cf`. then the equilibrium (8) of the new accompanying dynamic system also optimizes the objective functions (11) [15–17]. the artificially introduced parameters mk, kk, bk and q` can be chosen as arbitrarily positive numbers, but their values influence the dynamics of the synthesis. 4 planar example the main disadvantages of general optimization methods combined with traditional kinematical synthesis are a high computational cost, extreme growth of the computational complexity with the number of optimized parameters, together with inability to find the solution even though it exists. the following example shows the main advantage of the evolution of the associated dissipative system over general optimization methods, e.g. genetic algorithms. the comparison of the methods is focused on the synthesis of the kinematical system (a four-bar mechanism) from [12]. this mechanism was chosen because it is relatively simple and it simultaneously consists of 9 dimensional parameters. moreover, the example can easily be extended with other optimized parameters, e.g. angles of the crank. the original classical optimization formulation of the example (fig. 3) is as follows. the kinematical system has 9 + n dimensions to be synthesized, where n denotes the number of desired positions of the end effector. these dimensions are the length of the crank a, the length of the coupler c, the length of the follower b, and the lengths of the two rods d and e that are connected with the end effector. the other set of optimized dimensions are the x and y positions of the fixed part of the crank (xa,ya) and the follower (xb,yb). the last set of parameters to be optimized 84 acta polytechnica vol. 52 no. 6/2012 figure 5: evolution of the coordinates of a four-bar mechanism. is the positive increments of the angles β1i of the crank β1i = β11 + i∑ j=2 β1j. (13) the point m of the mechanism should pass through the given positions mi (i = 1, 2, . . . ,n) on the given trajectory. the coordinates of the mechanism are β1,β2 and β3. the system constraints are then a2 = (xdi −xai) 2 + (ydi −yai) 2, b2 = (xbi −xci) 2 + (ybi −yci) 2, c2 = (xci −xdi) 2 + (yci −ydi) 2, d2 = (xmi −xdi) 2 + (ymi −ydi) 2, e2 = (xmi −xci) 2 + (ymi −yci) 2. (14) the optimization task is defined as follows cf = q1 n∑ i=1 (xmi −x′mi) 2 + q2 n∑ i=1 (ymi −y′mi) 2 → min (15) and the set of constraints of the optimization task is β1(i+1) > β1i (16) where the optimization parameters are: a, b, c, d, e, xa, ya, xb, yb and β1i, (i = 1, 2, . . . ,n). the parameters qi again denote the penalization coefficients. the associated dynamical dissipative system consists of n subsystems for the individual positions of point m (fig. 4). the masses ma, mb, md, me are introduced in the points ai, bi, di, ei. the interactions between the subsystems are ensured by forces of a linear spring nature. a nonzero force acts into the relevant masses whenever the corresponding dimension differs between subsystems i and j (i,j = 1, 2, . . . ,n). the stabilization of the whole system is ensured by damper elements between the masses and the inertial frame. the constraints of the associated system were changed into the form a2i = (xdi −xai) 2 + (ydi −yai) 2, b2i = (xbi −xci) 2 + (ybi −yci) 2, c2i = (xci −xdi) 2 + (yci −ydi) 2, d2i = (xmi −xdi) 2 + (ymi −ydi) 2, (17) e2i = (xmi −xci) 2 + (ymi −yci) 2, xmi = const. (prescribed), ymi = const. (prescribed). the forces that act in the dynamical system are as follows faxi = n∑ j=1 kxa(xai −xaj), 85 acta polytechnica vol. 52 no. 6/2012 figure 6: evolution of the dimensions of a four-bar mechanism. fayi = n∑ j=1 kya(yai −yaj), fbxi = n∑ j=1 kxb(xbi −xbj), fbyi = n∑ j=1 kyb(ybi −ybj), fai = n∑ j=1 ka(ai −aj), (18) fbi = n∑ j=1 kb(bi − bj), fci = n∑ j=1 kc(ci − cj), fdi = n∑ j=1 kd(di −dj), fei = n∑ j=1 ke(ei −ej), mai = kmi(βi−1 −βi), where for each mechanism position i the equations take into account all the other forces connecting with the other positions j of the mechanism. therefore the system forms altogether 2n equations. the coefficient kmi takes the nonzero constant value only if β1(i+1) < β1i. the final dynamical equations for the mass particles in points b and the algebraic equations that must be fulfilled are as follows maẍai = − n∑ j=1 kxa(xai −xaj) + n∑ j=1 ka(ai −aj) cos β1i − bxaẋai, maÿai = − n∑ j=1 kya(yai −yaj) + n∑ j=1 ka(ai −aj) sin β1i − byaẏai, mbẍbi = − n∑ j=1 kxb(xbi −xbj) − n∑ j=1 kb(bi − bj) cos β3i − bxbẋbi, 86 acta polytechnica vol. 52 no. 6/2012 figure 7: evolution of the trajectories of a four-bar mechanism by the dissipative system. mbÿbi = − n∑ j=1 kyb(ybi −ybj) − n∑ j=1 kb(bi − bj) sin β3i − bybẏbi, mcẍci = n∑ j=1 kb(bi − bj) cos β3i − n∑ j=1 kc(ci − cj) cos β2i + n∑ j=1 ke(ei −ej) cos β5i − bxcẋci, (19) mcÿci = n∑ j=1 kb(bi − bj) sin β3i − n∑ j=1 kc(ci − cj) sin β2i + n∑ j=1 ke(ei −ej) sin β5i − bxcẋci, mdẍdi = − n∑ j=1 ka(ai −aj) cos β1i + n∑ j=1 kc(ci − cj) cos β2i + n∑ j=1 kd(di −dj) cos β4i − bxdẋdi, mdÿdi = − n∑ j=1 ka(ai −aj) sin β1i + n∑ j=1 kc(ci − cj) sin β2i + n∑ j=1 kd(di −dj) sin β4i − bydẏdi where the mechanism dimensions ai, bi, ci, di and ei are evaluated from the coordinates with the help of the constraint equations formulated in (17). the simulation started from some selected initial positions xa, ya, xb, yb, xc, yc, xd and yd, as was in general described in section 2, above. the initial positions also determine the initial dimensions a, b, c, d and e of the mechanism. the results of the simulation are presented in the following figures. fig. 5 and fig. 6 show the history of the system coordinates. fig. 7 presents the desired trajectories and the resulting trajectories. the system coordinates (xai, yai, xbi, ybi, ai, bi, ci, di, ei, i = 1, 2, . . . ,n) for all the subsystems (for all the positions) come to rest at the equilibrium values. the desired trajectory was reached using one reactivation of the dynamic process. all the coordinates for all the subsystems also come to rest at the equilibrium values. these equilibrium values can be interpreted as the searched parameters of the mechanism. the example was also simulated using genetic algorithms. the boundary conditions for each optimized 87 acta polytechnica vol. 52 no. 6/2012 figure 8: evolution of the trajectories of a four-bar mechanism using a genetic algorithm. dimensional parameter were set according to the interval 〈desired, resulting〉 value of the simulation, which was done by the dissipative system. this interval was extended by 50 % on both sides, and was used as the boundary condition for the particular optimized parameter. the simulation result is presented in fig. 8. it is shown that the desired trajectory was not found using this method. 5 spatial example a further example is the synthesis of a 3d (rssr) four-bar mechanism [17]. the original classical optimization formulation is presented in fig. 9. it consists of two skew mechanism axes. the first mechanism axis is identical with the coordinate y axis. the second mechanism is shifted to level za and rotated by angle βa round the z axis. this means that the vector of the axis of the second mechanisms still lies on the plane xy. there is a sleeve on the axis of each mechanism. there is a lever on each sleeve. there are spherical linkages on the other side of the levers. at the end, the spherical linkages are connected together by a pitman. the overall mechanism is naturally described by the height za of point a, the length of the `a axis, the angle of rotation βa, the length of lever `ab, the length of pitman `bd, the length of lever `cd and the length of the axis yc = yd. let us reformulate the description of the mechanism for conciseness, and in order to avoid the complicated equations in the following formulation. the mechanism can then be described by the positions xa, ya, za of the point a, by the lengths of the levers `ab and `cd, by the length of the pitman `bd and the length of the yd axis. taking the relevant constraints into account, the representation described here is the minimal representation of the mechanism. the constraints that describe the mechanism are as follows. the first constraint ensures that points a and b reflect angle ψ for each subsystem. the following constraint ensures that points c and d reflect angle ϕ for each configuration of the mechanism. the third constraint describes the given length of the lever `ab, and ensures that it is constant across the configurations. the last constraint ensures that vector [xa,ya, 0] is perpendicular to vector ba ([xb,yb,zb]−[xa,ya,za]) for each configuration (subsystem). the constraints are cos ψ = za −zb√ (xa −xb)2 + (ya −yb)2 + (za −zb)2 , cos ϕ = xd x2d + z 2 d , (20) `ab = √ (xa −xb)2 + (ya −yb)2 + (za −zb)2, 0 = xa(xb −xa) + ya(yb −ya). synthesis of the transmission of the rssr mechanism is a very good example in the sense that in this case the described representation could at the same time mean the set of synthesized parameters. 88 acta polytechnica vol. 52 no. 6/2012 the constraints that describe the mechanism are as follows. the first constraint ensures that points a and b reflect angle ψ for each subsystem. the following constraint ensures that points c and d reflect angle  for each configuration of the mechanism. the third constraint describes the given length of the lever lab, and ensures that it is constant across the configurations. the last constraint ensures that vector [xa,ya,0] is perpendicular to vector ba ([xb,yb,zb]-[xa,ya,za]) for each configuration (subsystem). the constraints are φi yc = yd lab a [xa, ya, za] c d b ld y x z za la lbd  ψi )()(0 )()()( )cos( )()()( )cos( 222 22 222 abaaba abababab dd d ababab ab yyyxxx zzyyxxl zx x zzyyxx zz          . (20) synthesis of the transmission of the rssr mechanism is a very good example in the sense that in this case the described representation could at the same time mean the set of synthesized parameters. because it is a synthesis of transmission with two levers, a global solution of such an example exists and it consists in lengths of the levelers equal to zero. however, a solution of this kind is absolutely unimportant for the practical usability of the synthesized mechanism. let us therefore choose the length of the lever lab equal to a nonzero constant. the rest of the parameters can then be synthesized. they are once more: xa, ya, za, lbd, lcd, yd. figure 9: rssr spatial mechanism. because it is a synthesis of transmission with two levers, a global solution of such an example exists and it consists in lengths of the levelers equal to zero. however, a solution of this kind is absolutely unimportant for the practical usability of the synthesized mechanism. let us therefore choose the length of the lever `ab equal to a nonzero constant. the rest of the parameters can then be synthesized. they are once more: xa, ya, za, `bd, `cd, yd. the task of transmission synthesis is to find dimensions of the mechanism that fulfill some transmission requirements. the requirement for this example is that the vector of the angles of lever ϕ should correspond with the vector of angles of lever ψ. this means that for ϕ = ϕ1 is ψ = ψ1, for ϕ = ϕ2 is ψ = ψ2, etc., for the constant dimensions of the mechanism. to sum up, the optimization task is as follows: f = n∑ i=1 (ϕi −ϕ′i) 2 → min, (21) where the optimization parameters are: xa, ya, za, `bd, `cd and yd and where ψi is given. parameters qi again denote the penalization coefficients. the associated dynamical dissipative system consists of n subsystems for the individual required positions of the transmission mechanism. the masses mai, mbi, mdi are introduced at points ai, bi, di and act in the coordinates x, y, z. the interactions between the subsystems are ensured by forces of a linear spring nature. a nonzero force acts into the relevant masses whenever the corresponding dimension differs between subsystems i and j (i,j = 1, 2, . . . ,n). stabilization of the whole system is ensured by the damper elements between the masses and the inertial frame, according to the sky-hook idea [20] or [21]. the idea of the transformation is presented in fig. 10. in spite of the simple formulation of this example, only using three mass points, the 3d examples with difficult constraints drive us to formulate the dynamical equations by means of lagrange equations of mixed type. the greatest difference that occurs in comparison with simple planar structures is the difficulty of formulating the dynamical system. with simple planar mechanisms, the dynamical equation in direct form can best be formulated using the newton equations. the dynamical equations can be defined according to the formulation of the constraints (20). for the sake of conciseness, let us define the lengths of the lever `cd and the pitman `bd, and formulate the x, y, z projections of vectors [xd,yd,zd] − [xb,yb,zb] and [xd,yd,zd] − [0,yc, 0]. the lengths are: `dc = √ (xd − 0)2 + (yd −yc)2 + (zd − 0)2, (22) `bd = √ (xd −xb)2 + (yd −yb)2 + (zd −zb)2, the vectors are vecxdb = xb −xb√ (xd−xb)2 + (yd−yb)2 + (zd−zb)2 , vecydb = yb −yb√ (xd−xb)2 + (yd−yb)2 + (zd−zb)2 , veczdb = zb −zb√ (xd−xb)2 + (yd−yb)2 + (zd−zb)2 , vecxdc = xd√ (xd − 0)2 + (yd −yc)2 + (zd − 0)2 , vecydc = 0, (23) veczdc = zd√ (xd − 0)2 + (yd −yc)2 + (zd − 0)2 . 89 acta polytechnica vol. 52 no. 6/2012 the task of transmission synthesis is to find dimensions of the mechanism that fulfill some transmission requirements. the requirement for this example is that the vector of the angles of lever  should correspond with the vector of angles of lever ψ. this means that for  = 1, is ψ = ψ1, for  = 2, ψ = ψ2, etc., for the constant dimensions of the mechanism. to sum up, the optimization task is as follows:   min 2 1   n i ii 'f  , (21) where the optimization parameters are: xa, ya, za, lbd, lcd and yd and where ψi is given. parameters qi again denote the penalization coefficients. the associated dynamical dissipative system consists of n subsystems for the individual required positions of the transmission mechanism. the masses mai, mbi, mdi are introduced at points ai, bi, di and act in the coordinates x,y,z. the interactions between the subsystems are ensured by forces of a linear spring nature. a nonzero force acts into the relevant masses whenever the corresponding dimension differs between subsystems i and j (i,j = 1,2,….n). stabilization of the whole system is ensured by the damper elements between the masses and the inertial frame, according to the sky-hook idea [20] or [21]. the idea of the transformation is presented in fig. 10. in spite of the simple formulation of this example, only using three mass points, the 3d examples with difficult constraints drive us to formulate the dynamical equations by means of lagrange equations of mixed type. the greatest difference that occurs in comparison with simple planar structures is the difficulty of formulating the dynamical system. with simple planar mechanisms, the dynamical equation in direct form can best be formulated using the newton equations. mai ψi mbi c fldci mdi φi y x z fxai fldbi fyai bzai byai bxai bybi bxai bzai fldbi bzdi fydi bydi bxdi fig. 10: associated system of the rssr mechanism figure 10: associated system of the rssr mechanism. the lengths (22) and the vectors (23) help in formulating the forces that act in the dynamical system. in addition, the constraints (20) have been changed into the form cos ψ = zb −za√ (xb −xa)2 + (yb −ya)2 + (zb −za)2 , cos ϕ = xd√ x2d + z 2 d , (24) `ab = √ (xb −xa)2 + (yb −ya)2 + (zb −za)2, 0 = xa(xb −xa) + ya(yb −ya), ϕi = const., ψi = const. in order to describe the general transmission synthesis of this mechanism, let us take i = 1, 2, . . . ,n required transmissions and thus n required subsystems of the dynamical system. the dynamical equations and the constraints are identical for each subsystem. the forces that act in dynamical subsystem i are fxai = n∑ j=1 k(xai −xaj), fyai = n∑ j=1 k(yai −yaj), fzai = n∑ j=1 k(zai −zaj), f`dbi = n∑ j=1 k(`dbi − `dbj), (25) f`dci = n∑ j=1 k(`dci − `dcj), fydi = n∑ j=1 k(ydi −ydj), bxai =bẋai, byai = bẏai, bzai = bżai, bxbi =bẋbi, bybi = bẏbi, bzbi = bżbi, bxci =bẋci, byci = bẏci, bzci = bżci. this means that for each mechanism position i the equations take into account all the other forces connecting with the other positions j of the mechanism. therefore the system forms altogether 2n equations. the mechanism dimensions are evaluated from the coordinates as follows `dbi = √ (xdi−xbi)2 + (ydi−ybi)2 + (zdi−zbi)2, `di = √ x2di + z 2 di. (26) the final dynamical equations for the mass particles in points a, b and d for dynamical subsystem i, together with the algebraic equations, are as follows maẍai = −fxai −bxai, maÿai = −fyai −byai, maz̈ai = −fzai −bzai, mbẍbi = f`dbivecxdb −bxbi, mbÿbi = f`dbivecydb −bybi, mbz̈bi = f`dbiveczdb −bzbi, (27) mdẍdi = −f`dbivecxdb −f`dcivecxdc −bxbi, mdÿdi = −f`dbivecydb −fydi −bybi, mdz̈di = −f`dbiveczdb −f`dciveczdc −bzbi, ϕi = const., ψi = const., where the integrated coordinates are ẍai, ÿai, z̈ai, ẍbi, ÿbi, z̈bi, ẍci, ÿci, z̈ci. the simulation started from some randomly generated initial positions xa, ya, za, xb, yb, zb, xd, 90 acta polytechnica vol. 52 no. 6/2012 figure 11: dynamical response of the dimensions of the rssr mechanism. yd, zd. the initial positions also determine the initial dimensions xa, ya, za, `bd, `dc, yd, of the mechanism. it is necessary to make a short note on the implementation. all the systems that have been simulated are easy in the context of the formulation. all the previous dynamical systems were thus formulated by newton equations. because this system is already quite complex, lagrange equations of mixed type were chosen, instead of the newton equations, as the optimal formulation tool. thus the formulation was really simplified. however, the great disadvantage of this tool is the instability of the constraints. the results of the simulation are presented in the following figures. the system coordinates xai, yai, zai, `bdi, `dci, ydi, (i = 1, 2, . . . , 8) for all the subsystems come to rest at the equilibrium values (fig. 11). these equilibrium values can be interpreted as the searched parameters of the mechanism. the evolution of the constraints is presented in fig. 12. the red lines show fulfilled constraints. the other lines represent the simulated values of the mechanism. the upper part is dedicated to the given angles ψ and ϕ. the simulated values of the angles come to rest at the given equilibrium very soon after the system starts. the lower left picture shows the constraint that ensures the length of the lever `ab. it can be seen that the given value had been set equal to one. here, too, the simulated value came to rest at the required value immediately after the system starts. the fourth history presents the perpendicularity of vectors [xa,ya, 0] and ba. this graph, too, shows how soon after the start the constraint condition is fulfilled. it is absolutely clear from all the pictures showing the evolution of the constraint conditions that the required constraints are perfectly fulfilled. the evolution of the whole structure of the space four-bar mechanism is presented in fig. 13. the simulation started from the random positions of the structure marked as the initial structure. the simulation finished in the final structure of the mechanism. the final image shows that the corresponding dimensions are equal. 6 conclusion this paper has described a new method for solving the parametric kinematical synthesis of mechanisms. the robustness and the rapid synthesis procedure of the method have been proven. especially the robustness is very valuable. acknowledgements the authors appreciate kind support from the grant msm6840770003 “algorithms for computer simulation and application in engineering”. 91 acta polytechnica vol. 52 no. 6/2012 figure 12: fulfilling the constraints of the rssr mechanism. figure 13: evolution of the structure of the rssr mechanism. 92 acta polytechnica vol. 52 no. 6/2012 references [1] haug, e.j.: computer aided analysis and optimization of mechanical system dynamics. nato asi, vol. f9, springer verlag, berlin, 1984. [2] hartenberg, r.s., denavit, j.: kinematic synthesis of linkages, mcgraw-hill, 1964. [3] hall, a.s., jr.: kinematics and linkage design, balt publisher, west lafayette, in, 1966. [4] sandor, g.n. erdman, a.g.: advanced mechanism design: analysis and synthesis, prentice hall, 1984. [5] erdman, a.g., sandor, g.n.: mechanism design, analysis and synthesis, prentice-hall, englewood cliffs, nj, 1991. [6] hansen, m.r.: a general procedure for dimensional synthesis of mechanisms, mechanism design and synthesis 46, 1992, pp. 67–71. [7] bruns, t.: design of planar, kinematic, rigid body mechanisms, master’s thesis, university of illinois at urbana-champaign, urbana, il, usa, 1992. [8] hansen, j.m., tortorelli, d.a.: an efficient method for synthesis of mechanisms, in: d. beste and w.schiehlen (eds): optimization of mechanical systems, kluwer, dordrecht, 1995, pp. 129–138. [9] hansen, m.r., hansen, j.m.: an efficient method for synthesis of planar multibody systems including shapes of bodies as design variables, multibody system dynamics 2, 1998, pp. 115–143. [10] paradis, m.j., willmert, k.d.: optimal mechanism design using the gauss constrained method, journal of mechanisms, transmissions, automation in design 105, 1983, pp. 187–196. [11] minnaar, r.j. et al.: on non-assembly in the optimal dimensional synthesis of planar mechanisms, structural optimization 21, 2001, pp. 345–354. [12] hansen, j.m.: synthesis of mechanisms using time-varying dimensions. multibody system dynamics, vol. 7/1, 2002, pp. 127-144. [13] jensen, o.f., hansen, j.m.: dimensional synthesis of spatial mechanisms and the problem of non-assembly. multibody system dynamics, vol. 15/2, 2006, pp. 107-133. [14] valášek, m., šika, z.: mechanism synthesis as control problems. in: proc. of workshop on interaction and feedbacks ‘2004, edited by i. zolotarev, prague, 2004, pp. 187-192. [15] gresl, m. et al.: synthesis of dexterity measure of mechanisms by evolution of dissipative system, applied and computational mechanics, 1 (2007), 2, pp. 461-468. [16] sika, z. et al.: synthesis of mechanisms using evolution of associated dissipative systems, multibody system dynamics, 28 (2012), 4, pp. 419-440. [17] grešl, m.: synthesis and multiobjective optimization of mechanisms using evolution of associated dissipative system, ph.d. thesis, fme ctu in prague, prague, 2010. [18] valasek, m.: multibody dynamics without analytical mechanics, in: j.a.c. ambrosio (ed.): proc. of int. conf. on advances in computational multibody dynamics, ist idmec, lisbon 2003, cd rom, pp. 1-14. [19] stejskal, v., valasek, m.: kinematics and dynamics of machinery, marcel dekker, new york 1996. [20] karnopp, d., crosby, m., harwood, r.a.: vibration control using semi-active force generators. journal of engineering for industry, no. 96, 1974, pp. 619-626. [21] valášek, m., kortüm, w., šika, z., magdolen, l., vaculín, o.: development of semi-active roadfriendly truck suspensions. control engineering practice, vol. 6, 1998, pp. 735-744. 93 ap08_6.vp 1 introduction in some cases, usual methods of supervised learning are not able to provide satisfactory results. this may occur in data with asymmetric distributed error, which is typical in insurance. in order to manage the asymmetry in the sense of the law of large numbers, this paper offers a new algorithm, which constructs a predictor not for points, but for sets. we will show an algorithm for finding sets of units with above-average outputs. let x be a set, and let �, � be measures over it. the pareto principle arises if there is a set p x� where p p x p p x x ( , ) ( ) ( ) ( ) ( ) � � �� � � � � 1 (1) and r p p x( ) ( ) ( )� ��� � 0. let � stand for volume, � for production, p for productivity, r for proportion. typically, the pareto principle is considered as a rule that 20 % of elements “produces” 80 % or more of the output. (this principle was discovered by vilfredo pareto while assessing the welfare distribution in the uk at the end of the 19th century. his ideas were systematically described, applied and extended by max lorenz[7].) in this case, r p( ) .� 0 2 and p p( ) � 4. managerial science often works with the pareto diagram [1]. x is discrete � x x xn1 2, , ,� . the elements are ordered by their production �( )xi , and the production is drawn in a chart. in stastitics,the pareto principle is represented by the continuous pareto distribution: pr( )x x x xm k � � � � � � � � � (2) for all x xm� , where xm is the (necessarily positive) minimum possible value of x, and k is a positive parameter. pareto distribution has positive skewness 2 1 3 2( )� � �k k k k , which means that the below-average subset is bigger than the above-average complement. this occurs in many real life situations: the median citizen has a below-average salary, the median driver causes below-average claims, etc. let � : x n� � be attributes of x. (attributes can be considered as columns in a data table, i.e. the n-tuple from the i-th row. mapping � can also involve preprocessing. if x is �k for some k � �, the � mapping may also be an identity.) the problem of prediction consists in constructing the mapping �y so that � ( )y i i d i x � � �� � �� 0. however, the construction of mapping �y may be difficult if the pareto principle arises. a small subset of high productivity (called outliers) corrupts usual assumptions. usual datamining techniques propose removing the set and working only with the rest. however, in case of the pareto principle, the small set is very interesting. it is not adequate to speak about outliers, because such data is relevant and obvious. therefore, we are dealing with more humble result, i.e. with finding the set p defined in (1). the formulated problem, i.e. finding arg max ( , ) p x p x p � under condition r p r( ) � 0 (3) is new, and has not been found in the current literature. however, many other topics are related to it. first, clustering methods [12] can be employed. creating clusters of above-average individuals, the set p will be defined as these clusters.it is important to define the border of the clusters somehow. if these clusters are well found, they can be employed for a more precise approximation [6]. another approach is to attempt to find a prediction mapping �y where the set p is afterwards defined at some level of this mapping. rbf neural networks [5] provide an example where the approach of rough sets is employed. finally, effective p can also be detected also data envelopment analysis [2]. however, none of these methods – as the research in the bibliography shows – has been applied explicitly to the problem of above-average subsets. 2 the fencing algorithm the following algorithm is the first attempt to solve this problem (3). it offers the construction of �p x� with above-average production, i.e. with high p. the space �n will be © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 48 no. 6/2008 the pareto principle in datamining: an above-average fencing algorithm k. macek this paper formulates a new datamining problem: which subset of input space has the relatively highest output where the minimal size of this subset is given. this can be useful where usual datamining methods fail because of error distribution asymmetry. the paper provides a novel algorithm for this datamining problem, and compares it with clustering of above-average individuals. keywords: art, pareto principle, insurance risk. considered as x. the set p is represented by union intervals (from points to hyperboxes) represented by means of complement coding. (complement coding is a concept applied in art and artmap neural networks, e.g.[3]. however, the objective of the fencing algorithm is different. while art and artmap work iteratively, the fencing algorithm must often go through the entire training set.) this algorithm works only with finite data sets d, namely with ( , )xi iy pairs. therefore, for all subsets of d, the volume � is defined as count � �( ) ( )m m d� � and the production as the sum of the production of particular items � � ( ) ( ) m yii m d � � �� 1 � . other ways to construct such intervals may be considered. the following algorithm uses fencing. fencing is a heuristic approachwhich anticipates that areas of higher average production are located between mutually close points with high production. the algorithm attempts to build a rectangular fence around the area of above-average production, as shown in fig. 1. 2.1 measuring and data preprocessing � mapping is necessary. this mapping involves measuring and data preprocessing. the simplest way is to transform binary attributes into real attributes by 0 – 1 coding. categorical attributes are transformed into more binary binary attributes. it is very useful to reduce the input vector dimension, e.g. by principal components analysis [11]. let us define xi i� �( ) and the vector x x x xmax ( , , , )max max max� 1 2 � n so x xj j i i dmax � � � . the vector xmax is used for complementary coding x. 2.2 data splitting the data set d is divided randomly into three subsets: base subset b, training subset t and validation subset v. sets b and t are used for constructing the predictor y, whereby their size will be represented by b t�� , say10 � �b t . the size of v is chosen with respect to cross validation [8]. 2.3 starting set of intervals the starting set of intervals is defined as follows r xi i bo o� �cc( ) , whereby b ni b y k b b o i� � � � � � � � � � � � ( ) ( ) , cc is the complement coding cc : � �n n� 2 and k is a parameter. the definition of bo ensures that p b b ko( , ) � . 2.4 interval expansion two intervals r1, r2 can be expanded as follows r r ri i i new � min( , )1 2 , i n� 1 2 2, , ,� . the construction of �p consists in iterative expansion of intervals. the process starts with ro. two intervals are expanded only if the new interval covers p i t qt( ( ), )r new � , where qt is a parameter that sinks linearly during the process from p0 to p1. in order to expand the compared intervals, a heuristic is used. two intervals i a( )r , i b( )r are suitable for expansion if p i a( ( ))r and p i b( ( ))r are high, and if ra and rb are close. let us define suitability as: v p i t p i t a b a b a b, ( ( ), ) ( ( ), ) ( , ) � �r r r r� , (4) where � is a metrics. the first version of the algorithm worked with hamming distance [6], but other metrics can be also applied. in each step, a pair of intervals is tested for expansion. the pair is selected partly randomly as follows: the pair with highest suitability (probability 0.8), the pair with lowest suitability (0.1), or a random pair (0.1). if a pair is expanded, its suitability is recalculated for all other intervals. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 1 2 3 4 1 2 3 � � 4 � �10 � �10 x2 x2 fig. 1: fencing: expanding the square, a new fence (dashed) is recommanded to the rectangle that is close and has high productivity � 2.5 termination the algorithm terminates after all pairs of intervals have been tested and none can be expanded. afterwards, unexpanded intervals (i.e. points) are deleted. because intervals may overlap, their conjunction may have lower p than average p of all intervals. therefore, only intervals with highest p are considered as results so their conjunction has p high enough (e.g. higher than a given threshold). 2.6 validation finally, the results are validated with respect to the validation set r p v v( , )� , and p p v v( , )� are calculated. such values can be considered as the quality of the algorithm. 3 results the fencing algorithm has been applied successfully on data on 18 177 insurance claims related to traffic accidents in the czech republic in 2003–2005. categorical and numerical attributes were transformed into binary attributes. there was a total of 135 binary attributes. the considered attributes and their transformation is summarized in table 1. the fencing algorithm has been implemented in matlab as a set of simple scripts. it should be mentioned that this particular data set inspired the author to invent of the fencing algorithm, after attempts to build some regression model failed. generalized linear models [4], which are typical in insurance mathematics, and multilayer perceptrons [8] did not provide sufficient results, as shown in table 2. the data set was split into 10 subsets and one subset was always tested. the logarithm of total costs was taken as an output variable. however, the mean absolute error remains very high (the prediction and reality differ over twentyfold on average!). first experiments showed that the 135 dimensional space is too sparse and than there are many futher unexpandable intervals. therefore, the dimension was reduced by selecting 36 attributes describing the region of the claimant, road type, and cause of the accident. after next unsuccessful experiments with p1 4� and p1 2� , it was neccessary to set p1 15� . . then 9 intervals were found. however, the conjunction of them had p � 119. only. therefore only 3 best intervals were selected, with p � 1 48. and r � 0 32. . © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 48 no. 6/2008 observed values data type used as accident-related information hour numerical 24 input binary variables day numerical 7 input binary variables month numerical 12 input binary variables year numerical 6 input binary variables district categorical not used municipality size numerical 6 input binary variables cause categorical 11 input binary variables road type categorical 10 input binary variables tariff group categorical 25 input binary variables car make categorical not used information about causing person age numerical 8 input binary variables sex categorical 3 input binary variables district categorical not used municipality size numerical 6 input binary variables region categorical 15 input binary variables accident at place of abode binary 1 input binary variables claim costs 1 output numerical variable paid numerical additional expected numerical table 1: transformation of observed values into input and output variables 3.1 comparison the problem (3) formulated here is novel and the fencing algorithm is the only solution so far. however, for a simple comparison a clustering based method was involved that can be described briefly as follows: 1. building above-average clusters from training data: best 20 % records were extracted and clustered via the k-means algorithm. for each cluster, the diameter was calculated as the maximum of distance between the center and the record belonging to it. 2. finding above-average records in the testing data: for each record, we test whether there is a cluster whose center is closer to the record than the c multiplied diameter of the cluster. parameter c is set up so that the level of r is satisfied. so p is defined and r ensured. 3. calculation of p from provided data, p is calculated. table 3 shows the results achieved by this method, and compares them with the fencing algorithm: the alternative method based on known algorihm provides less narrow results. however, the goal of this paper was not test proposed fencing algorithm, but to show that this algorithm is able to solve problem formulated above (3). more experiments with the k-means based approach might provide better results. 4 discussion and further work the fencing algorithm can be modified so the suitability va,b is calculated in another way. there should be an increase in p in both intervals and a decrease in distance between r(a) and r(b). the randomized selection rule can also be modified. if a pair of intervals is tested, the whole training set t is gone through. this is probably the achiles tendon, because the size of t is usually very large. therefore more detailed examination complexity and the design of more suitable data structures are desirable. the basic idea of constructing an above-average subset can be evolved in many ways. the subset need not be a union of intervals, but they may be simplexes. the set must not be narrow, it may be fuzzy. or the subset can be given in an algebraic form and detected by genetic programming or other optimization methods, such as ant colony optimization [10]. the fencing algorithm will be compared with these other approaches in terms of complexity and effectiveness on more data sets. systematic examination of relevant preprocessing methods is also desirable. finally, the algorithm could be modified not for data, but for an estimated probability function, e.g. in form of copulas [9] which are more appropriate for assymetric distributions. 5 conclusion the fencing algorithm is a novel heuristic method for finding a subset of with above-average production. the main idea of the algorithm is to join intervals with high production and small mutual distance. the fencing algorithm has been successfully applied to insurance data. further work has been discussed above. references [1] akpolat, h.: six sigma in transactional and service environments. . gower, burlington, vt.:,hasan akpolat 2004. [2] andersen, p., petersen, n. c.: a procedure for ranking efficient units in data envelopment analysis. manage. sci., vol. 39 (1993), no. 10, p. 1261–1264. [3] dagher, i.: l-p fuzzy artmap neural network architecture. soft comput., vol. 10 (2006), no. 8, p. 649–656. [4] de jong, p., heller, g. z.: generalized linear models for insurance data. cambridge press, 2008. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 validation sets # 1 2 3 4 5 6 7 8 9 10 glm 3.43 3.37 3.27 3.33 3.30 3.22 3.35 3.27 3.43 3.41 mlp 3.35 3.29 3.24 3.30 3.23 3.10 3.27 3.28 3.44 3.34 table 2: mean absolute error of machine learning for different validation sets method r (fixed) p means based approach (6 means) 0.32 0.96 means based approach (20 means) 0.32 0.94 fencing algorithm (raw results) 0.48 1.19 fencing algorithm (results after best interval selection) 0.32 1.48 table 3: comparison of the fencing algorithm with the k-means based approach [5] jia, z., gong, l.: the project risk assessment based on rough sets and neural network (rs-rbf ). in networking and mobile computing, 2008. wicom ’08. 4 t h international conference on wireless communications, 2008. [6] kreinovich, v., yam, y.: why clustering in function approximation? theoretical explanation, 2001. [7] lorenz, m. o.: methods of measuring concentration and wealth. journal of the american statistical association, vol. 9 (1905), p. 209–219. [8] mitchell, t. m.: machine learning. mcgraw-hill science/ engineering/math, march 1997. [9] nelsen, r. b.: an introduction to copulas, volume 139 of lecture notes in statistics. new york: springer-verlag, 1999. [10] ramos, g. n., hatakeyama, y., dong, f., hirota, k.: hyperbox clustering with ant colony optimization (haco) method and its application to medical risk profile recognition. appl. soft comput., vol. 9 (2009), no. 2, p. 632–640. [11] theodoridis, s., koutroumbas, k.: pattern recognition, third edition. academic press, february 2006. [12] xu, r., wunsch, d.: clustering. ieee press series on computational intelligence. john wiley & sons, 2009. mgr. karel macek e-mail: karel.macek@fjfi.cvut.cz department of mathematics czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 praha2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 48 no. 6/2008 table of contents an intelligent system for structural analysis-based design improvements 3 m. novak, b. dolšak the amount of regenerated heat inside the regenerator of a stirling engine 10 j. škorpík prerequisites for increasing the axle load on railway tracks in the czech republic 15 m. lidmila, l. horníèek, h. krejèiøíková, p. tyc simple numerical model of laminated glass beams 22 a. zemanová, j. zeman, m. šejnoha influence of simulated loca on the properties of zircaloy oxide layers 27 h. frank evaluation of methods used for separation of vibrations produced by gear transmissions 31 a. doèekal, m. kreidl, r. šmíd rock burst mechanics: insight from physical and mathematical modelling 38 j. vacek, j. vacek, j. chocholoušová properties of starch based foams made by thermal pressure forming 45 j. štancl, j. skoèilas, j. šesták, r. žitný the pareto principle in datamining: an above-average fencing algorithm 55 k. macek wykresx.eps acta polytechnica vol. 51 no. 1/2011 quantum moment map and invariant integration theory on quantum spaces o. osuna castro, e. wagner abstract it is shown that, on the one hand, quantum moment maps give rise to examples for the operator-theoretic approach to invariant integration theory developed by k.-d. kürsten and the second author, and that, on the other hand, the operator-theoretic approach to invariant integration theory is more general since it also applies to examples without a well-defined quantum moment map. keywords: quantum spaces, invariant integration theory, quantum moment map. 1 introduction a noncommutative analogue of an (infinitesimal) group action on a topological space is described by the action of a hopf algebra on a noncommutative function algebra. in this setting, a generalization of the classical haar measure is given by an invariant integral, that is, a positive linear functional with certain invariance properties. usually the noncommutative function algebra is generated by a finite set of generators which are considered as coordinate functions on the quantum space. as in the classical case, one does not expect that polynomials in the coordinate functions on locally compact quantum spaces are integrable. this leads to theproblemthatonehas to associatealgebrasof integrable (anddifferentiable) functions to the noncommutative polynomial algebra in an appropriate way. in the algebraic approach (see e.g. [2, 7]), one associates function algebras by imposing commutation relations with the generators and defines the invariant integral by jackson-type integrals. a more rigorous method was developed by kürsten and the second author in [3], based on hilbert space representations and (unbounded) operatoralgebras. theadvantageof thismethodbecomes apparent in the examples of [5], where the algebraic approach would fail. the first step of the operator-theoretic approach is to express the action of a hopf *-algebra u on a *-algebra a by algebraic relations of hilbert space operators. it should be noted that the operators describing the action do not have to satisfy the commutation relations of u. on the other hand, any joint representation of u and a on the same hilbert space allows one to equip a with a u-action, given by the formulas of the adjoint action, provided that a is invariant under these algebraic expressions [6]. this will be automatically the case if there is a *-homomorphism from u into a. then one only has to consider *-representations of a on a hilbert space and the methods from [3] will apply without restrictions. in [2], korogodsky called a *-homomorphism from u into a intertwining the (adjoint) action a “quantum moment map”. the aim of this paper is to show that, on the one hand, quantum moment maps give rise to examples for the operator-theoretic approach to invariant integration theory. on the other hand, we demonstrate that the operator-theoretic approach to invariant integration theory is more general, since it also applies to caseswhere the operators describing the action do not satisfy the commutation relations of u and hence do not define a quantum moment map. 2 operator-theoretic approach to invariant integration theory for details on quantum groups and related notions, we refer the reader to [1]. let u be a hopf *-algebra withhopf structureδ, ε and s, whereδ : u → u⊗u and ε : u → c are *-homomorphisms and s : u → u is an anti-homomorphism satisfying certain conditions. wewill use sweedler-heinemannnotation and write δ(f)= f(1) ⊗ f(2). a *-algebra x is called a left u-module *-algebra if there is a left u-action � on x such that f � (xy) = (f(1) � x)(f(2) � y), (f � x)∗ = s(f)∗ � x∗, (1) x, y ∈ x , f ∈ u. for unital algebras, one also requires f � 1 = ε(f)1. by an invariant integral we mean a positive linear functional h on x satisfying h(f � x)= ε(f)h(x), x ∈ x , f ∈ u. (2) 29 acta polytechnica vol. 51 no. 1/2011 givenadense linear subspace d ofahilbert space h, consider the *-algebra l+(d) := { x ∈ end(d); d ⊂ d(x∗), x∗d ⊂ d } with involution x �→ x∗ ⇁d. an (unbounded) *-representation of x is a *-homomorphism π : x → l+(d). if for each f ∈ u there exists a finite number of operators li, ri ∈ l+(d) such that π(f � x)= ∑ i liπ(x)ri, x ∈ x , (3) then we say that we have an operator expansion of the action. obviously, it suffices to know the operators li, ri for a set of generators of u. let a denote the *-subalgebra of l+(d) generated by π(x) and the operators li, ri for a set of generators of u. set s(a) := { t ∈ l+(d); t̄h ⊂ d, t̄∗h ⊂ d, atb ∈ l1(h) ∀a, b ∈ a }, (4) where the bar denotes the closure of closeable operators on d, and l1(h) is the schatten class of trace class operators on h. the *-algebra s(a) will be considered as an algebra of differentiable functions which vanish sufficiently rapidly at “infinity”. if the operators from the operator expansion satisfy convenient commutation relations (but not necessarily the defining relations of u), then the u-action can be expanded to s(a). in favorable cases, one can define an invariant integral by a weighted trace on s(a), where the weight is easily guessed from the operator expansion of the action by analogy to the well-known quantum trace (see [3, 5]). ahopfalgebrau actsalwayson itself by the (left) adjoint action: adl(f)(x) := f(1) x s(f(2)), f, x ∈ u, (5) in [2], l. i. korogodsky defined a quantum moment map as a *-homomorphism ρ : u → x such that ρ(adl(f)(x)) = f � ρ(x) for all f, x ∈ u. then any *-representation π : x → l+(d) leads to a *-representation π ◦ ρ : u → l+(d), and it follows easily from the hopf algebra structure of u that adl(f)(x) := π(ρ(f(1)))x π(ρ(s(f(2)))), f ∈ u, x ∈ l+(d), (6) defines a left u-action on l+(d) turning it into a umodule *-algebra. moreover, the algebra s(a) is invariant under this action. suppose furthermore that u denotes the quantizeduniversal envelopingalgebra of a semisimple lie algebra. then there exists a distinguished element γ in u such that γf = s2(f)γ for all f ∈ u. by the definition of s(a), the traces tr(π(ρ(γ))f) are well-defined and we can state the following theorem: theorem 1 let ρ : u → x be a quantum moment map and π : x → l+(d) a *-representation such that ±π(ρ(γ)) is a non-negative selfadjoint operator. then h(f) := ±tr(π(ρ(γ))f), f ∈ s(a), (7) defines an invariant integral on the u-module *-algebra s(a). proof. the invariance of h follows from the same formulasas in theproofof the invarianceof thequantum trace in [1, section 7.1.6] by applying the trace property tr(af) = tr(f a) which continues to hold for all f ∈ s(a) and a ∈ a, see [3]. � 3 example: a quantum hyperboloid let q ∈ (0,1)and s ∈ [−1,1). following [2],wedefine the two-sheet quantum hyperboloid x := oq(xs,1) (after a slight reparametrization) as the *-algebra generated by y, y∗ and x = x∗ with commutation relations yx = q2xy, xy∗ = q2y∗x, y∗y = (q−2x − s)(q−2x −1), yy∗ = (x − s)(x −1). the hopf *-algebra u := uq(su1,1) is generated by e, f , k and its inverse k−1 with relations ke = q2ek, f k = q2kf, ef − f e = (q − q−1)−1(k − k−1), with hopf structure δ(e) = e ⊗1+ k ⊗ e, δ(f) = f ⊗ k−1 +1⊗ f, δ(k) = k ⊗ k, ε(e) = ε(f)= 0, ε(k) = 1, s(e)= −k−1e, s(f) = −f k, s(k)= k−1, and with involution k∗ = k, e∗ = −kf . the quantum hyperboloid x becomes a u-module *-algebrawith the action defined by k � y = q2y, e � y =0, f � y = q1/2((1+ q−2)x − (1+ s)), k � x = x, e � x = q1/2y, f � x = q5/2y∗, k � y∗ = q−2y∗, e � y∗ = q−3/2((1+ q−2)x − (1+ s)), f � y∗ =0. 30 acta polytechnica vol. 51 no. 1/2011 let i be an at most countable index set, h0 a hilbert space, and h = ⊕ i∈i h0. we denote by ηi the vector of h which has the element η ∈ h0 as its i-th component and zero otherwise. it is understood that ηi = 0 whenever i /∈ i. let u be a unitary operator on h0, and let a and b be selfadjoint operators on the hilbert space h0 such that spec(a) ⊂ [q2,1], spec(b) ⊂ [q2, s], q2 is not an eigenvalue of a and b, and s is not an eigenvalue of b. set λn := √ (q2n − s)(q2n −1) and λn(t) :=√ (q2nt − s)(q2nt −1). then a list of non-equivalent *-representations of x is given by the following formulas (suppressing the letter π of the representation). s ∈ [−1,1) : xηn = q−2nηn, yηn = λ−(n+1)ηn+1 on h = ⊕ n∈n0 h0. s ∈ [0,1) : xηn = −q2(n+1)aηn, yηn = λn(−a)ηn−1 on h = ⊕ n∈zh0. s ∈ (0,1) : xηn = q2(n+1)sηn, yηn = λn(s)ηn−1 on h = ⊕ n∈n0 h0; x =0, y = su on h0. s ∈ (q2,1) : xηn = q−2nsηn, yηn = λ−(n+1)(s)ηn+1 on h = ⊕ n∈n0 h0; xηn = q 2(n+1) ηn, yηn = λnηn−1 on h = ⊕ n∈n0 h0; xηn = q 2(n+1) bηn, yηn = λn(b)ηn−1 on h = ⊕ n∈zh0. s = q2 : x = q2, y =0 on h0. s =0 : x = y =0 on h0. s ∈ [−1,0) : xηn = q−2nsηn, yηn = λ−(n+1)(s)ηn+1 on h = ⊕ n∈n0 h0. the domain d of the representation can be chosen, for instance, to be the linear span of the ηn’s. if one imposes some well-behavedness conditions, for instance that x̄ is self-adjoint and that yf(x̄) ⊂ f(x̄)y for all boundedmeasurable functions (with respect to the spectral measure of x̄), then this list is complete in the sense that eachwell-behaved representation is a direct sum of representations from the above list. a single representation is irreducible if and only if h0 = c. in this case a, b and u become complex numbers such that a ∈ (q2,1], b ∈ (q2, s) and |u| =1. for the proof of these claims, see [4]. given a *-representation such that x is invertible in l+(d), set e := q−1/2(q − q−1)−1x−1y, f := −q1/2(q − q−1)−1y∗, k := qx−1. direct computations show that k � z = kzk−1, e � z = ez − kzk−1e, f � z = f zk − zf k (8) for z = x, y, y∗. using the relations ke = q2ek, f k = q2kf, ef − f e = (q − q−1)−1(sk − k−1), (9) one easily proves that (8) defines a u-action on l+(d) turning it into a left u-module *-algebra. with a being the *-subalgebra of l+(d) generated by y, y∗, x and x−1, the *-algebra s(a) defined in (4) becomes a left u-module *-subalgebra of l+(d). since the traces of elements from s(a) are welldefined, we can state the following proposition: proposition 2 if ±x is a non-negative selfadjoint operator, then h(f) := ±tr(k−1f) defines an invariant integral on s(a). proof. the invariance follows from the trace property tr(af) = tr(f a) for all f ∈ s(a) and a ∈ a. as an example, we show the invariance with respect to e, h(e � z) = ±tr(k−1ez − zk−1e)= ±tr(k−1ez − k−1ez)= 0= ε(e)h(z). the positivity of h is clear by the positivity of ±k−1 = ±q−1x. � note that equation (8) is invariant under the rescaling k �→ tk and f �→ t−1f. if t ∈ r \ {0}, the rescaling does not affect the involution, i.e., we have k∗ = k and e∗ = −kf. from (9), it follows that ρ(k)= s1/2k, ρ(e)= e, ρ(f)= s−1/2f defines a moment map ρ : u → a if and only if s ∈ (0,1). in this situation, proposition 2 is an immediate consequence of theorem1 togetherwith the formula of the quantum trace. however, we emphasize that proposition 2 holds for all s ∈ [−1,0), even if the operators k, e and f do not satisfy the defining relations of uq(su1,1). this shows that the operator-theoretic approach to invariant integration theory is more general than the method based on a quantum moment map. we also would like to point out that our approach works for all representations from the above list where x �= 0, even for those where x has a continuous spectrum, whereas in the algebraic approach, one usually considers functions in x which are supported on a discrete set [2, 7]. references [1] klimyk,a.u., schmüdgen,k.: quantumgroups and their representations. berlin : springerverlag, 1997. 31 acta polytechnica vol. 51 no. 1/2011 [2] korogodsky, l. i.: representation of quantum algebras arising from non-compact quantum groups: quantum orbit method and super-tensor products, ph.d. thesis, massachusetts institute of technology, dept. of math., 1996. complimentary series representations and quantumorbit method. arxiv:q-alg/9708026v1. [3] kürsten, k.-d., wagner, e.: an operatortheoretic approach to invariant integrals onquantum homogeneous sun,1-spaces. publ. res. inst. math. sci. 43 (1), 2007, p. 1–37. [4] luciopeña,p.c.,osunacastro,o.,wagner,e.: invariant integration theory on the quantum hyperboloid, in preparation. [5] osuna castro, o., wagner, e.: an operatortheoretic approach to invariant integrals on quantum homogeneous sl(n + 1, r)-spaces. arxiv: math.qa/0904.0669v1. [6] schmüdgen, k., wagner, e.: hilbert space representations of cross product algebras. j. funct. anal. 200 (2), 2003, p. 451–493. [7] shklyarov, d. l., sinel’shchikov, s. d., vaksman, l. l.: integral representations of functions in the quantum disk. i. (russian) mat. fiz. anal. geom. 4 (3), 1997, p. 286–308. quantum matrix balls: differential and integral calculi. arxiv:math.qa/9905035. osvaldo osuna castro e-mail: osvaldo@ifm.umich.mx department of physics and mathematics university of michoacan morelia, mexico elmar wagner e-mail: elmar@ifm.umich.mx department of physics and mathematics university of michoacan morelia, mexico 32 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 voice activity detection for speech enhancement applications e. verteletskaya, k. sakhnov abstract this paper describes a study of noise-robust voice activity detection (vad) utilizing the periodicity of the signal, full band signal energy and high band to low band signal energy ratio. conventional vads are sensitive to a variably noisy environment especially with low snr, and also result in cutting off unvoiced regions of speech as well as random oscillating of output vad decisions. to overcome these problems, the proposed algorithm first identifies voiced regions of speech and then differentiates unvoiced regions from silence or background noise using the energy ratio and total signal energy. the performance of the proposed vad algorithm is tested on real speech signals. comparisons confirm that the proposed vad algorithm outperforms the conventional vad algorithms, especially in the presence of background noise. keywords: voice activity detection, periodicity measurement, voiced/unvoiced classification, speech analysis. 1 introduction an important problem in speech processing applications is the determination of active speech periods within a given audio signal. speech can be characterized as a discontinuous signal, since information is carried only when someone is speaking. the regions wherevoice information exists are referred to as ‘voiceactive’ segments, and the pauses between talking are called ‘voice-inactive’ or ‘silence’ segments. the decision on the class to which an audio segment belongs is based on an observation vector. this is commonly referred to as a ‘feature’ vector. one or many different features may serve as the input to a decision rule that assigns the audio segment to one of these two classes. an algorithmemployed to detect the presence or absence of speech is referred to as a voice activity detector (vad). vad is any important component of speech processing techniques suchas speechenhancement, speech coding, and automatic speech recognition. in speech enhancementapplications, for example in spectral subtractive type noise reduction algorithms, vad is used for noise estimation,which is thenused in the noise reduction process. speech/silence detection is necessary in order to determine frames of noisy speech that contain noise only. speech pauses or noise only frames are essential to allow the noise estimate to be updated, thereby making the estimation more accurate. in speech coding, the purpose is to encode the input audio signal in such away, that the overall transferred data rate is reduced. since information is only carried when someone is speaking, clearly knowing when this occurs can greatly aid in data reduction. another example is speech recognition. in this case, a clear indication of active speech periods is critical. false detection of active speech periods will have a direct degradation effect on the recognition algorithm. other examples include audio conferencing, echo cancellation, voip applications, cellular radio systems (gsm and cdma based) [1] and hands-free telephony [2]. generating an accurate indication of the presence or absence of speech is generally difficult, especially when the speech signal is corrupted by background noise or by unwanted impulse noise. voice activity detection algorithm performance trade-offs are made by maximizing the detection rate of active speech while minimizing the false detection rate of inactive segments. various techniques for vad have been proposed [3, 4, 5, 6, 7]. in the early vad algorithms, short-time energy, zero-crossing rate and linear prediction coefficientswere among the features commonly used in the detection process [3]. cepstral coefficients [4], spectral entropy [5], a least-square periodicity measure [6], and wavelet transform coefficients [7] are examples of recently proposed vad features. signal energy remains one of basic components of the feature vector. most of the standardized algorithms use signal energy and other parameters to make a decision. for voice activity detection, the proposed algorithm utilizes the total signal energy, which is compared with the dynamically calculated threshold. besides the total energy measure, the algorithm is supplemented by using a signal periodicity measure and a high frequency to low frequency signal energy ratio for more accurate decisions on voice presence. 2 voice activity detection principle the basic principle of avad device is that it extracts measured features or quantities from the input signal and then compares these values with thresholds usually extracted from noise-only periods. voice activity (vad=1) is declared if the measured values exceed 100 acta polytechnica vol. 50 no. 4/2010 the thresholds. otherwise, there is no speech activity or noise, and silence (vad=0) is present. a general block diagram of a vad design is shown in fig. 1. vad design involves extracting acoustic features that can appropriately indicate the probability of target speech signals existing in observed signals. based on these acoustic features, the latter part decides whether the target speech signalsarepresent in the observed signals, using a computedwell-adjusted threshold value. mostvadalgorithms output a binary decision on a frame-by-framebasis, where the frame of the input signal is a short unit of time 5–40 ms in length. the accuracy and reliability of a vad algorithm dependsheavilyonthedecisionthresholds. adapting the threshold value helps to track time-varying changes in the acoustic environments, and hence provides a more reliable voice detection result. 2.1 vad algorithms based on energy thresholding in energy-basedvad, the energy of the signal is comparedwith the threshold depending on the noise level. speech is detected when the energy estimation lies above the threshold. if (ej > k · er), where k > 1, frame is active else frame is inactive (1) in the equation, er represents the energy of the noise frames, while k · er is the threshold used in the decision-making. having a scaling factor, k allows a safe band for adapting er, and, therefore, adapting the threshold. different energy-based vads differ in the way the thresholds are updated. the simplest energy-based method, the linear energy-based detector (led), was first described in [8]. the rule for updating the threshold value was specified as, ernew =(1− p) · er old + p · esilence (2) here, er new is the updatedvalue of the threshold, er old is the previous energy threshold, and esilence is the energy of themost recentunvoiced frame. the reference er is updated as a convex combination of the old thresholdand the currentnoiseupdate. parameter p is constant (0 < p < 1). 2.2 energy of a frame the most common way to calculate the full-band energy of a speech signal is a short-time energy calculation. if x(i) is the i-th sample of speech, n is the number of samples in a frame, then the short-time energy of the j-th frame of a speech signal can be represented as ej = 1 n · j·n∑ i=(j−1)·n+1 x2(i). (3) another common way to calculate the energy of a speech signal is the root mean square energy (rmse), which is the square root of the average sum of the squares of the amplitude of the signal samples (3). ej = ⎡ ⎣ 1 n · j·n∑ i=(j−1)·n+1 x2(i) ⎤ ⎦ 1 2 (4) fig. 2 shows that the power estimate of a speech signal exhibits distinct peaks and valleys. while the peaks correspond to speech activity, the valleys can be used to obtain a noise power estimate. therefore, rmse is more appropriate for thresholding, because it display valleys in greater detail. fig. 1: block diagram of a basic vad design fig. 2: short-time vs. root mean square energy 101 acta polytechnica vol. 50 no. 4/2010 fig. 3: logic flowchart of the proposed vad 3 the proposed voice activity detector for voice/silence detection, the proposed algorithm uses a periodicity measure of the signal, as well as the high-frequency versus low-frequency signal energy ratio and full-band energy computation. a simplified flowchart of the whole algorithm is given in fig. 3. 3.1 feature extraction signal periodicity c is determined by estimating the pitch period of the signal. to reduce the computational complexity, the input signal is first center clipped [9], then the normalized autocorrelation function r(τ) given by (5) is used for pitch estimation. r(τ) = n−m−1∑ n=0 x(n) · x(n + τ) √√√√n−m−1∑ n=0 x2(n + τ) , (5) tmin ≤ τ ≤ tmax where x(n) n = 0,1, . . . , n is the input signal frame. the autocorrelation function is calculated for values of lag τ from tmin to tmax. the constants tmin and tmax are the lower and upper limits of the pitch period, respectively. the pitch period of a voiced frame is equal to the value of τ thatmaximizes the normalized autocorrelation function. the periodicity c of the frame is given by maximum value of r(τ). the total voice band energy ef is computed for the voice band frequency range from 0 hz to 4 khz. the total voice band energy is given by (4). the computation of the threshold for total voiceband energy is based on the energy level emin and emax, obtained from the sequence of incoming frames. these values are stored in memory and the threshold is calculated as, t hreshold = (1 − λ) · emax + λ · emin (6) λ = emax − emin emax . (7) here, λ – a scaling factor controlling the estimation process. the voice detector performs reliably when λ is in the range of [0.950, . . .,0.999]. for different types of signals the value of λ cannotbe the same, so itmust be set up properly. computing the scaling factor λ by (7) makes it independent and resistant to the variable background environment. fig. 4: threshold computation for total band signal energy energy ratio er is computed as the ratio of the energy above 2 khz to the energy below 2khz in the input voice band signal. to obtain a high-frequency signal, the input signal is passed through a high-pass filter that has a cut-off frequency of 2 khz. the high frequency to low frequency energy ratio er is calculated as er = eh/(ef − eh) (8) where ef and eh are the full band and high band signal energy, respectively, calculated by (2) and expressed in db. 102 acta polytechnica vol. 50 no. 4/2010 fig. 5: detailed flowchart of the proposed vad 3.2 thresholding and the hang-over algorithm after feature extraction, the parameters are compared with several thresholds to generate an initial vad decision (iv ad) (see fig. 5). after the thresholds have been compared to determinate the value of iv ad, a final outputdecision ismadeaccording to the lowerpart of the algorithm flowchart. output decision fv ad is performed anew for each value of iv ad produced by threshold comparison. the final output decision involves usage of a smoothing hang-over algorithm to ensure that detection of either the presence or the absence of speech lasts for at least a minimum period of time and does not oscillate on-and-off. upon startup of vad, the values of a hangover flag hv ad and a finalvadflag fv ad are initialized to zero. the output decision block checkswhether the received iv ad value is one. if so, it means that speech has been detected. the output decision therefore sets hv ad and fv ad to one. if the value of iv ad is found tobe zero, speech has not been detected. however, the output decision checks whether the value of hv ad is set to one from the previous frame. if so, the output decision checks whether the smoothed value ef s less the value of emin is greater than8db. if so, holdover is indicated, and so the output decision maintains fv ad set to one, even though speech has not been detected. 4 experimental results the matlab environment was used to test the algorithms on thirty speech signals from the czech speech database. the test templates varied in loud103 acta polytechnica vol. 50 no. 4/2010 ness, speech continuity, background noise and accent. both male speech and female speech in czech languagewere used for the experiments. fig. 6 shows the voice/silenceclassificationresultsof theproposedvad algorithm. the performance of the algorithm is compared to the performance of the ledalgorithm [8]. a comparison is performed on real clean speech and on speech degraded by additive noise. it is clear from the figures that the proposedvadoutperformed the led algorithm in extent ofmisdetection. in contrast to the led algorithm, the proposed vad results in correct detection of unvoiced speech regions. the proposed algorithm is able to detect the beginnings and ends of active speech segmentsaccuratelyevenonnoisy speech signals. fig. 6: performance comparison of vad algorithms: (a) led algorithm clean speech, (b) proposed algorithm clean speech, (c) led algorithm noisy speech (snr=5 db), (d) proposed algorithm noisy speech (snr=5 db) 5 conclusion this paper has presented voice activity detection algorithms employed to detect the presence/absence of speech components in an audio signal. an alternative vad based on periodicity detection and the high-frequency to low-frequency signal energy ratio has been presented. the aim of the paper was to show the principle of the proposed vad algorithm, and to compare it with the known linear energy-based detector (led). the results consistently show the superiority of the proposed vad scheme over the led algorithm. it is easy to recognize that the algorithm has low computational complexity, and can be easily integrated into speech coders and other speech enhancement systems. acknowledgement the researchdescribed in this paperwas supervisedby prof. ing. b. simak, csc., fel ctu in prague and was supported by czech technical university grant sgsno.ohk3-108/10and by theministry of education, youth andsports of theczechrepublic research program msm 6840770014. references [1] etsi ts 126 094 v3.0.0 (2000-01), 3g ts 26.094version3.0.0release1999,universalmobile telecommunications system (umts); mandatory speech codec speech processing functions amr speech codec; voice activity detector (vad), 2000. [2] benyassine,a., shlomot,e., su,h.-y.: itu-trecommendation g.729 annex b: a silence compression scheme for use with g.729 optimized for v.70 digital simultaneous voice and data application, ieee commun. mag., 1997, vol. 35, p. 64–73. [3] atal, b. s., rabiner, l. r.: a pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition, ieee trans. acoustics, speech, signal processing, vol. 24, p. 201–212, june 1976. [4] haigh, j. a., mason, j. s.: robust voice activity detection using cepstral features, inproc. of ieee region 10 annual conf. speech and image technologies for computing and telecommunications, (beijing), p. 321–324, oct. 1993. [5] mcclellan, s. a., gibson, j. d.: spectral entropy: an alternative indicator for rate allocation, in ieee int. conf. on acoustics, speech, signal processing, (adelaide,australia), p. 201–204,apr. 1994. [6] tucker,r.: voiceactivitydetectionusingaperiodicity measure, iee proc.–i, vol. 139, p. 377–380, aug. 1992. [7] stegmann, j., schroder, g.: robust voice-activity detection based on the wavelet transform, inproc. ieeeworkshop on speechcoding for telecommunications, (pocono manor, pn), p. 99–100, sept. 1997. 104 acta polytechnica vol. 50 no. 4/2010 [8] pollak, p., sovka, p., uhlir, j.: noise system for a car, proc. of the third european conference on speech, communication and technology – eurospeech’93, (berlin, germany), p. 1073–1076, sept. 1993. [9] verteletskaya, e., šimák, b.: performance evaluation of pitch detection algorithms. access server [online]. 2009, roč. 7, č. 200906, s. 0001. issn 1214-9675. about the authors ekaterina verteletskayawas born inuzbekistan. she was awarded an msc degree in telecommunication and radio engineering from the czech technical university, prague in 2008. she is currently a phd student at the department of telecommunication engineering of ctu in prague. her current activities are in the area of digital signal processing, focused on speech coding algorithms for mobile communications. kirill sakhnov was born in uzbekistan. he was awardedanmscdegree fromtheczechtechnicaluniversity in prague in 2008. he is currently a phd student at the department of telecommunication engineering ofctu inprague. his current activities are in the area of adaptive digital signal processing, focused on problems of acoustical and network echo cancellation in telecommunication devices. ekaterina verteletskaya kirill sakhnov e-mail: verteeka@fel.cvut.cz, sakhnkir@.fel.cvut.cz czech technical university in prague technická 2, 166 27 praha, czech republic 105 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 the depfet mini-matrix particle detector j. scheirich abstract the depfet is new type of active pixel particle detector. a mosfet is integrated in each pixel, providing the first amplification stage of the readout electronics. excellent noise parameters are obtained with this layout. the depfet detector will be integrated as an inner detector in the belle ii and ilc experiment. a flexible measuring system with a wide control cycle range and minimal noise was designed for testing small detector prototypes. noise of 60 electrons of the equivalent input charge was achieved during the first measurements on the system. keywords: depfet, belle ii, ilc, silicon pixel detector. 1 introduction accelerator physics experiments use linear or cyclic accelerators to accelerate charged particles. the particles are collided at an interaction point surrounded by various types of particle detectors that track the newly-originated particles. the pixel semiconductor detector near to the interaction point is called an inner detector. the depfet collaboration is an international organization developing a new type of pixel semiconductor innerdetector for the internationallinear collider andbelle ii, an upgrade of the belle experiment in japan. depfet is an abbreviation of ‘depleetedfieldeffecttransistor’. anewsystemhas beendeveloped formeasuring andcharacterizing small samples of the depfet detector. this system enables precise charge measurements and a flexible high resolution configuration of searing signals. 2 the depfet detector 2.1 active pixel structure the depfet detector itself consists of a highresistivity depleted n-substrate and two p-regions, creating a pnp-sandwich structure (p frontsideimplantation, n-substrate, p-rearside). the n-substrate is depleted sidewards. the principle of sideward depletion [1, 3] is shown in fig. 1. the n-substrate (bulk) is depleted from both sides by applying negative voltages to both pimplantationswith respect to the bulk. theminimum of the electron potential is in a plane parallel to the front surface. theminimumof the electron’s potential is at depth xmin given by [1] xmin = d 2 + εs qndd (vd − vu) (1) where xmin is the depth of the potential minimum in the detector substrate, q is the elementary charge, nd is the doping concentration of the substrate, d is the total wafer thickness, εs is the dielectric constant of the semiconductor, and vd, vu are the voltages applied to the rear and front sides. if vu = vd the potential minimum is in the middle. asymmetric voltages are applied in the depfet pixel to shift the electron potential minimum close to the front surface, where themosfetis located. additional n-implantshinder electron lateral diffusion and the electrons are concentrated in a small region under the mosfet channel. this region is called an internal gate. fig. 1: the principle of sideward depletion an impinging radiation generates electron-hole pairs in the depleted n-substrate (bulk), the holes drift to the rearside contact, and the electrons are trapped in the internal gate. the internal gate is located directly under the mosfet channel below the external 70 acta polytechnica vol. 50 no. 4/2010 gate contact, so the charge stored in the internal gate affects the mosfet channel. for a fixed drain to source voltage vds and a constant external gate voltage vgs, the drain current id is proportional to the stored charge in the internal gate. the amplification gq is given by the change of the transistor current δid due to the collected charge δq [1] gq = δid δq ∣∣∣∣ vgs ,vds . (2) amplification 300–600 pa/e− is obtained for minimatrices with channel effective length 4 μm. fig. 2: the depfet pixel cross-section 2.2 clearing process when new charge collection is needed, it is necessary to empty the internal gate. for clearing out the internal gate, there is a clear contact next to themosfet transistor. fig. 3. showsacleargatecross-section, and a detailed description of the clearing process is given in [4, 5] and [6]. the electrons are extracted from the internal gate by applying ahighpositive voltage to the clear contact. this causes the electrons drift to the clear contact, where they are taken away. to prevent losses during charge accumulation, the n+-region below the clear contact is surrounded by the p-well. the n+-region provides an ohmic contact to the clear electrode and with the p-well it provides a reverse biased pn junction that represents a potential barrier for the electrons in the internal gate. when the voltage applied to the clear electrode ishighenough, thedepleted region in the p-well is able to pass through the p-well and touches the p-well boundary (punch-through effect [2]). in this moment there is no barrier for electrons in the internal gate and they are extracted. fig. 3: the cleargate cross-section in order to control the potential barrier between the internal gate and the clear contact, an additional mos structure cleargate is added. if the cleargate is onapositivepotential during the clearprocess, ithelps to formann-channel in the p-well. butwhereas the nchannel is situated at the surface, the punch-through effect is also effective in thedepths of the internal gate. 3 measuring system 3.1 conception themini-matrixmeasuringsystem is able tomeasure and characterize a small (3.5 × 3.5 mm) prototype of a depfet (see fig. 4). the small sensor has 4 × 12 active pixels, allowing studies of the depfet structure behavior and processes during operation of the sensor. the mini-matrix readout setup allows us to make a precise collected charge measurement in each pixel with low noise, charge shearing among multiple pixels, clustering, charge-loss measurement, trimming steering voltage values and timing of driving signals. fig. 4: photo of the depfet mini-matrix the system is made of commercial and custommade blocks as a pc with an 8-channel 14-bit 125 msps pci data acquisition card, an fpga control card, a current readout and a switching circuit. the custom-made current readout circuit is made of 8 lownoise trans-impedance readoutamplifiersandthe switching circuit with 12 individual analog switches that are necessary for controlling the gate and clear electrodes of the depfet mini-matrix sensor. fig. 5: conception of the measuring system 71 acta polytechnica vol. 50 no. 4/2010 theswitching circuit canperformgatevoltage timing with resolution of 7.5 ns. a voltage for a pedestal current subtraction is reconfigurable. the measuring system is controlled and configured by the pc. all 8 channels aredigitizedby14-bitadcswith 125msps for each channels in parallel with frame readout time 26 us. 3.2 processing the signals from the detector the readout signals from the depfet detector are the current base. the source of the mosfet pixel is kept at the groundpotential and themosfet’s drain is at −5 v. the output signal has two components, constant pedestal current and signal current proportional to the charge in the internal gate. the pedestal current is subtracted at the analog level at the inputs of the low noise readout amplifiers. the drain signal currents are read out by 8 amplifiers in parallel and digitized by a gage octopus data acquisition card with 8 14-bit/125 msps inputs. fig. 6: the stream of 8 drain signal currents fig. 6 shows the typical digitized stream of illuminateddepfetmatrix. the signal stream is recorded and acquired by the data acquisition system (daq). 3.3 low-noise current readout amplifiers the system has 8 parallel current readout amplifiers that consist of 3 parts: tia, the non-inverting operational amplifier (el2126), and the fully differential output buffer (ad8139), which can be configured as differential or single-ended. the measured input-referred equivalent noise current of the readoutamplifiers is 18narms.thenoise is reduced to 8.1 narmsby using a 4-sample averaging method, and to 3.2 by 10-sample averaging. averaging of 90 samples is used in the daq system. the noise contribution of the amplifiers is less than 3.6 na, which is 1 adu of the 14-bit digitizing system. fig. 7: scheme of the current readout amplifier fig. 8: photo of the current readout amplifier board 3.4 depfet matrix steering the mini-matrix depfet sensor requires 6 external gate channels and 6 clear switching channels that are controlled by the fpgacard and configured via a pc program with time resolution of 7.5 ns. three additional trigger channels are available foradccard synchronization. the external gate electrodes areused for addressing the rows of thematrix. the clear electrode is used to clear one row of pixels. fig. 9: scheme of the double-throw switch fig. 9 shows a one channel scheme of the switch. each channel consists of 1/4 of the adg1434 analog switch and galvanic separator adum1100 insulating fpga card digital control inputs. the readout frame scheme is very flexible and can be configured to fit current requirements. the fpga 72 acta polytechnica vol. 50 no. 4/2010 generates drivingpulses according to the configuration software. fig. 10 shows the structure of the whole matrix readout frame configured in the control program, and fig. 11 shows the real control channel scopeplots. the frame length is approx. 26 μs. the gate on signals are approx. 100 ns overlapping for eachmatrix row to make a continuous drain current flow. this prevents saturation of the amplifiers. fig. 10: sequencer configuration software fig. 11: scopeplots of theexternalgatechannel (above) and clear channel (below) 3.5 daq control monitor fig. 12 displays the results of the daq control monitor. the frame is triggered by a hardware trigger generated by the sequencer. the whole frame is recorded and the signals for each row are software-triggered. samples are taken before and after the clear signal. the evaluated areas are indicated by blue stripes in the ‘full monitor with soft triggers’ top left window in fig. 12. complete data set can be saved in ascii format with: adu, hexadecimal or mv float formats. the top middle window is the signal histogram and the top right window indicates the distance between the hardware triggers. the bottom row of windows contains, from the left: real acquired signals in mv, the histogram of the evaluated areas and the evaluated parts of the signals. the daq control monitor helps to set the software triggers. fig. 12: the daq control monitor a visualization tool also forms part of the daq control monitor. the window in fig. 13 left shows the real pixel layout and the signal response in adu units of the14-bit resolutionadc.1aducorresponds approximately to a charge of 6 electrons. the noise of each pixel is indicated in the right window. the noise of the matrix in the dark expresses the total system noise. the total noise of all pixels is less than 10adu, which is a charge of 60 electrons. fig. 13: the matrix signal response and noise visualization fig. 14: a photo of the measuring system 73 acta polytechnica vol. 50 no. 4/2010 4 conclusions a measuring system has been designed for the depfet mini-matrix particle detector. the matrix of thedetectorwith4×12pixels canbe repeatedly read out with frame frequency of 38 khz and noise lower than 60 electrons. the steering pulses can by configured with high resolution of 7.5 ns. these parameters are efficient enough to start testing and characterizing the prototypes of the depfet particle detector. the development of the second measuring system has already started. noise below 20 electros should be achieved with the new measuring system. acknowledgement the research reported on in this paper has been supervised by dr. p. kodyš, ipnp cuni in prague and prof. m. husák, fee ctu in prague. the research has been supported by the czech grant agency under grantno. p203/10/0777“development of the pixel semiconductor detector depfet for new particle experiments” and by the grant agency of the czech technical university in prague, grant no. sgs10/075/ohk3/1t/13 “testing and characterization of a mini-matrix depfet particle detector” references [1] trimpl,m.: design of a current based readout chip and development of depfet pixel prototype system for the ilc vertex detector.phdthesis, bonn university, 2005. [2] chu, j. l.: thermionic injection and space-chargelimited current in reach-throughp+np+structures. journal of applied physics, 1972. [3] niculae, a. s.: development of a low noise analog readout for adepfet pixel detector.phdthesis, siegen university, 2003. [4] andricek, l., fischer, p., heinzinger, k., et. al.: themos-typedepfetpixel sensor for the ilc environment.nuclear instruments andmethods in physics research. elsevier science, 2003. [5] gartner, k., richter, r.: depfet sensor design using anexperimental 3ddevice simulator.nuclear instruments and methods in physics research. elsevier science, 2006. [6] sandoe,c.,andricek,l., fischer,p., et. al.: clearperformance of linear depfet devices. nuclear instruments and methods in physics research. elsevier science, 2006. about the author ján scheirich was born in 1983. he was awarded a bachelor degree in electronics and telecommunication in 2006 and a master degree in electronics and photonics in 2008 from fee ctu in prague. he attended cern summer school in 2007. he is now working towards his phd at the department of microelectronics at the czech technical university and at the institute of particle and nuclear physics at charles university in prague. he is a member of the depfet collaboration, the bell ii collaboration, and the atlas experiment at cern. ján scheirich e-mail: jan.scheirich@cern.ch dept. of microelectronics faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic institute of particle and nuclear physics charles university v holešovičkách 2, praha, czech republic 74 ap08_4.vp 1 introduction in all areas of particulate technology where solid particles are handled, structures coming into contact with particles exhibit wear. a major constraint of high intensity agitation is the possibility of developing erosion wear of the impeller blades due to the presence of solid particles in the liquid. [5, 6] in some applications, this wear can be so severe as to limit the life of a component, while in others it may be negligible [2]. all particles cause some wear, but in general the harder they are, the more severe the wear will be [3]. the materials used in plants differ in their susceptibility to erosive wear in the mechanism by which such wear occurs. the erosion of a pitched blade impeller caused by particles of higher hardness (e.g. corundum or sand) can be described by an analytical approximation in exponential form of the profile of the leading edge of the worn blade (fig. 1) � �h r c k r( ) exp ( )� � �1 1 , (1) where the dimensionless transversal coordinate along the width of the blade is h y r h � ( ) (2) and the dimensionless longitudinal (radial) coordinate along the radius of the blade r is r r d � 2 . (3) parameters h and d characterize the blade width and the diameter of the impeller, respectively. the values of the parameters of eq. (1) – the wear rate constant k and the geometric parameter of the worn blade c – were calculated by the least squares method from the experimentally formed profile of the worn blade. while the wear rate constant exhibits a monotonous dependence both on the hardness of the solid particles and on the pitch angle �, [1, 2] the geometric parameter of the worn blade is dependent on the pitch angle and, in linear form, on time. a recent investigation [2] shows that the latter parameter decreases hyperbolically with increasing blade hardness. all mentioned investigations were carried out in the same scale of the pilot plant mixing equipment (diameter of the vessel d � 300 mm). this study attempts to extend our knowledge about the influence of the parameters of the mixing process, and also the influence of the characteristics of the solid-liquid suspension on the erosion wear of the blade of pitched blade impellers, i.e. to determine the effect of the concentration and size of the solid particles on both parameters of eq. (1), and finally to observe the effect of impeller speed n. 2 experimental setup a pilot plant mixing vessel made from stainless steel was used (fig. 2), with water as a working liquid (density �l � 1000 kg/m 3, dynamic viscosity � � 1 mpa�s) and particles of corundum (see table 1). pitched blade impellers with four adjustable inclined plane blades made from construction steel (pitch angle � � 30°), pumping downwards were investigated in a fully baffled flat bottomed cylindrical agitated vessel (vessel diameter t � 300 mm, four baffles of width b � 30 mm, impeller 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 an investigation of the erosion wear of pitched blade impellers in a solid-liquid suspension t. jirout, i. fořt this paper reports on a study of the erosion wear mechanism of the blades of pitched blade impellers in a solid-liquid suspension in order to determine the effect of the impeller speed n as well as the concentration and size of the solid particles on its wear rate. a four-blade pitched blade impeller (pitch angle � � 30°), pumping downwards, was investigated in a pilot plant fully baffled agitated vessel with a water suspension of corundum. the results of experiments show that the erosion wear rate of the impeller blades is proportional to n2.7 and that the rate exhibits a monotonous dependence (increase) with increasing size of the particles. however, the erosion rate of the pitched blade impeller reaches a maximum at a certain concentration, and above this value it decreases as the proportion of solid particles increases. all results of the investigation are valid under a turbulent flow regime of the agitated batch. keywords: pitched blade impeller, erosion wear, solid-liquid suspension. fig. 1: radial profile of the leading edge of the worn blade of a pitched blade impeller diameter d � 100 mm, impeller off-bottom clearance c � 100 mm). the impeller speed was held constant n � 900 min�1 during an investigation of the influence of the suspension characteristics (see table 1), and three levels of this quantity (900 min�1, 1050 min�1 and 1200 min�1) were selected for determining the dependence of the wear rate on the impeller speed (for average particle size dp � 0 29. mm and volumetric particle concentration cv � 5 %). the impeller speed was held within accuracy �1 % and the lowest level of this quantity corresponded for all investigated values of dp and cv to complete homogeneity of the suspension under a turbulent regime of flow of an agitated batch. the preliminary experiments were made visually in a perspex mixing vessel under the same conditions as for the erosion wear experiments. it follows from the results that, for all considered sizes and concentrations of the particles of corundum, there was 90 % homogeneity of the suspension at impeller speed n � 700 min�1. 3 experiments during the experiments, the shape of the blade profile was determined from magnified copies of the worn impeller blades scanned to a pc (magnification ratio 2:1). the parameters of the blade profile for the given time of the erosion process were determined from each curve of four individual worn impeller blades. the selected time interval from the very beginning of the each experiment was not to exceed the moment where the impeller diameter began to shorten. then the values of the parameters of eq. (1) – the wear rate constant k and the geometric parameter of the worn blade c – were calculated by the least squares method from the experi© czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 4/2008 fig. 2: geometry of the pilot plant mixing vessel t � 300 mm, h t � 1, d t � 1 3, c d � 1, b t � 1 10b/t = 1/10 fig. 3: design of a pitched blade impeller with four inclined plane blades d � 100 mm, d0 20� mm, h � 20 mm, s � �100 0 05. . mm, � � �30 indication of particle grain particle density �s [kg�m �3] average particle diameter dp [mm] average volumetric particle concentrations in suspension cv [%] corundum 120 3930 0.15 5, 7.5, 10 corundum 90 3940 0.21 2.5, 5 corundum 70 3940 0.29 2.5, 5 corundum 60 3970 0.34 2.5 table 1: survey of the water-corundum suspensions used in the experiments mentally found profile of each worn blade at the given time interval t of the erosion process. each curve was calculated from at least 15 points (h, r) with a regression coefficient better than r � 0 970. (see example in fig. 4). the resulting values of parameters k and c were the average values calculated from all individual values of these parameters for each blade. it can be mentioned that the chosen shape of the regression curve h f r� ( ) fits best to the experimental data among other possible two-parameter equations (e.g. an arbitrary power function or the second power parabola). after the investigation of the shape of the worn blade, the weight of the blade was measured. all four blades were weighed on a scale with an accuracy � 5 mg, and the weight of the blade m related to its initial weight m0 (relative weight) was calculated at a given time (period) of the erosion process. the average value of the weight of the blade was calculated as the mean from all measured weights of the four individual blades moj or mj: m m m m o oj j j ( ) ( ) resp. � � 1 4 4 . (4) in this way the dependence of quantity m mo was obtained. at the same time the change in the shape of the particles was observed during the erosion process. microscopic snap-shots of the corundum particles were made before and after the process, and then their size distributions were compared. no change appeared on their surface (their edges did not become rounded and the corners did not disappear) after the experimental period came to an end (see example in table 2), and their size distribution was also unchanged. 4 results and discussion 4.1 impeller speed vs. erosion rate fig. 5 illustrates the time dependences of the relative weight of a worn impeller m m0 at three selected levels of impeller speed n: m m c t o m n� �1 , . (5) fig. 5 shows the values of parameter cm,n for all levels of impeller speed calculated from the experimental data by the least squares method. the values increase with increasing impeller speed. this dependence can be described in a power form c nm n, .~ 2 7, r � 0 998. . (6) 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 1st blade 2nd blade y = 0,0873e -5,7109x r = 0,980 0 0.02 0.04 0.06 0.08 0.1 0.12 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1-r 1 -h y = 0,0836e -6,1041x r = 0,966 0 0.02 0.04 0.06 0.08 0.1 0.12 0 0.1 0.2 0.3 0.4 0.5 0.6 1-r 1 -h 3rd blade 4th blade y = 0,0814e -5,8645x r = 0,968 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 0.1 0.2 0.3 0.4 0.5 0.6 1-r 1 -h y = 0,0943e -5,5135x r = 0,979 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1-r 1 -h fig. 4: example of evaluating of the shape of a worn impeller blade: points – experimental values, curve – calculated regression fig. 6 illustrates the time dependence of the wear rate constant k, and fig. 7 depicts the time dependence of the geometric parameter of the worn blade c at three impeller speed levels. while parameter k oscillates around a certain value throughout the erosion process, parameter c increases with time. therefore the average value of parameter kav throughout the period when it is being determined, irrespective of the impeller speed value, can be considered as constant: � � �kav 3 8 015. . . (7) the worn blade geometric parameter c increases linearly with the duration of the erosion process t c c tn� . (8) fig. 7 shows of the values of parameter cn for all levels of impeller speed, calculated from the experimental data by the least squares method. the values increase with increasing impeller speed. this dependence can be described in a power form c nn ~ .2 4, r � 0 939. . (9) © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 4/2008 indication of particle grain end of experiment corundum 120 corundum 90 corundum 70 corundum 60 start of experiment table 2: microscopic snap-shots of corundum particles before and after a period of erosion process experiments these results confirm that the wear rate of a pitched blade impeller depends significantly on the impeller speed. this dependence is expressed in the overall relationship m m t� ( ), while the wear rate constant k does not exhibit any change within the tested impeller speed interval. the power at both dependences m m f no � ( ) and c f nn � ( ) exceeds two, so it does not depend only on the square of the velocity of the solid particles in a suspension, i.e. not only on their kinetic energy [4]. for metals, the value of the exponent at n can be considered within the interval 2.3–3 [3]. it should only be pointed out that these correlations are valid for the given relative impeller diameter d t � 1 3d/t = 1/3, and pitch angle � � �30 . 4.2 suspension characteristics vs. erosion rate figs. 8 and 9 illustrate the time dependences of the relative weight of a worn impeller blade m mo for two investigated levels of average volumetric particle concentration in suspension cv always at three levels of average particle diameter dp. these dependences can be expressed for all tested conditions in a linear form 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 0 5 10 15 20 25 30 35 40 t [h] m /m 0 n � 900 min �1 n � 1050 min �1 n � 1200 min �1 fig. 5: time dependence of the relative weight of the impeller blade for different levels of impeller speed: points – experimental values, line – calculated linear regression 0.0 1.0 2.0 3.0 4.0 5.0 5 15 25 35 45 t[h] -k [] n � 900 min �1 n � 1050 min �1 n � 1200 min �1 fig. 6: time dependence of the wear rate constant for different levels of impeller speed m m c t o m d� �1 , . (10) it follows from the two figures that the value of parameter cm,d increases with increasing value of the average particle diameter. this dependence can be expressed in a power form for both levels of the average volumetric concentric time: c dm d p, ..� 0 0347 2 43, r cv� �0 999 2 5. ( . %) , (11) c dm d p, ..� 0 0117 1717, r cv� �0 899 5. ( %) . (12) from eqs. (11) and (12) we can conclude that the erosion wear rate of a pitched blade impeller exhibits steeper dependence on time at a lower concentration among the concentrations investigated here. a deeper insight into the relation between the rate of erosion wear and the average volumetric particle concentration is provided by fig. 10, which © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 4/2008 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 10 20 30 40 t [h] c [] n � 900 min �1 n � 1000 min �1 n � 1200 min �1 fig. 7: time dependence of the geometric parameter of the worn blade for different levels of impeller speed 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 0 20 40 60 80 100 120 t [h] m /m 0 [] dp � 0.34 mm dp � 0.29 mm dp � 0.21 mm fig. 8: time dependence of the relative weight of the impeller blade for different levels of average particle diameter dp (cv � 2 5. %): points – experimental values, line – calculated linear regression shows the time dependence of the relative weight m mo at three levels of volumetric particle concentration cv for the same average particle diameter dp. this dependence can be expressed in a linear form 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 0 20 40 60 80 100 t [h] m /m 0 [] dp � 0.34 mm dp � 0.29 mm dp � 0.21 mm fig. 9: time dependence of the relative weight of an impeller blade for different levels of average particle diameter dp (cv � 5 %): points – – experimental values, line – calculated linear regression 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 0 20 40 60 80 100 t [h] m /m 0 [] cv � 5 % cv � 7.5 % cv � 10 % fig. 10: time dependence of the relative weight of an impeller blade for different levels of average particle volumetric particle concentration cv (dp � 015. mm) m m c t o m c� �1 , . (13) the slope in eq. (13) cm,c increases up to a certain critical level of the average particle concentration, and then it decreases with increasing cv (see table 3). this finding is in accordance with the general observation (suchánek, 2006) that above some critical particle concentration the mutual interaction between striking and reflecting particles reduces their kinetic energy, and their influence on the metal surface of the impeller blade is reduced. fig. 11 illustrates the time dependence of the wear rate constant k, and figs. 12, 13 and 14 depict the time dependences of the geometric parameter of the worn blade c. while parameter k oscillates around a certain value during the erosion process, parameter c increases with time. therefore, average values of parameter kav were calculated for each individual condition (dp and cv), always over the period in which they were determined (table 4). it follows from this table that parameter kav varies in its absolute value within the limits 3.95–4.42. therefore, we can assume that this parameter is independent of both the size and the concentration of the solid particles in a suspension, with its average value � � �kst 4 10 0 20. . . (14) when we compare the values kav (eq. 7) and kst (eq. 14), we can conclude that they show no significant difference within their variation. it follows from figs. 12, 13 and 14 that the geometric parameter of the worn blade increases linearly with the duration of the erosion process t c c tt� . (15) © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 4/2008 average volumetric particle concentration cv 5.0 7.5 10.0 parameter cm,c in eq. (13) [h�1] 0.0013 0.0016 0.0013 table 3: dependence of parameter cm,c on the average volumetric particle concentration (dp � 015. mm) 0 1 2 3 4 5 6 0 20 40 60 80 100 120 t [h] -k [] d cp v� �0.29 mm, 2.5 % d cp v� �0.21 mm, 2.5 % d cp v� �0.15 mm, %5 d cp v� �15 , 10 %0. mm d cp v� �, 5 %0.29 mm dp � 0.21 mm, 5 %cv � dp � 0.15 mm, .5 %c 7v � dp � mm0.34 , 2.5 %cv � fig. 11: time dependence of the wear rate constant for different properties of the suspension average particle diameter dp [mm] average volumetric particle concentration cv [%] kav [-] 0.15 5 �4.09 7.5 �4.08 10 �4.06 0.21 2.5 �4.01 5 �4.14 0.29 2.5 �4.42 5 �4.14 0.34 2.5 �3.95 table 4: mean time values of the wear rate constant kav in accordance with eqs. (11) and (12), the power relation between parameter ct and the average particle diameter is c dt p� 0 0717 2 34. . , r cv� �0 952 2 5. ( . %) (16) and c dt p� 0 0614 176. . , r cv� �0 936 5. ( %) . (17) eqs. (16) and (17) confirm that the erosion wear rate of a pitched blade impeller depends significantly on the diameter of the solid particles in a suspension. similarly as for the relative weight of the blade (eq. 13), parameter ct reaches a maximum value within the interval of the average volumetric particle concentrations investigated here (see table 5). that is, at a higher concentration than cv � 7 5. %(dp � 015. mm), 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 20 40 60 80 100 120 t [h] c [] dp � 0.34 mm dp � 0.29 mm dp � 0.21 mm fig. 12: time dependence of the geometric parameter of the worn blade for different levels of average particle diameter dp (cv � 2 5. %) t [h] 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 20 40 60 80 100 c [] dp � 290. mm dp � 0.21 mm dp � 0.15 mm fig. 13: time dependence of the geometric parameter of the worn blade for different levels of average particle diameter dp (cv � 5 %) particles of corundum affect each other, with a simultaneous reduction in the interactions between particles and impeller blades. 5 conclusions a two-parameter equation describing the shape of a worn blade during the erosion process of a pitched blade impeller in a solid-liquid suspension of higher hardness was investigated. it follows from the results of the experiments that the erosion wear rate is proportional to the 2.7th power of the impeller speed and that it exhibits a monotonous dependence (increase) with increasing size of the particles. however, the erosion rate of the pitched blade impeller reaches a maximum at a certain concentration, and above this value it decreases as the proportion of solid particles in the agitated batch increases. list of symbols b baffle width, m c off-bottom impeller clearance, m c geometric parameter of the worn blade cm constant in eqs. (5), (10) and (13), h �1 ct constant in eq. (15), h �1 cv average volumetric concentration of solid particles, m3�m�3 d impeller diameter, m do hub diameter, m dp average diameter of solid particles, m h height of liquid from bottom of the vessel, m h dimensionless transversal coordinate of the profile of the worn blade h width of impeller blade, m k wear rate constant m weight of impeller blade, kg n impeller speed, s�1 r regression coefficient r dimensionless longitudinal (radial) coordinate along the radius of the impeller blade r longitudinal (radial) coordinate along the radius of the impeller blade, m s thickness of the impeller blade, m t vessel diameter, m t time, h y transversal coordinate along the width of the blade, m greek symbols � pitch angle of the blade, deg �l density of liquid, kg�m �3 � dynamic viscosity, pa�s indices av average value j summation index n related to the concentration of solid particles d related to the diameter of the particles t related to time o initial value © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 4/2008 0 0.05 0.1 0.15 0.2 0.25 0 20 40 60 80 100 t [h] c [] cv � 5 % cv � 7.5 % cv � 10 % fig. 14: time dependence of the geometric parameter of the worn blade for different levels of average volumetric particle concentration cv (dp � 015. mm) references [1] fořt, i., ambros, f., medek, j.: study of wear of pitched blade impellers, acta polytechnica, vol. 40 (2000), no. 5–6, p. 11–14. [2] fořt, i., jirout, t., cejp, j., čuprová, d., rieger, f.: study of erosion wear of pitched blade impeller in a solid liquid suspension, inzynieria chemiczna i procesowa, vol. 26 (2005), p. 437–450. [3] hutchings, i. m.: wear by particulates, chem. eng. sci, vol. 42 (1987), p. 869–878. [4] suchánek, j.: mechanisms of erosion wear and their meaning for optimum choice of a metal material in practical applications (in czech), proceedings of scientific lectures of czech technical university in prague (editor: k. macek), no. 9, (2006) prague (czech republic), 26 p. [5] wu, j., ngyuen, l., graham, l., zhu, y., kilpatrick, t., davis, j.: minimising impeller slurry wear through multilayer paint modelling, can. j. chem. eng., vol. 83 (2005), p. 835–842. [6] wu, j., graham, l. j., noui-mehidi, n.: intensification of mixing, j. chem. eng. japan, vol. 40 (2007), no. 11, p. 890–895. doc. ing. tomáš jirout, ph.d. phone: +420 224 352 681 fax: +420 224 310 292 e-mail: tomas.jirout@fs.cvut.cz doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: ivan.fort@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 influence of various process parameters on the density of sintered aluminium alloys mateusz laska1, jan kazior1 1 cracow univeristy of technology, ul. warszawska 24, 31-155 krakow correspondence to: mgrmlaska@gmail.com abstract this paper presents the results of density measurements carried out on alumix sintered parts. ecka alumix aluminium powders were used because of their wide application in the powder metallurgy industry. the compacts were produced using a wide range of compaction pressures for three different chemical compositions. the compacts were then sintered under a pure dry nitrogen atmosphere at three different temperatures. the heating and cooling rates were the same throughout the entire test. the results showed that the green density increases with compaction pressure, but that sintered density is independent of green density (compaction pressure) for each sintering temperature. keywords: alumix, sintering, powder metallurgy, density. 1 introduction pm aluminium alloys have growing potential in the automotive industry, because desired properties of structural components can be achieved at relatively low cost. in recent times, considerable efforts have been made to develop alznmgcu sintered alloys. the alloying elements were introduced as elemental powders, or in the form of master alloys, since prealloyed powders are incompressible and cause some technological problems during sintering. however, developments in aluminium-silicon alloys remain an area of intense interest. in principle, silicon additions are made to aluminium casting alloys in order to increase the fluidity of the molten alloy. the addition of silicon to aluminum alloys in pm technology offers some advantages in the production cycle over casting, in particularly in its ability to produce hypereutectic alloys with a relatively fine silicon particle, which can for example provide wear-resistant products. with the use of proper sintering parameters, densities of almost 99 % of the theoretical density can be achieved [1–5]. the powders used in this paper were produced by ecka granules, under the designation ecka alumix. ea231, ea 321 and ea 431d were used. the purpose of this research work was to study the influence of various process parameters on the densification behavior of various aluminium alloy powders. 2 experimental procedure the powders used in this study were ea231, ea 321 and ea 431d, supplied by ecka. the characteristics of the powders are summarized in table 1. the powder mixtures were uniaxially pressed in steel dies at 450, 500, 550 and 600 mpa compaction pressure to obtain cylindrical samples 20 mm in diameter and 5 mm in height. the sintering process was carried out at temperatures of 580, 590 and 600 ◦c for 30 minutes, under pure dry nitrogen. the heating and cooling rates were set at a constant level of 5 ◦c/min. the densities of the green compacts were determined from the mass and the dimensions of the compacts, while the densities of the sintered compacts were determined using the archimedes principle. the theoretical density (td ) was calculated using the simplified additive function, applying the formula: t d = 100 p 1 d1 + p 2 d2 + . . . + p x dx where the t d is theoretical density, px is mass percentage of respective elements, dx is density of the respective ingredients in elementary form. a netzsch 402c dilatometer was used for determining the dimensional changes during sintering, and 15 × 5 × 5 prismatic specimens were sintered under the same conditions as the cylindrical specimens. table 1: chemical compositions and theoretical densities material cu (wt. %) mg (wt. %) si (wt. %) zn (wt. %) wax (wt. %) al (wt. %) t d (g/cm 3 ) ea231 2.5 0.5 14 – 1.5 balance 2.68 ea321 0.2 1 0.5 – 1.5 balance 2.69 ea431 1.5 2.5 – 5.5 1 balance 2.79 93 acta polytechnica vol. 52 no. 4/2012 3 results and discussion figure 1 presents density measurements as a function of compaction pressure. it is evident that green density increases with compaction pressure. however, for the same compaction pressure the green density is different, due to differences in chemical compositions, in particular in the amount of silicon and copper in the powders under study here. figure 2 presents the relationship between sintered density and compaction pressure for different sintering temperatures. it was observed that while the green density increases with compaction pressure, the sintered density is independent of the green density (compaction pressure) for each sintering temperature. additionally, the densification factor for all sintered specimens was defined, using the formula: df = sd − gd t d − gd (1) where df is densification factor, sd is sintered density, gd is green density, and t d is theoretical density. a negative densification coefficient indicates expansion, while a positive value represents shrinkage. the relationship between densification factor and compaction pressure is presented in figure 3. the fact that both powders containing silicon (ea231 and ea321) have a densification factor significantly lower (samples expanded) than the ea431 powder containing zinc (shrinkage after sintering) leads to the conclusion that the two different groups of powders behave in different ways during sintering. figure 1: influence of compaction pressure on the green densities of compacts it is interesting to note that, in principle, the trend of the densification factor is lower for higher compaction pressures. this does not apply, however, for the sample of ea431 compacted at 600 mpa and sintered at 590 ◦c, and this phenomenon will be a topic for further study. figure 4 presents a typical dilatometric curve for ea431. dilatometry indicates that samples undergo significant dimensional changes during different parts of the sintering cycle. the specimens expand rapidly just before isothermal sintering and then shrink during isothermal sintering and cooling. the peak of expansion could be related to penetration of the liquid phase by the interparticle capillaries, which forces the sample apart. the shrinkage during sintering is a result of densification mechanisms, and during cooling it is a result of thermal contraction. figure 2: sintered density to theoretical density ratio as a function of compaction pressure for: (a) ea231, (b) ea321, (c) ea431 the negative values of the densification factor curves of the ea231 and ea321 powders therefore correspond with the observed changes in the geometry of the samples and the swelling of the sintered parts. general observations of the surface showed that the faces of samples ea231 and ea321 were rough, unlike the smooth surfaces of the ea431 specimens. further investigations are necessary for a better understanding of all the dimensional changes and the whole densification mechanism as a function of the chemical composition of aluminium alloys. 94 acta polytechnica vol. 52 no. 4/2012 figure 3: densification factor as a function of compaction pressure for: (a) ea231, (b) ea321, (c) ea431 4 conclusion an evaluation of the influence of compaction pressure and sintering temperature on the density of different sintered aluminium alloys shows that these parameters do not affect the densification process of the alloy powders under study here. the ea231 and ea321 powders have significantly lower densification rates than ea431. in all cases, the ea231 and ea321 parts had lower density than their green compacts. figure 4: dilatometric curve as a function of process time and temperature for ea431, compaction pressure 600 mpa this can be partially explained by the swelling of the samples. the ea431 powder showed not only a positive densification factor, but also the smoothest surface of all the investigated specimens. acknowledgement this research was supported by project european funds portal — innovative economy poig no. 01.01.02-00-015/09-00 references [1] martin, j. m., castro, f.: liquid phase sintering of p/m aluminium alloys: effect of processing conditions. in journal of materials processing technology, 2003, vol. 143–144, p. 814–821. issn 0924-0136. [2] mondolfo, l.f.: aluminium alloys: structure and properties. london : butterworths, 1976. [3] asm specialty handbook, aluminium and aluminium alloys, materials park, oh, usa, 1993. [4] greasley, a., shi, h. y.: powd. metall. 36 (1993) 288. [5] jatkar, a. d., sawtell, r. r.: proceedings of the international conference on p/m aerospace materials, paper 15, lausanne, 1991. 95 ap08_6.vp 1 introduction in 2005 and 2006, as part of a grant-founded project of the ministry of transport of the czech republic, the department of railway structures of the faculty of civil engineering ctu in prague was engaged in research on project no. 1f52h/052/130 “methodology of transitional parameters for the construction of the substructure on tracks of the conventional trans-european system”. the objective of this research project was to elaborate the structural layout principles for the permanent way and the substructure to enable an increase in axle loads to 250 kn and 300 kn. the passage of railway carriages on the rail tracks in the czech republic depends basically on the load-bearing capacity of the bridge structures. however, problems of the load-bearing capacity of bridge structures exposed to an increased load acting on the axle did not form part of the specifications for the research to carried out in the framework of this ministry of transport project. in order to fulfil the research objective, a detailed numerical analysis of the effect of an increased axle load on the permanent way and track bed structure was selected, complemented with experimental measurements on four track bed models. the effect of an increased axle load was determined by measuring the sleeper deflection and the track bed structure. 2 materials and methods 2.1 numerical analysis of the effects of an increased axle load on track bed structures the first step was to assess the rail load-bearing capacity using the methodology specified in the instruction čd s3 “permanent way”. this proved that with an increase of: 1. axle load to 250 kn and 300 kn, the requirements are met by rail shapes of s46, uic 60 and r65 type in spliced © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 6/2008 prerequisites for increasing the axle load on railway tracks in the czech republic m. lidmila, l. horníček, h. krejčiříková, p. tyc this paper deals with problems of increasing the axle load on czech railways (čd) tracks to 250 kn or 300 kn, respectively. the results of a numerical analysis of the effects of increased axle loads on the track bed structure were verified by experimental measurements carried out on track bed construction models in an experimental box on a 1:1 scale. the results of the research are applicable for routine use on čd. keywords: axle load, permanent way, track bed, model measurement, substructure. deflection (mm) 22.5 t y (mm) 25.0 t y (mm) 30.0 t y (mm) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 deflection (mm) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 d e p th b e lo w th e b a s e o f th e s le e p e r (m ) fig. 1: track bed deflection pattern due to permanent way loading with an axle load of 22.5 t – 25.0 t – 30.0 t (gravel edef � 100 mpa) rail even in the case of poor track bed quality (loading capacity coefficient c � 20– 50 mpa�m�1). 2. axle load to 250 kn, the requirements are met by uic rails in long-welded rail even in the case of poor track bed quality (loading capacity coefficient c � 20– 50 mpa�m�1). for loading by 300 kn, the requirements are met only by track beds with loading capacity coefficients of 100 mpa�m�1 and more. 3. axle load to 250 kn, s49 rails may be used only for vertical rail head wear of up to 20 mm in track beds with loading capacity coefficients of 100 mpa�1 and more. for loading by 300 kn, the use of s49 rail shape in long-welded rail is not advisable. the use of concrete sleepers in the permanent bed construction depends on the design of the sleeper structure. the sleepers in use on čd b 91 s corridor tracks are dimensioned for loading by an axle force of 250 kn. for loading by 300 kn, an assessment of the sleeper reinforcement would be necessary. the effect of an increased axle load on the track bed structure was determined by means of numerical analysis for calculating the deflections and stresses using the finite-element method. in calculations of deflections of longitudinal structures, simpler two-dimensional models are commonly used. this solution can be made on the assumption that the load is constant in the longitudinal axis of the structure, or is at least sufficiently long. however, this assumption is not fulfilled, because the changing load for 7 consecutive sleepers with a distance of 600 mm between the axes is required. for this reason, three-dimensional models of a railway track segment were created in plaxis tunnel 3d software. within the next period of the solution it turned out to be necessary to carry out a parametric study, because calculations of three-dimensional models were excessively time-consuming. the computations were made with software developed by a.s. stavební geologie – geotechnika praha. fig. 1 shows the track bed deflection pattern for permanent bed loading with axle loads of 22.5 t – 25.0 t – 30.0 t. fig. 2 displays the stress pattern in the track bed for loading of the permanent way with axle loads of 22.5 t – 25.0 t – 30.0 t. fig. 1 implies that when the weight acting on the axle is increased from 22.5 t to 25.0 t, the substructure subgrade, i.e. at a depth of 35 cm below the sleeper loading area, is exposed to a 16 % increase in deflection, and when the weight is increased from 22.5 t to 30.0 t the substructure subgrade shows a 52 % increase in deflection. 2.2 model measurements in an experimental box the effect of an increased axle load on increased deflection of the track bed was determined on four models of track bed structures on a 1:1 scale placed in an experimental box with inside dimensions of 2100 × 990 × 800 mm (fig. 3). 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 strain (kpa ) 22.5 t �y (kpa) 25.0 t �y (kpa) 30.0 t �y (kpa) 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 strain (kpa) 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 d e p th (m ) fig. 2: stress pattern in the track bed due to permanent way loading with an axle load of 22.5 t – 25.0 t – 30.0 t (gravel edef � 100 mpa) model specification granulated gravel layer thickness in mm ballast bed thickness in mm 15sd e10 150 350 30sd e10 300 350 15sd e25 150 350 30sd e25 300 350 table 1: specification of track bed construction models and their characteristics the main characteristics of the track bed model structures are specified in table 1. the model specifications include the granulated gravel thickness in cm and the value of the modulus of deformation of the subgrade, simulated by rubber plates (e.g. model 15sd e10 implies a model with a layer of granulated gravel 15 cm in thickness and the load bearing capacity of rubber plates er � 10 9. mpa). individual models were selected so as to characterize different load-bearing capacities (moduli of deformation values) on the substructure subgrade epl (see table 2). track bed model structures of type 2 were composed of one half of a sleeper mounted on a ballast bed 350 mm in thickness. the ballast bed was laid on a layer of granulated gravel 150 mm or 300 mm in thickness. the structural layer of granulated gravel was placed on rubber plates laid at the bottom of the experimental box. the rubber plates simulated the subgrade with a load-bearing capacity of 10.9 mpa and 25.0 mpa (see table 2). the structural layer of granulated gravel graded 0/32 mm was compacted with a special manual vibratory compacting unit with a compacting area of 174×174 mm. the granulated gravel layer was compacted in layers 150 mm in thickness. the time for compacting one layer was set to 30 minutes to achieve perfect compaction along the whole surface of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 6/2008 a) b) c) fig. 3: basic dimensions of the experimental box: a) side view, b) cross section, c) ground plan model specification moduli of deformation in mpa er epl epp 15sd e10 10.9 26.1 63.8 30sd e10 10.9 53.3 79.4 15sd e25 25.0 50.1 90.0 30sd e25 25.0 82.9 103.9 table 2: values of moduli of deformation on structural layers of model constructions layer. compaction of the granulated gravel was checked by measuring the modulus of deformation. the values of the moduli of deformation are given in table 2. on top of the structural layer of granulated gravel, a ballast bed of gravel graded 32/63 mm with a thickness of 350 mm was laid. the ballast bed structure was divided into two layers, each cca 175 mm in thickness. each layer of gravel was again compacted for 30 minutes running uniformly along the whole surface of the experimental box. the total ballast bed thickness under the sleeper loading area after compaction reached the value of 350 mm specified for concrete sleepers. the measured values of the moduli of deformation epp are given in table 2. the ballast bed surface served for mounting one half of an instrumented concrete sleeper b 91 s/1 with a piece of rail of uic 60 type. the arrangement of model 15sd e10 in the experimental box showing the position of the rigid circular plate during plate load tests is displayed in fig. 4. in order to measure the deflection, the track bed was fitted with bar displacement indicators with a loading area of 100×100 mm with a protected measurement axis. the deflection was measured at depths of 175 mm and 350 mm under the sleeper loading area. at each depth, four displacement indicators were placed at identical positions. the deflection values were measured by mechanical and digital path indicators with a precision of 0.01 mm. the location of the displacement indicators is shown in fig. 5. the measurements further included sleeper deflection due to rail load. for an axle load of 22.5 t the maximum force acting on the rail was calculated as p � 50 kn, for an axle load of 25.0 t p � 60 kn, and for an axle load of 30.0 t p � 70 kn. the deflection measurements in the experimental box are illustrated in fig. 6. sleeper deflection and deflection values at individual track bed depths were measured following 30× repeated loading with force p acting in three measurement cycles composed of loading and unloading phases. an example of the deflection results measured on model 30sd e25 is given in table 3. in the rail head loading process, it must be considered that by increasing the rail head load from p � 50 kn to 60 kn or 75 kn respectively, the load increase at p � 60 kn amounts to 120 % or to 150 % at p � 75 kn. 3 results the deflection measurements on model 30sd e25 displayed in table 3 provide the following results: 1. the deflection measurement method applying simple displacement indicators was suitable for determining the deflection values at various track bed depths, and the displacement indicators showed sensitivity in registering deflection values due to the changing rail head load. 2. a comparison of the deflection values for p � 50 kn at a depth of 175 mm with those for p � 60 kn or 75 kn, re18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 b)a) fig. 5 arrangement and position of individual displacement indicators a) ground plan, b) cross section b)a) fig. 4: model 15 sd e10 in the experimental box: a) longitudinal section, b) cross section spectively shows that the increase in the deflection values is 118 % or 143 % respectively, which roughly corresponds to the increased load acting on the rail head. 3. a comparison of the deflection values for p � 50 kn at a depth of 350 mm with those for p � 60 kn or 75 kn, respectively, shows that the increase in the deflection values is 117 % or 150 % respectively, which again roughly corresponds to the increased load acting on the rail head. 4. the deflection measurement results obtained on the models are in close accordance with the results of the numerical analysis. similar results were provided for other track bed structure models. 3.1 passage of carriages with an axle load of 22.5 t on railway tracks in the czech republic the passage of carriages with axle loads of 22.5 t (present-day maximum axle load) on railway tracks in the czech republic is regulated by instruction čd s66 “basic instruction for spatial passage and transition of carriages on railway tracks in the czech republic”. the permitted axle load of © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 6/2008 fig. 6: deflection measurement due to rail head load rail head loading in kn mean deflection under sleeper loading area in mm deflection increase in % at depth of 175 mm at depth of 350 mm at depth of 175 mm at depth of 350 mm 50 0.28 0.12 100 % 100 % 60 0.33 0.14 118 % 117 % 75 0.40 0.18 143 % 150 % table 3: results of deflection measurements on track bed model 30sd e25 22.5 t corresponds to track load classes d3 and d4. the allowable load per linear meter for track load class d3 is 7.2 t�m�1, and for track load class d4 the allowable load per linear meter is 8.0 t�m�1. instruction čd s66 enumerates individual railway track sections, specifying the track load class permitted. the technical condition of railway tracks in the czech republic is specified in the “outline of property management in sždc s.o. for 2004”, ref. no. 559/05/gř-013, may 2005. out of the total length of tracks of 9505.140 km, an axle load of 22.5 t is permitted only on 3454.785 km, i.e. on 36.34 %. on all-national corridor tracks with a total length of 1460.068 km, an axle load of 22.5 t (track load class d4) is permitted on 1359.612 km, i.e. on 93.11 %. the axle load permitted on the remaining corridor tracks is 20.0 t. on other all-national tracks with a total length of 4872.816 km, the axle load of 22.5 t is permitted only on 1836.812 km, i.e. 37.69 %. on the prevailing length of other all-national tracks, the permitted axle load of 20.0 t applies to a length of 2843.108 km, i.e. 58.43 %. on regional tracks with a total length of 3170.640 km, i.e. on 8.14 %, the permitted axle load of 20.0 t applies to a length of 2051.487 km, i.e. to 64.70 %. only the remaining regional tracks have permitted axle loads of 16.0 t to 18.0 t. an analysis of the passage of carriages on the railway tracks in the czech republic shows that the corridor tracks, which have been recently modernized or optimized (the permanent way and substructure being reconstructed to achieve so-called optimum condition), are suited for the assessing of a potential increase in axle load. other railway tracks are not suitable in this respect, as it would be too expensive. 3.2 assessment of a potential increase in axle load to 25.0 t on corridor tracks the existing corridor tracks in the railway network of the czech republic are in considerably better condition than the other all-national and regional tracks as they have gone through gradual modernization or optimization. their technical condition shows the following characteristics: 1. the permanent way on corridor tracks includes: � rail shapes of uic 60 or r 65 type, � resilient fixation without baseplates, � concrete sleepers for resilient fixation without baseplates (the b 91 s type sleeper is dimensioned for axle loads of 25.0 t and a speed of 160 km�h�1), � a ballast bed 350 mm in thickness under the sleeper loading area, � spacing of sleepers of 600 mm. the permanent way on corridor tracks complies with regulation uic-kodex 724 e “gleisbewehrung für 25 tonnen (250 kn) auf schotteroberbau” of may 2005. 2. the substructure is composed of the stable track body of the former all-national railways, whose track bed was reconstructed during modernization or optimization to comply with the minimum required load-bearing capacity of substructure subgrade of e pl � 50 mpa. when considering a potential increase in axle loads from 22.5 t to 25.0 t the load increase acting on the track bed amounts to 11.1 %. for this reason, it would also be in place to consider an increase in the minimum load-bearing capacity value e pl � � �50 0111 55 5. . mpa. we may presume that due to modernization or optimization during which the track bed load-bearing capacity has been increased by applying non-cemented layers or layers of stabilized soil, increased substructure subgrade loading may be allowed. 3. on corridor tracks with a modernized permanent bed and substructure constructions railway operation with the axle load increased to 25.0 t may be allowed. 3.3 assessment of a potential increase in axle load to 30.0 t on corridor tracks an analysis of the technical condition of corridor tracks provides the following conclusions for a potential increase in axle loads to 30.0 t on corridor tracks: 1. the design parameters for permanent way construction for an axle load of 30.0 t have not yet been defined. the permanent way structure would have to be improved by designing a new concrete sleeper for loading by 30.0 t potentially increasing the ballast bed thickness under the sleeper loading area. 2. the substructure would have to be designed for a track bed load increased by 50 % (i.e. min. e pl � �50 15. � 75 mpa). in order to reveal the current condition of the subtructure, a detailed geotechnical survey of the track bed would be necessary together with a reconstruction design. 3. an increase in axle loads to 30.0 t on corridor tracks is therefore presently not feasible. 4 discussion and summary a numerical analysis of the effect of increased axle loads to 25.0 t or 30.0 t and also experimental measurements on track bed models lead to the following conclusions: 1. an axle load increase to 25.0 t is possible only on corridor tracks with current axle load 22.5 t. the load-bearing capacity of the substructure subgrade must be increased at least by 11 %. 2. an axle load increase to 30.0 t on railway tracks in the czech republic is presently not feasible. 5 acknowledgment this paper was written within research project no. 1f52h/052/130, funded by the ministry of transport of the czech republic. references [1] krejčiříková, h., tyc, p., lidmila, m., horníček, l., voříšek, p., et al.: methodology of transitional parameters for the construction of substructure on tracks of the conventional trans-european system. research reports of project 1f52h/052/130, faculty of civil engineering, ctu in prague, department of railway structures, 2005 and 2006. [2] instruction čd s4 substructure, 1997 (in force since 1. 7. 1998). [3] instruction čd s3 permanent way. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 [4] instruction čd s66 basic instruction for spatial passage and transition of carriages on railway tracks in the czech republic. [5] outline of property management in sždc s.o. for 2004, ref. no. 559/05/gř-013, may 2005. [6] uic-kodex 724 e gleisbewehrung für 25 tonnen (250 kn) auf schotteroberbau, 1. ausgabe, mai 2005. ing. martin lidmila, ph.d. e-mail: lidmila@fsv.cvut.cz ing. leoš horníček, ph.d. e-mail: hornicek@fsv.cvut.cz doc. ing. hana krejčiříková, csc. phone: +420 224 354 756 e-mail: krejcirikova@fsv.cvut.cz prof. ing. petr tyc, drsc. e-mail: petr.tyc@fsv.cvut.cz department of railway structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 6/2008 ap10_1.vp 1 introduction atomic force microscopy (afm) [1, 2] senses the interaction forces between a sharp probing tip (with a radius of curvature from about ones to tens of nanometres) and the surface of the sample. the principle of afm is shown in the fig. 1. the probing tip is attached to the free end of a cantilever type spring (about tens or hundreds of micrometres in length). the forces between the probing tip and the sample (of an order of magnitude of 10�13 to 10�4 n) deflect the cantilever, which is monitored while the sample is being moved under the tip. the topography of the sample is subsequently reconstructed on the basis of the deflection signal that is recorded. another reconstruction method applies a signal with its origin in the feedback loop which is used to keep the deflection of the cantilever constant. afm is also widely used as a non-imaging technique [3] enabling measurements of the local mechanical properties of a sample, such as young’s modulus, etc. afm supports 3 modes of operation, which can be derived from the force-distance curve of the sample-tip couple (see fig. 2). the contact regime (sometimes referred to as c-afm), which is referred to in this paper, operates in the range of the repulsive forces. in the false engage mode, which is neither an imaging regime nor a force measurement regime, the cantilever is kept unloaded in the air. most afm systems use an optical lever (sometimes referred to as laser beam deflection) to detect the deflection of the cantilever [4, 5] (see fig. 1). the laser beam from a laser diode is reflected from the free end of the cantilever to the po© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 50 no. 1/2010 self heating of an atomic force microscope o. kučera atomic force microscopy (afm) is a sensitive technique susceptible to unwanted influences, such as thermal noise, vibrational noise, etc. although, tools that protect afm against external noise have been developed and are widely used, there are still many sources of inherent noise. one of them is self-heating of the apparatus. this paper deals with self-heating of the afm using an optical lever. this phenomenon is shown to be substantial in particular after activation of the microscope. the influence on the intrinsic contact noise of afm’s is also examined. keywords: atomic force microscopy, noise. piezo scannerelectric laser p ho to d c et e to r feedback loop sample t cantilever t tip x z y fig. 1: the principle of afm 0 z f z( ) fig. 2: force–distance curve (arbitrary units) with the typical shape of the lennard–jones two–body potential sition sensitive photo detector (pspd). the pspd consists mostly of four photodiodes forming a quad (see the encircled detail in fig. 1). the difference of the corresponding voltages (induced by the reflected laser beam) on the diodes represents signals used to calculate the deflection and torsion of the cantilever. these signals are termed vertical (deflection) and horizontal (torsion) voltages. since an afm is capable of imaging the sample topography on an atomic scale, or of sensing forces as weak as 10�13 n, the issue of noise influencing the resolution and the sensitivity of an afm has become vitally important [6]. on the onehand, external noise such as acoustic and electromagnetic noise, fluctuations of temperature, vibrations of the building in which the microscope is housed, etc. have an effect on the apparatus. on the other hand, afm itself generates endogenous noise form several sources. this includes the effects of the electronics, self heating and thermal expansiveness of the afm components, inherent thermally induced oscillations of the cantilever, etc. which influence the resolution and the sensitivity as well. this study demonstrates how the pspd signal evolves versus time after activation of the system, and how the intrinsic contact noise of the afm’s is affected. 2 materials and methods a veeco multimode iv scanning probe microscope with a veeco np20 d cantilever was used in the false-engage and contact regime. the microscope was housed in a special temperature-stabilised room (24 °c) on the basement floor and placed on a vibration isolated table. temperature stabilisation was provided by a standard air-conditioning unit (ac). the ac had been running for more than one day at the beginning of the experiment, so the room was assumed to be temperature stabilised. the pspd vertical and horizontal voltages were recorded in the false-engage mode for 200 minutes after activation of the afm. to reveal the effect of air-conditioning on these voltages, the ac was turned off from time t � 120 min till t � 180 min. the intrinsic contact noise was measured as follows [6]. the probing tip was brought into contact with the sample (a metal washer) and the primary feedback loop was turned off. the intrinsic contact noise signal was observed by recording the deflection of the cantilever (the pspd vertical voltage) with the scan area set to 0 nm. this means that the tip was not moving laterally across the surface of the sample. the observed signal was sampled with sampling frequency of 60.621 khz for 8.6486 s. 3 results and discussion as shown in fig. 3, both the horizontal and the vertical voltages begin to decrease directly after activation of the afm at time t � 0. the exponential decrease of the voltage vanished at approx. t � 30 min. until the air-conditioning was turned off at time t � 120 min both voltages occurred in a stationary state and oscillated slightly. the reason for these oscillations is not clear, but it probably springs from imperceptible temperature changes caused by the air-condition cycle. at t � 120 min, as mentioned above, the ac was turned off. both voltages started to decrease at the same time. the decrease was reversed by restarting of the ac at t � 185 min. this proves the effect of temperature changes on the vertical and horizontal voltage of the pspd. since the temperature of the environment was not changing significantly at the beginning of the experiment (up to t � 120 min), the exponential downward phase of the pspd voltage in this period must be due to self-heating of the afm. two explanations for this phenomenon are suggested. firstly, the laser beam could have heated the pspd. secondly, the laser beam could have heated the cantilever, which thus changed its geometry and deformability. the intrinsic contact noise (icn) that was observed includes three types of components. firstly, pure contact noise with small amplitude of ones of the resolution quantum of the a/d converter of the afm. this noise contains the electrical noise of the pspd, the electromechanical noise of the piezoscanner, and the thermally induced mechanical noise of the sample-tip-cantilever system. secondly, external noise is also included. this noise is expressed as fluctuations of the general drift, which is the third component. as noted above, the correlation between pspd voltage and ac status proved that there is a thermal cause for the pspd voltage fluctuations. the same temperature changes are the most probable source of the general drift of the icn; 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 0 20 40 60 80 100 120 140 160 180 200 �8 �7 �6 �5 �4 �3 �2 �1 0 t [min] u t( ) [v ] vertical horizontal air-condition on onoff fig. 3: influence of self-heating of afm on the voltage of the deflection detector versus time however, the mechanism could be somewhat different, because the effect of using the piezoscanner must be taken into account. note that the differences of the mean voltages in both cases (fig. 3 and fig. 4 respectively) are caused by the different afm regime that is used and the bias voltage applied in the c-afm. we may thus conclude that the pure contact noise is superimposed on a drift of the pspd signal, which seems likely to be caused by temperature changes. this feature of icn measurements, and of afm measurements generally, is not widely known and it may lead to incorrect determination of noise characteristics such rms value of the noise, etc. it can also have a negative influence on the vertical resolution of afm, because pspd voltage is used to calculate the topography of the sample. in particular, precise force measurements can be totally devalued due to self-heating of the afm. 4 conclusion a major problem of afm measurements (especially on an atomic scale) is the influence of any kind of noise. this study has shown the noise characteristics of afm, such as rms and icn, are strongly influenced by self–heating when an optical lever is used. references [1] binnig, g., quate, c., gerber, ch.: atomic force microscope. physical review letters, vol. 56 (1986), p. 930–933. [2] meyer, e., heinzelmann, h.: scanning force microscopy. in: scanning tunneling microscopy ii (editors: weisendanger, r., güntherodt, h.-j.). berlin – heidelberg – new york: springer, 1992, 1995 (second edition). [3] butt, h., cappella, b., kappl, m.: force measurements with the atomic force microscope: technique, interpretation and applications. surface science reports, vol. 59 (2005), no. 1–6, p. 1–152. [4] meyer, g., amer, n.: novel optical approach to atomic force microscopy. applied physics letters, vol. 53 (1988), p. 1045–1047. [5] alexander, s. et al.: an atomic-resolution atomic-force microscope implemented using an optical lever. journal of applied physics, vol. 65 (1989), p. 164–167. [6] han, w., lindsay, s.: intrinsic contact noise: a figure of merit for identifying high resolution afms. applicatin note, agilent technologies, 2007. [7] wilkening, g., koenders, l.: nanoscale calibration standards and methods. weinheim: wiley-vch verlag gmbh & co. kgaa, 2005. [8] kučera, o.: oscillation states on yeast cell membranes. diploma thesis, praha: czech technical university in prague, 2008. ing. ondřej kučera phone: +420 224 352 820 e-mail: kucerao@ufe.cz department of circuit theory czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic institute of photonics and electronics academy of sciences of the czech republic chaberská 57 182 51 prague 8, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 50 no. 1/2010 0 1 2 3 4 5 6 7 8 0.6 0.62 0.64 0.66 0.68 0.7 t [s] u t( ) [v ] fig. 4: an example of pure contact noise superimposed on the drift of pspd voltage. sudden fluctuations are caused by external noise such as acoustic noise, etc. acta polytechnica vol. 51 no. 4/2011 the horizons of observability in pt -symmetric four-site quantum lattices m. znojil abstract one of the key merits of pt -symmetric (i.e., parity times time reversal symmetric) quantumhamiltonians h lies in the existence of a horizon of the stability of the system. mathematically speaking, this horizon is formed by the boundary of the domain d(h) ⊂ rd of the (real) coupling strengths for which the spectrum of energies is real and non-degenerate, i.e., in principle, observable. it is shown here that even in the elementary circular four-site quantum lattices with d =2 or d =3 the domain of hidden hermiticity d(h) proves multiply connected, i.e., topologically nontrivial. keywords: hidden hermiticity, spectra and exceptional points, horizons, discrete schrödinger operators, quantum graphs, loops, four-site lattices, connectedness, strong-coupling anomalies. 1 introduction one of the most interesting formulations of the standard and robust dictum of quantum mechanics emerged in connection with the acceptance of the so called pt -symmetric operators of observables, where p means parity while t represents time reversal (cf. review papers [1–3] for an exhaustive discussion). one of the main reasons for the rebirth of interest in this new paradigm last year may be seen, paradoxically, in its impact on classical experimental optics [4]. the latter experimental activities (i.e., basically, the emergence of a few successful classical-physics simulations of quantum effects) re-attracted attention to the innovative theory. we may mention, pars pro toto, paper [5] which offered an exhaustive constructive classification of all of the pt symmetric quantum hamiltonians h defined in the finite-dimensional hilbert spaces h of dimensions n = 2 and n = 3. 1 3–3 –1 bcb fig. 1: graphical symbol for the straight-line openend four-site lattice. the numbers in the small circles (=sites) are the unperturbed energies while the letters b and c near the nearest-neighbor-interaction lines represent the (real) couplings inside the most elementary n ≤ 3 family of models no real surprises and spectral irregularities have been encountered. in contrast, in ref. [6] we found that certain anomalies certainly emerge at n = 8. in our present brief continuation of these developments we intend to show that the simplest models exhibiting similar irregularities in their spectra already occur, unexpectedly, as early as at the next hilbert-space dimension n = 4. 2 four-site quantum-lattice models 2.1 the exactly solvable straight-line case in refs. [7] the successful tractability of morethan-three-dimensional hamiltonian matrices resulted from a drastic simplification of their structure. we merely admitted their tridiagonal versions. in the language of physics this corresponded to the picture in which the system lived on an n−site straight-line lattice endowed with the mere nearest-neighbor interactions. at n = 4 this is schematically depicted in figure 1. the small circles represent the sites while their frame-line connections symbolize the interactions. the left-right symmetric straight-line lattice of figure 1 (i.e., of refs. [7]) is assigned the hamiltonian given in the form of two-parametric real matrix h = h(4)(b, c) = ⎡ ⎢⎢⎢⎢⎢⎢⎢⎣ −3 b 0 0 −b −1 c 0 0 −c 1 b 0 0 −b 3 ⎤ ⎥⎥⎥⎥⎥⎥⎥⎦ . (1) 104 acta polytechnica vol. 51 no. 4/2011 a quantitative analysis of the models is more or less trivial even at larger n > 4. the curious reader may find many details, say, in review paper [8]. 2.2 pt -symmetric circular lattices and their simplest four-site example once we replace figure 1 by its circular version of figure 2 we may immediately interpret the new diagram as representing the new n = 4 quantum model which is given by the following three-parametric fourby-four matrix form of the hamiltonian studied in ref. [9], h = h(4)(a, b, c) = ⎡ ⎢⎢⎢⎢⎢⎢⎢⎣ −3 b 0 −a −b −1 c 0 0 −c 1 b a 0 −b 3 ⎤ ⎥⎥⎥⎥⎥⎥⎥⎦ . (2) in the new model with one more coupling which connects the “upper two” sites, the method for constructing the boundary ∂d(h) does not change. at any number n of sites along the (circular) lattice the reality property of the spectrum of the energies will remain tractable by the standard mathematical techniques. a few representative samples may be found in ref. [10]. interested readers may search for a broader mathematical context in refs. [11, 12]. –1 1 –3 3 c bb a fig. 2: the circular four-site lattice once we restrict our attention just to our special toy model h(4)(a, b, c) it proves sufficient to recall the entirely elementary considerations of ref. [10]. this leads to the conclusion that the spectrum of energies is real and nondegenerate if and only if the triplet of parameters (a, b, c) lies inside the domain d(h) := { (a, b, c) ∈ r3 | w (a, b, c) > 0 , (3) q(a, b, c) > 0 , p (a, b, c) > 0 } where w (a, b, c) = ( 8 + c2 − a2 )2 − (4) 4 [ 16 − (a + c)2 ] b2 , q(a, b, c) = [ (a + 3)(c − 1) − b2 ] · (5)[ (a − 3)(c + 1) − b2 ] and p (a, b, c) = 10 − a2 − 2 b2 − c2 . (6) in other words, for the couplings moving to the twodimensional surfaces of d(h) from inside we observe that the quadruplets of the real bound-state energies themselves behave in an easily understandable manner. the reason is that we may rewrite the secular equation in the form s(s, a, b, c) = 0 where the energies e± = ± √ s emerge in pairs and where s(s, a, b, c) := s2 + ( −10 + c2 + 2 b2 + a2 ) s + 9 + 6 b2 − 9 c2 + b4 − 2 cab2 − a2 + c2a2 . this recipe generates the two auxiliary roots 4 s = 4 s(±) = 20 − 2 a2 − 2 c2 − 4 b2 ± 2 √ w (a, b, c) where we already know the function of eq. (4), w (a, b, c) = 64 + 16 c2 − 64 b2 − 16 a2 + c4 + 4 c2b2 − 2 c2a2 + 4 b2a2 + a4 + 8 cab2 . in the spirit of the general results of ref. [9] we may summarize that 1. whenever w (a, b, c) → 0+ the two pairs of energies approach the two distinct values e (w=0) ± = ± √ (10 − a2 − 2 b2 − c2)/2 representing the two limiting doubly-degenerate energies; 2. whenever q(a, b, c) → 0+ just the two energies move to zero while the other two energies do not vanish in general, e0,3 = ± √ 10 − a2 − 2 b2 − c2 ; 3. for p (a, b, c) → 0+ we must expect that all of the four real energies will vanish simultaneously. 3 two-parametric simplified versions of the circular four-site lattice 3.1 the case of a =0 naturally, in the no-upper-interaction limit lim a→0 h(4)(a, b, c) = h(4)(b, c) we return to the elementary straight-line model of figure 1. for our present purposes it is then sufficient to recall that the main features of such a simplified model were described in refs. [7]. in particular, we know that 105 acta polytechnica vol. 51 no. 4/2011 at n = 4 the spectrum of energies remains real and nondegenerate inside the innermost star-shaped domain d(h) ⊂ r2 shown, in figure 3, as lying inside an auxiliary circumscribed ellipse. the boundary ∂d(h) (i.e., the physical horizon of the system in question) is composed of four hyperbola-shaped curves. the key features of this example (like the triple intersections of the boundaries, etc.) generalize, mutatis mutandis, to the family of the similar models at all of the dimensions n < ∞ [8]. 2 1 0 –1 –2 –3 –2 –1 0 1 2 3 c b fig. 3: the graphical determination of the innermost star-shaped domain d(h) as assigned to the quantum lattice of figure 1 in refs. [7] we are now prepared to replace the elementary and transparent graphical determination of the starshaped domain d(h) assigned to the straight-line quantum lattice and displayed in figure 3 by its much more complicated a �= 0 analogue. at a freely variable a the knowledge of the a = 0 section may serve us and will still serve us as a very useful independent test of our forthcoming observations and conclusions. 3.2 the case of b =0 our four-site toy model h(4)(a, b, c) degenerates to the trivial non-interacting composition (i.e., the direct sum) of the two n = 2 models at b = 0. for this reason the b = 0 limiting case should be considered exceptional. in the b = 0 two-dimensional special case, even the general definition of the domain d(h) is slightly misleading. indeed, figure 4, which displays the three sets of boundaries (viz., the two hyperbolas w (a, 0, c) = 0, the four straight lines q(a, 0, c) = 0 and the single circle p (a, 0, c) = 0, respectively), should not be taken too literally. one of the boundaries (viz., the doublet of hyperbolas w (a, 0, c) = 0) describes in fact a sign-non-changing (i.e., the realityof-energies non-changing) curve of the doubly degenerate (and, hence, irrelevant and removable) zeros of the function w (a, 0, c) = (8 + c2 − a2)2. this means that at b = 0 the domain d(h) is strictly rectangular and strictly simply connected. in this context one of the key messages of our present study is the surprising discovery of the loss of both of these properties in the general case with the freely variable coupling strength b. incidentally, the multinomial w (a, bspec, c) becomes factorizable also at bspec = 1, w (a, 1, c) = (a + c) ·( a3 − ca2 − 12 a − c2a + 20 c + c3 ) . this is an artifact which does not carry any immediate physical meaning. its manifestation is of a purely geometrical character, which will only be briefly mentioned later. 3.3 the case of c =0 in the third (and last) preparatory step, let us discuss the vanishing-coupling special case in which c = 0 and w (a, b, 0) = ( 8 − a2 )2 − 4 [16 − a2] b2 , (7) q(a, b, 0) = ( 3 + b2 + a )( 3 + b2 − a ) (8) and p (a, b, 0) = 10 − a2 − 2 b2 . (9) the detailed study of precisely this special case reveals in fact the possibility of the emergence of a topological nontriviality in the general case. the detailed form of such a c = 0 hint may be seen in figure 5 and in its magnified version 6. as long as the –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 4: the degenerate case of the simply connected rectangular domain d(h) at b =0 –2 –1 0 1 2 b –3 –2 –1 0 1 2 3 a fig. 5: the triply connected nature of the triple-overlap domain d(h) for the quantum lattice of figure 2 at c =0 (i.e., in the no-central-coupling extreme) 106 acta polytechnica vol. 51 no. 4/2011 condition q(a, b, 0) > 0 degenerates to the elementary constraint 3 + b2 > a > −3 − b2 just the left and right small horizontal-parabolic segments (with their extreme at b = 0 and |amax| = 3) should be cut out of the elliptic domain with p (a, b, 0) > 0 as inadmissible since q(a, b, 0) < 0 there. as long as we only have 16 > a2, the remaining constraint w (a, b, 0) > 0 acquires the form |b| < 1 4 ∣∣8 − a2∣∣ √ 16 − a2 (10) of the geometric limitation of the admissible range of b by the two intersecting or rather broken and touching curves. the nonnegative function w (a, 0, 0) = (8 − a2)2 solely vanishes at a2 = 8. –0.4 –0.2 0 0.2 0.4 b 2.8 2.9 3 3.1 3.2 a fig. 6: same as figure 5 (detail) –0.4 –0.2 0 0.2 0.4 b 2.8 2.9 3 3.1 3.2 a fig. 7: the deformation of figure 6 at c =0.95 the allowed region decays into three disconnected open sets (cf. figures 5–7). the big one is formed by the eye-shaped vicinity of the origin, with its extremes at the points (a, b)± = (± √ 8, 0). the other two smaller open sets are fish-tail-shaped. in the pictures these two domains are easily spotted as containing the respective b = 0 intervals of |a| ∈ ( √ 8, 3). in this sense they may be expected to support a perturbatively inaccessible “strong-coupling” dynamical regime. an additional indication of the suspected emergence of topological as well as dynamical nontrivialities is offered by figure 7, where the same separation of a strong-coupling piece of the domain d(h) is shown to survive up to the very extreme of c ≈ 1. 4 the domain of cryptohermiticity in the full-fledged three-parametric dynamical scenario 4.1 the auxiliary domains and their boundaries the domain d(h) of parameters for which hamiltonian h(4)(a, b, c) generates the unitary evolution is defined as an intersection of the triplet of domains d(p,q,w) in r3. let us now leave all of the three parameters a, b and c freely variable and recall that • the domain d(p) is defined by the inequality p (a, b, c) = 10 − a2 − 2 b2 − c2 > 0 . it is compact, so that we may restrict our attention just to the intervals of b2 < 5, a2 < 10 and c2 < 10. at any fixed b2 < 5 the section of this first auxiliary domain coincides with the interior of a central circle in the a − c plane with radius r = √ 10 − 2b2; • the allowed interior of the triply connected domain d(q) is defined by the inequality q(a, b, c) = [ (a + 3)(c − 1) − b2 ] ·[ (a − 3)(c + 1) − b2 ] . in the a − c plane the boundaries of this domain are two hyperbolas sampled at b2 = 1 in figure 8. –5 0 5 –5 0 5 allowed allowed allowed allowed allowed forbidden forbidden forbidden forbidden a c fig. 8: boundaries q(a, b, c) = 0 and forbidden parts of the a − c plane as sampled at b2 =1 the third auxiliary domain d(w) is defined by the inequality w (a, b, c) = ( 8 + c2 − a2 )2 − 4 [ 16 − (a + c)2 ] b2 > 0 . the description of this domain is slightly less trivial. the interior of this domain covers all the exterior of 107 acta polytechnica vol. 51 no. 4/2011 the strip where |a + c| > 4. then the interior of this strip may be reparametrized, c − a = 2 τ (c, a) ∈ (−∞, ∞) , c + a = 4 sin ϕ(c, a) , ϕ(c, a) ∈ (−π/2, π/2) making the rest of the domain d(w) determined by the elementary inequality |b| < |1 + τ (c, a) sin[ϕ(c, a)]| cos[ϕ(c, a)] . (11) this means that within the restricted range of τ (c, a) ∈ (− √ 10, √ 10) the growth of |b| → ∞ must be compensated by the decrease of cos[ϕ(c, a)] → 0, i.e., by the convergence c → ±1 − a. this makes the strip-restricted part of the domain d(w) very small but increasing with the decrease of |b| from a sufficiently large initial value. 4.2 the boundaries of the cryptohermiticity domain the study of the overlaps of the three auxiliary domains d(p,q,w) may be started at the maximal admissible plane of b = b(p) = √ 5, which touches the boundary ∂d(p) at a = c = 0. this point still lies outside the domains d(w) and d(h) since w (0, √ 5, 0) = 64 − 320 < 0. in a search for the first touch between the b−plane and boundary ∂d(h) we must diminish our b and move into the interior of d(p). in the first illustrative example at b = √ 5−1/100, our fig. 9 displays the motion of the triplet of boundaries ∂d(p,q,w) projected into the a − c real plane. this picture shows that the corresponding section of the first domain d(p) becomes nonempty. still, it just occupies the interior of a very small circle c(b) = ∂d(p)|b=f ixed with the center at the origin. the interior of the second domain d(q) is perceivably bigger since, in the manner indicated by fig. 8 above, it occupies the large domain between the two outermost, b − dependent hyperbolic curves h1,2(b) ⊂ ∂d(q)|b=f ixed. the overlap d(h) itself remains empty because the third domain d(w) is localized behind the two remaining and less trivially parametrized curves g1,2(b) ⊂ ∂d(w)|b=f ixed. during the subsequent decrease of b sampled by figure 9, the two curves gj (b) and hj (b) (assigned the same subscript j = 1 or j = 2) get closer to each other while the internal circle c(b) gets larger. at each j and at the same value of b both the curves gj ,hj touch the circle c(b). at a still smaller b = √ 5 − 1/2 = 1.736 067 977 they already move inside, sharing their two separate intersections with the circle. this situation is illustrated in figure 10. the formation of the first two non-empty components of the physical domain d(h) emerges during the further decrease of b. due to the fact that the two triple-intersection points between c(b), gj and hj move apart at any j, these two components remain disconnected, extremely narrow and eye-shaped. these “eyes” look like “almost closed” and “slowly opening” with the further decrease of b. graphically, the generic situation in illustrated by figures 11 and 12. –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 9: b = √ 5 − 1/100 =2.226067977 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 10: b = √ 5 −1/2 =1.736067977 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 11: b = √ 5 − 1= 1.236067977 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 12: b =1.01 108 acta polytechnica vol. 51 no. 4/2011 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 13: b =1 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 14: b =0.999 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 15: b =0.6 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 16: b =0.4 the next qualitative change of the pattern occurs at the above-mentioned special value of b = 1, at which we touch the saddle of the surface ∂d(w). slightly before this happens, we encounter the situation depicted in figure 12 where the two separate subdomains of the physical domain d(h)|b=f ixed already almost touch. next they do touch (cf. figure 13) and, subsequently, get connected (cf. the next figure 14). surprisingly enough, below the saddle point b = 1 the topological surprises are still not at the end. there is no real news even at b = 0.6 (cf. figure 15). nevertheless, in the latter picture we must already pay attention to the two subdomains with the maximal a2s. having selected just the right end of the (symmetric) picture at the positive a ≈ 3, we reveal the emergence of a tendency towards a new intersection between the (hitherto, safely external and non-interfering) second branches of the q(a, b, c, )−related hyperbolas h(second)j and of the back-bending boundaries ∂d(w)|b=f ixed. for example, these curves get very close to each other but still do not intersect yet at b = 0.4, also staying outside the central circular domain d(p) (cf. figure 16). a change in the pattern is finally achieved slightly below b = 0.4, at the moment when both of the new intersection candidates touch the circle c(b) = ∂d(p)|b=f ixed in a single point. subsequently, this point splits into the pair of triple intersections. the further decrease of |b| forms the pattern which is sampled in figure 17 at b = 0.2 and in figure 18 at b = 0.1. –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 17: b =0.2 –3 –2 –1 0 1 2 3 c –3 –2 –1 0 1 2 3 a fig. 18: a return to the triply connected d(h) at b =0.1 109 acta polytechnica vol. 51 no. 4/2011 –1.5 –1 –0.5 0 0.5 1 c 2.6 2.7 2.8 2.9 3 3.1 3.2 a fig. 19: a magnified detail of figure 18 0.9 0.92 0.94 0.96 0.98 1 1.02 c 2.98 2.99 3 3.01 3.02 a fig. 20: a magnified detail of figure 19 certainly, the key features of the new situation are better visible, at the illustrative b = 0.1, in its magnified presentation, as mediated by figures 19 and 20. now we may return to the limiting pattern of figure 4, where we witness the abrupt change of the topology caused by the final confluence of the straight and backbanding branch of the boundary ∂d(w)|b=f ixed in the limit b → 0. obviously, in a way confirmed by the complementary results of section 3.3, such a confluence of the boundaries only occurs in the limit, so that the theedimensional version of the open set d(h) is ultimately confirmed to be triply connected. 5 summary in the history of pure mathematics the specification of the horizons ∂d(h) (called, often, “discriminant surfaces” in this context) has been perceived as a challenging and rather difficult problem even in its first “unsolvable” case characterized, in our present notation, by the hilbert-space dimensions n = 5 [11]. in certain mathematically natural directions real progress is of amazingly recent date [12]. remarkable parallel developments also occurred in several applied-mathematics oriented studies paying attention to the natural presence of more symmetries in the hamiltonian [7] and/or to the introduction of more observable quantities within a given phenomenological quantum model [13]. in a constructive, more pragmatic setting as sampled by our recent paper [6], we restricted our attention to the topological problem of horizons. the most obvious motivation for such an effort has been given by the fact that the disconnectedness of domain d(h) immediately requires the transition from its traditional perturbation-theory descriptions (with a recommended recent compact sample given in [14]) to non-perturbative methods, or to strong-coupling perturbation techniques [15]. in ref. [6] the parallel and less formal motivation has been emphasized to lie in a systematic search for the possible physical origin of the dynamical anomalies in a kinematical nontriviality of the topology of phase space. the conclusions of our present paper are encouraging. firstly we have demonstrated that for many purposes it may be sufficient to use the matrices with a not too large n . secondly, we have shown an increase in the feasibility describing models h with additional symmetries. at the first nontrivial hilbertspace dimension n = 4 we encountered, for example, the decrease of the minimal necessary number of parameters to d = 3 or even to d = 2 . thirdly, we clarified that once we work with a tridiagonal n by n hamiltonian h (n) 0 complemented by a computationally suitable specific perturbation, the existence of the disconnected subdomains in d(h) opens direct access to the strong-coupling dynamical regime. fourthly, on the mathematical side, we are now able to recommend the use of auxiliary symmetries in the hamiltonians (e.g., of the ones of ref. [16]). in such cases, the algebraic secular equations pertaining to the model often happen to factorize, leading to polynomial equations of perceivably lower orders. the latter fact rendered our toy model easily solvable. last but not least, it seems worth emphasizing that on the background given by refs. [6, 17] it took some time for us to imagine that the anomalies of spectra could also be sought at dimensions as small as n = 4. in this sense the message of our present study is encouraging. several specific spectral irregularities as observed at n = 8 in ref. [6] were found also for matrices with the dimension as low as n = 4. our model reconfirmed the hypothesis of a very close, topology-related connection between the loop-shaping of the lattices (i.e., presumably, betti numbers in continuous limit) and the existence of strong-coupling dynamical anomalies in the spectra of the energy levels. 110 acta polytechnica vol. 51 no. 4/2011 appendix a. the three-hilbert-space formulation of quantum mechanics in section 2 of ref. [5], one of the most compact introductions to the abstract formalism of pt -symmetric quantum mechanics (ptsqm) is given. thus, we may shorten the introductory discussion and restrict ourselves to a few key comments on the general theoretical framework. in such a compression, the ptsqm formalism may be characterized as such a version of entirely standard quantum mechanics in which, in principle, the system in question is defined in a certain prohibitively complicated physical hilbert space of states h(p), where the superscript may be read as abbreviating “prohibited” as well as “physical” [3]. typical illustrative realistic examples may be sought in the physics of heavy nuclei, where the corresponding fermionic states are truly extremely complicated. in the latter exemplification the first half of the ptsqm recipe lies in the transition to a suitable, unitary equivalent hilbert space, h(p) → h(s), where the superscript “(s)” may stand for “suitable” or “simpler” [3]. in the above-mentioned realistic-system illustration, for example, the new space h(s) coincided with a suitable “interacting boson model” (ibm). in the warmly recommended review paper of this field [18] it has been emphasized that the requirement of unitary equivalence between the two hilbert spaces h(p) and h(s) may only be achieved in two ways. either the corresponding boson-fermion-like mapping ω between these two hilbert spaces (known, in this context, as the dyson’s mapping) remains unitary (and the mathematical simplification of the problem remains inessential) or is admitted to be non-unitary (a less restrictive option which may enable us to achieve a really significant simplification, say, of the computational determination of the spectra). what remains for us to perform and explain now is the second half of the general ptsqm recipe. its essence lies in weakening the most common unitarity requirement imposed upon the dyson mapping, ω† = ω−1 to the mere quasi-unitarity requirement ω† = θ ω−1 . the symbol θ �= i represents here the so-called metric operator which defines the inner product in hilbert space h(s). more details using the present notation may be found in [3]. just a few of them have to be recalled here. firstly, the main source of the purely technical simplifications of the efficient numerical calculations (say, of the spectra of energies) is to be seen in the introduction of the third, purely auxiliary hilbert space h(f), where the superscript “(f)” combines the meaning of “friendlier” with “falsified” [3]. by definition, the two hilbert spaces h(s) and h(f) coincide as the mathematical vector spaces (“of ket vectors” in the dirac terminology). we only replace the nontrivial metric θ(s) ≡ ω†ω of the former space by its trivial simplification θ(f) ≡ i in the latter hilbert space. as an immediate consequence, the latter space acquires the status of an auxiliary, manifestly unphysical space which does not carry any immediate physical information or probabilistic interpretation of its trivial though, at the same time, maximally mathematically friendly inner products. appendix b. the role of pt -symmetry in its most widely accepted final form, described in ref. [1], the ptsqm recipe complements the latter general scheme by another assumption. it may be given the mathematical form of the introduction of the second auxiliary, manifestly unphysical vector space k(p) which is, by definition, not even the hilbert space. in fact, this fourth vectors space is assumed endowed with the formal structure of krein space [19]. ref. [14] may be consulted for more details. here, let us only remind the readers that the symbol p in the superscript carries a double meaning and combines the mathematical role of the indefinite metric p (defining in fact the krein space) with an input physical interpretation (usually, of the operator of parity). in addition, the theoretical pattern h(s) ↔ k(p) ↔ h(f) . (12) is complemented by the requirement that there exists a “charge” operator c such that the (by assumption, non-trivial, sophisticated) metric θ(s) �= i which defines the inner product in the second hilbert space h(s) coincides with the product of the two abovementioned operators, θ(s) = pc . (13) the contrast between the feasibility of the n �= 3 constructions presented in ref. [5] and the discouraging complexity and incompleteness of the next-step n = 4 constructions as performed in paper [17] and in its sequels [7] was also thoroughly discussed in our review [8]. in our present text we do not deviate from the notation and conventions accepted for this review. we pay attention solely to the class of models 111 acta polytechnica vol. 51 no. 4/2011 where the n−dimensional matrix of parity is unique and given, in advance, in the following form, p = ⎡ ⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ 1 0 . . . . . . 0 0 −1 0 . . . . . . 0 0 0 1 0 . . . 0 0 0 0 −1 . . . ... ... . . . . . . . . . . . . 0 0 . . . 0 0 0 ∓1 ⎤ ⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ . (14) in parallel, we made use of the time-reversal operator t of the form presented, e.g., in ref. [5] as mediating just the transposition plus complex conjugation of vectors and/or matrices. we should add that once we work with real vectors and matrices, we are even allowed to perceive t as a mere transposition. acknowledgement work supported by gačr grant nr. p203/11/1433, by mšmt “doppler institute” project nr. lc06002, and by the inst. res. plan av0z10480505. references [1] bender, c. m.: making sense of non-hermitian hamiltonians, rep. prog. phys. vol. 70 (2007), p. 947–1 018. [2] dorey, p., dunning, c., tateo, r.: the ode/im correspondence, j. phys. a: math. theor. vol. 40 (2007) p. r205–r283. davies, e. b.: linear operators and their spectra. cambridge : cambridge university press, 2007. mostafazadeh, a.: pseudo-hermitian quantum mechanics. [3] znojil, m.: three-hilbert-space formulation of quantum mechanics, sigma, vol. 5 (2009), 001, 19 pages. [4] rüter, c. e., makris, k. g., el-ganainy, r., christodoulides, d. n., segev, m., kip, d.: observation of parity-time symmetry in optics, nat. phys. vol. 6 (2010), p. 192–195. kottos, t.: optical physics: broken symmetry makes light work, nat. phys. vol. 6 (2010), p. 166–167. [5] wang, q.-h., chia, s.-z., zhang, j.-h.: pt symmetry as a generalization of hermiticity, j. phys. a: math. theor. vol. 43 (2010), p. 295 301. [6] znojil, m.: anomalous real spectra of nonhermitian quantum graphs in strong-coupling regime, j. phys. a: math. theor. vol. 43 (2010), p. 335 303. [7] znojil, m.: tridiagonal pt-symmetric n by n hamiltonians and a fine-tuning of their observability domains in the strongly non-hermitian regime, j. phys. a: math. theor. vol. 40 (2007), 13 131–13 148. znojil, m,: conditional observability versus selfduality in a schematic model, j. phys. a: math. theor. 41 (2008), p. 304 027. znojil, m.: a return to observability near exceptional points in a schematic pt-symmetric model, phys. lett. vol. b 647 (2007), p. 225–230. znojil, m.: conditional observability, phys. lett. vol. b 650 (2007), p. 440–446. [8] znojil, m.: pt-symmetric quantum chain models, acta polytechnica, vol. 47 (2007), p. 9–14. [9] znojil, m.: novel recurrent approach to the generalized su-schrieffer-heeger hamiltonians, phys. rev. b vol. 40 (1989), p. 12 468–12 475. [10] znojil, m.: horizons of stability, j. phys. a: math. theor. vol. 41 (2008), p. 244 027. [11] top, j., weitenberg, e.: models of discriminant surfaces, bull. ams vol. 48 (2011), p. 85–90. [12] wagner, d. g.: multivariate stable polynomials: theory and application, bull. ams vol. 48 (2011), p. 53–84. [13] znojil, m.: coupled-channel version of ptsymmetric square well, j. phys. a: math. gen. vol. 39 (2006), p. 441–455. [14] langer, h., tretter, ch.: a krein space approach to pt-symmetry, czech. j. phys. vol. 54 (2004), p. 1 113. [15] caliceti, e., graffi, s., maioli, m.: perturbation theory of odd anharmonic oscillators, commun. math. phys. 75 (1980), p. 51–66. fernández, f. m., guardiola, r., ros, j., znojil, m.: strong-coupling expansions for the ptsymmetric oscillators v (r) = aix + b(ix)2 + c(ix)3, j. phys. a: math. gen. vol. 31 (1998), p. 10 105–10 112. [16] znojil, m.: maximal couplings in pt-symmetric chain-models with the real spectrum of energies, j. phys. a: math. theor. vol. 40 (2007), p. 4 863–4 875. [17] znojil, m.: determination of the domain of the admissible matrix elements in the fourdimensional pt-symmetric anharmonic model, phys. lett. a vol. 367 (2007), p. 300–306. 112 acta polytechnica vol. 51 no. 4/2011 [18] scholtz, f. g., geyer, h. b., hahne, f. j. w.: quasi-hermitian operators in quantum mechanics and the variational principle, ann. phys. vol. 213 (1992), p. 74–101. [19] nagy, k. l.: state vector spaces with indefinite metric in quantum field theory. budapest : akademiai kiado, 1966. gohberg, i. c., krein, m. g.: introduction to the theory of linear nonselfadjoint operators. providence : american mathematical society, 1969. miloslav znojil e-mail: znojil@ujf.cas.cz nuclear physics institute ascr 250 68 řež, czech republic 113 acta polytechnica vol. 52 no. 5/2012 a study of parameters setting of the stadzt václav turoň dept. of circuit theory, czech technical university, technická 2, 166 27 praha, czech republic corresponding author: turonvac@fel.cvut.cz abstract this paper deals with the new time-frequency short-time approximated discrete zolotarev transform (stadzt), which is based on symmetrical zolotarev polynomials. due to the special properties of these polynomials, stadzt can be used for spectral analysis of stationary and non-stationary signals with the better time and frequency resolution than the widely used short-time fourier transform (stft). this paper describes the parameters of stadzt that have the main influence on its properties and behaviour. the selected parameters include the shape and length of the segmentation window, and the segmentation overlap. because stadzt is very similar to stft, the paper includes a comparison of the spectral analysis of a non-stationary signal created by stadzt and by stft with various settings of the parameters. keywords: spectral analysis, zolotarev transform, fourier transform, spectrogram. 1 introduction spectral analysis is a complex field of signal processing which usually deals with transforming the signal between the time domain and the frequency domain. one of the main aims of this analysis is to detect and observe signal information which is not easily analysed in the time domain. many methods are utilized for this purpose, e.g. the short-time fourier transform (stft), the wavelet transform (wt), the hilbert-huang transform (hht) and the novel short-time approximated discrete zolotarev transform (stadzt). all of these methods have certain advantages and disadvantages. stft is widely used because it is effectively evaluated by the fast fourier transform (fft), and because of the intuitive interpretation of the results in the form of spectrograms, which give the signal energy in the timefrequency domain [1]. wt is based on the correlation of an analysed signal and wavelet function which is scaled and time dilated [3]. the outcome of wt is a scalogram, which represents the signal energy in the time-scale domain. the interpretation of a scalogram is not as clear as the interpretation of a spectrogram. hht exploits the decomposition of the signal into the sum of the sub-signals called the intrinsic mode function (imf) by the process of empirical mode decomposition (emd). due to emd, the complex envelope of the signal can be evaluated by the hilbert transform (ht) of imf with higher accuracy than the application of ht to the raw signal [4]. the main disadvantage of hht lies in the non-intuitive representation of its results. the stadzt transform is very similar to stft, the difference between being that stadzt is based on selective zolotarev polynomials, which improve the time and frequency resolution [9]. the result of stadzt shows the signal energy in the time-frequency domain, as well as the spectrogram created by stft. all methods have the same difficulty: the parameters must be set and kept for the analysis of the whole signal. this issue is not limiting for the analysis of a stationary signal, because the signal parameters do not change in time. difficulties arise when the input signal is non-stationary, because many methods must adapt their parameters to obtain a result that can be used for further analysis. these requirements have led to the development of adaptive methods, e.g. stft with different length of the spectrograms based on various principles, e.g. minimum energy of spectral leakage (mesp) [6], or the katkovnik algorithm, which utilizes the actual frequency of the analysed signal [7]. this paper focuses on setting the parameters for stadzt. this is a new method in the field of spectral analysis, and no study has yet been performed that systematically describes its parameters. the next part of the paper briefly introduces the fundamental principle of stadzt. the main part describes the parameters and discusses their influence on the time and frequency resolution of stadzt. the well-known stft serves as a benchmark for greater clarity of the examples that are presented. 2 approximated discrete zolotarev transform the approximated discrete zolotarev transform (adzt) is a new time-frequency method that was specially designed by radim špetík [9] for spectral analysis of non-stationary signals. the transform is based on symmetrical zolotarev polynomials that create its basis function. these polynomials can be expressed 106 acta polytechnica vol. 52 no. 5/2012 as a weighted sum of chebyshev polynomials of the first and second kind, and it can be written using cosine and sine notation, respectively [13]. thus, the basis function can be defined as [9] zexp(`,i2πt) = zcos(`, 2πt) + i zsin(`, 2πt) = ∑̀ µ=−` a′2µ(κ) cos (2πµt) + i ∑̀ µ=−` b′2µ−1(κ) sin (2πµt) = ∑̀ µ=−` c′2µ(κ) exp (2πµt), (1) where ` denotes the required degree of the polynomials and κ signifies the selectivity, which is closely related to the height of the central lobes. a stable recursive algorithm for computing the coefficients a′2µ, b′2µ and c′2µ is given in [13]. from the spectral point of view, the basis functions can be separated to the stationary s(k) and nonstationary n(k) parts sz(k) = vks(k) − (1 −vk)n(k), (2) where vk is a weighted factor [5]. because the evaluation of the coefficients of zolotarev polynomials a′2m, b′2m is rather tedious, work [9] suggested an algorithm based on the minimization of the non-stationary part of zolotarev polynomials. it can be construed as a filtering or rearrangement of the fourier spectra according to sz = z · s, (3) where sz and s correspond to the coefficients of the zolotarev and fourier spectra, respectively. the matrix z contains coefficients of selective zolotarev polynomials c′2µ which are optimally chosen in compliance with an analysed signal [9]. the time-frequency analysis of the signal is realized by the stadzt, which is evaluated in the similar way as the stft. however, instead of the exponential function the zolotarev basis zexp 1 is used. thus, the zologram can be defined as [10] sz(`,n) = ∞∑ m=−∞ s(m)w(m−n) zexp(`,i2πn), (4) where w(m) must be the final length window resulting in signal segmentation. thus all parameters of stadzt are the same as the parameters of stft. the following part of paper deals with the impact of parameter settings on the time and frequency resolution of stadzt. 0 200 400 600 800 1000 −5 0 5 10 time index[−] s [− ] figure 1: analysed signal is composed of the sum of two harmonic waves and an unit impulse. 3 transform parameters the right choice of parameters has the main impact on the time and frequency resolution of the transforms that is used — stft or stadzt. there are three parameters which closely relate to this issue — window shape, window length and segment overlap. the following part of paper provides a short description of these selected parameters and shows their influence on the spectral analysis of a non-stationary signal. the analysed signal is composed of the sum of two harmonic waves s1 = cos ( 2πnk1 n ) and s2 = cos ( 2πnk2 n ) , where parameter n ∈ (1,n), n = 1200, k1 = 75 and k2 = 190, and unit impulse which is situated at the center of signal (see fig. 1). 3.1 window shape the main difficulty with the widely used discrete fourier transform (dft) is the presence of spectral leakage, which limits the frequency resolution of stft. the spectral leakage arises when the orthogonality between the analysed signal and the basis function (the complex exponential in this case) is violated. from another point of view, the leakage originates from the shortening signal to the finite length by the weighting function or segmentation window, respectively. as a consequence of shortening, the spectrum of the analysed signal is given as the result of the convolution of the spectral coefficients of the signal and the weighting function [11], [2]. the measure of leakage depends on the following factors: • the shape of the segmentation window; • the rate of segment length and the period of the signal. the first parameter of interest is the shape of the segmentation window, because it has a direct influence on the frequency resolution of stft. a segmentation window with a smoothly rising and falling edge suppresses the spectral leakage more effectively than a window with sharp edges. when a smooth window is 107 acta polytechnica vol. 52 no. 5/2012 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 a) time index[−] b) time index[−] c) time index[−] d) time index[−] k [− ] k [− ] k [− ] k [− ] e) time index[−] f) time index[−] g) time index[−] h) time index[−] k [− ] k [− ] k [− ] k [− ] figure 2: stadzt zolograms created by the various window shape: a) rectangular, b) hamming, c) hann, and d) blackman. stft spectrograms created by the various window shape: e) rectangular, f) hamming, g) hann, and h) blackman. applied to the analysed signal, the discontinuities at the boundary of the periodic extension are reduced. thus the frequency information of the main signal components is more distinguishable from each other signal component. the most frequent windows are hamming, hann and blackman windows. more information about the application of segmentation windows can be found in [11], [2]. fig. 2 contains a comparison of the stft transform and the stadzt transform with various shapes of the segmentation window. the other parameters are the same for both transforms: the window length is 256 samples and the segment overlap is 255 samples. as is shown in figs. 2e–h, the spectral leakage of stft is suppressed by using a smooth window — the best suppression is achieved by the blackman window because the side lobe level is 58 db down [2]. however, the best frequency resolution for stadzt is obtained for a rectangular window (see figire 2a). when some other window shape is applied, the interference between the window and the adzt time selective basis deteriorates the performance of adzt algorithm, which evaluates the optimal coefficients of the zolotarev basis according to the actual segment of the signal (see figures 2a–d). 3.2 window length the next parameter that has a notable effect on frequency and time resolution is the length of the segmentation window. the spectral resolution of dft is defined as ∆f = β fs n , (5) where fs is the sampling frequency, n is the length of the signal, and β reflects the equivalent noise bandwidth of the window shape that is used [2]. it is apparent that the frequency resolution increases with the prolonged segment. however, it is limited by the heisenberggabor uncertainty principle, which states that it is not possible to get the best frequency resolution without losing the time resolution, and vice versa [1]. for this reason, the compromise between time and frequency resolution must be set for stft. this property is demonstrated by the spectrograms created by stft with increasing window length (see figures 3e–h). the best time resolution is obtained for the shortest window length all sudden changes of signal are located with high accuracy in time, but the frequency resolution is rather poor (see figure 3e). when the longest window is applied, the frequency resolution is increased but 108 acta polytechnica vol. 52 no. 5/2012 0 500 1000 5 10 15 20 25 30 0 500 1000 5 10 15 20 25 30 0 500 1000 10 20 30 40 50 60 0 500 1000 10 20 30 40 50 60 0 500 1000 20 40 60 80 100 120 0 500 1000 20 40 60 80 100 120 0 500 1000 50 100 150 200 250 0 500 1000 50 100 150 200 250 a) time index[−] b) time index[−] c) time index[−] d) time index[−] k [− ] k [− ] k [− ] k [− ] e) time index[−] f) time index[−] g) time index[−] h) time index[−] k [− ] k [− ] k [− ] k [− ] figure 3: stadzt zolograms created by the various window length a) 64 samples, b) 128 samples, c) 256 samples, and d) 512 samples. stft spectrograms created by the various window shape: e) 64 samples, f) 128 samples, g) 256 samples, and h) 512 samples. the time localization of the signal deteriorates (see figure 3f). the frequency resolution of stadzt is given by the window length as well as stft, but the time resolution is not constant for all frequencies. the reason for this behaviour is that high frequencies can be better analysed than low frequencies by high-order zolotarev polynomials [10]. one advantage of stadzt is that the time resolution and the frequency resolution are independent from each other in a certain manner, because optimal coefficients of the symmetrical zolotarev polynomials are set by the adzt algorithm. figures 3a– d illustrate the good time resolution of stadzt, which is not significantly corrupted by increasing window length. only the frequency resolution is ameliorated. it is worth noting that all spectrograms in figures 3 and 4 are created by stft using a rectangular window. the first reason for this choice is that adzt achieves the best results for a rectangular window (see figure 3.1). the second reason is that the principle of adzt can be understood as filtering of fourier spectra. so when we use a rectangular window, we get an original spectrogram which is processed by stadzt. 3.3 segment overlap the last described parameter is the overlap of the signal segments. if non-overlap segments are used, stft will lose information about the signal near the boundaries because of the window shape that is applied [2]. stft therefore usually uses an overlap of 50 % or 75 %, which enables the whole signal to be analysed without significant loss of information. another advantage is that it reduces the extra effort for computing the fourier spectra for segments with an overlap of 99 %. the evaluation of the optimal zolotarev coefficients z is based on minimizing the non-stationary part of zolotarev polynomials represented by the spectral coefficients of complex fourier spectra (2) [5]. as a consequence, the adzt algorithm is very sensitive to the phase of the input signal. the best time resolution is obtained for overlap 99%, which approximately relates to a segmentation step of 1 sample in this case (see figure 4). this property is shown in figures 4a–d), which illustrate the zolograms created by stadzt with decreasing overlap. the worst case for time resolution is achieved for an overlap of 50%, when the depicted zologram does not contain information about the unit 109 acta polytechnica vol. 52 no. 5/2012 600 1200 10 20 30 40 50 60 600 1200 10 20 30 40 50 60 75 150 10 20 30 40 50 60 75 150 10 20 30 40 50 60 19 38 10 20 30 40 50 60 19 38 10 20 30 40 50 60 9 19 10 20 30 40 50 60 9 19 10 20 30 40 50 60 a) time index[−] b) time index[−] c) time index[−] d) time index[−] k [− ] k [− ] k [− ] k [− ] e) time index[−] f) time index[−] g) time index[−] h) time index[−] k [− ] k [− ] k [− ] k [− ] 0 0 0 0 0 0 0 0 figure 4: stadzt zolograms created by window length 128 samples and various overlap a) 127 samples ∼ 99 % overlap, b) 120 samples ∼ 93 % overlap, c) 96 samples ∼ 75 % overlap, and d) 64 samples ∼ 50 % overlap. stft spectrograms created by window length 128 samples and various window shape: e) 127 samples ∼ 99 percent overlap, f) 120 samples ∼ 99 % overlap, g) 96 samples ∼ 75 % overlap, and h) 64 samples ∼ 99 % overlap. impulse (see figure 4d). 4 conclusion this paper has presented a study of the parameters which affect the time-frequency analysis created by the new stadzt transform. the parameters discussed here include the shape and the length of the segmentation window, and the segment overlap. these parameters are crucial for stft, where the compromise between time resolution and the frequency resolution must be set. however, stadzt is not so sensitive to these parameters, because the coefficients of the basis function created by the selective zolotarev polynomials are adjusted for the analysed signal. due to these basis functions, the best frequency resolution is acquired for a rectangular window, which is not good for analysis by stft. in the case of spectral leakage reduction by using the window function, the resulting stft spectrogram contains less information that can be used for adjusting the adzt basis functions. as a consequence, the stadzt zologram (see figures 2b–d) is worse than the zologram using a rectangular window (see figire 2a). the next benefit of the zolotarev basis is that the time resolution of stadzt does not depend strictly on the window length. greater window length means more information in the stft spectrogram, which can be used for optimal adjustment of the zolotarev basis function. hence, stadzt is able to obtain good time and frequency resolution simultaneously (see figure 3d). a substantial feature of stadzt is that the time resolution is not constant for the whole frequency range. thus the time instants of signal changes are localized with higher accuracy for high frequencies than for low frequencies. more information about zolotarev polynomials and their application can be found on the website [12]. acknowledgements the research presented in this paper was supervised by prof. p. sovka, fee ctu in prague, and by prof. m. vlček, fts ctu in prague. the author wishes to thank the grant agency of the czech republic for supporting this research through project p-102/11/1795: 110 acta polytechnica vol. 52 no. 5/2012 novel selective transforms for non-stationary signal processing. references [1] r. e. croichiere. a weighted overlap-add method of short-time fourier analysis/synthesis. ieee transactions on acoustics, speech, and signal processing, 28(1):99–102, 1980. [2] f. j. harris. on the use of windows for harmonic analysis with the descrete transform. proceeding of the ieee 66(1):51–83, 1978. [3] c. heil, d. f. walnut, i. daubechies. fundamental papers in wavelet theory. princeton university press, princeton, 2006. [4] e. n. huang, z. shen, s. r. long et al. the empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis, proc. r. soc. london 454:903–995, 1998. [5] j. janik, v. turon, p. sovka et al. a way to a new multi-spectral transform. in 11th wseas international conference on signal processing, computational geometry and artificial vision, florence, 2011. [6] a. lukin, j. todd. adaptive time-frequency resolution for analysis and processing of audio. in audio engineering society 120th convention, paric, 2006. [7] v. katkovnik, l. stankovic. periodogram with varying and data-driven window length. signal processing 67(3):345–358, 1998. [8] a. v. oppenhein, r. w. schafer, j. r. buck. discrete-time signal processing. 2nd ed., prenticehall, new york, 1999. [9] r. spetik. the discrete zolotarev transform. fee ctu in prague, prague, 2009. [10] v. turon, j. janik, p. sovka et al. study of adzt properties for spectral analysis. in 11th wseas international conference on signal processing, computational geometry and artificial vision, florence, 2011. [11] j. uhlir, p. sovka, r. cmejla. uvod do cislicoveho zpracovani signalu. fee ctu in prague, prague, 2003. [12] m. vlcek et al. novel selective transforms for non-stationary processing. http://amber.feld. cvut.cz/selective%20transforms [2012-09-25]. [13] m. vlcek, r. unbehaen. zolotarev polynomials and optimal fir filters. ieee transaction on signal processing 47(3):717–730, 2006. 111 ap08_3.vp 1 introduction practical first semester undergraduate courses in electrical engineering have become more important in modern learning concepts, since they can strongly boost the motivation of first-year students and give insights into practical methods and basic engineering concepts [2]. against this background and within the scope of the new bachelor of science curriculum, instituted by the european bologna process [1], the faculty of electrical engineering and information technology at rwth aachen university, germany established the first semester student laboratory “matlab meets lego mindstorms” in the 2007–2008 academic year. the lab focuses on three elements: mathematical methods and digital signal processing fundamentals, programming basics of the matlab® software from mathworks [7], and practical engineering with lego® mindstorms® nxt robots [8]. freshman introduction courses in practical engineering overcome the traditional twofold teaching structure, which first presents theoretical foundations and only later confronts the more mature student with real-world issues and engineering approaches. this curriculum usually provides only low student motivation and thus the students can encounter difficulties in handling complex situations in their following career [3]. to acquire a better learning process, the essential element can be stated as “no matter how many concepts you teach, no matter how deep you go, no matter how tough your exams are, the only lessons that will remain in the students minds are those that touch them” [4]. thus it is very important to motivate first semester students by confronting them with practical situations, in which they “feel like engineers” right from the beginning of their studies [2]. in addition to mathematical and electrical engineering skills, the ability to work with computer programming software is also a required skill for today’s electrical engineers [5]. for this purpose, a new self-motivated first semester practical laboratory, called “matlab meets lego mindstorms”, has been developed. the paper is organized as follows: section 2 describes the learning targets of the project and the applied software and hardware. the project structure and management executed by a small core team of supervisors is presented in section 3. section 4 introduces a novel software toolbox, which has been developed in matlab. the students’ practical activities, separated into basic exercises and major tasks, are described in section 5. evaluation results are given in section 6. in section 7, some future perspectives are presented, and, section 8 highlights the main conclusions about this practical course. 2 learning targets the undergraduate laboratory “matlab meets lego mindstorms” is motivated by three objectives: mathematical methods, matlab, and practical engineering, as illustrated in fig. 1. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 first steps into practical engineering for freshman students using matlab and lego mindstorms robots a. behrens, l. atorf, r. schwann, j. ballé, t. herold, a. telle besides lectures on basic theoretical topics, contemporary teaching and learning concepts for first semester students give more and more consideration to practically motivated courses. in this context, a new first-year introductory course in practical engineering has been established in the first semester curriculum of electrical engineering at rwth aachen university, germany. based on a threefold learning concept, programming skills in matlab are taught to 309 students within a full-time block course laboratory. the students are encouraged to transfer known mathematical basics to program algorithms and real-world applications performed by 100 lego mindstorms robots. a new matlab toolbox and twofold project tasks have been developed for this purpose by a small team of supervisors. the students are supervised by over 60 tutors at 23 institutes, and are encouraged to create their own robotics applications. we describe how the laboratory motivates the students to act and think like engineers and to solve real-world issues with limited resources. the evaluation results show that the proposed practical course concept successfully boosts students’ motivation, advances their programming skills, and encourages the peer learning process. keywords: first semester students, freshman course, practical engineering, introduction to programming, matlab, lego mindstorms, robotics laboratory, project-based learning. fig. 1: key objectives of the laboratory the first learning target is characterized by applying and broadening the student’s mathematical basic knowledge that has been imparted by the first semester lecture in “mathematical methods in electrical engineering”. the content of this course includes an introduction to digital signal processing and system theory [6] with topics like complex numbers, sampling of functions, the fourier transform, polynomials, vector spaces and matrix operations. based on these theoretical foundations, the students are inspired to link theoretical methods to practical issues provided by the last two objectives of the project. the second objective represents the acquisition of programming skills in matlab and the transfer of mathematical methods to programming algorithms. experience has shown that the most efficient learning process can be achieved by unassisted writing of program code. therefore the students are encouraged to develop their own code right from the beginning. the introduction of matlab as a text-based programming software in this lab is motivated by direct and intuitive access to vectors and matrices. however this software is also widely used in industry and in other courses at rwth aachen university. the third learning target promotes practical engineering of real-world applications and challenging tasks by using programmable and remote-controlled lego mindstorms nxt robots [8]. the standard nxt education kit provides four different sensors (touch, sound, light and ultrasonic, which is used for measuring distances) and motors, illustrated in fig. 2. the advantage of using this robotic system is that, on the one hand, complex robots can be easily built with standard lego bricks, while on the other hand application constraints are given by the limited brick types and the low-budget sensor accuracy. limited project resources are essential for simulating authentic engineering tasks and real-world problems. in addition, the lab supports a peer learning process, in which the students acquire soft skills like collaboration, team work and leadership [5,10]. overall, the laboratory assists freshmen to handle practical issues in the manner of engineers. the student groups have to build mindstorms robots and develop algorithms in matlab. thus, the name of the project is derived straightforward as “matlab meets lego mindstorms”. 3 project structure and management the project first took place in december 2007. it was performed as an eight-day full-time (six hours per day) block course. thus, the students could focus entirely on the project while other first semester lectures paused. 309 freshmen participated in the lab and were guided by more than 60 supervisors at 23 institutes of the faculty. a total of more than 150 computer workplaces were provided and 100 lego mindstorms nxt robotics sets were bought. each student group of four shared one robot kit and two computers. the project management and development of the lab exercises was organized by a small core team of supervisors, who are members of five different departments of the faculty. within a four-month period the team determined the required resources, developed a system environment, created lab exercises and demonstrations, and trained 60 supervisors to achieve the proposed learning targets. since the focus of the lab is on mathematical methods, digital signal processing, and mechatronic aspects using matlab, the nxt robots have to be controlled directly via matlab. to provide remote-control and mobility for autonomous robots the usage ofthe wireless bluetooth® communication interface of the novel lego mindstorms nxt series with its original firmware was chosen. based on the communication protocol, the core team developed a new matlab toolbox, which provides a direct interface between matlab and nxt robots. in three-weekly meetings, the software design was defined and the contents of the new lab exercises were discussed and developed. eventually the practical exercises were documented and introduced to the other supervisors of the faculty. 4 rwth – mindstorms nxt toolbox despite the wide choice of commercial and free nxt software for different programming languages [11], no implementation is applicable for the desired direct computer-robot communication via matlab without additional software. for this reason, a novel software toolbox, called “rwth – mindstorms nxt toolbox” was developed and the lego mindstorms bluetooth communication protocol [9] were adapted to matlab functions. with these functions, all nxt standard sensors and servo motors can be controlled, and various other system features like reading the battery level or playing a tone are implemented. besides these direct commands, the toolbox is organized in a structure of four command layers, as shown in fig. 3. it provides additional © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 3/2008 fig. 2: lego mindstorms nxt hardware fig. 3: rwth – mindstorms nxt toolbox – command layers high-level and control functions for more intuitive and comfortable usability. towards this end the low-level functions and direct commands are encapsulated, and the user interface is adapted to the common matlab toolbox usage handling. fig. 4 shows an example for reading the current measurement value of the ultrasonic sensor. these high-level functions avoid time-consuming programming, especially for beginners. in addition, detailed documentation and program examples are embedded in the toolbox. the remote-control concept reduces the implementation of autonomous robots to three main steps: constructing a robot, developing a matlab program, and interactive program execution. there is no need for an intermediate step in which the program is compiled and downloaded onto the embedded robot processor. therefore a direct debugging and monitoring mode is supported, which allows the students to validate their program code simultaneously with processing. a drawback of this design is related to the characteristics of the wireless bluetooth interface. in general, the data rate of bluetooth is limited to 1-3 mbit/s and data errors and packet losses can occur randomly during the data transmission. however, the major bottleneck is caused by the bluecore™ chip of the nxt hardware, which penalizes switching from receive-mode to transmit-mode with an additional 30 ms latency [9]. due to these constraints, real-time regulation with a short response time is not possible via the nxt bluetooth communication protocol. the rwth – mindstorms nxt toolbox for matlab supports windows and linux platforms and is published as a free open source product. it is subjected to the gnu general public license (gpl) [13] and can be downloaded from [14]. in general, the open source software concept motivates people to participate in project development and founds international interacting communities [12]. in addition, the freeware toolbox enables other universities and schools to introduce the software to their practical robotics courses for free and without any obligation. 5 practical activities the core team has developed nine practical activities, which were performed by the students in two project parts. the first part included six sequential basic exercises, each of which addressed different nxt sensors or actuators. these were developed to teach the students, formed into teams of two, programming principles in matlab, applying the rwth – mindstorms nxt toolbox and the bluetooth interface, and measuring the nxt hardware characteristics. in the second part of the lab the student groups could choose from three major and more complex tasks, but were also free to develop their own experiments and challenges in order to enhance creative ideas, robot construction and individual problem solving methods. therefore student groups of four had to develop individual robots and algorithms to solve the major tasks in competition against other groups. in this process the limitations of the lego mindstorms hardware, e.g., slight variations between sensors and motors of the same type, had to be taken into account. not every behavior of the robot was easily foreseeable in all situations. for instance, the driving distance covered by a robot vehicle showed variations, since it depends on the traction on tiling or on carpet, and had to be considered in the robot design and the control algorithms. it was also necessary to handle the limitations of the bluetooth channel, which were noticeable in random transmission delays. all given tasks were designed by the core team with respect to the hardware and bluetooth constraints, documented and organized in modular subtasks to establish clear requirements and targets for each subtask. at the end of the lab, each student group presented individual robots and algorithms to the other participants. an overview of the lab time schedule is given in fig. 5. 5.1 basic exercises each of the six basic exercises focused on a key aspect of the three learning targets, shown in fig. 1. in the first experiment the students constructed individual robots and tested the sensors with the nxt firmware. in the next two exercises the bluetooth interface was introduced, and the touch and sound sensor were controlled via matlab using the rwth – mindstorms nxt toolbox. in addition, simple programming structures like loops, if-conditions and functions were taught. the fourth experiment was addressed to the nxt servo motors. after taking measurements of the internal rotation sensor and working with different gear transmission ratios, a machine for visualizing complex numbers was built and programmed. for this, the students had to implement an algorithm which reads two complex numbers, takes a mathematical operation from a user input dialog, calculates the result, plots the complex phases in a diagram and also displays them using the mechanical pointers of the lego machine, as shown in fig. 6. the fifth experiment focused on light sensor measurements and matlab gui design. finally, the last basic 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 >> openultrasonic(sensor_1); >> distance = getultrasonic(sensor_1) distance = 36 >> fig. 4: matlab example code using high-level functions for sensor reading fig. 5: lab time schedule (in days) exercise was performed by programming an explorer robot, which scans the local environment using the ultrasonic sensor, plots a 360-degree distance profile and drives autonomously through an open gate of an enclosure, illustrated in fig. 7. to fulfill this challenge the students had to process the data with simple filter operations, detect signal edges and locate the correct target angle. 5.2 major practical tasks the major practical tasks were designed to confront the students with more complex challenges, give them the opportunity to work together as a team of four, and manage the assigned tasks on their own. the students could choose among three assigned challenges, or were able to define their own creative application. the three documented tasks included an obstacle course robot which exits and maps a maze, a 3-d robot arm that grabs and sorts colored balls (fig. 8), and a 2-d robot arm which acts as an image scanner. besides the competition aspects and the individual robot constructions, all tasks focused on data processing and programming with matlab, e.g., data visualization, mapping, gui interaction and graphical display. the 2-d scanner application was given by a two-joint robot arm. students who chose this experiment had to develop a time-efficient scan algorithm. here the dependency between rotations of the two joints and the current grid position of the light sensor had to be taken into account, as illustrated in fig. 9. then challenges like transformation of polar and cartesian coordinates, registration, image interpolation and visualization were addressed in subtasks. the scanning process is illustrated in fig. 10. the figure shows the original image, a sampled image and the interpolation result of the test picture, called lena. despite the hardware limitations the interpolated image provides a good result. 5.3 presentation on the last day of the lab each student group presented the results and the individual robot constructions to the other participants in a 15-20 minute presentation. the features and behavior of the robots were shown in live demos and the developed matlab algorithms were discussed. for the presentation, common software like openoffice and microsoft® powerpoint was used. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 48 no. 3/2008 fig. 6: exercise 4: complex phase machine (left: matlab plot of complex phases, right: mechanical simulation displayed by the lego pointers) fig. 7: exercise 6: explorer robot scans the local environment (left: 360-degree distance profile, right: enclosure with an open gate) fig. 8: major practical tasks (1st row: obstacle course robot with route mapping, 2nd row: robot arm with monitored status) fig. 9: 2-d scanner with rotation angles of the joints fig. 10: 2-d scan (left: original image, middle: sampled image, right: interpolation) 6 evaluation and results after the lab, 175 students participated in an anonymous online evaluation. according to the evaluation results the lab fulfilled the goal of introducing matlab as a software for solving real-world problems, based on mathematical methods and principles. as shown in fig. 11, 48 % of the students rated their matlab programming skills as “average” at the beginning of the lab, whereas 48% evaluated their improved skills as ”excellent” and 44 % as “good” after the project. the overall rating also shows an “excellent” or “good” validation of 89 %. in addition, almost every participant would join the lab again and approved the usage of mindstorms applications also in other practical courses. besides the motivation rate of fig. 11, the students’ motivation was expressed by their group dynamics, strong intention to compete with other groups by spending additional time on refining the robot algorithms after the official hours. the impact of peer learning within the groups of four was also observable, since the students shared resources and engaged in debates and critical reflections. the question “identify positive aspects of the lab” produced responses such as “confrontation with problems which are not mentioned in the theory”, “theoretical foundations could be linked to practical problems, which led to a higher understanding level” and “practical application with matlab” also showed that the proposed objectives were fulfilled. students’ creativity and the variety of individual application ideas and robot constructions exceeded our expectations. the evaluation showed that 45% of the interviewed persons defined and implemented their own practical major tasks. thus new and creative robots, like a 3-d environment scanner with target recognition, a bottle finder with fill-level-scanner, a vehicle with roadside recognition, a bowling robot, an autonomous parking vehicle, a sorting machine, a bar code and bar disk scanner, a morse encoder/decoder (shown in fig. 12), etc. were designed. while many students argued that working with the hardware limitations was challenging, some participants criticized the occurrence of major bluetooth latencies and the complex motor control, which limited the implementations of their applications. the limited number of mindstorms kits was also criticized, especially during the basic exercises, in which one kit was shared by two teams of two students. referring to this, fig. 11 shows an “average” rating. overall, the laboratory work clearly showed that freshman students were highly motivated and encouraged to transfer mathematical principles to robotics while broadening their programming knowledge in matlab at the same time. social skills like working in a team, managing the assigned tasks within a given time frame and cooperating with others to attain mutually beneficial goals were also supported by the laboratory concept. in addition, students who showed only minor creativity and productivity were encouraged to work on the predetermined and documented applications. 7 perspectives based on this first experience and the evaluation results, some adjustments will be made for the next semester. the number of mindstorms nxt education sets will be doubled in order to accelerate the execution time of the basic exercises and provide additional time for the major practical tasks. the rwth – mindstorms nxt toolbox for matlab will support a usb interface to lower the probability of major packet losses using the bluetooth channel, and additional high-level motor control functions will be implemented. due to the great success of the project, this laboratory may also be integrated into the curriculum for industrial engineering. 8 conclusions this paper has presented an introductory course for freshmen in practical engineering. we have described a threefold learning concept which confirms mathematical basics, fosters matlab programming skills, and introduces students to real engineering problems, motivated by lego mindstorms robots. a novel matlab toolbox, which provides direct computer-robot communication, has been presented and has been made publicly available as an open source project. the sequential execution of basic and main practical tasks evoked a high student motivation, expressed by a wide range of creative and individual robotics applications and matlab solving algorithms. the inherent hardware limitations were challenging for most participants and led to intensive group dynamics and team work. the block course concept fulfilled the demanded practical engineering and matlab program48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 11: evaluation results (a: excellent, b: good, c: average, d: below average, e: inadequate) fig. 12: morse encoder and decoder machine ming goals, which were also expressed by the students’ final project presentations. extending this learning concept to other practical courses with interdisciplinary groups could broaden the students’ professional skills and help their future careers. project web page further images, descriptions, and videos of the students’ robots and inventions are available at the web page http://www.lfb.rwth-aachen.de/mindstorms. the contact to the developer community and the free download of the “rwth – mindstorms nxt toolbox” for matlab can be found at the project web page http://www.mindstorms.rwth-aachen.de. acknowledgments the authors would like to thank prof. dr.-ing. til aach, institute of imaging & computer vision, and prof. dr.-ing. tobias g. noll, chair of electrical engineering and computer systems, rwth aachen university, who supervised this project. the authors would also like to thank rainer schnitzler and achim knepper, who gave support to the project development and organization. finally, the authors gratefully acknowledge the work of marian walter and axel cordes, who tested and verified the toolbox and practical exercises. references [1] european ministers of education: bologna declaration, june 1999. [online]. available: http://www.ond.vlaanderen.be/hogeronderwijs/bologna/ documents/ [2] vallim, m., farines, j.-m., cury, j.: practicing engineering in a freshman introductory course, ieee trans. edu., vol. 49 (2006), no. 1, p. 74–79, feb. 2006. [3] director, s., khosla, p., rohrer, r., rutenbar, r.: reengineering the curriculum: design and analysis of a new undergraduate electrical and computer engineering degree at carnegie mellon university. proc. ieee, vol. 83 (1995), no. 9, p. 1246–1269, sep. 1995. [4] saint-nom, r., jacoby, d.: building the first steps into sp research. in proc. ieee int. conference on acoustics, speech, and signal processing (icassp), vol. 5 (2005), p. 545–548. [5] hissey, t.: education and careers 2000. enhanced skills for engineers. proc. ieee, vol. 88 (2000), no. 8, p. 1367–1370, aug. 2000. [6] mcclellan, j., schafer, r., yoder, m.: signal processing first. prentice-hall, nov 2002. [7] the mathworks, matlab® . [online]. available: http://www.mathworks.com. [8] lego mindstorms® . [online]. available: http://www.mindstorms.com. [9] lego® mindstorms®: bluetooth developer kit. [online]. available: http://mindstorms.lego.com/overview/nxtreme.aspx. [10] mcgoldrick, c., huggard, m.: peer learning with lego mindstorms. in proc. 34th frontiers in education, oct. 2004, p. s2f24–29. [11] patterson-mcneill, h., binkerd, c. l.: resources for using lego mindstorms. j. comput. small coll., vol. 16 (2001), no. 3, p. 48–55. [12] hars, a., ou, s.: working for free? – motivations of participating in open source projects. proc. 34th int. conf. on system sciences, p. 1–9, jan. 2001. [13] gnu general public license. [online]. available: http://www.gnu.org/licenses/licenses.html. [14] rwth aachen university: rwth – mindstorms nxt toolbox for matlab. [online]. available: http://www.mindstorms.rwth-aachen.de. alexander behrens e-mail: behrens@lfb.rwth-aachen.de linus atorf e-mail: atorf@lfb.rwth-aachen.de institute of imaging & computer vision robert schwann e-mail: schwann@eecs.rwth-aachen.de chair of electrical engineering and computer systemsg johannes ballé e-mail: balle@ient.rwth-aachen.de chair of institute of communications engineerin thomas herold e-mail: herold@iem.rwth-aachen.de institute of electrical machines aulis telle e-mail: telle@ind.rwth-aachen.de institute of communication systems and data processing rwth aachen university templergraben 55 52062 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 48 no. 3/2008 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 a new fast silicon photomultiplier photometer f. meddi, f. ambrosino, c. rossi, r. nesci, s. sclavi, a. ruggieri, s. sestito, i. bruni, r. gualandi abstract the crab pulsar is one of the most intensively studied x-ray/optical objects, but up to now only a small number of research groups have based their photometers on sipm technology. in early february 2011, the crab pulsar signal was observed with our photometer prototype. with low-cost instrumentation, the results of the analysis are very significant: the processed data acquired on the crab pulsar gave both a good light curve and a good power spectrum, in comparison with the data analysis results of other more expensive photometer instrumentation. keywords: silicon photomultiplier detector (sipm), photometer, fast variability, pulsar. 1 introduction astronomical sources with fast variability are basically of three kinds: pulsars, interactive binaries and pulsating stars. many of these objects are also xray and gamma-ray sources, and it is of great interest to study them because several orbiting x-ray and gamma-ray observatories are presently operative. the timescale variabilities range from hours to thousandths of seconds: the amplitude variations in the optical band range from 100 % (pulsars) down to 0.1 % (o subdwarfs). for fast time scales, the only detectors available in the optical band used to be classical photomultipliers. in recent times, a new class of detectors, silicon photo multipliers (sipm), has been developed. their astronomical use remains to be explored in detail. wehavebuilt a prototype of a rapid astronomical photometer, based on sipmdetectors, commercially available from the well-known hamamatsu company [1]. in thiswork,we report our first astronomical results. 2 technical description astronomical photometers based on sipm technology are presently used by very limited numbers of research groups: the optima team [2] at the max planck institutempe, and theaqueyeteam [3] at padua university. typical characteristics of these detectors are the short response time (20 ns), segmentation into cells of linear size from 0.025 mm to 0.1 mm, and photon detection efficiency (pde) up to 75 % at 450 nm. for details see figure 1, where the code s10362-11050u refers to the internal sensor present inside each mppc(multipixelphotoncounter)module thatwe used. our system comprises three mppc modules, byhamamatsu,with an active area of 1×1mm2 and a pixel size of 50 × 50 μm2. one detector is used to observe the target, a second detector is used for the sky level nearby, and a third one is used to observe a reference star. fig. 1: the blue curve shows the photon detection efficiency of our mppc modules the light from the telescope arrives at each detector through a plastic optical fiber (600 m in diameter). to reduce the electronic noise, the detectors are kept inside a commercial freezer, which cools two of them to about −8.5◦c and the third detector to about −6.0◦c. the fastest acquisition rate allowed by the software provided with the detectors by hamamatsu is 1 ms; we have nearly halved the rate to 0.55 ms using a dedicated electronic system named “p3e”, which stands for pulsar pulse period extractor, developed at the physics department of la sapienzauniversity. the speed limit is at present givenby the data recording device (sd card), but we are working to improve this limit. figure 2 shows a block diagram of our electronic chain. 42 acta polytechnica vol. 51 no. 6/2011 fig. 2: block diagram of the electronic chain mounted on the telescope the universal time of the data acquisition system is given by a commercial gps unit, the antenna of which is located outside the dome. the gps unit provides an information string (coordinates and timing via the serial interface) and also a pps (pulse per second) signal. the pps signal arrives either at an i/o (input/output) bit of amicrocontroller unit, where it is processed to have the possibility of getting one pulse at the beginning of the measure and another pulse at the end of acquisition (i.e. “gated pps”), or it is distributed as original to each p3e unit (i.e. “notgatedpps”). the gated pps is sent to the system to drive two leds to have an optical timing marker. the not gated pps is used by each p3e to start the internal finite state machine developed using an fpga (field programmable gate array) to count thediscriminated signalgeneratedby the mppc module. the p3e processed data is sent to another microcontroller unit, which interfaces a mass storage unit via an sd card (fat 32 formatted) in order to be readable by a pc. the mechanical interfacewasmade partly in ourdepartment and partly at the loiano observatory. we made some preliminary trials both on the vallinfreda 50 cmnewtonian telescope [4] and on the loiano 152 cm cassegrain telescope [5] to check the overall efficiency and linearity of the instrument response with stars of a given magnitude. in figure 3, the upper line refers to the loiano telescope and the lower line refers to the vallinfreda telescope weselected theloiano telescope for ourphotometer, because it is provided with a special focal plane arrangement which allows several instruments to be mounted simultaneously. asimpleflipmirror enables them to be fed alternately. two further separate probes on the focal plane feed the guiding camera and an auxiliary camera. the target is pointed with the mainccd instrument (bfosc) of the telescope permanently mounted on-axis. the flip mirror can redirect the light of the target to the first of our detectors throughanopticalfiber,without changing the focus position. the sky signal is recorded by a second optical fiber located at a distance of 17mm from the first one. the third optical fiber is positioned in the place of the auxiliary camera and can look at a reference star using the independent probe on the focal plane. we determined the position of a source on the ccd detector of bfosc when it is centered on the sipm sensor, so we can point a source with bfosc and then flip the mirror to get the signal on the sensor itself. fig. 3: magnitude computed by a pogson’s lawlike (number of detected photons from target minus sky background) as a function of a known magnitude (mag v) fig. 4: magnitude variation sensitivity (delta mv) as a function of a given magnitude (mv), for various gate time durations faint sources (16mag) canbe observedwith 1ms integration time andwith a signaltonoise ratio (s/n) ∼ 1 with this telescope. the calibration of the number of photons detected by our photometer was obtained by comparing the convolution integral of the absolute flux, derived from stars in the jacoby catalog, respectively, with sipm pde and the transmittance of johnson filters b and v. figure 4 reports the expected sensitivity in magnitude (delta mv) as a function of visual magnitude (mv) varying the mppc integration gate length from 1ms up to 10 s. 43 acta polytechnica vol. 51 no. 6/2011 fig. 5: power spectra of the crab pulsar signal detected by mppc0 (upper) and p3e0 (lower) fig. 6: crab pulsar light curves folded by “efold” for mppc0 (upper) and p3e0 (lower) 3 observational test: the crab pulsar on february 5, 2011 we observed the crab pulsar for 3300 seconds with 0.55 ms (p3e0) and 1 ms (mppc0) sampling in good photometric conditions (seeing ∼ 1.5 arcsec). the fourier power spectra for both mppc0 and p3e0 show typical crab pulsar characteristics (figure 5). refined data analysis was performedusingxronos software fromheasarc[6] with the following steps: a) correction for the motion of the earth to reduce the data to the sun barycentre with “earth2sun”; b) the best fitting period was then searched with “efsearch”, finding a result in agreement (within 3 μs)with the radioephemeris fromjodrellbank (p =0.033652394 s) [7]; c) the folded light curvewas computedwith“efold” and is reported in figure 6. the flux ratio between the primary and secondary pulse is in fair agreement with the literature (e.g. [8]). 4 conclusions ouranalysishasdemonstrated that our instrumentation candetect thecrabpulsar signal using a 152 cm loiano telescope. real time (s/n) ∼ 1 can be increased by applying corrections (i.e. orbital earth motion around the sun) and datamanipulations (i.e. integer multiple crab period temporal slice overlapping). for our mppc0 and p3e0 detectors, we saw a good crab pulsar signal by overlapping n = 1025 and n = 517 slices obtaining (s/n) ∼ 32 and ∼ 23, respectively, at a reasonable data collection duration (∼ 55 min.) at the telescope. references [1] http://www.hamamatsu.com/ [2] kanbach, g., et al.: spie, 4841, 82, 2003. [3] barbieri, c., et al.: spie, 7355, 15, 2009. [4] http://astrowww.phys.uniroma1.it/nesci/ vallin.html [5] http://www.bo.astro.it/loiano/index.htm [6] http://heasarc.nasa.gov/docs/xanadu/ xronos/xronos.html [7] http://www.jb.man.ac.uk/pulsar/crab.html [8] lynds, r., et al.: apj, 155, l121, 1969. f. meddi, f. ambrosino c. rossi, r. nesci s. sclavi dipartimento di fisica università la sapienza, roma a. ruggieri, s. sestito infn – sez. roma 1 i. bruni, r. gualandi inaf – osservatorio astrofisico di bologna 44 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 a pulverized coal-fired boiler optimized for oxyfuel combustion technology tomáš dlouhý1, tomáš dupal2, jan dlouhý1 1 czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic 2 nuclear research institute rez plc, energoprojekt prague, vyskočilova 3/741, 140 21, prague 4, czech republic correspondence to: tomas.dlouhy@fs.cvut.cz abstract this paper presents the results of a study on modifying a pulverized coal-fired steam boiler in a 250 mwe power plant for oxygen combustion conditions. the entry point of the study is a boiler that was designed for standard air combustion. it has been proven that simply substituting air by oxygen as an oxidizer is not sufficient for maintaining a satisfactory operating mode, not even with flue gas recycling. boiler design optimization aggregating modifications to the boiler’s dimensions, heating surfaces and recycled flue gas flow rate, and specification of a flue gas recycling extraction point is therefore necessary in order to achieve suitable conditions for oxygen combustion. attention is given to reducing boiler leakage, to which external pre-combustion coal drying makes a major contribution. the optimization is carried out with regard to an overall power plant conception for which a decrease in efficiency due to co2 separation is formulated. keywords: ccs, oxyfuel, oxyfuel boiler. 1 introduction ecological requirements on greenhouse gas emissions have led to the development of co2 capture and storage techniques for power facilities. one of these methods is oxyfuel, which is based on fuel combustion with oxygen. substituting air by oxygen fundamentally changes not only the combustion conditions but also the flue gas convection and heat transfer in the boiler. these changes have a significant impact on boiler design. 2 impact of oxygen combustion on boiler design the main benefit of hydrocarbon fuel combustion with oxygen is that the flue gas consists mostly of carbon dioxide and water vapor. dry, co2-rich flue gas can be obtained by condensing the water vapor, which simplifies and reduces the cost of co2 separation for subsequent storage. large-scale production of oxygen is required for the process. for energetic and economic reasons, oxygen with 95 % purity is extracted from the air by cryogenic distillation. another consequence of combustion with oxygen instead of air is that the combustion takes place at a high temperature with 1/3 of the flue gas flow rate. the flue gas has a different composition, and therefore different physical and chemical properties. these modifications impose new requirements on oxyfuel boiler design, which can be summarized as follows: 1. flue gas recycling needs to be employed in order to lower the combustion temperature and raise the flue gas flow rate. according to generally presented recommendations, backed up by our own calculations, a 70 % flue gas recycling ratio is considered (i.e. approximately twice the amount of flue gas that forms). 2. the boiler will lack an air preheater. the temperature of the exhaust flue gas will be fairly high, in accordance with the feed water temperature. in order to lower the exhaust temperature, we consider the possibility of employing a flue gas feed water preheater in parallel with the regenerative steam feed water heaters. 3. the boiler has to be perfectly leak-proof. false air suction has to be reduced to an absolute minimum. the reason for this is the radical effect of false air suction on increasing the amount of flue gas and on lowering the co2 concentration. false air suction needs to be eliminated, even in auxiliary equipment operating at sub-atmospheric pressure, e.g. coal mills and electrostatic precipitators. this requirement cannot be met in practical applications for widely-used beater wheel mills, where the coal is simultaneously dried by the hot flue gas. an external coal drying method is recommended in 49 acta polytechnica vol. 52 no. 4/2012 order to separate coal drying from the boiler altogether. a wta (german abbreviation standing for fluidized-bed drying with internal waste heat utilization) method is proposed for coal drying. in this way, the energetic block efficiency could be raised by several percentage points. 1. the size of the heating surface has to be optimized in order to comply with the power output and steam properties requirements, taking into account the modified heat exchange proportions due especially to the lower flue gas flow and the different substance properties. 3 task procedure the goal of the task was to elaborate a steam boiler study for consideration in a particular oxyfuel technology application in the conditions of the czech republic. the prunéřov ii (epr ii) power plant was selected as a suitable candidate for applying the model. a complex reconstruction is under preparation at the moment, and the disposition offers favorable technology layout conditions. the preparation process for the comprehensive reconstruction is in an advanced stage, and the main technological components, including the boiler, have already been designed. changes to the boiler design, required for oxygen coal combustion, have been proposed and quantified using thermal calculations. the task has been divided into four phases: 1. the epr ii boiler model was created using the project documentation of the new boiler as a reference case of air combustion. 2. a general boiler modeling software was modified for oxygen combustion. 3. oxygen combustion with no major changes to the boiler design was calculated — current boiler reconstruction. 4. boiler design optimization for oxygen combustion was carried out — new equipment design. 4 reference boiler and model description new once-through two-pass benson boilers with steam reheating will be installed in the boiler room. this layout best complies with the specific boiler room area and the new boiler placement requirements in the existing supporting structure. the boiler parameters are: superheated steam p = 18.262 mpa t = 575 ◦ c m = 183.44 kg/s (660.384 t/hr) reheated steam t = 580 ◦ c cold reheat steam p = 3.947 mpa t = 352.0 ◦ c m = 164.826 kg/s (593.376 t/hr) feed water p = 23.362 mpa t = 250.9 ◦ c m = 660.384 t/hr the boilers are designed for combusting coal with the following properties: lower heating value 9.75 mj/kg water content (raw) 31 % (mass) ash (dry) 41 % (mass) sulfur (dry) 3.0 % (mass) the new boiler is shown in figure 1. a computation model of the reference boiler was created by inserting the design data into the general steam boiler static operation mode simulation application ffb, created at the faculty of mechanical engineering at ctu in prague. the model setup consists of the following steps: • separate the boiler into balance volumes. • enter the connection order of – water and steam – flue gas – air • define the heat transfer from flue gas to water/steam/air. • specify the geometric characteristics of the furnace and the heating surfaces. the model was initially tuned for air combustion, so that the operation characteristics matched its design calculation in rated operation mode. this comprises the reference case modeling for further modifications and comparison. 5 modifying the model for oxygen combustion the general model created for air combustion was modified for oxygen combustion conditions. a fundamental change had to be made to the stoichiometric calculation of the oxidizer and the flue gas volume. a different method had to be used to balance the false air suction and flue gas recycling. on the basis of background research and preliminary computations, the following operating conditions were set for oxygen combustion: 1. oxygen with 95 % purity will be available. the remaining 5 % is assumed to be a nitrogen-argon mixture. 2. the combustion will operate with a 7 % oxidizer excess. 3. 5 % false air suction is assumed in the furnace (qualified as excess air). 50 acta polytechnica vol. 52 no. 4/2012 figure 1: epr ii 250 mwe boiler — air combustion reference case 4. the recycled flue gas will be extracted downstream from the flue gas filters (upstream from fgd). the recycling rate lies between 66 % and 70 % of the total flue gas flow. another significant design change is the addition of external coal drying, utilizing the wta method. this measure substantially reduces false air suction into the boiler. it is assumed that the coal will be dried to 12 % water content. this will raise its lower heating value and lower the flue gas flow rate, and increase the efficiency of the boiler. 6 boiler recalculation for oxygen combustion the modified reference boiler model was used for recalculating the oxygen operation mode. in the first stage, the boiler dimensions and the layout and size of the heating surfaces were left unchanged. in this way, the transition from the current boiler operating with air to oxygen combustion was simulated. only the flue gas recycling was optimized. the calculations showed that it is not possible to match the temperature and flue gas flow rate conditions for both air and oxygen combustion only by regulating the flue gas recycling. the amount of recycled flue gas was set to 2.21 times the amount of the oxidizer flow rate, which represents 67 % of the total flue gas flow rate through the recycling extraction point. in this case, the flue gas temperature downstream from the combustion chamber is the same as in the conventional boiler. however, the flue gas flow rate is 22.6 % lower, which leads to lower flue gas flow velocity through the convective heating surfaces, and therefore their power output is lowered. superheaters are most affected by this change, since both have a convective characteristic, which leads to reheated steam underheating by 51 acta polytechnica vol. 52 no. 4/2012 figure 2: air boiler modified for oxygen combustion — enlarged reheater 30 ◦c even with full biflux utilization. the absence of an air heater causes the very high flue gas temperature at the outlet of the boiler downstream from the economizer (310 ◦c). flue gas recycling may help to some extent to increase the reheated steam temperature at the cost of a further increase in the flue gas outlet temperature, thus downgrading the efficiency of the boiler. on the basis of these results, it can be stated that it is impossible to change the operation mode of the current boiler from air to oxygen simply by employing flue gas recycling while complying with the required steam properties, especially for reheated steam. the proposed solution to this problem was to enlarge the surface of the steam reheater. a fourth reheater bundle with the same design as the other three was placed in the free space above the first bundle, and one loop of the output reheater was added into the transition pass. the heating surface of the reheater increased by approximately one quarter. the boiler with these modifications is shown in figure 2. the reheater enlargement allowed for a higher reheated steam temperature, though it is still 5 ◦c lower than the nominal temperature. enlarging the reheater has no significant impact on the outlet flue gas temperature. the calculations indicate it is not a simple matter to reconstruct the current pulverized coal fired boiler from air to oxygen combustion. it would be necessary to utilize flue gas recycling and to modify the heating surface, in particular the reheater and probably also the economizer. the additional necessary modifications to the boiler, e.g. sealing the air leakage, adjusting the airway, and replacing selected surface materials and burners mean that converting the boiler to oxygen combustion might prove to be simply too complicated. it is worth considering replacing the current boiler with a new boiler that has been optimized for oxygen combustion. 52 acta polytechnica vol. 52 no. 4/2012 figure 3: boiler optimized for combustion with oxygen 7 boiler optimization for combustion with oxygen the analysis above shows that it is appropriate to modify and optimize the conventional air combustion boiler design for combustion with oxygen. despite intensive utilization of flue gas recycling, the temperature in the furnace will be higher, and the flue gas flow rate will be lower for oxygen combustion than for air combustion. it is therefore desirable to reduce the size of the radiant heating surfaces, especially the evaporator, and to increase the size of the convective heating surfaces or to thicken the pipe distribution, if this can be done without fouling. the following design modifications were implemented to optimize the boiler: 1. the size of the evaporator was reduced. the horizontal section of the furnace was scaled down from 14.97 × 14.97 m to 12 × 12 m. the boiler was thus narrowed by approx. 3 meters. the height of the boiler and the side dimension of the second pass remain the same. 2. the number of plates of the superheater in the upper part of the furnace has been preserved, but their span has been reduced to 1.09 m. 3. the pipe span of the output superheater remains unchanged; the size of the output superheater size has been reduced by 19 %. 4. a third loop has been added to the output reheater and the traverse span of its pipes has been reduced to 188 mm. the number of parallel pipes has thus been lowered by 10 %, and the heating surface has been enlarged by 16 %. 5. a fourth bundle has been added to the input reheater, and the pipe span has been reduced to 144 mm, which reduces the diameter of the hanging pipe to 32 mm. the surface is 8 % larger than the surface of the reference boiler. 6. the economizer has been extended by two bundles, and its surface has been increased by 44 %. 53 acta polytechnica vol. 52 no. 4/2012 figure 4: thermal cycle optimized for oxyfuel technology 54 acta polytechnica vol. 52 no. 4/2012 table 1: heating surface change reference [m2] oxyfuel [m2] difference [%] economizer 8 823 12 715 144.1 evaporator 2 149 1 715 79.8 platen superheater 1 357 329 92.2 platen superheater 2 450 422 93.8 output superheater 1 909 1 552 81.3 input reheater 7 482 8 077 108 output reheater 3 095 3 608 116.6 the flue gas recycling ratio remains at the previous value. the furnace was reduced in size so that the flue gas temperature on its outlet remains at approximately the same value as for the reference boiler. this, along with maintaining the same span of the output superheater pipes, should prevent slagging. the heating bundles downstream from the input reheater have a lower pipe span in order to achieve higher flue gas velocity. the possibility of reducing the lateral span of the heating bundles in the second pass is limited by the diameter of the hanging pipes, which has been reduced from 38 mm to 32 mm. in spite of all efforts, it was not possible to achieve the same flue gas velocity through the heating surfaces as in the reference case; however the difference is quite small. the final sizes of the heating surface are presented in table 1. the results of the thermal calculation show that the superheated steam temperature requirement can be met by increasing the size of the superheater, and the flue gas outlet temperature can be lowered to 284 ◦c by increasing the size of the economizer. it is difficult to achieve a lower flue gas outlet temperature, due to the low temperature difference equal to 33 ◦c in the cold end of the economizer. further lowering of flue gas temperature would mean a progressive increase in the size of the economizer, for which there is not enough space, and undesirable evaporation could occur with smaller loads. a higher outlet flue gas temperature has no significantly degrading influence on boiler efficiency, since the flue gas flow rate is about a third of the flow rate for the reference case. 8 ways of cooling the outlet flue gas some earlier studies assume that the flue gas will be cooled by heating the oxidizer. in our opinion, this is a complicated solution where valuable oxidizer can be lost and can leak into the flue gas due to low temperature corrosion, because the acid dew point of the flue gas is around 170 ◦c. for this reason, an alternative solution featuring flue gas cooling by feed water within the scope of regenerative preheating was taken into consideration for optimizing the thermal cycle. the solution is shown in figure 4. in this way, it is possible to cool the outlet flue gas down to 200 ◦c. although this measure lowers the thermal cycle efficiency, the total contribution to the net unit efficiency is positive by 1.5 percentage points. a further attempt to lower the outlet flue gas temperature by lowering the feed water temperature, achieved by lowering the number of high pressure regenerative heaters, proved to be less effective. 9 conclusion this paper has presented the latest results of a study on a pulverized coal steam boiler for oxyfuel technology elaborated in scope of research project tip no. fr-ti1/379, supported by the czech ministry of industry and trade. the goal is to draw attention to the differences and difficulties in boiler design for combustion with oxygen, and to derive an optimized oxyfuel boiler solution from its air combustion variant. the results indicate that a transition to this technology would require a substantial range of modifications to the existing boilers, which leads us to the recommendation that full-scale replacement of the boiler is a better option. the design of the burner and selection of the materials are challenges that are beyond the scope of this paper. acknowledgement this paper uses findings resulting from work done on tip research project no. fr-ti1/379, which was supported by czech ministry of industry and trade. 55 acta polytechnica vol. 52 no. 4/2012 references [1] buhre, b. j. p., elliott, l. k., sheng, c. d., gupta, r. p., wall, t. f.: oxy-fuel combustion technology for coal-fired power generation. progress in energy and combustion science. 31(4), 283–307, 2005. [2] iea ghg. oxy combustion processes for co2 capture from power plant. report 2005/9, cheltenham, uk, iea greenhouse gas r&d programme, 212 pp. jul 2005. [3] kakaras, e., koumanakos, a., doukelis, a., giannakopoulos, d., vorrias, i.: oxyfuel boiler design in a lignite-fired power plant. fuel. 86(14), 2 144–2150, sep 2007. [4] dernjatin, p., fukuda, y.: oxyfuel retrofit to coal power plant (part 1) – fs of 500 mw class plant. in 1 st oxyfuel combustion conference — book of abstracts, cottbus, germany, 8–11 sep 2009. cheltenham, uk, iea greenhouse gas r&d programme, paper occ1final00057.pdf, 2 pp, 2009, cd-rom. presentation available from: http://www.ieaghg.org/docs/oxyfuel/occ1/ session%206 a/6a 2-bhk fortum.pdf [5] dhungel, b., mönckert, p., maier, j., scheffknecht, g.: oxy-fuel combustion for co2 abatement during pulverised coal combustion. in international symposium, moving towards zero emission plants, proceedings. leptokarya, greece, 20–22 jun 2005. ptolemais, greece, centre for research and technology hellas, institute for solid fuels technology and applications (certh/isfta), paper 4 01.pdf, 16 pp, 2005, cd-rom. 56 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 cosmology with gamma-ray bursts using k-correction a. kovács, z. bagoly, l. g. balázs, i. horváth, p. veres abstract in the case ofgamma-raybursts withmeasured redshift, we can calculate the k-correction to get the fluence and energy thatwere actually produced in the comoving system of thegrb. to achieve this we have to usewell-fitted parameters of a grb spectrum, available in the gcn database. the output of the calculations is the comoving isotropic energy eiso, but this is not the endpoint: this data can be useful for estimating the ωm parameter of the universe and for making a grb hubble diagram using amati’s relation. keywords: k-correction, gamma-ray burst, cosmology, hubble diagram, density parameter of matter. 1 introduction several papers present how to make k-corrections (e.g. [1]). we will only note here the principles and the most important considerations. a typical grb spectrum has three main available parameters: peak energy (epeak ), lowand high energy spectral indices (α, β). with the band function [2] and the cutoff power-law function (hereafter cpl) these are wellfitted. the motivation of k-correction is related to the fact that the energy distribution measured with gamma-ray detectors placed on satellites is not equal to the energy that the burst released in its comoving system. the goal is to correct the redshifted grb spectrum. we will now show the main steps of this method and the meaning of k. 2 theory of the k-correction by definition, fluence is the integral of the band/cpl function over the energy range in which the detector is responsive (see equation 1). if redshift is taken into account, the formula for bolometric energy has to be modified. s[a,b] = ∫ b a φ(e) de (1) e[e1,e2] = 4πd2l (1 + z) s[emin,emax] · (2) s[e1/(1+z),e2/(1+z)] s[emin,emax] where e1 = 1 kev and e2 = 10 000 kev, conventionally. the detectors do not measure the fluence between [e1/(1 + z), e2/(1 + z)]. all of them have an [emin, emax] interval that they detect in, so the fluence should be corrected, as can be seen in equations (2) and (3). hereafter e[e1,e2] is named eiso. e[e1,e2] = 4πd2l (1 + z) s[emin,emax] · k (3) now we can see the meaning of k: it is a factor that multiplies the fluence, and therefore the energy. the most necessary parameters are also clear. the gamma-ray burst coordinate network (gcn) provides a good database for the calculations. we only have to search for grbs with measured redshift and decide whether the fit of the spectrum contains a peak energy parameter or not. this is important, because it is needed when working with amati’s relation [3]. we finally found 72 grb samples, most of which are konus-wind data. however, some data is from other sources. of course, the data of one grb sample is not mixed. it is important to say some words about the error calculations. we used a somewhat unusual method: because each measured parameter in the gcn database has an error, we generated random gaussian distributions where the mean value was the measured number, and the variance was the error. the k-corrections were made with these numbers from the tails of the curves, so we finally obtained a distribution for every grb energy that has a mean value and a variance. we thus identified the error as the variance in each case. with this procedure the δeiso errors are approximately one order less than the eiso energies. 3 results of the k-corrections we can clearly see from figure 1 that the most frequented k value is approximately 1.2. figure 2 shows the distribution of the isotropic energies. these energies are used when we want to test the amati relation [4], which states that there is a correlation between epeak (1 + z) (epeak in kev) and log10(eiso) (see figure 3). there are some outlier points. these are short grbs (def.: t90 < 2s) which do not follow this relation. the stars refer to points using the cpl function, the squares are from the band function. our group wanted to test the trimodality of the grb distribution. as mentioned above, short grbs 68 acta polytechnica vol. 51 no. 2/2011 0.5 1.0 1.5 2.0 2.5 3.0 k 0 10 20 30 40 n fig. 1: distribution of k 50 51 52 53 54 55 log 10 e iso 0 5 10 15 20 n fig. 2: distribution of log10(eiso), eiso is measured in ergs 50 51 52 53 54 55 log 10 e iso 10 100 1000 10000 e p e a k( 1 + z) fig. 3: visualization of amati’s relation do not follow the amati’s relation, but the others can also be separated into groups in the amati plane. figure 4 shows these datapoints. there is an overlap between the intermediate (2s < t90 < 10s) and the long type groups (signed ∗), so the separation cannot be seen in our data. with more burst in the future it may be visible. 50 51 52 53 54 55 log 10 e iso 10 100 1000 10000 e p e a k( 1 + z) fig. 4: the intermediate and the long grb groups 4 cosmological applications we will now present the most interesting results that can be derived form the dataset, the hubble diagram and an estimation of ωm . the first interesting result is based on two important things: accelerating expansion of the universe from supernova cosmology project data [5] and earlier results using grbs [6]. our method was very simple, as we just wanted to make estimations. we were curious whether the time-dilatation effect is seen or not. let us note the steps in this procedure: first of all, consider that the amati relation is real and fit a straight line to the datapoints. after this, we can calculate dl using the eiso values related to the two parameters of the straight line. finally, we ought to think that the points are exactly on the line. in this way we can get the luminosity distance in a different way, as written below in equation (4). dl = ( eiso(1 + z) s[emin,emax]4πk )1 2 (4) the last step in this operation is to put the derived dl to the well-known distance modulus formula (see equation 5), and the final result is figure 5. 0 2 4 6 8 10 z 40 45 50 55 μ( m a g ) fig. 5: the grb hubble diagram 69 acta polytechnica vol. 51 no. 2/2011 0.0 0.2 0.4 0.6 0.8 1.0 ω μ 1.440 1.442 1.444 1.446 1.448 1.450 1.452 1/ ρ fig. 6: a cosmological probe μ = 5 log10 dl − 5 (5) it is seen that the points are mostly above the theoretical curve, so it is clear time-dilatation causes the same effect as in the case of supernovae. the final result is the most important one: an estimation of ωm , the density parameter of matter. when the k-corrections were calculated, we had to give some input parameters, for example the mentioned quantity, because dl depends on the cosmology (see equation 6). it is clear that if we change these parameters, amati’s relation will also change. dl(z, ωm, ωλ, h0) = c(1 + z) h0 · (6) 1∫ z 0 [(1 + z′2)(1 + ωmz′) − z′(2 + z′)ωλ] 1 2 dz′ we can ask with what conditions we can get the highest correlation on the amati plane. to find the answer, the calculations have been redone with so many values of ωm , considering that ωtotal = 1, i.e. a flat universe, but there is a chance to test other models, too. we wanted to represent the correlation simply with the correlation coefficient ρ (see figure 6). this has a maximum when the correlation is highest, but all the results of some earlier papers [7] use methods where there is a minimum in the same case, and we therefore used the 1/ρ form in figure 6. we can see that the minimum is at 0.2, which would be the optimal ωm for obtaining the highest correlation. 5 conclusions our method, which is based on the k-correction, is a useful tool for cosmology. however, our estimation of ωm is less than the obtainable value from the wmap data [8], which may be the most precise data that is available. our results are approximately the same as the results of earlier grb studies, although we have used different data. acknowledgement the project is supported by the european union, cofinanced by the european social fund (grant agreement no. tamop 4.2.1./b-09/1/kmr-2010-0003), and in part through otka k077795, otka/nkth a08-77719, and a08-77815 (z.b.) grant. references [1] bloom, j. et al.: the prompt energy release of gamma-ray bursts using a cosmological k-correction, the astronomical journal, 2001, vol. 121, issue 6, pp. 2 879–2 888. [2] band, d. et al.: batse observations of gammaray burst spectra. i – spectral diversity, the astrophysical journal, vol. 413, pp. 281–292. [3] amati, l. et al.: intrinsic spectra and energetics of bepposax gamma-ray bursts with known redshifts, astronomy and astrophysics, vol. 390, pp. 81–89. [4] amati, l.: the ep,ieiso correlation in gammaray bursts: updated observational status, reanalysis and main implications. monthly notices of the royal astronomical society, 2006, vol. 372, issue 6, pp. 233–245. [5] perlmutter, s. et al.: measurements of omega and lambda from 42 high-redshift supernovae, the astrophysical journal, vol. 517, issue 2, pp. 565–586. [6] schaefer, b. e.: the hubble diagram to redshift > 6 from 69 gamma-ray bursts, the astrophysical journal, 2007, vol. 660, issue 1, pp. 16–46. [7] amati, l. et al.: measuring the cosmological parameters with the ep,i-eiso correlation of gamma-ray bursts. monthly notices of the royal astronomical society, 2008, vol. 391, issue 2, pp. 577–584. [8] komatsu, e. et al.: five-year wilkinson microwave anisotropy probe observations: cosmological interpretation, the astrophysical journal supplement, vol. 180, issue 2, pp. 330–376. andrás kovács dept. of physics of complex systems eötvös university h-1117 budapest, pázmány p. s. 1/a, hungary 70 acta polytechnica vol. 51 no. 2/2011 zsolt bagoly dept. of physics of complex systems eötvös university h-1117 budapest, pázmány p. s. 1/a, hungary lajos g. balázs konkoly observatory h-1525 budapest, pob 67, hungary istván horváth dept. of physics bolyai military university budapest, pob 15, h-1581, hungary péter veres dept. of physics of complex systems eötvös university h-1117 budapest, pázmány p. s. 1/a, hungary dept. of physics bolyai military university budapest, pob 15, h-1581, hungary 71 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 ultra low dispersion spectroscopy with gaia and astronomical slitless surveys r. hudec, l. hudec, m. kĺıma abstract the ultra-low dispersion spectroscopy to be applied in the esa gaia space observatory and the ground-based objectiveprism plate surveys represent a similar type of astrophysical data. although the dispersion in plate surveys is usually larger than in the gaia blue and red photometers (bp/rp), the spectral resolutions differ by a factor of 2–3 only, since the resolution in ground-based spectra is seeing-limited. we argue that some of the algorithms developed for digitized objective-prismplates can also be applied for thegaia spectra. at the same time, the plate results confirm the feasibility of observing strong emission lines with gaia rp/bp. keywords: astronomical spectroscopy, low dispersion spectroscopy, slitless spectroscopy, gaia. 1 introduction the esa gaia is a primary astrometry esa mission to be launched in 2013. the satellite payload consists of a single integrated instrument, the design of which is characterised by a dual telescope concept with a common structure and a common focal plane1. both telescopes are based on a three-mirror anastigmat design. beam combination is achieved in image space with a small beam combiner. silicon-carbide ultra-stable material is used for the mirrors and the telescope structure. there will be a large common focal plane with an array of 106 ccds. the large focal plane also includes areas dedicated to the spacecraft’s metrology and alignment measurements. three instrument functions/modes are designed: (i) astrometric mode for accurate measurements, even in densely populated sky regions of up to 3 million stars/deg(), (ii) photometric mode based on low-resolution, dispersive spectrophotometry using blue and red photometers (bp and rp) for continuous spectra in the 330–1 000 nm band for astrophysics and chromaticity calibration of the astrometry (jordi and carrasco, 2007), and (iii) spectrometroscopic (rvs) mode for high resolution, with grating, covering a narrow band: 847–874 nm. the expected limiting magnitude is 20 in photometric mode2. in this contribution, we discuss the data expected to be provided by the bp/rp photometers, and show that they will represent ultra-low dispersion spectra which can be used in various astrophysical projects. we compare this data with analogous data provided by plate surveys. 2 gaia photometers and simulations in this paper we focus on the “photometric mode” rp/bp. use of the dispersive element (prism) generates ultra-low dispersion spectra. one disperser, called bp for blue photometer, operates in the 330–660 nm wavelength range; the other, called rp for red photometer, covers the 650–1 000 nm wavelength range. the dispersion is higher at short wavelengths, and ranges from 4 to 32 nm/pixel for bp and from 7 to 15 nm/pixel for rp3. it should be noted however that the photometric ccds are located at the edge of the focal plane, where the quality of the images is more sensitive to aberrations than astrometric images (straizys et al., 2010). the bp and rp spectra will be binned on-chip in the across-scan direction; no along-scan binning is foreseen. rp and bp will be able to reach object densities on the sky of at least 750 000 objects deg−2. the obtained complex images can be simulated by the gibis simulator (figure 1). gibis is a pixel-level simulator of the gaia mission intended to simulate how the gaia instruments will observe the sky, using realistic simulations of the astronomical sources and of the instrumental properties. it is a branch of the global gaia simulator (gaiasimu) under development within gaia coordination unit 2: data simulations. 1http://sci.esa.int/gaia/, 2011 2http://sci.esa.int/gaia/, 2011 3http://sci.esa.int/gaia/, 2011 27 acta polytechnica vol. 52 no. 1/2012 fig. 1: left: the prism astronomical survey plate with 1.8 degree prism (case/pari), right: the rp image simulated by thegaiagibis simulator (visualizationds9). this simulated image illustrates the imagewingsmentionedbystraizys et al., 2006 fig. 2: image analyses of objects on low dispersion spectroscopic plates of the la pas german bolivia southern sky survey. digitised (using a usb microscope) star image (left), the spectrum (2d) plot (centre), and the 3d plot (right). the 3d plot allows some details to be studied that are not included in the 2d plot, such as the image distorsion (image wings) caused by the optics that are used 3 ultra low dispersion spectral plate databases low dispersion spectroscopy (lds) astrophysics was evolved and performed at numerous observatories between ca 1910 and 1980. in our project we have analysed the oldest lds plates at the carnegie observatories, in pasadena, ca, usa. these lds plates were taken in 1909 with excellent quality, albeit limited fov. mostly lds with schmidt telescopes (plates with objective prism) were used for various projects e.g. qso, emission line and halpha surveys, star classifications, etc.,though some lds surveys were performed with refractors. this technique was however little used after 1980. the plate databases were previously mostly evaluated only by manual methods, hence the application of advanced computer methods to this data can yield many new (and probably unexpected) results. some of these surveys are listed below (the dispersion data is given in the next section, and we also note that many other similar surveys exist): (1) schmidt sonneberg camera. sky survey (selected fields) with a 50/70 cm schmidt telescope. no online access yet, but the scans can be provided upon request (http://www.stw.tu-ilmenau.de/). (2) bolivia expedition spectral plates. these plates offer homogeneous but not full coverage (90 southern kapteyn’s selected areas nos. 116–206 were covered with plates representing 10 × 10 grad each, hence 9 000 square degrees in total) of the southern sky with spectral and direct plates, directed by the potsdam observatory. the plates are stored at the sonneberg observatory (http://www.stw.tuilmenau.de/) and were taken between 1926–1928, in total about 70 000 prism spectra were estimated and published in potsdam publications, see becker (1929) and following papers. see figure 2 for an example of this type of lds data. (3) hamburg quasar survey. a wide-angle objective prism survey searching for quasars with b < 17.5 in the northern sky. the survey plates were taken with the former hamburg schmidt telescope, located at calar alto/spain since 1980. online access (http://http://www.hs.unihamburg.de/de/for/exg/sur/index.html). (4) byurakan survey. the digitized first byurakan survey (dfbs) is the digitized version of the first byurakan survey (fbs). it is the 28 acta polytechnica vol. 52 no. 1/2012 fig. 3: lds spectra of two starburst galaxies on pari institute plates fig. 4: examples of objects with prominent spectral features on lds spectra, pari institute plates. left: qso, right: wn8h star v378 vel. the image on the right is an example of lds with higher spectral resolution. for these plates numerous emission and absorption lines are visible and various algorithmes are to be developed and applied largest spectroscopic database in the world, providing low dispersion spectra for 20 000 000 objects on 1139 fbs fields = 17,056 deg−2. online access (http://byurakan.phys.uniroma1.it/). sky coverage: dec > −15 deg, all ra (except the milky way). the prism spectral plates were taken by the 1 m schmidt telescope. limiting magnitude: 17.5 in v. spectral range: 340–690 nm, spectral resolution 5 nm. (5) spectral survey plates in the astronomical photographic data archive (apda) located at the pisgah astronomical research institute (pari), usa, e.g. case qso-survey (http://www.pari.edu/library). telescope: 61/91 cm burrell schmidt at kitt peak, 1.8 deg prism, plate fov: 5-degree by 5-degree, limiting b magnitude: 18, emulsion: iiiaj baked, spectral range: 330 nm to 530 nm (figures 3, 4). (6) karl henize h-alpha plate collection (located since 2010 at pari) — michigan-mount wilson southern h-alpha survey (henize, 1954). a newly (in 2010) re-discovered highly valuable plate collection. 290 high quality plates 15 × 15 inches taken in 1950–1952 in south africa by dedicated telescope by karl henize. telescope aperture d25 cm, dispersion 45 nm/mm at halpha, various filters used (henize, 1954). the spectral dispersion of gaia bp/rp was predetermined with no chance to interfere with the requirements of the scientific community. an important question is whether the dispersion of these devices is sufficient to detect and to study bright spectral features/emission lines. this question was answered e.g. by the extended work of us astrophysicist and nasa astronaut karl henize, who spent a large part of his scientific career on low dispersion spectroscopy with an objective prism. we have found the original low dispersion spectral plates that he took about 60 years ago in south africa and we have analysed them extensively. we found and investigated these plates (probably the complete henize collection) in the pari (pisgah astronomical research institute) institute, nc, usa. the plates show numerous examples of objects with very prominent and very wide emission lines, which he found in a very extended time-consuming and laborious project. 29 acta polytechnica vol. 52 no. 1/2012 4 ultra low dispersion images by gaia rp/bp: algorithms and a comparison with plate surveys the algorithms for automated analyses of digitised spectral plates were developed by computer science students (e.g. hudec, 2007). the main goals are as follows: automated classification of spectral classes, searches for spectral variability (both continuum and lines), searches for objects with specific spectra, correlation of spectral and light changes, searches for transients, and application for gaia. the archival spectral plates taken with the objective prism offer the possibility to simulate the gaia low dispersion spectra and related procedures such as searches for spectral variability and variability analyses based on spectro-photometry. we focus on the sets of spectral plates of the same sky region covering long time intervals with good sampling; this enables simulation of the gaia bp/rp outputs. the main task is automatic classification of stellar objective prism spectra on digitised plates, a simulation and a feasibility study for the low dispersion gaia spectra. the algorithms that have been developed and tested include the application of novel approaches and techniques with emphasis on neural networks for automated recognition of spectral types of stars, comparing them with atlas spectra. this technique differs from techniques discussed before (e.g. christlieb et al., 2002, or hagen et al., 1995). for the future, we plan to continue developing innovative dedicated image processing methods to continue our participation in data extraction and evaluation by providing expertise in the high level image processing with a focus on solving problems of data processing and data extraction emerging from the peculiar way that gaia is functioning. the expertise available at the department of radioelectronics of the ctu faculty of electrical engineering will be further used and developed in this direction. as illustrated in figure 1, the gaia bp/rp and lds astronomical plates represent similar databases. the motivation for studies comparing these two databases is as follows: (1) a comparison of the simulated gaia bp/rp images with those obtained from digitized schmidt spectral plates (both using dispersive elements) for 8 selected test fields, and (2) a feasibility study for application for the algorithms developed for the plates for gaia. dispersion is an important parameter, and is discussed later: (1) gaia bp: 4–32 nm/pixel i.e. 400–3 200 nm/mm, 9 nm/pixel i.e. 900 nm/mm at hγ, rp: 7–15 nm/pixel i.e. 700–1 500 nm/mm. psf fwhm ∼ 2 px i.e. spectral resolution is ∼ 18 nm, (2) schmidt sonneberg plates (typical mean value): the dispersion for the 7 deg prism 10 nm/mm at hγ, and 23 nm/mm at hγ for the 3 deg prism. (3) bolivia expedition plates: 9 nm/mm, with calibration spectrum, (4) hamburg qso survey: 1.7 deg prism, 139 nm/mm at hγ, spectral resolution of 4.5 nm at hγ, (5) byurakan survey: 1.5 deg prism, 180 nm/mm at hγ, resolution 5 nm at hγ and (6) pari prism dispersion: 150–340 nm at 450 nm. we see that the gaia bp/rp dispersion is ∼ 5 to 10 times less than the dispersion of a typical digitised spectral prism plate, and the spectral resolution of gaia is ∼ 3 to 4 times less than for the plates. note that for plates the spectral resolution is seeing-limited, hence the values represent the best values, and on the plates affected by not superior seeing the spectral resolution is only ∼ 2 times better when compared to gaia bp/rp. 5 astrophysics with gaia rp/bp spectro-photometry and lds in our oppinion, the major strength of gaia for many scientific fields will be in spectro-photometry, as the low dispersion spectra may be transferred to numerous well-defined color filters. as an example, the optical afterglows (oas) of gamma-ray bursts (grbs) are known to exhibit quite specific color indices, distiguishing them from other types of astrophysical objects (simon et al., 2001 and 2004a, 2004b), hence a reliable classification of oas of grbs will be possible, in principle, using this method. the colors of microquasars may serve as another example: they display blue colors, with a trend of a diagonal formed by the individual objects. this method can be used even for optically faint, and hence distant objects. the gaia bp/rp lds will also provide direct valuable inputs for various fields of recent astrophysics. figure 5 illustrates one of the examples, namely the value of lds for analyses of oas of grbs. the emphasis is not only of the lds spectral continuum profile (reflecting the synchrotron radiation) but also on a study of wide redshiftet lyman alpha breaks. the gaia data will be supported by ground-based optical data with emphasis on robotic telescopes. this is part of the sub-workpackage supplementary optical observation in workpackage specific object studies in the framework of cu7 unit of gaia. while this support will focus on supplementay photometry, we have also developed and tested methods involving the fast response lds (figure 6). this has scientific justification, as recently the lds of oas of grbs are mostly delayed by 1–10 hours (fynbo et al., 2009), hence they represent the afterglow optical emission, not the prompt optical emission. 30 acta polytechnica vol. 52 no. 1/2012 fig. 5: examples of lds of oas of grbs with strong intervening absorbers (fynbo et al., 2009). evidently the wide redshifted lyman alpha break will be observable by gaia rp, and as a consequence there will be a possibility to study highly redshifted objects with gaia rp up to z ∼ 7 fig. 6: bootes wfs (with prism) constructed by themechanical shop, ondrejovastronomical institute (left) and the direct image (centre) and prism image (right). direct vision prism is mounted on a 0.3m f/10 telescope, fov 43′ ×28′, dispersion ∼ 4 å/pixel at 4000 å , ∼ 30 å/pixel at 5500 å, and ∼ 100 å/pixel at 8000 å, limiting magnitude 13.5 in 30 s (spectrograph mode) the correct color indices however cannot be calculated without careful decontamination of the bp/rp spectra (straizys et al., 2006, straizys et al., 2010). the energy redistribution effect in the gaia bp and rp spectra arising from contamination by wings of the image profiles was mentioned and investigated by straizys et al. (2006), montegriffo et al. (2007), and montegriffo (2009). according to these researchers, the gaia spectra may be used for classifying stars either after applying contamination corrections or by using standard calibration stars with known physical parameters and observed with the gaia spectrophotometers. in the latter case, there is no way to calculate the real spectral energy distributions, magnitudes, color indices, color excesses or other photometric quantities. the classification has to be made by matching the observed pseudo-energy distributions of the target and the standard stars, or by using pattern recognition algorithms (template matching) over the whole spectrum to estimate the astrophysical parameters of stars. in addition, gaia may be useful in the study of strong spectral time variations. it is known that certain types of variable stars (vs) such as miras, cepheids, and a few cases of other stars, mostly peculiar variables, exhibit large variations in their spectral types. this field is, however, little exploited, as these studies used to be very laborious (plates were mostly visually inspected) and limited, and hence no review on the spectral variability among vs exists. the esa gaia is expected to deliver data to fill this gap. 6 recent results recently, we have digitised the full collection of henize plates (southern mtwilson-michigan h-alpha sky survey) and have found and analysed the northern mt wilson-michigan h-alpha sky survey plates deposited at carnegie observatories in pasadena, ca, usa. selected plates from this collection have been digitized. the southern sky la paz bolivia expedition survey was recently fully scanned and deposited at sonneberg observatory, germany. in addition, lds plates located at various observatories (e.g. sonneberg schmidt, kpno, lick, mt hamilton, etc.) were investigated and some of them have been digitized. a technique for on site plate scanning (using a transportable digitization device) was developed and tested. this proved to be essential, as many of the large plate collections do not have any suitable plate scanner. for the lds plates deposited at pari, an extended literature and in situ plate search was carried 31 acta polytechnica vol. 52 no. 1/2012 out in order to correlate the literature records and the plates, and also to re-discover and re-investigate various objects with prominent spectral lines using modern methods, with emphasis on objects described many years ago in the literature. some examples are illustrated in figures 3 and 4. advanced investigation and visualization methods were also exploited, with examples shown in figure 2. algorithms for lds data analyses, including neural networks, were developed and tested, with emphasis on automated star spectral type recognition. the parameters from various lds plate projects were compared with those of gaia br/rp. in addition, various evaluation and visualization techniques have been developed and tested. the potential of gaia bp/rp for astrophysical research was investigated with emphasis on objects with prominent colors and prominent spectral (and variable) spectral features. 7 conclusion the esa gaia satellite will provide ultra-low dispersion spectra by bp and rp, representing a new challenge for astrophysicists and for computer science. the nearest analogy is digitized prism spectral plates: the sonneberg, pari, hamburg and byurakan surveys. these digitised surveys can be used for simulation and for tests of the gaia algorithms and gaia data. some algorithms have already been tested. some types of variable stars are known to exhibit large spectral type changes — however this field is little exploited and more discoveries can be expected with the gaia data, as gaia will allow us to investigate the spectral behavior of huge numbers of objects over a period of 5 years with good sampling for spectroscopy. however, the data must first be decontamined to be scientifically applied, as discussed above. variability studies based on low dispersion spectra are expected to provide unique novel data, and can use the algorithms recently developed for automatic analyses of digitized spectral schmidt plates. these variability studies may use either gaia bp/rp data, or scanned plate data, or both. then time coverage up to and exceeding 100 years can be achieved. the oldest lds plates that we have identified in our project are stored at the carnegie observatories, pasadena, ca, usa, and were taken in 1909. acknowledgement czech participation in the esa gaia project is supported by pecs project 98058. the scientific part of the study is related to grants 205/08/1207 and 102/09/0997, provided by the grant agency of the czech republic. some aspects of the project described here are a natural continuation of czech participation in esa integral (esa pecs 98023). the analyses of digitised low dispersion spectral plates are supported by msmt kontakt project me09027. references [1] becker, f.: spektral-durchmusterung der kapteyn-eichfelder des südhimmels. i. pol undzone −75 deg. potsdam publ., 27, 1, 1929. [2] henize, k. g.: the michigan-mount wilson southern ha survey. astronomical journal, 59, 325, 1954. [3] hudec, l.: algorithms for spectral classification of stars. bsc. thesis, prague : charles university, 2007. [4] jordi, c., carrasco, j. m.: the future of photometric, spectrophotometric and polarimetric standardization, asp conference series, 364, 215, 2007. [5] montegriffo, p., et al.: a model for the absolute photometric calibration of gaia bp and rp spectra. i. basic concepts. gaia-c5-tnoabo-pmn-001-1, 2007. [6] montegriffo, p.: a model for the absolute photometric calibration of gaia bp and rp spectra. ii. removing the lsf smearing. gaia-c5tn-oabo-pmn-002, 2009. [7] simon, v., hudec, r., pizzichini, g., masetti, n.: a & a, 377, 450, 2001. [8] simon, v., hudec, r., pizzichini, g., masetti, n., gamma-ray bursts: 30 years of discovery: gamma-ray burst symposium, aip conference proceedings, 727, 487, 2004a. [9] simon, v., hudec, r., pizzichini, g.: a & a, 427, 901, 2004b. [10] straizys, v., et al.: baltic astronomy, 15, 449, 2006. [11] christlieb, n., wisotzki, l., grasshoff, g.: a & a, 391, 397, 2002. [12] hagen, h.-j., et al.: a & as, 111, 195, 1995. [13] fynbo, j., et al.: apjss, 285, 526, 2009. [14] straižys, v., lazauskaitė, r., zdanavičius, k.: baltic astronomy, 19, 181, 2010. 32 acta polytechnica vol. 52 no. 1/2012 rené hudec e-mail: rhudec@asu.cas.cz astronomical institute academy of sciences of the czech republic cz-25165 ondřejov, czech republic faculty of electrical engineering czech technical university in prague technická 2, cz-16627 prague, czech republic lukáš hudec miloš kĺıma faculty of electrical engineering czech technical university in prague technická 2, cz-16627 prague, czech republic 33 ap08_5.vp 1 introduction this paper will compare the effects of temperature changes in bridge girders, above all the effect of non-uniform temperature distributions. the loadings recommended by standards čsn 73 6203, env 1991-2-5 and din 1072 will be compared here. due to the variety of design processes, the comparison will be made without any coefficient of loading, combination or material. 2 summary of loading, according to three standards 2.1 loading according to čsn 73 6203 when designing bridge structures, this standard considers two basic effects: a) standard temperature changes of the structure as a whole (equal change in temperature) b) unequal temperature changes or temperature changes of parts of a structure 2.1.1 standard temperature changes of the structure as a whole (uniform temperature component) the standard prescribes equal temperature changes for each type of structure. if this value cannot be set in any other way, the upper and lower boundary temperature is used other way. the values of these temperatures are shown in table 1. the value t f � 10 °c can be used as the initial temperature for most structures. 2.1.2 unequal temperature changes (temperature difference component) unequal temperature changes are given as a difference of temperatures, the temperature gradient between two points on surfaces of the structural member. if this is not known, the models presented in the standard will be used. these models approximate temperature changes depending on the structure type. bridges with a span less than 50m can be designed according to simplified loading with a linear temperature gradient. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 comparison of temperature loadings of bridge girders j. římal, d. šindler this paper compares the effect of temperature changes on the superstructure of bridges, above all the effect of non-uniform temperature. loadings according to standards čsn 73 6203, env 1991-1-5 and din 1072 are compared here. the paper shows a short summary of temperature loading according to each standard and shows the comparison of bending moments arisen from these temperature loadings on superstructure made from continuous girder from a steel-concrete box girder with a composite concrete slab. with respect to a variety of design processes, the comparison is made without any coefficient of loading, combination or material. keywords: temperature loading, temperature gradients,bridge constructions, temperature difference components, deformation, temperature distribution, maximum moments from temperature loading. fig. 1: border bridge on the d8 motorway in summer and in winter 2.2 loading according to env 1991-2-5 this standard groups structures for temperature loading into three types: � type 1 – steel deck on steel girders. � type 2 – concrete deck on steel girders. � type 3 – concrete slab structure or concrete deck on concrete girders. the temperature loading is divided into: a) a uniform temperature component, b) temperature difference components, or the vertical component and the horizontal component of the temperature variations, c) differences in temperature between different structural elements. 2.2.1 uniform temperature component the uniform temperature component depends on the extreme temperatures. the minimum and maximum temperature that a bridge will achieve can be determined by applying the chart shown in fig. 2. the shade temperature (tmin, tmax) for the site is derived from national isotherm maps. according to nad, the temperatures for the czech republic are tmin � �24 °c and tmax � �37 °c. 2.2.2 temperature difference component the temperature difference component means that the upper surface of the bridge deck will be exposed to maximum heating (top surface warmer) or to maximum cooling (bottom surface warmer) temperature variation. as in the case of © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 5/2008 boundary temperatures (°c) bridge superstructure without sunshine at all times steel concrete steel concrete compo site steel fully hidden concrete with covering layer higher than 0.5 mwith rail bed steel-concrete concrete-concrete tmax 50 35 45 30 40 as concrete 35 30 tmin �35 �20 �35 �20 �25 �35 �15 table 1: boundary temperatures for bridge structures fig. 2: correlation between shade air temperature and structure temperature čsn 736203, env applies a linear temperature gradient for some structures and nonlinear gradient for others. the linear temperature difference values are shown in table 2. the values given in the table are based on a surfacing depth of 50mm for roads and railways. figures with temperature gradients and values are given in the standard. 2.2.3 differences in temperature between different structural elements these effects should be taken into account for structures where the difference in the uniform temperature component between the different element types may lead to adverse load effects. recommended temperature values are: � 15 °c between main structural elements, � 10 °c and 20 °c for light and dark color, respectively, between suspension/stay cables and deck. 2.3 loading according to din 1072 this standard divides the temperature loading into three groups, as does env 1991-2-5: a) an uniform temperature component b) temperature difference components c) differences in temperature (temperature jump) 2.3.1 uniform temperature component to obtain the uniform temperature, we apply the basic temperature t � �10 °c. for each type of supporting structure the values are given as follows: � steel bridges �35 °c � composite bridges �35 °c � concrete bridges �20 °c /�30 °c for bridges with a construction depth more than 0.7 m and for backfilled structures the temperature values can be reduced by 5 °c. 2.3.2 temperature difference components the temperature difference components are given as a linear gradient in the vertical direction of the bridge girder. the temperature difference values between surfaces are shown in table 3. 2.3.3 differences in temperature differences in temperature refer to the fact that different parts of a structure (e.g., arch and deck) can have different temperatures. the temperature difference value for a concrete-concrete composite member is � 5 °c, while for other kinds of composite members it is � 15 °c. 3 comparison of loadings all standards that are compared here divide the loading by temperature of the bridge structure into at least in to two basic effects – the uniform the temperature component and temperature difference components. the env 1991-2-5 standard takes into account loading by a temperature gradient in the vertical direction, and also loading by a temperature gradient in the horizontal direction. this is used for complicated structural arrangements, where these loadings produce considerable effects. for uniform temperature, each of the standards has its own technique for obtaining the temperature differences, but the final temperature values do not differ greatly. for temperature differences, the standards generally take a nonlinear temperature gradient in the vertical direction. for simple structures, according to the čsn and env standards, a simplified linear gradient can be used. the din standard uses this simplified linear gradient for all bridge structures. 4 analysis on a real bridge structure for a comparative analysis, a bridge on the d8 motorway, segment 0807, so 217 – border bridge has been chosen. the bridge crosses the border with germany, which is formed by a deep valley and the border brook. 4.1 description of the border bridge the structural system of the bridge is a continuous girder. it is supported by nine supports, two abutments and seven piers. at this point, the motorway has a constant curvature with radius r � 1750 m; there is a constant slope of 0.5 % in the vertical direction of the bridge. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 deck type top warmer than bottom bottom warmer than top �tm,heat(°c) �tm,cool(°c) type 1: steel deck 18 13 type 2: composite deck 15 18 type 3: concrete deck – concrete box 10 5 – concrete beam 15 8 – concrete slab 15 8 table 2: value of liner temperature differences upper surface warmer bottom surface warmer structural conditions service conditions structural conditions service conditions steel bridges 15 10 5 5 composite bridges 8 10 7 7 concrete bridges 10 7 3.5 3.5 table 3: temperature differences values the bridge structure is made as a steel-concrete box girder with a composite concrete slab. each direction of the motorway has one bridge girder with its own piers (figs. 3 and 5). the abutments are the same for both girders. the width of one structure is 14.5 m and the bridge depth is 3.65 m. the length of the bridge is approx. 430 m. the continuous girder © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 5/2008 fig. 3: view of bridge before completion fig. 4: longitudinal section with cross section locations fig. 5: sample cross section has spans (58.40 � 73 � 73 � 73 � 73 � 58.40) m, and it was launched into the final position from the czech abutment. only the steel box with the launching nose (length approximately 30 m) was launched. the deflection in the nose end during launching was about 1 m. for casting, deck removable formwork was used. the casting step took 11 days. six form travelers, each 25 m in length, were used. in order to eliminate cracks above the supports of the main girder, these parts were the last to be cast. 4.2 analysis as the bridge structure has the form of a continuous beam supported by fixed bearings on the two middle piers, the uniform temperature component causes axial displacements and produces a normal force accompanied by negligible bending moments only. therefore this bending effect is not taken into consideration here; the uniform temperature values are only listed in the following sections. an analysis of non-uniform temperature changes is performed on the beam model shown in fig. 4. these figures show the positions of characteristic cross-sections on the beam axis. 4.2.1 solution according to čsn 73 6203 uniform temperature component: for composite bridges, the values for the boundary temperatures according to table 1 are tmax � �40 °c and tmin � �25 °c. using reference temperature t f � 10 °c (temperature when the bridge girder was placed on the bearings), the uniform temperature components are as follows: �t t tmax max� � � � � � �0 40 10 30 °c �t t tmin min� � � � � � �0 25 10 35 °c temperature difference component: because the bridge span exceeds 50 m, it is not possible to use the simplified linear temperature gradient. the nonlinear temperature gradient which is shown in fig. 6 must be used. 4.2.2 solution according to env 1991-2-5 uniform temperature component: according to env 1991-2-5 this bridge belongs to structure type 2. for maximum and minimum air temperatures in the czech republic tmax � �37 °c and tmin � �24 °c, the following temperature values for bridges are taken from fig. 1: tmax � �37 °c � � �te, max 45 °c tmin � �24 °c � � �te, min 24 °c using reference temperature t0 10� °c (the temperature when the bridge girder was placed on the bearings) the uniform temperature components are as follows: t t tn pos e, , max� � � � � � �0 45 10 35 °c t t tn neg e, , min� � � � � � �0 20 10 30 °c components with temperature differences: because a composite bridge girder is not a simple structure with acceptable details, it is necessary to apply a nonlinear temperature gradient. the temperature gradient in the vertical direction of the superstructure is shown in fig. 7. 4.2.3 solution according to din 1072 uniform temperature component: for composite bridges, the uniform temperature is calculated from the referential temperature t � �10 °c with a change �35 °c. temperature difference components: for loading with a temperature gradient according to the din standard, only linear temperature gradient is used. this is shown in fig. 8. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 fig. 6: temperature gradient in the vertical direction, according to čsn 73 6203 fig. 7: temperature gradient in the vertical direction, according to env 1991-2-5 fig. 8: temperature gradient in the vertical direction, according to din 1072 4.3 comparison of results the temperature gradient figure shows that, according to the čsn and env standarts, which have almost the same temperature distribution, there will be only a one-side effect, and consequently only a positive or negative moment. the calculation confirmed this hypothesis, and the temperature difference component caused only positive moments. by contrast, the din standard, as shown by the temperature gradient, will cause both positive and negative moments. the calculated moments are shown in figs. 9 and 10. according to the din standard, loading with cooling will cause only negative moments, whereas the other two standards produce positive moments (fig. 9). the din standard will produce a minimal moment –11mnm in contrast to a zero moment due to loading according to the čsn and env standards. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 5/2008 fig. 9: minimum moments from temperature loading fig. 10: maximum moments from temperature loading if we compare the maximal moments caused by the heating difference component (fig. 10), no great differences in values appear. although the temperature gradients given by the čsn and env standards are significantly different, loading according to them causes almost the same moments. the difference between the moments equals approx. 1 %. the loading according to env produces about 15 % lower values, but design according to env uses many more coefficients than the compared standards. the resulting effect of loading according to this standard could be the same or worse. 5 conclusion the comparison of the standards took into account only basic values, without applying any coefficients (factors for actions or combination). the final effects from loading with maximum heating in the temperature gradient according to the env standard can, in some combinations, be higher than according to the other standards, though the effect is lower without the coefficients. without investigating other types of structures we cannot say whether this difference is generally observable, or whether it is only valid for this type of structure. in addition, it would be necessary to make universal measurements of temperature gradients on real structures in order to ascertain which standard determines true values, or which is closer to the truth. it is necessary to investigate this problem on other structural types. first of all, the theoretical considerations and calculations according to the standards must be subjected to on-site experimental measurements of the temperature fields on bridge structures. temperature fields and temperature gradients should be measured during diurnal cycles (24 hours) and year annual cycles. by evaluating these cycles it would be possible to learn whether the extreme measured effects do not to greatly exceed the values given by the standards, or it would be possible to determine how often they are exceeded. it would be possible to assess how precisely and how reliably the individual standards prescribe the temperature gradients for the bridge design. acknowledgment this project is being conducted with participation of ph.d. student ing. jana zaoralová, ing. simona rohrböcková and undergraduate student jakub římal. this research has been supported by the grant agency of the czech republic with grant no. 103/06/0815 and research project msm 6840770005. references [1] čsn 73 6203 zatížení mostů (včetně změny a a změny b) [2] env 1991-2-5 eurocode 1: action on structures – part 1–5: general actions – thermal actions [3] din 1072 strassenund wegbrücken – lastannahmen [4] římal, j.: charles bridge in prague – measurement of temperature fields. international journal for restoration of buildings and monuments, freiburg 2003, vol. 9 (2003), no. 2, p. 585–602. [5] římal, j.: charles bridge in prague – measurement of temperature fields. international journal for restoration of buildings and monuments, freiburg 2004, vol. 10 (2004), no. 3, pp. 237–250. prof. rndr. jaroslav římal, drsc. phone: +420 224 354 702 email: rimal@fsv.cvut.cz ing. daniel šindler phone: +420 224 354 624 email: daniel.sindler@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29, prague, czech republic 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 ap08_3.vp 1 introduction the matrix converter is a system for energy conversion. it is a type of direct frequency converter from the viewpoint of division of power electronics system. the converter is able to control machines supplied with alternate current (asynchronous or synchronous), but it can produce a higher frequency than the best known direct converter – the cycloconverter. a highly sophisticated control algorithm can be implemented to control the machines. this means that from this viewpoint it is comparable with classic indirect frequency converter, but it has many advantages and disadvantage. the main advantage over the indirect converter is the absence of a dc link and the attendant huge passive elements. as a consequence, a matrix converter can have reduced dimensions. this property can be put to use in space-demanding applications, e.g. traction or flight systems. another possible usage of the matrix converter is in connection with medium-frequency transformers. other advantages are connected with emc problems. the power factor of the required energy can be controlled by this type of converter. and finally, it can obtain sinusoidal current from the distribution net. like every technical solution, the matrix converter has some disadvantages as every technical solution. these include the large number of semiconductors elements, which need protection from destruction. it is also sensitive to fluctuations in the input voltage. 2 description of the real system a description of the realized system is necessary for understanding the simulation parameters. the following description is of the first generation matrix converter realized at the department of electric drives and traction. fig. 1 shows the real system configuration. the matrix converter supplies a three-phase squirrel cage asynchronous machine. the dc machine is used as a load. this machine is supplied with four-quadrant dc converter. the system is equipped with a speed sensor. 2.1 power electronic configuration the basic element of the matrix converter is a bidirectional switch. a possible bidirectional switch configuration is shown in fig. 2. configuration of the enlarged compact matrix converter is typically 3×3 switching patterns of 18 igbts, as shown in fig. 3. the compact eupec igbt modules fs150r17ke3 are used. the configuration of these modules is adapted to be employed in the matrix converter system. each module contains 3 bidirectional switches. all the system parts are 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 the precision simulation of the first generation matrix converter m. bednář this paper describes simulation of first generation matrix converter, which was realized in the lab. the simulation was developed in response to the need a diagnostic tool. the program is needed in order to debug the implemented control algorithms. the simulator supplies an environment for testing the generation of switching pulses without the risk of damaging the hardware. it supports the large potentiality for quicker development of new switching algorithms. a description of the simulated system is also included. keywords: matrix converter, real-time system, simulation, energy conversion. fig. 1: configuration of the real system fig. 2: bidirectional switch configuration fig. 3 switching patterns of the 18 igbts dimensioned for a permanent current stress of 30 a and for a voltage level of 400 v. 2.2 the control system the whole system is adapted to comply with the requirement for time precision pulse generation, contemporary control algorithm execution and communication with the user. two independent personal computers (called target pc and host pc) are used to serve these tasks. a specially-designed rack with a controller card is used to generate 18 control pulses. the card is based on the fpga circuit. a description of fpga inside the structure is important for simulation quality. 2.2.1 the host pc the host pc is the nearest part of the converter to the user. it looks like an ordinary pc and it hosts the monitor interface application, which enables the parameters of the converter to be set up. the parameters are given to the second pc through the lpt and com interface. 2.2.2 the target pc the execution of the control algorithm is the highest-priority task of this computer. the computer works in hard real-time mode. all system interrupts are disabled. the only way to change the program parameters is through the host pc. 2.2.3 the control rack the rack is a box which contains control and measurement cards, connected to the system bus, which goes through the rack and the target pc. pulse generation two identical cards are used for pulse generation purposes. the first is called the master, and the second is the slave. each card has 12 optical outputs, which represent one switching word in the regulation program. one fpga circuit is located on each card. it is programmed similarly as high speed output circuits on some digital signal processors. each fpga circuit is able to generate 16 switching words during the regulation period 150 �s in length. the internal structure of fpga is shown at fig. 4. 16 switching words (called tim_switch_word) are organized in the table. 16 switching times (in the range between 0 and 150 �s) are assigned to them. the internal timer value is compared to the time value in the table. the corresponding switching word is generated. this switching word stays on the output until it is changed by another word. not all 16 switching times have to be used in each period. the number of used switching word is called tim_switch_used. all possible states have been analyzed and reflected upon for simulation purposes. the other things for simulation are system feedback inputs. six analog channels are converted to digital information. these must be taken into account in the simulation design. program timing several operations must be executed in parallel during program execution. it is absolutely necessary take into consideration real system timing during simulation design. the timing of the simulation and the real system must be identical. otherwise the simulation will lost its seriousness. the system timing is shown in fig. 5. the main program loop takes a time of 150 �s. that means a switching frequency of the power transistors of about 6.6 khz. the internal timer step is 1/20. 1/20 �s is the basic time unit for the whole system. one program loop takes 3000 of basic time units. therefore 12-bit wide time information is needed in the switching table. 2.2.4 the control software the target pc’s software includes the whole firmware for initialization, the ad conversion service, communication and, of course, the application program. the application program is the most important for simulation purposes. it is necessary to take care about the full compatibility and portability of the source code between the simulation and the real system. basic modules and strict rules had to be settled. 3 the simulation 3.1 the requirements the most important requirement is source code compatibility between the simulation and the real system. the simulation must return an adequate response to enable software debugging and error finding. the adequate response must be ensured not only for the correct algorithm, but also for error states. © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 48 no. 3/2008 fig. 4: fpga structure fig. 5: system timing 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 6: the whole model view. the basic blocks are shown in this picture. the main control block is on the left. the switching pulses are on the output. these pulses are monitored by the scope blocks, by the auto error detection block, which can be seen in the bottom right corner (called the monitor). this inputs into the plecs block located in the middle part of the figure. the monitoring block is on the right, where the electrical magnitudes are shown. fig. 7: the interface between the real system and the portable simulation program and pulse generation. an operation which has to be applied to the output of the real program block situated on left side. the block called time�index is very important for time synchronization of the whole model. © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 48 no. 3/2008 fig. 8: the power part simulation block in plecs fig. 9: simulation outputs. some examples of the simulation result are shown here. the first and second graphs show input magnitudes (voltage and current). the third and fourth graphs show output voltage and current. this figure shows the input filter oscillation. the simulation must be optimized from the viewpoint of time consumption. automatic error state identification is required. it will be useful, if the simulation can identify switching word errors that can damage the real system. 3.2 the environment the simulation has been developed for the matlab simulink environment, and the plecs module simulating electric circuits has been used. the source code is written in c language. it can be seen that there are two different levels. the first one is the model of the lab. the second level includes the environment for source code transfer between the real system and simulation. an interface between the source code and matlab has been developed. the mex function system provided by matlab has been used. another possible interface, called s-function, was rejected because of its slow response. the model configuration is shown in figs. 6–8. 4 results many simulations have been made and some useful results have been obtained. some control algorithms have been tested (four step commutation, two step commutation, over-modulation) and some of the experience that has been acquired has been used in developing the second generation matrix converters. references [1] lettl, j., flígl, s.: simulation and realization of matrix converter drive system. in proceedings of transcom 2005 – 6th european conference of young research and science workers in transport and telecommunications, vol. 8, p. 69–72, žilina, 2005, isbn 80-8070-420-1. [2] arbona, c. c.: matrix converter igbt driver unit development and activation. bachelor project k13114 fee ctu in prague, praha, 2006. [3] flígl, s.: matrix converter in hybrid drives. ph.d. thesis k13114 fee ctu in prague, 2006 miroslav bednář e-mail: bednam2@.fel.cvut.cz dept. of electric drives and traction czech technical university faculty of electrical engineering technická 2 166 27 praha, czech republic 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 10 the one-phase simulation result. input voltage, input current, output voltage, output current acta polytechnica vol. 51 no. 4/2011 rational approximation to the solutions of two-point boundary value problems p. amore, f. m. fernández abstract we propose a method for the treatment of two-point boundary value problems given by nonlinear ordinary differential equations. the approach leads to sequences of roots of hankel determinants that converge rapidly towards the unknown parameter of the problem. we treat several problems of physical interest: the field equation determining the vortex profile in a ginzburg-landau effective theory, the fixed-point equation for wilson’s exact renormalization group, a suitably modified wegner-houghton fixed point equation in the local potential approximation, and a riccati equation. weconsider twomodelswhere theapproachdoesnotapply inorder to showthe limitations of ourpadé-hankelapproach. keywords: nonlinear differential equations, ginzburg-landau, wilson’s renormalization, wegner-houghton, riccati equation, padé-hankel method. 1 introduction some time ago fernández et al [1–3, 5, 6, 4, 7–9] developed a method for the accurate calculation of eigenfunctions and eigenvalues for bound states and resonances of the schrödinger equation. this approach is based on the taylor expansion of a regularized logarithmic derivative of the eigenfunction. the physical eigenvalue is given by a sequence of roots of hankel determinants constructed from the coefficients of that series. one merit of this approach, called the riccatipadé method, is the great convergence rate in most cases and that the same equation applies to bound states and resonances. besides, in some cases it yields upper and lower bounds to the eigenvalues [1]. the logarithmic derivative satisfies a riccati equation, and one may wonder if the method applies to other nonlinear ordinary differential equations. the purpose of this paper is to investigate whether a kind of padé-hankel method may be useful for two-point boundary value problems given by nonlinear ordinary differential equations. in section 2 we outline the method, in section 3 we apply it to several problems of physical interest, and in section 4 we discuss the relative merits of the approach. 2 method it is our purpose to propose a method for the treatment of two-point boundary value problems. we suppose that the solution f (x) of a nonlinear ordinary differential equation can be expanded as f (x) = xα ∞∑ j=0 fj x βj (1) about x = 0, where α and β are real numbers, and β > 0. we also assume that we can calculate sufficient coefficients fj in terms of one of them that should be determined by the boundary condition at the other point; for example, at infinity. we show several illustrative examples in section 3. we try a rational approximation to x−αf (x) of the form [m, n ](z) = ∑m j=0 aj z j∑n j=0 bjz j . (2) where z = xβ . the taylor expansion of the usual padé approximant yields m + n + 1 coefficients of the series (1) [11]; but in the present case we require that the rational approximation (2) gives us one more coefficient, that is to say, m + n + 2. if m = n + d, n = 1, 2, . . ., d = 0, 1, . . ., this requirement leads to the equation [1–3, 5, 6, 4, 7–9] hdd = |fi+j+d+1|i,j=0,1,...n = 0, (3) where d = n + 1 = 2, 3, . . . is the dimension of the hankel determinant hdd. in general, equation (3) exhibits many roots and one expects to find a sequence, for d = 2, 3, . . . and fixed d, that converges towards the required value of the unknown coefficient. from now on we call it the hankel sequence for short. if such a convergent sequence is monotonously increasing or decreasing we assume that it yields a lower or upper bound, respectively. such bounds have been proved rigorously for some eigenvalue problems [1]. 9 acta polytechnica vol. 51 no. 4/2011 3 examples in order to test the performance of the padé-hankel method, in this section we consider the examples treated by boisseau et al [10] by means of a most interesting algebraic approach. we first consider the field equation determining the vortex profile in a ginzburg-landau effective theory (and references therein) f ′′(r) + 1 r f ′(r) + ( 1 − n2 r2 ) f (r) − f (r)3 = 0, r > 0. (4) the solution f (r) satisfies the expansion (1) with x = r, α = n = 1, 2, . . ., and β = 2. if we substitute this series into the differential equation and solve for the coefficients fj , we obtain them in terms of the only unknown f0 that is determined by the boundary condition at infinity: f (r → ∞) = 1 [10] (and references therein). the coefficients fj , and therefore the hankel determinant hdd , are polynomial functions of f0. for example, for n = 1 we have table 1: convergence of thehankel series for the connection parameters of the global vortex for n =1 d d = 0 d = 1 2 0.595 0.578 3 0.584 0.582 9 4 0.583 24 0.583 15 5 0.583 20 0.583 183 6 0.583 192 0.583 187 7 0.583 190 0.583 189 0 8 0.583 189 7 0.583 189 3 9 0.583 189 54 0.583 189 46 10 0.583 189 52 0.583 189 48 11 0.583 189 51 0.583 189 491 12 0.583 189 498 0.583 189 494 13 0.583 189 496 4 0.583 189 495 3 14 0.583 189 496 1 0.583 189 495 6 15 0.583 189 496 0 0.583 189 495 7 16 0.583 189 495 90 0.583 189 495 83 17 0.583 189 495 88 0.583 189 495 84 18 0.583 189 495 867 0.583 189 495 854 19 0.583 189 495 864 0.583 189 495 857 20 0.583 189 495 862 0.583 189 495 859 1 21 0.583 189 495 860 9 0.583 189 495 859 8 22 0.583 189 495 860 7 0.583 189 495 860 1 f1 = − f0 8 , f2 = f0 192 + f30 24 , f3 = − f0 9216 − 5f30 576 , . . . (5) tables 1 and 2 show two hankel sequences with d = 0 and d = 1 that converge rapidly towards the result of the accurate shooting method [10] for n = 1 and n = 2, respectively. we appreciate that in the case n = 1 the sequences with d = 0 and d = 1 give upper and lower bounds, respectively, that tightly bracket the exact value of the unknown parameter of the theory: 0.583 189 495 860 60 < f0 < 0.583 189 495 860 61. on the other hand, the appropriate hankel sequences are oscillatory when n ≥ 2 and their rate of convergence decreases with n. table 3 shows the best estimates of f0 for n = 2, 3, 4. table 2: convergence of thehankel series for the connection parameters of the global vortex for n =2 d d = 0 d = 1 3 0.156 0.151 4 0.152 8 0.154 5 0.153 10 0.153 0 6 0.153 09 0.153 11 7 0.153 098 0.153 10 8 0.153 099 7 0.153 10 9 0.153 099 1 0.153 099 10 0.153 099 14 0.153 098 9 11 0.153 099 12 0.153 099 095 12 0.153 099 17 0.153 099 091 13 0.153 099 105 0.153 099 097 14 0.153 099 102 1 0.153 099 11 15 0.153 099 102 72 0.153 099 102 16 0.153 099 102 697 0.153 099 103 17 0.153 099 102 782 0.153 099 102 92 18 0.153 099 103 124 0.153 099 102 93 19 0.153 099 102 857 0.153 099 102 89 20 0.153 099 102 864 0.153 099 102 78 21 0.153 099 102 861 36 0.153 099 102 860 22 0.153 099 102 861 42 0.153 099 102 858 table 3: best estimates of the connection parameters of the global vortex for n = 2,3,4 by means of hankel sequences with d ≤ dmax n dmax f0 2 21 0.153 099 102 86 3 21 0.026 183 420 7 4 26 0.003 327 173 4 10 acta polytechnica vol. 51 no. 4/2011 our second example is the fixed-point equation for wilson’s exact renormalization group [10] (and references therein) 2f ′′(x) − 4f (x)f ′(x) − 5xf ′(x) + f (x) = 0, x > 0. (6) the solution to this equation can be expanded as in equation (1) with α = 1 and β = 2. the first coefficients are f1 = f0 3 + f20 3 , f2 = 7f0 60 + f20 4 + 2f30 15 , . . . (7) for large values of x the physical solution should behave as f (x) = ax1/5 + a2/(5x3/5) + . . . the hankel sequences with d = 0 and d = 1 converge towards the numerical result [10] (and references therein) from above and below, respectively. figure 1 displays the great rate of convergence of these sequences as δ = |f0(d, d = 0) − f0(d, d = 1)|, d = 2, 3, . . ., from which we obtain the accurate bounds −1.228 598 202 437 021 924 38 < f0 < −1.228 598 202 437 021 924 37. 0 5 10 15 20 25 d 10 -21 10 -18 10 -15 10 -12 10 -9 10 -6 10 -3 10 0 δ fig. 1: δ = |f0(d, d =0) − f0(d, d =1)| for wilson’s renormalization the third example comes from a suitably modified wegner-houghton’s fixed point equation in the local potential approximation [10] (and references therein) 2f ′′(x)+[1+f ′(x)][5f (x)−xf ′(x)] = 0, x > 0. (8) the solution satisfies the series (1) with α = 1 and β = 2, and the first coefficients are f1 = − f0 3 − f20 3 , f2 = f0 60 + 2f20 15 + 7f30 60 , . . . (9) on the other hand, the acceptable solution should behave as f (x) = ax5 − 4/(3x) + . . . when x � 1. table 4 shows hankel sequences with d = 0 and d = 1 that clearly converge towards the numerical value of f0 [10] (and references therein). table 4: convergence of the hankel sequences for the wegner-houghton connection parameter d d = 0 d = 1 3 −0.301 365 209 2 −0.419 012 931 2 4 −0.540 511 282 4 −0.469 645 717 0 5 −0.455 201 249 3 −0.460 479 692 6 6 −0.462 452 597 9 −0.461 693 582 1 7 −0.461 375 992 6 −0.461 509 171 7 8 −0.461 557 112 9 −0.461 537 339 3 9 −0.461 530 376 7 −0.461 533 153 5 10 −0.461 534 297 5 −0.461 533 816 5 11 −0.461 533 614 7 −0.461 533 704 3 12 −0.461 533 735 7 −0.461 533 722 7 13 −0.461 533 717 3 −0.461 533 719 6 14 −0.461 533 720 7 −0.461 533 720 2 15 −0.461 533 720 0 −0.461 533 720 1 16 −0.461 533 720 13 −0.461 533 720 119 17 −0.461 533 720 113 −0.461 533 720 115 7 18 −0.461 533 720 116 8 −0.461 533 720 116 3 19 −0.461 533 720 116 1 −0.461 533 720 116 2 20 −0.461 533 720 116 2 we have also applied our approach to the ordinary differential equation for the spherically symmetric skyrmion field [10] (and references therein) but we could not obtain convergent hankel sequences. we do not yet know the reason for the failure of the method in this case. the present approach has earlier proved suitable for the treatment of the riccati equation derived from the schrödinger equation [1–3, 5, 6, 4, 7–9]. consider, for example, the following riccati equation f ′(x) − f (x)2 + x2 = 0, x > 0. (10) the solution can be expanded as in equation (1) with α = β = 1; the first coefficients are f1 = f 2 0 , f2 = f 3 0 , f3 = − 1 3 + f40 , . . . there is a critical value f0c of f (0) = f0 such that f (x) ∼ −x at large x if f (0) < f0c, f (x) develops a singular point if f (0) > f0c, and f (x) ∼ x at large x if f (0) = f0c. the present padé-hankel method yields the value of f0c with remarkable accuracy, as shown in table 5. the rate of convergence of the hankel sequence for this problem is considerably greater than for the preceding ones. if we substitute f (x) = −y′(x)/y(x) into equation (10), then the function y(x) satisfies the schrödinger equation for a harmonic oscillator with zero energy 11 acta polytechnica vol. 51 no. 4/2011 table 5: convergence of thehankel sequenceswith d =0 for the riccati equation d f0 4 0.676 2 5 0.675 970 6 0.675 978 5 7 0.675 978 23 8 0.675 978 240 3 9 0.675 978 240 059 10 0.675 978 240 067 5 11 0.675 978 240 067 277 12 0.675 978 240 067 285 0 13 0.675 978 240 067 284 722 14 0.675 978 240 067 284 729 15 0.675 978 240 067 284 728 99 16 0.675 978 240 067 284 729 00 17 0.675 978 240 067 284 729 00 table 6: convergence of thehankel sequenceswith d =4 for the thomas-fermi equation d 2f2 10 −1.588 070 9 11 −1.588 070 6 12 −1.588 071 03 13 −1.588 071 024 14 −1.588 071 022 7 15 −1.588 071 022 64 16 −1.588 071 022 609 17 −1.588 071 022 609 18 −1.588 071 022 611 6 19 −1.588 071 022 611 5 20 −1.588 071 022 611 39 21 −1.588 071 022 611 38 22 −1.588 071 022 611 37 23 −1.588 071 022 611 37 24 −1.588 071 022 611 375 6 25 −1.588 071 022 611 375 37 26 −1.588 071 022 611 375 32 27 −1.588 071 022 611 375 315 4 28 −1.588 071 022 611 375 315 2 29 −1.588 071 022 611 375 315 4 30 −1.588 071 022 611 375 313 7 on the half line: y′′(x)−x2y(x) = 0, and the problem solved above is equivalent to finding the logarithmic derivative at origin y′(0)/y(0) so that y(x) behaves as exp(−x2/2) at infinity. obviously, any approach for linear differential equations is suitable for this problem. finally, we consider two examples discussed by bender et al [12]; the first of them is the instanton equation f ′′(x) + f (x) − f (x)3 = 0 (11) with the boundary conditions f (0) = 0, f (∞) = 1. the solution to this equation is f (x) = tanh ( x/ √ 2 ) . the expansion of f (x) is a particular case of equation (1) with α = 1 and β = 2; its first coefficients being f1 = − f0 6 , f0 ( 6f20 + 1 ) 120 , f3 = − f0 ( 66f20 + 1 ) 5 040 , . . . , (12) where f0 = f ′(0) is the unknown. the hankel series with d = 0 and d = 1 converge rapidly giving upper and lower bounds, respectively, to the exact result f0 = 1/ √ 2. the second example is the well known blasius equation [12] 2y′′′(x) + y(x)y′′(x) = 0 (13) with the boundary conditions y(0) = y′(0) = 0, y′(∞) = 1. the expansion of the solution in a taylor series about x = 0 is a particular case of equation (1) with α = 2 and β = 3; its first coefficients are f1 = − f20 60 , f2 = 11f30 20160 , . . . (14) since, in general, fj ∝ f j+10 , then the only root of the hankel determinants is f0 = 0, which leads to the trivial solution y(x) ≡ 0. we thus see another case where the padé-hankel method does not apply. 4 conclusions we have presented a simple method for the treatment of two-point boundary value problems. if there is a suitable series for the solution about one point, we construct a hankel matrix with the expansion coefficients and obtain the physical value of the undetermined coefficient from the roots of a sequence of determinants. the value of this coefficient given by a convergent hankel sequence is exactly the one that produces the correct asymptotic behaviour at the other point. we cannot prove this assumption rigorously, but it seems that if there is a convergent sequence, it yields the correct answer. moreover, in some cases the hankel sequences produce upper and lower bounds bracketing the exact result tightly. 12 acta polytechnica vol. 51 no. 4/2011 the present padé-hankel approach is not as general as the one proposed by boisseau et al [10], as we have already seen that the former does not apparently apply to the skyrmion problem or to the blasius equation [12]. however, our procedure is much simpler and more straightforward and may be a suitable alternative for treating problems of this kind. besides, if our approach converges, it yields remarkably accurate results, as shown in the examples above. references [1] fernández, f. m., ma, q., tipping, r. h.: tight upper and lower bounds for energy eigenvalues of the schrödinger equation. phys. rev. a, 39 (4), 1989, p. 1 605–1 609. [2] fernández, f. m.: strong coupling expansion for anharmonic oscillators and perturbed coulomb potentials. phys. lett. a, 166 1992, p. 173–176. [3] fernández, f. m., guardiola, r.: accurate eigenvalues and eigenfunctions for quantummechanical anharmonic oscillators. j. phys. a, 26 1993, p. 7 169–7 180. [4] fernández, f. m.: resonances for a perturbed coulomb potential. phys. lett. a, 203 1995, p. 275–278. [5] fernández, f. m.: direct calculation of accurate siegert eigenvalues. j. phys. a, 28 1995, p. 4 043–4 051. [6] fernández, f. m.: alternative treatment of separable quantum-mechanical models: the hydrogen molecular ion. j. chem. phys., 103 (15), 1995, p. 6 581–6 585. [7] fernández, f. m.: quantization condition for bound and quasibound states. j. phys. a, 29 1996, p. 3 167–3 177. [8] fernández, f. m.: direct calculation of stark resonances in hydrogen. phys. rev. a, 54 (2), 1996, p. 1 206–1 209. [9] fernández, f. m.: tunnel resonances for onedimensional barriers. chem. phys. lett, 281 1997, p. 337–342. [10] boisseau, b., forgács, p., giacomini, h.: an analytical approximation scheme to two-point boundary value problems of ordinary differential equations. j. phys. a, 40 2007, p. f215–f221. [11] bender, c. m., orszag, s. a.: advanced mathematical methods for scientists and engineers, new york, mcgraw-hill, 1978. [12] bender, c. m., pelster, a., weissbach, f.: boundary-layer theory, strong-coupling series, and large-order behavior. j. math. phys., 43 (8), 2002, p. 4 202–4 220. paolo amore e-mail: paolo@cgic.ucol.mx facultad de ciencias universidad de colima bernal dı́az del castillo 340, colima, colima, mexico francisco m. fernández e-mail: fernande@quimica.unlp.edu.ar inifta (conicet,unlp), diag. 113 y 64 s/n sucursal 4, casilla de correo 16, 1900 la plata, argentina 13 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 introduction to modeling of buying decisions o. grünwald abstract buying decision models of customers to adjust the competitiveness of organizations have been a challenge for marketing disciplines for several generations. this topic has been explored by researchers and academics in past years, and quite an extensive theoretical base exists with a number of approaches for dealing with this challenge. this paper presents some approaches for creating a customer decision model, and provides experimental results from an electronic investigation intended to build the kano model; to prove an ability to understand the modeling principle; and to find out the interpretation of the examined demand in a specific market segment involving students of a technical university. the last section of the paper contains a brief introduction to choice-based modeling with choice-based conjoint analysis (cbc), which was tailored for modeling purchasing decisions. keywords: customer behaviors, kano model, choice-based conjoint analysis. 1 models of customer behaviors modeling purchasing decisions is derived from general behavioral theory, according to which customer decisions are influenced primarily by cultural factors, social factors and personal factors [1]. buying decision models seek the best mathematical way to emulate the reality of markets. statistical surveys to obtain input data are usually performedwith a sample of the population. models for understanding customer purchasing behavior are usually based on the stimulus-response principle. influences in the buying decision process are shown in figure 1, where the final decision is shaped by incentives of competitive bids and personal characteristics of the buyers. during a marketing interview, respondents receive incentiveswith the specific information that the fig. 1: influences of external and personal factors in the buying decision process respondents understand. then the incentive is transformed to a perception which further enters the human consciousness. a psychological process together with the personal characteristics of the respondents arising from the perception forms the resulting purchasing decision. kotler describes four key processes constituting the final decision: the motivation process, the perception process, the learning process and the memory process. marketers considermotivating factors responsible for the purchase decision to be particularly important. the motivational process affecting customers’ decisions is described in the literature by three theories: freudian theory, maslow theory and herzberg theory. freudian theory assumes that the forces constituting human behavior are for the most part unconscious. people do not fully understand their own motivations for making decisions, such as the choice of a specific bundle of product features. to identify and track the motivation of people, the laddering technique [2] is used, which is in accordance with the means end chain theory. it is employed to model elements of three types: attributes, consequences and basic values. the technique monitors and analyzes connections and consequences between these elements, which can be represented by a summary table with counts of connections. the dominant connections can be represented in a tree diagram. the result then represents the dominant perceptual orientations (ways of thinking) across consumerswith respect to the product category, and can be represented by a hierarchical value map (hvm). this is essentially a tree diagram with a hierarchical structure of linkages or associations across levels of abstraction. 45 acta polytechnica vol. 51 no. 5/2011 maslow theory classifies the needs of customers into five hierarchical levels. according to this theory, people are driven by needs that theywant tomeet at a given moment. the hierarchy of needs in the following list reflects a priority sequence from highest to lowest level according to the traditional pyramid scheme: 1. self-actualization needs 2. esteem needs 3. social needs 4. safety needs 5. psychological needs. the needs belonging to the fifth and lowest psychological level, e.g. the need to drink, are the basal needs to ensure vital functions of human life, and must be satisfied before fulfilment of the needs from the higher levels. the highest category of the selfactualization level of needs contains needs as selfdeployment and realization [1]. herzberg theory deals with two types of factors. the first type is the group of dissatisfiers. these factors cause dissatisfaction among customers, and they want to avoid them. although these properties do not sell a product, they certainly cause a customer not to buy the product. the other type of factors is the group of satisfiers, which in turn are sought by the customers and ensure customer satisfaction in proportion to their performance. in a competitive environment, these properties allowdiversification of the product against other products designed to meet similar needs in a specific customer segment. 2 kano model the kano model (kano 1984) is designed for marketing surveys. it extends herzberg’s theory from two factors to three factors classifying the needs of customers. the first category of needs in the model comprises mandatory must-be requirements. if the product or service does notmeet these requirements, customers will be very dissatisfied because the product cannot fulfill its purpose well without them. on the other hand, when these requirements meet the customers’ needs, they are perceived as self-evident and they do not cause any customer satisfaction, as in herzberg’s theory. the second type is a class of one-dimensional needs. their increasing compliance in a product will proportionally increase the customer satisfaction gained from the experience of using it. these requirements are usually of a parametrical type (for example, a performance, a consumption or a durability). the third type is a class containing attractive needs. these requirements have the greatest influence on how satisfied customerswill be with a product. customers are mostly unable to specify attractive needs by themselves, because these requirements are often unconscious. ahigher level of their compliance will also increase customer satisfaction more than proportionately. however, if attractive needs are not met, customers will not feel any dissatisfaction. needs of this type are significant for a differentiated and positioned product. fig. 2: kano model, with three categories of customer needs (must-be, one dimensional and attractive) the kano model makes it possible to divide customer needs into a mandatory category, a onedimensional category and an attractive category of requirements. a product manager may determine what requirements will ensure the highest customer satisfaction, and will better understand the product offerings. he can find out about actual fulfillment of the requirements (the horizontal dimension in figure 2) of an already established product, and can determine the importance of each individual product feature, decide which feature to prioritize and which to ignore if is not possible to meet all of them due to budget constraints. he can define the requirements that have highest potential to increase the level of product performance, or discover completely new requirements demanded by customers. 2.1 design of the kano model the first step in the design is to recognise which requirements it is appropriate to employ in the model. in direct questioning on motives and shopping requirements, customers usually express their conscious desires. this direct method cannot reveal the unconsciousmotives involved in purchasing decisions which really lead to customer satisfactionwhen using the product. in order to determine the requirements indirectly, it is appropriate to make interviews with the following questions [5]. 1. what are the associations the customer makes when using product xy? 2. what problems/defect/complaints does the customer associate when using product xy? 3. what are the criteria the customer takes into account when buying product xy? 46 acta polytechnica vol. 51 no. 5/2011 4. what features or services do customers expect? what would customers change in product xy? the first type question is often answered unclearly, but we can detect the attitudes towards the product. answers to the second type of question define new requirements for the product that were previously unknown to the manufacturer. the third type of responsewill involve mainly one-dimensional (performance-related) requirements. the fourth type of question will evoke requirements that are already known, and consumers are aware about their expectations, but they have yet not been obtained in the product. to prove the kano model, we designed an interview with 80 respondents involved in survey. the questions concerned the respondents’ satisfaction with their studies at a technical university. the main requirements identified for the survey are listed in the appendix. 2.2 concept of a kano questionnaire a kano questionnaire usually consists of three types of questions adopted for the purpose which we are endeavoring to find out. in order to classify the requirements into categories, the questionnaire will include paired-type questions targeted at a function/dysfunction of each requirement. further questions that askabout the dimensionof current compliance are added (in our interview, a number of these questions are variable for each requirement in order to target different subparts). the results are plotted in figure 2 in the dimension of the horizontal axis. the third question type is intended to obtain the importance level of the requirements, which will enable us to prioritize the features for their further integration into developing services related to the studies. a paired-type question is composed for each product property included for consideration in the survey as two questions asking for two hypothetical situations that consider the case firstly that a requirement is satisfiedbytheproductandsecondly for the situation when the requirement is not met. for each question in the pair, respondents can select one of five standardized responses. all paired questions have the same standardized options. positive how do you assess whether you have the latest information and whether you can use a consolidated information system? ◦ it is excellent ◦ it must be that way ◦ i do not care ◦ i can live with it ◦ it is a big problem negative how do you assess whether you do not have the latest information and you cannot use a consolidated information system? ◦ it is excellent ◦ it must be that way ◦ i do not care ◦ i can live with it ◦ it is a big problem the retrieved paired responses for each requirement are then classified into 6 different classes [4] according to the selected combinations. table 1: the classification matrix for combinations of answers collected from the paired questions requirements dysfunctional → ↓ functional like must be neutral live with dislike like q a a a o must be r i i i m neutral r i i i m live with r i i i m dislike r r r r q in table 1, the rows and the columns determine the class of requirement based on the combination of paired responses. categorya is for attractive requirements, category o for one-dimensional requirements, and categorym for must be requirements. category i means that the response combination is indifferent, and the respondent has not provided decisive informationabout the requirement in a combinationof responses. categoryr signals that an increasing level of the related requirement decreases the customer satisfaction. category q is for unclear responses with a combinationwithout anunderstandablemeaning, for pairs of positive and negative responses in contradiction to each other. requirements with dominant q categoryshouldbe interviewedagain. acombination of pair answers belonging to the q category and to the r category should not occur very often, otherwise this result means improper product features or invalid responses from the respondents. performance questions presented in the interview together with the paired question type are about how a product currentlymeets the claims of respondents, and they assume that the respondents have previous experience of using the product. the answers are expressed on rating scales. in the interview we used scales with six degrees (1–7) plus one option for the case that the respondent is not affected by the request. we implemented different numbers of these questions under each individual requirement, as it was necessary to ask about specific sub-areas and related details. the questions that we used are listed in the appendix as a second level of the list. an example of a question to find out how the specific requirement currentlymeets the customers’ (students’) requirements is shown below. 47 acta polytechnica vol. 51 no. 5/2011 how satisfied are you with opportunities to obtain current information related to your studies? ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ does not affect very very me dissatisfied satisfied importance questions were the third type of question in the interview. the respondents evaluated each single requirement again on a scale of 1 to 7, andwere asked to reflect the degree of importance of each single demand in accordance with their perception. 2.3 survey of student satisfaction at ctu the survey to conclude the kano model was conducted electronicallywith agroupof 60 studentswho had previously been acquaintedwith the principle of the model. the respondents answered the questionnaire without any assistance on their own computer. they received the instruction and an invitation to participate in the interview by an email message. each requirement in the interview obtained 30 responses. the total set of20 requirementswasdivided into two variants of the questionnaire, each with 10 steps+1 step for importance questions. this partition reduced the interviewing time and prevented an overload of respondents during the interview at the expense a smaller number of responses. the variants of the questionnaire were alternated after each new respondent. in this way, each question in the interview was responded to by approximately one half of the total number of respondents. we investigated the classification of requirements, the degree of their current performance, and their importance for the respondents. an interpretation of the results includes for each requirement a specification of the frequency of the specific class in descendingorder. some requirements were not given an entirely clear category, so the category with the highest frequency was considered as dominant. the categories in other places in the order should also be taken into account when interpreting the results, taking into consideration the distributions of the frequencies of other classes. some of the answers were dominant in the category of attractive, and simultaneously the requirement received almost samethenumberofmust be answersandtherewasno great difference between them. in this case, the absence of a such requirement in the servicewill reduce the satisfaction and at the same time dissatisfaction will emerge. wealso calculated the coefficient of satisfaction [6] cs coefficient as shown in the following formulas. s = a + o a + o + i + m (1) where s denotes the coefficient of satisfaction. a, o, i, m are the frequencies of classifications into the categories d = o + m (−1) × (a + o + i + m) (2) where d means coefficient of dissatisfaction. cs = s + d = a − m a + o + i + m (3) cs means the total coefficient of satisfactions, and emphasizes the extremes of the satisfaction. if we use an averaging procedure for the responses of each requirement, it would compensate the result to the mean values. therefore, the cs coefficient subtracts the one dimensional responses in the numerator and then the result reflects by +/− sign whether it is an attractive or must be requirement, and the size of the coefficient expresses the absolute value of the a and m responses to the total number of responses. the cs coefficient reflects in its inner parts a and d how strongly the product requirement affects the satisfaction, or the dissatisfaction, if the requirement is/is not met. fig. 3: positioning of students’ requirements in a two dimensional system (satisfaction anddissatisfaction effects) figure 3 shows the rate of satisfaction in two dimensions: s as the satisfaction coefficient and d as the dissatisfaction coefficient. the requirements placed on top of the coordinate system indicate the degree of satisfaction if the requirement is met, and the location of the right means the degree of dissatisfaction when not fulfilled. this figure shows that if requirement no. 13 — use of the latest technology for teaching and opportunities to use your own notebook is not met, respondents will not feel strong dissatisfaction, but they will feel great satisfaction if they have access to good wireless connections and software support. in contrast, requirement no. 6 — content accordance of the field of study with world trends will cause high dissatisfaction if is not met, and no significant satisfaction if is fulfilled. 48 acta polytechnica vol. 51 no. 5/2011 in our questionaire, each requirement-step consisted of a paired question and a variable number of differently aimed second type questions on the current level of performance. the level of compliance was expressed by the respondents on a rating scale. when averaging the responses, all requirementswere slightly above the middle of the range. from this result it was very difficult to determine the actual performance despite the compensation feature. cp coefficient of performancewas therefore calculated as an additional indicator, in order to provide an overview. cp = ∑7 i=3 fi − ∑5 i=1 fi∑7 i=1 fi (4) where cp is the coefficient of performance, focused on the result of extreme values to avoid the compensated result, fn denotes the choice frequencies of each level on a seven-grade rating scale by the respondents. the result from the questionaire is shown in figure 4. the labels a,b,c refer to the second type question. a is the first performance question belonging under a specific requirement, b is the second question, and c is the third question. three questions in the setwas themaximumnumber in our questionnaire. the requirements are marked with numbers 1 to 20. the classification into categories (m,a,o, i) is derived fromthe superior requirement and is thus identical for all variants of all relatedperformance questions (which have the same number). the requirements are numbered from 1 to 20. fig. 4: performance questions focused on how the requirements meet the needs. performance is represented by the cp coefficient, togetherwith a classification of the three most dominant requirement categories theperformances of requirementsare shownbythe value of the cp coefficient, which subtracts the number of responses on the scale (1–5) from the number of responses on the evaluation scale (3–7) and divides this number by the overall number of assessments. if the resulting coefficient is negative, this requirement can be perceived as insufficiently met. each requirement is also displayed with the classification of the three most numerous groups. the classes are listed in descending order by the number of frequencies gained. the combinations indicate the approximate structures of class distributions and also the nature of the requirements. for example, request no. 16 a, from the appendix marked here as a16 — how satisfied are you with the links between the teaching and practical applications? is perceived as attractive (a), but some of the respondents considered it asamust be (m), and the coefficient of performance cp is −0.2, which indicates lack of compliance. question c of requirement no. 9 — how satisfied are you with the spaces for parking? is percieved as a very poor performance, is classified as attractive (a) and consequently as a must be (m). the number of evaluations was only 43 %due to the frequent choice does not affect me, whichmeans that many of the students are not concerned with parking needs. if the indifferent class is dominant, the next class in order becomes decisive. requirement no. 20a—how satisfied are you with the possibility of continuing for a master’s degree after graduating as a bachelor of stm? is a case of the dominant indifferent class and must-be class as second in the order, and therefore if the requirement is not met it will evoke dissatisfaction among the students. if it is met, it will be regarded as self-evident and will not evoke any improvement in student satisfaction. requirement no. 2 — helpful approach of teachers, mainly question b — how satisfied are you with the study materials? with a significant contribution of must be evaluation, also need to be addressed. 3 choice-based conjoint analysis conjoint analysis is an universal tool which might be used in market research to determine a marketing strategy. it belongs to the group of multivariate statistical methods designed for the most exact market survey. conjoint anlaysis was originally derived from the fields of mathematical psychology and psychometry. in contrast to compositional approaches, whereeachspecific incentiveattribute is assessedseparately, the conjoint approach is based on a global assessment of the incentives representing products. respondents evaluate a set of concepts in a questionnaire, and assign their preference to them. each 49 acta polytechnica vol. 51 no. 5/2011 product is a concept characterized by a specific combination of parameters, or by the characteristic levels of important attributes, and respondents must make trade-offs between the properties when considering concept preferences, because they are evaluating complex objects, each of which represents a specific product as a whole. this way of questioning is more similar to real buying behavior. in compositional approaches, product properties are evaluated without the context of the simultaneous presence of other attributes. respondents do not need to make any compromises during the evaluation, for example due to a limited financial budget, or they do not take into the account a combination of manufacturer brand and other properties. they do not evaluate the utility of a product, but only the separate features, and the resulting preference characteristic is often significantly distorted. choice-based models emerge from mc fadden’s economic theory [7]. according to this theory, individuals and homogeneous groups of individuals produce the market behavior generated by maximizing preferences, which may include a random component due to fluctuation in perceptions, and attitudes and also the action of other immeasurable factors. demographic, economic and social variables also shape the resulting preferences. choice-based conjoint analysis (cbc) [8] is a specific approach from the group of conjoint methods. it is derived from random utility theory (rut), which is an aspect of more general behavioral theory. this approach focuses directly on the process of customers’ purchasing decisions. a discrete choice qustionnaire also maintains hypothetical products, as in the traditional conjoint approach, but here respondents always choose only one concept per question. the question is formed as a subset of the total set of hypothetical concepts. in marketing and economics, themost important randomutilitymodel is the discrete-choice model (the choice of one option at the time). the data can be analysed using the count method, which simply expresses the number of cases when selected concepts included the levels as a percentage of the total number of occurrences of these levels in a questionnaire. other approaches foranalyzingchoice-baseddataaremultinomial logit (mnl), multinominal probit (mnp), latent classes and mnl hierarchical bayes (hb). the output of this method usually computes the probability that a particular type of respondent will buy a specified product (a combination of product features), or it determines thepreferences of themarket segment. 4 conclusion this article has presented issues of buying decision modeling in the context of the original motivation theories from psychology. an electronic survey was conducted among respondents belonging to a group of students at a technical university. a kano model was constructed for the identified requirements of students which were established during the intial stage. the model was analyzed. this allowed us to classify the requirements into the typical categories (attractive, one-dimensional, and must be) affecting the satisfaction and dissatisfaction of customers in different ways. it helped us to understand how the requirementswork tomeet students’ needs. the current performances and importance1 of the requirements were also measured. it was possible to deduce from the results strong andweak aspects in the services related to education and to define potential improvements and new services that will have the greatest benefit for students. aweakpoint in compositionalmethods lies in obtaining responses on individual attributes of a product or a service independently of other attributes. products are offered in the market as bundles of attributes. compositional methods are not able to capture the effect of combinations of attributes and are therefore often inaccurate. conjoint analysis deals successfully with this issue. its choice-based approach, tailored for modeling purchasing decisions, was introduced in section 3. appendix2 1. meals, housing and social facilities a. how satisfied are you with the food? b. how satisfied are you with the accomondation? c. how satisfied are you with the sanitary facilities (toilets, etc)? 2. helpful approach of teachers a. how satisfied are you with the helpfulness of the teachers? b. how satisfied are you with the study materials? 3. good organization of lessons, optimal schedule and the need to move from place to place a. how satisfied are youwith the arrangement of your schedule? 4. better opportunities through the image of the university a. how satisfied are youwith the offers of employment for graduates from the faculty? 1the measurement of requirement importance is not shown in this paper. 2the annex contains a list of requirements used in the survey to conclude the kano model. the first level in the list retains the identified requirements; the second level contains the questions to investigate current performances. 50 acta polytechnica vol. 51 no. 5/2011 5. top equipment at the faculty for teaching or research work a. how satisfied are you with the support facilities used in teaching at the faculty? 6. content accordance of the field of study with world trends a. how satisfied are you with the appeal of your field of study? 7. greater freedom in choice of curriculum a. how satisfied are youwith the current offer of courses? 8. opportunity to gain a recognized certificate for language skills a. howsatisfiedareyouwith theprovisions for learning a foreign language? 9. spaces available in the university for various purposes (sports, study halls for team discussions, etc) a. how satisfied are you with the spaces for self study? b. how satisfied are you with the sports facilities? c. how satisfied are you with the spaces for parking? 10. opportunities for scientific cooperationwith the department during the studies a. how satisfied are you with the opportunities to participate in projects carried out by the departments? 11. availability of current information, and opportunities to use a uniform and comprehensive is a. how satisfied are you with the opportunities to obtain current information related to your studies? 12. comfortable work with the kos system and good interaction with the staff of the study department a. how satisfied are you with the use of the kos system? b. how satisfied are you with office hours at the study department? c. how satisfied are you with the behavior of the study department staff? 13. use of the latest technology for teaching and opportunities to use your own notebook a. howsatisfiedareyouwith the technical aids used in your classes? b. how satisfied are youwith opportunities to use your own laptop in your classes? 14. compliance of the curriculum with your ideas about the field that you study a. how satisfied are you with the range of required courses? b. how satisfied are you with content of the individual courses? 15. opportunities to study abroad a. how satisfied are you with the offers for studying abroad? 16. the interrelation between teaching and practice a. howsatisfiedareyouwith the linksbetween teaching and practical applications? 17. the image and positive evaluation of the faculty in the world, its general recognition and reputation a. howsatisfied are youwith the reputation of the university among the public? 18. opportunities to choose interdisciplinary fields of study (a combination of fields) a. how satisfied are you with the offer of courses from other fields of study? 19. providing a wide range of student discounts a. how satisfied are you with the offer of student discounts? 20. the existence of an opportunity to continue to a master’s degree focused on the field of software technology and management (stm) a. how satisfied are you with the possibility of continuing to study for a master’s degree after graduating as a bachelor of stm? references [1] kotler, p., keller, k.: marketing management. 13th ed. pearson prentice hall, 2009. [2] reynolds, t. j., gutman, j.: laddering theory,method, analysis, and interpretation.understanding consumer decision making–the meanend approach to marketing and advertising strategy, p. 25–62. [3] biligili, b., ünal, s.: kanomodelapplication for classifying the requirements of university students. mibes, 2008, p. 31–45. [4] sauerwein,e., bailom,f., matzler,k., hinterhuber, h.: the kano model: how to delight your customers. international working seminar on production economics, 1996, vol. 1, p. 313–327. [5] shiba, s., graham, a., walden, d., lee, t. h., stata, r.: a new american tqm: four practical revolutions in management. productivity press portland, ore, 1993. [6] berger, c.: kano’s methods for understanding customer-defined quality. center for quality of management journal, 1993, vol.2, no. 4, p. 3–36. [7] mcfadden, d.: the choice theory approach to market research. marketing science, 1986, vol. 5, no. 4, p. 275–297. [8] orme, b.: the cbc system for choice-based conjointanalysis.sawtooth software, inc., 2008, p. 1–27. 51 acta polytechnica vol. 51 no. 5/2011 about the author ondrej grünwald was born in karlovy vary in 1978. he received his ing. degree in 2008 from the faculty of electrical engineering of the czech technical university in prague, specializing in economics and management in electrical engineering. currently he is a ph.d. student at the department of economics, management and humanities at the faculty of electrical egineering of ctu in prague. his interests include the application ofmathematical and statistical methods in marketing, mainly conjoint analyses and related techniques for marketing research. ondřej grünwald e-mail: grunwond@fel.cvut.cz dept. of economics management and humanities czech technical university technická 2, 166 27 praha, czech republic 52 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 resistance spot welding of dissimilar steels ladislav kolař́ık1, miroslav sahul2, marie kolař́ıková1, martin sahul3, milan turňa2, michal felix4 1 czech technical university in prague, department of manufacturing technology, technická 4, 166 07 praha, czech republic 2 slovak university of technology, department of welding, pauĺınska 16, 917 24 trnava, slovak republic 3 slovak university of technology, department of materials engineering, pauĺınska 16, 917 24 trnava, slovak republic 4 fronius česká republika, s. r. o, dolnoměcholupská 1535/14, 102 00 praha, czech republic correspondence to: ladislav.kolarik@fs.cvut.cz abstract this paper presents an analysis of the properties of resistance spot welds between low carbon steel and austenitic crni stainless steel. the thickness of the welded dissimilar materials was 2 mm. a deltaspot welding gun with a process tape was used for welding the dissimilar steels. resistance spot welds were produced with various welding parameters (welding currents ranging from 7 to 8 ka). light microscopy, microhardness measurements across the welded joints, and edx analysis were used to evaluate the quality of the resistance spot welds. the results confirm the applicability of deltaspot welding for this combination of materials. keywords: resistance spot welding, welded joint, low carbon steel, austenitic stainless steel. 1 introduction several problems arise when welding dissimilar steels, related mainly to the different physical, chemical and mechanical properties of the welded materials. austenitic stainless steel and low carbon steel possess a good combination of mechanical properties, formability, weldability and resistance to corrosion. this combination of steels is extensively used in the power generation industry [1,2]. research on welding is carried out at various research institutions [3, 4]. the aim of our paper is to analyze and compare the properties of resistance spot welds of low carbon steel and austenitic stainless steel. optical microscopy, microhardness measurements and edx analysis were used to analyse the properties of the spot welded joints. 2 experimental the welded materials used for resistance spot welding were dc 01 low carbon steel (lcs) and aisi 304 austenitic stainless steel (ass). the welded steels were 2 mm thick. both metals were delivered in cold rolled state. the chemical composition of the steels is given in tables 1 and 2. it was necessary to take various properties of the welded metals into consideration, e.g. thermal conductivity and electrical resistivity. the thermal conductivity of ass and lcs is about 16.2 w·m−1·k−1 and 52 w·m−1·k−1, respectively. differences in the properties of the metals resulted in an asymmetrical weld nugget. the electrical resistivity of ass is approximately 72 μω · cm, while the electrical resistivity of lcs is 14.2 μω · cm [6]. table 1: chemical composition of dc 01 low carbon steel main alloying elements [wt. %] c mn p s ≤ 0.12 ≤ 0.60 ≤ 0.045 ≤ 0.045 table 2: chemical composition of aisi 304 austenitic stainless steel main alloying elements [wt. %] c ≤ 0.07 cr 17.0 to 19.5 ni 8.0 to 10.5 mn ≤ 2.00 si ≤ 0.045 p ≤ 0.045 s ≤ 0.015 n ≤ 0.11 43 acta polytechnica vol. 52 no. 3/2012 resistance spot welds were produced at the fronius technology centre in prague, in cooperation with the faculty of mechanical engineering, czech technical university in prague. the deltaspot system by fronius was used for producing the resistance spot welds. deltaspot is a welding gun (figure 1) with a running process tape (type pt 3200-crni and pt 1200-steel) that protects the electrodes (r+100 and r+500, �16 mm) from wear and deposits of sheet coatings. the welding current used in the experiment ranged from 7 to 8 ka. the other parameters were kept constant. the welding parameters are given in table 3. figure 1: deltaspot x-gun (left) and a detail of the process tapes (right) [5] table 3: parameters for welding dissimilar steels sample code welding current welding current cycle duration force force cycle duration [ms] [ka] [ms] [kn] [ms] 1 8.0 370 4.0/3.4 310/480 2 7.5 370 4.0/3.4 310/480 3 7.0 370 4.0/3.4 310/480 figure 2: example of a welding cycle — dependence of welding current and force on time 44 acta polytechnica vol. 52 no. 3/2012 figure 3: verification of the weld nugget size (left) and the macrostructure of a selected dissimilar resistance spot weld (iw = 7.5 ka) figure 4: haz – dc 01 steel interface the quality of the weld joins was proved by peel testing (en iso 10 447), see figure 3 – left. the size of the weld nugget was measured. sample 2 was chosen for further analysis. light microscopy (for analysing the macroand microstructures of the welds), microhardness measurements across the welded joints, and edx analysis were used to analyse the properties of the resistance spot welds that were produced. the macrostructure of the welded joint produced by using a welding current of 7.5 ka is documented in figure 3. as the macrostructure suggests, the welded joint is asymmetrical. the fusion zone size on the stainless steel side is larger than the fusion zone on the low carbon steel side. similar results were also observed in welds produced with welding currents of 7 and 8 ka. the heat-affected zone (haz) of the dc 01 steel is broader due to the higher thermal conductivity of the low carbon steel sheet. on the basis of macrostructure analysis, it can be stated that the higher the welding current that is used, the larger the fusion zone. the microstructure of low carbon steel is fully ferritic. grain refining was observed in the low temperature haz of carbon steel (figure 4). some amounts of pearlite were also present. grain coarsening occurred in the high temperature haz of carbon steel. figure 5: sem image of a dissimilar steels weld nugget figure 6: sem image of base materials and a welded joint 45 acta polytechnica vol. 52 no. 3/2012 the structure of the weld nugget obtained by scanning electron microscopy is presented in figure 6. a jeol 7600 f scanning electron microscope with a field emission gun fitted with an x-max 50 mm2 edx detector was used to obtain images and for edx analysis. the observation parameters were: accelerating voltage 20 kev, probe current 2 na and working distance 15 mm. a more detailed view of the welded joint produced with a welding current of 8.0 ka is shown in figure 6. figure 7: course of microhardness across a welded joint (iw = 7.5 ka) vickers microhardness measurements across the weld nugget were performed on each sample. the individual indents were made horizontally, in two rows (aisi 304 steel and dc 01 steel, respectively). the first one was done in austenitic stainless steel across the weld nugget and the other in low carbon steel through haz and the weld nugget. the distance between the indents in the weld metal was 500 μm. the distance between the indents in the base metals and haz was 200 μm. the loading used in the experimental measurements was 100 g, acting over a period of 10 s. the hardness measurement results are given in figure 7. edx analysis was used for a more detailed study of the welded joint — base metal interface. the area examined by edx analysis, and also the line profiles of cr, ni, mn and fe elements are given in figure 8. 3 conclusion the properties of resistance spot welds of dissimilar steels have been studied. the influence of the welding parameters on the weld metal size has been evaluated. the size of the weld metal increases with welding current increase. the haz of the low carbon steel sheet was broader than the haz of the austenitic stainless steel. this is related to the higher thermal conductivity of dc 01 steel. the structure of low carbon steel is fully ferritic. aisi 304 steel consists of austenitic grains with the presence of twins. grain refining of ferrite grains was observed in the low temperature haz at carbon steel. the hardness increased in the fusion zone. in low carbon steel, the increase was from a value of 131 hv0.1, measured in the base metal, to hardness of 367.9 hv0.1 (measured in the weld metal). a similar perceptible increase was attained in stainless steel sheet. the hardness measured in aisi 304 steel increased from a value of 186.9 hv0.1 to a value of 359.9 hv0.1. a more detailed study of the welded joint interface was performed by edx analysis. an increase in iron content in the direction from weld metals towards dc 01 steel was observed. however, a decrease in cr, mn and ni from the weld metal towards the dc 01 steel was also recorded. figure 8: a) area studied using edx analysis, b) line profiles of cr, ni, mn and fe across the weld metal — dc 01 steel interface (iw = 7.5 ka) 46 acta polytechnica vol. 52 no. 3/2012 acknowledgement the paper was prepared with support from the grant agency of the ministry of education, science, research and sport of the slovak republic and the slovak academy of sciences within project no. 1/2594/12. the research was financed by the czech ministry of education, youth and sport within the framework of project sgs cvut 2010 — ohk2-038/10. references [1] arivazhagan, n., singh, s., prakash, s., reddy, g. m.: investigation on aisi 304 austenitic stainless steel to aisi 4140 low alloy steel dissimilar joints by gas tungsten arc, electron beam and friction welding. in materials and design 2011. elsevier, p. 3 036–3 050. issn 0261-3069. [2] krishnaprasad, k., prakash, r. v.: fatigue crack growth behavior in dissimilar metal weldment of stainless steel and carbon steel. in engineering and technology, 2009, p. 873–879. [3] marashi, p., pouranvari, m., amirabdollahian, s., abedi, a., goodarzi, m.: microstructure and failure behavior of dissimilar resistance spot welds between low carbon galvanized and austenitic stainless steels. in materials science and engineering, 2008, elsevier, p. 175–180. issn 0921-5093. [4] torkamany, m. j., sabbaghzadeh, j., hamedi, m. j.: effect of laser welding mode on the microstructure and mechanical performance of dissimilar laser spot welds between low carbon and austenitic stainless steels. in materials and design, 2012, elsevier, p. 666–672. issn 0261-3069. [5] deltaspot resistance welding. in fronius [online]. 2011 [cit. 2012–03–30]. available on: http://www.fronius.com/new/deltaspot [6] jamasri, ilman, m. n., soekrisno, r., triyono: corrosion fatigue behavior of resistance spot welded dissimilar metal welds between carbon steel and austenitic stainless steel with different thickness. in procedia engineering, 2011, elsevier, p. 649–654. issn 1877-7058. 47 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 on multiple m2-brane model(s) and its n =8 superspace formulation(s) i. a. bandos abstract wegive abrief review ofbagger-lambert-gustavsson (blg)model, with emphasis on its version invariant under thevolume preserving diffeomorphisms (sdiff3) symmetry. we describe the on-shell superfield formulation of this sdiff3 blg model in standard n = 8, d = 3 superspace, as well as its superfield action in the pure spinor n = 8 superspace. we also briefly address the aharony-bergman-jafferis-maldacena (abjm/abj) model invariant under su(m)k × su(n)−k gauge symmetry, and discuss the possible form of their n = 6 and, for the case of chern-simons level k = 1,2, n = 8 superfield equations. 1 introduction in the fall of 2007, motivated by the search for a lowenergy description of the multiple m2-brane system, bagger, lambert and gustavsson [1, 2, 3] proposed a n = 8 supersymmetric superconformal d = 3 model based on filippov three algebra [4] instead of lie algebra. 1.1 3-algebras lie algebras are defined with the use of antisymmetric brackets [x, y ] = −[y, x] of two elements, x = ∑ a x ata and y = ∑ a y ata, called lie brackets or commutator. the brackets of two lie algebra generators, [ta, tb] = fab ctc, are characterized by antisymmetric structure constants fab c = −fabc = f[ab]c which obey the jacobi identity f[ab dfc]d e = 0 ⇔ [ta, [tb, tc]]+ [tc, [ta, tb]]+ [tb, [tc, ta]] = 0. in contrast, the general filippov 3-algebra is defined by 3-brackets {ta, tb, tc} = fabcd td , fabcd = f[abc]d (1) which are antisymmetric and obey the so-called ‘fundamental identity’ {ta, tb, {tc1, tc2, tc3}} = 3{{ta, tb, t[c1}, tc2, tc3]}} . (2) towrite an action for some 3-algebra valued field theory, one needs as well to introduce an invariant inner product or metric hab =< ta, tb > . (3) then for the metric 3-algebra the structure constants obey fabcd := fabc ehed = f[abcd]. an example of infinite dimensional 3-algebra is defined by the nambu brackets (nb) [5] of functions on a 3-dimensional manifold m3 {φ,ξ,ω} = �ijk ∂iφ ∂jξ ∂kω , ∂i := ∂/∂y i , i =1,2,3 . (4) here yi = (y1, y2, y3) are local coordinates on m3, φ = φ(y), ξ = ξ(y) and ω = ω(y) are functions on m3, and �ijk is the levi-cevita symbol (it is convenient to define nb using a constant scalar density e [6], but this is not important for our present discussion here and we simplify the notation by setting e = 1). these brackets are invariant with respect to the volume preserving diffeomorphisms of m3, which we call sdiff3 transformations. in practical applications one needs to assume compactness of m3. for our discussion here it is sufficient to assume that m3 has the topology of sphere s3. another example of 3-algebra, which was present already in the first paper by bagger and lambert [1] is a4 realized by generators ta, a =1,2,3,4 obeying {ta, tb, tc} = �abcd td , a, b, c, d =1,2,3,4 . (5) these are related to the 6 generators mab of so(4) as euclidean d = 4 dirac matrices are related to the spin(4) = su(2)× su(2) generators, ta ↔ γa, mab ↔ 1/2γab := 1/4(γaγb − γbγa). a more general type of 3-algebras with not completely antisymmetric structure constants were discussed e.g. in [7], [8] and [9]. in particular, as it was shown in [8], the aharony-bergman-jafferismaldacena (abjm) model [10] is based on a particular ’hermitian 3-algebra’ the 3-brackets of which can be defined on two m × n (complex) matrices zi, zj and an n × m (complex) matrix z†k by [8] [zi, zj;z†k] m×n = ziz†kz j − zj z†kz i . (6) contribution to the selected topics in mathematical and particle physics prague 2009. procs. of the conference on the occasion of prof. jiri niederle’s 70th birthday. 14 acta polytechnica vol. 50 no. 3/2010 1.2 blg action the blg model on general 3-algebra is described in terms of an octet of 3-algebra valued scalar fields in vector representation of so(8), φi(x) = φia(x)ta, an octet of 3-algebravalued spinor fields in spinor (say, sspinor) representationofso(8), ψαa(x)= ψαa a(x)ta, and the vector gauge field aabμ in the bi-fundamental representation of the 3-algebra. the blg lagrangian reads lblg = t r [ − 1 2 |dφ|2 − g2 12 { φi , φj , φk }2 − (7) i 2 ψ̄γμdμψ + ig 4 { φi , φj , ψ̄ } ρij ψ ] + 1 2g lcs , i =1, . . . ,8 . where g is a real dimensionless parameter, lcs is the chern-simons (cs term) for the gauge potential aμb a = acdμ fdcb a which is also used to define the covariant derivatives of the scalar and spinor fields. the spin(8) indices are suppressed in (7); ρi := ρi aḃ are the 8 × 8 spin(8) ‘sigma’ matrices (klebsh-gordan coefficients relating the vector 8v and two spinor, 8s and 8c, representations of so(8)). these obey ρi ρ̃j + ρi ρ̃j =2δij i with their transpose ρ̃i := ρ̃i ȧb ; notice that ρij := (ρ[i ρ̃j])ab and ρ̃ ij := (ρ̃[i ρj])ȧḃ are antisymmetric in their spinor indices. this model possesses n = 8 supersymmetry and superconformal symmetries the set of which includes 8 special conformal supersymmetries. hence the total number of supersymmetryparameters is 2×8+2×8= 32. this coincides with the number of supersymmetries possessed by m2-brane [11] and the conformal symmetry was expected for infrared fixed point (low energy approximation) of the multiple m2-brane system [12]. thus, action (7) was expected to play for the multiple m2-brane system the same rôle as it is played by the u(n) sym action for the multiple dpbrane system [13] (with n dp-branes). however, if thiswere the case, thenumber of generators of the filippov 3-algebrawould be related somehow to the number of m2-branes composing the system the low energy limit of which is described by the action (7). this expectation enters in conflict with the relatively poor structure of the set of finite dimensional filippov 3-algebraswith positively definitemetric (3): this set was proved to contain the direct sums of a4 and trivial one-dimensional 3-algebras only (see [14, 15] as well as [16] and refs therein). a veryuseful rôle in searching for resolution of this paradoxwasplayedbytheanalysisbyraamsdock [17], who reformulated the a4 blg model in matrix notation. thiswasusedbyaharony,bergman, jafferis and maldacena [10] to formulate an su(n)k × su(n)−k and then [26] su(m)k × su(n)−k gauge invariant cs plusmattermodels, which are believed to describe the low energymultiplem2-branedynamics. the subscript k denotes the so-calledcs level, this is to saythe integer coefficient in front of thecs term in the action of the cs plus matter models. in the dual description of theabjmmodel bym-theory on the ads4×s7/zk [10] the same integer k characterizes thequotientof the 7-sphere. theabjm/abjmodel possessesonlyn =6manifest supersymmetries, which is natural for k > 2, as the ads4 × s7/zk backgrounds with k > 2 preserve only 24 of 32m-theory supersymmetries in these cases. the nonperturbative restoration of n = 8 supersymmetry for k = 1,2 cases was conjectured already in [10]. recently this enhancement of supersymmetry was studied in [9], where its relation with some special ‘identities’ (whichwepropose to callgridentities orgustavsson-rey identities) conjectured to be true due to the properties of monopole operators specific for k = 1,2 is proposed. we shortly discuss the abjm/abj model in the concluding part of this paper. 1.3 nb blg action coming back to the 3-algebrablg models, we notice that inside their set there are clear candidates for the n → ∞ limit of the multiple m2-brane system, which one can view as describing possible ‘condensates’ of coincident planarm2-branes. these are the blg theories in which the filippov 3-algebra is realized by the nambu-bracket (4) of functions defined on some 3-manifold m3. this model was conjectured [18, 19] to be related with the m5-brane [20, 21, 22] wrapped over m3 (see [6] and recent [23] for further studyof this proposal) and was put in a general context of sdiff3 gauge theories in [24]. it is described in terms of spin(8) 8v-plet of real scalar fields φi (i = 1, . . .8), and a spin(8) 8splet of majorana anticommuting sl(2;r) spinor fields ψa (a = 1, . . . ,8), both on the cartesian product of 3-dimensional minkowski spacetime with some 3dimensional closed manifold without boundary, m3. these fields transforms as scalars with respect to sdiff3: δξφ = −ξi∂iφ , δξψ = −ξi∂iψ, where ξi = ξi(y) is a divergenceless sdiff3 parameter. the action of thisnambubracket realizationof the bagger-lambert-gustavssonmodel (nb blg model) is ln b blg = ∮ d3y [ − 1 2 e |dφ|2 − i 2 e ψ̄γμdμψ + ig 4 εijk∂iφ i ∂j φ j ( ∂kψ̄ρ ij ψ ) − (8) g2 12 e { φi , φj , φk }2] + 1 2g lcs in (8) the trace t r of (7) is replaced by integral ∮ d3y over m3 and lcs is the cs-like term involving the sdiff3 gauge potential si and gauge pre-potential ai 15 acta polytechnica vol. 50 no. 3/2010 [24]. the gauge potential si = dxμsiμ transforms under the local sdiff3 with ξ i = ξi(x, y) as δξs i = dξi − ξj ∂j si + sj ∂j ξi and is used to construct sdiff3 covariant derivatives of scalar and spinor fields dφ = dφ + si∂iφ , dψ = dψ + si∂iψ . (9) as the gauge field takes values in the lie algebra of the lie group of gauge symmetries, and this is associated with volume preserving diffeomorphisms the infinitesimal parameter of which is a divergenceless three-vector ξi(x, y), ∂iξ i = 0, the sdiff3 gauge field si =dxμsiμ(x, y) obeys ∂is i ≡ 0 ⇔ ∂isiμ ≡ 0 (10) which implies the possibility to express it, at least locally, in terms of gauge pre-potential one-form ai = dxμaμi(x), si = �ijk∂j ak ⇔ siμ = � ijk∂j aμk . (11) also the covariant field strength f i =dsi + sj∂j s i = 1 2 dxμ ∧dxν f iνμ . (12) satisfies the additional identity ∂if i ≡ 0 ⇔ ∂if iμν ≡ 0 (13) and can be expressed (locally) in terms of pre-field strength, f i = εijk∂j gk ⇔ f iμν = ε ijk∂j gμν k , (14) gi =dai + s j ∂[j ai] = 1 2 dxμ ∧dxν gνμi . (15) thecs-like term in (8) is expressed through the gauge potential and pre-potential by lcs = ∮ d3y �μνρ [( ∂μs i ν ) aρ i − 1 3 �ijks i μs j ν s k ρ ] , (16) or, in terms of differential forms, by lcs =∮ d3y [ dsi ∧ ai − 1 3 �ijks i ∧ sj ∧ sk ] . the formal exterior derivative of lcs can be expressed through the field strength and pre-field strength by dlcs = ∮ d3y f i ∧ gi . (17) the lagrangian density (8) varies into a total spacetime derivative under the following infinitesimal supersymmetry transformations with 8c-plet constant anticommuting spinor parameter �α ȧ (ȧ =1, . . . ,8): δφi = i�ρ̃i ψ , δaμi = −ig ( �γμρ̃ i ψ ) ∂iφ i , (18) δψ = [ γμρidμφi − g 6 { φi , φj , φk } ρijk ] � . the blg equations of motion are dμdμφi = ig 2 εijk∂iφ j ∂j ψ̄ρ ij ∂kψ − g2 2 {{ φi , φj , φk } , φj , φk } , γμdμψ = − g 2 ρij { φi , φj , ψ } , (19) f iμν = −g εμνρε ijk [ ∂j φ idρ∂kφi − i 2 ∂j ψγ ρ∂kψ ] . 2 nb blg in n =8 superfields the nb blg equations of motion can be obtained from the set of superfield equations in n = 8 superspace [30]. wewill reviewthis approach in this section. let us introduce 8v-plet of scalar, and sdiff3scalar, superfields φi, the lowest component of which (also denoted by φi) may be identified with the blg scalar fields, and impose on it the following superembedding-like equation [30]1 dαȧφ i = iρ̃i ȧb ψαb. (20) the sdiff3-covariant spinorial derivatives on n =8 superspace, entering (20), dαȧ = dαȧ + ςαȧ i∂i , (21) are constrained to obey the following algebra [30] [dαȧ, dβḃ]+ = 2iδȧḃ(cγ μ)αβdμ + (22) 2i�αβwȧḃ i ∂i , wheredμ = ∂μ+isiμ∂i is the 3-vector covariantderivative which obeys [ dαȧ, dμ ] = fαȧ μ i∂i , [dμ, dν] = fμν i ∂i . (23) eqs. (22), (23) are equivalent to the ricci identity dd = f i∂i for the covariant exterior derivative d := d+si∂i = eαȧdαȧ +e μdμ , plus the constraint f i αȧ βḃ =2icαβ wȧḃ i. the basic sdiff3 gauge superfield strength wȧḃ i is antisymmetric on c-spinor indices (this is to say wȧḃ i is in the 28 of so(8)); it is also divergence-free, so wȧḃ i = −wḃȧ i , ∂iwȧḃ i =0 . (24) using the bianchi identity df i =0, one finds that 1the name comes from the observation that (20) can be obtained from the superembedding equation for a single m2-brane [25] by first linearizing with respect to the dynamical fields in the static gauge, and then covariantizing the result with respect to sdiff3. 16 acta polytechnica vol. 50 no. 3/2010 fαȧ μ i = i ( γμwȧ i ) α , wαḃ i := i 7 dαȧwȧḃ i , (25) fμν i = 1 16 �μνρdȧγ ρwȧ i , and that dα(ȧwḃ)ċ i = iwαḋ i ( δḋ(ȧδḃ)ċ − δḋċ δȧḃ ) , (26) dȧαwβḃ i = (cγμ)αβ ( dμwȧḃ i −4δȧḃwμ i ) . (27) we see that the sdiff field strength supermultiplet includes a scalar 28 (wȧḃ i), a spinor 8c (wαȧ i) and a singlet divergence-free vector (w μi = dȧγ ρwȧ i). there are many other independent components, but these become dependent on-shell as far as we are searching for a description of chern-simons (cs) rather than the yang-mills one. the relevant superchern-simons (super-cs) system superfield equation in the absence of ‘matter’ supermutiplets is obviously wȧḃ i = 0, since this sets to zero all sdiff3 field strengths; in particular it implies f iμν = 0. in the presence of matter, the super-cs equation may get a nonvanishing right hand side. indeed, acting on the superembedding-like equation (20) with an sdiff3-covariant spinor derivative, and making use of the anticommutation relation (22), one finds that dα[ȧρ̃ i ḃ]c ψ α c = 2wȧḃ i∂iφ i which is solved by the ‘super-cs’ equation [30] wȧḃ i =2gεijk∂iφ i ∂j φ j ρ̃ij ȧḃ . (28) it was shown in [30] that the two n = 8 superfield equations (20) and (28) imply thenambu-bracket blg equations (19). 3 nb blg in pure-spinor superspace an n = 8 superfield action for the abstract blg model, i.e. for theblgmodel basedonafinitedimensional 3-algebra, which in practical terms implies a4 or the direct sum of several a4 and trivial 3-algebras, was proposed by cederwall [28]. its generalization for the case of nb blgmodel invariant under infinite dimensional sdiff3 gauge symmetry, constructed in [24], will be reviewed in this section. the pure-spinor superspace of [28] is parametrized by the standard n =8 d =3 superspace coordinates (xμ, θα ȧ ) together with additional pure spinor coordinates λα ȧ . these are described by the 8c-plet of complex commuting d = 3 spinors satisfying the ‘pure spinor’ constraint λγμλ := λα ȧ γ μ αβ λ β ȧ =0 . (29) this is a variant of the d =10 pure-spinor superspace first proposed by howe [31] (see [32] for earlier attempt to use pure spinors in the sym and supergravity context). from a more general perspective, the approach of [28] can be considered as a realization of the harmonic superspace programme of [33] (although one cannot state that the algebra of all the symmetries of the superfield action of [28] are closed off shell, i.e. without the use of equations of motion). the d = 10 pure spinors are also the central element of the berkovits approach to covariant description of quantum superstring [34]. in this approach thepure spinors are considered tobe theghosts of a local fermionic gauge symmetry related to the κsymmetry of the standardgreen-schwarz formulation. this ‘ghost nature’ may be considered as a justification for that the pure-spinor superfields are assumed (in [28, 24] andhere) to be analytic functions of λ that can be expanded as a taylor series in powers of λ. to discuss the blg model, we allow all the pure spinor superfields to depend also on the local coordinates yi of the auxiliary compact 3-dimensional manifold m3. following [28], we define the brst-type operator (cf. [34]) q := λα ȧ dαȧ , (30) which satisfies q2 ≡ 0 as a consequence of the pure spinor constraint (29). we now introduce the 8v-plet of complex scalar n =8 ‘matter’ superfields φi, with sdiff3 transformation δφi =ξi∂iφ i (31) characterized by the commuting m3-vector parameter ξi =ξi(y). we allow these superfields to be complex because they may depend on the complex pure-spinor λ but, to make contact with the spacetime blg model, we assume that the leading term in its decomposition in power series on complex λ φi = φi +o (λ) , (32) is given by a real 8v-plet of ‘standard’ n = 8 scalar superfields, like the basic objects in sec. 2. let us consider (complex and anticommuting) lagrangian density l 0 mat = 1 2 mij ∮ d3y eφi qφj , (33) where mij = λ α ȧ ρ̃ij ȧḃ λαḃ is one of the two nonvanishing analytic pure spinor bilinears mij := λ α ρ̃ij λα , n μ ijkl := λ γ μρ̃ijklλ . (34) it is important that, due to (29), these obey the identities (see [24] for a detailed proof) mij ρ̃ j λ ≡ 0 , m[ij mkl] =0 , (35) np q[ij · nkl]p q ≡ 0 . to construct the n = 8 supersymmetric action with the use of the lagrangian (33) one needs to specify an adequate superspace integration measure. we 17 acta polytechnica vol. 50 no. 3/2010 refer to [29] for details on such a measure, which has the crucial property of allowing us to discard abrstexact terms when varying with respect φi. then, as a consequence of this and also of the identities (35), the action is invariant under the gauge symmetries δφi = λα ȧ ρ̃i ȧb ζαb + qk i for arbitrary pure-spinorsuperfield parameters ζα and k i . the variation with respect to φi yields the superfield equation mij qφ j =0 , (36) which implies, as a consequence of the pure-spinor identities, that qφi = λρ̃iθ (37) for some 8s-plet of complex spinor superfields θαȧ. the first nontrivial (∼ λ) term in the λ-expansion of this equation is precisely the free field limit of the onshell superspace constraint (20), dαȧφ i = iρ̃i ȧbψαb, with ψ =θ|λ=0. 2 in the light of the results of sec. 2, this implies that the free field (g �→ 0) limit of the nb blg field equations (19) can be obtained from the pure spinor superspace action (33). now, as the free field limit is reproduced, to construct the pure spinor superspace description of the nb blg system we need to describe its gauge field (chern-simons) sector and to use it to gauge the sdiff3 invariance. to this end, we introduce an m3vector-valued complex anticommuting scalar ψi with the sdiff3 gauge transformations δψi = qξi +ψj∂j ξ i −ξj ∂jψi , ∂iξi =0 (38) involving the commuting m3-vector parameter ξ i = ξi(x, θ, λ;yj) and its derivatives. in the present context,ψi will play the role of the sdiff3 gaugepotential. we require that ∂iψ i =0 so that, locally on m3, ψi = εijk∂j πk , (39) where πi is the complex anticommuting, and spacetime scalar, pre-gauge potential of this formalism. using ψi we can define an sdiff3-covariant extension of qφi by qφi := qφi +ψi∂iφ i (40) and construct the generalization of (33) invariant under local sdiff3 symmetry (31), (38): lmat = 1 2 mij ∮ d3y eφi qφj , (41) mij = λ ρ̃ ij �λ . next we have to construct the (complex and fermionic) lagrangian density lcs describing the (chern-simons) dynamics of the gauge potential ψi. to this end we introduce the field-strength superfield fi := qψi +ψj∂jψi = εijk∂jgk , (42) where the last equality is valid locally on m3 and gi := qπi +ψj∂jψi (43) is the pre-field-strength superfield of this formalism. both fi and gi are sdiff3 covariant, so figi is an sdiff3 scalar. furthermore, the integral of this density over m3 is q-exact, in the sense that∫ d3y e figi = q lcs , (44) where lcs = ∫ d3σ e ( πi qψ i − 1 3 �ijkψ iψjψk ) (45) is the complex and anti-commuting cs-type lagrangiandensity [24] which canbe used, togetherwith lmat of (41), to construct the candidate lagrangian density of the nb blg model, l = lmat − 1 g lcs . (46) the πi equation of motion of this combined lagrangian is fi = g 2e mij � ijk∂jφ i ∂kφ j . (47) at this stage it is important to assume that ψi has ‘ghostnumberone’ [28],whichmeans that it is apower series in λ with vanishing zeroth order term (and similarly for its pre-potential πi). in other words ψi = λα ȧ ςi αȧ , (48) where ςi is an m3-vector-valued 8c-plet of arbitrary anticommuting spinors. its zerothcomponent in the λexpansion is the fermionic sdiff3 potential introduced, with the same symbol, in (21). with this ‘ghost number’ assumption, (47) produces at lowest nontrivial order (∼ λ2) the superspace constraints (22) for the ‘ghost number zero’ contribution ςi|λ=0 to the pure spinor superfield ςi in (48), accompanied by the super cs equation (28) for the field strength wȧḃ constructed from this potential. anheuristic justificationof the assumption (48), so crucial to obtain the correct super-cs equations, can be found in thatwith this formofψi the covariantized brstoperator in (40) doesnot containa contribution 2notice that the above mentioned gauge symmetry δφi = λα ȧ ρ̃i ȧb ζαb of the action (33) contributes to δ(qφ i) the terms of at least the second order in λ. then the induced transformation of the pure spinor superfield θ αȧ in (37) is of the first order in λ so that ψ αȧ = θ αȧ |λ=0, entering the superembedding-like equation (20), is inert under those transformations. 18 acta polytechnica vol. 50 no. 3/2010 of ghost number zero, i.e. it has the form of (30), q = λȧ α dαȧ, but with the sdiff3 covariant grassmann derivative dαȧ = dαȧ + ξ i αȧ ∂i. varying the interacting action with respect to φi results in sdiff3 gauge invariant generalization of eqs. (36), mij qφ j =0 , (49) which contains, as the first nontrivial (∼ (λ)3) term in the λ-expansion, precisely the superembedding-like equation (20) with ψ =θ|λ=0. we have now shown, following [24], how the onshell n = 8 superfield formulation of sec. 2, and hence all blg field equations (19), may be extracted from the equations of motion derived from the pure spinor superspace action (46). of course, the field content and equations of motion should be analyzed at all higher-orders in the λ-expansion. to this end, one must take into account the existence of additional gauge invariance [28, 29] δφi = λ̄ρ̃i ζα +(q +ψ j ∂j)k i , (50) δπi = k i mij ∂iφ j , for arbitrarypure-spinor-superfieldparameters ζα and ki . what one can certainly state, even without a detailed analysis of these symmetries, is that, if additional fields are present inside the pure spinor superfields of the model (46), they are decoupled from the blgfields in the sense that theydonotenter the equations of motion of the blg fields which are obtained from the pure spinor superspace equations. this allowed us [24], following the terminology of [28], to call (46) the n = 8 superfield action for the nb blg model. 4 remarks on abjm/abj model the n = 6 pure spinor superspace action for the abjmmodel [10] invariantunder su(n)k×su(n)−k gauge symmetry, was proposed in [29]3. one can extract the standard (not pure spinor)n =6superspace equation by varying the action of [29] and fixing its gauge symmetries. it is also instructive (and probably simpler) to develop independently the on-shell n =6 superspace formalism for the abjm as well as for the abj [26] model invariant under su(m)k × su(n)−k symmetry [37]. for anyvalue of thecs-level k the startingpoint of the on-shell n = 6 superfield formalism could be the following (superembedding-like) superspace equation for complex m × n matrix superfield zi [37]4 d i αz i = γ̃iij ψαj , (51) i =1,2, . . . ,6, i, j =1,2,3,4 . here γ̃iij = 1 2 �ijklγikl = −(γ i ij) ∗ and γiij = −γ i ji are so(6) klebsh-gordan coefficients (generalized pauli matrices), which obey γi γ̃j + γj γ̃i = δij. the matrix superfield zi carries (m,n̄) representation of the su(m)×su(n)gaugegroup. itshermitianconjugate z † i is n × m matrix carrying (m̄,n) representation and obeying diαz † i = γ i ij ψ †j α . note that, although in the original abjm model [10] m = n, the n × n matrix superfields zi and z†i carry different representation of su(n) × su(n): (n,n̄) and (n̄,n), respectively. here we speak in terms of the case with m �= n, which is terminologically simpler, but all our arguments clearly also apply for m = n. the grassmann spinorial covariant derivatives diα in (51) includes the gaugegroup su(m)×su(n) connection and obey the algebra {diα, d j β} = iγ a αβ δijda + i�αβw ij . (52) this algebra involves the 15-plet of the basic field strength superfields w ij = −w ji which can be expressedthroughthematter superfieldsbythe following n =6 super-cs equation [37] w ijsu(m) = iz i z † j γ ij i j , w ijsu(n) = iz † jz iγij i j . (53) here w ijsu(m) and w ij su(n) are the basic field strength corresponding to su(m) and su(n) subgroups of the gauge group su(m)k×su(n)−k. one can check that the consistency conditions for eqs. (51) and (53) are satisfied if the matter superfield obeys the superfield equation of motion γjij d β(i d j) β z j +4iγjij[z j , zk;z†k]+ (54) 4iγjjk[z j, zk;z†i] , where [zj , zk;z†k] are hermitian 3-brackets (6). this superfield equation implies, in particular, the fermionic equations of motion [37] γaαβ daψ β i = − 2 3 [ψαj , z j;z†i]+ 1 6 [ψαi, z j;z†j]+(55) 1 2 �ijkl[z j , zk;ψ†lα ] . werefer to [37] for further details on then =6 superspace formalism of the abjm/abj model, including 3note the existence of the off-shell n = 3 superfield formalism for the abjm model [35] which was used to develop the quantum calculation technique in [36] 4here and below we use the latin symbols from the middle of the alphabet, i, j, . . ., to denote the four-valued su(4) index, i, j, . . . = 1,2,3,4; we hope that this will not produce confusion with real 3-valued vector indices of m3, see secs. 1.3, 2 and 3, as far as we do not use these in the present discussion. 19 acta polytechnica vol. 50 no. 3/2010 for the explicit form of the bosonic equations of motion. searching for an n = 8 superfield formulation for the abjm/abj models with cs levels k = 1,2 it is natural to assume that the universal n = 6 sector is present as a part of n = 8 superspace formalism and, to describe two additional fermionic directions of n = 8 superspace, introduce, in addition to six diα, one complex spinor grassmann derivative dα, and its conjugate (dα) † = −d̄α obeying {dα, d̄β} = iγaαβda + i�αβ w , (56) {dα, dβ} =0 , {d̄α, d̄β} =0 , {dα, djβ} = i�αβ w j , (57) {d̄α, djβ} = i�αβw̄ j . the structure of additional n = 2 supersymmetries proposed in [9] suggests to impose on the basic n =8 superfields the chirality condition in the new fermionic directions [37], d̄αz i =0 , dαz † i =0 . (58) while the natural candidate for the super-cs equation for the so(6) scalar superfield strength w is w = ziz†i , (59) to write a possibly consistent super-cs equation for 6 complex field strength w j, which has to be chiral, dαw j = 0 = d̄αw̄ j, to provide the consistency of the constraints (56), (57) and (52), w̄ jsu(m) =∝ z iγjij z̃ j , w jsu(m) =∝ z̃ † i γ̃ j ij z † j , (60) one needs to involve “non-abjm superfields”, the leading components of which are the “non-abjm fields” of [9]. these are n × m matrix z̃ i and m × n matrix z̃†i which obey d̄αz̃ i =0 , dαz̃ † i =0 (61) and must be related with abjm superfields zi, z†i by using the suitable monopole operators (converting (m̄,n) representation into (m,n̄))whichexist for the case ofcs levels k =1,2only [9]. according to [9], the existence of these monopole operators is reflected by the ‘identities’ between hermitain three brackets (6) of the abjm and non-abjm (super)fields. the set of these ‘gr-identities’ includes [(. . .), z̃†i ; z̃ i] = −[(. . .), zi ;z†i] . (62) the consistency of the system of n =8 superfield equations (51)–(60) and the set of gr-identities necessary for that are presently under investigation [37]. acknowledgement the author thanks josé de azcárraga, warren siegel, dmitri sorokin, paul townsend and linus wulff for useful discussions. this work was partially supported by the spanish micinn under the project fis20081980 and by the ukrainian national academy of sciences and russian rffi grant 38/50–2008. notice added: after this manuscript has been finished, a paper [38] devoted to n = 8 superspace formulations of d = 3 gauge theories appeared on the net. it contains a detailed description of the on-shell n = 8 superspace formulation of the blg model for finite dimensional three algebras, similar to the formulation of the sdiff3 invariant nambu bracket blg model in [30], and of its derivation starting from the gauge theory constraints and bianchi identities. also the component field content of the symmodel defined by the constraints (22) and its finite-3-algebra counterpart is discussed there. references [1] bagger, j., lambert, n.: gauge symmetry and supersymmetry of multiple m2-branes, phys. rev. d77 (2008) 065008 [arxiv:0711.0955 [hepth]]; [2] bagger, j, lambert, n.: comments on multiple m2-branes, jhep 0802 (2008) 105 [hepth/0712.3738 [hep-th]]. [3] gustavsson, a.: algebraic structures on parallel m2-branes, arxiv:0709.1260 [hep-th]; selfdual strings and loop space nahm equations, jhep 0804, 083 (2008) [arxiv:0802.3456 [hep-th]]. [4] filippov,v.t.: n-lie algebras,sib.mat. zh., 26, no 6, 126–140 (1985). [5] nambu, y.: generalized hamiltonian dynamics, phys. rev. d 7 (1973) 2405. [6] bandos, i. a., townsend, p. k.: light-cone m5 and multiple m2-branes, class. quant. grav. 25 (2008) 245003 [arxiv:0806.4777 [hep-th]]. [7] cherkis, s., saemann, c.: multiple m2-branes and generalized 3-lie algebras, phys. rev. d 78 (2008) 066019 [arxiv:0807.0808 [hep-th]]. [8] bagger, j., lambert, n.: three-algebras and n=6 chern-simons gauge theories, phys. rev. d79, 025002 (2009) [arxiv:0807.0163 [hep-th]]. [9] gustavsson, a., rey, s. j.: enhanced n = 8 supersymmetry of abjm theory on r(8) and r(8)/z(2), arxiv:0906.3568 [hep-th]. 20 acta polytechnica vol. 50 no. 3/2010 [10] aharony, o., bergman, o., jafferis, d. l., maldacena, j.: n =6 superconformalchern-simonsmatter theories, m2-branes and their gravity duals, arxiv:0806.1218 [hep-th]. [11] bergshoeff, e., sezgin, e., townsend, p. k.: supermembranesandeleven-dimensional supergravity, phys. lett. b189 (1987) 75. [12] bandres, m. a., lipstein, a. e., schwarz, j. h.: n = 8 superconformal chern–simons theories, jhep 0805 (2008) 025 [arxiv:0803.3242 [hepth]]. [13] witten, e.: bound states of strings and p-branes, nucl. phys. b460, 335 (1996) [hep-th/9510135]. [14] papadopoulos, g.: m2-branes, 3-lie algebras and plucker relations, jhep 0805 (2008) 054 [arxiv:0804.2662 [hep-th]]; on the structure of k-lie algebras, class. quant. grav. 25 (2008) 142002 [arxiv:0804.3567 [hep-th]]. [15] gauntlett, j. p., gutowski, j. b.: constraining maximally supersymmetric membrane actions, arxiv:0804.3078 [hep-th]. [16] de azcarraga, j. a., izquierdo, j. m.: cohomology of filippov algebras and an analogue of whitehead’s lemma, j. phys. conf. ser. 175, 012001 (2009) [arxiv:0905.3083 [math-ph]]. [17] van raamsdonk, m.: comments on the baggerlambert theory and multiple m2-branes, jhep 0805 (2008) 105 [arxiv:0803.3803 [hep-th]]. [18] ho,p.m.,matsuo,y.: m5 fromm2,jhep 0806 (2008) 105 [arxiv:0804.3629 [hep-th]]. [19] ho, p. m., imamura, y., matsuo, y., shiba, s.: m5-brane in three-form flux and multiple m2branes, jhep 0808 (2008) 014 [arxiv:0805.2898 [hep-th]]. [20] howe, p. s., sezgin, e.: d = 11, p = 5, phys. lett. b 394 (1997) 62 [arxiv:hep-th/9611008]. [21] bandos, i. a., lechner, k., nurmagambetov, a., pasti, p., sorokin, d. p., tonin, m.: covariant action for the super-five-brane ofm-theory,phys. rev. lett.78 (1997)4332 [arxiv:hep-th/9701149]. [22] aganagic, m., park, j., schwarz, j. h., popescu, c.: world-volume action of the m-theory five-brane, nucl. phys. b496, 191–214 (1997) [arxiv:hep-th/9701166]. [23] pasti, p., samsonov, i., sorokin, d., tonin, m.: blg-motivated lagrangian formulation for the chiral two-form gauge field in d = 6 and m5-branes, phys. rev. d80, 086008 (2009) [arxiv:0907.4596 [hep-th]]. [24] bandos, i.a.,townsend,p.k.: sdiffgaugetheory and the m2 condensate, jhep 0902, 013 (2009) [arxiv:0808.1583 [hep-th]]. [25] bandos, i.a., sorokin,d.p.,tonin,m., pasti,p., volkov,d. v.: superstrings and supermembranes in the doubly supersymmetric geometrical approach, nucl. phys. b 446 (1995) 79 [arxiv:hepth/9501113]. [26] aharony, o., bergman, o., jafferis, d. l.: fractional m2-branes, jhep 0811, 043 (2008) [arxiv:0807.4924 [hep-th]]. [27] kwon, o. k., oh, p., sohn, j.: notes on supersymmetryenhancement of abjmtheory, jhep 0908, 093 (2009) [arxiv:0906.4333 [hep-th]]. [28] cederwall, m.: n = 8 superfield formulation of the bagger-lambert-gustavssonmodel, jhep 0809 (2008) 116 [arxiv:0808.3242 [hep-th]]. [29] cederwall, m.: superfield actions for n=8 and n=6 conformal theories in three dimensions, jhep 0810 (2008) 070 [arxiv:0809.0318 [hepth]]. [30] bandos, i.a.: nbblgmodel inn=8superfields, phys. lett. b 669 (2008) 193 [arxiv:0808.3568 [hep-th]]. [31] howe, p. s.: pure spinors lines in superspace and ten-dimensional supersymmetric theories, phys. lett. b 258 (1991) 141 [addendum-ibid. b 259 (1991) 511]. [32] nilsson, b. e. w.: pure spinors as auxiliary fields in the ten-dimensional supersymmetric yang-mills theory, class. quant. grav. 3, l41 (1986). [33] galperin, a., ivanov, e., kalitsyn, s., ogievetsky, v., sokatchev, e.: unconstrained n = 2 matter, yang-millsandsupergravitytheories in harmonicsuperspace,class. quant.grav.1, 469 (1984). [34] berkovits, n.: super-poincare covariant quantization of the superstring, jhep 0004, 018 (2000) [arxiv:hep-th/0001035]; bedoya, o. a., berkovits, n.: ggi lectures on the pure spinor formalism of the superstring, arxiv:0910.2254 [hep-th] and refs. therein. [35] buchbinder, i. l., ivanov, e. a., lechtenfeld, o., pletnev, n. g., samsonov, i. b., zupnik, b. m.: abjm models in n = 3 harmonic superspace, jhep 0903, 096 (2009) [arxiv:0811.4774 [hepth]]. 21 acta polytechnica vol. 50 no. 3/2010 [36] buchbinder, i. l., ivanov, e. a., lechtenfeld, o., pletnev, n. g., samsonov, i. b., zupnik, b. m.: quantum n = 3, d = 3 chern-simons matter theories in harmonic superspace, jhep 0910, 075 (2009) [arxiv:0909.2970 [hep-th]]. [37] bandos, i. a., de azcarraga, j. a.: abjm in n =6 and n =8 superspaces, paper in preparation. [38] samtleben, h., wimmer, r.: n = 8 superspace constraints for three-dimensional gauge theories, jhep 1002, 070 (2010) [arxiv:0912.1358 [hep-th]]. igor a. bandos e-mail: igor bandos@ehu.es ikerbasque, the basque foundation for science and department of theoretical physics university of the basque country p.o. box 644, 48080 bilbao, spain 22 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 astronomical image compression techniques based on acc and klt coder j. schindler, p. páta, m. kĺıma, k. fliegel abstract this paper deals with a compression of image data in applications in astronomy. astronomical images have typical specific properties — high grayscale bit depth, size, noise occurrence and special processing algorithms. they belong to the class of scientific images. their processing and compression is quite different from the classical approach of multimedia image processing. the database of images from bootes (burst observer and optical transient exploring system) has been chosen as a source of the testing signal. bootes is a czech-spanish robotic telescope for observing agn (active galactic nuclei) and the optical transient of grb (gamma ray bursts) searching. this paper discusses an approach based on an analysis of statistical properties of image data. a comparison of two irrelevancy reduction methods is presented from a scientific (astrometric and photometric) point of view. the first method is based on a statistical approach, using thekarhunen-loève transform (klt)with uniform quantization in the spectral domain. the second technique is derived from wavelet decomposition with adaptive selection of used prediction coefficients. finally, the comparison of three redundancy reduction methods is discussed. multimedia format jpeg2000 and hcompress, designed especially for astronomical images, are comparedwith the newastronomical contextcoder (acc) coder based on adaptive median regression. keywords: astronomical image compression, karhunen-loève transform, dark frame compression, loss less compression algorithm, astronomical context coder (acc), jpeg 2000. 1 introduction this paper deals with scientific image data compression. the data for analysis was collected during work on the international (czech-spanish-italian) bootes experiment (burst observer optical transient exploring system) [2]. bootes has been in service since 1998 as the first spanish robotic telescope for sky observation [4]. this system is one of three similar systems in full operation in the world, and has three main stations. the first one is located in the southern spain (in mazagon, near huelva), and has been in full operation since july 1998. the first version of the system was completed in july 2001. the main aim of the project is to observe extragalactic objects and to detect a new optical transient (ot) of gamma ray burst (grb) sources. bootes has been operated in very close co-operation with a satellite observation of the gamma and roentgen universe integral satellite. integral is an orbital astrophysics laboratory of the european space agency (esa) and it has been in space since november 2002. due to the limited capacity of storage media, an efficient data compression algorithm has to be applied. lossless compression algorithms are often used in scientific applications, but their efficiency is limited. the maximum achieved compression ratio depends above all on the data type and on the amount of image signal entropy. the usual dictionary or entropy lossless algorithms are run length encoding (rle), lempel ziv welch (lzw), huffman or arithmetic coding. the typical compression ratios of these lossless algorithms are from 1 : 1.1 to 1 : 5 for astronomical images [1]. the second approach involves the use of compression techniques characterized by decorrelated parameters. typical examples of this option are jpeg and jpeg2000 standards, but data impairment has to be taken into account in the case of lossy coding. it is necessary to consider whether algorithms optimized for multimedia applications and human vision are suitable for compressing scientific image data. astronomical image data stored in archives is often accessed later to perform a new study, new comparisons and measurements. it is not possible to fix a set of investigation methods which may be applied to the astronomical image in the future. it is therefore not possible to determine in advance an admissible loss of image information during the compression process. the best way to guarantee maximally accurate and reliable results from post-processing an astronomical image is to preserve the image without any change or loss of information. for this reason lossless compression techniques are often preferred in this area. recent lossy and lossless still image compression formats are powerful tools for compressing all kinds of 97 acta polytechnica vol. 51 no. 2/2011 a) b) fig. 1: correction image data from the bootes project (1024 × 1536 × 16 bits) a) image for correcting non-uniform sensitivity of the whole detection system flat field (ff), b) map of the dark current of the ccd sensor — dark image (di) a) b) fig. 2: input image data from the bootes project (1024 × 1536 × 16 bits) a) image from wide field camera m7 and milky way with many objects (size smaller than 10 pixels), b) m42 nebulae with a satellite tray. image obtained from the deep sky camera common images (pictures, text, schemes, etc.). the performance of a compression algorithm generally depends on its ability to anticipate the image function of the processed image. in other words, a compression algorithm, in order to be successful, has to take fullest advantage of coded image properties. astronomical data forms a special class of images that have general image properties, and also some specific characteristics. if a new coder is able to make correct use of knowledge of these special properties, this will lead to superior performance on this specific class of images, at least in terms of the compression ratio. applying special compression algorithms based on specific properties of wavelet, fractal or karhunenloève transform [9] seems to be a better solution for astronomical image data compression. 2 astronomical images the data coder has been optimized for four image types: • image for correcting the non-uniform sensitivity of the whole detection system flat field (ff) (see figure 1a). note the shadow of the dust particle in the left part of the center of the image. • a map of the dark current of the ccd sensor – dark frame (df) (see figure 1b). note the bad ccd column in the right part of image. • light images (li) from wide and ultra-wide field cameras (eq focus length shorter than 100 mm) (see figure 2a). the size of the objects (especially stars) does not exceed 10 square pixels. • light image with high spatial resolution — deep sky images (dsli) (see figure 2b). light and flat field images are not corrected with the map of dark current, and isolated hot pixels are noticeable in these images. these artifacts are close to an uncorrelated signal, and are difficult to compress. these test images come from our deimos [3] database, which is available as open source (http://www.deimos-project.eu/). this image database covers a broad range of image content from 98 acta polytechnica vol. 51 no. 2/2011 scientific image data in astronomy and multimedia [6]. 3 lossless astronomical image compression 3.1 jpeg2000 the core part of the jpeg2000 standard [8] also enables lossless compression. for the lossless mode, the reversible color transformation and the reversible wavelet transform can be used to decorrelate the input data in terms of the color components and the spatial dependencies. these transformations convert input integer data into integer results. the reversible color transformation and the reversible 5/3 wavelet filter can also be used for lossy coding. thanks to the sophisticated jpeg2000 format structure, it is then very simple to work with the quality or resolution progression, from a lossy image overview until lossless maximum resolution image data. the roi (region of interest) technique also provides the most accurate data for the specific part of an image with reasonable bandwidth requirements. although the compression performance of the reversible transformations is limited for the lossy case, they show results that are almost comparable with the irreversible transformations dedicated to lossy compression. 3.2 hcompress hcompress was developed at the space telescope science institute (stsci, baltimore), and is commonly used to distribute archived images from digital sky survey dss1 and dss2. this compression format is based on the haar transform (2 × 2 pixels). the computation is extremely fast, since the haar transform does not require any multiplication. wavelet coefficients are linearly quantized, quad tree coded on bitplanes, and the statistical redundancy is reduced by the huffman code. besides lossy coding, this compression format also enables lossless compression, since the haar wavelet transform is reversible. a definition of this format can be found in [14]. 3.3 ccsds-ldc or rice algorithm the consultative committee for space data system published in 1997 a recommendation standard for lossless data compression based on modified rice algorithm [15]. ldc stands for lossless data compression. this coding should exhibit better results than jpeg-ls under the same conditions. the original rice’s algorithm can be found in [16]. 3.4 acc coder acc stands for astronomical context compression. this format is currently under development in the radio engineering department of fee of ctu in prague, and is being designed especially for astronomical images, focusing on their specific characteristics. however, it can also be applied for general raster images. acc consists of the following main parts: • background estimation • successive spatial decomposition • context computation based on noise evaluation • context-based pixel estimation, using linear regression • rle and arithmetic coding background estimation is the first part of the coding process, and it is important. it is based on tiled median computation and subsequent filtering. the estimated background is extracted from the original image data, and the background-free image is further processed. this background separation improves the coding performance of the following methods. the background-free image is then decomposed in several steps. in each step, a different set of pixels from the input pixel array is coded. each pixel is coded just once, so the sets of the pixel are disjunctive. this spatial decomposition is optimized for the specific astronomical image data, where many singularities in the image function are expected. the decomposition scheme differs from the wavelet dyadic decomposition, where the input pixel array is processed in a successive pyramidal way. astronomical images are usually contaminated by a significant noise level. the key to the acc algorithm is to measure the local noise characteristics and to differentiate the input image pixels into incompressible noise and significant data. according to the significance of the local data, the context is computed and assigned to each coded pixel. pixels having the same context are then coded together. 4 achievable compression ratios the performance of the three compression formats presented above was measured and compared in terms of the lossless compression ratio that was achieved. the measurement was made on three image sets, each set representing a different astronomical image type. in the first set, there were 26 deep sky astronomical images. the second and the third set contained 22 correction dark frames and 5 correction flat fields, respectively. all tested files were 1 536 × 1 024 single component images with 16 bit/pixel depth. 99 acta polytechnica vol. 51 no. 2/2011 fig. 3: noise bits versus the lossless compression ratio table 1: average lossless compression ratio image set hcmopress jpeg2000 rice acc deep sky 1.77 1.85 0.51 1.96 dark frames 2.69 2.85 1.52 5.63 flat field 1.71 1.74 0.49 1.74 figure 3 shows the measured compression ratio versus the gaussian noise equivalent bits. all image sets are included. the gaussian noise bits are computed by the fpack utility [12]. among the standard coders, the jpeg2000 standard achieved slightly better results than hcompress. its main benefit is the mq entropy coder. however, the static 5/3 dwt filter is not optimal in many cases. for example, the haar wavelet used in hcompress produces less high amplitude coefficients in the case of isolated singularities, which are common in astronomical images. the results show clearly that the acc coder exhibits very good results on all tested images. it shows superior compression ratios in almost all test cases. the strength of the context-based estimation optimization can be exploited particularly in the dark frames test set, where the average improvement of this novel method compared to the other algorithms was particularly evident. the dark frames usually include much less gaussian-like noise, and this enables it to have better theoretical compression ratios, e.g. compared with the deep sky images. 5 lossy astronomical image compression the algorithms required by astronomers are lossless. their efficiency is limited. unfortunately, they do not offer a higher compression ratio than 5 : 1 [10]. is this enough? inadequate results can be enhanced by the use of lossy algorithms. they provide a much better compression ratio, up to 100–200 : 1 for specific kinds of images. however, they also lead to increased errors in the reconstruction images. jpeg and jpeg2000 are the most widely known loss compression standards. they are preferred by graphics and web users. however, their usage for astronomical data compression is not optimal. these standards are optimized for human vision (i.e. perception based) and for so-called multimedia applications. we are searching for optimal compression algorithms with the following characteristics • highly efficient, with a good compression ratio • lossless, or loss with known and optimized defect reconstruction 100 acta polytechnica vol. 51 no. 2/2011 • a fast decompression algorithm — e.g. the coder and decoder of an archive machine can be nonsymmetrical. scientific data is not processed by the human eye, but sophisticated algorithms are usually used. they are sensitive to other parameters than the eye. the mean square error is usually used for estimating the good quality of an approximated image signal. a special compression technique is therefore studied in this paper. we can compare it with algorithms based on the unique properties of wavelets and fractals as alternative coding methods [13]. the technique described in this paper has the lossy coder of the spectral coefficients of the karhunen-loève transform [11, 5]. it seems to be a better solution. 5.1 distortion measurement of lossy coders the measurement confirms the possibility of arranging the coder blocks to produce an accepted error and a sophisticated data stream. first, the most principal spectral components are important for a preview of the image and background function estimation, together with sensitivity correction. next, the components can be used for searching objects and for high-precision astrometric and photometric measurements with a profile fitting. suboptimal klt decomposition has been found to be very suitable for astronomical data compression. the klt coding of correction sensitivity images (so-called flat fields) can be performed up to 100 : 1, according to the image characteristics [9]. the light images are very well reconstructed for compression ratios about 30–60 : 1 (see figure 4). a comparison of the impact of the wavelet, dct and kl transforms on the deep sky is shown in figure 5. the dark frames are a map of the thermally generated charge in the ccd structure. they are very difficult to code, due to their noise and their very stochastic character. application of the designed klt provides an insignificant result. the maximal accepted error of the reconstructed images corresponds to a compress ratio of about 5 : 1. the lossless variant of the klt coder is recommended for use for these images. figure 4 shows a comparison of the mean error of the object position for the karhunen-loève coder and the adaptive wavelet transform, based on the jpeg 2000 standard. fig. 4: error of the astrometry position measurement for the suboptimalkarhunen-loève expansion and the adaptive wavelet transform 6 conclusion the lossy compression technique described here can be considered as a good alternative for known compression algorithms (jpeg and jpeg2000). the disadvantages of the klt-based coder are its extensive computational requirements due to the need to calculate the eigenvectors of the covariance matrix. it can be improved by using the suboptimal klt coder. further improvement of technique can be achieved by sophisticated filtering methods and suitable image data organization. the lossless acc (astronomical context coder) has been designed and optimized for specific astronomical data properties. the proposed new compression method is based on noise estimation and pixel contextual modelling using median regression. for a given context, this pixel estimation is optimal in the sense of the estimation error sum. a) b) c) fig. 5: comparison of the impact of irrelevancy reduction for the adaptive wavelet algorithm jpeg 2000 (a – left), dct (jpeg) (b – central), and dklt (c – right) based coder. detail of object, stars and satellite tray in fig. 2b) 101 acta polytechnica vol. 51 no. 2/2011 acknowledgement this work has been supported by grant no. p102/10/1320 “research and modeling of advanced methods of image quality evaluation” of the grant agency of the czech republic, and by research project msm 6840770014 “research of perspective information and communication technologies” of msmt of the czech republic. references [1] bernas, m., páta, p., hudec, r., rezek, t.: lossless and lossy compression of images from the omc experiment of integral project, astrophysical letters and communications, gordon and breach science publishers, amsterdam, 2000, 429–432. [2] jeĺınek, m., castro-tirado, a. j., de ugarte postigo, a., kubánek, p., guziy, s., gorosabel, j., cunnife, r., vı́tek, s., reglero, v., sabau-graziati, l.: four years of real-time grb followup by bootes-1b, advances in astronomy, vol. 2010, 2010. [3] deimos, database of images: open source, http://www.deimos-project.eu/. [4] de ugarte postigo, a., mateo sanguino, t. j., castro cerón, j. m., páta, p., bernas, m., et al.: recent developments in the bootes experimet, in aip conference proceedings 662. cambridge : massachusetts institute of technology, 2003, 553–555. [5] effros, m., feng, f., zeger, k.: suboptimality of the karhunen-loève transform for transform coding, ieee transactions on information theory, vol. 50, aug. 2004. [6] fliegel, k., kĺıma, m., páta, p.: new open source image database for testing and optimization of image processing algorithms, in optics, photonics, and digital technologies for multimedia applications, spie proceedings, vol. 7723, 2010. [7] hudec, r., soldán, j., hudcová, v., bernas, m., páta, p., hroch, f., castro-tirado, a. j., masshessem, j. m., giminez, a.: blazar monitoring towards the third millenium, torino : 1999, 131–133. [8] iso/iec 15444-1:2000: jpeg2000 image coding system (core coding system), [online], 2000, http://www.jpeg.org/fcd15444-1.htm. [9] páta, p.: compression of astronomical images based on the karhunen-loeve transform. in proceedings of the eighth iasted international conference on signal and image processing, anaheim : acta press, 2006, p. 133–138. [10] páta, p., bernas, m.: properties of karhunenloeve expansion of astronomical images in comparison with other integral transforms, in gamma ray bursts, aip conference proceedings woodbury : american institute of physics, 2000, 882–886. [11] páta, p., hanzĺık, p., schindler, j., vı́tek, s.: influence of lossy compression techniques on processing precision of astronomical images, 6th ieee isspit conference, athens, greece, 2005. [12] pence, w. d., seaman, r., white, r. l.: fpack fits image compression utility, [online], 2010, http://heasarc.gsfc.nasa.gov/fitsio/fpack/ fpackguide.pdf. [13] starck, j. l., murtagh, f., louys, m.: ccma conference on data and information fusion. grenada, 1997. [14] white, r., postman, m., lattanzim, m.: digitized optical sky survey, kluver, pp. 167–175, 1992. [15] ccsds: lossless data compression, recommendation for space data system standards, ccsds, vol. 121.0-b-1, 1997. [16] rice, r. f., yeh, p.-s., miller, w.: algorithms for a very high speed universal noiseless coding module, jpl publication 91-1, jet propulsion laboratory, pasadena, ca, 1991. jaromı́r schindler petr páta miloš kĺıma karel fliegel department of radioengineering faculty of electrical engineering czech technical university in prague technická 2, prague, czech republic 102 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 energy recovery from contaminated biomass jǐŕı moskaĺık1, jan škvařil1, otakar štelcl1, marek baláš1, martin lisý1 1 brno university of technology, faculty of mechanical engineering, energy institute, technická 2896/2, 616 69 brno, czech republic correspondence to: ymoska03@stud.fme.vutbr.cz abstract this study focuses on thermal gasification methods of contaminated biomass in an atmospheric fluidized bed, especially biomass contaminated by undesirable substances in its primary use. for the experiments, chipboard waste was chosen as a representative sample of contaminated biomass. in the experiments, samples of gas and tar were taken for a better description of the process of gasifying chipboard waste. gas and tar samples also provide information about the properties of the gas that is produced. keywords: gasification in a fluidized bed, contaminated biomass, thermal disposal of waste. 1 introduction the growth in energy consumption has led to interest in non-conventional fossil fuels. biomass is an option for reducing the use of primary energy sources. an advantage of biomass is that it can be transformed directly into liquid and gaseous fuels [1]. if these fuels achieve certain quality parameters, e.g. purity or satisfactory heat value, they can be used in a more suitable way. in recent times, there have been considerable advances in recovering energy from biomass, even in large-scale energy production. a well-known example has been the interventions made by the state in support of feed-in tariffs for power coming from renewable sources. a situation developed when, due to the major impact on the market this type of fuel became “scarce goods”, particularly for larger consumers. then consumers begin to look around for some other type of fuel. these criteria for biomass production are also met by certain non-toxic wastes that can summarily be referred to as contaminated biomass. contaminated biomass includes materials such as wastes from agricultural production, and wastes from the furniture industry. energy recovery from contaminated biomass can be regarded from two angles, one of which focuses on energy production, while the other focuses above all on waste disposal. legislative problems of gasification of contaminated biomass are beyond the scope of this paper, but would form a topic for a separate paper by a different kind of specialist. 2 basic types of contaminated biomass the main hindrance to recovering energy from contaminated biomass is its elevated content of undesirable substances. biomass is usually contaminated in its primary, non-energy use. contaminants may vary significantly, according to the primary use and the origin of the biomass. basically, contaminated biomass can be classified into a small number of basic groups, as follows: • agricultural production wastes [2] • construction industry wastes • furniture industry wastes • sludge from waste water treatment plants • wastes from paper production and cellulose processing • wastes from the textile industry — biological components and plant residues used in textile production (flax, cotton, hemp, etc.) each group has its specific features, according to the type of substances the biomass has been exposed to, or treated with, for its primary use. while wastes from agricultural production tend to have elevated levels of nitrates, construction industry wastes have frequently been treated with protective agents, bonding primers, paints, etc. [3]. 3 experimental fuel since the material is readily availability, our study focuses on wastes from furniture manufacture. these wastes include materials contaminated by being processed with chemicals (e.g. bonds and binders, glues and adhesives, lacquers, and also biocidal products). a typical representative of wastes from furniture manufacture is waste chipboard. this waste contains a whole gamut of additives to boost the resistance of the wood. waste chipboard has a suitable consistency and is quite widely available. chipboard was chosen for experiments using the biofluid 100 gasifier. for operating reasons, the chipboard has to be 77 acta polytechnica vol. 52 no. 3/2012 crushed to be able to be fed way of screw conveyor into the gasifier. an analysis of the elemental composition and the basic properties of the experimental fuel was carried out by an accredited laboratory. the results of the analysis are summarized in the following tables. 4 analysis of the experimental fuel sample there are only very small quantities of fluorides in crushed chipboard (284 [mg/kg] in dry matter, which corresponds to only 0.000284 %). fluorides, together with chlorine, are problematic halogen compounds in the fuel sample. halogens are present here in relatively small quantities; however, what also matters is the compounds in which they are chemically bound. at relatively low gasification temperatures, some compounds may fail to decompose completely. undesirable compounds may only undergo a transformation, giving rise to substances harmful to the living environment [5,6]. compounds of chlorine are likely to be the most troublesome substances in this respect. in a fluidized bed, however, undesirable production of harmful substances is expected to be minimized owing to the large contact area [2]. the wood and chipboard used in furniture production are very often treated with biocidal products. these products are used to prolong the life of the material, to prevent the development of mould and material degradation. biocidal products often contain chlorine, and their molecular structure often takes after the molecular structure of dioxins or furans. furans and dioxins rank among the most toxic substances of all [7]. 5 energy properties of chipboard from the energy point of view, the heat value is probably the most essential property of fuels. chipboard consists mainly of pieces of wood that have been treated to meet the requirements for furnituremaking (e.g. low humidity). all kinds of glues and resins are considerable components of chipboard which also, in most cases, have good heat values. the experimentally verified heat value of chipboard is relatively high (see table 2). table 1: results of the initial ultimate analysis of the experimental fuel [3,4] anneal obtained sample waterless sample sample combustible [%] [%] [%] gross water 4.08 – – residual water 7.15 – – water total 11.23 – – ash content at 550 ◦c 1.02 1.15 – combustible 87.75 98.85 100 volatile matter 70.35 79.25 80.17 fixed carbon 17.40 19.60 19.83 ultimate analysis hydrogen h 5.65 6.36 6.43 carbon c 42.59 47.98 48.54 nitrogen n 3.64 4.10 4.15 oxygen o 35.84 40.37 40.84 sulphur total 0.04 0.05 – sulphur volatile 0.03 0.04 0.04 sulphur in ash 0.10 0.01 – cl total – 0.048 – fluorides 284 [mg/kg] in dry matter [3, 4] 78 acta polytechnica vol. 52 no. 3/2012 table 2: energy parameters of the test specimen of “fuel” [4] energy parameters anneal obtained sample waterless sample sample combustible [%] [%] [%] combustion heat [kj/kg] 17 601 19 828 20 059 heat value [kj/kg] 16 068 18 433 18 647 table 3: chemical analysis of ash from crushed furniture [3,4] chipboard ash composition compound [%] element [mg/kgash] sio2 15.30 pb 223 fe2o3 3.60 cd less than 10 mno – cu 484 al2o3 7.28 hg less than 10 tio2 25.80 mn 12 500 cao 19.00 cr 170 mgo 4.35 ni 107 na2o 1.83 zn 2 900 k2o 8.90 cl 0.46 so3 2.77 p2o5 2.34 6 chipboard ash the composition of ash has a significant impact on its properties, and consequently also on the ways in which the fuel is utilized in various technologies. this is particularly the case for new untested fuels. typical problems are with ash sintering, which is dependent on the composition of the ash. here, the most important elements are silicon, sodium and potassium, because the oxides of these elements have a big influence on the sintering temperature of the ash. chemical analyses were made of chipboard ash in order to illustrate fully the effect of ash on the equipment. the analyses were performed in an accredited laboratory. the ash was obtained by controlled annealing of the material at 550 ◦c. the values of the content of individual constituents are given in the table 3. 7 experimental measurements test measurements were made on the biofluid 100 experimental unit in 2010 and 2011. a more detailed description of our experimental unit can be found in earlier studies published by our institute. the main purpose of the experimental measurement was to establish the potential for thermal gasification of crushed furniture chipboard. as this is not a conventional fuel, the initial measurements aimed mainly at finding and verifying a suitable gasification method for this material. the tests focused on the process of gasification proper of the material, to see whether there are any process limitations. [3] figure 1: the biofluid 100 experimental gasification unit [8] in the early stage of the first measurements, difficulties arose with stabilization of gasification temperature. these difficulties were successfully resolved in the course of the experiment. the probable cause was the supply of excessive amounts of primary air, and the relatively fine consistency fuel that was probably released from the fluidized bed. in other words, no stable fluidized bed could properly form. nevertheless, the first experiments showed that chipboard can be gasified using biofluid. however, it is necessary to take into account the consistency of the fuel, and changes in the control of the screw conveyor must be made gently. 79 acta polytechnica vol. 52 no. 3/2012 figure 2: important temperatures in the gasifier, and an indication of the time course of the gas tar samplings (t101 – temperature on the fire grid, t102 – temperature at the beginning of the fluidized bed, t103 – temperature on top of the fluidized bed, t107 – temperature at the outlet from the gasifier) figure 3: measurements of the pressure loss of the fluidized bed are used to describe the stability of the fluidized bed and the gasification process 80 acta polytechnica vol. 52 no. 3/2012 table 4: composition of the gas mixture produced during the experiments sample temperature co2 h2 co ch4 n2 ethane misc sum total time [◦c] [%] [%] [%] [%] [%] [%] [%] [%] 11:04 760 16.80 6.38 13.33 5.07 57.81 0.480 0.130 100 11:40 760 17.39 6.54 12.19 4.14 59.28 0.350 0.110 100 12:07 760 17.58 7.25 13.79 4.15 56.83 0.280 0.110 100 12:04 780 17.33 7.63 14.36 3.64 56.76 0.160 0.110 100 12:42 770 17.57 7.09 12.45 2.97 59.69 0.120 0.110 100 13:07 780 18.15 6.54 10.51 2.05 62.04 0.660 0.040 100 13:43 800 16.80 6.19 11.27 1.93 63.07 0.670 0.060 100 14:18 805 17.22 7.44 12.43 2.20 59.90 0.760 0.040 100 14:53 805 15.81 7.09 13.54 2.87 59.74 0.880 0.070 100 table 5: summarized results from tar analysis to for comparison gasification temperature 770 ◦c 800 ◦c volume of gas [l] 156.0 156.0 volume of acetone [ml] 156.3 161.8 benzene [mg/m3] 3 453.7 3 645.5 toluene [mg/m3] 1 619.0 1 413.3 m+p+o-xylen+ethylbenzene+phenylethyne [mg/m3] 603.4 954.3 styrene [mg/m3] 725.0 627.1 c3-benzene sum [mg/m3] 1 013.3 847.3 btx sum [mg/m3] 7 414.4 7 487.6 oxygenous sum [mg/m3] 2 854.2 1 353.6 other substances (tar) [mg/m3] 398.4 452.0 sum of tar (wihtout btx) [mg/m3] 5 779 4 730 tar by tar protocol [mg/m3] 9 740 8 572 to form an idea, the following graph shows the temperature course of one of the experiments in relation to the frequency values of the screw conveyor. it is clear that the initially unstable course of gasification was successfully stabilized. it was only after an attendant’s intervention in the screw conveyor frequency that temperature fluctuations occurred in the gasifier. another measured variable is the pressure loss of the fluidized bed. a change in this value shows the stability of gasification process during the experiment. the next task was to conduct chipboard gasification at temperatures ranging from 760 ◦c to 830 ◦c, and to assess the impact of the gasification temperature on the composition of the resultant gas. after the unit had been heated to the required operating temperature, samples of tar and gas were taken. the following table summarizes the volume concentration values of individual constituents of the produced gas mixture. the samples are indicative of the production of a relatively stable mixture of gases. it was found that the gas composition values and the tar content values in relation to temperature correspond with the tar content in the gasification of conventional wood chips. however, deposits of compounds that have not yet been closely examined were left on the walls of the sample containers following the sampling. the next table summarizes some of the tar analysis results for the purposes of comparison. only a small part of the total tar analysis is shown, as a single analysis produces a large number of values. 81 acta polytechnica vol. 52 no. 3/2012 8 conclusions the results of our experiments have shown that it is feasible to gasify crushed chipboard. what is important, however, is stepwise and unhurried control of the screw conveyor operation to avoid clogging. it was also noted that the temperature response of chipboard to the screw conveyor frequency is much slower than in the case of fuel wood chips. the fluctuational pressure loss of the fluidized bed testifies to poorer stability of the proper gasification process. the fluctuations may originate from a lack of homogeneity of the tested fuel. the crushed furniture contained a relatively high volume of fine fraction, which sank to the bottom of the feedstock container during handling, and was therefore the first to enter the gasifier. the experiments show that the optimum temperature for chipboard gasification is somewhere around 800 ◦c. in gasification at temperatures in excess of 820 ◦c, fuel caking occurs in the fluidized bed, and it also solidifies above the grate. at temperatures below 770 ◦c, there is a growing tar content in the gas that is produced. a comparatively high concentration of undesirable compounds that are harmful to health was assumed, due to the high content of additives used in chipboard manufacture. however, the gas that is produced is just an intermediary of this technology. the resultant concentrations of harmful substances should only be quantified after the gas has been combusted. to carry on with this research, biofluid has been equipped with a special combustion chamber, in which the gas will be used as a fuel. in the course of the follow-up research, emission measurements will be carried out only with the outlet flue gas. acknowledgement this research was realized with support from the faculty of mechanical engineering, brno university of technology, in project fsi-j-10-40 thermal liquidation of contaminated biomass. references [1] baláš, m., lisý, m.: water-steam influence on biomass gasification process, acta metallurgica slovaca, 11, 1, 2005, 14–21, (in czech). issn 1335-1532. [2] werthera, j., saengera, m., hartgea, e.-u., ogadab, t.: combustion of agricultural residues, progress in energy and combustion science, 26, pergamon, 2000,, 1–27. [3] moskaĺık, j., škvařil, j., štelcl, o., baláš, m., lisý, m.: energy usage of contaminated biomass, acta metallurgica slovaca conference, vol. 2, no. 1, 2011, p. 145–151. issn 1338-1660. [4] kotlánová, a.: research paper of fuel testing — sample of fuel, testing laboratory, tüv nord czech, 2010. [5] tame, n. w., dlugogorski, b. z., kennedy, e. m.: formation of dioxins and furans during combustion of treated wood:, elsevier science, progress in energy and combustion science, 33, 2007, 384–408. [6] vamvuka, d., karouki, e., sfakiotakis, s.: gasification of waste biomass chars by carbon dioxide via thermogravimetry. part i: effect of mineral matter, fuel, elsevier science, 2010. [7] schtowitz, b., brandt, g., grafner, f., schlumpf, e.: dioxin emissions from wood combustion, chemosphere, vol. 29, elsevier science, 1994, 2005–2013. [8] baláš, m., lisý, m., kohout, p., ochrana, l., skoblia, s.: nickel catalysts gas cleaning, acta metallurgica slovaca, 13, 3, 2007, 18–26, (in czech). issn 1335-1532. 82 ap10_1.vp 1 introduction so far it has been believed [1] that electron populations n of avalanches crossing a discharge gap d in a homogeneous electric field are governed by furry statistics [2–3] w n d n n n n n n n 0 1 11 1 1 1 ( , ) exp� � � � � � � � � � � ��� � � � � �� ��� � � � � � � � � � � � � , exp ( ) ,n x x d � d 0 (1) where w0 and � are the probability density function and the first townsend ionization coefficient, respectively. the validity of this law used to be accepted [1] independently of the size of the avalanches, i.e., regardless of the magnitude of their mean electron content n, in spite of the fact that there were strong indications [1], [4–5] showing different statistical behavior. especially with highly populated avalanches n �105, clear deviations from the furry law (1) were often observed [4–5]. it has been illustrated many times [6–12] that the population statistics of such highly populated avalanches and streamers obey pareto (fractal) statistics w n d a d1 1( , ) ( )� � � �n , (2) where a is a constant and d is the so-called fractal dimension [13]. convincing examples of different statistical behavior of highly and lowly populated avalanches can be found in the earlier work of richter [5]. he measured population statistics in ether under discharge conditions that were favorable for creating a mixture of pre-streamer and streamer avalanches, with a majority of the latter. such highly populated avalanches (n �108) provided distribution functions with a very deep bending in the semilogarithmic co-ordinate system, where the population statistics of avalanches should show linear behavior in accordance with the furry law. when analyzing richter’s curves in the bilogarithmic system, one can easily recognize two neighboring different regions (see fig. 1). there is a longer linear part and a shorter non-linear (bent) part. the bending of the latter corresponds to the exponential furry behavior typical for less populated avalanches, whereas the linear part represents the pareto behavior characteristic for highly populated avalanches and streamers. this figure clearly illustrates that the exponential and power fitting functions in the furry and pareto regions, respectively, represent a good choice among possible analytical candidates, since both the fits follow the experimental data well. another instructive example showing how well the pareto power function represents the experimental data is given in fig. 2. in this case, avalanches were detected across a resistance as short voltage pulses with random heights u. (the resistance (r �100 k�) was connected in series to the discharge gap (c) so that the two components formed a classical rc-circuit – for more details see [14].) since the voltage pulses u were not calibrated against the number of electrons n, the resultant distribution curves w1 are dependent on u instead of n. assuming linear proportionality u c n� � , the curve w u c u d1 0 1( ) ( )� � � � will preserve the same shape as w n1( ), i.e. they will both possess the same slope � �( )1 d . recently a new statistical pattern has been developed [14] w n d g n k n n n g n k n d j j j j d n d ( , ) ( ) ( ) � � � � � � � � � � � � � � � � 0 1 1 j j j j d n n n � � � � � � ����� � � ����� 0 exp , (3) w n d g f n d j d n k d k n ( , ) ( , ), , , , ln ln . � � � � � � � � 1 1 1 (4) pattern (3) unifies both furry (1) and pareto (2) statistics into a single analytical form. for example, if j � 0, the furry distribution results from (3), whereas for j � 0 a superposition of furry/exponential functions creates pareto behavior, i.e. the linear section on the graph w(n, d) plotted in bilogarithmic co-ordinates, as shown in fig. 1. in addition, a rigorous mathematical proof was presented in ref. [14], showing equivalence between statistical forms (2) and (3). in order to explain the meanings of parameters g, k , n, and nd used in generalized analytical form (3), it is necessary 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 statistical distributions of electron avalanches and streamers t. ficker a new theoretical concept of fractal multiplication of electron avalanches has resulted in forming a generalized distribution function whose multiparameter character has been subjected to detailed discussion. keywords: fractal multiplication of electron avalanches, furry and pareto statistics, fractal dimension. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 50 no. 1/2010 fig. 1: population statistics of a mixture of lowly and highly populated avalanches. statistical relevancy: furry fit r2 0 9975� . and � 2 0 00018� . , pareto fit r2 0 998� . and � 2 0 0001� . . data taken from [5]. fig. 2: avalanche statistics (voltage pulses) in air at normal laboratory conditions. after [14]. to make a brief description of the scenario of fractal avalanche multiplication (fig. 3). let us consider a parent avalanche started at the cathode. after crossing a critical distance �, it may gather a certain number of electrons n � exp ( )�� (� is the first townsend ionization coefficient), and the uv-radiation associated with collisional ionization may initiate k new displaced avalanches (k is the so-called multiplicity). the displaced avalanches continue their own independent tracks, and after passing a distance � they may (or may not) generate k 2 new displaced avalanches of the second generation. in general, the j-th generation of displaced avalanches contains k j avalanches and the highest generation j is limited by the length d of the discharge gap, critical distance � and the value of �, i.e. j d� � �1. the populations of displaced avalanches of the j-th generation that reach the anode can easily be expressed: � �n d jd j, exp ( )� �� � . the term n nd d� ,0 is a population of the parent avalanche ( j � 0). each generation has its own distribution function w n k n n n nd j j d j d n , ( ) ( ) � � � �� � � � � � � � 1 1 and their sum over all generations ( j j� 0 1, , ,� ) leads to statistical pattern (3) – more details can be found in ref. [14]. the symbol g represents the normalization constant. since the process of fractal avalanche multiplication is highly stochastic, average values k n nd, , � and are used. creation of additional smaller avalanches inside the discharge gap is a rather different process from that of branching with streamers. while multiplication of avalanches results in creating a set of separate avalanches, streamer branching leads to a connected network of plasma channels. a common item of these two processes may be photoionization. photoionization is the “driving force” necessary for streamer propagation, but in the case of avalanches it represents one of the possible creation mechanisms. however, streamer branching, as described by recent research papers [15–18], seems to be a complex process. kulikovski [15] interpreted streamer branching as an instability that transforms the non-standard streamer into a number of standard streamers. pancheshnyi [16] described the effects of streamer branching on the basis of background ionization and photoionization. arrayas et al. [17] described the splitting at the streamer tip as a laplacian instability. montijn, ebert and hundsdorfer [18] compared this instability to the branching instability of fluid interfaces in viscous fingering. there are many other works that numerically simulate propagation and branching of streamers. nevertheless, to our knowledge there is no work that numerically simulates fractal multiplication of avalanches as a basis for explaining the pareto behavior of avalanche population statistics, though existing numerical models of streamer growth could determine these statistics, including avalanche multiplication. the present paper focuses on a recently published statistical model [14] concerning populations of electron avalanches that undergo fractal multiplication within the discharge gap. the discussion focuses on the model parameters and their connections with physical processes underlying the phenomenon of fractal avalanche multiplication. the discussion ex42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 fig. 3: scheme of fractal avalanche multiplication. after [14]. plains in detail the role of each particular parameter, and provides a deeper insight into the model operation. 2 discussion of model parameters the probability density function (3) has the character of a fitting function containing four parameters j, k n, and nd which, together with dimension d, deserve further discussion and explanation. 2.1 parameter nd this parameter represents an average number of electrons n dd � exp ( )� in the parent avalanche. for a given discharge gap, gas content and physical conditions, the value of nd is a fixed number. however, from the viewpoint of fractal theory nd cannot be considered as a fixed reference population scale that will “anchor” all remaining avalanche components (displaced avalanches), since each generation ( j) of displaced avalanches has its own scale (n nd j ) and all these population scales are as important as nd . thus, the statistical set of all avalanches is a mixture of many mutually different but equally important population scales, none of which stands for a unique, basic scale defining a reference. this is one of the basic properties of all fractal objects. now that the principle of the lost reference scale has been mentioned, it is clear that it has no sense to change the only parameter nd at fixed others to model the transition from furry statistics to pareto statistics. the only item that will be influenced by changing nd is the position of the linear section on the graph log (w) versus log (n). but if a complete set of avalanches is considered, i.e. n � �( , )0 , nd � � (for d � �), the linear section will be infinitely long and there will be no need to change nd . in such a case the parameter nd will lose its analytical role, which consists in shifting the linear section along the graph when an incomplete avalanche set is fitted. according to the adopted statistical concept, it is not the value of nd that governs the transition between furry and pareto statistics. the high values of nd that can be observed when such a transition occurs seem to be only an accompanying effect (not a primary effect). 2.2 parameter n the value of this parameter is intimately connected with the critical distance � at which the primary avalanche may generate the first displaced avalanche, i.e. n � �exp (� � . in other words: the average distance � is necessary for the primary avalanche to assemble a certain number of electrons n that are capable of generating sufficient uv radiation (formation of photon sources) to facilitate the creation of displaced avalanches. it is assumed that the effective ionization length � [15], [19], which is passed by photons prior to their absorption (photoionization events), is independent of �. (the photoionization process is assumed to be effective in the case of molecules that have been excited to their higher energy states in previous collisions with electrons. the different values of the first ionization potentials of n2 and o2 molecules in air make this process still easier.) although the values of these two parameters can be arbitrarily different, both the processes, i.e. the appearance of the critical population n and the appearance of the first displaced avalanche, are almost synchronous, due to the very high speed of the photons. however, it should be mentioned that not all photons are capable of performing photoionization, and not every photoionization terminates by starting new avalanches, and, of course, not all newly-created avalanches propagate independently of the parent avalanche (some of them may be integrated into the body of the parent avalanche). the quantity n actually represents a measure of the capability of parent avalanches to generate displaced smaller avalanches by means of the complete photoionization process. from this viewpoint, it is clear that the number of electrons adequate for this purpose must be higher than one, i.e. n �1. 2.3 parameter j this parameter determines an extension of the fractal region (linear section of the graph log (w) versus log (n)). if the whole region is measurable by the experimental device that is used, parameter j can be estimated accurately and represents so many avalanche generations – i.e. “humps” [14] on the graph – that are capable of covering all the linear region measured. this is quite easy to realize, e.g., heuristically (trying various numbers of generations). in such a case, the value j provides a right number of displaced avalanche generations and is equal to its upper limit j dmax � �� 1. however, experimental devices are sometimes not capable of measuring the whole extent of possible data. they usually measure in some restricted interval (measuring window). the result is a population distribution restricted to a certain extent which is narrower than the real extent, and parameter j, when fitted to the length of such a linear section, will not represent an exact number of avalanche generations but, instead, it will be smaller j j d! � �max � 1. this is the case for the statistics in fig. 4. inserting the length d � 0 7. mm of the discharge gap used and the value � " #� 57 56 m into the formula of the upper limit j dmax � �� 1, one can obtain eleven generations jmax �11, but the fitting to the measured linear section in fig. 4 gives j j� !7 max, which seems to be a consequence of constrained measurements. therefore, in both the cases – © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 50 no. 1/2010 in constrained and unconstrained measurements – the quickest way to determine j is heuristic testing. 2.4 parameter k parameter k , which has been termed “multiplicity”, specifies multiplication (splitting) of displaced avalanches within all their generations. its value can hardly be predicted, because the number of the displaced avalanches is a matter of purely stochastic processes, and there is no rule for estimating this quantity. however, some limitations can be specified: k cannot be zero, since in this case lim d k� � �� 0 , which is an unacceptable result. in addition, k must be a positive number and larger than or equal to one, i.e. k �1 because a fractal dimension in our case must satisfy the following relation 0 3$ � $d k nln ( ) ln ( ) . it is clear that the magnitudes of k and d cannot be determined directly from discharge parameters �, d and �, just because k is a result of an unpredictable, stochastic process and d is dependent on k , i.e. d f k� ( ). however, statistical pattern (3) has been proposed as a fitting pattern, and all its parameters, including k and d, can be determined either by using an optimizing procedure or heuristically. for example, if the whole distribution is measured, the entire linear section is available, from which jmax can be determined. as soon as jmax is known, the parameter � �n d j� �exp ( )max� 1 is available. parameter d (fractal dimension) can be estimated from the experimental data by fitting the linear section of the graph by a straight line – a regression line possessing slope s – i.e., d s� � �( )1 . then parameter k can be estimated by using k n d� . 2.5 parameter d although the fractal dimension d does not explicitly occur in the statistical pattern (3), it does have a fundamental meaning. for this reason it will be convenient to discuss its physical interpretation and properties in connection with the fractal production of displaced avalanches. the value of d is defined (4) by the values of parameters k and n, i.e. d k n� ln ( ) ln ( ) where k �1 and n �1. a fractal dimension, like a topological dimension, must be represented by a positive number, and because splitting of displaced avalanches occurs in the discharge gap, which is a three-dimensional euclidean space, the fractal dimension d cannot be larger than three, i.e. 0 3$ $d . an increase in the 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 fig. 4: fractal pattern fit of population statistics (voltage pulses – data taken from fig. 2). the quantity ud � 10 1 555. mv is a voltage scale corresponding to the average voltage pulse created by the largest avalanches detected. the quantity g is not a normalization constant but only indicates the shift between unnormalized data and the unnormalized distribution f n d( , ). after [14]. d-value is possible only if k increases or n decreases, or if both these changes act simultaneously. an increase in k means that the “splitting” is more effective (higher multiplicity means splitting into a larger number of displaced avalanches). a decrease of n (at fixed �) means a smaller critical distance �, and, therefore, a larger number ( jmax) of avalanche generations ( )maxj d� �� 1 , which implies a numerous set of displaced avalanches. thus, large d means that the discharge is accompanied by an abundant swarm of displaced avalanches, which can be interpreted as a tendency to delocalize the discharge over a larger portion of the inter-electrode space. in short, a higher d-value means a higher discharge delocalization and, conversely, a lower d-value indicates a more localized discharge with sporadic appearance of displaced avalanches. since the effect of discharge localization/delocalization is undoubtedly limited, among other things by the electric field e used in experiments, it will be no surprise that such a field dependence d(e) has been observed previously [9], because of the acting space charges of the parent avalanches, i.e. due to the dependence d e( ( ))� . the variability of parameter d when going from less populated to highly populated avalanches is well observable by comparing figs. 4 and 5. less populated avalanches, whose statistics are given in fig. 4, generate lower values of dimension d and also smaller n and k, in comparison with big avalanches with a prevalence of streamers (fig. 5). this means that streamer-like avalanches split more easily into side avalanches. they are more delocalized (higher k) and are capable of filling better in the discharge gap (higher d). 3 fitting procedure function (3), like any other multiparameter function, must be handled carefully when performing a fitting procedure. it is the starting values of the fitting parameters that have an essential influence on the results of an optimizing procedure. an inappropriate set of starting values may lead to final values that satisfy the mathematical conditions but may be physically completely unacceptable. to avoid such a failure, a proper choice of input values is necessary. in the case of function (3), there are several useful aids for establishing a proper choice. firstly, the value of nd should be estimated directly from the statistical data rather than from the relation n dd � �exp ( )� , especially when the measuring device provides only a narrow acquisition range. it is suggested to estimate the nd values as the horizontal asymptote of the ‘lowest hump’ on the graph w n d( , ). if only a linear section (without © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 50 no. 1/2010 fig. 5: fractal population statistics of a mixture of lowly and highly populated avalanches. data taken from fig. 1. g is a normalization constant. any ‘humps’) is available, the ‘corresponding asymptote’ can be simulated by using w n d n n nd d ( , ) exp� � � ���� � � ����� 1 . however, if only a restricted measuring range is available, such a value may not have the meaning of the populations of the largest parent avalanches. instead, it may correspond to the populations of some displaced avalanches. however, from the viewpoint of the fitting procedure it will be a right choice in either case. the second parameter d can also be reliably estimated prior to starting the optimizing procedure. linear regression of either the linear section or the ‘humpy section’ will provide a slope s and, thus, the value d s� � �( )1 . parameter d determines the ratio ln ( ) ln ( )k n , which may assist in estimating k and n – the value of k does not usually exceed 4. as soon as the estimate of nd , k and n has been made, one can try to draw the graph w(n, d) using only the first two terms ( j �1) of sum (3), and compare the result with the experimental data. after a re-adjustment of k and n, further terms ( j �1) of sum (3) can be added until a full cover of the linear section of the measured data is accomplished. if necessary, final corrections of some parameters are recommended in order to ensure a good accord with experimental data. only such ‘pre-processed’ values can be successfully used as proper input data for the chosen optimizing (fitting) procedure. if the scenario described above is followed, it will be easier to find results that may satisfy both the mathematical and physical conditions. 4 transition between furry and pareto statistics the proposed statistical concept (3) is based on creating displaced avalanches formed most probably by photoionization during the initial stages of the collision process in the parent avalanches if a sufficiently high electric field (and high e p) is present. the term “sufficiently high electric field” refers to such a field as is capable of ensuring the production of highly populated avalanches (for example, in air at normal atmospheric conditions n �105). in such a case, displaced avalanches may accompany the highly populated parent avalanches that terminate either without converting into streamers or as regular streamers that may or may not undergo channel branching. therefore, the condition for the transition from furry statistics to pareto statistics of electron populations is a sufficiently strong electric field facilitating the creation of displaced avalanches. there is also a simple mathematical condition ensuring the transition between these two statistics. furry’s distribution can be expected if � � d. as a consequence of this condition, one can find j dmax � � �� 1 0 and, in addition, n d� �exp ( exp ( )� �� . this means that the parent avalanche will reach the anode without starting displaced avalanches and, thus, k loses its sense, and also d cannot be rigorously defined. the condition jmax � 0 excludes the presence of displaced avalanches and, on the other hand, ensures the participation solely of parent avalanches. the parent avalanches cross the whole discharge gap and form furry statistics (1) with an average population n d� exp ( )� , which represents a fixed non-fractal reference scale. 5 conclusion instead of simple photoionization that acts solely within the primary (parent) avalanche, a new concept of displaced avalanche splitting has been proposed that allows for photoionization going beyond the parent avalanche channel, and for creating new smaller independent avalanches. the new displaced avalanches modify the overall population distribution of avalanches and cause a transition from furry to pareto statistics. such a transition may occur especially when the critical distance � for initiating displaced avalanches is essentially smaller than the discharge gap itself � !! d . furry and pareto statistics can be unified into a single generalized analytical pattern that is capable of following the experimental data faithfully in both the statistical regimes (fig 5). the main limitation of the pattern consists in its restriction to homogeneous or quasi-homogeneous background electric fields. acknowledgments this work has been supported by grant no. 202/07/1207 of the grant agency of the czech republic. references [1] raether, h.: electron avalanches and breakdown in gases. london: butterworths, 1964. [2] furry, w. h.: phys. rev., vol. 52 (1937), p. 569. [3] wijsman, r.: phys. rev., vol. 75 (1949), p. 833. [4] frommhold, l.: zeitschrif für physik, vol. 150 (1958), p. 172. [5] richter, k.: zeitschrift für physik, vol. 158 (1960), p. 312. [6] ficker, t.: j. appl. phys., vol. 78 (1995), p. 5289. [7] ficker, t., macur, j., kliment, m., filip, s., pazdera, l.: j. el. eng., vol. 51 (2000), p. 240. [8] ficker, t., macur, j., pazdera, l., kliment, m., filip, s.: ieee trans. diel. el. insul., vol. 8 (2001), p. 220. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 [9] ficker, t.: ieee trans. diel. el. insul., vol. 10 (2003), p. 689. [10] ficker, t.: ieee trans. diel. el. insul., vol. 10 (2003), p. 700. [11] ficker, t., macur, j., kapička, v.: czech. j. phys., vol. 53 (2003), p. 509. [12] ficker, t.: ieee trans. diel. el. insul., vol. 11 (2004), p. 136. [13] mandelbrot, b. b.: the fractal geometry of nature. new york: freeman, 1983. [14] ficker, t.: j. phys. d: appl. phys., vol. 40 (2007), p. 7720. [15] kulikovski, a. a.: j. phys. d: appl. phys., vol. 33 (2000), p. 1514. [16] pancheshnyi, s.: plasma sour. sci. technol., vol. 14 (2005), p. 645. [17] arrayas, m., ebert, u., hundsdorfer, w.: phys. rev. lett., vol. 88 (1998), p. 174502. [18] montijn, c., ebert, u., hundsdorfer, w.: phys. rev. e, vol. 73 (2006), p. 065401–1. [19] penney, g. w., hummert, g. t.: j. appl. phys., vol. 41 (1970), p. 572. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering brno university of technology veveří 95 662 37 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 50 no. 1/2010 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 3d modelling of a tunnel re-excavation in soft ground m. hilar abstract the construction of the shallow tunnel at brezno started using the pre-vault method. the tunnel excavation, in complicated geological conditions, led to many difficulties which finally resulted in a collapse, when a significant part of the temporary tunnel lining collapsed. various options for re-excavating the tunnel were evaluated prior to further construction. finally a decision was made to separate the collapsed area into sections 9 m in length using 16 m-wide, transversally oriented pile walls, to improve the stability of the collapsed ground. the walls were constructed from the surface prior to excavation. it was also decided to re-excavate a collapsed area using the sprayedconcrete lining (scl) method. due to problematic soft ground conditions, which had been made even worse by the collapse, some additional support measures had to be considered prior to re-excavation (ground improvement, micropile umbrellas embedded into the pile walls, etc.) this paper describes numerical modelling of the tunnel re-excavation through the collapsed area. initial calculations of the tunnel re-excavation were made using a 2d finite element method. subsequently, further calculations to evaluate the rock mass behaviour in the collapsed area were provided in 3d. the 2d calculations were used to provide sensitivity studies, while 3d modelling was mainly used for evaluating the tunnel face stability (impact of the pile walls, impact of ground improvement) together with other factors (length of advances, moment of the temporary invert closure, etc.) the results of the modelling were compared with the monitoring results. the paper also briefly describes the construction experience (technical problems, performance of various support measures, etc.) the excavation and the primary lining construction were completed in 2006, and the tunnel was opened for traffic in april 2007. keywords: tunnel, clay, soft ground tunnelling, sprayed concrete, new austrian tunnelling method, natm, sprayed concrete lining, scl, numerical modelling. history the construction of thebrezno tunnel, with overburden up to 30 m, started in 2000 using the pre-vault method (perforex) [1]. the pre-vault method principle is such that a 20 cm-wide slit (for the brezno tunnel) is cut along the upper and side parts of a predefined tunnel circumference. step by step, simultaneously with cutting the prescribed number of segments in a prescribed sequence, the slit is filled with sprayed concrete. the hardened concrete forms a protective pre-vault reaching ahead of the rock excavation. the pre-vault has a truncated conical shape. this shape allows overlapping of the individual prevaults. the length of the chain saw determines the length of the pre-vault, which was 5 m in the case of the brezno tunnel. the pre-vault can overlap the preceding prevault from 2.5 to 0.5m (the excavation advance length is selected according to the geological conditions that are encountered, from 2.5 to 4.5 m). full-face excavation then takes place under protection from the pre-prepared primary lining. the tunnel excavationwaspredominantly inplastic clays and claystones, and the maximum thickness of the quaternary deposits (gravels and sands) was about 6 m. the area was also affected by previous undocumented mining activities. the very complicated geological conditions caused many difficulties, resulting in a significant collapse in 2003. the collapse occurredwhen about 860m of the primary lining of the tunnel had been completed. about 77 m of primary lining was destroyed (chain effect of prevaults) and a further 44 m of the tunnel was filled with collapsed material. excavation ceased for several months directly after the collapse. proposed tunnel recovery a decision was made to separate the collapsed area into 9m-long sections using 16m-wide tranverse pile walls constructed from the surface. the walls were formed from piles 1.18 m in diameter, and the walls reached 3 m below the tunnel profile. the collapsed tunnel was separated into 7 sections in the longitudinal direction. a shaft had to be constructed in the area of a buried perforex machine (figure 1). the sprayed concrete lining (scl) method was used for re-excavating the collapsed area. the primary lining was designed as sprayed concrete, reinforced by lattice girders and amesh. the tunnel face had to be excavated in several stages. the proposed excavationmethodhad tobeproperly statically evaluated prior to application, and all support measures had to be optimised. 25 acta polytechnica vol. 51 no. 3/2011 fig. 1: longitudinal cross-section, including separation of the collapsed area using pile walls table 1: input geotechnical parameters (the abbreviations of the layers correspond with figure 2) geotechnical unit input parameters γ (kn/m3) c (kpa) φ (◦) edef (mpa) ν quaternary deposits qd 19.2 11.5 18 17 0.30 strongly weathered claystone swc 19.2 11.0 10 19 0.40 collapsed material cm 19.2 11.0 8 19 0.40 weathered claystone wc 19.5 17.0 19 19 0.40 claystone a ca 19.5 36.0 19 32 0.40 claystone b cb 19.5 40.0 20 35 0.38 claystone c cc 19.5 45.0 25 50 0.38 coal seam cs 19.5 30.0 25 60 0.30 fig. 2: 2d model finite element mesh (the abbreviations of the layers correspond with table 1) the calculations were generated using the finite element method (fem). due to the complexity of the problem, ordinary 2d calculations were supplemented by 3d calculations to verify some 3d effects (e.g. impact of tunnel separation by pile walls). original calculations the initial static calculations for designing the primary lining and the excavation sequence were generated using 2dfem(plane strainmodel) [2], and the rock mass was modelled using a linear elasto-plastic mohr-coulomb model (figure 2). rib software was used for the calculations [4]. the primary tunnel lining was evaluated using interaction curves produced by fine software [5]. the input parameters were derived from a supplementary site investigation conducted after the collapse. the initial input parameters for each geotechnical unit are summarised in table 1. the value used for the coefficient of the lateral pressure at rest was 0.8. the input parame26 acta polytechnica vol. 51 no. 3/2011 ters were derived from a supplementary site investigation carried out after the collapse. the collapsed ground was quite heterogeneous (a mixture of cohesive and non-cohesive soils), and the selected mean values were rather conservative. the static calculations that were generatedmodelled the excavation and support installation in several stages (top heading excavation, top heading lining, benchexcavation,bench lining, invert excavation and invert lining closure), and the model included two types of sprayed concrete — three-day-old green sprayed concrete, and sprayed concrete with its final parameters. the lining thicknesswas35 cm. the top heading liningwas expected to be regularly closed by a temporary invert, which is a crucial measure for achieving equilibrium in geological conditions of this type. geometry also plays a very important role in minimizing the bending moments (smaller eccentricity) in the lining. the lining geometrywas optimised in this way. the calculated maximal axial forces were in the range 1500 kn to 2450 kn, depending on the stage of excavation. the calculations confirmed that the maximum deformations of the primary lining should not exceed 50 mm, and monitoring during construction generally confirmed these expectations. the shape of the temporary top heading invert was designed as a compromise between the optimum static profile and the space requirement for the machinery. the shape of the permanent invert was more appropriate from the static view, as no compromises were required. the tunnel lining evaluation confirmed that its capacity was sufficient. verification calculations 3d calculationswere generated using plaxis 3dtunnel software [6]. the major aim of this modelling was to evaluate the impact of the pile walls on the excavation and the lining. the model was prepared for analysing the conditions input into the 2d calculations (location of geotechnical units, input parameters, tunnel lining, etc.) the3dmodelwas127m inheight, 90m inwidth, and 97m in length (figure 3). the model comprised half of the tunnel, and used symmetry. first, the top heading constructionwasmodelled (figure 4) in several steps to simulate top heading excavation. each advancewasmodelled in two phases (excavation and tunnel lining application). second, the bench and invert construction was modelled in several steps (figure 5). again, each advance was modelled in two phases (excavation and tunnel lining application). one model was generated with pile walls (figure 6); the second was generated without them. fig. 3: 3d model finite element mesh (the geotechnical layers and parameters correspond with the 2d model) fig. 4: modelling of the top heading construction fig. 5: modelling of the bench and invert construction 27 acta polytechnica vol. 51 no. 3/2011 fig. 6: details of the 3d model with pile walls fig. 7: top heading tunnel lining deformations (impact of pile walls) impact of pile walls: the model included pile walls spaced at 9 m, with a thickness of 1 m (figure 6). the pile walls were modelled as a linearelastic material, and they were separated into two parts to simulate the real structure (see figure 6): a. lower part (in the tunnel area) filled with concrete with the properties: e =25 gpa, ν =0.2 b. upper part (above the tunnel) filled by suspension with the properties: e =10 gpa, ν =0.2 two calculations were generated: with walls and without walls. the results showed the stiffening effect of pile walls. the construction of the pile walls means a reduction in deformations (figure 7) and bendingmoments of about 50%. the differences between the 2d results and the 3d results (seetable 2) were caused by the original estimation of the relaxation. the choice of low relaxation (i.e. a fast ring closure assumption) in the 2d calculations was determined mainly by taking a conservative approach to the primary lining analysis (to get higher axial forces). impact of bench and invert excavation: a further purpose of the 3d calculations was to evaluate the effect of bench and invert excavation on the 28 acta polytechnica vol. 51 no. 3/2011 top heading lining performance (i.e. when the tunnel invert should be closed). the invert was modelled to be closedafter 2m(figure8), 4m, and8madvances. the calculations showed that the values of the internal forces in the top heading lining are not a significant problem. a more significant problem would be the deformations, which would double in the case of 8 m advances. the next problem was the forces in the longitudinal direction and the shear forces in the lining close to the walls (figure 9). thus, a maximum advance of 4 m was recommended for bench and invert excavation. fig. 8: bench and invert constructionmodellingwith 2m advances fig. 9: concentration of shear forces in the tunnel lining on the interface of the pile walls and the ground topheading face stability: calculationsof the top heading face stability were also generated. the bench and invert excavationwas expected to be separatedat leastbyonepilewall tohaveaminimal effect on the stability of the top heading face. again the calculation was performed in several stages to simulate the tunnel construction procedure (installation of pilewalls, consequently several excavationsand installations of lining). the tunnel face stability was calculated when the tunnel face was 2 m behind the pilewall and1mof the excavationwasnot supported by the tunnel lining. the safety factorwas calculated using the phi-c reduction procedure (option available in plaxis for computing safety factors). in the phi-c reduction approach, the strength parameters tan φ and c of the ground were successively reduced until failure occurs. the resulting safety factor is the ratio of the initial and final shear parameters. the calculations showed a safety factor of 1.1, which indicated problems in the top heading face stability. however, the calculation that was generated did not include the designed support measures (supportwedge,micropile umbrellas and jet grouting columns, further sequencing of the face, etc.) the results showed a favourable effect of pile walls in limiting the propagation of the deformations that were generated (figure 10). fig. 10: propagation of the topheading face deformations construction experience there was significant anxiety about the ground behaviour prior to the start of excavation, as the area had been significantly disrupted by the previous collapse (the area in and above the tunnel profile). coreswere thereforedrilled fromthe tunnel faceprior to excavation of each section between the pile walls, and a decision was made on ground improvement and supportmeasuresbasedon the results of drilling. in chamber 1 (see figure 1), horizontal jet grouting columns were installed into the face to increase the face stability. this measure was also used in chamber 3 (see figure 1). the tunnel profile was regularly protected bymicropile umbrellas (figure 11); the micropiles were embedded into the pile walls on both ends. some attempts were made to embed the micropiles into 29 acta polytechnica vol. 51 no. 3/2011 fig. 11: tunnel construction under micropile umbrellas the horizontal jet grouting columns (to increase their stiffness), but like the jet grouting columns drilled into the face, this approach was finished after the third section. all excavations were done with 1 m advances. the excavatedprofilewas supported bywiremeshes, lattice girders and sprayed concrete. the face stability was regularly increased by a support wedge (ground left in the centre of the excavated profile). in addition, a flash coat of sprayed concrete (several centimetres)was immediately appliedonthe face and tunnel perimeter after the excavation. the top heading facewas sometimes excavatedand sprayed in several steps (in cases of local instability). all these measures helped significantly to increase the tunnel face stability, and the calculated low tunnel face stability was confirmed during excavation by several local failures. in addition, the temporary top heading invert was closed regularly. originally it was closed in 2 m or 3 m steps, but later these sections were extended to 4 m. bench and invert excavation was carried out more than 9 m behind the top heading face (the length of one chamber). the excavation started at the end of february 2006 and was completed without major problems at the beginning of august 2006. results of monitoring the ground deformations were recorded by ordinary geotechnical monitoring. the sprayed concrete lining was monitored by convergence monitoring, with threepoints on the topheadingand twopoints on the bottom. the convergence cross-sectionswere located in the centre of all pile walls and also between the pile walls (generally one or two intermediate monitoring profiles between two pile walls). the surface settlement wasmonitored on the top of all pile walls; some intermediatepoints at ground level between the walls were also monitored. the maximum surface settlement monitored above the tunnel was 28 mm (area above the chamber 2). 2d modelling predicted surface settlement of 40 mm, and 3d modelling with pile walls predicted 20 mm. all convergences generally remained below 40mm, but themonitoring results showed significant scatter of the tunnel lining deformations (table 2), mainlydue to theverydifficult heterogeneousground conditions. in chamber 2, vault settlement of 93mm was monitored. this high deformation was caused by local problems (tunnel lining cracking), which did not affect the overall stability of the tunnel. 2dmodelling predicted vault settlement of 50mm, while 3d modelling with pile walls predicted vault settlement of 25 mm. the results are compared with the monitoring results in table 2. naturally, even 3dmodelling could not reflect all aspects of the excavation. when the modelling results are compared with the actually measured deformation values, some differences become obvious, but they are not too great in this particular case. the differences are mainly caused by factors which could not be properly included in the models (heterogeneousground, timing andquality of the support measures, quality of the grouting, etc.) conclusion the brezno tunnel had to be excavated in very complicated geological conditions. these ground conditions were significantly worsened by the collapse of a long section of the tunnel lining. the design of the excavation procedure and appropriate support measures for re-excavation of the collapsed tunnel was not a straightforward task. static calculations of the tunnel re-excavation were provided using the 2d finite element method (rib software). further calculations for evaluating the rockmass behaviour in the collapsed area were provided using plaxis fem software. 2dcalculationswereused toprovidesensitivity studies [3], and 3d modelling assisted the evaluation of the tunnel face stability (impact of the pile walls, ground improvement, etc.) the results of the modelling were compared with the monitoring results. the paper also briefly describes the construction experience (technical problems, performance of various support measures, etc.) 2d and 3d modelling were used to evaluate the ground and tunnel behaviour prior to re-excavation. the modelling provided very useful information prior to the start of construction. it led to optimisation of the tunnel shapeand the excavation sequence. it indicated tunnel face stability problems,whichhad to be improved by various measures. it also showed quite well some anticipated problems which subsequently appeared during the excavation (low stability of the excavation face, concentration of stress be30 acta polytechnica vol. 51 no. 3/2011 table 2: comparison of monitored and calculated tunnel lining crown settlement tunnel chainage (m) monitoring (mm) 2d model (mm) 3d model (mm) 2004 5 50 25 2007 19 50 25 2012 21 50 25 2019 20 50 22 2025 20 50 25 2027 27 50 25 2031 33 50 22 2034 37 50 25 2036 50 50 25 2040 93 50 22 2043 38 50 25 2048 39 50 22 2052 40 50 25 2057 40 50 22 2061 40 50 25 2066 31 50 22 2070 24 50 25 2075 26 50 22 2079 11 50 25 2081 17 50 24 tween the unclosed and closed linings, the positive influence of the temporary invert and dividing pile walls, etc.) regarding the excavation itself, a flexible approach to the construction work on the part of the contractor was essential. in some cases, it was necessary to respond to the properties and behaviour of the ground in a very flexible manner; it was impossible to optimise all aspects of the excavation in the planning phase. the excavation procedure was reasonably well optimised during construction, so the collapse recoverywas completed without any further significant difficulties. the brezno tunnel construction was successfully completed, and the tunnel was opened for traffic in april 2007. acknowledgement financial support from research grant tacr ta01011816 is gratefully acknowledged. references [1] hilar, m., heřt, j., smida, m.: soft ground excavationof thebřeznotunnel.proceedings of the world tunnelling congress, agra, 2008. [2] hilar, m., john, v.: recovery of a collapsed section of thebrezno tunnel.tunel, vol.16, 3/2007. [3] barták,j.,hilar,m., pruška,j.: numericalmodelling of the underground structures. acta polytechnica. vol. 42, no. 1/2002. [4] rib — software for civil engineers. http://www.rib.cz [5] fine — civil engineering software. http://www.fine.cz [6] plaxis — software for geotechnical engineers. http://www.plaxis.nl doc. ing. matoušhilar, msc., phd., ceng.,mice phone: +420 604 862 686 e-mail: hilar@d2-consult.cz d2 consult prague, s.r.o. zelený pruh 95/97 (kuta) 140 00 praha 4, czech republic department of geotechnics faculty of civil engineering czech technical university in prague thákurova 7, 166 29 praha 6, czech republic 31 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 an application of the individual channel analysis and design approach to control of a two-input two-output coupled-tanks system david j. murray-smith1 1 school of engineering, rankine building, university of glasow, glasgow g12 8at, scotland, united kingdom correspondence to: david.murray-smith@glasgow.ac.uk abstract frequency-domain methods have provided an established approach to the analysis and design of single-loop feedback control systems in many application areas for many years. individual channel analysis and design (icad) is a more recent development that allows neo-classical frequency-domain analysis and design methods to be applied to multi-input multi-output control problems. this paper provides a case study illustrating the use of the icad methodology for an application involving liquid-level control for a system based on two coupled tanks. the complete nonlinear dynamic model of the plant is presented for a case involving two input flows of liquid and two output variables, which are the depths of liquid in the two tanks. linear continuous proportional plus integral controllers are designed on the basis of linearised plant models to meet a given set of performance specifications for this two-input two-output multivariable control system and a computer simulation of the nonlinear model and the controllers is then used to demonstrate that the overall closed-loop performance meets the given requirements. the resulting system has been implemented in hardware and the paper includes experimental results which demonstrate good agreement with simulation predictions. the performance is satisfactory in terms of steady-state behaviour, transient responses, interaction between the controlled variables, disturbance rejection and robustness to changes within the plant. further simulation results, some of which involve investigations that could not be carried out in a readily repeatable fashion by experimental testing, give support to the conclusion that this neo-classical icad framework can provide additional insight within the analysis and design processes for multi-input multi-output feedback control systems. keywords: multivariable, feedback, control, coupled tanks, frequency domain, nonlinear. 1 introduction problems of liquid level control arise in many industries. common examples include control of levels in blending and reaction vessels within chemical processes. this paper describes the application of the individual channel analysis and design (icad) approach to the development, implementation and testing of conventional diagonal controllers for a system involving liquid levels in a pair of coupled tanks [1]. the objective of the paper is to provide a detailed case-study to illustrate use of the icad methodology and to demonstrate some of the benefits of this neo-classical frequency-domain approach to problems involving multivariable control. the coupled-tanks equipment around which the control system is designed has two inputs, which are external liquid flow-rates into each of the tanks. the two outputs are the resulting levels of liquid in the tanks. there is only one outflow and this is from the second tank. figure 1 is a schematic diagram of this system. established frequency-domain methods, such as bode/nyquist analysis, are of central importance as figure 1: schematic diagram of the coupled-tanks equipment a basis for design tools for single-input single-output feedback control systems in many application areas. the success of the frequency-domain approach is due, in part, to the graphical nature of these techniques which provides transparency and flexibility in satisfying design specifications in the presence of practical constraints. the extension of classical methods of analysis and design to multivariable systems involv121 acta polytechnica vol. 52 no. 4/2012 ing more than one input and more than one output can introduce difficulties. it may still be possible, in cases where cross-coupling is not strong, to design the control system using approaches involving one loop at a time. however, in cases where dynamic interactions between loops are significant, more skill and experience is necessary in order to produce a successful design, and a process of tuning using trial-and-error methods may be needed. the individual channel analysis and design (icad) methodology was developed in the early 1990s and allows frequency-domain methods to be applied to the problems of analysis and design of multi-input multi-output feedback control systems (see, e.g., [2–6]). this approach allows an m-input m-output feedback control problem to be split into m single-input single-output problems without loss of structural information. each controlled output is paired with a specific reference input to form what is termed a “channel”. the icad approach makes direct use of the customer performance specification for different channels to provide a framework within which classical single-input single-output control engineering concepts can be extended to multiinput multi-output cases involving significant levels of cross-coupling. traditional methods for applications involving single-input single-output (siso) systems can thus be applied to multi-input multioutput (mimo) problems. this includes the use of open-loop system information in the form of nyquist and bode plots for system analysis and design, along with simple measures of robustness, such as gain and phase margins. it must be emphasized that the icad approach is not, itself, a design method but should be viewed as a framework through which useful insight may be gained about the dynamics of the plant and the characteristics of the complete controlled system. performance can be assessed for any chosen form of linear controller (which may be designed using any suitable approach), and limitations of a design can be investigated. thus, compared with some other approaches to multivariable control, it can be based on traditional analysis and design methods familiar to all control system designers. each channel has its own customer-defined performance specifications and these may be expressed in a simple way in terms of siso requirements. 2 the coupled-tanks system the two-input coupled-tanks laboratory system of figure 1 consists of a container (of volume 6 liters), with a central partition which divides it into two separate tanks. coupling between these tanks is provided by a number of holes of various diameters positioned near the base of the partition. the strength of coupling may be adjusted through the insertion of plugs into one or more of these holes. the system is equipped with a drain tap which is under manual control and this allows the output flow rate from one of the tanks to be adjusted. both tanks have inflows from electrically driven variable-speed pumps and are equipped with sensors that can detect the level of liquid and provide a proportional electrical output voltage. the hardware is based around a single-input commercial product intended for teaching applications (tecquipment ltd) [1]. this had a flow input only to tank 1 and was modified at the university of glasgow through the addition of the second pump to provide the inflow to tank 2. the original resistive level sensors have been replaced using more accurate differential-pressure based depth sensors. the derivation of a detailed nonlinear model of the system may be found in [7] and in a more recent conference paper [8] which also includes a very brief account of the application of the icad approach to the design of a control system for this process. the model is based on the application of the principle of conservation of mass to the liquid within each tank. bernoulli’s equations provide the basis for determining the flow from one tank to the other and from the second tank to the external environment. this leads to the following pair of equations, involving variables shown in figure 1: a1 dh1 dt = qi1 − cd1a1 √ 2g(h1 − h2) (1) a2 dh2 dt = qi2 + cd1a1 √ 2g(h1 − h2) − (2) cd2a2 √ 2g(h2 − h3) these equations describe the dynamics of the coupled tanks system, in nonlinear form, for all cases for which the level in tank 2 is below that in tank 1. it is, of course, possible to derive a similar set of nonlinear equations to describe the system for cases involving a liquid level in tank 2 which is greater than the level in tank 1. parameter values for the laboratory system are as follows: cross-sectional area of tanks, a1 = a2 = 9.7 × 10−3 m2. cross-sectional area of the orifice between tank 1 and tank 2, a1 = 3.956 × 10−5 m2. cross-sectional area of the orifice representing the outlet drain of tank 2, a2 = 3.85 × 10−5 m2. height of the outlet drain above the base of tank 2, h3 = 0.03 m. gravitational constant, g = 9.81 ms−2. maximum flow rates for inputs to tank 1 and tank 2: qi1 max = qi2 max = 5 × 10−5 m3 s−1. 122 acta polytechnica vol. 52 no. 4/2012 the minimum flow rates for these inputs are zero since the pumps are not reversible and thus negative inputs are not possible. maximum levels of liquid in tank 1 and tank 2: h1 max = h2 max = 0.3 m. the minimum level possible in each tank is 0.03 m which corresponds to the height of the outlet drain. values of the discharge coefficients cd1 and cd2 have to be determined empirically and the values obtained depend on the system operating point. the value used for cd1 in the design calculations was 0.63 and the value used for cd2 was 0.58. in addition to the above parameters, electrical signals in the system are related to the variables of the model (as described by equations (1) and (2)) through the following two parameters: pump flow-rate calibration constant, gp = 7.2 × 10−6 m3s−1v−1. liquid depth sensor calibration constant, gd = 33.33 vm−1. it should be noted that no dynamic representation is included for the two pumps and the associated electrical drives as it was found, through hardware testing, that they show a very fast response to changes of electrical input compared with the level changes within the tanks themselves. time constants that are associated with the pumps were therefore neglected for the purposes of the control system design. for the preliminary stages of the design it is also appropriate to consider a linearised model, within which the variables represent small variations of system variables about steady state values. h̃1(t) = h̄1 − h1(t) (3) h̃2(t) = h̄2 − h2(t) (4) qi1(t) = q̄i1 − qi1(t) (5) qi2(t) = q̄i2 − qi2(t) (6) q23(t) = q̄23 − q23(t) (7) in equations (3)–(7) the variables that have a horizontal bar above them denote values at the chosen operating point, which is normally defined by a steadystate condition. if equations (1) and (2) are re-arranged in the standard state-space form, we get a pair of nonlinear equations: dh1 dt = f1(h1, h2, qi1) (8) dh2 dt = f2(h1, h2, h3, qi2) (9) then, since the level h3 may be assumed constant, linearisation produces the standard linear state space model:⎡ ⎢⎣ dh̃1 dt dh̃2 dt ⎤ ⎥⎦ = ⎡ ⎢⎣ ∂f1 ∂h1 ∂f1 ∂h2 ∂f2 ∂h1 ∂f2 ∂h2 ⎤ ⎥⎦ [ h̃1 h̃2 ] + (10) ⎡ ⎢⎣ ∂f1 ∂q1 ∂f1 ∂q2 ∂f2 ∂q1 ∂f2 ∂q2 ⎤ ⎥⎦ [ qi1 qi2 ] in equation (10) all the partial derivatives must be evaluated at the operating point (h̄1, h̄2, q̄i1, q̄i2). the resulting linearised equation, after evaluation of the partial derivatives, has the form: [ ˙̃ h1 ˙̃ h2 ] = ⎡ ⎢⎣ −α1 a1 α1 a1 α1 a2 −(α1 + α2) a2 ⎤ ⎥⎦ [ h̃1 h̃2 ] + (11) ⎡ ⎢⎣ 1 a1 0 0 1 a2 ⎤ ⎥⎦ [ qi1 qi2 ] where α1 = cd1a1 2 √ 2g h̄1 − h̄2 (12) and α2 = cd2a2 2 √ 2g h̄2 − h3 (13) the individual block transfer functions that describe the plant in figure 1 may then be derived from equation (11) and are as follows: g11(s) = (α1+α2) α1α2 [ 1 + s a2 (α1+α2) ] 1 + (a1α1+a1α2+a2α1) α1α2 s + a1a2 α1α2 s2 (14) g21(s) = 1 α1 1 + (a1α1+a1α2+a2α1) α1α2 s + a1a2 α1α2 s2 (15) g12(s) = 1 α2 1 + (a1α1+a1α2+a2α1) α1α2 s + a1a2 α1α2 s2 (16) g22(s) = 1 α2 (1 + sa1 α1 ) 1 + (a1α1+a1α2+a2α1) α1α2 s + a1a2 α1α2 s2 (17) 3 an outline of the icad approach a linear time-invariant mimo plant may be modelled using a transfer function matrix g. if a control matrix k is positioned in the forward path in cascade with the plant transfer function matrix g and immediately before it, a feedback loop can be created around the combined system described by the product kg. the essential feature of the icad approach is that loops are considered individually, by opening one loop while all other loops remain closed. details of the icad methodology and applications that have been considered previously may be found in papers by leithead and o’reilly (e.g. [2–4] 123 acta polytechnica vol. 52 no. 4/2012 and [5]), who were responsible for the initial development of the approach. a bibliography of published papers and reports relating to icad methods has been made available by kocijan [6]. the icad methodology allows a controller to be assessed in a very direct fashion, for a given plant and given design specifications, in terms of performance and in terms of compromises and possible trade-offs. the design goals typically may involve: 1. steady state response 2. transient response 3. disturbance rejection 4. closed-loop stability 5. robustness to changes in plant characteristics 6. protection of actuators from high-frequency signals that might lead to excessive wear in the icad approach the significance of the ‘structure’ of the plant in translating the given mimo system into the equivalent set of channels is given special emphasis [2]. figure 2: block diagram of a general two-input twooutput closed-loop system of the type being considered in this application. (adapted from a diagram in [2]) as is clear from the plant model, the coupledtanks application described in this paper involves a two-input two-output system with feedback involving two channels. figure 2 is a block diagram that illustrates the type of system being investigated. if we consider the forward signal transmission from the reference signal r1 to the associated output y1, it may be seen that the signal follows two pathways. one path involves a direct link through the block g11 and the other is through the blocks g21, g12 and a block involving k2 and its associated feedback loop through g22. this diagram may be simplified to give the structure shown in figure 3 for the channel c1. from considerations of symmetry, the channel c2 may be handled in the same way to produce the simplified block diagram of figure 4. figure 3: block diagram for channel 1. (adapted from a diagram in [2]) these block diagrams can be used to show that, ignoring the disturbance signal, each channel can be described using a single-input single-output transfer function: c1 = k1g11(1 − γh2) (18) and c2 = k2g22(1 − γh1) (19) where γ = g12g21 g11g22 (20) h2 = k2g22 1 + k2g22 (21) and h1 = k1g11 1 + k1g11 (22) figure 4: block diagram for channel 2. (adapted from a diagram in [2]) in equations (18)–(22) and in figures 3 and 4, the effects of coupling are represented as additive disturbance terms at the outputs of each channel and this does not involve any loss of information. however, it must be emphasised that icad is not a single-loop design method since loop interactions are preserved. it can be shown (e.g., [2, 4]) that, for robustness to parameter uncertainties of the closed-loop system stability, the nyquist plots of (1−γh1) and (1−γh2) must not lie close to the origin of the polar plane at frequencies near to or below the open-loop gain crossover frequency. hence, if the corresponding plots of γh1(jω) or γh2(jω) come close to the point (1, 0) in 124 acta polytechnica vol. 52 no. 4/2012 the polar plane the conventional siso gain margins for the effective transfer functions of c1 and c2 do not provide robust measures of stability. in such a case it may not be appropriate to attempt to use the icad approach unless some form of pre-compensator is introduced to modify the plant characteristics in an appropriate way [5]. it can also be stated [4] that the quantities h1(jω) and h2(jω) have magnitude values that are generally close to one below the gain crossover frequency, and the quantity γ(jω), which is termed the multivariable structure function, provides a measure of the strength of any inter-loop coupling in the system and can indicate whether or not this is benign. for the case of a two-input two-output system there is only one multivariable structure function. however, in general, for systems having a larger number of input-output pairs there will be more than one structure function. in all cases it can be stated that when a multivariable structure function is small the interaction effects are small. in the two-input two-output case, if the multivariable structure function is small over the complete frequency range of interest, the two channels behave, more or less, as two independent loops. on the other hand, if the multivariable structure function is shown to have a large magnitude, at a frequency within the range that is important for the application being considered, the loop interactions become significant [2]. if the multivariable structure function has an appropriate form, as discussed above, frequency response information for each channel can be used in the analysis of the nominal system in exactly the same way as for the analysis of a conventional feedback loop in a siso control system application. however, the multivariable structure function provides additional information about potential interactions and the stability robustness of the closed-loop system. it is important to note that, for successful application of the icad approach to a two-channel system, the closed-loop bandwidth specification for one channel must not be too similar to the equivalent specification for the second channel. if this is not the case the problem of the transfer function of one channel depending on the controller of the other channel becomes a significant obstacle in the processes of analysis and design [2]. 4 design of the controller for the coupled-tanks system using icad for the coupled-tanks system, it may be shown that the multivariable structure function is given by: γ(s) = g12g21 g11g22 = α2 α1+α2 (1 + sa1 α1 )(1 + s a2 α1+α2 ) (23) the expressions for h1(s) and h2(s) for this system may also be derived directly, from equations (21) and (22). 4.1 design requirements the specifications for the closed-loop system were based on equivalent requirements for a siso version of the coupled-tanks system, for which considerable previous design experience had been accumulated. the requirements for the two-input two-output case were as follows: a) zero steady-state errors in the liquid levels in both tanks. b) a maximum overshoot of 30 % in liquid levels. c) a damping factor of at least 0.7 which corresponds, approximately, to a phase margin of at least 70 degrees for both channels. d) the gain cross-over frequency should be at least 0.05 rad/s for both channels. this value is based on previous experience with the design of pid controllers for the siso case for control of the liquid level in tank 2 (with controlled input flow to tank 1 only), e) for successful application of the icad design methodology, it is important to ensure that the polar plots of the multivariable structure functions γ(jω), γh1(jω) and γh2(jω) (in terms of the magnitude and phase values at different frequencies over the frequency range of significance) never approach the point (1, 0). 4.2 an outline of the design process the requirements outlined above provide a basis for design using the icad methodology for the linearised plant model, for selected operating conditions. design, in this case, has involved the use of matlab r© software and has led to continuous and digital controllers involving proportional plus integral controller structures for each channel. the design process was carried out for parameter values of the linearised model which correspond to an operating point in the lower half of the depth range in both tanks (h1 = 0.115 m and h2 = 0.071 m). this is a typical operating point for the system under open-loop conditions. the values used for the two discharge coefficients are those given in section 2. the first step in the design process involves establishing that the gain cross-over frequency of the open-loop transmittance of one channel will be significantly different from the gain cross-over frequency 125 acta polytechnica vol. 52 no. 4/2012 of the other. in this case it was decided, on the basis of physical reasoning, that the gain cross-over frequency of channel 1 should be higher than that for channel 2. from the design requirements, this latter value should be chosen to be at least 0.05 rad/s, so a value of at least 0.5 rad/s was required for the gain crossover frequency of channel 1. the next step involves evaluation of the magnitude and phase of the multivariable structure function over the range of frequencies that are of importance for the intended application. figure 5 is a typical plot of the multivariable structure function in polar form showing the magnitude and phase of γ(jω) for the complete range of relevant frequencies, and it is clear that the resulting plot involves small values of magnitude and does not come close to the (1, 0) point. this is satisfactory for the operating point considered but similar plots should be considered for a range of different operating conditions. figure 5: plot of the multivariable structure function in polar form for the coupled-tanks system showing the magnitude and phase of γ(jω) for the complete range of relevant frequencies for one operating point next, it is necessary to design the controller k2(s) since the requirements in terms of gain cross-over frequency for channel 2 are less demanding than for channel 1. equation (19) shows that the equation for channel 2 involves the transfer function h1(s) and the known multivariable structure function γ(s). the first step is to assume either that h1(s) = 0 or that h1(s) = 1 and design the controller k2(s) initially on that basis [2]. in this application it was assumed that h1(s) = 0, but an initial assumption that h1(s) = 1 would have been equally appropriate. the transfer function for g22(s) given in equation (17) has a magnitude at low frequencies of 1/α2 and at high frequencies the magnitude decreases in an approximately linear fashion at −20 db per decade. in order to meet the design requirement of zero steady-state closed-loop error this suggests use of a proportional plus integral type of controller of the form: β1 = (1 + β2s) s (24) where β1 and β2 are constants. this controller will produce infinite gain at zero frequency and thus eliminate any steady state error in the closed-loop system for this channel. the choice of parameter values for the controller involves, initially, the selection of a gain factor β1 to give a suitable gain cross-over frequency which is at least the required minimum of 0.05 rad/s. the integral action is then considered and the frequency ω = 1 β2 is chosen to be sufficiently smaller than the gain cross-over frequency to ensure that the overall phase margin is not influenced to any large extent. application of this procedure gives the following controller: k2(s) = 0.56 (1 + 8.929s) s (25) through the use of simulation, closed-loop step responses can be examined (usually on the basis of linearised models) and further adjustments can be made in the values for these controller parameters if this is judged to be necessary. after obtaining that first approximation to k2(s) an initial single-input single-output design can be carried out for the controller k1(s) on the basis of h2(s), which is now available (from equation (21)). the procedure followed is essentially the same as for channel 1 but with the higher value of gain crossover frequency that is required for this channel. the proportional plus integral controller resulting initially from this process has the form: k1(s) = 4.676 (1 + 5.988s) s (26) having found an initial k1(s), this transfer function can then be used to determine h1(s) by substitution into equation (22). the resulting gain and phase margins must then be checked and adjustments made to k1(s), if necessary. the process may have to be repeated once or twice. then, using the revised controller transfer function for channel 1, the design can be completed for channel 2 using a similar iterative procedure. final checks must then be made on both channels to compare the gain cross-over frequencies with the design specifications and check that the gain and phase margins are all satisfactory. this process also involves re-examination of the nyquist plots of the multivariable structure functions γ(jω), γh1(jω) and γh2(jω) to ensure that none of them approaches the point (1, 0) and thus establish that the gain and phase margins are valid measures of robustness [2]. 126 acta polytechnica vol. 52 no. 4/2012 following the application of the above procedures the optimised controller transfer functions were as follows: k1(s) = 5.0 (1 + 6.2s) s (27) k2(s) = 0.56 (1 + 10.0s) s (28) figure 6: bode diagram showing magnitude (db) and phase (deg) for channel 1 with the compensation provided by the controller transfer function of equation (27). the gain cross-over frequency is indicated by the vertical line at frequency of about 0.8 rad/s figure 7: bode diagram showing magnitude (db) and phase (deg) for channel 2 with the compensation provided by the controller transfer function of equation (28). the gain cross-over frequency is indicated by the vertical line at frequency of about 0.2 rad/s figures 6 and 7 show the open-loop bode plots for channels 1 and 2, respectively, for these optimised controllers. from these bode plots it may be seen that the gain crossover frequencies for channel 1 and channel 2 are approximately 0.8 rad/s and 0.2 rad/s, respectively. the corresponding phase margins are more than the required value of 70 degrees, in both cases. from the gain-crossover frequencies it is clear that the speed of response for channel 2 is likely to be about four times slower than for channel 1, which is consistent with the specifications. discrete equivalents of these continuous controllers have been found and an icad-based control system has been implemented with a digital controller using a general-purpose computer equipped with analogue-to-digital and digital-to-analogue converters. however, all experimental results presented in this paper are for the continuous control case where the controllers have been implemented using a small general-purpose electronic analogue computer equipped with comparators and switches that can provide limiting integrator action, if required, to avoid integrator saturation. for purposes of comparison, proportional plus integral controllers have also been designed empirically using the ziegler-nichols reaction curve method (see, e.g. [13]). this well-known approach to controller design is based on measurements obtained from simple open-loop tests on the plant. application of this approach gave the following controller transfer functions: k1(s) = 6.98 (1 + 3.96s) s (29) k2(s) = 8.9 (1 + 3.3s) s (30) it should be noted that the two controller transfer functions found from the application of the zieglernichols approach (equations (29) and (30)) are very much closer in terms of parameter values than the two controllers found using the icad approach, as given in equations (27) and (28). this is because of the requirement within the icad methodology that the bandwidth values for the two channels should be significantly different. 5 results extensive analysis and simulation studies have been performed using matlab r© and simulink r© to investigate the performance of the system, especially in terms of interactions between the two channels and overall robustness of the control systems. in the case of the control systems derived using the icad approach the performance of the controllers has also been the subject of detailed experimental investigation in the laboratory using the coupled-tanks system hardware. interactions between the two channels have been investigated both by experiment and through simulation. for the simulation studies the full nonlinear model has been used, with parameter values as given in section 2. 127 acta polytechnica vol. 52 no. 4/2012 5.1 simulation results figure 8 shows typical simulation results for a test in which simultaneous step changes are applied to the reference inputs determining the required levels in the two tanks, using the continuous controllers of equations (27) and (28). the resulting simulated changes in liquid levels in tanks 1 and 2 are shown by the upper and lower traces respectively. in the case of channel 1 the step change of reference imposed is from 199 mm to 228 mm, while for channel 2 the change is from 165 mm to 198 mm. this test involves input flow values for tank 1 and tank 2 which both reach their upper limits for this magnitude of demanded level change. it can be seen from these simulation results that, although the operating point considered is significantly different from the design point, the design requirements have been satisfied and also that the response of tank 2 is slower than that of tank 1, as expected. figure 8: simulation results found using the nonlinear model with the controllers designed using the icad approach for a test in which simultaneous step changes in required levels for tank 1 and tank 2 are applied at time t = 100 s. the vertical axis represents liquid level (m) while the horizontal axis represents time (s). the level in tank 1 is represented by the continuous line while the dashed line represents the level in tank 2 investigation, through simulation, of interactions between the two channels have involved introducing a step change of the desired level in one channel while maintaining the original set level in the other. the upper set of simulation results presented in figure 9a shows the level of liquid in tank 1 following the application of a step change of reference for channel 1 at time t = 100 s, together with the record for the level in tank 2. in figure 9b the lower plot shows the liquid level in tank 2 following the application, at time t = 100 s, of a step change of reference for channel 2 while the upper trace shows the corresponding level in tank 1. a) b) figure 9: a) simulated responses of levels (m) in tank 1 (continuous line) and in tank 2 (dashed line) versus time (s) when the reference level for channel 1 is changed. the horizontal axis represents time (s). this test involved use of the nonlinear model with controllers designed using the icad approach b) simulated responses of levels (m) in tank 1 (continuous line) and in tank 2 (dashed line) when the reference level for channel 2 is changed. this test involved use of the nonlinear model with controllers designed using the icad approach. the horizontal axis represents time (s) these results show that a transient disturbance occurs in the level of tank 2 when the set level of channel 1 is changed but that a negligible transient is found in the level of tank 1 when the set level of channel 2 is altered by a similar amount. this difference is due to the different bandwidths in the two channels. results of an additional simulation test are shown in figure 10. this involves the simultaneous application of negative step changes of reference for both channels at time t = 100 s. the demanded changes lead, transiently, to a complete cut-off of input flow for both tanks for a period of about 20 s at the time when the reference values are changed, as can be seen from the almost straight-line form of the negativegoing responses in that part of the record. 128 acta polytechnica vol. 52 no. 4/2012 figure 10: simulation results showing liquid levels (m) for tank 1 (continuous line) and tank 2 (dashed line) for large negative step changes of reference. the horizontal axis represents time (s) the closed-loop performance of the system with the proportional plus integral controllers designed using the ziegler-nichols reaction curve approach (see, e.g. [13]) was investigated through simulation. it was found that for small changes of the reference inputs the two-input two-output system with these controllers behaved very much in accordance with expectations (as shown in figure 11). figure 11: simulation results found using the nonlinear model with controllers designed using the zieglernichols approach for a test involving relatively small changes of reference level. the vertical axis represents liquid level (m) while the horizontal axis represents time (s). the level in tank 1 is represented by the continuous line while the dashed line represents the level in tank 2 for large positive changes of reference (similar to those applied in obtaining the results shown in figure 8 for the icad design) the responses for the control system designed using the ziegler-nichols approach are found to be much more oscillatory, as shown in figure 12. this is also the case for the transients found for large negative reference changes, as shown in figure 13. figure 12: simulation results found using the nonlinear model with controllers designed using the zieglernichols approach for a test similar to that of figure 8. the vertical axis represents liquid level (m) while the horizontal axis represents time (s). the level in tank 1 is represented by the continuous line while the dashed line represents the level in tank 2 figure 13: simulation results for the nonlinear model with controllers designed using the ziegler-nichols approach for a test involving large negative step changes of reference. the vertical axis represents liquid level (m) while the horizontal axis represents time (s). the level in tank 1 is represented by the continuous line, whereas the dashed line is the level in tank 2 responses found for simulated situations involving interactions between the two tanks were also more oscillatory in nature and varied more with operating point than those found using the controllers designed using the icad approach. 5.2 experimental results as implemented using operational amplifiers and the associated passive components, the two controllers corresponded to the transfer functions (as given in equations (27) and (28)) that resulted from the final optimisation stage of the design process. these are, of course, also the controller transfer functions used in the simulation studies discussed in section 5.1. 129 acta polytechnica vol. 52 no. 4/2012 experimental results for the control system, when implemented with these controllers, are shown in figure 14 for the case involving two simultaneous changes of reference. the results are almost identical in terms of steady state performance to the simulated results of figure 8 for the same test conditions, and are very similar in terms of the settling time of the transients. as in the simulation results, the inputs both reach their limits during the transients. the main difference observed between the experimental results of figure 14 and the simulation results of figure 8 is that the transients found experimentally (especially for the level in tank 1) are more oscillatory than those found through simulation. similar findings have been obtained for equivalent tests at other operating points, and this suggests strongly that there are imperfections within the model of the two-tank system. exactly what the modelling errors might be is not, of course, clear from the information from these closed-loop system tests alone. although the results shown in figures 8 and 14 are for one specific operating condition, comparison of experimental and simulation results for a range of different conditions has shown good overall agreement. figure 14: experimental results for conditions equivalent to those of the simulation results of figure 8. the continuous line shows the measured liquid level (m) in tank 1 while the dashed line shows the measured level in tank 2. the horizontal axis represents time (s) the experimental investigation of interactions between channels produced results shown in figures 15a and 15b, which can be seen to correspond closely to the corresponding simulation results shown in figures 9a and 9b. experimental results for a test involving simultaneous large negative changes of reference value for both channels simultaneously are shown in figure 16. these results are very similar in character to the simulated results of figure 10. as in the simulation, the controlled flows for tank 1 and tank 2 reach limiting values (zero) during the transient period. a) b) figure 15: a) experimental results, equivalent to the simulation results of figure 9a, involving application of a step change of reference for channel 1 while the reference input for channel 2 is held constant. the record for liquid depth (m) in tank 1 is shown by a continuous line and for tank 2 by the dashed line. the horizontal axis represents time (s) b) experimental results, equivalent to the simulation results of figure 9, involving application of a step change of reference for channel 2 while the reference for channel 1 is held constant. the record for liquid depth (m) in tank 1 is shown by a continuous line and for tank 2 by the dashed line. the horizontal axis represents time (s) figure 16: experimental results showing liquid levels (m) for tank 1 (continuous line) and tank 2 (dashed line) for large negative step changes of reference. the horizontal axis represents time (s) 130 acta polytechnica vol. 52 no. 4/2012 figure 17: experimental results for a test involving the addition of a small volume of water to tank 1 (continuous trace) and to tank 2 (dashed line) in turn. the horizontal axis represents time (s) a) b) figure 18: a) results of a simulated test involving the addition of a small volume of water to tank 1 (continuous line). the level in tank 2 is shown by the dashed line. the horizontal axis represents time (s) b) results of a simulated test involving the addition of a small volume of water to tank 2 (dashed line). the level in tank 1 is shown by the continuous line. the horizontal axis represents time (s) the behavior of the control system when subjected to external disturbances is also of great practical importance. figure 17 shows some typical experimental results where external disturbances have been introduced by adding, in as short a period of time as possible, a disturbance in the form of a measured volume of water to each of the tanks in turn, with feedback control loops applied. the upper plot shows the level in tank 1 for a reference input of 227 mm, while the lower plot shows the level in tank 2 for a reference input of 198 mm. the disturbance inputs are applied by the rapid addition of water, from a beaker, to tank 1 and addition of a similar volume to tank 2 at about time t = 225 s. the effects of each of these disturbances on each channel are clearly visible in these records. the results show the distinctive actions of the two channels in countering the effects of the disturbance inputs. the levels for both channels return to their set values after the disturbances, with transients of acceptable magnitude and duration. as with the tests involving changes of reference, it is clear (as would be expected) that the speed of response to disturbances is influenced by the choice of bandwidths for the two channels. similar results have been obtained through simulation, but quantitative comparisons are difficult for this type of test because it is hard to reproduce the detailed time-course of the disturbance input within the simulation. figures 18a and 18b show typical simulation results for disturbance tests which are approximately equivalent to the experiments of figure 17. the experimental and simulation results are seen to be qualitatively consistent. 5.3 results of additional simulation-based investigations one interesting practical finding, which has been fully supported by simulation results, is that control of the level in tank 1 can only be achieved for conditions in which the demanded level in tank 1 is equal to or greater than that in tank 2. this is understandable in terms of the physics of the system since tank 1 has only one outlet (to the second tank), whereas tank 2 has two outlets (one to the first tank and the second through the drain pipe). if the demanded value of h2 is greater than the demanded level of h1, liquid will flow into tank 1 from tank 2 as well as from the input but no liquid will flow out. since the input flow cannot become negative, satisfactory control of the level in tank 1 is impossible in these conditions. typical experimental results are shown in figure 19 and these demonstrate, in this specific case, that a demanded level of 0.105 m in tank 1 cannot be achieved in combination with a larger demanded level (in this case 0.131 m) in tank 2. what 131 acta polytechnica vol. 52 no. 4/2012 happens in practice is that the final levels in both tanks become equal to the final demanded level in tank 2. figure 20 shows results obtained from simulation for a very similar set of conditions. simulation investigations have confirmed that the addition of a drain pipe to the first tank eliminates this problem and would allow independent control of the two levels for any combination of reference values. figure 19: levels (m) found in tank 1 (continuous line) and tank 2 (dashed line) for a demanded change of the reference for channel 1 from 0.082 m to 0.105 m and for channel 2 from 0.048 m to 0.131 m. the horizontal axis represents time (s) figure 20: simulated results for a test which involves conditions which are very similar to those of the experiment of figure 16. in this case, following the step changes of reference inputs, the demanded level in tank 2 is again greater (0.131 m) than the demanded level in tank 1 (105 mm). here the continuous line again represents the liquid level in tank 1 and the dashed line the level in tank 2. the horizontal axis represents time (s) another area for further investigation through simulation relates to tests of robustness to changes within the plant. these have involved, for example, the introduction of sudden changes of the crosssectional area of the outlet drain orifice, or of the cross-sectional area of the orifice responsible for the coupling between tank 1 and tank 2. experimental testing is straightforward in the case of the outlet from tank 2, for which the drain tap can be used, but investigation of changes of the inter-tank orifice area presents practical difficulties since the variation of cross-sectional area normally requires the removal or insertion of a rubber bung for one of the three orifices in the partition that separates the two tanks. even partial closing and re-opening of the outlet drain tap is difficult to achieve manually in a precise and repeatable fashion. simulation can therefore be used to advantage to investigate the performance of the system for this type of change. figure 21: results of an experiment involving partial closure and re-opening of the drain tap from tank 2. the action of closure occurs at about t = 60 s and the re-opening takes place at about t = 150 s. the continuous line shows the level (m) in tank 1 and the dashed line represents the level (m) in tank 2. the horizontal axis represents time (s) figure 22: results from a simulation involving instantaneous changes of the cross-sectional area of the orifice representing the drain tap from tank 2. partial closure occurs at t = 100 s and the re-opening takes place at t = 300 s. the continuous line shows the level (m) in tank 1 and the dashed line represents the level (m) in tank 2 132 acta polytechnica vol. 52 no. 4/2012 inevitably, the results of simulation tests differ slightly from tests carried out on the real system. typical experimental results showing the effects of changes of drain tap opening and changing the crosssectional area of the inter-tank orifice are given in figure 21. results from simulation, for instantaneous changes of the cross-sectional area of the orifice representing the drain-tap and outlet pipe, are shown in figure 22 and these are broadly similar to the experimental findings of figure 21. both the simulation results and the experimental findings confirm that the robustness properties of the two-input twooutput control system, as displayed in this test, are satisfactory. transients for tank 2 are significantly larger than for tank 1 as could be expected from the bandwidths of the two channels. 6 discussion and conclusions the work reported in this paper illustrates the use of the icad analysis and design approach for a practical application that involves significant nonlinearities in terms both of control input limits and inherent nonlinearity of the plant model. analysis of the twoinput two-output system within the icad framework provides helpful insight which can be used in the design and implementation of the control system. comparisons between simulation and experimental results also provide useful information about the system performance and about limitations of the plant model. a previous journal paper [9] reporting the application of the icad methodology to the same system was concerned with issues of controller parameter tuning and did not consider the response to disturbances or address robustness issues. it can be concluded that the coupled tanks equipment provides a useful test-bed for investigation of issues of nonlinear system modeling, multivariable control system design and implementation. the availability of a comprehensive nonlinear model of the system, together with linearised representations appropriate for control system design, also makes this system suitable for the teaching of practical aspects of multi-input multi-output control system analysis and design using icad or other approaches. it is believed that the work reported in this paper could provide the basis for a useful case-study (most probably for use at postgraduate level) on the icad methodology. this could also illustrate the benefits of bringing together simulation and experimental testing within the processes of control system design and implementation. differences between simulation results and experimental results are believed to relate mainly to limitations in the representation of the plant and especially the relationship used to describe the output flow from the second tank within the nonlinear model. this aspect of the coupled-tanks system model has been discussed in previous model validation studies for this system (e.g. [7,10,11]) and is the subject of ongoing investigations. simulation results show that broadly satisfactorily results can also be obtained for this plant with proportional plus integral controllers designed empirically using the ziegler-nichols reaction curve method. however, results found using that approach have given responses, for the types of tests described in this paper, that tend to be more oscillatory than those found using the icad methodology, and indicate some issues of closed-loop system robustness to changes of operating point, variation of the magnitude of reference changes and the magnitude of disturbances. claims that the icad approach can provide enhanced performance compared with other available design methods would be inappropriate on the basis of the limited results presented in this single application. however, it is believed that the icad methodology brings more physical insight to the design process for multi-input multi-output systems, and it must also be remembered that this approach is not restricted to one single form of controller. acknowledgement this paper is a modified and extended version of a paper [8] presented at eurosim 2010 (the 7th eurosim congress on modelling and simulation), which was organized by staff of the department of computer science and engineering, czech technical university in prague (ctu). the congress took place in prague during the period 6-10 september, 2010. the author must thank jin lin chiang who, during undergraduate project work at the university of glasgow [12], carried out experiments from which some of the results in this paper have been derived. the author also wishes to thank professor john o’reilly of the university of glasgow for many valuable discussions about icad methods. references [1] wellstead, p. e.: coupled tanks apparatus: manual. tecquipment ltd., uk, 1981. [2] o’reilly, j., leithead, w. e.: multivariable control by ‘individual channel design’. international journal of control, 54, 1991, p. 1–46. [3] leithead, w. e., o’reilly, j.: performance issues in the individual channel design of 2-input 2-output systems. 1. structural issues. international journal of control, 54, 1991, p. 47–82. 133 acta polytechnica vol. 52 no. 4/2012 [4] leithead, w. e., o’reilly, j.: performance issues in the individual channel design of 2-input 2-output systems 2. robustness issues. international journal of control, 55, 1992, p. 3–47. [5] o’reilly, j., leithead, w. e.: frequency-domain approaches to multivariable feedback control system design: an assessment by individual channel design for 2-input 2-output systems. control theory and advanced technology, 10, 1995, p. 1 913–1940. [6] kocijan, j.: bibliography on individual channel analysis and design framework. http://dsc.ijs.si/jus.kocijan/icad. (accessed successfully 1. 11. 2011). [7] gong, m., murray-smith, d. j: a practical exercise in simulation model validation, mathematical and computer modelling of dynamical systems, 4, 1998, p. 100–117. [8] murray-smith, d. j.: the individual channel analysis and design method applied to control of a coupled-tanks system: simulation and experimental results. in “proceedings of the 7th eurosim congress on modelling and simulation. (editors: m. šnorek, z. buk,, m. čepek, and j. drchal), prague : department of computer science and engineering, czech technical university in prague (ctu), 2010. (isbn 9788001045893). [9] murray-smith, d. j., kocijan, j., gong, m.: a signal convolution method for estimation of controller parameter sensitivity functions for tuning of feedback control systems by an iterative process. control eng. practice, 11, 2003, p. 1 087–1 094. [10] tan, k. c., li, y., murray-smith, d. j. ,sharman, k. c.: system identification and linearisation using genetic algorithms with simulated annealing. in proc. conf. on genetic algorithms in eng. syst.: innovations and applications, 12–14 september 1995, (iee conference publication no. 414), london : iee, 1995, p. 164–189. [11] gray, g. j., murray-smith, d. j., li, y., sharman, k. c., weinbrenner, t.: nonlinear model structure identification using genetic programming, control eng. practice, 6, 1998, p. 1 341–1 352. [12] chiang, j. l.: control system for a two-input two-output liquid level system. project report 94-179, university of glasgow, department of electronics and electrical engineering, 1994. [13] golten, j., verwer, a.: control system design and simulation, mcgraw-hill, london, 1991. 134 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 gravitomagnetism j. bičák in the introduction to our talk,we explaineda simple thought experiment to indicate that gravitomagnetism has to arise if some basic relations, such as the dependence of inertial mass on velocity given by the standard special-relativistic formula, the dependence of active gravity on the inertial mass, and the lorentz invariance, take place: the effect on a test mass between two uniformly moving parallel streams of linear distributions ofmasses was considered from the frame inwhich they bothmovewith the samevelocity but in opposite directions, and from the frame in which one of the streams is at rest (see, e.g., [1] for details). we then summarized the basic ideas and results of the recent experiment gravity probe b, in which four gyroscopes placed in a satellite orbiting the earth measure the gravitomagnetic field caused by earth’s rotation. after a long period of preparation, observation and data analysis it has now been concluded that “the combined four-gyro result gives a statistical uncertainty of 14% (∼ 5 marcsec/yr) for frame-dragging” (in other words, for the presence of gravitomagnetic effects) – see http://einstein.stanford.edu/highlights/status1.html. in the main part of the talk, we summarized our recent works with joseph katz from the hebrew university in jerusalemanddonaldlynden-bell from the institute ofastronomyof theuniversity ofcambridge onmach’sprinciple andgravitomagneticeffects ingeneral relativity and cosmology. einstein was strongly influenced by mach’s idea that the inertia of a particle here and now arises as a consequence of its interaction with other particles in the universe. what do we understand by “mach’s principle” today? in our comprehensivework [2] summarizing a number of our preceding contributions to gravitomagnetic effects andmach’s principle, we start out from the general formulation of mach’s principle by hermann bondi in his cosmology [3]: local inertial frames are determined by the distribution of energy and momentum in the universe by suitable averages. inmathematical terms, we investigate the validity of such a formulation for the case of general linear perturbations of standard “background” models, i.e., of isotropic and homogeneous cosmologicalmodels describedby thefriedmann-lemâıtre-robertson-walker solutions of einstein’s field equations. in particular, we focus on thoseofeinstein’s equations for linearperturbations which represent constraints on initial data. in suitable coordinates (“gauges”), these constraints are represented by elliptic equations which connect the distribution of matter and energy described by an energy-momentum tensor of physical matter and fields with the geometry described by the metric tensor and its derivatives. in these gauges, the local inertial frames are determined instantaneously by the distribution of matter and energy. one has to realize that the physical effects associatedwith mach’s ideas, e.g. the “dragging of inertial frames” (“gravitomagnetic effects”) have a global character and require special coordinate systems, special gauges. as was noted by dieter brill in the discussion during the conference on various aspects of mach’s principle (on the basis of which a very valuable book [4] arose), “mach’s principle can show the way to give physical meaning to quantities which are usually considered as coordinate dependent.” in more recent papers [5, 6], we show that within general relativity, any general statement of mach’s principle that attributes all dragging of inertial frames solely to the distribution of energy and momentum of matter as the originof inertia is false: gravitomagnetic effects are also caused by gravitational waves. to show this, we investigatewaveswhich do not depend on one spatial coordinate (say the “z−coordinate”). we find that there is an almost flat cylindrical region near the z-axis of a revolving gravitational wave pulse (which inevitably has no “physical” energy-momentum) and demonstrate that the inertial frame in the cylindrical interior rotates relative to the inertial frame at great distances. our aim was to produce a nice clean example of the rotation of the inertial frame in an almost flat region surrounded by rotating gravitational waves. an extreme example of inertia due to gravitationalwaves alone is providedbygowdy’s universe [7], a closed world that contains nothing except gravitationalwaves. oneofourultimate aims is todiscuss the meaning of mach’s principle in fully nonlinear general relativity, and particularly its application to such systems as gowdy’s universe. another question of great interest is what bearing mach’s principle has on the existence of dark energy. references [1] schutz, b.: gravity from the ground up (an inthis lecture, here briefly summarized, was dedicated to jiří niederle, whose broad interests and contributions in mathematical physics also include gravity theories. 69 acta polytechnica vol. 50 no. 3/2010 troductory guide to gravity and general relativity), cambridgeuniversitypress,cambridge2003, xxvi+462 pages. [2] bičák, j., katz, j., lynden-bell, d.: cosmological perturbation theory, instantaneous gauges and local inertial frames, phys. rev. d76 (2007) 063501, 31 pages. [3] bondi, h.: cosmology, cambridge university press, cambridge 1961, vii+182 pages. [4] barbour, j., pfister, h. (editors): mach’s principle: from newton’s bucket to quantum gravity, birkhauser, boston–basel–berlin 1995, vii+536 pages. [5] bičák, j., katz, j., lynden-bell, d.: gravitational waves and dragging effects, class. quantum grav. 25 (2008) 165018, 19 pages. [6] lynden-bell,d., bičák, j., katz, j.: inertial frame rotation induced by rotating gravitational waves, class. quantumgrav.25 (2008) 165017, 13 pages. [7] gowdy, r. h.: gravitational waves in closed universes, phys. rev. lett. 27 (1971), p. 826–829. jiří bičák institute of theoretical physics charles university v holešovičkách 2, 180 00 prague 8 70 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 use of grasses and mixtures of grasses for energy purposes david andert1, jan frydrych2, ilona gerndtová1 1 research institute of agricultural engineering, p.r.i, drnovská 507, 161 01 prague 6, czech republic 2 oseva pro, zubř́ı, hamerská zubř́ı correspondence to: andert@vuzt.cz abstract as levels of agricultural productivity increase, there is also an increase in land area not utilized for food production. this area can be used for growing energy crops, including grasses. when land is set aside for grassing, or when the potential of perennial grasses is not utilized due to reductions in cattle herds, there is also an increased amount of grass that can be utilized for energy purposes. experiments were carried out on the principle of single-stage anaerobic digestion within the mezophyle range. during the experiments, we measured the cumulative production of biogas and its composition. the processed grass was disintegrated by pressing and cutting. this adaptation of the material resulted in increased biogas production. the optimum proportion of grass dry matter is from 35 to 50 % in the total d.m. the results of the experiments proved the suitability of grass phytomass as a material for biogas production. keywords: biomass production, anaerobic digestion, biogas. 1 introduction the search for new energy resources has become a worldwide phenomenon. due to increasing levels of agricultural productivity, there has also been an increase in land set aside without food production. grassland is of exceptional significance not only for forage production but also for non-production functions [1]. its important functions include: water management — rainfall retention; anti erosion — i.e. protection against water and wind erosion; protection in relation to the hydrosphere — root systems reduce underground water pollution; esthetic — grassland maintains the appearance of the landscape; economic and social functions — generating jobs for people living in marginal areas. when arable land is put into the set-aside regime, the lands needs to be cultivated by cutting. increased economic pressure for profitable agriculture is another reason why the cultivated area has been reduced, particularly in marginal regions. it may be assumed that the trend in germany and austria will be followed in the czech republic, and there will be increased social pressure on landowners, especially in tourist regions, to ensure that all grasslands are regularly maintained [1]. 2 material and methodology in our experiment, we used agrostis gigantea (rožnovský), fescue kora, reed canary grass palaton, reed canary grass lera, reed canary grass chrifton, grassland mixture for wet conditions, grassland mixture for dry conditions, brome-grass tacit and arrhenatherum elatius rožnovský. the experiments were performed with variants without n fertilization and with an n dose of 50 kg/ha/a. the experimental crop was harvested six times in the course of a year. the biomass yield in the green and dry material was found, and also the dry matter content. a laboratory workplace was built for producing biogas grown from a special substratum. a set of fermenters was placed in a heated water bath. each fermenter had its own gas container to enable the biogas production quantity to be read. these small devices determine biogas production and specify other properties of the phytomass mixture of energy plants, slurry, fugate and neutralization agents. the aim of our experiment was to reduce the acidity of an organic substratum mixture processed under anaerobic conditions. an air lf analyzer was used to analyze the biogas that was generated. this device was further used for measuring the co2, ch4 and s2 contents. a pair of larger reactors with comparative methanogenesis measurements was also available. the mixtures tested with good results in small fermenters were then verified in larger laboratory fermenters. 3 results and discussion 3.1 yield characteristics of the investigated grass species on 8th july 2010, the first harvest of energy grasses and grassland mixtures for biomass and seed production was gathered: fescue kora, reed canary grass 9 acta polytechnica vol. 52 no. 3/2012 (palaton, lera and chrifton), arrhenatherum elatuis řožnovský and brome-grass tacit. on august 9, energy grasses and grassland mixture for biomass production and agrostis gigantea řožnovský for seed production were harvested. on september 8, all grassland mixtures and grass species for biomass production were harvested. on october 18, energy grass species and grassland mixtures were harvested. the dry matter yield in the same species was up to 10 tons per 1 hectare. the grassland mixture produced the highest yield of green material (48.36 t/ha) and dry matter (9.77 t/ha) for wetter conditions, in the fertilized variant. for the grass species, the highest yield was for fecsue kora in green material (31.29 t/ha) and in dry matter (9.36 t/ha), in the fertilized variant. the second harvest of grass species and grassland mixtures was performed on july 4. the dry matter content in the green material of the investigated grasses and mixtures increased to 29.89 %–40.86 % according to species. the grassland mixture produced the highest yield of green material for wetter conditions (35.38 t/ha), in the fertilized variant. the highest dry matter yield was for brome-grass tacit (12.27 t/ha), in the fertilized variant. on july 8, a harvest of grasses and grassland mixtures for biomass production was performed, and some grasses for seed production were also harvested. the dry matter yield in the green material was from 30.59 % to 42.64 %, according to the verified components. the grassland mixture produced the highest yield of green material for wetter conditions (35.52 t/ha), in the fertilized variant. the highest dry matter yield was for brome-grass tacit (12.48 t/ha), in the fertilized variant. the highest seed yield was recorded for brome-grass (2.812 t/ha), in the fertilized variant. other grass species produced a seed yield from 0.117 to 0.824 t/ha. the seed yields were almost identical for the fertilized variant and for the non-fertilized variant. on august 9, grasses and grassland mixtures for biomass production were harvested. the seed harvest for agrostis gigantea was performed. the dry matter content in the green material ranged from 36.35 % to 50.83 % of the investigated components. the highest yield of green material was reported for fescue (30.24 t/ha), in the fertilized variant. the highest dry matter yield was for fescue (13.68 t/ha), in the fertilized variant. on august 9, 2005 the seed harvest of agrostis gigantea was performed, with the yield in the non-fertilized variant amounting to 0.425 t/ha, and in the fertilized variant amounting to 0.495 t/ha. figure 1: produce of harvested grasses (d.m.) 10 acta polytechnica vol. 52 no. 3/2012 figure 2: biogas production for different blends on september 8, the harvest of grasses and grassland mixtures for biomass production was performed. the dry matter content of the grasses and grassland mixtures ranged between 38.92 % and 63.21 %. the highest dry matter content was found for bromegrass tacit, 62.53 % in the non-fertilized variant, and 63.21 % in the fertilized variant. the highest yield of green material (22.74 t/ha) was for reed canary grass palaton, in the fertilized variant. the highest yield of dry matter was also found for reed canary grass palaton (13.30 t/ha), in the fertilized variant. the second cut of the investigated grassland mixtures and grasses in the first year took place on october 18th. the second cut was harvested on the plots where the first cut had been made on june 6. the dry matter content in the green material of the grassland mixtures and grass species ranged from 31.92 % to 39.59 %. the results for the green material yield were from 2.56 t/ha to 5.62 t/ha. the investigated grassland mixtures and grasses showed a minimum difference in green material yield between the fertilized variant and the non-fertilized variant. after the first cut, no further fertilization was carried out. the highest yield of green material was for reed canary grass lera, in the fertilized variant (5.62 t/ha). the highest yield of dry matter was found for reed canary grass palaton, in the fertilized variant (2.09 t/ha). the dry matter yield ranged from 1.03 t/ha to 2.09 t/ha in both the fertilized variant and the non-fertilized variant. 3.2 procedure for determining the biogas yield the biogas production, and its chemical composition, from each type of substrate was investigated. the two reactors enabled the fermentation blend composition to be optimised, the course of the process to be better controlled, and the operational temperature effect to be monitored. for inoculating the process of methanogenesis, we used a blend of fermented fugat from the rab třeboň biogas plant and fresh pig slurry, also from třeboň. identical conditions were set up for all the experiments. the fermenters operated at a temperature of 42 ◦c, i.e. within the thermophilic field. the dry matter weight percent of the initial blend of mixed substratum was between 4–8 %. the resulting biogas production (in litres) was always related to a mass of 1 kg of the sample organic dry matter. 4 conclusions the grassland mixtures and grass species that were included in the project revealed a different dry matter content in the green material, which increased mainly due to vegetation ageing and a later first harvest time. the highest dry matter content was found for plants harvested in september (for brome-grass in the fertilized variant, the dry matter content in the green material was 63.21 %). particular grassland mixtures and grass species also react differently in terms of dry matter yield and optimum harvest time for biomass and its utilization for energy purposes during the harvest year. the aim is to achieve the greatest possible dry matter yield. the reduction in dry matter yield for grassland harvested later in the summer and in autumn in the first cut is due to leaf fall and plant lodging (e.g. grassland mixtures or arrhenatherum elatius). on the basis of the preliminary results, it is recommended to harvest grassland mixture in wetter conditions and in dry conditions in june and july, with the possibility of using multicut. for these mixtures, in particular, the high yield potential of green material in an early cut can be utilized. 11 acta polytechnica vol. 52 no. 3/2012 our preliminary measurements have shown the possibility of using a high proportion of agrostis gigantea in the batch. the biogas production from the mixture with agrostis gigantea is fully comparable with the biogas produced from slurry alone. average yields of 265 m3/torg.d.m. are normally achieved, and the maximum yield achieved was 378 m3/torg.d.m.. this yield was for agrostis gigantea, one month before it reached technical ripeness. very good results were also achieved for arrhenatherum elatius, where the span between maximum and minimum production was smallest. fescue seems to be a less suitable plant for biogas production. with the extended reaction time, there is stagnation in biogas generation and a drop in methane content after 33 days. all the tests proved the suitability of using young plants up to two months before they attain technical ripeness. when the harvest was made one month after ripeness, the results were significantly worse. other trials focus on the effect of the grass species structure. the combustion trials have proved that grass can be combusted in selected combustion systems and at the same time comply with the emission limits. it has also been proved that agrostis gigantea and fescue are suitable fuels. for combustion purposes, it is suitable to do the harvesting as late as possible after the plant reaches technical ripeness [2]. the sieve mesh size for agrostis gigantea crushing before pressing the briquettes did not influence the emissions. it only influenced the quality of the briquettes. arrhenatherum elatius seems to be less suitable as a fuel. in future work, other grasses, i.e. brome-grass and reed canary grass, will be tested in blends. there is also a legislative problem with grass combustion, since the boiler is allowed to incinerate only approved fuels. until now, large boilers have been approved only for wood and straw combustion, and small boilers have been approved only for wood combustion. acknowledgement the results presented are part of grant project no. qi101c246 “use of the phytomass from permanent grassland and landscaping” supported by the national agency for agricultural research of the czech republic (nazv čr). references [1] andert, d., gerndtová, i., hanzĺıková, i., frydrych, j., andertová, j: wet grass utilization for energy. in energetika a životńı prostřed́ı. moderńı energetické technologie a obnovitelné zdroje 2006, všb–tu ostrava, ostrava : všb-tu, 2006, s. 86–89. isbn 80-248-1108-1. [2] andert, d., sladký, v., abrham, z.: energy utilization of solid biomass. prague : vúzt, 2006, č. 7, 59 s. isbn 80-86884-19-8. [3] muž́ık, o., abrham, z.: ekonomická a energetická efektivnost výroby biopaliv. [economic and energy efficiency of biofuel production]. agritechscience [online], praha, [cit. 2011–12–27], 2011, roč. 5, č. 3, s. 1–4. issn 1802-8942. [4] hutla, p., jevič, p.: vlastnosti topných briket z kombinovaných rostlinných materiál̊u. [the properties of heating briquettes from combined plant materials] agritechscience [online], praha, [cit. 2011–12–27], 2011, roč. 5, č. 3, s. 1–5. issn 1802-8942. 12 ap09_2.vp 1 introduction a heat pipe is a device that is able to transport heat over long distance with an insignificant temperature drop and with very high efficiency. heat is transferred by the cycle of a working fluid in a gaseous and liquid state inside a heat pipe, with phase changes – evaporation and condensation. the thermal conductance of a heat pipe is usually more than one hundred times higher than thermal conductance of copper (depending on the type of a heat pipe). heat pipes are commonly used for cooling and heat transport in electronic devices, technological processes and in many other types of equipment. in some applications, heat transport controllability is required. several standard modifications are used for this purpose, e.g. addition of a noncondensable gas into the heat pipe. in previous experiments implemented in our laboratory, we have observed a significant influence of the static magnetic field on the heat transport in the gravitational heat pipe filled with pure oxygen as a working fluid. liquid oxygen flowing down a wall under gravity can be captured by a magnetic field. when the magnetic field intensity is high enough, most of the oxygen is trapped. the lower part of the heat pipe is cut off from the working fluid and is apparently inactivated. we have observed that the heat flow is clearly disturbed by the exposure to a static magnetic field with the magnetic induction over 0.4 t. without magnetic field the working fluid can circulate with no restrictions [1]. the heat pipe operation and the magnetic field control may depend on various parameters, e.g. some of the important properties and the quantity of working fluid (esp. thermal characteristics and the magnetic susceptibility), the construction and the inner structure of the heat pipe (esp. a wick structure), etc. of course, magnetic field behaviors are also very important for successful heat transfer control. the influence of the quantity of working fluid in the selected type of heat pipe on the operation and on the magnetic field control has been ascertained in our experiments and will be discussed below. the cryogenic gravitational heat pipe with a special construction was manufactured for this purpose and a testing device was also set up. oxygen was chosen as a working fluid, due to its suitable magnetic properties. the experimental installation is shown in fig. 1. 2 experimental a series of identical experiments was performed for several different quantities of working fluid. the influence on the function and on the magnetic field control was observed for the gravitational heat pipe filled with pure oxygen (see fig. 2). the top part of the heat pipe was chilled by liquid nitrogen and the rest was exposed to room temperature (approx. 297 k). a static magnetic field was situated in the middle. the blocking efficiency of the magnetic field was determined by measuring the temperature changes above and below the magnetic field zone. we used k-type thermocouples calibrated by a pt-thermometer. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 working fluid quantity effect on magnetic field control of heat pipes f. cingroš, t. hron this paper deals with heat pipes controlled by a static magnetic field and with an important side effect – the quantity of working fluid. heat pipes are able to provide very effective heat transport. several standard regulation methods are commonly used for this purpose. in previous experiments implemented in our laboratory, we have observed the significant influence of a magnetic field on the heat conductance of the selected heat pipe. a special heat pipe was manufactured for this purpose and pure oxygen was chosen as a working fluid, due to its suitable magnetic properties. the heat pipe operation and the magnetic field control depend on various parameters. this paper is focused on the influence of the quantity of working fluid. some important results of our experiments are presented and discussed. keywords: heat pipes, magnetic field, variable conductance, oxygen, cryogenic. fig. 1: experimental installation the heat pipe was made from a brassy tube (length 470 mm, outside diameter 8 mm, wall thickness 1 mm). the ends of the tube were closed by brassy plugs with a hole through. copper capillaries (outside diameter 2 mm, wall thickness 0.5 mm) were connected through the brassy plugs to both ends of the heat pipe. they connected the heat pipe with a filling device and with a manometer. the heat pipe was filled with pure oxygen from the pressure vessel through the filling capillary and the filling pressure in the system was measured by the manometer. the relative permeability �r of gaseous oxygen is 1.00053, but for liquid oxygen �r � 1003. (at boiling point). this is high enough to be possible to capture liquid oxygen by the exposure to a magnetic field. a static magnetic field was generated by two special nd-fe-b permanent magnets (dimensions in millimeters – 40×20×10) with a magnetic circuit. the magnetic induction b was 0.6 t in the middle of the air-gap and the magnetic field was in the active blocking zone approximately homogeneous. the magnetic field could alternatively have been generated by an electromagnet with specially shaped magnetic poles. the magnetic induction b was adjustable in the range from 0 t to 1.3 t in the middle of the air-gap and the gradient b increased to a value of 300 t/m at the edges of the poles in such a case. the experiment was performed for 3 different values of the filling pressure in the heat pipe system – 8.0 mpa, 4.5 mpa, 1.8 mpa (at normal temperature t � 297 k). accordingly, the values of liquid oxygen at liquid nitrogen temperature (t � 77 k) were 1.33 cm3, 0.75 cm3, 0.30 cm3 (most of the working fluid is in the liquid state at that moment). 3 experimental results the functional quality and the magnetic field control of the tested heat pipe were the main parameters observed for the three different quantities of the working fluid. the measured temperature characteristics are illustrated in the three figures, one for each value of the quantity of working fluid. the characteristic t1 represents the temperature in the lower part of the heat pipe – below the magnetic field zone, t2 represents the temperature in the upper part of the heat pipe – between the magnetic field zone and the liquid nitrogen bath (see fig. 2). a magnetic field (b � 0.6 t) was active at the beginning of the experiments, then the magnetic field was deactivated for a time period and finally it was activated again till the © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 49 no. 2–3/2009 fig. 2: schema of the heat pipe system (dimensions are shown in millimeters) . fig. 3: temperature characteristics for the heat pipe with filling pressure 8 mpa under discontinuous magnetic field exposure b = 0.6 t end of the experiment. the efficiency of the magnetic field barrier can be determined from the temperature difference t t1 2� . without exposure to a magnetic field heat pipe can normally operate and the temperature difference is insignificant. as the disturbing effects increase, the temperature difference also increases. the heat pipe was able to transport heat in all observed cases and this main function was satisfactory throughout the experiment and with all quantities of working fluid. the heat pipe became almost isothermal in a short time after deactivation of the magnetic field. there is no important difference between the characteristics for filling pressure 8.0 mpa (fig. 3) and 4.5 mpa (fig. 4). at 1.8 mpa (fig. 5) the isothermal temperature was higher. it is obvious that the heat transfer capability was lower in this case and the liquid nitrogen bath was not able to cool the heat pipe as much as before. now let us focus on the magnetic field control. the heat transport was clearly disturbed during the exposure to a magnetic field in all observed cases. the magnetic field (b � 0.6 t) was great enough to capture most of the working fluid in the magnetic field zone and the lower part of the heat pipe was cut off and put out of operation. the temperature characteristics for filling pressure 8.0 mpa (fig. 3) and 4.5 mpa (fig. 4) are very similar. after deactivation of the 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 . . fig. 4: temperature characteristics for the heat pipe with filling pressure 4.5 mpa under discontinuous magnetic field exposure b � 0.6 t .. fig. 5: temperature characteristics for the heat pipe with filling pressure 1.8 mpa under discontinuous magnetic field exposure b � 0.6 t magnetic field the working cycle inside the heat pipe was refreshed and temperature t1 immediately dropped. the temperatures became isothermal in a moment. as soon as the magnetic field was reactivated the temperature difference started to increase rapidly. at filling pressure 1.8 mpa (fig. 5) the dynamic of these temperature changes was slower than in the previous cases. the highest temperature difference was achieved at filling pressure 4.5 mpa – 167 k just before deactivation of the magnetic field, resp. 159 k in the time stable state at the end of the measurement. at 8 mpa the difference was slightly lower – 165 k, resp. 156 k. the lowest temperature difference was observed at pressure 1.8 mpa – 155 k, resp. 137 k. 4 conclusion the influence of the quantity of working fluid on the operation and on the magnetic field control of the mentioned heat pipe has been observed. the cryogenic gravitational heat pipe filled with pure oxygen was specially manufactured and the testing device was set up for this purpose. the main function of a heat pipe – heat transport – was satisfactory throughout the experiment and with all quantities of working fluid. at filling pressure 1.8 mpa the heat transfer capability was lower than in the case of a greater quantity of working fluid. the heat transport was clearly disturbed during exposure to the magnetic field exposure in all observed cases. the highest temperature difference under exposure to a magnetic field was achieved at filling pressure 4.5 mpa. at 8 mpa the difference was slightly lower. the lowest temperature difference was observed at pressure 1.8 mpa and the dynamic of the temperature changes was also slower than in the previous cases. from the set of three quantities of working fluid observed in the experiment, the value for filling pressure 4.5 mpa seems to be optimal. a greater quantity should not make major changes. reducing the quantity can cause negative effects mentioned above. acknowledgments the research described in this paper has been supervised by doc. ing. j. kuba, csc., fee ctu in prague and supported by research program no. msm 6840770012 “transdisciplinary research in the area of biomedical engineering ii” of the ctu in prague, sponsored by the ministry of education, youth and sports of the czech republic. references [1] cingroš, f., hron, t., kuba, j.: magnetic field control of cryogenic heat pipes. in proceedings of the international conference isse 2009. brno (czech republic), 2009. [2] kuba, j., hron, t., cingroš, f.: vliv magnetického pole na transport tepla. in proceedings of the international conference diagnostika 2007. plzeň (czech republic), 2007. [3] cingroš, f.: transport tepla tepelnými trubicemi. master‘s thesis at the dept. of electrotechnology, fee ctu in prague. prague (czech republic), 2007. [4] kuba, j.: ascertaining of the effects of magnetic curtain. in proceedings of the international conference diagnostika 2005. plzeň (czech republic), 2005. [5] ueno, s., iwaki, s., tazume, k.: control of heat transport in heat pipes by magnetic field. j. appl. phys., vol. 69 (1991), no. 8, p. 4925–4927. filip cingroš e-mail: cingrf1@fel.cvut.cz tomáš hron e-mail: hront1@fel.cvut.cz department of electrotechnology czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 projective 3d-reconstruction of uncalibrated endoscopic images p. faltin, a. behrens abstract the most common medical diagnostic method for urinary bladder cancer is cystoscopy. this inspection of the bladder is performed by a rigid endoscope, which is usually guided close to the bladder wall. this causes a very limited field of view; difficulty of navigation is aggravated by the usage of angled endoscopes. these factors cause difficulties in orientation and visual control. to overcome this problem, the paper presents a method for extracting 3d information from uncalibrated endoscopic image sequences and for reconstructing the scene content. the method uses the surf-algorithm to extract features from the images and relates the images by advanced matching. to stabilize the matching, the epipolar geometry is extracted for each image pair using a modified ransac-algorithm. afterwards these matched point pairs are used to generate point triplets over three images and to describe the trifocal geometry. the 3d scene points are determined by applying triangulation to the matched image points. thus, these points are used to generate a projective 3d reconstruction of the scene, and provide the first step for further metric reconstructions. keywords: 3d reconstruction, uncalibrated camera, epipolar geometry, trifocal geometry, bladder, cystoscopy, endoscopy. 1 introduction with about 68810 new cases in 2008 in the united states [1], bladder cancer is a common disease of the urinary system. tumors are usually inspected and treated by endoscopic interventions. urological intervention of the bladder and urethra is also called cystoscopy. the cystoscope is inserted into the bladder through the urethra, which allows an inspection of the bladderwall. the inspection isusuallyperformedclose to the bladder wall, which is why the field of view is very limited. a possible way to improve the difficult orientation is e.g. by using an image mosaicking algorithm [2] to provide a panoramic overview image, or by generating a 3d model of the bladder. this paper presents amethod for extracting 3d information from an uncalibrated endoscopic image sequence, which is then used for a projective 3d bladder reconstruction. in further steps, this information can be used for auto calibration of the camera, which leads to the desired metric reconstruction. the paper is organized as follows: in section 2 the image preprocessing, the mathematical reconstruction and the reconstruction algorithms are described. in section 3 the evaluated image sequences and the results are presented. finally section 4 summarizes the results and gives prospects for future work. 2 reconstruction 2.1 imaging the image sequences are acquired by a rigid video endoscope system, in this case an olympus evis exera ii platform. at the ocular of the cystoscope, a ccd camera is attached, which delivers the data to a workstation. to illuminate the organ, a light source is coupled into the cystoscope. to increase the field of view, endoscopes usually have a fish-eye optic. a typical setup is shown in fig. 1. the realtimeframe software framework [3] is used for real-time data processing of endoscopicdata. this software allowsa very rapid prototyping of algorithms. fig. 1: a schematic view of a rigid cystoscope 2.2 distortion correction cystoscope optics produces a strongly radial distorted image, which has to be corrected to extract valid 3d information. to compensate this distortion, the method ofhartley andkang [4] is applied to each image in a preprocessing step. the radial distortion is modeled by the function �xd = �z + λ(r) · (�xu − �z) (1) with distorted point �xd, center of distortion �z, a function depending on the radius λ(r) and corrected point �xu. λ(r) is not based on a fixed model function but is dynamically determined, resulting in very precise distortion correction. an example using the implementation from [8] is shown in fig. 2. 29 acta polytechnica vol. 50 no. 4/2010 fig. 2: image distorted (left) by an endoscope, and image after correction (right) 2.3 feature detection feature detection is accomplished by the surfalgorithm [5], which extracts and describes distinctive points in each image independent of its scale, position and rotation. to detect points of interest, a hessian matrix containing the approximated second order partial derivatives of a gaussian function from fig. 3 is used. the extracted features are describedby an analysis of the surrounding area via haar wavelets. the results are stored in the descriptor vectors of the features. a simple comparison of different features can bemade by summing the absolute summed differences of these vectors. fig. 3: box filters in hessian matrix 2.4 2-view correspondences a simple method for generating correspondences over two images is brute matching. during this process, each feature f1 from the first image is compared with every feature f2 from the second image, and the f2 which minimizes the difference between the feature vectors is chosen. a correspondence is identified, if the difference is less than a given threshold. this process usually results in a high number of wrong correspondences. thus, advanced matching is applied. in addition to forward matching, where the best f2 for each f1 is chosen, backwardmatching is applied by choosing the best f1 for each f2. a correspondence is identified only if these two matchings are equal. to improve the robustness, a slight restriction of the scale and orientation of the features by factor two, respectively 45◦, is also applied. this assumption is valid since the position of the endoscope does not change much between two sequential images in a real bladder inspection. the last check iswhether thedetectedbest correspondence is reliable by comparing its distance dbest with the distance d2ndbest from the second-best one via looking at their ratio dbest/d2ndbest > τ. 2.5 epipolar geometry epipolar geometry describes the setup of two cameras looking at the same scene fromdifferentpoints of view. while in this section only the basic fundamentals are described, more details can be found in [14]. an exemplary camera setup showing the camera centers �c and �c ′ is given in fig. 4. the 3d-point �x is projected onto the image planes, resulting in points �x and �x ′. the points �e and �e ′ are called epipoles, and they represent the projected camera centers on the image planes. the position, orientation andproperties of the two cameras are described by the fundamental matrix f. it has seven degrees of freedom and rank 2. only the intrinsic geometry is described, which is why the fundamental matrix is independent of the scene content. a central equation for understanding epipolar geometry is the epipolar relation �x ′ t ·f · �x =0 (2) which connects an image point from one image plane with its correspondingpoint on the other image plane. epipolar lines for each image point can be calculated using the fundamental matrix. this line passes through the position of the corresponding point on the other image plane. all these lines intersect in the epipoles. they can be calculated by �l ′ =f · �x or �l =ft · �x ′ . (3) a geometric interpretation of eq. 3 is visualized in fig. 5. fig. 4: epipolar geometry fig. 5: epipolar line thedifferent framesof the sequenceare interpreted as different views. this is correct if there is a camera movement between the frames and if the scene is stationary. 30 acta polytechnica vol. 50 no. 4/2010 the ransac-algorithm [6] is applied to estimate the fundamental matrix. this algorithm generates a random set of correspondences and calculates the fundamental matrix. using backprojection, each corresponding point is classified as an inlier or an outlier. to be classified as an inlier, the reprojection error of a point pair has to be smaller than a threshold. for an acceptable computation time, the sampsonapproximation [9] is used to determine the error. this process is repeated iteratively for other random samples. finally, the inliers of the fundamental matrix, whichyields to the largestnumberof inliers, are chosen to calculate the final fundamental matrix, and all outliers are eliminated. theransac-algorithmuses the 7-point algorithm to calculate the fundamental matrices. the final matrix is estimated using the 8-point algorithm [10], which can handle eight or more points andprovides a least squares solution. both algorithms use a system of equations constructed with eq. 2. 2.6 2-view camera matrices to perform a reconstruction at least two camera matrices are required. if the first camera is chosen with p= [i|�0] the second camera matrix is defined by p′ = [[�e ′]× ·f+�e ′ · �v t |λ · �e ′] . (4) the fundamental matrix f and the epipole �e ′ are known, but the scalar λ and the vector �v are unknown. correspondingly, thereare fourdegreesof freedom to choose the second camera. therefore a scene reconstruction based on eq. 4 is subjected to a projective transformation compared to the original scene, as shown in [11]. fig. 6 shows an example. without any camera calibration or additional scene information, no metric reconstruction from two views is feasible. fig. 6: reconstruction with projective transformation 2.7 3-view correspondences it seems tobea straightforwardprocess to connect two matched imagepairswithonecommonimagetoan image triplet using the two view correspondences. but in practise the surf-algorithm cannot detect identical features in the three images, because of view changes and noisy image data. additionally, not all correspondences could be identified. this results in situations where not every feature could be retrieved in all images. this is visualized in fig. 7. as can easily be seen, only a small number of correspondences share the same corresponding point in the image i +1 indicated by surrounding circles. to increase the number of correspondences over three images, an additional matching process from the first to the third view is performed. this step may induce new incorrect matches, which have to be considered. thus, an advanced ransac-algorithm is used to join the set of tracked correspondences and the set of directly matched correspondences, and to detect outliers. the set of tracked correspondences contains a high amount of valid ones. to benefit from this fact, theransacalgorithms fills the samples with a higher probability from the tracked set than from the directly matched set of correspondences. but even if the tracked correspondences are verified by epipolar matching, they should not be chosen definitely, because they could still be wrong, as fig. 8 shows. �x ′1 and �x ′ 2 are located on the epipolar line, which corresponds to the point �x1. this implies the epipolar relation is fulfilled, and a correspondence between �x1 and �x ′ 1 or between �x1 and �x ′ 2 appears to be correct in the second view. only in the third view it is possible to identify the wrong correspondence �x1 and �x ′′ 2. fig. 7: three image correspondences fig. 8: wrong correspondence detected in three views 2.8 trifocal geometry the mathematical description of trifocal geometry uses tensornotation. agood introduction to this topic can be found in [7] and [14]. in this paper, tensors are written in calligraphic letters and the einstein notation is used. 31 acta polytechnica vol. 50 no. 4/2010 trifocal geometry describes the setup of three different views for the same scene. like epipolar geometry, trifocal geometry is also intrinsic and thus independent from the scene content. a sample configuration is shown in fig. 8 with the camera centers �c, �c ′ and �c ′′. the 3d point �x1 is projected to the image points �x1, �x ′ 1 and �x ′′ 1. the epipoles �e ′ and �e ′′ represent the projected camera center of the first camera on the image planes of the second and third view. the first camera is defined by p = [i|�0], whereby the second camera is p′ = [a|�e ′] and the third camera is p′′ = [b|�e ′′]. the properties and the relation of these cameras are described by the trifocal tensor t . this is a 3×3×3 third-order tensorwith 18 degrees of freedom. the reduction from 27 parameters to 18 degrees of freedom is caused by the internal constraint t jki = p ′i j p ′′k 4 − p ′j 4 p ′′k i (5) with the cameramatricesp′ andp′′ in tensornotation p′ and p′′. by analogywith the epipolar relation �x ′t ·f·�x =0 of two view geometry, trifocal geometry yields to x ix ′jx ′′kejprekqst pqi =0rs . (6) the tensor e in eq. 6 – called thelevi-cevita symbol – represents the constant third-order 3×3×3 tensor erst = ⎧⎪⎪⎨ ⎪⎪⎩ 0 if r, s, t not distinctive +1 if (r, s, t) is an even permutation −1 if (r, s, t) is an odd permutation (7) with r, s, t ∈ {1,2,3}. the trifocal tensor can also be written in matrix notation using the three 3×3 matrices ti jk = t jk i mit i, j, k ∈ {1,2,3} . (8) this notation can be used to extract the fundamental matrices between two different views from the trifocal tensor using the equations f21 = [�e ′]× · [t1,t2,t3] · �e ′′ (9) and f31 = [�e ′′]× · [tt1 ,t t 2 ,t t 3 ] · �e ′ . (10) here the notation �a t · [m1,m2,m3] ·�b represents the vector (�a t ·m1 ·�b, �a t ·m2 ·�b, �a t ·m3 ·�b), and [�x]× denotes the skew-symmetricmatrix,which for a vector �a is given by [�a]× = ⎛ ⎜⎜⎝ 0 −a3 a2 a3 0 −a1 −a2 a1 0 ⎞ ⎟⎟⎠ . (11) the trifocal tensor is calculated by the normalized linear algorithm [14] including algebraicminimization. the basic idea is to solve a system of equations generated from eq. 6 and to enforce the inner constrains given by eq. 5. 2.9 triangulation after calculating the camera matrices, the 3d points canbe reconstructedusing triangulation. the concept is that the projection lines through the camera centers �c and �c ′ and the image points �x and �x ′ intersect in the 3dpoint �x, as shown in fig. 5. to calculate �x the equation ⎛ ⎜⎜⎜⎜⎝ x̄ · �p 3t − �p 1t ȳ · �p 3t − �p 2t x̄′ · �p ′3t − �p ′1t ȳ′ · �p ′3t − �p ′2t ⎞ ⎟⎟⎟⎟⎠ · �x =�0 (12) is solved, where �p it and �p ′it are the i-th row vector of p and p′. since subpixel positions can only be determined by interpolation and additional distortion is induced by the camera system, the detected image points are noisy. this results in the effect that two projection lines do not meet in space and instead form two skew lines. to overcome this problem, the image points �x and �x ′ are adjusted to meet the epipolar relation and are called �̄x and �̄x ′. simultaneously, the sum of the euclidean distance sum d(�x, �̄x)2 + d(�x ′, �̄x ′)2 is minimized. 3 results the four different endoscopic video sequences from fig. 9 are used to analyze the different steps of the algorithm. a) pisa b) rome c) train station d) wooden house fig. 9: test sequences 32 acta polytechnica vol. 50 no. 4/2010 fig. 10: correspondences for two views boxplots are used to compare the data statistically over all frames of the sequences. 25 % or 75 % of all measured values are below the values indicated by the boxborders. the line inside the box is themedian and the whiskers indicate the 5 % and 95 % percentile. fig. 10 analyses the matching process for three views, as described in section 2.7. the number of correspondences is shown on the left side of the boxplots. on the right side, the related reprojection error is shown. the first row shows the results from the two-view process using the advanced matching from section 2.5 compared to the three-view process. analysing the “wooden house” sequence it is observable that the number of tracked correspondences (1–3) of about 60 is significantly lower than the number representing thedirectly identified correspondences (1–2), which is of about 125. the advanced matching between the first and third view (1–3) is only slightly inferior than the matching from the first view to the secondview(1–2). this canbe explainedbythehigher temporal distance between the images, which leads to higher variation of the image data. in the fourth row, the tracked and newly matched correspondences are joined using the advanced ransac-algorithm (1–3), which leads to the higher number of about 150 correspondences, compared to the matching among the first and second view. finally only correspondences between all three views (1–2–3), called triplets, are selected, which reduces the total number to about 90. the mean error of about 0.3 pixels is constantly low. only the tracked correspondences (1–3) show slightly higher error and variation, before applying the ransac-algorithm. this is caused by sporadic wrong correspondences, as described in section 2.7. finally, the reprojection error from the estimated trifocal tensor is analyzed. for this step, two fundamentalmatrices are calculated from the trifocal tensor by eq. 9 (1–2) and 10 (1–3). fig. 11 shows for all sequences nearly the same subpixel error of about 0.3 pixels (1–2) or ∼ 0.6 pixels (1–3), respectively. compared to epipolar geometry, the temporal distance has a stronger impact. since no ransac-algorithm has yetbeenapplied for estimating the trifocal tensor, outliers have a direct impact on the error value. fig. 11: trifocal error an exemplary reconstruction of the “pisa” sequence is shown in fig. 12. in the left image all detected features are shown on the image plane, and in the right image a 3d reconstruction from these points is shown. corresponding points have the same color. the reconstruction is compressed in the x-direction, caused by the projective transformation described in section 2.6. fig. 12: image from sequence “pisa” and related reconstruction 4 summary and prospects this paper presents a method for reconstructing 3d scene content from uncalibrated endoscopic sequences based on surf-features. the different steps yield robust results by using the ransac-algorithm in 33 acta polytechnica vol. 50 no. 4/2010 adapted forms. it has been shown that an epipolar geometry and a trifocal geometry can be extracted with high precision, whereby subpixel-precise reconstruction is possible. an important application for trifocal geometry is for extracting consistent camera matrices for the whole sequence by a linear method, like in [12] or [13]. subsequently these cameras can be used for auto calibration [14], which allows metric reconstruction. acknowledgement we would like to thank prof. dr.-ing. til aach, institute of imaging andcomputervision, rwthaachen university, germany for supervising this project. additionally we would like to thank olympus & winter ibegmbhforprovidingavideo endoscope system. references [1] american cancer society: cancer facts & figures 2008. american cancer society, 2008. [2] behrens, a., bommes, m., stehle, t., gross, s., leonhardt, s., aach, t.: a multi-threaded mosaicking algorithm for fast image composition of fluorescencebladder images.medical imaging 2010: visualization, image-guided procedures, and modeling, vol. 7625, san diego, ca, usa, 2010. [3] gross,s., behrens,a., stehle,t.: rapiddevelopment of video processing algorithms with realtimeframe. conference book biomedica, liege, belgium, 2009, pp. 217–220. [4] hartley, r., kang, s. b.: parameter-free radial distortion correction with center of distortion estimation. ieee transactions on pattern analysis andmachine intelligence, 2007,vol.29, no. 8, pp. 1309–1321. [5] bay, h., ess, a., tuytelaars, t., gool, l. v.: speeded-up robust features (surf). computer vision and imageunderstanding, 2008,vol.110, no. 3, pp. 346–359. [6] fischler, m. a., bolles, r. c.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. communications of the acm, 1981, vol. 24, no. 6, pp. 381–395. [7] triggs, w.: the geometry of projective reconstruction i:matchingconstraintsandthe joint image.proc. international conference oncomputer vision, boston, ma, usa, 1995, pp. 338–343. [8] stehle, t., truhn, d., aach, t., trautwein, c., tischendorf, j.: cameracalibration for fish-eye lenses in endoscopy with an application to 3d reconstruction. proceedings ieee international symposium on biomedical imaging, washington, d.c., usa, 2007. [9] sampson, p. d.: fitting conic sections to ‘very scattered’ data: an iterative refinement of the bookstein algorithm. computer graphics and image processing. fitting conic sections to scattered data, 1982, vol. 18, no. 1, pp. 97–108. [10] hartley, r.: in defense of the eight-point algorithm. ieee transactions on pattern analysis and machine intelligence, 1997, vol. 19, no. 6, pp. 580–593. [11] pollefeys, m.: self-calibration and metric 3d reconstruction from uncalibrated image sequences. phd thesis, leuven, belgium, 1999. [12] triggs, b.: linear projective reconstruction from matching tensors. image and vision computing, 1997, vol. 15, issue 8, pp. 617–625. [13] avidan, s., shashua,a.: threadingfundamental matrices. ieee transactions on pattern analysis andmachine intelligence, 2001,vol.23,no. 1, pp. 73–77. [14] hartley, r., zisserman, a.: multiple view geometry incomputervision. 2nd ed.cambridgeuniversity press, 2003. about the authors peter faltin was born in 1983 in cologne, germany. he studied computer engineering at rwth aachenuniversity and receivedhis dipl.-ing. in 2010. since then he has held a phdposition at the institute of imaging&computervision atrwthaachenuniversity. his research focuses onmedical imageprocessing, video processing, signal processing and computer vision. alexander behrens was born in bückeburg, germany in 1980. he received a dipl.-ing. degree in electrical engineering from the leibniz university of hannover, hannover, germany, in 2006. after receiving the degree, he worked as a research scientist at the university’s institut für informationsverarbeitung. since 2007, he hasbeen aph.d. candidate at the institute of imaging andcomputervision, rwthaachen university, aachen, germany. his research interests are inmedical imageprocessing, signalprocessing,pattern recognition, and computer vision. peter faltin alexander behrens e-mail: peter.faltin@lfb.rwth-aachen.de alexander.behrens@lfb.rwth-aachen.de institute of imaging & computer vision rwth aachen university d-52056 aachen, germany 34 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 algebraic solutions in open string field theory – a lightning review m. schnabl abstract in this short paper we review basic ideas of string field theory with the emphasis on the recent developments. we show how without much technicalities one can look for analytic solutions to witten’s open string field theory. this is an expanded version of a talk given by the author over the last year at a number of occasions1 and notably at the conference selected topics in mathematical and particle physics in honor of prof. jǐŕı niederle’s 70th birthday. 1 what is string field theory? the traditional rulesof firstquantized string theoryallow one to compute on-shell perturbative amplitudes, but they tell us little about collective phenomena or non-perturbative effects. two most prominent examples of such are tachyon condensation (a close relative of the higgs mechanism) and instanton physics. string field theory is an attempt to turn string theory into some sort of field theory by writing a field theory action for each of the single string modes and combining them together with very particular interactions. perturbative quantization of this field theory yields all of the perturbative string theory, and one might hope that one day we could get a truly nonperturbative description of the theory. one of the most interesting applications of string field theory to date has been in studies of the classical backgrounds of string theory. traditionally, string theories aredefined tobe inone-to-onecorrespondence with worldsheet conformal field theories (cft’s). as such they correspond to the choice of infinitely many couplings in two dimensional worldsheet theory. the condition of vanishing beta functions for all of these couplings is equivalent to einstein or maxwell like equations for the classical backgrounds. given two cft’s, the two corresponding string theories look in general entirely different. for cft’s related by exactly marginal deformations, the two theories may bear some resemblance, but for theories related by relevant deformations it is very hard to see how one background can be a solution of a theory formulated around another background. this is in stark contrast to general relativity, where theeinstein-hilbert action does not depend on any particular background, but it allows for solutions describing very different geometries. one of the holy grails of string theory research is a manifestly background independent formulation of string theory. string field theory (sft) goes half-way towards this goal. it gives us a formulation which is background independent in form (not truly in essence) and which posseses a multitude of classical solutions representing different backgrounds. it is defined using the data of a single reference cft. it is analogous towriting theeinstein-hilbert action and substituting the metric gμν(x) with g ref μν(x)+ hμν(x). the fundamental reason for this difficulty is that what are the field theoretic degrees of freedom in string theory depends on the background, unlike in general relativity. following sen and zwiebach [1, 2], it is believed that the space of classical solutions of sft is in one-to-one correspondence (modulo gauge symmetries and perhaps dualities) with worldsheet cft’s. in this short paper we review the amazingly simple construction of a class of solutions that can be determined purely algebraically. these are just the first steps in a long programof constructing and classifying all solutions and relating them to somecft’s. in sections 5 and 6we add in a little bit of originalmaterial. borrowing a few theorems from the theory of distributions and the laplace transform we are able to shed novel light on what the space of allowed string fields should look like. this seemingly academic question is actually important for distinguishing gauge trivial and non-trivial solutions. 2 précis of osft one of the best understood string field theories is witten’s covariantchern-simons type string field theory [3] for open bosonic string.2 as is well known, quantization of a single classical string is somewhat subtle, due to the reparametrization invariance of the worldsheet action. this gauge 1parts of this work have been presented at the kavli institute for theoretical physics, the aspen center for physics, the simons center for geometry and physics and the yukawa institute for physics. we thank these institutes for their warm hospitality. 2there aremany other string field theories, also for closed strings or superstrings, and some theories have more than one description, often non covariant. some are also non-polynomial. 102 acta polytechnica vol. 50 no. 3/2010 symmetry canbe fixed in a number ofways. in the covariantquantizationprocedureonehas togaugefix the worldsheet metric hαβ and introduce the worldsheet fadeev-popov ghost fields b and c. the virasoro constraints tαβ = 0 resulting from gauge fixing can then be conveniently imposed using the brst formalism.3 the space of physical states of the string is then identified with the cohomology of the brst operator qb acting on thehilbert spacehbcf t of thematter-plusghost boundary conformal field theory (bcft) determinedbythe stringbackground. interestingly, andnot for trivial reasons, this bcft is the most convenient starting point for string field theory. the classical degrees of freedomof open string field theory are fields associated to quantum states of the first quantized open string. it is very convenient to workwith the extended space hbcf t , which contains not only physical states of the string but also various other states. interestingly, these turn into auxiliary and ghost fields of string field theory. all the fields are neatly assembled into a string field |ψ〉 = ∑ i ∫ dp+1k φi(k)|i, k〉, (2.1) where the index i runs over all states of the first quantized string in a sector of momentum k. the dimensionality of the momentum is p+1, as appropriate for open strings ending on a d-p-brane. the coefficients φi(k) aremomentum spacewave functions for particle like excitations of the string, and would become field theory operators if we proceeded to second quantization. the action of string field theory can be written in the form s = − 1 g2o [ 1 2 〈ψ∗ qbψ〉+ 1 3 〈ψ∗ψ∗ψ〉 ] , (2.2) where go is the open string coupling constant and ∗ is witten’s star product. the action has enormous gauge symmetry given by δψ= qbλ+ψ∗λ−λ∗ψ, (2.3) where λ ∈ hbcf t (grassman even), provided the start product is associative, qb acts as a graded derivation and 〈 . 〉 has properties of integration. to summarize, the basic ingredients that oneneeds in order to write down witten’s osft in a particular background are hbcf t , ∗, qb, 〈.〉. for a more comprehensive review, the reader is referred to the excellent reviews [4, 5].4 3 demystifying the star product the star product has alwaysbeenoneof themostdifficult ingredients of the stringfield tounderstandand to workwith. it can be defined very intuitively using the schrödinger presentation of string wave functionals (ψ1 � ψ2)[x(σ)] =∫ [dxoverlap]ψ1 [ x̂(σ) ] ψ2 [ x̌(σ) ] , (3.4) where the hat and checkmeans that the left and right halves of these functions respectively coincide with those of the x(σ). it took some years and many researchpapers to understand exactlywhether this path integral makes sense. there is however a modern definitionwhichmakes many of the star product properties manifest. let us describe string field theory states as linear combinations of surfaces with vertex operator insertions, such as infig. 1.5 these represent theworldsheets of a single string evolving from the infinite past to the infinite future. a conformal transformation can be used to bring the surface to a canonical form, but this would act nontrivially on the in and out states. we will consider only shapeswhichhave the future (upper) part in the canonical shapeof a semi-infinite strip. byputting various vertex operators in the far future and evaluating the path integral over the surface, we can uniquely probe both the shape of the lower part of the surface and what vertex operators are inserted there. to describe the star product we take two states in the canonical form, cut off the probe strips (in light yellow) and glue the lower parts of the strips along the upper boundaries of the hatched regions. one gets again a state in the form of a surface with insertions, but the shape is different from those we started with. imaginenowfactorizing thepath integralmeasureover theworldsheetfields in thehatchedareaand in the rest of the surface. the path integral over the hatched region is performed first. then since there are no vertex operators inserted, one can replace its result by an effective term in the worldsheet action, or equivalently as an insertion of somenonlocal line operator. it turns out that this operator can be written as e−k, where k is a line integral of the worldsheet energy momentum tensor in some specific coordinates. the integration extends from the boundary to themidpoint of the worldsheet. 3alternatively, as in the light cone gauge, one could use the residual infinite dimensional conformal symmetry to gauge fix one of the embedding coordinates and solve the virasoro constraints algebraically. 4older reviews are [6, 7] and a more recent development appears in [8]. 5in order to match with witten’s original definition the σ coordinate must run from right to left. 103 acta polytechnica vol. 50 no. 3/2010 fig. 1: witten’s star product is defined by gluing the respective worldsheets the upshot is that the star multiplication is isomorphic to operator multiplication. to see this more explicitly, consider twovertex operators φ1,2. the corresponding states |φ1,2〉 star multiply as |φ1〉∗ |φ2〉 = |φ1e−k φ2〉. (3.5) let us now introduce new (non-local) cft operators φ̂ = ek/2φek/2 andassociated states |φ̂〉. then clearly |φ̂1〉 ∗ |φ̂2〉 = |φ̂1φ2〉. (3.6) therefore φ → |φ̂〉 is the claimed isomorphism. usually one does not think of the star product and the operator product as being the same thing. in particular, there are well known short distance singularities for local vertex operators in nearby points, whereas the starproduct isusually thought tobemuch more regular. well, thanks to the presence of the ek/2 operators, we do indeed get the same type of singularities when we try to star multiply two φ̂ states as in the vertex operator algebra. the φ̂ states can be represented by surfaces that differ from those in fig. 1 in that the lower bluish part ismissing and is replaced by the identification of the left and right parts of the base of the upper semi-infinite strip with a local operator inserted at themidpoint. we say that such string states have no security strips. 4 algebraic solutions to osft to solve the classical equation of motion qbψ+ψ∗ψ=0 (4.7) one could try to restrict the huge star algebra to as small subsector as possible. as we first want to study the tachyon condensation, perhaps we should include the vertex operator of the zero momentum tachyon, which is just the c-ghost. for the subalgebra to be nontrivial we should also include the non-local operator k. one can easily (but not necessarily) add an operator b which is defined by the same type of integral as k with the energy momentum tensor replaced by the b-ghost. together all these elements obey c2 =0, b2 =0, {c, b} =1 (4.8) [k, b] = 0, [k, c] = ∂c. (4.9) the action of the exterior derivative is equally simple qbk =0, qbb = k, qbc = ckc. (4.10) qb is not the only useful derivation. there is also one called l−, which aside of the usual leibnitz rule also obeys l−c = −c, l−b = b, l−k = k. (4.11) at a given ghost number, the derivative l− counts the number of k’s and is bounded from below. one could therefore use it to solve the equation of motion order by order in l− within the subalgebra generated by k, b, c. the simplest possible solution is ψ= αc − ck. (4.12) clearly qbψ = αckc − ckck = −ψ2. a more general solution has been found by okawa [9], following [10] (see also the works by erler [11, 12]) ψ= f c kb 1− f2 cf, (4.13) where f = f(k) is an arbitrary function of k. to prove that it obeys the equation of motion requires some straightforward if a bit tedious algebra. the solution can be formally written in the form ψ=(1− f bcf)qb ( 1 1− f bcf ) , (4.14) whichmakes the proof of the equations ofmotion trivial. what isnot so trivial is todistinguisha trivialpure gauge solution fromthenontrivial solutions. note that 104 acta polytechnica vol. 50 no. 3/2010 bc is a projector, in the sense that it squares to itself, and therefore (1− f bcf)−1 =1+ f 1− f2 bcf. (4.15) for a solution to be nontrivial, the factor f/(1− f2) must be ill defined, whereas the similar looking factor appearing in the string field itself f̃ ≡ k/(1 − f2) must be well defined. before we go in depth into what ill/well defined means, let us discuss another interesting property of these solutions. expanding string field theory around these solutions one obtains a similar looking theory with qb replaced by qψ = qb +{ψ, ·}∗. (4.16) the second term acts as a graded star-commutator with ψ. the fluctuations around the vacuum are described by the cohomology of this operator. interestingly, one could find a homotopy operator a which formally trivializes the cohomology [13] a = 1− f2 k b, (4.17) i.e. it obeys {qψ, a} = 1. therefore, formally, any qψ closed state χ can be written as qψ exact: χ = qψ(aχ). absence of nontrivial excitations around a given state ψ is a property expected by sen’s conjectures [14] around the tachyon vacuum, but definitely not around a generic state. we thus find an analogous condition to the one above: (1−f2)/k should bewell defined for the tachyon vacuum, but ill defined for the perturbative vacuum (f =0). assuming that f(k) is a well defined string field, and adopting a simplifying assumption that f is analytic around the origin, we find that f(k)= a + bk + . . . (4.18) gives the tachyonvacuum if a =1and b �=0, and gives the trivial vacuum for a �=1. solutionswith a =1and b =0might correspond to somethingmore exotic such as multiple brane solutions, but this has not yet been shown convincingly. 5 what constitutes a well defined string field? this is still an open question, so we will rather ask a more specific question of when a function f(k) constitutes a well defined string field. even this question might not have a unique answer, as there are several possible definitions of what might constitute ‘good’ or ‘bad’, depending on the context. we define a set of geometric string field functions f(k) to be those expressible as6 f(k)= ∫ ∞ 0 dαf(α)e−αk . (5.19) the name geometricmeans that we consider superpositions of surfaces, recall that e−αk represents a surface. for α ∈ n0 it is the α-th power of the sl(2, r) vacuum |0〉, and for generic α ≥ 0 one canfind a frame (a so called cylinder frame), in which the surface is a strip of size α. what space do we want f(α) to belong to? obviously a space of functions would be too restrictive, as one would have no hope of representing even the vacuum corresponding to f(k) = e−k. the theory of distributions, developed to a large extent by laurent schwartz more than sixty years ago, is exactly what we need. let us now remind the reader of some of the useful spaces that are introduced in a beautiful treatise [15]. schwartz introduces the following spaces d ⊂ s ⊂ dlp ⊂ dlq ⊂ ḃ ⊂ b ⊂ om ⊂ e ∩ ∩ ∩ ∩ ∩ ∩ ∩ ∩ e′ ⊂ o′c ⊂ d ′ lp ⊂ d ′ lq ⊂ ḃ ′ ⊂ b′ ⊂ s′ ⊂ d′ where in both lines 1 ≤ p < q < ∞. the first line denotes spaces of infinitely differentiable functions (in general on rn, but here we restrict to r) which in addition satisfy (together with their derivatives): d are of compact support, s is the space of fast decaying schwartz functions, dlp must also belong to lp, b ≡ dl∞ are bounded, ḃ are both bounded and possess a finite limit at infinity, om cannot grow too fast, but finally e have no restrictions on their growth. on the second line we have spaces of distributions that are defined as continuous linear functionals on some function space from the first line (in the case of o′c and b ′ with an additional restriction). for example, e′ is dual to e and represents the space of distributions with compact support. o′c are rapidly decaying distributions, i.e. those which together with all their derivatives are bounded even after multiplication with (1+ x2)k/2 for all k ∈ r. the space d′l1 is dual to ḃ. d′lp for p ∈ (1, ∞) are dual to dlp′ for p′ = p/(p − 1). a useful characterization of the d′lp spaces for all 1 ≤ p ≤ ∞ is that they are finite sums of derivatives of functions from lp (hence dlp ⊂ lp ⊂ d′lp), or, equivalently, that their convolution with any a ∈ d belongs to lp. the space ḃ′ is dual to dl1 and the distributions are characterized by convergence towards infinity. whereas d is dense in ḃ′, it is not dense in b′ ≡ d′l∞, the space of bounded distributions. s′ is the well known schwartz space of tempereddistributions (thosewithatmostpolynomial growth) dual to s, and finally d′ is the biggest space of all distributions dual to d. 6a much broader class of interesting non-geometric states has been considered recently by erler [16], inspired by rastelli [17]. 105 acta polytechnica vol. 50 no. 3/2010 after this little exposé, we are ready to answer the question which space should f(α) in (5.19) belong to. let us define geometric states as those for which f is a laplace transformable distribution, i.e. f ∈ d′, but such that there exists ξ, for which e−ξαf ∈ s′. we could perhaps have beenmore generous and have kept only the f ∈ d′ condition, but such states would be even more meaningless from the string field perspective. whatweneedarenot themost generalgeometric states, but more restricted ones. let us define a space of l0-safe geometric string field functions f(k) to be those for which f ∈ d′l1. this conditions comes from considering states f(k)|i〉 expanded in the virasoro basis. the coefficients are given by sums of integrals of the form∫ ∞ 0 dα f(α)(α +1)−n for n = 0,2,4, . . . for these integrals to be absolutely convergent, we must have f ∈ d′l1. this definition gives us a very nice surprise. since the star product of string fields f1(k) and f2(k) is just a multiplication, in terms of its inverse laplace transforms f1 and f2 it is a convolution f1 ∗ f2(x) = ∫ x 0 dyf1(y)f2(x − y) (defined in a more sophisticated way when fi are both actual distributions). now it is known that the space d′l1 is closed under convolution. therefore the space of l0-safe geometric string fields is closed under starmultiplication! we now proceed to define a space of l0-safe geometric string field functions f(k) by the condition f ∈ o′c. this condition comes from considering states f(k)|i〉 expanded in the basis of l0 eigenstates (see [10] for definition), or equivalently from expanding f(k) in the l− eigenstates (see [18]). we demand that ∫ ∞ 0 dα f(α)αn for n ∈ n0 be absolutely convergent. this forces f ∈ o′c. again this space is closed under convolution and hence the space of l0-safe geometric string fields is also closed under star multiplication! both definitions of safe string fields can be recast in terms of the properties of the function f(z) =∫ ∞ 0 f(α)e−αz. string field function f(k) is 1. geometric if and only if there exists ξ such that for all z with rez > ξ, f(z) is holomorphic and |f(z)| is majorized (i.e. bounded) by a polynomial in |z|. 2. l0-safe geometric if and only if f(z) is holomorphic forall z withrez > 0and |f(z)| ismajorized by a polynomial in |z| for all rez ≥ 0. 3. l0-safe geometric if and only if f(z) is holomorphic for all z with rez > 0, |f(z)| is majorized by a polynomial in |z| for all rez ≥ 0, and f(z) canbe extended to a c∞ function on the complex half-plane rez ≥ 0. the proof of the first statement can be found in textbooks, and the latter two canbe establishedbya slight modification. to end this mathematical discussion, let us now givea fewexamples. the stringfields (1+k)p areboth l0 and l0-safe geometric since the function (1+ z)p is holomorphic for rez > −1 and obeys all the above conditions. the inverse laplace transform f ∈ o′c can be easily computed: 1 γ(−p) α−p−1e−α, p < 0( 1+ d dα )[p]+1 ·[ 1 γ([p]+1− p) α[p]−pe−α ] , p > 0, p /∈ n (5.20)( 1+ d dα )p δ(α), p ∈ n0. here [p] denotes the integer part of p, but in fact any integer greater than that can be taken. for p ∈ n0 the inverse laplace transform actually belongs to the smaller space e′ of distributions with compact support. note that for p > 0, p /∈ n distribution theory takes care beautifully of the singularities that would be present if one thought of the inverse laplace transform as a function. had we considered functions (1 + γ−1k)p, with γ ∈ r+ the domain of holomorphicity would change, the maximal half-plane being rez > −γ. the inverse laplace transform for these functions is γf(γα). the closer γ is to zero, the slower falloff of f we get. if γ were taken negative, the inverse laplace transform would grow exponentially (definitely not what we want in osft) which would manifest itself as singularities of f(z) for rez > 0. another example is the string field 1/ √ 1+ k2. it is geometric and l0-finite but neither l0 nor l0-safe. the inverse laplace transform is the bessel function j0(α). it belongs to the space d′lp for p > 2. the reason for l0 finiteness is the cancelations due to the oscillatory behavior of thebessel function. finally, let us consider string field √ 1+ k2. the inverse laplace transform is δ′(α)+ 1 2 (j0(α)+ j2(α)) and belongs to d′l1 but not to o ′ c. correspondingly it is l0-safe, but not l0-safe. 6 examples let us go through some of the simplest examples of osft algebraic solutions of the form ψ= f c kb 1− f2 cf in more detail, and let us try to see what the generalities of the previous section tell us. let us remind 106 acta polytechnica vol. 50 no. 3/2010 the reader of the definition f̃ ≡ k/(1− f2) in terms of which ψ= f cbf̃ cf and the homotopy operator is a = f̃ −1b. • f(k)= a, a �=1 the solution can be simplified as ψ = a2 1− a2 · q(bc). for this object both f/(1 − f2) = a/(1− a2) and f̃ = k 1− a2 are regular, the solution is therefore a pure gauge and is thus the perturbative vacuum. thewouldbehomotopyoperator a = (1− a2)b/k is singular. this is so, because the inverse laplace transform of 1/z is 1 (when restricted to r+) which does not belong to neither o′c, nor d ′ l1. • f(k)= √ 1− βk, β �=0 the solution is geometric only for β < 0, but formally for all values one obtains ψ =√ 1− βkβ−1c √ 1− βk. this is nothing but a real formof the solution (4.12)with the identification β = α−1. for this solutionboth f̃ = β−1 and the homotopy operator a = βb are very simple and belong to our l0 and l0-safe spaces. thanks to the vanishing cohomology around the vacuum, it is believed to represent the tachyon vacuum (it can alsobe shownto be formally gauge equivalent to it) but we have not yet succeeded in computing its energy. the reason for the difficulty is that the string field is too identity like and gives rise to divergences in the energy correlator. perhaps rephrasing the problem in terms of distribution theory could solve this issue. • f(k)= e−k/2 the solution in this case is thefirst discovered analytic solution for the tachyon vacuum [10]. there is however one subtlety with this solution. since f̃ = k 1− e−k = (6.21)∫ ∞ 0 dα ( ∞∑ n=0 δ′(α − n) ) e−αk , the inverse laplace transform of f̃ does not vanish for large α. consequently f̃ ∈ b′ and does not belong to either d′l1 or o ′ c. it is therefore an example of a geometric string field that is neither l0 nor l0-safe, but is nevertheless l0-finite. this is alsomanifested by the fact that f(z) has poles on the imaginary axes. there are interesting consequences to this. truncating the sum ∑ ke−nk at some finite value of n = n one gets a remnant ke−(n+1)k 1− e−k which still contributes significantly to certain observables, in particular to the energy. this is the origin of the so called phantom term in the tachyon vacuum solution. the homotopy operator, on the other hand, is very well defined, and is both l0 and l0-safe. • f(k)= 1√ 1+ k this is the tachyon vacuum solution foundbyted erler and the author [19] ψ= 1 √ 1+ k cb(1+ k)c 1 √ 1+ k . (6.22) the homotopyoperator is simply a = b/(1+k), which is perfectly regular. the inverse laplace transform of f̃ = 1 + k is f̃ = δ(α) + δ′(α), which actually belongs to e′ and we thus see no need for the phantom term. in fact the energy for this solution can be computed very easily e = −s = 1 6 〈ψ, qψ〉= (6.23) 1 6 〈 (c + q(bc)) 1 1+ k c∂c 1 1+ k 〉 = 1 6 ∫ ∞ 0 dt1 ∫ ∞ 0 dt2e −t1−t2 〈 c e−t1k c∂c e−t2k 〉 = − 1 6π2 ∫ ∞ 0 duu3e−u ∫ 1 0 dv sin2 πv = − 1 2π2 , (6.24) which is the correct value, minus the tension of the d-brane, according to sen’s conjecture [14]. the last correlator that we had to evaluate is indeed very simple: two ghost insertions of c and c∂c on a boundary of a semi-infinite cylinder of circumference t1 + t2 separated by the distance of t1. acknowledgement i would like to thank ted erler for collaborating on many of the topics discussed in this review, and for his insightful comments. i would also like to thank the kavli institute for theoretical physics, the aspencenter for physics, the simonscenter forgeometry andphysics, the yukawa institute for physics and apctp in pohang, korea, for their hospitality while i was working on parts of the present work. this research was supported by the euryi grant gacr eyi/07/e010 from eurohorc and esf. references [1] sen, a.: on the background independence of string field theory,’ nucl. phys. b 345 (1990) 551. [2] sen, a., zwiebach, b.: a proof of local background independence of classical closed string field theory, nucl. phys. b 414 (1994) 649 [arxiv:hep-th/9307088]. [3] witten, e.: noncommutative geometry and string field theory, nucl. phys. b 268, 253 (1986). 107 acta polytechnica vol. 50 no. 3/2010 [4] taylor, w., zwiebach, b.: d-branes, tachyons, and string field theory, arxiv:hep-th/0311017. [5] sen,a.: tachyon dynamics in open string theory, arxiv:hep-th/0410103. [6] thorn, c. b.: string field theory, phys. rept. 175, 1 (1989). [7] bonora, l., maccaferri, c., mamone, d., salizzoni, m.: topics in string field theory, arxiv:hepth/0304270. [8] fuchs, e., kroyter, m.: analytical solutions of open string field theory, arxiv:0807.4722 [hepth]. [9] okawa, y.: comments on schnabl’s analytic solution for tachyon condensation in witten’s open string field theory, jhep 0604 (2006) 055 [arxiv:hep-th/0603159]. [10] schnabl, m.: analytic solution for tachyon condensation in open string field theory, adv. theor. math. phys. 10 (2006) 433 [arxiv:hepth/0511286]. [11] erler, t.: split string formalism and the closed string vacuum, jhep 0705 (2007) 083 [arxiv:hep-th/0611200]. [12] erler, t.: split string formalism and the closed string vacuum. ii, jhep 0705 (2007) 084 [arxiv:hep-th/0612050]. [13] ellwood, i., schnabl, m.: proof of vanishing cohomology at the tachyon vacuum, jhep 0702 (2007) 096 [arxiv:hep-th/0606142]. [14] sen, a.: universality of the tachyon potential, jhep 9912 (1999) 027 [arxiv:hep-th/9911116]. [15] schwartz, l.: théorie des distributions, hermann, paris 1966. [16] erler, t.: to appear [17] rastelli,l.: commentsontheopenstringc*algebra, talk at simons center for geometry and physics workshop on string field theory, stony brook, march 23–27, 2009. [18] okawa, y., rastelli, l., zwiebach, b.: analytic solutions for tachyon condensation with general projectors, arxiv:hep-th/0611110. [19] erler, t., schnabl, m.: a simple analytic solution for tachyon condensation, jhep 0910 (2009) 066 [arxiv:0906.0979 [hep-th]]. martin schnabl e-mail: schnabl.martin@gmail.com institute of physics as cr na slovance 2, prague 8, czech republic 108 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 pitched blade turbine efficiency at particle suspension d. ceres, t. jirout, f. rieger abstract mixing suspensions is a very important hydraulic operation. the pitched six-blade turbine is a widely-used axial-flow impeller. this paper deals with effect relative impeller size and particle content on the efficiency of a pitched six-blade turbine at particle suspension. two pitched six-blade turbines were used in model measurements of just suspension impeller speed. the ratios of the vessel to agitator diameter d/d were 3 and 4.5. the measurements were carried out in a dish-bottomed vessel 300 mm in diameter. the just suspension impeller speeds weremeasured using an electrochemical method, andwere checked visually. a 2.5 % nacl water solution was used as the liquid phase, and glass particles with four equivalent diameters between 0.18 and 0.89 mm and volumetric concentration from 2.5 % to 40 % were used as the solid phase. the criterion values πs = p o √ f r′3(d/d)7 were calculated from the particle suspension and power consumption measurements. the dependencies of πs on particle content cv show that larger agitators are more efficient for higher particle content. keywords: pitched blade turbine, particle suspension, agitator efficiency. 1 introduction mixing suspensions is a very important hydraulic operation. suspensions are frequently mixed when dispersions are prepared or homogenised, and in mass transfer operations between solid particles and a liquid, often accompanied by a chemical or biochemical reaction. it is estimated that about 60 % of mixing involves heterogeneous particulate solid phase-liquid systems. fig. 1 numerous papers on particle suspension in agitatedvesselshavebeenpublished. reviewshavebeen published by rieger and ditl [1] and more recently by kasat and pandit [2]. these authors show that axial-flow pattern impellers are generally considered themost suitable agitators in suchcases. thepitched six-blade turbine shown in fig. 1 is the one of most widely-used axial-flow impellers. the present paper deals with the effect of relative impeller size and particle content on the efficiency of a pitched six-blade turbine atparticle suspension. this problemwasalso the topic of our earlier papers [3, 4], in which attention was focused on relative impeller size, and measurements were carried out for two particle contents only. 2 theoretical background inorder todesignmixingapparatuses, it is important to know the reference state of just off-bottom particle suspension, which is often defined as the state at which no particle remains in contact with the vessel bottom for longer than a certain time. the impeller speed corresponding to this state is referred to as the critical (just-suspended) impeller speed nc. on the basis of inspection analysis of the equation of continuity, the navier-stokes equation and the equation expressing the balance of forces affecting the suspended particle, rieger and ditl [1] proposed the following relationship linking the modified froude number f r′, the dimensionless particle diameter dp/d and the mean volumetric concentration of the solid phase cv 13 acta polytechnica vol. 50 no. 6/2010 f r′ = n2c dρ gδρ = f ( dp d , cv ) . (1) this relation holds for geometrically similar mixing equipment and a turbulent regime. the results of critical (just-suspended) impeller speed measurements for the given solid phase concentration cv can be correlated in the power form f r′ = c ( dp d )γ (2) the values of coefficients c and γ depend on particle volumetric concentration cv. a mathematical description of these dependencies was proposed by rieger [5, 6] in the form c = aexp(bcv) (3) and γ = α + βcv. (4) the dimensionless criterion πs = p o √ f r′3(d/d)7 (5) was proposed in [7] for comparing the agitator power consumption necessary for suspension of solid particles. 3 experimental two pitched six-blade turbines with pitch blade angle 45◦ and blade width 0.2 · d were used in model measurements of just suspension impeller speed. the ratios of the vessel to the agitator diameter d/d were 3 and 4.5. the measurements were carried out in a dish-bottomed vessel 300 mm in diameter. the height of the impellers above the vessel bottom was 0.5d. the impellers were operated to pump the liquid down toward bottom of the vessel. the vessels were equipped with four radial baffles b = 0.1 · d in width. the height of the liquid levelwas equal to the vessel diameter h = d. the just suspension impeller speeds were measured by an electrochemical method described e.g. in [8], and were checked visually. a 2.5 % nacl water solution was used as the liquid, and glass particles with four equivalent diameters between 0.18 and 0.89mmand volumetric concentration from2.5% to 40 % were used as the solid phase. 4 results the dependences of coefficient c and exponent γ on theparticle volumetric concentration cv for both d/d ratio values were presented in [9]. the plot of exponent γ on the particle volumetric concentration cv shown in fig. 2 shows that it rises linearly with increasing cv. the dependence of coefficient c on particle concentration cv, see fig. 3, shows that the dependences can be approximated in semi-logarithmic coordinates by straight lines. this is in agreement with eqs. (3) and (4). fig. 2 fig. 3 the values of criterion πs were calculated from the results of particle suspension measurements (eqs. (2–4)) and from the results of power consumption measurements presented in [7, 10, 11]. the results are presented in figs. 4 and 5. fig. 4 for smaller particles shows that at low particle content the smaller agitator needs less power for particle suspension, while the converse is true for higher particle content. fig. 5 shows that for larger particles with low particle content, the two agitators need practically the same power for particle suspension. for higher particle content, the larger agitator is again more advantageous. fig. 4 14 acta polytechnica vol. 50 no. 6/2010 fig. 5 we can conclude that larger agitators are more efficient for higher particle content. this is in agreement with the conclusions presented earlier [3, 4]. 5 symbols a, b constants in eq. (3) cv volumetric concentration of particles c coefficient in eq. (2) d agitator diameter dp particle diameter d vessel diameter f r′ modifiedfroude number defined byeq. (1) g gravity acceleration n agitator speed nc critical agitator speed p o power number, p o = p ρn3d5 α, β constants in eq. (4) γ exponent in eq. (2) πs dimensionless criterion defined by eq. (5) ρ liquid density δρ solid-liquid density difference acknowledgement this project was crried out with financial support fromtheministry of industry andtradeof theczech republic (project number fr-ti1/005). references [1] rieger, f., ditl, p.: chem. eng. sci., 49, 2219, 1994. [2] kasat, g. r., pandit, a. b.: can. j. chem. eng., 83, 618, 2005. [3] rieger, f., ditl, p.: zeszyty naukowe politechniky lódzkiej. inžyniera chemiczna i procesowa. lódž, politechnika lódzka, 1997, 181. issn 0137-2602. [4] rieger, f., ditl, p.: the 4th international symposium on mixing in industrial processes. toulouse, progep-ismip 4, 458, 2001. [5] rieger, f.: chem. eng. j., 79, 171, 2000. [6] rieger, f.: chem. eng. proces., 41, 381, 2002. [7] rieger, f.: proceedings of vi. polish seminar on mixing, krakow 1993, 79. [8] jirout, t., moravec, j., rieger, f., sinevič, v., špidla, m., soboĺık, v., tihon, j.: inż. chem. proc. (chemical and process engineering). 26, no. 3, 485, 2005. [9] ceres, d., moravec, j., jirout, t., rieger, f.: inż. ap. chem. 49, nr. 1, 25, 2010. [10] medek, j.: proceedings of czech conference on mixing, brno, 1982, 127. [11] ceres, d., moravec, j., jirout, t., rieger, f.: proceedings of chisa 2009. ing. dorin ceres doc. ing. tomáš jirout, ph.d. prof. ing. frantǐsek rieger, drsc. phone: +420 224 352 548 e-mail: frantisek.rieger@fs.cvut.cz czech technical university in prague faculty of mechanical engineering department of process engineering technická 4, 166 07 prague 6, czech republic 15 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 boron doped nanocrystalline diamond films for biosensing applications v. petrák, j. krucký, m. vlčková abstract with the rise of antibiotic resistance of pathogenic bacteria there is an increased demand formonitoring the functionality of bacteria membranes, the disruption of which can be induced by peptide-lipid interactions. in this work we attempt to construct and disrupt supported lipid membranes (slb) on boron doped nanocrystalline diamond (b-ncd). electrochemical impedance spectroscopy (eis) was used to study in situ changes related to lipid membrane formation and disruption bypeptide-induced interactions. the observed impedance changeswereminimal for oxidizedb-ncd samples, butwere still detectable in the low frequencypart of the spectra. the sensitivity for the detection ofmembrane formation and disruption was significantly higher for hydrogenated b-ncd surfaces. data modeling indicates large changes in the electrical charge when an electrical double layer is formed at the b-ncd/slb interface, governed by ion absorption. by contrast, for oxidized b-ncd surfaces, these changes are negligible indicating little or no change in the surface band bending profile. keywords: biosensor, nanocrystalline diamond, electrochemical impedance spectroscopy. 1 introduction the increase in antibiotic resistance of pathogenic bacteria strains has spurred the development of novel antibiotics. one promising solution to these problems is the group of antibiotics based on antimicrobial peptides which are an abundant and diverse group of molecules that are produced by many tissues and cell types in a variety of plant and animal species. their amino acid composition, amphipathicity, cationic charge and size allow them to attach to membrane bilayers and disrupt the membrane by the formation of pores [1]. they do not target specificmolecular receptorsonthemicrobial surface, but rather interact directly with microbial membranes, which they can rapidly permeabilize. the monitoring of specific peptide-lipid interactions in antibiotic peptides, which affect the functionality of bacterial membranes, can play an important role in the research of new antibiotics [2]. supported lipid bilayers (slbs) are investigated as model systems of biologicalmembranes. they are composedof a lipidbilayer adsorbedon the surface of a solid substrate. in past decades, lipid membranes ona solid substratehaveattractedconsiderable interest, from the point of view of both fundamental and applied science. these structures have been extensively used to study the structure and properties of native biologicalmembranes and for investigating biological processes such as molecular recognition, enzymatic catalysis, membrane fusion and cell adhesion [3]. in addition, several applications based on lipid membranes have been developed, including the design of biosensors. a well-established technique for the formation of slbs is the langmuir–blodgett technique, which is carried out by pulling a hydrophilic substrate through a lipid monolayer and sequentially pushing it horizontally through another lipid monolayer [4]. a second commonly employed technique for forming slbs is vesicle fusion, in which a supported bilayer is formed by the adsorption and fusion of vesicles from an aqueous suspension to the solid substrate surface [5]. commonly used substrates for slbs are mica, fused silica and glass. other substrates such as silicon, sio2, platinum and gold have also been reported. in the case of diamond, slbs can be formed on an oxidized hydrophilic surface and also on a hydrogenated hydrophobic surface [6]. diamond exhibits several special properties, such as good biocompatibility and a large electrochemical potential window. these properties make diamond particularly suitable for biosensing [7]. in this application, aborondopednanocrystalline diamond (b-ncd) film serves as a solid support for slbs and an active electrode for electrochemical impedance spectroscopy (eis)measurementofmelittin induced membrane disruption. eis was successfully used for detection of disruption bymelittin on a free-standing lipid bilayer as well as on slbs on gold surfaces. ang et al. were able to detect membrane disruption of slbs caused by maigainin ii on optically transparent diamond [6]. the eis detection of 84 acta polytechnica vol. 51 no. 5/2011 membrane disruption by antimicrobial peptide ll37 has also been demonstrated. this work focuses on the effects of surface termination on the detection abilities of b-ncd film. in the presentworkwe have constructed a simple sensor for detecting thedisruption ofslbs formedon a semi-metallically boron doped ncd electrode that serves as aworking electrode. slbs are disrupted by membrane active peptide melittin. we report the results ofeisofmembranedisruptiononhydrogenated and oxidized surfaces, and discuss the influence of bncd surface termination on the sensitivity of the sensor. 2 experimental planar sensor electrodeswere preparedbymicrowave plasma-enhanced chemical vapor deposition (mw pe-cvd) from methane/hydrogen mixtures in an astex reactor, as described in [8]. the substrates were 2 inch silicon wafers (thickness 550 μm, crystalline orientation (100), p-type doped with boron and resistivity from 1 to 20 ω·cm), which were diced into samples 10 mm by 10 mm after deposition. the diamond layers had a typical thickness of 150 nm with an average grain size of 50 nm, as determined byx-raydiffraction and atomic forcemicroscopy. to ensure good electrical conductivity of the diamond layer, the cvd deposition was performed with an admixture of trimethylboron to the ch4 gas with a concentration ratio of 200 ppm b/c. the b ncd samples served as theworking electrode in our homemade set-up that allows impedance read-out. prior to the measurements, the diamond sampleswere either hydrogenated inh2 plasma (50torr, 800◦c, power 4000 w, duration 15 min) or oxidized by uv-ozone for 30 minutes. for the evaluation, the resulting contact wetting-angles were 95◦ ± 2◦ for the hydrogenated diamond and 14◦ ± 3◦ for the oxidizeddiamond. theb-ncdsampleswere cleaned in a mixture of h2so4/kno3 (2 : 1 wt%) heated to 250◦c for 10minutes. the samples were then rinsed in deionizedwater and dried under a streamof nitrogen. fig. 1: a schematic cross-section of the setup for eis measurements the b-ncd samples were mounted on a copper backing contact, using an electrically conductive eutectic transfer tape. rubber o-rings (viton) with an inner diameter of 6 mm were pressed between the active electrode and the body of the sensor, forming a cell with a total inner volume of 160 μl. the cell was filled with 140 μl of 10 mm hepes buffer solution. a gold counter electrode 500 μm in diameter was immersed in the solution. the working and counter electrode were connected to the 4194a impedance/gain-phase analyser (hewlett packard, usa) with shielded cables. after a stable signal was obtained at 25◦c, a 100 μm solution of dopc:dops (1 : 4) liposomes with negative chargewas added. themembranewas formed by the vesicle fusion method. lipid membrane formation was completed within 30 minutes. formembranedisruption, a2 μmactiveamphipathic α-helical peptide, melittin, was added. the impedance measurement, 10 mv ac potential signal (u) was applied and the resultingac current was measured (i). each 15 seconds, a sweep of 50 frequencies ranging from 100 hz to 1 mhz was done. 2.1 equivalent circuit theequivalent circuit used formodeling theeisdata was used for diamond to model the processes in the sensor in several other applications [9]. the equivalent circuit, shown in figure 2, can be divided into three components. (1) the first component is the series resistance rs. this comprises the solution and the electrode resistance between the gold and the bncd working electrode. (2) the second component is a parallel combination of resistance r1 and a constant phase element q1, and corresponds to the double layer on the surface of the electrode. (3) the third component, which corresponds to the spacechargeregion in theb-ncd,also consistsof aparallel combination of resistance r2 and a constant phase element q2. the data was fitted to the model in zsimpwin (princetonapplied research, usa). the quality of the fit is determined by the χ2 test. if the result is below 1 · 10−3 it is a good assumption that the model that was used is correct. fig. 2: the equivalent circuit used for modeling divided into components. r represents resistance; q is a constant phase element 85 acta polytechnica vol. 51 no. 5/2011 3 results the data series were fitted over the total measured frequency range from 100 hz to 1 mhz. the resistance of the solutionwas calculated from 8 measurements, and was found to be 99 ω ± 12 ω. the constant phase elements in the equivalent circuit showed a value of n close to 1, suggesting the capacitance character of the circuit element. 3.1 slb on oxidized b-ncd the lipidmembranewasmeasureddirectly during its formation and subsequently themelittin induced disruption was measured by eis. the modeling showed significant changes in the equivalent circuit related to the formation of the lipid membrane. the nyquist plot in figure 3 consists of a semicircle in the higher frequency part of the spectra, which represents the space charge region, and the second semicircle in the lower frequency corresponds to the interface capacitance. the change in the absolute value of the impedance upon formation of the membrane on the surfacewas low. however, the changewas detectable at low frequencies, as can be seen on the nyquist plot in figure 3. the absolute impedance value at a frequency of 255 hz had risen from 1453 to 1540 ω. fig. 3: nyquist plot showing the initial state prior to addition of liposomes (�), state after membrane formation (�) and addition of melittin (�). fits to the equivalent circuit are indicated with solid lines after membrane formation, the free liposomes in the solution were flushed with 1 mm hepes buffer. the changes in impedancewereminimal in the entire frequency range and remained at a value of 1540 ω for a frequency of 255 hz. the same is true for the equivalent circuit values, which remained almost unchanged. after the addition of melittin, the difference in the high frequency part was minimal. however, a small changewas observed in the impedance spectra. the absolute value of the impedance at a frequency of 255hz decreased from 1540 to 1461ω. themaximumchange of the absolute impedance value during themeasurementwas only 5%during thedisruption. 3.2 slb on an hydrogenated b-ncd thefirst curve (�) infigure 4 shows the initial state, when only hepes buffer was present, and the second curve (◦) shows the result state after the addition of liposomes. an increase in the size of the semicircle corresponding to the molecular bilayer on the b-ncdsurface canbe seen on thenyquist plot. the modeling showed that the main detectable change in the equivalent circuit was a decrease in resistance r1 from 55 to 28 kω, together with a change in the capacitance of the constant phase element q1 from 279 nf to 141 nf. by contrast, the resistance value represented by the r2 capacitance of q2 changed from 0.45 nf to 1.38 nf. the absolute value of the impedance at a frequency of 4.9 khz rose from 4.4 to 7.6 kω, so the impedance rose by 175% of the value prior to addition. fig. 4: nyquist plot representing changes in impedance after the addition of liposomes (�), state aftermembrane formation (◦) and the addition of melittin (�). fits to the equivalent circuit are indicated with solid lines. the change after the addition of liposomes into the solution and after membrane disruption is clearly visible redundant free liposomes in the solution were subsequently flushed with 1 mm hepes buffer. the flush of the liposomes did not result in any significant change in the impedance characteristic, and the absolute value of the impedance changed only from 7.6 to 7.5 kω. this represents a 2% change in the impedance at 4.9 khz frequency. the membrane active peptide, melittin, was added after the system had stabilized. the absolute value of the impedance at a frequency 5 khz decreased from 7.5 to 5.2 kω. this represents 68% of 86 acta polytechnica vol. 51 no. 5/2011 the impedance value before the addition of melittin. data modeling using the equivalent circuit showed a change in the values of the r1 ‖ q1 elements of the circuit. the values of elements r1, q1 increased. the resistance of r1 increased from 28 to 41 kω and the capacitance of q1 increased from 141 nf to 356nf.however, the resistance of r2 decreased from 7.3 to5.1kω, andthecapacitanceof q2 changedfrom 1.35 nf to 0.83 nf. 3.3 comparison of hydrogenated and oxidized surfaces the main difference in sensitivity can be attributed to the difference in hydrogenated and oxidized surfaces. in the case of a hydrogenated surface, the band bending is upwards, and the addition of negative charge slb at the surface leads to increased band bending. by contrast, in the case of an oxidized surface the band bending is downwards, and the addition of slb should reduce the surface band bending. however, the important fact is that we are working with b-ncd, in which free holes are present. when we add the negatively charged slb on a hydrogenated surface and increase the negative charges at the surface, the holes fromb-ncd diffuse to the b-ncd surface, leading to a large change in the impedance of the system. on the other hand, when we work with an oxidized surface there are no free electrons in b-ncd and therefore the change in the surface band bending is limited. this is themain reason why the b-ncd sensor will work much more effectively with hydrogenated surfaces. 4 conclusions an impedimetric characterization of membrane formation and disruption on a hydrogenated and oxidized b-ncd surface has been carried out. for a hydrogenated surface, significant changes have been observed in the properties of the b-ncd/slb interface on interacting with the membrane active peptides. data modeling indicated large changes in the electrical charge occurring at the diamond surface, and also the creation of an electric double layer at the b-ncd/slb interface, which is governedby ionabsorption. by contrast, for an oxidizedb-ncd surface, these changes are negligible. this indicates that there are fewor no changes to the surfacebandbending profile. acknowledgement the research described in this paper was supervised byprof. patrickwagner fromhasseltuniversity and prof. milos nesladek from the faculty of biomedical engineering, czech technical university in prague. financial support from the academy of sciences of the czech republic (grants kan200100801 & kan400480701), cost mp0901 — nanotp, msm6840770012 “transdisciplinary research in the field ofbiomedicalengineering ii”. andctu(grant no. ctu 10/811700) are gratefully acknowledged. the erasmus student exchange programme is also gratefully acknowledged. references [1] izadpanah, a., gallo, r. l.: antimicrobial peptides, journal of the american academy of dermatology, vol. 52, no. 3, 52, p. 381–390. [2] castellana, e. t., cremer, p. s.: surface science reports, solid supported lipid bilayers, vol. 61, no. 10, p. 429–444. [3] dufrene, y. f., lee, g. u.: advances in the characterization of supported lipid films with the atomic force microscope, biochimica et biophysica acta – biomembranes, vol. 1509, no. 1–2, p. 14–41. [4] solletti, j. m., botreau, m., sommer, f., brunat, w. l., kasas, s., duc, t. m., celio, m. r.: elaboration and characterization of phospholipid langmuir-blodgett films, langmuir, vol. 12, no. 22, p. 5379–5386. [5] kalb, e., frey, s., tamm, l. k.: formation of supported planar bilayers by fusion of vesicles to supported phospholipid monolayers, biochimica et biophysica acta, vol. 1130, no. 2, p. 307–316. [6] ang, p. k., loh, k. p., wohland, t., nesladek, m., van hove, e.: supported lipid bilayer on nanocrystalline diamond dual optical and field-effect sensor for membrane disruption, advanced functional materials, vol. 19, p. 109–116. [7] nebel, c. e., rezek, b., shin, d., uetsua, h., yang, n.: diamond and biology, journal of the royal society interface, vol. 4, p. 439–446. [8] williams, o., nesladek, m. et al.: growth, electronic properties and applications of nanodiamond, diamond and related materials, vol. 17, no. 7–10, p. 1080–1088. [9] vermeeren, v., daenen, m., grieten, l., et al.: diamond-based dna sensors: surface functionalization and read-out strategies, physica status solidi a, vol. 260, no. 3, 520(2009). 87 acta polytechnica vol. 51 no. 5/2011 about the authors václav petrák was born in prague, czech republic. he graduatedwith amaster degree from thefaculty of biomedical engineering of the czech technical university in prague in 2010. in 2008 he joined department of functional materials at the institute of physics of the academy of sciences of the czech republic. in 2010 he worked for 5 months at the institute of material research in hasselt (belgium) duringhiserasmusstudent exchange. he is currently working onhis doctoral thesis. his professional interests are nanocrystalline diamond, nanodiamond particles, biosensors, and also neural circuits and data analysis. his personal interests are family, traveling and photography. jaroslav krucký was born in prague, czech republic. he graduated with a bachelor degree from the faculty of biomedical engineering of the czech technical university in prague in 2009. in 2009 he joined the department of the functionalmaterials at the institute ofphysics ofacademyofsciences. he is currentlyworking onmethods for preparingnd thin films with selective growth. in future, he is planning to work on diamond micro-electron array electrodes for in vivo and in vitro neural measurements. marie vlčková graduated with a bachelor degree from the faculty of biomedical engineering of the czech technical university in prague in 2010. she is continuing her studies at fbmi, and is currently working on optimization of the surface pretreatment of substrates for diamond cvd deposition. her professional interests are biomedical applications of diamond films. václav petrák e-mail: vaclav.petrak@fbmi.cvut.cz jaroslavkrucký marie vlčková department of biomedical technology faculty of biomedical engineering czech technical university, kladno, czech republic 88 acta polytechnica vol. 51 no. 4/2011 ito-sadahiro numbers vs. parry numbers z. masáková, e. pelantová abstract we consider a positional numeration system with a negative base, as introduced by ito and sadahiro. in particular, we focus on the algebraic properties of negative bases −β for which the corresponding dynamical system is sofic, which happens, according to ito and sadahiro, if and only if the (−β)-expansion of − β β +1 is eventually periodic. we call such numbers β ito-sadahiro numbers, and we compare their properties with those of parry numbers, which occur in the same context for the rényi positive base numeration system. keywords: numeration systems, negative base, pisot number, parry number. 1 introduction the expansion of a real number in the positional number system with base β > 1, as defined by rényi [12] is closely related to the transformation t : [0, 1) �→ [0, 1), given by the prescription t (x) := βx − #βx$. every x ∈ [0, 1) is a sum of the infinite series x = ∞∑ i=1 xi βi , where xi = #βt i−1(x)$ (1) for i = 1, 2, 3, . . . directly from the definition of the transformation t we can derive that the ‘digits’ xi take values in the set {0, 1, 2, . . . , %β& − 1} for i = 1, 2, 3, . . .. the expression of x in the form (1) is called the β-expansion of x. the number x is thus represented by the infinite word dβ (x) = x1x2x3 . . . ∈ an over the alphabet a = {0, 1, 2, . . . , %β& − 1}. from the definition of the transformation t we can derive another important property, namely that the ordering on real numbers is carried over to the ordering of β-expansions. in particular, we have for x, y ∈ [0, 1) that x ≤ y ⇐⇒ dβ (x) ) dβ (y) , where ) is the lexicographical order on an, (ordering on the alphabet a is usual, 0 < 1 < 2 < . . . < %β& − 1). in [11], parry has provided a criterion which decides whether an infinite word in an is or not a βexpansion of some real number x. the criterion is formulated using the so-called infinite expansion of 1, denoted by d∗β (1), defined as a limit in the space an equipped with the product topology, by d∗β (1) := lim ε→0+ dβ (1 − ε) . according to parry, the string x1x2x3 . . . ∈ an represents the β-expansion of a number x ∈ [0, 1) if and only if xixi+1xi+2 . . . ≺ d∗β (1) (2) for every i = 1, 2, 3, . . . condition (2) ensures that the set dβ = {dβ(x) | x ∈ [0, 1)} is shift invariant, and so the closure of dβ in an, denoted by sβ, is a subshift of the full shift an. the notion of β-expansion can naturally be extended to all non-negative real numbers: the expression of a positive real number y in the form y = ykβ k + yk−1β k−1 + yk−2β k−2 + . . . , (3) where k ∈ z and ykyk−1yk−2 . . . ∈ dβ , is called the β-expansion of y. real numbers y having in the β-expansion of |y| vanishing digits yi for all i < 0 are usually called βintegers, and the set of β-integers is denoted by zβ . the notion of β-integers was first considered in [3] as an aperiodic structure modeling non-crystallographic materials with long range order, called quasicrystals. numbers y with finitely many non-zero digits in the β-expansion of |y| form the set denoted by fin(β). the choice of the base β > 1 strongly influences the properties of β-expansions. it turns out that an important role among bases is played by such numbers β for which d∗β (1) is eventually periodic. parry himself called these bases beta-numbers; now these numbers are commonly called parry numbers. we can demonstrate the exceptional properties of parry numbers on two facts: • the subshift sβ is sofic if and only if β is a parry number [6]. 59 acta polytechnica vol. 51 no. 4/2011 • distances between consecutive β-integers take finitely many values if and only if β is a parry number [15]. recently, ito and sadahiro [5] suggested a study of positional numeration systems with a negative base −β, where β > 1. the representation of real numbers in such a system is defined using the transformation t : [lβ, rβ ) �→ [lβ , rβ ), where lβ = − β β + 1 , rβ = 1 + lβ = 1 1 + β , t (x) := −βx − #−βx − lβ$ . (4) every real x ∈ iβ := [lβ, rβ ) can be written as x = ∞∑ i=1 xi (−β)i , (5) where xi = #−βt i−1(x) − lβ$ for i = 1, 2, 3, . . . the above expression is called the (−β)expansion of x. it can also be written as the infinite word d−β (x) = x1x2x3 . . . we can easily show from (4) that the digits xi, i ≥ 1, take values in the set a = {0, 1, 2, . . . , #β$}. in this case, the ordering on the set of infinite words over the alphabet a which would correspond to the ordering of real numbers is the so-called alternate ordering: we say that x1x2x3 . . . ≺alt y1y2y3 . . . if for the minimal index j such that xj �= yj it holds that xj (−1)j < yj (−1)j . in this notation, we can write for arbitrary x, y ∈ iβ that x ≤ y ⇐⇒ d−β (x) )alt d−β (y) . in their paper, ito and sadahiro have provided a criterion to decide whether an infinite word an belongs to the set of (−β)-expansions, i.e. to the set d−β = {d−β (x) | x ∈ iβ}. this time, the criterion is given in terms of two infinite words, namely d−β (lβ) and d ∗ −β (rβ ) := lim ε→0+ d−β (rβ − ε) . these two infinite words have a close relation: if d−β (lβ ) is purely periodic with odd period length, i.e. d−β (lβ ) = (d1d2 . . . d2k+1) ω, then we have d∗−β(rβ ) =( 0d1d2 . . . (d2k+1 − 1) )ω . (as usual, the notation wω stands for infinite repetition of the string w.) in all other cases we have d∗−β (rβ ) = 0d−β(lβ ). ito and sadahiro have shown that an infinite word x1x2x3 . . . represents a (−β)-expansion of some x ∈ [lβ , rβ ) if and only if for every i ≥ 1 it holds that d−β (lβ ) )alt xixi+1xi+2 . . . ≺alt d∗−β (rβ ) . (6) the above condition ensures that the set d−β of infinite words representing (−β)-expansions is shift invariant. in [5] it is shown that the closure of d−β defines a sofic system if and only if d−β (lβ ) is eventually periodic. by analogy with the definition of parry numbers, we suggest that numbers β > 1 such that d−β (lβ ) is eventually periodic be called ito-sadahiro numbers. the relation of the set of ito-sadahiro numbers and the set of parry numbers is not obvious. bassino [2] has shown that quadratic numbers, as well as cubic numbers which are not totally real, are parry if and only if they are pisot. for the same class of numbers, we prove in [10] that β is ito-sadahiro if and only if it is pisot. this means that notions of parry numbers and ito-sadahiro numbers on the mentioned type of irrationals do not differ. this would support the hypothesis stated in the first version of this paper, namely that the set of parry numbers and the set of ito-sadahiro numbers coincide. however, during the refereeing process liao and steiner [9] found an example of a parry number which is not an itosadahiro number, and vice-versa. the main results of this paper are formulated as theorems 4 and 7. theorem 4 gives a bound on the modulus of conjugates of ito-sadahiro numbers; theorem 7 shows that periodicity of (−β)-expansion of all numbers in the field q(β) requires β to be a pisot or salem number. statements which we prove, as well as results of other authors that we recall, demonstrate similarities between the behaviour of β-expansions and (−β)-expansions. we mention also phenomena in which the two essentially differ. 2 preliminaries let us first recall some number theoretical notions. a complex number β is called an algebraic number, if it is a root of a monic polynomial xn +an−1x n−1+. . . + a1x + a0, with rational coefficients a0, . . . , an−1 ∈ q. a monic polynomial with rational coefficients and root β of the minimal degree among all polynomials with the same properties is called the minimal polynomial of β, and its degree is called the degree of β. the roots of the minimal polynomial are algebraic conjugates. if the minimal polynomial of β has integer coefficients, β is called an algebraic integer. an algebraic integer β > 1 is called a perron number, if all its conjugates are in modulus strictly smaller than β. an algebraic integer β > 1 is called a pisot number, if all its conjugates are in modulus strictly smaller than 1. an algebraic integer β > 1 is called a salem number, if all its conjugates are in modulus smaller than or equal to 1 and β is not a pisot number. if β is an algebraic number of degree n, then the minimal subfield of the field of complex numbers containing β is denoted by q(β) and is of the form q(β) = {c0 + c1β + . . . + cn−1βn−1 | ci ∈ q} . 60 acta polytechnica vol. 51 no. 4/2011 if γ is a conjugate of an algebraic number β, then the fields q(β) and q(γ) are isomorphic. the corresponding isomorphism is given by c0+c1β +. . .+cn−1β n−1 �→ c0+c1γ +. . .+cn−1γn−1 . in particular, this means that β is a root of some polynomial f with rational coefficients if and only if γ is a root of the same polynomial f . 3 ito-sadahiro polynomial from now on, we shall consider for bases of the numeration system only ito-sadahiro numbers, i.e. numbers β such that d−β (lβ ) = d1 . . . dm(dm+1 . . . dm+p) ω . (7) without loss of generality we shall assume that m ≥ 0, p ≥ 1 are minimal values so that d−β (lβ ) can be written in the above form. recall that lβ = − β β + 1 . therefore (7) can be rewritten as − β β + 1 = d1 −β + . . . + dm (−β)m +( dm+1 (−β)m+1 + . . . + dm+p (−β)m+p ) ∞∑ i=0 1 (−β)p i , and after arrangement 0 = −β −β − 1 + d1 −β + . . . + dm (−β)m + (−β)p (−β)p − 1 ·( dm+1 (−β)m+1 + . . . + dm+p (−β)m+p ) . multiplying by (−β)m ( (−β)p − 1 ) , we obtain the following lemma. lemma 1 let β be an ito-sadahiro number and let d−β (lβ ) be of the form (7). then β is a root of the polynomial p (x) = (−x)m+1 p−1∑ i=0 (−x)i + ( (−x)p − 1 ) · (8) m∑ i=1 di(−x)m−i + m+p∑ i=m+1 di(−x)m+p−i . such a polynomial is called the ito-sadahiro polynomial of β. corollary 2 an ito-sadahiro number is an algebraic integer of degree smaller than or equal to m + p, where m, p are given by (7). it is useful to mention that the ito-sadahiro polynomial is not necessarily irreducible over q. as an example one can take the minimal pisot number. for such β, we have d−β (lβ ) = 1 001 ω, and thus the ito-sadahiro polynomial is equal to p (x) = x4−x3−x2 + 1 = (x−1)(x3−x−1), where x3−x−1 is the minimal polynomial of β. remark 3 note that for p = 1 and dm+1 = 0, we have d−β (lβ ) = d1 . . . dm0 ω , and the ito-sadahiro polynomial of β is of the form p (x) = (−x)m+1 + d1(−x)m + (d2 − d1)(−x)m−1 + . . . + (dm − dm−1)(−x) − dm , (9) and thus β is an algebraic integer of degree at most m + 1. theorem 4 let β be an ito-sadahiro number. all roots γ, γ �= β, of the ito-sadahiro polynomial (in particular all conjugates of β) satisfy |γ| < 2. proof. since β is a root of its ito-sadahiro polynomial p , there must exist a polynomial q such that p (x) = (x − β)q(x). let us first determine q and show that it is a monic polynomial with coefficients in modulus not exceeding 1. the coefficients di in the polynomial p in the form (8) are the digits of the (−β)-expansion of lβ , and thus, by (5), they satisfy di = #−βt i−1(lβ ) − lβ$. relation (4) then implies t i(lβ ) = −βt i−1(lβ )−#−βt i−1(lβ )−lβ$, wherefrom we have di = −t i(lβ ) − βt i−1(lβ) . for simplicity of notation in this proof, denote ti = t i(lβ ), for i = 0, 1, . . . , m + p. substituting di = −ti − βti−1 into (8), we obtain p (x) = (−x)m+1 p−1∑ i=0 (−x)i + ( (−x)p − 1 ) · m∑ i=1 (−ti − βti−1)(−x)m−i + m+p∑ i=m+1 (−ti − βti−1)(−x)m+p−i = (−x)m+1 p−1∑ i=0 (−x)i + ( (−x)p − 1 ) (x − β) · m∑ i=2 ti−1(−x)m−i + (10) (x − β) p∑ i=1 tm+i−1(−x)p−i −( (−x)p − 1 ) βt0(−x)m−1 + tm − tm+p . first realize that tm − tm+p = 0, since d−β (lβ) is eventually periodic with a preperiod of length m and a period of length p. as t0 = t 0(lβ ) = − β β + 1 , we can derive that (−x)m+1 p−1∑ i=0 (−x)i − ( (−x)p − 1 ) βt0(−x)m−1 = (−x)m−1(x − β)(x − t0) p−1∑ i=0 (−x)i . 61 acta polytechnica vol. 51 no. 4/2011 putting back to (10), we obtain that the desired polynomial q defined by p (x) = (x − β)q(x) is of the form q(x) = (−x)m−1(x − t0) p−1∑ i=0 (−x)i + ( (−x)p − 1 ) m∑ i=2 ti−1(−x)m−i + p∑ i=1 tm+i−1(−x)p−i , which can be rewritten in another form, namely, q(x) = −(−x)m+p−1 + m+p−2∑ i=m (tm+p−1−i − t0 − 1)(−x)i + (11) m−1∑ i=0 (tm+p−1−i − tm−1−i)(−x)i . note that the coefficients at individual powers of −x are of two types, namely tm+p−1−i − t0 − 1 ∈ [−1, 0) , and tm+p−1−i − tm−1−i ∈ (−1, 1) . in order to complete the proof, realize that every root γ, γ �= β, of the polynomial p satisfies q(γ) = 0. we thus have (−γ)m+p−1 = m+p−2∑ i=m (tm+p−1−i − t0 − 1)(−γ)i + m−1∑ i=0 (tm+p−1−i − tm−1−i)(−γ)i , and hence |γ|m+p−1 ≤ m+p−2∑ i=0 |γ|i = |γ|m+p−1 − 1 |γ| − 1 < |γ|m+p−1 |γ| − 1 . from this, we easily derive that |γ| < 2. as a consequence, we can easily deduce the relation between ito-sadahiro numbers greater or equal to 2 and perron numbers. corollary 5 every ito-sadahiro number β ≥ 2 is a perron number. in a recent preprint [9], it is shown that also itosadahiro numbers β < 2 are perron numbers. 4 periodic expansions in the ito-sadahiro system representations of numbers in the numeration system with a negative base from the point of view of dynamical systems have been studied by frougny and lai [7]. they have shown the following statement. theorem 6 if β is a pisot number, then d−β (x) is eventually periodic for any x ∈ iβ ∩ q(β). in particular, their result implies that every pisot number is an ito-sadahiro number. here, we show a ‘reversed’ statement. theorem 7 if any x ∈ iβ ∩ q(β) has eventually periodic (−β)-expansion, then β is either a pisot number or a salem number. proof. first realize that since l−β ∈ q(β), by assumption, d−β (lβ ) is eventually periodic, and thus β is an ito-sadahiro number. therefore, using corollary 2, β is an algebraic integer. it remains to show that all conjugates of β are in modulus smaller than or equal to 1. consider a real number x whose (−β)-expansion is of the form d−β (x) = x1x2x3 . . . we now show that x1 = x2 = . . . = xk−1 = 0 and xk �= 0 implies |x| ≥ 1 βk (β + 1) . (12) in order to see this, we estimate the series |x| = ∣∣∣ xk (−β)k + ∞∑ i=1 xk+i (−β)k+i ∣∣∣ ≥ 1 βk − 1 βk ∣∣∣ ∞∑ i=1 xk+i (−β)i ∣∣∣ . since the set d−β of all (−β)-expansions is shift invariant, the sum ∞∑ i=1 xk+i (−β)i is a (−β)-expansion of some y ∈ iβ . therefore we can write |x| ≥ 1 βk − 1 βk |y| ≥ 1 βk − 1 βk β β + 1 = 1 βk(β + 1) . as β > 1, there exists l ∈ n such that − β β + 1 < 1 (−β)2l+1 . let m ∈ n satisfy m > 2l + 1. choose a rational number r such that 1 (−β)2l+1 < r < 1 (−β)2l+1 + 1 βm (β + 1) . (13) according to the auxiliary statement (12), the (−β)expansion of r must be of the form r = 1 (−β)2l+1 + ∞∑ i=m+1 ri (−β)i . (14) 62 acta polytechnica vol. 51 no. 4/2011 as r is rational, by assumption, the infinite word rm+1rm+2 . . . is eventually periodic and by summing a geometric series, the sum ∞∑ i=m+1 ri (−β)i can be rewritten as ∞∑ i=m+1 ri (−β)i = c0 + c1β + . . . + cn−1β n−1 ∈ q(β) , where n is the degree of β. in order to prove the theorem by contradiction, assume that a conjugate γ �= β is in modulus greater than 1. by application of the isomorphism between q(β) and q(γ), we get c0 + c1γ + . . . + cn−1γ n−1 = ∞∑ i=m+1 ri (−γ)i , and thus r = 1 (−γ)2l+1 + ∞∑ i=m+1 ri (−γ)i . (15) subtracting (15) from (14), we obtain 0 < ∣∣∣ 1 (−β)2l+1 − 1 (−γ)2l+1 ∣∣∣ ≤ (16) ∞∑ i=m+1 ri ∣∣(−β)−i − (−γ)−i∣∣ ≤ 2#β$ηm+1 1 − η , where η = max{|β|−1, |γ|−1} < 1. obviously, for any m > 2l + 1, we can find a rational r satisfying (13) and thus derive the inequality (16). however, the left-hand side of (16) is a fixed positive number, whereas the right-hand side decreases to zero with increasing m , which is a contradiction. in order to stress the analogy of the ito-sadahiro numeration system with rényi β-expansions of numbers, recall that already schmidt in [13] has shown that for a pisot number β, any x ∈ [0, 1) ∩ q(β) has an eventually periodic β-expansion and also, conversely, that every x ∈ [0, 1) ∩ q(β) having an eventually periodic β-expansion force β is either a pisot number or a salem number. in fact, the proof of theorem 6 given by frougny and lai, as well as our proof of theorem 7 are using the ideas presented in [13]. a special case of numbers with periodic (−β)expansion is given by those numbers x for which the infinite word d−β (x) has suffix 0 ω. we then say that the expansion d−β (x) is finite. an example of such a number is x = 0 with (−β)-expansion d−β (x) = 0ω. as is shown in [10], if β < 1 2 (1 + √ 5), then x = 0 is the only number with finite (−β)-expansion. this property of the ito-sadahiro numeration system has no analogue in rényi β-expansions; for positive base, the set of finite β-expansions is always dense in [0, 1). just as in the numeration system with a positive base, we can extend the definition of (−β)-expansions of x to all real numbers x, and define the notion of a (−β)-integer as a real number y such that y = yk(−β)k + . . . + y1(−β) + y0 , where yk . . . y1y00 ω is the (−β)-expansion of some number in iβ . the set of (−β)-integers is denoted by z−β. with this notation, we can write the set of all numbers with finite (−β)-expansions as fin(−β) = ∞⋃ k=0 1 (−β)k z−β . it is not surprising that the arithmetical properties of β-expansions and (−β)-expansions depend on the choice of the base β. it can be shown that both zβ and z−β is closed under addition and multiplication if and only if β ∈ n. on the other hand, fin(β) and fin(−β) can have a ring structure even if β is not an integer. frougny and solomyak [8] have shown that if fin(β) is a ring, then β is a pisot number. a similar result is given in [10] for a negative base: fin(−β) being a ring implies that β is either a pisot number or a salem number. in [10] we also prove the conjecture of ito and sadahiro that in the case of quadratic pisot base β the set fin(−β) is a ring if and only if the conjugate of β is negative. 5 comments and open questions • every pisot number is a parry number and every parry number is a perron number, and neither of these statements can be reversed. the former is a consequence of the mentioned result of schmidt, the latter statement follows for example from the fact that every perron number has an associated canonical substitution ϕβ , see [4]. the substitution is primitive, and its incidence matrix has β as its eigenvalue. the fixed point of ϕβ is an infinite word which codes the sequence of distances between consecutive β-integers. • for the negative base numeration system, we can derive from theorem 6 that every pisot number is an ito-sadahiro number. from corollary 5 we know that an ito-sadahiro number β ≥ 2 is a perron number. based on our investigation, we conjecture that for any ito-sadahiro number β ≥ 1 2 (1 + √ 5), the sequence of distances between consecutive (−β)-integers can be coded by a fixed point of a ‘canonical’ substitution which is primitive and its incidence matrix has β2 for its dominant eigenvalue. thus we expect that 63 acta polytechnica vol. 51 no. 4/2011 every ito-sadahiro number β ≥ 1 2 (1 + √ 5) is also a perron number. in the case that β < 1 2 (1 + √ 5), we have z−β = {0} and so the situation is not at all obvious. • in [14], solomyak has explicitly described the set of conjugates of all parry numbers. in particular, he has shown that this set is included in the complex disc of radius 1 2 (1 + √ 5), and that this radius cannot be diminished. for his proof it was important that all conjugates of a parry number are roots of a polynomial with real coefficients in the interval [0, 1). in the proof of theorem 4 we show that conjugates of an ito-sadahiro number are roots of a polynomial (11) with coefficients in [−1, 1]. from this, we derive that conjugates of ito-sadahiro numbers lie in the complex disc of radius ≤ 2. we do not know whether this value can be diminished. acknowledgement we acknowledge financial support from czech science foundation grant 201/09/0584 and from grants msm6840770039 and lc06002 of the ministry of education, youth, and sports of the czech republic. references [1] ambrož, p., dombek, d., masáková, z., pelantová, e.: numbers with integer expansion in the numeration system with negative base, preprint 2009, 13pp. http://arxiv.org/abs/0912.4597 [2] bassino, f.: β-expansions for cubic pisot numbers, 5th latin american theoretical informatics symposium (latin’02), 2286 lncs. cancun, mexico. april, 2002. pp. 141–152. springerverlag. [3] burd́ık, č., frougny, ch., gazeau, j. p., krejcar, r.: beta-integers as natural counting systems for quasicrystals, j. phys. a: math. gen. 31 (1998), 6 449–6 472. [4] fabre, s.: substitutions et β-systèmes de numération, theoret. comput. sci. 137 (1995), 219–236. [5] ito, s., sadahiro, t.: (−β)-expansions of real numbers, integers 9 (2009), 239–259. [6] ito, s., takahashi, y.: markov subshifts and realization of β-expansions, j. math. soc. japan 26 (1974), 33–55. [7] frougny, ch., lai, a. c.: negative bases and automata, discr. math. theor. comp. sci. 13, no 1 (2011), 75–94. [8] frougny, ch., solomyak, b.: finite β-expansions, ergodic theory dynamical systems 12 (1994), 713–723. [9] liao, l., steiner, w.: dynamical properties of the negative beta transformation, preprint 2011, 18pp. http://arxiv.org/abs/1101.2366 [10] masáková, z., pelantová, e., vávra, t.: arithmetics in number systems with negative base, theor. comp. sci. 412 (2011), 835–845. [11] parry, w.: on the β-expansions of real numbers, acta math. acad. sci. hung. 11 (1960), 401–416. [12] rényi, a.: representations for real numbers and their ergodic properties, acta math. acad. sci. hung. 8 (1957), 477–493. [13] schmidt, k.: on periodic expansions of pisot numbers and salem numbers, bull. london math. soc. 12 (1980), 269–278. [14] solomyak, b.: conjugates of beta-numbers and the zero-free domain for a class of analytic functions, proc. london math. soc. 68 (1994), 477–498. [15] thurston, w. p.: groups, tilings, and finite state automata, ams colloquium lecture notes, american mathematical society, boulder, 1989. zuzana masáková e-mail: zuzana.masakova@fjfi.cvut.cz edita pelantová e-mail: edita.pelantova@fjfi.cvut.cz department of mathematics fnspe czech technical university in prague trojanova 13, 120 00 praha 2, czech republic 64 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 welding of aluminum using a pulsed nd:yag laser m. nerádová, p. kovačócy abstract this paper deals with pulsed laser welding of aluminum using an nd:yag laser with wavelength 1.06 μm. technically pure aluminum (95.50 wt. %) was used as the welded material. eighteen welds (penetration passes) were fabricated in the experiment. optical microscopy was used to assess the influence of changes in the parameters of the pulsed laser on the quality and geometry of the penetration passes of aluminumand on the hardness measurement through the interface of the welds. the results show that the geometry of the penetration passes was influenced above all by the position of the beam focus. keywords: welding parameters, laser welding, aluminum, weldability of aluminum. 1 introduction laser beams can be applied in a versatile manner in a wide range of technical and non-technical fields. in laser welding of materials, various types of laser devices with various power ranges are used. the automotive industry is one of many areas where laser welding of aluminum is utilized [1]. aluminum and its alloys are characterized by low density, relatively high strength and high corrosion resistance [2, 3]. they are used as structural materials in various industrial fields. welding ofaluminumand its alloyshas its specific features. aluminumoxidizes strongly above itsmelting point. the oxidic layer has a high melting point, and it does not melt in the welding process [4]. this layer has a strong ability to absorb gases and vapors, which then get into the weld metal. oxidic particle layersmay lead to the presence of oxidic inclusions in theweldmetal, which deteriorates the characteristics of the welded joints [5]. when welding aluminum, it is necessary to use a higher intensity laser beam on the surface of the workpiece, due to the high reflectivity of aluminum [4]. given the high reflectivity of the radiation from the surface of aluminum, it is preferable to use just annd: yag laser. depending on the configuration and the geometry of the welds, additional materials are sometimes used for aluminum welding. when welding aluminum and its alloys, pores often form in the weld metal. source of these pores is hydrogen. hydrogenhas atmelting temperature of aluminum relatively high solubility [5]. the high thermal conductivity and the high coefficient of expansion of aluminum give rise to major distortions in comparison with steel. the use of highly concentrated laser beam welding provides the preconditions for success in addressing these problems. in order to obtainhigh-qualitywelded joints, it is particularly necessary to prepare the surface prior to laserwelding. the oxidic layer along the length of the surface has to be removed. this surface preparation minimizes the formation of defects in welding and the presence of pores and oxidic inclusions in the weld metal [2]. when welding aluminum alloys, it is essential to protect the gas bathmelt from oxidation. the use of heliumas the protective gas enablesmaximumdepth for translating a high quality weld metal. the basis for achieving high-qualitywelded joints is the correct choice of the laser welding parameters. themicrostructure of the weldmetal joints made by the laser beam and the optimal welding parameters are significantly different from the microstructure of weld joints made with the use of metal arc welding. the weld metal has a fine dispersion structure without the presence of low-eutectic, while the dendrite dimensions are significantly smaller than in arcwelding. the structural changes in the heat-affected area of laser welding take place in a volume 5 to 6 times smaller than in the case of arc welding. the grain in this area increases minimally. this structure is advantageous in terms of mechanical properties and good resistance to hot cracks [1]. 2 experimental materials technically pure aluminium was used as the experimental material. the dimensions of the test samples were 76 × 30 × 1 mm. the chemical composition of al 99.50 % is shown in table 1. 66 acta polytechnica vol. 50 no. 6/2010 table 1: chemical composition al 99.50 wt. % material chemical composition [wt.%] al min fe si zn cu ti others 99.50 max. 0.40 max. 0.30 max. 0.07 max. 0.05 max. 0.05 max. 0.03 mechanical characteristics rm (mpa) ductility a5 [%] modulus of elasticity e 60–100 min. 10–20 [gpa] inf. 71 table 2: welding parameters sample f (mm)* τ (ms)* u (v)* e1(j)* 1.1 5.5 20 400 69.8 1.2 5.25 20 400 70.1 1.3 5.0 20 400 70.2 1.4 4.75 20 400 69.9 1.5 4.5 20 400 70.0 1.6 4.25 20 400 69.8 1.7 4.0 20 400 70.3 1.8 3.75 20 400 70.0 1.9 3.5 20 400 69.1 1.10 3.25 20 400 70.1 1.11 3.0 20 400 70.1 1.12 2.75 20 400 69.9 1.13 2.5 20 400 70.1 1.14 2.25 20 400 72.8 1.15 2.0 20 400 74.2 2.1 2.75 20 400 69.8 2.2 2.75 20 375 63 2.3 2.75 20 350 53.1 *f (mm) – focal length, τ (ms) – pulse duration, u (v) – pump power, e1 (j) – pulse energy 2.1 procedure and parameters for welding the experiment was executed at the international laser centre in bratislava. the experimental work was performed on a w50 laser welder, produced by solar laser systems, with wavelength 1.06 μm and maximum output power 74.2 j. during the experiments, 18 penetration passes were carried out, in which we observed the impacts of the focus location, the intensity of performance and also the impact of the energypulse values on the geometryand integrity of the welds. laser welding was performed in a protective atmosphere of argonwith 5 l/min. flow. the welding parameters are shown in table 2. 2.2 assessment of penetration welds optical microscopy was used for assessing the penetration welds and for measuring the hardness (hv) through the interface of the welds. figs. 1 to 4 document the macrostructures of penetration weld 1.2–1.5. it follows fromobserving the structures that in the case of samples 1.1–1.4 the material was not fully penetrated due to low power density. fig. 1: surface of penetration pass 1.5 fig. 2: root of penetration pass 1.5 67 acta polytechnica vol. 50 no. 6/2010 fig. 3: penetration pass 1.2 fig. 4: penetration pass 1.5 the surface and the root of penetration pass 1.5 are distinguished by roundness without a split. the position of the focus in relation to the surface of the material has caused the whole width of the material to be penetrated. a slight depression can be seen on the surface of the material, fig. 4. the weld metal does not show any non-integrities or defects. the width of the surface of the penetration pass is 1.336 mm and the width of the root of the penetration pass is 0.872mm. figs. 3 and 4 showdifferences in the character of the penetration. in fig. 3, the material was not liquidized — this is known as conductionalmodewelding. in thismode, a thin surface layer ofmaterial ismelted down and then ismaterial heated due to the thermal conductivity. in fig. 4, the parameters are used to produce a sufficient keyhole to enable deeper penetration of the laser beam into the material. fig. 5 documents the microstructure of the transition (the heat-affected zone — boundary melting down — weld metal) on penetration weld 1.5. fig. 6 shows the microstructure of the welding metal. grain (subgrain) boundarieswere observed in the weldmetal with oxides distribution inside the grains. fig. 5: microstructure theheat-affectedzone (al99.50%) fig. 6: microstructure weld metal (al 99.50 %) no non-integrities were present. a smelting boundary can be observed between the heat-affected zone and the weldmetal, and there is a smooth transition between them. the orientation of the dendrites can be observed in the weld metal. vickers microhardness tests were conducted on samples 1.5, 1.9 and 2.1. figs. 7 to 9 show the effect of some parameters on the geometry of the penetrations. ten measurements were made on each sample. fig. 10 shows that the highest microhardness values were for sample 1.9. this may be due to the smaller volumeof the smeltedmaterial, quicker cooling of the material, and the formation of a finer structure. inwelds 1.5, 1.9, 2.1 the highestmicrohardness is in the parentmaterial. thismay be because the material hasbeen cold-rolled. thehardness drops in the heat-affected zone due to thermal processing, and on the smelting boundary and in the welding metal the hardness starts to rise. the welding metal is characterized by the pouring structure. the slight growth in the hardness of the welding metal may be due to softening of the structure. 68 acta polytechnica vol. 50 no. 6/2010 fig. 7: effect of pulse energy on the geometry of the penetration passes fig. 8: dependence of thewidth anddepthof penetration on the excitation voltage fig. 9: influence of the focal length on the geometry of the penetration passes fig. 10: graphic illustration of the course of microhardness on samples 1.5, 1.9, 2.1 a – parent material, b – heat-affected zone, c – boundary of smelting, d – weld metal 3 conclusion based on the results obtained in the experiment it canbe stated, that the values of the standoffdistance being between 0–2mmandbetween 4.75–5.5mmare characterized by low penetration weld form factor (conductional welding system), which is more suitable for surface treatment ofmaterials. a total of 10 sampleswereused to recast thewhole thickness of the material. the standoff distance in these cases varied in the range from 2.5–4.5 mm. excitation voltage of 400 v was used for the production of penetrations. because of the possibility of assessing the impact of voltage-inspiring geometry, penetration welding was carried out at voltages 350 v and 375 v. based on themeasureddimensions influencesof voltageenergy, pulse duration and standoff distance on the weld geometrywereevaluated. basedonthe resultsofmacro and microstructural analysis, it can be considered, that themost suitableparameterswereused forwelding the sample no. 1.5. the greatest microhardness values were measured in the base material. in the thermally influenced area a decrease in hardness was observed. acknowledgement the paper was prepared with support from the vega project, no. 1/0842/09. references [1] turňa, m., kovačócy, p.: zváranie laserovým lúčom. bratislava, stu, 2003. isbn 80-227-1921-8. [2] accessible on internet: http://www.world-aluminium.org/?pg=107 [3] accessible on internet: http://www.chemicool.com/elements/ aluminum.html [4] accessible on internet: http://www.keytometals.com/article12.htm [5] ghaini, f. m. et al.: the relation between liquation and solidification cracks in pulsed laser welding of 2024 aluminium alloy. accessible on internet: http://www.sciencedirect.com ing. martina nerádová doc. dr. ing. pavel kovačócy phone: +421 335 521 007 slovak university of technology in bratislava faculty of materials science and technology in trnava institute of production technologies bottova 23, 917 24 trnava, slovak republic 69 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 stellar object detection using the wavelet transform e. anisimova, p. páta, m. blažek abstract several algorithms are used nowadays for detecting stellar objects in astronomical images, for example in thedaophot program package and in sextractor (software for source extraction). our team has become acquainted with the wavelet transform and its good localization properties. after studying themanual fordaophotandsextractor, andbecoming familiar with the à trous algorithm used for calculating the wavelet transform, we set ourselves the task to implement an algorithm for star detection on the basis of the wavelet transform. we focused on detecting stellar objects in complex fields, such as globular clusters and galaxies. this paper describes a stellar object detection algorithm with the help of the wavelet transform, and presents our results. keywords: stellar object detection, wavelet transform. 1 introduction the daophot program [1,2] and sextractor [3,4] calculate the estimated background value and perform thresholding of each pixel: if it is more than a specified threshold and meets certain conditions, we consider it to be a light source. otherwisewe assume that it is noise. problems arise in the case of faint stars, whose brightness is close to the ambient background. for this reason, they may not be properly detected. multiresolution methods of image analysis, e.g. the wavelet transform, have therefore gained ground. the main advantage of this transform is its ability to separate light sources contained in an image according to their size, enabling us to analyze both large, bright objects and the small, faint stars in their neighborhood. 2 wavelet transform of a 2d image to realize the wavelet transform of an image it is necessary to make an image convolution with a preselected wavelet, first by rows and then by columns. the result is an approximation of the original image and the presence of details in horizontal, vertical and diagonal direction. with increasing decomposition level we extract larger details from the image. in order not to change the scale of the low-pass or highpass wavelet filters, the image has to be subsampled by a factor of 2. therefore, when the wavelet transform is implementedbythemallatalgorithm[5] there is for each additional degree of decomposition an image with dimensions twice smaller than on the previous level. for detecting stellar objects, this is not a good property, because we need to have the same number of pixels on each scale in order to comply with the same coordinates. therefore, the wavelet transform for astronomical purposes is realized by the à trous algorithm (“with holes”). 3 the à trous algorithm wavelet transform implementation by the à trous algorithm involves convoluting the input image with a 2dconvolutionkernel representinga two-dimensional scaling function, which imitates the stellar psf [6]. to imitate the subsampling process, we have to change for each next decomposition level the filter length in such a way that 2j − 1 zeros are inserted between the coefficients,where j is thedecomposition level [6]. • during the first decomposition (we start from j = 0) we convolute the original image s0 with an unmodified kernel k0, and the result is the smoothed matrix s1. • subtracting s1 of s0, we get wavelet coefficients for the first decomposition level corresponding to the smallest details w1 = s0 − s1. • j = j +1. • expand the filter by 2j −1 zeros. • calculate the smoothedmatrix s2 = s1�k1 and the wavelet coefficients for the second decomposition level: w2 = s1 − s2, etc. if we stop this algorithm here, the original image is the sum of s2, w1 and w2 (figure 1). 4 detecting stellar objects stellar objects are detected in wavelet coefficients w1, w2, etc., representing details contained in the original image. this means that stars with the nar9 acta polytechnica vol. 51 no. 6/2011 fig. 1: the à trous algorithm [8] rowest radial profile are detected in w1 and with increasing decomposition level flatter and more extensive objects will be found. 1. after calculating the à trous decomposition of the imagewedetermine their significanceat each level of wavelet coefficients. this is done by noise level, or by estimating the level of the stellar background (we start as in the case of conventional algorithms). a robust estimate σ̂ is obtained with a median measurement, which is highly insensitive to isolated outliers of potentially high amplitudes [7]. this estimate is usually used to remove noise from the image, but we will use it to determine the threshold and todistinguishbetween significant coefficients belonging to stellar objects and insignificant coefficients that are part of the background. 2. the estimated noise standard deviation for each decomposition level will be used for wavelet coefficient thresholding in a way that it will set to zero all the coefficients belonging to the interval |wj | ≤ 3σ̂. 3. among the non-zero coefficientswe then look for the local maxima of the wavelet coefficients. 4. the coordinates of the local maxima can be considered to be stellar detections if there are nonzero wavelet coefficients in the next decomposition level in the same places. in this way we verify whether the detected objects have a star shape, i.e. we try to eliminate the detection of hot pixels and the detection of false centers, which could be detected due to imperfect background level estimation. 5 results in this paper the proposed algorithm for stellar object detection is based on thewavelet transform. our team also carried out stellar detection by standard algorithms (used in sextractor, with subtraction of discovered stars [8]) as well as by the wavelet transform. we monitored the total number of detected stars, the time required for performing detection, the number of discovered stars corresponding to optical catalog usno-a2.0 [9], and the percentage of the total number of stars in the catalog for a given image. the best results (number of real detected stars, and the time spent)were achievedusing analgorithm based on the wavelet transform. individual images were taken by the d50 telescope at the astronomical institute of the academy of sciences of theczechrepublic atondrejov by the highenergyastrophysicsgroup. table 1 illustrates the results for number of detected light sources for the following stellarobjects and theirneighbourhood: grb 100902a (gamma-ray burst), m5 (globular cluster), m67 (open cluster) and m51 (galaxy). table 2 shows the results comparedwith the usnoa2.0 catalog. for each image there are in the first line: the number of detected stars with the coordinates found in the catalog, the number of detected stars not found in the catalog, the percentage of detected starswith coordinates found in the catalogout of the total number of objects in the catalog for the image (second row). figure 2 compares the number of detected light sources using conventional methods based on background estimation and also using the wavelet transform. table 1: number of detected light sources. image standard algorithm wavelet transform grb 100902a 402 513 m 5 1031 1918 m 67 368 499 m 51 141 1191 10 acta polytechnica vol. 51 no. 6/2011 table 2: number of discovered stars corresponding to optical catalog usno-a2.0 and the percentage of the total number of stars in the catalog for a given image image standard algorithm wavelet transform in catalog outside catalog [%] in catalog outside catalog [%] grb 100902a 367 35 62 401 112 68 total 592 592 m 5 823 208 16 1023 895 20 total 5213 5213 m 67 355 13 38,5 445 54 48 total 922 922 m 51 63 78 68 88 1103 95 total 93 93 (a) (b) (c) fig. 2: an example of stellar object detection for an image of open cluster m67: using conventional methods based on estimation of the global (a) and local (b) background and using the wavelet transform (c) [8] 6 conclusion the best detection results were achieved using a method based on image analysis using the wavelet transform: the total number of discovered objects and the percentage of the number of stars in the catalog for all images is the highest of all tested methods [8]. in addition, there is an interesting situation. in simpler stellar fields (grb 100902a, m 67) we found more objects that are also listed in the catalog, while the situation was the opposite for a globular cluster and for galaxy m 51. this is due to the fact that stars in the middle of globular clusters or galaxies are not recorded in the catalog, so all light sources found in this place will be evaluated as not belonging to the catalog. conversely, there aremany stars in the catalog away from themiddle of a cluster that are almost indistinguishable from noise, or not at all visible, so they have not been detected. acknowledgement this paper has been supported by grant project sgs10/285/ohk3/3t/13. references [1] daophot – stellar photometry package [online]. web site of the software package. [cit. 2011–28–05]. available on the web: http://www.star.bris.ac.uk/ mbt/daophot/ [2] stetson, p. b.: daophot: a computer program for crowded-field stellar photometry. publications of the astronomical society of the pacific. march 1987, 99, p. 191–222. [3] bertin, e., arnouts, s.: sextractor: software for source extraction. astronomy and astrophysics supplement series. june 1996, 117, p. 393–404. [4] sextractor –astronomical sourceextractor [online]. web site of the software package. [cit. 2011–28–05]. available on the web: http://sextractor.sourceforge.net/ [5] mallat, s. g.: a theory for multiresolution signal decomposition: the wavelet represen11 acta polytechnica vol. 51 no. 6/2011 tation. ieee transactions on pattern analysis and machine intelligence. july 1989, vol. 11, no. 7. [6] starck, j.-l., murtagh, f.: astronomical image and data analysis. springer-verlag berlin heidelberg, 2002. 289 p. isbn 3-540-42885-2. [7] donoho, d. l., johnstone, i. m.: ideal spatial adaptation by wavelet shrinkage. biometrika. september 1994, vol. 81, p. 425–455. [8] anisimova, e.: methods for analysing and processing of astronomical image data. prague, 2011. 77 p. thesis. czech technical university inprague,faculty ofelectricalengineering,department of radio engineering. [9] usno – a2.0. a catalog of astrometric standards [online]. [cit. 2011–05–05].availableon the web: http://tdc-www.harvard.edu/software/ catalogs/ua2.html elena anisimova petr páta martin blažek department of radio engineering faculty of electrical engineering czech technical university in prague 12 wykresx.eps acta polytechnica vol. 51 no. 1/2011 the problem of predecessors on spanning trees v. s. poghosyan, v. b. priezzhev abstract we consider the equiprobable distribution of spanning trees on the square lattice. all bonds of each tree can be oriented uniquely with respect to an arbitrary chosen site called the root. the problem of predecessors is to find the probability that a path along the oriented bonds passes sequentially fixed sites i and j. the conformal field theory for the potts model predicts the fractal dimension of the path to be 5/4. using this result, we show that the probability in the predecessors problem for two sites separated by large distance r decreases as p(r) ∼ r−3/4. if sites i and j are nearest neighbors on the square lattice, the probability p(1) = 5/16 can be found from the analytical theory developed for the sandpile model. the known equivalence between the loop erased random walk (lerw) and the directed path on the spanning tree states that p(1) is the probability for the lerw started at i to reach the neighboring site j. by analogy with the self-avoiding walk, p(1) can be called the return probability. extensive monte-carlo simulations confirm the theoretical predictions. keywords: loop erased random walk, spanning trees, kirchhoff theorem, abelian sandpile model. 1 introduction and main results ingraphtheory, the spanning treeof connectedgraph g is a connected subgraphof g containingall vertices of g and having no cycles. numerous applications of spanning trees began with kirchhoff’s seminal problem solved in 1847, and then spread out many branches of mathematics and theoretical physics. in statistical mechanics, spanning trees are related to the potts model [1], the dimer model [2], the sandpile model [3] and many others. a relation between lattice models of statistical mechanics and spanning trees via the tutte polynomial has been established by fortuin and kasteleyn [4]. thekirchhoff theorem claims that the number of spanning trees of connected graph g is a cofactor of the laplacian matrix δ of graph g. if one deletes any row and any column from δ, one obtains a matrix δ∗ which gives the number of spanning trees as detδ∗. the determinantal structure allows easy calculation of local characteristics of the spanning trees, for instance, the average number of vertices with a given number of adjacent bonds. a characterization of non-local objects in the spanning tree is not so simple. one such object is the chemical path defined as a path along two or more bonds of the tree. the fractal dimension of a long chemical path on the twodimensional lattice has been calculated by means of a mapping of the spanning tree configurations onto the coulomb gas model. a closely related object is the loop erased random walk (lerw) on the two-dimensional lattice [11], which was proven to be equivalent to the directed chemical path of the spanning tree of the same lattice [7, 12]. in this paper,we consider a problemarising in the theory of lerw and equally distributed spanning trees: given two lattice sites i and j, what is the probability that the lerwor the directed chemical path passes i and j? if site i is passed first, we say that i is the predecessor of j and we refer to this problem as the predecessor problem. surprisingly, the problemhasno exact solution in the general case. only two limiting casesareavailable: (a) if sites i and j are separated by large distance r, the asymptotics of p(r) can be found fromknown results on the fractal dimension of the chemical path; (b) if points i and j are nearest neighbors of the square lattice, the seeking probability can be found from the theory of sandpiles [9] (see also [16]). the asymptotic behavior of p(r) for large distance r follows directly from the definition of the fractal dimension. indeed, consider a large square lattice l and the set of uniformly distributed spanning trees on l. we assume that the root is situated at the boundary of l. consider site i in the bulk of the lattice and somecircle contour c of radius r with the center in i. let π be a directed chemical path from i to the root along the oriented bonds of a tree. all points of the subset ofπ inside c are descendants of i. in accordance with the definition of the fractal dimension of the directed path on the spanning tree, the number of descendants inside c is proportional to r5/4 (see majumdar [7]). the probability that point i is the predecessor of point j lying at distance r from i is the density of descendants ρ(r). thus, we have 59 acta polytechnica vol. 51 no. 1/2011 ∫ r 1 ρ(r)r dr ∼ r5/4 (1.1) from where we conclude that ρ(r) ∼ r−3/4. as was mentioned above, the problem of predecessors for an arbitrary disposition of two lattice points is not solved. in section 2 we concentrate on a particular problemof probability p(1) when points i and j are nearest neighbors of the square lattice. an essential element of the theory of sandpiles is the probability distribution of sites having 0, 1, 2 and 3 predecessors among the nearest neighbors. the corresponding probabilities are denoted by x0, x1, x2 and x3. having explicit expressions for these values, we obtain p(1) as their combination and get an unexpectedly simple result p(1) = 5/16. in section 3, we relate this result to the return probability of the lerw.section 4 contains the results ofmonte-carlo verifications. 2 the problem of predecessors for nearest neighbors the spanning tree enumeration method, namely the kirchhoff theorem is proved to be a powerful mathematical instrument for the investigation of various combinatorial problems in theoretical physics. in the last decade, it has been developed and adapted for calculating the height probabilities of the abelian sandpilemodel [5, 9, 16]. theabelian sandpilemodel is a stochastic dynamical system,whichdescribes the phenomenon of self-organized criticality. during the evolution the system falls into a subset of all possible states, called the subset of recurrent states. the problem is to calculate analytically various observable values in the recurrent state, such as height probabilities pi, i =1,2,3,4atafixedsite andheight correlations between distinct fixed sites [18]. it was shown that the calculation of height probabilities in the abelian sandpile can be reduced to the calculation of x0, x1, x2 and x3 in the spanning tree model. the exact relation between these quantities is given by p1 = x0 4 ; p2 = p1 + x1 3 ; p3 = p2 + x2 2 ; p4 = p3 + x3. (2.2) majumdar and dhar [5] found in 1991 the probability of height 1, constructing the corresponding defect lattice for the situation when a site i0 has no predecessors and calculating the determinant of the defect matrix δ∗. a technique for computing the numbers x1, x2, x3 was devised in [9]. the results are (see also [16] for details) x0 = 8(π −2) π3 ; x1 = 3 4 + 48 π3 − 15 π2 − 3 2π ; x2 = 1 4 − 48 π3 + 6 π2 + 3 π ; (2.3) x3 = 16 π3 + 1 π2 − 3 2π , which give p1 = 2(π −2) π3 ; p2 = 1 4 + 12 π3 − 3 π2 − 1 2π ; p3 = 3 8 − 12 π3 + 1 π ; (2.4) p4 = 3 8 + 4 π3 + 1 π2 − 1 2π . fig. 1: all possible situations with a fixed vertex i0 (the central vertex in the diagrams) to have various nearest neighbouringpredecessors on the square lattice. byblack color we denote vertices, which are predecessors of i0. white means that the corresponding vertex is not a predecessor of i0. on the k+1st row(k =0,1,2,3) situations with k predecessors are presented. the configurations for which the right neighboring vertex is a predecessor of i0 are circled by a dashed line now consider the problem of predecessors for nearest neighboring sites. first we fix a site i0 in the bulk of the square lattice. denote its right nearest neighboring site by j0. next consider four various cases, when i0 has exactly k nearest neighboring predecessors (k = 0,1,2,3) (see fig. 1). for k = 0 the site j0 trivially is not a predecessor of i0. for k =1, we have 1 of 4 equivalent situations when j0 is the predecessor of i0. for k = 3, we have 3 of 4 equivalent situationswhen j0 is thepredecessor. thecrucial case is k = 2. here we have 6 situations, but not all of them are equivalent. on the other hand, we can select two groups of 4 and 2 elements (the first four and the last two in the third line of fig. 1) so that 60 acta polytechnica vol. 51 no. 1/2011 the elements in each group are equivalent. we are looking for situations where j0 is a predecessor of i0. there are 2 encircled elements from the first group and one from the second group, which correspond to the desired situations. thus, if we take the linear combination of x1, x2 and x3 with coefficients 1/4, 1/2 and 3/4 correspondingly, we get the desired probability p(1) that j0 is the predecessor of i0: p(1)= 1 4 x1 + 1 2 x2 + 3 4 x3 = 5 16 . (2.5) 3 return probability for the loop erased random walk consider a finite square lattice l with vertex set v and edge set e. given p = [u0, u1, u2, . . . , un] a path in l, its loop-erasure le(p) = [γ0, γ1, γ2, . . . , γm] is defined by chronologically removing loops from p. formally, this definition is as follows. we first set γ0 = u0. assuming γ0, . . . , γk have been defined, we let sk =1+max{j : uj = γk}. if sk = n+1, we stop and le(p)= [γ0, γ1, γ2, . . . , γm] with m = k. otherwise, we let γk+1 = usk. note that the order inwhich we remove loops doesmatter, and it follows from the definition that we remove loops as they are created, following the path. a walk obtained after applying the loop-erasure to a simple random walk path is called a loop-erased random walk (lerw). since the infinite simple random walk on finite connected undirected graphs is recurrent, the infinite lerw is not defined. on the other hand, we can fix a subset w ⊂ v of vertices and define lerw from a fixed vertex u0 to w . to do this, we take the path of a simple random walk started at u0 and stopped upon hitting w , after which we apply the loop-erasure. wilson established an algorithm to generate uniform spanning trees, which uses lerw [13]. it turns out to be extremely useful not only as a simulation tool, but also for theoretical analysis. it runs as follows. pickanarbitraryordering v = {v0, v1, . . . , vn } for the vertices in l. let s0 = v0. inductively, for i = 1,2, . . . , n define a graph si to be the union of si−1 and a (conditionally independent) lerw path from vi to si−1. note, if vi ∈ si−1, then si = si−1. then, regardless of the chosen order of the vertices, sn is a ust on l with root v0. if we take as an initial condition s0 = w , with some w ⊂ v , then the generated structures will be spanning forests with a set of roots w . the spanning forest with a fixed set of roots can be considered as a spanning tree, if we add an auxiliary vertex and join it to all the roots. nowconsider thewilson algorithmon l with the set of boundaryvertices ∂l and take s0 = ∂l. when the size of the lattice tends to infinity, the boundary effectswill vanish, sowe canneglect the details of the boundary. so we will not distinguish between spanning forests and spanning trees. it follows from the wilson algorithm for a fixed site i0, that if we start a lerwfrom i0 uponhitting theboundary ∂l, wewill generate a path � of a spanning tree from i0 to the boundary (see also [7, 8]). all vertices on the path � form the set of descendants of i0. so, if a fixed vertex j0 belongs to �, i0 is a predecessor of j0. consequently, the probability p(j0−i0) that i0 is apredecessorof j0 in a randomlytaken (fromtheuniform distribution) spanning tree on the large square lattice is equal to the probability that the lerw started from i0 passes j0. in the particular case |j0 − i0| =1, the probability p(1)= 5/16 calculated in the previous section is the return probability for the lerw. 4 monte-carlo simulations consider the finite 2n +1 × 2n +1 square lattice ln. denote its central vertex by i0 and assume that it is an origin of the coordinate system. we deliberately took an odd-odd lattice to provide the symmetry which guarantees the most efficient vanishing of the boundary effects for large lattices. after generating a large number of lerws starting from i0, we get an approximation of the function pn(j0−i0) ≡ pn(j0). given fixed j0, we can extrapolate pn(j0), tending n to infinity andgetasymptotical function p(j0). assume that the euclidean distance between the origin i0 and j0 is r and the coordinates of j0 are (rcosϕ, r sinϕ). the monte-carlo simulations and coulomb gas arguments show that the asymptotical behaviour of the function p(j0) for large r (r � 1) does not depend on ϕ. so, for r � 1 we have p(j0) � p(r). fig. 2 shows the behaviour of p(r) for various j0 on the log-log scale, obtained from monte-carlo simulations. from this result we conclude that p(r) decreases as a power function: p(r) � c rα , (4.6) fig. 2: the results of monte-carlo simulations for the probability p(r) 61 acta polytechnica vol. 51 no. 1/2011 with α ≈ 0.751 and c ≈ 0.305. during the simulations, we took sizes up to n = 100 and number of simulations 108. the obtained results are in agreement with α = 3/4, which follows from the coulomb gas arguments mentioned above. the effective monte-carlo algorithm allows the evaluation of probabilities p(j0 − i0) for arbitrary finite j0, i0. at the same time, the analytical calculation of p(r) for r > 1 remains a difficult unsolved problem. acknowledgement this work was supported by russian rfbr grant no. 09-01-00271a, and by the belgian interuniversity attraction poles program p6/02, through the nosy (nonlinear systems, stochastic processes and statistical mechanics) network. we would like to thank p. ruelle for helpful discussions. the montecarlo simulationswereperformedonarmeniancluster forhighperformancecomputation (armcluster, http://www.cluster.am). references [1] wu, f.y.: rev.mod. phys.54 (1982), 235–268. [2] fisher, m. e., stephenson, j.: phys. rev. 132 (1963), 1411–1431. [3] bak, p., tang, c., wiesenfeld, k.: phys. rev. lett. 59 (1987), 381; dhar, d.: phys. rev. lett. 64 (1990), 1613; priezzhev, v. b., ktitarev, d. v., ivashkevich, e. v.: phys. rev. lett. 76 (1996), 2093; ivashkevich, e. v.: phys. rev. lett. 76 (1996), 3368; hu, c.-k., et al.: phys. rev. lett. 85 (2000), 4048. [4] fortuin, c. m., kasteleyn, p. w.: physica, 57, 4 (1972), 536–564. [5] majumdar, s. n., dhar, d.: j. phys. a: math. gen. 24 (1991), l357. [6] majumdar, s. n., dhar, d.: physica a 185 (1992), 129. [7] majumdar, s. n.: phys. rev. lett. 68 (1992), 2329–2331. [8] lawler, g. f., schramm, o., werner, w.: annals of prob. 32, 1b (2004), 939–995. [9] priezzhev, v. b.: j. stat. phys. 74 (1994), 955. [10] mahieu, s., ruelle, p.: phys. rev. e 64 (2001), 066130; ruelle, p.: phys. lett. b 539 (2002), 172; jeng, m.: phys. rev. e 69 (2004), 051302; jeng, m.: phys. rev. e 71 (2005), 036153; jeng, m.: phys. rev. e 71 (2005), 016140;moghimi-araghi, s., rajabpour,m. a., rouhani, s.: nucl. phys. b 718 (2005), 362; ruelle, p.: j. stat. mech. (2007), p09013. [11] lawler, g. f.: duke math. j. 47, 3 (1980), 655–693. [12] kenyon,r.: actamath.185, 2 (2000), 239–286. [13] wilson, d.: proceedings of the twenty-eighth annual acm symposium on the theory of computing, acm (1996), 296–303. [14] piroux, g., ruelle, p.: j. stat mech. (2004), p10005. [15] piroux, g., ruelle, p.: j. phys. a: math. gen. 38 (2005), 1451. [16] piroux,g.,ruelle,p.: phys. lett.b 607 (2005), 188; jeng, m., piroux, g., ruelle, p.: j. stat. mech. (2006), p10015. [17] izmailian, n. sh., priezzhev, v. b., ruelle, p., hu, c.-k.: phys. rev. lett. 95 (2005), 260602; izmailian, n. sh., priezzhev, v. b., ruelle, p.: symmetry, integr. geom.: methods appl. 3 (2007), 001. [18] poghosyan, v. s., grigorev, s. y., priezzhev, v. b., ruelle, p.: phys. lett. b 659 (2008), 768; j. stat. mech. (2010), p07025. [19] azimi-tafreshi,n.,dashti-naserabadi,h.,moghimi-araghi, s., ruelle, p.: j. stat. mech. (2010), p02004. [20] grigorev, s. y., poghosyan, v. s., priezzhev, v. b.: j. stat. mech. (2009), p09008. v. s. poghosyan e-mail: vahagn.poghosyan@uclouvain.be institute for theoretical physics the catholic university of louvain b-1348 louvain-la-neuve, belgium v. b. priezzhev e-mail: priezzvb@theor.jinr.ru bogoliubov laboratory of theoretical physics joint institute for nuclear research 141980dubna, russia 62 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 a way to use waste heat to generate thermoelectric power marian brázdil1, jǐŕı posṕı̌sil1 1 brno university of technology, faculty of mechanical engineering, energy institute, department of power engineering, technická 2896/2, 616 69 brno, czech republic correspondence to: brazdil@fme.vutbr.cz abstract in recent years there has been rising interest in thermoelectric generation as a potential source of electric power using waste heat. this paper describes thermoelectric power generation from waste heat from biomass boilers, utilizing generators that can convert heat energy directly to electrical energy. general principles of thermoelectric conversion and future prospects of these applications are discussed. keywords: thermoelectricity, thermoelectric generator, seebeck effect, waste heat. 1 introduction the efficiency of traditional power generating systems is not satisfactory. a huge amount of thermal energy is discharged into the atmosphere every day. not all of the waste heat is dispersed in the atmosphere without profit, but a major part of the energy buried in fossil fuels is not converted into electricity. for example, the total energy efficiency of a conventional thermal power plant using steam turbines is approximately 40 %; the best modern combinedcycle plant using a gas turbine and a steam turbine is between 50–60 %. in vehicles using gasoline-powered combustion engines the conversion efficiency is about 30 %, and diesel-powered combustion engines achieve about 40 % efficiency. thermoelectric generators (teg) have the potential to recover waste heat as effective energy and to make a major contribution to reducing fossil fuel consumption. as a consequence of lower energy consumption and higher total energy efficiency, teg also can help reduce co2 and other greenhouse gas emissions. 2 general principles of thermolectric conversion thermoelectric devices are semiconductor devices based on thermoelectric effects that can convert thermal energy directly into electricity. these solid state devices use electrons as their working fluid. thermoelectric effects can be used both for power generation and for electronic refrigeration. these effects are often explained by two different electrically conductive materials connected together. when a temperature gradient is established between junctions of materials — e.g. one junction is heated and the other cooled, as shown in figure 1, a voltage (seebeck voltage) is generated. the thermocouple that is created can be connected to a load to provide electric power. this phenomenon was discovered in 1821 by j. t. seebeck, and is called the “seebeck effect”. figure 1: seebeck effect the generated voltage (μv/k) is directly proportional to the temperature gradient. the seebeck coefficient α is a coefficient of proportionality and a material-specific parameter [1]. vab = αab(t1 − t2) [v] (1) conversely, it is possible to convert electric energy into a temperature gradient. this complementary phenomenon, known as the peltier effect, was discovered by c. a. peltier in 1834. if a voltage is applied across a junction and a direct current flows in the circuit, a slight cooling or heating effect (depending on the direction of the current) occurs at the junction. this effect is extensively used for cooling. a typical thermoelectric device is composed of a large number of semiconductor thermocouples, see 21 acta polytechnica vol. 52 no. 4/2012 figure 2. thermocouples consist of n-type and ptype semiconductor pellets connected together with a metal plate by soldering. they generate seebeck voltage of hundreds of μv/k. figure 2: principle of thermoelectric power generation [2] 3 thermoelectric materials the highest thermoelectric conversion performance is achieved with heavily doped semiconductors. high electrical conductivity (σ), large seebeck coefficient (α) and low thermal conductivity of the materials (λ) are necessary in order to realize high-performance thermoelectric materials [3]. the potential to convert heat to electricity is quantified by the so-called thermoelectric figure-of-merit z, which is defined as: z = α2σ λ [k−1] (2) figure-of-merit z varies with temperature. as shown in figure 3, the conversion efficiency is a function of operating temperature difference. an increase in the temperature difference provides an increase in heat available for conversion, so large temperature differences are desirable [3]. it is more usual to specify a dimensionless figure-of-merit, which is equal to zt , where t is absolute temperature. only materials which possess zt > 0.5 are regarded as thermoelectric materials [3]. established thermoelectric materials can be divided into groups depending upon the temperature range of operation: • low-temperature materials, up to around 450 k • middle-temperature materials, from 450 k up to around 850 k • high-temperature materials, from 850 k up to around 1 300 k alloys based on bismuth in combinations with antimony, tellurium and selenium are low-temperature materials. middle-temperature materials are based on lead telluride and its alloys. high-temperature materials are fabricated from silicon germanium alloys [3]. 4 thermoelectric generators a simple thermoelectric generator consists of a series of thermoelectric couples placed between two heat exchangers and a dc/dc converter. heat from the heat source flows through the thermoelectric couples and is dissipated through the heat sink into the ambient air, as shown in figure 4. electric power is generated until a temperature differential is applied. a dc converter changes the output thermoelectric voltage to the voltage required by the load. figure 3: dependence of conversion efficiency on a given thermoelectric material and the applied temperature difference [1,2] 22 acta polytechnica vol. 52 no. 4/2012 figure 4: a simple thermoelectric generator in general, the thermoelectric system offers several advantages: it has no moving parts or chemical substances, has maintenance-free and silent operation, is reliable, and is not position dependent [1,4]. on the other hand, the most significant disadvantage is the low energy efficiency of teg. in addition, the energy efficiency and the released output power are temperature-dependent. when exploring the universe, radioisotope thermoelectric generators (rtgs) were often used as a power source for deep space probes. lower energy efficiency was not critical in these applications, because tegs are suitable for operating in dangerous or inaccessible areas. in ordinary applications, tegs supplied by combusted fossil fuels (oil or gas) are not cost competitive. fossil fuels are expensive, so it is better to use waste heat as the energy source. 5 waste heat from biomass boilers the quality and the quantity of various types of waste heat differ. thermoelectric generators are able to recover waste heat if the heat flux is sufficient. a high operating temperature results in high conversion efficiency. modern small-scale biomass combustion systems (pellet boilers) achieve high efficiency and low emissions, and they offer a convenient form of residential heat supply. however, these fully automatic systems require auxiliary electric power for continual fuel supply, co2 balanced heat production and distribution. these requirements result in dependence on the electricity grid. tegs have the potential to deliver auxiliary power for self-sufficient operation of the combustion and heating system [5]. waste heat from the furnace can be utilised by an integrated generator. in fact, the boiler becomes a micro-scale chp unit. the thermoelectric generator of the boiler needs to be designed to recover the allowable waste heat output without dropping the flue-gas temperatures below the condensation point. in this temperature region the conversion efficiency is quite low. maximum efficiencies commonly available for low-temperature thermoelectric materials are approximately 5 %. 6 construction of a prototype a prototype of the thermoelectric generator was designed and produced at the energy institute, brno university of technology. this generator was intended to verify the possibility of recovering waste heat from flue gas. a verner a251.1 automatic biomass boiler served as the heat source. the scheme of our micro-scale chp unit is shown in figure 5. the design of the thermoelectric generator is shown in figure 6. the most important parameters of this unit are shown in table 1. table 1: design specification of the prototype biomass thermoelectric boiler parameter rated heat input 25 kw fuel heat input range 7.5–30 kw minimum return water temperature 60 ◦ c hot surface temperature 220 ◦ c nominal electrical power output 100 wel fuel wood pellets this prototype was constructed with industrially available low-temperature thermoelectric material which allows operating temperatures of up to 230 ◦c. each thermoelectric module consists of 127 thermocouples made from bismuth telluride. 7 application market the inadequate efficiency of state-of-the-art thermoelectric materials prevents the utilization of mass waste heat. in future, new materials and further increases in energy efficiency are expected, and the time will come when thermoelectric technologies reach a level that can support a power generation market [6]. 8 conclusion thermoelectric generators can convert thermal energy directly into electricity. these solid state devices are based on the seebeck effect. a typical thermoelectric generator is composed of a large number of semiconductor thermocouples. the potential of a material to extract heat and convert it into electricity is quantified by the non-dimensional figure-of-merit zt . 23 acta polytechnica vol. 52 no. 4/2012 figure 5: scheme of the prototype biomass thermoelectric boiler figure 6: design of the prototype thermoelectric generator modern automated combustion systems need to be connected to the electricity grid. boilers with an integrated thermoelectric generator can utilize waste heat from the furnace. the chp units that are created provide an independent source of electric energy and enable efficient use of fuel. acknowledgement this work has been financially supported by the brno university of technology under grant bd13101004. references [1] rowe, d. m.: crc handbook of thermoelectrics. 1st ed. crc press, 1995. 701 s. isbn 978-0849301469. [2] jing-feng, l., wei-shu, l., li-dong, z., min, z.: high-performance nanostructured thermoelectric materials, npg asia mater. 2(4), 152–158, 2010. published online 21 october 2010. [3] rowe, d. m. (ed.): thermoelectrics handbook: macro to nano. 1st ed. [boca raton] : crc press, 2006, 1 014 s. isbn 978-0849322648. 24 acta polytechnica vol. 52 no. 4/2012 [4] riffat, s. b., ma, x.: thermoelectrics: a review of present and potential applications, applied thermal engineering, 23, 913–935, 2003. [5] höftberger, e., moser, w., aigenbauer, w., friedl, g., haslinger, w.: grid autarchy of automated pellets combustion systems by the means of thermoelectric generators, bioenergy2020+ [online]. 2010, 26 s. [cit. 2012–03–16]. http://www.bioenergy2020.eu/files/publications/ pdf/i1-2481.pdf [6] kawamoto, h.: r&d trends in high efficiency thermoelectric conversion materials for waste heat recovery. quarterly review [online]. 2009, 30, [cit. 2012–03–16]. s. 54–69. http://www.nistep.go.jp/achiev/ftx/eng/stfc/ stt030e/qr30pdf/sttqr3004.pdf 25 ap09_1.vp 1 introduction calculations to predict the deformation rate and load bearing capacity of concrete structures at high temperatures are often based on material models according to the model of eurocode 2 (ec2-model). in europe, most calculations of structures are based on this model. the model is very usable and provides a high level of safety for members under bending and standard fire test conditions. it has not been tested for natural fire conditions which include decreasing temperature conditions. the load bearing capacity of concrete structures can be optimized with models representing transient material behaviour. models which are approximated by transient data are more realistic. the following investigation describes the potential when using a new transient concrete model. this model considers thermal induced strain with external load or internal restraint load during heating up. for this model, a realisation of all components of concrete strain is needed. the concrete behaviour is influenced by transient temperature and load history. a material model for calculation of siliceous concrete is given in [1]. this new model is based on the thermal-induced-strain-model (tis-model) and is called the advanced transient concrete model (atcm). transient conditions during the whole calculation routine are taken into account. the transient load and the real temperature development are considered. generally, an atcm can be used for all types of concrete; only some parameters have to be changed. this examination is based on ordinary concrete with siliceous aggregates. the general calculation method is divided into thermal and mechanical analyses, which are normally nonlinear. using this model, finite element analysis (fea) is applied to the calculation [2]. in order to determine the time / temperature curves within the concrete, the thermal equation is solved with the inclusion of heat transfer through thermal analysis [3]. mass transports can also be included, because during fire exposure many phase transitions of the cement stone matrix and aggregate appear [4, 5]. these thermally conditioned physico-chemical variables can have influences on the mechanical model [6, 7, 8]. the mechanical analysis is based on these results. there are numerous models available for determining the behavior of ordinary concrete at high temperatures [9, 10, 11]. in regard to this, there is also high dependency on the type of concrete, used as studies for ultra-high performance concrete have shown (uhpc) [12, 13]. in a first step, the behaviour of small cylinders with siliceous concrete is calculated. the results are obtained using an atcm, which determines all local stresses and mechanical strains considering the whole cross section. these results are based on measured results according to [14]. in addition, a calculation of restraint stresses is given. the fea considers different material behaviour which allows all results obtained with the new model to be compared with the results of calculations obtained with the ec2-model, which is widely used in europe. the two concrete models, the ec2-model and atcm based on material properties according to tis-model (see equation (1), show a very different behaviour for deformation and restraint stresses during calculation. the influence of the load during heating is essential. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 calculation of a tunnel cross section subjected to fire with a new advanced transient concrete model for reinforced structures u. schneider, m. schneider, j.-m. franssen the paper presents the structural application of a new thermal induced strain model for concrete – the tis-model. an advanced transient concrete model (atcm) is applied with the material model of the tis-model. the non-linear model comprises thermal strain, elastic strain, plastic strain and transient temperature strains, and load history modelling of restraint concrete structures subjected to fire. the calculations by finite element analysis (fea) were done using the safir structural code. the fea software was basically new with respect to the material modelling derived to use the new tis-model (as a transient model considers thermal induced strain). the equations of the atcm consider a lot of capabilities, especially for considering irreversible effects of temperature on some material properties. by considering the load history during heating up, increasing load bearing capacity may be obtained due to higher stiffness of the concrete. with this model, it is possible to apply the thermal-physical behaviour of material laws for calculation of structures under extreme temperature conditions. a tunnel cross section designed and built by the cut and cover method is calculated with a tunnel fire curve. the results are compared with the results of a calculation with the model of the eurocode 2 (ec2-model). the effect of load history in highly loaded structures under fire load will be investigated. a comparison of this model with the ordinary calculation system of eurocode 2 (ec2) shows that a better evaluation of the safety level was achieved with the new model. this opens a space for optimizing concrete structure design with transient temperature conditions up to 1000 °c. keywords: material model, transient thermal strain, thermal creep, tunnel, concrete, fire. the calculations with simple structures show a good approximation between calculation results and measured data [15]. the good adaptation of the new atcm to measured data gives hope for a good adaptation in the calculation of complex structures. a cut and cover rectangular-shape reinforced concrete tunnel is calculated with the new model in the following sections. 2 generals and calculation results with concrete models 2.1 general tis-model it is generally agreed that the total strain �tot comprises the following parts: � � � � �tot el pl tr th� � � � , (1) where �tot total strain, �el elastic strain, �pl plastic strain, �tr total transient creep strain, �th thermal dilatation. it is therefore convenient to write for the pure mechanical strain: � � � � � �m el pl tr tot th� � � � � . (2) during an isothermal creep test the following types of deformation occur, see fig. 1. according to [17], in this case the term is called “load inducted thermal strain”. it consists of transient creep (transitional thermal creep and drying creep), basic creep and elastic strains. the shrinkage during the first heating is accounted for by the observed thermal strain (load 0 %). fig. 2 shows a general evolution of the total strain for specimens under different constant loads during heating up, based on the tis-model. the high influence of load during transient heating is to be seen. the elastic strain is very small at temperature t � 20 °c compared to the high deformation at high temperatures. it is concluded that the irreversible character of the main material properties must be incorporated in a calculation model to ensure a realistic consideration of the behavior of concrete. 2.2 calculation of total strains with the atcm and ec2 method 2.2.1 model parameters for calculation of total strains with the ec2 and atcm method the specimens are cylinders 80 mm in diameter and 300 mm in height. the heating rate is 2 k/min. the compressive strength at 20 °c is 38 mpa. the moisture content is © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 49 no. 1/2009 fig. 1: deformations of concrete at ambient temperatures subjected to a constant compressive load, according to [16] total strain at high temperatures as function of load history -15 -10 -5 0 5 10 15 20 0 100 200 300 400 500 600 700 800 temperature [°c] t o ta l s tr a in [‰ ] load 0% load 10% load 30% fig. 2: total strain at high temperatures as a function of load history w � 2 %. the results are obtained from heated specimens under different stress-time relationships [14]. in the advanced transient concrete model (atcm), the tis-model is used. the fea uses a model taken from eurocode 2 with a stress-strain constitutive model with minimum, recommended and maximum values of the peak stress strain. the minimum value of the peak stress strain (pss) is nevertheless not considered further, because the results are at a very negative side compared to the other models. fig. 3 shows the different peak stress strain values. the concrete behaviour shows a different young’s modulus during heating: the higher the pss, the smaller the young’s modulus. the practical relationship according to the measured data is not shown in eurocode 2. the stress-strain relationship in eurocode 2 is also used for a normative temperature condition, according to iso 824 (iso fire curve). 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 3: stress strain relationship subjected to fire, according to ec2 [18] stress-time-relationship 0 2 4 6 8 10 12 14 16 18 0 5000 10000 15000 20000 25000 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 4: stress-time relationship with constantly increasing load comparison between measurements and calculated data with different concrete models -20 -15 -10 -5 0 5 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 5: comparison of measured and calculated total strains under an applied load function according to fig. 4 2.2.2 results of measurements and calculations of total strains with an atcm and ec2 model the calculation is done with different load functions during heating. the atcm method is also well approximated for the mechanical strain according to measured data according to [14]. fig. 4 shows the load function as stress-time relationship of a constant increasing load. a comparison between the atcm method and the ec2 calculation is shown in fig. 5. the atcm with the tis-model are very well approximated with the measured based data. the result of the calculation with the ec2 model with the maximum pss value is generally as good as the value approximated by atcm. the calculation with the recommended value of pss is totally different above 3.5 h. fig. 6 shows the evolution of stress as a function of time that has been considered, with a linear increase until 15000 seconds and a linear decrease thereafter. fig. 7 shows the results of the comparison. the ec2 model with the maximum value of pss and the fea with the atcm approximated very well, as did the result of the calculation with ec2-model considering the maximum value of pss. the calculation with the ec2 model with the recommended value of pss generally has more deformation than the other calculations. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 6: stress-time relationship with continuously increasing load with continuous decreasing above 15000 seconds comparison between measurements and calculated data with different concrete models -4 -2 0 2 4 6 8 10 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 7: comparison of measured and calculated total strains under an applied load function according to fig. 6 the load function as a stress-time relationship with stepwise application of the load and stepwise unloading is given in fig. 8. a comparison between the different models is shown in fig. 9. the approximation between the two compared calculation methods with the atcm is comparatively good. however, a much higher difference between the total strains calculated with the atcm and the ec2 model with the maximum pss value is observed. the result of the calculation with the ec2 model with the recommended value of pss is significantly different from the calculations with the atcm and from the test results. fig. 10 shows the load function as a stress-time relationship with 3 increasing load steps and 3 decreasing load steps. fig. 11 shows a comparison between the different calculation models. the differences generally increase between the calculations with the ec2 model and atcm. the calculations with the atcm are a good approximation of the test results. the ec2 models, whatever value of pss is chosen, do not allow deformations to be calculated under a load function with a complex stress-time-relationship. for this calculation, atcm must be used. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 5 10 15 20 25 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 8: stress-time relationship with a sudden increase of the load and a sudden decrease till the origin comparison between measurements and calculated data with different concrete models -12 -10 -8 -6 -4 -2 0 2 4 6 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 9: comparison of measured and calculated total strains under an applied load function according to fig. 8 2.3 calculation of restraint axial force of a specimen under restraint condition 2.3.1 model parameters calculation of restraint axial force under restraint condition the specimens are calculated with the atcm and the ec2 model under restraint conditions and with a heating rate of 2 k/min. the restraint deformation applied at the beginning of the calculation is kept constant during heating up. the specimens are cylinders 80 mm in diameter and 300 mm in height. the cube compressive strength of the siliceous concrete at 20 °c is 20 mpa and it has a moisture content of w � 2 %. 2.3.2 calculation results of restraint axial forces for a heated specimen which is fully restrained the following figures compare the results of the calculation with the atcm with measured data taken from [14]. fig. 12 shows the restraint axial forces during heating with a load factor of 0.3. the measured data is based on different storage conditions during curing. the curve of the atcm is below the data of 105 °c dried concrete specimen till 300 °c and near the standard cured concrete (w � 2 – 4 %). above a temperature of 300 °c, the curve of the atcm is close to the curve of the water stored specimen. the curve of atcm lies in the confidence interval of all curves. fig. 13 shows the ratio of restraint axial force di© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 5 10 15 20 25 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 10: stress-time relationship with 3 sudden increases in load and 3 sudden decreases till the origin comparison between measurements and calculated data with different concrete models -10 -8 -6 -4 -2 0 2 4 6 8 10 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 11: comparison of measured and calculated total strains under an applied load function according to fig. 10 vided by compressive strength. the figure compares the restraint axial forces under different load conditions. the ec2 model is a stress-strain constitutive model without considering the load factor, i.e. it does not yield a good simulation result for restraint. at a temperature less than 420 °c the different load conditions indicate different restraint axial forces. above 420 °c the curves are nearly identical. the higher the load level, the higher are the restraint axial forces. the lines of the calculation with the ec2 model do not give a good approximation to the results of the atcm. from the experimental result of fig. 12 we come to the conclusion that ec2 model simulations do not give a good approximation of the measured values. the restraint axial forces are significantly lower than the measured data. since the axial stress has a significant effect on the fire resistance of building elements according to [19], a realistic simulation is important for loaded structures. 2.4 calculation of a tunnel cross section 2.4.1 model of the calculation of a tunnel cross section in general, calculation methods have two separate arithmetic steps: a thermal analysis and a mechanical analysis. for further information, please see the references [17, 20, 21]. the calculation model was divided into the following parts of the structure, see fig. 14. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 axial force during heating with load factor 0.3 compared to measured results 0 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 temperature [°c] r a ti o o f a x ia l fo rc e d iv id e d b y c o m p re s s iv e s tr e n g th atcm stored with water predried 105°c standard storing fig. 12: restraint axial force during heating with a load factor 0.3 compared to measured results comparision between different load factors during heating under restraint condition 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 temperature [°c] r a ti o o f a x ia l fo rc e d iv id e d b y c o m p re s s iv e s tr e n g th atcm with 0% load atcm with 10% load atcm with 30% load ec2 with recomended value of the peak stress strain (0% load) fig. 13: comparison of restraint forces for different load factors � ground plate beam 01 = symmetric axis of the cross section at node 1 beam 12 = mid-point between beam 01 and beam 20 at node 20 beam 20 = corner between ground plate and wall at node 41 � wall beam 23 = corner between wall and ground plate at node 41 beam 36 = point of maximum bending moment at node 75 beam 49 = corner between ceiling and wall at node 97 � ceiling beam 49 = corner between wall and ceiling at node 97 beam 60 = mid-point between beam 49 and beam 71 at node 120 beam 71 = symmetric axis of the cross section at node 143 in the following example, a single-bay frame is calculated. it is a model of a tunnel taken from a research project, shown in fig. 14 [22]. the simulation calculates a tunnel cross section with an exposition of a hci curve [23]. derived from the hydrocarbon curve, the maximum temperature of the hci curve is 1300 °c instead of the 1100 °c standard hc curve. fig. 15 shown the time-temperature relationship. such fires may occur in accidents involving tank trucks [7, 24]. the arithmetic model is based on a section 1 meter in width [25]. general calculations utilize the semi-probabilistic concept of eurocode 1 [17, 26]. the bedding is considered with the help of a spring component under every beam element of the ground plate [27]. the material used here is ordinary siliceous concrete c25/30 and steel bst500. the heating is calculated for transient heating. before the structure is subjected to fire, the basic combination must be used to determine the amount of reinforcement that is to be used for comparison purposes during the fire exposure. it is assumed that no spalling occurs during the fire. 2.4.2 results of the calculation of a tunnel cross section figs. 16 to 17 show the results of the deformation with the ec2 model with the maximum pss value, and with the atcm. the various displacements demonstrate how the whole structure responds during heating. the stiffness of the system changes as a function of time [28, 29]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 49 no. 1/2009 fig. 14: principle sketch of the tunnel; according to [22] hcm fire curve 0 200 400 600 800 1000 1200 1400 0 20 40 60 80 100 120 140 160 180 200 time [min] t e m p e ra tu re [° c ] fig. 15: hydro carbon increased fire curve according to [7] most of the deformations show a lower deformation with atcm. only in node 1 is the deformation in y-axis slightly larger with atcm than with the ec2 model. these results show the effect of the higher load utilisation of the new model. without considering the load history, the influence of the load under temperature exposure is not sufficiently reflected in the calculation of the deformation of the structure. the next figures show the mechanical properties of the strucure with respect to the axial forces and the bending moments. figures 18 to 23 show a comparison between the mechanical results. the axial forces of the ground plate, the wall and the ceiling are generally higher according to simulations with the ec2 model compared to simulations with atcm. due to the lower deformation in atcm, lower axial forces occur. an insignificant difference between the two models is seen in the calculation of the bending moment. positive bending moments are lower with atcm than with the ec2 model. negative bending moments are higher with atcm than with the ec2-model. 3 discussion of the results to calculate the load bearing capacity and the behaviour of structures subjected to fire, new material equations for the most important material properties of ordinary concrete have been developed [1, 15]. this model was developed to supplement the existing concrete model of ec2 with respect to the transient thermal creep and the effect of the load history. with this new model we can consider the load history in all phases of thermal exposure. with this complex model, we can calculate the total strain, taking into account a wide range of variations of load history and temperatures. different parts of deformations are approximated with discrete equations inter52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ec2-model maximum value of peak stress strain dof1 -0.002 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0 5000 10000 15000 20000 25000 30000 35000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 atcm dof1 -0.002 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0 5000 10000 15000 20000 25000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 fig. 16: displacement in the x-axis in various nodes ec2-model maximum value of peak stress strain dof2 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 atcm dof2 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 5000 10000 15000 20000 25000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 fig. 17: displacement in the y-axis in various nodes ec2-model with maximum value of pss axial force of the ground plate -450000 -400000 -350000 -300000 -250000 -200000 -150000 -100000 -50000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 1 beam 12 beam 20 atcm axial force of the ground plate -450000 -400000 -350000 -300000 -250000 -200000 -150000 -100000 -50000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 1 beam 12 beam 20 fig. 18: axial forces in various beams in the ground plate acting in the new concrete model. this technique can be used for realistic calculations of the behaviour of structures [30, 31, 32], especially in the case of restraint. by considering the load history during heating up, an increasing load bearing capacity due to higher stiffness of the concrete may be obtained in several cases. with this model, © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 49 no. 1/2009 ec2-model with maximum value of pss axial force of the wall -1200000 -1000000 -800000 -600000 -400000 -200000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 23 beam 36 beam 48 atcm axial force of the wall -1200000 -1000000 -800000 -600000 -400000 -200000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 23 beam 36 beam 48 fig. 19: axial forces in various beams in the wall ec2-model with maximum value of pss bending moment of the ground plate -1.00e+06 -5.00e+05 0.00e+00 5.00e+05 1.00e+06 1.50e+06 2.00e+06 0 5000 10000 15000 20000 25000 30000 35000 time [sec] b e n d in g m o m e n t [n m ] beam 1 beam 12 beam 20 atcm bending moment of the ground plate -1.00e+06 -5.00e+05 0.00e+00 5.00e+05 1.00e+06 1.50e+06 2.00e+06 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 1 beam 12 beam 20 fig. 21: bending moments in various beams in the ground plate ec2-model with maximum value of pss axial force of the ceiling -900000 -800000 -700000 -600000 -500000 -400000 -300000 -200000 -100000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 49 beam 60 beam 71 atcm axial force of the ceiling -900000 -800000 -700000 -600000 -500000 -400000 -300000 -200000 -100000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 49 beam 60 beam 71 fig. 20: axial forces in various beams in the ceiling ec2-model with maximum value of pss bending moment of the wall -2500000 -2000000 -1500000 -1000000 -500000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] b e n d in g m o m e n t [n m ] beam 23 beam 36 beam 48 atcm bending moment of the wall -2500000 -2000000 -1500000 -1000000 -500000 0 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 23 beam 36 beam 48 fig. 22: bending moments in various beams in the wall we can consider the thermal-physical behaviour of material properties for the calculation of reinforced concrete structures. application of this model, instead of the calculation system of ec2, will lead to a better evaluation of the safety level. this opens a space for optimizing reinforced concrete structures under temperature exposure. a calculation of a tunnel cross section of a cut-and-cover single bay frame was performed and presented above. lower deformations are calculated in all parts of the structures using the new advanced transient concrete model (atcm). due to this lower deformation, there is a lower axial force during heating. the results of the calculation of the bending moments show a lower moment on the inside of the tunnel surface and a higher bending moment outside of the tunnel, if we compare the results of atcm with those of the ec2 model. the differences between the calculations are very small. here we do not observe a significant difference in this structure when using the new model of concrete. 4 conclusion it has been shown that the recommended model of ec2 does not calculate realistic values of deformations of concrete structures under high temperature, when compared with the results of the advanced transient concrete model (atcm), which is based on measured data. a maximum value of peak stress strain is necessary for a relatively realistic description of the behaviour of the structure. for calculation of tunnels with concrete with siliceous aggregates, the ec2 model should be taken with the maximum value of the peak stress strain. for calculating a higher load bearing member, atcm should be applied. note that the full concrete behaviour is used in the structure only with the tis-model with the equations of atcm. calculation with atcm has a high potential for optimizing concrete structures, higher than the ec2 model. the reliability of the load bearing capacity is higher with atcm, because the deformations are lower than with the ec2 model. the calculated axial forces with atcm are with the ec2 model are close to each other. a potential is observed for more detailed calculations of complex structures. in the concept of structures it may be applied with lower safety factors, i.e. lower excess charges may be used in the design. references [1] schneider, u., schneider, m., franssen, j.-m.: consideration of nonlinear creep strain of siliceous concrete on calculation of mechanical strain under transient temperatures as a function of load history. proceedings of the fifth international conference – structures in fire sif 08, singapore 2008, p. 463–476. [2] franssen j.-m.: safir. a thermal/structural program modelling structures under fire. engineering journal, a.i.s.c., vol 42 (2005), no. 3, p. 143–158. [3] pesaveto, f. et al.: finite – element modelling of concrete subjected to high temperature. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano 2004. [4] lang, e.: feuerbeton, schriftenreihe spezialbetone band 4, verlag bau + technik, 2001. [5] wolf, g.: untersuchung über das temperaturverhalten eines tunnelbetons mit spezieller gesteinskörnung. diplomarbeit, technische universität wien, 2004. [6] florian, a.: schädigung von beton bei tunnelbränden. diplomarbeit, universität innsbruck, 2002. [7] schneider, u., horvath, j.: brandschutz – praxis in tunnelbauten, bauwerk verlag gmbh, berlin, 2006. [8] debicki, g., langhcha, a.: mass transport through concrete walls subjected to high temperature and gas pressure. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano 2004. [9] schneider, u.: verhalten von betonen bei hohen temperaturen; deutscher ausschuss für stahlbeton. berlin – münchen: verlag wilhelm ernst & sohn, 1982. [10] horvath, j., schneider, u.: behaviour of ordinary concrete at high temperatures. institut für baustofflehre, bauphysik und brandschutz, tu wien 2003. [11] schneider, u., morita, t., franssen, j.-m.: a concrete model considering the load history applied to centrally loaded columns under fire attack. in: fire safety science – proceedings of the fourth international symposium, ontario, 1994. [12] horvath, j.: beiträge zum brandverhalten von hochleistungsbetonen, technische universität wien 2003. [13] horvath, j., schneider, u., diedrichs, u.: brandverhalten von hochleistungsbetonen. institut für baustofflehre, bauphysik und brandschutz, tu wien 2004. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ec2-model with maximum value of pss bending moment of the ceiling -2500000 -2000000 -1500000 -1000000 -500000 0 500000 1000000 1500000 0 5000 10000 15000 20000 25000 time [sec] b e n d in g m o m e n t [n m ] beam 49 beam 60 beam 71 atcm bending moment of the ceiling -2500000 -2000000 -1500000 -1000000 -500000 0 500000 1000000 1500000 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 49 beam 60 beam 71 fig. 23: bending moments in various beams in the ceiling [14] schneider, u.: ein beitrag zur frage des kriechens und der relaxation von beton unter hohen temperaturen. (habilitationsschrift) institut für baustoffe, massivbau und brandschutz, tu braunschweig, heft 42, braunschweig, 1979. [15] schneider, u., schneider, m., franssen, j.-m.: numerical evaluation of load induced thermal strain in restraint structures compared with an experimental study on reinforced concrete columns. proceedings of the 11th international conference and exhibition, fire and materials 2009, 26–28 january 2009, fisherman’s wharf, san francisco, usa. [16] khoury, g. a., grainger, b. n., sullivan, p. j. e.: transient thermal strain of concrete: literature review, conditions with specimens and behaviour of individual constituents. magazine of concrete research, vol. 37 (1985), no. 132. [17] schneider, u., lebeda, c., franssen, j.-m.: baulicher brandschutz, berlin: bauwerk verlag gmbh, 2008. [18] eurocode 2: design of concrete structures – part 1-2: general rules – structural fire design. 2004. [19] dwaikat, m. b., kodur, v. k. r.: effect of fire scenario, restraint conditions, and spalling on the behaviour of rc columns. proceedings of the fifth international conference – structures in fire sif 08, singapore 2008, p. 463–476. [20] franssen, j.-m.: contributions à la modélisation des incendies dans les bâtiments et leurs effets sur les structures, université de liège, belgium 1998. [21] mason, j. e.: heat transfer programs for the design of structures exposed to fire. university of canterbury, christchurch, 1999. [22] franssen, j.-m., hanus, f., dotreppe, j.-c.: numerical evaluation of the fire behaviour of a concrete tunnel integrating the effects of spalling. in proceedings fib workshop – coimbra, november 2007. [23] övbb-sachstandsbericht: brandeinwirkungen – straße, eisenbahn, u-bahn. övbb-arbeitskreis aa1, entwurf zum grundstück, verf.: lemmerer, j. et al.: wien, januar 2005. [24] sfpe: the sfpe handbook of fire protection engineering. 2nd edition, quincy, ma, usa: sfpe, 1995. [25] wittke, w., wittke-gattermann, p.: tunnelstatik. in: beton kalender 2005: fertigteile-tunnelbauwerke, berlin: verlag ernst & sohn, 2005. [26] eurocode 1: actions on structures. part 1-1: general actions. densities, self-weight, imposed loads for buildings. en 1991-1-1:2002 [27] beutinger, p., sawade, g.: standsicherheit – vorhersagemöglichkeit der bodentragfähigkeit aus geotechnischer sicht, tiefbau tagung magdeburg, 2004. [28] feron, c.: the effect of the restraint conditions on the fire resistance of tunnel structures. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano, 2004. [29] vözfi: abschlussbericht – praxisverhalten von erhöht brandbeständigem innenschalen-beton, wien, 2003. [30] rotter, j. m., usmani, a. s.: thermal effects. in proceedings of the first international workshop “structures in fire”. copenhagen, 19th and 20th june 2000. [31] harada, kazunori: actual state of the codes on fire design in japan. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano, 2004 . [32] bailey, c. g., toh, w. s.: experimental behaviour of concrete floor slabs at ambient and elevated temperatures. in proceedings fourth international workshop “structures in fire”. aveiro, 2006. ulrich schneider e-mail: ulrich.schneider+e206@tuwien.ac.at martin schneider b, e-mail: e0527948@student.tuwien.ac.at university of technology vienna karlsplatz 13/206 1040 wien, austria jean-marc franssen e-mail: jm.franssen@ulg.ac.be university of liège 1, ch. des chevreuils, 4000, liège, belgium © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 49 no. 1/2009 ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 influence of climatic cycles on properties of leadfree solders m. tučan, p. žák abstract this paper presents complex climatic tests performed in the laboratory of climatechnology of thedepartment of electrotechnology. more specifically, it presents the results of climatic shocks administered to specimens of sn-pb and leadfree solders, and compares the results. while the experimental results are negative regarding the use of electrical resistance as amethod for trackingdegradation of the joints, the research has provided a large number of specimens that can be examined for structural changes in the solder after climatic stress. keywords: climatic tests, shocks, leadfree solder. 1 leadfree challenges leadfree soldering, compulsory today for most of the electronics industry, brings many challenges in its implementation. apart from problems such as tin corrosion and faster oxidation due to higher temperature, which appear in the production phase, there is the matter of the long-term behavior of leadfree solders. classical sn-pb solder was used for decades. a large sum of knowledge was accumulated about its properties in various conditions and climates. based on this knowledge, it is possible to make at least a rough prediction of the behavior of soldered joints in adverse conditions. however, we have only limited knowledge of the long-term properties of the technologies which have replaced sn-pb solders, be it leadfree soldering, lowtemperature soldering or the use of electrically conductive adhesives. this lack of experience prevents the use of such technologies in many important fields. the most prominent of these are aerospace, military, medicine and many subsystems in the transportation industry. [1] 2 accelerated testing to gain at least rough data on the properties of these new technologies in long-term use, there is generally just one solution: accelerated tests in adverse conditions. these tests usuallymake the conditions evenworse than expected in real use, in order to gain a margin of security and to obtain results faster. of course, any such test depends heavily on assumptions that are made and on factors that are simulated, as it is impossible to make complex tests that accurately simulate the real world use. accelerated testing of electronic devices and components is specifiedbyawidefieldofnorms. american military standards are especially useful, as they define a very wide set of operating conditions, test methods, etc. 2.1 specimens for the purposes of this experiment, several sets of experimental printed circuit boards were prepared. all boards utilized 0r0 smd resistors (1206), but differentmounting technologywasused: two types ofecas (amepoxax12andamepoxax20), twotypesof leadfree solders (cobar xm 3s and kester em907) and one type of sn-pb solder (cobar s62-gm5) were used. only the two cobar solders were examined. fig. 1: experimental circuit boards 2.2 thermal cycles the experimentsperformedby the authorswere aimed at comparing leadfree solders and electrically conductive adhesives of various types under cyclic climatic stresses. thanks to the available equipment, it was possible to perform three different sets of climatic shocks. first experiment was aimed at rapid changes of outside temperature. the specimens suffered repeated transition shocks between 125◦cdry heat and −45◦c cold. thedwell timeat eachof these temperatureswas 75 acta polytechnica vol. 50 no. 4/2010 15 minutes – this setup ensured good heating/cooling of the soldered joints. each specimen suffered182 such shocks. the temperature profile of these changes is shown in figure 2. fig. 2: temperature profile of thermal shocks fig. 3: resistance change depending on thermal shocks for sn-pb solder fig. 4: resistance change depending on thermal shocks for leadfree solder figures 3 and 4 allow a comparison of sn-pb and leadfree solders subjected to this kind of fatigue. as shown in the graphs, the change in electrical resistance in both cases was minor, and it was impossible to determine any significant clear trend. thermal cycling produced no significant structural changes, apart fromsignificant surface oxidationof the soldered joints. in contrast to the eca experiments, no apparent cracks formed. 2.3 moist heat – dry heat cycles in this subset of experiments, the specimenswere subjected to cyclic shocks between moist heat (50◦c, 100% humidity) and dry heat (125◦c). as shown in figure 5, the increase in electrical resistance is again inconclusive for leadfree solder, and the same applies to sn-pb solder (not shown due to space constraints). due to nature of the test – it was necessary to dry the specimens before measuring – it was possible to obtain only a much more limited set of values. fig. 5: resistance change depending on moisture-heat shocks for sn-pb solder fig. 6: resistance change depending on moisture-heat shocks for leadfree solder this type of climatic stress showed a significant increase in the extent and intensity of oxidation. as 76 acta polytechnica vol. 50 no. 4/2010 shown in figure 7, the surface of the joint is covered with oxides. unfortunately, aswith 2.2, no crackswere observed in the joints. fig. 7: oxidation of the surface of a soldered joint after moisture-heat cycling 2.4 moist heat – cold cycles the last of the settings for climatic shocks utilized moist heat (see above) and cold. as with the other experiments, figures 8 and 9 show that the electrical resistance of the soldered joints did not react significantly to climatic shocks, even in such adverse conditions. however, the surfaces of the joints again showed significant damage, as shown in figure 10. fig. 8: resistance change depending on moisture-cold shocks for sn-pb solder the degradation of the surface of the solder, as shown in figure 10, resembles tin pest. tin pest is a phase transformation of solid tin caused by low temperatures – it starts to demonstrate itself at roughly 13◦c, and the lower the temperature, the worse the tin pest. tin pest has again attracted attention with the widespread use of leadfree solders. fig. 9: resistance change depending on moisture-cold shocks for leadfree solder fig. 10: surface of leadfree solder after moist-cold cycles problems caused by tin pest have been observed on electronics such as cell phone base stations and other electronic devices located outdoors. one area where it struck unexpectedly was the isaf mission in afghanistan, where a number of armies (especially the us army) used off-the-shelf portable computers manufactured with leadfree solders. [2] 3 conclusion the conclusionsdrawn fromthis researchare, unfortunately, mostly negative. it has been shown that electrical resistance cannot be used to determine damage caused to soldered joints by climatic cycling, at least not in the extent that could realistically be performed in the laboratory belonging to the department. however the results also show the possibility of a closer studyof surface changes to the solder, especially aimed at low-temperature changes. the results from the round of experiments discussed in this article will be extended by obtaining quality cross-sections and examining them using electron microscopy. references [1] mach, p., skočil, v., urbánek, j.: montáž v elektronice: pouzdření aktivních součástek, plošné 77 acta polytechnica vol. 50 no. 4/2010 spoje. 1. vyd. praha : vydavatelství čvut, 2001. 440 s. isbn 80-01-02392-3. [2] lasky,r.c.: tinpest: aforgotten issue in leadfree soldering?, 2004 smta international conference proceedings, chicago, il, sept. 26–30, 2004, p. 838–840. about the authors marek tučan was born in kladno, czech republic, in 1982. he graduated from the faculty of electricalengineering in 2008, specializing inelectrotechnology, andhas remained faithful to his almamater. currently he is a postgraduate student under the supervision of doc. jan urbánek, and is conducting research on the properties of leadfree solders. he is fluent in english and french, and is able to handle short technical texts and tables in russian, german andpolish. among a wide scale of his hobbies the most prominent is history studies and military studies, aviation and space technology. pavel žák was born in 1983. in 2006 he graduated from the bachelor degree program in electrical engineering and entered the information technology – power engineering master degree program at fee-ctu in prague. two years later he graduated with a master’s degree in electrotechnology. at present, he is studying a postgraduate doctoral degree program under the supervision of assoc. prof. ivan kudláček, ph.d. he is fluent in french and english, and has a basic knowledge ofgerman. his hobbies include electronics, photography, aviation,marinemodel making and fishkeeping. he works part-time in a private electrophysical laboratory. marek tučan pavel žák e-mail: tucanm1@fel.cvut.cz, zakpavel@fel.cvut.cz dept. of electrotechnology faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic 78 ap09_2.vp 1 introduction 1.1 background continuous phase modulation (cpm) is a transmitterfriendly digital modulation scheme typically used in wireless communication systems e.g. bluetooth, gsm. the constant envelope property allows the use of a low-cost, nonlinear power efficient amplifier typically from c class, whereas continuous phase forces the spectrum to decrease steeply and leads to spectral efficient modulation with a narrow bandwidth. however, in the typical wireless fading communication channel, uncoded cpm suffers from the same performance degradation as any linear modulation format. hence, in the past few years, there has been an effort to extend the concept of space-time coding (stc), originally developed for linear modulations, to cpm. several st cpm coding schemes have been proposed that can be classified as st trellis codes [20], [19]. these schemes have modified the phase trellis inherent to cpm, and therefore the existing cpm receiver schemes cannot be directly applied. these disadvantages do not have space-time block codes (stbc) [18]. design rules combining orthogonality and cpm cannot be simply applied due to phase continuity violation. however, it might be easily accomplished with a burst-based approach [8]. phase continuity among the bursts is ensured by the termination tail data sequence, which forces the modulator to the zero state at the end of each burst. 1.2 goals we propose an optimal reduced constellation subspace for burst ostbc cpm with the optimality criterion given by a standard space-time determinant and rank criterion. we assume that the constellation basis fulfills orthogonality and the nyquist condition in order to simplify metric computation of the receiver and prepare the ground for possible iterative detection. the well-known laurent expansion [4], including its tilted forms variant, does not satisfy this. receivers that use shorter phase pulses than the transmitter are called mismatched receivers [10]. this type of receivers are also used here. it is shown that the optimal subspace is described in the constellation space by eigenvectors of a special hermitian matrix. this is formed by all signal vector differences making all minimal free paths of uncoded cpm, and depends only on the given cpm variant. the dimensionality of this space corresponds to the number of mfs practically implemented at each antenna of the receiver. the analytical derivation uses linear orthogonal subspace eigenvector projection in the constellation space. suboptimal receivers are tested with binary 2rec h �1 2 in a 2×1 mimo rayleigh fading channel and the alamouti space-time block code. the procedure itself has general validity for any space-time block code from the orthogonal design (od) [16] and for any used cpm format. 1.3 prior art and related work analytical projection to a one-dimensional subspace has been discussed in [14]. the procedure was based on earlier work with numerical optimization [13] and a definition of the optimality criterion [12]. the non-linear receiver preprocessor has been investigated in [11]. there has been a huge effort in cpm receiver complexity reduction. papers dealing with this topic are [10], [3], [9], [15], [6] and references in them. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 a constellation space dimensionality reduced sub-optimal receiver for orthogonal stbc cpm modulation in a mimo channel m. hekrdla we consider burst orthogonal space-time block coded (ostbc) cpm modulation in a mimo flat slow rayleigh fading channel. the optimal receiver must process a multidimensional non-linear cpm signal on each antenna. this task imposes a high load on the receiver computational performance and increases its complexity. we analytically derive a suboptimal receiver with a reduced number of front end matched filters (mfs) corresponding to the cpm dimension. our derivation is made fully in the constellation signal space, and the reduction is based on the linear orthogonal projection to the optimal subspace. criterion optimality is a standard space-time rank and determinant criterion. the optimal arbitrary-dimensional subspace search leads to the eigenvector solution. we present the condition on a sufficient subspace dimension and interpret the meaning of the corresponding eigenvalues. it is shown that the determinant and rank criterion for ostbc cpm is equivalent to the uncoded cpm euclidean distance criterion. hence the proposed receiver may be practical for uncoded cpm and foremost in a serially concatenated (sc) cpm system. all the derivations are supported by suitable error simulations for binary 2rec h= 1/2, but the procedure is generally valid for any cpm variant. we consider ostbc cpm in a rayleigh fading awgn channel and sc cpm in an awgn channel. keywords: space-time coding (stc), continuous phase modulation (cpm), receiver complexity reduction, mimo, burst orthogonal space-time block coded (ostbc) cpm, serially concatenated cpm (sccpm), mismatched detector. 2 definitions and system model 2.1 tilted cpm definition for mathematical convenience we adopt the tilted phase cpm concept [7] with a time-invariant trellis and possibly reduced modulator states and distinct signal waveforms. a stationary trellis diagram is attractive for a minimum free distance search, the termination tail algorithm and a constellation basis search. the modulated bandpass complex envelope signal is defined as s t t s s j f t t( , ) ( ( , ))u u� � 2 2 1� � �e (1) with shifted center carrier frequency f f h m td s1 0 1 2� � �( ) , where ts, md, �s, f0 is the symbol duration, alphabet size, symbol energy and original carrier frequency, respectively. the physically distinguishable phase is due to the harmonic function periodicity taken � � �� ( ) mod 2 . the information-carrying tilted phase is � � �( , ) ( , ) ( )t t h m tsd d� � �1 1 , where �( , )t d is the original classic cpm phase. new data symbols un take md different values � �u mn d� �0 1 1, , , ( )� . the physical phase function is � � � � � � � � ( , ) ( ) mod ( ) ( � � � � � � � � � nt hv h u it w s n n i s i l u 2 2 4 0 1 ); ,0 �t t n ls (2) with the data-independent time-dependent part w h m t h m it l m d s d s i l d ( ) ( ) ( ) ( ) ( )( � � � � � �� � � � � � � � � �1 2 1 1 0 1 � 1 0) ; .� �h ts (3) the modulation index is a rational irreducible number h k p� and the alphabet size md is assumed strictly power of 2. the frequency pulse �(t) with normalized area � � �( ) d �� � � 1 2 defines the phase function � � � �( ) ( )t t � �� d . frequently-used pulses of the correlation length l � � are �lrec s st lt t lt ( ) , , � � � � �� 1 2 0 0 else , (4) � � lrc s s st lt t lt t lt ( ) cos , , � � � � � � � � � � � � � � 1 2 1 2 0 0 else� . (5) modulator states are defined n n n l nu u v� � � �[ , , , ]1 1� . the accumulated state v u pn ii n l �� � � � � � � � � 0 mod takes only p distinct values, and so the modulator output signal is one of the pmd l possible waveforms and the number of modulator states is equal to pmd l( )�1 .the state equation for the accumulated phase follows v v u pn n n l� � �� �1 1( ) mod . 2.2 constellation space we find the orthonormal and nyquist basis by applying the gram-schmidt procedure to the maximum number of linearly independent modulator outputs { ( )} j j nt s�1 masked by a rectangular function. the orthonormal bases { ( )}�i i nt s�1 are obtained according to the gram-schmidt recursive formula � � � � j j j i ii j j j i t t t t t t t ( ) ( ) ( ), ( ) ( ) ( ) ( ), ( � � � � � � 1 1 t tii j ) ( )� � � � 1 1 , (6) where � �( ), ( ) ( ), ( )*t t t t t ts � d0 is an inner product operation. equivalently in matrix notation a � � � 1 2 1 2 ( ) ( ) ( ) ( ) ( ) ( ) t t s t t t s tn n � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � , (7) where matrix a � � � � � � � � � � � � � � � c a c a s a s c sn n n 1 21 2 1 2 0 0 0 � � � � � � � , (8) with a t tij j i� �( ), ( ) and c t tj j j i ii j � � � � � � �( ) , ( )1 1 . the modulation dimension is denoted ns. the maximum number of linearly independent outputs can be obtained for vn � 0, because the other functions are a multiple of exp( )j hvn2� . this determines that the corresponding constellation vectors are directly equal to the individual rows of matrix a and the other constellation vectors are multiplied by e j2�hvn. note that the modulator output waveforms for the case of rec are linearly dependent for the same data permutation. the final shape of the output signal is given by the sum of the data sequence of length l. in that case, dimension ns is equal to the number of all possible different summation results. the summation can take any number from 0 to l md( )�1 , hence n l ms rec d� � �( )1 1. (9) considering rc and other similarly piecewise linearly independent pulses we cannot use the previous trick and hence n ms rc d l� . (10) © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 49 no. 2–3/2009 2.3 burst orthogonal space-time block coded cpm without loss of generality we consider the only one full rate alamouti scheme (transmit antennas nt � 2 ) as the most important representative of ostbc. we consider signal division corresponding to the two nd 2 data messages, where nd denotes the number of data symbols in the message. let us define s t s t1( ) ( )� for 0 2 t t ns d and zero otherwise, and similarly s t s t t ns d2 2( ) ( )� � ; 0 2 t t ns d . the resulting space-time n n tt d s� continuous-time signal s( ) ( ) ( ) ( ) ( ) * * t s t s t s t s t � �� � � � � � � � 1 2 2 1 , (11) can be described in the constellation space with s t nd( ) ~ [ , , , ]s s s1 2 � , s t nd1 1 2 2( ) ~ [ , , , ]s s s� and s t n nd d2 2 1( ) ~ [ , , ]s s� � by the space-time codeword matrix [18] s s s s s s s s s � � �� � � � � 1 2 2 1 2 1 1 2 1 t n t n h n h n t n t h n h d d d d d d � � � � � � � � � � (12) the block structure may cancel important phase continuity. due to the long blocks the continuity violation will result in a small spectrum broadening, but long blocks lead to long delays in communication. so we use the termination tail technique, which forces the cpm modulator to the well-defined finish state at the end of each block and preserves the phase continuity. in addition, it increases the minimal free dis tance, but it also leads to some bits that carry no information [8]. the maximal termination tail length in symbols is obtained considering u u u vn n n l n� � � �� � � � �1 2 1 1� as n p m ltt d � � � � � � � � � � � 1 1 1, (13) � denotes modulo p addition and � �* is an integer part rounded up. 2.4 channel model a mimo system model in constellation space describes received signal at the kth antenna for t nt n ts s� �( , ( ) )1 as x s wn k ki n i n k i n h t , , ,� � � � 1 . (14) in the block constant flat rayleigh fading channel, the coefficients hki are complex gaussian zero mean independent identically distributed (iid) random variables with unity variance. additive white gaussian noise (awgn) is represented by vector w. it is a complex random zero mean gaussian variable with power 2n0 per dimension. 3 optimal projection subspace 3.1 rank and determinant criterion for stc in flat rayleigh fading channel the design criteria for space-time coding in quasi-static fading were originally designed to optimize the worst case pairwise error probability (pwep) [17]. if we denote �s s d s d� �( ) ( )( ) ( )i k , i k as a two distinct codeword difference and codeword distance matrix r � � �s s h 1. the code design rule for n nt r 4 follows. (a) maximize the minimum rank r of matrix r and (b) maximize the minimum product ii! for all nonzero i over all pairs of distinct codeword differences �s. 2. for the case n nt r � 4 , (a) make sure that the minimum rank r is such that rn r � 4 and (b) maximize the minimum trace ii� of matrix r among all pairs of distinct codeword differences �s. the minimum operation is over all distinct codeword pairs and maximization should be accomplished with suitable stc. burst alamouti coded cpm codeword difference matrix is according to (12) � � � � � � � � � s s s s s s s s � � �� � 1 2 2 1 2 1 1 t n t n h n h n t n t h d d d d d � � � � sn h d 2 � � � � � � � � . (15) the codeword distance matrix is then simplified to a diagonal matrix r s i� � � � � � � � � � � � i i n n d t 2 1 . (16) we may expect this result, since ostbc keeps orthogonality between signals transmitted from each antennas. hence all its eigenvalues are equal 1 2� � � �� nt and so this scheme has a full rank and determinant equal to det( )r � � �! ii n nt t 1 . note that the eigenvalues of diagonal distance matrix r are equal to the euclidean distance of uncoded cpm between s(i)(t) and s(k)(t). � � � � �si i n ik d d2 1 2 . (17) both the rank&determinant and the rank&trace criterion are equivalent to the minimal free distance of uncoded cpm 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 maximization. assuming that the minimal dmin 2 in most influences the error rate performance for medium to high snr, we have to find a subspace reducing projection that preserves these minimal free distances. we may observe that the suboptimal receiver would also be optimal for uncoded cpm. 3.2 minimal energy error events the minimal free distance for a general particular response (l > 1) cpm is commonly obtained numerically [2, 1]. because we assume zero start and termination state, we may conclude that free distances are all trellis paths that split and merge and are otherwise equal. the set containing all signal differences forming all minimal free distances � � � � � { , , }s s1 � n has n� components. for example, binary 2rec h �1 2 with dimension ns � 3 has the free paths shown by bolted lines in fig. 1. in the minimal path search it is sufficient to consider signals with vn � 0, due to cpm rotational symmetry. 3.3 optimal linear orthogonal subspace projection according to the stc code construction rules, we search for a projection subspace where the energy of minimal cpm path differences is maximal. in other words, we search for the best approximation of minimal euclidean differences with a lower-dimensional signal. the basic proper ties of linear projectors [5] are p b b b b� �( )h h1 , p ph � , p p2 � , where matrix b consists of the projection subspace generally linearly independent basis " #b b b b� 1 2, , ,� np . our task is to find such b that preserves all minimal free distances � arg max ( )b p b s b � � � � � � � � � � �� � � � i i n 2 1 . (18) we should note that the projection focusing minimal distances will also change the overall distance spectrum. thus after the projection e.g. the second minimum might be lower than the first minimum resulting in not-optimal performance. hence the general joint optimization problem � arg max min (b p b s b � � � � � � � � � � �� � � � s ) i i n 2 1 � should be assumed. due to computational tractability, we assume that the minimal error event will also be minimal after the projection, and confirm this condition afterwards. the projection to np dimensional subspace along the orthonormal base function set { , , , }b b b1 2 � np is defined by the projector p b b� h . the term in (22) simplifies p s s p s b s s b� � � � � � � i i n i h i i n j h i i h j i n j np 2 1 1 11� � �� � � �� �� � . the optimization problem is then b bj h j j np q � � � 1 ! max , (19) where the sum of all outer products q � � � � � � � � � � �� � � s si i h i n 1 . (20) the sum of np real non-negative quadratic forms (19) is maximized when each element is maximized. now we employ the spectral theorem on condition that b1, b2, …, bnp must form an orthonormal set. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 49 no. 2–3/2009 fig. 1: binary 2rec h � 1 2 minimal free paths (thick lines) the spectral theorem of quadratic form [5] says min max; : $ �x a x x x h 2 1 for a hermitian, where is an eigenvalue of a. here the conditions are fulfilled since the matrix q qh � is hermitian and bj 2 1� . thus b bj h jq max. (21) for a subspace basis equal to unit q eigenvectors vi, the quadratic form v q v v v vi h i i h i i i i i� � � 2 and similarly v q vmax max max h � . considering (21), we have found the maximum. assuming all eigenvalues are decreasing 1 � �� n s the optimal projection basis is " #� , ,b v v1� � np . (22) the minimal free distance after the projection is equal to ii n �� 1 � . considering p i� n s we conclude that the sum of all eigenvalues is equal to the total signal energy � � si i n i i ns 2 1 1� � � �� . therefore, the error energy of the difference between a signal and its approximation is i i n n p s � � � 1 . (23) as we discussed above, if we found all minimal free distances for zero accumulated metric vn � 0, we would obtain the rest of all other minimal free paths simply by a shift of e j vn2� . comparing matrix q, which we get from all signal differences forming all free distances, and matrix %q , which consists of those with vn � 0, we conclude that the eigenvectors are the same, because � � � � � �s s s s s sh j v j v h hn n� % % � % %�e e2 2� � . for this reason, the eigenvectors of matrix q are fully described for free paths with vn � 0, and so n� depends only on l and md (not a function of h). 3.4 sub-optimal receiver basis it can be shown, that the implemented subspace basis functions are obtained from the optimal basis in constellation space (22) and the original orthonormal bases { }�i i ns �1 according to the following formula. � � � i i t n pt t t for i n s ( ) ( ) ( ) , , ,� � � � � � � � � � � �v 1 1� � . (24) 4 numerical results for error performance verification we chose binary 2rec h �1 2 with free error events, depicted in fig. 1, and we assume a 2×1 mimo rayleigh fading channel with burst alamouti stbc cpm. according to the notes at the end of sec. 3.2 and 3.3, we need to investigate the trellis only for vn � 0. there are 4 minimal free paths with distance dmin . 2 3 454� that consist of vectors � � { , , , , }d d d d d12 78 23 35 462 , where the constellation vector difference is d s sik i k� �( ) ( ). the resulting matrix q d d d d d d d d d d� � � � �12 12 78 78 23 23 35 35 46 46 h h h h h . the corresponding eigenvalues are { , , } { . , . , . } 1 2 3 0 015 0 363 3 075� . hence the distance spectrum changes negligibly (by 0.015) after projection along {v2, v3 }, see fig. 2. abbreviations ’no’,’v3’,’v2v3’ mean receiver without dimensionality reducing projection, suboptimal receiver with projection to the subspace given by v3 and 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 3.4 3.6 3.8 4 4.2 4.4 4.6 4.8 0 1 2 3 4 5 6 7 8 d 2 o c c u rr e n c e distance spectra, bin 2rec h=1/2 no projection projection to [v2,v3] fig. 2: bin 2rec h � 1 2 distance spectrum for error length 3 0 2 4 6 8 10 12 14 16 18 20 10 �4 10 �3 10 �2 10 �1 10 0 es/n0 [db] b e r no v3 v2v3 bit error rate, burst alamouti stbc cpm, bin2rec h 1/2� , n 16d � fig. 3: burst alamouti stbc bin 2rec h � 1 2 error rate performance in a rayleigh fading channel by {v2,v3}, respectively. the error rate performance is shown in fig. 3. an interesting simulation was performed for serially concatenated bit random interleaved (5,7) convolution code in the awgn channel, see fig. 4. 5 conclusions and discussion we have analytically found a procedure for designing a suboptimal reduced dimensionality receiver for burst ostbc cpm, based on the standard space-time code construction rules. we observed in the derivations that this procedure can also be applied to the suboptimal receiver for uncoded cpm. the derivation was realized fully in the constellation space based on linear orthogonal projection. the whole problem leads to the eigenvector solution, where the corresponding eigenvalues agree with the energy distribution of the minimal free distance to its orthogonal eigenvector subspaces. the true values of these eigenvalues for given cpm describe possible dimensionality reduction with no impact on performance. it has been shown that for minimal free paths and an optimal eigenvector search it is sufficient to consider the zero accumulated phase vn � 0. the suggested procedure was simulated for a specific cpm variant – binary 2rec h �1 2 with a result corresponding to the analytical solution. acknowledgment i wish to thank prof. ing. jan sýkora, csc., who gave me the opportunity to work on up-to-date research in cutting edge technology, gave me full support and started my professional growth. i have enjoyed close collaboration with him as well as with other members of the digital radio communication (dirac) group. references [1] aulin, t., rydbeck, n., sundberg, c-e. w.: continuous phase modulation part ii: partial response signaling. communications, ieee transactions on, vol. 29 (1981), no. 3, p. 210–225. [2] aulin, t., sundberg, c-e. w.: continuous phase modulation part i: full response signaling. communications, ieee transactions on, vol. 29 (1981), no. 3, p. 196–209. [3] huber, j., liu, w.: an alternative approach to reduced complexity cpm receivers. ieee journal on selected areas in communications, vol. 7 (1989), p. 1427–1436. [4] laurent, p. a.: exact and approximate construction of digital phase modulations by superposition of amplitude modulated pulses (amp). ieee trans. commun., com, vol. 34(1986), no. 2, p. 150–160. [5] meyer, c. d.: matrix analysis and applied linear algebra. siam, 2000. [6] moqvist, p., aulin, t.: orthogonalization by principal components applied to cpm. ieee transactions on communications, vol. 51(2003), p. 1838–1845. [7] rimoldi, b. e.: a decomposition approach to cpm. ieee trans. information theory, vol. 34 (1988), no. 2, p. 260–270. [8] silvester, a. m., schober, r., lampe, l.: burst-based orthogonal st block coding for cpm. in proc. ieee global telecommunications conference (globecom), 2005. [9] simmons, s. j.: simplified coherent detection of cpm. communications, ieee transactions on, vol. 43 (1995), no. 2,3,4, p. 726–728. [10] svensson, a., sundberg, c.-e., aulin, t.: a class of reduced complexity viterbi detectors for partial response continuous phase modulation. communications, ieee transactions on, vol. 32 (1984), p. 1079–1087. [11] sykora, j.: information waveform manifold based preprocessing for nonlinear multichannel modulation in mimo channel. in proc. ieee global telecommunications conference (globecom), st. louis, usa, november 2005, p. 1–6. [12] sykora, j.: linear subspace projection and information waveform manifold based preprocessing for nonlinear multichanel modulation in mimo channel. in proc. int. conf. on telecommunications (ict), cape town, south africa, may 2005. invited paper, p. 1–6. [13] sykora, j.: receiver constellation waveform subspace preprocessing for burst alamouti block stc cpm modulation. proc. ieee wireless commun. network. conference (wcnc), march 2007, p. 1–5. [14] sykora, j., hekrdla, m.: determinant criterion optimizing linear subspace projector for burst orthogonal stc cpm modulation in mimo channel. ieee vtc, spring 2009. [15] tang, w., shwedyk, e.: a quasi-optimum receiver for continuous phase modulation. ieee transactions on communications, vol. 48 (2000), p. 1087–1090. [16] tarokh, v., jafarkhani, h., calderbank, a. r.: space-time block codes from orthogonal designs. information theory, ieee transactions on, vol. 45 (1999), no. 5, p. 1456–1467. [17] vtarokh, v., seshadri, n., calderbank, a. r.: space-time codes for high data rate wireless communication: performance criterion and code construction. ieee transactions inf. theory, it, vol. 44 (1998), no. 6, p. 744–765. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 49 no. 2–3/2009 0 1 2 3 4 5 6 7 10 �7 10 6� 10 5� 10 4� 10 3� 10 2� 10 1� 10 0 eb/n0 [db] b e r no v3 v2v3 bit error rate, sc bin2rec h � 1/2, n 1024d � fig. 4: sc bin 2rec h � 1 2 error rate performance in the awgn channel [18] vucetic, b., yuan, j.: space-time coding. john wiley & sons, 2003. [19] zajic, a., stuber, g.: a space-time code design for partial-response cpm: diversity order and coding gain. ieee icc, 2007. [20] zhang, x., fitz, m. p.: space-time code design with continuous phase modulation. ieee j. sel. areas communication, vol. 21 (2003), no. 5, p. 783–792. miroslav hekrdla e-mail: hekrdm2@fel.cvut.cz department of radioelectronics czech technical university in prague faculty of electrical engineering technická 2 166 27, prague 6, czech republic 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 simulation of hyperthermic treatment using the matrix of stripline applicators b. vrbová, l. víšek abstract this paper describes the design of a microwave stripline applicator for hyperthermic treatment, and the design of an anatomically based biological model, which is a necessary part of hyperthermia treatment planning for measuring the distribution of sar. in this paper we compare the sar distribution in a cylindrical homogeneous agar phantom (which has similar characteristics to biological tissue) and in an anatomically based biological model of the femur (which has been developed from a computer tomography scan) using a matrix of two applicators of the same type. keywords: microwave thermotherapy, sar, cancer treatment, anatomically based models, hyperthermia applicators. 1 introduction thermotherapy is a method based on differences in the behavior of healthy tissue and tumor tissue under enhanced temperatures. one of the most important methods ofmicrowavethermotherapy ishyperthermia. this method works in a temperature interval between 41 and 45◦c inwhich cancer cells are destroyed,while healthy cells are able to survive up to 45◦c. if the temperature is increased above 45◦c, coagulation of healthy cells occurs. the self protectivemechanism of tumors more than 2 cm in diameter already fails at a temperature of 41◦c. the blood flow in tumor cells decreaseswith increasing temperature, andso the temperature in tumor cells increases even more rapidly. finally, the tumor tissue is destroyed. the duration of a single hyperthermic treatment should not exceed 50 minutes. the level of the hyperthermic dose depends on temperature and time [1]. ahyperthermic set consists of ahighpowergenerator, thermal sensorsand, especially, theapplicator. an electromagnetic wave is generated and the applicator delivers it to the biological tissue, where it is absorbed, because the biological tissue is a lossy dielectric material. applicators are mainly designed for a working frequency of 434 mhz. this is one of the ism (industrial, scientific, medical) dedicated frequencies [2]. the distribution of the electromagnetic field in the biological tissue can be calculated using a simple, homogeneous model of tissue, which consists of only one type of tissue with defined electric parameters. however, the real tissue is much more complicated. the electromagnetic field diffuses through several types of tissue, so simulationswith anatomicalmodels are used for more realistic and more accurate results. these models can also be used for hyperthermia treatment planning. treatment planning is a very important procedure in hyperthermia. before the patient is treated by hyperthermia, several calculations are performed to find out the distribution of the absorbed power in this particular case. the best type of applicator for the treatment is selected according to the location of the tumor. in cases where the tumor is located near to vital organs or other critical areas, calculations with anatomicalmodels of patients are necessary. 2 hyperthermia applicator the basic parameters of a tem wave depend mainly on the dielectric parameters of the media through which it propagates (complex permittivity ε) and on the working frequency, but not on the cross section dimensions or on the type of transmission line [3]. the applicator discussed hereworks at a frequency of 434mhz, and ismade fromahighly conductivematerial (copper). the sides of the applicator are made fromacrylic glass. thetemwave is transferredalong a section of the microwave stripline transmission line with cross-sectiondimensionsof 50×30mmand length 80mm. thehorn section of the applicator is 80mm in length. the horn aperture is 120×80mm (fig. 1) [4]. fig. 1: model of a stripline applicator 106 acta polytechnica vol. 50 no. 4/2010 the length of the applicator is equal to twice the wave length, depending on the relative permittivity of the biological tissue (εr = 78). to avoid reflection of the wave back to the generator, the applicator is filled with a suitable dielectric material. in our case, the applicator is filled with distilled water, because 70 % of biological tissue consists of water. a water bolus is inserted between the applicator and the tissue to improve the transmission of the electromagnetic energy into the tissue. this improves the thermal profile in the biological tissue. impedance matching is very important for preventing reflections of electromagnetic energy back to the generator. reflected energy could destroy the high power output generator. the basic condition for impedance matching is that vswr has to be lower than 2. the impedance matching of the applicator was simulated using the fdtd simulator (e.g. semcad x em field simulator from speag, schmid & partner engineering ag, switzerland [6]) in order to find the position and the length of the exciting probe. the reflection coefficient at working frequency s11 is equal to −32.3 db, which represents a value of v sw r = 1.05. the length of the exciting probe is equal to 27 mm and the distance from the shortcircuited plane is 15 mm. 3 anatomically based biological model ananatomically basedbiologicalmodel is essential for numerical dosimetry. the numerical model is usually developed from ct scans (fig. 2). fig. 2: computed tomography scan dosimetry is used in the design and application of hyperthermia applicators. dosimetry quantifies the magnitude and distribution of the absorbed electromagnetic energy within biological objects exposed to electromagnetic fields. inrf, the dosimetric quantity, referred to as the specific absorption rate (sar), is defined as the rate at which energy is absorbed per unit mass. sar is determined not only by the incident electromagnetic waves but also by the electrical and geometric characteristics of the irradiated subject and nearby objects. it is related to the internal electric field strength and also to the electric conductivity and the density of the tissues. it is therefore a suitable dosimetric parameter, even when a mechanism is determined to be “athermal”. sar distributions are usually determined from calculations on human models. rapid progress with computers has enabled highlevel numericaldosimetry tobeperformedwith the aid of high-resolution biological models. the finite difference time domain (fdtd) method is currently the mostwidely used numerical rfdosimetrymethod [5]. in order to develop a model for numerical dosimetry, the original gray-scale data must be interpreted into tissue types. this process is known as segmentation. segmentation of medical images involves partitioning the data into contiguous regions representing individual anatomical objects. this is a prerequisite for further investigations in many computer-assisted medical applications, e.g. individual therapy planning and evaluation, diagnosis, simulation and imageguided surgery. segmentation is adifficult taskbecause it is inmost cases very hard to separate the object from the image background. this is due to the characteristics of the imaging process as well as the grey-value mappings of the objects themselves. the most common medical imageacquisitionmodalities includect(computer tomography) and mri (magnetic resonance imaging) images. mri or ct provides gray-scale image data in the form of many transverse slices, at a designated spacing, from the head to the feet of the biological body. the resolution in each slice is of the order of several millimeters. the gray-scale images are first rescaled to produce appropriate voxels. each voxel in the images then is identified rigorously as belonging to one type of tissue. this is donebyassigningto eachvoxela red-green-blue code that identifies thediscrete tissue type of that particularvoxel. thisprocess canbeperformedwithcommercial software [7]. all identified transverse images are then combined to obtain a three-dimensional numerical model. fine adjustment is generally required to connect each slice smoothly in the three orthogonal planes (axial, sagittal and coronal). dicom ct scans for a femur model were obtained from bulovka university hospital. the femur model has resolution 2 mm, meaning a voxel size of 2 × 2 × 2 mm. each voxel was assigned to one of 3 different tissue types, i.e. muscle, fat and bone (fig. 3). 107 acta polytechnica vol. 50 no. 4/2010 fig. 3: anatomical model of a femur 4 matrix composition of two applicators when the tumours cover a large area of the human body, e.g. on the arm, leg, back or stomach area, we can use applicators of bigger dimensions, or we can create a matrix of several smaller applicators. in our case, we chose to use a matrix. we chose the matrix composition of two applicators of the same type. in the first case we put two applicators on a cylindrical agar phantom next to each other (fig. 4). fig. 4: matrix of two applicators on a cylindrical agar phantom this homogeneous agar phantom represents a femur from a human body, and its dielectric properties are shown in the following table (tab. 1). the radius of the cylindrical agar phantom is 8 cm. in the second case, we placed a matrix of two applicators on an anatomically-based biological model (fig. 5), the dielectric parameters of which are listed in table 1. table 1: dielectric properties at a frequencyof 434mhz [8] name conductivity [s/m] relative permittivity agar 0.80 54.00 cortical bone 0.09 13.07 muscle 0.80 56.86 fat 0.04 5.56 fig. 5: matrix composition of two applicators on an anatomically-based biological model 5 results the results of the simulations of sar distribution are shown in the following pictures. the maximum sar distribution is situated in the middle of the two applicators in the homogeneous agar phantom (fig. 6). fig. 6: sar distribution in the agar phantom in the longitudinal cutting plane a comparison of the sar distribution in the agar phantom and in the anatomic model shows that the sardistributions acquire theirmaximumat the same locations, but the shape of the sardistribution in the anatomical phantom is influenced by the fat and bone (fig. 7). 108 acta polytechnica vol. 50 no. 4/2010 fig. 7: sar distribution in the anatomical model in the longitudinal cutting plane fathasa lowerpermittivityvalue thanmuscle, and therefore a small part of the energy is absorbed in the fat andmost of the energygoes into another layer, into the muscle. muscle behaves as a lossy environment, so the energy is absorbed there. bones also affect the shape of the sar distribution, as can be seen in the transversal section of the femur (fig. 8). fig. 8: sar distribution in the transversal cutting plane of the anatomic model bones, like fat, have a lowpermittivity value (they contain a small amount of water), but the sar value is almost equal to zero in this place, because bone behaves as a lossless environment andmost of the energy goes through the bones. fig. 9 shows the shape of the sar distribution in the transversal layer of the homogeneous agar phantom. simulations of sar distribution shows that microwave stripline applicators can be used to treat tumors locatedunder the surface of tissue. simulation of the sar distribution in an agar phantom is used for testing applicators. however, in cases of hyperthermic treatment planning where the tumor is located near to vital organs or other critical areas, calculations must be made with an anatomical model of the patient. fig. 9: sar distribution in the transversal cutting plane of an agar phantom 6 conclusions aset of severalapplicators canbeused inclinical practice to treat tumors covering a large area of the human body. in future the treatment of tumors can be improved by setting the amplitude and the phase of the applicators. acknowledgement this researchhasbeen supportedby thegrantagency of theczechrepublic project: “non-standardapplication of physical fields – analogy,modelling, verification and simulation” (102/08/h081). references [1] falk, h. m., issels, r. d.: hyperthermia in oncology. international journal of hyperthermia, vol. 17, no. 1, 1–18, 2001. [2] vrba, j.: medical applications of microwaves. prague (czech republic), 2007. isbn 978-80-01-02705-9. [3] vrba, j.: introduction to microwave technology. prague (czech republic), 2007. isbn 978-80-01-03670-9. [4] vrbová, b.: diploma thesis. microwave stripline applicator for local thermotherapy. prague, 2009. [5] fujiwara, o., wang, j.: electromagnetics in biology, springer japan, chapter on radiofrequency dosimetry and exposure systems, 2006, p. 223–225. [6] schmid & partner engineering ag, [online], url: http://www.semcad.com/, 2009. [7] 3d-doctor, 3d imaging, modelling, rendering and measurement software, [online], url: http://www.ablesw.com/3d-doctor, 2009. 109 acta polytechnica vol. 50 no. 4/2010 [8] gabriel, c.: compilation of the dielectric properties ofbodytissuesatrfandmicrowavefrequencies, brooks air force technical report, al/oetr-1996-0037. about the authors barbora vrbová was born in nové zámky, slovakia, onmarch28, 1984. she receivedhermscdegree in biomedical engineering from the czech technical university in prague in 2009. she deals with radiometric methods for verifying the biological effects of em fields. lukáš víšekwas born invysokémýto in 1982 and receivedhismscdegree fromtheczechtechnicaluniversity in prague in february 2006. he is currently a postgraduate student at the department of electromagnetic field at the faculty of electrical engineering. his present work is on developing an exposure system for unrestrained small animalswhichwill serve for research on the non-thermal effects of electromagnetic fields and hyperthermic applicators. barbora vrbová lukáš víšek e-mail: vrbovbar@fel.cvut.cz, visekluk@fel.cvut.cz dept. of electromagnetic field faculty of electrical engineering czech technical university technická 2, 166 27 praha, czech republic 110 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 sharply orthocomplete effect algebras m. kalina, j. paseka, z. riečanová abstract special types of effect algebras e called sharply dominating and s-dominating were introduced by s. gudder in [7, 8]. we prove statements about connections between sharp orthocompleteness, sharp dominancy and completeness of e. namely we prove that in every sharply orthocomplete s-dominating effect algebra e the set of sharp elements and the center of e are complete lattices bifull in e. if an archimedean atomic lattice effect algebra e is sharply orthocomplete then it is complete. keywords: effect algebra, sharp element, central element, block, sharply dominating, s-dominating, sharply orthocomplete. 1 introduction analgebraic structure calledan effect algebrawas introduced by d. j. foulis and m. k. bennett (1994). the advantage of effect algebras is that they provide a mechanism for studying quantum effects, or more generally, in non-classical probability theory their elements represent events that may be unsharp or pairwise non-compatible. lattice effect algebras are in some sense a nearest common generalization of orthomodular lattices [13] that may include noncompatible pairs of elements, and mv-algebras [3] that may include unsharp elements. more precisely, a lattice effect algebra e is an orthomodular lattice iff every element of e is sharp (i.e., x and “non x” are disjoint) and it is an mv-effect algebra iff every pair of elements of e is compatible. moreover, in every lattice effect algebra e the set of sharp elements is an orthomodular lattice ([10]), and e is a union of its blocks (i.e., maximal subsets of pairwise compatible elements that are mv-effect algebras (see [21])). thus a lattice effect algebra e is a boolean algebra iff every pair of elements is compatible and every element of e is sharp. however, non-lattice ordered effect algebra e is so general that its set s(e) of sharp elements may form neither an orthomodular lattice nor any regular algebraic structure. s. gudder (see [7, 8]) introduced special types of effect algebras e called sharply dominating effect algebras, whose set s(e) of sharp elements forms an orthoalgebra and also socalled s-dominating effect algebras, whose set s(e) of sharp elements forms an orthomodular lattice. in [7], s. gudder showed that a standard hilbert space effect algebra e(h) of bounded operators on a hilbert space h between zero and identity operators (with partially defined usual operation+) is sdominating. hence s-dominating effect algebrasmay be useful abstractmodels for sets of quantum effects in physical systems. we study these two special kinds of effect algebras. we show properties of some remarkable subeffect algebras of such effect algebras e satisfying the condition that e is sharply orthocomplete. namely properties of their blocks, sets of sharp elements and their centers. it is worth noting that it was proved in [11] that there are evenarchimedean atomic mveffect algebras which are not sharply dominating, hence they are not s-dominating. 2 basic definitions and some known facts definition 1 ([4]) a partial algebra (e;⊕,0,1) is called an effect algebra if 0, 1 are two distinct elements and ⊕ is a partially defined binary operation on e which satisfy the following conditions for any x, y, z ∈ e: (ei) x ⊕ y = y ⊕ x if x ⊕ y is defined, (eii) (x ⊕ y)⊕ z = x ⊕(y ⊕ z) if one side is defined, (eiii) for every x ∈ e there exists a unique y ∈ e such that x ⊕ y =1 (we put x′ = y), (eiv) if 1 ⊕ x is defined then x =0. we often denote the effect algebra (e;⊕,0,1) briefly by e. on every effect algebra e the partial order ≤ and a partial binary operation ! can be introduced as follows: x ≤ y and y !x = z iff x⊕z is defined and x⊕z = y. if e with the defined partial order is a lattice (a complete lattice) then (e;⊕,0,1) is called a lattice effect algebra (a complete lattice effect algebra). this paper is a contribution to the proceedings of the 6-th microconference “analytic and algebraic methods vi”. 51 acta polytechnica vol. 50 no. 5/2010 definition 2 let e be an effect algebra. then q ⊆ e is called a sub-effect algebra of e if (i) 1 ∈ q, (ii) if out of elements x, y, z ∈ e with x ⊕ y = z two are in q, then x, y, z ∈ q. if e is a lattice effect algebra and q is a sub-lattice and a sub-effect algebra of e then q is called a sublattice effect algebra of e. note that a sub-effect algebra q (sub-lattice effect algebra q) of an effect algebra e (of a lattice effect algebra e) with inherited operation ⊕ is an effect algebra (lattice effect algebra) in its own right. for an element x of an effect algebra e we write ord(x) = ∞ if nx = x ⊕ x ⊕ . . . ⊕ x (n-times) exists for everypositive integer n andwewrite ord(x)= nx if nx is the greatest positive integer such that nxx exists in e. an effect algebra e is archimedean if ord(x) < ∞ for all x ∈ e. a minimal nonzero element of an effect algebra e is called an atom and e is called atomic if under every nonzero element of e there is an atom. for a poset p and its subposet q ⊆ p we denote, for all x ⊆ q, by ∨ q x the join of the subset x in the poset q whenever it exists. we say that a finite system f = (xk) n k=1 of not necessarily different elements of an effect algebra (e;⊕,0,1) is orthogonal if x1 ⊕ x2 ⊕ . . . ⊕ xn (written n⊕ k=1 xk or ⊕ f) exists in e. here we define x1⊕x2⊕. . .⊕xn =(x1⊕x2⊕. . .⊕xn−1)⊕xn supposing that n−1⊕ k=1 xk is defined and n−1⊕ k=1 xk ≤ x′n. we also define ⊕ " =0. an arbitrary system g =(xκ)κ∈h of not necessarily different elements of e is called orthogonal if ⊕ k exists for every finite k ⊆ g. we say that for an orthogonal system g = (xκ)κ∈h the element ⊕ g (more precisely ⊕ e g) exists iff∨ { ⊕ k | k ⊆ g is finite} exists in e and then we put ⊕ g = ∨ { ⊕ k | k ⊆ g is finite}. (here we write g1 ⊆ g iff there is h1 ⊆ h such that g1 =(xκ)κ∈h1). we call an effect algebra e orthocomplete [9] if every orthogonal system g =(xκ)κ∈h of elements of e has the sum ⊕ g. it is known that every orthocomplete archimedean lattice effect algebra e is a complete lattice (see [22, theorem 2.6]). recall that elements x, y of a lattice effect algebra e are called compatible (written x ↔ y) iff x ∨ y = x ⊕ (y ! (x ∧ y)) (see [15]). p ⊆ e is a set of pairwise compatible elements if x ↔ y for all x, y ∈ p . m ⊆ e is called a block of e iff m is a maximal subset of pairwise compatible elements. every block of a lattice effect algebra e is a sub-effect algebra and a sub-lattice of e and e is a union of its blocks (see [21]). a lattice effect algebra with a unique block is called an mv-effect algebra. every block of a lattice effect algebra is an mv-effect algebra in its own right. anelement w ofaneffect algebra e is called sharp (see [7, 8]) if w ∧ w′ =0. definition 3 ([7, 8]) an effect algebra e is called sharply dominating if for every x ∈ e there exists x̂ ∈ s(e) such that x̂ = ∧ e {w ∈ s(e) | x ≤ w} = ∧ s(e) {w ∈ s(e) | x ≤ w}. note that clearly e is sharply dominating iff for every x ∈ e there exists x̃ ∈ s(e) such that x̃ = ∨ e {w ∈ s(e) | x ≥ w} = ∨ s(e) {w ∈ s(e) | x ≥ w}. a sharply dominating effect algebra e is called s-dominating [8] if x ∧ w exists for every x ∈ e, w ∈ s(e). it is awell known fact that in every s-dominating effect algebra e the subset s(e)= {w ∈ e | w∧w′ = 0} of sharp elements of e is a sub-effect algebra of e being an orthomodular lattice (see [8, theorem 2.6]). moreover if for d ⊆ s(e) the element ∨ e d exists then ∨ e d ∈ s(e) hence ∨ s(e) d = ∨ e d. we say that s(e) is a full sublattice of e (see [10]). let g be a sub-effect algebra of an effect algebra e. we say that g is bifull in e, if, for any d ⊆ g the element ∨ g d exists iff the element ∨ e d exists and they are equal. clearly, any bifull sub-effect algebra of e is full but not conversely (see [12]). the notion of a central element of an effect algebra e was introduced by greechie-foulispulmannová [6]. an element c ∈ e is called central (see [18]) iff for every x ∈ e there exist x ∧ c and x ∧ c′ and x =(x ∧ c) ∨ (x ∧ c′). the center c(e) of e is the set of all central elements of e. moreover, c(e) is a boolean algebra, see [6]. if e is a lattice effect algebra then z ∈ e is central iff z ∧ z′ = 0 and z ↔ x for all x ∈ e, see [19]. thus in a lattice effect algebra e, c(e) = b(e) ∩ s(e), where b(e) = ⋂ {m ⊆ e | m is a block of e} is called the compatibility center of e. 52 acta polytechnica vol. 50 no. 5/2010 aneffect algebra e is calledcentrally dominating (see also [5] for the notion central cover) if for every x ∈ e there exists cx ∈ c(e) such that cx = ∧ e {c ∈ c(e) | x ≤ c} = ∧ c(e) {c ∈ c(e) | x ≤ c}. an element a of a lattice l is called compact iff, for any d ⊆ l, a ≤ ∨ d implies a ≤ ∨ f for some finite f ⊆ d. a lattice l is called compactly generated iff every element of l is a join of compact elements. 3 sharply orthocomplete effect algebras in an effect algebra e the set s(e) = {x ∈ e | x ∧ x′ = 0} of sharp elements plays an important role. in some sense we can say that an effect algebra e is a “smeared set s(e)” of its sharp elements, while unsharp effects are important in studies of unsharpmeasurements [4, 2]. s.gudderproved(see [8]) that, in standard hilbert space effect algebra e(h) of bounded operators a on a hilbert space h between null operator and identity operator, which are endowed with usual + defined iff a + b is in e(h), the set s(e(h)) of sharp elements forms an orthomodular lattice of projection operators on h. further in [8, theorem 2.2] it was shown that in every sharply dominating effect algebra the set s(e) is a sub-effect algebra of e. moreover, in [7, theorem 2.6] it is proved that in every s-dominating effect algebra e the set s(e) is an orthomodular lattice. we are going to show that in this case s(e) is bifull in e. theorem 1 let e be an s-dominating effect algebra. then s(e) is bifull in e. proof. let s ⊆ s(e). (1) assume that z = ∨ s(e) s ∈ s(e) exists. let us show that z is the least upper bound of s in e. let y ∈ e be an upper bound of s. then y ∧ z exists and it is an upper bound of s as well. hence, for any s ∈ s, s ≤ y ∧ z. as e is sharply dominating, there exists a greatest sharp element ỹ ∧ z ≤ y ∧ z this yields that s ≤ ỹ ∧ z ≤ y ∧ z, for all s ∈ s, ỹ ∧ z ∈ s(e). hence z ≤ ỹ ∧ z ≤ y ∧ z ≤ z. then z = y ∧ z ≤ y i.e., z is really the least upper bound of s in e. (2) conversely, let z = ∨ e s ∈ e exist. let y ∈ s(e) be an upper bound of s in s(e). then y ∧ z exists and it is again an upper bound of s. as in (1) we have that ỹ ∧ z is the greatest sharp element under y ∧ z and hence s ≤ ỹ ∧ z ≤ y ∧ z ≤ z, for all s ∈ s. this gives that z = ỹ ∧ z ∈ s(e). thus z = ∨ s(e) s ∈ s(e). corollary 1 if e is a sharply dominating lattice effect algebra then s(e) is bifull in e. definition 4 an effect algebra e is called sharply orthocomplete (centrally orthocomplete (see [5])) if for any system (xκ)κ∈h of elements of e such that there exists an orthogonal system (wκ)κ∈h , wκ ∈ s(e) with xκ ≤ wκ, κ ∈ h (an orthogonal system (cκ)κ∈h , cκ ∈ c(e) with xκ ≤ cκ, κ ∈ h) there exists⊕ {xκ | κ ∈ h} =∨ e { ⊕ e {xκ | κ ∈ f } | f ⊆ h, f finite}. theorem 2 let e be a sharply orthocomplete sdominating effect algebra. then (i) s(e) is a complete orthomodular lattice bifull in e. (ii) c(e) is a complete boolean algebra bifull in e. (iii) e is centrally dominating and centrally orthocomplete. (iv) if c(e) is atomic then ∨ e {p ∈ c(e) | p atom of c(e)} =1. proof. (i): from [8, theorem 2.6] we know that s(e) is an orthomodular lattice and a sub-lattice effect algebra of e. let us show that s(e) is orthocomplete. let s ⊆ s(e), s orthogonal. then for everyfinite f ⊆ s we have that ⊕ e f = ∨ e f = ∨ s(e) f ∈ s(e). moreover, for any s ∈ s, s ≤ s. since s(e) is bifull in e by theorem 1 and e is sharply orthocomplete we have that ⊕ e s = ∨ e s = ∨ s(e) s ∈ s(e) exists. since s(e) is an archimedean lattice effect algebra we have from [22, theorem 2.6] that s(e) is complete. (ii): as c(e) = {x ∈ e | y = (y ∧ x) ∨ (y ∧ x′) for every y ∈ e}, we obtain that 1 = x ∨ x′ for every x ∈ c(e) and by the de morgan laws 0 = x ∧ x′ for every x ∈ c(e). hence c(e) ⊆ s(e). it follows by (i) that, for any q ⊆ c(e), there exists ∨ s(e) q = ∨ e q ∈ c(e) because c(e) is full in e, hence ∨ c(e) q = ∨ e q. by the de morgan laws there exists ∧ e q = ( ∨ e q′)′, where evidently 53 acta polytechnica vol. 50 no. 5/2010 q′ = {q′ ∈ e | q ∈ q} ⊆ c(e). hence ∧ e q ∈ c(e) which gives ∧ c(e) q = ∧ e q (see also [5]). (iii): let x ∈ e. using (ii) let us put cx = ∧ c(e) {c ∈ c(e) | x ≤ c} ∈ c(e). since c(e) is bifull in e we have that cx = ∧ e {c ∈ c(e) | x ≤ c} (see again [5]). since c(e) ⊆ s(e) we immediately obtain that e is centrally orthocomplete. (iv): since c(e) is an atomic boolean algebra we have ∨ c(e) {p ∈ c(e) | p atom of c(e)} =1. as c(e) is bifull in e, we have that ∨ e {p ∈ c(e) | p atom of c(e)} = ∨ c(e) {p ∈ c(e) | p atom of c(e)} =1. 4 sharply orthocomplete lattice effect algebras m. kalina in [12] has shown that even in an archimedean atomic lattice effect algebra e with atomic center c(e) the join of atoms of c(e) computed in e need not be equal to 1. next examples and theorems showconnections between sharporthocompleteness, sharp dominancy and completeness of an effect algebra e aswell as bifullness of s(e), c(e) and atomic blocks in a lattice effect algebra e. it is worth noting that if s(e) = {0,1} then evidently e is s-dominating and sharply orthocomplete. example 1 example of a compactly generated sharply orthocomplete mv-effect algebra that is not complete. it is enough to take the chang mv-effect algebra e = {0, a,2a,3a, . . . ,(3a)′,(2a)′, a′,1} that isnot archimedean (hence it is not complete). it is compactly generated (every x ∈ e is compact) and obviously sharply orthocomplete (the center c(e) = s(e) is trivial) and hence sharply dominating. example 2 example of a sharply dominatingarchimedean atomic latticemv-effect algebra e with complete and bifull s(e) that is not sharply orthocomplete. let e = ∏ {{0n, an,1n} | n =1,2, . . .} and let e0 = {(xn)∞n=1 ∈ e | xk = ak for at most finitely many k ∈ {1,2, . . .}}. then e0 is a sub-lattice effect algebra of e (hence it is an mv-effect algebra), evidently sharply dominating and it is not sharply orthocomplete (since it is not complete). s(e0) = ∏ {{0n,1n} | n = 1,2, . . .} is a complete boolean algebra and s(e0)= c(e0) is a bifull sub-lattice of e0. lemma 1 let e be a sharply orthocomplete archimedean atomic mv-effect algebra. then e is complete. proof. let a ⊆ e be a set of all atoms of e. then 1 = ∨ e {naa|a ∈ a} = ⊕ e {naa|a ∈ a}, naa ∈ c(e) = s(e) are atoms of c(e) for all a ∈ a. by [23, theorem 3.1] we have that e is isomorphic to a subdirect product of the family {[0, naa] | a ∈ a}. the corresponding lattice effect algebra embedding ϕ:e → ∏ {[0, naa] | a ∈ a} is given by ϕ(x)= (x ∧ naa)a∈a. let us check that e is isomorphic to ∏ {[0, naa] | a ∈ a}. it is enough to check that ϕ is onto. let (xa)a∈a ∈ ∏ {[0, naa] | a ∈ a}. then (na)a∈a is an orthogonal system and xa = kaa ≤ naa ∈ s(e) for all a ∈ a. hence x = ⊕ e {xa | a ∈ a} = ∨ e {kaa | a ∈ a} ∈ e exists. evidently, ϕ(x) = (x∧naa)a∈a = (kaa)a∈a =(xa)a∈a. example 3 example of a sharply orthocomplete archimedean mv-effect algebra that is not complete. if we omit in lemma 1 the assumption of atomicity in e it is enough to take the mv-effect algebra e = {f: [0,1] → [0,1] | f continuous function}, which is a sub-lattice effect algebra of a direct product of copies of the standard mv-effect algebra of real numbers [0,1] that is archimedean, sharply orthocomplete (the center c(e) = s(e) = {0,1} is trivial) and hence sharply dominating. moreover, e is not complete. it iswell knownthatanarchimedean lattice effect algebra e is complete if and only if every block of e is complete (see [22, theorem 2.7]). if moreover e is atomic then e mayhaveatomicaswell asnon-atomic blocks [1]. k.mosná [16, theorem8] has proved that in this case e = ⋃ {m ⊆ e | m atomic block of e}. hence every non-atomic block of e is covered by atomic blocks. moreover, many properties of archimedean atomic lattice effect algebras as well as their non-atomic blocksdepend onproperties of their atomic blocks. namely, the center c(e), the compatibility center b(e) and the set s(e) of sharp elements of archimedean atomic lattice effect algebras e can be expressed by set-theoretical operations on their atomic blocks. as follows, b(e) = ⋂ {m ⊆ e | m atomic block of e}, s(e) = ⋃ {c(m) | m ⊆ e, m atomic block of e} and c(e) = b(e) ∩ s(e) (see [16]). 54 acta polytechnica vol. 50 no. 5/2010 for instance, an archimedean atomic lattice effect algebra e is sharply dominating iff every atomic block of e is sharply dominating (see [11]). moreover, we can prove the following: theorem 3 let e be an archimedean atomic lattice effect algebra. then the following conditions are equivalent: (i) e is complete. (ii) every atomic block of e is complete. in this case every block of e is complete. proof. (i)=⇒ (ii): this is trivial, as every block m of e is a full sub-lattice effect algebra of e. (ii) =⇒ (i): it is enough to show that e is orthocomplete. from [22, theorem 2.6] we then get that e is complete. let g ⊆ e be a ⊕ -orthogonal system. then, for every x ∈ g, there is a set ax of atoms of e and positive integers ka, a ∈ ax such that x = ⊕ e {kaa | a ∈ ax}. moreover, for any f ⊆ g finite we have that ⋃ {ax | x ∈ f } is an orthogonal set of atoms. hence ag = ⋃ {ax | x ∈ g} is an orthogonal set of atoms of e and there is a maximal orthogonal set a of atoms of e such that ag ⊆ a. therefore there is an atomic block m of e with a ⊆ m. by assumption ⊕ m g exists and ⊕ m g = ⊕ e g, as m is bifull in e because e is archimedean and atomic (see [17]). theorem 4 let e be a sharply orthocomplete lattice effect algebra. then (i) s(e) is a complete orthomodular lattice bifull in e. (ii) c(e) is a complete boolean algebra bifull in e. (iii) e is sharply dominating, centrally dominating and s-dominating. (iv) if moreover e is archimedean and atomic then e is a complete lattice effect algebra. proof. (i),(iii): let s ⊆ s(e), s be orthogonal. then, for any s ∈ s, s ≤ s. hence (since s(e) is full in e) ⊕ e s = ∨ e s = ∨ s(e) s ∈ s(e) exists. since s(e) is an archimedean lattice effect algebra we have from [22, theorem 2.6] that s(e) is complete. moreover, let x ∈ e and let g = (wκ)κ∈h, wκ ∈ s(e), κ ∈ h be a maximal orthogonal system of mutually different elements such that wx =⊕ e {wκ | κ ∈ h} ≤ x. let us show that y ∈ s(e), y ≤ x =⇒ y ≤ wx ∈ s(e). clearly, wx ∈ s(e). assume that y ≤ wx. then wx < y ∨ wx ≤ x. hence z = (y ∨ wx) ! wx = 0 and g ∪ {z} is an orthogonal system of mutually different elements such that y ∨ wx = wx ⊕ z = ⊕ e {wκ | κ ∈ h} ⊕ z ≤ x, a contradiction with the maximality of g. therefore y ≤ wx and e is sharply dominating, hence sdominating and from theorem 2 we get that e is centrally dominating. from theorem 1, we get that s(e) is bifull in e. (ii): it follows from (i),(iii) and theorem 2. (iv): assume now that e is a sharply orthocomplete archimedean atomic lattice effect algebra. then every atomic block m of e is a sharply orthocomplete archimedean atomic mv-effect algebra and hence it is a complete mv-effect algebra by lemma 1. by theorem 3, e is a complete lattice effect algebra. theorem 5 let e be an atomic lattice effect algebra. then the following conditions are equivalent: (i) e is complete. (ii) e is archimedean and sharply orthocomplete. proof. (i) =⇒ (ii): by [20, theorem 3.3] we have that any complete lattice effect algebra is archimedean. evidently, any complete lattice effect algebra is sharply orthocomplete. (ii)=⇒ (i): it follows from theorem 4, (iv). acknowledgement the work of the first author was supported by the slovakresearchanddevelopmentagencyunder contract no. apvv–0375–06 and by the vega grant agency, grant number 1/0373/08. the second author gratefully acknowledges financial support from the ministry of education of the czechrepublic under project msm0021622409. the third author was supported by the slovak research and development agency under contract no. apvv–0071–06. the authors also thank the referee for reading very thoroughly and for improving the presentation of the paper. references [1] beltrametti, e. g., cassinelli, g.: the logic of quantum mechanics. addison-wesley, reading, ma, 1981. [2] busch, p., lahti, p. j., mittelstaedt, p.: the quantum theory of measurement, lecture notes in physics, new series m: monographs, vol. 2, springer-verlag, berlin, 1991. [3] chang, c. c.: algebraic analysis of many valued logics, trans. amer. math. soc. 88 (1958), 467–490. [4] foulis, d. j., bennett, m. k.: effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1331–1352. 55 acta polytechnica vol. 50 no. 5/2010 [5] foulis, d. j., pulmannová, s.: type-decomposition of an effect algebra, found. phys. [6] greechie, r. j., foulis, d. j., pulmannová, s.: the center of an effect algebra,order 12 (1995), 91–106. [7] gudder, s. p.: sharply dominating effect algebras, tatra mt. math. publ. 15 (1998), 23–30. [8] gudder, s. p.: s-dominating effect algebras, internat. j. theoret. phys. 37 (1998), 915–923. [9] jenča, g., pulmannová, s.: orthocomplete effect algebras, proc. amer. math. soc. 131 (2003), 2663–2671. [10] jenča, g., riečanová, z.: on sharp elements in lattice ordered effect algebras, busefal 80 (1999), 24–29. [11] kalina, m., olejček, v., paseka, j., riečanová, z.: sharply dominating mv-effect algebras, internat. j. theoret. phys., doi: 10.1007/s10773-010-0338-x. [12] kalina, m.: on central atoms of archimedean atomic lattice effect algebras, kybernetika, accepted. [13] kalmbach, g.: orthomodular lattices, mathematics and its applications, vol. 453, kluwer academic publishers, dordrecht, 1998, 1998. [14] kôpka, f.: compatibility in d-posets, internat. j. theoret. phys. 34 (1995), 1525–1531. [15] kôpka, f., chovanec, f.: boolean d-posets, internat. j. theoret. phys.34 (1995), 1297–1302. [16] mosná, k.: atomic lattice effect algebras and their sub-lattice effect algebras,j. electrical engineering 58 (2007), 7/s, 3–6. [17] paseka, j., riečanová, z.: the inheritance of bde-property in sharply dominating lattice effect algebras and (o)-continuous states, soft comput., doi: 10.1007/s00500-010-0561-7. [18] riečanová, z.: compatibility and central elements in effect algebras, tatra mt. math. publ. 16 (1999), 151–158. [19] riečanová, z.: subalgebras, intervals and central elements of generalized effect algebras, internat. j. theoret. phys.38 (1999), 3209–3220. [20] riečanová, z.: archimedean and block-finite lattice effect algebras, demonstratio mathematica 33 (2000), 443–452. [21] riečanová, z.: generalization of blocks for dlattices and lattice-ordered effect algebras, internat. j. theoret. phys. 39 (2000), 231–237. [22] riečanová, z.: orthogonal sets in effect algebras, demonstratio math. 34 (2001), 525–532. [23] riečanová, z.: subdirect decompositions of lattice effect algebras, internat. j. theoret. phys. 42 (2003), 1415–1433. doc. rndr. martin kalina, ph.d. e-mail: kalina@math.sk department of mathematics faculty of civil engineering slovak university of technology radlinského 11, sk-813 68 bratislava, slovakia doc. rndr. jan paseka, csc. e-mail: paseka@math.muni.cz department of mathematics and statistics faculty of science masaryk university kotlářská 2, cz-611 37 brno, czech republic prof. rndr. zdenka riečanová, ph.d. e-mail: zdena.riecanova@gmail.com department of mathematics faculty of electrical engineering and information technology slovak university of technology ilkovičova 3, sk-812 19 bratislava, slovak republic 56 ap09_1.vp 1 introduction the fire resistance of a steel column is strongly influenced by the conditions in which it is inserted in the building. apart form other parameters, the contact of the column with the walls of the building has a great influence on its behaviour in fire. the walls, on one hand, have a favourable influence on the fire resistance of the steel columns, because they protect a large part of its lateral surface from heating. however, on the other hand, they will have an unfavourable influence because they lead to differential heating of the cross-section. the design methods considered in eurocode 3 part 1.2 do not take into account this fact, and fire resistance is determined as if the heating were uniform [1]. this paper presents the results of fire resistance tests in steel columns embedded on walls, carried out at the laboratory of testing materials and structures of the university of coimbra. the evolution of temperatures registered in the experimental models is compared with the results obtained in numerical simulations performed with the supertempcalc fem program (stc), developed by y. anderberg of fire safety design, lund, sweden [2]. supertempcalc is a thermal finite element program that solves two-dimensional, non-linear, transient, heat transfer differential equations, incorporating thermal properties which vary with temperature. this program allows the use of rectangular or triangular finite elements, in cylindrical or rectangular co-ordinates. heat transfer by convection and radiation at the boundaries can be modelled as a function of time. 2 experimental program the aim of this study was to analyse the thermal behaviour of steel columns embedded on walls. fire resistance tests were carried out with two different column cross-sections, two orientations of the inertia axis in relation to the fire and two thicknesses of the building walls [3]. the columns had cross-sections of hea160 and hea200, steel class s355 and the walls of different thicknesses and were made of bricks (fig. 3). the bricks were laid using ordinary cement mortar. the columns in the test were placed at the center of a 3d restraining frame (fig. 2a). this frame was composed of heb200 columns 3 m in height and heb200 beams with a 6 m span, steel class s355. two brick walls were then built, one on each side of the column (fig. 2b). this restraining frame was later used to perform fire resistance tests on columns with restrained thermal elongation, but in the tests presented in this paper the columns were not thermally restrained. the specimens were instrumented with type k thermocouples (cromo-alumel) in various positions of the cross-section of the columns and on the walls (fig.1). the thermal action was applied only on one side of the element, in such a way as to permit an analysis of the thermal gradient produced through the wall and across the cross section of the column. the evolution of temperatures in the furnace followed the iso 834 standard fire curve. the temperatures inside the furnace were measured by type k shielded probe thermocouples in the first four tests (cases 1 to 4 in fig. 3) and were later exchanged for plate thermometers in the last four tests (cases 5 to 8 in fig. 3). this change was due to the fact that a small delay in the heating of the furnace was observed in the first tests, and so the decision was taken to change the thermocouples that controlled the furnace. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 49 no. 1/2009 influence of brick walls on the temperature distribution in steel columns in fire antónio j. p. moura correia, joao paulo c. rodrigues, valdir pignatta e silvac this paper reports on a study of steel columns embedded in walls in fire. several fire resistance tests were carried out at the laboratory of testing materials and structures of the university of coimbra, in portugal. the temperatures registered in several points of the experimental models are compared with those obtained in numerical simulations carried out with the supertempcalc finite element program. keywords: fire, walls, steel, columns, resistance, numerical simulation, eurocodes. s1 s2 s3 s4 s5 2 9 4 c m 3cm 3cm t5 t6 t3 t4 t2 t1 cross section 45cm 10cm 10cm 81cm 56cm 81cm 56cm fig. 1: specimen and position of thermocouples ~ 3 numerical modelling the computational modelling was performed using the computer code supertempcalc (stc) – temperature calculation and design v.5, developed by y. anderberg [2] for two-dimensional thermal analysis of any type of cross-sections exposed to heating. the thermal properties of the materials adopted in this work for numerical analysis were those presented in eurocode 3 [1] for steel, and in eurocode 2 [4] for concrete parts 1.2. for the mortar covering the bricks, the properties for concrete recommended by eurocode 2 part 1.2 were adopted. the thermal properties adopted for the masonry were the same as the values adopted in the ozone computer program, developed at the university of liège, i.e., thermal conductivity � 0.7 w/m°c, specific heat � 840 j/kg°c, 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 a) b) c) fig. 2: a) construction of the test model, b) column embedded in the wall, c) lateral view of the experimental system fig. 3: cases studied specific weight � 1600 kg/m3 specific heat × specific mass � 1344000 j/m3°c. the emissivity was � � 0 7. for the steel profile, and also for the masonry and the mortar. the coefficient of heat transmission by convection in the face exposed to the fire was �c � 25 w/m 2°c. for the non-exposed face, the values �c � 4 w/m 2°c and � � 0 7. were used. these values led to better results. the models were meshed in finite rectangular elements with sides of 4 mm × 5 mm. the stc computer code can draw isothermals and temperature fields, for each instant of time, and can rapidly give the value of the temperature as a function of time. the cases studied are summarized in fig. 3. 4 comparisons – experimental vs numerical analysis 4.1 furnace temperatures the temperatures inside the furnace were very uniform in both series of tests (cases 1 to 4 and 5 to 8), but a small delay to the iso 834 fire curve is observed in the first series of four tests (cases 1 to 4) (fig. 4). as already explained, this delay © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 49 no. 1/2009 a) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) iso case 4 case 3 case 2 case 1 b) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (min) t e m p e ra tu re ( c ) ° iso 834 case 7 case 8 case 6 case 5 fig. 4: furnace curves a) cases 1 to 4 b) cases 5 to 8 a) b) fig. 5: isothermals in the cross-section a) case 6, b) case 7 may be related to the type of thermocouples used in the furnace in the first series of tests. 4.2 thermal gradients in the cross-sections fig. 5 shows the isothermals on the cross-section for cases 6 and 7. in case 6, the wall was 10 cm in thickness, and in case 7 it was 14 cm in thickness. the figure shows higher thermal gradients in the cross-section for case 6 than for case 7. the mechanical resistance of the steel profile is perhaps more affected in case 7. 4.3 evolution of temperatures in the middle height section of the steel columns the temperatures in the experimental tests were measured in six points of five 5 sections of the steel column (fig. 1). the temperatures in the middle height section of the columns were compared with those obtained in numerical simulations for 60 min (figs. 6 to 9). in these figures, th_1 stands for thinner walls, and th_2 stands for thicker walls. in the case of the web parallel to the wall surface, the temperature in the flange not exposed to the fire (thermocouple 4 and 6), is higher in the case of walls of smaller thicknesses (figs. 6a and 8a). for hea160, the difference is nearly 100 °c for the stc calculations and for the experimental tests (fig. 6a). for hea200, the difference is almost 75 °c for both the stc calculations and the experimental tests (fig. 8a). in the face of the web exposed to the fire, the temperatures are higher for the thin walls than for the thick walls (thermocouple 3), presenting a very small difference between stc simulations and experimental tests (fig. 8b). in the case of the web perpendicular to the wall surface, the temperature in the exposed flange (thermocouple 5) is also higher in the case of the thin wall than the thick wall (figs. 7a and 9a). for hea160, the difference is approximately 50 °c in the stc simulations and almost the same in the experimental tests (fig. 7a). for hea200, the difference is about 100 °c in both analyses (fig. 9a). curiously, in the unexposed flange the temperatures are higher for the thick wall (thermocouple 6), in the experimental tests. for hea160, the difference is about 100 1c in the experimental test (fig. 7b) and for hea200 the results are very close in both analyses (fig. 9b). 5 conclusions for cases with the web parallel to the wall surface, it was concluded that the thicker wall plays an important role in reducing the temperatures on the unexposed half of the flange and also in the web. while for cases with the web perpendicular to the wall surface a quite interesting result was observed, in the unexposed face of the flange the temperatures were slightly higher when the wall was thicker. conversely, on the exposed flange the temperatures were higher when the walls were thinner. acknowledgments the authors gratefully acknowledge the support received from cbca, brazil, fct – mctes, preceram s.a., metalocardoso s.a. and a. costa cabral s.a., portugal. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 a) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t4_exp_th_2 t4_stc_th_2 t4_exp_th_1 t4_stc_th_1 b) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t5_exp_th_2 t5_stc_th_2 t5_exp_th_1 t5_stc_th_1 fig. 6: temperatures vs time for hea160 with the web parallel to the wall (cases 1 and 5); a) thermocouple t4 ; b) thermocouple t5 © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 49 no. 1/2009 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t5_exp_th_2 t5_stc_th_2 t5_exp_th_1 t5_stc_th_1 a) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t6_exp_th_2 t6_stc_th_2 t6_exp_th_1 t6_stc_th_1 b) fig. 7: temperatures vs time for hea 160 with the web perpendicular to the wall (cases 2 and 6); thermocouple t5; b) thermocouple t6 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t3_exp_th_2 t3_stc_th_2 t3_exp_th_1 t3_stc_th_1 a) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t6_exp_th_2 t6_stc_th_2 t6_exp_th_1 t6_stc_th_1 b) fig. 8: temperatures vs time for hea 200 with the web parallel to the wall (cases 3 and 7); a) thermocouple t6 ; b) thermocouple t3 references [1] eurocode 3, env 1993-1-2, design of steel structures, part 1.2: general rules – structural fire design, european community (ec). brussels, 2004. [2] anderberg, y.: tcd 5.0 – user’s manual. lund: fire safety design, 1997. [3] correia, a. m, rodrigues, j. p. c, silva, v. p.: studies on the fire behaviour of steel columns embedded on wall. 11th international conference on fire science and engineering – interflam. london, 2007. [4] eurocode 2, env 1992-1-2, design of concrete structures, part 1.2: general rules structural fire design, european community (ec). brussels, december 2004. [5] eurocode 1, env 1991-1-2, basis of design and actions on structures – part 1-2: actions on structures – actions on structures exposed to fire, european community (ec). brussels, 2002. [6] silva, v. p., correia, a. m, rodrigues, j. p. c.: simulation on fire behaviour of steel columns embedded on walls. xxxiii jornadas sudamericanas de ingenieria estructural. maio 2008, santiago, chile. antónio j. p. moura correia e-mail: antonio.correia@estgoh.ipc.pt superior school of technology and management of oliveira of the hospital faculty of sciences and technology of university of coimbra portugal joao paulo c. rodrigues faculty of sciences and technology of university of coimbra portugal valdir pignatta e silva polytechnic school of university of s. paulo brazil 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (seconds) t e m p e ra tu re ( c ) � t5_exp_th_2 t5_stc_th_2 t5_exp_th_1 t5_stc_th_1 a) 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 time (minutes) t e m p e ra tu re (° c ) t6_exp_th_2 t6_stc_th_2 t6_exp_th_1 t6_stc_th_1 b) fig. 9: temperatures vs time for hea 200 with the web perpendicular to the wall (cases 4 and 8); a) thermocouple t5 ; b) thermocouple t6 ~ acta polytechnica vol. 52 no. 6/2012 micromechanical analysis of cement paste with carbon nanotubes vít šmilauer, petr hlaváček, pavel padevět czech technical university in prague, faculty of civil engineering, department of mechanics, thákurova 7, 166 29 prague 6, czech republic corresponding author: vit.smilauer@fsv.cvut.cz abstract carbon nanotubes (cnt) are an attractive reinforcement material for several composites, due to their inherently high strength and high modulus of elasticity. there are controversial results for cement paste with admixed cnt up to 500 µm in length. some results show an increase in flexural or compressive strength, while others showing a decrease in the values. our experiments produced results that showed a small increase in fracture energy and tensile strength. micromechanical simulations on a cnt-reinforced cement paste 50×50 µm proved that cnt clustering is the crucial factor for an increase in fracture energy and for an improvement in tensile strength. keywords: carbon nanotubes, cement hybrid material, micromechanics, fracture energy. 1 introduction worldwide production of cement has increased significantly in recent years, due to the growing demand for concrete. in 2010, 3.3 · 109 tons of cement were produced, which is three times the quantity of 1.1·109 tons produced in 1990. for reasons of sustainability and profitability, secondary cementitious materials have been introduced into the binder, reducing the amount of portland clinker in the cement. a further reduction can be made if the strength of the hydrated cement increases. carbon nanofibers (cnf) and carbon nanotubes (cnt) are a feasible approach to the production of a high strength binder, as has been demonstrated for composites [3, 19] and specifically for cement pastes [2, 4, 5, 9, 12]. theoretical and experimental studies have indicated that cnt exhibit young modulus from 180 to 588 gpa and tensile strength in the range between 2000 and 6140 mpa [10]. both cnt and cnf as admixtures are very difficult to distribute uniformly, even within a small volume of cement paste. flocculation make it impossible to obtain improved properties of composite materials, unless a special treatment is applied. it is a challenging task to achieve good dispersion of cnt/cnf in a cement matrix. a simple powder mixing procedure leads to poor dispersal of the carbon nanomaterials in a cement paste. for this reason, sanchez and ince proposed adding silica fume to cement to increase the dispersiveness of cnf [15]. dispersion of cnt/cnf in water using surfactants is another method, and is a widely-used way to introduce carbon nanomaterials into polymers [2]. however, most surfactants and additives interfere with the hydration reaction of cement, and usually prolong figure 1: cnt synthesized on the surface of a cement grain. the image is 115 µm in width. image by k. hruška, institute of physics ascr, prague. the cement setting and hardening times. another way to enhance the dispersiveness of cnt/cnf is by functionalizing them. the carboxyl or hydroxyl groups attached to the surface of carbon that form during oxidation treatment can create bonds between the matrix and carbon nanomaterials. functionalized cnt have already been proven to influence the mechanical properties of hydrated cement paste [2, 5]. another method uses cnt directly synthesized on the surface of cement grains [12], see figure 1. cement hybrid material (chm) of this type can easily be blended with conventional cement. this procedure guarantees dispersion of cnt in the cement matrix. it has been reported that even a small mass fraction of cnt/cement = 0.005 helped to increase flexural strength by 22 % [5]. 22 acta polytechnica vol. 52 no. 6/2012 cnt length w/c compressive s. (mpa) flexural s. (mpa) ref. (µm) plain cnt plain cnt < 10 p 0.30 38.3 ± 8.2 61.8 ± 0.8 2.3 ± 0.4 2.3 ± 0.2 [2] < 30 p 0.30 — — 9.2 12.6 [4] < ∼10 m 0.40 28.9 54.3 6.0 8.23 [8] < 500 p 0.45 52.3 ± 0.7 62.1 ± 1.4 6.7 ± 0.1 8.4 ± 0.2 [5] < 100 p 0.50 — — 5.5 7.2 [9] < ∼5 p — 49 56 16 8 [12] table 1: summary of published data on strength gain when cnt is added to cement. p-paste, m-mortar. reports on improvement of compressive and flexural strength when un/functionalized cnt is added to a cement paste are inconsistent across the literature. for example, a decrease of more than 30 % in compressive strength has been reported [2]. table 1 shows the most optimistic claimed improvements which were achieved by adding certain amounts of cnt/cnf to cement paste or mortar, measured after 28 days of hardening. in addition, data from table 1 demonstrates that even cnt shorter than 10 µm can be effective in providing added strength. this raises the question of the reinforcing mechanism. to shed the light on the mechanical effect of cnt, a series of experimental data was measured and micromechanical simulations were executed. the results are presented below. 2 experiments 2.1 materials ordinary portland cement cem i 42.5 r from mokrá, czech republic, was used as the source material for all specimens. the specific blaine surface was 306 m2/kg, the chemical composition of the relevant elements was cao 65.6 %, sio2 19.0 %, fe2o3 3.5 %, mgo 1.1 % by mass. cement hybrid material (chm) with grown cnt on the surface was synthesized by a group from aalto university, finland, under the leadership of l. nasibulina by the chemical vapor deposition method [13]. cnt growth occurred at about 600°c in a fluidized bed reactor, where acetylene served as the main carbon source due to its low decomposition temperature and its affordable price. the cnt grown on the cement particles are approximately 30 nm in diameter and up to 3 µm in length [11], and the specific surface area of cnt is about 10–20 m2/g. chm has density 2.59 kg/m3 and contains 30 % of carbon by mass, which occupies 43 % of the volume. figure 2: layout of the three-point bending test used for fracture energy determination. table 2 summarizes the compositions of five experimental batches with the same water/binder ratio of 0.35. the batches cover feasible mixes potentially usable in concrete engineering. 0.2 % glenium ace 40 superplasticizer was added to the binder mass. the superplasticizer contains 65 % of water by mass, which was deducted from the total amount of water. 2.2 casting and curing of the specimens hand stirring of each batch took three minutes, and consecutive vibration and form filling took an additional period of four minutes. specimens 40 × 40 × 80 mm in size were covered, and after 24 hours they were cured in a water bath at ambient temperature. at 30 days of hardening, the specimens were cut with a diamond saw into nine parts with approximate dimensions 13 × 13 × 80 mm. the production of such small-size specimens by cutting from larger bodies is more efficient than direct casting into small molds. the casting and vibration of a small amount of material was found to be ineffective, and the quality of the specimens (including surface defects or material inhomogeneity) is significantly worse than the quality attained by hand-cutting from larger bodies. 23 acta polytechnica vol. 52 no. 6/2012 designation cem i chm water carbon/paste (mass %) (g) (g) (g) (vol %) cem 100 % + chm 0 % 234.0 0 81.9 0 cem 96.5 % + chm 3.5% 225.81 8.19 81.9 0.88 cem 93 % + chm 7 % 217.62 16.38 81.9 1.75 cem 86 % + chm 14 % 201.24 32.76 81.9 3.47 cem 70 % + chm 30 % 163.8 70.2 81.9 7.31 table 2: five batches of cement pastes with different amounts of chm. 0 10 20 30 40 50 60 70 80 c e m 1 0 0 % + c h m 0 % c e m 9 6 .5 % + c h m 3 .5 % c e m 9 3 % + c h m 7 % c e m 8 6 % + c h m 1 4 % c e m 7 0 % + c h m 3 0 % c o m p re s s iv e s tr e n g th ( m p a ) w/c = 0.35 figure 3: compressive strength of a plain paste and pastes with admixed chm. in accordance with rilem standards for mechanical testing [17], notches were cut in the middle of the beams to 45 % of the depth for subsequent fracture energy determination. 2.3 determining the mechanical properties the fracture energy, gf, was determined according to the rilem standard [17], see figure 2 for the experimental scheme. at least five notched beams from one batch were used to obtain the statistical results. the three-point bending test under a displacementcontrolled regime gave access for the full load displacement curve on an mts aliance rt 30 electromechanical machine. the work of external force p is calculated as wf = ∫ uf 0 p du (1) where u is the load-point displacement and uf represents the final displacement at which the load is equal to zero. the average (effective) fracture energy in the ligament, according to the rilem standard, 0 5 10 15 20 25 c e m 1 0 0 % + c h m 0 % c e m 9 6 .5 % + c h m 3 .5 % c e m 9 3 % + c h m 7 % c e m 8 6 % + c h m 1 4 % c e m 7 0 % + c h m 3 0 % f ra c tu re e n e rg y ( n /m ) w/c = 0.35 figure 4: fracture energy of a plain paste and pastes with admixed chm. is defined as gf = wf b(d−a0) (2) where b is the thickness of the beam, d is the beam depth, and a0 is the depth of the notch. the support span l was set to 50 mm. the compressive strength was determined from two broken pieces of each beam. thick metal pads 13 × 13 mm were put on the top and bottom surfaces, and the peak force was recalculated to the compressive strength. 2.4 experimental results figure 3 shows the compressive strength of the hardened pastes at 30 days after the casting. replacing 3.5 % cement with chm leads to a 25 % increase in compressive strength, in our case from an average value of 56 mpa to an average of 70 mpa. this paste contains only 0.88 % of cnt by volume. increased chm dosage leads to a slight decrease in compressive strength. figure 4 depicts the evolution of fracture energy. the chm samples exhibit an increase in fracture 24 acta polytechnica vol. 52 no. 6/2012 component e (gpa) ν ft (mpa) gf (n/m) porosity 0.2 0.02 — — clinker 135 0.3 1800 118.5 products 21.7 0.24 5.58 11.5 cnt 231 0.14 3000 200 table 3: elastic and fracture properties of the components. energy even in a small amount of replacement. replacing 3.5 % of the cement by chm causes a 14 % increase in the fracture energy. it should be added that a variation in chm content has consequences for workability, paste density and porosity. the results indicate that the classical cnt pull-out or crack bridging mechanism does not occur. microcrack shielding seems to be the most relevant mechanism operating in hardened cement paste reinforced by short cnt, giving only a slight increase in the fracture energy [7]. in contrast, pva-reinforced cement-based composites with fibers several millimeters in length yield fracture energy that is higher by orders of magnitude [6]. this experimental finding is further supported by micromechanical simulations that show a certain increase in the upper-bound fracture energy. 3 micromechanical simulations 2d micromechanical simulations aimed to reproduce the fracture energy from cnt-free and cntreinforced cement pastes and, in addition, to explore the role of clustering and cnt length. the cemhyd3d cement hydration model generated the 3d microstructure of hydrated cement paste, from which a 2d slice 50×50 µm was taken. various chemical phases were reduced to three components; capillary porosity, clinker minerals and hydration products, which correspond mainly to c-s-h. for 30 days of hydration, the volume fractions yield to 14.2 %, 11.6 % and 74.2 % respectively. the cnt volume fraction was considered as 3.47 %, which corresponds to a composition of cem 86 % + chm 14 %. cnt fibers were introduced as 1d truss elements, connecting particular nodes of 2d quadrilateral elements. each fiber is represented by one finite element without intermediate hanging nodes, in order to simplify the model. an isotropic damage material model is assigned to all four components. a simple cohesive crack model with linear strain softening and mazars’ measure of strain is used: σ = (1 −ω)eε = ft ( 1 − hωεft 2gf ) , (3) ε̃ = √√√√ 3∑ i=1 〈εi〉2, (4) where σ is the normal stress transferred across a cohesive crack, ft is the uniaxial tensile strength, h is the effective width of a finite element, ε is the normal strain to a crack, gf is the mode-i fracture energy, and ω is the damage parameter. 〈εi〉 are positive parts of the principal values of the strain tensor ε. equation (3) leads to an explicit evaluation of damage parameter ω. table 3 summarizes the elastic and fracture properties of the four components. the elastic modulus of porosity was assigned a finite value, since this speeds up convergence. it was checked that a further reduction of the modulus leads to negligible changes in results. clinker minerals are crystalline phases with high tensile strength and normally suffer no damage during loading. their fracture energy is recalculated from scratch-test results [18]. the properties of cnt are estimated on the basis of recent results [16]. hydration products, mainly c-s-h, are taken as the low-density type of c-s-h [1]. the c-s-h tensile strength and the c-s-h fracture energy were deduced from experimental data and a 2d simulation of plain paste. these two parameters were identified by means of an inverse-problem, taking the most reasonable case of the two weakly dependent parameters; flexural strength assuming 5.8 mpa, similar to [5, 8], and for fracture energy assuming 16.5 n/m as determined from our lab tests, see figure 4. figure 5 depicts microstructure images for micromechanical simulations. as mentioned above, a composition of cem 86 % + chm 14 % was considered for all cases, with the exception of plain paste. the crosssection of one cnt truss element was 50 nm times 1 µm, which in fact represents a bundle of cnt, not a single nanotube. the cnt length was randomized according to uniform distribution and length limit. the orientation of the cnt fibers is also random. gaussian distribution of cnt fibers in a cluster was 25 acta polytechnica vol. 52 no. 6/2012 figure 5: 2d microstructures used in the simulations. the plain paste contains three phases (black — porosity, dark gray — clinker minerals, light gray — hydration products). other subfigures show cnt fibers distributed within plain paste, either in clusters or randomly. figure 6: deformed microstructure with 5 clusters and cnt length up to 5 µm before the peak load. used, and the cluster radius is uniformly distributed up to 10 µm. the subfigures in figure 5 show cnt fibers distributed in the paste. each pixel was assigned to a quadrilateral element with four nodes, so the mesh is uniform and easily generated. each simulation leads to 4900 unknown displacements, contains 150 displacement increments, uses the modified newton-raphson solver, and runs for approximately 5 minutes in the oofem package [14]. figure 6 elucidates the damage distribution in the microstructure of 5 clusters with cnt length up to cnt clusters cnt length ft gf (mpa) (n/m) plain paste plain paste 5.8 16.5 5 ≤ 1 µm 5.9 15.5 5 ≤ 5 µm 6.3 25.9 0 (no clusters) ≤ 1 µm 6.3 27.7 0 (no clusters) ≤ 5 µm 7.1 43.8 0 (no clusters) ≤ 10 µm 7.3 55.8 table 4: elastic and fracture properties of cntreinforced paste. 5 µm before the peak load. note that there is no paste damage inside the heavily cnt-reinforced zones. table 4 summarizes the results from micromechanical simulations. it is evident that clustering of cnt is the most critical factor. clustering leads to crack formation around strongly-reinforced zones, which renders the cnt reinforcement ineffective. the simulations show that the uniform distribution of 1 µm cnt increases gf from 16.5 to 27.7 n/m. clustering leads to a smaller increase, even with cnt 5 µm in length. 26 acta polytechnica vol. 52 no. 6/2012 0 1 2 3 4 5 6 7 8 0 50 100 150 200 250 300 350 400 a v e ra g e t e n s il e s tr e s s ( m p a ) average strain in loading direction • 10 -3 (-) 0 clusters 10 µm 0 clusters 5 µm 0 clusters 1 µm 5 clusters 5 µm 5 clusters 1 µm plain paste figure 7: a stress-strain diagram for loaded microstructures. figure 7 plots the stress-strain diagram for plain and cnt-reinforced pastes. note that there is not much increase in flexural strength but there is a significant improvement in ductility, which is reflected as an increase in fracture energy. addition of cnt to cement paste, in any form, leads necessarily to clustering. according to the powers model of cement hydration, the cnt-unreinforced volume corresponds to the initial volume of unhydrated cement grains, where cnt cannot enter during initial mixing. the initial volume fraction of the clinker corresponds to fclinker = 0.32 w/c + 0.32 , (5) which for w/c = 0.35 yields 48 %. this large volume remains unreinforced and prevents uniform distribution of cnt fibers. this explains why our experimental fracture energy is always lower than when micromechanical predictions with a uniform cnt distribution are used. further simulations with variable chm content support the hypothesis that cnt clustering lowers the amount of fracture energy. figure 8 shows that placing cnt only outside of originally unhydrated cement grains corresponds well with the experimental results. cement paste with 30 % chm shows discrepancy with micromechanical simulation, the reason lies probably in lower workability and extensive cnt separation during mixing. 4 conclusions the role of cnt reinforcement in cement paste provides another option for improving the quasi-brittle behavior of cementitious materials. experimental evidence shows a marginal improvement in compressive strength and fracture energy. as has been proven by numerical simulations, the clustering of cnt in 0 20 40 60 80 100 c e m 1 0 0 % + c h m 0 % c e m 9 6 .5 % + c h m 3 .5 % c e m 9 3 % + c h m 7 % c e m 8 6 % + c h m 1 4 % c e m 7 0 % + c h m 3 0 % f ra c tu re e n e rg y [ n /m ] w/c=0.35 cnt everywhere cnt outside of cement grains measured data figure 8: fracture energy of pastes with variable chm content (cnt length up to 3 µm). micromechanical simulations with average cnt length up to 2.5 µm consider cnt placement everywhere or only in areas outside of originally unhydrated cement grains. the paste microstructure is the crucial factor, due to the presence of unreacted cement grains in the initial mixture and the intermixing of chm with ordinary cement. however, micromechanical simulations show that a uniform distribution cnt fibers only 1 µm in length may increase the fracture energy from 16.5 to 27.7 n/m, and the flexural strength up to 10 %. direct synthesis of cnt on the surface of cement particles solved the problem with flocculation of cnt separately added to the cement paste, but the problem of initially unhydrated grains remained. the claimed twofold increase in the compressive strength of cement paste with added cnt was not proven by our experiments [8]. preliminary micromechanical simulations indicate that a real cnt reinforcement is ineffective when the tubes are shorter than an average cement grain. hence, an effective cnt reinforcement needs to be at least approximately 15 µm cnt in length. the appropriate length of cnt reinforcement will be a topic for further research. acknowledgements we gratefully acknowledge financial support from the ministry of education and youth of the czech republic under cez msm 6840770003, and from the czech science foundation gačr under projects p104/12/0102 and 103/09/h078. references [1] g. constantinides, f.-j. ulm. the effect of two types of c-s-h on the elasticity of cementbased materials: results from nanoindentation 27 acta polytechnica vol. 52 no. 6/2012 and micromechanical modeling. cem concr res 34(1):67–80, 2004. [2] a. cwirzen, k. habermehl-cwirzen, v. penttala. surface decoration of carbon nanotubes and mechanical properties of cement/carbon nanotube composites. advances in cement research 20(2):65–74, 2008. [3] e. hammel, x. tang, m. trampert, et al. carbon nanofibers for composite applications. carbon 42(5-6):1153–1158, 2004. [4] maria s. konsta-gdoutos, zoi s. metaxa, surendra p. shah. multi-scale mechanical and fracture characteristics and early-age strain capacity of high performance carbon nanotube/cement nanocomposites. cement and concrete composites 32(2):110–115, 2010. [5] geng ying li, pei ming wang, xiaohua zhao. mechanical behavior and microstructure of cement composites incorporating surfacetreated multi-walled carbon nanotubes. carbon 43(6):1239–1245, 2005. [6] victor c. li, h. horii, p. kabele, et al. repair and retrofit with engineered cementitious composites. engineering fracture mechanics 65(2–3):317–334, 2000. [7] victor c. li, mohamed maalej. toughening in cement based composites. part i: cement, mortar, concrete. cement and concrete composites 18(4):223–237, 1996. [8] péter ludvig, josé m. calixto, luiz o. ladeira, ivan c. p. gaspar. using converter dust to produce low cost cementitious composites by in situ carbon nanotube and nanofiber synthesis. materials 4(3):575–584, 2011. [9] z. s. metaxa, m. s. konsta-gdoutos, s. p. shah. mechanical properties and nanostructure of cement based materials reinforced with carbon nanofibers and pva microfibers. aci special publication 270: advances in the material science of concrete sp-270-10 270:115–124, 2010. [10] p. morgan. carbon fibers and their properties. crc press, 2005. 794-796. [11] prasantha r. mudimela, larisa i. nasibulina, albert g. nasibulin, et al. synthesis of carbon nanotubes and nanofibers on silica and cement matrix materials. journal of nanomaterials, 2009. [12] albert g nasibulin, sergey d shandakov, larisa i nasibulina, et al. a novel cementbased hybrid material. new journal of physics 11(2):023013, 2009. [13] larisa i. nasibulina, ilya v. anoshkin, sergey d. shandakov, et al. direct synthesis of carbon nanofibers on cement particles. transportation research record: journal of the transportation research board 2(2142):96–101, 2010. [14] b. patzák, z. bittnar. design of object oriented finite element code. advances in engineering software 32(10–11):759–767, 2001. [15] florence sanchez, chantal ince. microstructure and macroscopic properties of hybrid carbon nanofiber/silica fume cement composites. composites science and technology 69(7–8):1310– 1318, 2009. [16] v šmilauer, c. g. hoover, z. p. bažant, et al. multiscale simulation of fracture of braided composites via repetitive unit cells. engineering fracture mechanics 78(6):901–918, 2011. [17] rilem tc50-fmc. determination of the fracture energy of mortar and concrete by means of three-pointbend tests on notched beams. materials and structures 18(4):285–290, 1985. [18] f.-j. ulm, r. pellenqs. bottom–up materials science. in mit-france energy forum. 2011. [19] b. i. yakobson, c. j. brabec, j. bernholc. nanomechanics of carbon tubes: instabilities beyond linear response. j phys rev lett 76(14):2511–2515, 1996. 28 ap10_1.vp 1 introduction our recently published paper [1] presented a new generalized statistical pattern for the population distribution of electron avalanches and streamers. the pattern includes the fury and pareto distributions, both of which have been used for sparsely and highly populated avalanches, respectively. the pattern, i.e. the probability density function, has the following form f n n k n n n n x j j n j j ( ) ( ) , exp ( ) � � � � � � � � � � � � � � � � 1 1 1 0 � dx d k n x 0 � � � � � � � � , ln ( ) ln ( ) , (1) where k , n and j are fitting parameters whose physical meaning has been explained in detail in [1] . quantity d is the so-called fractal dimension and its value (0 3� �d ) is a ‘measure’ of the space delocalization of avalanche pulsating discharges in the gap between electrodes. numerous tests have been performed to verify the functionality of pattern (1). the tests employed both electrical [1] and optical [2] data measured in our laboratory. all these types of tests have confirmed that the generalized pattern (1) can fit the data quite reliably. however, tests of one type were still missing, i.e., tests employing ultraviolet (uv) experimental data. the purpose of this short note is to present such tests. 2 ultraviolet tests uv-pulses coming from avalanches when they are crossing the discharge gap provide a unique opportunity to count their repetition rates and intensities. a convenient instrument for this purpose is a photomultiplier working in the uv-region. a method employing photomultipliers for registrating of ultraviolet radiation accompanying collisional ionization within avalanches allows us to analyze electron components independently from those of ion. the experimental arrangement is shown in fig. 1. the photomultiplier is located near the discharge gap, its signal is amplified and transferred to the digitizer. the files of the statistical data captured in the digitizer are processed off-line in a computer. recently, extensive measurements of ‘uv-statistics’ have been performed in our laboratory and one typical example of these measurements is shown in fig. 2. we can observe very good behavior of the fractal pattern (1) when comparing it with experimental data. uv-pulses were captured in the form of voltage pulses acquired across the resistance r � 100 k�. since we did not calibrate the voltage pulses u against the number of electrons, our resulting distribution curves f(u) were dependent on u instead of n. when assuming linear proportionality u const n� � , the curves f(u) will preserve the same shapes as f(n), i.e. they will both possess the same value of fractal dimension d. the quantity g in fig. 2 seemingly has a similar purpose as a normalization constant, i.e. w u g f u( ) ( )� � , but instead of the normalization correction it represents a ‘shift’ between unnormalized data and the unnormalized distribution function f(u). 3 conclusion the results of the analysis of ultraviolet pulses have unmistakably shown that the fractal pattern (1) is a convenient approximation of the population statistics of electron avalanches. this conclusion is also in full agreement with the electrical [1] and optical [2] measurements performed earlier. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 a note on the population statistics of electron avalanches and streamers t. ficker ultraviolet tests focused on population statistics of electron avalanches were not included in our earlier work [1], [2]. now they have been performed, and it is shown that a recently derived statistical pattern fits all the measured data very well. keywords: population statistics, electron avalanches, fractal statistical pattern, ultraviolet data. fig. 1: experimental arrangement for capturing of uv-pulses acknowledgments this work has been supported by the grant agency of the czech republic under grant no. 202/07/1207. references [1] ficker, t.: acta polytechnica, vol. 47 (2007), p. 31. [2] ficker, t.: ieee transactions on dielectrics and electrical insulations, vol. 11 (2004), p. 136. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering university of technology veveří 95 662 37 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 50 no. 1/2010 fig. 2: fractal pattern fitting uv-data ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 2005–2010 multiwavelength campaing of oj287 m. valtonen, a. sillanpää abstract the light curve of quasar oj287 extends from 1891 up today without major gaps. this is partly due to extensive studies of historical plate archives byrenehudec and associates, and partly due to several observing campaigns in recent times. here we summarize the results of the 2005–2010 observing campaign, in which several hundred scientists and amateur astronomers took part. the main results are the following: (1) the 2005 october optical outburst came at the expected time, thus confirming the general relativistic precession in the binary black hole system, as was originally proposed by sillanpää et al. (1988). at the same time, this result disproved the model of a single black hole system with accretion disk oscillations, as well as several toy models of binaries without relativistic precession. in the latter models the main outburst would have been a year later. no particular activity was seen in oj287 in 2006 october. (2) the nature of the radiation of the 2005 october outburst was expected to be bremsstrahlung from hot gas at a temperature of 3× 105k. the reason for the outburst is a collision of the secondary on the accretion disk of the primary, which heats the gas to this temperature. this was confirmed by combined ground based and ultraviolet observations using thexmm-newtonx-ray telescope. (3) a secondary outburst of the same nature was expected at 2007 september 13. within the accuracy of the observations (about 6 hours), it started at the correct time. thus the prediction was accurate at the same level as the prediction of the return of halley’s comet in 1986. due to the bremsstrahlung nature of the outburst, the radiation was unpolarised, as expected. (4) further synchrotron outbursts were expected following the two bremsstrahlung outbursts. they came as scheduled between 2007 october and 2009 december. (5) due to the effect of the secondary on the overall direction of the jet, the parsec scale jet was expected to rotate in the sky by a large angle around 2009. this rotation has been seen in high frequency radio observations. theoj287 binary black hole system is currently our best laboratory for testing theories of gravitation. using oj287, the correctness of general relativity has now been demonstrated up to second post-newtonian order, higher than has been possible using binary pulsars. keywords: quasars: general – quasars: individual (oj287) – bl lacertae objects: individual (oj287). 1 introduction oj287 is one of the brightest agn in the sky. since it is also highly variable, it has become one of the favorite objects for both professional and amateur astronomers to follow. in addition, it lies close to the ecliptic, which means that its image has been recorded by chance in hundreds of photographic plates since 1891. in 1982 one of the authors (a.s.) put together the historical light curve of oj287 based on published measurements. these were partly photometric measurements since the discovery of oj287 as an extragalactic object in 1968, partly studies of photographic plate archives from years prior to 1968 that are kept in various observatories, in particular at harvard and at sonneberg. there appeared to be a 12 year outburst cycle (see figure 1), and moreover, it was obvious that the next cyclic outburst wasdue very shortly. this predictionwasdistributed to colleages world wide, and indeed, oj287 did not disappoint us but produced the expected event in the followingjanuary [1,2]. observations showeda sharp decline in thepercentagepolarizationduring the outburst maximum, indicating that the outburst was produced essentially by unpolarized light [3]. this is different fromordinary outbursts inoj287which are characterized by an increase in the percentage polarization at the maximum light. in radio wavelengths the outbursts were found to follow the optical outbursts with a time delay of between 2 months and a year, depending on the observing frequency [4]. fig. 1: the optical light curve of oj287 from 1891 to 2010. the observations are taken from ref. 22, complementedbyunpublisheddata fromr.hudecandm.basta 76 acta polytechnica vol. 51 no. 6/2011 sillanpää et al. [5] suggested that oj287 is a binary black hole system where a smaller companion periodically perturbs the accretion disk of a massive primaryblack hole. they stated that the bestway to verify this hypothesis was to study future outbursts and to show that the major axis of the binary systemprecesses as expected ingeneralrelativity. the next expected outburstwas in 1994; it cameas scheduled [6,7]. at this point it became obvious to us that we are indeed dealing with a relativistic binary system, and it became necessary to develop the model in greater detail. in the binary model there should be two disk crossings per 12 yr orbital period. thus the 1994 outburst should have an equal pair whose timing was calculated to be at the beginning of 1995 october [8–10]. this predictionwas also verified [11]. figure 1 shows the optical light curve from 1891 to 2010. the main gap in the light curve is between years 1893 and 1897; observations from three consecutive years are missing there. there are no major gaps since 1897. (ref. 12makes a peculiar statement that “there are about 10 year long gaps in the data”; no such gaps are seen even in their own historical light curve.) alternative explanations have also been put forward. quasiperiodic oscillations in an accretion disk were suggested [13], and several binary toy models without relativistic precession have also been proposed [14–16]. the latter models all predicted the next main outburst of oj287 in the autumn of 2006, while the precessing binary model gave a prediction one year earlier, at the beginning of 2005 october [17,18]. the second major outburst was expected in late 2007 in all binary models, while in the single black hole model there was no reason to expect a second major outburst. in the precessing binary model the date was given with high accuracy, with the last prediction prior to the actual event being 2007 september 13 [19,20]. in the single accretion disk model and in the non-precessing binary models the nature of the radiation at these outbursts shouldhavebeenpolarized synchrotron radiation, while the precessing binary model predicted unpolarized bremsstrahlung radiation [9]. in addition, the precessing binary model predicted a series of further outbursts for the interval 2007–2010, but they were expected to show up as an increased level of synchrotron radiation [17]. also, the companion black hole should affect the disk of the primary in a predictable way, leading to thewobble of the jet [21]. in contrast, the non-precessing models predicted simultaneous brightening both in radio and in optical, at least for the second outburst [16] since in these models disk impacts play no role or a minor role, and flux enhancements are purely jet phenomena. with these predictions in mind, a multiwavelength campaign of observing oj287 during 2005–2010was set up, with one of the authors (a.s.) among the leaders. 2 five “smoking gun” results in the following, we describe five “smoking gun” observations which produced expected results from the point of view of the precessing binary black hole model, but which were surprising and unexpected in other theories. 2.1 timing the 2005 outburst the 2005 outburst was well covered by observations. the points in figure 2 are daily averages, 92 in all, formed from altogether 2329 observations inv-band and inr-band. the latter are transformedtov-band by adding 0.4 magnitudes to the r-band value. finally the flux values are calculated in a standardway (see e.g. ref. [22]). fig. 2: the optical light curve of oj287 during the 2005 outburst. the data points are based on refs. [23–25]. the dashed line is the theoretical fit based on ref. [9] according to ref. [10], the impact causing the 2005 outburst was expected 22.3 years after the impact of the 1983 outburst. in addition, in ref. [9] it is estimated that the 2005 outburst should be delayed after the impact. the 1983 outburst is also delayed but not as much, the difference being 0.44 yr. the timing uncertainty was estimated to be ±0.1 yr. the rapid flux rise started in the latter outburst at 1983.00; thus the corresponding rapid flux rise of the 2005 outburst was expected at 1983.00 + 22.30 + 0.44 = 2005.74. actually, the outburst was one week late and did not begin until 2005.76 (ref. [23]), but anyway the timing was well within the stated error limits. the comment in ref. [12] that “no one expected amajor burst at this point” is rather strange, and it fails to understand that any prediction has its associated error limits. only a few polarization measurements were carried 77 acta polytechnica vol. 51 no. 6/2011 out at that time, and unfortunately, even those happened during secondaryflares. thus the polarization state of the primary outburst remains unknown from observations. (in contrast, ref. [12] states that “the whole burst was rather strongly polarized”, based on twomeasurements amongmore than 2000 light curve points, an extraordinary extrapolation!) sillanpää et al. [5] stated that the binary system should show forward precession and thus the disk crossings should follow each other at shorter intervals than the orbital period. the required amount of precession is easily calculated, and it turns out to be 39.1◦ per period. it is so much higher than e.g. in binary pulsars (by a factor of 104) that we immediately realise the importance of oj287 in testing general relativity. we may also calculate the mass of the primary. its value, 1.84×1010 solar mass, seemed rather high when it was first calculated, but subsequent work on black hole mass functions now places it among the fairly common upper mass range black holes (common in the same sense as o-type stars are common among main sequence stars, see refs. [26–29]). thismass value placesoj287 right on themean correlation of the black hole mass — host galaxy kmagnitude relation, with mk ∼ −28.9 [30,31]. in ref. [12] it is claimed that oj287 is significantly, slightly more than one standard deviation, off the mean correlation. based on this, the authors state that themeasurement inref. [30]“ismost likely spurious”. the reason for the one standarddeviation offset in ref. [12] may be traced to an incorrect way of transformingopticalmagnitudes tok-magnitudes. the process of transforming the r-band measurement of the host galaxy magnitude [32–34] to the k-band is composed of several steps, each containing its associated uncertainties. one has to have a theory of the stellar composition of the host galaxy (not trivial for amerger) and of its (passive) cosmological evolution. one has to measure the neutral hydrogen column density of the host galaxy, and then to transform it to the extinction in r-band. for the hydrogen columndensity there exists ameasurement in ref. [35], albeit with large error bars, while for the extinction curves large variations from galaxy to galaxyhavebeen found [36]. as a result of these large uncertainties, one can safely say that the r-band measurements of the magnitude of the host galaxy are consistentwith the directmeasurement in the kband (which does not require the above mentioned transformations), and that the black hole mass — host galaxy k-magnitude correlation holds in oj287 within measurement errors. in any case, a displacement by one standard deviation from the mean correlation cannot be used as an argument for the correctness or otherwise of a single point in a correlation diagram. fig. 3: the optical — uv spectrum of oj287 during the 2005 outburst. data points are based on ref. [25]. the solid line is the bremsstrahlung fit, as predicted in ref. [9]. the observational points are corrected for the internal extinction in oj287, taken from ref. [35], and the assumed galactic extinction law [37]. the standard extinction in our galaxy is also taken into account 2.2 nature of radiation at the 2005 outburst impact outbursts are expected to consist of bremsstrahlung radiation, and thus the optical polarization ofoj287should godownduring them. asmentioned above, polarization information for the basic 2005 outburst is not available. however, bremsstrahlung may also be recognized by its spectrum, and this is thepartof the campaign thatwas successfully carried out. we had xmm-newton observations both before the 2005 outburst (2005 april), and during the outburst (2005november3–4). fortunately, thenovember observation happened at the time when the sourcewas at its basic outburst level, in between two secondary bursts. thus we would expect to see an additional pure bremsstrahlung spectrum above the underlying synchrotronpower-law. apreliminary report of these observations has appeared in ref. [25], and a more detailed report is under preparation. in figure 3 we show the difference between the 2005 november flux and the 2005 april flux. the values have been corrected for the galactic extinction and for the extinction in the oj287 host galaxy. for the latter, we use the measuments in ref. [35] and the standardgalactic extinction curve [37]. the solid line shows the bremsstrahlung spectrum at the expected temperature of 3×105k.note that a raised synchrotron spectrum, as one might have expected in some other theories, would have a downward slope toward higher frequencies, and it is entirely inconsistent with observations. incidentally, the effect of the extinction in the host galaxy of oj287 is such that 78 acta polytechnica vol. 51 no. 6/2011 it causes an apparent spectral break in the normal (non-outburst) spectrum in the optical region, while it really happens at the agn source somewhere in the uv [35]. 2.3 timing and nature of the 2007 september 13 outburst the 2007 september 13 outburst was an observational challenge, as the source was visible only for a short period of time in the morning sky just before the sunrise. therefore a coordinated effortwasmade starting with observations in japan, then moving to china, and finally to central and western europe. a crucial role was played by the not telescope in the canary islands and by the calar alto telescope in mainland spain, which were able to make polarization observations. the observed points with estimated errorbarsaregiven infigure4,where the contributions by participating observatories are shown by colour codes. fig. 4: the optical light curve of oj287 during the 2007 september outburst. the upper panel shows the measured magnitudes, while the lower panel shows the percentage polarization. data points are published in ref. [22] a comparison of the two panels shows immediately that there were two kinds of outbursts in 2007 september. three outbursts were highly polarised, with the degree of polarization above 15%, while the biggest outburst had polarization below 10 %. thus it is not difficult to decide which was the expected bremsstrahlung event. later in the year there were morehighly polarizedoutbursts, but ifwe look at the light curve composed of low polarization states only (figure 5), the september 13 outburst clearly stands out. fig. 5: the optical light curve of oj287 during 2006–2008. only low polarization (less than 10 %) data are shown; they are based on ref. [24] fig. 6: the optical light curve of oj287 during the 2007 september outburst. only low polarization (less than 10 %) data points are shown. the data points are based on ref. [22]. the dashed line is the theoretical fit based on ref. [9]. the arrow points to september 13.0, the predicted time of origin of the rapid flux rise in figure 6 we look at the low polarization light curve in more detail around the september 13 event. a theoretical light curve is also drawn, and an arrow points to the expected moment of the beginning of the sharp flux rise. we see that the observed flux rise coincides within 6 hours with the expected time. the accuracy is about the same as we were able to predict the return of halley’s comet with in 1986! 2.4 2007–2010 outbursts ref. [18] gave a detailed prediction of the whole light curve ofoj287during the campaignperiod (figure 7). in addition to the two impact outbursts, it was expected that the tidal forcing mechanism of sillanpää et al. [5] would raise the general level of activity of oj287, starting from the spring of 2007 79 acta polytechnica vol. 51 no. 6/2011 and continuing until the spring of 2009. the detailed structure of minor bursts in figure 7 is immaterial, since it is due to poisson noise in a simulation with a finite number of disk particles. this prediction is best compared with the low-polarization light curve of figure 5. in general outline oj287 behaved just as expected, except that the optical fluxdeclined fast in the spring of 2008, sooner than we would have thought. fig. 7: the predicted optical light curve of oj287 during 2000–2012. the data is based on ref. [18] fig. 8: the observed optical flux of oj287 during 2004–2010 minus the prediction in ref. [18]. the scatter is 1.4 mjy, and the only significant deviation from the prediction occurs in the spring of 2008 which suggests an eclipse at this point we may remind the reader that sillanpää et al. [5] also interpreted some sharp flux decreases as “eclipses”. at these times the secondary may move across our line of sight, between us and the agn optical continuum source. one such event was predicted in 2008 [9], but it is not included in the light curve prediction of figure 7. however, if we take the differential of observed minus predicted flux (figure 8), the eclipse-like feature becomes quite obvious. the two previous “eclipses” in the same sequence occurred in 1989 [38] and in 1998 [39]. the astrophysical reason for the eclipses could be gravitational deflection of the jet stream by the secondary, extinction in gas clouds circling the secondary, or something else. figure 7 shows also a prominent outburst at the end of 2009. it also came as expected [40]. in our model the accretion disk is optically thick but geometrically thin, it possesses a strongmagnetic field [41] and the disk is connected to the jet bymagnetic field lines [42]. ref. [12] presents an entirely different model which they then strongly criticise, and finally try to make a case for quasi-periodic oscillations in an accretion disk of a single black hole. it is shown in ref. [40] that the probability that such a model would explain the good match between the theory and observations is less than one in part in 108, not tomention that the other “smokinggun”observations also remain unexplained in such a model. actually, there is no evidence presented in favour of a single black hole model in ref. [12], while the criticism of a binary model is misdirected and consists of a number of incorrect statements. 2.5 turning jet the accretion disk as a whole is also affected by the companion in its 12year orbit. on the other hand, in our model the jet and the disk are connected. thus the jet direction should be strongly influenced by the companion. there are three periodicities that could be expected to showup: the 12 yr orbital cycle, the 120 yr precession cycle (or half of it due to symmetry) and thekozai cycle [43], which also happens to be 120 yr. the 12 yr orbital cycle produces the tidal enhancements in accretion flow, as postulated by sillanpää et al. [5], but in addition this enhancement can be stronger or weaker depending onwherewe are in the precession cycle. these two tidal effects pretty much explain the overall appearance of the light curve [18]. in addition, there is a modulation in the long term base emission level (unexplained by the tidal enhancement) which is in tune with the kozai cycle. this cycle may also appear in polarization data [44]. the jetorientation isdelayedrelativeto thediskwobble. theoretically the delay should be of the order of ten years; fitting with the optical data gives the best fit with a 13 yr delay. the jet wobble shows up in observations in several ways. first, the mean angle of optical polarization varies. the binary model predicts, among other things, a quick change in the optical polarization angle by nearly 90◦ around 1995, which was observed [12]. in radio, we should see a similar rapid change in the position angle of the parsec scale jet. depending on the actual value of the delay in radio jet orientation, the change could already be under 80 acta polytechnica vol. 51 no. 6/2011 way (figure 9), or it may be delayed by another 12 cycle (figure 10,ref. [45]). there are recent observations which suggest the first alternative [46], but the interpretation of these observations is not yet clearcut. fig. 9: the observed position angle of the radio jet of oj287 (points) compared with the binary model, with a 3 year response time of the jet orientation changes fig. 10: the observed position angle of the radio jet of oj287 (points) compared with the binary model, with a 14 year response time of the jet orientation changes there are also longer periods that are expected in the binary model: the period of the black hole spin (about 1300 yr, refs. [47,48]) which may show up in the structure of the megaparsec scale jet [49]. also the time scale of the binary settling in the nucleus of oj287 after a merger of two galaxies, about 108 yr [50], may be connected with the overall curvature of the magaparsec jet. in the shorter time scales, the half-period of the last stable orbit around the kerr black hole of ∼ 50 days may also show up in observations [51]. 3 testing general relativity using the oj287 binary, we may test the idea that the central body is actually a black hole. one of the most important characteristics of a black hole is that it must satisfy the so called no-hair theorem or theorems [52–57]. a practical test was suggested in refs. [58,59]. in this test the quadrupole moment q of the spinning body is measured. if the spin of the body is s and its mass is m, we determine the value of q in q = −q s2 m c2 . (1) for black holes q = 1, for neutron stars and other possible bosonic structures q > 2 (refs. [60,61]). we calculate the two-body orbit using the third post-newtonian (3pn) order orbital dynamics,which includes the leading order general relativistic, classical spin-orbit and radiation reaction effects (refs. [62–64]). the 3pn-accurate equations of motion can be written schematically as ẍ ≡ d2x dt2 = ẍ0 + ẍ1p n + ẍso + ẍq +ẍ2p n + ẍ2.5p n + ẍ3p n , (2) where x = x1 − x2 stands for the center-of-mass relative separation vector between the black holes with masses m1 and m2 and ẍ0 represents the newtonian acceleration given by ẍ0 = − g m r3 x; m = m1 + m2 and r = |x|. the pn contributions occurring at the conservative 1pn, 2pn, 3pnand the reactive 2.5pn orders, denotedby ẍ1p n, ẍ2p n, ẍ3p n and ẍ2.5p n respectively, are non-spin by nature, while ẍso is the spin-orbit term of the order 1.5pn. the quadrupole-monopole interaction term ẍq, entering at the 2pn order, reads ẍq = −q χ2 3g3 m21m 2c4 r4 · (3) {[ 5(n · s1)2 −1 ] n −2(n · s1)s1 } , where parameter q, whose value is 1 in general relativity, is introduced to test the black hole ‘no-hair’ theorem. the kerr parameter χ and the unit vector s1 define the spin of the primary black hole by the relation s1 = g m 2 1 χ s1/c and χ is allowed to take values between 0 and 1 in general relativity. the unit vector n is along the direction of x. in figure 11 we show the distribution of q-values allowed by “good” orbits. by “good” we mean an orbit which gives the correct timing of 9 outbursts within the range of measurement accuracy. obviously the range of timing at each of the 9 outbursts means that a set of solution orbits are possible. here we have used a representative set of such orbits. 81 acta polytechnica vol. 51 no. 6/2011 fig. 11: the distribution of the test parameter q among 598 solutions of the orbit. the result is consistent with general relativity (q = 1), and excludes the cases of no relativistic spin-orbit coupling (q = 0) and of a material body (i.e. not a black hole, q greater than 2) we note that the distribution peaks at q = 1, thus confirming the no-hair theorem. it is also the first test of general relativity thathasbeenperformed at higher than the 1.5post-newtonianorder. thus it forms a milestone in our study of the correct theory of gravitation. 4 conclusions prior to the 2005–2010 multiwavelength campaign there were several ideas about the nature of oj287. fortunately, these models made completely different predictions about the behaviour of oj287 during these years. one of the key differences was the timing of thefirst outburst: theprecessingbinarymodel, initially proposed by sillanpää et al. [5] with subsequent refinements [9,10,18] predicted the outburst in october2005, theothermodels inoctober2006. the result of a scientific enquiry is seldom as clear-cut as this: the outburst came within one week of the time expected in the precessing binarymodel and its spectrum agreed with the bremsstrahlung spectrum at the predetermined temperature. the prediction for the second outburst turned out to be accuratewithin 6 hours, and the lack of polarization again suggested strongly the bremsstrahlung origin. all flux values predicted for this period turned out to be accurate with the standard scatter of 1.4 mjy, which is only ten percent of the variability range in optical. the only exception occurred in 2008; however, this was the timewhen lower flux valueswere expected due to an “eclipse”. we have “eclipse” in quotation marks, as the reason for the sudden fade at the time when the secondary passes through our line of sight is not known. the optical variability data specifies the binary model except for the exact direction of the jet relative to our line of sight. however, the resolvedparsec scale radio jet allows us to get a handle on this parameter, too. the remaining unknown is the delay between the wobble of the accretion disk, due to the effect of the secondary, and the reorientation of the jet in the sky. a major reorientation may already have started, or it may come after one orbital period, depending on the details of the jet/disk connection [65]. the success of the binary model has encouraged us in using it to test theories of gravitation. any theory which can be presented as newton’s law plus post-newtonian terms may be studied, as different laws of gravitationproduce different impact times on theaccretiondisk. wehaveused thepost-newtonian termsof general relativityup to thirdorder, andhave found that the orbit solutions agree with the theory. our test parameter q should have the value of exactly 1, and indeed the possible solutions cluster around this value. the parameter values q = 0 or q = 2 can be rejected at the 4 standard deviation level at present. this is the first time that it has been possible to study general relativity at higher than the 1.5pn order [66]. acknowledgement we would like to thank all the participants in this campaign for the extraordinary amout of work dedicated to solving the riddle of oj287. in particular, we thankkarinilsson andstefanociprini, who have put together and harmonized huge amounts of data. rene hudec has made an invaluable contribution in collecting historical data, tapio pursimo and leo takalo were responsible for the key polarization observations at not, and seppomikkola, harry lehto andachamveedugopakumar have been key persons in turning the data into a viable binary black hole model. references [1] haarala, s., korhonen, t., sillanpää, a., salonen, e.: iau circ. 3764, 3, 1983. [2] sillanpää, a., et al.: astron. & astrophys. 147, 67, 1985. [3] smith, p. s., balonek, t. j., elston, r., heckert, p. a., astrophys. j. suppl. 64, 459, 1987. [4] valtaoja, l., sillanpää, a., valtaoja, e., astron. & astrophys. 184, 57, 1987. [5] sillanpää, a., haarala, s., valtonen,m. j., sundelius, b., byrd,g.g.: astrophys. j., 325,628, 1988. [6] fiorucci, m., et al.: iau circ. 6104, 2, 1994. 82 acta polytechnica vol. 51 no. 6/2011 [7] sillanpää, a. et al.: astron. & astrophys. 305, l17, 1996. [8] valtonen, m. j.: workshop ontwo years of intensive monitoring of oj 287 and 3c66a, proceedings of the meeting held in oxford. england, 11–14 september, 1995. tuorla observatory reports, informo no. 176. edited by leo o. takalo, tuorla observatory, university of turku, p. 64, 1996. [9] lehto,h. j., valtonen,m. j.: astrophys. j. 460, 207, 1996. [10] sundelius, b., wahde, m., lehto, h. j., valtonen, m. j.: blazar continuum variability astronomical society of the pacific conference series 110, proceedings of an international workshop held at florida international university, miami, florida, usa : 4–7 february 1996, san francisco : astronomical society pacific, edited by h.richardmiller, jamesr.webb, and johnc. noble, p. 99, 1996. [11] sillanpää, a. et al.: astron. & astrophys. 315, l13, 1996. [12] villforth, c. et al.: mon. not. ras, 402, 2087, 2010. [13] igumenshchev, i. v., abramowicz, m. a.: mon. not. ras, 303, 309, 1999. [14] katz, j. i.: astrophys. j. 478, 527, 1997. [15] villata, m., raiteri, c. m., sillanpää, a., takalo, l. o.: mon. not. ras, 293, l13, 1998. [16] valtaoja, e., teräsranta, h., tornikoski, m., sillanpää, a., aller, m. f., aller, h. d., hughes, p. a.: astrophys. j. 531, 744, 2000. [17] kidger, m. r.: astron. j. 119, 2053, 2000. [18] sundelius, b., wahde, m., lehto, h. j., valtonen, m. j.: astrophys. j. 484, 180, 1997. [19] valtonen,m. j.: astrophys. j. 659,1074, 2007. [20] valtonen, m. j.: the nuclear region, host galaxy and environment of active galaxies: proceedings of a meeting to celebrate the 60th birthday of deborah dultzin-hacyan, huatulco, mexico, 18–20 apr 2007 (eds. erika bentez, irene cruz-gonzlez, yair krongold) revista mexicana de astronoma y astrofsica (serie de conferencias), vol. 32, 22, 2008. [21] valtonen, m. j., et al.: astrophys. j. 646, 36, 2006. [22] valtonen, m. j., et al.: nature, 452, 851, 2008. [23] valtonen,m.,kidger,m., lehto,h., poyner,g.: astron. & astrophys. 477, 407, 2008. [24] valtonen, m. j., et al.: astrophys. j. 698, 781–785, 2009. [25] ciprini, s., et al.: mem. soc. astronomica italiana, 78, 741, 2007. [26] ghisellini, g., et al.: mon. not. ras, 399,l24, 2009. [27] sijacki, d., springel, v., haehnelt, m. g.: mon. not. ras, 400, 100, 2009. [28] kelly, b. c., et al.: astrophys. j. 719, 1315, 2010. [29] trakhtenbrot, b., netzer, h., lira, p., shemmer, o.: astrophys. j. 730, 7, 2011. [30] wright, s. c., mchardy, i.m., abraham,r.g.: mon. not. ras, 295, 799, 1998. [31] kormendy, j., bender, r.: nature, 469, 377, 2011. [32] hutchings, j. b.: astrophys. j. 320, 122, 1987. [33] heidt, j., et al.: astron. & astrophys. 352,l11, 1999. [34] pursimo, t., et al.: astron. &] astrophys. 381, 810, 2002. [35] ghosh, k. k., soundararajaperumal, s.: astrophys. j. suppl. 100, 37, 1995. [36] falco,e.e., et al.: astrophys. j.523,617, 1999. [37] draine, b. t.: ann. rev. astron. & astrophys. 41, 241, 2003. [38] takalo, l. o., et al.: astron. & astrophys. suppl. 83, 459, 1990. [39] valtonen, m. j., lehto, h. j., pietilä, h.: astron. & astrophys. 342, l29, 1999. [40] valtonen, m. j., lehto, h. j., takalo, l.o., sillanpää, a.: astrophys. j. 729, 33, 2011. [41] sakimoto, p. j., coroniti, f. v.: astrophys. j. 247, 19, 1981. [42] turner, n. j., bodenheimer, p., rozyczka, m.: astrophys. j. 524, 129, 1999. [43] innanen, k. a., zheng, j. q., mikkola, s., valtonen, m. j.: astron. j. 113, 1915, 1997. [44] sillanpää, a.: astron. & astrophys. 247, 11, 1991. 83 acta polytechnica vol. 51 no. 6/2011 [45] valtonen, m. j., savolainen, t., wiik, k.: jets at all scales, proceedings of the international astronomicalunion, iausymposium,vol.275, 2011, p. 275. [46] agudo, i., et al.: astrophys. j. 726, l13, 2011. [47] valtonen, m. j., et al.: astrophys. j. 709, 725, 2010. [48] valtonen, m. j., et al.: celestial mechanics and dynamical astronomy, 106, 235, 2010. [49] marscher, a. p., jorstad, s. g.: astrophys. j. 729, 26, 2011. [50] iwasawa, m., an, s., matsubayashi, t., funato, y., makino, j.: astrophys. j. 731, l9, 2011. [51] wu, j., et al.: astron. j. 132, 1256, 2006. [52] israel, w.: phys.rev. 164, 1776, 1967. [53] israel, w.: commun. math. phys. 8, 245, 1968. [54] carter, b.: phys. rev. lett. 26, 331, 1970. [55] hawking, s. w.: phys. rev. lett. 26, 1344, 1971. [56] hawking,s.w.: commun.math. phys.25,152, 1972. [57] misner, c. w., thorne, k. s., wheeler, j. a.: gravitation, w. h. freeman & co, new york, 1973, p. 876. [58] thorne, k. s.: rev. mod. phys. 52, 299, 1980. [59] thorne, k. s., price, r. m., macdonald, d. a.: in black holes: the membrane paradigm, new haven : yale univ. press, 1986. [60] wex, n., kopeikin, s. m.: astrophys. j. 514, 388, 1999. [61] will, c. m.: astrophys. j. 674, l25, 2008. [62] barker, b. m., o’connell, r. f.: phrvd, 12, 329, 1975. [63] damour, t.: c. r. acad. sci. paris 294, (ii), 1355, 1982. [64] kidder, l. e.: phrvd, 52, 821, 1995. [65] valtonen, m. j., villforth, c., wiik, k.: mon. not. ras, in press, arxiv1111.1539, 2011. [66] valtonen, m. j., mikkola, s., lehto, h. j., gopakumar,a.,hudec,r., polednikova,j.: astrophys. j. 742, 22, 2011. mauri valtonen helsinki institute of physics university of helsinki department of physics and astronomy university of turku finnish centre for astronomy with eso university of turku aimo sillanpää department of physics and astronomy university of turku 84 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 fundamental research on hydraulic systems driven by alternating flow ioan-lucian marcu1, daniel-vasile banyai1 1 technical university of cluj-napoca, department of mechanical engineering, b-dul muncii nr. 103-105 cluj-napoca romania correspondence to: lucian.marcu@termo.utcluj.ro abstract this paper presents a new approach to rotary hydraulic systems, and the functional principles of rotary hydraulic systems that can work using alternating flows. hydraulic transmissions using alternating flows are based on bidirectional displacement of a predefined volume of fluid through the connection pipes between the alternating flow, the pressure energy generator and the motor. the paper also presents some considerations regarding the basic calculation formulas, the design and testing principles for a hydraulic motor driven by alternating flow, and also a three-phase rotary hydraulic motor. keywords: hydraulic, alternating flow, star or delta connection. 1 general considerations all known methods of power transmission by a fluid, and their applications, are based on continuous pressure and flow circulation, which are achieved by the pump and collected by final elements (actuators), and the fluid is considered incompressible. in conventional hydraulic transmissions, the fluid performs a unidirectional motion between the energy converters in the power transmitting process. in hydraulic transmissions driven by alternating flow, the fluid executes an alternating periodical motion between the energy converters. in the case of a hydraulic system in which every working volume of an alternating motor is connected independently, by a phase pipe, with the corresponding working volume of a generator, any modification to the volume of the generator will produce an alternating flow and pressure transmitted along the phase line to the motor. figure 1: principle schema of a rotary hydraulic system with alternating flows and pressure drive generally, a hydraulic transmission driven by a rotary alternating flow consists of a generator of alternating flows and pressures (g) and a motor (m). the connection between them is realized by a number of pipes equal to the number of phases (phase 1, phase 2 and phase 3), the pipes being filled with fluid at a certain (pre-established) pressure, figure 1. when the system is functioning, the pressure and the flow within each pipe varies in a sinusoidal way, around an average value [1]. in order to have proper functioning, the average pressure from each pipe must necessarily have the same value and have a constant value in time. to obtain the correct functionality, we therefore either create from the beginning a pressure in each phase that is higher than the maximum value of the amplitude, or we let the pressure modify itself while it is functioning. this result is obtained by using a series of hydraulic resistances (rz1, rz2, rz3), rigorously calculated, which interconnect all the phases, and also a hydraulic accumulator (ac) connected to them at point c, figure 1. the resistances must eliminate the maximum average pressure rising value in one second, when the diminution of the flow amplitude from a phase does not exceed 1 %. the presence of the accumulator maintains the pressure in a connection approximately constant at all time, being able to take over the oil surplus from the dilatations, and at the same time to supplement any oil loses. in the case of slow working cycles, as in this case, the correct transformation for the accumulator is the isotherm transformation, which leads us, 96 acta polytechnica vol. 52 no. 4/2012 knowing also the system pressure, to the necessary volume of the hydraulic accumulator. this hypothesis was verified using an experimental stand consisting of a three-phase hydraulic transmission driven by an asynchronous alternating flow. 2 design principles of hydraulic energy converters driven by alternating flow as presented in figure 1, a hydraulic transmission driven by alternating flow is a closed circuit that does not require a tank for the working fluid. the generator was a modified axial piston pump (wobble-plate type) with three active pistons placed at angles of 120◦, which allows variation of the plate angle, thus obtaining different strokes for the pistons. a hydraulic rotary motor driven by alternating flow was realized using the principles presented in figures 2a and 2b. [4] the simplest solution for the construction of a rotary hydraulic motor working with alternating flow is to use in the same assembly three different oscillating motors with a gear rack (see elements 1 and 3 in figure 2a and 2b). because each piston moves independently, corresponding to its phase pipe, each gear wheel must have the possibility to act individually on the output shaft. the correct solution for this is to use an intermediary element, e.g. a drawn cup roller clutch, which provides a unidirectional rotation movement (see element 2 in figures 2a and 2b). in this construction, the three-phase generator provides an alternating flow only for the active stroke, and the retraction stroke of the motor pistons (the idle stroke) must be made also using an alternating flow provided by the generator, taking into account that each piston movement of the generator has a phase alteration with an angle of 120◦. if this solution is taken into consideration, we can obtain two separate ways to interconnect for the working volumes of the star or delta motor. figures 2a and 2b show that each phase pipe (line) is connected individually to a working volume of the motor cylinders, in this way providing the active stroke of the pistons. the working volumes (chambers) of the cylinders are interconnected. in this way, if we consider that the movement of each piston has a phase difference of 120◦, the advance of one motor piston will generate a similar flow in the star connection, which provides the retraction stroke (the idling stroke) of the next two pistons. a characteristic of the star connection is that the sum of the alternating flows in the connection point is theoretically zero, figure 2a. in the case of the delta connection, each first working volume of a cylinder, which provides the active stroke, is connected with the second working volume of the next phase cylinder, providing in this way its retraction stroke (the idling stroke); the pressures in these chambers are considered, of course theoretically, to have the same values, figure 2b. according to these functioning principles, a prototype of a rotary hydraulic three-phase motor was realized, working with alternating flows, figures 3a and 3b. [2] in principle, the designed rotary hydraulic motor consists of three associated hydraulic double end cylinders, with small dimensions, which act individually to an output shaft, thus obtaining a continuous rotational movement of the shaft. a) star interconnection b) delta interconnection figure 2: functioning principle of a rotary hydraulic motor driven by alternating flow 97 acta polytechnica vol. 52 no. 4/2012 a) b) figure 3: the design of the three-phase rotary hydraulic motor driven by alternating flow the hydraulic motor was designed in two versions. the compact model, figure 3a, has gear rack conversion mechanisms associated to each small hydraulic cylinder group. figure 3b presents an experimental model of the rotary hydraulic motor driven by alternating flow with a pulley and elastic steel sheet conversion mechanism. each designed version contains three double end hydraulic cylinders which individually actuate some pulleys (or gear), having drawn cup roller clutches mounted inside, and in this way the rotational motion and also the torque are transmitted unidirectionally to the output shaft of the motor. 3 basic calculation we assume that the governing equations for the instantaneous flow and pressure are: qi = qa max · sin(ωt + φ0) (1) and for the pressure: pi = pst + pa max · sin(ωt + φ0) (2) in which: (qi) is the instantaneous flow in [m 3/s], (qa max) is the alternating flow amplitude in [m 3/s], (pi) is the instantaneous pressure in [n/m 2], (pst) is the initial static pressure of the system in [n/m2], (pa max) is the alternating pressure amplitude in [n/m2], (ω) is the angular frequency in [rad/s], (t) is the time in [s], and (φ0) is the initial phase angle in [rad]. making a complete analysis of a hydraulic system working with alternating flow requires that we take into account some parameters that define the pressure drop along the line due to the friction, inertia and elasticity (compressibility) of the fluid, and also the fluid leaks. the instantaneous pressure drop due to the friction along the line will be: δpir = rf · qa max · sin(ωt + φ0) (3) where (rf ) is the flow resistance, in [n · s/m5]. eq. 3 demonstrates the fact that the instantaneous flow is in phase with the pressure drop through the resistance. the pressure drop due to fluid inertia is: δpil = ω · l · qa max cos(ωt + φ0) (4) where (l) is the fluid inertia, in [n · s2/m5]. the phase vector of the inductive pressure drop is advanced by 90◦ in comparison to the instantaneous flow. if we consider the elasticity of the transmitting medium, the capacitive pressure drop is: δpic = − qa max ch · ω · cos(ωt + φ0) (5) where (ch) is the hydraulic capacity [m 5/n]. it can be noted that the vector of the capacitive pressure drop is 90◦ behind the instantaneous flow. 4 testing principles and experimental results the functioning prototype was tested in the idling state and in the loaded state. the load was obtained by using a braking device connected to the output shaft and actuated by a small hydraulic cylinder. this way, we can obtain different values for the loading torque. the preliminary experimental data, representing the functional parameters, was obtained taking into account the monitoring protocol, the disposition of the sensors, and using a data acquisition board, figure 4. 98 acta polytechnica vol. 52 no. 4/2012 figure 4: pressures, motor piston stroke and angular speed evolution in order to acquire the experimental data during the testing process we had to control the mechanical and hydraulic parameters precisely, using proximity sensors mounted on the rotation shafts, pressure sensors mounted on representative points of the hydraulic pipes, and displacement sensors mounted on the motor cylinder pistons. [3] a series of functional parameters of the hydraulic system driven by alternating flow was obtained directly, as: angular speed of the shafts, generator and motor; pressures (on the generator, on the motor — loaded, on the motor idle, in the phase line pipes or in the interconnection pipes); and the piston strokes of the motor. some of the diagrams were obtained indirectly, using recorded data referring to: transmitted torques and phase alteration between alternating flow and harmonic pressure or the rotational position of the generator and motor shafts. 5 conclusion the objective of this research was to study a new approach to hydraulic drives, in which the pressure and flow is not continuously transmitted between the energy converters (pumps and motors). the paper describes the construction principles of hydraulic systems driven by alternating flow. within these systems, the active stroke of the pistons is produced by the flow of pressurized fluid from the generator, while, for the retraction stroke, a supplementary connection to a pressure generator is necessary, working in the opposite phase to the first. this means that the pistons can be interconnected in star or delta configuration. the experimental results, combined with the mathematical model of this system, demonstrate a way to adjust, during the functioning, several input parameters (e.g. the initial static pressure and the angular speed of the generator), in order to obtain the anticipated output values of some parameters, or whether the system load is modifying. knowing the limits of this system, the next development step is automatic control of the main mechanical and hydraulic output parameters. references [1] constantinescu, g.: theory of sonics. bucureşti : published by romanian academy, 1985 (in romanian). [2] marcu, i.-l.: researches and contributions regarding the improving of the acting systems functioning with pressure waves. phd. thesis, clujnapoca, 2004. [3] marecu i.-l.: functional parameters monitoring for an alternating flow driven hydraulic system. in ieee proceedings of international conference aqtr, cluj-napoca, 2008, tome iii, p. 239. isbn 978-1-4244-2576-1. [4] pop, i., a.o.: sonics applications. experimental results. ed. performantica, iaşi, 2007 (in romanian). isbn 978-973-730-391-2. 99 ap08_4.vp 1 introduction non-equilibrium atmospheric pressure discharges have recently been used in emerging novel applications such as the surface modification of polymers, the absorption or reflection of electromagnetic waves, biomedical treatments and plasma-aided combustion [1]. among these various interesting new applications, the interaction of atmospheric pressure “cold” plasmas with a biological medium promises to open new horizons in the medical and environmental fields. the main methods of sterilization, i.e., inactivation of living microorganisms, are based on thermal treatment (dry or moist heat), chemical treatment (e.g., eto, h2o2) or exposure to ionizing radiation (x-ray, gamma radiation, e-beams). these methods have specific drawbacks linked to their conditions of operation, e.g., the toxicity of the active agent used, the temperature conditions, or the size of the installation. clearly, plasma sources have some advantages such as low temperature (at or near room temperature), no risk of arcing, operation at atmospheric pressure, portability, optional hand-held operation, etc. therefore low-temperature non-equilibrium plasmas are playing an increasing role in biomedical applications, and reliable, user-friendly sources need to be developed. a point-to-plane corona discharge at atmospheric pressure is one of the atmospheric pressure discharges applicable for studying the interactions of atmospheric ions with microorganisms such as bacteria and fungi [2–4]. the bactericidal or fungicidal effect of corona discharge can be studied directly on a gel medium (agar), which is a gelatinous substance chiefly used as a culture medium for microbiological work. this technology minimizes the risk of accidental contamination of the samples, and makes the experiments relatively simple. corona discharges are usually realized in such a way that one of the electrodes has a small radius of curvature. this makes the electric field intensity very high close to this electrode in comparison to other points in the gap. electrons and positive ions are egnerated. if the gas contains an electronegative component, electrons can attach to it to produce negative ions. afterwards the ions drift to the corresponding electrode according to particle polarity. their movement is broken by collisions with neutral atoms and molecules. this process can be classified as friction. ions drifting to an electrode with a low radius of curvature (further referred to as a plate electrode) can also make the neutral gas move. this important phenomenon is called electric wind [5]. due to ionic wind the gel surface is deformed, which results in a change in the corona discharge geometry. the discharge current and voltage change during the experiments in dependence on the surface deformation and the gel medium. 2 experimental part fig. 1 shows the scheme of the apparatus used for generating a point-to-plane negative corona discharge at atmospheric pressure. each type of gel electrode was placed in a standard petri dish (90 mm in diameter). the size of the gel layer electrode varied in the analyzed samples, the mean value being 8 mm. the following gel types were used: blood agar, nutrient agar, endo agar. all were prepared from defined fabric substrates. six samples of each type of agar were used; the dates of sample preparation and storage of the samples in a cooling box varied. the point corona electrode was realized by an intramuscular medicinal hollow needle with an outer diameter of 0.7 mm. the needle had a sharpened angle of 15 degrees. the position of the hollow needle was adjusted by a micro© czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 4/2008 influence of a point-to-plane dc negative corona discharge on gel surfaces y. klenko, v. scholtz point-to-plane corona discharge is widely used for modifying polymer surfaces for biomedical applications and for sterilization and decontamination. this paper focuses on an experimental investigation of the influence of the single-point and multi-point corona discharge electric field on gel surface. .three types of gelatinous agar were used as the gel medium: blood agar, nutrient agar and endo agar. the gel surface modification was studied for various time periods and discharge currents. keywords: point-to-plane corona, electrohydrodynamic deformation, gel ion-conducting electrode. fig. 1: schematic of an open-air corona exposure single-point set-up with a petri dish 90 mm in diameter metric screw placed inside the needle holder. the position of the petri dishes could also be changed in two directions perpendicular to the needle movement and also perpendicular to each other. the position change of the hollow needle and the petri dish was measured with precision 0.01 mm. when the needle was used as an electrode in the discharge regime, it was connected to a high voltage supply. the same needle was used detecting deformation in the measurement regime. the conductive connection between the tip of the hollow needle and the gel surface was checked using an ohmmeter. initially, the tip of the hollow needle was placed 6 mm above the gel surface and the positions of the needle and the high voltage source set-up remained unchanged during corona treatment. in the deformation measurements the dimple depth was measured as the change in the distance between the electrodes. the needle was moved perpendicularly toward the gel surface. the contact between the needle tip and the gel surface was detected by the ohmmeter. the exposition times varied (1, 2, 4, 8, 16 min). 3 results fig. 3 is a photograph of the gel surface deformations caused by the corona discharge. a matrix multi-point-to-plane corona discharge was used in this case. fig. 3 shows the gel deformation due to the multi system point-to-plane corona discharges. due to their simplicity, the measurements were made with a single-needle-to-gel electrode system. fig. 4. shows the dependence of dimple depth on the gel surface exposition in a point-to-plane corona discharge. during the measurements, the corona current decreased from initial value (50 �a) according to the gel electrode deformation. the dependence of dimple depth on point-to-plane corona initial current is show in fig. 5. the duration of the gel surface exposition in the corona discharge was 16 minutes. the diameter of the dimple at 0.3 mm depth was taken as a standard characteristic dimension of the area affected by the point-to-plane corona discharge. after the influence of the corona, the needle was moved 6.3 mm towards the gel surface, and standard diameter d was measured. fig. 6 presents 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 fig. 2: photograph illustrating the experimental set-up fig. 3: photograph of a petri dish showing the gel surface deformation caused by the corona discharge exposition 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2 0 5 10 15 20 t [min] d im p le d e p th [m m ] fig. 4: dependence of dimple depth on exposition time at discharge current 50 �a 0 0,5 1 1,5 2 2,5 3 0 100 200 300 400 i [ma] d im p le d e p th [m m ] 16 min fig. 5: dependence of dimple depth on discharge current (exposition time 16 min) the process of measuring the standard diameter. the dependence of the standard diameter on exposition duration is shown in fig. 7. the initial corona current was 50 �a. the dependence of standard diameter d on the initial corona current is shown in fig. 8. the exposition duration was 16 minutes. 4 conclusion we developed an apparatus that generates a dc negative point-to-plane corona discharge of various parameters in air at atmospheric pressure. in addition, we attempted to develop a method enabling a quantitative evaluation of the action the discharge on an artificial bacterial culture. we also studied the influence of the electric field of the corona discharge on the substrate formed by the agar gel. the electrohydrodynamic deformation of the substrate was observed. the gel surface was deformed by the ionic wind in the point-to-plane corona. the deformation of the gel surface was comparable to the distances between the electrodes used in sterilization experiments [2, 3]. the electrohydrodynamic deformation effect redistributed the corona current during gel exposition to a corona. moreover figs. 4–7 show that the gel surface deformation cannot be predicted for a specific gel type (although a relatively precise method of gel surface position determination was used). the variation of the deformation characteristic dimensions may be explained by the fact that the water content in the used agar plates used here changed during storage, and consequently the gel electrode viscosity also changed. acknowledgments the research presented in this paper was supervised by doc. j. píchal, mudr. ing. v. kříha and prof. l. aubrecht, fee ctu in prague, and has been supported by gačr grant no. 202-03-h162 “advanced study in physics and chemistry of the plasma”. references [1] laroussi, m., tendero, c., lu, x., alla, s., hynes, w. l.: inactivation of bacteria by the plasma pencil. plasma process polym. 2006, 3, p. 470–473. [2] scholtz, v., et al.: corona discharge: a simple method of its generation and study of its bacterial properties. cz. j. phys., vol. 56 (2006), suppl. b, p. 1333–1338. [3] scholtz, v., et al.: the study of bactericidal effects of corona discharge at atmospheric pressure. in proceedings of the 33rd eps conference on plasma phys. rome (italy), 2006, vol. 301, p. 4.007. [4] sigmond, r. s., kurdelova, b., kurdel, m.: action of corona discharges on bacteria and spores. cz. j. phys., vol. 49 (1999), no. 3, p. 405–420. [5] kurdel, m., morvova, m.: dc corona discharge influence on chemical composition in mixtures of natural gas with air and is combustion exhaust with air. cz. j. phys., vol. 47 (1997), no. 2, p. 205–215. yuliya klenko e-mail: klenkj1@feld.cvut.cz vladimir scholtz e-mail: vscholtz@gmail.com czech technical university faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 4/2008 fig. 6: the process of measuring standard diameter d 0 2 4 6 8 10 12 14 16 0 5 10 15 20 t [min] d [m m ] fig. 7: standard diameter increment with an increase in discharge current (influence time 16 min) 0 5 10 15 20 25 30 35 0 100 200 300 400 i [ma] d [m m ] 16 min fig. 8: standard diameter dependence of treatment time (discharge current 50 �a) << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice wykresx.eps acta polytechnica vol. 51 no. 1/2011 bidifferential calculus, matrix sit and sine-gordon equations a. dimakis, n. kanning, f. müller-hoissen abstract we express a matrix version of the self-induced transparency (sit) equations in the bidifferential calculus framework. an infinite family of exact solutions is then obtainedby application of a general result that generates exact solutions from solutions of a linear system of arbitrary matrix size. a side result is a solution formula for the sine-gordon equation. keywords: bidifferential calculus, integrable system, self-induced transparency, sine-gordon. 1 introduction the bidifferential calculus approach (see [1] and the references therein) aims to extract the essence of integrability aspects of integrable partial differential or difference equations (pddes) and to express them, and relations between them, in a universal way, i.e. resolved from specific examples. a powerful, though simple to prove, result [1, 2, 3] (see section 6) generates families of exact solutions from a matrix linear system. in the following we briefly recall the basic framework and then apply the latter result to a matrix generalization of the sit equations. 2 bidifferential calculus a graded algebra is an associative algebra ω over c with a direct sum decomposition ω = ⊕ r≥0 ωr into a subalgebra a := ω0 and a-bimodules ωr, such that ωr ωs ⊆ ωr+s. a bidifferential calculus (or bidifferential graded algebra) is a unital graded algebra ω equipped with two (c-linear) graded derivations d, d̄ : ω → ω of degree one (hence dωr ⊆ ωr+1, d̄ωr ⊆ ωr+1), with the properties d2z =0 ∀z ∈ c , where dz := d̄− z d , (1) and the graded leibniz rule dz(χ χ ′) = (dz χ)χ ′ + (−1)r χdzχ′, for all χ ∈ ωr and χ′ ∈ ω. 3 dressing a bidifferential calculus let (ω,d, d̄) be a bidifferential calculus. replacing dz in (1) by dz := d̄− a − z d with a 1-form a ∈ ω1 (in the expression for dz to be regarded as a multiplication operator), the resulting condition d2z = 0 (for all z ∈ c) can be expressed as da =0= d̄a − a a . (2) if (2) is equivalent to a pdde, we have a bidifferential calculus formulation for it. this requires that a depends on independent variables and the derivations d, d̄ involve differential or difference operators. several ways exist to reduce the two equations (2) to a single one: (1) we can solve the first of (2) by setting a = dφ. this converts the second of (2) into d̄dφ =dφ dφ . (3) (2) the second of (2) can be solved by setting a = (d̄g)g−1. the first equation then reads d ( (d̄g)g−1 ) =0 . (4) (3) more generally, setting a = [d̄g − (dg)δ] g−1, with someδ ∈ a, wehave d̄a−a a =(da)gδg−1+ (dg)(d̄δ − (dδ)δ)g−1. as a consequence, if δ is chosen such that d̄δ = (dδ)δ, then the two equations (2) reduce to d ( [d̄g − (dg)δ] g−1 ) =0 . (5) with the choice of a suitable bidifferential calculus, (3) and (4), or more generally (5), have been shown to reproduce quite a number of integrable pddes. this includes the self-dualyang-mills equation, in which case (3) and (4) correspond to wellknown potential forms [1]. having found a bidifferential calculus in terms ofwhich e.g. (3) is equivalent to a certain pdde, it is not in general guaranteed that also (4) represents a decent pdde. then the generalization (5) has a chance to work (cf. [1]). in such a case, the miura transformation [d̄g − (dg)δ] g−1 =dφ (6) is a hetero-bäcklund transformation relating solutions of the two pddes. bäcklund, darboux and binary darboux transformations can be understood in this general framework [1], and there is a construction of an infinite 33 acta polytechnica vol. 51 no. 1/2011 set of (generalized) conservation laws. exchanging d and d̄ leads to what is known in the literature as ‘negative flows’ [3]. 4 a matrix generalization of sit equations and its miura-dual let a =mat(n, n, c∞(r2)), the algebraof n×n matrices of smooth functions on r2. letω= a⊗ ∧ (c2) with the exterior algebra ∧ (c2) of c2. in terms of coordinates x, y of r2, a basis ζ1, ζ2 of 1∧ (c2), and a constant n × n matrix j, maps d and d̄ are defined as follows on a, df = 1 2 [j, f]⊗ ζ1 + fy ⊗ ζ2 , d̄f = fx ⊗ ζ1 + 1 2 [j, f]⊗ ζ2 (see also [4]). they extend in an obvious way (with dζi = d̄ζi = 0) to ω such that (ω,d, d̄) becomes a bidifferential calculus. we find that (3) is equivalent to φxy = 1 2 [ [j, φ], φy − 1 2 j ] . (7) let n =2m and j =block-diag(i, −i),where i = im denotes the m × m identity matrix. decomposing φ into m × m blocks, and constraining it as follows, φ = ( p q q −p ) , (8) (7) splits into the two equations pxy =(q 2)y , qxy = q − pyq − qpy . (9) we refer to them as matrix-sit equations (see section 5), not purporting that they havea similar physical relevance as in the scalar case. themiura transformation (6) (with δ=0) now reads gx g −1 = 1 2 [j, φ] , 1 2 [j, g]g−1 = φy . (10) writing g = ( a b c d ) , with m × m matrices a, b, c, d, and assuming that a and its schur complement s(a)= d−c a−1b is invertible (which implies that g is invertible), (10)with (8) requires b = −c a−1d , ax = −cx a−1c , (11) dx = −cx a−1c a−1d . the last equation can be replaced by dx d −1 = ax a −1. invertibility of s(a) implies that d and i+r2 are invertible, where r := c a−1. the conditions (11) are necessary in order that themiura transformation relates solutions of (9) to solutions of its ‘dual’ (gx g −1)y = 1 4 [gjg−1, j] , (12) obtained from (4). taking (11) into account, the miura transformation reads q = −cx a−1 = −rx − r ax a−1 , qy = −r(i + r2)−1 , (13) py = i − (i + r2)−1 . as a consequence, we have qy 2 + py 2 = py . (14) furthermore, the second of (11) and the first of (13) imply axa −1 = qr. hence we obtain the system rx = −q − r q r , qy = −r(i + r2)−1 , (15) which may be regarded as a matrix or ‘noncommutative’ generalization of the sine-gordon equation. there are various such generalizations in the literature. the first equation has the solution q = − ∞∑ k=0 (−1)k rk rx rk, if the sum exists. alternatively, we can express this as q = −(i +rlrr)−1(rx), where rl (rr) denotes themap of left (right)multiplication by r. this canbe used to eliminate q fromthe second equation, resulting in( (i + rlrr) −1(rx) ) y = r (i + r2)−1 . (16) if r = tan(θ/2)π with a constant projection π (i.e. π2 = π) and a function θ, then (16) reduces to the sine-gordon equation θxy =sinθ . (17) (15) canbe obtaineddirectly from(12) as follows, by setting g = ( a −c c a ) = ( i −r r i ) a , hence g−1 = a−1 ( i r −r i ) (i + r2)−1 . this leads to( (rx r + rρ r + ρ)(i + r 2)−1 ) y = 0 ,( (rx + rρ − ρ r)(i + r2)−1 ) y = r(i + r2)−1 , 34 acta polytechnica vol. 51 no. 1/2011 where ρ := axa −1. setting an integration ‘constant’ to zero, the first equation integrates to ρ = −rxr−rρ r. with its help, the second can bewritten as (rx +rρ)y = r(i +r 2)−1. since q = −(ra)x a−1 = −rx − r ρ, this is the second of (15). the first follows noting that qr = ρ. 5 sharp line sit equations and sine-gordon we consider the scalar case, i.e. m = 1. introducing e = 2 √ αq with a positive constant α, p = 2qy, n = 2py − 1, and new coordinates z, t via x = √ α(z − t) and y = √ αz, the system (9) is transformed into pt = e n , nt = −e p , and the relation between e and p takes the form ez +et = α p . these are the sharp line self-induced transparency (sit) equations [5, 6, 7]. we note that p2 + n2 is conserved. indeed, as a consequence of (14), we have p2 +n2 =1. writing p = −sinθ and n = −cosθ, reduces the first two equations to e = θt. expressed in the coordinates x, y, the third then becomes the sine-gordon equation (17) (cf. [6]). as a consequence of the above relations, q and p depend as follows on θ, q = − 1 2 θx , qy = − 1 2 sinθ , (18) py = 1 2 (1−cosθ) . these areprecisely the equations that result fromthe miura transformation (10) (or from (13)), choosing g = ⎛ ⎜⎜⎝ cos θ 2 −sin θ 2 sin θ 2 cos θ 2 ⎞ ⎟⎟⎠ , and (12) becomes the sine-gordon equation (17). the conditions (11) are identically satisfied as a consequence of the form of g. 6 a universal method of generating solutions from a matrix linear system theorem 1 let (ω,d, d̄) be a bidifferential calculus with ω= a ⊗ ∧ (c2), where a is the algebra of matrices with entries in some algebra b (where the product of two matrices is defined to be zero if the sizes of the two matrices do not match). for fixed n, n ′, let x ∈ mat(n, n, b) and y ∈ mat(n ′, n, b) be solutions of the linear equations d̄x = (dx)p , d̄y = (dy )p , r x − x p = −q y , with d-constant and d̄-constant matrices p , r ∈ mat(n, n, b), and q = ṽ ũ, where ũ ∈ mat(n, n ′, b) and ṽ ∈ mat(n, n, b) are dand d̄constant. if x is invertible, the n×n matrix variable φ = ũ y x−1ṽ ∈ mat(n, n, b) solves d̄φ =(dφ)φ+dϑ with ϑ = ũ y x−1rṽ , hence (by application of d) also (3). � there is a similar result for (5) [3]. the miura transformation is a corresponding bridge. 7 solutions of the matrix sit equations from theorem 1 we can deduce the following result, using straightforward calculations [8], analogous to those in [2] (see also [3]). proposition 2 let s ∈ mat(m, m, c) be invertible, u ∈ mat(m, m, c), v ∈ mat(m, m, c), and k ∈ mat(m, m, c) a solution of the sylvester equation sk + ks = v u . (19) then, with ξ = e−sx−s −1 y and any p0 ∈ mat(m, m, c) (more generally x-dependent), q =u ξ (im +(kξ) 2)−1v , p=p0 − u ξkξ (im +(kξ)2)−1v (20) (assuming the inverse exists) is a solution of (9). � if the matrix s satisfies the spectrum condition σ(s)∩ σ(−s)= ∅ (21) (where σ(s)denotes the setof eigenvaluesof s), then the sylvester equation (19) has a unique solution k (for any choice of the matrices u , v ), see e.g. [9]. bya lengthycalculation [8] one canverifydirectly that the solutions in proposition 2 satisfy (14). alternatively, one can show that these solutions actually determine solutions of the miura transformation (cf. [3]), andwe have seen that (14) is a consequence. there is a certain redundancy in the matrix data that determine the solutions (20) of (9). this can be 35 acta polytechnica vol. 51 no. 1/2011 narroweddownbyobserving that the following transformations leave (19) and (20) invariant (see also the nls case treated in [2]). (1)similarity transformation with an invertible m ∈ mat(m, m, c): s �→ m sm −1 , k �→ m km −1 , v �→ m v , u �→ u m −1 . as a consequence, we can choose s in jordannormal form without restriction of generality. (2)reparametrization transformation with invertible a, b ∈ mat(m, m, c): s �→ s , k �→ b−1ka−1 , v �→ b−1v , u �→ u a−1 , ξ �→ abξ . (3) reflexion symmetry: s �→ −s , k �→ −k−1 , v �→ k−1v , u �→ u k−1 , p0 �→ p0 − u k−1v . this requires that k is invertible. more generally, such a reflexion can be applied to any jordan block of s and then changes the sign of its eigenvalue [8] (see also [10, 2]). the jordan normal form can be restored afterwards via a similarity transformation. the following result is easily verified [8]. proposition 3 let s, u , v be as in proposition 2 and t ∈ mat(m, m, c) invertible. (1) let t be hermitian (i.e. t † = t) and such that s† = t st −1, u = v †t . let k be a solution of (19), which can then be chosen such that k† = t kt −1. then q and p given by (20) with p † 0 = p0 are both hermitian and thus solve the hermitian reduction of (9). (2) let t̄ = t −1 (where the bar means complex conjugation) and s̄ = t st −1, ū = u t −1 and v̄ = t v . let k be a solution of (19), which can then be chosen such that k̄ = t kt −1. then q and p given by (20) with p̄0 = p0 satisfy q̄ = q and p̄ = p, and thus solve the complex conjugation reduction of (9). � 8 rank one solutions let m = 1. we write s = s, u = u, v = vt, k = k (where t means the transpose) and ξ = ξ = e−sx−s −1y. then (19) yields k = (vtu)/(2s). from (20) we obtain q = 2s k ξ 1+(kξ)2 π , p = p̃0 + 2s 1+(kξ)2 π , p̃0 := p0 −2s π , π := uvt vtu . the miura transformation (13) implies r = −qy (i − py) −1, and we obtain r = − 2kξ 1− (kξ)2 π , which is singular. but θ = −2arctan(2kξ/[1−(kξ)2]) is the single kink solution of the sine-gordon equation (17). 9 solutions of the scalar (sharp line) sit equations we rewrite p in (20), where now m =1, as follows, p = p0 − tr ( (sk + ks)ξkξ (im +(kξ) 2)−1 ) = p0 +tr ( (im +(kξ) 2)x (i m +(kξ) 2)−1 ) = p0 + ( logdet ( im +(kξ) 2 )) x , (22) using (19) and the identity (detm)x =tr(m xm −1) detm for an invertiblematrix function m. q in (20) can be expressed as q =2tr ( skξ (im +(kξ) 2)−1 ) . in particular, if s is diagonal with eigenvalues si, i =1, . . . , m, and satisfies (21), then the solution k of the sylvester equation (19), which now amounts to rank(sk + ks) = 1, is the cauchy-type matrix with components kij = vi uj/(si + sj), where ui, vi ∈ c. figs. 1 and 2 show plots of two examples from the above family of solutions. fig. 1: a scalar 2-soliton solution with s = diag(1,2) and ui = vi =1 fig. 2: a scalar breather solution with s = diag(1+ i, 1− i) and ui = vi =1 10 a family of solutions of the real sine-gordon equation via themiura transformation (18), proposition 2 determines a family of sine-gordon solutions (see also 36 acta polytechnica vol. 51 no. 1/2011 e.g. [6, 11, 12, 13, 14, 15, 16] for related results obtained by different methods). proposition 4 let s ∈ mat(m, m, c) be invertible and k ∈ mat(m, m, c) such that rank(sk + ks) = 1, det(im + (kξ) 2) ∈ r with ξ = e−sx−s −1 y, and tr(skξ (im + (kξ) 2)−1) �∈ ir (where i is the imaginary unit). then θ =4arctan ( √β 1+ √ 1− β ) with β := ( log |det(im +(kξ)2)| ) xy (23) solves the sine-gordon equation θxy = sinθ in any open set of r2 where det(im +(kξ) 2) �=0. proof: let p be givenby (22). due to the assumption det(im +(kξ) 2) ∈ r, py is real, hence (14) implies |1 − 2py|2 = 1 − 4qy2. it follows that qy2 is real. since another of our assumptions excludes that qy is imaginary, it follows that |1 − 2py| ≤ 1. hence the equation cosθ = 1 − 2py (second of (18)) has a real solution θ. inserting expression (22) for p, we arrive at cosθ =1−2 ( logdet(im +(kξ) 2) ) xy . moreover, (14) shows that py ≥ 0 and thus 0 ≤ py ≤ 1. using identities for the inverse trigonometric functions, we find (23), where β = py. � proposition 3 yields sufficient conditions on the matrix data for which the last two assumptions in proposition 4 are satisfied. references [1] dimakis, a., müller-hoissen, f.: bidifferential graded algebras and integrable systems. discr. cont. dyn. systems suppl., 2009, 2009, p. 208–219. [2] dimakis, a., müller-hoissen, f.: solutions of matrix nls systems and their discretizations: a unified treatment. inverse problems, 26, 2010, 095007. [3] dimakis, a., müller-hoissen, f.: bidifferential calculusapproachtoaknshierarchiesandtheir solutions. sigma, 6, 2010, 2010055. [4] grisaru, m., penati, s.: an integrable noncommutative version of the sine-gordon system. nucl. phys. b, 655, 2003, p. 250–276. [5] lamb, g.: analytical descriptions of ultrashort optical pulse propagation in a resonantmedium. rev. mod. phys., 43, 1971, p. 99–124. [6] caudrey, p., gibbon, j., eilbeck, j., bullough, r.: exact multi-soliton solutions of the self-induced transparency and sine-gordon equations. phys. rev. lett., 30, 1973, p. 237–238. [7] bullough, r., caudrey, p., eilbeck, j., gibbon, j.: a general theory of self-induced transparency. opto-electronics, 6, 1974, p. 121–140. [8] kanning, n.: integrable systeme in der allgemeinen relativitätstheorie: ein bidifferentialkalkül-zugang. diploma thesis. göttingen : university of göttingen, 2010. [9] horn, r., johnson, c.: topics in matrix analysis. cambridge : cambridge univ. press, 1991. [10] aktosun, t., busse, t., demontis, f., van der mee, c.: symmetries for exact solutions to the nonlinear schrödinger equation. j. phys. a: theor. math., 43, 2010, 025202. [11] hirota, r.: exact solution of the sine-gordon equation for multiple collisions of solitons. j. phys. soc. japan, 33, 1972, p. 1459–1463. [12] ablowitz, m., kaup, d., newell, a., segur, h.: method for solving the sine-gordon equation. phys. rev. lett., 30, 1973, p. 1262–1264. [13] pöppe,c.: construction of solutions of the sinegordonequationbymeans offredholmdeterminants. physica d, 9, 1983, p. 103–139. [14] zheng, w.: the sine-gordon equation and the tracemethod. j. phys. a:math. gen.,19, 1986, p. l485–l489. [15] schiebold, c.: noncommutative akns system and multisoliton solutions to the matrix sinegordon equation. discr. cont. dyn. systems suppl., 2009, 2009, p. 678–690. [16] aktosun, t., demontis, f., van der mee, c.: exact solutions to the sine-gordon equation. arxiv:1003.2453, 2010. aristophanes dimakis e-mail: dimakis@aegean.gr department of financial and management engineering university of the aegean, 41 kountourioti str. gr-82100 chios, greece nils kanning e-mail: nils.kanning@ds.mpg.de max-planck-institute for dynamics and self-organization bunsenstrasse 10, d-37073 göttingen, germany folkert müller-hoissen e-mail: folkert.mueller-hoissen@ds.mpg.de max-planck-institute for dynamics and self-organization bunsenstrasse 10, d-37073 göttingen, germany 37 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 x-ray transient sources (multifrequency laboratories) the case of the prototype a0535+26/hde 245770 f. giovannelli, l. sabau-graziati abstract the goal of this paper is to discuss the behaviour of the x-ray transient source a0535+26 which is considered for historical reasons and for the huge amount of multifrequency data, spread over a period of 35 years, as the prototype of this class of objects. transient sources are formed by a be star — the primary — and a neutron star x-ray pulsar — the secondary — and constitute a sub-class of x-ray binary systems. we will emphasize the discovery of low-energy indicators of high-energy processes. they are ubvri magnitudes and balmer lines of the optical companion. particular unusual activity of the primary star — usually at the periastron passage of the neutron star – indicates that an x-ray flare is drawing near. the shape and intensity of x-ray outbursts are dependent on the strength of the activity of the primary. we derive the optical orbital period of the system as 110.856 ± 0.02 days. by using the optical flare of december 5, 1981 (here after 811205-e) that triggered the subsequent x-ray outburst of december 13, 1981, we derive the ephemeris of the system as jd popt−outb = jd0(2,444,944) ± n(110.856 ± 0.02). thus the passage of the neutron star at the periastron occurs with a periodicity of 110.856 ± 0.02 days and the different kinds of x-ray outbursts of a0535+26 — following the definitions reported in the reviewbygiovannelli & sabau-graziati (1992) —occur just after ∼ 8 days. the delay between optical andx-ray outbursts is just the transit time of thematerial coming out from the optical companion to reach the neutron star x-ray pulsar. the occurrence of x-ray “normal outbursts”, “anomalous outbursts” or “casual outbursts” is dependent on the activity of the be star: “quiet state: steady stellar wind”, “excited state: stellar wind plus puffs of material”, and “expulsion of a shell”, respectively. in the latter case, the primary manifests a strong optical activity and the consequent strong x-ray outburst can occur in any orbital phase, with a preference at the periastron passage of the neutron star, because of its gravitational effects on the be star. keywords: x-ray/be systems,x-raypulsars, be stars, optical, spectroscopy, photometry,x-rays, individual: a0535+26 ≡ 1a 0535+26 ≡ 4u 0538+26 ≡ 1h 0536+263 ≡ 1rxs j053855.1+261843, individual: hde 245770 ≡ bd+26◦ 883 ≡ v 725 tau ≡ aavso 0532+26 ≡ sao 77348. 1 introduction the trivial definition of x-ray binaries (xrbs) is that they are binary systems emitting x-rays. however it has been largely demonstrated that x-ray binary systems emit energy in ir, optical, uv, x-ray, gammaray energy ranges, and they sometimes also show valuable radio emission. they can be divided into different sub-classes • high mass x-ray binaries (hmxb), in which the optical companion is an early type giant or supergiant star and the collapsed object is a neutron star or a black hole. they are concentrated around the galactic plane. the mass transfer usually occurrs via stellar wind; they show hard pulsed x-ray emission (from 0.069 to 1413 s) with kt ≥ 9 kev; typical x-ray luminosity ranges from 1034 to 1039 erg s−1, and the ratio of x-ray to optical luminosity is ∼ 10−3–10. hmxbs can be divided into two sub-classes • hard transient x-ray sources (hxts) in which the neutron star is eccentrically (e ∼ 0.2–0.5) orbiting around a v-iii luminosity-class be star (porb > 10 days); they show strong variable pulsed hard xray emission (lxmax/lxmin > 100) with kt ≥ 17 kev; • permanent x-ray sources (pxs), in which the neutron star or black hole is circularly orbiting (e ∼ 0) around a giant or supergiant ob star (porb < 10 days); they show an almost steady permanent pulsed hard xray emission (lxmax/lxmin � 100); • obscured sources, which display a huge amount of low energy absorption produced by the dense wind of the supergiant companion; • supergiant fast x-ray transients (sfxt), a new subclass of transients in which the formation of transient accretion discs could be partly responsible for the flaring activity in systems with narrow orbits. 21 acta polytechnica vol. 51 no. 2/2011 • low mass x-ray binaries (lmxb), in which the optical companion is a low-mass-late-type star and the collapsed object is a neutron star or a black hole (porb from 41 min to 11.2 days). they are concentrated around the galactic plane and especially in the galactic center. the mass transfer in these systems usually occurs via roche lobe overflow. their emission in the soft x-ray range is usually not pulsed with kt ≤ 9 kev. their x-ray luminosity ranges from 1036 to 1039 erg s−1 and lx/lopt ∼ 102–104; many lmxbs show quasi periodic oscillations (qpos) between 0.02 and 1 000 seconds, and a few of them also show pulsed x-ray emission, such as her x1, 4u 1626-27 and gx 1+4; • cataclysmic variables (cvs), in which the optical companion is a low-mass-late-type star and the compact object is a white dwarf. the detected cvs are spread roughly around the solar system at a distance of 200-300 pc. the orbital periods range from tens of minutes to about ten hours. the mass transfer occurs either via roche lobe overflow or via accretion columns or in an intermediate way, depending on the value of the magnetic field. typical x-ray luminosity ranges from 1032 to 1034 erg s−1. an updated review on cvs is that of giovannelli (2008); • rs canum venaticorum (rs cvn) type systems, in which no compact objects are present and the two components are an f or g hotter star and a k star. typical x-ray luminosity ranges from 1030 to 1031 erg s−1. in the current literature they are usually excluded from the class of x-ray binaries, since historically they were discovered as x-ray emitters only with the second generation of x-ray experiments. for a general review of xrbs, see the paper by giovannelli & sabau-graziati (2004). xrbs are the best laboratory for the study of accreting processes thanks to their relatively high luminosity in a large part of the electromagnetic spectrum. for this reason, multifrequency observations are fundamental in understanding their morphology and the physics governing their behaviour. because of the strong interactions between the optical companion and the collapsed object, low and high energy processes are strictly related. it is often easier to perform observations of low energy processes (e.g. in the radio, near-infrared (nir) and optical bands), since the experiments are typically ground-based, unlike observations of high energy processes, for which experiments are typically space-based. x-ray/be binaries are the most abundant group of massive x-ray binaries in the galaxy, with a total inferred number of between 103 and 104. those which occasionally flare-up as transient x-ray/be systems are only the “tip” of this vaste “iceberg” of systems (van den heuvel and rappaport, 1987). the mass loss processes are due to the rapid rotation of the be star, the stellar wind and, sporadically, to the expulsion of a casual quantity of matter essentially triggered by gravitational effects close to the periastron passage of the neutron star. the long orbital period (> 10 days) and the large eccentricity of the orbit (> 0.2) together with the transient hard x-ray behavior are the main characteristics of these systems. among the whole sample of galactic systems containing 114 x-ray pulsars (johnstone, 2005), only few have been extensively studied. among these, the system a 0535+26/hde 245770 is the best known, thanks to concomitant favorable causes, which have enabled thirty five years of coordinated multifrequency observations, most of them discussed by e.g. giovannelli & sabau-graziati (1992), burger et al. (1996), piccioni et al. (1999). accretion powered x-ray pulsars usually capture material from the optical companion via the stellar wind, since this primary star generally does not fill its roche lobe. however, in some specific conditions (e.g. the passage at the periastron of the neutron star) and in particular systems (e.g. a 0535+26/hde 245770), the formation of a temporary accretion disk around the neutron star behind the shock front of the stellar wind is possible. this enhances the efficiency of the process of mass transfer from the primary star onto the secondary collapsed star, as discussed by giovannelli & ziolkowski (1990) and by giovannelli et al. (2007) in the case of a 0535+26. optical emission of hmxbs is dominated by the emission of the optical primary component, which is not, in general, strongly influenced by the presence of the x-ray source. the behavior of the primary stars can be understood in the classical (or almost) framework of the astrophysics of these objects, i.e. by studying their spectra, which will provide indications on mass, radius, and luminosity. the two groups of hmxbs differ because of the different origin of the mass loss process: in the first group, the mass loss process occurs via a strong stellar wind and/or because of an incipient roche lobe overflow; in the second group, the mass transfer is probably partially due to the rapid rotation of the primary star and partially due to a stellar wind and sporadically due to expulsions of a casual quantity of matter, essentially triggered by gravitational effects because of periastron passage, where the effect of the secondary collapsed star is more marked. a relationship between the orbital period of hmxbs and the spin period of x-ray pulsars is shown in fig. 1 (updated from giovannelli & sabau-graziati, 2001 and from corbet, 1984, 1986). we can recognize three kinds of systems, namely disk-fed, wind-fed [ppulse ∝ (p orb)4/7], and x-ray/be systems [ppulse ∝ (p orb)2]. 22 acta polytechnica vol. 51 no. 2/2011 fig. 1: spin period vs orbital period for x-ray pulsars. disk–fed systems are clearly separated by systems having as optical counterparts either ob stars or be stars (giovannelli & sabau-graziati, 2001) most of the systems having a be primary star are hard x-ray (kt > 10 kev) transient sources (hxts). they are concentrated on the galactic plane within a band of ∼ 3.9◦. the orbits are quite elliptic and the orbital periods are large (i.e. a 053866: e = 0.7, porb = 16.6 days (skinner et al., 1982); a 0535+26: e = 0.47 (finger, wilson & hagedon, 1994), p = 111.0 days (priedhorsky & terrell, 1983). the x-ray flux during the outburst phases is of the order 10–1 000 times greater than during quiescent phases. for this reason, on the contrary, the stars belonging to the first class which do not present such strong variations in x-ray emission can be named “standard” high mass x-ray binaries. in xray/be systems, the primary be star is relatively unevolved and is contained within its roche lobe. the strong outbursts occur almost periodically in time scales of the order of weeks or months. their duration is shorter than the quiescent phases. during xray outbursts, spin-up phenomena in several systems have been observed (i.e. a 0535+26 and 4u 1118-61 (rappaport & joss, 1981)). the observed spin-up rates during the outbursts are consistent with torsional accretion due to an accretion disk (e.g. ghosh, 1994). the formation of a temporary accretion disk around the collapsed object should therefore be possible during the outburst phases (e.g. giovannelli & ziolkowski, 1990). the number of x-ray pulsars increases slowly with time, thanks to new detections performed at various new generation observatories. there were 95 (giovannelli & sabau-graziati, 2000) in 2000 and the orbital periods were known only for about three dozen of them. they contain the group of permanent hmxbs and the group of transient hmxbs (x-ray/be systems), whose components are an x-ray pulsing neutron star — the secondary — and a giant or supergiant ob or a be star, respectively — the primary. moreover, some lowmass x-ray binaries (lmxbs) containing an xray pulsar, and some pulsars belonging to magellanic clouds were also contained in the sample of 95 systems. in 2005, there were 114 known xray pulsars (johnston, 2005). 47 x-ray pulsars have been detected in the smc (corbet et al., http://lheawww.gsfc.nasa.gov/users/corbet/pulsars/). recently, rajoelimanana, charles & udalski (2010) listed 49 optical counterparts of smc x-ray pulsars detected by macho and ogle. there are 20 systems with known pspin, while there are 23 systems with known porb, 6 of which are uncertain. in this paper we will discuss the history of the discovery of optical indicators of high energy emission in the prototype system a0535+26/hde 245770, updated to the march–april 2010 event, when strong optical activity occurred roughly 8 days before the x-ray outburst (caballero et al., 2010a) that was predicted by giovannelli et al. (2010). 2 the x-ray/be system a0535+26/hde 245770 a 0535+26/hde 245770 is a typical x-ray/be runaway system (van oijen 1989), whose components are a 104 s x-ray pulsar (rosenberg et al. 1975) and an o9.7 iiie star (giangrande et al. 1980). the x-ray 23 acta polytechnica vol. 51 no. 2/2011 fig. 2: x-ray flux versus time of a 0535+26. x-ray measurements are reported with red lines and asterisks, upper limits with green arrows, and predicted fluxes with light blue stars. periods of real detected x-ray outburst and optical measurements are also marked (giovannelli, 2005) pulsar is spinning up with a pulse period derivative ṗ/p = −4.9×10−4 yr−1 (coe et al. 1990), with alternating phases of spin-down during x-ray outbursts. the x-ray pulsar a0535+26 has a magnetic field of ∼ 1013 g (grove et al. 1995). the orbital period has been determined by many authors, using various methods: 111d.0±0d.4 (priedhorsky & terrell 1983), 111d.38±0d.11 (motch et al. 1991), and 110d.3±0d.3 (finger, wilson & hagedon 1994) from x-ray data, and 110d.0 ± 0d.5 (coe et al., 2006) from the increasing x-ray activity associated with the periastron passage of the neutron star. these values agree with the value reported by guarnieri et al. (1985) from longterm optical photoelectric data (109d.8 ± 2d.0). the orbital eccentricity of the system is 0.47 (finger, wilson & hagedon 1994). the terminal velocity of the stellar wind from the star with significant optical output is ∼ 630 km s−1 (giovannelli et al. 1982), and its mass loss rate is ∼ 10−8 m� yr−1 (giovannelli et al. 1984a; de loore et al. 1984). the x-ray pulsar is close to its equilibrium state (bisnovatyi-kogan 1991; li & van den heuvel 1996), which could be a reason why some of the expected xray outbursts have failed to occur. this was noted at the end of the 1980s, as discussed by giovannelli & zió�lkowski (1990, and references therein), or in the mid-nineties, as found by the rxte satellite (see http://space.mit.edu/xte). the o9.7 iiie companion does not normally fill its roche lobe (de loore et al. 1984), although a temporary accretion disc can be formed around the neutron star when it is close to periastron (giovannelli & zió�lkowski 1990; giovannelli et al., 2007). complete reviews of the a0535+26/hde 245770 system can be found in the papers by giovannelli et al. (1985), giovannelli & sabau-graziati (1992), and burger et al. (1996). 2.1 historical multifrequency observations of a 0535+26/hde 245770 the most studied hmxb system, for historical reasons and due to concomitant favourable causes, is the x-ray/be system a 0535+26/hde 245770. by means of long series of coordinated multifrequency measurements, very often simultaneously obtained, it was possible to: • identify the optical counterpart hde 245770 of the x-ray pulsar; • identify various x-ray outbursts triggered by different states of the optical companion and influenced by the orbital parameters of the system; • identify the presence of a temporary accretion disc around the neutron star at periastron. multifrequency observations of this system started soon after its discovery as an x-ray pulsar by the ariel-5 satellite (rosenberg et al., 1975). the error box of the x-ray source contains 11 stars up to 23rd magnitude; these include the 9 mag be star hde 245770. the a priori probability of finding a 9 mag star in such a field is 0.004, so that margon et al. (1977) suggested this star as a probable optical counterpart of a0535+26. but in order to really associate this star with the x-ray pulsar, it was necessary to find a clear signature proving that the 24 acta polytechnica vol. 51 no. 2/2011 two objects would belong to the same binary system. this happened thanks to a sudden insight of one of us (fg), who predicted the fourth x-ray outburst of a 0535+26 around mid december 1977. for this reason, giovannelli’s group was observing in optical hde 245770 around the predicted period for the xray outburst of a 0535+26. figure 2 shows the xray flux intensity of a 0535+26 as deduced by various measurements available at that time, with obvious meaning of the symbols used (giovannelli, 2005). fg’s intuition was sparked by looking at the rise of the x-ray flux (red line) and at the 24th may 1977 measurement (red asterisk): he assumed that the evident rise of the x-ray flux would have produced an outburst similar to the first one, which occurred in 1975. then with a simple extrapolation he predicted the fourth outburst, similar to the second: and this happened! optical photoelectric photometry of hde 245770 showed significant light enhancement of the star relative to the comparison star bd +26 876 between dec 17 and dec 21 (here after 771220-e) and successive fading up to jan. 6 (bartolini et al., 1978), whilst satellite sas-3 was detecting an x-ray flare (chartres & li, 1977). the observed enhancement of optical emission followed by the flare-up of the xray source gave a direct argument strongly supporting the identification of hde 245770 — later nicknamed flavia’ star by giovannelli & sabau-graziati (1992) — with a 0535+26. soon after, with spectra taken at the loiano 152 cm telescope with a boller & chivens 26767 grating spectrograph (831 grooves/mm ii-order grating: 39 å mm−1) onto kodak 103 ao plates, it was possible to classify hde 245770 as o9.7iiie star. this classification was so good that it survives even to the recent dispute attempts made with modern technology. the mass and radius of the star are 15 m� and 14 r�, respectively; the distance to the system is 1.8 ± 0.6 kpc (giangrande et al., 1980). uv spectra taken with the iue enabled the reddening of the system to be determined as e(bv)= 0.75 ± 0.05 mag, the rotational velocity of the o9.7iiie star (vrot sin i = 230±45 km s−1), the terminal velocity of the stellar wind (v∞ � 630 km s−1), the mass loss rate (ṁ ∼ 10−8 m� yr−1 in quiescence (giovannelli et al., 1982). during the october 1980 strong outburst, the mass loss rate was ṁ ∼ 7.7 × 10−7 m� yr−1 (de martino et al., 1989). 2.2 optical and ultraviolet indicators of x-ray outbursts in a0535+26/hde 245770 giovannelli et al. (1985) and guarnieri et al. (1985) found uv and optical features of hde 245770 as indicators of the x-ray activity of a 0535+26. they found experimentally that the x-ray flaring activity of a 0535+26 is preceded by modifications in the optical spectrum of the companion hde 245770, of about one week (≈ 0.6 × 106 s). this roughly corresponds to the transit time of puffs of material expelled by the be star at ∼ 300 km s−1 in order to reach the neutron star, being the dimensions of the orbit of ∼ 1.34 au (de loore et al., 1984). the hγ line seems to be the best indicator of the activity going on. moreover, enhancements of optical luminosity were observed on four occasions before x-ray outbursts (bartolini et al., 1983; maslennikov, 1986). narrow absorption components present in the blue wings of the si iv and c iv resonance lines indicate a variable mass loss superimposed on a steady wind of ∼ 10−8 m� yr−1 (giovannelli et al., 1984a; de loore et al., 1984). they are strong probes in studying the physics and dynamics of the mass transfer at periastron and the subsequent x-ray flaring. xray outbursts are triggered by the eccentric orbital motion (porb ∼= 111 days: ≈ 107 s). then, multifrequency monitoring of this system and of a number of other x-ray/be systems around the passage at the periastron could give a conclusive answer to many of the open problems on the be/x-ray transients. from two outburst decays with exactly the same shape, shown in fig. 3 (bartolini et al., 1983) it is possible to derive the optical period of the system (porb−opt = 110.856 ± 0.02 days. this value is in agreement with the many values available in the literature, but it has the advantage that it has been derived by two simple optical measurements. we will use such a period in the following in order to discuss the flaring activity of the system. fig. 3: decay from two outbursts of optical luminosity of hde 245770: the first on february 24, 1976, the second on december 20, 1977 (bartolini et al., 1983). 25 acta polytechnica vol. 51 no. 2/2011 fig. 4: c iv resonance line of flavia’ star with narrow absorption components marked with red arrows. the data comes from the iue short wavelength high resolution spectra. clockwise from the upper right panel, the spectra were taken by giovannelli on november 1, 1981 (x-ray quiescence), october 26, 1980 (x-ray outburst decay), january 23, 1984 (x-ray outburst), and november 10, 1983 (x-ray quiescence), respectively fig. 5: typical hα, hβ and hγ lines of flavia’ star before x-ray activity of the neutron star. the fluxes are normalized to the continuum. the left and right panels report spectra taken at haute provence and loiano observatories, respectively (giovannelli et al., 1984a) figure 4 shows the narrow absorption components in the c iv resonance line in the high resolution iue spectra of flavia’ star. the velocity of the “puffs” of material expelled by the star is ∼ 350±50 km s−1, roughly coincident with that measured in the optical spectra (∼ 300 ± 50 km s−1). then, the transit time from the primary to the secondary is about 7–8 days, being the dimension of the system of ∼ 1.34 au (giovannelli et al., 1984a; de loore et al., 1984; guarnieri et al., 1985b). figure 5 shows typical hα, hβ and hγ lines of flavia’ star before x-ray activity of the neutron star. the spectra were taken at the haute provence and loiano observatories (giovannelli et al., 1984a). figure 6 shows the optical outburst 811205-e (guarnieri et al., 1985b) — marked with a red line — which occurred 8 days before the x-ray outburst of december 13, 1981 (nagase et al., 1982) — marked with a blue line — and was just 13 cycles after 771220-e), when, through enhanced optical activity, bartolini et al. (1978) discovered the firm association between the x-ray pulsar and the o9.7 iiie star. around the strong x-ray outburst — october 9–24, 1980: jd 2444522 – 2444537 – (marked with the blue rectangle in fig. 6) scarce optical measurements are available. however, such a strong x-ray outburst occurred between the 9th (jd 2444500.6) and 10th (jd 2444611.4) cycle after 771220-e. this will be commented in the conclusions. we use 811205-e to give the optical ephemerides of the binary system that is in our opinion the most reliable. they are: jdopt−outb = jd0(2,444,944) ± n(110.856 ± 0.02) days. 26 acta polytechnica vol. 51 no. 2/2011 fig. 6: optical outburst ondecember 5, 1981 (811205-e) (guarnieri et al., 1985b) —markedwith a red line —occurred 8 days before the x-ray outburst of december 13, 1981 (nagase et al., 1982) — marked with a blue line — and was just 13 cycles after the 771220-e. around the big x-ray outburst – october 9–24, 1980: jd 2,444,522 – 2,444,537 – (markedwith the blue rectangle) scarce optical measurements are available figure 7 shows 1983–1986 x-ray outbursts of a0535+26. the long 1983 october 1–18 [(jd (5609–5626) + 2,440,000] x-ray outburst started just in the 6th orbital period after the 811205-e (or the 19th cycle after 771220-e). no clear optical measurements were available. however, looking at the v light curve (obtained with the observations at the 1.5 m loiano telescope), it is possible to note a small decay followed by a further increase, better detected by the byurakan telescope (see fig. 8) — this means that the o9.7 iiie star was experiencing a stronger mass loss process than the usual process — which could be responsible for the prolongation of the x-ray outburst. the 1984 february 1–4 [jd (5732–5735) + 2,440,000] x-ray outburst (measured with the astron satellite by giovannelli et al., 1984b; 1986a,b), probably started a few days earlier. indeed the two available measurements (february 1 and 4) clearly show a decay. the 20th cycle after the 771220 e is placed just at jd 5720. at this moment the v magnitude of the optical companion was just in a phase of decay. the following 1986 october 28–november 1 x-ray outburst, detected only twice on those dates, occurred again after the 29th cycle (jd 2,446,718) from the 771220e. also in this case, the v magnitude of the optical companion was in a decay phase after a relative maximum. these three latter events strongly support that the real orbital period of the system is 110.856 days, which scans the periastron passage of the neutron star, and the x-ray outbursts are triggered with a delay of ∼ 8 days — the transit time of the expelled material from the primary for reaching the secondary. in the case of be star in “quiescence (steady stellar wind)”, the x-ray outburst has an intensity ≈ 0.1–0.5 crab (normal outbursts); in the case of “moderate activity (steady stellar wind plus puffs of material expelled)”, the intensity of the x-ray outburst is ≈ 0.5–1 crab (anomalous or noisy outbursts); in the case of the “expulsion of a shell” from the primary, preferably triggered near the periastron passage, a very strong x-ray outburst (> 1–∼ 4 crab) occurs and its duration is typically very long (≈ 10–20 days) (casual outbursts). fig. 7: 1983–1986 x-ray outbursts of a0535+26. the red lines indicate the 6th, 7th, and 16th optical cycles after the 811205-e. the blue rectangles and the blue lines indicate x-ray outbursts 27 acta polytechnica vol. 51 no. 2/2011 fig. 8: ubvri observational campaign of flavia’ star and x-ray observations with astron satellite between september 1983 and april 1984 (giovannelli et al., 1984b). blue stars represent x-ray measurements during two outbursts (giovannelli & sabau-graziati, 1992 and references therein). red lines indicate the 19th and 20th cycles after 771220-e figure 8 shows the ubvri observational campaign of flavia’ star at the stenberg, loiano, byurakan and tirgo telescopes and x-ray observations with the astron satellite between september 1983 and april 1984 (giovannelli et al., 1984b). the blue stars represent x-ray measurements during two outbursts (giovannelli & sabau-graziati, 1992 and references therein). the optical outburst occurred on october 3, 1983 (jd 2,445,611) — practically in the 19th cycle after the 771220-e (or the 6th cycle after 811205-e) — just 8 days before the x-ray maximum (jd 2,445,619). the successive x-ray outburst occurred just one cycle later [jd (5732–5735) + 2,440,000] with the probable maximum just at jd 5726; the 20th optical cycle after the 771220-e (or the 7th cycle after 811205-e) was at jd 5720 (also visible in fig. 7). figure 9 shows x-ray outbursts of a0535+26 in different epochs. red lines mark the time of the optical cycles after 811205-e. the x-ray activity starts roughly 8–12 days after the periastron passage of the neutron star. the strong x-ray outbursts, named casual, also occur preferably close to the periastron. finally, fig. 10 shows strong optical activity of flavia’ star detected at the loiano 1.5 m telescope through hα, hβ and hγ and he i lines (giovannelli et al., 2010). such activity predicted the incoming xray outburst of a0535+26 (caballero et al., 2010c), as shown in fig. 11. figure 11 shows the march–april 2010 x-ray outbursts of a0535+26. the x-ray activity starts roughly 8 days after the 93rd cycle from the 811205-e (caballero et al., 2010a). the strong optical activity of flavia’ star (giovannelli et al., 2010) was detected ∼ 10 days before the maximum of the x-ray outburst (caballero et al., 2010c). 3 discussion and conclusions we think we have clearly demonstrated that the optical orbital period of the a0535+26/flavia’ star system is porb−opt = 110, 856 ± 0.02 days. with this periodicity, the system scans the neutron star passage at the periastron with impressive precision. this passage is roughly 8 days before any kind of x-ray outbursts: normal, anomalous (or noisy) and casual outbursts, with some exceptions of the latter that can also occur outside this phase. an example of this could be october 9–24, 1980 (jd 2,444,522–2,444.537) a casual x-ray outburst apparently occurred between the 9th (jd 2,444,500.6) and 10th (jd 2,444,611.4) cycle after the 771220-e (see fig. 6). however, according to the available measures (giovannelli & sabau-graziati, 1992 and references therein), the x-ray outburst started 22 days after the periastron passage. however, the first x-ray measure gives an intensity of 1.1 crab, and the maximum of this strong outburst seems to be placed on october 10 28 acta polytechnica vol. 51 no. 2/2011 fig. 9: x-ray outbursts of a0535+26 in different epochs. red lines mark the time of the optical cycles after 811205-e. the x-ray activity starts roughly 8–12 days after the periastron passage of the neutron star. also in the case of strong x-ray outbursts, named casual, they occur preferably close to the periastron. upper left and right panels are adapted fromvioles et al. (1982), andgiovannelli &sabau-graziati (1992), respectively; middle left and right panels are adapted from finger, wilson & hagedon (1994), and caballero (2009), respectively; lower left and right panels are adapted from caballero et al. (2010b, and caballero et al. (2010c), respectively 29 acta polytechnica vol. 51 no. 2/2011 fig. 10: strong optical activity of flavia’ star detected at the loiano 1.5 m telescope through hα, hβ and hγ and he i lines. the fluxes are normalized to the continuum (giovannelli et al., 2010) fig. 11: march–april 2010 x-ray outbursts of a0535+26. the thin red line marks the time of the 93rd optical cycles after 811205-e. the bold red line marks the detected strong activity of flavia’ star (1.5 crab). the decay of the outburst extends until october 24 (0.4 crab) — the last measure available. then if we extrapolate the same shape of the decay to the rise of the outburst, we could have had the same intensity as on october 24 roughly on september 27 (jd 2,444,509). this date is just about 8 days after the 9th cycle after the 771220-e! this could support the idea that all the outbursts are triggered around the periastron passage, even those referred to as casual (strong). in conclusion, we have demonstrated that the optical ephemerides of the system can be given by jdopt−outb = jd0(2,444,944) ± n(110.856 ± 0.02) days. then the periastron passage of the neutron star is scanned every 110.856 days, and the x-ray outbursts are triggered starting from that moment and occur roughly after 8 days — the transit time of material expelled from the primary to reach the secondary. in the case of be star in “quiescence (steady stellar wind)”, the x-ray outburst has an intensity ≈ 0.1–0.5 crab (normal outbursts); in the case of “moderate activity (steady stellar wind plus puffs of material 30 acta polytechnica vol. 51 no. 2/2011 expelled)”, the intensity of the x-ray outburst is ≈ 0.5–1 crab (anomalous or noisy outbursts); in the case of the “expulsion of a shell” from the primary, preferably triggered near the periastron passage, a very strong x-ray outburst (> 1–∼ 4 crab) occurs and its duration is typically very long (≈ 10–20 days) (casual outbursts). therefore, continuous long-term monitoring of a0535+26/flavia’ star at least in optical and x-ray could definitively prove our conclusions. we believe that this behaviour of a0535+26/flavia’ star system is typical for all the x-ray/be systems for which we wish methodical multifrequency monitoring. in conclusion, with this paper we want to give a hint to the community to think again about the mechanisms responsible for outbursts in x-ray pulsars, and about new overall models of x-ray/be systems. acknowledgement we are pleased to thank the organizers of the karlovy vary 7th integral/bart workshop. one of us (fg) wishes to thank the loc for logistical support. this research has made use of nasa’s astrophysics data system. references [1] bartolini, c., guarnieri, a., piccioni, a., giangrande, a., giovannelli, f.: iau circ. no. 3 167, 1978. [2] bartolini, c., bianco, g., guarnieri, a., piccioni, a., giovannelli, f.: hvar obs. bull. 7(1), 1983, 159. [3] bisnovatyi-kogan, g. s. a & a, 245, 528, 1991. [4] burger, m., van dessel, e., giovannelli, f., sabau-graziati, l., bartolini, c., et al.: in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. s.a.it. 67, 365, 1996. [5] caballero, i.: ph. d. thesis, university of tübingen, germany, 2009. [6] caballero, i., lebrun, f., rodriguez, j., soldi, s., mattana, f., et al.: atel, no. 2 496, 2010a. [7] caballero, i., kretschmar, p., pottschmidt, k., santangelo, a., wilms, j., et al.: aip conf. proc. 1 248, 147, 2010b. [8] caballero, i., santangelo, a., pottschmidt, k., klochkov, d., rodriguez, j., et al.: atel, no. 2 541, 2010c. [9] chartres, m., li, f.: iau circ. no. 3 154, 1977. [10] coe, m. j., carstairs, i. r., court, a. j., davies, s. r., dean, a. j., et al. mnras 243, 475, 1990. [11] coe, m. j., reig, p., mcbride, v. a., galache, j. l., fabregat, j.: mnras, 368, 447, 2006. [12] corbet, r. h. d.: a&a, 141, 91, 1984. [13] corbet, r. h. d.: mnras, 220, 1 047, 1986. [14] finger, m. h., wilson, r. b., hagedon, k. s.: iau circ. no. 5 931, 1994. [15] ghosh, p.: in the evolution of x-ray binaries, s. s. holt & c. s. day (eds.), aip conf. proc. 308, 439, 1994. [16] giangrande, a., giovannelli, f., bartolini, c., guarnieri, a., piccioni, a.: a & a suppl. ser. 40, 289, 1980. [17] giovannelli, f.: the impact of multifrequency observations in high energy astrophysics, ph.d. thesis, university of barcelona, spain, 2005. [18] giovannelli, f.: chja & a suppl. 8, 237, 2008. [19] giovannelli, f., de loore, c., bartolini, c., burger, m., ferrari-toniolo, m., et al.: in proc. of the third european iue conference, esa sp-176, 233, 1982. [20] giovannelli, f., ferrari-toniolo, m., persi, p., bartolini, c., guarnieri, et al.: in proc. fourth european iue conference, e. rolfe (ed.), esa sp-218, 439, 1984a. [21] giovannelli, f., ferrari-toniolo, m., persi, p., golynskaya, i. m., kurt, v. g., et al.: in xray astronomy ’84, m. oda and r. giacconi (eds.), institute of space and astronautical science, tokyo, 1984b, p. 205. [22] giovannelli, f., ferrari-toniolo, m., persi, p., golynskaya, i. m., kurt, v. g., et al.: in multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.), edizioni scientifiche siderea, roma, 1985, p. 284. [23] giovannelli, f., kurt, v. g., sheffer, e. k.: iau circ. no. 4 284, 1986a. [24] giovannelli, f., van dessel, e. l., burger, m., de loore, c., bartolini, c., et al.: in new insights in astrophysics. eight years of uv astronomy with iue, esa-sp 263, 459, 1986b. [25] giovannelli, f., zió�lkowski, j.: aca, 40, 95, 1990. 31 acta polytechnica vol. 51 no. 2/2011 [26] giovannelli, f., sabau-graziati, l.: ssr, 59, 1, 1992. [27] giovannelli, f., sabau-graziati, l.: in the chemical evolution of the milky way: stars versus clusters, f. matteucci & f. giovannelli (eds.), kluwer publ. co., assl 255, 151, 2000a. [28] giovannelli, f., sabau-graziati, l.: ap & ss, 276, 67, 2001. [29] giovannelli, f., sabau-graziati, l.: ssr, 112, 1, 2004. [30] giovannelli, f., bernabei, s., rossi, c., sabaugraziati, l.: a & a, 475, 651, 2007. [31] giovannelli, f., gualandi, r., sabau-graziati, l.: atel, no. 2 497, 2010. [32] grove, j. e., strickman, m. s., johnson, w. n., kurfess, j. d., kinzer, r. l., et al.: apjl, 438, l25, 1995. [33] guarnieri, a., bartolini, c., piccioni, a., giovannelli, f.: in multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.), edizioni scientifiche siderea, roma, p. 318, 1985a. [34] guarnieri, a., bartolini, c., piccioni, a., giovannelli, f.: in multifrequency behaviour of galactic accreting sources, f. giovannelli (ed.), edizioni scientifiche siderea, roma, p. 310, 1985b. [35] johnston, r.: 2005. http://www.johnstonsarchive.net/relativity/ binpulstable.html [36] li, x.-d., van den heuvel, e. p. j.: a & al, 314, l13, 1996. [37] de loore, c., giovannelli, f., van dessel, e. l., bartolini, c., burger, m., et al.: a & a, 141, 279, 1984. [38] margon, b., nelson, j., chanan, g., bowyer, s., thorstensen, j. r.: apj, 216, 811, 1977. [39] de martino, d., waters, l. b. m. f., giovannelli, f., persi, p.: in two-topics in x-ray astronomy, e. rolfe (ed.), esa, sp-296, 519, 1990. [40] maslennikov, k. l.: sov. astron. lett. 12, 191, 1986. [41] motch, c., stella, l., janot-pacheco, e., mouchet, m.: apj, 369, 490, 1991. [42] nagase, f., hayakawa, s., kunieda, h., makino, f., masai, k., et al.: apj, 263, 814, 1982. [43] van oijen, j. g. j.: a & a, 217, 115, 1989. [44] piccioni, a., bartolini, c., bernabei, s., galleti, s., giovannelli, f., guarnieri, a., sabaugraziati, l., valentini, g., villada, m.: in the be phenomenon in early-type stars, iau colloquium 175, myron a. smith & huib f. henrichs (eds.), asp conf. proc. vol. 214, 569, 2000. [45] priedhorsky, w. c., terrell, j.: nature, 303, 681, 1983. [46] rajoelimanana, a. f., charles, p. a., udalski, a.: 2010, arxiv 1012.4610. [47] rappaport, s., joss, p. c.: in x-ray astronomy with the einstein satellite, r. giacconi (ed.), d. reidel publ. co., dordrecht, holland, 1981, p. 123. [48] rosenberg, f. d., eyles, c. j., skinner, c. g., willmore, a. p.: nature, 256, 631, 1975. [49] skinner, g. k.: nature, 297, 568, 1982. [50] violes, f., niel, m., bui-van, a., vedrenne, g., chambon, g., et al.: apj, 263, 320, 1982. franco giovannelli e-mail: franco.giovannelli@iasf-roma.inaf.it inaf – istituto di astrofisica spaziale e fisica cosmica – roma, area di ricerca di roma-2 via del fosso del cavaliere 100, i 00133 roma, italy lola sabau-graziati inta – dpt de programas espaciales y ciencias del espacio ctra de ajalvir km 4 – e 28850 torrejón de ardóz, spain 32 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 injection and combustion of rme with water emulsions in a diesel engine j. cisek abstract this paper presents ways of using the fully-digitised triggerable avl videoscope 513d video system for analysing the injection and combustion inside a diesel engine cylinder fuelled by rme with water emulsions. the research objects were: standard diesel fuel, rapeseed methyl ester (rme) and rme – water emulsions. with the aid of a helical flow reactor, stable emulsions with the water fraction up to 30 % weight were obtained, using an additive to prevent the water from separating out of the emulsion. an investigation was made of the effect of the emulsions on exhaust gas emissions (nox, co and hc), particulate matter emissions, smoke and the fuel consumption of a one-cylinder hd diesel engine with direct injection. additionally, the maximum cylinder pressure rise was calculated from the indicator diagram. the test engine was operated at a constant speed of 1600 rpm and 4 bar bmep load conditions. the fuel injection and combustion processes were observed and analysed using endoscopes and a digital camera. the temperature distribution in the combustion chamberwas analysed quantitativelyusing the two-colourmethod. the injection and combustion phenomena were described and compared. a way to reduce nox formation in the combustion chamber of diesel engines by adding water in the combustion zone was presented. evaporating water efficiently lowers the peak flame temperature and the temperature in the post-flame zone. for diesel engines, there is an exponential relationship between nox emissions and peak combustion temperatures. the energy needed to vaporize thewater results in lower peak temperatures of the combusted gases, with a consequent reduction in nitrogen oxide formation. the experimental results show up to 50 % nox emission reduction with the use of 30% water in an rme emulsion, with unchanged engine performance. keywords: diesel engine, rme – water emulsions, injection, combustion, visualization. 1 introduction among power-driving combustion engines, diesel engines are the most efficient, and their fuel consumption is 30 % lower than in petrol engines. however, one of the major problems with diesel engines concerns their exhaust gas emissions, which contain excessive amounts of toxic substances, such as nitrogen oxides nox and particulate matter pm. in the light of environmental legislation, attempts to reduce these toxic emissions remain a current issue. given the current state of the engineering knowledge, simultaneous reduction of emission levels of the two toxic substances through design modifications to diesel engines is not possible. recently new gas treatment methods have been developed, though the gas can be treated outside the engine only, after leaving the cylinder. a major disadvantage of these methods is their high costs, which preclude their widespread use. extensive research is being undertaken to investigate and modify the conditions for injection and combustion of diesel fuels to eliminate, or at least to vastly reduce, the zones of nox formation (high temperature and rich in oxygen zones) and the zones where conditions are ripe for soot formation. theoretical and experimental data available so far has revealed that nox emissions can be most effectively reduced when a carefully prepared fuel-water emulsion is injected into the cylinder. the pm emissions are reduced and, in certain conditions, fuel consumption can also be lowered [8]. however, one issue remains unsolved: how to prepare such an emulsion? 2 preparation of the rme-water emulsion the experimental program uses a helical flow reactor incorporated in the supply system of the engine. the helical flow of fluids in a ring-shaped slit between coaxial cylinders, the inner one rotating, is interpreted as a combination of poiseuille’s axial flow and couette’s rotating flow. because of the stability loss, the flowing fluids begin to develop taylor vortices, which cause intensive mixing. the dispersion factors in the plane perpendicular to the axis tend to be high, whilst the axial dispersion is insignificant (as the flow is induced by piston motion). so far, research data on the dispersion of the two mutually insolvent liquids has revealed that the flow reactor can be used to produce an emulsion in a wide range of fluid flow rates and their actual proportions. 37 acta polytechnica vol. 50 no. 2/2010 fig. 1: rme – water emulsion (30 % of water – m/m) the helical flow reactor was used to produce stable emulsions of rapeseed methyl ester (rme) and distilled water, the proportion of water being 10 % or 30 % by weight. a surface agent, rokafenol, was first added to distilled water, in volumetric proportion of 2 %, to ensure stability of the emulsion. the emulsion was prepared directly before fuelling the engine. 3 experimental setup and methodology the experimental setup incorporates a one-cylinder hd diesel engine with a direct injection system sb3.1 (cylinder diameter d = 127 mm, stroke s = 146 mm) equipped with a 4-nozzle injector, with outlet nozzle diameter 0.34 mm. measurements were taken at constant speed (1 600 rpm and 4 bmep load conditions), regardless of the fuel type. the engine was fuelled with: diesel fuel (sulphur content < 50 ppm), rapeseed methyl ester rme, rme-water emulsions (in proportions: 10 % water – 90 % rme and 30 % water and 70 % of rme, by weight). the setup for the braking tests (see fig. 2) incorporates the measurement and control apparatus for measurements of the fast-changing pressures within the cylinder and in the injection installation (avl indimeter), a set of analysers to measure the concentrations of co, thc, nox, (avl ceb ii), an installation to handle the pm emissions and the avl videoscope 513d video system for visualising the fuel injection and combustion processes. the system enables archiving of injection and combustion images, registered with frequency approaching 0.1◦ of the crankshaft rotation angle [1]. the measurement was repeated 10 times for each shaft rotation angle, for the purposes of statistical treatment of the images (to define the probability of the occurrence of an injection and/or a flame). in addition, the two-colour method was applied [1] to determine the isotherm distribution in the diffusion flame in the function of the shaft rotation angle. subsequent stages of fuel injection, ignition and combustion were monitored for each fuel type. the effects of the fuel type on the energy-based parameters, toxicity, presence of smoke in the exhaust gas and the maximum rate of pressure rise in the cylinder are addressed below. of major interest are other parameters of the indicator diagrams, the heat release rates and the injection characteristics. fig. 2: experimental setup 38 acta polytechnica vol. 50 no. 2/2010 4 energy-based parameters of the diesel engine during all the tests, the engine was operated at a constant rate of 1 600 rpm, torque 60 nm (equivalent to 4 bar bmep load conditions and effective power n c = 10 kw). of major interest was the temperature of the exhaust gas at the outlet channel. the indicator diagrams yield the retardation of ignition, the maximum rate of combustion pressure rise, and the maximum pressure in the cylinder. fuel consumption was measured by weight. diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 3: fuel consumption gh and specific fuel consumption ge at 1600 rpm and bmep=4 bar with different fuels diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 4: total efficiency at 1600 rpm and bmep=4 bar with different fuels the plot in fig. 3 reveals a major increase in the hourly and specific consumption of rapeseed methyl esters (rme) and its water emulsions. this increase is attributable to the fact that the calorific value of rme (37.5 mj/kg) is 1 % lower than that of standard diesel fuel (42 mj/kg). in addition, the calorific value of water emulsions of rme is lower by 10 % and 30 % than that of pure rme, since water has no calorific value at all. still smaller differences between particular fuel types are revealed in the total efficiency plots ηo as the differences in their calorific value are accounted for. the total efficiency for rme and diesel fuel is almost identical, approaching 29 %, whereas the use of an emulsion with 30 % water content reduces the total efficiency value slightly, to 28 %. the next plot shows the temperature of the exhaust gas measured at the outlet channel. it is apparent that the use of a water emulsion causes a significant reduction in the exhaust gas temperature, which is a most favorable feature particularly at higher loads, when the thermal loading of the engine and turbocompressors becomes critical. diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 5: exhaust gases temperature at 1600 rpm and bmep=4 bar with different fuels diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 6: maximum rise of cylinder pressure at 1600 rpm and bmep=4 bar with different fuels depending on the fuel type, the retardation of ignition also varies (10◦ of the shaft rotation angle for standard diesel fuel and 9◦ for rmr). when the en39 acta polytechnica vol. 50 no. 2/2010 gine is fuelled with 10 %and 30 %-water emulsions of rme, the retardation of ignition reaches a level of 11◦ and 14◦ of the shaft rotation angle, respectively. these reports are corroborated by images of the combustion chamber taken at the instant when the selfignition process begins (see fig. 14 and fig. 15). addition of water in the form of an emulsion increases the ignition delay and causes the combustion process to begin more rapidly, as evidenced by the differences in the maximum rate of pressure rise inside the cylinder, well apparent in fig. 6. the reported difference between rme and its 10 % water emulsion seems the most significant, and is accompanied by a slight increase in the maximum pressure of combustion pcmax, rising from 69 bar (standard diesel fuel and rme) to 71 bar (for a 30 % water emulsion of rme). 5 toxicity of the exhaust gas measurements of the gaseous substances constituting the exhaust gas were taken with the avl ceb ii set of analysers, using the hot gas path. the oxygen contents were measured with the pmd paramagnetic analyser, whilst an ndir infrared nondispersive analyser was used to handle the carbon monoxide and carbon dioxide. the nox emissions were measured by a cld chemo-luminescent analyser equipped with an no2/no converter and hydrocarbons hc – with a ionisation analyser with an hfid hot gas path. the plots below show the proportions of toxic substances in the exhaust gas: nitrogen oxides nox, carbon monoxide co and incompletely combusted hydrocarbons hc (re-calculated in terms of propane concentration c3h8). the pm emissions were obtained by the gravity (gravimetric) method, using a diluting tunnel providing the partial gas flow pfds and partial sampling. the bosch smoke number was measured using an avl smokemeter. diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30% – emulsion 70 % rme+30 % water fig. 7: nitric oxides nox concentration at 1600 rpm and bmep=4 bar with different fuels this plot shows the nox emissions in the exhaust gas. fuelling the engine with rme instead of standard diesel fuel causes a slight increase in nox emissions, whilst the use of a 10 % water emulsion of rme reduces the nox contents by 30 %. when the 30 % emulsion is used, the nox concentrations are doubly reduced, in relation to pure rme and standard diesel fuel. it is worth mentioning that this is so even though the beginnings of the combustion process are more rapid (fig. 7). diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 8: carbon monoxide concentration at 1600 rpm and bmep=4 bar with different fuels diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 9: total hydro carbon concentration at 1600 rpm and bmep=4 bar with different fuels nox emissions from engines fuelled by emulsions can be significantly reduced due to the reduction in the local peak combustion temperature, as shown in fig. 9, mainly due to evaporation of the water contained in the emulsions. this positive effect is attributable to the fact that water droplets injected in the form of an emulsion get directly to the combustion zone. in addition, dilution of gases by water vapour inside the cylinder reduces the local oxygen concentration [8], which causes the thermal formation of nox 40 acta polytechnica vol. 50 no. 2/2010 to proceed at a lower rate. in accordance with the zeldowicz’s extended model, the rate of nox thermal formation is associated with the atomic oxygen concentration (coming mainly from dissociation of atmospheric oxygen molecules), and increases exponentially with temperature and the proportion of oxygen. these two plots show the concentration of carbon monoxide and non-combusted hydrocarbons hc in exhaust gases. the use of pure rme and its water emulsion causes the co and hc emissions to increase considerably, but this is of little importance as the emission levels are still relatively low and these toxic substances can be largely neutralised by incorporating catalysers in the exhaust system. diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 10: emission and specific emission ofparticulatematters at 1600 rpm and bmep=4 bar with different fuels diesel fuel – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 11: bosch smoke number at 1600 rpm and bmep=4 bar with different fuels the plots summarise the bosch smoke number and the pm hourly and unit emission data. for an engine fuelled by emulsions, the pm emission level and the smoke number increases with increased water content, which may be attributable to the reduced rate of soot afterburning during the combustion process, caused by a significant reduction of the combustion temperature. measurements of the exhaust temperature seem to corroborate this view. a comparison of the exhaust temperature and the smoke number plots reveals that the higher the temperature of the exhaust gas, the lower the smoke number, which may be attributed to excessive cooling of the combustion zone under relatively small loads. 6 visualisation of fuel injection and combustion these images are registered with the use of the avl videoscope system for visualisation of fuel injection and combustion inside the cylinder. accordingly, the images were registered from the instant when fuel injection begins right through to the apparent end of the combustion process. the sampling frequency was equal to 1◦ of the shaft rotation angle. to enable image recording, the combustion chamber was lit with a stroboscope lamp. the measurement procedure was repeated 10 times for each shaft rotation angle, for the purposes of statistical treatment of the images (to define the probability of the occurrence of an injection and/or a flame). in addition, the two-colour method was applied [1] to determine the isotherm distribution in the diffusion flame in the function of the shaft rotation angle. registered images of the fuel injection and combustion process, and measurement data: the rapidlychanging pressure inside the cylinder and the rise of the metering pin for each fuel type were utilised to monitor the subsequent stages of fuel injection, ignition and combustion and to compute the fuel injection rate. figs. 12, 13, 14 show the fuel injection and combustion processes registered for the shaft rotation angles corresponding to the selected parameters of the working cycle of the engine. these parameters include: maximum injection rate, commencement of the ignition process (ignition timing) and maximum pressure of combustion. figs. 15, 16, 17 show statistically-treated results of 10 repeated measurements taken for each shaft rotation angle, corresponding to the maximum injection rate, the ignition time and the maximum pressure of combustion. fig. 18 shows measurement data for the shaft rotation angle corresponding to the maximum pressure of combustion, and also results of image analysis supported by the dedicated thermovision software. this image was obtained after statistical treatment of 10 data measurements collected for the shaft rotation angle corresponding to the maximum injection rate. in the regions marked in red, a fuel jet was registered during each repeated procedure. this should be interpreted as 100 % likelihood of the occurrence of a fuel jet at that point. in regions marked in black, no 41 acta polytechnica vol. 50 no. 2/2010 repeated imaging would reveal a fuel jet. this can be interpreted as 0 % likelihood of the occurrence of a fuel jet. intermediate colours indicate that a fuel jet was revealed only in some of the repeated measurements. it is readily apparent that the shape and extent of the fuel jet changes with the increased proportion of water in the emulsion, but that the repeatability of the injection process is impaired. fig. 12: image taken inside the combustion chamber (one fuel jet) for the shaft rotation angle corresponding to the maximal injection rate αdqmax on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 13: probability of the presence of injected fuel in the given region of the combustion zone for the shaft rotation angle corresponding to themaximuminjection rate αdqmax fig. 14 shows the images taken inside the combustion chamber at the instant when ignition begins (first visible flames). it appears that the addition of water in the form of an emulsion retards the ignition timing, which is line with the indicator diagrams. on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 14: image taken inside the combustion chamber (one fuel jet) for the shaft rotation angle corresponding to ignition timing αps on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 15: probability of the occurrence of a injected fuel or a flame in the given region of the combustion zone for the shaft rotation angle corresponding to ignition timing αps 42 acta polytechnica vol. 50 no. 2/2010 the image in fig. 15 was obtained after statistical treatment of 10 data measurements collected for the shaft rotation angle corresponding to the ignition timing. in the regions marked in red, a fuel jet or a flame was registered during each repeated procedure. this should be interpreted as 100 % likelihood of the occurrence of a fuel jet at that point. in regions marked in black, no repeated imaging would reveal a fuel jet or a flame. this can be interpreted as 0 % likelihood of the occurrence of a fuel jet or a flame. intermediate colours indicate that a fuel jet or a flame was revealed only in some of the repeated measurements. the following drawings show the images and the statistically treated results of image analysis registered for the shaft rotation angle corresponding to the maximum pressure of combustion. this stage of the combustion process is of key importance, featuring the local peak temperatures in the post-flame zones, which in turn determine the amounts of nox being formed [8]. on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 16: image taken inside the combustion chamber (one fuel jet) for the shaft rotation angle corresponding to the maximum pressure of combustion αpmax fig. 16 shows the combustion process in the same part of the chamber as in the previous images (figs. 12–14). during this stage of the process a diffusion flame is registered, and a portion of a piston bottom is revealed. the zone of the high-intensity flame is clearly greatest for pure rme, with standard diesel fuel being next in line. for the water emulsions, particularly the 30 % emulsion, the flame is less intense and occupies a smaller portion of the monitored chamber. on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 17: probability of the occurrence of a flame in the given region of the combustion chamber for the shaft rotation angle corresponding to the maximum pressure of combustion αpmax the shaft rotation angle at which the maximum combustion pressure is registered is similar for all tested fuel types, and the maximum pressures of combustion are also similar. the two-colour method was applied, supported by dedicated software, to determine the isotherm distribution in the diffusion flame in the function of the shaft rotation angle [1]. the diagrams below show the temperature distribution in the monitored section of the combustion chamber for the shaft rotation angle corresponding to the maximum pressure of combustion. the method is based on an analysis of the diffusion flame radiation spectrum, enabling us to find the temperature distribution inside the combustion chamber, provided that it exceeds 1 800 k. in the case considered here, the temperatures would only slightly exceed 1 800 k, because of the relatively low loading (bmep pressure 4 bar). however, a comparison of the combustion chamber regions occupied by the flame with temperatures in excess of 1 800 k reveals that in engines fuelled with rme the maximum local temperatures are higher than in those fuelled with standard diesel fuel. the use of rmewater emulsions leads to a significant reduction in the maximum local temperatures: the larger the proportion of water, the more significant the reduction. accordingly, the nox concentration in the exhaust gases is significantly reduced, as shown in fig. 5. for 30 % water emulsions, the nox emission level is reduced by nearly 50 % in relation to pure rme or standard diesel fuel. 43 acta polytechnica vol. 50 no. 2/2010 on – on – standard diesel fuel, rme – rapeseed methyl ester, rme+10 % – emulsion 90 % rme+10 % water, rme+30 % – emulsion 70 % rme+30 % water fig. 18: temperature distribution in the flame (in excess of 1800 k) for the shaft rotation angle corresponding to the maximum pressure of combustion αpmax an analysis of the registered images featuring the injection and combustion phenomena leads us to the following conclusions: 1. fuel injection proceeds differently, depending on the fuel type. for rme, a cloud of well-sprayed fuel is formed around the injected jet (fig. 11–12), which is then combusted in a large volume (of the chamber). in the case of rme-water emulsions, the situation is entirely different. no straightforward relationship is available between the proportion of water and the extent and angle of the jet cone for the given emulsion type. larger proportions of water encourage a greater extent and greater angles of the jet cone and the process is similar to that observed for pure rme. this is attributable to the fact that a low water content in emulsions (from several to around ten percent) would be revealed by the local viscosity extreme. stable emulsions with 10 % water content displayed the highest viscosity by far, followed by the 30 % water emulsion, rme and standard diesel fuel. for fuels with a low water content, high viscosity leads to a rapid increase in flow resistance as the fuel has to push through the spray nozzle openings. a further increase in the proportion of water reduces the kinematic viscosity of fuels. for emulsions, the injection process is similar to that observed for pure rme. 2. the zone occupied by the flame at the time of ignition is largest for rme and smallest for 30 % water emulsions (fig. 14–15). this is associated both with the spraying spectrum in the macroscale and with the latent heat required for evaporation of the water contained in the emulsion. ignition becomes retarded with an increase in the proportion of water in the emulsion. 3. for the maximum combustion pressure (fig. 16–17), the phenomena inside the cylinder proceed differently for the different fuel types. when the engine is fuelled with rme, the maximum combustion temperatures are higher and, as a result, the nox emissions are higher than when standard diesel fuel is used. addition of water (to form rme-water emulsions) significantly reduces the “maximum maximorum” temperature of the working medium, both in terms of the duration of the process and the cylinder volume covered by isotherms featuring high combustion temperatures. in consequence, the nox emissions in the exhaust gas are significantly reduced. 7 conclusions the application of fuel-water emulsions to the fuelling of diesel engines reduces the amounts of toxic substances present in the exhaust gas. the selection of key parameters of the emulsion requires extensive analyses, and the present study is a contribution to the body of research. research work is now underway in several research centres worldwide [2, 3, 4, 5, 6] and also in poland [10]. giving a better insight into the processes, and searching for vital relationships between the properties of water-fuel emulsions and energy-based, environmental and operational parameters of the diesel engine will enable poland to offer a more substantial contribution to the undergoing research work. references [1] cisek, j.: options for the analysis of fuel injection using visual digitized methods. journal of middle european construction and design of cars, no. 3, september 2004. [2] miyamoto, n., ogawa, h., wang, j.: significant nox reductions with direct water injection into the sub-chamber of an idi diesel engine. sae transactions, 1995, no. 950 609. [3] miyano, h., sungawa, t., tayama, k., nagae, y., yasueda, s.: the ship test for low-nox by stratified fuel-water injection system. 21st international congress on combustionengines. interlaken 1995. cimac 1995. [4] murayama, t., morishima, y., tsukahara, m., miyamoto, n.: experimental reduction of nox, smoke, and bsfc in a diesel engine using uniquely produced water (0–80 %) to fuel emulsion. sae transactions, 1978, no. 780 224. 44 acta polytechnica vol. 50 no. 2/2010 [5] velji, a., remmels, w., schmidt, r. m.: water to reduce nox – emission in diesel engines. a basic study. 21st internationalcongress oncombustion engines. interlaken 1995. cimac 1995. [6] mello, j. p., mellor, a. m.: nox emission from direct injection diesel engines with water/steam dilution. sae transactions, 1999. [7] merkisz, j., piaseczny, l.: wp lyw zasilania emulsją paliwowo-wodną na toksyczność i wskaźniki pracy okrętowego średnioobrotowego silnika spalinowego. journal of kones 2001, no. 3–4, p. 294–303. [8] banot, k.: wykorzystanie wody w strefie spalania oleju napędowego dla obniżenia emisji nox w silniku wysokoprężnym. konmot-autoprogres, czasopismo techniczne, wydawnictwo politechniki krakowskiej, 6-m/2004, p. 79–86, 2004. [9] cisek, j.: research project no. 5 t12d 033 25. politechnika krakowska. kraków, 2008. [10] szlachta, z., cisek, j.: research project no. 8 t12d 035 20. politechnika krakowska. kraków, 2008. jerzy cisek, ph.d., m.e. phone: +048 (12) 628 36 75 e-mail: jcisek@pk.edu.pl cracow university of technology institute of automobiles and internal combustion engines, division of diesel engines 31-864 kraków, al. jana pawla ii 37, poland 45 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 determination of uncertainties for correlated input quantities by the monte carlo method marcel goliaš1, rudolf palenčár1 1 department of automation, measurement and applied informatics, faculty of mechanical engineering, slovak university of technology in bratislava, slovak republic correspondence to: marcel.golias@stuba.sk abstract this paper presents the calculation of the uncertainty for distribution propagation by the monte carlo method for a measurement model with one output quantity. the procedure is shown on the basis of an example of the calculation of a rectangle by direct measurement of length by the same caliper. the measurements are correlated, and the uncertainties are calculated for three values of the correlation coefficients. another part of the paper presents a validation of the law of propagation of uncertainties for distribution propagation by the monte carlo method. keywords: uncertainty of measurement, monte carlo method, correlation. 1 introduction the numerical method that implements the law of propagation of distribution, specifically monte carlo simulation, is used especially in cases when linearization by taylor series cannot be implemented, the probability distribution of the input quantities is asymmetric, and it is difficult to determine the partial derivatives. an instruction given in annex 1 to gum (guide to the expression of uncertainty in measurement) [2] includes a general alternative procedure in accordance with gum [1] for the numerical evaluation of uncertainty in measurement that is suitable for computer processing. the procedure applies for model having a single output quantity, where the values of the input quantities are associated with any probability density function, including asymmetric probability density functions. when calculating the uncertainty of the correlated input quantities, it is necessary to determine the covariance matrix, the correlation coefficients and the associated probability density of the input quantities. the result of the monte carlo simulation is 95 % reference interval, estimate and standard uncertainty for the output quantity [2,3]. 2 propagation of distribution by the monte carlo the monte carlo method provides a general approach for numerical approximation to the distribution function gy (η) for the output quantity y = f(x). the input quantities of the model are x = (x1, x2, . . . , xn) t. the core approach is repeated selection of the values of probability density functions for input quantities xi [4,5]. the distribution function for output quantity y obtained from monte carlo simulation is defined as gy (η) = ∫ η −∞ gy (z) dz (1) the probability density function is defined [3] as gy (η) = ∫ ∞ −∞ . . . ∫ ∞ −∞ gx(ξ)δ(η − f(ξ)) dξn . . . dξ1 (2) where δ is the dirac function, gxi(ξi), and i = 1, . . . , n, are the probability density function of the input quantities xi, i = 1, . . . , n. the dependence of the input quantities can be expressed by the covariance matrix σ = ⎡ ⎢⎢⎢⎢⎢⎣ σ11 σ11 . . . σ1n σ21 σ22 . . . σ2n ... ... ... ... σn1 σn2 . . . σnn ⎤ ⎥⎥⎥⎥⎥⎦ (3) the monte carlo simulation as an implementation of the propagation of distribution can conveniently be stated as a step-by-step procedure • select the number m of monte carlo trials to be made, • generate m samples of input quantities, • evaluate the model to give the output quantity xi, • sort these m values of the output quantity into non-decreasing order, 57 acta polytechnica vol. 52 no. 4/2012 • form an estimate of the output quantity and the associated standard uncertainty, • form the shortest 95 % coverage interval for the output quantity [2,3]. 3 determination of uncertainties by monte carlo to validate the law of propagation of uncertainties using the monte carlo method, the same caliper is used to calculate the rectangle by direct measurement of its sides, so the measurements are correlated. the lengths of the sides of the rectangle have nominal values of 100 mm and 50 mm. according to the manufacturer’s certificate, error of measurement 0.01 mm is admissible. this error applies at 20 ◦c [5], and we neglect other effects. measurement model p = a · b = (am + δam ) · (bm + δam) (4) where am is the measurement error in measuring the length of side a (mm), δam is measurement error (mm), bm is measurement error in measuring the length of side b (mm). the model for calculating the sides of a rectangle by measuring the input quantities for the gaussian probability density function and one an output quantity is shown in figure 2. calculation of uncertainties according to the law of propagation of uncertainties, we can view in table of balance uncertainty, which is used to compare two methods for evaluation of uncertainties. the input quantities used in the monte carlo simulation are shown in table 2. m = 107 monte carlo trials are used in calculating the monte carlo simulation. the input quantities are generated using a pseudo-random number generator. the 32bit mersenne twister generator with a period of 21 019 937 −1 is used as a pseudo-random number generator. figure 2 shows the probability density functions and the frequency distributions (histograms) of the rectangle obtained by the distribution propagation by monte carlo simulation for the three correlation coefficients. the vertical lines define the 95 % confidence intervals for the three correlation coefficients, with estimates and expanded uncertainties. the width of the coverage intervals increases with increasing value of the correlation coefficient. figure 1: propagation of the distribution of input quantities for the measurement model table 1: balance of uncertainties input quantity xi estimate xi/mm standard deviation u(xi)/mm distribution sensitivity coefficient /mm contribution to the standard uncertainty ui(y)/mm 2 the arithmetic mean of the measured values am 100.097 0.016 3 normal 50.096 0.82 measurement error δam 0.000 0.010 normal 50.096 0.50 the arithmetic mean of the measured values bm 50.096 0.0164 normal 100.097 1.65 measurement error δbm 0.000 0.010 normall 100.097 1.00 p 5 014.46 correlation coefficient r = 0 2.15 p 5 014.46 correlation coefficient r = 0.5 2.27 p 5 014.46 correlation coefficient r = 0.9 2.35 58 acta polytechnica vol. 52 no. 4/2012 table 2: inputs to the monte carlo simulation quantity estimate standard deviation distribution the arithmetic mean of the measured values am 100.097 mm 0.016 3 mm normal measurement error δam 0.000 mm 0.010 mm normal the arithmetic mean of the measured valuesbm 50.096 mm 0.016 4 mm normal figure 2: probability density functions of the rectangle for the output quantity figure 3 shows the distribution function which is made of values of measurement model which are sorting into non-decresing order. vertical lines mark the endpoints of the probabilistically symmetric 95 % coverage interval for the three correlation coefficients. the standard uncertainty increases with the increase of the correlation coefficient. estimates and the combined uncertainties of the two methods for evaluating the uncertainties of output quantity y are given in table 3 [2]. in order to validate the law of propagation of uncertainty using monte carlo simulation it is necessary to determine δ [3]. u(y) = 2.15 mm2 = 21 · 10−1 mm2, a = 21, r = −1, δ = 1 2 · 10−1 mm2 = 0.05 mm2 (5) 59 acta polytechnica vol. 52 no. 4/2012 figure 3: the distribution functions of the rectangle for the output quantity table 3: validation of the law of propagation of uncertainty by monte carlo simulation method r m y/mm u(y)/mm 95 % coverage interval/mm dlow, dhigh/mm validation δ = 0.05 0 2.15 [5 010.15; 5 018.77] propagation of uncertainty 0.5 5 014.46 2.27 [5 009.92; 5 019.00] 0.9 2.35 [5 009.76; 5 019.16] 0 2.15 [5 010.25; 5 018.67] 0.1; 0.1 nie propagation of distribution (mcm) 0.5 10 7 5 014.46 2.64 [5 009.29; 5 019.63] 0.6; 0.6 nie 0.9 2.83 [5 008.91; 5 020.02] 0.8; 0.8 nie 4 conclusion in this paper, the monte carlo method has been used for estimating the uncertainty of correlated input quantities by direct measurement of the sides of a rectangle using the same caliper for three different correlation coefficients [2]. the measurement result was also determined by the propagation of uncertainty in accordance with [5]. when calculating the uncertainties, it is necessary to consider the correlation between the input quantities of the measurement model, because the correlation affects the final element of the uncertainty. the calculations have shown that the use of the law of propagation of un60 acta polytechnica vol. 52 no. 4/2012 certainties for the model considered here is not acceptable, and that the higher members of the taylor series should be taken into consideration. acknowledgement the authors wish to thank the slovak university of technology in bratislava and the vega grant agency, grant no. 1/0120/12, and appv-grant no. 0096-10, for their support. references [1] iso/iec guide 98-3:2008, uncertainty of measurement – part 3: guide to the expression of uncertainty in measurement, (gum: 1995). [2] jcgm 101:2008, evaluation of measurement data — supplement 1: guide to the expression of uncertainty in measurement — propagation of distributions using a monte carlo method, (bipm). [3] cox, m. g., siebert, b. r. l.: the use of a monte carlo method for evaluating uncertainty and expanded uncertainty. national physical laboratory, teddington, uk, metrologia, 43, 2006. [4] fisher, g. s.: monte carlo. new york : springer verlag, 1996. [5] chudý, v., palenčár, r., kureková, e., halaj, m.: meranie technických velič́ın. bratislava : vydavatělstvo stu, 1999. 61 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 a pitch detection algorithm for continuous speech signals using viterbi traceback with temporal forgetting j. bartošek abstract this paper presents a pitch-detection algorithm (pda) for application to signals containing continuous speech. the core of the method is based on merged normalized forward-backward correlation (mnfbc) working in the time domain with the ability to make basic voicing decisions. in addition, the viterbi traceback procedure is used for post-processing the mnfbc output considering the three best fundamental frequency (f0) candidates in each step. this should make the final pitch contour smoother, and should also prevent octave errors. in transition probabilities computation between f0 candidates, two major improvements were made over existing post-processing methods. firstly, we compare pitch distance in musical cent units. secondly, temporal forgetting is applied in order to avoid penalizing pitch jumps after prosodic pauses of one speaker or changes in pitch connected with turn-taking in dialogs. results computed on a pitchreference database definitely show the benefit of the first improvement, but they have not yet proved any benefits of temporal modification. we assume this only happened due to the nature of the reference corpus, which had a small amount of suprasegmental content. keywords: pda, fundamental frequency, viterbi, temporal forgetting, mnfbc, speech processing. 1 introduction almost every audible sound tends to have a fundamental frequency. this is the lowest frequency on which the signal is periodic, and we sense this frequency as the height (pitch) of the sound. human speech perception is partly based on intonation (changes of pitch), which is an aspect of prosody. thanks to this we can distinguish whether a person is making a statement or a question [1]. prosodic information also enables us to recognize the emotions of a speaker. a motivation for finding a precise and robust pdacould be to track the intonation contour in continuous speech. this is a crucial step for the proper function e.g. of a punctuation detector [2] or an emotion classifier of the speaker. there are nowadays several known pitch detectionmethods. they can generally be divided according to the domain in which they operate (time, frequency, cepstrum, etc.) an overview of some basic methods can be found in [12]. the most widely used methods are probably various forms of autocorrelation algorithms (time and frequency domain), based on similarity of the signal itself after some time period (time domain) or periodicity in the spectrum (frequency domain). amdf [5] (time domain), the cepstralmethod [4] (modificationof the spectrumdomain) and sub-harmonic summation (shs) [3] are well described and widely used methods. 2 a description of pda using mnfbc a critical aspect of pdas used for speech analysis is that there are fast transitions between articulated phonemes. for this reason, the best result for a speech signal (in contrast to a singing signal) is obtained by methods that work in the time domain. the pda that is used and improved in this paper emerges fromthe very complexpdadescribed in detail in [9]. the core of the pitch-detection method is a merged normalized forward-backward correlation. equation(1)presents thebasic correlationterm that is used in equations (2) and (3) to compute forward and backward correlations, where the constant max perrefers to the time period of the lowest detectable frequency. these two correlations combined together lead to the mnfbc function (4), which is then half-way rectified. combining two opposed correlations into a final correlation should improve the precision for frames with problematic content in terms of the different nature of the beginning and ending parts (transitions). 〈xwk[n], xwl[n]〉 = 2∗max p er−1∑ n=0 xw[n+k]xw[n+l] (1) n f c[t] = 〈xw0[n], xwt[n]〉√ 〈xw0[n], xw0[n]〉〈xwt[n], xwt[n]〉 (2) 8 acta polytechnica vol. 51 no. 5/2011 n bc[t] = 〈xw2m ax p er[n], xw2m ax p er−t[n]〉√ 〈xw2m ax p er[n], xw2m ax p er[n]〉〈xw2m ax p er−t[n], xw2m ax p er−t[n]〉 (3) m n f bc[t] = 〈xw0[n]], xw0[n]]〉(n f c′[t])2 + 〈xw2m ax p er[n], xw2m ax p er[n]〉(n bc′[t])2 〈xw0[n], xw0[n]〉 + 〈xw2m ax p er[n], xw2m ax p er[n]〉 (4) in contrast to [9], there is neither a signal preprocessing section in our algorithm nor a special block determining whether the segment of speech is voiced or unvoiced. this decision is made by thresholding themnfbcvalue itself. there is also no special block ensuring correct pitchwith a sub-harmonic summation (shs) function [3] to prevent halving errors. with the improvements suggested in this paper, this is not needed (see section 3 for details). the computational requirements are determined by the complexity of the correlation operation done in the time domain, which is n2, where n is the length of the processed window in the samples. this isworse thanthe complexity n log(n) of fasterpdas operating in the frequency domain and using fft. although real-time use in the final implementation is wanted, we assume that the algorithm will not be used in any time-critical or embedded system where computation complexity is a critical issue. 3 viterbi post-processing post-processing using the viterbi algorithm [6] is applied to find the optimal track of the pitch. the powerof theviterbi procedure lies in its ability to apply user-defined rules for comparing the candidates. however, for proper use of the algorithm some requirements have to be met. each candidate needs to haveassigned its “emission”probability bk (the probability that candidate k is f0 for the current frame, without considering any history) and also its “transition” probability akl, which denotes how probable it is that candidate k in the current frame will be followed by candidate l in the next frame. having these context-independent values, we can gradually compute the values of function δm,l which tells the final probability of candidate l being f0 for frame m considering the results from the previous frames. function ψm,l designates the index of themost probable candidate in frame m − 1. the equations can then be expressed as (5) and (6). δ[m, l] = max k [δ[m −1, k]a[k, l]]b[l] (5) ψ[m, l] = argmax k [δ[m −1, k]a[k, l]] (6) the algorithm starts by assigning for the first frame δ1,i = bi and ψ1,i = 0. in the current implementation, the three best candidates (the three highest peaks) come from the mnfbc function. note that these candidates often (but not always) correspond to the harmonic content of a speech signal [11]. thismeans that inmost cases the candidatewith the highestmnfbcvalue is reallyf0, and the twoother highest values are harmonics of f0 (its natural multiples). however, there could also be cases when the harmonics are “stronger” than fundamental. in this case, theviterbi procedure should preventhalving or doubling errors. the basic scheme of the algorithm is depicted in figure 1. fig. 1: the trellis of the viterbi algorithm the emission probability bk is implemented directly as the value of the mnfbc function for each f0 candidate. the transitionprobability canbe computed according to [9] as the decreasing exponential of the frequency difference. this could work well for some range of low fundamental frequencies with a suitablemultiplying constant for thedifference. however, our perception of pitch is not linear with growing frequencies, but is logarithmic [11]. this means that the same difference in frequency in a lower frequencybandcausesagreaterpitchchangeperception than the same difference in a higher frequency band. for this reason, the difference in frequency in hz is converted in our algorithm to the difference in pitch in musical manners — semitones and cents. let variable x be the difference of frequencies for consequent candidates converted tomusical cents (100 cents=1 semitone) according to equation (7). the resulting transition probability function a(x) is then expressed as (8), and the function is visualized 9 acta polytechnica vol. 51 no. 5/2011 in figure 2. multiplying constant 0.0012 was experimentally found to give the best results. overall results that improve the precision of the algorithm with this modification can be found in section 4. x = 1200 ∣∣∣∣log2 ( f1 f2 )∣∣∣∣ (7) a(x) = 1 e0.0012x (8) fig. 2: transition probabilities function depending only on the difference in cents now let us consider a situationwhen it is possible to have a jump in the pitch of speech in the place of the border of neighbouring prosodic units [1] (with unvoiced segments between them, so that the first voiced segment of the new prosodic unit is the next voiced segment for the last prosodic unit in terms of the viterbi algorithm). the previous probability function will not allow the change to be immediately applied to the pitch track, and needs some time to “adopt” the new pitch level. most utterances take place in the range of the musical fourth (which is 5 semitones=500 cents in terms of explicit musical distance). this is not the overall pitch range, but it is the common range that we use across prosodic units, sometimes referred to in the literature as the “pitch sigma”. it is probable that the biggest jump in pitch of 5 semitones will not occur very often, and only on the boundaries ofprosodicunits. however,wepermit this jump to be possible without any penalization after a long enough prosodic pause. for this reason, a difference of 500 cents is a limit, and higher differences will be penalized by a linear decrease. the behaviour of the temporal probability function of two variables (cent difference x and time t) can thus be expressed in the cent difference interval x ∈ 〈0,500〉 as: a(x, t)= e−0.0012x + (1−e−0.0012x)t tthr (9) and on the interval x ∈ 〈500,1200〉 as: a(x, t)= e−0.0012x + ( 1200−x 700 −e −0.0012x) t tthr (10) where tthr is the time forgetting threshold. when the prosodicpause length reaches this value, all pitch changes in the rangeof 500hzhavea transitionprobability of 1. fig. 3: transition probability function depending on difference in cents and on time 4 results 4.1 test conditions all the results were computed using a manually labeled pitch-reference database as a part of spanish speecon [10], with the use of a pitch evaluation framework [7]. all parts of the proposed algorithm were implemented in the matlab environment. 4.2 evaluation criteria the results section uses the evaluation criteria suggested in [7]. the voiced error ve (unvoiced error ue) rate is the proportion of voiced (unvoiced) frames misclassified as unvoiced (voiced). the gross error high geh (gross error low gel) is the rate of f0 estimates (correctly classified as voiced) which does not meet the 20 % upper (lower) tolerance of frequency in hz. the geh and gel 20 % tolerance range is quite broad, and thus cannot distinguish clearly between two precise pdas. for this reason,geh10andgel10were established by analogy with geh and gel, but with only 10 % tolerance ranges. these new criteria are also expected to result in higher error rates than the older criteria, but might be useful in applications where precision matters. ue+ve and geh+gel criteria are sometimes used to summarize pdaerrors. halving errors (he — the estimated frequency is half of the refe10 acta polytechnica vol. 51 no. 5/2011 table 1: channel 0 overall results pda ve ue ve+ue geh gel geh10 gel10 de he [%] [%] [%] [%] [%] [%] [%] [%] [%] acf freq 44.4 23.5 31.6 1.2 0.1 1.5 0.18 0.4 0.06 dfe 26.6 15.5 20.4 8.4 4.2 16.5 8.9 0.2 1.3 mnbfcv1 22 12.7 16.3 0.4 21.2 1.5 22.1 0.06 19.5 mnbfcv2 22 10.7 15 0.4 1.1 1.8 2.3 0.03 0.8 mnbfcv3 18.5 13.3 15.3 0.5 1.2 1.9 2.6 0.05 0.8 mnbfcv4 15.6 16.3 16 0.6 1.3 2 2.8 0.05 0.9 mnbfcv5 22 10.7 15 0.7 1.7 2.1 2.9 0.14 1 table 2: gross errors (gel+geh [%]) in 2/3 octave frequency bands on channel 0 pda 57–88 88–141 141–225 225–353 353–565 [hz] [hz] [hz] [hz] [hz] acf freq 92.7 5.2 0.6 1.1 17.7 dfe 26.4 12.1 12.3 13.0 53.4 mnbfcv1 2 1.9 28.3 49.7 73.7 mnbfcv2 1.1 0.7 1.7 2.2 32.1 mnbfcv3 1.7 0.9 2 2.3 33 mnbfcv4 2.3 1 2.3 2.7 34.4 mnbfcv5 2.2 1.6 2.8 2.9 32.1 rence) and doubling errors (de) were also brought in with a tolerance of 1 semitone range from half or double the referencef0. errorsof this kindarea special type of gross errors and often occur on real pda outputs for noisy signals or transitions fromvoiced to unvoiced speech elements. we may sometimes need to observe the errorsnot in the entire frequencyband but e.g. within 5 smaller frequency sub-bands individually (2/3 octave bands were used to cover the range from 60 to 560 hz). 4.3 results and discussion table 1 shows the overall results for the highest signal-to-noise (snr) ratio channel 0 of the reference database. mnbfcv1 is the basic variant with the voiced/unvoiced (v/uv) decision threshold set to value 0.5 and with the transition probability of the viterbi procedure computed from the direct frequency difference. mnbfcv2 improves the first variant with the conversion difference to cents. mnbfcv3 is almost the same as mnbfcv2, but has the v/uv threshold set to 0.45, whereas mnbfcv4 has the threshold value set to 0.4. the final mnbfcv5 involves adding the temporal domain to the transition probability function with the time forgetting threshold set to 2 seconds. table 2 presents a comparison of the precision over five distinct frequency bands. to compare our method with other widely used methods, we added the results for autocorrelation in the frequency domain (acf freq, a very goodmethod for tracking singing) and thedirect frequency estimation method (dfe) [8], which is currently used for evaluating parkinson’s disease at fee ctu in prague. the results show that mnfbc is better than dfe in v/uv detection and also in precision. the ve+ue parameter is the best for mnbfcv2, but we can achieve the bestve ratio formnbfcv4 (but with aworseue rate). the choice of variantdepends on the target application—whetherweneed tominimizevoicederrorsorunvoicederrors. for example, in the case of the planned punctuation detector we are trying tominimize the unvoiced error rate in order to obtain only confident f0 estimates. the results also show a big increase in precision with frequency difference (mnbfcv1) to cent conversion (mnbfcv2). progress can be seenmainly ingel and in the halving error rate. table 2 shows that most errors for mnbfcv1 occur in the highest band, where the dif11 acta polytechnica vol. 51 no. 5/2011 ferences from the current frequency aremuch greater in hz units than for lower bands. thus these transitions are evaluatedwith very lowprobability, leading to these errors. the table also shows that the acf method can provide the best results for the highest frequency band, but is very poor in the lowest band. mnbfcv5 with temporal forgetting could probably not show its strength on the reference corpus due to lack of suprasegmental prosodic phrases (the corpus consists mainly of isolated words). in comparisonwithmnbfcv2, however, there is only a slightly higher geh rate. globally, mnfbc with the addition of the viterbi traceback procedure outperforms dfe on close talk channel 0. note that it has much lower geh and gel even for lower ve. this is not easy to achieve for pda, because lower ve means that more uncertain segments (which other pdas with higher ve have considered as unvoiced) pass to computation of the precision of f0 detection (gross errors). other resultsnotpresented in this paperhavealso shownanoticeable decrease in precision on channel 1 with the algorithm presented here. to get good results even innoisy environments, higher noise robustness is needed. this couldbeaccomplishedbyadding a pre-processing stage with noise reduction (not implemented yet). 5 conclusion wehavedescribedapitch-detectionalgorithmpurely based on merged normalized forward-backward correlation (mnfbc) with an advanced viterbi postprocessing procedure for finding the most probable pitch track. the optimal range of the voicing threshold was found for the mnfbc function. the results confirm that computing the transition probabilities with the pitch difference measured in semitones significantly improves the gross error rates (especially frequencyhalving)over the caseofdirectdifferenceof frequencies. we have also tried to extend the transition probability functionwith a temporal dimension. this enhancement should lead to fewer errors occurring on the edges of prosodic pauses, but this has not been proven in experiments performed on a pitch reference database. this could be due to the very limited presence of supra-segmental prosodic pauses in the corpus. more experiments on suitable utterances need to be performed in order to evaluate this hypothesis. acknowledgement the research presented in this paper was supervised by ing. václav hanžl, fee ctu in prague. it has been supported by the czech grant agency under grant no. 102/08/0707 “speech recognition under real-world conditions” and by grant no. 102/08/h008 “analysis and modelling biomedical and speech signals”. references [1] palková, z.: fonetika a fonologie češtiny. praha : karolinum, 1994. [2] kim, j., woodland, p. c.: the use of prosody in a combined system for punctuation generation and speech recognition. in proceedings eurospeech (2001), 2757–2760. [3] hermes, d. j.: measurement of pitch by subharmonic summation. j. acoust. soc. am. 83 (1988), 257–264. [4] noll, a. m.: cepstrum pitch determination. j. acoust. soc. am. 41 (2) (august 1966), 293–309. [5] ross, m. j., shaffer, h. l., cohen, a., freudberg, r., manley, h. j.: average magnitude difference function pitch extractor. ieee trans. acoust. speech signal process. assp-22 (5) (october 1974), 353–361. [6] viterbi, a. j.: error bounds for convolutional codes and an asymptotically optimum decoding algorithm. ieeetrans. inform.theory, vol. it13, (april 1967), 260–269. [7] bartošek, j.: pitch detection algorithm evaluation framework. 20th czech-german workshop on speech processing, prague (2010), 118–123. [8] bořil, h., pollák, p.: direct time domain fundamental frequency estimation of speech in noisy conditions. in the proceedings of eusipco 2004 (european signal processing conference, vol. 1) (2004), 1003–1006. [9] kotnik, b., et al.: noise robust f0 determination and epoch-marking algorithms. signal processing 89(2009), 2555–2569. [10] kotnik, b., höge, h., kacic, z.: evaluation of pitchdetection algorithms in adverse conditions. proc. 3rd international conference on speech prosody, dresden, germany (2006), 149–152. [11] syrový, v.: hudební akustika, 2nd ed. praha : hamu, 2008. [12] uhlíř, j.: technologie hlasových komunikací. praha : čvut, 2007. 12 acta polytechnica vol. 51 no. 5/2011 about the author jan bartošek was born in 1984 in litoměřice (cz) and attended the primary and the secondary school there. he completed his bachelor degree (2007— informatics andcomputer science) and his master degree (2009— software engineering) at the department of computer science, faculty of electricalengineering,ctu inprague. he is nowa student on the doctoral study programme at ctu prague, dept. of circuit theory, and is interested in voice and music technologies. jan bartošek e-mail: bartoj11@fel.cvut.cz dept. of circuit theory faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha, czech republic 13 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 angular distribution of grbs l. g. balázs, a. mészáros, i. horváth, z. bagoly, p. veres, g. tusnády abstract we studied the complete randomness of the angular distribution of batse gamma-ray bursts (grbs). based on their durations and peak fluxes, we divided the batse sample into 5 subsamples (short1, short2, intermediate, long1, long2) and studied the angular distributions separately. weused threemethods to search for non-randomness in the subsamples: voronoi tesselation, minimal spanning tree, and multifractal spectra. to study any non-randomness in the subsamples we defined 13 test-variables (9 from voronoi tesselation, 3 from theminimal spanning tree and one from the multifractal spectrum). we made monte carlo simulations taking into account the batse’s sky-exposure function. we tested the randomness by introducing squaredeuclidean distances in the parameter space of the test-variables. we recognized that the short1, short2 groups deviate significantly (99.90%, 99.98%) from the fully random case in the distribution of the squared euclidean distances but this is not true for the long samples. in the intermediate group, the squared euclidean distances also give significant deviation (98.51%). 1 introduction recently, the cosmological origin of gamma-ray bursts (hereafter grbs) has been widely accepted [24,49,75]. assuming large scale isotropy for the universe, one would also expect the same property for grbs. there is an increasing amount of evidence that all grbs do not all form a physically homogeneous group [5, 27–31, 39, 50]. hence, it is worth investigating whether physically different subgroups are also different in their angular distributions. the authors have carried out several different tests in recent years [3,4,45,46] probing the intrinsic isotropy in the angular sky-distribution of the grbs collected in batse catalog [44]. briefly summarizing the results of these studies, one may conclude: a. the long subgroup (t90 > 10s) seems to be distributed isotropically; b. the intermediate subgroup (2s ≤ t90 ≤ 10s) is distributed anisotropically on the � (96–97)% significance level; c. for the short subgroup (t90 < 2s) the assumption of isotropy is rejected only on the 92%significance level;d.the long and short subclasses, respectively, aredistributeddifferently on the 99.3% significance level. (for a definition of the subclasses see [29–33,69].) independently and by different tests, [43] confirmed resultsa., b. andc., with one essential difference: for the intermediate subclass a much higher — namely 99.89% — significance level of anisotropy is claimed. again, the short subgroup is found to be “suspicious”, but only the � (85–95)% significance level is reached. the long subclass seems to be distributed isotropically. [42] found a significant angular correlation on the 2◦ −5◦ scale forgrbs of t90 < 2s duration. [64] reported a correlation between the locations of previously observed short bursts and the positions of galaxies in the local universe, indicating that between 10 and 25 per cent of short grbs originate at low redshifts (z < 0.025). it is a reasonable requirement to continue these tests using more sophisticated procedures in order to see whether the angular distribution of grbs is completely random, or whether it has some sort of regularity. this is the subject of this paper. new tests will be presented here. mainly the clarification of the short subgroup’s behaviour is expected from these tests. in this paper, similarly to the previous studies, the intrinsic randomness is tested; this means that the non-uniformsky-exposure function of batse instrument is eliminated. the paper is organized as follows. the three new tests are described in section 2. this section does not containnew results, but thisminimal surveymay be useful because the methods are not widely familiar. section 3 contains the statistical tests on the data. section 4 summarizes the results of the statistical tests, and section 5 presents the main conclusions of the paper. the paper is based on the results published in [68]; some preliminary cosmological implications are also discussed by [47,48]. 2 mathematical overview strictly from themathematical point of view, consideringgrbsas a point process on the celestial sphere, the first property means that for a given ω area on the sky the p(ω) probability of observing a burst depends only on the size ofω, andnot on its locationon the sphere. the second property means that for the given two disjunct ω1,ω2 areas on the sky the joint probability is given by p(ω1,ω2)= p(ω1)p(ω2), i.e. 17 acta polytechnica vol. 52 no. 2/2012 the probability of observing a burst in ω1 is fully independent from the probability of getting one in ω2. if both properties are fulfilled, the distribution is called completely random (for the astronomical context of spatial point processes, see [55]). there are several tests for checking the complete randomness of point patterns, but these procedures do not always give information for both properties simultaneously. the simplest test for isotropy is a comparison of the number counts of grbs in disjunct areas on the sky. in the case of isotropyexpanding p(ω) into a series of spherical harmonics, all the coefficients in the series, except the 0th order, should equal zero. this property canalso be used for testing the isotropy (for testing thedipole andquadrupolemoments, see [10]). before performing the isotropy tests the uneven exposure function of batse has to be taken into account [44]. weused threemethods, voronoi tesselation,minimal spanning tree and multifractal spectra to get statistical variables suitable for testing the fully randomness of the angular distribution of grbs. 2.1 voronoi tesselation (vt) the voronoi diagram— also known as dirichlet tesselation or thiessen polygons — is a fundamental structure in computational geometry and arises naturally in many different applications [70,62]. generally, this diagram provides a partition of a point pattern (“point field”, also “point process”), according to its spatial structure, which can be used for analyzing the underlying point process. let us assume that there are n points (n � 1) scattered on a sphere surface with a unit radius. we say that a point field is given on the sphere. the voronoi cell [62] of a point is the region of the sphere surface consisting of points which are closer to this givenpoint thantoanyother onesof the sphere. this cell forms a polygon on this sphere. each such cell has its area (a) given in the steradian, perimeter (p) given by the length of boundary (one great circle of the boundary curve is also called a “chord”), number of vertices (nv) given by an integer positive number, and by the inner angles (αi; i = 1, . . . , nv). this method is completely non-parametric, and therefore may be sensitive for various point pattern structures in the different subclasses of grbs. the points on a sphere may be distributed completely randomly or non-randomly; the non-random distribution may have different characters (clustering, filaments, etc.; for the survey of these nonrandom behaviors, see, e.g., [19]). the vt method is able both to detect nonrandomness and also to describe its form (see [16, 17,20,34,35,59,62,63,73,74,76] for the astronomical context). 2.2 minimal spanning tree (mst) unlike vt, this method considers the distances (edges) among the points (vertices). clearly, there are n(n −1)/2 distances among n points. a spanning tree is a system of lines connecting all the pointswithout any loops. theminimal spanning tree (mst) is a systemof connecting lines,where the sum of the lengths is minimal among all the possible connections between the points [40,58]. in this paper, the spherical version of mst is used following the original paper by prim [58]. the n − 1 separate connecting lines (edges) together define the minimal spanning tree. the statistics of the lengths and the αmst angles between the edges at the vertices can be used for testing the randomness of the point pattern. the mst is widely used in cosmology for studying the statistical properties of galaxy samples [1,6–8,21,41]. fig. 1: application of voronoi tesselation to short grbs (short1 sample) in the 0.65 < p256 < 2.00 peak flux range in galactic coordinates. the peak-flux is given in dimension photon/(cm2s) 18 acta polytechnica vol. 52 no. 2/2012 fig. 2: mst for the sample in figure 1 fig. 3: mfr spectra of simulated (dot-dashed), long1 (dashed), short1 (dotted) and short2 (three-dot-dashed) samples. boxes represent the error of the spectrumpoints derived from monte carlo simulations. note the shift of the maximum of the spectrum of the short1 sample towards higher values in comparison to α =2, corresponding to the completely random 2d euclidean case 2.3 multifractal spectrum let p(ε) denote the probability for finding a point in an area of ε radius. if p(ε) scales as εα (i.e. p(ε) ∝ εα), then α is called the local fractal dimension (e.g. α =2 for a completely random process on the plane). in the case of a monofractal α is independent from the position. amultifractal (mfr) on a point process can be defined as unification of the subsets of different (fractal) dimensions [52]. one usually denotes with f(α) the fractal dimension of the subset of points atwhich the local fractal dimension is in the interval of α, α +dα. the contribution of these subsets to the whole pattern is not necessarily equally weighted, in practice, it depends on the relative abundances of subsets. the f(α) functional relationship between the fractal dimension of subsets and the corresponding local fractaldimension is called the mfr or hausdorff spectrum. in the vicinity of the i-th point (i = 1,2, . . . , n) one canmeasure from the neighbourhood structure a local dimension αi (“rényi dimension”). this measure approximates the dimension of the embedding subset, giving the possibility to construct the mfr spectrum which characterizes the whole pattern (for more details see [52]). if themaximumof this convex spectrum is equal to the euclidean dimension of the space, then in the classical sense the pattern is not a fractal, but the spectrum remains sensitive to the non-randomness of the point set. the concept of a multifractal can be successfully applied in astronomical problems [2,9,11–13,18,22, 25,26,36,37,53,54,60,61,65–67]. 3 evaluation of statistical tests the three procedures outlined in section 2 enable us to derive several stochastic quantities well suited for testing the non-randomness of the underlying point patterns. 3.1 input data and sample definition until now the most comprehensive all-sky survey of grbswas done by the batse experiment on board thecgrosatellite in the period from1991–2000. in this period the experiment collected 2704 well justified burst events, and the data is available in the current batse catalog [44]. since there is increasing evidence ( [31] and references therein) that the grb population is actu19 acta polytechnica vol. 52 no. 2/2012 ally a mixture of astrophysically different phenomena, we divided the grbs into three groups: short (t90 < 2s), intermediate (2s ≤ t90 ≤ 10s) and long (t90 > 10s). to avoid problems with a changing detection threshold, we omitted grbs having a peak flux p256 ≤ 0.65 photonscm−2 s−1. this truncation was proposed in [56]. the burstsmay emerge at very different distances in the line of sight and itmayhappen that the stochastic structure of the angular distribution depends on it. therefore, we also made tests on the bursts with p256 < 2 photonscm −2 s−1 in the short and long population, separately. table 1 defines the 5 samples to be studied here. (due to the small number of intermediate bursts, this subsample was not divided into faint and bright parts). table 1: tested samples of batse grbs sample duration peak flux number [s] [photonscm−2s−1] of grbs short1 t90 < 2 s 0.65 < p256 < 2 261 short2 t90 < 2 s 0.65 < p256 406 intermediate 2s ≤ t90 ≤ 10s 0.65 < p256 253 long1 t90 > 2s 0.65 < p256 < 2 676 long2 t90 > 10s 0.65 < p256 966 3.2 definition of the test-variables the randomness of the point field on the sphere can be tested with respect to various criteria. since differentnon-randombehaviorsare sensitive todifferent types of criteria of non-randomness, it is not necessary that all possible tests using different measures reject the assumption of randomness. in the following, we define several test-variables which are sensitive to different stochastic properties of the underlying point pattern, as proposed by [72]. any of the quantities characterizing the voronoi cell, i.e. area a, perimeter p , number of vertices nv, cell chord length c, and inner angles αi can be used as test-variables, and even some combinations of these quantities. we defined the following testvariables: – cell area a; – cell vertex (edge) nv; – cell chords length c; – inner angle αi; – round factor (rf) average rfav =4πa/p ; – round factor (rf) homogeneity 1− σ(rfav) rfav ; – shape factor a/p2; – modal factor σ(αi)/nv; – the so-called “ad factor” defined as ad = 1 − (1 − σ(a)/〈a〉)−1, where σ(a) is the dispersion and 〈a〉 is the average of a. to characterize the stochastic properties of a point pattern, we use three quantities obtained frommst: – variance of the mst edge-length σ(lmst); – mean mst edge-length lmst ; – mean angle between edges αmst . as to the multifractals, the only variable used here is the f(α) spectrum, which is a sensitive tool for testing the non-randomness of a point pattern. throughout defining these variables mean (average) and variance refer to the mean and variance of the respective elements of the voronoi foam and mst, respectively. an important problem is to study the sensitivity (discriminant power) of the different parameters to the different kind of regularity inherent in the point pattern. in the case of a fully regular mesh, e.g., a is constant and so ad = 0, σ(αi) = 0 and both increase towards a fully random distribution. in the case of a patchy pattern the distribution of the area of thevoronoi cells and the edgedistribution ofmst become bimodal, reflecting the average area and the edge lengthwithin and between the clusters, in comparison to the fully random case. in a filamentary distribution, the shape of the areas becomes strongly distorted, reflecting in an increaseof themodal factor in comparison to the case of patches. [71] investigated the power of voronoi tesselation and the minimal spanning tree in discriminating between distributions having big and small clusters, full randomness and hard cores (random distributions, but the mutual distances of the points are constrainedby the sizeof thehardcore), respectively. they concluded that the voronoi roundness factor did not separate small clusters and hardcore distributions, and the roundness factor homogeneity did not distinguish between small clusters and random distributions, nor between random and hardcore distributions. mst has a very good discriminant power even in the case of hardcore distributions with close minimal interpoint distances. since the sensitivity of the variables differs on changing the regularity properties of the underlying point patterns one may measure significant differences in one parameter but not in another, even in cases when these are correlated otherwise. this is not a trivial issue. inmost cases, one needs extended numerical simulations to study the statistical significance of the different parameters. 3.3 estimation of significance to obtain the empirical distributions of the testvariables we made 200 simulations for each of the five samples. the number of simulated points was identical to the number for the samples defined in section 3.1. wegenerated the fully randomcatalogsbymonte carlo (mc) simulations of fully random grb celestial positions and taking into account the batse sky-exposure function [23,44]. 20 acta polytechnica vol. 52 no. 2/2012 table 2: calculated significance levels for the 13 test-variables and the five samples. a calculated numerical significance greater than 95% is put in bold face name var short1 short2 interm. long1 long2 cell area a 36.82 29.85 94.53 79.60 82.59 cell vertex (edge) nv 36.82 87.06 2.99 26.87 7.96 cell chords c 47.26 52.24 18.91 84.58 54.23 inner angle αi 96.52 21.39 87.56 37.81 63.18 rf average 4πa/p 65.17 99.98 33.83 10.95 86.07 rf homogeneity 1− σ(rfav) rfav 19.90 24.38 58.71 55.72 32.84 shape factor a/p2 91.04 94.03 90.05 55.22 63.68 modal factor σ(αi)/nv 97.51 1.99 7.46 56.22 8.96 ad factor 1− ( 1− σ(a) 〈a〉 )−1 32.84 25.37 11.44 95.52 52.74 mst variance σ(lmst) 52.74 38.31 22.39 13.93 59.70 mst average lmst 97.51 7.46 89.05 56.72 8.96 mst angle αmst 85.07 14.43 36.82 73.63 60.70 mfr spectra f(α) 95.52 96.02 98.01 73.63 36.32 squared euclidean distance 99.90 99.98 98.51 93.03 36.81 assuming that the point patterns obtained from the five samples defined in table 2 are fully random, we calculated the probabilities for all the 13 testvariables selected in section 3.2. based on the simulated distributions, we computed the level of significance for all the 13 test-variables and in all defined samples. 4 discussion of the statistical results 4.1 evaluation of the joint significance levels we assigned to each mc simulated sample 13 values of the test variables and, consequently, a point in the 13dparameter space. completing 200 simulations in all of the subsampleswe get in thiswaya 13dsample representing the joint probability distribution of the 13 test-variables. using theeuclideandistance of the points from the sample meanwe can get a stochastic variable characterizing the deviation of the simulated points from the mean only by chance. an obvious choice would be the squared euclidean distance. in case of a gaussian distribution with unit variances and without correlations, this would result in a χ2 distribution of 13 degrees of freedom. the testvariables in our case are correlated and have different scales. before computing squared euclidean distances we transformed the test-variables into noncorrelated ones with unit variances. due to the strong correlation between some of the test-variables we may assume that the observed quantities can be represented with non-correlated variables of lower number. factor analysis (fa) is a suitable way to represent the correlated observed variableswith noncorrelated variables of lower number. since our test-variables are stochastically dependent, following [72] we attempted to represent them by fewer non-correlated hidden variables, assuming that xi = k∑ j=1 aij fj + si i =1, . . . , p k < p . (1) in the above equation xi, fj , si mean the testvariables (p = 13 in our case), the hidden variables and a noise-term, respectively. equation (1) represents the basic model of fa. the covariance matrix of the xi variables has 1 2 p(p +1) free elements. the number of free elements on the right side of eq. (1) cannot exceed this value [38], yielding for k the following inequality: k ≤ (2p +1− √ 8p +1)/2 (2) which gives k ≤ 8.377 in our case. factor analysis is a common ingredient of professional statistical software packages (bmdp, sas, splus, spss1, etc). thedefault solution for identifying 1bmdp, sas, s-plus, spss are registered trademarks 21 acta polytechnica vol. 52 no. 2/2012 the factor model is to perform principal component analysis (pca). we kept as many factors as were meaningfulwith respect toequation (2). taking into account the constraint imposed by equation (2) we retained 8 factors. in this waywe projected the joint distribution of the test-variables in the 13d parameter space into an 8d parameter space defined by the non-correlated fi hidden variables. the fj hidden variables in equation (1) are noncorrelated and have zero means and unit standard deviations. using these variables, we defined the following squared euclidean distance from the sample mean: d2 = f21 + f 2 2 + . . . + f 2 k (k =8 in our case). (3) if the fj variableshad strictlygaussiandistributions, equation (3) would define a χ2 variable of k degrees of freedom. sq. distance f re q u e n cy 0 5 10 15 20 25 30 35 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 short2 short1 interm. long1 long2 fig. 4: distribution of the squared euclidean distances of the simulated samples from the stochastic mean of the fi hidden variables (factors) in the 8d parameter space. there are altogether 1000 simulated points. the full line marks a χ2 distribution of 8 degrees of freedom, normalized to the sample size. the distances of the batse samples are also indicated. the departures of samples “short1” and “short2” exceed all those of the simulated points. the probabilities that these deviations are nonrandom equal 99.9% and 99.98%, respectively 4.2 interpretations of the statistical results using the distribution of the squared euclidean distances, defined by equation (3), one can get further informationonwhether abatsesample represented byapoint in theparameter spaceof the test-variables deviates only by chance, or whether it differs significantly from the fully random case. in all categories (short1, short2, intermediate, long1, long2) we made 200, altogether 1000, simulations. we calculated the d2 squared distances for all simulations and compared them with those of the batse samples in table 1. figure 4 shows a histogramof the simulated squareddistances alongwith those of the batse samples. a full line represents a χ2 distribution of k = 8 degrees of freedom. figure 4 clearly shows that the departures of samples short1 and short2 exceed all those of the simulated points. the probabilities, that these deviations are non-random, equal 99.9% and 99.98%, respectively. the full randomness of the angular distribution of the long grbs, in contrast to the regularity of the short and to some extent to the intermediate ones, points towards the differences in the angular distribution of their progenitors. the recent discovery of the afterglow in some short grbs indicates that these events are associated with the old stellar population [24] accounted probably for the mergers of compact binaries, in contrast to the long bursts resulting from the collapse of verymassive stellar objects in young star forming regions. the differences in progenitors also reflects the differences between the energy released by the short and long grbs. as [33] showed, the redshift distributions of the differentgrbclasses aredifferent. theaverage z of the short bursts is significantly smaller than that of the long ones (the average redshift of intermediate bursts is between them). consequently, the short and long grbs experience different sampling volumes. the sampling volume of the short bursts is much smaller and the irregularities in the distribution of the host galaxies plays a more significant role. unfortunately, little can be said on the physical nature of the intermediate class. statistical studies ([31] and the references therein) suggest the existence of this subgroup — at least from the purely statisticalpoint ofview. also anon-randomskydistribution occurs here. however, its physical origin is still fully open yet [31]. 5 summary and conclusions we has made additional studies on the degree of randomness in the angular distribution of samples selected from the batse catalog. according to the t90 durations and the p256 peak fluxes of the grbs in the catalog we defined five groups: short1 (t90 < 2s & 0.65 < p256 < 2), short2 (t90 < 2s & 0.65 < p256 ), intermediate (2 s ≤ t90 ≤ 10 s & 0.65 < p256), long1 (t90 > 2s & 0.65 < p256 < 2) and long2 (t90 > 10s & 0.65 < p256). to characterize the statistical properties of the point patterns given by the samples, we defined 13 test-variables based on voronoi tesselation (vt), a minimal spanning tree (mst) and a multifrac22 acta polytechnica vol. 52 no. 2/2012 tal spectra. for all five defined grb samples we made 200 numerical simulations, assuming fully randomangular distribution and taking into account the batse exposure function. the numerical simulations enabled us to define the empirical probabilities for testing the null hypothesis, i.e. the assumption that the angular distributions of thebatse samples are fully random. since we performed 13 single tests simultaneously on each subsamples, the significance obtained by calculating it separately for each test cannot be treated as a true indication for deviating from the fully random case. in fact, some of the test-variables are strongly correlated. to concentrate the information on the non-randomness experienced by the test-variables, we assumed that they can be represented as a linear combination of non-correlated hidden factors of lower number. actually, we estimated k = 8 as the number of hidden factors. making use of the hidden factors we computed the distribution of the squared euclidean distances from the mean of the simulated variables. comparing the distribution of the squared euclidean distances of the simulated samples with the batse samples we concluded that the short1, short2 groups deviate significantly (99.90%, 99.98%) from full randomness, but this is not the case for the long samples. for the intermediate group, squared euclidean distances also give significant deviation (98.51%). acknowledgement this study was supported by otka grant no. k077795, by grant agency of the czech republic grant no. p209/10/0734, and by research program msm0021620860 of the ministry of education of the czech republic (a.m.). references [1] adami,c.,mazure,a.: a&as,134, 393, 1999. [2] aschwanden, m. j., parnell, c. e.: apj, 572, 1048, 2002. [3] balázs, l. g., mészáros,a., horváth i.: a & a, 339, 1, 1998. [4] balázs, l. g., mészáros, a., horváth, i., vavrek, r.: a & a suppl., 138, 417, 1999. [5] balázs, l. g., bagoly, z., horváth, i., mészáros, a., mészáros, p.: a & a, 401, 129, 2003. [6] barrow, j. d., bhavsar, s. p., sonoda, d. h.: mnras, 216, 17, 1985. [7] bhavsar, s. p., lauer, d. a.: examining the big bang and diffuse background radiations, in kafatos, m. c. kondo, y. (eds.) proc. iau symp. 168, dordrecht : kluwer, p. 517, 1996. [8] bhavsar, s. p., splinter, r. j.: mnras, 282, 1461, 1996. [9] bottorff, m., ferland, g.: apj, 549, 118, 2001. [10] briggs, m.: apj, 407, 126, 1993. [11] casuso, e., beckman, j.: pasj, 54, 405, 2002. [12] célérier, m. n., thieberger, r.: a & a, 367, 449, 2001. [13] chappell, d., scalo, j.: apj, 551, 71, 2001. [14] cline, d. b., matthey, c., otwinowski, s.: gamma-ray bursts: 5th huntsville symp., in kippen, r. m., mallozzi, r. s., fishman, g. j. (eds.): aip conf. proc., new york : melville, p. 97, 2000. [15] cline, d. b., matthey, c., otwinowski, s.: apj, 527, 827, 2000. [16] coles, p., barrow, j. d.: nmras, 244, 557, 1990. [17] coles, p.: nature, 349, 288, 1991. [18] datta, s.: a & a, 401, 193, 2003. [19] diggle, p. j.: statistical analysis of spatial point patterns.london : academicpress, 1983. [20] doroshkevich, a. g., gottlöber, s., madsen, s.: a & a suppl., 123, 495, 1997. [21] doroshkevich, a. g., turchaninov, v.: in banday, a. j., zaroubi, s., bartelmann, m. (eds.) proc. ‘mining the sky’ mpa/eso/mpe workshop, springer-verlag, p. 283, 2001. [22] elmegreen, b. g.: apj, 564, 773, 2002. [23] fishman, g. j., meegan, c. a., wilson, r. b., et al.: apjs, 92, 229, 1994. [24] fox, d. b., frail, d. a., price, p. a., et al.: nature, 437, 845, 2005. [25] gaite, j., manrubia, s. c.: mnras, 335, 977, 2002. [26] giraud, e.: apj, 544, l41, 2000. [27] hakkila, j., haglin, d. j., pendleton, g. n., et al.: apj, 538, 165, 2000. [28] hakkila, j., giblin, t. w., roiger, r. j., et al.: apj, 582, 320, 2003. [29] horváth, i.: apj, 508, 757, 1998. [30] horváth, i.: a & a, 392, 791, 2002. 23 acta polytechnica vol. 52 no. 2/2012 [31] horváth, i., balázs, l. g., bagoly, z., ryde, f., mészáros, a.: a & a, 447, 23, 2006. [32] horváth, i., balázs, l. g., bagoly, z., veres, p.: a & a, 489l, 1, 2008. [33] horváth, i., bagoly, z., balázs, l.g., deugarte postigo, a., veres, p., mészáros, a.: apj, 713, 552, 2010. [34] icke, v., van de weygaert, r.: qjras, 32, 85, 1991. [35] ikeuchi, s., turner, e. l.: mnras, 250, 519, 1991. [36] irwin, j. a., widrow, l. m., english, j.: apj, 529, 77, 2000. [37] kawaguchi, t., mineshige, s., machida, m., matsumoto, r., shibata, k.: pasj, 52, l1, 2000. [38] kendall, m. g., stuart, a.: the advanced theory of statistics. london – high wycombe : charles griffin & co. ltd., 1973. [39] kouveliotou, c., et al.: apj, 413, l101, 1993. [40] kruskal, j. b.: proc. am. math. soc., 7, 48, 1956. [41] krzewina, l. g., saslaw, w. c.: mnras, 278, 869, 1996. [42] magliocchetti, m., ghirlanda, g., celotti, a.: mnras, 343, 255, 2003. [43] litvin, v. f., matveev, s. a., mamedov, s. v., orlov,v.v.: pis’ma vastron. zhurnal,27, 489, 2001. [44] meegan, c. a., et al.: batse gamma-ray bursts catalog. 2000. http://gammaray.msfc.nasa.gov/batse/grb/ catalog [45] mészáros, a., bagoly, z., vavrek, r.: a & a, 354, 1, 2000. [46] mészáros, a., bagoly, z., horváth, i., balázs, l. g., vavrek, r.: apj, 539, 98, 2000. [47] mészáros, a., balázs, l. g., bagoly, z., veres, p.: gamma-ray bursts sixth huntsville symposium, 20–23 october 2008, c. meegan, n. gehrels, c. kouveliotou (eds) aip conference proceedings, 1133, 483, 2009. [48] mészáros, a., balázs, l. g., bagoly, z., veres, p.: baltic astronomy, 18, 293, 2009. [49] mészáros, p.: rep. prog. phys., 69, 2259, 2006. [50] mukherjee, s., et al.: apj, 508, 314, 1998. [51] murtagh, f., heck, a.: multivariate data analysis, astrophysics and space science library. dordrecht : reidel, 1987. [52] paladin,g., vulpiani, a.: physics reports,156, 1, 1987. [53] pan, j., coles, p.: mnras, 318, l51, 2000. [54] pan, j., coles, p.: mnras, 330, 719, 2002. [55] pásztor, l., tóth, l. v.: in shaw, r. a., payne, h. e., hayes, j. j. e. (eds) adass iv, asp conf. ser. vol. 77, p. 319, 1995. [56] pendleton,c.n., paciesas,w. s., briggs,m. s., et al.: apj, 489, 175, 1997. [57] press, w. h., flannery, b. p., teukolsky, s. a., vetterling, w. t.: numerical recipes. cambridge : cambridge university press, 1992. [58] prim,r.c.: bell syst. techn. journ.,36, 1389, 1957. [59] ramella, m., boschin, w., fadda, d., nonino, m.: a & a, 368, 776, 2001. [60] selman, f. j., melnick, j.: apj, 534, 703, 2000. [61] semelin, b., combes, f.: a & a, 387, 98, 2002. [62] stoyan, d., stoyan, h.: fractals, random shapes andpoint fields.newyork : wiley j.& sons, 1994. [63] subbarao,m.u., szalay,a. s.: apj,391, 483, 1992. [64] tanvir, n. r., chapman,r., levan,a. j., priddey, r. s.: nature, 438, 991, 2005. [65] tatekawa, t., maeda, k.: apj, 547, 531, 2001. [66] tikhonov, a. v.: astrophysics, 45, 79, 2002. [67] vavrek, r., balázs, l. g., epchtein, n.: in montmerle, t., andré, p. (eds) ‘from darkness to light’ aspconf. ser., vol.243, p. 149, 2001. [68] vavrek, r., balázs, l. g., mészáros, a., horváth, i., bagoly, z.: mnras, 391, 1741, 2008. [69] veres, p., bagoly, z., horváth, i., mészáros, a., balázs, l. g.: apj, 725, 1955, 2010. [70] voronoi, g.: j. reine angew. math., 134, 198, 1908. [71] wallet, f.,dussert,c.: j. theor. biol.,187, 437, 1997. 24 acta polytechnica vol. 52 no. 2/2012 [72] wallet, f., dussert, c.: europhys. let., 42, 493, 1998. [73] van de weygaert, r.: 1994, a & a, 283, 361. [74] yahagi, h., mori, m., yoshii, y.: apjs, 124, 1, 1999. [75] zhang, b., mészáros, p.: ijmpa, 19, 2385, 2004. [76] zaninetti, l.: a & as, 109, 71, 1995. l. g. balázs konkoly observatory p. o. box 67 h-1525 budapest, hungary a. mészáros charles university faculty of mathematics and physics astronomical institute v holešovičkách 2, 180 00 prague 8, czech republic i. horváth department of physics bolyai military university p. o. box 15 h-1581 budapest, hungary z. bagoly p. veres department of physics bolyai military university p. o. box 15 h-1581 budapest, hungary laboratory for information technology eötvös university pázmány péter sétány 1/a h-1518 budapest, hungary g. tusnády rényi alfréd mathematical institute budapest, hungary 25 ap08_6.vp 1 introduction glass is the most frequently used transparent material in the building envelopes. it is a fragile material, which fails in a brittle manner. for this reason, safety glasses are used in a situation when there is a possibility of human impact or where the glass could fall if shattered. laminated glass is a multi-layer material produced by bonding two or more layers of glass together with a plastic interlayer, typically made of polyvinyl butyral (pvb). the interlayer keeps the layers of glass bonded even when broken, and its high strength prevents the glass from breaking up into large sharp pieces. this produces a characteristic “spider web” cracking pattern when the impact is not powerful enough to completely pierce the glass. multiple laminae and thicker glass decrease the stress level, thereby also increasing the load-carrying capacity of the structural member. the focus of this study is on the establishing a simple and versatile framework for an analysis of the mechanical behavior of laminated glass units. to keep the discussion compact, we restrict our attention to the linearly elastic response of layered glass beams in the small strain regime. the rest of the paper is organized as follows. methods for an analysis of laminated glass beams are introduced in section 2, together with a brief characterization of the proposed numerical model. the principles of the method are described in detail in sections 3 and 4. in particular, the mechanical formulation of the model is shown in section 3. finite element discretization is presented in section 4. in section 5, the proposed numerical technique is verified and validated against a reference analytical solution and publicly available experimental data. finally, section 6 concludes the paper and discusses future extensions of the method. 2 brief overview of available methods the most frequent approach to the analysis of glass structural elements was, for a long time, based on empirical knowledge. such relations are sufficient for the design of traditional windows glasses. however, in modern architecture there has been a steadily growing demand in recent decades for transparent materials for large external walls and roof systems. therefore, a detailed analysis of layered glass units is becoming increasingly important in order to ensure reliable and efficient design. in general, the complex behavior of laminated glass can be considered as an intermediate state of two limiting cases [1]. in the first case, the structure is treated as an assembly of two independent glass plates without any interlayer (the lower bound on stiffness and strength of a member), while in the second case, corresponding to the upper estimate of strength and stiffness, the glass unit is modeled as monolithic glass (one glass plate equal in thickness to the total thickness of the glass plates). both elementary cases, however, fail to correctly capture the complex interaction among the individual layers, leading to non-optimal layer thickness designs. therefore, several alternative approaches to the analysis of layered glass structures have been proposed in the literature. these methods can be categorized into three basic groups: � methods calibrated with respect to experimental measurements [2], � analytical approaches [3, 4, 5], � numerical models typically based on detailed finite element simulations [6, 7]. the applicability of analytical approaches to practical (usually large-scale) structures is far from being straightforward. in particular, the closed-form solution of the resulting equations is possible only for very specific boundary conditions, and the equations therefore have to be solved by an appropriate numerical method. moreover, the analytical approaches are rather difficult to generalize to beams with multiple layers. therefore, it appears to be advantageous to formulate the problem directly in the discretized form, typically based on the finite element method (fem). nevertheless, we would like to avoid fully resolved twoor three-dimensional simulations, cf. [6, 7], which lead to unnecessarily expensive calculations. in this paper, we propose a simple fem model inspired by a specific class of refined plate theories [8, 9, 10]. in this framework, each layer is treated as a timoshenko beam with independent kinematics. the interaction between the individual layers is captured by the lagrange multipliers (with a physical meaning of shear stresses), which result from the conditions of the compatibility of the inter-layer displacements. such a refined approach circumvents the limitation of similar models available in typical commercial fem systems, which employ a single set of kinematic variables and average the mechanical response through the thickness of the beam, e.g. [11]. unlike the proposed approach, the averaging operation is too coarse to correctly represent the inter-layer interactions, see section 5 for a concrete example. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 simple numerical model of laminated glass beams a. zemanová, j. zeman, m. šejnoha this paper presents a simple finite element model aimed at efficient simulation of layered glass units. the approach is based on considering the independent kinematics of each layer, tied together via lagrange multipliers. validation and verification of the resulting model against independent data demonstrate its accuracy, showing its potential for generalization towards more complex problems. keywords: laminated glass beams, finite element method, lagrange multipliers. 3 mechanical model of laminated beams as illustrated in fig. 1, laminated glasses consist mostly of three layers. a local coordinate system is attached to each layer to allow for an efficient treatment of independent kinematics. in the following text, a quantity a expressed in the local coordinate system associated with the i-th layer is denoted as a i( ), whereas a variable without an index represents a globally defined quantity, cf. fig. 1. each layer is modeled using the timoshenko beam theory supplemented with membrane effects. hence, the following kinematic assumptions are adopted � the cross sections remain planar but not necessarily perpendicular to the deformed beam axis, � the vertical displacement does not vary along the height of the beam (when compared to the magnitude of the displacement). under these assumptions, the non-zero displacement components in each layer are parametrized as: u x z u x x zi i i i i( ) ( ) ( ) ( ) ( )( , ) ( , ) ( )� �0 � , w x z w xi i( ) ( )( , ) ( )� , where i � 1 2 3, , and z i( ) is measured in the local coordinate system from the middle plane of the i-th layer. the inter-layer interaction is ensured via the continuity conditions specified on the interfaces between the layers in the form (i � 1 2, ) u x h u x hi i i i ( ) ( ) ( ) ( ) ( , ) ( , ) 2 2 01 1 � � �� � . (1) the strain field in the i-th layer follows from the strain-displacement relations [12, 11] � � � � x i i i i i i x z u x x z u x x x ( ) ( ) ( ) ( ) ( ) ( ) ( , ) ( , ) ( , )� � � d d d d 0 ( ) ( )x z i , � � � � � �xz i i i i ix u z x z w x x x w x ( ) ( ) ( ) ( ) ( )( ) ( , ) ( ) ( ) (� � � � d d x) , (2) which, when combined with the constitutive equations of each layer expressed in terms of young’s modulus e and the shear modulus g: � �x i i i x i ix z e x z( ) ( ) ( ) ( ) ( )( , ) ( , )� and � �xz i i xz ix g x( ) ( ) ( )( ) ( )� , yield the expressions for the internal forces as n x e a u x xx i i i i ( ) ( ) ( ) ( ) ( ) ( , )� d d 0 , v x kg a x w x xz i i i i( ) ( ) ( ) ( )( ) ( ) ( )� � � � � � d d , m x e i x xy i i i i ( ) ( ) ( ) ( ) ( ) ( )� d d � , where b and h i( ) are the width and height of the i-th layer, recall fig. 1, and k � 56, a bh i i( ) ( )� and i b hi i( ) ( )( )� 112 3 stand for the shear correction factor, the cross-section area and the moment of inertia of the i-th layer, respectively. to proceed, consider the weak form of the governing equations, written for the i-th layer (the subscripts �x and �z related to internal forces and kinematics-related quantities are omitted from now on, for the sake of brevity) d d d d d(i) (i) (i) (i) (i) x u x e a x u x x u x f l x � ( )) ( ( )) ( ) ( 0 � � � i l l x x u x n x) ( ) ( ) ,( )d (i)�� 0 0 d d d ( )d (i) (i) (i) x w x k g a x x w x f x x w l z i � � ( )) ( ) ( ) ( ) 0 � � � � ( ) ( ) ,x v x l l (i) 0 0 � d d d d d(i) (i) (i) (i) (i) (i x x e i x x x x m l � � � �( )) ( ( )) ( ) 0 � �� )( ) ,x l 0 � � �(i) (i) (i) (i) (i) d d d( ) ( ) ( ) ( ( ))x k g a x x x w x x� �� �� � �� 0 0 l � � , to be satisfied for arbitrary admissible test fields u(i), �(i) and w. in particular, the first three equations correspond to equilibrium conditions written for normal and shear forces and bending moments, respectively. the last identity enforces the geometrical relation (2) in the integral form, thereby allowing to treat the shear strain as an independent field in the discretization procedure to be discussed next. further note that the continuity conditions (1) will be introduced directly into the discretized formulation, as explained in the following section. 4 finite element discretization to keep the discretization procedure transparent, it is assumed that each layer of the laminated beam is divided into © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 6/2008 y x z, w 1 2 3 x, u (1) � (1) h (1) z , w (1) h (2) h (3) � (2) � (3) x, u (2) x, u (3)z , w (2) z , w (3) glass pvb glass fig. 1: kinematics of a laminated beam an identical number of elements, leading to the discretization scheme illustrated in fig. 2. following the standard conforming finite element machinery, e.g. [12, 11], we express the searched and test displacement fields at the element level in the form u x x u x xe i e u i e u i e i e u i e ( ) , ( ) , ( ) ( ) , ( )( ) ( ) , ( ) ( )� �n r n r , ( ) , , , , ( ) , ( ) ( ) , ( ) ( ) , ( u i e e w e w e e w e w e i w x x w x x� �n r n r � x x x xe i e i e i e i e i) ( ) , ( ) ( ) ,, ( ) , ( ) ( ) , ( ) , ( )� �n r n r� � � � � � � � � �e i e i e i e i e i ex x x x ( ) , ( ) , ( ) ( ) , ( )( ) ( ) , ( ) ( )� �n r n r , ( ) ,� i where e is used to denote the element number, �e and �e denote a relevant searched and test field restricted to the e-th element, ne i ,• ( ) is the associated matrix of the basis functions and re i ,• ( ) is the matrix of nodal unknowns. in the actual implementation, the fields u(i), we and �e i( ), as well as the corresponding test quantities, are assumed to be piecewise linear. to obtain a locking-free element, the shear strain �e i( ) is taken as constant and is eliminated using static condensation, see [12, 11] for additional details. to simplify the further treatment, we consider the following partitioning of the stiffness matrix k and the right hand side matrix r related to the e-th element and the i-th layer after static condensation: k k k k r r re i ew i we i w i e i e w e ( ) ( ) ( ) ( ) ( ) , � � � � � � � � � � � � � � � � � ( ) , ( ) i e w ir � � � � � � � � , where k kew i we i( ) ( )( )� t and � r re i e ai e bi e ai e bi e w e au u w w( ) ,( ) ,( ) ,( ) ,( ) , ,, , , , ,� �� � t � e b, . t considering all three layers in fig. 2 together gives the resulting system of governing equations in the form k k k k e k k k k e ew e ew e e ew we w ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 1 2 2 3 3 1 0 0 0 0 0 0 t e we w w w e e ( ) ( ) ( ) ( ) ( )2 3 1 2 3 0 0 0 k k k k e r � � � � � � � � � � � � � � � � � � ( ) ( ) ( ) , ( ) ( ) 1 2 3 4 1 1 2 r r r r r e e e w ( ) e e � � � � � � � � � � � � � � � � � � r r r r e e w e w e w ( ) , ( ) , ( ) , ( ) ,3 1 2 3 0 � � � � � � � � � � � � � � � � � � where the matrix stores the nodal values of the lagrange multipliers, associated with the compatibility constraint (1), and the matrix ee h h h h � � � 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 2 1 2 2 2 2 2 ( ) ( ) ( ) ( ) 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 2 3 2 3 2 2 2 2 h h h h ( ) ( ) ( ) ( ) � � � � � � � � � � � � � � � � � � implements the tying conditions. 5 verification and validation of the numerical model to verify and validate the performance of the present approach, the previously described fem model was implemented using the matlab® system and compared with predictions of the analytical model and experimental data for a three-point bending test on the simply supported laminated glass beam presented in [5], see also fig. 3. the width of the beam is b � 01. m and the material data of individual components of the structure are available in table 1. the modulus of elasticity of the glass is slightly lower than the conventional 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 y x z, w i � ( )i e, a we, a � ( )i e a b u ( )i e, b we, b l e � ( )i e, b u ( )i e, a fig. 2: finite element discretization of the i-th layer f glass 5 mm pvb 0.38 mm glass 5 mm 10 cm40 cm strain gage 40 cm10 cm fig. 3: three point bending setup for simply supported beam glass pvb layer young’s modulus, e 64.5 gpa 1.287 mpa poisson’s ratio, � 0.23 0.4 table 1: material data values of 70–73 gpa reported in the literature, and is specific for the material employed in the experiment. moreover, as the pvb layer shows viscoelastic and temperature-dependent behavior, the modulus of elasticity corresponds to an effective secant value related to load duration of s and temperature of 22 °c. table 2 summarizes the values of the mid-span deflection for a representative load level determined by fe-based discretization using 60 elements (30 when symmetry of the problem is exploited) and the corresponding experimental values. note that the discretization is sufficient to achieve three-digit accuracy in the mid-span deflection. in addition to the results obtained by the analytical method proposed by asik and tezcan in [5], the results of the analysis using adina® system and the elementary lower and upper bounds are included. in particular, the adina® model is based on the classical laminate theory, cf. [11], whereas the two simplified approaches assume zero or infinite compliance of the interlayer, recall also the discussion in section 2. in the following discussion, e.g. �exp num denotes the relative error between the numerical prediction and reference experimental value, while e.g. �an is used for the error of the analytical solution when compared to the candidate approaches. clearly, the results of the last three methods differ substantially from the experimental data and also from the analytical results. the proposed numerical model, on the other hand, shows a response almost identical to the analytical method, which deviates from the experimental measurement by less than 6 %. such accuracy can be considered as sufficient from the practical point of view. to further confirm the predictive capacities of the proposed numerical scheme, a response corresponding to a proportionally increasing load was investigated. the results appear in tables 3 and 4. again, the method seems to be suf© czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 6/2008 model central deflection [mm] �exp [%] �an [%] laminated glass beam: thickness [mm] 5/0.38/5 (glass/pvb/glass) experiment 1.27 – �5.2 analytical model 1.34 5.5 – numerical model 1.34 5.5 0.0 adina® (multi-layered shell) 0.89 �30.2 �33.8 monolithic glass beam: thickness [mm] 10 (glass+glass) analytical model 0.99 �21.8 �25.9 two independent glass beams: thickness [mm] 5/5 (without any interlayer) analytical model 3.97 212.6 196.2 table 2: comparison of results for a simply supported beam (load 50 n) load [n] central deflection [mm] wexp wan �exp an [%] wnum �exp num [%] �an num [%] 50 1.27 1.34 5.51 1.34 5.51 0.00 100 2.55 2.69 5.49 2.68 5.10 �0.37 150 4.12 4.03 �2.18 4.02 �2.43 �0.25 200 5.57 5.38 �3.41 5.36 �3.77 �0.37 table 3: comparison of deflections for a simply supported beam load [n] maximum strain [×10-6] maximum stress [mpa] �an �num �an num [%] �an �num �an num [%] 50 112 114 1.79 7.23 7.34 1.52 100 224 228 1.79 14.45 14.68 1.59 150 336 341 1.49 21.68 22.02 1.57 200 448 455 1.56 28.9 29.36 1.59 table 4. comparison of stresses and strains for a simply supported beam ficiently accurate in the investigated range of loads when considering the values of deflections as well as the values of local stresses and strains. 6 conclusions as shown by the presented results, the proposed numerical method is well-suited for modeling laminated glass beams, mainly because of its low computational cost and its accurate representation of the behavior of the structural member. future improvements of the model will consider large deflections and the time-dependent response of the interlayer and will be reported separately. acknowledgment the support provided by gačr grant no. 106/07/1244 is gratefully acknowledged. references [1] vallabhan, c. v. g., minor, j. e., nagalla, s. r.: stress in layered glass units and monolithic glass plates, journal of structural engineering, vol. 113 (1987), p. 36–43. [2] norville, h. s., king, k. w., swofford, j. l.: behavior and strength of laminated glass, journal of engineering mechanics, vol. 124 (1998), p. 46–53. [3] vallabhan, c. v. g., das, y. c., magdi, m., asik, m., bailey, j. r.: analysis of laminated glass units, journal of structural engineering, vol. 113 (1993), p. 1572–1585. [4] asik, m. z.: laminated glass plates: revealing of nonlinear behavior, computers and structures, vol. 81 (2003), p. 2659–2671. [5] asik, m. z., tezcan, s.: a mathematical model for the behavior of laminated glass beams, computers and structures vol. 83 (2005), p. 1742–1753. [6] duser, a. v., jagota, a., bennison, s. j.: analysis of glass/polyvinyl butyral laminates subjected to uniform pressure, journal of engineering mechanics, vol. 125 (1999), p. 435–442. [7] ivanov, i. v.: analysis, modelling, and optimization of laminated glasses as plane beam, international journal of solids and structures, vol. 43 (2006), p. 6887–6907. [8] mau, s. t.: a refined laminated plate theory, journal of applied mechanics – transactions of the asme, vol. 40 (1973), no. 2, p. 606–607. [9] šejnoha, m.: micromechanical modeling of unidirectional fibrous composite plies and laminates, ph.d. thesis, rensselaer polytechnic institute, troy, ny (1996). [10] matouš, k., šejnoha, m., šejnoha, j.: energy based optimization of layered beam structures, acta polytechnica, vol. 38 (1998), no. 2, p. 5–15. [11] bathe, k. j.: finite element procedures, 2nd edition, prentice hall, 1996. [12] bittnar, z., šejnoha, j.: numerical methods in structural mechanics, asce press and thomas telford ltd, new york and london, 1996. ing. alena zemanová e-mail: zemanova.alena@gmail.com doc. ing. jan zeman, ph.d. phone: +420 224 354 482 fax: +420 224 310 775 e-mail: zemanj@cml.fsv.cvut.cz [url]: http://mech.fsv.cvut.cz/ zemanj mech department of mechanics prof. ing. michal šejnoha, ph.d. e-mail: sejnom@fsv.cvut.cz department of mechanics centre for integrated design of advances structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 bootes observation of grb080603b m. jeĺınek, j. gorosabel, a. j. castro-tirado, a. de ugarte postigo, s. guziy, r. cunniffe, p. kubánek, m. prouza, s. v́ıtek, r. hudec, v. reglero, l. sabau-graziati abstract we report on multicolor photometry of long grb080603b afterglow from bootes-1b and bootes-2. the optical afterglow has already been reported to present a break in the optical lightcurve at 0.12 ± 0.2 days after the trigger. we construct the lightcurve and the spectral energy distribution and discuss the nature of the afterglow. keywords: gamma-ray bursts, individual, grb080603b. 1 introduction grb 080603b was a long gamma-ray burst detected on june 3, 2008, at 19:38:13 ut by swift-bat [14]. the burst was also detected by konus-wind [5] and integral-spi/acs [18]. in x-rays, the afterglow was detected by swiftxrt, providing a rapid and precise localization [13]. the optical afterglow was observed by several telescopes – rotse iii [19], tarot [9–11], tls tautenburg [8], rtt150 [24], the liverpool telescope [15], xinglong est [23], the 1.0 m telescope at crao [21, 20], the 1.5 m telescope of sayan observatory [12] and from maidanak [6]. in infrared by pairitel [16], spectroscopy was obtained by the not [4] and the hobby-eberly telescope [2], providing a redshift of z = 2.69. an upper limit on radio emission was set by the vla [1]. 2 observations at both bootes stations, the grb happened during twilight, delaying follow-up by ∼ 1 h. despite the delay, the optical afterglow is well detected in the data from both telescopes. the 60 cm telescope bootes-2/telma, in la mayora, málaga, spain, started taking data at 20:29:19 ut, i.e. 51 minutes after the grb trigger. a sequence of r′-band exposures was taken, and later, after confirming the detection of the optical transient, i′, g′ and y band images were obtained. in the near infrared y band, despite 600 s of integration, the afterglow was not detected. the 30 cm telescope bootes-1b, located in el arenosillo, huelva, spain, [7] obtained 368 unfiltered images totalling more than 6 hours of integrated light until the end of the night. the images were combined to improve signal-to-noise, to yield 11 data points for the period between 1.2 and 5.2 hours after the grb. one point has a large error due to clouds crossing the field of view. the best fit astrometric position of the afterglow, obtained from the weighted average of all available images from bootes-2 is α = 11 : 46 : 07.73 δ = +68 : 03 : 39.9 (j2000), about 1.6◦ se from star λ dra. photometry was done in the optimal aperture using iraf/daophot. calibration was performed against three sdss (dr8) [3] stars. the stars are marked on the identification chart (figure 3) and their brightnesses are in the table 1. our unfiltered, “clear”, best fit magnitude clear=a1 ∗ g ′ + a2 ∗ r ′ used for bootes-1b calibration is also mentioned. for the summary of our observations, see table 2. table 1: calibration stars used id. g′ r′ i′ clear 1 18.00 17.50 17.32 17.52 2 18.80 17.35 16.04 17.35 3 19.88 18.42 17.09 18.47 3 fitting the lightcurve the lightcurve, as already shown by [24] shows a smooth transition between two decay slopes α1 = −0.55 ± 0.16 and α2 = −1.23 ± 0.22. the break occurs at tb = 0.129 ± 0.016 days. there is no hint of chromatic evolution within the lightcurve, so all filters were scaled and fitted together with the r′-band. the fitting of the lightcurve was performed in log t/ log f space, where power law functions, typical for gamma-ray bursts, show as straight lines. we used a hyperbolic transition between two slopes (smoothly broken power-law): 34 acta polytechnica vol. 52 no. 1/2012 table 2: optical photometric observations of the optical afterglow of the grb080603b ut date of mid exp. t − t0 [h] tel. filter texp [s] mag δ mag jun 3.855805 0.902 b-2 r′ 3 × 120 s 17.46 0.07 jun 3.859348 0.987 b-2 r′ 2 × 120 s 17.59 0.13 jun 3.862188 1.056 b-2 r′ 2 × 120 s 17.31 0.05 jun 3.864311 1.107 b-2 r′ 120 s 17.57 0.08 jun 3.865747 1.141 b-2 r′ 120 s 17.30 0.07 jun 3.867151 1.175 b-2 r′ 120 s 17.46 0.06 jun 3.868946 1.218 b-1b clear 10 × 60 s 17.53 0.07 jun 3.870011 1.243 b-2 g′ 3 × 120 s 18.29 0.04 jun 3.874248 1.345 b-2 g′ 3 × 120 s 18.24 0.04 jun 3.876758 1.405 b-1b clear 10 × 60 s 17.54 0.06 jun 3.879225 1.465 b-2 g′ 4 × 120 s 18.14 0.03 jun 3.884248 1.585 b-2 r′ 3 × 120 s 17.50 0.09 jun 3.884664 1.595 b-1b clear 10 × 60 s 17.70 0.06 jun 3.889912 1.721 b-2 r′ 3 × 120 s 17.70 0.15 jun 3.892654 1.787 b-1b clear 10 × 60 s 17.75 0.06 jun 3.893455 1.806 b-2 r′ 4 × 120 s 17.74 0.06 jun 3.899839 1.959 b-2 g′ 5 × 120 s 18.42 0.19 jun 3.900620 1.978 b-1b clear 10 × 60 s 17.79 0.06 jun 3.906961 2.130 b-2 g′ 5 × 120 s 18.42 0.04 jun 3.908509 2.167 b-1b clear 10 × 60 s 17.87 0.09 jun 3.914867 2.320 b-2 r′ 4 × 120 s 18.15 0.13 jun 3.916482 2.359 b-1b clear 10 × 60 s 17.91 0.11 jun 3.922694 2.508 b-2 i′ 5 × 120 s 17.89 0.05 jun 3.931774 2.726 b-2 r′ 7 × 120 s 18.01 0.06 jun 3.934988 2.803 b-1b clear 35 × 60 s 18.30 0.32 jun 3.940845 2.943 b-2 i′ 5 × 120 s 17.88 0.07 jun 3.947882 3.112 b-2 r′ 5 × 120 s 18.12 0.08 jun 3.956941 3.330 b-1b clear 20 × 60 s 18.45 0.07 jun 3.971736 3.685 b-1b clear 21 × 60 s 18.38 0.06 jun 3.977109 3.814 b-2 r′ 5 × 120 s 18.26 0.18 jun 4.006997 4.531 b-1b clear 78 × 60 s 18.79 0.07 h(a, b) = a + b 2 √ 1 + a2 b2 m(t) = m0−2.5α2 log t tb +h(−2.5(α1−α2) log t tb , g) where α1 and α2 are pre-break and post-break decay indices, tb is the break time, m0 is an absolute scaling parameter of the brightness and g expresses smoothness of the break. although the early point by rotse [19] was not used, it agrees with the backward extrapolation of the α1 slope and so supports this simple interpretation. we constructed a spectral energy distribution (sed) by fitting the needed magnitude shift of the r-band lightcurve model to the photometric points from bootes, uvot [14] and pairitel [16] obtained in other filters. while the points from uvot are practically contemporaneous to bootes, pairitel observed rather later (0.32 days after the trigger), so the sed is therefore model-dependent in its infrared part. the synthetic ab magnitudes equivalent to t = 0.1 days are in table 3. 35 acta polytechnica vol. 52 no. 1/2012 16 16.5 17 17.5 18 18.5 19 19.5 20 40m 1h 2h 4h 8h 50 ujy 0.1 mjy 0.2 mjy 0.5 mjy 1 mjy a b m a g n it u d e fl u x d e n si ty time post trigger r’ gcn r-band g’+1 i’-1 c fig. 1: detail of the optical light curve of grb080603b showing the observations by bootes (filled symbols) and from the literature (empty symbols) 14 15 16 17 18 19 20 21 10s 30s 1m 2m 5m 10m 30m 1h 2h 4h 8h 1d 2d 10 ujy 20 ujy 50 ujy 0.1 mjy 0.2 mjy 0.5 mjy 1 mjy 2 mjy 5 mjy 10 mjy a b m a g n it u d e fl u x d e n si ty time post trigger r’ gcn r-band g’ i’ c fig. 2: overall view of the light curve of grb080603b fig. 3: the finding chart of the afterglow of grb080603b. combination of images taken by bootes-2 17 18 19 20 21 200 300 500 1000 2000 10 100 1000 a b m a g n it u d e fl u x d e n si ty [ u jy ] wavelength [nm] fig. 4: the spectral energy distribution of the afterglow in rest frame. the small arrow marks ly-α position for z =2.69 table 3: the spectral energy distribution in ab magnitudes equivalent to 0.1days after the trigger. († uvot, ‡ pairitel) filter mab ∆mab w† 20.98 0.56 u† 19.83 0.23 b† 19.22 0.14 g′ 18.57 0.07 r′ 17.88 0.05 i′ 17.81 0.09 j‡ 17.44 0.10 h‡ 17.19 0.10 k‡ 17.22 0.10 the sed shows a clear suppression of radiation above 4 500 å, i.e. a redshifted ly-α line. no radiation is detected above the lyman break at 3 365 å. a rather shallow power law with an index β = −0.53 ± 0.06 was found redwards from r′ band. the fit was performed using the e(b − v ) = 0.013 mag [22]. the strong suppression of light for wavelengths shorter than r′ band is likely due to the ly-α absorption within the host galaxy and ly-alpha line blanketing for z = 2.69. 4 discussion the values of α2 = −1.23±0.22 and β = −0.53±0.06 both point to a common electron distribution parameter p = 2.05 ± 0.20 (α = (3 ∗ p − 1)/4, β = (p− 1)/2) [17]. such a combination suggests a stellar wind profile expansion and a slow cooling regime. the pre-break decay rate α1 = −0.55 ± 0.16 remains unexplained by the standard fireball model. it is unlikely that the break at tb = 0.129 ± 0.016 would be a jet break. it is quite possible that the plateau is not really a straight power law, and that some late 36 acta polytechnica vol. 52 no. 1/2012 activity of the inner engine may be producing bumping of hydrodynamic origin. we note that the literature contains a number of observations suggesting a rapid decay by about one day after the grb. without having all the images, it is, however, impossible to decide whether this is a real physical effect or a zero-point mismatch. 5 conclusions the 0.6 m telescope bootes-2 in la mayora observed the optical afterglow of grb 080603b in three filters. the 0.3 m bootes-1b in el arenosillo observed the same optical afterglow without a filter. using the data we obtained at bootes and from the literature, we construct the lightcurve and broadband spectral energy distribution. our fit of the obained data privides the decay parameters α2 = 1.23 ± 0.22 and β = −0.53 ± 0.06, which suggest a slow cooling expansion into a stellar wind. acknowledgement we acknowledge the support of the spanish ministerio de ciencia y tecnoloǵıa through projects aya2008-03467/esp and aya2009-14000-c03-01/ esp, and junta de andalućıa through the excellence reseach project p06-fqm-219, and the gačr grants 205/08/1207 and 102/09/0997. we are also indebted to t. mateo-sanguino (uhu), j. a. adame, j. a. andreu, b. de la morena, j. torres (inta) and to r. fernández-munoz (eelm-csic), v. munozfernández and c. pérez del pulgar (uma) for their support. references [1] chandra, p., frail, d. a.: grb 080603b: pairitel infrared detection. gcn circular, 7827, 2008. [2] cucchiara, a., fox, d.: grb 080603b: hobbyeberly telescope redshift confirmation. gcn circular, 7815, 2008. [3] eisenstein, d. j., weinberg, d. h., agol, e., aihara, h., allende prieto, c., anderson, s. f., arns, j. a., aubourg, é., bailey, s., balbinot, e., et al.: sdss-iii: massive spectroscopic surveys of the distant universe, the milky way, and extra-solar planetary systems. aj, 142, 72, sept. 2011. [4] fynbo, j., quirion, p.-o., xu, d., malesani, d., thoene, c., hjorth, j., milvang-jensen, b., jakobson, p.: grb 080603b: not redshift. gcn circular, 7797, 2008. [5] golenetskii, s., aptekar, r., mazets, e., pal’shin, v., frederiks, d., cline, t.: konuswind observation of grb 080603b. gcn circular, 7812, 2008. [6] ibrahimov, m., karimov, p., rumyantsev, a., pozanenko, a.: grb 080603b: optical observations in mao. gcn circular, 7975, 2008. [7] jeĺınek, m., castro-tirado, a. j., de ugarte postigo, a., kubánek, p., guziy, s., gorosabel, j., cunniffe, r. vı́tek, s., hudec, r., reglero, v., sabau-graziati, l.: four years of real-time grb followup by bootes-1b (2005–2008). advances in astronomy, 2010, 432 172, 2010. [8] kann, d., laux, u., ertel, s.: grb 080603b: tls afterglow observation. gcn circular, 7823, 2008. [9] klotz, a., boer, m., atteia, j.: grb 080603b: tarot calern observatory detection of a plateau in the light curve. gcn circular, 7795, 2008. [10] klotz, a., boer, m., atteia, j.: grb 080603b: tarot calern observatory confirmation of slow optical decay. gcn circular, 7799, 2008. [11] klotz, a., boër, m., atteia, j., gendre, b.: early optical observations of gamma-ray bursts by the tarot telescopes: period 2001–2008. the astronomical journal, 2009. [12] klunko, e., pozanenko, a.: grb 080603b: optical observation. gcn circular, 7890, 2008. [13] mangano, v., la parola, b., sbarufatti, b.: grb 080603b: swift-xrt refined analysis. gcn circular, 7806, 2008. [14] mangano, v., parsons, a., sakamoto, t., la parola, v., kuin, n., barthelmy, s., burrows, d., roming, p., gehrels, n.: swift observation of grb 080603b. gcn report, 144, 2008. [15] melandri, a., gomboc, a., guidorzi, c., smith, r., steele, i., bersier, d., mundell, c., carter, d., kobayashi, s., burgdorf, m., bode, m., rol, e., o’brien, p., bannister, n., tanvir, n.: grb 080603b: liverpool telescope observations. gcn circular, 7813, 2008. [16] miller, a., bloom, j., perley, d.: grb 080603b: pairitel infrared detection. gcn circular, 7827, 2008. [17] piran, t.: the physics of gamma-ray bursts. reviews of modern physics, 76, 1 143–1 210, oct. 2004. 37 acta polytechnica vol. 52 no. 1/2012 [18] rau, a.: catalogue of spi-acs gamma-ray burst. http://www.mpe.mpg.de/gamma/science/ grb/1acsburst.html, 2012. [19] rujopakarn, w., guver, t., smith, d.: grb 080603b: rotse-iii detection of optical counterpart. gcn circular, 7792, 2008. [20] rumyantsev, a., antoniuk, k., pozanenko, a.: grb 080603b: optical observations in crao. gcn circular, 7974, 2008. [21] rumyantsev, v., pozanenko, a.: grb 080603b: optical observation. gcn circular, 7869, 2008. [22] schlegel, d., finkbeiner, d., davis, m.: maps of dust infrared emission for use in estimation of reddening and cosmic microwave background radiation foregrounds. ajp, 500, 525, june 1998. [23] xin, l., feng, q., zhai, m., qiu, y., wei, j., hu, j., deng, j., wang, j., urata, y., zheng, w.: grb 080603b: xinglong est observations. gcn circular, 7814, 2008. [24] zhuchkov, r., bikmaev, i., sakhibullin, n., khamitov, i., eker, z., kiziloglu, u., gogus, e., burenin, r., pavlinsky, m., sunyaev, r.: grb 080603b: rtt150 optical observations, break in light curves. gcn circular, 7803, 2008. martin jeĺınek e-mail: mates@iaa.es instituto de astrof́ısica de andalućıa csic granada, spain javier gorosabel alberto j. castro-tirado antonio de ugarte postigo sergei guziy ronan cunniffe instituto de astrof́ısica de andalućıa csic granada, spain petr kubánek michael prouza fyzikálńı ústav (fzú av čr) praha, czech republic stanislav vı́tek fakulta elektrotechnická, čvut v praze, czech republic rené hudec astronomický ústav akademie věd (asú av čr) ondřejov, czech republic fakulta elektrotechnická, čvut v praze, czech republic victor reglero image processing laboratory universitat de valencia, spain lola sabau-graziati instituto nacional de técnica aeroespacial torrejón de ardoz, madrid, spain 38 acta polytechnica vol. 52 no. 6/2012 the asymptotic properties of turbulent solutions to the navier-stokes equations zdeněk skalák faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague, czech republic corresponding author: skalak@mat.fsv.cvut.cz abstract in this paper we study the large time behavior of solutions to the navier-stokes equations. we present a brief survey of results concerning energy decay, and discuss a related phenomenon of the large time energy concentration in the frequency space occurring in any turbulent solution. this leads us to the study of solutions in the besov spaces and to proof that if we choose a suitable initial condition then in some besov spaces the energy of the associated solution does not decrease asymptotically to zero. keywords: navier-stokes equations, besov spaces. 1 introduction we consider the navier-stokes equations for a viscous incompressible fluid which fills the whole threedimensional space r3 with the absence of external forces: ∂tu + ∇· (u⊗u) = ∆u−∇p, (1) ∇·u = 0, (2) u(x, 0) = u0(x). (3) here u : r3 × [0,∞) → r3 denotes the unknown velocity field and p : r3×[0,∞) → r is the unknown pressure. u0 = u0(x) = (u01(x),u02(x),u03(x)) is a given initial velocity. the mathematical theory of the navier-stokes equations has been developed since the pioneering work by leray ([7]). plenty of papers and books can now be found in the literature concerning various aspects of the theory, among them the famous problem (still unresolved) whether a solution of the navier-stokes equations with smooth data remains regular for all times or can develop a blow-up in a finite time. in this paper we are interested in the large time behavior of the solutions and we start with the following basic question: does the kinetic energy of the solutions decrease to zero as time t goes to infinity? this question was first raised by leray in [7] in 1934 and the intuitive answer is positive, since we consider no external forces here. indeed, the answer "yes" turns out to be correct, but many years passed between the formulation of the question and its partial solution by kato, in [6]. having solved the basic problem, we can now investigate more detailed aspects of energy decay. in the second section, we will discuss the rate of energy decay and we will present a short survey of the results. the third section will be devoted to the phenomenon of large time energy concentration in solutions. it turns out that in every (turbulent) solution the energy concentrates for large times in frequencies forming an annulus or a ball in the frequency space. this phenomenon seems to be connected with the rate of energy decay discussed in the second section, and we will present several results concerning the existence and the rate of the energy concentration the main results of this paper are presented in the fourth section, where we will turn our attention to the existence of solutions in besov spaces. these spaces are defined by the use of the fourier transform, and enable a study of the location of the energy in the whole spectrum of frequencies. this seems to be a suitable way to study the problems presented in the third section. we improve a result presented by miyakawa in [8], and show here that there exist some besov spaces in which some solutions do not decrease asymptotically to zero, unlike the decrease to zero in the energy norm mentioned above. for the purposes of clarity, all the notation used in this paper, and also definitions of some basic mathematical terms, can be found in the appendix. 2 rate of energy decay as was mentioned in the introduction, the energy of every turbulent solution u decreases asymptotically to zero, i.e. limt→∞‖u(t)‖2 = 0. (a precise simple proof can be found in [20]). a further logical step is to study the rate of energy decay, and to present some classes of initial conditions providing various rates of decay. many results concerning this problem were proved by schonbek (see, for example [11], [12] and [13]). we mention here 99 acta polytechnica vol. 52 no. 6/2012 as an example a result proved in [12]: if the initial condition u0 belongs to the space l1 ∩l2σ, then there exists a global weak solution of (1)–(3) and c > 0 such that ‖u(t)‖2 ≤ c(t + 1)−1/4 for every t ≥ 0. a key paper was published by wiegner in [20]. he showed that, roughly speaking, most of the solutions of the navier-stokes equations decrease at the same rate as the solutions of the so called stokes equations (the navier-stokes equations deprived of the nonlinear term) with the same initial conditions. more precisely, a turbulent solution with the initial condition u0 decreases at the rate (1 +t)−α for some α ∈ (0, 5/4] if also et∆u0 decreases at the same rate. solutions with an even higher rate of decay were studied by miyakawa and schonbek in [9]. they proved the following result: theorem 1. let u0 ∈ l2σ and ∫ |u0(x)|(1 + |x|)dx < ∞. let u be a turbulent solution to the nse with the initial condition u0 such that ‖u(t)‖2 ≤ c(1 + t)−5/4. we set bh,k = ∫ xhu0k(x)dx and λh,k = ∫ ∞ 0 ∫ (uhuk)(x,t)dxdt (h,k = 1, 2, 3). then if (bh,k) ≡ 0 and if there exists c ∈ r such that λh,k = cδh,k, then lim t→∞ t5/4‖u(t)‖2 = 0. conversely, if (bh,k) 6= 0 or (λh,k) is not scalar, then lim inf t→∞ t5/4‖u(t)‖2 > 0. theorem 1 presents conditions under which a solution decreases at a higher rate than (1 + t)−5/4. however, while it is simple to fulfill the condition (bh,k) ≡ 0 just by a suitable choice of the initial condition, it is difficult to observe the condition λh,k = cδh,k, since it includes the solution itself. thus, theorem 1 neither ensures the existence of solutions decreasing at a rate quicker than (1 + t)−5/4 nor gives a method for possibly constructing such solutions. this problem is solved by brandolese in [1]. he used the following concept of a symmetric solution: definition 1. a vector field u0 = (u01,u02,u03) from r3 to r3 is said to be symmetric if the following conditions are satisfied for all j,k = 1, 2, 3. 1. u0j is odd with respect to xj and even with respect to xk,j 6= k. 2. u01(x) = u02(σx) = u03(σ2x), where σ is the cycle σ(x1,x2,x3) = (x3,x1,x2). a simple example of a symmetric and solenoidal vector field is given by u0(x1,x2,x3) =   x1(x23 −x22)e−|x| 2 x2(x21 −x23)e−|x| 2 x3(x22 −x21)e−|x| 2   . (4) brandolese proved in [1] that if the initial condition u0 is a solenoidal symmetric vector field then there exists a solution which is symmetric for every time t ≥ 0. now the existence of a solution decreasing at the rate o((1 +t)−5/4), t →∞ is ensured: it suffices to take the initial condition (4). indeed, the associated solution u = u(t) is then symmetric for every time t ≥ 0 and it is also possible to verify the veracity of the condition λh,k = cδh,k. moreover, the initial condition (4) also satisfies (bh,k) ≡ 0 and the above conclusion follows from theorem 1. using symmetric initial conditions, brandolese obtained solutions decreasing at the rate (1 + t)−9/4. obtaining solutions which decrease at an even higher rate seems to be difficult, with only one exception: the existence of exponentially decreasing solutions is very well known, as was described in [10] and [17]. these solutions are very rare since it is possible to prove that their initial conditions lie only on a thin manifold in the phase space. the direct construction of exponentially decreasing solutions in three dimensional spaces is still an open problem. 3 concentration of energy in this section, we present several results concerning the large time energy concentration which occurs in every (turbulent) solution: the energy of the solution concentrates in frequencies localized in an annulus (for the case of solutions decreasing exponentially) or a ball (for the case of solutions not decreasing exponentially) in the frequency space. the diameter of the annulus determines the rate of the exponential decay of the solution, and it can be arbitrarily narrow. the ball is centered in the origin of the coordinates, and can have an arbitrarily small diameter (for the results presented in this section, see [15], [16], [17] and [18]). the following theorem and the ensuing remarks provide precise information. theorem 2. let u be a nonzero turbulent solution of (1)–(3). then there exists a ∈ [0,∞) such that lim t→∞ ‖(ea+ε −ea−ε)u(t)‖2 ‖u(t)‖2 = 1 (5) for every ε > 0, where we put ea−ε = 0 if a − ε < 0. the number a can be explicitly computed as a = limt→∞‖a1/2u(t)‖22/‖u(t)‖22. 100 acta polytechnica vol. 52 no. 6/2012 further, a = sup{λ ≥ 0; lim t→∞ ‖u(t)‖2eλt = 0}, which implies that the energy of u decreases exponentially for t →∞ if and only if a > 0. finally, if a > 0 and ε > 0, then lim t→∞ e(a−ε)t‖u(t)‖2 = 0 and lim t→∞ e(a+ε)t‖u(t)‖2 = ∞. remark 1. it is possible to show (see [5]) that for every λ > 0 f(eλu(t))(ξ) = χb√ λ (0)f(u(t))(ξ), where f denotes the fourier transform and χb√ λ (0) is the characteristic function of b√ λ (0) = {x ∈ r3;‖x‖ ≤ √ λ}. consequently, the equality (5) can be written in the form lim t→∞ ∫ b√a+ε(0)\b√a−ε(0) |f(u(t))(ξ)|2 dξ∫ r3 |f(u(t))(ξ)| 2 dξ = 1 if a > 0 and ε ∈ (0,a) and lim t→∞ ∫ b√ε(0) |f(u(t))(ξ)|2 dξ∫ r3 |f(u(t))(ξ)| 2 dξ = 1 if a = 0 and ε > 0. the results from theorem 2 and remark 1 can be interpreted in a way that in every turbulent solution the frequencies outside the annulus or the ball disappear asymptotically. this result can be further strengthened in the following way: theorem 3. let α ≥ 0. then lim t→∞ ∫ kca,ε |ξ|4α|f(u(t))(ξ)|2 dξ∫ r3 |f(u(t))(ξ)| 2 dξ = 0, where kca,ε = r3\ka,ε, ka,ε = b√a+ε(0)\b√a−ε(0) if a > 0 and ka,ε = b√ε(0) if a = 0. up to now we have discussed the phenomenon of the large time energy concentration in the frequency space which occurs in any turbulent solution. in theorem 4, we will present an example of a concrete class of initial conditions such that if u0 belongs to this class and u is a turbulent solution with u(0) = u0, then the energy of the solution concentrates asymptotically in frequencies from an arbitrarily small ball in the frequency space centered in the origin of the coordinates. we also present two estimates of the rate of energy concentration (see [18]). for a description of the class mentioned in the previous paragraph we need the following definition. definition 2. let α,δ > 0, and m be a real number. we define kδm,α = {v ∈ l 2 σ; |f(v)(ξ)| ≥ α|ξ| m,∀|ξ| ≤ δ}. theorem 4. let α,δ > 0, m > −3/2, p ∈ [1, 2] and 3 p − 3 2 ≤ m + 3/2 < min ( 6 p − 5 2 , 5 2 ) . let further u0 ∈ l2σ ∩k δ m,α ∩l p and u be a turbulent solution of (1)–(3) with the initial condition u0. if q ≥ 1/2, then there exists c > 0 dependent only on ‖u0‖2, ‖u0‖p, δ, m, α and q such that 1 − ‖eλu(t)‖2 ‖u(t)‖2 ≤ c λ t−1+(m−3/p+3)/(2q) (6) for every λ > 0 and every t ≥ 1. let λ0 > 0. then there exists c > 0 dependent only on ‖u0‖2, ‖u0‖p, δ, m, α and λ0 such that 1 − ‖eλu(t)‖2 ‖u(t)‖2 ≤ c λ2 t−3/p−1 (7) for all λ ≥ λ0 and t ≥ 1. inequalities (6) and (7) provide information about the concentration of the energy in low frequencies and about the rate of this concentration. since the energy of the solutions described in theorem 5 concentrates at low frequencies, the number a from theorem 2 is equal to zero and the energy of these solutions does not decrease exponentially. let us mention here one open problem: to find a turbulent solution u with an initial condition u0 such that the number a from theorem 2 is positive. in other words, to find a solution with the energy decreasing at the exponential rate e−at, t →∞ and concentrating in frequencies from an arbitrarily narrow annulus with the middle diameter a > 0. 4 solutions in besov spaces we start this section with a definition of homogeneous besov spaces (see also [2]). let c be the annulus {ξ ∈ r3; 3/4 ≤ |ξ| ≤ 8/3}. there exist the smooth radial function χ and ϕ with the support b(0, 4/3) and c, resp., with values in [0, 1], and such that χ(ξ) + ∑ j≥0 ϕ(2−jξ) = 1, ∀ξ ∈ r3 ∑ j∈z ϕ(2−jξ) = 1, ∀ξ ∈ r3 \{0}, 101 acta polytechnica vol. 52 no. 6/2012 supp ϕ(2−j·) ∩ supp ϕ(2−j ′ ·) = ∅ if |j − j′| ≥ 2, supp χ∩ supp ϕ(2−j·) = ∅ if j ≥ 1. let h = f−1ϕ. if u ∈ s′ (the space of tempered distributions), then the homogeneous dyadic blocks are defined for j ∈ z as ∆ju = 23j ∫ r3 h(2jy)u(x−y)dy. the space of the homogeneous distributions s ′ h is defined in the following way: u ∈ s′ belongs to s′h if and only if u = ∑ j∈z ∆ju. we can now define the homogeneous besov space bsp,∞, s ∈ r,p ∈ [1,∞]. this space consists of those distributions from s ′ h such that ‖u‖bsp,∞ = sup j∈z 2js‖∆ju‖p < ∞. suppose now that u is a turbulent solution of (1) (3) with an initial condition u0. then (see [19]) u(t) = et∆u0 + ∫ t 0 e∆(t−s)pσ∇(u⊗u(s))ds. if we denote the integral from the previous equality as w(t) and use the fact that the operator pσ∇ is homogeneous of degree 1, we can derive ‖∆jw(t)‖1 ≤ ∫ t 0 ce−c(t−s)2 2j 2j‖∆j(u⊗u(s))‖1ds. so, we have for every t > 0 ‖w(t)‖b−11,∞ = supj∈z 2−j‖∆jw(t)‖1 ≤ c sup j∈z ∫ t 0 e−c(t−s)2 2j ‖∆j(u⊗u(s))‖1ds ≤ c sup j∈z ∫ t 0 e−c(t−s)2 2j ‖u(s)‖22ds ≤ c ∫ t 0 ‖u(s)‖22ds < ∞. it follows that w(t) ∈ b−11,∞ and so w(t) ∈ b −5/2 2,∞ , since b−11,∞ is continuously embedded into b −5/2 2,∞ (as follows from the bernstein inequalities, see [3]). thus, if the initial condition u0 is from the space b −5/2 2,∞ , then e∆tu0 is also from the same space. this means that u(t) ∈ b−5/22,∞ for every t > 0, ||et∆u0||2 decreases at the rate (1 + t)−5/4 (see [3]) and using the result from [20] mentioned in the second section we also have ‖u(t)‖2 ≤ c2(1 + t)−5/4 for every t ≥ 0. suppose now that the initial condition was chosen in such a way that c1(1 + t)−5/4 ≤‖u(t)‖2 (8) for some c1 > 0 and every t ≥ 0. it follows from [14] that ‖aαu(t)‖2 ≤ c3(1 + t)−α−5/4 for every α > 0 and every sufficiently large t. if µ(t) = c4(1 + t)−1, we get c23(1 + t) −2α−5/2c−21 (1 + t) 5/2 ≥ ‖aαu(t)‖22 ‖u(t)‖22 ≥ c2α4 (1 + t) −2α ( 1 − ‖eµ(t)u(t)‖22 ‖u(t)‖22 ) . so, if c4 is sufficiently large then 1 − ‖eµ(t)u(t)‖22 ‖u(t)‖22 ≤ c−2α4 c 2 3c −2 1 < 1 and ‖eµ(t)u(t)‖2 ≥ c‖u(t)‖2 (9) for every sufficiently large t and some c > 0. we will now prove the existence of a constant c such that lim inft→∞‖u(t)‖b−5/22,∞ ≥ c > 0. we proceed by contradiction. suppose that there exists a sequence {tn}∞n=1, limn→∞ tn = ∞ such that limn→∞ c(n) = 0, where c(n) = ‖u(tn)‖b−5/22,∞ = supj∈z 2−5j/2‖∆ju(tn)‖2. then ‖∆ju(tn)‖22 ≤ 2 5jc(n)2. choose j0 so that 2j0 ∼ (1 + tn)−1/2 and sum up the last inequality over j from −∞ to j0. we get∑ j≤j0 ‖∆ju(tn)‖22 ≤ ∑ j≤j0 25jc(n)2. due to the definition of µ, (9) and the choice of j0, the left hand side is greater than c‖u(tn)‖22 for some c > 0 independent of n. the right hand side is smaller than 2c(n)225j0 ∼ 2c(n)2(1 + tn)−5/2. we get finally c‖u(tn)‖2 ≤ c(n)(1 + tn)−5/4 for every n ∈ n and this is in contradiction with (8). we sum up the result from this section in the following theorem. theorem 5. let u0 ∈ b −5/2 2,∞ ∩l 2 σ. let u be a turbulent solution of (1)–(3) with the initial condition u0 and such that ‖u(t)‖2 ≥ c(1 + t)−5/4 for some c > 0 and all t ≥ 0. then there exist constants c′ and c′′ such that 0 < c′ ≤‖u(t)‖ b −5/2 2,∞ ≤ c′′ (10) for every t ≥ 0. we will now show that theorem 5 improves the result presented by miyakawa in [8]. miyakawa studied 102 acta polytechnica vol. 52 no. 6/2012 turbulent solutions with initial conditions u0 ∈ l2σ such that ∫ (1 + |x|)|u0(x)|dx < ∞. (11) he proved that 0 < c0 ≤‖u(t)‖b−11,∞ ≤ c1 (12) for large t > 0 and some constants c0 and c1 if and only if (∫ xju0m(x)dx, ∫ ∞ 0 ∫ (ukul)(x,s)dxds ) 6= (0,cδkl). (13) moreover, it was proved by miyakawa and schonbek in [9] that (13) holds if and only if there exist constant c0 and c1 such that 0 < c0 ≤ t5/4‖u(t)‖2 ≤ c1 (14) for large t > 0. since the initial conditions satisfying (11) belong to the space b−5/22,∞ ∩l2σ, it is clear that theorem 5 generalizes the result by miyakawa mentioned above: for initial conditions satisfying (11) and under condition (13) (resp. (14)) both results give lower estimates of u(t), but while miyakawa’s estimate uses the space b−11,∞, in theorem 5 we use the space b −5/2 2,∞ . since b−11,∞ is continuously embedded into b −5/2 2,∞ , the result from theorem 5 is stronger. moreover, theorem 5 also describes lower estimates for solutions with initial conditions not satisfying (11) (in this paper we have not dealt with their existence). 5 appendix the definitions and some basic properties of the following concepts can be found in [19]: • lp, p ∈ [1,∞], the lebesgue space with the norm ‖·‖p; • wk,p, k ∈ n, p ∈ [1,∞], the sobolev space with the norm ‖·‖k,p; • c∞0,σ = {ϕ ∈ (c∞0 )3;∇·ϕ = 0}, the set of smooth solenoidal vector functions with compact support in r3; • l2σ, resp. w 1,2 0,σ, the closure of c∞0,σ in (l2)3, resp. (w 1,2)3; • pσ, the orthogonal projection of l2(ω)3 onto l2σ; • a, the stokes operator in l2σ defined as au = −pσ∆u for every u ∈ d(a) = w 1,20,σ ∩ (w 2,2)3; a is a positive self-adjoint operator; for the case of the whole space au = −∆u; • {eλ; λ ≥ 0}, the resolution of identity of a; • aµ, µ ∈ r, the powers of a with domains d(aµ) and ranges r(aµ); • {et∆; t ≥ 0}, the semigroup generated by the laplace operator −∆. definition 3. if u0 ∈ l2σ, a measurable function u defined on r3×(0,∞) is called a global weak solution of (1)–(3) if u ∈ l∞((0,∞); l2σ) ∩l 2((0,t); w 1,20,σ) for every t > 0 and the integral relation∫ ∞ 0 [ −(u(t),∂tφ(t)) + (∇u(t),∇φ(t)) + (u(t) ·∇u(t),φ(t)) ] dt = (u0,φ(0)) holds for all φ ∈ c∞0 ([0,∞); c∞0,σ). definition 4. a global weak solution u satisfies the strong energy inequality if ‖u(t)‖2 + 2 ∫ t s ‖∇u(σ)‖2dσ ≤‖u(s)‖2 for s = 0 and almost all s > 0, and all t ≥ s. a global weak solution satisfying the strong energy inequality is called turbulent. definition 5. let u0 ∈ d(a). a function u ∈ c([0,∞); d(a)) ∩ c1((0,∞); l2σ) is called a global strong solution of (1)–(3) if u(0) = u0 and du/dt + au + pσ(u ·∇u) = 0 for every t > 0. if u0 ∈ l2σ then there exists at least one turbulent solution of (1)–(3) (see [4]). every turbulent solution becomes strong after some transient time (see [19], chapter v.). this means that there exists t0 ≥ 0 such that u ∈ c((t0,∞); d(a)) ∩ c1((t0,∞); l2σ) and du/dt + au + pσ(u ·∇u) = 0 for every t > t0. 6 conclusion in this paper we have presented a survey of some results on the large time decay of energy in turbulent solutions to the navier-stokes equations, and the related topic of the large time energy concentration in the frequency space. in the fourth section we improved a result from [8] and showed that some navier-stokes flows do not decay asymptotically to zero when considered in suitable besov spaces. acknowledgements this work has been supported by the ministry of education of the czech republic, under project msm 6840770003. 103 acta polytechnica vol. 52 no. 6/2012 references [1] brandolese, l.: asymptotic bahavior of the energy and pointwise estimates for solutions to the navier-stokes equations, rev. mat. iberoamericana, 20, 2004, 223–256. [2] bahouri, h., chemin, j.y., danchin, r.: fourier analysis and nonlinear partial differential equations. [fundamental principles of mathematical sciences 343], heidelberg : springer, 2011. [3] chemin, j.y.: localization in fourier space and navier-stokes system, pubbl. cent. ric. mat. ennio giorgi, scuola norm. sup., vol. i, 53, 2004. [4] farwig, r., kozono, h., sohr, h.: an lq− approach to stokes and navier-stokes equations in general domains, acta math., 195, 2005, 21–53. [5] kajikiya, r., miyakawa, t.: on l2 decay of weak solutions of the navier-stokes equations in rn, math. z., 192, 1986, 135–148. [6] kato, t.: strong lp− solutions of the navierstokes equations in rm, with applications to weak solutions, math. z. 187 1984, 1471–480. [7] leray, j.: sur le mouvement d’un liquide visqueux emplissant l’éspace, acta math. 63, 1934, 193–248. [8] miyakawa, t.: on upper and lower bounds of rates of decay for nonstationary navier-stokes flows in the whole space, hiroshima math. j., 126, 2002, 431–462. [9] miyakawa, t., schonbek, m.: on optimal decay rates for weak solutions to the navier-stokes equations in rn, math. bohem., 126, 2001, 443– 455. [10] scarpellini, b.: solutions of evolution equations of slow exponential decay, analysis 20 2000, 255– 283. [11] schonbek, m.: asymptotic behavior of solutions to the three-dimensional navier-stokes equations, indiana university mathematics journal, 41, 1992, 809–823. [12] schonbek, m.: large time behavior of solutions to the navier-stokes equations, comm. partial differential equations, 11 1986, 753–763. [13] schonbek, m.: lower bounds of rates of decay for solutions to the navier-stokes equations, journal of the american mathematical society, 4, 1991, 423–449. [14] schonbek, m., wiegner, m.: on the decay of higher-order norms of the solutions of navierstokes equations, proceedings of the royal society of edinburgh, 126a, 1996, 677–685. [15] skalák, z.: some aspects of the asymptotic dynamics of solutions of the homogeneous navierstokes equations in general domains, j. math. fluid mech., 12, 2010, 503–535. [16] skalák, z.: on the asymptotic decay of higherorder norms of the solutions to the navier-stokes equations in r3, discrete contin. dyn. syst. ser. s, 3, 2010, 361–370. [17] skalák, z.: large time behavior of energy in exponentially decreasing solutions of the navierstokes equations, nonlinear analysis, 71, 2009, 593–603. [18] skalák, z.: solutions to the navier-stokes equations with large time energy concentration in low frequencies, zamm, 91, 2011, 733–742. [19] sohr, h.: the navier-stokes equations, an elementary functional analytic approach. basel, boston, berlin : birkhäuser verlag, 2001. [20] wiegner, m.: decay results for weak solutions of the navier-stokes equations on rn, j. london math. soc., 35, 1987, 303–313. 104 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 a crazy question: can apparently brighter gamma-ray bursts be farther away? a. mészáros, j. ř́ıpa, f. ryde abstract the cosmological relationships between observed and emitted quantities are determined for gamma-ray bursts (grbs). the relationship shows that apparently fainter bursts need not, in general, lie at larger redshifts.this is possible when the luminosities (or emitted energies) in a sample of bursts increase faster than the dimming of the observed values with redshift. four different samples of long bursts suggest that this is what really happens. keywords: cosmology-miscellaneous, gamma-ray bursts. 1 introduction the burst and transient source experiment (batse) instrument at the comptongammarayobservatory detected around 2700 grbs1, but only a few of these have directly measured redshifts from the optical afterglow (oa) observations [1, 2]. during the last couple of years the situation has improved, mainly due to the observations made by the swift satellite2. there are alreadydozens of directly determined redshifts [3]. however, this sample is only a small fraction of the thousands of detected bursts. beside direct determination of redshifts from oas, there are several indirectmethodswhich utilize gamma-ray data alone. in these mainly statistical studies, akeyrole is playedby theassumption that— on average — apparently fainter bursts should lie in smaller redshifts. the purpose of this paper is to show that this is not always the case. the paper is in essence a shortened version of [4]. 2 theoretical considerations let the observed peak-flux or fluence p(z) of a grb be given. this value is given by [5,6] p(z)= (1+ z)n l̃(z) 4πdl(z)2 , where it can be n = 0;1;2, and where z is the redshift, dl(z) is the luminosity distance, and l̃(z) is either the isotropic peak-luminosity or the isotropic emitted energy. there are four possibilities [5,6]: • p(z) –fluence is in“photons/cm2”units; n =2; l̃(z) is in “photons” units, • p(z) – fluence is in “erg/cm2” units; n = 1; l̃(z) is in “erg” units, • p(z) – peak-flux is in “photons/cm2s” units; n =1; l̃(z) is in “photons/s” units, • p(z) – peak-flux is in “erg/cm2s” units; n =0; l̃(z) is in “erg/s” units. in grb, topic p(z) is usually measured in an energy interval of photons e1 < e < e2. in addition, it is a good approximation that all arriving photons from this interval are detected, but none from outside. then l̃(z) is from the energy-interval e1(1+ z) < e < e2(1+ z). it is a standard cosmological behaviour that for small z’s (z � 0.1) dl(z) ∝ z, and for large redshifts lim z→∞ dl(z) 1+ z =finite positive number for any ho, ωm, ωλ [7]. hence, l̃(z) is a monotonically increasing function of the redshift along with (1 + z)2−n for fixed p(z) = p0 and for the given value of n ≤ 1. this means that z1 < z2 implies l̃(z1) < l̃(z2). expressing this result in other words: more distant and brighter sources may give the same observed value of p0. now, if a source at z2 has l̃ > l̃(z2), its observed value p ′ obs will have p ′obs > p0, i.e. it becomes apparently brighter despite its greater redshift than that of the source at z1. the probability of the occurrence of this kind of inversion depends on f(l̃|z), on the conditional probability density of l̃ assuming z is given, and on the spatial density of the objects. assume now specially that l̃(z) ∝ (1+ z)q; n + q > 2; for z → ∞. then dp(z) dz > 0; for z → ∞. 1http://www.batse.msfc.nasa.gov/batse/grb/catalog/ 2http://swift.gsfc.nasa.gov/docs/swift/swiftsc.html 45 acta polytechnica vol. 51 no. 6/2011 fig. 1: left panel: function q(z) for ωm = 0.27 and ωλ = 0.73. right panel: dependence of zturn on q for ωm = 0.27 and ωλ =0.73 fig. 2: distribution of the fluences (left panel) and peak-fluxes (right panel) of swift grbs with known redshifts. the medians separate the area into four quadrants. the objects in the upper right quadrant are brighter and have larger redshifts than those of the grbs in the lower left quadrant hence, if l̃(z) increases fast enough ((n + q) > 2), then it is possible that on average dp(z)/dz > 0 can happen. this means that for small z’s p(z) always decreases as z−2, but if l̃(z) ∝ (1+z)q, (n +q) > 2 athigher redshifts, then p(z) increases as zn+p−2 for big z’s; again always. if this is the case, then the question naturally arises: where is zturn, where dp(z)/dz = 0? mathematically this means that one has to search for the minimum of function q(z)= (1+ z)n+q/dl(z) 2; i.e., when dq(z)/dz = 0 holds. the results of numerical solutions for zturn are shown in figure 1. 3 the samples the expectation that more distant grbs are on average apparentlybrighter for the observer canbe verified on samples for which there are well-defined redshifts, as well as measured peak-fluxes and/or fluences. we discuss four samples here: batse grbs with known redshifts (8 long grbs), batse grbs withpseudo-redshifts (13 longgrbs), theswift sample (134 long grbs) and the fermi sample (6 long grbs). for the swift sample, the effect of inversion can easily be seen by the scatter plots of the [log fluence; z] and [log peak-flux; z] planes; figure 2. to demonstrate the effect of inversion we marked the medians of the fluence and peak-fluxwith horizontal lines and themedians of the redshiftwithvertical dashed lines. the medians split the plotting area into four quadrants. the grbs in the upper right quadrants are apparentlybrighter than those in the lower left quadrants, although their redshifts are larger. it is worth mentioning that the grb with the greatest redshift in the sample has higher fluence than 50 % of all the items in the sample. 46 acta polytechnica vol. 51 no. 6/2011 fig. 3: distribution of the fluences (left panel) and peak-fluxes (right panel) of grbs with known redshifts, where fermi grbs are denoted by asterisks, and batse grbs with determined redshifts (pseudo-redshifts) are denoted by dots (circles). the medians separate the area into four quadrants. the objects in the upper right quadrant are brighter and have larger redshifts than those of the grbs in the lower left quadrant fig. 4: distribution of the fluences (left panel) and peak-fluxes (right panel) of swift grbs with known redshifts. on the left panel the curves denote the values of fluences for ẽiso = ẽo(1+ z) q (the three constant ẽo are in units 10 51 erg: i. 0.1; ii. 1.0; iii. 10.0). in the panel on the right the curves denote the values of the peak-fluxes for l̃iso = l̃o(1+ z) q (the three constant l̃o are in units 10 58ph/s: i. 0.01; ii. 0.1; iii. 1.0) the fluence (peak-flux) vs. redshift relationship of the fermi and of the twobatse samples are summarized in figure 3. to demonstrate the inversion effect — similarly to the case of the swift sample — the medians are alsomarkedwith dashed lines. here it is quite evident that some of the distant bursts exceed in their observed fluence and peak-fluxes the values of those with smaller redshifts. here again the grbs in the upper right quadrants are apparently brighter than those in the lower left quadrant, although their redshifts are larger. note that the upper rightquadrantsareevenmorepopulated than the lower right quadrants. in other words, the trend of increasing peak-flux (fluence) with redshift is really evident here. using the special assumption l̃(z) ∝ (1 + z)q, the effect of inversion may also be illustrated distinctly in the swift sample, as follows. in figure 4 the fluences and peak-fluxes are typified against the redshifts. let the luminosity distances be calculated for h0 = 71km/(smpc), ωm = 0.27 and ωλ = 0.73. then the total emitted energy ẽiso and the peakluminosity l̃iso can be calculated with n = 1. in the figure, the curves of the fluences and peak-fluxes are shown after substituting l̃iso = l̃o(1+ z) q and ẽiso = ẽo(1+z) q where l̃o and ẽo are constants, and q =0;1,2. the inverse behaviour is quite obvious for q > 1 and roughly for z > 2. 4 conclusions in all samples, both for the fluences and for the peakfluxes, the “inverse” behaviour can happen. the apparently faintest grbs need not also be the most 47 acta polytechnica vol. 51 no. 6/2011 distant. the question of the title can really be answered by a clear “yes, they can.”. it should be noted that no assumptions have been made in this paper about the models of the long grb. in addition, no cosmological parameters needed to be specified. the results of this paper can be summarized as follows: 1. a theoretical study of the z-dependence of the observed fluences and peak-fluxes of grbs have shownthat fainter bursts couldwell have smaller redshifts. 2.this is really fulfilled for the four different samples of long grbs. 3. these results do not depend on the cosmological parameters and on the grb models. 4. all this suggests that the estimations (see, for example, [8]), leading to a large fraction of batse bursts at z > 5, need not be correct. acknowledgement wewish to thankz.bagoly,l.g.balázs, i.horváth, s. larsson, p. mészáros, d. szécsi and p. veres for useful remarks. this study was supported by otka grant k77795, by grant agency of the czech republic grant p209/10/0734, by research program msm0021620860 of the ministry of education of the czech republic, and by the swedish national space agency. references [1] schaefer, b. e.: apj, 583, l67, 2003. [2] piran, t.: rev. mod. phys., 76, 1143, 2004. [3] mészáros, p.: rep. progr. phys. 69, 2259, 2006. [4] mészáros, a., řı́pa, j., ryde, f.: a & a, 529, a55, 2011. [5] mészáros, p., mészáros, a.: apj, 449, 9, 1995. [6] mészáros, a., mészáros, p.: apj, 466, 29, 1996. [7] weinberg, s.: gravitation and cosmology. new york : j. wiley and sons, 1972. [8] lin, j.r., zhang, s.n., li, t.p.: apj,605, 819, 2004. attila mészáros jakub řı́pa charles university faculty of mathematics and physics astronomical institute v holešovičkách 2, 180 00 prague 8, czech republic felix ryde department of physics royal institute of technology albanova university center se-106 91 stockholm, sweden 48 ap_07_6.vp 1 introduction the original formal study of finite state systems (neural nets) is from 1943 by mcculloch and pitts [14]. in 1956 kleene [13] modeled the neural nets of mcculloch and pitts by finite automata. in that time similar models were presented by huffman [12], moore [17], and mealy [15]. in 1959, rabin and scott introduced nondeterministic finite automata (nfa) in [21]. the finite automata theory is a well developed theory. it deals with regular languages, regular expressions, regular grammars, nfas, deterministic finite automata (dfas), and various transformations among the previously listed formalisms. the final product of the theory towards practical implementation is a dfa. dfa then runs theoretically in time �(n), where n is the size of the input text. however, in practice we have to consider cpu cache that rapidly influences the speed. cpu has two level caches displayed in fig. 1. the level 1 (l1) cache is located on chip. it takes about 2–3 cpu cycles to access data in l1 cache. the level 2 (l2) cache may be on chip or may be external. it has about 10 cycles access time. the main memory access takes 150–200 cycles and hard disc drive access takes even 106 times more time. therefore it is obvious that cpu cache significantly influences dfa run. we cannot control the cpu cache use directly, but knowing the cpu cache strategies we can implement the dfa run in a way so that cpu cache would be most likely efficiently used. we distinguish two kinds of use of dfa. for each of them we describe the most suitable implementation. in section 2 we define nondeterministic finite automaton and discuss its usage. section 3 then describes general techniques for dfa implementation. it is mostly suitable for dfa that is run most of the time. since dfa has a finite set of states, this kind of dfa has to have cycles. recent results in the implementation using cpu cache are discussed in section 4. on the other hand we have a collection of dfas each representing some document (e.g., in the form of complete index in case of factor or suffix automata). such dfa is used only when properties of the corresponding document are examined. such automaton usually does not have cycles. there are different requirements for implementation of such dfa. suitable implementations are described in section 5. 2 nondeterministic finite automaton nondeterministic finite automaton (nfa) is a quintuple (q, �, �, q0, f), where q is a finite set of states, � is a set of input symbols, � is a mapping � �q q� ( ) ( )� � �� � , q q0 � is an initial state, and f q� is a set of final states. deterministic finite automaton (dfa) is a special case of nfa, where � is a mapping q q� � � . in the previous definition we talk about completely defined dfa, where there is for each source state and each input symbol exactly one destination state defined. however, there is also partially defined dfa, where there is for each source state and each input symbol at most one destination state defined. the partially defined dfa can be transformed to completely defined dfa introducing a new state (so called sink state) which has a self loop for each symbol of � and into which all non-defined transitions of all states lead. there are also nfas with more than one initial state. such nfas can be transformed to nfas with one initial state introducing a new initial state from which �-transitions lead to all former initial states. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 47 no. 6/2007 finite automata implementations considering cpu cache j. holub the finite automata are mathematical models for finite state systems. more general finite automaton is the nondeterministic finite automaton (nfa) that cannot be directly used. it is usually transformed to the deterministic finite automaton (dfa) that then runs in time �(n), where n is the size of the input text. we present two main approaches to practical implementation of dfa considering cpu cache. the first approach (represented by table driven and hard coded implementations) is suitable forautomata being run very frequently, typically having cycles. the other approach is suitable for a collection of automata from which various automata are retrieved and then run. this second kind of automata are expected to be cycle-free. keywords: deterministic finite automaton, cpu cache, implementation. l2 cachecpu core raml1 cache fig. 1: memory cache hierarchy 2 b 0 3 c 4 d 1 a d dc fig. 2: a deterministic finite automaton nfa accepts a given input string w � �* if there exists a path (a sequence of transitions) from the initial state to a final state spelling w. the problem occurs when for a pair (q, a), q q� , a � � (i.e., state q of nfa is active and a in the current input symbol) there are more possibilities how to continue: 1. there are more than one transitions labeled by a outgoing from state q. that is � ( , )q a � 1. 2. there is an �-transition in addition to other transitions outgoing from the same state. in such a case nfa cannot decide, having only the knowledge of the current state and current input symbol, which transition to take. due to this nondeterminism nfa cannot be directly used. there are two options: 1. we can transform nfa to the equivalent dfa using the standard subset construction [21]. however, it may lead to an exponential increase of number of states (2 q nfa states, where q nfa is the number of states of the original nfa). the resulting dfa then runs in linear time with respect to the size of the input text. 2. we can simulate the run of nfa in a deterministic way. we can use basic simulation method [7, 6] usable for any nfa. for nfa with a regular structure (like in the exact and approximate pattern matching field) we can use bit parallelism [16,7,6,10] or dynamic programming [16, 8, 6] simulation methods which improve the running time of the basic simulation method in this special case. the simulation runs slower than dfa however the memory requirements are much smaller. practical experiments were given in [11]. 3 deterministic finite automaton implementation further in the text we do not consider simulation techniques. we consider only dfa. dfa runs theoretically in time �( )n , where n is the size of the input text. there are two main techniques for implementation of dfa: 1. table driven (td): the mapping � is implemented as a transition matrix of size q � � (transition table). the current state number is held in a variable qcurr and the next state number is retrieved from the transitiontable from line qcurr and column a, where a is the current input symbol. 2. hard coded (hc) [22]: the transition table � is represented as a programming language code. for each state there is a place starting with a state-label. then there is a sequence of conditional jumps, where based on the current input symbol the corresponding goto command to the destination state-label is performed. 3.1 table driven an example of td implementation is shown in fig. 3. for partially defined dfa one have to either transform it to a completely defined dfa or handle the case when a undefined transition should be used. obviously td implementation is very efficient for completely defined dfa or dfas with non-sparse transition table. it can be also very efficiently used in programs, where dfa is constructed from a given input and then it is run. in such a case it can be easily stored into the transition matrix. the code for the dfa run is then independent on the content of the transition matrix. td implementation is also very convenient for a hardware implementation, where the transition matrix is represented by a memory chip. 3.2 hard coded an example of hc implementation is shown in fig. 4. the implementation can work with partially defined dfa in this case. hc implementation may save some space when used for partially defined dfa, where the transition matrix would be sparse. it cannot be used in programs, where dfa is constructed from the input. when dfa is constructed, a hard coded part of the program has to be generated in a programming language, then compiled and executed. this technique would need calls of several programs (compiler, linker, the dfa program itself) and would be very inefficient. note that we cannot use the recursive descent [1] approach from ll(k) top-down parsing, where each state could be represented by a function calling recursively a function representing the following state. in such a case the system stack would overflow since dfa would return from the function calls only at the end of the run. there would be as many nested function calls as the size of the input text. however, ngassam’s implementation [18] uses a function for each state, but the function (with the current input symbol given as a parameter) returns an index of the next state and then the next state function (with the next input symbol given as a parameter) is called. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 2 b 0 3 41 a dc 1 4 4 4 3 3 2 int dfa_td(){ int state=0,symbol; while((symbol= getchar())!= eof ) { state= transition_table[state][symbol]; } returnis_final[state]; } transition_table: fig. 3: table driven implementation of dfa from fig. 2 4 dfa with cycles td and hc implementations (and their combination called mixed-mode – mm) were heavily examined by ngassam [20, 18]. his implementations use a data structure that most likely will be stored in cpu cache. for each of td and hc implementationshe developed three strategies to use cpu cache efficiently: dynamic state allocation (dsa), state pre-ordering (spo), and allocated virtual caching (avc). dsa strategy has been suggested in [19] and was proved to outperform td when a large-scale dfa is used to recognize very long strings that tend to repeatedly visit the same set of states. spo relies on a degree of prior knowledge about the orderin which states are likely to be visited at run-time. it was shown that the associated algorithm outperforms its td counterpart no matter the kind of string being processed. avc strategy reorders the transition table at run time and also leads to better performance when processing strings that visit a limited number of states. ngassam’s approach can be efficiently exploited in dfa, where some states are frequently visited (like in dfa with cycles). in both td and hc ngassam’s implementations the transition table is expected to have the same number of items in each row (i.e., each state having the same number of outgoing transitions). ngassam’s implementation uses a fixed-size structure for each row of the transition table. therefore for sparse transition matrix the method is not so memory efficient. 5 acyclic dfa another approach is used for acyclic dfa. in these automata each state is visited just once during the dfa run. suffix automaton and factor automaton (automaton recognizing all suffixes and factors of the given string, respectively) [3, 4] are of such kind. given a pattern they verify if the pattern is a suffix or a factor of the original string in time linear with the length of pattern regardless the size of the original string. an efficient implementation of the suffix automaton (also called dawg – direct acyclic word graph) was created by balík [2]. an implementation of the compact version of the suffix automaton called compact suffix automaton (also called compact dawg) was presented by crochemore and holub in [9]. both these implementations are very efficient in terms of memory used (about 1.1–5 bytes per input string symbol). the factor and suffix automata are usually built over whole texts typically several megabytes long. instead of storing the transition tableas a matrix like in td implementation, whole automaton is used in a bit stream. the bit stream contains a sequence of states each containing a list of all outgoing transitions (i.e., sparse matrix representation). © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 47 no. 6/2007 int dfa_hc(){ int symbol; state0:if ((symbol= getchar())== eof) return0; switch(symbol){ case ’a’: goto state1; case ’d’: goto state4; default:return(-1); }; state1:if ((symbol= getchar())== eof) return0; switch(symbol){ case ’b’: goto state2; case ’c’: goto state3; default:return(-1); }; state2:if ((symbol= getchar())== eof) return0; switch(symbol){ case ’c’: goto state3; case ’d’: goto state4; default:return(-1); }; state3:if ((symbol= getchar())== eof) return0; switch(symbol){ case ’d’: goto state4; default:return(-1); }; state4:if ((symbol= getchar())== eof) return1; return(-1); } fig. 4: hard coded implementation of dfa from figure 2 the key feature of both implementations is a topological ordering of states. it ensures that we never get back in the bit stream when traversing the automaton. this minimizes main memory (or hard disc drive) accesses. balík’s implementation is focused on the smallest memory used. it uses some data compression techniques. it also exploits the fact that both factor and suffix automata are homogeneous automata [5], where each state has all incoming transitions labeled by the same symbol. therefore the label of incoming transition is stored in the destination state. the outgoing transition then only points to the destination state, where the corresponding transition label is stored. on the other hand holub’s implementation considers also the speed of traversing. each state contains all outgoing transitions together with their transition labels like in fig. 5. (however, the dfa represented in fig. 5 is neither suffix nor factor automaton.) it is not so memory efficient like balík’s implementation but it reduces main memory (or hard disc drive) accesses. it exploits the locality of data – principle used by cpu cache. when a state isreached during the dfa run, whole segment around the state is loaded into cpu cache (from main memory or hard disc drive). the decision which transition to take is done based only on the information in the segment (in the cpu cache) and no other accessesto other segments (i.e., possible memory/hdd accesses) are needed. while in balík’s implementation one needs to access all the destination states to retrieve the transition labels of the corresponding transitions. holub’s implementation uses at most as many main memory/hdd accesses as many states are traversed. 6 conclusion the paper presents two approaches to dfa implementation considering cpu cache. the first approach is suitable for dfa with cycles where we expect some states are visited frequently. hc and td implementations for dfa with non-sparse transition table were discussed. on the other hand the other approach is suitable for acyclic dfa with a sparse transition table. this approach saves memory used but it runs slower than the previous one – instead of direct transition table access (coordinates given by the current state and the current input symbol) a linked list of outgoing transition of a given state is linearly traversed. however, reducing the memory used for the transition table increases the probability that the next state is already in the cpu cache which also increases the speed of dfa run. the first approach is suitable for the dfas that are running all the time like for example an anti-virus filter on a communication line. on the other hand the second approach is suitable for a collection of dfas from which one is selected and then it is run. that is for example a case of suffix or factor automata build over a collection of documents stored in hard disk. the task is then for a given pattern find all documents containing the pattern. acknowledgment this research has been partially supported by the ministry of education, youth and sports under research program msm 6840770014 and the czech science foundation as project no. 201/06/1039. references [1] aho, a. v., sethi, r., ullman, j. d.: compilers – principles, techniques and tools. addison-wesley, reading, ma, 1986. [2] balík, m.: dawg versus suffix array. in: j.-m. champarnaud, d. maurel (eds.): implementation and application of automata, number 2608 in lecture notes in computer science, p. 233–238. springer-verlag, heidelberg, 2003. [3] blumer, a., blumer, j., ehrenfeucht, a., haussler, d., chen, m. t., seiferas, j.: the smallest automaton recognizing the subwords of a text. theor. comput. sci., vol. 40 (1985), no. 1, p. 31–55. [4] blumer, a., blumer, j., ehrenfeucht, a., haussler, d., mcconnel, r.: complete inverted files for efficient text retrieval and analysis. j. assoc. comput. mach., vol. 34 (1987), no. 3, p. 578–595. [5] champarnaud, j.-m.: subset construction complexity for homogeneous automata, position automata and zpc-structures. theor. comput. sci., vol. 267 (2001), no. 1–2, p. 17–34. [6] holub, j.: simulation of nondeterministic finite automata in pattern matching. ph.d. thesis, czech technical university in prague, czech republic, 2000. [7] holub, j.: bit parallelism – nfa simulation. in: b. w. watson, d. wood (eds.): implementation and application of automata, number 2494 in lecture notes in computer science, p. 149–160. springer-verlag, heidelberg, 2002. [8] holub, j.: dynamic programming—nfa simulation. in: j.-m. champarnaud, d. maurel (eds.): implementation and application of automata, number 2608 in lecture notes in computer science, p. 295–300. springer-verlag, heidelberg, 2003. [9] holub, j., crochemore, m.: on the implementation of compact dawg’s. in: j.-m. champarnaud, d. maurel (eds.): implementation and application of automata, number 2608 in lecture notes in computer science, p. 289–294. springer-verlag, heidelberg, 2003. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 2 b 3 c 4 d 0 a d dc0 10 0 0 1state number: bitstream: fig. 5: a sketch of bitstream implementation of dfa from fig. 2 [10] holub, j., iliopoulos, c. s., melichar, b., mouchard, l.: distributed string matching using finite automata. in: r. raman, j. simpson (eds.): proceedings of the 10th australasian workshop on combinatorial algorithms, p. 114–128, perth, wa, australia, 1999. [11] holub, j., špiller, p.: practical experiments with nfa simulation. in: l. cleophas, b. w. watson (eds.): proceedings of the eindhoven fastar days 2004, tu eindhoven, the netherlands, 2004, p. 73–95. [12] huffman, d. a.: the synthesis of sequential switching circuits. j. franklin institute, vol. 257 (1954), p. 161–190, 275–303. [13] kleene, s. c.: representation of events in nerve nets and finite automata. automata studies, (1956), p. 3–42. [14] mcculloch, w. s., pitts, w.: a logical calculus of the ideas immanent in nervous activity. bull. math. biophysics, vol. 5 (1943), p. 115–133. [15] mealy, g. h.: a method for synthetizing sequential circuits. bell system technical j., vol. 34 (1955), no. 5, p. 1045–1079. [16] melichar, b.: approximate string matching by finite automata. in: v. hlaváč, r. šára (eds.): computer analysis of images and patterns, number 970 in lecture notes in computer science, p. 342–349. springer-verlag, berlin, 1995. [17] moore, e. f.: gedanken experiments on sequential machines. automata studies, 1956, p. 129–153. [18] ngassam, e. k.: towards cache optimization in finite automata implementations. ph.d. thesis, university of pretoria, south africa, 2006. [19] ngassam, e. k., kourie, d. g., watson, b. w.: reordering finite automatata states for fast string recognition. in: j. holub, m. šimánek (eds.): proceedings of the prague stringology conference ’05, czech technical university in prague, czech republic, 2005, p. 69–80. [20] ngassam, e. k., kourie, d. g., watson, b. w.: on implementation and performance of table-driven dfa-based string processors. in: j. holub, j. ž�árek (eds.): proceedings of the prague stringology conference ’06, czech technical university in prague, czech republic, 2006, p. 108–122. [21] rabin, m. o., scott, d.: finite automata and their decision problems. ibm j. res. dev., vol. 3 (1959), p. 114–125. [22] thompson, k.: regular expression search algorithm. commun. assoc. comput. mach., vol. 11 (1968), p. 419–422. ing. jan holub, ph.d. e-mail: holub@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 47 no. 6/2007 ap08_3.vp 1 introduction many first and second world countries are facing dramatic demographic changes. mainly due to longer life expectancy and reduced natality, societies are growing older [1]. as ageing is accompanied by increasing costs for medical care, there are increasing efforts to develop smart integrated medical devices to support people with health problems in their everyday life, and especially in their domestic environment. within the personal healthcare scenario, monitoring of heart and lung activity is of special interest. continuous 24/7 monitoring techniques for an intelligent warning system, making use of a range of sensor information are required for patients at risk (e.g. of post myocardial infarction). special focus needs to be given to the time when patients are alone at home, and especially when they are sleeping. during this period, which covers roughly 1/3 of the day, neither the patient nor his relatives or neighbours can raise an alarm. the need for long-term monitoring is accompanied by further constraints: for example, movement should not be restrained, and long-term skin irritation must be avoided. at best, the measurements should not be noticeable. from a physical point of view, potentional sensors for imperceptible mobile monitoring can be based on capacitive, magnetic or optical measurement techniques. an approach based on magnetic induction is presented here. 2 background a healthy life is based on the fact that the heart pumps blood through the body, and breathing exchanges the air inside the lungs while changing the lung volume. the effect of these two mechanical actions is to a shift well-conducting matter (blood) and poorly-conducting matter (air) inside the chest region. as shown by tarjan [2], a measurement system based on magnetic induction can monitor these changing impedances in the human body. from maxwell’s theory [3], it is principally known that: � � � � � e t b � � � � j e� �� � � � � � � � � h j t e � � �( ) where � is the nabla-operator, e denotes the electric field, b denotes magnetic induction and j represents current density. as the re-induced field is related to the impedance distribution inside the body, the impedance changes directly affect the re-induced voltage in the whole area (see fig. 2). if the frequency and permittivity � is low, the displacement current can be neglected. previous works have already indicated that this approach is promising for heart activity [4] and lung activity [5] monitoring. however, due to the superposi© czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 48 no. 3/2008 non-contact monitoring of heart and lung activity by magnetic induction measurement m. steffen, s. leonhardt in many clinical applications, the monitoring of heart and lung activity is of vital importance. state-of-the-art monitoring involves the use of electrodes or other contact based sensors (electrocardiogram (ecg), impedance cardiography (icg), pulse oximetry or equivalent). with the equipment that is used, side effects like skin irritation, difficult application or additional cabling may occur. in contrast, this paper describes a method for non-contact monitoring of heart and lung activity, which is solely based on magnetic induction. this method allows simultaneous monitoring of heart and lung activity, and has the potential of an integrated application in a personal healthcare scenario. to illustrate the performance, a simple test-setup has been developed and the first results are presented here (some of which have been previously presented on the poster 2008 [10]). keywords: magnetic induction, non-contact, monitoring, heart, lung. fig. 1: shapes of the german population pyramid changing in time (modified from [1]) tion of the re-induced fields, simultaneous measurement of heart and lung activity is rather difficult. as a simple coil setup might not be sufficient in all cases (see [7]), a multi-coil setup has been developed and tested. 3 measurement setup the structure of the resulting measurement setup is depicted in fig. 3, and consists of several blocks: 1. a signal generator (direct digital synthesis – dds) to provide all required frequencies 2. multiple parallel receivers in which the hf signals are down-converted into the baseband. 3. a measurement pc, which performs a/d conversion and all further signal-processing steps. 4. a sensor head, which holds the actual excitation and measurement coils. 5. a control and timing block, which generates all required reference-clocks and controls the signal generator and receivers. 6. some reference sensors, including a pulse plethysmograph (ppg), an airflow sensor and a dcf radio-clock-module. a sinusoidal signal from the signal generator is amplified and feed to the excitation coils and thus generating the excitation field. this field penetrates the body and induces an eddy voltage. this eddy voltage drives an eddy current, which is influenced by the excitation field as well as the tissue impedance. this eddy current leads to a re-induced magnetic field, which superimposes the excitation field. the resulting field is measured by the receiver coils. the voltage induced by the magnetic field into the receiver coils is pre-amplified and then fed to the amplifier and demodulator stage. here it is mixed down to a 20–100 khz baseband signal. this signal is sampled and a/d converted. the further processing is carried out in the software on the measurement pc. the hardware described above was used in combination with the coil array, which is composed of two excitation coils (lying) and three measurement coils (standing) and this provides six independent channels of inductive coupling. experimentally, the sensor head was placed beneath a sun lounger, on which a healthy adult volunteer was resting, as shown in fig. 4. 4 results using this setup, the magnetic induction of the previously mentioned six channels was simultaneously recorded, together with the signals of a pulse plethysmograph (nellcore� ds-100a fingerclip & self-made electronic) and an airflow sensor based on differential pressure measurements (general electric� flow resistance & hoffrichter cps 1b differential pressure sensor). for illustration, fig. 5 shows all recorded signals during a selected period of timed. 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 2: concept of magnetic induction measurements [9]. excitation field penetrates the body. eddy current is induced and the resulting re-induced field is measured. fig. 3: overview of the measurement hardware. excitation frequencies are generated by dds synthesizers and feed to the excitation-coils. the resulting field penetrates the body and produces the re-induced field. this is picked up by the measurement-coils, pre-amplified and demodulated in a combined hardware/software approach (see [9]). fig. 4: picture of the measurement setup in this dataset it is especially interesting that the artefact shortly before t � 950 s seems to affect both the ppg and the magnetic induction. especially the strong appearance in channel 1 which otherwise shows little heart related content © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 48 no. 3/2008 fig. 5: overview on all signals measured simultaneously (from top: induction channels 1–6 (mv), ppg, airflow (ml)). at 950 s the inductive signal and the heart reference show an artefact presumably due to body motion. fig. 6: overview on the respiration related signals measured from chest and back makes it rather likely that this is a movement artefact and not an extra-systole. to find out whether monitoring in a bed is feasible in principle, measurements were undertaken with four different body positions: 1. lying on the chest 2. lying on the left side 3. lying on the back 4. lying on the right side in order to keep this preliminary testing as simple as possible, no further disturbance, e.g., by motion artefacts, was introduced, and moving metal parts were removed from the measurement area. during the measurement no mobile phones were carried though no negative effects on the measurement due to mobile phones have ever been experienced. magnetic shielding of the measurement setup is not required. in all four positions, respiratory frequency monitoring was possible without any problem, as shown in fig. 6 and fig. 7 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 7: overview on the respiration related signals measured from left and right side fig. 8: comparison inductive measurement (ch. 4) vs. respiratory reference (volunteer lying on the chest ) which shows the four positions for 20 s each. from the experience in [8], it is likely that the difference between the induction signal and the reference signal is not due to errors in the inductive monitoring, but rather due to drift, saturation and leakage in the reference sensor. as expected heart-activity monitoring is more demanding, though in position 1 (chest) band pass filtering with cut-off-frequencies of 0.75 hz and 5 hz were sufficient to derive the signals shown in fig. 8 and fig. 9, where pulse plethysmograph and the magnetic induction bear strong similarity. similar results where obtained when the volunteer was lying on the left side, as the distance to the centre of the heart is slightly larger the resulting inductive signal is a bit weaker, as shown in fig. 10. for the other positions further processing was required. this especially utilizes the availability of multiple simultaneously recorded induction channels. © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 48 no. 3/2008 fig. 9: comparison inductive meas. (ch. 4) vs. heart reference (ppg) (volunteer lying on the chest ) fig. 10: comparison inductive meas. (ch 6) vs. heart reference (ppg) (volunteer on it left side ) in the case of the volunteer lying on his back, noise was reduced by subtracting channel 3 from channel 4. the resulting signal and it comparison to the reference signal are shown in fig. 11 and fig. 12. in the case of the volunteer lying on the right side, signal noise was reduced by combining two channels (ch. 4 and ch. 6), which lead to the signals depicted in fig. 13 and fig. 14. in all cases, the required scaling of the channels considered was performed manually. automated channel selection and combination is under development. 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 11: drift reduction / compensation in (ch. 4 h. 3), result shown below (volunteer in back position) fig. 12: comparison inductive meas. (drift compensation ch4. ch. 3) vs. heart reference (ppg) (volunteer in back position) 5 conclusion the elimination of electric contact between a medical device and the patient’s body is a promising idea, and this paper presents one feasible approach to non-contact monitoring, based on magnetic induction. this approach is especially suitable for integration into a normal bed. though the measurements still required some manual post-processing, lung and heart related signal contents were detectable and distinguishable in all positions. further work will be done on automatic scaling and multivariate combination of the measured channels and on taking into account motion artefacts and em © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 48 no. 3/2008 fig. 13: noise reduction, combining ch. 4 and ch. 6, result shown below (volunteer lying on the right side) fig. 14: comparison inductive meas. (combination ch. 4 + ch. 6) vs. heart reference (ppg) (volunteer lying on the right side) interference. the problem of respiratory and heart rate detection will also be addressed. acknowledgments this work was supported in part by the aachen competence centre for biomedical engineering (akm) and the federal ministry for education and research (bmbf), germany. ongoing research is supported by the german research foundation (dfg, ste 1811/1-1). references [1] statistisches bundesamt (federal german office for statistics), bevölkerung deutschlands bis 2050 (“german population til 2050”). www.destatis.de, 2003. [2] tarjan, p., mcfee, r.: electrodeless measurements of the effective resistivity of the human torso and head by magnetic induction, ieee trans. on biomedical engineering, 1968. [3] maxwell, jc.: lehrbuch der elektrizität und des magnetismus, 1. ed., berlin: springer, 1883. [4] guardo, r., trudelle, s., adler, a., et al.: contactless recordings of cardiac related thoracic conductivity changes. proceedings of the engineering in medicine and biology conference, 1995, canada, p. 1581–582. [5] richter, a., adler, a.: eddy current based flexible sensor for contactless measurement of breathing. proceedings imtc 2005 instrumentation and measurement, ottawa, canada, 2005. [7] steffen, m., leonhardt, s.: requirement estimation of simultaneous heat and lung activity monitoring with magnetic impedance tomography systems. wc on med. phy. and biomed. eng. 2006, korea. [7] steffen, m., aleksandrowicz, a., leonhardt, s.: mobile non-contact monitoring of heart and lung activity, ieee trans. biomed. circ. and sys., vol. 1 (2007), no. 4. [8] steffen, m., heimann, k., bernstein, n., gonzalo, d., leonhardt, s.: kontaktloses monitoring von vitalparametern bei neugeborenen. 141. jahrestagung der deutschen gesellschaft für biomedizinische technik im vde – bmt 2007, 2007. [9] steffen, m., heimann, k., bernstein, n., leonhardt, s.: multichannel simultaneous magnetic induction measurement system (musimitos). phys. meas. vol. 29 (2008), no. 6, p. 291–306. [10] steffen, m.: non-contact monitoring of heart and lung activity by magnetic induction measurement, student conference poster 2008, prague 2008. matthias steffen e-mail: steffen@hia.rwth-aachen.de steffen leonhardt chair for medical information technology rwth aachen university aachen pauwelsstr. 20, germany 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 ap08_5.vp 1 introduction in the following text, we assume that a, b and c are real matrices of order n. every modern cpu architecture use a complex memory hierarchy scheme (for details see [6]), we assume the scheme depicted in fig. 1. in contrast to other authors, we also assume tlb performance impact. 1.1 the cache model the cache model we consider corresponds to the structure of l1 and l2 (and optionally l3) caches in the intel x86 architecture. an s-way set-associative cache consists of h sets and one set consists of s independent blocks (called lines in the intel terminology). let cs denote the size of the data part of a cache in bytes and bs denote the cache block size in bytes. then c s b hs s� � � . let sd denote the size of type double and si the size of type integer. we consider only two types of cache misses: � compulsory misses (sometimes called intrinsic or cold) that occur when new data is loaded into empty cache blocks. � thrashing misses (also called cross-interference, conflict, or capacity misses) that occur if useful data has been replaced prematurely from a cache block for capacity reason. 1.2 the code restructuring there are many compiler techniques for improving the performance (for details see [8, 9]). these compiler techniques are used in programs to maximize the cache hit ratio and improve alu and fpu utilization. 2 linear codes 2.1 existing la libraries and their possible drawbacks there exist many libraries for the linear algebra (la) such as blas [5], lapack [2], intel mkl [1] etc. they are tuned for the given architecture, but they also have some drawbacks: � we must include the whole library in the final code, it can significantly increase the code size. � licence problems can occur. � some format for storing sparse matrices are not supported. � operations are “black boxes”, so the user cannot manage the inner solution process. � a combination of two or more operations with the same operands is solved separately (i.e. inefficiently). © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 5/2008 memory hierarchy behavior study during the execution of recursive linear algebra library i. šimeček for good performance of every computer program, good cache and tlb utilization is crucial. in numerical linear algebra libraries (such as blas or lapack), good cache utilization is achieved by explicit loop restructuring (mainly loop blocking), but this requires difficult memory pattern behavior analysis. in this paper, we represent the recursive implementation (“divide and conquer” approach) of some routines from numerical algebra libraries. this implementation leads to good cache and tlb utilization with no need to analyze the memory pattern behavior due to “natural” partition of data. keywords: numerical linear algebra, code restructuring, loop unrolling, recursive implementation, memory hierarchy utilization. fig. 1: the memory hierarchy scheme fig. 2: parameters of the cache model 2.2 linear code optimization there are many routines for linear algebra. for demonstration purposes, we choose following: 1. matrix-matrix multiplication (gemm in blas notation). 2. symmetric matrix-matrix multiplication (syrk in blas notation). 3. backward substitution (trsm in blas notation). 4. cholesky factorization (potrf in lapack notation). the standard code of these routines has good performance due to the high cache hit ratio only for small sizes of order of matrix. when good performance is required for larger values, modifications must be made. in numerical algebra packages, this is achieved by explicit loop restructuring. this includes loop unrolling-and-jam which increase the fpu pipeline utilization in the innermost loop, loop blocking (which is why we refer to blocked codes) and loop interchange to maximize the cache hit ratio. after applying of these transformations, these codes are divided into two parts. the outer loops are “out-cache”, and the inner loops are “in-cache”. the codes have almost the same performance irrespective on the amount of data. but good cache utilization can also be achieved by the “divide-and-conquer” approach. these codes, which use recursive-style formulations, referred to as recursive-based. 2.3 comparison of two approaches for the design of numerical la routines 2.3.1 blocked code programming effective implementation in blocked code programming style requires the following steps: 1. straightforward implementation. 2. design a coarse cache model of this code. 3. apply source code transformations (according to the architecture parameters). 4. refine the cache model. 5. derive optimal values of the program thresholds from the cache parameters. 2.3.2 recursive code programming the motivation idea of recursive codes, is to divide the matrices into disjoint submatrices of the same size (see [3, 4]). the resulting code has much better spatial and temporal locality. effective implementation in recursive code programming style requires the following steps: 1. straightforward implementation. 2. apply source code transformations (according to architecture parameters). 3. express routine(s) in the recursive style. 4. measure threshold value(s). 2.4 discussion the main differences between blocked and recursive-based codes are � cache analysis of some la routines is very difficult, much more difficult than recursive formulation of these routines. � for optimal performance of blocked codes, the program may have different in-l1-cache loops and in-l2-cache loops (and in-l3-cache respectively). in recursive-based codes, this is automatically done by the partitioning of the data. � blocked codes lead to even better cache utilization because the data is optimally divided to fit into the cache. in recursive codes some part of cache memory remains unused. � recursive style of expression is very easy to understand and results in error-free codes. � recursive-based codes have additional stack management overhead in comparison to blocked codes. 3 detailed overview of matrix-matrix multiplication procedure 3.1 matrix-matrix multiplication we consider input real matrices a i k( )� , b k j( )� and c i j( )� . a standard sequential pseudocode for matrix-matrix multiplication c a b� � (abbreviated as gemm, assuming the ordering of loops: i, j, k) is as follows: 3.2 recursion in matrix-matrix multiplication let us denote c(i) if the final result (matrix c) is computed from the recursion by i variable. consider the partitioning of matrices a, b and c as follows. for better illustration, the partitioning of matrices is depicted in the fig. 3. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 pseudocode gemm_std (* standard matrix-matrix multiplication *) (1) for i � 1 to n do (2) for j � 1 to n do (3) sum � 0 (4) for k � 1 to n do (5) sum a i k b k j� � �[ ][ ] [ ][ ]; (6) c i j sum[ ][ ] ;� fig. 3: recursion in gemm c( ) ( ) ( , ) ( ) ( ) ( i c c a a b a a b a b a b� � � � � � � � 1 2 1 2 1 2 1 2gemm gemm� a b a b1 2, ) ( , )gemm c( ) ,j c c a b b a b b � � � � � � � � � � � �� � �� � � � � 1 2 1 2 1 2 gemm � � � � � � � � � � � � � � a b a b a b a b 1 2 1 2 gemm gemm ( , ) ( , ) c( ) , ( )k c c a a b b a a � � � � � � � � �� � �� � � � � 1 2 1 2 1 2 1 2 gemm � � � � � � � � ( ) ( , ) ( , ) b b a b a b a b a b 1 2 1 1 2 2 1 1 2 2gemm gemm 3.3 compulsory misses during the execution of gemm(a, b, c, i, j, k) routine all elements from arrays a, b and c must be loaded or written to the cache. so, the number of compulsory misses is n s ij ik jk b � � �d s ( ) . a similar idea is also valid for tlb misses. 4 evaluation of the results 4.1 testing configuration 4.1.1 hw configuration all results were measured on pentium celeron m420 at 1.6 ghz, 2gb@ 333 mhz, with the following cache parameters: l1 cache is a data cache with bs � 64, cs � 32 kb, s � 8, h � 64, and lru replacement strategy. l2 cache is unified cache with bs � 64, cs � 1 mb, s � 8, h � 2048, and lru strategy. 4.1.2 sw configuration we have used the following sw: � os windows xp professional sp3, � microsoft visual studio 2003, � intel compiler version 9.0 with switches: /o3 /og /oa /oy /ot /qpc64 /qxb /qipo /qsfalign16 /zp16 � intel mkl library 8.1. 4.1.3 test data we have measured instances using square matrices with the order of n in range from 10 to 1000 (for cache analysis to 800) with the step 10. 4.1.4 methodology of the measuring � all cache events were monitored by the cache analyzer (ca) [7]. � for tlb miss ratio measuring, we also used the ca with the following parameters: h � 1, s � 128, bs � 4096. this means, that we assume tlb as a fully-associative cache with its block size equal to the system page size. � for performance measurements, the average of five measured values was taken as the result. � the caches were flushed out before each measurement. in real measurement, they were flushed by a reading of a given amount of useless data. in measurement using the ca, they were flushed by a command of the ca. � the codesize of the innermost loops is negligible in comparison to the size of the l2 cache, so we assume that the unified l2 cache is used only for data. 4.2 experimental results 4.2.1 l1 cache miss ratio a comparison of the l1 cache miss rate for three different variants of the gemm procedure is illustrated in fig. 4. the peaks in this graph are caused by a resonance between the matrix element mapping function and the cache memory mapping function. both unrolled variants achieve better l1 utilization. 4.2.2 l2 cache miss ratio a comparison of the l2 cache miss rate for three different variants of the gemm procedure is illustrated in fig 5. we can conclude that if the size of operands exceeds the data cache size (for n � 350), the l2 cache utilization drops significantly. both unrolled variants achieve better l2 utilization. 4.2.3 choosing the optimal value of the threshold a comparison of cache utilization when three different threshold values were chosen, is depicted in figs. 6 and 7. if the threshold is too large, the l1 cache miss ratio increases rapidly. the l2 cache miss ratio remains the same across the whole measured set. comparisons of cache miss rates for three different variants of gemm procedure are illustrated in figs. 8 and 9. the recursive variant of the gemm procedure achieves much better cache utilization across the whole measured set. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 5/2008 pseudocode gemm_rec (a, b, c, i, j, k) (* standard implementation of recursive matrix-matrix multiplication *) (* limit is an architecture dependent constant *) (1) n i j k� min( , , ) ; (2) if (n � limit ) then (3) c � gemm_std(a, b); (4) else (5) switch which of i, j, k is maximum (6) case i: is maximum (7) c � mmm_rec(a, b); break; (7’) (* recursion by i variable *) (8) case j: is maximum (9) c � mmm_rec(a, b); break; (9’) (* recursion by j variable *) (10) case k: is maximum (11) c � mmm_rec(a, b); break; (11’) (* recursion by k variable *) peaks in this graph are caused by resonance of the matrix element mapping function with cache memory mapping function. 4.2.4 tlb cache miss ratio a comparison of tlb utilization when three different threshold values were chosen, is depicted in fig. 10. with the exception of one case (caused by resonance of the matrix element mapping function for n � 510 with the tlb memory mapping function), the number of tlb misses is very low. a comparison of the tlb miss rates for three different variants of the gemm procedure is illustrated in fig. 11. the recursive variant of the gemm procedure achieve much better tlb utilization for larger matrices (n � 250). 4.2.5 performance of gemm performance [mflops]� number of fpu operations execution time in s[ ]� 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 fig. 4: l1 cache miss rate for different variants of the gemm procedure fig. 5: l2 cache miss rate for different variants of the gemm procedure performance [mflops] gemm � 2 3n execution time in s[ ]� we compare the performance of five different variants: � the standard variant (gemm_std), � two unrolled variants, � the recursive variant, � the vendor implementation (intel mkl library). the resulting performances of different versions of gemm procedure are illustrated in the fig. 12. we can conclude that � the standard variant has very low performance. � two unrolled variants have reasonable performance until the sizes of the operands exceed the cache size. then © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 5/2008 fig: 6: dependence of l1 cache utilization on the threshold value fig. 7: dependence of l2 cache utilization on the threshold value the performance suffers from low memory hierarchy utilization. � the recursive variant achieves very good performance irrespective of matrix size. it was just slightly slower than the vendor implementation. 5 conclusions in this paper, we have described the implementation of some important routines from la using a recursive approach, concentrating on the matrix-matrix multiplication routine. all tested routines achieve performance about 80–90 % in comparison to the vendor mkl library. we can conclude that 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 fig. 8: comparison between unrolled and recursive variants fig. 9: comparison between unrolled and recursive variants recursive codes achieve very good performance due to effective memory hierarchy utilization. unlike other (more complicated) methods, no difficult memory pattern behavior analysis is needed. acknowledgement this research has been supported by mšmt under research program msm6840770014. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 48 no. 5/2008 fig. 10: dependence of tlb utilization on the threshold value fig. 11: comparison between unrolled and recursive variants references [1] intel® math kernel library 10.1 – overview. [2] lapack – linear algebra package. [3] carr, s., kennedy, k.: compiler blockability of numerical algorithms. in proceedings of the 1992 acm/ieee conference on supercomputing. ieee computer society press, 1992, p. 114–124. [4] carr, s., lehoucq, r. b.: a compiler-blockable algorithm for qr decomposition. in proceedings of the eighth siam conference on parallel processing for scientific computing, san francisco, ca, february 1995. [5] dongarra, j. j., croz, j. d., hammarling, s., duff i.: a set of level 3 basic linear algebra subprograms. acm transactions on mathematical software, vol. 16 (1990), no. 1, p. 1–17, mar. 1990. [6] hennessy, d. a. p. john l.: computer architecture, fourth edition: a quantitative approach. morgan kaufmann; 4 edition (september 27, 2006), 2006. [7] tvrdík, p., šimeček, i.: software cache analyzer. in proceedings of ctu workshop, vol. 9, p. 180–181, prague, czech republic, mar. 2005. [8] wadleigh, k. r., crawford, i. l.: software optimization for high performance computing. hewlett-packard professional books, 2000. [9] wolfe, m.: high-performance compilers for parallel computing. addison-wesley, reading, massachusetts, usa, 1995. ing. ivan šimeček phone: +420 224 357 268 e-mail:xsimecek@fel.cvut.cz department of computer science and engineering czech technical university in prague karlovo nám. 13 121 25 prague 2, czech republic 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 fig. 12: performance of different versions of gemm procedure in double precision ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 a study of the interface of soldered joints of sninagti active solder with ito ceramics m. provazńık, r. koleňák abstract this paper presents an analysis of the solderability ito ceramics (in2o3/sno2). the soft active solder sninageti was used for the experiments. the solder was activated by power ultrasound in air without flux. an analysis of the interface of the phases between the solder and the ceramic was carried out in order to discover the ultrasonic impacts on the active metal and to identify the mechanism of the joint on the ceramic side. keywords: ultrasonic activation, active solder, ito (indium-tin oxide) ceramics. 1 introduction ceramic materials are increasingly used in technical practice, especially in the field of electrotechnology. there is enormous demand for conductive joining of ceramics with metals. soldering with the use of active solders is a current trend in this area. these solders contain an active element which reacts with the surfaceof the ceramicmaterial. this enables it to be wetted and a reaction layer created. the solders have a very low wetting angle, enabling soldering at low temperatures, without flux and additional protection. the most widely used active metal is titanium. the reactive product transforms the surface energy of ceramics and enables wetting of the solder. the active element moves from the whole solder volume to the two solderedmaterials. at the interface of the soldered joint, a reaction layer several μm in thickness is created, which contains the reaction products of the active elements and the substrate [1]. the solder is activated mechanically or with the use of a veryhigh temperature to influence the active element. the mechanical application is achieved by scraping, by spreadingwith a metal brush (soldering cu,al, orcrni steel), by vibration, or by ultrasound above 20 khz (soldering ceramics and non-metallic materials). the working cycle of the mechanical application is approximately 10 times shorter than high temperature activation, and does not require the application of a vacuum or a protective atmosphere. 2 experiments a sample was made using the apparatus shown in fig. 1. the soldering process involves heating the soldered materials with the use of a hot plate to the soldering temperature in the range of 150–160◦c. the maximum soldering temperature is limited to 160◦c, when surface oxidation of ito ceramics increases. heatingby steps is chosen inorder toachieve steadyheating of bothmaterials. a strap-shapedsolder made by fast cooling technology is placed on a heated substrate, see fig. 2. the solder is activated by a titanium peak of the ultrasonic equipment after melt-down. the activation time was chosen in the interval from 1 to 5 seconds in one contact point. tab. 1 presents the parameters of theus equipment. the solder is applied to the second substrate in the same way. the two prepared parts are joined and softly pushed. the presence of the active solder protects the high plasticity of the phase interface. fig. 1: diagram of the apparatus for soldering by ultrasound fig. 2: solder production by fast cooling technology [2] 70 acta polytechnica vol. 50 no. 6/2010 table 1: us equipment parameters output power – intermittent service [w] max. 400 operative frequency [khz] 40 input power [w] max. 600 time adjustment range [s] 0.1–9.9 for 0.01 the soldering samples were fixed after splitting, and were prepared using a standard metallographic methodology. the samples were then analyzed using a light microscope and a scanning electron microscope. for further documentation of the joint creation mechanism, rtg. diffraction analysis was used for analyzing the concentration profiles of each element. production principle of the active solder foil 1. the molten alloy is extruded by fine pressure of an inert gas (argon, helium) through the rectangular slot of a jet placed near the cooling disk. 2. the rotating cooling disk gradually touches the molten solder. the disk creates a thin layer of solid alloy which the roller takes away. the solder in foil shape with parameters 0.15mm × 7 mm was made by the sav physical institute, bratislava for experimental purposes. the solder wasmade from foundry alloy by ready casting in an ingot mould. 3 experiment results a differential thermic analysis was made on a sample of active solder. the process of this analysis is recorded in fig. 3. using the dta graphs we can specify the thermal areas where there are phasic metamorphoses. a temperature of 116◦c characterizes the onset of molten eutectic sn–in. a temperature of 156◦c is the molten temperature of neat indium. fig. 3: dta of sninagti solder fig. 4: microstructure of sninagti active solder [3] fig. 4 presents the heterogeneous microstructure of thesninagti soft active solder in themolten state. a quantitative chemical analysis of the solder, shown in fig. 4, is described in tab. 2. the dark grains, specified as positions a1 and a2, are expressively enriched with ti and ag. other pallid grains (areasa3, a4) and the matrix (position a5, a6) are explicitly composed of in and sn elements. table 2: quantitative analysis of sninagti solder [3] ti ag in sn [at. %] [at. %] [at. %] [at. %] a1 3.1 27.6 55.1 14.3 a2 1.9 29.3 53.0 15.8 a3 0 0.3 11.3 88.5 a4 0 0 12.1 87.9 a5 2.7 0 64.7 32.7 a6 0 0 71.4 28.7 in order to identify the phasic composition of the solder after moulding, a radiographic diffraction analysis was made by the sav physical institute, bratislava, which identified the following phases: in3sn, insn4, ti6sn5, ag3sn, agin2, ti3ag [3]. we can identify the dark areas of the solder as theti3ag phase, which demonstrates the high affinity of titanium to silver. fig. 5 shows the phasic interface of solder and ito ceramics. the concentration profiles of the individual elements across the interface demonstrate that ti participates in the creation of the joint, but 71 acta polytechnica vol. 50 no. 6/2010 fig. 5: thesninagti– itointerface, and the concentration profiles of the individual elements over the interface it is supposed that indium has the greatest effect on the creation of the joint. a provable effect of power ultrasound is that the solder is able to fill the narrow spaces among the ceramic grains, proving that there is a high degree of ceramic wetting when soldering is performed. indium markedly supports the creation of the joint at this monitoring interface. the experiments show clearly that the mission of an active element in solder does not always have to appear. in this case, the ti is indifferent (it does not create any provable phases). it remains fixed in the solder (mainly in the ti3ag phase) and it does not participate markedly in creating the joint. since indium has a high affinity to oxygen, we can assume that it combines in the soldering process with atmospheric oxygenwith complex oxide genesis of indium,which provides an input into the reactions with the surfaceof the itoceramicsby the simplified model shown in fig. 6. fig. 6: simplified model of the creation of the joint between ito ceramics and sninagti solder [4] 4 our contribution to new findings our contribution to work on this topic can be summarized as follows: • a soft active sninagti solder has been designed and manufactured by methods of fast cooling, • our analyses of the phase structure of the solder has identified the following phases: in3sn, insn4, ti6sn5, ag3sn, agin2, ti3ag, • the sninagti solder drenched ito ceramics with the use of power ultrasound, • studies of the interfaces have shown thatwhen a joint with ito ceramics is created, indium from the solder participates preferentially. 72 acta polytechnica vol. 50 no. 6/2010 5 conclusion the results of our experiments have verified the possibility of creating a soldering joint between a metal and an ito ceramic. the soldering joint is produced with the use of mechanical activation power ultrasound on air, without flux. it has been proved that an sninagti solder reacts with the surface layers of the connected metals by creating reactive products of various thicknesses and qualities. this proves that the solid substrates dissolve in the solder and create diffusion joints. there is no evident diffusion area on the interface between the solder and the ito ceramics. the solder is able to leak into the spaces between the grains, and in this wayamechanical joint is created. indium plays a major role in creating the joint. indium creates oxide inputs that react with the ceramics and create a chemical bond. acknowledgement this paper has been preparedwith support from the vega1/0381/08project—research into the influence of physical and metallurgical aspects of high temperature soldering upon the structure ofmetallic and ceramic materials’ joints, and the apvt 20-010804 project — the development of a leadfree soft active solder and research into the solderability of metallic and ceramic materials using ultrasonic activation. references [1] chang, s. y., tsao, l. c., chiang1, m. j., chuang, t. h., tung, c. n.: active soldering of indium tin oxide (ito) with cu in air using an sn3.5ag4ti(ce, ga) filler, journal of materials engineering and performance, 2003, no. 4, pp. 383–389. [2] koleňák, r.: solderability of metal and ceramic materials by active solders. 1st ed. dresden, forschungszentrum dresden, 2008. (scientific monographs). isbn 978-3-941405-03-5. [3] adamč́ıková, a.: study of ceramics wetting by lead-free solders, master’s thesis, 2007. [4] koleňák, r: the development of a lead-free soft active solder and research into the solderability of metallic and ceramic materials using ultrasonic activation, final report of grant project apvt 20-010804. trnava, 2008. ing. martin provazńık e-mail: martin.provaznik@stuba.sk institute of production technologies mtf stu, trnava, slovak republic doc. ing. roman koleňák, phd. e-mail: roman.kolenak@stuba.sk institute of production technologies mtf stu, trnava, slovak republic 73 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 4u1909+07: a hidden pearl i. kreykenbohm, f. fürst, l. barrágan, j. wilms, r. e. rothschild, s. suchy, k. pottschmidt abstract wepresent adetailed spectral and timinganalysis of thehighmassx-raybinary (hmxb)4u1909+07with integral and rxte. 4u1909+07 is a persistent accreting x-ray pulsar with a period of approximately 605s. the period changes erratically consistent with a random walk expected for a wind accreting system. integral detects the source with an average of 2.4cps (corresponding to 15mcrab), but sometimes exhibits flaring activity up to 50cps (i.e. 300mcrab). the strongly energy dependent pulse profile shows a double peaked structure at low energies and only a single narrow peak at energies above 20kev. the phase averaged spectrum is well described by a powerlaw modified at higher energies by an exponential cutoff and photoelectric absorption at low energies. in addition at 6.4kev a strong iron fluorescence line and at lower energies a blackbody component are present. we performed phase resolved spectroscopy to study the pulse phase dependence of the spectral parameters: while most spectral parameters are constant within uncertainties, the blackbody normalization and the cutoff folding energy vary strongly with phase. keywords: x-rays: stars – stars: flare – stars: pulsars: individual: 4u 1909+07 – stars: magnetic fields. 1 introduction 4u 1909+07 (also known as x1908+07) was discovered with the uhuru satellite as 3u 1912+07 [5, in the third uhuru catalog]. since then, the source was detected with most x-ray instruments. the nature of the system was, however, not clear for 30 years. only in 2000, [24] found a stable 4.4 d period in rxte asm data, which was interpreted as the orbital period of a binary system. the photoelectric absorption was found to vary by a factor of at least 3 with orbital phase [11]. such a behavior can be well described by a spherical wind model and an inclination of 54◦ ≤ i ≤ 70◦, depending on the parameters of the wind model. furthermore, 4u 1909+07 shows irregular flaring activity due to inhomogeneities in the wind of the donor star, which lead to different accretion rates. in 2005, [16] detected an ob star in the near infrared at the location of the x-ray source, thus confirming that the system is indeed a high mass x-ray binary (hmxb). they estimated the distance to the source to 7 kpc [16]. using a pointed rxte-pca observation, [11] discovered a period of ∼ 605 s, explained as the pulse period of a rotating neutron star with an offset magnetic axis. [11] were thus able to refine the binary orbit parameters. they obtained m� = 9–31 m� and r� ≤ 22 r� for the mass and radius of the companion star assuming the canonical mass of 1.4 m� for the neutron star. in this paper, we present the data and data reduction methods in sect. 2. in sect. 3 we analyze the pulse period evolution and the pulse profiles and in sect. 4 we address the spectral properties of the source. we summarize and discuss our results in sect. 5. 2 data and observations we used all publicly available data of 4u 1909+07 from the international gamma ray laboratory [26, integral] and the rossi x-ray timing explorer [7, rxte]. integral has three x-ray science instruments, which provide coverage from 3 kev up to 10 mev: the imager ibis/isgri [22, 20 kev to 800 kev] with moderate energy resolution and a large effective area, the spectrometer spi [23, 20 kev to 10 mev] with excellent spectral resolution suitable for the analysis of nuclear lines, and the x-ray monitor jem-x [12, 3 kev to 35 kev]. thanks to isgri’s large field of view of almost 30◦ × 30◦ more than 6 msec of data on 4u 1909+07 exist, mostly from the integral core programme [25] and the grs 1915+105 monitoring [17], however, 4u 1909+07 was almost never in the field of view of jem-x, as no observations were pointed directly at 4u 1909+07. among the detected sources in the field of view are prominent sources like grs 1915+105 or ss 433. in total, twelve sources with a significance > 6σ were detected in the field of view, with 4u 1909+07 being the fourth brightest source. in addition to the standard pipelines of the offline scientific analysis software (osa) version 7.0 for images and spectra, the ii light distributed with osa was used to obtain light curves with a higher temporal resolution. we used a time resolution of 20 s to maintain 72 acta polytechnica vol. 51 no. 2/2011 an acceptable signal-to-noise ratio and to obtain a good enough time resolution to measure the pulse period. in addition to the integral observations, we also used the observations of 4u 1909+07 performed by rxte. both major instruments of rxte, the proportional counter array [6, pca], sensitive in the 2-60kev energy band and the high energy x-ray timing experiment [18, hexte], sensitive between 15–250 kev, were fully operational at that time. both instruments have a large effective area and a high temporal resolution, but no imaging capabilities. in total, 196 ksec of rxte data on 4u 1909+07 are available. 3 timing analysis 3.1 pulse period evolution apart from the discovery of the pulsations based on rxte data in 2000 and 2004 [11], no other measurements of the pulse period of 4u 1909+07 are available. the extensive archival integral data are therefore optimal to track the evolution of the pulse period since 2003. we split the lightcurve into segments between 300 ksec and 800 ksec length to determine the period accurately. we then used epoch folding [10] to determine the pulse period for each of these segments individually. 4u 1909+07 shows very strong pulse-to-pulse variations such that the actual pulse cannot be seen by the naked eye (see, e.g., figure 1). only after folding 500–1 000 pulses, a stable pulse profile emerges which allows for a reliable pulse period determination. the pulse profile itself, however, is remarkably stable on longer time scales. a similar effect is seen in vela x-1, although the overall luminosity is much higher in that source [20]. 78000 80000 82000 84000 − 1 0 0 1 0 2 0 3 0 time in sec since mjd 52704 c ts /s e c fig. 1: closeup on an isgri lightcurve of 4u1909+07 in the 20–40kev with 20s time resolution. the red line shows the folded pulse profile the evolution of the pulse period since 2001 is shown in figure 2. the overall behavior is best described by a random walk model, but also a spinup trend between march 2003 and april 2006 and a spin-down trend from april 2006 to october 2007 could describe the data. to confirm the random walk behavior, we implemented the algorithm proposed by [2]: this algorithm evaluates the relative change in the pulse period p over different time intervals δt between individual measurements. in the case of a perfect random walk, the result will be a straight line with a slope of 0.5 in the log δω − log δt space, where ω is the angular velocity: ω = 2π/p . figure 3 shows the corresponding diagram for 4u 1909+07. superimposed is a line with a slope of 0.5, shifted in ydirection to fit the data. it is obvious that the line matches the data very well. the uncertainties in figure 3 include uncertainties of the determined pulsed periods and also the uneven and coarse sampling. year 603.6 603.8 604.0 604.2 604.4 604.6 604.8 605.0 p e ri o d [ s] 2001 2002 2003 2004 2005 2006 2007 fig. 2: evolution of the pulse period of 4u1909+07. the historic rxte data points of [11] are shown as red crosses, ourmeasurements obtainedwith integralare shown as stars 6.5 7 7.5 − 6 .5 − 6 − 5 .5 − 5 log δ t lo g δ ω fig. 3: pulse period evolution in logδω − logδt space, as proposed by [2] 73 acta polytechnica vol. 51 no. 2/2011 0 .8 1 1 .2 m e a n c o u n ts a) 3.3−4.1kev 0 .8 1 1 .2 m e a n c o u n ts b) 9.1−10.3kev 0 .8 1 1 .2 m e a n c o u n ts c) 14.9−16.2kev 0 0.5 1 1.5 2 1 1 .5 2 m e a n c o u n ts phase d) 20−40kev fig. 4: energy resolved pulse profiles with rxte pca (a–c) and integral isgri(d). theprofiles are shown twice for clarity. note that the rxte and integral profiles are not strictly phase aligned 3.2 pulse profiles as shown by [11], the 3.7–17 kev pulse profile shows two distinct peaks, with the second peak being slightly broader than the first peak and having a complex shape with two subpeaks at phase φ = 0.85 and φ = 0.1. to study the energy dependence of the pulse profile, we extracted pulse profiles from pca data in different energy bands, using a period of p = 604.685 s and 32 phase bins. thanks to the large effective area of the pca instrument, high quality pulse profiles could be extracted in 30 narrow energy bands. three energy bands are shown as an example in the upper three panels of figure 4. the pulse profiles show a smooth transition from a two peaked profile at low energies to a single peak profile at high energies. at energies below 5 kev the secondary peak is broader and stronger than the primary peak (figure 4a). with increasing energy both peaks become at first equally strong and more clearly separated and then the relative strength of the secondary peak declines further (figure 4b,c). to obtain the pulse profile at high energies between 20 kev and 40 kev, we used integral data taken in 2004. we used the same epoch as for the pca analysis, but a period of p = 604.747 s, as determined by our analysis of the integral data. in this energy band, the secondary peak does not exist anymore while the primary pulse is clearly seen and has a sharp peak (figure 4d). the deep minimum around phase 0.3, however, is not energy dependent. 4 spectral analysis 4.1 phase averaged spectrum the pulse phase averaged spectrum of 4u 1909+07 is best described by a powerlaw continuum attenuated by photoelectric absorption at low energies and an exponential turnover at high energies similar to most accreting x-ray pulsars. [11] modeled the spectrum using bremsstrahlung, which describes the data equally well. the turnover at high energies is often modeled with cutoffpl, highecut, or the fdcut [21]. the npex model [15] is more complex as it involves a second powerlaw. furthermore at 6.4 kev an iron fluorescence line is present. we discarded all observations between orbital phase 0.88 < φorb < 0.12, as the nh is dramatically increased during this part of the orbit [11]. we applied the typical continuum models (see above) to the rxte and integral data (see table 1). all models can describe the data almost equally well. in the case of the fdcut model, however, the cutoff energy is set to < 1 kev, thus effectively removing the cutoff. we therefore did not use the fdcut model any more for 4u 1909+07. independent of the applied continuum models, however, a soft excess below 10 kev is evident, which can be very well modeled using a blackbody with a temperature of kt ≈ 1.4 kev. figure 5 shows the spectrum and the best fit cutoffpl model. 4.2 phase resolved spectra since the integral isgri data did not provide high enough statistics for high resolution phase resolved spectroscopy, we only used the rxte data. we divided the pca and hexte data into 7 phase bins (see figure 6a). we applied a cutoffpl with an additional blackbody component to all phase bins. neither the photoelectric absorption nor the power law index γ varies with pulse phase. the cutoff folding energy, however, is highest in the primary peak with ∼ 19 kev, while it is only ∼14 kev during the secondary peak, explaining why the primary peak is much stronger at higher energies. the blackbody component is strongly variable: the normalization changes by a factor of ∼ 3. the blackbody is strongest between the two peaks and weakest in the rise and maximum of the primary peak. the temperature kt of the blackbody on the other hand does not change with pulse phase. the width of the iron line σfe is consistent with a narrow line with zero width for most phase bins except during the peak of the pulse where it has a width of ∼ 0.4 kev. 74 acta polytechnica vol. 51 no. 2/2011 table 1: fit parameters for various models of the phase averaged spectrum of 4u1909+07 model cutoffpl cutoffpl highecut parameter +bbody χ2red 1.76 1.01 1.07 nh [10 22 atoms cm−2] 15.3+0.6−0.5 4.7 +1.6 −1.9 4.8 +0.9 −1.6 γ1 1.63+0.13−0.02 0.96 +0.03 −0.06 1.37 +0.03 −0.08 fe σ 1.7+0.1−0.5 0.28 +0.15 −0.11 0.41 +0.07 −0.06 fe energy [kev] 6.4+0.8−0.2 6.40 +0.04 −0.06 6.39 +0.04 −0.03 fe norm (1.1+0.1−0.6) × 10 −3 (0.57+0.17−0.11) × 10 −3 (0.78+0.07−0.04) × 10 −3 model highecut npex npex parameter +bbody +bbody χ2red 0.91 1.44 0.93 nh [atoms cm −2] 6.8+2.0−0.8 1.7 +0.7 −1.7 5 +1 −3 γ2 1.32 ± 0.10 0.36+0.05−0.15 / −3.08 +0.13 −0.09 0.80 +0.06 −0.15 / −2.0 3 fe σ 0.2 ± 0.2 0.41+0.07−0.06 0.27 +0.12 −0.18 fe energy [kev] 6.39+0.04−0.05 6.37 +0.04 −0.03 6.40 ± 0.05 fe norm (5+3−1) × 10 −4 (0.80+0.07−0.06) × 10 −3 (0.6+0.3−0.1) × 10 −3 1 0 − 6 1 0 − 5 1 0 − 4 1 0 − 3 0 .0 1 0 .1 1 c ou nt s s− 1 ke v − 1 a) 10 1005 20 50 − 2 0 2 χ energy (kev) b) fig. 5: acombinedrxtepca(red)andhexte(blue) and integral isgri (green) spectrum. the best fit cutoffpl-model is also shown. bresiduals of the best fit showing no evident deviation 5 discussion & conclusion we have presented a detailed study with integral and rxte of 4u 1909+07. we have shown that the pulse period changes strongly with time. the evolution of the period is best described by a random walk. such a behavior has been seen in many other hmxbs like vela x-1 and is a strong indicator for a direct accretion from the stellar wind of the optical companion without a persistent accretion disk [4]. an accretion disk would provide a more constant transfer of angular momentum and thus a long-term spinup or spin-down trend, as seen in other sources like 4u 1907+09 [3]. since the accreting neutron star is strongly magnetized (as indicated by strong coherent pulsations), the accreted matter has to couple to the magnetic field thus inhibiting the formation of an accretion disk close to the neutron star. the required b field strength is of the order of 1012 g. such magnetic field strengths are commonly observed in neutron stars, especially pulsars. in such a strong magnetic field, however, cyclotron resonant scattering features (crsfs) are expected to be present in the x-ray spectrum, as in the spectrum of vela x1 [9] or mxb 0656−072 [13]. unfortunately, we found no evidence for a crsf in 4u 1909+07 and are thus unable to determine the strength of the magnetic field. however, the absence of crsfs in the x-ray spectrum does not rule out the presence of a strong magnetic field in 4u 1909+07, as crsfs can be filled up again by photon spawning and depend strongly 75 acta polytechnica vol. 51 no. 2/2011 1.2 1.4 1.6 1.8 c o u n tr a te [ cp s] 13.2−14.5keva) 6 7 8 9 10 11 n h [ a to m s cm 2 ] b) 0.8 1.0 1.2 1.4 γ c) 12 14 16 18 20 e fo ld [ ke v ] d) 0.5 1.0 1.5 a b b × 1 0 3 e) 1.2 1.4 1.6 1.8 2.0 2.2 2.4 kt [ ke v ] f) 0.0 0.5 1.0 1.5 2.0 phase 0.0 0.2 0.4 0.6 σ f e g) fig. 6: spectral parameters of phase-resolved spectra b pulse profile in the 13.2–14.5kev energy range. the different shaded areas indicate the phase bins. bphotoelectric absorption column nh in units of 10 22, c power law index γ, d folding energy, e blackbody normalization, f blackbody temperature, and g width of the iron line on the geometry of the accretion column and on the viewing angle [19]. the two distinct peaks in the pulse profile at low energies suggest that the neutron star is accreting on both magnetic poles. however, the dramatic change of the pulse shape with energy suggests that the physical conditions in the accretion columns above the two poles might be different. the exact geometry of the accretion column, however, depends strongly on the way the matter couples to the magnetic field and is highly uncertain [1, 14]. besides a simple filled funnel other possible configurations exist including a hollow or a partly hollow funnel and several small funnels. these models, however, have to be taken with a pinch of salt, as the behavior of relativisitc plasma in magnetic fields with b ≈ 1012 g is still only little understood. additionally, gravitational light bending must be taken into account when analyzing pulse profiles and their origin [8]. references [1] basko, m. m., sunyaev, r. a.: mnras, 1976, 175, 395. [2] de kool, m., anzer, u.: mnras, 1993, 262, 726. [3] fritz, s., kreykenbohm, i., wilms, j., et al.: a & a, 2006, 458, 885. [4] ghosh, p., lamb, f. k.: apj, 1979, 234, 296. [5] giacconi, r., murray, s., gursky, h., et al.: apjs, 1974, 27, 37. [6] jahoda, k., markwardt, c. b., radeva, y., et al.: apjs, 2006, 163, 401. [7] jahoda, k., swank, j. h., giles, a. b., et al.: xray and gamma-ray instrumentation for astronomy vii, ed. o. h. w. siegmund, m. a. gummin, euv, spie 2808, 1996, 59–70. [8] kraus, u., zahn, c., weth, c., ruder, h.: apj, 2003, 590, 424. [9] kreykenbohm, i., coburn, w., wilms, j., et al.: a & a, 2002, 395, 129. [10] leahy, d. a., darbo, w., elsner, r. f., et al.: apj, 1983, 266, 160. [11] levine, a. m., rappaport, s., remillard, r., savcheva, a.: apj, 2004, 617, 1 284. [12] lund, n., budtz-jørgensen, c., westergaard, n. j., et al.: a & a, 2003, 411, l231. [13] mcbride, v. a., wilms, j., coe, m. j., et al.: a & a, 2006, 451, 267. [14] meszaros, p.: space sci. rev., 1984, 38, 325. [15] mihara, t.: phd thesis, riken, tokio, 1995. [16] morel, t., grosdidier, y.: mnras, 2005, 356, 665. [17] rodriguez, j., bodaghee, a.: arxiv e-prints, 2008, 801. [18] rothschild, r. e., blanco, p. r., gruber, d. e., et al.: apj, 1998, 496, 538. [19] schönherr, g., wilms, j., kretschmar, p., et al.: a & a, 2007, 472, 353. [20] staubert, r., kendziorra, e., pietsch, w. c., reppin, j. t., voges, w.: apj, 1980, 239, 1 010. 76 acta polytechnica vol. 51 no. 2/2011 [21] tanaka, y.: in radiation hydrodynamics in stars and compact objects, ed. d. mihalas, k.h. a. winkler, iau coll. 89, heidelberg, 1986, 198. [22] ubertini, p., lebrun, f., di cocco, g., et al.: a & a, 2003, 411, l131. [23] vedrenne, g., roques, j.-p., schönfelder, v., et al.: a & a, 2003, 411, l63. [24] wen, l., remillard, r. a., bradt, h. v.: apj, 2000, 532, 1 119. [25] winkler, c.: in exploring the gamma-ray universe, ed. a. gimenez, v. reglero, c. winkler, esa sp-459, alicante, 2001, 471. [26] winkler, c., courvoisier, t. j.-l., di cocco, g., et al.: a & a, 2003, 411, l1. ingo kreykenbohm e-mail: ingo.kreykenbohm@sternwarte.uni-erlangen.de felix fürst laura barrágan jörn wilms dr. karl remeis-sternwarte bamberg sternwartstrasse 7, 96049 bamberg, germany erlangen centre for astroparticle physics (ecap) erwin-rommel-str. 1, 91058 erlangen, germany richard e. rothschild slawomir suchy center for astrophysics and space sciences university of california san diego, la jolla, ca 92093-0424, usa katja pottschmidt cresst, university of maryland baltimore county 1000 hilltop circle, baltimore, md 21250, usa nasa goddard space flight center astrophysics science division code 661, greenbelt, md 20771, usa 77 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 heat exchangers for condensation and evaporation applications operating in a low pressure atmosphere petr kraćık1, jǐŕı posṕı̌sil1, ladislav šnajdárek1 1 brno university of technology, fme, department of power engineering, technická 2896/2, 616 69 brno, czech republic correspondence to: kracik@fme.vutbr.cz abstract this paper presents a state-of-the-art study of a heat transfer process in liquid spraying heat exchangers placed in a vacuum chamber. the experimental case studied here describes the behavior of a falling film evaporation and condensation mode on horizontal tube bundles. the study aims to obtain the heat transfer coefficient and its correlations by means of a mathematical model. keywords: liquid spraying, heat exchangers, vacuum chamber. 1 introduction the use of cooling increases year by year, not only in the food, chemical and engineering industries, but especially in shopping centers, in office buildings and in homes. cold compressor cooling systems, which consume a lot of energy, are the most widely produced systems. an alternative way of cooling that consumes approximately one-fifth of the energy is by using absorption units. the absorption cycle is supplied directly by heat from gas burners, or is indirectly generated by power plants in the form of waste heat. the use of waste heat leads primarily to fuel saving. if the cold that is produced is used to increase thermal comfort in the summer season, the use of waste heat can spread heat production more equally throughout the year. this can result in a stable output of power units throughout the year, increasing the reliability and therefore the lifetime of power units. the heat exchanger is an essential component of the absorption cycle that provides optimum heat transfer. in this paper, the authors attempt to determine the surface heat transfer coefficient of a spray-water cooled exchanger located in a vacuum chamber. the vacuum is generated by a rotary vane vacuum pump. this basic research follows a study of the heat transfer coefficient on a tube bundle at atmospheric pressure, which is being carried out in the dept. of power engineering at the energy institute of the faculty of mechanical engineering of the technical university in brno. 2 description of the vacuum apparatus the chamber is a cylindrical airtight vessel with three visors, in which a tube bundle is placed. the chamber is connected to three closed loops. two of these loops are designed for pressure up to 1.0 mpa, and they are used for transporting a cooling/heating fluid (depending on the type of experiment — condensation/evaporation). the third loop is used for transporting a liquid spray. each loop is equipped with a pump, a regulation valve, a flow meter and a plate heat exchanger. the plate heat exchanger can be connected to a boiler filled with hot water, for heating the liquid or with cold water in the case of cooling. each of the pressure loops is also equipped with an expansion vessel, which serves to compensate the pressure differences, and with a safety valve that releases the liquid when there is overpressure. for visual control, each of the loops is equipped with a pressure gauge and a thermometer. the temperature condition of the media in the individual loops is measured by thermocouples at the inlets or outlets of the container. for a more accurate approximation of the heat stratification in the tube bundle, there are four thermocouples: two for each loop that is not transporting liquid spray. the mounting plate for the tube bundle is divided into two halves. each half has been drilled with different spacing of the holes to provide large variability of the tube arrangements. the tube bundle is composed of copper tubes and sprinkler pipes, one over each vertical row of copper tubes. the copper tubes are 12.0 mm in 48 acta polytechnica vol. 52 no. 3/2012 figure 1: diagram of the vacuum apparatus diameter. the sprinkler pipe is a tube 940.0 mm in length, with 9.2 mm spacing of the sprinkling holes 1.0 mm in diameter. the pressure in the chamber is measured by three pressure gauges. the first utilizes mercury, and is used for visual inspection. the second is a digital gauge, and it can measure the entire desired pressure range. the third is a digital vacuum gauge, with an absolute pressure range of 0 kpa to 25 kpa, and serves for accurate low-pressure measurements. 3 mathematical description under the conservation of energy, the heat balance between the right-closed pressurized loop (marked with suffix ‘b’) and the under-pressured loop (marked with suffix ‘v’) must be guaranteed: q̇5 + q̇1 = q̇6 + q̇2 + δq̇56 + δq̇12 [w] (1) ⇒ q̇b = q̇6 − q̇5 [w] (2) ⇒ q̇v = q̇1 − q̇2 [w] (3) provided that: t6 > t5 a t1 > t2 [ ◦c] (4) the heat transfer red through the copper tube wall: q̇s = q̇b + δq̇56 = q̇v − δq̇12 [w] (5) • derivation equations of heat transfer through a circular wall the heat passage of the wall is composed of conduction, convection and also radiation. at lower temperatures, the heat transmitted by radiation is negligible, and is therefore not considered in the calculation. the calculation is based on newton’s principle of heat transfer and the fourier heat conduction principle, provided there is a parallel arrangement of pipes. walls do not contain internal sources, and the heat flow is in the radial direction relative to the tube. the heat passage is determined by substituting into the general equations and integrating them q̇s = 2 · π · n · l(tout − tin) 1 hin·rin + 1 λs · [ ln ( rin rout )] + 1 hout·rout [w] (6) 49 acta polytechnica vol. 52 no. 3/2012 figure 2: sign parameters for calculating the heat transfer where selected members of the equation express the heat transfer coefficient: us = 2 · π 1 hin·rin + 1 λs · [ ln ( rin rout )] + 1 hout·rout [w · m−1 · k−1] (7) the heat exchangers do not have a balanced temperature gradient at the inlet and outlet refrigerant. this is therefore replaced by the meanlogarithmic temperature gradient. thus the final equation for the heat passage circular wall in the radial direction is: q̇s = ks · n · l · δtln [w] (8) • calculation of the heat transfer on the outside of the tube the heat transfer coefficient on the outside of the tube is derived from the heat passage equation: hout = 1 2 · π · rout · [ 1 ks − 1 2·π·hin·rin − 1 2·π·λs · ln ( rin rout )] [w · m−2 · k−1] (9) equal (5) and (8) must be valid for the heat passage equations, which results in the heat transfer coefficient: q̇s = q̇b + δq̇56 = ks · n · l · δtln w] (10) ks = q̇s n · l · δtln = v̇v · ρv (pv; t̄) · cp(pv; t̄) · (t1 − t2) n · l · δtln [w · m−1 · k−1] (11) the investigated apparatus is among loops “a” and “v”, and the logarithmic temperature gradient is expressed as: δtln = δt1 − δt2 ln ( δt1 δt2 ) = (t1 − t6) − (t2 − t5) ln ( t1−t6 t2−t5 ) [k] (12) the last unknown in equation (9) is the coefficient of heat transfer inside the tube. it can be calculated using the nusselt number, which is defined as the ratio of the heat transfer by convection and by conduction. [1,2] nu = hout · l λ ⇒ hout1 = nud · λin din [w · m−2 · k−1] (13) the following criteria are accepted for calculating the nusselt number: 1. the reynolds number for fully developed turbulent flow is: re = win · din ν ≥ 10 000 [–] (14) 50 acta polytechnica vol. 52 no. 3/2012 2. the prandtl number ranges between: 0.6 ≤ ( pr = ν a ) ≤ 160 [–] (15) 3. the ratio of the length to its diameter pipe: l d1 ≥ 10 [–] (16) the nusselt number is stated as: nud = 0.023 · re0.8d · prn [–] (17) where coefficient “n” (in the case of heating fluid in the pipe) is equal to 0.4 [–] and in the case of cooling fluid in the pipe,it is equal to 0.3 [–]. the fluid velocity is calculated from the volume flow as: v̇b = sin · win ⇒ win = 4 · v̇b π · din (18) • comparison between the falling-film mode map and data in the literature the reynolds number expressed for the jet-droplet mode [4]: re = 0.084 · ga0.302mod [–] (19) the galileo number is stated as a reciprocal kapitza number: gamod = 1 ka = ρ · σ3 μ4 · g [–] (20) we compare the experimentally obtained values from the vacuum apparatus with the values obtained by jacobi and hu [4] using the criteria number (19), (20). this data was obtained from atmospheric apparatus with a comparable configuration of the heat exchanger, as shown in figure 3. figure 3: experimental results for the jet-droplet transition mode 51 acta polytechnica vol. 52 no. 3/2012 4 evaluation of measured data the measurements of temperature, pressure and other variables that are required for calculating any coefficient in the equations above are obtained with inaccuracies. the values are always inaccurate, due to the environment, the selected measuring method, or the condition of the measuring devices. there is always some tolerance within which the final value is spread. before we start evaluating the measured data, we have to determine the limits from the calibration measurements. the limits determined by the quadratic sum of the type a uncertainties represents a statistical evaluation of the measured data set and the type b uncertainty of which is determined by the instrument manufacturer or calibration of the measuring devices. the uncertainty of the indirectly measured physical quantities derived from the measured data set is coupled with a similar equation. the t-type enclosed thermocouples that were used, with an ungrounded end and with increased accuracy, were calibrated in an omega cl1000 dry block probe calibrator, which controls the temperature set by pid regulation with accuracy of 0.15 ◦c. due to the calibration curve, we decided to use a single error for all thermocouples which consists of the measured maximum peak, or the minimum peak (which achieved a maximum statistical error of 0.006 ◦c) and dry block probe calibrator errors. figure 4 plots the calibration curves of the t-thermocouples used in the measuring loops with factory inaccuracies ub,t (19) ub,t = ± 0.65 [◦c] (21) figure 4: segment of the calibration curve using thermocouples the vacuum (under-pressure) was measured using a ted 6 digital gauge with pressure inaccuracy 0.5 % of the measured absolute pressure range: ub,pv = ± 0.005 · (0 . . 100) = ±0.50 [kpa] (22) the flow rates and the spraying loop were measured using a flomag3000 inductive flowmeter within accuracies: ub,v = ± 0.003 · (0 . . 30.67) = ± 0.092 [l · min−1] (23) figure 5 plots the investigated heat transfer coefficients on the outside of the tube. the measurements were performed for three levels of absolute pressure: 6.2 kpa, 19.2 kpa and 30.2 kpa. the derived points were interspersed with the exponential curve that best described the increasing heat transfer coefficient trend, with an increasing flow rate in spite of the fact that a lower correlation was still sufficient at a pressure level of 6.2 kpa. 52 acta polytechnica vol. 52 no. 3/2012 figure 5: heat transfer coefficients on the outside of the tube 5 conclusion on a vacuum compiled apparatus, we measured the temperature, pressure and flow meter values, from which the heat transfer coefficient was deduced. the increasing heat transfer coefficient trend on the outside of the tube depends on the ambient pressure reduction that is achieved. this trend corresponds to the theory and to the current state of knowledge on spraying exchangers. in the vacuum apparatus will continue implementing to the influence of surface treatment tubes and design to increase the heat transfer coefficient. acknowledgement this work has been financially supported by the czech science foundation under the grant p101/10/1669. references [1] incropera, f. p., dewittl, d. p., bergman, t., lavine, a.: fundamentals of heat and mass transfer, 6th edition, 2007, hardcover, 1 024 p. isbn 978-0-471-45728-2. [2] j́ıcha, m.: přenos tepla a látky, brno. cerm, 2001, 160 p. isbn 80-214-2029-4. [3] karṕı̌sek, z.: matematika iv: statistika a pravděpodobnost. 3rd, brno : cerm, 2007, 170 p. isbn 978-802-1433-809. [4] jacoby, a., hu, x.: the intertube falling-film modes: transition, hysteresis, and effects on heat transfer, urbana, il 61801, (217) 333–3115, 1995. 53 ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 idealized compression ratio for a screw briquetting press peter biath1, juraj ondruška1 1 institute of manufacturing systems, environmental technology and quality management, stu bratislava, námestie slobody 7, 812 31 bratislava correspondence to: peter.biath@stuba.sk abstract this paper deals with issues in determining the ideal compression ratio for a screw briquetting press. first, the principles of operation and a basic description of the main parts of a screw briquetting press are introduced. the next section describes the pressing space by means of 3d software. the pressing space was created using a boolean subtract function. the final section of the paper measures the partial volumes of the pressing chamber in catia v5 by function of measuring. the measured values are substituted into the formula for the compression ratio, and the resulting evaluations are presented in the diagram in the conclusion of this paper. keywords: screw briquetting press; compression ratio; calculation of compression ratio. 1 introduction briquetting is the most widely-used waste compaction technology. briquettes made by screw briquetting presses are of the highest quality. in this paper, we will deal with the screw briquetting presses produced within the project “developing progressive technology of biomass compaction, production of prototypes and highly productive tools” co-financed by the european structural funds. the research is complex, ranging from a study of elementary biomass particles to the production of a new design and structure for briquetting machines and briquetting machine tools. the purpose of the study is to design and verify a prototype for a new progressive briquetting machine design, and to develop highly productive tooling. the paper attempts to clarify the procedure for calculating the idealized compacting ratio during compaction in our screw briquetting press. the results will clarify how the material changes in volume during compaction (although only idealized), and will be of great benefit to further research and development of screw presses. 2 description of a screw briquetting press in developing the design of a new press, we apply research results for the parameters that influence the biomass compaction process. when managing the technology and constructing the pressing parameters, it is important to achieve high-quality production in a variety of factors. the following experiments and the optimization to be implemented on the machine will reflect real compaction technology in practice. the research area focuses on the principle of screw presses due to the achievement of the highest with a view to achieving the highest quality production using this principle. however, one big disadvantage of this type of machine is its short bearing and worm tool life, due to the high axial pressures. figure 1: screw briquetting press 13 acta polytechnica vol. 52 no. 3/2012 figure 2: basic parts of the new screw press (1 – drive, 2 – storage node for screw and spindle, 3 – pressing chamber, 4 – feeding system) figure 3: pressing chamber (1, 2 – nozzles; 3 – pressing screw, 4 – feeding screw, 5 – collet, a, b, c, d, e – location of compaction) the basic structure of a briquetting press is shown in figure 2. the design of the machine is doublechambered to eliminate sharp rises in axial force. the primary parts of the machine are its main drive, the storage node for screw and spindle, the pressing chambers, the cooling channels and the two feeding systems. the core of the machine is held on frames which are similar in dimensions to euro pallets for better manipulation during transportation. it is not necessary to anchor the machine during operation, because the frame of the machine does not transmit any workloads. the machine is developed to be modular, making it easy to change it from a double-chamber version to a single-chamber version simply by removing one part of the machine — from the drive side, without any other modifications. the single-chamber structure is very helpful, especially in experiments and measurements, because we can measure a wide range of operating parameters, including the entire workload, during operation. 3 pressing space the pressing space in compaction machines is created from moving and stationary parts. in our case it comprises the pressing screws (positions 3, 4; figure 3) and the nozzles (positions 1, 2; figure 3). the pressing space in our equipment is quite complicated, because there are two screws (for pressing and for feeding) in each side. the first screw is for feeding (4). it transports material into the throat of the pressing chamber and overwhelms the second “pressing” screw (3). there is compaction of the material even in the feeding screw, so we will include it in the pressing space in order to determine the total pressing space of the machine. 14 acta polytechnica vol. 52 no. 3/2012 the compaction process starts at position a, where the feeding screw moves the material from the hopper to its grooves. at this moment, the bulk material has the same density as the material in the hopper. however, when the material begins to enter the conical part of the feeding screw (b) its volume decreases and its density increases. the compacted material at the end of the feeding screw moves to the overloaded space of the pressing screw (c) without changing density, but only in ideal conditions. in this space we will consider the same density, because the first part of the pressing screw is feeding (the screw and the nozzle are cylindrical in shape without any change in shape or into the conical parts). in the subsequent space, the material is again compacted (d) by the taper of the nozzles and the change in the pressing screw geometry. the compacted material then moves to the last part of the pressing chamber (e). here the pressing pressure is generated only by back-pressure, which is caused by the change in the collet (5) section. however, in this paper we will pay attention to creating the compression ratio by screw compaction only. thus the pressing space that we are studying ends up at the last thread of the pressing screw. however, we should not forget that the output of this experiment will be idealized, meaning that, if we neglect the resistance and friction within the pressed material and the friction of the compacted material in the pressing chamber, the following will apply: if the screw turns one revolution then the material will move by the value of one thread pitch. we used catia v5 3d software to create the pressing space. we identified the components which create the pressing space (the hopper, the feed screw (4), the feed screw nozzle (1), the pressing screw (3), and the pressing screw nozzles (1, 2, 3,)). in assembling the model, we created components (tubes) that create the theoretical fill of the pressing space in both screws. the tube dimensions must be designed in such a way that the smallest diameter of the tube must be smaller than the smallest diameter of the screws (screw grooves), and the largest diameter of the tube must be larger than the largest diameter of the individual nozzles (longitudinal grooves in the nozzles). the dimensions of the tubes were designed with respect to the boolean subtract operation, where we need to intersect the bodies and then subtract the intersections from one another. figure 4 shows the pressing space that is created, with and without screws. after this, the pressing space will be divided into smaller parts, and then we will measure its volumes. figure 4: 3d model of the pressing space 4 measuring volume of the parts in the pressing space to find the compression ratio, it is necessary to divide the model of the pressing space transversely into smaller parts. the resulting smaller volume parts will be measured. transversely dividing the model results in one quarter of the screw pitch. the feed screw is double-grooved with a pitch of 120 mm, and the pressing screw is single-grooved with a pitch of 31.8 mm. the step in dividing the feed screw is 30 mm, and for the pressing screw the step is 7.95 mm. we do not list the individual measured values in the paper but only mention them in the diagram in the next section. figure 5: 3d model of the pressing space — volume parts 15 acta polytechnica vol. 52 no. 3/2012 figure 6: diagram of the idealized compression ratio of a screw press 5 measurement evaluation the compression ratio is the ratio between the bulk material volume before compaction and the volume of the compacted material after or during compacting. the value of the compression ratio indicates how much the input volume decreases during the compacting process. the feeder compression ratio is the ratio of the given hopper material volume to the particular measured volumes in other feeding screw sections. zpp i = vdp max vdp i , (1) where (zpp i) is the compression ratio in any given screw section of the feeder, (vdp max) is the maximum input volume of pressed material in the feeder, (vdp i) is the volume of the measured part in any given screw section of the feeder. it is necessary to determine the compression ratio separately in mechanically separate phases of the process. the resultant compression ratio in the pressing space of the pressing chamber is the product of the compression ratio in the feeder and the partial compression ratios of the pressing screw. zpli = vdl max vdli · zpp max, (2) where (zpli) is the compression ratio in a given screw section of the pressing space, (vdl max) is the maximum input volume of pressed material in the pressing space, (vdli) is the volume of the measured part in a given screw section of the pressing space, and (zpp max) is the maximum compression ratio in the feeder. by substituting the measured values into the formula we find the character of the compression ratio, which is shown in the diagram below. figure 6 is divided into two parts: the first presents the compression ratio in the feeding space, and the second describes the compression ratio in the pressing space. the undulation of the character of the compression ratio is influenced by the change in the geometry of the screws and nozzles. 6 conclusion the values and the character of compression ratio in this paper are idealized. in the real process we cannot consider it as completely relevant, for example, to determine real compression by a direct calculation, because multi-axial compaction processes are very complex and difficult to describe theoretically. the behavior of the material during compaction is influenced by technological and design parameters. the results of this experiment will be compared with real experiments and measurements. the results that are obtained will be used to create a basic idea about the processes inside the chambers during compaction on this screw press. further research should determine the real compression ratio in this machine. acknowledgement this paper reports on work done in the project “developing of progressive technology of biomass compaction and production of prototypes and highly productive tools” (itms code of the project: 26240220017), supported by the research and development operating programme funded by the european regional development fund. references [1] biath, p., ondruška, j., šooš, l’.: theoretical calculation of plg 2010 press compression ratio. briquetting and pelleting 2012. bratislava : stu, 2012, s. 117–123. isbn 978-80-227-3641-1. [2] matúš, m., križan, p.: development of a progressive machine design for biomass briquetting. briquetting and pelleting 2012. bratislava : stu, 2012, s. 197–202. isbn 978-80-227-3641-1. 16 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 numerically optimized uniformly most powerful alphabets for hierarchical-decode-and-forward two-way relaying m. hekrdla abstract weaddress the issue of the parametric performance of thehierarchical-decode-and-forward (hdf) strategy in awireless 2-way relay channel. promisinghdf, representing theconceptofwireless network coding, performswellwithapre-coding strategy that requires channel state information (csi) on the transceiver side. assuming a practical case when csi is available only on the receiver side and the channel conditions do not allow adaptive strategies, the parametrization causes significant hdf performance degradation for some modulation alphabets. alphabets that are robust to the parametrization (denoted uniformly most powerful (ump)) have already been proposed restricting on the class of non-linear multi-dimensional frequency modulations. in this work, we focus on the general design of unrestricted ump alphabets. we formulate anoptimization problemwhich is solvedbystandardnon-linear convex constrained optimization algorithms, particularly by nelder-mead global optimization search, which is further refined by the local interior-points method. keywords: wireless network coding, hierarchical-decode-and-forward relaying, wireless two-way relay channel, modulation alphabet design, non-linear convex optimization. 1 introduction cooperative wireless network protocols which utilize network coding in the physical layer have recently attracted strong research interest due the significant capacity gains that theyoffer; for citations see [1] and the references therein. the terms wireless network coding (wnc) or physical-layer network coding (plnc) are usually used to stress the fact that operations similar to network coding (nc) [2] principles are carriedout in thewireless domain andcompletely in the physical layer. even in awireless 2-wayrelay channel (2-wrc), which is the simplest cooperative network, wnc almost doubles the achievable communication sum-rates. 2-wrcconsists of two terminals, a andb, bi-directionally communicatingwith a supporting relaynoder in ahalf-duplexmanner (the terminals cannot send and receive at the same time). in [3] and [4], the authors show that a two-stage denoise-and-forward (dnf)/hierarchical-decodeand-forward (hdf) strategy is very promising in 2-wrc since it can reliably operate outside the classicalmultiple-access (mac) capacity region [5]. instead of dnf, we feel that a more generic term hdf is better suited, especially for more complicated multi-source networks. hdf consists of an mac stage when a and b are simultaneously transmitting to the relay r with only hierarchical (exclusively coded) data decoding, and a broadcast (bc) stage when r broadcasts the previously-decoded hierarchicaldata. exclusiveoperationallowsthe terminals to decode the hierarchical data using their own data serving as complementary-side information (c-si) [6]. assuming canonical linear (one complex dimension; psk, ask, qam) modulation alphabets, hdf strategy suffers fromunavoidablewireless channel parametrization in the practical case when only channel state information at the receiver side (csir) is available. the problem of parametrization can be suppressed by adaptive extended-cardinality network coding [7]; the adaptation requires moderate channel dynamics, and it is sensitive to channel estimation errors. according to [8], we are able to overcome the parametrization effects by proper multi-dimensional alphabet design. paper [9] defined a class ofuniformlymostpowerful (ump)alphabets which are robust to the parametrization. it has been shown that the parametrization is not harmful when the terminals use binary alphabets, but non-binary linear alphabets always suffer from this effect. in this case,more dimensions need to be accommodated to meet the ump condition. paper [9] exploits additional dimensions of non-linear multi-dimensional frequency alphabets to fulfill the ump condition. in this paper, we study the design of unrestricted multi-dimensional ump alphabets. we reformulate the ump alphabet design as a complex optimization problem(maximizing theminimaleuclideandistance constrained by the ump condition and a unit mean symbol energy constraint) which is to be solved by standard non-linear convex constrained optimization algorithms, particularly by nelder-mead global opti53 acta polytechnica vol. 51 no. 5/2011 mization search, which is further refined by the local interior-points method [10]. the results are presented intable 1, and the relevant error performance is presented in sec. 4. 2 system model 2.1 signal space description and used notation let symbol at be a modulation alphabet at the terminal t ∈ {a, b}. we distinguish two cases: a) both terminals use the same alphabets aa = ab or b) the alphabets are assumed to be unequal aa �= ab. we assume that both terminals have the same alphabet cardinality md = |aa| = |ab| to be a power of two. further,wedonot consider channel codingbecause it canbe seriallyconcatenatedwith theproposedalphabets, not affecting the results [4,7]. the modulation function m : zmd = {0,1, . . . md − 1} → c ns is assumed to be a memoryless per-symbolmapper, however the output constellation space signals are generally allowed to bemulti-dimensionalwith dimensionality ns. let the mapping correspond directly to the signal indexation, m(dt)= sdt , where dt ∈ zmd denotes a data symbol from terminal t ∈ {a, b} and st is a multi-dimensional constellation space signal vector. 2.2 hdf in a wireless two-way relay channel the hdf strategy is a two-way relaying concept which consists of two independent stages: mac and bc.considering thehalf-duplex constraint,hdfhas the minimum number of stages for two-way relaying. we assume an accurate time synchronized system with full csir available (synchronization issues are beyond the scope of our paper) and the channel dynamics does not allow us any adaptation strategies. in the mac stage, both a and b transmit at the relay in an interfering manner. the received signal is x= hasda + hbsdb +w, (1) where w is awgn with variance 2n0 per complex dimension, and the channel parameters ha and hb are complex gaussian random variables with unit variance for flat rayleigh fading, and have a rician distributed envelope with a certain rician factor for rice fading. the rician factor is defined as a power ratio between stationary (line-of-sight) and scattered components. maximumlikelihood (ml) based processing at r decodes hierarchical data from the received signal (1) [4], where the hierarchical data is dab = χ(da, db); an exclusive operation is a function χ : z2md → zmdab , where mdab denotes the cardinality of exclusive symbol alphabets. the exclusive operation meets the exclusive law [7], which guarantees the existence of the invertible function χ−1 : (zmd , zmdab) → zmd. we assume the minimal cardinality exclusive operation md = mdab , and based on [9] the exclusive operation is fixed on bitwisexor; χ(da, db)= da ⊕ db. in the bc stage, r transmits hierarchical data receivedpreviously in the mac stage; the bc stage is similar to the standard broadcast channel. the terminals decode the desired data from the hierarchical data with the help of its own data and the invertible function, similarly as in nc concepts. 2.3 ump alphabets it was shown in [9] that the hierarchicalminimal euclidean distance d2min(α)= min k⊕l �=m⊕n ||ukl −umn||2 (2) mostly determines the ml-error performance with csir [7] and it is parametrized by α, where uij = sda + αsdb ∈ c ns and α ∈ c denote a hierarchical signal in the constellation space and a channel parameter, respectively. in the same paper, the authors propose the alphabets and exclusive operations whose minimal distance reaches the upper-bound d2min(α) ! = |α|2δ2min, ∀α ∈ c, |α| ≤ 1, (3) and they are robust w.r.t. parametrization by α; in other words, its performance is parametric, but for all possible parameter values its performance ismaximal, therefore such alphabets are denoted as uniformly most powerful (ump) alphabets. the symbol δ2min stands for aminimal distance of the primary alphabet a, δ2min = min k �=l ||sk − sl||2, k, l ∈ zmd . the initial definition of the ump alphabet (3) is equivalent to the following condition [9] ||sk −sm||2 + |α|2||sl −sn||2− (4) 2|α| |〈sk −sm,sl −sn〉| ≥ |α|2δ2min, for k �= m, l �= n, k ⊕ l �= m ⊕ n and ∀α ∈ c, |α| ≤ 1. 3 design of ump alphabets in contrast to [9], whereump frequency-modulation alphabets have been designed, we face a problem of general ump alphabet design with a natural unit mean symbol energy constraint, ε̄ = 1 md md−1∑ i=0 ||si||2. the optimization goal can be formulated as: maximize δ2min (5) s. t. ε̄ =1 and the ump condition (4) 54 acta polytechnica vol. 51 no. 5/2011 the proposed optimization search has free parameters: alphabet cardinality md and alphabet dimensionality ns. the optimization problem (5) is a standardmin-max problemwhich is conveniently described as maximize t (6) s. t. t ≤ ||sk −sl||2, k �= l, k, l ∈ zmd , ε̄ =1 and the ump condition (4) the optimization problem (6) is recognized to be a convex quadratic continuous optimization problem for which standard optimization techniques converge very well, with reasonable precision [10]. specifically, we have used a nelder-mead global-search method with several different starting points of its random number generator, and the results are then refined by the local-search interior-points method. we do not use the direct analytical solution based on the karush-kuhn-tucker conditions due to the large number of constrained conditions (4). we distinguish two cases, a) both terminals use the same alphabet aa = ab and case b) aa �= ab where the terminals use unequal alphabets. case b) needs to incorporate the fact that optimal alphabets should have both minimal distances of aa, ab greater than some threshold; therefore we re-define δ2min = min{||sda − sd′a|| 2, ||sdb − sd′b || 2}, where da �= d′a, db �= d ′ b;da, d ′ a, db , d ′ b ∈ zmd to reflect the additional requirement for unequal alphabets. 3.1 resulting optimized ump alphabets the results arepresented intable 1. in the third column, maximal attainableminimal distances δ2min not constrainedby theumpcondition are listed as a reference because they form a natural upper-bound at δ2min ofumpalphabets [9]. the 4th and 5th columns contain the obtained ump alphabets. table 1: numerically optimized ump alphabets md ns δ 2 min, not ump δ 2 min, aa = ab δ 2 min, aa �= ab 2†) 1 4 4 4 4 1‡) 2 0.4 0.68 4†) 2 2.6 2 2.18 4 3 2.6 2.6 2.6 alphabets † have the identical spectral efficiency r = log2 md ns = 1, where the extra dimensionality is taken as extra time-slots and a fair mutual comparison by minimal euclidean distance can be made; here we need to reflect that the modulation should have the same mean symbol energy per timeslot (i.e. with the same mean power), thus in fact the case (md = 4, ns = 2, aa = ab), called a 4ary 2d-ump alphabet, has δ2min = 4, which is directly equal to bpsk, and the alphabet (md = 4, ns = 2, aa �= ab), called a 4ary unequal 2dump alphabet, then has δ2min = 4.36, which is better by 10log10 4.36 4 � 0.37db than bpsk, see its hierarchical minimal distance in figure 2 and the error performance comparison in fig. 6. alphabets ‡ for both aa = ab and aa �= ab fulfill the ump condition only for the channel parameter |α| = 1 (we call these weak-ump alphabets), since 1d-umpalphabetsmust be binary only [9]. the optimal alphabet for (md = 4, ns = 1, aa = ab) : a = {−2 √ 2 5 , − √ 2 5 , √ 2 5 ,2 √ 2 5 }, see the constellation in figure 3 and its hierarchicalminimal distance depicted in figure 1, surprisingly resembles 4-ask fig. 1: parametric hierarchical minimal distance of quaternary 1d-weak ump with equal alphabets fig. 2: parametric hierarchical minimal distance of quaternary 2d-ump with equal alphabets 55 acta polytechnica vol. 51 no. 5/2011 �1.5 �1.0 �0.5 0.5 1.0 1.5 re �1.0 �0.5 0.5 1.0 im fig. 3: the constellation of 4ary 1d-ump with equal alphabets resembles standard 4-ask modulation �1.5 �1.0 �0.5 0.5 1.0 1.5 re �1.5 �1.0 �0.5 0.5 1.0 1.5 im �1.5 �1.0 �0.5 0.5 1.0 1.5 re �1.5 �1.0 �0.5 0.5 1.0 1.5 im fig. 4: the constellations of quaternary 1d-ump with unequal alphabets, alphabets aa and ab , respectively, resembles the scaled versions of standard qpsk constellations modulation: a = {− 3 √ 5 , − 1 √ 5 , 1 √ 5 , 3 √ 5 }. the other optimal case ‡ with (md =4, ns =1) and aa �= ab astonishingly seems to correspond to the scaled versions of two qpsk constellations, see figure 4. 4 performance evaluation in this section, we compare the symbol error rate (ser) of quaternary 1d alphabets: qpsk, 4-ask and theproposed4ary1d-umpalphabetswith equal and unequal alphabets ‡. the error performance in figure 5 is in accordance with our expectations. in the rayleigh fading channel the ump condition is not so interesting, rather the overall δ2min is determining. in contrast to rayleigh fading, it is preferable to useumpalphabets even at the price of lower δ2min in the rician fading channel (here the rician factor k =10db). a remarkable observation is that for sufficiently strong snr the performance of 4ary 1d-ump for equal and unequal alphabets is almost identical. 0 5 10 15 20 25 30 35 10 −4 10 −3 10 −2 10 −1 10 0 symbol error performance of hdf in 2−wrc with rayleigh/rice fading mean es/n0 [db] s ym b o l e rr o r ra te qpsk in rayleigh 4−ask modsum in rayleigh 4−ask bitxor in rayleigh 4ary 1d−ump equal alphabets in rayleigh 4ary 1d−ump unequal alphabets in rayleigh qpsk in rice k=10db 4−ask modsum in rice k=10db 4−ask bitxor in rice k=10db 4ary 1d−ump equal alphabets in rice k=10db 4ary 1d−ump unequal alphabets in rice k=10db fig. 5: symbol error rate of quaternary alphabets with a single complex dimension bit throughput is a relevant performance measure for alphabets with the same spectral efficiency†. in figure 6 we compare bpsk with quaternary 2dump alphabets with the case of equal and unequal alphabets. as expected, the quaternary unequal 2dump alphabet outperforms bpsk by ∼ 0.5db. at the cost of lower spectral efficiency, 4ary 3d-ump has better error performance. 0 5 10 15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 throughput of hdf strategy in rician channel mean eb/n0 [db] b it th ro u g h p u t ump bpsk in rice k=10db 4ary 3d−ump in rice k=10db 4ary 2d−ump equal alphabets in rice k=10db 4ary 2d−ump unequal alphabets in rice k=10db fig. 6: bit throughput of the proposed alphabets of hdf strategy in rice fading with rician factor k =10db 5 conclusions and discussion we have applied canonical convex constrained nonlinear optimization methods for general unrestricted designofuniformlymostpowerful (ump)alphabets, 56 acta polytechnica vol. 51 no. 5/2011 which are convenient for the hierarchical-decodeand-forward (hdf) strategy in a wireless 2-way relay channel with channel state information (csi) on the receiver side, where the channel conditions do not allow adaptive strategies. ump alphabets that are robust to parametrization (denoted uniformly most powerful (ump)) have already been proposed restricting onnon-linearmulti-dimensional frequency modulations. we have re-formulated the optimization problem and solved by the nelder-mead global optimizationmethod, which provided starting points to the interior-points method. the proposed nonbinary linear (one dimensional) alphabets outperform canonical linear alphabets (psk, ask, qam), and the proposed multi-dimensional alphabets outperform ump-bpsk modulation. utilization of the proposed alphabets with suitable mappings on the transitions of convolutional codes (similarly to the concept of trellis codedmodulation), possibly further serially concatenated with an inner code forming a turbo coded system, would be preferable in a practical design. we have supported the idea that unequal alphabets which are made as a power-scaled version of canonical linearmodulations arewell robust to the parametrization problem. this work may serve as a basis for multi-dimensional multiple-input multipleoutput (mimo)-ump alphabet design, which will be a topic for future work. acknowledgement thisworkwassupportedbythefp7-ictsaphyre project; by grant agency of the czech republic, project 102/09/1624; and by the grant agency of the czech technical university in prague, grant no. sgs10/287/ohk3/3t/13. about the author miroslav hekrdla received his m.sc. degree in electricalengineering fromtheczechtechnicaluniversity in prague, czech republic, in 2009. his research interests include nonlinear space-time modulation, coding and processing, relay-based wireless communication systems with distributed, cooperative and mimo processing, and aspects of information theory and channel coding. references [1] fu, s., lu,k., zhang,t.,qian,y., chen,h.-h.: cooperative wireless networks based on physical layer network coding, wireless communications, ieee, vol. 17, no. 6, 2010, pp. 86–95. [2] yeung, r. w., li, s.-y. r., cai, n., zhang, z.: network coding theory. now publishers, 2006. [3] popovski, p., yomo, h.: physical network coding in two-way wireless relay channels, in proc. ieee internat. conf. on commun. (icc), jun. 2007, pp. 707–712. [4] sykora, j., burr, a.: hierarchical alphabet and parametric channel constrained capacity regions for hdf strategy in parametric wireless 2-wrc, inproc. ieee wireless commun. network. conf. (wcnc), (sydney, australia), apr 2010, pp. 1–6. [5] cover, t. m., thomas, j. a.: elements of information theory. john wiley & sons, 1991. [6] sykora, j., burr,a.: network codedmodulation with partial sideinformationandhierarchicaldecode and forward relay sharing in multisource wireless network, inwireless conference (ew), 2010 european, 2010, pp. 639–645. [7] koike-akino, t., popovski, p., tarokh, v.: optimized constellations for two-waywireless relayingwith physical network coding,selectedareas in communications, ieee journal on, vol. 27, jun. 2009, pp. 773–787. [8] uricar, t., sykora, j.: design criteria for hierarchical exclusive codewith parameter-invariant decision regions forwireless 2-way relay channel, eurasip journal on wireless communications andnetworking, vol.2010, 2010, pp. 1–13. article id 921427. [9] hekrdla, m., sykora, j.: channel parameter invariant network coded fsk modulation for hierarchical decode and forward strategy in wireless 2-way relay channel, in cost 2100 mcm, (aalborg, denmark), june 2010, pp. 1–8. td10-11087. [10] boyd, s., vandenberghe, l.: convex optimization. new york, ny, usa : cambridge university press, 2004. miroslav hekrdla e-mail: miroslav.hekrdla@fel.cvut.cz department of radio engineering faculty of electrical engineering czech technical university in prague technická 2, 166 27 prauge, czech republic 57 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 asymptotic power series of field correlators i. caprini, j. fischer, i. vrkoč abstract we address the problem of ambiguity of a function determined by an asymptotic perturbation expansion. using a modified formof thewatson lemma recently proved elsewhere, wediscuss a large class of functionsdeterminedby the sameasymptotic power expansion and represented by various forms of integrals of the laplace-borel type along a general contour in theborel complex plane. some remarks on possible applications in qcd are made. 1 asymptotic perturbation expansions perturbation expansions are known to be divergent both in quantum electrodynamics and in quantum chromodynamics, as well as in many other physically interesting theories and models. in qed, divergence was proved byf. j. dyson in 1952 (see [1]). his result has been revisited and reformulated by many authors ([2, 3], see alsoa review in [4]). dysonproposed to give the divergent series mathematical meaning by interpreting it as an asymptotic series to f(z), the sought function: f(z) ∼ ∞∑ n=0 fnz n, z ∈ s, z → 0, (1) where s is a point set having the origin as an accumulation point, z being the perturbation parameter. to see how dramatically the philosophy of perturbation theory was changed by this step, let us first recall the definition of an asymptotic series: definition: let s be a region or point set having the origin as an accumulation point. the power series ∞∑ n=0 fnz n is said to be asymptotic to the function f(z) as z → 0 on s, and we write eq. (1), if the set of functions rn(z), rn(z)= f(z)− n∑ n=0 fnz n, (2) satisfies the condition rn(z)= o(z n) (3) for all n =0,1,2, . . ., z → 0 and z ∈ s. note that the asymptotic series is defined by a different limiting procedure than the taylor one: taking n fixed, one observes how rn(z) behaves for z → 0, z ∈ s, the procedure being repeated for all n ≥ 0 integers. convergencemaybe provablewithout knowing f(z), but asymptoticity can be tested only if one knows both the fn and f(z). by (1), f(z) is not uniquely determined; there are many different functions having the same asymptotic series, (1) say. theambiguityof a function givenbyan asymptotic series is illustrated by the lemma of watson. 2 watson lemma consider the following integral φ0,c(λ)= ∫ c 0 e−λx α xβ−1f(x)dx, (4) where 0 < c < ∞ and α > 0, β > 0. let f(x) ∈ c∞[0, c] and f(k)(0) defined as lim x→0+ f(k)(x). let ε be any number from the interval 0 < ε < π/2. lemma1 (g.n.watson): if the above conditions are fulfilled, the asymptotic expansion φ0,c(λ) ∼ 1 α ∞∑ k=0 λ− k+β α γ ( k + β α ) f(k)(0) k! (5) holds for λ → ∞, λ ∈ sε, where sε is the angle |argλ| ≤ π 2 − ε. (6) the expansion (5) can be differentiated with respect to λ any number of times. for the proof, see for instance [5]. let us add several remarks: 1) the angle sε of validity of (5), (6), is independent of α, β and c. 2) thanks to the factor γ ( k + β α ) , the expansion coefficients in (5) grow faster with k than those of the taylor series for f(x). 3) the expansion coefficients in (5) are independent of c. this illustrates the impossibility of a unique determination of a function from its asymptotic expansion. presented at the international conference “selected topics in mathematical and particle physics” organized in honour of the 70th anniversary of professor jiří niederle at new york university, prague, 5–7 may 2009. 71 acta polytechnica vol. 50 no. 3/2010 in the next section we shall give a modification to the watson lemma, which shows that under plausible assumptions the straight integration contour can be bent. 3 modified watson lemma the modified watson lemma we present below (and call lemma 2’) is a special case of lemma 2, which we publish and prove in ref. [6]. the special form given here is obtained from that given in [6] by setting α = β =1. let g(r) be a continuous complex function of the form g(r) = rexp(ig(r)), where g(r) is a real-valued function given on 0 ≤ r < c, with 0 < c ≤ ∞. assume that the derivative g′(r) is continuous on the interval 0 ≤ r < c and a constant r0 > 0 exists such that |g′(r)| ≤ k1rγ1, r0 ≤ r < c, (7) for a nonnegative k1 and a real γ1. assume that the parameter ε > 0 exists such that the quantities a = inf r0≤r r0. let f(u) be holomorphic on the disc and measurable on the curve. assume that |f(g(r))| ≤ k2rγ2, r0 ≤ r < c, (10) hold for a nonnegative k2 and a real γ2. define the function φ (g) b,c (λ) for 0 ≤ b < c by 1 φ (g) b,c (λ)= ∫ c r=b e−λg(r)g(r)f(g(r))dg(r). (11) lemma 2’: if the above assumptions are fulfilled, then the asymptotic expansion φ (g) 0,c (λ) ∼ ∞∑ k=0 λ−(k+1)γ(k +1) f(k)(0) k! (12) holds for λ → ∞, λ ∈ tε, where tε ={λ : λ = |λ|exp(iϕ), − π 2 − a + ε < ϕ < π 2 − b − ε}. (13) we refer the reader to ref. [6] for the proof of lemma 2 and its discussion. the above simplified version, lemma2’, is givenhere to illustrate some special features of the general lemma 2 and its possible applications. let us add several remarks to lemma 2’: 1/lemma2’ implieswatson’s lemmawhen the integration contour is chosen to have the special form of a segment of the real positive semiaxis, i.e. g(r) ≡ 0, and f(r) ∈ c∞[0, c]. 2/ perturbation theory is obtained by setting λ = 1/z in (10), (11). then, the function f (g) 0,c (z)= ∫ c r=0 e−g(r)/z f(g(r))dg(r) (14) has the asymptotic expansion f (g) 0,c (z) ∼ ∞∑ k=0 zk+1f(k)(0) (15) for z → 0 and z ∈ zε, where zε ={z : z = |z|exp(iχ), − π 2 + b + ε < χ < π 2 + a − ε}. (16) 3/ the parameter ε in (9) is limited by 0 < ε < π/2−(b−a)/2, but is otherwise arbitrary. note however that the upper limit of ε depends on b − a and may be considerably less than π/2. this happens, for instance, if the integration contour is bent or meandering. 4/ the parametrization g(r) = rexp(ig(r)) does not include contours that cross a circle centred at r = 0, either touching or doubly intersecting it, so that the derivative g′(r) either does not exist or is not bounded. in such cases, the parametrization has to be modified. 5/ let us remark that the proof of lemma 2 in ref. [6] allows us to obtain remarkable correlations between the strength of the bounds on the remainder and the size of the angleswithinwhich the asymptotic expansion is valid. it follows from [6] that the bounds are proportional to 1 (|λ|−1)sinε e−(|λ|−1)r0 sinε (17) or to cn(|λ|sinε)−(n+2), (18) where n is the truncation order and the cn, n = 0,1,2, . . . are λ-independent positive numbers. the bounds decrease with increasing ε, the parameter, which determines the angles tε and zε, see (13) and (16) respectively. as a consequence, the larger the angle of validity, the looser the bound, and vice versa. 1this integral exists since we assume that f(u) is measurable along the curve u = g(r) and bounded by (10). 72 acta polytechnica vol. 50 no. 3/2010 4 some applications to perturbative qcd todiscuss some applications of lemma2’, we take the adler function [7], d(s)= −s dπ(s) ds −1 . (19) where π(s) is the polarization amplitude defined in terms of the vector current products for light quarks. theadler functiond(s) is real analytic in the s-plane, except for a cut along the timelike axis produced by unitarity [7, 8]. in perturbativeqcd, any finite-order approximant has cuts along the timelike axis, while the renormalization-group improved expansion, d(s)=d1 αs(s)/π + d2 (αs(s)/π)2 + d3 (αs(s)/π) 3 + . . . , (20) has, in addition, an unphysical singularity due to the landau pole in the running coupling αs(s). (20) is known to be divergent, the dn growing as n! at large orders [9]–[12]. 4.1 on the high ambiguity of perturbative qcd todiscuss the implicationsoflemma2’,wefirstdefine the borel transform b(u) by [11], b(u)= ∑ n≥0 bn u n, bn = dn+1 βn0 n! . (21) it is usually assumed that the series (21) is convergent on a disc of nonvanishing radius (this result was rigorously proved by david et al. [13] for the scalar ϕ4 theory in four dimensions). this is what is required in lemma2’ for the generalizedborel transform f(g(r)). if we assume that the series (20) is asymptotic, lemma 2’ implies a large freedom in recovering the true function from its coefficients. all the functions dg0,c(s) of the form dg0,c(s)= 1 β0 ∫ c r=0 e − g(r) β0 a(s) b(g(r))dg(r) , (22) where a(s) = αs(s)/π, admit the asymptotic expansion dg0,c(s) ∼ ∞∑ n=1 dn (a(s)) n, as(s) → 0, (23) in a certain domain of the s-plane, which follows from (13) and the expression of the running coupling a(s) given by the renormalization group. no function of the form dg0,c(s), (22), can be a priori preferred when looking for the true adler function. contributing only to the exponentially suppressed remainder, neither the form or length of the contour, nor the values of b(u) outside the convergence disc can affect (23). the remainder to (23) is of the form hexp(−d/β0a(s)) ∼ h ( −λ2/s )d . the quantities h and d > 0 depend on the contour and on b(u) outside the disc, which can be chosen rather freely. as a consequence, (22) contains arbitrary power terms, to be added to (23). 4.2 analyticity and optimal conformal mapping in discussing the divergence of (20) and (21), the singularities of d(s) in the αs(s) plane and, respectively, those of b(u) in theborel plane are of importance. as for b(u), some information about the location andnature of the singularities can be obtained from certain classes of feynman diagrams (which can be summed, see [10]–[12]), and from general arguments based on renormalization theory, [9, 14]. it follows that b(u) has branch points along the rays u ≥ 2 and u ≤ −1 (ir anduvrenormalons respectively). other (though nonperturbative) singularities, for u ≥ 4, areproduced by instanton-antiinstantonpairs. (due to the singularities at u > 0, the series (20) is not borel summable.) no other singularities of b(u) in the borel plane are known, however. it is usually assumed that b(u) is holomorphic elsewhere. to make full use of the analyticity of b(u) in the whole b, we shall use the method of optimal conformal mapping [15]. let k be the disc of convergence of the series (21); clearly, k ⊂ b. then, evidently, the expansion (21) in powers of u can be replaced by that in powers of w(u), b(u)= ∑ n≥0 cn w n, (24) where the function w = w(u)with theproperty w(0)= 0 represents the conformalmapping of the region of b onto the disc |w| < 1, on which (24) converges. it can easily be seen that (24) has better convergence properties than (21) in this case: indeed, as was proved in [15] by using the schwarz lemma, the larger the region mapped by w(u) onto |w| < 1, the faster the large-order convergence rate of (24). if w(u)maps thewholeb onto theunit disc |w| < 1 in the w plane, the mapping is called optimal. in this case, (24) converges everywhere on b and the convergence rate is the fastest [15]. the regionof convergence of (24) coincides with b, the region of analyticity. in this way, the optimal conformal mapping can express analyticity in terms of convergence. inserting (24) into (22) we obtain an alternative asymptotic expansion: dg0,c(s)= 1 β0 ∫ c r=0 e − g(r) β0 a(s) ·∑ n≥0 cn [w(g(r))] n dg(r) . (25) 73 acta polytechnica vol. 50 no. 3/2010 containing powers of the optimal conformal mapping w(u) (which has the same location of singularities as the expanded function b(u)), this representation implements more information about the singularities of b(u) than the series (21) in powers of u, even at finite orders. thus, it is to be expected that even the finiteorder approximants of (25)will provide amore precise description of the function searched for [16, 17]. 4.3 analyticity may easily get lost we shall briefly mention an intriguing situation showing that careless manipulation with the integration contour may have a fateful impact on analyticity. in [18], two different integration contours in the u-plane were chosen for the summation of the so-called renormalon chains [10]: for a(s) > 0 and a(s) < 0, a ray parallel andclose to thepositiveand, respectively, negative semiaxis is chosen. as was expected and later proved [19], analyticity is lost with this choice, the summation being only piecewise analytic in s. on the other hand, as shown in [20, 21], the borel summationwith theprincipalvalue (pv)prescription of the same class of diagrams admits an analytic continuation in the s-plane, in agreementwith analyticity except for a cut along a segment of the spacelike axis, related to the landau pole. 5 conclusion in this paper we have discussed some special consequences of our general result published in [6], which is based on a modification of the watson lemma. it follows that a perturbation series, if regarded as asymptotic, implies a huge ambiguity of possible expanded functions having the sameasymptotic expansionof the type (1). this mathematical fact is often ignored or overlooked in physical applications. our contribution consists in the fact that we have specified its special subclass by lemma 2 of ref. [6]. moreover, in the present paper, we have considered a special subclass of lemma 2 (as defined by lemma 2’ in section 3 of this paper),whichwediscusshere inmoredetail due to its direct applicability to perturbative qcd. to find the true solution, additional information inputs are unavoidable. applying the result to qcd, we conclude that the contour of the integral representing the qcd correlator can be chosen very freely. the same holds for the borel transform b(u) outside the convergence circle. we have kept our discussion on a general level, bearing in mind that little is known, in a rigorous framework, about the analytic properties of the qcd correlators in the borel plane. if some specific properties are known or assumed, the integral representations will have additional analytic properties. naturally, the results obtained in [6] may also be useful in other branchesof physicswhereperturbation series are divergent. acknowledgement one of us (i.c.) thanks prof. j. chýla and the institute of physics of the czech academy in prague for hospitality. j. f. thanksprof.p.ra̧czkaand the institute of theoretical physics of warsaw university for hospitality. supportedbycncsis in the frameworkof programidei, contractno. 464/2009, andbyprojects no. la08015 of the ministry of education and av0z10100502 of the academy of sciences of the czech republic. references [1] dyson, f. j.: phys. rev. 85, 631 (1952). [2] lautrup, b.: phys. lett. b69, 109 (1977); lipatov, l. n.: sov. phys. jetp 45, 216 (1977); parisi, g.: phys. lett. b76, 65 (1978); mueller, a. h.: nucl. phys. b250, 327 (1985). [3] hooft,g.’t: in: the whys of subnuclear physics, proc. of the 15th intern. school on subnucl. physics, erice, 1977, ed. by a. zichichi (plenum press, new york, 1979), 943. [4] fischer, j: int. j. mod. phys. a12, 3625 (1997). [5] jeffreys,h.: asymptotic approximations, clarendon press, oxford, 1962; dingle, r. b.: asymptotic expansions: their derivation and interpretation, academic press, 1972; fedoryuk, m. v.: asymptotics, integrals and series (in russian), moscow, nauka, 1987, 58. [6] caprini, i., fischer, j., vrkoč, i.: j. phys. a42 395403, 2009. [7] adler, s. l.: phys. rev. d10, 3714 (1974). [8] bogoliubov, n. n., shirkov, d. v.: introduction to the theory of quantized fields, interscience, 1959. [9] mueller, a. h.: in: qcd – twenty years later, aachen 1992, edited byp. zerwas andh.a.kastrup (world scientific, singapore, 1992). [10] beneke, m.: nucl. phys. b405, 424 (1993); broadhurst, d. j.: z. phys. c58, 339 (1993). [11] neubert, m.: nucl. phys. b463, 511 (1996). [12] beneke, m.: phys. rept. 317, 1 (1999). [13] david, f., feldman, j., rivasseau, v.: comm. math. phys. 116, 215 (1988). [14] beneke, m., braun, v. m., kivel, n.: phys. lett. b404, 315 (1997). [15] ciulli, s., fischer, j.: nucl. phys. 24, 465 (1961). 74 acta polytechnica vol. 50 no. 3/2010 [16] caprini, i., fischer, j.: phys. rev. d60, 054014 (1999); caprini, i., fischer, j.: phys. rev. d62, 054007 (2000). [17] cvetic̆, g., lee, t.: phys. rev. d64, 014030 (2001). [18] howe, d. m., maxwell, c. j.: phys. rev. d 70, 014002 (2004); brooks, p. m., maxwell, c. j.: phys. rev. d74, 065012 (2006). [19] caprini, i., fischer, j.: phys. rev. d76, 018501 (2007). [20] caprini, i., neubert, m.: jhep 03, 007 (1999). [21] caprini, i., fischer, j.: phys. rev. d71, 094017 (2005). irinel caprini national institute of physics andnuclear engineering bucharest pob mg-6, r-077125 romania jan fischer institute of physics academy of sciences of the czech republic 182 21 prague 8, czech republic ivo vrkoč mathematical institute academy of sciences of the czech republic 115 67 prague 1, czech republic 75 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 determining the position of head and shoulders in neurological practice with the use of cameras p. kutílek, j. hozman abstract the posture of the head and shoulders can be influenced negatively by many diseases of the nervous system, visual and vestibular systems. wehavedesigned a systemand a set of procedures for evaluating the inclination (roll), flexion (pitch) and rotation (yaw) of the head and the inclination (roll) and rotation (yaw) of the shoulders. a new computational algorithm allows non-invasive and non-contact head and shoulder position measurement using two cameras mounted opposite each other, and the displacement of the optical axis of the cameras is also corrected. keywords: shoulder posture, head posture, neurology, camera calibration. 1 introduction theobjectiveof our studywas todevelopa technique for precise head and shoulder posture measurement or, in other words, for measuring the native position of the head and shoulders in 3d space. the technique set out to determine differences between the anatomical coordinate system of the head and shoulders and the physical coordinate system with accuracy to two degrees for inclination, flexion/extension and rotation. no similar technique has previously been developed that can be widely and easily used in neurological clinical practice. nevertheless, the technique couldhave important applications, as there are many neurological disorders that affect the postural alignment position of the head and shoulders. these can be divided into three main groups: • cervical blockages and diseases of the cervical spine often cause a wide range of positional abnormalities. • dystonic “movement disorders”. abnormal body segment position is typical for dystonia. • paralyses of eye muscles also often cause a position that attempts to compensate for the insufficient function. in many cases, the abnormalities of the head and shoulder position can be small and difficult to observe. in clinical practice, it has until now been possible to quantify only those deviations that are well visible. although an accurate method for measuring head and shoulder postural alignment could contribute to the diagnosis of vestibular disorders and some other disorders, this issue has not been systematically studied in the past. at the present time, the use of an orthopedic goniometer is the standardway to evaluate angles simply and rapidly in clinical practice. however, there are some limitations, especially in the case of head and shoulder posturemeasurement. due to the combination of three movement components, it is problematic to use only a goniometer. fig. 1: examples of head position abnormalities ferrario, v. f., et al, 1995 [1] developed a new method based on television technology that was faster thanconventionalanalysis. the subject’s body and facewere identified by 12 points. on the basis of an image analysis program, the specified angles were calculated after digitizing the recorded films. galardi, g., et al, 2003 [2] developed an objective method for measuring posture using fastrack. fastrack is an electromagnetic system consisting of a stationary transmitter station and four sensors. the head position in space was reconstructed (based on sensor signals) and was observed from the axial, sagittal and coronal planes. in this paper, we will describe our contribution and our proposed method for measuring head and shoulder position. the method is designed for use in neurology to discover relationships between some neurological disorders and postural alignment. pic32 acta polytechnica vol. 51 no. 3/2011 tures of the headmarked on the tragus and outer eye canthus, and shoulders marked on the acromion, are taken simultaneously by two digital cameras. 2 methods hozman, j., et al, 2004 [3] proposed a new method based on the application of three digital cameras with stands and appropriate image processing software. this new non-invasive head position measurement method was designed for use in neurology to discover relationships between some neurological disorders and postural alignment. the objective was to developa technique forpreciseheadposturemeasurement or, in otherwords, formeasuring the native position of the head in 3d space. the technique aimed to determine differences between the anatomical coordinate system and the physical coordinate system with accuracy from one to two degrees for tilt and rotation. pictures of the head marked on the tragus and the outer eye canthus are taken simultaneously by three digital cameras aligned by a laser beam. no similar technique has previously been developed that can be widely and easily used in neurological clinical practice. head positionwasmeasuredwith precision of 0.5◦ in three planes (rotation-yaw, flexion-pitch and inclination-roll) [3]. fig. 2: anatomical horizontal and axis in our recent method for studying only head position [5], two cameras are required for determining head positions. the rotation and inclination of the head is evaluated fromthedifferencebetween the tragus coordinates in the left-profile and right-profile image. the coordinates of the left and right tragus (figure 2) are evaluated by finding the centre of the roundedmark attached to the tragus. the images are captured simultaneously, using two cameras, which are situated on the same optical axis parallel with the frontal plane of the subject. it is a mathematically simple problem to determine the tilt in the sagittal plane (flexion/extension) of the head from side shots (profile photographs). the flexion value was measured relatively as the inclination of the connecting line between the tragus and the exterior eye corner. the angle between the anatomical and physical horizontal is determined by the angle between vector v (horizontal vector), given by the camera position, and vector u, which represents the coordinates of the points evaluated in the image. the angle is calculated as follows (1): θ =arccos u · v |u| · |v| (1) where u = (a1x [px]− a2x [px], a1y [px] − a2y [px]), v = (1,0). a1x and b1x are the x-axis coordinates of the tragus in the right-profile and left-profile images, and a2x and b2x are the x-axis coordinates of the outer eye canthus in the right-profile and left-profile images. the coordinates are evaluatedbyfinding the centre of the rounded mark attached to the tragus and outer eye canthus of the patient. the circumvolution extent (rotation) of the head is evaluated from the difference between the tragus coordinates in the left-profile and right-profile image (figure 3). these images were captured simultaneously, using two cameras, and the cameras are situated on the sameoptical axis parallelwith the frontal plane subject. fig. 3: geometry used for measuring the head position after evaluating the coordinates of the tragus in the captured images, the angle of head rotation is calculated as follows (2): ϕ =arcsin (a1x [px]− b1x [px]) ·const ds [mm] . (2) where const= ccd [mm] · (d [mm]− ds [mm]) 2 · f [mm] · s [px] . 33 acta polytechnica vol. 51 no. 3/2011 a1x and b1x are the x-axis coordinates of the tragus in the right-profile and left-profile images, ds is the diameter of the head, and const is a constant converting the distance between the tragus coordinates from pixels to millimeters. the quantity ccd is the width of theccd sensor given by the camera’smanufacturer, d is thedistancebetween theccdsensors (cameras), f is the focal length of the camera lens, and s is the x-axis image size. forevaluatingthe inclinationweapplied the same method as was used for evaluating the rotation: φ =arcsin (a1y [px]− b1y [px]) ·const ds [mm] . (3) a1y and b1y are the y-axis coordinates of the tragus in the right-profile and left-profile. the calculation of const has to take into account the modified quantities for the y-axis, i.e. ccd is the height of the ccd sensor given by the camera’s manufacturer, and s is the y-axis image size. fig. 4: geometry used for measuring the shoulder position for evaluating shoulderposition,weuseamethod similar to themethod thatweuse for evaluatinghead position. we assume an approximation of the shouldermovementbyacircularmovementof themarkers, so the formula for determining shoulder rotation is: ϑ =arcsin (a3x [px]− b3x [px]) · const dr [mm] (4) where const= ccd [mm] · (d [mm] − dr [mm]) 2 · f [mm] · s [px] a3x and b3x are the x-axis coordinatesof theacromion in the right-profile and left-profile. amedical doctor indicates these anatomical points with a red mark for easy location of the anatomical points in the pictures. if the clinical investigation is carried out by an experienced medical doctor, it is not necessary to apply colored marks to the anatomical parts of the body before making an examination using our camera system. ds is the distance between the acromions measured by a medical doctor before a clinical examination using our system. the value const is a constant converting thedistancebetween the acromioncoordinates frompixels to millimeters. the quantity ccd is the width of the ccd sensor given by the camera’s manufacturer, d is the distance between the ccd sensors (cameras), f is the focal length of the camera lens, and s is the horizontal x-axis image size. to evaluate the shoulder inclination we applied the same method as was used for evaluating the shoulder rotation: ζ =arcsin (a3y [px]− b3y [px]) · const dr [mm] (5) a3y and b3y are the vertical y-axis coordinates of the acromion in the right-profileand left-profile. the calculation of const has to take into account the modified quantities for the vertical y-axis, i.e. ccd is the height of theccdsensor givenby the camera’smanufacturer, and s is the vertical y-axis image size. the software also determines the angular displacement of the head to the shoulders for rotation, using the formula κ = ϑ − ϕ (6) and the angular displacement of the head to the shoulders for inclination λ = ζ − φ. (7) in the way described above, based on identifying anatomical points with the use of cameras, we can avoid influencing patientswhilewe aremeasuring the inclination (roll), flexion (pitch) and rotation (yaw) of the head and shoulders. unfortunately, the measurement accuracy is determined by the accuracy of the calibration of the cameras. problems of deviations ofccd sensors and deviation of optical axes can be excluded by special hardware or by computationally intensive software. in the earlier version of our system, a laser collimator was tested and used. when the cameras are on the same optical axis, the right position is signaled by the led diode (the laser beam is detected). a test version of our system currently uses software correction based on computational algorithms implemented in our software, or we can use, e.g, the camera calibrationtoolbox in matlab software for identifying and correcting the position of the cameras. problems of deviations of ccd sensors and deviation of optical axes can be excluded by scanning the correction mark on a transparent mask. in this way, we find the differences between the coordinates of this scanned point in the two frames. these differences represent the deviations that will be used for 34 acta polytechnica vol. 51 no. 3/2011 correcting the calculation. an easily adjusted formula for calculating, for example, the rotation is ϕ =arcsin (a1x [px] − b1x [px]− kx [px]) · const ds [mm] , (8) other assumptions are identical with the calculation without corrections. fig. 5: displacement of optical axes fig. 6: the two-arm stand with fixed cameras and laser collimators for the angular deviation of optical axes, the correction is more difficult. we have to apply special intensive software, e.g. the matlab camera calibration toolbox. this software enables accurate detection of the mutual positions of the optical axes by scanning the correction marks on a transparent mask/board. the software provides information on the mutual displacement and the mutual rotation of the optical axes of the cameras. withourproposedsoftware,wefind thevalues for the relative position of the two axes of the cameras, i.e. the rotation vector [ω, ξ, ψ] and the translation vector [kx, ky, kz]. for calculating the position of the head and shoulders, one of the cameras must be selectedas themain cameraandwedetermine the rotation, inclination and flexion of head and shoulders in 3d space to the coordinate system of the main camera. the physician must also take this state into account in the clinical examination. using components of the rotation and translation vector, the formula for determining head rotation is ϕ = arcsin ⎡ ⎣(a1x [px]− b1x [px]) ·const ds [mm] + kx [mm]+ d [mm]−ds [mm] 2 · sinξ ds [mm] ⎤ ⎦ , where ξ is the angle of mutual rotation of the coordinate systems / the axis of the cameras, d = kz is the total distance between the cameras purposely adjusted to this distance, kx is the deviation of the optical axis due to the displacement the cameras and the deviation of the centers of theccd sensors. the formula for calculating the inclination is defined in a similar way φ = arcsin ⎡ ⎣(a1y [px]− b1y [px]) · const ds [mm] + ky [mm]+ d [mm]−ds [mm] 2 · sinω ds [mm] ⎤ ⎦ flexion/extension is calculated using the formula: θ =arccos u · v |u| · |v| + ψ, where ψ is the angle of mutual rotation of the coordinate system of the cameras. the plus sign, i.e. addition of the mutual rotation and displacement of the cameras, in general, determines the relation of the values, but the values can be positive or negative depending on the direction of the mutual rotation and displacement of the cameras. other conditions are identical with the calculations for the position of head and shoulders if the cameras aremounted and adjusted by a laser system which ensures that the position of the optical axes of the cameras is on the same axis. similarly, the characteristics for calculating the angles of inclination, rotation and flexion are identical. 35 acta polytechnica vol. 51 no. 3/2011 fig. 7: angular deviation of the optical axes fig. 8: flowchart of clinicalmeasurementsusingour camera system the cameras are opposite to each other, and we cannot take pictures of one pattern using two cameras. for this reason, we have to use a transparent plate with colored correction marks or a planar checkerboard [9] located at a distance d/2 between the cameras. the correction procedure is such that themeasuredmark is scanned on a transparentmask which is located at a distance d/2 between the cameras. we can make the correction for rotation and inclination by observing the vertical and horizontal components of the coordinates and their mutual deviations in the two images/photographs. the subsequent image processing is the same as common calibration procedures using a planar checkerboard [9]. the exact values of the displacements and rotations of the optical axes canbe added to the calculation for correcting the displacements or to refine the angles, figure 8. it appears, however, that the angle correction software is time consuming and impractical for medical practice, and for this reason it is used only for correcting the displacement of the optical axes and the ccd sensors. 3 results our method was tested in clinical practice, and preliminary experiments indicate that the method functions effectively. a user interface has been created for easy control of the new system (figure 9). the first set of data was measured on 30 volunteers. the measured data shows that a healthy subject holds his/her head aligned with the physical coordinate systemwithin a range of ±5 degrees. statistical analyses of this sample show that all values (inclination, flexion, rotation) are in a normal distribution. our system based on two identical digital cameras is a sufficiently accurate system for determining the inclination, flexion and rotation of head and shoulders in neurological practice. an advantage of the system is that it is easy to determine the angles between the anatomical horizontal and axis and the physical coordinate systemdefined by the position of the cameras. fig. 9: user interface of the software during accuracy tests on a dummy placed on an accurately slewing stand the two cameras are placed on both sides (lateral profiles) of the patient. this is a very important advantage for medical doctors, because they can make various examinations which need an open space in front of the face. 36 acta polytechnica vol. 51 no. 3/2011 adisadvantageof the systemwith cameras is that there is increasing error of the detected angle with increasing abnormalities of the head position / measured angles. the reason for this is the large deviation of head position from the optimum location in the middle distance between the two cameras, which causes large differences in the distances between the ccd sensors (cameras) and the measured head and shoulders. our system is therefore designed for accurate identification of small abnormalities during the rehabilitation process and for cases when the abnormality is so slight that it cannot be determined by conventional goniometers. a second disadvantage of a system of cameras is the increasing error of the detected angle with increasingmotion of the measured subject. we cannot measure the position/angles of fast moving patients. however, there is an advantage when measuring the positions of small angles (head and shoulders) with very small error. this means that we can measure very small abnormalities of head and shoulder position. 4 conclusion we have designed special calibration equipment and implemented procedures for evaluating measured data. the new analytical methods for the implemented procedures have been described in this paper. the new equipment and measurement method are designed for a very accurate evaluation of head and shoulder position in neurological practice. the system is cheaper than sophisticated systems using accelerometers and magnetometers. nevertheless, the greatest advantage of the proposedmethod is itsnon-invasiveandnon-contactway of measurement, without using any sensors and applying only cheap single-use markers. if the clinical investigation is performedby an experiencedmedical doctor, it is not necessary to apply colored marks to anatomical parts of the body. an advantage of our system over conventional systems such as zebris or sonosens is that it can measure a patient without the influence of mechanical elements on the patient’s body segments. the system also allows direct detection of the anatomical axes of the patient’s head and shoulders, which cannot be done using current systems. these ways of measuring head and shoulder posture could also be applied in other areas of engineering, medicine and science. our system can be used anywhere to study the posture of a person. acknowledgement the work presented here has been carried out at the department of biomedical technology, faculty of biomedical engineering, czech technical university in prague in the framework of research program no. msm 6840770012 “transdisciplinary biomedical engineering research ii” of the czech technical university, sponsored by the ministry of education, youth and sports of the czech republic. the study has been supported by grant agency of the czech republic grant gacr 102/08/h018. references [1] ferrario,v., sforza,c.,germann,d.,dalloca,l., miani, a.: head posture and cephalometric analyses: an integrated photographic / radiographic technique. american journal of orthodontics & dentofacial orthopedics, vol. 106, 1994, pp. 257–264. [2] ferrario,v., sforza,c.,tartaglia,g.,barbini,e., michielon,g.: newtelevisiontechnique fornatural head and body posture analysis. cranio, vol. 13, 1995, pp. 247–255. [3] cerny, r., strohm, k., hozman, j., stoklasa, j., sturm, d.: head in space – noninvasive measurement of head posture. the 11th danube symposium – international otorhinolaryngological congress, bled, 2006, pp. 39–42. [4] hozman, j., zanchi, v., cerny, r., marsalek, p., szabo, z.: precise advanced head posture measurement.the 3rdwseas internationalconference onremote sensing (remote’07),wseas press, 2007, pp. 18–26. [5] kutilek, p., hozman, j.: non-contact method for measurement of head posture by two cameras and calibrationmeans.the8thczech-slovak conference ontrends inbiomedical engineering, bratislava, 2009, pp. 51–54. [6] harrison, a., wojtowicz, g.: clinical measurement of head and shoulder posture variables. the journal of orthopaedic & amp; sports physical therapy (jospt), vol.23, 1996, pp. 353–361. [7] young, j. d.: head posture measurement. journal of pediatric ophthalmology and strabismus, vol. 25, no. mar–apr (2), 1988, pp. 86–89. [8] hozman, j., sturm, d., stoklasa, j., cerny, r.: measurement of postural head alignment in neurological practice. the 3rd european medical andbiological engineeringconference – embec’05 [cd-rom], society of biomedical engineering and medical informatics of the czech medical association jep, vol. 11, prague, 2005, pp. 4229–4232. 37 acta polytechnica vol. 51 no. 3/2011 [9] bouguet, j.: camera calibration toolbox for matlab, april 2002. available: http://www.vision.caltech.edu/bouguetj/calib doc (may 1, 2009). ing. patrik kutílek, ph.d. doc. ing. jiří hozman, ph.d. phone: +420 312 608 302, +420 224 358 490 e-mail: kutilek@fbmi.cvut.cz faculty of biomedical engineering czech technical university in prague sitna sq. 3105, kladno, czech republic 38 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 searching for space debris elements with the “pi of the sky” system m. soko�lowski, m. należyty, a. majczyna, r. wawrzaszek, p. wajer abstract the main purpose of the “pi of the sky” system is to investigate short timescale astrophysical phenomena (particularly gamma-ray bursts, optical transients and variable stars). wide field, short exposures and full automation of the system, together with effective algorithms, give good prospects for effective identification of space debris elements. these objects can be a great danger for current and future space missions, and should be continuously monitored and cataloged. algorithms for identifying optical transients (ot), designed for the “pi of the sky” experiment enable moving objects like planes, satellites and space debris elements to be identified. the algorithm verifies each ot candidate against a database of known satellites and is also able to automatically self-identify moving objects not present in this database. thedata collected by theprototype in thelascampanasobservatory enabledus to obtain a large sample of observations of moving objects. some of these objects were identified as high-orbit geostationary (geo) satellites, which shows that it is possible to observe even distant satellites with small aperture photo lenses. the analysis of the sample is still going on. the preliminary results and algorithms for automatic identification of moving objects will be described here. keywords: space debris, robotic telescopes, satellite observations, satellite tracking, optical transients. 1 introduction space debris consists of objects originally launched by humans, which now orbit the earth but are no longer in use. they can be upper stages of rockets, dead satellites, engine modules of geostationary satellites, remnants from satellite collisions, etc. collisions of these elements moving at a speed of several km/s can destroy active satellites and can in the worst case be dangerous for astronauts on missions (spacecraft or space stations). this problem was spectacularly demonstrated in reality on 12 february 2009, when the irridium 33 and cosmos 2 251 satellites collided over northern siberia at relative speeds of about 11 km/s. the crash produced thousands of new pieces of space junk, which can be dangerous for other satellites. such elements should be cataloged and continuously monitored to enable satellites and spacecraft to avoid them. most space debris elements populate low earth orbits (leo), but the number of objects near to geostationary orbit (geo) is constantly growing. there are about 13 000 artificial objects in the earth’s orbit, fewer than 800 of which are active satellites, while more than 12,000 are space debris elements larger than 10 cm. in principle, the only way to avoid collisions with potentially dengerous space debris elements (larger than 1 cm in size) is by manoeuvering spacecraft. however, in order to do this, space junk elements must be continuously monitored and cataloged. efforts are being undertaken by national and international space agencies to discover and monitor space debris elements. the largest existing catalog of objects in the earth orbit is the space object catalog, provided by norad (north american aerospace defense command), which is based on a network of optical and radio telescopes of various apertures. systems like norad typically work in two modes. in wide-field mode they can discover new space debris elements, which are later precisely tracked by narrow field instruments. long term narrow mode observations enables precise determination of orbital parameters. the orbital elements are stored in the two line elements (tle) format which, besides identifiers, consists of ten parameters fully describing the object’s orbit [1]. there are also initiatives in europe to create a system similar to norad. one of the ideas is the space situational awareness program (ssa). the first stage of such a system is the european space surveillance system (esss) [2], which was proposed for automatic detection and identification of space debris pieces, and for determining orbital elements. it will track objects from leo orbits and predict their movements. natural candidates to join such a system are robotic telescopes. these instruments work automatically with almost no human attention. they also perform automatic data analysis, which is expected in the esss. in many cases they run algorithms very similar to those required for discovering space debris. examples of european telescopes already tracking elements in the earth orbit are tarot (fov of 2◦ × 2◦) and starbrook (fov of 10◦ × 6◦) [2]. another component of this system will be widefield optical systems like “pi of the sky”. the purpose of this study is to verify the ability of the “pi 103 acta polytechnica vol. 51 no. 2/2011 parameter prototype final focal length 85 mm 85 mm focal ratio 1.2 1.2 ccd sensor brand fairchild sta ccd size 2 048 × 2 048 px 2 048 × 2 048 px fov 20◦ × 20◦ 1.5–2 steradians ccd pixel size 15 × 15 μm 15 × 15 μm time of exposure 10 s 10 s relative accuracy of 20 ms 20 ms shutter synchronization limiting magnitude 12m 12m for stationary objects average accuracy of astrometry 10arcsec 10arcsec # of ccd cameras 2 2 × 12 (goal 2 × 16) # of mounts 1 2 × 3 (goal 2 × 4) fig. 1: basic parameters of “pi of the sky” telescopes. the image on the left shows the prototype system installed in the las campanas observatory in chile between 2004 and 2010 of the sky” system to observe and automatically discover objects in the earth’s orbit (satellites or space debris). 2 “pi of the sky” system the “pi of the sky” system [3, 4] was originally designed for observing short timescale optical events, particularly optical counterparts of gammaray bursts (grbs), optical transients, and to monitor variable objects (stars, blazars etc.). the system is remotely controlled, fully automatic, to a large degree autonomous, and performs automatic data analysis. special algorithms were designed and developed in order to automatically discover optical transients and discover new objects in the sky [5, 6, 4]. the final version of the system is currently under construction. the prototype system was installed in the las campanas observatory in chile (figure 1) between 2004 and 2010. the prototype consists of two cameras, which observe the same field in the sky and collect images in synchronized mode. besides its primary scientific goals, the data from the prototype has allowed to study the potential of the system for discovering and identifying space debris. the final system will have a substantially larger fov (the final goal is 2 steradians), which will allow the system to look for new pieces of space debris more effectively (more information can be found in [3]). the data from the prototype verified the system capabilities, and further developed the algorithms and software tools for space debris oriented analysis. figure 1 gives a summary of the parameters of the prototype and the full version of the system. 3 satellites in “pi of the sky” data the algorithms designed for identifying the short optical transients in the “pi of the sky” data are described in detail in [6]. the on-line algorithm analyses images while they are being collected by the camera, and looks for new objects appearing in the sky. in the first step, the algorithm finds objects which appear in the new image, but were not present in previous images. after this step, a list of flash candidates is created, then further cuts are applied in order to reject the background (mostly cosmic rays hits, hot pixels, sky background, star fluctuations etc.). one of the most important cuts at this stage is the coincidence of two cameras, requiring the optical transient to be visible in two cameras. this cut eliminates cosmic rays striking one of the ccd chips. after the coincidence cut, a list of real flashes from the sky is obtained. most of them are due to moving objects, e.g. planes, satellites, and perhaps space debris elements. the primary goal of the algorithm is to identify ots from natural astrophysical sources, so such objects must be identified, flagged, excluded from the sample of ot candidates, and saved to log files. then another algorithm can simply be used to analyze the objects in detail and look for space debris elements. in order to reject most of these events, databases of orbital elements in two lines element (tle) format are retrieved from the internet (mainly from the norad website) every evening. they are combined into a single large database containing ≈ 13 000 or104 acta polytechnica vol. 51 no. 2/2011 distance to closest satellite entries 9660 mean 8346 rms 4750 distance from event to closest satellite [arcsec] 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 n u m b e r o f e v e n ts 210 distance to closest satellite entries 9660 mean 8346 rms 4750 = 1800’’satr night 20070526 mc distance to closest satellite : night 20070526 and mc fig. 2: the image on the left shows the distribution of the distance from a flash event candidate to the closest satellite from the catalog, for events found by the coincidence algorithm during night of 2007.05.26/27. for comparison, distances from randomly generated flashes to the nearest satellite are shown as red dots — the combinatorial background is nicely reproduced. for the purposes of this analysis, the signal threshold was decreased to have more accidental coincidences. the image on the right shows the distribution of the distances from real data zoomed in the range [0,3000] arc sec only fig. 3: five subsequent 10s exposures showing fast, low orbit satellite sl-12 observed by the “pi of the sky” prototype fig. 4: five successive 10s exposures showing geostationary satellite xm-4 blues observed by the “pi of the sky” prototype. the satellite is a large object, with solar panels exceeding 10 m bital elements. for each image, the positions of all satellites in the database are calculated (using the predict package [7]), each flash candidate is verified and, if it is closer than rsat = 1 000 arc sec from any of the satellites, it is flagged and excluded from further ot analysis. this rejection radius was determined according to the distribution of angular distances from flashes to the closest satellite from the database, which is clearly peaked around zero (figure 2). the red dots on the plot represent the distance distribution for randomly generated flashes to the closest satellite from the catalog, and illustrate the size of the combinatorial background. the satellites identified according to the tle database are mostly leo and geo satellites. images of low orbit satellite sl-12 are shown in figure 3. there are also many geostationary satellites in the sample (mostly large, with solar panels exceeding 10 m). however, these observations show that it is possible to observe even geostationary objects with the small aperture (70 mm) instrument. images of geostationary satellite xm-4 blues are shown in figure 4. the orbital element databases are not complete, and many satellites are not included in them. in order to identify them, event candidates are examined against track criteria. there are currently two procedures for track identification: • normal track — tries to form a track out of events from different images, it finds slowly moving satellites (like the geostationary satellite in 105 acta polytechnica vol. 51 no. 2/2011 ccdx 0 200 400 600 800 10001200 140016001800 2000 c c d y 0 200 400 600 800 1000 1200 1400 1600 1800 2000 normal_tracks.lst ccdx 0 200 400 600 800 10001200 140016001800 2000 c c d y 0 200 400 600 800 1000 1200 1400 1600 1800 2000 single_camera_tracks.lst fig. 5: tracks identified by the two algorithms described in the text during the same night. the image on the left shows 160 normal tracks and the image on the right presents 40 single camera tracks ccdx 0 200 400 600 800 100012001400160018002000 c c d y 0 200 400 600 800 1000 1200 1400 1600 1800 2000 track00010.lst track info −−−−−−−−−−−−−−−−−−− frame = 847 point# = 435 line type = p params : a=−0.00003399 b=−0.00004576 c=701.24591028 chi2 = 0.15 oper = u ccdx 0 200 400 600 800 100012001400160018002000 c c d y 0 200 400 600 800 1000 1200 1400 1600 1800 2000 track00455.lst track info −−−−−−−−−−−−−−−−−−− frame = 309 point# = 88 line type = l params : a=49.10953497 b=−1.00000000 c=−67695.77024120 chi2 = 8.18 oper = u fig. 6: an example of a normal track with a fitted parabola (left image) and a single camera track with a fitted straight line (right image) figure 4). if it is possible to fit a track to a set of events from different or same images and the velocity of the object is approximately constant, all events matching the track are flagged as moving objects. an example of 160 normal tracks fitted during a single night is shown in figure 5. • single camera tracks — identify tracks using events identified by a single camera. the tracking starts with events from a single image, and requires at least 5 events in a track. if a new single camera track is identified, it is matched to earlier tracks of the same type. if they match each other, the earlier track is extended by the new one and a single larger track is formed. this procedure rejects fast satellites (or planes) which produce “long line” signatures in a single image (figure 3). an example of 40 normal tracks fitted during a single night is shown in figure 5. in the initial version only straight lines were fitted. for many moving objects, however, fov of 20◦ × 20◦ is large enough to observe significant curvature of the track. thus, in order to make the fit procedure more efficient, a parabolic curve is now also fitted. if the track consists of at least 20 events 106 acta polytechnica vol. 51 no. 2/2011 fig. 7: example of a text file containing satellite observational data and χ2par/χ 2 line < 2, parabolic fit parameters are chosen. examples of a straight line and parabolic tracks are shown in figure 6. the efficiency of identifying moving objects by the normal track procedure was studied by checking how many of the satellites identified in the tle database were later identified in the track procedure. the procedure is already very efficient and 99 % of objects that are moving objects according to the tle database were also identified by the normal track procedure. software tools were created in order to obtain the positions and times of all observations of the given tle-satellite. the observational data (right ascension, declination, time and some additional information) is saved to text files (figure 7). optionally, the data may be saved to the database in order to allow for fast searches of multi-night observations of satellites. php scripts were developed to access the data via the web browser. for example, it extracts all observations of the given satellite (from multiple nights). the data was later used to fit the orbital parameters. two programs were tested: findorb [8] and the custom developed orbfinder software. preliminary tests show that it is possible to fit orbital parameters using “pi of the sky” data if at least 0.7–1.0 % of the whole orbit is observed. 4 conclusions the data from the prototype “pi of the sky” system was used to study the capability of the system to discover space debris elements. preliminary estimates show that the system is able to observe at least half of the objects from the norad database. it is even possible to observe large and bright satellites on the geostationary orbit. algorithms for automatic identification of moving objects were developed, tested and seem to be efficient. the analysis is still going on, but it seems that a wide field system like “pi of the sky” could be used successfully for survey tasks for discovering and calculating the orbit of large and easy targets. acknowledgement we are very grateful to g. pojmanski for giving access to the asas dome and for sharing his experience with us. we would like to thank the staff of the las campanas observatory for their help during installation of the apparatus. this work was financed by the polish ministry of science from 2005 until 2010 as a research project. references [1] website the source for space surveillance data, http://www.spacetrack.org/tle format.html [2] klinkrad, h., et al.: europe’s eyes on the skies — the proposal for a european space surveillance system, esa bulletin 133 (2008). [3] żarnecki, a. f., et al.: improving photometry of the pi of the sky, this journal. [4] ma�lek, k., et al.: pi of the sky detector, advances in astronomy, volume 2010 (2010), article id 194946, 9 pages, http://www.hindawi.com/journals/aa/2010/ 194946.html [5] sokolowski, m.: investigation of astrophysical phenomena in short time scales with pi of the sky apparatus, phd thesis, institute for nuclear studies, jan 2008, http://arxiv.org/abs/0810.1179 [6] soko�lowski, m., et al.: automated detection of short optical transients of astrophysical origin in real time, advances in astronomy, vol. 2010 (2010), article id 463496, 11 pages, http://www.hindawi.com/journals/aa/2010/ 463496.html [7] web page of predict package, http://www.qsl.net/kd2bd/predict.html [8] web page of findorb package, http://www.projectpluto.com/find orb.htm 107 acta polytechnica vol. 51 no. 2/2011 m. soko�lowski e-mail: msok@fuw.edu.pl the andrzej soltan institute for nuclear studies hoża 69, 00-681 warsaw, poland m. należyty university of warsaw astronomical observatory al. ujazdowskie 4, 00-478 warsaw, poland a. majczyna the andrzej soltan institute for nuclear studies hoża 69, 00-681 warsaw, poland r. wawrzaszek space research center of the polish academy of sciences bartycka 18a, 00-716 warsaw, poland p. wajer space research center of the polish academy of sciences bartycka 18a, 00-716 warsaw, poland 108 acta polytechnica vol. 51 no. 4/2011 coquaternionic quantum dynamics for two-level systems d. c. brody, e. m. graefe abstract thedynamical aspects of a spin-1 2 particle inhermitian coquaternionic quantumtheory are investigated. it is shown that the time evolution exhibits three different characteristics, depending on the values of the parameters of thehamiltonian. when energy eigenvalues are real, the evolution is either isomorphic to that of a complexhermitian theory on a spherical state space, or else it remains unitary along an open orbit on a hyperbolic state space. when energy eigenvalues form a complex conjugate pair, the orbit of the time evolution closes again even though the state space is hyperbolic. keywords: complexified mechanics, pt symmetry, hyperbolic geometry. over the last decade or so there has been considerable interest in the study of complexified dynamical systems; both classically [1–6] and quantum mechanically [7–17]. for a classical system, its complex extension typically involves the use of complex phase-space variables: (x, p) → (x0 + ix1, p0 + ip1). hence the dimensionality of the phase space, i.e. the dynamical degrees of freedom, is doubled, and the hamiltonian h(x, p) in general also becomes complex. for a quantum system, on the other hand, its complex extension typically involves the use of a hamiltonian that is not hermitian, whereas the dynamical degrees of freedom associated with the space of states — the quantum phase space variables — are kept real. however, a fully complexified quantum dynamics, analogous to its classical counterpart, can be formulated, where state space variables are also complexified [18, 19]. the present authors recently observed that there are two natural ways in which quantum dynamics can be extended into a fully complex domain [19], where both the hamiltonian and the state space are complexified. in short, one way is to let the state space variables and the hamiltonian be quaternion valued; the other is to let them be coquaternion valued. the former is related to quaternionic quantum mechanics of finkelstein and others [20, 21], whereas the latter possesses spectral structures similar to those of pt-symmetric quantum theory of bender and others [7–10]. the purpose of this paper is to work out in some detail the dynamics of an elementary quantum system of a spin-1 2 particle under a coquaternionic extension, in a manner analogous to the quaternionic case investigated elsewhere [22]. as illustrated in [19], a coquaternionic dynamical system arises from the extension of the real and the imaginary parts of the state vector in the complex-j direction, where j is the second coquaternionic ‘imaginary’ unit (described below). the general dynamics is governed by a coquaternionic hermitian hamiltonian, whose eigenvalues are either real or else appear as complex conjugate pairs. here we examine the evolution of the expectation values of the five pauli matrices generated by a generic 2 × 2 coquaternionic hermitian hamiltonian. we shall find that, depending on the values of the parameters appearing in the hamiltonian, the dynamics can be classified into three cases: (a) the eigenvalues of h are real and the dynamics is strongly unitary in the sense that the ‘real part’ of the dynamics on the reduced state space is indistinguishable from that generated by a standard complex hermitian hamiltonian; (b) the eigenvalues of h are real and the states evolve unitarily into infinity without forming closed orbits; and (c) the eigenvalues of h form a complex-conjugate pair but the dynamics remains weakly unitary in the sense that the real part of the dynamics, although generating closed orbits, no longer lies on the state space of a standard complex hermitian system. interestingly, properties (b) and (c) are in some sense interchanged in a typical pt-symmetric hamiltonian where the orbits of a spin-12 system are closed when eigenvalues are real and open otherwise. these characteristics are related to the three cases investigated recently by kisil [23] in a more general context of heisenberg algebra, based on the use of: (i) spherical imaginary unit i2 = −1; (ii) parabolic imaginary unit i2 = 0; and (iii) hyperbolic imaginary unit i2 = 1. the use of coquaternionic hermitian hamiltonians thus provides a concise way of visualising these different aspects of generalised quantum theory. before we analyse the dynamics, let us begin by briefly reviewing some properties of coquaternions that are relevant to the ensuing discussion. coquaternions [24], perhaps more commonly known as split quaternions, satisfy the algebraic relation i2 = −1, j2 = k2 = ijk = +1 (1) 14 acta polytechnica vol. 51 no. 4/2011 and the skew-cyclic relation ij = −ji = k, jk = −kj = −i, ki = −ik = j. (2) the conjugate of a coquaternion q = q0 + iq1 + jq2 + kq3 is q̄ = q0 − iq1 − jq2 − kq3. it follows that the squared modulus of a coquaternion is indefinite: q̄q = q20 + q 2 1 − q22 − q23. unlike quaternions, a coquaternion need not have an inverse q−1 = q̄/(q̄q) if it is null, i.e. if q̄q = 0. the polar decomposition of a coquaternion is thus more intricate than that of a quaternion. if a coquaternion q has the property that q̄q > 0 and that its imaginary part also has a positive norm so that q21 − q22 − q23 > 0, then q can be written in the form q = |q|eiq θq = |q|(cos θq + iq sin θq ), (3) where iq = iq1 + jq2 + kq3√ q21 − q22 − q23 and θq = tan −1 (√ q21 − q22 − q23 q0 ) . (4) that a coquaternion with a ‘time-like’ imaginary part admits the representation (3) leads to the strong unitary dynamics generated by a coquaternionic hermitian hamiltonian. on the other hand, if q̄q > 0 but q21 − q22 − q23 < 0, i.e. if the imaginary part of q is ‘space-like’, then q = |q|eiq θq = |q|(cosh θq + iq sinh θq ), (5) where iq = iq1 + jq2 + kq3√ −q21 + q22 + q23 and θq = tanh −1 (√ −q21 + q22 + q23 |q0| ) . (6) if q̄q > 0 and q21 − q22 − q23 = 0, then q = q0(1 + iq ), where iq = q −1 0 (iq1 + jq2 + kq3) is the null pureimaginary coquaternion. finally, if q̄q < 0, then we have q = |q|eiq θq = |q|(sinh θq + iq cosh θq ), (7) where iq = iq1 + jq2 + kq3√ −q21 + q22 + q23 and θq = tanh −1 (√ −q21 + q22 + q23 q0 ) . (8) as indicated above, the fact that the polar decomposition of a coquaternion is represented either in terms of trigonometric functions or in terms of hyperbolic functions manifests itself in the intricate mixture of spherical and hyperbolic geometries associated with the state space of a spin-12 system, as we shall describe in what follows. in the case of a coquaternionic matrix ĥ, its hermitian conjugate ĥ† is defined in a manner identical to a complex matrix, i.e. ĥ† is the coquaternionic conjugate of the transpose of ĥ. therefore, for a coquaternionic two-level system, a generic hermitian hamiltonian satisfying ĥ† = ĥ can be expressed in the form ĥ = u0� + 5∑ l=1 ulσ̂l, (9) where {ul}l=0..5 ∈ �, and σ̂1 = ( 0 1 1 0 ) , σ̂2 = ( 0 −i i 0 ) , σ̂3 = ( 1 0 0 −1 ) , (10) σ̂4 = ( 0 −j j 0 ) , σ̂5 = ( 0 −k k 0 ) are the coquaternionic pauli matrices. the eigenvalues of the hamiltonian (9) are given by e± = u0 ± √ u21 + u 2 2 + u 2 3 − u24 − u25. (11) thus, they are both real if u21+u 2 2+u 2 3 > u 2 4+u 2 5; otherwise they form a complex conjugate pair. this, of course, is a characteristic feature of a pt-symmetric hamiltonian. a unitary time evolution in a coquaternionic quantum theory is generated by a one-parameter family of unitary operators e−ât, where â is skewhermitian: ↠= −â. as in the case of complex quantum theory, we would like to let the hamiltonian ĥ be the generator of the dynamics. for this purpose, let us write i = 1 ν (iu2 + ju4 + ku5), (12) where ν = √ u22 − u24 − u25 if u24 + u25 < u22, and ν = √ u24 + u 2 5 − u22 if u22 < u24 + u25. then we set â = iĥ and the schrödinger equation in units h̄ = 1 is thus given by (cf. [22]) |ψ̇〉 = −iĥ|ψ〉. (13) it is worth remarking that when u24+u 2 5 < u 2 2 we have i2 = −1, whereas when u22 < u24+u25 we have i2 = +1. 15 acta polytechnica vol. 51 no. 4/2011 in either case iĥ is a skew-hermitian operator satisfying (iĥ)† = −iĥ; thus e−iĥt formally generates a unitary time evolution that preserves the norm 〈ψ|ψ〉 = ψ̄1ψ1 + ψ̄2ψ2, where ψ̄ is the coquaternionic conjugate of ψ so that 〈ψ| represents the hermitian conjugate of |ψ〉. the conservation of the norm can be checked directly by use of the explicit form of the schrödinger equation in terms of the components of the state vector:( ψ̇1 ψ̇2 ) = ( −(u0 + u3)iψ1 − u1iψ2 − νψ2 −(u0 − u3)iψ2 − u1iψ1 + νψ1 ) . (14) here we have assumed u24 + u 2 5 < u 2 2 so that ν =√ u22 − u24 − u25; if u22 < u24 + u25, we have ν =√ u24 + u 2 5 − u22 and the sign of ν in (14) changes. to investigate properties of the unitary dynamics generated by the hamiltonian (9) we shall derive the evolution equation satisfied by what one might call a ‘coquaternionic bloch vector’ �σ, whose components are given by σl = 〈ψ|σ̂l|ψ〉 〈ψ|ψ〉 , l = 1, . . . , 5. (15) by differentiating σl in t for each l and using the dynamical equation (14), we deduce, after rearrangements of terms, the following set of generalised bloch equations: 1 2 σ̇1 = νσ3 − u3 ν (u2σ2 + u4σ4 + u5σ5) 1 2 σ̇2 = 1 ν (u2u3σ1 − u1u2σ3 + u0u5σ4 − u0u4σ5) 1 2 σ̇3 = −νσ1 + u1 ν (u2σ2 + u4σ4 + u5σ5) (16) 1 2 σ̇4 = 1 ν (−u3u4σ1 + u0u5σ2 + u1u4σ3 + u0u2σ5) 1 2 σ̇5 = 1 ν (−u3u5σ1 − u0u4σ2 + u1u5σ3 − u0u2σ4), where we have assumed u24 + u 2 5 < u 2 2 so that ν =√ u22 − u24 − u25. this is the region in the parameter space where the coquaternion appearing in the hamiltonian has a time-like imaginary part. note that these evolution equations preserve the condition: σ21 + σ 2 2 + σ 2 3 − σ24 − σ25 = 1, (17) which can be viewed as the defining equation for the hyperbolic state space of a coquaternionic two-level system. let us now show how the dynamics can be reduced to three-dimensions so as to provide a more intuitive understanding. for this purpose, we define the three reduced spin variables σx = σ1, σy = 1 ν (u2σ2 + u4σ4 + u5σ5), σz = σ3. (18) we can think of the space spanned by these reduced spin variables as representing the ‘real part’ of the state space (17). then a short calculation making use of (16) shows that 1 2 σ̇x = νσz − u3σy 1 2 σ̇y = u3σx − u1σz (19) 1 2 σ̇z = u1σy − νσx, or, more concisely, �̇σ = 2 �b ×�σ where �b = (u1, ν, u3). hence although the state space of a coquaternionic spin-12 system is a hyperboloid (17), remarkably in the region u24 + u 2 5 < u 2 2 the reduced spin variables σx, σy, σz defined by (18) obey the standard bloch equations (19). in particular, the reduced motions are confined to the two sphere s2: σ2x + σ 2 y + σ 2 z = const., (20) where the value of the right side of (20) depends on the initial condition (but is positive and is time independent). to put the matter differently, in the parameter region u24 + u 2 5 < u 2 2, the dynamics on the reduced state space s2 induced by a coquaternionic hermitian hamiltonian is indistinguishable from the conventional unitary dynamics generated by a complex hermitian hamiltonian. this corresponds to the situation in a pt-symmetric quantum theory whereby in some regions of the parameter space the hamiltonian is complex hermitian (e.g., a harmonic oscillator in the bender-boettcher hamiltonian family h = p2 + x2(ix)� [7], or the six-parameter 2 × 2 matrix family in [25]). some examples of dynamical trajectories are sketched in figure 1. the evolution of the other dynamical variables σ2, σ4, σ5 can be analysed as follows. recall that the dynamics (19) preserves the relation (20). thus, by subtracting (20) from (17) and rearranging the terms we deduce that −(u2σ4 + u4σ2)2 + (u4σ5 − u5σ4)2 −(u5σ2 + u2σ5)2 = const. (21) this shows that the evolution of the vector (σ2, σ4, σ5) is confined to a hyperbolic cylinder. it turns out that the time evolution of these ‘hidden’ dynamical variables σ2, σ4, σ5 can also be represented in a form similar to bloch equations if we transform the variables according to σy1 = u4σ5 − u5σ4, σy2 = u5σ2 + u2σ5, and σy3 = u2σ4 + u4σ2. in terms of these auxiliary variables we have 16 acta polytechnica vol. 51 no. 4/2011 fig. 1: (colour online) dynamical trajectories on the reduced state spaces. in the parameter region u22 > u 2 4 + u 2 5 the reduced state space is just a two-sphere, upon which the dynamical equations (19) generate rabi oscillations (left figure). in theparameter region u22 < u 2 4+u 2 5 the reduced state space is a two-dimensional hyperboloid, and thedynamical equations (26) generate open trajectories on this hyperbolic state space, if the energy eigenvalues are real (right figure). if the eigenvalues are complex, the open trajectories are rotated to form hyperbolic rabi oscillations 1 2 σ̇y1 = − u0 ν (u5σy2 + u4σy3) 1 2 σ̇y2 = − u0 ν (u2σy3 + u5σy1) (22) 1 2 σ̇y3 = − u0 ν (u4σy1 − u2σy2). it should be evident that these dynamics are confined to a hyperboloid: −σ2y1 + σ2y2 + σ2y3 = const. (23) note, however, that when u0 = 0 we have σ̇y1 = σ̇y2 = σ̇y3 = 0 from (22), while σ2, σ4, σ5 are in general evolving in time. hence in transforming the variables into σy1, σy2, σy3, part of the information concerning the dynamics is lost. we see from (20) and (21) that on the ‘imaginary part’ of the state space the dynamics is endowed with hyperbolic characteristics, which nevertheless is not visible on the reduced state space, or the ‘real part’ of the state space s2 spanned by σx, σy, σz . when u22 = u 2 4 + u 2 5 so that the imaginary part of the coquaternion appearing in the hamiltonian is null, a calculation shows that the reduced spin variables obey the following dynamical equations: 1 2 σ̇x = −u3σy 1 2 σ̇y = −u3σx + u1σz (24) 1 2 σ̇z = u1σy, and preserve σ2x − σ2y + σ2z . when u22 < u 2 4 + u 2 5 so that the imaginary part of the coquaternion in the hamiltonian is space-like, the structure of the state space, as well as the dynamics, change, and they exhibit an interesting and nontrivial behaviour. the five-dimensional spin variables in this case evolve according to 1 2 σ̇1 = −νσ3 − u3 ν (u2σ2 + u4σ4 + u5σ5) 1 2 σ̇2 = 1 ν (u2u3σ1 − u1u2σ3 + u0u5σ4 − u0u4σ5) 1 2 σ̇3 = νσ1 + u1 ν (u2σ2 + u4σ4 + u5σ5) (25) 1 2 σ̇4 = 1 ν (−u3u4σ1 + u0u5σ2 + u1u4σ3 + u0u2σ5) 1 2 σ̇5 = 1 ν (−u3u5σ1 − u0u4σ2 + u1u5σ3 − u0u2σ4), where ν = √ u24 + u 2 5 − u22. these evolution equations preserve the normalisation (17). however, in the region u22 < u 2 4 + u 2 5 the reduced spin variables σx, σy, σz defined by (18) no longer obey the standard bloch equations (19); instead, they satisfy 1 2 σ̇x = −νσz − u3σy 1 2 σ̇y = −u3σx + u1σz (26) 1 2 σ̇z = u1σy + νσx, and preserve the relation σ2x − σ2y + σ2z = const. (27) we thus see that at the level of reduced spin variables in three dimensions, the state space changes from a two-sphere (20) to a hyperboloid (27), as the parameters u2, u4, u5 appearing in the hamiltonian 17 acta polytechnica vol. 51 no. 4/2011 fig. 2: (colour online) conic sections and pt phase transition: changes of orbit structures. a projection of the orbits on the hyperboloid, for parameters just above the transition to complex energy eigenvalues, is shown on the left side. the orbits form circular sections. on the right side we plot orbits of hyperbolic rabi oscillations further into complex energy eigenvalues. the energy eigenvalues determine the angle between the axis of rotation and the axis of the hyperboloid. when eigenvalues are complex, the axis of rotation is within the hyperboloid, leading to closed orbits on the state space generated by circular sections. when the imaginary part of the coquaternion appearing in the hamiltonian is null, we have parabolic sections of the hyperboloid; whereas when the energy eigenvalues are real, the angle of the two axes is larger than π/4, and open orbits are generated by hyperbolic sections change. this transition corresponds to the transition from a complex hermitian hamiltonian into a pt-symmetric non-hermitian hamiltonian. since the energy eigenvalues can still be real even when u22 < u 2 4 + u 2 5, we expect the dynamics to exhibit two distinct characteristics depending on whether the reality condition u21 + u 2 2 + u 2 3 > u 2 4 + u 2 5 is satisfied. indeed, we find that on a hyperbolic state space, orbits of the unitary dynamics associated with real energies are the ones that are open and run off to infinities. conversely, when the reality condition is violated, these open orbits are in effect wick rotated to generate closed orbits. these features can be identified by a closer inspection on the structure of the underlying state space, upon which the dynamical orbits lie. in particular, (26) shows that the dynamics generates a rotation around the axis (u1, ν, u3); whereas the state space (27) is a hyperboloid about the axis (0, 1, 0). we have sketched in figure 2 dynamical orbits resulting from (26), indicating that there indeed is a transition from open to closed orbits as real eigenvalues turn into complex conjugate pairs. intuitively, one might have expected an opposite transition since in a pt-symmetric model of a spin-12 system the renormalised bloch vectors on a spherical state space follow closed orbits when eigenvalues are real, whereas sinks and sources are created when eigenvalues become complex [15]. the apparent opposite behaviour seen here is presumably to do with the facts that the underlying state space is hyperbolic, not spherical, and that no renormalisation is performed here. in figure 2 we have sketched some dynamical trajectories when energy eigenvalues are complex. a projection of the dynamical orbits from the σz axis (for the choice of parameters used in these plots) shows in which way the topology of the orbits are affected by the reality of the energy eigenvalues. the evolutions of the other dynamical variables σ2, σ4, σ5 are confined to the space characterised by the relation (u2σ4 + u4σ2) 2 − (u4σ5 − u5σ4)2 +(u5σ2 + u2σ5) 2 = const., (28) instead of the relation (21) of the previous case. however, if we define, as before, three auxiliary variables σy1 = u4σ5 − u5σ4, σy2 = u5σ2 + u2σ5, and σy3 = u2σ4 + u4σ2, then the dynamical equations satisfied by these variables are identical to those in (22), except, of course, that the definition of ν is different. it is interesting to remark that when the imaginary part of the coquaternion appearing in the hamiltonian is space-like, the imaginary unit i has the characteristic of a ‘double number’ or a ‘study number’ introduced by clifford [26], that is, i2 = 1. quantum theories generated by such a number field (instead of the field of complex numbers) and other hyperbolic generalisations, as well as various issues that might arise from such generalisations, have been discussed by various authors (e.g., [27, 28]; see also [23] and references cited therein). the use of coquaternionic hermitian hamiltonian thus captures dynamical behaviours of different generalisations of quantum mechanics in a simple unified scheme. 18 acta polytechnica vol. 51 no. 4/2011 acknowledgement we thank the participants of the international conference on analytic and algebraic methods in physics vii, prague, 2011, for stimulating discussions. emg is supported by an imperial college junior research fellowship. references [1] bender, c. m., boettcher, s., meisinger, p. n.: pt-symmetric quantum mechanics, j. math. phys. 40, 1999, 2201. [2] kaushal, r. s., korsch, h. j.: some remarks on complex hamiltonian systems, phys. lett. a 276, 2000, 47. [3] mostafazadeh, a.: real description of classical hamiltonian dynamics generated by a complex potential, phys. lett. a 357, 2006, 177. [4] bender, c. m., holm, d. d., hook, d. w.: complex trajectories of a simple pendulum, j. phys. a 40, 2007, f81. [5] bender, c. m., hook, d. w., kooner, k. s.: classical particle in a complex elliptic pendulum, j. phys. a 43, 2010, 165 201. [6] cavaglia, a., fring, a., bagchi, b.: pt-symmetry breaking in complex nonlinear wave equations and their deformations. j. phys. a 44, 2011, 325201, arxiv:1103.1832 [7] bender, c. m., boettcher s.: real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. 80, 1998, 5 243. [8] lévai, g., znojil, m.: systematic search for ptsymmetric potentials with real energy spectra. j. phys. a 33, 2000, 7 165. [9] bender, c. m., brody, d. c., jones, h. f.: complex extension of quantum mechanics, phys. rev. lett. 89, 2002, 270 401. [10] mostafazadeh, a.: pseudo-hermiticity versus pt-symmetry iii: equivalence of pseudohermiticity and the presence of antilinear symmetries, j. math. phys. 43, 2002, 3 944. [11] oko lowicz, j., p loszajczak, m., rotter, i.: dynamics of quantum systems embedded in a continuum, phys. rep. 374, 2003, 271. [12] jones, h. f., rivers, r. j.: which green functions does the path integral for quasi-hermitian hamiltonians represent? phys. lett. a 373, 2009, 3 304. [13] dorey, p., dunning, c., lishman, a., tateo, r.: pt symmetry breaking and exceptional points for a class of inhomogeneous complex potentials, j. phys. a 42, 2009, 465 302. [14] witten, e.: a new look at the path integral of quantum mechanics. in surveys in differential geometry xv. mrowka, t., yau, s.-t. (eds.) boston : international press, 2011. [15] graefe, e. m., korsch, h. j., niederle, a. e.: quantum-classical correspondence for a nonhermitian bose-hubbard dimer, phys. rev. a 82, 2010, 013 629. [16] günther, u., kuzhel, s.: pt-symmetry, cartan decompositions, lie triple systems and krein space-related clifford algebras. j. phys. a 43, 2010, 39 002. [17] moiseyev, n.: non-hermitian quantum mechanics, cambridge : cambridge university press, 2011. [18] nesterov, a. i.: non-hermitian quantum systems and time-optimal quantum evolution. sigma, 5, 2009, 069. [19] brody, d. c., graefe, e. m.: on complexified mechanics and coquaternions, j. phys. a 44, 2011, 072 001. [20] finkelstein, d., jaueh, j. m., schiminovieh, s., speiser, d.: foundations of quaternion quantum mechanics, j. math. phys. 3, 1962, 207. [21] adler, s. l.: quaternionic quantum mechanics and quantum fields. oxford : oxford university press, 1995. [22] brody, d. c., graefe, e. m.: six-dimensional space-time from quaternionic quantum mechanics. 2011, arxiv:1105.3604. [23] kisil, v. v.: erlangen programme at large 3.1: hypercomplex representations of the heisenberg group and mechanics. 2010, arxiv:1005.5057v2 [24] cockle, j.: on systems of algebra involving more than one imaginary; and on equations of the fifth degree, phil. magazine 35, 1849, 434. [25] wang, q., chia, s., zhang, j.: pt symmetry as a generalisation of hermiticity. j. phys. a 43, 2010, 295 301. [26] clifford, w. k.: applications of grassmann’s extensive algebra. am. j. math. 1, 1878, 350. [27] hudson, r. l.: generalised translation-invariant mechanics. d. phil. thesis, bodleian library, oxford, 1966. 19 acta polytechnica vol. 51 no. 4/2011 [28] kocik, j.: duplex numbers, diffusion systems, and generalized quantum mechanics, int. j. theo. phys. 38, 1999, 2221. dorje c. brody mathematical sciences brunel university uxbridge ub8 3ph, uk eva-maria graefe department of mathematics imperial college london london sw7 2az, uk 20 ap09_2.vp 1 introduction demand can be defined as the relation between price and the quantity of goods that buyers are willing to purchase. this correlation is displayed in relation to the global market by the sold product quantity at one time-point. if we focus on the telecommunication services sector, we can note the development of the sale of cell phones and internet extensions (adsl, isdn, gprs, wi-fi etc.) factors affecting the development of demand are for example technology, price and considerations from the fields of psychology, sociology and economies. generally, demand can be considered as a time series under which the demand model can be defined and development trends can be predicted. 2 construction of the demand model the demand model can be constructed using standard statistical methods including decomposition time series, and the box-jenkins metodology, or by applying artificial intelligence methods, including for example neural networks. 2.1 decomposition time series the series { , , , }y t tt � 1 � is gradually decomposed to several components: trend, circular component, seasonal component and residual component (unsystematic component). this method is based on work with time series systematic components. features of time series behaviour can be better observed in separate components than in the undecomposed original time series. in this research study from the field of standard statistical methods, exponential smoothing will be used for demand modelling. exponential smoothing the above defined time series will be written as { , , , }y t tt � 1 � . simple exponential smoothing is described in the recurrent form � ( ) �y y yt t t� � � �� �1 1, �yt is the exponential average in time t, �yt �1 is the exponential average in time t � 1, value � is the smoothing constant from the interval � � 0 1; . the exponential average can be expressed on the basis of the recurrent form as: � ( ) � ( )[ ( ) � ]y y y y y y y t t t t t t t � � � � � � � � � � � �� � � � � � � 1 1 11 1 2 � � � � � � � � � � � � �� � � � � � � � ( ) ( ) [ ( ) � ] ( ) 1 1 1 1 1 2 2 3y y y y y t t t t � t t i t i t i y y y � � �� � � � � � � � � � 1 2 2 0 1 1 1 1 ( ) ( ) ( ) � ( ) � � � � � � � � � � y yt i i t t � � � � � � 0 1 01( ) � .� brown’s simple exponential smoothing the time series yt is constructed with a stationary process in the form yt t� �� �0 , �0 is the mean value of the process, and �t are random values with the features of white noise. after applying exponential smoothing to the time series yt t� �� �0 we obtain the relation � ( ) ( ) ( ) ( y yt i t i i i t i i � � � � � � � � � � � � � � � �� � � � � � � � 1 1 1 0 0 0 0 � �) i t i i � � � � 0 because � �( )1 1 0 � � � � � i applies to the mean value and dispersion e y e yt t( � ) ( )� � �0, d yt a y( � ) � � � � � � � � � � 2 2 2 2. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 demand modelling in telecommunications comparison of standard statistical methods and approaches based upon artificial intelligence methods including neural networks m. chvalina this article analyses the existing possibilities for using standard statistical methods and artificial intelligence methods for a short-term forecast and simulation of demand in the field of telecommunications. the most widespread methods are based on time series analysis. nowadays, approaches based on artificial intelligence methods, including neural networks, are booming. separate approaches will be used in the study of demand modelling in telecommunications, and the results of these models will be compared with actual guaranteed values. then we will examine the quality of neural network models. keywords: demand, telecommunications, standard statistical methods, box-jenkins methodology, arima, artificial intelligence methods, neural network. brown’s linear (double) exponential smoothing alternatively, we can make a double application of the simple exponential smoothing method on the time series yt expression in the form � ( ) �( ) ( )y y yt t t 2 1 21� � � �� � . 2.2 box-jenkins methodology unlike classical decompositional methods, which deal with systematic time series components (trend, circular and seasonal), the box-jenkins methodology deals with a residual (unsystematic) component. the method involves searching for relations of individual observations. by this method we are able to describe time series which are not manageable by standard methods. the time series is perceived as a realization of the stochastic process which is defined as a series of accidental quantities arranged in time � x s t s t( , ), ,� �s t , where s is a selective space and t is an index series. for each s � s the realization of the stochastic process is defined on the index series t. for the box-jenkins methodology, the following special accesses are significant: autoregressive process ar, moving average process ma, and combined process arma. these processes result from the linear process by resetting all parameters till the final number. the parameters are chosen in such a way that the stationarity and invertibility of the processes will be ensured. a special non-stationary arima model also exists in the box-jenkins methodology. arima processes some integrated processes may be arranged by means of differentiation to stationary and are expressed in the form of the stationary and invertible arma(p, q) model. the original integrated process in the form: � � �p d t q tb b y b( )( ) ( )1 � � is called the autoregressive integrated process of sliding averages of the order p, d, q. this is called arima (p, d, q). models with d � 1 2, are usually used. 2.3 artificial intelligence methods neural networks the benefit of the artificial neural network lies in its ability to implement complex non-linear functions. neural systems co-execute a large number of operations and work without an algorithm. their activity is based upon the learning process, when the neural network gradually conforms to calculation. in the course of a learning phase we do not have to be occupied by the problem of the right selection function, because the neural network is able to make do only with practise examples. this is the main difference in comparison with a traditional approach (e.g. comparison of traditional non-linear models and a back-propagation network). the neurone function scheme can be demonstrated graphically as follows: y s w xi i i n � � � � � � � �� � 1 � , where y is the output (neurone activity), s demonstrates a transmission function (jump, linear, non-linear), xi is neurone input (inputs are in total n), wi represents a level of synoptic weight, � describes a trigger level. the principle of back-propagation network learning let us consider a neurone network with l – layers l � 1, …, l and as the output i-th neurone in the 1-th layer we use the indication vi l . vi 0 means xi, i.e. the i-th output. the indication wij l expresses a connection from vi l �1 to vi l . an algorithm can then be inscribed after separate stages as follows: 1. subranged random numbers based-weight initialization. 2. insertion of � xp into the network input (layer l � 0), i.e. v xk k p0 � . 3. network signal propagation: v g h g w vi l i l ij l j l k � � � � � � � � � � ��( ) 1 . 4. calculation of delta for the output layer: � � ip ip ip ipg h y v� � � �( ) . 5. calculation of delta for previous layers by error back-propagation: i l i l ji l j l i g h w� �� � �1 1( ) 6. weight change according to formula: �w vij l i l j l� � 1, w w wij ij ij new old� � � . 7. if all samples have been submitted to the network we continue in phase no. 8, otherwise we go back to phase no. 2. 8. if the network error compared to the selected criterion value was minor or the maximum number of steps was © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 49 no. 2–3/2009 fig. 1: model of a neurone exhausted, then the learning process can be completed, else phase no. 2 2.4 choice of a relevant model an appropriate model can be determined on the basis of: a) the graph of the time series, or from its absolute or relative characteristics, b) interpolative criteria (a decisive deviation of residues, coefficient of determination, coefficient of autocorrelation of residues, tests of parameters), c) extrapolative criteria (average characteristics of “ex post” forecast mistakes, graph forecasts). average characteristics of residues the average square mistake – dispersion mse n y y n at t t n t t n � � � � � � �1 12 1 2 1 ( � ) � . the root mean square mistake rmse n y y n at t t n t t n � � � � � � �1 12 1 2 1 ( � ) � . the average absolute mistake mae n y y n at t t n t t n � � � � � � �1 1 1 1 � � . the lower the values of the specified characteristics, the better the chosen model is. 3 demand modelling the above mentioned methods will be used for demand modelling of a narrowband mobile connection (gprs, hscsd). the gprs demand model (i.e. the number of households which are using a narrowband mobile internet connection) is based on the values from the czech statistical office (between 2003 and 2009) and estimate made by an expert. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 2: arima demand modelling of gprs arima processes 0. 0. �0. �0. �1 fig. 3: residual autocorrelation for gprs fig. 4: brown’s linear exp. smoothing of gprs brown’s linear exp. smoothing with alpha � 0.9428 0. 0. �0. �0. �1 fig. 5: residual autocorrelation for gprs 4 conclusions in this paper, we have described some methods for demand modelling in telecommunications. within the research project on demand modelling in telecommunications we obtained the following values for the average characteristics of residues. on the basis of the results of rmse and mae average characteristics of residues, the demand model based upon neural networks can be considered as best, but a different result arises from an evaluation of prediction quality, because the best prediction value result was obtained from a demand model based on brown‘s linear exponential smoothing. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 49 no. 2–3/2009 back-propagation neural network fig. 6: neural network demand modelling 0. �1 1. �1. �0. �2 fig. 7: residual values for gprs neural network arima (1,1,1) brown’s exp. smoothing rmse 0.96 1.31 1.53 mae 0.92 0.88 1.02 table 1: average characteristics of residues . . . . . . . . . . . fig. 8: comparison of the gprs internet connection used by households in the 2nd quarter of 2008, values published on february 10th, 2009 by the czech statistical office, and results predicted by demand models for gprs the prediction based on the neural networks was also used with other demand courses in the field of telecommunications for which more accurate results were obtained in terms of prediction quality. future research will focus on improving the prediction results and a demand model based on the above-mentioned methods. 5 references [1] cipra, t.: analysis of time series with application in economy, prague: sntl/alfa, 1986. [2] kaňok, m.: statistical methods in management, prague: ctu in prague, 2002. [3] šnorek, m.: neural network and neural computers, prague: ctu in prague, 2002. [4] artl, j., artlová, m.: financial time series, prague: grada publishing a.s., 2003. [5] artl, j., artlová, m., rublíková, e.: analysis of economical time series with examples, prague: university of economics, 2002. [6] özekes, s., osman, o.: classsification and prediction in data mining with neural networks, istanbul commerce university, 2003. [7] crone, s.: business forecasting with neural networks, boston: institute of business forecasting, 2004. [8] samuelson , p., nordhaus, d.: economy, prague: nakladatelství svoboda, 1991. [9] nau, r.: introduction to arima, duke university durham, 2007. [10] kuba, m.: neural networks, brno: masaryk university brno, 1995. martin chvalina e-mail: martin.chvalina@email.cz department of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 the science policy of the joint institute for nuclear research a. sissakian dear colleagues, it is great honor for me to participate in this conference, devoted to the 70th anniversary of the birth of the prominent czech scientist professor jiří niederle. following the title ofmy talk, i’ll give youan introductory presentation about jinr’s strategic development plans, starting with a very brief historical background. the joint institute for nuclearresearch (jinr) is located in the moscowregion, indubna, on the bank of the volga river (120 km to the north of moscow). the institutewas established inmarch1956 througha convention signed by the plenipotentiaries of the governments of the jinr member states. the institute was created in order to unify the intellectual and material potential of themember states in order to study the fundamental properties of matter. until the early 1990s, dubna was a centre that unified the efforts of leading research groups in the nuclear sciences from the socialist countries and the soviet union. a new development stage in the history of the institute has started in 1992. 18 independent states, including 9 republics of the former ussr, became its member states. in addition, agreements at governmental level were signed for cooperation with 6 countries –germany,hungary, italy, therepublic of south africa, serbia and, recently, with egypt. it is well-known that the future is based on the past, and on traditions. jinr has a huge foundation based on three “pillars”: • great experience in nuclear physics research and the world-wide recognized traditions of its scientific schools • a large and unique park of basic facilities for fundamental and applied research • the status of an international intergovernmental organization for over 50 years of jinr’s existence, first-class theoretical and experimental researchprogrammes implemented at jinr have led to a significant scientific output in fundamental nuclear science. more than 40 discoveries in nuclear physics, particle physics and condensedmatter physics havebeenmade in the jinr laboratories. in recognition of the achievements of jinr’s staff of researchers, in 1997 the international committee for pure and applied chemistry awarded the name “dubnium” to element 105 of theperiodictable ofelements. among the latest bright results, i would like to mention the pioneering investigations of the chemical properties of the superheavy elements. the main governing body of the institute is the committee of plenipotentiaries of themember states’ governments. the scientific policy is established and co-ordinated by the scientific council, whose members are prominent and well-known scientists from the member states as well as from cern, germany, greece, france, italy, china, india and other countries. three programme advisory committees for particlephysics,nuclearphysics andcondensedmatter physics hold their meetings twice a year. there are seven laboratories at jinr, each being comparable to a large research institution in terms of the scope of scientific activities. we also have a university centre, as well as several functional subdivisions and workshops. present scientific policy the triad: fundamental research, innovation developments and the educational programme – forms the strategic policy line of the institute’s development. i would like to note that, along with the current 7-year programme of the institute’s development for the years 2003–2009 (and now we are developing the next 7-year programme for the years 2010–2016),1 we have elaborated the programme of jinr’s strategic development (the so-called road map) for the next 10–12 years. this programme takes into account both theworld tendencies in scientificdevelopmentsand the interests of our member states. fundamental studies will remain as the general direction of jinr’s development. at the same time, special attention will be given to innovation activities, in particular to radiationmedicine, nanotechnologies and others. the innovative projects will be developed in close cooperationwith the public-private-partnership in the framework of the special economic zone in dubna. the role of the educational programme will be further enhanced. the road map has determined three major researchdirectionsat jinr:high energyphysics, nuclear 1approved in september 2009 by the jinr scientific council, and in november 2009 by the jinr committee of plenipotentiaries of the governments of the member states 109 acta polytechnica vol. 50 no. 3/2010 physics and condensed matter physics. as i have already mentioned, the activities in these directions are implemented at our seven laboratories. such disciplines as theory, network and computing, as well as physics instrumentation and methods, training of the young generation are very important supporting activities for our basic research directions. oneof thekeypoints of theroadmap is thatjinr should develop its role as a world leader in certain researchdomains. therefore, it is necessary to formulate the priority trends in the development of the institute. to achieve this ambitious goal, we understand clearly that the whole experimental infrastructure of the institute, and first of all our basic facilities, must be completely upgraded. such a programme will allow us to become a competitive scientific laboratory, attractive, first of all, for the member states, for the countries with which we have tight and long-standing cooperation activities, e.g. czechia, hungary, germany, france, italy and other partners. while updating the road map we clearly see that the research niches for jinr, offered by our home facilities, are the following: heavy ion physics at high energies and at low energies, and condensed matter physics using nuclear-physics methods, including radiobiology studies. now briefly about our large-scale facilities and our ambitious research projects. in heavy ion physics at high energies, the new flagship of the institute and the highest priority project is the nica/mpd (nuclotron-based ion collider facility andmultipurposedetector) with energy of 4–11 gev in the nucleon-nucleon center-of-mass. this project is aimed at the study of hot and dense strongly interacting qcd matter and the search for a possible manifestation of mixed phase formation in heavy ion collisions, as well as studies on spin physics. this energy range is of great interest and importance forunderstanding the evolutionof theearlyuniverse in the first milliseconds after the big bang, and the formationofneutronstarsaswell. toachievethese physics goals on nica, we have to accomplish several large technical tasks. first of all, upgradingof thenuclotron itself, achieving itsprojectparametersby2010. this will be the first step towards the nica collider facility. the next two steps deal with the design and construction of the heavy ion collider itself and the multipurpose detector. the key advantage of the nicaproject is the sufficientlyhigh luminosity in the energyrange of 4–11gev in thenucleonnucleon center-of-mass (∼ 1027 cm−2sec−1 at 9 gev). the nica complex will comprise several accelerators and beam transfer channels. the design of the collider suggests two intersectionpoints, where the detectors are to be located. one of them, the multipurpose detector (mpd), is dedicated to mixed phase problem studies. one of the options for another detector is a spin physics detector (spd). table 1: nica general parameters ring circumference 251.52 m bρ max 45.0 t ·m ion kinetic energy (au79+) 1.0 ÷ 4.56 gev/u dipole field (max) 4.0 t free space at interaction point (for detector) 9 m beam crossing angle at interaction point 0 vacuum 10−11 torr luminosity per one interaction point 0.75÷11 ·1026 cm−2 · s−1 the central goal of the heavy ion experimental studies is to explore the phase diagramof nuclearmatter. it is expected that it has at least two different phases: hadronic and quark-gluon phases, and they are separated by a first order phase transition at high baryon densities. in theplaneof temperature andbaryon-density the first order phase transition results in a finite width band of the coexistence region of these two phases, the so-calledmixed phase. we attracted the attention to this remarkable fact in the middle of 2005. exploration of the mixed phase region is one of the goals of the future nica/mpd and fair/cbm (compressed baryonic matter) experiments, together with the low energy scan program at rhic and the na61 experiment at sps. this region has not been adequately explored up to now. among our external research activities i would like to mention two experiments with jinr’s most active participation. the first of these is the cern na48/2 experiment.very large statistics of high performance data of charged kaon decays has been accumulated by this experiment at the cern sps. two major parameters – a0 and a2 – of scattering lengths have been extracted with an unprecedented experimental precision of a few percent, allowing accurate tests of chiral perturbation theory predictions. this resultwas highlightedatcernasan experimental achievement in 2008. the other experiment deals with our activities at fermilab. jinr physicists, together with their american colleagues, performed a physical analysis in 2008 of the d0 experiment at the tevatron, which led to the first observation of the ωb baryon. this discovery was ranked among the ten most significant achievements in physics in 2008 by the amer110 acta polytechnica vol. 50 no. 3/2010 ican physics society, and we are again proud that jinr made a significant contribution to this analysis. jinr successfully completed all its obligations to the three lhc projects: atlas, alice and cms. the total contribution of jinr to these projects, including lhc damper, amounted to approximately 25 million swiss francs. concerning jinr-cern collaboration, i’d like to note that in accordance with our recent negotiations with the new cern directorgeneral r. heuer, our collaboration will be of high priority for both centres. now we are preparing a renewed version of the general partnership agreement to be signed at the beginning of 20102. as for nuclear physics (low and intermediate energy heavy ion physics), i’d like to mention that we are proud that in the last decade jinr has become one of the leading scientific centres in the world in the synthesis of superheavy elements and physical and chemical investigations of these nuclei. other priority topics in this domain are the structure and properties of neutron-rich light exotic nuclei, accelerator technology and broad applied research, including nanotechnology. for the past 7-year period in a number of experiments performed with the use of the intense beam of 48ca and actinide targets, 5 new elements and 34 new superheavy isotopes have been synthesized. these experiments have been carried out in wide collaboration with scientific centers in russia and the usa. chemical properties of elements 112–115 were established for the first time. in particular, it was shown that elements 112 and 114 aremore volatile than their lighter homologues, due to relativistic effects in the electronic structure. so, a new branch of science really is presently developing at jinr – relativistic chemistry of superheavy elements. neutrino physics is a traditional research topic for our institute. one of the most fundamental studies in neutrino physics is the search for neutrinoless double betadecay. very impressiveresultshavebeenobtained by the nemo-3 collaboration. at present the limit for neutrino mass established by this experiment is of the order of 1 ev. it is planned in 2009–2010 to go down by a factor of two in this limit. in 2011 the installationof the supernemodetectorwill start. this new detector will provide sensitivity for the search for neutrinoless double beta decay below 0.1 ev in the neutrino mass. as for condensed matter physics’, i’d like to mention that jinr’s basic facility here is the ibr-2 fast pulsing reactor. it is included in theeuropean20-year strategic research programme with neutrons. the reactor is now under an upgrade programme. in 2010, the practically new ibr-2m reactor will start up its operation. the main priorities in condensed matter physics in the next 7-year period deal with reliable operation at design parameters of the new ibr-2m reactor, the establishment of a complex of modern neutron spectrometers, realization of a full-scale cryogenic complex, as well as wide-ranging innovations including nanoscience. special attention will be given to further development of the spectrometer complex of the ibr-2m reactor. the first priority projects include the construction of the new dn-6 spectrometer for studies of microsamples, the grains multifunctional reflectometer, and modernization of the spectrometer complex for geophysical research. these facilities will significantly expand the area of world-level researchat the ibr-2mreactor. we expect a broad international user policy on this machine, as in the past, or even broader. the radiobiology studies at jinr are aimed at obtaining fundamental results which will be principally important to general and cosmic radiobiology. starting from 2009, our laboratory of radiobiology has conducted its research programme under the scientific and methodological supervision of the section of biological sciences of the russian academy of sciences. it is important to take into our consideration that in this field of our activities we have to employ the huge scientific potential of the joint institute for the progress of this trend at the interface of biology, medicine and physics. obviously, we have to expect innovation breakthroughs exactly in this area during the next evolution cycle of civilization. the educational programme plays an important role in the jinr’s activities. in 1991 we established the jinruniversitycentre and in 1994, togetherwith the russian academy of natural sciences, the dubna international university. this provides good opportunities for young scientists to get their postgraduate training, obtainphddegrees and receive advanced professional training using jinr’s experimental facilities. thedubna internationaladvancedschool ontheoretical physics came into operation in 2003. a few words about the innovation activities in our institute. in 2005 the russian government adopted a resolution on the establishment of six special economic zones (sez) in the territory of the russian federation. the status of a special economic zone of technological and innovative type was granted to four towns: zelenograd,dubna, st.petersburgandtomsk. themain specializations of the dubna sez are related to nuclear physics and it technologies, including radiation medicine, safety systems, nanotechnologies and it technologies. on18april 2008,president of therussianfederationdmitry medvedev chaired a session of the presidiumof the statecouncil indubnawhichdiscussed the issue “developmentof thenational innovationsystem in the russian federation”. 2agreement has been signed on 28 february 2010. 111 acta polytechnica vol. 50 no. 3/2010 during his visit, presidentd.medvedev highly appreciated the results of jinr’s basic research and underlined the role of science in the innovation process. the president noted the importance of the future realization of two large-scale projects, proposed by jinr: the establishment in dubna of a centre for radiation medicine and of an international innovation centre for nanotechnology. i would like to note that over the decades jinr has played the role of a cluster centre for its member states, giving them an efficient way to be integrated into the european scientific society (for example cern, infn and others). we hope that the international innovative centre for nanotechnology, which we are now creating, will also serve for the integration of european innovation activities. in particular, you know that in 2008 the european institute for innovation and technology was established by the europeanunion, with its general headquarters in budapest, hungary. we intend to include our innovation centre in its knowledge and innovation communities (kic) as a first step in the integration process. i believe that our cooperation with the member states will further serve our mutual interests and, undoubtedly,will strengthen the scientific ties inbothbasic and applied research andwill integrate our joint research into european science. concluding, letme say a fewwords about cooperation activities between jinr and czechia. the czech republic has always been one of the most active institute member states. it should be mentioned that czech scientists have served as jinr vice-directors, deputy directors of the jinr laboratories, heads of departments and groups: v. votruba, v. petrzilka, c. simane, i. zvara and many others. taking the opportunity, i would like to cordially congratulateprof. cestmir simane onhis forthcoming 90th birthday, and wish him good health and happiness. today jinr conducts cooperation with 9 czech scientific centres, 3 universities and the company vacuum-prague on 34 scientific topics in all jinr scientific trends. jinr’s main partners in the czech republic are: the nuclear physics institute in rez, charles university, the czech technical university, the institute of physics in prague, the company vacuum prague, and many others. a ceremonial meeting dedicated to the 50th anniversary of jinr was held on 24 january in prague, in the building of charles university, one of the oldest universities in europe. among the honorary guests present at the meeting were prime minister of the czech republic j. paroubek, plenipotentiary of the russian federation to czechia a. fedotov, as well as representatives of czech scientific, governmental and social organizations. prime minister of czechia j. paroubek congratulated the audience on the 50th anniversary of jinr and noted the great role of the dubna international physics centre in founding and developing nuclear physics in all countries – jinr member states, including czechoslovakia and czechia. professorj.niederle has contributedgreatly to our cooperation. an outstanding theoretician, he has for many years been promoting the development of our collaboration in theoretical physics. as scientific secretary of the presidium of the czech academy of sciences, he contributedmuch to the solution of strategic issues in our joint activities. please, accept once more my heartiest congratulations on the occasion of professor jiří niederle’s anniversary. alexey sissakian academician, director of jinr 112 acta polytechnica acta polytechnica 53(1):54–57, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ study of silicon photomultipliers for the grips calorimeter module alexei ulyanov∗, lorraine hanlon, sheila mcbreen, suzanne foley school of physics, university college dublin, ireland ∗ corresponding author: alexey.uliyanov@ucd.ie abstract. grips is a proposed gamma-ray (200 kev to 80 mev) astronomy mission, which incorporates a pair-creation and compton scattering telescope, along with x-ray and infrared telescopes. it will carry out a sensitive all-sky scanning survey, investigating phenomena such as gamma-ray bursts, blazars and core collapse supernovae. the main telescope is composed of a si strip detector surrounded by a calorimeter with a fast scintillator material. we present the initial results of a study which considers the potential use of silicon photomultipliers in conjunction with the scintillator in the grips calorimeter module. keywords: silicon photomultiplier; grips; gamma-ray; future. 1. introduction grips (gamma-ray imaging, polarimetry and spectroscopy) [1, 2] is a proposed γ-ray astronomy mission, which will perform a sensitive all-sky scanning survey from 200 kev to 80 mev. with respect to previous missions (e.g. integral, comptel, egret) the sensitivity in this energy range will be improved by at least an order of magnitude. the grips mission will investigate γ-ray bursts and blazars, the mechanisms behind supernova explosions, nucleosynthesis and spallation, the origin of positrons in our galaxy, and the nature of radiation processes and particle acceleration in extreme cosmic sources including pulsars and magnetars. as its primary instrument, grips will have a combined compton scattering and pair creation telescope called the gamma-ray monitor (grm). in addition, a separate satellite will carry two auxiliary telescopes: an x-ray monitor (xrm) and an infrared telescope (irt), which will help to accurately localise grb counterparts and measure their redshifts. similar to previous compton and pair creation telescopes, the grm design envisages two separate detectors: a silicon tracker, in which the initial compton scattering or pair conversion takes place, and a calorimeter, which absorbs and measures the energy of the secondaries. the calorimeter employs labr3 scintillator crystals to achieve an improved energy resolution (about 3 % at 1 mev) and faster response times compared to traditional scintillator materials such as nai, csi, or bgo. for optimum angular resolution grm requires a finely segmented calorimeter with ∼ 105 readout channels. this precludes the use of bulky, fragile, highvoltage photomultiplier tubes, traditionally employed for the readout of scintillator crystals. pin diodes are not suitable because they have no internal gain and produce signals with a low signal-to-noise ratio, figure 1. a selection of the silicon photomultipliers produced by sensl. resulting in poor energy resolution. the grm relies on development of advanced semiconductor photodetectors such as silicon drift detectors (sdd). recent technological advances in the development of silicon photomultipliers (sipm) make them a promising option for the grm calorimeter readout. these detectors combine the high gain of traditional photomultipliers with low-voltage operation, robustness, low mass and compact design typical for semiconductor devices. the purpose of this study is to evaluate the possibility of using novel sipms for the calorimeter readout in the grips mission and to optimise the design of a calorimeter module. this work is a collaboration between the space science group in university college dublin and sensl technologies ltd (http://www.sensl.com), an irish company that has developed the novel sipms. another irish company, acra control ltd., is involved in the design of a suitable payload interface unit for the instrument. we present here the results of an initial study, which has focused on the technical requirements of the grm calorimeter. 2. sipm developments by sensl sensl have designed sipms to meet the needs of a variety of applications from analytical instruments, hazard and threat detection to nuclear medicine and 54 http://ctn.cvut.cz/ap/ vol. 53 no. 1/2013 study of silicon photomultipliers for the grips calorimeter module baseplate 1 baseplate 2 electronics acs d2 bottom d2 side lower d2 side upper d1 130 cm 160 cm 80 cm 120 cm3 5 c m 7 5 c m 6 0 c m 6 5 c m 160 cm grips: gamma-ray monitor detector ~65 cm frontend electronics ~60 cm bus ~120 cm figure 2. left: geometry and size of the grm detector used in the simulations. right: expected size and configuration of grm and related electronics on a generic satellite bus. process monitoring. the current product range includes sensors of three sizes (1 mm, 3 mm and 6 mm) available in various packages (fig. 1), with 19096 microcells per 6 mm detector. in addition, a 4 × 4 array of sipm pixels is available. the array permits close packing on all four sides allowing for a detection area that can be as large or small as required by the specific application. it is the first commercially available detector of its kind. larger 8×8 sipm custom arrays have also been produced. 3. grm model and simulation environment we have carried out monte-carlo simulations to understand how the properties of different detector components influence the performance of the whole grm detector. for the simulations we used the megalib software package [3, 4]. the package includes tools for detector description and simulation of particle interactions with matter using the geant4 toolkit [5]. it also provides dedicated tools for event reconstruction and high-level analysis for compton and pair telescopes. the simulations performed in the scope of this project used the grm baseline model from [2] including the grm instrument, the support tube with the electronics box, but not the spacecraft (fig. 2). the tracker consists of 64 layers each containing a mosaic of 8×8 double-sided si strip detectors of area 10×10 cm2, each of those detectors having 96 strips per side. the layers are spaced at a distance of 5 mm. the calorimeter is made of labr3 prisms with 5 × 5 mm2 cross-section. the upper half of the calorimeter side walls feature scintillators of 2 cm length and the lower half has 4 cm thick walls. the side wall crystals are read out by one photodetector each. the bottom calorimeter is 8 cm thick and is read out on both ends of the crystals to achieve a depth resolution of the energy deposits. the whole detector is surrounded by a plastic scintillator counter that acts as an anticoincidence shield against charged particles. the background for the grips mission (equatorial low-earth orbit) is dominated by diffuse cosmic photons and albedo photons coming from the earth’s atmosphere. these were the only background components included in the simulations. to carry out cpu intensive simulations we use the pascal ibm hpc cluster at ucd, which is a 720 core supercomputer running the red hat enterprise linux 5.2 operating system. the cluster has 28 nodes allocated to the school of physics. each node has 16 gb of ram and two xeon 3 ghz processors yielding eight cores per node. 4. initial results the calorimeter energy resolution is simulated in megalib by adding gaussian noise to ideal energy deposits in sensitive volumes produced by particles in the geant simulations. the influence of the calorimeter energy resolution on the overall grm sensitivity was investigated by multiplying the baseline energy resolution (a function of energy) by a scale factor. the resulting effect on the grm sensitivity is shown in figs. 3 and 4 for a variety of sources. while the calorimeter resolution has a strong impact on the narrow line source sensitivity, it has only a moderate influence on the continuum source sensitivity, mostly 55 a. ulyanov, l. hanlon, s. mcbreen, s. foley acta polytechnica resolution factor 0 1 2 3 4 5 6 7 8 9 /s ) 2 s e n si tiv ity ( p h /c m 0 2 4 6 8 10 12 14 -6 10× 511 kev narrow line source resolution factor 0 1 2 3 4 5 6 7 8 9 /s ) 2 s e n si tiv ity ( p h /c m 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 1 second gamma-ray burst 0.2 mev < e < 2 mev =-2.3, ep=0.3 mevβ=-1.1, α figure 3. on-axis point source sensitivity versus calorimeter energy resolution (in units of the baseline resolution) for a narrow line (106 s exposure) source and a continuum grb spectrum (1 s exposure). the scale factor of 1 corresponds to the baseline energy resolution (4.6% at 662 kev). resolution factor 0 1 2 3 4 5 6 7 8 9 /s ) 2 s e n si tiv ity ( p h /c m 0 2 4 6 8 10 12 14 16 18 20 -6 10× e = e = 1 mev∆ continuum source resolution factor 0 2 4 6 8 10 12 14 16 18 20 22 /s ) 2 s e n si tiv ity ( p h /c m 0 0.5 1 1.5 2 2.5 3 -6 10× e = e = 20 mev∆ continuum source figure 4. on-axis continuum point source sensitivity in a 106 s exposure for a source with an e−2 spectrum. crystal size (cm) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 /s ) 2 s e n si tiv ity ( p h /c m 0 2 4 6 8 10 12 14 16 18 20 -6 10× depth resolution, rms (cm) 0 0.5 1 1.5 2 2.5 /s ) 2 s e n si tiv ity ( p h /c m 0 2 4 6 8 10 12 14 16 18 20 -6 10× figure 5. on-axis continuum point source sensitivity for a 106 s exposure versus transverse crystal size (left) and versus calorimeter depth resolution (right); e−2 spectrum, 0.5 mev < e < 1.5 mev. the sensitivity obtained without any depth resolution is shown by the red dashed line. 56 vol. 53 no. 1/2013 study of silicon photomultipliers for the grips calorimeter module )2fluence above 100 kev (erg/cm -6 10 -5 10 -410 -3 10 m in im u m d e te ct a b le p o la ri sa tio n ( % ) -110 1 10 210 1000-2000 kev 500-1000 kev 200-400 kev figure 6. on-axis grm polarisation sensitivity in different energy bands as a function of the grb fluence. the grb spectrum is modelled with the band function (α = −1.1, β = −2.3, ep = 0.3 mev). the polarisation sensitivity also depends on the amount of background and consequently on the grb duration, which is taken to be 20 s for this analysis. the brightest grb detected by batse had a fluence of 7.3 · 10−4 erg/cm2 above 100 kev. through a change in the grm angular resolution. the calorimeter energy resolution becomes relatively unimportant at energies above ∼ 10 mev, when the pair creation dominates over the compton scattering and the direction of the original photon is reconstructed by the tracker. in addition, incomplete energy absorption at these energies becomes more important than the intrinsic energy resolution of the calorimeter. the angular resolution and sensitivity of the grm telescope depend on the calorimeter spatial resolution, which is achieved through the fine segmentation: 5 × 5 mm2 crystal size is envisaged in the baseline design. this level of segmentation follows the design of the mega prototype [6] and has not been optimised for grm. we considered several larger crystal sizes in our simulations to understand how the calorimeter segmentation affects the grm performance. from the left plot in fig. 5 we conclude that a crystal size of ∼ 1 × 1 cm2 can be used to reduce the number of readout channels without significant impact on the grm performance. in the baseline design, 8 cm long crystals in the bottom calorimeter are read out at both ends to achieve a depth resolution of energy deposits. we estimated the grm sensitivity with different values of the depth resolution and found that the depth resolution of σ = 1 cm, or better, gives a noticeable improvement in the instrument sensitivity (right plot in fig. 5). linearly polarised γ-rays preferentially compton scatter perpendicular to the incident polarisation vector, resulting in an azimuthal scatter angle distribution modulated relative to the distribution for unpolarised photons. the polarisation sensitivity of grm estimated for a typical γ-ray burst (grb) spectrum is shown in fig. 6. 5. future work our further work will be focused on: • laboratory tests of the sipms supplied by sensl; • a realistic software model of sipms, including thermal noise and finite dynamic range of the sensors; • optimisation of the calorimeter module design comprising labr3 crystals and a sipm array. two options will be considered to achieve a depth resolution in the thick bottom calorimeter: double readout of single crystals and longitudinal segmentation. acknowledgements we would like to thank andreas zoglauer who helped us to get started with the grips simulations. this project is being supported under esa’s strategic initiative ao/16418/10/nl/cbi. references [1] greiner, j et al.: gamma-ray burst investigation via polarimetry and spectroscopy (grips), exp. astron., 23, 2009, p.91. [2] greiner, j. et al.: grips gamma-ray imaging, polarimetry and spectroscopy, arxiv:1105.1265, 2011. [3] zoglauer, a., andritschke, r., schopper, f.: megalib the medium energy gamma-ray astronomy library, new astron. rev., 50, 7-8, 2006, p.629. [4] zoglauer, a, andritschke, r, boggs, s et al.: megalib: simulation and data analysis for low-to-medium-energy gamma-ray telescopes, spie, 7011e, 2008, p.101. [5] agostinelli, s. et al.: geant4 a simulation toolkit. nucl. instr. meth., a506, 2003, p.250. [6] kanbach, g. et al.: development and calibration of the tracking compton/pair telescope. nucl. instr. meth., a541, 2005, p.310. 57 acta polytechnica 53(1):54--57, 2013 1 introduction 2 sipm developments by sensl 3 grm model and simulation environment 4 initial results 5 future work acknowledgements references acta polytechnica vol. 52 no. 5/2012 model of customer buying behavior in the cz mobile telecommunication market ondřej grünwald dept. of economics, management and humanities, czech technical university, technická 2, 166 27 praha, czech republic corresponding author: grunwond@fel.cvut.cz abstract the czech mobile telecommunication market constitutes an oligopoly of three operators, who have built a privileged position and effectively crush any competition. the entry of a new operator has been considered by the government since the end of 2009. the new mobile operator should push down the prices of services, which are the highest in europe and also affect the development of new mobile services. this paper analyzes consumer behavior in the mobile telecommunication market. it reveals how different elements are considered by customers and what is important when choosing a mobile tariff. with use conjoint analysis, we obtained empirical arguments about the preferences of customers in the czech republic. the analysis shows that the relatively high price of services greatly reduces the unsaturated demand in the mobile telecommunication market, and proves that the price is crucial in customer decision-making. keywords: model of buying behavior, adaptive choice-based conjoint analysis, market research, market share, price sensitivity. 1 introduction in present-day market research, the most popular methods for modeling customer buying behavior are discrete-choice methods which include choice-based conjoint (cbc) [6]. these methods are favored mainly for their ability to mimic a real purchasing decision by a discrete choice better than traditional conjoint methods based on ranking or rating a set of product concepts, where customer preferences are usually expressed as rank orders respectively as values on quantitative scales. however, the cbc approach has been less effective than classical conjoint approaches in the amount of information about preferences that is gathered during questioning[3]. a respondent in a conjoint survey first considers multiattribute product options within some set. in the discrete-choice case, as a result, the respondent chooses just one concept, that is most preferred. the advantage of the choice is that this kind of decision is intuitive for everyone, but several pieces of information concerning the preferential orders of product concepts in the set are not recorded. it is therefore necessary to include more respondents in the cbc interview than in traditional conjoint methods. in recent years, parameter estimation for individual respondents in cbc has become feasible using hierarchical bayesian methods. however, approaches for designing more effective discrete choice questionnaires are still concern for market researchers. one method that improves the cbc experiment in terms of effective information gathering [1] is adaptive choice-based conjoint (acbc) [5]. 2 conjoint analysis methodology and experimental design conjoint methods originated in psychometry, and were adopted for market research. conjoint analysis is in principle based on a global assessment of product incentives described by a specific combination of product attributes. in a questionnaire, respondents make trade-offs among product properties while they consider the overall preference for a concept. 2.1 the adaptive choice-based conjoint approach adaptive choice-based conjoint (acbc) is a sawtooth software system. it combines two earlier approaches: choice-based conjoint (cbc) and adaptive conjoint analysis (aca). acbc has adopted the favorable aspects for questioning. it also allows the analyst to uncover the customer’s preferences and to model the customer’s buying behavior while he is forming his decision for a complex product or service. during an interview, this approach adjusts the offering of product concepts to make it as relevant as possible for each individual. for each respondent, the specific set of concepts focusing on their decision space is generated. the preference data is analyzed on the individual level, using an hb procedure. moreover, the acbc interview method is optimized for a larger number of attributes and levels than was possible in earlier cbc. it is necessary to carry out acbc 42 acta polytechnica vol. 52 no. 5/2012 byo configure your preferred product screening generation of individual relevant set of concepts detection of must-have and must-avoid attribute levels (including noncompensatory aspect) choice choice of the best product variant in the set of relevant alternatives section 1 section 2 section 3 figure 1: three acbc questioning sections. questioning using a computer, and it is designed in three types of sections, see figure 1. the first section is byo (build your own). usually, all the attributes and levels included in the interview are presented there, but some of them can be omitted. the respondent chooses the most preferred level from each attribute. in this way, he will define the closest configuration to his ideal product concept. the screening section follows byo, where the set of respondent’s most relevant product concepts (generated as near-neighbors to byo) is presented. the algorithm for generating the near-orthogonal set based on byo selection [3] typically produce t (18 < t < 36) near concepts containing the full range of attribute levels included in the study. number t depends on the number of attributes and levels. the algorithm below is executed for each of the t concepts. step 1. randomly choose integer number ai between amin and amax, which determines how many attributes in c0 will be modified to get near to concept c1. step 2. randomly choose ai elements in c0 which will be modified. step 3. randomly choose levels for the attributes from step 2 that are not included in the byo concept. step 4. check that the chosen concept does not contain prohibited pair of levels and does not duplicate a previous respondent’s concept. in cases of duplication, or positive detection of a prohibited pair, discard the concept and go back to step 1. in the algorithm, c0 is a vector of the number of elements which is equal to the number of attributes in the byo concept. amin and amax are explicitly set during questionnaire design to determine the minimum and maximum number of attributes that can be modified from the byo concept. the set of generated concepts is presented in the task, usually with 4-6 concepts at once, and the respondent chooses one answer from the two options in each concept. this indicates whether a product of this type is acceptable to him to purchase as a customer. from these responses, we can derive for each respondent the none parameter quantifying a utility threshold within which the respondent is willing to buy. during the purchase decision, customers usually use the elimination method and reduce the set of options from among which the final product will be chosen. acbc incorporates cutoff rules taking must-have and must-avoid product features [5] into account. the must-have question asks for the attribute levels that were included in the concepts marked as possibility for purchase during the previous screening tasks. the must-avoid question asks in a similar way for the attribute levels that were not presented in any of the product concepts marked as a possible option. compensatory models assume, that the utility of concept is u = ∑ j xiβj, where j is the number of attributes in concept u and βj is a part-worth parameter that denotes the weight of the utility of attribute level xj . these cut-off questions confirm the levels that cannot be compensated in the product by an another level. the set of concepts for the upcoming tasks is adjusted in a way that the concepts containing the must-avoid level, or not containing must-have levels, are replaced by new concepts satisfying the uncovered cut-off rules. the choice tournament section of acbc is in sense of discrete choice (one concept among other options within choice task) as in cbc tasks. here, only the options chosen in the screening section as possible product concepts for purchase are presented to the respondent. in this way, each respondent is asked only to a narrow group of product options that are relevant to him (cbc includes all range of concepts included in an experiment). usually, this section presents around 3-5 concepts at once, and the concept chosen as the best option is offered again in the subsequent tasks, together with the other winners. this is repeated until only one concept is left. this process takes t/2 choice tasks for t concepts taken from the screening section. another special feature of acbc is the possibility to create a summed price attribute. the classical price attribute approach usually defines 3 − 5 levels as discrete points in the price range and combines them with other attribute levels. this causes some of the concepts present unrealistic prices for low-end and premium products (low-end products are shown with a high price level, and vice versa). in this situations, respondents may distribute their preferences in a distorted way. unlike classical approaches, acbc composes the concept price as the sum of some common base price with partial component prices assigned to the levels included in the concept. while the concept set for the interview is being generated, the prices in the concepts are generated with a predefined variation [5]. all the attributes in the concepts should be mutually independent, but some small correlation is acceptable in order to achieve highly realistic concepts. for sufficient independence of the price attribute, the 43 acta polytechnica vol. 52 no. 5/2012 prices in the concepts should be varied randomly, at least in the range from −30 % to 30 % from the sum of the static base price with the component prices of concepts. 2.2 estimation part-worth utilities this estimation can be made using the hb procedure [2, 4, 7], where we want to estimate the part-worth for each individual respondent in vector β, the mean value of all respondents in vector α, variances and covariance for the respondents in matrix c. the hierarchical model consists of two levels, the upper level and the lower level. at the upper level, we assume that the vectors of the individual part-worths have multivariate normal distribution. β ∼ n(α,c) (1) at the lower level, we assume a multinomial logit model for each respondent. probability p, that respondent i chooses from set m a particular product alternative n is equal: pimn = exp (xmnβi)∑n j=1 exp (xmjβi) (2) where n is the number of the concept in set m. xmn represents vector (1×n) of attribute levels of nth alternative from set m. the estimation algorithm can be described as: init. based on counts of the presence of the levels in the chosen concepts divided by the count of the presence of the levels in all concepts from set m, determine the initial estimate of vector β, set the elements of α to 0 and, for c, set unit variance and zero covariance. step 1. based on the current estimations of β and c, estimate vector α as the mean value of the distribution. step 2. based on current estimations of β and α, estimate matrix of variance and covariance c. step 3. based on the current estimates of α a c, estimate new vector βnew for each respondent i. the algorithm is repeated in thousands of iterations which can be divided into two groups. the first few thousand iterations are used to achieve convergence, and the subsequent iterations further refine the estimation. the second group contains the remaining few thousand iterations, which are saved for further analysis and the estimation of vectors β, α and c. unlike conventional statistical approaches, the subsequent iterations do not converge to a single point estimate for each part-worth parameter, but after converging the estimates move randomly randomly in the subsequent figure 2: parameter estimates using hb routine. iterations which reflect some amount of uncertainty, see on figure 2. the part-worths are determined as a point estimate obtained by averaging the individual vectors β from the second group of iterations. subsequent values of βnew in each iteration are estimated by using the metropolis-hasting algorithm, which shapes the whole estimation process by a bayesian nature. we assume that βold and βnew represent the former and subsequent estimation where vector βnew was created by adding a small random variation to βold. by using the logit model, we can compute, from vector β, the probability of occurrence for the choice set of each respondent. li = m∑ m=1 n∑ n=1 yimn(pimn) (3) in accordance with the hierarchical model, individual vectors β have multinomial normal distribution with a mean value α and covariance matrix c. using the relative probability density function in expression 4, we can compute the probabilities pnew and pold that βold and βnew would have been drawn from that distribution. p(β1,β2, . . . ,βk) = e− 1 2 (β−α) tc−1(β−α) (4) finally, we can compute ratios r of the posterior probabilities which are in accordance with the bayesian rule: r = (lnew ×pnew) (lold ×pold) (5) if ratio r is greater than 1, a new estimate of βnew is accepted, because it has a higher posterior probability than in the previous step. if r is less than 1, vector βnew is accepted for the next iteration step with probability equal to r. 2.3 acbc experiment to investigate customer buying behavior we tested the acbc approach in a case study concerned with mobile operator tariffs in the czech re44 acta polytechnica vol. 52 no. 5/2012 operator calls sms new operator 50 minutes (czk 140) 20 sms (czk 20) vodafone 150 minutes (czk 300) 100 sms (czk 60) t-mobile 300 minutes (czk 450) 500 sms (czk 150) telefónica o2 500 minutes (czk 600) unlimited (czk 1500) data free numbers contract without data without lump sum (2 years) 150 mb 3 free numbers (czk 150) transferable credit 500 mb 6 free numbers (czk 300) rechargeable card 1 gb 3 gb price: summed pricing attribute at different prices from +50 % to −30 % table 1: attributes and configuration for the acbc study. public. the aim of our empirical study was: 1. to test the ability of the acbc method in terms of its applicability in a real scenario by using an electronic survey. 2. to evaluate the potential chances of a new gsm operator entering the cz market in terms of identifying a suitable product offer. 3. to determine the attributes of services offered by the mobile operators in terms of their utility and importance from the perspective of customers. 4. to determine the optimal telephone tariff (a combination of levels of attributes with the highest utilities) for a new operator. 5. to estimate the demand for services of the new operator and to determine customer needs and expectations. 6. to gain an insight into customer buying behavior in mobile telecommunications when deciding the tariff. 7. to design a price sensitivity model. the study included a total of 7 product attributes with a total of 23 levels1, see table 1. 1the attributes and their levels were determined by an analysis of unit prices for services offered by the operators on their websites in november 2011. the byo component prices were set to reflect the upcoming christmas event and were generally lower than the official offers. however, the operators have also been broadly offering unofficial individual retention bids at much lower prices. the attribute operator was not included in the byo section. the screener section was composed of 7 tasks each consisting of 4 concepts. after 3 initial screening tasks, 3 must-avoid questions and 2 musthave questions about the levels were asked. in the choice tournament section, each task consisted of 3 concepts. 3 results of the empirical study the experiment2 was started the week before christmas 2011. a total of 816 individual questionnaires were initialized in byo during the experiment. a typical respondent was a student of a czech university, aged between 20 and 30. the number of fully completed experiments was 512, i.e. 63 %. 3.1 counting analysis of tariff choices a counting analysis provides a good insight into the proportion of respondent decisions in each section of the experiment. figure 3 shows the frequency of the level choices within the attributes while setting up the ideal product in byo. the operator attribute has been omitted in this section. figure 3 shows the most common initial configuration of the desired tariff as a result of trading the attribute levels with associated component prices. the 2electronic interviews and the subsequent analysis of data were performed using ssi web from sawtooth software. 45 acta polytechnica vol. 52 no. 5/2012 0 0 0 0 283 143 51 18 17 208 253 51 158 122 128 64 40 382 118 12 348 63 101 0 50 100 150 200 250 300 350 400 450 o p e r a t o r n e w o p e ra to r v o d a fo n e t -m o b il e t e le fó n ic a o 2 c a l l s 5 0 m in u te s 1 5 0 m in u te s 3 0 0 m in u te s 5 0 0 m in u te s u n li m it e d s m s 2 0 s m s 1 0 0 s m s 5 0 0 s m s d a t a w it h o u t d a ta 1 5 0 m b 5 0 0 m b 1 g b 3 g b f r e e n u m b e r s w it h o u t 3 f re e n u m b e rs 6 f re e n u m b e rs c o n t r a c t l u m p s u m ( 2 y e a rs ) t ra n s fe ra b le c re d it r e c h a rg e a b le c a rd figure 3: frequencies of level choices in byo. 0 5 10 15 20 25 30 35 40 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 f re q u e n c y number of tariff concepts figure 4: frequencies of the number of the acceptable concepts. most configured tariff was: 50 minutes, 100 sms, without data services, without free call numbers within the network and lump payments for two years. much emphasis in the selection of respondents was on price, because respondents preferred a level of quantitative attributes at the lowest price. in the following screening section, where a set of tariffs generated as the nearest neighbors to the byo tariff was presented (each varied in between 1-2 attributes), the respondents considered the variants in means of the options for purchase. the largest group of respondents marked between 7 and 17 concepts as acceptable products. there were also respondents for whom no tariff was acceptable (a total of 27 respondents, representing 5.27 % of the total group). figure 4 illustrates how many respondents marked how many concepts as their option for the purchase. in the screening section, two questions on must-have attribute levels and three questions on must-avoid attribute levels were included (non-compensatory aspects). figure 5 shows the frequency of the respondents’ setting must-have levels in a tariff. the graph shows that the acceptability of the tariffs was mostly conditioned by at least 150 mb of data services (28.32 %), 100 sms (16.41 %) and 150 3 4 0 0 0 70 25 5 6 0 84 8 0 145 56 20 14 0 27 3 19 2 28 0 20 40 60 80 100 120 140 160 o p e r a t o r n e w o p e ra to r v o d a fo n e t -m o b il e t e le fó n ic a o 2 c a l l s 5 0 m in u te s 1 5 0 m in u te s 3 0 0 m in u te s 5 0 0 m in u te s u n li m it e d s m s 2 0 s m s 1 0 0 s m s 5 0 0 s m s d a t a w it h o u t d a ta 1 5 0 m b 5 0 0 m b 1 g b 3 g b f r e e n u m b e r s w it h o u t 3 f re e n u m b e rs 6 f re e n u m b e rs c o n t r a c t l u m p s u m ( 2 y e a rs ) t ra n s fe ra b le c re d it r e c h a rg e a b le c a rd figure 5: frequencies of must-have levels. 5 8 30 35 106 36 11 6 0 92 8 0 235 90 34 14 0 30 3 0 55 60 87 469 0 50 100 150 200 250 300 350 400 450 500 o p e r a t o r n e w o p e ra to r v o d a fo n e t -m o b il e t e le fó n ic a o 2 c a l l s 5 0 m in u te s 1 5 0 m in u te s 3 0 0 m in u te s 5 0 0 m in u te s u n li m it e d s m s 2 0 s m s 1 0 0 s m s 5 0 0 s m s d a t a w it h o u t d a ta 1 5 0 m b 5 0 0 m b 1 g b 3 g b f r e e n u m b e r s w it h o u t 3 f re e n u m b e rs 6 f re e n u m b e rs c o n t r a c t l u m p s u m ( 2 y e a rs ) t ra n s fe ra b le c re d it r e c h a rg e a b le c a rd p r ic e figure 6: frequencies of must-avoid levels. 142 121 104 91 241 114 63 27 13 151 233 74 146 111 103 71 27 328 110 20 224 115 119 0 50 100 150 200 250 300 350 o p e r a t o r n e w o p e ra to r v o d a fo n e t -m o b il e t e le fó n ic a o 2 c a l l s 5 0 m in u te s 1 5 0 m in u t 3 0 0 m in u t 5 0 0 m in u t u n li m it e d s m s 2 0 s m s 1 0 0 s m s 5 0 0 s m s d a t a w it h o u t d a ta 1 5 0 m b 5 0 0 m b 1 g b 3 g b f r e e n u m b e r s w it h o u t 3 f re e n u m b e rs 6 f re e n u m b e rs c o n t r a c t l u m p s u m ( 2 y e a rs ) t ra n s fe ra b le c re d it r e c h a rg e a b le c a rd figure 7: frequencies of levels in winning concepts. minutes of voice services (13.67 %). the difference for the byo section was mainly due to the predefined variability of the prices leading to more favorable tariffs. figure 6 shows the frequencies of the second part of the cut-off rule, namely the must-avoid attribute levels of a tariff. price level was clearly the most common consideration that was chosen as the must-avoid level. a large proportion of respondents (91.6 %) limited the tariffs acceptable to them to an average price of czk 635 46 acta polytechnica vol. 52 no. 5/2012 c o u n t o f ta ri ff s 268 251 239 228 216 201 181 166 150 135 121 112 100 84 65 58 51 40 35 29 22 16 11 6 1 price [czk] 3 2 5 0 3 0 2 0 2 9 1 0 2 8 1 0 2 7 1 0 2 6 1 0 2 5 1 0 2 4 1 0 2 3 1 0 2 2 1 0 2 1 1 0 2 0 1 0 1 9 1 0 1 8 1 0 1 7 1 0 1 6 1 0 1 5 1 0 1 4 1 0 1 3 1 0 1 2 1 0 1 1 1 0 1 0 1 0 9 1 0 8 1 0 7 1 0 6 1 0 5 1 0 4 1 0 3 1 0 2 1 0 1 1 0 page 1 figure 8: distribution of tariff prices in screening sections. n e w o p e ra to r v o d a fo n e t -m o b ile t e le fó n ic a o 2 5 0 m in u te s 1 5 0 m in u te s 3 0 0 m in u te s 5 0 0 m in u te s u n lim ite d 2 0 s m s 1 0 0 s m s 5 0 0 s m s w ith o u t d a ta 1 5 0 m b 5 0 0 m b 1 g b 3 g b w ith o u t 3 fre e n u m b e rs 6 fre e n u m b e rs l u m p s u m (2 y e a rs ) t ra n s fe ra b le c re d it r e c h a rg e a b le c a rd p ric e : 1 0 0 p ric e : 4 0 0 p ric e : 1 5 0 0 p ric e : 3 2 0 0 -250 -200 -150 -100 -50 0 50 100 150 200 250 p a rt w o rt h attributes operator calls sms free numbers contract price data figure 9: utility part-worth functions of the tariff attributes. per month. another significant part of the respondent group would not accept a tariff without data services (45.9 %). it is also interesting how many times the operator levels were marked as must-avoid. especially telefónica o2 (6.84 %) and t-mobile (5.86 %) receiving the largest number of disapprovals. the small frequency of the operator attribute in the noncompensatory consideration is obvious here. figure 7 refers to the frequency of the levels in the winning concepts of the final questionnaire section, which was chosen by the respondents by selecting the most appropriate concepts in the subsets created from the accepted tariffs from the screening section, where the respondents determined the winning concept. the most frequent winner would be the new operator tariff with 50 minutes (52 %), 100 sms (50.8 %), no data services (31.88 %), no free numbers (78.62 %) and lump monthly payments for two years (48.91 %). the average tariff price would be czk 432. (zero-centered differences) average utilities standard deviation new operator (β1) 3.1208 6.375 vodafone (β2) 0.4692 6.0899 t-mobile (β3) -1.2532 5.7165 telefónica o2 (β4) -2.3367 5.4828 50 minutes (β5) -38.3277 26.6757 150 minutes (β6) -12.1474 11.4497 300 minutes (β7) -2.274 13.2111 500 minutes (β8) -6.6355 19.0811 unlimited (β9) 59.3847 24.3779 20 sms (β10) -11.6015 17.57 100 sms (β11) 6.815 7.5969 500 sms (β12) 4.7865 13.8801 without data (β13) -44.509 34.6875 150 mb (β14) -7.3311 14.0473 500 mb (β15) 10.5313 12.6187 1 gb (β16) 18.9618 14.9506 3 gb (β17) 22.347 21.2323 without (β18) -7.9633 11.4524 3 free numbers (β19) 2.9537 8.5123 6 free numbers (β20) 5.0095 6.4706 lump sum (2 years) (β21) 6.5339 9.5482 transferable credit (β22) -1.2356 6.1642 rechargeable card (β23) -5.2982 11.9827 price: 100 (β24) 216.2966 29.955 price: 400 (β25) 133.2723 35.3982 price: 1500 (β26) -157.3462 74.3023 price: 3200 (β27) -192.2227 34.1939 none (β28) 356.1672 170.5967 table 2: part-worths of the levels estimated using hb procedure. 47 acta polytechnica vol. 52 no. 5/2012 tariff operator calls sms data contract price tm bav se* t-mobile 50 100 0 lump sum 228 tm bav se+* t-mobile 50 100 200 lump sum 366 tm kredit 300 t-mobile 60 20 0 credit 300 o2 pohoda* o2 50 100 0 lump sum 180 o2 pohoda+* o2 50 100 150 lump sum 280 o2 neon s o2 80 20 0 lump sum 360 st. na míru* vodafone 50 20 0 lump sum 150 st. na míru* vodafone 80 20 150 lump sum 240 karta na míru vodafone 80 20 0 rech. card 300 +with data *student tariff table 3: simulation of first scenario including student tariffs. 3.2 part-worth preference model the parameters of part-worth utilities were estimated by an hb-procedure for each respondent on the individual level and interaction effects between attributes were not taken in consideration. given the distribution of tariff prices generated in the screening section, see figure 8, the levels of the price attribute for the part-worth estimation was set as czk 100, czk 400, czk 1500, czk 3200 to get a good approximation of the distribution of all generated prices, using linear interpolation of the estimated parameters. the average result of the parameter estimation using the iterative hb procedure is contained in table 2 and is presented graphically in figure 8, where each attribute level has one parameter β estimated. the utilities for individual attributes are zero centered (the sum of the levels of attribute is 0), where the levels of mutually independent attributes are interpolated using the linear function. the range of maximum and minimum part-worth levels in the attribute represents the relative attribute importance computed as the utility range of the single attribute divided by the sum of all attribute ranges. the highest average importance is allocated to the attribute price (62 %), second place goes to the voice services (14.7 %) and then comes the data services (11 %). these attributes have most influence on the customer’s buying decision. the slope of the line between the levels within the attribute specifies the size of the change in utility due to a change in the attribute levels in the concept. for example, in voice services, the benefit of the tariff will increase most when the level is changed to unlimited calls. the parameter “none” (the average value was 356) for each individual respondent is derived from answers in the screening section which represents the threshold of the utility at which a purchase is made by the respondent. this parameter for each respondent was used to determine the purchasing decision in the subsequent market simulation model. 3.3 simulation model for market scenarios the simulation was performed using the market simulator from sawtooth software with 9 tariffs in the scenario, see table 3. the first scenario is based on web tariff offers from operators, and primarily comprise a market of student tariffs due to the composition of the group of target respondents. the student tariffs offer less expensive mobile services than standard tariffs, and the preference data from the respondents reflects the high importance of the price attribute, as seen in the part-worth estimation result in table 2. at the same time, one standard tariff is added for each operator to obtain a preferential comparison of the utility of the non-student offers. the result of the market share (preference share) simulation is listed in table 4 and in figure 10. the simulation is based on the first choice rule that supposes each respondent will buy the concept with the highest utility. despite the favorable prices offered in the student tariffs, nearly half of the respondents (49.32 %) would not purchase any tariff in this scenario. table 4 shows that o2 (27.93 %) received the largest share in relation to the other offers in this scenario, vodafone (16.67 %) had the second largest share and t-mobile (6.08 %) took third place. student tariffs with data services gained the highest 48 acta polytechnica vol. 52 no. 5/2012 tariff share [%] std. err. tm bav se* 2.89 0.43 tm bav se+* 3.13 0.42 tm kredit 300 0.06 0.02 o2 pohoda* 13.02 1.16 o2 pohoda+* 14.84 1.15 o2 neon s 0.07 0.03 vf student na míru* 5.99 0.73 vf student na míru+* 10.5 0.96 vf karta na míru 0.17 0.1 none 49.32 1.71 +with data *student tariff table 4: shares of the potential student and standard tariffs market. share in the simulation. standard operator tariffs3 that are lightly higher in price than the student tariffs would, take no share, see figure 10. due to the practices of the mobile operators, who in many cases offer retention bids in an attempt to prevent customers transferring to another operator, the new tariff based on the detected retention offer (operator: a new operator, call: 150 minutes, sms: 120, data: 600 mb, contract: lump sum (2 years), price: czk 220) was added to the basic scenario to compute its share, see table 5. the simulation results show that only a small proportion of the respondents (14.43 %) would not accept the offer of the new operator4. if a new operator were to enter the market with this tariff, it would acquire the majority of the market (72.27 %) provided that the other operators did not change their offer. 3.4 price sensitivity model in the market simulator, we also conducted a price sensitivity test. for each tariff, we gradually changed the price levels, while maintaining the other tariffs at constant settings, and in each step we observed what the change in the proportion of tariff was in the scenario. the results are shown in figure 11. the degree of price elasticity is usually calculated as: e = q2−q1 0.5(q1+q2) p2−p1 0.5(p1+p2) (6) 3standard tariffs are in tables without marks *+ . 4the “new tariff” offer was configured based on a retention offer made by an existing operator. tariff share [%] std. err. tm bav se* 0.97 0.19 tm bav se+* 0.11 0.03 tm kredit 300 0.02 0.01 o2 pohoda* 6.11 0.79 o2 pohoda+* 0.74 0.17 o2 neon s 0.02 0.01 vf student na míru* 2.97 0.47 vf student na míru+* 2.22 0.40 vf karta na míru 0.13 0.10 new tariff 72.27 1.47 none 14.43 1.02 +with data *student tariff table 5: the basic scenario with the new tariff based on the retention proposal. tm "bav se*" 3% tm "bav se+*" 3% tm "kredit 300" 0% o2 "pohoda*" 13% o2 "pohoda+*" 15% o2 "neon s" 0% vf "st. na míru*" 6% vf "st. na míru+*" 11% vf "karta na míru" 0% none 49% tm "bav se*" 1% tm "bav se+*" 0% tm "kredit 300" 0% o2 "pohoda*" 6% o2 "pohoda+*" 1% o2 "neon s" 0% vf "st. na míru*" 3% vf "st. na míru+*" 2% vf "karta na míru" 0% new tariff 72% none 15% figure 10: shares of potential market of student and standard tariffs. in figure 11, we estimate the average price elasticity e of the demand using log-log regression, which is more accurate in this case, where there are several price points. at a price level of czk 100, the new tariff would take almost 90 % of the market. when the price increases, the share decreases rapidly and when exceeds a level of czk 600, the market share of the new tariff drops to less than 10 %. all the student tariffs at the lowest price level of czk 100 have a market share below 20 %. when the price is increased, the share decreases rapidly due to the high price elasticity of the tariffs. taking into account the current market situation and the price elasticity, if all operators were to 49 acta polytechnica vol. 52 no. 5/2012 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 900 1000 s h a re [ % ] price [czk] new tariff e = -0,18 0 2 4 6 8 10 12 14 16 18 100 150 200 250 300 350 400 tm "bav se*" o2 "pohoda*" vf "st. na míru*" figure 11: price sensitivity slopes of tariffs in the second simulation scenario. reduce their price, the t-mobile and o2 tariffs would gain a bigger additional share than the vodafone tariff. in this model, offering a suitable combination of mobile services is more important than the operator’s brand for the respondents’ purchasing decision. conclusion the aim of this paper was to analyze customer preferences for mobile telecommunications services. the study was based on real market information. by using conjoint analysis, we provide a preference model based on empirical customer choice data which confirms that conjoint analysis is very helpful tool to quantify the potential effects of specific aspects of operator tariffs. the market analysis in this study consists of a counting analysis, a part-worth analysis, a choice simulation model and a price sensitivity model. the first analysis provides a view on the extreme values of the elements in the chosen concepts. the analysis of the part-worth utilities is created on the basis of a logit model, which takes into account all the respondent answers for the parameter estimation. this model allows us to predict the buying behavior and the preference of any respondent for any possible combination of the attributes involved in the study. the subsequent simulation model was analyzed in two basic scenarios including student tariffs, and their market shares (their preferences) were estimated. finally, we compiled a price sensitivity model. the key finding of the study is based on an analysis of prices. these findings confirmed that the price range, which was also most often restricted by respondents as a non-compensatory consideration, is the most important factor involved in the final purchasing decision made by customers, who are willing to significantly reduce the range of services for witch they subscribe, in return for a lower price. in the czech republic, the tariff prices are above the european average, and customers can easily compare the prices with offers abroad. this renders the current operators’ offers of little interest, and customers often negotiate better prices through unofficial retention deals. the entry of a new operator into the market with favorable prices promoted in official offers would surely reduce the current high prices for services. however, the new operator will have to anticipate an aggressive pricing strategy of the three present day operators in response to the new competition. acknowledgements the research described in this paper was supervised by doc. ing. věra vávrová, csc., fee ctu in prague and supported by sawtooth software under grant no. 1303744 and no. 1303744 b. the author would like to thank sawtooth software for the grant for software licences and doc. ing. věra vávrová, csc. for insightful and constructive comments. references [1] c. chapman, j. johnson, c. weidemann et al. cbc vs. acbc: comparing results with real product selection. in sawtooth software conference proceedings, sequim, wa, 2009. [2] r. johnson. understanding hb: an intuitive approach. research paper series, sawtooth software, 2000. [3] r. johnson, b. k. orme. a new approach to adaptive cbc. in sawtooth software conference proceedings, sequim, wa, 2007. [4] t. otter. hb analysis for multi-format adaptive cbc. in sawtooth software conference proceedings, 2007, p. 111. [5] sawtooth sofware inc. acbc technical paper, 2008. technical paper series. [6] sawtooth sofware inc. the cbc system for choice-based conjoint analysis, 2008. technical paper series. [7] sawtooth sofware inc. the cbc/hb system for hierarchical bayes estimation version 5.0, 2009. technical paper series. 50 acta polytechnica vol. 52 no. 5/2012 research trends for pid controllers antonio visioli university of brescia, via branze 38, i-25123 brescia, italy corresponding author: antonio.visioli@ing.unibs.it abstract this paper analyses the most significant issues that have been recently been addressed by researchers in the field of proportional-integral-derivative (pid) controllers. in particular, the most recent techniques proposed for tuning and designing pid-based control structures are briefly reviewed, together with methods for assessing their performance. finally, fractional-order and event-based pid controllers are presented among the most significant developments in the field. keywords: pid controllers, design, tuning, control structures. 1 introduction despite the new results in control theory that have been achieved by researchers year-by-year all over the world, proportional-integral-derivative (pid) controllers are still the most widely-used controllers in industry. this is because it is really very difficult to improve their cost/benefit ratio, which is obviously a major concern in industry. pid controllers are still a very active field of research, because it is recognized that they are often poorly tuned in industrial applications, and the performance that is achieved can be improved many times by using a more effective design technique for the pid-based control system. a great impulse for research on pid controllers was provided by a workshop dedicated to them organized in terrassa (spain) in 2000, sponsored by the international federation of automatic control (ifac). the success of the meeting is witnessed not only by the number of participants but also by the large number of papers and books published in the last ten years specifically on this topic (see, e.g. [3, 15, 23, 45, 46, 52, 55]). in order to further outline the current state-of-the-art and the future perspectives from an academic and also from an industrial viewpoint, another ifac conference dedicated to pid controllers was held in brescia (italy) in march 2012. this paper describes many significant results achieved recently in the field of pid controllers, and reviews in particular those related to tuning and designing pid-based control structures, determining the stabilizing region of the pid parameters, and performance assessment issues. finally, new control concepts applied to pid controllers, namely fractional and event-based pid controllers, will be highlighted. 2 generalities a pid controller is typically employed in a unity-feedback control system like that shown in figure 1, where p is the process, y is the process variable, r is the set-point signal and d is the load disturbance. in its basic form, the pid controller can be described by the following transfer function c(s) = kp ( 1 + 1 tis + tds ) , (1) where kp is the proportional gain, ti is the integral time constant and td is the derivative time constant. expression (1) is usually known as the ideal form. other forms (usually called series and parallel forms) can also be employed [3]. in order to be effectively employed in practical cases, additional functionalities also have to be implemented. the most important functionalities can be summarized as follows (details can be found in [52]). • the derivative action has to be filtered in order to make the controller proper and to filter the (high frequency) measurement noise; in addition, the derivative action is often applied directly to the process variable instead of to the control error in order to avoid the so-called derivative kick when a step signal is applied to the set-point. the derivative filter has to be taken into consideration in the overall design of the controller [13, 17]. • the set-point value for the proportional action can be weighted in order to obtain a two-degreeof-freedom controller, i.e. in order to reduce the overshoot in the set-point step response when the controller is tuned in order to increase the bandwidth of the system with the aim of increasing the load disturbance rejection performance. in this case a suitable choice of the value of the set-point weight (or the application of a more sophisticated techniques [47]) can yield a significant increment of the control performance. • suitable techniques (see, e.g. [48]) should be implemented properly in order to avoid the windup effect 144 acta polytechnica vol. 52 no. 5/2012 e r u y d c(s) p(s) figure 1: the unity feedback control scheme. of the integral action which has a detrimental effect when large set-point changes are applied. 3 tuning and automatic tuning techniques many tuning rules for pid controllers have been proposed in the last century [23], and new rules have also been proposed recently (for self-regulating, integral and unstable processes). among these new rules, it is worth mentioning the simc tuning rules [9, 34] which have been proven to provide very good results despite their simplicity, and the amigo tuning rules, which are capable of giving high performance for a wide range of processes [2]. when a new technique is investigated, the following points should be considered. first, the parameters related to the additional functionalities mentioned in section 2 should be explicitly taken into account. other important issues, e.g. the robustness of the controller to modelling uncertainties, should also be considered [43]. finally, the availability of more and more advanced identification techniques calls for procedures to select the most suitable identification strategy for a given application and for a general model-based design strategy [8, 18]. this is clearly more relevant when automatic tuning techniques are implemented. obviously, each design methodology should yield an optimal pid controller, i.e. it should minimize some significant performance index (e.g. the integrated absolute error when a set-point or load disturbance step signal is applied) subject to constraints usually represented by the robustness of the control system of the control effort. 4 pid-based control structures many different pid-based control structures have been considered as an effective means for obtaining high performance while retaining simple implementation of the control system. in this context it is worth highlighting (in addition to the above-mentioned decoupling strategies) structures and tuning techniques for dead time compensation schemes [22], for cascade [41] and ratio control systems [50, 51] and methods for designing feedforward control actions (both for the set-point following task [28, 49, 53] and for the load disturbance rejection task [11, 27, 44]). it should be stressed that the challenging issue in a pid-based control system is to achieve a suitable combination of the block scheme design and the tuning of the controller parameters. 5 pid control for mimo processes as it is recognized that many processes are multivariable by their nature, the design of pid controllers for application in multi-input-multi-output (mimo) systems is also a very interesting and relevant topic. in particular, the tuning of decentralized or multivariable pid controllers poses new challenges to be solved because of the coupling effects in the process [35, 54]. further, effective decoupling strategies are needed where the best trade-off between ease of implementation and obtained performance is achieved [7, 19, 21, 29]. 6 stabilizing pid controllers great advances have been made in recent years on the more theoretical issue of determining the complete set of stabilizing pid controllers for a given process [33]. for example, if a first-order-plus-dead-time (fopdt) process p(s) = k ts + 1 e−ls (2) is considered, the stabilizing pid parameters can be computed by solving fairly simple equations (numerically) [32]. similar procedures can also be employed for integral processes [25]. it should be stressed that knowledge of the set of stabilizing controllers provides information related to the robustness and/or fragility of the controller, and can also be employed for implementing advanced tuning techniques. 7 performance assessment and retuning in many practical cases, pid controllers are poorly tuned because of lack of time and lack of skill of the operator. as there are hundreds of control loops in large plants, it is almost impossible for operators to monitor each of them manually. it is therefore important to have automatic tools that are first able to assess the performance of a control system and, in the event that it is not satisfactory, to suggest a way to solve the problem (for example, if bad controller tuning is detected, new appropriate controller parameter values are determined). many performance assessment methodologies have been proposed in the literature and have been applied successfully in industrial settings [14]. they are generally divided in two categories stochastic performance monitoring in which the ability of the control 145 acta polytechnica vol. 52 no. 5/2012 system to cope with stochastic disturbances is of main concern (works that fall into this class mainly rely on the concept of minimum variance control), and deterministic performance monitoring in which performances related to more traditional design specifications, e.g. set-point and load rejection disturbance step response parameters, are taken into account. restricting the analysis to the tuning assessment of pid controllers, methods for determining the minimum variance pid controller have been proposed in [16, 42]. regarding deterministic performance monitoring, a practical approach has been proposed in [38, 39, 40]. it is based on a process parameter estimation procedure which uses the set-point step response (i.e. routine operation data). its rationale relies in the so-called “half rule”, which states that the largest neglected (denominator) time constant is distributed evenly to the effective dead time and the smallest retained time constant. then, the performance that is obtained is evaluated by comparing the integrated absolute error with the error that would have been obtained by applying the simc tuning rule, which is considered as a benchmark, i.e. iae = 2al, (3) where a is the amplitude of the set-point step signal. in particular, for a fopdt process (2), the gain is estimated as k = a ti kp ∫∞ 0 e(t)dt . (4) then, the sum of the time constant and the dead time can be determined as l + t = limt→+∞ ∫ t 0 eu(v)dv a (5) where eu(t) = ku(t) −y(t). (6) because the apparent dead time l can be determined by considering the time interval from the application of the step signal to the set-point and the time instant when the process output attains 2 % of the new set-point value a (a suitable noise band can be employed in practical cases to cope with measurement noise), the value of t can easily be determined from 5. with knowledge of the process parameters, the performance can be assessed by considering the following performance index (see (3)) j = 2al∫∞ 0 |e(t)|dt , (7) and, in the event that the performance is not satisfactory (in principle, it should be j = 1, but from a practical point of view j > 0.6 can be considered as acceptable), the pid controller can be retuned by applying the simc tuning rule (or some other suitable rule). for an integral processes, the procedure is similar. the process model is p(s) = k s(ts + 1) e−ls (8) and the sum of the lags and of the dead time of the process can be determined as l + t = limt→+∞ ∫ t 0 eu(v)dv a , (9) where eu(t) := k ∫ t 0 u(v)dv −y(t) (10) the process gain k can be determined as k = a ti kp ∫∞ 0 ∫ t 0 e(v)dvdt . (11) if the set-point following task is of concern and a pid controller is employed, the performance index becomes j = 3.45al∫∞ 0 |e(t)|dt . (12) it is worth noting that this methodology can also be applied, with appropriate modifications, when a load disturbance step response is available. 8 fractional-order pid controllers a topic which continues to be the subject of many investigations is fractional-order pid (fopid) controllers, which can be considered a generalization of standard integer-order pid controllers, where the orders of integration and differentiation are not necessary integer [20, 36]. the typical formulation of a fopid controller is c(s) = kp ( 1 + 1 tisλ + tdsµ ) , (13) where λ and µ are the noninteger orders of the integral and derivative terms respectively. an alternative form (which includes the filter of the derivative term) is c(s) = kp tis λ + 1 tisλ tds µ + 1 td n s + 1 . (14) fopid controllers have the great advantages of providing more flexibility in their design, as the user can also tune also the order of integration and differentiation in addition to the proportional gain and the integral and derivative time constants. this implies that the frequency response of the open-loop system can be shaped with more degrees-of-freedom, thus allowing the user to meet more control requirements. for example, the iso-damping property can be pursued, namely the 146 acta polytechnica vol. 52 no. 5/2012 capability of the control system to achieve the same phase margin (i.e.that is, the same overshoot in the set-point step response) independently from (moderate) variations of the process gain [4]. further, the minimum integrated absolute error when a step signal is applied to the set-point or to the load disturbance can be decreased with respect to standard pid controllers [24, 26]. however, fopid controllers are more difficult to implement (the fractional controller has to be approximated by a usually high-order integer-order controller) and, in spite of the theoretical results that have been recently achieved, there is still much work to be done before they are ready for widespread use in industrial settings. in particular, the effectiveness of the tuning rules that have been devised needs to be fully demonstrated and the substitution of classical pid controllers with fopid controllers in control structures (see section 4) still has to be addressed, as has the presence of the additional functionalities mentioned above (e.g. set-point weight, anti-windup, feedforward action, etc.) which makes standard industrial controllers successful in practical applications. 9 event-based pid controllers the recent introduction of wireless transmitters and wireless actuators in the process industry has motivated new interest in pid modifications that allow effective control using nonperiodic information updates. indeed, the underlying assumption in process control has always been that control is executed on a periodic basis, and that a new measurement value is available each execution. however, it is well known that in some processes a small stationary control error or smooth oscillations of the process output around the set-point may not constitute hard design constraints, but a reduction in the information exchanged between the agents that take part in the control loop (sensors, controllers, actuators) can be one of the tightest requirements. in fact, when wireless sensors and actuators are involved, reduction of the information flow implies a decrement of computing operations and transmissions, and thus longer lifetime of batteries. with these demands, one of the most convenient strategies is to use event-based sampling and control approaches. in particular, in order to minimize power consumption, a wireless transmitter may transmit a new measurement only if the measurement has changed by a significant amount or, in general, when a logical condition becomes true. in the process control field, the logical condition is usually a composition of boolean operations, where the variables are the signals (or a function of them such as an estimation, the derivative, the integration, etc.) that the sensor receives from the process or the control action produced by a controller [1, 31, 37]. in general, in event-based control strategies, the controller can be divided into four logical blocks as shown in figure 2: the sensor unit (su), the control unit (cu), the actuator unit (au) and a governor (g). the units and their tasks can be described as follows: • the sensor unit is composed of the sensor and its on-board intelligence. its task is to measure the process output and to calculate the error between the measured signal and a constant set-point value received from the governor. • the control unit implements the control algorithm, which determines the control action by taking into account the last received sampled error and sends it to the actuator unit. • the actuator unit receives the control action signal from the control unit and applies it to the actuator. the governor, which, in practice, can be implemented together with one of the previous two blocks, receives the desired set-point value from a user interface or from a hierarchically higher controller, and sends it to the sensor unit. these blocks can be implemented in a unique machine or in two or more physical entities. in this last case, the data has to be sent from one to each of the others in a network. it is clear that communication between two entities implies more effort than exchanging the data into a single machine, especially when they are battery-powered. for this reason, it is recommended to use event-triggered data exchanging for all the signals sent between two machines and normal time-driven sampling for the data elaborated by a unique machine. this implies that the control system has to be designed to deal effectively with an asynchronous sampling rate, and this opens new challenges both from a theoretical viewpoint and from a practical viewpoint, as the timing of the events influences the system performance and limit cycles may arise. further, in addition to the pid gains, there are in general other parameters (threshold values) employed in the control algorithm that have to be tuned, thus making the overall control design more complex. for reasons such as these, in recent years event-based sampling and control techniques have received special attention from several research groups, and various effective methodologies have been proposed. among the various solutions, it is worth mentioning the pidplus technique, which has already been applied successfully in industry [6]. it involves restructuring the pid controller to reflect the reset contribution for the expected process response since the last measurement update. the so-called symmetric send-on-delta (ssod) pi controller has recently been presented [5]. the approach that is employed is based on quantization of the sampled signal by a quantity multiple of a given parameter ∆, so that the relationship between the input and output of the event-generator block is symmetric with respect to the origin. in this context, necessary and sufficient conditions on the controller parameters for the existence of equilibrium points without limit cycles 147 acta polytechnica vol. 52 no. 5/2012 cusu au g plant figure 2: scheme of a generic event based control strategy. the dashed arrows indicate the possibility of event-triggered data transmission. can be determined, and this can be exploited to devise effective tuning rules. it is also shown that the choice of parameter ∆ does not influence the stability of the system, and it can therefore be selected just in order to handle the trade-off between reducing the desired number of events and reducing the steady-state error. 10 cacsd tools the availability of more and more sophisticated and high performance software tools has favored the realization of more and more effective computed aided control system design (cacsd) tools specifically designed for pid-based control systems [10, 12, 30]. cacsd tools surely contributes to the rapid dissemination of new methodologies through researchers and, most of all, makes their applicability much easier in industry, as theoretical issues can be made transparent to the user. moreover, the user can often better understand the design procedure and the physical meaning of the design parameters. it is therefore recognized that each new proposed design methodology should also be made available in a suitable software tool. 11 conclusions despite the long history of pid controller research and application, there are still many open issues and challenges related to many different aspects of the overall industrial control system design. indeed, the need to keep improving the performance of control systems while keeping them as simple as possible, together with the need to satisfy technological requirements, calls for new methodologies to be applied in this field. a (surely not exhaustive) selection of relevant topics that are currently the subject of investigation has been presented here, highlighting in particular the requirements that newly-devised methodologies should satisfy. acknowledgements the author thanks manuel beschi, jesús chacón, sebastián dormido, fabrizio padula, aurelio piazzi, josé sánchez and massimiliano veronesi for collaborating with him in various research fields on pid controllers. references [1] k.e. årzèn. a simple event-based pid controller. in proceedings of 14th world congress of ifac. bejining, china, 1999. [2] k. j. åström, t. hägglund. revisiting the ziegler–nichols step response method for pid control. journal of process control 14:635–650, 2004. [3] k. j. åström, t. hägglund. advanced pid control. isa press, research triangle park, usa, 2006. [4] r. s. barbosa, j. a. tenreiro machado, i. m. ferreira. tuning of pid controllers based on bode’s ideal transfer function. nonlinear dynamics 38:305–321, 2004. [5] m. beschi, s. dormido, j. sanchez, a. visioli. characterization of symmetric send-on-delta pi controllers. journal of process control, 2012. [6] t. l. blevins. pid advances in industrial control. in proceedings ifac conference on advances in pid control. brescia, i, 2012. [7] j. garrido, f. vázquez, f. morilla. centralized multivariable control by simplified decoupling. journal of process control 22:1044–1062, 2012. [8] e. grassi, k. tsakalis. pid controller tuning by frequency loop-shaping: application to diffusion furnace temperature control. ieee transactions on control systems technologies 8:842–847, 2000. [9] c. grimholt, s. skogestad. optimal pi-control and verification of the simc tuning rule. in proceedings ifac conference on advances in pid control. brescia, i, 2012. [10] j. l. guzman, k. j. åström, s. dormido, et al. interactive learning modules for pid control. ieee control systems magazine 28:118–134, 2008. [11] j. l. guzmán, t. hägglund. simple tuning rules for feedforward compensators. journal of process control 21:92–102, 2011. 148 acta polytechnica vol. 52 no. 5/2012 [12] j. l. guzmán, p. garcía, t. hägglund, et al. interactive tool for analysis of time-delay systems with dead-time compensators. control engineering practice 16:824–835, 2008. [13] t. hägglund. signal filtering in pid control. in proceedings ifac conference on advances in pid control. brescia, i, 2012. [14] m. jelali. an overview of control performance assessment technology and industrial applications. control engineering practice 14:441–466, 2006. [15] m. a. johnson, m. h. moradi (eds.). pid control new identification and design methods. springer, london, uk, 2005. [16] b.-s. ko, t. f. edgar. pid control performance assessment: the single-loop case. aiche journal 50:1211–1218, 2004. [17] a. leva, m. maggio. a systematic way to extend ideal pid tuning rules to the real structure. journal of process control 21:130–136, 2011. [18] a. leva, s. negro, a. v. papadopoulos. pi/pid autotuning with contextual model parametrisation. journal of process control 20:452–463, 2010. [19] t. liu, w. zhang, f. gao. analytical decoupling control strategy using a unity feedback control structure for mimo processes with time delays. journal of process control 17:173–186, 2007. [20] c. a. monje, b. m. vinagre, v. feliu, y. q. chen. tuning and auto-tuning of fractional order controllers for industry applications. control engineering practice 16:798–812, 2008. [21] p. nordfeldti, t. hägglund. decoupler and pid controller design of tito systems. journal of process control 16:923–936, 2006. [22] j. e. normey-rico, e. f. camacho. dead-time compensators: a survey. control engineering practice 16:407–428, 2008. [23] a. o’dwyer. handbook of pi and pid tuning rules. imperial college press, 2006. [24] f. padula, a. visioli. tuning rules for optimal pid and fractional-order pid controllers. journal of process control 21:69–81, 2011. [25] f. padula, a. visioli. on the stabilizing pid controllers for integral processes. ieee transactions on automatic control 57:494–499, 2012. [26] f. padula, a. visioli. optimal tuning rules for proportional-integral-derivative and fractionalorder proportional-integral-derivative controllers for integral and unstable processes. iet control theory and applications 6:776–786, 2012. [27] m. petersson, k.e. årzèn, t. hägglund. a comparison of two feedforward control structure assessment methods. international journal of adaptive control and signal processing 17:609–624, 2003. [28] a. piazzi, a. visioli. a noncausal approach for pid control. journal of process control 16:831–843, 2006. [29] s. piccagli, a. visioli. an optimal feedforward control design for the set-point following of mimo processes. journal of process control 19:978–984, 2009. [30] pid controller laboratory, www.pidlab.com. http: //www.pidlab.com. [31] j. sánchez, a. visioli, s. dormido. a two-degree-offreedom pi controller based on events. journal of process control 21:639–651, 2011. [32] g. j. silva, a. datta, s. p. bhattacharyya. new results on the synthesis of pid controllers. ieee transactions on automatic control 47:241–252, 2002. [33] g. j. silva, a. datta, s. p. bhattacharyya. pid controllers for time-delay systems. birkhauser, boston, usa, 2005. [34] s. skogestad. simple analytic rules for model reduction and pid controller tuning. journal of process control 13:291–309, 2003. [35] s. tavakoli, i. griffin, p. j. fleming. tuning of decentralized pi (pid) controllers for tito processes. control engineering practice 14:1069– 1080, 2006. [36] d. valerio, j. sa da costa. introduction to the single-input, single-output fractional control. iet control theory and applications pp. 1033–1057, 2011. [37] v. vasyutynskyy, k. kabitzsh. event-based control: overview and generic model. in proceedings of 8th ieee international workshop on factory communication systems. 2010. [38] m. veronesi, a. visioli. performance assessment and retuning of pid controllers. industrial and engineering chemistry research 48:2616–2623, 2009. [39] m. veronesi, a. visioli. an industrial application of a performance assessment and retuning technique for pi controllers. isa transactions 49:244–248, 2010. [40] m. veronesi, a. visioli. performance assessment and retuning of pid controllers for integral processes. journal of process control 20:261–269, 2010. 149 acta polytechnica vol. 52 no. 5/2012 [41] m. veronesi, a. visioli. a simultaneous closed-loop automatic tuning method for cascade controllers. iet control theory and applications 5:263–270, 2011. [42] m. veronesi, a. visioli. global minimum-variance pid control. in proceedings ifac world congress, pp. 7891–7896. milan, i, 2012. [43] r. vilanova. imc based robust pid design: tuning guidelines and automatic tuning. journal of process control 18:61–70, 2008. [44] r. vilanova, o. arrieta, p. ponsa. imc based feedforward controller framework for disturbance attenuation on uncertain systems. isa transactions 48:439–448, 2009. [45] r. vilanova, a. visioli (eds.). pid control in the third millennium: lessons learned and new approaches. springer, london, uk, 2012. [46] a. visiol, q.-c. zhong. control of integral processes with dead time. springer, london, uk, 2010. [47] a. visioli. fuzzy logic based set-point weight tuning of pid controllers. ieee transactions on systems, man, and cybernetics part a 29:587–592, 1999. [48] a. visioli. modified anti-windup scheme for pid controllers. iee proceedings control theory and applications 150(1):49–54, 2003. [49] a. visioli. a new design for a pid plus feedforward controller. journal of process control 14:455–461, 2004. [50] a. visioli. design and tuning of a ratio controller. control engineering practice 13:485–497, 2005. [51] a. visioli. a new ratio control architecture. industrial and engineering chemistry research 44:4617–4624, 2005. [52] a. visioli. practical pid control. springer, london, uk, 2006. [53] a. visioli, a. piazzi. improving set-point following performance of industrial controllers with a fast dynamic inversion algorithm. industrial and engineering chemistry research 42:1357–1362, 2003. [54] q.-g. wang, z. ye, w. j. cai, c. c. hang. pid control for multivariable processes. springer, london, uk, 2008. [55] c. c. yu. autotuning of pid controllers: a relay feedback approach. springer, london, uk, 2007. 150 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 optical detection of charged biomolecules: towards novel drug delivery systems v. petráková abstract this paper presents work done on developing optically-traceable intracellular nanodiamond sensors, where the photoluminescence can be changed by a biomolecular attachment/delivery event. their high biocompatibility, small size and stable luminescence from their color centers make nanodiamond (nd) particles an attractive alternative to molecular dyes for drug-delivery and cell-imaging applications. in our work, we study how surface modification of nd can change the color of nd luminescence (pl). this method can be used as a novel detection tool for remotemonitoring of chemical processes in biological systems. recently, we showed that pl can be driven by atomic functionalization, leading to a change in the color of nd luminescence from red (oxidized nd) to orange (hydrogenated nd). in this work, we show howpl of ndchanges similarly when interactingwith positively and negatively chargedmolecules. the effect is demonstrated on fluorinated nd, where the high dipole moment of the c-f bond is favorable for the formation of non-covalent bonds with charged molecules. we model this effect using electrical potential changes at the diamond surface. the final aim of thework is to develop a “smart” optically traceable drug carrier, where the delivery event is optically detectable. keywords: nanodiamond, drug delivery, nitrogen-vacancy center, luminescence properties, charged molecules. 1 introduction fluorescent cellular biomarkers play an essential role in biologyandmedicine for in-vitro and in-vivo imaging in living cells. luminescent nanodiamond (nd) has recently been suggested as a novel opticalmarker for cellular imaging [1,2]. nd offers advantages over classical fluorescent markers used for in vivo and in vitro imaging in living cells. it offers a cellular delivery combined with strong and stable photoluminescence (pl) originating from the nitrogen-vacancy (nv) or other lattice point defects. nd is biocompatible, and its surface can easily be terminatedwith various groups and additionally functionalized with biomolecules [3,4], making nd a suitable carrier for cellular targeting or drug delivery. in this work we showamethod for changing the plproperties ofnd in a biological environment by surface terminationinduced changes leading to electric field development close to the nd surface. the nv center is observed in two charge states, negative and neutral, each with different pl properties [5]. a negatively charged nv center (nv−) emits around 637 nm and a neutral center (nv0) emits around 575 nm. under standard conditions, both states are observed, but nv− related luminescence is more intense. a single nv center can exist in both states [6], depending on its surroundings. in this work we demonstrate the influence of the presence of charged molecules (biopolymers) on changes in the occupation of nv− and nv0 states in fluorinated nd, with nv− quenching upon interaction with positively charged polymers. the high dipole moment of f-terminated diamond attracts positively charged molecules, leading to the creation of a hole accumulation layer and therefore a charge transfer from the diamond to the surrounding polymer, i.e. an upward surface band bending. we use these effects to induce changes in the occupation of nv0/− centers lying in the band-bending zone. we demonstrate these effects on nv centers in high-pressure high-temperature nd of 20–50 nm in size produced by irradiation and annealing. the results are supported by theoretical modeling of the density of the state distribution for various surface interactions. the effect is compared with oxidized nd. 2 experimental the effects presented in this work were studied on hpht type ib nd 40–50 nm in size, sourced from microdiamant ag, switzerland (size selected from commercial product msy 0–0.05 gaf) containing about 100–120 ppm nitrogen (figure 1). fig. 1: afm picture of the nd particles 89 acta polytechnica vol. 51 no. 5/2011 acommercial solution ofnd was lyophilized and heated in air at 425◦c for 5 h to remove any sp2 carbon-containing layer. the resulting pale grey powder was dispersed in water and deposited in the formof a thin filmona target backing (10mg·cm−2) for ion implantation. thendwas then irradiatedusing an external proton beam produced by a u-120m isochronous cyclotron. the angle of the target backing with respect to the beamdirectionwas 10◦. the fluency of the delivered beam was 9.2 · 1015 cm−2, beam energy 5.4mevand beamcurrent 0.6 μa.the irradiated nd was thermally annealed in a vacuum at 710◦ c for 2 h to create nv centers by trapping the vacancies next to the nitrogen atom. nds were then oxidized in a mixture of concentrated h2so4hno3 (9 : 1, vol/vol) at 80 ◦c for 7 days. the reaction mixture was diluted with deionized water, the nds were separated by centrifugation and was subsequently washed with 0.1 m naoh, 0.1 m hcl, and finally washed three times with water. the solution was lyophilized, providing highly fluorescent nds in the form of a stable colloidalmonodispersion in water, as confirmed by afm and dls (final size 20–25 nm). the colloidal dispersion was stable after 2months, with no sedimentation. after themeasurement, the quartz plate with nd grains was exposed to microwave-excited hydrogen plasma for 30 minutes at a temperature of 500◦c and at pressure of 1 mbar to produce anh-terminated surface. finally, the h terminated samples were fluorinated by ionic fluorination in amixture with an aqueous solution of hydrogen fluoride and fluorine gas. after saturation of the solution, the suspension reacts under pressure for 2 days. raman and pl (514 nm) spectra were recorded using a renishaw invia raman microscope. spectra were taken at room temperature. all spectra were normalized to the diamond raman peak. the afmmeasurementswereperformed in tappingmode (111khz)withanntegraprimantmdtsystem equipped with a soft ha nc etalon tip. polymers were chosen as positively charged molecules: poly diallyldimethyl ammonium chloride (pdadmac) and polyallylamine (paa.hcl). polyacrylic acid sodium salt (pana) and polystyrene sulfonic acid sodium salt (pssna) were used as negatively-chargedmolecules. 3 results 3.1 comparison of pl on variously terminated nd the pl of nv centers is sensitive to the surface termination, as shown in our previous work [7]. however, the effect of surface fluorination has not been described yet. figure 2a shows the pl spectra of fluorinated, oxidized and hydrogenated nd, where we clearly see the difference in the nv−/nv0 ratio, where the nv− luminescence is most pronounced in fluorinated nd. fig. 2: comparison of pl of variously treated nd. a) pl spectra takenat 514nmexcitation in a liquid colloid solution showing increased nv− related luminescence for fluorinatednd in comparison to oxidized and hydrogenated nd. b) counted nv−/nv0 ratio of treated nd. the ratiowas counted from thenv− andnv0 zero phonon line (zpl) — gray line in figure 2a 3.2 interaction with charged polymers inamixture of nanoparticleswith chargedmolecules, non-covalent bonds are always formed between the chargedmolecule and the nanoparticle. the strength of thebonddiffers, anddependsonmany factors, e.g. 90 acta polytechnica vol. 51 no. 5/2011 the ability to formhydrogenbondsor the sizeandpolarity of the surface dipole moment of the nanoparticle. if the charged molecule is strongly attracted, charge transfer can occur, depending on the homo and lumo energetic levels of the nanoparticle and the chemical potential of the molecule. fig. 3: the negative and positive electric field formed in the close surface proximity of nd after the interaction with charged polymers fig. 4: the pl spectra of fluorinated nd on interacting with the negative and positive electric field formed in the close surface proximity of nd after the interaction with charged polymers four different types of polymers were chosen for this experiment (see experimental for details). the polymer size did not exceed the size of nd, favoring the creation of stronger bonds. the colloidal solution of nd was mixed with polymers in 10× higher concentration, enabling saturationof the surfacewith polymers. the interactionbetweenpolymersandnd is schematically shown in figure 3. figure 4a shows the changes in the pl spectra of fluorinated nd when interacting with charged polymers. figure 4b shows theplof the interactionwith the same charged polymers, but using oxidized nd. the luminescenceof thenv-centers clearlydecreased on interacting with positively charged molecules, while after adding negatively charged polymers the luminescence was restored to the original level. when comparing the effects observed on fluorinated nd with the effects on oxidized nd, we find that the effect is much stronger for fluorinated nd. this difference could be due to the different properties of the carbon-fluor (c-f) and carbon-oxygen (c-o) bond. the electron affinity of the c-f bond is much higher (1.45) than the affinity of the c-obond (0.9). this leads to stronger attraction of positively charged polymers to the fluorinated surface. however, concerning the hydrogen bond, a fluorinated surface can be only an acceptor of a hydrogen bond, oppose to the oxidized surface (containing carboxyl, carbonyl, hydroxyl or lactone groups) can be both donor and acceptor of the hydrogen bond. 3.3 in-vivo imaging in chicken embryos fluorinated nds were used for in-vivo luminescence imaging in chicken embryos. the results (figure 5) showan important fact that the intensity of the luminescence from fluorinatednd is strong enough to be detected in a commercially available confocal microscope used for standard luminescence imaging. the method is therefore suitable for optically traceable drug delivery systems. 4 modeling the effect to explain the observed effect, we modeled the energetic balance near the surface by numerical solution of the poisson equation using the boltzmann distribution. we can write for the depth (x) dependent total space charge density ρ(x): ρ(x)= env exp ( − (ef − ev )x kt ) , (1) where nv is the temperature dependent effective density of states at ev bm, nv = 2.7 · 1019 cm−3 at room temperature, e is elementary charge, k is boltzmann constant and t is thermodynamic tem91 acta polytechnica vol. 51 no. 5/2011 fig. 5: luminescence confocal image of fluorinatednd in a chicken embryo. ndvisible as green dots in the picture fig. 6: dos calculations of surface band bending for two different situations: a) fluorinated nd, b) oxidized nd. the unoccupied (dark) state where luminescence of the nvcenters cannot occur is larger for the fluorinated surface than for the oxidized surface perature. ef and ev are the energetic levels of the fermi and valence band. d2(ef − ev )x dx2 = e εε0 ρ(x) (2) where εε0 is the permittivity of the material. thepoissonequationwas solvednumericallywith boundary conditions (ef − ev )0 = kt ln ( nv p0 ) (3) p0 = e (na · ζ) (4) p0 is the total unscreened positive charge at x = 0 from (1), and na is the density of the surface acceptors. the density of the surface acceptors was calculated by solving the boltzmann-poisson equation, taking into account the different electron affinity values described above. the electron affinity was introduced into the model as parameter ζ. in our calculations, ζ was set to 1.4 and 0.9. the results of mathematical modeling are shown in figure 6. the modeling clearly explains the reduced effect. 5 conclusions the luminescent properties of nv defects engineered in hpht nd when interacting with charged molecules have been studied. it was found that the luminescence of nv centers is sensitive to surface treatment. the nvluminescence fell significantly after hydrogenation of the surface and increased after fluorination, in comparison with the standardly usedoxidized surface. additionally, itwas found that the pl of fluorinated nd can be strongly influenced by the presence of charged molecules. this can be further used for in-vivo optical detection of charged molecules in cells/smart drug delivery systems. the observed effects have been explained by numerical modeling. the final result of this study was an invivo application of luminescencend in a chicken embryo, showing the detectability of luminescence nd in a standard confocal microscope. acknowledgement special thanks to author’s supervisor, prof. miloš nesládek, hasselt university, belgium for leading the research project, to petr cı́gler and miroslav ledvina for their help with surface modifications, to frantǐsekfendrych andandrewtaylor for their help concerning the properties and characteristics of diamond and for managing the research project, and finally to jan štursa and jan kučka for the opportunity to irradiate nd particles in the cyclotron at ujv rež. the author acknowledges financial support from the academy of sciences of the czech republic (grants kan200100801,kan301370701& kan400480701), the european r&d projects (fp7 itn grant no. 238201 – matcon, no. 245122 dinamo and cost mp0901 – ld 11076 and ld11078), and msm6840770012 “transdisciplinary research in the field of biomedical engineering ii”. 92 acta polytechnica vol. 51 no. 5/2011 references [1] ho, d.: beyond the sparkle: the impact of nanodiamonds as biolabeling and therapeutic agents, acs nano, 2009, vol. 3, no. 12, p. 3825–3829. [2] fu, c. c., lee, h. y., chen, k., lim, t. s., wu, h. y., lin, p. k., wei, p. k., tsao, p. h., chang, h. c., fann, w.: characterization and application of single fluorescent nanodiamonds as cellular biomarkers. proc. natl. acad. sci. usa 2007, vol. 104, no. 3, p. 727–732. [3] liu, k. k., cheng, c. l., chang, c. c., chao, j. i.: biocompatible and detectable carboxylated nanodiamond on human cell, nanotechnology, 2007, vol. 18, p. 325102. [4] kreuger, a.: the structure and reactivity of nanoscale diamond, j. materials and chemistry, 2008, vol. 18, no. 13, p. 1485–1492. [5] davies, g., hamer, m. f.: optical studies of the 1.945 evvibronicband indiamond,proc. r. soc. lond. a., 1976, vol. 348, p. 285–298. [6] ialoubovskii, k., adriaenssens, g. j., nesladek, m.: photochromism of vacancy-related centres in diamond, j. phys.: cond. matter, 2000, vol. 12, p. 189–199. [7] petráková, v., nesládek, m., et. al.: luminescence of nanodiamond driven by atomic functionalization: towards novel detection principles, submitted to adv. func. mater., 2011. about the author vladimı́ra petráková was born in prague, czech republic. she graduated with a master degree from the faculty of biomedical engineering, czech technical university (fbe, ctu) in june 2009. in october 2009 she joined the laboratory of materials fornanosystems andbiointerfaces at the institute of physics,academyofsciences. currently she is in the secondyearof herdoctoral studies atfbe,ctu.she received twice josef hlávka prize for the best students and graduates (2007, 2009), and received the dean’s prize for an excellent bachelor thesis in 2007. her professional interests are luminescence centers in nanodiamond particles, high-resolution optical systems, surface chemistry, biosensors, and also neural circuits and data analysis. in 2010 she received a young investigator award for the best oral presentation in the european diamond conference in budapest, hungary and also at the mrs fall meeting at theboston,usa forherworkdescribing themechanism and control of switching between the neutral and negative charge state of the nv center in diamond. other interests are family, backpacking and music. vladimı́ra petráková e-mail: vladimira.petrakova@fbmi.cvut.cz faculty of biomedical engineering czech technical university in prague sitna sq. 3105, 272 01 kladno, czech republic institute of physics as cr na slovance 5, 185 00 prague 8, czech republic 93 acta polytechnica acta polytechnica 53(2):123–126, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ growth and physical structure of amorphous boron carbide deposited by magnetron sputtering on a silicon substrate with a titanium interlayer roberto canielloa,∗, espedito vassalloa, anna cremonaa, giovanni grossoa, david dellasegab, maurizio canettic, enrico miorind a cnr, institute of plasma physics ‘p. caldirola’, milan, italy b department of energy, polytechnic of milan, milan, italy c cnr, institute for macromolecular studies, milan, italy d cnr, institute for energetics and interphases, padua, italy ∗ corresponding author: caniello@ifp.cnr.it abstract. multilayer amorphous boron carbide coatings were produced by radiofrequency magnetron sputtering on silicon substrates. to improve the adhesion, titanium interlayers with different thickness were interposed between the substrate and the coating. above three hundreds nanometer, the enhanced roughness of the titanium led to the growth of an amorphous boron carbide with a dense and continuing columnar structure, and no delamination effect was observed. correspondingly, the adhesion of the coating became three time stronger than in the case of a bare silicon substrate. physical structure and microstructural proprieties of the coatings were investigated by means of a scan electron microscopy, atomic force microscopy and x-ray diffraction. the adhesion of the films was measured by a scratch tester. keywords: boron carbide, magnetron sputtering, titanium, interlayer, scratch test. 1. introduction boron b4c carbide is one of the most relevant material because of its very interesting characteristics such as high hardness, good electronic and tribological properties, chemical and thermal stability [24, 18, 9]. at room temperature, boron carbide is the third hardest known material and above 1100 ◦c is the hardest one [22]. the films of few microns show good performance on cutting tools [9] and can be used as mirrors with high reflectivity in the ultraviolet range [4]. boron-based coatings can also be very useful in neutrons detection application [17] due to their high neutron absorption cross section. several techniques such as chemical vapor deposition, plasma enhanced chemical vapor deposition [1], hot filament chemical vapor deposition [6], ion beam assisted evaporation [7], and vacuum arc deposition technology [11] were utilized to synthetize boron carbide films. magnetron sputtering [5] is one of the most used technique in thin film deposition on industrial scale due to its application at low temperature and without dangerous gases. coatings produced by this technique often show internal stress induced from the deposition conditions. in order to reduce the stress of the deposited coatings, many suitable deposition recipes have been studied and optimized in terms of deposition parameters [9, 26, 8], and several methods such as postprocess annealing have been explored [13]. in several cases, examples are reported in literature on the possibility to control thin films stress by using a multilayer structure [14, 25, 23] or to increase the adhesion by interposing a metallic inter-layers between the substrate and the coating [20, 12, 10, 19]. mechanical proprieties, time stability and adhesion to the substrate are fundamental requirements to achieve useful coatings in many applicative fields. in this study we report on the structure of an amorphous boron carbide (a-b4c) coating prepared by radio frequency (rf) magnetron sputtering as function of the titanium interlayer (ti-i) thickness deposited on (100) silicon substrate at room temperature. we found that a-b4c films are time instable if deposited on bare silicon or in presence of ti-i with thickness lower than 100 ÷ 150 nm. in some cases the coatings start to delaminate and peel off in a few hours after being taken out of the deposition chamber. as ti-i thickness further increases up to 400 nm thickness, the films show a very good stability and adhesion. emphasis is given to the relationship between substrate roughness and physical structure of the sputterdeposited a-b4cs. physical structure and microstructural proprieties of the a-b4c coating and ti interlayers are determined by scan electron microscopy (sem), atomic force microscopy (afm) and x-ray diffraction (xrd). a comparison between the adhesion of a-b4c coating to the bare silicon and to a 400 nm ti coated silicon is reported as measured by a cetr umt-2 scratch tester. 2. material and methods amorphous boron carbide films and ti interlayers were prepared by rf (13.56 mhz) magnetron sputtering of a b4c and ti targets, respectively. silicon substrates were ultrasonically cleaned with acetone 123 http://ctn.cvut.cz/ap/ roberto caniello et al. acta polytechnica and ethanol, and carefully placed on the grounded substrate holder kept at 7 cm distance from the rf powered electrode. the vacuum before deposition was less than 1 × 10−4 pa and the substrate temperature was monitored by using a k thermocouple placed in contact with the sample. high purity ar (99.9 %) gas was introduced into the chamber through a mass flow controller and a gate valve was used to adjust the pressure during the process. three samples of ti interlayer with thicknesses of 25 nm, 200 nm, and 400 nm were sputter-deposited on si substrates in the same experimental conditions for 4, 30 and 60 minutes, respectively, implying a 6.6 nm/min constant deposition rate. the rf power was fixed at 150 w and the plasma pressure at 1 pa. amorphous boron carbide was deposited on bare and ti-coated silicon substrates with a multilayer structure which consists in the growth of four layers at two different pressure. for the first and third layer, the working pressure was fixed at 2 pa, while, for the second and fourth layer a pressure of 0.8 pa was used. the morphological properties and physical structure of the films were investigated by atomic force microscopy (afm) and scanning electron microscopy (sem). afm measurements were made in air by a nano-rtm afm system (pacific nanotechnology, santa clara, ca, usa) operating in close contact mode. silicon conical tips of 10 nm radius mounted on silicon cantilevers of 1250 nm length, 42 n/m force constant and 320 khz resonance frequency were used. images were processed and analyzed by means of the nanorule+tm software provided by pacific nanotechnology. sem measurements were performed using a zeiss supra system with an accelerating voltage of 15 kv. the structural properties studied by x-ray diffraction measurements were performed with a wide angle siemens d-500 diffractometer (waxd) equipped with a siemens fk 60-10 2000 w tube. the radiation was a monochromatized cu kα beam with wavelength λ = 0.15418 nm. the operating voltage and current were 40 kv and 40 ma, respectively. the data were collected from 10 to 80 2θ at 0.02 2θ intervals by means of a silicon multi-cathode detector vortex-ex (sii). scratch test measurements were made in compliance with the european standard uni en 1071-3-2005 by using a cetr umt-2 tester equipped with a rockwell c standard spherical diamond indenter of 50 µm radius and a 400× optical microscope. 3. results and discussion figure 1 shows sem cross-section micrographs of the boron carbide coatings grown on bare si substrate and on different ti-i thicknesses. the cross sectional view allows to determine the b−c deposited coating thickness, which is about 0.5 micron (obtained by a sequential deposition of four layers), implying a deposition rate of about 0.9 nm/min. as expected, amorphous boron carbide starts to growth with columns (fig. 1a) much thinner than figure 1. sem cross-section images of the boron carbide coatings deposited on (a) bare si, and on (b) 25 nm, (c) 200 nm, (d) 400 nm of ti interlayer thickness. the ones deposited on si substrate with a 25 nm ti interlayer. this can be clearly related with the high flatness of si substrate (rrms less than 0.1 nm) and with the poor surface diffusion due to the high working pressure. at 25 nm ti-i thickness (fig. 1b), the coating exhibits both fine columnar structure (layer 1 and 3) and compact structure (layer 2 and 4). the interface separation of the four layers is quite clear. the four layer structure becomes less pronounced as the ti interlayer thickness increases further up to 200 nm (fig. 1c). the columnar structure becomes more continuous and the interface separation less visible. as the ti-i thickness further increases (fig. 1d), the layered structure completely disappeared and the coating assumes a continuous columnar structure. the coating structure remained columnar as the ti-i thickness increased above 400 nm. we have observed that the coatings without ti-i gradually start to delaminate after being taken out of the coating chamber and brought in atmospheric pressure at room temperature. these stresses can be caused by the large discrepancy between the lattice parameters of the (100) silicon plane and the structure of the growing coating. the delamination effect has also been found with ti-i thickness less than 100 ÷ 150 nm. the introduction of a ti interlayer with thickness above 200 nm produced a b−c stable coatings. the effect of the ti interlayer is likely to decrease the internal stress at the substrate-film interface, thus smoothing the difference in lattice parameters between the thin film and the substrate. x-ray diffraction has been used to characterize the structure. the diffraction (fig. 2) spectrum of the boron carbide deposited on 400 nm ti interlayer reveals the amorphous character of the coating. this result also applies to the other coatings with different ti-i thicknesses. the analysis shows, besides the reflection at an angle of 69.3 degree corresponding to the (100) si substrate, some crystallographic orientations of 124 vol. 53 no. 2/2013 growth and physical structure of amorphous boron carbide figure 2. xrd spectra of amorphous boron carbide deposited on 400 nm of ti-interlayer (green line) and bare si (black line); for comparison, the spectrum of ti interlayer (red line) is also reported. the hexagonal α-titanium phase [2, 21]. the main orientation is (002) at a 2θ angle of 38.5 degree; other three less intense peaks are visible. the columnar geometry depends on the substrate topology, because it results from the competition between the growth of the irregularities and the surface atoms diffusion [16, 15]. the substrate roughness influences strongly the initial stage of the coating growth [3] and it also plays an important role in the evolution of the physical structure. surface roughness usually increases during the deposition, and in some cases [27], columnar structures gradually appear in sputtered thick films in correspondence of a certain roughness value. as shown in fig. 3, the roughness of the ti interlayer increases as a function of the thickness. correspondingly, an increase of the lateral dimension of the a-b4c columns is observed (fig. 1). furthermore, the interface of each layers becomes less clear until it disappears at a value of about 300 nm of ti-i thickness. another interesting feature to note is that above this thickness the samples (fig. 1d) show dense and continue columns independent of which working pressure has been used. as previously reported in literature [3], if the surface adatoms diffusion length is longer than the irregularities characteristic length, the roughness of the deposited coating is smoothed out and the coating becomes denser. so we explain the growth features of our coatings by this notion. decreasing the working pressure, when the deposition is switched from layer 1 to layer 2 (fig. 1a), the energy released from the particles at the surface will be higher, consequently, the increased surface adatoms diffusion length will give rise to an a-b4c denser layer. with regard to the sample d (fig. 1), the a-b4c coating has started to grow with a large basal lateral dimension of the columns and, when the pressure was decreased, the growth figure 3. afm images of ti-i interlayers of thickness a) 25 nm, b) 200 nm and c) 400 nm; the root mean square roughness is indicated with rrms. figure 4. optical micrographs of the scratch track generated from the indenter on a-b4c deposited on bare and 400 nm ti coated silicon substrate. was proceeded with the same texture of the previous layer. we estimate that at this value of ti-i roughness, an equilibrium between surface adatoms diffusion length and roughness length scale of the substrate was achieved. we emphasize that, during the deposition process, the temperature of the substrate was very low (about 320 k), indicating that thermal induced surface diffusivity can be neglected. in order to investigate the mechanical properties of the coatings, scratch test measurements have been performed on a-b4c films grown on bare silicon and on 400 nm of ti-i thickness. the scratches were performed by progressive load scratch test (plst) mode in which the applied normal load increases linearly with time. the slide velocity of the indenter and the applied load were fixed to 9.0 mm/min and 9.0 n/min, respectively. a comparison between the two scratch test clearly shows an increase of the adhesion, that becomes three times stronger in the presence of ti-i, thus changing from 1.4 n (without ti-i) to 5.1 n (with ti-i). the destructive effect generated by the indenter while scratching along the two samples is shown in fig. 4. 4. conclusions boron carbide has been deposited by magnetron sputtering on silicon substrate with different titanium interlayer thickness. ti thickness above 300 nm leads to the growth of a-b4c with a dense and continue columnar structure. in this case, no delamination 125 roberto caniello et al. acta polytechnica effects were found. correspondingly, scratch test measurements show that the adhesion of the a-b4c coating becomes three times stronger. we attribute this result to the ti-i which decreases the internal stress at the substrate-film interface. this result is connected to the enhanced roughness of ti-i which induces the growth of a dense and continue columnar structure. references [1] a. annen, et al. structure of plasma-deposited amorphous hydrogenated boron-carbon thin films. thin solid films 312(1–2):147–155, 1998. [2] j. c. avelar-batista, et al. x-ray diffraction analyses of titanium coatings produced by electron beam evaporation in neon and argon inert gases. journal of vacuum science and technology a 21(5):1702–1707, 2003. [3] p. bai, et al. effect of substrate surface roughness on the columnar growth of cu films. journal of vacuum science and technology a 9(4):2113–2117, 1991. [4] g. m. blumenstock, r. a. m. keski-kuha. ion beam deposited boron carbide coatings for the euv. applied optics 33(25):5962–5963, 1994. [5] j. s. chapin. the planar magnetron (sputtering source). research development 25:37–40, 1974. [6] s. v. deshpande, et al. filament activated chemical vapor deposition of boron carbide coatings. applied physics letters 65(14):1757–1759, 1994. [7] r. gago, et al. boron-carbon-nitrogen compounds grown by ion beam assisted evaporation. thin solid films 373(1–2):277–281, 2000. [8] z. han, g. li, j. tian, m. gu. microstructure and mechanical properties of boron carbide thin films. materials letters 57(4):899–903, 2002. [9] t. hu, et al. structures and properties of disordered boron carbide coatings generated by magnetron sputtering. thin solid films 332(1–2):80–86, 1998. [10] j.-h. huang, et al. effect of ti interlayer on the residual stress and texture development of tin thin films. surface and coatings technology 200(20–21):5937–5945, 2006. [11] c. c. klepper, et al. amorphous boron coatings produced with vacuum arc deposition technology. journal of vacuum science and technology a 20(3):725–732, 2002. [12] m. d. kriese, et al. effects of annealing and interlayers on the adhesion energy of copper thin films to sio2/si substrates. acta materialia 46(18):6623–6630, 1998. [13] v. kulikovsky, et al. effect of air annealing on mechanical properties and structure of amorphous b4c. surface and coatings technology 205(205):4052–4057, 2011. [14] n. kuratani, et al. internal stress in thin films prepared by ion beam and vapor deposition. surface and coatings technology 66(1–3):310–312, 1994. [15] s. lichter, j. chen. model for columnar microstructure of thin solid films. physical review letters 56(13):1396–1399, 1986. [16] a. mazor, et al. columnar growth in thin films. physical review letters 60(5):424–428, 1988. [17] d. s. mcgregor, et al. new surface morphology for low stress thin-film-coated thermal neutron detectors. ieee transactions on nuclear science 49(4):1999–2004, 2002. [18] e. pascual, e. martinez, j. esteve, a. lousa. boron carbide thin films deposited by tuned-substrate rf magnetron sputtering. diamond and related materials 8(2–5):402–405, 1999. [19] k. a. pischow, et al. the influence of titanium interlayers on the adhesion of pvd tin coatings on oxidized stainless steel substrates. surface and coatings technology 58(3):163–172, 1993. [20] s. w. russell, et al. enhanced adhesion of copper to dielectrics via titanium and chromium additions and sacrificial reactions. thin solid films 262(1–2):154–167, 1995. [21] a. z. sadek, et al. anodization of ti thin film deposited on ito. langmuir 25(1):509–514, 2009. [22] s. ulrich, et al. subplantation effect in magnetron sputtered superhard boron carbide thin films. diamond and related materials 7(6):835–838, 1998. [23] e. vassallo, et al. deposition of boron-carbon multilayer coatings by rf plasma sputtering. surface and coatings technology 214:59–62, 2013. [24] h. werheit. thermoelectric properties of boron-rich solids and their possibilities of technical apllication. in ieee, international conference on thermoelectrics, pp. 159–163. 2006. [25] d. l. windt. low-stress w/cr films for scalpel mask scattering layers. journal of vacuum science and technology b 17(4):1385–1389, 1999. [26] m.-l. wu, et al. process-property relationship of boron carbide thin films by magnetron sputtering. thin solid films 449(1–2):120–124, 2004. [27] w. xu, et al. suppressing the surface roughness and columnar growth of silicon nitride films. surface and coatings technology 135(2):274–278, 2001. 126 acta polytechnica 53(2):123–126, 2013 1 introduction 2 material and methods 3 results and discussion 4 conclusions references wykresx.eps acta polytechnica vol. 51 no. 1/2011 path integral solution of pt-/non-pt-symmetric and non-hermitian hulthen potential n. kandirmaz, r. sever abstract the wave functions and the energy spectrum of pt-/non-pt-symmetric and non-hermitian hulthen potential are of an exponential type and are obtained via the path integral. the path integral is constructed using parametric time and point transformation. keywords: pt-symmetry, coherent states, path integral, hulthen potential. 1 introduction a suggestion by bender and boetcher on ptsymmetric quantummechanics has put forwardadifferent point of view from standard quantummechanics. for a quantum mechanical system have to a real energy spectrum, the hamiltonian must be hermitian. bender and his co-workers showed that even if a hamiltonian is not hermitian, it has a real energy spectrum [1]. pt-symmetric andnonhermitian potentials have been studied to prove they have a real energy spectrum, using numerical and analytical techniques. the energy spectrum corresponding to the wave functions is also calculated [2–9]. in this work, we have used feynman’s path integralmethod to get the energy spectrumand thewave functions of the pt-/non-pt-symmetric and nonhermitian exponential potential. thefeynmanpath integral is a given kernel which has transition amplitudes between the initial and final positions of the energy dependent green function. a feynman path integral formalism for deriving the kernel of various potentials was developed in [10–16]. duru derived the wave functions and the energy spectrum of the wood-saxonpotential for s-wavesvia the radial path integral. inomata obtained the energy spectrum and the normalized s-state eigenfunctions for thehulthen potential using the green function [11]. the kernel of the hulthen potential can be exactly solved given thepath integral for the particlemotion on thesu(2) manifold s3 [10–12]. in sec. ii and iii, we derive the energy dependent green’s function of the pt/non-pt-symmetric and non-hermitian q-deformed hulthen potential. we obtained the energy eigenvalues and the corresponding wave functions. 2 pt-symmetric and non-hermitian hulthen potential the kernel of a point particle moving in the v (x) potential in one dimension is represented by the following path integral k(xb, tb;xa, ta) = ∫ dxdp 2π · (1) exp{i ∫ dt[pẋ − p2 2m − v (x)]} where h̄ = 1. the kernel expresses the probability amplitude of a particlemoving to position xb at time tb fromposition xa at time ta. the time interval can be divided into n equal parts tj − tj−1 = tb − ta = t j =1,2,3, . . . , n (2) and taking initial position is xa and final position xb, the kernel [11] can be performed as k(xb, t ;xa,0)= ∫ ∞ −∞ n∏ i=1 dxi n+1∏ i=1 dpi 2π · (3) exp{i n+1∑ i=1 [pi(xi − xi−1)− p2i 2m − v (xi)]}. the ptsymmetric and non-hermitian potential is v (x)= − voe −ix/a 1− qe−ix/a (4) which is determined by taking 1 a −→ i a in the q-deformed hulthen potential [5]. we will start by applying point transformation to get a solvable path integral form for the hulthen potential 1 1− qe−ix/a =sin2 θ p = i 2a sinθcosθpθ (5) 38 acta polytechnica vol. 51 no. 1/2011 because of this transformation, there is a contribution to the jacobi performed kernel k(xb, t ;xa,0)= q 2a sinθb cosθb ∫ dθdpθ × exp[i ∫ dt(pθθ̇ + sin2 θcos2 θ 4α2 p2θ 2μ − v0 cos2 θ)]. (6) here, the kinetic energy term becomes positive. we define a new time parameter s [12] to eliminate the sin2 θ cos2 θ 4α2 part in the kinetic energy term dt ds = − 4a2 sin2 θcos2 θ or t = −4a2 ∫ ds′ sin2 θcos2 θ . (7) if we use the fourier transform of the δ− function, we can write 1 = ∫ ds ∫ de 2π 4a2 sin2 θb cos2 θb · exp [ −i ( et − ∫ ds 4a2e sin2 θcos2 θ )] (8) where s = sb − sa. the factor in front of the path integral reached from the jacobian can be a symmetrization according to points a and b, as follows 1 sinθb cosθb = 2 √ sin2θa sin2θb · exp ( i ∫ s 0 ds(−i) cos2θ sin2θ θ̇ ) (9) thus eq. (6) happens k(xb, xa, t) = ∫ ∞ 0 dseis/2μ ∫ ∞ −∞ de 2π eiet · 4iaq √ sin2θa cos2θb k (θb, θa;s) (10) where k (θb, θa;s)= ∫ dθdpθ · exp { i ∫ s 0 ds [ pθθ̇ − p2θ 2μ − (11) 1 2μ ( k (k −1) sin2 θ + λ(λ −1) cos2 θ ) − ipθ cos2θ 2μsin2θ ]} and k and λ are k = 1 2 [ 1+ √ 32μa2 (v0 + e) ] λ = 1 2 [ 1+ √ 32μa2e ] (12) if the factor contribution to the jacobian is symmetrized as [11] the contributions to the kernel become θ̇j −→ θ̇j ± icosθj 2μsinθj (13) so the problem is transformed into the path integral for pöschl-teller potential, for which an exact solution is known [11]. k (θb, θa;s) can be obtained as k (θb, θa;s)= ∫ dθdpθ · exp { i ∫ s 0 ds [ pθθ̇ − p2θ 2μ − (14) 1 2μ ( k (k −1) sin2 θ + λ(λ −1) cos2 θ )]} the kernel can be obtained in the form k (θb, θa;s)= (15) ∞∑ n=0 exp [ −i(s/2μ)(k + λ +2n)2 ] ψn (θa)ψ ∗ n (θb) where ψn (θ) = √ 2(k + λ +2n) ·√ γ(n +1)γ(k + λ + n) γ ( λ + n + 12 ) γ ( k + n + 12 ) × (16) (cosθ) λ (sinθ) k p(k−1/2,λ−1/2)n ·( 1−2sin2 θ ) with integrating over ds, the green’s function for the hulthen potential can be obtained as g(xb, xa;e)= 8μaq √ sin2θa cos2θb · (17) ∞∑ n=0 ∞∫ −∞ de 2π eiet (k + λ +2n) 2 −1 ψn (θa)ψ ∗ n (θb) therefore, the kernel of the physical system is rewritten as k(xb, xa;e) = ∞∑ n=0 e−ient ϕn(xa)ϕ ∗ n(xb)= ∞∑ n=0 exp {[ − 1 8μaq(n +1) 2 ·[ (n +1) 2 +2μa2v0 ]2] t } · φn(ub)φ ∗ n(ua). (18) integrating over de, we can get the energy eigenvalues en = 1 8μa2 (n +1) 2 [ 2μa2 v0 q − (n +1)2 ]2 (19) 39 acta polytechnica vol. 51 no. 1/2011 and the normalizedwave functions in terms of jacobi polynomials are φ(x)= 1 2 √ 2 √ n +1 √ 4(n +1)2 − (λn − kn)2 ·√ γ(n +1)γ(kn + λn + n) γ ( λn + n + 1 2 ) γ ( kn + n + 1 2 ) × exp[(kn −1/2)x/2a]( 1+ e−x/a )(kn+λn−1/2) p(kn−1/2,λ−1/2)n ·( − 1+ e−ix/a 1− e−ix/a ) (20) where we got kn = 1 8μa2 (n +1) 2 [ (n +1) 2 −2μa2 v0 q ] ; λn = 1 2 + 1 n +1 [ (n +1) 2 +2μa2 v0 q ] (21) here we see that the pt symmetric and nonhermitianhulthen potential has real energy spectra. 3 nonpt-symmetric and non-hermitian hulthen potential the non pt-symmetric and non hermitian hulthen potential is determined by taking 1 a → i a , v0 → a + ib and q → iq as v (x)= − ivoe −ix/a 1− iqe−ix/a (22) we will follow the same steps for getting the wave function and the energy spectrum. a suitable coordinate transformation kernel is obtained as k(xb, t ;xa,0)= q 2a sinθb cosθb ∫ dθdpθ · (23) exp [ i ∫ dt(pθθ̇ − sin2 θ cos2 θ 4α2 p2θ 2μ − v0cos2 θ) ] . if we follow the steps in sec. (2), we will obtain energy eigenvalues en = 1 8μa2 (n +1)2 · [ (n +1) 2 +2μa2 (ia − b) q ]2 (24) and the normalizedwave functions in terms of jacobi polynomials are φ(x) = 1 2 √ 2 √ n +1 √ 4(n +1) 2 − (λn − kn) 2 · √ γ(n +1)γ(kn + λn + n) γ ( λn + n + 1 2 ) γ ( kn + n + 1 2 ) × exp[(kn −1/2)x/2a]( 1+ ex/a )(kn+λn−1/2) · p(kn−1/2,λ−1/2)n ( 1− e−x/a 1+ e−x/a ) (25) kn and λn are the same in eq. (21). it is clear that the energy spectra are real only if re(v0)= 0. 4 conclusion we have calculated the energy eigenvalues and the corresponding wave functions for the pt-/non-pt symmetric and non-hermitian deformed hulthen potential. we obtained that pt-/non-pt symmetric and non-hermitian forms of potentials have real energy spectra by restricting the potential parameters. references [1] bender, c. m.: reports on progress in physics 70(6), 947–1018 (2007); bender, c. m., darg, d. w.: journal of mathematical physics 48(4), 042703 (2007); bender, c. m., boetcher, s.: pys. lett. 80, 5243 (1998); bender, c. m., boetcher, s., meisenger, p. n.: j. math. phys. 40, 2201 (1999); bender, c. m., dunne, g. v., meisenger, p. n.: phys. lett. a, 252, 272 (1999). [2] yesiltas, o., simsek, m., sever, r., tezcan, c.: phys. scripta, t67, 472 (2003). [3] berkdemir, c., berkdemir, a., sever, r.: phys. rev. c72, 027001 (2005). [4] faridfathi, g., sever, r., aktas, m.: j. math. chem. 38, 533 (2005). [5] egrifes, h., sever, r.: int. j. theo. phys. 46, 935 (2007). [6] ikhdair, s.m., sever,r.: j. of math. chem. 42, 461 (2007). [7] ikhdair, s. m., sever, r.: j. mod. phys. e 17, 1107 (2008). [8] kandirmaz, n., sever, r.: chinese j. phys. 47, 47 (2009). [9] kandirmaz, n., sever, r.: physica scripta, 87, 3, (2010). [10] feynmann, r., hibbs, a.: quantum mechanics and path integrals.mcgrawhill, new york, (1965). 40 acta polytechnica vol. 51 no. 1/2011 [11] duru, i. h.: phys. rev. d, 28, 2689 (1983); duru, i. h., kleinert, h.: phys. lett. b84, 185 (1979); fortschr. phys. 30, 401 (1982); duru, i. h.: phys.rev. d, 30, 2121 (1984); duru, i. h.: phys. lett a, 119, 163 (1986). [12] peak, d., inomata, a.: j. math. phys. 10, 1422 (1969). [13] erkoc, s., sever, r.: phys. rev d, 30, 2117 (1984). [14] erkoc, s., sever, r.: phys. rev a, 37, 2687 (1988). [15] pak, n. k., sokmen, i.: phys. lett. 100a, 327 (1984). [16] gradshteyn, i. s., ryzhic, i. m.: table of integrals, series, and products, 2nd ed., academic press, new york, (1981). nalan kandirmaz e-mail: nkandirmaz@mersin.edu.tr mersin university department of physics mersin, turkey ramazan sever middle east technical university department of physics ankara, turkey 41 acta polytechnica vol. 52 no. 5/2012 simulation of a matrix converter fed drive with sliding mode control jan bauer dept. of electric drives and traction, czech technical university in prague, faculty of electrical engineering, technická 2, 166 27 praha, czech republic corresponding author: bauerja2@fel.cvut.cz abstract induction machines are among the most widely used electrical-to-mechanical converters in electric drives. their advantageous robustness and simplicity goes hand-in-hand with complicated control. a converter with a suitable control algorithm is needed in order to withdraw maximum power and dynamics from the drive. in recent times, control methods such as those based on dtc and sliding mode methods have come to the forefront, due to their robustness and relative simplicity. in the field of power converters, new converter topologies are emerging with improved efficiency that pushes the operation limits of the drive. this paper focuses on the development of a control of this kind of strategy for an induction machine fed from a matrix converter. keywords: matrix converter, induction machine model, sliding mode control. 1 introduction the field of regulated ac drives is very broad [6, 10], but most of the drives are based on induction machines and frequency converters. indirect frequency converters are still the most widely used type, but they have several disadvantages. in order to push the limits of im drives and achieve maximum efficiency, research in this field is also investigating other converter topologies, e.g. multilevel inverters [10] and so-called all-silicon solutions [8]. matrix converters have also been attracting some interest. the matrix converter [3] does not include a passive accumulation element in dc link, and uses nine bidirectional switches to transfer voltage from its input to the output. it has some attractive features in comparison with indirect frequency converters: • the output frequency is almost unlimited. the only limit is the maximum switching frequency of the devices that are used. • it has a relatively simple power circuit. an indirect frequency converter comprises a rectifier, an inverter, and a dc link with a passive accumulation element. a matrix converter has only nine bidirectional switches. they are mostly realized from discreet semiconductor modules, but several manufacturers already offer integrated compact igbt modules. • the features of the matrix converter are same as vsi, with an active front end on its input. however, there is a drawback: because of the absence of a dc link, the output voltage amplitude is figure 1: matrix converter. limited to 86.6 % of the input voltage amplitude if we wish to maintain sinusoidal input currents. • sinusoidal input and output currents. • power factor regulation. • work in all four quadrants. the matrix converter got its name because its switches can be formed into a two-dimensional matrix. topology, consisting of nine bidirectional switches, is depicted in fig. 1. the modulation strategy and the control algorithm are very important parts of the converter’s controller. widely used modulation strategies from the area of indirect frequency converters are generally not applicable because of the absence of a dc link. modulation strategies based on virtual dc-link space vector modulation were therefore developed [2, 5, 4]. this approach helps to reduce the modulation complexity, and enables classical modulation strategies to be 8 acta polytechnica vol. 52 no. 5/2012 reused (after a slight modification). at the same time, the approach works as a dependency injection. various approaches can therefore be used to control the output voltage and the input current at the same time. this paper presents phase control of the input current in combination with sliding mode control of the output voltage. this enables us to overcome the traditional limitation of the output voltage without abandoning input current control. as concerns the control algorithm, field-oriented control (foc) and the direct torque control (dtc) are the most widely used high performance induction motor control methods at the present time [1]. in foc, torque and flux are controlled through decoupled control of the stator current: the torque and flux-producing components are controlled separately. the current is controlled by fast inner control loops. the outputs of the current controllers are often led to a decoupler, where the coupling between the torque and flux producing current components is compensated. in addition, since it operates in field coordinates, it requires coordinate transformation between the stationary and rotating flux coordinates. all this makes foc a relatively complicated control method that puts high demands on precise generation of the reference voltage in order to provide a good current control. in dtc, however, the torque and flux are controlled directly by switching suitable voltage vectors. this is relatively close to sliding mode based control. let us turn to the sliding mode [7, 9]. in the case of many scientific topics, ideas and approaches, it is difficult to say exactly where a particular theory originated. some of the first attempts can be found in the work of irmgard lotz. she developed many ideas and applied them already in the second world war. her results were heavily reworked and improved in the former soviet union by stanislav v. emelyanov, and especially, later, by vadim i. utkin. nowadays, sliding mode theory is popular all over the world. there are numerous publication about it, and this control approach is applied in many areas of engineering. sliding mode theory originated from the systems that naturally included a kind of relay and in where its usage often cannot be omitted. in systems of this kind, classical theories are often unsatisfactory. however, there are also many areas and systems that do not include any kind of relay, and yet sliding mode control can be successfully implemented and used to provide robustness and a well-defined response to big changes in the target value of the controlled parameter. power electronics is a scientific field in which nonlinearity and switching is essential, due to technical circumstances that do not allow power switches to work in linear mode. power electronics and drives are therefore very suitable fields for using sliding mode control. in most cases, sliding mode is based on guiding the system state space vector exactly via suitable switching following the sliding surface. however, the state vector first has to reach the switching plane, and the state space vector will only then move more or less directly to the destination position. this approach can invoke a time delay when the controller is starting up, but it can be implemented easily using a simple analog comparator — i.e. using analog equipment. it can be proved whether this approach is stable for a particular task, and what its limits are. control action based on a change in target value can be divided into three main steps or phases: 1. drive system to a stable manifold (the reaching phase); 2. slide to equilibrium (the sliding phase); 3. maintain the target value (the maintenance phase). classical sliding mode control has its strong point in the sliding phase, where it is accompanied by robustness. a weak point is the initial reaching phase, and there is sometimes a problem of chatter in the maintenance phase. in real systems, this is caused by limited switching frequency, and by the particular set of real available or natural switching states. the induction machine is a fourth-order system, and there are two action inputs that are, when supplied from voltage source inverter, bundled into eight vectors with a predefined direction and value. we only select a vector among this set of vectors. the possible directions of the vector depend on the construction of the inverter. the amplitude is given by the dc link voltage, and it cannot be changed quickly or cannot be changed at all (for a matrix converter, a change can be made in a virtual rectifier by adding a switching step). the switching frequency is limited to some hundreds of hz (for high-power drives) or to tens of khz (for smaller drives). the target point does not depend on the value of the state variables, but on multiplication of the state variables (mechanical torque). we might attempt to define the criteria for the best control law as follows: the aim is to move from any current system state (defined by the system state variables) to any target system state in the shortest possible time. this can be simplified to the task of finding the shortest route to the target point. that is currently defined by a line segment. there is a 1d space of system states that deliver the same mechanical torque value for the electric motor. all of them are good candidates for the target state of the control algorithm. existence and stability can be evaluated by analyzing the state space description of the induction motor in order to find limit boundaries, where the available action vectors can no longer produce the required movement in all directions. in other words, the 9 acta polytechnica vol. 52 no. 5/2012 boundary can be defined as the system state where the time derivative of one particular state variable is equal to zero, and at the same time the other available vectors deliver the same sign (the state value achieves its limit, and it can no longer be increased). by applying this line segment approach, we may be able to eliminate the sliding mode reaching phase, and if necessary we may obtain comparable performance to the sliding phase for a machine that is not yet excited, e.g. typically a machine that is just starting. of course, in the maintenance phase, both the classical sliding mode and the line segment approach may produce unwanted chatter. this can be overcome by additional arrangements, but this topic lies beyond the scope of this paper. 2 model of the matrix converter since modulation is not the main topic of this paper, a simplified model is fully sufficient [3]. let us use the virtual dc-link concept (fig. 2). under this consideration, the matrix converter can be virtually divided into a virtual rectifier and a virtual inverter. this will enable the application of classical modulation approaches, even in a direct converter. the inverter part can then be imagined as three bi-state switches. this means either +udc or −udc can appear on the output terminals. by contrast, a virtual rectifier can be imagined as two tri-state switches [2]. this guarantees the possibility that the direction of the input current space vector can be controlled independently from the selected output voltage vector. based on the power balance, and assuming zero losses in the converter, the transfer function of the converter can be defined as uout = mu mi uin. (1) when we employ space vector theory on the virtual rectifier part, we obtain formulas determining the duty ratio of the input current and output voltage vectors. equation( diy diδ ) = √ 6 3 |iin| ipα ( sin(π3 −θi) sin θi ) (2) represents the input current. the situation is depicted in the fig. 3 left. this switching will cause the converter to consume almost sinusoidal current, and the virtual dc-link voltage will be constant. the situation in the virtual inverter part will be different, since in our case one of the six space vectors will be used without any additional modulation (the figure 2: virtual dc-link. figure 3: synthesis of the input current vector and output voltage vector. 10 acta polytechnica vol. 52 no. 5/2012 figure 4: virtual inverter output vectors. output modulation will be a product of the sliding mode control). independently from the input sinusoidal modulation, there is a free choice among the vectors in fig. 4. because the output voltage will be switched without any modulation, its amplitude corresponds to the amplitude of the voltage in the virtual dc link. therefore, the only way to influence its amplitude is by modulating the input current. the current modulation index can be expressed assuming a power balance in the converter: pdc = pin (3) udc idc = |uin||iin|cos ϕ. (4) for the modulation index of the virtual inverter (fig. 3 right) we can write mu = |uout| k√ 3 2 udc = 3 2 |uout| udc 2 √ 3 = √ 3 |uout| udc , (5) where k = 2/3 is a transformation coefficient that ensures power invariant transformation. the input modulation index of the virtual rectifier can be expressed with the help of fig. 3 as mi = |iin| k√ 3 2 √ 3idc = |iin| idc . (6) when we substitute (5) and (6) into (4), the final equation for calculating the input current modulation index mi can be formed: mi = √ 3 |uout| |uin| . (7) from the equations of the machine, we can easily calculate the voltage that is required to hold the machine in the desired stable state (uout) and the input voltage is also known. 3 model of an asynchronous machine in order to obtain maximum performance from the drive, a precise regulation algorithm is needed. these algorithms are mostly based on regulating the flux of the machine, which can scarcely be measured. machine parameters from equivalent circuit and measured values are used to estimate the inner machine flux based on machine equations. the final accuracy of the equivalent circuit, and thereby also the accuracy of the controller, depends on exact knowledge of the equivalent circuit values. a further important advantage of the equivalent circuit is that the behavior of the machine, including the stable operation area, the maximum achievable torque, etc., can be derived analytically from the mathematical description of the equivalent circuit. this will be performed below. finally, the control algorithm for the electrical torque of the machine will be designed on the basis of this knowledge. the γ-equivalent circuit (fig. 5) will be used in this paper to describe the behavior of the machine. the advantage of this circuit is the simplification due to fusing the rotor and stator inductances into a single inductance on the rotor side, without loss of information. the induction machine is a system with four state variables. the following equation describes derivation of both fluxes according to state matrix a and vector of inputs b: ψ&(t) = b(t) + aψ(t). (8) any combination of fluxes or currents can be used. the behavior of the machine can be derived analytically from (8). however, since information about the rotor flux cannot be obtained easily, only stator 11 acta polytechnica vol. 52 no. 5/2012 figure 5: equivalent circuit of im. variables are used in this paper for modeling im. the system based on stator variables taken from inside the γ-equivalent circuit is introduced in:( ψ&µ (t) i&s (t) ) = as ( ψµ(t) is(t) ) + bs(t), (9) where as is the main state space matrix, b represents the influence of action vectors, and are given by: as =   0 ωb −rs 0 −ωb 0 0 −rs σ2 ωσ3 lµlσ σ1 ωb −ω − ωσ3 lµlσ ω2 ω −ωb −σ1   , (10) where σ1 = − rrlµ + rslµ + rslσ lµlσ , σ2 = rr lµlσ and σ3 = lµ + lσ; bs(t) =   esa(t) esb(t) esa(t)lµ+esa(t)lσ lµlσ esb(t)lµ+esb(t)lσ lµlσ   . (11) the last important missing equation is the expression for the calculating of the torque of the induction machine: me = 3 2 p(ψ × i) = 3 2 (ψdiq − ψqid). (12) we can propose the first simplifying assumption that the coordinate system rotates with speed of the flux. therefore ωb = ω1, ψd = ψ and ψq = 0. using this dynamically rotating coordinate system the original 4th order system will be reduced into a new 3rd order system. the ψq component of the flux will always be zero. this will help when designing the control algorithm, because the torque of the machine will depend on the ψd and iq components only. because we aim to design a sliding mode controller for motor properties nominal working point parameter value parameter value rs [ω] 25.9 · 10−3 pn [w] 87.0 · 103 rr [ω] 18.0 · 10−3 ωn [rad/s] 150 lµ [h] 27.6 · 10−3 ψn [wb] 0.84 lσ [h] 1.3 · 10−3 un [v] 230 p [1] 2 in [a] 180 table 1: motor parameters. the torque of the im, we will set ψd and iq as our sliding plane, and all investigations will be related to this plane. the last variable current id remains uncontrolled, but it follows from the following analysis that it can be controlled vicariously by proper selection of the input voltage vector or by limiting the area of operation. now, let us investigate which behaviors of the im can be expressed from the system (10). 3.1 reaction to voltage vector before designing the controller, let us study the response of the machine to the voltage vector that is applied to it. in order to simplify the analysis, let us begin by analyzing the reaction on the four vectors in the directions of the d, q axes with the amplitude of the virtual dc-link voltage, instead of the six vectors with an unknown heading angle that would be available at any moment. fig. 6 depicts the change in the state of the machine (in the flux/current plane). the arrows with a rhombus visualize the change in the state of the machine (torque). the arrows visualize the change in the id component (right orientation means a positive sign, left orientation means a negative sign). the investigated machine [8] has the following parameters, summarized in tab. 1. the figures show that, for different operation states, the voltage vectors have a different effect on the machine. we can see from the rightmost column of figures, where the machine is operating at nominal speed, that the application of the voltage vector in the direction of the d axis will either increase or decrease the flux, but it will always decrease the torque. as assumed, the application of the voltage vector in the negative direction of the q axis will lead to a large decrease in iq (torque), but an increase in torque is not possible because the machine has already reached its limits. 12 acta polytechnica vol. 52 no. 5/2012 figure 6: reaction to applied voltage vector. 4 design of the controller we might assume that the limitation of the working area for torque control would depend on a number of factors. let us to investigate the existence condition for sliding mode torque control. typically, the flux is kept constant and the torque control is actualized via changes in the orthogonal current (iq). in this case, the sliding line is orthogonal to the flux axis. there are typically two movements: upwards to increase the torque, and downwards to decrease the torque [9]. both cases are symbolically depicted in fig. 7. there is a sliding mode that increases the torque, if there is always a pair of voltage vectors, both increasing iq, where the first vector derives positive torque and the second vector derives negative torque. conversely, there is sliding mode that decreases the torque if there is always a pair of voltage vectors, both decreasing iq, where the first vector derives positive torque and second vector derives negative torque. obviously, the worst case (the strongest condition for existence) subsists in a voltage vector that simultaneously increases both the flux and the orthogonal current component. since the analysis is being conducted in the rotating plane, there is a 60-degree uncertainty in the available direction of the voltage vector (see fig. 8). there will always be one vector figure 7: principle of sliding mode torque control. figure 8: principle of sliding mode torque control. 13 acta polytechnica vol. 52 no. 5/2012 figure 9: principle of sliding mode torque control. with the direction between ub1 and ub2, but the exact value of the angle is random. the optimization task consists in appropriately selecting the 60-degree region (choice of ϕ0 — see the figure below) in order to extend the guaranteed working area as far as possible. a sliding mode controller for a drive fed by a matrix converter was designed on the basis of the analysis presented here. the input for the regulation is the desired flux, ψ∗d, and the desired torque, m ∗. from these two values we calculate the current component i∗q that is required for torque generation. the values for ψ∗d and i ∗ q are then compared with the current values for ψ∗d and i ∗ q, and according to the result of this comparison the vector that should be generated is selected to produce the desired movement of the operation point of the machine, see fig. 9. the vectors that are required were selected on the basis of an analysis of the reaction to the voltage vector. it followed from the analysis that the decrease in the torque is more rapid than the increase. for this reason, we choose to boost the increase in the iq component when an increase in torque is required by requiring a vector with an angle of 60 degrees. conversely, we smooth out the decrease in the iq component when a decrease in torque is required, by requiring a voltage vector with an angle of −45 degrees. taking the uncertainty of the available voltage vectors into account: torque-increasing voltage vectors are selected randomly from the interval between 30 and 90 degrees, and torque decreasing voltage vectors belong to the interval between 15 and 75 degrees. a simplified block diagram of the controller implemented in matlab/simulink is depicted in fig. 10. from the measured voltages and currents, we recalculate values that cannot be measured directly — motor flux ψ, and angle of the flux phasor ϕ. current values of ψd and iq are expressed on the basis of knowledge of these two variables. the most suitable voltage vector available is then selected by searching for the maxima of scalar multiplication of the output vectors of the required vector and the available converter. figure 10: controller block diagram. 5 simulation results the simulation results of this control are depicted in fig. 11. it shows the results of the simulation with the proposed control strategy for desired values of ψd = 0.84 wb and various load torques. the red line shows the desired torque and the green line represents the real torque. it can be seen that the controller tracks the desired value very rapidly, and without oscillations in the transitions. fig. 12 shows the trajectory of the controlled variables in the polar coordinate system. it can be seen that the flux of the machine first reaches its desired value, and then starts chattering around the desired flux line and the orthogonal current component is adjusted according to the required mechanical torque. fig. 13 shows the selected voltage vectors. it can be seen that the voltage vectors are scattered around the desired vectors ±60° and ±45°. 6 conclusion we have presented the control strategy for a matrix converter based on the virtual dc-link concept and generalized sliding mode modulation of the virtual inverter. the simulation results look very promising, and show that sliding mode control of the virtual inverter can operate simultaneously with independent input sinusoidal modulation. the responses of the controller to a change in the desired state are rapid, and there is no overshoot. however, the real impact of the modulation can only be evaluated after the algorithm has been implemented on the real converter prototype. nevertheless, it forms a solid basis for further investigations and for improving the suggested control concept for a matrix converter induction motor drive system. an analysis of the behavior of the induction machine based on the state space description has also been presented. the output of the analysis 14 acta polytechnica vol. 52 no. 5/2012 figure 11: reaction of the drive. figure 12: trajectory of the controlled variables. figure 13: generated output voltage vectors. figure 14: input of the converter. 15 acta polytechnica vol. 52 no. 5/2012 shows that stable operation of the induction machine is limited by its breakdown torque, independently from the supply voltage and the rotor speed. this result is also useful for other types of converters. acknowledgements the research described in this paper has been supervised by ing. stanislav flígl, phd., fee ctu in prague, and has been supported by the grant agency of the czech technical university in prague, grant no. sgs12/067/ohk3/1t/13. references [1] p. brandstetter, l. hrdina and p. simonik. properties of selected direct torque control methods of induction motor in industrial electronics (isie), 2010 ieee international symposium on, pp. 1456-1461, 2010. [2] c. klumpner, f. blaabjerg, i. boldea and p. nielsen. new modulation method for matrix converters. industry applications, ieee transactions on 42(3):797—806, 2006. [3] j. w. kolar, t. friedli, j. rodriguez and p. w. wheeler. review of three-phase pwm ac– ac converter topologies. industrial electronics, ieee transactions on 58(11):4988–5006, 2011. [4] j. lettl. matrix converter induction motor drive. in proceedings of epe-pemc 2006 1–4:1119– 1124. portoroz, 2006. [5] j. lettl and s. fligl. pwm strategy applied to realized matrix converter system. in proceedings of piers 2007. prague, 2007. [6] d. w. novotny and t. a. lipo. vector control and dynamics of ac drives. 1st edition, oxford, oxford university press, 1996. [7] s. ryvkin. sliding mode technique for ac drive. in proc.10th int. power electron. & motion control conf., epe – pemc. dubrovnik & cavtat, croatia, 2005. [8] s. ryvkin, r. schmidt-obermoeller and a. steimel. sliding-mode-based control for a threelevel inverter drive. ieee transaction on industrial electronics 55(11):3828–3835, 2008. [9] v. i. utkin, j. gldner and j. shi. sliding mode control in electromechanical system. philadelphia: taylor & francis, london, 1999. [10] p. vas. sensorless vector and direct torque control. oxford, oxford university press, 1998. 16 ap10_1.vp 1 electronic education 1.1 introduction in the present-day education system, e-learning is a widely-used term referring to activities that can improve not only distance learning but also full-time study. it refers to a manner of education which enables learners to complement and support the standard education with the use of modern computer technology (in most cases of web technologies). e-learning is changing the view on current education, and has influenced even the most important pedagogical-psychological theories of learning connected with technologies. if proper procedures are observed while preparing a course, its application in education corresponds to modern conceptions of cognition: constructivism and connectivism. to the concepts of remembering, recalling, and learning are added the concepts of thinking, creating, and forming. a proper e-learning course should consciously support internal, rather than external motivation of learners, and should accompany individual learning with cooperative learning. the main attributes of the course should be: � open learning materials connected to a number of expanding and elaborating texts. � methodological candor. � a problem-solving approach that propounds alternative views and theories, and consequently places emphasis on critical and divergent thinking. � students can be included in preparing the content 1.2 history electronic education as it is known today is a relatively new way of delivering education. its history started to be written only in the 1990s, alongside the development of the internet. nevertheless, if we take a broader definition of this notion, the first e-learning (in the sense of learning through technology) was through learning machines, the first of which was constructed already in the 1920s by the psychologist s. l. pressey. however, this machine did not come into its own. the next attempt at program learning appeared in the 1950s, and even came to the czechoslovak republic, where the unitutor educational automaton was created. however not even these automata were successful, most probably as a result of their cost. foreign language teaching attempted to make use of each new development in audio and audiovisual technology in the 20th century, but until quite late in the century high cost, user-unfriendliness and unsatisfactory auditory quality were considerable drawbacks. the next milestone of e-learning was during the period of 1984 and 1993. an idea to support education by the use of personal computers arose with their development. it consisted mostly of distribution of educational content by floppy disks or cd-rom, this manner of application is often called cbt (computer based training). from the current point of view, this way of education has many problems, such as the inability to update the content of learning materials, or the fact that the student has no contact with a teacher or other students. the history of the true e-learning starts after the year 1993 (although the term itself was not coined until 1999) alongside with the development of the web. at first there were only static sites, where learning materials were available, and communication with the teacher either did not take place at all, or only through e-mail. this phase is often called wbt (web based training). later, alongside with the development of internet technologies, more elaborate courses have begun to appear, which enable better cooperation, contact with a teacher, and feedback. moreover, also the content could stay up-to-date, due to easy updating of the sites, and become more and more multimedia. nowadays, e-learning has found its application not only in the educational system, but also in commercial firms, which use it in lifelong learning of their employees. as a result, there are thousands so-called corporate universities, which are one of the main driving forces of this method of learning. finally, the current situation in our country will be presented. first attempts to introduce e-learning appeared around the year 1999, and since then the number of projects dealing with this topic is increasing. within the ctu the well-known conference belcom also takes place. this conference is focused on monitoring of the new trends in the area of e-learning. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 electronic education at the faculty of nuclear sciences and physical engineering d. vaněček, v. klement this paper deals with the current issue of electronic education, and is based on a study of internet support for education at the faculty of nuclear sciences and physical engineering at the czech technical university in prague. the goal of the study was to establish to what extent and in what ways electronic support for education is utilized at fnspe ctu. in order to answer these questions, a questionnaire was conducted at the faculty. we will present the outcomes here. keywords: blended learning, lms, fnspe 1.3 definition defining of the notion of e-learning proves to be difficult and disunited in spite of its high frequency of use. a number of definitions has arisen during the time of its existence, and many differ significantly from each other. the usual differences are whether it is a definition from a pedagogical or technological point of view, or whether older forms of usage of computers for education (especially cbt) belong to e-learning or whether e-learning means only the use of internet technologies. for better understanding we will present some definitions of this notion by contemporary experts, see [1]: � for me, e-learning is studying by the means of electronic media, let it be learning through cd or internet. � by e-learning i understand educational and training systems (especially on-line ones). � i understand e-learning to be an education supported by modern electronic means (computers, media, internet) in distant learning, combined and full-time study. � i imagine e-learning to be an electronic education, i.e. educational course created in lms which is intended for self-study under supervision of a teacher who communicates with a student in an electronic manner through this environment. experts in their definitions of e-learning mostly prefer its more sophisticated and more exacting form. 1.4 course creation creating an e-learning course can prove rather difficult for many teachers. in order to create such a course it is necessary to be able to create a website. this requires knowledge of creation of html documents and in a case of websites with some active components also knowledge of programming e.g. in php. this is a big problem, because it cannot be expected that every teacher will be knowledgeable in the area of web technologies. fortunately, so-called web content management systems (wcms) can be used for creating of e-learning courses. wcms are nowadays one of the basic means used for website creation. they are mostly web applications (programs running on the internet, which are approached through the browser) which enable almost anybody, even without any programming skills, to create and maintain websites. their main purpose is to enable a visitor to log in and to add some text or other content, which will be saved and subsequently available for other users for viewing as a website (e.g. wikipedia). however, they often involve advanced functions which would not be easy to create even for a skilled programmer. this includes various forums, discussions, inquiries and others (like e.g. on facebook, which is a web application but not exactly wcms). all these functions are moreover very well secured and debugged. for example drupal, joomla, or other systems on the basis of wiki belong among popular management systems. from that it was just a small step to utilize such systems for education-oriented website creation, and thus developed so-called lcms (learning content management systems). they differ from ordinary wcms by their specialization on creation of specific sites (educational online courses) and by adding some functions specific for education. besides lcms there exist lms (learning management systems). these systems do not serve to creating e-learning courses but to their operating and management. lms thus enable to for example, follow results of individual students, appointing homework, or making available parts of courses created in advance in lcms. it should be noted that many applications fulfill both roles. the best known lms/lcms in the czech republic is probably moodle (see [2]). 1.5 utilization of e-learning at universities e-learning is being integrated as a part of education at many czech universities and is created and run by lcms /lms systems. this integration of electronic education is at the present time of three types: (see [3]) � electronic support of full-time study, classic education is mixed with elements of individual work of a student with electronic sources. � interactive elements are added to electronic support, classic education is in some cases limited. � student gets access to electronic courses, and whole educational cycles. classic education is limited to minimum, or it is omitted completely. electronic support is highly interactive and even tests of knowledge are often realized this way. 2 research of current electronic support of education at fnspe 2.1 faculty of nuclear sciences and physical engineering our research of utilization of internet support of education was conducted at the faculty of nuclear sciences and physical engineering. we tried to determine how e-learning is being used here nowadays, which means are used in its creation, and how students and teachers see its future development. 2.2 history of the faculty it is one of the smaller faculties at czech technical university in prague. it was founded as a part of the czechoslovak nuclear program in 1955 under charles university in prague as a faculty of technical and nuclear physics. however, it quickly expanded its sphere of activity to a wide range of mathematical, physical, and chemical subjects. by government order of 12th august of 1959 the faculty was transferred from charles university to czech technical university as its fourth faculty and was renamed to faculty of nuclear sciences and physical engineering. studying at the faculty is generally considered to be very demanding, but due to the small number of students they get almost individual care, especially during the higher grades. the faculty offers various subjects to study, but only as full-time. as a matter of interest, the faculty is in possession of its own nuclear reactor vr-1 (called “vrabec”, i.e. “sparrow”) which was put into operation in 1990. 2.3 methodology and form inquiries were divided into two groups. one for students, and one for the staff who uses internet support. because © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 50 no. 1/2010 internet support systems differ department from department (sometimes even within a department), one teacher was chosen from each department to answer this inquiry. inquiries for teachers were realized in a form of an interview, students could either fill out their inquiries online or some were handed printed copies. 2.4 results from the staff the inquiry was undertaken at six departments and as a result six members of the staff answered it. it is apparent from the results that almost all departments have their own servers and web space where teachers can upload materials for their courses. however, manners of execution vary greatly. nevertheless, they can be divided into three groups: � teachers have web space assigned to them, they create websites on their own (or with the help from students). � teachers give materials they want to put on web to a server administrator who puts them online. � teachers use some management systems for web site creation. concerning management systems, they are used approximately by half of the departments, when one uses specifically lms. however, general knowledge about the existence of lms is relatively low, they often have only a vague idea that something like this exists. on the other side, almost everyone knew what management system is. concerning e-learning, opinions were rather skeptical. a great number of teachers sees behind this notion just a replacement of a classic education in a form of presentations which are available somewhere on the internet. they do not consider this, understandably, as an improvement in education. consequently, pure electronic form of education instead of a classic one was renounced by almost everyone. on the other side, using websites as a support of lectures, according to the majority, makes sense. nonetheless, they mostly consider it as time-consuming and technically demanding. some of the teachers see e-learning as a convenient variant for postgradual students and lifelong learning systems in the future. 2.5 results from students 142 students from all grades of both bachelor and master study have participated in the inquiry. first, we will deal with the question of how often students encounter internet support of education. the results (fig. 2) showed that approximately each third subject already has some sort of internet support. therefore, this manner of support is relatively used at the faculty and thus the inquiry itself makes sense. when we look at the question dealing with content of the websites (table 1), we see that majority consists of static elements such as study materials, exercises, news from the subject, and links to other sites with similar content. whereas assigning homework through the web, forum, or interactive tests are rather rare. the most common feature is study materials. let us concentrate on the quality of these materials. this question is answered by the following chart. from it we can see that if there are any materials, then the materials are of good quality and cover the whole subject. on the other side, it is very rare that these materials bring something new to the subject and they are often just presentations from lectures. as a matter of interest, we tried to find out what students do with electronic materials. here the results (fig. 4) showed that the majority of students print their materials, because reading from the monitor does not suit them. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 moodle 13 % drupal 25 % none 49 % wiki 13 % fig. 1: chart shows which wcms are used and their distribution never (%) once (%) sometimes (%) often (%) almost always (%) always (%) news from subject 10.00 20.00 51.00 15.00 3.00 1.00 study materials 0.00 13.00 30.00 41.00 11.00 5.00 related links 12.00 17.00 46.00 15.00 6.00 4.00 forum 69.00 17.00 9.00 4.00 0.00 1.00 homework assigning 15.00 26.00 36.00 19.00 4.00 0.00 excersises 6.00 11.00 64.00 19.00 0.00 0.00 interactive tests 84.00 16.00 0.00 0.00 0.00 0.00 table 1: tab shows how often specific elements of electronic support are used without internet support 69 % with internet support 31 % fig. 2: chart shows an average use of e-learning during classes according to students in the second part of the inquiry we explored what students think about e-learning and teaching based on e-learning. first we inquired in which areas of subject e-learning could be used (fig. 5). programming oriented subjects absolutely won this question. next followed the final question how students would envision integration of e-learning into study (fig. 6). here an option of a mere support won by majority, only some students would at best accept reduction of number of lectures per week, but students definitely do not want purely electronic form. 3 conclusion internet support of education at fnspe is not unknown. a lot of teachers at this faculty are skilled enough to create their own web support and they often do so. moreover, at every department there is someone who is able to take care of © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 50 no. 1/2010 perfect and extending the subject good and complete slides from lectures poor and insufficient they don t exist' 0.00 % 5.00 % 10.00 % 15.00 % 20.00 % 25.00 % 35.00 %30.00 % 40.00 % fig. 3: chart showing quality of electronic materials according to students don t use them at all' don t use too much' them print them would prefer printedthem read from monitor 0.00 % 10.00 % 20.00 % 30.00 % 40.00 % 50.00 % 60.00 % 70.00 % fig. 4: chart showing students’ opinions on work with electronic materials math physics it others 0.00 % 10.00 % 20.00 % 30.00 % 40.00 % 50.00 % 60.00 % 80.00 % 70.00 % 90.00 % fig. 5: chart showing students’ opinion on which fields of study are appropriate for e-learning (students could check multiple answers) web creation, if there are teachers unable to do it themselves. concerning used technologies and space, almost all departments have their servers and do not use for web support means provided either by the faculty (unified web space) or by the university (lms moodle). in addition, half of the departments do not use any management systems and individual authors do not cooperate in any way. thus web sites differ in visual, functional and technological aspects basically subject from subject. concerning opinions on e-learning and its utilization there is an agreement between students and teachers. almost everyone agreed that internet support has many advantages and should be used, but it should definitely remain only a support and classes should not become purely electronic. references [1] sak, p., skalková, j., mareš, j.: člověk a vzdělání v informační společnosti. praha: portál, 2007. isbn 978-80-7367-230-0. [2] české stránky systému moodle, http://moodle.cz/, [online]. [3] rambousek, j.: e-learning z druhé strany. zpravodaj úvt mu, 2003, roč. 8, č. 5. issn 1212-0901. [4] pejsar, z.: elektronické vzdělávání. univerzita j. e. purkyně v ústí nad labem, 2007. isbn 978-80-7044-968-4. [5] vaněček, d.: informační a komunikační technologie ve vzdělávání. praha: čvut, 2008. isbn 978-80-01-04087-4. ing. bc. david vaněček, ph.d. e-mail: vanecek@muvs.cz czech technical university in prague masaryk institute of advanced studies horská 3 128 00 praha 2, czech republic bc. vladimír klement e-mail: klemevl1@fjfi.cvut.cz czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 praha 1, czech republic 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 pure electronic education instead of lectures fewer lectures just support don at all't use e learning somehow else 0.00 % 10.00 % 20.00 % 30.00 % 40.00 % 50.00 % 60.00 % 70.00 % fig. 6: chart showing the best form of integration of e-learning into study according to students ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 does a functional integral really need a lagrangian? d. kochan abstract path integral formulation of quantum mechanics (and also other equivalent formulations) depends on a lagrangian and/or hamiltonian function that is chosen to describe the underlying classical system. the arbitrariness presented in this choice leads to a phenomenon called quantization ambiguity. for example both l1 = q̇ 2 and l2 = e q̇ are suitable lagrangians on a classical level (δl1 = δl2), but quantum mechanically they are diverse. this paper presents a simple rearrangement of the path integral to a surface functional integral. it is shown that the surface functional integral formulation gives transitionprobability amplitudewhich is free of anylagrangian/hamiltonian and requires just the underlying classical equations of motion. a simple example examining the functionality of the proposed method is considered. dedicated to my friend and colleague pavel bóna. keywords: quantization of (non-)lagrangian systems, path vs. surface functional integral. 1 a standard path integral lore according to feynman [1], the probability amplitude of the transition of a system fromthe space-time configuration (q0, t0) to (q1, t1) is given as follows: a(q1, t1 | q0, t0) ∝∫ [dγ̃]exp { i h̄ ∫ γ̃ padq a − hdt } . (1) here the path-summation is taken over all trajectories γ̃(t) = (q̃(t), p̃(t), t) in the extended phase space which are constrained as follows: γ̃(t0) = ( q̃(t0)= q0, p̃(t0)–arbitrary, t0 ) , γ̃(t1) = ( q̃(t1)= q1, p̃(t1)–arbitrary, t1 ) . to obtain a proper normalization of the feynman propagator, one requires: δ(q̃0 − q0)=∫ rn[q1] dq1a ∗(q1, t1 | q̃0, t0)a(q1, t1 | q0, t0) , δ(q1 − q0)= lim t1→t0 a(q1, t1 | q0, t0) . the first equation asks for the conservation of the total probability and the second expresses the obvious fact that no evolution takes place whenever t1 approaches t0. it is a miraculous consequence (not a requirement!) of the propagator definition (1) that it satisfies an evolutionary chain rule (chapman-kolmogorov equation) a(q1, t1 | q0, t0)=∫ rn[q] dq a(q1, t1 | q, t)a(q, t | q0, t0) , whose infinitesimal version is the celebrated schrödinger equation. it is a curious fact thatformula (1)wasnot originally discovered by feynman. in his pioneering paper [2] he arrives at a functional integral in the configuration space only a(q1, t1 | q0, t0) ∝∫ [dq]exp { i h̄ ∫ q(t) l(q, q̇, t)dt } . (2) later, however, it was shown that this formula represents a very special case of themost general prescription (1). formula (1) is at the heart of our further discussion. 2 dehamiltonianization a step beyond involves eliminating the hamiltonian function h from formula (1). the price to be paid for thiswill be to replace the path summation therein by a surface functional integration. our aim is the transition probability amplitude between (q0, t0) and (q1, t1). let us suppose that there exists a unique classical trajectory in the extended phase space γcl(t) = (qcl(t), pcl(t), t) which connects these points (locally this assumption is always satisfied). then for any curve γ̃(t) = (q̃(t), p̃(t), t) which 57 acta polytechnica vol. 50 no. 5/2010 enters the path integration in (1), we can assign two auxiliary curveswhichwe call λ0(s) and λ1(s). they are parameterized by s ∈ [0,1] and specified as follows: λ0(s) = (q0, π0(s), t0)where π0(0) = pcl(t0), π0(1)= p̃(t0), λ1(s) = (q1, π1(s), t1)where π1(0) = pcl(t1), π1(1)= p̃(t1). let us emphasize that neither λ0(s) nor λ1(s) varies with respect to the q and t coordinates in the extended phase space. they are allowed to evolve only with respect to the momentum variables. there are, of course, infinitely many of such curves, but as we will see nothing in the theory will be dependent on a particular choice of λ0(s) and λ1(s). using these curves one can write:∫ γ̃ padq a − hdt = ∫ γcl padq a − hdt + ∮ ∂σ padq a − hdt , (3) where ∂σ = γ̃ − λ1 − γcl + λ0 is a contour spanned by four curves γ̃(t), γcl(t), λ0(s), λ1(s) counting their orientations. the first integral on the right is the classical action scl(q1, t1 | q0, t0). while the contour integral in (3) can be rearranged to represent a surface integral:∮ ∂σ padq a − hdt = ∫ σ dpa ∧ ( dqa − ∂h ∂pa dt ) − ∂h ∂qa dqa ∧ dt =: ∫ σ ω . (4) surface σ spanning the contour ∂σ is understood here as a map from the parametric space (t, s) ∈ [t0, t1] × [0,1] to the extended phase space, i.e. σ : (t, s) �→ ( qa(t, s), pa(t, s), t(t, s)= t ) . partial derivatives of the initialhamiltonian function can be substituted using the velocity-momentum relations and classical equations of motion: ∂h ∂pa = t abpb ( = q̇a ) and ∂h ∂qa = −fa ( = −ṗa ) . here we consider the physically mostly relevant situation only. in this case the velocities and momenta become related linearly by the metric tensor tab(q) (and its inverse) defined by the kinetic energy t = 1 2 tabq̇ aq̇b = 1 2 t abpapb of the system, then ω=dpa ∧ dqa − ( t abpadpb − fadqa ) ∧ dt . (5) this object represents a canonical two-form in the extended phase space. it is a straightforward generalization of the standard closed two-form dθ = dp ∧ dq − dh ∧ dt to the case when the forces are not potential-generated. it is clear that for a given pair of trajectories (γ̃, γcl) there exists infinitely many σ surfaces. they form a set which we call uγ̃. since the surface integral ∫ σ ω is only boundary dependent and formulas (3) and (4) are satisfied, we can write: exp { i h̄ ∫ γ̃ padq a − hdt } = e i h̄ scl ∞γ̃ ∫ uγ̃ [dς]exp { i h̄ ∫ σ ω } . here ∞γ̃ stands for the number of elements pertaining to the corresponding stringy set uγ̃. assuming no topological obstructions from the side of the extended phase space, ∞γ̃ becomes an infinite constant independent of γ̃. taking all of this into account we can rewrite (1) as follows: a(q1, t1 | q0, t0) ∝ e i h̄ scl ∫ u [dς]exp { i h̄ ∫ σ ω } . (6) in this formula the undetermined normalization constant ∞ was included into the integration measure [dς] and the path integral over γ̃’s was converted to the surface functional integral as was promised: ∫ [dγ̃] ∫ uγ̃ [dς] . . . = ∫ ⋃ γ̃ uγ̃ [dς] . . . =: ∫ u [dς] . . . the set u = ⋃ γ̃ uγ̃ overwhich the functional integration is carried out contains all extended phase space strings which are anchored to the given classical trajectory γcl. to eliminate hamiltonian h completely we need to expressscl(q1, t1 | q0, t0) in termsof the forcefield. such a quantitymaynot exist in general, howeverwe will see that in special casesone canrecoveranappropriate analog of scl(q1, t1 | q0, t0) requiring a certain behavior of a(q1, t1 | q0, t0) ∝ e i h̄ scl ∫ u [dς]exp { i h̄ ∫ σ ω } . in the limit h̄ → 0. 58 acta polytechnica vol. 50 no. 5/2010 3 functionality a major advantage of the surface functional integral formulation rests in its explicit independence on hamiltonian h. from the point of view of classical physics, dynamical equations and the force fields entering them seem to be more fundamental than the hamiltonian and/or lagrangian function, which provide these equations in a relatively compact but ambiguous way, see [3]. therefore from the conceptual point of view, formula (6) gives us transition probability amplitude from a different and hopefully new perspective. it is clear that for the potential generated forces the surface functional integral formula (6) gives nothing new compared to (1), since in this case ω is closed and can be represented as ω= d(padq a − hdt). there are, of course, some hidden subtleties whichwe pass over either quickly or in silence, however, all of them are discussed in [4]. to show functionalitywe need to analyze either a strongly non-lagrangian system [5] or a weakly nonlagrangian one. for the sake of simplicity let us focus on the second case. to this end, let us consider a system consisting of a free particle with unit mass affected by friction: q̈ = −κq̇ ⇔ ω=dp ∧ (dq − pdt) − κpdq ∧ dt . in the considered example, the surface functional integral can be carried out explicitly (for details see [4]). at the end one arrives at the path integral in the configuration space with a surprisingly trivial result: ∫ u [dς]exp { i h̄ ∫ σ ω } ∝ exp { − i h̄ t1∫ t0 ( 1 2 q̇2cl − κpclqcl ) dt } × ∫ [dq]exp { i h̄ t1∫ t0 ( 1 2 q̇2 − κpclq ) dt } . if we define scl to be scl(q1, t1 | q0, t0)= t1∫ t0 ( 1 2 q̇2cl − κpclqcl ) dt , (7) then a(q1, t1 | q0, t0) ∝∫ [dq]exp { i h̄ t1∫ t0 ( 1 2 q̇2 − κpclq ) dt } and in the classical limit h̄ → 0 we arrive at the saddlepoint equationwhich is specifiedby the functional term in the exponent above: q̈ = −κq̇cl . this differential equation is different from the equation q̈ = −κq̇ that we started with initially, but both of them coincide when a solution q(t) satisfying q(t0) = q0 and q(t1) = q1 is looking for. in the present situation we gain: scl = κ 4 (q1 − q0) (q0 +3q1)e −κt1 − (q1 +3q0)e−κt0 e−κt0 − e−κt1 and a(q1, t1|q0, t0)= √ κ 4πih̄tanh κ2(t1 − t0) e i h̄ scl . (8) here we have already employed the normalization conditions specified in the first paragraph. one can immediately verify that in the frictionless limit (κ → 0) the transition probability amplitude a(q1, t1 | q0, t0) matches the schrödinger propagator for a single free particle. 4 conclusion quantization of dissipative systems has been very attractive problem from the early days of quantummechanics. it has been revived again and again across the decades. many phenomenological techniques and effectivemethodshavebeen suggested. references [6] and [7] provideaverybasic list of papersdealingwith this point. we have developed here a new quantization method that generalizes the conventional path integral approach. we have focused only on the nonrelativistic quantummechanicsof spinless systems. however, the generalization to the field theory is straightforward. let us stress that the proposed method represents an alternative approach to [7] and possesses several qualitative advantages. for example, propagator (8) is invariant with respect to time translations, the same symmetry property which is possessed by the underlying equation of motion. moreover, it is reasonable to expect that the “dissipative quantumevolution”will not remainmarkovian. this fact is again confirmed, since the probability amplitude under consideration does not satisfy the memoryless chapman-kolmogorov equation mentioned in the first paragraph. finally, let us believe that the simple geometrical idea behind the surface functional integral quantization will fit within the ludwig faddeev dictum quantization is not a science, quantization is an art! 59 acta polytechnica vol. 50 no. 5/2010 acknowledgement this researchwas supported inpart bymšmtgrant lc06002,vegagrant1/1008/09andbythemšsr program for cern and international mobility. a. m. d. g. references [1] feynman,r.p.,hibbs,a.r.: quantummechanics and path integrals, mcgraw-hill inc., new york, 1965. [2] feynman, r. p.: space-time approach to nonrelativistic quantum mechanics, rev. of mod. phys. 20 (1948), 367–387. [3] wigner, e. p.: do the equations of motion determine the quantum mechanical commutation relations?, phys. rev. 77 (1950), 711–712. okubo, s.: does the equation of motion determine commutation relations?, phys. rev. d 22 (1980), 919–923. henneaux, m.: equations of motion, commutationrelations andambiguities in the lagrangian formalism, annals phys. 140 (1982), 45–64. [4] kochan, d.: direct quantization of equations of motion, acta polytechnica 47, no. 2–3 (2007), 60–67, arxiv:hep-th/0703073. kochan,d.: howtoquantizeforces(?): anacademic essay on how the strings could enter classical mechanics, j. geom. phys. 60 (2010), 219–229, arxiv:hep-th/0612115. kochan, d.: quantization of non-lagrangian systems: some irresponsible speculations aip conf. proc. 956 (2007), 3–8. kochan, d.: quantization of non-lagrangian systems, int. j. mod. phys. a 24 nos. 28 & 29 (2009), 5319–5340. kochan, d.: functional integral for nonlagrangian systems, phys. rev. a 81, 022112 (2010), arxiv:1001.1863. [5] douglas, j.: solution of the inverse problem in the calculus of variations,trans. am. math. soc. 50 (1941), 71–128. [6] beteman, h.: on dissipative systems and related variational principles, phys. rev. 38 (1938), 815–819. feynman, r. p., vernon, f. l.: the theory of a general quantum system interacting with a linear dissipative system, annals of physics 24 (1963), 118–173. kostin, m. d.: on the schrödinger-langevin equation,j. chem. phys.57 (1972), 3589–3591. kan,k.k., griffin, j. j.: quantizedfriction and the correspondence principle: single particle with friction, phys. lett. 50b (1974), 241–243. hasse, r. w.: on the quantum mechanical treatmentofdissipative systems, j.math. phys. 16 (1975), 2005–2011. dekker, h.: classical and quantum mechanics of the damped harmonic oscillator, phys. rep. 80 (1981), 1–112. alicki, r.: path integrals and stationary phase approximation for quantum dynamical semigroups. quadratic systems, j. math. phys. 23 (1982), 1370–1375. caldeira, a. o., leggett, a. j.: path integral approach to quantum brownian motion, physica 121a (1983), 587–616. geicke, j.: semi-classical quantisation of dissipative equations, j. phys. a: math. gen. 22 (1989) 1017–1025. weiss, u.: quantum dissipative systems (2nd ed.), series inmoderncondensedmatterphysics, vol. 10, world scientific, singapore, 1999. tarasov,v. e.: quantization of non-hamiltonian and dissipative systems, phys. lett. a 288 (2001), 173–182, quant-ph/0311159. breuer, h. p., petruccione, f.: the theory of open quantum systems. oxford university press, oxford, 2002. razavy, m.: classical and quantum dissipative systems. imperial colleges press, london, 2005. chruściński,d., jurkowski,j.: quantumdamped oscillator i: dissipation and resonances, ann. phys. 321 (2006), 854–874, quant-ph/0506007. gitman, d. m., kupriyanov, v. g.: canonical quantization of so-called non-lagrangian systems,eur. phys. j. c 50 (2007), 691–700, arxiv: hep-th/0605025. tarasov, v. e.: quantum mechanics of nonhamiltonian and dissipative systems. elsevier science, 2008. [7] caldirola,p.: forzenonconservativenellameccanica quantistica,nuovocim.18 (1941), 393–400. kanai,e.: on thequantizationof thedissipative systems, prog. theor. phys. 3 (1948), 440–442. moreira, i. c.: propagators for the caldirolakanai-schrödinger equation, lett. nuovo cim. 23 (1978), 294–298. 60 acta polytechnica vol. 50 no. 5/2010 jannussis, a. d., brodimas, g. n., strectlas, a.: propagatorwith friction in quantum mechanics, phys. lett. 74a (1979), 6–10. das, u., ghosh, s., sarkar, p., talukdar, b.: quantizationofdissipativesystemswithfriction linear in velocity, physica scripta 71 (2005), 235–237. mgr. denis kochan, ph.d. e-mail: kochan@fmph.uniba.sk department of theoretical physics fmfi, comenius university 842 48 bratislava, slovakia 61 ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 cataclysmic variables — x-rays and optical activity in v1223sgr and v709cas r. gális, l. hric, e. kundra, f. münz abstract intermediatepolars are amajor fraction of all cataclysmic variables detectedby integral inhardx-ray. these objects have recently been proposed to be the dominant x-ray source population detected near the galactic centre, and they also contribute significantly to x-ray diffuse galactic ridge emission. nevertheless, only 25% of all known intermediate polars have been detected in hard x-ray. this fact can be related to the activity state of these close interacting binaries. a multi-frequency (from optical to x-ray) investigation of intermediate polars is essential for understanding the physical mechanisms responsible for the observed activity of these objects. keywords: cataclysmic variables, optical and x-ray variability, mass transfer and accretion. 1 introduction cataclysmic variables (cvs)manifest strong activity in a whole spectrum from radio up to γ-rays. cvs are close binary systems consisting of a hot white dwarf (wd) and a red main-sequence star of spectral type m or k, which fills the volume of its inner roche lobe and transfers matter to the vicinity of the wd [1]. the mass transfer between components causes the observed activity of thecvs, which varies from relatively small light variations (flickering) to enormous photometric changes (outbursts of novae) on time scales that cover awide range from very fast variability of fractions of seconds (flickering) to longterm variations of several years or decades (activity cycles). according to strength of the wd magnetic field, the transferredmatter creates anaccretiondisk (classical cvs) or follows magnetic lines and falls to the surface of the wd (magnetic cvs). magnetic cvs are a small sub-set of all catalogued cv systems, and fall into two categories: polars and intermediate polars (ips). in ips, the wd magnetic field (106 − 107 g) is not strong enough to disrupt the disc entirely (as in the case of polars) and it simply truncates the inner part of the disc [2]. an accretion flow is channelled down towards the magnetic poles and onto the wd surface. increased interest in cvs was aroused after the discovery of the very hard x-ray emission (up to 80 kev) of many ips. when the transferred material impacts the wd atmosphere, a strong shock will form above its surface [3]. the temperature in the post-shock region (psr) can be very high and the plasma is cooled mainly via optically thin bremsstrahlung radiation in the hardx-ray band [4]. aswe showed in our previous analysis [5], the broadband spectra (3–100kev) of the studied ips can be well fitted by a thermal bremsstrahlung model with psr temperature kt ≈ (20–25)kev. reflection of thebremsstrahlungphotons at anoptically thick cold medium can also contribute to the hard x-ray spectrum [6]. in the case of v2400 oph, a significant emission excess detected at ≈ 26kev can be caused by the reflection of x-rays from the surface of the wd [5]. in total, 32 cvs (and 2 symbiotic systems) were detected in the data of integral [7–9,5]. this is more than was expected, and it represents ≈ 5% of all detections of this space observatory. the sample of ips detected in the (20–40)kev energy band has 23 members, representingmore than 70% of all cvs detected by integral [5]. ips are the most luminous and the hardest x-ray sources among accreting wds. in hardx-rays, these objects seem to be more luminous (up to a factor of 10) than polars [10]. in strongly magnetized (b ≥ 107g) polar systems, cyclotron radiation is an important coolingmechanism, which suppresses high temperature bremsstrahlung emission, whilst it should be negligible for ips. this could explain why most of the cvs observed in the hard x-ray band are ips. short-term soft x-raymodulations have been observed in the orbital period, in the spin period of the wd, or a beat between the two. no significant modulation has been found so far in the 20–30kev light curves [11]. most cvs seem to have persistent soft gamma rayfluxes. nevertheless, the sample detected by integral represents only 25% of all knownips [5]. some ipsarenotdetectable even ifwe have significant exposure time (more than 4msec) for these sources. this fact can be related to the activ13 acta polytechnica vol. 51 no. 6/2011 fig. 1: left panel: integral/ibis flux curves of v1223sgr in the corresponding energy bands. the arrows represent 3σ upper limits. right panel: integral/omc light curve of v1223sgr in the optical spectral band. the crosses represent average magnitude values in the corresponding seasons ity state of these interacting binaries. to understand this relation, it is necessary to study the correlation between the activity state and the x-ray emission of ips. 2 observations, analysis and results we used all publicly available observational data from integral/jem-x and integral/ibis to study possible variability of the selected ips in the hard x-ray and soft γ-ray spectral bands. in addition, observations from integral/omcdatawere used to look for long-termvariability of these objects in optical band. the observational data used in our analysiswasprocessedby integral’sofflinestandardanalysis packageosa7. the libraries grid.py and ms_suite.py developed by fm were also used for the data processing and preparation of the final mosaics. 2.1 intermediate polar v1223sgr ip v1223sgr is a bright x-ray source (4u 1849-31) with possiblex-rayflare activity. a short-termburst has also been detected from this system in the optical [12]. these outbursts areprobably a result of disk instabilities or an increase inmass transfer, but there is no correlation between optical andx-ray burst activity. moreover, episodes of deep low state (decrease by several magnitudes) of v1223sgr in the optical band have also been detected [13]. the observationalmaterial for ip v1223sgr consisted of 132 (integral/jem-x) and 1375 (integral/ibis) individual pointings obtained in the course of almost three years (mjd 52710.38– 53809.25). the overall integral/ibis mosaics (total exposure time1405.5ksec) showedthatv1223 sgr is detectableup to80kevwith ≈ 5σ significance. the broad-band (3–100)kev spectrumwas verywell fitted by a bremsstrahlung model with temperature kt =23.7+1.4−1.3kev [5]. during the monitored period, the mean fluxes of this object were (103.0 ± 4.0) × 10−12 ergcm−2 s−1, (46.4±1.4)×10−12 ergcm−2 s−1, (15.1±1.5)×10−12 ergcm−2 s−1 and (12.3±2.0)×10−12 ergcm−2 s−1 in (15–25)kev, (25–40)kev, (40–60)kev and (60–80)kev band, respectively. the inspection of the data showed that the observations were obtained in the course of 7 separate seasons. as the next stepwe split the data according these seasons to investigate long-term x-ray/γ-ray variability. the flux curves are displayed in figure 1 (left panel). it is clear that during the monitored period the fluxes of v1223sgr were long-term variable (especially in the softerbands),witha significant drop around mjd ≈ 53650. the optical light curve of v1223sgr (integral/omc) is shown in figure 1 (right panel). we can see that optical brightness of this sourcewas long-term variable, too. moreover, these light variations are strongly correlated with the changes in the (15–25)kev, (25–40)kev and (40–60)kev spectral bandswith correlation coefficients 0.81, 0.82 and 0.89, respectively. our detailed period analysis of integral/ omc data did not yield any significant period, only partial detection of the orbital period porb = 3.37hrs [14]. this was probably caused by complex intrinsic variability (flickering) of v1223sgr, possible period variations or a the drastic change in the brightness of this object during monitored period. however, a particular analysis is not possible due to inappropriate time distribution of the data. 14 acta polytechnica vol. 51 no. 6/2011 fig. 2: left panel: integral/jem-x flux curves of v 1223 sgr in the corresponding energy bands. the arrows represent 3σ upper limits. right panel: integral/ibis phase diagram of v1223sgr in (15–25)kev band folded with orbital period (3.37hrs) fig. 3: integral/ibis light curves of v709cas in the corresponding energy bands. the arrows represent 3σ upper limits we also prepared overall mosaics using all available data from integral/jemx. the total exposure time was 80.6 ksec. the medial fluxes of v1223sgr during the monitored period were (2.8 ± 0.4) × 10−12 ergcm−2 s−1, (3.7 ± 0.3) × 10−12 ergcm−2 s−1, (2.6±0.4)×10−12 ergcm−2 s−1 and (4.9 ± 1.1) × 10−12 ergcm−2 s−1 in (3–6)kev, (6–10)kev, (10–15)kev and (15–25)kev band, respectively. wealsospit thedata to7 seasonsto investigate the long-term x-ray variability of v1223sgr. the flux curves are displayed in figure 2 (left panel). as we can see the corresponding errors are too large for subsequentanalysis andthereforewecanconclude that the integral/jem-x fluxes of v1223sgr were persistent within their errors in the monitored period. typically, soft x-ray modulations were observed in the orbital period, in spin period ofwd, or a beat between the two in ips. however, the ips are close binary systems with orbital periods in the order of hours and these objects, are not detectable on these time scales by integral/ibis. as the next step, we attempted to investigate the possible short-term variability of v1223sgr in integral/ibis data. we prepared a uniquemethod of folding the particularphase intervalon thebasis ofproper time intervals from the individual science windows. our method applied good time intervals (gtis) according to the (orbital or other) phase bin and created phase resolved mosaics (assuming sufficient exposure) of a periodic source. a phase diagram of the fluxes v1223sgr in the (15–25)kev band folded with the orbital period and constructed using the data from time interval mjd (52917.17–52926.84) is shown in figure 2 (right panel). 2.2 intermediate polar v709cas this x-ray source was recognized as an ip after it had been detected in the rosat all sky survey as rxj0028.8+5917 and was identified with a 14th magnitude blue star, v709cas. the broadband spectrum (3–100)kev of this object was fitted by the bremsstrahlung model with temperature kt =29.6±2.5kev, covering factor cf =0.31±0.04, column density nh = (58 ± 15) × 1022 cm−2 and iron line energy 6.5 ± 0.2kev [15]. the analysis showed that accounting for compton scattering does not significantly change the obtainedmass of thewd (mwd ≈ 0.9m�) in the case of v709cas [15]. the broad-band (3–100)kev spectrum from integral was well fitted by a thermal bremsstrahlung model with post-shock temperature kt =24.4+1.5−1.4kev [5]. our analysis of all available observational data for v709cas showed that this source is detectable up to 100kev. the hard x-ray/soft γ-ray fluxes are not persistent, and the flux curves indicate that the brightness of this ip increased by a factor of ≈ 2 from mjd 52700 to mjd 53700 in (15–25)kev energy band (figure 3). 15 acta polytechnica vol. 51 no. 6/2011 3 conclusions weanalysedall availableobservationaldata from integral for ip v1223sgr andv709cas. our analysis of the data from integral/ibis showed that the fluxes of these objects are long-term variable, mainly in the (15–25)kev and (25–40)kev bands. moreoverthishardx-ray/softγ-rayvariability is correlatedwith the changes in the optical spectral band in the case of v1223sgr. our analysis revealed a deep flux drop around mjd ≈ 53650 observed in both the x-ray band and the optical band for this intermediate polar. a significantpart of the optical emission from ips is produced by a hot spot, where the matter from a donor star interacts with the outer rim of the accretion disk. x-ray emission is produced by the interaction of the accreting matter with the wd surface. the emission inboth the optical and thex-raybands is therefore related to the mass transfer, and the observed variations are therefore probably caused by changes in the mass accretion rate. we are preparing a photometric campaign to obtain long-term homogeneous observations (to cover whole activity cycles) as well as sets of observations with high time resolution (to cover the orbital cycles indetail) of selectedcvs,mainlyasa followupto the integral observations. simultaneous analysis of multi-frequency observation (from optical to x-ray) enables a complex study of the physical mechanism related to the mass transfer in these interacting binaries. acknowledgement the international gamma-ray astrophysics observatory (integral) is a european space agency mission with instruments and a science data centre funded by the esa member states (especially by the pi countries: denmark, france, germany, italy, spain, switzerland), theczechrepublic andpoland, and with the participation of russia and the usa. this study was supported by a project of the slovak academy of sciences, vega grant no. 2/0078/10. references [1] warner, b.: cataclysmic variable stars. cambridge : cambridge university press, 1995. [2] patterson, j.: publ. astron. soc. pacific, 1994, 106, 209. [3] aizu, k.: prog. theor. phys. 1973, 49, 1184. [4] king,a.r., lasota, j.p.: mon.not.r.astron. soc. 1979, 188, 653. [5] gális, r., eckert, d., paltani, s., münz, f., kocka, m.: baltic astronomy, 2009, 18, 321. [6] van teeseling, a., kaastra, j. s., heise, j.: astron. astrophys. 1996, 312, 186. [7] bird, a. j., malizia, a., bazzano, a., barlow, e. j., et al.: astrophys. j. suppl. ser. 2007, 170, 175. [8] kniazev, a., revnivtsev, m., sazonov, s., burenin, r., tekola, a.: the astronomer’s telegram, 2008, 1488. [9] masetti, n., parisi, p., palazzi, e., landi, r., et al.: the astronomer’s telegram, 2008, 1620. [10] chanmugam, g., ray, a., singh, k. p.: astrophys. j. 1991, 375, 600. [11] barlow,e. j., knigge, c., bird, a. j., dean, a., et al.: mon. not. r. astron. soc. 2006, 372, 224. [12] van amerongen, s., van paradijs, j.: astron. astrophys. 1989, 219, 195. [13] garnavich, p., szkody, p.: publ. astron. soc. pacific, 1988, 100, 1522. [14] jablonski, f., steiner, j. e.: astrophys. j. 1987, 323, 672. [15] suleimanov, v., poutanen, j., falanga, m., werner, k.: astron. astrophys. 2008, 491, 525. r. gális department of theoretical physics andastrophysics institute of physics faculty of sciences p. j. šafárik university park angelinum 9, 04001 košice, slovakia l. hric e. kundra astronomical institute of the slovak academy of sciences 05960 tatranská lomnica, slovakia f. münz faculty of science masaryk university kotlářská 2, 611 37 brno, czech republic 16 wykresx.eps acta polytechnica vol. 51 no. 1/2011 propagators of generalized schrödinger equations related by first-order supersymmetry a. schulze-halberg abstract weconstruct an explicit relation betweenpropagators of generalized schrödinger equations that are linkedbyafirst-order supersymmetric transformation. our findings extend and complement recent results on the conventional case [1]. keywords: generalized schrödinger equation, propagator, supersymmetry. 1 introduction the formalism of supersymmetry (susy) is a wellknown tool for identifying integrable cases of the schrödinger equation and for generating the corresponding solutions. the principal idea is to relate two schrödinger equations and their solutions via a supersymmetrical (or darboux [3]) transformation, such that known solutions of one equation can be mapped onto solutions of the second equation. there is a vast amount of literature on the susy formalism and its applications, for details we refer the reader to [5, 2] and references therein. while the concept of the susy formalism as a method for generating solutions is popular, it is less known that two schrödinger equations related by a susy transformation (susy partners) share more properties than the link between their solutions. in particular, susy establishes a connectionbetween thepropagatorsand thegreen’s functions of the underlyingpartner equations: the propagators are linked by means of an integral expression [1], while thegreen’s functions satisfy a simple trace formula [9, 6]. it is interesting to note that this trace formulapersistsundergeneralization of the schrödinger equation to the effectivemass case [8] or to a generalized sturm-liouville problem [7]. due to the close relation between green’s function and the propagator, one would expect that the propagator relation found in [1] also extends to generalized cases of the schrödinger equation. this is in fact true, as will be shown in this note. we restrict ourselves to first-order susy transformations of generalized schrödinger equations, a brief review of which and of related theory is given in section 2. subsequently, the explicit integral formula that links the propagators of our susy partner equations is done in section 3. 2 preliminaries in the following we briefly summarize basic facts about generalized schrödinger equations, their susy formalism, propagators and green’s functions. generalized schrödinger equation. we consider the following generalized sturm-liouville problemonthe real interval (a, b), equippedwithdirichlet boundary conditions: f(x)ψ′′(x)+ f ′(x)ψ′(x)+ [eh(x)− v (x)]ψ(x) = 0, x ∈ (a, b) (1) ψ(a)= ψ(b) = 0. (2) here f, h, v are smooth, real functions, with f, h positive and bounded in a and b. the constant e will be referred to as energy, and in solutions of (1), (2) that belong to the discrete spectrum, e stands for the spectral value. any solution ψ of (1), (2) belonging to a value e from the discrete spectrum, is located in the weighted hilbert space l2h(a, b) with weight function h [4]. the lowest valueof the discrete spectrumwill be called the ground state anddenoted by e0 with corresponding solution ψ0. the interval (a, b) can be unbounded, that is, a or b can represent minus infinity or infinity, respectively (however, if a and/or b are finite, thenwe require f, h, v to be continuous there). we see that the problem (1), (2) can be singular, which means that its spectrum can admit a continuous part. equation (1) will be referred to as the generalized schrödinger equation, since its special cases are frequently encountered in quantum mechanics, such as the schrödinger equation for effective mass or with a linearly energy-dependent potential. in the quantum-mechanical context, e denotes the energy associatedwith a solution ψ, and v stands for the potential. generalized susy formalism. we summarize results from [10]. the boundary-value problem (1), (2) can be linked to another problem of the same kind by means of the susy transformation method. consider f(x)φ′′(x)+ f ′(x)φ′(x)+ 63 acta polytechnica vol. 51 no. 1/2011 [eh(x) − u(x)]φ(x) = 0, x ∈ (a, b) (3) φ(a)= φ(b) = 0, (4) where the same settings imposed for (1), (2) apply. clearly, a solution φ = φ(x) and the potential u = u(x) are in general different from their respective counterparts ψ and v . now, suppose that ψ and u are solutions of the boundary-value problem (1), (2) and of equation (1) at real energies e and λ ≤ e, respectively. define the susy transformation of ψ as du,xψ(x) = √ f(x) h(x) w(u, ψ)(x) u(x) =√ f(x) h(x) [ − u′(x) u(x) ψ(x)+ ψ′(x) ] , (5) where w(u, ψ) stands for the wronskian of the functions u, ψ and the second indexof d denotes the variablewhich the derivatives in thewronskian apply to. the function φ = du,xψ as defined in (5) solves the boundary-value problem (3), (4), if the potential u is given in terms of its counterpart v , as follows: u(x) = v (x) −2f(x) d2 dx2 { log[u(x)] } +[ f(x)h′(x) h(x) − f ′(x) ] u′(x) u(x) − f ′′(x) 2 + [f ′(x)]2 4f(x) + 3f(x)[h′(x)]2 4h2(x) − f(x)h′′(x) h(x) . (6) note that (5) remains valid when multiplied by a constant, which can be used for normalization. now, depending on the choice of the auxiliary solution u in (5), the discrete spectrum of problem (3), (4) can be affected in three possible ways: if λ = e0 and u = ψ0, then e0 is removed from the spectrum of (3), (4). the opposite case, creation of a new spectral value λ < e0, happens if the auxiliary solution u does not fulfill the boundary conditions (4). finally, the spectra of both problems (1), (2) and (3), (4) are the same, if we pick λ < e0 and u that fulfills only one of the boundary conditions (2). propagator and green’s function. thepropagator governs a quantum system’s time evolution. for a stationary schrödinger equation, the propagator k has the defining property ψ(x, t) = exp(−iet)ψ(x)=∫ (a,b) k(x, y, t)ψ(y)dy, (7) as the solution ψ of the time-dependent schrödinger equation is related to its stationary counterpart ψ by the exponential factor used for separating time and spatial variable. suppose problem (1), (2) admits a complete set of eigenfunctions (ψn), n = 0,1,2, . . . , m ∈ n0, where m can stand for infinity, and (ψk), k ∈ r, belonging to the discrete and the continuous part of the spectrum, respectively. then the propagator k has the representation k(x, y, t) = h(y) [ m∑ n=0 ψn(x)exp(−ient)ψn(y)+∫ r ψk(x)exp(−ik2t)ψk(y)dk ] , (8) where en and k 2 stand for the spectral values belonging to the discrete and continuous spectrum, respectively. the green’s function g of problem (1), (2) has two equivalent representations [4], both of which we will use here. in order to state the first representation, let ψ0,l and ψ0,r be solutions of equation (1) that fulfill theunilateralboundaryconditions ψ0,l(a) = ψ0,r(b) = 0. the wronskian w(ψ0,l, ψ0,r) of these funtions is given by w(ψ0,l, ψ0,r)(x)= c0 f(x) , (9) where c0 is a constant that depends on the explicit formof ψ0,l and ψ0,r. nowwecangive thefirst representationof thegreen’s function g0 forourboundary value problem (1), (2): g(x, y) = − 1 c0 [ ψ0,l(y)ψ0,r(x)θ(x − y)+ ψ0,l(x)ψ0,r(y)θ(y − x) ] , (10) where c0 is the constant from (9) and θ stands for the heaviside distribution. the second representation of the green’s function g can be obtained as follows, provided problem (1), (2) admits a complete set of solutions: g(x, y) = m∑ n=0 ψn(x)ψn(y) en − e + ∫ r ψk(x)ψk(y) k2 − e dk, (11) where the notation is the same as in (8). 3 propagators related by generalized susy in order to obtain a relationbetween the propagators of the two boundary-value problems (1), (2) and (3), (4),we take thepropagator k1 of the secondproblem and express it through quantities related to the first problem. for the sake of simplicity we assume for now that the two boundary-value problems have the 64 acta polytechnica vol. 51 no. 1/2011 same discrete spectrumand that both of them admit a complete setof solutionsbelonging to adiscreteand a continuous part of the spectrum. furthermore, we assume that the solutionsof problem(1), (2) are realvalued functions. this is no restriction, as equation (1) involves only real functions. 3.1 general case the construction of our propagator k1 is similar to the way it was done in [1]. according to representation (8) we have k1(x, y, t) = h(y) [ m∑ n=0 φn(x)exp(−ient)φn(y)+∫ r φk(x)exp(−ik2t)φk(y)dk ] . (12) the notation is the same as in (8), only φn and φk must be replaced by ψn and ψk, respectively. now, since the solutions φn, φk are obtained by means of a susy transformation (5) from ψn, ψk, we can rewrite (12) as follows, taking into account normalization constants ln and lk, respectively: k1(x, y, t) = h(y)du,xdu,y ·[ m∑ n=0 l2nψn(x)exp(−ient)ψn(y)+ + ∫ r l2kψk(x)exp(−ik 2t)ψk(y)dk ] . in the next step we apply the defining property (7) to the previously obtained expression: k1(x, y, t)= h(y)du,xdu,y ·[ m∑ n=0 l2n ∫ (a,b) k0(x, z, t)ψn(z)dzψn(y)+ ∫ r l2k ∫ (a,b) k0(x, z, t)ψk(z)dzψk(y)dk ] . (13) we choose the free constants as ln = 1/(en − λ) 1 2 , n = 1, . . . , m, and lk = 1/(k 2 − λ) 1 2 , k ∈ r, where λ is the discrete spectral value associated with the auxiliary function u. after regrouping terms, these settings render (13) in the form k1(x, y, t)= h(y)du,xdu,y {∫ (a,b) k0(x, z, t) ·[ m∑ n=0 ψn(z)ψn(y) en − λ + ∫ r ψk(z)ψk(y) k2 − λ dk ] dz } = (14) h(y)du,xdu,y ∫ (a,b) k0(x, z, t)g0(z, y)dz, (15) where g0 is the green’s function of our boundaryvalue problem (1), (2) in its form (11), taken at energy λ. relation (15) gives the final connection between the propagators of our two boundary-value problems, provided they admit the same discrete spectrum. in the case that problem (3), (4) admits an additional discrete spectral value λ with the corresponding solution φ−1, formula (12) must be modified as follows: k1(x, y, t) = h(y) [ m∑ n=0 φn(x)exp(−ient)φn(y)+ φ−1(x)exp(−iλt)φ−1(y)+∫ r φk(x)exp(−ik2t)φk(y)dk ] . (16) from this point, the additional term is maintained until the final relation between the propagators k0 and k1 results as k1(x, y, t)= h(y) { du,xdu,y [∫ (a,b) k0(x, z, t)g0(z, y)dz ] + φ−1(x)exp(−iλt)φ−1(y) } , where the green’s function g0 is to be taken at energy λ. finally, if problem (3), (4) admits one discrete spectral value less than its initial counterpart, formula (12) remains the same except that the sum starts at one instead of at zero. this is maintained until formula (14), where summation now starts at one. expression (15) then turns into k1(x, y, t) = h(y)du,xdu,y ∫ (a,b) k0(x, z, t) · lim e→e0 [ g0(z, y)− ψ0(z)ψ0(y) e0 − e ] dz,(17) where the green’s function g0 is to be taken at energy e. in summary, the last three expressions stand for the final relations between the propagators of our two boundary-value problems. clearly, in the conventional case h = 1, the above expressions reduce correctly to the known relations [1]. in general, our expressions cannot be simplified anymore, unless more information on the auxiliary solution u is known. we will now study such a case. 3.2 special case: ground state as auxiliary solution let us assume that the auxiliary solution u is chosen to be the ground state ψ0, associated with the spectral value e0, of problem (1), (2). according to our 65 acta polytechnica vol. 51 no. 1/2011 brief susy review in section 2, this choice implies that the discrete spectrum of problem (3), (4) will not contain the value e0 anymore. wewill now show that the corresponding relationbetween thepropagators (17),where u is replacedby ψ0, canbe simplified considerably. while the general procedure of simplification follows a similar way as in the conventional case [1], one must keep track of the nonconstant factor in front of (5). before we start simplifying (17), we observe that the limit and the operator dψ0,y in (17) commute, because dψ0,y lim e→e0 [ g0(z, y)− ψ0(z)ψ0(y) e0 − e ] = lim e→e0 [ m∑ n=1 ψn(z)dψ0,yψn(y) en − e + ∫ r ψk(z)dψ0,yψk(y) k2 − e dk ] . the first term in the sum vanishes, as dψ0,yψ0(y)= 0, so we obtain dψ0,y lim e→e0 [ g0(z, y)− ψ0(z)ψ0(y) e0 − e ] = lim e→e0 [dψ0,yg0(z, y)] . this property will be useful for rewriting (17). note that for the sake of simplicitywe divide by the factor h: 1 h(y) k1(x, y, t)= dψ0,x ∫ (a,b) k0(x, z, t) lim e→e0 [dψ0,yg0(z, y)] dz = dψ0,x ∫ (a,y) k0(x, z, t) · lim e→e0 [ − 1 c0 dψ0,yψ0,l(y)ψ0,r(z) ] dz + dψ0,x ∫ (y,b) k0(x, z, t) · lim e→e0 [ − 1 c0 ψ0,l(z)dψ0,yψ0,r(y) ] dz. (18) we will now determine the limits that the integrals contain. to this end, first note that according to (5) the wronskian w(ψ0, ψ0,r) is involved in the limits, which we will now find by means of the differential equation that it obeys. we have w(ψ0, ψ0,r) ′(y)= d dy [ ψ0(y)ψ ′ 0,r(y)− ψ0,r(y)ψ ′ 0(y) ] = ψ0(y)ψ ′′ 0,r(y)− ψ ′′ 0(y)ψ0,r(y)= (19) h(y) f(y) ψ0(y)ψ0,r(y)(e0 − e)− f ′(y) f(y) w(ψ0, ψ0,r)(y). note that in the third line we replaced the second derivatives by means of the generalized schrödinger equation (1). equation (19) can be solved with respect to the wronskian, giving w(ψ0, ψ0,r)(y)= e0 − e f(y) ∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz, where a constant of integration has been set to zero. thus, we have dψ0,yψ0,r(y) = √ 1 f(y)h(y) e0 − e ψ0(y) ·∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz. thus, the term inside the limit in (18) reads − 1 c0 ψ0,l(z)dψ0,yψ0,r(y)= − e0 − e c0 √ 1 f(y)h(y) ψ0,l(z) ψ0(y) ·∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz. (20) according to (9), we have c0 = f(b)wψ0,l,ψ0,r(b), where the right hand side could have been evaluated at any point of [a, b]; the choice b will prove convenient in subsequent calculations. thus, taking into account the fact that ψ0,r(b) = 0, we have c0 = −f(b)ψ0,l(b)ψ′0,r(b), which we will now express by means of an integral. to this end, consider our generalized schrödinger equation (1) and its derivative with respect to e, each multiplied by a ψ0,l and its derivative with respect to e, respectively: { f(z)ψ′′0,l(z)+ f ′(z)ψ′0,l(z)+ [eh(z)− v (z)]ψ0,l(z) } ∂ ∂e ψ0,l(z)= 0[ f(z) ∂ ∂e ψ′′0,l(z)+ f ′(z) ∂ ∂e ψ′0,l(z)+ h(z)ψ0,l(z)+ eh(z) ∂ ∂e ψ0,l − v (z) ∂ ∂e ψ0,l ] × ψ0,l(z)= 0 taking the difference of these two equations yields the following result: f ′(z)ψ′0,l(z) ∂ ∂e ψ0,l(z)− f ′(z)ψ0,l(z) ∂ ∂e ψ′0,l(z)− h(z)ψ20,l(z)+ f(z)ψ′′0,l(z) ∂ ∂e ψ0,l(z)− f(z)ψ0,l(z) ∂ ∂e ψ′′0,l(z)= 0. 66 acta polytechnica vol. 51 no. 1/2011 rewriting and integrating this equation gives∫ (a,b) h(z)ψ20,l(z)dz =∫ (a,b) d dz [ f(z)ψ′0,l(z) ] ∂ ∂e ψ0,l(z)dz −∫ (a,b) ψ0,l(z) d dz [ f(z) ∂ ∂e ψ′0,l(z) ] dz. the two integrals on the right hand side can each be reformulated using integration by parts.∫ (a,b) d dz [ f(z)ψ′0,l(z) ] ∂ ∂e ψ0,l(z)dz = ∂ ∂e ψ0,l(z)f(z)ψ0,l(z) ∣∣∣∣∣ b a − ∫ (a,b) f(z)ψ0,l(z) ∂ ∂e ψ′0,l(z)dz∫ (a,b) ψ0,l(z) d dz [ f(z) ∂ ∂e ψ′0,l(z) ] dz = f(z)ψ0,l(z) ∂ ∂e ψ′0,l(z) ∣∣∣∣∣ b a − ∫ (a,b) f(z)ψ0,l(z) ∂ ∂e ψ′0,l(z)dz. observe that both expressions involve the same integral term on their right hand sides. therefore, if we substitute into their difference above, we obtain∫ (a,b) h(z)ψ20,l(z)dz = f(b)ψ′0,l(b) ∂ ∂e ψ0,l(b)− f(b)ψ0,l(b) ∂ ∂e ψ′0,l(b). (21) note that in the last stepwe substituted the limits of integration andmadeuse of the fact that ψ0,l(a)=0. now we are ready to compute the limits in (18). according to (20), the first limit reads lim e→e0 [ − 1 c0 dψ0,yψ0,l(y)ψ0,r(z) ] = lim e→e0 [ e0 − e f(b)ψ0,l(b)ψ′0,r(b) √ 1 f(y)h(y) ψ0,l(z) ψ0(y) · ∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz ] = lim e→e0 [ e0 − e ψ0,l(b) ] · lim e→e0 [ 1 f(b)ψ′0,r(b) √ 1 f(y)h(y) ψ0,l(z) ψ0(y) · ∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz ] = − 1 ∂ ∂e ψ0,l(b) ∣∣∣∣∣ e=e0 · lim e→e0 [ 1 f(b)ψ′0,r(b) √ 1 f(y)h(y) ψ0,l(z) ψ0(y) · ∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz ] . (22) in the next step we substitute the first factor using our relation (21), which we need to evaluate at e = e0. since our generalized schrödinger equation (1) can only have two linearly independent solutions at e = e0, the three solutions ψ0,l, ψ0,r and ψ0 become linearly dependent there. in particular, they must all fulfill the same boundary conditions (2), which implies for the right hand side of (21) that[ f(b)ψ′0,l(b) ∂ ∂e ψ0,l(b)− f(b)ψ0,l(b) ∂ ∂e ψ′0,l(b) ]∣∣∣∣∣ e=e0 = f(b)ψ′0,l(b) ∂ ∂e ψ0,l(b) ∣∣∣∣∣ e=e0 . now, (21) can be solved for ∂ ∂e ψ0,l(b): ∂ ∂e ψ0,l(b) ∣∣∣∣∣ e=e0 = [ 1 f(b)ψ′0,l(b) ∫ (a,b) h(z)ψ20,l(z)dz ]∣∣∣∣∣ e=e0 . we use this to replace the first factor of (22) and get lim e→e0 [ − 1 c0 dψ0,yψ0,l(y)ψ0,r(z) ] = − lim e→e0 [ ψ′0,l(b) ψ′0,r(b) √ 1 f(y)h(y) ψ0,l(z) ψ0(y) ·∫ (y,b) h(z)ψ0(z)ψ0,r(z)dz∫ (a,b) h(z)ψ 2 0,l(z)dz ] . for taking the limit we recall that at e = e0 the functions ψ0,l, ψ0,r and ψ0 become linearly dependent. the respective proportionality constants cancel out and we obtain lim e→e0 [ − 1 c0 dψ0,yψ0,l(y)ψ0,r(z) ] = − √ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (y,b) h(z)ψ20(z)dz∫ (a,b) h(z)ψ20(z)dz . 67 acta polytechnica vol. 51 no. 1/2011 the second limit in (18) is found in a similar fashion, yielding lim e→e0 [ − 1 c0 ψ0,l(z)dψ0,yψ0,r(y) ] =√ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (a,y) h(z)ψ 2 0(z)dz∫ (a,b) h(z)ψ 2 0(z)dz , note that there is no negative sign in front. now our two limits can be plugged into the propagator (18): 1 h(y) k1(x, y, t)= dψ0,x ∫ (a,y) k0(x, z, t) ·[ − √ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (y,b) h(w)ψ 2 0(w)dw∫ (a,b) h(w)ψ20(w)dw ] dz + dψ0,x ∫ (y,b) k0(x, z, t) ·[√ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (a,y) h(w)ψ 2 0(w)dw∫ (a,b) h(w)ψ 2 0(w)dw ] dz. (23) in order to join the two terms, we rewrite the inner integral over (y, b) as a difference of integrals over (a, b) and (a, y), respectively: 1 h(y) k1(x, y, t)= dψ0,x ∫ (a,y) k0(x, z, t) ·[ − √ 1 f(y)h(y) ψ0(z) ψ0(y) ] dz + dψ0,x ∫ (a,y) k0(x, z, t) ·[√ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (a,y) h(w)ψ20(w)dw∫ (a,b) h(w)ψ20(w)dw ] dz + dψ0,x ∫ (y,b) k0(x, z, t) ·[√ 1 f(y)h(y) ψ0(z) ψ0(y) ∫ (a,y) h(w)ψ20(w)dw∫ (a,b) h(w)ψ20(w)dw ] dz = − √ 1 f(y)h(y) 1 ψ0(y) dψ0,x ·∫ (a,y) k0(x, z, t)ψ0(z)dz +√ 1 f(y)h(y) ∫ (a,y) h(w)ψ 2 0(w)dw ψ0(y) ∫ (a,b) h(w)ψ 2 0(w)dw dψ0,x ·∫ (a,b) k0(x, z, t)ψ0(z)dz. (24) since according to (5) we have dψ0,x ∫ (a,b) k0(x, z, t)ψ0(z)dz = exp(−ie0t)dψ0,xψ0(x) = 0, relation (24) turns after multiplication by h into its final form k1(x, y, t) = − √ h(y) f(y) 1 ψ0(y) dψ0,x ·∫ (a,y) k0(x, z, t)ψ0(z)dz. (25) alternatively, in (23) one can write the inner integral over (a, y) as a difference of integrals over (a, b) and (y, b), respectively. this gives a result slightly different from (25): k1(x, y, t) = √ h(y) f(y) 1 ψ0(y) dψ0,x ·∫ (y,b) k0(x, z, t)ψ0(z)dz. it can be seen immediately that for a conventional schrödinger equation (1) with f = 1 and h = 1, our results reduce correctly to the known findings [1]. 4 concluding remarks we have obtained a relation between propagators of generalized sturm-liouville problems that are connected by means of susy transformations. our results complement and generalize former findings for the conventional schrödinger equation [1]. while in the latter reference propagators related by higherorder susy transformations are also found to satisfy simple interrelations, the corresponding situation in the generalized case is subject to ongoing research. references [1] pupasov, a. m., samsonov, b. f., günther, u.: exact propagators for susy partners, j. phys. a 40 (2007), 10557–10589. [2] cooper, f., khare, a., sukhatme, u.: supersymmetry and quantum mechanics, phys. rep. 251 (1995), 267–388. [3] darboux, m. g.: sur une proposition relative aux équations linéaires, comptes rendus acad. sci. paris 94 (1882), 1456–1459. [4] duffy, d. g.: green’s functions with applications, chapman and hall, new york, 2001. [5] fernandez, d. j. c.: supersymmetric quantum mechanics, quant-ph/0910.0192. 68 acta polytechnica vol. 51 no. 1/2011 [6] samsonov, b. f., sukumar, c. v., pupasov, a. m.: susy transformation of the green function and a trace formula, j. phys. a 38 (2005), 7557–7565. [7] schulze-halberg, a.: green’s functions and trace formulas for generalized sturm-liouville problems related by darboux transformations, j. math. phys. 51 (2010), 053501 (13pp). [8] pozdeeva, e., schulze-halberg, a.: trace formula for green’s functions of effective mass schrödinger equations and n-th order darboux transformations, internat. j. modern phys. a 23 (2008), 2635–2647. [9] sukumar, c. v.: green’s functions, sum rules and matrix elements for susy partners, j. phys. a 37 (2004), 10287–10295. [10] suzko, a. a., schulze-halberg, a.: darboux transformationsand supersymmetry for the generalized schrödinger equations in (1+1) dimensions, j. phys. a 42 (2009), 295203–295217. axel schulze-halberg e-mail: xbataxel@gmail.com department of mathematics and actuarial science indiana university northwest 3400 broadway, gary, in 46408, usa 69 ap08_4.vp 1 indroduction insulating oils should have stable high-quality properties, not only in the original state, but also during the up time in operation. the stability of insulating oils has an elementary meaning during operation, because they work under high temperatures usually in the presence of oxygen, so they should be oxidation resistant. the oxidation of oil increases its acidity and the content of sediments low sediment values indicate high oxidation stability, leading to long oil life. minimizing the creation of sediments, the dielectric dissipation factor, corrosion of metals, electric failures maximizes the insulating ability of oil. oxidation stability is measured by iec 61125, method c [2, 3], or by the čez – orgrez method [1]. oxidation stability is an indicator that allows us to set stricter limits for oils in special applications. in some countries, striker limits or other requirements and tests are imposed. 2 čez – orgrez methodology during the test, the sample of new or reclaimed oil is exposed to conditions simulating a load application, similar to the load during operation. individual factors are simplified. high-quality parameters are periodically monitored until sediments form and the oil is no longer usable. 2.1 laboratory instruments, devices and chemicals [1] � glass sulphonation flask, 6000 ml � separation trap 250 ml � tube immune temperatures and oils (teflon, silicon) � air compressor � laboratory drying chamber with temperature regulation (100 °c) � plastic component syringe for taking samples (150 ml) � clean copper wire without surface treatment (unvarnished) � analytical scales, precision 0.001 g � measuring cylinder, 2000 ml. 2.2 preparation of the experiment, and testing process [1] before the test begins, the initial high-quality parameters are determined and 500ml is taken away from the oil sample. a measuring cylinder is used to measure out 5000 ml of oil, marked out for testing, into the sulphonation flask. the required quantity of copper wire will be added to the oil – 10g (quantity cca. 0.1 cm2/g of oil) to each litre of oil. the flask with oil will be put into the laboratory drying chamber, at a temperature of 100 °c. using tubes perhaps made of glass air will be conducted into the samples to ensure delivery of condensed fluid into the separation trap outside the drying chamber. a control test and verification of the temperature regulated in the drying chamber will be carried out every day. some, of the samples, will be removed at weekly intervals to determine the values of selected parameters (acidity, interfacial tension and content of inhibitors – 1× per week, dielectric dissipation factor – 1× per 3 weeks). the test will be completed when sediments insoluble in n-heptane are present or whenthere are no more samples for continuing the test or after 840 hours of testing. 2.3 comparison with the čsn en 61125 standard, method c this method describes a test for interpreting the oxidation stability of new hydrocarbon insulating fluids under accelerated conditions, without reference to whether antioxidant additives are present. test conditions: the filtered fluid sample oxidizes under the following conditions [3]: � oil weight: 25 g � 0.1 g, � oxidation gas: air, � gas flow speed: 0.15 l/h � 0.015 l/h, � test period: � 164 h – for uninhibited oil, � 500 h – for inhibited oil, � accelerator: copper wires in quantities 28.6 cm2 � 0.3 cm2 measured oil. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 thermal – oxidation stability of insulating fluids p. semančík, r. cimbala, k. záliš this paper deals with the thermal – oxidation stability of insulating fluids by the čez– orgrez test method [1]. it describes the principle test method as a preparation of experiment and the thermal – oxidation stability measuring of insulating oil. keywords: oxidation stability, oil life, dielectric dissipation factor, čez – orgrez test method. 3 experimental results the thermal – oxidation stability test of insulating oil was made using the čez – orgrez method [1]. power transformer insulating oil was used as a sample. further information about the sample confidental to the manufacturer and to the plant operator. during the experiment, the data, interpreted in (table 1) was measured [4]. the principle of the test is based on the air oxidation of the measured oil with added accelerator at a given temperature. the test was carried out under the following conditions [1]: � temperature 100 °c, � volume of oil samples 5 l, � bubbling of oil dried and refined by air in larger amounts than are needed for reaction of oil with the air, � accelerator: copper wires in quantities cca 0.1 cm2/g measured oil. the separation trap, placed outside the drying chamber, gathers the condensed fluid released during the test. the values measured in dependency on the length of test periods are recorded in tables (table 1), which show the degradation process of the oil until the moment when sediments insoluble in n-heptane form, or until the test is terminated. the graphic dependencies in figs. 1–4 were made from the measured values monitoring the individual parameters. monitored parameters [2–5]: tg� – dielectric dissipation factor, �r – dielectric permittivity, � – volume resistivity, čk – determination of acidity, � – determination of interfacial tension of oil against water, qi – contents of inhibitors. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 4/2008 test period (h) tg� (×10�2) �r (-) � (g� m) čk � qi sediments insoluble in the n-heptan 20 °c 70 °c 90 °c 20 °c 70 °c 90 °c 20 °c 70 °c 90 °c 0 0.018 0.075 0.138 2.257 2.199 2.175 2203.8 598.1 232.9 0.004 51 0.35 168 0.006 50 0.37 336 0.040 0.126 0.385 2.255 2.159 2.136 856.8 178.5 107.1 0.006 49 0.38 504 0.007 49 0.37 672 0.010 47 0.29 840 0.076 0.832 1.473 2.263 2.207 2.182 561.4 147.3 122.8 0.012 47 0.29 1008 0.011 45 0.27 1176 0.016 44 0.27 1344 0.122 0.886 1.866 2.262 2.190 2.165 235.8 108.8 48.3 0.016 43 0.25 1512 0.016 42 0.24 1680 0.021 41 0.23 1848 0.200 2.509 3.761 2.268 2.191 2.164 169.3 52.0 24.7 0.026 41 0.23 2016 0.030 39 0.23 2184 0.037 38 0.22 2352 0.168 3.300 5.978 2.271 2.202 2.181 127.4 15.2 9.5 0.048 37 0.16 2520 0.065 37 0.13 2688 0.079 35 0.08 2856 0.483 7.890 17.648 2.300 2.244 2.227 51.2 2.7 1.0 0.132 28 0.00 presence čk – mg koh/g, � – mn/m qi – %hmot., f – sample filtered using white tape filter paper (6 – 6.8 �m) before the measurement. table 1: values measured by the čez – orgrez test method [4] 4 conclusion the oxidation stability of oil is evaluated by period of time until sediments for that are soluble in the insulating oil (insoluble in the n-heptane), or by the creation of sediments that are insoluble in insulating oil. in the test of thermal-oxidation stability the submitted sample of insulating oil degraded in 2856 hours. this was documented by the presence of sediments insoluble in the n-heptanes. the thermal – oxidation stability test was cerried out using, the čez – orgrez method. the graphic dependencies were assessed from the monitored oil parameters. acknowledgments this work was supported by scientific grant agency of the ministry of education of the slovak republic and the slovak academy of sciences under project vega no. 1/3142/06 and apvv-20-006005. references [1] sop 2-32/72: tepelně – oxidační stabilita izolačných kapalin podle metodiky orgrez a.s..orgrez a.s., brno, czech republic, 2004. [2] stn en 60296: kvapaliny na elektrotechnické aplikácie. nepoužité minerálne izolačné oleje pre transformátory a spínače. slovenský ústav technickej normalizácie, bratislava, 2005. [3] čsn en 61125: nové izolační kapaliny na bázi uhlovodíků. zkušební metody na vyhodnocování oxidační stálosti. český normalizační institut, 1996. [4] orgrez, a.s.: protokol o měření zkouška tepelně-oxidační stálosti izolačního oleje – metodika čez-orgrez. orgrez a.s. divize elektrotechnických laboratoří, praha, 2006. [5] čez, a.s.: profylaktika minerálních izolačních olejů. podniková norma 00/08 rev0. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 0 2 4 6 8 10 12 14 16 18 20 0 500 1000 1500 2000 2500 3000 t [h] tg�� �×10 �2 � 20°c 70°c 90°c fig. 1: dependence of tg� 2 12. 2 14. 2 16. 2 18. 2 20. 2 22. 2 24. 2 26. 2 28. 2 30. 2 32. 0 500 1000 1500 2000 2500 3000 � r [-] t [h] 20°c 70°c 90°c fig. 2: dependence of �r 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 3000 t [h] � [g� m] 20°c 70°c 90°c fig. 3: dependence of � [mn/m] 0 00. 0 05. 0 10. 0 15. 0 20. 0 25. 0 30. 0 35. 0.40 0 500 1000 1500 2000 2500 3000 t [h] èk [mg koh/g] qi [% hmot.] 0 10 20 30 40 50 60 ck qi � fig. 4: dependence of čk, qi, � © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 4/2008 ing. peter semančík phone: +421 556 023 560 e-mail: peter.semancik@tuke.sk doc. ing. roman cimbala, ph.d. phone: +421 556 023 557 e-mail: roman.cimbala@tuke.sk department of electric power engineering technical university in košice faculty of electrical engineering and informatics mäsiarska 74 041 20 košice, slovak republic doc. ing. karel záliš, csc. phone: +420 224 352 369 e-mail: zalis@fel.cvut.cz department of electrical power engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 plasma surface treatment of powder materials — process and application monika pavlatová1, marta horáková2, jan hlad́ık2, petr špatenka1 1 surfacetreat, a. s., na lukách 669, 511 01 turnov, czech republic 2 technical university of liberec, faculty of mechanical engineering, department of material science, studentská 2, 461 17 liberec, czech republic correspondence to: monika.pavlatova@surface-treat.cz abstract polyolefin particles are hydrophobic, and this prevents their use for various applications. plasma treatment is an environment-friendly polyolefin hydrophilisation method. we developed an industrial-scale plant for plasma treatment of particles as small as micrometers in diameter. materials such as pe waxes, uhmwpe and powders for rotomolding production were tested to verify their new surface properties. we achieved significantly increased wettability of the particles, so that they are very easily dispersive in water without agglomeration, and their higher surface energy is retained even after sintering in the case of rotomolding powders. keywords: plasma treatment, polyolefin powders, wettability, surface energy, agglomeration. 1 introduction cold plasma surface modification has been established as an effective and low-cost technology for modifying the surface properties of polymer materials without altering the bulk material. modification of plastic packaging foils, plastic headlights, or automobile bumpers prior to metallization or painting are examples of industrial applications. another application is in surface cleaning processes for removing contaminants such as grease, greasy acids, dust, and even bacteria [1]. although powders find a range of applications in many branches of industry, e.g. painting, biotechnology, and fillings for composite materials, plasma modification of the powder surface has not been applied for modifying flat solid materials. this is due to problems connected with their three-dimensional geometry, the need for sophisticated mixing to overcome the aggregation phenomenon, and the large surface area of the powders that are to be modified. surfacetreat company has developed an original setup that enables powder hydrophilisation on an industrial scale. the aim of this paper is to present the process of powder surface modification and mainly new properties and the specific application of the modified material. 2 plasma technology plasma discharge is a widely-used method for changing the specifications of surfaces. plasma discharge is a simple and easy approach for changing the chemical constitution of raw materials. basically, plasma can be generated by putting energy into a gas by heating, applying a voltage, injecting electromagnetic waves, or a combination of these methods. due to the increasing energy, the gas particles (atoms and molecules) begin to move faster in the three spatial dimensions, and simultaneously they acquire higher rotational and vibrational energies. the particles are pulled apart due to the collisions between them, and they give rise to positively charged ions and to electrons that escape from the atoms and molecules. the principle of the plasma treatment process involves creating active particles by transporting the working gas through the plasma discharge, which changes the surface properties of the material in various ways. [2,3] the bonding of new functional groups on the molecular chain during plasma surface treatment of polymer materials is one way in which this type of treatment can be used. the most frequently applied process is polymer hydrophilisation by oh groups. the resulting effect is a change in the surface energy, resulting in higher wettability, ability of the material to disperse, and also enhancement of the adhesion properties of the polymer to other materials. 3 system st 650 surfacetreat inc. surfacetreat developed the st 650 pilot plant, which produces hydrophilic powder materials. the plant is based on a vacuum cham83 acta polytechnica vol. 52 no. 3/2012 ber equipped with a mechanical stirrer. the plasma is excited by two microwave power sources with total power up to 2 kw. the development of this industrial-scale plant for powder treatment is described in detail in a previous paper [4]. two-shift operation produces up to 3 tons per week, which is a sufficient amount to cover the delivery of material to start production, and also serves for semiindustrial tests of materials for various applications. the st 1000 machine is now under development, and production of at least of 70 kg per hour is estimated for 500 micrometer particle size. the advantages of plasma treatment of powders are summarized as follows: • surface properties can be modified without altering the bulk material • high stability of the treatment effect • great efficiency, together with wide variability of the process • an environment-friendly surface modification process project no. fr-ti1/176 “application of plasma modification of micro powder in industrial scale” is being implemented within the framework of the tip programme for industrial research and development, supported by the ministry of industry and trade of the czech republic. the project tests the modification effect on various types of powder materials, their industrial applications, and is developing an industrial-scale machine prototype in cooperation with the technical university of liberec, czech republic. 4 application examples the plasma-modified powder has already found various applications. two characteristic industrial applications of the hydrophilized material, mainly used as an additive and/or as a raw material for rotomolding, will be described in the next section. we also participate in and support research in other applications of plasma surface treatment. 5 additives • polyethylene wax dispersivity is a major problem with polyolefin powders. the aim is to increase the surface energy of the particles in order to ensure dispersion without creating clusters. this problem is usually solved by using chemical dispersing agents or chemically oxidized variants for application in dispersions. examples are polyolefin waxes, which are used as a rubresistant additive for printing inks, for modifications to surface property matting agents for paints & coatings, for lubricants for plastics, and especially for highly demanding pigment concentrates based on pe and pp (dispersing aids). we tested pe wax with a particle size of 8 μm. first, it was necessary to find suitable treatment conditions in laboratory equipment. the plasma treatment process is very variable in terms of pressure, working gas, and time for treatment setting. the mixing system is also very important. then appropriate conditions were found in the production apparatus. samples with different grades of modification were prepared and tested immediately after treatment. in this case, the wetting behavior was observed after intensive mixing (2 000 rpm for 1 hour using dissolver disc � = 60 mm). after stopping the stirrer and waiting for 30 min, we evaluated the settling behavior. estimation of the wettability: 1–2 cm water phase was tolerated. all samples were compared with the commercially-produced oxidized wax (ox) benchmark system, and also with the unmodified powder zero sample. we achieved very good dispersion, comparable with or even better than the benchmark (ox), see figure 1. the material is now being tested by the end-user. additional testing in the form of a long-term stability test is still running, and technological adjustments are being made to the production apparatus to ensure high quality of the treatment as well as effectiveness of the process. figure 1: from the left — unmodified powder (zero sample), plasma-modified powder (111024 10), chemically oxidized powder (3715) 84 acta polytechnica vol. 52 no. 3/2012 • ultra-high molecular weight polyethylene (uhmwpe) many types of powders are classified as additives for paints, and are used, e.g., to influence the abrasion properties or the appearance. we have developed an uhmwpe treatment powder with particles 30 μm in size. very good wettability was achieved after plasma treatment, see figure 2. the wetting behavior is observed in dispersion in distilled water to show the effect immediately after treatment. the plasma-treated powder is dispersed without any agglomerations, unlike the unmodified powder. the wettability of the powder is then practically determined from dynamic capillarity rising measurements using a tensiometer, according to the washburn method [5]). the suction of the material is determined as the time dependence of the weight of the penetrating benzyl alcohol at 25 ◦c. the wettability of the powder was about 90 % higher than the wettability of the unmodified sample. various grades of treatment were tested. figure 2: from the left — plasma-modified powder in 2 grades, unmodified powder enhancing the wettability of the powder results in effective dispersion of the powder in a liquid environment. this enables the powder to be applied as a filler in a water-based paint. ecolor profi base c type: 9005 was used for the test, and was then coated on a sheet metal plate as a layer of pure varnish, and as a layer with added modified powder. the modified material was immediately and very easily dispersed, without agglomeration or air bubbles. then we tested the influence of the addition of uhmwpe to water-based paint on the abrasion resistance. a typical result of an abrasion test performed on the elcometer 1720 abrasion, scrubbing & washability tester according to norm en iso din 11998 is presented in figures 3 and 4. the tester was set to the fixed speed mode at 37 cycles per minute. the specimens were exposed to an abrasion tester, and the number of cycles until the paint-layer disrupts was counted. for the tests, a ratio of 10 % plasma modified powder by volume was used in the varnish. figure 3: pure varnish 30 000 cycles, boar hair brush tool figure 4: varnish filled with 10 % by volume plasma modified uhmwpe 135 000 cycles, boar hair brush tool the layer made of pure varnish was able to withstand 30 000 cycles using the boar hair brush tool before the first wear through occurred, see figure 3. the layer of varnish with 10 % by volume plasma modified powder was able to withstand 135 000 cycles using the hair brush tool without any wear through, see figure 4. 85 acta polytechnica vol. 52 no. 3/2012 adding unmodified pe into the paint also raised the abrasion resistance of the coating, but there were significant problems with admixing the unmodified filler into the paint. the paint layer with plasma modified powder was smooth and homogeneous. we are working in cooperation with lacquer and paint producers, and various plasma-modified materials are being tested by customers, using certified laboratories. 6 powders for rotomolding plasma surface treatment of solids is currently more widely employed. it can replace chemical wet processes or flame interference, which are used to enhance adhesion properties. we are able to go even further, and to treat in powder form and then sinter a product with new surface properties using a rotomolding technique. the surface is directly prepared for painting, gluing, printing or foaming without any additional pretreatment. the characteristics of the material, i.e. impact strength, escr and other major specifications, are the same as for traditional materials, and the processing also remains unchanged [6]. the enhanced surface hydrophility remains, even after sintering. we tested various types of pe powders suitable for rotomolding. the effect of treatment was verified by measuring the surface energy after sintering using test inks and test pens by arcotest gmbh. surface energy is a decisive criterion for the adhesion of printing ink, glue, varnish, etc., to many plastic and metal surfaces. it is given in mn/m. a value of 38 mn/m is often mentioned as a general limit. if the surface energy level is below this value, the adhesion is poor; above this value, the adhesion is good or sufficient [7]. our samples sintered from modified powder have surface energy values about 10 mn/m higher than the zero (unmodified) samples, always depending on the type of material and the modification grade, see table 1. the particle size of the tested materials was approximately 500 μm. all materials mentioned here reached or exceeded the required limit value of 38 mn/m. as practical tests for industrial applications, we tested the adhesion of the water-based paint layer on a surface sintered from the modified material without any additional pre-treatment of the surface. then an adhesion test was performed using the cross cut method, according to iso 2409. the plasmamodified material samples produced very good results. the sample sintered from unmodified powder received classification no 5 (figure 5), and the sample sintered from plasma-modified powder received classification no 0 or 1 for (figure 6). table 1: overview of the surface energy measured on a surface sintered from unmodified and modified powder for different types of material material/producer type density [g/cm3] mfi [g/10 min] surface energy [mn/m] before treatment (arcotest pens/inks) surface energy [mn/m] after treatment (arcotest pens/inks) borecene rm 8343 borealis lmdpe 0.934 0 6.0 30/ < 35 44/41–44 mpe m 3581 uv s01 total petrochemical mdpe 0.935 0 6.0 30/ < 35 44 lupolen 3621 m rm lyondellbasell mdpe 0.935 5 7.5 30/n.a. 42–44/41–44 surpass rms 244 ug nova chemicals hdpe 0.944 0 1.7 30–36/< 35 42–44/41 dowlex 2631.10ue dow chemical comp. pe 0.935 0 7.0 30/28 38–40/38–41 icorene 3590 ico polymers lmdpe 0.935 0 9.0 30/28 38/44 86 acta polytechnica vol. 52 no. 3/2012 figure 5: cross cut test (water-based paint) on the surface sintered from unmodified pe powder figure 6: cross cut test (water-based paint) on the surface sintered from plasma-modified pe powder the adhesive bonding ability of the sintered parts was tested on the adhesion of pur foam to a surface sintered from plasma-modified powder. the specimen was created in the form of sandwich, and consisted of one layer of sintered plasma-modified pe material, a layer of standard pur foam, and another layer of sintered plasma-modified material. other materials were also tested. it is known that there is no adhesion between pe and pur. in our case, we achieved very good results using only a manual strength test. in this test, the adhesion between pe and pur foam was higher than the cohesion of the foam itself. adhesion between sintered plasma-modified pe and pur foam parts could be used in the case of foam plastic products, such as seats or thermo boxes. there is also significant enhancement of the entire body strength, which facilitates construction. at the same time, the so-called multilayer rotomolding process seems to be a very interesting area for applying plasma-modified powder. for example, fuel tanks could be produced using this process, where the plasma-modified powder can act as an adhesive, e.g. for an inner barrier layer. 7 conclusion this paper has presented a plasma treatment method for powder materials. practical examples have shown their industrial applications. treated polyolefin has been applied as a filler in water-based suspensions, significantly enhancing the wear resistance of the coating. parts with hydrophilic surfaces can be produced by applying the raw material for producing parts using rotomolding technology. these parts can be directly painted or glued, without any pretreatment. plasma treatment of powder on an industrial scale has been developed. the st-650 pilot plant has a production capacity of up to 3 tons per month, which is sufficient for small-series production, and provides material for industrial testing. further development is in progress. plasma treatment of nano-powders will open up broad new areas of application. acknowledgement this work has been partially supported by the ministry of industry and trade of the czech republic; project no. fr-ti1/176. references [1] lebert, r., neff, w., pochner, k.: atmospheric pressure gas discharges for surface treatment, surface and coating technology, 74–75, 1995, 394–398. [2] abdul-kader, a. m., kelany, a. m., radwan, r. m., turos, a.: surface free energy of ultra-high molecular weight polyethylene modified by electron and gamma irradiation, applied surface science, vol. 255, issue 17, 2009, p. 7 786–7790. [3] fellenberg, r., kickuth, r., reichel, k.: plasma technology process diversity + sustainability, german federal ministry of education and research, printec offset, kassel, 2001, p. 6–12. [4] hlad́ık, j., peciar, m., špatenka, p., źıtko, m.: plasma treatment of micropowder — from laboratory experiments to production plant, 52nd annual technical conference proceedings of the society of vacuum coaters, 2009, p. 537–540. 87 acta polytechnica vol. 52 no. 3/2012 [5] washburn, e. w.: phys. rev. ser. 2, 17, 1921, 273. [6] boersch, d. e., knoth, p., pfitzmann, a.: plasma modified polyolefin powders for rotational molding, proceedings: 61st spe annual technical conference (antec), vol. 1 of processing session t13 material formulation and properties, p. 1 278–1281, nashville, may 4–8 2003. society of plastics engineers. [7] arcotest manual — test inks for testing the surface energy. 88 ap09_2.vp 1 introduction endoscopy is a vital part of medical diagnostic processes and an everyday tool in today’s medical environments. the evolution of endoscopic technologies was driven by the desire to gain information on the patient’s medical status from the optical appearance of body cavities. data on the biological processes inside the body were hard to come by. medical practitioners, engineers, and inventors participated in the endeavor to bring light into the body and guide it out again. a review of the technical evolution of medical endoscopy is given in this paper. the first steps in the development, which are centered around rigid endoscopy are described in section 2. the advances in flexible endoscopy which led to an increased range in the human body are highlighted in section 3. the introduction of photographic documentation is the topic of section 4. the focus of section 5 is on the adoption of video technology in endoscopy. in section 6 the use of spectral imaging in endoscopy is described. swallowable capsules for endoscopic examination are discussed in section 7. the increased significance of endoscopic images by means of image processing is the topic of section 8. an example of an interdisciplinary application of medical imaging technology is shown in section 9. closing remarks are given in section 10. 2 rigid endoscopy giulio cesare aranzi (1530–1589), an italian physician, was the first medical practitioner to direct sunlight into a body cavity. this medical examination took place in 1585. he used a flask of water to reflect light into his patient’s nose [1]. there were several later descriptions of the use sunlight and its redirection into the human body. these examinations include some conducted by waldenburg and kussmaul (both in 1870). the first technologically successful attempt to guide light into the human body was undertaken by philipp bozzini (1773–1808) [2]. he was an italian physician who grew up in germany. his apparatus called ’lichtleiter’ (german, translated to english: light conductor) was constructed from a metal casing which was designed to hold a candle [3]. a schematic drawing of the lichtleiter can be seen in fig. 1. on the one side of the casing bozzini placed holes, to which he attached metal tubes. the tubes were used to guide the light into the human body. tubes of different sizes were available for different body openings. he even invented a tube with a mirror to redirect the light to the vocal cords. on the other side of the casing was an opening for the observer to look into. the apparatus was a revolutionary idea. the disadvantages were the heat and smoke created by the candle. consequently, the invention did not gain wide acceptance among medical practitioners [4]. unfortunately, philipp bozzini died of typhus at the age of 35 and did not live to anticipate the changes he inspired with his creation. antonin jean desormeaux (1815–1882) replaced the candle with a mixture of alcohol and turpentine to increase the illumination. however, this solution still created heat and sometimes smoke while burning. furthermore, he used condenser lenses to concentrate the illumination on a single spot during the examinations. desormeaux conducted the first successful operative endoscopic procedures in living patients, and is considered by many as the ’father of endoscopy’ [6]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 49 no. 2–3/2009 technical evolution of medical endoscopy s. gross, m. kollenbrandt this paper gives a summary of the technical evolution of medical endoscopy. the first documented redirection of sunlight into the human body dates back to the 16th century. rigid tubes with candle light were given a trial later on. low light intensity forced the development of alternative light sources. some of these experiments included burning chemical components. electric lighting finally solved the problems of heat production and smoke. flexible endoscopy increased the range of medical examinations as it allowed access to tight and angular body cavities. the first cameras for endoscopic applications made taking photos from inside the human body possible. later on, digital video endoscopy made endoscopes easier to use and allowed multiple spectators to observe the endoscopic intervention. swallowable capsules called pill-cams made endoscopic examinations of the small intestine possible. modern technologies like narrow band imaging and fluorescence endoscopy increased the diagnostic significance of endoscopic images. today, image processing is applied to decrease noise and enhance image quality. these enhancements have made medical endoscopy an invaluable tool in many diagnostic processes. in closing, an example is given of an interdisciplinary examination, which is taken from the archaeological field. keywords: technical evolution of endoscopy, medical examination history, flexible endoscopy, rigid endoscopy, photo endoscopy, spectral endoscopy, capsule endoscopy, wireless endoscopy, image processing. fig. 1: schematic drawing of bozzini’s lichtleiter the lighting system was further improved by julius bruck (1840–1902), a dentist. he was the first to suggest inserting a light source into the human body. in 1867 he installed a platinum wire into a water reservoir and in this way created an easily applicable light source. max nitze (1848–1906) was a general practitioner interested in medical examination of the urinary bladder. he was the first inventor who created an endoscope with the light source at its tip. after experimenting with julius bruck’s platinum wire, he miniaturized edison’s filament globe and created the first cystoscope in 1877. the cystoscope contained several optical lenses used to guide the light through the tube and to the observer’s eye. however, collaborating with several medical instrument makers he failed to claim the patent for his invention and a patent war between several parties erupted [7]. johann von mikulicz-radecki (1850–1905) used nitze’s concept and introduced a mirror to create the angular field of view still found in many of today’s endoscopic, devices [5]. he also added an air canal to his endoscope which enabled the examiner to inflate the body cavity under observation. this greatly increased the field of view and allowed the inspection of otherwise collapsed body cavities. his rigid gastroscope was 650 mm long and 13 mm in diameter. a new type of light transmission was introduced by fourestier in 1952. a rigid quartz rod just 1.5 mm in diameter was inserted into a 2 mm stainless steel tube. the quartz rod was later used in the development of the first highquality movie films. 3 flexible endoscopy the first recorded flexible esophagoscope was invented by kelling in 1898. the lower third of this endoscope could be flexed up to an angle of 45°. the instrument maker who manufactured the endoscope was albright. both kelling and albright might be called pioneers of gastroscopy [2]. schindler introduced an improved version of a semiflexible gastroscope in 1936. the flexible lower third was 12 mm in diameter and contained a spiral which hosted fixings for more than 48 lenses. the rigid part of the endoscope was 8.5 mm in diameter. the illumination was supplied by an electric globe. the maximum bending angle for the endoscope was 30°, as exceeding this angle caused the transmission of the image to fail. the system was an improvement over existing technologies but was still not flawless, as blind spots existed which could not be visualized. 4 photoendoscopy maximilian nitze (1848–1906) realized that creating suitable photos is of special importance in the medical field. photos made sharing information with colleagues or creating records for later documentation possible. he invented a cystoscope which could hold glass plates with a light sensitive coating [3]. the plates could be moved into the light and an exposure time of 3–5 seconds was necessary to create a photograph. lange and meltzing invented a small camera that was attached to a rubber tube and could be swallowed. their idea was to create images from the inside of the esophagus without the need for an endoscope which could guide the light out of the body. the front part was still rigid and contained the camera, a film magazine for film rolls with 50 exposures, and an electric globe for illumination. the rubber tube hosted the electric wiring for the globe, the mechanical camera trigger, and an air channel for insufflation of the body cavities. the disadvantage of this design was that the examiner was not aware of the actual position of the camera or the field of view. only after the images had been processed one could say whether the images were of any diagnostic value. 5 video endoscopy conventional endoscopy had always caused a lot of restrictions for the investigator. similar to using a microscope, the observer needed one eye to look along the optical axis of the endoscope. this was uncomfortable at best, and only a single person could observe the body cavities at a time. taking images from the inside of the body was tedious, and was often done without knowing what would actually be on the photos. a remedy for this situation was video endoscopy. the first reported video endoscopy was conducted by soulas in france in 1956. video equipment was still very large and heavy, as miniaturization had not been a focus. in fact, the easiest way to get video signals from the human body was to take the patient to a television studio. a rigid endoscope was fixed to a regular television camera, and video images were transmitted to a television monitor. these first steps were cumbersome, but with the development of better equipment video endoscopy became the standard method for medical examinations and interventions. however, the first endoscopic video images were black & white only. a team in melbourne, australia, miniaturized the camera in 1960. it was 45 mm wide and 120 mm long with a reduced weight of 350 g. the camera could be attached to a regular endoscope and could transmit the images to a screen. however, the images were still black and white and a monochrome display was used. there was little enthusiasm in the clinical world. the introduction of digital imaging with charge-coupled device (ccd) image sensors in 1985 was a breakthrough. the chips could be miniaturized and the whole imaging procedure could take place at the tip of the endoscope. the fiber bundles used to guide the light in optical systems could be replaced by wires. the flexibility of the endoscopes increased and the image quality rose. now digital video data could be easily processed digitally, and noise filtering and image enhancement were built-in post processing steps of the endoscopic video units. 6 spectral endoscopy technological advancements have not only increased the flexibility, the image quality, or the operational reach of endoscopes, but have also changed the kind of information that they can retrieve. standard white light endoscopy is still a very important aspect of endoscopic examinations. however, spectral endoscopy has evolved and allows the detection or easy diagnosis of additional diseases. the use of various illumination techniques has enhanced the visibility of features that cannot be distinguished under white light. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 an example of this is narrow-band imaging (nbi) [8]. a filter wheel is moved in front of the white light source, which reduces the emitted light to two narrow bands. the blue light at 415 nm enhances the superficial capillary network, while the green light at 540 nm displays subepithelial vessels. the combination of the two light bands highlights the blood vessel pattern, which can be essential for certain diagnoses. for example, in their 2007 paper tischendorf et al. [9] presented a promising study on the diagnostic relevance of the blood vessel pattern in the differentiation between benign and possibly malignant colorectal polyps. the difference between a white light image and an nbi image can be seen in fig. 2. another example for the diagnostic value of spectral imaging is fluorescence endoscopy. since the 1960’s urologists have investigated ways to label bladder tumors in vivo [10]. however, staining with tetracyclines, methylene blue, fluorescein or synthetic porphyrin compounds could not be established as standard procedures and they were abandoned. in 1995 an investigation was made of a fluorescing agent called ala [11], which is injected into the patient’s blood circuit. the agent is transported into all parts of the patient’s body and accumulates in areas with high metabolic activity. during a consecutive bladder examination special lighting is used to activate the fluorescence, which then indicates any tumorous growth on the bladder wall. an example of the discriminative power of fluorescence endoscopy is highlighted in fig. 3. 7 capsule endoscopy one of the recent developments in the field of endoscopy is capsule endoscopy, where a mini-camera in capsule form is given to the patient for swallowing. an israeli physician, gavriel iddan, started developing a pill-sized camera in 1981. however, miniaturization and battery technology were not ready for his idea. however, in 2001 the american food and drug administration approved a camera of the size of a pill (26×11 mm) for endoscopic purposes. the pill contained a digital camera, its battery, control and transceiver electronics and was covered by a plastic coating. after the capsule had been swallowed, the camera took two images per second. during the eight hour procedure approximately 60000 images were taken. an image of a capsule endoscope can be seen in fig. 4. so-called pill-cams are primarily used to visualize diseases in the small intestine which is too long for an examination using traditional scopes. however, the number of images created during the long procedure forces medical staff to spend much time on the analysis. this can sometimes take up to two hours, which is longer than a standard colonoscopy or gastroscopy. leading research in this field is now being done in the united states, japan, and germany. 8 computer-based image processing in medical endoscopy image processing is a crucial part of the efforts to increase the quality and significance of medical images. numerous filtering and image quality enhancement techniques have been introduced for various modalities and situations. a small sample selection is presented here. in the endoscopic field a wide range of methods and ideas have been suggested. an example of quality enhancement for fluorescence endoscopy which is described in section 6, is contrast enhancement [12]. other efforts concentrate on correction of systematic errors introduced by endoscopic imaging equipment. to this end, stehle et al. suggested a dynamic distortion correction for endoscopy systems with exchangeable © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 49 no. 2–3/2009 fig. 2: two images (white light on the left, narrow-band imaging on the right) showing the same scene. one can easily see the difference. the blood vessel pattern is emphasized in the image on the right. fig. 3: the image on the left side shows a white light image from bladder endoscopy. there is no evidence of tumors. on the right side the same location is imaged with bladder fluorescence endoscopy. two tumors can easily bespotted. the images give a clear example of the discriminative power of spectral imaging in endoscopy. fig. 4: image of a capsule endoscope (pill-cam) with size reference optics [13]. a main focus of research on capsule endoscopes is on reducing the time humans need to review the data created during the procedure. automatic video browsing and region of interest tagging, as well as content evaluation, are among the most common research projects [14, 15]. mosaicking techniques have been adapted to create panoramic views from endoscopic images, which are otherwise locally restricted [16]. 9 interdisciplinary applications while investigating on the historical evolution of endoscopic technology one recognizes that endoscopic equipment is used in a multitude of different application fields. these fields include security, science and criminal investigations. endoscopes are also used in maintenance tasks for technical equipment like the engines of airplanes. endoscopic technologies have fostered new ways of research, especially in interdisciplinary contexts. disciplines like archeology and biology can often profit from research and developments in the medical field. an excellent example of an archaeological application is the examination of oetzi the iceman [17]. the 5300 year old mummy from the neolithic age was found in a glacier in the alps between austria and italy. an endoscopic examination of the stomach via the esophagus (gastroscopy) was performed amongst other medical examinations of the corpse. the contents of the stomach gave indications of the eating habits of human kind in the late stone age. remains of whip-worms, which are parasites causing diarrhea and anemia, were also found during the analysis. an endoscopic examination of the colon (colonoscopy) substantiated this suspicion, as a large number of whip-worm eggs were found. this led to the conclusion that the iceman had an infestation caused by the whip-worm parasite. 10 conclusion endoscopy is an integral part of many of today’s medical diagnostic processes. engineers have tried to extend the boundaries of endoscopic imaging ever since the first physicians used sun light to look into the human body. the pioneering years were spent trying to guide light into the body and back out again. while only inelastic tubes were used in those days, various light sources and mirroring techniques were invented. rigid endoscopy is still used today for many investigations and surgical interventions. there were, however, parts of visceral cavities that could not be reached. to this end flexible endoscopy was invented to catch a glimpse of narrow and angulate body cavities like the colon. medical practitioners gained additional insight into the human body. photographic equipment for endoscopies gave the opportunity to take images from the inside of the human body. but this idea was soon replaced by video endoscopy, which made the clinical routine easier and gave multiple spectators the chance to follow interventions. spectral imaging was introduced to increase the significance of endoscopic images and create new diagnostic methods for many disease patterns. but even with all these advances the small intestine was still out of reach for investigators, and capsule endoscopes were developed to close this diagnostic gap. today computing power is exploited, e.g. to reduce noise in endoscopic images and support medical practitioners in their decisions. all these advancements make medical endoscopy a unique and important tool in the clinical routine. however, the benefits of endoscopic equipment and of the inventions made to increase its value in clinical environments are not restricted to the medical field. acknowledgments the authors would like to thank univ.-prof. dr. phil. christine roll, head of the institute of history, rwth aachen university, and univ.-prof. dr.-ing. til aach, head of the institute of imaging & computer vision, rwth aachen university, for their kind support and guidance. references [1] aranzi, g. c.: hippocratis librum de vulneribus captitis commentarius cum claudii porralii annotationibus marginalibus. lugduni: apud ludovicum cloquemin, 1580. [2] berci, g., forde, k. a.: history of endoscopy: what lessons have we learned from the past? in surg endosc, vol. 14 (2000), no. 1, p. 5–15. [3] prevedello, d. m., doglietto, f., jane, j. a. jr., jagan nathan, j., han, j., laws, e. r. jr.: history of endoscopic skull base surgery: its volution and current reality. in j neurosurg, vol. 107 (2007), p. 206–213. [4] bozzini, p. h.: lichtleiter, eine erfindung zur anschauung innerer teile und krankheiten. in j prak heilk, vol. 24 (1806). [5] zajaczkowski, t.: johann anton von mikulicz-radecki (1850–1905) – a pioneer of gastroscopy and modern surgery: his credit to urology. in world j urol, springer, vol. 26 (2008), no. 1, p. 75–86. [6] lamaro, v. p.: gynaecological endoscopic surgery – past, present and future. in st. vincent’s clinic, proceedings, vol. 12 (2004), no. 1, p. 23–29. [7] nitze, m.: zur photographie der menschlichen harnblase. in med wochenschr, vol. 178 (1893), no. 2. [8] gono, k., obi, t., yamaguchi, m., ohyama, n., machida, h., sano, y., yoshida, s., hamamoto, y., endo, t.: appearance of enhanced tissue features in 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 5: image of oetzi the iceman during examinations narrow-band endoscopic imaging. in journal of biomedical optics, vol. 9 (2004), no. 3, p. 568–577. [9] tischendorf, j. j. w., wasmuth, h. e., koch, a., hecker, h., trautwein, c., winograd, r.: value of magnifying chromoendoscopy and narrow band imaging (nbi) in classifying colorectal polyps: a prospective controlled study. in endoscopy, vol. 39 (2007), no. 12, p. 1092–1096. [10] zaak, d., karl, a., knchel, r., stepp, h., hartmann, a., reich, o., bachmann, a., siebels, m., popken, g., stief, c.: diagnosis of urothelial carcinoma of the bladder using fluorescence endoscopy. in bju int., vol. 96 (2005), no. 2, p. 217–222. [11] steinbach, p., weingandt, h., baumgartner, r., kriegmair, m., hofstdter, f., knchel, r.: cellular fluorescence of the endogeneous photosensitizer protoporphyrin ix following exposure to 5-aminolevulinic acid. in photochem photobiol, vol. 62 (1995), p. 88795. [12] stehle, t., behrens, a., aach, t.: enhancement of visual contrast in fluorescence endoscopy. in proceedings of the ieee international conference on multimedia and expo, june 2008, p. 537–540. [13] stehle, t., hennes, m., gross, s., behrens, a., wulff, j., aach, t.: dynamic distortion correction for endoscopy systems with exchangeable optics. in bildverarbeitung für die medizin 2009, springer, berlin (germany), 2009. [14] bourbakis, n.: detecting abnormal patterns in wce images. in fifth ieee symposium on bioinformatics and bioengineering 2005, bibe 2005, october 2005, p. 232–238. [15] kodogiannis, v.: computer-aided diagnosis in clinical endoscopy using neuro-fuzzy systems. in proceedings ieee international conference on fuzzy systems 2004, vol. 3 (2004), p. 1425–1429. [16] behrens, a.: an image mosaicing algorithm for bladder fluorescence endoscopy. in proceedings of the 12th international student conference on electrical engineering poster 2008, prague (czech republic), may 2008. [17] spindler, k.: the man in the ice: the preserved body of a neolithic man reveals the secrets of the stone age. london (great britain): weidenfeld & nicolson, january 1994. sebastian gross e-mail: sebastian.gross@lfb.rwth-aachen.de institute of imaging & computer vision rwth aachen university d-52056 aachen, germany medical department iii rwth aachen university hospital pauwelsstr. 30, d-52074 aachen, germany maren kollenbrandt e-mail: maren.kollenbrandt@rwth-aachen institute of history rwth aachen university d-52056 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 subjective evaluation of audiovisual signals f. fikejz abstract this paper deals with subjective evaluation of audiovisual signals, with emphasis on the interaction between acoustic and visual quality. the subjective test is realized by a simple rating method. the audiovisual signal used in this test is a combination of images compressed by jpeg compression codec and sound samples compressed by mpeg-1 layer iii. images and sounds have various contents. it simulates a real situation when the subject listens to compressed music and watches compressed pictures without the access to original, i.e. uncompressed signals. keywords: subjective test, audiovisual signal quality, compression, jpeg, mp3. 1 introduction we come into contact with compressed images and sounds every day. compression is used to save space on a disc or to maximize the speed of data transfers. the compromise between compression and subjective quality is still under investigation. a subjective test is one way to find it. 2 methodology goodguidelines onhowtoprepare subjective tests can be found in [1] and [2]. a method without a reference is good if the author of the test wants to model a real situation. if he wants to compare something (such as different types of compression), methods with a reference are better. for more on methods, see chapter 2.2 in this paper. the choice of suitable subjects is as important as a calibration in a physicalmeasurement. according to [1], subjects should have stable emotions and physical condition, they should be well motivated, they should present the typical reactions of a focus group, they should be independent of the author, and they should not have information that could influence them (about the compression formats, the types of displays, the loud-speakers, etc.) 2.1 psychometry psychometry is field of psychology. it concerns measurement of psychological effects. psychometric tests evaluate the influence of sound and image stimuli on humans. as subjective test is a type of psychometric test in which the subjects evaluate, in our case, sound and image quality in a defined way. based on subjective test results, new compression algorithms can be programmed and new audiovisual technology can be developed. it is important to train subjects correctly on what they are evaluating andhowthey shoulddo it. the instructions, including the degree scale, should be comprehensible for peoplewith no experience of subjective tests. one of the most important considerations for subjective evaluation is duplication. the same test can be performed several times with the same subject, or the same test can be performed with several subjects. the second version is more frequently used. thismethodwas implemented in the test described in this paper. opponents of subjective tests say that it is not possible to achieve the same conditions all the time, due to the influence of time andatmosphere, and in addition each subject is an individual. supporters say that quite similar conditions can lead to good validity (see [1]). 2.2 methods subjective tests can be implemented using various types ofmethods. twomethods arewidely employed. the first type comprises methods without a reference (simple rating methods, single stimulus continuous quality evaluation – sscqe, etc.), while the second typeconsists ofmethodswithpair comparison(double blindtriple stimuluswithhiddenreference –dbts, double stimulus continuous quality scale – dscqs, double stimulus impairment scale – dsis, etc.) when a simple rating method is used, the subject has to classify each sample on a metric scale. this is most often a numeric or graphic scale, for more details on types of scales see [2]. the degrees of the scale can be characterized by words, but more frequently a maximum and a minimum are defined. sscqeis a relativelynewmethod fordynamic rating of video sequences, and it is suitable for an objective evaluation. the subject evaluates the quality of a video sequence in time periods. for more on this method, see [3], and for a comparison with dscqs, see [4]. in the dbts method, the subjects listen to a reference signal and then two samples of an audio signal, 35 acta polytechnica vol. 50 no. 4/2010 one of which is the reference and the other is a compressed signal. the subjects try to identify the compressed signal and evaluate it. the reference signal is evaluated by the best degree of the scale. a similar method for visual signals is dscqs. the sequence of two samples is played twice. one is the reference and the other is the compressed image. for more, see [5]. the dsis method is quite similar, but the subjects know that the first image is the reference and second image is compressed. this may be easier for the subjects than classicdscqs.for a comparison of the two methods, see [6]. 2.3 compression formats an audiovisual signal of a subjective test was realized by jpeg pictures accompanied by mp3 compressed sound. jpeg (jpg) is a compression standard which has become the most widely used format for storing digital photos. the principle of the whole algorithm is described in [7]. mpeg-1 layer iii (mp3) is a very popular sound format. for more about mp3, see [8] and [9]. 3 subjective test themain goal of this subjective test was tomodel the real situation where people watch compressed images on tv and listen to compressed music. therefore the subjects did not compare the samples of the audiovisual signal with a reference. they evaluated each sample. the simple ratingmethod used ametric scale from 1 (the worst) to 10 (the best). 3.1 preparation and realization all basic image and sound samples have different contents. it is important to eliminate situations where some subject has a previous positive emotional bias towards the signal contents, unlike another subject, and automatically evaluates it more positively. the inverse situation, when a subject has a negative bias towards some content and automatically rates it negatively should also be avoided. the test described in this paper consisted of five basic samples, as listed in tab. 1a. all photos were takenwith anikone8700coolpix camerawith 3264×2448pixel picture resolution. the quality was set up to 100 percent. the images were compressed in irfan view 4.23 software to jpegwith qualities of 80, 60, 40 and 20 percent. all soundswere taken from original cds. each original sound was a wavfilewithbit rate1411kb/s, quantization16bits, stereo signal with sample frequency 44.1 khz modulatedbypcm.the sampleswere edited insonysound forge9.0, and in this softwaretheywerecompressedto mp3 formatwith bitrates 256 kb/s, 128 kb/s, 96 kb/s and64kb/s. the subjects didnotknowwhich formats and standardswerebeingused in the test, because this could have influenced their rating. table 1: a) list of samples, b) authors of music image music 1 decoration snow (hey oh) 2 troubadours the handsome cabin boy 3 butterfly jeux de vagues 4 church invitatorium hodie exultandum 5 monte carlo vrooom a) author (album, track) 1redhotchilli peppers (stadiumarcadium, 2–cd1) 2 sweeney’s men (the irish folk collection, 1) 3 claude debussy (jean fournet conducts debussy, 5) 4 schola gregoriana pragensis (bohemorum sancti, 1) 5 king crimson (thrak, 1) b) the main idea of this subjective test was to discover the interactionbetweenacoustic andvisual quality. there were ten combinations of compressed sounds and images. some of them were extreme (better audio quality with worse visual quality, or worse audio quality with better visual quality), while others were compromises with medium quality of both. a complete list of the combinations of compression is shown in tab. 2. table 2: combination of sound and image compression /average size of file – megabytes/ c. image format /mb/ sound format /mb/ 1 jpeg (100 %) /1.55/ wav (1411 kb/s) /3.2/ 2 jpeg (20 %) /0.3/ mp3 (64 kb/s) /0.15/ 3 jpeg (60 %) /0.6/ mp3 (128 kb/s) /0.3/ 4 jpeg (80 %) /1.03/ mp3 (256 kb/s) /0.6/ 5 jpeg (40 %) /0.5/ mp3 (96 kb/s) /0.2/ 6 jpeg (100 %) /1.55/ mp3 (64 kb/s) /0.15/ 7 jpeg (20 %) /0.3/ wav (1411 kb/s) /3.2/ 8 jpeg (60 %) /0.6/ mp3 (256 kb/s) /0.6/ 9 jpeg (80 %) /1.03/ mp3 (96 kb/s) /0.2/ 10 jpeg (40 %) /0.5/ mp3 (128 kb/s) /0.3/ each basic sample of tab. 1a was modified to all ten combinations of tab. 2. the whole test was composed of fifty samples. the length of each sample was between 17 and 20 seconds. all samples were numbered from 1 to 50. after each sample there was a black displaywith awhite text: rate example number “x”, example number “x+1” follows” where “x” was 36 acta polytechnica vol. 50 no. 4/2010 a number of the sample being played. the subjects had ten seconds for their rating, which they marked on the test form. the length of the whole test was 25 minutes. the subjective test was created in sonic foundry vegas 4.0 software as an avi file. the resolution was paldv (720×576 px/25 fields per second). the test was conducted at thedepartment ofradioelectronics, ctufee, inroomb3-552 (multimedia studio). the display was realized using a panasonic th-50px8ea plasma monitor with 1280× 768 pixel resolution and high colour quality (32 bits). the monitor was connected with the pc by hdmi (high definition multimedia interface). the soundsystemconsistedofevent electronics tr8 (100 w/220 v/0.5 a) stereophonic loud-speakers. the block scheme of the workspace is shown in fig. 1. fig. 1: block scheme of workspace the subjects evaluated all samples with regard to subjective quality using the simple rating method. they used a scale from 1 (worst quality) to 10 (best quality). before the start of the test they were informed that it was not a pair comparison test with a reference, and that they had to evaluate each sample independently. the order of the samples was generated by the latin squares algorithm (for details, see [12]). the frequency of the same types of samples was not regular, so the subjects did not know which of the five types of samples would come in order. 41 participants (33male and 8 female) tookpart in this subjective test. they were aged between 16 and 29 and one person was 36 years old. 21 of them wrote in the answer sheet that they had experience of image and sound processing. 3.2 analysis and results the whole statistic evaluation of the subjective test was implemented in matlab 7.1. the data was entered into a 50 × 41 matrix. the rows of the matrix represent the samples, and the columns represent the subjects. themean value, standard deviation and variance were calculated for all samples of the audiovisual signal. the mean value fell between 4.5 and 7.5, the standard deviation fell between 1.3 and 2.3, and the variance fell between 1.8 and 5.6. in some subjective tests, a few of the last samples can be eliminated, but not in our test, because the variance in the evaluation of the last samples does not show constant growth. this indicates that fatigue of the subjects did not influence our data. another element in evaluating this test was calculating the average values for all combinations of compression according to tab. 2. this was the sum of all five values (one per type of sample according to tab. 1a) for each combination. this result was divided by 5 (number of types of samples). fig. 2 shows the average values of all combinations. the values are shown in tab. 3. fig. 2: average of the mean values and variances of each combination table 3: average of the mean values and variances of each combination combination μ σy 1 100 %+wav 6.37 3.01 2 20 %+64 kb/s 5.90 3.26 3 60 %+128 kb/s 6.29 3.21 4 80 %+256 kb/s 6.21 3.87 5 40 %+ 96 kb/s 6.50 3.10 6 100 %+64 kb/s 6.38 3.46 7 20 %+wav 6.33 3.20 8 60 %+256 kb/s 6.29 3.71 9 80 %+96 kb/s 6.25 2.95 10 40 %+128 kb/s 6.37 2.93 the highest mean value is surprisingly for combination no. 5 (40 % jpeg and mp3 with bit rate 37 acta polytechnica vol. 50 no. 4/2010 96 kb/s). the reason may be that people are used to listening to compressed records rather than original cds. the combination of originals was given the third highestmeanvalue. the lowestmeanvaluewent to the combination of the worst qualities. almost all combinations apart from the worst oscillated around 6.35. the highest variance was observed for combination no. 4 (80 % jpeg, mp3 with bit rate 256 kb/s). fig. 3: anova (1 – experienced subjects, 2 – inexperienced subjects) the evaluationwas performed for experienced and inexperienced subjects, respectively. experienced subjects rated the samples with lower variance than inexperienced subjects. the best rated combinations by experienced subjectswere no. 3 (60%jpegandmp3 with bit rate 128 kb/s) and no. 5 (40 % jpeg and mp3 with bit rate 96 kb/s) with average mean values of 6.61 and 6.60 respectively. the combination of originals was given a value of 6.48. inexperienced subjects rated no. 6 the best combination (original image with the worst sound quality), with averagemean values of 6.43, but the rate of both types with original sound (nos. 1 and 7)was also high (6.26 and 6.28). combination no. 3which received the highest rating from experienced subjects was rated as the secondworst. both experienced and inexperienced subjects identified the combination of the worst qualities well. the difference between the mean values awarded by experienced and inexperienced subjects was very small, so there was a hypothesis that the mean values of both groups were almost the same. the results of both groups were analyzed by anova (analysis of variance). for more on anova, see [10]. fig. 3 and tab. 4 show the results for anova. ss means the sum of squares due to each source, df means the degrees of freedom associated with each source, m s is mean squares for each source, which is the ratio ss/df, f shows the statistic, which is the ratio of the m ss. the prob value decreases as f increases. because of this result (f → 1) and because of the mean values of both groupswere quite similar, the hypothesis can be applied. fig. 4 and tab. 5 show that anova was also applied to all subjects, but f was too high and the same hypothesis was rejected. table 4: anova (1 – experienced subjects, 2 – inexperienced subjects) source ss df m s f p rob > f columns 0.8935 1 0.89348 1.28 0.2599 error 68.1747 98 0.69566 total 69.0682 99 fig. 4: anova (all subjects) table 5: anova (all subjects) source ss df m s f p rob > f columns 2144.82 40 53.6206 18.62 0 error 5784.52 2009 2.8793 total 7929.34 2049 another part of the evaluation was an analysis of the mean value and variance of all combinations for each type of sample fortab. 1a. the x axis represents the compression number according to tab 2., and the y axis shows the mean value or variance. this is like in fig. 1, but average values of all five types of samples were displayed there). sample no. 1 (according to tab. 1a), a photo of decoration and music from red hotchilli peppers,was ratedverywell. thebest classificationwas given to combination no. 3 (60 % jpeg andmp3with bit rate 128 kb/s) and theworst combination was no. 6 (original image and the worst sound quality). in this case, the subjects were influenced by the sound quality. the second sample (troubadours and an irish ballad) was scored very low, and the subjects were more influenced by the soundquality. sample no. 3 (butterflyontheflower,withmusicby impressionist composer claude debussy) was scored very high. the variance of the ratingwasvery low. combinationno. 6 (original 38 acta polytechnica vol. 50 no. 4/2010 imageandtheworst soundquality)wasgiventhehighestmean value. the subjects weremore influenced by the image quality. sample no. 4 (church and gregorianchant)wasalso ratedwell. the subjects evaluated combination no. 7 (the worst image quality with original sound) as the best. the subjects were probably influenced by the sound quality. the worst classification was given to sample no. 5 (monte carlo panorama and music by king crimson). the best classificationwas for compromise combination no. 10 (40 % jpeg with mp3 with bit rate 128 kb/s). in this case, the subjects were influenced by the quality of the images. special attention is focused on each combination according to tab. 2. the values for each type of sample according to tab. 1a are compared. the results for combination no. 1 (originals) are shown in fig. 5, where the numbers on the x-axis represent the types of samples, and the numbers on the y-axis represent the mean value and variance. fig. 5: mean value and variance – image 100 %, sound wav the results of the other combinations were quite similar for the mean value. the rating of each combination (according totab. 2)was not influenced by the type of sample (according to tab. 1a). the variances weremore varied. samples no. 1, no. 3 and no. 4 were regularly rated better than samples no. 2 and no. 5. this result may be influenced by the selection of subjects. younger people mostly give preference to music like sample no. 1, and they do not have much experience with classical music and gregorian chant. after the test, many of subjects said that they had recognisedmarks of compression in samples no. 2 andno. 5. on the basis of all the results, the subjects were influenced by the sound element of the sample in 1, 2 and 4, and they were influenced by the image elements of the audiovisual signal in the case of samples no. 3 and no. 5. moredetails about thewhole subjective test are given in [11]. 4 conclusion and discussion this subjective test evaluated combinations of static images andparts of sound records. the aimof the test was to investigate the interaction between the sound quality and the image quality of audiovisual signals. forty-one subjects evaluated ten combinations of compressed audio and video signals. each of ten combinations, see tab. 2, was applied to five types of samples of different content, according to tab. 1a. thus the test was composed of fifty samples in all. a simple rating method was used (with a metric scale from 1 – theworst to 10 – the best). as in real life, the subjects had no referencewhen theywerewatching compressed images and listening to compressed music. it is interesting that a combination of 40%quality image and an mp3 record with a bit rate of 96 kb/s was the best rated. a combination of 60 % quality of the image and an mp3 record with a bit rate of 256 kb/s, and all combinationswith the original sound or the original imagewere alsowell rated. a combination of the worst image and the worst sound was the worst rated. moreexperiencedsubjectshad lowervariance in their evaluation, but there was no great difference between the mean values of experienced subjects and inexperienced subjects. samples no. 1, no. 3 and no. 4 (according to tab. 1a) were regularly rated much better than samples no. 2 and no. 5. in the case of no. 1, no. 2 and no. 4, the subjects were particularly influenced by the sound quality, and in the case of no. 3 and no. 5 they were particularly influenced by the image quality. the type of situation determines whether people are more influenced by sound quality or by image quality. acknowledgement this research has been supervised by libor husník and supported by research project no. sgs10/ 265/ohk3/3t/13 modern modeling and monitoring methods in acoustics. references [1] melka, a.: základy experimentální psychoakustiky, praha : akademie múzických umění, 2005. [2] guilford, j. p.: psychometric methods, new york, mcgraw-hill, 1936. [3] horita, y., miyata, t., gunawan, i. p., murai, t., ghanbari, m.: evaluationmodel considering static-temporal quality degradation and humanmemory forsscqevideoquality.visual communications and image processing, spie vol. 5150, 2003, p. 1601–1611. [4] pinson, m., wolf, s.: comparing subjective video quality testing methodologies. institute 39 acta polytechnica vol. 50 no. 4/2010 for telecommunication sciences (its), national telecommunications and information administration (ntia), u.s. department of commerce. [5] itu-r bt.500-11, methodology for subjective assessment of the quality of television pictures. [6] van den ende, n., meesters, l. m. j., haakma,r.: relation betweendsis anddscqs for temporal and spatial video artifacts in a wireless home environment. human vision and electronic imagingxii, spie-is&t/vol.6492, 2007, p. 64920ll-1-11. [7] furht, b.: encyclopedia of multimedia. springer, 2005, p. 372–373. [8] spanias, a., painter, t., atti, v.: audio signal processing and coding. wiley interscience. 2007, p. 120–128. [9] kahrs,m.,brandenburg,k.: applications ofdigital signal processing to audio and acoustics. kluwer academic publishers. 2003, p. 75–78. [10] sung-hwan shin, jeong-guon ih, hyuk jeong: statistical processing of subjective listening test data for psq. institute of noise control engineering. 2003 july–august, p. 232–238. [11] fikejz, f.: metodika subjektivního vyhodnocování multimediálního signálu. prague, 2009. diploma thesis, department of radioelectronics, ctu fee in prague, 2009. dept. of radioelectronics. supervisor libor husník. [12] latinské čtverce. http://math.feld.cvut.cz/demlova/teaching/avt/ pred-a09.pdf [quoted 2009–05–10]. about the author filip fikejz was born in prague, czech republic, on december 15th, 1983. he graduated with an ing. degree from theczechtechnicaluniversity inprague, faculty of electrical engineering in 2009. he is now a phd. student at the department of radioelectronics, ctu fee. his research interests are in audio signal processing from the psychoacoustic point of view. filip fikejz e-mail: fikejfil@fel.cvut.cz dept. of radioelectronics faculty of electrical engineering czech technical university in prague technická 2, 166 27 praha 6, czech republic 40 ap-4-12.dvi acta polytechnica vol. 52 no. 4/2012 laser cutting of materials of various thicknesses martin grepl1, marek pagáč1, jana petr̊u1 1 všb – technical university of ostrava, faculty of mechanical engineering, department of machining and assembly, 17. listopadu 15/2172, 708 33 ostrava, czech republic correspondence to: martin.grepl.st@vsb.cz abstract thise paper deals with the application of laser technology and optimizing the parameters for cutting nickel alloy. the theoretical part of the paper describes various types of lasers, their principles and usage. the experimental part focuses on optimizing the section parameteres of haynes 718 alloy using a co2 gas laser. this alloy is employed in the production of components for the aircraft industry. the experiment was performed on the wibro delta laser system designed for sizable parts. the actual section is measured with respect to its quality and any accompanying side effects that occur during the process. in this case, laser output and cutting speed were the parameters with most influence on the final cut. the summary explains the results achieved in a metallographic laboratory. keywords: laser, cutting, thickness. 1 introduction lasers have found wide application in scientific work in astronomy and in optics, in the investigation of material characteristics and other basic research areas. other practical applications include optical equipment for eye surgery, use in geodesy and seismography, in welding miniature parts from hard-tomelt materials, in chemistry and metallography during spectral microanalyses, etc. laser cutting laser cutting can be: • sublimating — the material is removed primarily by evaporation due to the high intensity of the laser radiation in the cut area; • melting — the material is melted by a laser beam in the cut area and blown away by an auxiliary gas. mainly metallic materials are cut using this process; • burning — a laser beam heats the material to its ignition temperature, so it can then burn in an exothermic reaction with the reactive gas (e.g., oxygen), and the slag is removed from the cutting area by an auxiliary gas. titanium, low carbon and corrosion resistant steels can be cut this way. laser cutting, the most established laser material processing technology, is a method for shaping and separating a workpiece into segments of desired geometry. the cutting process is executed by moving a focused laser beam along the surface of the workpiece at a constant distance, thereby generating a narrow cut kerf. this kerf fully penetrates the material along the desired cut contour. the absorbed energy heats and transforms the prospective kerf volume into a state (molten, vaporized, or chemically changed) which is volatile or which can be removed easily. normally, removal of the material is supported by a gas jet that, impinges coaxially to the laser beam. this cutting gas accelerates the transformed material and ejects it from the kerf. this process is successful only if the melt zone completely penetrates the workpiece. laser metal cutting is therfore generally restricted to thin sections. while cutting has been reported through 100 mm sections of steel, the process is more typically used on metal sheets 6 mm or less in thickness. figure 1: principle of laser cutting 62 acta polytechnica vol. 52 no. 4/2012 table 1: chemical composition of haynes 718 alloy elements ni co fe cr cb + ta mo mn si ti al c b cu weight [%] 52 1* 19 18 5 3 0.35* 0.35* 0.9 0.5 0.05 0.009 0.1* * maximum 2 experimental cutting of materials the goal of this experiment was to find suitable cutting parameters when cutting 2.5 mm and 3.2 mm metal sheets of this type of alloy using the winbro delta laser system. the winbro delta manufacturer recommends using this machine for cutting material with a maximum thickness of 6 mm. we also investigated the influence of cutting parameters on the resulting cutting structure from the point of view of shape deformations of the cutting gap and the creation of a recast layer, which is an undesirable accompanying effect of laser cutting. 2.1 material test samples were made from the haynes 718 high strength alloy, which belongs to the nickel alloy group. these alloys are suitable for operations under extremely demanding conditions. they are materials that are primarily resistant to high temperatures. these alloys are used in the construction of land gas and aircraft engine turbines, and also for industrial furnaces, combustion chambers, etc. the alloy features outstanding resistance to temperatures from −253 ◦c to +705 ◦c, and also excellent resistance against oxidation up to 980 ◦c. 2.2 laser system the experiment was performed on the winbro delta laser system designed for sizable parts up to 1 900 mm in diameter, 500 mm in height, and up to 500 kg in weight. the delta system can be configured with up to four different types of laser sources in order to meet the requirements for specific applications of operational laser technology (e.g., cutting, drilling, or welding) (see figure 2). the laser system has been supplemented by the rofin dc 020 source, and is equipped with the heidenhain itnc 530 control system. it is a gas co2 laser that operates in continuous regime. 2.3 cutting parameters samples were made from sheet metal 2.5 mm and 3.2 mm in thickness. the laser cutting speed was set to 500 mm·min−1 based on experimental experience, the distance of the jet from the surface was 0.9 mm, exciting frequency 2 000 hz, filling 75 %. the laser source output was changed by 10 % for each cut, see tables 2 and 3. the cut length was 10 mm. the placement or distances of the cuts were empirically selected so that the gaps between them would be sufficient from the point of view of possible temperature effects on the neighboring cuts. figure 2: wibro delta laser system with the heidenhain itnc 530 control system table 2: power of laser (2.5 mm thickness) number of cut power of laser 2 kw [%] cut 1 90 cut 2 80 cut 3 70 cut 4 60 cut 5 50 cut 6 55 table 3: power of laser (3.2 mm thickness) number of cut power of laser 2kw [%] cut 7 90 cut 8 80 cut 9 70 cut 4 60 63 acta polytechnica vol. 52 no. 4/2012 figure 3: the sample — sheet metal 2.5 mm in thickness (haynes 718): a) upper side of the sample, b) underside of the sample 3 realization of the experiment cuts nos. 1 to 5 were formed according to the proposed cutting parameters. it was demonstrated that the metal was not completely cut through with the parameters set for no. 5. the output for the cut no. 6 was therefore set to 55 %, i.e. for the average value between cuts no. 4 and 5. however, 55 % power was again not sufficient for cutting the metal. the lowest power useful for this material thickness is 60 % of the maximum source output, which corresponds to laser source output of 1 200 w. figure 3a) shows cuts nos. 1 to 6. cuts nos. 1 to 4 went through the whole thickness of the metal. the reason why the last two cuts were unsuccessful was insufficient laser source power. figure 3b) shows the apparent temperature influenced cutting area, which decreases with increasing output. burnt-on slag forms on the bottom part of the cut. this can be considered as an accompanying phenomenon of the cutting process that can be influenced by setting the cut parameters. it can generally be stated that the height of slag that forms does not exceed the thickness of the cut metal, is brittle and breaks. we then we investigated a suitable cutting speed using the cuts marked by nos. 4a to 4f, in order to increase the cut quality. the source parameters were set to the values for cut no. 4, and the cutting speed was the only variable. for each cut we increased the cutting speed by 100 mm · min−1. cuts nos. 4c and 4d were evaluated as the best from the point of view of cutting quality. these cuts correspond to a cutting speed interval of 〈700; 800〉 mm · min−1. in order to make the cutting speed value more exact, we performed cut no. 4f with a cutting speed of 750 mm · min−1. table 4: laser cutting velocities with 60 % power number of cut laser cutting velocities [mm · min−1] cut 4a 500 cut 4b 600 cut 4c 700 cut 4d 800 cut 4e 900 cut 4f 750 figure 4: the cuts of variable cutting velocity with constant power 60 % figure 5: the sample — sheet metal 3.2 mm in thickness (haynes 718): a) upper side of the sample, b) underside of the sample 64 acta polytechnica vol. 52 no. 4/2012 table 5: laser cutting velocities with 80 % power number of cut velocities of laser cutting [mm · min−1] cut 8a 500 cut 8b 600 cut 8c 400 cut 8d 300 cut 8e 450 figure 6: the cuts of variable cutting velocity with constant power 80 % cuts nos. 7 to 10 were formed according to the proposed cutting parameters. it was demonstrated that the metal was not completely cut through with the parameters set for no. 10. the lowest power useful for this material thickness is 70 % of the maximum source output, which corresponds to laser source output of 1 400 w. then we investigated a suitable cutting speed, using the cuts marked by nos. 8a to 8e, in order to increase the cut quality. the source parameters were set to the values of cut no. 8, and the cutting speed was the only variable. for each cut we changed the cutting speed, see table 5. cuts nos. 8a and 8c were evaluated as the best from the point of view of cutting quality. these cuts correspond to a cutting speed interval of 〈400; 500〉 mm · min−1. in order to make the cutting speed value more exact we performed cut no. 8e with a cutting speed of 450 mm · min−1. 4 metallographic evaluation we made metallographic sections in the metallographic laboratory, and compared the cuts. figure 7 is a photograph showing a comprehensive view of cut no. 1 (magnified 10 and 50 times, etching agent vilella), with no recast layer observable, except for an imperceptible layer on the cut walls that probably occurs always. the cut profile is symmetrical, without major shape deformations. figure 7: detail of cut no. 1 (power 90 %, cutting velocity 500 mm · min−1, magnified 10× and 50×) 65 acta polytechnica vol. 52 no. 4/2012 figure 8: detail of cut no. 4 (power 60 %, cutting velocity 500 mm · min−1, magnified 10× and 50×) figure 9: detail of cut no. 6 (power 55 %, cutting velocity 500 mm · min−1, magnified 10× and 50×) we can see the originating asymmetrical distribution of the recast layer at the bottom of the cut. this layer is caused by the barrier that originates due to the melt concentrated in the bottom part of the cut. the melt after solidification manifests itself as a burr on the bottom edge of the cut. burrs are not acceptable, and must be removed by milling. due to this, the heat that originates during cutting, and does not have a chance to dissipate from the surrounding of the cut, accumulates in this location. as a consequence there is local overheating, a change in the shape of the cut, and an increase in the volume of the recast layer. cut no. 4 was selected as the best in quality after visual control. it can be seen that it appears satisfactory, even from the point of view of the size of the recast layer. figure 9 shows at photograph of cut no. 6, in which the metal was not completely cut through. this is an unacceptable situation, caused by insufficient output of the laser. 5 conclusion on the basis of experiments, we have found that suitable parameters for cutting haynes 718 alloy and metal thickness of 2.5 mm are 60 % (1 200 w) of the laser output and 750 mm · min−1 cutting speed, according to cut no. 4f. 66 acta polytechnica vol. 52 no. 4/2012 for metal thickness of 3.2 mm, the optimal parameters are 80 % (1 600 w) of the laser output and 450 mm·min−1 cutting speed, according to cut no. 8e. to avoid any influence of the surrounding atmosphere on the cut, it is suitable to measure the recast layer using a microprobe, and then to perform a microchemical analysis. we recommend that increased attention be paid to a study of the recast layer and its increased dependence on cutting parameters. it would be suitable to perform the microhardness measurement at the melting boundary of the original material and more importantly, on the recast layer itself. our paper has contributed a comprehensive view on the influence of the process parameters on a narrow group of materials used in the aerospace industry. acknowledgement this paper was prepared within the project: increasing of professional skills by practical acquirements and knowledge, no. cz.1.07/2.4.00/17.0082, supported by the education for competitiveness operational programme financed by the european union structural funds and with support from the state budget of the czech republic. references [1] gavrilov, p., jeĺınková, h., vrbová, m.: introduction to laser technology. praha : čvut, 1994. 235 p. faculty of nuclear sciences and physical engineering. isbn 80-01-01108-9. [2] grepl, m.: laser cutting materials with variable thickness: master thesis. ostrava : všb – technical university of ostrava, faculty of mechanical engineering, department of machining and assembly, 2010, p. 75. [3] ready, j. f.: lia handbook of laser materials processing. laser institute of america, orlando, fl : magnolia publishing, inc., 2001. laser cutting, p. 425–470. isbn 0-912035-15-3. 67 ap10_1.vp 1 introduction many cases of the use of evolutionary algorithms in optimizing technical tasks have been published, most of them in the design of antennas, microstrip lines and other microwave elements. these tasks are characterized by difficult geometry, which can be solved only numerically. a solution based on an analytic description followed by optimizations is almost impossible. therefore, stochastic or evolutionary algorithms are used for such tasks. in the electrical engineering industry, synthesis of high-frequency circuits (for example radio-interference filters) is a very frequent task. the goal is to find values of electrical components for which the circuit behaves according to our requirements. this can also be called optimization; the target function is for example the difference between real behavior of the circuit and its scheduled frequency response. as shown in this paper, an evolutionary algorithm can also be used in such tasks. a big advantage when using evolutionary algorithms is their robustness and the large probability of finding a right solution or more than one right solution. the main disadvantage is the longer time needed to finalize the computation (especially when optimizing several variables simultaneously) and great sensitivity to the control settings. therefore, some studies have recently been undertaken to eliminate these unwanted features. one way is to use auto-adaptation of the controlling parameters during the optimization process. [1] there are some special features of using an evolutionary algorithm when optimizing electric circuits: first, there are a huge range of variables from pico (10�12) to units or tens of mega (107). next, there is a huge sensitivity to a small variance in inputs. a resonance effect in lc circuits (consisting only from inductors l and capacitors c) can serve as an example. the final difficulty is the huge dimension of the area of the solution. most simple digital or lc filters are composed of at least 3 – 4 elements, but it is necessary to consider parasitic effects and initial conditions. the final number of wanted variables can be 10 or more. thus these engineering tasks are more difficult than ordinary mathematical tests for optimizing algorithms. some tests (de jong, rastring, schwefel and ackley’s function) are introduced in [2]. 2 optimization using de in technical practice, the differential evolutionary algorithm is commonly considered as a very reliable and robust optimizing technique. de was developed for computing with real numbers and the domain of definition can be specified. (when solving box constrained problems) de works with a population of solutions p, the size (dimension) of which is np individuals. population p changes during g generations. each individual from the current population can be presented as d-dimensional vector, when d presents the dimension of the definition domain. d corresponds to the number of required variables. de can be described in pseudo-code near to pascal as follows: 1. create initial population p x x xn� ( , , , )1 2 � ; (randomly) 2. for i:� 1 to g do 3. for j:� 1 to n do 4. create mutation vector u � ( , , , )u u ud1 2 � 5. create new solution yj by combination of “parents” u and xj 6. if f(yj) < f(xj) then replace xj by yj; (f means target function) 7. end if 8. end for 9. end for; (or cycle repeat – until with suitable terminating condition) this process is essentially common for all evolutionary algorithms, the core of de is built-in in the strategy of creating mutation vector u. there are several ways (strategies) for generating mutation vector u. the most frequent strategy is called de/rand, when the weighted difference from randomly chosen solutions (individuals from p) is used: u � � �r f r r1 2 3( ). (1) 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 adaptation of an evolutionary algorithm in modeling electric circuits j. hájek this paper describes the influence of setting control parameters of a differential evolutionary algorithm (de) and the influence of adapting these parameters on the simulation of electric circuits and their components. various de algorithm strategies are investigated, and also the influence of adapting the controlling parameters (cr, f) during simulation and the effect of sample size. optimizing an equivalent circuit diagram is chosen as a test task. several strategies and settings of a de algorithm are evaluated according to their convergence to the right solution. keywords: differential evolution, adaptation, maple, modeling of electric circuits. the disparity r r r x j1 2 3� � � must be valid and ri is randomly chosen from population p. the name of this strategy also results from this equation (rand), as does the name of the algorithm (weighted difference � de). control factor f is entered by the user and the authors of de recommend f � 0.8. another strategy, known as de/best, generates mutation vector u as follows: u � � �x f r rmin ( )1 2 . (2) in this case, the disparity r r r x1 2 3� � � min is again valid and the value xmin represents the best solution in previous and present populations, i.e. it represents the best solution with minimal target function ( f ). this explains the name of the strategy (best). after creating a mutation (sometimes called a “noise vector”) it is possible to make descendant y from “parents” xj and u. elements yk for k d� 1 2, , ,� are created according to the following equations: y uk k� , while u crk � or l k� (element yk changes), (3) y xk jk� , while u crk � and l k� (element yk does not change). (4) variable l is a randomly chosen number from the set {1, 2, …, d}, random variables u1 up to ud are from the interval [0, 1] with normal distribution. cr is the next control factor that influences the frequency of crossing; cr � [ , ]0 1 . the authors recommend us to choose cr � 0.5; i.e. 50 % probability of changing xjk. it is recommended to choose np d� 10 . the advantage of de is its simplicity, because the introduced procedure is simply programmable. the settings of factors f/cr and a suitable choice of the population size are the biggest disadvantages. convergence of de depends dramatically on the controlling factors. values f � 0.8 and cr � 0.5 do not always guarantee good convergence. various settings are suitable in various cases. auto-adaptation of algorithm de can be used as a defense against this unwanted sensitivity. it can be provided by changing weight factor f according transient results, as shown in [1]: f f f f � � ����� � ������ max ,min max min 1 , while f f max min � 1, (5) f f f f � � ����� � ������ max ,min min max 1 , while f f max min � 1. (6) another way, which was not tested in this paper is to insert factors f and cr as variables directly into the population p and keep them developing in each generation. successful values will survive longer and can affect the optimization process positively. this way is described in details in [2]. 3 description of test task optimizing a model of a suppressor capacitor was chosen as a test task for the de algorithm. this model has four input variables r1, r2, l1, c1, which affect its behavior, especially insertion loss. the output variable of the model is the computed insertion loss dependence. the goal of the optimization process was to find out values of the input variables (r1, r2, l1, c1), such that the model has the same dependence of the insertion loss as shown on the fig. 1. the target function for the optimizing routine was defined as the sum of square of the variations between model insertion loss and the measured insertion loss. a common foil capacitor with plastic dielectric and axial outlets (tc 218 100 nf/630 v) served as an object for measuring. the insertion loss dependence (i.e. the amplitude frequency response) of capacitor tc 218 was measured using signal analyzer pmm 9010 in accordance with standard cispr 17 [3]. the domain of the solution has a dimension of 4, because there are four wanted elements r1, r2, l1, c1. this domain was restricted to real ranges: r1(�) � [0.01; 1], r2(�) � [10 4; 107], c1(nf) � [80; 120], l1(nh) � [0; 60]. this restriction has two meanings: first, it accelerates computation (smaller space, which should be sought through) and second, it separates solutions that are mathematically correct, but practically impossible. another necessary procedure was to scale all variables before running the algorithm. this was necessary because of the potential influence of round-off errors. the values of the variables in the test task range from pico farads to tens of mhz. after finishing de, all results were re-scaled back into common units. the following transformations were used for scaling and re-scaling: r r ri i n norm � , (7) c c f rn n1 12norm � � , (8) l l f r n n 1 12norm � � . (9) the reference values were rn � 1 � and f n � 1 mhz. all calculations were carried out in absolute values. relative units (db) were used only in results and graphs. this was done to avoid the possibility of round-off errors. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 50 no. 1/2010 fig. 1: model of the capacitor (below, right) and the insertion loss measured on real capacitor tc 218 all the optimization routines of the de algorithm were written in a maple worksheet. a model of a capacitor was implemented in maple 9.5 with help from the syrup library [4]. the syrup library enables us to insert electric circuits directly into the maple environment. the test task ran on a pc with the following configuration: intel celeron 2,4 ghz, ram 256 mb, windows 2000, maple 9.5 with syrup library. 4 results of simulations all the tested settings of the de algorithm are shown in table 1. each setting was checked out and tested ten times. average results are introduced in table 1. individual calculations were done independently with a random start-up population. the evolution cycle ended in all tests after the 100th generation. this number of generations is sufficient in a relatively simple case such as this. populations with ranges 2d, 4d, 10d and 16d were chosen, so that various strategies and convergences could be discussed. weight factor f was also simultaneously changed. this factor is responsible for creating mutation vector u. at the beginning of the evolution, this f factor was always set to 0.9. this value ensures outstanding mutations at the beginning of the evolution and fine-tuning of founded solution at the end. the tested algorithms finished properly and discovered solutions in specified ranges. the final values of the calculations after re-scaling and averaging are shown in table 2. it should be noted that more different values of r1, r2, l1, c1 can ensure right behavior of the insertion loss (output variable of the model). an infinite number of electric circuits can theoretically even have nearly the same characteristics. another result from table 2 is that the model of the capacitor can be simplified. input variable r2 (varying in a large range) has practically neutral influence on the characteristic of the circuit. the conformity with measured dependence is presented in table 2 as the absolute peak error. the typical accuracy of the insertion loss measurement required by [3] is �3 db. the simple four-element model from fig. 1 cannot provide greater conformity. an evaluation and a comparison of various de algorithm strategies was declared as main goal of this paper. this comparison can be based on an analysis of the convergence. the results of the first four tests are shown on the fig. 2. there are 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 no. test np g auto-adapt. strategy f100 cr best cpu time (s) 1. 8 100 no rand 0.9000 0.85 0.0420 39 2. 8 100 no best 0.9000 0.85 0.0400 34 3. 8 100 yes rand 0.9427 0.85 0.1159 34 4. 8 100 yes best 0.5000 0.85 0.0951 32 5. 16 100 no rand 0.9000 0.85 0.0488 70 6. 16 100 no best 0.9000 0.85 0.0400 70 7. 16 100 yes rand 0.9715 0.85 0.0468 73 8. 16 100 yes best 0.9824 0.85 0.0401 72 9. 40 100 no rand 0.9000 0.85 0.0656 216 10. 40 100 no best 0.9000 0.85 0.0400 214 11. 40 100 yes rand 0.9994 0.85 0.0431 220 12. 40 100 yes best 0.5000 0.85 0.0400 208 13. 64 100 no rand 0.9000 0.85 0.0476 385 14. 64 100 no best 0.9000 0.85 0.0400 380 15. 64 100 yes rand 0.9992 0.85 0.0519 401 16. 64 100 yes best 0.5000 0.85 0.0400 395 table 1: tested variants of de. np – size of population, g number of generations, auto-adapt. – fixed or changed f factor, strategy – used strategy, f100 – average weight factor f in the 100 th generation, cr – fixed factor, best – average minimum of target function, cpu time – average time needed to compute one test. fig. 2: development of convergence in population np d� 2 no visible trends in the random dependences. this is because of the small population, which is np d� 2 . convergence is even slower if auto-adaptation is used. the algorithms tend towards stagnation from approximately the 40th generation. the second quaternion of tests (fig. 3) worked with a bigger population, when np d� 4 . however, this population is still small; the algorithms exhibited the same behavior as in the first case. a better situation occurs when analysing algorithms working with a bigger population of about 10d. the dependences from fig. 4 are more interesting; it is shown that variants de/best exhibit better convergence. variants de/best established solutions faster and the final target function (best) is lower (courses 10 and 12 lay below courses 9 and 11). this trend can be observed in allmost all generations. there is a relatively inexpressive difference between test variants with and without adaptation of the f factor. test variants with adaptation (courses 11 and 12) exhibit slightly faster convergence, but the differences are not outstanding. almost the same results can be achieved when analyzing a population with np d� 16 (fig. 5). variants de/best again show better convergence than de/rand and there is no expressive difference between the adapted and non-adapted variants. when comparing courses from fig. 5, it should be pointed out that it has no sense to discuss generations younger than 10. at approximately that time, the solution will © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 50 no. 1/2010 no. test r1 (m�) r2 (m�) l1 (nh) c1 (nf) abs. peak error (db) 1. 63.9 4.134 36.2 88.6 0.82 2. 64.6 0.052 35.7 89.6 0.86 3. 43.0 1.596 35.8 87.2 1.42 4. 68.0 6.423 32.7 98.2 1.40 5. 62.6 5.882 36.9 87.1 0.78 6. 64.6 0.109 35.7 89.6 0.86 7. 66.5 1.695 36.1 89.6 0.82 8. 64.5 2.386 35.9 89.2 0.84 9. 73.4 4.640 35.2 92.6 1.32 10. 64.6 0.024 35.7 89.6 1.02 11. 66.0 5.619 36.3 88.0 0.86 12. 64.6 5.575 35.7 89.6 0.86 13. 66.8 9.780 34.8 91.6 1.22 14. 64.6 0.021 35.7 89.6 0.84 15. 58.2 7.011 35.4 90.8 1.02 16. 64.6 8.435 35.7 89.6 0.86 table 2: results of all variants after re-scaling and averaging fig. 3: development of convergence in population np d� 4 fig. 4: development of convergence in population np d� 10 fig. 5: development of convergence in population np d� 16 crystallize from random numbers. all initial populations were generated completely randomly. 5 conclusion we tested some variants of a de algorithm for optimizing an electrical circuit. attention was first focused on the difference between a de/rand strategy and a de/best strategy, and then on the influence of auto-adaptation. for this purpose a simple adaptation of control factor f was tested according to [1]. the tests showed a big influence of the choice of strategy, de/best variants exhibit better results than de/rand variants. the population must be big enough. the influence of auto-adaptation on convergence was not very expressive in this case, but variants with auto-adaptation of the f factor were somewhat faster. all tested variants find a solution in dedicated borders. the maximum divergence between the counted loss and the measured insertion loss was less than the 3 db required by the cispr standard. 6 references [1] tvrdík, j.: evoluční algoritmy a adaptace jejich řídicích parametrů. automatizace, vol. 50 (2007), no. 7–8, p. 453–457. [2] zelinka, i.: umělá inteligence v problémech globální optimalizace. praha: ben, 2002. [3] čsn cispr 17 methods of measurement of the suppression characteristics of passive radio interference filters and suppression components. [4] internet: http://www.maplesoft.com/applications/view.aspx?sid= 4680 ing. jiří hájek phone: +420 2 2435 2124 fax: +420 2 2435 3949 e-mail: hajekj1@fel.cvut.cz department of electrotechnology czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 ap08_6.vp 1 introduction in a loca-type accident in atomic reactors, the cladding of the fuel cells is exposed to high temperature shock for several minutes, and it is vital to know possible changes in the material properties influencing the safety conduct. the aim of this work is to investigate the oxide layers formed by the thermal shock, and to compare their properties with the original pre-oxidized layers. 2 samples in order to simulate loca conditions, samples of zr1nb and of zry-4w pre-oxidized in steam at 425 °c for 360 days were exposed to 1200 °c in steam for 3 minutes. the oxide layer thickness increased from 33.3 �m to 78.8 �m for zr1nb, and from 35.3 �m to 78.7 �m for zry-4w. on the metallographic section, the additionally grown oxide is of different color, and at the interface a very thin layer of metallic tin could be observed. oxygen diffused into the underlying zirconium with 30 % at the interface, and structural changes could be asserted down to 100 �m, table 1. tube specimens of 30 mm length and 9 mm outer diameter prepared from zirconium alloys were used. the properties of the pre-oxidized layers have been described in [1]. 3 experimental the oxide at the end faces of the tubes was ground off for good contact. painted-on contacts of colloidal silver 6.0 mm in diameter were used. the samples were mounted into a mini-thermostat for measurement at temperatures up to 180 °c. details of the measuring procedure are given in [2]. the relative permittivity of 36 and 31 for the two samples, respectively, was very high, whereas the pre-oxidized forms had permittivity of only 20.5 and 17.2, respectively. the i-v characteristics were symmetrical, therefore only the forward voltage branch, with the positive terminal connected to the zirconium metal, was measured in voltage steps of 5 v up to 90 v, and at constant temperatures in steps of 20 °c up to 180 °c. sample zr1nb had normal i-v characteristics with space-charge limited currents given by i au bu c� � �2 , (1) allowing computation of resistivity �, mobility � and carrier concentration n. 4 results and discussion further details concerning the theoretical aspects are given in [3, 4]. as an example the i-v characteristic at room temperature is shown in fig. 1. at higher temperatures © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 6/2008 influence of simulated loca on the properties of zircaloy oxide layers h. frank specimens of zr1nb and zry-4w were pre-oxidized first for 360 days in steam at 425 °c and were then exposed for 3 min to 1200 °c in steam, simulating loss of coolant conditions. in this way, the oxide thickness was more than doubled. the i-v characteristics, measured up to 90 v at temperatures up to 180 °c, revealed the formation of double layers, consisting of dark monoclinic oxide with 1.3 ev activation energy near the metal, and of a whitish surface phase with only 0.5 ev, for both samples. the i-v characteristics of zr1nb showed normal behavior, whereas zry-4w changed from an unusual sub-linear form at low temperatures and voltages to normal space-charge limited currents at higher temperatures and voltages. observation of the injection and extraction currents in the zry-4w sample showed abnormally high negative zero voltages produced by extraction after former injection. keywords: zirconium alloys, oxide layers, semiconductor doping conduction type, reduction semiconductor, anomalous i-v characteristics, space-charge limited currents, injection and extraction currents, temperature dependence of resistivity, activation energy sample material number medium temperature (°c) time(min.) thickness (�m) 1a zr1nb 1744332 steam 1200 3 78.79 3a zry-4w 3744337 steam 1200 3 78.74 table 1: characterization of samples 0 10 20 30 40 50 60 70 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zr1nb (1200), ag, t (°c) 63 44 23 fig. 1: zr1nb, i-v characteristics at 23, 44, 63°c, respectively the current values are higher, but the form remains unchanged. the temperature dependence of the resistivity, giving e �1 30. ev, is shown in fig. 2. the pre-oxidized layer has 1.30 ev and the oxide grown at high temperature (upper layer) has 0.54 ev activation energy. sample zry-4w behaved completely differently from sample zr1nb. as shown in fig. 3a, the i-v characteristic at room temperature was sub-linear up to 50 v, i.e. it obeyed eq. (1), though with negative coefficient a, and at higher voltages continued with a linear trend. with rising temperature the sub-linear part continually diminished, and the linear part gradually changed over to space-charge limited current (fig. 3b), until this character prevailed completely at temperatures over 150 °c (fig. 3c). the end point of sub-linear behavior decreased from 50 v at 23.5 °c to 20 v at 120 °c, respectively, with an indication of 10 v at 142 °c, and was not observed at temperatures over 150 °c. this behavior may be explained by the different properties of the original pre-oxidized layer and the subsequent rapidly grown top layer, thus creating a double layer with attributes similar to p-n junctions. in steam at temperatures below 450 °c, the dark monoclinic oxide phase is formed with activation energy of about 1.3 ev, which prevails at higher measuring temperatures and higher voltages. at temperatures over 450 °c, a whitish oxide is formed with low activation energy of 0.5 ev, determining the current flowing at, and near, room temperature. at the interface between the two oxide types there is a potential drop from 1.3 to 0.5 v, which influences the form of the i-v characteristics. using eq. (1), the coefficient a in the prevailing region of space-charge limited currents in the i-v characteristics determines the mobility with sufficient precision, but the linear part represents only a small fraction of the characteristic, and coefficient b is loaded with high error. it is therefore better to compute the resistivity directly, using current readings near the origin at voltages below the onset of the influence of the space charge. coefficient c is better determined at zero voltage. thus the temperature dependence of resistivity, plotted in fig. 4 using values taken from the i-v characteristics measured at 102, 120, 142, 159, and 180 °c, respectively, is well defined, but not so at lower temperatures. in order to fill this gap, the resistivity in fig. 5 was determined by current measurement at constant 5 v with slowly rising temperature up to 100 °c, thus completing the inaccurate range under 100 °c in fig. 4. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 11 11.5 12 12.5 13 13.5 14 2 2.2 2.4 2.6 2.8 3 3.2 3.4 1000/t (k) lo g rh o (o h m c m ) 0.54 1.30 ev zr1nb (1200), ag fig. 2: zr1nb, temperature dependence of resistivity 0 5 10 15 20 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zry-4w (1200) ag, 23.5°c a) -50 0 50 100 150 200 250 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) zry-4w (1200), ag, 102°c b) 0 2000 4000 6000 8000 10000 0 20 40 60 80 100 voltage (v) c u rr e n t (p a ) c) zry-4w (1200), ag fig. 3: zr1nb, i-v characteristics changing with temperatures, a) 23.5 °c, b) 102 °c, c)180°c, respectively the observed negative zero current (without external voltage) in fig. 6 was due to extraction of the formerly injected charge, and was not caused by continued oxidation, which would have produced a positive ion current. at flow of current i, the resistance r of the layer and the input resistance ri of the pico-amperemeter form a closed circuit, thus acting as a voltage source. it is defined by the sum of the voltage drops of the resistors, caused by current i, giving u i r ri� �( ). this is an open-circuit voltage and can be assessed by compensating the current, or can be computed, if the resistance of the layer is known. the result shown in fig. 7 is astonishing. the extremely high voltages up to 22 v are the consequence of former high injection at 90 v into the very thick oxide layer. the typical time dependence of injection and extraction currents is shown in fig. 8. it takes long times to reach equilibrium after applying the voltage, up to 30 min. the injected charge is stored and can be observed as a negative extraction current by short-circuiting the sample with the pico-amperemeter, obeying a power law of the form i i t n� �0 , with the exponent 1 0 5� �n . . the time integrals of the extraction and the negative injection current (minus the equilibrium current) are equal to the injected space charge, divided by the injection voltage, and define the capacity c/v of the layer. 5 conclusions the changes provoked by high temperature shock are best seen by comparing their properties with those of the pre-oxidized samples, as shown in table 2. the main difference consists in the formation of double layers of pre-oxidized grey monoclinic phase with high activa© czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 6/2008 12.5 13 13.5 14 14.5 15 2.2 2.4 2.6 2.8 3 3.2 3.4 1000/t (k) lo g rh o (o h m c m ) zry-4w (1200),ag 1.06 ev fig. 4: zry-4w, resistivity at temperatures over 100 °c 13.6 13.8 14 14.2 14.4 14.6 14.8 15 2.6 2.8 3 3.2 3.4 1000/t (k)/ lo g rh o (o h m c m ) zry-4w (1200), ag 0.53 ev fig. 5: zry-4w, resistivity under 100 °c -100 -80 -60 -40 -20 0 0 50 100 150 200 temperature (°c) z e ro c u rr e n t (p a ) zry-4w (1200) ag fig. 6: zry-4w, negative zero current (extraction) 0 5 10 15 20 25 60 80 100 120 140 160 180 temperature (°c) z e ro v o lt a g e (v ) zry-4w (1200), ag fig. 7: zry-4w, negative zero voltage 0 0.5 1 1.5 2 2.5 3 3.5 1 2 3 4 log t (s) lo g i (p a ) zry-4w (1200), ag, 83°c injection extraction fig. 8: zry-4w, injection (at 75 v) and extraction at t � 83 °c tion energy (1.3 ev) and the subsequent growth at 1200 °c of a whitish phase of low activation energy (0.5 ev), thus acting as n� n structures. in zr1nb the alloyed nb does not form active centers, and therefore the i-v characteristics are normal space-charge limited current curves, whereas in zry-4w there are active centers of sn, fe and cr, which influence the i-v characteristics. acknowledgments support for this work from ujp, praha a.s. and from the grant msm 680770015 is highly appreciated. special thanks are due to mrs. v. vrtílková for preparing the specimens with measured thickness. references [1] frank, h.: influence of oxidation media on the transport properties of thin oxide layers of zirconium alloys. acta polytechnica, vol. 47 (2007), no. 6, p. 36. [2] frank, h.: j. nucl. mater., vol. 340 (205), p. 119. [3] mott, n. f., guerney, r. w.: electronic processes in ionic crystals, clarendon, oxford, 1940. [4] gould, r. d.: j. appl. phys., vol. 53 (1982), p. 3353. prof. rndr. helmar frank, drsc. phone: +420 2 2435 8655 fax: +420 2 2191 2407 e-mail: helmar.frank@fjfi.cvut.cz department of solid state engineering czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 prague 2, czech republic 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 6/2008 sample zr1nb zry-4w steam at 425 °c 1200 °c 425 °c 1200 °c �r 18 36 11.5 31 � (�cm) 7×10 13 6.6×1013 2.5×1014 4×1014 e (ev) 1.3 1.3/0.5 1.4 1.1/0.5 c/v (�f/cm3) 3.9 – 1.0 0.9 permittivity �r low high low high resistivity � nearly equal twice higher energy e 1.3 ev only change at 85 °c change at 85 °c capacity c/v high – unchanged (lower) table 2: comparison of samples before and after thermal shock acta polytechnica acta polytechnica 53(2):189–192, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ atmospheric argon free burning arcs with a simplified unified model using cfd-arc modeling won-ho leea, jong-chul leeb,∗ a graduate school of automotive engineering, gangneung-wonju national university, republic of korea b school of mechanical and automotive engineering, gangneung-wonju national university, republic of korea ∗ corresponding author: jclee01@gwnu.ac.kr abstract. free burning arcs, where the work piece acts as an anode, are frequently used for a number of applications. our investigation is exclusively concerned with a simplified unified model of arcs and their electrodes under steady state conditions at atmospheric pressure. the model is used to make predictions of arc and electrode temperatures and arc voltage for a 200 a arc in argon. the computed temperatures along the axis between the cathode tip and the anode surface compare well the measured data. keywords: free burning arcs, thermal plasmas, arc modeling, arc-electrode interaction, computational fluid dynamics. 1. introduction high pressure arcs (also known as thermal plasmas) have been used for cutting, welding, spraying, coating, material heating and melting, lighting, current interruption, and, more recently, for waste disposal, production of fine particles, and thermal plasma vapor deposition [7, 10]. two methods, one is a dc arc discharge and the other an inductively coupled discharge, are used for generating thermal plasmas, and the dc arc torches have been widely modified due to their simplicity for applications. there are two types of dc arc torches, transferred arc and non-transferred arc, by the configuration of electrodes. it has been studying on high-power dc arc torches for transferred arc (also known as free burning arc) because the arc energy is directly deposited on the treated materials acting as anode with high heat transfer efficiency in free burning arc plasma systems [2]. thermal plasmas are usually at atmospheric pressure or above. the very frequent collisions between particles of different species within the arc ensure the attainment of a single temperature for all species. for such a plasma, local thermodynamic equilibrium (lte) usually holds, and computer simulation based on lte can usually predict satisfactorily the bulk properties (i.e. the arc column) of the arc plasma. for understanding the basic physical processes occurring in free burning arcs there have been conducted several computations with comparing the results to the temperature measurements of an arc column successfully [4, 1]. however, the accuracy of computations is insufficient for the anode region because there is severe departure from lte. it is very important to predict precisely the temperature of anode region in free burning arcs. the quality and the efficiency of a process depend on the energy flux going into the work piece which acts as anode. our investigation is exclusively concerned with argon free burning arcs under steady state conditions at atmospheric pressure by computational fluid dynamics (cfd) analysis. we are also interested in the energy flux and temperature transferring to the anode work piece with a simplified unified model of arcs and their electrodes. in order to determine two thermodynamic quantities such as temperature and pressure and flow characteristics we have modified navier–stokes equations to take into account radiation transport, electrical power input and the electromagnetic driving forces with the relevant maxwell equations. from the simplified self-consistent solution the energy flux to the anode work piece can be derived. 2. numerical methods in order to determine two thermodynamic quantities such as temperature and pressure and flow characteristics of the free burning arc under steady state we have modified navier–stokes equations to take into account radiation transport, electrical power input and the electromagnetic driving forces with the relevant maxwell equations, 1 r ∂ ∂r [ rρvφ−rγφ ∂φ ∂r ] + ∂ ∂z [ rρwφ−γφ ∂φ ∂z i ] = sφ, (1) 1 r ∂ ∂r ( rσ ∂ϕ ∂r ) + ∂ ∂z ( σ ∂ϕ ∂z ) = 0 , (2) where ρ is the gas density, γφ the diffusion coefficient, sφ the source term, φ the dependent variables, µ the molecular viscosity, κ the thermal conductivity, ϕ the electrostatic potential and σ the electrical conductivity (tab. 1). the subscript ‘l’ denotes the laminar part of the diffusion coefficient and ‘t’ the turbulent part. in the energy conservation equation, q represents the net radiation loss per unit volume and time. 189 http://ctn.cvut.cz/ap/ won-ho lee, jong-chul lee acta polytechnica equations φ γφ sφ continuity 1 0 0 axial w µl + µt −∂p/∂z + jrbθ momentum + viscous terms radial v µl + µt −∂p/∂r + jzbθ momentum + viscous terms enthalpy h (κl + κt)/cp σe2 − q + ∂p ∂r v + ∂p ∂z w + viscous dissipation electrostatic ϕ σ 0 potential table 1. definitions of variable, diffusion coefficient, and source term for governing equations. v (m/s) w (m/s) t (k) sφ de ** ** 1000 ∂ϕ/∂z = j0/σ ej 0 win 1123 ∂ϕ/∂z = 0 kg * * ∂h/∂n = 0 ∂ϕ/∂n = 0 gh bc 0 ∂w/∂r = 0 ∂h/∂r = 0 ∂ϕ/∂r = 0 cd ** ** ∂t/∂r = 0 ∂ϕ/∂r = 0 ef 0 0 ----fc bh 0 0 --ϕ = const. hi ** ** 1000 --ai table 2. boundary conditions for the temperature and the electric potential. the viscous terms in the momentum and energy equations which are not included in the diffusion coefficient are treated as their respective source terms. ohmic heating σe2 is providing heat source and lorentz force j × b is providing momentum source. in addition, the current density and the magnetic field are calculated from the electrostatic potential and ampere’s law, respectively, and can be written as jr = σ ∂ϕ ∂r , jz = σ ∂ϕ ∂z , (3) 1 r ∂ ∂r (rbθ) = µ0jz , (4) where bθ is the azimuthal component of the magnetic field and µ0 the permeability of free space. thermodynamic and transport properties such as density, enthalpy, constant pressure specific heat, viscosity, electrical and thermal conductivities, and optically thin radiation losses required for this calculation are strongly dependent on temperature and pressure. the data adopted in this study have been taken from refs. [8, 9]. the boundary conditions required for the solution of eqs. 1 and 2 are listed in tab. 2. from tab. 2, j0 is the total current divided by the uniform cross section of the cathode and ‘n’ denotes the normal direction to a surface. symbol ‘*’ denotes the pressure is figure 1. schematic of a free-burning arc plasma system for experiments performed by haddad and farmer [3]. set at 0.1 mpa and symbol ‘**’ denotes the locations where velocities do not need to be specified. the given inlet flow conditions correspond to those investigated by haddad and farmer which is 0.5 l/min [3]. symbol ‘---’ indicates that the solid surface temperature is calculated. energy transport within the solid electrodes is governed by conduction. the energy conservation equation for steady state case can be written as −∇· (km∇tm) , (5) where km and tm are respectively the thermal conductivity and temperature of the electrode material. 3. results and discussion the argon arc burns at atmospheric pressure in a 5 mm gap between a shaped cathode and a water-cooled copper anode as shown in fig. 1 (commonly known as the point-plane arc), where the key locations in the computational domain are indicated by letters a to i. the tungsten cathode has a rod diameter of 3.2 mm and its tip has a full conical angle of 60◦ [3]. temperature distribution in the arc can be predicted by including the electrodes in the solution domain without the inclusion of the non-lte sheaths. the experimental arrangements and the experimentally measured arc temperature of farmer and haddad [3] have become a benchmark for theoretical studies of argon free burning arc. the argon gas with a temperature of 1000 k and a pressure of 0.1 mpa is fed in through an annular nozzle at a fixed flow rate of 0.5 l/min. the comparison between the simulation result and the experimental measurement was given for the arc current of 200 a. the two-dimensional temperature and axial velocity fields are shown in fig.2. the temperature in the arc region presents the typical bell shape of high intensity arcs. the present results for 200 a, compared with those of [3, 11] given in brackets, were maximum arc temperature 24 300 k (23 700 k), maximum axial velocity 500 m/s (500 m/s) and arc voltage 10.5 v (12.9 v). the temperature near the cathode is high because of contraction of arc column 190 vol. 53 no. 2/2013 atmospheric argon free burning arcs (a) temperature (b) axial velocity figure 2. temperature [unit: k] and axial velocity [unit: m/s] fields for a 200 a free burning argon arc at 1 atm under the experiment conditions of haddad and farmer [3]. near the cathode spot. the magnetic pinch effect induced by the arc current creates an over pressure near the cathode. since the arc temperature is very high, a small pressure difference will accelerate the arcing gas to a high speed, typically a couple of hundred meters per second. there is a very thin, high velocity core (fig. 2b) in the arc column. the half width of the radial profile of axial velocity, defined as the radial position at which the velocity is reduced to 50 % of the axis velocity at the mid-gap (2.5 mm from the cathode tip), is 0.4 mm. such a narrow high velocity core is typical of laminar arc plasma flow for which viscous effects are negligible. the energy transport is dominated by convection and radiation in the core region near the axis. however, the turbulent model is essential to predict flow characteristics near cathode and anode due to separation and stagnation, respectively [5]. according to the above comparative data, our simulation results match well with other measured and predicted data given by [3, 11]. figure 3 shows the detailed comparisons with the temperatures along the axis. the experimental data of [3] for an arc under same conditions are also plotted in fig. 3. the temperature near the cathode is higher than that near the anode because of contraction of arc column near the cathode spot. the differences between these two sets of results are not significant and there is good agreement. figure 3. comparison of the predicted temperature (solid lines) along the axis between the cathode tip and the anode surface with the experiment results (symbols) of haddad and farmer [3] for a 200 a argon arc at 1 atm. figure 4. temperature and isotherm lines for a 200 a argon arc at 1 atm. in anode, outer isotherm, 1400k, interval 25 k for the temperature range 1000 ÷ 1400 k, whereas interval 1000 k is for the temperature range 1000 ÷ 23 000 k in arc column. in fig. 4 we now present the computed isotherm lines for the part of anode. the temperature reaches a maximum of 1365 k in the anode, which shows good agreement with the results by lago et al. [6]. from this value, we can predict the presence of a liquid metal pool in the anode and the presence of metal vapours in the plasma because it is greater than the melting point of cooper (1357 k). therefore, further investigation should include the modelling of cu evaporation from anode and non-lte situation near electrode for more realistic calculations. 4. conclusions in this study, we have carried out computational investigation of argon free burning arcs under steady state conditions at atmospheric pressure with a simplified unified model of arcs and their electrodes. it was found that the computed temperatures along the axis be191 won-ho lee, jong-chul lee acta polytechnica tween the cathode tip and the anode surface show good agreement with those measured by haddad and farmer [3]. for the maximum axial velocity, the predicted value of 750 m/s from our simulation results matches well with the predicted value given by zhu et al. [11]. in addition to the arc column, the temperature distribution in the anode also shows good agreement with the results by lago et al. [6]. this knowledge of free burning arc features can play a role in developing the atmospheric plasma systems, however, further investigation should include the modelling of cu evaporation from anode and non-lte situation near electrodes for more realistic calculations. acknowledgements this research was supported by basic science research program through the national research foundation of korea (nrf) funded by the ministry of education, science and technology (2012-0007808). references [1] r. bibi, m. monno, m. i. boulos. numerical and experimental study of transferred arcs in argon. j phys d: appl phys 39(15):3253–3266, 2006. [2] w. h. gauvin. some characteristics of transferred-arc plasmas. plasma chem and plasma processing 9(1):65s–84s, 1989. [3] g. n. haddad, a. j. d. farmer. temperature determinations in a free-burning arc. i. experimental techniques and results in argon. j phys d: appl phys 17(6):1189–1196, 1984. [4] j. haidar. a theoretical model for gas metal arc welding and gas tungsten arc welding. j appl phys, 84(7):3518–3529, 1998. [5] y. j. kim, j. c. lee. computational investigation on the disturbed thermal plasma for improving the reliability of electrostatic diagnosis. vacuum 84(6):766–769, 2010. [6] f. lago, j. j. gonzalez, p. freton, a. gleizes. a numerical modelling of an electric arc and its interaction with the anode: part i. the two-dimensional model. j phys d: appl phys 37(6):883–897, 2004. [7] s. liau. computer simulation of high pressure non-equilibrium plasma. department of electrical engineering, university of liverpool, england, 2005. dr. thesis. [8] j. menart, j. heberlein, e. pfender. theoretical radiative emission results for argon/copper thermal plasmas. plasma chem and plasma processing 16(1):245s–265s, 1995. [9] a. b. murphy. a comparison of treatments of diffusion in thermal plasmas. j phys d: appl phys 29(7):1922–1932, 1996. [10] e. pfender. thermal plasma technology: where do we stand and where are we going? plasma chem and plasma processing 19(1):1–31, 1999. [11] p. zhu, j. j. lowke, r. morrow. a unified theory of free burning arcs, cathode sheaths and cathodes. j phys d: appl phys 25(8):1221–1230, 1992. 192 acta polytechnica 53(2):189–192, 2013 1 introduction 2 numerical methods 3 results and discussion 4 conclusions acknowledgements references ap09_1.vp 1 introduction load induced thermal strain (lits) is an integral part of the behaviour of concrete in fire. the existence of lits has been well documented and modelled by different researchers. it is vital that this strain development is correctly represented in structural models, as the locked in strains due to lits constituents are significant. current methods of modelling lits involve incorporating the strains into constitutive curves. this approach allows the total strains developed due to lits to be simply included in a finite element analysis. more thorough representation is needed to accurately represent the plastic components in loading directions, and the total strains in non-loading directions. this paper presents a technique to allow the evolution of lits in accordance with the rules developed in several academic material models [1–3]. the technique is implemented with a simple drucker-prager yield surface and the results assessed. 2 current methods inclusion of lits in a concrete constitutive curve is a convenient way of representing lits in finite element analyses. it allows the modeller to make the lits constituents temperature dependent and stress dependent – through the use of multiple curves and by giving strains for different stresses respectively. a number of models are available from different sources and for different concretes [1–4]. failure to represent lits will result in the modeller not modelling the strains developed in the material accurately, thereby giving an excessively stiff structure. in fact, it could be argued that since lits is an integral part of concrete behaviour, a modeller failing to include it will not be modelling concrete but some other, non-physical, material. once the total strains caused by lits have been represented, one can then think about the division between elastic and plastic strains. it has been observed that the largest lits © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 49 no. 1/2009 incorporation of load induced thermal strain in finite element models a. law, m. gillie, p. pankaj load induced thermal strains (lits) are an integral part of the behaviour of concrete in fire; their existence has been well documented and modelled by different researchers. more thorough representation of lits is needed to accurately represent their plastic constituents in finite element models. this paper develops a technique to allow the evolution of lits in accordance with the rules developed in several academic models. the technique is implemented with a simple drucker-prager yield surface and the results assessed. keywords: load induced thermal strain, finite element, concrete, structures in fire, yield surface. � � �� fig. 1: plastic flow, and model setup constituents are irrecoverable [5], i.e. they are plastic strains. therefore, to accurately model these plastic strains it is necessary to determine the elastic modulus of the material as a function of temperature. if the modulus is too stiff, the plastic strains will be overestimated; too soft, and they will be underestimated. the correct modelling of plastic strain constituents becomes increasingly important as a structure cools, since the plastic strains will induce greater tension on strain reversal. some authors have presented their material models in parts, allowing the user to build the strain constituents into the full curve. the elastic modulus is, therefore, a precisely identifiable constituent of the material model and can be included in a structural model as such; henceforth, this will be termed the “actual” modulus. other material data such as that presented in the eurocode do not specify the value of the elastic modulus. in this case, extra care must be taken to represent the strain components accurately. where the elastic modulus is the initial gradient of the constitutive curve, this will be termed the “apparent” modulus. 3 multiple dimensions the primary focus for research has been on total and plastic strains in the direction of loading. however, attention must also be paid to the non-loading directions. depending on the model in use, failure to carefully consider the elastic modulus of the material will result in unrepresentative plastic strains, unexpected strains in the non-loading directions, or a mixture of both. the potential for these effects to manifest themselves can be demonstrated by a simple example. 3.1 simple example consider a small cube of concrete, subject to a displacement controlled loading in principle direction 2, but free to move in the transverse directions with a drucker-prager yield surface and a perfectly plastic material behaviour, as shown in fig. 1. the associative isotropic flow rule (used here for simplicity) dictates that once the yield surface is reached, plastic strain must occur in a direction orthogonal to the yield 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 a) 0.01 0.02 0.03 �0.02 �0.01 apparent actual t o ta l st ra in (å 1 ) (å1) (å1) total strain (å2) b) �0.02 �0.01 �0.02 �0.01 apparent actual p la st ic st ra in (å p l2 ) total strain (å2) (åpl2) (åpl2) c) fig. 2. the same constitutive curve with different elastic moduli gives different lateral deformations and direct plastic strains surface in stress space. this means that plastic strains are induced in directions other than the one in which the load is applied. since the location of trial stress is a function of the elastic modulus, the implications of this for the implementation of lits via a constitutive curve are significant. the inclusion of lits whether implicitly (with an “apparent” elastic modulus) or explicitly (with an “actual” elastic modulus) will result in a proportion of that lits becoming active in the transverse directions. the magnitude of the extra strain would depend on the stress state of the material, and on the degree of plasticity developed in the principle direction. for example: should the element described above be at a stress state at point a, no plastic strains would be induced in the 1-direction. in the case of the apparent modulus, a large proportion of the extra transverse strain may be elastic; while in the case of the actual modulus, the major constituent of the incremental strain would be plastic. the impact of this difference is demonstrated below, using the drucker-prager yield criterion with a constitutive curve corresponding to that of the 200 °c terro [2] lits curve. this temperature was used as there is a significant difference between the actual and apparent moduli, but the temperature is not too extreme. two different models were created, each with a different elastic modulus – apparent or actual – but with the same constitutive curve (fig. 2a). the numerical models consisted of a single cubic finite element, restrained at the base in the 2-direction (but free to displace in the 1 and 3-directions) and were strained in the 2-direction. the corresponding deformations and plastic strains were recorded. fig. 2b shows the total strains in the lateral deformation direction. the strains in the 2-direction (i.e. the direction of strain control) are the same for both of the models. in the unrestricted directions, however, there are significant differences in the total strains, particularly in the inelastic phase of the constitutive model. the origin of these differences can be clearly seen from fig. 2c. in the “apparent” model the plastic strains do not develop until much later in the deformation process. the “actual” model on the other hand – because of the difference between the elastic modulus and the shape of the constitutive curve – activates the plastic strain constituents immediately. this difference in plastic strain is entirely due to the activation of the flow rule at a much lower stress. conse© czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 49 no. 1/2009 �� a) � � � b) fig. 3: calculation of plastic and elastic strains: a) hardening with apparent modulus, b) corresponding strains � � � b) a) � � fig. 4: redistribution of strains due to the difference between actual modulus and apparent modulus: a) redistribution found using actual modulus, b) corresponding strains quently, though the plastic strain in the loading direction is what would be expected from using the “actual” modulus in the constitutive curve, the impact of this approach can be clearly seen in the non-loading directions. since equations to represent lits are all functions of temperature and direct stress, the use of either the apparent or the actual modulus is inadequate if one wants to model the plastic strains accurately, whilst limiting the lateral deformations. 4 the embedded modulus to allow the modelling of lits to be more representative, a new method is proposed for the inclusion of lits in the constitutive model while avoiding the transverse strain issue outlined above. the drucker-prager yield criterion and plasticity equations are solved in a two step method: first, the elastic strains and corresponding plastic strains are calculated using the apparent modulus and the normal solution methods (fig. 3); secondly, the elastic (�el1) and plastic (�pl1) strains are recalculated using the actual modulus (fig. 4). as such, the actual modulus is embedded within the solution procedure. this second stage can expressed simply as: � � el em 1 � e , (1) where eem is the embedded actual modulus and � is the stress calculated from the previous solution. since: � � �el pl0 0� � total , (2) where �el0 and a�pl0 are the original elastic and plastic strains, and �total is the total strain. the new plastic strain can be directly calculated from: � � �pl el1 1� �total . (3) the new plastic and elastic strains are then used in the subsequent analysis. the equivalent plastic strain is not, however, changed. consequently, the strains developed in the transverse directions are in line with those that would occur when using an apparent modulus, but the plastic strains developed in the principle direction are as would be expected from using the actual modulus. it should also be noted that where plastic strain has occurred, but the yield function is found to be negative (i.e. the total strain is reduced), the corresponding elastic stresses must be recalculated using the embedded modulus. otherwise, the redistributed strains would be reabsorbed into the elastic region on return to zero stress. a drucker-prager model was created [6–12] which incorporated this method of modification by the embedded modulus. a model with an apparent elastic modulus and an embedded actual modulus was subjected to the previously described test. the results were compared with the previous models (fig. 5). the total lateral strains experienced by the “embedded” material are the same as those experienced by the “apparent” material. equally, the total plastic strain experienced in the loading direction is the same as those experienced by the “actual” material. thus, a fully plastic, transient strain constituent has been included in the model without affecting the 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 �0.02 �0.01 �0.02 �0.01 apparent actual embedded 0.01 0.02 0.03 �0.02 �0.01 apparent actual embedded p la st ic st ra in (å p l2 ) t o a l st ra in (å 1 ) (åpl2) (åpl2) (å1) (å1) (å1) (åpl2) total strain (å2) total strain (å2) t fig.5. comparison of two stage approach with results from original models deformations in the non-loading directions. this allows the plastic lits effect to be successfully modelled uni-axially and in proportion to the applied stress in the way stated in the governing lits equations. 5 conclusion there are several conclusions to be drawn from this study: � there are significant differences between a constitutive curve which includes lits, and a full constitutive model which accurately represents lits components. � inclusion of the plastic strains by means of an “apparent” modulus is useful in one dimension; however, plastic flow rules cause unwanted strains to develop laterally when more than one dimension is considered. � use of a two step model with an “apparent” modulus, and an embedded “actual” modulus within the material model is one approach which can be used to correctly model the plastic strain due to the lits equations, while allowing the strain in the lateral directions to be modelled correctly. this model has been demonstrated in the case of an element deformed uniaxially. the authors of this work would like to gratefully thank the project sponsors; bre trust, and the epsrc. references [1] anderberg, y., thelandersson, s.: stress and deformation characteristics of concrete, 2 experimental investigation and material behaviour model. sweden, university of lund, 1976. [2] terro, m. j.: numerical modeling of the behaviour of concrete structures in fire. aci structural journal, vol. 95 (1998), p. 183–193. [3] nielsen, c. v., pearce, c. j., bicanic, n.: theoretical model of high temperature effects on uniaxial concrete member under elastic restraint. magazine of concrete research, vol. 54 (2002), p. 239–249. [4] en1992-1-2: design of concrete structures – part1-2: general rules – structural fire design, 1992. [5] khoury, g. a., grainger, b. n., sullivan, p. j. e.: transient thermal strain of concrete: literature review, conditions within specimen and behaviour of individual constituents. magazine of concrete research. vol. 37 (1985), p. 131–144. [6] calladine, c. r.: plasticity for engineers: theory and applications. chichester: horwood publishing, 2000. [7] crisfield ma: non-linear finite element analysis of solids and structures. chichester: wiley, 1991. [8] crisfield ma: non-linear finite element analysis of solids and structures: advanced topics. chichester: wiley, 1997. [9] cook, r. d., malkus, d. s., plesha, m. e., et al.: concepts and applications of finite element analysis. chichester: wiley, 2002. [10] hill, r.: the mathematical theory of plasticity. oxford: oxford university press, 1950. [11] pankaj, p.: finite element analysis in strain softening and localisation problems, in department of civil engineering. swansea: university college of swansea, 1990. [12] zienkiewicz, o. c.: the finite element method (ed. 3). london: mcgraw-hill, 1977. angus law martin gillie pankaj pankaj the university of edinburgh bre centre for fire safety engineering edinburgh, uk © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 49 no. 1/2009 ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 tilings generated by ito-sadahiro and balanced (−β)-numeration systems p. ambrož abstract let β > 1 be a cubic pisot unit. we study forms of thurston tilings arising from the classical β-numeration system and from the (−β)-numeration system for both the ito-sadahiro and balanced definition of the (−β)-transformation. keywords: beta-expansion, negative base, tiling. 1 introduction representations of real numbers in a positional numeration system with an arbitrary base β > 1, so-called β-expansions, were introduced by rényi [10]. during the fifty years since the publication of this seminal paper, β-expansions have been extensively studied from various points of view. this paper considers tilings generated by β-expansions in the casewhen β is a pisot unit. a generalmethod for constructing the tiling of a euclidean space by a pisot unit was proposed by thurston [11], although an example of such a tiling had already appeared in the work of rauzy [9]. fundamental properties of these tilings were later studied by praggastis [8] and akiyama [1, 2]. in 2009, ito and sadahiro introduced a new numeration system [6], using a non-integer negative base −β < −1. their approach is very similar to the approach by rényi. another definition of a system using a non-integer negative base −β < −1, obtained as a slight modification of the system by ito and sadahiro, was considered by dombek [4]. the main subject of this paper is to transfer the construction by thurston into the framework of (−β)numeration (both cases) and to provide examples of how tilings (for fixed β) in the positive and negative case can resembleand/ordiffer fromeachother. thepaper is intendedasanentrypoint into a studyof theproperties of these tilings. 2 rényi β-expansions let β > 1 be a real number and let the transformation tβ : [0,1) → [0,1) be defined by the prescription tβ(x) := βx − �βx�. the representation of a number x ∈ [0,1) of the form x = x1 β + x2 β2 + x3 β3 + · · · , where xi = �βt i−1β (x)�, is called the β-expansion of x. since βt(x) ∈ [0, β) the coefficients xi (called digits) are elements of the set {0,1, . . . , �β� − 1}. the β-expansion of an arbitrary real number x ≥ 1 can be naturally defined in the following way: find an exponent k ∈ n such that x βk ∈ [0,1). using the transformation tβ derive the β-expansion of x βk of the form x βk = x1 β + x2 β2 + x3 β3 + . . . , so that x = x1β k−1 + x2β k−2 + . . . + xk−1β + xk + xk+1 β + . . . the β-expansion of x ∈ r+ is denoted by dβ(x), and as usual we write dβ(x)= x1x2 . . . xk•xk+1xk+2 . . . the digit string x1x2x3 · · · is said to be β-admissible if there exists a number x ∈ [0,1) so that dβ(x) = •x1x2x3 . . . is its β-expansion. the set of admissible digit strings can be described using the rényi expansion 7 acta polytechnica vol. 50 no. 5/2010 of 1, denoted by dβ(1)= t1t2t3 . . ., where t1 = �β� and dβ(β − �β�)= t2t3t4 . . .. the rényi expansion of 1 may or may not be finite (i.e., ending in infinitely many 0’s which are omitted). the infinite rényi expansion of 1, denoted by d∗β(1) is defined by d∗β(1)= lim ε→0+ dβ(1 − ε) , where the limit is taken over the usual product topology on {0,1, . . . , �β� − 1}n. it can be shown that d∗β(1)= { dβ(1) if dβ(1) is infinite,( t1 · · · tm−1(tm − 1) )ω if dβ(1)= t1 · · · tm0ω with tm =0. the characterization of admissible stings is given by the following theorem due to parry. theorem 1 ([7]) a string x1x2x3 . . . over the alphabet {0,1, . . . , �β� − 1} is β-admissible, if and only if for all i =1,2,3, . . ., 0ω �lex xixi+1xi+2 . . . ≺lex d∗β(1) , where �lex is the lexicographical order. using β-admissible digit strings, one can define the set of non-negative β-integers, denoted zβ, zβ := {akβk + . . . a1β + a0 | ak · · · a1a00ω is a β-admissible digit string} , and the set fin(β) of those x ∈ r+ whose β-expansions have only finitely many non-zero coefficients to the right from the fractional point fin(β) := ⋃ n∈n 1 βn zβ . the distances between consecutive β-integers are described in [11]. it is shown that they take values in the set {δi | i =0,1, . . .}, where δi = ∞∑ j=1 ti+j βj and dβ(1)= t1t2 . . .. moreover, the sequence coding the distances in zβ is known to be invariant under a substitution provided dβ(1) is eventually periodic [5]. the form of this substitution also depends on dβ(1). if we consider β an algebraic integer, then obviously fin(β) ⊂ z[β−1]+. the converse inclusion, which is very important for the construction of the tiling and also for the arithmetical properties of the system, does not hold in general. an algebraic integer β for which fin(β)= z[β−1]+ holds, is said to have property (f). 3 ito-sadahiro (−β)-expansions now consider the real base −β < −1 and the transformation t−β : [ −β β +1 , 1 β +1 ) → [ −β β +1 , 1 β +1 ) defined by the prescription t−β(x)= −βx − ⌊ − βx + β β +1 ⌋ . every number x ∈ [ −β β +1 , 1 β +1 ) can be represented in the form x = x1 −β + x2 (−β)2 + x3 (−β)3 + · · · , where xi = ⌊ − βt i−1−β (x)+ β β +1 ⌋ . the representation of x in such a form is called the (−β)-expansion of x and is denoted d−β(x)= •x1x2x3 . . . 8 acta polytechnica vol. 50 no. 5/2010 byanalogy to the case ofrényi β-expansions, we use for the (−β)-expansion of x ∈ r a suitable exponent l ∈ n such that x (−β)l ∈ [ −β β +1 , 1 β +1 ) . it is shown easily that the digits xi of a (−β)-expansion belong to the set {0,1, . . . , �β�}. in order to describe strings that arise as (−β)-expansions of some x ∈ [ −β β +1 , 1 β +1 ) , so-called (−β)admissible digit strings, we will use the notation introduced in [6]. we denote lβ = −β β +1 and rβ = 1 β +1 the left and right end-point of thedefinition interval iβ of the transformation t−β, respectively. that is iβ = [lβ , rβ). we also denote d−β(lβ)= d1d2d3 . . . theorem 2 ([6]) a string x1x2x3 · · · over the alphabet {0,1, . . . , �β�} is (−β)-admissible, if and only if for all i =1,2,3, . . ., d−β(lβ) �alt xixi+1xi+2 ≺alt d∗−β(rβ) , where d∗−β(rβ)= lim ε→0+ d−β(rβ − ε) and �alt is the alternate order. recall that the alternate order is definedas follows: we say that x1x2x3 . . . ≺alt y1y2y3 . . ., if (−1)i(xi −yi) > 0 for the smallest index i satisfying xi = yi. the relation between d∗−β(rβ) and d−β(lβ) is described in the same paper. theorem 3 ([6]) let d−β(lβ)= d1d2d3 . . . if d−β(lβ) is purely periodic with odd period-length, i.e., d−β(lβ)= (d1d2 · · · d2l+1)ω, then d∗−β(rβ)= (0d1d2 · · · d2l(d2l+1 − 1)) ω. otherwise, d∗−β(rβ)= 0d−β(lβ). similarly to the rényi case, one can define the set of (−β)-integers, denoted z−β, using the admissible digit strings. z−β := {ak(−β)k + · · · a1(−β)+ a0 | ak · · · a1a00ω is a (−β)-admissible digit string} . the set of distances between consecutive (−β)-integers has been described only for a particular class of β, cf. [3]. 4 balanced (−β)-numeration system the last numeration system used in this paper is a slight modification of (−β)-numeration defined by ito and sadahiro. let −β < −1 be the base and consider the transformation s−β : [ − 1 2 , 1 2 ) → [ − 1 2 , 1 2 ) given by s−β = −βx − ⌊ −βx + 1 2 ⌋ . the balanced (−β)-expansion of a number x ∈ [ − 1 2 , 1 2 ) , denoted db,−β(x)= •x1x2x3 . . ., is x = x1 −β + x2 (−β)2 + x3 (−β)3 + . . . , where xi = ⌊ −βsi−1−β (x)+ 1 2 ⌋ . also in this casewe use for the (−β)-expansion of x ∈ r a suitable exponent l ∈ n such that x (−β)l ∈ [ − 1 2 , 1 2 ) . it is shown easily that the digits xi of a balanced (−β)-expansion belong to the set { − ⌊ β +1 2 ⌋ , . . . , ⌊ β +1 2 ⌋} . note that sometimes d is used instead of −d. a digit string x1x2x3 . . . is called balanced (−β)-admissible if it arises as the balanced (−β)-expansion of some x ∈ [ − 1 2 , 1 2 ) . the two following theorems by dombek [4] prove that also in this case the admissible strings are characterized by the balanced (−β)-expansions of the endpoints of the interval [ − 1 2 , 1 2 ) . 9 acta polytechnica vol. 50 no. 5/2010 theorem 4 ([4]) a string x1x2x3 . . . over the alphabet { − ⌊ β +1 2 ⌋ , . . . , ⌊ β +1 2 ⌋} is balanced (−β)-admissible if and only if for all i =1,2,3, . . . db,−β ( − 1 2 ) �alt xixi+1xi+2 . . . ≺alt d∗b,−β ( 1 2 ) , where d∗b,−β ( 1 2 ) = lim ε→0+ db,−β ( 1 2 − ε ) . theorem 5 ([4]) let db,−β ( − 1 2 ) = d1d2d3 . . . then d∗b,−β( 1 2 )= ⎧⎪⎨ ⎪⎩ ( d1 . . . d2l (d2l+1 − 1)d1 . . . d2l(d2l+1 − 1) )ω if db,−β ( − 1 2 ) =(d1 . . . d2l+1) ω , d1 d2 d3 . . . otherwise. the set of balanced (−β)-integers, denoted zb,−β, is defined by analogy to the two previous cases. zb,−β := {ak(−β)k + . . . a1(−β)+ a0 | ak . . . a1a00ω is a balanced (−β)-admissible string} . 5 constructing of the tiling recall that a pisot number is an algebraic integer such that all its algebraic conjugates are in modulus strictly smaller than one. let β > 1 be a pisot number of degree d = r +2s. we denote β = β(1) and we assume that β(2), . . . , β(r) are real conjugates of β and β(r+1), . . . , β(r+2s) are complex conjugates of β such that β(r+j) = β(r+s+j) for j =1, . . . , s. denote by x(j), j =1, . . . , n the corresponding conjugate of x ∈ q(β), i.e., x = q0 + q1β + . . . + qd−1β q−1 �→ x(j) = q0 + q1β(j) + . . . + qd−1(β(j))q−1 . consider the map φ : q(β) → rd−1 defined by φ(x) := ( x(2), . . . , x(r), �(x(r+1)), �(x(r+1)), . . . , �(x(r+s)), �(x(r+s)) . proposition 6 ([1]) let β > 1 be a pisot number of degree d. then φ(z[β]) is dense in rd−1. the map φ is used to construct the tiling in the following way. let w = w1 . . . wl ∈ {0,1, . . . , �β� − 1}∗ be a finite word such that w0ω is an admissible digit string. we define the tile tw as tw := {φ(x) | x ∈ fin(β) and (x)β = ak . . . x1x0•w1 · · · wl} . the properties of the tiling of the euclidean space using tiles tw were described byakiyama; the results are summarized in the following theorems. theorem 7 ([1]) let β be a pisot unit of degree d with property (f). then • rd−1 = ⋃ w0ω admissible tw, • for each x ∈ zβ we have φ(x) ∈ inn(t�), where � is the empty word and inn(x) denotes the set of inner points of x; especially, the origin 0 is an inner point of the so-called central tile t�, • for each tile tw we have inn(tw)= tw, • ∂(tw) is closed and nowhere dense in rd−1, where ∂(tw) is the set of boundary elements of tw, • if dβ(1)= t1 · · · tm−11 then each tile tw is arc-wise connected. 10 acta polytechnica vol. 50 no. 5/2010 theorem 8 ([2]) let β be a pisot unit of degree d such that dβ(1) = t1 . . . tm(tm+1 · · · tm+p)ω with m, p the smallest possible. then there are exactly m + p different tiles up to translation. note that q(β) = q(−β) and z[β] = z[−β]. thus the construction of the tiling associated to (−β)numeration follows the same lines, the corresponding mapping φ− being defined using isomorphisms of the extension fields q(−β) and q(−β(j)), and the following variant of proposition 6 holds; its proof follows the same lines as in the proof of the original proposition. proposition 9 let β > 1 be a pisot number of degree d. then φ−(z[−β]) is dense in the space rd−1. 6 examples of tilings in the rest of the paper we provide several examples of tilings associated with β cubic pisot units, i.e., the minimal polynomial of β is of the form x3 − ax2 − bx ± 1. every time all the tiles tw with w of length 0,1,2 are plotted. so far no properties of tilings in the negative case similar to those in theorem 7 and theorem 8 have been proved. however, the following examples demonstrate that it is reasonable to anticipate that most of the properties remain valid. on the other hand, one can also observe that for a fixed β when we change the β-numeration into the (−β)-numeration (either ito-sadahiro or balanced) the shape and form of the tiles can be either preserved or changed slightly or completely. 6.1 minimal polynomial x3 − x2 − 1 the tilings associated to −β are trivial in this case. indeed, d−β(lβ)= 1001ω and db,−β ( − 1 2 ) = ( 10(−1)(−1) (−1)(−1)(−1)010(−1)011 )ω , hence z−β = zb,−β = {0} (cf. [3, 4]). 6.2 minimal polynomial x3 − 2x2 − 2x − 1 this β is an example of a base for which the three considered tilings almost do not change. we have dβ(1)=211 , d−β(lβ)=201 ω , db,−β ( − 1 2 ) =(1(−1)1)ω . all three sets zβ, z−β and zb,−β have the same set of three possible distances between consecutive elements, namely {1, β − 2, β2 − 2β − 2}. the codings of the distances in these sets are generated by substitutions which are pairwise conjugated. recall that substitutions ϕ and ψ over an alphabet a are said to be conjugated if there exists a word w ∈ a∗ such that ϕ(a) = wψ(a)w−1 for all a ∈ a. the tilings are composed of the same tiles (up to rotation). see figure 1. 6.3 minimal polynomial x3 − 3x2 + x − 1 in this case dβ(1)=2201 , d−β(lβ)= (201) ω , db,−β ( − 1 2 ) =(1(−1)00)ω , and again all three sets of integers have the same possible distances between consecutive elements, δi ∈ {1, β − 2, β2 − 2β − 2, β2 − 3β +1}. however, in this case the associated substitutions are not conjugated (the condition is not fulfilled on exactly one of four letters) and even though the tilings do look similar, they are composed of different tiles. see figure 2. 6.4 minimal polynomial x3 − 2x2 − 1 this β is an example of a base for which two tilings (and the corresponding properties of the sets of integers) are very similar, but the third tiling differs substantially. we have dβ(1)= 201 , d−β(lβ)= (2101) ω , db,−β ( − 1 2 ) =(101)ω . 11 acta polytechnica vol. 50 no. 5/2010 rényi case ito-sadahiro case balanced case fig. 1: minimal polynomial x3 − 2x2 − 2x −1 12 acta polytechnica vol. 50 no. 5/2010 rényi case ito-sadahiro case balanced case fig. 2: minimal polynomial x3 −3x2 + x − 1 13 acta polytechnica vol. 50 no. 5/2010 rényi case ito-sadahiro case balanced case fig. 3: minimal polynomial x3 − 2x2 −1 14 acta polytechnica vol. 50 no. 5/2010 rényi case ito-sadahiro case balanced case fig. 4: minimal polynomial x3 − 3x2 +2x −1 the sets zβ and zb,−β have the same set of distances {1, β − 2, β2 −2β}, however the associated substitutions are not conjugated. on the other hand there are five distances between consecutive elements in the set z−β, namely {1, β2 − β − 1, β − 1, β, β2 − β}. the forms of the tilings comply: the tiling in the rényi case and the tiling in the balanced case are somewhat similar, but the tiling in the ito-sadahiro case is completely different. see figure 3. 6.5 minimal polynomial x3 − 3x2 +2x − 1 the last example demonstrates that the tiling can change fundamentally when considering different numeration systems with fixed β. in this case dβ(1)=201 ω , d−β(lβ)= (211) ω , db,−β ( − 1 2 ) =(1010(−1)(−1)(−1)(−1)0(−1)0111)ω , there are three distances between consecutive elements in the set zβ, four in the set z−β and seven in the set zb,−β. the tilings are completely different. see figure 4. 15 acta polytechnica vol. 50 no. 5/2010 7 conclusion due to the similar nature of β-numeration and (−β)-numeration, the transfer of the construction of the tiling of a space due to thurston into the framework of (−β)-numeration is quite straightforward. in this paper we have provided several examples of these tilings (for both the ito-sadahiro definition and the balanced definition of the −(β)-transformation). although the shape and form of tiling can change dramatically when one changes (for a fixed β) the β-numeration into the −(β)-numeration, in general the examples demonstrate that the validity of most of the properties derived by akiyama andpraggastis in the positive case should be preserved. it remains an open question to provide proofs of such properties. acknowledgement we acknowledge financial support from czech science foundation grant 201/09/0584 and from grants msm 6840770039 and lc06002 of the ministry of education, youth, and sports of the czech republic. references [1] akiyama, s: self affine tiling and pisot numeration system. in number theory and its applications (kyoto, 1997), k. győry and s. kanemitsu, (eds.), vol. 2 of dev. math., kluwer acad. publ. 1999, 7–17. [2] akiyama, s.: on the boundary of self affine tilings generated by pisot numbers. j. math. soc. japan 54, 2002, 283–308. [3] ambrož,p., dombek,d.,masáková,z., pelantová,e.: numberswith integer expansions in the numeration system with negative base. submitted to acta arithmetica. [4] dombek, d.: beta-numeration systems with negative base. master’s thesis, czech technical university in prague, 2010. [5] fabre, s.: substitutions et β-systèmes de numération. theoret. comput. sci. 137, 1995, 219–236. [6] ito, s., sadahiro, t.: beta-expansions with negative bases. integers, 9, 2009, a22, 239–259. [7] parry, w.: on the β-expansions of real numbers. acta math. acad. sci. hungar. 11, 1960, 401–416. [8] praggastis, b.: markov partitions for hyperbolic toral automorphisms. phd thesis, university of washington, 1994. [9] rauzy, g.: nombres algébriques et substitutions. bull. soc. math. france, 110, 1982, 147–178. [10] rényi, a.: representations for real numbers and their ergodic properties. acta math. acad. sci. hungar, 8, 1957, 477–493. [11] thurston, w. p.: groups, tilings, and finite state automata. ams colloquium lecture notes, 1989. ing. petr ambrož, ph.d. e-mail: petr.ambroz@fjfi.cvut.cz department of mathematics fnspe, czech technical university in prague trojanova 13, 120 00 praha 2, czech republic 16 acta polytechnica vol. 51 no. 4/2011 more on pt -symmetry in (generalized) effect algebras and partial groups j. paseka, j. janda abstract wecontinue in thedirection of our paper on pt -symmetry in (generalized)effectalgebras andpartialgroups. namely we extend our considerations to the setting ofweakly ordered partial groups. in this setting, any operatorweakly ordered partial group is a pasting of its partially ordered commutative subgroups of linear operators with a fixed dense domain over bounded operators. moreover, applications of our approach for generalized effect algebras are mentioned. keywords: (generalized) effect algebra, partially ordered commutative group, weakly ordered partial group, hilbert space, (unbounded) linear operators, pt -symmetry, pseudo-hermitian quantum mechanics. 1 introduction it is a well known fact that unbounded linear operators play the role of the observable in the mathematical formulation of quantum mechanics. examples of such observables corresponding to the momentum and position observables, respectively, are the following self-adjoint unbounded linear operators on the hilbert space l2(r): (i) the differential operator a defined by (af )(x) = i d dx f (x) where i is the imaginary unit and f is a differentiable function with compact support. then d(a) �= l2(r), since otherwise the derivative need not exist. (ii) (bf )(x) = xf (x), multiplication by x and again d(b) �= l2(r), since otherwise xf (x) need not be square integrable. note that in both cases the possible domains are dense sub-spaces of l2(r), i.e., d(a) = d(b) = l2(r). the same is true in general, since for any unbounded linear operator there is no standard way to extend it to the whole space h. by the hellingertoeplitz theorem, every symmetric operator a with d(a) = h is bounded. an important attempt at an alternative formulation of quantum mechanics started in the seminal paper [1] by bender and boettcher in 1998. bender and others adopted all the axioms of quantum mechanics except the axiom that restricted the hamiltonian to be hermitian. they replaced this condition with the requirement that the hamiltonian must have an exact pt -symmetry. later, a. mostafazadeh [6] showed that pt -symmetric quantum mechanics is an example of a more general class of theories, called pseudo-hermitian quantum mechanics. in [4] foulis and bennett introduced the notion of effect algebras that generalized the algebraic structure of the set e(h) of hilbert space effects. in such a case the set e(h) of effects is the set of all self-adjoint operators a on a hilbert space h between the null operator 0 and the identity operator 1 and endowed with the partial operation + defined iff a + b is in e(h), where + is the usual operator sum. recently, m. polakovič and z. riečanová [8] established new examples of generalized effect algebras of positive operators on a hilbert space. in [7] we showed how the standard effect algebra e(h) and the latter generalized effect algebras of positive operators are related to the type of attempt mentioned above. as a by-product, we placed some of the results from [8] under the common roof of partially ordered commutative groups. the aim of the present note is to continue in this direction. the paper is organized in the following way. in section 1 we recall the basic notions concerning the theory of (generalized) effect algebras and partially ordered commutative groups. in section 2 we show that the linear operators on h and symmetric linear operators on h are equipped with the structure of a weakly ordered commutative partial group. in section 3 we manifest the fact that each of these operator structures is a pasting of partially ordered commutative groups of the respective operators with a fixed dense domain. in the last section we show that our results concerning the application of renormalization due to the pt -symmetry of an operator from [7] remain true for weakly ordered commutative partial groups. 65 acta polytechnica vol. 51 no. 4/2011 2 basic definitions and some known facts the basic reference for the present text is the classic book by a. dvurečenskij and s. pulmannová [3], where the interested reader can find unexplained terms and notation concerning the subject. we now review some terminology concerning (generalized) effect algebras and weakly ordered partial commutative groups. definition 1 ([4]) a partial algebra (e; +, 0, 1) is called an effect algebra if 0, 1 are two distinct elements and + is a partially defined binary operation on e which satisfy the following conditions for any x, y, z ∈ e: (ei) x + y = y + x if x + y is defined, (eii) (x + y) + z = x + (y + z) if one side is defined, (eiii) for every x ∈ e there exists a unique y ∈ e such that x + y = 1 (we put x′ = y), (eiv) if 1 + x is defined then x = 0. definition 2 ([5]) a partial algebra (e; +, 0) is called a generalized effect algebra if 0 ∈ e is a distinguished element and + is a partially defined binary operation on e which satisfies the following conditions for any x, y, z ∈ e: (gei) x + y = y + x, if one side is defined, (geii) (x+y)+z = x+ (y +z), if one side is defined, (geiii) x + 0 = x, (geiv) x + y = x + z implies y = z (cancellation law), (gev) x + y = 0 implies x = y = 0. in every generalized effect algebra e the partial binary operation � and relation ≤ can be defined by (ed) x ≤ y and y � x = z iff x + z is defined and x + z = y. then ≤ is a partial order on e under which 0 is the least element of e. note that every effect algebra satisfies the axioms of a generalized effect algebra, and if a generalized effect algebra has the greatest element then it is an effect algebra. definition 3 a partial algebra (g; +, 0) is called a commutative partial group if 0 ∈ e is a distinguished element and + is a partially defined binary operation on e which satisfy the following conditions for any x, y, z ∈ e: (gi) x + y = y + x if x + y is defined, (gii) (x + y) + z = x + (y + z) if both sides are defined, (giii) x + 0 is defined and x + 0 = x, (giv) for every x ∈ e there exists a unique y ∈ e such that x + y = 0 (we put −x = y), (gv) x + y = x + z implies y = z. we will put ⊥g = {(x, y) ∈ g × g | x + y is defined}. a commutative partial group (g; +, 0) is called weakly ordered (shortly a wop-group) with respect to a reflexive and antisymmetric relation ≤ on g if ≤ is compatible w.r.t. partial addition, i.e., for all x, y, z ∈ g, x ≤ y and both x + z and y + z are defined implies x+z ≤ y +z. we will denote by p os(g) the set {x ∈ g | x ≥ 0}. recall that wop-groups equipped with a total operation + such that ≤ is an order are exactly partially ordered commutative groups. throughout the paper we assume that h is an infinite-dimensional complex hilbert space, i.e., a linear space with inner product 〈· , ·〉 which is complete in the induced metric. recall that here for any x, y ∈ h we have 〈x, y〉 ∈ c (the set of complex numbers) such that 〈x, αy + βz〉 = α〈x, y〉+ β〈x, z〉 for all α, β ∈ c and x, y, z ∈ h. moreover, 〈x, y〉 = 〈y, x〉 and finally 〈x, x〉 ≥ 0 at which 〈x, x〉 = 0 iff x = 0. the term dimension of h in the following always means the hilbertian dimension defined as the cardinality of any orthonormal basis of h (see [2]). moreover, we will assume that all considered linear operators a (i.e., linear maps a : d(a) → h) have a domain d(a) a linear subspace dense in h with respect to the metric topology induced by the inner product, so d(a) = h (we say that a is densely defined ). we denote by d the set of all dense linear subspaces of h. moreover, by positive linear operators a, (denoted by a ≥ 0) it means that 〈ax, x〉 ≥ 0 for all x ∈ d(a), therefore operators a are also symmetric, i.e., 〈y, ax〉 = 〈ay, x〉 for all x, y ∈ d(a) (for more details see [2]). to every linear operator a : d(a) → h with d(a) = h there exists the adjoint operator a∗ of a such that d(a∗) = {y ∈ h | there exists y∗ ∈ h such that (y∗, x) = (y, ax) for every x ∈ d(a)} and a∗y = y∗ for every y ∈ d(a∗). if a∗ = a then a is called self-adjoint. recall that a : d(a) → h is called a bounded operator if there exists a real constant c ≥ 0 such that ‖ax‖ ≤ c‖x‖ for all x ∈ d(a) and hence a is an unbounded operator if to every c ∈ r, c ≥ 0 there exists xc ∈ d(a) with ‖axc‖ > c‖xc‖. the set of all bounded operators on h is denoted by b(h). for every bounded operator a : d(a) → h densely defined on d(a) = d ⊂ h exists a unique extension b such as d(b) = h and ax = bx for every x ∈ d(a). we will denote this extension b = ab (for more details see [2]). bounded and symmetric operators are called hermitian operators. we also write, for linear operators a : d(a) → h and b : d(b) → h, a ⊂ b iff d(a) ⊆ d(b) and ax = bx for every x ∈ d(a). 66 acta polytechnica vol. 51 no. 4/2011 3 operator wop-groups as a pasting of operator sub-groups equipped with the usual sum of operators definition 4 let h be an infinite-dimensional complex hilbert space. let us define the following set of linear operators densely defined in h: gr(h) = {a : d(a) → h | d(a) = h and d(a) = h if a is bounded}. theorem 1 let h be an infinite-dimensional complex hilbert space. let ⊕d be a partial operation on gr(h) defined for a, b ∈ gr(h) by a ⊕d b = ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ a + b (the usual sum) if a + b is unbounded and (d(a)= d(b) or one out of a, b is bounded) , (a + b)b if a + b is bounded and d(a)= d(b), undefined otherwise and ≤ be a relation on gr(h) defined for a, b ∈ gr(h) by a ≤ b iff there is a positive linear operator c ∈ gr(h) such that b = a ⊕d c. then gr(h) = (gr(h); ⊕d, 0) is a wop-group with respect to ≤. proof. let a, b, c ∈ gr(h). then (gi) is valid since a ⊕d b is defined iff b ⊕d a is defined and because the usual sum is commutative we get that a ⊕d b = b ⊕d a. moreover, (giii) is valid since a ⊕d 0 is always defined and a ⊕d 0 = a. clearly, (giv) follows from the fact that for every a ∈ gr(h) there exists a unique b ∈ gr(h) such that a ⊕d b = 0 (namely we put −a = b and evidently a + b = 0/d(a) yields 0 b /d(a) = 0). it remains to check (gii) and (gv). this will be proved by cases. assume that (a ⊕d b) ⊕d c is defined and a ⊕d (b ⊕d c) is defined. first, let (a⊕db)⊕dc be of the form (a+b)+c, hence (a + b) + c is unbounded. then d(a + b) = d(a)∩d(b) ∈ {d(a), d(b)} and d((a+b)+c) = d(a + b) ∩ d(c) ∈ {d(a), d(b), d(c)}. assume for the moment that d((a + b) + c) = d(a) �= h (the other cases follow by a symmetric argument). we have the following possibilities: (α1): d(a) = d(b), hence a + b is unbounded and d((a + b) + c) = d(a + b) = d(a) = d(b). then either c is unbounded and d(a + b) = d(c) or c is bounded and d(a + b) ⊂ d(c). in both cases we have that d(b + c) = d(b) = d(a). but this yields that (a + b) + c = a + (b + c) = a ⊕d (b ⊕d c). (β1): d(a) �= d(b), hence a + b is unbounded, d(a + b) = d(a) and b is bounded. as in (α1), either c is unbounded and d(a + b) = d(c) or c is bounded and d(a+b) ⊂ d(c), d(b) = d(c) = h. in both cases we have that d(b + c) = d(c) ⊇ d(a). hence again (a + b) + c = a + (b + c) = a ⊕d (b ⊕d c). similarly, let (a ⊕d b) ⊕d c be of the form ((a + b) + c)b, hence (a + b) + c is bounded. because (a + b) is unbounded, c is also unbounded. then d(a + b) = d(a) ∩ d(b) ∈ {d(a), d(b)} and d(a + b) = d(c). assume that d(a) �= h (the other case where d(b) �= h is symmetric). we will distinguish the following cases: (α2): d(a) = d(b), then d(a + b) = d(a) = d(b) and because d(c) = d(a + b) we have d(a) = d(b) = d(c). so ((a + b) + c)b = (a + (b + c))b = a ⊕d (b ⊕d c). (β2): d(a) �= d(b), then b is bounded and d(b) = h, hence d(a) = d(a + b) = d(c). then d(b + c) = d(c) = d(a), which yields that ((a + b) + c)b = (a + (b + c))b = a ⊕d (b ⊕d c). let (a⊕db)⊕dc be of the form ((a+b)b+c)b, hence (a + b)b + c is bounded a + b is bounded and then c is also bounded. we will verify each of the following cases: (α3): d(a) = d(b) �= h, then d(b + c) = d(b) = d(a). and then ((a + b)b + c)b = (a + (b + c))b = a ⊕d (b ⊕d c). (β3): d(a) = d(b) = h, therefore d(b + c) = h = d(a) and ((a+b)b+c)b = (a+(b+c)b)b = a ⊕d (b ⊕d c). and in the last case, let (a ⊕d b) ⊕d c be of the form ((a + b)b + c). that is, (a + b) is bounded and c is unbounded. then we prove: (α4): d(a) = d(b) = h i.e. a and b are bounded. then d((a + b)b + c) = d(c) and d(c) = d(b + c) = d(a + (b + c)). hence ((a + b)b + c) = (a + (b + c)) = a ⊕d (b ⊕d c). (β4): d(a) = d(b) �= h, i.e. a and b are unbounded. then if d(b) �= d(c), ((a⊕d b)⊕d c) = ((a + b)b + c) is defined and d((a + b)b + c) = d(c), but (b⊕dc) is not defined, so (a⊕d(b⊕dc)) is not defined. in the case that d(b) = d(c) we have d(b) = d(c) = d(a), so ((a + b)b + c) = (a + (b + c)) = a ⊕d (b ⊕d c). 67 acta polytechnica vol. 51 no. 4/2011 table 1: d(c1) d(c2) c1 ⊕d c2 d(c1 ⊕d c2) = h = h (c1 + c2)b = h = d(b) = h c1 + c2 = d(c1) = d(b) = h = d(a) c1 + c2 = d(c2) = d(a) = d(b) = d(a) = d(c1) c1 + c2 if c1 + c2 is unbounded �= h (c1 + c2) b if c1 + c2 is bounded = d(a) = d(b) now, assume that a ⊕d b = a ⊕d c. first, assume that a is bounded. then d(b) = d(c) ⊆ d(a). hence a + b = a + c. this yields by (gii) that c = −a + (a + c) = −a + (a + b) = b. now, assume that a is unbounded. if b is unbounded then d(b) = d(a) and we will distinguish the following cases: (γ1): a + b is bounded and hence also a + c is bounded. it follows that c is unbounded and hence d(c) = d(a). therefore also a + b = a + c. (δ1): a + b is unbounded and hence also a + c is unbounded. we get that d(c) = d(a), i.e. a + b = a + c. in both cases we get as above from (gii) that c = b. if b is bounded we have that a+b is unbounded. this implies that a + c is unbounded as well. therefore d(a + b) = d(a) ⊆ d(b) and d(a) ⊆ d(c). we then have a + b = a + c. it follows again by (gii) that c/d(a) is bounded and b/d(a) = c/d(a). therefore b = c. let us check that ≤ is reflexive and antisymmetric. let a, b ∈ gr(h). evidently, a ≤ a since a = a ⊕d 0 and 0 is a positive bounded linear operator on h. now, assume that b = a ⊕d c1 and a = b ⊕d c2 for some positive linear operators c1, c2 on h. by cases we have that (a+c1 = b with b unbounded or (a + c1) b = b with b bounded) and (b+c2 = a with a unbounded or (b+c2) b = a with a bounded). assume first that a+c1 = b with b unbounded and b + c2 = a with a unbounded. we have the following possibilities for c1 and c2 (see tab. 1). now assume that a + c1 = b with b unbounded and (b + c2) b = a with a bounded. then c1 has to be unbounded with d(c1) = d(b) and c2 can only also be unbounded with d(c2) = d(b). when c1 + c2 is bounded then c1⊕d c2 = (c1 + c2)b and d(c1 ⊕d c2) = h. for c1 + c2 unbounded we have d(c1 ⊕d c2) = d(b). the situation for (a + c1) b = b with b bounded and (b + c2) = a with a unbounded is symmetric to the previous case with d(c1 ⊕d c2) = h when c1 + c2 is bounded and d(c1 ⊕d c2) = d(a) when c1 + c2 is unbounded. the last case is (a + c1) b = b with b bounded and (b + c2) b = a with a bounded too. hence c1 and c2 are bounded as well and (c1 ⊕d c2) = (c1 + c2) b with d(c1 ⊕d c2) = h. this yields that a ⊕d (c1 ⊕d c2) is defined and a⊕d (c1⊕d c2) = (a⊕d c1)⊕d c2 = b ⊕d c2 = a. hence by (giii) and (giv) we obtain that c1⊕d c2 = 0, hence c1 + c2 = 0/d(c1). by [7, theorem 2] we have that c1 = c2 = 0. so, it remains to check that ≤ is compatible with addition, i.e., for all a, b, c ∈ gr(h) such that a ≤ b, c ⊕d a and c ⊕d b are defined we have that c ⊕d a ≤ c ⊕d b. again by cases we have that (c ⊕d a = c + a with c + a unbounded or c ⊕d a = (c + a)b with c + a bounded) and (c ⊕d b = c + b with c + b unbounded or c ⊕d b = (c + b)b with c + b bounded). since a ≤ b there is e ∈ gr(h), e positive such that a ⊕d e = b. but c ⊕d b = c ⊕d (a ⊕d e). for the case when e is bounded, clearly, (c ⊕d a) ⊕d e is always defined. now assume that e is unbounded. in the case when a is unbounded and b is bounded we get that d(e) = d(a) and d((e + a)b) = h. we have the following possibilities: (a): c is bounded. then d(c + a) = d(a) = d(e). (b): c is unbounded. then d(c) = d(a) = d(e). in the case when both a, b are unbounded we obtain that d(e + a) = d(b) = d(a) = d(e). we distinguish: (c): c is bounded. then d(c + a) = d(a) = d(e). (d): c is unbounded. then d(c) = d(a) = d(e). in the last case assume that a is bounded and b is unbounded, hence d(e + a) = d(b) = d(e). (e): c is bounded. then d(c + a) = h. (f): c is unbounded. then d(c + a) = d(c) = d(b) = d(e). hence in all cases (c ⊕d a) ⊕d e is defined. recall that we have the following result of [9]. 68 acta polytechnica vol. 51 no. 4/2011 theorem 2 [9, theorem 1] let h be an infinitedimensional complex hilbert space. let us define the following set of positive linear operators densely defined in h: v(h) = {a : d(a) → h | a ≥ 0, d(a) = h and d(a) = h if a is bounded}. let ⊕d be defined for a, b ∈ v(h) by a ⊕d b = a + b (the usual sum) iff 1. either at least one out of a, b is bounded 2. or both a, b are unbounded and d(a) = d(b). then vd(h) = (v(h); ⊕d, 0) is a generalized effect algebra such that ⊕d extends the operation ⊕. then p os(gr(h)) = v(h), hence the generalized effect algebra v(h) is the positive cone of gr(h) and, for all a, b, c ∈ v(h), (a ⊕d b) ⊕d c exists iff a ⊕d (b ⊕d c) exists. definition 5 let (g, +, 0) be a commutative partial group and let s be a subset of g such as: (si) 0 ∈ s, (sii) −x ∈ s for all x ∈ s, (siii) for every x, y ∈ s such x + y is defined also x + y ∈ s. then we call s a commutative partial subgroup of g. let g be a wop-group with respect to a partial order ≤g and let ≤s be a partial order on a commutative partial subgroup s ⊆ g. if for all x, y ∈ s holds: x ≤s y if and only if x ≤g y, we call s a wop-subgroup of g. for a commutative partial group g = (g, +, 0) and a commutative partial subgroup s, we denote +s = +/s2. we will omit an index and we will write s = (s, +, 0) instead of (s, +s , 0) where no confusion can result. lemma 1 let g = (g, +, 0) be a commutative partial group and let s be a commutative partial subgroup of g. then (s, +, 0) is a commutative partial group. let g be a wop-group and let s be a wop-subgroup of g. then s is a wop-group. proof. conditions (gi), (gii) and (gv) follow immediately from (siii). condition (giii) follows from (si) and (siii) and condition (giv) follows from (sii). assume now that g is a wop-group such that s is a wop-subgroup of g. if x, y, z ∈ s, x ≤s y and x + z, y + z are defined, then x + z, y + z ∈ s and x + z ≤g y + z hence x + z ≤s y + z. lemma 2 let g = (g, +, 0) be a commutative partial group and s1, s2 commutative partial subgroups of g. then s = s1∩s2 is also a commutative partial subgroup of g. proof. condition (si) is clear. (sii): if x ∈ s then x ∈ s1 and x ∈ s2 hence −x ∈ s1 and −x ∈ s2. therefore −x ∈ s. (siii): assume that x, y ∈ s such that x + y is defined. then x, y ∈ s1 and x, y ∈ s2. hence x+y ∈ s1 and x + y ∈ s2. this yields x + y ∈ s. definition 6 let g1 = (g1, +1, 01) and g2 = (g2, +2, 02) be commutative partial groups. a morphism is a map ϕ : g1 → g2 such that, for any x, y ∈ g1, whenever x +1 y exists then ϕ(x) +2 ϕ(y) exists, in which case ϕ(x +1 y) = ϕ(x) +2 ϕ(y). if ϕ is a bijection such that ϕ and ϕ−1 are morphisms we say that ϕ is an isomorphism of commutative partial groups and g1 and g2 are isomorphic. moreover, let ≤1 on g1 and ≤2 on g2 be partial orders such that g1 and g2 are wop-groups. let ϕ : g1 → g2 be a morphism between commutative partial groups. if for every x, y ∈ g1 : x ≤1 y implies ϕ(x) ≤2 ϕ(y), then ϕ is a morphism between wopgroups. if ϕ is a bijection, ϕ and ϕ−1 are morphisms we say that ϕ is an isomorphism of wop-groups and g1 and g2 are isomorphic as wop-groups. definition 7 let h be an infinite-dimensional complex hilbert space. let us define the following sets of linear operators densely defined in h: sgr(h) = {a ∈ gr(h) | a ⊂ a∗} hgr(h) = {a ∈ gr(h) | a ⊂ a∗, d(a) = h}. i.e. sgr(h) is the set of all symmetric operators and hgr(h) is the set of all hermitian operators. from the definition we can see that hgr(h) ⊆ sgr(h). it is a well known fact that every positive operator is symmetric and every positive bounded operator is both self-adjoint and hermitian (see [2]). theorem 3 let h be an infinite-dimensional complex hilbert space. let ≤s be a relation on sgr(h) defined for a, b ∈ sgr(h) by a ≤s b if and only if there exists a positive operator c ∈ sgr(h) such as a ⊕d c = b. then (sgr(h); ⊕d, 0) equipped with ≤s forms a wop-subgroup of gr(h). proof. conditions (si) and (sii) are clearly satisfied. we have to verify that sgr(h) is closed under addition. let a, b ∈ sgr(h) be bounded, then they are hermitian and it is well known that the sum of two hermitian operators is also hermitian. recall that a ⊂ a∗ iff for all x, y ∈ d(a) : 〈x, ay〉 = 〈ax, y〉. if a is bounded and b is unbounded then d(a + b) = d(b) and, for all x, y ∈ d(b), it holds 〈x, (a + b)y〉 = 〈x, ay〉 + 〈x, by〉 = 69 acta polytechnica vol. 51 no. 4/2011 〈ax, y〉 + 〈bx, y〉 = 〈(a + b)x, y〉 hence (a + b) ⊂ (a + b)∗. a similar argument holds for a unbounded, b unbounded and a + b unbounded, where d(a + b) = d(a) = d(b). if a and b are unbounded and a + b is bounded, then, for all x, y ∈ d(a) = d(b), we have 〈x, (a + b)y〉 = 〈x, ay〉 + 〈x, by〉 = 〈ax, y〉 + 〈bx, y〉 = 〈(a + b)x, y〉. therefore (a + b) ⊂ (a + b)∗. from (a + b) ⊂ (a + b)b we get ((a + b)b)∗ ⊂ (a + b)∗ and then h = d((a + b)b∗) = d((a + b)∗). hence (a + b)∗ = ((a + b)b)∗. because (a + b)∗ is symmetric one obtains that (a + b)∗ = ((a + b)b)∗ = (a + b)b. hence (a + b)b = (a ⊕d b) ∈ sgr(h). now let a ⊕d c = b where a, b ∈ sgr(h) and c ∈ gr(h), c positive. since c is a positive operator we get that c ∈ sgr(h). therefore ≤s =≤/sgr(h)2. 4 operator weakly ordered partial groups as a pasting of operator sub-groups equipped with the usual sum of operators let us recall the following theorem from [7] that was our basic motivation for investigating the set of linear operators on a hilbert space. theorem 4 let h be an infinite-dimensional complex hilbert space and let d ∈ d. let lind(h) = {a : d → h | a is a linear operator defined on d}. then (lind(h); +, ≤, 0) is a partially ordered commutative group where 0 is the null operator, + is the usual sum of operators defined on d and ≤ is defined for all a, b ∈ lind(h) by a ≤ b iff b − a is positive. definition 8 let h be an infinite-dimensional complex hilbert space and let d ∈ d. let grd (h) = {a ∈ gr(h) | d(a) = d or a is bounded}. sgrd (h) = {a ∈ sgr(h) | d(a) = d or a is bounded}. now, we are going to show that the set grd (h) equipped with the prescription ⊕d = ⊕d/(grd(h))2 and the relation ≤d=≤/(grd(h))2 is a partially ordered commutative group isomorphic to lind(h). theorem 5 let h be an infinite-dimensional complex hilbert space and let d ∈ d. then grd (h) = (grd (h); ⊕d, 0) with respect to ≤d is a wopsubgroup of gr(h) such that the induced operation ⊕d is total. moreover, grd (h) is isomorphic to lind(h) and hence a partially ordered commutative group. proof. conditions (si) and (sii) are clearly satisfied. let us check condition (siii). for a, b ∈ grd (h), first assume that d(a) = d(b) ∈ {d, h}. then d(a) = d(b) = d(a + b) ∈ {d, h}, hence a ⊕d b exists in grd (h). on the other hand, let d(a) �= d(b). then either d(a) ⊂ d(b) = h, in which case d(a) = d(a + b) = d or d(b) ⊂ d(a) = h with d(b) = d(a + b) = d hence a ⊕d b also exists in grd (h). hence for all a, b ∈ grd (h) we have that a ⊕d b ∈ grd (h) i.e. ⊕d is a total operation on grd (h). now let a ⊕d b = c where a, c ∈ grd (h) and b ∈ gr(h), b positive. since a ⊕d b is defined and b is positive we have that b ∈ grd (h) ∩ v(h). we can define a map ϕ : lind → grd (h) where: ϕ(a) = { a if a is unbounded, ab if a is bounded. for a ∈ lind unbounded it holds d(a) = d(ϕ(a)) = d hence ϕ(a) ∈ grd (h). for a bounded we have d(ϕ(a)) = d(ab) = h hence a ∈ grd (h). we can define ψ : grd (h) → lind as ψ(b) = b for b unbounded and ψ(b) = b/d for b bounded. then clearly ψ ◦ ϕ = idlind and ϕ ◦ ψ = idgrd(h) hence ϕ is a bijection. it is evident that ϕ(0) = 0 and + on lind is total. for a, b ∈ lind, let us assume that: (a): a, b be bounded. then ϕ(a + b) = (a + b)b = ab + bb = ϕ(a) ⊕d ϕ(b). (b): a be bounded, b be unbounded. then d(a + b) = d(b) = d and ϕ(a + b) = a + b = ab + b = ϕ(a) ⊕d ϕ(b). (c): a be unbounded, b be unbounded, a + b be unbounded. then ϕ(a+b) = a+b = ϕ(a)⊕d ϕ(b) (d): a be unbounded, b be unbounded, a + b be bounded. then ϕ(a + b) = (a + b)b = (ϕ(a) + ϕ(b))b = ϕ(a) ⊕d ϕ(b). now, we should verify order preservation, but it is clear that ϕ and ψ preserve order. theorem 6 let h be an infinite-dimensional complex hilbert space and let d ∈ d. then sgrd (h) with the induced total operation ⊕d and the induced partial order ≤sgrd(h) is a wop-subgroup of grd (h) and hence a partially ordered commutative subgroup of grd (h). proof. sgrd (h) is a commutative subgroup of grd (h) because of lemma 2 and sgrd (h) = 70 acta polytechnica vol. 51 no. 4/2011 grd (h) ∩sgr(h). we have to check order preservation. for any a, b ∈ sgrd(h) such that a ≤grd(h) b we have that there exists positive c ∈ grd (h) such that a + c = b. since every positive operator is symmetric we have that c ∈ sgrd (h). this yields that ≤sgrd(h)= (≤grd(h))/sgrd(h)2. theorem 7 (the pasting theorem for gr(h)) let h be an infinite-dimensional complex hilbert space. then the wop-group gr(h) pastes their partially ordered commutative subgroups grd (h), d ⊆ h a dense linear subspace of h, together over b(h), i.e. grd1(h) ∩ grd2(h) = b(h) for every pair d1, d2 of dense linear subspaces of h, d1 �= d2, and gr(h) = ⋃ {grd(h) | d ∈ d}. proof. straightforward from definition, for d ∈ d, every bounded a ∈ gr(h) lies in grd (h). for any unbounded b ∈ gr(h), b ∈ grd (h) if and only if d(b) = d hence there is unique grd (h) in which b lies. hence grd1 (h) ∩ grd2 (h) = b(h) for all d1 �= d2, d1, d2 ∈ d. and because grd (h) in which b lies exists for every b ∈ gr(h), we have gr(h) = ⋃ {grd(h) | d ∈ d}. theorem 8 (the pasting theorem for sgr(h)) let h be an infinite-dimensional complex hilbert space. then the wop-group sgr(h) pastes their partially ordered commutative subgroups sgrd (h), d ∈ d, together over hgr(h), i.e., for every pair d1, d2 of dense linear subspaces of h, d1 �= d2, sgrd1 (h) ∩ sgrd2(h) = hgr(h) and sgr(h) = ⋃ {sgrd (h) | d ∈ d}. proof. let d ∈ d. since sgrd (h) = grd (h) ∩ sgr(h), with previous theorem ⋃ d∈d sgrd(h) = ⋃ d∈d (grd (h) ∩ sgr(h)) = ( ⋃ d∈d grd (h) ) ∩ sgr(h) = gr(h) ∩ sgr(h) = sgr(h). similarly we have sgrd1(h)∩sgrd2(h) = (sgr(h)∩grd1 (h))∩ (sgr(h) ∩ grd2(h)) = sgr(h) ∩ (grd1 (h) ∩ grd2 (h)) = sgr(h) ∩ b(h) = hgr(h) for all d1 �= d2, d1, d2 ∈ d. 5 pt -symmetry and related effect algebras let us repeat some of the notions concerning the basics of pt -symmetry from [7]. let h be a hilbert space equipped with an inner product 〈ψ, φ〉. let ω : h → h be an invertible linear operator. then we obtain a new inner product 〈〈 −, − 〉〉 on h which will have the form: 〈〈ψ, ϕ〉〉 = 〈ωψ, ωϕ〉, ∀ψ, ϕ ∈ h. clearly, our new inner product space is complete with respect to 〈〈 −, − 〉〉. let us denote hω the corresponding hilbert space. hence ω : hω → h and ω−1 : h → hω provide a realization of the unitaryequivalence of the hilbert spaces hω and h. let us define a map (·)dω : grd (h) → grω−1(d)(hω) by aω = ω−1 ◦ a ◦ ω for a linear map a ∈ grd (h), d ∈ d. we then have proposition 1 [7, proposition 3] let h be an infinite-dimensional complex hilbert space. assume moreover that ω : h → h is an invertible linear operator and d ∈ d. then 1. (a)dω is a positive operator on hω iff a is a positive operator on h. 2. (·)dω is an isomorphism of partially ordered commutative groups. 3. (a)dω is a hermitian operator on hω iff a is a hermitian operator on h. 4. (i)dω = i. the preceding proposition immediately yields that theorem 9 let h be an infinite-dimensional complex hilbert space and ω : h → h an invertible linear operator. then the map (·)ω : gr(h) → gr(hω) defined by (a)ω := (a) d(a) ω for all a ∈ gr(h) is an isomorphism of wop-groups. proof. it follows from proposition 1 and theorem 7. corollary 1 let h be an infinite-dimensional complex hilbert space and ω : h → h an invertible linear operator. then v(h) and v(hω) are isomorphic generalized effect algebras. proof. it follows immediately from the fact that (·)ω preserves and reflects positive operators and from theorem 9. we say that an operator h : d → h defined on a dense linear subspace d of a hilbert space h is η+-pseudo-hermitian and η+ is a metric operator if η+ : h → h is a positive, hermitian, invertible, linear operator such that h∗ = η+hη −1 + (see also [6]). theorem 10 let h be an infinite-dimensional complex hilbert space, let d ⊆ h be a linear subspace dense in h and let h : d → h be a η+pseudo-hermitian operator for some metric operator η+ : h → h such that η+ = ρ2+. then 1. gr(hρ+ ) and gr(h) are mutually isomorphic wop-groups such that h ∈ gr(hρ+ ) and h is a self-adjoint operator with respect to the positivedefinite inner product 〈〈 −, − 〉〉 = 〈ρ+−, ρ+−〉 on hρ+. 71 acta polytechnica vol. 51 no. 4/2011 2. v(h) and v(hρ+) are mutually isomorphic generalized effect algebras. if moreover h is a positive operator with respect to 〈〈 −, − 〉〉 (i.e., its real spectrum will be contained in the interval [0, ∞)) then h ∈ v(hρ+). proof. it follows from the above considerations and [7, theorem 3]. 6 conclusion in this paper we have shown that a η+-pseudohermitian operator for some metric operator η+ of a quantum system described by a hilbert space h yields an isomorphism between the weakly ordered commutative partial group of linear maps on h and the weakly ordered commutative partial group of linear maps on hρ+. the same applies to the generalized effect algebras of positive operators introduced in [9]. hence, from the standpoint of (generalized) effect algebra theory the two representations of our quantum system coincide. acknowledgement the work of the author was supported by the ministry of education of the czech republic under project msm0021622409 and by grant 0964/2009 of masaryk university. the second author was supported by grant 0964/2009 of masaryk university. references [1] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. 80 (1998), 5 243–5 246. [2] blank, j., exner, p., havĺıček, m.: hilbert space operators in quantum physics. 2nd edn. berlin : springer, 2008. [3] dvurečenskij, a., pulmannová, s.: new trends in quantum structures, bratislava : kluwer acad. publ., dordrecht/ister science, 2000. [4] foulis, d. j., bennett, m. k.: effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1 331–1 352. [5] hedĺıková, j., pulmannová, s.: generalized difference posets and orthoalgebras, acta math. univ. comenianae 45 (1996), 247–279. [6] mostafazadeh, a.: pseudo-hermitian representation of quantum mechanics, int. j. geom. meth. mod. phys 7 (2010), 1 191–1 306. [7] paseka, j.: pt-symmetry in (generalized) effect algebras, internat. j. theoret. phys. 50 (2011), 1 198–1 205. [8] polakovič, m., riečanová, z.: generalized effect algebras of positive operators densely defined on hilbert spaces, internat. j. theoret. phys. 50 (2011), 1 167–1 174. [9] riečanová, z., zajac, m., pulmannová, s.: effect algebras of positive operators densely defined on hilbert spaces, reports on mathematical physics, (2011), accepted. jan paseka e-mail: paseka@math.muni.cz jǐŕı janda e-mail: 98599@mail.muni.cz department of mathematics and statistics faculty of science masaryk university kotlářská 2, cz-611 37 brno 72 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 a wideband low-pass filter for differential mode distortion m. brejcha abstract this paper deals with the solution for a wideband low-pass filter that can be used for filtering the input currents of switching converters, which are distorted by the switching frequency of pwm. initially, the filter was proposed for the special type of ac converter, which is described in the paper. however, these solutions can also be used in the inputs of activepfcconverters and in the outputs of pwmconverters, where there are similar problemswith switching frequency. the frequency band of the filter is given by the switching frequency of the filtered device and by the demands of emc standards. this makes the filter able to work in the frequency band from 10 khz to 30 mhz. to ensure such a frequency band, the filter should be proposed with two sections, each for a specific part of the band. keywords: filter, filtering, low-pass, wideband, emc, differential mode. 1 introduction each realization of a passive filter is affected by the frequency dependence of the circuit component. the value of themain parameter of a component depends on other parasitic properties. when considering a coil, the main parameter is the inductance, and the parasitic parameters are the resistivity of the wires and the capacity among the turns. the resistivity affects the quality factor of the coil, and the capacity defines its serial resonant frequency. further we will deal with the common circuit parts in passive filters, which are the coils and the capacitors. the equivalent circuits of both circuit parts are depicted in figure 1. the use of coils and capacitors is limited by the resonant frequency, because when they achieve this frequency their reactance changes sign. the values of the parasitic parameters are associated with the size and construction of the circuit part. it can be said that the size of capacitors and coils are related to thevalueof themainparameterand the required reactive power. in case of inductors, if inductance valuewas increasedat the same reactivepower, then the size of the coil would have to be bigger. as stated above, the value of the parasitic parameters (resistance and capacitance) would be increased. ordinary filters are usually proposed for filtering the distortion from 150 khz to 30 mhz. this frequency band is defined in theemcstandards. these filters are not difficult to design, because the frequency boundaries are relatively close. the circuit parts are relatively small, and their parasitic properties do not affect the frequency response so much. a problem arises when we try to reduce the cutoff frequency of the filter, because we have to increase the inductance and capacitance values. a filter with a relatively low cut-off frequency had to be used for the special ac converter shown in figure 2. this four-quadrant converter is based on the topology of a buck-converter, which is widely used in dc circuits. the amplitude of the output fundamental harmonic voltage is proportional to the duty-cycle of the switching. to obtain the fundamental harmonic function, the output filter has to be able to filter the switching frequency (about 20 khz). in this case we should design the filter as two ormore parts, each for a specific frequency band. fig. 1: equivalent circuits for a) inductors and b) capacitors fig. 2: an ac converter based on a buck-converter 2 dependency of the parameters on frequency all circuits parts used in the proposed filter were measured to obtain the dependency of the main parameter on frequency. weused 10nf and 680nf foil 14 acta polytechnica vol. 51 no. 5/2011 capacitors and two handmade 56 μh and 3.18 mh inductors in the filter. we plotted the absolute impedance characteristic and the phase angle characteristic in all cases to make it easier to compare the plots with each other. 2.1 capacitors the characteristic impedance decreases with frequency in both cases of capacitors until the resonant frequency is reached. this correspondswith the equation: z ≈ xc = 1 2π · f · c (ω; hz; f) (1) in equation 1, f is frequency and c is capacity. after reaching the resonant frequency the impedance begins to rise, because the capacitor starts to behave as an inductor. note that the resonant frequency of the 10 nf capacitor (figure 3) is higher than the resonant frequency of the 680 nf capacitor (figure 4). this is because of the size of the parts. the 680 nf has to have larger electrodes to obtain higher capacity. the parasitic inductance increases, because the electrodes also behave as wires. fig. 3: the impedance of capacitor 10 nf fig. 4: the impedance of capacitor 680 nf the resonant frequency of a 680 nf capacitor is close to 1 mhz. if this capacitor was used in a filter, an unwanted shape of the frequency response would appear above this frequency. 10 nf capacitors have lower parasitic inductance, and they can be used up to 10 mhz. 2.2 inductors there is a similar situation in the case of inductors. at the beginning, the trace of the impedance increases in both types of inductors. this corresponds to the equation: z ≈ xl =2π · f · l (ω; hz; h) (2) in equation (2), f is the frequency and l is the inductance in the. as in the previous case, after reaching the resonant frequency the parasitic parameters becomedominant and the inductor behaves like a capacitor. an inductor with a higher inductance value has a lower resonant frequency. this is easy to understand, when we realize that the smaller inductor has only 10 turns on a ferrite core and the bigger inductor has 110 turns on an iron powder core. you may observe that some breaks occur in the tracebehind the resonant frequency infigure6. this is because the length of the coiledwires exceeded the 1/4 of the wave length of the signal. this is another parasitic phenomenon that affects the resulting frequency response of the filter. the inductor infiure 6 certainly should not be used for frequencies higher than 500 khz. fig. 5: the impedance of inductor 56 μh fig. 6: the impedance of inductor 3.18 mh 15 acta polytechnica vol. 51 no. 5/2011 3 realization of the filter we will further consider the realization of the filter, for which the tolerance scheme is shown in figure 7. our goal is to propose a filter with high attenuation between 20 khz and 30 mhz. the terminal resistors will be considered to be 50 ω. if we use the butterworth approximation for the coefficients of the filter, we obtain the values of the circuit parts for the πsection shown in table 1. these are the same values of the inductors and capacitors as were measured in section 2. their frequency characteristics are shown in figure 4 and figure 6. table1: filter realization bybutterworthapproximation r1 (ω) c1 (nf) l1 (mh) c2 (nf) r2 (ω) 50 680 3.18 680 50 fig. 7: prescribed tolerance scheme for the filter the reader will have noticed that this realization of the filter can be used only up to approximately 1 mhz. above this frequency, the attenuation of the filter will begin to decrease. to solve this problem we should add another section of the filter, which will workwell between 1 mhz and 30 mhz and does not affect the previous section of the filter. 3.1 the corrective section of the filter the corrective section of the filter can be just of the 2nd order, because it needs to hold only high attenuation at high frequency. the high level of attenuation should be reached by the previous low-frequency section. the first task is to ensure that the corrective section of the filter does not affect the previous section. this can be achieved by a progressive sequence of the sections. if the cut-off frequency of the corrective section is much higher than the cut-off frequency of the filter, then it does not affect the passband. the influence of the corrective section is then negligible, and the filter behaves as if it were terminated by r1 and r2. this is because the values of inductors and capacitors in the corrective section are then much lower than in the filter. lower values of the main parameters also mean lower values of the parasitic parameters of the circuit parts. we propose a cut-off frequency at 100 khz for this design of a corrective section. the next task is to reflect the impedance of the filter section in solution of the corrective section. we cannot countwith the terminal resistor r1 anymore, because the impedance of the filter section is not negligible. however the resulting output impedance of the filter is very lowat high frequency, because of the large value of capacitor c2. we can therefore consider the input impedance of the corrective section to be zero. we therefore design the section with terminal resistors 0 and r2. the proposed values of the circuit parts, using the butterworth approximation for both sections, are shown in table 2. table 2: filter design with corrective section r1 c1 l1 c2 l2 c3 r2 (ω) (nf) (mh) (nf) (μh) (nf) (ω) 50 680 3.18 680 56 10 50 fig. 8: filter topology 3.2 prototype of the filter using the method described above, we made a prototype of the filter, which is shown in figure 9. the filter was made to attenuate the differential mode of distortion. the circuit partswere solderedon the top layer of the pcb. the bottom layer was grounded to the shielding. to attenuate possible interference among the circuit parts, the inductor l1 (3.18 mh) was enclosed in its own shielding. the input and the output of the filter were realized bybnc connectors. fig. 9: the prototype of the filter 16 acta polytechnica vol. 51 no. 5/2011 the frequency response of the filter prototype was measured, and the result is shown in figure 10. there is only low attenuation in the passband up to 5 khz. this corresponds to the demands for the design. after crossing the cut-off frequency, the trace begins to decrease. the maximum of attenuation is close to 100 db at approximately 200 khz. the corrective section should also function at this frequency. attenuation higher than 85 db is then held up to 30 mhz. fig. 10: frequency response of the filter prototype 4 conclusion we have described a method for designing a wide band low-pass filter by adding the corrective section to the filter. the resulting frequency response of the filter meets the requirements, and high attenuation has been retained up to 30mhz. however, the other circuit parts make the filter more expensive and increase its size. the larger dimension of the filter can lead to other problems with parasitic effects. acknowledgement the research presented in this paper was supported by the czech ministry of education under grant no. msm6840770017 — rozvoj, spolehlivost a bezpečnost elektroenergetických systémů. references [1] cetl, t., papež, v.: konstrukce a realizace elektronických obvodů. praha : vydavatelstvíčvut, 2002. 263 s. isbn 80-01-02463-6. [2] bičák, j., laipert, m., vlček, m.: lineární obvody a systémy. praha : česká technika—nakladatelství čvut, 2007. 204 s. isbn 978-80-01-03649-5. [3] davídek, v., laiperet, m., vlček, m.: analogové a číslicové filtry. praha : nakladatelství čvut, 2006. 345 s. isbn 80-01-03026-1. [4] chen, w.-k., et al.: passive, active and digital filters. 3rd ed. chicago, u.s.a. : crc press, 2009. isbn 978-1-4200-5885-1. [5] faktor,z.: transformátory a cívky. dotisk 1. vyd. praha : ben, 2002. 400 s. isbn 80-86056-49-x. about the author michal brejcha was born 26 march 1983 in prague. he attended spše františka křižíka secondary schoolbetween1999and2003,wherehe studied the electrotechnic section. he then started to study electric power engineering at ctu in prague. there he was awarded his bachelor degree and his master degree in electrotechnology in 2006. he is now studying as a phd student in the same department of the faculty of electrical engineering. michal brejcha e-mail: brejcmic@fel.cvut.cz dept. of electrotechnology czech technical university technická 2, 166 27 praha, czech republic 17 | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 timing analysis with integral: comparing different reconstruction algorithms v. grinberg, i. kreykenbohm, f. fürst, j. wilms, k. pottschmidt, m. cadolle bel, j. rodriguez, d. m. marcu, s. suchy, a. markowitz, m. a. nowak abstract integral is one of the few instruments capable of detecting x-rays above 20kev. it is therefore in principle well suited for studying x-ray variability in this regime. because integral uses coded mask instruments for imaging, the reconstruction of light curves of x-ray sources is highly non-trivial. we present results from a comparison of two commonly employed algorithms, which primarily measure flux from mask deconvolution (ii lc extract) and from calculating thepixel illuminated fraction (ii light). bothmethodsagreewell for timescales above about10s, thehighest time resolution for which image reconstruction is possible. for higher time resolution, ii light produces meaningful results, although the overall variance of the lightcurves is not preserved. keywords: timing analysis, instrumentation and methods, lightcurves, data extraction. 1 introduction the integral satellite is one of the few instruments designed for the detection of x-rays above 20 kev with a good time resolution. it offers a unique opportunity for timing studies in this regime, though the exact analysis at high time resolution remains a challenge. in coded mask instruments like the ibis telescope aboard the integral the source radiation is modulated by a mask. each source will cast a shadow image (shadowgram) — the combined shadowgram is recorded in the detector plane. to obtain the original image the detected flux distribution has to be deconvolved in an analytically and computationally non-trivial process which is highly cpuintensive. for the reconstruction of lightcurves two algorithms are commonly employed: ii lc extract deconvolves shadowgrams for each time and energy bin, where the lightcurve is extracted. ii light calculates the lightcurves primarily from the pixel illuminated fraction (pif, number between 0 and 1 for a given source expressing the degree of illumination of a detector pixel). both are included in the integral off-line scientific analysis (osa) software package. in the following we compare the two extraction mechanisms and discuss their advantages and shortcomings (sec. 2), and then assess the suitability of ii light for high time resolution analysis (sec. 3). a short summary of the results and the implications for further timing analysis with integral are given in sec. 4. 2 comparison between different lightcurve extraction algorithms to reduce the influence of the selected field on the results of the lightcurve extraction, we perform the comparison on two different fields. figure 1 shows a comparison of significance mosaics obtained durfig. 1: intensity mosaics of the fields of cygx-1 and grs1915+105 in the 20–40kev band 33 acta polytechnica vol. 51 no. 2/2011 fig. 2: scatterplots for countrates obtained with ii lc extract and ii light ing the cyg x-1 and grs 1915+105 integral key programme in the 20–40 kev energy band with the source in the fully coded field of view (fov), i.e. to a maximum pointing offset of 4.5◦, from 15 science windows (scws) from revolution 628 and 26 scws from revolution 852, respectively. cyg x-1 (countrate ∼ 100 cps) is significantly brighter than grs 1915+105 (∼ 40 cps). while both fields are comparable regarding the sources taken into account for our extractions (named boxes, σdetection ≥ 6, cps ∼ 0.5–4.5), the field of grs 1915+105 is crowded with ∼ 35 weak sources (marked with ×, 1 ≤ σdetection < 6), while the field of cyg x-1 shows only ∼ 20 of them. figure 2 shows the correlation between the results obtained with ii light (osa 7 version) and ii lc extract (osa 8 version) for sources in the fully coded field of view in the 18–50 kev and 20–50 kev band for cyg x-1 and grs 1915+105 (revolutions and scws as for mosaic images), respectively. note that the ii light algorithm was not included into osa 8 release; we therefore used the newest version of both algorithms available at the time of writing. at a 10 s time resolution ii lc extract fails to detect the respective sources in several timebins, resulting in datapoints with zero countrate and error (red circles), which are excluded from our analysis. negative countrate values are an artifact of the background extraction and common for x-ray lightcurves. ii lc extract does not allow for a much higher time resolution than 10s. ii light systematically underestimates the countrate. the 10 s lightcurves (cyan circles) are however well linearly correlated with a bestfit slope of 1.05 ± 0.01 for cyg x-1 and 1.15 ± 0.01 for grs 1915+105. fits to individual scw-averaged countrates (black circles) in both cases show a different linear correlation with a lower slope and a significant offset. given the good correlation for the non-averaged lightcurves and the fact, that the average datapoints lie well on the 10 s lightcurve fits, we are inclined to attribute this to the low number of scws analysed. more data covering a greater range in countrates will shed light on this issue. the performance of ii light on all available integral crab data was analysed by [2], who found that ii lightunderestimates the countrates by about5%, consistent with our results. we attribute the different ratios of ii light and ii lc extract results for cyg x-1 and grs 1915+105 to the differences in the fields: in a more crowded field like the one of grs 1915+105, a signal is more likely to be assigned to the wrong source. 3 high time resolution with ii light for the following section we use all scws from revolution 628 where cyg x-1 is in the fully or partially coded fov, i.e., science windows with a pointing offset of up to ∼ 15◦. histograms (width 1 cps) of the ii light lightcurves for cyg x-1 (fig. 3) and gaussian fits to them show that though the scatter increases with the time resolution, the routine produces meaningful results. the centers of the gaussians fit components (dashed lines) are well consistent with each other. the fwhm of the gaussians increases by a factor of ∼ 3 for one order of magnitude increase in time resolution, consistent with the decreasing snr. the deviations from the gaussian shape are explained by the high intrinsic variability of the source of > 25 % over the 3 days of the integral revolution 628, as seen in fig. 4. note also that fig. 4 supports the find34 acta polytechnica vol. 51 no. 2/2011 0 100 200 # o f d a ta p o in ts −5 0 5 re si − d u a ls 0 500 1000 # o f d a ta p o in ts −5 0 5 re si − d u a ls 0 1000 2000 3000 # o f d a ta p o in ts −400 −200 0 200 400 600 −5 0 5 countrate [cps] re si − d u a ls fig. 3: histograms (width 1cps) of the ii light lightcurves of revolution 628 with 10s (upper panel), 1s (middle panel) and 0.1s (lower panel) time resolution and up to ∼ 15◦ pointing offset angle in the 20–40kev band of cygx-1 and gaussian fits to them. the dotted lines indicate the centers of the gaussians 5.4439×104 5.44395×104 5.444×104 5.44405×104 5.4441×104 8 0 9 0 1 0 0 1 1 0 1 2 0 time [mjd] co u n tr a te [ cp s] fig. 4: fluxes from image extraction (box; deconvolution algorithm consistent with ii lc extract) as well as averaged countrates for individual scw of the ii light lightcurves with 10s (triangle) 1s (x) and 0.1s (circle) time resolution in the 20–40kev energy band 2 4 6 8 10 12 14 0 .9 1 1 .1 offset angle [deg] ii_ lig h t/ flu x fr o m im a g e e xt ra ct io n fig. 5: the ratio between averaged countrates for individual scws of the ii light lightcurves with 10s (triangle), (x) and 0.1s (circle) time resolution in the 20–40kev energy band to the fluxes from the image extraction in dependency on the pointing offset angle of the science window 35 acta polytechnica vol. 51 no. 2/2011 10−3 0.01 0.1 1 1 0 0 5 0 2 0 0 fourier frequency l e a h y p o w e r fig. 6: power spectrum densities (psds) for cygx-1 presented here for the lightcurves with 10s (red), 1s (blue) and 0.1s (brown) time resolution in the 20–40kev energy band in the leahy normalization ing that ii light underestimates the source flux — the different time resolutions are, however, consistent among each other and reproduce the shape of the lightcurve well. comparing the ratio between averaged countrates for individual scws of the ii light lightcurves to the fluxes from the image extraction, we see no offset angle dependency, as reported by [2], cf. fig. 5. the respective means of the ratios (dotted lines) agree well and indicate an offset of ∼ 5 %, consistent with the linear correlation presented above for the fully coded fov. the power spectrum densities (psds) calculated from the above discussed ii light lightcurves are shown in fig. 6. for such psds calculated in leahy normalizazion, the poisson noise level should be equal to 2, independent of the countrate of the source. it can, however, clearly be seen here that even at as high frequencies as a few hz, the psd flattens out at a value of ∼ 80 rms2/hz. this is consistent with the findings of [1] for vela x-1, where the poisson noise contributes as much as 100 rms2/hz at a given frequency. our psds for different time resolutions agree reasonably well with each other in shape (for exact timing studies longer periods than a single revolution would be necessary to reduce the uncertainities). note that [3] also found consistent psd shapes comparing isgri and rxte-pca 15–70 kev data for cyg x-1. so while a better noise correction is required, ii light lightcurves are still well suited for timing studies with a 10s to 0.1s resolution in the regime above 20kev. 4 summary and conclusions we have shown that it is possible to perform timing studies with a resolution of up to 0.1 s with integral when using the ii light tool. although ii light (osa 7 version) systematically underestimates the countrates when compared to more exact deconvolution algorithms (which do not allow better time resolution than 10 s even for bright sources such as cyg x-1), the differences can in principle be assessed and taken into account. the correlation between the countrates is linear, with the slope apparently depending on the field under consideration. a more detailed analysis of sources in different fields will allow to better quantify this linear correlation. the countrates of the ii light lightcurves follow a gaussian distribution around the mean value. we do not see a dependency on the pointing offset angle of the observation. psds calculated from these lightcurves with different time resolutions agree well with each other, the noise does however show anomalous behaviour which has also been observed by [1]. 36 acta polytechnica vol. 51 no. 2/2011 references [1] fürst, f., kreykenbohm, i., pottschmidt, k., et al.: a & a, 519, 37, 2010. [2] kreykenbohm, i., wilms, j., kretschmar, p., et al.: a & a, 492, 511, 2008. [3] pottschmidt, k., wilms, j., nowak, m., et al.: advances in space research, 38, 2006, p. 1 350. v. grinberg e-mail: victoria.grinberg@sternwarte.uni-erlangen.de remeis-observatory/ecap/fau sternwartstr. 7 d-960 49 bamberg, germany usm/lmu, munich, germany i. kreykenbohm remeis-observatory/ecap/fau bamberg, germany f. fürst remeis-observatory/ecap/fau bamberg, germany j. wilms remeis-observatory/ecap/fau bamberg, germany k. pottschmidt cresst/nasa-gsfc greenbelt, md, usa umbc, baltimore, md, usa m. cadolle bel esac, madrid, spain j. rodriguez dsm/dapnia/sap, cea saclay, france d. m. marcu cresst/nasa-gsfc greenbelt, md, usa umbc, baltimore, md, usa gmu, fairfax, va, usa s. suchy cass/ucsd, la jolla, ca, usa a. markowitz cass/ucsd, la jolla, ca, usa m. a. nowak mit/chandra x-ray center cambridge, ma, usa 37 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 a model of active roll vehicle suspension i. čech abstract this paperdescribes active suspensionwith active roll for four-wheel vehicle (bus) bymeans of an in-series pumpactuator with doubled hydropneumatic springs. it also gives full control law with no sky-craping. lateral stiffness and solid axle geometry in the mechanical model are not neglected. responses to lateral input as well as responses to statistical unevennesses show considerable improvement of passengers comfort and safety when cornering. keywords: active suspension, roll-yaw model of a four-wheel vehicle, cross control, solid axle, hydraulic control, pump actuator. 1 introduction a mathematical description of a model of active roll vehicle suspension ismade in steady state of the harmonic input in symbolic form. the complex amplitudes and effective values are denoted by capital letters and the instantaneous values, and also the constant values, are denoted in lower-case letters. differentiations are substituted by the operator s = i2πf, second differentiations by s2, and so on. as our model includes two more stages of differentiations than in usual mechanical models, a complex symbolic form is necessary to handle the model. only linear relations are used. a complete 4-wheel vehicle is dealtwith here, but no heave input is presumed. the parameters relate to a tall bus. the computedanswers are comparedwithpassive suspension. the results are given for both vertical and horizontal input, in spectral form and in impulse form. unlike other proposed solutions, e.g. [1, 2, 3] our project includes the static load control, and should be low-powered. this is because of the in-series character of the control, so that static displacement compensation takes no power. in addition, the controller uses no throttling and works against no static load. the control scheme is simple, so that there are no problems with stability. there is also the possibility of achieving active suspension using a source of force parallel to air suspension (which includes the static load displacement control). this source of force can be implemented by a linear electric motor. however, the power producedby the linearmotor is small in comparisonwith a rotational electric motor with gearing. 2 mechanical model dealing only with linear relations, we first presume the unevennesses decompositiondiagram infig. 1. it showsananalysis of thevertical input (unevennesses) z1–z4 of the four wheels 1–4 of the model into input zks1, zks2 of the heave-pitchmodel [4], whichwill not be dealt with here, and input zkp1, zkp2 of the fourwheel rollmodel. this input will be marked zk1, zk2. indices 1, 2 relate to front and rear axles. the mechanical schema of this roll-yaw model is shown in fig. 2. fig. 1: analysis of vertical inputs this figure shows a model of a suspension with a solid rear axle. we will consider the radii of gyration of the body rbx, of the seat rsx, and of the axle rwx. (the mass of the axle is assumed to be concentrated in the wheels.) then there are the heights of the mass centers of the body ztb, of the seat zts and of the wheel ztw. the track is denoted by yw, the lateral distance of the seats by ys, and the distance of the spring settings by yc. the height of the joint of the solid axle ismarkedby zq, and the lateral stiffness is marked by ky. the lateral displacements of thewheels at stiffnesses ky1, ky2 are denoted by yky1 , yky2 . vertical antiphased displacements of the body (above the wheel), seat and wheel are noted by zb, zs, and zw1, zw2. 16 acta polytechnica vol. 50 no. 6/2010 fig. 2: mechanical scheme the most important input of this model is the lateral acceleration ac. we will take into account the lateral sliding of the rolling tyre. sliding acts similarly to damper. its damping is proportional to the weight of the vehicle and reciprocally proportional to the travelling velocity and the constant of the tyre tgα. the stiffness of a damper is its damping multiplied by operator s. so kypk = s · ag(mb + ms + mw)/vx/tgα the lateral stiffness ky consists of this sliding stiffness kypk and the lateral stiffness of the tyre kwy acting in series. so 1/ky =1/kwy +1/kypk, we will use kwyre =0.6kw, tgα =0.1. the lateral input of themodel is due to the lateral acceleration ac of the body. the solid axle rotates around the center of the anti-phasedunevennesses c, thebody rotates around joint q and translates in lateral direction with the lateral displacement of the joint. only small angular displacements are assumed. the lateral stiffness in front and rear, and also the wheel masses front and rear correspond to the position of the mass centre, i.e. ky1/ky2 = mw1/mw2 = xw2/xw1 the roll model rotates around the on-axis, but its angle α α =arctg(zq/xw) is small and approximately taken as cosα =1. nevertheless, the efficient value of the axle joint height under the mass centre of the body is zqt = xw1/xw · zq 17 acta polytechnica vol. 50 no. 6/2010 3 the equations of the mechanical model two auxiliary constants are used to enable the equations to be used also for independent suspension, namely tn =1, nz =0 when with solid axle, tn =0, nz = 1 when no solid axle is used. (with independent suspension the inertia forces of the wheel are supported by the body.) if y z1, y z2 are the lateral displacements of the yaw motion, then the lateral force originating from the body mass mb is due to the acceleration of cornering, the rollmovement of the body, the roll of the joint, and the lateral shift of the mass center, so sqb =4mbac+ 4mbs 2[(ztb − zqt)zb + zqtzw2] ·2/yw− 4mb · s2 · (xw2yky1 + xw1yky2)/xw (3.1) and the lateral force of the seat sqs =4msac+ 4mss 2[(zts − zqt)zb + zqtzw2] · 2/yw− 4ms · s2 · (xw2yky1 + xw1yky2)/xw (3.2) the vertical forces in the suspension between body and wheel, front and rear sbw1 = kb1(zb − zw! − zr1)+ (3.3) kst1(zb − zw!), sbw2 = kb2(yc/yw · (zb − zw2) − zr2)+ (3.4) kst2yc/yw · (zb − zw2) nowthe equation ofmoments to thebody around the on-axis: moments of the weight of the displacedmass centres 0= −8[(ztb − zqt)mb +(zts − zqt)ms]ag/ywzb+ moment of the inertia force from the roll motion +8s2mbr 2 bx + msr 2 sx]/yw · zb moments of the forces in the suspension +ywsbw1 + ycsbw2 moment from the suspension of the seat −2ysks(zs − yszb/yw) moments from the lateral forces +(ztb − zqt)sqb +(zts − zqt)sqs moments of the roll motion of the wheels with independent suspension +2 · (mw1 + nzmw2)[2s 2(z2tw + r 2 w)/yw · zb− 2ztwag/yw · zb] moments from the lateral motion of the wheels +2ztw[(mw1 + nzmw2)ac− s2(mw1yky1 + nzmw2yky2)] (3.5) equation of the forces vertical to the seat 0= s2mszs + ks(zs − ys/yw · zb) (3.6) equation of vertical forces to the front wheels 0= s2mw1zw1 + kw1(zw1 − zk1) − sbw1 (3.7) equation of moments to the solid rear axle: from the inertia of the axle 0= 2s2mw2(y 2 w/4+ tn · z 2 tw + tn · r 2 wx)· 2/yw · zw2 from the unevennesses +ywkw2(zw2 − zk2) from the suspension −ycsbw2 from the lateral forces in the joint sqb, sqs +zqxw1/xw · (sqb + sqs) from the lateral motion of the axle −2tn · mw2(ztw − zq)(ac − s 2 · yky2) from the weight of the displaced mass centre −2tn · mw2ztw ·2agzw2/yw fromtheweight of the body and seat to the displaced axle +8(mb + ms)agzqzw2/yw · xw1/xw (3.8) equation of lateral forces lateral forces between the wheels and the road 0= −2ky1yky1 −2ky2yky2 inertia forces of the body and seat +sqb + sqs from solid axle roll +s2 · mw2 · ztw · 2/yw · (tn · zw2 + nz · zb) 18 acta polytechnica vol. 50 no. 6/2010 from the lateral shift of the wheels +2(mw1 + mw2)ac− s2 · (mw1yky1 + mw2yky2) (3.9) finally, the equation of moments to the body around the z-axis passing through the center of gravity of the body: inertia moments of the body and seat 0=s2(4mbr 2 bz +4msr 2 sz)· (yky1 − yky2 − zq · 2/yw · zw2)/xw plus the inertia moments of the wheels +s2[2mw1(r 2 wz + x 2 w1 + y2w/4)+ 2mw2(r 2 wz + x 2 w2 + y2w/4)]· (yky1 − yky2 − zq · 2/yw · zw2)/xw and themoment of the forces between thewheels and the road +2xw1ky1yky1−2xw2ky2(yky2+zq·2/yw·zw2) (3.10) 4 the control fig. 3 shows the control scheme of the suspension. in it, sensors are marked by their sensitivity constants. we will assume that active suspension will be used only on the rear axle. on the left side is the scheme of the front axle, with its spring kb1 and the damping bb1 and with static level control consisting of sensor fz, very low pass vlp, hydraulic valve v , central accumulator ca and drain dr. the dynamics of this control will be not dealt with here, i.e. zr1 =0. some explanation at first. elementsmarked sp are pressure sensors. a ring with minus in it produces the difference of two signals. a ringwith n in itmeans anegator—anelement converting the phase by 180 deg. a ring with p in it means an amplifier, with amplification proportional to the control input marked by an arrow. hydraulic linkage is shown by double lines, and the signal paths are shownby single lines. the directions of the signals are marked by triangular arrows. no feed-back through them is assumed. signals coming to a common point add together. crossing signal paths are not connected. on the left side, the equipment of front wheels 1, 2 is shown. on the right side, the equipment of rear wheels 3, 4 is shown. for reasons of stability, active suspension can be used only on one axle— let it be the rear axle. (cars with front drive can use the front axle.) the lateral acceleration meter (with sensitivity constant tar) ismounted on the body at a height of zta. fig. 3: control scheme 19 acta polytechnica vol. 50 no. 6/2010 the active suspension on the rear axle (right part of the figure) consists of the sensors of the displacement wheel-body marked by fz, producing the mean value of the front and rear displacements, and of the sensor of the vertical acceleration, marked by 1/fa, and of the actuator embodied by a pump driven by an electric motor. the actuator is marked a, and its control amplification is ζ2, the common amplication of signals is ζ. there is also a control velocity sensor, marked by its constant ζr, to develop a source of velocity by means of a negative feedback. thus, the force of the actuator (proportional to the mass mb + ms) is sa =ζ2sζr · zr2 + ζζ2ζd[ζh · s 2/(2πfa) · zb+ 2πfzζb(zb − zw1/2− zw2/2)− (ac −s2yky2 +2s 2((zta − zqt)zb+ zqtzw2)/yw −2agzb/yw)tar]b0· (mb + ms)/m0 (4.1) where b0 is a formal constant valued 1n/ms and m0 is a constant with dimension kg. and where suitable damping with damping coefficient βb is introduced by the element, marked by its transmission ζb =1+s ·2βb/(2πfz) two filters are also included: a low pass with transmission ζd and cut-off frequency fd, which helps to prevent undamping of the wheel, and a high-pass with transmission ζh and cut-off frequency fh, which eliminates the false signal of the acceleration sensor of the tilted body. namely ζd =1/(1+s/(2πfd)), ζh = s/(1+s/(2πfh)) to the negative feedback: this control equation makes the actuator approximately a source of hydraulic current, i.e. the control displacement is not dependent on the force sbw2. (this can also be achieved by very high inner stiffness of the actuator, but with great losses of power.) the force of the actuator is at the same time sr = krzr2 + k0zr2 − kb2(zb2 − zw2 − zr2) (4.2) where zr2 is the control displacement and k0 is the stiffness of the balancing spring in which the same static pressure is maintained as that in spring kb by means of sensors sp , differentiator d, very low pass v lp and valve v . (in the computed examples k0 = kbre.) the inner stiffness of the actuator kr consists of the damping of pump br and the inertia of the pump and the electric motor, so kr = sbr +s2mr combining equations 4.1 and 4.2, a single control equation can be written, as follows 0=ζ2sζrzr2 + ζζ2ζd[ζh ·s 2/(2πfa) · zb+ 2πfzζb(zb − zw1/2− zw2/2)− (ac −s2yky +2s2((zta − zqt)zb+ zqtzw2)/yw −2agzb/yw)tar]− [(sbr +s 2 · mr)zr + k0zr− kb · (zb − zw − zr)]m0/(mb + ms)/b0 (4.3) cross control is provided on the front axle. this cross control uses the measurements of the pressure transducers, marked sp (in fig. 3). the sum of the loads in the rear (after filtering through a very low pass v lp to get static values sst1, sst2) controls the amplification of the proportional amplifier p of the load difference of the front wheel load. the outputs of the two proportional amplifiers are compared, and theirdifference controls thepumpsof the frontwheels with the appropriate phase. an equation to fulfil this aim can be written as follows 0=szr1 + ζdpvcc[kbre2(zb+ yw/yc · zr2 − zw2)/(xw1/xw)/sst1− kbre1(zb + zr1 − zw1)/(xw2/xw)/sst2] (4.4) where vcc is the control constantandwhere the transmission of the low path ζdp =2πfdp/(2πfd +s) with the characteristic frequency fdp delays the answer to antiphased unevennesses. the influence of the inner stiffness of the actuator is not included here (a source of displacement is assumed). equations 3.1 to 3.10, 4.3, 4.4make a set of equations for unknownquantities sbw1, sbw2, sqb, sqs, zb, zs, zw1, zw2, yky1, yky2, zr1, zr2. 5 input and output the spectrum of a surface of medium roughness is used for the statistical formof the roadunevennesses (ride velocity vx =60 km/h): zf k = √ (vx)/f · √ (1/2− 1/2/(1+(2π · f /4.5 · yw/v4x))/ (1+(rpn · f /vx)2) · 0.64 mm. (5.1) where the track yw = 1.8 m. a correction to the tire diameter rpn is added. the effective value of this spectrum is 7.7 mm. the impulse input of the shape 1-cos (also “translated sinusoidal”) was used in its spectral form τ f =sin(2πf ti)/(2πf)+ ti/2 · [sin(τ1)/τ1 +sin(τ2)/τ2], (5.2) 20 acta polytechnica vol. 50 no. 6/2010 where τ1 = π −2πf ti, τ2 = π +2πf ti and where ti is the duration of the impulse at its half-height. the maximum value of the impulse unevenness zki is assumedtobeproportional to its duration (progressive impulses). these inputs are shown in the graphs of zf bw or zbw. the lateral input (lateral acceleration) is a sort of trapezoid-shaped impulse, but with rounded edges. it has been obtained by adding the two abovementioned curves 1−cos with ti/2 (double cosinus), the second delayed by ti from the first. this input will be attached to the graphs of lateral seat acceleration. the graphs of the frequency characteristics hf have the effective values attached. these effective values follow the formula√ 2 ∫ h2f df within limits 0 : fmax the pulse-effective values, given with time histories ht, are the effective values of the whole answer divided by the pulse duration, ie.√∫ h2tdt/ti) within limits t =0− inf the vertical statistical input is due to the central path between the wheels. the time interval between the front and the rear of the vertical input depends on the axle base xw and riding speed vx tx = xw/vx. so the relation of the vertical input front and rear is zk2 = zk1(cos(stx)+ isin(stx) 6 criteria lateral acceleration in the rear seat asy2 =ac −s 2 · yky2+ s2[(ztb − zqt)zb + zqzw2] ·2/yw− 2agzb/yw vertical acceleration in the seat as = s 2zs vertical accelerationof the body at radius yw/2 yw/2 ab = s 2zb body-wheel displacement front and rear zbw1 =zb − zw1, zbw2 =zb − zw2. load transfer ratio front and rear sw1/sst1 = kw1(zw1 − zk1)/sst1, sw2/sst2 = kw2(zw2 − zk2)/sst2. where the static forces are sst1 = 2(mb + ms + mw)agxw2/xw, sst2 = 2(mb + ms + mw)agxw1/xw seat-body displacements front and rear zsb1 =zs1 − zb, zsb2 =zs2 − zb dynamic-to-static ratio of lateral forces between wheels and road, front and rear sy1/sst1 = ky1yky1/sst1, sy2/sst2 = ky2yky2/sst2. spring displacements, front and rear zp1 = zb − zw1 − zr1, zp2 = (zb − zw2)yw/yc − zr2 the forces in the suspension, front and rear, are sbw1 = kb1(zb1 − zw1), sbw2 = kb2(zb2 − zw2 − zr2) 7 universal parameters and used parameters the universal parameters (natural frequencies, damping coefficients) defined for the heave-pitch model are also used in the roll model, but we must also consider the anti-roll bar. if the anti-roll bar stiffness-to-spring-stiffness kb ratio κstab is given, then kb =kbre +sbb =4π 2f2b (mb + ms)(1+ κstab)+ s · 4πβbfb(mb + ms) parameters used with a high bus v variant x − a, i.e. cross control front and active control rear 21 acta polytechnica vol. 50 no. 6/2010 fb =1 hz βb =0.4 mb =1000 kg ztb =0.9 m rbx =0.6 m yw =1.8 m fw =10 hz βw =0.05 mw =150 kg ztw =0.5 m rwx =0.3 m yc =1.2 m fs =3 hz βs =0.28 ms/mb =0.5 zts =2.7 m rsx =0.4 m ys =1.2 m ζ =3 fz =0.4 hz βa =1 fa =0.3 hz fd =1 hz fh =0.1 hz tar =0.075 s ζ2 =100 ζr =0.9br = bb/1000mr = mb/1000 b0 =1 ns/m m0 =1 kg zta =0.9 m vcc =1 m/s fcc =3 hz κstab1 =0 κstab2 =0 zq =0.3 m xw =5.5 m xw1/xw2 =2 rbz =2 m rsz =2 m rwz =0.3 m fig. 4: time histories due to the impulse of lateral acceleration. a thick line means an active variant (act), a thin line means a passive variant (pas). the attached numbers are imp-eff. values. a semicolon is used tomark an index. in the graph of power, the attached value refers to energy consumption [ws/1000 kg] stiffnesses kb, kw are assumed to be proportional to the static load front and rear: kb1 =kb · 2xw2/xw, kb2 = kb ·2xw1/xw, kw1 =kw ·2xw2/xw, kw2 = kw · 2xw1/xw for variant p − p , i.e. for a vehicle with passive suspension front and rear, the parameters differ as follows: κstab1 =1.2 κstab2 =1.4 zq =0.9 m ζ2=100 the parameters of actuators mr and br are only estimated. 8 example of modelling results in the graphs in figs. 4, 5, the active variant act (passive suspension front, active rear) is indicated by more prominent lines (lines with dots) than those of the passive variant (pas–pas). in some suitable graphs, theappropriate input is shownby interrupted lines. effective values (with frequency characteristics) or impulse-effective values (effective values of the answersdivided by the impulse duration,with timehistories) are attached to the labels. fig. 5: frequency characteristics due to unevennesses. a thick linemeans an active variant (act), a thin linemeans a passive variant (pas). the attached numbers are impeff. values. a semicolon is used to mark an index 22 acta polytechnica vol. 50 no. 6/2010 active passive steady-state values due to ac =1 m/s 2 lateral acceleration in the seat asy [mm/s 2] 0.77 1.35 body-wheel travel zbw2 [mm] 25 25 vertical dynamic force to static force ratio, front sw1/sst1 12 20 vertical dynamic force to static force ratio, rear sw2/sst2 14 20 steady-state values due to in-phase unevenness zk1 = zk2 =0.01 m lateral acceleration in the seat asy [m/s 2] 0.11 0.15 vertical dynamic force to static force ratio, front sw1/sst1 0.02 0.022 vertical dynamic force to static force ratio, rear sw2/sst2 0.017 0.019 steady-state values due to cross unevenness zk1 =0.01 m, zk2 = −0.01 m lateral acceleration in the seat asy [m/s 2] 0.028 0.023 vertical dynamic force to static force ratio, front sw1/sst1 0.00042 −0.07 vertical dynamic force to static force ratio, rear sw2/sst2 0.009 0.03 9 stability stability against self-exciting oscillation is an important criterion of the system. it is most important with the roll model, where there is also the possibility of roll-over. asymptotic stability is used, i.e. the real part of the eigenvalues of the equation matrix must be negative. stability was checked for all parameters of a simplified model. the parameters of nominal value hnom were varied between 1/100 to 100 multiples of its given value hjm. when no instability was found, output 100was given. if instability was found at parameter value hcrit, the stability rate hcrit/hnom was put out. a minimal stability rate of about 2.2 of the passive model increases to 5 with active suspension. 10 conclusions based on the example with vertical statistical input, the lateral acceleration in the seat is slightly increased and the lateral force is substantially decreased. the lateral load transfer is slightly deteriorated, but the effective values incorporating the values from the heave-pitch model are little influenced. it has been shown that it is possible by means of cross control to distribute the load changesdue to corneringproportional to the static load of the axle. the lateral input is the main advantage of active suspension, thanks to the active roll: from the viewpoint of ride comfort, active roll substantially suppresses the lateral force during cornering. from the viewpoint of ride safety, active roll also substantially suppresses the load transfer to the outside wheels in a curve. thus the vehicle can take curves at a higher speed or with more safety. with active roll suspension there is noneed touse a solid axle for anti-roll purposes. this means more comfort for the passenger and there is reduced damage to the road, due to a big reduction in the lateral forces betweenwheel and road. a solid axlewas used in our examples, but with reduced height of the roll axis. with active suspension there is also no need for anti-roll bars. active roll technology also enables the design of a slender passenger car for two passengers sitting in tandem (we can call this a tandemo) with, e.g., an 0.8 m track. references [1] sampson, d. j. m., cebon, d.: active roll control of single unit heavy road vehicles, http://www.cvdc.org/recent papers/ sampsoncebon vsd02.pdf [2] vacuĺın, o., valášek, m., svoboda, j.: influence of vehicle tilting on its performance, conf. interaction and feedbacks, praha, 2005. [3] masashi, yamashita: application of h-y control to active suspension systems, automatica, vol. 30, 1994. [4] čech, i.: a pitch-plane model of a vehicle with controlled suspension. vehicle system dynamics, vol. 23, 1998, pp. 133–148. ing. ilja čech e-mail: cech10600jes11@seznam.cz jesenická 11, 106 00 prague 10, czech republic 23 acta polytechnica vol. 50 no. 6/2010 list of symbols complex amplitudes, spectral values and instantaneous values ac af c ac lateral (centripetal) acceleration [m/s 2] as af s as vertical acceleration of the seat [m/s 2] asy af ty asy lateral acceleration in the seat above the center of gravity [m/s 2] ax af x ax longitudinal acceleration input [m/s 2] sa sf a sa force developed by the actuator [n] sbw sf bw sbw force between body and wheel [n] sq sf q sq lateral inertia force due to yaw motion [n] sp sf p sp force on the spring [n] sqb sf qb sqb lateral force in the joint due to the body [n] sqs sf qs sqs lateral force in the joint due to the seat [n] ssb sf sb ssb vertical force between seat and body [n] sw sf w sw vertical force between wheel and road surface [n] sy sf y sy lateral force between wheel and road surface [n] vr vf r vr velocity of movement of the actuator [m/s] yar yf ar yar lateral displacement of the body at lateral acceleration meter height [m] yky yf ky yky lateral displacement in lateral stiffnesses ky1, ky2 zb zf b zb vertical displacement (for the roll model at radius yw/2) [m] yz yf z yz lateral displacement of the body due to yaw motion [m] zbw zf bw zbw body-wheel displacement [m] zk zf k zk vertical input displacement [m] zr zf r zr control displacement [m] zs zf s zs vertical seat displacement [m] zw zf w zw vertical wheel displacement [m] quantities without physical dimension βa relative damping rate of the control βb relative damping rate of the body βs relative damping rate of the seat ζ common amplification of signals ζ2 amplification of the actuator ζb corrective damping element ζd transmission of low pass filter ζh transmission of high pass filter ζr sensitivity constant of feedback sensor κstab ani-roll bar stiffness κyw lateral stiffness to radial stiffness ratio nz, tn nz =0, tn =1 for solid axle, nz =1, tn =0 for no solid axle other quantities ag gravity acceleration [m/s 2] bb damping of passive suspension [ns/m] bpk slip damping of the tyre [ns/m] b0 formal constant of the actuator b0 =1 [ns/m] br damping of the actuator [ns/m] bs seat damping [ns/m] bw wheel damping [ns/m] f frequency [hz] fa 1/(2πf a) sensitivity constant of transducer of vertical acceleration [hz] fb natural frequency of passive suspension [hz] fd, fh characteristic frequency of the filter [hz] 24 acta polytechnica vol. 50 no. 6/2010 fr sensitivity constant of the control velocity sensor [hz] fz 2πfz constant of the displacement transducer [hz] fε 1/(2πfε) constant of the roll acceleration transducer [hz] hj nominal value of the parameter hkrit critical value of the parameter k0 spring rate of the balancing spring [n/m] kb stiffness of the body-wheel suspension [n/m] kbre spring rate of the body-wheel spring [n/m] kr inner stiffness of the actuator [n/m] ks stiffness of the seat-body spring [n/m] kst stiffness of the roll-bar [n/m] kw radial stiffness of the tyre [n/m] kwyre real lateral stiffness of the tyre [n/m] ky lateral stiffness [n/m] kypk sliding stiffness [n/m] m0 formal constant m0 =1 kg mb quarter body mass [kg] ms quarter seat mass [kg] mc total mass of vehicle mb + ms + mw [kg] mr actuator mass transferred to the pump radius [kg] mw wheel mass [kg] p power [w] rpn tyre radius [m] rbx gyration radius of the body to the longitudinal axis [m] rbz gyration radius of the body to the vertical axis [m] rsx gyration radius of the seat to the longitudinal axis [m] rsz gyration radius of the seat to the vertical axis [m] rwx gyration radius of the wheel to the longitudinal axis [m] rwz gyration radius of the wheel to the vertical axis [m] s operator i2πf t time [s] tar sensitivity constant of the lateral acceleration sensor [hz] tgal constant of inverse proportionality between slip rotation and load ti duration of input impulse at half-height [s] vx travelling speed [m/s] yc spring distance [m] ys seat distance [m] yw wheel track [m] za height of the lateral acceleration meter [m] zki maximum value of the vertical input impulse [m] zq height of the joint [m] zqt height of the axis on under the center of gravity [m] zta height of the lateral acceleration transducer [m] xw axle base [m] xw1 distance between the mass centre of the body and the front axle [m] xw2 distance between the mass centre of the body and the rear axle [m] ztb height of the body mass centre [m] zts height of the seat mass centre [m] ztw height of the wheel mass centre [m] indices 1 front axle 2 rear axle b body f spectral values re real part s seat x longitudinal direction, axis y lateral direction, axis w wheel 25 ap08_3.vp 1 introduction a wide range of applications use lenses with a wide angle of view (wfc, uwfc) for sky monitoring. detection of new objects e.g. novae, supernovae and agn (active galactic nuclei) is a well known application in which these images are analysed. all-sky imaging (monitoring) based on so-called “fish-eye” lenses is also used in some applications. this paper deals with scientific (astronomical) image data processing. real image data from the bootes experiment and from double-station video observation of meteors is analyzed. bootes is a system for monitoring the optical transient of grb (gamma ray bursts). the main goal of double-station video observation of meteors to acquire and analyse video records of meteors. the images from these systems contain survey data with huge numbers of objects of small size. bootes is an automatic robotic system located in southern spain. it is equipped with a set of telescopes and ccd (charged-couple device) cameras, which are fitted with lenses of various focal lengths. each camera is used for a different task in monitoring the ot (optical transient) of the grb and agn. in the following text, we deal with image data acquired from the uwfc optical system. the second application is double-station video observation of meteors in the czech republic. observation using a double camera system enables us to determine the trajectory of meteors in the earth’s atmosphere. uwfc is also used in this system. high quality of the imaging system is therefore required. uwfc image data analysis is very difficult in general. there are many different kinds of optical aberrations and distortions in these systems. moreover, the objects in ultra wide-field images are very small (a few pixels per object dimension). optical aberrations have their greatest impact toward the margins of the fov (field of view), it means that aberration error growing with increasing angular distance from optical axis of the system. these aberrations distort the psf of the optical system and rapidly reduce the accuracy of the astrometry measurements. optical aberrations are dependent on three-dimensional coordinates. this relation affects the transfer characteristics of optical systems and makes them spatially variant. the influence of spatially variant optical aberrations on the transfer function of optical imaging systems is outlined in this paper. 2 transfer characteristics of optical imaging systems the description of transfer characteristics (mainly psf) of optical imaging system is presented in the following section. to specify the properties of the lens optical system, we adopt the point of view that all optical elements are lumped into a “black box”. referring to fig. 1, the terminals of this black box consist of the planes containing the entrance and exit pupils [2]. 2.1 point spread function the smallest detail that the imaging system can produce is determined by its impulse response h( , )� � . the impulse response, referred to in optical systems as the point spread © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 48 no. 3/2008 space variant psf – deconvolution of wide-field astronomical images m. řeřábek the properties of uwfc (ultra wide-field camera) astronomical systems along with specific visual data in astronomical images contribute to a comprehensive evaluation of the acquired image data. these systems contain many different kinds of optical aberrations which have a negatively effect on image quality and imaging system transfer characteristics, and reduce the precision of astronomical measurement. it is very important to figure two main questions out. at first: in which astrometric depend on optical aberrations? and at second: how optical aberrations affect the transfer characteristics of the whole optical system. if we define the psf (point spread function) [2] of an optical system, we can use some suitable methods for restoring the original image. optical aberration models for lsi/lsv (linear space invariant/variant) [2] systems are presented in this paper. these models are based on seidel and zernike approximating polynomials [1]. optical aberration models serve as suitable tool for estimating and fitting the wavefront aberration of a real optical system. real data from the bootes (burst observer and optical transient exploring system) experiment is used for our simulations. problems related to uwfc imaging systems, especially a restoration method in the presence of space variant psf are described in this paper. a model of the space variant imaging system and partially of the space variant optical system has been implemented in matlab. the “brute force” method has been used for restoration of the testing images. the results of different deconvolution algorithms are demonstrated in this paper. this approach could help to improve the precision of astronomic measurements. keywords: high order aberrations, psf, uwfc, bootes, zernike, seidel, lsi, lsv, deconvolution, image processing. fig. 1: generalized model of an optical imaging system function (psf), describes spatial distribution of the illumination in the image plane, when the point source in object plane is used. psf is actually the response of the optical imaging system to the two dimensional dirac impulse (see fig. 2). 3 effects of aberrations on psf optical systems with all aberrations compensated are known as diffraction limited system. the influence of aberrations on image quality can be expressed as the wavefront error at the exit pupil, and their effects on the transfer characteristics can be expressed as the changes in the psf size and shape. when an imaging system is diffraction limited, the psf consists of the fraunhofer diffraction pattern of the exit pupil [2]. in that case, we consider the departures of the exit-pupil wavefront from ideal spherical form. 3.1 shift invariant psf modeling the relation between the object and image of an lsi optical system can be expressed by the convolution in the spatial domain (the object irradiance distribution f ( , )� � is convolved with the impulse response h u v( , )) [5]. the psf of the lsi optical imaging system can be expressed as � �psf x y ft q u v ft p x y i w x y ( , ) ( , ) ( , ) exp ( , ) � � �� � � � � � 2 2� � � � � 2 , (1) where p x y( , ) defines the shape, size and transmission of the exit pupil, and � �exp ( ( , ))�i w x y2� � accounts for the phase deviation of the wavefront from a reference sphere. fig. 3 shows the psf of diffraction limited optical system. an example of coma aberration (one of the basic optical aberration) [1, 3] influence to psf is presented in fig. 4. 3.2 shift variant psf modeling when the impulse response of the optical imaging system depends on the coordinates of the object ( , )� � , we speak about linear shift variant (lsv) systems. if we consider the lsv system, we cannot use convolution to express the relation between object and image. in order to compute psf we have to use the diffraction integral [1, 3, 4]. psf can then be expressed as [6, 10] h u v z z p x y i w x y i ( , , , ) ( , ) exp ( , ) � � � � � � �� � � � �� � � � 1 2 2 0 � � � � � � � � � � � � � � � �exp ( ) ( ) .j z u m x u m y x y i 2� � � � d d (2) fig. 5 shows real astronomical data, which was taken within the double-station observation project. this program is currently running at the ondřejov observatory. a stellar object profile [8, 9] depending on the image position is also demonstrated in fig. 5. 4 model of the spatially variant imaging system astrometric measurements are often limited by variations in psf shape and size over the image. these variations in psf structure occur especially in uwfc systems, because of the amount of aberrations. optical aberrations increase toward the margins, as mentioned above. the principal difficulty in spatially variant (sv) systems is that fourier approaches can no longer be used for restorations (deconvolutions) of the original image [5, 7]. let us consider the sv optical system distorting only by coma aberration. if we want to use fourier methods for the 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 2: two dimensional impulse response of the optical imaging system – i.e. point spread function (psf) fig. 3: psf of a diffraction limited (aberration free) system fig. 4: psf distorted with coma aberration deconvolution process, we need to split the original image. the transfer characteristics of each part (so called isoplanatic patch) are described by a unique psf. this system is called a partially space invariant system, and we use it in our experiments (see fig. 6). such a system has parametric psf – for each value of a parameter (in our case it is the coordinate of the object in the object plane) the psf takes a different size and shape according to the aberrations. the wavefront aberration function for the sv optical system can be described as w w zn m n m m n n n k ( , , , ) ( , )( , )� � � � �� � � �� �� , (3) where wn m ( , )� is the rms wavefront error for aberration mode n m and for object coordinate ( , )� . use of the partially space invariant system enables us to describe the transfer characteristics by fourier methods. the psf of this system can then be described as psf u v ft p x y i w( , , , ) ( , ) exp ( , , , )� � � � � � � �� � � � � � � � 2 � 2 (4) where ( , )� is the polar coordinate on the object plane. so we obtain a number of psfs, one for each part. now, we can also use the fourier approach for image deconvolution. 81 acta polytechnica vol. 48 no. 3/2008 fig. 5: input image data from the system at the ondřejov base. demonstration of the influence of aberrations at the edge and in the centre of an image. psfj psfi psfn (� �) fig. 6: model of a partially space invariant optical system a) b) fig. 7: a) source image – object, b) image passed through the sv coma distorted optical system the optical system is divided into 40 spatially invariant parts in our experiments. three deconvolution algorithms [8, 9] – wiener, lucy-richardson and blind -have been used for restoring the original image (see fig. 7 and 8). the wiener deconvolution algorithm gives inaccurate results. this method would need finer divisions. the lucy-richardson method gives the best results. the results are influenced by the setting of the weighting function of each spatially invariant part. 5 conclusions optical aberration models for lsi, lsv and partially invariant systems have been presented in this paper. these models are based on the seidel and zernike approximating polynomials. we can commonly use these approaches for modeling the wfc and uwfc of the bootes experiment and also for double-station video observation of meteors. the influences of coma aberration on astrometric measurement precision have been shown. the effect of optical aberrations and spatial variations on the transfer function (psf) of optical imaging systems have been described. the results of various deconvolution algorithms have been demonstrated. the goal of our future work is to find the psf model of high order optical aberrations for real sv (partially space invariant system) and to remove aberrations from the image. it will be necessary to find sufficient splitting of the spatial variant system, and to find a proper deconvolution method for removing the aberrations. this approach will also help to improve the precision of astronomic measurements around the optical axis and on the edges of fov (far from the optical axis). acknowledgments the research described in this paper was supervised by mgr. petr páta, phd., fee ctu in prague. this work has been supported by grant no. ctu0707613 “shift variant optical systems in imaging algorithms” of ctu in prague. a part of this research work has been supported by the czech grant agency under grant no. 102/05/2054 “qualitative aspects of audiovisual information processing in multimedia systems”. references [1] born, m., wolf, e.: principles of optics. 2nd ed. london: pergamon press, 1993. [2] goodman, j. w.: introduction to fourier optics. boston: mcgraw-hill, 1996. [3] hopkins, h. h.: wave theory of aberrations. london, oxford university press, 1950. [4] goodman, j. w.: linear space-variant optical data processing. optical information processing, edited by s. h. lee, berlin, heidelberg, new york: springer-verlag, 1981, p. 235–260. isbn 3-540-10522-0. [5] lohman, a. w., paris, d. p.: space-variant image formation. journal of the optical society of america, vol. 55 (1965), no. 8, p. 1007–1013. [6] campisi, p., egiazarian, k.: blind image deconvolution: theory and applications. crc, 2007. 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 a) b) c) fig. 8: restored images: a) wiener deconvolution. b) lucy – richardson deconvolution. c) blind deconvolution [7] řeřábek, m., páta, p., koten, p.: high order optical aberration influence to precision of astronomical image data processing. in spie europe conference proceedings 6584. prague, 2007. isbn: 9780819467126 [8] starck, j., murtagh, f.: astronomical image and data analysis. berlin: springer verlag, 2002. [9] starck, j., murtagh, f., bijaoui, a.: image and data analysis: the multiscale approach. cambridge university press, 1998. [10] maeda, p. y.: zernike polynomials and their use in describing the wavefront aberrations of the human eye. 2003. martin řeřábek e-mail: rerabem@.fel.cvut.cz department of radioelectronics czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 48 no. 3/2008 ap08_5.vp 1 introduction this paper presents viscous heating in a rotational viscometer with coaxial cylinders (see fig. 1), which is often used for measuring rheological behaviour. power-law and bingham models are often used to describe rheological behaviour [1]. the power-law model is the simplest model widely used for describing the rheological behaviour of non-newtonian fluids. using this model, the dependence of shear stress � on shear rate can be expressed by the relation: � �� k n� , (1) where k is the coefficient of consistency and n stands for the flow behaviour index. the bingham model is the simplest model used for describing the rheological behaviour of viscoplastic materials. using this model, the relation of shear stress � and shear rate �� can be expressed by the following relation: � � � �� �p� 0 for � �� 0, (2) where �0 is the yield stress and �p stands for plastic viscosity. if the inner to outer cylinder diameter ratio does not differ significantly from 1, the curvature can be neglected and the flow reduces to the flow between the moving and stationary plates. the temperature distribution can be obtained by solving the fourier-kirchhoff equation [1] � �� d d 2 2 t y � � � , (3) where y is the distance from the stationary plate. the equation will be solved with the following boundary conditions y h t y y t y t t � � � � � , , ( ) d d d d f 0 0 � � (4) i.e., we assume an insulated moving plate (rotating cylinder) and a stationary plate (cylinder) tempered to temperature tf. 2 solution 2.1 power-law fluids inserting (1) for � into (3), we get � � d d 2 2 1t y k n� � �� (5) and after integration we obtain t t k h y h y h bi n � � � � � � � � � � � � � � � f 2 1 21 2 1�� � , (6) where bi h� � �. the solution is shown in graphical form in fig. 2, where t t t k h y y hn * ( ) � , *� � � � f � �2 1 . (7) 2.2 bingham plastics inserting (2) for � into (3), we get � � � � � �� d d 2 2 2 0 1t y p� � �( � � (8) after integration and rearrangement we obtain © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 48 no. 5/2008 dissipative heating in a rotational viscometer with coaxial cylinders f. rieger rotational viscometers with coaxial cylinders are often used for measuring rheological behaviour. if the inner to outer cylinder diameter ratio does not differ significantly from 1, the curvature can be neglected and the flow reduces to the flow between moving and stationary plates. the power-law and bingham models are often used for describing rheological behaviour. this paper deals with the temperature distribution obtained by solving the fourier-kirchhoff equation and in the case of negligible inner heat resistance it also covers temperature time dependence. the solution is illustrated by a numerical example. keywords: viscous heating, viscometer with coaxial cylinders, power-law fluids, bingham plastics. fig. 1: rotational viscometer with coaxial cylinders t t h y h y h bi � � � � � � � � � � � � � � �f p� � � � 2 2 0 2 1 1 2 1� ( )* , (9) where � � � � 0 0* � � p . (10) the solution of (9) for bi � is shown in graphical form in fig. 3, where t t t h * ( ) � � � f p � � �2 2 . (11) fig. 3 shows that increasing plasticity (�0 * ) increases the temperature rise. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 0.01 0.1 1 10 100 0 0.2 0.4 0.6 0.8 1 y* t * bi=1 bi=10 bi=100 fig. 2: dimensionless temperature profiles 0 0.2 0.4 0.6 0.8 1 1.2 0 0.2 0.4 0.6 0.8 1 y* t * ����� ������� ����� fig. 3: dimensionless temperature profiles from the dependencies shown in fig. 2 it can be seen that at small biot number values the outer temperature resistance prevails and the temperature of the liquid is practically constant, and in a unsteady-state it depends on time. for this case the enthalpy balance can be written in the form � �� �ch t t h t t d d f � � �� ( ) (12) and after integration it transforms to t t t t t k k t* * * ( *) exp( *)� � � � � � �f f0 1 , (13) where k t h t t t t ch * � ( ) , *� � � � � � �0 f . (14) the dependence is shown in fig. 4, where the line for k * � 1designates the equilibrium state at which all dissipative heat is removed by convection. the dependence of the dimensionless time t* after which the dimensionless temperature attains 99 % of the steady-state value on k* is shown in fig. 5. in the case when initial temperature t0 is equal to temperature tf it is suitable to define the dimensionless temperature as t t t t h � � �( ) � f � � (15) after integration (11) we can obtain the relation t t� � � �1 exp( *) . (16) the application of the above relations will be illustrated in the following example. 3 example a rotational viscometer with inner cylinder diameter 48 mm and outer cylinder diameter 50 mm contains a newtonian liquid with density � � 1000 kgm�3, heat capacity c � 4200 jkg�1k�1 and heat conductivity � �0.5 wm�1k�1. calculate the inner and outer cylinder temperature 1) when the tempering temperature is 20 °c and the heat transfer coefficient � � 100 wm�2k�1 © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 48 no. 5/2008 0 0.5 1 1.5 2 2.5 3 3.5 0 5 10 15 t* t * * k����� k����� k��� k��� k��� fig. 4: dependence of dimensionless temperature on dimensionless time 0 1 2 3 4 5 6 7 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 k* t * fig. 5: dependence of the dimensionless time t* after which the dimensionless temperature attains 99 % of the steady-state value on k* 2) at no tempering, outer temperature 20 °c and � � 5 wm�2k�1. the temperature dependence of viscosity is described by the relation � [ ] . exp( . [ ])pa s c� � � �0 82 0 025 t . measurement is carried out at shear rate a) 100 s�1, b) 1000 s�1. calculate also the temperature after 5 minutes, when the initial temperature is �) 20 °c, �) 15 °c. 4 solution 4.1 calculations of final temperatures first the biot number will be calculated 1) bi h � � � � � � 100 0 001 0 5 0 2 . . . . inserting n � 1 and k � � into eq. (6) or inserting �0 0 * � and � �p � into eq. (9), the inner cylinder temperature ti will be calculated for y h� from a) t h bi ti f� � � � � � � � � � � � � � � � 2 2 6 4 1 2 1 0 5 10 10 0 5 5 5 20 20 � . . . .055 �c and the outer cylinder temperature te will be calculated for y � 0 from t h bi te � � � � � � � � � � � � � � � � 2 2 6 4 0 1 0 5 10 10 0 5 5 20 20 05 � . . . f �c. b) t h bi ti f� � � � � � � � � � � � � � � � 2 2 6 6 1 2 1 0 44 10 10 0 5 5 5 20 2 � . . . 4 9. �c, t h bi te � � � � � � � � � � � � 2 2 6 6 1 0 44 10 10 0 5 5 20 24 4 � . . . f c. the viscosities at the mean liquid temperature were inserted into the above equations. 2) bi h � � � � � � 5 0 001 0 5 0 01 . . . and the two temperatures will be calculated from the same equations a) t h bi ti f� � � � � � � � � � � � � � � 2 2 6 4 1 2 1 0 485 10 10 0 5 100 5 2 � . . . 0 21� �c, t h bi te � � � � � � � � � � � � � � � � 2 2 6 4 0 1 0 485 10 10 0 5 100 20 2 � . . f 1 �c. b) t h bi ti f� � � � � � � � � � � � � � � 2 2 6 6 1 2 1 019 10 10 0 5 100 5 20 � . . . � �58 3. c, t h bi te � � � � � � � � � � � � 2 2 6 6 1 019 10 10 0 5 100 20 58 2 � . . . f c. these results show that at a high shear rate the temperature rise is unacceptable especially without tempering. it can also be seen that the difference between the inner and outer cylinder temperature is not high due to the low biot number values, especially in case 2). 4.2 calculations of temperatures after 5 minutes of measurement �) in the case when the initial temperature t0 is equal to temperature tf, eq.(15) will be usedin the calculations 1a) t h t t� � � � � � � � � � � � � [ exp( *)] . . exp 2 2 1 0 497 0 001 100 100 1 1 f 00 300 1000 4200 0 001 20 20 05 � � � � �� � �� � � � . . c, 1b) t h t t� � � � � � � � � � � � � [ exp( *)] . . exp 2 2 1 0 47 0 001 1000 100 1 1 f 00 300 1000 4200 0 001 20 24 7 � � � � �� � �� � � � . . c, 2a) t h t t� � � � � � � � � � � � � � [ exp( *)] . . exp 2 2 1 0 5 0 001 100 5 1 5 300 f 1000 4200 0 001 20 20 3 � � � �� � �� � � � . . c, 2b) t h t t� � � � � � � � � � � � � � [ exp( *)] . . exp 2 2 1 0 375 0 001 1000 5 1 5 f 300 1000 4200 0 001 20 42 5 � � � �� � �� � � � . . c, where viscosity was calculated at the mean temperature ( )t t� f 2. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 5/2008 ) in cases when the initial temperature t0 is not equal to temperature tf, eq.(12) will be used in the calculations 1a) t t t t k k t� � � � � � � � � f f( )[ * ( *) exp( *)] [( . ) exp 0 1 20 5 1 0 0106 ( *) . ] .� � � �t 0 0106 20 05 c, 1b) t t t t k k t� � � � � � � � � f f( )[ * ( *) exp( *)] [( . ) exp( 0 1 20 5 1 0 995 � � � �t*) . ]0 995 25 c, 2a) t t t t k k t� � � � � � � � � � f f( )[ * ( *) exp( *)] [( . ) exp( 0 1 20 5 1 0 22 t*) . ] .� � �0 22 16 8 c, 2b) t t t t k k t� � � � � � � � � � f f( )[ * ( *) exp( *)] [( . ) exp( 0 1 20 5 1 16 3 t*) . ]� � �16 3 41 c, where viscosity was calculated at the mean temperature ( )t t0 2� . the results presented above show that the temperature rise and the experimental error is considerable especially at high shear rate in cases 1b) and 2b). 5 conclusion it was shown that dissipative heating can play an important role in measurement of highly viscous fluids. the temperature of measured liquid can be significantly higher than tempering temperature, which can cause significant experimental error. the time necessary for temperature stabilisation is often not negligible. measurement without tempering can lead to a significant temperature rise and unacceptable error of measurement. list of symbols bi biot number, 1 c specific heat capacity, j�kg�1�k�1 h distance of planes (cylinder walls), m k consistency coefficient, pa�sn n flow index, 1 r1 inner cylinder radius, m r2 outer cylinder radius, m t time, s t temperature, k y coordinate, m � heat transfer coefficient, w�m�2�k�1 �� shear rate, s�1 ratio r1/r2, 1 � heat conductivity, w�m�1�k�1 � viscosity, pa�s �p plastic viscosity, pa�s � density, kg�m�3 � shear stress, pa �0 yield stress, pa lower indexes f fluid 0 initial upper indices *+ dimensionless acknowledgment this work was supported by research project of the ministry of education of the czech republic msm6840770035. reference [1] bird, r. b., et al.: transport phenomena, j. wiley, n. york, 1960. prof. ing. františek rieger, drsc. phone: +420 224 352 548 e-mail: frantisek.rieger@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 48 no. 5/2008 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 superconformal calogero models as a gauged matrix mechanics s. fedoruk abstract we present basics of the gauged superfield approach to constructing the n-superconformal multi-particle calogero-type systems developed in arxiv:0812.4276, arxiv:0905.4951 and arxiv:0912.3508. this approach is illustrated by multi-particle systemspossessing su(1,1|1) and d(2,1;α) supersymmetries, aswell as by themodel of new n =4superconformal quantum mechanics. 1 introduction the celebrated calogero model [1] is a prime example of an integrable and exactly solvablemulti-particle system. it describes the system of n identical particles interacting through an inverse-square pair potential ∑ a�=b g/(xa − xb)2, a, b = 1, . . . , n. the calogero model and its generalizationsprovidedeep connections of various branches of theoretical physics and have a wide range of physical and mathematical applications (for a review, see [2, 3]). an important property of the calogero model is d = 1 conformal symmetry so(1,2). being multiparticle conformal mechanics, this model, in the twoparticle case, yields the standard conformal mechanics [4]. conformal properties of the calogero model and the supersymmetric generalizations of the latter give possibilities to apply them in black hole physics, since the near-horizon limits of extreme black hole solutions in m-theory correspond to ads2 geometry, having the same so(1,2) isometry group. analysis of the physical fermionic degrees of freedom in the black hole solutions of fourand five-dimensional supergravities shows that related d =1 superconformal systems must possess n =4 supersymmetry [5, 6, 7]. superconformal calogero models with n = 2 supersymmetrywere considered in [8, 9] andwith n =4 supersymmetry in [10, 11, 12, 13, 14, 15]. unfortunately, consistent lagrange formulations for the nparticle calogero model with n = 4 superconformal symmetry for any n is still lacking. recently, we developed a universal approach to superconformalcalogeromodels for anarbitrarynumber of interacting particles, including n =4 models. it is based on the superfield gauging of some non-abelian isometries of d =1 field theories [16]. our gaugemodel involves threematrix superfields. one is a bosonic superfield in the adjoint representation of u(n). it carries the physical degrees of freedom of the supercalogero system. the second superfield is in the fundamental (spinor) representation of u(n) and is described by chern-simons mechanical action [17, 18]. the third matrix superfield accommodates the gauge “topological” supermultiplet [16]. nextended superconformal symmetry plays a very important role in our model. elimination of the pure gauge and auxiliary fields gives rise to calogero-like interactions for the physical fields. the talk is based on the papers [19, 20, 21]. 2 gauged formulation of the calogero model the renowned calogero system [1] can be described by the following action [18, 22]: s0 = ∫ dt [ tr(∇x∇x)+ i 2 (z̄∇z −∇z̄z)+ ctra ] , (2.1) where ∇x = ẋ + i[a, x], ∇z = ż + iaz ∇z̄ = ˙̄z − iz̄a . theaction (2.1) is the actionofu(n), d =1gauge theory. the hermitian n×n-matrix field x ba(t), (x ba) = x ab , a, b =1, . . . , n and the complex commuting u(n)spinor field za(t), z̄ a = (za) present the matter, scalar and spinor fields, respectively. the n2 “gauge fields” aba(t), (a b a) = a a b are non-propagating ones in d = 1 gauge theory. the second term in the action (2.1) is the wess-zumino (wz) term. the third term is the standard fayet-iliopoulos (fi) term. the action (2.1) is invariant under the d = 1 conformal so(1,2) transformations: δt = α, δx ba = 1 2 α̇x ba, δza =0, δa b a = −α̇a b a , (2.2) where the constrained parameter ∂3t α = 0 contains three independent infinitesimal constantparametersof so(1,2). talk at theconference “selected topics inmathematical andparticlephysics”, inhonor of 70thbirthdayof jiriniederle, 5–7may 2009, prague and at the xviii international colloquium “integrable systems and quantum symmetries”, 18–20 june 2009, prague, czech republic. 23 acta polytechnica vol. 50 no. 3/2010 the action (2.1) is also invariant with respects to the local u(n) invariance x → gxg†, z → gz, a → gag† + iġg† , (2.3) where g(τ) ∈ u(n). let us demonstrate, in hamiltonian formalism, that the gauge model (2.1) is equivalent to the standard calogero system. the definitions of the momenta, corresponding to the action (2.1), px =2∇x , pz = i 2 z̄ , p̄z = − i 2 z , pa =0 (2.4) imply the primary constraints a) g ≡ pz − i 2 z̄ ≈ 0 , ḡ ≡ p̄z + i 2 z ≈ 0; b) pa ≈ 0 (2.5) and give us the following expression for the canonical hamiltonian h = 1 4 tr(px px)−tr(a t) , (2.6) where matrix quantity t is defined as t ≡ i[x, px]− z · z̄ + cin . (2.7) the preservation of the constraints (2.5b) in time leads to the secondary constraints t ≈ 0 . (2.8) the gauge fields a play the role of the lagrangemultipliers for these constraints. using canonical poisson brackets [x ba, px d c]p = δdaδ b c, [za, p b z ] p = δba, [z̄ a, p̄z b]p = δ a b , we obtain the poisson brackets of the constraints (2.5a) [ga, ḡb]p = −iδ a b . (2.9) dirac brackets for these second class constraints (2.5a) eliminate spinormomenta pz, p̄z fromthephase space. the dirac brackets for the residual variables take the form [x ba, px d c]d = δ d aδ b c , [za, z̄ b] d = −i δba . (2.10) the residual constraints (2.8) t = t+ form the u(n) algebra with respect to the dirac brackets [t ba , t d c ]d = i(δ d at b c − δ b ct d a) (2.11) and generate gauge transformations (2.3). let us fix the gauges for these transformations. in the notations xa ≡ x aa , pa ≡ px a a (no summation over a); xba ≡ x b a , p b a ≡ px b a for a �= b the constraints (2.7) take the form t ba = i(xa − xb)p b a − i(pa − pb)x b a + (2.12) i ∑ c (xcap b c − p c ax b c)− zaz̄ b ≈ 0 for a �= b , t aa = i ∑ c (xcap a c − p c ax a c)− zaz̄ a + c ≈ 0 (2.13) (no summation over a) . the non-diagonal constraints (2.12) generate the transformations δxba = [x b a, � a b t a b ]d ∼ i(xa − xb)� a b . therefore, in case of the calogero-like condition xa �=xb, we can impose the gauge xba ≈ 0 . (2.14) then we introduce dirac brackets for the constraints (2.12), (2.14) and eliminate xba, p b a. in particular, the resolved expression for pba is pba = − i (xa − xb) zaz̄ b . (2.15) the dirac brackets of residual variables coincide with poisson ones due to the resolved form of the gauge fixing condition (2.14). after gauge-fixing (2.14), the constraints (2.13)become zaz̄ a − c ≈ 0 (no summation over a) (2.16) and generate local phase transformations of za. for these gauge transformations we impose the gauge za − z̄a ≈ 0 . (2.17) the conditions (2.16) and (2.17) eliminate za and z̄ a completely. finally, using the expressions (2.15) and the conditions (2.14), (2.16) we obtain the following expression for the hamiltonian (2.6) h0 = 1 4 tr(px px)= 1 4 ⎛ ⎝∑ a (pa) 2 + ∑ a�=b c2 (xa − xb)2 ⎞ ⎠ , (2.18) which corresponds to the standardcalogero action [1] s0 = ∫ dt [ ∑ a ẋaẋa − ∑ a�=b c2 4(xa − xb)2 ] . (2.19) 3 n =2 superconformal calogero model n = 2 supersymmetric generalization of the system (2.1) is described by • the even hermitian (n × n)-matrix superfield xba(t, θ, θ̄), (x) + = x , a, b =1, . . . , n [supermultiplets (1,2,1)]; 24 acta polytechnica vol. 50 no. 3/2010 • commuting chiral u(n)–spinor superfield za(tl, θ), z̄a(tr, θ̄) = (za)+, tl,r = t ± iθθ̄ [supermultiplets (2,2,0)]; • commuting n2 complex “bridge” superfields bca(t, θ, θ̄). the n = 2 superconformally invariant action of these superfields has the form s2 = ∫ dtd2θ [ tr ( d̄x dx ) + 1 2 z̄ e2vz − ctrv ] . (3.1) here the covariant derivatives of the superfield x are dx = dx+i[a, x ] , d̄x = d̄x+i[ā, x ] , (3.2) d = ∂θ + iθ̄∂t , d̄ = −∂θ̄ − iθ∂t , {d, d̄} = −2i∂t , where the potentials are constructed from the bridges as a = −i eib̄(de−ib̄) , ā = −i eib(d̄e−ib) (b̄ ≡ b+) . (3.3) the gauge superfield prepotential v ba (t, θ, θ̄), (v ) † = v , is constructed from the bridges as e2v = e−ib̄ eib . (3.4) the superconformalboosts of then =2 superconformal groupsu(1,1|1)� osp(2|2) have the following realization: δt = −i(ηθ̄ + η̄θ)t , δθ = η(t + iθθ̄) , δθ̄ = η(t − iθθ̄) , (3.5) δx = −i(ηθ̄ + η̄θ)x , δz =0 , δb =0 , δv =0 . (3.6) its closurewithn =2supertranslations yields the full n =2 superconformal invariance of the action (3.1). the action (3.1) is invariant also with respect to the two types of the local u(n) transformations: • τ-transformationswith the hermitian (n×n)-matrix parameter τ(t, θ, θ̄) ∈ u(n), (τ)+ = τ; • λ–transformationswith complexchiral gaugeparameters λ(tl, θ) ∈ u(n), λ̄(tr, θ)= (λ)+. these u(n) transformations act on the superfields in the action (3.1) as eib ′ = eiτ eibe−iλ , e2v ′ = eiλ̄ e2v e−iλ , (3.7) x ′ = eiτ x e−iτ , z′ = eiλz , z̄′ = z̄ e−iλ̄ . (3.8) in terms of τ-invariant superfields v , z and new hermitian (n × n)-matrix superfield x = e−ib x eib̄ , x ′ = eiλ x e−iλ̄ , (3.9) the action (3.1) takes the form s2 = ∫ dtd2θ [ tr ( d̄x e2v dx e2v ) + 1 2 z̄ e2vz − ctrv ] (3.10) where the covariantderivatives of the superfield x are dx = dx + e−2v (de2v )x , d̄x = d̄x − x e2v (d̄e−2v ) . (3.11) for gauge λ-transformations we impose the wz gauge v (t, θ, θ̄)= −θθ̄a(t) . then, the action (3.10) takes the form s2 = s0 + s ψ 2 , sψ2 = −itr ∫ dt(ψ̄∇ψ−∇ψ̄ψ) (3.12) where ψ= dx| and ∇ψ=ψ̇+ i[a,ψ] , ∇ψ̄= ˙̄ψ+ i[a,ψ̄] . the bosonic core in (3.12) exactly coincides with the calogero action (2.19). exactly as in the pure bosonic case, residual local u(n) invariance of the action (3.12) eliminates the nondiagonal fields x ba, a �=b, and all spinor fields za. thus, the physical fields in our n = 2 supersymmetric generalization of the calogero system are n bosons xa = x a a and 2n 2 fermions ψba. these fields present the on-shell content of n multiplets (1,2,1) and n2−n multiplets (0,2,2) which are obtained from n2 multiplets (1,2,1) by the gauging procedure [16]. we can present it by the plot: x aa =(x a a ,ψ a a, c a a)︸ ︷︷ ︸ (1,2,1) multiplets x ba =(x b a,ψ b a, c b a), a �=b︸ ︷︷ ︸ (1,2,1) multiplets ⇓ gauging ⇓ x aa =(x a a ,ψ a a, c a a)︸ ︷︷ ︸ (1,2,1) multiplets interact ωba =(ψ b a, b b a, c b a), a �=b︸ ︷︷ ︸ (0,2,2) multiplets where the bosonic fields caa, c b a and b b a are auxiliary components of the supermultiplets. thus, we obtain some new n =2 extensions of the n-particle calogero modelswith n bosonsand2n2 fermionsas compared to the standard n = 2 supercalogero with 2n fermions constructed by freedman and mende [8]. 4 n =4 superconformal calogero model the most natural formulation of n = 4, d = 1 superfield theories is achieved in the harmonic superspace [23] parametrized by (t, θi, θ̄ k, u±i ) ∼ (t, θ ±, θ̄±, u±i ) , θ± = θiu±i , θ̄ ± = θ̄iu±i , i, k =1,2. 25 acta polytechnica vol. 50 no. 3/2010 commuting su(2)-doublets u±i are harmonic coordinates [24], subjected by the constraints u+iu−i = 1. the n =4 superconformally invariant harmonic analytic subspace is parametrized by (ζ, u)= (ta, θ +, θ̄+, u±i ), ta = t−i(θ +θ̄−+θ−θ̄+) . the integration measures in these superspaces are μh =dudtd 4θ and μ (−2) a =dudζ (−2). the n = 4 supergauge theory related to our task is described by: • hermitian matrix superfields x(t, θ±, θ̄±, u±i ) = (xba) subjected to the constraints d++ x =0, d+d− x =0, (d+d̄ − + d̄ + d−)x =0 (4.1) [multiplets (1,4,3)]; • analytic superfieldsz+(ζ, u)= (z+a ) subjected to the constraint d++z+ =0 (4.2) [multiplets (4,4,0)]; • the gauge matrix connection v ++(ζ, u) = (v ++ba). in (4.1) and (4.2) the covariant derivatives are defined by d++x = d++x + i [v ++, x], d++z+ = d++z+ + i v ++z+. also d+ = d+, d̄ + = d̄+ and the connections in d−, d̄ − are expressed through derivatives of v ++. the n = 4 superconformal model is described by the action s α�=0 4 = − 1 4(1+ α) ∫ μh tr ( x −1/α ) + (4.3) 1 2 ∫ μ (−2) a v0 z̃ +z+ + i 2 c ∫ μ (−2) a trv ++ . the tilde in z̃+ denotes ‘hermitian’ conjugation preserving analyticity [24, 23]. the unconstrained superfieldv0(ζ, u) is a real analytic superfield, which is defined by the integral transform (x0 ≡ tr(x)) x0(t, θi, θ̄ i)=∫ du v0 ( ta, θ +, θ̄+, u± ) ∣∣∣ θ±=θiu± i , θ̄±=θ̄iu± i . the real number α�=0 in (4.3) coincides with the parameter of the n = 4 superconformal group d(2,1;α)which is symmetry groupof the action (4.3). field transformations under superconformal boosts are (see the coordinate transformations in [23, 16]) δx = −λ0 x , δz+ =λz+, δv ++ =0 , (4.4) where λ=2iα(η̄−θ+−η−θ̄+), λ0 =2λ−d−−d++λ. it is important that just the superfield multiplier v0 in the action provides this invariance due to δv0 = −2λv0 (note that δμ (−2) a =0). the action (4.3) is invariant under the local u(n) transformations: x ′ = eiλxe−iλ, z+′ = eiλz+, v ++ ′ = eiλ v ++ e−iλ − i eiλ(d++e−iλ), (4.5) where λba(ζ, u ±) ∈ u(n) is the ‘hermitian’ analyticmatrix parameter, λ̃ = λ. using gauge freedom (4.5) we choose the wz gauge v ++ = −2i θ+θ̄+a(ta). (4.6) considering the case α = − 1 2 (when d(2,1;α) � osp(4|2)) in the wz gauge and eliminating auxiliary and gauge fields, we find that the action (4.3) has the following bosonic limit s α=−1/2 4,b = ∫ dt {∑ a ẋaẋa + i 2 ∑ a (z̄ak ż k a − ˙̄zakz k a)+ ∑ a�=b tr(sasb) 4(xa − xb)2 − ntr(ŝŝ) 2(x0)2 ⎫⎬ ⎭ , (4.7) where (sa)i j ≡ z̄ai z j a, (ŝ)i j ≡ ∑ a [ (sa)i j − 1 2 δ j i(sa)k k ] . the fields xa are “diagonal” fields in x = x|. the fields zi define first components in z+, z+| = ziu+i . they are subject to the constraints z̄ai z i a = c ∀ a . (4.8) these constraints are generated by the equations of motion with respect to the diagonal components of gauge field a. using dirac brackets [z̄ai , z j b]d = iδ a b δ j i , which are generated by the kinetic wz term for z, we find that the quantities sa for each a form u(2) algebras [(sa)i j ,(sb)k l] d = iδab { δli(sa)k j − δjk(sa)i l } . thus modulo center-of-mass conformal potential (up to the last term in (4.7)), thebosonic limit (4.7) is none other than the integrableu(2)-spincalogeromodel in the formulationof [25, 3]. except for the case α = − 1 2 , the action (4.3) yieldsnon-trivial sigma-model type kinetic term for the field x = x|. for α = 0 it is necessary to modify the transformation law of x in the following way [16] δmodx =2i(θkη̄ k + θ̄kηk) . (4.9) 26 acta polytechnica vol. 50 no. 3/2010 then the d(2,1;α =0) superconformal action reads sα=04 = − 1 4 ∫ μh tr ( e x ) + (4.10) 1 2 ∫ μ (−2) a z̃ +z+ + i 2 c ∫ μ (−2) a trv ++ . the d(2,1;α = 0) superconformal invariance is not compatible with the presence of v in the wz term of the action (4.10), still implying the transformation laws (4.4) for z+ and for v ++ . this situation is quite analogous to what happens in the n = 2 super calogero model considered in sect. 3, where the center-of-mass supermultiplet tr(x) decouples from the wz and gauge supermultiplets. note that the (matrix) x supermultiplet interactswith the (column) z supermultiplet in (3.1) and (4.10) via the gauge supermultiplet. 5 d(2,1;α) quantum mechanics the n =1 case of the n =4calogero-likemodel (4.3) above (the center-of-mass coordinate case) amounts to a non-trivialmodel of n =4 superconformalmechanics. choosing the wz gauge (4.6) and eliminating the auxiliary fields by their algebraic equations of motion, we obtain that the action takes the following on-shell form s = sb + sf , (5.1) sb = ∫ dt [ ẋẋ + i 2 ( z̄kż k − ˙̄zkzk ) − (5.2) α2(z̄kz k)2 4x2 − a ( z̄kz k − c ) ] , sf = −i ∫ dt ( ψ̄kψ̇ k − ˙̄ψkψ k ) + (5.3) 2α ∫ dt ψiψ̄kz(iz̄k) x2 + 2 3 (1+2α) ∫ dt ψiψ̄kψ(iψ̄k) x2 . the action (5.1) possesses d(2,1;α) superconformal invariance. using the nöther procedure, we find the d(2,1;α) generators. the quantum counterparts of them are qi = pψi +2iα z(iz̄k)ψk x + (5.4) i(1+2α) 〈ψkψkψ̄i〉 x , q̄i = pψ̄i −2iα z(iz̄k)ψ̄ k x + (5.5) i(1+2α) 〈ψ̄kψ̄kψi〉 x , si = −2xψi + tqi, s̄i = −2xψ̄i + tq̄i . (5.6) h = 1 4 p2 + α2 (z̄kz k)2 +2z̄kz k 4x2 − (5.7) 2α z(iz̄k)ψ(iψ̄k) x2 − (1+2α) 〈ψiψi ψ̄kψ̄k〉 2x2 + (1+2α)2 16x2 , k= x2 − t 1 2 {x, p}+ t2h , d= − 1 4 {x, p}+ th , (5.8) jik = i [ z(iz̄k) +2ψ(iψ̄k) ] , i1 ′1′ = −iψkψk , i2 ′2′ = iψ̄kψ̄k , i 1′2′ = − i 2 [ψk,ψ̄ k] . (5.9) the symbol 〈. . .〉 denotes weyl ordering. it can be directly checked that the generators (5.4)–(5.9) form the d(2,1;α) superalgebra {qai ′i,qbk ′k} = −2 ( �ik�i ′k′tab + (5.10) α�ab�i ′k′jik − (1+ α)�ab�ikii ′k′ ) , [tab,tcd] = −i ( �actbd + �bdtac ) , (5.11) [jij ,jkl] = −i ( �ikjjl + �jljik ) , (5.12) [ii ′j′ ,ik ′l′] = −i ( �ikij ′l′ + �j ′l′ii ′k′ ) , [tab,qci ′i] = i�c(aqb)i ′i, (5.13) [jij ,qai ′k] = i�k(iqai ′j), [ji ′j′ ,qak ′i] = i�k ′(i′qaj ′)i due to the quantum brackets [x, p ] = i , [zi, z̄j] = δ i j , {ψi,ψ̄j} = − 1 2 δij . (5.14) in (5.11)–(5.14) we use the notation q21 ′i = −qi, q22 ′i = −q̄i, q11 ′i = si, q12 ′i = s̄i, t22 = h, t11 =k, t12 = −d. to find the quantum spectrum,wemake use of the realization z̄i = v + i , z i = ∂/∂v+i (5.15) for the bosonic operators where v+i is a commuting complex su(2) spinor, as well as the following realization of the odd operators ψi = ψi, ψ̄i = − 1 2 ∂/∂ψi , (5.16) where ψi are complex grassmann variables. the full wave function φ= a1+ψ ibi +ψ iψia2 is subjected to the constraints z̄iz i φ= v+i ∂ ∂v+i φ= cφ. (5.17) 27 acta polytechnica vol. 50 no. 3/2010 table 1 r0 j i a (c) k′ (x, v +) |α|(c +1)+1 2 c 2 1 2 b ′(c) k (x, v +) |α|(c +1)+1 2 − 1 2 sign(α) c 2 − 1 2 0 b ′′(c) k (x, v +) |α|(c +1)+1 2 + 1 2 sign(α) c 2 + 1 2 0 requiring thewave functionφ(v+) to be single-valued gives rise to the condition that positive constant c is integer, c ∈ z. then (5.17) implies that thewave function φ(v+) is a homogeneous polynomial in v+i of the degree c: φ = a (c) 1 + ψ ib (c) i + ψ iψia (c) 2 , (5.18) a (c) i′ = ai′,k1...kc v +k1 . . . v+kc , (5.19) b (c) i = b ′(c) i + b ′′(c) i = (5.20) v+i b ′ k1...kc−1 v+k1 . . . v+kc−1 + b′′(ik1...kc)v +k1 . . . v+kc . on the physical states (5.17), (5.18) the casimir operator takes the value c2 =t 2 + αj2 − (1+ α)i2 + i 4 qai ′iqai′i = α(1+ α)(c +1)2/4 . (5.21) on the same states, the casimir operators of the bosonic subgroups su(1,1), su(2)r and su(2)l, t2 = r0(r0 −1) , j2 = j(j +1) , i2 = i(i+1) , take the values listed in the table 1. the fields b′i and b ′′ i formdoublets of su(2)r generated by jik , whereas the component fields ai′ = (a1, a2) form a doublet of su(2)l generated by i i′k′. each of ai′, b ′ i, b ′′ i carries a representation of the su(1,1) group. basis functions of these representations are eigenvectors of the generator r = 1 2 ( a−1k+ ah ) , where a is a constant of the length dimension. these eigenvalues are r = r0 + n, n ∈ n. 6 outlook in [19, 20, 21], we proposed a new gauge approach to the construction of superconformal calogero-type systems. the characteristic features of this approach are the presence of auxiliary supermultiplets withwz type actions, the built-in superconformal invariance and the emergence of the calogero coupling constant as a strength of the fi term of the u(1) gauge (super)field. we see continuation of the researches presented in the solution of some problems, such as • an analysis of possible integrability properties of new supercalogeromodels with finding-out a role of the contribution of the center of mass in the case of d(2,1;α), α�=0, invariant systems. • construction of quantum n = 4 superconformal calogero systems by canonical quantization of systems (4.3) and (4.10). • obtaining the systems, constructed from mirror supermultiplets and possessing d(2,1;α) symmetry, after use gauging procedures in bi-harmonic superspace [26]. • obtaining other superextensions of the calogero model distinct from the an−1 type (related to the root system of the su(n) group), by applying the gauging procedure to other gauge groups. acknowledgement i thank the organizers of jiri niederle’s fest and the xviii international colloquium for the kind hospitality inprague. iwould also like to thankmyco-authors e. ivanovando.lechtenfeld for fruitful collaboration. i acknowledge a support from therfbrgrants 08-0290490, 09-02-01209 and 09-01-93107 and grants of the heisenberg-landau and the votruba-blokhintsevprograms. references [1] calogero, f.: j. math. phys. 10 (1969) 2191; 10 (1969) 2197. [2] olshanetsky, m. a., perelomov, a. m.: phys. rept. 71, 313 (1981); 94, 313 (1983). [3] polychronakos,a. p.: j. phys. a: math. gen. 39 (2006) 12793. 28 acta polytechnica vol. 50 no. 3/2010 [4] de alfaro, v., fubini, s., furlan, g.: nuovo cim. a34 (1976) 569. [5] claus, p., derix, m., kallosh, r., kumar, j., townsend, p. k., van proeyen, a.: phys. rev. lett. 81 (1998) 4553. [6] gibbons, g. w., townsend, p. k.: phys. lett. b454 (1999) 187. [7] michelson, j., strominger, a.: commun. math. phys.213 (2000) 1; jhep9909 (1999) 005;maloney, a., spradlin, m., strominger, a.: jhep 0204 (2002) 003. [8] freedman,d.z.,mende,p.f.: nucl.phys.b344, 317 (1990). [9] brink, l., hansson, t. h., vasiliev, m. a.: phys. lett.b286 (1992) 109;brink, l., hansson,t.h., konstein, s., vasiliev, m. a.: nucl. phys. b401 (1993) 591. [10] wyllard, n.: j. math., phys. 41 (2000) 2826. [11] bellucci, s., galajinsky, a., krivonos, s.: phys. rev. d68 (2003) 064010. [12] bellucci, s., galajinsky, a. v., latini, e.: phys. rev. d71 (2005) 044023. [13] galajinsky, a., lechtenfeld, o., polovnikov, k.: phys. lett. b643 (2006) 221; jhep 0711 (2007) 008; jhep 0903 (2009) 113. [14] bellucci, s.,krivonos,s., sutulin,a.: nucl. phys. b805 (2008) 24. [15] krivonos, s., lechtenfeld, o., polovnikov, k.: nucl. phys. b817 (2009) 265. [16] delduc, f., ivanov, e.: nucl. phys. b753 (2006) 211, b770 (2007) 179. [17] faddeev, l., jackiw, r.: phys. rev. lett. 60 (1988) 1692; dunne, g. v., jackiw, r., trugenberger, c. a.: phys. rev. d41 (1990) 661; roberto, f., percacci, r., sezgin, e.: nucl. phys. b322 (1989) 255; howe, p. s., townsend, p. k.: class. quant. grav. 7 (1990) 1655. [18] polychronakos, a. p.: phys. lett. b266, 29 (1991). [19] fedoruk, s., ivanov, e., lechtenfeld, o.: phys. rev. d79 (2009) 105015. [20] fedoruk, s., ivanov, e., lechtenfeld, o.: jhep 0908 (2009) 081. [21] fedoruk, s., ivanov, e., lechtenfeld, o.: jhep 1004 (2010) 129. [22] gorsky, a., nekrasov, n.: nucl. phys. b414 (1994) 213. [23] ivanov, e., lechtenfeld, o.: jhep 0309 (2003) 073. [24] galperin, a. s., ivanov, e. a., ogievetsky, v. i., sokatchev, e. s.: harmonic superspace, cambridge univ. press, 2001. [25] polychronakos, a. p.: jhep 0104 (2001) 011; morariu, b., polychronakos, a. p.: jhep 0107 (2001) 006; phys. rev. d72 (2005) 125002. [26] ivanov, e., niederle, j.: phys. rev. d80 (2009) 065027. sergey fedoruk e-mail: fedoruk@theor.jinr.ru bogoliubov laboratory of theoretical physics, jinr 141980 dubna, moscow region, russia 29 t90histogram.eps acta polytechnica vol. 52 no. 1/2012 s5 1803+78 revisited r. nesci, a. maselli, f. montagni, s. sclavi abstract we report on our optical monitoring of the bl lac object s5 1803+78 from 1996 to 2011. the source showed no clear periodicity, but a time scale of about 1300 days between major flares is possibly present. no systematic trend of the color index with flux variations is evident, at variance with other bl lacs. in one flare, however, the source was bluer in the rising phase and redder in the falling one. two γ-ray flares were detected by fermi-gst during our monitoring: on the occasion of only one of them we found simultaneous optical brightening. a one-zone synchrotron self compton (ssc) model appears too simple to explain the source behavior. keywords: active galactic nuclei, blazars, bl lac objects. 1 introduction s5 1803+78 is a bl lac object, a special class of active galactic nuclei (agn). bl lacs take their names from the prototype source, discovered at sonneberg in 1929 and then cataloged as the variable star bl lacertae. they are characterized by: a) a featureless optical continuum; b) large and fast variability; c) appreciable polarization; d) flat radio spectrum. many bl lacs are x-ray sources and are a substantial part of the extragalactic sources detected in γ rays. the current model to explain the emission of bl lacs is based on a supermassive black-hole, hosted in the galaxy center, which accretes matter from the surroundings. part of the matter in the accretion disk is funnelled in a narrow relativistic jet, oriented at small angles with the observer’s line of sight: the existence of such jets has been proven by vlbi radio observations. most of the observed emission is believed to be produced in this jet by two different processes: 1) synchrotron radiation of relativistic electrons in a strong magnetic field; 2) inverse compton radiation from these electrons on the ambient photons. depending on the energy of the electrons and the intensity of the magnetic field, the peak of the synchrotron emission, in the log(ν fν ) vs log(ν) plane, may range between far ir and x rays. accordingly, bl lacs are classified as low frequency (lbl), intermediate frequency (ibl) and high frequency (hbl) peaked sources. the peak of the inverse compton emission is between the hard x-ray and the γ-ray bands. during a flare the spectral energy distribution (sed) shape is deformed and generally shifted towards higher frequencies. the light curves of several bl lacs have been studied since their discovery, also using historical plate archives for long-term variability studies. their behavior has proven to be generally quite erratic, save the case of oj 287 [1], which has a periodicity of about 11 years for the major flares. in a number of cases time scales of a few years have been found, though not strict periodicities (e.g. s5 0716+71 [2]; ao 0235+174 [3]; gb6 j1058+5628 [4]). long-term trends have also been found in a few sources, superposed on the short-term ones, extending for tens of years and possibly being just part of longer recurrency time scales (e.g. oq 530 [5]; on 231 [6]; s5 0716+71 [2]). the long-term variability may be due to intrinsic changes in the source or to geometrical effects: in this case the best explanation could be a precession of the relativistic jet, producing a variation of doppler boosting and therefore an achromatic variation of the apparent source luminosity. 2 s5 1803+78 in brief s5 1803+78 is a bright radio source discovered in 1981. due to its circumpolar position it can be followed for a large part of the year. it has a large optical polarization and a redshift z = 0.680, based on a single emission line assumed to be mgii (2 900 å). it is well monitored in the radio band because it is a source of the icrs reference frame and is also used as a geodetic reference source. it was observed by several x-ray satellites, but was not pointed by egret. its first optical systematic monitoring was made by [7] in the years 1996–2001. the overall sed is of the lbl type. most of the well-monitored lbl sources (e.g. bl lacertae itself) show a correlation between optical luminosity and optical spectral slope, being bluer when brighter. on the contrary, s5 1803+78 showed very limited variation in the optical spectral slope, although it varied by more than 2 mag in flux. we 39 acta polytechnica vol. 52 no. 1/2012 fig. 1: light curve in r band of s5 1803+78 from 1996 to 2010. abscissa lower scale in julian days, upper scale in years. the squares after jd 4000 are swift-uvot data extrapolated to the r band. vertical dashes indicate the dates of gamma-ray flares decided therefore to continue its monitoring to check if: a) the color variation remained small in time; b) the source showed any recurrent time scale in flux variations; c) the source showed monotonic longterm trend. the monitoring was performed with the telescopes of asiago (183 cm), loiano (152 cm), vallinfreda (50 cm) and greve in chianti (30 cm), mainly in the r band, but with several observations also in the b, v and i bands. the resulting light curve is shown in figure 1. 3 optical light curve the source underwent a number of flares during our monitoring: we derived by eye estimates of the starting and ending point for the rising and falling branches of each major well-sampled flare. the luminosity variation rates range between 0.03 and 0.07 mag/day. the average values of the rising and falling rates are not statistically different. an inspection of the light curve (by eye and by fft analysis) shows no strict periodicity for the flares. a possible time scale of 1 300 days between major flares may be present. the (v − i) color index showed small variations around the average value of v − i = 1.1 mag during all our monitoring, a typical value for lbl sources, corresponding to spectral index −1.6. the most accurate measures, carried out with the larger asiago and loiano telescopes during the strong flare at jd 3550, showed a bluer color during the rising phase and a redder one in the falling part. similar behavior was reported in the literature for a few other bl lacs (s5 0716+71, 3c 66a [8]). the uv spectral slope observed by swift-uvot was consistent with the optical one from our bvri data, indicating a common origin (tail of the synchrotron emission) for optical and uv light. 4 high energy observations the x-ray spectrum was observed by swift on four occasions between 2007 and 2009: it always showed a flux level comparable to that observed by bepposax in 1998, with small variations. also the xray spectral slope (photon index) was stable, around +1.6, suggesting an inverse compton origin (see figure 2). during these x-ray observations the r magnitude was not very different, ranging between 15.4 to 16.3 mag: no clear correlation is evident between the x-ray and the optical band. 10 17 10 18 ν (hz) 10 -12 ν f (ν ) (e rg c m -2 s -1 ) 001 002 004 005 fig. 2: x-ray spectrum of s5 1803+78: different symbols are used for the four epochs. the numbers in the rectangle are the last digits of the observation id in theswift archive 40 acta polytechnica vol. 52 no. 1/2012 s5 1803+78 was first detected in γ rays by fermigst (1fgl [9]). its weekly sampled γ-ray light curve shows an average level around 2 · 10−8 ph cm−2s−1 with oscillations within a factor of 2 [10]. our optical light curve has only a few points around the end of this 11-month time interval; the source was rather faint (r ∼ 16.0) and shows a short increase of 0.3 mag two weeks before a short γ-ray bump (mjd 54990). a strong flare, at 9 · 10−7 ph cm−2s−1 (e ≥ 100 mev) on 2010 january 11 was reported by [11], 40 times brighter than the average value. we could observe the source in the optical 4 days after the γray burst at r=14.9 mag with color index (v − i) = 1.2 mag. it was already in the decreasing phase at a rate of 0.04 mag/day, a typical value for this source. extrapolating backwards, this trend to the epoch of the γ-ray flare gives r = 14.7 mag for the peak value, rather lower than the maximum recorded flares of this source (r = 14.0–14.2 mag). a second brighter flare (1·10−6 ph cm−2s−1) was detected by fermi-gst [12] on 2011 may 2: we observed the source in the following days, finding it at r = 15.6 mag, at variance with the previous flare [13]. the source rose at about r = 15.3 in the following 30 days. the dates of the flares are indicated in figure 1 with vertical dashes. on the basis of these two contradictory gammaray episodes it is rather premature to derive any strong conclusion on a correlation between optical and gamma-ray flux density variations. 5 spectral energy distribution we used literature data, as well as our own observations and data analysis, to build the sed of the source, which is shown in figure 3. the gamma-ray flux densities (violet dots) are from fermi-gst: they have been retrieved from the asi science data center (asdc) website and are the average values during the aug 2008 jul 2009 period. the hard x-ray flux density at 20 kev (violet square) was calculated by converting the count rate in the 15–30 kev map obtained from the reduction of 54 months of bat data using batimager software [14]. the conversion factor was derived from the crab count rate, and its spectrum was used for calibration purposes, as explained in the bat calibration status report1. the x-ray data in the 0.5–8 kev range are those already reported in figure 2, with the same color code. near uv data (blue dots) are from swift-uvot, simultaneous with one (blue) swift-xrt pointing. the optical black dots are from sdss. the optical and nir red dots are from our observations: for clarity, only the maximum and minimum optical levels of the source during our monitoring are shown. finally, mm and radio data are shown as black dots. 10 6 10 9 10 12 10 15 10 18 10 21 10 24 10 27 ν (hz) 10 -17 10 -15 10 -13 10 -11 10 -9 ν f (ν ) (e rg c m -2 s -1 ) fig. 3: the sed of s5 1803+78 from non-simultaneous data. see the text for colour codes. the log-parabolic fits to the synchrotron component and inverse compton components are also shown the sed is that typical of lbl sources, with two large bell-shaped parts: the synchrotron component, peaking between 1013 and 1014 hz, is well fitted by a log-parabola [15]; the inverse compton branch is well defined by the swift-xrt, bat and fermi-gst instruments: a log-parabolic fit gives a peak at 3.8·1021 hz, similar to that of other lbl objects observed by fermi-gst. a bluer-when-brighter behaviour of the synchrotron component is quite evident. the photon index in the γ-ray domain is 2.33, well within the range of values for lbl objects (2.21 ± 0.16, see [16]). 6 conclusions the optical light curve of s5 1803+78 shows several large (2 mag) flares with a possible time scale of about 1 300 days. the flux variations are slow, with a typical rate of 0.04 mag/day, without a marked difference between the rising and falling branches. no monotonic long-term trend is apparent on a 14-year time span. the optical spectral slope shows only small (10 %) variations even in large flares: accuracy of 0.01 mag is necessary to study these variations. on the occasion of a strong flare we had enough accuracy to detect a bluer-when-rising behavior. the overall sed is well described by two log-parabola, with peak positions and slopes typical of lbl sources. the correlation between γ-ray and optical flares is based on two cases only: further observations are necessary to explore this topic, but a one-zone synchrotron self compton model seems too simple to explain the behavior of this source. if the flares are correlated, from the frequency of the optical flares we statistically expect at least one more γ-ray flare within the lifetime of fermi-gst. 1http://swift.gsfc.nasa.gov/docs/swift/analysis/bat digest.html 41 acta polytechnica vol. 52 no. 1/2012 references [1] sillanpaa, a., et al.: apj, 1988, 325, 628. [2] nesci, r., et al.: aj, 2005, 130, 1 466. [3] raiteri, c. m., et al.: a & a, 2008, 480, 339. [4] nesci, r.: aj, 2010, 139, 2 425. [5] massaro, e., et al.: a & a, 2004, 423, 935. [6] massaro, e., et al.: a & a, 2001, 374, 435. [7] nesci, r., et al.: aj, 2002, 124, 53. [8] fiorucci, m., et al.: a & a, 2004, 419, 25. [9] abdo, a. a., et al.: apjs, 2010a, 188, 405. [10] abdo, a. a., et al.: apj, 2010b, 722, 520. [11] donato, d., et al.: atel, 2010, 2 386. [12] reyes, l. c.: atel, 2011, 3 322. [13] nesci, r.: atel, 2011, 3 323. [14] segreto, a. et al.: a & a, 2010, 510, 47. [15] maselli, a., et al.: blazar variability across the em spectrum. http://pos.sissa.it, 2008, p. 77. [16] abdo, a. a., et al.: apj, 2010b, 710, 1 271. r. nesci s. sclavi university la sapienza, roma, italy a. maselli inaf-iasf, palermo, italy f. montagni greve in chianti observatory, italy 42 fahmi_dry_cyc_impr.eps acta polytechnica vol. 52 no. 2/2012 teaching management at technical universities, business reality in the academic environment v. baroch, b. duchoň, v. faifrová, z. ř́ıha abstract students of technical universities often do not understand why their studies should include learning management skills (in addition to the study of economics). however, not only the experience of graduates but also the requirements of their future employers show that education in the field of the management should provide training, skills and practical testing. it is only a matter of time before graduates of technical university take up leading positions or become part of a team working on some complicated technical problem. a classical technical education is no longer sufficient and, above all, it is employees with knowledge of economics and with managerial skills, specifically soft skills that come to the fore. it is evident from ample experience that people’s individual dispositions play a role in learning soft skills, but many of these skills can also be acquired by progressive training. the question is which form of teaching to choose to enable necessary skills to be learned, without at the same discouraging students by offering them potentially unattractive courses. these are the issues that will be treated in this paper. keywords: management, business requirements analysis, teaching, business simulation. 1 introduction the solution of any problem must be based on a detailed analysis. crucial information concerns the extent of the need for various skills required by the employer. ctu inprague has been carrying out surveys among its graduates over a long period, as the following text will show. the results of two longterm surveys are in strong agreement. in response to questions about which knowledge and skills they use most, graduates begin by mentioning technical knowledge (93 % of the respondents), but this is followed immediately by communication skills and skills for dealing with people (78 %) [1]. the list of skills to be worked on in the years at the university begins with foreign languages; work with information technology and practical skills are next in the list, followed by managerial skills (leadership, presentation, teamwork and social competences) [1]. these findings are illustrated in the graphs 1, 2. a second way to study the need for managerial skills is to pose questions directly to the personnel managers of businesses employing ctu graduates. as petr fiala, former president of the conference of rectors said: “business should mainly play a substantially larger role in education quality assessment. feedback must exist of the type: you have prepared them poorly, or: the graduates cause us the following problems.” [2] this feedback was the target for the second part of the survey. 35 businesses took part in the survey. they were asked about the particular skills of graduates: on one hand the real level of skills of the graph 1: knowledge used by graduates of the ctu in their work (perentage of graduates using each type of skill), source [1] graph2: skills that graduates had towork on extensively after graduation, source [1] 26 acta polytechnica vol. 52 no. 2/2012 graduates, and on the other hand the level of skills required by the business. thus the survey answers the questions: howdoes an business evaluate particular skills, and to what extent does it require them. this focuses attention on skills that are required, but not obtained by business. these are above all foreign language skills, soft skills and issues of decision-making and business economics. the survey validates the original premise that the labourmarket requires above all graduates with a technical education supplemented by management and economic knowledge and skills, all supported by knowledge of at least one foreign language. the results of the survey are illustrated in the graphs 3. the graph shows selected skills of graduates of the faculty of transportation sciences of ctu in prague, as assessed by their employers. all of the skills are situated above the diagonal of the graph, which means that employers require these skills at a higher level than they really obtain. this is true above all for soft skills and analytical thinking. similarly, the following graph shows that businesses also seek additional management skills, above all decision-making and exact methods for use in decision-making. graph 3: assessment by their employers of the skills of graduates of the faculty of transportation sciences of ctu in prague, source [1] graph 4: assessment of the skills of graduates of the faculty of transportation sciences of ctu in prague by their employers, source [1] 27 acta polytechnica vol. 52 no. 2/2012 it shouldbepointedout that the conclusions from our survey match the conclusions of similar projects implemented at other technical universities. it also follows from these surveys that businesses require management skills to be trained at universities (in-service training courses are an additional expense for business companies [3]). however, businesses should realize that they need to provide financial and material support for the implementation of managerial “laboratory teaching” at technical faculties. the businesses also mentioned that while technical skills can be acquiredwithin half a year ofwork experience, soft skills can requiremuchmore time [4]. management conception until this point we have used the concept of management and the concept of hard and soft elements. however, the word “management” is widely used to express a variety of concepts, but the practical meaning of the word is not sufficiently well understood. management should be understood as a broadly based field in which many disciplines interpenetrate, frommathematics to psychological factors (see figure 1). fig. 1: penetration of disciplines, source [5] many business managers confirm what is taught e.g. at ctu in prague and at other technical universities. some opinions from this field can be cited [6]. p.kafka, president of the czechmanagerial association, affirms the width of disciplines needed for management. he points out that it is necessary “to aimatcreativity, at theacceptanceofnewdisciplines, at links between technical disciplines and other disciplines, e.g. economics and the arts”. kafka argues that it is essential “to join the ability to assess the offers in thewholeproduction context, fromthe technical viewpoint and from the viewpoint of cost/price”. in the next part of the article wewill concentrate on ways to make the teaching of management more attractive, and ways to adjust it to the practical requirements of business life. a priority is to transfer business practice to the laboratory conditions of a university. in teaching that is organized in this way, the main accent should be on teamwork, and on the simulation of the business environment. business simulation it is essential for students to master the integration of managerial activities, so that they can apply this integration both in projectmanagement and in businessmanagement. the problemdescribed in this paper can provide a basis for a general idea of a draft concept for management teaching for master’s students. the main problem is how to transfer real situations into academic teaching. one way is to simulate the operation of a real business in management teaching. the simulation itself does not solve a management problem by reaching a strict conclusion. however, it provides support for the final decision, which must be made by the decision-maker, i.e. the manager responsible formanaging the company. the implementation is realized by means of a management game that simulates business activities. the students influence the result of the simulation on the basis of their actions, so that the simulation employs them actively in business management, and teaches them to think strategically. students take active decision steps based on changes in the system environment, and also based on the changes in the learning curve of the student-decision-maker’sknowledge. students try variousways, and evaluate the results of their decisions. simulations are appropriate where students learn to solve problems, to integrate information and to react to changes in environment, including the risk that the business may collapse. the simulation can take the form of electronic financial models for business decision-making, or can take the form of a contact board game. both forms are based on simulating business processes that are very close to the way real companies operate. teamwork should be an essential element. the concept of the business simulation board game is not based on the virtual reality of computer applications. it attempts to recreate a real contact situation from everyday business activities. besides specialist knowledge, students learn face-to-face how to behave with other people. they also learn teamwork, and to take the individual responsibility for the results of the working group. students improve in team communication and in operational decisionmaking, and they learn to take a personal role in the team. theywork on their presentation skills and they have to react to the questions of the lecturer and of their teammates. the lecturer is a part of the game; he functions as a consultant, coordinator, auditor and supervisor. 28 acta polytechnica vol. 52 no. 2/2012 software versions of management games have been becoming increasingly popular all over the world, because they are able to create a very real business environment in an e-learning form. they enable a trained participant to experience real situations from everyday business practice. a great advantage ofmanagementgames is that they save time, because simulation can be used in a remote form via the internet. this makes simulations highly suitable for teaching students by distance study. worldwide experience has shown that this method using competition provides efficientmotivation for activework. the teammembersmeet regularly for businessmeetings and, thanks to e-learning and the communication technologies thathavebeendeveloped, themeetings can take the very modern and desirable form of a video conference. experience of teamwork through videoconferencing is nowadays in great demand in preparation for carryingout the time-demandingprofession of a top manager. software business simulations depend greatly on mathematical methods that express thousands of complicated relations among hundreds of simulated elements. the following part of the text is devoted to our experience of working with students using the board versionof a businessmanagement simulation. a software version is under preparation. experience from management teaching for technically-oriented students through business simulation the concept of unique management contact business simulation (sim) was developed on the basis of our experience of teaching management at a technical university, with emphasis on the practical requirements of msc graduates. the chosen approach was further tested in the pilot version with a sample of approximately 130 technically-oriented students, according to the following process: 1. after the teams have been formed, each student teamwasgiven a functioning virtual company in the form of a board game, but without a company management. 2. the starting state of the companywas the same for all, and its further fate depended fully on the team’s decisions (at the beginning of each laboratory seminar the lecturer distributed the components of the board game according to the balances recording the movements of assets and liabilities from the preceding seminar). 3. in this way, a certain competition element among the student teams was incorporated into the implementation, and above all the need for good team cooperation. 4. the students took on the role of top managers of a virtual company, and took on responsibility for the further running of “their” company. 5. during the seminars, the teams led their virtual company according to established rules, which were laid down in the game manual, for a predefined period of time. at certain times they were able to direct andmanage the operation of the company. 6. each teammadedecisions e.g. about the capital structure of the company, and made long-term production plans and research programs. 7. all decisionswere transparent, as the simulation took place in the form of a board game. for the evaluationweused theopinionsof the target group of an anonymous sample of 128 students out of those who had taken part in our laboratory verification of the business simulation game. in this wayquality feedbackwas acquired. the sample comprised in an equal ratio all the main study profiles at the faculty of transportation sciences of ctu in prague, including 29 students from the part-time study programme. the sample of 128 addressed students was segmented into 2 disjoint classes, with a total of 7 segments: • 1st class: 1. men 2. women • 2nd class: 3. students of the profile transportation systems and technology 4. students of the profile air traffic control and management 5. students of the profile management and economics of transportation and telecommunications 6. students of the profiles engineering informatics of transportation and communication, intelligent transport systems, security of information and telecommunication systems 7. part-time students the students were asked to respond to 10 questions through an anonymous questionnaire. they were able express their experience and opinions through a free text, after passing through the pilot project. to summarize the most substantial results, we can state that 93 % of the respondents greatly welcome this form of innovation, 74 % would also add the swversion of the business simulation to the teaching programme, 5%do not know, and only 2% were in favour of preserving the traditional form of teaching. 90 % of the respondents expected to gain better employment on the labour market after com29 acta polytechnica vol. 52 no. 2/2012 pleting the innovated course. 77 % favoured further teaching beyond the number of lessons at present allocated for this subject. a detailed overview of the survey is published at the url http://www.esim.cz. conclusion the commercial sector calls for better preparation of graduates of technical universities in the field of leading practical projects, team decision-making, strategic management of operations and entire businesses. students and graduates of ctu inpraguehave good specialist and technical knowledge. however the development of soft skills and practical skills is usually not adequately provided at the university. most businesses support the idea that youngpeople should already acquire these skills during their university studies. the teaching concept should be as close as possible to business practice. students should therefore adopt teamwork and project work as much as possiblewhile they are still at the university. a team approach is nowadays used increasingly in all fields. this trend is related to the need for rapid and efficient changes in decision-making. appropriate team forming is a condition for efficientandsuccessfulbusiness functioning. the key competences developed in the framework of management classes should include the ability to assert oneself and promote one’s ideas, the art of teamwork and team leadership, the ability to reach decisions independently, to integrate easily and rapidly into the ever-changing economic environment, including knowledgeof business terminology in english. teaching based on business simulation enables business reality to be implemented into the university environment. although this type of teaching has been increasingly requiredby the business companies in which future graduates will be employed, there is still inadequate provision. moreover, a survey among 128 students who participated in the laboratory test showed clearly that students are interested in this form of innovation. commercial courses offered in this field for company employees are generally much more expensive than early preparation during students’ university studies, and they are less efficient. acknowledgement the study reported in this paper was supported by the european union and co-funded by the social fundandby thebudget of thecapitalcity ofprague (project no. cz.2.17/3.1.00/33309, “an innovative management education technique for better preparationof technical collegegraduates entering the labour market“). references [1] šafránková, j.: hodnoceńı a posuzováńı kvality výuky z pohledu absolventů čvut a z hlediska požadavků organizaćı. praha : 2010. [2] http://ekonom.ihned.cz/ c1-47081190-prilis-magistru-malo-bakalaru [3] hay group: průzkum absolventů technických vysokých škol. studie provut brno, haygroup, 2006. [4] průzkum požadavků zaměstnavatelů na absolventy technických a př́ırodovědeckých obor. 2009. http://ipn.msmt.cz/data/uploads/portal/ pruzkum pozadavku zamestnavatelu.pdf [5] duchoň, b., šafránková, j.: management. integrace tvrdých a měkkých prvků ř́ızeńı. praha : c. h. beck, 2008. ing. václav baroch phone: +420 224 359 168 e-mail: xbaroch@fd.cvut.cz department of economics and management of transport and telecommunications ctu fts horská 3, 128 03 praha 2 prof. ing. bedřich duchoň, csc. phone: +420 224 359 155 e-mail: duchon@fd.cvut.cz department of economics and management of transport and telecommunications ctu fts horská 3, 128 03 praha 2 ing. veronika faifrová, phd. phone: +420 224 359 165 e-mail: faifrova@fd.cvut.cz department of economics and management of transport and telecommunications ctu fts horská 3, 128 03 praha 2 ing. zdeněk řı́ha, phd. phone: +420 224 359 156 e-mail: riha@fd.cvut.cz department of economics and management of transport and telecommunications ctu fts horská 3, 128 03 praha 2 30 acta polytechnica acta polytechnica 53(3):314–316, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ intervals in generalized effect algebras and their sub-generalized effect algebras zdenka riečanová∗, michal zajac department of mathematics, faculty of electrical engineering and information technology stu, ilkovičova 3, sk-81219 bratislava ∗ corresponding author: zdenka.riecanova@stuba.sk abstract. we consider subsets g of a generalized effect algebra e with 0 ∈ g and such that every interval [0, q]g = [0, q]e ∩ g of g (q ∈ g, q 6= 0) is a sub-effect algebra of the effect algebra [0, q]e. we give a condition on e and g under which every such g is a sub-generalized effect algebra of e. keywords: generalized effect algebra, effect algebra, hilbert space, densely defined linear operators, embedding, positive operators valued state. 1. introduction and some basic definitions and facts the hilbert space effect algebra e(h) on a hilbert space h is the set of positive operators dominated by the identity operator i. in the quantum mechanical framework the elements of an effect algebra represent quantum effects and these are important for quantum statistics and for quantum mechanical theory (see [2, 3]). one may think of quantum effects as elementary yesno measurements that may be unsharp or imprecise. effect algebras were introduced by d. foulis and m.k. bennet in 1994 [1]. the prototype for the abstract definition of an effect algebra was the set e(h) (hilbert space effects) of all selfadjoint operators between null and identity operators in a complex hilbert space h. if a quantum mechanical system is represented in the usual way by a complex hilbert space h then self-adjoint operators from e(h) represent yes-no measurements that may be unsharp. recently several examples and properties of operator (generalized) effect algebras were studied in papers polakovič, riečanová [9], polakovič [10], paseka, riečanová [8], riečanová, zajac, pulmannová [12], pulmannová, riečanová, zajac [11], riečanová, zajac [13] and riečanová [14] the abstract definition of an effect algebra follows the properties of the usual sum of operators in the interval [0, i] (i.e. between null and identity operators in h) and it is the following. definition 1.1 (foulis, bennet [1]). a partial algebra (e;⊕, 0, 1) is called an effect algebra if 0,1 are two distinguished elements and ⊕ is a partially defined binary operation on e which satisfy the following conditions for any x, y, z ∈ e: (e1) x ⊕ y = y ⊕ x if x ⊕ y is defined, (e2) (x⊕y)⊕z = x⊕(y ⊕z) if one side is defined, (e3) for every x ∈ e there exists a unique y ∈ e such that x ⊕ y = 1 (we put x′ = y), (e4) if 1 ⊕ x is defined then x = 0. immediately in 1994 the study of generalizations of effect algebras (without the top element 1) was started by several authors (foulis and bennet [1], kalmbach and riečanová [4], hedlíková and pulmannová [5], kôpka and chovanec [6]). it was found out that all these generalizations coincide and their common definition is the following: definition 1.2. a generalized effect algebra (e;⊕, 0) is a set e with element 0 ∈ e and partial binary operation ⊕ satisfying for any x, y, z ∈ e conditions (ge1) x ⊕ y = y ⊕ x if one side is defined, (ge2) (x⊕y)⊕z = x⊕(y⊕z) if one side is defined, (ge3) if x ⊕ y = x ⊕ z then y = z, (ge4) if x ⊕ y = 0 then x = y = 0, (ge5) x ⊕ 0 = x for all x ∈ e. in every (generalized) effect algebra e a partial order ≤ and a binary operation can be introduced as follows: for any a, b ∈ e, a ≤ b and b a = c iff a ⊕ c is defined and a ⊕ c = b. throughout the paper we assume that h is an infinite-dimensional complex hilbert space. for notions and results on hilbert space operators we refer the reader to [7]. we will assume that the domains d(a) of all considered linear operators a are dense linear subspaces of h (in the metric topology induced by the inner product). we say that operators a are densely defined in h. the set of all densely defined linear operators on h will be denoted by l(h). recall that a : d(a) →h is a bounded operator if there exists a real constant c > 0 such that ‖ax‖≤ c‖x‖ for all x ∈ d(a). if a is not bounded then it is called unbounded. recall that if (e;⊕, 0, 1) is an effect algebra ((e;⊕, 0) is a generalized effect algebra) then a subset g 6= ∅ such that 1 ∈ g (0 ∈ g respectively) is a sub-effect algebra (sub-generalized effect algebra) of e iff 314 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 intervals in generalized effect algebras and their sub-generalized effect algebras (s) for any a, b, c ∈ e with a ⊕ b = c in e the fact that two out of elements a, b, c are in g implies that all a, b, c ∈ g. moreover, as we can easily check, every subgeneralized effect algebra is a generalized effect algebra in its own right. 2. sub-generalized effect algebras of generalized effect algebras a significant property of a generalized effect algebra (e;⊕, 0) is the fact that for any q ∈ e, q 6= 0, the interval [0, q]e = {c ∈ e | there exists d ∈ e such that c ⊕ d = q} is an effect algebra with top element q and the partial binary operation ⊕q defined for a, b ∈ [0, q]e iff a ⊕ b ≤ q. then we set a ⊕q b = a ⊕ b (we write ⊕q = ⊕|[0,q]e ). thus if a set g ⊆ e with 0 ∈ g is a sub-generalized effect algebra of e, then the same is true for all [0, q]e ∩ g and [0, q]e, q ∈ e. more precisely, theorem 2.1. let e be a generalized effect algebra and 0 ∈ g ⊆ e. then the following assertions are equivalent: (1.) g is a sub-generalized effect algebra of e. (2.) for all nonzero q ∈ e the set g ∩ [0, q]e is a sub-generalized effect algebra of [0, q]e considered as a generalized effect algebra. proof. (1.) ⇒ (2.) this implication is obvious. (2.) ⇒ (1.) let a, b, c ∈ e, a ⊕ b = c. substituting c = q into (2) we obtain that g ∩ [0, c]e is a subgeneralized effect algebra of [0, c]e. hence a, b, c ∈ [0, c]e satisfy the property (s), i.e., g is a subgeneralized effect algebra of e. the following example shows that the condition (2.) in theorem 2.1 cannot be replaced by a stronger one: (2’.) for all nonzero q ∈ e the set g ∩ [0, q]e is a sub-effect algebra of the effect algebra [0, q]e. example 2.2. let e = r+ and g = q+ be the sets of all non-negative real and rational numbers, respectively and let + denote the usual sum of real numbers. then (2) obviously holds but for nonrational q > 0, e.g. for q = √ 2, we have g ∩ [0, q]e is not a sub-effect algebra of [0, q]e. it is easy to see that if g is a sub-generalized effect algebra of a generalized effect algebra e, then for all q ∈ g, q 6= 0, the intersection [0, q]g = [0, q]e ∩g is a sub-generalized effect algebra of [0, q]e considered as a generalized effect algebra. our goal, roughly speaking, is to investigate under what conditions the converse holds. theorem 2.3. let g be a subset of a generalized effect algebra (e;⊕, 0) such that 0 ∈ e and for every c ∈ e there exists g ∈ g with c ≤ g. then the following conditions are equivalent: (1.) g is a sub-generalized effect algebra of (e;⊕, 0). (2.) for any q ∈ g, q 6= 0 the interval [0, q]g = [0, q]e ∩ g in g is a sub-effect algebra of the effect algebra [0, q]e. proof. (1.) ⇒ (2.) this is obvious since (1.) implies that g satisfies the condition (s), hence [0, q]e is an effect algebra for every nonzero q ∈ g. thus the intersection of two generalized effect algebras [0, q]e ∩ g = [0, q]g is an effect algebra, as q ∈ g. (2.) ⇒ (1.) let a, b ∈ g with a ⊕ b = c ∈ e. there exists g ∈ g with c ≤ g and hence a ⊕ b ∈ [0, g]e ∩ g = [0, g]g which, by (2.), gives a ⊕ b ∈ g. if a, c ∈ g, b ∈ e and a ⊕ b = c, then by (2.) we have b ∈ g. this proves that g is a sub-generalized effect algebra of e. example 2.4. assume that h is an infinitedimensional complex hilbert space. further, let b+(h) be the set of all bonded positive linear operators with domain h. in [12] it was proved that for any dense linear subspace d ⊆ h the set gd(h) = b+(h) ∪{a : d →h| a ≥ 0, unbounded linear operator with d(a) = d} is a generalized effect algebra with the operation ⊕d which for any a, b ∈gd(h) coincides with the usual sum of linear operators, i.e. a ⊕d b = a + b. it is easy to show that b+(h) is a sub-generalized effect algebra of gd (h). for every q ∈ b+(h), q 6= 0, the intervals under q in b+(h) and gd (h) coincide, i.e. [0, q]b+(h) = [0, q]gd (h) ∩ b +(h) = [0, q]gd (h) and they also coincide as effect algebras. this shows that conditions (1.) and (2.) from theorem 2.3 hold. open problem 2.5. example 2.4 shows that, in theorem 2.3, the condition “to every q ∈ e there exists c ∈ g with q ≤ c” is only sufficient but not necessary for the equivalence of conditions (1.) and (2.). thus the open problem remains to find a necessary and sufficient condition for the equivalence of (1.) and (2.) in theorem 2.3. in fact, for every dense subspace d of h, the generalized effect algebra gd (h) = b+(h) ∪u+d (h), where u+d (h) is the set of all unbounded positive linear operators with domain d and the null operator 0. clearly, b+(h) ∩u+d (h) = {0}. on the other hand, while b+(h) is a sub-generalized effect algebra of gd(h), the same is not true for u+d (h). this shows that the union of {0} with the difference of two sub-generalized effect algebras of the generalized effect algebra e need not be again a sub-generalized effect algebra of e. example 2.6. let u+d (h) = ( gd (h)\b+(h) ) ∪{0} be the set of all positive unbounded operators in h with domain d and the null operator 0. 315 z. riečanová, m. zajac acta polytechnica then u+d (h) is not a sub-generalized effect algebra of gd (h) because for a ∈ b+(h) and u, v ∈u+d (h) such that v = u + a we have a /∈ u+d (h). it follows that there are q ∈ u+d (h), q 6= 0, such that [0, q]u + d (h) = [0, q]gd (h) ∩u + d (h) is not a sub-effect algebra in [0, q]gd (h). acknowledgements this work was supported by grants vega 1/0297/11 and vega 1/0426/12 of the ministry of education of the slovak republic and by grant apvv-0178-11 of the slovak research and development agency. references [1] foulis, d. j., bennet, m. k.: effect algebras and unsharp quantum logics, found. phys. 24, (1994) 1331–1352. [2] foulis, d. j.: observables, states and symmetries in the context of cb-effect algebras, reports on mathematical physics, 60 (2007), 329–346. [3] foulis, d. j.: effects, observables and symmetries in physics, found. phys. 37 (2007), 1421–1446. [4] kalmbach, g., riečanová, z.: an axiomatization for abelian relative inverses, demonstratio math. 27 (1994), 769–780. [5] hedlíková, j., pulmannová, s.: generalized difference posets and orthoalgebras, acta math. univ. comenianae 45 (1996), 247–279. [6] kôpka, f., chovanec, f.: d-posets, math. slovaca 44 (1994), 21–34. [7] blank, j., exner, p., havlíček, m.: hilbert space operators in quantum physics (second edition), springer 2008. [8] j. paseka, z. riečanová: considerable sets of linear operators in hilbert spaces as operator generalized effect algebras, found. phys. 41 (2011), 1634–1647. [9] polakovič, m., riečanová, z.: generalized effect algebras of positive operators densely defined in hilbert spaces, internat. j. theor. phys. 50 (2011), 1167–1174. [10] polakovič, m.: generalized effect algebras of bounded positive operators defined on hilbert spaces, reports on mathematical physics, 68 (2011), 241–250. [11] pulmannová, s., riečanová, z., zajac, m.: topological properties of operator generalized effect algebras, reports of mathematical physics, 69 (2012), no. 2, 311–320. [12] riečanová, z., zajac, m., pulmannová, s.: effect algebras of positive linear operators densely defined on hilbert spaces, reports of mathematical physics 68 (2011), no. 3, 261–270. [13] riečanová, z., zajac, m.: extension of effect algebra operations, acta polytechnica 51 (2011) no. 4, 73–77. [14] riečanová, z.: effect algebra of positive self-adjoint operators densely defined on hilbert spaces, acta polytechnica 51 (2011) no. 4, 78–82. [15] riečanová, z.: subalgebras, intervals and central elements of generalized effect algebras. international journal of theoretic physics 38, (1999) 3209–3220. 316 acta polytechnica 53(3):314–316, 2013 1 introduction and some basic definitions and facts 2 sub-generalized effect algebras of generalized effect algebras acknowledgements references ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 grs1758–258: rxte monitoring of a rare persistent hard state black hole m. obst, k. pottschmidt, a. lohfink, j. wilms, m. böck, d. m. smith, j. a. tomsick, i. kreykenbohm abstract grs1758–258 is the least studied of the three persistent black hole x-ray binaries in our galaxy. it is also one of only two known black hole candidates, including all black hole transients, which shows a decrease of its 3-10kev flux when entering the thermally dominated soft state, rather than an increase. we present the spectral evolution ofgrs1758–258 from rxte-pcaobservations spanning a time of about 11 years from 1996 to 2007. during this time, seven dim soft states are detected. we also consider integral monitoring observations of the source and compare the long-term behavior to that of the bright persistent black hole x-ray binary cygnus x-1. we discuss the observed state transitions in the light of physical scenarios for black hole transitions. keywords: black hole binaries: general, black hole binaries: individual: grs1758–258. 1 introduction grs1758−258 (figure 1) is an intermediatemassxray binary harboring a black hole and a companion consistent with an early a-type main sequence star, butwithunusual colors [2]. mass transfer is probably driven byroche lobe overflow. among such systems, usually transients, grs1758−258 is one of only few persistent sources. generally it can be found in the hard state. however, in some respects it still displays a behavior typical for transient sources [3–6, hysteresis, rare decay-type soft states]. fig. 1: integral-isgri count rate mosaic image in the 20–40kev band obtained during galactic center region key programme observations performed in spring 2007 [1] 2 rxte monitoring grs1758–258 was monitored by rxte in 1–1.5ks pointed snapshots monthly in 1996, weekly through 2000 and twice a week from march 2001 to october 2007. every year there is a gap from november to january as the sun is too close to thegalactic center and rxte cannot observe the source. the spectra were modeled taking into account the galactic ridge background (see sect. 2.1 for details). the flux has been corrected for the contribution of the galactic ridge emission (figure 2 top). error bars are shown, typical errors are in the range of 1.0–1.5 %. the photon index varies between 1.5 and 3. most of the time, grs1758–258 is in the hard state. however, seven dim soft states, during which the flux decreases and the spectrum softens, appear clearly in the data (figure 2 bottom). however, no periodicity is detected in the occurrences of such soft states. during the 2001 soft state (highlighted in dark red ), the source almost turned off completely. this strongdecline in fluxmakesgrs1758–258especially interesting as it is typical for transient, not for persistent sources (see also hardness intensity diagram, sect.3). 2.1 background modeling as grs1758–258 is a rather faint source located close to the galactic center (figure 1), the rxtepca spectra contain not only source counts but also a strong background component caused by the galactic ridge emission (figure 3). in order to distinguish between these, a 13ks background ob49 acta polytechnica vol. 51 no. 6/2011 fig. 2: top: flux in kevs−1 cm−2 in the 3–20kev band, fitted to the spectra taken by rxte. bottom: photon index obtained from modeling. soft states are highlighted for episodes reaching a photon index above 2 fig. 3: spectrum for theapril 2003 grs1758–258 observation, i.e. containing the source as well as the galactic ridge contribution (blue (black)), and spectrum for the galactic ridge emission alone (red (grey)) servation 1.5◦ offset from grs1758–258 was performed by rxte in 1999. figure 4 contains the spectrum modeled with two bremsstrahlung components (f1,3−8kev =0.015kevs −1cm−2, kt1 =8kev, f2,3−8 = 0.0027kevs −1cm−2, kt2 = 1.2kev). the iron line complex wasmodeled according to galactic ridge observations performedwith suzaku [7]: three lines at6.4kev,6.67kevand7kev, respectively,have equivalent widths that scale as 85:458:129. 2.2 spectral parameters the rxte-pca spectra in the 3–20kev band were fittedwithanabsorbedpowerlaw,aweakneutral iron kα line, and a black body disk component where required, always including the galactic ridge emission (see sect.2.1). the column density due to interstellar absorption in the direction of grs1758–258 is fixed at nh = 1.5 × 1022cm−2 according to earlier results [8]. an example is shown in figure 5. the fig. 4: spectrum of the galactic ridge emission as seen byrxte. the data were fittedwith two bremsstrahlung components (1: orange dashed line, 2: red dash-dotted line) and an iron line complex as described in [7] fig. 5: example spectrum taken byrxte on 2009 april 08, containing the absorbedpowerlaw component (orange dash-dotted line), the disk (red dashed line) and the iron line (green (grey) solid line) 50 acta polytechnica vol. 51 no. 6/2011 fig. 6: spectral parameters from rxte monitoring observations of grs1758–258: the photon index, temperature and normalization of the disk component, and the reduced χ2 fig. 7: hardness intensity diagram (hid) from rxte monitoring observations of grs1758–258 from 1997 to 2007. the seven dim soft states are highlighted as in figures 6 and 2 disk becomes visible in the dim soft states, the low source flux increasing the error bars (figure 6). 3 hardness intensity diagram for energies < 20kevthe hardness intensity diagram (hid) of grs1758–258 shows a clear hysteresis for hard and soft state fluxes (absorbed fluxes, see figure 7). this behavior is similar to that shown by blackhole transientsover their outbursts [9,q-shaped hid]. different fromtransients, there is no rise in the hard state fromquiescence. during themost extreme soft state the 3–20kev flux is clearly below the lowest hard state flux, with no full return to the hard branch observed down to near-quiescence. a comparison at these energies with our long-term rxte monitoring observations of the persistent black hole x-ray binary cyg x-1 is in preparation. results from spectral fits to 2003–2009 integral monitoring data of grs 1758–258 [1] allow us to extend the hid studies to higher energies (figure 8). as expected, neither source shows hysteresis for energies > 20kev, i.e., in an energy range where only one, namely the hard, spectral compo51 acta polytechnica vol. 51 no. 6/2011 fig. 8: hid from integral monitoring of grs1758–258 from 2003 to 2009 (red (dark grey): monthly binning), compared to the hid from rxte monitoring of cygx-1 from 1998 to 2010 (black: individual pointings, green (light grey): monthly binning) nent dominates. overall, the tracks ofgrs1758–258 in both hids are consistent with a persistent hard state source with occasional softening due to a temporary decrease in the mass accretion rate as suggested by [5]. the hard state hids of both sources show a remarkably similar range of hardnesses. assuming a distance of 1.9kpc for cyg x-1 as recently determined from radio parallax [10] and dust scattering halo measurements [11], the > 20kev luminosities of both sourceswouldbe similar ifgrs1758–258 had a distance of 6.5kpc. note, however, that while the decay towards softer, lower luminosity states is qualitatively similar in both sources as well, the luminosity ofgrs1758–258hasdroppedmore severely at a given hardness level than that of cyg x-1. 4 summary and outlook the analysis of the 1996–2007 rxte data of grs1758–258 reveals many details about the longtermbehaviorof the source. typicalhardstatefluxes after taking into account the galactic ridge emission are 0.2–0.4kevs−1cm−2. the spectra can be well describedwith an absorbed power lawwith a photon index of 1.5–3. there are indications for a possible detection of an additional weak fe line. during themonitoring, there are 7 occurrences of soft states with photon indices softer than 2. these states are most likely due to a decrease in the mass accretion rate. during the soft states, a marginal disk detection with ktin ∼ 500–800ev can be seen. the < 20kev hid shows hysteresis with no full return to the hard branch observed down to nearquiescence. the > 20kev hid shows the same hard state luminosity and hardness as cygx-1 but a different decay. acknowledgement this research has been partly funded by the europeancommission under contract itn215212 “black holeuniverse”andbydlrcontractdlr50or1007. k. pottschmidt anda. m. lohfink acknowledge supportbynasaintegralgograntnnx08ay17g and in additiona. m. lohfink acknowledges support by the daad. references [1] lohfink, a. m., pottschmidt, k., wilms, j., lubiński, p.: apj, 2011, in prep. [2] muñoz-arjonilla, a. j., mart́ı, j., luqueescamilla, p. l., et al.: a & a, 519, a15, 2010. [3] pottschmidt, k., chernyakova, m., zdziarski, a. a., et al.: a & a, 452, 285, 2006. [4] smith, d. m., heindl, w. a., markwardt,c. b., swank, j. h.: apj, 554, l41, 2001. [5] smith,d.m.,heindl,w.a., swank, j.h.: apj, 569, 362, 2002. [6] soria, r., broderick, j. w., hao, j., hannikainen, d. c., et al.: mnras, 2011, in press (arxiv:1103.3009). [7] ebisawa, k., yamauchi, s., tanaka, y., et al.: progress of theoretical physics supplement, 169, 121, 2007. [8] pottschmidt,k.,chernyakova,m., lubiński, p., et al.: inproc. 7th integralworkshop. pos, 2008, p. 98. [9] fender, r. p., belloni, t. m., gallo, e.: mnras, 355, 1105, 2004. [10] reid, m. j., mcclintock, j. e., narayan, r., et al.: 2011, arxiv:1106.3688. [11] xiang, j., lee, j. c., nowak, m. a., wilms, j.: apj, 2011, in press (arxiv:1106.3378). m. obst remeis observatory/ecap bamberg, germany 52 acta polytechnica vol. 51 no. 6/2011 k. pottschmidt cresst/nasa-gsfc greenbelt, md, usa umbc baltimore, md, usa a. lohfink umcp college park, md, usa j. wilms remeis observatory/ecap bamberg, germany m. böck remeis observatory/ecap bamberg, germany d. m. smith scipp/ucsc santa cruz, ca,usa j. a. tomsick ssl/ucb berkeley, ca, usa i. kreykenbohm remeis observatory/ecap bamberg, germany 53 acta polytechnica acta polytechnica 53(2):233–236, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ x-ray spectroscopic characterization of shock-ignition-relevant plasmas michal šmída,b,∗, luca antonellic,d, oldřich rennera a institute of physics of the ascr, v.v.i., prague, czech republic b czech technical university in prague, fnspe, prague, czech republic c dipartimento di ingegneria industriale, universita roma ‘tor vergata’, roma, italy d celia, université de bordeaux 1, talence, bordeaux, france ∗ corresponding author: smidm@fzu.cz abstract. experiments with multilayer plastic/cu targets performed at a pals laser system aimed at the study of matter at conditions relevant to a shock ignition icf scheme, and, in particular, at the investigation of hot electrons generation. plasma temperature and density were obtained using high-resolution x-ray spectroscopy. 2d-spatially resolved quasi–monochromatic imaging was observing the hot electrons via fluorescence kα emission in the copper tracer layer. found values of plasma temperature 690 ± 10 ev, electron density 3 × 1022 cm−3 and the effective energy of hot electrons 45 ± 20 kev demonstrate the potential of x-ray methods in the characterization of the shock ignition environmental conditions. keywords: hot electrons, shock ignition, laser-produced plasma, x-ray spectroscopy, kα radiation. 1. introduction shock ignition (si) scheme is one of the alternative approaches to the inertial confinement fusion. it anticipates a compression of the spherical target by a laser beam with intensity ≈ 1 × 1014 w cm−2, followed by an intense pulse (≈ 1 × 1016 w cm−2) which drives the strong converging igniting shock with pressures at the ablation front up to several 100 mbar [2]. under these conditions, the igniting (main) beam is not absorbed in the solid target, but interacts with the underdense plasma corona, where parametric instabilities take place. primarily, the stimulated raman scattering occurring below the nc/4 limit is a significant mechanism of collisionless absorption. in this process, a plasma wave is excited along the laser direction and a light beam is reflected with a modified frequency spectrum. the plasma wave accelerates free electrons to suprathermal energies. these ‘hot’ electrons may affect the si target significantly, but their role is not fully understood yet: they can either preheat the compressed material, which would lead to premature target expansion and the target gain decrease, or they may be stopped in the dense shell of the target, thus increasing the ablation pressure, improving the symmetry of the converging shock pressure front and increasing the gain [10]. the aim of the experiments performed in the pals research center [6] was to create plasma with conditions relevant for the si and implement advanced spectroscopic diagnosis to validate the this relevance. the collected data should contribute to the investigation of hot electrons generation and propagation in dense target. figure 1. experimental setup. 2. experimental setup the scheme of the experiment shown in fig. 1 includes the multi-layer target, the irradiating laser beam, the x-ray spectrometer, and the kα imaging system. the target consisted of a 25 or 40 µm thick layer of chlorine-doped plastic (parylene-c; c8h7cl) and a 5 µm thick copper tracer layer. the relevance of using doped plastic targets as an si ablator layer has been discussed, e.g., in paper [3], similar measurements have also been performed [4]. the target was irradiated from the plastic side. the prepulse (1315 nm, 70 j, focal spot diameter 700 µm, ≈ 6 × 1013 w cm−2) generates a long-scale 233 http://ctn.cvut.cz/ap/ michal šmíd, luca antonelli, oldřich renner acta polytechnica preplasma corresponding to the compression phase of the si. the frequency-tripled pals main beam (438 nm, 170 j, focal spot diameter 80 µm, pulse length 300 ps, ≈ 1 × 1016 w cm−2) strikes the target at a variable delay, generating a shock wave and the hot electrons. these electrons propagate into the copper layer and create vacancies in the k-shell of the cu atoms which consequently emit the fluorescence kα radiation. the x-ray spectra were measured using a spherically-bent mica crystal spectrometer which was aligned to provide a spatial resolution along the laser axis. in the 4th order, it covered the wavelengths 4.17 ÷ 4.52 å to image the cl heα and lyα lines. the heγ ÷ heη and lyβ lines were observed in the 5th order. the cu kα emission was measured using a spherically-bent quartz (211) crystal, which was set up as a monochromator in imaging mode (bragg angle θ = 88.7◦) to provide a quasi-monochromatic distribution of kα intensity, 2d-spatially resolved along the target surface. both diagnostics used the kodak aa400 film to detect the signal. 3. x-ray spectra evaluation the measured spectral records were digitized using a calibrated table-top scanner and recalculated to optical densities. the raw record with lines identification is in fig. 2. this record was split into regions corresponding to the 4th and 5th spectroscopic orders, each recalculated into emitted intensity using theoretical energy-dependent reflectivity, filter transmission, and film response. the dominant lyβ and heδ lines were selected for the evaluation. the ratio of these two lines, reflecting the ratio of hydrogenand helium-like ions, is very sensitive to plasma temperature, while their width is given mainly by the stark broadening thus providing a very sensitive tool for plasma density estimation. though their recorded intensities are comparable to the α lines, their emissivities are relatively low. this is because the crystal reflectivity and filter transmission are increasing with the decreasing wavelength. the low optical thickness of these lines is very beneficial for the diagnostic purposes, as it minimizes undesirable effects like opacity broadening or reabsorption in inhomogeneous plasma. the temperature and density were determined from the best fit of the observed spectra with simulations. a set of needed spectra with variable parameters was generated by using the prismspect code [8] under the assumption of homogeneous planar plasma and a steady-state approximation. the fitting was done using the least square method. the logarithm of the least squares of differences between the experimental and synthetic spectra for a typical experimental data, shown as a function of the temperature and density of the theoretical spectra, is plotfigure 2. raw spectral record with lines identification. figure 3. the dependence of the merit function on t and ρ; the diagonally-elongated minimum indicates the uncertainty of the plasma parameters estimation. ted in fig. 3. this figure characterizes the uncertainty of the parameter estimation: the diagonallyelongated minimum represents a set of synthetic spectra with a good agreement to the experimental one. to increase the precision of the plasma parameters estimation, the density has been separately derived using the fwhm width of the lyβ line. having the density fixed, the uncertainty of the temperature estimation was about 20 ev. figure 4 shows the comparison of the reconstructed experimental spectrum with the best fitting theoretical one; the 5th order lines are magnified in the inset. 4. kα evaluation the 2d-spatially resolved kα record directly provides the information on radius and intensity of the kα signal. this data was used to evaluate the hot electrons effective energy and absolute population. 4.1. effective energy the effective energy of the hot electron beam was estimated by analyzing the attenuation of the signal in dependence on the plastic layer thickness. as the experiments were conducted with various thicknesses, 234 vol. 53 no. 2/2013 x-ray spectroscopic characterization of shock-ignition-relevant plasmas figure 4. comparison of the reconstructed experimental spectrum and the best fitting theoretical one. figure 5. the dependence of the normalized integrated kα intensity on the plastic layer thickness, its experimental values and fitted attenuation curve. namely 0, 25 and 40 µm, the signal intensity could be plotted as a function of this thickness and fitted with an assumed exponential attenuation (fig. 5). comparing the attenuation with the stopping power from the estar database [1], the mean electron energy was found to be about 45 ± 20 kev. this high uncertainty is caused mainly by the hot electron production process. in the relevant intensity regime, hot electrons are generated by strongly nonlinear processes (stimulated raman scattering and two plasmon decay) which are correlated to the stability of the laser system and laser-plasma interaction. as a consequence, the energy of hot electrons can vary a lot between comparable shots. for our purposes, the essential information is the order of magnitude of this energy which equals to tens of kev. 4.2. absolute calibration to perform the absolute calibration of the measured data, it is necessary to relate the detected signal to the kα emission, and to relate this kα emission to the number of hot electrons propagating through the copper layer. the first problem was solved using a detailed quantitative analysis based on a ray-tracing procedure. the calculations follow the standard algorithms described, e.g., in paper [11]. the simulation assumes a point source giving an origin to a fan of isotropic quasi monochromatic x-rays, each ray carries an equivalent part of the emitted intensity. by default, it is assumed that the source is emitting 1 photon into the full solid angle. the reflection curve of the spherically bent crystal is calculated by using the modified taupin equation [5]. by taking into account all relevant geometric factors (source-to-crystal and crystal-todetector distances, crystal and detector dimensions), the directions and amplitudes of the rays reflected from the crystal are found and the signal distribution at the detector plane is calculated. the collection efficiency is then determined by integrating the signal over the active detector area. the resulting ratio of the detected to emitted radiation is only 2.4 × 10−6, which is more than one order of magnitude less than might be expected when neglecting the variation of the incidence angle. the full number of the emitted kα photons is determined with respect to the transmission of the protective filters and the characteristic curve of the x-ray film used. to relate the number of hot electrons (nhe) to the number of kα photons (nkα) emitted, the assumption of monoenergetic 50 kev electron beam was made. the collisional cross section for cu k-shell ionization is σ = 4 × 10−22 cm−2 [9]. using the thin target approximation, the number of excited copper ions is ncu∗ = σdncunhe , (1) where ncu is the number density of copper atoms. the probability that the excited copper ion will decay through radiative recombination is wk = 0.39 [7] and the final relation is nkα = σdncuwknhe . (2) using this formula, the absolute number of hot electrons propagating through the copper layer can be estimated. 5. results and discussion several shots with similar conditions and with variable delays between the prepulse and the main beam have been analysed. the measured parameters did not show any dependence on the delay, so the impact of the prepulse in this situation can be considered as negligible. the found plasma temperature was t = 690 ± 10 ev, and the density ρ = 0.10 ± 0.02 g cm−3 (which corresponds to ne = 3 × 1022 cm−3). the total number of cu kα photons emitted was in the range 3 ÷ 30 × 109, which corresponds to the production of hot electrons nhe = 2 ÷ 20 × 1011. since this number is lower than theoretical predictions, further experiments are planned to revise this measurement in the future. the important value measured is the effective energy of hot electrons, estimated as 45 ± 20 kev. 235 michal šmíd, luca antonelli, oldřich renner acta polytechnica 6. conclusion high-resolution x-ray spectroscopy combined with monochromatic kα imaging was implemented in laserplasma interaction experiments with laser intensities ≈ 1 × 1016 w cm−2 and with weaker prepulse using variable delay to the main beam, thus in this matter relevant for the shock-ignition icf scheme. the usability of this diagnostics was demonstrated, and sample data needed for the explanation of hot electrons generation, which is decisive for the success of the shockignition scheme, were collected. the measured data did not exhibit any distinct dependence on the prepulse timing. there are several alternate scenarios explaining this unexpected behavior, their validity should be confirmed by further experiments and simulations. acknowledgements this research has been supported by the czech science foundation, grant no. p205/10/0814, laserlabeurope (grant no. 228334), and by the ctu grant sgs10/299/ohk4/3t/14. the work has been done within the activities of the working package 10 (fusion experiment) of the hiper project. references [1] m. j. berger, et. al. estar, pstar, and astar: computer programs for calculating stopping-power and range tables for electrons, protons, and helium ions. http://physics.nist.gov/star. version 1.2.3. [2] r. betti, et al. shock ignition of thermonuclear fuel with high areal density. phys rev lett 98(15):155001, 2007. [3] r. cook, et al. production and characterization of doped mandrels for inertial-confinement fusion experiments. j vac sci technol a 12:1275, 1994. [4] d. a. haynes jr., et al. chlorine k-shell spectroscopy of directly driven cylindrical implosions. j quant spectrosc radiat transf 65:297–302, 2000. [5] g. holzer, et al. characterization of flat and bent crystals for x-ray spectroscopy and imaging. cryst res technol 33:555, 1998. [6] k. jungwirth, et.al. the prague asterix laser system. phys plasmas 8(5):2495–2501, 2001. [7] m. o. krause. atomic radiative and radiationless yelds from k and l shells. j phys chem ref data 8:307, 1979. [8] j. j. macfarlane, et.al. spect3d – a multi-dimensional collisional-radiative code for generating diagnostic signatures based on hydrodynamics and pic simulation output. high energy density physics 3(1-2):181–190, 2007. [9] a. morace, d. batani. spherically bent crystal for x-ray imaging of laser produced plasmas. nucl instrum meth a 623:797, 2010. [10] l. j. perkins, et al. shock ignition: a new approach to high gain inertial confinement fusion on the national ignition facility. phys rev lett 103(4):045004, 2009. [11] s. g podorov, et al. optimized polychromatic x-ray imaging with asymmetrically cut bent crystals. j phys d: apl phys 34:2363, 2001. 236 http://physics.nist.gov/star acta polytechnica 53(2):233–236, 2013 1 introduction 2 experimental setup 3 x-ray spectra evaluation 4 k evaluation 4.1 effective energy 4.2 absolute calibration 5 results and discussion 6 conclusion acknowledgements references ap1_01.vp 1 introduction high energy electron beams from linear accelerators or microtrons are strongly peaked forward. to get uniform irradiation fields for medical treatment or for other purposes, several techniques are used, either to destroy the forward peaking by scattering on foils [1] or to distribute the electron current by some scanning method. there is a principal difference between these two ways of uniforming the electron flux over a given area. when the beam is scanned, only a relatively small part of the total beam current is lost at the boundaries of the field, inevitable in order to keep the mean current density within the boundaries in the prescribed limits. this is far from the case if scattering by foils is used for this purpose. the results of the following considerations will to some extent have a qualitative character, as only a small angle scattering model is used for calculating the angular distribution of the electron beam scattered by foils, not including the radiation processes. let us suppose, as usual, a gaussian angular distribution of the beam after scattering. because the electron is not lost during the scattering process, the probability of finding the scattered electron in the total solid angle is equal to one. the electron flux f(�) inside a cone of angular aperture 2�, normalized to unity for � from 0 to �, can be written in the form � � � �f w� � � � � � � sin d0 (1) where w(�)d�� the probability of an electron being scattered under an angle � into an interval d�, is, according to the moliere theory [2], given by an expression (with only the first term taken into account) � �w b b � � � � � � � � � � � d d 2 2 exp � � � � � � (2) the constant factor before the exponential on the right side is the normalization factor and at the same time the amplitude of the gaussian distribution. the peak value of the distribution at � = 0 is inversely proportional to the width in radians of the angular distribution at e–1 height of the gaussian. therefore, knowing the amplitude, also the half-width can be deduced from it. the constants �1 2 and b depend on the thickness x, atomic number z, atomic weight a and density s of the material of the scattering foil and on the energy e0 of the primary beam (see addendum). differential electron densities w (�) x (fig. 1) were calculated by formulae taken from [2], [3] and [4] for al foils of thickness x from 0.1 to 1 mm in 0.1 mm steps for energies e0 = 10, 15, 20 and 25 mev. it is interesting to note that the dependence of the peak value of the gaussian distribution am = 2 / (�1 2b), resulting from the moliere formulae, can with a high degree of precision be approximated by a simple function, which in the case of al foils has the form am = a�x b (3) where x is the al thickness in mm of the scattering foil and the constants are a = 5002 and b = –1.162. 2 experimental part for electron flux measurements in our experiments, the faraday cup described in [5], was used. it was constructed for electron energies up to 25 mev and currents from 10–10 to 10–5 a, with a circular, 0.1 mm al entry window of � = 1.8 cm (window area = 2.54 cm2). the angular aperture of the input window of the faraday cup as a function of the distance from the beam outlet window is represented in fig. 2. if situated close to the beam outlet window of the microtron electron guideline, the faraday cup receives the total electron beam, thus giving the value of the total electron flux. this value was used for calibration of the induction type current monitor, situated at the vacuum side before the beam exit window in the electron guideline, which served as the beam intensity monitor during the experiments. the forward peak electron flux density f� =0 ( l, x, � ) [necm –2s–1 a–1] per 1 a of electron beam current (ne is the number of electrons, x the scattering al foil thickness in mm 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 some aspects of profiling electron fields for irradiation by scattering on foils č. šimáně, m. vognar, d. chvátil with the use of formulae derived by moliere and others the angular distributions of electrons scattered on al foils have been calculated for electron energies e0 = 10, 15, 20, 22 and 25 mev. theoretical results are compared with flux density measurements of 22 mev electrons, scattered by a 0.16 mm al foil, with a faraday cup at distances 100 to 150 cm from the beam exit window, and at distances 110 and 130 cm for electrons scattered by al foils 0.16, 0.26, 0.36, 0.46 and 0.66 mm in thickness. a strong dependence on the beam aperture, due to the broad beam scattering effects in air, was observed. semiempirical formulae for beam apertures 1.56 and 4.6 degrees were derived, which reproduce well the experimental results of forward peak electron flux density (electrons cm �2 s �1 �a�1) for 1 �a of total beam current as functions of the distance from the beam exit window and of the thickness of the al scattering foils. keywords: microtron, electron fields, electron scattering, scattering foils. and � the beam aperture in degrees) was measured for l (the distance from the beam exit window) from 110 to 150 cm in 10 cm steps, for x = 0.16 mm, �1 = 1.57 and �2 = 4.6 degrees. to limit the beam aperture �1 to 1.57 degree, a diaphragm of � = 1.2 cm was placed at a distance 53 cm from the beam exit window. for �2 = 4.6 degree a diaphragm of � = 4.0 cm was situated at a distance of 60 cm. the experimental results are presented as crosses in the graph in fig. 4. in another series of experiments, the electron flux densities were measured for two distances, l = 110 and 130 cm, with scattering foils 0.16, 0.26, 0.36, 0.46 and 0.66 mm for beam apertures � = 1.57 and 4.6 degrees. the emplacement of the diaphragms and their diameters were the same as in the previous case. the results are presented as crosses in the graphs in fig. 5 and fig. 6. the forward peak electron flux density f� =0 was calculated by dividing the current of the faraday cup by the total beam current, by the area of its entry window equal to 2.54 cm2 and by the electron charge 1.6 10–19 c. finally, the electron flux density was reduced by a factor of 10–6 to obtain the results per 1 a of the total beam current. the mean square error of the mean value of five current readings in each measurement was less than �3 %. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 41 no.1/2001 electrons: e0 =10 mev, al 0.1 1 mm� 0 5. 0 6. 0 7. 0 8. 0.9 1 0 1 2 3 �[degrees] w (� ) x 1 electrons: e 0 =15 mev, al 0 1 1 mm. � 0 5. 0 6. 0 7. 0 8. 0 9. 1 0 1 2 3 �[degrees] w (� ) x 1 electrons: e0 =20 mev, al 0 1 1 mm. � 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 �[degrees] w ( � ) x 1 electrons: e0 =25 mev, al 0 1 1 mm. � 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 �[degrees] w ( � )x 1 x 0.1� x 0.1� x 0.1� x 0.1� fig. 1: gaussian distributions w (�)x of electrons scattered on al foils of thickness x (amplitude normalized to 1) for energies 10, 15, 20 and 25 mev 0.1 1.0 10.0 100.0 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 distance l( ) [cm] from the beam exit windowx a p e rt u re 2 � f [d e g ] o f th e f a ra d a y c u p fig. 2: angular aperture 2� of the input window of the faraday cup as a function of its distance from the beam outlet window 3 theoretical part we shall suppose that the electron beam before scattering is strictly parallel and has negligible radial dimensions. then, for electron energy e0 = 22 mev, used in our experiments, angular apertures corresponding to the heights h = 0.95, 0.5 and 1/e of the peak value of the gaussian distribution of the scattered beams (fig. 3) close behind the al scattering foils can be calculated as a function of their thickness x [mm]. for the shortest distance in our experiments l = 110 cm of the faraday cup from the beam outlet window, the angular aperture of the faraday cup entry window is about 0.9°. fig. 3 and fig. 1 show that the inhomogeneity of the electron flux density at the faraday cup entry opening is less than �5 % in the case of x 0,16 mm and less than �2.5 % in the case of x 0.26 mm. in reality, the uniformity will be better, due to the scattering of electrons during their flight in air. because the thickness of the air layer, when expressed in g/cm2, in the electron flight path is comparable with the thickness of the scattering foil, it must be taken into account and a formula for calculating the electron flux density must contain a member that accounts for the scattering on air molecules. unlike the scattering in thin al foil, where the radial displacement of the electron in its volume is negligible, the calculation of the scattering in the air volume, if axial symmetric, is a three dimensional problem and radial displacements of the electrons cannot be neglected. the scattering must be considered as broad beam scattering. we tried to simplify the problem by assuming, as is usually done in similar problems e.g., in neutron or gamma scattering, that the build up effects can be accounted for an exponential decrease of the flux density with proper choice of the linear attenuation coefficient � and by modification of the point source magnitude. the forward peak ( � = 0) electron flux density will then be of the form � � � � � �� �f l x a x l li i, , , exp ,� � � �� � � �0 2 (4) [ne.cm –2.s–1 a–1] where a depends on the foil thickness x and the beam aperture �, and � only on the beam aperture. the index i denotes the beam aperture. using expressions (1) and (4), the 22 mev electron flux density at the entry of the faraday cup has been calculated in dependence on distance l in the range of 110 to 150 cm. with a1 = 2.58 10 15 and a2 = 2.28 10 15 [ne s –1 a–1] in (4), which are very close to the peak value as calculated from the moliere formula, and with �1 and �2 put equal to 0.00639 cm–1 resp. 0.0019 cm–1 for �1 and �2, formula (4) reproduces very well the experimental results for x = 0.16 mm, as can be seen from fig. 4 (full lines). keeping �1 and �2 unchanged, the values of functions a(x, �) were fitted on experimental results for x = 0.16, 0.26, 0.36, 0.46 and 0.56 mm. we found that the set of values of a selected in this way can be approximated by the functions � �a x x� �5 46 1014 0 8480. . for beam aperture �1 = 1.57° (5) � �a x x� �6 92 1014 0 6525. . for beam aperture �2 = 4.60° (6) final expressions used for calculating the theoretical curves represented in the graphs in fig. 5 and fig. 6 are then � � � �f l x = x l l, . exp ..� � � �0 14 0 8480 25 46 10 0 00639 for �1 = 1.57° (7) 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 0 00. 2 00. 4 00. 6 00. 8 00. 10 00. 12 00. 14 00. 16 00. 18 00. 20 00. 22 00. 24.00 100 110 120 130 140 150 distance [cm] from the beam exit windowl e le c tr o n fl u x d e n s it y [n e c m � 2 s� 1 a � 1 ]� 1 0 � 1 0 �2 � 4.60° �1 � 1.57° e0 � 22 mev x 0.16 mm al� fig. 4: theoretical (full line) and experimental values (crosses) of forward peak flux density f (l, x) � = 0 of electrons e0 = 22 mev scattered on al foil 0.16 mm thick as functions of the distance from the beam exit window for two values of beam aperture �1 = 1.57 and �2 = 4.60 degree 0 00. 1 00. 2 00. 3 00. 4 00. 5 00. 6 00. 7 00. 8.00 0 0 1. 0 2. 0 3. 0 4. 0 5. 0 6. 0 7. 0 8. 0 9. 1 al foil thickness [mm]x � h (x ) [d e g ] h e1/� h 0.5� h � 0.95 angul. apert. 2 � f of the faraday cup for 130 cml � 0 2 4 6 8 2 7 12 1 �h h fig. 3: angular apertures �(x)h of the beam width at the heighs h=0.95, 0.5 and 1/e of the amplitude of the gaussian distribution of electrons scattered on al foils of thickness x 1 10 100 0.1 0 2. 0 3. 0 4. 0 5. 0 6. 0 7. al scattering foil thickness [mm]x l � 110 cm l �150 cm e 0 � 22 mev �1 � 1.57° e le c tr o n fl u x d e n s it y [n e c m � 2 s� 1 a � 1 ]� 1 0 � 1 0 fig. 5: theoretical (full line) and experimental values (crosses) of forward peak flux density f (l, x)� = 0 of electrons e0= 22 mev as functions of the thickness of al scattering foils for beam aperture �1 = 1.57 degree at distances 110 to 150 cm from the beam exit window � � � �f l x x l l, . exp ..� � �� �0 14 0 6525 26 92 10 0 00 19 for �2 = 4.60�. (8) the deviations of the experimental points from the theoretical curves are in the order of several percent. the electron beam before scattering has a diameter of 3 to 5 mm and expressions (7) and (8) may be modified if other beam diameters are adjusted at the exit window. the ideal initial conditions supposed at the beginning of this part are not strictly fulfilled and in applying the formulae this must be respected. 4 conclusion the semiempirical expressions (7) and (8) were derived for two beam apertures from measurements in a rather limited range of thickness of al scattering foils and of distances between the beam exit window and the faraday cup. it is reasonable to suppose that similar simple functions can also be constructed from experimental measurements for other beam apertures. the strong dependence of the linear attenuation coefficients of air on beam apertures indicates a very important role of the geometrical configuration of diaphragms in the electron flight path. this role is not easily predictable from theory. it has also not been clarified how far the expressions can be extrapolated beyond the range of the variables for which they were derived. similar expressions for other foil materials can probably be constructed from the same kind of measurements, perhaps excluding materials in which the probabilities of radiation processes and of small angle scattering are comparable. further experiments with an extended range of variables x, l and � would be necessary to elucidate these questions. the build-up effect in air evidently destroys the relation resulting from (2) between the half-width amplitude of the gaussian distribution and its peak value. the distribution is no longer gaussian, and intuitively should be broader. the build-up effect, if not limited by excessively small angular beam apertures, contributes not only to the electron flux density but also to the flattening of its distribution over the irradiation area. at the beginning of our paper we stated that scanning of the beam would be preferable, if one intends to minimize electron beam losses while uniforming the field. this is true, provided that the electron flight path does not include air layers. our experiments show that not only scattering in air may reduce the advantage of scanning over utilization of scattering foils, but also that scanning of the electron beam in air is always accompanied by broad beam build-up effects. addendum this addendum provides a summary of expressions compiled from [2], [3] and [4], which served as a basis for the calculations of b and �1 2. the physical units are the same as those used by the authors. the expressions are organized in the order that corresponds to the flow of the calculations. � �� 0 2 137� �ze hc z � � � � � � � �g x e x e x szx ae� � � � 0 01 0153 19 45. ln . � � � �f x g x x x � �� 20 d � � � � � �� �� � �2 1 3 27800 1 1 3 35� � �z z f x a . � �� �b � ln . ln .11 1 42 2� � � �k n z z e1 0 48 1� �� e � �4 8 10 10. elst.j. mc2 78 0 10� �. erg � 0 0� e � 0 501. mev � � � ��12 1 2 022� k f x mc � x = foil thickness [cm], z = atomic number, a = atomic mass, s = density [g/cm–3], e = energy [mev], e0 = initial electron energy [mev], � = energy in electron rest energy � units, m = electron mass [g], e = charge of electron [elst.u.], n0 = atoms/cm 3. for al: z = 13, a = 26.98, s = 2.699 g/cm–3, n0 = 6.02.10 22cm–3. references [1] vognar, m., šimáně, č., burian, a., chvátil, d.: electron and photon fields for dosimetric metrology generated by electron beams from a microtron. nuclear instruments and methods in physics research a 380 (1996), pp. 613–617 [2] moliere, g.: z. naturf. 3a (1948), p. 78 [3] kovalev, v. p.: vtoritchnye izlutchenija uskoritelej elektronov. atomizdat, moskva 1979 [4] kai siegbahn: beta and gamma ray spectroscopy. north holland publishing company, amsterdam 1955 [5] vognar, m., šimáně, č., chvátil, d.: faraday cup for electron flux measurements on the microtron mt 25. submitted for publication in acta polytechnica prof. ing. čestmír šimáně, drsc. ing. miroslav vognar ing. david chvátil dept of dosimetry & appl. of ionizing radiation phone: +420 2 2323657, +420 2 2315212 fax: +420 2 2320861 e-mail: vognar@br.fjfi.cvut.cz ctu, faculty of nuclear science & physical engineering břehová 7, 115 19 praha 1 czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 41 no.1/2001 al scattering foil thickness [mm]x e 0 � 22 mev �2 � 4.60° 0.1 0 2. 0 3. 0 4. 0 5. 0 6. 0 7. 1 10 100 e le c tr o n fl u x d e n s it y [n e c m � 2 s� 1 a � 1 ]� 1 0 � 1 0 l �150 cm l � 110 cm fig. 6: theoretical (full line) and experimental values (crosses) of electron peak flux density f (l, x) � = 0 of electrons e0 = 22 mev as functions of the thickness of al scattering foils for beam aperture �1 = 4.60 degree at distances 110 to 150 cm from the beam exit window | bzs >gnuplot.ps acta polytechnica vol. 51 no. 2/2011 maia: technical development of a novel system for video observations of meteors s. v́ıtek, k. fliegel, p. páta, p. koten abstract a system for double station observation of meteors now known as maia (meteor automatic imager and analyzer) is introduced in this paper. the system is based on two stationswith gigabite ethernet cameras, sensitive image intensifiers and automatic processing of the recorded image data. this paper presents the measured electrooptical characteristics of the components and the overall performance of the new digital system in comparison with the current analog solution. keywords: imaging systems, image processing, ethernet camera, image intensifier, system testing, astronomy, meteors. 1 introduction double station observation of meteors using two video systems coupled with image intensifiers started at the ondřejov observatory about a decade ago[1]. it was shown that the properties of the system with an image intensifier allow detection of meteors down to masses of fractions of one gram. good time resolution of events with meteors is provided by the video technique, which enables us to calculate the atmospheric trajectory and many other properties of meteors. however, the precision of the image data captured by the video recording both in spatial resolution and in dynamic range is lower than with a photographic approach [2]. this paper describes the evolution of the current project to replace analog s-vhs camcorders with a new design in which gigabit ethernet cameras are used. the direct digital output will have many advantages in the enhanced parameters of the system, and especially in the advanced automation of the observation process. fig. 1: system inner housing with installed components 2 design of the new system the design of the new video capturing hardware is based on experience with the current analog system — the main components of the image sensing hardware are the input lens, the image intensifier, the camera lens, and the camera itself. the central part of the system is the xx1332 image intensifier manufactured by philips (now photonis). these image intensifiers are characterized by the very large diameter input 50 mm and output 40 mm apertures, the high gain (typically 30 000 to 60 000 lm/lm) and the good resolution (typically 30 lp/mm). since the diameter of the photocathode in the image intensifier is 50 mm and the angle of view for meteor observation should be about 50◦, the most suitable focal length of the input lens comes at about 50 mm. the aperture of the input lens plays an important role in the overall sensitivity and signal-tonoise ratio of the system. after an extensive search, the fast lens pentax smc fa 1.4/50 mm was considered as a compromise between aperture, sharpness and price. this 50 mm lens contains 7 optical elements in 6 groups, offers aperture f/1.4 and angle of view 47◦. the lens features an smc multi-layer coating to lower the surface reflection, reduce ultraviolet rays and deliver clear, high-contrast images. the parameters of a suitable camera for video observation of meteors should be better than in the case of the analog s-vhs camcorder used in the current system. this means that the camera should offer at least the frame rate and resolution common for the pal standard, i.e. 50 interlaced fields per second and digitized resolution 720 × 576 pixels. these requirements are met in the jai cm-040ge camera with 1/2 progressive scan ccd sensor offering resolution of 776 × 582. the gigabit ethernet interface allows maximum framerate 61.15 fps and 10 or 8-bit output. the focal length of the camera lens was selected to get a perfect match between the output screen of the image intensifier (diameter of 40 mm), the height of the ccd (4.83 mm) and a suitable distance between the camera and the image intensifier (about 109 acta polytechnica vol. 51 no. 2/2011 10 −4 10 −3 10 −2 10 −1 10 010 −2 10 −1 10 0 normalized input power [−] n o rm a liz e d g a in [ − ] 0 200 400 600 800 1000 1200 0 0.2 0.4 0.6 0.8 1 spatial frequency [lw/ph] m t f [ − ] overall mtf input lens and image intensifier (a) (b) fig. 2: the normalized gain of the system (measured at 650 nm) describes the automatic gain control as nonlinearity in the image intensifier (a), the overall mtf of the system including the camera (solid line) and the partial mtf of the image intensifier with the lens (dashed line) (b) 400 500 600 700 800 900 1000 0 0.2 0.4 0.6 0.8 1 wavelength [nm] r e la tiv e r e sp o n se , s p e ct ra l t ra n sm is si o n , r e la tiv e in te n si ty [ − ] camera w/o lens camera with lens relative power lens 400 500 600 700 800 900 0 0.2 0.4 0.6 0.8 1 wavelength [nm] r e la tiv e s e n si tiv ity , s p e ct ra l t ra n sm is si o n [ − ] lens b=0.11 b=0.23 b=0.36 b=0.47 b=0.59 b=0.71 (a) (b) fig. 3: relative spectral response of the camera with (or without) a lens, and the relative power of the light at the output screen of the image intensifier (a), relative overall spectral sensitivity of the system for different digital levels b in the output image (b =1 white, b =0 black) (b) 10 cm). the fast lens pentax h1214-m 1.4/12 mm was selected — the vertical viewing angle is approximately 22◦ with the selected camera. the image of the screen can be focused easily on the ccd with the 250 μm spacer ring. the image intensifier screen to focal plane distance is about 120 mm in this configuration. 3 measurement of the electrooptical characteristics the spectral transparency of both lenses (input and camera), the spectral sensitivity of the image intensifier, the spectral sensitivity of the camera, the spectrum of the light at the output of image intensifier, the spatial resolution of the input lens and the image intensifier, and the spatial resolution of the whole system are among the most important parameters tested and presented in this paper. 3.1 spectral response the spectral response was measured independently for all parts of the system [3]. the experimental setup consisted of the lot-oriel collimated halogen light source, the lot-oriel omni 150 computer controlled monochromator, the expander to get even illumination of the image sensor, and the avantes avaspec3648 fiber optic spectrometer. the measurement results are shown in figure 3. 110 acta polytechnica vol. 51 no. 2/2011 3.2 spatial resolution the mtf was measured using a test chart according to iso 12233. this chart can be used to evaluate mtf with two different approaches, utilizing a slanted edge (an approx. 5◦ slightly slanted black bar used to measure the horizontal or vertical spatial frequency response), or a line square wave sweep with the spatial frequency range 100–1 000 lw/ph (line widths per picture height). in our case, slanted edges were used to determine the spatial frequency response – see figure 2(b). 4 conclusions this paper has presented the design of a new system for double station video observation of meteors now known as maia. the basic electrooptical characteristics have been measured and the functionality of the proposed image sensing part of the system has been verified. the achieved parameters have proved that the proposed system can be used for the intended task. measurements of the spectral characteristics of the image intensifier and the camera show the two devices are well matched. the overall spectral response of the system is broad, and it can be used easily in the range 455–845 nm. the time resolution of the selected camera (i.e. 61.15 frames per second) is above the frame rate offered by the analog s-vhs camcorder used in the current system. the only limiting factor is the lower effective spatial resolution of the selected camera (approximately 0.24 megapixels) in comparison with the spatial properties of the image intensifier (0.95 megapixels). acknowledgement this work has been supported by the grant no. 205/09/1302 “study of sporadic meteors and weak meteor showers using automatic video intensifier cameras” of the grant agency of the czech republic. references [1] koten, p.: software for processing of meteor video records, proceedings of the asteroids, comets, meteors 2002 conference, 197 (2002). [2] vı́tek, s., koten, p., páta, p., fliegel, k.: double-station automatic video observation of the meteors, advances in astronomy, 2010 (article id 943145), 4 pages (2010). [3] fliegel, k., havĺın, j.: imaging photometer with a non-professional digital camera, proc. spie 7443, 74431q (2009). stanislav vı́tek e-mail: viteks@fel.cvut.cz karel fliegel petr páta czech technical university faculty of electrical engineering technická 2, 166 27 prague, czech republic pavel koten astronomical institute of the academy of sciences of the czech republic 251 65 ondřejov, czech republic 111 acta polytechnica acta polytechnica 53(2):155–159, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ physical and electrical properties of yttria-stabilized zirconia thin films prepared by radio frequency magnetron sputtering dmitriy a. golosov∗, sergey m. zavatskiy, sergey n. melnikov thin film research laboratory, belarusian state university of informatics and radioelectronics, minsk, belarus ∗ corresponding author: dmgolosov@gmail.com abstract. this paper presents the electrophysical characteristics of a 7 mol.% yttria-stabilized zirconia (ysz) thin film deposited by radio-frequency magnetron sputtering. in order to form the crystalline structure, the deposited films were annealed in air over a temperature range of 700 ÷ 900 ◦c. by xrd analysis it was established that as the deposited films were amorphous, they crystallized into a pure cubic structure as a result of annealing in air at a temperature above 820 ◦c. the electrophysical properties of ysz films were investigated on structures such as ni/ysz/pt/ti/si and ni/ysz/si. film features ε > 20 and tg δ < 0.05 were obtained. an estimate of the capacity–voltage characteristic proved that the ni/ysz/si structures possessed a hysteresis. this hysteresis resulted from the drift of the mobile ions in the ysz film. high-temperature ionic conductivity of the stabilized zirconia was determined by the measurements of the electric resistivity of the ysz films at 1 khz over the temperature range from ambient to 800 ◦c. the ysz film conductivity obtained was 1.96 × 10−2 s/cm under 800 ◦c. keywords: yttria-stabilized zirconia, rf sputtering, x-ray diffraction, dielectric constant, loss tangent, capacity-voltage characteristic, ionic conductivity. 1. introduction yttria-stabilized zirconia (ysz) is recognized as a very attractive electrical insulator because it is characterized by high chemical stability, resistivity, and relative dielectric constant. in microelectronics, films of stabilized zirconia are used as buffer layers, preventing the functional layers from interacting chemically with silicon, in particular, in high-temperature superconductors [2, 8]. today these films are widely applied as gate dielectrics in field-effect transistors and dynamic random-access memories (dram) [15, 11], and also as an insulator in silicon-on-insulator structures [7, 5]. doped zirconia has ionic transport properties under high temperatures, and thus can be used as a solid electrolyte in micro solid-oxide fuel cells (msofc) [3, 13, 10] or as sensitivity elements in integrated gas sensors (gs) [4, 14]. for these devices, the thickness of the solid electrolyte layer must not exceed 1 ÷ 2 µm in order to reduce ohmic loss and decrease the operating temperatures down to 500 ÷ 600 ◦c [6]. the requirements imposed on the solid electrolyte are rather tough: it has to be mechanically strong and chemically reliable at high temperatures, mechanically and chemically stable in time, should have maximal ionic and minimal electronic conductivity, and also be gasproof. the aim is therefore to obtain and study the properties of doped zirconia films grown by plasma deposition methods [1]. rf magnetron sputtering, unlike other techniques, enables dense uniform films to be deposited under comparatively low temperatures; features stability of the process, the possibility of standalone control of the deposition parameters, and supports deposition over large-area substrates [9]. however, the electrophysical properties of rf sputtered doped zirconia thin films need further study. in the present paper, the authors report the synthesis of ysz thin films by rf magnetron sputtering and report the results of studying the structural and electrical properties of ysz films. 2. experiment figure 1 illustrates our experimental installation for thin-film ysz deposition by rf magnetron sputtering. the installation is based on the leybold–heraeus a550 vzk exhaust cart. the vacuum chamber is equipped with an external flanged hall current ion source. it is used for substrate pre-cleaning. the originally designed rif.039.001 magnetron sputtering system with a � 39 mm target was used to sputter a ceramic zro2 + 7 mol.% y2o3 target. the magnetron was mounted in the place of the target unit in the ion source. a 13.56 mhz rf power source having 1 300 w in the maximal power output, was used as the power supply. monocrystalline superalloyed silicon si (100), and structures of si3n4 (1 µm)/si, pt (150 nm)/ti 155 http://ctn.cvut.cz/ap/ dmitriy a. golosov, sergey m. zavatskiy, sergey n. melnikov acta polytechnica s s n s s n n to pump substrate is ar o2 mfc mss rf power suppl 13.56 mhz y matching network is ps 5.0 kv 300 ma figure 1. experimental installation for deposition of thin-film doped zirconia by rf magnetron sputtering: mss – magnetron sputtering system, is – ion source, mfc – mass flow controller. (100 nm)/si were used as substrates. in the series of experiments, the substrates were mounted in a rotating substrate holder. the vacuum chamber was pumped down to a residual pressure of 8 × 10−4 pa. the substrates were pre-cleaned with an ion beam. for this purpose, ar was added into the gas distribution system of the ion source at the flow rate qar = 10 sccm. during all the experiments the cleaning time, ion energies and discharge currents were constant, at 3 min, 700 ev, and 40 ma, respectively. then the targets were cleaned of impurities, for which the substrates were taken away from the deposition zone, and the working gases were supplied directly into the magnetron discharge zone (ar/o2 mixture). the argon flow rate was qar = 50 sccm, while the oxygen flow rate was qo2 = 10 sccm. the target cleaning modes for all the experiments were constant: forward power pf = 70 w, reflected power pr = 5 w, and sputtering time of 20 min. then the substrates were placed into the deposition zone. the total gas flow rate was kept constant at a level of 60 sccm. the oxygen content in the gas mixture was varied from 0 to 30 %, whereas the pressure in the chamber was maintained at 0.2 pa. the discharge power stabilization mode with 125 w and 80 w in the forward power was used for thin-film deposition, while the level of the reflected power did not exceed 10 % of the forward power. the deposition time in all the experiments was kept constant at 180 min. the substrate was placed 82 mm away from the target surface. the thickness of the deposited films varied within 200 ÷ 400 nm, depending on the oxygen content in the ar/o2 gas mixture. then the deposited films were annealed in air, using the isoprin ir heating system. the annealing temperature varied within the temperature range of 700 ÷ 900 ◦c (annealing time 15 min). the phase composition of the ysz films was determined by means of the x-ray diffraction (xrd) method, using the dron-3 system in cukα radiation. si (100) substrate sio ti pt nizro +y o2 2 3 2 figure 2. ni/ysz/pt capacitor stucture. the x-ray patterns were obtained at a rate of 60 ◦/h in the angular range of 2θ = 20 ÷ 80 ◦, and room temperature. the thickness of the deposited layers was determined with the poi-o8 optical interferometric profilometer. the capacitor structures (fig. 2) were formed to measure the electrophysical characteristics of the ysz films. the doped zirconia thin film (200 ÷ 400 nm in thickness) was deposited onto superalloyed silicon si (100) and pt/ti/si structures, followed by film annealing. the upper ni electrode was deposited by ion-beam sputtering through a mask. the capacitors obtained were 0.8 × 0.8 mm. the capacitance, the dielectric loss tangent, and the c–v characteristics were obtained using the lcr meter e7-20 at frequencies between 25 hz and 1.0 mhz. the dielectric constant values were calculated on the basis of the thickness and capacitance of the ysz layer, using the formula ε = cd/ε0s, where ε0 = 8.85 × 10−12 f/m, and s is the capacitor area s = 6.4 × 10−7 m2. the ionic conductivity of the stabilized zirconia was determined through measurements of the ysz film electric resistance at 1 khz and over the temperature range from room temperature up to 800 ◦c. 3. results we aimed to show how the rf magnetron sputtering conditions of zro2 + 7 % y2o3 target affect the deposition rate. the deposition rate vs. discharge power and vs. ar/o2 producer gas mixture curves were plotted in fig. 3. the deposition rate decreased practically linearly as the oxygen content in the ar/o2 gas mixture was raised, and the discharge power dropped. when the oxygen content in the ar/o2 gas mixture was increased to 5 ÷ 7 %, the deposition rate was observed to drop by 20 ÷ 25 %. the estimated sputtering coefficient for the zro2 + 7 %y2o3 target was yysz = 0.17. the dependence of the phase composition in the ysz films on the annealing temperature was studied, with si3n4/si (100) structures used as substrates. a silicon nitride layer 1 µm in thickness was formed on the surface of the monocrystalline silicon by the cvd method. yttria-stabilized zirconia thin films were deposited up to a thickness of more than 500 nm under the following conditions: forward power 125 w, reflected power 10 w, qar = 35 sccm, qo2 = 25 sccm. then the samples were annealed in air over the temperature range of 700 ÷ 900 ◦c, for 15 min. xrd was used to analyze the positions and the intensity of the peaks in the respective zirconia phases 156 vol. 53 no. 2/2013 physical and electrical properties of yttria-stabilized zirconia thin films d e p o s i t i o n r a t e , n m / s 0 0.01 0.02 0.03 0.04 0.05 5 10 15 20 25 o ,%2 а b figure 3. ysz film deposition rate as a function of oxygen percentage in the ar/o2 gas mixture at various forward power values: a – 80 w, b – 125 w. (fig. 4). it was determined that regardless of the deposition conditions, wide implicit peaks with a high noise level were present in the x-ray diagrams of the asdeposited films, implying that originally the films were almost non-crystalline and/or amorphous (fig. 4a). the positions of the peaks corresponded to the monoclinic modification of the crystalline structure. annealing at temperatures up to 800 ◦c resulted in an increase (−111) and (022) in the peak intensity of the monoclinic modification of the crystal lattice, and evoked cubic lattice peaks (fig. 4b, c). the peak positions in the films annealed at temperatures above 820 ◦c corresponded to the pure cubic modification of the crystal lattice. (111), (200), (220), (311), and (400) cubic peaks with preferable orientation (200) were observed (fig. 4d). the frequency curves for the dielectric constant and the dielectric loss tangent in ysz films were obtained. figures 5 and 6 plot the dielectric constant and the loss tangent curves of the deposited ysz films in ni/ysz/pt and ni/ysz/si structures. the following thin-film ysz deposition conditions were applied: forward power 125 w, reflected power 11 w, qar = 50 sccm, qo2 = 10 sccm. the dielectric constant ε = 6.0 and the loss tangent tg δ = 0.11 were observed at 1 mhz of frequency for the deposited films in ni/ysz/pt structures, whereas ε = 13.2, tg δ = 0.4 were observed at 1 khz of frequency, respectively. ni/ysz/si structures exhibited dielectric constant ε = 6.0 and the loss tangent tg δ = 0.06 at 1 mhz, and ε = 6.8 and tg δ = 0.34 at 1 khz, respectively. at frequencies below 500 hz, the capacitor structures were characterized by high values of dielectric loss and electric conductivity. as a result of annealing in air under a temperature exceeding 820 ◦c as the ysz cubic structure was formed, the dielectric constant of the films increased, whereas the loss tangent dropped. figures 7 and 8 plot the dielectric constant and loss tangent curves for ni/ysz/pt and ni/ysz/si structures annealed in air at a temperature of 850 ◦c (annealing time 15 min). the ysz films deposited on the pt electrode demonstrate a dielectric constant such as ε > 20 and loss tangent tg δ < 0.05 in 25 hz ÷ 1.0 mhz frequency figure 4. xrd patterns of the ysz films annealed at different temperatures: a – as-deposited, b – 700 ◦c, c – 800 ◦c, d – 900 ◦c (annealing time 15 min). l o s s t a n g e n t , t g δ 10 10 10101010 5 0.2 0 0 10 0.4 0.6 15 0.8 20 1.0 32 4 5 6 frequency, hz d i e l e c t r i c c o n s t a n t , ε figure 5. dielectric constant and loss tangent vs. frequency curves for as-deposited ysz films in ni/ysz/pt structures. 10 2 5 0 10 15 20 d i e l e c t r i c c o n s t a n t , ε l o s s t a n g e n t , t g δ 10 10 101010 0.2 0 0.4 0.6 0.8 1.0 3 4 5 6 frequency, hz figure 6. dielectric constant and loss tangent vs. frequency curves for as-deposited ysz films in ni/ysz/si structures. range. ysz films deposited on an si substrate in that frequency range showed the dielectric constant as low as ε = 13 ÷ 15 and loss tangent tg δ = 0.04 ÷ 0.1 . the c–v relationships for ni/ysz/pt and ni/ysz/si capacitor structures annealed at 850 ◦c were received at dc bias up to ±10 v (fig. 9). it was established that the capacitance of ni/ysz/pt structures did not depend on the bias voltage, which was typical for conventional dielectrics. as for the ni/ysz/si structures, estimates of the capacitance–voltage relationships proved that these structures had hysteresis (fig. 9b). capacitance variation at dc bias is typical of metal–insulator–semicon157 dmitriy a. golosov, sergey m. zavatskiy, sergey n. melnikov acta polytechnica 10 10 10101010 14 0.02 12 0 16 0.04 18 0.06 20 0.08 22 0.1 3 4 5 6 frequency, hz d i e l e c t r i c c o n s t a n t , ε 2 l o s s t a n g e n t , t g δ figure 7. dielectric constant and loss tangent vs. frequency curves for ni/ysz/pt structures annealed in air at 850 ◦c (annealing time 15 min). 10 10 10101010 14 0.02 12 0 16 0.04 18 0.06 20 0.08 22 0.1 3 4 5 6 frequency, hz d i e l e c t r i c c o n s t a n t , ε 2 l o s s t a n g e n t , t g δ figure 8. dielectric constant and loss tangent vs. frequency curves of ni/ysz/si structures annealed in air at 850 ◦c (annealing time 15 min). 0 5 10 15-5-10-15 437 439 441 443 445 bias voltage, v c a p a c i t a n c e , p f 0 5 10 15-5-10-15 490 480 500 510 520 530 540 bias voltage, v c a p a c i t a n c e , p f a b figure 9. c-v characteristics of structures such as ni/ysz/pt (a) and ni/ysz/si (b) annealed in air at 850 ◦c (annealing time 15 min). ductor structures and can account for the charge carrier drift at the ysz/si boundary affected by the electric field. the measured p–e characteristics of the ysz layers showed the absence of dielectric polarization, proving that the deposited ysz films can be referred to as linear dielectrics. the high-temperature ionic conductivity of the sta600 800 10004002000 10 10 10 10 10 10 temperature, с c o n d u c t i v i t y , s / c m -2 -3 -5 -6 -4 -1 figure 10. ionic conductivity the ysz films in ni/ysz/pt structures annealed in air at 850 ◦c (annealing time 15 min) as a function of temperature (f = 1.0 khz). bilized zirconia was determined by measuring the electric resistance of the ni/ysz/pt capacitor structures annealed at 850 ◦c at 1 khz of frequency with the temperature varied from ambient up to 800 ◦c. it was established that as the temperature was increased, the film conductivity was also grown in proportion to the temperature. when the substrate temperature was 800 ◦c, the ionic conductivity of the ysz films reached 0.0196 s/cm (fig. 10). for comparison: the ion conductivity of the bulk ysz samples is about 0.025 s/cm at 800 ◦c [12]. 4. conclusion the rf magnetron sputtering method was used to deposit thin-film 7 mol.%-yttria-stabilized zirconia. by means of xrd analysis, it was established that the deposited films were amorphous, they could crystallize into pure cubic structures after being annealed in air under temperatures above 820 ◦c. the annealing treatment of ysz films at temperatures above 700 ◦c causes the dielectric constant to rise, and the dielectric loss tangent to go down. the films that are obtained are characterized by ε > 20 and tg δ < 0.05. estimates of the capacitance–voltage relationships proved that the ni/ysz/si structures have hysteresis, resulting from the drift of the mobile ions in the ysz film. it was discovered that the high-temperature ionic conductivity of thin-film ysz to the temperature and reached 0.0196 s/cm with the substrate temperature of 800 ◦c. thus, annealed ysz films can be used as electrical insulators at room temperature, and as an efficient ion conductor of high temperatures. references [1] j. w. bae, et al. characterization of yttria-stabilized zirconia thin films prepared by radio frequency magnetron sputtering for a combustion control oxygen sensor. journal of the electrochemical society 147(6):2380–2384, 2000. [2] a. barda, et al. initial stages of epitaxial growth of y-stabilized zro2 thin films on a-siox/si(001) substrates. appl phys 75(6):2902–2910, 1994. [3] d. beckel, et al. thin films for micro solid oxide fuel cells. journal of power sources 173:325–345, 2007. 158 vol. 53 no. 2/2013 physical and electrical properties of yttria-stabilized zirconia thin films [4] a. dubbe. fundamentals of solid state ionic micro gas sensors. sensors and actuators b 88:138–148, 2003. [5] m. hartmanova, et al. characterization of ceria/yttria stabilized zirconia grown on silicon substrate. thin solid films 345:330–337, 1999. [6] j. van herle, et al. concept and technology of sofc for electric vehicles. solid state ionics 132:333–342, 2000. [7] t. koch, p. ziemann. effects of ion-beam-assisted deposition on the growth of zirconia films. thin solid films 303:122–127, 1997. [8] l. mechin, et al. double ceo2/ysz buffer layer for the epitaxial growth of yba2cu3o7−δ films on si(001) substrates. physica c: superconductivity and applications 269(1–2):124–130, 1996. [9] l. r. pederson, p. singh, zhou x.-d. review. application of vacuum deposition methods to solid oxide fuel cells. vacuum 80:1066–1083, 2006. [10] l. r. pederson, p. singh, x.-d. zhou. application of vacuum deposition methods to solid oxide fuel cells. review. vacuum 80:1066–1083, 2006. [11] k. sasaki, et al. limited reaction growth of ysz (zro2:y2o3) thin films for gate insulator. vacuum 66:403–408, 2002. [12] s.c. singhal. high-temperature solid oxide fuel cells: fundamentals, design and applications. elsevier science, oxford, 2003. [13] j. will, et al. fabrication of thin electrolytes for second-generation solid oxide fuel cells. solid state ionics 131:79–96, 2000. [14] s. yu, et al. development of a silicon-based yttria-stabilized-zirconia (ysz), amperometric oxygen sensor. sensors actuators b 85:212–218, 2002. [15] j. zhu, z.g. liu. dielectric properties of ysz high-k thin films fabricated at low temperature by pulsed laser deposition. materials letters 57(26–27):4297–4301, 2003. 159 acta polytechnica 53(2):155–159, 2013 1 introduction 2 experiment 3 results 4 conclusion references acta polytechnica vol. 52 no. 6/2012 a micromechanics-based model for stiffness and strength estimation of cocciopesto mortars václav nežerka, jan zeman department of mechanics, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 praha 6, czech republic corresponding author: vaclav.nezerka@fsv.cvut.cz, zemanj@cml.fsv.cvut.cz abstract the purpose of this paper is to propose an inexpensive micromechanics-based scheme for stiffness homogenization and strength estimation of mortars containing crushed bricks, known as cocciopesto. the model utilizes the mori-tanaka method for determining the effective stiffness, combined with estimates of quadratic invariants of the deviatoric stresses inside phases to predict the compressive strength. special attention is paid to the representation of the c-s-h gel layer around bricks and the interfacial transition zone around sand aggregates, which renders the predictions sensitive to particle sizes. several parametric studies are performed to demonstrate that the method correctly reproduces the data and trends reported in the available literature. moreover, the model is based exclusively on parameters with a clear physical or geometrical meaning, and as such it provides a convenient framework for its further experimental validation. keywords: micromechanics, homogenization, strength estimation, cocciopesto, c-s-h gel coating, interfacial transition zone. 1 introduction the use of lime as a binder in mortars is associated with well-known inconveniences, such as slow setting and carbonation, high drying shrinkage and porosity, and low mechanical strength [1]. although these limitations have been overcome with the use of portland cement in the last 50 years, lime mortars still find use in the restoration of historic structures. this is mainly due to their superior compatibility with the original materials, in contrast to many modern renovation render systems, e.g. [16, 17, 29]. the mechanical properties of lime mortars can be improved by the suitable design of the mixture. the phoenicians were probably the first ones to add crushed clay products, such as burnt bricks, tiles or pieces of pottery, to lime mortars to increase their durability and strength. the romans called this material cocciopesto and utilized this mortar in areas where other natural pozzolans were not available. cocciopesto-based structures exhibit increased ductility, leading to their remarkable resistance to earthquakes [3, 21]. much later, it was found that the mortars containing crushed clay bricks, burnt at 600–900°c, exhibit a hydraulic character, manifested by the formation of a thin layer of calcium-silicate-hydrate (c-s-h) gel at the lime-brick interface [20]. since c-s-h gel is the key component responsible for the favorable mechanical performance of portland cement pastes [23], it is generally conjectured that the enhanced performance of cocciopesto mortars can be attributed to the high strength and stiffness of the c-s-h gel coating [3, 20, 21, 29]. this mechanism competes with the formation of the interfacial transition zone (itz) at the matrix-aggregate interface, which is known to possess higher porosity and thus lower stiffness in cement-based mortars, e.g. [24, 28, 36]. the purpose of this work is to interpret these experimental findings by a micromechanical model based on the mori-tanaka method [19], motivated by its recent applications to related material systems. these include estimates of the effective thermal conductivity of rubber-reinforced cement composites [31], elasticity predictions for early-age cement [5] or alkali-activated [35] pastes, upscaling the compressive strength of cement mortars [26], and multi-scale simulations of three-point bending tests of concrete specimens [34]. here, we exploit these developments to propose a simple analytical model for a stiffness and strength estimation of cocciopesto mortars in section 2. in particular, the elasticity predictions utilize benveniste’s reformulation [4] of the mori-tanaka method [19], whereas the strength predictions build on recent results by pichler and hellmich [26], who demonstrated that compressive strength is closely related to the quadratic average of the deviatoric stress in the weakest phase. particular attention is paid to representation of the coatings by c-s-h gel and itz, which renders the predictions sensitive to the size of the brick particles and aggregates. in section 3, we verify predictions of the proposed scheme against data available in the open literature. these finding are summarized in section 4, mainly as a support for 29 acta polytechnica vol. 52 no. 6/2012 matrix (0) brick (2) void (1) sand (4) itz (5)c-s-h (3) figure 1: scheme of the micromechanics-based model. the numbers in parentheses refer to the indexes of the individual phases. future validation of the model against experimental results. finally, in appendix a we gather technical details needed to account for coated inclusions in order to make the paper self-contained. in what follows, the mandel representation of symmetric tensorial quantities is systematically employed, e.g. [18, p. 23]. in particular, italic letters, e.g. a or a, refer to scalar quantities, and boldface letters, e.g. a or a, denote vectors or matrix representations of secondor fourth-order tensors. at and (a)−1 standardly denote the matrix transpose and the inverse matrix. other symbols and abbreviations are introduced in the text when needed. 2 model we consider a composite sample occupying domain ω, composed of n distinct phases indexed by r. the value r = 0 is reserved for the matrix phase and r = 1, . . . ,n refer to heterogeneities having the shape of a sphere or spherical shell, see fig. 1. the volume fraction of the r-th phase is defined as c(r) = |ω(r)|/|ω|, where |ω(r)| denotes the volume occupied by the r-th phase, and the geometry of the coated particles is specified by their radii r(r) for r = 2, . . . , 5, fig. 2. several comments are in now order concerning simplifications adopted in the model. first, brick particles and voids are considered spherical in shape, instead of more realistic ellipsoids, as in e.g. [27, 26]. this step is known to introduce only minor errors in the prediction of overall transport [31] or elastic [27] properties. as demonstrated by pichler et al. [27], the up-scaled strength is more sensitive to the shape of inhomogeneities, but the model is still capable of predicting the correct trends. second, the itz is taken as homogeneous and is not resolved down to the level of microheterogeneities. this arises as a result of the fact that, in contrast to cement-based materials [24, 28], we are currently not aware of any work studying the structure of itz in lime-based mortars, thus input data for a more detailed representation is not available. third, only monodisperse distribution of particles in assumed. polydispersivity can be incorporated by simple averaging arguments and results only in a moderate increase in accuracy [31, and references therein]. the elastic properties of the individual phases are specified by the material stiffness matrix l(r). as each phase is assumed to be homogeneous and isotropic, we have l(r) = 3k(r)iv + 2g(r)id for r = 0, . . . ,n, (1) where k(r) and g(r) are the bulk and shear moduli of the r-th phase, and iv and id denote the orthogonal projections to the volumetric and deviatoric components, e.g. [13], so that ε(x) = (iv + id)ε(x) = εv(x)1 + εd(x), (2a) σ(x) = (iv + id)σ(x) = σv(x)1 + σd(x), (2b) for x ∈ ω. in eq. (2), ε and σ refer to local stresses and strains, εv and εd are the volumetric and deviatoric strain components, σv and σd refer to the stress components, and 1 is the second-order unit tensor (in the matrix representation). the development of the model follows the standard routine of the continuum micromechanics, e.g. [37]. the sample ω is subjected to the overall strain loading e. neglecting the interaction among the phases, the mean strains inside the heterogeneities are obtained as e(r) = a(r)dil e for r = 1, . . . ,n, where a(r)dil is the dilute concentration factor of the r-th phase, see section 2.1. in section 2.2, after accounting for the phase interaction, these are combined to the full concentration factors satisfying e(r) = a(r)e for r = 0, . . . ,n, utilized next to estimate the overall stiffness of the composite material, leff. moreover, as outlined in section 2.3, the expression for the overall stiffness also encodes the mean value of the quadratic invariant of the local stress deviator σd, defined as j (r) 2 = √ 1 2|ω(r)| ∫ ω(r) σd(x)tσd(x) dx, (3) which can be directly used to estimate the overall strength of a material. 2.1 dilute concentration factors due to the geometrical and material isotropy of the individual phases, the dilute concentration factors attain a form analogous to (1): a (r) dil = a (r) dil,viv + a (r) dil,did for r = 1, . . . ,n. (4) 30 acta polytechnica vol. 52 no. 6/2012 (i3) (i1) (i2) r(i1 ) r(i2 ) figure 2: scheme of a single-layer inclusion the expressions for the components are given separately for the uncoated (r = 1) and coated (r = 2, . . . , 5) particles. namely, in the first case it holds a (1) dil,v = k(0) k(0) + α(0)(k(1) −k(0)) , a (1) dil,d = g(0) g(0) + β(0)(g(1) −g(0)) , with the auxiliary factors following from the eshelby solution [9] in the form α(0) = 1 + ν(0) 3(1 + ν(0)) , β(0) = 2(4 − 5ν(0)) 15(1 −ν(0)) , where ν(0) is the poisson ratio of the matrix phase. the coated case is more involved, and was first solved in its full generality by herve and zaoui [10] for a multi-layered spherical inclusion. to apply their results in the current setting, we locally number the phases by the index i = [i1, i2, i3]t, see fig. 2, where i = [2, 3, 0]t for the brick–c-s-h conglomerate and i = [4, 5, 0]t refers to a sand particle coated by itz. now, we have a (i1) dil,v = 1 q211 , a (i2) dil,v = q111 q211 , (5) and a (i1) dil,d = a1 − 21 5 r(i1)2 1 − 2ν(i1) b1, (6) a (i2) dil,d = a2 − 21 5 r(i2)5 −r(i2−1)5 (1 − 2ν(i2))(r(i2)3 −r(i2−1)3) b2, where the auxiliary factors are provided in appendix a. 2.2 stiffness estimates in benveniste’s [4] interpretation of the original moritanaka method [19], the mutual interaction among the heterogeneities is modeled by loading each particle by the average strain in the matrix phase e(0) instead of e. for this purpose, we relate e(0) to e by a strain compatibility condition, valid under the dilute approximation, e = ( c(0)i + n∑ r=1 c(r)a (r) dil ) e(0), from which we express the full concentration factors as a(0) = ( c(0)i + n∑ r=1 c(r)a (r) dil )−1 , a(r) = a(r)dil a (0) for r = 1, . . . ,n. utilizing a universal relation leff = n∑ r=0 c(r)l(r)a(r), we can see that the effective stiffness inherits the symmetry of individual phases (1) with keff = c(0)k(0) + ∑n r=1 c (r)k(r)a (r) dil,v c(0) + ∑n r=1 c (r)a (r) dil,v , (7a) geff = c(0)g(0) + ∑n r=1 c (r)g(r)a (r) dil,d c(0) + ∑n r=1 c (r)a (r) dil,d . (7b) 2.3 strength estimates as was first recognized by kreher [14], the fluctuations of the stresses and strains in individual phases can be estimated from the energy conservation condition due to hill [11]: etleffe = 1 |ω| n∑ r=0 ( 9k(r) ∫ ω(r) ε2v(x) dx + ∫ ω(r) 2g(r)εtd(x)εd(x) dx ) , (8) expressing the conservation of energy on the macroscale due to e and the average local values due to εv and εd. differentiating (8) with respect to g(r), we obtain et ∂leff ∂g(r) e = 2 |ω| ∫ ω(r) εtd(x)εd(x) dx, for r = 0, . . . ,n. next, we recognize that σ(r)d = (1/2g(r))ε(r)d and recall the definition of the quadratic invariant (3) to arrive at j (r) 2 = g (r) √ 1 c(r) et ∂leff ∂g(r) e. (9) as was thoroughly demonstrated by pichler et al. [27] and pichler and hellmich [26], this quantity is closely related to the compressive strength fc of 31 acta polytechnica vol. 52 no. 6/2012 . table 1: reference properties of individual phases; ρ denotes mass density, ft is tensile strength, m is the mass fraction, and radii r are defined according to fig. 2 r phase ρ e ν ft m r note [kgm−3] [mpa] [-] [mpa] [-] [µm] 0 pure lime matrix 1200a 2000[8] 0.25[8] 0.4[8] 3 × 1 voids × 10−9 0.25 × × 0.1–100[22] c(1) = 35%a 2 clay brick 2300a 5000a 0.17a 3.2a 1 500 3 c-s-h gel 2000[12] 22000[7] 0.2[7] × × 510[6] 4 siliceous sand 2700a 60000[34] 0.17[34] 48[23] 1 500 5 itz 1200b 500c 0.25b × × 520c a our own (unpublished) data. densities and porosity were measured by a pycnometer, elastic constants were determined from strain-gauge data in a compression test, and the tensile strength follows from a unidirectional tensile test. b same value as for the lime matrix. c set as in [36] for cement-based concretes, i.e., young’s modulus to 20–40% of the value for the matrix phase and thickness to 20 µm. cement pastes at various degrees of hydration. here, we postulate that fc(p1) fc(p2) ≈ j (w) 2 (p2) j (w) 2 (p1) , (10) where w = 0, . . . ,n is the index of the weakest phase and p refers to a parameter characterizing the mixture composition, see the next section for concrete examples. 3 results and discussion the purpose of this section is to examine the trends in mechanical properties as predicted by the proposed scheme. the default data for individual phases, summarized in table 1, were assembled from open literature and complemented with our own, yet unpublished, measurements. note that the matrix–brick–sand fractions correspond to a typical composition of historic lime mortars [3, 2], and that the engineering constants e and ν are connected to the bulk and shear moduli through well-known relations, e.g. [18, p. 23], k = e 3(1 − 2ν) , g = e 2(1 + ν) . given the data in table 1, the volume fractions of individual phases are determined from six independent conditions. the first two relate the volume fractions of brick and sand particles and their coatings by c(3) = ((r(3) r(2) )3 −1)c(2), c(5) = ((r(5) r(4) )3 −1)c(4). next, we enforce the value of mass fractions via c(0) = m(0)ρ(2) m(2)ρ(0) c(2), c(0) = m(0)ρ(4) m(4)ρ(0) c(4), where m(r) and ρ(r) denote the mass fraction and the mass density of the r-th phase, respectively. since c(1) is given, the remaining condition is provided by∑5 r=0 c (r) = 1, and the phase volume fractions follow as the solution of the system of 6 × 6 linear equations. in the sensitivity analyses, motivated by experimental findings in e.g. [3, 33, 32], we assume that an increase in the c-s-h gel volume fraction (∆c(3)) is compensated by the corresponding changes for matrix (∆c(0)), voids (∆c(1)), and clay bricks (∆c(2)), so that ∆c(0) + ∆c(1) + ∆c(2) + ∆c(3) = 0, where we set for simplicity ∆c(1) = ∆c(2) = ∆c(3). by analogy, the increase in the volume fraction of itz corresponds to the decrease in volume of the matrix phase: ∆c(0) + ∆c(5) = 0. in the strength estimates, the imposed loading simulates the uniaxial compression test, for which σ = [−1, 0, 0, 0, 0, 0]t and the average strain follows from e = (leff )−1σ. we assume that itz is the weakest phase, i.e. w = 5 in eq. (10), and, similarly to [27], we estimate the derivative in eq. (9) by the forward difference with a step size of ∆g(5) = 1 pa.1 1our results are reproducible with a matlab code homogenizator mt, freely available at http://mech.fsv.cvut. cz/~nezerka/software. 32 acta polytechnica vol. 52 no. 6/2012 0 10 20 30 40 50 60 70 80 0.75 0.8 0.85 0.9 0.95 1 c-s-h coating thickness t [µm] a ( 2 ) d il (t )/ a ( 2 ) d il (0 ) volumetric part deviatoric part 0 10 20 30 40 50 60 70 80 0.75 0.8 0.85 0.9 0.95 1 itz coating thickness t [µm] a ( 4 ) d il (t )/ a ( 4 ) d il (0 ) volumetric part deviatoric part (a) (b) figure 3: influence of the coating thickness on the dilute concentration factors for (a) brick and (b) sand particles. the vertical lines refer to the default thicknesses that are kept constant in the remaining sensitivity analyses. 0 10 20 30 40 50 1000 2000 3000 4000 5000 porosity c(1) [%] e e ff [m p a ] 0 10 20 30 40 50 0.4 0.5 0.6 0.7 0.8 0.9 1 porosity c(1) [%] f c (c ( 1 ) )/ f c (0 ) (a) (b) figure 4: (a) stiffness-porosity and (b) strength-porosity relations. 3.1 effect of coatings the first aspect that we would like to discuss is the effect of the coating on the dilute concentration factors of the brick and sand particles. fig. 3 demonstrates that, in terms of the volumetric phase strains, the effects of c-s-h and itz are comparable, despite the fact that the c-s-h is stiffer and itz is more compliant than the matrix phase. the differences in the deviatoric part, which drive the strength estimates according to eq. (10), become more pronounced with the increasing thickness. this indicates that the contribution of brick and sand particles to the overall properties might still be different, after accounting phase properties and their interaction, see section 3.3 for a further discussion. 3.2 influence of porosity by analogy with the cement pastes, porosity has a major influence on the overall properties of lime-based mortars. this is confirmed by the results of the proposed model, shown in fig. 4. as for the overall stiffness, for the realistic range of porosities of 25– 40% [8], the estimates (7) predict young modulus values between ≈ 2, 000 and 1, 000 mpa. this is consistent with the values reported in [3] historic lime mortars (without pozzolan admixtures). as for the strength estimates, it follows from fig. 4 (b) that they reproduce a power-law relation [23, p. 280] fc(c(1)) = fc(0)(1 − c(1))n, (11) with n .= 1.04, yielding practically the linear strengthporosity scaling. unfortunately, we are currently unable to validate this prediction against experiments; the only available work we are aware of by papayianni and stefanidou [25] does not contain enough data. still, eq. (11) complies with the fact that the influence of porosity is much smaller in lime mortars than in cement-based materials [15, section 2.6], for which n ≈ 3 is typically used, see [25, and references therein]. 3.3 size effects now we proceed to clarify the impact of brick and sand particles on the overall mechanical properties.2 in particular, when the size of the brick particles increases, the material becomes more compliant since the stiffening effect of the c-s-h layer decreases, fig. 5(a). this also increases the deviatoric stresses in itz, as manifested by the strength reduction visible in fig. 5(b). these effects practically stabilize for particles larger than 0.5 mm, and their magnitude is rather limited: the stiffness decreases by about 10 % and the strength only by 4 %. such trends are qualitatively consistent with the results presented e.g. in [3, 21, 29]. larger sand particles, on the other hand, tend to make the composite material stiffer, fig. 6(a), by 2in the sensitivity analysis, the mass ratio is kept constant according to table 1. 33 acta polytechnica vol. 52 no. 6/2012 0.3 0.6 0.9 1.2 1.5 900 1000 1100 1200 1300 1400 radius r(2) [mm] e e ff [m p a ] 0.3 0.6 0.9 1.2 1.5 0.95 0.96 0.97 0.98 0.99 1 radius r(2) [mm] f c (r ( 2 ) )/ f c (0 ) (a) (b) figure 5: influence of the brick particle size on the overall (a) stiffness and (b) strength. 0 0.5 1 1.5 800 1000 1200 1400 1600 1800 radius r(4) [mm] e e ff [m p a ] 0 0.5 1 1.5 0.75 0.8 0.85 0.9 0.95 1 radius r(4) [mm] f c (r ( 4 ) )/ f c (0 ) (a) (b) figure 6: influence of the sand particle size on the overall (a) stiffness and (b) strength. compensating for the inferior mechanical properties of the itz. since the relative thickness of itz layer decreases, the stresses inside this phase increase and the material becomes weaker in overall, fig. 6(b). when compared to brick particles, these effects are much more pronounced: in the considered range of radii, the young modulus increases by about 100 % and the strength decreases by 25 % with no tendency to stabilize. this agrees well with the experimental outcomes reported in [30]. 4 conclusions in the present work, following the recent developments presented in [27, 26, 35, 34], a simple micromechanicsbased scheme for strength and stiffness estimates of cocciopesto mortars has been presented. the model directly utilizes measurable material and geometrical properties of individual phases and is free of adjustable parameters. on the basis of the presented results, we conclude that the model 1. predicts realistic values of the overall young modulus and strength-porosity scaling, 2. captures the “smaller is stiffer” and “smaller is stronger” trends for crushed brick particles, 3. captures the “larger is stiffer” and “larger is weaker” trends for sand aggregates, 4. explains the positive role of crushed bricks in comparison with sand aggregates. of course, in order for this model to be accepted for practical use, it needs to be validated against comprehensive experimental data at microand macroscales, and the role of itz in lime-based mortars needs to be clarified. this topic is currently under investigation and will be reported separately. acknowledgements we wish to thank an anonymous referee for a number of valuable suggestions on the previous version of the manuscript. this work was supported by the ministry of culture of the czech republic, project no. df11p01ovv008. references appear after the appendix on page 36. 34 acta polytechnica vol. 52 no. 6/2012 the effect of coating on the mechanical properties enters the solution through the auxiliary factors qk in eq. (5), and ak and bk in eq. (6). here, these are provided in the closed form optimized for coding, utilizing the results and nomenclature by herve and zaoui [10]. note that in order to keep the notation consistent, a(k) corresponds to a property of the k-th phase, whereas ak denotes a quantity utilized in the herve-zaoui solution (independent of a(k)). also recall that we employ the local numbering of phases by the index i = [i1, i2, i3]t introduced by fig. 2. in particular, the volumetric part is expressed in terms of matrices q1 = n1, q2 = n2q1, with nk = 1 3k(ik+1) + g(ik+1)   3k(ik) + 4g(ik+1) 4r(ik)3 (g(ik+1) −g(ik)) 3r(ik)3(k(ik+1) −k(ik)) 3k(ik+1) + 4g(ik)   for k = 1, 2. the matrices needed to evaluate the deviatoric part follow from a1 = p 222 p 211p 2 22 −p 212p 221 , a2 = w 21 , b 1 = −p 221 p 211p 2 22 −p 212p 221 , b2 = w 22 , where wk = 1 p 222p 2 11 −p 212p 221 pk−1 [ p 222 −p 221 0 0 ]t (k = 1, 2), p 1 = m1, p 2 = m2p 1. the auxiliary matrix mk admits the expression: mk = 1 5(1 −ν(ik+1))   ck 3 r(ik)2(3bk − 7ck) 5(1 − 2ν(ik)) −12αk r(ik)5 4(fk − 27αk) 15r(ik)3(1 − 2ν(ik)) 0 bk(1 − 2ν(ik+1)) 5(1 − 2ν(ik)) mk23 −12αk(1 − 2ν(ik+1)) 7r(ik)7(1 − 2ν(ik)) r(ik)5αk 2 −r(ik)7(2ak + 147αk) 70(1 − 2ν(ik)) dk 7 mk34 mk41 7αkr(ik)5(1 − 2ν(ik+1)) 2(1 − 2ν(ik)) 0 ek(1 − 2ν(ik+1)) 3(1 − 2ν(ik))   with mk23 = −20αk(1 − 2ν(ik+1)) 7r(ik)7 , mk41 = −5αkr(ik)3(1 − 2ν(ik+1)) 6 , mk34 = r(ik)2(105(1 −ν(ik+1)) + 12αk(7 − 10ν(ik+1)) − 7ek) 35(1 − 2ν(ik)) , and ak = g(ik) g(ik+1) (7 + 5g(ik))(7 − 10g(ik+1)) − (7 − 10g(ik))(7 + 5g(ik+1)), bk = g(ik) g(ik+1) (7 + 5g(ik)) + 4(7 − 10g(ik)), ck = (7 − 5g(ik+1)) + 2(4 − 5g(ik+1)) g(ik) g(ik+1) , dk = (7 + 5g(ik+1)) + 4(7 − 10g(ik+1)) g(ik) g(ik+1) , ek = 2(4 − 5g(ik)) + g(ik) g(ik+1) (7 − 5g(ik)), fk = (4 − 5g(ik))(7 − 5g(ik+1)) − g(ik) g(ik+1) (4 − 5g(ik+1))(7 − 5g(ik)), αk = g(ik) g(ik+1) − 1. appendix a: herve-zaoui solution 35 acta polytechnica vol. 52 no. 6/2012 references [1] a. arizzi, g. cultrone. aerial lime-based mortars blended with a pozzolanic additive and different admixtures: a mineralogical, textural and physical-mechanical study. construction and building materials 31:135–143, 2012. [2] g. baronio, l. binda. study of the pozzolanicity of some bricks and clays. construction and building materials 11:41–46, 1997. [3] g. baronio, l. binda, n. lombardini. the role of brick pebbles and dust in conglomerates based on hydrated lime and crushed bricks. construction and building materials 11:33–40, 1997. [4] y. benveniste. a new approach to the application of mori-tanaka theory in composite materials. mechanics of materials 6:147–157, 1987. [5] o. bernard, ulm f.-j., e. lemarchand. a multiscale micromechanics-hydration model for the early-age elastic properties of cement-based materials. cement and concrete research 33:1293– 1309, 2003. [6] h. böke, s. akkurt, b. ípekoǧlu, e. uǧurlu. characteristics of brick used as aggregate in historic brick-lime mortars and plasters. cement and concrete research 36:1115–1122, 2006. [7] g. constantinides, f.j. ulm. the effect of two types of c-s-h on the elasticity of cement-based materials: results from nanoindentation and micromechanical modeling. cement and concrete research 34:67–80, 2004. [8] m. f. drdácký, d. michoinová. lime mortars with natural fibres. in a. m. brandt, v. c. li, i. h. marshall (eds.), brittle matrix composites 7: proceedings of the 7th international symposium, pp. 523–532. institute of fundamental technological research, woodhead publishing limited, cambridge, uk, 2003. [9] j.d. eshelby. the determination of the elastic field of an ellipsoidal inclusion, and related problems. proceedings of the royal society of london 241:376–396, 1957. [10] e. herve, a. zaoui. n-layered inclusion-based micromechanical modelling. international journal of engineering science 31:1–10, 1993. [11] r. hill. elastic properties of reinforced solids: some theoretical principles. journal of the mechanics and physics of solids 11(5):357–372, 1963. [12] j.t. jeffrey, m.j. hamlin. a colloidal interpretation of chemical aging of the c-s-h gel and its effects on the properties of cement paste. cement and concrete research 36:30–38, 2006. [13] m. jirásek. basic concepts and equations of solid mechanics. revue européenne de génie civil 11:879–892, 2007. [14] w. kreher. residual stresses and stored elastic energy of composites and polycrystals. journal of the mechanics and physics of solids 38:115–128, 1990. [15] r.m.h. lawrence. a study of carbonation in nonhydraulic lime mortars. ph.d. thesis, university of bath, 2006. [16] p. maravelaki-kalaitzaki, a. bakolas, a. moropoulou. physico-chemical study of cretan ancient mortars. cement and concrete research 33:651–661, 2003. [17] d. michoniová. questions about renovation plasters. zprávy památkové péče 65:313–316, 2005. in czech. [18] g. w. milton. the theory of composites. cambridge monographs on applied and computational mathematics. cambridge university press, 2002. [19] t. mori, tanaka k. average stress in matrix and average elastic energy of materials with mixfitting inclusions. acta metallurgica 21:571–574, 1973. [20] a. moropoulou, a. bakolas, k. bisbikou. characterization of ancient, byzantine and later historic mortars by thermal and x-ray diffraction techniques. thermochemica acta 269/270:779–995, 1995. [21] a. moropoulou, a.s. cakmak, g. biscontin, et al. advanced byzantine cement based composites resisting earthquake stresses: the crushed brick/lime mortars of justinian’s hagia sophia. construction and building materials 16(8):543 – 552, 2002. [22] m.j. mosquera, b. silva, b. prieto, e. ruizherrera. addition of cement to lime-based mortars: effect on pore structure and vapor transport. cement and concrete research 36:1635–1642, 2006. [23] a.m. neville. properties of concrete: fourth edition. longman, 1996. [24] j.p. ollivier, j. c. maso, b. bourdette. interfacial transition zone in concrete. advanced cement based materials 2:30–38, 1995. 36 acta polytechnica vol. 52 no. 6/2012 [25] i. papayianni, m. stefanidou. strength-porosity relationships in lime-pozzolan mortars. construction and building materials 20:700–705, 2005. [26] b. pichler, c. hellmich. upscaling quasi-brittle strength of cement paste and mortar: a multiscale engineering mechanics model. cement and concrete research 41:467–476, 2011. [27] b. pichler, c. hellmich, j. eberhardsteiner. spherical and acicular representation of hydrates in a micromechanical model for cement paste: prediction of early-age elasticity and strength. acta mechanica 203(3):137–162, 2009. [28] k.l. scrivener, a.k. crumbie, p. laugesen. the interfacial transition zone (itz) between cement paste and aggregate in concrete. interface science 12:411–421, 2004. [29] a. sepulcre-aguilar, f. hernández-olivares. assessment of phase formation in lime-based mortars with added metakaolin, portland cement and sepiolite, for grouting of historic masonry. cement and concrete research 40:66–76, 2010. [30] m. stefanidou, i. papayianni. the role of aggregates on the structure and properties of lime mortars. cement & concrete composites 27:914– 919, 2005. [31] j. stránský, j. vorel, j. zeman, m. šejnoha. mori-tanaka based estimates of effective thermal conductivity of various engineering materials. micromachines 2:129–149, 2011. [32] e. vejmelková, m. keppert, z. keršner, et al. mechanical, fracture-mechanical, hydric, thermal, and durability properties of lime-metakaolin plasters for renovation of historical buildings. construction and building materials 31:22–28, 2012. [33] a.l. velosa, f. rocha, r. veiga. influence of chemical and mineralogical composition of metakaolin on mortar characteristics. acta geodynamica et geomaterialia 153:121–126, 2009. [34] j. vorel, v. šmilauer, z. bittnar. multiscale simulations of concrete mechanical tests. journal of computational and applied mathematics 236:4882–4892, 2012. [35] v. šmilauer, p. hlaváček, f. škvára, et al. micromechanical multiscale model for alkali activation of fly ash and metakaolin. journal of materials science 46:6545–6555, 2011. [36] c.c. yang. effect of the transition zone on the elastic moduli of mortar. cement and concrete research 28:727–736, 1998. [37] a. zaoui. continuum micromechanics: survey. journal of engineering mechanics 128:808–816, 2002. 37 ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 practical results of a five-level flying capacitor inverter o. sivkov abstract this paper investigates the realization of a five-level flyingcapacitor inverter. after a brief description of general power electronic converters and an introduction to the advantages of multilevel inverters over conventional two-level inverters the main focus is on the five-level flying capacitor inverter. the flying capacitor multilevel inverter (fcmi) is a multilevel inverter (mi) where the capacitor voltage can be balanced using only a control strategy for any number of levels. after a general description of five-level fcmi topology, the simulation and experimental results are presented. the capacitor voltage is stabilized here with various output voltage amplitude values. the simulation and experimental results of five-level fcmi show that the voltage is stabilized on capacitors using the control strategy. a single-phase five-level fcmi model is currently being developed and constructed in the laboratory. some of the experimental results are available. keywords: multilevel inverters, control strategy, igbt,dcmi,fcmi, activepowerfilters, unifiedpower flowcontroller. 1 introduction since the 1990s switching devices such as gto thyristors and igbt transistors have been widely used in power electronic converters. these converters are currently classified into: a) rectifiers — converters that convert an input alternating voltage and current into output direct voltage and current; b) inverters — converters that convert input direct voltageandcurrent into outputalternating voltage and current; c) dcvoltageconverters (choppers)—converters that convert input direct voltage and current of one value into output direct voltage and current of other values; d) ac converters — converters that convert input energy with one parameters (alternating voltage, current, number of phases, frequency) into alternating output energy with other parameters. this paper investigates the realization of a multilevel inverter (mi) that belongs to category (b) — inverters. the conventional single-phase inverter generates ordinary rectangular voltage on its output. power electronic switchers are switched to two input voltages—positive andnegative. this type of inverter is called a two-level inverter. electronic switchers are stressed by full input dc voltage in this case. the electronic switcher voltage stress can be reduced using a series connection or a multilevel inverter. conventional inverters are used in low-voltage electrical equipment up to 1 kv. for medium and high voltage equipment over 1 kv, a series connection of gto/igbt or a multilevel inverter is applied. the three-level inverter is the multilevel inverter with the smallest number of levels. the advantages of three-level inverter topology over conventional two-level topology are: • the voltage across the switches is only one half of the dc source voltage; • the switching frequency can be reduced for the same switching losses; • the higher output current harmonics are reduced by the same switching frequency. multilevel inverters find applications in new areas of mediumand highvoltage applications, e.g. frequency inverters for high voltage adjustable speed drives, inverters for highvoltage compensators, highvoltage unified power flow controllers (upfc), and highvoltage active power filters, etc. voltage sources of the next levels can be realized as separated sources or as voltage capacitor dividers. separated sources require further power hardware. the chief problem inmultilevel inverterswith capacitor dividers is the proper control strategy for voltage stabilization without additional power hardware. themostpopular types ofmiarediodeclamped multilevel inverters (dcmi) and flying capacitor multilevel inverters (fcmi). for a three-level topology, only both types of mi can be designed without any separated voltage sources or auxiliarypower circuits. papers [6] and [7] compare these two types of multilevel inverters fromvarious points of view. the comparison was realized for the same output powers. the mathematical models for both types of in74 acta polytechnica vol. 50 no. 6/2010 verters were investigated in the simulink program. this comparison showed that three-level dcmi requires the total capacity of all capacitors to be at least two times lower than three-level fcmi in order to achieve the same capacitor voltage swinging. hence thedcmi solution is consideredmore effective for three-level mi. however, for a higher number of levels than 3 in dcmi, the capacitor voltage cannot be balanced using only the control strategy. additional circuits or independent sources are required. for example, paper [3] describes the stabilization of capacitor voltage in five-level dcmi using the advanced strategy together with auxiliary circuits. on the contrary the voltage can be stabilized using only the control strategy in fcmi for all levels. multilevel inverters with more than three levels are mainly used in high-voltage applications for the voltage greater than 10 kv. the purpose of this paper is to present the results of a theoretical study and the practical realization of a five-level fcmi. after a brief description of five-level fcmi topology, we present the simulation and experimental results. the simulation and experimental results of five-level fcmi are carried out on a model, supplied with 200 v dc source. both of these results show that a five-level output voltage is generated and the capacitor voltage is stabilized using only the control strategy. five-level fcmi is investigatedhere because only this strategy is able to balance the capacitor voltage using only the control strategy. 2 five-level flying capacitor topology simulation 2.1 description of five-level fcmi topology generally, fcmiwith n levels needs n −2 flying capacitors for each phase. the scheme of a threephase five-level flyingcapacitor inverter is shown in fig. 1. this figure shows (5 − 2) × 3 = 9 capacitors. the lower levels on phase outputs a, b, c are achievedas the difference between supply voltage udc and capacitor voltages uc1,2,3. the idea of the control strategy can be explained using fig. 2. this figure shows the possible switching states for one phase of five-level fcmi. the voltage levels are signed (−1), (−0.5), (0), (0.5), (1) and thesenumbers represent the relationbetween the voltage levels and the dc supply voltage + = (1) and − = (−1). each level can be obtained by one or more switching states. the number of possible fig. 1: scheme of the three-phase five-level flying capacitor inverter fig. 2: switching states in the five-level fcmi states is represented by the number of circles in the corresponding level. the upper number in the circle is the switching state number. the lower three signs represent the voltage behaviour on the corresponding capacitors: increasing (+), decreasing (−), or unchanging (0). the rectangles on the bottom show the output voltage value in relative units that correspond to each column of the states. this behaviour is correct forpositive output currentpolarity. the bidirectional bonds show the possible state transitions between levels that fulfil the condition that only one couple of igbts can switch in one moment (one of them on, the other one off). the five-level inverter allows sixteen possible states. all four upper switches (fig. 1) are on in state 16; consequently, the output voltage is +1, equal to one half of the dc voltage source uout = udc/2. at states 12, 13, 14, 15, the output voltage is the difference between one half of the dc voltage source and any capacitor voltages with the resulting out75 acta polytechnica vol. 50 no. 6/2010 put voltage 0.5. different capacitors are connected for specific states. in these states, the three upper switches are one lower switch on and the others are off. which actual switchers are on and off depends on actual switching state. in states 6, 7, 8, 9, 10, 11, the output voltage is thedifferencebetween thehalfdcvoltage sourceand any capacitor voltages. the resulting output voltage is 0. in these states, two upper switches are and two lower switches are on. similarly as in the previous states, the specific switchers which are on and off depends on concrete switching state. in states 2, 3, 4, 5, one upper switch and three lower switches areon, and all the rest areoff.the output voltage is the difference between one half of the dc voltage source and any capacitor voltages. the result is −0.5. in the state 1, all four lower switches (fig. 1) are on; consequently, the output voltage is −1, equal to one half of the dc voltage source uout = udc/2. in state 1, the four lower switches areon,and the four upper switches are off. the output voltage is equal to the negative dc voltage source uout = −1. the transition of the switching states to the next state is implemented to ensure the capacitor voltage balance. this is determined by the polarities of all three capacitor voltages and also the output current polarity. the switching state in the five-level fcmi in fig. 2 is shown only for positive output current polarity. the full matrix of switching states transition, including all capacitor voltages and the current polarity would require a large table, and cannot be published in this paper. all the flying capacitor values in each phase are accepted the same c1 = c2 = c3 = c. the dividing capacitor voltages are defined in the following equations: (1) defines the voltage on capacitor c1, (2) defines the voltage on capacitor c2 and (3) defines the voltage on capacitor c3. uc1 = 3 4 × udc (1) uc2 = 1 2 × udc (2) uc3 = 1 4 × udc (3) 2.2 load simulation the behavior of this inverter was simulated for a passive load using matlab simulink only for one phase. the connection of a three-phase passive load is depicted in fig. 3. the passive load in this picture is represented by series connection of resistance r and reactance ωl in each phase in star connection. the output voltage phasors uouta, uoutb, uoutc can represent line-to-zero voltages to zero point of source os or zero point of load ol. the real model was realized only for one phase, and therefore the phasors represent line-to-zero point of source voltages. they are shifted at an angle of — 2 3 × π from each other. thedirection of the output current is shown infig. 3 only for phase a. fig. 3: scheme of a three-phase equivalent inverter with a passive load the voltage equation for a one-phase model can be written according to the second kirhgoff law (see equation (4)). ûouta = îouta × (ra + jω × la) (4) this connection allowsus to change the load type frompure resistive to pure inductive by changing the values of resistance ra or la. if inductance la ≈ 0 and the resistance has any definite value, the load is pure resistive and the output current is in the phase with the output voltage and has the same shape as the voltage. if the inductance has a definite value and the resistance is close to zero, the load is purely inductive and the first harmonic of the current is 90◦ delayed to the first harmonic of the voltage. if the reactance is equal to the resistance, then the first harmonic of the current is 45◦ delayed to the first harmonic of the voltage. 2.3 simulation strategy the input parameters of the five-level fcmi for the simulink programare given in table 1. the inverter is supplied from a 200 v dc source, the flying capacitor values of ca1, ca2, ca3 are identical and are equal to 1000 μf. the load values are: ra =150 ω, ωla =110 ω. the equations for the variable input parameters used in the simulink program are given in table 2. they are: angular frequency omega, load inductance l and load impedance z. initial capacitor voltages uc1, uc2, uc3 are taken from equations (1)–(3). fig. 4 shows the simulation results fromsimulink with output voltage uout, output current iout, and capacitor voltages uc1, uc2, uc3. flying capacitor voltages fluctuate around their initial values: uc1 = 3/4udc = 150 v, uc2 = 1/2udc = 100 v, uc3 =1/4udc =50 v. 76 acta polytechnica vol. 50 no. 6/2010 table 1: parameter values for the matlab simulink program in five-level fcmi f =50 network frequency, [hz] n =14 switching triangular carrier frequency per period, [number/period] x =110 reactance, [ω] r =150 resistance, [ω] c =0.001 clamping capacitor, [f] rc =0.001 resistance of capacitor circuit, [ω] udc =200 direct voltage source, [v] umax =0.95 amplitude of output voltage, [p.u.] table 2: variable input parameter equations used for the matlab simulink program in five-level fcmi ω=2 · π · f angular frequency l = x/ω inductance z = √ rr + xx impedance uc01 =3/4u dc initial voltage on the capacitor c1 uc02 =1/2u dc initial voltage on the capacitor c2 uc03 =1/4u dc initial voltage on the capacitor c3 fig. 4: output voltage uout, output current iout and flying capacitor voltages uc1, uc2 and uc3 of five-level fcmi with a resistive inductive load fig. 5: a zoom-in of output voltage uout, output current iout and capacitor voltages uc1, uc2, uc3 shown in fig. 4 77 acta polytechnica vol. 50 no. 6/2010 fig. 5 gives a close look into the results in fig. 4. five-level output voltage is obtained, the output current is shifted around 45◦ against the first harmonic of output voltage, and its shape is close to sinusoidal. the flying capacitor voltages are stabilized without any independent sources or additional circuits. 3 realization of a five-level flying capacitor inverter 3.1 structural schemes the one-phase five-level fcmi with a dc source 30. . .200vexperimental equipment is inoperation in the laboratory of the department of electric drives andtractionatctu.themainpurpose is to achieve the necessary experimental results with the generatedfive-leveloutput voltage uout andcapacitorvoltages stabilized. the principal scheme of the power part of the one-phase five-level fcmi is shown in fig. 6. the igbt switch modules are chosen as main switching elements of irg4bc20kd type, including an international rectifier anti-parallel diode (http://www.irf.com). the value of each capacitor is c1 = c2 = c3 = c = 1000 μf and the voltage in each capacitor is distributed as shown in equations (1)–(3). the type of load can be selected as resistive or inductive. the scheme includesmeasurement, drive and protection, dspace board, and power circuit. the structural scheme of the experimental model hasbeenproposedandalso constructed. fig. 7 shows the control conception of igbt drivers and its connection to capacitors c1, c2, c3, load a, all control boards, all power supplies and dspace. the igbt driver protects the igbt in the event of over-current. it is constructed in four floors, each floor having two drivers: right and left. each driver is supplied from a 15 v voltage source that is supplied from 230 v network and converted to 15 v in the output. the drive andprotection board ensures galvanic separation between dspace board and igbt driver. it alsoprovidesprotection for igbt.whenat least one of igbt indicates over-current, it switches off all the igbts and indicates which of them is out of order. the measurement part measures the capacitor voltage and the load current, and converts the actual values into logic voltage signals compatible with input of dspace. fig. 8 shows somedetails of the experimentprototype—the igbtdrivers, the power supplies, the flying capacitors, the input capacitorsand thedriveand protectionboard. someof theboardsareconstructed asprinted circuits, someare soldered inwires. all the boards are interconnected with wires. the schemes of the voltage capacitor and the load current measurement, the igbt drivers, the drive and protection board, and pictures of the other devices will be published in the final thesis. fig. 6: principal scheme of the single-phase five-level flying capacitor inverter that is in the laboratory fig. 7: structural scheme of the experimental model of a single-phase five-level flying capacitor inverter 78 acta polytechnica vol. 50 no. 6/2010 fig. 8: pictures of some experimental prototype details: 1 – igbt drivers; 2 – power supplies 15 v; 3 – flying capacitor c1; 4 – input capacitors; 5 – drive and protection board 3.2 experimental results theparametersof the five-levelfcmi for experimentalmeasurements are given intable 3. the control is realized using the matlab simulink program, and also with the same parameters as in the simulation measurement. only the first three parameters from table 3 are controlled in dpsace.they are: output frequency f, switching triangular carrier frequency per period n, and control amplitude output voltage umax. table 3: parameter values for the matlab simulink program in five-level fcmi for experiment results f =50 network frequency, [hz] n =14 switching triangular carrier frequency per period, [number/period] umax =0.6 . . .0.95 amplitude of output voltage, [p.u.] x =110 reactance, [ω] r =150 resistance, [ω] udc =200 direct voltage source, [v] the other parameters, e.g. direct voltage source udc, resistance ra, inductance la, capacities c1, c2, c3, are realized physically in laboratory conditions. output voltage uout is the controlparameter, and it is set in relative units. the maximum value must be a little less than 1, so that it can be modulated. fig. 9 shows experimental results for output voltage uout and output current iout with control amplitude output voltage umax = 0.95. five-level output voltage is generated. it is in the scale 5vper division and the voltage probe has the divider in oscilloscope 1/20. the voltage axis with its values is on the left. its amplitude is around ±100 v. the output current has a curved shape close to sinusoidal and its delay to the first harmonic voltage is around 45◦ because the load is resistive inductive. the scale of the current is 5 mv per division and its probe divider is 10 mv/a. thecurrentaxis is on the left. thecurrentamplitude is ±0.5 a. fig. 9: experimental results: 1 – output voltage uout; 2 – output current iout with umax =0.95 fig. 10 shows the experimental results for output voltage uout andoutput current iout with controlamplitudeoutputvoltageequal to umax =0.6. thegenerated five-level output voltage uout has much narrower pulses than the corresponding voltage in the previous case. the output current amplitude is almost two times lower than the previous amplitude corresponding to an output voltage value of uout. fig. 10: experimental results: 1 – output voltage uout; 2 – output current iout with umax =0.6 fig. 11 shows the experimental results for capacitor voltages uc1, uc2, uc3 measurement with umax = 0.95. they are all stabilized same as in the simulation results above the same value. voltages uc1 is stabilized above a value of 150 v, uc2 is stabilized above a value of 100 v and uc3 is stabilized above 50 v, which corresponds with equations (1), (2) and (3). 79 acta polytechnica vol. 50 no. 6/2010 fig. 11: experimental results of flying capacitor voltages: 1 – uc1, 2 – uc2, 3 – uc3 with umax =0.95 fig. 12 shows the experimental results of capacitor voltages uc1, uc2, uc3 with the control amplitude output voltage umax = 0.6. the capacitor voltages are all stabilized at the same values as with the control amplitude output voltage umax =0.95 (fig. 11). fig. 12: experimental results of flying capacitor voltages: 1 – uc1, 2 – uc2, 3 – uc3 with umax =0.6 4 conclusion a brief description of a five-level fcmi inverter has been presented. the principle of five-level fcmi function has been presented. a simulation model of the inverter with its passive load has been designed and tested. the capacitor voltage in the fivelevel fcmi has been balanced without auxiliary circuits. five-level fcmi is practically investigated because the capacitor voltage is balanced here using only the control strategy. the simulation results are confirmed by experimental results realized in fivelevel fcmi. the five-level output voltage is generated and the capacitor voltages are stabilized using the selected control strategy. the experimentalmeasurements are presented for twodifferent control output voltage amplitudes umax, and in both cases the capacitor voltages are stabilized. acknowledgement research described in the paper has been supervised by prof. j. pavelka, and has been supported by the department of electric drives and traction, fee ctu in prague. references [1] tran, v. t.: multilevel inverters for high voltage drive, phd thesis, pp. 104–115, czech technical university, prague, september 2003. [2] skvarenina, timothy l.: the power electronics, handbook, industrial electronic series, 2002, pp. 6-7–6-15. isbn 0-8493-7336-0. [3] borghetti, g., carpanto, m., marchesono, m., tenca, p., vaccaro, l.: a new balancing technique with power losses minimization in diodeclamped multilevel converters, università degli studi di genova – dipartimento di ingegneria elettrica, pp. 2, 3, 7–10, genova, italy, september 2007. [4] nguyen van nho, hong-hee lee: carrier pwm algorithm for multileg multilevel inverter, department of electrical and electronic engineering, hcmut, hochiminh, pp. 3–7, vietnam, september 2007. [5] tran, v. t.: control strategy for flying capacitor multilevel inverter, in poster 2003 – book of extendedabstracts, (prague), p. pe30, ctu, faculty of electrical engineering, may 2003. [6] sivkov,o.: comparison of two types ofmultilevel inverters, in poster 2008, pe3, ctu, faculty of electrical engineering, pp. 2–4, 6–7, may 2008. [7] sivkov, o., pavelka, j.: analysis of capacitor dividers for multilevel inverter, 13-th epe-pemc 2008 conference proceedings, p. 165. ing. oleg sivkov phone: +420 608 415 978 e-mail: ollerrg@seznam.cz department of electric drives and traction czech technical university in prague faculty of electrical engineering technická 2, 166 27 prague, czech republic 80 acta polytechnica acta polytechnica 53(1):58–62, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ luiza: analysis framework for gloria aleksander filip żarneckia,∗, lech wiktor piotrowskia,b, lech mankiewiczc, sebastian małekd a faculty of physics, university of warsaw, hoża 69, 00-681 warsaw, poland b riken advanced science institute, wako, japan c center for theoretical physics, al. lotnikow 32/46, 02-668 warsaw, poland d national centre for nuclear research, hoża 69, 00-681 warsaw, poland ∗ corresponding author: zarnecki@fuw.edu.pl abstract. the luiza analysis framework for gloria is based on the marlin package, which was originally developed for data analysis in the new high energy physics (hep) project, international linear collider (ilc). the hep experiments have to deal with enormous amounts of data and distributed data analysis is therefore essential. the marlin framework concept seems to be well suited for the needs of gloria. the idea (and large parts of the code) taken from marlin is that every computing task is implemented as a processor (module) that analyzes the data stored in an internal data structure, and the additional output is also added to that collection. the advantage of this modular approach is that it keeps things as simple as possible. each step of the full analysis chain, e.g. from raw images to light curves, can be processed step-by-step, and the output of each step is still self consistent and can be fed in to the next step without any manipulation. keywords: telescope network, image processing, data analysis. 1. introduction gloria [1] (global robotic-telescope intelligent array) is an innovative citizen-science network of robotic observatories, which will give free access to professional telescopes for a virtual community via the internet. the gloria project will develop free standards and tools for doing research in astronomy, both by making observations with robotic telescopes and by analyzing data that other users have acquired with gloria and/or from other free access databases, e.g. the european virtual observatory. dedicated tools will be implemented for designing and running so called off-line experiments, based on analysis of available data. many different types of experiments are considered, for example classification of variable stars, searches for optical transients, and searches for occultations of stars by solar system objects. one of the challenges we have to face in designing the environment for off-line gloria experiments is how to deal with huge amounts of data and a large variety of analysis tasks. we need an analysis framework that will be both very efficient and very flexible. these requirements are new to astronomy. however, high energy physics experiments have long years of experience of dealing with enormous amounts of data and complicated analysis tasks. experiments at the cern large hadron collider read information from about 100 million electronic channels, which is equivalent to taking 100 mpixel image of the detector, every 50 ns (20 million times per second). even after very strong (10−5) on-line selection of events (using multi-level trigger systems) gbs of data are stored every second. data analysis for lhc experiments has to be performed on the lhc computing grid (wlcg), which currently includes about 170000 tb of disk space and cpu power of about 1800000 hepspec06 units. however, this analysis is only possible thanks to custom-designed, highly efficient analysis software. detectors at the future international linear collider (ilc), which is the next generation e+e− collider under study, will deal with even larger “images”. although the project will not be realized before 2020, detailed studies of physics and detector concepts, and also detector prototype tests, have been under way for many years. large samples of monte carlo data have already been generated to test detector performance and analysis methods. a dedicated marlin [2] framework has been developed for efficient data reconstruction (corresponding to image reduction in astronomy) and analysis. we decided to adopt this framework for the needs of data analysis in gloria. 2. basic concept marlin (modular analysis and reconstruction for the linear collider) is a simple modular application framework for developing reconstruction and analysis code for ilc. data reconstruction and analysis should be divided into small, well defined steps, implemented as so called processors. processors are grouped in modules, dedicated to particular tasks or types of analysis. since many different groups worldwide are involved in the ilc project, it was assumed that the framework should allow distributed development of 58 http://ctn.cvut.cz/ap/ vol. 53 no. 1/2013 luiza: analysis framework for gloria figure 1. example of the marlin analysis chain for mimosa silicon pixel detectors, developed within the eudet project (eutelescope package). modules and should combine existing modules in a larger application according to users’ needs. the crucial requirement in such an approach is that each step of the analysis has a well defined input and output data structure. in the case of marlin, all possible data classes that can be exchanged between processors are defined in the lcio (linear collider i/o) data model. lcio is used by almost all groups involved in linear collider detector studies and thus has become a de facto standard in software development. by defining universal data structures we make sure that different processors can be connected in a single analysis chain, and can exchange data and analysis results. the base class for a marlin processor is also defined in the marlin framework. it defines a set of standard callbacks that the user can implement in their subclasses. these callbacks are used to initialize the analysis chain, process subsequent sets of data and to conclude the analysis. a steering file mechanism allows the needed processors to be activated and their parameters to be set at run time. the dedicated processor manager loads selected processors and calls their corresponding methods for subsequent steps of data analysis. an example of the marlin analysis chain for silicon pixel detectors, developed within the eudet project [3], is shown in fig. 1. processing data from pixel detectors in high energy physics is in fact like ccd image analysis in astronomy. charged particle tracks are measured instead of photons, but the analysis steps are similar: raw data are read from file, pixel cluster are found (object finding), their position and charge are reconstructed (photometry) and are used to fit the particle track (astrometry). the marlin framework has turned out to be very efficient and flexible, and is widely used by the ilc community. it is designed to run in a batch mode, without user interaction, with all input streams, analysis tasks, parameters and options specified in the steering file. this approach enables huge amounts of data to be handled and distributed computing resources to be used (grid computing). we therefore decided to use the same concept in developing the luiza framework for gloria. the package is developed mostly in c++, and makes wide use of standard template library (stl) classes and methods. 3. luiza framework 3.1. data structures fits [4] (flexible image transport system) is the standard astronomical data format endorsed by both nasa and the iau. fits is much more than an image format (such as jpg or gif), and is primarily designed to store scientific data sets consisting of multi-dimensional arrays (1-d spectra, 2-d images or 3-d data cubes) and 2-dimensional tables containing rows and columns of data. when developing luiza, we decided to use cfitsio [5] library for reading and writing data files in fits format. the following basic data classes are defined: the gloriafitsimage class — for storing 2-dimensional fits images, with either integer or floating point pixels. it includes basic methods for image manipulation (addition, subtraction, multiplication and division); the gloriafitstable class — for storing data tables. various column types are allowed (integers, floats, strings, and also vectors of integers and floats); the gloriafitsheader class — for storing fits header data. this includes basic methods for accessing and modifying header information. both gloriafitsimage and gloriafitstable inherit from this class. additional classes can be implemented on this basis, for example the gloriaobjectlist class. this class defines the gloriafitstable with predefined columns for storing object position on ccd (“ccd_x” and “ccd_y”) and object brightness (“signal”). this ensures that object lists will be exchangeable between processors. a user can add additional columns if needed. 59 a. f. żarnecki, l. w. piotrowski, l. mankiewicz, s. małek acta polytechnica dark.fit darkimages 0 true message figure 3. example section of the steering file for luiza, with parameters of the processor reading dark frames. figure 2. example steering file header, containing information on the modules selected for the analysis chain. for internal storage of all data being processed, a dedicated class gloriadatacontainer was implemented. it stores vectors, so-called “collections”, of images or tables. each collection has a unique name (string), which can be used to access its elements. multiple collections can be stored in memory, each with multiple images or tables (though in many cases collections will contain just a single image or table). the pointer to gloriadatacontainer is passed to each luiza processor in the data processing loop. processors can analyse data already stored in memory, and can also add new collections (e.g. when reading data from storage or saving analysis results). 3.2. data processing we assume that every computing task can be implemented as a processor (module) that analyzes the data stored in a gloriadatacontainer structure and additional output that is created is also added to that structure. a user defines the analysis chain at run time, by specifying a list of active processors in an xml steering file (see fig. 2). the idea is to develope a large number of processors in gloria, performing many different tasks, so that the user is always able to find a set which matches his/her needs. the main “work horse” of luiza is the processor manager (processormgr class). this is used by luiza to create a list of active processors (after parsing xml file), and to set values to the parameters required by these processors (given in the same xml file). the same processor type (e.g. a processor reading fits images fitsimagereader) can be used many times: the instances are distinguished by a unique names given by the user. each instance has its own set of parameters, so one instance of an image reader can be used to read dark frames used for calibration, and another instance to read actual images; an example of a parameter section for one processor is shown in fig. 3. 3.3. analysis tools until now we have mainly focused on the development of the general structure and functionality of luiza: data classes based on the fits standard have been designed, steer file parsing and processing management have been adopted from marlin, processors for input and output of fits image files have been implemented on the basis of cfitsio library. nevertheless, the current version of luiza already includes some basic tools for image processing: • an image viewer based on the cern root [6] package (see fig. 4) • an image normalization processor, allowing for dark/bias subtraction and flat correction • a processor for image stacking or averaging • a processor for simple geometry operations: image cropping and rotations • two simple object finding algorithms: one based on the particle identification algorithm developed for silicon pixel detectors, and the other based on python library mahotas • an astrometry algorithm based on astrometry.net (still being tested) 60 vol. 53 no. 1/2013 luiza: analysis framework for gloria ccdx 50 100 150 200 250 300 350 c c d y 50 100 150 200 250 300 350 10000 20000 30000 40000 50000 60000 ccdx 180 200 220 ccd y 200 220 240 410 figure 4. various graphics options implemented in cern root, available for viewing fits images in luiza: as an image (left), as a histogram (centre), and as a histogram in 3d projection (right). red circles in the middle plot indicate objects reconstructed in an image by the pixelclusterfinder processor. in the plot on the right, only small section of the image is shown for clarity, presenting the psf of flare star rxj0413.4-0139 at the outburst maximum. in addition, a dedicated user interface, luizagui, has been prepared for creating and editing of xml steering files. 3.4. development plans we plan to continue developing basic tools for image manipulation and analysis (astrometry, photometry, light curve reconstruction). general purpose algorithms, which should be flexible enough to cope with data from all gloria telescopes, are assumed not to be the most highly precise algorithms. they can be used as examples and starting points for future improvements, and in the development of more advanced tools, dedicated to particular studies. dedicated processors will also be developed in the course of gloria off-line experiments. our current plans include the development of: • an interface to star catalogues and external databases ; • an interface to virtual observatory resources; • a processor for smart image stacking (correcting for image shifts and rotations); • frame quality analysis; • high quality aperture and profile photometries; • light curve determination and variability analysis; • searches for optical bursts, flares etc. thanks to the simple and modular structure of luiza, individual gloria users will also be able to contribute to software development. new packages can be compiled as independent libraries and loaded dynamically at run time without the need to change anything in luiza or other modules. it is therefore possible for users to develop “private” luiza modules and libraries, adapted for their particular analysis, which can later be included in luiza as a separate package after proper testing. [min]0t t 0 10 20 30 40 50 60 m a g n it u d o 10.5 11 11.5 12 12.5 13 figure 5. lightcurve of flare star rxj0413.4-0139 reconstructed with luiza. data from followup observation by bootes-1 of the outburst observed by pi of the sky on nov. 24, 2011. the secondary outburst of the star was measured. 3.5. documentation we decided to use the doxygen [7] package to manage framework documentation. web page and/or latex documentation is created automatically from class header files, based on simple tags used in the comments included in the code. the additional work needed to keep the documentation up to date is minimal, assuming that developers put comments in the code. this solution also makes it straightforward to add the code submitted by users to the documentation. documentation for the public luiza release is available on the dedicated web page [9]. 4. results on november 24, 2011, just before midnight, the pi of the sky telescope located at inta near huleva, spain, automatically recognized a new object in the sky. unfortunately, it was visible to our detector only for about one minute, fading fast below our limiting magnitudo. we asked the bootes group operating a 61 a. f. żarnecki, l. w. piotrowski, l. mankiewicz, s. małek acta polytechnica bigger telescope at the same site to make a follow up observation. figure 5 shows the light curve of the object, later identified as flare star rxj0413.4-0139, as observed by the bootes-1 telescope [8]. we clearly see a secondary outburst, more than one magnitudo brighter than the first one (11.8m at maximum) and much, much longer. the data analysis resulting in this light curve was performed with luiza. 5. conclusions an efficient and flexible analysis framework for gloria has been developed based on a concept taken from high energy physics. the basic data classes, framework structure and data processing functionality are implemented, as well as selected data processing algorithms. the framework will be further developed as a part of work on gloria off-line experiments. the first public version of the framework has been released via gloria project svn [10]. acknowledgements the research leading to these results received funding from the european commission seventh framework programme (fp7/2007–2013) under grant agreement no. 283783. the research work is also being funded the polish ministry of science and higher education from 2011–2014, as a co-funded international project. references [1] http://www.gloria-project.eu/ [2] gaede, f.: marlin and lccd: software tools for the ilc, nucl instrum meth a559, 2006, 177-180. http://ilcsoft.desy.de/portal/software_ packages/marlin/ [3] rubinskiy, i.: eutelescope final status, presented at eudet annual meeting 2010, http://ilcagenda.linearcollider.org/ conferencedisplay.py?confid=4649 [4] http://fits.gsfc.nasa.gov/ [5] http://heasarc.gsfc.nasa.gov/docs/software/ fitsio/fitsio.html [6] http://root.cern.ch/ [7] http://www.doxygen.org/ [8] martin jelinek and petr kubanek, private communication. [9] http://hep.fuw.edu.pl/u/zarnecki/gloria/luiza/ doc/html/index.html [10] http://sourceforge.net/projects/gloriaproject/ 62 http://www.gloria-project.eu/ http://ilcsoft.desy.de/portal/software_packages/marlin/ http://ilcsoft.desy.de/portal/software_packages/marlin/ http://ilcagenda.linearcollider.org/conferencedisplay.py?confid=4649 http://ilcagenda.linearcollider.org/conferencedisplay.py?confid=4649 http://fits.gsfc.nasa.gov/ http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html http://root.cern.ch/ http://www.doxygen.org/ http://hep.fuw.edu.pl/u/zarnecki/gloria/luiza/doc/html/index.html http://hep.fuw.edu.pl/u/zarnecki/gloria/luiza/doc/html/index.html http://sourceforge.net/projects/gloriaproject/ acta polytechnica 53(1):58--62, 2013 1 introduction 2 basic concept 3 luiza framework 3.1 data structures 3.2 data processing 3.3 analysis tools 3.4 development plans 3.5 documentation 4 results 5 conclusions acknowledgements references ap09_2.vp 1 introduction in the region of xuv (extended ultra violet) radiation, i.e. in the region of ~ (0.2–100) nm there are two significant sub-regions. the first is the water-window region (2.3– 4.4) nm. radiation in this region is highly absorbed in carbon, but not much absorbed in water, so it is useful for observing organic samples in their native environment. the second is the region around 13.5 nm. at this wavelength there are mo/si multilayer mirrors with high reflectivity at a normal incidence angle, and thus radiation in this wavelength region it is convenient for industrial applications as xuv lithography. a typical x-ray source, the roentgen tube, is based on bremsstrahlung of electrons and produces a broad, continuous spectrum with intensity peaks on k-lines. if we are interested in a narrow spectrum of xuv radiation, synchrotron accelerators or fels are suitable sources. however due to their huge dimensions, build and operation costs these devices are unobtainable for most laboratories. another approach for obtaining xuv radiation involves electron transitions in plasma ions. using this way, radiation with a line spectrum is created. some electron transitions have noticeably higher probability than other transitions, and when suitable conditions arise in the plasma, these transitions become dominant and a significant peak can appear in the spectral region of interest. in addition, when the plasma forms a uniform column with length exceeding diameter in two orders, the pumping power is sufficiently high to ensure a high enough population on a meta-stable energy level, spontaneous amplified emission (ase) can appear. ase has the properties of a laser. the active medium of these lasers is created by laser-produced plasma or by capillary-produced plasma. the first capillary produced plasma xuv laser was made by rocca et al. at colorado university in 1994 [1]. in 2005, heinbuch et al., from rocca’s team, produced a tabletop version of the laser [2]. 2 discharge apparatus 2.1 z-pinching capillary discharge in order to obtain radiation in the xuv region, the electron transitions have to proceed on highly ionized states. these states appear in plasma with a electron temperature of te ~ 100 ev. such hot plasma with sufficient electron density ne > 10 17 cm�3 is created by radial compression of a plasma column by the lorentz force fl of a magnetic field b of flowing current i – a z-pinch. first, the current flows along the walls of the capillary. the flowing current creates a magnetic field, which affects the charged particles of the plasma by a radial force. this force compresses the particles like a snowplow till the magnetic pressure is equilibrated by the plasma pressure. schematic of a z-pinching capillary discharge is shown in fig. 1. to obtain sufficient electron density and electron temperature, rapid compression is needed. 2.2 discharge circuit design pulsed current of amplitude ~10 ka and an abrupt rise of di/dt > 1011 as�1 is needed for the rapid compression with a high compression rate, that is necessary for obtaining sufficiently dense and hot plasma. our discharge circuit is realized by a ceramic capacitor bank discharged by closing of a spark-gap switch through ceramic capillary filled with gas. because of the high di/dt, all inductances in the circuit are undesirable because their impedance becomes dominant. reduction of inductances in the circuit allows us to achieve high di/dt with a relatively small charging voltage. fig. 2 shows the scheme of the apparatus. a ceramic capacitor bank with a maximum capacity of 26.2 nf is pulse charged by a 2-stage marx generator and an rlc oscillating circuit up to 100 kv (limited by the capacitors breakdown voltage). if the voltage on the capacitors exceeds the self-breakdown voltage of the spark gap, the gap is enclosed and the capacitors discharge through the capillary. this is the main discharge circuit. to reduce the inductance of this circuit, a close coaxial configuration was designed. the capillary is also shielded by a duralumin tube. this tube reduces the energy of magnetic the field around the capillary and thus reduces its inductance. before the main discharge a © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 49 no. 2–3/2009 capillary discharge xuv radiation source m. nevrkla a device producing z-pinching plasma as a source of xuv radiation is described. here a ceramic capacitor bank pulse-charged up to 100 kv is discharged through a pre-ionized gas-filled ceramic tube 3.2 mm in diameter and 21 cm in length. the discharge current has amplitude of 20 ka and a rise-time of 65 ns. the apparatus will serve as experimental device for studying of capillary discharge plasma, for testing x-ray optics elements and for investigating the interaction of water-window radiation with biological samples. after optimization it will be able to produce 46.9 nm laser radiation with collision pumped ne-like argon ions active medium. keywords: capillary discharge, z-pinch, xuv, soft x-ray, rogowski coil, pulsed power. fig. 1: z-pinching capillary discharge (30 – 40) a, (3 – 6) �s long current pulse pre-ionizes the gas in the capillary and prepares a uniform conducting channel. the capillary is made of al2o3, which is a material with low wall ablation. the ablation of the material from the capillary walls is an undesirable effect in the gas-filled capillary, which introduces impurities into the discharge plasma. the dimensions of capillary are: diameter – 3.2 mm and length – 210 mm. it is filled with gas through a hollow electrode on its grounded side. radiation is also emitted through this hole. the xuv radiation is highly absorbed in any material, even a gas, so 3 mm from the capillary exit there is a 1.2 mm diameter aperture separating the discharge gas with working pressure (10 – 100) pa from the high vacuum of 3 orders lower pressure. the charging voltage is measured by a 40 kv tektronix probe. because the charging voltage of 100 kv could damage the probe, a capacitive divider is used. this divider divides the voltage according to the ratio of the adjustable capacitor capacitance and the input capacitance of the probe. the current is measured using a rogowski coil with rl integration. the breakdown voltage is adjusted by the nitrogen pressure in the spark-gap. the system is enclosed in a duralumin housing in order to reduce electromagnetic noise. biodegraded oil is used to isolate the circuit and avoid unwanted breakdowns. 2.3 electrostatic analysis electrostatic simulations in quickfield, finite elements analysis software, were performed in order to estimate spots predisposed to unwanted electrical break-downs. the input parameters were 100 kv on the hv electrode, closed spark-gap, and constant resistivity along the capillary, i.e. a constant voltage drop between the electrodes of the capillary. the electric strength e field picture in the region of e � �( . . )0 02 2 00 107 v/m with three problem spots is shown in fig. 3. during experiments without electro-isolating oil breakdowns really did occur between point a and the housing, and between point b and c in case of spark-gap switching into a non-preionized capillary. formerly, when a dc charging voltage was used, these breakdowns occasionally also emerged in electro-isolating oil after exceeding 70 kv. after exceeding this voltage the main trigatron spark-gap also switched on unexpectedly. there was a need to change the dc charging design with a trigatron spark-gap to a pulse charging design with a self-breakdown spark-gap. 2.4 charging circuit and pre-ionization the ceramic capacitor bank is charged by a two-stage marx generator and a rlc oscillating circuit. a simplified schematic of the circuit is shown in fig. 4. the rlc circuit is formed by two charging capacitors with a capacity of cc � 37 5. nf in series, ~ 1 mh charging coil, charged capacitor bank c, and by parasite resistivity. after closing the marx generator, the voltage on the capacitor bank stabilizes approximately at a value of ust, given by: u u c c c c c st marx� � , (1) where umarx is the voltage on charging capacitor cc after switching the marx generator on. by charging through coil, the voltage on capacitor bank has a waveform of under-damped oscillations with a period of ~ 25 �s (depending on the capacitor bank capacity), with the first maximum at ~ 2×ust. in the case of c �cc, the voltage on the capacitor bank can reach almost 160 kv, with a charging voltage of 40 kv (doubled by marx and again doubled in the coil). the first stage of the marx generator is used for pre-ionization of the capillary. the pre-ionization current is limited by the 1 k� resistor to (30 – 40) a, and the duration of pre-ionization is given by the charging time of capacitor bank, i.e. by the time from marx switching on to the self-breakdown spark-gap between the capacitors bank and capillary breakdown. this time is typically (3 – 6 ) �s. formerly, the capacitor bank was charged by a dc power supply and discharged by the trigatron spark-gap switched by a commercial triggering unit. pre-ionization circuit was separately charged by second power supply and switched by second commercial triggering unit. a breakdown of the capillary at the beginning of pre-ionization causes big electromagnetic noise. the triggering units were not resistant to this 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 2: discharge apparatus fig. 3: quickfield electrostatic analysis fig. 4: charging circuit electromagnetic noise, and the main discharge was switched prematurely. this was another reason for turning to a pulse charging design with a self-breakdown spark-gap, one power supply, and one triggering unit. 2.5 rogowski coil the current in the capillary is measured by a rogowski coil. this is a simple current probe based on the faraday principle of electromagnetic induction. a toroidal coil with a returning loop is placed around the current conductor. in changing current, voltage uind will be induced on the terminals of the coil, according to: u t sn i tind d d d d � � � � � �0 , (2) where � is magnetic flux, t is time, �0 is permeability of vacuum, s is cross-section of coil’s loop, i is measured current, and n is number of loops of coil per unit length. as seen from (2), uind is proportional to the change of current. in order to be proportional directly to measured current, uind has to be integrated. there are three basic types of integrating circuit, integration with an operating amplifier – fig. 5.a, rc integration – fig. 5.b, and rl integration – fig. 5.c. because of the big changes in the measured current, we do not need active integration with oa. the measured current has a basic frequency around 3 mhz. a coil with rc integration should have a higher self-resonant frequency, in order to measure the current correctly. it is difficult to manufacture a coil, with sufficiently high uind and sufficient high self-resonant frequency. the best solution for our case is rl integration. the equivalent high-frequency circuit for a coil with rl integration is shown in fig. 6. by solving this circuit and assuming r l r cd x d x� �� 1 , where rd is a small integrating resistor, lx is coils inductance, and cx is the inter-turns capacitance of the coil, and according to (2), and according to the coil inductance estimation: l nv n l n vx � �� �0 0 2 , (3) where �0 is permeability of air, n is number of coil turns per unit length, v is coil’s inner volume, and n is total number of coil turns, we can obtain the sensitivity of the coil: u r r c l r r c u r n iout d d x x d d x ind d � � � � � � � � 1 1 . (4) more on rogowski coil theory is available in [3]. our rogowski coil has the calculated properties: low-frequency limit: f r ld � � 1 2 29 � d x khz, (5) time resolution: t l c � � d 10 ns, (6) sensitivity: i u n rout d � � �137 1av , (7) where the time resolution is limited by the finite signal propagation along the coil wire length of ld. the coil was calibrated by placing it in the discharge system, where the capillary was replaced by a copper tube and the capacitor was discharged over a non-inductive 10 � resistor. the discharge current was measured via differential voltage measurement on the resistor, and was compared with the voltage output of the coil. the experimentally determined sensitivity was: i uout av� � �( )130 8 1. (8) as seen, the theoretical sensitivity (7) is in good agreement with the measured sensitivity (8). 3 experimental results here we present measurements of charging voltage, discharge current, output radiation, and pinch positions changing with respect to discharge gas pressure. fig. 7 shows the charging voltage and the discharge current. the charging voltage is measured directly using a tektronix hv probe and an hv probe behind the capacitor divider. the capillary is replaced by a copper tube. the capacitance of the capacitor bank is 11.2 nf. fig. 8 shows the measured current through an argon filled capillary at pressures 13 pa, 40 pa, and 65 pa. the capacity of the bank is 15 nf and the charging voltage is 75 kv. the pinch is observable as a drop on the current waveform. fig. 8 shows, how the pinch moves forward in time with increasing pressure. the strongest pinch appears when the current reaches its maximum. for our experimental voltage and capacity it is at 40 pa. fig. 8 also shows the xuv emission. the emission is measured by a vacuum diode with a gold cathode, 15 cm from © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 49 no. 2–3/2009 uout r l uind uoutruind c uout r uind c op a) b) c) fig. 5: integrating circuits: a) active with oa, b) passive rc, c) passive rl fig. 6: high-frequency equivalent circuit of rog. coil the capillary end. electrons, visible and uv radiation are filtered by a 0.8 �m thick aluminum foil. 4 conclusion an apparatus for pinching capillary discharge production, as a source of xuv radiation, was designed and realized. the discharge current was measured using a calibrated rogowski coil, the charging voltage was measured using a capacitive divider and a tektronix hv probe, and the xuv output radiation was measured using a vacuum diode. the apparatus is in the final stage of construction. after completing the apparatus we would like to find the laser gain on ne-like argon ions at a wavelength of 46.9 nm. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 2–3/2009 fig. 7: charging voltage and discharge current fig. 8: pinch and xuv radiation (current – solid line, xuv signal – dot-and-dash line) acknowledgment research described in the paper was supervised by dr. a. jančárek. this research is supported by ministry of education cr grants 102/07/0275 and 202/08/h057, and internal grant of czech technical university in prague ctu0910414. i also thank dr. vrba for consultations on the theoretical background of z-pinching plasma and laser gain estimation. references [1] rocca, j. j. et al.: demonstration of a discharge pumped table-top soft x-ray laser. physical review letters, october 1994, vol. 73 (1994), no. 16, p. 1236. [2] heinbuch, s., grisham, m., martz, d., rocca, j. j.: demonstration of a desk-top size 46.9 nm laser at 12 hz repetition rate. optics express, vol. 13 (2005), no. 11, p. 4050–4055. [3] nevrkla, m.: design and realisation of apparatus to study capillary discharge in argon: master thesis. prague: ctu, faculty of nuclear sciences and physical engineering, 2008. michal nevrkla e-mail: michal.nevrkla@fjfi.cvut.cz department of physical electronics czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 49 no. 2–3/2009 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica acta polytechnica 53(4):388–393, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the effect of kaolin concentration on flock growth kinetics in an agitated tank radek šulc∗, ondřej svačina czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: radek.sulc@fs.cvut.cz abstract. this paper reports on an investigation of the effect of initial solid particle concentration on flock growth and flock shape characterized by the fractal dimension df2. the experiments were carried out in a baffled tank agitated by a rushton turbine at mixing intensity 64 w/m3 and constant dimensionless flocculant dosage df/ck0 = 4.545 mg/g. the model wastewater (a suspension of tap water and kaolin) was flocculated with sokoflok 56a organic flocculant (solution 0.1 % wt.). the size and shape of the flocks were investigated by image analysis. the flock growth kinetics was fitted according to a semi-empirical generalized correlation proposed by the authors. the dependences a∗f = 100.35c 1.532 k0 , df,eq,max = 1.0474c −0.311 k0 and (ntf)max = 1622c −0.393 k0 were found. the fractal dimension df2 was found to be independent of flocculation time and initial kaolin concentration, and its value df2 = 1.470 ± 0.023 was determined as an average for the given conditions. keywords: flocculation, flock growth, flock size, mixing, rushton turbine, kaolin slurry, fractal geometry, image analysis. 1. introduction flocculation is one of the most important operations in solid–liquid separation processes in water supply and wastewater treatment (e.g. [1–3]). the purpose of flocculation is to transform fine particles into coarse aggregates, flocks, which will eventually settle, in order to achieve efficient separation. the properties of the separated particles have a major effect on the separation process and on the separation efficiency in a solid–liquid system. the solid particles in a common solid–liquid system are compact and are regular in shape, and their size does not change during the process. the size of these particles is usually sufficiently described by the diameter, or by some equivalent diameter (e.g. [4]). however, the flocks that are generated during flocculation are often porous and are irregular in shape, and this complicates the design of the flocculation process. flock properties such as size, density and porosity affect the separation process and its efficiency. the flock growth kinetics, and the effects of flocculant dosage and mixing intensity on flock size during flocculation in an agitated vessel were investigated in our previous research [5–7]. the aim of this work is to find the effect of initial solid particle concentration on flock growth kinetics and flock shape characterized by fractal dimension. the experiments were carried out in a baffled tank agitated by a rushton turbine at mixing intensity 64 w/m3 and constant dimensionless flocculant dosage df/ck0 = 4.545 mg/g. the model wastewater (a suspension of tap water and kaolin) was flocculated with sokoflok 56a organic flocculant (solution 0.1 % wt.). the size and shape of the flocks were investigated by image analysis. 2. theoretical background the dependence of flock size on flocculation time can be expressed by a simple empirical formula, taking into account flock breaking [5]: 1 df = af log2(ntf) + bf log(ntf) + cf, (1) where df is flock size, ntf is dimensionless flocculation time, tf is flocculation time, n is impeller rotational speed and af,bf,cf are model parameters. based on eq. (1), a generalized correlation for flock growth kinetics in an agitated tank that takes into account flock breaking can be rewritten as follows [5]: ∆ ( 1 df )∗ = a∗f ( ∆(ntf)∗log )2 (2) or df,max df = 1 + a∗f ( ∆(ntf)∗log )2 , (3) where ∆ ( 1 df )∗ = 1/df − 1/df,max 1/df,max , (4) ∆(ntf)∗log = log(ntf) − log(ntf)max log(ntf)max , (5) 388 http://ctn.cvut.cz/ap/ vol. 53 no. 4/2013 flock growth kinetics in an agitated tank a∗f = b2f 4afcf −b2f , (6) where df,max is the maximum flock size reached at dimensionless time (ntf)max, a∗f is the flock size shift coefficient, ∆(1/df ) and ∆(ntf)∗log are the variables defined by eqs. (4) and (5), and af,bf,cf are the parameters of eq. (1). the generalized correlation parameters df,max, (ntf)max and a∗f generally depend on the flocculation process conditions, such as mixing intensity, flocculation dosage and initial solid particle concentration. the flock growth kinetics model was successfully tested on published experimental data [8] and was also successfully adopted for quantifying the effect of flocculant sonication on the flock growth kinetics occurring in an agitated vessel [9]. in the experiments, a model suspension of commercial chalk dust in distilled water and an industrial suspension of ultrafine coal particles were used. 3. experimental 3.1. experimental apparatus the flocculation experiments were conducted in a fully baffled cylindrical vessel of diameter d = 150 mm with a flat bottom and 4 baffles per 90°, filled to height h = d with a model wastewater (tap water + kaolin particles) (fig. 1). the model wastewater was agitated by a rushton turbine (rt) d = 60 mm in diameter placed at an off-bottom clearance of ht/d = 0.85. the baffle width b/d was 0.1. the impeller motor and the cole parmer servodyne model 50000-25 speed control unit were used in our experiments. the impeller speed was set up and the impeller power input value was calculated using the impeller power characteristics. the flock size was determined using a non-intrusive optical method. the method is based on an analysis of the images obtained by a digital cmos camera in a plane illuminated by a laser light. the method consists of the following steps: (1.) illuminate a plane in the tank with a laser light sheet (also called a laser knife) in order to visualize the flocks; (2.) record images of the flocks using a camera; (3.) process the images captured by the image analysis software. the illuminated plane is usually vertical, and a camera is placed horizontally (e.g. [10, 11]). kysela and ditl [11] used this technique for observing flocculation kinetics. they found that the application of this method is limited by the optical properties of the system. the required transparency limits the maximum particle concentration in the system. we determine the flock size not during flocculation but during sedimentation. thus this limitation should be overcome. therefore, the laser-illuminated plane is horizontal and perpendicular to the impeller axis. the scheme of the experimental apparatus for image figure 1. scheme of the experimental apparatus for image analysis. analysis is shown in fig. 1. the agitated vessel was placed in an optical box (a water-filled rectangular box). the optical box reduces laser beam dispersion, and thus it improves the optical properties of the measuring system. the camera with the objective and the laser diode are placed on the laboratory support stand. the hardware & software installation and setup are described in detail in [12]. the technical parameters are presented in tab. 1. 3.2. experimental procedure the maximum flock sizes formed during flocculation were measured during sedimentation. the experimental parameters are summarized in tab. 2. the experimental procedure was as follows: (1.) calibration and experimental apparatus setting. the calibration grid was placed in an illuminated plane and the camera was focused manually onto this calibration grid before each flocculation experiment. then the image of the calibration grid was recorded. for camera resolution 800 × 800 pixels, the scale 1 px ∝ 45 µm was found for our images. this corresponds to a scanned area of 35 mm × 35 mm (approx. 6 % of the cross-section area of the tank). the scanned area was located in the middle of one quarter of the vessel, between the vessel wall and the impeller. (2.) model wastewater preparation. kaolin slurry (a suspension of water and kaolin particles [18672 kaolin powder finest riedel-de haen]) was 389 r. šulc, o. svačina acta polytechnica item specification laser diode nt 57113, 30 mw, wave length 635 nm (red light), edmund optics, germany diode optics optical projection head nt54-186, projection head line, edmund optics, germany camera colour cmos camera silicon video®si-sv9t001c, epix inc., usa camera optics objective 12vm1040asir 10–40 mm, tamron inc., japan image processing card (so-called grabbing card) pixci si pci image capture board, epix inc., usa camera control software xcap®, epix inc., usa operation software linux centos version 5.2, linux kernel 2.6 software for image analysis sigmascan pro 5.0 table 1. technical parameters. parameter experiment �v [w/m3] 64 n [rpm] 210 tf [min] 3.43, 5.71, 8, 11.4, 17.1 ntf [–] 720, 1200, 1680, 2400, 3600 ck0 [mg/l] 440, 560, 640 df/ck0 [mg/g] 4.545 df [ml/l] 2, 2.55, 2.91 no. of exp. points 15 table 2. experimental conditions used as a model system. the solid concentration of kaolin was 440, 560 and 640 mg/l. (3.) flocculation. the model wastewater was flocculated using sokoflok 56a organic polymer flocculant (medium anionity, 0.1 % wt. aqueous solution; flocculant weight per flocculant solution volume mf/vf = 1 mg/ml; sokoflok ltd., czech republic). the experimental conditions are specified in tab. 2. the flocculation was initiated by adding flocculant into the agitated vessel, and the flocculation time measurement was started. (4.) image acquisition. after impeller shutdown the flocks began to settle. during sedimentation, the images of flocks passing through the illuminated plane were captured. the camera was set to 10-bit depth at frame rate 10 s−1, exposure 5 ms and gain 35 db. image capturing started 20 s after impeller shutdown and took 120 s. finally 1200 images were obtained for the flocculation experiment and some were stored in a hard disk in 24-bit jpg format. (5.) image analysis. the images were analyzed using sigmascan software and its pre-defined filters figure 2. experimental data — maximum flock size vs. flocculation time — df,eq = f (tf). (filter no. 8 for removing one-pixel points, filter no. 10 for removing objects touching on an image border, and filter no. 11 for filling an empty space caused by capture error in identified objects and our macros (svačina [13] in detail). 3.3. experimental data evaluation from the images that were captured, the largest flock was identified and its projected area was determined for the given flocculation time. the equivalent diameter calculated according to the flock area plotted in dependence on the flocculation time for a given dimensionless flocculant dosage and various initial kaolin concentrations is shown in fig. 2. when the flocculation time increases, the flock size increases up to the maximum value, due to primary aggregation, and then decreases due to flock breaking. the experiments were conducted at dimensionless flocculant dosage df/ck0 =const. we expected that 390 vol. 53 no. 4/2013 flock growth kinetics in an agitated tank ck0 n (ntf)max tf,max df,eq,max a∗f iyx (a) δr,ave/δr,max (b) [mg/l] [rpm] [–] [min] [mm] [–] [–] [%] 440 210 2252 10.7 1.334 30.379 0.986 3.6/7.3 560 210 2006 9.6 1.304 34.631 0.999 0.2/0.3 (c) 640 210 1952 9.3 1.174 56.723 0.986 2.1/10.4 (a) correlation index. (b) relative error of equivalent flock size df,eq: average/maximum absolute value. (c) outlier flock size for ntf = 720 was excluded from the evaluation. table 3. generalized correlation ∆(1/df,eq)∗ = f (∆(n tf)∗log): parameters fitted. figure 3. generalized correlation ∆(1/df,eq)∗ = f (∆(n tf)∗log). the effect of initial solid particle concentration could be eliminated by satisfying this condition, and thus the same flock growth curves would be observed. this assumption was not clearly confirmed by the experiment. 3.3.1. generalized correlation ∆(1/df,eq)∗ = f(∆(ntf)∗log) the dependence of calculated equivalent diameter df,eq on flocculation time was fitted according to the generalized correlation (2). the generalized correlation parameters are presented in tab. 3. a comparison of the experimental data and the generalized correlation is depicted in fig. 3. 3.3.2. fractal dimension the flocks generated are often porous and have an irregular shape [14, 15], which complicates the design of the flocculation process. fractal geometry [16] is a method that can be used for describing the properties of irregular objects. some flock properties, e.g. shape, density and sedimentation velocity can be described on the basis of fractal geometry. a fractal dimension of the 3rd order df3 evaluated from the flock mass was usually determined. since we were measuring the projected area of the flocks, the fractal dimension of the 2nd order df2 was used for characterizing the flock shape. the dependence of the projected area a figure 4. fractal dimension determination — an example for n tf = 1680 and ck0 = 560 mg/l. on characteristic length scale lchar and on the fractal dimension df2 is given by: a = cldf2char. (7) the largest flocks determined in the images were used for estimating the fractal dimension. the maximum flock size was used as a characteristic length scale. the fractal dimension df2 was determined for each flocculation time and initial kaolin concentration. for illustration, the dependence of the projected area on the maximum flock size is shown in fig. 4 for dimensionless flocculation time ntf = 1680. the fractal dimension df2 plotted in dependence on flocculation time for a given flocculant dosage and mixing intensity is shown in fig. 5. the effect of flocculation time on fractal dimension df2 was tested by hypothesis testing. the statistical method of hypothesis testing can estimate whether the differences between the parameter values that are predicted (e.g. by some proposed theory) and the parameter values that are evaluated from the measured data are negligible. in this case, we assumed the dependence of fractal dimension df2 on dimensionless flocculation time, described by the simple power law df2 = b(ntf)β, and the difference between predicted exponent βpred and evaluated exponent βcalc was tested. the hypothesis test characteristics are 391 r. šulc, o. svačina acta polytechnica ck0 m t-distribution(a) relation t-characteristics |t| average value df2 [mg/l] [–] tm−2,α=0.05 df2 = b.(ntf)β hypothesis df2 = b(ntf)0 [–] βcalc βpred = 0 440 5 3.1825 0.016 1.1 (acceptable) 1.470 ± 0.041 560 5 3.1825 0.015 0.7 (acceptable) 1.476 ± 0.057 640 5 3.1825 0.002 0.1 (acceptable) 1.463 ± 0.073 (a) the t-distribution for m − 2 degrees of freedom and significance level α = 0.05 . table 4. fractal dimension — effect of flocculation time. figure 5. effect of initial kaolin concentration on the fractal dimension. given as t = (βcalc −βpred)/sβ, where sβ is the standard error of parameter βcalc. if the calculated |t| value is less than the critical value of the t-distribution for (m− 2) degrees of freedom and significance level α, the difference between βcalc and βcalc is statistically negligible (statisticians state: “the hypothesis cannot be rejected”). the hypothesis test result and parameter β evaluated from the data are presented in tab. 4. the fractal dimension was found to be independent of flocculation time for all three initial kaolin concentrations. 4. effect of initial solid particle concentration we found that the model parameters a∗f ,df,eq,max and (ntf)max depend on the initial kaolin concentration for the given constant dimensionless flocculant dosage df/ck0. the relations can be described simply as follows: a∗f = 100.35c 1.532 k0 (iyx = 0.883), (8) df,eq,max = 1.0474c−0.311k0 (iyx = 0.869), (9) (ntf)max = 1622c−0.393k0 (iyx = 0.984). (10) finally, the effect of initial kaolin concentration on fractal dimension df2 was tested in the same manner. the power exponent −0.009 and t-characteristics |t| = 0.2 were evaluated from the data. based on this, the fractal dimension was found to be independent of the initial kaolin concentration, and the value df2 = 1.470 ± 0.023 was determined as an average value for the given conditions. 5. conclusions the following results have been obtained: • the effect of initial solid particle concentration on flock growth and shape was investigated in a baffled tank agitated by a rushton turbine at mixing intensity 64 w/m3 and constant dimensionless flocculant dosage df/ck0 = 4.545 mgf/gk. flock size and flock shape were investigated by image analysis. • the experiments were carried out on a kaolin slurry of initial particle concentration 440, 560 and 640 mg/l as a model wastewater. the model wastewater was flocculated with sokoflok 56a organic flocculant (solution 0.1 % wt.). • the largest flock was identified from the obtained images, and its projected area was determined for a given flocculation time. the calculated equivalent diameter plotted in dependence on flocculation time for the given initial kaolin concentration is shown in fig. 2. • the flock size increases with increasing flocculation time up to a maximum value due to primary aggregation, and then decreases due to flock breaking. the flock growth curves are similar and shifted for various initial kaolin concentrations. with increasing particle concentration, the flock sizes are smaller for flocculation time > 6 minutes. • the flock growth kinetics were fitted according to a semi-empirical generalized correlation proposed by the authors. a comparison of the experimental data and the generalized correlation is depicted in fig. 3. the dependences a∗f = 100.35c 1.532 k0 ,df,eq,max = 1.0474c−0.311k0 and (ntf)max = 1622c −0.393 k0 were found. • the fractal dimension df2 was determined for each flocculation time on the basis of the experimental data. using the statistical hypothesis test, the fractal dimension df2 was found to be independent of flocculation time and initial kaolin concentration, and its value df2 = 1.470 ± 0.023 was determined as an average for the given conditions. 392 vol. 53 no. 4/2013 flock growth kinetics in an agitated tank acknowledgements this research has been supported by grant agency of czech republic project no. 101/12/2274 “local rate of turbulent energy dissipation in agitated reactors & bioreactors”. references [1] bache, d.h. & ross g.: “flocs in water treatment,” london, uk: iwa publishing, 2007 [2] lin, s.d. & lee c.c.: “water and wastewater calculations manual,” usa: mcgraw hill, 2001 [3] weiner, r.f. & matthews r.: “environmental engineering” (4thed.) boston, usa: butterworth, heinemann [4] lemanowicz, m. & gierczycki, a.t. & al-rashed, m.h.: “dual-polymer flocculation with unmodified and ultrasonically conditioned flocculant.” chem. eng. proc., 50, 2011, pp. 128–138. [5] šulc, r. & svačina, o.: “flock growth kinetics for flocculation in an agitated tank.” acta polytechnica, 50 (2), 2010, p. 22–29. [6] šulc, r. & svačina, o.: “effect of flocculant dosage onto flock size at flocculation in an agitated vessel.” inżynieria i aparatura chemiczna, 49 (1), 2010, p. 111–112. [7] šulc, r. & svačina, o.: “effect of mixing intensity onto kinetics of flock growth in an agitated tank.” chemical and process engineering, 31, 2010, p. 261–271. [8] kilander, j. & blomström, s. & rasmuson, a.: “scale-up behaviour in stirred square flocculation tanks,” chem. eng. sci., 62, 2007, p. 1606–1618. [9] šulc, r. & lemanowicz, m. & gierczycki, a.t.: “effect of flocculant sonication on floc growth kinetics occurring in an agitated vessel.” chemical engineering and processing, 60, 2012, p. 49–54. http://dx.doi.org/10.1016/j.cep.2012.05.008 [10] bouyer, d. & escudie, r. & do-quang, z. & line, a.: “mixing and flocculation in a jar for drinking water treatment” in: proceedings of 5th international symposium on mixing in industrial processes ismip-5, june 1–4, 2004, sevilla, spain. [11] kysela, b. & ditl, p.: “floc size measurement by the laser knife and digital camera in an agitated vessel,” chemical and process engineering (inzynieria chemiczna i procesowa), 26, 2005, p. 461–468. [12] dostál, m. & šulc, r.: in situ particle size measurement: hardware & software installation and setup. in: “proceedings of 36th international conference of the slovak society of chemical engineering ssche 2009”, may 25-29, tatranské matliare, slovakia. (cd rom) [13] svačina, o.: application of image analysis for flocculation process monitoring. diploma project. czech technical university, faculty of mechanical engineering, 2009 (in czech). [14] li, d.h. & ganczarczyk, j.: “fractal geometry of particle aggregates generated in water and wastewater treatment processes.” environ, sci. technol., 23 (11), 1989, p. 1385–1389. [15] logan, b.e. & kilps, j.r.: “fractal dimensions of aggregates formed in different fluid mechanical environments.” wat.res., 29 (2), 1995, p. 443–453. [16] mandelbrot, b.b.: “the fractal geometry of nature.” new york, usa: w.h. freeman and co., 1983 393 http://dx.doi.org/10.1016/j.cep.2012.05.008 acta polytechnica 53(4):388–393, 2013 1 introduction 2 theoretical background 3 experimental 3.1 experimental apparatus 3.2 experimental procedure 3.3 experimental data evaluation 3.3.1 generalized correlation 3.3.2 fractal dimension 4 effect of initial solid particle concentration 5 conclusions acknowledgements references ap-6-11.dvi acta polytechnica vol. 51 no. 6/2011 observing galaxy clusters with erosita: simulations j. hölzl, j. wilms, i. kreykenbohm, ch. schmid, ch. grossberger, m. wille, w. eikmann, t. brand abstract the erosita instrument on board therussian spectrum roentgen gamma spacecraft, which will be launched in 2013, will conduct an all sky survey inx-rays. amain objective of the survey is to observe galaxy clusters in order to constrain cosmological parameters and to obtain further knowledge about dark matter and dark energy. for the simulation of the erosita survey we present a monte-carlo code generating a mock catalogue of galaxy clusters distributed according to the mass function of [1]. the simulation generates the celestial coordinates as well as the cluster mass and redshift. from these parameters, the observed intensity and angular diameter are derived. these are used to scalechandra cluster images as input for the survey-simulation. keywords: galaxy clusters, cosmology, erosita, simulation. 1 introduction in recent years, our knowledge about the cosmological parameters has increased significantly through, e.g., precision measurements of the cosmic microwave background [2] and the supernova cosmology project [3]. measurements of the expansion history of our universe have provided evidence for the existence of a dark energy component which dominates all other contents of the universe and drives expansion [4]. a complementary method for measuring the cosmological parameters is to perform observations of large-scale structures. galaxy clusters form inareasoverdensewith respect to themeandensity of the universe and therefore trace the large-scale structure, such that a statistically complete sample of clusters provides us with information about the cosmological parameters [4]. by combining differentmeasurements, degeneracies of parameters canbe broken [4]. the mass function of clusters depends on the density parameter ωm and the amplitude of the primordial power spectrum σ8. the evolution of the mass function and also the amplitude and the shape of the power spectrum of the spatial cluster distribution p(k) [5] are strongly influenced by dark matter and dark energy [4]. the baryonic acoustic oscillations, which enable the curvature of space to bemeasured, are also imprinted on the large-scale structure. in the followingwe present a simulation generating a catalogue of galaxy clusters, which is used as an input for the complete simulation of the erosita all sky survey [6]. galaxy clusters are the largest virialized structures in the universe. the space between the galaxies is filled with an intracluster medium with temperatures of several tens of millions of degrees, which causes x-ray emission in the energy band of ∼ 2–15kev. the gas traces the gravitational potential of the cluster, therefore the totalmassof the cluster can be calculated by measuring the temperature and density profile [7]. 2 erosita erosita (extended roentgen survey with an imaging telescope array) is one of two instruments on the russian spectrum roentgen gamma mission [8]. it consists of seven co-aligned identical wolter-i x-ray telescopes with 54 nested mirrors each. each of the wolter telescopes is equipped with an identical pnccd camera developed by the mpi halbleiterlabor. the pnccds, which are backsideilluminated charge coupled devices (ccds), are advanced versions of the pnccds flying on xmmnewton [9]. the main science drivers of erosita are high precision cosmology, i.e., determination of the cosmological parameters independent of cosmicmicrowavebackground and supernovameasurements, and the study of dark matter and dark energy. erosita will perform a deep all sky survey for four years, followed by pointed observations of selected fields. the flux limit will be at least one order of magnitude lower than the flux limit of the rosat all sky survey [10]. with an effective area of ∼ 1500cm2 at 1.5kev, erositawill detect about 50000–100000 galaxy clusters [4]. 3 mass function as a cluster mass function, the function from [1] was used (fig. 1): dn dm = f(σ) ρm m dlnσ−1 dm (1) 17 acta polytechnica vol. 51 no. 6/2011 fig. 1: halo mass function from [1] for redshifts of 0, 1 and 2.5 for halos with an overdensity of 500 relative to the critical density at redshift z. the y-axis is plotted with a factor m2/ρ0. the function shows a strong redshift evolution, the number density of clusters decreases with increasing redshift where n = dn dv is the number density of halos, ρm is the mean matter density of the universe, and σ is the root mean square of the linear matter power spectrum at redshift z. the function f(σ) has been calibrated by n-body simulations for redshifts up to z =2.5 [1]. clusters are identified as isolated density peaks and the halo mass is calculated in a spherical area around the peak enclosing a specified overdensity. [1] describes the redshift evolution of the mass function through interpolation formulae or, alternatively, splines for the fitting parameters f(σ). the redshift also enters through σ(z). 4 simulation as a lower limit for the cluster mass we chose 1013 m�. up to a redshift of z = 2.5, the mass function gives about 23 million clusters with a mass larger than the limit. in the first step, the celestial coordinates, redshift, and mass were generated via the rejection sampling method [12]. the lx − m relation by [11] was used to obtain the luminosity as a function of the mass. after sampling the catalogue, it was converted into a fits file to be used as an input for the simulation of the survey, which is performed with the simulation of x-ray telescopes (sixt) software [6]. each entry of the catalogue links to an chandra image of a galaxy cluster provided by reiprich (private communication), which is scaled in size and luminosity according to the corresponding mass and redshift generated in the simulation, and to a spectrum. 5 summary and conclusions we generated a mock catalogue of galaxy clusters using a monte-carlo code. the final source catalogue contains chandra-images of galaxy clusters, which are scaled in luminosity and diameter according to the mass and redshift sampled by the simulation. the next step is to take the correlationfunction, which is the fourier-transformof the power spectrum [5], into account when sampling the cluster positions such that the celestial coordinates are not longer independent of redshift and mass (the mass enters because of the redshift evolution of the mass function). this will allow us to perform a realistic simulation of erosita’s observing program. 18 acta polytechnica vol. 51 no. 6/2011 this research was funded by the bundesministerium für wirtschaft und technologie under deutsches zentrum für luftund raumfahrt grant numbers 50 qr 0903 and 50 or 0801. references [1] tinker, j., kravtsov, a. v., klypin, a., et al.: astrophys. j. 688, 709, 2008. [2] komatsu, e., smith, k. m., dunkley, j., et al.: astrophys. j. suppl. ser. 192, 18, 2011. [3] perlmutter, s., schmidt, b. p.: in k. weiler (ed.) supernovae and gamma-ray bursters. lecture notes in physics, berlin springer verlag, 598, 195, 2003. [4] predehl, p., hasinger, g., böhringer, h., et al.: in space telescopes and instrumentation ii: ultraviolet to gamma ray.proc. spie 6266, 62660p, 2006. [5] peacock, j. a.: cosmological physics. cambridge : cambridge university press, 1999. [6] schmid, c., martin, m., wilms, j., et al.: x-ray astronomy 2009, aip conf. proc., 1248, 591, 2010. [7] trümper, j. e., hasinger, g. (eds.): the universe in x-rays (astronomy and astrophysics library). berlin : springer, 2008. [8] pavlinsky,m., sunyaev,r., churazov,e., et al.: in optics for euv, x-ray, and gammaray astronomy iv. proc. spie 7437, 743708, 2009. [9] strüder, l., briel, u., dennerl, k., et al.: astron. astrophys. 365, l18, 2001. [10] voges, w.: adv. space res. 13, 391, 1993. [11] vikhlinin,a., burenin,r.a., ebeling,h., et al.: astrophys. j. 692, 1033, 2009. [12] deák, i.: random number generators and simulation. budapest : akademiai kiado, 1990. johannes hölzl jörn wilms ingo kreykenbohm christian schmid christoph grossberger michael wille wiebke eikmann thorsten brand dr. karl remeis-sternwarte/ecap universität erlangen-nürnberg,germany 19 ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 measurement of heat transfer coefficients in an agitated vessel with tube baffles m. dostál, k. petera, f. rieger abstract cooling or heating an agitated liquid is a very common operation in many industrial processes. a classic approach is to transfer the necessary heat through the vessel jacket. another option, frequently used in the chemical and biochemical industries is to use the heat transfer area of vertical tube baffles. in large equipment, e.g. fermentor, the jacket surface is often not sufficient for large heat transfer requirements and tube baffles can help in such cases. it is then important to know the values of the heat transfer coefficients between the baffles and the agitated liquid. this paper presents the results of heat transfer measurements using the transient method when the agitated liquid is periodically heated and cooled by hot and cold water running through tube baffles. solving the unsteady enthalpy balance, it is possible to determine the heat transfer coefficient. our results are summarized by the nusselt number correlations, which describe the dependency on the reynolds number, and they are compared with other measurements obtained by a steady-state method. keywords: heat transfer coefficients, agitated vessels, tube baffles. 1 introduction cooling or heating agitated liquid in vessels is a basic technological operation in the chemical, biochemical, pharmaceutical, food and processing industries. the cooling or heating rate depends on how the heat is supplied or removed, the mixing intensity and many other parameters. good knowledge of all parameters is important for the design of real equipment, e.g. fermentors for transforming biomass to biogas. a very frequent technique for heating or cooling agitated liquids is to transfer heat via the vessel jacket. in the case of large vessels, the heat transfer area of the jacket may not be sufficient, because the relative size of the transfer area decreases with increasing volume (the area increases with power 2 of the characteristic dimension, e.g. diameter, but the volume increases with power 3), or the jacket cannot be used for other, e.g. structural, reasons. in such cases, helical pipe coils or tube baffles can be used, usually with water or steam flowing inside as the heat transfer medium. in addition to the heat transfer, tube baffles also prevent circular motion of the agitated liquid and generate some axial mixing. the areas around the tube baffles are highly turbulent, so good heat transfer rates (coefficients) can be achieved. the heat transfer rate between tube baffle and an agitated liquid depends on many parameters, e.g. the geometry, the agitated liquid properties, and the mixing intensity, which is influenced by the type of agitator and its rotation rate. the influence of most of these parameters can be represented by heat transfer coefficient α. heat transfer rate q̇ between the agitated liquid and the tube baffle can then be expressed as q̇ = α s δt , (1) where s is the heat transfer area of the tube baffle, and δt represents the characteristic mean temperature difference. this paper uses the transient method to find heat transfer coefficient α on tube baffles in a vessel mixed by a six-blade turbine impeller with pitched blades. dimensionless parameters are usually used to describe the relation between heat transfer coefficients and other parameters, e.g. mixing intensity. the resulting dimensionless correlations based on data from small laboratory equipment can be then used to predict the rate of heat transfer in large-scale plant vessels. basic dimensionless parameters are the reynolds number re = n d2� μ , (2) the prandtl number pr = ν a = μcp λ . (3) and the nusselt number, which includes the heat transfer coefficient α nu = αd λ . (4) here, d is the vessel diameter and λ is the thermal conductivity of the agitated liquid. a general relation between all these dimensionless numbers is usually written as nu = f (re, pr, geometry) , (5) and the following form is often seen in the literature nu = c remprnvis . (6) the last term on the right-hand side is sieder-tate’s correction factor, which represents the change in the 46 acta polytechnica vol. 50 no. 2/2010 thermophysical properties of an agitated liquid near the heat transfer wall (tube baffle, in our case). reynolds power m is usually within the range 2/3 . . . 3/4. prandtl power n is commonly given as 1/3, and sieder-tate’s correction term power s is 0.14. viscosity number vi is defined as the ratio of the agitated liquid dynamic viscosity at mean temperature and heat transfer wall temperature. vi = μ̄ μw (7) many correlations for the nusselt number describing the heat transfer in jacketed vessels agitated by six-blade turbines with pitched angle 45◦ can be found in the literature. for example, chisholm [1] reported nu = 0.52 re2/3pr1/3vi0.14 (8) and rieger et al. [2] used nu = 0.56 re0.67pr1/3vi0.14 . (9) karcz and stręk [3] presented the results of heat transfer coefficient measurements for various threeblade propellers and various configurations of tube baffles. the following correlation is for a three-blade propeller nu = 0.494 re0.67pr1/3vi0.14 (10) and for he3 impeller they presented nu = 0.513 re0.67pr1/3vi0.14 . (11) karcz et al. [4] measured the heat transfer coefficients for rushton and smith turbine impellers, sixblade and three-blade impellers with pitched angle 45◦, a three-blade propeller, and six various geometrical configurations of tube baffles. they presented the results using energy characteristics describing the dependency of the nusselt number on the modified reynolds number re∗ = (p/v )d4�2 μ3 , (12) where p is agitator power input and v is volume of the agitated liquid. the general nusselt correlation (6) then transforms to (liquid height equal to vessel diameter, and flat bottom) nu = k (π 4 )m/3 re∗ m/3 prnvis . (13) lukeš [5] also measured heat transfer coefficients in a vessel with tube baffles. he compared the results obtained for a two-stage impeller (combining an axial and radial type impeller) to a three-blade turbine with pitched angle 45◦. the following correlation describes a pitched three-blade impeller nu = 0.5416 re0.6576pr1/3vi0.14 . (14) 2 theoretical basics of the transient method the transient method is based on time monitoring the temperature of an agitated liquid. assuming a perfectly mixed liquid with constant temperature t throughout its entire volume, a perfectly insulated system with no heat sources (e.g. dissipation of the mechanical energy of the impeller), constant liquid mass m and its specific heat capacity cp, we can write the unsteady enthalpy balance m cp dt dt = q̇ . (15) heat flow rate q̇ on the right-hand side of eq. (15) is proportional to the heat transfer coefficient α, heat transfer area of the tube baffle s, and the characteristic mean temperature difference between the agitated liquid and the tube surface δt , see eq. (1). to express this mean temperature difference, we need to know the surface (wall) temperature tw. one way is to measure it directly, as we did for the jacket surface in our previous work [6]. the other way is to use the enthalpy balance of cooling or heating water inside the tube baffle, as is usual in heat exchanger design theory, see for example [7]. in this case, we also have to take into account the heat transfer inside the tube and determine the corresponding heat transfer coefficient αi. assuming constant specific heat capacity of the heat transfer media cpb and constant values of the heat transfer coefficients on both sides of the tube, we can express the heat flow rate as q̇ = k s δtln , (16) where k is the overall heat transfer coefficient and δtln is the logarithmic mean temperature difference between the agitated liquid and the heat transfer media. δtln = t ′b − t ′′b ln t ′b − t t ′′b − t (17) neglecting the tube baffle wall thickness in the case of materials with big thermal conductivities (e.g. copper), the overall heat transfer coefficient can be expressed using the heat transfer coefficients on both sides k = ( 1 α + 1 αi )−1 . (18) heat transfer rate q̇ at a specific time can also be expressed using the enthalpy balance of the heating or cooling media q̇ = ṁbcpb (t ′ b − t ′′ b) . (19) 47 acta polytechnica vol. 50 no. 2/2010 fig. 1: schema of our experimental equipment substituting (16) and (17) into (15), we get the first order ordinary differential equation m cp dt dt = k s t ′b − t ′′b ln t ′b − t t ′′b − t . (20) using the initial condition for agitated liquid temperature t ∣∣∣ t=0 = t0 , (21) we can solve eq. (20) and get a time course of the temperature of the agitated liquid. in the case of constant inlet temperature t ′b, we can directly use the enthalpy balance (19) to find the outlet temperature of the cooling/heating media t ′′b, and the ordinary differential equation (20) has an analytical solution. in our transient method, the inlet and outlet temperatures change in time; we measure them together with the temperature of the agitated vessel ti and we have to use some numerical method to solve eq. (20). the solution gives us theoretical time profile t (t) for a given overall heat transfer coefficient k and the measured inlet and outlet temperatures of the heat transfer media. the real heat transfer coefficient k should ensure small deviations of the theoretical temperature t (t) time profile from the measured profile ti. mathematically, we can look for such a value of k which minimizes the sum of squares of the deviations n∑ i=1 (t (ti) − ti) 2 = min . (22) using this condition, we get an optimal value of the overall heat transfer coefficient k which best fits our experimental data. to express the heat transfer coefficient on the side of agitated liquid α from eq. (18), we have to know the heat transfer coefficient inside the tube baffle αi. it mostly depends on the geometry and the flow regime. for turbulent flow of a newtonian liquid in a pipe of circular cross-section, [8] gave the correlation nub = αidbi λb = = (ξ/8) reb prb 1 + 12.7 √ ξ/8(pr 2/3 b − 1) [ 1 + ( dbi lb )2/3] , (23) where ξ is defined ξ = (1.8 log reb − 1.5)−2 (24) 48 acta polytechnica vol. 50 no. 2/2010 table 1: geometrical parameters of our experimental equipment, and thermophysical properties of the agitated liquid vessel diameter d 200 mm liquid height h 200 mm h/d = 1 inner baffle diameter dbi 8 mm outer baffle diameter dbe 10 mm dbe/d = 0.05 baffles circle diameter db 144 mm db/d = 0.72 number of baffles 4 baffles material copper heat transfer area s 0.011 m2 impeller type six-blade turbine, pitched angle 45◦ impeller diameter d 67 mm d/d = 3 impeller height above bottom h2 67 mm h2/d = 1, h2/d = 1/3 blade width b 13 mm b/d = 0.065, b/d = 0.194 impeller rotation rate n 200–1 200 min−1 agitated liquid distilled water average temperature t 30 ◦c density at t � 995.7 kg m−3 specific heat capacity cp 4 178 j kg −1 k−1 thermal conductivity λ 0.618 w m−1 k−1 dynamic viscosity μ 0.7966 · 10−3 pa s prandtl number pr 5.39 thermal diffusivity a 0.148 · 10−6 m2 s−1 agitated liquid mass m 5.760 kg and the reynolds number for a circular pipe with diameter dbi is reb = ūbdbi�b μb . (25) the mean velocity of heating (cooling) transfer media ūb can be written as ūb = 4 ṁb π�bd 2 bi . (26) as already mentioned, it is not possible to solve the ordinary differential equation (20) analytically because the inlet and outlet temperatures, t ′b and t ′′ b, of the heating or cooling liquid flowing inside the tube baffle change with time. in our case, we used the improved euler method with second order accuracy, see [9], tn+1 = tn + 0.5(k1 + k2) k1 = δtf (tn, tn) , k2 = δtf (tn + δt, tn + k1) (27) which solves an ordinary differential equation with right-hand side f (t, t ), corresponding in our case to the right-hand side of eq. (20) divided by m cp dt dt = f (t, t ) = k s m cp t ′b − t ′′b ln t ′b − t t ′′b − t . (28) this means that in every step of our optimization procedure described by eq. (22) it is necessary to numerically solve the previous differential equation. this sets higher demands on computational resources, but they can be satisfied using present-day computers, and the optimization process can be implemented by high-level programming language systems like matlab r© or octave. 3 experimental measurements of the heat transfer coefficient between the agitated liquid and the tube baffle, using the transient method as described in the previous section, were carried out in a cylindrical vessel with an elliptical bottom 200 mm in diameter. the vessel was insulated by a polystyrene jacket. four two-tube baffles were used, regularly positioned by 90◦ along the vessel wall. a six-blade turbine impeller with pitched angle 45◦ was used. the geometrical and other parameters are depicted in detail in figure 1 and table 1. 49 acta polytechnica vol. 50 no. 2/2010 fig. 2: typical time courses of the agitated liquid, inlet and outlet temperatures of cooling/heating media flowing in the tube baffle during a single heating/cooling cycle, n = 500 min−1. red and blue circles outline the results of numerical integration with best-fit values of overall heat transfer coefficients k the agitator was driven by a servodyne 5000-45 power unit (cole parmer instrument co., 150–6 000 min−1). distilled water was used in the vessel, and its temperature was measured using a pt100 platinum resistance thermometer, placed in the area between the impeller and the tube baffles, see fig. 1. public water mains were used to supply hot or cold water into the tube baffles. the inlet and outlet temperatures of the heat transfer media were again measured using pt100 platinum resistance thermometers, and the flow rate was determined by weighing the liquid passed in a specific time interval. the platinum resistance thermometers were calibrated before the experiments, using an accurate laboratory mercury thermometer, to obtain the dependency of their resistance on temperature (standard relations for pt100 were not used). the resistance of the pt100 thermometers was measured using the four-wire method and the agilent 34970a programmable multimeter (agilent technologies). the multimeter contains an integration type a/d converter, so we set the integration period to 20 ms, which corresponds to the period length of the voltage supply (frequency 50 hz). the temperatures of the agitated liquid and the heat transfer media were measured with period 1 s. measurements were performed periodically. first, the whole equipment was assembled (tube baffles and impeller) and the amount of agitated liquid was weighed. then, the liquid agitated at constant impeller rotation speed was cooled down to a low temperature by cold water flowing through the baffle. after reaching a steady state, we switched to hot water. the agitated liquid temperature started to increase, and it was measured together with the inlet and outlet temperatures of the water running inside the baffle until the agitated liquid temperature approached the inlet hot water temperature. during this period, samples of flowing water were weighed in order to determine the mass flow rate. then, we switched to cold water again and repeated the whole measurement process during cooling of the agitated liquid. figure 2 shows a typical time course of temperatures measured during one experiment cycle for a specific rotation rate. the heat transfer coefficient was not evaluated using the whole time course. it is obvious from figure 2 that the measured temperatures of the agitated liquid are within the range of 15◦c through 45◦c, which corresponds to the water mains temperatures. we used a narrower temperature range 20–40◦c to evaluate the heat transfer coefficient, as described in the previous section. the mean temperature of this range was 30◦c, which was close to the ambient temperature, it therefore practically prevented substantial heat exchange between the agitated liquid and the surroundings, and minimized the measurement errors (these were neglected in our mathematical model). 50 acta polytechnica vol. 50 no. 2/2010 table 2: evaluated heat transfer coefficients during heating/cooling cycle for different impeller rotation speeds. in the last column, heat transfer coefficients inside tube baffles were calculated using eq. (23) and measured flow rate n (min−1) re α (w m−2 k−1) nu vi0.14 αi 200 18 681 2 455 / 2 218 794 / 717 1.041 0 / 0.953 2 16 941 / 14 052 300 28 021 3 199 / 2 900 1 034 / 938 1.042 0 / 0.955 5 17 091 / 13 128 400 37 362 3 831 / 3 486 1 239 / 1 127 1.039 8 / 0.958 3 16 879 / 13 107 500 46 702 3 890 / 4 061 1 258 / 1 313 1.039 1 / 0.963 0 14 796 / 11 683 500 46 702 3 798 / 4 036 1 228 / 1 305 1.040 4 / 0.959 2 16 898 / 13 205 600 56 042 4 956 / 4 519 1 602 / 1 461 1.038 0 / 0.963 8 16 106 / 11 243 700 65 383 5 498 / 5 028 1 778 / 1 626 1.036 2 / 0.963 3 17 275 / 12 459 800 74 723 6 039 / 5 522 1 953 / 1 785 1.037 1 / 0.962 9 17 332 / 13 729 900 84 064 6 467 / 5 950 2 091 / 1 924 1.034 2 / 0.963 7 16 970 / 13 945 1 000 93 404 7 006 / 6 357 2 265 / 2 056 1.034 4 / 0.968 2 17 032 / 11 487 1 200 112 085 8 045 / 7 737 2 601 / 2 502 1.032 4 / 0.954 5 18 595 / 13 338 4 measured data evaluation the measured data was processed in two steps. in the first step, we determined the heat transfer coefficients for specific rotation rates, and in the second step the nusselt correlation parameters were determined. in the first step, we obtained the time courses of the measured temperatures during one heating/cooling cycle for a specific rotation rate, as displayed in figure 2. using a numerical solution of eq. (28) and minimizing the sum of squares (22), we found the overall heat transfer coefficient k which best described the measured temperature profile. the red and blue circles in figure 2 outline the result of this numerical solution using the best-fit values. see [10] for more details and some matlab code examples. this procedure was applied to both the heating phase and the cooling phase, so we had two different values of the overall heat transfer coefficients, one for heating, and the other for cooling. using eq. (23) and the measured mass flow rate of the heating (cooling) water, the heat transfer coefficients inside the tube baffles can be calculated, and it is easy to express the heat transfer coefficients on the agitated liquid side from eq. (18). α = ( 1 k − 1 αi )−1 (29) these values for different rotation speeds are shown in table 2 which presents the results of the first data evaluation step (pairs of values delimited by a forward slash correspond to heating and cooling, respectively). other columns in this table show the calculated values of nusselt numbers, and also the viscosity numbers, which describe the influence of temperature on the thermophysical properties near the heat transfer area (baffles). the second step of our data evaluation focused on finding optimal values of parameters c and m in eq. (6) for the nusselt number. again, this was based on minimizing the sum of squares of the deviations, defined as ss = n∑ i=1 [ c remi pr 1/3 − nui/vi0.14i ]2 = min , (30) where rei, nui and vi 0.14 i correspond to individual rows in table 2 (the prandtl number was calculated for the mean temperature of agitated liquid t = 30◦c, and the last row with rotation rate 1 200 min−1 was skipped in the optimization procedure because one thermometer broke during the experiment and the calculated values of the heat transfer coefficients were therefore inaccurate). the result of this optimization procedure (nonlinear regression, actually) is the following correlation, describing the nusselt number for our case of heating or cooling of an agitated liquid using tube baffles nu = 0.54 re0.675pr1/3vi0.14 . (31) figure 3 compares our results with other data in the literature [5, 2, 3, 1]. 5 confidence interval analysis confidence intervals are omitted in many papers, especially when dealing with a nonlinear regression. however, they are important, as they can show how precisely the parameters were determined. they usually express some (95 %) probability that a true parameter value lies within certain interval. if this interval is wide, then we do not know the parameter value well and we should probably obtain more (precise) data or 51 acta polytechnica vol. 50 no. 2/2010 fig. 3: our measured data points and the nusselt correlation described by eq. (31). correlations from [5, 2, 3, 1] are depicted for comparison redefine our model function. this “qualitative” conclusion can be made for the case of nonlinear regression with approximate (asymptotic) intervals. in our case, we have determined them for the two parameters in eq. (31) using matlab command nlparci as c = 0.540 ± 0.278 , 0.262 . . . 0.818 m = 0.675 ± 0.047 , 0.628 . . . 0.722 (32) parameter m has a relatively narrow confidence interval, so we can be satisfied. this is not the case for parameter c, which has quite a large confidence interval. what does this show? well, yes, we have a small data set here and it would be nice to have more data points and more accurate data points. the other reason is that parameter c is closely connected with m, and if m is changed only a little, the consequence is a relatively large change in c. if we fixed parameter m to some constant value, for example 0.67, then we would obtain a very narrow confidence interval for c c = 0.571 ± 0.010 (33) which confirms a high correlation of the two parameters. this is also confirmed by the correlation coefficient or matrix rij = cij√ cjj cii ; r = ( 1 −0.9994 −0.9994 1 ) (34) where non-diagonal elements represent the correlation between parameters c and m. the closer their values are to 1 (or −1), the higher is the correlation. the minus sign means that an increase in the value of one parameter can be compensated by decreasing the other parameter, and vice versa. cij is the covariance matrix [11]. so, in our case, if we increase m we have to decrease c so that we will get a result (fit) that is not much worse. another way to analyze the confidence intervals of the parameters is via the “extra sum-of-squares f test” [12], which is an adaptation of anova (analysis of variance). it describes the difference between two models (simpler and more complex) using their sum-of-squares of deviations (errors) ss and their corresponding degrees of freedom df f = (ssa − ssb)/ssb (dfa − dfb)/dfb (35) if the relative difference of the sum-of-squares of two different models (in the numerator) is approximately the same as (or smaller than) the relative difference of degrees of freedoms (in the denominator), then the two models are most probably similar and we can use the simpler one. if the relative difference of the sumof-squares is greater than the relative difference in degrees of freedom, then this probability is smaller and the model with the smaller sum-of-squares (the more complex model) is probably better. the probability p of getting an f -ratio less than or equal to a specific value can be described by the f-distribution (fishersnedecor), see figure 4. the more frequently used p-value, defined as 1 − p, shows the probability of getting f greater than some specific value. in other words, the p-value shows the probability that the simpler model and the more complex model are similar 52 acta polytechnica vol. 50 no. 2/2010 fig. 4: the f cumulative distribution function used in the extra sum-of-squares f test. this describes the probability p that the f-ratio (eq. 35) is less than or equal to some specific value. the more frequently used p-value, defined as 1 − p, shows the probability that the simpler model and the more complicated model are similar (not too different). the lower its value is, the more significant is the difference. if the p-value is less than 0.05, we usually assume the simpler model is not correct and should be rejected fig. 5: the probability density function of f-distribution which integratedwithin a certain range gives the probability of an f-ratio located within that range. the filled red area corresponds to p-value 0.05, that is to the probability that the f-ratio is greater than the critical value. the critical value fcrit = 3.5546 stands here for p-value 0.05 and degrees of freedom 2 and 18, and can be calculated as finv(1-0.05,2,18) in matlab (not too different). if we get a p-value less than 0.05 (5%), then the two models are considered significantly different and we should reject the simpler model. our goal is the reverse. we would like to find a region where the sum-of-squares is not significantly different from the sum-of-squares for our best-fit parameters, so that models with parameters in this region can be considered practically the same (statistically not significantly different). this region can be defined as [12] ssall-fixed = ssbest-fit [ p n − p f0.95(p, n − p) + 1 ] , where f0.95 represents the inverse cumulative distribution function for the given confidence level of 95 %, p is number of parameters, and n is number of data points. in matlab, the f value for 95 % confidence level can be calculated as finv(0.95,p,n-p). such a confidence region is depicted in figure 6. the contour command can be used to plot this region in matlab. maximum and minimum values of the parameters obtained from this region will give us larger and asymmetric confidence intervals compared to the asymptotic ones, see eq. (32). c = 0.276 . . . 1.034 m = 0.616 . . . 0.736 (36) here, we should realize that these two parameters are closely joined together. so if one parameter moves to one side of its confidence interval, the other should also move so that it stays inside the confidence region in figure 6. this is for example the case of our 1parameter fit (eq. 33), or the correlation by [2]. our 1-parameter fit is very close to the 2-parameter fit, so it is not plotted in figure 3. rieger’s correlation is plotted there, and it is close to ours. from the statistical point of view, there is no difference between these models for the 95 % confidence level, so we can be satisfied only with our 1-parametric fit. let us try to compare situations when we take values of parameters c and m from places near to the left or right margins of our confidence region, denoted in figure 6 as “test 1” and “test 2”. they are compared with our 2-parameter fit in figure 7. the difference is not so big if we look at the data points, and it is not significant from the statistical point of view. both curves fall into the darker gray region, and this corresponds to the asymptotic confidence band constructed by matlab command nlpredci with options simopt=on and predopt=curve. this represents an interval where, with 95% probability, the true best-fit curve should be. the lighter gray and wider band corresponds to the asymptotic prediction band, where 95 % of data points from all following measurements should fall (nlpredci with options simopt=on and predopt=observation). 6 conclusions we have measured the heat transfer coefficients on tube baffles using the transient method, when the agitated liquid is periodically heated and cooled by the liquid running through tube baffles. for the reported geometrical parameters, the following correlation summarized our data nu = 0.54 re0.675pr1/3vi0.14 . we have also analyzed the confidence regions of the parameters in the previous correlation, and we found that the one-parameter fit of our data with the commonly used exponent m = 0.67 53 acta polytechnica vol. 50 no. 2/2010 fig. 6: 95 % confidence region (contour) of parameters m and c, which encloses parameter values that produce curves not significantly different from the best-fit curve fig. 7: comparing two “extreme” values of parameters c and m, “test 1” and “test 2”, with our best-fit correlation (eq. 31) and with rieger et al. [2]. in addition, the darker gray region displayed here corresponds to the asymptotic prediction band where the true best-fit curve should lie with 95 % probability. the lighter gray region is the asymptotic prediction band where 95 % of the data points obtained in many following measurements should fall 54 acta polytechnica vol. 50 no. 2/2010 table 3: comparison of different impeller types and tube baffle configurations (4×2 means four two-tube baffles). constant c, eq. (6), is given for the commonly used exponent m = 0.67 (nu = cre0.67pr1/3vi0.14). constant k from the energy characteristic, eq. (14), is also given for m = 0.67. the power number po = p/�n3d5 follows the corresponding references, [5], or our own experiments authors impeller type and tube baffle configurations c k po m (two-param. fit) [3] propeller, 4×4 0.494 0.518 0.27 0.642 ± 0.075 he3, 4×4 0.513 0.406 0.95 0.665 ± 0.056 [4] pitched six-blade 45◦, 24×1 0.750 0.540 1.50 – propeller, 24×1 0.640 0.630 0.37 – [5] pitched three-blade 45◦, 4×2 0.494 0.393 0.93 0.6576 this work pitched six-blade 45◦, 4×2 0.571 0.396 1.60 0.676 ± 0.047 nu = 0.571 re0.67pr1/3vi0.14 and the correlation in [2] also fall into the 95% confidence region, producing curves which are not significantly different from the best-fit curve (from the statistical point of view). table 3 summarizes the constants of the heat, energy and power characteristics for corresponding correlations. acknowledgement this work has been supported by research project of the czech ministry of education msm6840770035. references [1] chisholm, d., ed.: heat exchanger technology, elsevier applied science (1988). [2] rieger, f., novák, v., ditl, p., fořt, i., vlček, j., ludvík, m., machoň, v., medek, j.: míchání a míchací zařízení, maprint 9, čschi, praha (1995), in czech. [3] karcz, j., stręk, f.: heat transfer in agitated vessel equipped with tubular coil and axial impeller, mieszanie ’99 (1999), pp. 135–140. [4] karcz, j., stręk, f., major, m., michalska, m.: badania efektywności wnikania ciep la w mieszalniku z pionową wężovnicą rurową, inżynieria i aparatura chemiczna (2002), pp. 76–78. [5] lukeš, j.: mixing equipment with tube baffles, master thesis, czech technical university in prague (2000), in czech. [6] petera, k., dostál, m., rieger, f.: transient measurement of heat transfer coefficient in agitated vessel, in mechanical engineering 2008, slovak university of technology, bratislava (2008). [7] shah, r. k., sekulić, d. p.: fundamentals of heat exchanger design, johnwiley & sons (2003). [8] schlünder, e. u., ed.: vdi-wärmeatlas (berechnungsblätter für den wärmeübergang), vdi verlag, düsseldorf (1994). [9] acheson, d.: from calculus to chaos: an introduction to dynamics, oxford university press (1997). [10] dostál, m., petera, k., rieger, f.: measurement of heat transfer coefficients in agitated vessel with tube baffles, in chisa conference, srní (2009), in czech. [11] press, w. h., teukolsky, s. a., vetterling, w. t., flannery, b. p.: numerical recipes: the art of scientific computing, cambridge university press (1992), 2nd edition. [12] motulsky, h. j., christopulos, a.: fitting models to biological data using linear and nonlinear regression. a practical guide to curve fitting, graphpad software inc., san diego ca, http://www.graphpad.com (2003). ing. martin dostál, ph.d. phone: +420 224 358 489 e-mail: martin.dostal@fs.cvut.cz department of process engineering faculty of mechanical engineering czech technical university in prague technická 4, 166 07 prague 6, czech republic ing. karel petera, ph.d. phone: +420 224 359 949 e-mail: karel.petera@fs.cvut.cz department of process engineering faculty of mechanical engineering czech technical university in prague technická 4, 166 07 prague 6, czech republic 55 acta polytechnica vol. 50 no. 2/2010 prof. ing. františek rieger, drsc. phone: +420 224 352 548 e-mail: frantisek.rieger@fs.cvut.cz department of process engineering faculty of mechanical engineering czech technical university in prague technická 4, 166 07 prague 6, czech republic nomenclature a thermal diffusivity (m2 s−1) c model parameter (−) cpb specific heat capacity of heating or cooling liquid b (j kg −1 k−1) cp specific heat capacity of an agitated liquid (j kg −1 k−1) cij covariance matrix, [11] (−) d impeller diameter (m) dbi inner diameter of tube baffle (m) dbe outer diameter of tube baffle (m) d inner diameter of vessel (m) db tube baffle diameter (m) df degrees of freedom (−) f ratio of the sum-of-squares and degrees of freedom for two different models (−) f cumulative f-distribution function (fisher-snedecor distribution) (−) h2 clearance between impeller and vessel bottom (m) h height of agitated liquid in the vessel (m) k overall heat transfer coefficient (w m−2 k−1) k1,2 euler’s method constants ( ◦c, k) k model parameter (−) m model parameter (−) ṁb mass flowrate of heating (cooling) liquid b (kg s −1) m mass of agitated liquid (kg) n model parameter (−) n number of measurements (−) n impeller rotation speed (s−1) nu nusselt number (−) nub nusselt number of heating (cooling) liquid b (−) p number of parameters (−) p p-value, probability 1 − p (−) p power input (w) p probability (−) po power number (−) pr prandtl number (−) prb prandtl number for heating (cooling) liquid b (−) q heat flux (w m−2) q̇ heat transfer rate (w) rij correlation matrix, coefficient (−) re reynolds number (−) reb reynolds number for heating (cooling) liquid b (−) re∗ modified reynolds number (−) s model parameter (−) s heat transfer area (m2) ss sum of squares, eq. (30) (−) t time (s) t temperature, temperature of agitated liquid (◦c, k) t0 initial temperature of agitated liquid ( ◦c, k) tb temperature of heating (cooling) liquid b ( ◦c, k) t ′b inlet temperature of heating (cooling) liquid b ( ◦c, k) t ′′b outlet temperature of heating (cooling) liquid b ( ◦c, k) 56 acta polytechnica vol. 50 no. 2/2010 ti measured temperature of agitated liquid ( ◦c, k) tw wall temperature ( ◦c, k) ūb mean velocity of liquid in tube baffle (m s −1) v volume of agitated liquid (m3) vi viscosity ratio (−) α heat transfer coefficient between agitated liquid and tube baffle (w m−2 k−1) αi heat transfer coefficient inside tube baffle (w m −2 k−1) δt time step of euler’s method (s) δtln mean logarithmic temperature difference ( ◦c, k) λ thermal conductivity of agitated liquid (w m−1 k−1) λb thermal conductivity of heating (cooling) liquid (pa s) μ dynamic viscosity of agitated liquid (pa s) μb dynamic viscosity of heating (cooling) liquid (pa s) μ dynamic viscosity of agitated liquid at mean temperature (pa s) μw dynamic viscosity of agitated liquid at wall temperature of tube baffle tw (pa s) ν kinematic viscosity of agitated liquid (m2 s−1) � density of agitated liquid (kg m−3) �b density of heating (cooling) liquid (kg m −3) 57 ap1_01.vp acta polytechnica vol. 41 no.1/2001 1 experimental determination of specific heat application of classical adiabatic-calorimetry methods in measuring the specific heat of building materials can lead to certain difficulties for the following reasons: a) the samples of building materials have to be relatively large because of their inhomogeneity b) due to the low thermal conductivity of most building materials it can take a relatively long time to reach temperature equilibration over large dimensions, resulting in significant heat loss. therefore, a nonadiabatic method was used for determining the temperature-dependent specific heat (see [1], for details). the nonadiabatic calorimeter that we used has a mixing vessel with a volume of 2,5 liters. the volume of the measuring fluid (water in this case) is about 1 liter. the maximum volume of the measured samples is 1 liter. the amount of heat loss of the nonadiabatic system is determined using a calibration. the calorimeter is filled with water, whose temperature is different from the ambient air. then, the relation of water temperature to time, tc(t), is measured. the tests show that the calibration curve tc(t) is nearly exponential; small differences in measuring conditions cause only small changes in the calibration curve. the experiments have a replicability better than 1 %. the measuring method itself is based on well-known principles. the sample is heated to a predetermined temperature ts in a muffle furnace and then put into the calorimeter with water. then, the relation of water temperature to time tw(t) is measured, the water being slowly stirred all the time, until the temperatures of the measured sample and the calorimeter are equal. the duration of temperature equilibration ranges from 20 minutes to 1 hour, depending on the thermal conductivity and size of the material being measured. the heat balance of the sample-calorimeter system can be written in the form: � � � �� �mc t t k m c t t m l qs e w w e w r� � � � � � �0 � , (1) where m is the mass of the sample, ts is the temperature of the sample prior to being put into the calorimeter, c is the specific heat of the sample in the temperature interval [te, ts], k is the heat capacity of the calorimeter, mw is the mass of the water, cw is the specific heat of water, tw0 is the initial water temperature, l is the latent heat of evaporation of water, qr is the reaction heat, �m is the mass of evaporated water, � � �m m m m m mcw s n sc� � � � � , (2) mcw is the mass of the calorimeter with water before the measurement, ms is the mass of the system calorimeter-water-sample after measurement, � mn is the mass of the water naturally evaporated during the measurement (this heat loss is already included in the heat loss calibration curve), � m sc is the change of mass due to the chemical reaction of the sample with water (e.g., hydrolysis). this value can be obtained as �msc = m � md , where md is the mass of the dried sample after the measurement. determining the specific heat c directly from equation (1) we would obtain a mean value of specific heat, c, in the interval [te, ts] by � � � � � � c k m c t t m l q m t t w w e w r s e � � � � � � � � 0 � . (3) however, from the physical point of view, it is more correct to determine the value of the specific heat “pointwise”, in accordance with the definition of specific heat, � �c t h t i ti � � � � � � , (4) where h is the specific enthalpy. using relation (4) to determine specific heat, we have to specify the zero-point of the enthalpy scale, i.e., we have to ensure that all the enthalpy calculations are related to a certain constant temperature. this reference temperature can be, for example, tk = 0 °c. upon adding � �q m c t te k� � � �0 , (5) where c0 is the mean specific heat of the sample in the temperature interval [0, te ], to both sides of equation (1), and dividing by m, we obtain � � � �� � � �h t k m c t t m l q m c t ts w w e w r e k� � � � � � � � 0 0 � .(6) the value of c0 is considered to be constant, taking into account the condition t t t ts e e k� �� � , (7) © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 temperature and moisture dependence of the specific heat of high performance concrete j. toman, r. černý the specific heat of two types of high performance concrete was measured in the temperature range from 20 °c to 1000 °c and in the moisture range from dry material to saturation water content. a nonadiabatic method was chosen instead of classical adiabatic treatments in order to meet the requirements following from the large representative elementary volume of the materials. the measured results reveal a significant temperature effect on the specific heat value. the influence of moisture is less important than the influence of temperature, but is also not negligible. keywords: concrete, specific heat, moisture content, temperature. and it can be measured, for example, using the classical adiabatic method. performing a set of measurements for various sample temperatures ti, we obtain a set of points [ti, h (ti )]. a regression analysis of this pointwise given function results in a functional relationship for h � h (t) and, using relation (4), also in the function c � c (t) as the first derivative of h with respect to t. 2 materials and samples the experimental work was done with two types of high performance concrete used in nuclear power plants: penly concrete and temelin concrete. penly concrete was used for concrete containment building in a nuclear power plant in france (samples were obtained from m. dugat, bouygues company, france). it had a dry density of 2290 kg/m3, and consisted of the following components: cement cpa hp le havre (290 kgm�3), sand 0/5 size fraction (831 kgm�3), gravel sand 5/12.5 size fraction (287 kgm�3), gravel sand 12.5/25 size fraction (752 kgm�3), calcareous filler piketty (105 kgm�3), silica fume (30 kgm�3), water (131 kgm�3), retarder chrytard 1.7, super-plasticizer resine gt 10.62. the maximum water saturation was 4 %kg/kg. the temelin concrete used for the concrete containment building of the temelin nuclear power plant in the czech republic had a dry density of 2200 kg/m3 and maximum water saturation 7 %kg/kg. the composition was as follows: cement 42.5 r mokrá (499 kgm�3), sand 0/4 size fraction (705 kgm�3), gravel sand 8/16 size fraction (460 kgm�3), gravel sand 16/22 size fraction (527 kgm�3), water (215 kgm�3), plasticizer 4.5 lm�3. the specimens for measuring the specific heat were cubic in shape, 71 × 71 × 71 mm for both materials. twenty measurements were made for each material. 3 measured results 3.1 temperature dependence of specific heat the measured results are summarized in figs. 1a, b. in the temperature range to 400 °c, both materials behaved in a similar way, and we observed a characteristic increase of specific heat with temperature. the measured data were very close to those determined earlier for similar concrete mixes [2]. for temperatures above 400 °c, the specific heat of temelin concrete slowed down its increase with temperature, and for temperatures higher than 600 °c the specific heat began to decrease. these changes in the character of the specific heat vs. temperature relation can be explained by structural changes of the material in this temperature range, which are due to the loss of crystallically bonded water and dehydration of some components. the penly concrete was found (from the point of view of structural changes) to be more stable against high temperature exposure. the specific heat vs. temperature relation remained increasing in the whole studied temperature range, only the increase was slower for temperatures above 400 °c. 3.2 moisture dependence of specific heat the measured results showed that the sensitivity of the nonadiabatic method was not high enough to distinguish between the changes of specific heat due to the moisture variation in the whole range of moisture; only the differences between the dry samples and the samples with very high moisture content could be distinguished in a reliable way. the comparative experiments with the classical adiabatic method gave very similar results; again no reliable data were obtained. therefore, we employed a classical superposition principle of water and dry material and calculated the specific heat according to the analytical formula given by i. kašpar [3]: 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 1a: dependence of the specific heat of temelin concrete on temperature fig. 1b: dependence of the specific heat of penly concrete on temperature fig. 2a: dependence of the specific heat of temelin concrete on the moisture content c c c u u d w� � � � 100 1 100 (8) where cd is the specific heat of the dry material, cw the specific heat of water, u the moisture in %kg/kg. the results for the temperature 25 °c calculated according to this formula are shown in figs. 2a, b. 4 conclusions the specific heat of two types of high performance concrete was determined in wide temperature and moisture ranges. the temperature dependence was found to be very important. compared to the room temperature data, we observed changes within the range of several tens of percent, and for the case of penly concrete the value at 1000 °c was even approximately two times higher than at 100 °c. the moisture dependence of specific heat was not found to be as remarkable as the temperature dependence, but the changes of the specific heat in the range of approximately 10 % compared to the dry material were also not negligible as well. acknowledgement this paper is based on work supported by the ministry of education of the czech republic, under contract no. cez:j04/98:210000003. references [1] toman, j., erný, r.: high temp.-high press. 25 (1993), p. 643 [2] toman, j.: influence of external conditions on building materials and structures (in czech). thesis, ctu prague 1986 [3] kašpar, i.: moisture transport in building materials (in czech). thesis, ctu prague 1984 prof. mgr. jan toman, drsc. department of physics phone: +420 2 2435 4694 e-mail: toman@fsv.cvut.cz prof. ing. robert černý, drsc. department of structural mechanics phone: +420 2 2435 4429 e-mail: cernyr@fsv.cvut.cz ctu, faculty of civil engineering thákurova 7, 166 29 prague 6 czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 41 no.1/2001 fig. 2b: dependence of the specific heat of penly concrete on the moisture content 5 6 7 ap-3-11.dvi acta polytechnica vol. 51 no. 3/2011 relation between cutting surface quality and alloying element contents when using a co2 laser j. litecká, j. fabianová, s. pavlenko abstract this paper deals with the influence of material content on changes in the quality parameters of the cutting surface when cutting with a laser. the study focuses on experiments to find the effect of material structure and cutting parameters on surface roughness, vickers microhardness and precision of laser cutting. the experimental results are shown in graphs which illustrate the suitability of materials for achieving required cutting surface quality parameters. these results can be used for optimizing production in practical applications using a laser cutting machine. keywords: laser cutting, alloyed steel, unalloyed steel, surface quality, cutting parameters. 1 introduction laser material cutting is a progressive technology. however, due to high cost and high energy use, it is not aswidely used in industry as traditionalmaterial cutting technologies. this progressive technology offers a wide area for researchers. it is necessary to focus on optimization and reengineering, using nonconventional ways of machining. there are nowadays wide opportunities for using lasers in technological applications. laser processing is a physical way of processing that is very clean and has high energy density. it is able to focus energy on a very small surface. it is very fast, and it can reduce material requirements. 2 cutting surface quality evaluation parameters after laser beam cutting thequality of the cutting area is determinedby three basic cutting parameters: cutting speed; gap width and surface quality. the cutting speed should be as high as possible; the gap width should be as small as possible, and the surface quality is determined by ra based on stn, čsn and rz based on din, en iso 9013. this standard is based on the english text of the european standard en iso 9013:2002 “thermal cutting—classificationof thermal cuts— geometrical product specification and quality tolerances”. the cutting area using a laser is characterized by analogy with most high-power cutting technologies. a grooved track is createdby laser cutting as a result of the cyclic phase of the energy beam in interaction with the material by oscillation of hot-melt flow. for certain types of material (mild steel, corrosion-resistant steel, ceramics, composite materials, titanium, plastics) the quality of surfaces cut by a co2 laser with an additive gas, based on en iso 9013, can be characterized by the following values: a) perpendicularity or angularity tolerance, u; b) mean profile height, rz. the following characteristic values may be used in addition: c) drag, δn; d) melting of the top edge, r; e) possible occurrence of dross or melting drops on the lower edge of the cut. fig. 1: parameters for evaluating the cutting surface during laser cutting: rz – mean profile height, n – drag line, α – angle of beamdeviation, r – radius, w – groovewidth, s – material thickness, δs – thickness reduction, m – measured area 39 acta polytechnica vol. 51 no. 3/2011 fig. 2: evaluation of the range of cutting surface quality based on perpendicularity or angularity by en iso 9013 [7] fig. 3: evaluation of the range of cutting surface quality based on the mean height of profile rz by en iso 9013 [7] in our experiments, we made precise measurements based on the en iso 9013 standard. the measurements focused on finding the perpendicularity, because the angle of the laser beam used in the experiments was α =0◦. the perpendicularity value was determined according to the following equation: u = u p s − u n s (1) where: u – perpendicularity value, u p s – measured value on theupper side of the cutting surface, u n s – measured value on the lower side of the cutting surface. the experimental surface roughness measurements focused on finding all roughness parameters ra, rq and rz. however, parameter rz was of greatest importance for determining the surfacequality. the measurement was supplemented by a vickers microhardness measurement. this is not defined in en iso 9013 as a surface quality parameter. however, microhardness is a very important property of thermally cut materials because of its effects on the other processes, especially mechanical working. 40 acta polytechnica vol. 51 no. 3/2011 3 description of the experiment the following materials were chosen for an experiment to determine the quality characteristics of material cutting using a co2 laser with different structures, i.e. with different alloys: s355j2c+n, qste380tm and s2355jr. the samples were cut with dimensions 40 × 40 mm, 6 mm in thickness. the samples were marked by laser engraving, and laser setting parameterswere allocated for each sample. each material was cut to six different parameters. each sample was measured ten times, and the valuewasverifiedby thegrubbs test. theevaluation was made on the basis of the measured results. the measurements focused on the quality of the cutting surface. 3.1 quality parameter measurement procedure surface roughness. the touch method was used for measuring the roughness parameters. a sharp point which moves on the surface of the measured object was used for finding the surface roughness value. themeasurementsweremade usingmitutoyo surftest sj-301 mobile equipment. this is workshop equipment, but its parameters were convenient for the experiment. the test results can be displayed in several international standardsdin, jis, iso,ansi. fig. 4: mean profile height rz by en iso 9013 [7] measuring of precision. each sample was cut with nominal dimension n d =40mm. the aimwas to find the material and cutting parameters that are able to cut out the most precise dimension. the cutting surfacewas divided into two parts—upper side (ups) and lower side (uns), because the start andrun surface of thematerial is unevenlymelted. themeasurement was made on ten points along a cut, 1 mm from the top cut edge and 1mmfrom the bottom cut edge. themeasurement principle is illustrated in figure 5. the measurement was performed using the 3d coordinate rapid thome me 5008 measuring machine with high-precision measurement. fig. 5: principle of precision measurement vickers microhardness. this test involves impressingan indentor (adiamondsquarepyramidwith top angle 136◦) into the test material. the samples were loaded by force f =200gf for a period t =20 s in the normal direction on the surface of the sample. ten impressionsweremade on each sample along the bottom cut edge at a distance of 0.20 mm from the edge, and the average microhardness was calculated nearest to the heat affected zone. vickersmicrohardness ismarkedas hvm, and is defined as force size f on impression surface s. the principle ofvickersmicrohardness measurement is illustrated in figure 6. fig. 6: principle of vickers microhardness measurement 4 experimental materials 4.1 s355j2c+n (1.0579+n) material normative en 10025-2/04 this is analloyedconstruction steelwithgoodductility. it is used for steel structures, welded structures and machining parts with high tensile yield strength and for longandplaneproducts thatarehot rolled. it is forweldeduse, screwedand riveted structureswith frequent cold shaping (blending, flanging, beading, profiling), static and dynamic stressed, for working at normal temperatures andat reduced temperatures to −20◦c. it is suitable for all welded operations. it is suitable for cold shaping, with marking c. the final condition of the supplies is normalization rolled, with marking +n. 41 acta polytechnica vol. 51 no. 3/2011 table 1: chemical properties of s355j2c+n material quality based on en 10 204-3.1 u. s. steel kosice chemical analysis of melting with number 58479 in % s355j2c+n nb v ni mo cu cev cr 0.04 0.003 0.01 0.002 0.03 0.414 0.03 c mn si p s al 0.20 1.20 0.47 0.012 0.007 0.042 table 2: mechanical properties of material s355j2c+n quality based on en 10 204-3.1 u. s. steel kosice mechanical properties lower yield point breaking strength tensibility kv re rm a5 [mpa] [mpa] [%] [◦c] [j] s355j2c+n 414 554 28.0 −20 83 4.2 qste 380 tm material normative sew 092/90 this is a micro-alloyed, small grained, high-quality steel with a low content of carbon, suitable for cold shaping. it is used for long and plane productswhich are hot rolled, e.g.: strips, sheet metals, wide steel. final condition of supplies: thermomechanical rolled is marked tm. steel quality marking of class qste with a triple number which expresses the minimum lower yield point value re, which is measured in the normal direction in the direction of rolling. 4.3 s235jr material normative en 10025-2/04 this is a non-alloyed high-quality construction steel. it is used for plane and long products hot rolled in thickness up to 250 mm with a certain peak load for +20◦c. this steel is not suitable for heat treatment, except in the case of products supplied with status +n. these products can be hot shaped or normalized after delivery. normalization after removing the internal stress is allowed. this steel is suitable for screwed and riveted constructions. welding property: this steel is suitable for welding. table 3: chemical properties of qste380tm material quality based on en 10 204-3.1 u. s. steel kosice chemical analysis of melting with number 53019 in % qste380tm c mn si p s al nb v ti 0.08 0.90 0.01 0.009 0.006 0.039 0.04 0.001 0.013 table 4: mechanical properties of qste380tm material quality based on en 10 204-3.1 u. s. steel kosice mechanical properties lower yield point breaking strength tensibility kv re rm a5 [mpa] [mpa] [%] [◦c] [j] qste380tm 482 540 29.0 −40 40 42 acta polytechnica vol. 51 no. 3/2011 table 5: chemical properties of s235jr material quality based on en 10 204-2.2 u. s. steel kosice chemical analysis of melting with number 53019 in % s235jr c mn si p s al n v ti 0.19 1.50 – 0.045 0.045 – 0.014 – – table 6: mechanical properties of s235jr material quality based on en 10 204-3.1 u. s. steel kosice mechanical properties lower yield point breaking strength tensibility kv re rm a5 [mpa] [mpa] [%] [◦c] [j] s235jr 276 439 31.5 20 40 table 7: group of all cutting parameters used in the experiment index of cutting parameters cutting speed power gas pressure cutting gap focal distance of lens nozzle diameter gas [m/min] [w] [bar] [mm] [mm] [mm] 1 2.8 3200 0.8 0.3 7.5 1 o2 2 2.6 2700 0.6 0.3 7.5 1 o2 3 2.2 2000 0.6 0.3 7.5 1 o2 4 2 1800 0.6 0.3 7.5 1 o2 5 3.2 3200 0.8 0.3 7.5 1 o2 6 2.5 3200 0.8 0.3 7.5 1 o2 5 experimental results for the experiment, the samples were marked with letters of the alphabet and numerical indices to differentiate the materials and the cutting parameters. the materials were marked as follows: • s355j2c+n by letter “b” • qste 380 tm by letter “c” • s235jr by letter “d” each material was cut with six different cutting parameters (in table 7). the cutting parameters were marked with numerical indices “1, 2, 3, 4, 5, 6”, indicating changes in cutting speed, power and working gas pressure. the samples with the best quality results were selected for further tests. the surface roughness results are shown in figure 7, vickers microhardness (figure 8) and precision of the cut surface to nominal distance (figure 9). figure 7 illustrates three basic surface roughness parameters rz – average value of the five highest points and the five lowestpoints in thematerial; rq – sum of the average values of the highest and lowest points on themeasured length; ra –averageof all deflections on the measured length. qste 380tm material showedthebest surface roughness,with cutting parameters: cutting speed – 2.2 m · min−1, power – 2000 w, gas pressure – 0.6 bar. fig. 7: evaluation of surface roughness 43 acta polytechnica vol. 51 no. 3/2011 figure 8 presents average vickers microhardness values. these values were measured along the heataffected area to obtain the average microhardness value. again, the best results were forqste 380tm material, but the cutting parameters were changed to: cutting speed – 2.5 m · min−1, power – 3200 w, gas pressure – 0.8 bar. fig. 8: vickers microhardness evaluation fig. 9: evaluation of dimensional inaccuracy figure 9 illustrates the relation of dimensional inaccuracy, i.e. the size of the deflection, on the nominal dimension. the start-up of the laser beam causes a change inmaterialmelting on the upper side (the melting on the upper side is greater than on the lower side). it is therefore very important tomeasure the divergence between the values on the upper side and on the lower side. the graph in figure 9 shows that qste 380 tm material again has the best precision values. for the purposes of the experiment, the upper side and the lower side of an experimental sample were melted. this was due to the increased exothermic reaction of the alloyed elements. a correction can be made by changing the cutting beam parameters. the cutting parameters for the highestprecision sample are: cutting speed – 2 m · min−1, power – 1800 w, gas pressure – 0.6 bar. 6 conclusion our experiment focused on finding the influence of materials with different content of alloy elements on the quality of the cut surface. the results of the test show that for cutting with a co2 laser the best material is qste 380 tm. this material has a low carbide and alloy element content. it had the best results for all measured quality parameters. the results also show that in order to achieve the required cut quality it is necessary to correct the cutting laser parameters, which also influence the cut quality. it is important in the component production process to determine the necessary priorities. these priorities depend on the subsequent processing or on the practical use of the products. it is very important to design an optimization solution for achieving the required results with minimum operating costs. our results can help companies to improve the efficiency and cost-effectiveness of their production. references [1] maňková, i.: progreśıvne technológie. vienala košice, 2000, 275 p. isbn 80-7099-430-4. [2] vasilko, k., kmec, j.: delenie materiálu. datapress prešov, 2003, 232 p. isbn 80-7099-903-9. [3] havrila, m.: progreśıvne technológie. tu v košiciach, fvt so śıdlom v prešove. prešov, 2002, 105 p. isbn 80-7099-891-1. [4] rasa, j., jindrova, r.: nekonvenčńı technologie. in: mmspektrum2006,7, p. 34. issn1212-2572. [5] maščenik, j., batešková, e., haľko, j.: change the structure of the material technology of thermal effects of drilling flowdrill. in: sbw 2009: robotization and automation in welding and other techniques: slavonski brod, 11.–13. 11. 2009. mechanical engineering faculty, 2009, p. 187–190. isbn 978-953-6048-51-9. [6] fečová, v., barna, j., litecká, j.: the using of 3d coordinate measuring machine into the control system of components made in engineering. in: journal of engineering and technology for 44 acta polytechnica vol. 51 no. 3/2011 young scientists. vol.1, issue 1 (2010), p. 11–16. issn 1338-2349. [7] standard en iso 9013:2002 thermal cutting — classification of thermal cuts — geometrical product specification and quality tolerances. [8] hatala,m.: laser – nástroj budúcnosti. in: strojárstvo, 2008, issue 4, [online]: http://www.strojarstvo.sk/docwww/sk/299/ 299.pdf ing. juliána litecká phone: +421 051 772 37 96 e-mail: juliana.litecka@tuke.sk department of technological devices faculty of manufacturing technologies with a seat in prešov technical university in košice štúrova 31, 08001 prešov, slovakia doc. ing. jarmila fabianová, csc. phone: +421 051 772 37 96 e-mail: jarmila.fabianova@tuke.sk department of manufacturing technologies faculty of manufacturing technologies with a seat in prešov technical university in košice štúrova 31, 08001 prešov, slovakia prof. ing. slavko pavlenko, csc. phone: +421 051 772 37 96 e-mail: slavko.pavlenko@tuke.sk department of technological devices faculty of manufacturing technologies with a seat in prešov technical university in košice štúrova 31, 08001 prešov, slovakia 45 ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 on global and nonlinear symmetries in quantum mechanics h.-d. doebner 1 prolog i firstmet jiri niederle around 40 years ago in the international centre for theoretical physics at trieste. during this time the centre was located in a modern building at piazza oberdan, the director was abdus salam and his deputy was paolo budini. jiri and i were fellows in a group under the indirect guidance of asimbarut andchris fronsdal. i remember some our colleagues in this group: arnobohm,richardraczka, moshe flato. it was the period in which abdus asked the scientist in the centre ‘topush forwardthe frontiers of knowledge’, especially in particle physics through models based on group theory; ũ(12) was fashionable in these times. some of us expected a description of the geometrical structure behind fundamental quantum physics through groups, their lie algebras and their representations to be too narrow and not flexible enough to model real physical classical and quantum systems: if the group and their linear representation are chosen, there isnot enough freedomtoaccommodate their characteristic properties. hence, wewere interested to apply a framework which is beyond groups. we tried e.g. nonlinear and nonintegrable lie-algebra representations, nonseparable hilbert spaces and lie algebras over unusual number fields; we studied differential geometric methods which reflect local as well as global geometrical, algebraic, analytic and symmetry properties which are more general compared with group theory. jiri niederle and joukomickelssonwere interested in the construction of nonlinear representations,which opens the plethora of unknown possibilities to model lie symmetries of quantum systems. these representations contain nonlinear generators acting in a linear representation space. one has to specify the type of ‘nonlinearity’ and their physical interpretation. jiri tolar and imyself intended tomodel quantum systems localizedandmovingona smoothmanifold m with (classical) position andmomentum observable o which span their kinematics k(m) through a quantization q(m, k) of k(m) –borelkinematics – and, after the introductionof a time t dependence, their dynamical properties – borel quantization – through e.g. as quantum analogous to the classical case. it is reasonable to connect a paper on the occasion of the 70th birthday of jiri niederle with a retrospective view on both of our attempts. we thought in trieste that these ideas – the utilization of nonlinear representations and the application of global methods – were not directly related to each other. however, it turns out now that this is not the case. a discussion of these relations is the topic of my paper: the global structure of the set {q(m, k)} of quantizations of the kinematics in ahilbert space h is connected with a nonlinear representation of a matrix group g which leads also to a quantization of other observables through nonlinear operators. one obtains e.g. for m = r3 a nonlinear representation of the central extensionof the inhomogeneousgalilei groupg(3) and furthermore nonlinear schrödinger equationswith given potentials which were also derived in another context in [1, 2]. 2 preview weconsider scalarnonrelativistic systems localized andmoving ona smoothmanifoldwithout internal degrees of freedom and external fields (2-forms on m) and their quantizations (quantummaps) on a suitable hilbert space h: the set of unitary inequivalent quantummaps q of systems on m with kinmatic k(m) {qα,d(m, k)} is labelled through two quantum numbers α and d which depend on topological and global properties of m and k(m) [3, 4], for reviews see [5, 6]. we explain this in section 3. in the case m = r3 (with trivial α and a real number d), presented in section 4, the set {qd(r3, k)} is shown to carry a physicallymotivated nonlinear representation ng of a 2×2 matrix group g which act on a domain d ∈ h ng ∈ n:ψ −→ n[ψ]. (1) this implies that unitary inequivalentquantizations in qd(o), o = f, x, for different d are related through tangent maps with n (see also [4]) d d� (n[ψ + i�qd(o)ψ])�=0 = i q d′(o) · n[ψ]. (2) this construction leads also to nonlinear operators from elements of order ≥ 2 in esa polynomials in q0 and p0. a lecture on the occasion of the 70th birthday of jiri niederle 76 acta polytechnica vol. 50 no. 3/2010 3 borel kinematic k(m) and its quantization 3.1 k(m) as global kinematical algebra our systems are localized and moving on a smooth manifold m. to model their localizations we consider a set of regions on m such that this set should contain information on the probabilities to observe the system in a given region. a sufficiently large canonical set of regions is the σ-algebra b(m) of borel sets in m. we choose b(m) as set of position observables. smoothmotions of systems localized in b ∈ b(m) can be described canonically through flows φ on m b �→ b′ = φxs (b) parameterized with s and characterized through infinitesimal generators x which are contained in the set v ect0(m) of smooth complete vectorfields in m. these covers a large class of motions; their generators x can be chosen as momentum observables. collecting the results we have: • the kinematical situation of the system is described with a flow model in m. position observables are modelled with the σ-algebra b(m) and momentum observableswith vectorfields v ect0(m). the collection of these observable k(m)= (b(m), v ect0(m)) (3) contains through the flow model all possible positions andmomenta as global properties of themovingborel field b(m). this justifies the name ‘borel’ kinematic. this notion can be generalized [4, 7, 8] to systems with k internal degrees of freedom and external fields (2-forms on m). the kinematics k(m) is a time independent quantity. if a time dependence t is introduced to parametrize the motion of the system, e.g. through t dependent states, the quantization q of k(m) leads to evolution equations which select, with e.g. initial values, the t dependence of the matrix element of the quantized kinematical observable (see e.g. [1, 9, 10]). 3.2 quantizations of k(m) we use ‘quantization’ in the following sense: consider a set of classical objects, e.g. k(m), a separable hilbert space h and the set sa(h) of esa operators (and hence linear) in h. the states of the system are modelled through h. we define ‘quantization’ as a map q (quantummap) of this classical object on a cdi domain in sa(h) with certain properties depending on the structure of the system. hence a quantization of k(m) is a quantum map q(m, k)= (q,p) q:k(m)� (b, x) �→ (q(b),p(x) ∈ sa(h) (4) on a cdi domain d in h. 3.2.1 quantizations of borel fields b(m) and algebraic properties of k(m) following the geometrical background of the flow model and the above notion we interpret the map q:b �→ q(b) ∈ sa(h), b ∈ b(m) (5) through thematrix elements ofq(b) in a (pure) state with ψ ∈ d μϕ(b)= (ψ,q(b)ψ))/(ψ, ψ) (6) as the probability to find the system in this state localized in b ∈ b(m). therefore we assume μψ:b �−→ μψ(b) to be a probability measure on b(m) which implies that q(b) is a positive operator valued (pov) measure on b(m). we assume that the position observable b and also their traumatizations q(b) are commutative1 and get a projective pov valued measure q. if h is realized as l2(m, ν) (ν is a standard measure on m) the q(b) act, up to unitary equivalence, as a multiplication operator with the characteristic function of b q(b)ψ = χ(b)ψ. (7) finallywe induce aquantummapqof the set c∞(m) of real smooth functions f(m) on m ∈ m via the spectral theorem q:c∞(m) � f �→ q(f(m)= f(m) ∈ sa(l2(m, ν)). (8) the model for position observableswith smooth functions f ∈ c∞(m)) instead with borel sets b ∈ b(m) ismotivatedbecause one can equivalentlywrite k(m) as k(m)= (c∞(m), v ect0(m)) (9) which implies an algebraic structure of k(m): the abelianlie-algebra c∞(m) and the algebra of smooth vectorfields v ect0(m) which act together on m as semidirect sum k(m)= c∞(m)⊕s v ect0(m). (10) hence k(m) can be viewed as an (∞ dim) global infinite dimensional lie-algebraic symmetry of a system on m. it can also be considered as lie-algebra of an inhomogeneous subgroup of the diffeomorphism group dif f(m) which was used in [11]. 1certain types of noncommutative positions can also be discussed in this approach. 77 acta polytechnica vol. 50 no. 3/2010 this symmetry (10) reflects the physics in the flow model and it is plausible to assume that the symmetry survives the quantizationmapqand leads after quantization in l2(m, ν) to a partial realization of k(m) in (10). we add ‘partial’ because two esa elements in k(m) on d belong to q(m, k) only if their linear combinations and commutators are also esa on d. collecting the result we have: • quantum maps q of position observable f ∈ c∞(m) lead to q(f) which act as multiplication operators in l2(m, ν). this allows to view k(m) as global infinite dimensional lie-algebraic symmetry of the system which is assumed to survive the quantum map. 3.2.2 quantizations of v ect0(m) momentum observables in k(m) are introduced as flow generators x ∈ v ect0(m) of φx. following our above arguments their quantum map p should again be based on the geometrical roots and must respect the consistency with q(f) in (10). i. the observable x act in our flowmodel as differential operators on position observables in c∞(m). hence it is plausible to assume that also p(x) ∈ sa(l2(.)) is a finite order differential operator on l2(m, ν). however, a realization of this assumption is difficult. this is because a definition of differential operators in l2(m, ν) needs a notion of the differentiation of complex functions ψ(m) on a smooth manifold m which are square integrable in respect to themeasure ν. this measure theoretic property containsno information on their differentiability. hence additional assumptions are necessary to realizemomentum operators p(x) as pdo’s. to formulate such assumptions we note that differentiation on a smoothmanifold m is given through its definition, and differentiation on the complex plane c is a standard notion. this implies technically the existence of differential structures -d(m) on m and -d(c) on c. for the differentiation of complex functions on m in l2(m, ν) we need technically a differential structure -d(m × c) on the point set m × c with the restrictions -d(m ×c)/m = -d(m), -d(m ×c)/c = -d(c). (11) without going into mathematical details, including the definition of -d and interesting applications, we need for p some results on differential structures with (11). a possibility is to look on complex line bundles η over m with hermiteanmetric <, >. this is because there exists with η an isometrically isomorphic complex line bundle η0 with hermitean metric <, >0 and a differentiable structure -d(m × c) with (11). hence we are interested in complex line bundles with metric <, > with hermitean connection ∇ and in their classification up to equivalence. this is well known, we refer e.g. to [12, 13]. because we are dealing with a system without external fields the connection is flat. the inequivalent line bundles with flat connection are labelled by the character group π∗1(m) of the fundamental group of m. we have to relate the sections σ of a line bundle η with the elements ψ of l2(m, ν). the sections σ in η form a complex vectorspace and, together with ν and <, >, the ‘square’ integrable ones formahilbert space l2(η, <, >, ν). a dense domain of smooth sections in η can be embedded in a dense domain d in l2(m, ν). hence we use this domain d to define differential operators. in general there are a large number of nonequivalent differentiable structures on m × c which lead to different dense domains in l2(m, ν). hence one can view the definition of differential operators with a domain problem. ii. in part i., we explained how to introduce a differential structure. now we refer again to the geometrical roots to argue that our differential operators are of finite order. classical generators x of with support supp(x) moves with φx the characteristic function χ(b) resp. the support of f. after quantization the situation should be analogous. this motivates a locality condition for p(x) supp(p(x)ψ)⊆ supp(ψ), ψ ∈ l2(m, ν). (12) with peetre’s theorem [14] this locality condition and a differentiable structure -d(m × c) yields for p(x) differential operators of finite order. collecting the results we have: •thequantummappofmomentumobservable x leads to differential operators of finite order on a domain in l2(m, ν) which is embedded on a domain in l2(η, <, >, ν) spannedby square integrable sections of a complex line bundle η on m withahermiteanmetric and a flat hermitean connection. 3.2.3 quantizations of the borel kinematic k(m) with the properties for q(f) and p(x) at the end of section 3.2.1 and 3.2.2 we construct the quantummap q(m, k). the result reflects the classification of flat complex line bundles with a hermitean metric and a hermitean connection. a further classifying real number d is related to the global lie-algebraic symmetry k(m) (10) and the consistency of the partial realization in q(m, k). classification theorem for quantum borel kinematics inequivalent irreducible quantum maps q(m, k) from k(m) to a cdi domain dq in sa(l2(m, ν)) 78 acta polytechnica vol. 50 no. 3/2010 qα,d(m, k)= (qα,d,pα,d (13) are labelled by two numbers α,d α ∈ π∗1(m), d ∈ r. (14) the domain dq ⊂ l2(m, ν) is obtained through an embedding in thehilbert space l2(η, <, >, ν) spanned through square integrable sections of the complex line bundle η on m with hermitean metric and a flat hermitean connection. the α, d are quantum numbers in the sense of wigner. quantizations pα,d for different α and/or d are unitary inequivalent. quantizations qα,d which act on d as qα,d(f(m)))= f(m), f(m) ∈ c∞(m, r) are independent of α and d. for pα,d on d one obtains a first order pdo through lie-derivatives and as the zero order part a smooth section of the endomorphism bundle which depends on d and α (see the example in section 3.2.4 and [3]). there are many applications. we refer to the list in [4] and e.g. to quantizations on a trefoil manifold [15] and on two and higher dimensional configurationmanifolds for n identical particles [16] including anyons. physical effects of quantummaps qα,d with d =0 through quantum numbers α are experimentally observed. however, for qα,d with d �= 0 our derivation contains no information on the interpretation and hints for its physical relevance; even in the case for trivial α the physical meaning of d is unknown. furthermore qα,d with d �=0 implies as a quantummap of the kinematics no information on the time dependence of the system, i.e. of its dynamics and furthermore no rule for the quantization (up to orderings) of higher order (≥ 2) polynomials of momentum and position observable. a possible answer to these questions was proposed by jerry goldin and hdd [1]. they introduced for m = r3 a time dependence of ψ through particle conservation and obtainedwith qα,d(r3, k) after ‘gauge generalisation’ nonlinear schrödinger equations with nonlinear termsproportional to d. theprocedure can be generalizedto systemson m. these results indicate a hidden nonlinear structure of {qα,d(m, k3} which will be explored in section 4. 3.2.4 quantizations on m = r3 as an example for the classification theorem we consider m = r3 with trivial π∗1(r 3) and x = −→g (x) −→ ∇ , f = f(x). we find for qd(r3, k) in d ∈ l2(r3,dx3) qd(f) = f(x), pd(x) = 1 i −→g (x) −→ ∇ + ( 1 2i + d ) div−→g (x). (15) qd=0 ≡ q0 is the canonical quantization. the maps qd and qd ′ are unitarily inequivalent for d �= d′. the expectation values can be scaled according to their physical interpretation with automorphisms of v ect0(m) � x �−→ ax and c∞(m, r) � f �−→ bf. 4 a nonlinear symmetry of{ qd } 4.1 nonlinear transformations and operators we use the above set { qd(r3, k) } to look for a ‘hidden non linear structure’. because nonlinearities are often a result of nonlinear transformations it is plausible to try physically motivated nonlinear transformations n of ψ ∈ l2(r3, d3x) n:ψ −→ n[ψ] = n(ψ)ψ. (16) we assume that n act asmultiplication operators and depend only on ψ, i.e. not on derivatives of ψ and not explicitly on x, t (domain and range questions are not discussed in this paper). these transformation n[ψ] could imply singularities e.g. in evolution equations, which are not important here. in the case for multivalued n[ψ] one has to show that relevant results are unique and independent from the choice of the representatives of the ray {ψτ | ψ expiτ, τ real} which describe physical equivalent states. to construct from esa operators a through n operators an, which may be linear or nonlinear, we again use the flow model. the flow with generator x corresponds after quantization to a strongly continuous one parameter unitary group u� with an esa generator. this generator appears through a tangent map d d� (u�ψ)�=0 on a path {u�ψ, ∈ [−1,+1]} in l2(r3,d3x). with this inmindwe construct an from a given esa a with v� = expi�a via a tangent map n[v�ψ] with n on a corresponding nonlinear path d d� (n[v�ψ])�=0 = d d� (n[ψ + i�aψ])�=0 ≡ (17) ian · n[ψ]. on a domain in which the limit exists. for differential operators an this domain can be extended in l2(r3,d3x). for some not esa linear a this construction is also possible. 4.2 choice of nonlinear transformations our aim is to determine a set n of nonlinear transformations such that qd,pd in (15) qd = f, pd = 1 i −→g · −→ ∇ + ( 1 2i + d ) div−→g 79 acta polytechnica vol. 50 no. 3/2010 are connected for different d through a tangent map with n ∈ n qd ′ = qdn with q d′ =qdn , p d′ =pdn . (18) an evaluation of these conditions is indeed possible; the calculation is straightforward. with a (non unique) polar decomposition ψ = rexpis we find for the nonlinear transformations n n(γ,λ)ψ =expi(γ lnr +(λ−1)s) · ψ. (19) they build a nonlinear representation ng = {n(γ,λ)}of twoparameter (γ,λ)group g. thegroup ng was derived in a different context and with other assumptions in [17]. for the corresponding tangent maps we find qd = q0n(γ,λ) with γ =2d, λ.arbitrary qd ′ = qdn(γ,λ) with γ =2(d ′ − d), λ=1.(20) hence our search for a hidden symmetry was successful. the result implies: • quantizations qd and qd ′ are inequivalent in respect to the group of unitary transformations but ‘equivalent’ in respect to a nonlinear realization ng of a matrix group g with one and for d = 0 and for d′ �= 0 with two real parameters. the quantum number d leads to a nonlinear representation of a g symmetry of { qd(k(r3)) } . 4.3 nonlinear quantizations and symmetries we now extend our construction from linear first and zero order differential operators in q(r3, k) to esa higher order differential operators operators in polynomials of canonically quantized observables d ∈ pesa(q0(f),p0(x)). an application of tangent maps leads to n(γ,λ): d �→ dn(γ,λ) (21) i.e. to a two parameter set of nonlinear differential operators dn(γ,λ). this nonlinear ‘extension’ of canonically quantized polynomialsd of classical observables through tangent maps with ng is an attempt for a formal path to nonlinear ‘extensions’ of quantum mechanics. however, an interpretation of nonlinear operators and nonlinear evolutions as in (linear) quantum theory is not possible. results from a nonlinear theory can be interpreted in some approximation as in the linear case, e.g. the eigenvalues of a nonlinear schrödinger equation (22) (see [18]). a complete mathematical framework and convincing physical interpretations for a nonlinear ‘extension’ is not known. with (21) one obtains an already mentioned nonlinear realization of g(3) with linear generators of the galilei-algebrawith the exception of the freehamiltonian which appears as nonlinear operator. furthermore one gets with n ∈ ng from a given linear schrödinger equation a nonlinear one with nonlinear part f [ψ] which depends on γ,λ. this nonlinear equation was generalized in [19] to a nonlinearisable schrödinger equation (known as the doebner goldin equation) with real coefficients d, λ1, . . . , λ5 ih̄∂t = (−h̄2/2m%+ f [ψ])ψ f [ψ] = ih̄d 2 %ρ ρ + h̄d {λ1r1 + . . . + λ5r5} (22) with ρ = ψψ, j = h̄ 2mi { ψ∇ψ −∇ψψ } r1[ψ] = ∇· j ρ , r2[ψ] = %ρ ρ , r3[ψ] = j2 ρ2 r4[ψ] = j ·∇ρ ρ2 , r5[ψ] = ∇ρ ·∇ρ ρ2 5 conclusion we announced in the preview that we would relate global and nonlinear structures of quantum maps q(m, k) for the borel kinematics k(m) of nonrelativistic systems on m without internal degrees of freedomand external fieldswith application forq(r3, k). • inequivalent quantum maps { qα,d(m, k) } , labelled through quantum numbers α, d in the sense of wigner which reflect topological and global properties of m and k(m). experiments to measure effects in qα,d are known for d =0. • for m = r3 and d �= 0 these global structures imply that { qd(r3, k) } carries a nonlinear representations ng of a 2×2 matrix group g. • for esa differential operators d of order ≥ 2 in the polynomial set pesa(q0(f),p0(x)) one obtains nonlinear differential operators dn through tangent maps with n ∈ ng. nonlinear versions of symmetry and dynamical symmetry algebras are available. our constructionmaybe viewedas part of a path to a nonlinear extension of quantum mechanics. this may be relevant in the case that precision experiments show a nonlinear character and corresponding nonlinear evolutions based on a global character of m and k(m). references [1] doebner,h.-d.,goldin,g.a.: phys. lett. a 162, 397 (1992). [2] doebner, h.-d., goldin, g. a.: phys. rev. a 54, 3764 (1996). [3] angermann, b., doebner, h.-d., tolar, j.: lecture notes in math. vol. 1037 (1983). 80 acta polytechnica vol. 50 no. 3/2010 [4] doebner, h.-d., tolar, j.: in symmetry in science xiv kluver, dordrecht (2004). [5] doebner, h.-d., stovicek, p., tolar, j.: rev. math. phys. 13, 799 (2001). [6] ali, s., english, m.: rev. math. phys. 17, 205 (2005). [7] drees, m.: zur kinematik lokalisierbarer quantenmechanischer systeme unter berücksichtigung innnerer freitsgrade und äusserer felder, phd. thesis tu clausthal (1992). [8] nattermann, p.: dynamics in borel quantisation: nonlinear schrödinger equations vs. master equations, phd. thesis tu clausthal (1997). [9] hennig, j. d.: in nonlinear, deformed and irreversible quantum systems, world scientific (1995). [10] doebner, h.-d., hennig, j. d.: in symmetry in science ix, plenum press (1996). [11] goldin, g. a., menikoff, r., sharp, d. h.: j. math. phys. 21, 650 (1980). [12] kostant, b.: in quantisations and unitary representations, lecture notes in math. vol. 170 (1970). [13] weil, a.: variete kähleriennses, herman, paris (1958). [14] kahn, w.: introduction to global analysis, academic press, london (1980). [15] doebner, h.-d., groth, w.: j. phys. a 30, l503 (1997). [16] doebner, h.-d., groth, w., hennig, j. d.: j. geom. phys. 31, 35 (1998). [17] doebner, h.-d., goldin, g. a., nattermann, p.: j. math. phys. 40, 49 (1996). [18] doebner, h.-d., manko, v. i., scherer,w.: phys. lett. a 268, 17 (2000). [19] doebner, h.-d., goldin, g. a.: j. phys. a math. gen. 27, 1771 (1994). h.-d. doebner e-mail: asi@pt.tu-clausthal.de, doebner@t-online.de technical university of clausthal institute for energy research and physical tecnology 81 ap09_1.vp 1 introduction calculations to predict the deformation rate and load bearing capacity of concrete structures at high temperatures are often based on material models according to the model of eurocode 2 (ec2-model). in europe, most calculations of structures are based on this model. the model is very usable and provides a high level of safety for members under bending and standard fire test conditions. it has not been tested for natural fire conditions which include decreasing temperature conditions. the load bearing capacity of concrete structures can be optimized with models representing transient material behaviour. models which are approximated by transient data are more realistic. the following investigation describes the potential when using a new transient concrete model. this model considers thermal induced strain with external load or internal restraint load during heating up. for this model, a realisation of all components of concrete strain is needed. the concrete behaviour is influenced by transient temperature and load history. a material model for calculation of siliceous concrete is given in [1]. this new model is based on the thermal-induced-strain-model (tis-model) and is called the advanced transient concrete model (atcm). transient conditions during the whole calculation routine are taken into account. the transient load and the real temperature development are considered. generally, an atcm can be used for all types of concrete; only some parameters have to be changed. this examination is based on ordinary concrete with siliceous aggregates. the general calculation method is divided into thermal and mechanical analyses, which are normally nonlinear. using this model, finite element analysis (fea) is applied to the calculation [2]. in order to determine the time / temperature curves within the concrete, the thermal equation is solved with the inclusion of heat transfer through thermal analysis [3]. mass transports can also be included, because during fire exposure many phase transitions of the cement stone matrix and aggregate appear [4, 5]. these thermally conditioned physico-chemical variables can have influences on the mechanical model [6, 7, 8]. the mechanical analysis is based on these results. there are numerous models available for determining the behavior of ordinary concrete at high temperatures [9, 10, 11]. in regard to this, there is also high dependency on the type of concrete, used as studies for ultra-high performance concrete have shown (uhpc) [12, 13]. in a first step, the behaviour of small cylinders with siliceous concrete is calculated. the results are obtained using an atcm, which determines all local stresses and mechanical strains considering the whole cross section. these results are based on measured results according to [14]. in addition, a calculation of restraint stresses is given. the fea considers different material behaviour which allows all results obtained with the new model to be compared with the results of calculations obtained with the ec2-model, which is widely used in europe. the two concrete models, the ec2-model and atcm based on material properties according to tis-model (see equation (1), show a very different behaviour for deformation and restraint stresses during calculation. the influence of the load during heating is essential. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 calculation of a tunnel cross section subjected to fire with a new advanced transient concrete model for reinforced structures u. schneider, m. schneider, j.-m. franssen the paper presents the structural application of a new thermal induced strain model for concrete – the tis-model. an advanced transient concrete model (atcm) is applied with the material model of the tis-model. the non-linear model comprises thermal strain, elastic strain, plastic strain and transient temperature strains, and load history modelling of restraint concrete structures subjected to fire. the calculations by finite element analysis (fea) were done using the safir structural code. the fea software was basically new with respect to the material modelling derived to use the new tis-model (as a transient model considers thermal induced strain). the equations of the atcm consider a lot of capabilities, especially for considering irreversible effects of temperature on some material properties. by considering the load history during heating up, increasing load bearing capacity may be obtained due to higher stiffness of the concrete. with this model, it is possible to apply the thermal-physical behaviour of material laws for calculation of structures under extreme temperature conditions. a tunnel cross section designed and built by the cut and cover method is calculated with a tunnel fire curve. the results are compared with the results of a calculation with the model of the eurocode 2 (ec2-model). the effect of load history in highly loaded structures under fire load will be investigated. a comparison of this model with the ordinary calculation system of eurocode 2 (ec2) shows that a better evaluation of the safety level was achieved with the new model. this opens a space for optimizing concrete structure design with transient temperature conditions up to 1000 °c. keywords: material model, transient thermal strain, thermal creep, tunnel, concrete, fire. the calculations with simple structures show a good approximation between calculation results and measured data [15]. the good adaptation of the new atcm to measured data gives hope for a good adaptation in the calculation of complex structures. a cut and cover rectangular-shape reinforced concrete tunnel is calculated with the new model in the following sections. 2 generals and calculation results with concrete models 2.1 general tis-model it is generally agreed that the total strain �tot comprises the following parts: � � � � �tot el pl tr th� � � � , (1) where �tot total strain, �el elastic strain, �pl plastic strain, �tr total transient creep strain, �th thermal dilatation. it is therefore convenient to write for the pure mechanical strain: � � � � � �m el pl tr tot th� � � � � . (2) during an isothermal creep test the following types of deformation occur, see fig. 1. according to [17], in this case the term is called “load inducted thermal strain”. it consists of transient creep (transitional thermal creep and drying creep), basic creep and elastic strains. the shrinkage during the first heating is accounted for by the observed thermal strain (load 0 %). fig. 2 shows a general evolution of the total strain for specimens under different constant loads during heating up, based on the tis-model. the high influence of load during transient heating is to be seen. the elastic strain is very small at temperature t � 20 °c compared to the high deformation at high temperatures. it is concluded that the irreversible character of the main material properties must be incorporated in a calculation model to ensure a realistic consideration of the behavior of concrete. 2.2 calculation of total strains with the atcm and ec2 method 2.2.1 model parameters for calculation of total strains with the ec2 and atcm method the specimens are cylinders 80 mm in diameter and 300 mm in height. the heating rate is 2 k/min. the compressive strength at 20 °c is 38 mpa. the moisture content is © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 49 no. 1/2009 fig. 1: deformations of concrete at ambient temperatures subjected to a constant compressive load, according to [16] total strain at high temperatures as function of load history -15 -10 -5 0 5 10 15 20 0 100 200 300 400 500 600 700 800 temperature [°c] t o ta l s tr a in [‰ ] load 0% load 10% load 30% fig. 2: total strain at high temperatures as a function of load history w � 2 %. the results are obtained from heated specimens under different stress-time relationships [14]. in the advanced transient concrete model (atcm), the tis-model is used. the fea uses a model taken from eurocode 2 with a stress-strain constitutive model with minimum, recommended and maximum values of the peak stress strain. the minimum value of the peak stress strain (pss) is nevertheless not considered further, because the results are at a very negative side compared to the other models. fig. 3 shows the different peak stress strain values. the concrete behaviour shows a different young’s modulus during heating: the higher the pss, the smaller the young’s modulus. the practical relationship according to the measured data is not shown in eurocode 2. the stress-strain relationship in eurocode 2 is also used for a normative temperature condition, according to iso 824 (iso fire curve). 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 3: stress strain relationship subjected to fire, according to ec2 [18] stress-time-relationship 0 2 4 6 8 10 12 14 16 18 0 5000 10000 15000 20000 25000 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 4: stress-time relationship with constantly increasing load comparison between measurements and calculated data with different concrete models -20 -15 -10 -5 0 5 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 5: comparison of measured and calculated total strains under an applied load function according to fig. 4 2.2.2 results of measurements and calculations of total strains with an atcm and ec2 model the calculation is done with different load functions during heating. the atcm method is also well approximated for the mechanical strain according to measured data according to [14]. fig. 4 shows the load function as stress-time relationship of a constant increasing load. a comparison between the atcm method and the ec2 calculation is shown in fig. 5. the atcm with the tis-model are very well approximated with the measured based data. the result of the calculation with the ec2 model with the maximum pss value is generally as good as the value approximated by atcm. the calculation with the recommended value of pss is totally different above 3.5 h. fig. 6 shows the evolution of stress as a function of time that has been considered, with a linear increase until 15000 seconds and a linear decrease thereafter. fig. 7 shows the results of the comparison. the ec2 model with the maximum value of pss and the fea with the atcm approximated very well, as did the result of the calculation with ec2-model considering the maximum value of pss. the calculation with the ec2 model with the recommended value of pss generally has more deformation than the other calculations. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 6: stress-time relationship with continuously increasing load with continuous decreasing above 15000 seconds comparison between measurements and calculated data with different concrete models -4 -2 0 2 4 6 8 10 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 7: comparison of measured and calculated total strains under an applied load function according to fig. 6 the load function as a stress-time relationship with stepwise application of the load and stepwise unloading is given in fig. 8. a comparison between the different models is shown in fig. 9. the approximation between the two compared calculation methods with the atcm is comparatively good. however, a much higher difference between the total strains calculated with the atcm and the ec2 model with the maximum pss value is observed. the result of the calculation with the ec2 model with the recommended value of pss is significantly different from the calculations with the atcm and from the test results. fig. 10 shows the load function as a stress-time relationship with 3 increasing load steps and 3 decreasing load steps. fig. 11 shows a comparison between the different calculation models. the differences generally increase between the calculations with the ec2 model and atcm. the calculations with the atcm are a good approximation of the test results. the ec2 models, whatever value of pss is chosen, do not allow deformations to be calculated under a load function with a complex stress-time-relationship. for this calculation, atcm must be used. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 5 10 15 20 25 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 8: stress-time relationship with a sudden increase of the load and a sudden decrease till the origin comparison between measurements and calculated data with different concrete models -12 -10 -8 -6 -4 -2 0 2 4 6 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 9: comparison of measured and calculated total strains under an applied load function according to fig. 8 2.3 calculation of restraint axial force of a specimen under restraint condition 2.3.1 model parameters calculation of restraint axial force under restraint condition the specimens are calculated with the atcm and the ec2 model under restraint conditions and with a heating rate of 2 k/min. the restraint deformation applied at the beginning of the calculation is kept constant during heating up. the specimens are cylinders 80 mm in diameter and 300 mm in height. the cube compressive strength of the siliceous concrete at 20 °c is 20 mpa and it has a moisture content of w � 2 %. 2.3.2 calculation results of restraint axial forces for a heated specimen which is fully restrained the following figures compare the results of the calculation with the atcm with measured data taken from [14]. fig. 12 shows the restraint axial forces during heating with a load factor of 0.3. the measured data is based on different storage conditions during curing. the curve of the atcm is below the data of 105 °c dried concrete specimen till 300 °c and near the standard cured concrete (w � 2 – 4 %). above a temperature of 300 °c, the curve of the atcm is close to the curve of the water stored specimen. the curve of atcm lies in the confidence interval of all curves. fig. 13 shows the ratio of restraint axial force di© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 49 no. 1/2009 stress-time-relationship 0 5 10 15 20 25 0 50 100 150 200 250 300 350 400 time [sec] c o m p re s s iv e s tr e s s [m p a ] fig. 10: stress-time relationship with 3 sudden increases in load and 3 sudden decreases till the origin comparison between measurements and calculated data with different concrete models -10 -8 -6 -4 -2 0 2 4 6 8 10 0 5000 10000 15000 20000 25000 time [sec] t o ta l d e fo rm a ti o n [‰ ] based on measured data atcm ec2-model with maximum value of the peak stress strain ec2-model with recommended value of the peak stress strain fig. 11: comparison of measured and calculated total strains under an applied load function according to fig. 10 vided by compressive strength. the figure compares the restraint axial forces under different load conditions. the ec2 model is a stress-strain constitutive model without considering the load factor, i.e. it does not yield a good simulation result for restraint. at a temperature less than 420 °c the different load conditions indicate different restraint axial forces. above 420 °c the curves are nearly identical. the higher the load level, the higher are the restraint axial forces. the lines of the calculation with the ec2 model do not give a good approximation to the results of the atcm. from the experimental result of fig. 12 we come to the conclusion that ec2 model simulations do not give a good approximation of the measured values. the restraint axial forces are significantly lower than the measured data. since the axial stress has a significant effect on the fire resistance of building elements according to [19], a realistic simulation is important for loaded structures. 2.4 calculation of a tunnel cross section 2.4.1 model of the calculation of a tunnel cross section in general, calculation methods have two separate arithmetic steps: a thermal analysis and a mechanical analysis. for further information, please see the references [17, 20, 21]. the calculation model was divided into the following parts of the structure, see fig. 14. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 axial force during heating with load factor 0.3 compared to measured results 0 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 temperature [°c] r a ti o o f a x ia l fo rc e d iv id e d b y c o m p re s s iv e s tr e n g th atcm stored with water predried 105°c standard storing fig. 12: restraint axial force during heating with a load factor 0.3 compared to measured results comparision between different load factors during heating under restraint condition 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 temperature [°c] r a ti o o f a x ia l fo rc e d iv id e d b y c o m p re s s iv e s tr e n g th atcm with 0% load atcm with 10% load atcm with 30% load ec2 with recomended value of the peak stress strain (0% load) fig. 13: comparison of restraint forces for different load factors � ground plate beam 01 = symmetric axis of the cross section at node 1 beam 12 = mid-point between beam 01 and beam 20 at node 20 beam 20 = corner between ground plate and wall at node 41 � wall beam 23 = corner between wall and ground plate at node 41 beam 36 = point of maximum bending moment at node 75 beam 49 = corner between ceiling and wall at node 97 � ceiling beam 49 = corner between wall and ceiling at node 97 beam 60 = mid-point between beam 49 and beam 71 at node 120 beam 71 = symmetric axis of the cross section at node 143 in the following example, a single-bay frame is calculated. it is a model of a tunnel taken from a research project, shown in fig. 14 [22]. the simulation calculates a tunnel cross section with an exposition of a hci curve [23]. derived from the hydrocarbon curve, the maximum temperature of the hci curve is 1300 °c instead of the 1100 °c standard hc curve. fig. 15 shown the time-temperature relationship. such fires may occur in accidents involving tank trucks [7, 24]. the arithmetic model is based on a section 1 meter in width [25]. general calculations utilize the semi-probabilistic concept of eurocode 1 [17, 26]. the bedding is considered with the help of a spring component under every beam element of the ground plate [27]. the material used here is ordinary siliceous concrete c25/30 and steel bst500. the heating is calculated for transient heating. before the structure is subjected to fire, the basic combination must be used to determine the amount of reinforcement that is to be used for comparison purposes during the fire exposure. it is assumed that no spalling occurs during the fire. 2.4.2 results of the calculation of a tunnel cross section figs. 16 to 17 show the results of the deformation with the ec2 model with the maximum pss value, and with the atcm. the various displacements demonstrate how the whole structure responds during heating. the stiffness of the system changes as a function of time [28, 29]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 49 no. 1/2009 fig. 14: principle sketch of the tunnel; according to [22] hcm fire curve 0 200 400 600 800 1000 1200 1400 0 20 40 60 80 100 120 140 160 180 200 time [min] t e m p e ra tu re [° c ] fig. 15: hydro carbon increased fire curve according to [7] most of the deformations show a lower deformation with atcm. only in node 1 is the deformation in y-axis slightly larger with atcm than with the ec2 model. these results show the effect of the higher load utilisation of the new model. without considering the load history, the influence of the load under temperature exposure is not sufficiently reflected in the calculation of the deformation of the structure. the next figures show the mechanical properties of the strucure with respect to the axial forces and the bending moments. figures 18 to 23 show a comparison between the mechanical results. the axial forces of the ground plate, the wall and the ceiling are generally higher according to simulations with the ec2 model compared to simulations with atcm. due to the lower deformation in atcm, lower axial forces occur. an insignificant difference between the two models is seen in the calculation of the bending moment. positive bending moments are lower with atcm than with the ec2 model. negative bending moments are higher with atcm than with the ec2-model. 3 discussion of the results to calculate the load bearing capacity and the behaviour of structures subjected to fire, new material equations for the most important material properties of ordinary concrete have been developed [1, 15]. this model was developed to supplement the existing concrete model of ec2 with respect to the transient thermal creep and the effect of the load history. with this new model we can consider the load history in all phases of thermal exposure. with this complex model, we can calculate the total strain, taking into account a wide range of variations of load history and temperatures. different parts of deformations are approximated with discrete equations inter52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ec2-model maximum value of peak stress strain dof1 -0.002 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0 5000 10000 15000 20000 25000 30000 35000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 atcm dof1 -0.002 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0 5000 10000 15000 20000 25000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 fig. 16: displacement in the x-axis in various nodes ec2-model maximum value of peak stress strain dof2 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 atcm dof2 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 5000 10000 15000 20000 25000 time [sec] d is p la c e m e n t [m ] node 1 node 41 node 97 node 143 fig. 17: displacement in the y-axis in various nodes ec2-model with maximum value of pss axial force of the ground plate -450000 -400000 -350000 -300000 -250000 -200000 -150000 -100000 -50000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 1 beam 12 beam 20 atcm axial force of the ground plate -450000 -400000 -350000 -300000 -250000 -200000 -150000 -100000 -50000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 1 beam 12 beam 20 fig. 18: axial forces in various beams in the ground plate acting in the new concrete model. this technique can be used for realistic calculations of the behaviour of structures [30, 31, 32], especially in the case of restraint. by considering the load history during heating up, an increasing load bearing capacity due to higher stiffness of the concrete may be obtained in several cases. with this model, © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 49 no. 1/2009 ec2-model with maximum value of pss axial force of the wall -1200000 -1000000 -800000 -600000 -400000 -200000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 23 beam 36 beam 48 atcm axial force of the wall -1200000 -1000000 -800000 -600000 -400000 -200000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 23 beam 36 beam 48 fig. 19: axial forces in various beams in the wall ec2-model with maximum value of pss bending moment of the ground plate -1.00e+06 -5.00e+05 0.00e+00 5.00e+05 1.00e+06 1.50e+06 2.00e+06 0 5000 10000 15000 20000 25000 30000 35000 time [sec] b e n d in g m o m e n t [n m ] beam 1 beam 12 beam 20 atcm bending moment of the ground plate -1.00e+06 -5.00e+05 0.00e+00 5.00e+05 1.00e+06 1.50e+06 2.00e+06 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 1 beam 12 beam 20 fig. 21: bending moments in various beams in the ground plate ec2-model with maximum value of pss axial force of the ceiling -900000 -800000 -700000 -600000 -500000 -400000 -300000 -200000 -100000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] a x ia l fo rc e [n ] beam 49 beam 60 beam 71 atcm axial force of the ceiling -900000 -800000 -700000 -600000 -500000 -400000 -300000 -200000 -100000 0 0 5000 10000 15000 20000 25000 30000 time [sec] a x ia l fo rc e [n ] beam 49 beam 60 beam 71 fig. 20: axial forces in various beams in the ceiling ec2-model with maximum value of pss bending moment of the wall -2500000 -2000000 -1500000 -1000000 -500000 0 0 5000 10000 15000 20000 25000 30000 35000 time [sec] b e n d in g m o m e n t [n m ] beam 23 beam 36 beam 48 atcm bending moment of the wall -2500000 -2000000 -1500000 -1000000 -500000 0 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 23 beam 36 beam 48 fig. 22: bending moments in various beams in the wall we can consider the thermal-physical behaviour of material properties for the calculation of reinforced concrete structures. application of this model, instead of the calculation system of ec2, will lead to a better evaluation of the safety level. this opens a space for optimizing reinforced concrete structures under temperature exposure. a calculation of a tunnel cross section of a cut-and-cover single bay frame was performed and presented above. lower deformations are calculated in all parts of the structures using the new advanced transient concrete model (atcm). due to this lower deformation, there is a lower axial force during heating. the results of the calculation of the bending moments show a lower moment on the inside of the tunnel surface and a higher bending moment outside of the tunnel, if we compare the results of atcm with those of the ec2 model. the differences between the calculations are very small. here we do not observe a significant difference in this structure when using the new model of concrete. 4 conclusion it has been shown that the recommended model of ec2 does not calculate realistic values of deformations of concrete structures under high temperature, when compared with the results of the advanced transient concrete model (atcm), which is based on measured data. a maximum value of peak stress strain is necessary for a relatively realistic description of the behaviour of the structure. for calculation of tunnels with concrete with siliceous aggregates, the ec2 model should be taken with the maximum value of the peak stress strain. for calculating a higher load bearing member, atcm should be applied. note that the full concrete behaviour is used in the structure only with the tis-model with the equations of atcm. calculation with atcm has a high potential for optimizing concrete structures, higher than the ec2 model. the reliability of the load bearing capacity is higher with atcm, because the deformations are lower than with the ec2 model. the calculated axial forces with atcm are with the ec2 model are close to each other. a potential is observed for more detailed calculations of complex structures. in the concept of structures it may be applied with lower safety factors, i.e. lower excess charges may be used in the design. references [1] schneider, u., schneider, m., franssen, j.-m.: consideration of nonlinear creep strain of siliceous concrete on calculation of mechanical strain under transient temperatures as a function of load history. proceedings of the fifth international conference – structures in fire sif 08, singapore 2008, p. 463–476. [2] franssen j.-m.: safir. a thermal/structural program modelling structures under fire. engineering journal, a.i.s.c., vol 42 (2005), no. 3, p. 143–158. [3] pesaveto, f. et al.: finite – element modelling of concrete subjected to high temperature. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano 2004. [4] lang, e.: feuerbeton, schriftenreihe spezialbetone band 4, verlag bau + technik, 2001. [5] wolf, g.: untersuchung über das temperaturverhalten eines tunnelbetons mit spezieller gesteinskörnung. diplomarbeit, technische universität wien, 2004. [6] florian, a.: schädigung von beton bei tunnelbränden. diplomarbeit, universität innsbruck, 2002. [7] schneider, u., horvath, j.: brandschutz – praxis in tunnelbauten, bauwerk verlag gmbh, berlin, 2006. [8] debicki, g., langhcha, a.: mass transport through concrete walls subjected to high temperature and gas pressure. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano 2004. [9] schneider, u.: verhalten von betonen bei hohen temperaturen; deutscher ausschuss für stahlbeton. berlin – münchen: verlag wilhelm ernst & sohn, 1982. [10] horvath, j., schneider, u.: behaviour of ordinary concrete at high temperatures. institut für baustofflehre, bauphysik und brandschutz, tu wien 2003. [11] schneider, u., morita, t., franssen, j.-m.: a concrete model considering the load history applied to centrally loaded columns under fire attack. in: fire safety science – proceedings of the fourth international symposium, ontario, 1994. [12] horvath, j.: beiträge zum brandverhalten von hochleistungsbetonen, technische universität wien 2003. [13] horvath, j., schneider, u., diedrichs, u.: brandverhalten von hochleistungsbetonen. institut für baustofflehre, bauphysik und brandschutz, tu wien 2004. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 ec2-model with maximum value of pss bending moment of the ceiling -2500000 -2000000 -1500000 -1000000 -500000 0 500000 1000000 1500000 0 5000 10000 15000 20000 25000 time [sec] b e n d in g m o m e n t [n m ] beam 49 beam 60 beam 71 atcm bending moment of the ceiling -2500000 -2000000 -1500000 -1000000 -500000 0 500000 1000000 1500000 0 5000 10000 15000 20000 25000 30000 time [sec] b e n d in g m o m e n t [n m ] beam 49 beam 60 beam 71 fig. 23: bending moments in various beams in the ceiling [14] schneider, u.: ein beitrag zur frage des kriechens und der relaxation von beton unter hohen temperaturen. (habilitationsschrift) institut für baustoffe, massivbau und brandschutz, tu braunschweig, heft 42, braunschweig, 1979. [15] schneider, u., schneider, m., franssen, j.-m.: numerical evaluation of load induced thermal strain in restraint structures compared with an experimental study on reinforced concrete columns. proceedings of the 11th international conference and exhibition, fire and materials 2009, 26–28 january 2009, fisherman’s wharf, san francisco, usa. [16] khoury, g. a., grainger, b. n., sullivan, p. j. e.: transient thermal strain of concrete: literature review, conditions with specimens and behaviour of individual constituents. magazine of concrete research, vol. 37 (1985), no. 132. [17] schneider, u., lebeda, c., franssen, j.-m.: baulicher brandschutz, berlin: bauwerk verlag gmbh, 2008. [18] eurocode 2: design of concrete structures – part 1-2: general rules – structural fire design. 2004. [19] dwaikat, m. b., kodur, v. k. r.: effect of fire scenario, restraint conditions, and spalling on the behaviour of rc columns. proceedings of the fifth international conference – structures in fire sif 08, singapore 2008, p. 463–476. [20] franssen, j.-m.: contributions à la modélisation des incendies dans les bâtiments et leurs effets sur les structures, université de liège, belgium 1998. [21] mason, j. e.: heat transfer programs for the design of structures exposed to fire. university of canterbury, christchurch, 1999. [22] franssen, j.-m., hanus, f., dotreppe, j.-c.: numerical evaluation of the fire behaviour of a concrete tunnel integrating the effects of spalling. in proceedings fib workshop – coimbra, november 2007. [23] övbb-sachstandsbericht: brandeinwirkungen – straße, eisenbahn, u-bahn. övbb-arbeitskreis aa1, entwurf zum grundstück, verf.: lemmerer, j. et al.: wien, januar 2005. [24] sfpe: the sfpe handbook of fire protection engineering. 2nd edition, quincy, ma, usa: sfpe, 1995. [25] wittke, w., wittke-gattermann, p.: tunnelstatik. in: beton kalender 2005: fertigteile-tunnelbauwerke, berlin: verlag ernst & sohn, 2005. [26] eurocode 1: actions on structures. part 1-1: general actions. densities, self-weight, imposed loads for buildings. en 1991-1-1:2002 [27] beutinger, p., sawade, g.: standsicherheit – vorhersagemöglichkeit der bodentragfähigkeit aus geotechnischer sicht, tiefbau tagung magdeburg, 2004. [28] feron, c.: the effect of the restraint conditions on the fire resistance of tunnel structures. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano, 2004. [29] vözfi: abschlussbericht – praxisverhalten von erhöht brandbeständigem innenschalen-beton, wien, 2003. [30] rotter, j. m., usmani, a. s.: thermal effects. in proceedings of the first international workshop “structures in fire”. copenhagen, 19th and 20th june 2000. [31] harada, kazunori: actual state of the codes on fire design in japan. in: fib task group 4.3 fire design of concrete structures: what now? what next? milano, 2004 . [32] bailey, c. g., toh, w. s.: experimental behaviour of concrete floor slabs at ambient and elevated temperatures. in proceedings fourth international workshop “structures in fire”. aveiro, 2006. ulrich schneider e-mail: ulrich.schneider+e206@tuwien.ac.at martin schneider b, e-mail: e0527948@student.tuwien.ac.at university of technology vienna karlsplatz 13/206 1040 wien, austria jean-marc franssen e-mail: jm.franssen@ulg.ac.be university of liège 1, ch. des chevreuils, 4000, liège, belgium © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 49 no. 1/2009 1 introduction for single-storey steel portal frames in fire, especially when situated close to a site perimeter, it is imperative that the boundary walls stay close to vertical, so that fires which occur are not allowed to spread to adjacent properties. a current uk fire design guide [1] requires either that the whole frame be protected as a single element, or that the rafter can be left unprotected but the column bases and foundations should be designed to resist the forces and moments necessary to prevent collapse of the rafter, in order to ensure the lateral stability of the boundary walls. some arbitrary assumptions regarding the behaviour of the frame in fire, which are used to simplify this current design model, can lead to very uneconomical foundation design and base-plate detailing. further understanding of the behaviour of portal frames in fire is required, to provide other design options so that over-design of column bases and foundations can be avoided, and a more reasonable prediction of real critical temperatures can be made. on the basis of fire tests, a simplified method to estimate the critical temperatures of portal frames in fire was developed by wong in 2001 [2] for single-span portal frames with simple base connections. it was shown by numerical modelling that this method could predict the temperature at which the rafters initially lose stability in fire. a recently developed quasi-static analysis [3], implemented in the program vulcan, using a combination of static and dynamic solvers, has also shown that the strong base connections recommended by the current design method may not always lead to a conservative design. a second-phase failure mechanism observed in numerical modelling corresponds with the failure mode shown in one of the previous fire tests. the critical temperature at which runaway collapse occurs may be higher than that at which the roof initially loses its stability, because of re-stabilisation. in this paper, a new method is presented for estimating critical temperatures of single-span frames in fire, using these two failure mechanisms. numerical tests on typical industrial frames are used to calibrate this new method against the current design method. 2 behaviour of single-storey portal frames in fire as early as 1979 the behaviour of steel portal frames in accidental fires was described in the report of a study [4] of fires in a number of portal frames in the uk. a typical variation of the overturning moment at the column base with time, after a fire is ignited in a pitched-roof portal frame is described in the constrado design guide [5]. it was believed that the stability of the column was mainly determined by the resistance provided by the column base connections. however, the fire test on a scaled pitched-roof portal frame performed in 1999 [2] showed that the steel columns, connected to their foundations by a fairly flexible connection, could stand almost upright throughout the fire while the rafters snapped through to a inverted shape. this indicates that strong column bases are not always essential to the stability of an industrial frame under fire conditions. it has been postulated by o’meagher et al. [6] that unaffected parts of a building can act as anchorage for the fire-affected zone, provided that the forces developed in the purlins are reasonably small and that they have sufficient capacity at high temperatures, so that cold frames will also deform in an acceptable mode. the results of a series of parametric studies [2] using twoand three-dimensional modelling showed that a portal frame with semi-rigid bases initially loses stability in a combined mechanism, which differs from the assumption used in the current design method. further deformation could not be simulated because of the limitations of the static solver. in a previous paper the behaviour of single-span pitched portal frames was simulated using the recently-developed quasi-static solver [3] in vulcan. this showed that collapse of the frame happens in two phases [7]. it was also found that the initial collapse of the rafter is always caused by a plastic hinge mechanism which is based on the frame initial configuration. if the frame can re-stabilize when the roof is substantially inverted, a second plastic mechanism based on the re-stabilized configuration leads to eventual failure of the whole frame. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 a new design method for industrial portal frames in fire y. song, z. huang, i. burgess, r. plank industrial portal frames near to other buildings must keep their vertical walls standing in fire in order to prevent fire spread. a recently developed analysis, implemented in the program vulcan, using a combination of static and dynamic solvers, has shown that the strong base connections recommended by the current design method may not always lead to conservative design. a second-phase failure mechanism observed in numerical modelling, and the critical temperature at which final run-away collapse occurs, may be higher than the temperature at which the roof frame initially loses its stability, because a re-stabilisation often happens. a new method for estimating critical temperatures of portal frames in fire, using these two failure mechanisms, is presented. numerical tests on typical industrial frames are used to calibrate this new method. keywords: steel structures, portal frames, fire, boundary conditions, dynamic analysis, plastic theory, base connections. 3 new design method a single-span portal frame fails either in the first-phase mechanism when it initially loses stability, or may re-stabilise for a while before collapsing in the second-phase mechanism. the simple method developed by wong [2] is based on the initial configuration of the frame. hence, it is capable of explaining the reason why frames initially lose stability in fire, but is not valid for frame collapse in the second mechanism, in which the deformation of the frame is significant. the estimation of the critical temperatures for a two-phase failure mechanism should be based on different initial configurations for each of the two phases. 3.1 estimation of first-phase failure when the roof of the frame starts to deform downward under the loading and fire temperature, the columns are pushed outward due to the change of geometry and to thermal expansion of the rafters. for a portal frame with frictionless pinned base connections, high rotations can be generated at these bases, caused by either elastic or plastic deformation. these rotations, together with the fire hinges formed at the apex and eaves, can generate a ombined plastic mechanism. wong simple model, as shown in fig. 1, uses this mechanism, whose kinematics is referred to the initial configuration of the portal frame. this method can only apply to the frame initial loss of stability at relatively low deflections. according to plastic theory, for the mechanism shown in fig. 1, the fire hinge moments at corners 1 and 2 can be calculated. the ratio of the fire hinge moment to the normal moment capacity is given by the strength reduction factor at the critical temperature, so the critical temperature of this frame can be interpolated from stress-strain curves defined in eurocode 3 part 1. 2 [8]. 3.2 estimation of second-phase failure the initial collapse of the roof frame may initiate a “combined” mechanism leading to collapse of the whole frame, or the columns may be pulled back towards the upright position (see shape abcde in fig. 2) due to the collapse of the rafters. in the latter case the change of direction of the column’s rotation causes elastic unloading of the base moments of the columns, so that the plastic hinges developed at the bases are effectively locked. when the pitched roof deflects further than the position bcd, column ab is pulled inward, so that the base moments of the columns increase again. when the rotation of column ab is faster than the rotation of the adjacent rafter, the moment on one eave (corner b) starts to reverse, leading to locking of the adjacent plastic hinge. this causes the frame to re-stabilise at this position (shape ab0c0de in fig. 2), at which the internal angle (ab0c0) between column ab0 and the connected rafter bc0 stops closing and starts opening. with further increase of the pulling force at the column top caused by catenary action of the inverted roof, the fire hinges at the eave and column base can be mobilised again (shape ab’c’de in fig. 2), and a new mechanism, referred to as the second-phase failure mechanism, is established which leads to complete collapse of the frame. the new design method developed here focuses mainly on collapse caused by the second-phase mechanism in fire, and aims to predict the critical temperatures which initiate formation of the second-phase failure mechanism. the method is based on calculating the strength reduction factor of the fire hinge moment according to the work balance within the frame. because of the significant deformation of the roof frame before the start of the second failure mechanism, this new model has to identify the re-stabilised position of the frame and its critical position at the start point of the second-phase mechanism. both the thermal elongation of rafters and degradation of fire hinge moments at elevated temperatures are considered under some temperature assumption. moreover, because a plastic hinge at one column base is essential to generate a second-phase mechanism, the strength of the column bases is also included in this new method. when the second-phase mechanism of the frame is established, the elongation of the rafters is significant. this should not be ignored when the work balance is calculated within this system. estimation of the critical temperature for the second-phase mechanism of the frame is also based on the reduced moment capacity of the rafters. because the strength reduction factor and the steel elongation at the critical temperature are both unknown, an iterative solution procedure, illustrated by the flow chart in fig. 3, is required. at the beginning of the second-phase calculation, an initial temperature t0 is assumed, so the re-stabilised position can be estimated on the basis of the geometry of the frame, including the elongation of the rafter, at temperature t0. the fire hinge moment can also be calculated on the basis of the configuration of the frame at the re-stabilised position (as © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 49 no. 1/2009 � � udl on the roof = w � 2 � 1 fire hinges centre of rotation fig. 1: the model of wong’s simple design method opening d-stabilized position ’ b’ 0 b0 closing b a e c c c re fig. 2: illustration of the second phase mechanism shown in fig. 4) and the work balance based on plastic theory. the critical temperature t1 can be obtained from the strength reduction factor given by dividing the fire hinge moment by the moment capacity of the rafter section, and relating this to the corresponding temperature, as defined in eurocode 3 part 1.2 [8]. if the difference between t0 and t1 is larger than the tolerance required, steps 2 to 5 as defined in fig. 3 are repeated, using the elongated lengths of the rafters at t1, until tdiff is smaller than the tolerance required. the temperature t1 estimated from the final iteration is the critical temperature of the frame at the beginning of the second-phase mechanism. 4 validations against numerical tests in order to validate the new design method a series of comparisons has been conducted between critical failure temperatures predicted using the new design method and those obtained from previous numerical tests [9]. because the two-phase mechanisms are included in this method, critical temperatures for the creation of both the firstand the second-phase mechanisms are compared. figs. 5 and 6 compare the critical temperatures predicted by the new design method and numerical analysis results for two typical portal frames. the results presented in fig. 5 are for the portal frame designed without haunches but with varying base strength. results for the other portal frame, which is designed with typical-sized haunches and modelled with different base strengths, are shown in fig. 6. the first re-stabilised position of the frame is reached when the rafter is deformed into the inverted position and the vertical displacement of the apex is around 5 m. the prediction of the new design method is 5 m. this confirms that re-stabilisation during the collapse of the portal frame is caused by locking of the plastic hinge near to an eave, which disables the first-phase mechanism. once the opening of the locked angle exceeds the elastic rotation limit, the frame loses its stability again. for the frame with haunches, the re-stabilised position predicted by the new design method is about 0.4 m lower than for the numerical results. this is because, in the new design method, a re-stabilised position is assumed to occur when, in the first-phase mechanism, the rafter ceases to rotate relative to the column at hinge b (fig. 4) and this hinge consequently locks itself. when the hinge finally begins to rotate again, in the opposite sense to its original rotation, the second-phase mechanism is created and failure occurs. locking of the plastic hinges developed at one of the column bases could also be encountered when the column is pulled inward and passes 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 t 0 -stabilised position with 0 estimate the strength reduction factor � � >tolerance 1. 3. t 10 tt � 10 ttt diff �� diff tt � 02 diff t (ec3 part 1.2) 5. 4.get the critical temperature t related to base 0 tposition atstabilisedon the re2.calculate the re tthe elongation of the rafters at try temperature 1. at the symmetry-breaking threshold λ = 1 some of the eigenvectors become degenerate, giving rise to a jordan-block structure for each degenerate eigenvector. in general this is expected to give rise to a secular growth in the amplitude of the wave. however, it has been shown in a recent paper by longhi, by numerical simulation and by the use of perturbation theory, that for an initial wave packet this growth is suppressed, giving instead a constant maximum amplitude. we revisit this problem by developing the perturbation theory further. we verify that the results found by longhi persist to second order, and with different input wave packets we are able to see the seeds in perturbation theory of the phenomenon of birefringence first discovered by el-ganainy et al. keywords: pseudo-hermitian quantum mechanics, optical lattices, perturbation theory. 1 introduction the study of quantum mechanical hamiltonians that are p t -symmetric but not hermitian [1–6] has recently found an unexpected application in classical optics [7–15], due to the fact that in the paraxial approximation the equation of propagation of an electromagnetic wave in a medium is formally identical to the schrödinger equation, but with different interpretations for the symbols appearing therein. it turns out that propagation through such a medium exhibits many new and interesting properties, such as power oscillations and birefringence. the equation of propagation takes the form i ∂ψ ∂z = − ( ∂2 ∂x2 + v (x) ) ψ, (1) where ψ(x, z) represents the envelope function of the amplitude of the electric field, z is a scaled propagation distance, and v (x) is the optical potential, proportional to the variation in the refractive index of the material through which the wave is passing. a complex v corresponds to a complex refractive index, whose imaginary part represents either loss or gain. in principle the loss and gain regions can be carefully configured so that v is p t symmetric, that is v ∗(x) = v (−x). there is also a non-linear version of this equation, arising from sufficiently intense beams, where there is an additional term proportional to |ψ|2ψ. however, for the purposes of this paper we shall limit ourselves to the linear case. a model system exemplifying some of the novel features of beam propagation in p t -symmetric optical lattices uses the sinusoidal potential v = v0 [cos(2πx/a) + iλ sin(2πx/a)] this model has been studied numerically and theoretically in refs. [9, 12, 13]. the propagation in z of the amplitude ψ(x, z) is governed by the analogue schrödinger equation (1), which for an eigenstate of h, with eigenvalue β and z-dependence ψ ∝ e−iβz reduces to the eigenvalue equation − ψ′′ − v0 [cos(2πx/a) + iλ sin(2πx/a)] ψ = βψ .(2) it turns out that these eigenvalues are real for λ ≤ 1, which corresponds to unbroken p t symmetry, where the eigenfunctions respect the (anti-linear) symmetry of the hamiltonian. above λ = 1 complex eigenvalues begin to appear, and indeed above λ ≈ 1.776 87 all the eigenvalues are complex [16]. clearly one would expect oscillatory behaviour of the amplitude below the threshold at λ = 1 and exponential behaviour above the threshold, but the precise form of the evolution at λ = 1 is less obvious. at first sight one would expect linear growth because of the appearance of jordan blocks associated with the degenerate eigenvalues that merge at that value of λ, but, as longhi [12] has emphasized, this behaviour can be significantly modified depending on the nature of the initial wave packet. in a previous paper [17] we approached this problem by explicitly constructing the bloch wavefunctions and the associated jordan functions corresponding to the degenerate eigenvalues and then using the method of stationary states to construct the z-dependence. we found that the explicit linear dependence arising from the jordan associated 21 acta polytechnica vol. 51 no. 4/2011 functions is indeed cancelled by the combined contributions from the non-degenerate wave-functions and were able to understand how this cancellation came about. in the present paper we approach the problem from a different point of view by revisiting the complementary perturbative calculation of longhi [12]. in section 2 we briefly recapitulate how the spectrum and eigenfunctions are calculated. then in section 3, which forms the main body of the paper, we give an explicit expression for the first-order contribution and carry out the second-order calculation in detail. this enables us to investigate the saturation phenomenon for a variety of different inputs. finally, in section 4 we give a brief discussion of our results. 2 band structure at threshold at the threshold λ = 1, the potential v in eq. (2) becomes the complex exponential v = v0 exp(2iπx/a), for which the schrödinger equation is − ψ′′ − v0 exp(2iπx/a)ψ = βψ. (3) this is a form of the bessel equation, as is seen by the substitution y = y0 exp(iπx/a), where y0 = (a/π) √ v0, giving y2 d2ψ dy2 + y dψ dy − (y2 + q2)ψ = 0, (4) �1 1 k. a π 2 4 9 16 β. π2 a2 fig. 1: band structure for λ = 1 in the reduced zone scheme. the bloch momentum k is plotted in units of π/a and the eigengvalue β in units of (a/π)2 where q2 = β(a/π)2. thus the spectrum is that of a free massive particle, shown in the reduced zone scheme in fig. 1, and for q ≡ ka/π not an integer the solutions ψk(x) = iq (y) and ψ−k(x) = i−q (y) are linearly independent, and have exactly the correct periodicity, ψk(x + a) = e ikaψk(x), to be the bloch wavefunctions. it is important to note, however, that because the original potential is p t -symmetric rather than hermitian, these functions are not orthogonal in the usual sense, but rather with respect to the p t inner product, namely∫ dxψ−k (x)ψk′ (x) = δkk′ ∫ dxψ−k(x)ψk (x) (5) however, for q = n, a non-zero integer, in(y) and i−n(y) are no longer independent. in that case the bloch eigenfunctions do not form a complete set, and we must search for other functions, still with the same periodicity, to supplement them. these are the jordan associated functions, which we denote by ϕk(x) ≡ χn(y). they may be defined as derivatives of the eigenfunctions with respect to β, and satisfy the generalized eigenvalue equation[ y2 d2 dy2 + y d dy − (y2 + n2) ] χn(y) = in(y) (6) the crucial feature of the jordan functions is that because of this latter equation they naturally give rise to linear growth in z, provided that they are excited: e−ihz ϕr = e −iβr ze−i(h−βr)zϕr = e−iβr z(ϕr − izψr). (7) however, as was found numerically in ref. [12], and explored further in ref. [17], this natural linear growth may become saturated due to the contributions of neighbouring bloch functions, which are closely correlated with those of the jordan functions. 3 perturbation theory the analysis of ref. [17] approached the problem from one point of view, in which the interplay between the contributions of the bloch eigenfunctions and the jordan associated functions was made explicit. a complementary way of looking at things, which does not separate these two contributions, is to use the perturbative expansion, which instead emphasizes the contributions of the free propagation and the corrections brought about by the potential. the general framework for an expansion of ψ(x, z) in powers of v0, namely ψ(x, z) = ∞∑ r=0 v r0 ψr(x, z), has been given in ref. [12], along with an approximate form of the first-order term ψ1(x, z) for the case q0 = −1 and w large. in this section we generalize this calculation by obtaining analytic expressions for both the firstand second-order terms for general q0 and w. of course, this can only be used as a guide because 22 acta polytechnica vol. 51 no. 4/2011 �400 �200 200 400 y 0.5 1 1.5 2 �a� �400 �200 200 400 y 0.1 0.2 0.3 0.4 0.5 �b� fig. 2: characteristic behaviour of error functions whose arguments are (a) real or (b) have a large imaginary part. in (a) we plot erf(4+ y/w)+erf(4 − y/w) with w =80. in (b) we plot w e−w 2 |erf(4+ y/w − iw)+erf(4− y/w + iw)| there is no guarantee that the expansion converges, nor that the large-z behaviour of the complete amplitude can be extracted from the behaviour of the truncated series. we will take as our input a gaussian profile of the form ψ(x, 0) = f (x) ≡ e−(x/w) 2+ik0x, (8) with offset k0 and width w. the zeroth-order term, ψ0(x, z), is just the freely-propagating wave-packet ψ0(x, z) = w √ (w2 + 4iz) · eik0(x−k0z)e−(x−2k0z) 2/(w2+4iz), (9) while the first-order term, ψ1(x, z), is given by ψ1(x, z) = −i ∫ dkf̃ (k)ei(k+kb)x ·[ −i ∫ z 0 dy e−ik 2yei(k+kb) 2(y−z) ] , (10) where f̃ (k) = w 2 √ π e−(k−k0) 2w2/4 (11) is the fourier transform of f (x) of eq. (8) and kb = 2π/a is the width of the first brillouin zone. we can reverse the order of integration in the expression for ψ1, performing the (gaussian) k integration first, to obtain ψ1(x, z) = −i w √ (w2 + 4iz) e 1 2 ikb x− 1 4 ik 2 b z− 1 4(k0+ 1 2 kb) 2w2 × (12)∫ z 0 dy e−(2kb y−(x+kbz)+ 1 2 iw 2(k0+ 1 2 kb)) 2 /(w2+4iz). the y integration is then also a gaussian integration over a finite range, giving the result ψ1(x, z) = −i w √ π 4kb · e 1 2 ikb x−14 ik 2 b z− 1 4 (k0+ 1 2 kb) 2w2 (erf(η1) + erf(η2)) , (13) where η1 = kb z + x − 12iw 2(k0 + 1 2kb )√ (w2 + 4iz) (14) η2 = kb z − x + 12iw 2(k0 + 1 2kb )√ (w2 + 4iz) . (15) for the purposes of considering large w, it is convenient to rewrite these in the form η1 = x − 2k0z√ (w2 + 4iz) − 1 2 i(k0 + 1 2 kb ) √ (w2 + 4iz) (16) η2 = 2z(kb + k0) − x√ (w2 + 4iz) + 1 2 i(k0 + 1 2 kb ) √ (w2 + 4iz) . (17) the case k0 = − 1 2 kb , i.e. q0 = −1, is clearly very special, since in this case the second terms in the contributions to η1 and η2 vanish, so that we get the simple expressions η1 = (x + kb z)/ √ (w2 + 4iz) and η2 = −(x − kbz)/ √ (w2 + 4iz). in that case, as long as w2 � 4z the arguments may be treated as effectively real, and each error function behaves like a sign function of its argument (see figure 2(a)), so that the sum of the two behaves like the step function θ(kb z − |x|). this is the function φ(x/(kb z)) of ref. [12]. in this case the qualitative features of the perturbative calculation are in complete agreement with the spreading of the wave-function in figure 3 of that paper, and the saturation1 of ψmax. however, in this treatment there is of course no mention of whether or not any jordan functions are excited. 1ψ1(x, z) grows initially like z for small z. 23 acta polytechnica vol. 51 no. 4/2011 -400 -200 200 400 0.2 0.4 0.6 0.8 1 -400 -200 200 400 0.002 0.004 0.006 0.008 0.01 0.012 0.014 fig. 3: |ψ2(x, z)| versus x for z =50. (a) q0 = −1, (b) q0 =0. the parameters are: a =1, v0 =2 and w =6π the same expressions in eqs. (13) and (16) can also be used for the cases q0 = 0 and q0 = 1 in the limit of large w. in each case the arguments η1 and η2 now have a large imaginary part. in that situation the modulus of the erf has a narrow peak where the real part vanishes (see figure 2(b)). thus the result consists of two narrow rays, which are centered on x = 2k0z and x = 2(kb + k0)z. here we have the seeds of the birefringence first observed in ref. [7]. for the case k0 = q0 = 0, the two rays are centered on x = 0 and x = 2kbz, while for the case q0 = 1, or k0 = 1 2 kb , the two rays are centered on x = kbz and x = 3kbz. we now go on to second-order perturbation theory to investigate the behaviour of ψ2(x, z), which is given by [12] ψ2(x, z) = − ∫ dkf̃ (k)ei(k+2kb) ∫ z 0 dη · (18)∫ z−η 0 dξ e−ik 2z−4ikb(k+kb)η+ikb(kb+2k)ξ again the k integration is a gaussian, which leaves finite-range gaussian integrations over ξ and η. it is convenient to change the integration variable ξ to y ≡ 2η − ξ. the integration over y then yields the expression ψ2(x, z) = − w 2kb ∫ z 0 dη √ π 2 (erf(a) − erf(b)) · (19) e−2ik 2 b ηe 3 2 ikb x− 1 4 ik 2 b t− 1 4 w 2δ2, where we have written k0 = − ( 1 2 kb + δ ) , and a and b are given by b = 4kbη − x − kbz − 12iw 2δ √ (w2 + 4iz) a = 3kb (2η − z) − x − 12iw 2δ √ (w2 + 4iz) . (20) the final η integrations are then of the form iη ≡ ∫ dη erf(c1η + c2)e c3η = 1 c3 [ ec3ηerf(c1η + c2) − (21) ec3(c3−4c1c2)/(4c 2 1)erf (c1η + c2 − c3/(2c1)) ] . thus in principle ψ2 is expressible in terms of eight error functions. however, it turns out in practice that only six are involved. in the case k0 = −kb/2, or q0 = −1, when δ = 0, the arguments of the error functions are such as to give a plateau in |ψ2| between x = −3kbz and x = −kbz, i.e. a widening of the beam to the left of ψ0, and a much smaller peak, centered around x = 3kbz, that is, a second weak beam to the right. more importantly, the second-order contribution again shows no sign of the linear growth2 in z näıvely expected from the excitation of jordan associated functions. in the other case, q0 = 0, corresponding to δ = −kb/2, there are three peaks, centered around x = 0, x = −2kbz and x = 4kbz, representing a further splitting of the initial beam. both cases are illustrated in figure 3. 4 discussion of course one needs to treat the results of perturbation theory with caution. in general terms we have no proof that the perturbation series converges, and in particular the asymptotic behaviour in z of a few terms of the series does not necessarily give the correct asymptotic behaviour of the entire sum. nonetheless this series does appear to give reliable results. for the parameters used by longhi in ref. [12], with v0 = 0.2, first-order perturbation theory already reproduces the numerical results very well, and the second-order term gives only an extremely small correction. 2in fact ψ2(x, z) is proportional to z 2 for small z. 24 acta polytechnica vol. 51 no. 4/2011 in all of our calculations we have not allowed z to become too large, restricting it by the condition z � w2/4, in which case saturation is an inbuilt feature of perturbation theory. in fact it was shown in ref. [17] using the method of stationary states that if one goes to much larger values of z the amplitude so calculated begins to grow again but this ultimate resumption of linear growth is not physical, because it corresponds to the situation where the beam has widened beyond lateral limits of the optical lattice. it is an elegant feature of the perturbative expansion that the different types of possible behaviour of the beam — spreading or splitting into two or more beams — arise from very simple properties of the error function depending crucially on the offset k0. in the first case the arguments are essentially real, and the error functions behave like sign functions, while in the second case there is a large imaginary part, and the moduli of the error functions behave instead like narrow peaks. references [1] bender, c. m., boettcher, s.: phys. rev. lett. 80, 5243 (1998). [2] bender, c. m.: contemp. phys. 46, 277 (2005); rep. prog. phys. 70, 947 (2007). [3] mostafazadeh, a.: arxiv:0810.5643. [4] bender, c. m., brody, d. c., jones, h. f.: phys. rev. lett. 89, 270401 (2002); 92, 119902(e) (2004). [5] bender, c. m., brody, d. c., jones, h. f.: phys. rev. d 70, 025001 (2004); 71, 049901(e) (2005). [6] mostafazadeh, a.: j. math. phys. 43, 205 (2002); j. phys. a 36, 7081 (2003). [7] el-ganainy, r. et al.: optics letters 32, 2632 (2007). [8] musslimani, z. et al.: phy. rev. lett. 100, 030402 (2008). [9] makris, k. et al.: phy. rev. lett. 100, 103904 (2008). [10] klaiman, s., günther, u., moiseyev, n.: phys. rev. lett. 101 080402 (2008). [11] guo, a. et al.: phy. rev. lett. 103, 093902 (2009). [12] longhi, s.: phys. rev. a 81, 022102 (2010). [13] makris, k. et al.: phy. rev. a 81, 063807 (2010). [14] rüter, c., et al.: nature physics 6, 192 (2010). [15] ramezani, h. et al.: arxiv:1005.5189 (2010). [16] midya, n., roy, b., choudhury, r.: phys. lett. a 374, 2605 (2010). [17] graefe, e. m., jones, h. f.: arxiv:1104.2838 (2011). hugh f. jones e-mail: h.f.jones@imperial.ac.uk physics department imperial college london sw7 2bz, uk 25 acta polytechnica doi:10.14311/ap.2013.53.0631 acta polytechnica 53(supplement):631–634, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the search for blazars among the unidentified egret γ-ray sources pieter j. meintjesa,∗, pheneas nkundabakuraa,b a department of physics, university of the free state, po box 339, bloemfontein, 9300, south africa b kigali institute of education, p.o. box 5039, kigali, rwanda ∗ corresponding author: meintjpj@ufs.ac.za abstract. in this paper we report the results of a multi-wavelength follow-up study of selected flat spectrum extragalactic radio-optical counterparts within the error boxes of 13 unidentified egret sources. two of these previously unidentified counterparts have been selected for optical photometric and spectroscopic follow-up studies. spectroscopic observations made with the 4.1 m soar telescope at cerro pachón, chile, showed that the spectra of the optical counterparts of 3eg j0821−5814 (pks j0820−5705) and 3eg j0706−3837 (pmn j0710−3835) correspond to a flat spectrum radio quasar (fsrq) and liner-seyfert i galaxy respectively. optical photometry of these sources, performed with the 1.0 m telescope at sutherland (south africa) shows noticeable intranight variability for pks j0820−5705, as well as a 5 sigma variation of the mean brightness in the r-filter over a timescale of three nights. significant variability has been detected in the b-band for pmn j0710−3835 as well. the gamma-ray spectral indices of all 13 candidates range between 2–3, correlating well with the bl lacs and fsrqs detected with fermi-lat in the first 11 months of operation. keywords: radiation mechanisms: non-thermal, line: identification, techniques: spectroscopic, galaxies: jets, bl lacertae objects. 1. introduction the energetic gamma ray telescope experiment egret (20 mev ÷ 30 gev) provided the highest gamma-ray window on board the compton gammaray observatory (cgro). the main scientific objective was to survey the gamma-ray sky to identify and study possible point sources of gamma-ray emission. egret detected 271 gamma-ray sources above 100 mev, 92 % of which were blazars. of the 271 sources detected, 131 remained unidentified, i.e. could not be associated with any specific point source of gamma-ray emission [5]. the aim of this study is to search for possible extra-galactic radio loud active galactic nuclei (agn), i.e. blazars and flat spectrum radio quasars (fsrqs) within some selected egret error boxes. to avoid confusion with possible galactic sources, especially molecular cloud distributions, the search was restricted to some unidentified sources at galactic latitudes |b| > 10 deg. selection criteria the counterparts should be inside the error box associated with the detection [5], confirmed as extragalactic in the nasa extragalactic database (ned), have radio brightness above 100 mjy at 8.4 ghz [8], exhibiting hard spectra with spectral indices |α| < 0.7 [8] and display variability (e.g. [4]) that may be associated with an inner accretion disc or jet. based upon these criteria, 13 sources have been selected for further follow-up study. the egret (30 mev÷10 gev) gamma-ray spectra of these sources observed between april 1991 and figure 1. gamma-ray spectral index distribution of selected unidentified sources. october 1995 (cycles 1, 2, 3 and 4 of the mission) have been determined. the spectral index distribution is displayed in fig. 1. the spectral distribution of these unidentified sources corresponds remarkably well with the gammaray blazar spectral index distribution observed by fermi-lat in the first 11 months of operation [1]. in the first phase of this study, two of these previously unidentified counterparts, i.e. pks j0820−5705 and pmn j0710−3850, were selected for further optical spectroscopic and photometric follow-up studies. 631 http://dx.doi.org/10.14311/ap.2013.53.0631 http://ojs.cvut.cz/ojs/index.php/ap pieter j. meintjes, pheneas nkundabakura acta polytechnica egret counterpart dec(j2000) ra(j2000) 3eg j0159−3603 j0156−3616 −36 16 14 01 56 47 3eg j0500+2502 j0502+2516 +25 16 24 05 02 59 3eg j0702−6212 j0657−6139 −61 39 26 06 57 02 3eg j0706−3837 j0710−3850 −38 50 36 07 10 43 3eg j0724−4713 j0728−4745 −47 45 14 07 28 22 3eg j0821−5814 j0820−5705 −57 05 35 08 20 58 3eg j1300−4406 j1302−4446 −44 46 52 13 02 31 3eg j1659−6251 j1703−6212 −62 12 38 17 03 37 3eg j1709−0828 j1713−0817 −08 17 01 17 13 06 3eg j1800−0146 j1802−0207 −02 07 44 18 02 50 3eg j1813−6419 j1807−6413 −64 13 50 18 07 54 3eg j1822+1641 j1822+1600 +16 00 12 18 22 11 3eg j1824+3441 j1827+3431 +34 31 05 18 27 00 table 1. some unidentified high galactic latitude unidentified sources and counterparts. 5e−17 1e−16 1.5e−16 2e−16 2.5e−16 3e−16 3.5e−16 4e−16 4.5e−16 5e−16 5.5e−16 4000 4500 5000 5500 6000 6500 7000 7500 8000 8500 f lu x (e rg s − 1 cm − 2 å − 1 ) wavelength (å) hα+[nii] mgib nad b−band ca ii h and k a−band pks j0820−5705 z=0.06 figure 2. spectrum of pks j0820−5705 (fsrq). 2. optical follow-up studies optical spectroscopy of the two selected counterparts was performed using the 4.1 m soar telescope in chile on the night of january 16–17 2009, utilizing the goodman spectrograph. the spectra of both these sources are shown in fig. 2 (pks j0820−5705) and fig. 3 (pmn j0710−3850), respectively. the spectrum for pks j0820−5705 resembles that of an fsrq at redshift z = 0.06, while the spectrum of pmn j0710−3850 shows broad and narrow lines resembling the spectrum of a liner or seyfert i galaxy at redshift z = 0.129. what distinguishes the spectrum of pks j0820−5705 from that of a normal radio galaxy is the shallow k4000 depression of only 8.8 % ± 2.5 %, indicating substantial non-thermal activity, while the corresponding value for pmn j0710−3850 is 80 % ± 1 %, in agreement with the value expected for a linerseyfert i galaxy (e.g. [2]). optical photometry in the b and r filters (fig. 4) 1e−16 2e−16 3e−16 4e−16 5e−16 6e−16 7e−16 4000 4500 5000 5500 6000 6500 7000 7500 8000 8500 f lu x (e rg s − 1 cm − 2 å − 1 ) wavelength (å) hα+[nii] [oi] nad b−band [oiii] hβ a−band pmn j0710−3850 z=0.129 hγ figure 3. spectrum of pmn j0710−3850 (linerseyfert i). of these systems during january 2009 shows peculiar intranight variability, as well as variability at the 5 σ level over timescales of a few nights for pks j0820−5705. this level of intranight variability from pks j0820−5705 has also been observed in a few other sources e.g. blazars ao 0235+164 and pks j0736+17 (e.g. [4]) respectively. the variability observed in pmn j0710−3850 corresponds to that observed from another seyfert i galaxy, i.e. ngc 4395 (e.g. [3]). 3. sed modelling the multi-wavelength data from the two counterparts from radio to gamma-rays have been combined to create the spectral energy distribution (sed) over more than 15 decades in energy (fig. 5). the data is fitted with a single zone synchrotron self-compton (ssc) model (e.g. [6]) and external compton (ec) model (e.g. [7]), i.e. where relativistic jet electrons up-scatter 632 vol. 53 supplement/2013 the search for blazars among the unidentified egret γ-ray sources figure 4. optical photometry (b-band and r-band) of pks j0820−5705 and pmn j0710−3850. also shown are the lightcurves of constant brightness comparison stars in the same field. infrared (ir) photons from the disc torus and possibly optical photons from the emission line regions to high energies. the model parameters are explained in the footnote of table 2. the gamma-ray emission for 3eg j0706−3837 can readily be explained by an ec process, where jet electrons upscatter photons from both the disc torus and emission line regions to high energies, while for 3eg j0821−5814 the gamma-ray component is mostly compatible with an ssc process. for 3eg j0821−5814, a higher energy component of the egret spectrum below the indicated upper limits, could possibly be associated with an ec process. 4. conclusions we report the discovery of 13 flat spectrum extragalactic sources within the error boxes of some high galactic latitude unidentified egret sources. four of these egret sources have been detected with fermilat within the first 11 months of operation. optical spectroscopy of pks j0820−5705 shows a featureless spectrum, shallow k4000 depression with broad absorption lines at z = 0.06 resembling an fsrq, while the spectrum of pmn j0710−3850 shows broad and narrow emission lines at z = 0.129, resembling a liner-seyfert i galaxy. the optical spectrum of pks j0820−5705 shows a very shallow k4000 depression of 8.8 %±2.5 %, implying the non-thermal emission associated with an fsrq, while the k4000 depression for pmn j0710−3850 is significantly deeper at 80 %±1 %, in accordance with that expected of a liner-seyfert i galaxy. photometry of both these sources shows ∼ 1 magnitude intranight variability in the b-band, with an additional 5 σ variability seen over a 3-day period in the r-band from pks j0820−5705. the sed of these sources have been fitted with a combination of ssc and ec models. 633 pieter j. meintjes, pheneas nkundabakura acta polytechnica 1e-18 1e-17 1e-16 1e-15 1e-14 1e-13 1e-12 1e+10 1e+15 1e+20 1e+25 ν f ν (w m -2 ) ν (hz) radio opt. nir rosat egret 3eg j0706-3837 (pmn j0710-3850) ec (ir) ssc sync ec (bel) 1e-18 1e-17 1e-16 1e-15 1e-14 1e-13 1e-12 1e-11 1e+10 1e+15 1e+20 1e+25 ν f ν (w m -2 ) ν (hz) radio opt. nir rosat egret 3eg j0821-5814 (pks j0820-5705) xmm ec ssc-opt figure 5. sed model fits for 3eg j0706−3837 (pmn j0710−3850) and 3eg j0821−5814 (pks j0820−5705). ssc parameter 3eg j0821−5814 3eg j0706−3837 ssc (opt) ssc (x-ray) r (m) 1.00 × 1013 1.00 × 1012 3.50 × 1013 b (t) 2.50 × 10−4 2.50 × 10−4 2.50 × 10−4 δ 3.8 15 12 γmax 3.10 × 103 7.80 × 104 2.00 × 103 ec parameter 3eg j0821−5814 3eg j0706−3837 r (m) 1.0 × 1015 1.0 × 1015 b (t) 2.50 × 10−4 2.50 × 10−4 γ 10 10 θobs (rad) 0.2 0.52 ke 3.0 × 1050 2.5 × 1053 uir (in erg cm−3) 0.08 25 νir (in ev) 0.01 0.1 table 2. ssc and ec model parameters for 3eg j0821−5814 and 3eg j0706−3837 . the ssc (opt) refers to the ssc model fit obtained using the optical data as part of the synchrotron emission while the ssc (x-ray) refers to the model fit obtained using the x-ray data as part of the synchrotron emission. here r and b are the radius and the magnetic field intensity of the emitting region respectively; δ and γmax are the doppler factor and the maximum lorentz factor of the electrons. the main parameters for the ec model are θ, uir, νir, γ and kel representing the viewing angle, the energy density of dust ir radiation, the ir radiation characteristic frequency, the bulk jet lorentz factor and the normalization constant in the electron density distribution respectively. references [1] abdo a.a. et al., 2010, apj, 710, 1271 [2] caccianiga a., della ceca r., gioi i.m., maccacaro t & wolter t., 1999, apj, 513, 51 [3] desroches l-b et al., 2006, apj, 650, 88 [4] fan j-h., 2005, chin. j. astron. astrophys. suppl., 5, 213 [5] hartman r., et al., 1999, apjs, 123, 79 [6] katarzynski k., sol h & kus a., 2001, a&a, 367, 809 [7] sikora m., begelman m.c. & rees m.j., 1994, apj, 421, 153 [8] sowards-emmerd d., romani r.w. & michelson p.f.: 2003, apj, 590 discussion james beall — does the variability you mentioned correlate with that observed in blazars? thank you. pieter meintjes — the level of intranight variability seen in in b and r-filter from pks j0820−5705 has been seen a few other blazars, for example ao 0235+164 and pks j0736+17. however, it must be stressed that in most cases the intranight (microvariability) is at a much lower level than has been observed from pks j0820−5705. the intranight variability seen from pmn j0710−3850 resembles that of another seyfert i galaxy ngc 4395. so the level of variability is not normally observed, especially in pks j0820−5705, but it could represent an active phase from these sources. regular optical monitoring will be performed to verify this. 634 acta polytechnica 53(supplement):631–634, 2013 1 introduction 2 optical follow-up studies 3 sed modelling 4 conclusions references acta polytechnica acta polytechnica 53(1):11–22, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ astroparticle physics today franco giovannelli∗ inaf – istituto di astrofisica e planetologia spaziali, area di ricerca di tor vergata, via del fosso del cavaliere, 100 – i 00133 roma, italy ∗ corresponding author: franco.giovannelli@iaps.inaf.it abstract. in this short review paper i will draw attention to the most important steps made in the past decade toward a better understanding of the physics governing our universe. the results that i will discuss are drawn from the photonic astrophysics, particle astrophysics, and neutrino astrophysics, which constitute the main tools for exploring the universe. the union of these three tools has given rise to a new field of physics known as astroparticle physics. because of the limited length of this paper, i have selected only a few arguments that, in my opinion, have been crucial for the progress of physics. keywords: photonic astrophysics, particle astrophysics, neutrino astrophysics. 1. introduction astroparticle physics, a new field of physics, was formed about twenty years ago by joining the efforts of the community of high energy astrophysicists and the community of particle physicists. over this relatively short period of time, astroparticle physics has developed strongly through the study of cosmic sources that emit photons, charged particles, and neutrinos. these sources are considered as frontier objects between astrophysics and particle physics. results emerging from the study of cosmic sources via photonic astrophysics, particle astrophysics, and neutrino astrophysics have been stimulating the scientific community toward a unifying scheme for general comprehension of the physics governing our universe. the wmap (wilkinson microwave anisotropy probe) mission described by bennett et al. (2003) determined that the universe is 13.7 ± 0.2 gyr old. the combination of wmap and 2dfgrs data constrains the energy density in stable neutrinos: ωνh2 < 0.0072. for three degenerate neutrino species, this limit implies that their mass is less than 0.23 ev (95 % confidence limit). the best fit of the data favors a flat universe, from which it follows that the mean energy density in the universe is equal to the critical density (spergel et al., 2003). this is equivalent to a mass density of 9.9 × 10−30 g cm−3, which is equivalent to only 5.9 protons m−3. of this total density, we now know the breakdown to be: 4.6 % baryons. this includes stars, heavy elements, and helium and free hydrogen. 0.4 % neutrinos. fast-moving neutrinos do not play a major role in the evolution of structure in the universe. they would have prevented the early clumping of gas in the universe, delaying the emergence of the first stars, in conflict with the wmap data. however, with 5 years of data, wmap is able to show evidence that a sea of cosmic neutrinos do exist in numbers that are expected from other lines of reasoning. this is the first time that such evidence has come from the cosmic microwave background. 23 % cold dark matter. dark matter is likely to be composed of one or more species of sub-atomic particles that interact very weakly with ordinary matter. several plausible candidates for dark matter exist. new particle accelerator experiments, and especially the large hadron collider (lhc), are likely to provide new insights in the coming years. 72 % dark energy. the first observational hints of dark energy in the universe date back to the 1980s when astronomers were trying to understand how clusters of galaxies were formed. their attempts to explain the observed distribution of galaxies were improved if dark energy was present, but the evidence was highly uncertain. in the 1990s, observations of supernovae were used to trace the expansion history of the universe, and it was a big surprise that the expansion appeared to be accelerated, rather than decelerated. in 2003, the first wmap results came out, indicating that the universe was flat, and that dark matter made up only ∼ 23 % of the density required to produce a flat universe. if 72 % of the energy density in the universe is in the form of dark energy, which has a gravitationally repulsive effect, this is just the right amount to explain both the flatness of the universe and the observed accelerated expansion. thus dark energy explains many cosmological observations all at the same time (nasa official web page, edward j. wollack, 2011), however, we have no experimental proofs up till now of the existence of dark matter and dark energy. thus we can affirm with reasonable certainty that the ocean composed of dark matter and dark energy is equivalent to the place outside the roman world where the ancient romans placed lions (hic sunt leones). we know rather well, but not yet completely, 11 http://ctn.cvut.cz/ap/ franco giovannelli acta polytechnica about only 5 % of the content of the universe! photonic astrophysics, particle astrophysics and neutrino astrophysics are the three tools that have made it possible to sound this 5% portion of the universe. the combined use of these three tools has given rise to a new field of physics, known as astroparticle physics. all cosmic sources, both discrete and diffuse, are variable at different time scales in intensity and in spectral shape. in this sense, we can affirm that no one source is sufficiently stable to be considered as a standard candle. for this reason, multifrequency observations, possibly simultaneous, are mandatory for a proper comprehension of the behaviour of a target cosmic source. a clear example of this assertion is the case of the crab nebula which was considered very stable, like a standard candle, until the strong flares detected by the agile satellite in october 2007 and september 2010 in the 100 mev–10 gev energy range with an intensity three times greater than in quiescence (tavani et al., 2011), and by the fermi gamma-ray space telescope in february 2009 and september 2010 at energies > 100 mev, with intensity of four and six times greater than in quiescence, respectively (abdo et al., 2011). the brevity of the flares implies that the γ-rays were emitted via synchrotron radiation from 1015 ev electrons in a region smaller than 1.4×10−2 pc. these are the highest energy particles that can be associated with a discrete astronomical source, and they pose challenges to particle acceleration theory. in this paper i will discuss, on the basis of my knowledge and feelings, the most relevant results obtained in the recent past that have significantly improved our knowledge of the physics governing our universe. deeper discussions about astroparticle physics can be found in the review papers by giovannelli (2007, 2009, 2011). in their reviews, giovannelli & sabaugraziati (2010, 2012a), and de angelis, mansutti & persic (2008) have discussed papers about the multifrequency behaviour of high energy cosmic sources, and very high energy (vhe) γ-ray astrophysics. 2. astroparticle physics development high energy astrophysics is generally approached through the study of cosmic rays. the reason for this is historical in nature. since the discovery of this extraterrestrial radiation by victor hess (1912), an enormous amount of scientific research has attempted to discover its nature, and as a result many separate research fields have been developed. before particle accelerators came into operation, high energy cosmic rays were the laboratory tools for investigations of elementary particle production, and to date they remain the only source of particles with energies greater than 1012 ev. research on the composition of this radiation led to the development of the study of the astrophysical environment using information in the charge, mass, and energy spectra; this field is known as particle astrophysics. now, the large hadron collider (lhc), described by straessner et al. (2011), is able to reach such tev energies for p-p interactions at 7 tev in searching for the higgs boson with the atlas detector (aad et al., 2012). no significant excess of events over the expected background is observed, and limits on the higgs boson production cross section are derived for higgs boson mass in the range 110 gev < mh < 300 gev. the observations exclude the presence of a standard model higgs boson with mass 145 gev < mh < 206 gev at a 95 % confidence level. of great importance was the discovery of high energy photons near the top of the earth’s atmosphere. this discovery sparked the development of new astronomical fields such as x-ray and γ-ray astronomy. however, many of these high energy photons have their origin in interactions of the high energy charged particles with cosmic matter, light, or magnetic fields. in this fact, the fields of particle astrophysics and astronomical research have found a bond to join their efforts in trying to understand the high energy processes that occur in astrophysical systems. 2.1. cosmic rays the modern picture of cosmic rays is of a steady rain of particles moving at speeds close to the speed of light. the particles are primarily nuclei with atomic weights less than 56, as well as a few nuclei of heavier elements, some electrons and positrons, and a few γ-rays and neutrinos. the energy spectrum extends over 12 orders of magnitude (∼ 108–1020 ev), and the particle flux decreases rapidly with increasing energy. figure 1 (after nagano & watson, 2000 and the references therein) shows the energy spectrum, starting from 1011 ev. the flux is multiplied by e3 in order to keep the plot more compact, and to emphasize the variations in the spectrum. a simple inspection of fig. 1 shows a break in the spectrum around 1015–1016 ev. this break is also called the knee. the knee was found at the end of the 1950s, initially as the steepening in the eas size spectrum (kulikov & khristiansen, 1958). since then, more than 60 years have passed but the origin of the knee is still a challenge for cosmic ray physics. the steepening appears when the intensity of the showers falls to a value of ∼ 4–5 × 10−11 cm−2 s−1 sr−1. this intensity delivers one shower above the knee per area of ∼ 25 m2 in one hour. owing to this low intensity the knee has not yet been studied by direct measurements in space, but it has been studied by indirect methods which use atmospheric cascades. in summary, the cosmic ray spectrum for protons at gev energies is close to e−2.75, and for he and higher elements it is close to e−2.65 below the knee at ∼ 5 × 1015 ev, where the spectrum turns down to ∼ e−3.1, and flattens out again near 1018.5 ev, 12 vol. 53 no. 1/2013 astroparticle physics today figure 1. the all-particle spectrum of cosmic rays obtained by several experiments cited in the figure. energies less than or greater than 4 × 1014 ev divide the two ranges in which direct and indirect measurements of the cr spectrum are possible. the knee, the ankle and the greisen–zatsepin–kuzmin (gzk) cutoff, and a 2nd knee are clearly shown (after nagano & watson, 2000). called the ankle (e.g. lawrence, reid & watson, 1991; nagano et al., 1992; zatsepin, 1995). the origin of the knee can be clarified by a detailed study of the primary spectrum, mass composition and arrival directions of cosmic rays. reviews by erlykin (1995), müller (1993), and hillas (1999) discussed these topics in detail. in particular, erlykin (1995) provides a very clear discussion of the mass composition and the origin of cosmic rays. indeed, their origin, independent of all the discussions and models, is focused around four main possible reasons, namely: (1.) sources of cosmic rays; (2.) acceleration mechanisms; (3.) propagation through ism; (4.) interaction characteristics. in most models based on one of these four items, the mass composition varies with the energy in various ways. then erlykin (1995) discussed the mass composition below 105 gev, near the knee (105–108 gev), and above 108 gev. he concluded that most of the results indicated that the genesis of cosmic rays is much more complicated than had been thought in the past. the most likely tendency for cosmic rays is to grow heavier with rising energies up to the knee region, and then to become lighter beyond the knee. several reviews have been published about the chemical composition around the knee (alessandro, 2001), about ultra high energy (uhe) cosmic rays (blasi, 2001), and about the propagation and clustering of cosmic rays (stanev, 2001). a possible interpretation of the knee is that it represents the energy at which cosmic rays can escape more freely from the galaxy or it may indicate a transition between two different acceleration mechanisms. in the first case one might expect an anisotropic effect in the distribution of the arrival directions above this energy if the cosmic rays were originated within the galaxy. shibata (2008) discusses the status of the composition measurement of cosmic rays by direct observations and indirect observations which provides key information on the origin of cosmic rays. the results of the tibet hybrid experiment measuring proton and the helium spectra at knee show the decreasing fractions of both the proton and helium component to the whole particle spectrum with increasing energy. this suggests that the knee is dominated by heavy nuclei. a 2nd knee is present at about 1018 ev. its origin is not yet completely clear: it could be due to dispersion of sns, or due to reacceleration of particles or an early transition to extragalactic cosmic rays (e.g. nagano & watson, 2000). a summary on the status of the search for the origin of the highest energy cosmic rays has been published by biermann (1999). he mentioned several competing proposals, such as supersymmetric particles, gamma ray bursts also giving rise to energetic protons, interacting high energy neutrinos and cosmological defects, and then he discussed the possibility of the propagation of these particles, assuming that they are charged. the distribution of the arrival directions of the highest energy particles on the sky ought to reflect the source 13 franco giovannelli acta polytechnica distribution and also the propagation history. he remarked that the present status can be summarized as inconclusive. however, he concluded as follows: if we can identify the origin of the events at the highest energies, beyond 5 × 1019 ev, the greisen–zatsepin– kuzmin cutoff owed to the microwave background, near to 1021 ev, and if we can establish the nature of their propagation through the universe to us, then we will obtain a tool to do physics at eev energies. the arrival directions of ≥ 60 eev ultra-high-energy cosmic rays (uhecrs) cluster along the supergalactic plane and correlate with active galactic nuclei (agn) within ∼ 100 mpc (abraham et al., 2007, 2008). the association of several events with the nearby radio galaxy cen a supports the paradigm that uhecrs are powered by supermassive black-hole engines and are accelerated to ultra-high energies in the shocks formed by variable plasma winds in the inner jets of radio galaxies. the gzk horizon length of 75 eev uhecr protons is ∼ 100 mpc, so that the auger results are consistent with the assumed proton composition of uhecrs. in this scenario, the sources of uhecrs are fr ii radio galaxies and fr i galaxies like cen a with scattered radiation fields that enhance uhecr neutral beam production. radio galaxies with jets pointed away from us can still be observed as uhecr sources due to the deflection of uhecrs by magnetic fields in the radio lobes of these galaxies. a broadband ∼ 1 mev–10 eev radiation component in the spectra of blazar agn is formed by uhecr–induced cascade radiation in the extragalactic background light. this emission is too faint to be seen from cen a, but could be detected from more luminous blazars (dermer et al., 2009). recent evidence from the pierre auger observatory suggests a transition, at 5 eev–10 eev in the composition of uhecrs, from protons to heavier nuclei such as iron (abraham et al., 2010). piran (2010) considered the implications of heavier composition on the sources of uhecrs. he concluded that with typical reasonable parameters of a few ng for the extra–galactic magnetic field (egmf) intensity, and a coherence distance of an mpc, the distance that nuclei uhecrs above gzk energy traverse before photodisintegrating is only a few mpc. in spite of the significantly weaker limits on the luminosity, cen a is the only currently active potential source of nuclei uhecrs within this distance. the large deflections erase the directional anisotropy expected from a single source. if indeed the composition of the above gzk– uhecrs is iron, and if egmf is not too small, then cen a is the dominant source of the nuclei uhecrs observed above the gzk limit. in summary, charged cosmic rays are influenced in their propagation through space by the magnetic fields in the galaxy, and for the lowest energy particles they are also influenced in the solar system. the result is that the distribution of arrival directions as the radiation enters the earth’s atmosphere is nearly isotropic. it is not possible to identify the sources of the cosmic rays by detecting them. however, in the high energy interactions produced at the source, electrically neutral particles such as photons, neutrons, and neutrinos are also produced and their trajectories are not deviated being directed from their point of origin to the observer. owing to their short lifetime neutrons cannot survive the path length to the earth (decay length ∼ 9 pc at 1 pev) and neutrinos do not interact efficiently in the atmosphere. it is in this context that gamma ray astronomy has demonstrated itself to be a powerful tool. the observations made to date have detected γ-rays from many astronomical objects, e.g. neutron stars, interstellar clouds, the center of our galaxy and the nuclei of active galaxies (agns). one might expect very important implications for high energy astrophysics from the observations at energies greater than 1011 ev of extragalactic sources (e.g. hillas & johnson, 1990). the fluxes of γ-rays at these energies are attenuated because of their interactions with the cosmic radio, microwave, infrared and optical radiation fields. measurements of the flux attenuation can then provide important information on the distribution of these fields. for example, the threshold energy for pair production in reactions of photons with 2.7 k background radiation is reached at 1014 ev and the absorption length is of the order of ∼ 7 kpc. for the infrared background the maximum absorption is reached at energies greater than 1012 ev. the qualitative problem of the origin of cosmic rays has practically been solved, whilst the quantitative problem in determining the fraction of these rays coming from the different possible sources remains open. 2.2. tev sources the most exciting results of the last decade have been obtained in the field of vhe astrophysics, from various experiments (e.g. cgro/egret, wipple, hegra, cangaroo, celeste, stacee, tibet, hess, veritas, milagro, magic) that have detected many vhe cosmic sources. the high energy sky — with the exception of the crab nebula, vela x, and 3c 273 — was empty until the mid 1990s. updated to 19th april 2012, the vhe sky (e > 100 gev) is populated by 107 cosmic sources: 46 out of 107 being extragalactic and 61 galactic (http: //www.mppmu.mpg.de/~rwagner/sources/ or http: //tevcat.uchicago.edu). one of the most interesting results has been the determination of the spectral energy distribution (sed) of the crab nebula, thanks to many measurements obtained through various he–vhe experiments (albert et al., 2008b). another exciting result has been the detection of the first variable galactic tev source, namely the binary pulsar psr b1259-63 (aharonian et al., 2005). the authors found that radio silence occurs during the 14 http://www.mppmu.mpg.de/~rwagner/sources/ http://www.mppmu.mpg.de/~rwagner/sources/ http://tevcat.uchicago.edu http://tevcat.uchicago.edu vol. 53 no. 1/2013 astroparticle physics today time in which the pulsar is occulted by the excretion disk of the be star. the many detected sources representing various galactic and extragalactic source populations are supernova remnants (snrs), pulsar wind nebulae (pwne), giant molecular clouds (gmcs), star formation regions (sfrs), compact binary systems (cbss), and active galactic nuclei (agns). paredes & persic (2010) reviewed the results from the magic cherenkov telescope for most of the former class of sources. models of tev agns have been discussed by lenain (2010). 2.3. diffuse extragalactic background radiation after the big bang the universe started to expand with rapid cooling. the cosmic radiation observed now is probably a melting of various components which had their origin in various stages of the evolution as the results of various processes. this is the diffuse extragalactic background radiation (debra), which, if observed in various energy ranges, allows the study of many astrophysical, cosmological, and particle physics phenomena (ressell & turner, 1990). debra has witnessed the whole history of the universe from the big bang to the present time. this history is marked by three main experimental witnesses supporting the big bang theory (e.g. giovannelli & sabau-graziati, 2008): the light element abundances (burles, nollett & turner, 2001); the cmbr temperature at various redshifts as determined by srianand, petitjean & ledoux (2000), and the references therein; the cmb at z = 0 as result of cobe (tcmbr(0) = 2.726 ± 0.010 k), which is well fitted by a black body spectrum (mather et al., 1994). at z ' 2.34, the cmbr temperature is: 6.0 k < tcmbr(2.34) < 14.0 k. the prediction from the hot big bang: tcmbr = tcmbr(0)×(1 +z) gives tcmbr(2.34) = 9.1 k, which is consistent with the measurement (srianand, petitjean & ledoux, 2000). 2.4. reionization of the universe after the epoch of recombination (last scattering) between ∼ 3.8 × 105 and ∼ 2 × 108 yr (z ' 1000– 20), the universe experienced the so-called dark ages, where the dark matter halos collapsed and merged until the appearance of the first sources of light. this ended the dark ages. the ultraviolet light from the first sources of light also changed the physical state of the gas (hydrogen and helium) that fills the universe, from a neutral state to a nearly fully ionized state. this was the reionization era where the population iii stars formed and as feedback the first sne and grbs. this occurred between ∼ 2–5×108 yr (z ' 20– 10). soon after population ii stars started to form and the second wave of reionization probably occurred and stopped at ∼ 9×108 yr (z ' 6) after the big bang, and then the evolution of galaxies started (e.g. djorgovski, 2004, 2005). quasars, the brightest and most distant objects known, offer a window on the reionization era, because neutral hydrogen gas absorbs their ultraviolet light. reionization drastically changes the environment for galaxy formation and evolution, and in a hierarchical clustering scenario the galaxies responsible for reionization may be the seeds for the most massive galaxies in the local universe. reionization is the last global phase transition in the universe. the reionization era is thus a cosmological milestone, marking the appearance of the first stars, galaxies and quasars. recent results obtained by ouchi et al. (2010) are an important contribution toward solving this problem. indeed, from the the lyα luminosity function (lf), clustering measurements and the lyα line profiles based on the largest sample to date of 207 lyα emitters at z = 6.6 on the 1 deg2 sky of subaru/xmmnewton deep survey field, ouchi et al. (2010) found that the combination of various reionization models and observational results about the lf, clustering, and line profile indicates that there would exist a small decrease of the intergalactic medium’s (igm’s) lyα transmission owing to reionization, but that the hydrogen igm is not highly neutral at z = 6.6. their neutral-hydrogen fraction constraint implies that the major reionization process took place at z >∼ 7. the w. m. keck 10 m telescope has shown quasar sdss j1148+5251 at a redshift of 6.41 (∼ 12.6 × 109 yr ago). this is currently the most distant known quasar (djorkovski, 2004). this measurement does not contradict the result found for the epoch of reionization. however, the search of the epoch of reionization remains one of the most important open problems for understanding the formation of the first stars, galaxies and quasars. 2.5. clusters of galaxies the problems of the production and transport of heavy elements seem to have been resolved. indeed, thermally driven galactic winds, e.g. from m82, have shown that only active galaxies with an ongoing starburst can enrich the intracluster medium (icm) with metals. the amounts of metals in the the icm are at least as high as the sum of the metals in all galaxies of the cluster (e.g. tozzi et al., 2003). several clusters of galaxies (cgs) with strong radio emission have been associated with egret sources. this is an important step in clarifying the nature of many unknown egret sources (colafrancesco, 2002). however, in the first 11 months of the fermi lat cg monitoring program, no γ-ray emission from any of the monitored cgs was detected (ackermann et al., 2010b). in spite of many important results coming from satellites of the last decade, the hierarchical distribution of dark matter, and also the role of the intergalactic magnetic fields in cgs, are still open. multifrequency simultaneous measurements with higher sensitivity instruments, in particular those in the hard 15 franco giovannelli acta polytechnica x-ray and radio energy regions and optical–near infrared (nir) could solve problems such as these. 2.6. dark energy and dark matter by using various methods to determine the mass of galaxies, a discrepancy has been found that suggests that ∼ 95 % of the universe is in a form that cannot be seen. this form of unknown content of the universe is the sum of dark energy (de) and dark matter (dm). colafrancesco (2003) provides a deep discussion about the new cosmology. the discovery of the nature of dark energy may provide an invaluable clue for understanding the nature and the dynamics of our universe. however, there is ∼ 30 % of the matter content of the universe which is dark and still requires a detailed explanation. baryonic dm consisting of machos (massive astrophysical compact halo objects) can yield only some fraction of the total amount of dark matter required by cmb observations. wimps (weakly interacting massive particles) (non-baryonic dm) can yield the needed cosmological amount of dm and its large scale distribution provided that it is “cold” enough. several options have been proposed so far, e.g.: (1.) light neutrinos with mass in the range mν ∼ 10– 30 ev; (2.) light exotic particles like axions with mass in the range maxion ∼ 10−5–10−2 ev; (3.) or weakly interacting massive particles like neutralinos with mass in the range mχ ∼ 10–1000 gev, this last option being favored at the present time (see e.g. ellis 2002). eros and macho, two experiments based on the gravitational microlensing, have been developed. two lines of sight have been probed intensively: the large magellanic cloud (lmc) and the small magellanic cloud (smc), located 52 kpc and 63 kpc, respectively, from the sun (palanque-delabrouille, 2003). with 6 years of data towards lmc, the macho experiment published a most probable halo fraction between 8 % and 50 % in the form of 0.2m� objects (alcock et al., 2000). most of this range is excluded by the eros exclusion limit, and in particular the macho preferred value of 20 % of the halo. among experiments for searching wimp dark matter candidates, pamela is devoted to a search for dark matter annihilation, antihelium (primordial antimatter), new matter in the universe (strangelets?), a study of cosmic-ray propagation (light nuclei and isotopes), the electron spectrum (local sources?), solar physics and solar modulation, and the terrestrial magnetosphere. a comparison of pamela expectations with many other experiments has been discussed by morselli (2007). bruno (2011) discusses some results from pamela. the search for dm is one of the main open problems in present–day astroparticle physics. 2.7. the galactic center the galactic center (gc) is one of the most interesting places for testing theories in which frontier physics plays a fundamental role. there is an excellent review by mezger, duschl & zylka (1996), which discusses the physical state of stars and interstellar matter in the galactic bulge (r ∼ 0.3–3 kpc from the dynamic center of the galaxy), in the nuclear bulge (r < 0.3 kpc) and in the sgr a radio and gmc complex (the central ∼ 50 pc of the milky way). this review also reports a list of review papers and conference proceedings related to the galactic center, with bibliographic details. in the review paper by giovannelli & sabau-graziati (2004, and the references therein), multifrequency gc behaviour is also discussed. larosa et al. (2000) presented a wide-field, high dynamic range, high-resolution, long-wavelength (λ = 90 cm) vla image of the galactic center region. this is the most accurate image of the gc. it is highly obscured in optical and soft x-rays; it shows a central compact object – a black hole candidate – with m ∼ 3.6 × 106m� (genzel et al., 2003a), which coincides with the compact radio source sgr a∗ [r.a. 17 45 41.3 (hh mm ss); dec.: -29 00 22 (dd mm ss)]. sgr a∗ in x-rays/infrared is highly variable (genzel et al., 2003b). gc is also a good candidate for indirect dark matter observations. moreover, the detected excess of he γ-rays at gc would be produced by neutralino annihilation in the dark matter halo. this excess could be better measured by the fermi observatory. 2.8. gamma-ray bursts a theoretical description of grbs is still an open and strongly controversial issue. the most popular descriptions are the fireball (fb) model (meszaros & rees, 1992; piran, 1999), the cannon ball (cb) model (dar & de rújula, 2004), the spinnin-precessing jet (spj) model (fargion, 2003a,b; fargion & grossi, 2006), and the fireshell (izzo et al., 2010) model, which comes directly from the electromagnetic black hole (embh) model (e.g. ruffini et al., 2003 and the references therein). however, each model competes against the others. important implications on the origin of the highest redshift grbs come from the detection of grb 080913 at z = 6.7 (greiner et al., 2009) and grb 090423 at z ' 8.2 (tanvir et al., 2009). this means that we are really approaching the possibility of detecting grbs at the end of the dark era, where the first pop iii stars appeared. izzo et al. (2010) successfully discuss a theoretical interpretation of the grb 090423 within their fireshell model. wang & dai (2009) studied the high-redshift star formation rate (sfr) up to z ' 8.3 considering the swift grbs tracing the star formation history and the cosmic metallicity evolution in various background cosmological models including λcdm, quintessence, 16 vol. 53 no. 1/2013 astroparticle physics today and quintessence with a time-varying equation of state and brane-world models. λcdm is the preferred model, and it is compared with other results. although major progress has been made in recent years, grbs theory needs further investigation in the light of experimental data coming from old and new satellites, often coordinated, e.g. bepposax or batse/rxte or asm/rxte or ipn or hete or integral or swift or agile or fermi or maxi. 2.9. extragalactic background light space is filled with diffuse extragalactic background light (ebl), which is the sum of the starlight emitted by galaxies throughout the history of the universe. high energy γ-rays traversing cosmological distances are expected to be absorbed through their interactions with ebl by: γvhe +γebl −→ e+e−. then the γ-ray flux φ is suppressed while travelling from the emission point to the detection point, as φ = φ0e−τ(e,z), where τ(e,z) is the opacity. the e–fold reduction [τ(e,z) = 1] is the gamma ray horizon (grh) (e.g. blanch & martinez, 2005; martinez, 2007). direct measurement of ebl is difficult at optical to infrared wavelengths because of the strong foreground radiation originating in the solar system. however, the measurement of ebl is important for vhe gamma-ray astronomy, and also for astronomers modelling star formation and galaxy evolution. second in intensity only to the cosmic microwave background (cmb), optical and infrared (ir) ebl contains the imprint of galaxy evolution since the big bang. this includes the light produced during the formation and reprocessing of stars. current measurements of ebl are reported in the paper by schroedter (2005, and references therein). schroedter used the available vhe spectra from six blazars. later, the redshift region over which the gamma reaction history (grh) can be constrained by observations was extended up to z = 0.536. the upper ebl limit has been obtained on the basis of 3c 279 data (albert et al., 2008a). the universe is more transparent to vhe gamma rays than expected. thus many more agns could be seen at these energies. indeed, abdo et al. (2009a) observed a number of tev-selected agns during the first 5.5 months of observations with the large area telescope (lat) on board the fermi gamma-ray space telescope. redshift-dependent evolution is detected in the spectra of objects detected at gev and tev energies. the most reasonable explanation for this is absorption on ebl. as such, this would be the first modelindependent evidence for absorption of γ-rays on ebl. by using a sample of γ-ray blazars with redshift up to z ' 3, and grbs with redshift up to z ' 4.3, measured by fermi/lat, abdo et al. (2010b) placed upper limits on the γ-ray opacity of the universe at various energies and redshifts and compared this with predictions from well-known ebl models. they found that ebl intensity in the optical-ultraviolet wavelengths as great as predicted by the "baseline" model of stecker, malkan & scully (2006) can be ruled out with high confidence. 2.10. relativistic jets relativistic jets have been found in numerous galactic and extragalactic cosmic sources in various energy bands. the emitted spectra of jets from cosmic sources of different nature are strongly dependent on the angle formed by the beam axis and the line of sight, and obviously by the lorentz factor of the particles (e.g. bednarek et al., 1990 and the references therein; beall, guillory & rose, 1999, 2009; beall, 2002, 2003, 2008, 2009; beall et al., 2006, 2007). observations of jet sources at various frequencies can therefore provide new inputs for the comprehension of these extremely efficient carriers of energy, as for cosmological grbs. the discovered analogy among µ–qsos, qsos, and grbs is fundamental for studying the common physics governing these different classes of objects via µ–qsos, which are galactic, and then apparently brighter and with all processes occurring in time scales accessible by our experiments (e.g. chaty, 1998). chaty (2007) noted the importance of multifrequency observations of jet sources by means of measurements of grs 1915+105. dermer et al. (2009) suggest that uhecrs could come from black hole jets of radio galaxies. spectral signatures associated with uhecr hadron acceleration in studies of radio galaxies and blazars with the fermi observatory and ground–based γ-ray observatories can provide evidence for cosmic-ray particle acceleration in black hole plasma jets. also in this case, γ-ray multifrequency observations (mev–gev– tev) together with observations of pev neutrinos could confirm whether black-hole jets in radio galaxies accelerate uhecrs. despite their frequent outburst activity, microquasars have never been unambiguously detected emitting high-energy gamma rays. fermi/lat has detected a variable high-energy source coinciding with the position of the x-ray binary and microquasar cygnus x-3. its identification with cygnus x-3 is secured by the detection of its orbital period in gamma rays, as well as the correlation of the lat flux with radio emission from the relativistic jets of cygnus x-3. the γ-ray emission probably originates from within the binary system (abdo et al., 2009b). the microquasar ls 5039 has also been unambiguously detected by fermi/lat its emission being modulated with a period of 3.9 days. analyzing the spectrum, variable with the orbital phase, and having a cutoff, abdo et al. (2009c) conclude that the γ-ray emission of ls 5039 is magnetospheric in origin, like that of the pulsars detected by fermi. this experimental evidence of emission in the gev region of microquasars opens an interesting window on the formation of relativistic jets. 17 franco giovannelli acta polytechnica 2.11. cataclysmic variables the detection of cvs by the integral observatory (barlow et al., 2006) has recently renewed the interest of high energy astrophysicists in these systems, and has lead to renewed involvement of the low energy astrophysical community. the detection of cvs having orbital periods inside the so-called period gap between 2 and 3 hours, which separates polars, experiencing gravitational radiation, from intermediate polars, experiencing magnetic braking, renders attractive the idea of physical continuity between the two classes. further investigations are necessary in order to solve this important problem. for a recent review on cvs see the paper by giovannelli & sabau-graziati (2012b). 2.12. high mass x-ray binaries for general reviews, see e.g. giovannelli & sabaugraziati (2001, 2004) and van den heuvel (2009) and references therein. hmxbs are young systems, with age ≤ 107 yr, mainly located in the galactic plane (e.g. van paradijs, 1998). a compact object, the secondary star, mostly a magnetized neutron star (x-ray pulsar) is orbiting around an early type star (o, b, be), the primary, with m ≥ 10m�. the optical luminosity of the system is dominated by the early type star. such systems are the best laboratory for the study of accreting processes thanks to their relatively high luminosity in a large part of the electromagnetic spectrum. because of the strong interactions between optical companion and collapsed object, the low and high energy processes are strictly related. in x-ray/be binaries the mass loss processes are due to the rapid rotation of the be star, the stellar wind and, sporadically, due to the expulsion of a casual quantity of matter essentially triggered by gravitational effects close to the periastron passage of the neutron star. the long orbital period (> 10 days) and the large eccentricity of the orbit (> 0.2) together with transient hard x-ray behavior are the main characteristics of these systems. among the whole sample of galactic systems containing 114 x-ray pulsars (johnstone, 2005), only a few have been extensively studied. among these, system a 0535+26/hde 245770 is the best known, thanks to concomitant favorable causes which have rendered possible thirty seven years of coordinated multifrequency observations, most of them discussed e.g. by giovannelli & sabau-graziati (1992, 2008), burger et al. (1996). accretion powered x-ray pulsars usually capture material from the optical companion via the stellar wind, since this primary star generally does not fill its roche lobe. however, in some specific conditions (e.g. passage at the periastron of the neutron star) and in particular systems (e.g. a 0535+26/hde 245770), a temporary accretion disk can form around the neutron star behind the shock front of the stellar wind. this enhances the efficiency of the process of mass transfer from the primary star onto the secondary collapsed star, as discussed by giovannelli & ziolkowski (1990) and by giovannelli et al. (2007) in the case of a 0535+26. giovannelli & sabau-graziati (2011) discussed the history of the discovery of optical indicators of high energy emission in the prototype system a0535+26/hde 245770 ≡ flavia’ star, updated to the march–april 2010 event when a strong optical activity occurred roughly 8 days before the x-ray outburst (caballero et al., 2010) that was predicted by giovannelli, gualandi & sabau-graziati (2010). this optical indicator of an x-ray outburst, together with the whole history of the a0535+26 system, led to the conclusion that the periastron passage of the neutron star is scanned every 110.856 ± 0.002 days (optical orbital period) (bartolini et al., 1983), and the x-ray outbursts are triggered starting from that moment and occur roughly after 8 days – the transit time of the material expelled from the primary for reaching the secondary. however, it is still an important open problem how x-ray outbursts are triggered in x-ray pulsars. this issue has given rise to controversy among astrophysicists. important news has also been coming also from gev observations of hmxbs. indeed, abdo et al. (2009e) have presented the first results from the observations of lsi + 61°303 using fermi/lat data obtained between 2008 august and 2009 march. their results indicate variability that is consistent with the binary period, with the emission being modulated at 26.6 days. this constitutes the first detection of orbital periodicity in high–energy γ-rays (20 mev–100 gev). the light curve is characterized by a broad peak after the periastron, as well as a smaller peak just before the apastron. the spectrum is best represented by a power law with an exponential cutoff, yielding an overall flux above 100 mev of ∼ 0.82 × 10−6 ph cm−2 s−1, with a cutoff at ∼ 6.3 gev and photon index γ ' 2.21. there is no significant spectral change with orbital phase. the phase of maximum emission, close to the periastron, hints at inverse compton scattering as the main radiation mechanism. however, previous very high-energy gamma ray (> 100 gev) observations by magic and veritas show peak emission close to the apastron. this and the energy cutoff seen with fermi suggest that the link between he and vhe gamma rays is nontrivial. this is an open problem to be solved in future. 2.13. obscured sources and supergiant fast x-ray transients there are relevant integral results about a new population of obscured sources and supergiant fast x-ray transients (sfxts) (chaty & filliatre, 2005; chaty, 2007; rahoui et al., 2008; chaty, 2008). the importance of the discovery of this new population is based on the constraints on the formation and 18 vol. 53 no. 1/2013 astroparticle physics today evolution of hmxbs: does the dominant population of short-living systems — born with two very massive components — occur in a rich star-forming region? what will happen when the supergiant star dies? are primary progenitors of ns/ns or ns/bh mergers good candidates for gravitational waves emitters? can we find a link with short/hard γ-ray bursts? 2.14. ultra compact double degenerated binaries ultra-compact double-degenerated binaries (ucd) consist of two compact stars, which can be black holes, neutron stars or white dwarfs. in the case of two white dwarfs revolving around each other with an orbital period porb < 20 min, the separation of the two components for a ucd with porb ' 10 min or shorter is smaller than the diameter of jupiter. these ucds are evolutionary remnants of low–mass binaries, and they are numerous in the milky way. the discovery of ucd hints interestingly at possible gravitational wave detection with the lisa observatory. 2.15. magnetars the discovery of magnetars (anomalous x-ray pulsars – axps – and soft gamma-ray repeaters – sgrs) is another very exciting result from recent years (mereghetti & stella, 1995; van paradijs, taam & van den heuvel, 1995; and e.g. a review by giovannelli & sabau-graziati, 2004 and the references therein). indeed, with a magnetic field intensity of order 1014–1015 g the question naturally arises: what kind of sn produces these axps and sgrs? are the collapsed objects in axps and sgrs really neutron stars? (e.g. hurley, 2008). with such high magnetic field intensity, an almost ‘obvious’ consequence can be derived: the corresponding dimension of the source must be ∼ 10 m (giovannelli & sabau-graziati, 2006). this could be the dimension of the acceleration zone in supercompact stars. could they be quark stars? ghosh (2009) discussed some of the recent developments in the quark star physics along with the consequences of possible hadron–to–quark phase transition in a high density scenario of neutron stars, and their implications for astroparticle physics. important consequences could be derived by the experimentally demonstrated continuity among rotationpowered pulsars, magnetars, and millisecond pulsars, (kuiper, 2007). however, the physical reason for this continuity remains unclear. 3. cross sections of nuclear reactions in stars knowledge of the cross-sections of nuclear reactions occurring in the stars appears to be one of the most crucial points of all astroparticle physics. direct measurements of the cross sections of the 3he(4he,γ)7be and 7be(p,γ)8be reactions of the p–p chain and 14n(p,γ)15o reaction of the cno-cycle will allow a substantial improvement in our knowledge about stellar evolution. the luna collaboration has already measured with good accuracy the key reactions d(p, γ)3he, 3he(d, p)4he and 3he(4he,γ)7be. these measurements substantially reduces the theoretical uncertainty of d, 3he, 7li abundances. the d(4he,γ)6li cross section, which is the key reaction for determining the primordial abundance of 6li, will be measured in the near future (gustavino, 2007, 2009 and 2011). 4. neutrino astronomy for a short discussion about neutrino astronomy, see e.g. the paper by giovannelli (2007 and the references therein), as well as all the papers of the neutrino astronomy session, which were published in the proceedings of the vulcano workshops 2006, 2008, and 2010 (giovannelli & mannocchi, 2007, 2009, 2011). however, it should be noted that several papers have appeared about: (1.) the sources of he neutrinos (aharonian, 2007) and diffuse neutrinos in the galaxy (evoli, grasso & maccione, 2007); (2.) potential neutrino signals from galactic γ-ray sources (kappes et al., 2007); (3.) galactic cosmic-ray pevatrons with multi-tev γrays and neutrinos (gabici & aharonian, 2007); (4.) results achieved with amanda: 32 galactic and extragalactic sources have been detected (xu & icecube collaboration, 2008); diffuse neutrino flux from the inner galaxy (taylor et al., 2008); discussions about vhe neutrino astronomic experiments (cao, 2008). important news and references can be found in the proceedings of the les rencontres de physique de la vallée d’aoste (greco, 2009, 2010). news about neutrino oscillations has been reported by mezzetto (2011). angle θ13 differs from zero: sin2 θ13 = 0.013. this result opens the door to cp violation searches in the neutrino sector, with profound implications for our understanding of matter– antimatter asymmetry in the universe. 5. conclusions and reflections i will conclude this brief and incomplete review with some comments about the topics discussed here. (1.) many ground–based and space–based experiments have been exploring the whole energy range of the cr spectrum from ∼ 1 gev to ∼ 1012 gev, and many experiments are programmed for the near future. significant improvements have been obtained in the definition of the cr spectrum for protons, electrons, positrons, antiprotons and all ions. better results are expected in the near future. particular interest is devoted to knowledge about extreme high energy crs. 19 franco giovannelli acta polytechnica (2.) many experiments have been exploring cosmic sources along the whole electromagnetic spectrum, and new space–based and ground–based experiments are developing a tendency to explore processes at higher and higher energies, which are directly linking photonic astrophysics with particle astrophysics. (3.) particular attention is needed at the highest energies, where the cosmic ray spectrum extends to 1020 ev (see figure 1). however, the origins of these spectacularly high energy particles remains obscure. particle energies of this magnitude imply that there is a range of elementary particle physics phenomena present near their acceleration sites which is beyond the ability of present day particle accelerators to explore. vhe γ-ray astronomy may catch a glimpse of these phenomena. it is becoming increasingly clear that the energy régime covered vhe γ-ray astronomy will be able to address a number of significant scientific questions, which include: (1.) what parameters determine the cut-off energy for pulsed γ-rays from pulsars? (2.) what is the role of shell-type supernovae in the production of cosmic rays? (3.) at what energies do agn blazar spectra cut off? (4.) are gamma blazar spectral cut-offs intrinsic to the source or due to intergalactic absorption? (5.) is the dominant particle species in agn jets leptonic or hadronic? (6.) can intergalactic absorption of the vhe emission of agns be a tool for calibrating the epoch of galaxy formation, the hubble parameter, and the distance to γ-ray bursts? (7.) are there sources of γ-rays which are ‘loud’ at vhes, but ‘quiet’ at other wavelengths? the importance of multifrequency astrophysics and multienergy particle astrophysics seems evident. there are many problems in performing simultaneous multi–frequency, multi–energy multi–site, multi– instrument, multi–platform measurements, due to: (1.) objective technological difficulties; (2.) sharing common scientific objectives; (3.) problems of scheduling and budgets; (4.) political management of science. in spite of the many ground–based and space–based experiments that provide an impressive quantity of excellent data in various energy regions, many open problems still exist. i believe that only a drastic change in the philosophy of the experiments will lead to faster solution of most of the problems that remain open. for example, in the case of space–based experiments, small satellites — dedicated to specific missions and problems, and able to schedule very long–term observations — must be supported, because they can be prepared relatively rapidly, are easier to manage and are less expensive than medium–size and large satellites. i strongly believe that in the coming decades passive–physics experiments, space–based and ground– based and maybe also lunar–based, will be the most suitable probes for sounding the physics of the universe. active physics experiments have probably already reached the maximum dimensions compatible with a reasonable cost/benefit ratio, with the obvious exception of neutrino–astronomy experiments. acknowledgements i wish to thank the loc of the karlovy vary 9th integral/bart workshop for logistic support. this research has made use of nasa’s astrophysics data system. references [1] aad, g. et al. (atlas collaboration): 2012, phys. rev. letter 108, 1802. [2] abdo, a.a. et al.: 2009a, apj 707, 1310. [3] abdo, a.a. et al.: 2009b, sci. 326, 1512. [4] abdo, a.a. et al.: 2009c, apj 706, l56. [5] abdo, a.a. et al.: 2009e, apj 701l, 123. [6] abdo, a.a. et al.: 2010b, apj 723, 1082. [7] abdo, a.a.: 2011, science 331, 739. [8] abraham, j. et al. (the pierre auger collaboration): 2007, science 318, 938. [9] abraham, j. et al. (the pierre auger collaboration): 2008, astropart. phys. 29, 188. [10] j. abraham, j. et al.: 2010, phys. rev. letter 104, 091101. [11] ackermann, m. et al.: 2010b, apj 717, l71. [12] aharonian, f.a.: 2007, sci. 315, 70. [13] aharonian, f. et al.: 2005, a&a 442, 1. [14] albert, j. et al. (magic collaboration): 2008a, sci. 320, 1752. [15] albert, j. et al.: 2008b, apj 674, 1037. [16] alcock, c. et al.: 2000, apj 542, 281. [17] alessandro, b.: 2001, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy 73, 363. [18] barlow, e.j. et al.: 2006, mnras 372, 224. [19] bartolini, c., bianco, g., guarnieri, a., piccioni, a., giovannelli, f.: 1983, hvar obs. bull. 7(1), 159. [20] beall, j.h.: 2002, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 73, 379. [21] beall, j.h.: 2003, chja&as 3, 373. [22] beall, j.h.: 2008, chja&as 8, 311. [23] beall, j.h.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), italian physical society, ed. compositori, bologna, italy, 98, 283. 20 vol. 53 no. 1/2013 astroparticle physics today [24] beall, j.h., guillory, j., rose, d.v.: 1999, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 70, 1235. [25] beall, j.h. et al.: 2006, chja&as1 6, 283. [26] beall, j.h. et al.: 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), italian physical society, ed. compositori, bologna, italy, 93, 315. [27] beall, j.h., guillory, j., rose, d.v.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), italian physical society, ed. compositori, bologna, italy, 98, 301. [28] bednarek, w., giovannelli, f., karakula, s., tkaczyk, w.: 1990, a&a 236, 268. [29] bennett, c.l. et al.: 2003, apj 583, 1. [30] biermann, p. l.: 1999, astrophys. and space sci. 264, 423. [31] blanch, o., martinez, m.: 2005, astrop. phys. 23, 588. [32] blasi, p.: 2001, astrop. phys. 15, 223. [33] bruno, a.: 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 103, 139. [34] burger, m. et al.: 1996, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 67, 365. [35] burles, s., nollet, k.m., turner, m.s.: 2001, apjl 552, l1. [36] caballero, i. et al.: 2010, atel no. 2541. [37] cao, z.: 2008, nucl. phys. b (proc. suppl.) 175-176, 377. [38] chaty, s.: 1998, ph.d. thesis, university paris xi. [39] chaty, s., 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 93, 329. [40] chaty, s.: 2008, chja&as 8, 197. [41] chaty, s., filliarte, p.: 2005, chja&as 5, 104. [42] colafrancesco, s.: 2002, a&a 396, 31. [43] colafrancesco, s.: 2003, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 85, 141. [44] dar, a., de rújula, a.: 2004, phys. rep. 405, 203. [45] de angelis, a., mansutti, o., persic, m.: 2008, il n. cim. 31 n. 4, 187. [46] dermer,c.d., razzaque, s., finke, j.d., atoyan, a.: 2009, new j. of phys. 11, 1. [47] djorgovski, s.g.: 2004, nature 427, 790. [48] djorgovski, s.g.: 2005, in the tenth marcel grossmann meeting, m. novello, s. perez bergliaffa & r. ruffini (eds.), world scientific publishing co., p. 422. [49] ellis, j.: 2002, astro-ph 4059 (arxiv:hep-ex/0210052). [50] erlykin, a.d.: 1995, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy 47, 483. [51] evoli, c., grasso, d., maccione, l.: 2007, astro-ph 0701856. [52] fargion, d.: 2003a, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy 85, 267. [53] fargion, d.: 2003b, chja&as 3, 472. [54] fargion, d., grossi, m.: 2006, chja&as1 6, 342. [55] gabici, s., aharonian, f.a.: 2007, apjl 665, l131. [56] genzel, r. et al.: 2003a, apj 594, 812. [57] genzel, r. et al.: 2003b, nature 425, 934. [58] ghosh, s.k.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), italian physical society, ed. compositori, bologna, italy, 98, 243. [59] giovannelli, f.: 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 93, 3. [60] giovannelli, f.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 98, 3. [61] giovannelli, f.: 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 103, 3. [62] giovannelli, f., ziółkowski, j.: 1990, aca 40, 95. [63] giovannelli, f., sabau-graziati, l.: 1992, space sci. rev. 59, 1. [64] giovannelli, f., sabau-graziati, l.: 2001, ap&ss 276, 67. [65] giovannelli, f., sabau-graziati, l.: 2004, space sci. rev. 112, 1. [66] giovannelli, f., sabau-graziati, l.: 2006, chja&as1 6, 1. [67] giovannelli, f., bernabei, s., rossi, c., sabau-graziati, l.: 2007, a&a 475, 651. [68] giovannelli, f., mannocchi, g. (eds.): 2007, 2009, 2011, proc. vulcano workshops on frontier objects in astrophysics and particle physics, italian physical society, ed. compositori, bologna, italy, vol. 93, 98, 103. [69] giovannelli, f., sabau-graziati, l.: 2008, chja&as 8, 1. [70] giovannelli, f., gualandi, r., sabau-graziati, l.: 2010, atel no. 2497. [71] giovannelli, f., sabau-graziati, l.: 2010, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 81 n. 1, 18. [72] giovannelli, f., sabau-graziati, l.: 2011, acta polyt. 51 n. 2, 21. [73] giovannelli, f., sabau-graziati, l.: 2012a, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 83 n. 1, 17. 21 franco giovannelli acta polytechnica [74] giovannelli, f., sabau-graziati, l.: 2012b, in the golden age of cataclysmic variables and related objects, f. giovannelli & l. sabau-graziati (eds.), mem. sait 83 n. 2, 440. [75] greco, m. (ed.): 2009, 2010, le rencontres de physique de la vallée d’aoste: results and perspectives in particle physics, frascati phys. ser. vol. l, vol. li. [76] greiner, j. et al.: 2009, apj 693, 1610. [77] gustavino, c.: 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 93, 191. [78] gustavino, c.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 98, 77. [79] gustavino, c.: 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 103, 657. [80] hess, v.f.: 1912, physik zh. 13, 1084. [81] van den heuvel, e.p.j.: 2009, ap&ss library 359, 125. [82] hillas, a.m.: 1999, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy 65, 391. [83] hillas, a.m., johnson, a.p.: 1990, proc. 21st intern. cosmic ray conf. (adelaide) 4, 19. [84] hurley, k.: 2008, chja&as 8, 202. [85] izzo, l. et al.: 2010, j. korean phys. soc. 57, no. 3, 551. [86] johnstone, wm.r.: 2005, http://www.johnstonsarchive.net/ relativity/binpulstable.html [87] kappes, a., hinton, j., stegman, c., aharonian, f.a.: 2007, apj 656, 870. [88] kuiper, l.: 2007, talk presented at the frascati workshop on multifrquency behaviour of high energy cosmic sources. [89] kulikov, g.v., khristiansen, g.b.: 1958, jetp 35, 63. [90] larosa, t.n., kassim, n.e., lazio, t.j.w., hyman, s.d.: 2000, aj 119, 207. [91] lawrence, m.a., reid, r.j.o., watson, a.a.: 1991, j. phys. g: nucl. part. phys. 17, 733. [92] lenain, j.-p.: 2010, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 81 n. 1, 362. [93] martinez, m.: 2007, ap&ss 309, 477. [94] mather, j.c. et al.: 1994, apj 420, 439. [95] mereghetti, s., stella, l.: 1995, apjl 442, l17. [96] meszaros, p., rees, m.j.: 1992, apj 397, 570. [97] mezger, p.g., duschl, w.j., zylka, r.: 1996, a&a rev. 7, 289. [98] mezzetto, m.: 2011, journal of physics: conference series 335, 012005. [99] morselli, a.: 2007, in high energy physics ichep ’06, y. sissakian, g. kozlov & e. kolganova (eds.), world sci. pub. co., p. 222. [100] müller, d.: 1993, in frontier objects in astrophysics and particle physics, f. giovannelli, g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy 40, 391. [101] nagano, m. et al.: 1992, j. phys. g: nucl. part. phys. 18, 423. [102] nagano, m., watson, a.a.: 2000, rev. mod. phys. 72, 689. [103] ouchi, m. et al.: 2010, apj 723, 869. [104] palanque-delabrouille, n.: 2003, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 85, 131. [105] van paradijs, j.: 1998, in the many faces of neutron stars, r. buccheri, j. van paradijs & and m.a. alpar (eds.), kluwer academic publ., dordrecht, holland, p. 279. [106] van paradijs, j., taam, r.e., van den heuvel, e.p.j.: 1995, a&a 299, l41. [107] paredes, j.m., persic, m.: 2010, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 81 n. 1, 204. [108] piran, t.: 1999, phys. rep. 314, 575. [109] piran, t.: 2010, arxiv1005.3311. [110] rahoui, f., chaty, s., lagage, p.-o, pantin, e.: 2008, a&a 484, 801. [111] ressel, m.t., turner, m.s.: 1990, comm. astrophys. 14, 323. [112] ruffini, r. et al.: 2003, aip conf. proc. 668, 16. [113] schroedter, m.: 2005, apj 628, 617. [114] shibata, m.: 2008, nuclear physics b (proc. suppl.) 175176, 267. [115] spergel, d.n. et al.: 2003, apjs 148, 175. [116] srianand, r., petitjean, p., ledoux, c.: 2000, nature 408, 931. [117] stanev, t.: 2001, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy 73, 397. [118] straessner, a. (on behalf of the atlas collaboration): 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, ed. compositori, bologna, italy, 103, 43. [119] tanvir, n.r. et al.: 2009, nature 461, 1254. [120] tavani, m. et al.: 2011, science 331, 736. [121] taylor, a.m. et al.: 2008, in high energy gamma-ray astronomy, aip conf. proc. 1085, 384. [122] tozzi, p. et al.: 2003, apj 593, 705. [123] wang, f.y., dai, z.g.: 2009, mnras 400, 10. [124] wollack, j.: 2011, nasa official web page. [125] xu, x.w. (icecube collaboration): 2008, n. phys. b 175-176, 401. [126] zatsepin, v.i.: 1995, j. phys. g: nucl. part. phys. 21, issue 5, l31. 22 acta polytechnica 53(1):11--22, 2013 1 introduction 2 astroparticle physics development 2.1 cosmic rays 2.2 tev sources 2.3 diffuse extragalactic background radiation 2.4 reionization of the universe 2.5 clusters of galaxies 2.6 dark energy and dark matter 2.7 the galactic center 2.8 gamma-ray bursts 2.9 extragalactic background light 2.10 relativistic jets 2.11 cataclysmic variables 2.12 high mass x-ray binaries 2.13 obscured sources and supergiant fast x-ray transients 2.14 ultra compact double degenerated binaries 2.15 magnetars 3 cross sections of nuclear reactions in stars 4 neutrino astronomy 5 conclusions and reflections acknowledgements references acta polytechnica vol. 52 no. 5/2012 a task-driven grammar refactoring algorithm ivan halupka, ján kollár, emília pietriková dept. of computers and informatics, technical university of košice, letná 9, 042 00 košice, slovak republic corresponding author: ivan.halupka@tuke.sk abstract this paper presents our proposal and the implementation of an algorithm for automated refactoring of context-free grammars. rather than operating under some domain-specific task, in our approach refactoring is perfomed on the basis of a refactoring task defined by its user. the algorithm and the corresponding refactoring system are called martinica. martinica is able to refactor grammars of arbitrary size and structural complexity. however, the computation time needed to perform a refactoring task with the desired outcome is highly dependent on the size of the grammar. until now, we have successfully performed refactoring tasks on small and medium-size grammars of pascal-like languages and parts of the algol-60 programming language grammar. this paper also briefly introduces the reader to processes occurring in grammar refactoring, a method for describing desired properties that a refactored grammar should fulfill, and there is a discussion of the overall significance of grammar refactoring. keywords: grammar refactoring, evolutionary algorithm, refactoring processes, task-driven transformation. 1 introduction our work in the field of automated grammar refactoring derives from the fact that two or more equivalent context-free grammars may have different forms. although two equivalent grammars generate the same language, they do not necessarily share some other specific properties that are measurable by grammar metrics [2]. the form in which a context-free grammar is written may have a strong impact on many aspects of its future application. for example, it may affect the general performance of the parser used to recognize the language generated by the grammar [12], or it may influence, and in many cases limit, our choice of parser generator for use in implementing the syntactic analyzer [12]. since there is a close relation between the forms in which a grammar is expressed and the purpose for which the grammar is designed, different grammars generating the same language become domain-specific formalizations of language. the ability to transform one grammar to another equivalent grammar therefore actually becomes the capability to shift between domains of the possible application of grammars. although this ability makes each context-free grammar more universal in the scope of its application, its practical advantages may easily be overwhelmed by the difficulties that this approach can introduce. the problem is that grammar refactoring is in many cases a non-trivial task, and if done manually it is prone to errors, especially in the case of larger grammars. this is an issue, because there is in general no formal way of proving that two context-free grammars generate the same language, since this problem cannot be resolved. in our work, we address this issue by proposing an evolutionary algorithm for automated task-driven grammar refactoring. the algorithm is called martinica. the main idea behind our algorithm is to apply a sequence of simple transformation processes to a chosen context-free grammar in order to produce an equivalent grammar with the desired properties. the current state of development of the algorithm requires that the grammar’s production rules be expressed in bnf notation. these refactoring processes are more closely discussed in section 4, while the refactoring algorithm itself is discussed in section 6. desired properties of a grammar produced by the algorithm are defined by an objective function that we discuss in section 5. finally, in section 7, we present experimental results when our algorithm was used for refactoring a context-free grammar. 2 motivation grammarware engineering is an up-and-rising discipline in software engineering, which aims to solve many issues in grammar development, and promises an overall rise in the quality of grammars that are produced, and in the productivity of their development [5]. grammar refactoring is a process that may occur in many fields of grammarware engineering, e.g. grammar recovery, evolution and customization [5]. in fact, it is one of five core processes occurring in grammar evolution, alongside grammar extension, restriction, error correction and recovery [1]. the problem is that, unlike program refactoring, which is well-established practice, grammar refactoring is little understood and little practised [5]. 51 acta polytechnica vol. 52 no. 5/2012 if there is a clear purpose for which the grammar is being developed, its specification for an experienced grammar engineer is usually not an issue. problems arise when a grammar is being developed for multiple purposes [1], or when a grammar engineer lacks knowledge about the future purpose of the grammar. in the first case, the problem is usually solved by developing multiple grammars of one language [1]. this need to develop multiple grammars could be replaced by developing a single grammar generating a given language and automatically refactoring it to another form suited to satisfy certain requirements, thus increasing the productivity of the grammar engineer. in fact, this is one of the main objectives of our work in the field of grammar refactoring. in cases when the grammar engineer lacks knowledge about some aspect of the future purpose of the grammar, its final shape may not satisfy some of the specific requirements, even if it generates correct language. in this case, the grammar must either be refactored or be rewritten from scratch, thus draining valuable resources. an automated or even semi-automated way of refactoring the grammar could produce significant savings in this redundant consumption of resources. these are not the only two scenarios where an efficient refactoring tool is needed. in fact, an automated approach can be useful in all cases where we have a grammar with a form that needs to be changed while preserving the language that it generates. in this case, we see two main domains for applying our algorithms, i.e. adaptation of legacy grammars, and grammar inference. parser generators and other implementation platforms for context-free grammars develop over time. newly-established platforms and other tools operating with context-free grammars may require a form in which the grammar should be expressed that differs from the tools for the previous technological generation, or that operate with unequal efficiency over the same grammar forms. kent beck states that programs have two kinds of value: what they can do for today, and what they can do for tomorrow [4]. when we take this principle into the account, we can say that the ability to refactor a context-free grammar in order to adjust it to the requirements of current platforms is in fact the ability to add value to the legacy formalization of the language. grammar inference is defined as recovering the grammar from a set of positive and negative language samples [11]. grammar inference focuses on resolving issues of over-generality and over-specialization of the generated language [3], while the form of the grammar is only a secondary concern. grammar recovery tools in general do not allow their users enough fine-grained tuning options for recovering a grammar in the desired form, making it in many cases difficult to comprehend, and not useful until it has been refactored [6]. 3 related work we were able to find very little reported research in the field of automated grammar refactoring. the small amount of work that we did find is mostly concerned with refactoring context-free grammars in order to achieve some fixed domain-specific objective. kraft, duffy and malloy developed a semiautomated grammar refactoring approach to replace iterative production rules with left-recursive rules [6]. they present a three-step procedure consisting of grammar metrics computation, metrics analysis in order to identify candidate nonterminals, and transformation of the candidate non-terminals. the first and third step of this procedure are fully automated, while the process of identifying non-terminals to be transformed by replacing iteration with left recursion is done manually. this approach is called metricsguided refactoring, since the grammar metrics are calculated automatically, but the resulting values must be interpreted by a human being, who uses them as a basis for making decisions necessary for resuming the refactoring procedure. the work also provides an exemplary illustration of the benefits of grammar refactoring, since left-recursive grammars are more useful for some aspects of the application of a grammar [8] and are also more useful to human users [9] than iterative grammars. however, the procedure for left-recursion removal is a well-known practice in the field of compiler design. an algorithm for automated removal of direct and indirect left recursion can be found in louden [10]. this approach is further extended by lohman, riedewald and stoy [9], who present a technique for removing left-recursion in attribute grammars and semantic preservation while executing this procedure. 4 refactoring processes in our approach, we use grammar refactoring processes only as a tool for incremental grammar refactoring. formally, a grammar refactoring process is a function that takes some context-free grammar g = (n,t,r,s) and uses it as a basis for creating a new grammar g′ = (n′,t ′,r′,s′) equivalent to grammar g. this function may also require some additional arguments, known as process parameters. we refer to each assignment of actual values to the required process parameters of the specific grammar refactoring process as refactoring process instantiation, and an instance of this refactoring process is referred to as a specific grammar refactoring process with assigned actual values of its required process parameters. at this stage of development, we have experimented with a base of eight grammar refactoring processes (unfold, fold, remove, pack, extend, reduce, split 52 acta polytechnica vol. 52 no. 5/2012 and nop), the first three of which have been adopted from ralf läammel’s paper on grammar adaptation[7], while the others are proposed by us. for the purpose of better understanding what refactoring processes really are, and how they work, in following subsections we briefly introduce two of them, namely pack and nop. in our research, we tend to keep the process base as small as possible, and we try to keep the refactoring processes as universal as possible. this is mainly because, as the refactoring process base grows, the state space of possible solution grammars also grows, and thus the size of the process base has a significant impact on the calculation complexity of the algorithm. however, lack of domain-specific refactoring processes is compensated by the overall openness of the process base, which means that it is a relatively trivial task to expand it or reduce it. in fact, the only refactoring process required by the algorithm, which must reside at all times in the process base, is the nop process. 4.1 nop nop, or identical transformation, is a grammar refactoring process that transforms context-free grammar g into the same context-free grammar g, or in other words, it does not impose any changes on the grammar. 4.2 pack pack is a grammar refactoring process that creates transformed grammar g′ on the basis of three process parameters, i.e., a mandatory parameter called the packed production rule (pr), and optional parameters initial package symbol (ps) and package length (pl). these parameters must have the following properties: pr ∈ r, ps ∈ n ∧ ps ≥ 0 ∧ ps < rightsidelength(pr ), pl ∈ n ∧ pl > 0 ∧ pl ≤ rightsidelength(pr ) − ps. in cases when the initial package symbol is not defined, we assume that ps = 0. if the package length is not defined, we define it as pl = rightsidelength(pr ) − ps. function rightsidelength returns the number of symbols contained within the right side of the production rule. pack replaces some sequence of symbols contained within the right side of the packed production rule with a new nonterminal, and creates a rule whose left side is this new nonterminal and whose right side is the sequence of symbols mentioned above. this sequence of symbols is defined by the initial package symbol and the package length. more precisely, it is a sequence of pl symbols starting from the symbol whose position within the packed rule is ps + 1. 5 objective function we adopt a somewhat modified understanding and notation of objective functions from mathematical optimization. in this case, the objective function describes the properties of the context-free grammar that we seek to achieve by refactoring. however, it does not describe the way in which refactoring should be performed, and the condition in which desired properties of the grammar are achieved. in our view, the objective function consists of two parts: objective and state function. our automated refactoring algorithm works with only two kinds of objectives, which are minimization and maximization of a state function. we define a state function as an arithmetic expression whose only variables are the grammar metrics calculable for any context-free grammar. as such, a state function is a tool for qualitative comparison of two or more equivalent context-free grammars. until now, we have experimented with some grammar size metrics [2], e.g. number of non-terminals (var) and number of production rules (prod). an example of an objective function defining the refactoring task to be performed on grammar g executable by our algorithm is: f(g) = minimize 2 ∗ var + prod. (1) 6 refactoring algorithm the main idea behind our grammar refactoring algorithm is to apply a sequence of grammar refactoring processes to a chosen context-free grammar, in order to produce an equivalent grammar with a lower value of the objective function, when the objective is minimization, or a higher value of the objective function when the objective is maximization. since it is an evolutionary algorithm, it also requires some other input parameters, in addition to the initial grammar and the objective function, in order to be executed. the algorithm requires three other input parameters: number of evolution cycles, population size and length of life of a generation. the first two of these parameters are characteristic for algorithms of similar type, while the third parameter is our own. as shown in fig. 1, which presents a white-box view of our algorithm, the central figure in martinica is an abstraction called population of grammars. in our view, population of grammars is a set containing a constant number of grammar population entities. its main property is that, after performing an arbitrary step in our algorithm, the number of elements in the population of grammars is always equal to the population size. further, we define a grammar population entity as an arranged triple of elements: post-grammar, process 53 acta polytechnica vol. 52 no. 5/2012 chain of grammar generation, and difference in objective functions. a post-grammar is a context-free grammar equivalent to the initial grammar. the process chain for grammar generation is a sequence of refactoring process instances that was used to create the post-grammar from the corresponding postgrammar of the previous generation. the number of refactoring process instances in each grammar generation process chain is always equal to the length of life of a generation. the difference in objective functions is the difference between the values of the objective function calculated for a post-grammar of the current population and the corresponding post-grammar of the previous population of grammars. 6.1 refactoring process instantiation all process instances occurring in our algorithm are created automatically in one of three procedures, which are referred to as random process creation, random parameter creation, and identical process creation. random process creation creates instance of a random refactoring process with random parameters. the first step in this procedure randomly selects a process from the base of grammar refactoring processes. in this procedure, each grammar refactoring process has the same probability of being selected. the second step in the procedure defines concrete process parameters for this process on the basis of the grammar to which the process instance will be applied. all possible combinations of process parameters that respect the restrictions defined by a specific refactoring process have the same probability of being generated in this procedure. random parameter creation creates a process instance originating from some other process instance. the two mentioned process instances share the same refactoring process, but their process parameters may differ, since new process parameters have been created in a procedure analogous to the second step of the random process creation procedure. the only exception to this rule occurs when there is no acceptable combination of process parameters for a given refactoring process to be applicable to the given context-free grammar. in this, the random parameter creation procedure returns an instance of the nop refactoring process. identical process creation creates an instance of the nop grammar refactoring process. 6.2 creating an initial population in the first phase of the automated refactoring algorithm, the initial population of the grammars is created, and as such this phase is not repeated throughout the algorithm. population of grammars initial population creation test grammars creation selection refactoring processes figure 1: white-box view of martinica. the first step in this phase is to create grammar generation process chains for each grammar population entity. all process instances of each process chain created in this phase of the algorithm are created in the random process creation procedure, except for one, whose processes are all created in the identical process creation procedure. the reason for this exception is to guarantee that the initial grammar will be incorporated into the initial population of grammars. since the sequence of process instances contained in the process chain must be applicable to the grammar for which are they being generated, in exact order, we must consider all changes to the grammar performed by one refactoring process instance in order to be able to generate the next process instance of the process chain. we solve this issue by generating intermediate grammars after each random process creation procedure by applying this refactoring process instance to the grammar for which the random refactoring process instance is being generated. we then generate the next random process instance of the process chain on the basis of the intermediate grammar. in order to better understand the idea behind this approach, we provide an example of creating a process chain consisting of three random refactoring processes for the initial grammar. this example is shown in fig. 2. the second step in the first phase of the algorithm creates corresponding post-grammars for each grammar population entity by applying its process chain to the initial grammar, and finally the third step calculates the difference of the objective function calculated for the initial grammar and the post-grammar of the corresponding grammar population entity. 54 acta polytechnica vol. 52 no. 5/2012 refactoring process instance 1 refactoring process instance 2 refactoring process instance 3 initial grammar intermediate grammar 1 intermediate grammar 2 random process creation random process creation random process creation grammar transformation grammar transformation process chain figure 2: creating a random process chain. 6.3 creating test grammars the second and third phase of the algorithm, called test-grammar creation and selection, are repeated in sequence for a number of evolution cycles. in test-grammar creation, we create three test grammar population entities for each grammar population entity. these entities are called self-test grammar, foreign-test grammar, and random-test grammar. self-test grammar is created on the basis of the corresponding grammar population entity and process chain, generated on the basis of the process chain of this entity. all refactoring process instances in the newly generated process chain are created in the random parameter creation procedure, and the algorithm for creating them is analogous to the algorithm for creating a random process chain in the initial population of the grammar creation phase. self-test grammar is therefore a grammar population entity containing a grammar that was created on the basis of the same refactoring processes as those on which original tested grammar was created, but these processes may have different process parameters. foreign-test grammar is created in a similar procedure as for self-test grammar, with the exception that the new population entity is not created on the basis of a tested grammar process chain, but on the basis of some other grammar population entity process chain. this population entity is randomly selected from the population of grammars. random-test grammar is created in a procedure analogous to the procedure for creating a random grammar population entity in the first phase of the algorithm, with the exception that the random process chain is not being generated for an initial grammar, but for a grammar contained within the tested grammar population entity. initial grammar grammar parser automated refactoring algorithm objective function evaluator evolution monitor refactored grammar refactoring report txt pdf txt objective function txt figure 3: martinica system architecture. 6.4 selection and evaluation in the selection phase of the algorithm, we compare the value of the objective function of each grammar within the population of grammars with the values of the objective function of the corresponding test grammars, and we choose the grammar with the best value of the objective function. this is the grammar which will be incorporated in the next generation of the population of grammars. when the chosen grammar is the tested grammar no changes occur, and the corresponding grammar population entity is preserved in the population of grammars. otherwise, the tested grammar population entity is removed from the population of grammars and is substituted by the test grammar population entity with the best value of the objective function. the fourth and final phase of the algorithm is performed after all evolution cycles have ended. in this phase, we compare the values of the objective function calculated for each grammar within the population of grammars, and we choose the grammar with the highest or lowest value, depending on our objective. this is the solution grammar, and as such is the result of automated refactoring. 7 experimental results 7.1 martinica implementation in order to be able perform experiments and demonstrate the correctness of our approach, we implemented a grammar refactoring system in which martinica plays a central role. the entire system is implemented in java, and its architecture is shown in fig. 3. the refactoring system takes the initial grammar to be refactored and the objective function from two different text files and, after refactoring has been performed, it creates two files. the first of these is a 55 acta polytechnica vol. 52 no. 5/2012 program ::= program ident begin commandsequence end commandsequence ::= command comma commandsequence commandsequence ::= command command ::= assignement command ::= declaration assignement ::= variable assign expression declaration ::= var ident type type expression ::= variable operation expression expression ::= constant operation expression expression ::= variable expression ::= constant operation ::= plus operation ::= minus type ::= integer type ::= real ident ::= ident variable ::= ident constant ::= number table 1: inital grammar. text file containing the resulting grammar, and the second is a pdf file containing the evolution report. the core of the systems is divided into two coexisting entities: an automated refactoring algorithm, and an objective function evaluator. the automated refactoring algorithm contains the implementation of the entire martinica algorithm, the refactoring process base and the interactive user interface for obtaining the number of evolution cycles, the population size and the length of life of the generation. the initial grammar is taken from the text file, parsed by the grammar parser, which creates a grammar model on the basis of which the refactoring is done. the objective function is parsed by the objective function evaluator, which calculates the values of the objective function for all grammars provided by the automated refactoring algorithm. it does not provide an automated refactoring algorithm with a refactoring objective, since the algorithm always assumes that the objective is minimization, and if this is false the objective function evaluator transforms the state function so that it is equivalent to the native state function, but with the objective of minimization. the entire refactoring process is monitored by the evolution monitor, which creates a report containing some analytical data concerning the specific refactoring process. program ::= program ident begin commandsequence end commandsequence ::= command comma commandsequence commandsequence ::= command command ::= ident assign expression command ::= var ident type real command ::= var ident type integer expression ::= ident plus expression expression ::= ident minus expression expression ::= number plus expression expression ::= number minus expression expression ::= ident expression ::= number table 2: refactored grammar. 7.2 refactoring experiment we experimented with a context-free grammar generating a simple assignment language. the grammar itself contains 11 non-terminals, 13 terminals and 18 production rules, whose bnf notation is shown in tab. 1. symbols starting with a lowercase letter represent nonterminals, while symbols starting with uppercase letters represent terminals. the start symbol of the grammar is a non-terminal program. in our experiment, the refactoring task was described by an objective function from example (1). we iterated through 30 evolutionary cycles, with a population of 500 grammar population entities, and the length of life of a generation was set to 4. after refactoring had been performed, we obtained the grammar shown in tab. 2. the value of the objective function evaluated for the initial grammar was 40, while the value of the objective function evaluated for the refactored grammar is 20, which means that martinica managed to reduce the value of the objective function by 50 %, and thus fulfilled the refactoring task. the development of the value of the objective function through the evolutionary cycles is illustrated in fig. 4. the upper line illustrates the development of the average value of the objective function in all grammars within the population of grammars, while the lower line shows the development of the value of the objective function in the best grammar found within the population of grammars. 8 conclusion in this paper, we have presented our algorithm and software system for automated refactoring of contextfree grammars. the main advantage of the algorithm 56 acta polytechnica vol. 52 no. 5/2012 values of objective function through evolution minimal value of objective function average value of objective function 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 evolutionary cycle 0 5 10 15 20 25 30 35 40 45 v a lu e o f o b je c ti v e f u n c ti o n figure 4: values of the objective function through evolution. is its relatively broad range of application, while its main disadvantages are relatively high computational complexity and slow propagation of positive changes within the population of grammars. in future, we will focus on resolving these issues and on expanding the implemented system to make it applicable for other aspects of grammar evolution, not only for refactoring. acknowledgements the work presented in this paper was supported by srda project of slovak-slovenian research and development no. sk-si-0003-10 “language patterns in domain-specific languages evolution”. references [1] t. l. alves, j. visser. a case study in grammar engineering. in proceedings of sle’2008 (eds. d. gašević, r. lämmel and e. wyk), pp. 285–304. springer-verlag, berlin-heidelberg, 2009. [2] j. cervelle et al. on defining quality based grammar metrics. in proceedings of imcsit ’09. international multiconference (eds. m. ganzha and m. paprzycki), pp. 651–658. ieee computer society press, los alamitos, 2009. [3] a. d’ulizia, f. ferri, p. grifoni. a learning algorithm for multimodal grammar inference. systems, man, and cybernetics, part b: cybernetics 41(6):1495–1510, 2011. [4] m. fowler et al. refactoring: improving the design of existing code. addison-wesley, boston, 1999. [5] p. klint, r. läammel, c. verhoef. toward an engineering discipline for grammarware. transaction on software engineering and methodology 14(3):331–380, 2005. [6] n. a. kraft, e. b. duffy, b. a. malloy. grammar recovery from parse trees and metrics-guided grammar refactoring. software engineering 35(6):780–794, 2009. [7] r. lämmel. grammar adaptation. in fme 2001: formal methods for increasing software productivity (eds. j. oliveira and p. zave), pp. 550–570. springer-verlag, berlin-heidelberg, 2001. [8] r. läammel, c. verhoef. semi-automatic grammar recovery. software: practice and experience 31(15):1395–1438, 2001. [9] w. lohmann, g. riedewald, m. stoy. semanticspreserving migration of semantic rules during left recursion removal in attribute grammars. electron notes theor comput sci 110:133–148, 2004. [10] k. c. louden. compiler construction: principles and practice. pws publishing, boston, 1997. [11] m. mernik et al. grammar inference algorithms and applications in software engineering. in proceedings of icat 2009. xxii international symposium (eds. a. salihbegović, j. velagić, h. šupić and a. sadžak), pp. 14–20. ieee computer society press, los alamitos, 2009. [12] t. mogensen. basics of compiler design. university of copenhagen, 2007. 57 acta polytechnica doi:10.14311/ap.2013.53.0433 acta polytechnica 53(5):433–437, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap on renormalization of poisson–lie t-plural sigma models ladislav hlavatý∗, josef navrátil, libor šnobl department of physics, faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, 115 19 prague 1, czech republic ∗ corresponding author: ladislav.hlavaty@fjfi.cvut.cz abstract. covariance of the one-loop renormalization group equations with respect to poisson–lie t-plurality of sigma models is discussed. the role of ambiguities in renormalization group equations of poisson–lie sigma models with truncated matrices of parameters is investigated. keywords: sigma models, string duality, renormalization group. ams mathematics subject classification: 81t40 (81t30, 81t15). submitted: 20 may 2013. accepted: 31 may 2013. 1. introduction one-loop renormalizability of poisson–lie dualizable σ-models and their renormalization group equations were derived in [1]. covariance of the renormalization group equations with respect to poisson–lie t-duality was proven in [2]. this suggests that also properties of quantum σ-models can be given in terms of drinfel’d doubles and not their decompositions into manin triples. this was indeed claimed in [3] where a renormalization on the level of sigma models defined on drinfel’d double was proposed. a natural way to independently verify this claim would be to extend the proof of covariance of [2] to poisson–lie t-plurality. unfortunately, transformation properties of the structure constants and the matrix m (parameters of the models) under the poisson–lie t-plurality are much more complicated than in the case of t-duality. that’s why we decided to check it first on examples using our lists of 4and 6-dimensional drinfel’d doubles and their decompositions into manin triples [4, 5]. it turned out that the renormalization group equations of [1, 2] are indeed invariant under poisson–lie t-plurality. the equivalence of the renormalization flows of the models on the poisson–lie group of [2] and on the drinfel’d double [3] also holds in all cases studied so far provided one is careful in interpreting of the formulas in different parts of [3], see section 3. an assumption in the renormalizability proof [1] is that there is no a priori restriction on elements of matrix m that together with the structure of the manin triple determine the models. it was noted in [2, 6] that the renormalization group equations need not be consistent with truncation of the parameter space. on the other hand there is some freedom in the renormalization group equations and we are going to show how they can be used in the choice of one-loop β functions for a given truncation. 2. review of poisson–lie t-plurality for simplicity we will consider σ-models without spectator fields, i.e. with target manifold isomorphic to a group. let g be a lie group and g its lie algebra. the σ-model on group g is given by the classical action se [g] = ∫ d2xr−(g)aeab(g)r+(g)b, (1) where g : r2 → g, (σ+,σ−) 7→ g(σ+,σ−), r±(g)a are components of the right-invariant fields ∂±gg−1 in the basis ta of the lie algebra g, ∂±gg −1 = (r±(g))ata ∈g and e(g) is a certain bilinear form on the lie algebra g, to be specified below. the σ-models that can be transformed by the poisson–lie t-duality are formulated (see [7, 8]) by virtue of the drinfel’d double d ≡ (g|g̃) — a lie group whose lie algebra d admits a decomposition d = g u g̃ into a pair of subalgebras maximally isotropic with respect to a symmetric ad-invariant nondegenerate bilinear form 〈 . , .〉. these decompositions are called manin triples. the matrices e(g) for such σ-models are of the form e(g) = (m + π(g))−1, π(g) = b(g) ·a−1(g) = −π(g)t, (2) where m is a constant matrix, the superscript t means matrix transposition and a(g),b(g) are submatrices of the adjoint representation of the subgroup g on the lie algebra d defined as gtg−1 ≡ ad(g) . t = a−1(g) ·t, gt̃g−1 ≡ ad(g) . t̃ = bt(g) ·t + at(g) · t̃, (3) 433 http://dx.doi.org/10.14311/ap.2013.53.0433 http://ojs.cvut.cz/ojs/index.php/ap l. hlavatý, j. navrátil, l. šnobl acta polytechnica where ta and t̃ a are elements of dual bases of g and g̃, i.e. 〈ta, tb 〉 = 0, 〈 t̃ a, t̃ b 〉 = 0, 〈ta, t̃ b 〉 = δba. the origin of the poisson–lie t-plurality [7, 9] lies in the fact that in general several decompositions (manin triples) of the drinfel’d double may exist. let d = ĝ u ḡ be another decomposition of the lie algebra d into maximal isotropic subalgebras. the dual bases of g, g̃ and ĝ, ḡ are related by the linear transformation( t t̃ ) = ( k q w s )( t̂ t̄ ) , (4) where the matrices k, q, w, s are chosen in such a way that the structure of the lie algebra d in the basis (ta, t̃ b) [ta,tb] = fabctc, [t̃ a, t̃ b] = f̃abct̃ c, [t̃ a,tb] = fbcat̃ c − f̃acbtc (5) transforms to a similar one where t → t̂, t̃ → t̄ and the structure constants f, f̃ of g and g̃ are replaced by the structure constants f̂, f̄ of ĝ and ḡ. the duality of both bases requires( k q w s )−1 = ( st qt w t kt ) . (6) the σ-model obtained by the poisson–lie t-plurality is defined analogously to (1)-(2) where ê(ĝ) = (m̂ + π̂(ĝ))−1, π̂(ĝ) = b̂(ĝ) · â−1(ĝ) = −π̂(ĝ)t, m̂ = (m ·q + s)−1 · (m ·k + w) = (kt ·m −w t) · (st −qt ·m)−1. (7) the transformation (7) m 7→ m̂ is obtained when the subspaces e± = span{e±a }na=1 spanned by e+a :− ta + m −1 ab t̃ b, e−a :− ta −m −1 ba t̃ b (8) are expressed as e+ ={t̂a + m̂−1ab t̄ b}na=1, e− ={t̂a −m̂−1ba t̄ b}na=1. classical solutions of the two σ-models are related by two possible decompositions of l ∈ d, l = gh̃ = ĝh̄. (9) examples of explicit solutions of the σ-models related by the poisson–lie t-plurality were given in [10]. the poisson–lie t-duality is a special case of poisson–lie t-plurality with k = s = 0, q = w = 1. it is useful to recall that several other conventions are used in the literature. e.g., the action in [2, 3] is defined as s[g] = ∫ d2xl+(g) · (m + π̄(g))−1 ·l−(g), (10) where π̄(g) = bt(g)·a(g) = π(g−1). the transition between actions (1) and (10) is given by g ↔ g−1, m ↔ mt. the one-loop renormalization group equations for poisson–lie dualizable σ-models were found in [1]. in our notation it reads dmba dt = rab(mt). (11) note that equation (11) appears in [1, 2] without transposition of m on both sides of the equation due to different formulations of the σ-model action (1) vs. (10). the matrix valued function rab is defined as rab(m) = racd(m)ldbc(m), (12) rabc(m) = 1 2 (m−1s )cd ( aabem de + bademeb −bdbemae ) , (13) labc(m) = 1 2 (m−1s )cd ( babem ed + adbemae −aademeb ) , (14) aabc = f̃abc −fcdamdb, babc = f̃abc + madfdcb, (15) ms = 1 2 (m + mt). (16) it was shown in [2] that equation (11) is covariant with respect to the poisson–lie t-duality, i.e., it is equivalent to dm̃ba dt = r̃ab(m̃t) (17) obtained by f → f̃, f̃ → f, m → m̃ = m−1. (18) one expects that equations (11) are covariant also with respect to the poisson–lie t-plurality when f → f̂, f̃ → f̄, m → m̂, (19) where the transformation of m̂ under plurality is given by (7). we have checked the invariance on numerous examples of poisson–lie t-plurality using 4and 6-dimensional drinfel’d doubles and their decompositions into manin triples of [4, 5], and, have found no counterexamples. 434 vol. 53 no. 5/2013 on renormalization of poisson–lie t-plural sigma models 3. relation to the renormalization group equations on the drinfel’d double the renormalization equation (11) presented above will be compared to the renormalization group equations derived in [3] on the whole drinfel’d double drab dt = sab (r,h) = 1 4 (racrbf −ηacηbf ) · (rkdrhe −ηkdηhe )hkh chde f (20) for the symmetric matrix r, indexes a,b,. . . refer to drinfel’d double lie algebra d spanned by the basis ta = {ti, t̃ j}. for a given decomposition of the drinfel’d double into a manin triple (g|g̃), the structure constants h of the drinfel’d double are given by the structure constants f, f̃ of the subalgebras of the manin triple h = h(f, f̃) as in equation (5). matrix r is related to the matrix m, which defines the σ-model on the group g, by rab = ρab (m) = ( m̃s −bm̃−1s b −bm̃−1s m̃−1s b m̃ −1 s ) , (21) where b = 1 2 [ m−1 − (m−1)t ] , m̃s = 1 2 [ m−1 + (m−1)t ] , rab = (r−1)ab, r−1 = η ·r ·η, and ηab = 〈ta|tb〉 = ( 0 idg×dg idg×dg 0 ) . (22) it is easy to show that due to (21) the equivalence of (20) and (11) where rab = rab(m,f, f̃) requires sab ( ρ(m),h(f, f̃) ) = ∂ρab ∂mab (m) rba(mt,f, f̃). (23) note the presence of transpositions on the right-hand side. by construction — cf. equation (4.15) of [3] — matrix m which is put into equation (21) (and thus appears in equation (20)) transforms under t-plurality as in (7), i.e. agrees with the convention used here for the sigma model of the form (1). however, the sigma models on the poisson–lie groups in [3] are expressed in a different convention, as in equation (10) here. thus, a tacit transposition of matrix m is necessary when comparing the renormalization group flows on the double and on the individual poisson–lie subgroup in [3]. taking this fact into consideration we were able to recover the examples presented in [3] and also confirm the conjectured equivalence of the renormalization group equations (20) and (11) in all the investigated 4and 6-dimensional drinfel’d doubles. 4. non-uniqueness of the renormalization group equations it was noted in the paper [1] that there is a certain ambiguity in the one-loop renormalization group equations. namely, the flow given by equation (11) is physically equivalent to the flow given by the equation dmba dt = rab(mt) + rabc(mt) ξc, (24) where ξc are arbitrary functions of the renormalization scale t and rabc(m) were defined in (13). the origin of this arbitrariness in ξc lies in the fact that the metric and b-field are determined up to the choice of coordinates, i.e. up to a diffeomorphism, of the group g viewed as a manifold. in our case we may in addition require that the transformed action again takes the form (1)–(2) for some matrix m′. on the other hand, we do not have to require the diffeomorphism to be a group homomorphism because the group structure plays only an auxiliary role in the physical interpretation. for example, in the particular case of the semiabelian double, i.e. f̃ = 0, π = 0, with a symmetric matrix m, the left translation by an arbitrary group element h = exp(x) ∈ g, i.e. replacement of g by hg in the action (1), leads to the new matrix m′ = ad(h)·m ·ad(h), specifying a metric physically equivalent to the original one. such a diffeomorphism is generated by the flow of the left-invariant vector field x. for general manin triples and matrices m similar transformations are generated by more complicated vector fields parameterized by ξc, as was found in [1]. thus the renormalization group flows (24) differing by the choice of ξc are physically equivalent. consistency under the poisson–lie t-plurality requires that the functions ξ̂c for the plural model satisfy r̂(m̂t) · ξ̂ = (s −mt ·q)−1 · ( r(mt) · ξ ) · (k + q ·m̂t). (25) for the poisson–lie t-duality this formula simplifies to r̃(m̃t) · (ξ̃ + m̃t · ξ) = 0. freedom in the choice of functions ξa can be employed when compatibility of the renormalization group equation flow with a chosen ansatz (truncation) for the matrix m is sought. 4.1. renormalizable σ-models for m proportional to the unit or diagonal matrix the simplest ansatz for the constant matrix is m = m1 where 1 is the identity matrix and m 6= 0. as mentioned in the introduction, truncation or symmetry of the constant matrix m that determines the background of the σ-model often contradicts the form of the r.h.s. of the renormalization group equations 435 l. hlavatý, j. navrátil, l. šnobl acta polytechnica manin triple conditions on ξ1 and/or m and their duals (1|1) dm dt = 0, ξ1 = 0, (3|3.i|b) dm dt = 0, ξ1 = 0, m = ±b, (5|1) dm dt = 2m2, ξ1 = 2m, (60|5.iii|b) dmdt = 0, ξ1 = 0, m = ±b, (6a|61/a.i|b) dmdt = 0, ξ1 = 0, m = ±b/a, (6a|61/a.i|b) dmdt = 2b 2(a2 − 1 a2 ), ξ1 = −2b(a + 1a ), m = −b, (7a|1) dmdt = 2a 2m2, ξ1 = 2am, a ≥ 0, (7a|71/a|b) dmdt = 2(m 2 − b2), ξ1 = 2(m− b), a = 1, (9|1) dm dt = −m2/2, ξ1 = 0, (9|5|b) dm dt = −12m 2 − 2b2, ξ1 = −2b table 1. conditions for consistency of the one-loop renormalization group equations for three-dimensional σ-models with m proportional to the unit matrix (for notation of (x|y ) or (x|y |b) see [5]). (11). on the other hand, the freedom in the choice of ξc in (24) may help to restore the renormalizability. it is therefore of interest to find consistency conditions for the renormalization group equations for the σ-models given by this simple m. two-dimensional poisson–lie σ-models are given by manin triples generated by abelian or solvable lie algebras with lie products [t1,t2] = at2, [t̃ 1, t̃ 2] = ã t̃ 2, a ∈{0, 1}, ã ∈ r (26) or [t1,t2] = t2, [t̃ 1, t̃ 2] = t̃ 1. (27) in the former case, equation (24) for m = m1 reads( dm dt 0 0 dm dt ) = ( a2m2 − ã2 (am + ã)ξ2 0 −(am + ã)ξ1 ) (28) so that we generically get ξ1 = ã−am, ξ2 = 0 and the renormalization group equation is dm/dt = a2m2−ã2. in the special case a = 1, m = −ã the r.h.s. of the equation (28) vanishes for all choices of ξk, i.e. there is no renormalization. notice that had we allowed a diagonal ansatz m = ( m1 0 0 m2 ) (29) instead of the multiple of the unit matrix, the restriction on the value of ξ1 would disappear and the renormalization group equation would take the form dm1 dt = −ã2 + m21a 2, dm2 dt = − m2 m1 ξ1(ã + m1a). (30) for the manin triple (27), the equation (24) reads( dm dt 0 0 dm dt ) = ( m2 + ξ2 m (ξ2 − 1) m− ξ1 −1 −mξ1 ) (31) and no choice of ξ1,ξ2 satisfies the equation (31). therefore the poisson–lie σ-model given by manin triple (27) is not renormalizable with m kept proportional to the unit matrix. the situation changes when we allow general diagonal form (29) of matrix m. then the renormalization group equation becomes( dm1 dt 0 0 dm2 dt ) = ( m21 + m1 m2 ξ2 m1 (ξ2 − 1) m1 − ξ1 −1 −m2 ξ1 ) (32) which allows the flow dm1 dt = m21 + m1 m2 , dm2 dt = −1 −m1m2 respecting the diagonal ansatz (29) for the unique choice ξ1 = m1, ξ2 = 1. consistency of the one-loop renormalization group equations for three-dimensional poisson–lie σ-models with m proportional to the unit matrix fixes ξ3 = 0 and is consistent with the choice ξ2 = 0 (unique in some cases). it exists for manin triples and choices of ξ1 and/or m and their duals summarized in table 1. renormalization of the poisson–lie σ-models given by other six-dimensional manin triples is not consistent with the assumption m proportional to identity, i.e. renormalization spoils the ansatz. we have also investigated three-dimensional σmodels with general diagonal matrices m but the list of renormalizable models is rather long, so that we do not display it here. we note that the list of renormalizable threedimensional poisson–lie σ-models with m proportional to the unit matrix is in agreement with the results obtained in [11]. there the conformally invariant poisson–lie σ-models, i.e. those with vanishing β-function, were studied and the sigma models with diagonal m and constant dilaton field were obtained. they appear in the above constructed list with vanishing r.h.s. of the renormalization group equation. 436 vol. 53 no. 5/2013 on renormalization of poisson–lie t-plural sigma models 5. conclusions we have discussed the transformation properties of the renormalization group flow under poisson–lie tplurality. originally, on the basis of our previous experience with the poisson–lie t-duality and t-plurality, we expected that it should possible to generalize the proof of the equivalence of the renormalization group flows (11) of poisson–lie t-dual sigma models [2] to the case of poisson–lie t-plurality. unfortunately, this task proved to be beyond our present means due the relative complexity of the transformation formula (7) compared to the duality case (18). thus, we resorted to investigation of the invariance properties of the renormalization group flows on low-dimensional examples. we have found no contradiction with the hypothesis that the renormalization group flows as formulated in [2] are equivalent under the poisson–lie t-plurality and with the claim that the renormalization renormalization flows of the models on the poisson–lie group and on the drinfel’d double are compatible. next, we studied whether the freedom in the choice of functions ξc in the renormalization group equations (24) can be employed to preserve chosen ansatz of the matrix m during the renormalization group flows. it turned out that indeed this ambiguity often enables one to stay within the diagonal ansatz for matrix m. acknowledgements this work was supported by rvo68407700 and research plan msm6840770039 of the ministry of education of the czech republic (l.h. and l.š) and by the grant agency of the czech technical university in prague, grant no. sgs10/295/ohk4/3t/14 (j.n.). we are grateful to konstadinos sfetsos and konstadinos siampos for e-mail discussions that helped to pinpoint the differences in notation and corresponding reformulations of the renormalization group equations. references [1] galliano valent, ctirad klimčík, romain squellari. one loop renormalizability of the poisson-lie sigma models. phys lett b 678(1):143–148, 2009. [2] konstadinos sfetsos, konstadinos siampos. quantum equivalence in poisson-lie t-duality. j high energy phys (6):082, 15, 2009. [3] konstadinos sfetsos, konstadinos siampos, daniel c. thompson. renormalization of lorentz non-invariant actions and manifest t-duality. nuclear phys b 827(3):545–564, 2010. [4] ladislav hlavatý, libor šnobl. classification of poisson-lie t-dual models with two-dimensional targets. modern phys lett a 17(7):429–434, 2002. [5] libor šnobl, ladislav hlavatý. classification of six-dimensional real drinfeld doubles. internat j modern phys a 17(28):4043–4067, 2002. [6] konstadinos sfetsos. duality-invariant class of two-dimensional field theories. nuclear phys b 561(1-2):316–340, 1999. [7] ctirad klimčík, pavel ševera. dual non-abelian duality and the drinfel’d double. phys lett b 351(4):455–462, 1995. [8] ctirad klimčík. poisson-lie t-duality. nuclear phys b proc suppl 46:116–121, 1996. s-duality and mirror symmetry (trieste, 1995). [9] rikard von unge. poisson-lie t-plurality. j high energy phys (7):014, 16, 2002. [10] ladislav hlavatý, jan hýbl, miroslav turek. classical solutions of sigma models in curved backgrounds by the poisson-lie t-plurality. internat j modern phys a 22(5):1039–1052, 2007. [11] ladislav hlavatý, libor šnobl. poisson-lie t-plurality of three-dimensional conformally invariant sigma models. j high energy phys (5):010, 19, 2004. 437 acta polytechnica 53(5):433–437, 2013 1 introduction 2 review of poisson–lie t-plurality 3 relation to the renormalization group equations on the drinfel'd double 4 non-uniqueness of the renormalization group equations 4.1 renormalizable sigma-models for m proportional to the unit or diagonal matrix 5 conclusions acknowledgements references acta polytechnica vol. 52 no. 6/2012 simulation of free airfoil vibrations in incompressible viscous flow — comparison of fem and fvm petr sváček1, jaromír horáček2, radek honzátko3, karel kozel1 1faculty of mechanical engineering, czech technical university in prague, prague, czech republic 2institute of thermomechanics, academy of sciences of the czech republic, prague, czech republic 3faculty of production technology and management, jan evangelisty purkyně university in ústí nad labem, ústí nad labem, czech republic. corresponding author: petr.svacek@fs.cvut.cz abstract this paper deals with a numerical solution of the interaction of two-dimensional (2-d) incompressible viscous flow and a vibrating profile naca 0012 with large amplitudes. the laminar flow is described by the navier-stokes equations in the arbitrary lagrangian-eulerian form. the profile with two degrees of freedom (2-dof) can rotate around its elastic axis and oscillate in the vertical direction. its motion is described by a nonlinear system of two ordinary differential equations. deformations of the computational domain due to the profile motion are treated by the arbitrary lagrangian-eulerian method. the finite volume method and the finite element method are applied, and the numerical results are compared. keywords: laminar flow, finite volume method, finite element method, arbitrary lagrangian-eulerian method, nonlinear aeroelasticity. 1 introduction coupled problems describing the interactions of fluid flow with an elastic structure are of great importance in many engineering applications [23, 9, 1, 24]. commercial codes are widely used (e.g. nastran or ansys) in technical practice, and aeroelasticity computations are performed mostly in the linear domain, which enables the stability of the system to be determined. recently, the research has also focused on numerical modeling of nonlinear coupled problems, because nonlinear phenomena in post-critical states with large vibration amplitudes cannot be captured within linear analysis (see, e.g., [28]). nonlinear phenomena are important namely for cases, when the structure loses its aeroelastic stability due to flutter or divergence. a nonlinear approach allows the character of the flutter boundary to be determined. the dynamic behavior of the structure at the stability boundary can be either acceptable, when the vibration amplitudes are moderate, or catastrophic, when the amplitudes increase in time above the safety limits. the terminology of benign or catastrophic instability is synonymous with that of stable and unstable limit cycle oscillation (lco), [9], also referred to as supercritical and subcritical hopf bifurcation [24]. the nonlinear aeroelasticity of airfoils is very closely related to flutter controlling methods. librescu and marzocca in [21] published a review of control techniques and presented in [22] the aeroelastic response of a 2-dof airfoil in 2-d incompressible flow to external time-dependent excitations, and the flutter instability of actively controlled airfoils. flutter boundaries and lco of aeroelastic systems with structural nonlinearities for a 2-dof airfoil in 2-d incompressible flow was studied by jones et al. [18]. the limit-cycle excitation mechanism was investigated for a similar aeroelastic system with structural nonlinearities by dessi and mastroddi in [7]. recently, chung et al. in [4] analyzed lco for a 2-dof airfoil with hysteresis nonlinearity. here, two well-known numerical methods, the finite volume method (fvm, see [15, 16]) and the finite element method (fem, see [28]), were employed for a numerical solution of the interaction of 2-d incompressible viscous flow and the elastically supported profile naca 0012. the mathematical model of the flow is represented by the incompressible navier-stokes equations. the profile motion is described in section 2 by two nonlinear ordinary differential equations (odes) for rotation around an elastic axis and oscillation in the vertical direction. the application of the fvm numerical scheme is described in section 3. the dual-time stepping method is applied to the numerical solution of unsteady simulations and the runge-kutta method is used to solve odes numerically. the arbitrary lagrangianeulerian (ale) method is employed to cope with strong distortions of the computational domain due to profile motion. the numerical scheme used for 105 acta polytechnica vol. 52 no. 6/2012 khh (u ,v )t8 8 kϕϕ ea cg l(t)m(t) h ϕ figure 1: elastic support of the profile on translational and rotational springs. unsteady flow calculations satisfies the geometric conservation law (gcl), cf [19]. the application of fem is described in section 4. section 5 presents numerical results for both methods. 2 equations of profile motion a vibrating profile with 2-dof can oscillate in the vertical direction and rotate around the elastic axis ea (see figure 1). the motion is described by the following system of two nonlinear ordinary differential equations [17]: mḧ + sϕϕ̈ cos ϕ−sϕϕ̇2 sin ϕ + khh = −l, sϕḧ cos ϕ + iϕϕ̈ + kϕϕ = m, (1) where h is vertical displacement of the elastic axis (downwards positive) [m], ϕ is rotation angle around the elastic axis (clockwise positive) [rad], m is the mass of the profile [kg], sϕ is the static moment around the elastic axis [kg m], kh is the bending stiffness [n/m], iϕ is the inertia moment around the elastic axis [kg m2], and kϕ is the torsional stiffness [n m/rad]. the aerodynamic lift force l [n] acting in the vertical direction (upwards positive) and the torsional moment m [n m] (clockwise positive) are defined as l = −dρu2∞c ∫ γwt 2∑ j=1 σ2jnjds, (2) m = dρu2∞c 2 ∫ γwt 2∑ i,j=1 σijnjr ⊥ i ds, where d [m] is the airfoil depth, ρ [kg m−3] is the constant fluid density, u∞ [m s−1] denotes the magnitude of the far field velocity, c [m] denotes the airfoil chord, n = (n1,n2) is the unit inner normal to the profile surface γwt (here, the concept of non-dimensional variables was used, cf. [11]). furthermore, vector r⊥ is given by r⊥ = (r⊥1 ,r⊥2 ) = (−(y − yea),x − xea), where x,y are the nondimensional coordinates of a point on the profile surface (i.e., the physical coordinates divided by the r d n r ea figure 2: airfoil segment. airfoil chord c), γwt denotes the moving part of the boundary (i.e., the surface of the airfoil), and (xea,yea) are the non-dimensional coordinates of the elastic axis (see figure 2). the components of the (non-dimensional) stress tensor σij are given by σ11 = ( −p + 2 re ∂u ∂x ) σ12 = σ21 = 1 re ( ∂u ∂y + ∂v ∂x ) (3) σ22 = ρ ( −p + 2 re ∂v ∂y ) , where u = (u,v) is the non-dimensional fluid velocity vector, p is the non-dimensional pressure, re is the reynolds number defined as re = u∞c ν , and ν [m2 s−1] is the fluid kinematic viscosity. equations (2), (3) together with the boundary conditions for the velocity on the moving part γwt of the boundary represent the coupling of the fluid with the structure. system of equations (1) is supplemented by the initial conditions prescribing the values h(0), ϕ(0), ḣ(0), ϕ̇(0). furthermore, it is transformed to the system of first-order ordinary differential equations and solved numerically by the fourth-order multistage runge-kutta method. 3 finite volume method 3.1 mathematical model of the flow the flow of viscous incompressible fluid in the computational domain ωt is described by the twodimensional navier-stokes equations written in the conservative form (dw)t + fcx + g c y = 1 re (fvx + g v y), (4) 106 acta polytechnica vol. 52 no. 6/2012 where w is the vector of conservative variables w = (p,u,v)t , fc = fc(w), gc = gc(w) are the inviscid physical fluxes defined by fc = (u,u2 + p,uv)t , gc = (v,uv,v2 + p)t , and fv = fv(w), gv = gv(w) are the viscous physical fluxes defined by fv = (0,ux,vx)t , gv = (0,uy,vy)t , where the partial derivatives with respect to x,y and t are denoted by the subscripts x,y and t , respectively. symbol d denotes the diagonal matrix d = diag(0, 1, 1), non-dimensional time is denoted by t, and the non-dimensional space coordinates are denoted by x,y. symbols u = (u,v)t and p stand for the dimensionless velocity vector and the dimensionless pressure, respectively. the detailed relation of the dimensionless form of the governing equations and the relationship between dimensional and nondimensional variables can be found, e.g., in [11], [15]. in order to time discretize (4), a partition 0 = t0 < t1 < · · · < tm = t of the time interval [0,t] is considered, and the (variable) time step ∆tn = tn+1 − tn is denoted by ∆tn. further, wn denotes the approximation of the solution w at the time instant tn. eqs. (4) are discretized in time, and the solution of the non-linear problem for each time step is performed with the use of the concept of dual time, cf. [12]. we start with an approximation of the time derivative wt by the second order backward difference formula (bdf2) ∂w ∂t (tn+1) ≈ αn0 w n+1 + αn1 wn + αn2 wn−1 ∆t , (5) where αn0 = 2 − 1/(1 + θ), αn1 = −(1 + θ), αn2 = θ2/(1 + θ), θ = (∆tn)/(∆tn−1). the artificial compressibility method, see [3, 16], together with a timemarching method is used for the steady state computations. this means that the derivatives dβwτ with respect to a fictitious dual time τ are added to the time discretized equations (4) with the matrix dβ = diag(β, 1, 1), and β > 0 denotes the artificial compressibility constant. the solution of the nonlinear problem is then found as a solution of the steady-state problem in dual time τ, i.e. dβwτ = −r̃(w), (6) where r̃(w) is the unsteady residuum defined by r̃(w) = d αn0 w + αn1 wn + αn2 wn−1 ∆t + r(w), and r(w) is the steady residuum defined by r(w) = (fc − 1 re fv)x + (gc − 1 re gv)y. the required real-time accurate solution at the time level n + 1 satisfies r̃(wn+1) = 0, and it is found by marching equation (6) to a steady state in the dual time τ. system (1) is equipped with boundary conditions on ∂ωt. the dirichlet boundary condition u = u∞ = (u∞, 0) is prescribed on the inlet γd, whereas on the outlet γo a mean value of pressure p is specified. the non-slip boundary condition is used on walls, i.e. u = w on γwt is prescribed, where w denotes the velocity of the moving wall. 3.2 numerical scheme the governing equations for the dual-time approach are represented by eq. (6). the arbitrary lagrangianeulerian method, satisfying the so-called geometrical conservation law, is used for application in the case of moving meshes see [19]. the integral form of equation (4) in the ale formulation is given by ∂ ∂t ∫ ci(t) dw dxdy + ∫ ∂ci (f̃c dy − g̃c dx) − ∫ ∂ci 1 re (fv dy −gv dx) = 0, (7) where f̃c = f̃c(w,w1) = fc(w) − w1dw, g̃c = g̃c(w,w2) = gc(w) −w2dw, w = (w1,w2)t is the domain velocity, and ci = ci(t) is a control volume, which moves in time with the domain velocity w, see [19]. in what follows let us assume that ωt is an polygonal approximation of the domain occupied by a fluid at time t, being discretized by a mesh consisting of a finite number of cells cj(t) satisfying ωt = ⋃ j∈j cj(t). let us denote by n(i) the set of all neighbouring cells cj(t) of cell ci(t), i.e. the set of all cells whose intersection with ci(t) is a common part of their boundary denoted by γij(t). only the quadrilateral cells are considered in the computations. in this case, set γij(t) is the common side of cells ci(t) and cj(t). the mean value of vector w over cell ci(t) at time instant tn is approximated by wni , and symbol wn denotes the vector formed by the collection of all wni , i ∈ j. in addition, the numerical approximation depends on the ale mapping at and on the domain velocity w(t), see also [19]. in order to show this dependence we shall denote at time instant tn the vector of grid point positions by xn and the volume of cell ci(t) by |cni | = |ci(tn)|. thus the resulting equation for the i-th finite volume cell reads d ∂ (∣∣ci(t)∣∣wi) ∂t + ri(w, t) = 0, (8) 107 acta polytechnica vol. 52 no. 6/2012 with ri(w, t) = fci (w(t),x(t), w(t)) −f v i (w(t),x(t)), where fci (w,x, w) and fvi (w,x) represent the numerical approximations of the convective and diffusive fluxes, respectively. further, the bdf2 formula is used for the approximation of the time derivative in (7) by the difference δn+1t wi = 1 ∆tn ( αn0 |c n+1 i |w n+1 i + α n 1 |c n i |w n i + αn2 |c n−1 i |w n−1 i ) . in order to solve the arising non-linear problem, an additional term related to the artificial time τ is added to the scheme, so that to find the solution on time level tn+1 the following problem written in the ale formulation needs to be solved dβ ∣∣cn+1i ∣∣∂wn+1i∂τ + dδn+1t wi = −r n+1 i (w n+1), (9) where rn+1i (w n+1) = ω1f c,n+1/2 i (w n+1) + ω2f c,n−1/2 i (w n+1) −fv,n+1i (w n+1) + adn+1i (w n+1) and ω1 = αn+1, ω2 = −αn−1/θ. the symbol adn+1i (wn+1) stands for the artificial dissipation term (see [15]). the approximation of the convective fluxes fc,n±1/2i is given by fc,mi (w n+1) = ∑ j∈n(i) ( f̃c(wn+1ij ,w m 1 )∆y m ij − g̃c(wn+1ij ,w m 2 )∆x m ij ) , where wn+1ij = (w n+1 i + w n+1 j )/2, x n+1/2 = (xn+1 + xn)/2, wn+1/2 = (xn+1 − xn)/∆tn and ∆xmij , ∆ymij are the xand y-coordinate differences of the segment γmij . the approximation of the fluxes given by ri(wn+1) guarantees the satisfaction of gcl, see [19]. the approximation of the viscous fluxes is evaluated on the mesh configuration at the time instant tn+1 and reads fv,n+1i (w n+1) = 1 re ∑ j∈n(i) f̂v|n+1ij ∆y n+1 ij − ĝ v|n+1ij ∆x n+1 ij , where f̂v|n+1ij = ( 0, ûx|n+1ij , v̂x| n+1 ij )t , ĝv|n+1ij = ( 0, ûy|n+1ij , v̂y| n+1 ij )t , and ûx|n+1ij , ûy| n+1 ij , v̂x| n+1 ij , v̂y| n+1 ij are the approximations of the first partial derivatives of u and v with respect to x and y evaluated by the use of dual finite volume cells, see [15]. finally, to find the steady state solution of (9) in the dual-time τ the four-stage runge-kutta scheme is used. denoting r̃n+1i (w n+1) = dδn+1t w n+1 i + ri(w n+1), equation (9) can be written in the form: ∂wn+1i ∂τ = −d−1β 1 |cn+1i | r̃i(wn+1), and solved with the aid of runge-kutta method, cf. [15]. 4 finite element method 4.1 mathematical model of the flow for the application of the finite element method, the flow was described by the incompressible navierstokes equations written in the ale form dau dt − 1 re 4u + (u − w) ·∇u + ∇p = 0, (10) div u = 0, in ωt ⊂ r2, where d a dt is the ale derivative, u denotes the velocity vector, p denotes the nondimensional pressure, w is the domain velocity known from the ale method, and re denotes the reynolds number. furthermore, equation (10) is equipped with the boundary conditions a) u(x,t) = ud(x) for x ∈ γd, b) u(x,t) = w(x,t) for x ∈ γwt, (11) c) −pn + 1 2 (u · n)−u + ν ∂u ∂n = 0, on γo, where n denotes the unit outward normal vector, ud is the dirichlet boundary condition, and (α)− denotes the negative part of a real number α, see also [2], [14]. further, system (10) is equipped with the initial condition u(x, 0) = u0(x). 4.2 finite element approximation fem is well known as a general discretization method for partial differential equations. however, straightforward application of fem procedures often fails in the case of incompressible navier-stokes equations. the reason is that momentum equations are of advectiondiffusion type, with the dominating advection , the galerkin fem leads to unphysical solutions if the grid is not fine enough in regions of strong gradients 108 acta polytechnica vol. 52 no. 6/2012 (e.g. in the boundary layer). in order to obtain physically admissible correct solutions it is necessary to apply suitable mesh refinement (e.g. an anisotropically refined mesh, cf. [8]) combined with a stabilization technique, cf. [5], [27]. in this work, fem is stabilized with the aid of the streamline upwind/pressure stabilizing petrov-galerkin (supg/pspg) method (the so called fully stabilized scheme, cf. [13]) modified for the application on moving domains (cf. [28]). in order to discretize the problem (10), we approximate the time derivative by the second order backward difference formula, dau dt (x,t) ≈ 3un+1 − 4ûn + ûn−1 2∆t , where un+1 is the flow velocity at time tn+1 defined on the computational domain ωn+1, and ûk is the transformation of the flow velocity at time tk defined on ωk transformed onto ωn+1. further, equation (10) is formulated weakly, and the solution is sought on a couple of finite element spaces w∆ ⊂ h1(ωn+1) and q∆ ⊂ l2(ωn+1) for approximation of the velocity components and pressure, respectively, and the subspace of the test functions is denoted by x∆ ⊂ w∆. let us mention that the finite element spaces should satisfy the babuška–brezzi (bb) condition (see e.g. [25], [26] or [29]). in practical computations we assume that the domain ω = ωn+1 is a polygonal approximation of the region occupied by the fluid at time tn+1 and the finite element spaces are defined over a triangulation t∆ of domain ωt as piecewise polynomial functions. in our computations, the wellknown taylor-hood p2/p1 conforming finite element spaces are used for the velocity/pressure approximation. this means that the pressure approximation p = p∆ and the velocity approximation u = u∆ are linear and quadratic vector-valued functions on each element k ∈t∆, respectively. the stabilized discrete problem at the time instant t = tn+1 reads: find u = (u,p) ∈ w∆ × q∆, p := pn+1, u := un+1, such that u satisfies approximately the conditions (11 a-b) and a(u; u,v ) + l(u; u,v ) + p(u,v ) = f(v ) + f(u; v ), (12) holds for all v = (v,q) ∈ x∆ × q∆. here, the galerkin terms are defined for any u = (u,p), v = (v,q), u∗ = (u∗,p∗) by a(u∗; u,v ) = 3 2∆t (u, v)ω + 1 re (∇u,∇v)ω + (w ·∇u, v)ω − (p,∇· v)ω + (∇· u,q)ω, (13) f(u, v) = 1 2∆t (4ûn − ûn−1, v)ω, where w = u∗ − wn+1, and the scalar product in l2(ω) is denoted by (·, ·)ω. further, the supg/pspg stabilization terms are used in order to obtain a stable solution also for large reynolds number values, l(u∗; u,v ) = ∑ k∈t∆ δk (3u 2τ − 1 re 4u + (w ·∇)u + ∇p,ψ(v ) ) k , f(u∗; v ) = ∑ k∈t∆ δk (4ûn − ûn−1 2τ ,ψ(v ) ) k , where ψ(v ) = (w ·∇)v + ∇q, w stands for the transport velocity at time instant tn+1, i.e. w = v∗−wn+1, and (·, ·)k denotes the scalar product in l2(k). the term p(u,v ) is the additional grad-div stabilization defined by p(u,v ) = ∑ k∈t∆ τk(∇· u,∇· v)k. here, the choice of the parameters δk and τk is carried out according to [13] or [27] on the basis of the local element length hk δk = δ∗hk2, and τk = 1. furthermore, problem (12) is solved with the aid of oseen linearizations, and the arising large system of linear equations is solved with the aid of a direct solver, e.g. umfpack (cf. [6]), where different stabilization procedures can easily be applied, even when anisotropically refined grids are employed. the motion equations describing the motion of the flexibly supported airfoil are discretized with the aid of the 4th order runge-kutta method, and the coupled fluid-structure model is solved with the aid of a partitioned strongly coupled scheme. this means that for every time step the fluid flow and the structure motion are approximated repeatedly in order to converge to a solution which will satisfy all interface conditions. 5 numerical results numerical flow induced vibration results for both fvm and fem are presented for the naca 0012 profile, taking into account the following parameters of the flowing air and the profile: m = 8.66×10−2 kg, sϕ = −7.797 × 10−4 kg m, iϕ = 4.87 × 10−4 kg m2, kh = 105.1 n/m, kϕ = 3.696 n m/rad, d = 0.05 m, c = 0.3 m, ρ = 1.225 kg/m3, ν = 1.5 × 10−5 m/s2. the elastic axis ea is located at 40 % of the profile. figures 3–6 compare the profile motion numerically simulated by fvm and fem. the angle of rotation ϕ [°] and the vertical displacement h [mm] of the profile in dependence on time t [s] are presented for upstream flow velocities u∞ = {5, 10, 15, 20, 25, 35, 40}m/s (the resulting reynolds numbers are in the range from 105 109 acta polytechnica vol. 52 no. 6/2012 to 8 × 105). the initial conditions h(0) = −20 mm, ḣ(0) = 0, ϕ(0) = 6°, ϕ̇(0) = 0 were used. the vibrations are evidently damped by aerodynamic forces, and the decay of the oscillations is faster with the fluid velocity up to about u∞ = 35 m/s (see figures 3 5). further, the system loses its static stability at about u∞ = 40 m/s by divergence. more severe instability occurs in the fem simulation, see figure 6, where the rotation angle reaches nearly 14°(at that moment the computation failed due to mesh distortion). the aeroelastic response for this case computed by fvm rather limits to smaller static deflections (see figures 5 and 6). however, the fvm results for upstream velocity u∞ = 42.5 m/s shown in figure 7 apparently demonstrate unstable behaviour of the system. here, the vibration amplitudes increase rapidly and reach values of about 9°for rotation and −70 mm for vertical displacement just before the computation collapses. note that the critical flow velocity for the aeroelastic instability computed by nastran was 37.7 m/s (see [28]). this corresponds to the results of the numerical simulations presented since the system was found stable for upstream flow velocity 35 m/s and unstable for upstream flow velocity 40 m/s in the case of fem and 42.5 m/s in the case of fvm. further, the fourier transformations of rotation angle ϕ(t) and vertical displacement h(t) signals were computed. figures 8 and 9 compare the frequency spectrum analysis of fvm and fem results for the far-field flow velocities u∞ = 5, 10, 15 and 20 m/s, respectively. the two dominant frequencies can be seen for all the considered far field velocities. the lower frequency f1 ≈ 5.3 hz refers to the vertical translation of the profile, and the higher frequency f2 ≈ 13.6 hz refers to the profile rotation around the elastic axis. when the flow velocity is increased, both resonances are more damped and the frequencies are getting closer (f1 ≈ 5.5 hz, f2 ≈ 12.8 hz), which is a typical phenomenon in flutter analyses. the spectra show very good agreement of the dominant frequencies between the fvm and fem results for these flow velocities. the result of the fourier transformations shown in fig. 10 for u∞ = 25 m/s is different, and it is difficult to identify the dominant frequencies precisely. this is above all due to much higher aerodynamical damping, which is also similar for the cases u∞ = 35– 45 m/s (the results are not shown here). 6 conclusions aeroelastic instability due to divergence appeared prior to flutter in both numerical approaches. the results of the numerical simulations in the time domain computed by fvm are in good qualitative agreement with the fem computations of the same aeroelastic problem, and both numerical methods estimate the critical flow velocity in good agreement with the nastran computation using a classic linear approach. the numerical results presented here, computed by fvm and fem, are in a very good quantitative agreement for upstream velocities up to 35 m/s. here, both computations fully and almost identically represent the aeroelastic behaviour of the system. fvm and fem results differ markedly when the upstream flow velocity reaches 40 m/s. the system is unstable for upstream velocity close to this value. the artificial numerical dissipation implemented in the fvm scheme may be responsible for the excessively strong influence of the flow, and consequently for the behaviour of the system for upstream velocities close to 40 m/s. the numerical scheme in fvm is implicit in real time but explicit in dual time, while the fem method is fully implicit. this implies that the computations in fvm are more time consuming than in fem. in the computations presented here, the mesh for fvm consisted of approximately 18 thousand of cells and 54 thousand unknowns, and the computations took approximately 50–80 seconds for a single time step (depending on the far field velocity), on a pc computer with an intel quad core processor with 4 gb of memory. the computation of fem was performed on a mesh with approximately 34 thousand of elements with approximately 150 thousand unknowns, and the computations took approximately 30–50 seconds for a single time step. in both cases the time consumed is also influenced by the number of iterations of the coupling algorithm between fluid and structure. however, the implemented numerical scheme for fvm does not require any big matrices to be stored, unlike the fully implicit numerical scheme in fem. hence, the computer memory requirements are much more moderate for fvm than for fem (approximately 1.2 gb of memory). acknowledgements this work has been supported by the research plan of the ministry of education of the czech republic no 6840770003 and by grants of the czech science foundation no. p101/11/0207, p101/12/1271. references appear after the figures on page 115. 110 acta polytechnica vol. 52 no. 6/2012 -20 -15 -10 -5 0 5 10 15 20 0 0.5 1 1.5 2 h [m m ] t[s] (a) vertical displacement (u∞ = 5 m/s) -6 -4 -2 0 2 4 6 0 0.5 1 1.5 2 ϕ [d e g ] t[s] (b) angle of rotation (u∞ = 5 m/s) -20 -15 -10 -5 0 5 10 15 20 0 0.5 1 1.5 2 h [m m ] t[s] (c) vertical displacement (u∞ = 10 m/s) -6 -4 -2 0 2 4 6 0 0.5 1 1.5 2 ϕ [d e g ] t[s] (d) angle of rotation (u∞ = 10 m/s) figure 3: vertical displacement h [mm] and angle of rotation ϕ [°] in dependence on time t [s] for u∞ = 5 (re = 105, above) and 10 m/s (re = 2 × 105, below). solid line — fvm results, dashed line — fem results. -20 -15 -10 -5 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (a) vertical displacement (u∞ = 15 m/s) -6 -4 -2 0 2 4 6 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (b) angle of rotation (u∞ = 15 m/s) -20 -15 -10 -5 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (c) vertical displacement (u∞ = 20 m/s) -6 -4 -2 0 2 4 6 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (d) angle of rotation (u∞ = 20 m/s) figure 4: vertical displacement h [mm] and angle of rotation ϕ [°] in dependence on time t [s] for u∞ = 15 (re = 3 × 105, above) and 20 m/s (re = 4 × 105, below). solid line — fvm results, dashed line — fem results. 111 acta polytechnica vol. 52 no. 6/2012 -25 -20 -15 -10 -5 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (a) vertical displacement (u∞ = 25 m/s) -4 -2 0 2 4 6 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (b) angle of rotation (u∞ = 25 m/s) -30 -25 -20 -15 -10 -5 0 5 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (c) vertical displacement (u∞ = 35 m/s) -2 -1 0 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (d) angle of rotation (u∞ = 35 m/s) figure 5: vertical displacement h [mm] and angle of rotation ϕ [°] in dependence on time t [s] for u∞ = 25 (re = 5 × 105, above) and 35 m/s (re = 7 × 105, below). solid line — fvm results, dashed line — fem results. -45 -40 -35 -30 -25 -20 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (a) fvm -180 -160 -140 -120 -100 -80 -60 -40 -20 0 0 0.2 0.4 0.6 0.8 1 h [m m ] t[s] (b) fem 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (c) fvm -2 0 2 4 6 8 10 12 14 0 0.2 0.4 0.6 0.8 1 ϕ [d e g ] t[s] (d) fem figure 6: vertical displacement h [mm] (above) and angle of rotation ϕ [°] (below) in dependence on time t [s] for u∞ = 40 m/s (re = 8 × 105). 112 acta polytechnica vol. 52 no. 6/2012 t[s] h [m m ] 0 0.2 0.4 0.6 0.8 1 -70 -60 -50 -40 -30 -20 -10 0 10 (a) angle of rotation t[s] [d e g ] 0 0.2 0.4 0.6 0.8 1 -4 -2 0 2 4 6 8 10 ϕ (b) vertical displacement figure 7: vertical displacement h [mm] and angle of rotation ϕ [°] in dependence on time t [s] for u∞ = 42.5 m/s (re = 8.5 × 105) computed using fvm. 0 0.5 1 1.5 2 2.5 0 5 10 15 20 |g (h )| frequency [hz] (a) angle of rotation 0 0.15 0.3 0.45 0 5 10 15 20 |g (ϕ )| frequency [hz] (b) vertical displacement 0 0.5 1 1.5 2 2.5 0 5 10 15 20 |g (h )| frequency [hz] (c) angle of rotation 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 5 10 15 20 |g (ϕ )| frequency [hz] (d) vertical displacement figure 8: frequency spectrum analysis of vertical displacement h [mm] and angle of rotation ϕ [°] signals for u∞ = 5 (above) and 10 m/s (below). solid line — fvm results, dashed line — fem results. 113 acta polytechnica vol. 52 no. 6/2012 0 0.5 1 1.5 2 2.5 0 5 10 15 20 |g (h )| frequency [hz] (a) angle of rotation 0 0.15 0.3 0.45 0 5 10 15 20 |g (ϕ )| frequency [hz] (b) vertical displacement 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0 5 10 15 20 |g (h )| frequency [hz] (c) angle of rotation 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0 5 10 15 20 |g (ϕ )| frequency [hz] (d) vertical displacement figure 9: frequency spectrum analysis of vertical displacement h [mm] and angle of rotation ϕ [°] signals for u∞ = 15 (above) and 20 m/s (below). solid line — fvm results, dashed line — fem results. 0 0.5 1 1.5 2 0 5 10 15 20 |g (h )| frequency [hz] (a) angle of rotation 0 0.15 0.3 0.45 0 5 10 15 20 |g (ϕ )| frequency [hz] (b) vertical displacement figure 10: frequency spectrum analysis of vertical displacement h [mm] and angle of rotation ϕ [°] signals for u∞ = 25 m/s. solid line — fvm results, dashed line — fem results. 114 acta polytechnica vol. 52 no. 6/2012 references [1] bisplinghoff r. l., ashley h., halfman r. l.: aeroelasticity. new york : dover, 1996. [2] bruneau ch.-h., fabrie p.: effective downstream boundary conditions for incompressible navierstokes equations, int. j. numer. methods fluids, 19, 8, 1994, 693–705. [3] chorin a. j.: a numerical method for solving incompressible viscous flow problems, j. comput. phys., 2, 1, 1967, 12–26. [4] chung k. w., he y. b., lee b. h. k: bifurcation analysis of a two-degree-of-freedom aeroelastic system with hysteresis structural nonlinearity by a perturbation-incremental method, j. sound vib., 320, 2009, 163–183. [5] codina r.: stabilization of incompressibility and convection through orthogonal sub-scales in finite element methods, comput. methods appl. mech. eng., 190, 2000, 1579–1599. [6] davis t. a., duff i. s.: a combined unifrontal/multifrontal method for unsymmetric sparse matrices, acm trans. math. softw., 25, 1999, 1–19. [7] dessi d., mastroddi f.: a nonlinear analysis of stability and gust response of aeroelastic systems, j. fluids struct., 24, 2008, 436–445. [8] dolejší v.: anisotropic mesh adaptation technique for viscous flow simulation, east-west j. numer. math., 9, 2001, 1–24. [9] dowell e. h.: a modern course in aeroelasticity. dodrecht : kluwer academic publisher, 1995. [10] feistauer, m., 1993. mathematical methods in fluid dynamics. longman scientific & technical, harlow. [11] feistauer, m., felcman, j., straškraba, j., 2003. mathematical and computational methods for compressible flow. clarendon press, oxford. [12] gaitonde a. l.: a dual-time method for twodimensional unsteady incompressible flow calculations, int. j. numer. methods eng., 41, 1998, 1153–1166. [13] gelhard t., lube g., olshanskii m. a., starcke j.-h.: stabilized finite element schemes with lbb-stable elements for incompressible flows, j. comput. appl. math., 177, 2005, 243–267. [14] heywood j. g., rannacher r., turek s.: artificial boundaries and flux and pressure conditions for the incompressible navier-stokes equations, int. j. numer. math. fluids, 22, 1992, 325–352. [15] honzátko r.: numerical simulations of incompressible flows with dynamical and aeroelastic effects. prague : phd dissertation, czech technical university in prague, faculty of nuclear sciences and physical engineering, 2007. [16] honzátko r., horáček j., kozel k.: numerical solution of flow induced vibrating of a profile. in: numerical mathematics and advanced applications, enumath 2007, heidelberg : springer, 2008, 547–554. [17] horáček j.: nonlinear formulation of oscillations of a profile for aero-hydroelastic computations, in dynamics of machines, prague : institute of thermomechanics as cr , 2003, 51–56. [18] jones d. p., roberts i., gaitonde a. l.: identification of limit cycles for piecewise nonlinear aeroelastic systems, j. fluids struct., 23, 2007, 1012–1028. [19] koobus b., farhat ch.: second-order timeaccurate and geometrically conservative implicit schemes for flow computations on unstructured dynamic meshes, comput. methods appl. mech. eng., 170, 1999, 103–129. [20] lesoinne m., farhat ch.: geometric conservation laws for flow problems with moving boundaries and deformable meshes, and their impact on aeroelastic computations, comput. methods appl. mech. eng., 134, 1996, 71–90. [21] librescu l., marzocca p.: advances in the linear/nonlinear control of aeroelastic structural systems, acta mech., 178, 2005, 147–186. [22] librescu l., marzocca p., silva w. a.: aeroelasticity of 2-d lifting surfaces with time-delayed feedback control. j. fluids struct., 20, 2, 2005, 197–215. [23] naudasher e., rockwell d.: flow-induced vibrations. rotterdam : a. a. balkema, 1994. [24] paidoussis m. p.: slender structures and axial flow, volume 1 of fluid-structure interactions. san diego : academic press, 1998. [25] raviart p.-a., girault v.: finite element methods for the navier-stokes equations. berlin : springer, 1986. [26] r. l. sani and p. m. gresho.: incompressible flow and the finite element method. chichester : wiley, 2000. [27] sváček p., feistauer, m.: application of a stabilized fem to problems of aeroelasticity, in numerical mathematics and advanced application. berlin: springer, 2004, 796–805. [28] sváček p., feistauer m., horáček j.: numerical simulation of flow induced airfoil vibrations with large amplitudes, j. fluids struct., 23, 3, 2007, 391–411. [29] verfürth r.: error estimates for mixed finite element approximation of the stokes equations, r.a.i.r.o. analyse numérique/numerical analysis,18, 1984, 175–182. 115 acta polytechnica doi:10.14311/ap.2013.53.0703 acta polytechnica 53(supplement):703–706, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap computational schemes for the propagation of ultra high energy cosmic rays roberto aloisioa,b,∗ a inaf, osservatorio astrofisico arcetri, i-50125 arcetri (firenze) italy b infn, laboratori nazionali del gran sasso, i-67010 assergi (l’aquila) italy ∗ corresponding author: aloisio@arcetri.astro.it abstract. we discuss the problem of ultra high energy particles propagation in astrophysical backgrounds. we present two different computational schemes based on kinetic and monte carlo approaches. the kinetic approach is an analytical computation scheme based on the hypothesis of continuos energy losses while the monte carlo scheme takes into account also the stochastic nature of particle interactions. these schemes, which give quite reliable results, enable the computation of fluxes keeping track of the different primary and secondary components, providing a fast and useful workbench for studying ultra high energy cosmic rays. keywords: particles astrophysics, ultra high energy cosmic rays, astrophysical backgrounds. 1. introduction ultra high energy cosmic rays (uhecr) are the most energetic particles observed in nature, with energies up to a several 1020 ev. experimental studies of uhecr are currently being conducted in three different experiments: auger in argentina, hires and telescope array in the usa. the propagation of uhecr from the source to the observer is conditioned by their interactions with astrophysical backgrounds: the cosmic microwave background (cmb) and the extragalactic background light (ebl). understanding the key features of propagation is of paramount importance for interpreting experimental observations paving the way for the discovery of the astrophysical origin of these fascinating particles. several features of the observed spectrum can be linked directly to the chemical composition of uhecr and to their sources [1, 2, 7, 9, 13]. one of the particularly important features is the greisin, zatsepin and kuzmin (gzk) suppression of the flux, an abrupt depletion of the observed proton spectrum, arising at energies e ' 5 × 1019 ev, due to the interaction of uhe protons with the cmb radiation field [9, 13]. gzk suppression, as follows from the original papers, refers to protons and it is due to the photopion production process on the cmb radiation field (p+γcmb → π+p). in the case of nuclei the expected flux also shows a suppression at the highest energies which, depending on the nucleus species, is due to the photo-disintegration process on the cmb and ebl radiation fields (a+γcmb,ebl → (a−nn) +nn) [3]. another important feature in the spectrum that can be directly linked with the nature of the primary particles and their origin (galactic/extra-galactic) is the pair-production dip [1, 7]. this feature is present only in the spectrum of uhe extragalactic protons and, like gzk, is a direct consequence of the interaction of the proton with the cmb radiation field. in particular the dip brings a direct imprint of the pair production process p + γcmb → p + e + + e− suffered by protons. from the experimental point of view the situation is far from being clear with different experiments claiming contradictory results. the hires experiment, which is no longer taking data, showed a proton dominated spectrum till the highest energies [10, 11] while the auger observations show a heavy mass composition at energies e > 4 × 1018 ev [5]. this puzzling situation, with different experiments favoring different scenarios, shows once more the importance of a systematic study of uhecr propagation in astrophysical backgrounds. in the present paper we will review the main points of two alternative computation schemes which enable the determination of the fluxes expected on earth fixing the injection spectrum and the distribution of sources. these two schemes are based on different approaches to modeling the interactions between particles and backgrounds: the continuum energy losses (cel) approximation, which forms the basis of the kinetic approach, and the monte carlo (mc) technique. as we will discuss in the following these two different schemes give reliable results that, in the framework of different assumptions, agree each other and offer a suitable theoretical framework to study experimental results unveiling the intimate nature of uhecr. 2. kinetic equations the main assumption under which the kinetic theory is built is the cel approximation [6], through which particle interactions are treated as a continuum process that continuously depletes the energy of the particles. 703 http://dx.doi.org/10.14311/ap.2013.53.0703 http://ojs.cvut.cz/ojs/index.php/ap roberto aloisio acta polytechnica uhecr propagating through astrophysical backgrounds suffer different interaction processes: • protons – uhe protons interact only with the cmb radiation field giving rise to the two processes of pair production and photo-pion production. both of these reactions can be treated in the cel hypothesis. • nuclei – uhe nuclei interact with the cmb and ebl radiation fields, suffering the process of pair production, for which only cmb is relevant, and photo-disintegration, which involves both cmb and ebl backgrounds. while the first process can be treated in the cel hypothesis, the nucleus species being conserved, the second cannot be, producing a change in the nucleus species. following aloisio et al. [3], in the framework of the kinetic approach, we will treat the photo-disintegration process as a “decaying” process that simply depletes the flux of the propagating nucleus. taking into account all energy loss processes we can describe the propagation of protons and nuclei through kinetic equations of the type: ∂np(γ,t) ∂t − ∂ ∂γ [bp(γ,t)np(γ,t)] = qp(γ,t) (1) ∂na(γ,t) ∂t − ∂ ∂γ [na(γ,t)ba(γ,t)]+ na(γ,t) τa(γ,t) = qa(γ,t) (2) where n is the equilibrium distribution of particles, b are the energy losses (adiabatic expansion of the universe and pair/photo-pion production for protons or only pair-production for nuclei) q is the injection of freshly accelerated particles and, in the case of nuclei, also the injection of secondary particles produced by photo-disintegration (see below). the energy losses b for protons or nuclei depend only on the cmb field and in the cel hypothesis they can be computed analytically [1, 3, 7]. the second process that affects nuclei propagation is photo-disintegration over cmb and ebl backgrounds. this process is treated as a decaying process that depletes the flux of nuclei. it enters in the kinetic equation (see eq. 2) through a sort of “life-time” of the nucleus under the photo-disintegration process. this “life-time” corresponds to the mean time needed for a nucleus of lorentz factor γ and atomic mass number a to lose, at least, one of its nucleons: 1 τa = c 2γ 2 ∫ ∞ �0(a) d�rσ(�r,a)ν(�r)�r ∫ ∞ �r/(2γ) d� nbkg(�) �2 (3) where σ(�r,a) is the photo-disintegration cross-section and ν(�r) is the multiplicity associated with this process, namely the average number of nucleons extracted from the nucleus by a single interaction and nbkg = ncmb + nebl. the dependence on red-shift of τa follows directly from the evolution with red-shift of the background photon densities ncmb and nebl. in the case of cmb this dependence is known analytically while for ebl one should refer to evolution models (in our computations we have used the model by stecker et al. [12]). one important feature of the photo-disintegration process is that it starts to contribute to the propagation of nuclei at a lorentz factor that is almost independent of the nuclei species γcr ' 2 × 109 [3]. this is an important general characteristic of nuclei photo-disintegration process from which we can immediately deduce the dependence on the nuclei species of the energy corresponding to the photo-disintegration suppression of the flux: eacut = amnγcr. a being the atomic mass number of the nucleus and mn the proton mass. from this expression for eacut it is evident how the flux behavior could provide informations on the chemical composition of uhecr. in the case of helium (a = 4), suppression is expected around energies e ' 1019 ev while in the case of iron (a = 56) suppression is expected at higher energies e ' 1020 ev. let us now discuss the generation function qa(γ,t) on the right hand side of eq. 2. one should distinguish between primary nuclei, i.e. nuclei accelerated at the source and injected in the intergalactic space, and secondary nuclei and nucleons, i.e. particles produced as secondaries in the photo-disintegration chain. in the case of primaries the injection function is an assumption of the source model, while the injection of secondaries should be modeled taking into account the characteristics of the photo-disintegration process. the dominant process of photo-disintegration is one nucleon (n) emission, namely the process (a + 1) + γbkg → a + n. this follows directly from the behavior of the photo-disintegration cross-section (see [3, and references therein]) which shows the giant dipole resonance corresponding to one nucleon emission. moreover, at the typical energies of uhecr (e > 1017 ev) one can safely neglect the nucleus recoil so that photo-disintegration will conserve the lorentz factor of the particles. the production rate of secondary a-nucleus and a-associated nucleons will therefore be given by qa(γ,z) = qap (γ,z) = na+1(γ,z) τa+1(γ,z) (4) where τa+1 is the photo-disintegration life-time of the nucleus father (a + 1) and na+1 is its equilibrium distribution, the solution of the kinetic equation (eq. 2). using eq. 4 we can build a system of coupled differential equations that starting from the primary injected nuclei (a0) follows the complete photodisintegration chain for all secondary nuclei (a < a0) and nucleons. clearly secondary proton1 propagation will be described by the proper kinetic equation (eq. 1) with an injection term given by eq. 4. the solution 1neutrons decay very fast into protons, so we will always refer to secondary protons. 704 vol. 53 supplement/2013 computational schemes for the propagation of ultra high energy cosmic rays (e/ev) 10 log 18 18.5 19 19.5 20 20.5 ] -1 y r -1 sr -2 k m 2 j [e v 3 e 33 10 34 10 35 10 36 10 37 10 56 50 40 30 20 10 figure 1. flux of iron and secondary nuclei (a = 50, 40, 30, 20, 10) at z = 0 in the case of pure iron injection at the source with a power law injection index γ = 2.2. full squares correspond to the simprop result [4] while continuous lines correspond to the solution of the nuclei kinetic equation of [3]. of the kinetic equation for protons and nuclei can be worked out analytically. in the case of protons: np(γ,z) = ∫ zmax z dz′ (1 + z′)h(z′) qp(γ ′,z) dγ ′ dγ , (5) being qp the injection of primary protons or secondary protons (eq. 4) and γ ′ = γ ′(γ,z) is the characteristic function of the kinetic equation [3]. in the case of nuclei: na(γ,z) = = ∫ zmax z dz′ (1 + z′)h(z′) qa(γ ′,z) dγ ′ dγ e−ηa(γ ′,z′), (6) being, again, qa the injection of primary or secondary (eq. 4) nuclei. the exponential term in eq. 6 represents the survival probability during the propagation time t′ − t for a nucleus with fixed a and can be computed according to aloisio et al. [3]. the derivative term dγ ′/dγ present in both solutions eq. 5 and eq. 6 is analytically given [3]. 3. monte carlo the kinetic approach outlined above neglects interactions fluctuations considering an (average) continuum loss of energy suffered by particles. in the case of protons, this approximation has a limited effect on the flux computation only at the highest energies (e > 100 eev) [1, 7, 8]. in order to evaluate the effects of fluctuations on the expected nuclei flux, we have built a computation scheme alternative to the kinetic one, which uses the mc technique to simulate nuclei interactions. first of all, let us remark that fluctuations could be relevant only in the case of nuclei photo-disintegration. this follows from the fact that the pair-production process involving nuclei can be considered as an interaction process of the inside nucleon, therefore fluctuations in proton pair-production are irrelevant [8], and the same holds for nuclei. the simprop mc simulation scheme that we have developed [4] is mono-dimensional: it does not take into account spatial distributions tagging sources only through their distance from the observer (red-shift). the mc simulation propagates particles in steps of red-shift following the injected nucleus, the secondary nuclei and protons produced at each photo-disintegration interaction and calculates their losses up to the observer, placed at red shift zero. the nuclear model on which simprop is based is the same as is used for the kinetic approach (see [3, 4, and references therein]). the stochastic nature of the nuclei photo-disintegration process is modeled through the survival probability of a nucleus of atomic mass number a and lorentz factor γ p(γ,z) = exp ( − ∫ z∗ z 1 τa(γ,z ′) ∣∣∣∣ dtdz′ ∣∣∣∣dz′ ) (7) where z and z∗ are the values of the redshift of the current step (from z∗ to z). the simprop code is designed in such a way that any red-shift distribution of sources and any injection spectrum can be simulated. this is achieved by drawing events from a flat distribution in the red-shift of the sources and of the logarithm of the injection energy. once the event is recorded at z = 0 the actual source/energy distribution is recovered through a proper weight attributed to the event [4]. we will now compare the spectra obtained using simprop [4] with the spectra calculated solving the kinetic equation associated to the propagation of nuclei [3]. to pursue this comparison, a pure iron injection with a power law injection of the type ∝ e−γg with γ = 2.2 has been assumed. the sources have been assumed to be homogeneously distributed in the red-shift range 0 < z < 3. in fig. 1 the fluxes expected at z = 0 are shown for iron and secondary nuclei produced in the photo-disintegration chain suffered by primary injected irons. the points refer to the simprop results, while the continuous lines refer to the fluxes computed in the kinetic approach. good agreement between the two schemes is clearly visible in fig. 1. at the highest energies the path-length of iron nuclei is very short. therefore, to achieve good sampling in the mc simulation, higher statistics is needed; this is the reason for the larger errors bars in the simprop results at the highest energies. let us conclude by discussing why it is useful to go beyond the kinetic approach. the kinetic approach has the important feature of being analytical: the fluxes are computed mathematically by solving a first principles equation [3]. this means that the flux of primaries and secondaries is expressed in terms of several integrals that can be computed numerically, once the injection spectrum and the sources distribution are specified. in particular, the flux of secondary nuclei and nucleons produced in the photo-disintegration 705 roberto aloisio acta polytechnica chain of primary a0 is obtained by the numerical computation of a0 nested integrals and this computation should be repeated each time the hypothesis on sources (injection and distribution) is changed. this computation, while it is always feasible numerically, takes some time. however the time can be substantially reduced by using a mc computation scheme. this follows from the fact that, as discussed above, it is possible within the simprop approach to simulate different source distributions and injection spectra without repeating the overall propagation of particles. in this sense, a faster computation scheme is provided by the mc approach presented here, which is the minimal stochastic extension of the kinetic approach. acknowledgements i am grateful to all collaborators with whom the works presented here were developed: v. berezinsky, a. gazizov and s. grigorieva, from the theory group of the gran sasso laboratory, d. boncioli, a. grillo, s. petrera and f. salamida, auger group of l’aquila university and p. blasi, from the high energy astrophysics group of the arcetri astrophysical observatory. references [1] r. aloisio, v. berezinsky, p. blasi, a. gazizov, s. grigorieva and b. hnatyk, astrop. phys. 27 (2007) 76 [2] r. aloisio and d. boncioli, astropart. phys. 35 (2011) 152 [3] r. aloisio, v. berezinsky and s. grigorieva (2012a), astrop. phys. in press doi: 10.1016/j.astropartphys.2012.07.010, astrop. phys. in press doi: 10.1016/j.astropartphys.2012.06.003 [4] r. aloisio, d. boncioli, a. grillo, s. petrera and f. salamida, arxiv:1204.2970, (2012b), jcap in press [5] auger collaboration, phys. letters b 685 (2010) 239; phys. rev. lett. 104 (2010) 091101 [6] v. berezinsky, s. bulanov, v. dogiel, v. ginzburg and v. ptuskin, astrophysics of cosmic rays, north-holland, 1990 [7] v. berezinsky, a. gazizov and s. grigorieva, (2006a) phys. rev. d 74 (2006) 043005 [8] v. berezinsky, a. gazizov and m. kachelriess, (2006b) phys. rev. lett. 97 (2006) 23110 [9] k. greisen, phys. rev. lett. 16 (1966) 748 [10] hires collaboration, phys. rev. lett. 100 (2008) 101101 [11] hires collaboration, phys. rev. lett. 104 (2010) 161101 [12] f. w. stecker, m. a. malkan and s. t. scully, apj 648 (2006) 774 [13] g.t. zatsepin and v.a. kuzmin, pisma zh. experim. theor. phys. 4 (1966) 114 discussion carlo gustavino — the difference between auger and the other experiments can be due to the fact they are looking from different hemispheres? roberto aloisio — this is an hypothesis that was recently put forward. i personally do not believe in such explanation because of the simple reason that at energies around (2 ÷ 3) × 1019 ev, where already the difference between auger and hires starts, the universe visible in uhecr has a huge scale of the order of gpc. therefore it is very unlikely to have differences between observations carried out from the southern and northern hemispheres. 706 acta polytechnica 53(supplement):703–706, 2013 1 introduction 2 kinetic equations 3 monte carlo acknowledgements references acta polytechnica acta polytechnica 53(3):283–288, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ on the solvability of some partial differential inequality martin himmel∗ fb 08 — institut für mathematik, johannes gutenberg-universität mainz, staudinger weg 9, d-55099 mainz, germany ∗ corresponding author: himmel@mathematik.uni-mainz.de abstract. the dulac criterion is a classical method for ruling out the existence of periodic solutions in planar differential equations. in this paper the applicability and therefore reversibility of this criterion is under consideration. keywords: dynamical systems, planar differential equations, limit cycles, dulac’s inequality, locally sufficient criteria. 1. introduction and motivation let x : d → r2 be a smooth vector-valued function with components p and q, x = (p,q), defined on some planar domain d ⊆ r2. now, consider the system of two ordinary differential equations dx dt = p(x,y), dy dt = q(x,y). (1) such systems appear frequently in applications, e.g. in electrical engineering, physics, biology and many others, but they have also become a field of mathematical interest on their own. for simplicity, one can think of p and q either just as polynomials of two real variables x and y, or as smooth or even analytical functions given by some power series that converges in d, i.e., x is of class cr with r ∈ {1, 2, . . .∞,ω}. geometrically speaking, the solutions of the system (1) are smooth curves in the plane that are tangential to the function x at each point z = (x,y) ∈ d. if the components of x are polynomials in x and y of degree one then x(x,y) = ( ax + by cx + dy ) with a,b,c,d ∈ r being real constants, and the analysis of system (1) is easy and the solution curves of system (1) are given explicitly in terms of the matrix exponential function ea := ∑∞ k=0 1 k!a k. the situation changes drastically if x is at least quadratic, i.e., if p and/or q are polynomials of degree two or higher, p,q ∈ rn[x,y], n ≥ 2. many things are known about quadratic differential equations, see [1–3], but they are still a broad area of research. quite a few people have dedicated their whole life to the investigation of quadratic planar differential equations. interest in this field is related to the 16th hilbert problem, which is still unsolved even in the quadratic case. 1.1. sixteenth hilbert problem let rd[x,y] denote the space given by polynomials in x and y of degree at most d, and consider a planar system (1) with polynomial right-hand side x ∈ r2d[x,y]. the 16th hilbert problem consists of two parts, the first of which is a purely algebraic problem and deals with questions related to the topology of algebraic curves and surfaces. in this paper, we are mainly concerned about the second part of the 16th hilbert problem, which deals with the curves that arise as solutions of the planar differential equation (1) with polynomial vector field x. more precisely, in the second part of his problem hilbert asks for the maximal number and relative position of the isolated closed orbits, called limit cycles,1 this differential system (1) can have at most. it is even difficult to understand why fixed planar differential equations with polynomial vector field can only have a finite number of limit cycles. a proof of this fact is due to ilyashenko in 1991 [4], who actually corrected a very complicated and long proof due to dulac [5], a student of poincaré. écalle et al. independently obtained a proof for this theorem [6]. note that this does not imply the existence of an upper bound for the maximal number of limit cycles h(d) which a planar polynomial system of degree d can have. one calls h(d) the d-th hilbert number. linear vector fields have no limit cycles; hence h(1) = 0. quadratic systems can actually have four limit cycles and some people believe that this is the maximal number of limit cycles a quadratic system can have, but it is still unknown whether or not h(2) is a finite number. usually, the first part of the 16th hilbert problem is studied by researchers in real algebraic geometry, while the second part is considered by mathematicians working in dynamical systems or differential equations. hilbert also pointed out that there exist possibly connections between these two parts. see paper [7] and [8] for the original paper in russian by ilyashenko and more recently a paper [9] for a survey about the second part of the 16th hilbert 1in the past, the term limit cycle was used for a stable isolated closed orbit or just for a closed orbit. nowadays and in this paper, a limit cycle is a closed orbit that is isolated in the set of all periodic orbits of some differential equation. 283 http://ctn.cvut.cz/ap/ m. himmel acta polytechnica problem. 1.2. dulac criterion in 2008, the author was confronted with the analysis of a quadratic polynomial differential equation. in literature there existed a quite technical proof showing the uniqueness of a limit cycle for this system. then, fortunately, he was able to obtain the same result by applying the well-known theorem 1 (dulac criterion [5]). let ω ⊆ r2 be a simply connected region, x := (p,q) ∈ c1(ω,r2) a smooth vector field and b ∈ c1(ω,r) a smooth real-valued function such that the partial differential inequality div(bx) := ∂(bp) ∂x + ∂(bq) ∂y > 0 (2) is satisfied in ω. then the planar ordinary differential equation (1) with x as right-hand-side does not posses any periodic solution that is fully contained in ω. remark 1. (1.) the proof of theorem 1 is indirect: one assumes the existence of a closed orbit of (1) in ω and applies the divergence theorem of gauß. this immediately gives a contradiction. (2.) in 2009, the author was able to weaken the assumptions made on the function b, see [10]. essentially, the statement of theorem 1 still holds if the function b is only weakly differentiable of first degree and equation (2) holds only almost everywhere in ω, compare definition 1. (3.) if the domain ω ⊆ r2 referred to in theorem 1 is not simply connected, but p-connected for some p ∈ n, p ≥ 2, then there can be at most p−1 closed orbits fully contained in ω. (4.) the dulac criterion is a generalization of the bendixson criterion [11]. in fact, the bendixson criterion follows from theorem 1 if we choose b = 1, the function that is constant 1 for all z ∈ ω. definition 1 (dulac function). the multiplier b ∈ w 1,p(ω,r) is called dulac function of the planar dynamical system (1) in ω ⊆ d if and only if there is a real-valued continuous function g > 0 having a positive sign in ω except on a set of lebesgue measure zero such that the equation div(bx) = b · div x + 〈∇b,x〉 = g (3) holds in ω. here w 1,p(ω,r) denotes the sobolev space of lp functions that are weakly differentiable of first degree and with a weak derivative in lp. vice versa, we call any function b of class w 1,p satisfying (3) a g-dulac function of x in ω. in general, it is very difficult to find a dulac function for some given vector field x, but if one incidentally guesses such a function, it is quite easy to verify that it obeys condition (2).2 just for curiosity, the author wondered whether the converse statement of theorem 1 is also true. question 1. given a smooth planar vector field x and assume that the planar differential equation (1) does not have any periodic solution in some simply connected domain ω. does there exist a dulac function in ω? the answer is no, if the boundary of ω is formed by a periodic solution of the corresponding system, but in every other case considered by the author he was able to obtain an affirmative answer to question 1 raised above. in the following we quote some of these positive results. 2. local results gradient systems never possess periodic solutions [12], hence we expect that gradient fields have a dulac function. indeed, this is the case. theorem 2 (gradient fields [10]). let x = ∇v be a globally defined gradient field with potential v ∈ c2(r2,r). then b1 = exp (v ), b2 = exp (−v ) and b3 = v are local dulac functions for x in ωi ⊂ r2, i = 1, 2, 3, with ∪3i=1ωi = r 2. proof. define b1 = exp (v ), b2 = exp (−v ) and b3 = v and set ωi := { z ∈ r2 ∣∣ div(bix)(z) > 0}. verify that ∪3i=1ωi = r 2 gives indeed the whole plane. a standard result from calculus says that vector fields can be straightened locally unless there is an equilibrium. this fact can be used to obtain a local existence result in domains not containing zeros of the vector field. such domains are called canonical regions. theorem 3 (parallel flow [13]). let x : d ⊆ r2 → r2 be any smooth vector field and ω a canonical region of x, i.e., a region where the flow of system (1) is equivalent to a parallel flow in the sense of neumann [14]. then, under some integrability assumptions, x has a dulac function in ω. proof. by assumption there are no equilibria in the domain, hence the vector field can be straightened locally. observe that, by this process of straightening, the planar system (1) decouples and the partial differential equation (3) from the definition of a dulac function reduces to the linear ordinary differential equation b div x + αbr = g, 2one can compare this problem to the decomposition of a natural number into its prime factors: if some natural number n together with some product p := ∏n i=1 p di i of powers of prime numbers is given, it is quite easy to decide whether p is the prime factorization of n, p = n, but, from our current state of knowledge, it is very difficult to decompose a big natural number into its prime factors. 284 vol. 53 no. 3/2013 on the solvability of some partial differential inequality b20 = a4 + 4a3d−a2(2bc− c2 − 3d2) + 3acd(c− b) + c2(b2 − bc + d2) (a + d)(3a2 + 10ad− 4bc + 3d2) , b02 = a2(b2 + 3d2) + ad(3b2 − 3bc + 4d2) − b3c + b2(c2 + d2) − 2bcd2 + d4 (a + d)(3a2 + 10ad− 4bc + 3d2) , b11 = 2a3b + a2d(7b + 3c) −a(3b2c + b(c−3d2) − 7cd2) − cd(b2 + 3bc− 2d2) (a + d)(3a2 + 10ad− 4bc + 3d2) , b01 = b10 = b00 = 0. figure 1. values of bij from equation (6). which implies br = g −b div x α . this is an ordinary differential equation with the solution b = ea(r) {∫ g · ea(r) dr + κ } , where a(r) = ∫ div x dr and κ ∈ r is a constant of integration. for linear vector fields there is also a dulac function in terms of a quadratic function if some spectral condition holds. theorem 4 (linear case [10]). let x(z) = az be a linear vector field with a having at most one eigenvalue zero. the linear system (1) does not have periodic orbits if and only if the spectrum of the matrix a consists only of eigenvalues with nonzero real part or zero, i.e., σ(a) ∩ ir ⊆{0}. proof. let the vector field be x(z) = az with( a b c d ) ∈ r2×2. instead of the general partial differential inequality (2) the author considered div(bx) = ‖x‖2 := p 2 + q2 (4) and made a quadratic ansatz for the dulac function b = 1 2 〈z,gz〉 with some matrix g ∈ r2×2. then equation (4) reduces to atg + ga + tr a ·g = ata. letting s := a + tr a, one obtains stg + gs = ata, (5) equation (5) is the well-known lyapunov equation, a special case of the more general silvester equation. now one can apply the well-established solvability theory for the silvester equation [15] and verify that the lyapunov equation (5) does indeed have a unique solution under the spectral assumptions that were made. on the other hand, one obtains by direct calculation b = b20x2 + b02y2 + b11xy + b10x + b01y + b00; (6) values of bij are shown in figure 1. the case tr a = a + d = 0 has to be examined with care. the reader may verify that having spectrum σ(a) = { 0, 12 tr a } is equivalent to det a = (tr a 2 )2. hence the quadratic dulac function from (6) does the job because we assumed that at most one eigenvalue of a is zero. the hartman grobman theorem combined with the last two results gives some local existence statement holding in some neighborhood of hyperbolic fixed points. proposition 1. let x be a smooth vector field and z a hyperbolic zero of x, i.e., the real part re z 6= 0 is different from zero. then there is a neighborhood u of z such that x has a dulac function on u. remark 2. proposition 1 says morally that near to hyperbolic equilibriums one can define dulac functions, which is what we expected since in dimension n = 2 a hyperbolic equilibrium is either a node (two real eigenvalues of the same sign), a saddle (two real eigenvalues of different sign) or a focus, sometimes also called spiral point, (two complex conjugate eigenvalues with non-zero real part). can there be a dulac function near to a non-hyperbolic equilibrium? we believe so, unless the equilibrium is a center. recall that a non-hyperbolic equilibrium (two purely imaginary eigenvalues of opposite sign) can be either a center or a focus. 3. qualitative theory let us recall now some basic definitions and results that are frequently used in the qualitative theory of planar differential equations. for a more detailed introduction we refer to [16] and [13]. we call system (1) integrable in domain d if it has a first integral defined on this domain, i.e., a non-constant smooth scalar-valued function h of class ck which is constant 285 m. himmel acta polytechnica on each solution ( x(t),y(t) ) of (1) as long as it is defined. this means: if ( x(t),y(t) ) is any fixed solution of (1) defined for t ∈ [0, tmax] =: imax, its maximal interval of existence, and h ∈c1(d,r) a first integral of system (1), then there is a real number h such that h ( x(t),y(t) ) = h for all t ∈ imax (7) is satisfied. taking the derivative with respect to time t of equation (7), we see that any first integral h ∈ c1(d,r) of (1) satisfies the linear partial differential equation 〈∇h,x〉 = p ·hx + q ·hy = 0. (8) conversely, every non-constant solution of equation (8) is a first integral h : d → r of (1). if h0 is a non-constant solution of (8), then every other solution is of the form f (h0), where f is an arbitrary function having continuous partial derivatives (use the chain rule to verify this). first integrals are strongly related to the notion of integrating factors. definition 2 (integrating factor). an integrating factor µ of the planar dynamical system (1) is a smooth solution µ ∈ c1(ω,r) of the linear partial differential equation div(µ ·x) = µ · div x + 〈∇µ,x〉 = 0 in ω. (9) note that the first equality in (9) is due to the leibniz rule and div x, the divergence of the vector field x, denotes the trace of the jacobian of x, div x = ∂p ∂x + ∂q ∂y . if instead a first integral h : d → r of (1) is known, using equation (8), one can equivalently define an integrating factor as the common value of the ratios µ(x,y) = hy p != − hx q , (10) i.e., an integrating factor must satisfy both hy = µp and hx = −µq. the latter two relations can be read as dh(x,y) = µ(x,y) ( p(x,y) dy −q(x,y) dx ) , (11) where dh(x,y) indicates the differential of h. thus, multiplying the right hand side of (1) by an integrating factor makes the equation an exact differential and an exact equation), which means that, whenever an integrating factor is available, the modified vector field µx has vanishing divergence and the problem of solving equation (1) is reduced to one-dimensional integration h(x,y) = ∫ (x,y) (x0,y0) µ(x,y) ( p(x,y) dy −q(x,y) dx ) . note that the latter line integral might not be welldefined if domain d is not simply connected. for this reason, integrating factors are usually considered only in connected components of d. secondly, we observe that the vector fields x and µx have the same phase portrait (with maybe reversed orientation if µ is negative) as long as µ does not vanish. for this reason, solving system (1), i.e., constructing a first integral and finding an integrating factor for it, are considered to be equivalent problems. in applications the notion of inverse integrating factor is very common. definition 3 (inverse integrating factor). the function v ∈c1(ω,r) is called an inverse integrating factor for the planar system (1) in the domain ω ⊆ r2 if µ = 1 v is an integrating factor of system (1) in ω \{v = 0}. as usual {v = 0} is short notation for the preimage of zero under v , v −1(0) = {z ∈ ω : v (z) = 0}. the method of integrating factors is, both historically and theoretically, a very important technique in the qualitative analysis of first order ordinary differential equations. the use of integrating factors goes back to leonard euler (1707–1783). integration factors and first integrals refer to the problem of integrating the planar system (1), which means geometrically nothing but finding smooth curves that are tangential to the vector field at each point. then, another interesting problem is deriving the qualitative behavior of these solution curves as, for instance, their topology (whether they are closed or not), their asymptotics (whether they blow up in finite time or remain in some compact set and approach a limit cycle or an equilibrium point) and stability properties. 4. a unifying point of view definition 4 (invariant curve). let ω ⊆ r2 be an open set of the plane. an invariant curve is the vanishing set or the preimage of zero of some smooth function. more precisely, to a given function f ∈ c1(ω,c) we associate the preimage of zero f−1(0) ≡{f = 0} := { z ∈ ω ∣∣f(z) = 0} (12) and call it an invariant curve of the vector field x = (p,q) ∈ c1(d,r2) if there is a smooth function k ∈ c1(ω,c), called cofactor of f, satisfying the relation 〈∇f,x〉 = k ·f for all z ∈ ω (13) or more explicitly fx(z) ·p(z) + fy(z) ·q(z) = k(z) ·f(z). here ∇f = ( ∂f ∂x , ∂f ∂y )t denotes the gradient of f, 〈·, ·〉 is the canonical inner product of r2 and the subscripts of f indicate partial derivates, fx = ∂f∂x, fy = ∂f ∂y . note that on the invariant curve {f = 0} the gradient of f, ∇f, is orthogonal to the vector field x by the defining 286 vol. 53 no. 3/2013 on the solvability of some partial differential inequality property of invariant curves (13). by convention, the function f defining the invariant curve f−1(0) ⊆ r2 is called the invariant function. the notion of exponential factors, a special case of invariant functions, is useful for studying the multiplicity of invariant curves. it allows the construction of first integrals for polynomial systems via the same method used by darboux. definition 5 (exponential factor). given two h,g ∈ r[x,y] coprime polynomials, the function exp(g/h) is called an exponential factor for system (1) if there is a polynomial k ∈ r[x,y] of degree at most d−1, d := max{deg p, deg q} being the degree of the polynomial system (1), satisfying the relation〈 ∇ ( eg/h ) ,x 〉 = k · eg/h. (14) note that obviously { (eg/h) = 0 } = ∅, but {g = 0} defines an invariant curve for system (1). definition 6 (darboux function). any function of the form r∏ i=1 fλii l∏ j=1 ( exp { gj/h nj j })µj (15) where, for 1 ≤ i ≤ r and 1 ≤ j ≤ l, fi(z) = 0 and gj(z) = 0 are invariant curves for system (1), hj is a polynomial of c[x,y], λi and µj are complex numbers and nj is a natural number or zero, is called darboux function. the following remark reminds the reader of some facts about invariant curves. remark 3. (1.) invariant curves are very important in the qualitative study of dynamical systems because they generalize the notion of integrating factors and dulac functions. thus, it is possible to interpret integrating factors and dulac functions as invariant functions to certain cofactors: an integrating factor is nothing but an invariant function having cofactor k = −div x (16) and a dulac function is an invariant function with cofactor k = −div x + 1 f ·g, (17) g being a continuous function with g > 0 for almost every z ∈ ω. note that the latter interpretation of a dulac function as an invariant function to a specific cofactor makes sense only when f does not vanish, i.e., the dulac function is not defined respectively singular on the invariant curve, the vanishing set of the invariant function. this already gives a clue to the natural boundaries on the maximal domain of definition of dulac functions. (2.) an easy observation is that, if f and g are invariant functions with cofactor kf and kg, respectively, then their pointwise product f · g also defines an invariant curve with cofactor kf + kg. (3.) without loss of generality, we will always consider complex-valued invariant functions because, if f is an invariant function with cofactor k (with respect to some vector field in some domain), then its conjugate function f̄ is also an invariant function having cofactor k̄, and therefore the product f · f̄ is a real-valued invariant function with cofactor k + k̄. the same holds for exponential factors. (4.) in the case of polynomial planar vector fields, the algebraic part of invariant curves and exponential factors has already been developed. we quote some of these results and refer to [17], [18] and [19] for further reading. therefore, let the vector field x ∈ r2d[x,y] be a polynomial vector field of degree d. furthermore, assume that d(d+1)2 + 1 different irreducible invariant algebraic curves are known. then one can construct a first integral of the form h = fλ11 · . . . ·f λs s (18) where each fi(x,y) defines an invariant algebraic curve fi(x,y) = 0 (19) for system (1) and λi ∈ c, not all of them null, for i = 1, 2, . . . ,s, s ∈ n. the functions of type (15) are called darboux functions. (5.) the irreducibility of the invariant functions in the algebraic case must be replaced by the condition{ p ∈ ω : f(p) = 0 and ∇f(p) = 0 } ⊆ { p ∈ ω : x(p) = 0 } in the non-algebraic case. (6.) an easy observation: any invariant curve {f = 0} has exactly one cofactor k = 〈∇f,x〉 f . 5. open questions in section 2, proposition 1, we obtained a local existence result for dulac functions near to a hyperbolic equilibrium. the proof was based on theorem 4, where we calculated a dulac function explicitly in terms of a quadratic polynomial. in this proof we observed, in fact, why one needs to impose that at most one eigenvalue of matrix a is zero, because, if we had two such eigenvalues σ(a) = {0}, the matrix would have trace zero and we would have divided by it. then, by applying the hartman grobman theorem, the result carried over to hyperbolic fixed points. observe that neither does the hartman grobman theorem hold for equilibria p with linearization having a purely imaginary or zero spectrum, nor could we make any use 287 m. himmel acta polytechnica figure 2. gluing three different diffeomorphisms of it, because in both cases div x(p) = 0 holds and our dulac function blows up. hence, in future the following question for nonlinear vector fields has to be addressed: question 2. when does a dulac function exist near to non-hyperbolic fixed points? this question is somehow naive, because one has to deal here with the center-focus-problem, which is still not completely solved in general. however, this dulac approach may give a new perspective on it. why do we bother about all these local existence statements for dulac functions? in principle, the motivation arose from the following algorithm 1. input: a nonlinear planar vector field x; output: number and position of periodic orbits together with phase portrait. step1: determine the zeros of x. step2: at each zero, define locally a dulac function. step3: extend them as long as possible. if no further extension is possible, one has found the limit cycles and determined the phase portrait. of course, this is rather a pseudo algorithm, because one cannot accomplish any of its steps. one more approachable but somehow technical step is to combine these local results in a global one. technically, this means, as one has to deal here with different local coordinate representations, that one has to glue together different diffeomorphisms. summing up, we claim that the dulac method in the spirit of algorithm 1 will give new insights in the qualitative theory of planar differential equations. references [1] i. burdujan. some geometrical aspects of the theory of quadratic differential equations. bul inst politehn iaşi secţ i 37(41)(1-4):39–44, 1991. [2] d. k. kayumov. limit cycles of a class of quadratic differential equations. in problems in the theory of ordinary differential equations (russian), pp. 25–29, 109–110. samarkand. gos. univ., samarkand, 1984. [3] d. e. koditschek, et al. limit cycles of planar quadratic differential equations. j differential equations 54(2):181–195, 1984. [4] y. s. il′yashenko. finiteness theorems for limit cycles, vol. 94 of translations of mathematical monographs. american mathematical society, providence, ri, 1991. translated from the russian by h. h. mcfaden. [5] h. dulac. sur les cycles limites. bull soc math france 51:45–188, 1923. [6] j. écalle, et al. non-accumulation des cycles-limites. i. c r acad sci paris sér i math 304(13):375–377, 1987. [7] y. ilyashenko. centennial history of hilbert’s 16th problem. bull amer math soc (ns) 39(3):301–354, 2002. [8] y. s. il′yashenko. centennial history of hilbert’s 16th problem. in fundamental mathematics today (russian), pp. 135–213. nezavis. mosk. univ., moscow, 2003. [9] j. llibre. on the 16-hilbert problem. gac r soc mat esp preprint 2012. [10] m. himmel. on the existence of periodic orbits of ordinary differential equations (transl.) pp. 43–46, 2009. [11] a. a. andronov, et al. qualitative theory of second-order dynamic systems. halsted press (a division of john wiley & sons), new york-toronto, ont., 1973. translated from the russian by d. louvish. [12] h. p. guckenheimer, john. nonlinear oscillations, dynamical systems, and bifurcations of vector fields. springer-verlag. [13] f. dumortier, et al. qualitative theory of planar differential systems. universitext. springer-verlag, berlin, 2006. [14] d. a. neumann. classification of continuous flows on 2-manifolds. proc amer math soc 48:73–81, 1975. [15] r. bhatia, et al. how and why to solve the operator equation ax − xb = y 29:1–21, 1997. [16] i. a. garcía, et al. a survey on the inverse integrating factor. qual theory dyn syst 9(1-2):115–166, 2010. [17] j. llibre. integrability of polynomial differential systems. in handbook of differential equations, pp. 437–532. elsevier/north-holland, amsterdam, 2004. [18] d. schlomiuk. algebraic particular integrals, integrability and the problem of the center. trans amer math soc 338(2):799–841, 1993. [19] d. schlomiuk. algebraic and geometric aspects of the theory of polynomial vector fields. in bifurcations and periodic orbits of vector fields (montreal, pq, 1992), vol. 408 of nato adv. sci. inst. ser. c math. phys. sci., pp. 429–467. kluwer acad. publ., dordrecht, 1993. 288 acta polytechnica 53(3):283–288, 2013 1 introduction and motivation 1.1 sixteenth hilbert problem 1.2 dulac criterion 2 local results 3 qualitative theory 4 a unifying point of view 5 open questions references wykresx.eps acta polytechnica vol. 51 no. 1/2011 some formulas for legendre functions induced by the poisson transform i. a. shilin, a. i. nizhnikov abstract using the poisson transform, which maps any homogeneous and infinitely differentiable function on a cone into a corresponding function on a hyperboloid, we derive some integral representations of the legendre functions. keywords: legendre functions, lorentz group, poisson transform. 1 introduction let us assume that the linear space rn+1 is endowed with the quadratic form q(x) := x20 − x 2 1 − . . . − x 2 n. we denote the polar bilinear form for q by q̂. the lorentz group so(n,1) preserves this form and divides rn+1 into orbits. we will deal with two kinds of these orbits. one of them is c := {x | q(x)= 0}; it is a cone. the second kind of orbits consist of two-sheet hyperboloids h(r) := {x | q(x)= r2} for any r > 0. the group so(n,1) has 2 connected components. one of them contains the identity and will be under our consideration further. we denote this subgroup by symbol g. the action x �−→ g−1x of the group g is transitive on c. let σ ∈ c and dσ be a linear subspace in c∞(c) consisting of σ-homogeneous functions. it is useful to suppose throughout this paper that −n +1 < re σ < 0. we define the representation tσ in dσ by left shifts: tσ(g)[f(x)] := f(g −1x). suppose that γ is a contour on c intersecting all generatrices (i.e. all lines containing the origin). everypoint x ∈ γ depends on n−1parameters, so every point x ∈ c can be represented as xi = {tfi(ξ1, . . . , ξn−1), i =1, . . . , n +1. denoting by g̃ the subgroup of g which acts transitively on γ, we have dx = tn−3dtdγ, (1) where dγ is the g̃-invariantmeasure on γ. for anypair (dσ, dσ̃), wedefine thebilinear functionals fγ : (dσ, dσ̃) −→ c, (f1, f2) �−→ ∫ γ f1(x)f2(x)dγ. the functional fγ does not depend on γ if σ̃ = −σ − n+1, because, first, we have formula (1), and, second, f1 and f2 are both homogeneous functions, and, third, the g-invariantmeasure on c can be represented in the form dx = dxζ(1) . . .dxζ(n) |xζ(n+1)| , (2) where ζ ∈ s n and s n+1 is the permutation group of the set {1, . . . , n +1}. let f ∈ dσ and y ∈ h(1). we refer to the integral transform π(f)(y) :=fγ(q̂ −σ−n+1(y, x), f) as the poisson transform [1]. 2 formulas related to sphere and paraboloid let γ1 be the intersectionof the cone c and theplane x0 = 1. each point x ∈ γ1 depends on spherical parameters φ1, . . . , φn−1 by the formula xs = n−s∏ i=1 sinφi · cosφn−s+1, s �=0, the research presented in this paper was supported by grant nk 586p-30 from the ministry of education and science of the russian federation. 70 acta polytechnica vol. 51 no. 1/2011 if angle φn−s+1 exists. here φn−1 ∈ [0;2π) and φ1, . . . , φn−2 ∈ [0;π). the subgroup h1 � so(n) acts transitively on γ1, and any permutate ζ ∈ s n+1 defines the h1invariant measure dγ1 = dγζ(2) . . .dγζ(n) |xζ(n+1)| . the invariant measure in spherical coordinates is given by 9.1.1.(9) [2] let γ2 be the intersection of cone c and the hyperplane x0+xn =1. wedescribe everypoint x ∈ γ2 by the coordinates r, φ1, . . . , φn−2 according to the formulas x0 = 1+ r2 2 , xn = 1− r2 2 , xs = r n−s−1∏ i=1 sinφi cosφn−s, s /∈ {0, n} (if angle φn−s exists), where r ≥ 0, φn−2 ∈ [0;2π) and φ1, . . . , φn−3 ∈ [0;π). we denote as h2 the subgroup of g acting transitively on γ2. h2 consists of the matrices n(b)= ⎛ ⎜⎜⎜⎜⎝ diag(1, . . . ,1︸ ︷︷ ︸ n−1 ) bt bt −b 1− b∗ −b∗ b b∗ b∗ ⎞ ⎟⎟⎟⎟⎠ , where b =(b1, . . . , bn−1) and b ∗ = 1 2 (b21+ . . .+b 2 n−1). it is not too hard to derive the h2-invariantmeasure dγ = rn−2dr n−2∏ i=1 sinn−i−2 φi dφi on γ2. let λ > 0, μ ∈ r, k0 ≥ k1 ≥ . . . ≥ kn−2 ≥ 0, l1 ≥ . . . ≥ ln−2 ≥ 0, m1 ≥ . . . ≥ mn−2 ≥ 0, k = (k0, k1, . . . , kn−3, ±kn−2), l = (l1, . . . , ln−3, ±ln−2), m =(m1, . . . , mn−3, ±mn−2). we will now deal with two bases in dσ. one of them consists of the functions f σ1k (x)= x σ−k0 0 ξ n k(x), where k = (k0, k1, . . . , kn−3, ±kn−2) ∈ zn−1, ki ≥ ki+1 ≥ 0 and ξnt(x) = n−3∏ i=1 r ti−ti+1 n−i · c n−i 2 −1 ti−ti+1 ( xn−i rn−i ) (x2 ± ix1)tn−2. the second basis consists of the functions f σ2(l,λ)(x) = (x0 + xn) σ+ n−32 · ( λ 2 )l1 (λrn−1 2 )3−n 2 −l1 · jl1+n−32 ( λrn−1 x0 + xn ) ξn−1l (x), where r2j = x 2 1+ . . .+x 2 j, l =(l1, . . . , ln−3, ±ln−2) ∈ z n−2, λ ≥ 0 and li ≥ li+1 ≥ 0. suppose, in addition, that the functions of the above bases are equipped with the normalizing factors defined by formulas [2, 9.4.1.7, 10.3.4.9]. let us consider the distribution f σ1k (x)= ∑ l ∫ +∞ 0 cσ12k,(l,λ) f σ2 (l,λ)dλ. (3) from the orthogonality of the functions ξnt , we obtain the property fγ(f σ1 k , f −σ−n−1,1 −k̃ )= δkk̃ . from this property, it immediately follows that cσ12k,(l,λ) =fγ(f σ1 k , f −σ−n−1,2 (l,λ) ). let γ = γ1. then from the formula∫ 1 −1 (1− x2)ν− 1 2 cνm(x)c ν n(x)dx =0, where m �= n, re ν > − 1 2 , we derive lemma 1. if n−2∑ i=1 (ki − li)2 �=0, then cσ12k,(l,λ) =0. let us assume another situation. lemma 2. if n−2∑ i=1 (ki − li)2 =0, then cσ12k,(l,λ) =2 −σ+n+3k1−3 π−1 ik1 (n +2k0 −2) 1 2 · √ (k0 − k1)!λk1 γ ( n −1 2 ) γ ( n −1 2 + k1 ) · γ (n 2 + k1 −1 ) γ 1 2 ( n −1 2 ) γ−1(n +2k1 − 2) · γ− 1 2(n + k0 + k1 −2) k0−k1∑ m=0 (−1)m (m!)−1 · γ(n + k0 + k1 + m −2)γ−1 ( n −1 2 + k1 + m ) · γ−1(k0 − k1 − m −1)γ−1(−σ + k1 + m) · g2113 ⎛ ⎝λ2 4 ∣∣∣∣∣∣ −m −σ + k1 −1, n −3 2 + k1 ⎞ ⎠ . 71 acta polytechnica vol. 51 no. 1/2011 proof. suppose γ = γ2. then we obtain the integral ∫ +∞ 0 r n−1 2 +l1 (r2 +1)σ−k1 · c n 2 −k1−1 k0−k1 ( 1− r2 1+ r2 ) j n−3 2 +l1 (λr)dr, which can be solved explicitly after replacing rk jk(λr) =2 k λ−k g1002 (( λr 2 )2 ∣∣∣∣∣ 0k,0 ) according to formulas [3, 8.932.1, 8.932.2] and [4, 20.5.4].� theorem 1. p − n2+1 −σ− n2 (coshα)= 22n− 9 2 π− 3 2 √ n −1 · sinh n 2 −1 α e(σ+n−1)α γ (n 2 −1 ) γ ( n +1 2 ) · γ−1(−σ)γ− 1 2(n −1) ∫ +∞ 0 λ−n+3 · g2113 ⎛ ⎝λ2 4 ∣∣∣∣∣∣ 0 −σ −1,0, n −3 2 ⎞ ⎠ · g2113 ⎛ ⎝(λe−α)2 4 ∣∣∣∣∣∣ 0 σ − n −1 2 ,0, n −3 2 ⎞ ⎠ dλ. proof. suppose that the condition k1 = l1, . . . , kn−2 = ln−2 holds. from the distribution (3), we obtain π(f σ1k )= ∫ +∞ 0 cσ12k,(l,λ)π(f σ2 (l,λ))dλ. furtherweassumeπ(f σ1k )=fγ1(q̂ −σ−n+1(y, x), f σ1k ) and π(f σ2(l,λ)) = fγ2(q̂ −σ−n+1(y, x), f σ2(l,λ)), then for the case y = (coshα,0, . . . ,0,sinhα) and put k = (0, . . . ,0).� consider the case so(2,1) of the group so(n,1). in this case, k ≡ k and (l, λ) ≡ λ. the following theorem is related to this case. theorem 2. if −1 < re σ < 0 and α �=0, then p −l+12 σ+12 (coshα)= (−1)l−12−σ− l 2− 9 4 π− 1 2 × e−α sin(−πσ) sinhl+ 1 2 α ·( coshα +1 coshα −1 ) l 2+ 1 4 γ(σ − l +1)γ ( l − 3 2 ) · γ−1 ( l + 1 2 ) ∫ ∞ 0 ρ−σ−1 kσ+1(ρe −α) · (4) ∞∑ s=0 (−1)n γ−2(s +1)γ−1(s − σ) · g2113 ( ρ2 4 ∣∣∣∣∣ −s−σ −1, 0 ) dρ proof. after repeating the proof of the previous theorem, we derive the following representation of the gauss hypergeometric function: 2f1 ( −σ − 1 2 , σ + 3 2 ; 1 2 + l; 1−coshα 2 ) = (−1)l−12−σ− 5 2 π− 1 2 e−α sinhα sin(−πσ) ·( coshα +1 coshα −1 ) l 2+ 1 4 γ(σ +1− l)γ ( l − 3 2 ) ·∫ ∞ 0 λ−σ−1 kσ+1(λe −α) ∞∑ s=0 (−1)n γ−2(s +1) · γ−1(s − σ)g2113 ( λ2 4 ∣∣∣∣∣ −s−σ −1, 0, 0 ) dλ. now we use the formula [5, 7.3.1.88] for l =0.� 3 formulas related to paraboloid and hyperboloid let γ3+ be the intersection of cone c and the plane xn =1. we denote as γ3− the intersection of c and the plane xn = −1. let γ3 := γ3+ ∪ γ3−. the contour γ3 is a homogeneous space with respect to the subgroup h3 � so(n−1,1). if x belongs to γ3, then xn = ±1, x0 =cosht, xs = sinht n−s−1∏ i=1 sinφi ·cosφn−s, s /∈ {0, n} (if angle φn−s exists), where t ∈ r, φn−2 ∈ [0;2π) and φ1, . . . , φn−3 ∈ [0;π). any permutation ζ ∈ s n determines the h3invariantmeasure dγ4 = dxζ(1) . . .dxζ(n−1) |xζ(n)| on γ3, so dγ3 =cosh n−2 tdt n−2∏ i=1 sinn−i−2 φi dφi. let us now consider the basis consisting of the functions f σ2(m,μ,±)(x) = (xn) σ+ n−32 ± r 3−n 2 −m1 n−1 · p 3−n 2 −m1 −12+iμ ( x0 xn ) ξn−1m (x), 72 acta polytechnica vol. 51 no. 1/2011 where (xn) σ+ n−32 ± is the generalized function defined as (xn) σ+ n−32 ± = { |xn|σ+ n−3 2 , if sign xn = ±1, 0, if sign x �= ±1, m = (m1, . . . , mn−3, ±mn−2) ∈ zn−2, mi ≥ mi+1 ≥ 0 and μ ∈ r. by analogywith the previous case, we can obtain the coefficients ck,(m,μ,+). let us suppose that n =3 and k =(l, s), m ≡ m. from the distribution f σ3m,μ,+(x)= ∞∑ l=0 |l|∑ s=−|l| cl,s,m,μ,+ f σ1 l,s(x), we have f σ3−s,μ,+(x)= ∞∑ l=0 cl,s,−s,μ,+ f σ1 l,s(x) and, therefore, π(f σ3−s,μ,+)= ∞∑ l=0 |l|∑ s=−|l| cl,s,−s,μ,+π(f σ1 l,s). (5) we choose γ3 (in fact, γ3+) on the left side of equality (5) and γ1 on the opposite side. in accordance with our choice, we use two parametrizations of a point y ∈ h(1): y(v)= ( v + v−1 2 ,0, . . . ,0, v−1 − v 2 ) and y(t) = (cosht,0, . . . ,0,sinht) respectively, so v = e−t. after integration we have sin[π(σ +1)] cosh−1 tγ ( iμ − σ − 1 2 ) · γ ( − 3 2 − σ − iμ ) p σ+1 −12+iμ (tanht)= √ 2π 3 2 ∞∑ l=0 (−1)l (l!)−1 al sinh 1 2 t · γ(l +1)γ−1(σ − l +1)p − 1 2 −l σ+12 (cosht), where al is the normalizing factor of the function f σ1l,s(x). references [1] vilenkin, n. ja., klimyk, a. u.: representation of lie groups and special functions, vol. 2, 1993. [2] vilenkin, n. ja.: special functions and theory of group representations, 1968. [3] erdelyi, a.: tables of integral transforms, 1954. [4] gradstein, i. s., ryshik, i. m.: tables of series, products and integrals, 1981. [5] prudnikov, a. p., brychkov, yu. a., marichev, o. i.: integrals and series, vol. 3: more special functions, 1989. ilya shilin dept of higher mathematics m. scholokhov moscow state university for the humanities verhnya radishevskaya 16–18 moscow 109240, russia dept 311 moscow aviation institute volokolamskoe shosse 4, moscow 125993, russia aleksandr nizhnikov moscow pedagogical state university m. pirogovskaya 1, moscow 119991, russia 73 ap10_1.vp 1 introduction for a number of years iibw, the austria-based institute for real estate, construction and housing ltd., has been providing advice aimed at establishing affordable rental housing sectors in transition economies (e.g. amann 2005, amann 2006, amann et al. 2006), based on the rationale that private housing construction is unlikely to satisfy the housing demand of lowand moderate-income households. the main aim of this paper is to outline two important requirements for providing a satisfactory supply of affordable rental housing in transition countries. it focuses on two projects already undertaken by iibw. one of these aims to establish a legal basis and a business model for public-private partnership (ppp) housing companies, while the other aims to provide long-term and stable financing for affordable rental housing in transition economies. 2 a new housing law for romania commissioned by the romanian ministry of development, public works and housing in 2007, iibw has developed a new housing law, based on european best practice while meeting eu requirements. the rationale for this work stemmed from major inefficiencies in the romanian rental housing sector. as a result of mass privatization in the 1990s, involving 27 per cent of the total housing stock (accounting for some 2.2 million dwellings), virtually no rental dwellings remained. there is only an informal rental sector, which is largely self-organized on an irregular basis. an estimated 1.0 million privatized condominiums are rented out privately, without any consumer protection and very often even without written contracts (tsenkova 2005, prc 2005, iibw 2007). the condominium sector has already undergone some legal reforms. however, restructuring the regulations has revealed some inconsistencies and some gaps in the regulations. homeowners’ associations are organized in a fairly operative way, but they are formed on a voluntary basis, and are consequently not widespread. housing management and maintenance is partially regulated within the legislation on condominiums. however, as is commonly found in all cee countries, there is inadequate enforcement. today, housing administration is mostly organized by individual owners, and rarely by professional service providers. nevertheless, in romania private initiative has achieved the licensing of administrators who are required to have taken a basic training programme. housing maintenance continues to be a major challenge, particularly thermal refurbishment, which has only been realized in a few projects. despite rather generous subsidies, improvements have been impeded by a decision-making process that now involves multiple owners, including many with very limited resources. there are some subsidy programmes in place, e.g. to promote the completion of unfinished residential buildings or to shelter young families. for thermal refurbishment, subsidies of up to two thirds of the construction costs are available, but these are rarely applied. unfortunately, subsidy programmes tend to stem from short-term political motives and for this reason lack a more strategic approach. in order to re-establish social housing, a national housing agency (anl) was established in the late 1990s. this was originally assigned to organize financing of social housing, but has since changed its focus to ownership of housing assets. toward this aim, anl has realized some remarkable projects, e.g. the brâncuşi rental housing estate in bucharest, with some 1,500 social dwellings of fairly high quality. however, the rents are set politically. they are extremely low, and the allocation of the dwellings lacks transparency. currently, the government has decided to sell the dwellings to the sitting tenants at far below market prices, with the effect of again diminishing the newly-established social housing stock. the proposed new romanian housing law consolidates all previous regulations pertaining to housing, and supplements them with european best practice to provide a comprehensive canon of housing regulations. this includes a housing law as an umbrella law providing a framework to ensure legal consistency of the six laws that constitute romanian housing legislation: a rent law, which resolves common deficiencies that can undermine relationships between tenants © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 50 no. 1/2010 new policies to facilitate affordable housing in central and eastern europe w. amann affordable housing construction is a lagging sector in most of the cee countries. recent increases in housing production have mainly served the top end of the market. in some countries, policy schemes for public housing have been established. however, actual construction of housing for moderate-income households has yet to take place. wolfgang amann describes the approach of the austria-based iibw – institute for real estate, construction and housing aimed at rectifying this unsatisfactory situation. the framework for a public-private partnership (ppp) model in housing will be established, with two parallel strategies. first, a legal framework will be established to allow for a new business model of ppp housing. second, financing tools will be implemented, similar to structured financing, implementing the different layers of sources. this approach refers to lessons learned in many western european countries, e.g., the netherlands and austria, where a third sector in housing contributes substantially to good housing provision for large parts of the population. iibw is currently implementing this new approach in several cee countries, including romania, montenegro and albania. keywords: affordable housing, housing law, housing construction, cee countries, romania. and landlords, a condominium law, a housing management and maintenance law that covers all regulations in the field of housing operation, administration, accounting, maintenance and refurbishment, and a housing subsidy law that defines the legal basis for all activities of the state in (co-)financing housing construction, refurbishment, housing benefits and related activities. iibw has newly introduced a ppp housing law to establish a new type of housing provider that has shown outstanding efficiency in several european countries. it combines the functions of a housing developer, an investor and a housing administrator, and is particularly eligible for rental housing construction, the takeover of social housing stock and the refurbishment of existing residential buildings. the ppp housing law draws on the example of the best european models of limited-profit social housing and turns them into a model applicable to the specific environment faced by countries in transition. it combines the strengths of the markets (privately-run companies) with the backing of the state (privileged access to subsidies, public control). in this way, it is expected to promote a strong sector for the provision of affordable dwellings. the ppp housing sector is designed fully in line with european positions. the european union has communicated quite plainly its support for the establishment of social housing sectors in the new member states. ppp housing companies fulfill public service obligations and may be compensated for these obligations without interfering with eu regulations on competition (ec 2005/179; ec 2005/842). 3 structured financing for ppp housing ppp housing legislation has been described as a strategy leading toward the establishment of a new business sector, targeting at affordable housing, particularly rental housing (unece 2005a, lux 2006, chiquier/lea 2009). this is unambiguously a top-down approach, which requires political will to facilitate. however, in order to establish ppp housing as a new business sector, a second strategy is necessary, i.e. financing schemes that allow for affordable rents, without leaving the paths of market-based operations. together, the aim is to develop social housing as a bankable product. in 2005 and 2006, iibw carried out research which paved the way for the development of a housing finance agency for countries in transition (h!fact, cf. amann et al. 2006). initiated by the stability pact for south eastern europe and in cooperation with some commercial banks and financing institutions active in the cee countries, new ways of financing affordable housing were sought. the need for action was, and still is, evident: within the decade to come, around 5 million dwellings will be required in the cee countries and a very large part of the existing 40 million dwellings is in urgent need of refurbishment. the theoretical basis of our approach is built on the numerous studies that iibw has completed concerning the austrian system of housing finance and housing promotion (amann & mundt 2005, lugger & amann 2006, lux 2006). the model for ppp housing, as executed in the ppp housing law for romania and the h!fact financing scheme, is the sector of limited-profit housing associations (lpha) in austria, which dominates both the affordable rental housing market and the new residential construction market. some 20 per cent of the total housing stock in austria has been built by lphas, comprising 800,000 dwellings, around two thirds of which are rental housing units and one third of which are affordable condominiums. lphas are responsible for more than 60 per cent of multi-apartment new construction. notably, social housing in austria is rooted in an ideological background which stems both from the socialist idea of solidarity and the catholic social doctrine. for this reason, the lpha sector is supported by the two major political parties, the social democrats and the people’s party (kemeny et al. 2001). this is certainly a significant consideration when attempting to transfer and establish a ppp housing sector in countries in transition. the financing of affordable housing in austria is quite complex, but nevertheless rather efficient. even though more than 80 per cent of new construction is co-financed by the state, public expenditure on housing promotion only amounts to approx. 1.0 per cent of gdp, which is well below the western european average. the main reason for this cost-efficiency is the focus on construction-based subsidies, specifically including the lpha sector (amann & mundt 2005). the housing products are targeted at lower and middle income groups, which may be defined as the 2nd to the 8th income decile. the majority of beneficiaries are able to cover their rents or annuities without a need for additional housing benefits. hence, subject-oriented subsidies amount to only approx. 8 per cent of total public expenditure on housing policy. that is to say, the prices of lpha rental dwellings are not cheap, but they are usually below the market prices for private rentals. with broad accessibility and a remarkable market share, lpha rental housing has an effective influence on price levels and price development in the private market. it is mainly because of this interference that housing market prices did not shoot up in the boom period and have not slumped since 2007. for this reason, the austrian model of rental housing may well be described as a unitary or integrated market, as classified by jim kemeny (1995; et al., 2001). as private companies, lphas are responsible for cost-effective execution of construction works and financing. multiple incentives contribute to sound performance, despite the limitation on profits. a typical housing project is financed 30–50 per cent with capital market mortgage loans, 30–40 per cent with low-interest public loans, 10–20 per cent with equity of the lpha, mostly for land purchase, and up to 10 per cent with the equity of the future tenants. the subsidized public loans have a maturity of more than 25 years and interest rates of mostly only 1 per cent. the diverse financing models aim to reduce the necessary public funding, steering effective costs for the tenants below market levels and other policy targets. the different tranches of financing have quite different characteristics. the ppp housing business model leads to a good equity position of most of lphas, which allows them to purchase land and afford bridging finance for the construction period from their own capital. the low-interest public loans are not just cheap money. adherent to the strict audit and supervision of lpha and the occasional disposition as a subordinated claim, public loans are treated as equity capital. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 for capital market financing, additional tools for increasing efficiency are in place. all major banks have special housing banks that issue tax-privileged housing construction convertible bonds. the acquired capital has to be invested in affordable housing in austria that also qualifies for public subsidies, i.e. mainly in lpha housing (schmidinger 2008). this reduces the capital costs of lphas by about 0.8 per cent points. more than this, it turns competition of banks and borrowers upside down. as the banks are limited to investing in the affordable housing sector, they must compete for the lpha with the best credit history. lphas as a whole pose very few risks due to their mostly solid equity basis, very low vacancy rates, public support and the strict auditing procedures. capital funding with housing construction convertible bonds allows for interest rates equivalent to the euribor flat rate for the best social housing developers (amann & mundt 2005). combined, affordable housing finance in austria can be considered as a low-risk model of structured financing. in contrast to more common models of structured financing in commercial real estate financing (unece 2005c), it not only lowers the capital costs, but also contributes to the stabilization of financing markets (springler 2008). based on this model, the following principles for financing affordable housing in transition economies have been developed (h!fact financing): 1 legal framework as financing is bound to public funding, a legal framework is essential, and this is achieved via ppp housing legislation. both ppp housing and h!fact financing are top-down approaches, which require a clear commitment of the state authorities in a target country. 2 affordability this is basically defined with cost coverage. it means condominiums at own costs and rents of about 2 eur/m2 usable floor space. this is only possible by drawing on public support on several levels. rents and prices shall never be determined by political decision, but in principal by sound financing schemes. mortgages have to be repaid from rent incomes, which rise according to cpi or slightly above. the break-even should not exceed 10–15 years. 3 target groups beneficiaries of affordable rental housing shall be households from the 2nd to the 6th income decile, i.e. lower and middle income groups. affordable condominiums may address even higher income groups. lowest income groups and vulnerable households may be served as well, but require additional housing allowances. there shall be no housing estates with predominantly lowest income households. the inclusion of lowest income groups is a social policy task and has to follow criteria for integrative development of communities. 4 consumer choice the share of rental dwellings and condominiums shall be determined by transparent parameters, such as availability of retail financing for buyers or equity of the developer, but first and foremost by demand and consumer choice. rental housing shall be established in such a way that it is economically rational for tenants to go into rental markets instead of condominiums. 5 management and maintenance: today, a big part of the housing stock lacks sound management and maintenance. the h!fact financing scheme includes monthly fees for operating costs of the building, including housing administration, costs for common parts, sewage disposal, savings for a reserve fund and others. these costs are estimated to be 0.50 eur/m2 usable floor space. 6 subsidies these must be available. they may be low interest loans of 30 – 40 per cent of construction costs or grants of about half the amount. 7 cooperation with municipalities h!fact financing requires the cooperation of municipalities. land and infrastructure should be provided free of charge, by concession or at a low price. in return, municipalities should play a major role in allocating the dwellings. h!fact financing will apply to very different local markets. compared to western european states, countries in transition show much higher economic disparities between underdeveloped areas and areas of strong economic development. there is an urgent need for affordable housing in both poor and rich areas, but of course the ability to pay differs widely. 8 equity the housing developer (ppp housing company) should have sources of equity to invest in affordable housing. this will be rather limited at the commencement of operations, but may grow to a substantial quantity over time. 9 cross-subsidies sources of cross-subsidies should be tapped, i.e. from richer to poorer regions, from for-profit condominiums or from commercial space to affordable rental dwellings. 10 international financing institutions (ifis) h!fact financing includes international financing sources. the most helpful sources are mortgage loans from the housing fund digh – dutch international guarantees for housing. these loans are guaranteed by dutch housing associations to cover the risk of first loss. hence, they are regarded as being equal to equity capital. in the medium term, other international financing institutions shall be attracted. 11 capital market financing this is addressed for bridging-financing during the construction period and, in the medium term, for strategic long-term investments in rental housing. taking the risk position of the other tranches, capital market financing shall be addressed only for senior loans with appropriate conditions. affordable rental housing may develop as an important property sector attracting investment from the capital market, as shown in austria and switzerland (unece 2005b). decreasing the risk involved in housing finance will be a major requirement for securing future financial involvement by commercial banks. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 50 no. 1/2010 12 allocation of dwellings this must follow transparent procedures. similar to the housing developer, who is bound to limitation of profits, the tenant shall be limited in her/his right to ‘cash up’ or extract public subsidies. resale of affordable condominiums shall be allowed only at regulated prices for a defined period, e.g. ten years. subleasing of affordable rental dwellings shall be prohibited. a sizeable rental sector has important functions for a national economy, far beyond social policy goals. rental housing not only offers low entry prices, it also promotes mobility of the workforce. a rental market for housing is crucial for young households and domestic migrants who have not accumulated sufficient capital to access the financial and mortgage markets for home purchase. in the long run, establishing a rental market offers substantial institutional investment opportunities. altogether, affordable housing should be developed which integrates social and private rents, following the integrated market concept of jim kemeny (1995; et al., 2001) (see above). the supply of affordable condominiums and rental dwellings should be developed to sufficient and considerable quantities in order to influence and stabilize the private markets. h!fact financing is of course in line with eu legislation, particularly regarding state aid. analyzing current jurisdiction, a set of rules can be identified: clear definition of services of general economic interest in the field of social housing (target groups), limitation of subsidies to additional costs of these services, and transparent and separate accounting principles (ec 2005/179, ecr i-7747/2003 the “altmark trans gmbh” case). the new financing model may very well be combined with ppp housing legislation. this has been proved compliant with eu legislation. the financing model described above has been applied to montenegro and is in preparation for application to albania. references [1] amann, w.: how to boost rental housing construction in cee-/see-countries, the housing finance international journal, iuhe, 12/2005, 2005. [2] amann, w., beijer, e., komendantova, n., neuwirth, g., roy, f., schimpel, m., and schwimmer, w.: h!fact – a housing finance agency for cee/see. feasibility study, vienna: iibw 2006. [3] amann, w., mundt, a.: the austrian system of social housing finance, 2005, www.iibw.at. [4] cgfs – committee on the global financial system: the role of ratings in structured finance issues and implications, bank for international settlements, 2005. [5] chiquier, l., lea, m.: housing finance policy in emerging markets, washington: the world bank, 2009. [6] dübel, h.-j., brzeski, w. j., hamilton, e.: rental choice and housing policy realignment in transition, post-privatization challenges in the europe and central asia region, washington: the world bank, 2006. [7] ec (2005/179): community framework for state aid in the form of public service compensation. [8] ec (2005/842): commission decision of 28 november 2005 on the application of article 86(2) of the ec treaty to state aid in the form of public service compensation granted to certain undertakings entrusted with the operation of services of general economic interest. [9] ec (2006/1080): regulation of the european parliament and of the council of 5 july 2006 on the european regional development fund and repealing regulation (ec) no 1783/1999. [10] ec (2006/177): communication from the commission. implementing the community lisbon programme social services of general interest in the european union. [11] european court (ecr i-7747/2003): the “altmark trans gmbh” case. [12] iibw (ed.): implementation of european standards in romanian housing legislation. final report. research study commissioned by the romanian ministry of development, public works and housing. with the collaboration of wolfgang amann, ioan a. bejan, helmut böhm, nadeyda komendantova, mihai mereuta, alexis mundt, theo österreicher, ciprian paun, gerhard schuster, andreas sommer, arin o. stanescu, walter tancsits. vienna: iibw, 2007. [13] kemeny, j.: from public housing to the social market, rental policy strategies in comparative perspective, london: routledge, 1995. [14] kemeny, j., andersen, h. t., matznetter, w., thalman, p.: non-retrenchment reasons for state withdrawal, developing the social rental market in four countries. institute for housing and urban research working paper 40, uppsala university, 2001. [15] kemeny, j., kersloot, j., thalmann, p.: non-profit housing influencing, leading and dominating the unitary rental market. three case studies, in: housing studies, vol. 20 (2005), no. 6, p. 855–872. [16] lugger, k., amann, w. (ed.): der soziale wohnbau in europa. österreich als vorbild, vienna: iibw, (2006). amann, w., ball, m., birgersson, b., ghekiere, l., lux, m., mundt, a., turner, b. [17] lux, m.: gemeinnütziges wohnen eine herausforderung für mittel-ostund südost-europa, in: lugger/amann (ed.) 2006, p. 77–91. [18] prc bouwcentrum international: sustainable refurbishment of high-rise residential buildings and restructuring of surrounding areas in europe, report for european housing ministers . in: conference proceedings 14–15 march 2005, the netherlands: prc, 2005. [19] tsenkova, s.: housing policy reforms in post-socialist europe. lost in transition, heidelberg: physica, 2009. wolfgang amann e-mail: amann@iibw.at iibw pb 2 1020 vienna, austria 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 50 no. 1/2010 acta polytechnica doi:10.14311/ap.2013.53.0438 acta polytechnica 53(5):438–443, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap lie groups and numerical solutions of differential equations: invariant discretization versus differential approximation decio levia, pavel winternitzb,∗ a dipartimento di matematica e fisica, universitá degli studi roma tre and infn, sezione roma tre, via della vasca navale 84, 00184 roma, italy b centre de recherches mathématiques, université de montréal, c.p. 6128, succ. centre-ville, montréal, qc, h3c 3j7, canada ∗ corresponding author: wintern@crm.umontreal.ca abstract. we briefly review two different methods of applying lie group theory in the numerical solution of ordinary differential equations. on specific examples we show how the symmetry preserving discretization provides difference schemes for which the “first differential approximation” is invariant under the same lie group as the original ordinary differential equation. keywords: ordinary difference equations, numerical solution, lie group theory, invariant discretization. submitted: 25 april 2013. accepted: 5 may 2013. 1. introduction lie group theory provides powerful tools for solving ordinary and partial differential equations, specially nonlinear ones [1–3]. the standard approach is to find the lie point symmetry group g of the equation and then look for invariant solutions, i.e. solutions that are invariant under some subgroup g0 ⊂ g. for ordinary differential equations (odes) this leads to a reduction of the order of the equation (without any loss of information). if the dimension of the symmetry group is large enough (at least equal to the order of the equation) and the group is solvable, then the order of the equation can be reduced to zero. this can be viewed as obtaining the general solution of the equation, explicitly or implicitly. however, for an ode obtaining an implicit solution is essentially equivalent to replacing the differential equation by an algebraic or a functional one. from the point of view of visualizing the solution, or presenting a graph of the solution, it maybe easier to solve the ode numerically than to do the same for the functional equation. for partial differential equations (pdes) symmetry reduction reduces the number of independent variables in the equation and leads to particular solutions, rather than the general solution. both for odes and pdes with nontrivial symmetry groups it may still be necessary to resort to numerical solutions. the question then arises of making good use of the group g. any numerical method involves replacing the differential equation by a difference one. in standard discretizations no heed is paid to the symmetry group g and some, or all of the symmetries are lost. since the lie point symmetry group encodes many of the properties of the solution space of a differential equation, it seems desirable to preserve it, or at least some of its features in the discretization process. two different methods for incorporating symmetry concepts into the discretization of differential equations exist in the literature. one was proposed and explored by shokin and yanenko [4–8] for pdes and has been implemented in several recent studies [9, 10]. it is called the differential approximation method and the basic idea is the following. a uniform orthogonal lattice in xi is introduced and the considered differential equation e(~x,u,uxi,uxi,xj , · · ·) = 0, (1) is approximated by some difference equation e∆ = 0. the derivatives are replaced by discrete derivatives. all known and unknown functions in (1) are then expanded in taylor series about some reference point (~x), in terms of the lattice spacings. in the simplest case of 2 variables, x and t, we have σ = xn+1 − xn, τ = tn+1 − tn e∆(x,t,u, ∆xu, ∆tu, ∆xxu, ∆ttu, ∆xtu, · · ·) = e + σe1 + τe2 + σ2e3 + 2στe4 + τ2e5 + · · · . (2) the expansion on the right hand side of (2) is a “differential approximation” of the difference equation e∆ = 0. (3) the “zero order differential approximation” of (3) is the original differential equation (1) and hence is invariant under the symmetry group g. keeping terms of order σ or τ in (2) we obtain the first differential approximation, etc. the idea is to take a higher 438 http://dx.doi.org/10.14311/ap.2013.53.0438 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 lie groups and numerical solutions of differential equations order differential approximation, at least the first order one, and require that is also invariant under g, or at least under a subgroup g0 ⊂ g. this is done by constructing different possible difference schemes approximating eq. (1) and choosing among them the one for which the first (or higher) differential approximation has the “best” symmetry properties. the second approach, the invariant discretization method is part of a program devoted to the study of continuous symmetries of discrete equations, i.e. the application of lie groups to difference equations [11–24]. as far as applications to the numerical solutions of differential equations are concerned, the idea, originally due to dorodnitsyn [12, 13], is to start from the differential equation, its symmetry group g and the lie algebra l of g, realized by vector fields. the differential equation can then be expressed in terms of differential invariants of g. the differential equation is then approximated by a finite difference scheme that is constructed so as to be invariant under the same group g as the differential equation. the difference scheme will consist of several equations establishing relations between points in the space of independent and dependent variables. these equations determine both the evolution of the dependent variables and the form of the lattice. the equations are written in terms of group invariants of the group g acting via its prolongation to all points on the lattice (rather than to derivatives of the dependent functions). as pointed out by p. olver, this amounts to prolonging the group action to “multi-space” for difference equations [24] rather than to “jet space” as for differential equations [25]. the purpose of this paper is to compare the two different methods of incorporating lie symmetries into the numerical analysis of differential equations. for simplicity we restrict ourselves to the case of odes and analyze difference schemes that were used in recent articles [26–28] to solve numerically some third order nonlinear odes with three or four dimensional symmetry algebras. in this article we take invariant difference schemes (on symmetry adapted lattices) and construct its first differential approximation. we then verify that in all examples this first differential approximation is invariant under the entire symmetry group g. 2. differential approximations of ordinary difference equations and invariant discretization of odes. let us consider the case of a third order ode e ≡ e(x,y,y′,y′′,y′′′) = 0, (4) (the generalization to order n ≥ 3 is straightforward). we can approximate (4) on a 4 point stencil with points (xk,yk), (k = n−1,n,n+ 1,n+ 2). alternative coordinates on the stencil are the coordinates of one reference point, say (xn,yn), the distances between the points, and the discrete derivatives up to order 3, { xn, yn, hn+k = xn+k −xn+k−1, p (1) n+1 = yn+1 −yn xn+1 −xn , p (2) n+2 = 2 p (1) n+2 −p (1) n+1 xn+2 −xn , p (3) n+3 = 3 p (2) n+3 −p (2) n+2 xn+3 −xn } . (5) the one-dimensional lattice can be chosen to be uniform and then hn+1 = hn ≡ h, (6) or some other distribution of points can be chosen. the lie point symmetry group g transforms the variables (x,y) into (x̃, ỹ) = ( λ(x,y), ω(x,y) ) and its lie point symmetry algebra l is represented by vector fields of the form x̂µ = ξµ(x,y)∂x + φµ(x,y)∂y, 1 ≤ µ ≤ diml. (7) the vector fields must be prolonged in the standard manner [1] to derivatives prx̂µ = x̂µ + φx(x,y,yx)∂yx + φxx(x,y,yx,yxx)∂yxx + · · · , (8) when acting on a differential equation, or to all points when acting on a difference scheme [15, 18, 20] pr∆x̂µ = ∑ i [ ξµ(xi,yi)∂xi + φµ(xi,yi)∂yi ] . (9) in the differential approximation method we start with a difference equation, usually on a uniform lattice (6) e∆(xn,hn+1,hn+2,yn,p (1) n+1,p (2) n+2,p (3) n+3) = 0, (10) and expand it into a taylor series in the spacing h: e∆ = e0 + he1 + h2e2 + · · · = 0. (11) for any difference equation approximating the differential equation e0 = 0 the lowest order term will be invariant under the group g. different schemes (10) can then be compared with respect to the invariance of the first differential approximation e0 + he1 = 0, (12) or of some higher order differential approximations. better results can be expected for schemes for which (12) is invariant under all of g or under some subgroup g0 ⊂ g that is relevant for the problem. when studying the invariance of (12) it must be remembered that the prolongations of x̂µ also act on the lattice parameters h. 439 d. levi, p. winternitz acta polytechnica in the invariant discretization method one constructs an invariant difference scheme e∆a ( xn,yn,hn,hn+1,hn+2,yn, p (1) n+1,p (2) n+2,p (3) n+3 ) = 0, a = 1, 2 (13) prx̂µe∆a ∣∣ e∆1 =0,e ∆ 2 =0 = 0. (14) the two equations (13) determine both the lattice and the difference equation. both are constructed out of the invariants of the group g prolonged to the lattice as indicated in eq. (14). thus, the difference scheme is by construction invariant under the entire group g acting on the equation and lattice. in the continuous limit we have e∆1 = e + hne (1) 1 + hn+1e (2) 1 + hn+2e (3) 1 + h.o.t., (15) e∆2 = 0 + hne (1) 2 + hn+1e (2) 2 + hn+2e (3) 2 + h.o.t.. (16) the terms spelled out in (15) correspond to the first differential approximation. since the left hand side of (15), (16) is invariant under g, the series on the right hand side must also be invariant. this does not guarantee that the first (or n-th) differential approximation will be invariant. in the next three sections we will show on examples that the first differential approximation is indeed invariant. thus, choosing an invariant difference scheme guarantees, at least in the considered cases, that the aims of the differential approximation method are fully achieved. the examples are all third order odes with 3 or 4 dimensional symmetry groups. in each case we write an invariant difference scheme of the form (13) and its first differential approximation (15). the terms e (k) 1 , (k = 1, 2, 3) in (15) are differential expressions containing y′′′ and y′′′′. these expressions will be simplified by removing y′′′ and y′′′′, using the ode (4) and its first differential consequence. 3. equations invariant under the similitude group sim(2). let us consider the group of translations, rotations and uniform dilations of an euclidean plane. its lie algebra sim(2) is realized by the vector fields x̂1 = ∂ ∂x , x̂3 = y ∂ ∂x −x ∂ ∂y , x̂2 = ∂ ∂y , x̂4 = x ∂ ∂x + y ∂ ∂y , (17) this group has no second order differential invariant and precisely one third order one, namely[26] i = (1 + y ′2)y′′′ − 3y′y′′2 y′′2 . (18) the expressions i1 = y′′ (1 + y′2)3/2 , i2 = (1 + y ′2)y′′′ − 3y′y′′2 (1 + y′2)3 are invariant under the euclidean group, with lie algebra {x̂1, x̂2, x̂3}, but only the ratio i2/i21 = i is invariant under dilations. thus the lowest order ode invariant under sim(2) is (1 + y′2)y′′′ − 3y′y′′2 = ky′′2 (19) where k is an arbitrary constant. to discretize (19) (or any third order ode) we need (at least) a four-point stencil. the euclidean group has 5 independent invariants depending on 4 points (xn+k,yn+k),k = −1, 0, 1, 2 namely [26] ξ1 = hn+2 [ 1 + (yn+2 −yn+1 hn+2 )2]1/2 , ξ2 = hn+1 [ 1 + (yn+1 −yn hn+1 )2]1/2 , ξ3 = hn [ 1 + (yn −yn−1 hn )2]1/2 , ξ4 = (yn+2 −yn+1)hn+1 − (yn+1 −yn)hn+2, ξ5 = (yn+1 −yn)hn − (yn −yn−1)hn+1. (20) out of them we can construct 4 independent sim(2) invariants, for instance j1 = 2αξ4 ξ1ξ2(ξ1 + ξ2) + 2βξ5 (ξ2ξ3)(ξ2 + ξ3) , α + β = 1, j2 = 6 ξ1 + ξ2 + ξ3 [ ξ4 ξ1ξ2(ξ1 + ξ2) − ξ5 ξ2ξ3(ξ2 + ξ3) ] , (21) and the two ratios ξ1/ξ2, ξ2/ξ3. to obtain the continuous limit we put xn−1 = xn −hn, xn+1 = xn + hn+1, xn+2 = xn + hn+1 + hn+2, yn+k = y(xn+k), hn+k = αk�, αk ∼ 1, (22) and expand yn+k into a taylor series about xn ≡ x. the invariants j1, j2 where so chosen that their continuous limits are i1 and i2 respectively. the invariant scheme used in [26] to solve the ode (19) numerically was e∆2 = ξ1ξ3 − ξ 2 2 = 0, (23) e∆1 = j2 −kj 2 1 = 0. (24) the first differential approximation of (23) is e∆2 ≈ 2(−h 2 n+1 + hnhn+2)(y ′2 + 1) + (2hnhn+1hn+2 − 2h3n+1 −h 2 nhn+2 + hnh2n+2)y ′y′′ = 0 (25) 440 vol. 53 no. 5/2013 lie groups and numerical solutions of differential equations applying prdx̂i to eq. (25) we find that the equation is invariant under the entire group sim(2), as are the terms of order �2 and �3 separately. the first order differential approximation of the difference equation (24) is quite complicated. however, if we substitute the ode (19) and its differential consequences into the first nonvanishing term of the approximation, we obtain a manageable expression e∆1 ≈ (1 + y ′2)y′′′ − 3y′y′′2 −ky′′2 − 1 24 y′′3 [1 + y′4](hn + hn+1 + hn+2) × { k2 [ 16α(hn + hn+1 + hn+2)2 − 4h2n − 12h 2 n+2 − 8h 2 n+1 − 16hnhn+2 − 20hn+1hn+2 + 12hnhn+1 ] + 9hn+1(hn+2 −hn) } = 0. (26) expression (26) also satisfies pr∆x̂ie∆1 ∣∣∣ e∆1 =e ∆ 2 =0 = 0, i = 1, · · · , 4 (27) so the first differential approximation of the entire scheme (23), (24) is invariant under sim(2). 4. equations invariant under a one-dimensional realization of sl(2, r). four non-equivalent realizations of sl(2,r) by vector fields in two variables exist [29]. invariant difference schemes for second and third order odes have been constructed and tested for all of them [26–28]. in this section we will consider the first one, called sl 1(2,r), or alternatively sly(2,r), which actually involves one variable only: x̂1 = ∂ ∂y , x̂2 = y ∂ ∂y , x̂3 = y2 ∂ ∂y . (28) the corresponding lie group acts by mobius transformations (fractional linear transformations) on y. the third order differential invariants of this action are the schwarzian derivatives of y and the independent variable x. the most general third order invariant ode is 1 y ′2 ( y′y′′′ − 3 2 y′′2 ) = f(x). (29) where f(x) is arbitrary. for f(x) = k = const the group is gl(2,r) and (28) is extended by the vector field x̂4 = ∂x. for f(x) = 0 the symmetry group is sly(2,r) ⊗ slx(2,r). the difference invariants on a four point stencil are r = (yn+2 −yn)(yn+1 −yn−1) (yn+2 −yn+1)(yn −yn−1) , xn,hn+2,hn+1,hn. (30) the discrete invariant approximating the left hand side of (29) is j1 = 6hn+2hn hn+1(hn+1+hn+2)(hn+hn+1)(hn+2+hn+1+hn) × [ (hn+2+hn+1)(hn+1+hn) hnhn+2 −r ] . (31) any lattice depending only on xk will be invariant, in particular the lattice equation can be chosen to be xn+1 − 2xn + xn−1 = 0. (32) the general solution of (32) is xn = nh + c0, (33) i.e. a uniform lattice with origin x0 and spacing xn+1 −xn = h. let us consider the invariant ode 1 y′2 ( y′y′′′ − 3 2 y′′2 ) = sin(x). (34) and approximate it on an a priori arbitrary lattice. such a scheme is given by e∆ = j1 − sin(ξ) = 0, (35) ξ = xn + ahn + bhn+1 + chn+2, φ(xn,hn,hn+1,hn+2) = 0, (36) where a, b, c are constants and φ satisfies the condition φ(xn, 0, 0, 0) ≡ 0, (37) (for instance φ can be linear as in (32). the first differential approximation of (35), after the usual simplifications, is e0 ≈ 1 y′2 ( y′y′′′ − 3 2 y′′2 ) − sin(x) + cos(x) { hn(1 + 4a) − 2hn+1(1 − 2b) −hn+2(1 − 4c) } . (38) eq. (38) is invariant under sly(2,r). moreover, if we choose a = − 1 4 , b = 1 2 , c = 1 4 , (39) the second term in (38) vanishes completely and e0 = 0 is a second order approximation of the ode (34). we mention that the uniform lattice (33), used in [26] satisfies (39). 5. equations invariant under a two-dimensional realization of gl(2, r). a genuinely two-dimensional realization of the algebra gl(2,r) is given by the vector fields x̂1 = ∂ ∂y , x̂2 = x ∂ ∂x + y ∂ ∂y , x̂3 = 2xy ∂ ∂x + y2 ∂ ∂y , x̂4 = x ∂ ∂x . (40) 441 d. levi, p. winternitz acta polytechnica the sl(2,r) group corresponding to {x̂1,x̂2,x̂3} has two differential invariants of order m ≤ 3: i1 = 2xy′′ + y′ y′3 , i2 = x2(y′y′′′ − 3y′′2) y′5 . (41) the expression i2/i 3 2 1 is also invariant under the dilations generated by x̂4 and hence under gl(2,r). the corresponding ode invariant under gl(2,r) is i22 = a 2i31 (42) which is equivalent to e = x4(y′y′′′− 3y′′2)2 −a2y′(2xy′′ + y′)3 = 0. (43) five independent sl(2,r) difference invariants are ξ1 = 1 √ xn+1xn+2 (yn+2 −yn+1), ξ2 = 1 √ xnxn+1 (yn+1 −yn), ξ3 = 1 √ xn−1xn (yn −yn−1), ξ4 = 1 √ xnxn+2 (yn+2 −yn), ξ5 = 1 √ xn+1xn−1 (yn+1 −yn−1). (44) any 4 ratios ξi/ξk will be gl(2,r) invariants. the sl(2,r) invariants that have the correct continuous limits are j2 = 12 ξ2(ξ1 + ξ2 + ξ3) [ξ4 − ξ1 − ξ2 ξ1(ξ1 + ξ2) − ξ5 − ξ2 − ξ3 ξ3(ξ2 + ξ3) ] j1 = 8 [ α ξ4 − ξ1 − ξ2 ξ1ξ2(ξ1 + ξ2) + (1 −α) ξ5 − ξ2 − ξ3 ξ2ξ3(ξ2 + ξ3) ] . (45) the difference scheme for the ode (43) with a = −1 used in ref. [26] was actually the square root of (42), equivalent to e∆1 = j2 + j 3 2 1 = 0, (46) e∆2 = ξ1 ξ2 = γ = const. (47) the first differential approximation of the lattice equation (47) is e (∆) 1 ≈ (γhn+1 −hn+2)y ′ + 1 2x2 {[ −γh2n+1 + hn+2(hn+2 + 2hn+1) ] y′ + xy′′ [ γh2n+1 −hn+1hn+2 −h 2 n+2 ]} = 0. (48) both terms in (48) are invariant under gl(2,r). the first differential approximation to the difference equation (46) is e (∆) 1 ≈ e y′10 − √ y′(y′ + 2xy′′) 7 2 16xy′10(hn + hn+1 + hn+2) × [ h2n(32α− 11) + 16h 2 n+1(2α− 1) + h2n+2(32α− 21) + hnhn+1(64α− 21) + 32hnhn+2(2α− 1) + hn+1hn+2(64α− 43) ] . (49) the second term of expression (49) has been simplified using (43) and its differential consequences. again, (49) is invariant under the entire gl(2,r) group. 6. conclusions the three examples considered above in sections 3, 4 and 5 confirm that the method of invariant discretization provides a systematic way of constructing difference schemes for which the first differential approximation is invariant under the entire symmetry group of the original ode. the two equations (13) determining the invariant difference scheme for an ode are not unique since there are more difference invariants than differential ones. the differential approximation of an invariant scheme can be used to benefit from this freedom and to choose an invariant scheme with a higher degree of accuracy. an example of this is given in section 3, eq. (38) where the choice of the parameters (39) assures that the terms of order � in (38) vanish identically. previous numerical comparisons between invariant discretization and standard noninvariant numerical methods for odes [26–28] have shown two features. the first is that the discretization errors for invariant schemes are significantly smaller [26] (by 3 orders of magnitude for eq. (18), 1 order of magnitude for eq. (43)). the second feature is that the qualitative behaviour of solutions close to singularities is described much more accurately by the invariant schemes [26– 28]. an analysis of the relation between invariant discretization and the differential approximation method for pdes is in progress. acknowledgements we thank alex bihlo for interesting discussions. the research of p.w. was partially supported by a grant from nserc of canada. d.l. thanks the crm for its hospitality. the research of d.l. has been partly supported by the italian ministry of education and research, 2010 prin “teorie geometriche e analitiche dei sistemi hamiltoniani in dimensioni finite e infinite”. references [1] peter j. olver. applications of lie groups to differential equations second edition, springer-verlag, new york, 1993. 442 vol. 53 no. 5/2013 lie groups and numerical solutions of differential equations [2] george w. bluman, sukeyuki kumei, symmetries and differential equations springer-verlag new york, heidelberg, berlin, 1989, (reprinted with corrections, 1996). [3] nail h. ibragimov, transformation groups applied to mathematical physics translated from the russian. d. reidel publishing co., dordrecht, 1985. [4] yurii i. shokin, the method of differential approximation translated from the russian by k. g. roesner. springer-verlag, new york-berlin, 1983 [5] n. n. yanenko and yu. i. shokin first differential approximation method and approximate viscosity of difference schemes, phys. fluids 12(ii):28–33,1969. [6] n. n. yanenko, z.i. fedotova, l.a. tusheva, yu. i. shokin, classification of difference schemes of gas dynamics by the method of differential approximation. i. one-dimensional case. comput. & fluids 11 (3):187–206, 1983. [7] n.n. yanenko and yu. i. shokin, group classification of difference schemes for a system of one dimensional equations of gas dynamics, some problems of mathematics and mechanics american mathematical society translations series 2 volume: 104 pp. 259–265, 1976 [8] yu.i. shokin, n.n. yanenko, the connection between the proper formulation of first differential approximations and the stability of difference schemes for hyperbolic systems of equations. (russian) mat. zametki 4 493–502, 1968. [9] e. hoarau, c. david, p. sagaut, t.-h. lê, lie group study of finite difference schemes. discrete contin. dyn. syst., dynamical systems and differential equations. proceedings of the 6th aims international conference, suppl., 495–505, 2007. [10] m. chhay, e. hoarau, a. hamdouni, p. sagaut, comparison of some lie-symmetry-based integrators. j. comput. phys. 230 (5): 2174–2188, 2011. [11] shigeru maeda, canonical structure and symmetries for discrete systems. math. japon. 25 (4): 405–420, 1980. [12] v.a. dorodnitsyn, transformation groups in difference spaces. (russian) translated in j. soviet math. 55 (1): 1490–1517, 1991. [13] v.a. dorodnitsyn, finite difference models entirely inheriting continuous symmetry of original differential equations. internat. j. modern phys. c 5 (4): 723–734, 1994. [14] v.a. dorodnitsyn, r. kozlov, p. winternitz, lie group classification of second-order ordinary difference equations. j. math. phys. 41 (1): 480–504, 2000. [15] vladimir dorodnitsyn, applications of lie groups to difference equations. differential and integral equations and their applications, crc press, boca raton, fl, 2011. [16] d. levi, p. winternitz, continuous symmetries of discrete equations. phys. lett. a 152 (7): 335–338, 1991. [17] d. levi, p. winternitz, symmetries and conditional symmetries of differential-difference equations. j. math. phys. 34 (8): 3713–3730, 1993. [18] d. levi, p. winternitz, continuous symmetries of difference equations. j. phys. a 39 (2): r1–r63, 2006. [19] d. levi, p. winternitz, r.i. yamilov, lie point symmetries of differential-difference equations. j. phys. a 43 (29): 292002, 14 pp., 2010. [20] p. winternitz, symmetry preserving discretization of differential equations and lie point symmetries of differential-difference equations. symmetries and integrability of difference equations. edited by d. levi, p. olver, z. thomova and p. winternitz. cambridge university press, cambridge, pp. 292-336, 2011. [21] g. r. w. quispel, h. w. capel, r. sahadevan, continuous symmetries of differential-difference equations: the kac-van moerbeke equation and painlevé reduction. phys. lett. a 170 (5): 379–383, 1992. [22] g. r. w. quispel, r. sahadevan, lie symmetries and the integration of difference equations. phys. lett. a 184 (1): 64–70, 1993. [23] p.e. hydon, symmetries and first integrals of ordinary difference equations. r. soc. lond. proc. ser. a math. phys. eng. sci. 456 (2004): 2835–2855, 2000. [24] p.j. olver, geometric foundations of numerical algorithms and symmetry. special issue “computational geometry for differential equations”. appl. algebra engrg. comm. comput. 11 (2001), no. 5, 417–436. [25] c. ehresmann, les prolongements d’une variété différentiable. i. calcul des jets, prolongement principal. c. r. acad. sci. paris 233 598–600, 1951; les prolongements d’une variété différentiable. ii. l’espace des jets d’ordre r de vn dans vm. c. r. acad. sci. paris 233 777–779, 1951; les prolongements d’une variété différentiable. iii. transitivité des prolongements. c. r. acad. sci. paris 233 1081–1083, 1951; les prolongements d’une variété différentiable. iv. éléments de contact et éléments d’enveloppe. c. r. acad. sci. paris 234 1028–1030, 1952; les prolongements d’une variété différentiable. v. covariants différentiels et prolongements d’une structure infinitésimale. c. r. acad. sci. paris 234 1424–1425, 1952. [26] a. bourlioux, c. cyr-gagnon, p. winternitz, difference schemes with point symmetries and their numerical tests. j. phys. a 39 (22): 6877–6896, 2006. [27] r. rebelo, p. winternitz, invariant difference schemes and their application to sl(2,r) invariant ordinary differential equations. j. phys. a 42 (45): 454016, 10 pp., 2009. [28] a. bourlioux, r. rebelo, p. winternitz, symmetry preserving discretization of sl(2,r) invariant equations. j. nonlinear math. phys. 15(suppl. 3): 362–372, 2008. [29] a. gonzález-lópez, n. kamran, p.j. olver, lie algebras of vector fields in the real plane. proc. london math. soc. (3) 64 (2): 339–368, 1992. 443 acta polytechnica 53(5):438–443, 2013 1 introduction 2 differential approximations of ordinary difference equations and invariant discretization of odes. 3 equations invariant under the similitude group sim(2). 4 equations invariant under a one-dimensional realization of sl(2,r). 5 equations invariant under a two-dimensional realization of gl(2,r). 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0742 acta polytechnica 53(supplement):742–745, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the past and the future of direct search of gw from pulsars in the era of gw antennas l. milanoa,b,∗, e. callonia,b, r. de rosaa,b, m. de laurentisa,b, l. di fioreb, f. garufia,b a dipartimento di ‘scienze fisiche’ università federico ii di napoli, complesso universitario di monte s. angelo, via cinthia, i 80125, napoli, italy b sezione infn di napoli, complesso universitario di monte s. angelo, via cinthia i 80125, napoli, italy ∗ corresponding author: milano@na.infn.it abstract. in this paper we will give an overview of the past and present status of gravitational wave (gw) research associated with pulsars, taking into account the target sensitivity achieved from interferometric laser gw antennas such as tama, geo, ligo and virgo. we will see that the upper limits obtained with searches for periodic gw begin to be astrophysically interesting by imposing non-trivial constraints on the structure and evolution of the neutron stars. we will give prospects for the future detection of pulsar gw signals, with advanced ligo and advanced virgo and future enhanced detectors, e.g. the einstein telescope. keywords: gravitational waves, pulsar, neutron star. 1. introduction efforts to detect gravitational waves started about fifty years ago with joe weber’s bar detectors [1]. they opened the way to the present day interferometric detectors with useful bandwidths of thousands of hertz, namely 10 hz ÷ 10 khz for virgo [2], 40 hz ÷ 10 khz for ligo [3], 50 hz ÷ 1.5 khz for geo600 [4] and 10 hz ÷ 10 khz for the japanese tama [5]. in this paper we will give an overview of the past and present status of research on gravitational waves from pulsars. taking into account all interferometric laser gw antennas tama, geo600, ligo and virgo, we must note that no direct detection of a gw credited signal has yet been announced, although target sensitivity has been reached for both virgo and ligo (see fig. 1). searches for gws from pulsars did not begin with ligo and virgo. after the discovery of the crab pulsar optical pulsations at ∼ 30 hz [6] there were a series of pioneering searches for gw emission: • 1972: levine and stebbins [7] used a 30 m laser interferometer (single arm fabry–pérot cavity); • 1978: hirakawa et al. [8, 9]) searched for the crab gw emission using a ∼ 1000 kg aluminum quadrupole antenna with resonant frequency at 60.2 hz; • 1983: hereld [10] at caltech used a 40 m interferometer; • 1983: hough et al, in glasgow used a split bar detector [11]; • 1993: niebauer [12] searched for gws from a possible neutron star (ns) remnant of sn1987a using a garching 30 m interferometer (100 hrs of data from 1989; searching around 2 and 4 khz): h0 < 9 × 10−21; • 1986–1995: tokyo group crab pulsar search using a 74 kg torsion-type antenna (cooled at 4.2 k) owa et al. [13, 14], suzuki [15] using a cooled (4.2 k) 1200 kg antenna: h0 < 2 × 10−22 (∼ 140 × spin-down limit); • 2008: clio search for gws from vela [16]: h0 < 5.3 × 10−20 at 99.4 % cl; • 2010: tokyo prototype low-frequency mag-lev torsional bar antenna search for slowest pulsar (psr j2144-3933 at νrot ∼ 0.1 hz) [17]: h0 < 8.4 × 10−10 (bayesian 95 % ul with 10 % calibration errors); • various narrow-band all-sky, and galactic centre, blind searches have been performed using explorer and auriga bar detectors [18, 19, 35]; for a review see [36]) h0 < 2 × 10−22 (∼ 140 × spin-down limit). we will divide our paper as follows: section 2 offers a basic introduction to gw emission from pulsars, then results from past and present research are given in section 3. future work is outlined in section 4, conclusions and open questions are dealt with in section 5. 2. basics of gw emission by pulsars gw emission is a non-axisimmetric process, since gws are emitted through a quadrupolar mechanism. possible processes for this kind of asymmetry, involving neutron stars, are: a rotating tri-axial ellipsoid (i.e. an ns with a bump), precessing ns (wobbling angle), oscillating ns, and accreting ns in an inspiralling binary system. 742 http://dx.doi.org/10.14311/ap.2013.53.0742 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the past and the future of direct search of gw from pulsars we will focus on the first three items, since they give rise to continuous gw emission, while accreting ns gives rise to a chirping gw emission in the very last stage of life of the system. gw amplitude can be decomposed into two polarizations: “+” and “×”, according to the effect they have on a circle of free falling masses orthogonal to the propagation direction. the functional form is described by [21] h+ = h0 1+cos 2 ι 2 cos 2ω ( t− d c ) h× = h0 cos ι sin 2ω ( t− d c ) (1) being ω = 2πν the rotational angular velocity and h0 = 16π2g c4 izzν 2 d � (2) where ι is the angle between the rotation plane and the observer, izz is the momentum of inertia of ns with respect to the rotation axis, d is the distance from the observation point, and � = (ixx − iyy)/(izz) is quadrupole ellipticity. from a large set of equations of state (eos), describing ns matter at supra-nuclear densities one gets: izz = (1÷3)×1038 kg m2 depending on the eos and on the rotation velocity. so the scale parameter of gw amplitude can be expressed in more physical units as h0 = 4 × 10−25 ( � 10−6 ) ( i 1038 kg m2 ) ( 100 pc d ) ( ν 100 hz )2 . (3) gws can also be emitted by an axisimmetric rotating star, when the symmetry and the rotation axes are misaligned by “wobble” angle α. the maximum sustainable quadrupole deformation � for an ns depends on the physics of the crust and the eos of matter at super-nuclear density. we can set a limit in � by looking at the maximum strain sustainable by the crust without breaking. modeling of crustal strains indicates that [21] � < k ubreak brlim where ubreak is the crustal limit strain, and k and brlim depend on the star model. from molecular dynamics simulations, one obtains: k = 2 × 10−5 and ubreak ' brlim = 0.1. this means that a ns breaks for deformations larger than 20 cm. according to various models for solid quark stars, we can have brlim = 10−2 and k = 6×10−4÷6×10−3. for strong initial magnetic fields (1016 g), the star can emit intense gws, due to the deformation induced by the magnetic field. in brief, we have many estimates for the ellipticity, but few certainties. something can be gathered by looking at the data from gw experiments. how can ligo and virgo and astronomical data be useful for constraining the eos of ns using the upper limits on � and the momentum of inertia i? let us start from what we know from astrophysical observations and gravitational observatory data. in our galaxy we can estimate a number of ns of about 108, 2000 of which are radio pulsars. distance d and rotation frequency ν can be evaluated from radio and optical observations (gw emission has a frequency given by 2ν). if the variation of the rotational period is such that ν̇ < 0, there is a mechanism that makes the pulsar lose its energy. it can be: electromagnetic (em) dipole radiation, particle acceleration and gw emission. since ν̇ = kνn, one can derive the breaking index n from the rotational frequency and its derivatives, n = νν̈ ν̇2 . it has been evaluated that for pure magnetic dipole radiation, the breaking index is n = 3, while for pure gw radiation n = 5. for the few objects that have been observed with sufficient accuracy to determine the breaking index, there is n < 3, thus some other mechanism or a combination must be invoked (see [20] for an interesting phenomnological analysis of the problem). now let us try to derive some upper limit on gw emission. we start with the strong hypothesis that all the energy is lost by gw emission. thus the spindown gw amplitude is given by hsd ∼ ( 5 2 gizz|ν̇| c3d2ν )1/2 or, expressed in a more convenient set of units, hsd ∼2.5 ×−25 ( izz 1038 kg m2 ) ( |ν̇| 10−11 hz s−1 ) ( 100 hz ν ) ( 1 kpc d ) from which we can get an upper limit for izz combining em and gw observations. 3. gw from pulsars, and results from the past until now in recent years, since ground-based gw interferometric detectors have come into operation, several studies have been carried out to detect gw signals from continuous sources, both by single interferometers and by coherent searches using two or more detectors, but as of june 2012, the results of these searches have only involved the upper limits on the rate and breaking strain threshold. the search for periodic sources (or continuous waves – cw), benefits from long integration times and coincident analysis between different detectors. this was the first to be started in scientific run 1 (s1) of the ligo detector, using time and frequency domain 743 l. milano, e. calloni, r. de rosa, m. de laurentis, l. di fiore, f. garufi acta polytechnica 10 0 10 1 10 2 10 3 10 4 10 −30 10 −28 10 −26 10 −24 10 −22 gw frequency (hz) s tr a in virgo vsr2 virgo design ligo design crab pulsar ligo s6 advirgo design vela pulsar et design adligo design figure 1. the figure shows the upper bounds on gw amplitude from known pulsars, assuming 100 % conversion of spindown energy into gws. an integration time of one year is assumed. the (green) diamonds represent a pulsar with a spindown outside the detection band for both adv and advanced ligo; the (cyan) squares are in the potential detection band for at least one of the advanced detectors. the red circles represent crab and vela pulsars. analyses [22] on a single pulsar of known frequency. in s2–s5, it evolved into a broadband search [23, 24], using various techniques, e.g. the hough transforms, the short fourier transforms and excess power. recently, the s5/vsr1 (vsr1 is the first virgo scientific run) search on pulsars of known period and spindown using ephemerides given by radio-telescopes and x-ray satellites [25], has set lower limits on gw radiation from targets, on the assumption that it is locked to em emission. a further analysis was performed on joint data from s6 and the vsr2 run and results interesting for physics were obtained from the upper limit in gw emission from the crab pulsar [25] and the vela pulsar (abadie et al. 2011) as shown by the red circles in fig. 1. figure 1 shows that the great majority of known pulsars (green diamonds) are below the present sensitivity curves of the ligo and virgo gw antennas, and only a few of them (shown in cyan) fall in the range of sensitivity of either advanced ligo or advanced virgo or in the range of the einstein telescope [32]. the upper limits obtained by the analysis of the sources that can be studied by the present interferometers, provide interesting astrophysical information, since they give non-trivial constraints on the structure and the evolution of ns. two highlight results have come from pulsars gw data analysis [26, abadie et al. 2011]: for the crab pulsar the hsd ' 1.4 × 10−24 limit was beaten, finding h0 < 2.4 × 10−25 i.e. ≈ hsd/6. this implies that less than 2 % of the rotational energy is lost in gw and also constrains the ellipticity to � < 1.3 × 10−4. for the vela pulsar the hsd ' 3.29 × 10−24 limit was beaten, by finding h0 < 2.4×10−24 implying that less than 35 % of the rotational energy is lost in gw and constraining the ellipticity to � < 1.1 × 10−3. these results also impose stringent limits on the eos of an ns [21]. 4. the near future: the new proposals the ground-based gw detector network has been running with improved sensitivity (the virgo vsr3 run in 2010 july and the s6b ligo run), major upgrades, e.g. advanced ligo [27] and advanced virgo [28], are aimed at achieving one order of magnitude better sensitivity than the current instruments (fig. 1 blue and red curves). these detectors come into operation in 2014 or 2015, and, unless something in gr theory is wrong, we can foresee regular detections of binary inspirals with very good prospects of detecting various other signals. the proposals for other large interferometric detectors, lcgt in japan [29] (approval received in june 2010) and aigo in australia [30], will significantly improve the capabilities of the current detector network. we will have good chances of detecting low-frequency signals with pulsar timing arrays on about the same time scale (hobbs et al. 2010), while space-based detectors such as lisa/ngo [33], aimed to detect gw in the range 10−4 ÷10−1 hz and decigo [34], are anticipated years later to open up the intermediate frequency band. third generation conceptual designs for the einstein telescope (et) (fig .1) [32] will be at future opportunity for gw detection since this telescope should attain greater sensitivity by a factor of 100 more in sensitivity than first generation virgo and ligo interferometers. 5. conclusions and open questions in this paper we have given an overview of the past and present status of gravitational wave (gw) research associated with pulsars. many questions remain open, e.g.: what are the upper limits for pulsars: when there will be a direct detection? what fraction of pulsars energy is emitted in the form of gravitational 744 vol. 53 supplement/2013 the past and the future of direct search of gw from pulsars waves? can we have direct inferences on pulsar parameters from gravitational waves to constrain the eos of an ns? what will be the model of ns or quark stars? with interferometric gw antennas (ligo, geo600, tama, virgo), we have as yet only succeeded in setting interesting upper limits, but the cherished belief is that the first direct detection will be possible with the new generation of interferometers, and as a result answers may be found to some open questions. acknowledgements this work was supported by infn grant 2011. references [1] shawhan p.s.: 2010 class. quantum grav. 27 084017 [2] accadia t. et al.: 2012 journal of instrumentation 7 p03012 [3] abbott b p et al.: 2009 rep. prog. phys. 72 076901 [4] grote h. et al.: 2008 class. quantum grav. 25 114043 [5] takahashi r.: 2004 cqg 21 s403-8 [6] cocke w. j. et al.: 1969 nature 221 525 [7] levine j. and stebbins r., 1972 phys. rev. d 6 1565 [8] hirakawa k. et al.: 1978 phys. rev. d 17 1919 [9] hirakawa k. et al.: 1979 phys. rev. d 20 2480 [10] hereld m.: 1983 phd dissertation california institute of technology [11] hough j. et al.: 1983 nature 303 216 [12] nierbauer j. et al.: 1993 phys. rev. d 47 3106 [13] owa s. et al.: 1986 proceedings of iv marcel grossmann meeting on general relativity part a 571 [14] owa s. et al.: 1988 intern. symposium on experimental gravitational physics 397 [15] suzuki t.: 1995 first edoardo amaldi conference on gravitational waves [16] akutsu t. et al.: 2008 class. and quantum grav. 25 184013 [17] ishidoshiro k.: 2010 phd dissertation, university of tokyo [18] astone p. et al.: 2001 phys. rev. d 65 022001 [19] astone p. et al.: 2003 class. quantum grav. 20 s665 [20] palomba c.: 2000 astron. & astrophys. 354 163 [21] ferrari v.: 2010 class. quantum grav. 27 194006 [22] b.p. abbott et al.: 2004 phys. rev d 69 102001 [23] b.p. abbott et.al.: 2005 phys. rev d 72 042002 [24] b.p. abbott et al.: 2007 phys. rev d 76 062003 [25] abbott b p et al.: 2010 astrophys. j. 713 617 [26] j. abadie et al.: 2010 apj 737 93 [27] g. m. harry et al.: 2010 class. and quantum grav. 27 084006 [28] s. hild et al.: 2009 class. and quantum grav. 26 025005 [29] m. ohashi for the lcgt collaboration: 2008 j. phys.: conf. ser. 120 032008 [30] p. barriga et al.: 2010 class. and quantum grav. 27 084005 [31] g.b.hobbs, et al: 2009 publ. astron. soc. australia 26 103 [32] m. punturo et al.: 2010 class. and quantum grav. 27 084007 [33] p.amaro-seoane et. al.: 2012 arxiv:1201.3621v1 [34] s. kawamura et. al.: 2008 journal of physics: conference series 122 012006 [35] a. mion et al.: 2009 astron. & astroph. 504 673 [36] o.d. aguiar : 2010 arxiv:1009.1138v1 745 acta polytechnica 53(supplement):742–745, 2013 1 introduction 2 basics of gw emission by pulsars 3 gw from pulsars, and results from the past until now 4 the near future: the new proposals 5 conclusions and open questions acknowledgements references ap-3-12.dvi acta polytechnica vol. 52 no. 3/2012 optimal combustion conditions for a small-scale biomass boiler viktor plaček1, cyril oswald1, jan hrdlička2 1 czech technical university in prague, faculty of mechanical engineering, department of instrumentation and control engineering, technická 4, 166 07 prague 6, czech republic 2 czech technical university in prague, faculty of mechanical engineering, department of power engineering, technická 4, 166 07 prague 6, czech republic correspondence to: viktor.placek@fs.cvut.cz abstract this paper reports on an attempt to achieve maximum efficiency and lowest possible emissions for a small-scale biomass boiler. this aim can be attained only by changing the control algorithm of the boiler, and in this way not raising the acquisition costs for the boiler. this paper describes the experimental facility, the problems that arose while establishing the facility, and how we have dealt with them. the focus is on discontinuities arising after periodic grate sweeping, and on finding the parameters of the pid control loops. alongside these methods, which need a lambda probe signal for proper functionality, we inroduce another method, which is able to intercept the optimal combustion working point without the need to use a lambda sensor. keywords: biomass, combustion, control. 1 introduction small-scale boilers used for residential heating have undergone considerable development in the last decade. the developments have focused on reducing manual user handling. this has led to the requirement for a low-volume fuel that can be delivered to the boiler by automatic feeding. automatic feeding enables the fuel to be fed in smaller batches, so that the combustion process is disurbed much less than when the fuel is stoked in the form of whole logs. feeding small batches also allows combustion in much smaller combustion chambers, so that the size of the boiler can also be reduced [5]. however, a smaller combustion chamber leads to issues with combustion process control. the small mass of the burning fuel leads to much faster dynamics of the combustion process response, requiring the controller parameters to be more carefully adjusted for faster reactions. other differences between smallscale boilers and larger boilers are: • not only faster but also more sensitive to external influences, • are more resistant to lack of maintenance of the boiler, • sensors are not periodically tested. we are developing a new original algorithm to control the combustion process in a way that will maintain the combustion process in optimal conditions. the algorithm meets the requirements of low acquisition costs and takes advantage of the sensitivity of the combustion process to external sources of excitation. it is also resistant to qualitative changes in the combustion process due to lack of maintenance. the combustion process is often controlled by maintaining an optimal air-to-fuel ratio using feedback from a lambda probe (more on this topic in [5]). although a lambda probe is not expensive equipment nowadays, it still has significant impact on the cost of automating a boiler. the proposed algorithm is able to find the optimal air-to-fuel ratio without the need to use a lambda probe. 2 experimental arrangement and deployment the basic arrangement used for combustion control of small-scale biomass boilers is depicted in figure 1. it is based on a biomass boiler for residential and small enterprise heating that is widely available on the market. its warmed water heat output is 25 kw when combusting wood pellets. the original control electronics for the boiler was purpose-made, and could not be used for experimental control. we therefore developed and installed a control unit in addition to the original control electronics. the control unit is based on the rexwinlab-8000 data acquisition and control station, which had earlier been developed at the department where the authors work. further information on the control unit can be found in [3]. 89 acta polytechnica vol. 52 no. 3/2012 figure 1: process schematic of adapted boiler used for small-scale experiments the boiler instrumentation was further extended by thermal measurements, a flue gas analyzing unit, a frequency changer for combustion air speed control, etc. the first issue that emerged when we completed the experimental arrangement was that we observed quite strong peaks in co emissions in instants just after grate sweeping. according to [1], these peaks are probably caused by the formation of air channels in the burning layer of the biomass which are destroyed by movement of the whole layer during grate sweeping. until new channels are established, the combustion is temporarily short of oxygen. we shortened the length of the grate-sweeper run and shortened the grate sweeping period accordingly. the tuned grate sweeping algorithm led to almost complete elimination of the co peaks. more on this topic can be found in [3]. the original control electronics of the boiler uses some proprietary modulation of the heat output that is unknown to us. the modulation is based on burn out and start up of the fire in the combustion chamber, leading to quite long transitions. during the transitions, the boiler has considerably high emissions of flue gases, and as in the case of the grate sweeping peaks mentioned above, unpleasant fluctuations make any automatic evaluation of the steady state of the combustion process difficult. 3 optimizing the operation conditions for a small-scale biomass-fired hot-water boiler the optimal operation conditions for a biomass-fired boiler can be viewed from two main perspectives: economic, and ecological. from the economic point of view, we want to maintain the desired performance of the boiler while using as little fuel as possible. in other words, we want to achieve maximum boiler efficiency. from the ecological point of view, we want to have the lowest content of monitored components in the flue gases throughout the process. both the economic aspects and the ecological aspects of optimal combustion can be affected by controlling the current excess air ratio. if the only ecological consideration is to monitor carbon monoxide 90 acta polytechnica vol. 52 no. 3/2012 emissions, both of the combustion optimality requirements can be fulfilled at very similar excess air ratio values (figure 2). then the optimal operation conditions for a biomass-fired boiler can be achieved by monitoring only a single factor. figure 2: efficiency and co dependency on excess air ratio for a biomass combustion process — a typical trend scheme demonstrating that the minimum of co emissions and the efficiency maximum occur for approximately the same excess air ratio value a typical approach to controlling the optimal operation conditions for a biomass-fired boiler involves monitoring the current carbon monoxide content in the flue gases and controlling the excess air ratio by changing the primary and/or secondary combustion air supply. this approach is effective and relatively fast. on the other hand, optimizing the boiler operation conditions by analyzing the composition of the flue gases requires special, relatively expensive instrumentation such as a lambda probe. the additional expenses are problem for producers of smallscale biomass-fired boilers for residential heating, and for their customers. the control algorithm for optimizing small-scale biomass-fired boiler operation conditions by monitoring current fuel consumption has been developed for this reason. the new optimization algorithm is based on the following assumption: if the controller continuously controls the current boiler heat output by changing the fuel supply, and there are no changes in the desired boiler heat output, then the maximum boiler efficiency is achieved when the fuel consumption is lowest. the proposed algorithm continuously analyzes current fuel consumption and adjusts the combustion air supply to achieve the minimum required fuel consumption by changing the excess air ratio. this approach is more time consuming than the approach using an analysis of the fuel gases. on the other hand, the proposed optimization algorithm does not need special additional instrumentation, and does not introduce any additional costs for the boiler producer. the proposed optimization algorithm was tested on the small-scale biomass-fired boiler introduced above. the results of the test are depicted in figure 3. it shows that, although this approach is very time-consuming, it is an effective and cheap way to find and maintain small-scale biomass-fired boiler combustion in its optimal operating conditions. figure 3: results of a air-fuel ratio optimization algorithm experiment we are currently developing the optimization approach for mid-scale boilers, which will use a combination of the two approaches mentioned here to achieve a better compromise between the two aspects of optimal operating conditions. in addition, this new approach should be able to observe the current combustion process parameters. 4 conclusion although there are no emission regulations for smallscale boilers in most countries, recent developments in national legislation show that this situation will probably change in the near future. one probably inevitable way to reduce emissions will be by improving boiler control algorithms. this paper has pointed out some of the issues that can arise when building an experimental small-scale biomass combustion platform. it has also introduced a method for finding optimal combustion conditions without the need for a lambda probe. future work will involve testing the proposed algorithm on a boiler with a much greater heat output. the verner 600 experimental boiler is located in turňa nad bodvou in slovakia, and experiments 91 acta polytechnica vol. 52 no. 3/2012 on it are controlled remotely, using the internet, in cooperation with technical university in košice. acknowledgement this work has been supported by the ministry of education of the czech republic, under project no. msm68400770035 “development of environmentally friendly decentralized power systems”, which is gratefully acknowledged. the work of phd students has been supported by doctoral grant support of the czech technical university in prague, grant no. sgs10/252/ ohk2/3t/12. references [1] korpela, t., bjorkovist, t., lautala, p.: durable feedback control system for small scale wood chip combustion. in proceedings of world bioenergy 2008, jönköping, sweden : may 2008, p. 224–230. [2] mižák, j., pitel’, j. on the problem of use of lamda sensors for combustion control. in zborńık pŕıspevkov artep 2011, stará lesná, 16. 2.–18. 2. 2011. [cd-rom]. košice : tu v košiciach, 2011. isbn 978-80-553-0606-3. [3] plaček, v., šulc, b., vrána, s., hrdlička, j., pitel’, j.: investigation in control of smallscale biomass boilers. in proceedings of the 2011 12th international carpathian control conference (iccc), velké karlovice : ieee – systems, man, and cybernetics society, 2011, p. 312–315. [4] oswald, c., šulc, b.: achieving optimal operating conditions in pi controlled biomass-fired boilers: undemanding way for improvement of small-scale boiler effectiveness. in proceedings of the 2011 12th international carpathian control conference. velke karlovice : 25.–28. may 2011, p. 280–285. isbn 978-1-61284-359-9. [5] pitel’, j., mižák, j.: approximation of co/ lambda biomass combustion dependence by artificial intelligence techniques. in annals of daaam for 2011 & proceedings of the 22nd international daaam symposium, vienna : daaam international, p. 0143–0144. [6] van loo, s., koppejan, j.: the handbook of biomass combustion and co-firing. london : earthscan, 2008. 92 acta polytechnica doi:10.14311/ap.2013.53.0550 acta polytechnica 53(supplement):550–554, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap dark energy and key physical parameters of clusters of galaxies gennady s. bisnovatyi-kogana,b,∗, artur cherninc, marco merafinad a space research institute rus. acad. sci., moscow, russia b moscow engineering physics institute, moscow, russia c shernberg astron inst. msu, moscow, russia d university of rome “la sapienza”, department of physics, rome, italy ∗ corresponding author: gkogan@iki.rssi.ru abstract. we study physics of clusters of galaxies embedded in the cosmic dark energy background. the equilibrium and stability of polytropic spheres with equation of state of the matter p = kργ, γ = 1 + 1/n, in presence of a non-zero cosmological constant λ is investigated. the equilibrium state exists only for central densities ρ0 larger than the critical value ρc and there are no static solutions at ρ0 < ρc. at this density the radius of the configuration is equal to the zero-gravity radius, at which the dark matter gravity is balanced by the dark energy antigravity. it is shown, that dark energy reduces the dynamic stability of the configuration. we show that the dynamical effects of dark energy are strong in clusters like the virgo cluster, which halo radius is close to the zero-gravity radius. it is shown, that the empirical data on clusters like the virgo cluster or the coma cluster, are consistent with the assumption that the local density of dark energy on the scale of clusters of galaxies is the same as on the global cosmological scales. keywords: dark energy, equilibrium models, galaxy clusters. 1. introduction analysis of the observations of distant sn ia [16, 17] and of the spectrum of fluctuations of the cosmic microwave background radiation (cmb), see e.g. [18], have lead to conclusion that the term, representing “dark energy” (de) contains about 70 % of the average energy density in the present universe and its properties are very close to the properties of the einstein cosmological λ term, with a density ρλ = c 2 8πgλ = 0.7 × 10 −29 g/cm3, and pressure pλ = − c 2 8πgλ, pλ = −ρλ, c = 1. merafina et al. [15] constructed newtonian self-gravitating models with a polytropic equation of state in presence of de. the additional parameter β represents the ratio of the density of de to the matter central density of the configuration. the limiting values βc were found, so that at β > βc there are no equilibrium configurations. dynamic stability of the equilibrium models with de is analyzed, using an approximate energetic method. it is shown that de produces a destabilizing effect contrary to the stabilizing influence of the cold dark matter [2, 14]. local dynamical effects of dark energy were first recognized by chernin et al. (2000), basing on the studies of the local group of galaxies and the expansion outflow of dwarf galaxies around it [1, 5–7, 12, 19]. chernin et al. [10] have shown that in the nearest rich cluster of galaxies, the virgo cluster, the matter gravity dominates in the volume of the cluster, while the dark energy antigravity is stronger than the matter gravity in the virgocentric outflow at the distances of ' 10 ÷ 30 mpc from the cluster center. the key physical parameter here is its “zero-gravity radius” which is the distance from the system center, where the matter gravity and the dark energy antigravity exactly balance each other. bisnovatyi-kogan and chernin [4] have considered a cluster as a gravitationally bound quasi-spherical configuration of cold non-relativistic collisionless dark and baryonic matter in the cosmological proportion, in presence of a dark energy with the cosmological density ρλ in the same volume. it was shown that the zero-gravity radius may serve as a natural cut-off radius for the dark matter halo of a cluster. the organization of the paper is the following: in sections 2 and 3 we derive equations, find equilibrium solutions, and analyze a stability of polytropic configurations in presence of a dark energy, in the form of a cosmological constant. the section 4 is devoted to application of these result to the estimation of parameters of local and virgo clusters. this presentation follows the papers of merafina et al. [15], and bisnovatyi-kogan and chernin [4]. 2. main equations let us consider spherically symmetric equilibrium configuration in newtonian gravity, in presence of de, represented by the cosmological constant λ. in this case, the gravitational force fg which a unit mass undergoes in a spherically symmetric body is written as fg = −gmr2 + λr 3 , where m = m(r) is the mass inside the radius r. its connections with the matter den550 http://dx.doi.org/10.14311/ap.2013.53.0550 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 dark energy and key physical parameters of clusters of galaxies sity ρ and the equilibrium equation are respectively written as dmdr = 4πρr 2, 1 ρ dp dr = − gm r2 + λr3 , and the de density ρv is connected with λ as ρv = λ8πg. let us consider a polytropic equation of state p = kργ, with γ = 1 + 1/n. by introducing the nondimensional variables ξ and θn so that r = αξ and ρ = ρ0θnn, α2 = (n+1)k4πg ρ 1 n −1 0 , we obtain the lane–emden equation for polytropic models with de 1 ξ2 d dξ ( ξ2 dθn dξ ) = −θnn + β. (1) here ρ0 is the matter central density, α is the characteristic radius, β = λ/4πgρ0 = 2ρv/ρ0 is twice the ratio of the de density to the central density of the configuration. the spherically symmetric poisson equation for the gravitational potential ϕ∗ in presence of de is given by 1 r2 d dr ( r2 dϕ∗ dr ) = 4πg(ρ−2ρv), ϕ∗ = ϕ+ϕλ. (2) the gravitational energy of a spherical body εg is εg = −g ∫m 0 m r dm, m = 4π ∫ r 0 ρr 2dr, where m = m(r), r is the total radius, and the energy ελ, representing the interaction of the matter with de, is given by ελ = ∫m 0 ϕλdm, ϕλ = −4πgρvr 2/3. the relations between gravitational εg, thermal εth energies, and the energy ελ (the virial theorem) have been found by merafina et al. [15]. εg = − 3 5 −n gm2 r − λ 2(5 −n) mr2− 2n + 5 5 −n ελ, (3) εth = n 5 −n gm2 r + nλ 6(5 −n) mr2 + 5n 5 −n ελ. (4) εtot = n− 3 5 −n gm2 r + (n− 3)λ 6(5 −n) mr2 + 2n 5 −n ελ. (5) 3. equilibrium solutions the equilibrium mass mn for a polytropic configuration which is solution of the lane–emden equation is written as mn = 4π [ (n + 1)k 4πg ]3/2 ρ 3 2n − 1 2 0 ∫ ξout 0 θn nξ2dξ. (6) using eq. 1, the integral in the right site may be calculated by partial integration, giving the following relation for the mass of the configuration mn = 4πρ0α3 [ −ξ2out ( dθn dξ ) out + βξ3out 3 ] . here θn(ξ) is not a unique function, but depends on the parameter β, according to eq. 1. for the limiting configuration, with β = βc, we have on the outer boundary θn(ξout) = 0, dθndξ |ξout = 0, and the mass mn,lim of the limiting configuration is written as mn,lim = 4π3 rout 3βcρ0c = 4π3 rout 3ρ̄c, so that the limiting value βc is exactly equal to the ratio of the average matter density ρ̄c of the limiting configuration to its central density ρ0c: βc = ρ̄c/ρ0c. for the lane–emden solution with β = 0, we have ρ0/ρ̄ = 3.290, 5.99, 54.18 for n = 1, 1.5, 3, respectively. let us consider the curve m(ρ0) for a constant de density ρv = λ/8πg. for plotting this curve in the nondimensional form, we introduce an arbitrary scaling constant ρch and write the mass in the form mn = 4π [ (n + 1)k 4πg ]3/2 ρ 3 2n − 1 2 ch m̂n, with m̂n = ρ̂ 3 2n − 1 2 0 [ βξ3out 3 − ξ2out ( dθn dξ ) out ] , where ρ̂0 = ρ0/ρch is the nondimensional central density, m̂n is the nondimensional mass. the numerical solutions of the eq. 1 have been obtained by merafina et al. [15] for n = 1, 3, 1.5. at n = 1 we have ξout = π, 3.490, 4.493, for β = 0, β = 0.5βc = 0.089, β = βc = 0.178, respectively. the nondimensional curve m̂n(ρ̂0), at constant ρv = βρ0/2 is plotted in fig. 1 for βin = 0, βin = 0.5βc, βin = βc, for which m̂1 = π, 3.941, 5.397 at ρ̂0 = 1, ρ̂0β = βin = const. at n = 3 the numerical solution of the equilibrium equation gives ξout = 6.897, 7.489, 9.889, for β = 0, β = 0.5βc = 0.003, β = βc = 0.006, respectively. in fig. 2 we show the behavior of m̂3(ρ̂0)|λ, for different values of βin = 0, βin = 0.5βc, βin = βc, for which m̂3 = 2.018, 2.060, 2.109, at ρ̂0 = 1, respectively. at n = 1.5 we have ξout = 3.654, 3.984, 5.086, for β = 0, β = 0.5βc = 0.041, β = βc = 0.082, respectively. for βin = 0, βin = 0.5βc, βin = βc, we have m̂3/2 = 2.714, 3.081, 3.622, at ρ̂0 = 1, respectively. stability analysis of these configurations done by merafina et al. [15] using an approximate energetic method [3, 20]. the density in the configuration is distributed according to the lane–emden solution at n = 3, ρ = ρ0 θ33 (ξ), and we investigate the stability to homologous perturbations. taking ρ = ρ0φ ( m m ) , with a nondimensional function φ, remaining constant during homologous perturbations we write the derivative of the total energy ε equal to zero, as an equilibrium equation ∂ε ∂ρ 1/3 0 = 3ρ−4/30 ∫ m 0 p dm φ(m/m) − 0.639gm5/3+ + 0.208λm5/3ρ−10 − 1.84 g2m7/3 c2 ρ 1/3 0 = 0. (7) the dynamical stability is defined by the sign of the second derivative. the de input in the stability of the configuration is negative like the general relativistic correction [15]. 551 gennady s. bisnovatyi-kogan, artur chernin, marco merafina acta polytechnica figure 1. nondimensional mass m̂1 of the equilibrium polytropic configurations at n = 1 as a function of the nondimensional central density ρ̂0, for different values of βin. the cosmological constant λ is the same along each curve. the curves at βin 6= 0 are limited by the configuration with β = βc. 4. local and virgo clusters for presently accepted values of the de density ρv = (0.72±0.03)×10−29 g/cm3, the mass of the local group, including the dark mater input, is between mlc ∼ 3.5 × 1012 m� [8], and mlc ∼ 1.3 × 1012 m� [11]. the radius rlc of the lc may be estimated by measuring the velocity dispersion vt of galaxies in lc and by the application of the virial theorem, so that rlc ∼ √ (gmlc/vt). the estimated vt = 63 km/s is close to the value of the local hubble constant h = 68 km s−1 mpc−1 [11]. the radius of the lc may be estimated as rlc = (gmlc/v2t ) = (1.5 ÷ 4) mpc. chernin et al. [8] identifies the radius rlc with the radius rλ of the zero-gravity force, 1.2 < mlc < 3.7 × 1012 m� and 1.1 < rλ < 1.6 mpc. these estimations indicate the importance of de for the structure and dynamics of the outer parts of lc. clusters of galaxies are known as the largest gravitationally bound systems, and the zero-gravity radius is an absolute upper limit for the radial size r of a static cluster with a mass m: r < rλ = [ m 8π 3 ρλ ]1/3 . taking the total mass of the virgo cluster (dark matter and baryons) m = (0.6 ÷ 1.2) × 1015m� [4], one finds the zero-gravity radius of the virgo cluster: rλ = (9÷11) mpc ' 10 mpc. for the richest clusters like the coma cluster with the masses ' 1016m� the zero-gravity radius is about 20 mpc. figure 2. same as in fig. 1, for n = 3. the data of the hubble diagram for the virgo system [13] enable us to obtain another approximate empirical equality:[ rv 2 gm ] virgo ' 1. this relation does not assume either any kind of equilibrium state of the system, or any special relation between the kinetic and potential energies. it assumes only that the system is embedded in the dark energy background and it is gravitationally bound. the data on the local group [8, 12] give[ rv 2 gm ] virgo ' [ rv 2 gm ] lg ' 1. here we use for the local group the following empirical data: r ' 1 mpc, m ' 1012m�, v ' 70 km s−1 [12]. assuming that the the virgo system has a zero-gravity radius rλ, we obtain from the empirical relation that v 2 ' ( 8π 3 )1/3 gm2/3ρ 1/3 λ . (8) the velocity dispersion in the gravitationally bound system depends only on its mass, and the universal dark energy density. the relation eq. 8 enables one to estimate the matter mass of a cluster by its velocity dispersion m ' g−3/2 [ 8π 3 ρλ ]−1/2 v 3 ' 1015 [ v 700 km/s ]3 m�. the approximate empirical relation may serve as an estimator of the local dark energy density, ρloc. if the mass of a cluster and its velocity dispersion are independently measured, one has ρloc ' 3 8πg3 m−2v 6 ' ρλ [ m 1015m� ]−2[ v 700 km/s ]6 , 552 vol. 53 supplement/2013 dark energy and key physical parameters of clusters of galaxies what indicates that the observational data on the local system and the virgo system provide evidence in favor of the universal value of the dark energy density which is the same on both global and local scales. the gravitational potential ϕ∗(r) inside the cluster comes from the poisson equation eq. 2. it was found by bisnovatyi-kogan & chernin [4] in the model of the isothermal halo the maximum of the potential ϕ∗ max = − 3 2 gm rλ = − 3 2 g ( 8π 3 ρλ )1/3 m2/3. the value of ϕ∗ max depends on the cluster matter mass m and the universal dark energy density. its value is the same for any halo profile. it gives a quantitative measure to the deepness of the cluster potential well and determines the characteristic isothermal velocity of the gravitationally bound objects in the cluster, viso = |ϕ∗ max|1/2 = g1/2 ( 3 2 )1/2 (8π 3 ρλ )1/6 m1/3 = = 780 [ m 1015m� ]1/3 . this velocity is rather close to the mean velocity dispersion, v ' 700 km/s, of the galaxies in the virgo cluster; viso ' v also for the coma cluster with its matter mass m ' 1016m� and v ' 1000 km/s. the plasma isothermal temperature tiso = gm 3k v 2iso = m 3k ( 8π 3 ρλ )1/3 m2/3 = 3 × 107 [ m 1015m� ]2/3 k, this temperature is roughly equal to the temperature of the hot x-ray emitting plasma in clusters like the virgo cluster or the coma cluster. identifying theoretical value viso with the observed value v for typical clusters, we can estimate the matter mass of a cluster, if the velocity dispersion of its galaxies is measured: m = ( 2 3g )3/2 (8π 3 ρλ )−1/2 v 3iso = 1015m� ( v 780 km/s )3 . the relation m ∝ v 3 agrees with the empirical relation following from eq. 8. in a similar way, the mass may be found, if the theoretical value of the temperature tiso is identified with the measured temperature of the intracluster plasma: m = ( 3k gm )3/2 (8π 3 ρλ )−1/2 t 3/2 iso = ( t 2 × 107 k )3/2 × 1015m�. if the matter mass of a cluster and its velocity dispersion or its plasma temperature are measured independently, one can estimate the local density of dark energy: ρloc = ρλ ( m 1015m� )−2 ( v 780 km/s )6 , (9) ρloc = ρλ ( m 1015m� )−2 ( t 3 × 107 k )3 . (10) the empirical data on clusters like the virgo cluster or the coma cluster are consistent with our assumption that the local density of dark energy on the scale of clusters of galaxies is the same as on the global cosmological scales. 5. conclusions (1.) the key physical parameter of cluster of galaxies is the zero-gravity radius rλ = [ m 8π 3 ρλ ]1/3 . a bound system must have a radius r ≤ rλ. for the virgo cluster r ' rλ ' 10 mpc. (2.) the mean density of cluster’s dark matter halo does not depend on the halo density profile and is determined by the dark energy density only: 〈ρ〉 = 2ρλ. (3.) the available observational data show that the local density is near the global value ρλ. acknowledgements the work of gsbk was partially supported by rfbr grant 11-02-00602, the ran program p20 and grant nsh3458.2010.2. a.c. appreciates a partial support from the rfbr grant 10-02-0178. gsbk is grateful to the organizers of the workshop for support. references [1] baryshev yu.v., chernin a.d., teerikorpi p.: 2001, a&a, 378, 729 [2] bisnovatyi-kogan, g.s.: 1998, apj 497, 559 [3] bisnovatyi-kogan, g.s.: 2001, stellar physics. ii. stellar structure and stability (heidelberg: springer) [4] bisnovatyi-kogan, g.s., chernin a.d.: 2012, apss, 338, 337 [5] byrd g.g., chernin a.d., valtonen m.j.: 2007, cosmology: foundations and frontiers moscow, urss [6] chernin a.d.: 2001, physics-uspekhi, 44, 1099 553 gennady s. bisnovatyi-kogan, artur chernin, marco merafina acta polytechnica [7] chernin, a.d. 2008, physics-uspekhi, 51, 267 [8] chernin, a.d. et al.: 2009, astro-ph 0902.3871v1 [9] chernin a.d., karachentsev i.d., valtonen m.j. et al.: 2009, a&a. 507, 1271 [10] chernin a.d., karachentsev i.d., nasonova o.g. et al.: 2010, a&a. 520, a104 [11] karachentsev, i.d. et al.: 2006, aj 131, 1361 [12] karachentsev i.d., kashibadze o.g., makarov d.i. et al.: 2009, mnras, 393, 1265 [13] karachentsev, i.d., nasonova o.g.: 2010, mnras, 405, 1075 [14] mclaughlin, g., fuller, g.: 1996, apj 456, 71 [15] merafina m., bisnovatyi-kogan g.s., tarasov s.o.: 2012, a&a, 541, 84 [16] perlmuter s., aldering g., goldhaber g. et al.: 1999, apj, 517, 565 [17] riess a.g., filippenko a.v., challis p. et al.: 1998, aj, 116, 1009 [18] spergel, d.n. et al. 2003, apj suppl., 148, 17 [19] teerikorpi p., chernin a.d.: 2010, a&a 516, 93 [20] zel’dovich, ya.b., novikov, i.d.: 1966, physics-uspekhi 8, 522 554 acta polytechnica 53(supplement):550–554, 2013 1 introduction 2 main equations 3 equilibrium solutions 4 local and virgo clusters 5 conclusions acknowledgements references ap08_4.vp 1 introduction axial high-speed rotary impellers (propeller impellers, pitched blade impellers, axial hydrofoil impellers) create a significant axial force (thrust). the rotation of the impeller causes a field of axial forces by which the flowing liquid acts on the impeller and the vessel. it is relatively easy to determine such forces and their components, and it can be done without interfering with the velocity field. until now, the axial force of the axial flow impellers has been determined directly as the change in vessel weight [1] or from the distribution of the axial force affecting the bottom [2–4]. from the latter investigation, it follows that the distribution of the axial force affecting the flat bottom of the cylindrical mixing vessel is unambiguously joined with the flow pattern along the bottom: force fax1 originates when the liquid jet streaming downwards from the impeller rotor region deviates from its vertical (axial) direction and flows along the vessel bottom, and force fax2 appears in the region where the liquid flowing along the vessel bottom changes direction and starts flowing vertically (axially) up the cylindrical vessel walls. between the areas of the bottom affected by forces fax1 and fax2 there is, under certain conditions (e.g. impeller off-bottom clearance), a subregion where the pressure force acts on this corresponding area. in experiments, most attention has been paid to the distribution of the axial pressures along the bottom. these distributions were determined directly [2, 3] by measuring the total pressures in holes situated in the bottom of the vessel. the dynamic pressures corresponding to them were calculated as the differences between the total pressure measured and the hydrostatic pressure under the conditions of the experiment. direct measurement of the dynamic pressures affecting the wall of the mixing vessel [5, 6] has recently been developed. this approach consists of a non-invasive measurement by means of pressure transducers located in appropriate positions of the boundary of the stirred system (wall, bottom), simultaneously recording the measured instantaneous signal throughout the experiment. such an experimental technique allows us to determine not only the mean value of the measured quantity but also some characteristics of the oscillations. the aim of this study is to determine the distribution of the dynamic pressures along the flat bottom of a pilot plant cylindrical mixing vessel equipped with four radial baffles and stirred with a four 45° pitched blade impeller pumping downwards. a set of pressure transducers is located along the whole radius of the flat bottom between two adjacent baffles. the radial distribution of the dynamic pressures indicated by the pressure transducers will be determined in dependence on the impeller off-bottom clearance and the impeller speed. the results of the experiments are evaluated according to the model idea of the flow along the flat bottom in a cylindrical mixing vessel with an axial high-speed impeller under turbulent regime of flow of an agitated liquid. 2 experimental the experiments were performed in a flat-bottomed cylindrical stirred vessel of inner diameter t � 0 49. m filled with water at room temperature with the vessel height h t� . the © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 4/2008 axial force at the vessel bottom induced by axial impellers i. fořt, p. hasal, a. paglianti, f. magelli this paper deals with the axial force affecting the flat bottom of a cylindrical stirred vessel. the vessel is equipped with four radial baffles and is stirred with a four 45° pitched blade impeller pumping downwards. the set of pressure transducers is located along the whole radius of the flat bottom between two radial baffles. the radial distribution of the dynamic pressures indicated by the transducers is measured in dependence on the impeller off-bottom clearance and impeller speed. it follows from the results of the experiments that under a turbulent regime of flow of an agitated liquid the mean time values of the dynamic pressures affecting the bottom depend not on the impeller speed but on the impeller off-bottom clearance. according to the model of the flow pattern of an agitated liquid along the flat bottom of a mixing vessel with a pitched blade impeller, three subregions can be considered in this region: the liquid jet streaming downwards from the impeller deviates from its vertical (axial) direction to the horizontal direction, the subregion of the liquid flowing horizontally along the bottom and, finally, the subregion of the liquid changing direction from the bottom upwards (vertically) along the wall of the cylindrical vessel, when the volumetric flow rates of the liquid taking place in the downward and upward flows are the same. keywords: pitched blade impeller, dynamic pressure, mixing vessel, flat bottom, pressure transducer. fig. 1: pilot plant experimental equipment vessel was equipped with four radial baffles (width of baffles b t� 01. ) and was stirred with a four pitched blade impeller (pitch angle � � �45 , diameter d t� ( )2 5 , width of blade w d� 5), pumping downward. two levels of impeller off-bottom clearances c t t� 3 4, and two levels of impeller speed n � 284 rpm and 412 rpm were adjusted (see fig. 1). in this work, the total pressure time series were recorded using a set of nine pressure transducers distributed along the flat bottom between two adjacent baffles (see fig. 2). the transducer is based on a silica chip and a slim diaphragm 2.54 mm2 in surface area, which is able to reveal small pressure variations. the time traces were recorded and transferred to a personal computer by a data acquisition system card, and were controlled by a simple labview programme. for each experimental run, a time history of about 60 minutes was recorded at a sampling frequency of 500 hz. 3 results of experiments the mean time values of the local total axial pressures detected by the pressure transducers were calculated from the measured total pressure time series. then the mean time values of the local dynamic axial pressures pax were calculated as the difference between the local mean total axial pressure and the hydraulic pressure given by the height of the liquid level h. the calculated results from each position of the pressure transducer were the mean time value of the local dynamic pressure pax and its standard deviation s psx , always at the given (selected) value of the off-bottom clearance and the impeller speed. dimensional quantity pax was expressed in dimensionless form p p n d ax ax� � 2 2 , (1) and plotted in dependence on the dimensionless radial coordinate r, where quantity r is a radial coordinate with its origin in the centre of the circular bottom: r r t � 2 . (2) figs. 3 and 4 illustrate the above mentioned radial profiles of the mean time local dynamic pressure, always at two levels of impeller speed. a statistical analysis of the fluctuating local dynamic pressures resulted in calculating the standard deviation of this quantity s psx . these statistical characteristics were expressed in dimensionless form 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 centerabcdefhil g distance i wall distance il distance hi distance gh distance fg distance ef distance de distance cd distance bc distance a-b distance a center cm cm cm cm cm cm cm cm cm cm cm 3 1.8 2 1.5 2.2 1.8 2 2 2 2.2 4 fig. 2: distribution of pressure transducers along the bottom �0.01 0 0.01 0.02 0.03 0.04 0.05 0 0.25 0.75 1 p [] a x r [-] r 1 r 2 0.5 – 284 rpmn � – 412 rpmn � fig. 3: radial profiles of the dimensionless mean time local axial dynamic pressures on the vessel bottom (c = t/3) s s n d p p ax ax� � 2 2 , (3) or as the variance v s pp p ax ax ax � . (4) figs. 5 and 6 illustrate the radial profiles of quantity s psx , and figs. 7 and 8 illustrate the radial profiles of quantity vpsx , always in dependence on the dimensionless radial coordinate r. in accordance with the results of previous investigations [3, 4] three subregions of the forces acting on the flat bottom of mixing vessel can be considered: 1. the subregion below the axial impeller, where the liquid jet streaming downward from the impeller rotor region deviates from its direction and flows along the vessel bottom. the outer boundary of this region r1 is given by the first intersection of the curve p p rax ax� ( ) and the coordinate p oax � . 2. the subregion at the vessel wall where the liquid flowing along the vessel bottom changes direction and starts flowing up the cylindrical vessel walls. the inner boundary of its region r2 is given by the second intersection of the curve p p rax ax� ( ) and the coordinate p oax � . the outer boundary of this subregion is a radial coordinate of the vessel wall. 3. the subregion at the flat bottom where the agitated liquid flows along the bottom. its boundaries are given by the © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 48 no. 4/2008 �0.02 0 0.02 0.04 0.06 0.08 0 0.25 0.5 0.75 1 p a x [] r [-] r 1 r 2 – 284 rpmn � – 412 rpmn � fig. 4: radial profiles of the dimensionless mean time local axial dynamic pressures on the vessel bottom (c = t/4) 0 0.1 0.2 0.3 0.4 0.5 0 0.25 0.5 0.75 1 s p a x [] r [-] – 284 rpmn � – 412 rpmn � fig. 5: radial profiles of dimensionless standard deviations of the local axial dynamic pressures on the vessel bottom ( )c t� 3 0 0.1 0.2 0.3 0.4 0.5 0 0.25 0.5 0.75 1 s p a x [] r [-] – 284 rpmn � – 412 rpmn � fig. 6: radial profiles of dimensionless standard deviations of the local axial dynamic pressures on the vessel bottom ( )c t� 4 0 2.5 5 7.5 10 0 0.25 0.5 0.75 1 v p a x [] r [-] – 284 rpmn � – 412 rpmn � fig. 7: radial profiles of variances of the local axial dynamic pressures on the vessel bottom (c t� 3) two above mentioned intersections, r1 (the inner one) and r2 (the outer one). it follows from the plots p p rax ax� ( ) shown in figs. 3 and 4, that the series of above defined subregions depend on the impeller off-bottom clearance. the higher the quantity c t, the smaller the area of the second subregion, and finally, this subregion can disappear, so it holds r r1 2 2 4max � � . (5) then the maximum value of coordinate r1 coincides with the minimum value of coordinate r2 , corresponding to the geometrical mean of the dimensional radial coordinate 2 4. the mean time axial forces affecting individual subregions of the bottom can be calculated under assumption of axial symmetry of the investigated region by integrating of the mean time local dynamic pressures over the corresponding area of the bottom: f p r r r r ax ax d1 0 1 2� �� ( ) (6) and f p r r r r t ax ax d2 2 2 2� �� ( ) . (7) the mean time axial forces can be expressed in dimensionless form as follows f f n d i i ax ax� � 2 4 , i � 1 2, . (8) tables 1 and 2 consist of values of the calculated mean time axial forces both in dimensional form and in dimensionless form, as well as the values of boundary coordinate r1 for both two investigated impeller off-bottom clearances. it should be pointed out that the curves p p rax ax� ( ) were extrapolated by a cubic function up to the vessel wall. the values of components fax1 and fax2 are derived from an impulse theorem. the vertical (axial) component of the free liquid stream changes in both subregions where a � 2-radian turn takes place [4] f q r bt ax1 1 2 1 2� � � ( ) , (9) f q r bt ax2 2 2 2 2 24 � � � � ( ) � . (10) the volumetric flow rates along the bottom qbt1 and qbt2 act on the areas limited by radii r1 or r1 and r2 or t 2 and r2, respectively, and create the mean axial forces fax1 and fax2, respectively. both quantities can be expressed in dimensionless form n q nd q bti bti � 3 , i � 1 2, . (11) 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 0 2.5 5 7.5 10 0 0.25 0.5 0.75 1 v p a x [] r [-] – 284 rpmn � – 412 rpmn � fig. 8: radial profiles of variances of the local axial dynamic pressures on the vessel bottom (c = t/4) n [rpm] fax1 [n] fax2 [n] fax1 [ – ] fax2 [ – ] qbt1 [m3s�1] qbt2 [m3s�1] nq bt1 [ – ] nq bt 2 [ – ] 284 13.17 11.49 0.406 0.355 0.0329 0.0322 0.939 0.917 412 25.41 26.03 0.373 0.382 0.0485 0.0485 0.899 0.952 average values – 0.390 0.369 – – 0.919 0.935 r1 0 340� . table 1: axial forces affecting flat bottom (four 45° pitched blade impeller, c t� 3) n [rpm] fax1 [n] fax2 [n] fax1 [ – ] fax2 [ – ] qbt1 [m3s�1] qbt2 [m3s�1] nq bt1 [ – ] nq bt 2 [ – ] 284 14.86 14.67 0.459 0.453 0.0350 0.0364 0.997 1.036 412 28.60 26.37 0.420 0.388 0.0486 0.0488 0.954 0.958 average values – 0.440 0.403 – – 0.976 0.997 r1 0 32 0� . table 2: axial forces affecting flat bottom (four 45° pitched blade impeller, c t� 4) tables 1 and 2 show the values of the calculated volumetric flow rates along the bottom in dimensional form and in dimensionless form. figs. 5 and 6 clearly indicate that the value of the dimensionless (and also of the dimensional) standard deviation of the local dynamic pressure depends significantly on the absolute level of the measured pressure signal. the higher the level of the signal observed at the lower impeller off-bottom clearance, c t � 1 4, gives mutually similar radial profiles of s pax at both examined impeller speeds, while at c t � 1 3 the variations of the dimensionless standard deviation s pax along the bottom radius seem to be quite irregular. the values of the coefficient of variation, vpax , of the local bottom dynamic pressure (see figs. 7 and 8) exhibit very similar trends at both off-bottom clearance values used in the experiments: at higher values of the mean time axial dynamic pressure, pax, which originates from more directional liquid flow within the region beneath the impeller and in the bottom region close to the vessel wall. the pressure fluctuations are insignificant in comparison with the fluctuations in the region close to the radius of the first turn of the liquid stream from the impeller rotor region (radius r1 in figs. 3 and 4) and close to the very beginning of the second turn at the vessel wall (radius r2 in figs. 3 and 4). 4 discussion all experiments were conducted under a turbulent regime of flow of an agitated liquid. this fact confirms that all results of experiments presented here, expressed in dimensionless form, depend not on the impeller speed but on the impeller off-bottom clearance. with respect to the statistical analysis of the experimental data, it seems that the accuracy of the experiments performed in the subregion between radius r1 (r1) and radius r2 (r2) of the flow along the bottom is worse than the accuracy in the two other subregions. the reason for this difference can be explained by the different flow mechanisms in these subregions and, thus, by different mechanisms of the origin of the measured axial dynamic pressures [4]. it follows from figs. 3 and 4 that the subregion between radius r1 (r1) and radius r2 (r2) of the flow along the bottom practically diminishes at the level of impeller c t � 1 3. then the value for radius r1 (see table 1) is rather close to the value for r1max (see eq. 3). moreover the values of the local axial dynamic pressure in both regions of the flow turn are significantly higher at the impeller off-bottom clearance c t � 1 4 than when the impeller is located at a higher axial distance from the bottom. tables 1 and 2 clearly illustrate the expected validity of the continuity equation of the flow around the bottom: q qbt bt1 2� (12) found earlier [3, 4] for the six 45° pitched blade impeller. for the off-bottom impeller clearance c t� 3 it is moreover valid f fax ax1 2� (13) always at the given impeller speed. table 3 covers the same quantities as those observed in this study, but for the six 45° pitched blade impeller published earlier [3, 4]. the general dependence of all investigated quantities (fax, nbt, r1) on the impeller off-bottom clearance corresponds fairly well to the results in this study. the only, rather significant, difference is probably caused by the relation between the flow rate along the bottom nq bx and the flow rate number nq p , i.e. the dimensionless impeller pumping capacity. the value of this quantity for the arrangement of the agitated system investigated here is [7]: nq p � 0 753. (four 45° pitched blade impeller). (14) while for the six 45° pitched blade impeller the quantity nq bt does not amount to its pumping capacity nq p , for the four 45° pitched blade impeller it significantly exceeds this quantity. these facts are joined with different values of the pumping area at the bottom (up to radius r1) which is greater for the latter impeller than for the former impeller. the direction of the flow pattern is therefore more axial for the four blade impeller than for the six blade impeller, and more induced flow is entrained from the surroundings of the impeller rotor region towards the bottom. 5 conclusions under a turbulent regime of flow of an agitated liquid the mean time values of the dimensionless dynamic pressures affecting the bottom depend not on the impeller speed but on the impeller off-bottom clearance. according to the experimentally confirmed model of the flow pattern of an agitated liquid along the flat bottom of a mixing vessel with a pitched blade impeller, three subregions can be considered in this region: the liquid jet streaming downwards from the impeller rotor region deviates from its vertical (axial) direction to the © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 48 no. 4/2008 c t [ ] fax [ ] nbt [ ] r1 [ ] nq p [ ] 1/3 0.486 0.620 0.200 0.988 1/4 0.538 0.731 0.188 1.020 note: data from ref. [3, 4] table 3: axial forces affecting flat bottom – d t � 2 5 (six 45° pitched blade impeller) horizontal direction, the subregion of the liquid flowing horizontally along the bottom and, finally, the subregion of the liquid changing direction from the bottom (vertically) along the wall of the cylindrical vessel, when the volumetric flow rates of the liquid taking place in the downward and upward flows are the same. acknowledgments the authors of this paper are grateful for financial support from the czech science foundation (grant no. 104/05/2500) and from the czech ministry of education (grant no. msm6046137306). list of symbols b width of baffle, m c impeller off-bottom clearance, m d impeller diameter, m fax axial force, n fax dimensionless axial force h height of water level, m n impeller speed, s�1 (rpm) nq p impeller flow rate number nq bt dimensionless flow rate of the stream along the bottom pax axial dynamic pressure, pa pax dimensionless axial dynamic pressure qp impeller pumping capacity, m 3s�1 qbt flow rate of the stream along the bottom, m 3s�1 r radial coordinate, m r dimensionless radial coordinate s psx standard deviation of the axial dynamic pressure, pa s psx dimensionless standard deviation of the axial dynamic pressure t diameter of mixing vessel, m v variance of the axial dynamic pressure w width of impeller blade, m � pitch angle, deg � density of agitated liquid, kg m�3 subscripts and superscripts 1 related to the bottom subregion below the impeller rotor region 2 related to the bottom subregion at the vessel wall ab the mean time value references [1] hrubý, m., žaloudík, p.: axial component of impellers (in czech). chemical industry, vol. 15 (1965), p. 469–472. [2] fořt, i., tomeš, l.: studies on mixing. xviii. the action of a stream from a propeller mixer on the bottom of a mixing vessel. collection of czechoslovak chemical communications, vol. 32 (1967), p. 3520–3529. [3] fořt, i., eslamy, m., košina, m.: studies on mixing. xxv. axial force of axial rotary mixers. collection of czechoslovak chemical communications, vol. 34 (1969), p. 3673–3691. [4] fořt, i.: flow and turbulence in vessels with axial impellers. chapter 14 of the book mixing, theory and practice, vol. iii (1986), new york: academic press, inc., p. 133–197. [5] paglianti, a., montante, g., magelli, f.: novel experiments and mechanistic model for macroinstabilities in stirred tanks, aiche journal (2006), vol. 52, p. 426-436. [6] paglianti, a., liu, zh., montante, g., magelli, f.: effect of macroinstabilities in single and multiple stirred tanks, industrial and engineering chemistry research (in press). [7] kresta, s., wood, p. e.: the mean flow field produced by a 45° pbt. changes of the circulation pattern due to off-bottom clearance. canadian journal of chemical engineering, vol. 71 (1993), p. 42–52. doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: ivan.fort@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic prof. ing. pavel hasal, csc. phone: +420 220 443 167 fax: +420 220 444 320 e-mail: pavel.hasal@vscht.cz department of chemical and process engineering prague institute of chemical technology technická 5 166 28 praha 6, czech republic prof. ing. alessandro paglianti phone: +39 051 209 0403 fax: +39 051 634 7788 e-mail: alessandro.paglianti@mail.ing.unibo.it prof. ing. franco magelli phone: +39 051 209 0245 fax: +39 051 2090247 e-mail: franco.magelli@mail.ing.unibo.it department of chemical, mining and environmental engineering university of bologna via terracini 28 40 131 bologna, italy 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 4/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap09_1.vp 1 introduction gypsum board based systems are now widely used in buildings, as walls or ceilings, to provide passive fire protection. the basis of the fire resistance of such systems lies in low thermal conductivity and the evaporation of the water content of the gypsum board, which absorbs a considerable amount of heat, thereby delaying temperature rise through the system. to accurately model the performance of such systems in fire condition, their thermal properties should be known. thermal properties of gypsum are temperature-dependent and among them, thermal conductivity has a critical influence, but there is a wide difference in reported values in literature. given the effects of porosity, non-homogeneity and moisture in gypsum, direct experimental measurement of thermal conductivity of gypsum at high temperatures is not an easy task. as an alternative, this paper proposes a hybrid numerical and experimental method to extract the thermal conductivity of gypsum. a one-dimensional finite difference heat conduction programme has been developed to predict the temperature development through the thickness of the gypsum board, based on an initial estimate of the thermal conductivity-temperature relationship as a function of porosity and radiation within the voids. this relationship is then calibrated by comparing numerical results with the experimental results from small-scale fire tests, so that the temperature history of the specimen calculated by the programme closely matches the temperatures recorded during the test. this method has been found to yield more consistent results than those reported in literature. 2 outline of the numerical analysis method the transient heat transfer through a gypsum board is modelled using one-dimensional finite difference formulation. a computer program has been developed and implemented in the familiar environment of microsoft excel using vba. the modelling procedure has been thoroughly validated [1] by comparisons with a number of analytical solutions and simulation results, using abaqus/standard. the following describes the basis of the modelling method. 2.1 one-dimensional finite difference formulation fourier’s law of conduction in one dimension with no heat generation is expressed as: � � � � � � �x k t t x t x c t x t t ( ) ( , ) ( , )� � � � � � � , (1) where t x t( , ) is temperature (°c), k(t) is thermal conductivity (w/m °c), � is material density (kg/m3), c is specific heat of material (j/kg °c), t is time (sec), x is the coordinate (0 x l, l being the thickness of the panel). choosing the explicit technique, the temperature of a volume cell (refer to figs. 1 and 2) at a time step is computed directly based on the temperatures of the adjacent cells in the last time step, which leads to a very simple scheme of computation [2]: (i) for a typical node m within the material (fig. 1): � � � � � � � � t f k t k t k k t fm m m m m m m m m m m m0 1 1 1 1 1 1 0 2 1( ), , , , 2 � � � � � � � � � � � � � � � � , (2) where f0 is defined as: f k k t c x m m m m 0 1 1 22 � � �( ) ( ) , , � �� (3) �tm is the temperature of m in the subsequent time step and ki,j is the thermal conductivity at the average temperature of cells i and j: k k t t i j i j , � �� � � � � � � �2 . (4) 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 thermal conductivity of gypsum at high temperatures a combined experimental and numerical approach i. rahmanian, y. wang performance of gypsum board based systems in fire is highly influenced by the temperature-dependant thermal conductivity of gypsum boards, yet there is a wide difference in the thermal conductivity values used in literature. presented here is a hybrid method to determine the effective thermal conductivity of gypsum boards at high temperatures, based on using small-scale experimental results and a thermal conductivity model which includes the effects of radiation in voids. keywords: gypsum board, thermal conductivity, high temperatures, radiation in voids, porous material, passive fire protection, fire resistance numerical stability under the explicit scheme requires: � � t c x k km m m m � � � ( ) ( ), , 2 1 1 . (5) (ii) for a boundary node, when subjected to convective and radiative boundary conditions (fig. 2): � � � � � � � � � � � � � � � � � � � � � �t f t h x k t f h x k t1 0 2 1 0 1 12 1 2 1 � � � � �e t t t c x � � ( ) ( ) ,� � �273 273 24 1 4 � � (6) where f0 is f k t c x 0 1 2� � �� ( ) , h(t) is convection heat transfer coefficient (w/m2 °c), t� is the ambient temperature (°c), � is a geometric “view factor”, e is the effective emissivity, � is stefan-boltzmann constant (5.67×10 8 w/m2 k4). numerical stability limits the time step to: � � � � t c x k h x k e x k t t � � �� � � � � � � � 0 1 2732 1 1 1 1 4 1 ��� � �( ) ( ) 1 . (7) 2.2 initial and boundary conditions gypsum board is assumed to have a uniform initial temperature equal to the ambient temperature. on the unexposed boundary, the convective heat transfer coefficient (h) is assumed to be constant and the value is taken as 10 w/m2 °c. the surface of gypsum plasterboards is laminated by paper with emissivity of 0.8–0.9, as reported in reference [3]. thus, the surface emissivity of the board is taken as 0.8 and the view factor equals unity. for extraction of thermal conductivity based on fire test results, the recorded temperatures on the exposed surface are used as input data. 2.3 specific heat and density the temperature-dependent specific heat of gypsum experiences two peaks corresponding to the two dehydration reactions of gypsum, as shown in fig. 3. these peaks represent the energy consumed to dissociate and evaporate water and include the effect of water movement and recondensation of water in cooler regions of gypsum [4]. the base value of the specific heat is 950 j/kg °c, as reported by mehaffey et al. [5], and the additional specific heat at each dehydration reaction can be expressed by [4]: � � c t e f e f� � � 2 26 106 1 2 . ( )d free (j/kg °c), (8) where � c is the average additional specific heat, ed is the dehydration water content (percentage by total weight), efree is the free water content (percentage by total weight), �t is the temperature interval, f1, f2 are correction factors to account for the heat of reactions and effects of water movement. according to ang and wang [4], f1 � 1.28 and 1.42 for the first and second dehydration reactions, respectively. for standard fire conditions f2 � 1.4. due to evaporation of water, the density of gypsum reduces with temperature increase. fig. 4 shows the density © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 49 no. 1/2009 m 1 m m 1� � x direction of heat flow fig. 1: finite difference discretization for node m within the material 2 1 � x direction of heat flow t fig. 2: finite difference discretization for a boundary node 0 5 10 15 20 25 30 0 200 400 600 800 1000 temperatures (°c) s p e c if ic h e a t (k j/ k g .° c ) fig. 3: specific heat of gypsum as used in the analysis 70 75 80 85 90 95 100 105 0 200 400 600 800 1000 temperatures (°c) % o ri g in a l m a ss fig. 4: density of gypsum as used in the analysis (% of the original density) used in the modelling as a percentage of the original density of gypsum at ambient temperature. 2.4 thermal conductivity since gypsum is a porous material, heat transfer through gypsum is a combination of all three modes: conduction through the solid, and convection and radiation through the pores. therefore the effective thermal conductivity of gypsum should include these effects. this effective thermal conductivity can be affected by many factors such as temperature, density, moisture content and porosity of the material. such sensitivity contributes to the diverse data reported in literature as demonstrated in fig. 5. assuming gypsum is made of solid substrate and uniformly distributed spherical pores, the effective thermal conductivity of gypsum may be calculated using the following equation [6]: k k k k k k * ( ) ( ) ( � � � � s g s g s 2 3 2 3 2 3 2 3 1 1 , (9) where k* is the effective thermal conductivity of gypsum, kg is the effective thermal conductivity of gas to account for heat transfer in the pores, ks is the thermal conductivity of the solid, � is the porosity of the material (the ratio of the volume of void to the overall volume). in this study, the thermal conductivity of solid dried gypsum (ks) is 0.31 w/m °c and the porosity of gypsum is 67 %–72 %. since the size of the pores is very small (never larger than 5 mm), natural convection in the pores can be neglected. therefore the effective thermal conductivity of the gas is [6]: k t d tg � � � � 4 815 10 2 3 44 0 717 3. . e � (10) where t is absolute temperature and de is the effective diameter of the pores. in this study de � 0.15 mm. hence, the effective thermal conductivity-temperature relationship consists of three parts, as demonstrated in fig. 6: 1) constant thermal conductivity up to 95 °c before water evaporation, equal to that at ambient temperature, reported by the manufacturer; 2) linear reduction of conductivity to 0.1 w/m °c at 155 °c; 3) non-linear increase in thermal conductivity based on equations 9 and 10. 3 small-scale high temperature tests a limited number of small-scale experiments have been performed. the specimens tested were gypsum board panels of two different types; 12.5 mm gyproc fireline plasterboard and 9.5 mm gyproc wallboard plasterboard, both british gypsum products. a total number of 8 specimens were tested, as specified in table 1. all specimens had approximate dimensions of mm. each specimen was placed horizontally on top of an electric kiln, as the source of heat, so that one side of the panel was subjected to kiln temperature and the other side faced up to the room temperature (19–25 °c). an opening of 280×265 mm on the top lid of the kiln allowed the lower side of the panel to be exposed to the elevated kiln temperatures. a 30 mm layer of glass wool (with an opening of the same size as that in the kiln lid) was laid underneath the specimen to insulate the contact surface between the top lid and the plasterboard. fig. 7 shows the typical set-up of the experiments. fig. 8 shows the heating curve achieved in the kiln, which is compared to a standard cellulosic fire (bs476) [8]. temperatures were measured on the unexposed side, the midpoint (for double layered panels) and the exposed side of the gypsum panel using type k thermocouples. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 temperature (°c) t h e rm a l c o n d u c ti v it y (w / m k ) � fig. 5: thermal conductivity of gypsum, as reported by various researchers [7] 0 0,05 0,1 0,15 0,2 0,25 0 200 400 600 800 1000 temperature (°c) t h e rm a l c o n d u c ti v it y (w / m .° c ) fig. 6: effective thermal conductivity of gypsum as used in this study fig. 7: typical set-up for the small-scale fire tests 4 results in figs. 9 to 12, the temperature histories measured from the tests (data points) and calculated by the program using © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 49 no. 1/2009 test no plaster board type layers total thickness (mm) density (kg/m3) free water (% by weight) initial thermal conductivity (w/m °c) 1 gyproc fireline single 12.5 770 3.5 0.24 2 gyproc fireline single 12.5 770 3.5 0.24 3 gyproc fireline double 25 770 3.5 0.24 4 gyproc fireline double 25 770 3.5 0.24 5 gyproc wallboard single 9.5 641 3.5 0.19 6 gyproc wallboard single 9.5 641 3.5 0.19 7 gyproc wallboard double 19 641 3.5 0.19 8 gyproc wallboard double 19 641 3.5 0.19 table 1: specifications of gypsum board specimens 0 200 400 600 800 1000 1200 0 20 40 60 80 100 120 140 160 180 time (min) t e m p e ra tu re (° c ) standard cellulosic fire kiln temperature fig. 8: time-temperature curve for the kiln against standard cellulosic fire curve 0 100 200 300 400 500 0 10 20 30 40 50 60 70 80 90 100 time (min) t e m p e ra tu re s (° c ) experiment analysis using the proposed thermal conductivity analysis using thermal conductivity as used by mehaffey at al unexposed . fig. 9: temperature history for 12.5 mm fireline gypsum panel 0 100 200 300 400 500 0 10 20 30 40 50 60 70 80 90 100 time (min) t e m p e ra tu re s (° c ) experiment analysis using the proposed thermal conductivity analysis using thermal conductivity as used by mehaffey at al unexposed fig. 11: temperature history for 9.5 mm wallboard gypsum panel 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 70 80 90 100 time (min) t e m p e ra tu re s (° c ) experiment analysis using the proposed thermal conductivity analysis using thermal conductivity as used by mehaffey at al midpoint unexposed fig. 10: temperature history for 25 mm fireline gypsum panel pore size of 1 mm (solid thick line) are compared. also plotted in these figures are the numerical results utilizing thermal conductivity of gypsum, as used by mehaffey et al. [5] (thin solid line). the results demonstrate a considerable improvement in prediction of temperature development through gypsum when using the new thermal conductivity model described in this paper. 5 conclusions this paper has presented a hybrid method to determine the effective thermal conductivity of gypsum at high temperatures, based on using small-scale experimental results and a thermal conductivity model which includes the effects of radiation in voids. despite the simplicity of the method, the results are in good agreement with test measurements and show great improvement when compared to those produced using thermal conductivity values reported in literature. this method will aid manufacturers to develop their products without having to conduct numerous large-scale fire tests. further planned research includes investigating the effects of discrete large cracks in gypsum on heat transfer in gypsum board systems and gypsum falling-off at high temperatures. acknowledgment the authors would like to thank british gypsum for their financial support and drs. kane ironside and jan rideout for their interest and technical support. the technical assistance by the laboratory staff at the university of manchester is greatly appreciated. references [1] rahmanian, i.: fire resistance of gypsum board based systems. first year phd progression report, school of mechanical, aerospace and civil engineering, university of manchester, uk, 2008. [2] wang, h. b.: heat transfer analysis of components of construction exposed to fire. department of civil engineering and construction, university of salford, u.k., 1995. [3] ozisik, m. n.: heat transfer: a basic approach. new york; london: mcgraw-hill, 1985. [4] ang, c. n., wang, y. c.: the effect of water movement on specific heat of gypsum plasterboard in heat transfer analysis under natural fire exposure. construction and building materials, vol. 18 (2004), p. 505–515. [5] mehaffey, j. r., cuerrier, p., carisse, g. a.: a model for predicting heat transfer through gypsum-board/wood-stud walls exposed to fire. fire and materials, vol. 18(1994), p. 297–305. [6] yuan, j.: fire protection performance of intumescent coating under realistic fire conditions. phd thesis, school of mechanical, aerospace and civil engineering, university of manchester, uk, 2009. [7] thomas, g.: thermal properties of gypsum plasterboard at high temperatures. fire and materials, vol. 26 (2002), p. 37–45. [8] bs476, fire tests on building materials and structures, part 20: method for determination of the fire resistance of elements of construction (general principles), british standards institution, 1987. ima rahmanian email: ima.rahmanian@postgrad.manchester.ac.uk yong wang university of manchester school of mechanical, aerospace and civil engineering, manchester, po box 88 manchester m60 1qd, united kingdom 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 70 80 90 100 time (min) t e m p e ra tu re s (° c ) experiment analysis using the proposed thermal conductivity analysis using thermal conductivity as used by mehaffey at al midpoint unexposed fig. 12: temperature history for 19 mm wallboard gypsum panel acta polytechnica acta polytechnica 53(2):193–196, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ high-speed monitoring of dust particles in iter elms simulation experiments with qspa kh-50 vadym a. makhlaja,∗, igor e. garkushaa, nikolay n. aksenova, alexander a. chuviloa, igor s. landmanb a institute of plasma physics of the nsc kipt, 61108 kharkov, ukraine b karlsruhe institute of technology (kit), ihm, 76344 karlsruhe, germany ∗ corresponding author: makhlay@kipt.kharkov.ua abstract. dust generation under powerful plasma stream impacts has been studied in iter elm simulation experiments with qspa kh-50 plasma accelerator. repetitive plasma exposures of tungsten have been performed by 0.25 ms plasma pulses and the heat load varied in the range (0.1 ÷ 1.1) mj m−2. main characteristics of dust particles such as a number of ejected particles, their velocity, angular distribution and start time from the surface are investigated. dust particles have not been observed under heat load below the cracking threshold. quantity of dust particles rises with increasing heat load. average velocities of dust particles are found to be strongly dependent on their start time from the surface after beginning of plasma-surface interaction. maximal velocity achieved a few tens of meters per second. keywords: iter, elms simulation experiments, qspa, plasma-surface interaction, tungsten dust. 1. introduction the anticipated regime of the tokamak iter is the elmy h-mode. the edge localized modes (elms) of plasma instabilities, intrinsic for h-mode, produce short periodic pulses of heat flux at the divertor armor. tungsten is the most suitable material for the iter divertor. it should withstand both the stationary and transient heat fluxes demonstrating tolerably erosion rate. the drawbacks of tungsten as plasma-facing material are w brittleness and damage effects related with brittle destruction as well as the melt layer erosion [11]. tungsten cracking leads to generation of the dust, which can contaminate and radiatively cool the core plasma [7, 4]. the dust particles produced during tungsten cracking and the droplets ejected from the melt layer under the action of giant elms and disruptions are critical issues for iter performance that require comprehensive experimental studies. there are three main approaches to experimental dust study in fusion plasmas: (i) collection of dust particles resulted from transient events, (ii) laser scattering by the dust grain sand and (iii) dust monitoring with fast cameras [6]. observations with fast cameras can track the trajectories of the grains in the chamber during the discharge (with trajectory reconstruction), giving the magnitude of dust speed. it can also provide information on grain–wall collisions, some peculiar features of dust dynamics in fusion reactor, the amount of visible dust grains in vacuum chamber during operation in different discharges and regimes [6]. the iter energy fluxes to the divertor are not achievable in existing tokamaks. for this reason simulation experiments of iter transients have been carried out with other plasma devices [7, 4, 5]. in particular, the quasi-stationary plasma accelerators (qspa) can reproduce energy densities (0.2 ÷ 2 mj m−2) and pulse duration (0.1 ÷ 0.5 ms) of iter elms [7, 4]. therefore, qspa can be applied for investigation of material response to the expected heat loads [4, 5, 3]. this paper presents the results of qspa kh-50 experiments on high-power interactions with material surfaces and behavior of tungsten dust under pulsed energy loads typical for iter type i elms. 2. experimental setup the quasi-steady-state plasma accelerator qspa kh-50 is the largest and most powerful device of this kind [4, 12]. qspa kh-50 consists of two stages. the first one is used for plasma production and preacceleration. the main second stage is a coaxial system of shaped active electrodes with magnetically screened elements, supplied from independent power sources. the hydrogen plasma streams, generated by qspa kh-50, are injected into the magnetic system of 1.6 m in length and 0.44 m in inner diameter with a magnetic field of up to 0.54 t in diagnostic chamber (2.3 m from acceleretor) where the target has been installed [3, 12]. plasma parameters were varied both by changing dynamics and the amount of gas filled in the accelerator channel and by the variation of the working voltage of capacitor battery of the accelerating channel. the main parameters of qspa plasma streams were as follows: ion impact energy about 0.4 ÷ 0.6 kev, the maximum plasma pressure 3.2 bar, and the plasma stream diameter about 18 cm. the surface energy loads measured with a thermocouple calorimeters were varied in the range 193 http://ctn.cvut.cz/ap/ vadym a. makhlaj et al. acta polytechnica figure 1. high speed imaging of plasma interaction with tungsten target: t = 1.2 ms after the start of plasma-surface interaction, texp = 1.2 ms; a) qsurf = 0.45 mj m−2, b) qsurf = 0.6 mj m−2, c) qsurf = 0.75 mj m−2. of (0.1 ÷ 1.1) mjm−2 [3]. the plasma pulse shape was approximately triangular, and the pulse duration was about 0.25 ms. in previous experiments it was demonstrated that tungsten cracking and melting thresholds under qspa kh-50 exposures corresponded to 0.3 mj m−2 and 0.6 mj m−2 respectively. the evaporation onset is estimated as 1.1 mj m−2 [4, 3]. the targets made of pure tungsten of sizes 5 × 5 × 0.5 cm3 and 12 × 8 × 0.1 cm3 have been used for these experiments. observations of plasma interactions with exposed surfaces, the dust particle dynamics and the droplets monitoring have been performed with a high-speed 10 bit cmos pco.1200 s digital camera pco ag (exposure time from 1 µs to 1 s, spectral range from 290 to 1100 nm). in general, the applied measurement scheme for droplets monitoring was similar to one used in [4, 5, 3]. as an example, fig. 1 shows camera frames registered with the same exposure for different plasma heat loads. camera frames corresponding to different time moments during one plasma pulse are presented in fig. 2. dynamics of particles ejected from exposed surfaces has been analyzed from the series of camera frames. this is the main difference of applied scheme from the experiments described in [5]. velocities of ejected tungsten dust particles/droplets have been evaluated from the lengths of their traces done coluring selected frame. the moment when the particle was released from the target surface could be calculated also. temporal distributions of quantity and velocity of erosion products were obtained for different heat loads to the exposed tungsten surfaces. the observed region in front of target was 8 cm due to feature of a design the diagnostic vacuum chamber of qspa kh-50. therefore, the particles with velocity higher than 35 m/s were able to fly away of observed region during of 2.2 ms. taking into account figure 2. high speed images of plasma interaction with inclined (35◦ to plasma jet) tungsten target, a) t = 1.16 ms and b) t = 9.06 ms after the start of plasma-surface interaction, qsurf = 0.75 mj m−2, texp = 0.25 ms. these circumstance, the special pulsed system was used for synchronization of the camera registration start and the beginning the plasma-surface interaction. this allows improvement of temporal resolution for applied registration system in comparison with that described in [5]. the exposure was not exceed 1.5 ms. particles were also registered after plasma impact for more than 9 ms (fig. 2b). the total recording time achieved 50 ms. 3. experimental results the irradiation of tungsten surface with qspa kh-50 plasma heat load below the cracking threshold (0.3 mj m−2) does not trigger the generation of erosion products. at the heat load above the cracking threshold but below melting threshold only several dust particles traces have been registered (fig. 1a). 194 vol. 53 no. 2/2013 high-speed monitoring of dust particles elastic energy stored in stressed tungsten surface layer should be the motive force for the cracking process with following acceleration of separated solid particles in this case [10]. further increase of heat load leads to the surface melting and results in splashing of eroded material (fig. 1b). number of ejected particles rises with increasing heat load due to growing thickness of melted layer. quantity of particles ejected from irradiated surface increases more than twice as a consequence of heat load elevation from 0.6 mj m−2 to 0.75 mj m−2 only. the majority of w particles are ejected from the exposed surface 0.2 ms after beginning of plasma– surface interaction (fig. 3). velocity of registered tungsten particles achieves 25 m/s for earlier instants. for the later moments, velocity decreases to several m/s [3, 9]. maximal velocity only weakly depends on the heat load. the maximum of particles with high velocities (i.e. ejected from surface before t = 0.4 ms) is clearly detected on camera frame at 1.2 ms after beginning of plasma-surface interaction (fig. 3). for the later moments of observation [9], they leave a zone of observation. therefore, the number of such particles decreases. the quantity of particles with velocity of 10 m/s changes weakly during 5 ms. as follows from [3], under perpendicular plasma impacts droplets are ejected primarily with small angles to the normal (i.e. towards impact plasma, see fig. 1c). nevertheless, rather large angles of ejection, up to 80◦ have been observed also. analysis of droplet traces from consecutive images shows the influence of gravitational force on droplets with larger mass and smaller velocity values. due to gravitation the resulting angular distribution of droplets became non-symmetric. the high speed imaging of qspa kh-50 plasma interaction with the inclined tungsten target is presented in fig. 2. for this case, the angular distribution of ejected droplets is also non-symmetric (fig. 4). the maximal number of particles has been registered at angle (35 ÷ 40)◦ to the normal. such particles flying towards the plasma impact, i.e., in upstream direction. however, quite large number of droplets turns to the downstream direction due to the influence of gravitational force and plasma stream pressure. it should be mentioned that the impacting energy density to the exposed surface depends on the plasma incidence angle to the target [8, 2]. a non-uniform distribution of the heat load along the target surface (for instance, due to formation of non-uniform shielding layer) can also cause a non-symmetric angular distribution of ejected droplets. for heat load exceeding the melting threshold, the flying particles can be originated from melt surface due to kelvin–helmholtz or rayleigh–taylor instabilities [1]. analysis of obtained experimental results and comparison with the results of numerical simulations [10, 9] allows conclusion that the generation of tungsten particles in the form of droplets may occur figure 3. amount of dust particles versus delay from start for different heat loads to target surface; a) qsurf = 0.6 mj m−2, b) qsurf = 0.75 mj m−2. only during the plasma pulse and (as latest) few tens microseconds after the pulse end. other particles may be exclusively solid dust that is generated due to elastic energy stored in stressed re-solidified tungsten surface layer. it is interesting to compare the results of qspa kh-50 plasma exposures with performed experiments on erosion product monitoring in qspa-t facility [3, 1]. similar velocity of erosion products and the energy threshold of particles appearing have been observed. however, in our experiments with inclined irradiation the droplets are primarily ejected at small angles to the plasma jet. the reason for somewhat different results obtained in mentioned devices can be much larger plasma pressure and output electric currents in qspa-t. 4. summary the results of erosion products monitoring from tungsten targets exposed to iter elm-like surface heat load at qspa kh-50 have been discussed. plasma energy load to the target surfaces achieved 0.75 mj m−2 and caused a pronounced surface melting. the erosion products for tungsten targets have been registered with a high-speed digital camera both under normal and inclined plasma irradiation. distributions of the particles in dependence on their start time from the target surface have been obtained for different heat loads. the number of particles grows significantly with increasing heat load. erosion products have been observed with ccd camera only under 195 vadym a. makhlaj et al. acta polytechnica figure 4. the angular distribution of ejected droplets for inclined (35◦ to plasma jet) plasma irradiation. heat load exceeding the cracking threshold. maximal velocity of tungsten particles may achieve few tens of m/s. the main erosion mechanisms are found to be droplet ejection from the melt and solid dust origination during the brittle destruction of exposed tungsten surfaces due to the thermo-stress. for energy load below 0.75 mj m−2, the start time of ejected particles from the surface is only weakly depended on the heat load value. the measurement scheme applied in qspa kh-50 can be effectively used for investigations of dust dynamics in near-surface plasma in simulation studies of plasma interaction with iter divertor surfaces. references [1] b. bazylev, et al. experimental and theoretical investigation of droplet emission from tungsten melt layer. fusion engineering and design 84:441, 2009. [2] a. a. chuvilo, et al. calorimetric studies of the energy deposition on a material surface by plasma jets generated with qspa and mpc devices. nukleonika 57:49, 2012. [3] i. e. garkusha, et al. experimental study of plasma energy transfer and material erosion under elm-like heat loads. j nucl mater 390/391:814, 2009. [4] i. e. garkusha, et al. latest results from elm-simulation experiments in plasma accelerators. phys scr t135:014054, 2009. [5] n. klimov, et al. experimental study of pfcs erosion and eroded material deposition under iter-like transient loads at the plasma gun facility qspa-t. journal of nuclear materials 415:559, 2011. [6] s. i. krasheninnikov, r. d. smirnov, d. l. rudakov. dust in magnetic fusion devices plasma phys. control fusion 53:083001, 2011. [7] i. landman, et al. material surface damage under high pulse loads typical for elm bursts and disruptions in iter. phys scr 111:206, 2004. [8] v. a. makhlaj, et al. simulation of iter edge localized modes’ impacts on the divertor surfaces within plasma accelerators. phys scr t145:014061, 2011. [9] s. pestchanyi, et al. estimation of the dust production rate from the tungsten armour after repetitive elm-like heat loads. physscr t145:014062, 2011. [10] s. pestchanyi, et al. simulation of residual thermostress in tungsten after repetitive elm-like heat loads. fusion engineering and design 86:1681, 2011. [11] m. rietha, et al. review on the efda programme on tungsten materials technology and science. journal of nuclear materials 417(1-3):463, 2011. [12] v. i. tereshin, et al. powerful quasi-steady-state plasma accelerator for fusion experiments. brazil j phys 32:165, 2002. 196 acta polytechnica 53(2):193–196, 2013 1 introduction 2 experimental setup 3 experimental results 4 summary references acta polytechnica doi:10.14311/ap.2013.53.0512 acta polytechnica 53(supplement):512–517, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap opera highlights thomas strauss∗, for the opera collaboration albert einstein center for fundamental physics, laboratory for high energy physics, university of bern, sidlerstrasse 5, ch-3012 bern ∗ corresponding author: thomas.strauss@lhep.unibe.ch abstract. the opera experiment is a long baseline neutrino oscillation experiment aimed at observing the νµ → ντ neutrino oscillation in the cern neutrino to gran sasso beamline in the appearance mode by detecting the τ-decay. here i will summarize the results from the run years 2008–10 with an update on observed rare decay topologies and the results of the neutrino velocity measurements. keywords: neutrino oscillation, neutrino velocity, tau appearance, nuclear emulsions, charm decay, electron neutrino, emulsion cloud chamber, lngs, cern, opera. figure 1. picture of the opera detector, with a view of a reconstructed neutrino interaction occurring in the 2nd super module. 1. introduction the opera experiment [1], located at the gran sasso laboratory (lngs), aims at observing the νµ → ντ neutrino oscillation in the direct appearance mode in the cern neutrino to gran sasso (cngs) [2, 3] beam by detecting the decay of the τ produced in charged current (cc) interactions. a detailed description of the detector can be found in [1, 4–9]. the opera detector consists of two identical super modules (sm), each of them consisting of a target area and a muon spectrometer, as shown in fig. 1. the target area consists of alternating layers of scintillator strip planes and target walls. the muon spectrometer is used to reconstruct and identify muons from νµ–cc interactions and estimate their momentum and charge. the target walls are trays in which target units of 10 × 12.5 cm2 and a depth of 10x0 in lead (7.9 cm) with a mass of around 10 kg each are stored: they are also refered to as bricks. a brick is formed by alternating layers of lead plates and emulsion films (2 emulsion layers separated by a plastic base) building an emulsion cloud chamber (ecc). this provides figure 2. the cngs neutrino beamline. figure from [13]. high granularity and high mass, which is ideal for ντ interaction detection. figure 3 left shows an image of the unwrapped ecc brick. on the right, the arrangement of the scintillator strip planes (target tracker, tt) and the ecc is shown. note an extra pair of emulsion films in a removable box called a changeable sheet (cs), shown in blue. the total mass of each target area is about 625 tons, leading to a target mass of 1.25 ktons for 145 000 bricks. 2. daq and analysis the information from the reconstructed event, recorded by the electronic detectors, is used to predict the most probable ecc for the neutrino interaction vertex [10]. a display of a reconstructed event is indicated in fig. 1. figure 4 shows the procedure for localizing the event vertex in the ecc, by extrapolating the reconstructed tracks from the electronic detector to the cs emulsion films. the signal recorded in the cs films will confirm the prediction from the electronic detector, or will act as a veto and trigger a search in neighboring bricks to find the correct ecc in which the neutrino interaction is contained. after a positive cs result, the ecc will be unpacked and the emulsion films are developed and sent to one of the various scanning stations in japan and europe. dedicated automatic scanning systems allow us to follow the tracks from the cs prediction up to their stopping point. around these 512 http://dx.doi.org/10.14311/ap.2013.53.0512 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 opera highlights figure 3. τ detection principle in opera. figure 4. event detection principle in the opera experiment. the candidate interaction brick is determined from the prediction of the electronic detector (blue). the changeable sheet (cs) is used to confirm the prediction. from the cs result, the tracks are followed up to the interaction point inside the ecc. stopping points, a volume of 1 cm2 times 15 emulsion films will be scanned to find the interaction vertex. as illustrated in fig. 4, only track segments in the active emulsion volume are visible and a reconstruction of the event is needed to find tracks and vertices. a dedicated procedure, called a “decay search” is used to search for possibly interesting topologies, like the τ-decay pictured in fig. 4. the accuracy of the track reconstruction goes from cm in the electronic detector, down to mm for the cs analysis and to micrometric precision in the final vertex reconstruction (after aligning the ecc emulsion plates with passing-through cosmic-ray tracks). 2.1. tau detection τ detection is only possible due to the micrometric resolution of the emulsion films, as it allows us to separate the primary neutrino interaction vertex from the decay vertex of the τ particle. the most prominent background for τ-decay is either hadron scattering or charged charm decays. the background from hadron scattering can be controlled by cuts applied to the event kinematics. the background due to charm can be reduced by identifying the muon at the primary vertex, as charm will occur primarily in νµ–cc interactions (for further details see [8, 10]). after topological and kinematical cuts are applied, the number of background events in the nominal events sample is anticipated to be 0.7 at the end of the experiment. in 2010 the first ντ candidate was reconstructed inside the opera emulsion. it was recorded in an event classified as a neutral current, as no muon was identified in the electronic detector. to crosscheck the τ hypothesis, all tracks were followed downstream of the vertex until their stopping or their re-interaction point. they were all attributed to be hadrons, and no soft muon (e < 2 gev) was found. in 2010, the expected total number of τ candidates was 0.9, while the expected background was less than 0.1 events. more details on the analysis are presented in [8]. in fig. 5 shows a picture of this event. track number 4, labeled as the parent, is the τ, decaying into one charged daughter track. the two showers are most likely connected to the decay vertex of the τ, rather than activity from the primary interaction. thus the decay is compatible with τ→ η−(π− + π0)ντ. figure 6 shows the cuts used for the selection criteria defined at the time of the proposal and the kinematic variables of the τ-decay observed by the opera experiment. at the time of this conference, the number of expected τ candidates was 1.7, with 0.5 events expected in the single-prong channel. the expected background for the analyzed event sample corresponding to 4.9 × 1019 protons on target (pot) was 0.16 ± 0.05 events. at the time of writing these proceedings, a second τ candidate appeared [11]. 2.2. physics run performance and data analysis status since 2007, the opera experiment has collected a total of 18.5×1019 pot. this corresponds to about 15 000 interactions in the target areas of the experiment. figure 7 shows from top to bottom, as a function of time, 513 thomas strauss, for the opera collaboration acta polytechnica figure 5. display of the 2010 τ− candidate event. top left: view transverse to the neutrino direction. top right: same view zoomed on the vertices. bottom: longitudinal view. figure from [8]. figure 6. mc distribution of: a) the kink angle for τdecays, b) the path length of the τ, c) the momentum of the decay daughter, d) the total transverse momentum pt of the detected daughter particles of τ-decays with respect to the parent track. the red band shows the 68.3 % of values allowed for the candidate event, and the dark red line indicates the most probable value. the dark shaded area represents the excluded region corresponding to the a priori τ selection cuts. figure from [8]. the integrated number of events occurring in the target (showing the cngs shutdown periods), with their vertex reconstructed by the electronic detectors, for which at least one brick has been extracted, for which at least one cs has been analysed, for which this analysis has been positive (track stubs corresponding to the event have been found), for which the brick has been analysed, for which the vertex has been located, for which the decay search has been completed. the efficiency of the analysis of the most probable cs is rather low, about 65 %, and is significantly lower figure 7. events recorded and analyzed in the opera experiment from 2008 until 2012. for nc events, among which ντ interactions are the most likely to be found. to recover this loss, multiple brick extraction is performed; this brings the final efficiency of observing tracks of the event in the cs to about 74 %. the efficiency for locating an event seen on the cs in the brick is about 70 %. at the time of this conference, a total of 4611 events has been localised in the bricks and the decay search has been completed for 4126 of them. after an ecc has been identified to be the most likely interaction brick, the ecc is developed and sent to one of the scanning laboratories, where it is scanned within a short time. the efficiency for locating an event within the ecc is about 70 % with respect to the number of positive cs results. after the event has been located, a dedicated decay search is performed to obtain a data sample which can be compared to mc and which provides uniform data quality from all laboratories. this decay search includes the search for decay daughters and a reconstruction of the kinematics of the particles at the vertex. at the time of this conference, 4611 events have been localized in the ecc, with a total of 4126 cc and nc events having completed the decay search. 2.3. charm decay topologies in neutrino interactions in about 5 % of the νµ–cc interactions, the production of a charmed particle at the primary vertex takes place. charmed particles have lifetimes similar to the τ and similar decay channels. thus charm events provide a subsample of decay topologies similar to τ decay, for which the detection efficiency can be estimated based on mc simulation. a study of a high purity selection of charm events in 2008 and 2009 shows agreement with the data [12]. one-prong charm decay candidates are retained if the charged daughter particle has a momentum larger than 1 gev/c. this leads to an efficiency of �short = 0.31 ± 0.02(stat.) ± 0.03(syst.) for short and �long = 0.61 ± 0.05(stat.) ± 0.06(syst.) for long charm decays, wherein ‘long’ means the 514 vol. 53 supplement/2013 opera highlights figure 8. display of one of the charm events. top: view of the reconstructed event in the emulsion. bottom: zoom in the vertex region of the primary and secondary vertex. production and the decay vertices are not located in the same lead plate. the number of events for which the charm search is complete is 2167 cc interactions. in these we expect 51±7.5 charm candidates, with a background of 5.3±2.3 events. the number of observed candidates is 49, which is in agreement with expectations. figure 8 shows a charm decay detected in the opera experiment. in the electronic detector reconstruction, two muons were observed, one charged positively, the other negatively. the µ− is attached to the primary vertex, while the µ+ is connected to the decay vertex. this topology corresponds with a charged charm decaying into a muon and the measured kinematic parameters are a flight length of 1330 µm and a kink angle of 209 mrad. the impact parameter (ip) of the µ+ with respect to the primary vertex is 262 µm, and its momentum is measured as 2.2 gev/c. this accounts for a transverse momentum (pt) of 0.46 gev/c. 3. ν-velocity measurement due to the time structure of the cern sps beam, the opera experiment is able to trigger on the proton spill hitting the cngs target. as a result, the electronic detector provides a time signal of the recorded events, which can be used to measure the neutrino velocity in the cngs beam. one needs to measure figure 9. scheme of the time of flight measurement. figure from [13]. with precision of some ns the flight time (time of flight – tof) between cern and lngs, and the distance between reference points in the detector and the cngs. the concept of the neutrino time of flight measurement is illustrated in fig. 9. the procedures are explained in great detail in [13]. since the time of the conference, an instrumental mistake has been identified that makes the results presented at this conference obsolete. updated results taken from [14] are presented below. figure 10 shows the timing systems both at cern and lngs, which allowed time calibration between both sites within accuracy of ±4 ns. the distance between cern and the opera detector was measured via gps geodesy and extrapolation down to the location of both the cngs target and the opera detector with terrestrial traverse methods. the effective baseline is measured as 731 278.0±0.2 m. the proton wave form for each sps extraction was measured with a beam current transformer (bct). the sum of the wave forms restricted to those associated to a neutrino interaction in opera was used as pdf for the time distribution of the events within the extraction. the maximum likelihood method was used to extract the time shift between the two distributions, i.e. the neutrino time of flight. internal nc and cc interactions in the opera target and external cc interactions occurring in the upstream rock from the 2009, 2010 and 2011 cng runs were used for this analysis. as shown in fig. 11, it is measured to be δt = tofc−tofν = (6.5±7.4(stat.)±+8.3−8.0(syst.)) ns. modifying the analysis by using each neutrino interaction waveform as pdf instead of their sum gives a comparable result of δt = (3.5 ± 5.6(stat.) ±+9.4−9.1 (syst.)) ns. no energy dependence was observed. 515 thomas strauss, for the opera collaboration acta polytechnica figure 10. top: scheme of the timing sytem at cern. bottom: scheme of the timing system at lngs. figures from [13]. to cross-check for systematic effects, a dedicated bunched beam run was performed, where the sps proton delivery is split into 3 ns long spills, separated by 524 ns in time during autumn 2011, and a similar mode of 3 ns with 100 ns separation in spring 2012. the value of δt obtained in 2011 by using timing information provided by the target tracker is 1.9 ± 3.7 ns; it is 0.8 ± 3.5 ns when based on the spectrometer data [13]. for the 2012 run, the corresponding values of δt are δt = (−1.6±1.1(stat.)+6.1−3.7) ns [14]. all results are in agreement with the measurement from standard cngs beam operation. 4. conclusions the opera experiment detected two τ neutrino events appearing in the cngs beam. further, we measured the neutrino velocity to be in agreement with the speed of light in a vacuum to the o(10−6). other short decay topologies like νe or charm decays can also be detected and are in agreement with mc expectations, thus providing a benchmark for validating the τ efficiency expectations. acknowledgements firstly, i thank the organizers of the workshop and the opera ptb for the possibility to join this workshop. the figure 11. top: comparison of the measured neutrino interaction time distributions (data points) and the proton pdf (red and blue line) for the two sps extractions resulting from the maximum likelihood analysis. bottom: blow-up of the leading edge (left plot) and the trailing edge (right plot) of the measured neutrino interaction time distributions (data points) and the proton pdf (red line) for the first sps extraction after correcting for δt = 6.5 ns. within errors, this second extraction is equal to the first one. figures from [13]. opera collaboration thanks cern, infn and lngs for their support and work. in addition opera is grateful for funding from the following national agencies: fonds de la recherche scientique – fnrs and institut interuniversitaire des sciences nucléaires for belgium; moses for croatia; cnrs and in2p3 for france; bmbf for germany; infn for italy; jsps (japan society for the promotion of science), mext (ministry of education, culture, sports, science and technology), qfpu (global coe program of nagoya university, “quest for fundamental principles in the universe”, supported by jsps and mext) and promotion and mutual aid corporation for private schools of japan for japan; the swiss national science foundation (snf), the university of bern and eth zurich for switzerland; the russian foundation for basic research (grant 09-02-00300 a), the programs of the presidium of the russian academy of sciences “neutrino physics” and “experimental and theoretical researches of fundamental interactions connected with work on the accelerator of cern”, the support programs of leading schools (grant 3517.2010.2), and the ministry of education and science of the russian federation for russia; the korea research foundation grant (krf-2008-313-c00201) for korea; and tubitak “scientific and technological research council of turkey”, for turkey. in addition the opera collaboration thanks the technical collaborators and the in2p3 computing centre (cc-in2p3). references [1] opera collaboration, r. acquafredda et al., jinst 4 (2009) p04018 [2] ed. k. elsener, the cern neutrino beam to gran sasso (conceptual technical design), cern 98–02, infn/ae-98/05 516 vol. 53 supplement/2013 opera highlights [3] r. bailey et al., the cern neutrino beam to gran sasso (cngs) (addendum to cern 98–02, infn/ae98/05), cern-sl/99-034(di), infn/ae-99/05 [4] a. ereditato, k. niwa and p. strolin, the emulsion technique for short, medium and long baseline νµ → ντ oscillation experiments, 423, infn-ae-97-06, dapnu-97-07 [5] opera collaboration, h. shibuya et al., letter of intent: the opera emulsion detector for a long-baseline neutrino-oscillation experiment, cern-spsc-97-24, lngs-loi-8-97 [6] opera collaboration, m. guler et al., an appearance experiment to search for νµ → ντ oscillations in the cngs beam: experimental proposal, cern-spsc-2000-028, lngs p25/2000 [7] opera collaboration, m. guler et al., status report on the opera experiment; cern/spsc 2001-025, lngs-exp 30/2001 add. 1/01 [8] opera collaboration, n. agafonova et al., phys. lett. b 691 (2010) 138 [9] opera collaboration, n. agafonova et al., arxiv:1107.2594v1 [10] n. agafonova, search for nu-mu–nu-tau oscillation with the opera experiment in the cngs beam, new j. phys. 14 (2012) 033017 [11] m. nakamura, neutrino 2012, xxv international conference on neutrino physics and astrophysics, 3–9 june 2012, kyoto, japan. [12] strauss, t., charm production in the opera experiment and the study of a high temperature superconducting solenoid for a liquid argon time projection chamber, phd thesis “eth-19247” [13] opera collaboration, t. adam et al., arxiv:1109.4897v4 [14] m. dracos, neutrino 2012, xxv international conference on neutrino physics and astrophysics, 3–9 june 2012, kyoto discussion james beall — do you have evidence that the neutrinos travel at less than the speed of light? when will we know the answer to this question? thomas strauss — the speed of the neutrinos from cngs to lngs could be in agreement with the speed of light. the answer to the second question will be presented at the neutrino 2012 conference. addendum: the final results from [13, 14] are used in these proceedings, but were not used in the talk at vulcano 2012. maurice h.p.m. van putten — in your upcoming announcement on neutrino velocity, will you insist on consistency of the results from the 0.5 µs nuch experiment and the 2 ns pulse experiment? in arxiv.org1110:4781, i pointed out the need for a 2-parameter analysis for causal matched filtering, to accuarately determine the tof. the results show a reduction to 3.75σ from the opera claim of 6.04σ, demonstrating the need for a careful analysis. will opera make its data public, and will opera pursue proper casual matching filtering on the 10.5 µs experiment? thomas strauss — for the first question, the answer will be given at the neutrino 2012 conference. the data is available to the public in the raw format, and a guideline for setting up the timelink calibration has been developed together with cern. as for the matched filtering, this question will be forwarded to the person responsible for the neutrino velocity analysis. addendum (statement from the collaboration): the 2-parameter analysis has been debated at length. it is clear that by introducing a new degree of freedom (the length of the bunch) one may obtain a better result, as putten actually obtains. however this corresponds to a possible new unknown physical property of neutrinos. to be conservative about the measurement in september, opera has chosen to exclude any debate on the possible physical source of the result. 517 acta polytechnica 53(supplement):512–517, 2013 1 introduction 2 daq and analysis 2.1 tau detection 2.2 physics run performance and data analysis status 2.3 charm decay topologies in neutrino interactions 3 nu-velocity measurement 4 conclusions acknowledgements references acta polytechnica vol. 52 no. 5/2012 a network simulation tool for task scheduling ondřej votava dept. of computer science, czech technical university, karlovo nám. 13, 121 35 praha 2, czech republic corresponding author: votavon1@fel.cvut.cz abstract distributed computing may be looked at from many points of view. task scheduling is the viewpoint, where a distributed application can be described as a directed acyclic graph and every node of the graph is executed independently. there are, however, data dependencies and the nodes have to be executed in a specified order. hence the parallelism of the execution is limited. the scheduling problem is difficult and therefore heuristics are used. however, many inaccuracies are caused by the model used for the system, in which the heuristics are being tested. in this paper we present a tool for simulating the execution of the distributed application on a “real” computer network, and try to tell how the execution is influenced compared to the model. keywords: task scheduling, dag scheduling, simulation, network simulation. 1 introduction heterogeneous computation platforms have become very popular in the past decade. they are cheap and easy to construct and offer good computation power. compared to parallel computers, distributed systems offer better price-to-power ratio. however, the properties of distributed systems are different. communication is provided by a high-speed network which is still slow in comparison with the specialized networks used in parallel systems [15]. mainly, communication leads to the need to modify traditional parallel algorithms into distributed algorithms [25, 14, 19]. task scheduling is one of many approaches used for distributed algorithms. the idea is simple. let us take an application: this application consists of several parts that may be executed independently. these part can then be computed on different computers concurrently and the application can be speeded up. task scheduling tries to answer which parts should be computed on which computers and when, so that the computation time is minimized. the structure of the paper is as follows. in the next section we describe the problem of task scheduling itself, and show several approaches that are wide used for solving the problem. at the end of the next section we show the network-related problem of the simplified models that are used. section 3 then describes the simulation tool that we used for making measurements, and in section 4 we show some interesting results obtained from the simulations. 2 task scheduling the application that is to be scheduled can be described as a directed acyclic graph (dag), i.e. am = (v,e,b,c), where: v = {v1,v2, . . . ,vv}, |v| = v is the set of tasks, task vi ∈ v represents the piece of code that has to be executed sequentially on the same machine; e = {e1,e2, . . . ,ee}, |e| = e is the set of edges, edge ej = (vk,vl) represents data dependencies, i.e. task vl cannot start computation until the data from task vk have been received, task vk is called the parent of vl, vl is called the child of vk; b = {b1,b2, . . . ,bv}, |b| = v is the set of computation costs (e.g. number of instructions), where bi ∈ b is the computation cost for task vi; c = {c1,c2, . . . ,ce}, |c| = e is the set of data dependency costs, where cj = ck,l is the data dependency cost (e.g. amount of data) corresponding to edge ej = (vk,vl). a task which has no parents or children is called an entry task or an exit task, respectively. if there is more than one entry/exit task in the graph a new virtual entry/exit task can be added to the graph. such a task would have zero weight and would be connected by zero weight edges to the real entry/exit tasks. the application has to be scheduled on a heterogeneous computation system (cs) which can be described as a general graph, but there is one very important restriction. the graph represents a connection structure, and even though there may be no direct connection of two computation nodes there must be an edge between all of the nodes that are able to communicate. this restriction leads to the observation that the cs is always a complete graph. the computation system can then be described as cs = (p,q,r,s), where: p = {p1,p2, . . . ,pp}, |p| = p is a set of the computation nodes; 112 acta polytechnica vol. 52 no. 5/2012 c b i i i,j j bj entry exit 1 2 3 4 5 6 7 8 9 10 00 10 3020 2514 26 36 37 40 25 36 57 88 49 11 48 58 51 69 0 0 00 0 figure 1: application model described as a dag. q = {q1,q2, . . . ,qp}, |q| = p is the set of speeds of computation nodes, where qi is the speed of node pi; r is a matrix describing the communication costs, the size of r is p×p; s is a matrix used to describe the communication startup costs; it is usually one-dimensional, i.e. its size is p× 1. scheduling is connected to the specific application and the specific cs. the computation time ti,j of task vi on a node pj can be calculated using equation ti,j = bi qi . (1) when thinking of static scheduling, matrix w can be used. w contains information on the computation times for all of the tasks on every node, i.e. the size of w is v×p. the scheduling algorithm has to take into account the communication delay. the duration of transfer of the edge ei from node p to node q is then defined as t(i)p,q = s[p] + ci · r[p][q]. (2) 2.1 scheduling algorithms the problem of task scheduling is claimed to be npcomplete [21, 7] and therefore intensive research in heuristics has been done. the heuristics can be divided into several categories, the common criterion being knowledge of when the schedule is created. if p 1 p 2 p 3 p 5 p 8 p 76 p p 4 figure 2: the computation system described as a complete graph. the schedule is computed before computation of the application begins, i.e. if the schedule is known a priori, the heuristic is called static or offline. in contrast, when the schedule is computed as a part of the computation of the application, the heuristic is called dynamic or online. both static and dynamic algorithms have been proposed in the literature. for a heterogeneous computation platform, an example of a very well-known static algorithm is heft [20]. the main idea of heft is to order the tasks in a list and to assign the tasks which are ready to the computer which minimizes the execution time for the task. another algorithm proposed in [20] is cpop. this algorithm finds a critical path and minimizes the execution of tasks which are in the path. the quality of the schedule is then very dependent on how the critical path was created. cpop has slightly worse computation complexity than heft, but scheduling quality results are close. the idea of creating a list of tasks ordered in a specific manner is common for a whole group of scheduling heuristics. they are called list scheduling algorithms. modifying heft in such a way that some tasks can be duplicated, we obtain the algorithm presented in [8]. this algorithm can then be applied to a specific context of cluster-based computation systems, and the results that it achieves are very good [11]. many other algorithms have been published. some were summarized in [2], and many other, which are focused on homogeneous computation system, are compared in [13]. unlike static algorithms, dynamic algorithms are usually used to schedule more than one application at a time. however, there are some exceptions. for example, [24] schedules several applications in a static way and compares several attitudes that are permissible for this problem. a semi-dynamic attitude is 113 acta polytechnica vol. 52 no. 5/2012 100 bps 100 bps 100 bps 100 bps r =   0 0.01 0.01 0.01 0.01 0 0.01 0.01 0.01 0.01 0 0.01 0.01 0.01 0.01 0   100 bps 100 bps 100 bps 100 bps 100 bps 100 bps figure 3: a real network (left), its parameters, and the false representation from matrix r (right). described in [1], where the tasks are scheduled statically, but there is a global application structure which contains all of the applications, and this global structure changes when a new application arrives in the system. a completely different attitude to the dynamic algorithm is presented in [17], where there is one central scheduling node and each computation node collects statistics of its own usage. these data are then sent to the central node, and the scheduling algorithm adds the tasks from the queue to the queue of tasks of specific nodes according to the prediction of node utilization computed from the statistics. a system in which there is more than one scheduling node, is described in [10]. the scheduling nodes are independent machines and therefore have no information about schedules of others, so a statistical approach to node utilization is used. a two-level scheduling algorithm is described in [9]. the first level schedules the task to the specific “server”, which is the leader of a set of close nodes. the “server” then schedules the task to a specific node. a very simple dynamic algorithm was also proposed in [23]. the idea is that the nodes are not differentiated, i.e. a node can be both a “worker” and a “server”. the schedule is created in steps, and in each step several messages are sent that try to get information on the utilization of the neighbours of the node. 2.2 weaknesses of the model the model of the application (am ) describes the application in a very good way. however, this description is limited and does not reflect reality in all terms that we could imagine. there is, for example, a hidden prerequisite that all of the nodes know the code of the application and all the data that are needed as input for the application are available before the application is computed. similarly, the output of the application is not targeted to a specific node, and the computation can finish on any node, which may be confusing in reality. the computation system cs is also simplified. all of the properties of the network are merged in the two matrices r and s. the values of the matrices do not take into account all of the possible properties of the network. in figure 3 the network contains one bottleneck. however, the properties of the network gained from independent measurements of the properties of the link do not indicate this, and communication may therefore be delayed against the plan of the schedule. however, this delay may or may not be critical for the subsequent computation and it is purpose of this paper to show how the communication delay may change the execution order of the schedule. 2.3 related work the problem of evaluating the correctness of the generated schedules has been studied extensively. several tools have been presented, all of which try to help researchers to validate their algorithms. there are two main attitudes to the problem. testing the algorithms on real platforms, and simulating the experiments. the problem with using real systems for testing is that there are very limited possible system architectures due to the limited hardware resources. there are, however, some systems that are focused on this type of testing. grid’5000 [4] and planetlab [6] are two examples of platforms available for application testing. the results provided by these tools are very reliable, but the scalability of the network is limited. another very important problem tightly coupled with task scheduling is that the number of existing applications is limited. since only generated structures of non-existing applications are tested, real systems cannot be used. the same problem emerges when we talk about emulation tools. simulating experiments suits the task scheduling problem much better. this method is very widely used, though not all authors mention the system and the simulation method that they have used [16]. however, several systems became well known. gridsim [3] is a simulation tool focused on modeling the resources of the nodes of cs. the network layer is modeled using a simple discrete event simulation, and starts at the level of the third layer of iso/osi. the alea framework [12], an extension of gridsim, provides a tool for various grid scheduling problems. 114 acta polytechnica vol. 52 no. 5/2012 figure 4: topologies of four testing networks. another well-known simulator is simgrid [5]. simgrid has transformed from a tool for scheduling applications with a dag structure to a system that is able to both simulate and emulate distributed algorithms. simgrid uses a math model of the network, but the version 3 also introduces a hybrid system that allows the use of gtnets [18] as a transport simulation tool. 3 simulation tool the purpose of this paper is to show the influence of network parameters on schedule execution. hence the simulation tool has to offer a realistic simulation of the network, and we decided to use the omnet++ simulation tool [22]. omnet++ aims primarily at network simulations and is used as a core for many projects. there are also many extensions to omnet++, e.g. inet framework is a set of omnet modules that simulate internet devices. inet contains modules for both physical devices (routers, switches, hubs or access points for wireless networks) and protocols (tcp/ip protocol family, sctp, ospf or mpls). since we want to make the simulation of the network as realistic as possible, we chose omnet++ with inet framework as a simulation core. omnet++ itself offers no support for scheduling. as mentioned above, the applications that we use for testing the scheduling algorithms are randomly generated and have no real representation (i.e. code). the simulation of the execution of the schedule then consists only of sequences when the data is sent or received and when the nodes pretend to be working. in terms of simulation, they sleep for a specified amount r2 r1 10 gbit, 1000 m r3 100 mbit, 100 m r4 1 gbit, 10000 m table 1: speeds and distances used in network 3. r2 r1 100 mbit, 100 ms r3 10 mbit, 100 m table 2: speeds and distances in network 4. of time. this behavior is executed in the taskexecutor module, which may be connected to two modules, the first for tcp communication and the second for udp communication. the communication itself is then simulated by standard inet framework. this involves packet collisions, routing, queuing of packets, etc. 4 simulation results we created four network topologies that were used for simulation. the topologies of the networks are shown in figure 4, and the main properties of each of the networks are as follows. network 1 contains 10 nodes connected by a 1 gbit ethernet link to the central switch; the cable length is 10 meters. network 2 contains 2 groups of 10 nodes. the nodes are connected by 1 gbit ethernet to the router; the routers are connected by a 10 gbit point-to-point line with the delay corresponding to a distance of 10 km. network 3 contains 4 groups of 5 nodes. the nodes are connected by a 1 gbit ethernet to the router, and the routers are connected as shown in the table 1. network 4 contains 1 group of 10 nodes and two groups of 5 nodes. the nodes are connected by 1 gbit ethernet to the router, and the routers are connected as shown in table 2. the communication delays are specified in time units (ms) or in distance units (m) – the real delay is then computed by the following equation: delay = distance 0.64c . (3) networks 1 to 4 were the computation systems for a set of 300 randomly generated applications. the method for generating them was copied from [20]. 115 acta polytechnica vol. 52 no. 5/2012 -4.5e-06 -4e-06 -3.5e-06 -3e-06 -2.5e-06 -2e-06 -1.5e-06 -1e-06 -5e-07 0 0 2000 4000 6000 8000 10000 12000 14000 16000 t im e d if fe re n c e ( s ) time of computation (s) time differences for one application on network 1 tcp udp figure 5: differences between the expected and real start of tasks of one application. network 1 as an execution platform. a schedule for each network was created for each application. we used heft and cpop as scheduling algorithms and tcp and udp as the transport layer. in the end, we had a set of 4800 schedules. all of the schedules were simulated and the differences between the execution time of the scheduled task and the simulated task were recorded. the results of the simulations showed that the differences between real execution time and expected execution time are not large. we had expected that these small differences would cumulate and grow, but the difference seems to be almost constant, see figures 5, 6 and 7. the structure of the network influences the execution of the schedule. it may not be a coincidence that the time differences are usually in the order of the startup delay. for example, the time differences in network 1 were very slow, the maximum being about 10−6 s, which is the same order as the startup delay for network 1 stored in s. the differences in network 4 were higher, 10−1 s, which is higher than the order of values in the startup delay matrix s. nevertheless, the value is much smaller than the total execution time of the computation. we have also shown that the average length of the schedule was 103 s and the average difference caused by the network transport was −10−6 to −10−1 s. the difference is really small compared to the schedule length. 5 conclusion in this paper we have presented the problem of task scheduling. since the problem is difficult, heuristics are used, and great progress has been achieved in this area. however, the models that are used as a standard input for most of the algorithms are only models, and may suffer from many simplifications. we have focused here on the computation system, especially the networking subsection. we have shown that there are several points that may lead to misunderstandings about how the network may work. in order to show whether these points affect real world applications, we created a simulation tool which is able to simulate the execution of the schedule and the corresponding networking activity. the tool is based on omnet++, which is often used for network-based simulations. we created a set of randomly generated applications, and schedules for them. we ran the simulations on four network topologies, and our results show that the communication caused some differences against the expected execution time. however, the differences were very small. for this specific set of appli116 acta polytechnica vol. 52 no. 5/2012 -0.4 -0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0 5000 10000 15000 20000 25000 30000 35000 t im e d if fe re n c e ( s ) time of computation (s) time differences all networks and complex applications average minimal maximal figure 6: differences between the expected and real start of applications with more than 50 tasks across all platforms. cations and networks we may say that the differences are insignificant. we also mentioned the idea that the time differences caused by the network, are in the order of startup delay. to make this idea bullet-proof we need to make more simulations on various types of networks and with various types of applications. of course, the best way would be to execute real applications on real networks. real networks have more problems than only collisions or delays. there may be some other traffic, and the bandwidth may therefore change during the computation. these and many other problems are still to be solved. acknowledgements the research reported in this paper has been supported by the ministry of education, youth and sports of the czech republic under research program msm 6840770014 and by the grant agency of the czech technical university in prague, grant no. sgs11/158/ohk3/3t/13. references [1] j. barbosa, b. moreira. dynamic job scheduling on heterogeneous clusters. in parallel and distributed computing, 2009. ispdc ’09. eighth international symposium on, pp. 3 –10. 2009. [2] tracy d. braun, howard jay siegel, noah beck, et al. a comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. journal of parallel and distributed computing 61(6):810 – 837, 2001. [3] rajkumar buyya, manzur murshed. gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. concurrency and computation: practice and experience 14(13-15):1175– 1220, 2002. [4] f. cappello, e. caron, m. dayde, et al. grid’5000: a large scale and highly reconfigurable grid experimental testbed. in proceedings of the 6th ieee/acm international workshop on grid computing, grid ’05, pp. 99–106. ieee computer society, washington, dc, usa, 2005. [5] henri casanova, arnaud legrand, martin quinson. simgrid: a generic framework for largescale distributed experiments. in 10th ieee 117 acta polytechnica vol. 52 no. 5/2012 -7e-06 -6e-06 -5e-06 -4e-06 -3e-06 -2e-06 -1e-06 0 1e-06 0 5000 10000 15000 20000 25000 30000 t im e d if fe re n c e ( s ) time of computation (s) time differences for the network 1 average minimal maximal figure 7: average, minimal and maximal time difference between the expected and real start of all tasks of all applications executed on network 1. international conference on computer modeling and simulation. 2008. [6] brent chun, david culler, timothy roscoe, et al. planetlab: an overlay testbed for broadcoverage services. sigcomm comput commun rev 33:3–12, 2003. [7] michael r. garey, david s. johnson. computers and intractability: a guide to the theory of npcompleteness. w. h. freeman & co., new york, ny, usa, 1979. [8] t. hagras, j. janecek. a high performance, low complexity algorithm for compile-time task scheduling in heterogeneous systems. parallel computing 31(7):653 – 670, 2005. heterogeneous computing. [9] m a iverson, f özgüner. hierarchical, competitive scheduling of multiple dags in a dynamic heterogeneous environment. distributed systems engineering 6(3):112, 1999. [10] michael iverson, fusun ozguner. dynamic, competitive scheduling of multiple dags in a distributed heterogeneous environment. heterogeneous computing workshop 0:70, 1998. [11] jan janeček, peter macejko, tarek morad gomaa hagras. task scheduling for clustered heterogeneous systems. in m.h. hamza (ed.), iasted international conference parallel and distributed computing and networks (pdcn 2009), pp. 115–120. 2009. isbn: 9780-88986-783-3, isbn (cd): 978-0-88986-784-0. [12] dalibor klusáček, hana rudová. alea 2 – job scheduling simulator. in proceedings of the 3rd international icst conference on simulation tools and techniques (simutools 2010). icst, 2010. [13] yu kwong kwok, ishfaq ahmad. benchmarking the task graph scheduling algorithms. in in proc. ipps/spdp, pp. 531–537. 1998. [14] claudia leopold. parallel and distributed computing: a survey of models, paradigms and approaches. john wiley & sons, inc., new york, ny, usa, 2001. [15] jiuxing liu, b. chandrasekaran, jiesheng wu, et al. performance comparison of mpi implementations over infiniband, myrinet and quadrics. in supercomputing, 2003 acm/ieee conference, p. 58. 2003. 118 acta polytechnica vol. 52 no. 5/2012 [16] s. naicken, a. basu, b. livingston, s. rodhetbhai. towards yet another peer-to-peer simulator. in proceedings of the fourth international working conference on performance modeling and evaluation of heterogeneous networks (het-nets), pp. 37/1–37/10. ilkley, uk, 2006. [17] xiao qin, hong jiang. dynamic, reliabilitydriven scheduling of parallel real-time jobs in heterogeneous systems. parallel processing, international conference on 0:0113, 2001. [18] george f. riley. simulation of large scale networks ii: large-scale network simulations with gtnets. in proceedings of the 35th conference on winter simulation: driving innovation, wsc ’03, pp. 676–684. winter simulation conference, 2003. [19] gerard tel. introduction to distributed algorithms. cambridge university press, new york, ny, usa, 2nd edn., 2001. [20] h. topcuoglu, s. hariri, m.y. wu. performanceeffective and low-complexity task scheduling for heterogeneous computing. ieee transactions on parallel and distributed systems 13:260–274, 2002. [21] j.d. ullman. np-complete scheduling problems. journal of computer and system sciences 10(3):384 – 393, 1975. [22] andrás varga, rudolf hornig. an overview of the omnet++ simulation environment. in proceedings of the 1st international conference on simulation tools and techniques for communications, networks and systems & workshops, simutools ’08, pp. 60:1–60:10. icst (institute for computer sciences, social-informatics and telecommunications engineering), icst, brussels, belgium, belgium, 2008. [23] o. votava, p. macejko, j. kubr, j. janeček. dynamic local scheduling of multiple dags in a distributed heterogeneous systems. in proceedings of the 2011 international conference on telecommunication systems management, pp. 171–178. american telecommunications systems management association inc., dallas, tx, 2011. [24] henan zhao, r. sakellariou. scheduling multiple dags onto heterogeneous systems. parallel and distributed processing symposium, international 0:130, 2006. [25] albert y. h. zomaya (ed.). parallel and distributed computing handbook. mcgraw-hill, inc., new york, ny, usa, 1996. 119 acta polytechnica doi:10.14311/ap.2013.53.0807 acta polytechnica 53(supplement):807–810, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap large area hodoscopes for muon diagnostics of heliosphere and earth’s magnetosphere i.i. yashina,∗, n.v. ampilogova, i.i. astapova, n.s. barbashinaa, v.v. boroga, a.n. dmitrievaa, r.p. kokoulina, k.g. kompanietsa, g. mannocchib, a.s. mikhailenkoa, a.a. petrukhina, o. saavedrac, v.v. shutenkoa, g. trincherob, e.i. yakovlevaa a national research nuclear university mephi, 115409, moscow, russia b istituto di fisica dello spazio interplanetario, inaf, 10133, torino, italy c dipartimento di fisica dell’universita di torino, 10125, torino, italy ∗ corresponding author: iiyashin@mephi.ru abstract. muon diagnostics is a technique for remote monitoring of active processes in the heliosphere and the magnetosphere of the earth based on the analysis of angular variations of muon flux simultaneously detected from all directions of the upper hemisphere. to carry out muon diagnostics, special detectors – muon hodoscopes – which can detect muons from any direction with good angular resolution in real-time mode are required. we discuss approaches to data analysis and the results of studies of various extra-terrestrial processes detected by means of the wide aperture uragan muon hodoscope. keywords: cosmic rays, muons, muon hodoscope, angular variations of muon flux, solar, terrestrial connections, heliosphere, magnetosphere. 1. introduction during powerful solar events, e.g. coronal mass ejections (cme), clouds of magnetized plasma disturb interplanetary magnetic field (imf) and can produce strong magnetic storms in the magnetosphere of the earth. however, the problem of an early recognition and forecasting of the development of such dangerous phenomena has not yet been solved. the main reason for this is a scarcity of information about the heliosphere conditions between the mercury’s and the earth’s orbits. solar observations allow to see the moments of the appearance of various events. space-borne apparatus can give information about heliospheric disturbances 1–2 hours before their arrival on earth. from this point of view, cosmic rays are a promising tool in the development of the world environmental observation system, based on their penetrative ability. movement of solar plasma through the heliosphere disturbs the magnetic field, causing modulation of primary cr (see fig. 1) and correspondtngly of secondary components of cosmic rays (mainly neutrons and muons) generated in the atmosphere. to detect them, a world-wide net of ground-based stations (neutron monitors and muon telescopes) is used. the objective of these observations is to solve the inverse task – a study of dynamic processes in the heliosphere and magnetosphere using cosmic ray variation data. however, neutron monitors can measure integral flux variations only and the information from the whole world-wide neutron monitor net allows to obtain some conclusions about heliospheric processes. muons (in contrast to neutrons) keep parent particle directions, and hence there is an opportunity to measure primary cosmic ray variations from various directions. however, muon telescopes do not have sufficient resolution to obtain a spatial picture of the disturbance development in the heliosphere. new possibilities in this field are opened by the use of muon hodoscopes [4, 7], which allow the solution of this task by methods of muon diagnostics [3, 11]. unfortunately, however, the scheme of cosmic ray application for heliosphere and magnetosphere investigations is not so simple, since the flux of secondary particles at the surface depends not only on the primary cosmic ray flux, but also on the conditions in the atmosphere. for example, muon flux at ground level is strongly related with various thermodynamic processes in the earth’s atmosphere at generation level (barometric, temperature effects) and with more complex wave processes in the low stratosphere (inner gravitational waves of air density, density gradients, etc.) correlated with various turbulent and wave processes of geophysical origin, which are localized in space and time [2]. for reliable recognition of extraterrestrial phenomena, it is therefore necessary to take into account the variations of muon flux of atmospheric origin. 2. muon hodoscopes to realize the muon diagnostics, there is a need for wide-aperture muon detectors – hodoscopes, which enable simultaneous measurements of the intensity of 807 http://dx.doi.org/10.14311/ap.2013.53.0807 http://ojs.cvut.cz/ojs/index.php/ap i.i. yashin et al. acta polytechnica figure 1. modulation of primary cosmic rays related with solar activity. muons from all directions of the upper hemisphere. such detectors must have angular resolution of muon track reconstruction of the order 1 degree and a large sensitive area to provide necessary statistical accuracy of experimental data for all zenith and azimuth angular bins. the possibilities of muon diagnostics were demonstrated by means of the first hodoscopes: the temp scintillation muon detector [7] and the uragan hodoscope [4]. now a new scintillation muon hodoscope with wavelength shifting (wls) optical fiber light collection is under construction [1]. since most of the experimental results were obtained at the uragan muon hodoscope, we mainly consider them. uragan has been in operation at mephi (moscow, 55.7° n, 37.7° e, 173 m altitude a.s.l.) since may 2005 [4]. it consists of four eight-layer assemblies (supermodules, sm) on the basis of streamer tubes (1 cm2 cross-section, 3.5 m length) with external two-coordinate data readout. it has 46 m2 total area, sufficient to provide good statistics: more than 5000 muons per second. the supermodule response contains information about the muon track in both xand y-projections [8]. two projected zenith angles are reconstructed in real time mode and are accumulated in 2d-directional matrices (zenith and azimuth angles, or a pair of projected zenith angles). a sequence of such matrices provides the filming of the upper hemisphere in “muon light”. to study muon flux variations, for every cell of the angular matrix the average number of muons (estimated during the preceding 24 hours and corrected for atmospheric pressure and temperature [9] is subtracted, and the results are divided by standard deviations. the obtained data array provides a “muon snapshot” of the upper hemisphere with 1-minute exposure. the size of angular cells in the uragan matrix data was chosen 2° × 2° (in two projected zenith angles). the local anisotropy vector is used as a quantitative characteristic of the angular variations of muon flux. this vector indicates the average arrival direction of the muons. the local anisotropy vector a is defined as the sum of the unit vectors, each representing the direction of an individual track, normalized to the total number of muons [12]. projections of the vector of local anisotropy (ax,ay,az) from the initial matrix data m are defined as: ax = 1 n ∑ θ ∑ ϕ m(θ,ϕ) cos ϕ sin θ, ay = 1 n ∑ θ ∑ ϕ m(θ,ϕ) sin ϕ sin θ, az = 1 n ∑ θ ∑ ϕ m(θ,ϕ) cos θ, n = ∑ θ ∑ ϕ m(θ,ϕ), where θ and ϕ are the angles corresponding to the matrix cell midpoints, m(θ,ϕ) is the number of recorded events in a corresponding matrix cell (θ,ϕ) of the matrix m, and n is the total number of events in the angular range that is used. for convenience, the x axis is directed to the south, and the y axis is direcred to the east (see fig. 2). in addition, the value of the vector of the relative anisotropy of r = a−〈a〉, where 〈a〉 is the local anisotropy vector averaged for a long time period, and its projection on the horizontal plane rh are considered. 3. analysis of the uragan data 3.1. temporal variations variations of the total counting rate of muons measured by means of the uragan hodoscope between 2007 and 2011 are presented in fig. 3. the data in the figure (about 4 × 1011 muons) were obtained by summing the counting rates of three supermodules of the uragan for zenith angle range 25° ≤ θ < 76°. the frequency spectrum obtained as a result of the fourier analysis of the time series of hourly average muon counting rate is shown in fig. 4. the part of the spectrum with periods from 10 to 40 days is highlighted in the inset. the analysis of variations of the counting rates shows the presence of annual, 27-day, diurnal and semi-diurnal variations [10]. figure 5 presents correlations between the diurnal changes of east and south projections of local anisotropy vector. the correlations in the annual average diurnal cycle are clearly seen. 3.2. analysis of forbush decreases for the analysis of variations of the muon flux at the time of forbush decreases (fd) detected by the uragan, an experimental technique has deen developed that allows their energy, angular, and temporal characteristics to be investigated [5]. variations in the muon flux during fds were studied using both 808 vol. 53 supplement/2013 large area hodoscopes for muon diagnostics figure 2. vector of local anisotropy. figure 3. time dependence of the daily average muon counting rate (for zenith angle interval 25° ÷ 76°) estimated by the uragan data (normalized to one module). the integral counting rate, summed over three sms (average 10-minute data corrected for barometric and temperature effects), and the counting rates for five zenith-angular ranges: 0° ÷ 17°, 17° ÷ 26°, 26° ÷ 34°, 34° ÷ 44°, and over 44°, the boundaries of which were chosen to provide nearly equal statistics. the threshold muon energies depend on the zenith angle, and vary from 200 to 400 mev. the fds detected with the uragan hodoscope between 2006 and 2011 with amplitude of the decrease in the integral counting rate ≥ 0.5 % were selected. figure 6 presents the results of a study of variations of muons and neutrons (from moscow neutron monitor – mnm – data) during the forbush decrease of july 11, 2011. solar wind and imf parameters are also shown. the right panel of the figure presents gse-images obtained for the moments of the maximum increase of rh during the period july 07–14, 2011. to obtain a gse-image, a “muon snapshot” is projected on to figure 4. fourier power spectra of the time series of the muon counting rate. in the boxes, the values of the periods (in days) are specified for the corresponding peaks. -40 -20 0 20 40 40 20 0 -20 -40 12h 06h 2007 . 2008 . 2009 . 2010 . 2011 . a so ut h · 1 05 a east · 105 n ew s 00h figure 5. correlations in the annual average diurnal cycle. a magnetopause using asymptotic directions, and is transformed into the gse (geocentric solar ecliptic) coordinate system [6]: x is directed from the earth to the sun, y is directed in the plane of the ecliptic against the movement of the earth, and z is parallel to the direction to the pole of the ecliptic. gse is usually used to display the trajectories of satellites, observations of the interplanetary magnetic field and the solar wind. 4. conclusions the use of cosmic rays as a penetrating component in the heliosphere and muon hodoscopes as apparatus for formation of “muon images” of the magnetosphere and extra-terrestrial space opens a new approach for 809 i.i. yashin et al. acta polytechnica 7 jul 9 jul 11 jul 13 jul 9300 9400 9500 9600 9700 9800 9900 10000 7 jul 9 jul 11 jul 13 jul 1400 1420 7 jul 9 jul 11 jul 13 jul -20 -10 0 10 20 30 7 jul 9 jul 11 jul 13 jul 200 300 400 500 600 700 800 7 jul 9 jul 11 jul 13 jul 0 2 4 mnm r a te , m in -1 r a te , s -1 uragan b z gse b t 3 2 b t b z , n t v s w , k m /s 1 4 r h /σ figure 6. variations of cr muons during fd of july 11, 2011. left (from top to bottom): solar wind velocity; parameters of the vector of imf (bt and bz); counting rate of the uragan; counting rate of the mnm; changes of rh. right: gse-images obtained for moments indicated on the rh plot in the left panel. note the behavior of the local anisotropy parameter rh. increasing rh (up to 4σ level) started 2 days before the beginning of fd. the cross in the circle indicates the direction of the interplanetary magnetic field force line. remote monitoring of the environment. the analysis of muon flux anisotropy during heliosphere perturbations related with solar activity by means of even a single wide-aperture hodoscope provides a way to obtain unique information about the structure and dynamics of such events and to compare predictions of various models of heliospheric processes with direct measurements of muon flux variations. acknowledgements this research was performed at the nevod scientific and educational centre, with support from the russian ministry of education and science (contract no.16.518.11.7053) within the framework of the federal target program “scientific and pedagogical cadres for innovative russia” and leading scientific school grant nsh6817.2012.2. references [1] ampilogov, n.v., et al.: 2011, astrophys. space sci. trans. 7, 435 [2] astapov, i.i., et al. : 2011, in proc. 32nd icrc (beijing), 11, 444 [3] barbashina, n.s., et al.: 2007, bull. rus. acad. sci., phys. 71, 1041 [4] barbashina, n.s., et al.: 2008, instrum. exp. tech. 51, 180 [5] barbashina, n.s., et al.: 2011, bull. rus. acad. sci., phys. 75, 814 [6] barbashina, n.s., et al.: 2012, in proc. 23rd ecrs (moscow), id 574 [7] borog, v.v., et al.: 1995, in proc. 24th icrc (roma), 4, 1291 [8] chernov, d.v., et al.: 2005, in proc. 29th icrc (pune), 2, 457 [9] dmitrieva, a.n., et al.: 2011, astropart. phys. 34 , 401 [10] shutenko, v.v., et al.: 2012, in proc. 23rd ecrs (moscow), id 584 [11] timashkov, d.a., et al.: 2007, in proc. 30th icrc (merida), 1, 685 [12] timashkov, d.a., et al.: 2008, in proc. 21st ecrs (koshice), 338 810 acta polytechnica 53(supplement):807–810, 2013 1 introduction 2 muon hodoscopes 3 analysis of the uragan data 3.1 temporal variations 3.2 analysis of forbush decreases 4 conclusions acknowledgements references ap-4-10.dvi acta polytechnica vol. 50 no. 4/2010 valve concepts for microfluidic cell handling m. grabowski, a. buchenauer abstract in this paper we present various pneumatically actuated microfluidic valves to enable user-defined fluid management within a microfluidic chip. to identify a feasible valve design, certain valve concepts are simulated in ansys to investigate the pressure dependent opening and closing characteristics of each design. the results are verified in a series of tests. both the microfluidic layer and the pneumatic layer are realized by means of soft-lithographic techniques. in this way, a network of channels is fabricated in photoresist as a molding master. by casting these masters with pdms (polydimethylsiloxane) we get polymeric replicas containing the channel network. after a plasma-enhanced bonding process, the two layers are irreversibly bonded to each other. thebonding is tight for pressures up to2bar. the valves are integrated into amicrofluidic cell handling system that is designed to manipulate cells in the presence of a liquid reagent (e.g. peg – polyethylene glycol, for cell fusion). for this purpose a user-defined fluid management system is developed. the first test series with human cell lines show that the microfluidic chip is suitable for accumulating cells within a reaction chamber, where they can be flushed by a liquid medium. keywords: microfluidic chip, valve, cell handling, simulation, pneumatic actuation, polydimethylsiloxane. 1 introduction the integration of new characteristics into a cellular system is an essential challenge in modern biotechnology. methods which are suitable for fusing different cell types to generate hybrid cells which show hybrid characteristicsof the twooriginal cells areof central interest in this area. the hybridoma technology, for example, generates hybrid cell lines by fusing antibodyproducing b-cells with a cancerous myeloma cell line. as a result of this fusion, the generated hybridomas produce monoclonal antibodies. cells can be fused for example chemically in the presence of polyethylene glycol (peg), electrically by the application of pulsed electric fields (electrofusion), or bymeans of a focused laser beam (laser-induced fusion). fusion is traditionally performed in a cell bulk of two mixed cell lines. the drawback of this traditional type of cell fusion within abulk is the statistical combination of the cells. equal concentrations in both cell suspensionswill lead to 50%heterokaryonformation (fusionof twodifferent cell types) and 50 % monokaryon formation (fusion of cells of the same type). user-controlledandobservable cell fusion is not realizablewith thismethod. in recent years, theuseofmicrofluidic systemshasbecomeaneligible method for applications like biochemical assays, medical diagnostics, drug delivery, cell sorting and cell manipulation, as they enable transportation, isolation andmanipulation of small amounts of liquids and cells or even single cells. this microfluidic system enables the user to select certain cells which should be manipulated or investigated. polydimethylsiloxane (pdms) is a well suited material for these applications, as microfluidic systems can be developed very cheaply and rapidly by methods of soft lithography [1, 2]. fig. 1: common geometrical parameters of all valve types 2 materials and methods 2.1 valve design variousvalve conceptsweredesignedand fabricated in pdmsusing soft-lithographic techniques to find a feasible geometry. the geometry of the pneumatic layer is equal in all investigated valve types, as is the width and height of the fluidic channel. the changes are only in the form of the fluidic channel (rectangular or rounded) and the order of the two functional layers (fig. 2, 3, 4, 5). figure1 showsthegeneralgeometrical parameters of all investigated valve types. the pneumatic chamber has a diameter of 1 mm. the height of the chamber is 70 μm. a change in height has no influence on the characteristics of the valve, as long as it is high enough to make upward bending of the membrane possible. the cross-section of the fluidic channel is 200μm × 40μm. the membrane is 50 μm in thickness. the first valve geometry (valve type 1) was built up with an interrupted microfluidic chan41 acta polytechnica vol. 50 no. 4/2010 nel and a pneumatic chamber above the interruption (fig. 2) similar to [3, 4]. the bar is 30 μm in width. this valve was at first operated passively, i.e. no pressure (positive or negative) was applied to the pneumatic layer. with no pressure difference between the pneumatic layer and the fluidic layer, the bar blocks the flow and the valve remains closed. therefore valve type 1 is intrinsically closed. when pressure is applied to the filled microfluidic channel, the membrane around the interruption lifts up and the fluid is able to pass the valve. this is the open state of the valve. to optimize this concept the valve was operated actively by applying a positive or negative pressure onto the pneumatic chamber. the second valve geometry (valve type 2) was designed without the intersection in the fluidic channel, using a microfluidic channel with just a rectangular cross-section (fig. 3). fig. 2: valve type 1. cross-section of valve areawith interrupted fluidic channel. 1: pneumatic layer; 2: fluidic layer; 3: substrate fig. 3: valve type 2. cross-section of valve area with rectangular channel geometry. 1: pneumatic layer; 2: fluidic layer; 3: substrate this valve type can only work when actively actuated, because it is intrinsically open. to avoid the disadvantages of valve type 2, which are explained in detail in the results, we designed a subtype of this geometry. this optimized valve type 3 is characterized by a fluidic channel with a rounded cross-section (fig. 4). the rounded geometry of the fluidic channel is realized by heating the photoresistmaster to 140◦c for 5 minutes, similar to [5]. for further optimization based on the ansys simulations, we designed a valve type 4 similar to type 3. it is also characterized by a rounded fluidic cross-section. the significant difference is the reverse order of the pdms layers (fig. 5). the pneumatic layer is now below the fluidic channel. fig. 4: valve type3. cross-section of valveareawith round channel geometry. 1: pneumatic layer; 2: fluidic layer; 3: substrate fig. 5: valve type 4. cross-section of valve area with round channel geometry. 1: pneumatic layer; 2: fluidic layer; 3: substrate 2.2 fabrication the fabrication of themicrofluidic chip startswith the design of amicrofluidic andpneumatic channel system using cad software. this layout is then printed in 25kdpi toa transparencywhichacts asaphotomask in photolithography to structureaznlof /az9260photoresist (az electronic materials) on a silicon wafer. by casting liquid polymer onto the master wafer the channel network is transferred to the pdms device. the pdms (sylgard 184, dow corning) is a twocomponent system consisting of a base and a curing agent, which are mixed in a ratio of 10:1 (base:curing agent). subsequently, the liquidpre-polymer is poured onto the master and is cured in a convection oven at 80◦c for 20 minutes. afterwards, the pdms replica can be peeled off from the wafer. the channel system can be made accessible by punching out holes into the pdms slide using a hollow needle. the cast pdms layers can be bonded in two ways. bringing the polymer substrates into tight contact without a pretreatment leads to reversible bonding of the two slabs due to van der waals forces [1, 2]. because this kind of 42 acta polytechnica vol. 50 no. 4/2010 bonding does not withstand elevated pressures> 0.4 bar in the channel network,wepreferplasmaenhanced bonding of the substrates. if the pdms devices are exposed to an o2-plasma, covalent si–o–si bonds are generated, leading to irreversible, tight bonding up to pressures of about 2 bar [2, 6, 7]. theplasmapretreatment is performed in a plasma asher with 275 w at 13.56mhz, 420 sccmo2, 2 mbar for 12 s (tegal plasmaline 415). the device is finally bonded to another pdms layer or to a glass substrate to seal the remaining open channels, utilizing the same plasma process (fig. 2, 3, 4, 5). 2.3 microfluidic chip thevalves are integrated into amicrofluidic chip. the actual chip layout consists of three fluidic inlet channels (e.g. cell solution1, cell solution2, liquid reagent), one outlet channel and twobackflush channelswith an inlet and an outlet for both (fig. 6). each channel is equippedwith a pneumatically actuated valve to open or close the fluid channel (fig. 7). fig. 6: microfluidic networkwith inlet andoutlet channels, backflush channels and pneumatic valves fig. 7: pneumatically actuated valves over fluid channels two driving mechanisms were investigated for the actuation principle for the cell solution. an actuation mechanism utilizing a peristaltic pump was not feasible due to the low survival rate of the cells due to the mechanical components of the peristaltic pump. because of these restrictions, a software-controlledpneumatical drivingmechanismwas developedwhich actuates the pneumatic layer as well as the fluidic layer. therefore the whole chip is integrated into an adapter that makes all fluidic and pneumatic channels accessible. a centralized valve group supplied with pressurized air is connected to the adapter. the valve group itself is computer controllable. byusing alabviewbased interface, user-defined fluid management is possible (fig. 8). the reaction chamber itself consists of a beaked structure including a centered passage 5 μm in height (fig. 9). this design is suitable for accumulating cells in the middle of the chamber and bringing them into tight contact. at the same time, it enables a gentle fluid stream,which is necessary for transporting more cell solution or another liquid medium into the chamber. fig. 8: experimental setup with a microfluidic chip mounted on an adapter. the adapter is pneumatically connected via a centralized and software-controlled valve group fig. 9: reaction chamber. cells can be accumulated in the central structure 43 acta polytechnica vol. 50 no. 4/2010 a typical cell accumulation process consists of the following steps: 1. open the valve of the exit channel 2. open the valve of inlet 1 andpump cell solution 1 3. close the valve of inlet 1 4. open the valve of inlet 2 andpump cell solution 2 5. close the valve of inlet 2 6. open the valve of inlet 3 to flush the accumulated cells with reagent 7. close the valves of inlet 3 and exit channel 8. open the valves of one backflush channel and pump the cells out of the chip 2.4 cell lines in order to analyze the suitability of the microfluidic chip for cell handling, permanent cell lines were identified, which were comparable in size and handling to the final target cells which should be merged within the chip. the human myeloid cell line u937 and the human lymphoid cell line l540 were found to meet these requirements. this avoids laborious preparation of spleen cells from mice and animal consuming procedures for the preliminary experiments. passages through microfluidic structures may affect cell functions, e.g. by mechanical stress or adherence dependent activation. the viability of the cells after passage of the microfluidic was tested by viability and proliferation assays. 3 results 3.1 valve simulations all valve conceptswere simulatedusingansys to optimize the geometries. valve type 1 was at first just operated passively. this means that the pneumatic chamber is not pressurized and the membrane just bends up and down depending on the pressure in the fluidic channel. this concept is optimized by applying a positive or negative pressure to the pneumatic layer. with increasing pressure in the pneumatic chamber in the closed state the bar can be pressed down even more strongly, leading to effective sealing of the fluidic channel that blocks the flow. in the opened state, a relative negative pressure of 0.2 bar in the pneumatic layer results in upward bending of the membrane and the bar, enabling continuous flow. because the bar is linked at the sides of the channel, increasing negative pressure does not improve the valve opening significantly. with about 15 μm in the central region of the valve the opened cross-section of this valve type is clearly smaller than the channel itself, as can be seen in the ansys simulations (fig. 10, 11). figures 12, 13 and 14 show the simulation results for valve type2. when thepneumatic layer isnotpressurized, continuous flow is possible in the fluidic channel and the valve is intrinsically open. in this state, fig. 10: ansys simulation of valve type 1 in the opened state when actuated with a negative pressure of 0.2 bar in the pneumatic layer fig. 11: ansys simulation of valve type 1 in the opened state when actuated with negative pressure of 0.2 bar in the pneumatic layer. perspective from inside the channel fig. 12: ansys simulation of valve type 2 in the closed state when actuated with 0.5 bar in the pneumatic layer the opening cross-section of the valve is the same as the channel cross-section itself, allowing undisturbed flow. if pressure is applied to the pneumatic layer, the membrane bends downwards and blocks the channel so that the valve is in the closed state. the ansys simulations show insufficient closure of valve type 2 (fig. 13, 14). 44 acta polytechnica vol. 50 no. 4/2010 fig. 13: ansys simulation of valve type 2 in the closed state when actuated with 0.5 bar in the pneumatic layer. perspective from inside the channel fig. 14: ansys simulation of valve type 2 in the closed state when actuated with 1.0 bar in the pneumatic layer. perspective from inside the channel due to the rectangular fluidic channel, the membrane cannot bent entirely to the bottom of the channel. the simulations show two remaining approximately triangular areas where flow is still possible. in the middle of the channel the closure of the valve is complete. depending on the pressure applied to the pneumatic layer, a greater or smaller amount of fluid is able to pass the valve. figure 14 shows that an entire stop to the fluid cannot be obtained within an acceptable pressure range (< 1 bar). these imperfect closing characteristics also persist if the pressure is increased up to 2 bar. in the same pressure region, valve type 3 shows better sealing than valve type 2, but no complete flow blockage can be achieved with this modification (fig. 15). figure 16 shows the results of the ansys simulations for valve type 4. this valve type is characterized by a reverse order of the functional layers, as described above. the reverse order results in a membrane which, when actuated by pressure, tightly adapts to the rounded fluidic channel, resulting in refig. 15: ansys simulation of valve type 3 in the closed state when actuated with 0.5 bar in the pneumatic layer. perspective from inside the channel fig. 16: ansys simulation of valve type 4 in the closed state when actuated with 0.75 bar in the pneumatic layer liable sealing. the ansys simulations show effective closure of this valve type for pressures < 1 bar. in summary, the simulations show that valve types 2 and 3 are not suitable geometries for reliable fluid management. these results were verified in several experimental test series. valve types 2 and 3 show blocked flow in the middle of the channel. with increasing pressure on the pneumatic layer the closed area of the channel also increases. however, a persistent flowrate is observable in these geometries at the sides of the channel, which cannot be blocked despite an pressure increase of up to 2bar in the pneumatic layer. by contrast, the simulations indicate that valve types 1 and 4 are adequate designs for a well working valve. primarily valve type 4 shows excellent properties in the opened state. the cross-section of the valve in this state is the entire cross-section of the fluidic channel. the flow does not need to pass a constriction causing high shear rates, as in type 1. in the closed state, valve type 4 entirely blocks the flowwhen apressureof about 0.75–1bar is applied to the pneumatic chamber. in the test series, valve type 4 showed one disadvan45 acta polytechnica vol. 50 no. 4/2010 tage concerning the integrity of the valve. due to the stiffness of pdms, the membrane of the pneumatic channel must not exceed a thickness of about 50 μm. this is necessary because the membrane is distorted when the valve is actuated and has to apply tightly to the rounded wall of the fluidic channel. an excessively thick membrane limits this distortion and the valve does not entirely block the flow. due to the low thickness of the membrane, it can be damaged. careful handling during casting and the whole assembly process is therefore essential. valve type 1 shows clear advantages concerning the closed state. its intrinsic closing behavior is already very efficient, and the sealing can be further optimized by a little overpressure in the pneumatic chamber. the disadvantage of valve type 1 is its reduced cross-section in the opened state. when actuatedwith negative pressure, the membrane and the bar of the valve bend up only in the central region. this produces just a small opening compared to the whole cross-section of the fluidic channel. the test series showed that this is no serious problemwhen dealing with particle-free solutions. when handling solutions including particles like cells, however, this problem causes obstruction of the valve if the particles are too big to pass the opened part of the channel. this problemmakes valve type 1 inapplicable for such applications. 3.2 biological tolerance the compatibility of themicrofluidic chip for cells was tested by viability and proliferation assays to exclude a negative impact of the microfluidic passage on the cells. no significant difference in viability and proliferating activity was observed. the suitability of the different valve designs for cell handling was tested using the human cell lines u937 and l540. valve type 1 shows excellent results when dealing with cell-free media such as buffer solutions. unfortunately, cells are generally damaged when passing the valve due to the mechanical stress they are exposed to between the lifted bar and the bottom of the fluidic channel. the test series for valve type 4 showed excellent suitability for cell handling applications. no negative impact on the cell survival rate is caused by this type of valve geometry. 3.3 cell accumulation in the test series with human cell lines u937 andl540 we showed that the microfluidic chip is suitable for user-defined cell handling. we are able to collect one or more cells within the reaction chamber (fig. 17). after collecting the cells we can flush them with an arbitrary liquid medium to initiate certain reactions (e.g. cell fusions when using peg as a medium). afterwards, the cells can be flushed out of the chip for further biological investigations. fig. 17: cell accumulation in the reaction chamber 4 conclusions in this paper we have simulated 4 pneumatically actuatedvalveconcepts, andweproposetwoof them(valve type 1 and 4) as suitable for microfluidic applications inpdmssubstrates. thevalve function hasbeen simulated inansysand the simulation results havebeen verified in several test series. one valve type (valve type 4) is shown to be very advantageous when dealing with solutions with particles such as cells inside. the other valve type (valve type 1) is proposed formicrofluidic applications with particle-free solutions. by integrating these valves into microfluidic channels we designed a pneumatically actuated microfluidic chip for cell handling applications. a software interface allows a user-definedfluidmanagement. the systemdesign is suitable for collecting one or more cells within a reaction chamber, where they can be flushed with a liquid medium to initiate biological reactions such as cell fusion. acknowledgement the research described in this paper was supervised byprof.dr.w.mokwa, institute ofmaterials inelectrical engineering, rwth aachen university, and by prof.dr.dr. s. barth, fraunhofer institute formolecular biology, aachen, and was supported by the exploratory research space rwth aachen university under grant no. mse04. references [1] cooper mcdonald, j., whitesides, g. m.: poly(dimethylsiloxane) as a material for fabricating microfluidic devices. accounts of chemical research, 2002, vol. 35, no. 7, p. 491–499. [2] cooper mcdonald, et al.: fabrication of microfluidic systems in poly(dimethylsiloxane). electrophoresis, 2000, vol. 21, p. 27–40. 46 acta polytechnica vol. 50 no. 4/2010 [3] buchenauer, a., et al.: microbioreactors with microfluidic control and user-friendly connection to the actuator hardware. j. micromech. microeng., 2009, vol. 19. [4] hosokawa, k., maeda, r.: a pneumaticallyactuated three-way microvalve fabricated with polydimethylsiloxane using the membrane transfer technique.j.micromech. microeng., 2000, vol.10, p. 415–420. [5] unger, m. a., et al.: monolithic microfabricated valves and pumps by multilayer soft lithography. science, 2000, vol. 288, p. 113–116. [6] morent, r., et al.: adhesion enhancement by a dielectric barrier discharge of pdms used for flexible and stretchable electronics. j. phys. d: appl. phys, 2007, vol. 40, p. 7392–7401. [7] bhattacharya, s., et al.: studies on surface wettability of poly(dimethyl)siloxane (pdms) and glass under oxygen-plasma treatment and correlation with bond strength. journal of microelectromechanical systems, 2005, vol. 14, no. 3, p. 590–597. about the authors mirko grabowski was born in bochum, germany in 1980. he studied electrical engineering at ruhr-universität bochum, focusing on plasma technology. mirko grabowski is currently working as a researchassociate at the institute ofmaterials inelectrical engineering at rwth aachen university, germany, under the supervision of prof. dr. w. mokwa. his work focuses on microfluidic devices for applications in biomedical engineering. andreas buchenauer was born in stuttgart, germany in 1978. he studied mechanical engineering at rwth aachen university, focusing on mems technology. andreas buchenauer is currently working as a research associate at the institute of materials in electrical engineering at rwth aachen university, germany, under the supervision of prof. dr. w. mokwa. his work focuses on microfluidic devices for applications in biochemical engineering. mirko grabowski andreas buchenauer e-mail: grabowski@iwe1.rwth-aachen.de, buchenauer@iwe1.rwth-aachen.de institute of materials in electrical engineering 1 rwth aachen university sommerfeldstraße 24, 52074 aachen, germany 47 ap-5-11.dvi acta polytechnica vol. 51 no. 5/2011 performance of beams made of low-cost self-compacting concrete in an aggressive environment m. a. safan abstract self-compacting concrete mixes (scc) incorporating silica fume, fly ash and dolomite powder were used in casting two groups of beams. the beams in one group were stored in an open environment, while those in the other group were subjected to salt attack and successive wet/drying cycles. the beams were stored for about one year under a sustained load. the structural performance of the stored beams was evaluated by testing the specimens under four-point loading until failure. the results indicated that the low-cost scc mixes showed comparable structural behavior with respect to the corresponding control mixes in a normal environment. different scc mixes in a corrosive environment yielded a different structural performance, depending on the composition of the fillers. keywords: corrosion, self-compacting concrete, silica fume, fly ash, dolomite powder, harsh. 1 introduction steel reinforcements are used in concrete members to resist tensile stresses and to provide the concrete structure with the required structural integrity. unfortunately, the steel reinforcementhas anatural tendency to corrode, returning to its stable state as an iron ore [1]. steel corrosion is considered the leading cause of deterioration in concrete structures. corrosion in progress results in rust occupying a greater volume and thus exerting stress on the surrounding concrete. estimates of the expansive stress exerted due to rust were reported to vary from 32 to 500 mpa [2]. such a substantial stress causes concrete to crack, delaminate and finally spall. loss of the concrete-steel bond and reduction of the effective reinforcement area put the integrity and safety of the structure seriously into question. corrosion is an electrochemical process involving the flowof electrons and solublemetal cations (fe2+) thatmigrate throughthe concreteporewater to combine with hydroxyl ions (oh−) to form iron hydroxide (fe(oh)2) or rust. the amount and rate of the corrosiondepend largelyonthe solubilityof themetal cations, which is influenced by temperature, by the ph of the surroundingmedium, and by the humidity of concrete. sound concrete has a ph of 12 to 13 [3]. this high alkalinity results in the formationof a tight filmof ironoxide,which servesasapassiveprotection against corrosion. the passive film reduces the corrosion rate to an insignificant level, typically 0.2 μm per year. this rate is increased by up to 1000 times if the passive layer is destroyed [1]. the passive protection is subject to destruction due to the penetration of chloride ionswhen concrete serves in a salt-rich environment. dissolved chlorides can permeate slowly through sound concrete, or can reach the steel more rapidly through cracks. as expected, the risk of corrosion increases as the chloride content increases. according toaci318 [4], themaximumcontent forwater-solublechloride fromtheconstitutingmaterials is 0.15%byweightof the concrete for reinforced concrete exposed to chloride in service and0.3%byweightof the concrete forprotectedconcrete. it is interesting to note that chlorides are directly responsible for the initiation of corrosion, and do not influence the future corrosion rate. the natural protection of concrete against corrosion is affected by carbonation,which reduces the ph of concretedue to the reactionof carbondioxide from the air and calciumhydroxide. the ph canbe as low as 8.5 due to this reaction, and the passive layer becomes unstable [5]. in addition, carbonation allows for a much smaller chloride corrosion threshold. it is reported that 7000 to 8000 ppm of chlorides are required to initiate corrosionwhen the concrete ph is 12 to 13, while only 100 ppm concentration is needed when the ph is 10 to 12 [6]. however, carbonation reactions are slow, and carbonation proceeds at a rate up to 1.0mmpear year in highquality concrete characterized by low permeability, high cement content and a low water/cement ratio. the carbonation rate depends on the relative humidity of the concrete, and the highest rates occur in concretes with 50–75 percent relative humidity, so that the carbonation rates are insignificant indryaswell aswater-saturatedconcrete [7]. the above literature concerning corrosion mechanisms, environment and triggers has clearly demonstrated that concrete quality and design practices 120 acta polytechnica vol. 51 no. 5/2011 may be considered as the first defense against corrosion. first, quality concretes are produced with well-proportioned quality materials, and they typically have low water/cement ratios, are well compacted and well cured. these practices reduce the permeability and porosity of concrete and thus slow down the penetration of chloride ions and carbonation. the water/cement ratios can be lowered by simply increasing the cement content, using waterreducing admixtures and using fly ashes. further, the use of micro-silica can help to produce almost impermeable concrete [8]. second, design practices for concrete structures specify the amount of steel reinforcements that will keep the cracks tight. aci 318 [4] specifies a maximum crack width of 0.33 mm for exterior exposureunder service loads, and0.4mm for interior exposure under service loads. another important factor is to specifyminimumconcrete covers depending on exposure conditions in order to delay the onset of corrosion by extending the time required for carbonation and penetrating chlorides to reach the steel reinforcement [9]. it was interesting to find that self-compacting concrete (scc) was initially named high-performance concrete. scc was more used worldwide to describe durable concrete with a low water/cement ratio [10]. this new type of concrete, proposed by okamura in 1986, was indented to be selfcompactable in the fresh state, free from initial defects in the early age, and durable after hardening [11]. the intrinsic properties of scc involve high deformability of the mortar and resistance to segregation when concrete flows through confining steel rebars. according tookamuraandozawa [12], these properties can be achievedby utilizing limited aggregate content, a lower water/powder ratio and by using superplastizers. scc mixes typically incorporate high fractions of cements and fine materials such as silica fume (sf), fly ash (fa), granulated blast furnace slag (gbfs), limestone powder, etc. combining finely dividedpowders, admixtures, andportland cement can enhance the behavior of scc in terms of filling ability, passing ability, and stability. the main target is to enhance the grain-size distribution and particle packing, thus ensuring greater cohesiveness [13]. silica fume is a by-product resulting from the operation of electric arc furnaces used to reduce high purity quartz to produce silicon and ferrosilicon alloys. silica fume consists of extremely fine spherical particles of amorphous silicon dioxide with an average particle diameter that is about 1/100 of the average particle diameter of cement. due to its extremefineness, silica fume has apronounced effect on the fresh and hardened properties of concrete due to physical effects and pozzolanic reactions. the physical effects include reduced bleeding and greater cohesiveness, which directly influence the properties of hardened concrete [8]. according to mindess [14], silica fume increases the strength of hardened concrete by increasing the strength of the transition zone between the cement paste and coarse aggregates. through pozzolanic reactions, silica fume reacts with calciumhydroxide (ch) resulting from the cement hydration process, producing calcium silicate hydrate (csh). relatively more ch forms in the transition zone than through the paste. ch crystals tend to decrease the strength of cement materials, as they are normally large, strongly oriented parallel to the aggregate surface andweaker thancsh. concern has been raised regarding a reduction in the ph of the pore fluid by the consumption of ch, and the effect of any such reduction on the passivation of the reinforcing steel. at the levels of silica fume dosage typically used in concrete, the reduction of ph is not large enough to be of concern. for corrosion protection purposes, the increased electrical resistivity and the reduced diffusivity to chloride ions are believed to be more significant than any reduction in pore solution ph [8]. fly ash is aby-product of coal combustion in electric power plants. depending on the chemical composition, fly ashes are classified as class f: low-lime ashes with pozzolanic properties normally produced fromanthraciteorbituminous coal, andclassc:highlime ashes with pozzolanic and cementitious properties normally produced from lignite or subbituminous coal. according toastmc618 [15], the sumof sio2, al2o3 and fe2o3 should be greater than 70 % to classify the ash as type f, and greater than 50 % to classify it as type c. fly ash is used to improve the workability of fresh concrete, reduce the temperature riseduring initial hydration, improve resistance to sulfates, reduce expansion due to alkali-silica reaction, and increase both the strength and the durability of hardened concrete. fly ashes suitable for use in concrete should be fine enough so that no more than 34 % of the particles are retained on a 45 μm (no. 325) sieve, in accordance with astm c618 requirements. theproperties of fresh concrete containing flyashare improved in termsof reducedbleeding, as a greater surface area of solid particles is provided at lowerwater content fora specifiedworkability. unlike silica fume, the pozzolanic reaction of fly ash continues over several years if the concrete is kept moist. thus, concrete containing fly ash with lower early strength would be expected to have equivalent or higher strength at 28 days and at later ages [16]. dolomite powder was used as a filler in the currentwork. however, limestone powders aremost frequently used in the sccmixes reported in the literature. it is expected that the powders obtained when cutting or sieving natural rocks would give similar physical effects depending on shape, size and surface 121 acta polytechnica vol. 51 no. 5/2011 texture characteristics. both limestone and dolomite are carbonate rocks. pure limestone is mainly composed of calcium oxide and carbon dioxide. when about 20 percent of magnesium oxide is introduced, weobtainhighqualitydolomite stone that is stronger and harder than limestone. ye, et al. [17] explored the hydration and the microstructure of cementitious pastes of typical composition for ordinary, high performance, and self-compacting concrete, and reported that limestone powders do not participate in the chemical reactions during hydration. by definition and by composition, scc is expected to provide a protective environment against corrosion with regard to its low water/powder ratio, reducedbleedingandreducedpermeabilitydue to the use of extremely fine materials. the corrosion resistance parameters of scc have been investigated by many authors. yazici [18] investigated the chloride penetration of scc mixes incorporating silica fume andhigh-volume classcfly ash. itwas reported that incorporatingfaand/orsfwasveryeffective for improving resistance to chloride penetration. the penetration depth in the controlmix with only portland cement was 19 mm after 90 cycles, while this depth was only 9.5 mm in scc with 60%fa and 10% sf replacement. the potentials of scc durability and the corrosion parameters in terms of chloride penetration, oxygenpermeability and accelerated carbonationwere addressedbyassié et al. [19] for bothscc andvibratedconcrete (vc).theyusedcemii/a-ll 32.5r and cemi 52.5n cements, which are typically used for low-strength and high-strength concrete, respectively. the scc mixes were proportioned using limestone filler to replace fractions of the coarse aggregate. the results showed that scc mixes were more resistant thanvctooxygenpermeability,while the chloride penetration and carbonation amounts were similar in both scc and vc mixes. zhu and bartos [20] investigated the permeation properties of scc as durability measures utilizing fa and limestone fillers. it was emphasized that the permeation measures are strongly reflected by the type of filler. saylev andfrancois [21] investigated the influence of the steel-concrete interfaceon the corrosionof steel in terms of resistance to polarization and the corroded surface area. interface defects related to gaps caused by bleeding, settlement and segregation of fresh concrete were more limited in scc than in vc. despite the appealing characteristics and merits of self-compacting concrete, it is still classified as a special type of concrete. the wide spread of this type of concrete is restricted by the need to implement strict quality measures and by the relatively high cost of scc mixes. a rationalmix design should attempt to balance the cost and the structural efficiency of concrete in terms of durability and strength. it is however expected that scc can become a conventionally-used material with a competitive cost, for two reasons. first, concrete developers believe that 21st century concrete practice must be driven by considerations of durability rather than strength in order to build environmentally sustainable concrete structures [22]. second, it is possible to invest in establishing ready-mix concrete plants even in small local communities, due to the continuously expanding market, and in this way quality production canbe achieved. the cost-effectiveness of sccwasamatter of concern inmany recent research works reported in the literature [23–25]. mix economy is usually achievedby incorporating low-costbyproducts thatmay replaceportland cement as active and non-active fillers. the initial work presented by the author [25] dealt with attempts to produce lowcost scc by using dolomite stone powder to replace significant amounts of cement and limiting the use of chemical admixtures to hrwr admixtures. the results indicated that it was possible to combine sf (10 % by weight of cement) either with fa (up to 40 % by weight of cement) or with dolomite powder (up to 30 % by weight of cement) to obtain scc with a satisfactory level of compressive strength for structural concrete. 2 significance of the research extensive research work is needed to explore the durability parameters of self-compacting concrete. the properties of hardened scc can differ greatly, as the mixes usually incorporate different combinations of fines for technical and economic purposes. for this reason, studies addressing the influence of fine powders and combinations of fine powders on strength and durability are needed for the development of scc mixes. the current work has investigated the corrosion protection provided by different scc mixes containing different fillers. the test specimens were exposed to a corrosive environment while they were under service loading as simply supported beams. the beams were subjected to salt solutions and wet/drying cycles for about one year to accelerate corrosion, after which the beams were tested to failure to examine their structural behavior and address the apparent defects due to corrosion. 3 experimental study the experimental work included the manufacture of two sets of test beams. the beams in one set were stored for one year in an aggressive medium to accelerate corrosion, while those in the other set were stored protected from extreme exposure. the beams in the two sets were loaded during storage under service loading. based on the results reported in an 122 acta polytechnica vol. 51 no. 5/2011 initial phase of the research [25], seven mixes were selected to produce scc based on the compressive strength criterion. the selected mixes incorporated dolomite powder (dp) replacing up to 30 % of the cement weight, along with either silica fume (sf) or fly ash (fa), which replaced 10 % of the cement by weight. the chemical analysis of the fine materials that were used including cement, silica fume, fly ash and dolomite powder — are reported in table 1. the constituents of the selectedsccmixes are given in table 2. in these mixes, the fine-to-coarse aggregate ratio was 1.13, the total content of powders (cement and fillers) was 500 kg/m3, the hrwr dosage was fixed at 10 kg/m3 (2 % by weight of the powders). the water content was determined by a trial-and-error procedure to obtain a consistent mix with the required fresh rheological properties. table 3 shows the compressive strength evaluated at 28 and 365 days, and the measured rheological properties. table 1: chemical analysis of fine materials (% by mass) material cemi 52.5-n silica fume fly ash (f) dolomite powder sio2 21.3 93.2 49.0 0.83 fe2o3 2.67 1.58 4.10 0.52 al2o3 4.22 0.51 32.3 0.77 cao 62.3 0.20 5.33 28.5 mgo 2.65 0.57 1.56 19.3 k2o 1.00 0.53 0.54 nd na2o 0.15 0.45 0.28 nd so3 3.20 0.22 0.16 nd co2 nd nd 0.80 46.8 l.o.i 1.80 2.62 1.25 43.2 nd: not detected table 2: concrete mix proportions mix mix constituentsa, kg/m3 w/p cement sand dolomite water silica fume fly ash dolomite powder m1 500 945 840 160 – – – 0.32 m2 450 935 830 165 – – 50 0.33 m3 400 925 820 170 – – 100 0.34 m4 350 915 810 170 50 – 100 0.34 m5 300 900 800 175 50 – 150 0.35 m6 350 915 810 170 – 50 100 0.34 m7 300 900 800 175 – 50 150 0.35 a superplasticizer dosage in all mixes= 10 kg/m3 (2 % by weight of powders) table 3: rheological and hardened properties of ssc mixes mix fcy (mpa) rheological properties 28-days 365-days slump flow (mm) v-funnel t0 (sec.) m1 33 36.5 710 6.2 m2 32 34.0 675 6.0 m3 28 31.0 660 5.3 m4 33 36.0 600 5.3 m5 31 33.0 590 5.0 m6 29 32.0 650 5.0 m7 28 31.0 630 5.2 123 acta polytechnica vol. 51 no. 5/2011 fig. 1: particle size distribution for aggregates, dolomite powder and fly ash fig. 2: dimensions and reinforcements of a test beam 4 materials cement type i (42.5 n) meeting the requirements of bs en 197-1:2000 [26] was used. the specific gravity of the cement was 3.13, and the initial setting time was 105 min. (28 percent water for standard consistency). locally produced densified silica fume was delivered in 20-kg sacks. according to the manufacturer, the light-gray powder had a specific gravity of 2.2, and a specific surface area of 17 m2/gm. imported class f fly ash meeting the requirements of astm c618 [15] was used. the average sum of sio2, al2o3 andfe2o3 is 86 percent byweight, with a specific gravityof 2.1. theparticle sizedistribution curve, figure 1, shows that 90 percent by weight of the ash passes through a 45-μm sieve. the dolomite powderwasobtained froma local plant for ready-mix asphalt concrete. the production process includes drying the crusheddolomite used as coarse aggregate and sieving the aggregates to separate the different sizes. a small fraction of the powder that passes through sieve no. 50 (300 μm) is used in the mix, while most of the powder is a by-product. the powder had a light brownish color, and a specific gravity of 2.72. sieving six random samples of the powder showed that the average percentage passing through the 45-μm sieve was 63 percent (figure 1). natural siliceous sand having a fineness modulus of 2.63 and a specific gravity of 2.65 was used in the scc mixes. crushed dolomite with a maximum nominal size of 16 mm was used as coarse aggregate. the aggregate had a specific gravity of 2.65 and a crushing modulus of 19 %. the grading of the aggregates that were used is shown in figure 1. a traditional sulfonated naphthalene formaldehyde condensate hrwr admixture conforming to astm c494 (types a and f) was used. the admixture is a brown liquidwith a specific gravityof 1.18. high tensile deformed steel rebars (nominal diameter 10mm) were used for tension reinforcement. the rebars had a yield strength of 553 mpa. mild steel rebars were used for stirrups with a nominal diameter of 8 mm and yield strength of 380 mpa. 5 test specimens preparation: tight steel forms were used to cast fourteen test specimens with the dimensions and re124 acta polytechnica vol. 51 no. 5/2011 inforcement arrangement shown in figure 2. each concrete mix was used to cast two beams. the amount of tension and web reinforcement was adequate to ensure ductile failure under ultimate load. six 150 × 300 mm cylinders were cast to determine the compressive strength for each mix after 28 days and after one year of exposure. the test beams and the cylinder specimens were cured under a wet cloth for 7 days, after which they were left to dry in the laboratory atmosphere. after 28 days, the test beams were loaded to a specified level as if the beamswere under service flexure loading. each twobeams cast from the samemix were laid horizontal and parallel to each other with the tension side out. the two beams were tied by means of a 12-mm welded steel stirrup 50 mm away from the beam ends. a 100 kn hydraulic jack was used to apply a concentrated load atmid span, while the endties counteractedtheacting load. onceapredetermined load level had been attained, two pieces ofwood (50×50mmcross section and100mmapart) were used tomaintain the deformed shape of the two opposite beams and the hydraulic jack was released. the loading sequence is illustrated in figure 3. fig. 3: test beams tied and loaded at mid-span by means of a hydraulic jack: (a) loading configuration, (b) wooden struts keeping the deformed shape after releasing the hydraulic jack the load applied by the hydraulic jack in this stage (ps) was the load causing the test beams to crack so that the length of the cracks would not exceed two thirds of the beam depth. this load was found to be 20 kn, and this value was about 47 % of the nominal failure load (pn) estimated using the aci 318 code [4] design equation: mn = ρfy(1−0.59ρfy/fcy)bd2 in which mn is the nominal moment (mn = pnl/4, l is the clear span=900mm), ρ is the reinforcement ratio (ρ = 0.012), f y is the yield strength of the tension reinforcement (f y = 553 mpa), f cy is the 28-day cylinder compressive strength, b is the width of the beam cross section (b =100 mm) and d is the effective depth (d = 12.7 mm). table 4 reports the maximum crack width, which was measured using a microscope under the applied load (ps). the maximum measured crack width was 0.16 mm, which is about one half of the maximum width permitted by the aci 318 code for exposed elements. table 4: maximum measured crack width under service load mix crack width (mm) m1 0.14 m2 0.14 m3 0.16 m4 0.14 m5 0.14 m6 0.16 m7 0.16 exposure procedure: the test beams under a sustained loadwere closely arranged, as shown in figure 4. the upper beams were control beams that were kept dry during the whole course of exposure. on the other hand, the lower beams rested on the floor of a shallow basin containing an nacl solution. the amount of the saline was sufficient to cover only 15 mm of the beam height. the concentration of the saline was adopted on the basis of the results reported in ref. [1], showing that the corrosion rate of steel was at its maximum when the nacl solution concentrationwas7%,which is about twice thenacl concentration in seawater. this concentration of the 95% percent purity salt containing 15 percent water provided 34.000 mg/l of soluble chloride ions. this amount of ions represents 1.1 % by weight of the cement in control mix m1, which means that the ion concentrationwas 7 times higher than the concentration allowed by the aci 318 code [4] for elements in severe conditions. 125 acta polytechnica vol. 51 no. 5/2011 fig. 4: exposure of test beams in an open environment under a sustained load fig. 5: four-point bending test and loading configuration the beams were stored in an open environment for 12 months. the lower beams attached to the saline underwent 12wet/dry cycles. the beamswere attached to the saline for 7 days, after which the saline was drained out and the lower beams were allowed to dry for 21 days. at the start of each cycle, the sustained load level was checked. the losses in the sustained load due to time-dependent effects were restored by reloading, using a hydraulic jack and repositioning the wooden struts. the temperature and relative humidity were measured weekly during daytime. during the whole course of exposure, the temperature ranged from 22 to 38◦c, while the relative humidity ranged from 58 to 77 %. bending test: after 12 months of exposure, the test beamswere loaded to failure under 4-point loading, figure 5. the mid-span deflection was measured at equal load steps of 2.0 kn until the main steel yielded, and then the deflection was measured at specified displacement increments. the load versus the mid-span deflection curves is shown in figure 6. 6 results and discussion the fourteen test specimens were arranged into two sets during long-term exposure. the upper beams (b1–b7)were referencebeams,while the lowerbeams (b1c–b7c) were exposed to corrosive conditions in order to investigate the corrosion resistance of the scc mixes (m1–m7). figure 6 shows the relation between the total acting load (p) and the measured mid-span deflection. table 5 reports the ultimate loads recorded as well as the deflections (δy, δu) at the yield and ultimate loads. the corresponding values for the ductility index (δu/δy) as a measure of ductility were also computed. the corrosion resistance behavior of a givenmixmi (used in casting the twobeamsbi andbic)was evaluatedon the basis of the following criteria after performing the four-point load test: 1. comparing the structural performance in terms of stiffness, ultimate loads, ductility and cracking patterns. 2. visual inspectionof themain rebarsafter removing the concrete cover. 126 acta polytechnica vol. 51 no. 5/2011 fig. 6: load versus mid-span deflection for test beams knowing that all test beamswerepre-cracked,figure 6 shows that the load-deflectioncurves consisted of only two parts: a linear relation up to yield and a yield plateau. the post-cracking stiffness of all test beams subjected to corrosion was smaller than the post-cracking stiffness of the corresponding control beams. a significant reduction in stiffness can be observed in beams b3c (20 % dp) and b7c (30 % dp+10 % fa) compared to the stiffness of beams b3 and b7, respectively. it can be noted that the 127 acta polytechnica vol. 51 no. 5/2011 table 5: ductility index and ultimate loads recorded for test beams control beams beams in a corrosive environment beam ultimate load, deflection, mm ductility beam ultimate load, deflection, mm ductility kn δy b δu b index kn δy b δu b index b1 59.6 (1.00)a 2.66 18.58 7.0 b1c 55.0 (0.92) 3.59 10.38 2.9 b2 53.1 (0.89) 3.14 9.98 3.2 b2c 50.4 (0.85) 3.21 7.12 2.2 b3 52.0 (0.87) 3.17 7.17 2.3 b3c 48.7 (0.82) 3.55 7.05 2.0 b4 54.8 (0.92) 3.00 13.8 4.6 b4c 52.3 (0.89) 2.88 9.70 3.4 b5 52.2 (0.88) 2.95 15.00 5.1 b5c 49.1 (0.82) 3.03 8.96 3.0 b6 53.0 (0.89) 2.75 17.00 6.2 b6c 51.7 (0.87) 3.94 8.54 2.2 b7 50.8 (0.85) 2.77 15.05 5.4 b7c 49.3 (0.83) 3.86 8.9 2.3 a ( ) : ultimate load as a fraction of the ultimate load of the control beam (b1) b δy,δu: mid-span deflection at yield and ultimate load, respectively (a) beams in corrosive environment (b) control beams fig. 7: cracking patterns for test beams under sustained load (continuous lines) and at ultimate loads (dashed lines) stiffness of beams b3 and b7 is very close to that of the control beam b1. on the other hand, the stiffness reduction in beam b4c (20 % dp+10 % sf) andb5c (30%dp+10%sf) compared to the stiffness in beamsb4 andb5 is insignificant. the results summarized in table 5 show that the ultimate loads recorded for beams (b2–b7)were 81 to 92 percent of the ultimate loadof control beamb1. this reduction in the ultimate load was consistent with the change in the compressive strength of the scc mixes that were used. therecordedultimateloadsforbeams(b1c–b7c) werevery close to the correspondingultimate loadsof beams (b1–b7), and were only 5 % less in average. these results indicated that no serious loss in the area of the main reinforcement occurred under the prescribed exposure conditions. on the other hand, the results suggested that the corrosion affected only the bond characteristics between the steel rebars and concrete. it is well known that a steel-concrete bond is the sum of two components due to adhesion and the interlocking action provided by the ribs. it is here suggested that the bond reduction is mainly related to reduced adhesion between the concrete and the steel rebars, while the ultimate bond strength was adequate so that the test beams (b1c–b7c) developed the reported ultimate loads. the suggestion of the lack of adhesion bond in the corroded beams is consistent with the following observations: 1. a reduction in post-cracking stiffness. 2. a comparison of the cracking patterns in figure7 showsthatnewcracksdeveloped inbeams 128 acta polytechnica vol. 51 no. 5/2011 (b1–b7) during four-point loading, while this was not the case for the corroded beams. however, beamsb4c (20%dp+10%sf) andb5c (30 % dp+10 % sf) were exceptions. examining the cracking patterns at failure, figure 7, shows that all test beams failed due to concrete crushing in themaximummoment region. however, beams b3c (20 % dp) failed in shear due to the formation of a diagonal tension crack. this premature failure may be related to the lack of bond between the main reinforcement and concrete. the computed ductility index reported in table 5 shows that control beam b1 had the highest ductility measure. the ductility index was considerably lower in beams b2 and b3, while a better ductility measure was achieved by the rest of the beams (b4–b7) containing 10%sf or fa. the ductility index in beams (b1c–b7c) ranged from 35 % to 87 % of the corresponding values recorded for the corresponding control beams (b1–b7), indicating limited ductility due to corrosion. finally, a visual inspection of the main reinforcement in the corroded beams (b1c–b7c) after conducting the four-point test andremoving the concrete cover confirmed the existence of a corroded surface layer inallbeams. thecorrodedsurfacedidnot cover the whole length of the bar. the corroded portions covered about 70–100 mm around the points of intersection of the rebar with the pre-cracks developed under the sustained load. small and shallow pitting was observed along the main rebars in the corroded beams. 7 conclusions this research included experimental testing of reinforced concrete beams cast with different scc mixes incorporating combinationsof silica fume, flyashand dolomite powder as finematerials replacingportland cement. all test beams were stored under sustained load that caused the beams to crack to simulate actual exposure conditions. half of the beams were exposed to a harsh environment, while the other half were similar control beams. after one year of exposure, the corrosion resistance provided by the different scc mixes was evaluated on the basis of the structural performance of the test beams in flexure compared to the behavior of the control beams. the proposed aggressive environment helped to initiate corrosion along the main rebars. no severe reduction in the area of the main reinforcement was detected, and only small and shallow pitting was observed. based on the available test results, the following conclusions can be drawn: 1. independent of the scc mix composition, the structural performance of the tested control beams was quite similar in terms of postcracking stiffness and mode of failure. the variation in the ultimate loads was consistent with the variation in the compressive strength of the scc mixes that were used. 2. the ductility index of the control beams was considerably lower than that of the control beam containing only portland cement. the addition of either silica fume or fly ash effectively increased the ductility index. 3. the post cracking stiffness of the corroded beams was significantly less in the beam containing 20 % dolomite powder as a cement replacement. the use of 10 % silica fume was effective in increasing the post-cracking stiffness, even when the dolomite powder replacement increased to 30 percent. 4. all corroded beams showed a reduction in the ductility index compared to the corresponding control beams. the use of silica fume yielded a relatively higher ductility index. 5. the use of silica fume was found to be more effective than fly ash for improving the structural performance in terms of ductility andpostcracking stiffness for corrosion-exposed scc beams containing up to 30 percent dolomite powder replacing portland cement. references [1] aci committee 222: protection of metals in concrete against corrosion. acimanual of concrete practice – part 1: materials and general properties of concrete, 2008, 41p. [2] hoke, j. h., chama, c., rosengarth, k.: measurement of stresses developing during corrosion of embedded concrete reinforcing bars. corrosion, 1983, paper no. 168, national association of corrosion engineers, houston, usa. [3] uhlig, h. h.: corrosion and corrosion control. 2nd edition, new york, usa : john wiley & sons, 1971, 419 p. [4] acicommittee318: building code requirements for structural concrete. aci 318, 2005, michigan, usa, 443 p. [5] beckett, d.: influence of carbonation and chloride on concrete durability. concrete, feb. 1983, p. 16–18. [6] pettersson, k.: chloride threshold value and the corrosionrate in reinforcedconcrete.cement and concrete research, 20, 1994, p. 461–470. [7] pca report is536: types and causes of concrete deterioration. portland cement ass., illinois, usa, 6 p. 129 acta polytechnica vol. 51 no. 5/2011 [8] aci 234r-06: guide for the use of silica fume in concrete. aci manual of concrete practice – part 1: materials and general properties of concrete, 2008, 51 p. [9] aci committee 201.2r-01. guide to durable concrete. aci manual of concrete practice – part 1: materials and properties of concrete, 2008, 51 p. [10] ozawa, k., maekawa, k., kunishima, m., okamura, h.: development of high performance concrete based on durability design of concrete structures. 2nd east-asia and pacific conference on structure engineering and construction, kochi, japan, 1988, p. 445–450. [11] okamura, h., ouchi, m.: self-compacting concrete. journal of advanced concrete technology, 1, 2003, p. 5–15. [12] okamura, h., ozawa, k.: mix-design for selfcompacting concrete,concrete library of jsce, 25, 1995, p. 107–120. [13] aci 237r-07: self-consolidating concrete. aci manual ofconcretepractice –part 1: materials and general properties of concrete, 2008, 34 p. [14] mindess, s.: bonding in cementitious composites: how important is it, proceedings of the bonding in cementitious composites symposium, boston, 1988, pp. 3–10. [15] astm c618: specification for fly ash and raw calcined natural pozzolan for use as a mineral admixture in portland cement concrete, annual book for astm stand, 4, 2000, 4 p. [16] aci 232.2r-03: use of fly ash in concrete. aci manual ofconcretepractice –part 1: materials and general properties of concrete, 2008, 41 p. [17] ye, g., liu, x., de schutter, g., poppe, a., taerwe, l.: influence of limestone powder used as filler in scconhydration andmicrostructure of cement pastes, cement and concrete composites, 29, 2007, pp. 94–102. [18] yazici, h.: the effect of silica fume and high-volume class c-fly ash on mechanical properties, chloride penetration and freezethaw resistance of self-compacting concrete. construction and building materials, 2007, doi:10.1016/j.conbuildmat 2007.01.002. [19] assié, s., escadeillas, g., waller, v.: estimates of self-compacting concrete potential durability, construction and building materials, 21, 2007, p. 1909–1917. [20] zhu, w., bartos, p.: permeation properties of self-compacting concrete, cement and concrete rearch, 33, 2003, p. 921–926. [21] saylev, t. a., francois, r.: quality of steelconcrete interface and corrosion of reinforcing steel, cement and concrete research, 33, 2003, p. 1407–1415. [22] mehta, p.k., burrows,r.w.: building durable structures in the 21st century,concrete international, 2001, p. 57–63. [23] naik, t. r., kraus, r. n., chun, y., canpolat, f., ramme, b. w.: use of lime stone quarry by-products for developing economical self-compacting concrete. canmet/aci (sdcc 38) international symposium on sustainable development of cement and concrete, 2005, canada, 21 p. [24] elinwa,a. u., ejeh, s. p., mamuda, a. m.: assessing of the fresh concrete properties of selfcompacting concrete containing sawdust ash, construction and building materials, 2007, doi:10.1016/j.conbuildmat 2007.02.004. [25] kamal, m. m., safan, m. a., al-gazzar, m. a.: blended portland cements for low-cost selfcompacting concrete, proceedings of the 1st international conference “new cements and their effects on concrete performance”, national housing and building and researchcenter, egypt, 2008, 12 p. [26] bs-en 197-1: cement: composition, specifications, and conformity criteria for common cements. 2000, 52 p. mohamad a. safan phone: 0020 104 919 623, fax. 0020 482 328 405 e-mail: msafan2000@yahoo.com department of civil engineering engineering faculty menoufia university shebeen el-koom, menoufia, egypt 130 acta polytechnica doi:10.14311/ap.2013.53.0776 acta polytechnica 53(supplement):776–781, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap neutrinos from icarus christian farnese∗, for the icarus collaboration dipartimento di fisica e infn, università di padova, via marzolo 8, i-35131,padova, italy ∗ corresponding author: christian.farnese@pd.infn.it abstract. liquid argon time projection chambers are very promising detectors for neutrino and astroparticle physics due to their high granularity, good energy resolution and 3d imaging, allowing for a precise event reconstruction. icarus t600 is the largest liquid argon (lar) tpc detector ever built (∼ 600 ton lar mass) and is presently operating underground at the lngs laboratory. this detector, internationally considered as the milestone towards the realization of the next generation of massive detectors (∼ tens of ktons) for neutrino and rare event physics, has been smoothly running since summer 2010, collecting data with the cngs beam and with cosmics. the status of this detector will be shortly described together with the intent to adopt the lar tpc technology at cern as a possible solution to the sterile neutrino puzzle. keywords: liquid argon, tpc, neutrino, cngs, cern, lngs, sterile neutrino. 1. the icarus t600 detector 1.1. detection technique the liquid argon time projection chamber (lar-tpc), first proposed by c. rubbia in 1977 [1], is a powerful detection technique that can provide a 3d imaging of any ionizing event. in a liquid argon tpc the ionization electrons produced by each ionizing event taking place in highly purified lar can be transported by an uniform electric field and can be collected by 3 parallel wire planes, placed at the end of the drift path with wires oriented at different directions. the passage of charged particles through the lar generates also a copious prompt ultra-violet scintillation light signal that can be detected using immersed pmts. the lar-tpc is a continuously sensitive and self triggering detector, characterized by high granularity and spatial resolution, similar to bubble chambers. the signal collected on the wire planes can provide a precise tridimensional reconstruction of the recorded particle trajectory. moreover the lar-tpc detector is an excellent calorimeter and it can provide an efficient particle identification based on the density of the energy deposition at the end of the particle range. 1.2. detector layout the icarus t600 lar-tpc detector, presently taking data in hall b of the gran sasso national underground laboratory (lngs), is the largest liquid argon tpc ever built: it consists of a large cryostat split into two identical, adjacent modules, with internal dimensions 3.6 × 3.9 × 19.9 m3 each. both modules house two time projection chambers made of three parallel wire planes, 3 mm apart, the first with horizontal wires and the other two at ±60° from the horizontal direction. the two tpcs are separated by a common cathode and the distance between the cathode and the wire planes, corresponding to the maximal drift length, is 1.5 m equivalent to 1 ms drift time for a nominal drift field of 500 v/cm. finally in order to collect the scintillation light produced by the events, an array of 74 pmts (20 in west module and 54 in the east module) has been located behind the wire planes [2]. the pmt signals are used not only for triggering but also to determine the absolute time of the ionizing event. 1.3. electron lifetime the icarus detector is equipped with both gas and liquid recirculation systems containing standard hydrosorb/oxysorb™ filters in order to trap and keep at an exceptionally low level the electro-negative impurities, especially oxygen and nitrogen. the electron lifetime is continuously and automatically measured in real time by the charge signal attenuation as a function of the drift time along through-going muon tracks. the free electron lifetime is constantly above 5 ms in both the modules (see fig. 1) corresponding to the remarkable value of 0.06 ppb o2 equivalent impurity concentration, resulting in a 17 % charge attenuation at the maximum 1.5 m drift distance. 1.4. trigger system the main icarus t600 trigger system is based on the analog sum of the signals from pmts in the same chamber, with a defined photo-electron discrimination threshold for each of tpc chambers (100 phe in the west cryostat and 200 phe in the east cryostat). in particular for cngs neutrino events, using the “early warning signal” sent from cern to lngs 80 ms before each proton extraction, a 60 µs gate is open at the expected proton extraction time, and the trigger is generated when the pmt sum signal is present in at least one tpc chamber within the gate. using this source 776 http://dx.doi.org/10.14311/ap.2013.53.0776 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 neutrinos from icarus figure 1. free electron lifetime evolution with time, for both cryostats. figure 2. comparison between the differential cosmic ray rates in the east (top) and west (bottom) modules, in blue the results for the events collected in july 2011 and in pink the results for events collected in march 2012, after the pmts improvements carried out from december 2011 to march 2012. of trigger about 80 events/day related to cngs neutrino beam are recorded (∼ 1 mhz rate). for cosmic ray events, the trigger is provided by the coincidence of the pmt sum signals of the two adjacent chambers in the same module. in fig. 2, the differential c-ray rates in 2011 and 2012, as a function of deposited energy, are compared for both west and east module. thanks to the recent improvement in the pmts hv biasing scheme and signal treatment, an increase of the rates in the east half-module below about 1 gev has been obtained: an overall trigger rate of about 35 mhz corresponding to 125 events/hour has been achieved on the full t600 (expected: 160 events/hour). a new fpga called superdaedalus chip implementing a dedicated hit finding algorithm has been recently installed in order to improve the trigger efficiency both for cngs and cosmic events at low energies (i.e. below 500 mev). this new trigger system exploiting the local charge deposition on the tpc wires was used in 2012 run [4]. 1.5. physics potentials and results for the cngs neutrino runs lar-tpcs are a very suitable detector for the study of rare events, such as neutrino oscillation physics and proton decay searches, because of the high spatial granularity (resolution of ∼ 1 mm3 on a overall active volume of 340 m3 for the icarus t600) and of the good calorimetric response (σe/e ≈ 11 %/ √ e(gev)). the icarus detection technique offers the possibility to detect “bubble chamber like” neutrino events produced by the cngs neutrino beam from the cern-sps to gran sasso. icarus-cngs run started in stable conditions on october 1st 2010, collecting 5.8×1018 protons on target (p.o.t.) out of the 8×1018 delivered by cern up to november 22nd 2010. during the 2011 run an event statistics corresponding to 4.44 × 1019 p.o.t. over the 4.78 × 1019 p.o.t. delivered by cern has been collected (fig. 3), with detector live-time of 93 % for cngs exploitation: with this p.o.t. statistics ∼ 1280νµ cc and ∼ 400 nc events are expected. an example of a νµ cc and of a νµ nc interaction collected during the 2011 run are shown in figs. 5 and 6. starting from the reconstruction of the muon produced in the cngs νµ interactions it is possible to verify the uniformity and stability of the detector: the muon tracks collected during a short period of the 2011 cngs run have been reconstructed in 3d rejecting δ-rays and showers along the track. the 777 christian farnese, for the icarus collaboration acta polytechnica figure 3. integrated proton on target delivered to cngs in the 2011 campaign (red). the corresponding integrated cngs beam intensity (proton on target) that has been collected by the icarus t600 detector, taking into account the detector live-time, is also shown (blue). figure 4. comparison between the expected and the reconstructed energy deposition density distribution for muons in cngs νµ cc interactions. same analysis has been made also on mc events: the comparison between the de/dx distributions (fig. 4) shows a very nice agreement both in shape and on average within 2 ÷ 3 % demostrating a good detector calibration. thanks to the high resolution and granularity, the information redundancy and the particle identification capability, icarus can search for the νµ → ντ oscillation in the cngs neutrino beam recognizing the τ decay on the basis of kinematical criteria. in addition, thanks to the capability to identify the interacting neutrino flavour and to discriminate the νe cc signal from the nc π0 background with very high efficiency, icarus can also search for neutrino oscillation in lnsd parameter space. icarus can also perform exclusive nucleon decay searches, in a background free mode thanks to its powerful background rejection: with an exposure of a few years icarus can improve the known limits in particular for some “super-symmetric favored” nucleon decay channels [3]. finally at least 100 cc atmospheric neutrino interaction per year are expected and can be fully reconstructed in icarus. 2. search for superluminal neutrino’s radiative processes in icarus in september 2011 the opera collaboration has reported an evidence of superluminal neutrino propagation (later disproved by the same collaboration): they observed that cngs muon neutrinos arrive to gran sasso from cern 60 ns earlier than expected for the travel at light speed. this result corresponds to δ≡ (v2 ν −v2c )/v2c ≈ 5 × 10−5 [5]. cohen and glashow argued that if neutrinos are superluminal, they should loose their energy by producing photons and e+e− pairs, through z0 mediated processes similar to the cherenkov radiation [6]. to study in details the effects produced by the superluminal neutrino cherenkov radiation, a full 3d simulation of the cngs neutrino propagation from cern to gran sasso has been performed using fluka for different values of the δ parameter and the expected rate for the e+e− pairs events have been estimated. assuming the value of the parameter δ observed in the opera experiment, a strong deformation of the energy neutrino spectrum and in particular the full suppression of the neutrino events with an energy greater than 30 gev are expected. in addiction a very large amount of events containing e+e− pairs and photons should be found in the detector at the lngs (∼ 107 events/kt/1019 p.o.t.). in order to verify the presence of these two effects using the icarus detector, the events corresponding to a total number of 4.9×1018 p.o.t. for the 2010 run and of 1.04×1019 p.o.t. for the 2011 run has been studied in detail by a visual scanning. the fiducial volume used for this analysis is 447 tons of liquid argon in 2010 and 434 tons in 2011. no events containing e+e− pairs has been observed in the icarus t600 detector as shown in tab. 1. furthermore the experimental energy deposition spectrum of neutrino events agrees with the mc expectation assuming δ = 0. the lack of the e+e− pairs, combined with the no spectrum distorsion, allows to set the limit δ < 2.5 × 10−8 at 90 % c.l. for multi-gev neutrinos [7]. 3. neutrino time of flight measurement with the 2011 cngs bunched beam from october 21st to november 6th 2011 the cern cngs neutrino beam has been operated in a new, lower intensity mode that allows a very accurate neutrino time of flight measurement on an event by event basis. the bunched beam structure was characterized by 4 lhc-like extractions with a narrow width (∼ 3 ns fwhm) separated by 524 ns, with ∼ 1012 p.o.t. per 778 vol. 53 supplement/2013 neutrinos from icarus figure 5. example of a cc muon neutrino interaction collected during the 2011 run: the long muon track coming out from the primary vertex is clearly visible. figure 6. example of a nc muon neutrino interaction collected during the 2011 run. pulse. because of a problem related to the “early warning signal”, icarus started to collect neutrino data with this new beam from october 31st. 7 events has been collected: two νµ cc interaction and one nc event with the primary vertex contained in icarus fiducial volume and 4 through going muons generated by a neutrino interaction in the upstream rock. for each event, the neutrino time of flight tofν has been calculated as the difference between the event time in the icarus detector tstop and the time tstart of the proton transit at the beam current transformer accounting for the additional time related to the nearest proton bunch. the use of the gps-based timing system linking cern to lngs allows the time measurement with uncertainty of a few nanoseconds. in order to improve the precision of the neutrino arrival time, the icarus pmt system has been equipped with an additional pmt-daq system: at each cngs trigger the pmt-sum signals and the absolute time of the lngs ppms signal are stored allowing for the determination of the absolute neutrino arrival time in the detector within few ns precision. moreover for each event the distance from the closest pmt and the position of the interaction vertex along the detector length has been evaluated and used to obtain a more precise arrival time estimation. finally for each event the difference δt between the expected time of flight considering the speed of light and the neutrino time of flight is calculated. the average value of the parameter δt obtained from the 7 events is δt = +0.3 ns±4.9 ns(stat.)±9 ns(syst), compatible with the lorentz invariance (δt ∼ 0) [8]. 4. icarus after cngs2: a new approach to sterile neutrinos at cern/sps the recent observations of an electron excess in a anti-νµ beam, made by the lsnd and miniboone experiments, and of an apparent disappearance signal in the anti-νe events collected by the reactor neutrino experiments, seem to suggest the presence of invisible “sterile” neutrinos, in addition to the three well known physical neutrino νe, νµ and ντ. the ideal device to try to solve at the same time all these anomalies is the lar-tpc thanks to the detection capability to recognize the genuine electron neutrino events combined with the high level of rejection of the associated background events due for example to nc π0 interactions. in addition the detector granularity and energy resolution provided by lar tpcs are excellent for neutrino events at low energy (i.e. below 3 gev). for these reasons a new experimental search, based on two strictly identical lar-tpc detectors and two mag779 christian farnese, for the icarus collaboration acta polytechnica rates observed expected δ = 0 expected δ = 5 × 10−5 cc 308 315 ± 5 98.1 ± 2 nc 89 93.1 ± 3 33 ± 1 νµ cc edep > 25 gev 25 18 ± 1.3 < 10−6 e+e− pairs 0 0 7.4 × 106 table 1. expected and observed neutrino and e+e− pairs rates for the icarus t600 detector [7]. figure 7. scheme of the new neutrino facility proposed in the cern north area [9]. netic spectrometers located at different distances on axis on a neutrino beam is proposed at cern/sps after the icarus t600 exploitation at the gran sasso laboratories. a complete description of this new experiment can be found in the technical proposal [9]. a simple scheme of the neutrino facility located in the cern north area is shown in fig. 7: the icarus t600 detector will be moved from the gran sasso to cern and located at 1600 m from the proton target (far position). at the same time, a new lar-tpc detector, identical to the icarus t600 but of smaller size (150 t), will be built and located at 330 m (near position) from the target. the two nessie spectrometers will be placed downstream of the lar-tpcs in order to greatly complement the physics capabilities of this experiment. the neutrino beam will be produced by a 100 gev proton beam fast extracted from the sps. the anti-neutrino beam can be produced simply inverting the current in the horn used to focus the beam. the resulting neutrino cc event energy will be centered around 2 gev. in absence of oscillations the comparison between the neutrino spectra in the near and in the far position shows an excellent agreement for the νe events and a fairly good agreement for the νµ events, apart some small beam related spatial corrections. therefore an exact observed proportionality between the two νe spectra implies directly the absence of neutrino oscillations over the measured interval of l/e without any need of monte carlo corrections: any difference between the near and far observed spectra would be a direct proof of neutrino oscillations and both the mixing angle and the mass difference can be determined separately. at the same time the determination of the muon charge by the spectrometers allows the full discrimination between the νµ and νµ events. thanks to the unique feature described and to the properties of the lar-tpc and of the nessie magnetic spectrometers, the proposed experiment can provide a full exploration of the lsnd allowed region and can also study the possible oscillatory disappearance both in the νµ and the νe initial signal studying both the neutrino and anti-neutrino beam [9]. 5. conclusions the icarus-t600 detector installed underground at lngs, is taking data with the cngs neutrino beam since october 2010 in a stable condition searching both for νµ → ντ and the νµ → νe oscillation as well as for atmospheric neutrinos and proton decay events. its unique imaging capability, the spatial and calorimetric resolution and the capability to separate the shower produced from π0 and from electrons allow to reconstruct and identify events in a new way with respect to the other current neutrino experiments. the successful start and data taking of the icarus-t600 detector surely represents a major milestone towards the realization of large mass lar-tpc detectors for neutrino and astroparticle physics. 780 vol. 53 supplement/2013 neutrinos from icarus recently, the employment of the lar-tpc technique, combined with the use of the magnetic spectrometers has been proposed at cern-sps after the icarus-t600 exploitation at lngs in order to definitely solve the sterile neutrino puzzle. acknowledgements i thank first of all the organizing committee for the possibility to present my talk during the vulcano 2012 workshop. i would like also to thank all the icarus collaboration for the contributions in the preparation of the talk and of this proceeding. references [1] c. rubbia, the liquid-argon time projection chamber: a new concept for neutrino detector, cern-ep/77-08 (1977) [2] a. ankowski et al. [icarus collab.] nucl. instr. and meth. a556 (2005) 146 [3] f. arneodo et al.[icarus collab.], icarus initial physics program, lngs p28/01, lngs-exp 13/89 add. 1/01; cloning of t600 modules to reach the design sensitive mass, lngs-exp 13/89 add. 2/01 [4] baibussinov b. et al., a hardware implementation of region-of-interest selection in lar-tpc for data reduction and triggering, jinst 5, p12006 (2010) [5] t.adam et al., [opera collab.], arxiv:1109,4897v2, 2011 [6] a.g.cohen, s.l. glashow, phys. rev. lett. 107(2011) 181803 [7] m. antonello et al.[icarus collab.], a search for the analogue to cherenkov radiation by high energy neutrinos at superluminal speed in icarus, physics letters b 711(2012) 270–275 [8] m. antonello et al.[icarus collab.], measurement of the neutrino velocity with the icarus detector at the cngs beam, physics letters b 713(2012) 17–22 [9] m. antonello et al.[icarus and nessie collab.], search for “anomalies” from neutrino and anti-neutrino oscillations at δm2 ≈ 1ev2 with muon spectrometers and large lar-tpc imaging detectors, cern-spsc-2012-010 and spsc-p-347 (2012) discussion fernando ferroni — can you comment more on what the magnetic spectrometers bring to the measurement of sterile neutrinos? christian farnese — thanks to the use of magnetic spectrometers in the new proposed experiment at cernsps, the νµ disappearance signal can be studyed in a better way. the determination of the muon charge with the spectrometers allows for the full discrimination of the νµ from the νµ signal increasing the statistics and sensitivity for this physics item. finally, combining the muon reconstruction based on the multiple scattering in the lar-tpcs with the results from the spectrometer reconstruction a precise measurement for the muons can be obtained. 781 acta polytechnica 53(supplement):776–781, 2013 1 the icarus t600 detector 1.1 detection technique 1.2 detector layout 1.3 electron lifetime 1.4 trigger system 1.5 physics potentials and results for the cngs neutrino runs 2 search for superluminal neutrino's radiative processes in icarus 3 neutrino time of flight measurement with the 2011 cngs bunched beam 4 icarus after cngs2: a new approach to sterile neutrinos at cern/sps 5 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):241–245, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ nuclear fusion effects induced in intense laser-generated plasmas lorenzo torrisia,b,∗, salvatore cavallarob, mariapompea cutroneoa,c, josef krasad a department of physics, university of messina, messina, italy b infn — laboratori nazionali del sud, catania, italia c csfnsm — centro siciliano di fisica nucleare e struttura della materia, catania, italy d institute of physics, ascr — pals lab, prague, czech republic ∗ corresponding author: lorenzo.torrisi@unime.it abstract. deutered polyethylene (cd2)n thin and thick targets were irradiated in high vacuum by infrared laser pulses at 1015 w/cm2 intensity. the high laser energy transferred to the polymer generates plasma, expanding in vacuum at supersonic velocity, accelerating hydrogen and carbon ions. deuterium ions at kinetic energies above 4 mev have been measured by using ion collectors and sic detectors in time-of-flight configuration. at these energies the deuterium–deuterium collisions may induce over threshold fusion effects, in agreement with the high d−d cross-section values around 3 mev energy. at the first instants of the plasma generation, during which high temperature, density and ion acceleration occur, the d−d fusions occur as confirmed by the detection of mono-energetic protons and neutrons with a kinetic energy of 3.0 mev and 2.5 mev, respectively, produced by the nuclear reaction. the number of fusion events depends strongly on the experimental set-up, i.e. on the laser parameters (intensity, wavelength, focal spot dimension), target conditions (thickness, chemical composition, absorption coefficient, presence of secondary targets) and used geometry (incidence angle, laser spot, secondary target positions). a number of d−d fusion events of the order of 106÷7 per laser shot has been measured. keywords: d−d fusion, plasma laser, d−d cross section, proton detection, neutron detection. 1. introduction the nuclear reactions between two nuclei of deuterons are generally accepted as playing crucial roles in recent observed nuclear processes and significant heat production in condensed matter. independent measurements of the cross sections for these nuclear reactions have an important role and determine significant heat production from d−d nuclear reactions. by considering the possibility to achieve the d−d reaction through the injection into the plasma of deuterium ions, if results that the number of fusion events caused by the beam deuterons is too low to compensate for the energy expended on creating the plasma and the high energy ion beam, remains the possibility of increasing the efficiency of the d−d reaction. this can be done by using, directly or indirectly, the neutrons released in the d−d reaction [5]. as is well-known, in a deuterium plasma the d−d reaction can proceed along two paths having the same probability: d+d → (t + 1.01 mev) + (p + 3.03 mev), (1) d+d → (3he + 0.82 mev) + (n + 2.45 mev). (2) intense pulsed lasers can be employed in the field of nuclear fusion in different ways, such as to increase the plasma temperature, to increase the electron density of the plasma, to ignite of fusion processes, or to accelerate ions inside the plasma. this is the reason why different lasers with different pulse durations, wavelengths, focalization methods and pulse energy can be utilized. in our experiment a laser intensity of about 1015 w/cm2 is used to irradiate in vacuum a deutered target producing a plasma from which deuterons are accelerated at energies above 3 mev, as recently demonstrated [7]. these ions induce d−d nuclear fusion in the same target and in secondary targets, from which monochromatic protons and neutrons are generated. 2. material and methods the iodine laser at prague pals laboratory was employed for the experiment; it provides 300 ps pulses at 1315 nm wavelength, 70 µm focused spot diameter and an energy of 500 j [3]. this laser has been used to irradiate thick and thin targets at normal incidence in high vacuum (10−6 mbar). deutered polyethylene, (cd2)n, was used as thick (5 mm) and thin (5 µm) targets. the primary target irradiated by laser was a porous polymer acting as high absorbent laser radi241 http://ctn.cvut.cz/ap/ lorenzo torrisi et al. acta polytechnica figure 1. sketch of the experimental set up. ation; the secondary targets were three cd2 polymers at high density and flat surface (each 1 cm radius). these secondary targets were placed at different distances and angles from the primary target, so as reported in the sketch of fig. 1. a sic detector, a semiconductor with a 3.2 ev energetic gap, was fixed in forward direction, at 150◦ angle with respect to the incidence direction and at a distance of 102 cm, from primary target and at 115 cm, 127 cm and 141 cm from first, second and third secondary targets, respectively. it permits to detect ions with a low background signal due to its not-absorbent visible light emitted from plasma. thus protons and deuterium ions emitted from primary and secondary targets, can be detected and their kinetic energy measured. sic was employed in time-of-flight (tof) configuration; its signal was acquired through a fast storage oscilloscope (20 gs/s) in order to measure the tof of arriving ions and the corresponding kinetic energy of the produced protons, as in previous experiments [4]. a plastic scintillator ne102a has been used coupling it with a fast photomultiplier and a storage oscilloscope to detect mev neutrons produced by the d−d neutron branch. the scintillator, having a density of 1.032 g/cm3, provides a fast response to gamma and neutrons, thanks to its 2.4 ns decay time and high detection efficiency (light output 65 % with respect to antracene medium). the scintillator was placed at a distance of 200 cm from the primary target and 216 cm, 238 cm and 242 cm from the first, second and third secondary targets, respectively. its use was dedicated mainly to neutron energy measurements through tof approach. a thomson parabola (tp) spectrometer is fixed along the normal direction to the target surface at about 2 m distance from the target. tp analyzes the plasma ion emission produced by thin targets (∼ 1 µm in thickness) that is transimmetted by a narrow collimation constituted by two pinholes, the first 1 mm and the second 100 µm in diameter, respectively. a magnetic field of 0.2 t and an electric field of 3 kv are provided in order to produce the ions deflection. figure 2. d−d cross section as a function of the deuterium energy. the electric field is placed after the magnetic one it is realized using two parallel plates 8 cm long and 1 cm apart. the distance between the electric deflector plates and the shield containing the micro-channel plates (mcp) detector for the parabolas recording is 16.5 cm. a ccd camera, in remote control, captures, at high spatial resolution, the parabola images shown by mcp. opera-3d/tosca code [1] allows to simulate the ion trajectories starting from magnetostatic and electrostatic forces acting in the tp spectrometer so that simulation data can be compared with the experimental ones in order to have information about the charge/mass, charge states and ion kinetic energies. 3. results the d−d cross sections as a function of the deuterium energy, which permits to calculate the number of fusion events generating monocromatic protons and neutron branches carried out to calculate the number of monochromatic protons and neutrons, is reported in fig. 2. the maximum cross section of 0.2 barns is obtained at 3.0 mev incident deuterons. the deuteron energy acquired in the laser-generated plasma has been measured by irradiating thick and thin cd2i targets at normal incidence angle. figure 3 shows a typical example of sic-tof spectrum at 150◦ detection angle in forward direction obtained from a plasma generated by 5 µm deutered polyethylene target. the sic detector spectrum, placed at 102 cm distance from the target shows a narrow photopeak due to photons coming from plasma (start signal) and a narrow minor peak coming from electron bremsstrahlung at about 10 ns. moreover, a structured and larger peak, due to the detection of fast and slower ions, extends 242 vol. 53 no. 2/2013 nuclear fusion effects induced in intense laser-generated plasmas figure 3. typical example of sic-tof spectrum produced by the ions emitted from plasma generated in forward direction by 5 µm deutered polyethylene irradiated target; deuterium peak loceted at a tof of 55 ns has an energy of 3.5 mev. figure 4. typical tps spectrum (a) and comparison with simulation data (b) reporting the parabolas for protons, deuterium carbon and contaminant oxygen ions; the maximum energy of 3.0 mev and 3.5 mev is evaluated for protons and deuterium respectively. in times higher than 60 ns. the front peak, located at 55 ns, is due to fast deuterons detected at a kinetic energy of 3.5 mev. the deuteron energy measurement is confirmed also by tp analysis, for forward ion emission. a typical tp spectrum is reported in fig. 4a together with the simulated plot (a) which permits the ion parabola recognition. it shows the parabolas relative to the detected ion species, charge states and kinetic energies coming from a laser irradiated thin deutered polyethylene foil. the spectrum features protons, the six charge states of carbon ions, the deuterium parabola overlapped figure 5. typical tof neutron spectrum obtained by the plastic scintillator reporting peaks at 2.5 mev neutrons coming from the secondary targets. with c+6 parabola and the presence of oxygen contaminant ions. the maximum ion energy, measurable fron the distance between the center of the circle point (due to x-ray mcp detection) and the initial point of the parabola line, is 3.0 mev and 3.5 mev for protons and deuterium, respectively. the maximum energy for the carbon ions is of about 0.5 mev per charge state. the detection of protons and neutrons with the characteristic energy of 3.0 mev and 2.45 mev, respectively, was obtained by irradiating thick deutered polyethylene and by observing the sic and the plastic scintillator spectra showing signals coming from the generation of protons and neutrons from the nuclear events. the scintillator spectrum relative to the neutrons detection is reported in fig. 5. it shows a fast and very high photopeak, due to electron bremsstrahlung in the primary target, followed by three lower peaks at different tof times all corresponding to the 2.45 mev neutrons emitted from the three secondary targets. the fusion reaction yield, in terms of the number of fusion per incident d+ ion produced along the ion track in the target, as a function of depth is given by y (x) dx = φxnd σ(e(x)) dx, (3) where y (x) is the probability that a fusion will occur per unit length, φx is the ion current, nd is the density of deuterium atoms in the target, σ(e(x)) is the d−d fusion cross section as a function of the deuteron energy and dx is the distance along the deuteron track. the deuterium ions accelerated by the plasma penetrate in the cd2 secondary target matter up to the range depth of the energetic particles. this can be calculated through srim code [8] that gives the energy loss per unit length in the target material de(x)/dx as a function of the depth. these data can be used to calculate the d+ ion energy as a function of the depth travelled in the target as e(x) = e0 − ∫ r 0 de dx dx (4) where e0 is the initial accelerating energy assumed to be 3.5 mev. 243 lorenzo torrisi et al. acta polytechnica figure 6. plot of the deuterium energy (a) and of the d−d cross section (b) versus the polyethylene depth for 3.5 mev d+ ions impacting a cd2 substrate. figure 6a reports the graph of deuterium energy versus the polyethylene depth for 3.5 mev d+ ions impacting a cd2 substrate. the range is about 135 µm. the total d−d cross section as a function of the depth can be calculated integrating the values over the deuterium ion range in the deutered polyethylene target through the equation σt = ∫ r 0 σ(e(x) dx. (5) equation 5 can be plotted as a function of the polyethylene depth, as represented in fig. 6b, demonstrating that the cross section maintains its maximum value within the first 120 µm of surface layers. the density of deuterium atoms can be calculated from the equation nd = 2ρna/m , (6) where ρ is the polyethylene density, na the avogadro’s number and m the cd2 mass. the factor 2 is due to the presence of two deuterium atoms per carbon. the total number of fusion processes can be calculated as nf = ∫ r 0 y (x)dx = φxndσt. (7) thus, in order to evaluate nf, three parameters must be known. the first parameter is the deuterium ion current, φx, produced by the plasma developed from the primary target. ic measurements of ion currents from polyethylene laser irradiation have indicated that a total current of the order of 10 ma can be produced at time of the order of 10 µs. assuming the mean charge state to be 2+ (carbon is present with charge states from 1+ up to 6+ but the lower charge states are more intense with respect to the higher ones) and that the deuterium ions represent only the 30 % of the total ions in the plasma, a current of about φx ' 1011 d+ ions per laser shot can be estimated. the evaluation of the second parameter concerns the target density: assuming the cd2 polymer to have a density of 0.98 g cm−3, the atomic density is nd = 7.4 × 1022 cm−3. thus the deuterium atoms, for 120 µm deuterium range, correspond to 8.9 × 1020 cm−2. the surface of the three secondary targets irradiated by the primary deuterons is about 9.4 cm2, thus a total irradiation of about 8.4 × 1021 atoms is possible. the third parameter is the d−d cross section which is about constant to σt = 0.018 × 10−24 cm2 in the first 120 µm depth, where the kinetic energy ranges between 3.5 mev and about 1 mev. these approximations permit to evaluate a total number of fusion processes nf of about 1.5 × 107 per laser shot. a value comparable to other similar experiments performed with a laser-generated plasma [6, 2]. 4. conclusions the measurements have determined, with an accuracy of the order of 15 %, the detection of 3.0 mev protons and 2.5 mev neutrons coming from the secondary targets irradiated by mev deuterons accelerated by lasergenerated plasma. proton detection occurs togheter fast deuterium ions, while neutron spectra show the coexistence of gamma-rays, as a consequence of the electron bremsstrahlung. the preliminary evaluation of the number of fusion events per laser shot is of the order of 107. further nuclear fusion events may occur in the primary target due to d−d collisions generated in plasma and increase the yield of monoenergetic protons and neutrons. therefore the fusion event produces an energy emission of 3.27 mev for the neutron production channel and of 4.03 mev per the proton emission channel, the total number of events occurring in the three secondary targets corresponds to a nuclear energy generation of the order of 10 µj per laser shot. thus the laser energy conversion in nuclear event is low, considering that the used laser pulse is of about 500 j, and further investigations should be performed in order to increase this conversion factor. the presented results highlight the importance of the laser induction of nuclear fusion events to develop nuclear energy without generation of dangerous radioactive species. 244 vol. 53 no. 2/2013 nuclear fusion effects induced in intense laser-generated plasmas references [1] vector field-opera 3d-tosca-code. http://www. cobham.com/about-cobham/aerospace-and-security/ about-us/antenna-systems/kidlington/products/ opera-3d.aspx, 2012. [2] j. krasa, a. velyhan, b. bienkowska, et al. . in 33rd eps conference on plasma phys, vol. rome, pp. p–5.035. 2006. [3] pals laboratory. http://www.pals.cas.cz/2012. [4] d. margarone, j. krasa, l. giuffrida, et al. full characterization of laser-accelerated ion beams using faraday cup, silicon carbide, and single-crystal diamond detectors. j of appl phys 109:103302, 2011. [5] b. r. martin. nuclear and particle physics. john wiley&sons, ltd., 2006. [6] l. torrisi, s. cavallaro, m. cutroneo, et al. deuterium-deuterium nuclear reaction induced by high intensity laser pulses. applied surface science 2012. in press. [7] l. torrisi, s. cavallaro, m. cutroneo, et al. monoenergetic proton emission from nuclear reaction induced by high intensity laser-generated plasma. review of scientific instruments 83(2):02b111, 2012. [8] j. f. ziegler, j. p. biersack, m. d. ziegler. srim – the stopping and range of ions in matter. http://www.srim.org/. 245 http://www.cobham.com/about-cobham/aerospace-and-security/about-us/antenna-systems/kidlington/products/opera-3d.aspx http://www.cobham.com/about-cobham/aerospace-and-security/about-us/antenna-systems/kidlington/products/opera-3d.aspx http://www.cobham.com/about-cobham/aerospace-and-security/about-us/antenna-systems/kidlington/products/opera-3d.aspx http://www.cobham.com/about-cobham/aerospace-and-security/about-us/antenna-systems/kidlington/products/opera-3d.aspx http://www.pals.cas.cz/2012 http://www.srim.org/ acta polytechnica 53(2):241–245, 2013 1 introduction 2 material and methods 3 results 4 conclusions references ap-6-10.dvi acta polytechnica vol. 50 no. 6/2010 fractality of fracture surfaces t. ficker abstract a recently published fractal model of the fracture surfaces of porous materials is discussed, and a series of explanatory remarks are added. the model has revealed a functional dependence of the compressive strength of porous materials on the fractal dimension of fracture surfaces. this dependence has also been confirmed experimentally. the explanatory remarks provide a basis for better establishing the model. keywords: fractal dimension, fracture surfaces, porous materials, compressive strength. 1 introduction mandelbrot and his co-workers [1] started fractal research of fracture surfaces of solid materials. after their pioneering work had been published, many other authors [2, 3, 4, 5] tried to correlate the fractal properties of fracture surfaces with the mechanical quantities of materials. due to the complexity of these surfaces, especially with composite materials, the results of such studies were not consistent, sometimes even contradictory. good examples of such complicated surfaces are the fracture surfaces of cementitious materials. nevertheless, they are extensively studied [6, 7]. recently a fractal model of the fracture surfaces of porous materials has been published [8, 9, 10]. its functionality has been tested andprovedwith porous cementitious materials. one of the most important results of themodel consists in the relation σ = f(d) which is the dependence of the compressive strength σ ofmaterials on the fractal dimension d of fracture surfaces. this finding may be of practical importance, since it indicates the possibility of estimating compressive strength on the basis of the fractal geometry of fracture surfaces. the aim of this paper is to provide necessary explanations and comments on particular steps performed within the derivation of the model [8–10] in order tomake transparent all its parts. after a short overview of the basic relations of the model (section 2), a series of explanatory sections 3.1–3.3 follows. 2 outline of the model a short sketch of the fractal model [8, 9, 10] of fracture surfaces of porous materials is presented here. the content of this section has its source in ref. [10], which is the most recent presentation of the model. 2.1 fractal porosity the large class of porousmaterials possesses at least one common feature, namely, they are composed of grains (particles, globules, etc.) ofmicroscopic size l. the grains are usually arranged fractally with number distribution n(l) and fractal dimension d n(l)= ( l l )d , l < l. (1) assuming the volume of a globule to be v =const · l3 and the volumeof a sample v =const·l3, the porosity p of the cluster may be derived as follows p = v − n · v v =1− n v v =1− n ( l l )3 = 1− ( l l )d ( l l )3 =1− ( l l )3−d . (2) in general, the porosity p of a sample with a characteristic size λ, stochastically scattered fractal clusters of sizes {li}i=0,1,2,...,n � λ and dimensions {di}i=0,1,2,...,n reads p =1− n∑ i=0 ξi ( li li )3−di , ξi = mil 3/λ3 (3) where mi is the number of fractal clusters with dimension di. 2.2 fractal compressive strength the relations for estimating the compressive strength of porous materials published earlier [11, 12] rely on porosity p as a main decisive factor. in the fractal model under discussion the balshin [11] relation σ = σ∗o(1 − p) k was used as a starting point. the relation was developed [10] into a more general form σ = σ∗o ( 1− p pcr )k +so = σo(1− p − b)k +so, (4) 26 acta polytechnica vol. 50 no. 6/2010 0 ≤ b =1− pcr ≤ 1, 0 ≤ σ∗o ≤ σo = σ∗o p kcr (5) where so is the remaining strength which may be caused, among others, by the virtual incompressibility of pore liquids. combining (3) and (4), the compressive strength of porous matter appears as a function of the fractal structure σ = σo [ n∑ i=0 ξi ( li li )3−di − b ]k + so. (6) 2.3 dimension of fracture surface when performing the fracture of a porous material whose inner (volume) structure has fractal dimension di, this structure is projected onto the fracture surface with a lower dimension d∗i < di. provided a fracture surface has its owndimension s and itsmorphology is ‘typical’ rather than ‘special’, the relation between d∗i and di can be expressed [13] as follows di =max{0, d∗i +(3− s)} , d ∗ i ≤ s < 3 (7) where 3 − s is the co-dimension of the fracture surface. using (7), the exponent 3 − di in eq. (6) can be replaced by s − d∗i and the generalized strength function now reads σ = σo [ n∑ i=0 ξi ( li li )s−d∗i − b ]k + so. (8) this function may containmany parameters, so that it is difficult to fit it to the experimental data, because there may be more than one ‘reliable’ set of parameters σo, {ξi}ni=1, li, li, s, d ∗ i , b, k, so. fortunately, the structure of porous material often contains only one type of grain, i.e. one type of fractal arrangement (i = 1) dominates over the solid rest (i = 0) which is usually of non-fractal character (do =3) σ =σo [ ξ1 ( l1 l1 )s−d∗1 +(ξo − b) ]k + so = σo [ξ1 exp((s − d∗1)/a) − γ] k + so, (9) a = 1 ln(l1/l1) , γ =(b − ξo). 2.4 experimental tests relation (9) is directly applicable to samples of hydratedportland cement paste, since it is a composite whose main component (calcium-silicate-hydrate gel) is known for its inner fractal structure. other components can be assigned to a non-fractal remnant. therefore, this material was used [8, 9, 10] to test the functionality of eq. (9). fig. 1: dependence of compressive strength on fractal dimension with cement paste [8] seventy-two samples of hydrated ordinary portland cement paste of various water-to-cement ratio r (0.4, 0.6, 0.8, 1.0, 1.2, 1.4) were prepared. after 28 days of hydration the samples were subjected to three-point bending and were fractured. the fracture surfaces were used for further fractal analysis. the 3-d digital reconstruction of the fracture surfaceswasperformedusinga confocalmicroscope, and then a series of horizontal sections (contours) were analyzed with resolution 0.2 μm2/pixel by means of the standardbox-countingmethod [8, 9, 10] to obtain a representative d∗ for the particular surface. the box-counting analyses were performed in the length interval 〈2 μm,300 μm〉. the second parts of the fractured sampleswere cut into small cubes and subjected to destructive tests to determine their compressive strength values σ. it is known that cement pastes of higher waterto-cement ratio suffer from sedimentation of cement clinker grains and bleeding, which may lead to lower homogeneity and modified w/c ratio. the influences of these effects on strengthmeasurements have been partly suppressed by the fact that sampleswith higher w/c are localized on the strength curve in the less sensitive region in which the curve is bent and asymptotically approaches the horizontal direction. the surface inhomogeneity has been partly compensated by taking microscopic images from different sites on the surface. samples prepared with different water-to-cement ratio r possess different porosity. from cement technology it is well-known that with increasing ratio r the porosity increases. naturally, this will change the dimensions of the projected patterns d∗. six groups of sampleswith different r means six different d∗ at which we are able to measure the dependence σ(d∗) and check it according to eq. (9). the result can be seen in fig. 1 along with all the fitting 27 acta polytechnica vol. 50 no. 6/2010 parameters. since the assumed analytical form (9) of the dependence σ(d∗) has been reproduced well, onemay conclude that compressive strength is one of thosemechanical quantities whose value is ‘coded’ in the surface arrangement of the fractured samples of porous materials. 3 explanatory remarks and discussion the following paragraphs provide comments on the proposed concept of fractal compressive strength in order to clarify all its crucial points. 3.1 derivation of fractal porosity thederivationof fractal porosity startswitheq. (1), which determines the number n of fractal elements on the length scale l. the length l has been taken as a reference scale standing for the largest possible scale on which only one fractal element is present (the so-called initiator, to use mandelbrot’s nomenclature [14]). in short, the object under discussion shows power law behavior (1) only within a limited length interval (l, l) whose borders l and l were coined by mandelbrot [1] as the ‘inner’ and ‘outer’ cutoffs. beyondthis interval the objectbehaves asan ordinary non-fractal euclidean body. on the smallest length scale l we can ‘see’ a great number (n(l)) of basic building elements of size l, and as we go to larger length scales, the number of corresponding elements decreases. at the length scale l = l there is only one element, i.e. no = 1. this is a common property of all self-similar fractals and can be demonstrated very instructively with all deterministic fractals [15], e.g., the cantor set, koch curve, menger sponge, etc. to explain the origin of eq. (1), it is necessary to go to the definition of a fractal measure and a fractal dimension. the most general definitions of these quantities are those of hausdorff [16]. however, his definitions are rather sophisticated and not convenient for computer implementation. for software processing there are some modifications, among which the box-counting procedure is frequently used [17, 18, 19, 20]. the box-countingmeasure m is given as a sumof d-dimensional ‘boxes’ (ld) needed to cover the fractal objects embedded in the e-dimensionaleuclidean space. the boxes are parts of the d-dimensional network created in the euclidean space: m = n∑ i=1 ld = n · ld =exp(lnn) · ld = [exp(ln l)] ln n ln l · ld = l ln n ln l · ld = ld− ln n ln(1/l) = ld−d → lim l→0 ld−d = { ∞, d < d 0, d > d } . (10) the fractal box-counting dimension is defined by the point of discontinuity of the function m(d). according to eq. (10), this is just the point d = d = lnn ln(1/l) (11) where themeasure m abruptly changes its value from infinity to zero. from such a defined dimension d it is easy to express the number n of fractal elements whose size is equal to l or l n(l) = l−d, (12) n(l) = l−d = no. (13) combining (12) and (13), we obtain n(l)= no ( l l )d . (14) bearing in mind that no belongs to the ‘initiator’ of the fractal object, i.e. no =1, we obtain n(l)= ( l l )d (15) with a total interval of fractality (l, l). relation (15) is in fact eq. (1), which was a starting point in deriving fractal porosity in section 2.1. the functionality of (15) can be easily verified using deterministic fractals [15]. for example, the koch curve (d = ln4/ ln3) in its thirdgenerationhas n3 =4 3 elementswith the length l3 =1/3 3. the number n3 = 43 can be obtained fromeq. (15) by inserting l =1, l3 = 1/3 3 and d = ln4/ ln3 which give the following result n3 = [ 1/(1/33) ]d = 33d = 33ln4/ ln3 = e3ln3·ln4/ ln3 = e3ln4 = 43 in full agreement with whathadbeen expected. if the length l of the initiator is different from one, however, the result remains unchanged, i.e. n3 = [ l/ ( l/33 )]d =33d. . . as far aseq.(2) is concerned,wemayraise aquestion about its validity if the basic building elements of size l are small spheres tightly packed in the euclidean space. due to the compactness of the structure it holds d = 3. eq. (2) then yields p = 0 instead of p > 0, which would be expected since there are always gaps between spheres, regardless of their type of space arrangement. herewe shouldbear in mind that spheres of finite diameter cannot generate a true fractal since it requires the presence of an infinitely fine structure, i.e. l → 0. in such a case eq. (2) provides p = 1 − 00 which is, however, an uncertain expression that allows no mathematical decision to be made. nevertheless, the condition l → 0 ensures that with tight arrangement 28 acta polytechnica vol. 50 no. 6/2010 there are no gaps between ‘spheres’, since ‘point-like spheres’ completely fill in the euclidean space and, thus, porosity must be zero. this means that the uncertain expression p = 1 − 00 should also converge to zero. in addition, performing the same procedurewith small cubes instead of small spheres, the value d = 3 (tight arrangement) can be attained even with cubes of finite size (l > 0), and the corresponding porosity is then exactly zero (p = 0), as required. therefore, the non-zero porosity in the case of ‘tightly’ packed spheres offinite size is a consequence of a shape artifact and not of erroneous behavior of eq. (2). whendealingwitha compositematerial, not each of its components is a fractal and not each fractal cluster is delocalized over the whole sample. for this reason, it is necessary to assume that the sample consists of sets (i = 0,1,2, . . . , n) of both fractal and non-fractal clusters whose characteristic sizes li are smaller than or at most equal to the size λ of the sample (fig. 2). if mi denotes a number of clusters of the i-th type (either fractal or non-fractal), the porosity p can be derived by analogy with eq. (2) p = v − ∑n i=0 mi · ni · vi v = λ3 − ∑n i=0 mi · ( li li )di · l3i λ3 = 1− n∑ i=0 mi · ( li li )di · l3i λ3 = (16) 1− n∑ i=0 ( mil 3 i λ3 ) · ( li li )3−di = 1− n∑ i=0 ξi · ( li li )3−di where ξi = mil 3 i /λ 3. eq. (16) includes all possibilities of fractal, non-fractal or mixed arrangements. for example, when only n+1 fractal components exist and are delocalized over thewhole sample (li =λ, mi =1, ξi =1), eq. (16) reads p =1− n∑ i=0 ( li li )3−di =1− n∑ i=0 ( li λ )3−di (17) if there are only n + 1 non-fractal compact components localized inside the sample (li < λ), then porosity assumes a common ‘non-fractal’ form p =1− n∑ i=0 ξi, ξi = mil 3 i λ3 = vi v . (18) if fully delocalized compact (i.e., non-fractal) components (li = λ) are considered, then, naturally, only one of them can be taken into account since no more than a single fully compact component can fill in the volume of the sample, i.e. do = 3, lo = λ, mo = 1, ξo =1, n =0 p =1− ξo =0 (fully compact body). (19) finally, in cases when one fractal component (d1) and one non-fractal component (compact do = 3) are present, either of them localized in several sites of the sample (m1 – fractal clusters, mo – non-fractal clusters), the porosity can be expressed as follows p = 1− ξ1 ( l1 l1 )3−d1 − ξo, (20) ξ1 = m1l 3 1 λ3 , ξ0 = mol 3 o λ3 . fig. 2: a scheme of a porous composite material (3 components) 3.2 derivation of compressive strength in technical literature, many relations have been introduced by various authors. most of these relations use porosity p as a main governing factor. let us discuss some relations concerning compressive strength. balshin [11] suggested a power function σ = σ∗o(1 − p) k that was related to porous metallic ceramic materials. his relation is equivalent to the well-known expression of powers [21]. ryshkewitch [22] recommended the exponential function σ = σ∗o exp(−bp), which is in fact an asymptotic form1 (p → 0) of the balshin power function. schiller [23] presented an expression similar to that of balshin in the form σ = σ∗o[1 − (p/pcr) 1/a], corrected for a critical porosity pcr atwhich compressive strength approaches zero. several other relations are summarized in ref. [12]. 1the balshin relation can be rewritten in the ryshkewitch fomula considering (1 − p)k = exp[k · ln(1 − p)] and restricting to the first term of the taylor series of the logarithmic function. 29 acta polytechnica vol. 50 no. 6/2010 there are two important points that should be taken intoaccountwhendealingwith the compressive strength of porous materials, namely, the so-called critical porosity pcr and partly incompressible pore liquids. schiller [23] considered the critical porosity pcr as a limiting factor for compressive strength, i.e. σ(pcr) = 0. but the limit may also be somewhat influenced by the virtual incompressibility of porous liquids. liquids are displaced in the porous network under the action of an imposed external mechanical load, but narrow pores hinder the liquid movement [24] and due to the virtual incompressibility of the liquid the strength of the structuremaybe somewhat modified. it is natural that this effect concerns especially quite narrowpores, andwith their increasing diameters this effect weakens. however, let us term themodified strength as the remaining strength so. now it is clear that pcr and so should be correlated to fulfill the condition σ(pcr)= so. taking the balshin relation σ = σ∗o(1 − p) k as a good starting point, his form may be generalized by taking into account pcr and so, as follows σ = σ∗o(1 − p pcr )k + so, σ(pcr)= so. (21) now the critical porosity pcr does not represent the absolute limit of strengthbut it onlydefinesa limit at which the influence of incompressible liquids begins to play a role. 3.3 universal exponent of fracture surfaces it is important to realize that the fractality of porous materials is determined by their solid skeleton and notbytheirpores,whichareaconsequenceof thevolume arrangement of material components possessing dimensions {di}. as soon as the volume structure is brokenanda fracture surfaceappears, a newtopological situation occurs. the volume components {di} create surface patterns {d∗i } with lower dimensions d∗i < di. the decrease of the dimensions {di} can easily be found when the fracture surface is a plane (intersection of the euclidean plane and the volume fractal component). in this case d∗i = di − 1, as is well-known. in general, the value of the dimensional shift of a fractal that has originally been embedded in theeuclideanspace (e)andthenprojected onto a subspace (s < e) is called the co-dimension (e−s). whentheoriginal space is three-dimensional (e = 3) and the subspace two dimensional (s = 2), the co-dimension is one (e − s = 1), as in the case of intersection of the euclidean plane with a volume fractal. however, fracture surfaces are not smooth euclidean planes but rather irregular wavy surfaces. let us consider the simplest case of fracture of a nonporous fully compact solid, e.g. a puremetal. such a solid has no fractal volume component, but its fracture surface is fractal (2 < d∗o < 3), as has been shown elsewhere [25]. on the other hand, when a solid consisting of one delocalized fractal component (d) is broken, the dimension d∗ of the corresponding fracture surface is equal to the dimension d∗ of the fractal projection onto an ‘imaginary’ subspace s, i.e. d∗ = d − (3 − s). the dimension s of the subspace can be calculated from the dimensions of the volume fractal d and its surface projection d∗, i.e. s =3−(d − d∗), provided there are techniques for determining d and d∗. similarly, if a solid is composed of more than one fractal component {di}, the dimensions of their surface projections {d∗i } are given as follows d∗i = di − (3− s) ⇒ 3− di = s − d ∗ i (22) this relation has been usedwhen going fromeq. (6) to eq. (8), i.e. from volume fractals to their surface projections. the dimensions of surface projections d∗i are ‘measurable’ e.g. using the confocal technique, if these components are extended over different length scales and do not overlap each other. in our case of hydrated cement paste there is one fractal component (calcium-silicate-hydrate gel) that dominates over the other non-fractal components, and this simplifies the computations according toeq. (9). there is no reason why the mentioned subspace has to be of the euclidean type. it can also be of fractal type, i.e. its dimension s can be not only an integer but also a non-integer number. and this is the case of compactmetals possessing the dimension di =3, forwhicheq. (22) gives d ∗ i = s. bouchaud, lapasset and planès [25], when investigated metallic fractures, found d∗i = 2.2 which means s = 2.2. in our case of porous calcium-silicate-hydrates the dimension s has also been found in the same rank s ≈ 2.2, although these two materials are quite different. the idea that s may be a universal exponent related to fracture surfaces has been indicated previously [25], and our results seem to support it, nevertheless, this concept should be studied further. if future experiments confirm the concept, then it will be necessary to distinguish carefully between the two types of exponents s and d∗i . the former exponent s is a relatively stable and probably universal exponent which would be directly measurable if the sample were fully compact (non-porous and non-fractal), i.e. aperfecteuclideanbody. thedimension s seems to depend more on the fracture process itself than on structural components. the latter exponents are the dimensions {d∗i } of surface projections. they vary with the properties of materials, which has been illustrated in the previous studies [9, 10] by using a series of samples of different compressive strength. these studies have simultaneously confirmed that fracture surfaces bear information on the compressive strength of porous materials (fig. 2). 30 acta polytechnica vol. 50 no. 6/2010 4 conclusion the fractalmodel of compressivestrengthmaybeapplicable to all fractal porous materials. if the particular material is composed of a single fractal component, the model contains only a few parameters that can be easily fitted to experimental data. however, whenmore fractal components are present,manyparametershave tobefitted andtheremayarisenumerical problems in selecting their ‘right’ values among all the options, each of which satisfies the optimalizing criteria equally well. there is no general numerical procedure that will guarantee such a right selection of values. in these cases an intuitive and heuristic approach, supported by physical reasoning, may be instrumental in finding an optimum solution. acknowledgement this work was supported by grant no. me09046 provided by the ministry of education, youth and sports of the czech republic. references [1] mandelbrot, b. b., passoja, d. e., panllay, a. j.: nature, 1984, vol. 308, p. 721. [2] balankin, a. s. et al.: phys. rev., 2005, e 72, 065101r. [3] marconi, v. i., jaga, e. a.: phys. rev., 2005, e 71, 036110. [4] bohn, s. et al.: phys. rev., 2005, e 71, 046214. [5] bouchbinder, e., kessler, d., procaccia, i.: phys. rev., 2004, e 70, 0461107. [6] yan, a., wu, k.-r., zhang, d., yao, w:. cem. concr. res., 2003, vol. 25, p. 153. [7] wang, y., diamond, s.: cem. concr. res., 2001, vol. 31, p. 1385. [8] ficker, t.: acta polytechnica, 2007, vol. 47, p. 27. [9] ficker, t.europhys. lett., 2007, vol.80, 16002p1. [10] ficker, t.: theoret. appl. fract. mech., 2008, vol. 50, p. 167. [11] balshin, m. y.: dokl. akad. nauk sssr, 1949, vol. 67, p. 831 (in russian). [12] narayanan, n., ramamurthy, k.: cem. concr. res., 2000, vol. 22, p. 321. [13] falconer, k. j.: professor of mathematics, university of st. andrews, scotland – private communication. [14] mandelbrot, b. b.: fractal geometry of nature. san francisco, freeman, 1982. [15] ficker, t., benešovský, p.: eur. j. phys., 2002, vol. 23, p. 403. [16] hausdorff, f.: math. ann., 1919, vol.79, p. 157. [17] ficker, t., druckmüller, m., martǐsek, d.: czech. j. phys., 1999, vol. 49, p. 1445. [18] ficker, t.: j. phys. d: appl. phys., 1999, vol. 32, p. 219. [19] ficker, t.: ieee trans. diel. el. insul., 2003, vol. 10, p. 700. [20] ficker, t.: j. phys. d: appl. phys., 2005, vol. 38, p. 483. [21] ficker, t., němec, p.: porosity and strength of cement materials. in proceedings of the international workshop on physical and material engineering 2006, bratislava, slovak university of technology, 2006, p. 30–33. [22] ryshkewitch, e.: j. am. ceram. soc., 1953, vol. 36, p. 65. [23] schiller, k. k.: in: walton, w. h. (ed.) mechanical properties of non-metallic brittle materials. london, butterworths, 1958, pp. 35–45. [24] scherer,g.w.: cem.concr.res., 1999, vol.29, p. 1149. [25] bouchaud, e., lapasset, g., planès, j.: europhys. lett., 1990, vol. 13, p. 73. prof. rndr. tomáš ficker, drsc. phone: +420 541 147 661 e-mail: ficker.t@fce.vutbr.cz department of physics faculty of civil engineering university of technology žižkova 17, 662 37 brno, czech republic 31 acta polytechnica acta polytechnica 53(2):94–97, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ predictive models in diagnosis of alzheimer’s disease from eeg lucie tylovaa,∗, jaromir kukala, oldrich vysatab a faculty of nuclear sciences and physical engineering, czech technical university in prague, department of software engineering in economy, trojanova 13, 120 00 praha 2, czech republic b department of computing and control engineering, faculty of chemical engineering, institute of chemical technology in prague ∗ corresponding author: tylovluc@fjfi.cvut.cz abstract. the fluctuation of an eeg signal is a useful symptom of eeg quasi-stationarity. linear predictive models of three types and their prediction error are studied via traditional and robust measures. the resulting eeg characteristics are applied to the diagnosis of alzehimer’s disease. our aim is to decide among: forward, backward, and predictive models, eeg channels, and also robust and non-robust variability measures, and then to find statistically significant measures for use in the diagnosis of alzheimer’s disease from eeg. keywords: alzheimer’s disease, eeg, linear predictive model, quasi-stationarity, robust statistics, multiple testing, fdr. 1. introduction dementia is a set of clinical symptoms, e.g., memory loss and communicative difficulties. two main categories are cortical and subcortical dementias. the most important cortical dementia, which accounts for about 50 % of the cases, is alzheimer’s disease. in patients with alzheimer’s disease, brain cells die quickly and the chemistry and structure of the brain is changed. eeg is used to study and detect abnormalities caused by dementia. biological rest is an endogenously dynamic process. transient eeg events identify and quantify brain electric microstates as time epochs with a quasi-stable field topography [1]. we can assume better predictability inside microstates, lower predictability during changes between microstates. higher fluctuations of eeg predictability may be connected with higher frequency of microstate changes. falk et al. [2] analysed an eeg signal via its envelope. they first constructed the modulation spectrum and found the region of significant spectral peaks (sp). this technique achieves accuracy 81.3 % with sensitivity 85.7 % and specificity 72.7 %. after the hilbert transform, they also calculated the percentage modulation energy (pme) with better accuracy 90.6 %, sensitivity 90.5 %, and specificity 90.9 %. another approach was used by ahmadlou et al. [3]. the first step in their approach was based on wavelet decomposition. the resulting patterns were processed by the visibility graph algorithm (vga). the power spectrum of the vga structures was used for feature extraction. two types of classifier (rbfnn, pca-rbfnn) were used for the final decision. the accuracy was 96.50 % with sensitivity 100 % and specificity 87.03 % in the case of rbfnn. pca-rbfnn increased the accuracy to 97.75 % with sensitivity 100 % and specificity 91.08 %. 2. models the main hypothesis of this work is that predictability of the brain activity differs between groups of patients with alzheimer’s disease (ad) and normal controls (cn). the activity of the human brain is measured via multichannel eeg, which produces a time series. on the basis of the quasi-stationarity of the eeg signal, the time series were decomposed into non-overlapping segments of constant length. each segment of a given eeg channel and of each patient produced a short time series, the properties of which were studied via linear autoregressive models of three types. 2.1. predictive model let m be the length of a segment. let n be the model size as a number of parameters. let x1, . . . ,xm be an eeg [4] data segment. the linear predictive model has the form xk = n∑ i=1 aixk−i + ek, , (1) for k = n+1, . . . ,m, where ek is the model error in the k-th measurement and ai is the model parameter for i = 1, . . . ,n. formula (1) represents the traditional ar (autoregressive) model [5]. 2.2. back-predictive model the predictive ar model (1) can also be used in the opposite time direction. the resulting model is xk = n∑ i=1 aixk+i + ek, (2) 94 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 predictive models in diagnosis of alzheimer’s disease from eeg ch predictive back-predictive symmetric std mad1 mad2 std mad1 mad2 std mad1 mad2 1 0.1027 0.0402 0.0314 0.0711 0.0338 0.0270 0.0503 0.0131 0.0074 2 0.0065 0.0016 6.52·10−4 0.0031 0.0012 5.06·10−4 0.0015 2.16·10−4 6.00·10−5 3 0.0121 0.0038 0.0014 0.0141 0.0035 0.0013 0.0172 0.0010 2.24·10−4 4 0.1408 0.0612 0.0337 0.1470 0.0540 0.0308 0.0635 0.0081 0.0029 5 0.2551 0.1906 0.1277 0.2573 0.1690 0.1141 0.2063 0.0867 0.0441 6 0.0643 0.0476 0.0275 0.0647 0.0391 0.0223 0.0417 0.0165 0.0064 7 0.0279 0.0192 0.0103 0.0232 0.0166 0.0086 0.0288 0.0214 0.0091 8 0.0917 0.1619 0.1290 0.0947 0.1474 0.1152 0.0572 0.0908 0.0664 9 0.1780 0.2093 0.1512 0.1815 0.1797 0.1308 0.1862 0.0832 0.0504 10 0.6823 0.8572 0.8429 0.7739 0.8309 0.8136 0.6093 0.6226 0.5553 11 0.2358 0.1763 0.1203 0.2218 0.1540 0.1046 0.1234 0.0527 0.0255 12 0.0910 0.0598 0.0467 0.0924 0.0545 0.0446 0.0359 0.0285 0.0216 13 0.1183 0.2376 0.1806 0.0953 0.2120 0.1607 0.0997 0.1113 0.0602 14 0.1027 0.1964 0.1744 0.1114 0.1779 0.1595 0.0706 0.0827 0.0558 15 0.2297 0.2925 0.2539 0.2363 0.2521 0.2174 0.1673 0.1517 0.0985 16 0.4478 0.5942 0.5282 0.4009 0.5395 0.4806 0.3636 0.3136 0.2170 17 0.0680 0.1197 0.1094 0.0437 0.1070 0.0965 0.0288 0.0304 0.0175 18 0.0418 0.0634 0.0595 0.0545 0.0694 0.0654 0.0296 0.0299 0.0219 19 0.2875 0.3889 0.3288 0.2483 0.3431 0.2868 0.1506 0.1568 0.1025 table 1. traditional fluctuation measures. where ek is again the model error, but for k = 1, . . . ,m−n. 2.3. symmetric model the third ar model is symmetric, and thus with lower prediction error for smooth signals. supposing n is even, the adequate model is xk = n/2∑ i=1 aixk−i + n/2∑ i=1 an/2+ixk+i + ek, (3) where ek is the model error for k = n/2 + 1, . . . ,m− n/2. 2.4. model error the three ar models above are easily comparable, because they produce an overdetermined system of m = m−n linear equations for n unknown variables a1, . . . ,an. the unknown parameters a1, . . . ,an were estimated by the method of least squares (lsq) [6] and the residues r1, . . . ,rm are determined as the difference between an observed value and a predicted value. the estimate of the prediction error inside the given segment is se = √ ∑m i=1 ri 2 m −n . (4) 3. fluctuation of the model error three basic characteristics were used to characterize eeg fluctuations: the standard deviation (std), the mean of the absolute differences from the mean value (mad1), and the mean of the absolute differences from the median value (mad2). however, these characteristics are excessively sensitive to outlier values. we preferred robust measures of eeg fluctuations: the median of the absolute differences from the median (mad3), the interquartile range (iqr), and the first quartile of the absolute mutual differences (med). let n be the number of eeg signal segments. let s = (s1,s2, . . . ,sn ) be the vector of errors (4) in all segments. let q1, q2, q3, e be the first, the second, and the third quartile and mean value functions. the fluctuation criteria are defined as std = (e(s − e(s))2)1/2, (5) mad1 = e(|s − e(s)|), (6) mad2 = e(|s − q2(s)|), (7) mad3 = q2(|s − q2(s)|), (8) iqr = q3(s) − q1(s), (9) med = q1(|si −sj|). (10) we obtained the std, mad1, mad2, mad3, iqr, and med values of the model fluctuations of each channel for all ad and cn patients. the null hypothesis h0: µad = µcn was tested via a two-sample t-test [7] against the alternative ha: µad 6= µcn. here, µad = e ln fluctuation (5–10) for ad group and µcn = e ln fluctuation (5–10) for cn group. 4. experimental part groups of 26 ad and 139 cn patients were used for testing. we used the international 10–20 electrode system with constant sampling frequency 200 hz. a predictive model (1), a back-predictive model (2), and a 95 l. tylova, j. kukal, o. vysata acta polytechnica ch predictive back-predictive symmetric std mad1 mad2 std mad1 mad2 std mad1 mad2 1 0.0029 0.0265 0.0018 0.0035 0.0236 0.0021 2.6·10−4 0.0025 2.3·10−4 2 6.9·10−6 6.2·10−5 4.8·10−6 1.0·10−5 5.1·10−5 4.8·10−6 3.9·10−7 3.0·10−6 5.1·10−7 3 3.5·10−6 6.5·10−5 3.5·10−6 1.7·10−6 4.2·10−5 3.7·10−6 1.8·10−7 1.6·10−6 3.0·10−7 4 2.4·10−4 0.0019 2.2·10−4 6.6·10−4 0.0026 3.4·10−4 4.8·10−6 4.6·10−5 9.1·10−6 5 0.0017 0.0124 0.0022 0.0038 0.0170 0.0025 1.4·10−4 0.0015 2.1·10−4 6 2.9·10−4 0.0047 4.1·10−4 2.9·10−4 0.0039 3.2·10−4 2.4·10−5 2.4·10−4 3.2·10−5 7 2.4·10−4 0.0025 1.9·10−4 1.7·10−4 0.0019 1.7·10−4 2.9·10−4 0.0014 2.6·10−4 8 0.0478 0.0787 0.0390 0.0495 0.0679 0.0406 0.0174 0.0384 0.0204 9 0.0159 0.0490 0.0127 0.0130 0.0387 0.0123 0.0013 0.0066 0.0023 10 0.8614 0.6785 0.7914 0.8522 0.6462 0.7958 0.3281 0.2948 0.3613 11 0.0038 0.0227 0.0034 0.0021 0.0151 0.0031 1.8·10−4 0.0015 2.4·10−4 12 0.0054 0.0182 0.0082 0.0066 0.0201 0.0121 0.0051 0.0100 0.0083 13 0.0177 0.0722 0.0212 0.0201 0.0730 0.0219 7.0·10−4 0.0085 6.9·10−4 14 0.0713 0.1341 0.0873 0.0676 0.1056 0.0885 0.0040 0.0180 0.0053 15 0.0877 0.1351 0.0791 0.0581 0.0994 0.0631 0.0028 0.0139 0.0050 16 0.2547 0.2882 0.2583 0.2338 0.2464 0.2512 0.0131 0.0368 0.0195 17 0.0307 0.0740 0.0359 0.0277 0.0676 0.0338 4.3·10−4 0.0038 4.7·10−4 18 0.0511 0.1400 0.0317 0.0451 0.1313 0.0375 0.0032 0.0257 0.0033 19 0.0491 0.1666 0.0492 0.0448 0.1625 0.0439 0.0017 0.0253 0.0021 table 2. robust fluctuation measures. predictive back-predictive symmetric mad3 3 3 2, 3, 4 iqr 2, 3 med 2, 3 2, 3 2, 3 table 3. significant channels. symmetric model (3) were identified, and the model errors (4) and their fluctuations were studied for segment length m = 150 and model size n = 50. the number of eeg segments varied patient-by-patient and satistisfied the inequality 352 ≤ n ≤ 762. the significance level for the testing was α = 0.001. the hypotheses of mean equity were tested on 19 eeg channels, three predictive models, and six fluctuation characteristics. this is a kind of multiple testing, with 342 potentially dependent tests. the standard false discovery rate (fdr) methodology [8] was used to eliminate the acceptance of a false hypothesis. the corrected critical value was determined as αfdr = 4.8347 · 10−6. the t-test results (pvalue) for traditional measures are included in tab. 1. results for robust measures are collected in tab. 2. bold font was used for a pvalue below the critical probability αfdr. the null hypothesis was rejected only in channels 2, 3, 4, which correspond to the frontal domain of the human brain. only three robust fluctuation characteristics are significant: ln mad3, ln iqr, and ln med. the second channel is significant only for ln med or symmetrical prediction. the third channel is significant only for ln med, ln mad3 or symmetric prediction. the fourth channel is significant only for ln mad3 together with symmetrical prediction. tab. 4 summarizes the results. the best pvalue = 1.8885 · 10−7 was obtained on the third channel for the symmetric model and the ln mad3 criterion. figure 1 shows its receiver operating characteristic (roc) curve [9]. the area under the curve (auc) is 0.77, which evaluates the model as good. the boxplot in fig. 2 displays the differences between ad and cn patients. 5. discussion while the autoregressive model is linear and requires a stationary signal, the higher fluctuation of the model error in alzheimer’s subjects may reflect a different structure of brain microstates than in healthy subjects. it may reflect alterations in the brain anatomical cortical connectivity in resting-state networks. in contrast to applying robust methods and filters, the autoregressive linear model offers a simple and traditional solution that provides results with a sufficient level of significance. these results could also be influenced by the small group of testing data. 96 vol. 53 no. 2/2013 predictive models in diagnosis of alzheimer’s disease from eeg 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 − sp se figure 1. roc for ln mad3 6. conclusion using a symmetric predictive model of the eeg signal and mad3, iqr, and med robust measures of predictive error fluctuations, we recognize significant differences between ad and cn groups in the case of frontal electrodes, which are represented by the second, the third, and the fourth channel of eeg. this result is directly applicable to the diagnosis of alzheimer’s disease. the accuracy of our method is 63.64 % with sensitivity 84.62 % and specificity 59.71 %. however, methods based on the modulation spectrum and the hilbert transform [2] or on the visibility graph algorithm [3] are better in accuracy, sensitivity, and specificity. acknowledgements this paper was created with support from ctu in prague grant sgs11/165/ohk4/3t/14. references [1] musso, f., brinkmeyer, j., mobascher, a., warbrick, t., winterer, g., spontaneous brain activity and eeg −7.5 −7 −6.5 −6 −5.5 −5 −4.5 −4 −3.5 ad cn ln m a d 3 figure 2. boxplot diagram for ln mad3 microstates. a novel eeg/fmri analysis approach to explore resting-state networks, neuroimage, vol.52, no.4, 2010, pp. 1149-1161. [2] falk, t. h., et al., eeg amplitude modulation analysis for semiautomated diagnosis of alzheimer’s disease, eurasip journal on advances in signal processing, 2012 2012:192. [3] ahmadlou, m., adeli, h., adeli, a., new diagnostic eeg markers of the alzheimer’s disease using visibility graph, eurasip journal on advances in signal processing, 2010 2010:177. [4] niedermeyer, e., lopes da silva, f., electroencephalography: basic principles, clinical applications, and related fields , lippincott williams & wilkins, 2005. [5] priestley, m. b., non-linear and non-stationary time series analysis, academic press, 1988. [6] björck, a., numerical methods for least squares problems, siam, 1996. [7] meloun, m., militky, j., the statistical analysis of experimental data, academia, 2004. [8] benjamini, y., hochberg, y., controlling the false discovery rate: a practical and powerful approach to multiple testing, journal of the royal statistical society, vol.57, no.1, 1995, pp. 289-300. [9] fawcett, t., an introduction to roc analysis, pattern recognition letters, vol.27, no.8, 2006, pp. 861-874. 97 acta polytechnica 53(2):94--97, 2013 1 introduction 2 models 2.1 predictive model 2.2 back-predictive model 2.3 symmetric model 2.4 model error 3 fluctuation of the model error 4 experimental part 5 discussion 6 conclusion acknowledgements references acta polytechnica acta polytechnica 53(3):317–321, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ non-hermitian star-shaped quantum graphs miloslav znojil∗ nuclear physics institute ascr, 250 68 řež, czech republic ∗ corresponding author: znojil@ujf.cas.cz abstract. a compact review is given, and a few new numerical results are added to the recent studies of the q-pointed one-dimensional star-shaped quantum graphs. these graphs are assumed endowed with certain specific, manifestly non-hermitian point interactions, localized either in the inner vertex or in all of the outer vertices and carrying, in the latter case, an interesting zero-inflow interpretation. keywords: quantum observables, hidden hermiticity, non-local inner products, smearing requirement, discretized coordinates, exactly solvable quantum graphs. 1. the concept of quantum graphs the scope of the traditional quantum theory is currently being broadened to cover not only the 1d motion of (quasi)particles, say, along a single thin wire but also along the more complicated one-dimensional graph-shaped structures g which are, by definition, composed of some nw (finite or semi-infinite) wedges which may — though need not — be connected at some of the elements of the given set of nv vertices. under the influence of the contemporary solid-state technologies these structures may be realized as ultrathin 1d-like waveguides with nontrivial topological, geometrical and/or physical properties. nowadays, there exists an extensive literature on the subject (cf., e.g., [1]). from the formal point of view, one of the key features of these structures is the necessity of their description using the formalism of quantum mechanics. this necessity opens, naturally, a number of challenging questions which concern both the kinematics (for this reason, in a typical quantum graph model one assumes a free motion along the interior of the edges) and the dynamics (most often, one uses just point interactions which are localized strictly at the vertices). 2. the concept of crypto-hermiticity one of the most interesting mathematical questions addressed in the quantum-graph context concerns the proofs or the sufficient conditions of the self-adjoint nature of the corresponding hamiltonian h = h(g) in a pre-selected, sufficiently “friendly” hilbert space h(f) of states. recently, a new approach to this type of problems (carrying a nickname of pt−symmetric quantum mechanics) has been advocated and made popular by carl bender with multiple coauthors (cf. review [2]). the mathematical essence of their recommendation may be seen in the replacement of the a priori selected friendly or “first” hilbert space h(f) by another, ad hoc and hamiltonian-dependent “second” hilbert space h(s) which only differs from h(f) by a redefinition of the inner product defined in terms of an operator θ called “hilbert space metric”, 〈φ|ψ〉(s) := 〈φ|θ|ψ〉(f) ≡〈〈φ|ψ〉(f) (1) (we use the dirac-like notation conventions as introduced and used in compact reviews [3–5]). as an immediate consequence one reveals, in general, that one may define a generic quantum system by a doublet of operators (i.e., by the hamiltonian h and metric θ) which only have to satisfy the following generalized version of the standard and necessary hermiticity requirement h†θ = θh, 0 < θ = θ† < ∞, (2) which may be called a “disguised” hermiticity or crypto-hermiticity [6]. the introduction of this concept may be attributed, historically, to dieudonné [7] and, in the context of (nuclear) physics, to scholtz et al [8]. 3. the concept of crypto-hermitian quantum graphs even for the most trivial, nw = 1 quantum graphs with g = r it is by far not easy to satisfy the selfconsistency condition (2). in essence, two technical problems are being encountered. firstly, we must guarantee that the energy spectrum of our hamiltonian is real (this is necessary for probabilistic interpretation/tractability but, generically, the proof is difficult — see [9] for an illustrative example). secondly, the compatibility (2) between h and θ (studied, most often, as a brute-force construction of θ = θ(h)) is usually not an easy task, either (for the above-mentioned example, just various tedious approximative constructions may be found in the literature — cf. [10]). 317 http://ctn.cvut.cz/ap/ miloslav znojil acta polytechnica figure 1. a typical six-pointed star graph. this being said, one of the most efficient ways of addressing these technical challenges has been described in [11] and generalized, to nontrivial graphs, in its sequel [12]. the guiding idea lied in the discretization of the graphs (see also [13] for a deeper analysis and discussion of this technique). its use opened also the successful transition to the description of some topologically nontrivial graphs in [14]. unfortunately, for the topologically more complicated crypto-hermitian graphs, the advantages of the discretization technique (which, basically, converted the underlying schrödinger equation into a finite-dimensional matrix problem) proved more or less lost due to the growth of the difficulties caused by the necessity of finding a suitable metric θ = θ(h). in this sense, the discrete elementary solvable examples as described in [11, 12] appeared exceptional and exceptionally friendly. for this reason we returned, recently, to the study of the continuous quantum graphs again. in order to keep them tractable nonnumerically, we restricted our attention to the mere topologically trivial, star-shaped graphs in [15]. here we intend to recall one of these models and add a few comments, yielding the conclusion that the study of the pt−symmetric quantum graphs may be made sufficiently friendly not only via the simplification of kinematics (i.e., by the discretization of the wedges) but also via the alternative extreme simplification of dynamics. not realized by the selection of the sufficiently elementary form of the graph itself but rather by the use of the symmetry-rich point interactions at the vertices. 4. the simplifications of kinematics during the study of the broad variety of the graphsrelated open questions one often selects the models where the underlying mathematics is simplified, to its very core, by the replacement of the real-world problem in 3d by its idealized representation by a piecewise linear 1d graph, on the wedges of which the hamiltonian is represented by the laplacean, h ∼ −4. in such a scenario one may feel inspired by the simplest systems which live on the single straight line which may be finite (then, the system is, in effect, a square well possessing just bound states) or infinite (then, one usually speaks about the one-dimensional problem of scattering). in the most natural direction of a generalization of the theory one then glues a q-plet of half-lines in a single vertex and arrives at the star-shaped family of graphs g(q) (see fig. 1 where q = 6 and where all of the wedge lengths and angular distances of neighbors are equal). naturally, the main principles of quantum mechanics remain unchanged. in particular, the bound states living on a given star graph g(q) will be represented by the quadratically integrable wave functions ψn(x) ∈ l2(g),∫ g dxψ∗n(x)ψn(x) < ∞. (3) whenever the corresponding friendly and trivial metric θ(f) = i is replaced by its nontrivial version defining the correct physical hilbert space h(s), the “naive” eq. (3) must be replaced by the double integral which defines, by the way, also the correct physical orthonormalization of the bound states,∫ g dx ∫ g dx′ψ∗m(x)θ(x,x′)ψn(x ′) = δmn (4) where δmn is kronecker symbol. for the hamiltonians which are hermitian in h(f) the use of the so called dirac’s metric θ(x,x′) = δ(x−x′) leads to the degenerate form (3) of eq. (4) of course. 318 vol. 53 no. 3/2013 non-hermitian star-shaped quantum graphs o o o o o ooo0 1 2 3 z 1 2 3 4 5 k figure 2. the real-momentum-dependence of the two factors of the secular determinant of eq. (10) in the subcritical non-hermitian regime with α = 0.7. in the most natural move towards a more realistic scenario one often tries to think about a nonzero thickness of the wedges. of course, with the necessary transition to the partial-differential schrödinger equations this would make the eigenvalue problem highly nontrivial even for the straight-line 1d graph. in [11], an alternative idea of the smearing of the 1d wedges has been proposed, therefore. its realization relied heavily upon the discretization of the 1d graph which reduced eq. (2) to the mere n by n matrix problem. the key idea lied in the smearing of the wedges caused by the use of the smearing-providing metric θ. illustrative examples were constructed in the form of the tridiagonal (or block-tridiagonal or block-pentadiagonal, etc) matrices θ (see also [16] for a more detailed account of this metric-mediated smearing idea as well as of its possible phenomenological interpretation and consequences). 5. the simplifications of dynamics the discretization of the star-shaped graphs rendered it possible to construct the smearing-metric matrices θ for various models of dynamics. for illustration, we may recall the one-parametric toy-model discretegraph hamiltonian of [11] with n = 4, h(4)(λ) =   2 −1 0 0 −1 2 −1 −λ 0 0 −1 + λ 2 −1 0 0 −1 2   , (5) which has been assigned there the four-parametric set of metrics θ(4)(λ) = θ(4)[α1,α2,α3,α4](λ) = α1m1 + α2m2 + α3m3 + α4m4, (6) where m1 =   1 −λ 0 0 0 0 1 −λ 0 0 0 0 1 + λ 0 0 0 0 1 + λ   , m2 =   0 1 −λ 0 0 1 −λ 0 1 −λ2 0 0 1 −λ2 0 1 + λ 0 0 1 + λ 0   , m3 =   0 0 1 0 0 1 −λ 0 1 1 0 1 + λ 0 0 1 0 0   , m4 =   0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0   . (7) obviously, in the light of the well known spectralexpansion formula [17] the general n by n matrix of the metric contains n free real parameters so that due to the independence of components (7) the solution (6) of eq. (2) is exhaustive whenever all the four αjs are positive and whenever |λ| < 1. moreover, the first component m1 is diagonal (so that we may conclude that the energy spectrum remains real). at the same time, merely the last component cannot be treated as a smearing-mediating band matrix. analogous (though, naturally, not so easily displayed) results were obtained, in [12], for the nontrivial discrete star graphs with q > 2. in place of the diagonal, tridiagonal, pentadiagonal (etc) matrices (cf eq. (7)) we merely had to construct there the blockdiagonal, block-tridiagonal, block-pentadiagonal (etc) matrices, respectively. 319 miloslav znojil acta polytechnica oo o ooo o ooo o ooo0 4 8 z 5 10 k figure 3. the real-momentum-dependence of the two factors of the secular determinant of eq. (10) in the strongly non-hermitian regime with α = 1. 6. an illustrative, exactly solvable star-shaped 1d quantum graph the main technical disadvantage of the discretization recipe lies in the quick weakening of the merits (viz., of the tractability of the mathematics by means of linear algebra) with the growth of q and/or of the matrix dimensions (say, beyond n > 10 [12]). the occurrence of these limitations forced us to return, recently, to the continuous graphs of the star-shaped form which is sampled in fig. 1. in [15], in particular, we omitted the “central-vertex” interaction (as used in [12]) completely. this means that just the most elementary kirchhof’s law has been postulated to hold in the x = l center of the graph with the wedges oriented, conventionally, inwards, ψj(l) = ψ0(l), j = 1, . . . ,q − 1, q−1∑ j=0 ∂xj ψj(l) = 0. (8) the simplification has been compensated by the “exotic”1 external boundary conditions ∂xj ψj(0) = iαe ijϕψj(0), j = 0, 1, . . . ,q − 1, ϕ = 2π/q. (9) 1an anonymous referee of this paper correctly objected that these boundary conditions need not be considered exotic at all. the essence of such an objection may be found explained in paper [18] in which the authors show how, in the q = 2 special case, these boundary conditions may be given a natural and elementary physical interpretation in terms of a perfecttransmission external scattering. naturally, the presence of the additional phase factors still leaves the net inflow into the whole graph equal to zero at any q > 2. another extremely interesting aspect of the models of similar class has also been pointed out and discussed (though not yet published) by stefan rotter et al [19] in the units l = 1 we arrived at the q = 6 secular equation k6 −α6 tan4 k k6 + α6 tan6 k tan k = 0. (10) this equation admits the straightforward graphical solution which is being sampled here via figs. 2 and 3. a thorough inspection of these two pictures reveals that there exists the so called kato’s [20] exceptionalpoint value of the critical coupling which may be localized to lie at αmin ≈ 0.7862806298. at this value of the coupling (which is, in our model, real) the number of the real roots of the secular determinant changes. more precisely, the more detailed fig. 2 shows that at the not too large value of coupling α = 7/10 there exist four distinct real zeros of the secular determinant in the interval of k ∈ (0, 2). the subsequent inspection of fig. 3 enables us to find out that after the transition to an overcritical value of α = 1 the two lowest roots disappeared and, obviously, complexified (in an independent calculation, the position of these “missing roots” in the complex plane of k has been localized by an ad hoc, brute-force numerical method). 7. summary and outlook the idea of the study of the non-hermitian starshaped quantum graphs relates, equally strongly, to the studies of its non-hermitian predecessors which remain topologically trivial (cf. their samples, say, in [2–4]) and to the studies of the topologically nontrivial, genuine q−pointed-star-shaped quantum graphs with q > 2 where the hamiltonian h of the quantum system is assumed self-adjoint in one of the most common hilbert-space forms of square-integrable wave functions (see, pars pro toto, an overall discussion of this approach in recent paper [21] which lists also further references in this area). each of these predecessors 320 vol. 53 no. 3/2013 non-hermitian star-shaped quantum graphs offers a slightly different motivation which originated much more from the needs of the phenomenology and quantum theory in the former case (cf., in particular, [2]) and which was able to make use of an extensive knowledge of the underlying mathematics and, in particular, of the existing theorems and results of functional analysis in the latter context. in this sense, at present as well as in the nearest future, the combination of the two ideas may be expected to share both their merits and weaknesses. in particular, the strengths may be expected to emerge due to the related new perspectives in the applications in physics (cf. e.g., the new perspective which appeared opening in the experimental optics using newly developed anomalous metamaterials [22]). at the same time, the most important weak points of the theory may, understandably, be identified with the current lack of the reliable abstract mathematical understanding of some subtleties which naturally emerge not only in connection with the possible, uninhibited encounter of the kato’s exceptional points, say, in the space of couplings, but also in connection with our — up to now fairly vague — current understanding of the representation theory of hilbert spaces in which the operator of metric is admitted non-trivial (cf., e.g., [23, 24] for further reading). acknowledgements the support by the gačr grant p203/11/1433 is acknowledged. the author also appreciates discussions with ondrej turek and taksu cheon (kochi university), with sergii kuzhel (agh university in cracow) and, last but not least, unidirectionally, with anonymous referees. references [1] p. exner, j. p. keating, p. kuchment and a. teplyaev, analysis on graphs and its applications (ams, rhode island, 2008). [2] c. m. bender, rep. prog. phys. 70 (2007) 947. [3] m. znojil, sigma 5, 001 (2009), arxiv:0901.0700. [4] j. železný, the krein-space theory for non-hermitian pt-symmetric operators (msc thesis, fnspe ctu, 2011); [5] p. siegl, non-hermitian quantum models, indecomposable representations and coherent states quantization (phd thesis, univ. paris diderot & fnspe ctu, 2011). [6] m. znojil, acta polytechnica 50 (2010) 62. [7] j. dieudonne, proc. int. symp. lin. spaces (pergamon, oxford, 1961), p. 115. [8] f. g. scholtz, h. b. geyer and f. j. w. hahne, ann. phys. (ny) 213 (1992) 74. [9] p. dorey, c. dunning and r. tateo, j. phys. a: math. theor. 40 (2007) r205 and pramana — j. phys. 73 (2009) 217. [10] a. mostafazadeh, int. j. geom. meth. mod. phys. 7 (2010) 1191. [11] m. znojil, phys. rev. d. 80 (2009) 045022. [12] m. znojil, phys. rev. d. 80 (2009) 105004. [13] p. exner and k. němcová, j. phys. a: math. gen. 36 (2003) 10173. [14] m. znojil, j. phys. a: math. theor. 43 (2010) 335303; m. znojil, int. j. theor. phys. 50 (2011) 1052 and 1614. [15] m. znojil, can. j. phys. 90 (2012) 1287. [16] m. znojil, phys. rev. d. 80 (2009) 045009. [17] m. znojil, sigma 4, 001 (2008), arxiv:0710.4432v3. [18] h. hernandez-coronado, d. krejcirik and p. siegl, phys. lett. a 375 (2011) 2149. [19] s. rotter, “exceptional points in open and closed gain-loss structures”, http://phhqp11.in2p3.fr/ wednesday_29_files/rottersl12.pdf [20] t. kato, perturbation theory for linear operators (springer, berlin, 1966). [21] l. p. nizhnik, methods funct. anal. topology 18 (2012) 68. [22] c. e. rüter, r. makris, k. g. el-ganainy, d. n. christodoulides, m. segev and d. kip, nat. phys. 6 (2010) 192. [23] f. bagarello and m. znojil, j. phys. a: math. theor. 45 (2012) 115311; a. mostafazadeh, phil. trans. r. soc. a 371 (2013) 20120050. [24] p. siegl and d. krejcirik, phys. rev. d 86 (2012) 121702(r). 321 http://arxiv.org/abs/0901.0700 http://arxiv.org/abs/0710.4432v3 http://phhqp11.in2p3.fr/wednesday_29_files/rottersl12.pdf http://phhqp11.in2p3.fr/wednesday_29_files/rottersl12.pdf acta polytechnica 53(3):317–321, 2013 1 the concept of quantum graphs 2 the concept of crypto-hermiticity 3 the concept of crypto-hermitian quantum graphs 4 the simplifications of kinematics 5 the simplifications of dynamics 6 an illustrative, exactly solvable star-shaped 1d quantum graph 7 summary and outlook acknowledgements references acta polytechnica vol. 51 no. 4/2011 extensions of effect algebra operations z. riečanová, m. zajac abstract we study the set of all positive linear operators densely defined in an infinite-dimensional complex hilbert space. we equip this set with various effect algebraic operations making it a generalized effect algebra. further, sub-generalized effect algebras and interval effect algebras with respect of these operations are investigated. keywords: generalized effect algebra, effect algebra, hilbert space, densely defined linear operators, extension of operations. 1 introduction and some basic definitions and facts the aim of this paper is to show that (generalized) effect algebras may be suitable and natural algebraic structures for sets of linear operators (including unbounded ones) densely defined on an infinitedimensional complex hilbert space h. in all cases, if the effect algebraic sum of operators a, b is defined then it coincides with the usual sum of operators in h. effect algebras were introduced by d. foulis and m. k. bennet in 1994 [2]. the prototype for the abstract definition of an effect algebra was the set e(h) (hilbert space effects) of all selfadjoint operators between null and identity operators in a complex hilbert space h. if a quantum mechanical system is represented in the usual way by a complex hilbert space h, then self-adjoint operators from e(h) represent yes-no measurements that may be unsharp. the subset p(h) of e(h) consisting of orthogonal projections represents yes-no measurements that are sharp. the abstract definition of an effect algebra follows the properties of the usual sum of operators in the interval [0, i] (i.e. between null and identity operators in h) and it is the following. definition 1 (foulis, bennet [2]) a partial algebra (e; ⊕, 0, 1) is called an effect algebra if 0,1 are two distinguished elements and ⊕ is a partially defined binary operation on e which satisfy the following conditions for any x, y, z ∈ e: (e1) x ⊕ y = y ⊕ x if x ⊕ y is defined, (e2) (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) if one side is defined, (e3) for every x ∈ e there exists a unique y ∈ e such that x ⊕ y = 1 (we put x′ = y), (e4) if 1 ⊕ x is defined then x = 0. immediately in 1994 the study of generalizations of effect algebras (without the top element 1) was started by several authors (foulis and bennet [2], kalmbach and riečanová [4], hedĺıková and pulmannová [3], kôpka and chovanec [5]). it was found that all these generalizations coincide, and their common definition is the following: definition 2 a generalized effect algebra (e; ⊕, 0) is a set e with element 0 ∈ e and partial binary operation ⊕ satisfying for any x, y, z ∈ e the conditions (ge1) x ⊕ y = y ⊕ x if one side is defined, (ge2) (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) if one side is defined, (ge3) if x ⊕ y = x ⊕ z then y = z, (ge4) if x ⊕ y = 0 then x = y = 0, (ge5) x ⊕ 0 = x for all x ∈ e. in every (generalized) effect algebra e a partial order ≤ and a binary operation � can be introduced as follows: for any a, b ∈ e, a ≤ b and b � a = c iff a ⊕ c is defined and a ⊕ c = b. if the elements of a (generalized) effect algebra e are positive linear operators in a given infinitedimensional complex hilbert space, then e is called an operator (generalized) effect algebra. throughout the paper we assume that h is an infinite-dimensional complex hilbert space, i.e., a linear space with inner product (·, ·) which is complete in the induced metric. recall that here for any x, y ∈ h we have (x, y) ∈ c (the set of all complex numbers) such that (x, αy+βz) = α(x, y)+β(x, z) for all α, β ∈ c and x, y, z ∈ h. moreover (x, y) = (y, x) and (x, x) ≥ 0 with (x, x) = 0 iff x = 0. the term dimension of h in the following always means the hilbertian dimension, i.e. the cardinality of any orthonormal basis of h (see [1, p. 44]). for notions and results on hilbert space operators we refer the reader to [1]. we will assume that the domains d(a) of all considered linear operators a are dense linear subspaces of h (in the metric topology induced by inner product). we say that operators a are densely defined on h. the set of all densely defined linear operators on h will be denoted by l(h). 73 acta polytechnica vol. 51 no. 4/2011 recall that a : d(a) → h is a bounded operator if there exists a real constant c > 0 such that ‖ax‖ ≤ c‖x‖ for all x ∈ d(a). if a is not bounded then it is called unbounded. let t ∈ l(h). since d(t ) is dense in h, for any y ∈ h there is at most one y∗ ∈ h satisfying (y, t x) = (y∗, x) for all x ∈ d(t ). this allows us to define the adjoint t ∗ of t by putting d(t ∗) = {y ∈ h | there exists y∗ ∈ h such that (y, t x) = (y∗, x) for all x ∈ d(t )} and t ∗y = y∗. operator t is said to be self-adjoint if t = t ∗. an operator t ∈ l(h) is called symmetric, if t (x, y) = (x, t y) for all x, y ∈ d(a). it is well-known that this is equivalent with (t x, x) ∈ r for all x ∈ d(t ). clearly every self-adjoint operator is symmetric but the converse need not hold for unbounded operators (see [1], p. 98). since every (generalized) effect algebra includes the zero element 0 as the least element of e, we will assume that all considered operators are positive (written a ≥ 0). this means that (ax, x) ≥ 0 for all x ∈ d(a) and hence a is also symmetric, i.e. (ax, y) = (x, ay) for all x, y ∈ d(a) (see [1, pp. 68 and 94]). for two operators a : d(a) → h and b : d(b) → h we write a ⊂ b iff d(a) ⊂ d(b) and ax = bx for all x ∈ d(a). then b is called an extension of a. we show some examples of partial binary operations (sums) on the set v(h) of all positive linear operators densely defined on infinite-dimensional hilbert space h. our main goal is to study the properties of sub-generalized effect algebras and effect algebras being intervals in v(h) if v(h) is equipped with two different partial sums such that one of them is an extension of the other. 2 some properties of unbounded operators in complex hilbert spaces before defining (generalized) effect algebras consisting of operators a ∈ l(h), we review some necessary results on hilbert space operators [1, chapter 4]. theorem 3 let d1 ⊂ d2 ⊂ h be linear subspaces, d1 = d2 = h. let a ∈ l(h), d(a) = d2 and its restriction a1 = a|d1 = 0 and (ax, x) ∈ r for all x ∈ d2. then a = 0. proof. if a �= 0, then ∃e ∈ d2 \ d1 for which ae �= 0. for all d ∈ d1 and all λ ∈ c ad = 0, d + λe ∈ d2 , (1)( a(d + λe), d + λe ) = λ(ae, d) + |λ|2(ae, e) ∈ r . since d⊥1 = {0} and ae �= 0, we can choose d1 ∈ d1 for which (ae, d1) �= 0. then, by (1), ∀ λ ∈ c , λ(ae, d1) + |λ|2(ae, e) ∈ r . since the second summand is real, the first one must be real for all λ ∈ c. however, this is possible only if (ae, d1) = 0, a contradiction. corollary 4 if a ∈ l(h), d = d(a) �= h. is a symmetric bounded operator and b ∈ l(h) is its symmetric extension, then b is also bounded. proof. let b be a proper symmetric extension of a and let ã be the unique bounded extension of a. then (b − ã)|d = 0 and, by theorem 3, (b−ã)|d(b) = 0. it follows that b = b−ã|d(b)+ ã|d(b) = ã|d(b) is a bounded linear operator. theorem 5 let a ∈ l(h) and there exists k > 0 such that ∀ x ∈ d(a) |(ax, x)| ≤ k‖x‖2 then a is bounded. proof. let there exist k > 0 such that ∀x ∈ d(a) |(ax, x)| ≤ k‖x‖2 using the polarization identity: (ax, y) = 1 4 { (a(x + y), x + y) − (a(x − y), x − y) + i[(a(ix + y), ix + y) − (a(ix − y), ix − y)] } , we obtain for any x, y ∈ d(a), ‖x‖ = ‖y‖ = 1, that |(ax, y)| ≤ 1 4 { |(a(x + y), x + y)| + |(a(x − y), x − y)| + |(a(ix + y), ix + y)| + |(a(ix − y), ix − y)| } ≤ (2) k 4 { ‖x + y‖2 + ‖x − y‖2 + ‖ix + y‖2 + ‖ix − y‖2 } = k 4 { (x + y, x + y) + (x − y, x − y) + (ix + y, ix + y) + (ix − y, ix − y) } = k 4 { 2‖x‖2 + 2‖y‖2 + 2‖ix‖2 + 2‖y‖2 } = k(‖x‖2 + ‖y‖2) = 2k . this means that the quadratic form (ax, y) is bounded, i.e. ∀ x, y ∈ d(a) |(ax, y)| ≤ 2k‖x‖ ‖y‖ . (3) it follows that for any fixed x ∈ d(a) the linear functional ϕ(y) = (ax, y) is bounded and defined on d(a) therefore ϕ̃ : d(a) = h → c, ϕ̃(y) = (ax, y) for all y ∈ h is its unique bounded linear extension and ‖ϕ̃‖ = ‖ϕ‖ ≤ 2k‖x‖. now putting y = ax we obtain, by (3), ‖ax‖2 = (ax, ax) = ϕ̃(ax) ≤ 2k‖x‖‖ax‖ =⇒ ‖ax‖ ≤ 2k‖x‖ . 74 acta polytechnica vol. 51 no. 4/2011 corollary 6 let a, b be nonnegative densely defined linear operators having the same domain d. if a + b is bounded then both a, b are bounded. proof. it suffices to observe that 0 ≤ (ax, x) ≤ (ax, x) + (bx, x) = ((a + b)x, x) ≤ ‖a + b||‖x‖2 and then, by theorem 5, a is bounded. by the same reasoning we obtain that b is bounded. 3 extensions of effect algebra operations it is well known that if a set e includes two distinguished elements 0, 1 then there may exist more than one partial binary operations ⊕ on e making e an effect algebra. example 7 let e = {0, a, b, 1} and let us define operations ⊕1, ⊕2 on e as follows: ⊕1: a ⊕1 b = b ⊕1 a = 1 and 0 ⊕1 x = x ⊕1 0 = x for all x ∈ e, ⊕2: a ⊕2 a = b ⊕2 b = 1 and 0 ⊕2 x = x ⊕2 0 = x for all x ∈ e. then (e; ⊕1, 0, 1), (e; ⊕2, 0, 1) are effect algebras with the same set of elements. on the other hand (e; ⊕1, 0, 1) is a boolean algebra, while (e; ⊕2, 0, 1) is a horizontal sum of two chains {0, a, a ⊕2 a = 1} and {0, b, b ⊕2 b = 1}, hence in this case elements a, b are noncompatible. we obtain special cases of two operations ⊕1 �= ⊕2 on the same underlying set e, if ⊕2 extends ⊕1: definition 8 let (e; ⊕1, 0) and (e; ⊕2, 0) be generalized effect algebras. we say that the operation ⊕2 extends ⊕1 (written ⊕1 ⊂ ⊕2) if for any a, b ∈ e the existence of a ⊕1 b implies that a ⊕2 b exists and a ⊕2 b = a ⊕1 b. lemma 9 let e1 = (e; ⊕1, 0), e2 = (e; ⊕2, 0) be generalized effect algebras and ⊕1 ⊂ ⊕2. then (i) if g ⊆ e is a sub-generalized effect algebra of e2, then g is also a sub-generalized effect algebra of e1. (ii) if ≤1 and ≤2 are the partial orders on e derived from ⊕1 and ⊕2, respectively, then, for a, b ∈ e, if a ≤1 b then a ≤2 b. (iii) for intervals in e1, e2 the following inclusion holds: [0, q]e1 ⊆ [0, q]e2 for any nonzero q ∈ e. proof. the proof obviously follows from the fact that, for any a, b ∈ e, the existence of a ⊕1 b implies a ⊕2 b = a ⊕1 b. let us prove, e.g., (i): let a, b, c ∈ e with a ⊕1 b = c and assume that at least two out of elements a, b, c are in g. since ⊕1 ⊂ ⊕2 implies a ⊕1 b = c = a ⊕2 b = c and g is a sub-effect algebra of e2 we obtain a, b, c ∈ g. hence g is a sub-generalized effect algebra of e1. the following example shows that the converses of assertions (i)–(iii) do not hold. example 10 let e = {0, 1, 2, . . .} and g = {0, 1, 2, 4, 6, . . .}. define the partial binary operations ⊕1 and ⊕2 for a, b ∈ e a ⊕1 b = { a + b, if a = 0 or both a, b are even, not defined, otherwise, (4) a ⊕2 b = a + b for all a, b ∈ e . (5) and let ≤1 and ≤2 be the corresponding derived partial orders. then obviously ⊕1 ⊂ ⊕2 and (i) e1 = (e, ⊕1, 0) and e2 = (e, ⊕2, 0) are generalized effect algebras. (ii) g is a sub-generalized effect algebra of e1. (iii) g is not a sub-generalized effect algebra of e2. (iv) there exist a, b ∈ e for which a ≤2 b but a �≤1 b. (v) there exist q ∈ e for which [0, q]e2 �⊆ [0, q]e1. let us prove conditions (i)–(v). (i) let us show that ⊕1 is associative. to show this, suppose that for a, b, c ∈ e the sum (a⊕1 b) ⊕1 c exists. first, if c = 0 then (a ⊕1 b) ⊕1 c = a ⊕1 b = a ⊕1 (b ⊕1 c). next, if c �= 0 then either a = b = 0 or both c and a + b are even. if a = b = 0 then (a ⊕1 b) ⊕1 c = a ⊕1 (b ⊕1 c) is obvious. a + b is even if both a, b are odd, but this is impossible because a ⊕1 b exists. so, both a, b are even and, then, again, (a ⊕1 b) ⊕1 c = a ⊕1 (b ⊕1 c) is obvious. the rest of the proof of (i) is obvious. (ii) suppose that a, b, c ∈ e satisfy a ⊕1 b = c. if a, b ∈ g then clearly c ∈ g as well. if a, c ∈ g and a = 0, then b = c, hence b ∈ g. a, c ∈ g are nonzero then both are even and then b is even, as well. so, if two elements out of a, b, c are in g, then all three are in g. this proves (ii). (iii) put a = 1, b = 2. now a, b ∈ g and a ⊕2 b = 3 /∈ g shows that g is not a sub-generalized effect algebra of e2. (v) clearly, [0, q]e1 = {0, q} for any odd q. so, e.g. [0, 3]e2 = {0, 1, 2, 3} �⊆ [0, 3]e1. (iv) obviously (v) implies (iv). 4 extensions of operator effect algebra operations we introduce examples of operator generalized effect algebra with the same set of elements and different operations. moreover, we consider intervals in these algebras. 75 acta polytechnica vol. 51 no. 4/2011 a generalized effect algebra whose elements are positive linear operators on a complex hilbert space h is called an operator generalized effect algebra. definition 11 assume that h is an infinitedimensional hilbert space and that a ∈ l(h) is positive. let d denote the set of all dense linear subspaces of h and (i) v(h) = {a ∈ l(h) | a ≥ 0, d(a) = h if a is bounded and d(a) ∈ d if a is unbounded}. (ii) gd (h) = {a ∈ v(h) | a is bounded or d(a) = d if a is unbounded}, d ∈ d. (iii) let ⊕ be a partial binary operation on v(h) defined by: for a, b ∈ v(h), a ⊕ b is defined and a ⊕ b = a + b (the usual sum) iff at least one out of operators a, b is bounded. the triple (v(h); ⊕, 0) will be denoted by v(h) for short. (iv) let ⊕d be a partial binary operation on v(h) defined by: for a, b ∈ v(h), a ⊕d b is defined and a ⊕d b = a + b (the usual sum) iff either at least one out of a, b is bounded, or d(a) = d(b) ∈ d if both are unbounded. the triple (v(h); ⊕d, 0) will be denoted by vd(h) for short. from the above definition it is clear that v(h) =⋃ {gd(h) | d ∈ d} and for d1 �= d2 it holds gd1 (h) ∩ gd1(h) = b +(h) = {a ∈ v(h) | a is bounded with d(a) = h}. moreover, for any a, b ∈ v(h) if a⊕b is defined then a⊕d b is defined and a⊕d b = a⊕b = a+b. hence ⊕ ⊂ ⊕d. recently in [6, 8] the following theorems were proved. theorem 12 [6] for v(h) from definition 11 it holds: (i) (v(h); ⊕, 0) is a generalized effect algebra. (ii) if sp(h) = {a ∈ v(h) | a = a∗} is equipped with ⊕|sp(h), i.e., for a, b ∈ sp(h) there exists a ⊕|sp(h) b = a ⊕ b iff there exists a ⊕ b in v(h), then (sp(h); ⊕|sp(h) , 0) is a subgeneralized effect algebra of (v(h); ⊕, 0). theorem 13 [8] using the notation from definition 11 we obtain: (i) (v(h); ⊕d, 0) is a generalized effect algebra. (ii) let d ∈ d be a fixed dense subspace of h. let gd,d(h) = {a ∈ v(h) | either a is bounded with d(a) = h or a is unbounded with d(a) = d}. let for a, b ∈ gd,d(h) the sum a ⊕d|gd,d(h) b = a ⊕d b if there exists a ⊕d b ∈ gd,d(h), otherwise a ⊕d|gd,d(h) b is not defined. then (gd,d(h); a ⊕d|gd,d(h) , 0) is a sub-generalized effect algebra of (v(h), ⊕d, 0). now, because ⊕ ⊂ ⊕d on v(h) and, for every fixed d ∈ d, gd,d(h) is a sub-generalized effect algebra of (v(h); ⊕d, 0) we obtain that gd,d(h) is also a subgeneralized effect algebra of (v(h); ⊕, 0). clearly, the intersection of two sub-generalized effect algebras of (v(h); ⊕, 0) is again its sub-generalized effect algebra. thus we obtain the following corollary of theorems 12 and 13. theorem 14 let for every fixed d ∈ d, sp,d(h) = {sp(h) ∩ gd,d(h) = {a ∈ v(h) | a = a∗ and either d(a) = h, if a is bounded, or d(a) = d if a is unbounded}. let ⊕|sp,d(h) on sp,d(h) be defined as follows: for a, b ∈ sp,d(h), a ⊕|sp,d(h) b = a ⊕ b iff a ⊕ b is defined in v(h). then (sp,d(h); ⊕|sp,d(h) , 0) is a sub-generalized effect algebra of (sp(h); ⊕, 0). the above observations show that generalized effect algebra vd(h) is a pasting of its sub-generalized effect algebras gd,d(h) through the sub-generalized effect algebra b+(h) = {a ∈ v(h) | a is bounded}. consequently, the generalized effect algebra sp(h) is a pasting of its sub-generalized effect algebras sp,d(h) through the sub-generalized effect algebra b+(h). more precisely: theorem 15 (pasting theorem) (i) for generalized effect algebra vd(h) = {v(h); ⊕d, 0) and its sub-generalized effect algebras gd,d(h), d ∈ d, and b+(h) (1) gd,d1(h) ∩ gd,d2(h) = b +(h) for every d1, d2 ∈ d, d1 �= d2. (2) vd(h) = ⋃ {gd,d(h) | d ∈ d}. (ii) for generalized effect algebra sp(h) and its subgeneralized effect algebras sp,d(h), d ∈ d (1) spd1(h) ∩ spd2 (h) = b +(h) for every d1, d2 ∈ d, d1 �= d2. (2) sp(h) = ⋃ {sp,d(h) | d ∈ d}. remark 16 it is worth noting that the operations ⊕, ⊕d are the usual sum of operators in h, but only partially applied on pairs a, b ∈ v(h). therefore all effect algebra operations ⊕, ⊕d, ⊕|sp(h), ⊕d|gd,d(h), ⊕|sp,d(h) coincide with the usual sum of operators a + b if the corresponding effect algebra sum of a and b exists. 5 intervals in generalized effect algebras of self-adjoint operators assume that (e; ⊕, 0) is a generalized effect algebra. for any q ∈ e, q �= 0, let [0, q]e = {a ∈ e | there exists b ∈ e with a ⊕ b = q} be an interval in (e; ⊕, 0]. we will denote by ⊕|[0,q]e the partial binary operation on [0, q]e defined as follows: 76 acta polytechnica vol. 51 no. 4/2011 for a, b ∈ [0, q]e the sum a ⊕|[0,q]e b = a ⊕ b iff a ⊕ b is defined in e and a ⊕ b ∈ [0, q]e . it is known that [0, q]e equipped with ⊕|[0,q]e is an effect algebra (see, e.g. [7]). in this way, for all nonzero q ∈ v(h), we obtain the operator effect algebras( [0, q]v(h); ⊕|[0,q]v(h) , 0, q ) and ( [0, q]vd(h); ⊕d|[0,q]vd(h) , 0, q ) (see [8]). by definition of ⊕, ⊕d and the results of section 2, it is clear that for q ∈ v(h) with d(q) = d ∈ d we have [0, q]v(h) ⊂ gd (h) and [0, q]vd(h) ⊂ gd,d(h). the same is true for any q ∈ sp(h), q �= 0, d(q) = d. namely, [0, q]sp(h) = [0, q]sp,d(h) and ( [0, q]sp,d(h); ⊕|[0,q]sp,d(h) , 0, q ) , d ∈ d, are effect algebras of positive self-adjoint operators a ≤d q, where ≤d is the partial order on sp(h) derived from operation ⊕|sp,d(h). since d(q) = d is dense in h, the next theorem 18 about states on intervals is sp(h) (hence in sp,d (h)) can be proved by the same argument as theorem 7 in [8] for states on intervals in gd,d(h). definition 17 let e be an effect algebra. (i) a map ω : e → [0, 1] is called a state on e if 1. ω(0) = 0, ω(1) = 1, 2. ω(a ⊕ b) = ω(a) + ω(b) for all a, b ∈ e with a ⊕ b defined in e. 3. a state ω is faithful if ω(a) = 0 implies a = 0. 4. a set m of states is called an ordering set of states on e if a ≤ b iff ω(a) ≤ ω(b) for all ω ∈ m, a, b ∈ e. theorem 18 let d ∈ d and q ∈ sp,d(h), q �= 0. then (i) there exists x̃ ∈ d(q) such that cx̃ = (qx̃, x̃) > 0. (ii) the mapping ωx̃ : [0, q]sp,d(h) → [0, 1] ⊂ r given by ωx̃(a) = 1 cx̃ (ax̃, x̃) for every a ∈ [0, q]sp,d(h) is a state. (iii) if d0 = {x ∈ d(q) | cx = (qx, x) > 0} then m = {ωx | x ∈ d0} is an ordering set of states on [0, q]sp,d(h). (iv) if h is separable, then there exists a faithful state ω : [0, q]sp,d(h) → [0, 1]. acknowledgement supported by grants vega 1/0297/11 and vega 1/0021/10 of the ministry of education of the slovak republic. references [1] blank, j., exner, p., havĺıček, m.: hilbert space operators in quantum physics (second edition). springer, 2008. [2] foulis, d. j., bennet, m. k.: effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1 331–1 352. [3] hedĺıková, j., pulmannová, s.: generalized difference posets and orthoalgebras, acta math. univ. comenianae 45 (1996), 247–279. [4] kalmbach, g., riečanová, z.: an axiomatization for abelian relative inverses, demonstratio math. 27 (1996), 769–780. [5] kôpka, f., chovanec, f.: d-posets, math. slovaca 44 (1994), 21–34. [6] riečanová, z.: effect algebras of positive selfadjoint operators densely defined on hilbert spaces, preprint. [7] riečanová, z.: subalgebras, intervals and central elements of generalized effect algebras. international journal of theoretic physics 38 (1999), 3 209–3 220. [8] riečanová, z., zajac, m., pulmannová, s.: effect algebras of positive linear operators densely defined on hilbert spaces, reports of mathematical physics, to appear. zdenka riečanová e-mail: zdenka.riecanova@stuba.sk m. zajac e-mail: zajacm@stuba.sk department of mathematics faculty of electrical engineering and information technology stu ilkovičova 3, sk-81219 bratislava 77 acta polytechnica doi:10.14311/ap.2013.53.0878 acta polytechnica 53(6):878–882, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap a lightning conductor monitoring system based on a wireless sensor network jan mikeša,∗, ondrej kreibichb, jan neužilb a department of economics, management and humanities, faculty of electrical engineering, czech technical university in prague, technická 2, cz-16627 prague, prague 6, czech republic b department of measurement, faculty of electrical engineering, czech technical university in prague, technická 2, cz-16627 prague, prague 6, czech republic ∗ corresponding author: mikes.jan@fel.cvut.cz abstract. automated heating, lighting and irrigation systems are nowadays standard features of industrial and commercial buildings, and are also increasingly found in ordinary housing. in addition to the benefits of user comfort, automated technology for buildings saves energy and, above all, it provides enhanced protection against leakage of water and hazardous gases, and against fire hazards. lightning strikes are a natural phenomenon that poses a significant threat to the safety of buildings. the statistics of the fire and rescue service of the czech republic show that buildings are in many cases inadequately protected against lightning strikes, or that systems have been damaged by previous strikes. a subsequent strike can occur within the period between regular inspections, which are normally made at intervals of 2–4 years. over the whole of europe, thousands of buildings are subjected to the effects of direct lightning strikes each year. this paper presents ways to carry out wireless monitoring of lightning strikes on buildings and to deal with their impact on lightning conductors. by intervening promptly (disconnecting the power supply, disconnecting the gas supply, sending an engineer to inspect the structure, submitting a report to arc, etc.) we can prevent many downstream effects of direct lightning strikes on buildings (fires, electric shocks, etc.) this paper introduces a way to enhance contemporary home automation systems for monitoring lightning strikes based on wireless sensor networks technology. keywords: lightning protection, lightning monitoring, wireless sensor networks, lightning counter sensor node. 1. introduction lightning discharges are highly unpredictable and uncontrollable natural phenomena, and their direct and indirect effects can have destructive consequences for structures. due to the low probability of a strike, owners and managers of buildings often neglect to ensure that they are safely protected from direct lightning strikes. the building is thus exposed to the risk of being struck by a lightning current without any protection. other buildings are located in places where lightning strikes so frequently that a direct hit on the building is almost inevitable, and may even be repeated several times a year [1]. the only currently used protection systems for residential, commercial and industrial buildings involve conducting lightning discharges from the point of the strike through the catchment system safely into the ground. safe operation of this system depends on the condition in which it is maintained. statistics [2] indicate that a direct lightning strike can cause a fire or, more commonly, can destroy electrical appliances and consumer electronics even in a building with a protective conductor [3]. our study addresses the issue of monitoring lightning strikes on buildings and processing this information online. a direct lightning strike can discharge a current of tens to hundreds of ka. the strike can have a considerable dynamic and thermal impact on components of the protection system. all kinds of mechanical joints are vulnerable. the conductive connection to the grounding system may be damaged, and the conductors and the catchment equipment itself may be mechanically and thermally damaged. regular inspections of the lightning conductor system are covered in annex e of čsn en 62305-3 ed. 2, which specifies periodic inspections at intervals of 2–4 years. the time intervals are defined according to the protection classification of the structure (commercial buildings are mostly in class ii and class iii) based on an analysis of the risk of harm as defined in čsn en 62305. in the czech republic, the applicable procedures are the čsn en 33 1500 standard with a valid change of z4 and default inspection of čsn 33 20006-61 ed.2 (332000) electrical installations of buildings – part 6-61: revision – initial revision. the čsn en 62305 standard is a translation of the european standard, so equivalent rules apply in the countries of the european union. however, experience suggests that an interval of 2 to 4 years between inspections may be too long. when a building, its lightning conductor and grounding system are struck by lightning, the 878 http://dx.doi.org/10.14311/ap.2013.53.0878 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 a lightning conductor monitoring system protection visual inspection complete inspection critical situationsa level complete inspectionb year year year i and ii 1 2 1 iii and iv 2 4 1 a lightning protection systems utilized in applications involving structures with a risk caused by explosive materials should be visually inspected every 6 months. electrical testing of the installation should be performed once a year. an acceptable exception to the yearly test schedule is in order to perform the tests on a 14 to 15 month cycle where it is considered beneficial to conduct earth resistance testing over different times of the year to get an indication of seasonal variations. b critical situations could include structures containing sensitive internal systems, office blocks, commercial buildings or places where a large number of people may be present. table 1. maximum period between inspections of a lightning protection system adopted as iec 62 305-3 [4]. year cause number ratio damage ratio fatalities injuries of fires [%] [thousands of czk] [%] 2012 lightning – w 13 0,06 8 953,0 0,31 0 1 lightning – wo 30 0,15 10 727,0 0,37 0 5 2011 lightning – w 14 0,07 26 159,7 1,17 0 0 lightning – wo 31 0,15 18 994,5 0,85 0 5 2010 lightning – w 13 0,07 3 041,00 0,16 0 0 lightning – wo 24 0,13 6 912,30 0,35 0 3 2009 lightning – w 12 0,06 2 067,0 0,10 0 1 lightning – wo 29 0,14 14 700,5 0,68 0 1 2008 lightning – w 9 0,04 1 385,00 0,04 0 1 lightning – wo 32 0,15 10 455,00 0,32 0 1 table 2. lightning strikes on structures. statistics for 2008–2012 [5]. causes: w – buildings with protection, wo – building without protection. protective system can be damaged to such an extent that it may not be able to provide protection against a subsequent lightning strike. if the owner or the operator of the building is not aware of the lightning strike, appropriate measures may not be taken. the czech hydrometeorological institute records the times and the localities in which storms have occurred, for the purposes of insurance companies and claimants. however, the system does not provide evidence of a direct lightning strike on a building it only provides evidence of lightning discharges in the locality. in our proposed system, each conductor is equipped with a wireless sensor, which records the event of a lighting strike. this data is subsequently used to check the effectiveness of the catchment system and to indicate where improvements are needed. 2. lightning and the catchment system in the last five years, systems have appeared in industrial applications that can monitor the passage of the lightning discharge through the conductor when a building is struck. these systems are based on the principle of electromagnetic induction, or they work with non-electrical phenomena such as polarized light signals. however, these systems indicate the lightning strike only through a mechanical dial mounted directly on the conductor. as a part of our project, a wireless sensor module was prepared that provides online information about the state of the lightning current that is passing through. 3. monitored building the proposed solution was developed by implementing a wireless sensor into a device used commercially for making a statistical record of lightning strikes on lightning conductors. most devices used for monitoring systems struck by lightning are passive devices. after the critical current passes (mostly 2–100 ka) they can detect the event by increasing the mechanical counter by one unit. the speed with which this change is evaluated depends on the operator or on the building. by contrast, our proposed system works online, and immediately after the building has been struck it provides information on the passage of the current through the conductor. the monitored building is 5 × 3 × 4 m in size, and is situated in the šumava region in the czech republic. according to the isokeraunic map [6], it is located in an area with an average frequency of 30–35 thunder879 jan mikeš, ondrej kreibich, jan neužil acta polytechnica figure 1. isokeraunic map of the czech republic. storm days per year (see figure 1). number of days with thunderstorms per year the advantages of the proposed solution are that instant information is obtained, and a rapid response can be made. the building can be disconnected from the networks, an engineer can be sent to make checks, and other actions can be taken to stabilize the structure. when information is received, an integral part of the response is to predict that there may be a fire, which can often arise as a direct result of a lightning strike. however, there is often a delay before a fire breaks out, and the time interval can be used to take measures to minimize its impact. 4. description of the proposed solution we present a modular solution that can be integrated into commercially-available instruments for registering lightning strikes. this paper describes the methodology for online transmission of information about lightning strikes, proposes a technological process for processing and evaluating the information, and describes the practical verification of a prototype. 5. proposed system the proposed system for monitoring the passage of a lightning current through a collection conductor is based on the following conditions: • the device installation shall not affect the functionality of the building. • the equipment must be easy to install on new and existing buildings. • the device shall not significantly increase the budget for the construction of the lightning protection. • the device must provide long-term maintenancefree operation between inspections, and should ideally be completely maintenance-free. • the electronic part of the equipment should be protected from the effects of shock. figure 2. concept of a sensor node [8]. figure 3. mesh multi-hop routing [8]. under these conditions, wireless transmission was the only option. at the time when the system was implemented, an appropriate wireless sensor network (wsn) technology was available. wsn is a network of wirelessly interconnected sensors that monitor the surrounding physical phenomena (light, humidity, temperature, etc.). wireless sensors transmit measured data to each other, or are able to preprocess the data when it is transferred through networks to the so-called sink node. the sink node, also called the gateway node, transmits the data to the control unit (pc), where it is processed and analyzed. on the basis of this information, action is taken and/or information about the status of the monitored environment is displayed or transferred [7]. the sensor node is the basic unit of the wireless sensor network. the sensor node generally consists of a sensor, a computer, a power supply, and a radio module (see figure 2). the whole wsn system, i.e. computing performance, node performance and transmission protocols is expected to offer maximum energy saving and high flexibility. the node uses various levels of “sleep” when the performance of cpu, memory and the peripherals is controlled according to current needs (scanning parameters, data processing, communication, inactivity, etc.). these networks can consist of just a few nodes, though there is no theoretical limit to the quantities. the network can operate with the well-known star topology as the basic arrangement, but the biggest benefit of wsn is that it forms mesh type networks using multi-hop routing, see figure 3. for monitoring purposes, we used the crossbow iris development kit, which already supports mesh networking technology thanks to the moteworks technology. the properties of this series are summarized in table 3 [9]. to capture the event when a discharge has occurred, we use the commercially available dehn and söhne lightning counter, enriched by the iris node. the selected wsn technology ensures reliable data 880 vol. 53 no. 6/2013 a lightning conductor monitoring system electro – mechanical counter (original equipment) electro – mechanical micro relay latch circuit (original equipment) wsn module iris (mcu +radio + battery) solar cell toroid c on du ct or figure 4. lightning sensor node block diagram. processor performance processor atmel atmega 1281 speed 8 mhz program flash memory 128 kb serial flash 512 kb ram 8 kb adc 10 bit, 8 ch., 0–3 v input operating system tinyos 1.0 rf transceiver frequency band 2.4 ghz ism band, programmable in 1 mhz steps transmit data rate 250 kb/s outdoor range > 300 m indoor range > 50 m table 3. iris development kit features. transmission for periods of several years. when on battery power, the power supply battery needs to be replaced every few years. to enhance the lifetime of the system, we enriched the battery power system by a solar cell energy harvesting system, which operated reliably throughout the experiment. the sensing element of the sensor node is a toroidal coil with wound threads, in which the voltage is induced during the passage of the lightning current. in our application, we connected the input of the mechanical counter to the micro relay. the micro relay serves primarily as a galvanic isolation element against voltage surges in the secondary circuit of the current sensing coil. the electronic equipment is therefore quite simple. it is basically a reference switch connected to the input of a simple microcontroller and a radio module. the block circuit diagram is shown in figure 4. the gateway is from the iris set. it is used to connect with a pc via the usb. as the system is designed to be maintenance-free, we chose the power of the central node of the 230 v network (backup adapter). the digital output of the iris gateway was applied to the digital input to the alarm with the gsm module (already installed in the house). the topology of the entire system is presented in figure 5. 6. system features the deep sleep mode of mcu in the sensor node consumes 8 µa, while in transmit mode the current is almost 17 ma. thus the concept is based on the deep sleep mode while waiting for an event. the event is a lightning discharge. a discharge activates the relay connected to a +3 v backup battery (cr2030). this voltage wakes up the node. after waking up, the node sends an “event message”. the central node has radio communication continually poweredon, and when the message arrives it activates the gsm alarm input. features of the wireless system: • nodes are inserted into the respective measurement points for the lightning conductor, • selected communication of the mesh type for guaranteed transfer of discharge information, • the electronics of the measuring node is stored inside a lightning counter, galvanically separated using an electromechanical relay • the central node is connected to the gsm gateway of the house alarm • the power supply is provided by a photovoltaic cell and the backup battery 7. economic balance sheet of the proposed system the components used in the test are relatively expensive, but the topology of the system is relatively inexpensive. today, such a system could be based on nodes available in a wsn network. after that, the balance sheet would be as follows: 881 jan mikeš, ondrej kreibich, jan neužil acta polytechnica node 1 distribution board gsm alarm or a building bus wsn gateway node 2 node 3 node 4 figure 5. monitoring system topology. figure 6. lightning counter sensor node. • the sensing part (coil and circuit switching relays) < usd 5 • complete wsn node < usd 10 • gateway (ethernet, usb or gsm) < usd 20 • mechanical parts < usd 10 the total cost of the system is around usd 70. however the customer solution would be even cheaper (less than usd 50). 8. summary the system described above was implemented in the lightning conductor of a family house in the šumava region in the czech republic, and for a period of one year the events were recorded and transmitted to the central unit. in the assembled module, the key parameters, mainly related to battery life, were verified. the authors suggest possible future possible extension with precise wireless monitoring of the grounding system itself, detecting the amount of current passing through the conductor. the wsn technology enables sensing of commonly measured physical phenomena. this paper has shown that high-quality lightning protection, especially with reference to disruption due to mechanical damage, is feasible. references [1] hasse, p., wiesinger, j., zischank, w., handbuch für blitzschutz und erdung. pflaum 2006. [2] kutáč, j., meravý, j., ochrana před bleskem a přepětím z pohledu soudních znalců. praha, trenčín: spbi, 2010. isbn 978-80-7385-081-4. [3] kutáč, j., rozbor mimořádných událostí způsobených údery blesku v roce 2012. seminar of unie soudních znalcu, 2012, isbn 978-80-260-3382-0. [4] international standard iec 62305-3. edition 2.0, 2010-12. protection against lightning. part 3: physical damage to structures and life hazard. p. 151, table e.2. [5] fire and rescue service of the czech republic, http://www.hzscr.cz/ [6] http://mve.energetika.cz [7] yick, j., mukherjee, b., ghosal, d., wireless sensor network survey. computer networks, vol. 52, no. 12, pp. 2292–2330, aug. 2008. [8] neužil, j., šmíd, r., kreibich, o., distributed classification in wireless sensor networks for machine condition monitoring in: the seventh international conference on condition monitoring and machinery failure prevention technologies. dublin: mfpt + bindt, 2010, isbn 9781901892338. [9] xmesh user’s manual. [s.l.]: crossbow technology, 2007. 882 http://www.hzscr.cz/ http://mve.energetika.cz acta polytechnica 53(6):878–882, 2013 1 introduction 2 lightning and the catchment system 3 monitored building 4 description of the proposed solution 5 proposed system 6 system features 7 economic balance sheet of the proposed system 8 summary references ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 coherent state quantization and moment problem j. p. gazeau, m. c. baldiotti, d. m. gitman abstract berezin-klauder-toeplitz (“anti-wick”) or “coherent state” quantization of the complex plane, viewed as the phase space of a particle moving on the line, is derived from the resolution of the unity provided by the standard (or gaussian) coherent states. the construction of these states and their attractive properties are essentially based on the energy spectrum of the harmonic oscillator, that is on natural numbers. we follow in this work the same path by considering sequences of non-negative numbers and their associated “non-linear” coherent states. we illustrate our approach with the 2-d motion of a charged particle in a uniform magnetic field. by solving the involved stieltjes moment problem we construct a family of coherent states for this model. we then proceed with the corresponding coherent state quantization and we show that this procedure takes into account the circle topology of the classical motion. 1 introduction one of the most interesting properties of standard or glauber coherent states |z〉 [1, 2, 3, 4] is the bayesian duality [5, 6] that they encode between the discrete poisson probability distribution, n �→ e−|z| 2 |z|2/n!, of obtaining n quantum excitations (“photons” or “quanta”) in a measurement through some counting device, and the continuous gamma probability distribution measure |z|2 �→ e−|z| 2 |z|2/n! on the classical phase space. for this latter distribution, |z|2 is itself a randomvariable, denoting the averagenumber of photons, given that n photons have been counted. such a duality underlies the construction of all types of coherent state families, provided they satisfy a resolution of the unity condition. it turns out that this condition is equivalent to setting up a “positive operator valued measure” (povm) [7, 4] on the phase space. such a measure, in turn, leads to the quantization of the classical phase space, which associates to each point z ≡ (q + ip)/ √ 2 the one dimensional projection operator pz, projecting onto to the subspace generated by the coherent state vector, and then for z �= z′, pzpz′ �= pz′pz). this “berezin-klauder-töplitz” quantization (or “anti-wick”) [1, 8, 9] turns out, in this case, to be equivalent to the canonical quantization procedure. clearly, this non-commutative version of the complex plane is intrinsically based on the nonnegative integers (appearing in the n! term). we then follow a similar path by considering sequences of nonnegativenumberswhichare far ornot fromthenatural numbers [10]. the resulting quantizationswill then be looked upon as generalizations of the one yielded by the standard coherent states. we illustrate our approach with the elementary model of the 2-d motion of a charged particle in a uniformmagnetic field [11, 12]. by using a solution to a version of the stieltjes moment problem [13, 14] we construct a family of coherent states for this model. we prove that these states form an overcomplete set that is normalized and resolves the unity. we then carry out the corresponding coherent state quantization and we examine the consequences in terms of its probabilistic, functional, and localization aspects. this article is organized as follows. in section 2, we briefly review the standard coherent states and the way they allow painless quantization of the complex plane viewed as a phase space. the so-called nonlinear coherent states built from arbitrary sequences of numbers are described in section 3 and we show how the moment problem immediately emerges from the exigence of unity resolution. if the positive case, the corresponding quantization of the complex plane is described in section 4. in section 5 we apply our formalism to themotion of a chargedparticle in a uniformmagnetic field. there exist two families of coherent states for suchamodel, namely themalkin-man’ko states [15], which are just tensor products of standard coherent states, and the kowalski-rembielinski states [16]. by introducing a kind of squeezing parameter q = eλ > 1 we extend the definition of the latter and solve the corresponding stieltjes moment problem. this allows us to proceed with the quantization of the physical quantities and illustrate our studywith numerical investigation. 2 quantization with standard coherent states and a short review of standard cs let h be a separable (complex) hilbert space with orthonormal basis e0, e1, . . . , en ≡ |en〉, . . .. to each complex number z ∈ c there corresponds the following vector in h: |z〉= ∞∑ n=0 e− |z|2 2 zn√ n! |en〉 . (1) selected topics in mathematical and particle physics, prague, may 5–7, 2009 30 acta polytechnica vol. 50 no. 3/2010 such vectors are the well-known glauber-klauderschrödinger-sudarshan coherent states or standard coherent states. they are distinguished by many properties. here we particularly retain the following. (i) 〈z|z〉 =1 (normalization). (ii) themap c � z �→ |z〉 is continuous (continuity). (iii) the map n ∈ n �→ |〈en|z〉|2 = e−|z| 2 |z|2n/n! is a poisson probability distribution with average number of occurrences equal to |z|2 (discrete probabilistic content). (iv) the map c � z �→ |〈en|z〉|2 = e−|z| 2 |z|2n/n! is a gamma probability distribution (with respect to the square of the radial variable) with n as a shape parameter (continuous probabilistic content). (v) there holds resolution of the unity in h: i = ∫ c d2z π pz , (2) where pz = |z〉〈z| is the orthogonal projector on vector |z〉 and the integral should be understood in the weak sense. the proof is straightforward and stems from the orthogonality of the fourier exponentials and fromthe integral expression of the gamma function which solves the moment problem for the factorial n!∫ c d2z π pz = ∞∑ n,n′=0 |en〉〈en′| 1√ n!n′! · ∫ c d2z π e−|z| 2 znz̄n ′ = (3) ∞∑ n=0 |en〉〈en′| = i . berezin-klauder-toeplitz-“anti-wick” quantization or “coherent state quantization” property (v) allows to define 1. a normalized positive operator-valued measure (povm) on the complex plane equipped with its lebesgue measure d2z π and its σ−algebra f of borel sets: f � δ �→ ∫ δ d2z π pz ∈ l(h)+ , (4) wherel(h)+ is the cone of positive bounded operators on h. 2. a quantization of the complex plane, which means that to a function f(z, z̄) in the complex plane there corresponds the operator af in h defined by f �→ af = ∫ c d2z π f(z, z̄)pz = ∞∑ n,n′=0 |en〉〈en′| 1 √ n!n′! · (5) ∫ c d2z π f(z, z̄)e−|z| 2 znz̄n ′ provided that weak convergence holds. for the simplest functions f(z) = z and f(z) = z̄ we obtain az = â , â |en〉 = √ n|en−1〉 , (6) â|e0〉 = 0 , (lowering operator) az̄ = â † , ↠|en〉 = √ n +1|en+1〉 (7) (raising operator) . these two basic operators obey the canonical commutation rule : [â, â†] = i. the number operator n̂ = â†â is such that its spectrum is exactly n with eigenvectors en : n̂|en〉 = n|en〉. the fact that the complex plane has become non-commutative is apparent from the quantization of the real and imaginary parts of z = 1 √ 2 (q + ip): aq def = q = 1√ 2 (â + â†) , (8) ap def = p = 1 √ 2i (â − â†) , [q, p ] = ii . 3 coherent states for generic sequences let x = {xn}n∈n be a strictly increasing sequence such that x0 = 0 and lim n→∞ xn = ∞. then its associated exponential e(t)= +∞∑ n=0 tn xn! , xn! ≡ x1x2 · · · xn , x0! = 1 , (9) has an infinite convergence radius. associated “coherent states” (“non-linear cs” in quantum optics) read as elements of h, a separable hilbert space with orthonormal basis {|en〉 , n ∈ n}: |vz〉 = ∞∑ n=0 1√ e(|z|2) zn √ xn! |en〉 . (10) these vectors still enjoy someproperties similar to the standard ones. (i) 〈vz|vz〉 =1 (normalization). (ii) the map c � z �→ |vz〉 is continuous (continuity). 31 acta polytechnica vol. 50 no. 3/2010 (iii) the map n ∈ n �→ |〈en|vz〉|2 = |z|2n e(|z|2)xn! is a poisson-like distribution with average number of occurrences equal to |z|2 (discreteprobabilistic content). consider the discrete probability distributionwith parameter t ≥ 0: n �→ p(n;t)= 1 e(t) tn xn! . (11) the average of the random variable n �→ xn is 〈xn〉 = t. contrariwise to the standard case x = n, the continuous (gammalike) distribution t �→ 1 e(t) tn xn! with parameter n is not a probabilitydistributionwith respect to the lebesgue measure dt:∫ +∞ 0 dt e(t) tn xn! def = μn �=1 . (12) finding the right measure amounts to solve a usually intractable moment problem. so, the map c � z �→ |〈en|vz〉|2 = |z|2n/ ( e(|z|2)xn! ) is not a (gamma-like) probability distribution (with respect to the square of the radial variable in the complex plane)with xn+1 as a shapeparameter, and this is a serious setback for the berezin-toeplitz quantization program. indeed there is no reason to get now the resolution of the unity: with pz = |vz〉〈vz|,∫ c d2z π pz = ∞∑ n,n′=0 |en〉〈(en′| 1√ xn!x′n! · ∫ c d2z π 1 e(|z|2) znz̄n ′ = (13) ∞∑ n=0 1 xn! i(n)|en〉〈(en| def = f . here f is a diagonal operator determined by the sequence of integrals i(n) = ∫ +∞ 0 tn dt e(t) . these integrals form a sequence of stieltjes moments for the measure dt e(t) . if the moment problem has a solution. suppose that the stieltjes moment problem [13, 14] has a solution for the sequence (xn!)n∈n, i.e. there exists a probability distribution t �→ w(t) on [0,+∞) with infinite support such that xn! = ∫ +∞ 0 tn w(t)dt . (14) we know that a necessary and sufficient condition for this is that the two matrices⎛ ⎜⎜⎜⎜⎜⎜⎜⎝ 1 x1! x2! . . . xn! x1! x2! x3! . . . xn+1! x2! x3! x4! . . . xn+2! ... ... ... ... ... xn! xn+1! xn+2! . . . x2n! ⎞ ⎟⎟⎟⎟⎟⎟⎟⎠ , (15) ⎛ ⎜⎜⎜⎜⎜⎜⎜⎝ x1! x2! x3! . . . xn+1! x2! x3! x4! . . . xn+2! x3! x4! x5! . . . xn+3! ... ... ... ... ... xn+1! xn+2! xn+3! . . . x2n+1! ⎞ ⎟⎟⎟⎟⎟⎟⎟⎠ have strictly positive determinants for all n. then, a natural approach is just to modify the measure on c by including the weight w(|z|2)e(|z|2). we then obtain the resolution of the identity:∫ c d2z π w(|z|2)e(|z|2)pz = ∞∑ n,n′=0 |en〉〈(en′| 1√ xn!x′n! · (16) ∫ c d2z π w(|z|2)znz̄n ′ = ∞∑ n=0 1 xn! ∫ +∞ 0 tn w(t)dt |en〉〈(en| = i . if the moment problem is solved by a measure t �→ w(t), then the vectors |vz〉 enjoy all needed properties for quantization: (iv) themap c � z �→ |〈en|vz〉|2 = |z|2n/ ( e(|z|2)xn! ) is a (gamma-like) probability distribution (with respect to the square of the radial variable) with xn+1 as a shape parameter and with respect to the modified measure on the complex plane ν(dz) def = w(|z|2)e(|z|2) d2z π . (17) note that we might face with (xn!) an indeterminate moment sequence, which means that there are several representing measures. then to each such a measure there corresponds a probability distribution on the classical phase space to be interpreted in terms of statistical mechanics. if the moment problem has no (explicit) solution. see an alternative in [10]. 4 cs quantization with sequence x if the moment problem has an explicit solution, one can proceed with the corresponding cs quantization of the complex plane since the family of vectors |vz〉 solves the unity : with pz = |vz〉〈vz|,∫ c ν(dz)pz = ∞∑ n,n′=0 |en〉〈en′| 1 √ xn!xn′! · ∫ c d2z π w(|z|2)znz̄n ′ = (18) ∞∑ n=0 1 xn! |en〉〈en| = i , 32 acta polytechnica vol. 50 no. 3/2010 weproceedwith this quantization like in the standard case: to a function f(z, z̄) in the complex plane there corresponds the operator af in h defined by f �→ af = ∫ c d2z π f(z, z̄)w(|z|2)e(|z|2)pz = ∞∑ n,n′=0 |en〉〈en′| 1√ xn!xn′! · (19) ∫ c d2z π w(|z|2)f(z, z̄)znz̄n ′ provided that weak convergence holds. for the simplest functions f(z, z̄)= z and f(z, z̄)= z̄ we obtain az = â , â |en〉 = √ xn [en−1〉 , (20) â |e0〉 = 0 , (lowering operator) az̄ = â † , ↠|en〉 = √ xn+1 |en+1〉 (21) (raising operator) . these twobasic operatorsobey the commutation rule : [â, â†] = xn+1 − xn def = δn. the operator xn is defined by xn = â †â and is such that its spectrum is exactly the sequence x with eigenvectors en : xn |en〉 = xn |en〉. the triple {â, â†,δn} equipped with the operator commutator [· , ·] generates (generically) an infinite lie algebra which replaces the weyl-heisenberg lie algebra. the quantization of the real and imaginary parts of z = 1 √ 2 (q + ip) yields position and momentum operators corresponding to the sequence x , aq def = q = 1 √ 2 (â + â†) , (22) ap def = p = 1 √ 2i (â − â†) , [q, p ] = iδn , together with new quantum localization properties. 5 an example: charged particle in a magnetic field consider a classicalnonrelativistic particle, charge−e, moving in the plane ( x1, x2 ) and interacting with a constant and uniform magnetic field of intensityb perpendicular to the plane, described by a vector potential a only (a0 = 0). the hamiltonian of the particle is h(x,p) = 1 2μ [ p+ e c a(x) ]2 , x =( x1, x2 ) , p = (p1, p2). with the symmetric gauge âi = − b 2 εij x̂ j , i, j, k = 1,2, the quantum hamiltonian takes the form ĥ = 1 2μ ( p̂21 + p̂ 2 2 ) . the p̂i, i = 1,2, are components of the kinematic momentum operator, p̂i = p̂i − eb 2c εij x̂ j , [ p̂1, p̂2 ] = −ih̄ eb c , (23) where εij is the levi-civita symbol. kowalski & rembielinski coherent states kowalskiandrembielinski [16] haveproposed the construction of cs for a particle in a uniform magnetic field by using their coherent states for the circle [17]. the latter are constructed from the angular momentum operator ĵ and the unitary operator û that represents the position of the particle on the unit circle. these operators obey the commutation relations[ ĵ, û ] = u , [ ĵ, û+ ] = −û+ . (24) the introduction of these coherent states permits to avoid the problem of the infinite degeneracy present in the approach followed by man’ko and malkin [15], and, in addition, takes into in account the momentum part of the phase space. consequently, the so obtained cs offer a better way to compare the quantumbehavior of the system with the classical trajectories in the phase space. let us introduce the centre-coordinate operators x̂10 = x̂ 1 − 1 μω p̂2 , x̂ 2 0 = x̂ 2 + 1 μω p̂1 , (25) where ω = eb/μ is the cyclotron frequency. the x̂i0 are integral of motion, [h, x̂i0] = 0. relative motion coordinate operators, r̂1 = x̂1 − x̂10 = 1 μω p̂2 , (26) r̂2 = x̂2 − x̂10 = − 1 μω p̂1 . introduce now the operators r̂0± = x̂ 1 0 ± ix̂ 2 0 , (27) r̂± = r̂ 1 ± ir̂2 = 1 μω (p̂2 ∓ ip̂1) . they obey the commutation rules [r̂0+, r̂0−] = 2 h̄ μω , [r̂+, r̂−] = −2 h̄ μω , (28) [r̂0±, r̂±] = 0 . the “relative” angular momentum operator ĵ is proportional to the hamiltonian ĵ = r̂1p̂2 − r̂2p̂1 = − 2 ω ĥ = (29) μωr̂+r̂− + h̄ = μωr̂−r̂+ − h̄ . due to the rules, [j, r̂0±] = 0 , [j, r̂±] = ±2h̄r̂± , (30) ĵ can be viewed as the generator of rotations about the axis passing through the classical point (x10, x 2 0) 33 acta polytechnica vol. 50 no. 3/2010 and perpendicular to the (x1, x2) plane. the nonunitary operator r̂− describes to a certain extent the angular position of the particle on a circle. the symmetries and the integrability of the model can be encoded into the two independent weyl-heisenberg algebras, one for the center of circular orbit and the other for the relative motion. they allow one to construct the fock space with orthonormal basis {|m, n〉 ≡ |m〉⊗|n〉, m, n ∈ z}, as repeated actions of the raising operators r̂0− and r̂+, r̂0−|m〉 = √ 2h̄(m +1) μω |m +1〉 , (31) r̂+|n〉 = √ 2h̄(n +1) μω |n +1〉 . r̂0+|m〉 = √ 2h̄m μω |m −1〉 , (32) r̂−|n〉 = √ 2h̄n μω |n −1〉 , and the eigenvalue equation ĵ |m, n〉 =(2n +1) h̄ |m, n〉 . (33) the k&r cs |z0, ζ〉 are constructed in the hilbert space spanned by the orthonormal basis as solution to the eigenvalue equation: r̂0+ |z0, ζ〉 = z0 |z0, ζ〉 , (34) ẑ |z0, ζ〉 = ζ |z0, ζ〉 , z0, ζ ∈ c , where ẑ = e 1 2(ĵ /h̄+1)r̂−. the projection of these cs in this fock basis reads as 〈m, n| ζ, z0〉 = e− |z̃0| 2 2√ e(|ζ̃|2) z̃m0√ m! ζ̃n √ n! e− 1 2 n(n+1) , (35) where z̃0 = √ μω 2h̄ z0 , ζ̃ = √ μω 2h̄ ζ. the normalization factor involves the function e (t)= ∞∑ n=0 e−n(n+1) tn n! ≡ ∞∑ n=0 tn xn! , (36) where we recognize a generalized exponential with xn ≡ e2nn. squeezing/deforming the k& r states the introduction of a “squeezing” parameter λ allows us to generalize the previous cs of a charged particle in a uniform magnetic field as an eigenvector of the commuting operators r̂0+ and ẑλ, r̂0+ |z0, ζ〉 = z0 |z0, ζ〉 , ẑλ |z0, ζ〉 = ζ |z0, ζ〉 , ẑλ = exp [ λ 4 ( ĵ/h̄ +1 )] r̂− . (37) operator ẑλ coincideswith thek&r ẑ for λ =2, and with just r̂− for λ =0, i.e., the case ofmalkin-man’ko cs, which are actually tensor products of standard cs. operator ẑλ controls the dispersion relations of the angular momentum ĵ and of the “position operator” r̂−. the corresponding cs read: |z0, ζ〉 = e− |z̃0| 2 2√ eλ (∣∣∣ζ̃∣∣∣2) ∑ m,n z̃m0√ m! ζ̃n √ xn! |m, n〉 , (38) with eλ (t)= ∞∑ n=0 tn/xn! and xn ≡ enλn. the complex numbers z0 and ζ parameterize, respectively, the position of the centre of the circle and the classical phase space state of the circular motion. some properties of these cs make them more suitable with regard to the semi-classical behavior of a charged particle in a magnetic field, in comparisonwith themalkin-man’ko cs. the generalization involving λ allows one to exhibit better these interesting characteristics. resolution of the moment problem the λ-cs |z0, ζ〉 are the tensor product of the states |z0〉 and |ζ〉, where the first one is a standard cs. so, in order to perform the cs quantization, we concentrate only on the states |ζ〉. for convenience, we put μω/2h̄ = 1, and so ζ̃ = ζ. then, in the fock basis {|n〉}, |ζ〉 = 1√ eλ|ζ|2) +∞∑ n=0 ζn√ xn! |n〉 , eλ(t) = +∞∑ n=0 t xn! , xn = e λn n . they resolve the unity in the fock space spanned by the kets |n〉, ∫ c �λ ( |ζ|2 ) d2ζ π |ζ〉〈ζ| = i . the weight function �λ solves the moment problem∫ ∞ 0 tn�λ (t) dt = n! exp { λn(n +1) 2 } ≡ xn! ,(39) λ ≥ 0 , and is given under the form of the laplace transform, �λ (t) = e−λ/2 √ 2πλ ∫ +∞ 0 du exp ( −e−λ/2tu ) e− (ln u)2 2λ = e−λ/2 √ 2πλ l [ e− (ln u)2 2λ ] ( e−λ/2t ) . 34 acta polytechnica vol. 50 no. 3/2010 fig. 1: error function as a function of l for λ = 2 (solid line), λ =4 (dashed line) and λ =6 (dotted line). we see that, with the λ-cs, this approximation can be improved, for |l| ≤ 1, by increasing the value of λ cs quantization the correspondingcsquantizationof functions on the complex plane is the map f ( ζ, ζ̄ ) �→ ∫ c d2ζ π �λ ( |ζ|2 ) · (40) f ( ζ, ζ̄ ) eλ ( |ζ|2 ) |ζ〉〈ζ| def= f̂ . as expected, the cs quantization of the variables ζ and ζ̄ yields ζ �→ ζ̂ = ẑλ , ζ̄ �→ ζ̂ = ẑ † λ . (41) numerical analysis one convenient criterion to evaluate the closeness of the introduced λ-cs to the classical phase space consists in verifying how closely the expectation value of the angular momentum operator approaches the respective classical quantity. this can be implemented through the evaluation of the relative error e(λ, l)= |(〈ĵ〉ζ /h̄ − l)| l , (42) with the expectation value of the angular momentum given by 〈ĵ〉ζ = 〈ζ|ĵ|ζ〉 = 1 eq(|ζ|2) +∞∑ n=0 |ζ|2n (2n +1) xn! , xn = e λ 2 n n . the parameter ζ are relatedwith the classical angular momentum l = μωr2 (where r is the classical radius) by |ζ| = √ l μω exp( λ 4 l) . kowalski and rembielinski observed that the approximate equality 〈ĵ〉ζ � l does not hold for arbitrary small l, being really acceptable for |l| > 1 only. references [1] klauder, i. r., skagerstam, b. s.: coherent states, applications in physics and mathematical physics, world scientific, singapore, 1985, pp. 991. [2] perelomov, a. m.: generalized coherent states and their applications, springer-verlag, new york 1986. [3] malkin, i. a., man’ko, v. i.: dynamical symmetries and coherent states of quantum systems, nauka, moscow, 1979, pp. 320. [4] ali, s.t., antoine, j.p.,gazeau, j.-p.: coherent states, wavelets and their generalizations. graduate texts in contemporary physics, springerverlag, new york, 2000. [5] ali, s. t. , gazeau, j.-p., heller, b.: j. phys. a: math. theor., 41 (2008) 365302. [6] gazeau, j.-p.: coherent states in quantum physics, wiley-vch, berlin, 2009. [7] holevo, a. s.: statistical structure of quantum theory, springer-verlag, berlin, 2001. [8] berezin, f. a.: comm. math. phys. 40 (1975) 153. [9] chakraborty, b., gazeau, j.-p., youssef, a.: arxiv:0805.1847v1. [10] ali, s. t., balková, l., curado, e. m. f., gazeau, j.-p., rego-monteiro, m. a., rodrigues, ligia m. c. s., sekimoto, k.: j. math. phys., 50, 043517-1-28 (2009). [11] baldiotti, m. c., gazeau, j.-p., gitman, d. m.: phys. lett. a, 373, 1916–1920 (2009); erratum: phys. lett. a, 373, 2600 (2009). [12] baldiotti, m. c., gazeau, j.-p., gitman, d. m.: phys. lett. a 373, 3937–3943 (2009). [13] stieltjes, t.: ann. fac. sci. univ. toulouse 8 (1894–1895), j1-j122; 9, a5-a47. [14] simon, b.: adv. in math. 137 (1998), 82–203. 35 acta polytechnica vol. 50 no. 3/2010 [15] malkin, i. a., man’ko, v. i.: zh. eksp. teor. fiz. 55 (1968) 1014 [sov. phys. – jetp 28, no. 3 (1969) 527]. [16] kowalski, k., rembielinski, j.: j. phys. a 38 (2005) 8247. [17] kowalski, k., rembielinsk, j., papaloucas, l. c.: j. phys. a 29 (1996) 4149. j. p. gazeau e-mail: gazeau@apc.univ-paris7.fr laboratoire apc université paris diderot paris 7 10 rue a. domon et l. duquet 75205 paris cedex 13, france m. c. baldiotti, d. m. gitman e-mail: baldiott@fma.if.usp.br, gitman@dfn.if.usp.br instituto de f́ısica, universidade de são paulo caixa postal 66318-cep 05315-970 são paulo, s.p., brazil 36 acta polytechnica doi:10.14311/ap.2013.53.0589 acta polytechnica 53(supplement):589–594, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap dama results: dark matter in the galactic halo r. bernabeia,b,∗, p. bellib, f. cappellac,d, v. caraccioloe, r. cerullie, c.j. daif, a. d’angeloc,d, a. di marcoa,b, h.l. hef, a. incicchittid, x.h. maf, f. montecchiab,g, x.d. shengf, r.g. wangf, z.p. yef,h a dip. di fisica, università di roma “tor vergata”, i-00133 rome, italy b infn, sez. roma “tor vergata”, i-00133 rome, italy c dip. di fisica, università di roma “la sapienza”, i-00185 rome, italy d infn, sez. roma, i-00185 rome, italy e laboratori nazionali del gran sasso, i.n.f.n., assergi, italy f ihep, chinese academy, p.o. box 918/3, beijing 100039, china g laboratorio sperimentale policentrico di ingegneria medica, università degli studi di roma “tor vergata” h university of jing gangshan, jiangxi, china ∗ corresponding author: rita.bernabei@roma2.infn.it abstract. experimental efforts and theoretical developmens support that most of the universe is dark and a large fraction of it should be made of relic particles; many possibilities are open on their nature and interaction types. in particular, the dama/libra experiment at gran sasso laboratory (sensitive mass: ∼ 250 kg) is mainly devoted to the investigation of dark matter (dm) particles in the galactic halo by exploiting the model independent dm annual modulation signature with higly radiopure na i(tl) targets. dama/libra is the succesor of the first generation dama/na i (sensitive mass: ∼ 100 kg); cumulatively the two experiments have released so far the results obtained by analyzing an exposure of 1.17 t yr, collected over 13 annual cycles. the data show a model independent evidence of the presence of dm particles in the galactic halo at 8.9σ confidence level (c.l.). some of the already achieved results are shortly reminded, the last upgrade occurred at fall 2010 is mentioned and future perspectives are sumarized. keywords: dark matter, dark matter annual modulation signature, highly radiopure na i(tl) crystals. 1. introduction about one century of experimental efforts and of theoretical developments has pointed out that most of our universe is dark and that a large fraction of it should be in form of relic particles. many possibilities are open about the nature and the interaction types of such particles. often wimp is adopted as a synonymous of dm particle, referring usually to a particle with spin-independent elastic scattering on nuclei. on the contrary, wimp identifies a class of dm candidates which can have different phenomenologies and interaction types. there are many open aspects, having large impact on model dependent investigations and comparisons, as e.g. which is the right description of the dark halo and related parameters, which is the right related atomic/nuclear and particle physics, etc. as well as the fundamental question on how many kinds of dm particles exist in the universe. the experiments at accelerators could prove the existence of some possible dm candidate particles, but they could never credit by themselves that a certain particle is in the halo as a solution or the only solution for dm particle(s). moreover, dm candidate particles and scenarios (even for neutralino candidate) exist which cannot be investigated at accelerators. thus, to pursue direct detection of dm particles with a model independent approach, with suitably large exposure, with full control of the running conditions and ultra-low-background (ulb) widely sensitive target material is mandatory. the direct detection of dm particles is based on different approaches. the dm interaction processes can be of well different nature depending on the candidate, as e.g.: (1.) elastic scatterings on target nuclei with either spin-indepedent or spin-dependent or mixed coupling; moreover, additional electromagnetic contribution in case of few gev candidates can arise from excitation of bound electrons by the recoiling nuclei [1]; (2.) inelastic scatterings on target nuclei with either spin-indepedent or spin-dependent or mixed coupling in various scenrios [2, 3]; (3.) interaction of light dm (ldm) either on electrons or on nuclei with production of a lighter particle [4]; (4.) preferred interaction with electrons [5]; (5.) conversion of dm particles into electromagnetic radiation [6]; (6.) etc. 589 http://dx.doi.org/10.14311/ap.2013.53.0589 http://ojs.cvut.cz/ojs/index.php/ap rita bernabei acta polytechnica thus, considering the richness of particle possibilities and the existing uncertainties on related astrophysical (e.g. halo model and related parameters, etc.), nuclear (e.g. form factors, spin factors, scaling laws, etc.) and particle physics (e.g. particle nature and interaction types, etc.), a widely-sensitive model independent approach is mandatory as well as a suitable exposure, and full control of the running condition over the whole data taking. at present the only model independent signature, which can be effectively exploited is the dm annual modulation signature, originally suggested in [7] and investigated by the dama experiments at the gran sasso national laboratory with highly radiopure na i(tl) as target-detectors; in the following the already-achieved results (see [8–14] and references therein) are briefly summarized. 2. the dama project the dama project is an observatory for rare processes located deep underground at the gran sasso national laboratory of the i.n.f.n. it is based on the development and use of low background scintillators. profiting of the low background features of the realized set-ups, many rare processes are studied [1, 4–6, 8–13, 15–25]. the main apparatus, dama/libra, is investigating the presence of dm particles in the galactic halo by exploiting the model independent dm annual modulation signature. in fact, as a consequence of its annual revolution around the sun, which is moving in the galaxy traveling with respect to the local standard of rest towards the star vega near the constellation of hercules, the earth should be crossed by a larger flux of dm particles around june 2 (when the earth orbital velocity is summed to the one of the sun with respect to the galaxy) and by a smaller one around december 2 (when the two velocities are subtracted). thus, this signature has a different origin and peculiarities than the seasons on the earth and than effects correlated with seasons (consider the expected value of the phase as well as the other requirements listed below). this dm annual modulation signature is very distinctive since the effect induced by dm particles must simultaneously satisfy all the following requirements: (1.) the rate must contain a component modulated according to a cosine function; (2.) with one year period; (3.) with a phase that peaks roughly around june 2; (4.) this modulation must be present only in a welldefined low energy range, where dm particles can induce signals; (5.) it must be present only in those events where just a single detector, among all the available ones in the used set-up, actually “fires” (single-hit events), since the probability that dm particles experience multiple interactions is negligible; (6.) the modulation amplitude in the region of maximal sensitivity has to be ' 7 % in case of usually adopted halo distributions, but it may be significantly larger in case of some particular scenarios such as e.g. those in [3, 26]. the dama/libra data released so far correspond to six annual cycles for an exposure of 0.87 t yr [9, 10]. considering these data together with those previously collected by dama/na i over 7 annual cycles (0.29 t yr), the total exposure collected over 13 annual cycles is 1.17 t yr; this is orders of magnitude larger than the exposures typically released in the field. the dama/na i set up and its performances are described in [15, 17–19], while the dama/libra setup and its performances are described in [8, 10]. the sensitive part of the dama/libra set-up is made of 25 highly radiopure na i(tl) crystal scintillators placed in a 5-rows by 5-columns matrix; each crystal is coupled to two low background photomultipliers working in coincidence at single photoelectron level. the detectors are placed inside a sealed copper box flushed with hp nitrogen and surrounded by a low background and massive shield made of cu/pb/cdfoils/polyethylene/paraffin; moreover, about 1 m concrete (made from the gran sasso rock material) almost fully surrounds (mostly outside the barrack) this passive shield, acting as a further neutron moderator. the installation has a 3-fold levels sealing system which excludes the detectors from environmental air. the whole installation is air-conditioned and the temperature is continuously monitored and recorded. the detectors’ responses range from 5.5 to 7.5 photoelectrons/kev. energy calibrations with x-rays/γ sources are regularly carried out down to few kev in the same conditions as the production runs. a software energy threshold of 2 kev is considered. the experiment takes data up to the mev scale and thus it is also sensitive to high energy signals. for all the details see [8]. several analyses on the model-independent dm annual modulation signature have been performed (see [9, 10] and references therein). figure 1 shows the time behaviour of the experimental residual rates of the single-hit events collected by dama/na i and by dama/libra in the (2 ÷ 6) kev energy interval [9, 10]. the superimposed curve is the cosinusoidal function: a cos ω(t− t0) with a period t = 2πω = 1 yr, with a phase t0 = 152.5 day (june 2), and modulation amplitude, a, obtained by best fit over the 13 annual cycles. the hypothesis of absence of modulation in the data can be discarded [9, 10] and, when the period and the phase are released in the fit, values well compatible with those expected for a dm particle induced effect are obtained [10]; for example, in the cumulative (2 ÷ 6) kev energy interval: a = (0.0116 ± 0.0013) cpd/kg/kev, t = (0.999 ± 0.002) yr and t0 = (146 ± 7) day. summarizing, the analysis of 590 vol. 53 supplement/2013 dama results: dark matter in the galactic halo 2-6 kev time (day) r e si d u a ls ( c p d /k g /k e v ) dama/nai ≈ 100 kg (0.29 ton×yr) dama/libra ≈ 250 kg (0.87 ton×yr) figure 1. experimental model-independent residual rate of the single-hit scintillation events, measured by dama/na i over seven and by dama/libra over six annual cycles in the (2 ÷ 6) kev energy interval as a function of the time [9, 10, 18]. the zero of the time scale is january 1 of the first year of data taking. the experimental points present the errors as vertical bars and the associated time bin width as horizontal bars. the superimposed curve is a cos ω(t − t0) with period t = 2πω = 1 yr, phase t0 = 152.5 day (june 2) and modulation amplitude, a, equal to the central value obtained by best fit over the whole data. the dashed vertical lines correspond to the maximum expected for the dm signal (june 2), while the dotted vertical lines correspond to the minimum ([9, 10] and references therein). the single-hit residual rate favours the presence of a modulated cosine-like behaviour with proper features at 8.9σ confidence level (c.l.) [10]. the same data of fig. 1 have also been investigated by a fourier analysis, obtaining a clear peak corresponding to a period of 1 year [10]; this analysis in other energy regions shows only aliasing peaks instead. moreover, in order to verify absence of annual modulation in other energy regions and, thus, to also verify the absence of any significant background modulation, the energy distribution in energy regions not of interest for dm detection has also been investigated. this has allowed the exclusion of a background modulation in the whole energy spectrum at a level much lower than the effect found in the lowest energy region for the single-hit events [10]. a further relevant investigation has been done by applying the same hardware and software procedures, used to acquire and to analyse the single-hit residual rate, to the multiple-hits events in which more than one detector “fires”. in fact, since the probability that a dm particle interacts in more than one detector is negligible, a dm signal can be present just in the single-hit residual rate. thus, this allows the study of the background behaviour in the same energy interval of the observed positive effect. a clear modulation is present in the single-hit events, while the fitted modulation amplitudes for the multiple-hits residual rate are well compatible with zero [10]. similar results were previously obtained also for the dama/na i case [19]. the annual modulation present at low energy has also been analyzed by depicting the differential modulation amplitudes, sm, as a function of the energy; the sm is the modulation amplitude of the modulated part of the signal obtained by maximum likelihood method energy (kev) s m ( c p d /k g /k e v ) -0.05 -0.025 0 0.025 0.05 0 2 4 6 8 10 12 14 16 18 20 figure 2. energy distribution of the modulation amplitudes sm for the total cumulative exposure 1.17 t yr obtained by maximum likelihood analysis. the energy bin is 0.5 kev. a clear modulation is present in the lowest energy region, while sm values compatible with zero are present just above. in fact, the sm values in the (6 ÷ 20) kev energy interval have random fluctuations around zero with χ2 equal to 27.5 for 28 degrees of freedom [9, 10]. over the data, considering t = 1 yr and t0 = 152.5 day. the sm values are reported as function of the energy in fig. 2. it can be inferred that a positive signal is present in the (2 ÷ 6) kev energy interval, while sm values compatible with zero are present just above; in particular, the sm values in the (6 ÷ 20) kev energy interval have random fluctuations around zero with χ2 equal to 27.5 for 28 degrees of freedom. it has been also verified that the measured modulation amplitudes are statistically well distributed in all the crystals, in all the annual cycles and energy bins; these and other discussions can be found in [10]. these results confirm those achieved by other kinds of analyses. in particular, a modulation is present in the rate of the single-hit events of the lower energy intervals; the period and the phase agree with those expected for dm signals [10]. both the data 591 rita bernabei acta polytechnica of dama/libra and of dama/na i fulfil all the requirements of the dm annual modulation signature. careful investigations on absence of any significant systematics or side reaction have been quantitatively carried out (see e.g. [8–10, 13, 18, 27], and references therein). no systematics or side reactions able to mimic the signature (that is, able to account for the measured modulation amplitude and simultaneously satisfy all the requirements of the signature) has been found or suggested by anyone over more than a decade. the obtained model independent evidence is compatible with a wide set of scenarios regarding the nature of the dm candidate and related astrophysical, nuclear and particle physics. for examples some given scenarios and parameters are discussed e.g. in [1, 4–6, 16, 18–21] and in appendix a of [9]. further large literature is available on the topics [28]; other possibilities are open. here we just recall the recent papers [29, 30] where the dama/na i and dama/libra results, which fulfil all the many peculiarities of the model independent dm annual modulation signature, are examined under the particular hypothesis of a light-mass dm candidate particle interacting with the detector nuclei by coherent elastic process; comparison with some recent possible positive hints [31, 32] are also given. no other experiment exists, whose result can be directly compared in a model-independent way with those by dama/na i and dama/libra. some activities claim model-dependent exclusion under many largely arbitrary assumptions (see for example discussions in [9, 18, 19, 33, 34]); often some critical points exist in their experimental aspects (e.g. use of marginal exposures, determination of the energy threshold, of the energy resolution and of the energy scale in the few kev energy region of interest, multiple selection procedures, non-uniformity of the detectors response, absence of suitable periodical calibrations in the same running conditions and in the claimed low energy region, stabilities, tails/overlapping of the populations of the subtracted events and of the considered recoil-like ones, well known side processes mimicking recoil-like events, etc.), and the existing experimental and theoretical uncertainties are usually not considered in their presented model dependent result. moreover, implications of the dama results are generally presented in incorrect/partial/not-updated way, as appeared in many papers in literature and in conferences. similar considerations hold for the indirect detection searches, in fact also in this case no direct modelindependent comparison can be performed between the results obtained in direct and indirect activities, since it does not exist a biunivocal correspondence between the observables in the two kinds of experiments. moreover, these searches are restricted to some dm candidates and scenarios and their results are strongly model dependent. anyhow, measurements published up to now are not in conflict with the effect observed by dama experiments. in conclusion, for completeness we remind: i) the recent possible positive hints presented by cogent and cresst exploiting different approaches/different target materials; ii) the uncertainties in the model dependent results and comparisons; iii) the relevant argument of the methodological robustness [35]. in particular, the general considerations on comparisons reported in appendix a of [9] still hold; on the other hand, whatever possible “positive” result has to be interpreted and a large room of compatibility with the dama annual modulation evidence is present. 3. last dama/libra upgrade and perspectives the positive model independent evidence for the presence of dm particles in the galactic halo is now supported at 8.9σ c.l. (cumulative exposure: 1.17 t yr, i.e. 13 annual cycles of dama/na i and dama/libra). it is worth noting e.g. that: i) the exploited dark matter annual modulation signature acts itself as a strong background reduction procedure as pointed out since the papers in the 80’s, mentioned above; ii) this signature is unambiguous since it requires the simultaneous fulfilment of many peculiarities; iii) the dama positive evidence has already been verified over 13 independent experiments of 1 year each one and in different experimental situations (different detectors, different assembling, slight different shield, different electronics, etc.); iv) no systematic or side reaction able to mimic this signature (that is, able to account for the observed modulation amplitude and to simultaneously satisfy all the many peculiarities of the signature) has been found or suggested by anyone over more than a decade. further corollary analyses in some of the many possible scenarios for dm candidates, interactions, halo models, nuclear/atomic properties, etc. are in progress as well as analyses/data taking to investigate other rare processes. a first upgrade of the dama/libra set-up was performed in september 2008. a further and more important upgrade has been performed in the end of 2010 when all the pmts have been replaced with new ones with higher quantum efficiency; all the details related to the developments and to the features of these high q.e. pmts in dama/libra are reported in [14]. the purpose of this upgrade is: i) to increase the experimental sensitivity lowering the software energy threshold of the experiment; 2) to improve the corollary investigations on the nature of the dark matter particle and related astrophysical, nuclear and particle physics arguments; 3) to investigate other signal features; 4) to improve the sensitivity in the investigation of rare processes other than dark matter. this requires long and heavy full time dedicated work for 592 vol. 53 supplement/2013 dama results: dark matter in the galactic halo reliable collection and analysis of very large exposures, as dama collaboration has always done. since january 2011 the dama/libra experiment is again in data taking in the new configuration, which is identified as dama/libra stage 2. further improvements are foreseen with new preamplifiers and trigger modules realised to further implement the lowest energy studies. moreover, in the future dama/libra will also continue its studies on several other rare processes [11, 12] as also the former dama/na i apparatus did [22]. further developments are in progress. references [1] r. bernabei et al.: 2007, int. j. mod. phys. a 22, 3155 [2] r. bernabei et al.: 2002, eur. phys. j. c 23, 61 [3] d. smith and n. weiner: 2001, phys. rev. d 64, 043502; d. tucker-smith and n. weiner: 2005, phys. rev. d 72, 063509 [4] r. bernabei et al.: 2008, mod. phys. lett. a 23, 2125 [5] r. bernabei et al.: 2008, phys. rev. d 77, 023506 [6] r. bernabei et al.: 2006, int. j. mod. phys. a 21, 1445 [7] a.k. drukier, k. freese, d.n. spergel: 1986, phys. rev. d 33, 3495; k. freese, j.a. frieman, a. gould: 1988, phys. rev. d 37, 3388 [8] r. bernabei et al.: 2008, nucl. instr. & meth. a 592, 297 [9] r. bernabei et al.: 2008, eur. phys. j. c 56, 333 [10] r. bernabei et al.: 2010, eur. phys. j. c 67, 39 [11] r. bernabei et al.: 2009, eur. phys. j. c 62, 327 [12] r. bernabei et al.: 2012, eur. phys. j. c 72, 1920 [13] r. bernabei et al.: 2012, eur. phys. j. c 72, 2064 [14] r. bernabei et al.: 2012, j. of inst. 7, p03009 [15] r. bernabei et al.: 1999, il nuovo cim. a 112, 545 [16] r. bernabei et al.: 1996, phys. lett. b 389, 757; 1998, phys. lett. b 424, 195; 1999, phys. lett. b 450, 448; 2000, phys. rev. d 61, 023512; 2000, phys. lett. b 480, 23; 2001, phys. lett. b 509, 197; 2002, eur. phys. j. c 23, 61; 2002, phys. rev. d 66, 043503; 2006, int. j. mod. phys. a 21, 1445; 2006, eur. phys. j. c 47, 263; 2007, int. j. mod. phys. a 22, 3155; 2008, eur. phys. j. c 53, 205; 2008, phys. rev. d 77, 023506; 2008, mod. phys. lett. a 23, 2125 [17] r. bernabei et al.: 2000, eur. phys. j. c 18, 283 [18] r. bernabei et al.: 2003, la rivista del nuovo cimento 26, n.1, 1 [19] r. bernabei et al.: 2004, int. j. mod. phys. d 13, 2127 [20] r. bernabei et al.: 2006, eur. phys. j. c 47, 263 [21] r. bernabei et al.: 2008, eur. phys. j. c 53, 205 [22] see in: http://people.roma2.infn.it/dama [23] p. belli et al.: 1996, astropart. phys. 5, 217; 1996, nuovo cim. c 19, 537; 1996, phys. lett. b 387, 222; 1996, phys. lett. b 389, 783 (err.); r. bernabei et al.: 1998, phys. lett. b 436, 379; p. belli et al.: 1999, phys. lett. b 465, 315; 2000, phys. rev. d 61, 117301; r. bernabei et al.: 2000, new j. of phys. 2, 15.1; 2000, phys. lett. b 493, 12; 2000, nucl. instr. & meth. a 482, 728; 2001, eur. phys. j. direct c 11, 1; 2002, phys. lett. b 527, 182; 2002, phys. lett. b 546, 23; 2003, beyond the desert 2003 (berlin: springer) p. 365; 2006, eur. phys. j. a 27, s01 35 [24] r. bernabei et al.: 1997, astropart. phys. 7, 73; 1997, nuovo cim. a 110, 189; p. belli et al.: 1999, astropart. phys. 10, 115; 1999, nucl. phys. b 563, 97; r. bernabei et al.: 2002, nucl. phys. a 705, 29; p. belli et al.: 2003, nucl. instr. & meth a 498, 352; r. cerulli et al.: 2004, nucl. instr. & meth a 525, 535; r. bernabei et al.: 2005, nucl. instr. & meth a 555, 270; 2006, ukr. j. phys. 51, 1037; p. belli et al.: 2007, nucl. phys. a 789, 15; 2007, phys. rev. c 76, 064603; 2008, phys. lett. b 658, 193; 2008, eur. phys. j. a 36, 167; 2009, nucl. phys. a 826, 256; 2010, nucl. instr. & meth a 615, 301; 2011, nucl. instr. & meth a 626-627, 31; 2011, j. phys. g: nucl. part. phys. 38, 015103 [25] p. belli et al.: 2007, nucl. instr. & meth. a 572, 734; 2008, nucl. phys. a 806, 388; 2009, nucl. phys. a 824, 101; 2009, proceed. of the int. conf. npae 2008 (ed. inr-kiev, kiev), p. 473; 2009, eur. phys. j. a 42, 171; 2010, nucl. phys. a 846, 143; 2011, nucl. phys. a 859, 126; 2011, phys. rev. c 83, 034603; 2011, eur. phys. j. a 47, 91 [26] k. freese et al.: 2005, phys. rev. d 71, 043516; 2004, phys. rev. lett. 92, 111301 [27] r. bernabei et al.: 2010, aip conf. proceed. 1223, 50 (arxiv:0912.0660); 2010, j. phys.: conf. ser. 203, 012040 (arxiv:0912.4200); http: //taup2009.lngs.infn.it/slides/jul3/nozzoli.pdf, talk given by f. nozzoli; 2011, can. j. phys. 89, 11; 2011, sif atti conf. 103 (arxiv:1007.0595); pre-print rom2f/2011/12, tipp2011 conf., chicago, usa (2011) to appear on physics procedia [28] a. bottino et al.: arxiv:1112.5666; a. bottino et al.: 2010, phys. rev. d 81, 107302; n. fornengo et al.: 2011, phys. rev. d 83, 015001; a.l. fitzpatrick et al.: arxiv:1003.0014; d. hooper et al.: arxiv:1007.1005v2; d. g. cerdeno and o. seto: 2009, jcap 0908, 032; d. g. cerdeno, c. munoz, o. seto: 2009, phys. rev. d 79, 023510; d. g. cerdeno et al.: 2007, jcap 0706, 008; j.f. gunion et al.: arxiv:1009.2555; a.v. belikov et al.: arxiv:1009.0549; c. arina and n. fornengo: 2007, jhep 11, 029; g. belanger et al.: arxiv:1105.4878; s. chang et al.: 2009, phys. rev. d 79, 043513; s. chang et al.: arxiv:1007.2688; r. foot: arxiv:1001.0096, arxiv:1106.2688, 2010, phys. rev. d 82, 095001; y. mambrini: 2011, jcap 1107, 009; 2010, jcap 1009, 022; y. bai and p.j. fox: 2009, jhep 0911, 052; j. alwall et al.: arxiv:1002.3366; m. yu khlopov et al.: arxiv:1003.1144; s. andreas et al.: 2010, phys. rev. d 82, 043522; m.s. boucenna, s. profumo: arxiv:1106.3368; p.w. graham et al.: 2010, phys. rev. d 82, 063512; b. batell, m. pospelov and a. ritz: 2009, phys. rev. d 79, 115019; e. del nobile et al.: 2011, phys. rev. d 84, 027301; j. kopp et al.: 2010, jcap 593 http://people.roma2.infn.it/dama http://taup2009.lngs.infn.it/slides/jul3/nozzoli.pdf http://taup2009.lngs.infn.it/slides/jul3/nozzoli.pdf rita bernabei acta polytechnica 1002, 014; v. barger et al.: 2010, phys. rev. d 82, 035019; s. chang et al.: 2010, jcap 1008, 018; j.l. feng et al.: 2011, phys. lett. b 703, 124; m.t. frandsen et al.: 2011, phys. rev. d 84, 041301; y.g. kim and s. shin: 2009, jhep 0905, 036; s. shin: arxiv:1011.6377; m.r. buckley: arxiv:1106.3583; n. fornengo et al.: arxiv:1108.4661; p. gondolo et al.: arxiv:1106.0885; e. kuflik et al.: arxiv:1003.0682; c. arina et al.: arxiv:1105.5121; m.r. buckley et al.: arxiv:1011.1499. [29] p. belli et al.: 2011, phys. rev. d 84, 055014. [30] a. bottino et al.: 2012, phys. rev. d 85, 095013. [31] c.e. aalseth et al.: arxiv:1002.4703; arxiv: 1106.0650 [32] g. angloher et al.: arxiv:1109.0702 [33] r. bernabei et al.: 2009, “liquid noble gases for dark matter searches: a synoptic survey”, exorma ed., roma, isbn 978-88-95688-12-1, pp. 1–53 (arxiv:0806.0011v2). [34] j.i. collar and d.n. mckinsey: arxiv: 1005.0838; arxiv:1005.3723; j.i. collar: arxiv: 1006.2031; arxiv:1010.5187; arxiv:1103.3481; arxiv:1106.0653; arxiv:1106.3559 [35] r. hudson: 2009, found. phys. 39, 174 discussion francesco ronga — when the data with the new set-up will be released? rita bernabei — the first data release will occur when we will have collected and analized adequately large exposure to start to present significant results on the topics which have motivated this effort. for completeness i mention that we plan to release the results of the last annual cycle of dama/libra stage 1 in incoming year or before. 594 acta polytechnica 53(supplement):589–594, 2013 1 introduction 2 the dama project 3 last dama/libra upgrade and perspectives references acta polytechnica doi:10.14311/ap.2013.53.0671 acta polytechnica 53(supplement):671–676, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap from galactic to extragalactic jets: a review j.h. bealla,b,c,∗ a st. john’s college, annapolis, md b ssd, nrl, washington, dc c college of science, george mason university, fairfax, va ∗ corresponding author: beall@sjca.edu abstract. an analysis of the data that have recently become available from observing campaigns, including vla, vlba, and satellite instruments, shows some remarkable similarities and significant differences in the data from some epochs of galactic microquasars, including grs 1915+105, the concurrent radio and x-ray data [3] on centaurus a (ngc 5128), 3c120 [35], and 3c454.3 as reported by bonning et al. [16], which showed the first results from the fermi space telescope for the concurrent variability at optical, uv, ir, and γ-ray variability of that source. in combination with observations from microquasars and quasars from the mojave collaboration [32], these data provide time-dependent evolutions of radio data at mas (i.e., parsec for agns, and astronomical unit scales for microquasars). these sources all show a remarkable richness of patterns of variability for astrophysical jets across the entire electromagnetic spectrum. it is likely that these patterns of variability arise from the complex structures through which the jets propagate, but it is also possible that the jets’ constitution, initial energy, and collimation have significant observational consequences. on the other hand, ulrich et al. [42] suggest that this picture is complicated for radio-quiet agn by the presence of significant emission from accretion disks in those sources. consistent with the jet-ambient-medium hypothesis, the observed concurrent radio and x-ray variability of centaurus a [3] could have been caused by the launch of a jet element from cen a’s central source and that jet’s interaction with the interstellar medium in the core region of that galaxy. keywords: astrophysical jets, active galactic nuclei, uhe cosmic rays, quasars, microquasars. 1. introduction we have become aware that jets are ubiquitous phenomena in astrophysics. extended linear structures that can be associated with jets are found in starforming regions, in compact binaries, and of course in agn. in so-called blazar sources (see, e.g., [35, 42]) there seems to be a confirmed connection of jets with accretion disks. in sources without large-scale linear structures (i.e., jets), as ulrich et al. [42] note, the source variability could result from the complex interactions of the accretion disk with an x-ray emitting corona. but to the extent that “small” jets are present in these sources, the disk-jet interaction must still be of paramount importance, since it provides a mechanism for carrying away energy from the disk. current theories (see e.g., [15, 26] for a discussion of disk structure and jet-launch issues, respectively) suppose that the jet is formed and accelerated in the accretion disk. but even if this is true in all sources, it is still unclear whether or not astrophysical jets with shorter propagation lengths are essentially different in constitution from those that have much longer ranges, or whether the material through which the jet propagates determines the extent of the observational structures we call jets. at all events, the complexities of the jet–ambient-medium interaction must have a great deal to do with the ultimate size of the jets. this sort of argument has applications to both quasars and microquasars, especially if essentially similar physical processes occurs in all these objects (see, e.g., [9, 35]). to some, it has become plausible that essentially the same physics is working over a broad range of temporal, spatial, and luminosity scales. hannikainen [25] and chaty [17] have discussed some of the emission characteristics of microquasars, and paredes [38] has considered the role of microquasars and agns as sources of high energy γ-ray emission. in fact, the early reports of the concurrent radio and x-ray variability of centaurus a can be plausibly interpreted as the launch of a jet from cen a’s central source into the complex structures in its core. additionally, these observations are remarkably similar to the observations of galactic microquasars and agns, including the observations from the fermi space telescope of concurrent γ-ray, ir, optical, and uv variability of 3c454.3 [16], and observations [33] for bl lac, among others. 671 http://dx.doi.org/10.14311/ap.2013.53.0671 http://ojs.cvut.cz/ojs/index.php/ap j.h. beall acta polytechnica 2. concurrent radio and x-ray variability of centaurus a (ngc 5128) as an indication of jet launch or jet–cloud interactions? the first detection of concurrent, multifrequency variability of an agn comes from observations of centaurus a (fig. 1 of [3]). beall et al. conducted the observing campaign of cen a at three different radio frequencies in conjunction with observations from two different x-ray instruments on the oso-8 spacecraft in the 2 ÷ 6 kev and 100 kev x-ray ranges. these data were obtained over a period of a few weeks, with the stanford interferometer at 10.7 ghz obtaining the most data. beall et al. also used data from other epochs to construct a decade-long radio and x-ray light curve of the source. figure 1a of [12] shows the radio data during the interval of the oso-8 x-ray observations, as well as the much longer timescale flaring behavior evident in the three different radio frequencies and at both low-energy (2 ÷ 6 kev, see fig. 1b of [12]) and high-energy (∼ 100 kev, see fig. 1c of [12]) x-rays. as noted by beall [12], a perusal of fig. 1a shows that the data at 10.7 ghz (represented as a “+” in that figure) generally rise during 1973 to reach a peak in mid-1974, then decline to a relative minimum in mid-1975, only to go through a second peak toward the end of 1975, and a subsequent decline toward the end of 1976. this pattern of behavior is also shown in the ∼ 30 ghz data and the ∼ 90 ghz data albeit with less coverage at the higher two radio frequencies. several points are worthy of note. first, as beall et al. [3] note, the radio and x-ray light curves track one another. this result is the first report of concurrent radio and x-ray variability of an active galaxy. mushotzky et al. [37], using the weekly 10.7 ghz data obtained by beall et al. [3] demonstrate that the 10.7 ghz radio data track the 2 ÷ 6 kev x-ray data on weekly time scales, also. the concurrent variability at radio and x-ray frequencies suggests that the emitting region is the same for both the radio and x-ray light. this, as was noted by beall and rose [4], can be used to set interesting limits on the parameters of the emitting region. in addition, the observations at the three radio frequencies (10.7 ghz, ∼ 30 ghz, and ∼ 90 ghz) clearly track one another throughout the interval whenever concurrent data are available. one plausible hypothesis for the observations is that we witnessed physical processes associated with the “launch” of an astrophysical jet into the complex structures in the core of centaurus a. the 6-day x-ray flare early in 1973 is likely to be associated with an accretion disk acceleration event, while the longer timescale evolution from early 1973 through 1977 appears to be associated with the evolution of a larger structure over more extended regions. it is also possible, however, that observations are consistent with the interaction of the astrophysical jet with an interstellar cloud (i.e., the ambient medium) in the core of cen a. it is clear from this discussion that a distinction needs to be made here about which observational signatures are associated with the jet launch, the jet itself, and which are associated with the ambient medium’s reaction to the jet. but this issue is greatly complicated by the fact that the ambient medium (through which the jet propagates) is accelerated and in fact becomes part of the jet. this occurs in agns on scales of parsecs. this issue will be discussed later in this paper when we consider the data from 3c120 (see, e.g., [35]). this behavior is also evident in the observations of sco x-1 by fomalont, geldzahler, and bradshaw [21], as discussed below, but on much shorter physical and temporal scales. 3. a comment on particle acceleration in the jet–ambient-medium interaction van der laan [44] discussed the theoretical interpretation of cosmic radio data by assuming a source which contained a uniform magnetic field, suffused with an isotropic distribution of relativistic electrons. the source, as it expanded, caused an evolution of the radio light curve at different frequencies. each of the curves in van der laan’s paper represents a factor of 2 difference in frequency, the vertical axis representing intensity of the radio flux and the horizontal axis representing an expansion timescale for the emitting region. van der laan’s calculations show a marked difference between the peaks at various frequencies. the data from cen a (as discussed more fully in beall 2008, [11] are, therefore, not consistent with van der laan expansion, since for van der laan expansion, we would expect the different frequencies to achieve their maxima at different times. also, the peak intensities should decline with increasing frequencies at least in the power-law portion of the spectrum. chaty [17] and hannikainen [25] have pointed out that for galactic microquasars, there are some episodes which are consistent with van der laan expansion. remarkably, some episodes that are not. another episode that seems nominally consistent with van der laan expansion of an isotropic source has been presented by mirabel et al. (1998) for a galactic microquasar. mirabel shows a series of observations of grs 1915+105 at radio, infrared, and x-ray frequencies. these data associate the genesis of the first galactic microquasar with instabilities in the accretion disk that are inferred from the x-ray flaring (see, e.g., [11]). the most likely explanation for the changes in the spectrum of the cen a data at 100 kev [3] is that the emitting region suffered an injection of energetic electrons. that is, a jet–ambient-medium interaction 672 vol. 53 supplement/2013 from galactic to extragalactic jets: a review dumped energetic particles into a putative “blob”, or, equivalently, that there was a re-acceleration of the emitting electrons on a timescale short compared to the expansion time of the source. bonning et al. [16] performed an analysis of the multi-wavelength data from blazar 3c454.3, using ir and optical observations from the smarts telescopes, optical, uv and x-ray data from the swift satellite, and public-release γ-ray data from the fermi-lat experiment and demonstrated an excellent correlation between the ir, optical, uv and γ-ray light curves, with a time lag of less than one day. urry [43] in her recent paper noted that 3c454.3 can be a laboratory for multifrequency variability in blazars. while a more precise analysis of the data will be required to determine the characteristics of the emitting regions for the observed concurrent flaring at the different frequencies, the pattern of a correlation between low-energy and higher-energy variability is consistent with that observed for cen a, albeit with the proviso that the energetics of the radiating particles in 3c454.3 is considerably greater. that is, the pattern of variability is consistent with the injection of relativistic particles into a region with relatively high particle and radiation densities (i.e., an interstellar cloud). the picture that emerges, therefore, is consistent with the observations of spatially and temporally resolved galactic microquasars and agn jets. perhaps most importantly, for the cen a data, and now for the data from 3c454.3, it is the concurrent variability that suggests that the radio to x-ray (in cen a’s case) and the ir, optical, and uv to γ-ray fluxes (in 3c454.3’s case) are created in the same region. this leads to the possibility of estimates of the source parameters that are obtained from models of these sources. vlbi observations of cores vs. jets (see, e.g., the study of bl lac by bach et al. [2]) show the structures of the core vs. jet as they change in frequency and time. it has thus become possible to separate and study the time variability of the jet and the core of agn at remarkably fine temporal and spatial scales. 4. galactic microquasars and agn jets an analysis of the 3c120 results compared with the data from the galactic microquasar, sco x-1, undertaken by beall (2006) shows a similar radio evolution, with rapidly moving “bullets” interacting with slower moving, expanding blobs. it is highly likely that the elements of these sources that are consistent with van der laan expansion are the slower-moving, expanding blobs. i believe that the relativistically moving bullets, when they interact with these slower-moving blobs, are the genesis of the flaring that we see that seems like a re-acceleration of the emitting, relativistic particles. i note that a similar scenario could be operating in cen a. this is not to say that the “slower-moving blobs” are not themselves moving relativistically, since the bi-polar lobes have significant enhancements in brightness due to relativistic doppler boosting for the blobs moving toward us. the true test of this hypothesis will require concurrent, multifrequency observations with resolutions sufficient to distinguish jet components from core emissions in galactic microquasars as well as for agn jets. one of the most remarkable sagas regarding the discovery of quasar-like activity in galactic sources comes from the decades long investigation of sco x-1 by ed fomalont, barry geldzahler, and charlie bradshaw [21]. during their observations, an extended source changed relative position with respect to the primary object, disappeared, and then reappeared many times. we now know that they were observing a highly variable jet from a binary, neutron star system. the determinant observation was conducted using the very large array (vla) in socorro, new mexico and the vlba interferometer (see, e.g., beall, 2008) for a more complete discussion). put briefly, the data from sco x-1 and 3c120 show remarkable similarities and reveal a consistent pattern of behavior, albeit on remarkably different temporal and physical scales. the radio structures appear to originate from the central source and propagate along an axis that maintains itself over time scales long compared to the variability time scales of the respective sources. the emission from the lobes fade over time, as one would expect from a source radiating via synchrotron and perhaps inverse compton processes. the subsequent brightening of the lobes is apparently from a re-energizing or re-acceleration via the interaction of the highly relativistic “bullets” of material, which propagate outward from the source and interact with the radio-emitting jets. the radio jets apparently come from prior eruptions in the central source, or from the ambient material through which the jet moves. it is unclear whether all of the material in sco x-1 comes from the central source, but it is likely that in 3c120, some part of the ambient medium through which the very fast beam propagates (i.e., the broad line region), contributes to the material in the jet. as marscher et al. [34] note, this radiating material is intermediate between the broad line region (blr) and the narrow-line region (nlr). the data outlined here suggest a model for the jet structures in which beams or blobs of energetic plasmas propagate outward from the central engine to interact with the ambient medium in the source region. this ambient medium in many cases comes from prior ejecta from the central source, but can also come from clouds in the broad line region. the jet can apparently also excavate large regions, as is suggested by the complex structures in, for example, 3c120. the physical processes which can accelerate and entrain the ambient medium through which the jet propagates, have been discussed in detail in several 673 j.h. beall acta polytechnica figure 1. 3c454.3 shown at milliarcsecond scales (left side), taken from the mojave vlba campaign) and arcsecond scales (right side, taken from the vla) (lister et al. [32], cooper et al. [18], respectively). the data show the remarkable complexity of the jet on the first few parsecs of its evolution, including a remarkable change in direction as it evolves. on the other hand, the jet on the scale of hundreds to thousands of parsecs shows a remarkable stability and constancy of direction. the data at milliarcsecond scales are the most recent available and were taken in april of 2012. venues (see, e.g., [6, 9, 11, 40, 41]). the observations of the concurrent ir, optical, uv, and γ-ray variability of 3c454.3 suggest a reinvestigation of the vlba data for this source using the mojave observations. it is worthy of note that the milliarcsecond observations show a complex evolution of structure at parsec scales, including an apparently sharp change of directions associated with changes in the polarization of the radio light at that point in the jet’s evolution (see fig. 1a). this can be compared with vla data from the same source, which shows the jet on scales of hundreds to thousands of parsecs. the parsec scale jet seems to eventually order itself in the direction of the large-scale jet (see fig. 1), but it shows quite a dynamic evolution in its early stages. these data are even more complex than the data from sco x-1 or 3c120, since they add to the time-dependent evolution of the linear structure of the objects an apparent change in direction of the jet on a scale of a few parsecs. furthermore, the suggestion of such an enormous scale of the dynamo region in 3c454.3 argues that we need to model magnetic structures on parsec scales for agn jet acceleration. it is also of interest to note that in cases like sco x-1 and 3c120, the complex jet structure consisting of slower moving “blobs” and very rapidly moving “bullets” is aligned on an apparently stable axis, while in the case of 3c454.3 and 1308+326.1, there is evidence of radical changes in direction over 10s to 100s of parsecs (see fig. 2). this is also true of the galactic microquasar ss433 (see, e.g., mioduszewski et al. [36], and doolin and blundell [20]), which shows a helical figure 2. observations of 1308+326.1 on a scale of milliarcseconds from the mojave vlba campaign [32] show an apparent precession of the jet in that source. these data thus apparently show a remarkable change in the propagation of the relativistic jet on the scale of parsecs. such a change in propagation direction bears directly on the question of the strength and large-scale structure of the magnetic field in the jet-interacting region. structure to the jet in both directions. this behavior is indeed remarkable. such a change in propagation direction bears directly on the question of the strength and large-scale structure of the magnetic field in the jet-interacting region. regarding the acceleration region and the possible mechanisms for the collimation of the jets, a number of models have been proposed (see, e.g., [14, 15, 29, 39] which might help explain the complexity present in these data. 5. concluding remarks our realization of the dynamic nature of jets from microquasars to agn has come from laudable persistence on the part of observing teams throughout the world. the detail available from observatories in the current epoch provides considerable guidance to those interested in modeling these sources. prior to these efforts, it was understandable that jets could have been considered as remarkable for their stability and persistence, given the data we had seen in the past, even in spite of the variability observed from agn and microquasar jets. however, what we are now seeing suggests a more complex and dynamic structure to the source regions. our condition is similar to that which occurred three decades ago when we broke from assumptions of spherical symmetry in our models of active galaxies. we did this to consider the linear structures we saw on the sky. we now find that the true nature of these sources is even more complex. we must face the challenge this poses for our modeling of these truly dynamic, complex, and asymmetric structures. acknowledgements the author gratefully acknowledges the support of the office of naval research for this research. this research has 674 vol. 53 supplement/2013 from galactic to extragalactic jets: a review made use of data from the mojave database, which is maintained by the mojave team [32]. the author gratefully acknowledges the data provided by the mojave team for this paper. references [1] the pierre auger collaboration, et al., 2007, science 318, 938 [2] bach, u., villata, m., raiteri, c. m., et al. 2006, “structure a variability in the vlbi jet of bl lacertae during the webt campaigns (1995-2004),” a&a, 456, 105 [3] beall, j.h. et al., 1978, ap. j., 219, 836 [4] beall, j.h., rose, w.k., 1980, ap. j., 238, 579 [5] beall, j.h., 1987, ap. j., 316, 227 [6] beall, j.h., 1990, physical processes in hot cosmic plasmas (kluwer: dordrecht), w. brinkman, a. c. fabian, & f. giovannelli, eds., pp. 341–355 [7] beall, j.h., bednarek, w., 2002, ap. j., 569, 343 [8] beall, j.h., 2002, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. s.a.it. 73, 379 [9] beall, j.h., 2003, in multifrequency behavior of high energy cosmic sources, chin. j. astron. astrophys. 3, suppl., 373 [10] beall, j.h., 2009, societa italiana di fisica, 98, 283–294 [11] beall, j.h., 2010, in multifrequency behaviour of high energy cosmic sources, frascati workship 2009, f. giovannelli & l. sabau-graziati (eds.), mem. s.a.it. 81, 395 [12] beall, j.h., 2011, mem. s. a. it., 83, 283–290 [13] benford, gregory, and protheroe, r.j., 2008, m.n.r.a.s., 383, 417–816 [14] bisnovatyi-kogan, genaddi, et al., 2002, mem. s.a.it. 73, 1134., in proceedings of the vulcano workshop on high-energy cosmic sources, 2001 [15] bisnovatyi-kogan, genaddi, 2012, acta polytechnica, vol. 53, supplement [16] bonning, e. w., bailyn, c., urry, c. m., buxton, m., fossati, g., maraschi, l., coppi, p.,scalzo, r., isler, j., kaptur, a. 2009, astrophys. j. lett., 697, l81 [17] chaty, s., 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy, 93, 329 [18] cooper et al., 2007, ap.j. supplement, 171, 376 [19] dar, arnon, 2009, private communication [20] doolin, s., and blundell, k.m., 2009, ap. j., 698, l23 [21] fomalont, e., geldzahler, b., bradshaw, c., 2001, ap. j. 558, 283–301 [22] giovannelli, f., sabau-graziati, l., 2004, space science reviews, 112, 1–443 (kluwer academic publishers: netherlands) [23] gómez et al. 2000, science 289, 2317 [24] hiribayashi et al., 2000 pasj 52, 997 [25] hannikainen, d.c., rodriguez, j., 2008, in multifrequency behavior of high energy cosmic sources, chin. j. astron. astrophys. 8, suppl., 341 [26] hawley, j.f., 2003, phys. plasmas 10, 1946 [27] jorstad, s.g., marscher, a.p., lister, m.l., stirling, a.m., cawthorne, t.v. et al., 2005, astron. j. 130, 1418–1465 [28] jorstadt, s., marscher, a., stevens, j., smith, p., forster, j. et al., 2006, in multifrequency behavior of high energy cosmic sources, chin. j. astron. astrophys. 6, suppl. 1, 247 [29] kundt, w. and gopal-krishna, 2004, journal of astrophysics and astronomy, 25, 115–127 [30] krause, m., camenzind, m., 2003, in the physics of relativistic jets in the chandra and xmm era, new astron. rev. 47, 573 [31] lightman, a. p., eardley, d. n., 1974, ap. j. letters 187, l1 [32] lister et al., 2009, a.j., 137, 3718 [33] madejski, greg m., et al., 1999, apj, 521, 145–154 [34] marscher, a.p., et al., 2002, nature, 417, 625–627 [35] marscher, a.p., 2006, in multifrequency behavior of high energy cosmic sources, chin. j. astron. astrophys. 6, suppl. 1, 262 [36] mioduszewski, a. j., rupen, m. p., walker, r. c., schillemat, k. m., & taylor, g. b. 2004, baas, 36, 967 [37] mushotzky, r.f., serlemitsos, p.j., becker, r.h., boldt, e.a., and holt, s.s., ap. j. 220, 790–797 [38] paredes, j., 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), italian physical society, editrice compositori, bologna, italy, 93, 341 [39] romanova, m. and lovelace, r., 2005, 2009, triggering of relativistic jets, (instituto de astronomia, universidad nacional autonoma de mexico, william h. lee and enrico ramirez-ruiz, eds.); also at arxiv:0901.4753v1, astro-ph.he, 29 jan 2009 [40] rose, w.k., et al., 1984, ap. j. 280, 550 [41] rose, w.k., et al., 1987, ap. j. 314, 95 [42] ulrich, marie-helene, maraschi, laura, and urry, c. megan, 1997, ann. rev. astron. astrophys, 35, 445–502 [43] urry, c. megan, 2011, j. astrophys. astr. 32, 139–145 [44] van der laan, h., 1966, nature 211, 1131 [45] zanni, c., murante, g., bodo, g., massaglia, s., rossi, p., ferrari, a., 2005 astron. astrophys., 429 675 j.h. beall acta polytechnica discussion giora shaviv — if my colleague, arnon dar, were here, he would have asked you if you can say that the clouds you have described as moving in the jet are “cannon balls”. jim beall — although i don’t think our colleague, arnon dar, would like the equivocation, i believe that it is a matter of perspective whether or not you consider these clouds as “cannon balls”. they seem to be associated with instabilities in the accretion disks of agns, galactic binary systems, and perhaps even star forming regions. and there appear to be at least two kinds of structures emitted: the slower moving blobs and the very fast bullets. matteo guainazzi — could you comment on the experimental evidence for “jets stacked in the ambient medium”? do radio and x-ray measurements indicate that compact radio galaxies are young rather than “frustrated”? jim beall — it does appear that the jets have a significant effect on the ambient medium through which they propagate. but my saying this suggests that the distinction between the jet and the ambient medium is somewhat arbitrary. the jets are enormously energetic. they therefore heat the ambient material, accelerate, and entrain it. however, the larger and more important question you are asking is whether the jets are somehow different in seyferts (where the jets are confined within the agn core) than the jets in giant radio galaxies, where they can extend hundreds of kiloparsecs. my guess is that the jets evolve as they propagate. they are likely to be electrons, positrons, and hadrons to begin with, and then an admixture of stellar-like material as they evolve. the difference in propagation lengths might be due to the energetics of the accretion disk, but that, too, is only a guess. 676 acta polytechnica 53(supplement):671–676, 2013 1 introduction 2 concurrent radio and x-ray variability of centaurus a (ngc 5128) as an indication of jet launch or jet–cloud interactions? 3 a comment on particle acceleration in the jet–ambient-medium interaction 4 galactic microquasars and agn jets 5 concluding remarks acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0281 acta polytechnica 54(4):281–284, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap impact of strain rate on microalloyed steel sheet fracturing mária miháliková∗, miroslav német, marek vojtko department of materials science, faculty of metallurgy, technical university of košice, letná 9, 042 00 košice, slovakia ∗ corresponding author: maria.mihalikova@tuke.sk abstract. the strain rate is a significant external factor, and its influence on material behavior in the forming process is a function of its internal structure. this paper presents an analysis of the impact of the loading rate from 1.6 x 10-4 m s-1 to 24 m s-1 on changes in the fracture properties of steel sheet used for bodywork components in cars. experiments were performed on samples taken from hc420la grade strips produced by cold rolling and hot dip galvanization. material strength properties were compared on the basis of measured values, and changes to the character of the fracture surface were observed. keywords: strain rate, microalloyed steel, fracture surface. 1. introduction high-strength low-alloy steel (hsla) is a type of alloy steel that provides enhanced mechanical properties [1, 2]. increasing the strain rate increases the resistance of the material against deformation, but also increases the tendency toward brittle fracture. an increasing strain rate results in changes to the microstructure and substructure of a deformed material. in practical terms, this means that it is necessary to know the impact of the strain rate on the mechanical properties of a specific material. this forms the basis for calculating the deformational resistance, and also the processes taking place during forming. it is quite complicated to predict the impact of the strain rate on the properties of a material. this is related to the fact that the intensity of the impact of the strain rate is a function of the internal structure of a material, and it is also very difficult to interpret test results at high rates. an increasing strain rate also increases the critical flow stress. the yield strength grows strongly, the tensile strength increases, and the deformation characteristics of the material are changed [3, 4]. at the same time, the values of the forming criteria derived from these characteristics are also changed [5–7]. 2. experimental material and methods experiments were performed on samples taken from the cold rolled strips and then hot dip galvanized hc420la grades intended for the production of stampings in the automotive industry. the chemical composition of the tested steels is shown in table 1. the microstructure of the tested material is polygonal ferritic, with small amounts of fine pearlitic grains precipitated at the boundaries of the ferritic grains (fig. 1). figure 1. steel hc420la – microstructure. the tested material was 1.0 mm in thickness. samples of the material were taken in the rolling direction, and flat test specimens were produced for the tensile test. the tensile test was carried out on the instron 1185 tensile testing machine, at a loading speed of 1–1000 mm/min. dynamic tests were carried out on the psw-type pendulum impact tester at a maximum speed of 24 m/s. 3. results and discussion according to [3, 4], an increasing strain rate increases the resistance of the tested steel to plastic deformation, and there is an increase in the yield strength and tensile strength (fig. 2). the relation between stress (r) and the strain rate ε̇ can be described by the following equation [4]: rmε = rmε0 + a log ε̇/ε̇0, where a is the material constant factor expressing the sensitivity of the material to the strain rate, 281 http://dx.doi.org/10.14311/ap.2014.54.0281 http://ojs.cvut.cz/ojs/index.php/ap m. miháliková, m. német, m.. vojtko acta polytechnica figure 2. effect of loading rate on the yield stress re and the tensile ultimate strength rm of hc420la steel. figure 3. influence of loading rate on re/rm. rmε expresses the ultimate tensile strength values at strain rate ε̇, and rmε0 expresses the ultimate tensile strength values at the lowest strain rate. one of the characteristics of thin sheet formability is the re/rm ratio. this ratio grows with increasing strain rate, as shown in fig. 3. this increases the risk of local plastic instability. typical deep drawing sheets have a ratio of about 0.6. sheets tested under a static load had re/rm ratio = 0.72. this ratio increased with increasing loading speed (fig. 3). the paper [8] states that the formability (plasticity) of the tested steels decreased after an re/rm ratio > 0.82 was reached. at a static loading speed of 1.6 × 10−4 m/s, a transcrystalline ductile failure with a dimple morphology can be observed on the fracture surface. the shape and symmetry of the dimples is related to the stress at the failure spot. the size and the layout of the dimples depends on the grain size. at low rates, see fig. 4, there is a ductile failure with an equiaxial dimple morphology, where the dimples are deep. the generation of a fracture surface is accompanied by significant plastic deformation associated with the increasing number of active slip systems at a higher strain rate. the fracture surface obtained at a loading speed of 1.6 m/s has similar characteristics (fig. 5). on the fracture surface, an increased number of secondary cracks and voids are generated in the direction of the lines. the increase in plastic deformation is clear from the shape of the dimples. the dimples are 282 vol. 54 no. 4/2014 impact of strain rate on microalloyed steel sheet fracturing material c si mn p s al ti nb hc420la 0.1 0.5 1.6 0.025 0.025 0.015 0.015 0.09 table 1. chemical composition of the material (in percents). figure 4. fracture surface at a loading rate of 1.6 × 10−4 m/s. figure 5. fracture surface at a loading rate of 1.6 m/s. elongated, with a strong presence of striation on the walls. coalescence of cavities is seen more significantly in the direction perpendicular to the direction of the tensile stress. at a loading speed of 24 m/s, the angle of rupture increases (fig. 6), an uneven surface is generated, and the dimples are shallower. at the void growth in the process of ductile failure, the coalescence bridges become narrow. bridges break due to gradual stretching. ductile fractures are formed in the phase of micro-defect nucleation, void growth and contraction of the bridges between the voids. the nucleation of micro-defects in ductile failure is generated by decohesion of the inclusions and other particles from the matrix. figure 6. fracture surface at a loading rate of 24 m/s. microscopic observation of the strain-strengthened steel structure confirmed that as the strain rate increases, the inhomogenity of the deformation plasticity also increases in the volume of deformed steel. it follows that the resulting properties of the strainstrengthened material are influenced by the strain rate. 4. conclusion this paper has presented an analysis of the impact of the strain rate of hc420la steel sheet on changes in mechanical properties and the appearance of a fracture. on the basis of the results of tensile tests and dynamic tests in the loading speed range from 1.6 × 10−4 m/s to 24 m/s for the tested steel, it can be stated that: • at an increasing strain rate up to about 3 s−1, there is no deterioration in the material characteristics of deep drawing, but deformation resistance increases. • the limit state characterizing the drop in formability as a result of increasing the loading speed can be determined by the re/rm ratio. for the tested sheet, this ratio was around 0.82. • for the hsla steel of hc420la grade tested here, ductile failure is generated by the void mechanism. • at all rates, the material fails due to transcrystalline ductile fracture with a dimple morphology. • as the strain rate increases, the plastic deformation becomes more significant and there is greater number of voids that are oriented in the direction of the lines, and the dimples become shallower. 283 m. miháliková, m. német, m.. vojtko acta polytechnica the localization of plastic deformation in deep drawing automobile sheet metal is considered to be one of the most important indicators of compressibility sheets. the results of these experiments are recommended for practical applications regarding the exhaustion plasticity of materials and the behavior of materials during dynamic processes. when using and including these experimental findings in practical applications, it is necessary to support the analysis by measurements of the substructure of the material and by implementing ebsd analysis. acknowledgements this study has been supported by the vega 1/0549/14 grant project. references [1] wang, h.r., wang, w., gao, j.o.: precipitates in two zr-bearing hsla steel plates. materials letters, 64 (2), 2010, p. 219–222. doi:10.1016/j.matlet.2009.10.053 [2] huh. h., lim, j.h. et al.: high speed tensile test of steel for the stress-strain curve at the intermediate strain rate. international journal of automotive technology, 10 (2), 2009, p. 195–204. doi:10.1007/s12239-009-0023-3 [3] čižmárová, e., micheľ, j. et al.: influence of strain rate on properties of microalloyed s-mc steel grades, metalurgija, 41 (4), 2002, p. 285–290. [4] mihaliková, m., janek, j.: influence of the loading and strain rates on the strength properties and formability of higher-strength sheet. metalurgija, 46 (2), 2007, p. 107–110. [5] német, m., mihaliková m.: the effect of strain rate on the mechanical properties of automotive steel sheets, acta polytechnica, 53 (4), 2013, p. 384–387. [6] niechajowicz, a., tobota, a.: application of flywheel machine for sheet metal dynamic tensile tests. archives of civil and mechanical engineering 8 2008, p. 129–137. [7] buršák, m., mamuzič, i., micheľ, j.: contribution to evaluation of mechanical properties during impact loading. metalurgija, 47 (1) 2008, p. 19–23. [8] micheľ, j., buršák, m.: the influence of strain rate on the plasticity of steel sheets. komunikacie, 12 (4), 2010, p. 27–32. 284 http://dx.doi.org/10.1016/j.matlet.2009.10.053 http://dx.doi.org/10.1007/s12239-009-0023-3 acta polytechnica 54(4):281–284, 2014 1 introduction 2 experimental material and methods 3 results and discussion 4 conclusion acknowledgements references ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 two-particle harmonic oscillator in a one-dimensional box p. amore, f. m. fernández abstract we study a harmonic molecule confined to a one-dimensional box with impenetrable walls. we explicitly consider the symmetry of the problem for the cases of different and equal masses. we propose suitable variational functions and compare the approximate energies given by the variation method and perturbation theory with accurate numerical ones for a wide range of values of the box length. we analyze the limits of small and large box size. keywords: harmonic oscillator, diatomic molecule, confined system, one-dimensional box, point symmetry, avoided crossings, perturbation theory, variational method. 1 introduction during the last decades, there has been great interest in the model of a harmonic oscillator confined to boxes of different shapes and sizes [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]. such a model has been suitable for the study of several physical problems, ranging from dynamical friction in star clusters [4] to magnetic properties of solids [6] and impurities in quantum dots [23]. one of the most widely studied models is given by a particle confined to a box with impenetrable walls at −l/2and l/2boundbya linear force that produces a parabolic potential–energy function v (x)= k(x−x0)2/2, where |x0| < l/2. when x0 =0 the problem is symmetric and the eigenfunctions are either even or odd; such symmetry is brokenwhen x0 =0. although interesting in itself, thismodel is rather artificial because the cause of the force is not specified. it may, for example, arise from an infinitely heavy particle clamped at x0. in such a case we think that it is more interesting to consider that the other particle also moves within the box. the purpose of this paper is to discuss the model of two particles confined to a one-dimensional box with impenetrable walls. for simplicity we assume that the force between them is linear. in sec. 2 we introduce the model and discuss some of its general mathematical properties. in sec. 3 we discuss the solutions of the schrödinger equation for small box lengths bymeans of perturbation theory. in sec. 4we consider the regime of large boxes and propose suitable variational functions. in sec. 5 we compare the approximate energies provided by perturbation theory and the variational method with accurate numerical methods. finally, in sec. 6 we summarize the main results and draw additional conclusions. 2 the model as mentioned above, we are interested in a system of two particles of masses m1 and m2 confined to a onedimensional boxwith impenetrablewalls located at x = −l/2 and x = l/2. if we assumea linear force between the particles then the hamiltonian operator reads ĥ = − h̄2 2 ( 1 m1 ∂2 ∂x21 + 1 m2 ∂2 ∂x22 ) + k 2 (x1 − x2)2 (1) and the boundary conditions are ψ =0 when xi = ±l/2. it is convenient to convert it to a dimensionless form by means of the variable transformation qi = xi/l that leads to: ĥd = m1l 2 h̄2 ĥ = − 1 2 ( ∂2 ∂q21 + β ∂2 ∂q22 ) + λ 2 (q1 − q2)2 (2) where β = m1/m2, λ = km1l 4/h̄2 and the boundary conditions become ψ = 0 if qi = ±1/2. without loss of generality we assume that 0 < β ≤ 1. the free problem (−∞ < xi < ∞) is separable in terms of relative and center-of-mass variables x = x1 − x2 x = 1 m (m1x1 + m2x2) , m = m1 + m2 (3) 17 acta polytechnica vol. 50 no. 5/2010 respectively, that lead to ĥ = − h̄2 2 ( 1 m ∂2 ∂x2 + 1 m ∂2 ∂x2 ) + v (x), m = m1m2 m . (4) in this case we can factor the eigenfunctions as ψkv(x1, x2) = e ikx φv(x) −∞ < k < ∞, v =0,1, . . . (5) where φv(x) are the well–known eigenfunctions of the harmonic oscillator, and the eigenvalues read ekv = h̄2k2 2m + h̄ √ k m ( v + 1 2 ) . (6) however, because of the boundary conditions, any eigenfunction is of the form ψ(x1, x2) = (l 2/4 − x21) (l2/4 − x22)φ(x1, x2), where φ(x1, x2) does not vanish at the walls. we clearly appreciate that the separation just outlined is not possible in the confined model. when β < 1 the transformations that leave the hamiltonian operator (including the boundary conditions) invariant are: identity ê : (q1, q2) → (q1, q2) and inversion ı̂ : (q1, q2) → (−q1, −q2). therefore, the eigenfunctions of ĥd are the basis for the irreducible representations ag and au of the point group s2 [24] (also called ci by other authors). on the other hand, when β = 1 (equal masses) the problem exhibits the highest possible symmetry. the transformations that leave thehamiltonian operator (including the boundary conditions) invariant are: identity ê : (q1, q2) → (q1, q2), rotation by π c2 : (q1, q2) → (q2, q1), inversion ı̂ : (q1, q2) → (−q1, −q2), and reflection in a plane perpendicular to the rotation axis σh : (q1, q2) → (−q2, −q1). in this case, the states are basis functions for the irreducible representations ag, au, bg, and bu of the point group c2h [24]. 3 small box when λ � 1 we can apply perturbation theory choosing the unperturbed or reference hamiltonian operator to be ĥ0d = ĥd(λ =0). its eigenfunctions and eigenvalues are given by ϕ(0)n1,n2(q1, q2) = ⎧⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎩ 2cos[(2i − 1)πq1]cos[(2j − 1)πq2] ag 2sin(2iπq1)sin(2jπq2) ag 2cos[(2i − 1)πq1]sin(2jπq2) au 2sin(2iπq1)cos[(2j − 1)πq2] au , i, j =1,2, . . . �(0)n1,n2 = π2 2 ( n21 + βn 2 2 ) , n1, n2 =1,2, . . . (7) there is no degeneracy when β < 1, except for the accidental one that takes place for particular values of β which we will not discuss in this paper. the energies corrected to first order read �[1]n1,n2 = π2 2 ( n21 + βn 2 2 ) + λ π2n21n 2 2 − 3 ( n21 + n 2 2 ) 12π2n21n 2 2 . (8) when β = 1 the zeroth–order states ϕ(0)n1,n2 and ϕ (0) n2,n1 (n1 = n2) are degenerate, but it is not necessary to resort to perturbation theory for degenerate states in order to obtain the first-order energies. we simply take into account that the eigenfunctions of ĥ0d adapted to the symmetry of the problem are 18 acta polytechnica vol. 50 no. 5/2010 ϕ(0)n1,n2(q1, q2)= ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ √ 2 − δij {cos[(2i − 1)πq1]cos[(2j − 1)πq2] +cos[(2j − 1)πq1]cos[(2i − 1)πq2]} ag √ 2{cos[(2i − 1)πq1]cos[(2j − 1)πq2] −cos[(2j − 1)πq1]cos[(2i − 1)πq2]} bg √ 2{cos[(2i − 1)πq1]sin[2jπq2]+sin[2jπq1]cos[(2i − 1)πq2]} au √ 2{cos[(2i − 1)πq1]sin[2jπq2] − sin[2jπq1]cos[(2i − 1)πq2]} bu √ 2 − δij {sin[2iπq1]sin[2jπq2]+sin[2jπq1]sin[2iπq2]} ag √ 2{sin[2iπq1]sin[2jπq2] − sin[2jπq1]sin[2iπq2]} bg (9) where i, j =1,2, . . . they give us the energies corrected to first order as �[1]n (s)= 〈 ϕ (0) ij (s) ∣∣∣ĥd ∣∣∣ϕ(0)ij (s)〉 where s denotes the irreducible representation. since some of these analytical expressions are rather cumbersome for arbitrary quantum numbers, we simply show the first six energy levels for future reference: � [1] 1 (ag) = π 2 + λ(π2 − 6) 12π2 � [1] 1 (au) = 5π2 2 + λ(108π4 − 405π2 − 4096) 1296π4 � [1] 1 (bu) = 5π2 2 + λ(108π4 − 405π2 +4096) 1296π4 � [1] 2 (ag) = 4π 2 + λ(2π2 − 3) 24π2 � [1] 3 (ag) = � [1] 1 (bg)= 5π 2 + λ(3π2 − 10) 36π2 . (10) the degeneracy of the approximate energies denoted � [1] 3 (ag) and � [1] 1 (bg) is broken at higher perturbation orders as shown by the numerical results in sec. 5. 4 large box when l → ∞ the energy eigenvalues tend to those of the free system (6). more precisely, we expect that the states with finite quantum numbers n1, n2 at l =0 correlate with those with k =0 when l → ∞: lim λ→∞ �(β, λ) √ λ = √ 1+ β ( v + 1 2 ) , v =0,1, . . . (11) besides, we should take into account that the symmetry of a given state is conserved as l increases from 0 to ∞. when β < 1 we expect that the states approach ψkv(x, x)= ⎧⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ cos(kx)φ2v(x) ag sin(kx)φ2v+1(x) ag sin(kx)φ2v(x) au cos(kx)φ2v+1(x) au (12) as l → ∞. 19 acta polytechnica vol. 50 no. 5/2010 for the more symmetric case β =1 the states should be ψkv(x, x)= ⎧⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ cos(kx)φ2v(x) ag sin(kx)φ2v(x) au cos(kx)φ2v+1(x) bu sin(kx)φ2v+1(x) bg . (13) obviously, perturbation expressions (8) or (10) are unsuitable for this analysis and we have to resort to other approaches. in order to obtain accurate eigenvalues and eigenfunctions for the present model we may resort to the rayleigh-ritz variational method and the basis set of eigenfunctions of ĥ0d given in equations (7) and (9). alternatively, we can also make use of the collocation method with the so-called little sinc functions (lsf) that proved useful for the treatment of coupled anharmonic oscillators [25]. in this paper we choose the latter approach. another way of obtaining approximate eigenvalues and eigenfunctions is provided by a straightforward variational method proposed some time ago [26]. the trial functions suitable for the present model are of the form ϕ(q1, q2)= ( 1 4 − q21 )( 1 4 − q22 ) f(c, q1, q2)e −a(q1−q2)2 (14) where c= {c1, c2, . . . , cn } are linear variational parameters, which would give rise to the well known rayleighritz secular equations, and a is a nonlinear variational parameter. even the simplest and crudest variational functions provide reasonable results for all values of λ, as shown in sec. 5. the simplest trial function for the ground state of the model with β < 1 is ϕ(q1, q2)= ( 1 4 − q21 )( 1 4 − q22 ) e−a(q1−q2) 2 . (15) note that this function is the basis for the irreducible representation ag. we calculate w(a, λ) = 〈 ϕ ∣∣∣ĥd∣∣∣ϕ〉/ 〈ϕ|ϕ〉 andobtain λ(a) fromthevariationalcondition ∂w/∂a =0sothat [w(a, λ(a)), λ(a)] is a suitableparametric representation of the approximate energy. in this way we avoid the tedious numerical calculation of a for each given value of λ and obtain an analytical parametric expression for the energy, which we do not show here because it is rather cumbersome. we just mention that the parametric expression is valid for a > a0 where a0 is the greatest positive root of λ(a)=0. when β =1 we choose the following trial functions for the lowest states of each symmetry type ϕag(q1, q2) = ( 1 4 − q21 )( 1 4 − q22 ) e−a(q1−q2) 2 ϕau(q1, q2) = ( 1 4 − q21 )( 1 4 − q22 ) (q1 + q2)e −a(q1−q2)2 ϕbu(q1, q2) = ( 1 4 − q21 )( 1 4 − q22 ) (q1 − q2)e−a(q1−q2) 2 ϕbg(q1, q2) = ( 1 4 − q21 )( 1 4 − q22 )( q21 − q 2 2 ) e−a(q1−q2) 2 . (16) 5 results fig. 1 shows the ground-state energy for β =1/2 calculated by means of perturbation theory, the lsf method and the variational function (15) for small and moderate values of λ. fig. 2 shows the results of the latter two approaches for a wider range of values of λ. we appreciate the accuracy of the energy provided by the simple variational function (15) for all values of λ. the reader will find all the necessary details about the lsf collocation method elsewhere [25]. here we just mention that a grid with n = 60 was sufficient for the calculations carried out in this paper. fig. 3 shows that �(λ)/ √ λ calculated by the same two methods for β = 1/2 approaches √ 3/8 as suggested by equation (11). fig. 4 shows the first six eigenvalues �(λ) for β = 1/2 calculated by means of the lsf collocation method. the level order to the left of the crossings 20 acta polytechnica vol. 50 no. 5/2010 0 50 100 150 200 λ 8 10 12 14 ε fig. 1: perturbation theory (dotted line), lsf (points) and variational (solid line) calculation of the ground-state energy �(λ) for β =1/2 0 500 1000 1500 2000 λ 10 15 20 25 30 ε fig. 2: variational (line) and lsf (points) calculation of the ground-state �(λ) for β =1/2 10 -1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 λ 0 1 2 3 4 5 6 7 8 9 10 ε / λ1 /2 fig. 3: variational (line) and lsf (points) calculation of the ground-state �(λ)/ √ λ for β = 1/2. the horizontal line marks the limit √ 3/8 21 acta polytechnica vol. 50 no. 5/2010 0 200 400 600 800 1000 λ 0 10 20 30 40 50 60 70 80 ε fig. 4: first six eigenvalues for β =1/2 calculated by means of the lsf method 0 50 100 150 200 λ 0 10 20 30 40 50 60 70 ε fig. 5: first six eigenvalues for β = 1. circles, dotted line and solid line correspond to the lsf collocation approach, perturbation theory and variation method, respectively. the level order is ag < au < bu < 2ag < 3ag < bg 10 0 10 1 10 2 10 3 10 4 10 5 10 6 λ 1 10 100 ε / λ 1 /2 a g a u b u b g fig. 6: variational (solid line) and lsf (symbols) calculation of �(λ)/ √ λ for β =1. the horizontal lines mark the limits 1/ √ 2 and 3/ √ 2 22 acta polytechnica vol. 50 no. 5/2010 is �1(ag) < �1(au) < �2(au) < �2(ag) < �3(ag) < �3(au). note the crossings between states of different symmetry and the avoided crossing between the states 2au and 3au. fig. 5 shows the first six eigenvalues for β =1 calculated by means of perturbation theory, the lsf method and the variational functions (16) for small values of λ. the energy order is �1(ag) < �1(au) < �1(bu) < �2(ag) < �3(ag) < �1(bg) and we appreciate the splitting of the energy levels �3(ag) and �1(bg) that does not take place at the first order of perturbation theory, as discussed in sec. 3. finally, fig. 6 shows �(λ)/ √ λ for sufficiently large values of λ. we appreciate that the four simple variational functions (16) are remarkably accurate and that �(λ)/ √ λ → 1/ √ 2 for the first two states of symmetry ag and au and �(λ)/ √ λ → 3/ √ 2 for the next two ones of symmetry bu and bg. these results are consistentwith equation (13), which suggests that the energies of the states with symmetry a and b approach √ 2(2v +1/2) and √ 2(2v +3/2), respectively. in fact, fig. 6 shows four particular examples with v =0. 6 conclusions themodel discussed in this paper is different from those considered before [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], because in the present case the linear force is due to the interactionbetween two particles. although the interaction potential depends on the distance between the particles the problem is not separable and should be treated as a two-dimensional eigenvalue equation. it is almost separable for a sufficiently small box because the interaction potential is negligible in such a limit, and also for a sufficiently large box where the boundary conditions have no effect. it is convenient to take into account the symmetry of the problemand classify the states in terms of the irreducible representationsbecause it facilitates the discussion of the connection between both regimes. the model may be suitable for investigating the effect of pressure on the vibrational spectrum of a diatomic molecule, and in principle one can calculate the spectral lines by means of the rayleigh-ritz or the lsf collocation method [25]. the simple variational functions developed some time ago [26] and adapted to present problem in sec. 4 provide remarkably accurate energies for all box size values, and are, for that reason, most useful for showing the connection between the two regimes and for verifying the accuracyofmore elaborate numerical calculations. references [1] auluck, f.c.,kothari,d. s.: energy-levels of an artificially bounded linear oscillator.science andculture, 7 (6), 1940, p. 370–371. [2] auluck, f. c.: energy levels of an artificially bounded linear oscillator.proc. nat. inst. india, 7 (2), 1941, p. 133–140. [3] auluck, f. c.: white dwarf and harmonic oscillator. proc. nat. inst. india, 8 (2), 1942, p. 147–156. [4] chandrasekhar, s.: dynamical friction ii. the rate of escape of stars from clusters and the evidence for the operation of dynamical friction. astrophys. j., 97 (2), 1943, p. 263–273. [5] auluck, f. c., kothari, d. s.: the quantum mechanics of a bounded linear harmonic oscillator. proc. camb. phil. soc., 41 (2), 1945, p. 175–179. [6] dingle, r. b.: some magnetic properties of metals iv. properties of small systems of electrons. proc. roy. soc. london ser. a, 212 (1108), 1952, p. 47–65. [7] baijal, j. s., singh, k. k.: the energy-levels and transition probabilities for a bounded linear harmonic oscillator. prog. theor. phys., 14 (3), 1955, p. 214–224. [8] dean, p.: the constrained quantum mechanical harmonic oscillator.proc. camb. phil. soc., 62 (2), 1966, p. 277–286. [9] vawter, r.: effects of finite boundaries on a one-dimensional harmonic oscillator. phys. rev., 174 (3), 1968, p. 749–757. [10] vawter, r.: energy eigenvalues of a bounded centrally located harmonic oscillator. j. math. phys., 14 (12), 1973, p. 1864–1870. 23 acta polytechnica vol. 50 no. 5/2010 [11] consortini, a., frieden, b. r.: quantum-mechanical solution for the simple harmonic oscillator in a box. nuovo cim. b, 35 (2), 1976, p. 153–163. [12] adams, j. e., miller, w. h.: semiclassical eigenvalues for potential functions defined on a finite interval. j. chem. phys., 67 (12), 1977, p. 5775–5778. [13] rotbar,f.c.: quantumsymmetrical quadratic potential in abox.j.phys. a,11 (12), 1978, p. 2363–2368. [14] aguilera-navarro,v. c., ley koo, e., zimerman, a. h.: perturbative, asymptotic and pade-approximant solutions for harmonic and inverted oscillators in a box. j. phys. a, 13 (12), 1980, p. 3585–3598. [15] aguilera-navarro, v. c., iwamoto, h., ley koo, e., zimerman, a. h.: quantum-mechanical solution of the double oscillator in a box. nuovo cim. b, 62 (1), 1981, p. 91–128. [16] barakat, r., rosner, r.: the bounded quartic oscillator. phys. lett. a, 83 (4), 1981, p. 149–150. [17] fernández, f. m., castro, e. a.: hypervirial treatment of multidimensional isotropic bounded oscillators. phys. rev. a, 24 (5), 1981, p. 2883–2888. [18] fernández, f. m., castro, e. a.: hypervirial calculation of energy eigenvalues of a bounded centrally located harmonic oscillator. j. math. phys., 22 (8), 1981, p. 1669–1671. [19] aguilera-navarro, v. c., gomes, j. f., zimerman, a. h., ley koo, e.: on the radius of convergence of rayleigh-schroedinger perturbative solutions for quantum oscillators in circular and spherical boxes. j. phys. a, 16 (13), 1983, p. 2943–2952. [20] chaudhuri, r. n., mukherjee, b.: the eigenvalues of the bounded lx2m oscillators. j. phys. a, 16 (14), 1983, p. 3193–3196. [21] mei, w. n., lee, y. c.: harmonic oscillator with potential barriers — exact solutions and perturbative treatments. j. phys. a, 16 (8), 1983, p. 1623–1632. [22] aquino, n.: the isotropic bounded oscillators. j. phys. a, 30 (7), 1997, p. 2403–2415. [23] varshni, y. p.: simple wavefunction for an impurity in a parabolic quantum dot. superlattice microst, 23 (1), 1998, p. 145–149. [24] tinkham, m.: group theory and quantum mechanics, new york, mcgraw-hill, 1964. [25] amore, p., fernández, f.m.: variational collocation for systems of coupled anharmonic oscillators. arxiv: 0905.1038v1 [quant-ph]. [26] arteca, g. a., fernández, f. m., castro, e. a.: approximate calculation of physical properties of enclosed central field quantum systems. j. chem. phys., 80 (4), 1984, p. 1569–1575. dr. paolo amore e-mail: paolo.amore@gmail.com facultad de ciencias, cuicbas universidad de colima bernal dı́az del castillo 340, colima, colima, mexico dr. francisco m. fernández e-mail: fernande@quimica.unlp.edu.ar inifta (conicet, unlp) division quimica teorica diagonal 113 y 64 s/n sucursal 4, casilla de correo 16, 1900 la plata, argentina 24 acta polytechnica acta polytechnica 53(2):98–102, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ multigroup approximation of radiation transfer in sf6 arc plasmas milada bartlova∗, vladimir aubrecht, nadezhda bogatyreva, vladimir holcman faculty of electrical engineering and communication, brno university of technology, technicka 10, 616 00 brno, czech republic ∗ corresponding author: bartlova@feec.vutbr.cz abstract. the first order of the method of spherical harmonics (p1-approximation) has been used to evaluate the radiation properties of arc plasmas of various mixtures of sf6 and ptfe ((c2f4)n, polytetrafluoroethylene) in the temperature range (1000 ÷ 35 000) k and pressures from 0.5 to 5 mpa. calculations have been performed for isothermal cylindrical plasma of various radii (0.01 ÷ 10) cm. the frequency dependence of the absorption coefficients has been handled using the planck and rosseland averaging methods for several frequency intervals. results obtained using various means calculated for different choices of frequency intervals are discussed. keywords: sf6 and ptfe plasmas, radiation transfer, mean absorption coefficients, p1-approximation. 1. introduction an electric (switching) arc between separated contacts is an integral part of a switching process. for all kinds of high power circuit breakers, the basic mechanism is to extinguish the switching arc at the natural current zero by gas convection. the switching arc is responsible for proper disconnection of a circuit. in the mid and high voltage region, sf6 self-blast circuit breakers are widely used. radiation transfer is the dominant energy exchange mechanism during the high current period of the switching operation. due to the extreme conditions, experimental work only gives global information instead of local information, which may be important for determining the optimum operating conditions; theoretical modelling is then of great importance. several approximate methods for radiation transfer in arc plasma have been developed (isothermal net emission coefficient method [7, 1, 8], partial characteristics method [2, 11], p1-approximation [9], discrete ordinates method [9], etc.). in this paper, the p1-approximation has been used to predict radiation processes in various mixtures of sf6 and ptfe plasmas. 2. p1-approximation if diffusion of light is neglected and local thermodynamic equilibrium is assumed, the radiation transfer equation can be written as ω ·∇iν(r, ω) = κν(bν − iν), (1) where iν is the spectral intensity of radiation, ω is a unit direction vector, κν is the spectral absorption coefficient, and bν is the planck function – the spectral density of equilibrium radiation. in p1-approximation, the angular dependence of the specific intensity is assumed to be represented by the first two terms in a spherical harmonics expansion iν(r, ω) = c 4π uν(r) + 3 4π fν(r) · ω , (2) where uν denotes the radiation field density, fν is the radiation flux, and c is the speed of light. combining this expression with eq. 1, one finds for radiation flux fν(r) = − c 3κν ∇uν(r) (3) and a simple elliptic partial differential equation for the density of radiation uν ∇· [ −c 3κν(t) ∇uν(r) ] +κν(t)cuν(r) = κν(t)4πbν(t) . (4) integrating over frequency, the total density of the radiation and the total radiation flux are obtained u(r) = ∫ ∞ 0 uν(r) dν, f(r) = ∫ ∞ 0 fν(r) dν. (5) 3. absorption coefficients prediction of both radiation emission and absorption properties requires knowledge of the spectral coefficients κν of absorption as a function of radiation frequency. these coefficients are proportional to the concentration of the chemical species occurring in the plasma, and depend on the cross sections of various radiation processes. in the mixture of sf6 and ptfe (c2f4) we assume the following species: sf6 molecules, s, f, c atoms, 98 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 multigroup approximation of radiation transfer in sf6 arc plasmas s+, s+2, s+3, f+, f+2, c+, c+2, c+3 ions and electrons. the equilibrium concentrations of each species in various sf6 + ptfe mixtures were taken from [3]. spectral coefficients of absorption were calculated using semi-empirical formulas to represent both continuum and line radiation. the continuum spectrum is formed by bound-free transitions (photorecombination, photo-ionization) and free-free transitions (bremsstrahlung). the photo-ionization cross sections for neutral atoms were calculated by the quantum defect method of seaton [12], the cross sections of the photo-ionization of ions and free-free transitions were treated using coulomb approximation for hydrogen-like species [6]. in the discrete radiation calculations, spectral lines broadening and their complex shapes have to be carefully considered. the lines are broadened due to numerous phenomena. the most important are doppler broadening, stark broadening, and resonance broadening. for each line, we have calculated the values of half-widths and spectral shifts. the line shape is given by convolution of the doppler and lorentz profiles, resulting in a simplified voigt profile. the lines that overlap have also been taken into account. due to lack of data, from molecular species we have only considered sf6 molecules with their experimentally measured absorption cross sections [5]. 4. absorption means one of the procedures for handling the frequency variable in the radiation transfer equation is the multigroup method [10, 4]. it is based on a simplified spectral description with only some spectral groups assuming grey body conditions within each group with a certain average absorption coefficient value, i.e. for the k-th spectral group κν(r,ν,t) = κk(r,t); νk ≤ ν ≤ νk+1. (6) the mean absorption coefficient values are generally taken as either the rosseland mean or the planck mean. the planck mean is appropriate in the case of an optically thin system. the planck mean absorption coefficient is given by κp = ∫νk+1 νk κνbν dν bk , (7) where bk = ∫ νk+1 νk bν dν. the rosseland mean is appropriate when the system approaches equilibrium (almost all radiation is reabsorbed). the rosseland mean is given by κ−1r = ∫νk+1 νk κ−1ν dbν dt dν∫νk+1 νk dbν dt dν . (8) the total radiation density value is then given by u(r) = ∑ k uk(r), (9) where uk are solutions of eq. 4 with frequency independent κk(t) and bk(t). 5. net emission coefficients assuming local thermodynamic equilibrium, coefficient of absorption κν is related to the coefficient of emission εν by kirchhoff’s law εν = bνκν. (10) strong self-absorption of radiation in the plasma volume occurs, and this must be taken into account in the calculations. the net emission coefficient of radiation, εnν is defined by lowke [7] as εnν = εν −jνκν , (11) where jν is an average radiation intensity, which is a function of temperature. for an isothermal plasma sphere at radius r (the results are approximately the same as for the isothermal cylinder), it is defined as jν = bν[1 − exp(−κνr)]. (12) a combination of eqs. 10–12 gives the expression for the net emission coefficient εn = ∫ ∞ 0 bνκνexp(−κνr) dν. (13) the isothermal net emission coefficient corresponds to the fraction of the total power per unit volume and unit solid angle irradiated into a volume surrounding the axis of the arc plasma and escaping from the arc column after crossing thickness r of the isothermal plasma. it is often used for predicting the energy balance, since the net emission of radiation (the divergence of the radiation flux) can be written as ∇· fr = 4πεn. (14) in multigroup p1-approximation, the net emission coefficient can be determined from eq. 4. in the case of cylindrically symmetrical isothermal plasma, eq. 4 has constant coefficients κk and bk, and depends only on one variable – radial distance r. it represents the modified bessel equation, and can be solved analytically. taking into account the boundary condition (no radiation enters into the plasma cylinder from outside) n · fk(r) = − cuk(r) 2 (15) the net emission over the volume of the arc for the k-th frequency group is (wavg)k = 2π πr2 ∫ r 0 r∇· fk(r) dr = = 2 r 4πbk 2i1( √ 3 κkr) + √ 3 i0( √ 3 κkr) i1( √ 3 κkr) (16) 99 m. bartlova, v. aubrecht, n. bogatyreva, v. holcman acta polytechnica 2 4 6 8 10 10-4 10-2 100 102 104 a bs or pt io n c oe ffi ci en t ( cm -1 ) radiation frequency (1015 s-1) (a) absorption spectrum planck mean rosseland mean sf 6 p = 0.5 mpa t = 20 000 k figure 1. the real absorption spectrum of sf6 plasma at p = 0.5 mpa and t = 20 000 k compared with the planck and rosseland means. 10000 20000 30000 10-3 10-1 101 103 105 107 sf 6 p = 0.5 mpa r = 0.1 cm n et e m is si on c oe ffi ci en t ( w cm -3 sr -1 ) temperature (k) planck mean, 5 gr. rosseland mean, 5 gr. planck mean, 10 gr. rosseland mean, 10 gr. aubrecht [2] figure 2. net emission coefficients of sf6 plasma with radius 0.1 cm as a function of temperature for two different cuttings of the frequency interval and various absorption means; comparison with results of aubrecht [1]. where i0(x) and i1(x) are modified bessel functions. summing over all frequency groups gives the net emission of radiation ∇· fr = ∑ k (wavg)k = 4πεn. (17) 6. results the mean absorption coefficient values depend on the choice of the frequency interval cutting. 5000 10000 15000 20000 25000 30000 35000 10-1 100 101 102 103 104 105 106 n et e m is si on c oe ffi ci en t ( w cm -3 sr -1 ) temperature (k) aubrecht [2] planck mean rosseland mean r = 0 r = 1 cm r = 10 cm sf 6 p = 0.5 mpa figure 3. net emission coefficients of sf6 plasma as a function of temperature for various thicknesses of the plasma and various absorption means. 5000 10000 15000 20000 25000 30000 35000 10-1 101 103 105 107 n et e m is si on c oe ffi ci en t ( w cm -3 sr -1 ) temperature (k) p = 0.5 mpa p = 2 mpa p = 5 mpa sf 6 r = 0.1 cm planck rosseland figure 4. net emission coefficients of sf6 plasma with radius 0.1 cm as a function of temperature for various pressures. the cutting frequencies are mainly defined by the steep jumps of the evolution of the continuum absorption coefficients that correspond to individual absorption edges. however, the number of groups should be minimized to decrease the computation time. in this work, the frequency interval (1012 − 1016) s−1 was cut into (a) five frequency groups with cutting frequencies (0.001, 1, 2, 4.1, 6.8, 10) × 1015 s−1 , (18) (b) ten frequency groups with cutting frequencies (0.001, 1, 1.4, 1.77, 2, 2.2, 2.5, 3, 4.1, 6.8, 10) × 1015 s−1 . (19) 100 vol. 53 no. 2/2013 multigroup approximation of radiation transfer in sf6 arc plasmas 10000 20000 30000 101 102 103 104 105 106 n et e m is si on c oe ffi ci en t ( w cm -3 sr -1 ) temperature (k) 100% sf 6 80% sf 6 + 20% ptfe 20% sf 6 + 80% ptfe 100% ptfe p = 0.5 mpa r = 0.1 cm planck rosseland figure 5. net emission coefficients of different mixtures of sf6 and ptfe plasmas as a function of temperature at pressure 0.5 mpa for various absorption means. the two cuttings differ in the frequency interval (1 ÷ 4.1) × 1015 s−1, which is split into two groups in case (a), and in greater detail into seven groups in case (b). the absorption spectrum evaluated at 20 000 k compared with various averaged versions for five groups cutting is shown in fig. 1. the net emission coefficients were calculated by combining eqs. 16 and 17. results for an isothermal plasma cylinder of radius r = 0.1 cm for two different cuttings eqs. 18, 19 of the frequency interval are given in fig. 2. in the case of the planck averaging method, cutting the spectrum into more frequency groups influences the resulting net emission coefficients only slightly. a comparison is also provided with the values of aubrecht [1], which were obtained by direct integration from eq. 13. it can be seen that the planck mean leads to an overestimation of the emitted radiation, while the rosseland approach underestimates it. an example of the calculated temperature dependence of the net emission coefficients for various thicknesses of pure sf6 plasma at a pressure of 0.5 mpa is presented in fig. 3. the strong effect of plasma thickness can be seen both for direct frequency integration eq. 13 and for planck means; rosseland averages are influenced only slightly. as can be expected from the definition of the planck and rosseland means, by omitting self-absorption (r = 0) the planck means give good agreement with the results of direct integration, while for thick plasma (r = 10 cm) the rosseland mean is a good approach. the influence of the plasma pressure on the net emission coefficient values is shown in fig. 4. net emission coefficients increase with increasing pressure, mainly for rosseland means. the influence of an admixture of ptfe on the values of the net emission coefficients of sf6 plasma is given in the fig. 5 for plasma thickness 0.1 cm. the differences between net emission coefficients are very small. this can be explained by the approximately equivalent role of sulphur and carbon species. sulphur and carbon atoms and ions have similar radiation emission behavior. 7. conclusions net emission coefficients for various mixtures of sf6 and ptfe plasmas have been calculated using p1approximation for an isothermal plasma cylinder. multigroup approximation for handling the frequency variable has been used. both planck and rosseland averaging methods have been applied to obtain mean values of absorption coefficient values. a comparison with the net emission coefficients calculated by direct frequency integration has been provided. it has been shown that planck means generally overestimate the emission of radiation, while rosseland means underestimate it. planck means give good results only for a very small plasma radius (omitting self-absorption). the rosseland mean is a suitable approach for thick plasma (absorption dominated system). in reality, neither mean is correct in general. the simplest procedure for improving the accuracy is to use the planck mean for frequency groups with low absorption coefficient values and the rosseland mean for groups with high absorption coefficient values. another approach was suggested in [9], where each group based on original frequency splitting was further divided according to the absorption coefficient values, and planck averaging for these new groups was calculated. this procedure partially solves the problem of overestimation of the role of lines in the planck averaging method. another correction of the influence of lines on planck means was presented in [4], where the escape factor was introduced. acknowledgements this work has been supported by the czech science foundation under project no. gd102/09/h074 and by the european regional development fund under projects nos. cz.1.05/2.1.00/01.0014 and cz.1.07/2.3.00/09.0214. references [1] v. aubrecht, m. bartlova. net emission coefficients of radiation in air and sf6 thermal plasmas. plasma chem plasma process 29(2):131–147, 2009. [2] v. aubrecht, j. j. lowke. calculations of radiation transfer in sf6 plasmas using the method of partial characteristics. j phys d: appl phys 27(10):2066–2074, 1994. [3] o. coufal, p. sezemsky, o. zivny. database system of thermodynamic properties of individual substances at high temperatures. j phys d: appl phys 38(8):1265–1274, 2005. 101 m. bartlova, v. aubrecht, n. bogatyreva, v. holcman acta polytechnica [4] y. cressault, a. gleizes. mean absorption coefficients for co2 thermal plasmas. high temp mat process 10(1):47–54, 2006. [5] a. r. hochstim, g. a. massel. kinetic processes in gases and plasmas. academic press, new york, 1969. [6] r. w. liebermann, j. j. lowke. radiation emission coefficients for sulfur hexafluoride arc plasmas. j quant spectrosc radiat transfer 16(3):253–264, 1976. [7] j. j. lowke. prediction of arc temperature profiles using approximate emission coefficients for radiation losses. j quant spectrosc radiat transfer 14(2):111–122, 1974. [8] y. naghizadeh-kashani, y. cressault, a. gleizes. net emission coefficients of air thermal plasmas. j phys d: appl phys 35(22):2925–2934, 2002. [9] h. nordborg, a. a. iordanidis. self-consistent radiation based modeling of electric arcs: i. efficient radiation approximations. j phys d: appl phys 41(13):135205, 2008. [10] g. c. pomraning. the equations of radiation hydrodynamics. dover publications, new york, 2005. [11] g. raynal, a. gleizes. radiative transfer calculations in sf6 arc plasmas using partial characteristics. plasma sources sci technol 4(1):152–160, 1995. [12] m. seaton. quantum defect method. monthly not royal astronom society 118(5):504–518, 1958. 102 acta polytechnica 53(2):98–102, 2013 1 introduction 2 p1-approximation 3 absorption coefficients 4 absorption means 5 net emission coefficients 6 results 7 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0122 acta polytechnica 54(2):122–123, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap the floquet method for pt-symmetric periodic potentials h. f. jones physics department, imperial college, london sw7 2bz, uk correspondence: h.f.jones@imperial.ac.uk abstract. by the general theory of pt -symmetric quantum systems, their energy levels are either real or occur in complex-conjugate pairs, which implies that the secular equation must be real. however, for periodic potentials it is by no means clear that the secular equation arising in the floquet method is indeed real, since it involves two linearly independent solutions of the schrödinger equation. in this brief note we elucidate how that reality can be established. keywords: band structure, pt symmetry, floquet method. the study of systems governed by hamiltonians for which the standard requirement of hermiticity is replaced by that of pt -symmetry has undergone significant development in recent years [1–6]. provided that the symmetry is not broken, that is, that the energy eigenfunctions respect the symmetry of the hamiltonian, the energy eigenvalues are guaranteed to be real. in the case where the symmetry is broken energy levels may instead appear as complex-conjugate pairs. this phenomenon is particularly interesting for the case of periodic pt -symmetric potentials, where unusual band structures may occur [7, 8]. an important physical realization of such systems arises in classical optics, because of the formal similarity of the time-dependent schrödinger equation to the paraxial equation for the propagation of electromagnetic waves. this equation takes the form [9] i ∂ψ ∂z = − ( ∂2 ∂x2 + v (x) ) ψ, (1) where ψ(x,z) represents the envelope function of the amplitude of the electric field and z is a scaled propagation distance. the optical potential v (x) is proportional to the variation in the refractive index of the material through which the wave is passing. in optics this potential may well be complex, with its imaginary part representing either loss or gain. if loss and gain are balanced in a pt -symmetric way, so that v ∗(x) = v (−x). we have the situation described above. optical systems of this type have a number of very interesting properties [9–14], particularly when they are periodic. in such a case the potential v (x), whose period we can take as π, without loss of generality, satisfies the two conditions v ∗(−x) = v (x) = v (x + π). for periodic potentials we are interested in finding the bloch solutions, which are solutions of the time-independent schrödinger equation − ( ∂2 ∂x2 + v (x) ) ψk(x) = eψk(x) (2) with the periodicity property ψk(x + π) = eikπψk(x). the standard way of obtaining such solutions is the floquet method, whereby ψk(x) is expressed in terms of two linearly-independent solutions, u1(x) and u2(x), of eq. (2), with initial conditions u1(0) = 1, u′1(0) = 0, u2(0) = 0, u′2(0) = 1. (3) then ψk(x) is written as the superposition ψk(x) = cku1(x) + dku2(x). (4) imposing the conditions ψk(π) = eikπψ(0) and ψ′k(π) = e ikπψ′(0) and exploiting the invariance of the wronskian w(u1,u2) one arrives at the secular equation cos kπ = ∆ ≡ 1 2 (u1(π) + u′2(π)) . (5) in the hermitian situation both u1(π) and u2(π) are real, and the equation for e has real solutions (bands) when |∆| ≤ 1. however, in the non-hermitian, pt symmetric, situation it is not at all obvious that ∆ is real, since that implies a relation between u1(π) and u′2(π), even though u1(x) and u2(x) are linearly independent solutions of eq. (2). it is that problem that we wish to address in the present note. in fact we will show that u′2(π) = u∗1(π). the clue to relating u1(π) and u2(π) comes from considering a half-period shift, namely x = z + π/2. we write ϕ(z) = ψ(z + π/2) and u(z) = v (z + π/2). then ϕ(z) satisfies the schrödinger equation − ( ∂2 ∂z2 + u(z) ) ϕk(z) = eϕk(z). (6) the crucial point is that because of the periodicity and pt -symmetry of v (x) the new potential u(z) is also pt -symmetric. thus u(−z) = v (−z + π/2) = v (−z −π/2) = v ∗(z + π/2) = u∗(z). 122 http://dx.doi.org/10.14311/ap.2014.54.0122 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 the floquet method for pt-symmetric periodic potentials now we can express the floquet functions u1(x), u2(x) in terms of floquet functions v1(z), v2(z) of the transformed equation (6), satisfying v1(0) = 1, v′1(0) = 0, v2(0) = 0, v′2(0) = 1. (7) it is easily seen that the relation is u1(x) = v′2(−π/2)v1(z) −v ′ 1(−π/2)v2(z), u2(x) = −v2(−π/2)v1(z) + v1(−π/2)v2(z), (8) in order to satisfy the initial conditions on u1(x), u2(x). so u1(π) = v′2(−π/2)v1(π/2) −v ′ 1(−π/2)v2(π/2), u′1(π) = v ′ 2(−π/2)v ′ 1(π/2) −v ′ 1(−π/2)v ′ 2(π/2), u2(π) = −v2(−π/2)v1(π/2) + v1(−π/2)v2(π/2), u′2(π) = −v2(−π/2)v ′ 1(π/2) + v1(−π/2)v ′ 2(π/2). (9) but, because of the pt -symmetry of eq. (6) and the initial conditions satisfied by v1(z), v2(z), v1(−π/2) = (v1(π/2))∗, v′1(−π/2) = −(v ′ 1(π/2)) ∗, v2(−π/2) = −(v2(π/2))∗, v′2(−π/2) = (v ′ 2(π/2)) ∗. (10) hence, indeed, u1(π) = (u′2(π))∗, so that ∆ in eq. (5) is real and the energy eigenvalues of the bloch wavefunctions are either real or occur in complex conjugate pairs. from eq. (10) we also see that u′1(π) and u2(π) are real. the statement u1(π) = (u′2(π))∗ is in fact the pt -generalization of the relation u1(π) = u′2(π) implied without proof by eq. (20.3.10) of ref. [16] for the hermitian case of the mathieu equation, where v (x) = cos(2x). if we wish, we may express everything in terms of u1, u2 because from eq. (8) u1(π/2) = v′2(−π/2), u′1(π/2) = −v ′ 1(−π/2), u2(π/2) = −v2(−π/2), u′2(π/2) = v1(−π/2). (11) hence u1(π) = (u′2(π)) ∗ = u1(π/2)(u′2(π/2)) ∗ + u′1(π/2)(u2(π/2)) ∗, (12) which is the pt -generalization of a relation implied by eq. (20.3.11) of ref. [16] after the use of the invariance of the wronskian. similarly u′1(π) = 2re (u ∗ 1(π/2)u ′ 1(π/2)) , u2(π) = 2re (u∗2(π/2)u ′ 2(π/2)) . (13) to conclude, we have shown that the secular equation for the band structure of pt -symmetric periodic potentials is indeed real, even though in the floquet method the discriminant involves the two ostensibly independent functions u1(x) and u2(x). the crucial point is that for such potentials there is also a pt symmetry about the midpoint of the brillouin zone. the proof involves expressing u1(x) and u2(x) in terms of shifted functions v1(x) and v2(x), and shows that u1(π) and u′2(π) are actually complex conjugates of each other. the proof incidentally casts light on certain relations that hold for real symmetric potentials, such as cos (2x). references [1] c. m. bender and s. boettcher, phys. rev. lett. 80, 5243 (1998). doi: 10.1103/physrevlett.80.5243 [2] c. m. bender, contemp. phys. 46, 277 (2005); rep. prog. phys. 70, 947 (2007). doi: 10.1080/00107500072632 [3] a. mostafazadeh, int. j. geom. meth. mod. phys. 7, 1191 (2010). doi: 10.1142/s0219887810004816 [4] c. m. bender, d. c. brody and h. f. jones, phys. rev. lett. 89, 270401 (2002) doi: 10.1103/physrevlett.89.270401; 92, 119902(e) (2004). doi: 10.1103/physrevlett.92.119902 [5] c. m. bender, d. c. brody and h. f. jones, phys. rev. d 70, 025001 (2004) doi: 10.1103/physrevd.70.025001; 71, 049901(e) (2005). doi: 10.1103/physrevd.71.049901 [6] a. mostafazadeh, j. math. phys. 43, 205 (2002); j. phys. a 36, 7081 (2003). [7] c. m. bender, g. v. dunne, p. n. meisinger, phys. lett. a 252, 272 (1999). [8] h. f. jones, phys. lett. a 262, 242 (1999). [9] r. el-ganainy, k. g. makris, d. n. christodoulides and z. h. musslimani, optics letters 32, 2632 (2007). [10] z. musslimani, k. g. makris, r. el-ganainy and d. n. christodoulides, phys. rev. lett. 100, 030402 (2008). doi: 10.1103/physrevlett.100.030402 [11] k. makris, r. el-ganainy, d. n. christodoulides and z. musslimani, phys. rev. lett. 100, 103904 (2008). doi: 10.1103/physrevlett.100.103904 [12] k. makris, r. el-ganainy, d. n. christodoulides and z. musslimani, phys. rev. a 81, 063807 (2010). doi: 10.1103/physreva.81.063807 [13] s. longhi, phys. rev. a 81, 022102 (2010). doi: 10.1103/physreva.81.022102 [14] e. m. graefe and h. f. jones, phys. rev. a 84, 013818 (2011). doi: 10.1103/physreva.84.013818 [15] h. f. jones, j. phys. a: 45 135306 (2012). doi: 10.1088/1751-8113/45/13/135306 [16] m. abramowitz and i. a. stegun, handbook of mathematical functions, dover, ny, 1970. 123 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1080/00107500072632 http://dx.doi.org/10.1142/s0219887810004816 http://dx.doi.org/10.1103/physrevlett.89.270401 http://dx.doi.org/10.1103/physrevlett.92.119902 http://dx.doi.org/10.1103/physrevd.70.025001 http://dx.doi.org/10.1103/physrevd.71.049901 http://dx.doi.org/10.1103/physrevlett.100.030402 http://dx.doi.org/10.1103/physrevlett.100.103904 http://dx.doi.org/10.1103/physreva.81.063807 http://dx.doi.org/10.1103/physreva.81.022102 http://dx.doi.org/10.1103/physreva.84.013818 http://dx.doi.org/10.1088/1751-8113/45/13/135306 acta polytechnica 54(2):122–123, 2014 references acta polytechnica acta polytechnica 53(2):127–130, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ diffuse coplanar surface barrier discharge in nitrogen: microdischarges statistical behavior jan cech∗, jana hanusova, pavel stahel, pavel slavicek masaryk university, regional r&d center for low-cost plasma and nanotechnology surface modifications, faculty of science, masaryk university, kotlarska 2, 611 37 brno, czech republic ∗ corresponding author: cech@physics.muni.cz abstract. we studied statistical behavior of microdischarges of diffuse coplanar surface barrier discharge (dcsbd) operated in nitrogen atmosphere at two input voltage regimes. we measured spectrally unresolved discharge patterns together with discharge electrical parameters using highspeed iccd camera and digital storage oscilloscope. external synchronization enabled us to measure the discharge pattern during positive and/or negative half-period of input high voltage in the single-shot mode of operation. the comparison of microdischarges behavior during positive, negative and both half periods of input high voltage was performed for two levels of input voltage, i.e. voltage slightly above ignition voltage and high above ignition voltage (“overvoltage”). the number of microchannels crossing discharge gap was counted and compared with number of microdischarge current peaks observed during corresponding half-period of input high voltage. the relations of those incidences was shown and discussed. keywords: dcsbd, diffuse coplanar surface barrier discharge, microdischarges, time resolved imaging, iccd. 1. introduction in past decades the importance of barrier discharges as the sources of non-equilibrium plasmas for material processing has raised [3]. as a type of planar configuration of barrier discharge so called diffuse coplanar surface barrier discharge (dcsbd) was invented by prof. cernak [1]. plasma of dcsbd is generated in thin layer above dielectric, at relatively high power densities of the order of 100 w m−3. the discharge consists of thin channels (filaments or micro-discharges) crossing the electrode gap between electrodes [2] and visually diffuselike layer above electrodes. these properties make dcsbd a promising candidate for high-speed plasma processing of various materials [1]. because of its application potential coplanar barrier discharge properties and potential for applications have been investigated in recent decade [4, 5]. in presented paper the statistical behavior of dcsbd microdischarges operated in nitrogen was studied by means of high-speed iccd camera imaging synchronized with the power supply generator and digital storage oscilloscope. the relation of number of microfilament channels crossing discharge gap and number of current peaks observed during corresponding half-period of input high voltage was studied. 2. experimental setup for presented study the simplified dcsbd cell or system was used. the one-electrode-pair discharge cell was used. in fig. 1 (left) the cross-section of discharge cell is given. the discharge cell is made of polymer capsule in which the system of semi-movable electrodes is placed. both electrodes are pressed against dielectric plate and dipped in insulating oil bath. the arbitrary rectangular electrode gap between electrodes can be set with width up to 5 mm. the minimum distance between electrodes is governed by the insulating properties of oil bath. in presented study the electrode distance was set to 0.55 ± 0.05 mm. the dielectric plate is pressed directly to the surface of electrodes, which form two semicircle footprints on the dielectric plate, forming rectangular electrode gap in between. the schematic view of the electrodes groundplan is given in fig. 1 (right). the view plane of the picture is the same as if the paper would be the dielectric plate. the diameter of semicircle electrodes is approx. 20 mm. discharge chamber is placed from the opposite side of dielectric plate; see fig. 1 (left). the dielectric plate is made of 96 % alumina (al2o3) with dimensions of 10 × 10 cm and thickness of approximately 0.6 mm. discharge chamber enables us to operate the discharge in controlled working gas environment. quartz window on the opposite side of discharge chamber enables us to perform optical diagnostics of the discharge. experimental setup for time resolved iccd measurements is given in fig. 2. the discharge cell described in previous section was used. voghtlin instruments red-y gcr gas flow controller was used to set requested working gas atmosphere. nitrogen gas of the purity better than 99.996 % and total gas flow rate of 3 slpm was used. 127 http://ctn.cvut.cz/ap/ jan cech, jana hanusova, pavel stahel, pavel slavicek acta polytechnica figure 1. experimental setup ii: a) discharge chamber cross-section, b) electrode system groundplan. figure 2. experimental setup: general scheme of discharge setup and time-resolved optical imaging. high voltage (hv) power supply was used to ignite and maintain the discharge. the hv power supply consists of high frequency tunable generator lifetech hf power source powered by stabilized dc power source statron 3262 and lifetech hv transformer. the hv power supply was operated at 30 khz and 24 kv peak-to-peak, resp. 32 kv peakto-peak, sine-wave. these voltage levels correspond to the level of “just ignited” discharge and discharge operated at high “overvoltage” level above ignition voltage. the current–voltage characteristics were recorded using lecroy waverunner 6100a 1 ghz/5 gsa digital storage oscilloscope coupled with hv probe tectronix p6015a 1000:1 (in fig. 2 denoted as pr1) and pearson current monitor 2877 (in fig. 2 denoted as pr2). variable high voltage capacitor was used as the displacement current compensator. the hv capacitor was connected antiparallel through current probe. tuning the hv capacitor to the capacity close to discharge cell capacity effectively reduces measured displacement current of discharge cell which is of the same order as the discharge current. this increases effectively the signal-to-noise ratio of discharge current measurements. for the high speed synchronized discharge imaging, the princeton instruments pi-max 1024rb-25-fg43 iccd camera equipped with 50 mm, f/2.8 uv lens was used. the iccd camera was placed along axis of symmetry perpendicular to dcsbd plasma layer. to synchronize and semi-automate the measurements, the agilent 33220a function generator was used as trigger. as the source signal for trigger the reference signal of hf generator was used. the agilent 33220a fires triggering signals for synchronous iccd image capture together with current–voltage measurement of the same event. this setup enables us to take series of synchronized images of the discharge pattern together with its current–voltage characteristics with the resolution of single half-period of high voltage waveform. in presented work the 100 images series of first, second and both half-periods were taken with gate times of 17, resp. 34 µs. the iccd delays were set in the way to guarantee that images represent all discharge events of half-periods that can be identified in current–voltage waveforms. 3. results and discussion for each half period of the discharge the 100 events were recorded, that gives 600 discharge pattern images and also current–voltage measurement. in the first step of data processing the discharge current corresponding to specified half period of input high voltage was analyzed. the peak detection algorithm was adopted to evaluate the current peaks total number, together with their polarity and amplitude. in the second step of data processing the corresponding iccd images of discharge pattern were processed. for analysis of discharge microchannels the narrow rectangular region corresponding to inter-electrode rectangular gap (see figs. 1 and 3) was selected. the total number of microdischarges (bright channels) crossing the electrode gap was obtained together with their positions and amplitudes using similar peak detection algorithm used for discharge current analysis. finally the data sets of current peaks events and microdischarge channels events were processed to obtain relation frequency matrix of incidence described in next section. in fig. 3 the iccd images of the discharge pattern for input voltage of 24 kv is shown. the central part of images corresponding to electrode gap was processed to evaluate microchannels crossing the gap. images represent single shots of discharge microchannels incidence during one half period of input high voltage. in fig. 3 the left, resp. middle image represents positive, resp. negative half period of discharge with respect to the polarity of the discharge electrodes (shown for clarity only in the first image), image on the right represents whole period of discharge. polarity is taken as polarity of hv signal on the left electrode, see figs. 1 and 2. in fig. 4 the typical current–voltage characteristics of dcsbd operated in nitrogen is given. the numerous current peaks per half period of input high voltage can be seen. the reference triggering signal used for diagnostics synchronization is also depicted. 128 vol. 53 no. 2/2013 diffuse coplanar surface barrier discharge in nitrogen figure 5. relations of number of iccd identified microchannels crossing discharge gap and number of current peaks in corresponding half period, input voltage is 24 kv (pk-pk); figures represents: a) positive half period, b) negative half period and c) whole period of input high voltage. figure 6. relations of number of iccd identified microchannels crossing discharge gap and number of current peaks in corresponding half period, input voltage is 32 kv (pk-pk); figures represents: a) positive half period, b) negative half period and c) whole period of input high voltage. figure 3. time resolved iccd images of dcsbd, d = 0.6 mm, upk-pk = 24 kv; half-periods: left to right: positive, negative, both. in figs. 5 and 6 the resulting relation frequency matrices of incidence is given. the matrices represent evaluation of microdischarges behavior. on x axis the number of microchannels crossing the gap during one half period (whole period) is given. on y axis the corresponding number of microdischarge current peaks within the same half period (whole period) is given. the numbers in matrices represents numbers of incidences of described events. from the incidence matrices the statistical behavior of the dcsbd microdischarges (of filaments) can be seen. in case of 1:1 correspondence, where each microdischarge will produce unique microchannel crossing the discharge gap, all events will be positioned along main diagonal figure 4. typical current–voltage characteristics of dcsbd operated at 32 kv (pk-pk); synchronization signal used for synchronization of iccd camera and digital storage oscilloscope is shown. of the incidence matrix. this tendency can be seen in fig. 5 for the matrices representing positive (t1) and negative (t2) half period of discharge for input voltage of 24 kv. it can be seen, that the substantial part of events is distributed along the main diagonal. in case of whole period (t3) the most occurring event corresponds to case where only half number 129 jan cech, jana hanusova, pavel stahel, pavel slavicek acta polytechnica of unique microchannels crossed the discharge gap in comparison of total microdischarges current peaks within the period. notice the different scale (section) of incidence matrix in case of whole period. when we compare the behavior of filaments of dcsbd operated at very low input voltage (fig. 5) with behavior under “overvoltage” of input voltage (fig. 6), we can see substantial differences. please notice the shifted scales (sections) of corresponding incidence matrices in figs. 5 and 6. with agreement to the research performed by hoder [2] on the single-filament dcsbd configuration, the filaments of dcsbd in the regime of “overvoltage” prefer to prolongate or branch existing microchannels against formation of new microchannels crossing the electrode gap. from the presented data the tendency is that the substantial number of microdischarges (identified as discharge current peaks) do not form new microchannels but rather prolongate or branch existing discharge microchannels. 4. conclusion in this paper we have presented time resolved optical measurements of dcsbd in nitrogen. with presented experimental setup we were able to take single-shot pictures of discharge pattern with the resolution of half-period of input voltage. the behavior of dcsbd microdischarges in nitrogen was studied for two different input voltage regimes. the incidence frequency matrices of microdischarges and microchannels occurrence were derived. it has been investigated, that under low input voltage conditions, the preferred way of microdischarge behavior is formation of unique microchannel per microdischarge. contrary in the case of substantially increased input voltage (“overvoltage”) the preferred microdischarges behavior is to reuse existing microchannels and/or prolongate or branch existing microchannels instead of forming completely new microchannels. obtained results support the previous measurements of dcsbd performed by hoder in single-filament dcsbd configuration [2]. to distinguish between different regimes of microdischarges behavior further investigations has to be carried. acknowledgements this research has been supported by the project regional r&d center for low-cost plasma and nanotechnology surface modifications cz.1.05/2.1.00/03.0086 funded by european regional development fund and projects no. ta01011356/2011 and ta01010948/2011 of the technology agency of the czech republic. references [1] m. cernak, l. cernakova, i. hudec, et al. diffuse coplanar surface barrier discharge and its applications for in-line processing of low-added-value materials. european physical journal-applied physics 47(2):22806, 2009. [2] t. hoder. studium filamentu koplanarniho barieroveho vyboje. masaryk university, brno, 2009. ph.d. thesis. [3] u. kogelschatz. dielectric-barrier discharges: their history, discharge physics, and industrial applications. plasma chemistry and plasma processing 23(1):1–46, 2003. [4] d. korzec, e. g. finantu-dinu, g. l. dinu, et al. comparison of coplanar and surface barrier discharges operated in oxygen-nitrogen gas mixtures. surface and coatings technology 174–175:503–508, 2003. [5] m. stefecka, m. kando, m. cernak, et al. spatial distribution of surface treatment efficiency in coplanar barrier discharge operated with oxygen-nitrogen gas mixtures. surface and coatings technology 174–175:553–558, 2003. 130 acta polytechnica 53(2):127–130, 2013 1 introduction 2 experimental setup 3 results and discussion 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0059 acta polytechnica 54(1):59–62, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap experimental verification of the structural and technological parameters of the pks iveta onderová∗, ľudovít kolláth, lucia ploskuňáková slovak university of technology in bratislava, faculty of mechanical engineering, institute of manufacturing systems, environmental technology and quality management ∗ corresponding author: iveta.onderova@stuba.sk abstract. this paper presents the theoretical design for experimental verification of the structural and technological parameters of the parallel kinematic structure. the experimental equipment was developed at the faculty of mechanical engineering, stu in bratislava. it is called tricept, and due to its kinematics it is classified as a parallel structure. the previous phase of the project dealt with developing a mathematical model of the tool path. we have worked with two methods, a monte carlo method and the use of a mathematical analytical geometry structure. the calculated values were verified by comparing the results of the two methods. based on the equations that were obtained, we can design the control of the tool path during cutting. the next stage focuses on the test methods and on verifying the structural and technological parameters of tricept. the experiment is designed on the basis of the en iso 9283 standard, and involves testing the technological parameters: one-way positioning accuracy and repeatability of the position, changes of multidirectional positioning accuracy, repeatability and accuracy of the distance, position overshoot, drift of the position parameters, path accuracy, and repeatability of the path. this paper classifies the measurement methods and presents the measurement processes and the equipment for the experiment. keywords: parallel kinematic structure, quality, positioning accuracy, repeatable positioning accuracy, design of the experiment. 1. introduction the current trend is toward high-speed machining, which encourages the development of machine tools with high dynamics, improved rigidity and reduced moving masses. in general, parallel robots — the basic mechanical tool — are referred to as parallel kinematic machines. parallel kinematic mechanisms offer higher stiffness, lower moving mass, greater acceleration, potentially higher precision, lower installation requirements, and greater mechanical simplicity than existing conventional machine tools. on the basis of these attributes, parallel kinematic mechanisms offer the potential to change current production forms. they have the potential to be highly modular, highly configurable, high-precision machine tools. other potential benefits are high dexterity, simpler and smaller tools, a multiple mode of production capacity, and a small footprint. conventional machine tools are usually based on a serial structure. there are as many degrees of freedom as necessary, and the axes are arranged in series. this results in a single kinematic chain. the axes are generally arranged by cartesian axes, which means that there is an x-axis, a y-axis and a z-axis, and rotation axes if necessary. these machines are easy to handle, because each axis directly controls one cartesian degree of freedom, and there is no connection between the axes. parallel kinematic machines are machines in which the movement of the tool is based on the principle of parallel mechanisms. a parallel mechanism is a closed mechanism, in which the end-effector (the mobile platform) is connected to the base by at least two independent kinematic chains. 2. tricept research being carried out at the institute of manufacturing systems, environmental technology and quality management is developing a parallel kinematic structure of the tricept type. a fixed platform is connected with a moving platform with three telescopic rods with drives and one central rod without any drive. between the moving platform and the central rod there is a fixed connection. the central rod placed on the fixed platform allows translational motion without any turning.this type of mechanism is created from kinematic pairs of hps type (universal, sliding and spherical joints). the universal joint is formed by two swivel joints. its role is to transmit the rotary motion of the telescopic rod, with sufficient accuracy, stiffness and low friction in the joint. the location of the primary points is important in creating the program that will be used to manage the ejection of the telescopic rods. the movement of these swivel joints is ensured by a pair of bearings, which are located in an axis perpendicular to another pair of bearings. the bearings 59 http://dx.doi.org/10.14311/ap.2014.54.0059 http://ojs.cvut.cz/ojs/index.php/ap i. onderová, ľ. kolláth, l. ploskuňáková acta polytechnica figure 1. a computer model and a real model of tricept: 1–solid platform, 2–central pole, 3–universal (primary) joint, 4–telescopic pole, 5–spherical (secondary) joint, 6–movable platform. figure 2. primary joint. allow smooth and accurate turning of the telescopic rods, which are connected by universal joints to the fixed platform of the mechanism. the sliding joint is formed by a telescopic rod that transmits the rotary motion of the motor to the moving platform. the telescopic rods are the most important and the most exposed parts of tricept. they convert the rotary motion of the actuator into linear motion. ejection is performed by the inner cylinder, which ejects from the outer cylinder. the inner cylinder is fixed with one part fixed on the platform secondary joint onto the carrier. the outer cylinder is fixed by the primary joint onto the fixed platform. the inner cylinder is slidably placed in the outer cylinder. as both rods are slim, the telescopic rods are the most stressed parts. the accuracy of the bars has the greatest effect on the final position of the tool. the telescopic bars are stressed in terms of force transmission, and they are also sensitive to phenomena arising from long and slim rods. they are also stressed to buckling and, in a wide temperature range, also to shortening and lengthening as a result of thermal expansion. the telescopic rod is created by a moving screw. it can be ejected a distance of 300 mm. the ball joint transmits the movement of the telescopic rods to the carrier. it must allow spherical motion of the telescopic rod against the carrier. not only the functions of the joint are important, but also its location on the carrier. the location must be as close to the center of the carrier as possible. figure 3. pull-rod of the tricept. figure 4. the relationship between the specified position and the reached position. we therefore we reduce the dimensions of the carrier, and this minimizes the secondary circle. previous analyses have shown that the location of the joint is also important in terms of the tensions that arise. the incline of the telescopic rods to the central rod is important for the tensions. the smaller the incline is, the greater tensions are generated in the telescopic rods with the same force applied. a dynamic analysis shows a minimum incline of the telescopic rod to the central rod, which must be maintained even in the most unfavorable position. if the inclination is lower, the tension in the rods will increase significantly. the ball joint itself is formed by a spherical pin fastened in a bed with the inverse shape of the pin. to make it simple, it contains no rolling elements but is secured slidably. 3. the basic concepts desired (programmed) position — a position determined by programming with learning, manual data entry or explicit programming. the programmed (desired) positions for robots specified using programming with learning must be defined as a measuring point on the robot. this point is obtained when programming a robot that approximates the points in a cube (p1, p2, etc.). when the accuracy calculation is based on the successive positions that are reached, the coordinates expressed by the measuring system are used as the programmed positions. reached position — the position reached by the robot in automatic mode in response to the programmed position (see figure 4). the parameters of accuracy and position repeata60 vol. 54 no. 1/2014 experimental verification figure 5. unidirectional positioning accuracy. bility express deviations that occur between the desired position and the reached positions, as well as variations in the reached position in a series of runs to the programmed position. these errors may be due to the properties of internal control functions, coordinate transformation errors, differences between the dimensions of joint structures and the dimensions used in the control system model, mechanical failures such as backlash, hysteresis, friction and temperature. the method for recording data on the specified position is associated with options for controlling the robot, and significantly affects the parameters of accuracy. for this reason, the chosen method must be clearly stated in the protocol on the implementation of the test. if the desired position is programmed with explicit programming, the relationship (distance and orientation) between the different specified positions is known or can be determined, and is required for specifying and measuring the distance parameters. 4. positioning accuracy and position repeatability 4.1. unidirectional positioning accuracy unidirectional positioning accuracy (ap) expresses the deviation between the desired position and the diameter of the reached positions when moving to the desired position in the same direction. unidirectional positioning accuracy. we distinguish: unidirectional positioning accuracy — the difference between the desired location and the barycenter of the set of reached points (see figure 5); unidirectional orientation accuracy — the difference between the programmed orientation and the mean value of the achieved angular orientation (see figure 6). figure 6. unidirectional orientation accuracy. unidirectional positioning accuracy is calculated as follows: app = √ (x̄ − xc)2 + (ȳ − yc)2 + (z̄ − zc)2,, apx = x̄ − xc, x̄ = 1 n n∑ j=1 xj , apy = ȳ − yc, ȳ = 1 n n∑ j=1 yj , apz = z̄ − zc, z̄ = 1 n n∑ j=1 zj , where x̄, ȳ, z̄ are the barycentric coordinates of a set of points obtained by repeating the same location n-times, xc, yc, zc are the coordinates of the programmed (specified) position, and xj, yj, zj are the coordinates of the reached position. unidirectional orientation accuracy is calculated as follows: , apa = ā − ac, ā = 1 n n∑ j=1 aj , apb = b̄ − bc, b̄ = 1 n n∑ j=1 bj , apc = c̄ − cc, c̄ = 1 n n∑ j=1 cj , where ac, bc, cc are the angles of the programmers (specified) position and aj, bj, cj are the angles of the reached position 4.2. measurement procedure for unidirectional positioning accuracy tricept gradually moves with its mechanical connection (interface) from point p1 to the following positions: p5, p4, p3, p2; then gradually back to p1. each position has to be achieved with a unidirectional approach, that is from the same direction. the individual measurements are performed only when 61 i. onderová, ľ. kolláth, l. ploskuňáková acta polytechnica tricept is in that position in a steady state. with the coordinates of the programmed positions, the mean values of the reached position coordinates and the mean value of the angle orientations at n-repetitions of the same position, we can calculate the unidirectional orientation and the positioning accuracy for each position using simple formulas. 4.3. unidirectional positioning repeatability unidirectional positioning repeatability (rp) expresses the level of correlation between the positions and orientations of the reached positions after nrepetitions of the movement to the same desired position in the same direction. for a given position the repeatability is expressed by: the radius of the sphere rpl, the center of which is the barycenter (see figure 5); the dispersion of angles ±3sa, ±3sb, ±3sc around the mean values ā, b̄, c̄; where sa, sb and sc are standard deviations (see figure 6). unidirectional position repeatability is calculated as follows: rp = l̄ + 3sl, where l̄ = 1 n n∑ j=1 lj , lj = √ (xj − x̄)2 + (yj − ȳ)2 + (zj − z̄)2, sl = √ ∑n j=1(lj − l̄)2 n − 1 . unidirectional repeatability orientation is calculated as follows: rpa = ±3sa = ±3 √ ∑n j=1(aj − ā)2 n − 1 , rpb = ±3sb = ±3 √ ∑n j=1(aj − ā)2 n − 1 , rpc = ±3sc = ±3 √ ∑n j=1(aj − ā)2 n − 1 4.4. measurement procedure for unidirectional positioning repeatability the robot gradually moves with its mechanical connection according to the selected cycle the same way as when the unidirectional accuracy was measured, except that when measuring the unidirectional position repeatability rp and angle errors rpa, rpb, rpc are calculated for each position of the sphere radius. 5. conclusions the tricept experiments will take place when the wiring has been completed. the preparations are currently being finalized. the experimental results will form the basis for optimizing the design of tricept, and also for adjusting the control system. acknowledgements the research work presented in this paper was performed with financial support from vega grant 1/0584/12. references [1] ploskuňáková, l., biath, p.: konštrukcia pohyblivých častí triceptu, 5th annual international travelling conference, in.: proceedings of erin conference 2011, tatranská koltina, slovakia, 13th–16th april 2011. prešov : apeiron eu, 2011, pp. 447–450 62 acta polytechnica 54(1):59–62, 2014 1 introduction 2 tricept 3 the basic concepts 4 positioning accuracy and position repeatability 4.1 unidirectional positioning accuracy 4.2 measurement procedure for unidirectional positioning accuracy 4.3 unidirectional positioning repeatability 4.4 measurement procedure for unidirectional positioning repeatability 5 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):165–169, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ protonand α-radiation of the micro-pinch with the boron-containing target anatolii a. gurina,b,∗, andrii s. adamenkoa, stanislav v. adamenkoa, mykola m. kuzmenkoa a edl proton-21 ltd, kyiv, ukraine b institute for nuclear research nasu, kyiv, ukraine ∗ corresponding author: gurin@kinr.kiev.ua abstract. using ion pinhole camera and track detectors, the image of hot spot is recorded in a pulsed diode micro-pinch equipped with a solid anode target. the track image is a record of repeated fronts of fast protons with energies up to 1 mev. fluctuations in the ion luminosity of hot spot are associated with the wave-like nature of the proton accelerating processes in the dense plasma of target material, which is characterized by a mean energy of 100 kev. the results of the track analysis of a fast ions, detected in the thomson analyser in experiments with boron-polyethylene targets, are presented. in 5 % of the shots, the presence of α-particles of energy up to 2 mev in the flux of fast ions is discovered by means of thomson analyser equipped with track detectors. estimations of total amount of helium nuclei as products of nuclear reactions p(b11, 2α)α result in an output of 108 ÷ 109 per successive shot. keywords: micro-pinch, fast ions, proton, boron, target, α-particles. 1. introduction hot spot, flashing in the channel of micro-pinch (“plasma spark”), is an intriguing object since when as early as the fifties of last century it was shown that the “star”, helium and hydrogen-like emission spectra of heavy elements up to iron can be simulated in laboratory experiments. in micro-pinches arising from the high-voltage breakdown of the vacuum diode, the temperature exceeds 10 kev, and the plasma density is close to the solid state. however, the uncontrolled nature of self-organization is the main obstacle to the use of micro-pinches for controlled fusion. breakdowns in vacuum diodes prepared by anodic pre-plasma formed as a result of the surface ionization of the anode by the primary electron beam, or by different means, with the parameters of the beam and anode plasma are critical for the formation of the hot spot. this picture is much more complex one that takes place in the gas-filled low-pressure “plasma focus”, which has long maintained a leading position among the number of fusion projects in the d−d neutron output (notice the surveys in this topic [9, 5]). in the edl “proton-21”, we found new and interesting features of the hot spots. in a vacuum diode sparked as the load of the plasma opening switch at a voltage of 400 kv, with the current rise of tens nanoseconds up to values of 30 ÷ 40 ka, there is a phenomenon which can be described as flicker luminosity of the hot spot, but only in terms of ion optics. ion pinhole-camera fitted with track detector recorded a repetition of the fronts of fast protons, which characterize the high-energy corpuscular radiation of micropinch. figure 1 shows an example of the hot spot image, which, in image of “plasma focus” or “laser focus” [8, 10] as a compact spot, is a composite image in the form of several strips (“candles”) oriented along the pinch axis. separate sparse track clusters are sometimes recorded too. (the pinhole axis is placed in plane of the diode, the distance from the diode axis to the inlet is 17 mm and the distance to the detector 45 mm, i.e. zoom factor is about 2). width of track strips (200 ÷ 220 µm) is weakly dependent on the diameter of the inlets (tens µm), so the geometrical considerations of to determine hot spot size about 100 µm. “heads” of strips, its close to the pinhole axis edges, are formed by deep latent tracks of fast protons with energies from 0.6 to 1 mev, while the continuation of the strips represented highly degraded tracks down to the bottom of track formation energy 100 kev. the images, and all of their details recorded by the pinhole, are not snapshots of hot spot but the integral pictures without any time binding. however, the main conclusions can be made about how such images arise. overlapping of the strips shows that the flows of fast protons are emitted not simultaneously but in a number of fronts following sequentially. the full picture occurs as a result of drawing compact spot, which is the trace of a thin radial beam of protons penetrating into the pin-hole camera during a period of each front. each strip is formed by deflection of slow protons of the passing front in the peripheral azimuth magnetic field to be present in the pin-hole camera. images like in fig. 1, has been published previously [7, 1]. since hundreds of images similar to fig. 1 have been accumulated, as well as the energy spectrum [7] is confirmed. in our experiments, the average energy of track-productive protons is within 165 http://ctn.cvut.cz/ap/ anatolii a. gurin et al. acta polytechnica figure 1. the typical image of the “hot spot” of micro-pinch on the track detector exposed in the ion pinhole camera. the range 200 ÷ 300 kev. total number of fast protons in each of the shots is valuated at the amount of 1014 ÷ 1015 per shot. note, the usual optical radiation of the diode discharge, which continues microseconds after fast proton passing, is realized with the amount order of 1018 hydrogen atoms coming from the peripheral anode plasma. in our experiments, the “target” is an element of the anode – a rod about 1 mm thick, fastened in the metal anode in the diode gap. characteristically, fully equally well be used as a metal or dielectric targets, which clearly indicates that the target material provides rise, above all, a conductive anode plasma. despite the long history of the problem, there is still no sufficiently complete model of the mechanisms of the hot spot and the proton acceleration in micropinches. the concept of collective ion acceleration with virtual cathode can be considered as a dominant [5]. however, we note the [2] where, in the configuration of the “luce diode” close to which we used, many bursts of accelerated protons were found, the possibility of its acceleration with the potential hole of electrons was refuted, as well the wave character of the formation of accelerated proton streams was suggested. we also note [3], which also mentions the radiation of separate bunches of protons from a micro-pinch. in our view, the most important for understanding the processes of proton acceleration results are obtained in [13, 12], where the effect of predominant acceleration of protons as the easiest component of the plasma is observed, followed by the shock wave of the magnetic field, which forms a strong current channel. this mechanism brings to the multispecies hall mhd model, waiting for further development, of acceleration and separation of ions due to fast magnetic field penetration in plasma with current rising. results [13, 12] relate to the current-carrying plasma in erosion switches, but it is natural to assume that the same sharpened processes take place in the more dense micro-pinch plasma after the breakdown of the diode as a shunt of plasma switch. in this publication we would like to note that fig. 1 and other data confirm the wave-like nature of the proton accelerating processes directly in a dense plasma of hotspot. the track images on detectors indicate that in some cases there are up to 20 “candles”, though the most high energy emission founded in the case of a small number of them, from 3 to 5, as shown in fig. 1. if the accelerated proton fluxes arise directly inside the hot spot of micro-pinch, presence of relative motion of protons through the relatively dense plasma of heavy ions is an important attribute of the hot spot. although there is still no complete dynamical model, it is reasonably to conclude that the relative velocity of protons, which are involved in the process of pinching and radiation, is scaled by the average energy order of 100 kev recorded by tracks in the pinhole, at the finish of proton free movement. the ensemble of these protons could be characterized by the time-averaged velocity distribution function with temperature scale of 100 kev. these temperatures do not occur in nature, is more correct to speak of, rather, a hybrid version of the beam-plasma system, attractive in terms of ion inertial fusion. the purpose our experiments was to examine the possibility of occurrence of a number of fusion reactions p+b11 = 3 α+8.7 mev in the “hot spot”, initiated in the micro-pinch discharges with solid boroncontaining anode targets. this reaction has long been discussed as the most attractive from the standpoint of the requirements of environmentally friendly, “aneutronic” energy of the future (“3α-energetics”, as some authors). it is extremely interesting to determine whether achieved in the ongoing laboratory experiments the conditions for this reaction, which is also known that the energy threshold is very high (about 500 kev) and is irresistible to all contemporary fusion projects. getting to experiment with boron-containing targets, we are fully aware that the parameters of our micro-pinch are not sufficiently high for to achieve the proton–boron above-barrier synthesis, so one can not expect to see plenty of the presence of alpha particles in the products affected by the imploded targets. the exponentially decreasing spectrum of fast protons, while the average energy in the range 200 ÷ 300 kev, only allows for the possibility of individual and subbarrier reactions. however, results of first steps in this direction may be of interest. 2. equipment and methods the preparation of targets based on boron is impeded with the fact that pure natural boron as well the main boron–hydrogen compounds do not form a solid state. therefore, the target used on the basis of hydrogenrich filler, as is used most often pure polyethylene, containing a suspension of amorphous natural boron (83 % of b11). it was possible to achieve weight ratio of 8:5 in favour of boron. we also used other fillers: paraffin, silica or epoxy glue. used also coating targets: a gold foil (submicron natural “gold leaf”), the enrichment of the surface layer with the amorphous boron. about 200 experiments have been performed. 166 vol. 53 no. 2/2013 proton and α-radiation of the micro-pinch figure 2. experimental arrangement; also the example of photo image of thomson parabolas on the etched track detectors is shown. to register the track pin-hole image in the pin-hole camera the target was oriented parallel to the axis of the diode. we also used diagnostics, which is shown in fig. 2, where the target was located across the anode axis before the 6 mm hole in the flat anode letting out a beam of ions into the space behind the anode. then a thinned plasma beam passes sequentially the parallel magnetic (9 kgs) and electric (6 kv on planar electrodes with a gap of 8 mm) fields according to scheme of an thomson analyser. figure 2 observes proportions of the target, fields and the proton orbits sizes. in our thomson analyser, ions are registered with track detectors and scintillators provided with electronic optic signal conversion. scintillators settled in several positions that meet certain energy of protons, and allows to measure time of flight signals of fast ions on the base 69 cm. in fig. 3 the waveform of protons signal delay is registered by scintillator in position eproton = 0.5 mev, together with the diode current waveform recorded by rogowski coil, as well the x-ray electron beam bremsstrahlung signal detected by the scintillator open only for hot spot. the delay time of protons corresponds with their emission at growth phase of the discharge current lasting no more than 40 ns. (these data will be published in more details elsewhere). time of flight measurements are insufficient for the final identification of the ion composition of fast plasma flows of the affected target. to this end, the tracks of ions recorded on the track detectors cr-39 along the thomson parabolas m/z = 1 figure 3. the waveforms of diode current, collimated x-ray hot-spot emission and signal of scintillator placed at position eproton = 0.5 mev on proton parabola m/z = 1 in thomson analyser. (of the protons) and m/z ≥ 2 (of more heavy ions) were studied. in our experiments the ions he+24 , b +5 10 , b+611 , c +6 12 , etc, can occupy the m/z = 2 parabola (the isotope splitting of parabolas is not achieved). in successful shots, in which second parabola is well expressed, the track groups were selected within its area for to be analysed by the “squared diameters asymptotic method” (sdam) [1, 11], which allows to determine the full track length in the detector, r, and the maximal depth of etched tracks, l, by measuring the diameters of the tracks in the normal procedure a long time etching. these data allow us to determine the average reduced rate, v = r/(r − l), and to compare these values with the calibration and the theoretical dependences v (r) for different ions. this procedure reduces to the construction of the locus for all analysed tracks, “v vs r”, and to fixation of splitting loci in accordance with the calibration or theoretical data. we used the theoretical and experimental dependences v (r) [4, 6], and the gauge locus for alphaparticles [7], which we built for α-tracks on the detector irradiated with neptunium source (np247). the main drawback is the large sdam time consuming, the need for high-precision optical measuring instruments, especially in the case of short range tracks of heavy ions. however, sdam is convenient, or even the only possible method for finding and determining the energy of alpha particles in-situ, in a mixed flow of ions very low energy light nuclei with a range 5 ÷ 10 mm in the detector, for which the values of v are well distinguished. the following data also illustrates another feature of sdam. as a rule, especially in the case of heavy ions, the calculated error of v , r for the individual tracks are small compared with the statistical spread of these values for dense groups of similar tracks selected for analysis. random parameters of individual tracks (brownian struggling), which the sdam ignores, affect the final form of loci. therefore, in figs. 4–6 data of some groups of tracks with close values of r combined into a single “middle track” 167 anatolii a. gurin et al. acta polytechnica 0 5 10 15 20 25 30 0 2 4 6 8 10 track range r (mm) re du ce d et ch r at e v gr 1 gr 2 gr 3 gr 4 gr 5 gr 6 gr 7 gr 8 he сalibr b theor c theor o theor figure 4. locus 80 tracks chosen for the analysis within different parts of the parabola m/z = 2; groups number 1 and 3 represent 30 individual tracks of αparticles with showing 20 % evaluative errors of v measurements; groups number 2 and 4–8, join each 10 into an ‘average’ track and belong apparently to the ions b, c or o, and even heavier ions. followed by the standard deviation of coordinates (∆r, ∆v ) for the group. this helps the interpretation of measurement results. we note in this connection that in [4] data, were used to calibrate the relations (r, l), are presented as a average statistical values of 12 tracks, with no indication of error intervals. below the loci are showing standard deviation ∆v , ∆r for the respective groups. 3. results summarized results of these studies are as follows. in some shots within the parabola m/z = 2 separate track groups are recorded, which are identified as α-particles of range 8 ÷ 4 µm in the detector, which corresponds to a range of energy α-particles of 0.8 ÷ 1.7 mev, and single tracks with a range up to 14 microns. only about 10 experiments in a series of 200 shoots were positive. in some cases, track analysis for various reasons do not allow definite conclusions. figures 4–6 are positive examples of identification of α-particles as well as an achieved accuracy of the construction of loci by the sdam. this data is presented together with the gauge v (r) for α-particles. we also give the theoretical relations v (r) for light nuclei, as in the case of small length of heavy nuclei any calibration are absent. figures 4, 5 show the splitting of loci indicating the presence of α-particles with energies up to 1.7 mev in the flow of heavier ions. note that in the ranges r < 4 µm track analysis is difficult, and the sdam does not allow unambiguous identification of such very weak tracks. figure 6 combines the results of detecting relatively small groups of α-tracks, including the case of registration of the fastest individual α-particles: 3 α-tracks are characterized by runs about 13 microns, which corresponds to the energy α-particles of 2.7 mev. note that the single α-tracks with big runs are often recorded in arbitrary samples, however such data can hardly be considered conclusive. 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 track range r (mm) re du ce d et ch r at e v 5 tracks 30 tracks 15 tracks he calibr b theor c theor o theor figure 5. locus of almost 100 tracks (boron–plastic target coated with au): a rich locus 105 tracks belong to α-particles, two group loci are treated equally as belonging to the nuclei of carbon or oxygen. 0 2 4 6 8 10 12 14 16 18 20 0 5 10 15 20 25 track range r (mm) re du ce d et ch r at e v 3 alfa tracks 37 tracks 8x10 4x10 tracks alfa calibr b teor figure 6. the total loci of 120 tracks of the 4 experiments, each of which α-particles tracks are detected, including recorded the fastest α-particles in one experiment: three tracks on the background of 37 tracks of boron nuclei. 4. discussion and conclusions these results indicate that we have achieved accuracy is sufficient for the identification of alpha particles, as the lightest nuclei corresponding to the condition m/z = 2. the main reason for further discussion is the absence of alpha particles in the range 3 ÷ 4 mev, as products of the reaction p(b11,2 α)α. the identification of such particles would be more easy task. another question is their amount. obviously, the detectors recorded the helium ions formed in the central axial region of the anode target, which, when explosions boron-hydrocarbon target is the source of the dense plasma, the starting density is close to the solid. in this case the energy of 8.7 mev, the resulting “harrow-proton” reaction, being divided onto three α-particles, is partially lost in the target material. note that the range of 3 mev α-particles in hydrocarbon media does not exceed 20 microns. the detectors record only a portion of the emitted fast ions of helium, have overcome the scattering in the target material, whereas some of the ions of helium is completely inhibited, and mixed with the gas-plasma decay products of the target (data of gas analyser confirm such conclusion). such an effect may occur if the “op168 vol. 53 no. 2/2013 proton and α-radiation of the micro-pinch tically opaque” core of the hot spot (in terms of ion optics) has a size of 100 µm. the determination of the total number of fast helium ions in the plasma flows of affected targets directly in the experiment, in contrast to the analogous problem for the neutron experiments with dd inertial fusion, seems to be much more difficult. the total number of registered alpha-particles is difficult to assess because the complete overlapping of etched tracks makes them impossible to be counted. α-particles are found only in that part of the parabola m/z = 2, which corresponds to the interval 1 ÷ 2 mev, and the number of tracks is about 1 % of the total track number on the corresponding segment of the parabola. a comparison of the intensities of the first and second parabolas shows that the coefficient is not higher 10−6 can be taken to evaluate the proportion of the total number of fast helium ions to the total number of fast protons are regularly recorded. an integral number of fast protons emitted by the target is estimated about 1014 ÷ 1015. (this number is determined by counting the total number of proton tracks, deflected by the magnetic field, directly in the magnet gap with an additional small 5 mm aperture at the entrance into magnet). consequently, up to 108 fast helium ions can be produced in a lucky shot with boron– polyethylene target. it must be recognized that the reproducibility of the process is low: only 5 % of the experiments gave a positive result. this estimation of the amount of helium produced in the experiments with boron-containing targets is confirmed, and even rises up to a value 109 per shot, thanks to more reliable data of the gas analyser, based on samples taken from the chamber immediately after the shooting. these samples also contain helium atoms, which are formed as a result of complete inhibition of alpha-particles in the diode plasma and are not registered by the thomson analyser. the above results of the track detection help to do, rather, a qualitative conclusion about the presence of the nuclear reaction p(b11,2 α)α in the hot spot of laboratory micro-pinch with not a record-breaking parameters. note that in the early experiments on thermonuclear d−d fusion using “plasma focus” number of neutrons recorded at 108 per shot was considered as a relative success. in experiments with solid targets micro-pinch on program “3-α-energy” must take the next step: to achieve the value of the average proton energy order of 1 mev in fluctuations of corpuscular emission of hot spots. if the average energy of a pulsed “runaway” of protons will be so high, it can be expected that the “flicker” will be associated with abundant above-barrier proton-boron synthesis in the hot spot plasma. references [1] s. v. adamenko, a. s. adamenko, a. a. gurin, yu. m. onishchuk. track measurements of fast particle streams in pulsed discharge explosive plasma. radiation measurements 40(2–6):486–489, 2005. [2] r. adler, j. a. nation, v. cerlin. collective acceleration of protons in relativistic electron beam propagation in evacuated drift tubes. phys fluids 24(2):347–356, 1981. [3] v. m. bystritski, a. n. didenko, ya. e. krasik, et. al. . letters to jetp 4(9):547–551, 1978. [4] b. dorschel, d. hermsdorf, k. kander, h. kuchne. track parameters and etch rates in alpha-irradiated cr-39 detectors used for dosemeter response calculation. radiat prot dosim 78(3):205–212, 1998. [5] a. e. dubinov, i. yu. kornilova, v. l. selemir. collective ion acceleration in systems with a virtual cathode. advances in physical sciences 45(11):1109–1130, 2002. [6] a. n. golovchenko. on registration properties of intercast company cr-39. int j of radiat appl & instrum part d 20(3):517–519, 1992. [7] a. a. gurin, a. s. adamenko. fundamental theories in physics. controlled nucleosynthesis. springer, 2007. pages 105–151. edited by s. adamenko, f. selleri, a. van der merve. [8] u. jager, h. herold. fast ion kinetics and fusion reaction mechanism in the plasma focus. nucl fus 27(3):407–423, 1987. [9] r. b. miller. an introduction to the physics of intense charged particle beam. plenum press, n.-y., l., 1984. [10] d. c. slater. pinhole imaging of fast ions from laser-produced plasmas. appl phys lett 31(3):196–198, 1977. [11] g. somogyi, s. a. szalay. track-diameter kinetics in dielectric track detectors. nucl instrum and methods 109(2):211–232, 1973. [12] s. b. swanekamp, r. j. commisso, p. f. ottinger, et al. species separation and field-penetration in a multi-component plasma. in beams 2002: 14th international conference on high-power particle beams, pp. 455–458. albuquerque, new nexico (usa), 2002. [13] a. weingarten, r. arad, y. maron, a. fruchman. ion separation due to magnetic field penetration into a multispecies plasma. phys rev lett 87(11):115004, 2001. 169 acta polytechnica 53(2):165–169, 2013 1 introduction 2 equipment and methods 3 results 4 discussion and conclusions references ap-5-10.dvi acta polytechnica vol. 50 no. 5/2010 self-adjoint extensions of schrödinger operators with δ-magnetic fields on riemannian manifolds t. mine abstract we consider the magnetic schrödinger operator on a riemannian manifold m. we assume the magnetic field is given by the sum of a regular field and the dirac δ measures supported on a discrete set γ in m. we give a complete characterization of the self-adjoint extensions of the minimal operator, in terms of the boundary conditions. the result is an extension of the former results by dabrowski-šťovíček and exner-šťovíček-vytřas. keywords: spectral theory, functional analysis, self-adjointness, aharonov-bohm effect, quantum mechanics, differential geometry, schrödinger operator. 1 introduction let (m, g) be a two-dimensional, oriented, connected complete c∞-riemannian manifold, where g is the riemannianmetric on m. let dμ be themeasure induced fromtheriemannianmetric. ifwe take a local chart (u, ϕ), ϕ = (x1, x2), the measure dμ is written as dμ = √ gdx1dx2 in u, where g = det(gmn), gmn = g(∂m, ∂n), and ∂m = ∂/∂x m. we denote l2(m) = l2(m;dμ). the set of all 1-forms on m is denoted by λ1(m). in the coordinate neighborhood u, a ∈ λ1(m) is written as a = a1dx 1 + a2dx 2. in general, the coefficients a1, a2 are complexvalued. we say a is real-valued if the coefficients are real-valued. we say a is of the class ckλ1(m) if the coefficients are of the class ck(u) for any local chart (u, ϕ). we define the class lqlocλ 1(m) (1 ≤ q ≤ ∞)1, etc. similarly. the 2-form da is called the magnetic field. if a ∈ l1locλ 1(m), da can be defined at least in the distribution sense. in u, the magnetic field is given by da =(∂1a2 − ∂2a1)dx1 ∧ dx2. let γ = {γk}kk=1 be a sequence of mutually distinct points in m. the number k may be infinity, and in this case we assume additionally γ has no accumulation points in m. let a be a 1-form on m given by the sum of two 1-forms (a) a = a(0) + a(1). the part a(0) corresponds to the δ magnetic fields, that is, we assume the following. (a0) a(0) ∈ c∞λ1(m \γ)∩l1locλ 1(m), real-valued, and da(0) = k∑ k=1 2παkδγk , (1) where αk ∈ r, and δγ is the dirac measure concentrated on the point γ. more precisely, (1) means − ∫ m dϕ ∧ a(0) = k∑ k=1 2παkϕ(γk), for any ϕ ∈ c∞0 (m) (since a (0) ∈ l1locλ 1(m), the left hand side is well-defined). notice that this equation is independent of the riemannian metric g. for the regular part a(1) and the scalar potential v , we assume the following: (a1) a(1) ∈ c1λ1(m), real-valued. (v) v is real-valued, v ∈ l2loc(m), and is bounded in some open neighborhood of γk for every k =1, . . . , k. using the local coordinate (x1, x2), we define the schrödinger operator l in each coordinate neighborhood by lu = − 1√ g ∑ m,n=1,2 (∂m + iam) · (√ ggmn(∂n + ian)u ) + v u, 1the measure dμ is omitted, since the class lq loc λ1(m;dμ) is independent of the choice of dμ. the coefficient am is a function on u ⊂ m, however, we denote the pull-back (ϕ−1)∗am = am ◦ ϕ−1 on ϕ(u) ⊂ r 2 by the same symbol am, for simplicity of notations. this convention is frequently used in this paper. 62 acta polytechnica vol. 50 no. 5/2010 where (gmn) is the inverse matrix of (gmn). this definition is independent of the choice of local coordinates (see section 2). define the minimal operator hmin by hminu = lu, d(hmin)= c∞0 (m \ γ), where the overlinedenotes the closurewith respect to the graph norm. define the maximal operator hmax by hmax = h ∗ min. then we can show that hmaxu = lu, d(hmax) = {u ∈ l2(m) | lu ∈ l2(m)}, where l is a differential operator on d′(m \ γ). we assume (sb) the operator hmin is bounded from below. in the case m is the flat euclidean plane, it is well-known that the operator hmin is not essentially self-adjoint and the structure of the self-adjoint extensionsof hmin canbedeterminedvia the celebrated krein-vonneumann theoryof self-adjoint extensions (see e.g. reed-simon [13]). in the textbook by albeverio et al. [3], the case a(0) = a(1) = 0 and v = 0 (but γ = ∅) is exhaustively studied. adamiteta [1] and dabrowski-šťovíček [7] study the case k = 1, α1 ∈ z, a(1) = 0, and v = 0. exneršťovíček-vytřas [8] study the case k = 1, α1 ∈ z, da(1) = bdx1 ∧ dx2 for some non-zero constant b (the constant magnetic field), and v =0. moreover, lisovyy [11] studies the case m is thepoincarédisk, g is thepoincarémetric, v =0and da = bωg+2παδ0, where b is a non-zero constant and ωg is the surface form induced from the poincarémetric g. in all the results above, they first determine the deficiency subspaces ker(hmax ∓ i) and apply the krein-vonneumann theory. this method cannot be applied in the case k ≥ 2 and αk ∈ z, however, this case (and a(1) is the constantfield, v =0)on theflat euclideanplane is studiedby the author [12], and the structure of the self-adjoint extensions is determined. our main purpose in this paper is to generalize the result in [12] on general complete riemannian manifolds and for more general a and v . our first result is about the deficiency indices n±(hmin)=dimker(hmax ∓ i). theorem 1.1 assume (a), (a0), (a1), (v), and (sb). then, both deficiency indices n±(hmin) are equal to 2k1 + k2, where k1 =#{αk | αk ∈ z}, k2 =#{αk | αk ∈ z}. note that bulla-gesztesy [4] obtain a similar result in the case a = 0 and v has singularities, and iwaiyabu [9] also obtain a similar result on the twodimensional torus. next, we shall give a complete characterizationof the self-adjoint extensions of hmin. to this purpose, we introduce some nice coordinates around singularities and some auxiliary functions. for simplicity, we assume k =#γ is finite for a while. for k = 1, . . . , k, let (uk, φk), φk = (x 1, x2), be a local chart around γk such that uk is simply connected, φk(γk)= 0, v is bounded in uk, and {uk}kk=1 are disjoint. let (r, θ) be the radial coordinate in uk defined by x1 + ix2 = reiθ, r ≥ 0, 0 ≤ θ < 2π. we assume gmn(0,0) = δmn, (2) ∂j gmn(0,0) = 0 (m, n, j =1,2), where δmn is the kronecker delta. condition (2) is satisfied, for example, if we take the normal coordinate2 as (x1, x2). let βk be the fractional part of αk, that is, αk = [αk]+ βk, [αk] ∈ z and 0 ≤ βk < 1. put3 ã(0) = βkr −2(−x2dx1 + x1dx2), ã(1) = a(1) − a(1)(0). it is well-known that dã(0) = 2πβkδ0 (see e.g. aharonov-bohm [2, 1] or [7]). define a phase function ψk ∈ c∞(uk \ {0}) by ψk(x) = exp 1 i ( a (1) 1 (0)x 1 + a (1) 2 (0)x 2 + (3)∫ x x0 ( a(0) − ã(0) )) , where a(1) = a (1) 1 dx 1 + a (1) 2 dx 2, x0 is some point in uk \ {0}, and the path of the line integral ∫ x x0 lies in uk \ {0}. notice that the value of the line integral is independent of the choice of paths modulo 2πz, by the stokes theorem and the assumption d(a(0) − ã(0))=2π[αk]δ0 in uk. then we have a = ã + iψ−1k dψk, ã = ã (0) + ã(1) (4) and l = ψkl̃ψ−1k (5) in uk \ {0}, where l̃ is the operator l corresponding to the vector potential ã and the scalar potential v . let k1, k2 be the numbers in theorem 1.1. in the sequel, we rearrange the index k so that 0 < βk < 1 for 1 ≤ k ≤ k1. as we prove later, the 2the coordinate defined by the local inverse map of the exponential map from the tangent space at γk to m. 3more precisely, the 1-form a(1) − a(1)(0) is defined as (a(1)1 (x 1, x2) − a(1)1 (0,0))dx 1 +(a (1) 2 (x 1, x2) − a(1)2 (0,0))dx 2, in the coordinate neighborhood uk. 63 acta polytechnica vol. 50 no. 5/2010 asymptotics of u ∈ d(hmax) in uk as r → 0 is given by u = ⎧⎪⎪⎨ ⎪⎪⎩ ψk(c k 1r βk−1e−iθ + ck2r −βk+ ck4r 1−βk e−iθ + ck5r βk)+ ξ (1 ≤ k ≤ k1), ψk(c k 3 logr + c k 6)+ ξ (k1 +1 ≤ k ≤ k), where ck1, . . . , c k 6 are constants and ξ is a regular function in the sense ξ ∈ d(hmin). define φj(u)= { t(c1j , . . . , c k1 j ) ∈ c k1 (j =1,2,4,5), t(ck1+1j , . . . , c k j ) ∈ c k2 (j =3,6), φ(u)= t(tφ1(u) · · · tφ6(u)) ∈ c4k1+2k2. define a (2k1 + k2) × (2k1 + k2)-diagonal matrix d by d = diag(1 − β1, . . . ,1 − βk1, β1, . . . , βk1, (6) −1/2, . . . , −1/2). now our theorem is stated as follows. theorem 1.2 assume (a), (a0), (a1), (v), (sb) and k < ∞. let φ(u), d given above. (i) let x= ( x1 x2 ) , where x1, x2 are (2k1 + k2)× (2k1 + k2) matrices satisfying rankx =2k1 + k2, x ∗ 1dx2 = x ∗ 2dx1. (7) then, the operator hx defined by hx u = lu, d(hx) = {u ∈ d(hmax) | φ(u) ∈ ranx} is a self-adjoint extension of hmin. (ii) for any self-adjoint extension h of hmin, there exists some matrix x satisfying (7) and h = hx. wecan consider the case k = ∞, but some technical assumptions are necessary. we shall argue this case in section 5. thus we can characterize the self-adjoint extensions in terms of the boundary conditions. we can easilyprove that thefriedrichsextensioncorresponds to the case x1 = o, x2 = id. in the case m = r 2 and k =1, similar results are obtained in [7] and [8], and our theorem is a generalization of their results. as stated in their paper, the choice of matrices x is of course not unique: there are infinitely many matrices x giving same ranx. the difficulty in the proof is that we cannot determine the deficiency subspaces explicitly. to overcome this difficulty, we describe the condition of the self-adjointness only using the quotient subspace d(hmax)/d(hmin). this quotient subspace is essentially the same object as the sum of deficiency subspaces, butmuch easily tractable than the deficiency subspaces themselves. this idea is also used in [4] or [12]. we note that recently self-adjoint extensions of the schrödinger operators on r2 with δ magnetic fields are studied from the viewpoint of the hidden supersymmetric structure; see correa et al. [5, 6]. the rest of the paper is organized as follows. in section 2, we review basic notations and facts from the differential geometry and the theory of selfadjoint extensions. in section 3, we shall prove the structure of the self-adjoint extensions depends only on the singular part of the vector potentials. in section 4, we shall prove the main theorems. in section 5, we shall consider the case k = ∞ and give a complete characterization of the self-adjoint extensions, under some homogeneity conditions. 2 basic facts 2.1 formulas in differential geometry we quote some formulas used in shubin [14] for the convenience of the readers. take a local chart (u, ϕ), ϕ = (x1, x2), around p ∈ m. put gmn = g(∂m, ∂n), and let (gmn) be the inverse matrix of (gmn). for α, β ∈ λ1p(m) (the cotangent space at p), we define the scalar product 〈α, β〉 = ∑ m,n=1,2 gmnαmβn, where α = α1dx 1 + α2dx 2 and β = β1dx 1 + β2dx 2. put |α|2 = 〈α, α〉, where α = α1dx1 + α2dx2. for a 1-form ω = ω1dx 2 + ω2dx 2, we define a function d∗ω by d∗ω = − 1√ g ∑ m,n=1,2 ∂m (√ ggmnωn ) . this definition is independent of the choice of local coordinates. actually, operator d∗ is characterized by the following relation:∫ m 〈du, ω〉dμ = ∫ m ud∗ωdμ for any u ∈ c∞0 (m) and ω ∈ c ∞ 0 λ 1(m). let a be a1-formsatisfyingour assumptions. for a function f, we define a 1-form daf by daf = df + if a, where d is the exterior derivative, and i = √ −1. for a 1-form ω, we define d∗aω = d ∗ω − ia∗ω, a∗ω = 〈a, ω〉. 64 acta polytechnica vol. 50 no. 5/2010 then we obtain a representation of our schrödinger operator l independent of local coordinates: l = d∗ada + v. for operator d∗a, the following leibniz formulas hold: for an appropriate function f and 1-form ω, we have d∗a(f ω) = f d ∗ω − 〈df, ω〉 − if 〈a, ω〉 = f d∗aω − 〈df, ω〉 = f d ∗ω − 〈daf, ω〉, (8) d∗ada(f g) = f d ∗ adag − 2〈df, dag〉 + gd ∗df. (9) proposition 2.1 let u, u ′ be open subsets of m \γ such that u is a compact subset of u ′, and v is bounded in u ′. then, there exists a constant c > 0 such that∫ u |daf |2dμ ≤ c ∫ u′ (|f |2 + |lf |2)dμ (10) for f ∈ d(hmax). proof. according to [14, (5.3)],4 we have (l(φf), φf) = �(φlf, φf)+ ∫ m |dφ|2|f |2dμ for f ∈ d(hmax) and φ ∈ c∞0 (m \ γ). take φ ∈ c∞0 (u ′) such that φ =1 on u. then the conclusion follows from the above equality,∫ suppφ |da(φf)|2 =(l(φf), φf) − (v φf, φf) and assumption v is bounded. � 2.2 theory of self-adjoint extensions we quote some notation from the textbook [13]. let h be a separable hilbert space and denote its inner product by (·, ·), and norm by ‖ · ‖. all the linear operators in this subsection are on the hilbert space h. for a linear operator x, d(x) denotes the domain of definition of x, x the closure of x, x∗ the adjoint operator of x. for a linear operator x, the graph inner product of x is defined by (x, y)x =(xx, xy)+(x, y) for x, y ∈ d(x), and the graph norm by ‖x‖x = (x, x) 1/2 x . we introduce some equivalent for the sum of the deficiency subspaces, which is also introduced in [4] or [12]. let x be a closed, densely defined symmetric operator. let d = d(x∗)/d(x), where the right hand side denotes the quotient space. the space d is a hilbert space equipped with the norm ‖[x]‖2d = min y∈[x] ‖y‖2x∗ = ‖qx‖ 2 x∗ , where x ∈ d(x∗), [x] = x+d(x) denotes the equivalence class of x in the quotient space d(x∗)/d(x), and q denotes the orthogonal projection onto the orthogonal complement of d(x) in d(x∗). for u, v ∈ d, define [u, v]d = (x ∗x, y) − (x, x∗y), u = [x], v = [y], x, y ∈ d(x∗). the value [u, v]d is independent of the choice of the representatives x, y. let p be the canonical projection from d(x∗) to d. for a closed subspace v of d, we define a closed linear operator xv by d(xv )= {x ∈ d(x∗) | p x ∈ v }, xv x = x∗x. we also define v [⊥] = {u ∈ d | [u, v]d =0 for any v ∈ v }. then the following proposition immediately follows from the definition of the self-adjointness. proposition 2.2 1. for a closed subspace v of d, the operator xv is a self-adjoint extension of x if and only if v [⊥] = v. (11) 2. for any self-adjoint extension x̃ of x, there exists a closed subspace v of d such that xv = x̃. in terms of the above notations, the krein-von neumann theory can be rephrased as follows. proposition 2.3 let n± = ker(x∗ ∓ i) the deficiency subspaces of x, n± = dimn± the deficiency indices of x. then, the following holds. (i) the projection operator p gives a hilbert space isomorphism from the direct sum n+ ⊕ n− to d. in particular, dimd = n+ + n−. (ii) there exists a one-to-one correspondence between the closed subspaces v of d satisfying (11) and the unitary operators u from h+ to h−, given by v = p(1+ u)h+. this proposition says the space d can play the same role as the sum of deficiency subspaces in the theory of self-adjoint extensions. particularly when n± is difficult to determine explicitly (as in our case), the space d is more tractable, since the element of this space has ambiguity by d(x). actually, in the next section we shall see that the structure of d for our schrödinger operator hmin and the form [·, ·]d is determined only from the singular part a(0) of the vector potential. 4since the function φ avoids the singularities, the proof of [14, (5.3)] is also available in our case. 65 acta polytechnica vol. 50 no. 5/2010 3 reduction 3.1 division to the local potential let (uk, φk), φk = (x 1, x2), the local coordinate introduced in section 1. let ã be the 1-form given by (4). take a positive number �k so small that the closed disc {r ≤ 2�k} is contained in uk. let ηk ∈ c∞0 (u) such that 0 ≤ ηk ≤ 1, ηk =1 for r ≤ �k, ηk =0 for r ≥ 2�k. define functions ĝmn, âm and v̂ on r2 by ĝmn = ηkgmn +(1 − ηk)δmn, âm = ã (0) m + ηkã (1) m , v̂ = ηkv. define a differential operator lk on r2 by lk = − 1√ ĝ ∑ m,n=1,2 ( ∂ ∂xm + iâm ) · √ ĝĝmn ( ∂ ∂xn + iân ) + v̂ , where ĝ = det(ĝmn), and (ĝ mn) is the inverse matrix of (ĝmn). define a linear operator lk,min on l2(r2;dμk), dμk = √ ĝdx1dx2, by lk,minu = lku, d(lk,min)= c∞0 (r 2 \ {0}). let lk,max = l ∗ k,min. then lk,maxu = lku, d(lk,max)= {u ∈ l2(r2;dμk) | lku ∈ l2(r2;dμk)}, where lk is regarded as a differential operator on d′(r2 \ 0). let d = d(hmax)/d(hmin), dk = d(lk,max)/d(lk,min). let χk ∈ c∞0 (m) such that 0 ≤ χk ≤ 1, χk = 0 for r ≥ �k and χk = 1 for r ≤ �k/2. define a map tk from d to dk by tk[f] = [ψ −1 k χkf], where the function ψk is given by (3). define a map t from d to the direct sum k⊕ k=1 dk by t [f] = k⊕ k=1 tk[f]. we also define a map s from k⊕ k=1 dk to d by5 s k⊕ k=1 [fk] = [ k∑ k=1 ψkχkfk ] . in the sequel, we sometimes write [f, g]d = [[f], [g]]d etc. for simplicity of notations. lemma 3.1 1. assume k < ∞. then, the maps s, t defined above are well-defined andmutually inverse. moreover, we have [f, g]d = k∑ k=1 [tk[f], tk[g]]dk (12) for any [f], [g] ∈ d. 2. assume k = ∞. then themap s is well-defined and injective. proof. (i) we divide the proof into three steps. step 1. the map d(hmax) + f �→ ψ−1k χkf ∈ d(lk,max) is well-defined and continuous. proof. clearly ψ−1k χkf ∈ l 2(r2;dμk), so it suffices to show that lk(ψ−1k χkf) ∈ l 2(r2;dμk). by (5) and the leibniz rule (9), we have lk(ψ−1k χkf)= ψ −1 k l(χkf)= ψ−1k (χklf − 2〈dχk, daf 〉 +(d ∗dχk)f) . the first term and the third in the parenthesis of the right hand side are in l2(r2;dμk) and continuous with respect to ‖ · ‖hmax. moreover,we can prove the second term is also in l2 and continuouswith respect to ‖ · ‖hmax by using (10). � step 2. let f ∈ d(hmin). then, we have ψ−1k χkf ∈ d(lk,min). proof. by definition, there exists a sequence {fn}∞n=1 ⊂ c ∞ 0 (m \γ) such that fn → f in d(hmin). then, ψ−1k χkfn ∈ c ∞ 0 (r 2 \ {0}) and ψ−1k χkfn → ψ−1k χkf in d(lk,max), by step 1. since d(lk,min) is a closed subspace of d(lk,max), we have the conclusion. � step 1 and 2 imply the map t is well-defined. we can similarly prove that the map s is also welldefined. step 3. the operator st is the identity map on d. proof. by definition, we have (i − st)[f]= [ψf], ψ =1 − k∑ k=1 χ2k. so it suffices to prove that g = ψf ∈ d(hmin). 5when k = ∞, we define the map s for the elements of ∞⊕ k=1 dk having only finite nonzero components [fk]. so there is no difficulty in the definition of s. 66 acta polytechnica vol. 50 no. 5/2010 let (r, θ) be the radial coordinate in uk and put bk,� = {x ∈ uk | r < �}. then we have suppψ ⊂ m \ k⋃ k=1 bk,�k/2. for c > 0, let ξc ∈ c ∞(m) such that 0 ≤ ξc ≤ 1, ξc =1 in m \ k⋃ k=1 b�k/c, ξc =0 in k⋃ k=1 bk,�k/(2c). let l0, h0,min and h0,max be the operators corresponding to the potentials ξ4a and ξ4v . these potentials have no singularities, so we have h0,min = h0,max by [14]. since lg = l0g ∈ l2, we have g ∈ d(h0,max) = d(h0,min). thus we can take a sequence {gn} such that gn → g in ‖ · ‖h0,min. then ξ2gn ∈ c∞0 (m \ γ) and ξ2gn → ξ2g = g in ‖ · ‖hmin. thus we have g ∈ d(hmin). � wecanprove t s = i similarly. then (12) follows from (5) and the equality [f, g]d = k∑ k=1 [χkf, χkg]d (notice that f − ∑ k χkf ∈ d(hmin) can be proved as in step 3). (2)let k = ∞. for anypositive integer n, wecan define t(n) from d to n⊕ k=1 dk, and s(n) from n⊕ k=1 dk to d similarly, andprove t(n)s(n) = id. this implies the map s is well-defined and injective. � 3.2 analysis of operators on r2 we shall analyze the operator lk (or lk,min, lk,max) defined in the previous subsection. for simplicity of notation, we omitˆand˜in the definition of lk in the sequel. then our assumptions are the following: 1. lk = d∗ada + v on r 2 \ {0}, a = a(0) + a(1), 2. lk,min and lk,max are operators on l 2(r2;dμk), dμk = √ gdx1dx2, 3. a(0) = βkr −2(−x2dx1 + x1dx2), 0 ≤ βk < 1, 4. a(1) ∈ c10λ 1({r < 2�k}), real-valued, a(1)(0) = 0, 5. v is bounded, real-valued, 6. gmn(0) = δmn, ∂j gmn(0) = 0, and gmn = δmn for r ≥ 2�k. we shall show that gmn, a (1) and v havenothing todowith the structureof the self-adjoint extensions. to this purpose, define a differential operator mk on r 2 by mk = − ∑ n=1,2 ( ∂ ∂xn + ian )2 . define a linear operator mk,min on l 2(r2;dx1dx2) by d(mk,min) = c∞0 (r 2 \ {0}), mk,minu = mku for u ∈ d(mk,min). put mk,max = m ∗ k,min, and ek = d(mk,max)/ d(mk,min). we also define m (0) k , m (0) k,min, m (0) k,max, and e(0)k , by replacing an by a (0) n in the above definition. the operator m(0)k is already studied in [1] and [7]. here we quote their results and calculate the form [·, ·]e(0) k . proposition 3.2 let χ ∈ c∞0 (r 2) such that χ = 1 in some neighborhood of 0. 1. assume 0 < βk < 1. put f1k = χe −iθrβk−1, f2k = χr −βk , f4k = χe −iθr1−βk , f5k = χr βk . then, the deficiency indices n±(m (0) k,min) = 2, dime(0)k =4 and the vectors {[f n k ]}n=1,2,4,5 form a basis of e(0)k . moreover, for m, n ∈ {1,2,4,5} with m ≤ n,6 we have [f mk , f n k ]e(0) k = ⎧⎪⎪⎨ ⎪⎪⎩ 4π(βk − 1) for (m, n)= (1,4), −4πβk for (m, n)= (2,5), 0 otherwise. 2. assume βk =0. put f3k = χ logr, f 6 k = χ. then, the deficiency indices n±(m (0) k,min) = 1, dime(0)k = 2, {[f j k]}j=3,6 form a basis of e (0) k , and [f3k , f 6 k]e(0) k =2π, [f3k , f 3 k]e(0) k = [f6k , f 6 k]e(0) k =0. proof. (i) the first statement follows from the result in [7] or [1]. for the calculation of [u, v]e(0) k , we use some notation in vector analysis. we use the gradient vector ∇ = t(∂1, ∂2), and identify a 1-form a with the component vector t(a1, a2). the dot · denotes the euclidean inner product. then we have [u, v]e(0) k = lim �→0 ∫ r≥� ( −v(∇ + ia(0)) · (∇ + ia(0))u + u(∇ + ia(0)) · (∇ + ia(0))v ) dx1dx2 = 6notice that [f nk , f m k ]e(0) k = −[f m k , f n k ] e(0) k by definition. 67 acta polytechnica vol. 50 no. 5/2010 lim �→0 ∫ r=� ( vn · (∇ + ia(0))u− un · (∇ + ia(0))v ) rdθ = lim �→0 ∫ r=� ( v∂ru − u∂rv ) rdθ, (13) where n =(cosθ,sinθ), and the line integral is taken counterclockwise. we used the green formula and the fact n · a(0) = 0. then we can easily prove the second statement by using (13). (ii) the first part of the statement follows from the results in [3]. the second statement can be justified by using (13). � next, we prove that the regular part a(1) does not affect the structure of ek and the corresponding form. proposition 3.3 all the statements of proposition 3.2 hold even if we replace m (0) k,min by mk,min, and e(0)k by ek. before the proof, we prepare a perturbative lemma, which is an immediate corollary of [10, theorem iv.5.22]. lemma 3.4 let h be a separable hilbert space and ‖ · ‖ its norm. let x, y be densely defined symmetric operators on h. assume d(x) ⊂ d(y ) and there exist positive constants c, δ with 0 < δ < 1 and ‖y u‖ ≤ δ‖xu‖ + c‖u‖ for every u ∈ d(x). then, we have d(x + y ) = d(x) and n±(x + y ) = n±(x), where the overline denotes the operator closure. proof of proposition 3.3 we prove only statement (i). statement (ii) can be proved similarly. by the leibniz formula (8), we have for u ∈ c∞0 (r 2 \ {0}) (mk − m (0) k )u = i(d ∗a(1))u − (14) 2i〈a(1), da(0)u〉 + |a (1)|2u. we denote ‖u‖2 = ∫ r 2 |u|2dx1dx2 for a function u, and ‖ω‖2 = ∫ r 2 |ω|2dx1dx2 for a 1-form ω (notice that |ω|2 = 〈ω, ω〉). we denote the essential supremumnormof |u| and |ω| by ‖u‖∞ and ‖ω‖∞, respectively. then we have by the schwarz inequality ‖(mk − m (0) k )u‖ ≤ ‖d∗a(1)‖∞‖u‖+2‖a(1)‖∞‖da(0)u‖+‖a (1)‖2∞‖u‖ ≤ (‖d∗a(1)‖∞ + ‖a(1)‖2∞)‖u‖ + ‖a(1)‖∞(�‖m (0) k u‖ 2 + �−1‖u‖2) for any � > 0, where we used the inequality ‖da(0)u‖ =(m (0) k u, u) 1/2 ≤ (�‖m(0)k u‖) 1/2(�−1‖u‖)1/2 ≤ 1 2 (�‖m(0)k u‖ + � −1‖u‖). take � > 0 sufficiently small and apply lemma 3.4. then we have n±(mk,min) = n±(m (0) k,min) = 2, thus dimek = 4 by (i) of proposition 2.3. moreover we have d(mk,min)= d(m (0) k,min), so the functions {f j k } (j = 1,2,4,5) do not belong to d(mk,min). and we can prove mkf jk ∈ l 2(r2) by using (14) and the fact |a(1)| = o(r) near the origin. thus {[f jk]} form a basis of ek. for the form [·, ·]ek, we can prove the formula [u, v]ek = lim �→0 ∫ r=� ( v(∂r u) − u(∂rv) − 2i(n · a(1))uv ) rdθ in a similarwayas in (13). thus the value [f mk , f n k ]ek is not affected by a(1), since |a(1)| = o(r) and |f mk f n k | is at most o(r − max(2βk,2(1−βk))). � nextwe shall consider the non-flat case. we shall show that metric g also does not affect the structure of dk and the corresponding form. proposition 3.5 all the statements of proposition 3.2 hold even if we replace m (0) k,min by lk,min and e (0) k by dk. since v is bounded, we can assume v = 0. in the sequel, we use the following notation: l = g−1/2(d + a) · g1/2g−1(d + a), where d is the column vector t(d1, d2), dj = −i∂j, a is identified with the component vector t(a1, a2), and g−1 is the inverse matrix of g =(gmn). we shall prepare some elliptic a priori estimate. lemma 3.6 let m, n ∈ {1,2}. then, there exist cm > 0 and cmn > 0 such that ‖(dm + am)u‖ ≤ cm(�‖mku‖ + �−1‖u‖), ‖(dm + am)(dn + an)u‖ ≤ cmn(‖mku‖+‖u‖)(15) for every u ∈ c∞0 (r 2 \ {0}) and every � > 0, where ‖ · ‖ = ‖ · ‖ l2(r 2 ;dx1dx2) . the difficulty is the singularity of our vector potential a at the origin. we can overcome this difficulty by using some commutator technique. proof of lemma 3.6 putπj = dj+aj (j =1,2). then, since ‖πju‖2 = (�1/2π2j u, � −1/2u) ≤ 1 2 ( �‖π2j u‖ 2 + �−1‖u‖2 ) for u ∈ c∞0 (r 2), it suffices to prove (15). 68 acta polytechnica vol. 50 no. 5/2010 define auxiliary operators a = iπ1 +π2, a† = −iπ1 +π2. let [x, y ] = xy − y x be the commutator of operators x and y . then we have [π1,π2] = [d1, a2] − [d2, a1] = −i(b +2πβkδ0), where b = ∂1a (1) 2 − ∂2a (1) 1 is the magnetic field corresponding to a(1). thus we have [a, a†] = 2i[π1,π2] = 2(b +2πβkδ0). particularly for u ∈ c∞0 (r 2 \ {0}), we have (aa† − a†a)u =2bu. moreover, we have by definition (aa† + a†a)u =2mku. these equalities imply aa† = mk + b, a†a = mk − b (16) on c∞0 (r 2 \ {0}). sinceπmπn canbewritten as a finite linear combinationof the operatorsof the form xy , where x, y are a or a†, it suffices to showthat there exists some constant c > 0 such that ‖xy u‖ ≤ c(‖mku‖ + ‖u‖) (17) for u ∈ c∞0 (r 2\{0}). for (x, y )= (a, a†),(a†, a), (17) follows from (16), since b is bounded. to estimate ‖a2u‖2, we assume a(1) ∈ c∞ for a while. then, we have by (16) ‖a2u‖2 =(a2u, a2u)= ((a†)2a2u, u)= (a†(aa† − 2b)au, u)= ‖a†au‖2 − 2(bau, au) ≤ ‖a†au‖2 +2‖b‖∞‖au‖2 ≤ ‖a†au‖2 +2‖b‖∞‖a†au‖‖u‖. when a(1) ∈ c1, we approximate a(1) by c∞potentials w.r.t. c1-norm on some neighborhood of suppu, thenweget the above inequalityagain. then, we have (17) by using (16). the case x = y = a† can be treated similarly. � proof of proposition 3.5 first, by assumption (vi), we have g−1 = i + ĝ, max |ĝmn| = o(r2), (18) g = 1+ o(r2), |dg| = o(r), as r → 0. define a unitary operator u from l2(r2; √ gdx1dx2) to l2(r2;dx1dx2) by u u = g1/4u. put l̃k = u lku −1, l̃k,min = u lk,minu −1, etc. then we have for v ∈ c∞0 (r 2 \ {0}) l̃k,minv = g −1/4(d + a) · √ gg−1(d + a)g−1/4v. thus we have l̃k,min=g −1/4(d + a) · g1/4g−1(d + a)+ g−1/4(d + a) · √ gg−1(dg−1/4). (19) the second term of (19) is written as g−1/4 ( d · ( √ gg−1(dg−1/4)) ) + (20) (dg−1/4) · g1/4g−1(d + a). the first term of (20) is bounded, and the second is infinitesimally small w.r.t. mk,min, by lemma 3.6. the first term of (19) is written as (d + a) · g−1(d + a)+ (21) g−1/4(dg1/4) · g−1(d + a). the second term of (21) is also infinitesimally small w.r.t. mk,min, by lemma 3.6. the first term of (21) is written as mk,min +(d + a) · ĝ(d + a). the second term of this expression is written as∑ m,n=1,2 (dmĝmn)(dn + an)+ (22) ∑ m,n=1,2 ĝmn(dm + am)(dn + an). the first sum of (22) is infinitesimally small w.r.t. mk,min. if we take �k sufficiently small, the second sum is mk,min-boundedwith relative bound less than 1, bylemma3.6. nowwe can applylemma3.4, and conclude that d(l̃k,min)= d(mk,min)= d(m (0) k,min), and n±(lk,min) = n±(l̃k,min) = n±(mk,min) = n±(m (0) k,min). moreover, one can show that multiplication by g1/4 is a bijective continuous map on d(m (0) k,min). thus we have d(lk,min)= u −1d(l̃k,min)= g−1/4d(m (0) k,min)= d(m (0) k,min). and then we can prove lkf mk ∈ l 2(r2;dμk) by the leibniz formula and (18), and thus {[f mk ]}m form a basis of dk. 69 acta polytechnica vol. 50 no. 5/2010 in a similar way as in (13), we have [u, v]dk = lim �→0 ∫ r=� ( vn · √ gg−1(∇ + ia)u− −un · √ gg−1(∇ + ia)v ) rdθ. since √ gg−1 = i + o(r2), we can replace √ gg−1 by i in the calculation of [f mk , f n k ]dk, and we have [f mk , f n k ]dk = [f m k , f n k ]ek. thus we have the conclusion. � 4 proof of main theorems proof of theorem 1.1 since hmin is semibounded, we have n+(hmin) = n−(hmin) = dimd/2. by lemma 3.1 and proposition 3.5, we have for k < ∞ dimd = k∑ k=1 dimdk =4k1 +2k2, and for k = ∞ dimd ≥ ∞∑ k=1 dimdk = ∞. thus we have the conclusion. � proof of theorem 1.2 bylemma 3.1 andproposition 3.5, we have for u, v ∈ d(hmax) [u, v]d =4πφ(u) ∗ ( o −d d o ) φ(v), whereφ(u)∗ is the row-vector tφ(u) and d is thematrix given by (6). let x = t(x1, x2) be the matrix satisfying (7). then we have x∗ ( o −d d o ) x = o, which implies v ⊂ v [⊥] for v = ranx. moreover, if rankx =2k1 + k2, we have dimv [⊥] =4k1+2k2 −dimv =2k1+k2 =dimv. thus we have (11), and therefore hx is self-adjoint. conversely, for a given self-adjoint extension h of hmin, we can construct a (4k1+2k2)×(2k1+k2)matrix x by arranging the coefficients of an arbitrary basis of v = p d(h) with respect to the basis {[ψkf jk]}. � 5 infinite singularities let us consider the case k = ∞, and extend theorem 1.2. even in this case, for u ∈ d(hmax) and for each k, we can define the asymptotic coefficients ckj at γk. however, the sequence φj(u) is an infinite sequence. we shall findappropriateassumptionswhich make these infinite sequences square summable. in the sequel, uk, βk, gmn are those introduced in section 1. however, we may replace ψk defined by (3) more appropriate one satisfying (4), if such one exists. for simplicity, we assume v =0. (u) (i) there exists �0 > 0, independent of k, such that uk = {r < �0} for every k. (ii) there exist β−, β+ such that 0 < β− ≤ βk ≤ β+ < 1 or βk =0, for every k. (iii) there exists c1 > 0 independent of k such that gmn satisfies (2) and |∂i∂j gmn| ≤ c1 in uk, for every i, j, m, n =1,2. (iv) there exists c2 > 0 independent of k, and phase functions ψk ∈ c∞(uk \ {0}) satisfying |ψk| =1, (4) and |∂j a(1)m | ≤ c2 in uk, for j, m =1,2. thus we assume some homogeneity for g, a(0), and a(1). since the open sets {uk}∞k=1 are required to be disjoint, assumption (i) says the points of γ are uniformly separated in some sense. assumption (ii) seems a little strange, butwe need this assumption if we want to make the boundary value φ(u) square summable.7 assumption (iii) binds the curvature of m, and (iv) the intensity of the magnetic field. in [12], the author considers a similar assumption when m is the flat euclidean plane and da(1) is a constant magnetic field. in the sequel, we use the notation c ∞ = l2 = {(cj)∞j=1 | ∞∑ j=1 |cj |2 < ∞}, and define its inner product by usual l2-inner product. let h = ck1 ⊕ ck1 ⊕ ck2. proposition 5.1 assume (a), (a0), (a1), (sb), (u), v =0, and k = ∞. then, the following linear map 7if we consider another type of characterization, assumption (ii) may be dropped. 70 acta polytechnica vol. 50 no. 5/2010 d + [u] �→ φ(u) ∈ h ⊕ h, (23) is a well-defined homeomorphism. moreover, [u, v]d = 4π(φ(u), d̃φ(v)), (24) d̃ = ( o −d d o ) , where d is a bounded operator on h defined by (6). once this proposition is established, our theorem can be proved similarly as in the proof of theorem 1.2. so we omit the proof. theorem 5.2 assume the same conditions as in proposition 5.1. then, the statements of theorem 1.2 hold with the following changes: x1, x2 are bounded operators on h, and condition (7) is replaced by the condition ranx =kerx∗d̃, where d̃ is the bounded operator on h ⊕ h defined in proposition 5.1. we conclude this paper by proving proposition 5.1. proof of proposition 5.1. we divide the proof into two steps. step 1. the map d + [f] �→ ∞⊕ k=1 tk[f] ∈ ∞⊕ k=1 dk is continuous, bijective and its inverse is also continuous. proof. by our assumption (u) and the calculation in section 3, we can prove there exists c > 0 independent of k such that ‖ψ−1k χkf ‖ 2 lk,max ≤ c ∫ uk (|lf |2 + |f |2)dμk. summing up these equalities with respect to k, we conclude the map d(hmax) + f �→ ∞⊕ k=1 ψ−1k χkf ∈ ∞⊕ k=1 d(lk,max) is continuous. then the well-definedness of the map (23) can be proved similarly as in section 3. since d is identified with the closed subspace d(hmin) ⊥ of d(hmax) and the projection from d(lk,max) to dk is continuous, we conclude the map (23) is continuous. moreover, we can prove the inverse map is also well-defined and continuous, so we have the conclusion. � step 2. there exists c > 1 independent of k such that c−1|ck| ≤ ‖[u]‖dk ≤ c|c k| for every [u] ∈ dk, where ck = (ck1, c k 2, c k 4, c k 5) for 0 < βk < 1, c k = (ck3, c k 6) for βk = 0, and c k j are asymptotic coefficients of u defined in section 1. proof. we only consider the case 0 < βk < 1. consider the following formula for ck1 ck1 = 1 4π(1 − βk) [f4k , u]dk , which can be verified by substituting all the basis functions into u. by choosing the representative u ∈ d(lk,min)⊥ (so ‖u‖lk,max = ‖[u]‖dk) and using the schwarz inequality, we have |ck1| ≤ 1 2π(1 − βk) ‖f4k ‖lk,max‖[u]‖dk . the fraction is bounded uniformly w.r.t. k, by our assumption (ii) of (u). moreover, we can prove ‖f jk ‖lk,max is also uniformly bounded, by (u) and the calculations in section 3 (first decompose lk as in section 3, and estimate all terms). thus we have |ckj | ≤ c‖[u]‖dk . for j = 1. the case j = 2,4,5 can be treated similarly. conversely,∑ j=1,2,4,5 ‖ckj [f j k]‖dk ≤ |c k|( ∑ j=1,2,4,5 ‖f jk ‖ 2 lk,max )1/2, and the sum in the right hand side is uniformly bounded. thus the conclusion holds. � by step 1 and 2, we have proved the map (23) is well-defined and homeomorphism. equation (24) is confirmed by substituting each f jk as u or v. � acknowledgement this work was partially supported by doppler institute for mathematical physics and applied mathematics, kit faculty research abroad fellowship program, and jsps grant wakate 20740093. references [1] adami, r., teta, a.: on the aharonov-bohm hamiltonian, lett. math. phys. 43 (1998), 43–54. [2] aharonov, y., bohm, d.: significance of electromagnetic potentials in the quantum theory, phys. rev. 115 (1959), 485–491. 71 acta polytechnica vol. 50 no. 5/2010 [3] albeverio, s., gesztesy, f., høegh-krohn, r., holden, h.: solvable models in quantum mechanics. texts and monographs in physics., springer-verlag, new york, 1988. [4] bulla, w., gesztesy, f.: deficiency indices and singular boundary conditions in quantum mechanics, j. math. phys. 26 no. 10 (1985), 2520–2528. [5] correa, f., falomir, h., jakubsky, v., plyushchay, m. s.: hidden superconformal symmetry of spinless aharonov-bohmsystempreprint, url: http://arxiv.org/abs/0906.4055 [6] correa, f., falomir, h., jakubsky, v., plyushchay, m. s.: supersymmetries of the spin1/2 particle in the field of magnetic vortex, and anyons, preprint, url: http://arxiv.org/abs/1003.1434 [7] dabrowski, l., šťovíček, p.: aharonov–bohm effectwith δ-type interaction,j.math. phys.39, no. 1 (1998), 47–62. [8] exner, p., šťovíček, p., vytřas, p.: generalized boundary conditions for the aharonovbohmeffect combinedwithahomogeneousmagnetic field, j. math. phys. 43, no. 5 (2002), 2151–2167. [9] iwai, t., yabu, y.: aharonov-bohm quantum systems on a punctured 2-torus, j. phys. a: math. gen. 39 (2006) 739–777. [10] kato, t.: perturbation theory for linear operators. springer, 1966. [11] lisovyy, o.: aharonov-bohm effect on the poincaré disk, j. math. phys. 48 (2007), no. 5, 052112. [12] mine, t.: the aharonov-bohm solenoids in a constant magnetic field. ann. henri poincaré 6 (2005), no. 1, 125–154. [13] reed, m., simon, b.: methods of modern mathematical physics. ii. fourier analysis, selfadjointness, academicpress,newyork-london, 1975. [14] shubin, m.: essential self-adjointness for semibounded magnetic schrödinger operators on non-compact manifolds, j. funct. anal. 186 (2001), 92–116. dr. takuya mine e-mail: mine@kit.ac.jp kyoto institute of technology matsugasaki, sakyo-ku kyoto 606-8585, japan 72 acta polytechnica acta polytechnica 53(1):23–26, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ finding hidden treasures: investigations in us astronomical plate archives rené hudeca,b,∗, lukáš hudecb a astronomical institute, academy of sciences of the czech republic, ondřejov, czech republic b czech technical university in prague, faculty of electrical engineering, prague, czech republic ∗ corresponding author: rhudec@asu.cas.cz abstract. we report here on an ongoing investigation of us astronomical plate archives and tests of the suitability of transportable scanning devices for in situ digitization of archival astronomical plates. keywords: astronomical data archives, astronomical photography, astronomical photographic archives. 1. introduction there are numerous important astronomical plate archives in the usa, including plate collections that are little known to the community and that have been little investigated in the past. within the framework of a czech-us collaborative project, we have recently analysed some of them, obtaining test scans with the use of a portable digitizing device. digitization is a necessary step for an extended evaluation of the plate data using dedicated programs and powerful computers. 2. the plate archives the us astronomical archival plate collections that we recently visited include those housed in the following 14 institutions: (1.) carnegie observatories pasadena, ca; (2.) lick observatory, ca; (3.) yerkes observatory, wi; (4.) mt palomar observatory, ca; (5.) pari, rosman, nc (which has a collection of plates from many observatories); (6.) kpno tucson, az; (7.) cfht waimea, hi; (8.) ifa manoa, hi; (9.) usno flagstaff, az; (10.) usno washington, dc; (11.) steward observatory tucson, az; (12.) nmsu, las cruces, nm; (13.) rosemary hill observatory, university of florida, gainesville, fl; (14.) leander mccormick observatory, university of virginia, charlottesville, va. our estimate is that there are more than 1 million astronomical archival plates in these archives. we performed a quality check and analyzed these plate figure 1. digitizing plates with a transportable device in the plate vault of the steward observatory, tuscon, az. this digitization technique can be used even in very small rooms. archives with emphasis on their scientific, historical and cultural value, which we have found to be enormous. 3. transportable digitizing device most of the plate archives that we visited have no plate scanners and lack modern instrumentation in general. as our study includes plate digitization, it was necessary to find a solution. since we were going to travel from europe to the us by air, the obvious option was a transportable digitization device based on a digital camera with a high-quality lens and a stable tripod. this solution has the following advantages over other techniques: the device is easily transportable, and offers much faster scanning and higher repeatability than commercial flatbed scanners, because there are no moving scanner parts. the equipment that we used was as follows: camera: 21 mpx canon eos 5d 23 http://ctn.cvut.cz/ap/ r. hudec, l. hudec acta polytechnica figure 2. storage of archival plates at the mccormick observatory. mark ii, lenses: canon ef 24–70 f/2.8 l usm and canon 70–200mm f4, a stable tripod, and a fomei lp-310 professional photographic light table. more recently, we have been working on the design and development of a better custom-made light table based on highly homogeneous led illumination, and also a further improved camera and lens. the recorded images are then corrected for lens image distortions and for other effects, in order to store research-grade digital images. 4. general picture after visiting the 14 us plate collections mentioned above, we offer a (subjective) list of the major problems found in these archives: (1.) the list of us plate collections provided by dr. wayne osborn (robbins and osborn, 2009) was found to be incomplete. we have found valuable plate collections with plates from important telescopes that are not listed, e.g. the two hawaii plate collections in manoa (institute of astronomy) and in waimea (canada-france hawaii telescope cfht). some plate archives have been completely hidden, as their home institutes were in some extreme cases not even aware that they have plate stacks. (2.) in numerous collections, only a very rough estimate of the number of plates can be given, as no exact information about the total number of plates, etc., is available. usually, the real number of plates is higher than the previously available estimate. in general, it is very difficult to give the exact number of plates, due to lack of observation logs and inadequate organization of the plate archives. (3.) in many cases, there is no contact person responsible for the plate archive, and it is difficult to make contact. in some places, it is even difficult to get access. this situation very has a serious adverse effect on efforts to exploit these plate collections for scientific purposes. figure 3. example of metadata on the plate envelope (mccormick observatory plate archive). in numerous us plate archives, the plate envelopes are the only source of metadata information. (4.) for some archives, no information is available, not even an approximate number of plates. in many archives, no plate logs are available; they have either been removed or are lost. the only available information is what is written on the plates or on the envelopes (in some cases, there is not even adequate information on the envelope). we guess that there were originally observation logs, but that these were later separated from the plates and archived in a different location. example: carnegie observatories pasadena (nearly 0.5 million plates), where the logs are probably located in the attic above the library, with difficult access. (5.) damaged plates in some archives (mostly due to a partly-released or even complete released emulsion layer), probably due to improper storage (or changes in humidity/temperature over time). we point out however that even these plates can be restored using suitable chemical methods and procedures. (6.) lack of electronic records — no lists of plates, the only information is on the plates and/or plate envelopes (7.) many of the archives that we visited suffer from inadequate funding, lack of devices, e.g. no scanners. (8.) we have revealed that many plates have been removed from their home: these plates are usually scattered in private homes and offices, or are being kept by observers (often abroad). numerous plates taken at us observatories have been found in european plate archives. 24 vol. 53 no. 1/2013 finding hidden treasures: investigations in us astronomical plate archives figure 4. historical plate evaluation instruments at the mccormick observatory: blink microscope (left) and the pds machine (right). figure 5. example of a digitized direct plate (blazar program, full area), rosemary hill observatory. nevertheless, we found highly valuable plates almost everywhere, and the quality of the plates (and hence their scientific potential) is mostly high or even very high, in comparison with the plates in european archives. this is true both for direct images and for spectral images (taken with an objective prism). in addition to stellar images, some of the archives that we visited also include extensive collections of solar images (e.g. carnegie observatories in pasadena) and/or planetary images (unique collection in las cruces). the storage conditions were found to vary from archive to archive, from proper temperature and humidity conditions, to less proper conditions. the main degradation sources found in the plate collections were high levels of humidity and probably also temperature variation, resulting in partial or complete release of the emulsion layer. the scientific use of the plate archives is negatively impacted by poor access to the plates at some places, and also by the fact that the plates have in most cases not yet been cataloged. figure 6. example of a digitized low-dispersion spectral plate (small selected area), mccormick observatory. this is one of many plates taken with a 10-inch cooke camera and an objective prism for the long spectral survey program (lasting over 20 years) set up by a. vyssotsky. 5. suggested strategy our suggested strategy for data mining and plate digitization in us plate archives is as follows. (1.) digitize the plate archives using a fast and transportable scanning device, as described above. this scanning method is fast and inexpensive. these are important considerations, as the archives are scattered and there are very large numbers of plates. (2.) create electronic catalogs. (3.) include these catalogs into search programs like wfpd, operated by our bulgarian colleagues (e.g. tsvetkov et al., 2005, and tsvetkov, 2009). 6. summary fourteen us astronomical plate archives were visited within the amvis czech-us collaborative project. the quality of the plates and their scientific, historical and cultural value were investigated for possible 25 r. hudec, l. hudec acta polytechnica inclusion in the us astronomical plate repository at pari, nc. some of these archives (cfht waimea and ifa manoa) were unknown to the astronomical community before our study. selected plates were digitized using a transportable scanning device. all the archives that we visited have plates that are scientifically valuable, and in many cases unique. the plates are however mostly hidden from the astronomical community, and the plates have not yet been catalogued. the total number of plates is higher than expected – in many of the locations, the actual number of plates is unknown. as no catalogs exist, the real number of plates is very difficult to estimate, but for sure the places that we visited have more than a million photographic plates in their collections. acknowledgements we acknowledge grants 102/08/0997 and 205/08/1207 from the grant agency of the czech republic, and from msmt project me09027. this work is based on collaboration with thurburn barker and mike castelaz, pari, within the framework of a collaborative czech-us plate project. references [1] hudec, r. et al., acta polytechnica 1(52), 2012. [2] hudec, r. astrophysics with astronomical plate archives, in exploring the cosmic frontier: astrophysical instruments for the 21st century. eso astrophysics symposia, european southern observatory series. edited by andrei p. lobanov, j. anton zensus, catherine cesarsky and phillip j. diamond. series editor: bruno leibundgut, eso. isbn 978-3-540-39755-7. published by springer-verlag, berlin and heidelberg, germany, 2007, p.79 [3] hudec, r., an introduction to the world’s large plate archives, acta historica astronomiae, vol. 6, p. 28-40, 1999. [4] tsvetkov m. et al., proc. iv serbian-bulgarian astronomical conference, publ. astron. soc. rudjer boskovic no 5, 303-308, 2005. [5] tsvetkov, m. making astronomical photographic data available: the european perspective, preserving astronomy’s photographic legacy: current state and the future of north american astronomical plates. asp conference series, vol. 410. edited by wayne osborn and lee robbins. san francisco: astronomical society of the pacific, 15, 2009. [6] robbins, l.; osborn, w., preserving astronomy’s photographic legacy: current state and the future of north american astronomical plates. asp conference series, vol. 410, appendix b and c. edited by wayne osborn and lee robbins. san francisco: astronomical society of the pacific, 2009. 26 acta polytechnica 53(1):23--26, 2013 1 introduction 2 the plate archives 3 transportable digitizing device 4 general picture 5 suggested strategy 6 summary acknowledgements references ap-v2.dvi acta polytechnica vol. 50 no. 2/2010 treatment of biogas for use as energy j. koller abstract the biogas generated in biogas plants offers significant potential for the production of energy from renewable energy sources. the number biogas plants in the czech republic is expected to exceed one hundred in the near future. substrates from agriculture, industry and municipal wastes are used for biogas production. biogas plants usually use co-generation units to generate electricity and heat. increased effectiveness can be achieved by using heat as a source of energy for producing renewable natural gas. keywords: renewable energy sources, biogas plant, anaerobic fermentation, organic substrates, heat utilisation, renewable natural gas. 1 introduction biogas plants are significant sources of renewable energy. at present, there are 47 such facilities in operation in the czech republic, and in 2008 they produced 214 gwh of electric energy (0.3 % of total consumption). biogas production from various kinds of degradable substrates in biogas plants is technically feasible and economically viable. this resource offers considerable potential, and with complete utilization the capacity is in excess of one thousand mw. the primary application of biogas plants is in agriculture, enabling the utilization of all types of agricultural land for growing plants for energy, with guaranteed sales. biogas plants help to develop rural areas and provide employment. the technology is suitable for villages and towns that can efficiently handle the processing of separated biologically degradable municipal wastes by means of suitably located biogas plants, leading to less waste disposal in landfills. the construction of biogas plants can reduce dependence on fossil fuel imports, and it is also in accordance with the commitment of the czech republic toward the eu to increase the proportion of energy produced from renewable resources. there could be a total of several hundred biogas plants in the czech republic within the next 20 years. 2 basic arrangement of a biogas plant biogas production technology uses anaerobic fermentation, in the course of which organic substances degrade without access of air, and thus biogas is created. the most frequently-used method is wet mesophile fermentation at a temperature of 37–42 ◦c, and the solids content in the measured-out substrate and in the fermentor is lower than 12 %. a simplified scheme of a biogas plant is shown in figure 1. the first structure in a biogas station is the closed homogenization pit, where the separate raw materials are mixed together. in cases where fibrous substrates are added, disintegration to particles not exceeding 1 cm in size is necessary. for some types of raw material, particularly animal wastes, sanitation is prescribed. this is done by means of pasteurization in a closed container, where the waste is heated to 70 ◦c for at least 1 hour. the mixed raw material is pumped once or twice a day into fermentors. fermentors are closed cylindrical reactors, usually made of concrete or steel, thermally insulated and heated. the heating is done by hot water, either by means of heating coils or in a slurry/water exchanger. the material in the reactor is regularly agitated. for agitation, high-speed and slowspeed paddling agitators are used. the biogas that is produced is collected in a gasholder, which is most frequently designed as an integrated gasholder, which means that the gas is accumulated in the top section of the fermentor. from there it is drained away for purification and for further processing. naturally, a solution with a separate gasholder can also be used. fermentation is usually performed in two fermentors arranged one after the other. the size of the fermentor, the operational volume, the detention period and the fermentor load are selected in a way that ensures sufficient stabilization of the organic substances in the substrate and optimum biogas production. usual detention periods lie between 30–50 days. to complete the decomposition of the organic substances contained in the substrates, a stabilising basin (for final digestion) is added after the fermentors. the stabilising basin is not heated, but because biogas production is still proceeding, it is covered and connected to the gasholder system. biogas is a mixture of gases, the prevailing components of which are methane and carbon dioxide. further additions are hydrogen, water vapour, hydrogen sulphide and other sulphuric compounds, a small 58 acta polytechnica vol. 50 no. 2/2010 spaliny – exhaust gas vzduch – air teplo – heat chlazení – cooling kogenerace – co-generation elektřina – electricity čištění – purification hygienizace – sanitation desintegrace – disintegration biofiltr – bio-filter homogenizace – homogenization fermentace – fermentation stabilizace – stabilization separace – separation skladování – storage fugát – fugate separát – separate bp – biogas fig. 1: simplified scheme of a biogas plant amount of nitrogen from the air, hydrocarbons, organic silicon and chlorine compounds. the composition of the biogas and the ch4/co2 ratio are determined by the composition of the substrate, in particular by the proportion of carbohydrates, proteins and fats. before biogas is used in the co-generating unit, it is necessary to eliminate humidity and hydrogen sulphide in order to prevent corrosion of the engine. the water vapour condenses as water during biogas cooling. the hydrogen sulphide content is usually reduced by microbiological processes directly in the fermentor. if the concentration is too high, it is removed chemically either by a special wash-out or by adsorption on active coal. the co-generating unit consists of a gas combustion engine and an electric energy generator. the biogas is combusted and the mechanical energy that is created is partially converted to electricity, but the main part is carried away through the engine and exhaust gas cooling as heat. the usual electrical efficiency of a co-generating unit is 33–40 %, and it increases with size. the electrical energy that is produced is fed into the grid at long-term guaranteed subsidised purchase prices. table 1: composition of substrate and biogas substrate formula tod (kg/kg) biogas production (m3/kg) ch4 (%) carbohydrates c6h12o6 1.07 0.75–0.9 53 fats c18h34o2 2.89 1.1–1.6 75 proteins c6h14n2o2 1.56 0.6–0.8 70 note: tod – total oxygen demand 59 acta polytechnica vol. 50 no. 2/2010 the decomposed substrate, flowing away as the residue after fermentation from the stabilising tank, is transferred to the separator. either centrifugal machines or band screens are used for separation. the fermentation residue is divided into liquid fugate (usually 2 % of the dry matter) and separate (approx. 28–35 %). separation can be improved by adding suitable polymeric organic flocculants. the separated solid is easily transported and can be used as a component for the production of compost, or can be worked into the soil. fugate is a liquid organic nitrogeneous fertilizer which is applied to fields, etc. because it can be applied only outside the growing season, it is necessary to provide storage for a period of approx. 110 days (the usual growing season) in a watertight storage tank, which is usually open. in some biogas plants there is no separation. the whole fermentation residue is stored, and it is worked into the soil as a fertilizer at a suitable time. a major problem in operating a biogas plant is the development of malodorous substances, and their leakage must be prevented. the main point where odour emissions arise is the entry section of the biogas plant. this section must therefore be closed off, and is opened only for short periods of time to accept materials. the air mass from these facilities is exhausted and removed through an odour bio-filter, filled with moistened mass consisting of a mixture of peat, bark and other porous materials on which aerobic microorganisms grow. in this way the malodorous compounds are disposed of. there are no odour problems from a correctly designed and well-functioning biogas plant. such problems arise when the plant is overloaded, in particular when processing animal wastes. 3 substrates in general, any suitably treated material containing a sufficient concentration of biologically degradable organic substances can be used in a biogas plant. animal materials from agriculture: • pig manure • cow manure, including litter • manure and litter from horse, goat and rabbit keeping • poultry droppings, including litter plant materials: • maize silage • grass biomass or hay (haylage) • chaff and waste from cereal treatment • potato leaves and potato peel • beet leaves, including sugar beet leaves biologically degradable wastes: • municipal waste from households, restaurants and cafeterias • separated biological waste • biologically degradable industrial waste the fermentation process is also influenced by the content of degradable organic carbon and by the concentration of the nitrogen and sulphur compounds. the main part of the nitrogencompounds in the substrates consists of organic bound nitrogen in proteins, which is converted into ammonia during anaerobic fermentation. depending on the ph value in the fermentation mixture, the nh+4 ↔nh3 balance is established. the ionic form is non-toxic, but the optimum ph value for methanogenic microorganisms (7–8), and a substantial part of the nitrogen is present as toxic dissolved nh3. the c/n content in the substrates is therefore a significant monitored parameter. the optimum c/n value is 20, and there is a critical value of 10, at which considerable inhibition of biogas production develops and the fermentation process can take place only under a low load. the c/n content in different types of substrates is given in table 2. table 2: the c/n ratio for typical substrates substrate c/n blood 3–4 meat and bone meal 4–7 pig liquid manure 12–15 maize silage 35–40 straw 20–40 vegetable biomass 40–100 clearly, all animal wastes with a high protein content are hazardous; on the other hand, plant biomass has a substantially lower content of nitrogen compounds. an optimum substrate composition can be achieved by using a suitable mixture of the components that are available. the source of sulphur in the materials may be dissolved inorganic sulphates or organic sulphur in some proteins. all these compounds are then converted to sulphides during anaerobic fermentation. if fe2+ ions are present, insoluble fes is created. the remaining sulphur appears in the biogas and in the fermentation liquid as gaseous hydrogen sulphide or during process overloading as much more malodorous organic sulphur compounds. hydrogen sulphide and other sulphurcontaining compounds must be removed from biogas before it is burned, otherwise there is a risk of corrosion damage to the co-generation unit. it is also very important to check the presence of heavy metals in the materials. anaerobic fermentation endures relatively high concentrations without remarkable inhibition, but the presence of an excessive concentration of heavy metals would make it impossible to apply the fermentation residue subsequently as a fertilizer, and this would make the operation of the biogas plant totally impossible. 60 acta polytechnica vol. 50 no. 2/2010 3.1 biogas plant operation efficiency currently-operated biogas plants burn biogas in cogenerating units and produce electricity which is compulsorily purchased at subsidised prices. the electric efficiency of these units is approx. 33–40 % of the contained energy. the remainder is thermal energy, which falls off as hot water from motor cooling or is taken away as heat in the combustion gas. about 20 % (in summer) or 33 % (in winter) of the heat is used for heating the fermentors, and the rest of the heat is available for other purposes, or is disposed of as exhaust heat. the efficiency of a biogas plant can be increased in several ways • electricity production in energy peaks • better heat utilization • biomethane production. 3.2 production of electricity in energy peaks the current conditions for providing the subsidy for the construction of a biogas plant require continuous operation of the co-generation unit. according to the project of skanska cz, it would be more convenient to accumulate the biogas and to operate the co-generation unit and supply electricity into the grid only during energy peaks, when the price of electricity is much higher. technically, this is not a problem, but it would require a change in the legislation, an agreement with the purchasers of electricity, and a modification of the subsidy conditions. this provision would take significant effect in the future, when the operation of several hundreds of biogas plants is anticipated. this naturally means an increase in investment costs, because the co-generation unit must have approximately double electric output. 3.3 increased heat utilization the excess thermal energy can be used in the following ways: • hot water for heating agricultural structures or structures in a nearby village • heat utilization for drying • heat utilization from exhaust gas for steam production • in tri-generation, where the energy is used to produce heat and cold; cold can be accumulated in the form of ice. 3.4 biomethane production the dominant components of biogas are ch4 and co2, together with 4–6 % volume components of other additions. if co2 and the additions are removed, we obtain a gas that does not practically differ from natural gas in its content. this gas can then be burned as a heating gas, used for driving motor vehicles, or added to the natural gas pipeline. in these applications, all energy contained in the biogas is utilized. several methods are known for separating co2 from biogas. for the separation process, a water, glycol or ethanolamine wash-out can be used. there is also a cryogenic method in which co2 is removed as dry ice. in technical practice, the technically and economically optimum method is psa (pressure swing adsorption), which uses special active coal working as molecular sieve. the adsorption runs under increased pressure (0.5–1 mpa), the smaller co2 molecules are captured, and the bigger ch4 molecules pass through the filter. desorption of the captured gases takes place during the pressure decrease (5 kpa). biomethane production from biogas has established itself in sweden (for use in cars), and in germany and switzerland (biomethane is fed into gas pipelines). however, connecting a biogas plant to the gas pipeline requires relatively hard technical measures to be taken. high biomethane quality must be continuously monitored by means of chromatographic analysis. the heating power is modified by adding propane from an external source. a supply point equipped with a high pressure compressor has to be constructed. in the conditions of the czech republic, the implementation of biomethane production would have to be supported by technical and legislative modifications to the existing regulations. 4 conclusion the production and utilization of biogas from biogas plants offers significant potential for the production of energy from renewable resources. in the near future, a substantial increase in the number of biogas plants is anticipated, and they will utilize all kinds of suitable materials. agricultural animal and plant wastes, energetic biomass, slurry from waste water treatment plants, and also industrial and municipal wastes with a high content of organic degradable materials can be used as substrates. a standard biogas plant burns biogas in a co-generation unit, and during this process electricity and heat are generated. an increase in the efficiency of biogas plant operation can be achieved by a better heat utilization, or by biomethane production. references [1] straka f., et al: bioplyn. 2nd ed. gas s.r.o., prague, 2006. [2] metodický pokyn „k podmínkám schvalování bioplynových stanic před uvedením do provozu�. ministry of the environment of the czech republic, prague, september 2008. 61 acta polytechnica vol. 50 no. 2/2010 [3] dohányos, m., zábranská, j., procházka, j.: intenzifikace výroby bioplynu – předpoklady a praktické zkušenosti. in: sborník konference „výstavba a provoz bioplynových stanic�, november 9–10, 2008, třeboň. [4] aufwind schmack gmbh, regensburg, germany: biometanová stanice v obci pliening. in: sborník konference „výstavba a provoz bioplynových stanic�, november 9–10, 2008, třeboň. doc. ing. jan koller, csc. phone: +420 595 953 013 e-mail: koller@tomkar.cz http://www.tomkar.cz tomášek servis výstavní 135/107 703 00 ostrava-vítkovice 62 acta polytechnica doi:10.14311/ap.2013.53.0832 acta polytechnica 53(supplement):832–838, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the sensitivity of the greenhouse effect to changes in the concentration of gases in planetary atmospheres smadar bresslera,∗, giora shaviva, nir j shavivb a dept. of physics, israel institute of technology, haifa 32000, israel b racah institute of theoretical physics, the hebrew university, jerusalem 91904, israel ∗ corresponding author: smadar@physics.technion.ac.il abstract. we present a radiative transfer model for earth-like-planets (elp). the model allows the assessment of the effect of a change in the concentration of an atmospheric component, especially a greenhouse gas (ghg), on the surface temperature of a planet. the model is based on the separation between the contribution of the short wavelength molecular absorption and the long wavelength one. a unique feature of the model is the condition of energy conservation at every point in the atmosphere. the radiative transfer equation is solved in the two stream approximation without assuming the existence of an lte in any wavelength range. the model allows us to solve the simpson paradox, whereby the greenhouse effect (ghe) has no temperature limit. on the contrary, we show that the temperature saturates, and its value depends primarily on the distance of the planet from the central star. we also show how the relative humidity affects the surface temperature of a planet and explain why the effect is smaller than the one derived when the above assumptions are neglected. keywords: greenhouse effect, radiative transfer, earth like planets, surface temperature. 1. introduction the influence of concentration changes of atmospheric gases on the surface temperature of planetary atmospheres is a leading thread in current planetary research [7]. due to the importance of the problem, it is desirable to have a model which can predict correctly the greenhouse effect and its dependence on various changes, while at the same time being sufficiently simple to provide a tool to understand the details of the radiative transfer physical processes affecting this problem. we devised such a model. the model consists of two parts – radiative transfer and the molecular absorption dependent optical depth. we find that the minimum number of bands needed for a model to be faithful to the underlying physics, is a two-band semi-grey model. consequently, we first solve the radiative transfer problem in terms of two optical depths in the two chosen bands, chosen in such a way as to provide a faithful representation of the underlying radiative transfer. we denote the two bands by “vis” and “fir” and we will specify them shortly. the radiative transfer equation is then solved in terms of the optical depths τvis and τfir to yield a universal function tsurf (τvis,τfir). as the solution is found in terms of dimensionless quantities, it is universal and does not depend on many of the planetary parameters such as its mass or specific composition of the atmosphere. once we obtain the universal solution to the radiative transfer problem, we calculate the optical depths τvis and τfir from the basic molecular absorption data. as the molecular absorption is predominantly line absorption with wide windows, special care must be exercised in devising the algorithm which converts the molecular absorption coefficients κ(λ) into the above optical depths. with the universal radiative transfer solution and the optical depths, a change δx in the concentration of a certain gas yields a change δt according to: ∂tsurf ∂x = ∂tsurf ∂τvis ∂τvis ∂x + ∂tsurf ∂τfir ∂τfir ∂x , (1) where the optical depths τvis and τfir will be defined shortly. our study is unique is several ways. in particular, we define the two wavelength bands according to physical properties. since the original treatment by simpson [10, 11], the semi-grey model has only been applied to rt problems by thomas and stamnes [12] and practically no other similar distinction was made. it is also crucial to note the fundamental difference between stellar and planetary atmospheres. the molecular absorption, with its large variation in wavelength, and the total optical depth of planetary atmospheres, are such that the atmosphere contains spectral windows through which the planetary radiation can leak to space almost freely. such a phenomenon does not exist in stellar atmospheres which are hotter and in which the absorption is due to ions. consequently, the assumption of lte, frequently implemented in stellar atmospheres, is not really justified in planetary 832 http://dx.doi.org/10.14311/ap.2013.53.0832 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the sensitivity of the greenhouse effect atmospheres. note that the assumption of lte is usually referred to the distribution of the electrons in the various levels and to the distribution of the radiation field with wavelength. many treatments treat the visible as a heat source and the ir in the diffusion approximation. we assume that the electrons satisfy the boltzmann distribution but we do not assume that the radiation field is planckian. in this sense, the radiation field in illuminated planetary atmospheres is not in lte and it is improper to assume that f = ∇b(t) where f is the radiation flux. separate treatment of the radiative transfer in the visible and infrared was already carried out in the past (cf. [12, section 12.3]). our treatment differs in several points: we impose the energy conservation at each layer, we redefine the absorption coefficients, we apply the radiative transfer and not the diffusion approximation, and we find the temperature feedback directly from the radiative equation and not from the diffusion approximation for the far infrared part of the spectrum. 2. basic assumptions we show in fig. 1 the specific intensities of the insolating star and the thermal emission of the planet. we define λrad as the wavelength at which the two intensities are equal, namely ip(λrad) = (1 −a) 4 ( r∗ d )2 i∗(λrad), (2) where ip and i∗ are the specific intensities of the planet and the insolating star at the top of the planetary atmosphere. a is the mean albedo. energy conservation implies that 1 cv dq(z) dt = ∫ ∞ 0 κ(λ,z) [j(λ,z) −b(λ,z)] dλ, (3) where j is the mean specific intensity. a further requirement of steady state implies that dq(z)/dt = 0. as apparent in the figure, j(λ) � b(λ(tatm)) for λ < λrad. consequently, we can split the energy integral, and write ∫ λrad 0 κ(λ,z)j(λ,z)dλ = ∫ ∞ λrad κ(λ,z) [j(λ,z) −b(λ,z)] dλ (4) the first term is positive definite and hence always represents heating. the second term must therefore be negative and thus represents cooling. clearly, the two wavelength ranges describe different phenomena. at this point, we can also state that j = b, which is the condition for lte, only if the first integral vanishes, namely there is no heating of the atmosphere by absorbed radiation. our semi-grey model therefore has two bands, the vis band in the range up to λrad, and the fir band for wavelength in te ns ity /a bs or pt io n central star emission planet emission !(") "cut "rad figure 1. the definition of λrad, where the specific intensity of insolation equals the specific intensity of the planetary thermal emission. also shown is the definition of λcut, the wavelength above which the molecular absorption (denoted by κ) becomes significant. λ > λrad. the radiative transfer equation is solved in the two stream approximation. however, while the optical depths τvis and τfir are constant over the respective wavelengths, we allow i(λ) to change with λ. we assumed in the calculations reported here: nvis = 200 wavelengths in the range (103å÷λrad) and nfir = 400 wavelengths in the range (λrad ÷ 8 × 105å). the atmosphere was divided into 50 slabs. each slab has an optical depth of τvis/50 for λ < λrad and an optical depth τfir/50 for λ > λrad. the nvis and nfir wavelengths were distributed logarithmically. the energy condition was used to calculate the temperature of slab i. the energy condition in our calculation is given by: τvis natm j=nvis∑ j=1 [ji,j −b(ti,j)] (5) + τfir natm j=nfir∑ j=1 [ji,j −b(ti,j)] = 0 where natm is the number of layers in the atmosphere (50 in our case) and the index i runs over all layers in the atmosphere. the above condition must be satisfied for every i. nvis and nfir are the number of wavelengths we use in the respective wavelength range. ji,j is j at atmospheric layer i and wavelength j and b(ti,j) is the planck function for temperature ti at layer i and wavelength λj. although it can be eliminated, natm is kept for clarity. there are natm such equations to solve for the temperature ti in each layer i. note that we kept the planck function in the λ < λrad range despite the fact that it is very small. we assume that the total optical depth is divided equally in all the layers of the atmosphere. this implies that the physical width of the atmosphere varies with height. moreover, we work with the two τ’s as the independent variables, not κ. 833 smadar bressler, giora shaviv, nir j shaviv acta polytechnica figure 2. the surface temperature as a function of τfir for a fixed τvis. also shown is the sky temperature tsky. 0.01 0.1 1 10 100 1000 300 400 500 600 rad åλ =20,000 rad åλ =10,000 rad åλ =50,000 su rf ac e t em pe ra tu re k τfir>>>1 τvis figure 3. the effect of τvis on tsurf for three cases. 3. greenhouse and anti-greenhouse models figure 2 illustrates a typical case, where τfir increases indefinitely while keeping τvis fixed. the surface temperature increases slowly for small τfir’s (i.e., per given logarithmic increase of τfir), but when τfir ∼ 1, the rate of increase becomes larger, up to about τfir ∼ 100, where the surface temperature saturates and levels off. the reason is that as the temperature rises, the peak of the planck spectrum progressively moves towards shorter wavelengths and thermal radiation leaks through the vis range (cf. [9]). thus, the greenhouse effect does not experience a runaway when the concentration of any gas, and its corresponding optical depth, increase indefinitely. clearly, simpson’s paradox does not exist when the problem is solved properly. previous attempts to solve the paradox assumed the existence of windows in the absorption coefficient. the saturation is obtained, however, even if no windows are assumed. moreover, the saturation appears even without assuming various feedbacks like increased albedo etc. the figure also depicts the sky temperature tsky, defined such that σt 4sky is equal to the radiation flux from the atmosphere to the surface of the planet. as tsky < tsurf in this equilibrium model, the surface cools by exchanging energy with the atmosphere. figure 3 demonstrates the effect of a changing τvis on the surface temperature, for extremely large τfir for which the saturation temperature is reached. we show that τvis is sufficiently powerful to create an anti-ghe even under the most adverse condition. as long as τvis ≤ 1 the effect of τvis is negligible, but for larger values, the effect is very noticeable. in the limit of τvis � 100, the decrease in temperature approaches an asymptotic value. it is interesting to note that irrespective of λrad, the minimal temperature reached is the same. finally, the saturation temperature is not a monotonic function of λrad. 4. the resolution of the simpson paradox in 1927 simpson treated the radiative transfer problem in planetary atmospheres, and tacitly assumed: (a) that λcut ≡ λrad. (b) the existence of lte for the long wavelength range. under these assumptions, he found that the temperature of the atmosphere is given by: t 4p (τ) = ( 1 + 3τ 4 ) (1 −〈a〉) 4 ( r∗ d )2 t 4∗ , (6) where τ is the mean optical depth for λ ≥ λrad, which is measured from the top of the atmosphere downward. 〈a〉 is the mean albedo. simpson did not specify how τ is evaluated, presumably it was the planck mean. in particular, the temperature near the surface is obtained by substituting τ = τtot. obviously, as τtot →∞, so does tp. for sufficiently large τtot (τtot ≈ 3.8 × 105), the temperature of the planet reaches the temperature of the central star and can even surpass it. we note that there are particular wavelength ranges in the far ir, for which the total optical depth is as high as 104. the possibility of a temperature runaway, as predicted by the simple simpson solution, was coined the simpson paradox. there were attempts to resolve the paradox by assuming the existence of windows in the fir. however, it is obvious that as long as the vis band is ignored and the radiative transfer is solved with a single band, there is no way to eliminate the paradox. our treatment (see [9]) solves the problem, as we demonstrate that for moderate optical depths of τfir, the temperature saturates. 5. the (τvis,τfir) plane the main result is shown in fig. 4 where the contour lines of constant tsurf in the (τvis,τfir) plane are plotted. the classical picture of the greenhouse effect, where τvis does not exist or is merged with τfir, is along the τfir axis, where the surface temperature rises monotonically and saturates. as long as τvis < 1, its effect is small, but as τvis increases, its effect becomes more prominent. for τvis ∼ 10, it reduces the saturation temperature. 834 vol. 53 supplement/2013 the sensitivity of the greenhouse effect 200 250 300 350 400 greenhouse saturation anti-greenhouse 4 3 2 1 0 1 2 1 0 1 2 3 log10 τvis l og 10 τ fi r figure 4. curves of constant tsurf in the (τvis,τfir) plane. the regions of saturation, greenhouse and antigreenhouse are marked. the (τvis,τfir) plane allows us to determine the effect that the change in concentration of an atmospheric constituent has, once we evaluate the two optical depths τvis and τfir and their changes with concentration. in the next section we show how to evaluate the mean optical depth of the relevant bands. 6. the algorithm for the optical depth both the planck and the rosseland means are poor averages of the absorption when it comes to planetary atmospheres, where molecular absorption dominates. two factors play here, the existence of spectral windows and the large variation as a function of wavelength. in any averaging of this sort, the optimal weight function is the one which is the closest to the actual solution, which is of course not known. nevertheless, it is imperative to have a good guess for the weight function, or else the results may be completely skewed. the failure of the rosseland mean is a good example for a poor weight function which yields an unacceptable result. consider first the vis range. since the temperature of the radiation is that of the central star and hence very high relative to that of the planetary atmosphere, the zeroth solution for the transmission of specific intensity i(z,ν) is given by i(z,ν) = itoa∗,ν e −κ(ν)z. (7) here itoa∗,ν is the stellar specific intensity at the top of the atmosphere. to secure the energy flux transfer through the atmosphere, we therefore write that:∫ ν2 ν1 itoa∗,ν e −κ(ν)zdν = e−〈τvis〉 ∫ ν2 ν1 itoa∗,ν dν. (8) next we note that the radiation interacts with the atmosphere, which is at temperature tatm, and the stellar radiation is to a good approximation that of a black body at temperature t∗, so we have: 〈τvis〉 = − log (∫ν2 ν1 b(t∗,ν)e−τ(ν,tatm)dν∫ν2 ν1 b(t∗,ν)dν ) , (9) where ν1 and ν2 correspond to the boundaries of the vis range, tatm is the temperature of the atmosphere and τvis is the total average optical depth for this range. it is important to note that the temperature in the weighting function is not that of the atmosphere through which the radiation passes, but that of the insolating star – the sun in the particular case of the earth or the star in the general case. the optical depth, ∫z 0 κ(λ,t,p)dλ, however, is calculated with the temperature of the atmosphere. the expression so obtained reduces to the trivial results in various limits. if there is no absorption in the vis range, then τvis = 0. when the optical depth is constant, then τvis is equal to this constant as well. 6.1. transition in the far infrared domain consider now the radiative transfer in the fir range. let τ be measured from the top of the atmosphere downwards. if we write i+(τ) as the thermal flux towards larger optical depths (downwards) and i−(τ) as the flux towards smaller optical depths, then the solutions for i±(τ) under the above approximations are: f(τ) ≡ i−(τ) − i+(τ) = const. (10) e(τ) ≡ i−(τ) + i+(τ) (11) = [i−(τ) − i+(τ)] τ + const. by comparing the conditions at the top (τ = 0) to the bottom (τ = τtot), we obtain i−(τtot) − i+(τtot) = i−(0) − i+(0), (12) i−(τtot) + i+(τtot) = [i−(0) − i+(0)] τtot + [i−(0) + i+(0)] . the boundary conditions we have are: i+(0) = i∗,fir and i−(τtot) = ip,fir, (13) where i∗,fir is the insolation for λ > λrad and ip,fir is the planet’s emission at λ > λrad. it is generally assumed that i∗,fir = 0. however, if λrad decreases significantly, this assumption may no longer be justified. next we consider the thermal equilibrium of the surface, i.e., total absorption equals the total emission: (1−a(λ))i∗,vis +iatm,↓ = (1−a(λ))ip,vis +ip,fir, (14) where i∗,vis is the insolation for λ < λrad, iatm,↓ is the emission of the atmosphere towards the surface 835 smadar bressler, giora shaviv, nir j shaviv acta polytechnica (at λ > λrad), ip,vis the planet’s emission at λ < λrad and a is the albedo at the short wavelengths. our main point is that ip,vis must be included at relatively high surface temperatures. using the two sets of eqs. 12 and 13, the thermal equilibrium becomes: (1 −a(λ)) (i∗,vis − ip,vis) = 2(ip,fir − i∗,fir) 2 + τfir,tot . (15) from the above set of equations, we can derive an expression for the average net fir flux (per unit frequency) over a finite band δν, and define an effective opacity through the following: δifir = 1 δν ∫ ν2 ν1 [ip,fir(tp) − i∗,fir] 1 + 3τ(ν)/4 dν (16) ≈ [ ip,fir(tp) − i∗,fir ] δν ∫ ν2 ν1 dν 1 + 3τ(ν)/4 ≡ [ ip,fir(tp) − i∗,fir ] δν 1 1 + 3τfir,tot/4 , that is, τfir,tot = 4 3  ∫ν2ν1 b(tatm,ν)dν∫ν2 ν1 b(tatm,ν)dν 1+3τtot(ν)/4 − 1   . (17) this expression for the grey absorption is useful because it encapsulates the different behaviors in the fir. in particular, wavelength regions for which the optical depth is small allow for a larger flux and therefore receive a larger weight in the averaging. this is present in the rosseland mean as well. however, unlike the rosseland mean, the flux does not diverge if the wavelength region becomes optically thin. in such a case, the emission saturates at its surface emission. thus, the mean can adequately describe the effect of spectral windows. 7. actual calculation the data of molecular absorption was taken from the hitran compilation [6]. the procedure for calculating the line absorption is described in the manual of this compilation. 8. the effect of water vapor we calculated the optical depths for increasing degrees of column densities of water molecules. the results are shown in fig. 5. the effect of increasing the column density of water vapor is shown by the green line. the interesting phenomenon is that the curve starts in the region for greenhouse effect and continues for sufficiently high amounts of water, towards the antigreenhouse domain. the water curve is not vertical but has a slope. the pink arrow denotes the result that a model which does not distinguish between the two domains, the vis and fir, would yield. as we can see, the inclusion of the effect of τvis lowers the predicted 5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 330 320 310 300 290 280 270 260 250 230 360 vis fi r 350 340 figure 5. the effect of gaseous water molecules in the (τfir,τvis) plane. the points are increasing column density starting from zero to the maximum amount of water vapor (relative humidity of 100 %). the green line depicts the increase in tsurf while the pink arrow depicts the result that would have been obtained from the classical approach, which neglects τvis. the arrow marks the location of the earth according to the earth mean column density of water vapor as given by crisp [2]. surface temperature. the arrow marks the location of the earth for a column density of 8.12 × 1022#/cm2, as given by crisp [2] for the mean earth. 8.1. line-by-line models of all the methods applied to describe the greenhouse effect of any gas on the atmospheric temperature [3] the line-by-line (lbl) models are the most elaborate. these are not full radiative transfer models but radiation transmittance models. these models calculate the absorbed energy by each absorption line. the lbl model treats only one downward stream of radiation. in the vis, it is pure insolation, whereas beyond λrad, it is the downward self emission from the top of the atmosphere (toa) and subsequent lower layers, in response to the long wavelength radiation emitted from the planetary surface. the absorbed energy so calculated (in terms of w/m2) is transferred to a global circulation model (gcm), where the effect on the temperature is calculated. since in this way the increase in concentration of any absorber always leads to increased energy absorbed, the results are always positive, namely heating. less detailed methods, like the correlated-k distribution are even less accurate. 836 vol. 53 supplement/2013 the sensitivity of the greenhouse effect 9. a maximum temperature for a planet the fact that the greenhouse saturates implies that a planet at a given distance from the central star has a maximal temperature, the temperature of saturation. this is a strict limit which does not depend on the parameters of the planet like atmospheric composition or the mass or structure of the atmosphere (like pressure and density). the saturation temperature depends only on the distance of the planet from the central star and the mean albedo. hence, this limit can serve to distinguish between brown dwarfs revolving around a central star and a planet. this model should be further developed for jovian planets, to distinguish them from brown dwarfs. this should be done by changing the lower boundary conditions. 10. summary the radiative transfer model for planetary atmospheres presented here enjoys simplicity yet it does not compromise the fundamental physics of the greenhouse effect. the prediction of the greenhouse effect is correct and reliable, as we saw with the prediction of the location of the earth in the (τvis,τfir) plane. in general, having the universal solution tsurf (τvis,τfir) allows an easy determination of the surface temperature of the atmosphere, irrespective of the particular composition. we demonstrated the solution in the case of water vapor, and have shown that water vapor do not lead to a runway greenhouse effect, and even does not drive an earth-like atmosphere into the saturation region. the model generalizes the greenhouse and the antigreenhouse effects and generates a comprehensive picture of the phenomenon. radiative models which are essentially transmittance models like the lbl, cannot predict the resulting temperature changes, as they do not impose an energy balance equation. such models can predict how much energy is absorbed by a given atmospheric structure but they do not have the feedback to evaluate the resulting temperature changes. for this purpose, one has to resort to a general circulation model, which is a dynamic model and not static. the saturation of the greenhouse effect leads to the existence of a strict limit to the temperature of a planet. this limit can help to distinguish between planets and brown dwarfs. acknowledgements special thanks to dr. eyal bressler and to jenia and ram oren. references [1] bressler,s., shaviv, g., shaviv, n.j. 2012, in preparation [2] crisp, d. chap. 11 in allen astrophysical quantities, 2000 [3] ipcc climate change 2007: synthesis report [4] kasting, j.f., 1988 runaway and moist greenhouse atmospheres and the evolution of earth and venus icarus 74, 472 [5] mihalas, d. 1970 stellar atmospheres, w.h. freeman and company, san francisco [6] rothman, l. s. et al. the hitran 2008 molecular spectroscopic database, 2009 jqsrt 110, 533 [7] seager, s. 2010 exoplanet atmospheres, princeton university press, new jersey [8] shaviv, g. & wehrse, r., 1991, continuous energy distributions of accretion discs, a&a, 251, 117 [9] shaviv, n.j., shaviv, g. & wehrse, r., 2011, the maximal runaway temperature of earth-like planets, icarus, 216, 403 [10] simpson, g.c., 1927. some studies in terrestrial radiation. mem. r. meteorol. soc. ii 1015 (16), 69–95. 1016 [11] simpson, g.c., 1928. further studies in terrestrial radiation. mem. r. meteorol. soc. 1017 iii (21), 1–26 [12] thomas, g.e. and stamnes, k. (1999) radiative transfer in the atmosphere and ocean, section 12.3, cambridge university discussion jim beall — there is a correlation between temperature and co2 in the industrial period between 1880–2004. the co2 goes from 280 ppm to 380 ppm. but the temperature only increased by 3/4 °c during the interval from 1880–2004. smadar — different analyses give a wide range for the anthropogenic contribution to the temperature rise, and even the measured temperature rise itself has a large uncertainty, such that our results all fall within these limits. it is only with better and more detailed models that we will be able to rule out (through this methodology) possible misconceptions about the extent that anthropogenic co2 is the sole cause for heating of the atmosphere. maurice van putten — is it known how much ch4 is released when there is a 1 °c increase? the result, i expect, can be readily included in your model. smadar — you are correct that ch4 is the next runner up in our model after co2, being the third important greenhouse gas in earth’s atmosphere. the answer to your question can be obtained using our model in a similar manner to that of co2 and water. on the one hand, we expect a smaller effect due to the still relatively low column density of ch4 (about 200 times less than co2). its strong absorption, on the other hand, might compensate for this, and it is hard to predict what will be its final effect without calculating it. we have developed a greenhouse indicator aimed exactly at comparing different greenhouse gases by their slope over the τfir and τvis plane, due to changes in concentration. for example, from the presented data, it seems that co2 has a stronger greenhouse effect than 837 smadar bressler, giora shaviv, nir j shaviv acta polytechnica water, at least for some humidity regions. so we should wait and see. bozena czerny — 1. what about cloud coverage? this varies across the surface and it is also coupled vertically with high cloud reflectivity. 2. what about mechanic effects like convection or winds? this likely predominantly cools the surface. smadar — 1. cloud coverage has not yet been considered, as it is a complicated matter: clouds at different heights have different albedos, and also the complication of combining the surface albedo with partial cloud coverage, changing the effective albedo. initial experiments with changing surface albedo, though, show very low sensitivity of the model to even 30 % changes, but this is yet to be determined carefully in further calculations. 2. convection was still not considered in this preliminary model, only pure radiative transfer. as convection will onset only at the steepest temperature gradient, which is the adiabatic limit, our calculation yields the upper limit of the temperature. we can parameterize convection from our model by translating the dt/dτfir into dt/dz through a linear scale factor l = (d ln τ dz )−1 . (18) this will be carried out in further studies. as to winds, our model is not suitable for calculating the effect of global winds and wind jets (in gravitationally locked planets especially), and it requires coupling with climate global circulation models (gcm), which are used today to calculate the greenhouse temperatures from lbl-models, which yield radiation fluxes in w/m2. we hope to match our radiative transfer software in the future with a good gcm, in order to feed our results to climatic models. as you see, there is still much work to be done! we are only at the beginning . . . 838 acta polytechnica 53(supplement):832–838, 2013 1 introduction 2 basic assumptions 3 greenhouse and anti-greenhouse models 4 the resolution of the simpson paradox 5 the (tau_vis, tau_fir) plane 6 the algorithm for the optical depth 6.1 transition in the far infrared domain 7 actual calculation 8 the effect of water vapor 8.1 line-by-line models 9 a maximum temperature for a planet 10 summary acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0231 acta polytechnica 54(3):231–239, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap stirred fluidized-bed dryer of regenerated ion exchanger particles michal pěnička, pavel hoffman∗, ivan fořt czech technical university in prague, department of process engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: pavel.hoffman@fs.cvut.cz abstract. this paper describes the intensification of the fluidized-bed drying process for regenerated spherical-shape ion exchanger particles in batch mode, achieved by a mechanical stirrer in the fluidized bed layer of dried particles. the effect of the mechanical stirring system on the drying process is examined. the calculations and also the results of comparative measurements provide evidence of a favourable effect of stirring on the total drying time in comparison with the initial unstirred system. the regenerated ion exchanger particles pass to the fluid state in a shorter time and the ultimate total drying time is thus more than 60 % shorter. keywords: fluidized-bed drying; stirring; mechanical disruption; drying kinetics; ion exchanger. 1. introduction this study seeks savings in fluidized-bed drying of particles in batch mode, the particles being highly adhesive due to the surface tension of the liquid adhering to the particles. drying is an extremely energy-demanding process, and so it is important – especially now that energy prices are becoming so high – to find ways to achieve energy savings. fluidized-bed drying is a drying process in which intensive heat and mass transfer occur between the particles that are present in the fluid state and the air flowing through the bed. the surface tension of the liquid adhering to the surface of the particles at the beginning of the drying process gives rise to a strong tendency for the particles to stick to each other and also to the walls of the drying chamber. the batch mode of this process together with the insufficient amount of material present in a batch does not allow a vibro-fluid dryer to be used, since these dryers are mainly designed for continuous mode. a solution may be found in drying with a fluidized bed layer which is stirred [1–4] and where the particle clusters are disintegrated continuously and the particles adhering to the drying chamber walls are swept off due to the stirring process. the aim of this study is to intensify and improve the drying process of the regenerated ion exchanger particles, i.e. to shorten the process while at the same time improving the homogeneity of the humidity of the particles. 2. material and methods 2.1. spherical particles of the regenerated ion exchanger ion exchanger particles are used in membrane technology processes for wastewater treatment purposes [5]. they are also widely used for water treatment in the nuclear power sector [6]. ion exchangers are mostly synthetic high-molecularweight organic compounds, largely based on styrene, polyacrylate, phenol formaldehyde resins, etc. [7]. the marathon-a cation exchanger, consisting of spherical particles 450–580 µm in diameter, was selected as the model material [8]. the maximum permitted temperature is 120°c to avoid thermal stress resulting in structural degradation of the particles. the drying process consists of three main periods, see fig. 1. in the initial phase (period 0-1 in fig. 1), the particles act as a stationary porous layer. this period is characterized by particle surface moisture, resulting in a strong tendency toward adhesion. in their appearance and in their physical properties, the dried particles resemble an amorphous material (fig. 2, point 0) and stick strongly to the walls and also to each other. the initial moisture of the regenerated ion exchanger is approximately 65 % rm (fig. 1). in this period the heat supplied by the drying air is used to evaporate the free surface water from the particles (fig. 2, point 0). thus the particle diameter remains constant in this period, as is clear from the particle diameter change curve during the drying process (see fig. 1b). due to imperfect circumfluence of the drying air around the particles (it flows through the layer only in several channels) the heat energy is not consumed completely, and thus both the drying air temperature and the humidity at the fluid chamber output are not close to the ideal state, i.e. the drying air humidity approaches the saturation limit and the temperature approaches the wet or moist bulb temperature (fig. 1d, e). as soon as the attractive forces from the water surface tension decrease enough to enable transition of the particles to the fluidized state (period 1–2 in fig. 1), moisture starts to evaporate from the particles (fig. 2, point 1). this transition to the fluidized state improves the drying air circumfluence of the particles and 231 http://dx.doi.org/10.14311/ap.2014.54.0231 http://ojs.cvut.cz/ojs/index.php/ap m. pěnička, p. hoffman, i. fořt acta polytechnica figure 1. time dependences of the particle and drying air properties during the whole drying process. consequently accelerates the heat and also the mass transfer. as a result, the drying air temperature at the fluid chamber output starts to approach the wet bulb temperature, and the maximum possible humidity of the drying air is reached (fig. 1c, d). the dried particles now get to what is referred to as the first drying period, in which the drying rate is constant. this material is characterized by a significant expansion in volume, depending on the moisture content of the material (fig. 2, point 2). for this reason, the particle diameters are decreased in this period. the rate of moisture evaporation from the particle surface is lower than the rate of moisture transfer from the centre of the particle to the surface. thus the limiting factor of the drying process is the evaporation from the particle surface. after passing point 2 in fig. 1, the diameter of the particles remains constant and the particles start to be dried to the required final moisture level. through this figure 2. particle diameter change curve during the drying process. effect, the drying process reaches the second drying period, in which the drying rate decreases. the drying air temperature at the fluid chamber output increases and the humidity decreases. moisture transfer from the particle centre to the surface is the limiting factor in this period. 2.2. stirring unit the main issue in this process is that the particles stick to each other or to the dryer walls in the first drying period. this is caused by the surface tension of the liquid, which is present both on the surface and inside the particles. hence, the properties of the stirrer are an important factor in the technology and in intensifying the process. a suitable stirrer should have the capacity to disrupt the ion exchanger particle clusters and sweep the particles off the walls. this ability should be effective enough to minimise the initial drying period in which the particles adhere to each other, but not to cause particle degradation. the effect of the stirrer in the first and second periods, in which the surface of the pre-dried particles is dry enough for the particles to pass to the fluidized state, must not be adverse, i.e. it must not disturb the fluidized bed layer or the contact between the particles and the drying air, which would slow down the heat and mass transfer effects. 2.3. experimental equipment an experimental equipment assembly for fluidized-bed drying with a stirred layer and various fluid chamber diameters was designed to evaluate the development of the drying process. the layout of the experimental equipment is shown in fig. 3. pressurized air at a known temperature and humidity is fed in from the main air distribution line (fig. 3, item 1). the air flow is controlled by the pressure control valve (fig. 3, item 2) and is measured using a rotameter (fig. 3, item 3). downstream of the flowmeter the air is heated by a heating device with resistance wires, and the output of the unit is controlled manually by the voltage change using a transformer (fig. 3, item 4). the temperature of the heated air is measured upstream of the fluid chamber using a contact thermometer with accuracy of ±0.1°c. the fluidized bed chamber (fig. 3, item 5) consists of a duct made of a galvanized zinc sheet which is thermally isolated. this main part is 1 m 232 vol. 54 no. 3/2014 stirred fluidized-bed dryer figure 3. experimental equipment layout. long, and then the diameter is expanded to 250 mm. a sensor measuring the humidity of the drying air (accuracy ±0.1 %) and the temperature of the drying air (accuracy ±0.1 %) after passing the fluidized bed layer was positioned always 300 mm above the fluidizedbed chamber grid. this sensor is designed to determine the properties of the drying air at the output (point c in fig. 4). an adjustable-speed stirrer unit is positioned above the chamber (fig. 3, item 7). a schematic drawing of the wire stirrer is shown in item 6 in fig. 3. fluidized bed chambers with three basic diameters (85 mm, 100 mm and 140 mm) were designed to enable the development of the model drying process to be precisely monitored. 2.4. properties of the drying air the flow rate of the heated air downstream of the heating unit was determined using the equation of state of an ideal gas (1), assuming that the amount of substance of the flowing air na at the heating unit input and output is constant, i.e. that there is no leakage of the drying air from the experimental equipment pav̇a = narta. (1) the values from points a and b in fig. 4 are inserted in the equation of state, including the thermodynamic temperature of the drying air ta, air pressure pa and the universal gas constant r. the superficial drying air velocity uab of the drying chamber was determined from the air flow rate v̇a at the point b (fig. 4) uab = v̇a sk , (2) where sk is the cross-section of the fluidized-bed chamber. the density of the wet air depending on the temperature was calculated according to equation (9) %a = 1.316 · 10−3 ta (2.65p + ϕap′′v ), (3) figure 4. h-x diagram for the adiabatic drying process. where ϕa is the relative humidity of the drying air and the pressure of the saturated water vapour p′′v at temperature ta was determined according to relation (4), which is valid for the temperature range 0 to 200°c, where c8 to c13 are coefficients taken from the literature [10] ln p′′v = c8 ta +c9+c10ta+c11t 2a+c12t 3 a+c13 ln ta. (4) the dependence of the dynamic viscosity of the drying air on temperature is expressed by equation (11) µa = 0.00004ta + 0.176, (5) where ta is the temperature of the drying air in degrees celsius. 2.5. determination of layer porosity table 1 presents the values of the non-status initial conditions of the studied drying process of particles of the regenerated ion exchanger, identical for all tested sizes of the fluid drying chambers. the reynolds number was determined from the known drying air velocity uab according to the equation uab = reµa dp%a , (6) where µa is dynamic viscosity and %a is density of the drying air. these values are determined according to equations (3) and (5). variable dp is the mean value of the particle diameter in the studied period. the relationship between porosity ε and drying air velocity uab, or between porosity and the reynolds number re is expressed [12] by an equation valid for fine particles: re = arε4.75 18 + 0.6 √ arε4.75 . (7) the archimedes criterion ar is determined according to the equation ar = d3p(%sw −%a)%ag µ2a , (8) 233 m. pěnička, p. hoffman, i. fořt acta polytechnica column diameter dk m 0.085 0.100 0.140 temperature of drying air at point b in fig. 3 ta °c 120 density of drying air at point b in fig. 3 %a kg m−3 0.91 viscosity of drying air at point b in fig. 3 µa pa s 2.24 · 10−5 density of dry particles %s kg m−3 1440 superficial drying air velocity uab m s−1 2.1 table 1. input parameters of the drying air. where g is the acceleration of gravity and dp is the mean particle diameter in the studied period. %a is the density and µa is the dynamic viscosity of the drying air determined according to equations (3) and (5). the density of the wet particles %sw for the given interval is determined using the equation %sw = %s ( 1 + mw ms ) , (9) where %s represents the density of the dried material and mw is the mass of the liquid component of the corresponding moisture of the measured material at a given point in the studied process. 2.6. determining the drying rate the drying rate is thus determined for the given interval according to equation [13] − ms as ∆xw ∆τ = nw , (10) where ∆xw is the difference in absolute moisture between the initial and final point of the studied interval and ∆τ represents the drying period in the given period. the total particle surface was determined according to the formula as = vpas, (11) where the total particle volume vp is calculated from the formula vp = ms %sw , (12) where ms is the mass of the dry matter of the measured sample and %sw is the density of the wet particles for the given interval (9). the specific particle surface is determined according to the formula valid for monodisperse systems of spherical particles as = 6(1 −ε) dp , (13) where ε is the porosity in the studied period. 3. results and discussion 3.1. design of the optimized stirrer for minimizing the total drying time the wire stirrer prototype stirrer as shown schematically in fig. 3, was designed on the basis of the results of preliminary experiments with standardized stirrers. the stirrer is made of stainless steel wire 0.4 to 0.6 mm in diameter. it was designed for a centric arrangement. the diameter of the stirrer is 0.95–0.97dm/dk and its height is 2.8 to 5 times that of the material layer at rest. the expected rotation speed is approximately 50 rpm for all fluidized bed dryer sizes tested here. the peripheral speed of its blades is in the range of 0.225 to 0.26 ms−1 for the column sizes tested. the preliminary experiments clearly showed that when the stirrer mesh ho is too small the wet ion exchanger particle clusters are insufficiently disintegrated and the clusters move along with it. if the stirrer mesh is too large, the particle clusters are disintegrated but the resulting clusters are not small enough. there was no significant effect of mesh orientation on the cluster disintegration process. the optimum mesh size was found to be 10 mm×10 mm. 3.2. the course of changes in ion exchanger particle diameters in the test periods during individual experiments, samples of dried ion exchanger were taken every 5 minutes from the same point in the drying column. during the drying process, over 30 samples were taken for the unstirred version and over 13 samples for the stirred version. in a part of this sample, the absolute moisture of the sample was determined using drying balances. in this way, the moisture of the dried particles in a particular point of the drying process was determined. the rest of the sample was immediately subjected to microscopic analysis and photos of at least six particles were taken. the mean diameter of these particles for one particular point in the whole process of drying (and thus for the particular moisture of these particles) was then determined using an optical method. the average mean particle diameter value for all diameters of the columns, for the stirred and unstirred variants, was for the porous layer period dp01 = 590·10−6 m±2 %, for the first period of the fluidized-bed layer dp12 = 530 · 10−6 m ± 3 %, and for the second period of the fluidized-bed layer dp23 = 460·10−6 m±3 %. the dependence of the mean particle diameter value on the absolute moisture of the particle is shown in fig. 5, where the transition points between individual periods in fig. 1 are highlighted. 234 vol. 54 no. 3/2014 stirred fluidized-bed dryer figure 5. the dependence of mean particle diameter on particle absolute humidity. 3.3. the drying rate in the porous layer period the porous layer period is defined for section 0–1 in fig. 1. the porosity of the dried particles layer was determined according to the formula for a porous layer [14] ε01 = ε0 + 0.33 dp01 dk , (14) where ε0, as the probable mean porosity value, is within the interval ε0 = 0.38 to 0.39 [11]. the mean ion exchanger particle diameter in the studied area is dp01 , and dk is the diameter of the fluid dryer column. if you introduce porosity ε01 and mean particle diameter dp01 into the equation (13), you will obtain the specific surface of particles as01 in this studied period. the total particle volume vp01 is determined using equation (12), where ms is the mass of the dry particles and %sw1 is the density of the wet particles at the final point of the studied area, i.e., at point 1 in fig. 1 and fig. 2. this value is determined using equation (9), where %s is the density and ms is the mass of the dry particles. the mass of the moisture or of the dried water in this area is expressed by mw01 . if we introduce the specific surface of particles as01 and the total particle volume vp01 into the equation (11), we will obtain as01 , i.e., the total surface of the dried particles in this period. the drying rate in the porous layer period nw01 is determined using equation (10), where ms is the mass of the dry particles and as01 is the total surface of the dried particles. quantity ∆xw01 is the difference in absolute moisture of the dried particles between the initial point and the final point of the studied period and ∆τ01 is the length of this period. table 2 summarizes the determined values of characteristic quantities for this period of the drying process of particles of the regenerated ion exchanger for individual diameters of drying columns. 3.4. the drying rate in the first period of fluidized-bed drying in the region of the first period of fluidized-bed drying (period 1–2 in fig. 1) the transition of the particles to the fluidized state begins and moisture from the particle figure 6. resulting drying curves. centre starts to evaporate. this causes a change in the particle diameter, which depends on the moisture of the particle (see fig. 5). there are important values for determining the properties of the fluidized-bed layer, namely the velocity at the fluidization threshold ua,mf and the porosity ε12 at superficial drying air velocity uab. the rate at the fluidization threshold is determined using the following formula: ua,mf = rep,mf µab dp12%ab , (15) where µab is the dynamic viscosity and %ab is the density of the drying air. variable dp12 represents the mean particle diameter in this studied period. to determine the velocity at the fluidization threshold ua,mf , quantity rep,mf must be determined in advance [15] according to the formula in the period valid for fine particles rep,mf = (33.72 + 0.0408ar )0.5 − 33.7, (16) where the archimedes number ar is defined by equation (8). this is determined by the dynamic viscosity µab, the density of the drying air %ab, the acceleration of gravity g, the mean particle diameter dp12 and the final particle density %sw12 in the calculated period 1–2. this value is determined according to equation (9), where ms is the dry mass of the particles, %s is the density of the dry particles and value mw12 is given by the average mass of the particle moisture for the studied period. the porosity for the first period of fluidized-bed drying ε12 is also determined using formula (7), where ar is the archimedes number for this period and re is the reynolds number determined at the superficial drying air velocity uab, the dynamic viscosity µab and the density %ab of drying air in the period b in fig. 3. the drying rate for the first period of fluidized-bed drying is determined according to formula (10), where ms represents dry material mass and as12 is the total area of dried particles determined using formulas (11), (12), (13) and (9) for the studied periods. an overview of the values of characteristic quantities for this period 235 m. pěnička, p. hoffman, i. fořt acta polytechnica column diameter dk m 0.085 0.100 0.140 stirred version mean value of particle size dp01 m 5.9 · 10−4 5.9 · 10−4 5.9 · 10−4 period time ∆τ01 s 1634 1625 1860 difference in humidity ∆xw01 kgw kg −1 s 0.962 1.125 1.030 water mass at point 0 mw0 kg 0.227 0.306 0.533 water mass at point 1 mw1 kg 0.122 0.165 0.287 water mass at period 0–1 mw01 kg 0.105 0.141 0.246 density of wet particles at point 1 %s1 kg m−3 2880 2880 2880 mean value of porosity ε0 – 0.39 0.39 0.39 porosity ε01 – 0.39 0.39 0.39 specific surface of particles as1 m−1 6207 6210 6216 total particle volume vp1 m3 4.54 · 10−5 5.61 · 10−5 9.79 · 10−5 total surface of particles as1 m2 0.28 0.35 0.61 drying rate nw01 kg m−2 s−1 2.55 · 10−4 3.27 · 10−4 2.61 · 10−4 unstirred version mean value of particle size dp01 m 5.9 · 10−4 5.9 · 10−4 5.9 · 10−4 period time ∆τ01 s 5687 5880 5765 difference in humidity ∆xw01 kgw kg −1 s 1.035 1.018 1.003 water mass at point 0 mw0 kg 0.220 0.327 0.571 water mass at point 1 mw1 kg 0.097 0.144 0.251 water mass at period 0–1 mw01 kg 0.123 0.183 0.319 density of wet particles at point 1 %s1 kg m−3 2618 2618 2618 mean value of porosity ε0 – 0.39 0.39 0.39 porosity ε01 – 0.39 0.39 0.39 specific surface of particles as1 m−1 6207 6210 6216 total particle volume vp1 m3 3.85 · 10−5 5.87 · 10−5 1.03 · 10−4 total surface of particles as1 m2 0.24 0.36 0.64 drying rate nw01 kg m−2 s−1 9.02 · 10−5 8.35 · 10−5 8.38 · 10−5 table 2. overview of values for determination of drying rate in the porous layer period. of the drying process of particles of the regenerated ion exchanger is summarized in table 3. 3.5. effect of the stirrer on the total drying time a comparison of the drying curves for the stirred and unstirred fluidized bed layers is shown in fig. 6. a comparison of these curves shows the effect of the stirrer on the porous layer area, where the drying time is significantly shorter. the rotating stirrer also helps the particles pass to the fluid state already at 50 % rh instead of the initial 45 % rh. there is also a positive effect of the stirrer on the first period of fluidized-bed drying, because it sweeps the wet particles stuck on the walls back into the fluidized-bed layer, where the transfer phenomena are more intensive. 4. conclusions it has been demonstrated that the stirrer has a positive effect on the porous layer period and during the first period of the drying process in the fluidized bed layer. the stirrer regularly disturbs the stationary porous layer that has formed, and this intensifies the rate of the transport phenomena. the total drying time to reach the required moisture content of the material is thus 60 minutes, which corresponds to a 63 % shorter drying time. 236 vol. 54 no. 3/2014 stirred fluidized-bed dryer column diameter dk m 0.085 0.100 0.140 stirred version mean value of particle size dp12 m 5.30 · 10−4 5.30 · 10−4 5.30 · 10−4 period time ∆τ12 s 1434 1262 1587 difference in humidity ∆xw12 kgw kg −1 s 0.588 0.588 0.588 water mass at point 1 mw1 kg 0.122 0.165 0.287 water mass at point 2 mw2 kg 0.022 0.029 0.051 water mass at period 1–2 mw12 kg 0.100 0.136 0.236 density of wet particles at point 2 %sw2 kg m−3 2290 2070 1802 archimedes number ar – 6046 5465 4756 reynolds number of fluidization rep,mf – 3.480 3.160 2.766 threshold rate at fluidization threshold ua,mf m s−1 0.16 0.15 0.13 porosity at fluidization start εmf – 0.40 0.40 0.40 reynolds number at uab re – 43 47 44 porosity at working conditions ε12 – 0.79 0.82 0.83 specific surface of particles as12 m−1 2429 2016 1930 total particle volume vp12 m3 8.47 · 10−5 1.14 · 10−4 1.99 · 10−4 total surface of particles as12 m2 0.21 0.23 0.38 drying rate nw12 kg m−2 s−1 2.43 · 10−4 3.33 · 10−4 2.76 · 10−4 unstirred version mean value of particle size dp12 m 5.30 · 10−4 5.30 · 10−4 5.30 · 10−4 period time ∆τ12 s 1406 1411 1691 difference in humidity ∆xw12 kgw kg −1 s 0.497 0.497 0.497 water mass at point 1 mw1 kg 0.097 0.144 0.251 water mass at point 2 mw2 kg 0.021 0.031 0.054 water mass at period 1-2 mw12 kg 0.076 0.113 0.197 density of wet particles at point 2 %sw2 kgm−3 2315 2030 1778 archimedes number ar – 6114 5359 4693 reynolds number of fluidization rep,mf – 3.517 3.101 2.730 threshold rate at fluidization threshold ua,mf m s−1 0.16 0.14 0.13 porosity at fluidization start εmf – 0.40 0.40 0.40 reynolds number at uab re – 43 47 44 porosity at working conditions ε12 – 0.78 0.83 0.83 specific surface of particles as12 m−1 2450 1977 1904 total particle volume vp12 m3 8.22 · 10−5 1.22 · 10−4 2.13 · 10−4 total surface of particles as12 m2 0.20 0.24 0.41 drying rate nw12 kg m−2 s−1 2.08 · 10−4 2.57 · 10−4 2.22 · 10−4 table 3. overview of values for determination of drying rate in the first period of fluidized-bed drying 237 m. pěnička, p. hoffman, i. fořt acta polytechnica it follows from the experiments and the results that the drying rate in the period of the porous stationary layer is 8.58 · 10−5 kg m−2 s−1 ± 5 %, and after introducing the stirrer the rate increases to 2.81 · 10−4 kg m−2 s−1 ± 16 %, which represents an increase of 220 %. in the first and second periods of fluidized-bed drying, the stirrer does not disturb the fluidized bed layer that has formed. then intensification of the process helps, and the stuck wet particles are swept off the walls of the drying chamber into the fluidized bed. this results in an increase in the drying rate in the first period of fluidized-bed drying from 2.29 ·10−4 kg m−2 s−1 ±12 % to 2.84 · 10−4 kg m−2 s−1 ± 17 %, i.e. an increase of 17%. the results presented here are based on more than one hundred long-term drying experiments resulting in figures that provide a description of the course of investigated drying process. acknowledgements the authors appreciate the support provided by internal grant sgs12/057/ohk2/1t/12 of the czech technical university in prague, grant ingo lg 13036 of the czech ministry of education, and research project j04/98: 212200008 of the czech ministry of education. list of symbols dk column diameter [m] ar archimedes number [–] as specific surface of particles [m−1] as total surface of particles [m2] c8–c11 calculation constants [–] d diameter [m] g acceleration of gravity [m s−2] h height [m] m mass [kg] n amount of substance [mol] nw drying rate [kg m−2 s−1] p pressure [pa] p′′ pressure of saturated water vapours [pa] r universal gas constant [j k−1mol−1] re reynolds number [–] s cross-section [m2] t temperature [°c] t thermodynamic temperature [k] u velocity [m s−1] v total particle volume [m3] v̄ volume flow [m3 s−1] x absolute humidity (moisture) [kgw kg−1s ] ε porosity [–] ϕ relative humidity (moisture) [% rh (rm)] µ viscosity [pa s] % density [kg m−3] τ time period [s] list of subscripts a air w water (humidity, moisture) sw wet material s dry matter p particle m stirrer k column 0 initial point 1 transition point between the periods of the porous layer and the first period of fluidized-bed drying 2 transition point between the periods of the first and second period of fluidized-bed drying a zone of cold air b zone of heated air c zone of chilled wet air wall wall references [1] kim j, han g.y.: effect of agitation on fluidation characteristics of fine particles in a fluidized bed. powder technol. 2006; 166:113-122 doi: 10.1016/j.powtec.2006.06.001 [2] daud w.r.w.: fluidized bed dryers recent advances. adv. powder technol. 2008; 19:403-418 doi: 10.1163/156855208x336675 [3] ying han, jia-jun way, xue-ping gu, lian-fang feng, guo-hua hu.: homogeneous fluidization of geldart d particles in a gas-solid fluidized bed with a frame impeller. industrial & engineering chemistry research. 2012; 51(50) 16482-16487 doi: 10.1021/ie301574q [4] ying han, jia-jun way, xue-ping g.u., lian-fang feng, guo-hua hu.: effect of agitation on the fluidization behavior of a gas-solid fluidized bed with a frame impeller. aiche journal. 2013;59: 1066-1074. doi: 10.1002/aic.13893 [5] mega.cz, ralex heterogenous ionex membranes (in czech). qartin s.r.o., [online]. available: http://www.mega.cz/ heterogenni-iontomenicove-membrany-ralex.html [approach obtained 20 11 2012]. [6] m. j. hudson, p. a. williams.: recent developments in ion exchangers, ion exchangers in the nuclear industry. springer netherlands. 1990. [7] l. jelínek.: desalination and separation methods in water treatment (in czech). prague institute of chemical technology. 2009. [8] dow, dowex marathon a, 26 11 2010. [online]. available: http://www.dowwaterandprocess.com/products/ix/ dx_mar_a.htm [approach obtained 22 05 2011]. [9] j. chyský.: moist air (in czech). publishing house of the czech technical university in prague. 1977. [10] j. šesták, j. bukovský and m. houška.: thermal processes: transport and thermodynamic data (in czech). publishing house of the czech technical university in prague. 1993. [11] s. h. touloukian.: thermophysical properties of matter, sv. 11. ifi/plenum press, 1975. 238 http://dx.doi.org/10.1016/j.powtec.2006.06.001 http://dx.doi.org/10.1163/156855208x336675 http://dx.doi.org/10.1021/ie301574q http://dx.doi.org/10.1002/aic.13893 http://www.mega.cz/heterogenni-iontomenicove-membrany-ralex.html http://www.mega.cz/heterogenni-iontomenicove-membrany-ralex.html http://www.dowwaterandprocess.com/products/ix/dx_mar_a.htm http://www.dowwaterandprocess.com/products/ix/dx_mar_a.htm vol. 54 no. 3/2014 stirred fluidized-bed dryer [12] r. t. goroško. izv. vuzov. neft i gaz.: 1958;1:125. [13] a. s. mujumdar.: handbook of industrial drying. dekker. 1995. [14] v. hlavačka.: thermal processes in technical systems of gas-solid particles (in czech). publishing house of technical literature. 1980. [15] c. wen and y. yu.: a generalized method for predicting the minimum fluidization velocity. aiche journal. 1966; 12:610-612. 239 acta polytechnica 54(3):231–239, 2014 1 introduction 2 material and methods 2.1 spherical particles of the regenerated ion exchanger 2.2 stirring unit 2.3 experimental equipment 2.4 properties of the drying air 2.5 determination of layer porosity 2.6 determining the drying rate 3 results and discussion 3.1 design of the optimized stirrer for minimizing the total drying time 3.2 the course of changes in ion exchanger particle diameters in the test periods 3.3 the drying rate in the porous layer period 3.4 the drying rate in the first period of fluidized-bed drying 3.5 effect of the stirrer on the total drying time 4 conclusions acknowledgements list of symbols list of symbols references acta polytechnica doi:10.14311/ap.2014.54.0079 acta polytechnica 54(2):79–84, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap tracking down localized modes in pt-symmetric hamiltonians under the influence of a competing nonlinearity bijan bagchia, ∗, subhrajit modakb, prasanta k. panigrahib a department of applied mathematics, university of calcutta, 92 acharya prafulla chandra road, kolkata-700 009, india b department of physical science, indian institute of science, education and research (kolkata), mohanpur, west bengal 741 252, india ∗ corresponding author: bbagchi123@rediffmail.com abstract. the relevance of parity and time reversal (pt)-symmetric structures in optical systems has been known for some time with the correspondence existing between the schrödinger equation and the paraxial equation of diffraction, where the time parameter represents the propagating distance and the refractive index acts as the complex potential. in this paper, we systematically analyze a normalized form of the nonlinear schrödinger system with two new families of pt-symmetric potentials in the presence of competing nonlinearities. we generate a class of localized eigenmodes and carry out a linear stability analysis on the solutions. in particular, we find an interesting feature of bifurcation characterized by the parameter of perturbative growth rate passing through zero, where a transition to imaginary eigenvalues occurs. keywords: nonlinear schrödinger equation, pt-symmetry, competing nonlinearity. 1. introduction following bender and boettcher’s seminal paper [1], in which they offered the first coherent explanation of a special class of non-hermitian but parity and timereversal (pt)-symmetric hamiltonians to possess a real bound-state spectrum, the last decade has witnessed extensive theoretical work [2–4] being devoted to this growing field of research. the interplay between the parametric regions where pt is unbroken and the regions in which pt is broken as signaled by the appearance of conjugate-complex eigenvalues (see, for example, [5–7]) has for some time found repeated experimental support [8–17] as evidenced by the observations of a phase transition that clearly marks out the separation of these regions. it is useful to bear in mind that the analytical studies in this regard have mostly been carried out for the linear domain. of late, the relevance of pt-structure has been noticed in various optical systems and interesting features have been seen such as, for example, power oscillations [9], unidirectional invisibilty [13], coherent perfect absorber [12, 18], giant wave amplification [19] and realiztion through electromagnetically induced transparency [20]. in optical systems, ptsymmetry has the implication that the index guiding part nr(x) and the gain/loss profile ni(x) of the complex refractive index n(x) = nr(x) + ini(x) obey the symmetric nr(x) = nr(−x) and antisymmetric ni(x) = −ni(−x) combinations (see, for example, [21–23]). balancing gain and loss [24–27] is an interesting curiosity towards experimental realization of pt-symmetric hamiltonians. against the background of the experimental findings, musslimani et al [24, 25] have reported optical solitons in pt-periodic potentials which are stable over a wide range of potential parameters. specifically they have considered optical wave propagation with the beam evolution being controlled by a normalized nonlinear schrödinger (nls) equation defined in terms of an electric field envelop and a scaled propagation distance. indeed, the generalized nls that they consider, in the presence of a pt-symmetric potential, is given by iψz + ψxx + [ v (x) + iw(x) ] ψ + g|ψ|2ψ = 0, (1) with the pt-symmetric potential possessing the usual properties [28] v (−x) = v (x) and w(−x) = −w(x). in (1), ψ represents the electric field envelope, z is a scaled propagation distance and g = 1 or −1 corresponds to a self-focussing or a defocussing nonlinearity. further, the ψxx term describes the optical diffraction, v (x) is the index guiding and w(x) represents the gain/loss distribution of the optical potential. musslimani et al [24, 25] studied nonlinear stationary solutions of the form ψ(x,z) = φ(x) exp(iλz), λ being a real propagation constant and φ is the signature of the nonlinear eigenmode. in the context of nonlinear optics, localised modes are either temporal or spatial depending on whether the confinement of light occurs in time or space during wave 79 http://dx.doi.org/10.14311/ap.2014.54.0079 http://ojs.cvut.cz/ojs/index.php/ap b. bagchi, s. modak, p. k. panigrahi acta polytechnica propagation.in particular, spatial modes represent propagating transverse self-guided beams orthogonal to the direction of movement. because of a balance between the diffraction and the kerr effect, a spatial mode does not change with propagation. these modes are termed as spatial solitons. both the types of solitons emerge from a nonlinear change in the refractive index of an optical material induced by the light intensity. this phenomenon is referred to as the optical kerr effect. the intensity dependence of the refractive index leads to spatial self-focussing (or self-defocussing) and temporal self-phase modulation, the two major nonlinear effects that are responsible for the formation of optical solitary modes or optical solitons [29]. in this article we report on some new localized solutions of the nls and study the distribution of eigenmodes on the real and complex plane by incorporating the effects of higher degree nonlinear effects over and above the minimal cubic term. by parametrizing the coupling strength of the latter and arbitrarily specifying the order of additional nonlinearity on a rosen-morse potential we observe numerically for one class of solutions the existence of a threshold value of the growth rate parameter beyond which suitably chosen pairs of discrete eigenmodes on the real axis merge and subsequently appear in conjugate imaginary pairs exhibiting the qualitative character of bifurcation. in this connection it needs to be pointed out that our model differs significantly from those advanced so far to search for solitonic solutions [30, 31]. for instance, the potentials of our interest are markedly different from the rosen-morse type considered in [31] because of the presence of an additional nonlinear term in our case. the pt-symmetric potentials addressed in [30] are basically nonlinear extensions of the scarf ii. also, the above aspect of bifurcation did not arise in the models considered in [30, 31]. 2. mathematical model and formulation we consider here an optical wave propagation in the presence of a pt -symmetric potential. in this case the beam dynamics is governed by a generalized nonlinear schrödinger model with competing nonlinearities, i.e., i ∂ψ ∂z + ∂2ψ ∂x2 + [ v (x) + iw(x) ] ψ + g1|ψ|2ψ + g2|ψ|2κψ = 0, (2) where κ is an arbitrary real number, ψ(x,z) is a complex electric field envelope, g1 and g2 control respectively the strength of the cubic and the arbitrary nonlinear term. it is clear that eq. (2) admits stationary solutions ψ(x,z) = φ(x)eiλz, where λ is a real propagation constant and the complex function φ(x) obeys the eigenvalue equation ∂2φ ∂x2 + [ v (x) + iw(x) ] φ + g1|φ|2φ + g2|φ|2κφ = λφ. (3) we now show that this model supports two different soliton solutions marked by class i and class ii cases provided we do not alter the imaginary part of the potential but only choose the real part appropriately. 2.1. class i solutions we focus on a pt -symmetric rosen-morse potential −a(a + 1) sech2 x + 2ib tanh x (a, b ∈ r) being subjected to an additional term −v1 sech2κ x (v1, κ ∈ r), i.e., v (x) = −a(a + 1) sech2 x−v1 sech2κ x, w(x) = 2b tanh x. (4) it may be noted that a pt-symmetric rosen-morse potential has been well studied in the literature (see [32] and references therein). a noteworthy feature of this potential is that while its real component vanishes asymptotically, it’s imaginary part remains nonzero finite. corresponding to (4) there always exists for (3) a typical solution φ(x) = φ0 sech(x) eiµx, (5) provided that the amplitude φ0 and the phase factor µ are related to the potential parameters through φ2κ0 = v1 g2 , φ20 = a2 + a + 2 g1 , b = µ, λ = 1 −µ2. (6) note that the imaginary strength of the potential contributes only through the phase factor whereas for the real part both the parameters a and v1 along with the nonlinearity parameters g1 and g2 appear with odd powers and thus their sign is not of relevance. for this solution the transverse power flow defined by s = i 2 (φφ∗x −φ ∗φx) (7) turns out to be s = bφ20 sech 2(x) implying that the transmission depends upon the strength of the imaginary part of the potential. 2.2. class ii solutions on the other hand, if we choose the extended rosenmorse potential to have the form v (x) = −a(a + 1) sech2 x−v1 sech 2 κ x, w(x) = 2b tanh x, κ ∈ r\{0} (8) then eq. (3) enjoys a solution φ(x) = φ0 sech 1 κ x eiµx (9) 80 vol. 54 no. 2/2014 tracking down localized modes in pt-symmetric hamiltonians if the amplitude and phase factor are constrained by the relations φ20 = v1 g1 , b = µ κ , λ = 1 κ2 −µ2, φ2κ0 = 1 g2 [ a(a + 1) + (1 κ + 1 κ2 )] . (10) note that the solution (10) is valid irrespective of the signs of g1 and g2. the transverse power flow, defined by eq. (5), turns out to be s = bφ20κ sech 2 κ x, which in this case is influenced by both b and κ including of course the effects of their signs. 3. numerical computations and eigenmode distribution solitary waves associated with the non-kerr nonlinear media retain their shape but their stability is not guaranteed because of the nonintegrable nature of the underlying extended nls equation that we have at hand. in fact, their stability against small perturbation is an important issue because only stable (or weakly unstable) self-trapped beams can be observed experimentally. let us impose a small perturbation to determine whether it is stable or unstable against this slight disturbance. more specifically we consider a perturbation of the form [30] ψ(x,z) = φ(x)eiλz + {[ v(x) + ω(x) ] eηz + [ v∗(x) −ω∗(x) ] eη ∗z } eiλz, (11) where v(x) and ω(x) are infinitesimal perturbed eigenfunctions such that |v|, |ω| � |φ| and η indicates the perturbed growth rate. linearization of eq. (1) around φ(x) yields the following eigenvalue problem( 0 l̂0 l̂1 0 )( v ω ) = −iη ( v ω ) , (12) where l̂0 = ∂xx−λ+ (v +iw) +g1|φ|2 +g2|φ|2κ and l̂1 = ∂xx −λ + (v + iw) + 3g1|φ|2 + g2(1 + 2κ)|φ|2κ and η is the associated eigenvalue corresponding to the growth rate parameter. the η-spectrum is called the linear-stability spectrum for the localized modes. it is easy to see if η is an eigenvalue then so are η∗, -η, and -η∗ indicating that these eigenvalues always appear in pairs or quadruples. the continuous spectrum of eq. (12) can be readily recovered in the large-distance limit of |x| → ∞. under this limit, l̂0 and l̂1 move over to a simple differential operator with a constant coefficient. in order for η to be in the continuous spectrum, the corresponding eigenfunction at large distance must be a fourier mode. if we observe the orientation of the eigenmodes in the entire spectrum, we run into three different kinds of modes. the appearance of nonzero discrete eigenvalues in the linearization spectrum of solitary waves is a signature of nonintegrable character of the equation. if the spectrum contains a real positive eigenvalue, the corresponding eigenmode in the perturbed solution will grow exponentially with time; hence the solitary wave is linearly unstable. generally if the spectrum contains any eigenvalue with a real positive part then such eigenvalues are unstable. secondly if the spectrum admits of a pair of conjugate-complex eigenvalues (internal modes) the perturbed solution will exhibit oscillations leading eventually to shape fluctuations that would be smothered with time. thirdly one can encounter zero eigenvalues which are the so-called goldstone modes (see for a discussion on this point [30]). the behavior of the eigenvalues η can be ascertained by solving eq. (12) numerically. here we adopt the fourier collocation method [33, 34] to track the tendencies of the eigenvalues. we now turn to some discussions of our results. 4. results and discussion figures 1–4 give a graphical display of our numerical results on the eigenmode distribution. the interplay between the cubic and competing nonlinearity on the soliton dynamics is best understood in terms of the parameters φ0, |g1|, |g2|. we also look for stability around some specific value of κ, as mentioned in the figure captions. it should be borne in mind that eq. (6) and eq. (10) constrain these parameters in terms of the amplitude φ0 and the phase factor µ as well as the coupling constants of the potential. the evolution of a class i solution for different choices of g1 and g2 is laid down in fig.1. we start with a sample choice of a, b and g2 (for specific values of these parameters see the corresponding figure captions) when the discrete modes lie on the real axis. of course a continuous change of the parameter a will ultimately result in the eigenvalues appearing with a non-vanishing imaginary part (discussed below). normalizing φ0 to unity without any loss of information, g1 gets automatically fixed while b, λ and µ acquire their values from the consistency conditions (6). in this manner of parametrization g1 and g2 differ in sign while the magnitude of g1 turns out to be weaker than g2. in fig.2 the plots are sequentially arranged according to the varying strengths of the couplings corresponding to the cubic nonlinear term and holding g2 fixed. we note that g1 changes according to the potential parameter a. a new type of solution develops at this point due to the sensitive dependence of the perturbative growth rate parameter η on a: around a = 0.03 we see that as parameter a is varied the discrete modes initially lying on the real axis mutually approach towards the zero-mode eigenvalue. however, further change in a causes a pair of imaginary eigenvalues to develop revealing a typical feature of bifurcation. we next carry out computations for equal and opposite values of |g1| and |g2| couplings. in fig.3 we see that in such a case the real eigenmodes lead to the solitonic solutions becoming unstable. a similar situation exists in fig.4 81 b. bagchi, s. modak, p. k. panigrahi acta polytechnica −0.2 −0.1 0 0.1 0.2 0.3 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 re η im η −0.2 −0.1 0 0.1 0.2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 re η im η −0.2 −0.1 0 0.1 0.2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 re η im η −0.2 −0.1 0 0.1 0.2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 re η im η figure 2. sequentially computed eigenmode behavior for b=.003, v1 = g2 = −4 and κ = 3. in this case the coupling parameters corresponding to cubic nonlinearity are varied against the potential parameter a continuously from a =.03 to .09 . four different values of a as a=.03, a=.04, a=.05, a= .09 are considered and the figures are arranged in such a way that the lowest value of a corresponds to the figure at the left and the highest to the figure at the right. −3 −2 −1 0 1 2 3 −60 −40 −20 0 20 40 60 re η im η figure 1. numerically computed eigenmode distribution. in this case we have considered a = .01, b = .3, v1 = g2 = −4, g1 = 2.01 and κ = 3. specification of g1 is done by choosing the potential parameter a. −0.2 −0.1 0 0.1 0.2 −20 −15 −10 −5 0 5 10 15 20 re η im η figure 3. the figure shows the unstable modes for the coupling parameters |g1| and |g2| with equal and opposite strengths. distribution of eigenmodes for a=1, b=.003, g1=4, v1 = g2 = −4 and κ = 3. 82 vol. 54 no. 2/2014 tracking down localized modes in pt-symmetric hamiltonians −0.2 −0.1 0 0.1 0.2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 re η im η −0.2 −0.1 0 0.1 0.2 0.3 −4 −3 −2 −1 0 1 2 3 4 re η im η figure 4. figure at the left showing the unstable modes corresponding to equal-strength couplings for the choice g1 = 4, g2 = 4 obtained from (6) and κ = 3. the potential parameters are taken to be a = 1 and b = .003. the figure at the right corresponds to the evaluation of class ii solution under the same parametrizations. the coupling strengths for class ii are g1 = 4 and g2 = 2.44 obtained from (10). where oscillatory instability along the imaginary axis is caused by the equal-strength coupling parameters from the two nonlinear terms. finally, let us point out that class ii solutions can be evaluated under various parametric conditions. here we inevitably run into the unstable character of the soliton solutions (in figure 4 one such situation is described to compare with the counterpart of class i solution). it should be emphasized that for the various runs of the parameters we were unable to track down any feature of bifurcation characterized by the growth rate parameter crossing the zero-threshold value and transiting to the complex plane. acknowledgements we thank dr. abhijit banerjee for useful discussions and the referee for making a number of constructive suggestions. references [1] c. m. bender and s. boettcher phys. rev. lett. 80 5243 (1998). doi: 10.1103/physrevlett.80.5243 [2] c. m. bender contemp. phys. 46 277 (2005). doi: 10.1080/00107500072632 [3] c. m. bender repts. prog. phys. 70 947 (2007). doi: 10.1088/0034-4885/70/6/r03 [4] a. mostafazadeh int. j. geom. methods mod. phys. 7 1191 (2010). [5] m. znojil phys.lett. a285 7 (2001). [6] b. bagchi, s. mallik and c. quesne mod.phys.lett. a17 1651 (2002). doi: 10.1142/s0217732302008009 [7] k. abhinav and p. k. panigrahi ann.phys.(n.y.) 325 1198 (2010); ibid. 326 538 (2011). [8] a. guo, g. j. salamo, d. duchesne, r. morandotti, m. volatier-ravat, v. aimez, g. a. siviloglon and d. n. christodoulides phys. rev. lett. 103 093902 (2009). doi: 10.1103/physrevlett.103.093902 [9] c. e. rüter, k. g. makris, r. el-ganainy, d. n. christodoulides, m. segev and d. kipnat. phys. 6 192 (2010). doi: 10.1038/nphys1515 [10] j. rubinstein, p. sternberg and q. ma phys. rev. lett. 99 167003 (2007). doi: 10.1103/physrevlett.99.167003 [11] k. f. zhao, m. schaden and z. wu phys. rev. a 81 042903 (2010). doi: 10.1103/physreva.81.042903 [12] y. d. chong, l. ge and a. d. stone phys. rev. lett. 106 093902 (2011). doi: 10.1103/physrevlett.106.093902 [13] z. lin, h. ramezani, t. eichelkraut, t. kottos, h. cao and d. n. christodoulides phys. rev. lett. 106 213901 (2011). doi: 10.1103/physrevlett.106.213901 [14] c. zheng, l. hao and g. l. long arxiv:1105.6157 [quant-ph]. [15] s. bittner, b. dietz, u. günther , h. l. harney, m. miski-oglu, a. richter and f. schaefer phys. rev. lett. 108 024101 (2012). doi: 10.1103/physrevlett.108.024101 [16] j. schindler, a. li, m. c. zheng, f. m. ellis and t. kottos phys. rev. a 84 040101 (2011). doi: 10.1103/physreva.84.040101 [17] a. szameit, m. c. rechtsman, o. bahat-treidel and m. segev phys. rev. a 84 021806(r) (2011). doi: 10.1103/physreva.84.021806 [18] s. longhi phys. rev. a 82 031801 (2010). doi: 10.1103/physreva.82.031801 [19] v. v. konotop, v. s. shchesnovich and d. a. zezyulin phys. lett. a376 2750 (2012). 83 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1080/00107500072632 http://dx.doi.org/10.1088/0034-4885/70/6/r03 http://dx.doi.org/10.1142/s0217732302008009 http://dx.doi.org/10.1103/physrevlett.103.093902 http://dx.doi.org/10.1038/nphys1515 http://dx.doi.org/10.1103/physrevlett.99.167003 http://dx.doi.org/10.1103/physreva.81.042903 http://dx.doi.org/10.1103/physrevlett.106.093902 http://dx.doi.org/10.1103/physrevlett.106.213901 http://dx.doi.org/10.1103/physrevlett.108.024101 http://dx.doi.org/10.1103/physreva.84.040101 http://dx.doi.org/10.1103/physreva.84.021806 http://dx.doi.org/10.1103/physreva.82.031801 b. bagchi, s. modak, p. k. panigrahi acta polytechnica [20] h-j li, j-p dou, j-j yang and g. huang pt symmetry via electromagnetically induced transparency arxiv:1307.2695 [quant-ph]. [21] s. klaiman, u. günther and n. moiseyev phys. rev. lett. 101 080402 (2008). doi: 10.1103/physrevlett.101.080402 [22] o. bendix, r. fleischmann, t. kottos and b. shapiro phys. rev. lett. 101 080402 (2008). [23] a. regensburger, c. bersch, m. miri, g. onishchukov and d. n. christodoulides nature 488 167 (2012). doi: 10.1038/nature11298 [24] z. h. musslimani, k. g. makris, r. el-ganainy and d. n. christodoulides phys. rev. lett. 103 030402 (2009). doi: 10.1103/physrevlett.103.030402 [25] z. h. musslimani, k. g. makris, r. el-ganainy and d. n. christodoulides j. phys. a : math. theor. 41 244019 (2008). [26] k. g. makris, r. el-ganainy and d. n. christodoulides and z. h. musslimani phys. rev. lett. 100 103904 (2008) doi: 10.1103/physrevlett.100.103904 [27] r. el-ganainy, k. g. makris, d. n. christodoulides, z. h. musslimani opt. lett. 32 2632 (2008). [28] b. bagchi and r. roychoudhury j. phys. a : math. theor. 33 l1 (2000). [29] g. p. aggarwal nonlinear fiber optics (elsevier science, 2012) [30] a. khare, s. m. al-marzoug and h. bahlouli phys. lett. a 376 2880(2012). [31] b. midya and r. roychoudhury phys. rev.a 87 045803 (2013). doi: 10.1103/physreva.87.045803 [32] g. lévai and e. magyari j. phys. a: math. theor. 42 195302(2009). doi: 10.1088/1751-8113/42/19/195302 [33] j. yang nonlinear waves in integrable and non-integrable systems (siam , 2010). [34] j. yang j. comp. phys. 227 6862 (2008). doi: 10.1016/j.jcp.2008.03.039 84 http://dx.doi.org/10.1103/physrevlett.101.080402 http://dx.doi.org/10.1038/nature11298 http://dx.doi.org/10.1103/physrevlett.103.030402 http://dx.doi.org/10.1103/physrevlett.100.103904 http://dx.doi.org/10.1103/physreva.87.045803 http://dx.doi.org/10.1088/1751-8113/42/19/195302 http://dx.doi.org/10.1016/j.jcp.2008.03.039 acta polytechnica 54(2):79–84, 2014 1 introduction 2 mathematical model and formulation 2.1 class i solutions 2.2 class ii solutions 3 numerical computations and eigenmode distribution 4 results and discussion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0205 acta polytechnica 54(3):205–209, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap “pi of the sky” off-line experiment with gloria ariel majchera,∗, arkadiusz ćwieka, mikołaj ćwiokb, lech mankiewiczc, marcin zarembab, aleksander f. żarneckib a national centre for nuclear research, hoża 69, 00-681 warsaw, poland; b faculty of physics, university of warsaw, hoża 69, 00-681 warsaw, poland; c center for theoretical physics, polish academy of sciences, al. lotników 32/46, 02-668 warsaw, poland. ∗ corresponding author: majer@cft.edu.pl abstract. gloria is the first free and open-access network of robotic telescopes in the world. based on the web 2.0 environment, amateur and professional users can do research in astronomy by observing with robotic telescope, and/or analyzing data acquired with gloria, or from other free access databases. the gloria project develops free standards, protocols and tools for controlling robotic telescopes and related instrumentation, for scheduling observations in the telescope network, and for conducting so-called off-line experiments based on the analysis of astronomical data. this contribution summarizes the implementation and results from the first research level off-line demonstrator experiment implemented in gloria, which was based on data collected with the “pi of the sky” telescope in chile. keywords: telescope network, citizen science, web 2.0, image processing, data analysis, robotic telescopes. 1. introduction gloria [1] (global robotic-telescope intelligent array) is an innovative citizen-science network of robotic observatories, which will give free access to professional telescopes for a virtual community via the internet. contributions to gloria are made by 13 partners with 17 robotic telescopes scattered in 7 countries all over the world (see fig. 1). this network will allow for continuous observations of celestial objects from different locations. when the object of interest sets in one location, an observer is able to continue observations from another location. this should allow her/him to observe an object even for 24 hours providing new quality in astronomical observations. in addition, there will often be an opportunity to observe celestial objects simultaneously from two or more observatories. one of the challenges we have to face in designing the environment for gloria off-line experiments is dealing with huge amounts of data and a large variety of analysis tasks. luiza [2], an efficient and flexible analysis framework for gloria, has been developed based on a concept taken from high energy physics. the basic data classes, framework structure and data processing functionality are implemented, as well as selected data processing algorithms. the framework was used to implement the first gloria research level off-line demonstrator experiment, which is described in this contribution. figure 1. location of observatories that contribute to the gloria project [1]. 2. demonstrator experiments gloria allows users to run experiments in the network. these experiments can be divided into two general types: • on-line experiments, for making observations with robotic telescopes. experiments can involve teleoperation of the telescope or making scheduled sky observations in the network. • off-line experiments, for analyses of the collected data on a basic level (for education or outreach) or doing more advanced image analysis (research level). 205 http://dx.doi.org/10.14311/ap.2014.54.0205 http://ojs.cvut.cz/ojs/index.php/ap a. majcher, a. ćwiek, m. ćwiok et al. acta polytechnica figure 2. fields of view selected for the off-line demonstrator experiment drawn on the stellarium [4] map (left), and their common part, as seen on the “pi of the sky” image (right). to present the network capabilities and the performance of the tools, demonstrator experiments have been created in both categories. to participate in these experiments, it is only necessity to create a user account via the project’s website [1], and then, after signing in, the user can choose among various on-line and off-line experiments. the tad (telescopio abierto divulgacion) robotic solar telescope at the observatorio del teide in tenerife (canary islands) was the first gloria telescope made available to users as an on-line demonstrator experiment. gloria provides the interface for users to schedule the observation time, and then to access and control the telescopes remotely, and make observations. the user gets almost full control over the telescope performance, including telescope pointing, focus adjustment, gain control and exposure settings. 3. “pi of the sky” project the “pi of the sky” project involves two observatories, one in chile and the other in spain [3]. in both observatories we use custom designed cameras with a 2000 × 2000 pixel matrix and canon lenses (f = 85 mm, f /d = 1.2). all cameras in spain and one camera in chile have only standard uv and infrared cut filters installed, while the second camera in chile has an r johnson-bessel filter. data from “pi of the sky” was used to implement the research-level off-line demonstrator experiment for gloria, focusing on light curve reconstruction and classification of variable objects. the experiment is based on pre-selected data from the telescope in chile. analysis is done using the luiza framework, designed within gloria for efficient analysis of astronomical data. the “pi of the sky” telescope takes sky images with 10 s exposure time. however, much better photometry accuracy is obtained from sums of 20 subsequent frames, corresponding to 200 s exposure time. stacked images taken from 2006 to 2009 at las campanas observatory (lco) and also in 2012 and 2013 at san pedro de atacama (spda), chile, were selected for the demonstrator experiments. after visual inspection, and image pre-processing (see the following section) about 500 images remained. the field of view of “pi of the sky” cameras is 20° × 20°. selected frames correspond to 4 overlapping observation fields, and the central region of about 10° × 10°, visible on all frames, was selected as the subject of the analysis. the chosen fields of view and their common part are shown in fig. 2. 4. luiza framework luiza is a simple modular application framework for development of image reduction and analysis tools. it is based on the following assumptions: (1.) each data (image) analysis can be divided into small, well defined steps, implemented as so called processors; (2.) each step has to have a well defined input and output data structure; (3.) by defining universal data structures we make sure that different processors can be connected in a single analysis chain; (4.) the processor configuration and parameters can be set by the user in run time in a simple steering file. the luiza framework implements all basic data structures required for image analysis: gloriafitsimage — a class for storing fits images, which uses the fitsio library for reading and storing images, and basic methods for image manipulation are implemented; gloriafitstable — a flexible class for storing other data (integers, floats, strings, vectors of int/float); 206 vol. 54 no. 3/2014 “pi of the sky” off-line experiment with gloria figure 3. an example of the processing concept in the luiza framework. gloriadatacontainer — an internal storage class in which images and tables are stored in collections; each collection has a unique name (string). to use the luiza framework, each user has to create a steering file (gui is provided), in which an analysis chain can be defined by selecting processors and defining their order. the steering file also allows the user to define input-output streams and set other processor parameters. each luiza processor gets a pointer to a global gloriadatacontainer and will allow the user to create a new collection when reading data from a file or to analyse data stored in memory, and in the end the user will have the possibility to save analysis results to output files. an example scheme of image processing in the luiza framework is shown in fig. 3. 5. implementation of the experiment the “pi of the sky” images selected for the off-line demonstrator experiment are analysed within the luiza framework. unfortunately, the full analysis chain, starting from a raw image and resulting in the light curve of the selected object, is quite time consuming. therefore we decided to divide image analysis into two steps: image preprocessing, and object light curve reconstruction. preprocessing (image stacking, dark subtraction, flat correction, object finding and astrometry) is done only once, while setting up the experiment and the object light curve reconstruction is run in response to each user request. object finding is done with the pixelclusterfinder processor. the algorithm searches for groups of pixels (called clusters) with a signal above the defined threshold. in the current analysis we use thresholds of 8 and 3 times the noise level. to define the thresholds, the background level and the average noise level are calculated first. to correct for significant background variation over the frame, the background level is calculated in 4 × 4 subframes and then interpolated. finally, the cluster signal and the position on ccd is calculated. the output of the cluster finding processor is stored as a gloriaobjectlist table. the astrometry processor is based on the astrometry.net algorithm for finding the frame orientation and the transformation for calculating object positions in the sky. the processor analyses the gloriaobjectlist table and calculates the parameters of the position transformation. in the final step, the positions of all objects in the list are calculated (ra, dec) and are added to the object list table. the final object lists are stored to binary fits tables. the light curve construction is based on coordinates specified by the user (from some range both right ascension and declination). however, when measurements from different fields are combined, this results in significant systematic effects. they most probably result from significant psf deformations over the “pi of the sky” field of view and the dependence of psf on the spectral type of the reference star (resulting from wide spectral acceptance of the “pi of the sky” apparatus — only a uv + ir cut filter is used). more precise calibration is obtained when a larger set of stars from the same part of the frame is used. after dedicated tests, we decided to use the “pi of the sky” reference star list, but to limit it to stars between 7 and 8.5 mag. one has to note that although “pi of the sky” measurements are normalized to v magnitudo, there is a non-vanishing spectral dependence due to the wide spectral acceptance mentioned above. as a result, the “pi of the sky” 207 a. majcher, a. ćwiek, m. ćwiok et al. acta polytechnica figure 4. distribution of the estimated calibration uncertainty ∆ for one constant star. magnitudo coincides with that of the v catalogue only for stars with j − k ≈ 0.4. when multiple reference stars are selected, photometric calibration can be based on the average reference star magnitudo correction. an advantage of this approach is that the rms of the difference between the reference star correction and the average correction, ∆, can be used as an estimate of the photometry uncertainty. after applying the calculated correction to the object magnitudo, the measurement is added to the light curve with an hjd timestamp. 6. results 6.1. tests with constant stars the light curve reconstruction procedure was first tested on selected constant stars. the distribution of the estimated correction uncertainty ∆ for one of the considered stars is presented in fig. 4. the distribution has a clear two-peak structure, indicating that for some frames the photometry accuracy was significantly worse, probably due to some systematic effects (or the object being near the edge of the ccd on some fields). by using a cut ∆ < 0.1 we can limit ourselves to the best measurements only. the result of such a cut is illustrated in fig. 5, where the magnitudo distribution for the same selected star is presented before (dashed line) and after the cut (solid line). the (properly adjusted) cut has a dramatic impact on the measurement. while reducing the event figure 5. a magnitudo distribution for one constant star. the dashed line is before, and the solid cut is after the quality cut ∆ < 0.1. statistics by about 30 %, it also improves photometry quality (measurement spread) for the star by a factor of more than 4 (in terms of the rms of the magnitudo distribution: from 0.111 mag to 0.026 mag). 6.2. variable star reconstruction an approach tested with constant stars was then applied to selected variable stars in the considered field. regular variable stars were selected based on “pi of the sky” and simbad [5] catalogues. for each star, after removing poor quality measurements with a cut on the estimated calibration uncertainty, a phased light curve was fitted. selected results are presented in figures 6 and 7. the fit, indicated by a green line, was performed with the cern root package. the light curve was modelled with a fourier series. 7. summary the first on-line and off-line experiments in the gloria network are ready and are already available to users. the first off-line demonstrator experiment has been developed based on “pi of the sky” data collected at las campanas observatory and at the san pedro de atacama observatory (both in chile). the data are pre-processed with luiza [2] — an efficient and flexible analysis framework. fast light curve reconstruction, based on pre-processed data, is also done with luiza and is triggered by a user request sent via 208 vol. 54 no. 3/2014 “pi of the sky” off-line experiment with gloria figure 6. light curve of w gem, classical cepheid (δ cep type), as reconstructed with luiza. a web interface. after successful testing and tuning the experiment should be released soon on gloria’s website [1]. acknowledgements the research leading to these results has received funding from the european union seventh framework programme (fp7/2007-2013) under grant agreement 283783. the research work has also been funded by the polish ministry of science and higher education from 2011-2014, as a cofunded international project. figure 7. light curve of v1388 ori, eclipsing binary of algol type (detached), as reconstructed with luiza. references [1] http://gloria-project.eu/, [2014-06-01]. [2] a. f. żarnecki, l. w. piotrowski, l. mankiewicz, s. małek. luiza: analysis framework for gloria. acta polytechnica 53(1):58, 2013. [3] a. majcher, m. sokolowski, t. batsch, et al. present status of pi of the sky telescopes. in society of photo-optical instrumentation engineers (spie) conference series, vol. 8008 of society of photo-optical instrumentation engineers (spie) conference series. 2011. doi:10.1117/12.905642. [4] http://www.stellarium.org, [2014-06-01]. [5] http://simbad.u-strasbg.fr/simbad/, [2014-06-01]. 209 http://gloria-project.eu/ http://dx.doi.org/10.1117/12.905642 http://www.stellarium.org http://simbad.u-strasbg.fr/simbad/ acta polytechnica 54(3):205–209, 2014 1 introduction 2 demonstrator experiments 3 ``pi of the sky'' project 4 luiza framework 5 implementation of the experiment 6 results 6.1 tests with constant stars 6.2 variable star reconstruction 7 summary acknowledgements references ap1_01.vp 1 introduction drying is one of the most energy-intensive processes in many industrial fields (food industry, paper industry, textile industry, etc.). therefore, studies of this type of problem have become very important and have for several decade attracted the attention of researchers. this article discusses an experimental study of heat transfer during the drying of porous media, using image analysis with the aim of improving process performance. textile drying machines are among the most expensive and most energy-consuming machines used in wet processing. most of them consist of hot air dryers from which a proportion of the circulating air is discharged to waste as exhaust air. the air discharged from the chimney stack, represents a significant energy loss. infrared dryers have a variety of advantages, in particular when they are used for treatment of textile sheet material in wet state. first, they permeate into and heat the textile sheet very quickly. secondly, they heat only the required sections of the object. there is no heating of unnecessary sections of the object, thereby avoiding wastage of thermal energy. thirdly, since infrared irradiation causes, an almost simultaneous temperature rise in different sections of an object exposed to it, regional variations in temperature within the object can be significantly reduced leading to ideal and uniform heating of the object. finally, adjustment of the output voltage to infrared irradiation enables simple, easy and swift control of heating condition in accordance with the demands in actual treatment of textile sheet materials. 2 experimental the experiment consisted a simple 20 kw quartz tube laboratory infrared dryer. the dryer sends the heat to the fabric surface. an infrared camera observes the thermal scene and produces a real-time, monochrome thermal image. the analog video images are recorded on tape for the entire experiment, with later extraction of the temperature data. the ir camera, which is calibrated at certain temperature, maps the spectral radiance onto a thermal image of the fabric surface. following image acquisition, analog video images are digitized and treated to image processing routines (algorithms) to improve the signal-to-noise ratio and transformation of useful data. this image processing system has already been described in detail in previous papers [1–6]. generally, a typical image processing system contains three fundamental elements: • an image-acquisition element, • an image-processing element, and • an image display element. a continuous scene is converted to a digital image and stored in compacted memory by the image acquisition element. this image is displayed in some form by the image-display element for human viewing. the image-processing element is designed for various tasks, which generally fall into three main groups: • image enhancement, • image analysis and • image coding. the scanning pyrometer rests on the fabric surface, ensuring that the temperature measurement does not result in damage to the fabric. knowing the values of water content of the fabric before drying (hi) and after drying (hf), one can calculate the global rate of water evaporation from the fabric in the dryer at a given total power of the infrared source [7]: � �evap h h sldi f� � (1) where s = speed of fabric in the oven (m/s), l = fabric width (m), d = superficial weight of dry fabric (g/m2). the energetic efficiency (eff) may be calculated from the evaporation rate using [7]: eff evap h p � � � � 100 (2) where �h is the latent heat of evaporation of water (j/g). alternatively, evap and p can be expressed in units of (g�h2o�m –2s–1) and w�m–2 respectively. equation (2) predicts that evap will be directly proportional to p provided eff remains constant. the value of eff applies only to this particular dryer and depends on the operating conditions. the characteristics of the various fabrics in this drying study are given in table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 41 no.1/2001 experimental investigations on drying of porous media using infrared radiation a. k. haghi increased interest is being shown in infrared drying today because of the environmental and technological advantages offered by this method. in order to assess the advantages of this drying process, extensive trials have been carried out. the objective of this investigation was to study the drying rate of infrared drying. this was achieved with the use of scanning pyrometer and image analysis. keywords: infrared dryer, porous media, drying rate, image analysis table 1: fabrics used in the drying study fabric fiber weight, g/m2 a cotton 130 b polyester 150 in the drying process, water is evaporated from the exposed yarn surfaces and replaced by water migration from within the fabric structure. during this process, a steady state is established and the fabric temperature and evaporation rate remain almost constant. at the of the so-called constant rate drying period, the fabric temperature begins to increase and the rate of evaporation decreases, because below a certain critical water content migration of liquid water, and later water vapor, to the exposed surfaces becomes more difficult and eventually ceases. 3 results and discussion the drying curves obtained (fig. 1) show moisture content versus time. the slope of this curve is the drying rate, the rate at which moisture is being removed. the curve begins with a warm-up period, where the material is heated. the drying rate can be negative in the warm-up period. as the material heats up, the rate of drying increases to a peak rate that is maintained for a period of time known as the constant rate period. eventually, the moisture content of the material drops to a level known as the critical moisture content, where the high rate of evaporation cannot be maintained. this is the beginning of the falling rate period. the greatest effects of humidity on the moisture content versus time curves occur in the warm-up and constant rate periods. once the critical moisture content is reached, the curves are similar. in the second falling rate period, humidity has little effect on the time required to remove an increment of moisture. the warm-up period corresponds to the region where moisture content is the highest and the drying rate is rapidly changing. the plots reveal that the warm-up period is relatively short and represents a small part of the total drying time. an image analysis of the surface is shown in figure 2. the shape of the wave is influenced by the pore size distribution of the fabric. the flat region of the surface depicts the highest temperature. in phenomenological terms, the fabric reaches a critical temperature when there is an irreversible shift in the heat balance between the heat generation rate and the heat losses. therefore, factors such as fabric orientation affect the measurement of this parameter. as weight increases, the temperature of the fabric increases at slower rate. this is as expected, because increasing fabric thickness increases the 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 time [s] m o is tu re c o n te n t [% ] fig. 1: moisture content versus time fig. 2: surface temperature distribution at 90 seconds and surface isothermal distribution internal resistance to heat transfer within the fabric. the surface isotherm distributions are also shown in figure 2. the fabric surface temperature depends on the balance between the rates of heat generation and heat loss. surface radiation is a major part of the total heat losses and depend strongly on the surface temperature. a small change in surface temperature leads to significant changes in surface heat losses. the surface isotherm vary with the fabric properties, including weight, construction and porosity. the irregularity of the isothermal contour lines is due to surface non-uniformities and yarn structure. 4 conclusions in this study, the rate of drying was essentially independent of the nature of the fabric type, provided the final water content was not much below the critical value corresponding to the end of the constant rate drying period. the drying rate in the constant rate period decreases with increasing humidity if the temperature is low enough to prevent thermal damage. humidity had little effect in the falling rate period. references [1] haghi, a. k.: an experimental approach for study of heat dissipation from the surface of small electric machines. the annals of the stefan cel mare university, romania, 2000, 7 (13), pp. 66–70 [2] haghi, a. k.: a thermal imaging technique for measuring transient temperature field – an experimental approach. the annals of the stefan cel mare university, romania, 1999, 6 (12), pp. 73–76 [3] haghi, a. k. et al.: determination des coefficients de transfert de chaleur lors du sechage. 2nd int. conference on development & application systems proc., romania, 1994, pp. 189–196 [4] haghi, a. k.: determination of heat transfer coefficients during the process of through drying of wet textile materials with an optico-mechanical scanning pyrometer & ir thermograph. 3rd int. conf. on development & application systems proc., romania, 1996, pp. 25–32 [5] haghi, a. k.: controle de materiaux par thermographie infrarouge: modelisation et experiences, 4th int. conf. on development & application systems proc., romania, 1998, pp. 65–76 [6] haghi, a. k. et al.: determination des coefficients de transfert de chaleurs lors du sechage de textiles par thermographie infrarouge et microscopie thermique a balayage. poster presentation, sft, paris, france, 1994 [7] broadbent, a. d.: predrying of textile fabrics with infrared radiation, textile res. journal, 1994, 64 (3), pp. 123–129 dr a. k. haghi university of guilan p.o.box 3756 rasht iran e-mail: haghi2002@yahoo.com © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 41 no.1/2001 ap1_01.vp 1 introduction nitrogen is a nutrient essential to all forms of life as a basic building block of plant and animal protein. nevertheless, too much of it can be toxic. the presence of nitrogen excess in the environment has caused serious distortions of the natural nutrient cycle between the living world and the soil, water, and atmosphere. nitrogen in the form of nitrous oxide, nitric oxide, nitrate, nitrite or ammonia/ammonium is soluble in water and can end up in ground water and drinking water. one of the best-documented and best-understood consequences of human alterations of the nitrogen cycles is the eutrophication of estuaries and coastal seas (oenema and roest, 1998). according to the instruction of the european community council (cce) of 15 july 1980, the maximum permissible level of ammonium in drinking water is 0.5 mg l�1. the processes used for the removal of ammonium ions from drinking waters include air stripping, nitrification, breakpoint chlorination, and ion exchange. the properties of synthetic and natural zeolites as ion exchangers for removal of ammonium ions have been discussed in great detail (jörgensen, 1976; gaspard and martin, 1983; hlavay et al., 1983; vokáčová, et al., 1986; hódi et al., 1995; booker et al., 1996; beler baykal and akca guven, 1997; cooney et al., 1999). the behavior of the exchange and the quantities of ammonium removed are very much dependent on the origin of the material, the impurities contained, the counter ions, as well as the pretreatment performed on the zeolite. the objective of our work is to treat drinking water spiked with nh4 + (10 � 0.5, 5 � 0.5, and 2 � 0.5) mg l�1 with the use of clinoptilolite and under pretreatments of the mineral. 2 zeolites in drinking water treatment zeolites are natural minerals, which can be described chemically as aluminum silicates. they are used for various applications, e.g., ion exchange, molecular sieves, and air-drying. the simplified chemical composition and structure of zeolites can be seen in fig. 1. in the regular structure of silicates a few places are occupied by aluminum ions, and so an additional charge is caused. this charge is compensated by other ions like sodium, potassium or ammonia, which are reversibly fixed by interactions and can easily be exchanged by other ions. the composition of this structure leads to various forms of zeolites. of all these zeolite types, clinoptilolite has the best selectivity for ammonia. preparing clinoptilolite so that all exchange places are filled with sodium ions results in a form with the best selectivity for ammonia (jörgensen et al. 1979; oldenburg m. and sekoulov i., 1995). 3 materials and methods physical description of the system fig. 2 consists of a column set with the following characteristics: internal diameter = 20 mm, total height of column = 60 cm, height of the bed of zeolite = 33 cm, 100 ml of material at a volumetric flow rate of 8.7 ml min�1, which is equivalent to 5.25 bed volumes (bv) per hour, particle-size of material = 3–5 mm, mass of zeolite in column = 100 g. a nylon screen supported the zeolite in the column. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 41 no.1/2001 clinoptilolite in drinking water treatment for ammonia removal h. m. abd el-hady, a. grünwald, k. vlčková, j. zeithammerová in most countries today the removal of ammonium ions from drinking water has become almost a necessity. the natural zeolite clinoptilolite is mined commercially in many parts of the world. it is a selective exchanger for the ammonium cation, and this has prompted its use in water treatment, wastewater treatment, swimming pools and fish farming. the work described in this paper provides dynamic data on cation exchange processes in clinoptilolite involving the nh4 +, ca+2 and mg+2 cations. we used material of natural origin – clinoptilolite from nižný hrabovec in slovakia (particle-size 3–5 mm). the breakthrough capacity was determined by dynamic laboratory investigations, and we investigated the influence of thermal pretreatment of clinoptilolite and the concentration of regenerant solution (2, 5, and 10% nacl). the concentrations of ammonium ion inputs in the tap water that we used were 10, 5, and 2 mg nh4 + l �1 and down to levels below 0.5 mg nh4 + l�1. the experimental results show that repeated pretreatment sufficiently improves the zeolite’s properties, and the structure of clinoptilolite remains unchanged during the loading and regeneration cycles. ammonium removal capacities were increased by approximately 40 % and 20 % for heat-treated zeolite samples. there was no difference between the regenerates for 10% and 5% nacl. we conclude that the use of zeolite is an attractive and promising method for ammonium removal. keywords: clinoptilolite, zeolite, ion exchange, ammonia removal, drinking water. me +n al sisi me n+ xalnx siy o2(nx+y) y/x > 1 me n+ ^ k + , ca 2+ , na + , mg 2+ , fe 3+ , nh4 + fig. 1: simplified chemical composition and structure of zeolites acta polytechnica vol. 41 no.1/2001 analysis all analyses were made according to standard methods (apha, see greenberg et al., 1992). ammonia was determined by nessler methods using a spectrophotometer (model hach dr/2000). calcium and magnesium were determined by the edta titrimetric method. specification of the material two materials were used for the investigation: a) natural clinoptilolite (clino. 1). origin: nižný hrabovec, slovakia. the chemical composition is given as: sio2 al2o3 fe2o3 feo tio2 cao mgo 69.77 % 12.24 % 2.2 % 0.14 % 0.45 % 1.68 % 1 % na2o k2o mno p2o5 h2o total 2.7 % 2.11 % 0.034 % 0.08 % 7.3 % 99.704 % b) activated clinoptilolite. the activation process was performed by heating for two samples: the samples were exposed to an elevated temperature of 400 °c for 6 h. before being heated, the first sample (clino. 2) (previously used as natural clinoptilolite) was washed with distilled water, while the second sample (clino. 3) was heated in its natural form. system operation parameters the first experiment was carried out to see the exhaustion performance of (clino. 1) using distilled water spiked at 10 mg nh4 � l �1 concentration. then we applied tap water containing ca�2 = 60 mg l �1 and mg�2 = 12 mg l �1. the concentrations of ammonium input were 10, 5 and 2 mg nh4 + l �1 as (nh4cl) and down to levels below 0.5 mg l �1. clino. 2 and clino. 3 were applied only with tap water containing ca�2, mg�2 and nh4 +. in the regeneration phase, 2.0% nacl at ph range (11.5–12.5), at a volumetric flow rate of 8.7 ml min�1, which is equivalent to 5.25 bed volumes (bv) per hour was investigated for clino. 1 in the first experiment and then at 5%, 10% nacl for clino. 1. for clino. 2 and clino. 3, nacl only was spiked at 5 %. after regeneration, the excess cl � was removed from the clinoptilolite, with the use of distilled water. this washing was repeated until visual tests with agno3 revealed zero chloride. 4 results and discussion breakthrough capacity and effects of activated clinoptilolite within the scope of this work, the results from the first experiment using distilled water for clino. 1 indicated that the breakthrough capacity defined as 0.5 mg l �1 in nh4 � was 0.47 mol l �1 for nh4 � = 10 mg l �1 as shown in table 1. figs. 3, 4 and 5 and table 2 show that, for tap water containing calcium and magnesium ions, the breakthrough capacity were 0.038, 0.052, 0.058 mol l �1 for nh4 � = 10, 5, and 2 mg l �1. for clino. 2 the values were 0.048, 0.063, 0.068 mol l �1 for nh4 � = 10, 5, and 2 mg l �1, and for clino. 3 the values were 0.06, 0.063, 0.076 mol l �1, respectively. these values indicate that tap water provides a lower breakthrough capacity than distilled water, due to the presence of other ions, especially ions with a polyvalent charge in water, such as ca�2 and mg�2. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ reservoir regeneration peristaltic pump valve service ion exchange resin column fig. 2: laboratory column zeolite column nh 4 cl feed breakthrough curve 0 0.1 0.2 0.3 0.4 0.5 0.6 0 20 40 60 80 100 120 bed volume passed n h 4 + m g l1 clino. 1 clino. 2 clino. 3 breakthrough level fig. 3: breakthrough curves of ion exchange resin for a tap water containing (nh4 + = 10 mg/ l, ca2+ = 60 mg/l, mg2+ = 12 mg/ l) zeolite column nh 4 cl feed breakthrough curve 0 0.1 0.2 0.3 0.4 0.5 0.6 100 120 140 160 180 200 220 240 260 280 300 bed volume passed n h 4 + m g l1 clino. 1 clino. 2 clino. 3 breakthrough level fig. 4: breakthrough curves of ion exchange resin for a tap water containing (nh4 + = 5 mg/ l, ca2+ = 60 mg/l, mg2+ = 12 mg/ l) these results reveal that thermal activation increases clinoptilolite selectivity for ammonium ions. the results shown in fig. 6 indicate that the breakthrough capacities of a thermally activated sample are on average roughly 20 %, 40 % for clino. 2 and clino. 3, respectively, higher than the value for untreated clinoptilolite. in addition, these results indicate that ammonium ions can be removed efficiency by the zeolite, and the exchange capacity of zeolites depends on the initial ammonia concentration at the column inlet. table 1: important parameters measured during the experiment using distilled water spiked with nh4 � inf. = 10 mg l�1 for clino. 1 solution volume – l 50 61 72 82 88.5 nh4 � eff. – mg l�1 0.06 0.088 0.094 0.29 0.503 bv 500 610 720 820 885 table 2: important parameters measured during the experiment using tap water spiked with ammonia materials clino. 1 clino. 2 clino. 3 nh4 � inf [mg l �1 ] 10 5 2 10 5 2 10 5 2 volume l 7.25 21 70 9 25 81 11.25 28 91 bv 72.5 210 700 90 250 810 112.5 280 910 capacity [mol l �1 ] 0.038 0.052 0.058 0.048 0.063 0.068 0.06 0.063 0.076 comparing the results summarized in figs. 7, 8, it can be seen that the removal of calcium and magnesium at breakthrough level of ammonia for clino. 2 and clino. 3 was lower than for clino. 1 at all concentrations of ammonia, which agrees with the selectivity rules for ionic exchange in untreated clinoptilolite zeolite. these results reveal that the removal efficiency of calcium and magnesium for clino. 1 was higher than for clino. 2 and clino. 3. regeneration effects the result from the first regeneration using 2% nacl indicate that 140 bv of a nacl solution is sufficient for ammonium elution from clino. 1. the elution curves (figs. 9, 10 and table 3) indicate that no difference between regeneration by 10% and 5% nacl solution for clino. 1. 65 bv of a nacl solution is sufficient for ammonium elution. on the other hand, complete elution of ammonia from clino. 2 and clino. 3 © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 41 no.1/2001 zeolite column nh 4 cl feed breakthrough curve 0 0.1 0.2 0.3 0.4 0.5 0.6 500 540 580 620 660 700 740 780 820 860 900 940 bed volume passed n h 4 + m g l1 clino. 1 clino. 2 clino. 3 breakthrough level fig. 5: breakthrough curves of ion exchange resin for a tap water containing (nh4 � = 2 mg/ l, ca2� = 60 mg/ l, mg2� = 12 mg/ l) useful capacity of exchange 0 0.02 0.04 0.06 0.08 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 nh 4 + = mg/l m o l l1 clino .1 clino .2 clino .3 fig. 6: exchange capacity 0 15 30 45 60 10 5 2 nh4 + mg l -1 m g l-1 clino. 1 clino. 2 clino. 3 fig. 7: comparison of calcium breakthrough of ion exchange resin for a tap water containing (nh4 +, ca2+= 60 mg/ l, mg2+= 12 mg/ l) 0 3 6 9 12 15 10 5 2 nh4 + mg l-1 m g l1 clino. 1 clino. 2 clino. 3 fig. 8: comparison of magnesium breakthrough of ion exchange resin for a tap water containing (nh4 +, ca2+= 60 mg/l, mg2+= 12 mg/l) 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 require 110 bv of using 5% nacl solution. these results reveal that clino. 1 is slightly more economical than clino. 2, and clino. 3 and 5% nacl should be used. 5 conclusions the experimental results indicate that ammonium ions can be removed efficiency by zeolites. pretreatment of the zeolite considerably improves its ion exchange capacity, and clino. 3 can be used for ammonia removal as a more economical exchanger than clino. 1 and clino. 2. ammonium removal capacities were increased by approximately 40 % and 20 % for heat-treated zeolite samples. there was no difference between the regenerates for 10% and 5% nacl. the removal efficiency of calcium and magnesium for clino. 1 was higher than for clino. 2 and clino. 3. we conclude that the use of zeolite is an attractive and promising method for ammonium removal. references [1] apha: standard methods for examination of water and waste water. 18th edition, 1992, published by american public health association, washington, usa [2] beler baykal, b. and akca guven, d.: performance of clinoptilolite alone and in combination with sand filters for the removal of ammonia peaks from domestic wastewater. wat. sci. technol., 1997, vol. 35, no. 7, pp. 47–54 [3] booker, n., cooney, e. and priestly, a.: ammonia removal from sewage using natural australian zeolite. wat. sci. technol., 1996, vol. 34, no. 9, pp. 17–24 clino. 1 nacl 2% nacl 5% nacl 10% volume (ml) nh4 + (mg l �1) bv volume (ml) nh4 + (mg l �1) bv volume (ml) nh4 + (mg l �1) bv 1000 24.1 10 1000 27.6 10 1000 659 10 5000 2.65 50 5000 1.08 50 6000 25 60 6000 0.098 60 6000 0.098 60 1000 0.98 100 6500 0.036 65 6500 0.04 65 14000 0.028 140 nacl 5% clino. 2 clino. 3 volume (ml) nh4 + (mg l �1) bv volume (ml) nh4 + (mg l �1) bv 1000 40.39 10 1000 23.77 10 6000 3.01 60 6000 2.66 60 10000 0.255 100 10000 0.266 100 11000 0.05 110 11000 0.04 110 table 3: elution of ammonia from the resins using nacl nacl = 5% 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 bed volume paseed n h 4 + m g l-1 clino.1 clino. 2 clino. 3 fig. 9: typical regeneration curves for elution of ammonia from reins using nacl = 5% nacl = 10% 0 25 50 75 100 125 0 20 40 60 80 bed volume passed n h 4 + m g l-1 clino.1 fig. 10: typical regeneration curves for elution of ammonia from reins using nacl = 10% © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 41 no.1/2001 [4] cooney, e., booker, n., shallcross, d. and stevens, g.: ammonia removal from wastewater using natural australian zeolite. i. characterization of the zeolite. sep. sci. technol., 1999, vol. 34, no. 12, pp. 2307–2327 [5] gaspard, m. and martin, a.: clinoptilolite in drinking water treatment for nh4 + removal. wat. res., 1983, vol. 17, no. 3, pp. 279–288 [6] hlavay, j., vigh, gy., olaszi, v. and inczédy, j.: ammonia and iron removal from drinking water with clinoptilolite tuff. zeolite 3, 1983, pp. 188–190 [7] hódi, m., polyák, k. and hlavay, j.: removal of pollutants from drinking water by combined ion exchange and adsorption methods. envir. international, 1995, vol. 21, no. 3, pp. 325–331 [8] jörgensen, s. e., libor, o., barkacs, k. and kuna, l.: equilibrium and capacity data of clinoptilolite. wat. res., 1979, vol. 13, pp. 159–165 [9] jörgensen, s. e.: ammonia removal by use of clinoptilolite. wat. res., 1976, vol. 10, pp. 213–224 [10] oenema, o. and roest, c. w. j.: nitrogen and phosphorus losses from agriculture into surface waters: the effects of policies and measures in the netherlands. wat. sci. technol., 1998, vol. 37, no. 2, pp. 19–26 [11] vokáčová, m., matejka, z. and ellášek, j.: sorption of ammonium-ion by clinoptilolite and by strongly acidic cation exchangers. acta hydrochim, 1986, vol. 14, no. 6, pp. 605–611 hossam monier abd el-hady phone: +420 2 2435 4605 e-mail: hoss@fsv.cvut.cz prof. ing. alexander grünwald, csc. phone: +420 2 2435 4638 e-mail: grunwald@fsv.cvut.cz ing. karla vlčková ing. jitka zeithammerová department of sanitary engineering czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic acta polytechnica vol. 52 no. 5/2012 diamond films for implantable electrodes petra henychová, klára hiřmanová, martin vraný department of biomedical technology, faculty of biomedical engineering, czech technical university, kladno, czech republic corresponding author: petra.henychova@fbmi.cvut.cz abstract diamond is a promising material for implantable electrodes due to its unique properties. the aim of this work is to investigate the growth of boron-doped nanocrystalline diamond (b-ncd) films by plasma-enhanced microwave chemical vapor deposition at various temperatures, and to propose optimal diamond growth conditions for implantable electrodes. we have investigated the temperature dependence (450 °c–820 °c) of boron incorporation, surface morphology and growth rate on a polished quartz plate. surface morphology and thickness were examined by atomic force microscopy (afm). the quality of the films in terms of diamond and non-diamond phase of carbon was investigated by raman spectroscopy. afm imaging showed that the size of the grains was determined mainly by the thickness of the films, and varied from an average size of 40 nm in the lowest temperature sample to an average size of 150 nm in the sample prepared at the highest temperature. the surface roughness of the measured samples varied between 10 (495 °c) and 25 nm (800 °c). the growth rate of the sample increased with temperature. we found that the level of boron doping was strongly dependent on temperature during deposition. an optimal b-ncd sample was prepared at 595 °c. keywords: nanocrystalline diamond, implantable electrode. 1 introduction an implantable electrode is an invasive electronic device, and for this reason there are very high requirements on the materials that are used. long-term functionality and good performance must be ensured by selecting appropriate materials for the electrode [4]. above all, the electrodes must not adversely affect the health of the patient [1]. the material of the implant must be biocompatible, and must not cause an allergic reaction [4]. there should be no acute or chronic adverse reaction, and no evidence of toxicity. in addition, the host organism must not affect the implanted system, and the system must preserve its properties in the aggressive environment of a living organism [1]. the impedance between electrode and tissue must be low and stable in order to provide reliable measurements or stimulations [4]. electrodes can be exposed to potentials of hundreds of volts: the material must be resistant to electrochemical degradation [5]. last but not least, the material should be compatible with magnetic resonance [4]. further important factors are manufacturability, reproducible quality, and cost [1]. diamond is a promising material for implantable electrodes due to its unique properties. most importantly, diamond is biologically compatible. it is the hardest material in nature. it has extreme wear resistance and resistance to chemical corrosion. it is a great thermal conductor, and its thermal expansion coefficient at room temperature is very low. diamond can easily be doped, and then it becomes a semiconductor. it does not melt during current flow, and it has a wide potential window [2]. diamond can be made in many different ways. one example is the high-pressure high-temperature (hpht) technique. this method simulates the conditions in which diamonds are formed in nature. in the nature, diamond is formed deep underground, under very high pressure and temperature. this results in the formation of diamond monocrystals. diamond in form of monocrystals is not a convenient material for most medical applications. another commonly used technique for diamond deposition is cvd (chemical vapor deposition). this method enables not only single crystalline diamonds to be deposited. carbon atoms are added from the gas phase to the initial template. lower pressure and temperature is used. cvd is a more economical method. the biggest advantage is that a very thin layer of diamond is formed [2]. platforms for implantable diamond electrodes are made in several steps. a thin diamond film is grown on a silicon wafer. the second step is deposition of a film of a suitable polymer (e.g. poly (methyl methacrylate) or polyimide) by spin coating followed by hardening of the polymer by baking. chemical etching of the silicon wafer leaves a polycrystalline diamond film on a flexible substrate. in this paper, we focus on optimizing the growth of thin diamond films by plasma-enhanced microwave chemical vapor deposition. we investigate the temperature dependence of boron incorporation, surface morphology and growth rate. we propose optimal diamond conditions for application in implantable electrodes. 58 acta polytechnica vol. 52 no. 5/2012 run id temperature pressure power (°c) (mbar) (w) s820 820 90 1522 s670 670 50 1150 s595 595 50 750 s450 450 50 550 s490 490 50 650 table 1: variable conditions for mwpecvd 2 experimental the study was performed for five different substrate temperatures during deposition. during the study, the temperatures ranged between 450 °c and 820 °c. the growth conditions and temperatures are summarized in tab. 1. 2.1 substrate pretreatment we used a polished quartz plate 10×10×1 mm (mateck gmbh, germany) as the substrate. the substrates were cleaned using isopropyl on dust-free cellulose wadding. then the sample was inserted into deionized water (2 mωm) and sonicated in an ultrasonic bath at 80 °c for 10 minutes. finally, we dried the substrate with dry compressed air. initial nucleation of the substrates were done by immersion in a solution of nanoamando nanodiamond (nanocarbon research institute, japan) at a concentration of 0.2 g/l, followed by spin coating at 3800 rpm for 30 s. the declared size of the nanodiamonds in the solution was 5 ± 0.9 nm. 2.2 growth of ncd ncd was grown by the microwave plasma enhanced chemical vapor deposition method (mwpecvd). we used an astex ax5010 reactor (seki technotro corp., japan). all samples were grown for 90 minutes with a fixed boron concentration in a gas mixture (2000 ppm b/c), and there was a fixed concentration of methane (0.5 %). the variable growth conditions are presented in tab. 1. 2.3 characterization the surface roughness and thickness of the layer was obtained by atomic force microscopy (afm). the roughness (root mean square) is calculated as rq = √√√√1 n n∑ i=1 y2i figure 1: raman spectra (excluding 633 nm) of boron-doped ncd films prepared at various temperatures. peaks at 500 cm−1 and 1225 cm−1 related to boron doping are clearly visible for the samples prepared at 595 °c and 670 °c. this is analogous with standard deviation in statistics. we used an ntegra prima nt mdt microscope and budget sensors tap ai-150g tips at a frequency of 150 khz and stiffness of 5 n/m. for the thickness measurements, a part of the substrate was cleared of seeds prior to growth. this resulted in an area without diamond growth, which was used as a reference for the thickness evaluation. the growth rate was calculated by dividing thickness by growth time. layer quality was measured using an invia raman microscope. spectra were taken using a laser power of 2 mw 633 nm at room temperature. all spectra were normalized to the characteristic spade at diamond wave number around 1332 cm−1. the temperature during deposition was measured using a pyrometric 92 two-color pyrometer (williamson, usa). 3 results and discussion 3.1 raman spectroscopy all samples exhibit a peak between 1306 cm−1 and 1322 cm−1 attributed to diamond (sp3 bonds). the position intensity of the diamond peaks is influenced by many factors, including the number of sp3 bonds in the material and the level of boron doping. high boron doping shifts the diamond raman peak downwards and reduces its intensity [6]. another prominent peak in all spectra is at 500 cm−1. this peak is related to the presence of boron in the crystal lattice of diamond. the intensity of this peak can be used for a semiquantitative estimate of the boron content [3]. fano resonance, another spectral feature related to the presence of boron, is also distinguishable in all spectra. 59 acta polytechnica vol. 52 no. 5/2012 figure 2: position of the diamond raman peak and resistance measured by the 2-point probe method. the relative intensity of the peaks varies among the investigated samples. according to the raman spectra, the highest boron incorporation was achieved at a temperature of 595 °c, followed by a temperature of 670 °c. these samples feature an intense peak at 500 cm−1, prominent fano resonance at 1225 cm−1, and a shifted diamond raman peak. the sample prepared at 820 °c has a narrow raman peak, and boron related peaks were less apparent. the spectra recorded for samples prepared at 450 °c and 490 °c are similar to each other. both have a broad diamond raman peak of low intensity, indicating small grains and low quality of the diamond. the intensity of the peaks related to boron is similar to the results for the sample prepared at 820 °c. raman spectroscopy also shows the presence of a high amount of non-diamond phase in some samples. most notably, the g band (1580 cm−1) that appears in the spectra of the b-ncd samples is attributed to crystalline graphite. this peak is relatively strong in the samples prepared at the lowest temperature (450 °c) and at the highest temperature (820 °c). it should be noted that the laser excitation that we used is very sensitive to sp2 bonding. to verify our assumption, based on raman spectroscopy, that there is higher boron doping for the sample prepared at temperature 595 °c, we measured the resistance of the films by a simple 2-point probe method. the results adjusted for the thickness of the film are plotted in fig. 2. although 2-point probe measurements allow only a crude estimate of the true resistivity of the sample, fig. 2 clearly shows a correlation between the resistance and the position of the diamond raman peak, which is influenced by the presence of boron in the material. the higher resistance of films prepared at lower temperatures may be caused by the fact that the films are not completely continuous. in summary, all of the b-ncd films exhibit evidence of boron doping. there is a lower level of boron doping for low temperatures. the level then increases with temperature, reaching a maximum at around 600 °c, and then decreases again for high temperatures. figure 3: growth rate of a b-ncd film as a function of temperature. 3.2 growth rate and surface morphology afm imaging showed that the b-ncd films (fig. 4) consist of randomly-oriented grains of submicron size. the size of the grains was mainly determined by the thickness of the films, and varied from an average size of 40 nm in the lowest temperature sample to an average size of 150 nm in the sample prepared at the highest temperature. for lower temperature, we observe small grains without visible facets. clearly visible crystal facets can be identified on the samples prepared at 595 °c and above. the surface roughness of the measured samples varies between 10 nm (495 °c) and 25 nm (800 °c). the growth rate of the film is an important factor for commercial applications. the growth rate must be sufficiently high to ensure a viable production rate. the growth rate of the b-ncd films prepared in this study was found to be linearly dependent on temperature in all samples, and varied by one order of magnitude between samples prepared at the lowest temperature and at the highest temperature. the growth rate of the sample prepared at 450 °c was 40 nm/hour. the highest growth rate was for the sample prepared at 800 °c, where the value was over 500 nm/hour. the growth rates of all samples are summarized in fig. 3. the growth rate values at various temperatures are typical for the given methane concentration. as the growth rates are strongly dependent on methane concentration, an increase in the methane fraction during deposition could be used for boosting the growth rate [7]. 4 conclusions in this work, we have investigated the quality of boron-doped nanodiamond films prepared at various temperatures in order to find the optimal conditions for application in implantable electrodes. five samples were deposited by mwpecvd for the experiment. morphology and thickness were investigated by afm. raman spectroscopy was used for estimating the boron content and for determining the quality of 60 acta polytechnica vol. 52 no. 5/2012 figure 4: 5 µm×5 µm afm images of b-ncd films prepared at various temperatures. the films. we found that the level of boron doping was strongly dependent on the deposition temperature. low temperatures, and also high temperatures, led to low boron incorporation. an optimal sample of b-ncd films was prepared at 595 °c. this sample had the highest level of boron incorporation and a low amount of impurities. the growth rate achieved at this temperature is considered sufficient for potential commercial production. acknowledgements the research presented in this paper was supervised by václav petrák, fbmi ctu in prague. special thanks to vlaďka petráková from fbmi ctu in prague for useful consultations. we are grateful to colleagues from the institute of physics as cr for their help with afm measurements, and to colleagues from the heyrovsky institute of physical chemistry as cr for their help with raman spectroscopy measurements. this work was supported by the grant agency of the czech technical university in prague, grant no. ctu 10/811700. references [1] m. houdek, et al. neuromodulace. grada, 2007. [2] p. w. may. diamond thin films: a 21st-century material. philosophical transactions of the royal society of london series a mathematical physical and engineering sciences 358(1766):473–495, 2000. [3] p. w. may, w. j. ludlow, m. hannaway, et al. raman and conductivity studies of boron-doped microcrystalline diamond, faceted nanocrystalline diamond and cauliflower diamond films. diamond & related materials 17(2):105–117, 2008. [4] s. myllymaa, k. myllymaa, r. lappalainen. flexible implantable thin film neural electrodes. http://cdn.intechweb.org/pdfs/9218. pdf, [2012-03-14]. [5] a. norlin, j. pan, c. leygraf. investigation of pt, ti, tin, and nano-porous carbon electrodes for implantable cardioverter-defibrillator applications. electrochimica acta 49(22–23):4011–4020, 2004. [6] s. prawer, r. j. nemanich. raman spectroscopy of diamond and doped diamond. philosophical transactions of the royal society of london series a 362(1824):2537–2565, 2004. [7] o.a. williams, m. nesladek, m. daenen, et al. growth, electronic properties and applications of nanodiamond. diamond & related materials 17(7–10):1080–1088, 2008. 61 acta polytechnica doi:10.14311/ap.2013.53.0405 acta polytechnica 53(5):405–409, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap proposal of new optical elements goce chadzitaskos∗, jiří tolar department of physics, faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, cz–115 19 prague ∗ corresponding author: goce.chadzitaskos@fjfi.cvut.cz abstract. a overview of our patented proposals of new optical elements is presented. the elements are suitable for laser pulse analysis, telescopy, x-ray microscopy and x-ray telescopy. they are based on the interference properties of light: a special grating for a double slit pattern, parabolic strip imaging for a telescope, and bragg’s condition for x-ray scattering on a slice of a single crystal for x-ray microscopy and x-ray telescopy. keywords: optical elements, interference of light, telescopes, x-ray microscopy, x-ray telescopy. submitted: 31 may 2013. accepted: 28 june 2013. 1. introduction over the last twelve years we have patented four new optical elements and systems. prof. m. havlíček has been interested in our work all this time. we would like to dedicate this summary to him. we will describe the following proposals: • a two-diffraction system; • a parabolic strip telescope; • an optical element for x-ray microscopy; • an x-ray bragg’s telescope. the principles of each proposal will be described in the next four sections. 2. two-diffraction system the two-diffraction system was patented [1] and published in [5]. the idea of the two-diffraction system is based on the two-slit experiment, with two channels coupled with alternating slits, and light passes through both of them. when a measurement is made or a photon is identified in one of the channels, we can distinguish which channel the light passed through, and the interference of light between the two channels is lost. otherwise, when light passes through the two channels, and nobody can distinguish which channel the light passed through, the interference of the light between the channels remains. the new feature of the proposal is the combination of a two-slit arrangement with an interference grid. this allows for better measurement options. let us provide an example. one possible realization of the 2-diffraction system is shown in fig. 1. light source 1 is monochromatic. the light is divided into two channels 3 by a 1:1 beam splitter 2, and is brought into the 2-diffraction grating 5. 2-diffraction grating generally consists of a large number of equally spaced slits, which allows the connection of one channel with odd slits and the other figure 1. let 1 be a monochromatic light source. the light is divided into two channels 3, by a 1:1 beam splitter 2, and is brought into the two-diffraction gratings 5. a two-diffraction grating generally consists of a large number of equally spaced slits, which allows the connection of one channel with odd slits and the other channel with even slits 4. then spots (interference maxima) appear on the screen 6. channel with even slits 4. then spots 7 — interference maxima — appear on the screen 6. the positions of the maxima depend on the possibility of distinguishing the paths. case 1. if it is possible to distinguish the photon paths between the two channels, then sin αn = nλ 2d , (1) where d is the spacing of the slits (or optical segments as in fig. 1), λ is the wavelength, and n = 0.± 1,±2 . . . . case 2. if it is not possible to distinguish the path through which of the channels the photon has passed and has the same phase in even and odd slits, then sin αn = nλ d , (2) and half of the spots will not appear. of course, the intensity of the spots is different. 405 http://dx.doi.org/10.14311/ap.2013.53.0405 http://ojs.cvut.cz/ojs/index.php/ap g. chadzitaskos, j. tolar acta polytechnica figure 2. principle of the new telescopic system. it consists of a parabolic strip mirror 1, a ccd camera 2 in the image plane, supported by the mounting 3, and it rotates around the optical axis 4. the shots are stored in a computer, where the image is reconstructed. case 3. if it is possible to distinguish the photon paths between the two channels only for some of the photons (for example for the interacting photons) and it is not possible for the other photons, the intensity of the spots depends on the ratio of these parts. the reliability of the information transmission depends on the efficiency of the detectors. moreover, entry into the transmitted information also needs a detector. a small part of the light is turned aside and detected there. if the efficiency of this detector is comparable with the detector used in the 2-diffraction system, the entry will be detected. the intensity of the light passing through a diffraction grating with n slits of the same width a, spaced at an equal distance a + b in cases 1 or 2, depends on angle α iγ(α) = iγ0 ( sin(πa λ sin α) πa λ sin α sin(nπγ(a+b) λ sin α) sin(πγ(a+b) λ sin α) )2 , (3) where γ = 1 for the case 1 and γ = 2 for the case 2, i10 and i20 are the integral intensities of the light passed through the diffraction system in cases 1 or 2, λ is the wavelength of the light, and n is the number of slits. in case 3, for the same diffraction grating the intensity is the sum of the intensities of case 1 and case 2. let p percent of the photons belong to case 1 and 100 −p percent belong to case 2. then the intensity is i3(α) = p 100 i0 ( sin(πa λ sin α) πa λ sin α sin(nπ(a+b) λ sin α) sin(π(a+b) λ sin α) )2 + 100 −p 100 i0 ( sin(πa λ sin α) πa λ sin α sin(nπ2(a+b) λ sin α) sin(π2(a+b) λ sin α) )2 , figure 3. the paraxial beams 5 are reflected by the strip on focus line f, and v is the vertex line of the strip. where i0 is the integral intensity of the light passed through the system. 3. telescope with a rotating objective element the parabolic strip telescope was patented [2] and published in [6]. the basic idea of our new system was inspired by x-ray computer tomography (ct). the integral absorptions of x-rays at different angles step by step during the rotation of the sample are measured. the total absorption of all photons coming along different lines perpendicular to the camera are registered as points of a one-dimensional picture. finally, the inverse radon transform is used to reconstruct the magnitude of the absorption of x-rays in different points of the media. the same approach can be used when the parabolic strip is the primary mirror of a telescope, following the scheme shown in figures 2 and 3. the images of the observed objects are the lines. each line represents the integral intensity of the light incoming from the object or objects perpendicular to the strip (parallel to the focal line) located inside the field of view which is guaranteed by the geometry. making a series of photos step by step during the rotation of the telescope around its optical axis, one can use the inverse radon transform to reconstruct the image with the abovementioned angular resolution. fig. 5 shows a series of images of five illuminated circular apertures, after rotation of the telescope by 0, 45, 90 and 135 degrees. the angular resolution of a parabolic strip telescope is δl = λ l , where λ is the wavelength of the light, l is the length of the projection of the parabolic strip on the x axis. the main advantages of such telescopes are: • good angular resolution; • low cost; 406 vol. 53 no. 5/2013 proposal of new optical elements figure 4. two parabolic strips of lengths 20 and 40 cm; for the proof-of-principle experiment, the 40 cm strip was used. figure 5. series of images of the five illuminated circular objects after rotations the telescope by 0, 45, 90 and 135 degrees. • simple technological development; • possibility to install a grid of large telescopes across the earth; • lower weight for use on satellites. the only major complication is that one more rotational movement is needed to reconstruct the image with the same angular resolution in all directions. the good angular resolution can be used for direct observation of bright objects. additional rotation is not necessary for some purposes, for example when the plane of rotation of the rotating objects is known, and changes of orbits are observed. 4. optical element for x-ray microscopy the optical element for x-ray microscopy was patented [3] and published in [7]. our proposal exploits a single crystal to display monochromatic x-ray radiation with wavelength λ. bragg’s condition is guaranteed by applying stress to the single crystal. let us consider one dislocation-free single crystal strip with atomic planes oriented in parallel with the optical axis, which is the line connecting an imaging point with the center of its image, see 6c). the mutual distance of the atomic planes in a state of rest, i.e. without stress, is d0. the cross section of the single crystal is variable, in order to use stress to change the distances of the atomic planes. with respect to figure 6. a) and b) show two examples of the possible shape of the single crystal 1, designed for shifting the atomic planes 2 by push and pull force f. c) shows an example of an optical set-up diffracting x-ray radiation with wavelength λ. it consists of a single crystal 1 with atomic planes 2 which are parallel to the optical axis 3. the mutual distance of the atomic planes 2 in resting state is d0 and the single crystal’s 1 cross section s is variable. it is equipped with a push device to maintain a push force f in the direction orthogonal to the atomic planes 2 of this single crystal 1, located between distances r0 and rm. the optical axis, the farther or closer side of the single crystal, orthogonal to this optical axis, is equipped with a device to create a push 6a) or pull 6b) force and to maintain a pull or push force in the direction orthogonal to the atomic planes of this single crystal.we have to determine the cross section s of the single crystal at a distance r from optical axis. for the constructive interference at the image plane it is necessary for the difference in the paths between two neighboring reflected rays an integer multiple of the wavelength. since the planes are parallel, we will study the effect of spatial grating via bragg’s condition for x-ray diffraction. bragg’s condition of constructive interference is given for the distance between two atomic planes d by the expression nλ = 2d sin α, where n is an integer, λ is the wavelength of the x-ray, and α is the angle between the ray and the optical 407 g. chadzitaskos, j. tolar acta polytechnica axis sin α = r √ r2 + s2 . the distance d can be expressed via the distance between the atomic layers without any stress d0, i. e. d = d0 ±4d and hooke’s law gives 4d d0 = f es , where s is a cross section, f is the push or pull force, and e is the young modulus of elasticity of the single crystal in the direction of the acting force. the single crystal’s cross section is given by the following formulas s = f e ( 1 nλ 2rd0 √ r2 + s2 − 1 ) , s = f e ( 1 1 − nλ2rd0 √ r2 + s2 ) , (4) for the pull and push force, respectively. the cross section has to be positive in both cases, and is restricted by the limits of hooke’s law. the force f is defined by the equation f = ±s0e ( nλ 2r0d0 √ r20 + s2 − 1 ) , where s, with respect to the single crystal’s longitudinal axis, is an object and also an image distance and s0 is a pre-selected cross section of a single crystal based on requirements of the application at a distance r0 from the optical axis, λ is the x-ray radiation wavelength and n is a natural number. in both the equations, the plus sign applies for the pull force, the minus sign applies for the push force. the cross section is then s = s0 √ r2 + s2 − 1√ r20 + s2 − 1 . the calculated cross section s corresponds to a stress-free state, but r is the distance from the optical axis under stress. in order to calculate the shape of the strip for manufacturing, it is necessary to calculate the cross sections corresponding to the resting distances by the following procedure: express an inverse function r(x) of the function x(r) x = d0 r−r0 nλ 2rd0 √ r2 + s2 (x is the distance from r0). then substitute r(x) in (4) and the cross section s is expressed as a function of x — the distance of the corresponding cross section from the end of the single crystal in a stress-free state. a correction according to the applicable poisson ratio can be included. numerical calculation methods are applicable. the maximal rm is then given by the figure 7. a parabolic cylinder consisting of bent single crystal strips of two types 1 and 2. each strip has a different distance between the atomic planes parallel with its surface. the temperature at each point determines that the bragg condition for x-ray reflection is satisfied there. bent single crystal strips are fixed on the sides by brackets 3. young modulus of the crystal in the direction of the stress. an advantage of this device is that it is relatively simple to manufacture its components. another advantage is that it also works for shorter wavelengths, i.e. below 1 nm. 5. x-ray bragg telescope we have proposed a structure based on bragg’s x-ray reflection from single crystals arranged along a truncated parabolic cylinder. the basic idea presented in [7] and [6] is applied in telescopy. the x-ray reflector is composed of bent rectangular single crystal strips. the strips are cut so that the atomic planes are parallel to their surfaces, see fig. 7. these bent single crystal strips are attached by their ends fixed in two parallel brackets, ensuring parabolic geometry of the assembly. the reflected incoming x-rays are focused roughly to a line. the essence of the arrangement is that the bent strips form a parabolic cylinder. this geometry ensures that the incidence and reflection angles are equal at each point of the parabolic cylinder. the conditions for bragg’s reflection have to be guaranteed by the structure satisfying the following conditions: (1.) the cylinder is assembled from single crystal strips made from various materials and cut along different planes, so that the mutual distances between the atomic planes at one end of each strip fulfill bragg’s condition; (2.) the mutual distance between the atomic planes of each single crystal strip is changed by suitable thermal expansion at different places on the strip. it is also possible to control the concentration of the impurities in order to change the mutual distances of the atomic planes. 408 vol. 53 no. 5/2013 proposal of new optical elements figure 8. one bent single crystal strip displaying atomic planes 5 with mutual distance d0 at one end and dmax at the opposite end. the geometry is the same as in fig. 3. the vertex line of a given parabolic cylinder goes through the origin (x = 0, y = 0, z), and the focal line has the coordinates (x = 0, y = p/2, z). the position of a strip can then be parameterized by its x coordinate. a strip is located between xmin and xmax. each dislocation-free single crystal strip has atomic planes in parallel with its surface. let the mutual distance of the atomic planes at temperature t be d.the bragg condition is fulfilled at all points on the strip due to the changing mutual distance between the atomic planes parallel with the surface at different points. the mutual distance of the atomic planes in different points of the strip is properly changed by the local changes in temperature. the smallest distance d0 at temperature txmin is at the end of the strip closer to the vertex line of the parabolic cylinder xmin, and the maximal distance between planes dmax is at the opposite end, at xmax. the mutual distance d of the atomic planes between the ends of the strip according to bragg’s condition depends on x, d = nλ 2p √ x2 + p2, where n is a natural number indicating the order of bragg’s reflection from the two adjacent atomic planes, and p is a parameter of the parabola. the temperature along the strip at points in the direction of the x axis has to be a function of the x-coordinate t(x) = t(xmin) + 1 γ ( nλ 2pd0 √ x2 + p2 − 1 ) , where γ is the coefficient of thermal expansion of the single crystal strip in the direction perpendicular to its surface. for example, for a germanium single crystal strip cut along plane (100), for temperatures between tmin = 250 k and tmax = 1000 k, for wavelength of x-rays 0.5 nm the ratio is xmax xmin = 1.023, and for wavelength 0.55 nm the ratio is xmax xmin = 1.08, i.e. for example if xmin = 10 cm, then xmax = 10.8 cm. the ratio given by xmax 2 xmin2 = 4d2max −n2λ2 4d02 −n2λ2 will be enhanced when the difference between 2d0 and nλ is smaller. it is also possible to make very small changes in wavelength by a change in the temperature distribution on the surface of the single crystal strip. 6. conclusion we have proposed four optical elements, they are suitable for use in the examining of the interference properties of two beams, in optical astronomy, in xray microscopy, and in x-ray astronomy. acknowledgements support from the ministry of education of the czech republic (project msm6840770039) is gratefully acknowledged. references [1] chadzitaskos, g. – tolar, j., difrakční systém, cz patent, 285999. 1999-10-18. [2] chadzitaskos, g. – tolar, j., teleskopický systém, cz patent, 298313. 2007-07-12. [3] chadzitaskos, g., optical element for x-ray microscopy, patent, european patent office, ep2168130. 2012-06-06. [4] chadzitaskos, g., x-ray telescope, užitný vzor, 24204. 2012-08-20, applied for patent [5] chadzitaskos g. – tolar j., the two-diffraction system, optics communications 187 (2001), 359-362 [6] chadzitaskos, g. – tolar, j., telescopic system with a rotating objective element, washington: the international society for optical engineering, 2004. 5pp. isbn 0-8194-5419-2. [7] g. chadzitaskos, optical element for x-ray microscopy, nuclear instruments and methods in physics research a 629 (2011) 206 [8] g. chadzitaskos, x-ray bragg’s telescope, submitted 409 acta polytechnica 53(5):405–409, 2013 1 introduction 2 two-diffraction system 3 telescope with a rotating objective element 4 optical element for x-ray microscopy 5 x-ray bragg telescope 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0254 acta polytechnica 54(3):254–258, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap on observational phenomena related to kerr superspinars zdenek stuchlik∗, jan schee institute of physics, faculty of philosophy and science, silesian university in opava, bezručovo nám. 13, opava, czech republic ∗ corresponding author: zdenek.stuchlik@fpf.slu.cz abstract. we investigate possible signatures of a kerr naked singularity (superspinar) in various observational phenomena. it has been shown that kerr naked singularities (superspinars) have to be efficiently converted to a black hole due to accretion from keplerian discs. in the final stages of the conversion process the near-extreme kerr naked singularities (superspinars) provide a variety of extraordinary physical phenomena. such superspinning kerr geometries can serve as an efficient accelerator for extremely high-energy collisions, enabling a direct and clear demonstration of the outcomes of the collision processes. we shall discuss the efficiency and the visibility of the ultra-highenergy collisions in the deepest parts of the gravitational well of superspinning near-extreme kerr geometries for the whole variety of particles freely falling from infinity. we demonstrate that ultrahigh-energy processes can be obtained with no fine tuning of the motion constants and the products of the collision can escape to infinity with efficiency substantially higher than in the case of near-extreme black holes. such phenomena influence the radiative processes taking place in the accretion disc, and together with the particular generated geometry they influence the observed radiation field. here we assume the “geometrical” influence of a kerr naked singularity on the spectral line profiles of radiation emitted by monochromatically and isotropically radiating point sources forming a keplerian ring or disc around such a compact object. we have found that the profiled spectral line of the radiating keplerian ring can be split into two parts because there is no event horizon in the naked singularity spacetimes. the profiled lines generated by keplerian discs are qualitatively different for a kerr naked singularity and black hole spacetimes broadened near the inner edge of a keplerian disc. keywords: kerr superspinar, particle collisions, spectral line profiles. 1. introduction string theory, one of the most relevant candidates for the theory of all physical interactions and quantum gravity, indicates a possibility to be tested in relativistic astrophysics. gimon and hořava [1] have shown that kerr superspinars with mass m and angular momentum j violating the general relativistic limit on the spin of black holes (a = j/m2 > 1) could be primordial remnants of the high-energy phase of a very early period in the evolution of the universe, when the effects of string theory were relevant. it is assumed that the spacetime outside a kerr superspinar of radius r, where the stringy effects are irrelevant, is described by the standard kerr geometry. the exact solution describing the interior of the superspinar is not yet known in the 3+1 theory, but it is considered that its extension is limited to 0 < r < m covering thus the region of causality violations (naked time machine) and physical singularity and still allowing for the presence of the most interesting astrophysical phenomena related to the kerr naked singularity spacetimes [1]; we assume here r = 0. the properties of the surface of kerr superspinars are usually assumed to correspond to those of the black hole horizon, i.e., the surface is assumed to serve as a one-way membrane. of course, we can introduce assumptions of non-zero reflexity (or emissivity) as discussed in [2]. here we assume for simplicity surface properties resembling those of the black hole horizon. 2. ultra-high-energy collisions ultra-high energy collisions could be relevant in the field of near-extreme superspinning kerr geometries with spin a = 1 + δ (δ � 1) that can well describe the exterior of primordial kerr superspinars. they have to be converted into (near-extreme) black holes due to accretion. such a classical instability works on large time scales so that primordial superspinars can quite well survive to the era of high-redshift quasars (at z ≥ 2) [3]. of course, the travel time of the colliding particles to the collision point has to be smaller then the conversion time of the kerr superspinar [4]. collisions of particles in the field of the near-extreme superspinars that exhibit extremely large energy in the cm system occur if the collisions take place at the special surface r = m, or in its close vicinity. these ultra-high-energy processes occur quite naturally for a wide range of motion constants of the particles freely falling from infinity. assuming two particles with constants of motion, the covariant energy ei = mi, the azimuthal angular momentum li, and the angular momentum qi related to the total angular momentum, 254 http://dx.doi.org/10.14311/ap.2014.54.0254 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 3/2014 on observational phenomena related to kerr superspinars figure 1. escape cones of lnrf. the lnrf source at r = m and representative latitude values are θ = 5° (left) and 85° (right). the superspinar spin is set to a representative value a = 1 + 10−7 (upper row) and a = 1 + 10−2 (lower row). the frequency shifts of the photons are represented by the colour code with the range given in each case. where i = 1, 2, in zeroth approximation the formula for the energy of the collision in the centre of the mass system reads e2cm = 2m1m2 1 + cos2 θ (2 − l1)(2 − l2) δ . (1) for impact parameters l1 = l2 = −5 one obtains extremal efficiency. the corresponding energy of collision then reads e2cm = 2m1m2 1 + cos2 θ 49 δ . (2) the most efficient energy processes occur for colliding particles with negative axial angular momentum (impact parameter) with maximal magnitude l ∼ 5, when the local cm energy larger by a factor of 72 than those corresponding to the simplest case of collisions of inward and outward radially moving particles colliding at r = m[4]. the possibility of escape of the particles generated by ultra-high-energy collisions can be determined by constructing escape photon cones that well represent the escape of ultra-relativistic particles (we can expect direct formation of ultra-high energy photons). for simplicity, our attention is restricted to the case of collisions of radially moving particles of identical rest energy that have the cm system identical with the lnrf since they move purely radially relative to such frames [4]. the resulting escape cones and the related frequency shift are given in fig.1. we observe strong restriction of the extension of the escaping photon cone and a strong decrease in the frequency shift of the escaping photons with decreasing kerr superspinar spin for a small latitude θ = 5°, while for a large latitude θ = 85° the escaping cone remains large in the part related to photons comoving relative to the spacetime, and the frequency shift of the photons remains very small. in fact, for θ = 5° the behavior of both the escaping cone and the frequency shift in the field of kerr superspinars with a = 1 + 10−7 strongly resembles their behaviour in the field of near-extreme kerr black holes — almost precisely radially directed photons can escape with largely decreasing frequency shift. in such situations the ultra-high energy of the photons occurs only in situations when ultra-heavy particles collide and their rest energy corresponds to the energy of photons observed at large distances. from the behaviour of the frequency shift of the radiated photons it follows that there is a high probability to observe ultra-high energy photons if the collision occurs near the equatorial plane [4]. we conclude that near-extreme superspinning kerr geometries are much more efficient for the occurrence of ultra-high energy collisions then the vicinity of the 255 zdenek stuchlik, jan schee acta polytechnica 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 frequency shift g sp e c if ic e n e rg y fl u x p=2.0 θ0=30° rsusp=0.1 figure 2. profiled lines from a keplerian disc plotted for two representative values of spin a = 0.9999 (solid) and 3.0 (dashed). the disc ranges from rin = risco (a) to rout = 10m. the inclination of the observer is θ0 = 30°. figure 3. profiled lines from a keplerian disc plotted for two representative values of spin a = 0.9999 (solid) and 3.0 (dashed). the disc ranges from rin = risco (a) to rout = 10m. the inclination of the observer is θ0 = 85°. black hole horizon, and imply much more efficient escaping of the created photons (ultra-relativistic particles) and an opportunity for distant observers to observe them with such an ultra-high energy. we plan to study these effects explicitly also in the more complex situations occurring for collisions of “non-radially” moving particles when collisions can be more efficient by a factor of 3.5 than for purely radial collisions. 3. profiled spectral line in our simulations of spectral line profiles we assume that a keplerian ring (disc) is composed of point sources emitting monochromatic radiation isotropically. each point emitter moves along a circular geodesic in the equatorial plane. the general formula of the observed flux of the profiled line is given by fo(νo) = ∫ i(r)g4δ(ν0 − gνe) dπ, (3) where i(r) is the emitter radiation intensity, g is the frequency shift of the radiation, νo(νe = const.) is the photon frequency of the observer (emitter) and dπ is 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.0 0.2 0.4 0.6 0.8 1.0 frequency shift g sp e c if ic e n e rg y fl u x a=2.0 θ0=30° rsusp=0.1 rms 1.5 rms r e figure 4. profiled lines from a keplerian ring plotted for representative value of spin a = 2.0 and two keplerian ring radial size values re = rms (solid) and re = 1.5rms (dashed). the inclination of the observer is θ0 = 30°. 0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 frequency shift g sp e c if ic e n e rg y fl u x a=2.0 θ0=85° rsusp=0.1 rms 1.5 rms r e figure 5. profiled lines from a keplerian ring plotted for representative value of spin a = 2.0 and two values keplerian ring radial size values re = rms (solid) and re = 1.5rms (dashed). the inclination of the observer is θ0 = 30°. the solid angle element intended by the source on the observer sky. the radial dependence of the radiation intensity is assumed in the standard form i(r) ∼ 1/rp. (4) the frequency shift g takes the usual form g = √ 1 − 2(1 − aω)2/re − (r2e + a2)ω2 1 − λω , (5) where ω = 1/(a+ √ r3e ) is the angular frequency of the emitter. we illustrate our results in figs 2–7 where the profiled lines generated in the field of the kerr superspinars and black holes are compared for a small and large inclination angle of distant observers. when the event horizon is not present, there is an additional group of photons that can reach a distant observer and contribute to the observed profiled lines. they are initially directed inward with the impact parameter below the corresponding value of the photon spherical orbit [5]. additionally, there is a clear strong functional dependence of the width of the spectral profiled line on the spin parameter a of the kerr superspinars. 256 vol. 54 no. 3/2014 on observational phenomena related to kerr superspinars 0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 g s pe ci fi c f lu x h a; r susp l =h 1.01; 0.01l θ o = 30 ° 1 2 3 figure 6. profiled lines from a keplerian disk plotted for a representative spin value a = 1.01 and three emissivity index values p = 1 (thick solid), 2 (solid) and 3 (dashed). the inclination of the observer is θ0 = 30°. 0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 g s pe ci fi c f lu x h a; r susp l =h 1.01; 0.01l θ o = 85° 1 2 3 figure 7. profiled lines from a keplerian disk plotted for a representative spin value a = 1.01 and three emissivity index values p = 1 (thick solid), 2 (solid) and 3 (dashed). the inclination of the observer is θ0 = 85°. in the case of a keplerian disc, the superspinar “fingerprints” are in the shape of the profiled line and in its frequency range [5]. in the case of a keplerian ring, the profiled lines “split” into two parts, where the “blue” line is strongly influenced by the superspinar surface radius. of course, the inclination of the observer plays an important role too, and should be known prior to the analysis. there is strong qualitative difference between the profiled lines created in the field of kerr superspinars and kerr black holes. these phenomena related to the profiled lines can be observed even for extremely distant objects. for more details see [5–7]. we expect that the spectral line profiles could be well distinguished by future x-ray satellites, e.g. loft [8] 4. observed shape of the thin keplerian disc the radiation also recovers the imprints of kerr naked singularity spacetime in the shape of thin keplerian discs. for spins a > 1 the additional image is present in the inner part of the disk image. increasing the figure 8. shapes of keplerian discs plotted for spin parameters a = 0.998(top) and 1.0001(bottom). the disc ranges from rin = risco (a) to rout = 20m. the inclination of the observer is θ0 = 85°. figure 9. shapes of keplerian discs plotted for spin parameters a = 1.1 (top) and 3.0 (bottom). the disc ranges from rin = risco (a) to rout = 20m. the inclination of the observer is θ0 = 85° . spin parameter a, one can clearly see that an additional image creates, in the inner part of the disk image, a complex structure which clearly distinguishes naked singularity spacetimes from black hole spacetimes, as demonstrated in figs 3, 6 and 7. of course, phenomena related to the details of innermost parts of keplerian discs can be observed only in objects that are sufficiently close to the observer, e.g. the supermassive object in the centre of the galaxy. for detailed information, see [5]. acknowledgements the authors acknowledge albert einstein centre for gravitation and astrophysics, czech science foundation no. 14-37086g. references [1] e. g. gimon, p. hořava. astrophysical violations of the kerr bound as a possible signature of string theory. phys lett b 672:299–302, 2009. doi: 10.1016/j.physletb.2009.01.026. [2] p. pani, e. barausse, e. berti, v. cardoso. gravitational instabilities of superspinars. phys rev d 82(4), 2010. doi: 10.1103/physrevd.82.044009. 257 http://dx.doi.org/10.1016/j.physletb.2009.01.026 http://dx.doi.org/10.1016/j.physletb.2009.01.026 http://dx.doi.org/10.1103/physrevd.82.044009 zdenek stuchlik, jan schee acta polytechnica [3] z. stuchlík, s. hledík, k. truparová. evolution of kerr superspinars due to accretion counterrotating keplerian discs. class and quant grav 28(155017), 2011. doi: 10.1088/0264-9381/28/15/155017. [4] z. stuchlík, j. schee. ultra-high energy collisions in the superspinning kerr geometry. class and quant grav 30(075012), 2013. doi: 10.1088/0264-9381/30/7/075012. [5] z. stuchlík, j. schee. appearance of keplerian discs orbiting kerr superspinars. class and quant grav 27(215017), 2010. doi: 10.1088/0264-9381/27/21/215017. [6] z. stuchlík, j. schee. observational phenomena related to primordial kerr superspinars. class and quant grav 29(065002), 2012. doi: 10.1088/0264-9381/29/6/065002. [7] j. schee, z. stuchlík. profiled spectral lines generated in the field of kerr superspinars. jour of cosmolog and astropart phys 04(005), 2013. doi: 10.1088/1475-7516/2013/04/005. [8] m. feroci, l. stella, m. van der klis, et al. the large observatory for x-ray timing (loft). experimental astronomy 34:415–444, 2012. doi: 10.1007/s10686-011-9237-2. 258 http://dx.doi.org/10.1088/0264-9381/28/15/155017 http://dx.doi.org/10.1088/0264-9381/28/15/155017 http://dx.doi.org/10.1088/0264-9381/30/7/075012 http://dx.doi.org/10.1088/0264-9381/27/21/215017 http://dx.doi.org/10.1088/0264-9381/27/21/215017 http://dx.doi.org/10.1088/0264-9381/29/6/065002 http://dx.doi.org/10.1088/1475-7516/2013/04/005 http://dx.doi.org/10.1088/1475-7516/2013/04/005 http://dx.doi.org/10.1007/s10686-011-9237-2 http://dx.doi.org/10.1007/s10686-011-9237-2 acta polytechnica 54(3):254–258, 2014 1 introduction 2 ultra-high-energy collisions 3 profiled spectral line 4 observed shape of the thin keplerian disc acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0199 acta polytechnica 55(3):199–202, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap hydrogenated targets for high energy proton generation from laser irradiation in the tnsa regime lorenzo torrisia, ∗, mariapompea cutroneob, jiri ullschmiedc a physics department and es, messina university, v.le f.s. d’alcontres 31, 98166 s. agata, messina, italy b nuclear physics institute, ascr, 25068 rez, czech republic c institute of physics, ascr, v.v.i., 182 21 prague 8, czech republic ∗ corresponding author: ltorrisi@unimel.it abstract. polyethylene-based thin targets were irradiated in high vacuum in the tnsa (target normal sheath acceleration) regime using the pals laser facility. the plasmais produced in forward direction depending on the laser irradiation conditions, the composition of the target and the geometry. the optical properties of the polymer use nanostructures to increase the laser absorbance. proton kinetic energies from hundreds kev up to about 3 mev were obtained for optimal conditions enhancing the electric field driving the ion acceleration. keywords: hydrogenated target; proton acceleration; tnsa regime. 1. introduction one of the objectives of studies of high-intensity lasergenerating plasmas has been to attain high proton acceleration, and the results are becoming competitive with traditional ion acceleration systems. mev proton beams can be applied in various areas of interest, from biomedicine (hadron therapy) to nuclear physics (nuclear fusion processes), from microelectronics (semiconductor doping) to laser ion source (ion injection in accelerators) and to other fields [1, 2]. currently, proton beam acceleration reaches about 50 mev using f s lasers, but the yield is low and of the order of 1011 particles/laser shot [3]. due to the low proton yield, the large energy distribution, contamination of the beam with other ions, and high proton emittance, this topic needs further investigation. using thick hydrogenated targets, such as polymers, backward proton acceleration (bpa), obtainable at laser intensities of the order of 1015 w/cm2, shows the advantages of giving high proton yields and generating a continuum proton current emission using repetitive laser pulses. however bpa has the disadvantage that it accelerates protons at maximum kinetic energies generally below 1 mev [4]. using target normal sheath acceleration (tnsa) regime, at intensities of the order of 1016–1018 w/cm2, laser interaction confers higher acceleration and protons may reach energies above 10 mev. however, due to the limited thickness of the target of the order of 1–10 µm, the proton current remains low especially for the most accelerated ions [5]. further proton acceleration can be obtained using the radiation pressure acceleration (rpa) regime, at intensities above 1018 w/cm2, at which the ion energies grow significantly but at which protons have high energy spread and low yield emission [6]. our measurements of proton production by plasma-laser are conducted at laser intensity of the order of 1016 w/cm2 using laser pulses of 300 ps duration, at 1315 nm wavelength, in order to examine in grater depth the optimization of the composition and the geometry of the target, and even the irradiation conditions for maximizing the proton energy emission using tnsa regime. measurements are performed using the asterix laser at the pals facility, with high pulse reproducibility and rich plasma diagnostics. the targets were based on polyethylene, and a deep investigation was carried out in the order to increase its absorption coefficient with respect to the ir laser wavelength by using nanostructures embedded into the polymer. the laser absorption increment, in fact, permits a great laser energy release to the polymer and to the generated plasma, resulting in higher proton acceleration, as it will be presented and discussed here. 2. material and methods the iodine laser at pals employed for this study operates at a wavelength of 1315 nm, maximum pulse energy of about 600 j, 300 ps pulse duration and 70 µm spot size. laser beam was employed as non-polarized and as the “p” or “s” polarized mode. the focal position (fp) of the laser beam with respect to the target surface was fixed to 0 (focal position on the target surface). the targets were based on semicrystalline polyethylene, (ch2)n, ranging in thickness between 50 µm up to 0.5 µm with flat surfaces placed in a suitable holder. the targets were prepared diluting polyethylene (pe) powder in xylene at a temperature of 150 °c in a rotating solution for 1 hour and then depositing the melted mixture on a glass substrate. after solidification, the films were detached from the glass by water floating. in some cases nanoparticles of 199 http://dx.doi.org/10.14311/ap.2015.55.0199 http://ojs.cvut.cz/ojs/index.php/ap l. torrisi, m. cutroneo, j. ullschmied acta polytechnica figure 1. experimental set-up (a), transmission vs. thickness at 1315 nm wavelength of pe pure and containing 1 % fe2o3 and cnt nanostructures (b) and a photo of the pe targets that we used (c). carbon nanotubes (cnt) and fe2o3 nanostrucutres at concentrations of 0.1–1 % in weight were introduced into the solution. pure, thin pe targets appear closely transparent to the visible light, while thick targets and targets containing nanostructures appear opaque and black — colored if the solution contains cnt and red colored if it contains fe2o3 (fig. 1c). the targets were irradiated in the tnsa regime, under 10−6 mbar vacuum conditions with an incidence angle of 30°. the x-ray ccd streak camera (xsc) was employed to observe the x-ray plasma emission in backward direction. this camera has an exposition time of 2 ns, time resolution of 30 ps, spatial resolution of 20 µm and spectral range detection efficiency from about 100 ev to 10 kev. the calibration correlates the photon energy to false colors from black to red. the ion acceleration was controlled by using a sic detector in time-of-flight (tof) configuration, using a fast, 20 gs/s, storage oscilloscope. the sic semiconductor has a gap of 3.2 ev, so it is transparent to the visible radiation emitted from hot plasma [7]. it detects uv and x-rays, electrons and ions. its depletion layer is 80 µm in depth at a voltage bias of 600 v. the surface metallization uses ni2si with 200 nm thickness, so the minimum detected energy is 30 kev for protons and 150 kev for carbon ions. the sic detector was placed in forward direction at an angle of 30° and a distance of 60 cm with respect to the normal to the target surface. a thomson parabola spectrometer (tps) was employed in forward direction along the normal to the target surface at a distance of 133 cm from the target. two pin holes, the first of which was 1 mm in diameter while the second 100 µm in diameter, were employed to select only the axial ion component emitted from the plasma. figure 2. sic spectra detecting ions from irradiation of 6 µm pe (a), 50 µm pe pure (b), containing 1 % fe2o3 (c) and cnt (d). tps ion deflection uses a 0.06 t magnetic field and a 1.4 kv/cm electric field to deflect ions. a multichannel plate (mcp) placed at a distance of 16.5 cm from the electric field collects the deflected ions, transmits the electrons to the coupled phosphorus screen and enables the images of the parabola to be recorded with a traditional ccd camera. tps parabolas recognition is performed using opera 3d tosca code, in which ion trajectories of various charge-to-mass ratios and energies are simulated and projected, using the real geometry and values of the deflecting fields, on the orthogonal mcp plane, as presented in previous paper [8]. fig. 1a shows a scheme of the experimental set-up. 3. results and discussion fig. 2a reports a typical sic-tof spectrum obtained by irradiating a pure pe thin film, 6 µm in thickness, with 586 j pulse energy. the spectrum shows a photopeak due to the detection of uv and x-rays, representing the start of the tof analysis, the background signal due to the electrons and their bremsstrahlung and a larger peak due to protons and carbon ions of the plasma that is produced. the faster protons, which occur during the rapid ascent of the peak at 77 ns, correspond to a kinetic energy of 300 kev and the subsequent faster carbon ions, which occur at about 135 ns, correspond to maximum kinetic energy of about 1.23 mev. this low ion acceleration in tnsa regime is justified by the low laser energy deposition in the pure polymer. pe, in fact, has very low absorption at the laser ir wavelength, according to the transmission measurements reported in the plot of fig. 1b, indicating high transmission, of about 90 %, at 1.3 µm wavelength for a thickness of 100 µm. at 200 vol. 55 no. 3/2015 hydrogenated targets for high energy proton generation figure 3. tps spectra relative to ion detection from 10 µm pe (a) and 50 µm pe as pure (b) and containing 1 % of fe2o3 (c) and cnt (d). this wavelength the absorption coefficient, calculable using the lambert-beer law, is 8 cm−1 [9]. using a target of 6 µm in thickness the transmitted radiation is therefore about 99 % and the absorbed radiation only 1 %. the sic-tof spectrum of fig. 2b shows the accelerated ions irradiating a 5 µm pe target with a 463 j laser pulse. in this case, although the laser pulse energy is about 21 % lower, faster protons and carbon ions are produced, occurring at a tof of 69 ns and 120 ns, corresponding to kinetic energy of 400 kev and 1.6 mev, respectively. this result is increased to about 5 %, due to the increment of the absorbed laser energy in the target, and consequently due to the hotter plasma that is generated. a further increment of the laser absorption in pe can be obtained using highly absorbent nanostructures embedded in the polymer. this is the case for fe2o3 and cnt embedded uniformly in the polymer at concentrations of the order of 1 % in weight, which significantly changes the chemical and physical properties of the polymer. fig. 2c shows a typical sic-tof spectrum of pe, 50 µm in thickness, containing fe2o3 nanostructures, irradiated at 469 j. in this case, the spectrum differs significantly from the spectrum for pure pe, showing ion peaks occurring at lower tof. protons have a minimum tof of about 33 ns, corresponding to a maximum kinetic energy of 1.72 mev, while c ions have a maximum energy of 3.5 mev. this result can be justified on the basis of the high absorption coefficient of the polymer which, with the fe2o3 nanostructures, permits an absorption level of about 53 %. using a pe thin film 50 µm in thickness, containing cnt at 1 % concentration, the foil absorbance increases up to 78 %. higher ion acceleration is therefore expected, as reported in fig. 2d for pe irradiated at 468 j. in this case, the protons have a minimum tof of about 25 ns corresponding to a maximum kinetic energy of 2.99 mev and the carbon ions have maximum energy of 5.5 mev. tps spectra provide further information figure 4. x-ray streak camera images relative to plasma detection from10 µm pe (a), from 50 µm pe as pure (b), containing 1 % of fe2o3 (c) and containing 1 % of cnt (d). on the plasma produced in forward direction. fig. 3 shows typical tps images obtained detecting the ion emission from a pe foil 10 µm in thickness (a) and 50 µm in thickness (b). for comparison, the spectra obtained when irradiating 50 µm pe with embedded fe2o3 (c) and with embedded cnt (d) are also reported. the obtained parabolas are in relation to the protons and carbon ions coming from the polymer composition. for thick pe foils, the luminosity of the parabola decreases due to the lower transmitted ions, and only the most energetic ions, which are less deflected, are detected. using the embedded nanostructures, the ions increase their energy and their parabolas appear nearer to the non-deflection center. however the weak luminosity indicates that there is only a small number of more accelerated ions, and in some cases this means that the maximum ion energy for protons and carbon ions cannot be evaluated accurately. from this point of view, the sic detector seems to be more sensitive to the ion energetic measurement with respect to tps. further details on the plasma that is produced can be obtained by analyzing the x-ray streak camera images. as an example fig. 4 shows the xsc images obtained by irradiating a pe foil 1 µm in thickness (a) and 50 µm in thickness (b). in the first case, cold plasma is obtained, because the laser absorption in the foil is negligible and no x-rays are emitted, while in the second case the laser absorption increases and plasma emitting low energetic x-rays is produced. for comparison the fig. 4c and fig. 4d present the plasma obtained by irradiating pe in which 1 % fe2o3 is embedded and pe in which 1 % cnt is embedded respectively. the higher xsc light intensity (false colors proportional to the plasma temperature) demonstrates that in this last case the plasma emits more energetic x-ray as result of the higher laser absorption in the thin nanostructured target. 201 l. torrisi, m. cutroneo, j. ullschmied acta polytechnica 4. conclusions hydrogenated targets can be used for high energy proton acceleration from laser irradiation in the tnsa regime. polyethylene can be employed to generate high energetic and high yield protons only if its thickness is in the range 6–50 µm and if it contains absorbent nanostructures such as fe2o3 and cnt at concentrations of about 1 %. this last mentioned structure has the advantage that it absorbs high hydrogen content and to develops high proton emission as a consequence of the laser-generated plasma. acknowledgements the research leading to these results has received funding from laserlab-europe (grant agreement 284464, ec’s seventh framework programme) references [1] g. a. p. cirrone, m. carpinelli, g. cuttone, al. elimed, future hadrontherapy applications of laser-accelerated beams. nucl instr and methods a 730:174–177, 2013. [2] l. torrisi, s. cavallaro, m. cutroneo, al. deuteriumdeuterium nuclear reaction induced by high intensity laser pulses. l torrisi, s cavallaro, m cutroneo et al, applied surface sci 272 (2013) 42-45 272:42–45, 2013. [3] a. macchi, m. borghesi, m. passoni. ion acceleration by super-intense laser-plasma interaction. plasma physics 1775:1–58, 2013. [4] l. torrisi, g. foti, l. giuffrida, al. single crystal silicon carbide detector of emitted ions and soft x rays from power laser-generated plasmas. j appl phys 105:123304, 2009. [5] j. badziak, s. glowacz, s. jablonski, al. production of ultrahigh ion current densities at skin-layer subrelativistic laser-plasma interaction. plasma phys contr fus 46(12b):044, 2004. [6] a. p. l. robinson, m. zepf, s. kar, al. radiation pressure acceleration of thin foils with circularly polarized laser pulses. new journal of physics 10:1367, 2008. [7] m. cutroneo, p. musumeci, m. zimbone, al. high performance sic detectors for mev ion beamsgenerated by intense pulsed laser plasmas. m cutroneo, p musumeci, m zimbone et al, j material research 28(01), pp 87 âăş 93, 2012 28(01):87–93, 2012. [8] m. cutroneo, l. torrisi, l. andó, al. thomson parabola spectrometerfor energetic ions emitted from sub-ns laser generated plasmas. acta polytechnica 53(2):138–141, 2013. [9] l. torrisi, f. caridi, a. m. visco, al. polyethylene welding by pulsed visible laser irradiation. appl surf science 257:2567âăş2575, 2011. 202 acta polytechnica 55(3):199–202, 2015 1 introduction 2 material and methods 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0439 acta polytechnica 54(6):439–441, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap the shape of a nucleus growing on the strongly curved surface of a nanofiber alexey m. sveshnikova, b, ∗, pavel demoa, b, zdeněk kožíšekb, petra ticháa a faculty of civil engineering czech technical university in prague b institute of physics, academy of sciences of the czech republic ∗ corresponding author: sveshnik@fzu.cz abstract. the equilibrium shape of a critical nucleus has a strong impact on important parameters of nucleation theory, such as the nucleation rate. while on a flat substrate the growing nucleus has the shape of a spherical cap, the shape of a nucleus on the strongly curved surface of a nanofiber is more complex. in the present paper, we propose a simple model for estimating the deviation of the shape of the critical nucleus from spherical. it is shown that the nucleus extends more in the direction of the axis of the nanofiber than in the perpendicular direction, and the deviation from a spherical shape is stronger in the case of well wettable surfaces. keywords: nanofiber, nucleation, equilibrium shape. 1. introduction in recent years, nanotechnology has found a constantly growing range of applications in civil engineering. nano-capsules filled with gel and placed in the volume of concrete can provide a self-healing ability for the material; submicron optical fibers mixed with portland cement have enabled the creation of translucent concrete; carbon nanotubes improve the mechanical durability of concrete; titanium dioxide nanoparticles break down organic pollutants; zinc oxide nanoparticles improve the resistance of concrete to water, etc. [1,2]. one of such possible applications of nanotechnology in civil engineering is protection of buildings from aggressive influence of the environment. many deteriorative phenomena start at the surface of a building material and slowly find their way into the volume, which makes protection of the surface a very important task [3]. however, nanoparticles tend to attract each other and to coagulate, which usually makes it a non-trivial task to cover the surface of a building material. in recent times, nanotextiles have been proposed as a potential way to solve this problem. nanotextiles by themselves have a number of useful properties, which make them a promising candidate for protective layers. their small thickness make them very flexible and nearly transparent, which is important, for example, for protection of historical building, which often have strongly curved architectonic elements. because a nanotextile is a porous material, the air can relatively freely pass through it. this porosity and extremely small radius of curvature of nanofibers lead to very high surface-to-volume ratio for this type of a material in comparison with usual protective materials like plasters. it means that nanotextiles interact with the environment stronger than usual materials. moreover, if we manage to distribute nanoparticles with desired properties uniformly over the surface of a nanotextile, the resulting textile will combine good properties of both types of nanomaterial. for example, silver nanoparticles are known to have strong bactericide effect. the nanotextile imbued with silver nanoparticles and attached to the surface of construction material will inhibit a growth of molds and significantly improve the longevity of building. taking into account huge amount of money paid every year for restoration of surface of civil constructions every factor which can reduce the rate of required treatment will lead to appreciable savings. one possible way to achieve a uniform distribution of nanoparticles on the surface of a nanotextile is by a nucleation process. the nanotextile is placed in a chamber filled with a medium in a metastable state. due to the subsequent nucleation and growth process, nanoparticles are formed on the surface of the nanotextile. since the nucleation process is stochastic, a generally uniform distribution of nanoparticles over the surface can be expected. a better understanding of the nucleation process on a highly curved nanotextile surface will also be valuable in applications outside civil engineering. for example, technology involving this process has been suggested for improving the efficiency of semiconductor devices [4]. 2. heterogeneous nucleation from the thermodynamical point of view, the formation of nuclei on the surface of a nanofiber is controlled by an excess of gibbs free energy. in a metastable medium, atoms have higher free energy than in the form of a cluster. the expression for the change of free energy during this transition consists of two terms: the volume term, and the surface term. the volume 439 http://dx.doi.org/10.14311/ap.2014.54.0439 http://ojs.cvut.cz/ojs/index.php/ap a. m. sveshnikov, p. demo, z. kožíšek, p. tichá acta polytechnica figure 1. cross-sections of the nucleus on the nanofiber: a) parallel to the nanofiber axis; b) perpendicular to the nanofiber axis. term expresses the general tendency of a metastable phase to become stable. this term makes a negative contribution to the total free energy of the cluster. the surface term is positive, because the atoms on the interface have relatively high free energy. thus, the change in the total free energy of n atoms during their transition from the metastable phase into a nucleus can be written as follows: ∆g = −n∆µ + γσn2/3, (1) where ∆µ is the difference of the chemical potentials of the initial metastable phase and the cluste, σ represents the average excess surface energy, and γ stands for the so-called shape factor. the growing nucleus is in contact with the ambient metastable phase and the supporting nanofiber. both areas of contact are proportional to n2/3, and the specific proportionality coefficient leads to the shape factor γ. equation (1) represents the energy barrier that the growing nuclei have to overcome, because for small n the surface term in (1) prevails. the critical nucleus is in non-stable equilibrium with the environment, and its size can be found to be n = ( 2γσ 3∆µ )3 . (2) the nucleation rate, i.e. the number of overcritical nuclei appearing in the system per unit volume per unit time, is proportional to the exponent of the free energy of a critical nucleus: i ∼ exp(−∆gc). (3) it can be seen from the above formulas that the nucleation rate depends on the specific shape of the critical nucleus, represented by its shape factor γ. in order to describe the nucleation process on the surface of nanofibers one needs to know the shape of the corresponding nuclei [5,6]. in the standard theory of heterogeneous nucleation it is assumed that the characteristic radius of curvature of the substrate is much larger than the radius of the nucleus. this assumption leads to the conclusion that growing nuclei have the shape of a spherical cap. however, this assumption is not valid for heterogeneous nucleation on a nanotextile. 3. equilibrium shape of a nucleus the equilibrium shape of a nucleus on an arbitrary surface can be obtained by minimizing its free energy under the condition of constant volume. the application of this principle to the line of contact of the nucleus with the substrate leads to the well-known young-laplace equation for contact angle θ: σsa − σsn − σna cos θ = 0, (4) where σsa stands for the surface tension between substrate and ambient phase, σsn stands for the surface tension between substrate and nucleus, and σna stands for the surface tension between nucleus and ambient phase. from (4) it follows that contact angle θ is constant along the whole line of contact of the nucleus with the nanofiber. consequently, the equilibrium shape of a nucleus on the cylindrical surface of a nanofiber cannot be a spherical cap, because the contact angle on the intersection of a sphere with a cylinder varies along the contact line. some research has been done on the equilibrium shape of droplets on fibers [7,8]. in these papers it is usually assumed that the droplet already has a 440 vol. 54 no. 6/2014 the shape of a nucleus growing on the strongly curved surface of a nanofiber macroscopic size. however, for applications in nucleation theory one needs to consider very small droplets smaller in size than or comparable in size with the radius of the nanofiber. in the present paper, we propose a simple theoretical model allowing the shape factor of very small nuclei to be estimated. while the nucleus as a whole cannot be considered as a spherical cap, one can assume that its crosssections by planes parallel and perpendicular to the axis of nanofiber may be well approximated by circular arcs. however, these arcs have a different diameter and their centers do not coincide. we denote the thickness of the nucleus (i.e., the maximum distance of its surface from the surface of the nanofiber) as h, the radius of a nanofiber as r, and the radii of the circular arcs in parallel and perpendicular cross-sections as r1 and r2 (see figure 1). then h = r1(1 − cos θ) (5) and h = d + r2 − r, (6) where d is the distance between the axis of the nanofiber and the center of the circular arc in perpendicular cross-section: d2 = r22 + r 2 − 2r2r cos θ. (7) from (6) and (7), we obtain the expression for the radius of curvature of the surface of the nucleus in the plane perpendicular to the nanofiber axis: r2 = h2 + 2hr 2(h + r − r cos θ) . (8) the deviation of the shape of the nucleus from spherical can be estimated by the ratio of its radii of curvature r1 and r2: r1 r2 = 2(x + 1 − cos θ) (x + 2)(1 − cos θ) , (9) where x = h/r is the dimensionless thickness of the nucleus. in figure 2, the dependence of ratio (9) on dimensionless thickness x is shown for different values of the contact angle. it is clear that the more wettable the surface of the nanofiber is, the larger is the deviation of the shape of the nucleus from a spherical cap. 4. conclusion it follows from (9) that the growing nucleus is always more extended in the direction parallel to the nanofiber axis than in the perpendicular direction (r1/r2 is always greater than 1). for a well-wettable surface (θ ≈ 0, cos θ ≈ 1) this effect is stronger, because the area of contact of the nucleus with the fiber is larger and the nucleus better “feels” the deviation of the surface of the fiber from the flat surface. figure 2. deviation of the equilibrium nucleus shape from spherical as a function of nucleus relative thickness x; r1 is the radius of curvature in the plane parallel to the axis of the nanofiber; r2 is the radius of curvature in the plane perpendicular to the axis of the nanofiber; θ is a contact angle. acknowledgements this work has been supported by ctu project sgs14/111/ohk1/2t/11 and by gacr projects p108/12/0891 and 14-04431p, and has been performed within the framework of the joint laboratory of nanofiber technology of the institute of physics ascr and the faculty of civil engineering, ctu in prague. references [1] lee j., mahendra s., alvarez p., nanomaterials in the construction industry: a review of their applications and environmental health and safety considerations, acs nano, vol. 4, issue 7, 2010, p. 3580. [2] olar r., nanomaterials and nanotechnologies for civil engineering, buletinul institutului politehnic din iasi, vol. liv, issue 4, 2011. [3] vodák f. et al. effect of γ-irradiation on strength of concrete for nuclear-safety structures, cement and concrete res. 35, 2005, p. 1447. [4] vaněček m. et al, steady-state transport under non-equilibrium conditions in undoped and phosphorus doped a-si:h at very low temperature, journal of non-crystalline solids, vol. 90, issue 1-3, 1987, p. 183. [5] demo p., kožíšek z., šášik r., analytical approach to time lag in binary nucleation, phys. rev. e 59, 1999, p. 5124. [6] demo p., sveshnikov a., hošková š., ladman d., tichá p., physical and chemical aspects of the nucleation of cement-based materials, acta polytechnica 52, 2012, p. 15. [7] haloi p. et al, stability and surface free energy analysis of a liquid drop on a horizontal cylindrical wire using fem simulation, ijret, vol. 03, issue 03, 2014, p. 332. [8] wu x., dzenis y. a., droplet on a fiber: geometrical shape and contact angle, acta mech., vol. 185, 2006, p. 215. 441 acta polytechnica 54(6):439–441, 2014 1 introduction 2 heterogeneous nucleation 3 equilibrium shape of a nucleus 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0811 acta polytechnica 53(supplement):811–813, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the jem-euso mission mario bertainaa,b, toshikazu ebisuzakic, piero galeottia,b,∗, fumiyoshi kajinod, on behalf of the jem euso collaboration a department of physics, university of torino, torino, italy b national institute of nuclear physics (infn), sez. di torino, italy c riken advance science institute, tokyo, japan d department of physics, konan university, kobe, japan ∗ corresponding author: piero.galeotti@unito.it abstract. the jem-euso mission explores the origin of extreme energy cosmic rays (eecrs) above 50 eev and explores the limits of fundamental physics, through observations of their arrival directions and energies. it is designed to open a new particle astronomy channel. this super-wide-field (60 degrees) telescope with a diameter of about 2.5 m looks down from space onto the night sky to detect near uv photons (330 ÷ 400 nm, both fluorescent and cherenkov photons) emitted from the giant air showers produced by eecrs. the arrival direction map with more than five hundred events will tell us the origin of the eecrs and will allow us to identify the nearest eecr sources with known astronomical objects. it will allow them to be examined in other astronomical channels. this is likely to lead to an understanding of the acceleration mechanisms, perhaps producing discoveries in astrophysics and/or fundamental physics. the comparison of the energy spectra among the spatially resolved individual sources will help to clarify the acceleration/emission mechanism, and also finally to confirm the greisen–zatsepin–kuz’min process for the validation of lorentz invariance up to γ ∼ 1011. neutral components (neutrinos and gamma rays) can also be detected, if their fluxes are high enough. the jem-euso mission is planned to be launched by a h2b rocket in about 2017 and transferred to iss by the h2 transfer vehicle (htv). it will be attached to the exposed facility external experiment platform of kibo. keywords: cosmic rays, neutrino, lorentz invariance, international space station. 1. introduction the extreme universe space observatory – euso is the first space mission devoted to exploring the universe through the detection of the extreme energy (e > 50 eev) cosmic rays (eecrs) and neutrinos [1– 5]; it looks downward from the international space station (iss). it was first proposed as a free-flyer, but was selected by the european space agency (esa) as a mission attached to the columbus module of iss. the phase a study for the feasibility of this observatory (which we will refer to here as esa-euso) was successfully completed in july 2004. however, because of financial problems at esa and in the european countries, together with the logistic uncertainty caused by the columbia accident, the start of phase b was put back. in 2006, japanese and u.s. teams redefined the mission as an observatory attached to kibo, the japanese experiment module (jem) of iss. they renamed it jem-euso and started with a renewed phase a study. jem-euso is designed to achieve our main scientific objective: astronomy and astrophysics through the particle channel to identify sources by arrival direction analysis and to measure the energy spectra from individual sources, with an overwhelmingly high exposure, approaching 1 × 106 km2 sr yr at energies above 300 eev (see fig. 3). this will allow the exploration of an energy region beyond any other previous or planned experiment (in the case of space-based telescopes, the observation area of the earth’s surface is essentially determined by the projection of the field of view of the optics). it will constrain the acceleration or emission mechanisms, and will also finally confirm the greisen–zatsepin–kuz’min process [6] for the validation of lorentz invariance up to γ ∼ 1011. 2. scientific objectives the scientific objectives of the jem-euso mission are divided into one main objective and five exploratory objectives. the main objective of jem-euso is to initiate a new field of astronomy that uses the extreme energy particle channel (5 × 1019 ev < e < 1021 ev). jem-euso has the critical exposure ((0.1 ÷ 1 × 106) km2 sr yr depending on energy) to observe all the sources at least once inside several hundred mpc, and makes possible the following: • identification of sources with high statistics by arrival direction analysis. • measurement of energy spectra from individual sources to constrain the acceleration or the emission mechanisms. 811 http://dx.doi.org/10.14311/ap.2013.53.0811 http://ojs.cvut.cz/ojs/index.php/ap mario bertaina et al. acta polytechnica figure 1. principle of the jem-euso telescope for detecting extreme energy cosmic rays. we have set five exploratory objectives: • detection of extreme energy gamma rays. • detection of extreme energy neutrinos. • a study of the galactic magnetic field. • verification of relativity and quantum gravity effects at extreme energies. • a global survey of nightglows, plasma discharges, and lightning and meteors. see [7–9] for detailed discussions of the scientific objectives. the criteria for the success of the mission involve achieving these scientific objectives. 3. instrument the jem-euso instrument consists of the main telescope, the atmosphere monitoring system, and the calibration system [10]. the main telescope of the jem-euso mission is an extremely-fast (∼ µs) and highly-pixelized (∼ 3 × 105 pixels) digital camera with a large diameter (about 2.5 m) and a wide-fov (±30°). it works in near-uv wavelength (330 ÷ 400 nm) with single-photon-counting mode. the telescope consists of four parts: the optics, the focal surface detector and electronics, and the structure. the optics focuses the incident uv photons on to the focal surface with an angular resolution of 0.07 degree [11]. the focal surface detector converts the incident photons to photoelectrons and then to electric pulses [12, 13]. the data electronics issues a trigger for an air-shower event or an other transient event in the atmosphere, and sends necessary data to the ground for further analysis. the atmosphere monitoring system (ams) monitors the earth’s atmosphere continuously inside the fov of the jem-euso telescope [14]. ams uses an ir camera, lidar, and the slow data of the main telescope to measure the cloud-top height with accuracy better than 500 m. the calibration system measures the efficiencies of the optics, the focal surface detector, and the data acquisition electronics [15]. figure 2. arrival time distribution of photons at the telescope per m2 from an eas of e = 1020 ev and θ = 60° with various cloud conditions. dashed and dotted lines correspond to the case of cirrus-like and stratus-like test clouds, along with a solid line for the clear atmosphere case. the peak at ∼ 150 µs in a clear atmosphere and cirrus-like cloud is due to the cherenkov light reflected from the ground, and it is fainter for cirrus. in case of stratus-like clouds, such reflection occurs at cloud-top; this peak is therefore closer to the fluorescence shower maximum. 4. observational merit in comparison with ground-based observatories, the space-based telescope may provide various merits in observations of eass induced by eecr. one of the substantial differences is that the signals of eas from higher altitudes are efficiently observed with no attenuation or with limited attenuation in cloudy cases, either if the cloud lies at lower altitudes or if optically thin clouds lie at high altitude. in order to determine the primary energy of eecrs, it is necessary to measure the shower development, including the signature around the maximum of the shower development. figure 2 compares the behavior of typical eas developping inside clouds and without clouds. in the case of optically thick clouds that lie at altitudes lower than the shower maximum, e.g. stratus, the main part of the shower development is well measured and this allows the energy deposit in the atmosphere to be reconstructed. moreover, the diffusively reflecting cherenkov light enhances the total intensity from the shower, which helps to increase the efficiency of triggering the showers at nearly threshold energies. in the presence of the optically thin clouds that lie at high altitudes, e.g. those categorized as cirrus, most eas signals penetrate the layer of the clouds and are attenuated partly and may be recognized as a lower energy event. in such a case, however, the geometry of the shower axis is properly determined by the analysis of the angular velocity of the eas signal. figure 3 shows the evolution of the exposures of the past and future missions devoted to research on the extremely high energy cosmic rays. at the highest energies, jem-euso can achieve more than one 812 vol. 53 supplement/2013 the jem-euso mission figure 3. expected highest cumulative exposure, in km2 sr yr or linsley units, of jem-euso. the two thick red curves correspond to pure nadir mode and pure tilted mode; the actual exposure will depend on the final operating mode adopted and will lie between the two curves. for comparison, the evolution of exposure by other retired, running and proposed eecr observatories is shown. figure 4. relative deviation from uniformity of the aperture as a function of the sine of declination. dashed curves show the cases for a selection of events with different zenith angle. the pure isotropic exposure to the solid angle is defined as 0. the horizontal axis on the bottom denotes the corresponding declination. order of magnitude larger exposure than the auger experiment or the telescope array experiment. figure 4 demonstrates the uniformity of exposure expected in the jem-euso mission as a function of the sine of declination (solid angle). this is not the case for ground-based experiments, such as auger and telescope array. in addition to the significant increase in the overall exposure by about one order of magnitude compared with auger as of today, the orbiting jem-euso telescope will cover the entire celestial sphere. moreover, the cumulative exposure results in a high degree of uniformity, thanks to the inclined iss orbit. this advantage is more pronounced if the eecrs from the single source are observed with angular spread. if this is the case, the gradient of exposure distributions in the celestial sphere may sweet over the real signals from the sources. with the wide fov of jem-euso telescopes observing from space, the measurements of the entire profile of the shower development is simpler than with ground-based experiments with relatively small fov. jem-euso will survey an atmospheric mass on the earth in excess of 1012 tons, and will be more sensitive to showers with larger zenith angles. this property allows effective measurements of neutrinoinduced showers. 5. conclusions jem-euso is a scientific mission looking downward from iss to explore the extremes in the universe and fundamental physics through the detection of extreme energy (e > 5 × 1019 ev) cosmic rays. it is the first instrument that has full-sky coverage and achieves exposure comparable to one million km2 sr yr. the jem-euso mission is planned to be launched by an h2b rocket in about 2017. it will be transferred to iss by the h2 transfer vehicle (htv), and attached to the external experiment platform of kibo. acknowledgements this work has been partially supported by the italian ministry of foreign affairs, general direction for cultural promotion and cooperation. references [1] y. takahashi et al., 2009, new journal of physics, 11, 065009 [2] t. ebisuzaki et al., 2008, nucl. phys. b (proc. suppl.), 175–176, 237 [3] t. ebisuzaki et al. (jem-euso collab.), proc. 31st icrc, 2009 (#icrc1035) [4] t. ebisuzaki et al. 2009, tours symp. on nuclear physics and astrophys. vii, pp. 369–376 [5] f. kajino et al., 2010, nuclear instruments and methods a 623, 422–424 [6] k. greisen 1966, phys. lett. 16, 148, g. t zatsepin and v. a. kuz’min 1966, jetp phys. lett. 4, 78 [7] a. santangelo et al., 2009, tours symp. nuclear physics and astrophys. vii, pp. 380–387 [8] k. shinozaki et al., 2009, tours symp. nuclear physics and astrophys. vii, pp. 377–37 [9] g. medina-tanco et. al. (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #0956) [10] f. kajino et. al. (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #1216) [11] j.h. adams (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #1100) [12] y. kawasaki et al., (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #0472) [13] m. casolino et al., (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #1219) [14] a. neronov et al., (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #0301) [15] p. gorodetzky et al. (jem-euso collab.), proc. 32nd icrc, 2011 (ibidem #0218) 813 acta polytechnica 53(supplement):811–813, 2013 1 introduction 2 scientific objectives 3 instrument 4 observational merit 5 conclusions acknowledgements references ap09_1.vp 1 introduction passive fire protection materials insulate steel structures from the effects of the elevated temperatures that may be generated during a fire. they can be divided into two types, non-reactive, of which the most common types are boards and sprays, and reactive, of which intumescent coatings are an example. they are available as solventor water-based systems applied up to approximately 3 mm. one problem associated with the use of such systems is the adhesion of the charred structure to the steel element during and after a fire. it is very important that the char remains in the steel surface to ensure fire protection. intumescent chemistry has changed little in recent years, and almost all coatings are largely based on the presence of similar key components. the chemical compounds of intumescent systems are classified into four categories: a carbonisation agent, a carbon rich polyhydric compound that influences the amount of char formed and the rate of char formation; an acid source, and a foaming agent, which during their degradation release non-flammable gases such as co2 and nh3 [1]. activated by fire or heat, a sequential chemical reaction between several chemical products takes place. at higher temperatures, between (200–300) °c, the acid reacts with the carboniferous agent. the formed gases will expand, beginning the intumescence in the form of a carbonaceous char. different models handle the intumescent behaviour with char-forming polymers as a heat and mass transfer problem. other existing models provide a suitable description regarding the intumescence and char formation using kinetic studies of thermal degradation, accounting the complex sequence of chemical reactions, thermal and transport phenomenon [2–5]. due to the thermal decomposition complexity of intumescent coating systems, the models presented so far are based on several assumptions, the most relevant being the consideration of one-dimensional heat transfer through material, temperature and space independent thermal properties and the assumption of a constant incident heat flux where the heat losses by radiation and convection are ignored [3]. some authors also assume that the thermochemical processes of intumescence occur without energy release or energy absorption [6]. results show that the insulation efficiency of the char depends on the cell structure, and the low thermal conductivity of intumescent chars results from the pockets of trapped gas within the porous char which act as a blowing agent to the solid material. in a previous work considering the results obtained from coated steel plates tested in a cone calorimeter, the authors studied intumescence as a single homogeneous layer. the steel temperature variation was considered, and with the intumescence thickness time variation an inverse one-dimensional heat conduction problem (ihcp) was applied to determine the intumescence effective thermal conductivity and thermal resistance [7]. this work presents an experimental study to assess the performance of water-based intumescent paints used as a passive fire protection material. these tests were performed in a cone calorimeter, in steel plates coated with two different paints, three dry film thicknesses and considering two different radiant heat fluxes. during tests, among other quantities, the steel temperature, the intumescence mass loss and thickness variation were measured. a numerical model is also presented to study the intumescence behaviour. the paint thermal decomposition numerical model is based on the conservation equation of energy, mass and momentum. 2 experimental tests performed in the cone calorimeter to assess the performance of two commercial water-based intumescent paints, a set of experimental tests was performed in a cone calorimeter, see fig. 1 and fig. 2. the steel plates are 100 mm squared and 4, 6 mm thick, coated on one side with different dry film thicknesses and tested in a cone calorimeter, as prescribed by the iso5660 standard [8]. temperatures are measured by means of four thermocouples, type k, welded to the plate on the heating side and on the opposite side, at two different positions. the samples were weighed before and af60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 decomposition of intumescent coatings: comparison between a numerical method and experimental results l. m. r. mesquita, p. a. g. piloto, m. a. p. vaz, t. m. g. pinto an investigation of two different intumescent coatings used in steel fire protection has been performed to evaluate their efficiency. a set of experimental tests is presented. they were conducted in a cone calorimeter, considering different thicknesses and heat fluxes. among other quantities, the steel temperature and the intumescence thickness variation were measured. a numerical model for the paint decomposition is also presented. based on the intumescence experimental value, the model is able to provide a reasonably good prediction of the steel temperature evolution. keywords: fire protection, intumescent coatings, cone calorimeter, char formation, heat transfer, energy equation, thermal decomposition, porosity, arrhenius equation, activation energy. ter being coated, to allow for the initial coating mass. the dry thickness was also measured in 16 different points. the mean values and the standard deviation are presented in fig. 1. between the steel plate and the sample older, two silicate plates were used to put the specimen in place and also a thermocouple was placed to measure its temperature variation. the distance between the sample surface and the heater remained unchanged, at approximately 60 [mm]. this means that with the increasing intumescence the top of the sample came closer to the cone surface. due to the large volume of results, only a set of samples will be referenced in this paper. experimental results the temperature evolution in a steel plate without protection was also tested to attain the efficiency of this fire protection. the measured temperatures are presented in © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 49 no. 1/2009 specimens identification initial mass [g] final mass [g] coating mass [g] dft [� m] ó (sd) [� m] higher [� m] smaller [� m] a 35 4 0.5 1-3 363.77 371.28 7.51 545 40.6 601 478 a 35 4 0.5 2-3 363.82 369.98 6.16 615 51.5 695 504 a 35 4 0.5 3 364.54 373.19 8.65 528 60.4 624 427 a 35 4 1.5 1 361.10 387.74 26.64 1670 107 1860 1500 a 35 4 1.5 2 362.17 388.06 25.89 1610 72.2 1750 1500 a 35 4 1.5 3 361.38 385.42 24.04 1450 84.9 1580 1280 a 35 4 2.5 1-2 362.81 400.38 37.54 2530 165 2800 2240 a 35 4 2.5 2 365.81 407.89 42.08 2590 122 2790 2310 a 35 4 2.5 3-2 363.49 404.38 40.89 2680 179 2920 2370 a 75 4 0.5 1 363.46 372.34 8.88 549 60.3 639 425 a 75 4 0.5 2-2 363.58 371.33 7.75 586 36.3 651 538 a 75 4 0.5 3 368.44 377.85 9.41 582 48.6 657 466 a 75 4 1.5 1 369.59 394.82 25.23 1510 83.7 1660 1390 a 75 4 1.5 2 371.11 396.24 25.13 1530 87.7 1720 1380 a 75 4 1.5 3 364.87 391.13 26.26 1620 98.7 1820 1450 a 75 4 2.5 1 366.97 407.71 40.74 2590 122 2760 2330 a 75 4 2.5 2 365.11 404.90 39.79 2590 134 2800 2350 a 75 4 2.5 3 370.60 410.77 40.17 2530 167 2810 2260 a 35 6 0.5 1 527.37 535.05 7.68 476 33.1 518 403 a 35 6 2.5 1 526.65 565.71 39.06 2420 150 2610 2130 a 75 6 0.5 1 522.90 530.58 7.68 494 33.9 561 434 a 75 6 2.5 1 525.71 564.89 39.18 2490 112 2670 2290 specimens identification initial mass [g] final mass [g] coating mass [g] dft [� m] ó (sd) [� m] higher [� m] smaller [� m] b 35 4 0.5 1 366.73 375.36 8.63 571 41.6 665 506 b 35 4 0.5 2 365.38 374.88 9.50 626 38.6 698 563 b 35 4 0.5 3 364.95 373.95 9.00 603 49.5 710 481 b 35 4 1.5 1 365.63 390.10 24.47 1510 70.2 1610 1400 b 35 4 1.5 2 365.82 391.42 25.60 1570 64.1 1670 1470 b 35 4 1.5 3 364.80 390.67 25.87 1580 66.5 1710 1470 b 35 4 2.5 1 365.49 409.85 44.36 2640 90.9 2750 2460 b 35 4 2.5 2 366.29 409.12 42.83 2560 89.0 2660 2400 b 35 4 2.5 3 366.40 407.77 41.37 2510 85.7 2660 2350 b 75 4 0.5 1 362.92 371.94 9.02 581 35.9 653 518 b 75 4 0.5 2 366.00 375.97 9.97 662 53.9 817 599 b 75 4 0.5 3 367.53 377.53 10.00 631 31.2 707 583 b 75 4 1.5 1 366.27 390.71 24.44 1530 79.5 1720 1440 b 75 4 1.5 2 364.69 389.63 24.94 1550 67.8 1690 1450 b 75 4 1.5 3 359.09 384.05 24.96 1560 74.9 1740 1450 b 75 4 2.5 1 359.79 399.66 39.87 2520 211 2840 2170 b 75 4 2.5 2 364.28 405.30 41.02 2520 91.4 2690 2350 b 75 4 2.5 3 364.80 404.97 40.17 2490 126 2760 2340 b 35 6 0.5 1 528.60 537.10 8.50 533 56.7 663 431 b 35 6 2.5 1 528.91 571.74 42.83 2570 105 2720 2360 b 75 6 0.5 1 525.47 534.86 9.39 607 65.9 799 528 b 75 6 2.5 1 529.04 570.00 40.96 2610 75.8 2760 2500 fig. 1: set of experimental tests. reference: paint/heat flux/steel thick./dry thick/test n° fig. 2: coated steel plates, with fixed thermocouples. tested samples at 35 kw/m2 and at 75 kw/m2, reference and position of the thermocouples. fig. 3 for a radiant heat flux of 35 kw/m2 and then resetting the cone to 75 kw/m2. fig. 4 represents the mass loss of each sample and shows a variation almost linear with time, mainly for a heat flux of 35 kw/m2. using discrete frames obtained from a digital camera during the tests, and by image processing techniques using matlab, the intumescence development was measured over time. fig. 5 presents the intumescent development (free 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 3: measured temperature in the steel plate without protection fig. 4: measured mass loss with time fig. 5: intumescence thickness mean values of four central measurements fig. 6: temperature variation of steel and silicate plates for coating a boundary l(t)) for specimens with paint a and b, different thicknesses and radiant heat fluxes. higher intumescence may be observed in the sample right region coincident to the thermocouple wire position responsible for coating accumulation. the presented values are mean values determined from four central measures. the figures show that for the lower heat flux the intumescence becomes stable, but for the higher heat flux it continues to increase. coating a presents a higher expansion at the initial stage compared to coating b. for longer periods of exposure coating b continues to expand. the steel temperature profiles and temperatures in the middle of the silicate plates are reported in fig. 6 and fig. 7. measured values from the thermocouples welded to the bottom of the plate are very close to the temperatures at the top. for the same heat flux, the time to reach the same temperature increases with the increase of the dry thickness. the behaviour is very similar for both coatings, but for all cases the time to reach, e.g., a temperature of 200 °c is always higher when paint b is used. for these conditions it gives improved fire protection. 3 mathematical model of the intumescence behaviour to determine the temperature field in an intumescent material, it is necessary to solve a phase transformation problem with two ore more moving boundaries that characterize its state, initial, softened and carbonaceous char. different methodologies can be found in the literature to model the thermal decomposition of a polymer or polymer based materials. the methodology followed in this work was to consider that the decomposition occurs not only at the outside surface but also inside, for temperatures above the pyrolysis temperature, tp. in this case the moving boundary regression rate must be determined considering the motion of the whole domain. this strategy implies that a mass diffusion term needs to appear in the energy equation due to its motion. this term was disregarded due to the small thickness of the virgin layer for this types of applications, about 1–3 mm. considering a first order reaction, the mass loss is given by � ( ( , )) ( , )m t x t t a e e rt x t� � � � �� � �v v 0 0 for t tp� , (1) where �m is the local mass loss kg�m�3�s�1, t(x, t) is the temperature at point x at instant t, a0 is the pre exponential factor [s�1], e0 the activation energy [j�mol �1], and r the universal gas constant [j�mol�1�k�1]. the position of the moving boundary is obtained by summing all the mass loss and dividing by the specific mass. the energy equation for the steel and virgin layers is based on the one-dimensional conduction heat equation. the conservation equation for the solid virgin material phase is given by �� � v v v v v t w vmd� � � , (2) where �wmdv represents the destruction rate of virgin material per unit volume, originated by the thermal decomposition. the virgin material decomposition produces a fraction of gas, equal to the porosity, �, and a solid char fraction equal to ( )1 � � . © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 49 no. 1/2009 fig. 7: temperature variation of steel and silicate plates for coating b the formation rate of char and gas mass is: � ( ) , � w a s t t w p p gas c v v char c d d � � � � � � � � � � � � � 1 1 1 � � � � � �v v d d � � � � � � � � � � � � � � � a s t t ( ) . (3) � represents the fraction of the bulk density difference between the virgin and char materials that is converted to gas. in this study the value used was � � 0.66, [9]. the conservation of gas mass equation is given by eq. (4). � � � � � � � � � � ( ) �g g g t v v t m x � � �� � 0. (4) in the previous equation, � �v t represents the intumescence rate. the gas mass flux, � ��mg, is calculated accordingly to darcy’s law and it is assumed that the gases present in the intumescent material behave as a perfect gas. the thermodynamic properties are related by the ideal gas law and, assuming that the gas is a mixture of 50wt%co2 and 50wt%h2o, the generated gas molar mass used in the model mg is 31 g/mol. the conservation of gas mass equation with the darcy and the ideal gas laws combined can be used to give a differential equation for the pressure inside the intumescence. in the numeric calculations, the intumescence rate is assumed to be known, provided by experimental results, so the pressure calculation is disregarded on the assumption that the internal pressure is constant and equal to the atmospheric pressure. an energy equation for the conservation of energy within the intumescence zone can be obtained by combining the energy equation for the gases with that of the solid char material. the equation for the conservation of energy per unit bulk volume can be written as: ( ) ( � )� � � � � � � � � cp t t x m t cp x k t x cpteff g g g eff� �� � � � � � � �� � � � � t cp t v v t �( ) ,eff (5) where ( ) ( )� � � � �cp cp cpeff g g c c� � �1 and k k keff g c� � �� �( )1 . the effective thermal conductivity for the intumescence bulk material, including gas and char, is equal to the thermal conductivity of the gas per unit bulk volume, plus that of the solid material. the same thing applies to the effective heat capacity. in the steel plate back surface we assume an adiabatic boundary condition, and at the boundary steel/virgin layers we assume a perfect thermal contact. at the moving front, the boundary conditions are: k t x q t t h t t t s t t t k r a av p � � � ��� �� � � � � �� ( ) ( ) ( ( ), ) ,4 4 c for v p � � � � t x k t x q t s t t th� � � �c for ( ( ), ) , (6) in which qh is the heat flux due to the endothermic decomposition of the virgin material, given by q h s th � � p v� �( ), where hp represents the decomposition enthalpy. a wide range of values are reported in the literature for the heat of pyrolysis, ranging from a few units to units of millions. the value used in the calculations was 50 j/kg. the intumescent coating specific mass was measured by the pycnometer method given a value of 1360 and 1250 kg/m3 for the virgin coating and a value of 692.4 and 450 kg/m3 for the char material, for paints a and b, respectively. the steel properties are assumed constant, with a specific heat value of 600 j/kg�k and a specific mass equal to 7850 kg/m3. the mathematical model is based on the following major simplifying assumptions: there is no heat between gas and char, the thermophysical properties and the pressure at both layers are constant. the solution method was implemented in a matlab routine using the method of lines (mol), [10], and the integrator ode15s to solve the set of ordinary differential equations. the temperature field is determined by the steel and virgin energy equations. when the front reaches the pyrolysis temperature, equal to 250 °c, it starts to decompose and to move. then the moving front rate is determined and the intumescence forms. the position the free boundary is set equal to the experimental results and the intumescence temperature field is determined. in each time step the virgin and char layers are remeshed. the input parameters are listed as follows: kv � 0.5 w�m �1�k�1; kc � 0.1 w�m �1�k�1; cpv � 2600 j�kg �1�k�1; cpc � 3000 j�kg �1�k�1; hc � 20 w�m 2�k�1; � � 0.92; tp � 525 k; a0 � 4.67 e 12�s�1. two case studies are presented in fig. 8 and fig. 9. in the first study, the steel temperature variation and the moving front position are determined based on a value of the acti64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 49 no. 1/2009 fig. 8: comparison of measured and computed steel temperatures and position of the moving front, e0 125� k�j�mol �1 vation energy equal to e0 125� k�j�mol �1. the numerical results follow the experimental values reasonably well. the major differences occur at intermediate times, probably because a transition state of molten polymer was not considered. both the determined steel temperatures and the moving front are strongly dependent on the activation energy that defines the amount of mass loss of virgin paint, as presented in fig. 9. it must be said that the value used in the simulations was obtained from the literature, but the correct values of both paints are needed. the reaction kinetics parameters can be obtained from thermogravimetric analysis. acknowledgments the authors acknowledge financial support from the portuguese science and technology foundation, project ptdc/eme-pme/64913/2006, “assessment of intumescent paint behaviour for passive protection of structural elements submitted to fire conditions”, and fellowship sfrh/bd/ 28909/2006. the authors also acknowledge the contribution from the paints producers: cin, nullifire. references [1] duquesne, s., bourbigot, s., delobel, r.: ”mechanism of fire protection in intumescent coatings”. european coatings conference: fire retardant coatings ii, berlin, 2007. [2] staggs, j. e. j.: “a discussion of modelling idealised ablative materials with particular reference to fire testing”, fire safety journal, vol. 28 (1997), p. 47–66. [3] moghtaderi, b., novozhilov, v., fletcher, d., kent, j. h.: “an integral model for the transient pyrolysis of solid materials”. fire and materials, vol. 21 (1997), p. 7–16. [4] lyon, r. e.: “pyrolysis kinetics of char forming polymers”. polymer degradation and stability, n1, vol. 61 (1998), p. 201–210. [5] jia, f., galea, e. r., patel, m. k.: “numerical simulation of the mass loss process in pyrolizing char materials”. fire and materials, n1, vol. 23 (1999), p. 71–78. [6] kuznetsov, g. v., rudzinskii, v. p.: “heat transfer in intumescent heatand fire-insulating coatings”, journal of applied mechanics and technical physics, vol. 40 (1999), no. 3. [7] mesquita, l. m. r., piloto, p. a. g., vaz, m. a. p., pinto, t.: “numerical estimation for intumescent thermal protection using one-dimensional ihcp”. wccm8-eccomas 2008, venice, italy, june 30–july 5, 2008. isbn 978-84-96736-55-9. [8] iso 5660-1:2002: reaction-to-fire tests – heat release, smoke production and mass loss rate. part 1: heat release rate (cone calorimeter method), international organization for standardization, 2002. [9] lautenberger, c.: a generalized pyrolysis model for combustible solids, ph.d. thesis, university of california at berkeley, berkeley, ca, 2007. [10] wouwer, a. v, saucez, p., schiesser, w. e.: “simulation of distributed parameter systems using a matlab-based method of lines toolbox: chemical engineering applications”, ind. eng. chem. res. vol. 43 (2004), p. 3469-3477. luís m. r. mesquita e-mail: lmesquita@ipb.pt paulo a. g. piloto tiago m. g. pinto applied mechanics department polytechnic institute of bragança portugal mário a. p. vaz faculty of engineering university of porto portugal © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 49 no. 1/2009 fig. 9: influence of the activation energy in the steel temperature and in the moving front acta polytechnica doi:10.14311/ap.2013.53.0707 acta polytechnica 53(supplement):707–711, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap influence of the results of uhecr detection on the lhc experiments anatoly a. petrukhin∗ national research nuclear university mephi, moscow, russia ∗ corresponding author: aapetrukhin@mephi.ru abstract. the cosmic ray energy region 1015 ÷ 1017 ev corresponds to lhc energies 1 ÷ 14 tev in the center-of-mass system. the results obtained in cosmic rays (cr) in this energy interval can therefore be used for developing new approaches to the analysis of experimental data, for interpreting the results, and for planning new experiments. the main problem in cosmic ray investigations is the remarkable excess of muons, which increases with energy and cannot be explained by means of contemporary theoretical models. some possible new explanations of this effect and other unusual phenomena observed in cr, and ways of searching for them in the lhc experiments are discussed. keywords: cosmic ray energy spectrum, cosmic ray composition, eas, muons, quark-gluon matter. 1. introduction the discovery of the higgs boson is the main task for lhc, and this task will be solved in the nearest future (positively or negatively1). however, this gigantic experimental complex will continue to work, and its possibilities will continue to expand. the question: “what are the next tasks?” is therefore very topical. of course, there are many tasks connected with investigations of hadron-hadron interactions at lhc energies, tests of existing theoretical models behavior at such energies, etc. there are also many new theoretical ideas: supersymmetry, dark matter, etc., which will be searched at lhc energies. in parallel with the development of accelerator equipment and experiments, corresponding investigations have been conducted in cosmic rays. lhc energies of 1 ÷ 14 tev correspond to the interval 1015 ÷ 1017 ev in the laboratory system for pp-interaction (for nuclei–nuclei interactions the upper limit can be higher). and namely at these energies many interesting and sometimes unusual results have been observed in cr investigations. of course, investigations in cosmic rays have many drawbacks compared to accelerator experiments, since in cosmic ray interactions many parameters of particles are unknown: the type of particle, its energy, arrival direction, etc. in addition, the results of cr experiments can be interpreted in two ways: as an investigation of particle interaction if to believe that the energy spectrum and composition of cr are known, and as a study of the characteristics of cr flux if to assume that hadron interaction model is known. one of the main disadvantages is the poor statistics, since cr flux decreases very rapidly with the increase in energy. however, numerous cr experiments have shown that below 1015 ev no serious deviations of the results 1during the preparation of this paper, the discovery of the higgs boson was announced. eas n ei e h c. c. x max n j energy spectrum composition figure 1. general scheme of eas investigations. of the measurements from the standard cr energy spectrum and composition and interaction model have been observed. the purpose of this paper is to analize the consequences for the lhc experiments that follow from the results of cr investigations at energies more than 1015 ev. 2. evidence of new physics in the cr experiments investigations of cosmic rays with energies more than 1015 ev can be conducted at the earth surface only, since the flux of such particles is very small and very large detectors are required. primary crs interact with air nuclei at high altitudes, and the results of these interactions are detected. the general scheme of a cr study with energies more than 1015 ev is illustrated by fig. 1. in principle, it is possible to detect all secondary components: number of electrons ne (more exactly, all charged particles), number of muons nµ, energy deposit of eas core δeh, longitudinal shape of eas development by using cherenkov or fluorescence radiation (c.c., cascade curve), maximum of eas de707 http://dx.doi.org/10.14311/ap.2013.53.0707 http://ojs.cvut.cz/ojs/index.php/ap anatoly a. petrukhin acta polytechnica 10 15 10 16 10 17 10 -19 10 -18 10 -17 10 -16 10 -15 10 -14 10 -13 e eas e 0 d n /d e , a rb . u n it s e, ev e knee ∆e = e 0 e eas (missing energy) figure 2. the change in the cr energy spectrum at the appearance of the missing energy. velopment xmax. in recent time, new parameters of eas have been also investigated: dµ – local muon density and nn – number of eas neutrons. in principle, two approaches to the analysis of measured eas parameters are possible: – a cosmophysical approach, in which it is assumed that the eas energy is equal to the energy of the primary particle, and all changes of eas parameters in dependence on energy are the result of the energy spectrum or/and the composition of the cr changes only. – a nuclear-physical approach, in which it is believed that changes of eas parameters are the result of the inclusion of a new process of interaction or the production of new particles, states of matter, etc. in this case, the eas energy is not necessarily equal to the primary particle energy. the formation of the knee can be considered as the first evidence in favor of a change in the interaction model. as was first shown in [1], the knee in cr energy spectrum can be the result of the appearance of missing energy (fig. 2), which is taken away by undetectable particles (three types of neutrinos) and muons, the energy of which is not usually measured. to implement this approach, a change in the interaction model at center-of-mass system energy about 3 tev is required (for details, see [1]). the second piece of evidence in favor of a change in the interaction model was obtained from the mountain experiments, mainly “pamir” and “chakaltaya”, in which various unusual phenomena – halos, alignment, penetrating cascades, centauros, anti-centauros – were observed. though it was very difficult to evaluate the energy of the primary cr particles in these experiments, a comparison of the intensity of the observed events with the intensity of the cosmic rays shows that these events appear at energies between 1015 and 1016 ev. a more detailed description and an anlysis of the unusual events were made earlier [2]. the third piece of evidence in favor of new physics was obtained in cr investigations in muon experiments. first, it is the ever-increasing excess of the ratio of the number of muons to the number of electrons compared with the predicted ratio at energies 1015 ÷ 1017 ev. this increase can be explained by the change in the cr composition. however, a further increase in this ratio and the appearance of the muon excess even for the assumption of a pure iron composition can evidence at inclusion of new physical processes. the situation is the same with muons of very high energies (> 100 tev). the tendency toward the appearance of a muon excess with increasing muon energy to 100 tev is remarkable [3]. 3. possible version of a new interaction model a possible model for describing all unusual phenomena observed in cr investigations above 1015 ev must satisfy the following requirements: (1.) threshold behavior (unusual events appear at several pev only). (2.) large cross section (to change the eas spectrum slope). (3.) large yield of leptons (excess of vhe muons, missing energy and penetrating cascades). (4.) large orbital momentum (alignment). (5.) quicker development of eas (to increase the nµ/ne ratio and decrease the xmax elongation rate). there are various ways to construct the necessary model, from including a new (e.g., super-strong) interaction at distances of about 10−17 cm and generating new massive particles (m ∼ 1 tev), to producing blobs of quark-gluon matter (qgm). we consider the last model, since it provides a demonstrable explanation of the inclusion of a new interaction and provides the predictions that can be checked both in cr investigations and in lhc experiments. the production of qgm provides two main conditions: – threshold behavior, since a high temperature (energy) is required for it; – a large cross section, since the transition occurs from a quark–quark interaction to some collective interaction of many quarks: σ = πλ̄2 → σ ∼ π(λ̄+r)2 or π(r1 +r2)2, (1) where r, r1 and r2 are sizes of quark-gluon blobs. however, a large value of the orbital angular momentum is required for an explanation of other observed phenomena. as has been shown by zuo-tang liang and xin-nian wang [4], a globally polarized qgp with large orbital angular momentum which increases with energy as l ∼ √ s appears in non-central ion–ion collisions. in 708 vol. 53 supplement/2013 influence of the results of uhecr detection on the lhc experiments 10 15 10 16 10 17 10 -19 10 -18 10 -17 10 -16 10 -15 10 -14 10 -13 d n /d e , a rb . u n it s e, ev figure 3. the production of the knee with some “bump” in the nuclear-physical approach. this case, a blob of quark-gluon matter can be considered as a usual resonance with a large centrifugal barrier. centrifugal barrier v (l) = l2/2mr2 will be large for light quarks but small for top-quarks or other heavy particles. the orbital angular momentum value can be of the order of 105 [5]. the probability of the production of top–antitop pairs therefore increases. however, simultaneous interaction of many quarks changes the energy in the center-of-mass system drastically: √ s = √ 2mpe0 → √ 2mce0, (2) where the mass of the qgm blob mc ≈ nmn . at threshold energy, n ∼ 4 (α-particle). the tt̄-quarks that are produced take away the energy εt > 2mt ≈ 350 gev, and taking into account the fly-out energy, εt > 4mt ≈ 700 gev in the center-of-mass system. top-quarks decay in the following way: tt̄ → w+(w−) + b(b̄). in their turn, w-bosons decay into leptons (∼ 30 %) and hadrons (∼ 70 %); b → c → s → u with the production of muons and neutrinos. 4. consequences for cr experiments one part of the t-quark energy gives the missing energy (νe, νµ, ντ, µ), and another part changes the eas development, especially its beginning, the parameters of which are not measured. as a result, additional muons appear, the measured eas energy eeas will not be equal to the primary particle energy e0, and the measured spectrum will be different from the primary spectrum (fig. 2). transition of particles from energy e0 to energy eeas leads to a bump in the energy spectrum near the threshold (fig. 3), which appears if we sum the two solid curves in fig. 2. the appearance of other bumps in fig. 4 is explained in the same way. since not only high temperature (energy) but also high density is required for qgm production, the threshold energy for the production of a new state of matter for heavy nuclei will be less than for light nuclei 10 15 10 16 10 17 10 18 10 19 10 2 10 3 10 4 10 5 e 2 .7 d n /d e , g e v 1 .7 m -2 s -1 s r1 e, ev all primary nuclei h he cno nemgsi fe figure 4. energy spectra of basic groups of cr nuclei and the formation of the all-particle spectrum. and protons. the heavy nuclei (e.g., iron) spectra are therefore changed earlier than the light nuclei and proton spectra. the measured spectra of different nuclei will not correspond to the primary composition (fig. 4). thus, the observed increase in the mass of the cr composition is explained not by its real change, but by increasing detection probability of eas generated by heavy nuclei. in framework of this hypothesis, the so-called muon problem (muon puzzle) – the excessive number of measured eas muons compared to the simulated number even for pure iron composition of primary cr – can be solved, since with 70 % probability w-bosons decay into hadrons (mainly pions) with an average number of about 20, and the multiplicity of secondary particles (and also muons) begins to increase more sharply than the existing models predict. there is a more interesting situation with the muon energy spectrum. as was shown in the first paper about the nuclear-physical approach to the explanation of the knee [1], in this case a considerable excess of very high energy muons (> 100 tev) must appear. figure 5 presents the results of muon energy spectrum simulation in the framework of the qgm model. since corsika does not include tt̄-quark production, the well-known pythia code was used to introduce them. one can see that a remarkable excess of muons appears only at energies near 100 tev. the contribution of tt̄-quarks leads to a sharper increase of the muon spectrum than in the case of so-called prompt muons. it is a very difficult task to get experimental data at such energies, and the corresponding results have been obtained only in recent years [6, 7]. since there are no other ways to generate vhe muons, apart from the production of massive particles (state of matter), these results practically prove the approach considered here. if this effect does not disappear when there is a further increase in the statistics in the icecube experiment, it will provide excellent proof of the validity of the nuclear-physical approach. 709 anatoly a. petrukhin acta polytechnica 1 2 3 4 5 6 7 10 -3 10 -2 10 -1 10 0 10 1 10 2 10 3 10 4 10 5 pythia+corsika tt pythia+corsika d n µ /d lg e µ , a rb . u n it s lg(e µ , gev) e 0 = 10 17 ev, h 1st int. = 26.7 km figure 5. the differential muon energy spectrum simulated by corsika with the tt̄-quarks included from pythia. another way is to have measurements of the eas muon energy deposit below and above the knee. a change in the behavior of this value at the transition through the knee energy will also provide serious proof of the nuclear-physical approach. 5. consequences for the lhc experiments on the face of it, the search for qgm with the characteristics described above (excess of t-quarks, excess of vhe muons, sharp increasing of missing energy, etc.) is a very simple task. however, there is apparently no possibility to observe it in pp-interactions even at full lhc energy 14 tev, since larger energies are required for that. in fact, detailed investigations of pp-interactions at total energy 7 tev showed very good agreement with the predictions of existing theoretical models, and no evidence of new physics was obtained [8]. however, in nuclei–nuclei interactions some deviations were observed. the first of them is sharper increase in the charged particle multiplicity than in the predictions of the simulations (fig. 6 [9]) in the interactions of the nuclei. of course, a single experimental point is not sufficient to draw serious conclusion about new physics, but the tendency is very clear. the second result which can provide evidence in favor of qgm production is the detection of highly asymmetric dijet events (fig. 7, [10]). in the framework of the model considered here, these events can be explained very simply. when a top-quark decays in the center-of-mass system, the kinetic energy is distributed as tb ∼ 65 gev, and tw ∼ 25 gev. if one takes into account the fly-out energy of the top-quark, tb can be more than 100 gev. in the case that the b-quark gives a jet and w decays into ∼ 20π, the atlas event can be obtained. some evidence of qgm production in the nuclei– nuclei interactions in the lhc experiments are therefigure 6. results of charged particle multiplicity measurements at lhc energy 3 tev [9]. figure 7. highly asymmetric dijet event observed in atlas in heavy ion collisions [10]. fore obtained. however, a more detailed investigation of the new state of matter at lhc will be not so easy, since usual accelerator methods of searching for relatively narrow resonances in pp-interactions cannot be applied in the case of blobs of qgm production. when hot blobs of qgm decay, it is very difficult to wait for the reconstruction of a narrow resonance state. apparently, new methods for investigations of the new state of matter will need to be developed. one possible method is an evaluation of the missing energy, an increase in which with total energy will provide evidence in favor of the production of qgm or some other new state of matter. 6. discussion it should be noted that the considered model of qgm blob production can be checked in very different ways in cr investigations and in lhc experiments. since the life time of a qgm blob is very short (in spite of its very large orbital angular momentum and centrifugal barrier), it is very difficult to obtain any evidence of its existence by measuring most of the usually detected parameters. apparently, two values can be used: the multiplicity of charged particles, which will be sharply increased with increasing energy and mass of the interacting nuclei, and the average missing energy in certain types of events. from this point of view, the experiments in cos710 vol. 53 supplement/2013 influence of the results of uhecr detection on the lhc experiments mic rays have some advantages connected with the large longitudinal momentum of the primary particles. for this reason, the muons (and neutrinos) from the w-boson decays have energy not of ∼ 40 gev (as in the center-of-mass system) but of more than 100 tev. unfortunately, very large detectors are required for muons with such energy. it is therefore very difficult to predict in what kind of experiment an exhaustive proof of new physics (production of qgm blobs, or some other new state of matter) will be obtained, though the author has no doubt that the nuclear-physical approach is correct. 7. conclusions the approach considered here, based on the production of qgm blobs, which allows an explaination of practically all problems of cosmic ray investigations above 1015 ev, shows that this new physics at lhc energies can be found in nuclei–nuclei interactions only. two clear predictions can be made: a quicker increase in charged particle multiplicity and in missing energy with the increase in energy and the mass of the interacting nuclei. acknowledgements the author thanks alexey bogdanov and rostislav kokoulin for some very fruitful discussions, and for help in preparing this paper. this work has been supported by the russian ministry of education and science, rfbr (grant 11-02-12222-ofim-2011) and by a grant of the leading scientific school nsh-6817.2012.2. references [1] petrukhin, a.a.: 1999, proc. xith rencontres de blois frontiers of matter, j. tran thanh van (ed.) (the gioi publishers, vietnam, 2001), 401 [2] petrukhin, a.a.: 2004, proc. vulcano workshop frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), 489 [3] bugaev, e.v. et al.: 1998, phys. rev. d 58, 05401 [4] zuo-tang liang and xin-nian wang: 2005, prl 94, 102301 [5] jian-hua gao et al.: 2008, phys. rev. g 77, 044902 [6] bogdanov, a.g. et al.: 2012, astropart. phys. 36, 224 [7] berghaus, p.: 2011, presentation at 32nd icrc, beijing [8] d’enterria, d. et al.: 2011, astropart. phys. 35, 98 [9] tonelli, g.: 2011, presentation at 32nd icrc, beijing [10] cern courier, january/february 2011 discussion todor stanev — you gave the impression that we have no idea about particle physics. there is a paper by d’enterria et al. that compares the lhc results at√ s = 7 tev. cosmic ray models predict measurements better than some versions of pythia. anatoly petrukhin — in the paper of d’enterria et al., only pp-interactions are considered. the idea of my talk is the following. qgp will appear firstly in interactions of heavy nuclei (e.g., iron) with nuclei of the atmosphere. apparently, the threshold of qgp production in pp-interactions will be at energies higher than 14 tev. of course it is possible that some deviations can begin at this energy, but at energy 7 tev qgp blobs cannot be produced. how are we to search top-quarks in nuclei– nuclei interactions? i believe nobody has thought about this. therefore we have time to obtain additional proof of qgp production in cosmic ray investigations. 711 acta polytechnica 53(supplement):707–711, 2013 1 introduction 2 evidence of new physics in the cr experiments 3 possible version of a new interaction model 4 consequences for cr experiments 5 consequences for the lhc experiments 6 discussion 7 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0034 acta polytechnica 55(1):34–38, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap operation modes and characteristics of a plasma dipole antenna nikolay n. bogacheva, b, ∗, irina l. bogdankevichb, c, namik g. gusein-zadeb, c, konstantin f. sergeychevb a moscow state technical university of radio engineering, electronics and automation, moscow, russia b prokhorov general physics institute, russian academy of sciences, moscow, russia c pirogov russian national research medical university, moscow, russia ∗ corresponding author: bgniknik@yandex.ru abstract. the existence modes of a surface electromagnetic wave on a plasma cylinder, and the operating modes and characteristics of a plasma antenna are studied in this paper. solutions of the dispersion equation of the surface wave are obtained for a plasma cylinder of finite radius for different plasma density values. the operation modes of a plasma asymmetric dipole antenna of finite length and radius are researched by numerical simulation. the electric field distributions of the plasma antenna in the near field and the radiation pattern are obtained. these characteristics are compared with the characteristics of a similar metal antenna. numerical models are verified by comparing the counted and measured metal antenna radiation patterns. keywords: plasma antenna, surface electromagnetic wave, numerical simulation, operation modes, metal monopole, radiation pattern. 1. introduction in recent years much plasma antenna research has been done by theoretical, numerical and experimental methods [1–11]. the most popular plasma antenna type is the plasma asymmetrical dipole (monopole) antenna (pad) [1, 2, 4–11]. some little explored problems of plasma antennas are: the noise performances of the plasma antenna; non-linear distortions and current instabilities in the plasma of gas discharge, the choice of optimum plasma parameters for pad operation, and the role of the surface electromagnetic wave (surface wave) [12] in antenna operation. our research was focused on the determining the influence of plasma density on the radiation modes of pad and the interrelation surface wave with antenna operation modes. previous papers [2, 4, 5, 8–11] have reported numerical simulation results. in the context of the simulations, some characteristics of plasma antennas have been determined. in these simulations such plasma density values ne were used that plasma antenna functioned like metal antenna (ωp e > 10 ·ωew0). however, in experimental studies the plasma density values may be very different, and the plasma frequency may be close to the threshold ωp e ≈ √ 2 ωew0. in this case, the radiation antenna fails, or nonlinear distortions appear in the transmitted signal. for example, it was shown in [7] that at a frequency of 400 mhz, the radiated power dependence p(ne) represents a nonlinear function. our task was to study the interrelation between the surface electromagnetic wave and the operation modes of a plasma antenna for plasma frequency values in the range of √ 2 ωew0 ≤ ωp e ≤ 15 ·ωew0, where ωew0 = 2πf0 = 2π · 1.7 ghz = 1.07 · 1010 rad/s. 2. surface electromagnetic wave on a cylindrical plasma column we will consider in this section distribution conditions of the existence modes of a surface electromagnetic wave on the boundary of a plasma cylinder of infinite length and fixed radius r0. for this, we use the dispersion equation for an azimuthal symmetric surface wave on the cylindrical surface of a conducting medium of radius r0 [13]: ε √ kz 2 − ω2ew c2 k0 (√ kz 2 − ω 2 ew c2 r0 ) k′0 (√ kz 2 − ω 2 ew c2 r0 ) − √ kz 2 − ω2ew c2 ε i0 (√ kz 2 − ω 2 ew c2 εr0 ) i′0 (√ kz 2 − ω 2 ew c2 εr0 ) = 0, (1) where plasma dielectric permittivity ε is defined as ε(ω) = ε0 − ω2p e ωew (ωew + iνe) =   ε0 − ω2p e ν2e + i ω2p e ωew νe if ωew � νe, ε0 − ω2p e ω2ew ( 1 − i νe ωew ) if ωew � νe, (2) and ωp e = √ nee2/meε0 — electron plasma frequency, i0, k0 and i′0, k′0 — modified bessel functions and their derivatives respectively, kz — a wave number, ωew = 2πf — the cyclic frequency of an electromagnetic wave, c — velocity of light, νe — electron colli34 http://dx.doi.org/10.14311/ap.2015.55.0034 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 1/2015 operation modes and characteristics of a plasma dipole antenna sion frequency, ε0 — relationship dielectric permittivity, ne — electron plasma density, me and e — mass and charge of an electron. the real part of the solution of the dispersion equation (1) is given in fig. 1 for the parameters of the plasma used in the simulation part of our studies. note that νe = 107 s−1, for argon in a tube with pressure p0 = 3·10−2 torr = 4 pa [4], r0 = 0.5 cm, and the angular frequency on the vertical axis of the graphs in fig. 1 normalized ω = ωew r0 c , so ω0 = ωew0 r0c = 0.18 and k = kzr0 fig. 1,a shows for ωp e = √ 2ωew0 = 1.58 · 1010 rad/s (ne = 8.0 · 1010 cm−3) that frequency ω0 misses the asymptotic part of the dispersion curve. in this case, the surface electromagnetic wave propagates along the plasma column and does not radiate into the surrounding space (nonradiative mode). if ωp e = 5.35 · 1010 rad/s (ne = 9.1 · 1011 cm−3) (see fig. 1,b) wave frequency ω0 falls on the nonlinear part of the dispersion curve. at such settings, the electromagnetic wave radiated in space, although the emitted characteristics wave are suboptimal. in this sub-optimal (or transition) mode any small change in the wave parameters can lead to a change in the characteristics of the radiation. for plasma ωp e = 1.07 · 1011 rad/s (ne = 3.6 · 1012 cm−3) (see fig. 1,c), wave frequency ω0 is near the border of the linear part and the radiation characteristics are close to optimal. this will be called the linear mode. 3. model verification in this section, we compare the results of numerical simulations and experimental measurements of the radiation pattern for verification. we investigated the metal asymmetric dipole (monopole) antenna (mad) with la = 4.1cm; da = 0.3cm; ds = 18cm at frequency f0 = 1.7ghz. the radiation pattern in the far field of mad was obtained by numerical simulation in karat code [14] and cad empro [15] and experimental measurements were carried out in an anechoic chamber. the general scheme of the quarterwave asymmetric dipole with length la, diameter da = 2ra metal screen diameter ds = 2rs is shown in fig. 2. the mad model was implemented in full electromagnetic karat code [14] in the 2.5d version. we consider the axisymmetric case with a perfect matching layer (pml) on the borders of the counting area. the metal screen and the pin of mad were given as perfectly conducting surface. simulation was carried out by the finite difference time domain (fdtd) method. the model in empro was created in three-dimensional geometry in the xyz coordinate system, with a resizable and perfect matching layer at the edges of the counting area. the calculation was performed using the finite element method (fem) in the block agilent fem simulator. (a) (b) (c) figure 1. real parts of dispersion equation solutions for plasma: a) ωp e = 1.58 · 1010 rad/s (ne = 8.0 · 1010 cm−3), b) ωp e = 5.35 · 1010 rad/s (ne = 9.1 · 1011 cm−3), c) ωp e = 1.07 · 1011 rad/s (ne = 3.6 · 1012 cm−3). fig. 2 shows the results of a numerical simulation and measurement of the radiation patterns for mad with la = 4.1 cm; da = 0.3 cm and ds = 18 cm at the frequency f0 = 1.7 ghz. as can be seen from the graphs, the radiation patterns coincide on the main lobe. the side lobes have differences in level and in position. the radiation pattern obtained in karat code differs from the measured radiation pattern because the absorber layer is very near to the back surface of the metal screen in the model. the differences in the radiation pattern of the empro model is due to the fact that the fem method are not very correct for devices with a small q-factor (gain–bandwidth) [15]. 35 nikolay bogachev, i. bogdankevich, n. gusein-zade, k. sergeychev acta polytechnica figure 2. scheme of an asymmetric dipole (monopole) antenna (left side) and the experimental and modeling radiation pattern (right side) of metal asymmetric dipole antenna. figure 3. plasma antenna model in karat code: 1 — coaxial cable, 2 — plasma column, 3 — metal screen, 4 — absorber (pml). 4. numerical simulation results and discussion this section presents the results of a numerical simulation of the plasma and metal asymmetric dipole (la = 4 cm, da = 1 cm) with an infinite size of the screen ds = ∞. the plasma in the model was set as the medium described by the drude theory, where the dielectric permittivity of the plasma was determined by formula (3) [16]: ε(ω) = 1 − ω2p e ωew (ωew − iνe) , (3) in this model (see fig. 3.), the gaussian pulse with τi = 15 ns and frequency f0 = 1.7 ghz reached the plasma (metal) antenna through a coaxial cable. the plasma parameters were changed by varying the plasma density ne (the electron collision frequency remained constant νe = 107s−1). the field of the quarterwave dipole has structure of a tm-mode. so we consider the functions ez (r) and no. ratio ωp ne ωp vs ωew0 vs f0 [rad/s] [cm−3] 1 √ 2 ·ωew0 = √ 2 · 2πf0 1.58 · 1010 8.0 · 1010 2 5 ·ωew0 = 5 · 2πf0 5.35 · 1010 9.1 · 1011 3 10 ·ωew0 = 10 · 2πf0 1.07 · 1011 3.6 · 1012 table 1. parameters of the plasma. er(z) to be the most informative. in addition, the selection function er (z) is due to the proportionality of this component to the distribution of charge q along the antenna. the spatial structure of the field components ez (r) and er (z) (fig. 4 and fig. 5) was plotted for the plasma and metal antennas la = 4 cm, da = 1 cm, ds = ∞ according to the simulation results of code karat at frequency f0 = 1.7 ghz. in fig. 4, function er (z) are presented in the three operating modes of pad. in the first mode (the parameters correspond to point no. 1 in table 1) there is a surface wave distribution with wavelength λ ≈ 1.5 cm along a plasma column of the antenna (curve 1). this wavelength matches the wavelength calculated in section 2 for the same parameters. this case is a nonradiative mode. the second mode (no. 2 in table 1) is suboptimal transition mode (curve 2). the third mode (no. 3 in table 1) of the plasma antenna (curve 3) is close to the operation mode of the metal asymmetrical dipole (curve 4). in fig. 5, the graphics ez (r) are for the same plasma concentration values as in fig. 4. three qualitatively different operation modes of the plasma antenna are also clearly visible. in the first mode (curve 1), can be seen as ez (r) fades out in both directions from the boundaries of the plasma-vacuum at a different speed 36 vol. 55 no. 1/2015 operation modes and characteristics of a plasma dipole antenna figure 4. distributions of er (z): 1–3 — modes of the plasma antenna, 4 — the metal antenna. figure 5. distributions of ez (r): 1–3 — modes of the plasma antenna, 4 — the metal antenna. figure 6. radiation patterns: 1–3 — modes of the plasma antenna, 4 — the metal antenna. and in a vacuum it fades out at the distance a = 1cm, which is much smaller than the wavelength supplied to the antenna (λ ≈ 18 cm). this indicates that when ωp e = √ 2ωew0 the antenna operates as a surface wave line, without radiation of the surface wave into the surrounding space. this mode of operation of the antenna is nonradiative, and it coincides with the existence mode of the surface wave on the plasma column (see section 2). the second mode is characterized by the presence of a surface wave component and a radiated volumetric wave component in the distribution of ez (r) (curve 2). the surface component of the wave slowly fades in the depth of the plasma, and the radiated component of the wave for the case ωp e = 5ωew0 differs in phase by more than 60° from the radiation of the metal antenna (curve 4). this is a transition mode, and it is also associated with the regime of the existence of a surface wave on the plasma column. in the third mode (curve 3) ez (r) consists of a surface part and a volumetric wave part, but the surface wave is attenuated rapidly in the plasma and the volumetric portion is different from the mad case (curve 4) is only 20° in phase. the difference in the phase of ez (r) for the real pad and mad of imperfect conductors may be less, due to the finiteness of the skin layer. this mode is linear (radiative). the radiation patterns were plotted in the considered cases of pad and mad (see fig. 6). the plasma antenna radiation patterns (curves 1-3) are normal37 nikolay bogachev, i. bogdankevich, n. gusein-zade, k. sergeychev acta polytechnica ized to a metal antenna radiation pattern (curve 4), and are plotted in a rectangular coordinate system for θ values from 0° to 90° (0° coincides with the antenna axis). as the graph shows, in the case of ne = 8.0 · 1010 cm−3, curve 1 is close to 0, i.e. when ωp e =√ 2ωew0, as noted above, the antenna does not radiate energy waves into the surrounding space, and all the energy goes to surface wave propagation along the plasma tube. in the transitional mode (curve 2) ne = 9.1 · 1011 cm−3 and ωp e = 5ωew0 the radiation pattern is smaller in amplitude than radiation pattern of the metal antenna, which means no optimum plasma antenna operating compared to mad. the radiation pattern of the linear mode (ne = 3.6·1012 cm−3 and ωp e = 10ωew0, curve 3) is very close to curve 4, which implies that the plasma antenna is near to the characteristics of the metal. 5. conclusions we have obtained the following results by using the solution of the dispersion equation and a numerical simulation: (1.) three existence modes of the surface wave on an infinite plasma cylinder of finite radius. (2.) the operation modes of a plasma asymmetric dipole antenna. they are nonradiative, transition and linear (radiative). (3.) the relationship between the modes of the existence of a surface wave on an infinite plasma cylinder and the operation of a plasma asymmetric dipole antenna. (4.) the dependence of the operation modes of a plasma asymmetric dipole antenna on the ratio of the plasma frequency and the electromagnetic wave frequency. (5.) the plasma antenna characteristics in the linear mode are close to the characteristics of the metal antenna. in addition, the models used here were verified by experimental measurements. acknowledgements this work has been supported by the russian foundation for basic research (rfbr) project n14-08-31336. the authors are grateful to professor a.a. rukhadze and professor a.m. ignatov for discussions and useful comments. the measurement patterns were carried out in an anechoic chamber in the jsc kulon research institute. the authors thank the management and staff of jsc kulon research institute for their assistance. references [1] g. g. borg, j. h. harris, d. g. miljak, n. m. martin. application of plasma columns to radiofrequency antennas. appl phys lett 74(22):3272, 1999. doi:10.1063/1.874041. [2] j. p. rayner, a. p. whichello, a. d. cheetham. physical characteristics of a plasma antenna. ieee trans on plasma science 32(1):269, 2004. doi:10.1109/tps.2004.826019. [3] i. alexeff, t. anderson, s. parameswaran, et al. experimental and theoretical results with plasma antennas. ieee trans on plasma science 34(2):166, 2006. doi:10.1109/tps.2006.872180. [4] e. n. istomin, d. m. karfidov, i. m. minaev, et al. plasma asymmetric dipole antenna excited by a surface wave. plasma physics reports 32(5):388–400, 2006. doi:10.1134/s1063780x06050047. [5] c. liang, y. xu, z. wang. numerical simulation of plasma antenna with fdtd method. chin phys lett 25(10):3712, 2008. [6] d. qian, d. jun, g. chen-jiang, s. lei. on characteristics of a plasma column antenna. icmmt 1:413–415, 2008. doi:10.1109/icmmt.2008.4540404. [7] t. anderson. plasma antennas. artech house, norwood, 2011. [8] z. chen, a. zhu, j. lv. two-dimensional models of cylindrical monopole plasma antenna excited by surface wave. wseas trans on com 12(2):63, 2013. [9] n. n. bogachev, i. l. bogdankevich, n. g. guseinzade, v. p. tarakanov. computer simulation of plasma vibrator antenna. acta polytechnica 53(2):1–3, 2013. [10] v. konovalov, g. kuzmin, i. minaev, o. tikhonevich. spectral characteristics of plasma antennas. xli zvenigorod international conference on plasma physics and controlled fusion 2014. [11] b. belyaev, a. leksikov, a. leksikov, et al. investigation of nonlinear behavior of plasma antenna. izvestiia vysshykh uchebnykh zavedenii fizika 56(8):88–91, 2013. [12] m. moisan, c. beaudry, p. leprince. a small microwave plasma source for long column production without magnetic field. ieee transactions on plasma science 3(2):55, 1975. [13] a. aleksandrov, l. bogdankevich, a. rukhadze. principles of plasma electrodynamics. springer verlach, heidelberg, 1984. [14] v. tarakanov. user’s manual for code karat. va, springfield, 1992. [15] keysight technologies. about empro. http://www.keysight.com/en/pc-1297143/empro. [16] drude. zur elektronentheorie der metalle. anndphys 1:566, 1900. 38 http://dx.doi.org/10.1063/1.874041 http://dx.doi.org/10.1109/tps.2004.826019 http://dx.doi.org/10.1109/tps.2006.872180 http://dx.doi.org/10.1134/s1063780x06050047 http://dx.doi.org/10.1109/icmmt.2008.4540404 http://www.keysight.com/en/pc-1297143/empro acta polytechnica 55(1):34–38, 2015 1 introduction 2 surface electromagnetic wave on a cylindrical plasma column 3 model verification 4 numerical simulation results and discussion 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0555 acta polytechnica 53(supplement):555–559, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap results from the xenon100 experiment rino persiani∗ university of bologna and infn-bologna, bologna, italy ∗ corresponding author: rino.persiani@bo.infn.it abstract. the xenon program consists in operating and developing double-phase time projection chambers using liquid xenon as the target material. it aims to directly detect dark matter in the form of wimps via their elastic scattering off xenon nuclei. the current phase is xenon100, located at the laboratori nazionali del gran sasso (lngs), with a 62 kg liquid xenon target. we present the 100.9 live days of data, acquired between january and june 2010, with no evidence of dark matter, as well as the new results of the last scientific run, with about 225 live days. the next phase, xenon1t, will increase the sensitivity by two orders of magnitude. keywords: dark matter, wimp, xenon. 1. introduction astronomical and cosmological observations indicate that a large amount of the energy content in the universe is made of dark matter [1]. particle candidates under the generic name of weakly interacting massive particles (wimps) arise naturally in many theories beyond the standard model of particle physics, such as supersymmetry, universal extra dimensions, or little higgs models [2]. they may be observed in underground-based detectors, sensitive enough to measure the low-energy nuclear recoil resulting from the coherent scattering of a wimp with a target nucleus [3]. the xenon dark matter project searches for nuclear recoils from wimps scattering off xenon nuclei. in a phased approach, experiments with increasingly larger mass and lower background are being operated underground, at the infn laboratori nazionali del gran sasso (lngs) in italy [4]. xenon100 is the current phase of the xenon program, which aims to improve the sensitivity to dark matter interactions in liquid xenon (lxe) with two-phase (liquid/gas) timeprojection chambers (tpcs) of large mass and low background. the extraordinary sensitivity of xenon to dark matter is due to the combination of a large, homogeneous volume of ultra pure liquid xenon as the wimp target, in a detector which measures not only the energy, but also the three spatial coordinates of each event occurring within the active target. the ability to localise events within millimetre resolution, enables the selection of a fiducial volume in which the radioactive background is minimised. the simultaneous detection of the xe scintillation light (s1) at the few kevee level (kev electron equivalent), and ionization (s2) at the single electron level, allows to discriminate electronic recoils (ers) from nuclear recoils (nrs), providing the basis of one of the major background techniques. 2. xenon100 2.1. detector description the xenon100 detector is a double-phase (liquid-gas) time projection chamber (tpc). a particle interacting with the target generates scintillation light and ionization electrons (fig. 1). the primary light (s1) is detected immediately by two photomultiplier (pmt) arrays above and below the target. the light readout is based on 1"-square hamamatsu r8520-06-al low-radioactive pmts with quantum efficiencies up to ∼ 35 %. 98 pmts in the top array are arranged to improve the position reconstruction, 80 pmts on the bottom to optimise the light collection. the other 64 pmts are located in the lxe veto to reduce the background. each interaction liberates electrons, which are drifted upwards across the tpc by an electric field (∼ 0.53 kv/cm) to the liquid-gas interface with a speed of about 2 mm/µs. these electrons are then extracted into the gas phase by a strong extraction field. in the gas phase, the electrons generate very localised secondary scintillation light (s2), which can be used to determine the horizontal position of the interaction vertex with a resolution < 3 mm (1σ). the time difference between these two signals gives the depth of the interaction in the tpc with a resolution of 0.3 mm (1σ). the event positions can be fully reconstructed and used to federalise the target volume in order to drastically reduce the radioactive background from external sources. the high ionization density of nuclear recoils (nrs) in lxe leads to larger s1 and smaller s2 signals compared to electron recoils (ers). the simultaneous measurement of charge and light provides powerful discrimination between nr signals and er background events via the ratio s2/s1. 555 http://dx.doi.org/10.14311/ap.2013.53.0555 http://ojs.cvut.cz/ojs/index.php/ap rino persiani acta polytechnica figure 1. principle of a two-phase liquid xenon tpc. a particle generates primary scintillation light (s1) and ionization electrons. these are drifted upwards by an electric field and detected via secondary scintillation light in the gas phase (s2). the s2 hit pattern (xy) and the drift time (z) give three-dimensional information on the position of events. additionally, the ratio s2/s1 allows event discrimination between nuclear recoils (wimps, neutron) and electron recoils (γ,β). 2.2. detector backgrounds the background in xenon100 comes mainly from external sources and from the materials used for the construction of the detector itself. in order to reduce the background from the radioactivity in the experiment’s environment, in the laboratory walls, etc. [5], a passive shield has been installed. an improved version of the xenon10 shield [6] was required to increase the sensitivity of the xenon100 experiment. the detector is surrounded (from inside to outside) by 5 cm of copper, followed by 20 cm of polyethylene, and 20 cm of lead, where the innermost 5 cm consist of lead with low 210pb contamination [7]. the entire shield rests on a 25 cm thick slab of polyethylene. an additional outer layer of 20 cm of water and polyethylene has been added on top and on 3 sides of the shield to reduce the neutron background further. figure 2 shows a picture of the xenon100 detector in front of its shield. in order to minimise the radioactive background in the detector, the cryogenic system is located outside the passive shield. to reduce the radioactivity from the detector and the shielding materials, radioactivity screening was performed with a 2.2 kg high purity ge detector in an ultra-low background cu cryostat and cu/pb shield, operated at lngs [8]. the radioactive contaminations of each material screened can be found in reference [7]. the radioactivity of all the components has been measured and is used as input for detailed monte-carlo simulations of the γ and neutron background of the experiment. gammas from the decay chains of the radioactive contaminants in the detector materials dominate the electron recoil background. the activity of the pmts is the major contributor to the current xenon100 electron recoil background (about 65 % of the total background from all detector and figure 2. xenon100 with opened shield door. the circular copper pipe around the detector is used for inserting calibration sources from outside the shield. the pb brick visible at the front is necessary to block gamma rays from the ambe source during neutron calibration. shield materials). this background in xenon100 is 6 × 10−3 dru. after 99 % electron recoil rejection we obtain a conservative background rate of 6 × 10−5 dru. a potentially dangerous background for xenon100 is the gamma background from the decay chain of 222rn daughters inside the shield cavity. the average measured radon activity in the lngs tunnel is about 350 bq/m3. therefore, the shield cavity is constantly flushed with nitrogen gas when the shield door is closed. the 222rn concentration in the cavity is continuously monitored and the measured values are at the limit of the sensitivity of the radon monitor itself. without the veto cut, the background rate from 1 bq/m3 of 222rn in the shield is 6 × 10−3 dru for the entire target mass of 62 kg and 2 × 10−4 dru in the 30 % kg fv. this background is less than 2 % of the background from the detector and shield materials. moreover, the measured radon concentration is well below 1 bq/m3. there is no long-lived radioactive xenon isotope, with the exception of the potential double beta emitter 136xe, but its half-life is so long that it does not limit the sensitivity of the detector. another lxe intrinsic background is due to 85kr, which is a beta emitter with an endpoint of 687 kev and a half-life of 10.76 years. its concentration in the detector can be measured using a second decay mode (in a metastable state of rb) with a 0.454 % branching ratio. commercially available xenon gas has a concentration of natural krypton at the ppm level and 85kr has an isotopic abundance of 85kr/natkr ∼ 2 × 10−11. for the 85kr-induced background to be subdominant, the fraction of natkr in xe must be at the level of 100 ppt. a natkr/xe ratio of 100 ppt would contribute a rate of ∼ 2 mdru from 85kr [9]. to reduce the kr level in xe, a small-scale cryogenic distillation column [10] was 556 vol. 53 supplement/2013 results from the xenon100 experiment figure 3. electronic (blue dots) and nuclear (red dots) recoil bands in the log10(s2/s1) vs. energy space. the electronic recoil band is obtained with 60co and 232th calibrations, performed weekly during a dark matter search run. the nuclear recoil band is measured with an ambe source shielded by 10 cm of lead. this calibration is performed at the beginning or at the end of a dark matter search run, because of neutron activation of xe and other materials. produced and integrated into the xenon100 system underground. the column is designed to deliver a factor 1000 in kr reduction in a single pass. the nuclear recoil background is estimated by monte-carlo simulations and is based on the measured radioactivity concentrations. it takes into account the neutron spectra and the total production rates from spontaneous fission and (α, n) reactions in the detector, the shielding materials, and the surrounding concrete-rock, calculated with a modified sources4a code [11]. the muon flux at the 3600 mw.e. gran sasso depth is 24 muons/m2/day. muons will produce neutrons in the shielding materials due to electromagnetic and hadronic showers and through direct spallation. high energy muon interacting in the rock-concrete can produce highly penetrating neutrons with energies up to a few gev. the impact of muon-induced neutrons is obtained from simulations and contributes 70 % to the total nrs background. 2.3. detector calibrations to characterise the detector performances and its stability in time, calibration sources are regularly inserted in the xenon100 shield through a copper tube surrounding the cryostat (see fig. 2). while the vertical position of sources is restricted to the tpc centre, they can be placed at all azimuthal angles. the electronic recoil band in log10(s2/s1) vs. energy space defines the region of background events from βand γ-particles (blue dots in fig. 3). it is measured using the low energy compton tail of highenergy γ-sources such as 60co and 232th. in the 225 day dark matter search, the amount of electronic figure 4. results of ambe calibration data. besides the nr recoil band due to neutron elastic scatter off xe nuclei, this calibration provides additional gamma lines from neutron activation of xenon and fluorine (in the teflon) nuclei at 40 kev (129xe), 80 kev (131xe), 110 kev (19f), 164 kev (131mxe), 197 kev (19f) and 236 kev (129mxe). recoil calibration data taken is a significant increase over that taken during the 100.9 day dark matter search. the detector response to single scatter nuclear recoils, the expected signature of a wimp scattering off a nucleus, is measured with an ambe (α, n) source (red dots in fig. 3). this source is shielded by 10 cm of lead in order to suppress the contribution from its high energy gamma rays (4.4 mev). the comparison of the charge/light ratio allows to define a region where most of the neutrons are, and only few gammas. the discrimination power is ∼ 99.5 % at low energies for 50 % neutron acceptance. besides the definition of the nuclear recoil band and a benchmark wimp search region, the ambe calibration provides additional gamma lines from inelastic scattering as well as from xenon or fluorine (in the teflon) neutron activation at 40 kev (129xe), 80 kev (131xe), 110 kev (19f), 164 kev (131mxe), 197 kev (19f) and 236 kev (129mxe). these lines are clearly visible in fig. 4 as ellipsoids over the nuclear recoils band due to neutron elastic scatters off xenon nuclei. in order to get a uniform proportional scintillation s2 signal, the liquid-gas interface has to be adjusted at the optimal distance from the anode to optimise the s2 resolution. this levelling was performed at the beginning of the run with a 137cs source, by varying the overall liquid xenon level until the best resolution of the full absorption s2 peak was found. after detector levelling, two independent effects remain that have an impact on the size of the proportional scintillation s2 signal from the charge: a) the absorption of electrons as they drift (finite electron lifetime), leading to a z-dependent correction, b) the reduced s2 light collection efficiency at large radii, non-functional pmts, quantum efficiency differences between neighbouring pmts, as well as non-uniformities in the proportional scintillation gap, leading to an s2 correction that depends on the (x,y)-position of the s2 signal. regular 557 rino persiani acta polytechnica figure 5. new result on spin-independent wimp-nucleon scattering from xenon100: the expected sensitivity of this run is shown by the green/yellow band (1σ/2σ) and the resulting exclusion limit (90 % cl) in blue. for comparison, the xenon100 exclusion limit of the last 100.9 day run and other experimental results are also shown, together with regions (1σ/2σ) preferred by supersymmetric (cmssm) models. the projected sensitivity for xenon1t is shown in red. 137cs calibration runs are taken in order to determine the mean lifetime for electrons transversing the liquid xenon volume (free electrons are removed through ionization of impurities). the lxe is continuously purified through a hot getter in order to reduce the impurity level. in the 100.9 day dark matter search, the mean electron lifetime increased from 230 µs to 380 µs, while in the last run (225 days) it increased (constantly, if the maintenance periods are not taken into account) from around 300 µs to 650 µs. 2.4. data analysis the data used in the analysis were selected from periods with stable detector operating conditions. we excluded from the analysis all periods with xenon pressure and temperature values that were more than 5 sigma away from the average value. after this selection, other parameters were found to be stable during the whole science run. the radon concentration in the xenon100 room and inside the shield cavity were measured using a dedicated radon monitor, and the concentration was stable for the whole period. two parallel analyses were performed to interpret these data in a spin-independent wimp-nucleon interaction framework. the energy region used for both analyses was (4 ÷ 30) pe in s1 corresponding to (8.4 ÷ 44.6) kevnr. this energy region was lowered from 4 pe to 3 pe for the analysis of the last run (225 days). before unblinding, it was decided to use the profile likelihood analysis method [12] as the primary interpretation method which does not employ a fixed discrimination in s2/s1 parameter space. a cut-based analysis was also performed in [13] to cross check the results. the criteria applied to the science data to select candidate events in the region of interest are grouped into: data quality cuts, energy selection and a threshold cut on s2, selection of single scatter events, consistency cuts, selection of the fiducial volume, and selection of the signal region for the cut-based analysis [14]. 3. results 3.1. 100 live days dm search the dark matter data analysed for this run was acquired between january 13 and june 8, 2010. about 2 % of the exposure was rejected due to variations in detector operation parameters. in addition, 18 live days of data taken in april were rejected due to an increased electronic noise level. removing all the calibration runs during the data-taking period, this led to a dataset of 100.9 live days. the expected background 558 vol. 53 supplement/2013 results from the xenon100 experiment in the wimp search region is the sum of the gaussian leakage from the er background, of the non-gaussian leakage, and of the nrs from neutron interactions. the total background prediction for 99.75 % er rejection, 100.9 days of exposure and 48 kg fiducial mass is (1.8 ± 0.6) events. after unblinding the pre-defined wimp search region, a population of events was observed that passed the s1 coincidence requirement only because of the correlated electronic noise. these events are mostly found below the s1 analysis threshold, with 3 events from this population leaking into the wimp search region close to the 4 pe lower bound. this population was identified and rejected with a cut that takes into account the correlated pick-up noise. once they have been removed, 3 events pass all quality criteria for single-scatter nrs and fall in the wimp search region. given the background expectation, the observation of 3 events does not constitute evidence for dark matter, as the chance probability of the corresponding poisson process resulting in 3 or more events is 28 %. 3.2. 225 live days dm search the last run of xenon100 represents 224.6 live days of dark matter search data. purification through a dedicated krypton removal column saw the intrinsic background of the liquid xenon drop by more than a factor of 10. in the last run, for unblinded data (above 99.75 er rejection) and a 30 kg fiducial mass, about 2 single-scatter events are observed per day below 30 pe in s1. these values illustrate the low count rate in the electron recoil background in xenon100 and the improvement made with the reduction of the intrinsic krypton background. moreover, the trigger threshold was lowered from 300 pe to 150 pe in s2. the dark matter data analysed for this run was acquired over 13 months in 2011 and 2012. a blind analysis of 224.6 live days × 34 kg exposure yielded no evidence for dark matter interactions. the two candidate events observed in the pre-defined nuclear recoil energy range of 6.6 ÷ 30.5 kevnr are consistent with the background expectation of (1.0 ± 0.2) events [15]. 4. xenon1t in parallel with the successful operations of xenon100, the collaboration has already designed the next generation detector: xenon1t. the detector is based on an lxe tpc with a fiducial mass of 1 ton and a total mass of 2.4 tons of lxe, viewed by low radioactivity photomultiplier tubes and housed in a water cherenkov muon veto at lngs. detailed simulation studies informed by results from previous xenon and other lxe detectors indicate that an increase in the light yield of xenon1t relative to xenon100 is achievable by adopting relatively modest modifications to the design of the tpc, such as greater coverage of non-reflective surfaces with ptfe of near unity reflectivity, greater optical transmission of electrode structures and, especially, greater quantum efficiency of photomultiplier tubes. the background can be reduced through the selection of every component used in the experiment, based on an extensive radiation screening campaign, using a variety of complementary techniques and dedicated measurements. moreover the self-shielding of lxe is exploited to attenuate and moderate radiation from material components within the tpc and simultaneously a fiducial volume will be defined, thanks to the tpc event imaging capability. the experiment aims to reduce the background from all expected sources such that the fiducial mass and the low energy threshold will allow xenon1t to achieve unprecedent sensitivity. with 2 years live-time and 1.1 ton fiducial mass, xenon1t reaches sensitivity of 2 × 10−47 cm2 at 90 % cl for 50 gev/c2 wimps, as shown in fig. 5. acknowledgements references [1] n. jarosik et al., astropart. j. suppl. 192, (2011) 14; k. nakamura et al., (particle data group), j. phys. g 37, (2010) 075021. [2] g. steigman and m. s. turner, nucl. phys b 253, (1985) 375; g. jungman, m. kamionkowski and k. griest, phys. rept. 267, (1996) 195. [3] m. w. goodman and e. witten, phys. rev. d 31, (1985) 3059. [4] www.lngs.infn.it [5] m. haffke et al., nucl. instr. meth. phys. res. sect. a 643, (2011) 36. [6] e. aprile et al., (xenon10 collaboration), astropart. phys. 34, (2011) 679. [7] e. aprile et al., (xenon100 collaboration), astropart. phys. 35, (2011) 43. [8] g. heusser, m. laubenstein and h. neder, radionuclides in the environment 8, (2006) 495. [9] e. aprile et al., (xenon100 collaboration), phys. rev. d 83, (2011) 082001. [10] http://www.tn-sanso.co.jp/en/ [11] r. lemrani et al., nucl. instr. meth. a 560, (2006) 454. [12] e. aprile et al., (xenon100 collaboration), phys. rev. d 84, (2011) 052003. [13] e. aprile et al., (xenon100 collaboration), phys. rev. lett. 107, (2011) 131302. [14] e. aprile et al., (xenon100 collaboration), arxiv:1207.3458v2 [15] e. aprile et al., (xenon100 collaboration), arxiv:1207.5988 559 acta polytechnica 53(supplement):555–559, 2013 1 introduction 2 xenon100 2.1 detector description 2.2 detector backgrounds 2.3 detector calibrations 2.4 data analysis 3 results 3.1 100 live days dm search 3.2 225 live days dm search 4 xenon1t acknowledgements references acta polytechnica vol. 52 no. 6/2012 geometrical modeling of concrete microstructure for the assessment of itz percolation daniel rypl1, tomáš bým2 1department of mechanics, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague, czech republic 2golder associates ab, östgötagatan 12, 116 25 stockholm, sweden corresponding author: drypl@fsv.cvut.cz abstract percolation is considered to be a critical factor affecting the transport properties of multiphase materials. in the case of concrete, the transport properties are strongly dependent on the interfacial transition zone (itz), which is a thin layer of cement paste next to aggregate particles. it is not computationally simple to assess itz percolation in concrete, as the geometry and topology of this phase is complex. while there are many advanced models that analyze the behavior of concrete, they are mostly based on the use of spherical or ellipsoidal shapes for the geometry of the aggregate inclusions. these simplified shapes may become unsatisfactory in many simulations, including the assessment of itz percolation. this paper deals with geometrical modeling of the concrete microstructure using realistic shapes of aggregate particles, the geometry of which is represented in terms of spherical harmonic expansion. the percolation is assessed using the hard core – soft shell model, in which each randomly-placed aggregate particle is surrounded by a shell of constant thickness representing itz. keywords: percolation, concrete, interfacial transition zone, aggregates, hard core – soft shell, spherical harmonic analysis. 1 introduction concrete is nowadays a modern composite material. it is a multiscale material with length scales from nanometers (c-s-h), via micrometers (cement paste) to millimeters (mortar and concrete). it is a different random composite at each length scale. thus the range of the microstructure of concrete spans nine orders of magnitude. it is therefore a large and difficult task to try to relate the microstructure and the properties of concrete theoretically. a possible way to relate several material properties to the microstructure of the material are based on percolation theory [1, 2]. percolation theory is using the idea of connectivity. the percolation threshold denotes the volume fraction of a particular phase of a composite material at which that phase goes from being disconnected to being connected (or vice versa), so that there is a change in the topology of the microstructure. percolation properties are now more or less commonly accepted as the critical geometrical and topological factors influencing the transport properties of multiphase materials (e.g. ionic diffusivity, electric or thermal conductivity) [3, 4, 5]. moreover, in many practical applications the structure of composite materials evolves in time, so that the percolation transition occurs after an ageing time (as it is typical for cementbased materials and gels). these are very important factors, because it is now well recognized that much of the deterioration of concrete in infrastructures is caused by corrosion of the reinforcing bars due to a massive chloride attack (or due to an attack by ions of some other salts). in the case of mortar and concrete, the transport properties are strongly dependent on the region of cement paste close to the aggregate particle surface (typically within 50 micrometers). this region, known as the interfacial transition zone (itz), exhibits higher capillary porosity and larger pores than the bulk cement paste matrix [6, 7]. these features are commonly attributed to the cement particle packing effect and the one-side growth effect. however, if itzs do not percolate, their effect on transport will be fairly small, as any transport path through concrete would have to go through the bulk cement paste. the transport properties would then be dominated by the transport properties of the bulk cement paste. the problem of itz percolation in concrete is computationally not simple, as the geometry and topology of this phase is complex. note that micrometer and millimeter length scales have to be considered simultaneously in such a study. this makes the standard models based solely on digital image processing [8, 9] prohibitive in terms of memory requirements. the hard core – soft shell model [10, 11] is therefore utilized. in this continuum model, each randomly-placed 38 acta polytechnica vol. 52 no. 6/2012 aggregate particle is surrounded by a shell of constant thickness representing the itz. while the hard core aggregates may not overlap one another, the soft shell itzs are free to overlap one another. itz percolation is verified if it propagates continuously through the whole specimen in a particular direction. until now, the itz percolation problem has been studied for spherical [12] and ellipsoidal aggregate particles [13]. the itz percolation threshold in terms of the aggregate volume fraction was found to be dependent on the aspect ratio of the (ellipsoidal) aggregates. it is interesting, however, that the itz percolation threshold in terms of the volume fraction of the itzs was not dependent on the aggregate aspect ratio [13]. this work goes one step further and elaborates the assessment of itz percolation for realistic shapes and distribution of aggregates. a relatively simply and robust approach introduced in [14] is employed for describing real three-dimensional aggregate particles. this method is based on approximating the particle shape by spherical harmonic functions (the threedimensional equivalent of two-dimensional fourier analysis). although this representation is not universal (it implies that the aggregate particle is star-like in shape with no internal voids), it is suitable for almost all aggregates used in structural concrete. a significant advantage of this approach is that the resolution of the smooth representation can be flexibly controlled by the number of terms in the expansion. the paper is organized as follows. in section 2, the representation of the aggregate particle using spherical harmonic expansion is recalled. then the algorithm for packing the aggregate particles into the representative volume is described in section 3. the actual assessment of itz percolation is worked out in section 4. section 5 presents a simple example of a concrete specimen with realistic shapes and distribution of the aggregates. the paper ends with concluding remarks in section 6. 2 geometrical representation of aggregate particles the geometrical representation of aggregate particles is based on a scalar function r(η,ϕ), defined as the distance of the particle surface point from the particle center (possibly center of mass) measured in the direction of spherical coordinates η and ϕ with the origin located at that center (figure 1). the cartesian coordinates of the surface point (components of its position vector r) are then given by x(η,ϕ) = r(η,ϕ) sin(η) cos(ϕ) y(η,ϕ) = r(η,ϕ) sin(η) sin(ϕ) (1) z(η,ϕ) = r(η,ϕ) cos(η) . η y x ϕ o z η ϕr ( , ) η y x ϕ o z η ϕr ( , ) surface surface real digital figure 1: description of the aggregate particle in the spherical coordinate system. if the function r(η,ϕ) is smooth on unit sphere (0 ≤ η ≤ π, 0 ≤ ϕ ≤ 2π) and periodic in ϕ, it can be expressed in the form of the expansion into spherical harmonic functions as r(η,ϕ) = ∞∑ n=0 n∑ m=−n anmy m n (η,ϕ), (2) where anm are the coefficients of the expansion and y mn (η,ϕ) are the spherical harmonic functions (see [14] for details). the coefficients anm are determined by the integration anm = ∫ 2π 0 ∫ π 0 r(η,ϕ) sin(η)ȳ mn (η,ϕ) dη dϕ, (3) where ȳ mn (η,ϕ) is the complex conjugate to y mn (η,ϕ). an analytical evaluation of anm is practically not feasible (except for the case of a sphere where a00 is the only non-zero coefficient), and it is therefore necessary to apply numerical integration. gaussian numerical integration is employed in the present study. theoretically, the order of the numerical integration should approach infinity. however, taking for the approximation of r(η,ϕ) the final summation r(η,ϕ) = n∑ n=0 n∑ m=−n anmy m n (η,ϕ), (4) where n is the order of the expansion, there is a reduction in the order of the numerical integration needed for an accurate evaluation of coefficients anm and consequently also in the number of values r(η,ϕ) needed for the integration. the values of r(η,ϕ) at the integration points may be derived, for example, from a digital (voxel-based) representation of an aggregate particle (easily acquainted from computer tomography or any other similar scanning device). r(η,ϕ) is defined as the length of the segment in the direction given by η and ϕ connecting the particle center with the side of the voxel forming the aggregate particle boundary (figure 1 on the right). numerical experiments reveal that the contributions of the expansion terms for n > 20 are usually 39 acta polytechnica vol. 52 no. 6/2012 negligible, and that order 128 of gaussian numerical integration is sufficient in most cases. it is also important to realize that growth of the expansion order may lead not to expected improvement in the geometrical representation, but to undesirable capturing of the unrealistic digital roughness inherently present in the voxel-based description. an example of the representation of a particular aggregate particle using spherical harmonic expansion with 128-point gaussian numerical integration is shown in figure 2 for different values of n. after the spherical harmonic analysis is completed, the aggregate particle in a general location is represented by its origin and unit vectors, corresponding to the positive x, y, and z axes of the local spherical coordinate system used for spherical harmonic expansion, by the order of the expansion, and by the set of expansion coefficients. for an evaluation of most geometrical properties1 only the last two parameters, the order of the expansion and the expansion coefficients, are relevant. the remaining parameters describe the spatial location of the aggregate particle, and may be used for evaluating some of the geometrical properties in the global cartesian coordinate system. 3 packing algorithm in order to model a concrete specimen with the aggregate distribution reasonably matching the realistic shapes and volumetric fractions of aggregate particles of various size grades, it is necessary to employ an appropriate packing algorithm that assembles aggregate particles into a representative volume. this kind of algorithm should (i) ensure sufficient randomness, (ii) comply with the volumetric content and the statistical distribution of size grades of the individual aggregates, and (iii) prevent overlapping of individual aggregate particles. this is a non-trivial task when realistic aggregate shapes are considered. there are many packing strategies [15, 16, 17, 18, 19, 20] designated for spherical, cylindrical, and ellipsoidal inclusions, but many of them are specialized for a particular purpose, or are restricted by some constraints not applicable to the problem of aggregate packing. the most common packing approach is based on the so-called take-and-place method ([21, 22, 23]), which is also employed in the present study. in the first phase, the particles are randomly chosen from a container of predefined and sufficiently representative shapes of aggregates of a particular type (e.g. artificially crushed aggregates or natural aggregates from a riverbed). each particle is randomly scaled to fit into a particular size grade and, its volume is 1with respect to the present work only the volume, needed for the evaluation of aggregates volume fraction, and the first gradients of the r with respect to η and ϕ, needed for the intersection check, are of interest. figure 2: representation of an aggregate particle by spherical harmonic expansion of order 5, 10, 15, 20, 25, and 30 (from left to right and from top to bottom). added to the volumetric content of that grade. the individual size grades are processed sequentially in descending order with respect to the sieve size, until the desired volumetric content is reached. in the next phase, which is computationally most demanding, the individual particles are placed into the representative volume in descending order with respect to the radius of their bounding sphere. each particle is first randomly rotated, and then its tentative location in the representative volume is randomly generated. note that if the periodic representative volume is treated, the periodic counterpart(s) of the processed particle must also be handled. before the tentative location is accepted, it has to be verified that the particle (and its periodic counterpart(s)) does not overlap with any of the already inserted particles and, if the periodicity is not considered, that it does not intersect the boundary of the representative volume. if an overlap or an intersection is detected, a new position (and after a large number of unsuccessful attempts also a new rotation) is randomly generated and the process is repeated until the location can be accepted. note that the above algorithm is suitable only for realistic size grading with volumetric content of the aggregates not exceeding 50–60 % of the total representative volume. for higher volumetric content or for unusual size grading, alternative more sophisticated packing procedures (see [24, 25, 26] for example) are necessary. the crucial aspect of the packing algorithm is the intersection check. in order to make it efficient, an octree data structure is used to easily identify pairs of particles which could potentially overlap. although many intersection checks can be resolved using the bounding spheres of individual particles (a single bounding sphere per particle), the computational effort related to the remaining checks is still prohibitive. a refined bounding box, comprising a set of spheres of variable radius covering the whole particle, is therefore established. the spheres in the refined bounding box are spheres circumscribed to individual simplices in a con40 acta polytechnica vol. 52 no. 6/2012 a) b) c) figure 3: construction of refined bounding box of an aggregate particle: a) surface points generated by the intersection of regularly spaced rays emanating from the expansion center of the particle with the surface of the particle, b) constrained delaunay triangulation of surface points, c) refined bounding box defined by the envelope (in gray) of circles circumscribed to delaunay simplices. strained 3d delaunay triangulation [27] constructed over a set of particle surface points. these points are obtained as the intersection of the particle surface with regularly spaced rays emanating from the expansion center of the particle. the particle surface points are connected by a surface triangulation, the topology of which is the same for all particles and is known in advance. the process for constructing the refined bounding box is schematically demonstrated for a 2d case in figure 3. in the current implementation, the directions of the rays are taken from the microplane material model [28], which uses 122 regularly spaced directions (defining normals of individual constitutive microplanes in that material model). this yields 240 triangles approximating the particle surface. this basic triangulation is sufficient up to expansion order 10 (inclusive). for higher expansion order values, however, the basic triangulation may be too coarse and must be globally refined by a regular recursive subdivision, in which each surface triangle is divided into four geometrically similar subtriangles. the newlyintroduced subtriangle vertices define new rays which, in turn, produce new surface points that enrich the original set of basic surface points, and that replace the original vertices (obtained by subdivision) of individual subtriangles (see figure 4). an example of particle surface triangulations corresponding to basic points and to points after the first and the second refinement level is depicted in figure 5. the estimate of the number of refinement levels ensuring that the particle is completely covered by the refined bounding box is based on a heuristics (verified experimentally on a set of particles) which applies a one-level refinement (an increase in the number of triangles by a factor of 4) whenever the expansion order increases by 10. this implies that just one level of refinement is necessary for the most commonly-used expansion orders between 10 and 20 (inclusive). however, even for just a singlelevel refinement, the refined bounding box comprises figure 4: refinement of the surface triangulation using hierarchical subdivision. black nodes correspond to surface points on the preceding refinement level, white nodes are subdivision points (not on the real surface) on current refinement level, gray nodes are surface points on the current refinement level. 960 spheres, which is computationally unfavourable. the number of spheres forming the refined bounding box can be decreased by performing the subdivision locally (for example, with respect to the change in particle surface curvature). in this case, the constrained delaunay triangulation is no longer conforming, due to the introduced hanging nodes (if only one out of two adjacent triangles was refined). however, this causes no difficulty. the only consequence is that the refined bounding box is slightly larger because it is formed by fewer spheres. in the current approach, an alternative strategy is adopted to keep the number of bounding spheres reasonably small. initially, the constrained delaunay triangulation is built using only the basic 240 surface triangles, irrespective of the actual number of refinement levels that are applied. then, for each of the created tetrahedra (a maximum of 240), its circumscribed sphere is appropriately expanded2 to cover also the refined points constructed during the refinement over the basic triangle(s) bounding that tetrahedron. finally, the total number of spheres in the refined bounding box is reduced by merging (again using the appropriate expansion2) several almost identical spheres (having a similar radius and location). of course, this increases the size of the refined bounding box. the magnitude of the increase is controlled by an expansion factor related to the radius of the single bounding sphere of the particle. figure 6 demonstrates the effect of merging the spheres in the refined bounding box of the particle from figure 5 when one refinement level (see figure 5b) was used. in the present study, an expansion factor of 0.1 was chosen as optimal, considerably reducing the number of spheres in the refined bounding box while maintaining reasonably tight bounding. unfortunately, it is difficult to make a quantitative assessment of the measure of agreement between the particle itself and its refined bounding box. since the particle is geometrically described by the 2note that expansion is accompanied by an appropriate shift of the center of the sphere to minimize necessary growth of its radius. the center, however, has to be kept safely inside the particle. 41 acta polytechnica vol. 52 no. 6/2012 a) b) c) figure 5: surface triangulation of an aggregate particle: a) basic triangulation (240 triangles), b) triangulation with one-level refinement (960 triangles) and c) triangulation with two-level refinement (3840 triangles). a) b) c) d) e) f) figure 6: refined bounding box (of the particle from figure 5): a) no merging applied (240 spheres), b) merging using expansion factor 0.05 (156 spheres), c) 0.1 (71 spheres), d) 0.15 (49 spheres), e) 0.2 (36 spheres) and f) 0.25 (31 spheres). ray length r(η,ϕ), the refined bounding box may be precomputed for individual particles in the container and then only appropriately scaled and rotated according to the randomly generated values during the packing procedure. the refined bounding box is used to efficiently resolve the vast majority of the intersection checks between close particles. note that handling all pairs of spheres (one from each refined bounding box) in such a check is inefficient, even for the relatively small number of spheres in refined bounding boxes. in the present implementation, the most marginal sphere (on the side facing the other particle along the direction connecting the expansion centers of the two particles) in the refined bounding box of each of the two particles is identified first. then a search is made for the closest sphere from the refined bounding box of one particle to the marginal sphere of the refined bounding box of the other particle. from these two pairs, the one with smaller distance between the spheres is selected. then the search continues only among mutual pairs of those spheres (one from each refined bounding box) crossing a layer perpendicular to the line connecting the expansion centers of the two particles. the planes bounding the layer are defined as the planes touching the marginal spheres ec 1 90 2min d(ms , s ) 2min d(ms , s ) ec 2 1 1 1 ms1 2 1 2 ms 12 2 d figure 7: intersection check of refined bounding boxes of two particles. only pairs of spheres (one from each refined bounding box) crossing the light gray layer are investigated. eci indicates expansion center of particle i, msji denotes marginal sphere (in dark gray) of particle i facing particle j, d(msji ,sj) represents distance between marginal sphere of particle i and any sphere of particle j, d stands for the minimal detected distance (monotonically decreasing during the intersection check) initially set to smaller from min(d(ms21,s2)) and min(d(ms12,s1)). (on the side facing the other particle) shifted toward the other particle by the so far detected minimum distance between spheres defining the refined bounding boxes. the strategy discussed above is schematically depicted in figure 7. if the intersection cannot be reliably refuted (i.e. if any of the spheres in the refined bounding box of one particle is touching or overlapping any sphere in the refined bounding box of the other particle) or cannot be reliably proved3 (if the center of any sphere in the refined bounding box of one particle is inside the other particle), then the real geometry of the investigated particles is considered. since the surface of the particle (represented by the spherical harmonic expansion) is naturally parametrized by the spherical angles η, ranging from 0 to π and ϕ, varying from 0 to 2π and being periodic, a standard algorithm (the so-called closest point projection) is adopted which returns, for a given point, the closest point on the surface (for details see [29]). to obtain a pair of mutually closest points, one on each particle surface, the closest point projection is performed in a staggered way. this means that the projection of a point to the surface of one particle is then projected to the surface of the other particle, and so on (see figure 8), until the converged state is achieved (the projection of the point on the surface of one particle to the surface of the other particle is the point on the other particle, and vice versa). note that the staggered projection may become quite inefficient if the 3note that the refined bounding box is generally not suitable to confirm the intersection of two particles because the thickness of the cover of the particle by the refined bounding box is variable. 42 acta polytechnica vol. 52 no. 6/2012 ec 2 2 1ec 1 sp a a b b1 1 2 2 spa b figure 8: intersection check of particles represented by the real geometry using the closest point projection in the staggered way. starting points are indicated as white circles, the intermediate projection points as gray circles, and the closest points as black circles. choosing spa as starting point results in locally closest points a1 and a2. globally closest points b1 and b2 are obtained by choosing spb as starting point. surfaces of the particles facing each other are almost parallel, or if the particles are far from each other. in the case of convex non-overlapping particles, this procedure yields the closest points in the global sense, irrespective of the location of the starting point. if the convex particles overlap or touch each other, then the procedure converges to geometrically identical points on the intersection of the particles4. to speed up identification of the overlapping particles, the points projected to the surface of one particle are checked against being inside the other particle before projecting it to the other particle. if the particles are not convex (which is the case for the present study), the staggered projection generally results in points closest to each other in the local sense only (points a1 and a2 in figure 8), depending on the starting point. in order to identify the closest points in the global sense (points b1 and b2 in figure 8), the staggered projection procedure is performed for several starting points. in the present implementation, the points used to build the refined bounding box are used. to limit the total number of starting points, only those points on the surface of one particle which are inside the single bounding sphere of the other particle are considered (see figure 9). from this point of view, local refinement of the basic surface triangulation (used to build the refined bounding box) has a clear advantage over global refinement, as it yields fewer points that can potentially be used as starting points for the staggered projection. the performance can also be improved by using only those surface points of one particle, within the single bounding sphere of the other particle, which 4there is a pathological case (not too likely to happen because of round-off errors), when the starting point is located on a line which is normal simultaneously to both overlapping particles. in this case, the closest points are distinct, each one inside the other particle. 1ec 1 ec 2 2 1 2 bs bs figure 9: points (white circles) used as starting points for the intersection check of particles represented by the real geometry. bsi indicates the single bounding sphere of the i-th particle. are closest to any of the surface points on the other particle. however, this acceleration may determine only the locally closest points. note that, in some pathological cases, there may be no starting points or only a few starting points, within the single bounding spheres, even if the refined bounding boxes overlap. in this case, the radii of the single bounding spheres are slightly enlarged to identify a large enough number of starting points. it is apparent that the above approach is appropriate for identifying whether two particles do or do not intersect, and for determining the distance that is needed for estimating the scaling factor (see below) of close enough particles (with overlapping refined bounding boxes). the distance of two particles with non-overlapping refined bounding boxes is only approximated by the minimum distance of the spheres in the refined bounding boxes. if their distance is to be evaluated exactly, then the starting points for the staggered projection are determined using enlarged single bounding spheres containing the expansion center of the other particle. in the second phase of the aggregate packing algorithm, each of the aggregate particles is expanded (scaled with respect to its expansion center), and is shifted until it is touching at least two of its surrounding particles or, if periodicity is not considered, one or more sides of the representative volume. note that this may lead to the introduction of additional periodic particles, if a representative volume with a periodic boundary is considered. the expansion is a computationally demanding process that must be performed in an iterative manner. in order to reduce the computational load related to the calculation of the distance of two aggregate particles (needed for estimating the scaling factor) and also to the intersection check (needed for verifying that the scaling has not been overestimated), only particles in the immediate neighbourhood are considered. since a large number of iterations are required to achieve touching (within the round-off error) of two neighbouring particles, the 43 acta polytechnica vol. 52 no. 6/2012 a)b) d)c) figure 10: aggregates packing procedure: a) container of aggregate shapes (particles are rotated according to the final location; only used shapes are shown), b) empty representative volume, c) representative volume with periodic aggregates pack, d) representative volume with periodic expanded aggregates pack. scaling yielding non-overlapping particles closer than the sum of the thicknesses of their itzs is accepted as approximate touching. note that although this approach allows us to handle variable itz thickness, we utilize a constant thickness that is the same for all particles. there are several expansion scenarios, depending on the order in which the aggregate particles are processed, and whether a maximum scaling factor (cumulative within a single expansion step) is prescribed. currently, the simplest approach is adopted, in which the aggregate particles are processed in the order in which they were packed into the representative volume. note that the expansion phase results in a slight violation of the initially prescribed size grading. the whole process of the packing procedure is schematically depicted in figure 10 for a twodimensional example of a periodic representative volume with aggregate particles corresponding to three size grades (distinguished by different levels of gray). note that some of the particles appear in the pack more than once. these are the periodic particles, and they can easily be identified in figure 10 as those crossing the boundary of the representative volume. 4 assessment of itz percolation after the representative aggregate pack is available, the itz percolation has to be verified. this is accomfigure 11: some of the representative shapes of aggregate particles. plished using the hard core – soft shell model, in which the hard core is formed by the aggregate particle and the soft shell by itz of constant thickness (the same for all particles, irrespective of their size). initially the connectivity of aggregate particles that are less than twice the thickness of itz apart is built up. note that this connectivity has already been established during the aggregates expansion process of the aggregates packing procedure described in section 3. the mutually connected aggregate particles form regions that are surrounded by distinct continuous itzs. itz percolation occurs if any of the distinct itzs interconnects the opposite sides of the representative volume (preferably in all principal directions). to prove that a particular pack has percolated, it is sufficient to tag aggregate particles touching (and/or crossing, if the periodicity of the pack is taken into account) opposite sides of the representative volume (using a different tag for each side) and to verify whether in the connectivity lists, there are particles marked by tags corresponding to opposite sides. since it is difficult to evaluate the itz volume fraction, as itz is free to overlap itself, itz percolation is indirectly described by the aggregate volume fraction of the total representative volume at which itz percolation just occurs. note that generally only the volume of the parts of the aggregate particles that are inside the representative volume should be taken into consideration. however, if the aggregate pack is built as periodic, then the whole volume of each particle is taken, with the exception of the periodic counterparts, which are ignored. 5 example in this section, a single aggregate pack used for itz percolation assessment is presented. the pack was generated inside a representative cube 50 mm in edge length, with a total volumetric content of aggregates prescribed to 50 %. a total of three aggregates size grades were used. the coarsest grade ranges from 8 to 16 mm, the intermediate grade from 4 to 8 mm, and the finest grade from 0.5 to 4 mm. the thickness 44 acta polytechnica vol. 52 no. 6/2012 figure 12: aggregates packing into representative cube (levels of the gray correspond to individual size grades). of itz was chosen to be 30 µm. the container of representative aggregate shapes was formed by 100 aggregate particles derived from a randomly generated ellipsoid by randomly scaling the length of rays r used for evaluating the coefficients of the spherical harmonic expansion (eq. (3)). note however that sufficient correlation was enforced for length of rays corresponding to adjacent integration points, in order to avoid unrealistic artificial shapes. the expansion coefficients were calculated for an expansion order equal to 20, using 128-point numerical gaussian integration. some of the representative aggregate particles are visualized in figure 11. although the particles are generated artificially, they quite realistically resemble natural riverbed aggregates. the final aggregate pack, shown in figures 12 and 13, contains 2542 particles, out of which 14 are from the coarse grade, 161 from the intermediate grade, and 2367 are from the fine grade. the resulting aggregate volumetric content was 53 %, which suggests that there was not much space for particle expansion. however, the percolation of itz in this particular case was not proved. this again reveals that a more powerful packing procedure is desirable for the higher aggregate volumetric content at which itz percolation may occur. 6 conclusions this paper has presented an algorithm for geometrical modeling of the microstructure of concrete with embedded aggregate particles of realistic shape. unlike figure 13: detail of aggregates packing into representative cube. other works on similar topics, real-shaped aggregate particles, with the geometrical representation based on spherical harmonic expansion have been considered. the microstructure that was produced is used for assessing itz percolation within the representative volume, using the hard core – soft shell model. the representative volume is built using a simple two-phase randomized particle packing procedure. in the first phase, the particles are taken from the container of representative aggregate particle shapes, and are randomly placed into the representative volume. in the second phase, the particles are scaled and shifted until they come into mutual contact (within the tolerance given by thickness of itz). to make the investigation of the particle contacts more efficient, the individual particles are approximated by a refined bounding box composed of a set of spheres of variable radius. if the contact cannot be reliably assessed using the refined bounding box, the real geometry of the investigated particles is considered. the contacts are used to set up the connectivity of the aggregate particles, which is then used to verify whether itz percolation occurs. in order to estimate the itz percolation threshold, given in terms of the aggregate volume fraction of the total representative volume, a large number of simulations have to be performed. those that are close to the state at which itz percolation just occurs are then taken into consideration. however, this is a subject for future work. future research will also focus on alternative packing procedures that enable higher aggregate volumetric content values to be reached. acknowledgements this work was supported by the ministry of education of the czech republic under project no. msm 6840770003. its financial assistance is gratefully acknowledged. 45 acta polytechnica vol. 52 no. 6/2012 references [1] garboczi, e.j., bentz, d.p.: computer simulation and percolation theory applied to concrete, annual reviews of computational physics vii, d. stauffer (ed.), world scientific publishing company, 1999, pp. 85–123. [2] bentz, d.p., garboczi, e.j.: percolation of phases in a three-dimensional cement paste microstructure model, cement and concrete research, 21, 1991, pp. 325–344. [3] bentz, d.p., schlangen, e., garboczi, e.j.: computer simulation of interfacial zone microstructure and its effect on the properties of cementbased composites, in material science of concrete iv, skalny, j.p. et al. (eds), journal of the american ceramic society, 1995, pp. 155–199. [4] garboczi, e.j., bentz, d.p.: computer simulation of the diffusivity of cement-based materials, journal of materials science, 27, 1992, pp. 2083– 2092. [5] shane, j.d., mason, t.o., jennings, h.m., garboczi, e.j., bentz, d.p.: effect of the interfacial transition zone on the conductivity of portland cement mortars, journal of the american ceramic society, 83, 2000, pp. 1137–1144. [6] maso, j.c.: the bond between aggregates and hydrated cement paste, in proceedings of the 7th international congress on the chemistry of cement i, vii-1/3, 1980. [7] garboczi, e.j., bentz, d.p.: digital simulation of the aggregate-cement paste interfacial zone in concrete, journal of materials research, 6, 1991, pp. 196–201. [8] garboczi, e.j.: finite element and finite difference programs for computing the linear electric and elastic properties of digital images of random materials, nist internal report 6269, 1998. [9] jia, x., williams, r.a.: a packing algorithm for particles of arbitrary shapes, powder technology, 120, 2001, pp. 175–186. [10] tourquato, s.: two-point distribution function for a dispersion of impenetrable spheres in a matrix, journal of chemical physics, 85, 1986, pp. 6248–6249. [11] lee, s.b., torquato, s.: porosity for the penetrable-concentric-shell model of two-phase disordered media: computer simulation results, journal of chemical physics, 89, 1988., pp. 3258– 3263 [12] bentz, d.p., garboczi, e.j., stutzman, p.e.: computer modelling of the interfacial transition zone in concrete, in interfaces in cementitious composites, maso, j.c. et al. (eds), 1993, pp. 259–268. [13] bentz, d.p., hwang, j.t.g., hagwood, c., garboczi, e.j., snyder, k.a., buenfeld, n., scrivener, k.l.: interfacial zone percolation in concrete: effects of interfacial thickness and aggregate shape, in microstructure of cementbased system/bonding and interfaces in cementitious materials, diamond, s. et al. (eds), material research society, pittsburgh, 1995, pp. 437–442. [14] garboczi, e.j.: three-dimensional mathematical analysis of particle shape using x-ray tomography and spherical harmonics: application to aggregate used in concrete, cement and concrete research, 32, 2002, pp. 1621–1638. [15] berryman, j.g.: random close packing of hard spheres and disks, physical review a, 27, 1983, pp. 1053–1061. [16] salvat, w.i., mariani, n.j., barreto, g.f., martínez, o.m.: an algorithm to simulate packing structure in cylindrical containers, catalysis today, 107–108, 2005, pp. 513–519. [17] kristiansen, k.l., wouterse, a., philipse, a.: simulation of random packing of binary sphere mixtures by mechanical contraction, physica a, 358, 2005, pp. 249–261. [18] knott, g.m., jackson, t.l., buckmaster, j.: the random packing of heterogeneous propellants, aiaa journal, 39, 4, 2001, pp. 678–686. [19] lo, s.h., wang, w.x.: generation of tetrahedral mesh of variable element size by sphere packing over an unbounded 3d domain, computer methods in applied mechanics and engineering, 194, 2005, pp. 5005–5018. [20] han, k., feng, y.t., owen, d.r.j.: sphere packing with a geometric based compression algorithm, powder technology, 155, 1, 2005, pp. 33–41. [21] bažant, z.p., tabbara, m.r., kazemi, m.t., pijaudier-cabot, g.: random particle model for fracture of aggregate of fiber composite, journal of engineering mechanics asce, 116, 8, 1990, pp. 1686–1705. [22] wang, z.m., kwan, a.h.k., chan, h.c.: mesoscopic study of concrete i: generation of random aggregate structure and finite element mesh, computers and structures, 70, 5, 1999, pp. 533– 544. 46 acta polytechnica vol. 52 no. 6/2012 [23] koenke, c., eckardt, s., haefner, s.: spatial and temporal multiscale simulations of damage processes for concrete, in innovation in computational structures technology, topping b.h.v. et al. (eds), saxe-coburg publications, 2006, pp. 133–157. [24] van mier, j.g.m., van vliet, m.r.a.: influence of microstructure of concrete on size/scale effects in tensile fracture, engineering fracture mechanics, 70, 16, 2003, pp. 2281–2306. [25] leite, l.p.b., slowik, v., mihashi, h.: computer simulations of fracture processes of concrete using meso-level models of lattice structures, cement and concrete research, 34, 2004, pp. 1025–1033. [26] haefner, s., eckardt, s., koenke, c.: a geometrical inclusion-matrix model for the finite element analysis of concrete at multiple scales, in proceedings of the 16th ikm, guerlebeck, k. et al. (eds), 2003. [27] baker, t.j.: automatic mesh generation for complex three-dimensional regions using a constrained delaunay triangulation, engineering with computers, 5, 1989, pp. 161–175. [28] bažant, z.p., caner, f.c., carol, i., adley, m.d., akers, s.a.: microplane model m4 for concrete. part i: formulation with work-conjugate deviatoric stress, journal of engineering mechanics asce, 126, 2000, pp. 944–953. [29] rypl, d.: discretization of three-dimensional aggregate particles, in cd-rom proceedings of the iii. european conference on computational mechanics, mota soares, c.a. et al. (eds), 2006. 47 acta polytechnica doi:10.14311/ap.2014.54.0275 acta polytechnica 54(4):275–280, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap using photoplethysmography imaging for objective contactless pain assessment marcus koenya, ∗, nikolai blanika, xinchi yua, michael czaplikb, marian waltera, rolf rossaintb, vladimir blazeka, steffen leonhardta a chair for medical information technology, helmholtz-institute, rwth aachen university, pauwelsstrasse 20, 52074 aachen, germany b department of anesthesiology, university hospital aachen, pauwelsstrasse 30, 52074 aachen, germany ∗ corresponding author: koeny@hia.rwth-aachen.de abstract. this work presents an extension to the known analgesia nociception index (ani), which provides an objective estimation of the current depth of analgesia. an adequate “measure” would facilitate so-called balanced anesthesia. generally, ani is computed using heart rate variability or rather beat-to-beat intervals based on an electrocardiogram (ecg). there are clinical situations where no ecg monitoring is available or required, but only photoplethysmography (ppg), e.g., in some cases in postoperative care or pain therapy. in addition, a combination of ppg and ecg for obtaining beat-to-beat intervals may lead to increased robustness and reliability for dealing with artifacts. this work therefore investigates the computation of ani using standard ppg. in addition, new methods and opportunities are presented using contactless ppg imaging (ppgi). ppgi®enables contactless ppg recordings for deriving beat-to-beat intervals as well as analysis of local perfusion and wounds. keywords: anesthesiology, analgesia, nociception, pain, heart rate variability, ecg, ppg, image based ppg, ppgi. 1. introduction this work discusses an extension to the photoplethysmogram (ppg) [1] based analgesia nociception index (ani) for image based ppg (ppgi®). one of the major tasks of the anesthesiologist during surgical interventions is to maintain adequate narcosis. the dosage of drugs leading to suppression of the patient’s consciousness and the sensation of pain must be adapted to the current surgical progress, and also individually for each patient with regard to his premedical history and current health state. some events, e.g., skin incisions, require deeper analgesia. increases of blood pressure and/or heart rate can indicate inadequate narcosis. however, it is often particularly challenging to discriminate pain from insufficient sedation. an adequate assessment of the depth of analgesia would support the anesthesiologist in balancing the narcosis properly. unfortunately, a reliable objective measurement of pain intensity is not yet available, since pain is an individual sensation leading to large inter-patient variability. awake patients are usually asked to estimate their pain intensity on a given scale (e.g., from 1 to 10), using e.g., the visual analogue scale (vas) [2]. however, this procedure is not feasible with unconscious or uncooperative patients. several new methods are currently under investigation which aim to quantify the depth of analgesia. one is the surgical stress index (ssi) which has been especially developed for analgesia monitoring during surgical interventions, and is based on an analysis of the ppg measured by a finger clip [3]. ssi is currently available as an extension module for ge healthcare monitors to assess the analgesia or the stress level during surgical interventions. another method is based on skin conductance measurements [4]. the number of fluctuations of the skin conductance (nfsc) is associated with the patient’s pain sensation. analyses of skin conductance are mostly used in postoperative clinical trials under various circumstances [5, 6]. a promising new method is ani. the ani index is based on an analysis of the ecg signal [7], especially an analysis of heart rate variability (hrv) or the beatto-beat interval [8]. 1.1. heart rate variability heart frequency is related to the activity of the sympathetic and parasympathetic nervous system. variations of frequency are affected by the autonomous nervous system (ans) which, for example, reacts to external influences, such as stress or pain. since the patient is unconscious during surgical interventions, it can be assumed that in this situation, pain is the most significant stress factor. in studies, a relatively high correlation between analgesic drugs and hrv has been confirmed [7]. the spectrum of the hrv can be divided into four parts, which are associated with different sources [8, 9]. ani uses the hf frequency range between 0.15 hz and 0.5 hz, which is associated with the correlation between breathing and heart rhythm [7] or the respiratory sinus arrhythmia. 275 http://dx.doi.org/10.14311/ap.2014.54.0275 http://ojs.cvut.cz/ojs/index.php/ap m. koeny, n. blanik, x. yu et al. acta polytechnica 1.2. analgesia nociception index the following steps of ani computation are summarized from [7]. the first step is to determine the beat-to-beat intervals from the ecg signal. then a special filter is used to remove extra systoles and artifacts from the beat-to-beat interval series [10]. after resampling, the signal is normalized over a time period of 64 seconds and the signal is band-pass filtered between 0.15 and 0.5 hz. finally, the parasympathetic tone is computed using an area-under-the-envelope algorithm of the filtered beatto-beat series curve. ani is now computed, according to equation 1, where aucmin is determined as the minimum area of the envelopes of the 64 s interval divided into 16 s parts. finally ani is determined as described in [7]: ani = 100 · (α · aucmin + β) 12.8 (1) one advantage of the system is that no additional sensor is necessary. according to the manufacturer, ani is also valid for awake patients in postoperative care [7], for example in the recovery room or in intensive care. 1.3. use of ppg ppg is an optical volumetric measurement. it is routinely used during surgical interventions to determine oxygen saturation (spo2). in most cases, it is measured using a finger clip. figure 4 shows an ecg and the inherent ppg signal. of course, the ecg and ppg correlate well. therefore, the beat-to-beat intervals should also correlate, and the authors assume the ppg signal can be used to determine the ani index using the ppg signal. a very promising approach would be to fuse ecg and ppg in order to improve the accuracy and the robustness of ani computation. in addition, the source can be switched if the quality of one of the signals is not sufficient or is provided with artifacts. 1.4. image based ppg image-based ppg technology (photoplethysmography imaging, ppgi) was first introduced in 1998 by rwth aachen university, aachen, germany [11]. a ppgi signal is recorded using visible and/or near infrared camera technology focused on the skin surface. similar to classical ppg, ppgi detects minor changes in light intensity originating from modulated perfusion of the skin tissue. hence, these changes — although not visible to the human eye — possess similar information to a ppg signal. the signal can be used to estimate heart rate and heart rate variability.[11]. since, camera sensor and measurement location are now not fixed together rigidly, relative movements between the observed object and the camera may become an issue in the assessment of ppgi sequences. to prevent such movement artifacts, either image processing algorithms for movement compensation must be applied, anesthesia workstation hr 83 ll lead off nibp 120 80 propofol 15 ml/h infusion pump patient monitor hr 83 nibp 120 80 surgical events anesthesia events anesthesia machine propofol 15 ml/h infusion pump propofol 15 ml/h infusion pump figure 1. setup of the system in the university hospital in aachen for recording vital signs during surgical interventions. or the measurement setup must guarantee a motionfree measurement scenario, for example by appropriate fixation of the observed body part. to extract a ppg signal from a ppgi video sequence, every single pixel of the sensitive camera chip senses like a classical ppg sensor. in addition, inside the recorded video frame, a region of interest (roi) can be defined in which for every frame the mean gray values of all containing pixels are calculated, representing the mean regional ppg value at that particular moment. this enables totally contact-free, non-obtrusive monitoring of skin perfusion and further derivable vital parameters. the local distribution within the observed body surface can be estimated for all findings resulting in functional mapping of the observed parameter with spatial resolution [12]. we acknowledge that part of the work presented in this paper has been previously published in [1]. 2. materials and methods 2.1. data recordings in the university hospital in aachen the ecg and ppg signals used to develop and evaluate the described algorithms were recorded over a time period of two months anonymously at the university hospital of aachen, germany, after approval by the local ethics committee. the patients received either total intravenous anesthesia or balanced anesthesia with volatile anesthetics. in addition to the ecg and ppg signal, all available vital signs from the patient monitor, the anesthesia machine and syringe pumps were recorded. in addition, important anesthesia-associated and surgical events, such as skin incision or manually administered drugs were recorded by a dedicated ob276 vol. 54 no. 4/2014 using ppg imaging for objective contactless pain assessment figure 2. steps of the beat-to-beat interval calculation, modified after [14]. server. based on these data, influences on the ani could be examined and correlations could be analyzed, cf. figure 1. the system used for data acquisition was developed as an “anesthesia workstation” in context of a research project [13] on modular networking in the operating room. using the pc data connection with the mp70 patient monitor (philips ag), a primus or cato anesthesia machine (draeger medical, luebeck) and syringe pumps (bbraun, melsungen), data were recorded during the surgical intervention and were prepared for further visualization and analysis in matlab (the mathworks company). using the same system, during an animal trial, the vital signs of a young pig (“deutsche landrasse”) were recorded as a reference. using an additional pc, which was connected to the ppgi camera (avt, pike 210b), the ppgi video was recorded synchronously, using a self-developed software tool with matlab. due to restrictions of the recording system, three minutes could be recorded continuously. the resolution of the camera was set for this recording to 640 × 480 pixels, with a sample rate of 15 frames per second. 2.2. preprocessing the ppg signal because the ani was originally developed using the ecg derived beat-to-beat intervals, the most important step was to adapt the beat-to-beat interval computation to the ppg signal. the first step is to estimate the beatto-beat intervals from the ppg signal. this was done using a modified version of the ppg-adapted adapit algorithm [14] with the following steps, according to figure 2: (1.) a median filter of length 550 ms is applied to the original signal (figure 2.1) to determine the dc offset and trend of the ppg signal (figure 2.2). (2.) the median filtered signal is subtracted from the original signal in order to eliminate the dc offset and trend of the ppg signal (figure 2.3). figure 3. selected example of a ppgi image. (3.) to filter out invalid peaks, two thresholds are calculated in a seven-second-long window (figure 2.3): t1 = 2 × standard deviation, t2 = 3 × standard deviation. the first estimation of peaks results of peaks greater than t2. (4.) the third threshold is t3 = 0.8 × standard deviation of the resulting signal. (5.) all peaks not within the threshold are invalid. (6.) a standard peak detection algorithm is used to detect the maxima of the resulting signal (figure 2.4). (7.) figure 2.5 shows the detected peaks can be seen together with the original signal. now the distance between the peaks can be determined in order to estimate the ani, as described in section 1.2 and [7]. 2.3. preprocessing the ppgi signal first, the ppgi signal must be extracted from the ppgi video signal. in the context of the animal trial, this was done setting up a fixed region of interest (roi), as shown in figure 3. a fixed roi can be used, because the pig is sedated and the camera is mounted in a fixed position. therefore, movement artifacts are suspended. the video signal was filtered within the region of interest using a 2d median filter. the extracted ppgi signal has a sample rate of 15 samples per second, as this is the frame rate of the ppgi camera. for heart rate variability computation, exact determination of the hrv is essential. therefore, 15 samples per second are not sufficient and the signal is interpolated to 125 hz, which is equal to the sample rate of the measured ppg signal. a simple spline interpolation was used, because this matches the characteristics of the ppg signal much 277 m. koeny, n. blanik, x. yu et al. acta polytechnica -1 0 1 2 3 4 5 6 ecg signal ppg signal time [s] figure 4. example of an ecg and a ppg signal with a time-delay. figure 5. interpolated vs. non-interpolated ppgi signal. better than a linear interpolation. figure 5 shows the original extracted signal and the up-sampled signal. the interpolated ppgi signal is handled like the regular ppg signal for further processing steps. the ppgi signal was at least limited to three minutes, due to restrictions of the recording system used in the study. ani computation is not reliable in this context. in addition, the pig was ventilated with up to 40 breaths per minute, while the filter of the hrv is set to 0.15– 0.5 hz, according to ani specification. thus, the ani would not be valid. therefore, in the following results only the rr-intervals are considered and are compared with the ecg and ppg rr-intervals. 3. results 3.1. comparison of ecg, ppg and ppgi signals it is clearly understandable that the heart rates of the different signals correlate. of course, there is a short time-delay between the r-peaks of the ecg and the maximum of the ppg signal, see figure 4. there are various reasons for the short time delay. physiologically, the time delay is caused by the pulse transition time from the heart to the fingers. this delay is dependent on cardiovascular factors, e.g., blood pressure. the other time delay is due to technical reasons. depending on the device, the time delay of the signal filter of each signal differs. there can also be a time shift, if the two signals are acquired by different devices. for example, the ecg is received from a figure 6. hrv signals of ecg, ppg and ppgi. patient monitor and the ppg from a special optimized ppg device. for computation of the hrv, the time delay can be ignored, because it is only up to one second under normal conditions. this does not influence ani computation. the ppgi signal is also delayed. in addition to the delay caused by signal processing and filtering, the delay is dependent on the same physiological factors as the regular ppg signal. of course, it also depends on the region that is recorded or that the roi is focused on. 3.2. comparing heart rate variability in addition to the time delay between the ecg signal and the ppg signal, the sample times of the ecg and ppg signals, are relevant for the precision of the measurement. because the r-peak is a short sharp peak, it can be altered and the detected maximum is not precisely the real maximum. an example is shown in figure 4. state of the art devices for high quality hrv analysis use a sample rate of up to 1000 hz [8] to ensure high precision hrv computation. the patient monitor used in this study has a sample rate of 250 hz for the ecg signal and 125 hz for the ppg signal. this is standard for currently used patient monitors. the ppg signal is much smoother than the ecg signal, see figure 3. for the lower sample frequency, the error of the peak detection is larger. additional care must therefore be taken in peak detection and in threshold adjustment of this algorithm. in addition, the 278 vol. 54 no. 4/2014 using ppg imaging for objective contactless pain assessment -1 0 1 2 0 25 50 75 1. 2. 0 2 43. 43:50 44:02 44:14 44:26 44:38 44:50 0 25 50 75 time (mm:ss) 4. figure 7. (1) ecg signal. (2) ecg-based ani. (3) ppg signal. (4) ppg-based ani. dicrotic notch can influence the peak detection of the adapit algorithm. the non-interpolated ppgi signal has a sample rate of 15 hz. this sample rate is not sufficient to determine the peaks of the ppgi wave accurately. using the interpolation method described in section 2.3, the ppgi curve is smoother and is up-sampled to a standard ppg sample rate of 125 hz. unfortunately, interpolation is only an approach and does not represent the real ppgi signal. the error is therefore larger than the error of the ppg signal. figure 6 shows the rr-intervals from the synchronized ppg and ppgi signals. the increase is caused by inaccurate determination of the rr-intervals, caused by the interpolation and the lower sampling rate. the big spikes are generated by artifacts of the peak detection algorithm. of course this deviation influences the ani, as is discussed in the following section. 3.3. comparing indices the differences mentioned above in ecgand ppgbased beat-to-beat interval calculation result in a deviation between the two anis. as shown in figure 7, there is an offset between the different ani indices, resulting in a changed distribution, see figure 8. the distribution is computed over 20 surgical interventions, using matlab software. hence, the ani is lower and if the hrv increases the ppg-ani is lower than the ecgani. however, both anis can be merged, resulting in a single more reliable index. for example, the ppg signal is free of artifacts during cautery, which however makes the ecg signal useless. ani is computed over a time of about one minute, and even a short cautery can influence ani for a minute. an analysis of 10 surgical interventions using autocorrelation shows a correlation from 0.92 to 0.98 of the beat-to-beat intervals for the time delay. the ani correlates from 0.90 to 0.97. it can be assumed that most of the deviation results from errors in peak detection and possible misdetected peaks caused by artifacts. unfortunately, the ani of the animal trial cannot be computed, because the respiration figure 8. distribution of (1) ecg-ani and (2) ppgani. the distribution is computed based on 40 hours of surgical interventions. rate was set up to 40 per minute. this exceeds the limit of the filter between 0.15–0.5 hz and produces an invalid ani index. 4. discussion and outlook ppg-based ani aims to improve the assessment of analgesia during surgical interventions. with the ppg signal, an additional source for ani computation is available, which can be chosen if the ecg signal is not available or not valid, for example during electro-surgical procedures. both indices (ecg-ani and ppg-ani) could in future be combined to one index to improve ani. ani is therefore likewise usable in postoperative care with awake patients. the ppg-based version is an alternative if no ecg is derived. additionally, it can be combined with other methods [3] for postoperative pain and stress analysis. further work needs to be done to adapt the scaling factors (α and β) of the ani to the offset in rr-interval variation. furthermore, histogram transformation can be used to translate the ppg-ani to the distribution of the ecg-ani, compared to the results in figure 8. the indices can be fused, for example using artifact detection algorithms or other strategies for selection 279 m. koeny, n. blanik, x. yu et al. acta polytechnica between ecg or ppg. additional information about current surgical procedures from other devices in a networked operating room [13] can make artifact detection more reliable. for example, if the system could get information that an electro surgical procedure has started, the source of ani computation can automatically be switched from ecg and ppg. this would prevent complex analysis and artifact detection. further research needs to be done on ppgi-based rr interval computation and ppgi-ani. first, the sample rate needs to be increased, and automatic determination of the region of interest should be implemented. this should be done by focusing the ppgi camera on a smaller field of view, or by setting an roi directly in the camera. then the frame-rate can be increased, leading to a higher sample-rate of the resulting ppgi signal. a higher sample rate improves the rr-interval detection. the improved ppgi system should be verified in a human trial. in addition, histogram transformation can be used to match the ppg-based ani to the ecg-ani. contactless measurement of the ppg waveform offers new opportunities. for example, during postoperative pain therapy, the ppgi method offers a very comfortable method for assessing the ani, as the patient does not have to be connected to an ecg or ppg device. in addition, the camera-based procedure enables contactless measurements on various regions of the body. for example, the ppg signal can be measured synchronously on the face and on the hand. this enables the physician to focus on different regions and consider local physiological phenomena, e.g., local vasomotion. another opportunity can be the use of ppgi in wound diagnosis [15]. for example, the ppgi camera, optionally in combination with an thermo infrared camera, can be focused on a wound. this can deliver information about local ppg correlated phenomena and the difference in thermography caused by inflammation. this information can be used for improved wound diagnosis, possibly leading to improved wound treatment. acknowledgements this work forms part of the or.net project, which is funded by the german federal ministry of education and research (bmbf). the authors would like to thank all partners for the efficient collaborative work. the human and animal trials have been approved by the local ethics committee. references [1] m. koeny, x. yu, m. czaplik, et al. computing the analgesia nociception index based on ppg signal analysis. in poster conference. 2013. [2] s. l. collins, r. a. moore, h. j. mcquay. the visual analogue pain intensity scale: what is moderate pain in millimetres? pain 72(1):95–97, 1997. [3] m. huiku, k. uutela, m. van gils, et al. assessment of surgical stress during general anaesthesia. british journal of anaesthesia 98(4):447–455, 2007. doi:10.1093/bja/aem004. [4] t. ledowski, j. bromilow, j. wu, et al. the assessment of postoperative pain by monitoring skin conductance: results of a prospective study. anaesthesia 62(10):989– 993, 2007. doi:10.1111/j.1365-2044.2007.05191.x. [5] t. ledowski, j. bromilow, m. paech, et al. monitoring of skin conductance to assess postoperative pain intensity. british journal of anaesthesia 97(6):862–865, 2006. doi:10.1093/bja/ael280. [6] m. czaplik, c. hübner, m. köny, et al. acute pain therapy in postanesthesia care unit directed by skin conductance: a randomized controlled trial. plos one 7(7):e41758, 2012. doi:10.1371/journal.pone.0041758. [7] r. logier, m. jeanne, a. dassonneville, et al. physiodoloris: a monitoring device for analgesia/nociception balance evaluation using heart rate variability analysis. in engineering in medicine and biology society (embc), 2010 annual international conference of the ieee, pp. 1194–1197. ieee, 2010. doi:10.1109/iembs.2010.5625971. [8] m. malik, j. bigger, a. camm, et al. heart rate variability. standards of measurement, physiological interpretation, and clinical use, task force of the european society of cardiology and the north american society of pacing and electrophysiology. circulation 93(5):1043– 1065, 1996. doi:10.1161/âăń01.cir.93.5.1043. [9] s. akselrod, d. gordon, j. madwed, et al. hemodynamic regulation: investigation by spectral analysis. american journal of physiology-heart and circulatory physiology 249(4):h867–h875, 1985. [10] r. logier, j. de jonckheere, a. dassonneville. an efficient algorithm for rr intervals series filtering. in engineering in medicine and biology society, 2004. iembs’04. 26th annual international conference of the ieee, vol. 2, pp. 3937–3940. ieee, 2004. doi:10.1109/iembs.2004.1404100. [11] v. blazek, u. schultz-ehrenburg. frontiers in computer-aided visualization of vascular functions. in fortschrittberichte vdi, pp. 117–124. vdi verlag duesseldorf, 1998. [12] t. wu, v. blazek, h. j. schmitt. photoplethysmography imaging: a new noninvasive and noncontact method for mapping of the dermal perfusion changes. in eos/spie european biomedical optics week, pp. 62–70. international society for optics and photonics, 2000. doi:10.1117/12.407646. [13] m. koeny, j. benzko, m. czaplik, et al. getting anesthesia online: the smartor network. international journal on advances in internet technology 5(3 and 4):114–125, 2012. [14] c. yu, z. liu, t. mckenna, et al. a method for automatic identification of reliable heart rates calculated from ecg and ppg waveforms. journal of the american medical informatics association 13(3):309–320, 2006. doi:10.1197/jamia.m1925. [15] u. schultz-ehrenburg, v. blazek. value of quantitative photoplethysmography for functional vascular diagnostics. skin pharmacol appl skin physiol 14(5):316–323, 2001. 280 http://dx.doi.org/10.1093/bja/aem004 http://dx.doi.org/10.1111/j.1365-2044.2007.05191.x http://dx.doi.org/10.1093/bja/ael280 http://dx.doi.org/10.1371/journal.pone.0041758 http://dx.doi.org/10.1109/iembs.2010.5625971 http://dx.doi.org/10.1161/​01.cir.93.5.1043 http://dx.doi.org/10.1109/iembs.2004.1404100 http://dx.doi.org/10.1117/12.407646 http://dx.doi.org/10.1197/jamia.m1925 acta polytechnica 54(4):275–280, 2014 1 introduction 1.1 heart rate variability 1.2 analgesia nociception index 1.3 use of ppg 1.4 image based ppg 2 materials and methods 2.1 data recordings in the university hospital in aachen 2.2 preprocessing the ppg signal 2.3 preprocessing the ppgi signal 3 results 3.1 comparison of ecg, ppg and ppgi signals 3.2 comparing heart rate variability 3.3 comparing indices 4 discussion and outlook acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0142 acta polytechnica 54(2):142–148, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap a simple derivation of finite-temperature cft correlators from the btz black hole satoshi ohyaa, b a department of physics, faculty of nuclear sciences and physical engineering, czech technical university in prague, pohraniční 1288/1, 40501 děčín, czech republic b doppler institute for mathematical physics and applied mathematics, czech technical university in prague, břehová 7, 11519 prague, czech republic correspondence: ohyasato@fjfi.cvut.cz abstract. we present a simple lie-algebraic approach to momentum-space two-point functions of two-dimensional conformal field theory at finite temperature dual to the btz black hole. making use of the real-time prescription of ads/cft correspondence and ladder equations of the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r, we show that the finite-temperature two-point functions in momentum space satisfy linear recurrence relations with respect to the left and right momenta. these recurrence relations are exactly solvable and completely determine the momentum-dependence of retarded and advanced two-point functions of finite-temperature conformal field theory. keywords: ads/cft correspondence, correlation functions, btz black hole. 1. introduction and summary conformal symmetry is powerful enough to constrain the possible forms of correlation functions in quantum field theory. it has been long appreciated that, for scalar (quasi-)primary operators, for example, so(2,d) conformal symmetry completely fixes the possible forms of twoand three-point functions up to an overall normalization factor in any spacetime dimension d ≥ 1. this symmetry constraint works well in position space, however, its direct implication to momentum-space correlators are less obvious before performing the fourier transform. since momentum-space correlators are directly related to physical observables (e.g. the imaginary part of a retarded two-point function in momentum space gives the spectral density of many body systems), it is important to understand how directly conformal symmetry constrains the possible forms of momentum-space correlators. from the practical computational viewpoint, it is also important to develop efficient methods for computing momentum-space correlators directly through symmetry considerations, because fourier transforms of position-space correlators are hard in general. in this short paper we continue our investigation [1] and present a novel lie-algebraic approach to momentumspace two-point functions of conformal field theory at finite temperature by using the ads/cft correspondence. the ads/cft correspondence relates stronglycoupled conformal field theory to classical gravity in a one-higher spatial dimension. according to the correspondence, finite-temperature conformal field theory is dual to an asymptotically ads spacetime that contains black holes. in this paper we focus on twodimensional conformal field theory (cft2) at finite temperature dual to the three-dimensional anti-de sitter (ads3) black hole (i.e. bañados-teitelboim-zanelli (btz) black hole [2, 3]) and we give a simple derivation of retarded and advanced two-point functions of scalar operators of dual cft2 by just using the real-time prescription of ads/cft correspondence à la iqbal and liu [4, 5] and the ladder equations of the lie-algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r of the isometry group so(2, 2) ∼= (sl(2,r)l ×sl(2,r)r)/z2 of ads3. in contrast to the conventional approaches to momentumspace cft correlators (such as fourier-transform of position-space correlators or original real-time ads/cft prescription [4–6], which requires bulk field equations to be solved explicitly), our lie-algebraic method is quite simple and clarifies the role of conformal symmetry in momentum-space correlators in a direct way: for finite-temperature two-point functions in momentum space, conformal symmetry manifests itself in the form of recurrence relations which are exactly solvable and, up to an overall normalization factor, completely determine the momentum dependence of two-point functions. the rest of the paper is organized as follows. in section 2 we briefly review the ads3 black hole based on the quotient construction [3, 7]: the ads3 black hole is a locally ads3 spacetime and is given by a quotient space of ads3 with an identification of points under the action of a discrete subgroup z of the isometry group so(2, 2) of ads3. though not so widely appreciated, the ads3 black hole is a quotient space of ads3 with a particular coordinate patch in which both the timeand angle-translation generators generate the one-parameter subgroup so(1, 1) ⊂ so(2, 2).1 in 1this is true for a non-extremal black hole with positive mass. timeand angle-translation generators generate other one-parameter subgroups for the zero-mass limit of a black hole (or a black hole vacuum), the extremal black hole and the negative mass black hole (or black hole with naked singularity). for example, in the case of a black hole vacuum, 142 http://dx.doi.org/10.14311/ap.2014.54.0142 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 a simple derivation of finite-temperature cft correlators section 3 we introduce a coordinate realization of the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r realized in scalar field theory on the ads3 black hole background. we then demonstrate in section 4 a simple lie-algebraic method for computing the retarded and advanced cft2 two-point functions by just using the ladder equations of the lie algebra so(2, 2) ∼= sl(2,r)l⊕sl(2,r)r in the basis in which so(1, 1) ×so(1, 1) ⊂ so(2, 2) generators become diagonal. we will see that our method correctly reproduces the known results [5, 8, 9]. 2. ads3 black hole: locally ads3 spacetime in the so(1, 1) ×so(1, 1) diagonal basis let us start with the following non-rotating btz black hole described by the metric ds2ads3 bh = − ( ρ2 r2 − 1 ) dτ2 + dρ2 ρ2/r2 − 1 + ρ2dθ2, (1) where τ ∈ (−∞, +∞), ρ ∈ (0,∞), θ ∈ [0, 2π), and r > 0 is the ads3 radius. in this paper we simply call (1) the ads3 black hole and focus on the region outside the horizon ρ > r. for the following discussions it is convenient to introduce a new spatial coordinate x as follows: ρ = r coth(x/r), (2) where x ranges from 0 to ∞. note that the black hole horizon ρ = r corresponds to x = ∞, while the ads3 boundary ρ = ∞ corresponds to x = 0. a straightforward calculation shows that in the coordinate system (τ,x,θ) the black hole metric (1) takes the following form: ds2ads3 bh = −dτ2 + dx2 + r2 cosh2(x/r)dθ2 sinh2(x/r) . (3) for the sake of notational brevity, we will hereafter work in the units in which r = 1. several comments are in order: (1.) btz black hole. the above ads3 black hole (1) is locally isometric to the rotating btz black hole [2, 3] and obtained by suitable change of spacetime coordinates. indeed, it is easy to show that the btz black hole metric ds2btz = − (r2 −r2+)(r2 −r2−) r2 dt2 + r2dr2 (r2 −r2+)(r2 −r2−) + r2 ( dφ− r+r− r2 dt )2 , (4) the timeand angle-translation generators generate the subgroup e(1) ×e(1) ⊂ so(2, 2) ∼= (sl(2,r)l ×sl(2,r)r)/z2 prior to making the z-identification. (note that sl(2,r) contains three distinct one-parameter subgroups: the rotation group so(2), the lorentz group so(1, 1) and the euclidean group e(1).) for detailed discussions of the quotient construction, we refer to the original paper [3] (see also [7]). where r+ and r− are the outer and inner horizons, respectively, is reduced to the ads3 black hole (1) by the following coordinate change [10]: ρ = √ r2 −r2− r2+ −r2− , τ = r+t−r−φ, θ = r+φ−r−t. (5) note that the light-cone coordinates satisfy the relations τ ±θ = (r+ ∓r−)(t±φ). (2.) local coordinate patch of ads3. the ads3 black hole is a locally ads3 spacetime and is obtained from the ads3 spacetime with a suitable periodic identification [3, 7]. to see this, let us first note that the ads3 spacetime can be embedded into the four-dimensional ambient space r2,2 and defined as the following hypersurface with constant negative curvature −1 (= −1/r2): ads3 = { (x−1,x0,x1,x2) ∈ r2,2 ∣∣ − (x−1)2 − (x0)2 + (x1)2 + (x2)2 = −1 } . (6) the ads3 black hole (3) is given by the following local coordinate patch of the hypersurface: (x−1,x0,x1,x2) = ( coth x cosh θ, sinh τ sinh x , coth x sinh θ, cosh τ sinh x ) , (7) with the periodic identification θ ∼ θ + 2nπ (n ∈ z). in fact, it is straightforward to show that the induced metric ds2ads3 = −(dx −1)2 − (dx0)2 + (dx1)2 + (dx2)2 ∣∣ (x−1,x0,x1,x2)∈ads3 on the hypersurface takes the following form: ds2ads3 = −dτ2 + dx2 + cosh2 xdθ2 sinh2 x . (8) it should be emphasized that the periodic identification θ ∼ θ + 2nπ makes the metric (8) black hole. as mentioned in [3], without such identification the metric (8) just describes a portion of ads3 and the horizon is just that of an accelerated observer. (roughly speaking, (7) is the ads3 counterpart of the rindler coordinate patch of minkowski spacetime.) (3.) so(1, 1) × so(1, 1) global symmetry. as is well-known, the ads3 spacetime (6) has an alternative equivalent description as the sl(2,r) group manifold defined as follows: ads3 = { x = ( x−1+x2 x1−x0 x1+x0 x−1−x2 ) ∣∣∣ det x = 1}. (9) with this definition it is obvious that the ads3 spacetime (9) is invariant under leftand rightmultiplications of sl(2,r) matrices, x 7→ x′ = glxgr, where gl ∈ sl(2,r)l and gr ∈ sl(2,r)r with the z2-identification (gl,gr) ∼ (−gl,−gr). 143 satoshi ohya acta polytechnica (note that (gl,gr) and (−gl,−gr) give the same x′.) in the local coordinate patch (7) the 2×2 matrix x = ( x−1+x2 x1−x0 x1+x0 x−1−x2 ) takes the following form: x = ( coth x cosh θ+ cosh τsinh x coth x sinh θ− sinh τ sinh x coth x sinh θ+ sinh τsinh x coth x cosh θ− cosh τ sinh x ) . (10) now it is easy to see that the time-translation τ 7→ τ′ = τ + � is induced by the noncompact so(1, 1) ⊂ so(2, 2) group action x 7→ x′ = glxgr given by the matrices gl = g−1r = ( cosh �2 sinh � 2 sinh �2 cosh � 2 ) ∈ so(1, 1). (11) likewise, the spatial-translation θ 7→ θ′ = θ + � is induced by another so(1, 1) ⊂ so(2, 2) group action x 7→ x′ = glxgr given by the matrices gl = gr = ( cosh �2 sinh � 2 sinh �2 cosh � 2 ) ∈ so(1, 1). (12) hence, prior to making the periodic identification θ ∼ θ + 2nπ, the timeand spatial-translation generators i∂τ and −i∂θ must be given by two distinct so(1, 1) generators of the lie group so(2, 2) ∼= (sl(2,r)l × sl(2,r)r)/z2. after the periodic identification θ ∼ θ + 2nπ, on the other hand, gl and gr in eq. (12) should be regarded as an element of the coset so(1, 1)/z,2 where the zidentification is defined by � ∼ � + 2nπ (n ∈ z). hence, the ads3 black hole is given by the quotient space ads3/z, where the identification subgroup z = {gn | n = 0,±1,±2, · · ·} ⊂ so(2, 2) is generated by the matrix g = gl = gr = ( cosh π sinh π sinh π cosh π ) . it should be emphasized here that the fact that the time-translation generator generates the noncompact lorentz group so(1, 1) is a manifestation of thermodynamic aspects of a black hole: if we work in euclidean signature, the noncompact lorentz group so(1, 1) becomes the compact rotation group so(2) ∼= s1 such that the frequencies conjugate to the imaginary time are quantized and hence give rise to the matsubara frequencies. (4.) two-point function. as we have seen, the ads3 black hole (1) is a locally ads3 spacetime but its global structure is quite different from ads3. this global difference of course leads to a big difference between the structure of two-point functions of cft2 living on the boundary of the ads3 black hole and those living on the boundary of ads3 [10]. to see this, let gads3 bh(τ,θ) be a scalar two-point function of cft2 dual to the ads3 black hole and let gads3 (τ,θ) 2note that the parameter space of so(1, 1) (more precisely, so+(1, 1), i.e. the connected component to the identity element) is the whole line r. hence the parameter space of so(1, 1)/z is r/z, which is isomorphic to a circle s1. be a scalar two-point function of cft2 living on the ads3 boundary without periodic identification. then, once we get gads3 (τ,θ), the scalar two-point function of cft2 dual to the ads3 black hole is given by the coset construction (or the method of images [10]): gads3 bh(τ,θ) = ∑ n∈z ρ(n)gads3 (τ,θ + 2nπ), (13) where ρ : z → u(1) is a scalar (i.e. one-dimensional) unitary representation of the identification subgroup z and given by ρ(n) = einα. here α is a real parameter and its value depends on the model. for example, for scalar operator o(τ,θ) that satisfies the periodic boundary condition o(τ,θ + 2π) = o(τ,θ), α is zero (i.e. ρ is the trivial representation). (basically, α is a boundary condition parameter for o(τ,θ) with respect to angle θ.) we emphasize that, regardless of the value of α, thus constructed two-point function (13) indeed satisfies the periodic boundary condition gads3 bh(τ,θ + 2π) = gads3 bh(τ,θ). for simplicity throughout this paper we will focus on gads3 (i.e. the zero-winding sector of gads3 bh), because gads3 bh can be constructed from the knowledge of gads3 . hence in what follows we do not need to worry about the subtleties of periodic identification and global difference between the ads3 black hole and the ads3 spacetime.3 let us next consider a massive scalar field φ of mass m on the background spacetime (8) (without periodic identification) that satisfies the klein-gordon equation (�ads3 −m2)φ = 0, where the d’alembertian is given by �ads3 = sinh 2 x [ −∂2τ + ∂2x − 1 sinh x cosh x∂x − −∂2θ cosh2 x ] . in order to get cft two-point functions via real-time ads/cft prescription, we need to find a solution to the klein-gordon equation whose τand θ-dependences are given by the plane waves, φ(τ,x,θ) = φω,k(x)e−iωτ+ikθ; that is, we need to know a simultaneous eigenfunction of the d’alembertian �ads3, the time-translation generator i∂τ and the spatial-translation generator −i∂θ. for such a simultaneous eigenfunction the klein-gordon equation reduces to the following differential equation:4 ( −∂2x + 1 sinh x cosh x ∂x + ∆(∆ − 2) sinh2 x + k2 cosh2 x ) φω,k = ω2φω,k, (14) where ∆ = 1 + √ m2 + 1 is one of the solutions to the quadratic equation ∆(∆ − 2) = m2. note that near the 3actually, the momentum-space two-point functions computed in refs. [5, 8, 9] are nothing but the momentum-space representation of gads3 rather than gads3 bh (or gbtz). 4redefining the field as φ 7→ φ̃ = (coth x)−1/2φ, one sees that the differential equation (14) reduces to the schrödinger equation with hyperbolic pöschl-teller potential( −∂2x + (∆ − 1)2 − 1/4 sinh2 x + k2 + 1/4 cosh2 x ) φ̃ω,k = ω2φ̃ω,k. 144 vol. 54 no. 2/2014 a simple derivation of finite-temperature cft correlators ads3 boundary x = 0 the differential operator in the lefthand side of (13) behaves as −∂2x+ 1 x ∂x+ ∆(∆−2) x2 +o(1). hence the general solution has the following asymptotic near-boundary behavior: φ(τ,x,θ) ∼ a∆(ω,k)x∆e−iωτ+ikθ + b∆(ω,k)x2−∆e−iωτ+ikθ as x → 0, (15) where a∆(ω,k) and b∆(ω,k) are integration constants which may depend on ∆, ω and k. the real-time prescription of ads/cft correspondence tells us that the retarded and advanced two-point functions are given by the ratio [4, 5] g r/a ∆ (ω,k) = (2∆ − 2) a∆(ω,k) b∆(ω,k) , (16) where the retarded two-point function gr∆ is obtained by the solution that satisfies the in-falling boundary conditions at the horizon, whereas the advanced twopoint function ga∆ is obtained by the solution that satisfies the out-going boundary conditions at the horizon [6]. the goal of this paper is to compute the ratio (15) in a lie-algebraic fashion without solving the klein-gordon equation explicitly. 3. lie algebra sl(2, r)l ⊕ sl(2, r)r in the so(1, 1) × so(1, 1) diagonal basis in order to get the momentum-space two-point functions, we need to find a simultaneous eigenfunction of the d’alembertian �ads3, the time-translation generator i∂τ and the spatial-translation generator −i∂θ. as we will see below, the d’alembertian is given by the quadratic casimir of the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r. on the other hand, as we have seen in the previous section, the timeand spatialtranslations are induced by two distinct noncompact so(1, 1) group actions such that i∂τ and −i∂θ must be given by so(1, 1) generators of the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r. hence we need to work in the basis in which the noncompact so(1, 1) generators become diagonal. we note that unitary representations of the lie algebra sl(2,r) in the noncompact so(1, 1) basis have been studied in the mathematical literature [11, 12], and are known to be a bit complicated. in this paper we will not touch upon these mathematical subtleties and will not discuss which of the unitary representations are realized in the scalar field theory on the background (8) (without periodic identification).5 instead, we will present a rather heuristic argument that reproduces the known results by just 5one way to avoid these subtleties is to wick-rotate both the time τ and angle θ. in such euclidean-like signature, the noncompact so(1, 1) ×so(1, 1) symmetry becomes the compact so(2) ×so(2) symmetry such that we can use standard unitary representations of the lie-algebra so(2, 2) ∼= sl(2,r)l⊕sl(2,r)r in the so(2)×so(2) diagonal basis. in this approach, computations of momentum-space two-point functions are essentially reduced to those presented in ref. [1]. using the ladder equations of the lie algebra so(2, 2) in the so(1, 1) ×so(1, 1) diagonal basis. to begin with, let us first recall the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r, which is spanned by six self-adjoint generators {a0,a1,a2,b0,b1,b2} that satisfy the following commutation relations: [a0,a1] = ia2, [a1,a2] = −ia0, [a2,a0] = ia1, (17a) [b0,b1] = ib2, [b1,b2] = −ib0, [b2,b0] = ib1, (17b) with other commutators vanishing, [aa,bb] = 0 (a,b = 0, 1, 2). we note that a0 and b0 are compact so(2) generators, whereas a1, a2, b1, b2 are noncompact so(1, 1) generators. note also that the standard classification of unitary representations of the lie algebra so(2, 2) is based on the cartan-weyl basis {a0,a1 ± ia2,b0,b1 ± ib2}, where a1 ± ia2 and b1±ib2 play the role of ladder operators that raise and lower the eigenvalues of a0 and b0 by ±1. for the following discussions, however, it is convenient to introduce the hermitian linear combinations a± = a2 ∓a0 and b± = b2 ∓b0, which also play the role of “ladder” operators; see next section. in the basis {a1,a±,b1,b±} the commutation relations (17a) and (17b) are cast into the following forms: [a1,a±] = ±ia±, [a+,a−] = 2ia1, (18a) [b1,b±] = ±ib±, [b+,b−] = 2ib1. (18b) in the problem of scalar field theory on the background (8) (without periodic identification), these symmetry generators turn out to be given by the following firstorder differential operators: a1 = i 2 (∂τ + ∂θ), (19a) a± = − i 2 e±(τ+θ) [ sinh x∂x ± ( cosh x∂τ + 1 cosh x ∂θ )] , (19b) b1 = i 2 (∂τ −∂θ), (19c) b± = + i 2 e±(τ−θ) [ sinh x∂x ± ( cosh x∂τ − 1 cosh x ∂θ )] , (19d) which indeed satisfy the commutation relations (18a) and (18b). the quadratic casimir of the lie algebra so(2, 2) ∼= sl(2,r)l⊕sl(2,r)r yields the d’alembertian on the ads3 black hole c2(so(2, 2)) = 2c2(sl(2,r)l) + 2c2(sl(2,r)r) = sinh2 x ( −∂2τ + ∂ 2 x − 1 sinh x cosh x ∂x − −∂2θ cosh2 x ) , (20) where the quadratic casimir of each sl(2,r) is given by c2(sl(2,r)l) = (a0)2−(a1)2−(a2)2 = −a1(a1±i)− 145 satoshi ohya acta polytechnica a∓a± and c2(sl(2,r)r) = (b0)2 − (b1)2 − (b2)2 = −b1(b1 ± i) −b∓b±. a straightforward calculation shows that c2(sl(2,r)l) and c2(sl(2,r)r) coincide and are given by c2(sl(2,r)l) = c2(sl(2,r)r) = 1 4 c2(so(2, 2)). (21) asymptotic near-boundary algebra. we are interested in the asymptotic near-boundary behavior of the solution to the klein-gordon equation (14). to analyze this, let us introduce the boundary symmetry generators defined as the limit x → 0 of (19a)–(19d): a01 := lim x→0 a1 = i 2 (∂τ + ∂θ), (22a) a0± := lim x→0 a± = − i 2 e±(τ+θ) [ x∂x ± (∂τ + ∂θ) ] , (22b) b01 := lim x→0 b1 = i 2 (∂τ −∂θ), (22c) b0± := lim x→0 b± = + i 2 e±(τ−θ) [ x∂x ± (∂τ −∂θ) ] , (22d) which still satisfy the commutation relations of the lie algebra so(2, 2) ∼= sl(2,r)l ⊕ sl(2,r)r [a01,a 0 ±] = ±ia 0 ±, [a 0 +,a 0 −] = 2ia 0 1, (23a) [b01,b 0 ±] = ±ib 0 ±, [b 0 +,b 0 −] = 2ib 0 1. (23b) the quadratic casimir of this asymptotic near-boundary algebra, which we denote by so(2, 2)0 ∼= sl(2,r)0l ⊕ sl(2,r)0r, takes the following simple form: c2(so(2, 2)0) = 4c2(sl(2,r)0l) = 4c2(sl(2,r)0r) = x 2∂2x −x∂x =: c 0 2. (24) as we have repeatedly emphasized, we are interested in simultaneous eigenstates of the d’alembertian �ads3 x→0→ c02 , the time-translation generator i∂τ = b01 + a01 and the spatial-translation generator −i∂θ = b01 −a01. let |∆,kl,kr〉0 be a simultaneous eigenstate of c02 , a01 and b01 that satisfies the following eigenvalue equations: c02|∆,kl,kr〉 0 = ∆(∆ − 2)|∆,kl,kr〉0, (25a) a01|∆,kl,kr〉 0 = kl|∆,kl,kr〉0, (25b) b01|∆,kl,kr〉 0 = kr|∆,kl,kr〉0. (25c) in the coordinate realization these eigenvalue equations become the following differential equations:( −∂2x + 1 x ∂x + ∆(∆ − 2) x2 ) φ0∆,kl,kr = 0, (26a) (i∂xl −kl)φ 0 ∆,kl,kr = 0, (26b) (i∂xr −kr)φ 0 ∆,kl,kr = 0, (26c) where xl and xr are light-cone coordinates given by xl = τ + θ and xr = τ −θ, and (kl,kr) and (ω,k) are related by kl = (ω −k)/2 and kr = (ω + k)/2. these differential equations are easily solved with the result φ0∆,kl,kr(τ,x,θ) = a∆(kl,kr)x ∆e−iklxle−ikrxr + b∆(kl,kr)x2−∆e−iklxle−ikrxr, (27) which precisely coincides with the asymptotic nearboundary behavior of the solution (15). 4. recurrence relations for finite-temperature cft2 two-point functions as mentioned in the previous section, a± and b± (and also a0± and b0±) play the role of “ladder” operators. to see this, let us consider states a0±|∆,kl,kr〉0 and b0±|∆,kl,kr〉0. the commutation relations [a01,a0±] = ±ia0± and [b01,b0±] = ±ib0± give a01a0±|∆,kl,kr〉0 = (kl±i)a0±|∆,kl,kr〉0 and b01b0±|∆,kl,kr〉0 = (kr± i)b0±|∆,kr,kl〉0, which imply that a0± and b0± raise and lower the eigenvalues kl and kr by ±i:6 a0±|∆,kl,kr〉 0 ∝ |∆,kl ± i,kr〉0, (28a) b0±|∆,kl,kr〉 0 ∝ |∆,kl,kr ± i〉0. (28b) in the coordinate realization (22a) and (22c) with the solution (27), the left-hand sides become a0±φ 0 ∆,kl,kr = i ( − ∆ 2 ± ikl ) a∆(kl,kr) ×x∆e−i(kl±i)xle−ikrxr + i (∆ 2 − 1 ± ikl ) b∆(kl,kr) ×x2−∆e−i(kl±i)xle−ikrxr, (29a) b0±φ 0 ∆,kl,kr = −i ( − ∆ 2 ± ikr ) a∆(kl,kr) ×x∆e−iklxle−i(kr±i)xr − i (∆ 2 − 1 ± ikr ) b∆(kl,kr) ×x2−∆e−iklxle−i(kr±i)xr, (29b) which should be proportional to φ0∆,kl±i,kr and φ0∆,kl,kr±i, respectively. in other words, the integration constants should satisfy the recurrence relations (−∆2 ± ikl)a∆(kl,kr) ∝ a∆(kl ± i,kr) and ( ∆2 − 1 ± ikl)b∆(kl,kr) ∝ b∆(kl ± i,kr), and similar expressions for kr. hence the two-point function g∆(kl,kr), which is given by the ratio g∆(kl,kr) = 6one may wonder why the eigenvalues of the self-adjoint operators a01 and b 0 1 take the complex values kl ± i and kr ± i. the reason is that, even if the state |∆,kl,kr〉0 lies inside the domain in which the operators a01 and b 0 1 become self-adjoint, the states a0±|∆,kl,kr〉 0 and b0±|∆,kl,kr〉 0 turn out to lie outside the self-adjoint domain of a01 and b 0 1 . (for rigorous mathematical discussions we refer to the literature [11, 12].) as we will see below, however, a naive use of the “ladder” equations (28a) and (28b) correctly yields the retarded and advanced two-point functions. 146 vol. 54 no. 2/2014 a simple derivation of finite-temperature cft correlators (2∆ − 2)a∆(kl,kr)/b∆(kl,kr), should satisfy the following recurrence relations: g∆(kl,kr) = ∆ 2 − 1 ± ikl −∆2 ± ikl g∆(kl ± i,kr), (30a) g∆(kl,kr) = ∆ 2 − 1 ± ikr −∆2 ± ikr g∆(kl,kr ± i). (30b) these recurrence relations are linear such that they are easily solved by iteration. but how should we identify the solutions to these recurrence relations with the retarded and advanced two-point functions? a standard prescription to get the retarded (advanced) two-point functions via ads/cft is to use the solution to the klein-gordon equation that satisfies the in-falling (out-going) boundary conditions at the horizon x = ∞ [6]. here we present an alternative approach to get the retarded and advanced twopoint functions without knowing the boundary conditions at the horizon x = ∞. a key is the generic causal properties of two-point functions: the retarded two-point function has support only on the future light-cone, whereas the advanced two-point function has support only on the past light-cone. let us first focus on the case where the point (τ,θ) on the ads3 boundary (∂ads3) lies inside the future light-cone xl = τ + θ > 0 and xr = τ − θ > 0. in this case the state (a0−)n(b0−)mφ0∆,kl,kr ∝ e−i(kl−in)xle−i(kr−im)xr converges as n,m → ∞ such that (a0−)n(b0−)mφ0∆,kl,kr would be well-defined. hence it would be natural to expect that the ladder equations a0−φ0∆,kl,kr ∝ φ 0 ∆,kl−i,kr and b0−φ 0 ∆,kl,kr ∝ φ 0 ∆,kl,kr−i would lead to the retarded two-point function. indeed, iterative use of the relations g∆(kl,kr) = ∆ 2 −1−ikl −∆2 −ikl g∆(kl − i,kr) and g∆(kl,kr) = ∆ 2 −1−ikr −∆2 −ikr g∆(kl,kr − i) gives gr∆(kl,kr) = γ( ∆2 − ikl) γ(1 − ∆2 − ikl) × γ( ∆2 − ikr) γ(1 − ∆2 − ikr) gr(∆), (31) where gr(∆) is a normalization factor given by gr(∆) = limn,m→∞gr∆(kl − in,kr − im). this is the retarded two-point function with the desired analytic structure: gr∆(kl,kr) is analytic in the upper-half complex kland kr-planes and has simple poles at kl = −i2πt ( ∆2 + n) and kr = −i2πt ( ∆2 +m) (n,m ∈ z≥0) on the lowerhalf complex kland kr-planes, where t = 12π (= 1 2πr ) is the hawking temperature with respect to time τ. let us next derive the retarded two-point function of cft2 dual to the rotating btz black hole (4). to this end, let pl and pr be momenta conjugate to the btz light-cone coordinates t±φ. since τ ±θ and t±φ are related as τ ±θ = (r+ ∓r−)(t±φ), we have kl = 1r+−r− pl and kr = 1r++r− pr, from which we get gr∆(pl,pr) = γ(hl − ipl2πtl ) γ(h̄l − ipl2πtl ) γ(hr − ipr2πtr ) γ(h̄r − ipr2πtr ) gr(∆), (32) where tl and tr are the hawking temperature for leftand right-moving sectors with respect to the btz time t and given by tl = r+ −r− 2π and tr = r+ + r− 2π . (33) hl and hr are conformal weights for a scalar operator of dual cft2 given by hl = hr = ∆ 2 with h̄l = h̄r = 1 − ∆ 2 . (34) note that eq. (32) precisely coincides with the known results [8] (see also [5, 9] for the case of fermionic operators.) let us next move on to the case where the point (τ,θ) ∈ ∂ads3 lies inside the past lightcone xl = τ + θ < 0 and xr = τ − θ < 0. in this case the state (a0+)n(b0+)mφ0∆,kl,kr ∝ e−i(kl+in)xle−i(kr+im)xr converges as n,m →∞ such that (a0+)n(b0+)mφ0∆,kl,kr would be well-defined. iterative use of the relations g∆(kl,kr) = ∆ 2 − 1 + ikl −∆2 + ikl g∆(kl + i,kr), (35a) g∆(kl,kr) = ∆ 2 − 1 + ikr −∆2 + ikr g∆(kl,kr + i), (35b) then gives the advanced two-point function ga∆(pl,pr) = γ(hl + ipl2πtl ) γ(h̄l + ipl2πtl ) γ(hr + ipr2πtr ) γ(h̄r + ipr2πtr ) ga(∆), (36) where ga(∆) = limn,m→∞ga∆(pl + in,pr + im). we note that, since in general the retarded and advanced two-point functions are related by complex conjugate ga∆(pl,pr) = [g r ∆(pl,pr)] ∗, the normalization constants must be related by ga(∆) = [gr(∆)]∗. acknowledgements the author is supported in part by esf grant cz.1.07/2.3.00/30.0034 references [1] s. ohya, “recurrence relations for finite-temperature correlators via ads2/cft1,” jhep 12 (2013) 011, arxiv:1309.2939 [hep-th]. doi: 10.1007/jhep12(2013)011 [2] m. bañados, c. teitelboim, and j. zanelli, “black hole in three-dimensional spacetime,” phys. rev. lett. 69 (1992) 1849–1851, arxiv:hep-th/9204099 [hep-th]. doi: 10.1103/physrevlett.69.1849 147 http://dx.doi.org/10.1007/jhep12(2013)011 http://arxiv.org/abs/1309.2939 http://dx.doi.org/10.1007/jhep12(2013)011 http://dx.doi.org/10.1103/physrevlett.69.1849 http://dx.doi.org/10.1103/physrevlett.69.1849 http://arxiv.org/abs/hep-th/9204099 http://dx.doi.org/10.1103/physrevlett.69.1849 satoshi ohya acta polytechnica [3] m. bañados, m. henneaux, c. teitelboim, and j. zanelli, “geometry of the 2+1 black hole,” phys. rev. d48 (1993) 1506–1525, arxiv:gr-qc/9302012 [gr-qc]. doi: 10.1103/physrevd.48.1506 [4] n. iqbal and h. liu, “universality of the hydrodynamic limit in ads/cft and the membrane paradigm,” phys. rev. d79 (2009) 025023, arxiv:0809.3808 [hep-th]. doi: 10.1103/physrevd.79.025023 [5] n. iqbal and h. liu, “real-time response in ads/cft with application to spinors,” fortsch. phys. 57 (2009) 367–384, arxiv:0903.2596 [hep-th]. doi: 10.1002/prop.200900057 [6] d. t. son and a. o. starinets, “minkowski-space correlators in ads/cft correspondence: recipe and applications,” jhep 09 (2002) 042, arxiv:hep-th/0205051 [hep-th]. doi: 10.1088/1126-6708/2002/09/042 [7] a. r. steif, “supergeometry of three-dimensional black holes,” phys. rev. d53 (1996) 5521–5526, arxiv:hep-th/9504012 [hep-th]. doi: 10.1103/physrevd.53.5521 [8] d. birmingham, i. sachs, and s. n. solodukhin, “conformal field theory interpretation of black hole quasinormal modes,” phys. rev. lett. 88 (2002) 151301, arxiv:hep-th/0112055 [hep-th]. doi: 10.1103/physrevlett.88.151301 [9] v. balasubramanian, i. garcía-etxebarria, f. larsen, and j. simón, “helical luttinger liquids and three-dimensional black holes,” phys. rev. d84 (2011) 126012, arxiv:1012.4363 [hep-th]. doi: 10.1103/physrevd.84.126012 [10] e. keski-vakkuri, “bulk and boundary dynamics in btz black holes,” phys. rev. d59 (1999) 104001, arxiv:hep-th/9808037 [hep-th]. doi: 10.1103/physrevd.59.104001 [11] j. g. kariyan, n. mukunda, and e. c. g. sudarshan, “master analytic representation: reduction of o(2, 1) in an o(1, 1) basis,” j. math. phys. 9 (1968) 2100–2108. doi: 10.1063/1.1664551 [12] g. lindblad and b. nagel, “continuous bases for unitary irreducible representations of su (1, 1),” ann. inst. henri poincaré 13 (1970) 27–56. http: //www.numdam.org/item?id=aihpa_1970__13_1_27_0. 148 http://dx.doi.org/10.1103/physrevd.48.1506 http://dx.doi.org/10.1103/physrevd.48.1506 http://arxiv.org/abs/gr-qc/9302012 http://dx.doi.org/10.1103/physrevd.48.1506 http://dx.doi.org/10.1103/physrevd.79.025023 http://dx.doi.org/10.1103/physrevd.79.025023 http://arxiv.org/abs/0809.3808 http://dx.doi.org/10.1103/physrevd.79.025023 http://dx.doi.org/10.1002/prop.200900057 http://dx.doi.org/10.1002/prop.200900057 http://arxiv.org/abs/0903.2596 http://dx.doi.org/10.1002/prop.200900057 http://dx.doi.org/10.1088/1126-6708/2002/09/042 http://arxiv.org/abs/hep-th/0205051 http://dx.doi.org/10.1088/1126-6708/2002/09/042 http://dx.doi.org/10.1103/physrevd.53.5521 http://arxiv.org/abs/hep-th/9504012 http://dx.doi.org/10.1103/physrevd.53.5521 http://dx.doi.org/10.1103/physrevlett.88.151301 http://arxiv.org/abs/hep-th/0112055 http://dx.doi.org/10.1103/physrevlett.88.151301 http://dx.doi.org/10.1103/physrevd.84.126012 http://dx.doi.org/10.1103/physrevd.84.126012 http://arxiv.org/abs/1012.4363 http://dx.doi.org/10.1103/physrevd.84.126012 http://dx.doi.org/10.1103/physrevd.59.104001 http://arxiv.org/abs/hep-th/9808037 http://dx.doi.org/10.1103/physrevd.59.104001 http://dx.doi.org/10.1063/1.1664551 http://dx.doi.org/10.1063/1.1664551 http://www.numdam.org/item?id=aihpa_1970__13_1_27_0 http://www.numdam.org/item?id=aihpa_1970__13_1_27_0 acta polytechnica 54(2):142–148, 2014 1 introduction and summary 2 ads3 black hole: locally ads3 spacetime in the so(1,1) x so(1,1) diagonal basis 3 lie algebra sl(2,r)l + sl(2,r)r in the so(1,1) x so(1,1) diagonal basis 4 recurrence relations for finite-temperature cft2 two-point functions acknowledgements references acta polytechnica acta polytechnica 53(1):5–10, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ on some statistical properties of grbs with measured redshifts having peaks in optical light curves grigori i beskina,∗, giuseppe grecob, gor oganesyanc, sergey karpova a special astrophysical observatory of russian academy of sciences b astronomical observatory of bologna, inaf, italy c south federal university, russia ∗ corresponding author: beskin@sao.ru abstract. we studied the subset of optical light curves of gamma-ray bursts with measured redshifts and well-sampled r band data that have clearly detected peaks. among 43 such events, 11 are prompt optical peaks (p), coincident with gamma-ray activity, 22 are purely afterglows (a), and 10 more carry the signatures of an underlying activity (a(u)). we studied pair correlations of their gamma-ray and optical parameters, e.g. total energetics, peak optical luminosities, and durations. the main outcome of our study is the detection of source frame correlations between both optical peak luminosity and total energy and the redshift for classes a and a(u), and the absence of such a correlation for class p events. this result seems to provide evidence of the cosmological evolution of a medium around the burst defining class a and a(u) energetics, and the absence of cosmological evolution of the internal properties of grb engines. we also discuss some other prominent correlations. keywords: gamma-ray bursts, statistical methods. 1. introduction the very first time when optical afterglows of grbs were observed, it was immediately obvious that these phenomena are no mere extinguishing of the energy that powered grbs, but rather, that they represent a vast wealth of physical phenomena that are not yet fully understood. the aim of our work is to develop a statistical analysis of the prominent optical peaks that characterize several grbs in the early stage of their emission. these are important for a study of the properties of the interstellar medium over a wide range of distances, and of the physical mechanisms operating during the transition phases between internal/external shock. with this purpose in mind, we discuss the selection criteria and the classification scheme for the grb sample with a well-measured optical peak. we study the correlations of pairs of parameters of these grbs, and define possible directions for future work. we adopt the concordance λcdm cosmology with ωm = 0.3, ωλ = 0.7, and h0 = 70 km s−1 mpc−1. 2. method and classification we gathered all available grbs with measured redshifts z and well sampled optical light curves that present a clear peak or peaks during its evolution (over the period from february 28, 1997 until february 28, 2011). subsequently, the entire optical peak sample was divided into sub-groups that took into account the main characteristics of the contemporaneous emissions at higher energies. in particular, we were interested in comparing the rising phases of optical emission with simultaneous behaviour at gamma-ray wavelengths, as observed by space-borne telescopes. this operational classification leads to the division of our sample into three main sub-groups. • p: optical peaks arise during the main phases of the prompt gamma-ray emission; we will refer to these as prompt optical emission. • a(u): there is still some residual underlying gamma-ray activity simultaneously with the formation of the optical peak. • a: no significant gamma-ray activity is seen simultaneously with the onset of the optical peak • p?: events that may not be unambiguously classified either as a(u) or as p. three of the p? objects fall into the a(u)-populated regions of the correlation plots. we also use p-3 notation to refer to all prompt events except p? events. the observed optical peak flux fopt was obtained using the calibration of [1] fopt = 1568 · ( 2.15 · 10−9 · 10−0.4mag ) erg cm−2 s−1 (1) and was then corrected for galactic extinction (based on the map of [2]) and for the brightness of the host galaxy (if this value is available). for the bursts whose the spectral index β and the host galaxy extinction ahost are not yet available, we assumed β = 0.75 and 5 http://ctn.cvut.cz/ap/ g. beskin, g. greco, g. oganesyan, s. karpov acta polytechnica the mean value of ahost measured in the corresponding ranges of the redshifts. namely, the ahost data collected in the golden sample presented in [3] were divided into five redshift ranges and for each distance interval the corresponding average value of av was obtained. the isotropic equivalent luminosity lopt of the optical peak is related to the peak optical flux fopt by lopt = 4πκopt(z)d2l (z)fopt, (2) where dl(z) is the luminosity distance for the standard cosmological model and κopt(z) is the cosmological κcorrection that takes into account the transformation of the r passband to the proper grb rest frame: κopt = ( νr11+z∫ νr0 1+z ν−βdν )/( νr1∫ νr0 ν−βdν ) = 1 (1 + z)1−β , (3) here, νr0 and νr1 are the frequency boundaries of the r band and β is the power-law index of the optical spectrum fν ∝ ν−β. the optical fluence sopt was determined by numerical integration of the afterglow light curve over the interval from the earliest observation to the latest observation with a power-law interpolation of the flux f(t) in the segments between the observational points. since only a part of the optical afterglow light curve may be recorded in practice, this quantity is actually only a lower limit for the fluence. the isotropic equivalent of the total optical energy in the r band eopt in the rest frame of the source was determined from the optical fluence sopt using the relation: eopt = 4πκopt(z)d2l (z)sopt 1 + z . (4) below is a list of quantities we use to characterize the grbs. the duration of the burst until the emission of 90 % of its energy, topt; the delay of the maximal optical flux after the gamma-ray trigger, tpeak; the width of the optical peak on the 10 % from the maximum, twidth. these quantities are converted to the comoving frame by dividing by (1 + z), in the same way as the total fluences in the optical and gamma ranges (sγ and sopt) are converted to the rest frame energies eiso and eopt, also taking into account the kappacorrection. the shape of the optical light curves is characterized by the power-law indices of the rise before the peak and decay after the peak, αr and αd. the peak optical and gamma light curve values are characterized by the fluxes fγ and fopt, and also by the rest frame isotropic equivalent luminosities liso and lopt. finally, the gamma-ray spectrum is characterized by the photon index α . to measure the relations between different parameters we use unweighted pearson correlation coefficient r, computed for the logarithms of all quantities. the significances of these estimates are computed using student’ t-distribution for the quantity t = r √ n−2√ 1−r2 , where n is the sample size. 3. results table 1 lists all the pair correlations for various classes of optical companions with correlation coefficients greater than 0.5 and significances better than 1 %. we found a strong correlation between the peak optical luminosity and the redshift for all object classes except for the prompt class (see the lower right panel of figure 1). there ls a less obvious, but still significant, correlation of the optical energetics and the redshift. this may provide evidence for the cosmological evolution of interstellar medium parameters in regions around the progenitors of long gamma-ray bursts. we detected significant connections between energies and peak luminosities, both in the gamma-ray range and in optical range, for all classes of objects (see the upper row of panels in figure 1), and also between the optical energetics and the peak luminosities and energetics in gamma-rays. it is worth noting that in the latter cases the characteristics of these connections are similar in all classes. this suggests similar jet opening angles in gamma-ray and optical emission regions. in table 2, we show the parameters of linear fits of quantities whose correlation coefficients are shown in bold in table 1. a detailed discussion of all detected correlations will be performed in the future. 4. conclusions monitoring of robotic telescopes has helped us to understand some characteristics of optical emissions; from the region of peak formation, the gamma-ray emission can proceed in coincidence with the optical luminosity (p), can be extinguished (a), or can carry the remnants of an underlying activity (a(u)). in the case of a or a(u), the peak events that arise when prompt gamma-ray emission is completely extinguished appear to carry the signatures of the circumburst environment at different cosmological epochs. in the p type, the optical luminosity of the peak – coinciding with a major activity of the prompt gamma-ray emission – seems to deviate from any cosmological evolution trend. this fact provides independent confirmation that the prompt physical mechanisms are independent of their location in the universe. however, the mechanisms of the prompt optical emission are still unclear; with the exception of a few rare cases, the morphology of these peaks is complex and poorly re-sampled. 6 vol. 53 no. 1/2013 on some statistical properties of grbs acknowledgements this work was supported by bologna university progetti pluriennali 2003, by grants of crdf (no. rp1-2394-mo02), rfbr (no. 04-02-17555, 06-02-08313, 09-02-12053 and 12-02-00743), intas (04-78-7366), by the presidium of the russian academy of sciences program and by a grant of the president of the russian federation for the support of young russian scientists. s.k. has also been supported by a grant from the dynasty foundation. g.b. thanks landau network-cenro volta and the cariplo foundation for fellowship and brera observatory for hospitality. g.g. also gratefully acknowledges support from foundatione carisbo. references [1] fukugita, m., shimasaku, k. and ichikawa, t.: galaxy colors in various photometric band systems. pasp, 107, 1995, 945. [2] schlegel, d.j., finkbeiner, d.p. and davis, m.: maps of dust infrared emission for use in estimation of reddening and cosmic microwave background radiation foregrounds. apj, 500, 1998, 525. [3] kann, d.a., klose, s., zhang, b. et al.: the afterglows of swift-era gamma-ray bursts. i. comparing pre-swift and swift-era long/soft (type ii) grb optical afterglows. apj, 720, 2010, 1513–1558. lo g t p ea k lo g t w id th lo g t op t lo g l op t lo g e op t α r α d lo g t 90 lo g l is o lo g e is o α 1 0.23 −0.00 0.01 0.11 0.19 −0.55 0.43 0.19 0.47 0.57 0.12 2 −0.20 −0.26 −0.06 0.82 0.68 0.16 −0.31 0.19 0.73 0.82 0.21 3 z −0.25 −0.64 0.31 0.74 0.54 −0.33 −0.25 0.02 0.46 0.31 0.44 4 −0.03 −0.21 0.25 0.59 0.57 −0.18 −0.06 0.16 0.59 0.60 0.15 5 −0.31 −0.44 0.06 0.82 0.67 0.07 −0.34 0.22 0.72 0.73 0.31 6 0.72 0.22 −0.64 −0.18 0.03 0.10 0.22 0.31 0.22 0.46 −0.15 7 −0.25 −0.35 0.12 0.83 0.69 −0.17 −0.20 0.20 0.74 0.75 0.30 1 0.54 −0.07 −0.77 −0.55 0.12 0.03 0.35 −0.51 −0.36 −0.62 2 0.65 0.10 −0.21 −0.02 −0.02 −0.11 0.46 −0.39 −0.16 −0.13 3 lo g t p ea k 0.74 −0.50 −0.37 0.07 −0.20 0.24 −0.67 −0.12 −0.25 0.30 4 0.76 0.38 −0.39 0.07 −0.19 0.27 0.02 −0.43 −0.31 −0.37 5 0.72 −0.12 −0.35 −0.09 −0.10 0.23 0.10 −0.46 −0.25 −0.15 6 0.64 −0.72 −0.76 −0.54 −0.64 0.20 0.40 −0.36 −0.21 −0.59 7 0.74 −0.03 −0.32 0.01 −0.10 0.20 0.06 −0.40 −0.25 −0.21 1 −0.40 −0.62 −0.78 −0.14 −0.08 −0.52 −0.10 −0.32 −0.05 2 0.03 −0.10 0.03 0.04 −0.08 0.23 −0.20 −0.13 −0.09 3 lo g t w id th −0.51 −0.57 −0.27 −0.24 0.59 −0.58 −0.56 −0.21 −0.12 4 0.22 −0.36 −0.02 −0.20 0.36 −0.16 −0.30 −0.27 −0.29 5 −0.20 −0.34 −0.12 −0.09 0.42 −0.09 −0.35 −0.21 −0.18 6 −0.76 −0.70 −0.81 −0.74 0.10 −0.44 −0.25 −0.52 −0.26 7 −0.11 −0.31 −0.02 −0.13 0.38 −0.13 −0.26 −0.17 −0.21 1 −0.06 0.08 0.19 0.10 0.03 −0.33 −0.27 −0.27 2 −0.07 0.17 −0.25 0.07 −0.05 −0.27 −0.18 0.04 3 lo g t op t 0.32 0.34 0.30 −0.51 0.48 0.33 0.49 0.11 4 −0.04 0.39 −0.19 0.12 −0.06 −0.19 −0.19 −0.28 5 0.07 0.22 −0.09 −0.22 0.08 −0.11 0.00 0.10 6 0.47 0.49 0.34 −0.29 0.22 −0.16 0.08 0.27 7 0.10 0.28 −0.18 −0.16 0.05 −0.04 0.03 0.06 (continued on the next page) 7 g. beskin, g. greco, g. oganesyan, s. karpov acta polytechnica (cont.) lo g t p ea k lo g t w id th lo g t op t lo g l op t lo g e op t α r α d lo g t 90 lo g l is o lo g e is o α 1 0.88 −0.03 0.24 −0.28 0.72 0.74 0.79 2 0.88 0.36 −0.55 0.09 0.77 0.79 0.33 3 lo g l op t 0.76 −0.36 0.09 −0.02 0.69 0.58 0.70 4 0.77 0.12 −0.12 0.10 0.76 0.75 0.51 5 0.85 0.19 −0.30 0.15 0.78 0.76 0.47 6 0.87 0.83 0.14 −0.42 0.74 0.74 0.87 7 0.83 −0.04 −0.19 0.15 0.76 0.76 0.45 1 0.12 0.30 0.10 0.64 0.76 0.47 2 0.22 −0.59 0.29 0.66 0.73 0.54 3 lo g e op t −0.40 0.12 −0.01 0.83 0.69 0.69 4 −0.01 0.02 0.17 0.62 0.61 0.35 5 0.12 −0.30 0.27 0.70 0.74 0.58 6 0.81 0.13 0.00 0.75 0.86 0.54 7 −0.15 −0.16 0.23 0.67 0.70 0.48 1 −0.25 −0.04 −0.32 −0.12 −0.29 2 −0.60 −0.02 0.16 0.20 0.08 3 α r −0.72 0.27 −0.28 −0.20 −0.26 4 −0.45 0.06 −0.04 0.07 0.05 5 −0.56 0.06 0.12 0.13 0.02 6 0.18 −0.01 0.67 0.84 0.79 7 −0.60 0.06 −0.17 −0.09 −0.01 1 −0.04 0.17 0.34 −0.04 2 −0.15 −0.20 −0.23 −0.29 3 α d −0.45 0.03 0.18 0.05 4 −0.28 −0.10 −0.00 −0.23 5 −0.33 −0.22 −0.13 −0.18 6 −0.31 −0.05 0.28 −0.17 7 −0.31 −0.09 −0.04 −0.16 1 −0.40 −0.13 −0.65 2 0.14 0.45 0.28 3 lo g t 90 0.48 0.21 −0.24 4 0.16 0.37 0.14 5 0.25 0.44 0.18 6 −0.48 −0.08 −0.70 7 0.20 0.40 0.17 1 0.89 0.68 2 0.89 0.41 3 lo g l is o 0.70 0.44 4 0.88 0.47 5 0.87 0.44 6 0.82 0.51 7 0.88 0.44 1 0.63 2 0.45 3 lo g e is o 0.56 4 0.53 5 0.50 6 0.45 7 0.51 table 1. pair correlation coefficients for rest frame parameters. those shown in bold have correlation coefficients greater than 0.5 and significance better than 1%. 1: p, 2: a, 3: a(u), 4: p+a+a(u), 5: a(u)+a, 6: p-3, 7: [a(u)+3]+a 8 vol. 53 no. 1/2013 on some statistical properties of grbs 50,0 50,5 51,0 51,5 52,0 52,5 53,0 53,5 50,5 51,0 51,5 52,0 52,5 53,0 53,5 54,0 54,5 a a(u) p p? fit linear a+a(u)+p is ot ro ph ic e ne rg y of g am m ara y bu rs t, er g isotrophic peak gamma luminosity, erg/s 50,0 50,5 51,0 51,5 52,0 52,5 53,0 53,5 54,0 54,5 46 47 48 49 50 51 52 53 a a(u) p p? fit linear a fit linear p-3 is ot ro ph ic o pt ic al e ne rg y, e rg isotrophic energy of gamma-ray burst, erg 50 51 52 53 46 47 48 49 50 51 52 53 a a(u) p p? fit a fit a(u) is ot ro ph ic o pt ic al e ne rg y, e rg peak isotrophic gamma luminosity, erg/s 45 46 47 48 49 50 46 47 48 49 50 51 52 53 a a(u) prompt p? fit linear p fit linear a is ot ro ph ic o pt ic al e ne rg y, e rg peak optical luminosity, erg/s 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 46 47 48 49 50 51 52 53 a a(u) prompt p? fit linear a+a(u)+p? is ot ro ph ic o pt ic al e ne rg y, e rg log(z+1) 50,5 51,0 51,5 52,0 52,5 53,0 53,5 54,0 54,5 44 45 46 47 48 49 50 a a(u) prompt p? fit linear a+a(u) p ea k op tic al lu m in os ity , e rg /s lsotrophic energy of gamma-ray burst, erg 49,5 50,0 50,5 51,0 51,5 52,0 52,5 53,0 53,5 54,0 44 45 46 47 48 49 50 a a(u) prompt p? fit linear a p ea l o pt ic al lu m in os ity , e rg /s peak gamma luminosity, erg/s 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 44 45 46 47 48 49 50 a a(u) prompt p? fit a+a(u)+p? p ea k op tic al lu m in os ity , e rg /s log(1+z) figure 1. scatter plots of pair correlations between various parameters. 9 g. beskin, g. greco, g. oganesyan, s. karpov acta polytechnica c or re la ti on t yp e a b c or re la ti on t yp e a b e is o − l is o p 1. 58 ± 5. 60 0. 99 ± 0. 11 e op t − l is o a 2. 09 ± 12 .0 2 0. 93 ± 0. 23 a 5. 04 ± 6. 78 0. 93 ± 0. 13 a (u ) -2 4. 50 ± 18 .6 3 1. 44 ± 0. 36 a + a (u )+ p 5. 09 ± 4. 31 0. 92 ± 0. 08 a + a (u )+ p 2. 88 ± 9. 55 0. 91 ± 0. 18 a + a (u ) 7. 34 ± 5. 67 0. 88 ± 0. 11 a + a (u ) -0 .3 9 ± 9. 49 0. 97 ± 0. 18 a + a (u )+ p ? 8. 20 ± 4. 91 0. 86 ± 0. 09 a + a (u )+ p ? 2. 80 ± 9. 22 0. 91 ± 0. 18 e op t − l op t p 11 .6 2 ± 7. 14 0. 80 ± 0. 15 e op t − e is o p -2 0. 74 ± 19 .6 9 1. 32 ± 0. 37 a 12 .9 1 ± 4. 57 0. 80 ± 0. 10 a -4 .2 0 ± 11 .4 0 1. 03 ± 0. 22 a (u ) 28 .6 3 ± 0. 43 0. 46 ± 0. 14 a + a (u )+ p -1 .2 8 ± 10 .3 1 0. 97 ± 0. 20 a + a (u )+ p 17 .4 5 ± 4. 89 0. 69 ± 0. 09 a + a (u ) -3 .0 3 ± 8. 91 1. 01 ± 0. 17 a + a (u ) 17 .7 6 ± 3. 65 0. 69 ± 0. 08 p -3 -4 0. 95 ± 21 .4 9 1. 70 ± 0. 41 p -3 9. 87 ± 8. 85 0. 83 ± 0. 19 a + a (u )+ p ? -1 .8 ± 9. 16 0. 98 ± 0. 17 a + a (u )+ p ? 17 .2 6 ± 3. 82 0. 70 ± 0. 08 e op t − (z + 1) a 47 .9 0 ± 0. 51 4. 26 ± 1. 03 l op t − (z + 1) a 43 .9 2 ± 0. 42 5. 41 ± 0. 95 a + a (u ) 48 .0 9 ± 0. 41 3. 89 ± 0. 78 a + a (u )+ p 45 .0 3 ± 0. 55 3. 66 ± 1. 17 a + a (u )+ p 47 .8 1 ± 0. 46 4. 13 ± 0. 93 a + a (u ) 44 .0 3 ± 0. 39 5. 31 ± 0. 79 a + a (u )+ p ? 47 .9 2 ± 0. 39 4. 11 ± 0. 75 a + a (u )+ p ? 44 .0 5 ± 0. 35 5. 32 ± 0. 73 e is o − (z + 1) a 50 .8 4 ± 0. 28 3. 49 ± 0. 58 l is o − (z + 1) a 50 .2 8 ± 0. 36 2. 86 ± 0. 77 a + a (u )+ p 51 .7 3 ± 0. 31 2. 17 ± 0. 61 a + a (u )+ p 50 .8 2 ± 0. 32 2. 11 ± 0. 66 a + a (u ) 51 .0 9 ± 0. 28 2. 98 ± 0. 52 a + a (u ) 50 .3 9 ± 0. 29 2. 66 ± 0. 56 a + a (u )+ p ? 51 .0 9 ± 0. 26 3. 01 ± 0. 49 a + a (u )+ p ? 50 .3 0 ± 0. 28 2. 82 ± 0. 55 e op t − t w id th p 53 .4 9 ± 1. 22 -3 .9 4 ± 1. 04 l op t − t p e a k p 52 .7 3 ± 2. 03 -3 .8 6 ± 1. 24 p -3 54 .2 8 ± 1. 61 -4 .4 5 ± 1. 32 l op t − e is o p -3 2. 52 ± 24 .0 3 1. 50 ± 0. 45 l op t − l is o a -2 7. 03 ± 11 .9 5 1. 43 ± 0. 23 a -3 2. 3 ± 13 .2 1. 50 ± 0. 25 a + a (u )+ p -2 9. 88 ± 10 .0 1 1. 49 ± 0. 19 a + a (u )+ p -3 0. 16 ± 11 .9 5 1. 46 ± 0. 23 a + a (u ) -3 7. 96 ± 9. 85 1. 65 ± 0. 19 a + a (u ) -4 4. 11 ± 11 .6 0 1. 73 ± 0. 22 a + a (u )+ p ? -3 2. 90 ± 9. 65 1. 55 ± 0. 19 a + a (u )+ p ? -4 1. 69 ± 11 .1 4 1. 69 ± 0. 21 t p ea k − t w id th a 1. 24 ± 0. 24 0. 51 ± 0. 13 e op t − (z + 1) a 47 .9 0 ± 0. 51 4. 26 ± 1. 03 a + a (u )+ p 1. 00 ± 0. 13 0. 59 ± 0. 08 a + a (u ) 48 .0 9 ± 0. 41 3. 89 ± 0. 78 a + a (u ) 1. 22 ± 0. 16 0. 50 ± 0. 09 a + a (u )+ p 47 .8 1 ± 0. 46 4. 13 ± 0. 93 a + a (u )+ p ? 1. 16 ± 0. 14 0. 53 ± 0. 08 a + a (u )+ p ? 47 .9 2 ± 0. 39 4. 11 ± 0. 75 *e op t − α d ec ay a 48 .8 9 ± 0. 36 -1 .0 1 ± 0. 32 *l op t − α d ec ay a 46 .1 1 ± 0. 46 -0 .8 8 ± 0. 35 table 2. pair correlations for various classes of optical companions with correlation coefficients greater than 0.5 and significances less than 1%. four columns are linear regression (a + b ·x) coefficients, derived through unweighted least squares. the stars mark the log-linear correlations, in contrast to the log-log correlations used otherwise. 10 acta polytechnica 53(1):5--10, 2013 1 introduction 2 method and classification 3 results 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):213–218, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ long-lived plasma process, created by impulse discharge in micro-disperse droplet environment serge olszewski∗, tamara lisitchenko, vitalij yukhymenko taras shevchenko national university of kyiv, radio physics faculty, pr. a. glushkova 4g, kyiv, ukraine ∗ corresponding author: olszewski.serge@gmail.com abstract. the processes of organic compound (phenol and cation-active surfactants) destruction in water solutions, which stay under the influence of plasma treatment have been investigated in different dynamic plasma-liquid systems (pls) with discharges in droplet micro-disperse environments (dmde). the long-lived plasma process with separate spectral properties has been observed for pulsed discharge in dmde. the approximate computer model is being proposed for a description of this effect. according to the introduced model this long-lived process is the aggregation of correlated discharges between charged droplets. keywords: dynamic plasma–liquid system, plasma-chemical processing, ultrasonic nebulization, droplet micro-disperse environment. 1. introduction water is a valuable natural resource. with metabolic processes in all aspects, forming the base of human living, water plays an exclusive role. methods based on plasma-chemical processes in the liquid-gas environments for water treatment and purification of highly polluted wastewater are the most promising among the others. unlike the regenerative methods which remove the collected impurities from the water into the solid (absorption), gas (desorption) or nonaqueous liquid (extraction) phase, the destructive method (technology of water and industrial waste plasma-chemical processing) is based on conversion of the chemical structure of molecules and impurities. the problem of complete cleaning of the industrial wastewaters from organic highly active and toxic substances (hats) is important enough and simultaneously difficult to apply. however this problem can not be solved completely. apparently, plasmachemical technologies are represented by the most perspective ones, as allowing achieving quick speed of substances destruction under the conditions of expense of high-energy concentration. however, it is necessary to take into account that chemical reactions occurring in the liquid phase can be stimulated only by particles that penetrate from plasma into the liquid. they can be active radicals created as a result of recombination processes in the plasma phase. they could also be excited molecules that are generated as a result of the bombardment of liquid surface by charged particles of plasma. therefore the effectiveness of plasma-liquid systems (pls) as applications for liquid treatment is primarily determined by the size of the interaction zone between liquid phase and plasma. the integral feature for all systems with plasmaliquid interaction is their little active volume as compared with total liquid volume. as a matter of fact, the active volume of plasma-liquid systems is defined as the contact area multiplied by the thickness of the diffusion layer of active plasma particle in liquid. the contact area is the area of surface of contact plasma and liquid. one of the directions to increase the ratio of active volume to total liquid volume is the using of pls with liquid solutions in micro-disperse state. for example, pls are based on discharges in fog that contains drops with size of order diffusion layer thickness. using of the pulsed discharges as sources of decaying plasma in droplet environment can generate the flows of chemical-active particles onto extended surface of liquid. at the same time, the small size of discrete droplets provides more complete treatment of the total liquid volume. some features of plasma-liquid systems based on pulsed discharges in the droplets micro-disperse environments have been studied in the present work. 2. material and methods the scheme and photo of experimental setup are represented in fig. 1. in this setup the distilled water (4) was sprayed by ultrasound field into inside volume of working vessel (3) and transformesd to monodisperse fog (10). the quartz cylinder with inside diameter 28 mm and height 150 mm was used as a working vessel. the ultrasound field was created by quartz crystal (8). the friction of an ultrasound field was 800 hz and acoustic power ∼ 60 w. for initiating a spark discharge between copper electrodes (1), high voltage ∼ 10 kv was input onto them. the discharge gap between electrodes was 3 mm. the gap between high-voltage electrode and inside surface of quartz vessel was 5 mm. the current of spark was measured by rogowski coil. 213 http://ctn.cvut.cz/ap/ serge olszewski, tamara lisitchenko, vitalij yukhymenko acta polytechnica figure 1. experimental setup: a) scheme, b) photo; 1 – metal electrodes, 2 – quartz insulator, 3 – quartz vessel, 4 – up flanges, 5 – side metal walls, 6 – rubber seals, 7 – ultrasonic sprayer, 8 – quartz piezocrystal, 9 – water cooler, 10 – ultrasonic fog, 11 – ultrasonic fountain, 12 – work liquid. figure 2. diagram of optical measurements; 1 – copper electrodes, 2 – droplet environment, 3 – pulsed discharge, 4 – filter wheels, 5 – optical-fiber waveguide, 6 – ccd-spectrometer, 7 – pc-recorder. the value of current was ∼ 1 ka. all processes were observed in atmospheric pressure. the emission spectra were registered by ccdspectrometer solar tii. the scheme of optical measurements is shown in fig. 2. the coordinate axis z was directed along the geometrical axis of system down. the point z = 0 was located in the horizontal plane contained discharge gap. the optical radiation was collected along axis directed perpendicularly to axis z through the filter wheel (4) into optical-fiber lightwaveguide (5). the spectra registered by ccd-spectrometer (6) were stored by specialized software in personal computer (7). the typical lead time for slow processes in the volume of system was investigated by analysis of images from video-frames. video filming was performed by video camera ean850a with duration of videoframe 30 ms. the video-frame duration itself determined the interval between time readings in experiment. according to the result of test-drives for different video-cameras [7, 9, 8] the typical time interval of ccd-matrix perception is ∼ 1 µs. this time interval determined the experiment accuracy in time for slow processes. figure 3. photo of plasma processes in micro-disperse droplet environment, a) spark discharge between metal electrodes, b) sliding discharge along inside surface working vessel between high-voltage electrode and water surface, c) long-lived plasma process with spectral properties different from the previous discharges types. 3. results and discussion 3.1. experiment in the course of experiment three different processes in working volume of system were being observed (fig. 3). fig. 3a shows the trivial spark discharge between metal electrodes. the breakdown voltage of this discharge in water fog was ∼ 7 ÷ 10 kv. the pulsing current had amplitude ∼ 1.5 ka. this discharge had a single channel. the visual lengthwise size of spark channel was ∼ 10 mm. the visual diameter of spark channel was ∼ 1 mm. the duration current impulse of spark discharge was ≤ 10 µs. fig. 3b is shows the sliding discharge between high-voltage metal electrode and liquid surface. the breakdown voltage of this discharge was ∼ 10 kv. the pulsing current of sliding discharge had amplitude ∼ 1.5 ka. this discharge had a multiple channels located near inside dielectric surface of working vessel. the visual lengthwise size of these channels was ∼ 50 mm. the visual diameter of channels was ∼ 1 mm. the duration of current impulse of sliding discharge was ∼ 10 µs. fig. 3c shows the volume process with glow color distinct from the spark and gliding discharges. on the video the evolution of this process was placed on several consecutive video-frames. as a rule the number of these video-frame was ≥ 4. in some cases their number attained 18. the typical live time of this process was determined by calculation of these video-frame number. this live time was ∼ 120÷540 ms. this process did not have channels and had a look of transparent luminous cloud. as systematical observations have shown, the necessary condition for start-up of longlived process is fog existence in area of this process. for the cases of impulsing discharges the breakdown voltage was measured by high-voltage voltmeter, built in power source for reservoir capacitor. the reservoir capacitor was used for energy supply into discharge and in times of discharge breakdown power source was detached from this capacitor automatically. however 214 vol. 53 no. 2/2013 long-lived plasma process, created by impulse discharge this on-board voltmeter did not register the voltage in system during the time of volume process existence. the discharge current was not registered either. the maximum output current of power source was 150 ma. therefore during the time of volume process existence the current in system was ≤ 150 ma. the results of spectroscopic investigations of impulsing discharges in pls with droplet micro-disperse environment are shown on fig. 4. the emission spectrum of spark discharge is represented on plot a) and spectrum of sliding discharge on b). the spark spectrum was registered along axis with coordinate z = 0 mm. the radiation of sliding discharge was registered along axis with coordinate z = 15 mm (fig. 2). this axis was distanced from place of spark purposely to avoid the overlapping of spectra of different discharges. also this method was used for registration of spectra of longlived process too. the atomic lines of hydrogen was λ = 657.8 nm, of oxygen λ = 778.8 nm and material of electrodes λ = 229.2; 327.8; 524.8 nm are present in spectra for both cases. in these experiments it was the atomic lines of copper. the molecular bands of hydroxyl (286.3 ÷ 343.7 nm) were not registered in radiation of impulsing discharges. the probably cause of absence of hydroxyl bands in emission spectra of both pulsed discharges can be related with presence of metal vapor. the metals have the most low excitation potentials of electron states. therefore the high concentration of metal atoms can to increase appreciably probability of radiationless deexcitation of hydroxyl molecules. but this supposition requires of screening and goes beyond the scope of this work. the emission spectrum of long-lived process is represented on fig. 5. optical radiation of long lived process was registered along axis with coordinate z = 15 mm. the atomic lines of hydrogen and oxygen were also present in this spectrum. however unlike cases of impulsing discharges the power band of hydroxyl was present and atomic lines of electrodes material were absent in emission spectrum of long-lived process. the absence of atomic lines of copper in emission spectra of longlived process can to explain by hypothetical nature of this process. the long-lived process can be related with discharge between charged drops. if it is true then the material of electrodes in this case is the water and atomic lines of copper can not be present in emission spectra. the spreading of long-lived process in working volume is shown by serial video-frames on fig. 6. as a result of processing of images from these videoframes the estimation of linear speed of glow area spreading was made. estimated rate of glow area boarder spreading was ∼ 0.7 m s−1. such a small value of this speed specifies that the mentioned process cannot be propagation of a combustion wave. figure 4. emission spectra of impulsing discharges in pls with micro-disperse droplet environment, a) spark discharge, z = 0 mm, b) gliding discharge z = 15 mm. figure 5. emission spectra of long-lived processes in pls with micro-disperse droplet environment z = 15 mm. 3.2. physical model statistical analysis of experimental data has shown that the necessary conditions of durable process appearing are obligatory presence of water fog in the volume with discharge. the hypothesis, that long-lived process for given conditions is the correlated multiple spark (cms) discharge between charged and uncharged fog drops during their approaching has been introduced. to verify this hypothesis, the approximate model of long-lived process was created. according to the model, fog drops located in the area impulsing discharge channel gain an electric charge due to their contact with plasma. in the result of the brownian motion they are mixed in the volume with the uncharged fog drops. chaotic motion of aerosol particles 215 serge olszewski, tamara lisitchenko, vitalij yukhymenko acta polytechnica figure 6. spreading of long-lived process in volume on serial video-frames. leads to the approaching of single drops on the lengths in order of magnitude of their radii. in the case of such approaching between the fog particles with different charges, or between charged and uncharged particles, the electric field appears with the magnitude that can be much larger (according to [5, 3]) then the one calculated using the coulomb’s law. electric fields’ value between the single particles is also related to self-consistent electric field, formed by the ensemble of charged drop, chaotically distributed in the working space. spatial redistribution of charged particles in time is defined by the langevin equation with the additional determinate force, which has an electrostatic nature, ξxtr dx dt = fx(t) + fx , (1) where ξxtr is drag coefficient for a ellipsoidal drop, fx(t) is random force and fx is external electrostatic force. according to [5], the drop can be split into the parts as a result of rayleigh or taylor instability. as far as taylor instability depends on the outer electric field and can be developed even for the electro neutral drops, in case of charged aerosol drops ensemble it is more probable. according to [6], charged liquid drop always acquires an ellipsoidal form. an instability criterion for ellipsoid drops is given by the inequality ε0e 2 sr 4αs ≤ 1.54 (2) according to taylor [4], wherein r is the drops’ radius, αs is coefficient of surface tension, es is the value of outer electric field and ε0 is absolute dielectric permittivity. the conditions of spark ignition discharge between the drops, according to [2], are defined by the electric field value in kv/cm as es = 27.2 ( 1 + 0.54 √ r ) , (3) wherein r is radius of drops. the conditions of spark breakdown require more intensity of the electric field, comparing to the case of drops’ break-up according to the capillary surface instability of taylors’ criterion. in cases of quasistatic systems, spark breakdown between micro-drops of electro conductive liquid is low probable. but according to [1] the typical time of capillary instability development can be estimated as τ ∼ r2ρ1/2α1/2s , (4) wherein ρ is liquids’ density. this time range is more than in 6 times of magnitude greater than the time range of spark discharge. hence, in dynamic systems the conditions can be carried out when the electric field between neighbor drops can increase to the value enough for spark discharge ignition during period of time less than the typical time of instability development. such an increasing of the electric field can be provided either by rate of charged and uncharged drops approaching, or by the superposition of charged aerosol drops’ self-consistent field and vortex electric field produced by the alternating current of spark discharge between neighbor pair of drops. the latter mechanism is also an extra discharges correlation factor between the approaching drops’ pairs, because it relieves the breakdown conditions due to the photoelectric effect and charged particles diffusion, and leads to the impulse increasing of the electric field value. 3.3. simulation according to presented physical model, 3d computer model has also been developed. due to the orbital symmetry of ellipsoidal drops, the number of dimensions can be reduced to 2d, so the working space was chosen more like the experiment only for two coordinates, and the third dimension was chosen as contracted in two orders of magnitude: 0.02 × 2 × 10 cm. fogs density was chosen as 5 × 103 cm−3. on the first step of calculation the ensemble of fog drops was created, with gamma-distributed characteristic sizes and random coordinates inside the working space, n(a) = µµ+1 γ(µ + 1) aµ r µ+1 m exp ( −µ a rm ) , (5) wherein γ(µ + 1) is gamma function, rm is the most probable drops radius and µ is half-width of distribution. initial velocities of particles were generated according to the maxwell distribution. for each n-th particle random charge was specified, so that value 1/n could be distributed in space according to the gaussian law. on the next step self-consistent electric field in instant coordinate of each particle was calculated. then, for each pair of particles the correction to the electric fields’ intensity [3, 6], criterion of electrical break-down – eq. 3, and particles’ break-up criterion were calculated – eq. 2. for the pairs of drops, which conformed to the breakdown criterion, the break-down impulse discharge current was calculated. for the drops conform to the break-up criteria, 216 vol. 53 no. 2/2013 long-lived plasma process, created by impulse discharge the initial time point of instability development was fixated. if in the process of further evolution integral time, during which the break-up criteria was fulfilled, exceeded the time of instability development, such pair of drops was replaced by the ensemble of drops with typical sizes and integral charge according to [4]. for the particles conform to the criterion of corona discharge   τapproach ≤ r2ρ1/2α 1/2 s , es ≥ 50 kv, r ≥ 10r, (6) the loss of charge took place, been calculated as the discharge current multiplied by the time step. the pair of drops, approaching at the distance equal to the sum of their radii, was replaced by the one particle with the integral volume and charge. on the next step, the langevin equation eq. 1 was numerically solved for each drop, and on the next step the recalculation of particles ensemble spatial evolution was took place. after that, iteration was repeated. calculations stopped when the linear velocity of glow boundary of cms-discharges area was formed. as simulation result it was calculated how the areas that contained multiple spark discharges between drops were propagated in space. the evolution of simulated cms-discharge is shown in fig. 7. this evolution is represented by space distribution of spark current in different time stations. the black points correspond to cms-discharge state after 25 ms of charged cloud creation. the dark grey points correspond to time station 50 ms and light-grey points to 100 ms. the comparison of experiment and simulation results is presented in fig. 8. this plot represents the dependence of border location of long-lived process glow area upon time. the black curve with round points corresponds to results of experimental measurement. the grey curve with triangular points corresponds to the results of simulation. the experimental estimated average value of linear speed of long-lived process spreading is 0.7 m s−1. the calculated rate of cms-discharge area boundary spreading is ∼ 0.4 m s−1. this result is conformed to experimental measured values. the mean radius of drops that support cmsdischarge is ∼ 0.8 ÷ 1 µm. for drops’ size ≤ 10 µm the atomization probability is greater then probability of spark breakdown, because for this case drops have smaller mean velocity and capillary instability has enough time to progress. for drops’ size ≤ 0.5 µm the probability of lost charge in the result of corona discharge is greater then probabilities of the other considered remainder elementary processes of the drops state transform. the increase of self-consistent field value reduces the probability of spark breakdown, since it relieves the conditions of capillary instability development. however the field gradient enhances condifigure 7. space distribution of areas that contained pairs of drops with spark discharges between it. figure 8. dependence of position of glow border on time; the black curve represent results of experimental measurements; the grey curve represent results of simulation. tions of spark breakdown for sufficiently rapid drops. on the periphery of charged fog cloud the gradient of self-consistent field mainly hinders the electrical field increase during drops approach. it is right for charged drops that are moved from charged cloud to periphery. but chaotic pattern of drops movement provides a grate drops number with velocity that is oriented to self-consistent field. just the ensemble of these drops is the main source of cms-discharge existence. 4. conclusions comparison of calculation and experimental results gives the following conclusions: • in terms of the introduced model, long-lived plasma process created by the impulsing discharge in micro-disperse droplet environment can be presented as correlated multiple spark discharge (cmsdischarge) between discrete pairs of differentlycharged drops. 217 serge olszewski, tamara lisitchenko, vitalij yukhymenko acta polytechnica • the self-consistent electrical field of charged droplets ensemble decreasei the probability of cmsdischarge initiation on periphery of charged fog cloud and increase the probability of droplet fragmentations due to taylor instability. • the calculated speed and experimentally measured speed of propagation of glow boundary of cmsdischarge match together within the order of magnitude. this speed corresponds to submicron size of drops that predominately support the cmsdischarge. acknowledgements the present work was partly supported by the kyiv national taras shevchenko university, by the national academy of sciences of ukraine, by the ministry of education and science of ukraine. references [1] d. f. belonoszko, a. i. grigoriev. the characteristic time of instability droplets charged to rayleigh limit. technical physics letters 25(15):41–45, 1999. [2] e. d. lozanskij, o. b. firsov. the theory of the spark. atomizdat, moscow, 1975. [3] m. v. mirolubov, et al. methods for calculating electrostatic fields. higher school, moscow, 1963. [4] v. m. muchnik, b. e. fishman. the electrification of coarse aerosols in the atmosphere. gidrometeoizdat, moscow, 1982. [5] v. a. saranin. some of the effects of electrostatic interaction of water drops in the atmosphere. technical physics 65(12):12–17, 1999. [6] a. a. shutov. the form of drops in a constant electric field. technical physics 72(12):15–22, 2002. [7] a. zajtsev, et al. testing of a network television camera samsung snp-5200h. prosystem cctv 1:28–45, 2012. [8] a. zajtsev, et al. testing of a network television camera sony snoem52. prosystem cctv 2:60–72, 2012. [9] a. zajtsev, et al. testing of the network video registrar softtera nvr-sr5q24. prosystem cctv 1:46–53, 2012. 218 acta polytechnica 53(2):213–218, 2013 1 introduction 2 material and methods 3 results and discussion 3.1 experiment 3.2 physical model 3.3 simulation 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0782 acta polytechnica 53(supplement):782–785, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap status of the cuore experiment claudia tomei∗, for the cuore collaboration sez. infn di roma, roma i-00815, italy ∗ corresponding author: claudia.tomei@roma1.infn.it abstract. the cuore (cryogenic underground observatory for rare events) experiment will search for neutrinoless double beta decay of 130te, a rare nuclear process that, if observed, would demonstrate the majorana nature of the neutrino and enable measurements of the effective majorana mass. the cuore setup consists of an array of 988 tellurium dioxide crystals, operated as bolometers, with a total mass of about 200 kg of 130te. the experiment is under construction at the gran sasso national laboratory in italy. as a first step towards cuore, the first tower (cuore-0) has been assembled and will soon be in operation. keywords: double beta decay, neutrino mass, majorana neutrinos, bolometers, cryogenic detectors. 1. introduction neutrinoless double beta decay is a unique probe for addressing the open questions in neutrino physics: the nature of a neutrino (dirac or majorana), the absolute neutrino mass scale and the mass hierarchy. neutrinoless double beta decay (0νββ) is an extremely rare nuclear process, never observed up to now1. it can take place through the exchange of a massive majorana neutrino, and its occurrence violates conservation of the total lepton number, thus implying physics beyond the standard model (for a recent review see [1]). the 0νββ decay amplitude depends on the effective majorana mass mββ = ∣∣∣∣∣∑ k u2ekmk ∣∣∣∣∣ , (1) where uek are the elements of the neutrino mixing matrix and mk are the neutrino mass eigenstates. the measurable quantity is the half-life, which is related to mββ by the formula( t 0ν1/2 )−1 = g0ν(q,z) |m0ν| 2 |〈mββ〉| 2 m2e , (2) where g0ν is the phase-space factor and m0ν is the nuclear matrix element. from a measurement (or a lower limit in the case of a null result) on t 0ν1/2 for any ββ emitting isotope, one can extract the value (or an upper limit) for the majorana neutrino mass, assuming the knowledge of the phase space factor and the nuclear matrix element for the given isotope. the phase space factors g0ν are known with good accuracy [3]. unfortunately, the same is not true for the nuclear matrix elements m0ν, which are extremely difficult to calculate and can only be estimated using approximations. the actual spread in m0ν calculation 1with the exception of a very controversial claim [2], which will be tested by current generation experiments. for a given isotope is of a factor ∼ 2, which is reflected in the uncertainty of the same entity on mββ, for a given experimental result. 1.1. 0νββ in experiments the signature of 0νββ is the emission of two simultaneous electrons with summed energy equal to the q-value of the ββ emitter. for experiments able to measure the sum energy of the two electrons, the signal appears as a peak at the q-value. despite its simple signature, the detection of 0νββ is extremely challenging due to the fact that the predicted half lives for this kind of decay are above 1025 y. the main experimental issues are the issues common to all experiments that search for rare events. the most crucial parameters are summarized in the formula for the experimental sensitivity, s0ν = ln 2na · a� a ( mt bδe )1/2 , (3) where na is the avogadro number, a is the isotopic abundance, � is the detection efficiency, and the other quantities have the usual meaning. many experiments, based on various isotopes, are presently dedicated to the search for 0νββ decay. some of them are already running, while others are in the construction phase and are expected to start data taking in the coming years. the first goal is to scrutinize the 76ge claim [2], corresponding to a half-life of 2.23±0.04×1025 y and, if confirmed, collect sufficient statistics for a precision measurement. if it is not confirmed, the next goal is to improve the half-life sensitivity above 1026 y, in order to enter the region of mββ allowed for the inverted hierarchy of neutrino masses. the best experimental results (namely limits on the 0νββ half-life for many isotopes) have been produced up to now with the use of calorimetric detectors (heidelberg-moscow [4], cuoricino [5]), tracking 782 http://dx.doi.org/10.14311/ap.2013.53.0782 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 status of the cuore experiment figure 1. the cuore experimental setup. a drawing of the full detector inside the future cuore cryostat and shields. detectors (nemo-3 [6, 7]) and, more recently, liquid scintillator detectors (exo [8], kamland-zen [9]). 1.2. cuore and the bolometric technique cryogenic bolometers are particle detectors where the energy of the impinging particle is converted into phonons, causing a temperature variation. given the low heat capacity and the low base temperature (∼ 10 mk) of the device, even small δt (∼ 0.1 mk/mev) can be detected. the temperature rises are measured by thermistors glued to the surface of the bolometer. cuore bolometers [10, 11] are tellurium dioxide crystals. the choice of the absorber (teo2) is driven by many considerations: the excellent bolometric properties of this material, the high natural abundance of the ββ emitting isotope 130te (∼ 34 %, the highest among the ββ emitting isotopes used to date), the reasonably high q-value (2528 kev), almost above the energy spectrum of natural radioactivity, and the optimal energy resolution (∼ 5 kev @ the q-value). teo2 crystals have been successfully operated for more than ten years in the search for the 0νββ of 130te. the best result obtained by the cuoricino experiment is t 0ν1/2 > 2.8 × 10 24 y @ 90 % c.l. the future of the bolometric technique lies in the cuore experiment, which will prove the feasibility of a ∼ 1 ton scale experiment with cryogenic bolometers. the cuore setup consists in an array of 988 tellurium dioxide crystals for a total mass of about 200 kg of 130te. the crystals will be arranged in 19 towers, each with 13 floors of 4 detectors, held by means of teflon spacers inside a copper structure. the whole array will be cooled down to 10 mk in a cryostat, surrounded by a gamma and a neutron shield (see fig. 1). the experiment is under construction at the gran sasso national laboratory in italy. 1.3. cuore background budget the cuore goal is to achieve a flat background in the region of interest (roi) around the 0νββ q-value of 0.01 counts/kg/kev/y. this is quite a dramatic improvement for the cuoricino background index (0.169 ± 0.006 counts/kg/kev/y). for this purpose, only carefully selected radio-pure materials are used in the detector region, e.g. high-purity low-background copper and teflon. moreover, the compact design and the granularity of the cuore array will provide an additional background reduction through self-shielding and anti-coincidence between the detectors. to shield against the external radioactivity, the detectors are surrounded by at least 30 cm of lead, divided into several layers. the innermost lead shields are made from ancient roman lead with low contamination of 210pb. an additional 20 cm borated polyethylene layer surrounding the whole cryostat will act as a neutron shield. the experience of cuoricino has shown that an important fraction of the background in the roi is due to degraded alpha from u/th contaminations on the surfaces of crystals and of the mounting structure (copper and ptfe). the cuore crystals are purchased from siccas, and have been produced since 2009 following a very strict protocol to ensure the required radiopurity [12]. approximately 4 % of the crystals delivered to lngs are tested as cryogenic bolometers in the so-called cuore crystal validation runs (ccvrs). the results of the first five ccvrs have proven the excellent performance of the cuore crystals in terms of energy resolution (improved with respect to cuoricino) and the absence of u/th surface and bulk contaminations [13]. as far as the copper is concerned, all components facing the detectors have to undergo a complex cleaning process, following a procedure developed at the legnaro national laboratory of infn. a small tower assembled using cuoricino crystals (completely reprocessed according to the cuore standard) was operated underground at lngs, inside the cuoricino cryostat, to measure the surface activity of the mounting structure [14]. 783 claudia tomei, for the cuore collaboration acta polytechnica background source b in roi [counts/kg kev y] external (µ + n + γ) < 2 × 10−3 roman lead (bulk) < 4 × 10−3 cosmogenic activation of crystals ∼ 10−3 crystals (bulk) < 1 × 10−4 crystals (surface) < 4 × 10−3 mounting – copper (surface) < (2 ÷ 3) × 10−2 mounting – ptfe (surface) < 6 × 10−2 table 1. some contributions to cuore background. the numbers and limits were obtained from bolometric tests, radioactivity measurements (hp-ge, naa) and monte-carlo simulations, and are reported in [13–15]. some contributions to cuore background are shown in tab. 1, based on the pessimistic hypothesis that all the contributions observed in the r&d tests come from sources that will not improve with the geometry of cuore. the limits for the mounting structure are mutually exclusive. 2. status of the cuore experiment the cuore experiment is presently under construction at the gran sasso national laboratory (lngs) of infn. the cuore experimental building (hut) is located in hall a of lngs, and consists of three floors (cryostat equipment, clean rooms and electronics). the cuore clean rooms are fully equipped for tower assembly. the assembly procedure is composed of several steps, each one taking place in a dedicated clean room. the first phase of a the assembly of a cuore tower is gluing the thermistors (temperature sensors) and the joule heaters (silicon resistors to check fluctuations in the stability of the gain) to the crystals. this operation is performed within the thermistor gluing station, equipped with two robotic arms that ensure precise and uniform gluing. the mechanical assembly of the tower is performed inside the assembly clean room, following the cuore tower assembly line (ctal). the operational principle of ctal is the following: a sealed and flushed stainless steel chamber (garage) supports a common working plane (uwp), where 4 pmma chambers (glove boxes) will switch, each one with features able to allow specific operations, namely mounting, bonding, cabling and storage, until completion of the tower. the ctal procedure was successfully tested in may 2012 with the assembly of cuore-0 (see below). as concerns crystal production, more than 80 % of the bolometers are already at lngs, stored in nitrogen-fluxed cabinets in cuore’s underground parts storage area (psa). the cuore cryogenic system is a fundamental, though technologically challenging, part of the setup. it should ensure stable cooling of the detector’s array to its operating temperature, while guaranteeing the operation of the detectors (readout and calibration) and proper shielding from environmental radiation. the key parts are the cryostat, the cooling unit and the fast cooling system, the readout (wiring), and the calibration systems. the cuore cryostat is made of six nested vessels at different temperatures, from 300 k (outer vacuum chamber) to 0.01 k (mixing chamber). the cooling power to operate the cuore detectors to temperatures in the 10 mk range is provided by a cooling system based on a closed cycle, cryogen-free, high-power 3he/4he dilution refrigerator (drs). the drs was completed and successfully tested in july 2011. the cryostat commissioning at lngs started in april 2012. the calibration of the cuore bolometers is also challenging. due to the thick lead shields and the large self-shielding effect of the bolometer array, sources must be inserted into the detector region, adhering to the strict heat-load requirements of the cryostat. the cuore detector calibration system (dcs) consists of 12 radioactive source strings that are able to move, under their own weight, through a set of guide tubes that route them from outside the cryostat at 300 k to their locations between the bolometers in the detector region inside the cryostat. the guide tubes are made of stainless steel, or high-purity low-background copper, depending on their position in the cryostat and the thermal requirements. the source strings are moved into the detector region for calibrations, and removed during regular data taking. the dcs is under construction at the university of wisconsin. the updated cuore schedule foresees the first cooldown of the full detector by fall 2014. 3. cuore-0 cuore-0 is a single cuore tower, assembled from detector components manufactured, cleaned and stored following the stringent protocols defined for cuore, with the aim of testing the performance of the new detector assembly line and detector structure. at the same time, given its sensitive mass (about 11 kg of 130te) and provided the background can be kept at a level lower than cuoricino, cuore-0 can have a physics program of its own, surpassing cuoricino in sensitivity reach, while cuore is being assembled (see fig. 2) [17], and demonstrating the potential for dark matter and axion detection with the help of a new software trigger able to reduce the threshold to a few kev [16]. cuore-0 operations began in september 2011, when the crystals were successfully instrumented with thermistors and heaters using the new semiautomated cuore gluing system. during march 2012, a test assembly of all the new cuore-0 copper parts, without the crystals, was successfully carried out. the mechanical assembly of the tower was started immediately afterwards, and was completed successfully in april 784 vol. 53 supplement/2013 status of the cuore experiment live time [y] -1 0 1 2 3 4 5 6 7 s e n si ti v it y σ [ y ] 1 1 /2ν 0 t 25 10 26 10 cuoricino cuore-0 bkg: 0.05 cts/(kev kg y) cuore bkg: 0.01 cts/(kev kg y) figure 2. cuore-0 sensitivity compared to the cuoricino result and the cuore expectations [17]. 2012 with the installation of the copper shields. the fully assembled tower has been moved to the cuoricino building for insertion into the cryostat and for cooldown. unfortunately, a leak appeared in the dilution unit soon after the initial system check. the leak was then fixed, and the cooldown of cuore-0 has started. data taking was due to start at the end of august 2012. 4. conclusions the cuore (cryogenic underground observatory for rare events) experiment will search for neutrinoless double beta decay of 130te. with a background of 0.01 counts/kg/kev/year and a lifetime of 5 years, cuore will reach a sensitivity of 1.6 × 1026 y @ 68 % c.l., thus entering for the first time the allowed region for the effective majorana mass in the inverted hierarchy scenario. cuore is under construction at the gran sasso national laboratory in italy, and cooldown of the detector is scheduled for fall 2014. as a first step towards cuore, the first tower (cuore-0) has been assembled and will soon come into operation. references [1] s. bilenky, c. giunti: 2012, mod.phys.lett. a27, 1230015. [2] h. v. klapdor-kleingrothaus, i. v. krivosheina, a. dietz, o. chkvorets: 2004, physics letters b586, 198. [3] j. kotila and f. iachello: 2012, phys.rev. c85, 034316. [4] h. v. klapdor-kleingrothaus, et al.: 2001, eur.phys.j. a12, 147. [5] e. andreotti, et al.: 2011, astropart.phys. 34, 822. [6] r. arnold, et al.: 2006, nucl.phys. a765, 483–494. [7] a. barabash, v. brudanin: 2011, phys.atom.nucl. 74, 312–317. [8] m. auger et al.: 2012, phys.rev.lett. 109, 032505. [9] kamland-zen collaboration: 2012, phys.rev.c 85, 045504. [10] r. ardito et al.: 2005, arxiv:hep-ex/0501010. [11] c. arnaboldi et al.: 2004, nucl. instr. meth. a518, 775. [12] c. arnaboldi et al.: 2010, journal of crystal growth 312, 2999. [13] f. alessandria et al.: 2012, astropart.phys. 35, 839–849. [14] f. alessandria et al.: 2013, astropart.phys. 45, 13–22. [15] f. bellini et al.: 2010, astropart.phys. 33, 169. [16] s. di domizio, f. orio and m. vignati: 2011, jinst 6, p02007. [17] f. alessandria et al.: 2013, arxiv:1109.0494v3 [nucl-ex], submitted to astropart.phys. discussion james beall — what is the size of the cuore crystals? claudia tomei — each crystal is 5×5×5 cm3. 785 acta polytechnica 53(supplement):782–785, 2013 1 introduction 1.1 0 nu beta beta in experiments 1.2 cuore and the bolometric technique 1.3 cuore background budget 2 status of the cuore experiment 3 cuore-0 4 conclusions references acta polytechnica acta polytechnica 53(3):289–294, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ weakly ordered a-commutative partial groups of linear operators densely defined on hilbert space jiří janda∗ department of mathematics and statistics, faculty of science, masaryk university, kotlářská 2, cz-611 37 brno, czech republic ∗ corresponding author: 98599@mail.muni.cz abstract. the notion of a generalized effect algebra is presented as a generalization of effect algebra for an algebraic description of the structure of the set of all positive linear operators densely defined on a hilbert space with the usual sum of operators. the structure of the set of not only positive linear operators can be described with the notion of a weakly ordered partial commutative group (wop-group). due to the non-constructive algebraic nature of the wop-group we introduce its stronger version called a weakly ordered partial a-commutative group (woa-group). we show that it also describes the structure of not only positive linear operators. keywords: (generalized) effect algebra, partial group, weakly ordered partial group, hilbert space, unbounded linear operator, self-adjoint linear operator. ams mathematics subject classification: 06f05, (03g25, 81p10, 08a55). 1. introduction the notion of an effect algebra was presented by foulis and bennett in [3]. the definition was motivated by giving an algebraic description of positive self-adjoint linear operators between the zero and the identity operator in a complex hilbert space h. the notion of a generalized effect algebra extends these ideas on unbounded sets of positive linear operators. to answer the natural question concerning the structure of sets of not only positive linear operators paseka started to investigate a partially ordered commutative group of operators with a fixed domain [5]. in [6] paseka and janda introduced the structure of a weakly ordered partial commutative group (shortly a wop-group). they also showed that the set of all linear operators on complex hilbert space h with the usual sum, which is restricted to the same domain for unbounded operators (partial operation ⊕d), possesses this structure. in [4] we considered the structure on the important subset of self-adjoint operators, showing that it is also a wop-group. wop-groups have only a non-constructive associativity (the equation holds if and only if both sides are defined). it has been shown [4] that the set of all linear operators has generally stronger algebraic properties. this was a motivation for introducing the notion of a weakly ordered partial a-commutative group (woa-group) where the associative law is more constructive. also a weak order is more strongly related to the partial operation. moreover, every positive cone of a woa-group is a generalized effect algebra. on the other hand, we present a construction showing that every generalized effect algebra is a positive cone of some woa-group. 2. preliminaries we review some basic terminology, definitions and statements. the basic reference for this text is the book by dvurečenskij and pulmannová [2]. definition 1. a partial algebra (e, +, 0) is called a generalized effect algebra if 0 ∈ e is a distinguished element and + is a partially defined binary operation on e which satisfies the following conditions for any x,y,z ∈ e: (gei) x + y = y + x, if one side is defined, (geii) (x + y) + z = x + (y + z), if one side is defined, (geiii) x + 0 = x, (geiv) x + y = x + z implies y = z (cancellation law), (gev) x + y = 0 implies x = y = 0. in every generalized effect algebra e the partial binary operation − and relation ≤ can be defined by (ed) x ≤ y and y−x = z iff x + z is defined and x + z = y. then ≤ is a partial order on e under which 0 is the least element of e. a generalized effect algebra with the top element 1 ∈ e is called an effect algebra and we usually write (e, +, 0, 1). a subset s of e is called a sub-generalized effect algebra (sub-effect algebra) of e iff (i) 0 ∈ s (1 ∈ s), (ii) if out of elements x,y,z ∈ e such that x + y = z at least two are in s, then all x,y,z ∈ s. definition 2. [6] a partial algebra (g, +, 0) is called a commutative partial group if 0 ∈ e is a distinguished 289 http://ctn.cvut.cz/ap/ jiří janda acta polytechnica element and + is a partially defined binary operation on e which satisfies the following conditions for any x,y,z ∈ e: (gpi) x + y = y + x if x + y is defined, (gpii) (x + y) + z = x + (y + z) if both sides are defined, (gpiii) x + 0 is defined and x + 0 = x, (gpiv) for every x ∈ e there exists a unique y ∈ e such that x + y = 0 (we put −x = y), (gpv) x + y = x + z implies y = z (cancellation law). we say that a commutative partial group (g, +, 0) is weakly ordered (shortly a wop-group) with respect to a reflexive and antisymmetric relation ≤ on g if ≤ is compatible w.r.t. partial addition, i.e., for all x,y,z ∈ g, x ≤ y and both x+z and y +z are defined implies x + z ≤ y + z. due to the non-constructive algebraic nature of wop-groups, we will introduce a stronger structure with the notion of a woa-group. definition 3. a partial algebra (g, +, 0) is called an a-commutative partial group if 0 ∈ g is a distinguished element and + is a partially defined binary operation on g which satisfies the following conditions for any x,y,z ∈ g: (gi) x + y = y + x if x + y is defined, (gii) x + 0 is defined and x + 0 = x, (giii) for every x ∈ e there exists a unique y ∈ e such that x + y = 0 (we put −x = y), (giv) if (x + y) + z and (y + z) are defined, then x + (y + z) is defined and (x + y) + z = x + (y + z). an a-commutative partial group (g, +, 0) is called weakly ordered (shortly a woa-group) with respect to a reflexive and antisymmetric relation ≤ on g (we call it a weak order) if (ri) x ≤ y iff there exists 0 ≤ z, x + z is defined and x + z = y, (rii) 0 ≤ x,y and x + y defined, then 0 ≤ x + y, (riii) 0 ≤ x, 0 ≤ z, x ≤ y and y+z defined implies x + z defined. note that in the case of generalized effect algebras the partial order is induced from a partial operation. on the other hand, the weak order for woa-groups (wop-groups) can be chosen in various ways. we will show that the weak order is determined by the partial operation and the set of positive elements. remark 1. using the commutativity (gi) with the axiom (giv), we can obtain similar formulas to (giv) for any permutation of variables. that is, let us have an a-commutative group (g, 0, +). then for any x,y,z ∈ g, from the existence of (x + y) + z and x + z we have (x + y) + z = (x + z) + y. or similarly, the existence of x + (y + z) and x + y implies x + (y + z) = (x + y) + z and so on. in the following, we will be using (giv) in this more general sense and we omit mentioning the commutativity. definition 4. let (g, +, 0) be an (a-)commutative partial group and let s be a subset of g such as (si) 0 ∈ s, (sii) −x ∈ s for all x ∈ s, (siii) for every x,y ∈ s such that x + y is defined also x + y ∈ s. then we call s an (a-)commutative partial subgroup of g. let g be a wop-group (woa-group) with respect to a relation ≤g and let ≤s be a relation on a (a-)commutative partial subgroup s ⊆ g. if for all x,y ∈ s holds: x ≤s y if and only if x ≤g y, we call s a wop-subgroup (woa-subgroup) of g. lemma 1. let (g, +, 0) be an a-commutative partial group. then for any a,b,c,x,y,z ∈ g the following holds: (1.) a + c = b iff c = b + (−a), (2.) a + x = (a + y) + z implies x = (y + z), (3.) whenever a + b is defined, then (−a) + (−b) is defined and (−a) + (−b) = −(a + b). proof. (1.) we have c = c + (a + (−a)) = (c + a) + (−a) = b + (−a). (2.) let us have a + x = (a + y) + z. then x = ((a + y) + z) + (−a). since y + a and y + (a + (−a)) are defined then (y + a) + (−a) is defined. we have x = ((a+y)+z)+(−a) = (((a+y)+(−a))+z) = y+z. (3.) let us have a,b ∈ g such that a + b ∈ g. we have a = a + (b + (−b)) = (a + b) + (−b) defined. then by (1.) (−b) = a + (−(a + b)) and also by (1.) (−a) + (−b) = (−(a + b)). lemma 2. let (g, +, 0) w.r.t. ≤ be a woa-group. then for any a,b,c,x,y,z ∈ g the following holds: (1.) a ≤ b iff b + (−a) ≥ 0, (2.) a ≤ b iff −b ≤−a. proof. (1.) let a ≤ b. then there exists c ≥ 0 such that a + c = b and from lemma 1 (1.) b + (−a) ≥ 0. on the other hand, let b + (−a) ≥ 0. then b = b + (a + (−a)) = a + (b + (−a)) hence by (ri) a ≤ b. (2.) let a ≤ b for some a,b ∈ g. then (b + (−a)) ≥ 0 and −a = ((−b) + b) + (−a) = (b + (−a)) + (−b) ie. −b ≤−a. lemma 3. every a-commutative partial group (g, +, 0) is a partial commutative group. proof. the cancellation law follows from lemma 1 (2.) choosing z = 0. 290 vol. 53 no. 3/2013 weakly ordered a-commutative partial groups of linear operators lemma 4. every woa-group (g, +, 0) w.r.t. ≤ is a wop-group w.r.t. ≤. proof. by the previous lemma, we have (g, +, 0) is a partial commutative group. let us have x,y,z ∈ g, x ≤ y, x + z,y + z defined. then by lemma 2 (1.) 0 ≤ (y + (−x)) and x + (y + (−x)) = y hence y + z = (x + (y + (−x)) + z = (x + z) + (y + (−x)) and according to (rii) x + z ≤ y + z. example 1. let g = {0,a,b,c,−a,−b,−c} be a set with partial operation + defined for a + b = c and 0 + x = x, x + (−x) = 0 for all x ∈ g. then (g, +, 0) is a commutative partial group, but it is not an a-commutative partial group. note that even c + (−b) = (a + b) + (−b) is not defined. lemma 5. let (g, +, 0) be an a-commutative partial group and s ⊆ g its a-commutative partial subgroup. then (s, +/s, 0) is an a-commutative partial group. proof. immediately (gi) and (giv) follows from (siii), (gii) from (si), (giii) from (sii). lemma 6. let (g, +, 0) be a woa-group w.r.t. ≤ and s ⊆ g its woa-subgroup w.r.t. ≤s. then (s, +/s, 0) w.r.t. ≤s is a woa-group. proof. by the previous lemma (s, +/s, 0) is an acommutative partial group. for (ri) let x,y ∈ s, x ≤s y. then by lemma 2 (1.) 0 ≤ y + (−x) is defined and by (siii) y + (−x) ∈ s. the other direction is straightforward. (rii) is clear and (riii) follows from (siii). corollary 1. let (g, +, 0) be a wop-group w.r.t. ≤ and s ⊆ g its wop-subgroup w.r.t. ≤s. whenever (g, +, 0) w.r.t. ≤ is also a woa-group, then s is its woa-subgroup. theorem 1. let (g, +, 0) w.r.t. ≤ be a woa-group. then the set pos(g) = {x ∈ g | 0 ≤ x} with the restriction of the partial operation + on pos(g), i.e., (pos(g), +/ pos(g), 0) forms a generalized effect algebra. proof. (gei) and (geiii) hold from definition and the cancellation law follows from lemma 4. for the axiom (gev) let x,y ∈ pos(g) such that x + y = 0. then x = (−y) and with lemma 2 (2.) we have −y ≤ 0 hence x = y = 0. we will verify (geii). let us have x,y,z ∈ pos(g) such that (x + y) + z is defined. therefore y ≤ x + y, (x + y) + z exists and (riii) implies that y + z exists. using (giv) we have (x + y) + z = x + (y + z). lemma 7. let (g, +, 0) be an a-commutative partial group and e ⊆ g a subset closed under the +, i.e., x,y ∈ e,x+y ∈ g implies x+y ∈ e, such that 0 ∈ e and (e, +/e, 0) forms a generalized effect algebra. define a relation ≤ by x ≤ y iff (−x) + y is defined and ((−x) + y) ∈ e. then (g, +, 0) is a woa-group w.r.t. ≤, pos(g) = e and ≤ on pos(g) coincides with induced partial order ≤e from (e, +/e, 0). proof. reflexivity is clear since 0 ∈ e. let x ≤ y and y ≤ x, then −x + y,x + (−y) ∈ e and with (giv) (−x + x) + (−y + y) = (−x + y) + (x + (−y)) = 0. e is a generalized effect algebra hence by (gev) x + (−y) = −x + y = 0 that is x = y. clearly pos(g) = e. then (ri) is straightforward using the definition of ≤ and lemma 1 (1.). (rii) holds because we want e to be closed under the +. for (riii), let x,z ∈ e, x ≤ y and y + z be defined. then y + z = (x + ((−x) + y)) + z = (x + z) + ((−x) + y)) using the associativity of generalized effect algebra e since x,z, ((−x) + y) ∈ e. we show the coincidence. let us have −x + y,x,y ∈ e. because e is closed on + we have x + (−x + y) defined and x + (−x + y) = y, i.e., x ≤e y. on the other hand let x ≤e y, x,y ∈ e. then there exists z ∈ e such that x + z = y and by lemma 1 (1.) z = y + (−x). the previous lemma formalizes the idea of determining a weak order by the set of positive elements. let us have an a-commutative partial group and choose some elements to be positive. consider the smallest set closed under partial addition containing zero and the chosen elements. if the set has the form of a generalized effect algebra, then there exists such a weak order that our set is exactly the set of positive elements. on the other hand, it is not hard to show that if the set is not a generalized effect algebra, then there is no such weak order that all of our chosen elements are positive. theorem 2. let (e, +, 0) be a generalized effect algebra with induced order ≤. then there exists a woa-group (g,⊕, 0) w.r.t. relation ≤g such that (pos(g), ⊕/ pos(g), 0) = (e, +, 0) and ≤g/ pos(g)=≤. proof. let (e, +, 0) be a generalized effect algebra with induced order ≤. for any a,b ∈ e, a ≤ b, the symbol b−a denotes such an element that a+(b−a) = b. let e− be a set with the same cardinality disjoint from e. consider a bijection ϕ : e → e−. we set a− = ϕ(a) for a ∈ e r{0} and 0− = 0. let g = e ∪̇ ( e− r { ϕ(0) }) be a disjoint union of e and e− r{ϕ(0)}. let us define define a partial binary operation ⊕ on g by • a⊕ b exists iff a + b exists and then a⊕ b = a + b for all a,b ∈ e, • a− ⊕ b− exist iff a + b exists and then a− ⊕ b− = (a + b)− for all a,b ∈ e, • a⊕ b− = b− ⊕a is defined iff (1.) b ≤ a (a− b exists) then a⊕ b− = a− b or (2.) a ≤ b (b−a exists) then a⊕ b− = (b−a)− for any nonzero a,b ∈ e. 291 jiří janda acta polytechnica it is not hard to show that the definition is correct. for any a,b ∈ e it holds that (α1)(a⊕ b)− = (a + b)− = (a− ⊕ b−). let a⊕ b− be defined and (β1)b ≤ a (a−b exists), then (a⊕b−)− = (a−b)− = a− ⊕ b. we show that (g,⊕, 0) forms an a-commutative partial group. commutativity (gi) is clear from the definition. since 0 ∈ e it follows a⊕ 0 = (a + 0) = a and (a− ⊕ 0) = (a− 0)− = a− for all a ∈ e, that is (gii). clearly for any a ∈ e there exists a− ∈ e− where a⊕a− = a−a = 0. this defines also inverse elements for any a− ∈ e−. let us verify the associativity case by case. we assume that x,y,z ∈ e. case i. first, let us have (x⊕y)⊕z defined and y⊕z defined. then (x⊕y)⊕z = (x+y) +z = x+ (y +z) = x⊕(y⊕z) where the existence and the equation follow from the associativity of the generalized effect algebra e. case ii. let (x⊕y) ⊕z− and y⊕z− be defined. and let (α2) z ≤ y (i.e., y−z is defined). hence z ≤ (x + y) and (x⊕y) ⊕ z− = (x + y) − z. since y − z exists, we have y = (y − z) + z and then ((x + y) − z) + z = x + y = x + ((y − z) + z) = (x + (y − z)) + z, where we used the associativity of the generalized effect algebra e. by the cancellation law we have (x⊕y)⊕z− = (x + y)−z = x + (y−z) = x⊕(y⊕z−). (β2) y ≤ z and z ≤ (x + y) (that is z − y and (x + y) − z exist). we have z = (z − y) + y and also ((x + y) − z) + z = x + y. putting together ((z − y) + y) + ((x + y) − z) = x + y from which (z−y)+((x+y)−z) = x hence (x+y)−z = x−(z−y). so we have (x⊕y)⊕z− = (x + y)−z = x−(z−y) = x⊕(z⊕y−)− = x⊕(z−⊕y). the last equation holds by β1. (γ2) y ≤ z and (x+y) ≤ z (hence z−y and z−(x+y) exist). from (z−y)+y = (z−(x+y))+(x+y) we have z−y = (z−(x+y)) +x hence (z−y)−x = z−(x+y). therefore (x⊕y) ⊕z− = (z − (x + y))− = ((z −y) − x)− = (x⊕ (z −y)−) = x⊕ (y ⊕z−). case iii. let (x− ⊕y) ⊕z and y ⊕z be defined. let (α3) x ≤ y (i.e., y −x is defined and also x ≤ y ≤ y+z). then y = (y−x)+x and y+z = ((y−x)+x)+z hence (y+z)−x = (y−x)+z. therefore (x−⊕y)⊕z = (y −x) + z = (y + z) −x = x− ⊕ (y ⊕z). (β3) y ≤ x and (x − y) ≤ z (that is x − y and z−(x−y) are defined). since z + y is defined, we have from z = (z − (x−y)) + (x−y) with x = (x−y) + y the equation z +y = (z−(x−y)) +x and (z +y)−x = z − (x − y). hence (x− ⊕ y) ⊕ z = (x − y)− ⊕ z = z − (x−y) = (z + y) −x = x− ⊕ (y ⊕z). (γ3) y ≤ x and z ≤ (x − y) (hence x − y and (x − y) − z are defined). we have (x − y) − z = (x−y)−z from which x = ((x−y)−z) + y) + z hence x− (y + z) = (x−y) −z. and then (x− ⊕y) ⊕z = (x − y)− ⊕ z = ((x − y) − z)− = (x − (y + z))− = (x ⊕ (y ⊕ z)−)− = x− ⊕ (y ⊕ z) (the last equation follows from β1). case iv. let (x⊕y−) ⊕z and y− ⊕z be defined. let (α4) y ≤ x and y ≤ z (i.e., x − y and z − y are defined). using y = x−(x−y) and z = (z−y)+y we get (x−y) +z = (x−y) + ((z−y) +y) = (z−y) +x. hence (x⊕y−)⊕z = (x−y) + z = (z−y) + x = x⊕(y−⊕z). (β4) y ≤ x and z ≤ y (hence x−y and y − z are defined). similarly y = (y−z) + z and x = (x−y) + y and then x = (x − y) + ((y − z) + z) from which x − (y − z) = (x − y) + z. hence (x ⊕ y−) ⊕ z = (x−y)+z = x−(y−z) = x−(y⊕z−) = x⊕(y⊕z−)− = x⊕ (y− ⊕z) (the last equation follows from β1). (γ4) x ≤ y and y ≤ z which implies (y − x) ≤ z (hence y−x, z− (y−x) and z−y are defined). then z = (z −y) + y = (z −y) + ((y −x) + x) from which z − (y −x) = (z −y) + x and hence (x⊕y−) ⊕z = (y−x)−⊕z = z−(y−x) = (z−y) +x = x⊕(y−⊕z). (δ4) x ≤ y, (y − x) ≤ z and z ≤ y (i.e., y − x, z−(y−x) and y−z are defined). then y = (y−z)+z = (y−z) + ((z− (y−x)) + (y−x)) = (y−x) + x hence (y−z) + (z− (y−x)) = x from which (x⊕y−) ⊕z = (y−x)−⊕z = z−(y−x) = x−(y−z) = x−(y⊕z−) = x⊕ (y− ⊕z) (using β1). (�4) x ≤ y, z ≤ (y −x) and z ≤ y (that is y −x, (y−x)−z and y−z are defined). then y = (y−x)+x = ((y−x)−z)+z)+x = (y−z)+z. hence (y−x)−z)+x = (y−z) and (x⊕y−)⊕z = (y−x)−⊕z = ((y−x)−z)− = ((y −z) −x)− = (x⊕ (y −z)−) = x⊕ (y− ⊕z). until now, the map − : e → e− ∪ 0 has been defined only for elements of e. we can extend it on g by (a−)− = a, which gives us an involution on g. then for any a,b ∈ e whenever b − a is defined it holds (a ⊕ b−)− = ((b − a)−)− = (b − a) = a− ⊕ b. together with (α1) and (β1) we have (γ1)(a⊕ b−)− = a− ⊕ b for all a,b ∈ e, (δ1)(a−⊕b−)− = ((a⊕b)−)− = a⊕b for all a,b ∈ e case v. let (x−⊕y−)⊕z and y−⊕z be defined. then also y ⊕z− is defined and using (α1), (γ1) and (ii) we get (x−⊕y−)⊕z = ((x⊕y)⊕z−)− = (x⊕(y⊕z−))− = (x⊕ (y− ⊕z)). case vi. let (x⊕y−)⊕z− and y−⊕z− be defined. then y ⊕z is defined and using (α1), (γ1) and (iii) we have ((x⊕y−)⊕z−) = ((x−⊕y)⊕z)− = (x−⊕(y⊕z))− = (x⊕ (y− ⊕z−)). case vii. let (x− ⊕y) ⊕z− and y ⊕z− be defined. then y− ⊕z is defined and with (α1), (γ1) and (iv) we can see that ((x− ⊕y) ⊕z−) = ((x⊕y−) ⊕z)− = (x⊕ (y− ⊕z))− = (x− ⊕ (y ⊕z−)). case viii. let (x−⊕y−)⊕z− and y−⊕z− be defined. then y⊕z is defined and (x−⊕y−)⊕z− = (x⊕y)−⊕ z− = ((x⊕y)⊕z)− = (x⊕(y⊕z))− = x−⊕(y−⊕z−). 292 vol. 53 no. 3/2013 weakly ordered a-commutative partial groups of linear operators hence (g,⊕, 0) forms an a-commutative partial group. since e is a generalized effect algebra, by lemma 7 there exists a relation ≤g such that (g,⊕, 0) w.r.t. ≤g forms a woa-group. example 2. an interval [−1, 1] with ⊕ defined for 0 ≤ x,y by • x⊕y = x + y iff x + y ≤ 1, • x⊕ (−y) = x−y, • (−x) ⊕ (−y) = −(x⊕y) iff (x⊕y) exists and relation ≤g defined by x ≤g y iff x ≤ y and y −x ≤ 1 for all x,y ∈ [−1, 1] forms a woa-group. a positive cone ( [0, 1],⊕/[0,1], 0 ) forms a well-known unit interval effect algebra. 3. hilbert spaces we assume that h is an infinite-dimensional complex hilbert space, i.e., a linear space with inner product 〈· , ·〉 which is complete in the induced metric. the term dimension of h in the following always means the hilbertian dimension defined as the cardinality of any orthonormal basis of h (see [1]). moreover, we will assume that all considered linear operators a (i.e., linear maps a : d(a) →h) have a domain d(a) a linear subspace dense in h with respect to the metric topology induced by the inner product, so d(a) = h (we say that a is densely defined). we denote by d the set of all dense linear subspaces of h. by positive linear operators a, (denoted by a ≥ 0) it means that 〈ax,x〉≥ 0 for all x ∈ d(a). to every linear operator a : d(a) → h with d(a) = h there exists the adjoint operator a∗ of a such that d(a∗) = { y ∈ h | there exists y∗ ∈ h such that 〈y∗,x〉 = 〈y,ax〉 for every x ∈ d(a) } and a∗y = y∗ for every y ∈ d(a∗). when a∗ = a, a is called self-adjoint (for more details see [1]). recall that a linear operator a : d(a) → h is called a bounded operator if there exists a real constant c ≥ 0 such that ‖ax‖ ≤ c‖x‖ for all x ∈ d(a) and hence a is an unbounded operator if to every c ∈ r, c ≥ 0 there exists xc ∈ d(a) with ‖axc‖ > c‖xc‖. by symbol 0 we mean a null operator and it is a bounded operator. the set of all bounded operators on h is denoted by b(h). for every bounded operator a : d(a) →h densely defined on d(a) = d ⊂ h exists a unique extension b such as d(b) = h and ax = bx for every x ∈ d(a). we will denote this extension b = ab (see again [1]). definition 5. for an infinite-dimensional complex hilbert space h, let us define these sets of operators (in order to [4, 6]): • gr(h) = {a : d(a) →h| d(a) = h and d(a) = h if a is bounded} • grd (h) = {a ∈ gr(h) | d(a) = d or a is bounded} • sagr(h) = {a ∈gr(h) | a = a∗} • sagrd (h) = {a ∈ sagr(h) | d(a) = d or a is bounded} • v(h) = {a ∈gr(h) | a ≥ 0}. we also define an operation ⊕d on gr(h) by a⊕db =   a + b if a + b is unbounded and( d(a) = d(b) or one out of a,b is bounded ) , (a + b)b if a + b is bounded and d(a) = d(b), undefined otherwise, and operation ⊕u by a⊕u b = a⊕d b iff at least one of a,b ∈gr(h) is bounded or a,b ∈gr(h) are both unbounded, d(a) = d(b) and there exists a real number λab 6= 0 such that a−λ a bb is bounded[4]. for an arbitrary subset x ⊆gr(h) let us define a relation ≤xd (resp. ≤ x u ) such that for any a,b ∈ x, a ≤xd b (resp. a ≤ x u b) iff there exists a positive c ∈ x such that a⊕d c = b (resp. a⊕u c = b). theorem 3. let h be an infinite-dimensional complex hilbert space. then ( gr(h),⊕d, 0 ) w.r.t. relation ≤gr(h)d forms a woa-group. moreover,( grd (h),⊕d/grd (h), 0 ) w.r.t. relation ≤grd (h)d forms its woa-subgroup. proof. it has been shown [6] that ( gr(h),⊕d, 0 ) w.r.t. relation ≤gr(h)d forms a wop-group. moreover, in [4, lemma 4] the axiom (giv) is proved. axiom (ri) holds by definition. since ( v(h),⊕d/v(h), 0 ) =( pos(gr(h)),⊕d/p os(gr(h)), 0 ) is a generalized effect algebra [8], (rii) and (riii) hold. according to [6] ( grd (h), ⊕d/grd (h), 0 ) w.r.t. ≤gr(h)d is a wopsubgroup hence by corollary 1 it is also a woasubgroup. note that since the operation ⊕d/grd (h) is total on grd (h), it is also a partially ordered commutative group. theorem 4. let h be an infinite-dimensional complex hilbert space. then ( gr(h),⊕u, 0 ) w.r.t. relation ≤gr(h)u forms a woa-group. moreover,( grd (h), ⊕u/grd(h), 0 ) w.r.t. relation ≤grd (h)u ,( sagr(h),⊕u/sagr(h), 0 ) w.r.t. relation ≤sagr(h)u and ( sagrd (h),⊕u/sagrd(h), 0 ) w.r.t. relation ≤sagrd (h)u form its woa-subgroups. proof. we have shown [6] that ( gr(h),⊕u, 0 ) w.r.t. relation ≤gr(h)u forms a wop-group. in [4, lemma 6]. is proved the axiom (giv). (ri) holds by definition. because ( v(h)⊕u/v(h), 0 ) =( pos(gr(h)),⊕u/p os(gr(h)), 0 ) is a generalized effect algebra [4, 7], we have (rii) and (riii). according to [4] ( grd (h),⊕u/grd(h), 0 ) w.r.t. relation ≤grd (h)u , 293 jiří janda acta polytechnica ( sagr(h),⊕u/sagr(h), 0 ) w.r.t. relation ≤sagr(h)u and ( sagrd (h),⊕u/sagrd(h), 0 ) w.r.t. ≤sagrd (h)u are wop-subgroups hence by using corollary 1 they are woa-subgroups. acknowledgements the author gratefully acknowledges financial support from masaryk university under grant 0964/2009 and financial support from esf project cz.1.07/2.3.00/20.0051 algebraic methods in quantum logic of the masaryk university. references [1] blank j., exner p., havlíček m., hilbert space operators in quantum physics, 2nd edn. springer, berlin (2008). [2] dvurečenskij a., pulmannová s., new trends in quantum structures, kluwer acad. publ., dordrecht/ister science, bratislava, 2000. [3] foulis d. j., bennett m. k., effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1331–1352. [4] janda j., weakly ordered partial commutative group of self-adjoint operators densely defined on hilbert space, tatra mt. math. publ., 50 (2011), 1-16. [5] paseka j., pt -symmetry in (generalized) effect algebras, internat. j. theoret. phys., 50 (2011), 1198–1205. [6] paseka j., janda j., more on pt -symmetry in (generalized) effect algebras and partial groups, acta polytechnica, 51 (2011), no. 4, 65–72. [7] paseka j., riečanová z., considerable sets of linear operators in hilbert spaces as generalized effect algebras. found. phys. 41 (2011), 1634–1647. [8] riečanová z., zajac m. and pulmannová s., effect algebras of positive linear operators densely defined on hilbert spaces, reports on mathematical physics 68, (2011), 261–270. 294 acta polytechnica 53(3):289–294, 2013 1 introduction 2 preliminaries 3 hilbert spaces acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0430 acta polytechnica 54(6):430–438, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap local velocity profiles measured by piv in a vessel agitated by a rushton turbine radek šulc∗, vít pešava, pavel ditl czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: radek.sulc@fs.cvut.cz abstract. the hydrodynamics and flow field were measured in an agitated vessel using 2-d time resolved particle image velocimetry (2-d tr piv). the experiments were carried out in a fully baffled cylindrical flat bottom vessel 300 mm in inner diameter. the tank was agitated by a rushton turbine 100 mm in diameter. the velocity fields were measured for three impeller rotation speeds: 300 rpm, 450 rpm and 600 rpm, and the corresponding reynolds numbers in the range 50000 < re < 100000, which means that fully-developed turbulent flow was reached. in accordance with the theory of mixing, the dimensionless mean and fluctuation velocities in the measured directions were found to be constant and independent of the impeller rotational speed. the velocity profiles were averaged, and were expressed by chebyshev polynomials of the 1st order. although the experimentally investigated area was relatively far from the impeller, and it was located in upward flow to the impeller, no state of local isotropy was found. the ratio of the axial rms fluctuation velocity to the radial component was found to be in the range from 0.523 to 0.768. the axial turbulence intensity was found to be in the range from 0.293 to 0.667, which corresponds to a high turbulence intensity. keywords: mixing, rushton turbine, particle image velocimetry, flow, velocity profile, mean velocity, fluctuation velocity. 1. introduction it is important to know the flow and the flow pattern in an agitated vessel in order to determine many impeller and turbulence characteristics, e.g. impeller pumping capacity, intensity of turbulence, turbulent kinetic energy, convective velocity, and the turbulent energy dissipation rate. the information and data that are obtained can also be used for cfd verification. drbohlav et al. [1] made an experimental investigation of the velocity field in the stream discharging from a rushton turbine at reynolds numbers re = 146000 and re = 166000. they described the axial profiles of the mean velocity components in this region using a phenomenological three-parameter model based on a tangential cylindrical jet, proposed by [2]. obeid et al. [3] used the proposed model to describe the velocity field in the discharged flow produced by various types of turbine impellers. the flow in a mechanically agitated vessel should be divided into several regions where the flow behaviour is quite different. fořt et al. [4] divided the flow in a vessel agitated by a rushton turbine into the following regions: (1.) region o, in which the stream discharges from the impeller; (2.) region a, close to the wall, in which the flow direction changes from radial into axial; (3.) region b, which contains the predominantly ascending and descending sections of flow along the vessel wall; (4.) region c, in which the flow direction changes from axial to radial at the vessel bottom or at the liquid surface; (5.) region d, which contains the prevailing radial flow at the vessel bottom or at the liquid surface; (6.) region e, in which the flow at the bottom or at the surface turns into the direction of the vessel axis; (7.) region f, which contains the predominantly ascending or descending flow along the vessel axis towards the impeller; (8.) region g, in which the streamline pattern is stagnant and unstable, where neither of the mean velocity components is significant. the flow pattern characterized by the mean velocity components is described by the proposed theoretical model based on the stokes stream function. although the flow discharging from an impeller has been investigated by many authors, the regions outside the impeller region have not been treated with the same level of interest ([5]). the aim of this work is to study scaling of the velocity field outside the impeller region in a vessel mechanically agitated by a rushton turbine in a fully turbulent region at a high reynolds number in the range of 50000 < re < 100000. the hydrodynamics and the flow field were measured in an agitated vessel using time resolved particle image velocimetry (tr piv). the radial profiles of the mean and fluctuation velocities are expressed in terms of chebyshev polynomials. 430 http://dx.doi.org/10.14311/ap.2014.54.0430 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 local velocity profiles 2. theoretical background 2.1. inspection analysis of flow in an agitated vessel the flow of a newtonian fluid in an agitated vessel has been described by the navier – stokes equation: % (∂~u ∂t + ~u ·∇~u ) = −∇p + µ∇2 ~u + %~g. (1) this equation can be rewritten into dimensionless form, as follows (e.g. [6]): ∂~u∗ ∂t∗ + ~u∗ ·∇∗~u∗ = −∇∗p∗+ 1 re ∇∗2 ~u∗+ 1 fr ~n∗, (2) where the dimensionless properties are defined as follows: • dimensionless instantaneous velocity ~u∗ = ~u nd ; • dimensionless instantaneous pressure p∗ = p %n2d2 ; • dimensionless space gradient ∇∗ = ∇ d ; • reynolds number re = nd 2 ν ; • froude number fr = n 2d g ; and where ~n is a unity vector. similarly, the equation of continuity for stationary flow of a non-compressible fluid given as ∇· ~u = 0 (3) can be rewritten into the dimensionless form: ∇∗ · ~u∗ = 0. (4) the following relations can be obtained for dimensionless velocity components and dimensionless pressure, respectively, by inspection analysis of eqs. (2) and (4), as follows: ~u∗ = f1(~x∗, t∗, re, fr ), (5) p∗ = f2(~x∗, t∗, re, fr ), (6) where ~x∗ is a dimensionless location vector. for stationary flow with a periodic character, the velocity time dependence can be eliminated by substituting the velocity and pressure by time-averaged properties. for highly turbulent flow in a baffled vessel, the viscous and gravitational forces can be neglected and finally the time-averaged dimensionless velocity components and the pressure are independent of the reynolds number and the froude number, and depend on location only: u∗i = f1(~x ∗), p∗ = f2(~x∗), (7) reynolds decomposition of the instantaneous velocity components has been applied for the velocity profiles studied in this work. 2.2. mean and fluctuation velocity using piv, the instantaneous velocity data set ui(tj) in the ith direction for j = 1, 2, . . . ,nr at observation times tj with an equidistant time step ∆ts (i.e., ∆ts = tj+1 −tj) was obtained in a given location. assuming the so-called ergodic hypothesis, the time-averaged mean velocity ui was determined as the average value of velocity data set ui(tj): ui = 1 nr nr∑ j=1 u(tj), (8) where ui is mean velocity in the ith direction, ui(tj) is instantaneous velocity in the ith direction at observation time tj, and nr is the number of data items in the velocity data set. consequently, the fluctuation velocity in the ith direction ui(tj) at observation time tj is obtained by decomposition of the instantaneous velocity: ui(tj) = ui(tj) −ui for j = 1, 2, · · ·nr, (9) where ui(tj) is the fluctuation velocity in the ith direction at observation time ti, ui is mean velocity in the ith direction, ui(tj) is instantaneous velocity in the ith direction at observation time tj. the root mean squared fluctuation velocity is determined as follows: ui = ( 1 nr nr∑ j=1 ui(tj)2 )1/2 , (10) where ui is the root mean squared fluctuation velocity, and u(tj) is the fluctuation velocity at observation time tj. 3. experimental the hydrodynamics and the flow field were measured in an agitated vessel using time resolved particle image velocimetry (tr piv). the experiments were carried out in a fully baffled cylindrical flat bottom vessel 300 mm in inner diameter [7]. the tank was agitated by a rushton turbine 100 mm in diameter, i.e., the dimensionless impeller diameter d/t was 1/3. the dimensionless impeller clearance c/d taken from the lower impeller edge was 0.75. the tank was filled with degassed distilled water, and the liquid height was 300 mm, i.e., the dimensionless liquid height h/t was 1. the dimensionless baffle width b/t was 1/10. to prevent air suction, the vessel was covered by a lid. velocity fields were measured for three impeller rotation speeds: 300, 450 and 600 rpm, at which fully developed turbulent flow was reached. distilled water at a temperature of 23 °c (density % = 997.4 kg m−3, dynamic viscosity µ = 0.9321 mpa s) was used as the agitated liquid. the time resolved litron ldy 304 2d-piv system (dantec dynamics (denmark)) consists of a 431 r. šulc, v. pešava, p. ditl acta polytechnica neodyme-ylf laser (light wave length 532 nm, impulse energy 2×30 mj), a speedsence 611 high speed piv-regime camera (resolution 1280 × 1024 pixels) with a sigma macrodg objective equipped with an optical filter with wave length 570 nm. rhodamine b fluorescent particles of mean diameter 11.95±0.25 µm were used as seeding particles. the fluorescent particles lit by 532 nm light emit 570 nm light. in this way, non-seeding particles such as impurities and bubbles are separated and are not recorded. the operating frame rate was 1 khz (1000 vector fields per second), i.e., the sampling time ∆ts was 1 ms. the measured vertical plane was located in the center of the vessel and in the middle of the baffles. the plane was illuminated by a laser sheet 0.7 mm in thickness. the investigated area was 43×27 mm. the position of the right top apex e [re; ze] was [10; 55], i.e., r∗ = 2r/t = 20/300 and z∗ = z/t = 55/300, i.e., the right top edge was located 20 mm below the impeller paddle edge and 10 mm from the impeller axis. the scheme of the experimental apparatus and the investigated area are depicted in fig. 1. the camera was positioned orthogonally to the laser sheet. the experiments were conducted in the framework of cooperation with dr m. kotek (technical university liberec) and dr b. kysela (institute of hydrodynamics, czech academy of sciences). the velocity profiles in four horizontal planes located in the investigated area in the positions z∗ = 0.1114, 0.1294, 0.1474 and 0.1654 are presented in this paper in the dimensionless radius range r∗ = 〈0.0702÷0.3509〉. the investigated area corresponds to region f according to the classification given by fořt et al. (1982). the ensemble averaged velocity field method can be used due to the axisymmetric character of the flow in this region. for all experiments, 5000 images were taken at a sampling interval of 0.001 s, i.e., the total record time length was 5 s. unfortunately, a recording time of only 3.865 s was available for the 300 rpm measurement, due to damage to the storage disk. 4. experimental data evaluation according to the inspection analysis, the dimensionless velocities normalized by the product of impeller speed n and impeller diameter d should be independent of the reynolds number. for a single impeller size and a single liquid, this dimensionless velocity should be independent of the impeller rotational speed. the effect of impeller rotational speed on dimensionless velocities was tested by hypothesis testing [8]. the statistical method for hypothesis testing can estimate whether the differences between the predicted parameter values (e.g., predicted by some proposed theory) and the parameter values evaluated from the measured data are negligible. in this case, we assumed dependence of the tested parameter on the impeller rotational speed, described by the simple power law parameter = bnβ, and the difference between predicted exponent βpred and evaluated exponent βcalc figure 1. scheme of the experimental apparatus and the investigated area. was tested. the hypothesis test characteristics are given as t = (βcalc−βpred)/sβ where sβ is the standard error of parameter βcalc. if the calculated |t| value is less than the critical value of the t-distribution for m − 2 degrees of freedom and significance level α, the difference between βcalc and βpred is statistically negligible (statisticians state: “the hypothesis cannot be rejected”). in our case, the independence of dimensionless velocities from the impeller speed was tested as the hypothesis, i.e., parameter = bn0 = const., i.e., βpred = 0. the t-distribution coefficient tm−2,α for three impeller rotational speeds and significance level α = 0.05 is 12.706. hypothesis testing was performed for each point in the investigated profiles. the hypothesis test results are presented for each profile in table 1 by the percentage of points in which the above-formulated hypothesis parameter = const. is satisfied, and by the percentage of points in which the hypothesis parameter = const. cannot be accepted. for illustration, the average calculated |t| values are also presented here. 432 vol. 54 no. 6/2014 local velocity profiles hypothesis: parameter = bn0 percentage; t-characteristics |t| parameter ur/(nd) ur/(nd) uax/(nd) uax/(nd) profile z∗ = 0.1114 acceptable 92.4 %; 4.1 100 %; 0.7 100 %; 3.4 100 %; 0.9 not acceptable 7.6 %; 46.7 0 % 0 % 0 % profile z∗ = 0.1294 acceptable 98.7 %; 0.76 100 %; 0.6 95 %; 2.9 96.2 %; 1.2 not acceptable 1.3 %; 185 0 % 5 %; 57 3.8 %; 47.7 profile z∗ = 0.1474 acceptable 100 %; 0.9 100 %; 0.6 92.4 %; 4.1 93.7 %; 1.1 not acceptable 0 % 0 % 7.6 %; 21.5 6.3 %; 22.3 profile z∗ = 0.1654 acceptable 95 %; 2.4 100 %; 0.73 86.1 %; 2.81 97.5 %; 1.44 not acceptable 5 %; 28.1 0 % 13.9 %; 51.3 2.5 %; 18.1 table 1. dimensionless velocities – effect of impeller speed. figure 2. dimensionless radial mean velocity profile – effect of impeller speed; z∗ = 0.1474. figure 3. dimensionless radial fluctuation velocity profile – effect of impeller speed; z∗ = 0.1474. 433 r. šulc, v. pešava, p. ditl acta polytechnica figure 4. dimensionless axial mean velocity profile – effect of impeller speed; z∗ = 0.1474. figure 5. dimensionless axial fluctuation velocity profile – effect of impeller speed; z∗ = 0.1474. for illustration, the profiles of the dimensionless velocities for three impeller speeds and their average are presented in figures 2–5 for the line z∗ = 0.1474. the dimensionless radial mean velocities were found to be close to zero. as shown in fig. 2, some velocity values are positive, while other velocity values are negative. these findings correspond to characteristics of the given zone, according to fořt et al. (1982). this region contains predominantly ascending flow along the vessel axis towards the impeller. the dimensionless axial mean velocity was found to be in the range from 0.323 to 0.488. these values are higher than the dimensionless values for the radial mean velocity, as expected for this region. the tested hypothesis can be accepted in almost all profile points. the dimensionless radial rms fluctuation velocities were found to be in the range from 0.194 to 0.326. the tested hypothesis can be accepted in all profile points, as is signalized by the very low calculated |t| values. the dimensionless axial rms fluctuation velocities were found to be in the range from 0.139 to 0.204. the tested hypothesis can be accepted in the majority of profile points as is again signalized by the low calculated |t| values. because the selected position is relatively far from the impeller and outside the impeller discharge flow, we expected the local isotropy state defined on the length-scale level corresponding to integral length scale to be an equality of the fluctuation velocity components. however, this expectation was not confirmed. the ratio of the axial rms fluctuation velocity to the radial component was found to be in the range from 0.523 to 0.768. on the basis of the results of this hypothesis test, we assume that all dimensionless velocities can be statistically taken as constant and independent of the impeller rotational speed. the velocity profiles were averaged, and were expressed by the polynomial approximation written in chebyshev form. 434 vol. 54 no. 6/2014 local velocity profiles profile a0 × 10+2 a1 × 10+2 a2 × 10+2 a3 × 10+3 a4 × 10+3 iyx† δr ave/δr max‡ radial mean velocity ur/(nd) z∗ = 0.1114 −0.8872 8.407 −2.716 −2.484 4.457 0.9988 30.7/1459 z∗ = 0.1294 −2.893 6.289 −2.276 −2.649 4.652 0.9987 30.8/1249 z∗ = 0.1474 −4.727 5.097 −1.772 −3.677 3.809 0.998 6.5/22 z∗ = 0.1654 −6.457 3.907 −1.455 −4.148 3.322 0.9957 3.8/10.5 axial mean velocity uax/(nd) z∗ = 0.1114 36.37 7.362 0.7714 3.053 4.342 0.9967 0.82/2.2 z∗ = 0.1294 39.74 7.974 −0.6153 8.866 0.1507 0.9975 0.7/1.7 z∗ = 0.1474 41.36 8.17 −1.387 7.645 2.016 0.9953 0.9/4.6 z∗ = 0.1654 42.31 7.801 −2.142 8.041 7.179 0.9975 0.6/3.2 radial rms fluctuation velocity ur/(nd) z∗ = 0.1114 27.84 −8.426. 1.025 8.843 3.06 0.9983 1/1.8 z∗ = 0.1294 27.08 −7.825 0.1065 8.521 5.46 0.9987 0.73/2.6 z∗ = 0.1474 26.44 −7.295 −0.5566 8.669 3.414 0.9986 0.8/3.8 z∗ = 0.1654 25.5 −6.929 −0.3675 7.561 6.272 0.9969 1.1/2.8 axial rms fluctuation velocity uax/(nd) z∗ = 0.1114 14.65 −1.313 1.212 0.2547 1.802 0.977 1.14/4.7 z∗ = 0.1294 15.29 −2.063 1.151 1.144 1.028 0.9759 1.8/4.1 z∗ = 0.1474 16.07 −2.347 1.317 0.2177 2.836 0.9828 1.6/4.6 z∗ = 0.1654 16.78 −2.84 1.746 −3.173 1.911 0.994 0.91/6.5 † correlation index. ‡ relative error of velocity: average/maximum absolute value. table 2. profiles of dimensionless velocity components – coefficients of polynomial approximation in chebyshev form. 4.1. the profile as a function of the dimensionless radius the velocity profiles were described as a function of dimensionless radius r∗ in the range r∗ ∈ 〈r∗lower; r ∗ upper〉 using chebyshev polynomials of the 1st order (see [9], based on the original work of chebyshev [10]), as follows: parameter = 4∑ k=0 aktk(xch), (11) where ak are coefficients of polynomial approximation in chebyshev form, consisting of four terms, xch is the chebyshev polynomial variable, and tk(xch) are chebyshev polynomials, defined as follows: t0(xch) = 1, (12) t1(xch) = xch, (13) t2(xch) = 2x2ch − 1, (14) t3(xch) = 4x3ch − 3xch, (15) t4(xch) = 8x4ch − 8x 2 ch + 1. (16) the four terms of polynomial approximation were found to be sufficient for a quality description of the velocity profiles in this region. the chebyshev polynomial variable xch was calculated as follows: xch = 2r∗ − (r∗lower + r ∗ upper) r∗upper −r∗lower , (17) where r∗ is dimensionless radius in a given point defined as the ratio of radius r in a given point and tank radius t/2, r∗lower and r ∗ upper are the lower and upper limit values of the dimensionless radius. the evaluated parameters of the chebyshev polynomials are presented in table 2. a comparison of the averaged velocity profiles and the regression polynomials is presented in figures 6–9 for all four horizontal positions. extremely high relative error values obtained for the dimensionless radial mean velocity were observed for values close to zero. for faster calculation of the velocity in a given point, eq. (11) can be rewritten into the following form: parameter = b0 + b1xch + b2x2ch + b3x 3 ch + b4x 4 ch, (18) where b0 = a0 −a2 + a4, (19) b1 = a1 − 3a3, (20) b2 = 2a2 − 8a4, (21) b3 = 4a3, (22) b4 = 8a4. (23) 4.2. intensity of turbulence the axial turbulence intensity was calculated as follows: tiax = uax/uax, (24) 435 r. šulc, v. pešava, p. ditl acta polytechnica figure 6. dimensionless radial mean velocity profile ur/(n d) = f (r∗). figure 7. dimensionless radial fluctuation velocity profile uax/(n d) = f (r∗). figure 8. dimensionless axial mean velocity profile ur/(n d) = f (r∗). figure 9. dimensionless axial fluctuation velocity profile uax/(n d) = f (r∗). 436 vol. 54 no. 6/2014 local velocity profiles hypothesis: parameter = bn0 percentage; t-characteristics |t| parameter profile z∗ = 0.1114 z∗ = 0.1294 z∗ = 0.1474 z∗ = 0.1654 acceptable 96.2 %; 1.91 100 %; 1.2 92.4 %; 1.64 93.7 %; 1.6 not acceptable 3.8 %; 207 0.00 % 7.6 %; 183 6.3 %; 28.1 table 3. axial turbulence intensity – effect of impeller speed. profile a0 × 10+2 a1 × 10+2 a2 × 10+2 a3 × 10+3 a4 × 10+3 iyx† δr ave/δr max‡ tiax z∗ = 0.1114 41.64 −12.34 3.817 −4.749 1.637 0.9928 1.8/5.3 z∗ = 0.1294 40.06 −13.94 5.261 −12.62 6.243 0.9933 2.2/4.4 z∗ = 0.1474 40.49 −14.58 6.237 −14.92 7.515 0.9968 1.7/5.4 z∗ = 0.1654 41.37 −15.59 7.88 −24.23 1.472 0.9981 1.2/10.1 area 40.89 −14.11 5.799 −14.13 4.217 0.9979 1.3/4.6 † correlation index. ‡ relative error of velocity: average/maximum absolute value. table 4. profiles of axial turbulence intensity – coefficients of polynomial approximation in chebyshev form. figure 10. axial turbulence intensity profile tiax = f (r∗). where uax is axial rms fluctuation velocity, and uax is axial mean velocity. for dimensionless velocities independent of the impeller rotational speed, the turbulence intensity should also be independent from the impeller rotational speed. the independence of dimensionless velocities from impeller speed was again tested by hypothesis testing. the t-distribution coefficient tm−2,α for three impeller rotational speeds and significance level α = 0.05 is 12.706. hypothesis testing was performed for each point in the investigated profiles. the hypothesis test results for each profile are presented in table 3. the tested hypothesis can be accepted in almost all profile points, as is again signalized by the low calculated |t| values. on the basis of these hypothesis test results, we assume that the axial turbulence intensity can be statistically taken as constant and independent of the impeller rotational speed, as expected. the radial profiles of the axial turbulence intensity were averaged, and were expressed by the polynomial approximation in chebyshev form. the evaluated coefficients are presented in table 4. the radial profiles are presented in fig. 10 for given horizontal positions. as is shown, the calculated values are in the range from 0.293 to 0.667. these values correspond to high turbulence intensity. as expected, the highest turbulence intensity was found close to the impeller axis in the ascending flow core. the calculated axial turbulence intensity values were found to be approximately the same in each horizontal profile, see fig. 10. the effect of the dimensionless profile height on the turbulence intensity was therefore tested by hypothesis testing for each 437 r. šulc, v. pešava, p. ditl acta polytechnica point in the investigated profiles. the t-distribution coefficient tm−2,α for three impeller rotational speeds and significance level α = 0.05 is 4.3027. the percentage of points in which the hypothesis tiax = const is satisfied was found to be 73.4 %, and the average calculated |t| value was 1.6. the percentage of points in which the hypothesis tiax = const cannot be accepted was found to be 26.6 %, and the average calculated |t| value was 8.41. on the basis of this hypothesis test result, we assume that the axial turbulence intensity can be statistically taken as constant and independent from the dimensionless profile height in the investigated area. the radial profiles in four horizontal planes were averaged and were expressed by the polynomial approximation in chebyshev form. the evaluated coefficients denoted as “area” are presented in table 4. 5. conclusions the following results have been obtained: (1.) the hydrodynamics and the flow field were measured in a vessel 300 mm in inner diameter agitated by a rushton turbine using 2-d time resolved particle image velocimetry (2-d tr piv). the velocity fields were measured in the zone in upward flow to the impeller for three impeller rotation speeds: 300, 450 and 600 rpm, corresponding to a reynolds number in the range 50000 < re < 100000. (2.) the dimensionless radial mean velocities were found to be close to zero. these findings correspond to the characteristics of the given zone according to fořt et al. (1982). this region contains predominantly ascending flow along the vessel axis towards the impeller. (3.) in accordance with the theory of mixing, the dimensionless mean and fluctuation velocities in the measured directions were found to be constant and independent of the impeller rotational speed. consequently, the velocity profiles were averaged and were expressed by chebyshev polynomials of the 1st order. (4.) because the investigated area is relatively far from the impeller and outside the impeller discharge flow, we expected a local state of isotropy defined on the length-scale level corresponding to the integral length scale by equality of the fluctuation velocity components. this expectation was not confirmed. the ratio of the axial rms fluctuation velocity to the radial component was found to be in the range from 0.523 to 0.768. this state will affect the determination of the turbulent energy dissipation rate in a given region. (5.) the axial turbulence intensity was calculated and was found to be in the range from 0.293 to 0.667, which corresponds to high turbulence intensity. as expected, the highest turbulence intensity was found close to the impeller axis in the ascending flow core. it was found that the axial turbulence intensity can be statistically taken as constant and independent of the impeller rotational speed. the radial profiles of the axial turbulence intensity were averaged and were expressed by chebyshev polynomials of the 1st order. the calculated values for axial turbulence intensity were found to be approximately the same in each horizontal profile. the effect of dimensionless profile height on turbulence intensity was therefore tested by hypothesis testing for each point in the investigated profiles. it was found that the axial turbulence intensity can be characterized using a single formula in the investigated area. acknowledgements this research has been supported by grant agency of the czech republic project no. 101/12/2274 “local rate of turbulent energy dissipation in agitated reactors & bioreactors” and by ctu in prague project no. sgs14/061/ohk2/1t/12. references [1] drbohlav, j., fořt, i., máca, k., ptáček, j.: turbulent characteristics of discharge flow from the turbine impeller. coll. czechoslov. chem. comm., 1978, vol. 43, pp. 3148-3162 [2] drbohlav, j., fořt, i., krátký, j.: turbine impeller as a tangential cylindrical jet. coll. czechoslov. chem. comm., 1978, vol. 43, pp. 696-712 [3] obeid, a., fořt, i., bertrand, j.: hydrodynamic characteristics of flow in systems with turbine impellers. coll. czechoslov. chem. comm., 1983, vol. 48, pp. 568-577 [4] fořt, i., obeid, a., březina, v.: flow of liquid in a cylindrical vessel with a turbine impeller and radial baffles. coll. czechoslov. chem. comm., 1982, vol. 47, pp. 226-239 [5] kysela, b., konfršt, j., fořt, i., kotek, m., chára, z.: study of the turbulent flow structure around a standard rushton impeller. chem. and process eng., 2014, vol. 35, no. 1, pp. 137-147. doi: 10.2478/cpe-2014-0010. [6] novák, v., rieger, f., vavro, k.: hydraulické pochody v chemickém a potravinářském průmyslu, sntl, praha, 1989 [7] kotek, m., pešava, v., kopecký, v., jašíková, d., kysela, b.: piv measurement in a vessel of d = 0.3 m agitated by rushton turbine. research report for project no. 101/12/2274. liberec, 2012 [8] bowerman, b.l., o’connell, r.t.: applied statistics: improving business processes. richard d. irwin, usa, 1997, isbn 0-256-19386-x [9] rivlin, t.j.: the chebyshev polynomials. pure and applied mathematics, willey, new york, 1974 [10] chebyshev, p.l.: théorie des mécanismes connus sous le nom parallélogrammes. mémoires des savants étrangers présentés a l´académie de saint-pétersbourg, 1854, vol. 7, pp. 539-586 438 acta polytechnica 54(6):430–438, 2014 1 introduction 2 theoretical background 2.1 inspection analysis of flow in an agitated vessel 2.2 mean and fluctuation velocity 3 experimental 4 experimental data evaluation 4.1 the profile as a function of the dimensionless radius 4.2 intensity of turbulence 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0518 acta polytechnica 53(supplement):518–523, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap highlights from the lhc arno straessner∗, on behalf of the atlas and cms collaborations institut für kernund teilchenphysik, tu dresden, 01062 dresden, germany ∗ corresponding author: straessner@physik.tu-dresden.de abstract. the large hadron collider (lhc) and the two multi-purpose detectors, atlas and cms, have been operated successfully at record centre-of-mass energies of 7 ÷ 8 tev. this paper presents the main physics results from proton–proton collisions based on a total luminosity of 2 × 5 fb−1. the most recent results from standard model measurements, standard model and mssm higgs searches, as well as searches for supersymmetric and exotic particles are reported. prospects for ongoing and future data taking are presented. keywords: lhc, atlas, cms, standard model measurements, sm higgs boson, mssm higgs boson, supersymmetry, top quark, exotic particle searches. 1. introduction the large hadron collider (lhc) [1] is a proton– proton (pp) and heavy-ion collider built in an underground tunnel at cern. four detectors are installed at the interaction points: the multi-purpose experiments atlas [2] and cms [3], the b-physics experiment lhcb [4], and the alice detector [5] designed to study heavy ion physics. this paper concentrates on pp collision measurements with the atlas and cms detectors, and represents the status at the time of the conference (june 2012). in 2011 and 2012, pp centre-of-mass energies, √ s, of 7 tev and 8 tev were reached. maximum instantaneous luminosities of 6.5 × 1033 cm−2 s−1 and an average lhc operating time fraction of about 35 % with stable beams [6] allowed lhc to deliver about 5.5 fb−1 and 3 fb−1 of data at 7 and 8 tev, respectively, to the atlas and cms detectors. the prospects for the data-taking period of 2012 are to collect about 12 fb−1 of additional pp collision data per experiment at 8 tev. following the current lhc run, an 18-month period is foreseen to repair the lhc superconducting magnet splices and to prepare for operation at centre-of-mass energies of 13 tev or more. in addition, the peak luminosity is planned to be increased to 2×1034 cm−2 s−1, which corresponds to twice the nominal value. the atlas and cms detectors ran in 2011 and 2012 with more than 97 % of the detector channels operational. high data taking efficiencies of about 94 % were achieved. the detectors were calibrated using control data samples. similarly, trigger and detector efficiencies were determined directly from data or derived from simulations which were verified with data. the high instantaneous luminosities lead to a pileup of simultaneous pp collisions in one lhc proton bunch crossing. interesting hard-scattering reactions are thus overlaid with additional inelastic pp collisions, figure 1. display of a z → µ+µ− candidate event recorded by the atlas detector with 25 additional pp collision vertices reconstructed in the tracking detectors. as shown in fig. 1. the average number of pile-up events reached mean values of ≈ 12 in 2011 and ≈ 20 in 2012. simulations to which the atlas and cms data are compared include event pile-up effects as measured in data. since the longitudinal momentum of the partonic process in a pp collision is a priori unknown, kinetic quantities like the transverse momentum, pt, and the transverse energy, et, of electrons, muons, tau leptons, and jets are often used in the analyses. momentum conservation in the plane transverse to the beam is applied to measure missing transverse energy, emisst . event triggers and selection algorithms apply thresholds to these quantities in order to identify hard scattering processes. 518 http://dx.doi.org/10.14311/ap.2013.53.0518 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 highlights from the lhc 2. standard model measurements the lhc experiments have performed an extensive measurement program of standard model (sm) pp collision processes. the aim is to establish whether sm reactions take place at the expected rate also at centreof-mass energies of 7 ÷ 8 tev, to improve on precision measurements of sm parameters, and to develop a good understanding of those sm processes which form the background to searches for new physics. an important study is the measurement of gauge boson production, pp → w±+x, z+x, accompanied by additional quark and gluon jets. the corresponding final states with leptons, possibly missing transverse energy, and jets are typical background signatures for new particle searches. as an example, the measured pt spectra of the jet with the largest pt in w + jets events are shown in fig. 2 [7]. the data are well described by leading-order (lo) multi-parton event generators [8, 9], after normalisation to the next-tonext-to-leading-order (nnlo) total cross-section of inclusive w-boson production, and by nlo calculations [10]. the leptonic decays of w and z bosons, w → eνe, µνµ and z → e+e−, µ+µ−, are also studied to derive a detailed understanding of the parton density functions (pdf) of the colliding protons. in particular, the w± leptonic charge asymmetry and the z rapidity distributions are sensitive to the strange-to-down sea quark ratio, rs = 0.5(s + s̄)/d̄, which is derived to be 1.00+0.25−0.28 at a momentum transfer of q2 = 1.9 gev 2, and a parton momentum fraction of x = 0.023 in an nnlo pqcd analysis [11]. this is significantly above previous pdf analyses. a similar measurement is done by studying w + c final states [12], which are dominantly produced from sea s-quarks. the σ(w + c + x)/σ(w + jets + x) cross-section ratio is found to be in agreement with nlo pdf predictions. in general, the production cross-sections of w and z bosons are measured in all leptonic final sates, including tau leptons, and are found to agree with (n)nlo predictions, as illustrated in fig. 3. similarly, the production of two gauge bosons, ww, wz, zz, wγ, zγ, shows no deviations from sm expectations [14]. thus, anomalous contributions to triple gauge boson vertices are further constrained. in the area of tests of the electroweak sm, the lhc collaborations will attempt to achieve precisions for electroweak parameters, like w and top quark masses, mw, mtop, competitive with the current lep and tevatron results [15]. in preparation of the lhc w-mass measurement from the reconstructed lepton pt and w transverse-mass spectra, the pt distributions of w and z bosons is compared to nlo calculations [13]. the good agreement observed will enable the lhc experiments to control one important source of systematic uncertainty of the future w mass measurement. top quarks are produced at the lhc individually or in pairs, and decay nearly exclusively to a w bo [ p b /g e v ] t /d p σ d -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 1 10 2 10 3 10 + jetsνl→w =7 tevsdata 2010, alpgen sherpa blackhat-sherpa -1 ldt=36 pb∫ jets, r=0.4 t anti-k |<4.4 jet y>30 gev, | t jet p atlas 1 jets ≥ w + -1 2 jets, x10 ≥ w + -2 3 jets, x10 ≥w + -3 4 jets, x10 ≥w + [ p b /g e v ] t /d p σ d -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 1 10 2 10 3 10 50 100 150 200 250 300 t h e o ry /d a ta 0.5 1 1.5 1 jet≥w + 50 100 150 200 250 300 t h e o ry /d a ta 0.5 1 1.5 [gev] t first jet p 50 100 150 200 250 300 t h e o ry /d a ta 0.5 1 1.5 2 jets≥w + [gev] t first jet p 50 100 150 200 250 300 t h e o ry /d a ta 0.5 1 1.5 figure 2. differential cross-section of w+jets events as a function of the pt of the jet with the highest pt in the event as measured by atlas [7], compared with lo predictions by alpgen [8] and sherpa [9], normalized to nnlo inclusive cross-section, and to nlo predictions using blackhat-sherpa [10]. son and a b-quark. top pairs are detected using information from b-tagging of jets in single-lepton, di-lepton, tau + lepton, tau + jets, and all-hadronic final states. the measured cross-sections [16], summarized in fig. 4, reach relative precisions of 6 ÷ 8 %, mostly dominated by systematic uncertainties. the main sources of channel-dependent systematics are jet energy scales, b-tag uncertainties, pile-up description, signal and background modeling. full nnlo theory calculations will soon be necessary to meet the experimental precision. the top quark pair events in the di-lepton, lepton + jet, and fully hadronic channels are used further to derive the mass of the top quark. mass-dependent event samples, so-called templates, are simulated and the underlying top-quark mass is varied until the masssensitive distributions fit the data. important systematic uncertainties, like light-quark jet energy scale, b-jet energy scale, signal and background modeling are determined directly from data or by comparing data control samples with simulations. the results [17] obtained by atlas and cms are summarized in tab. 1. 519 arno straessner, on behalf of the atlas and cms collaborations acta polytechnica ratio (cms/theory) 0 1 2 4%±lumi. uncertainty: b( w )×σ theo 0.05± exp 0.01±0.99 ) + b( w×σ theo 0.05± exp 0.02±0.98 ) b( w×σ theo 0.05± exp 0.02±0.99 b( z )×σ theo 0.05± exp 0.01±1.00 )ττ→b( z ×σ theo 0.04± exp 0.10±1.03 )γb( w×σ theo 0.08± exp 0.18±1.12 )γb( z×σ theo 0.04± exp 0.12±0.97 b( ww )×σ theo 0.00± exp 0.18±1.29 b( wz )×σ theo 0.00± exp 0.13±0.86 b( zz )×σ theo 0.01± exp 0.33±0.59 effθ 2 sin theo 0.00± exp 0.01±0.99 w/zr theo 0.01± exp 0.02±0.98 ±wr theo 0.04± exp 0.01±0.99 ( w )σ1-jet )/≥( w+σ theo 0.00± exp 0.17±1.10 ( w )σ2-jet )/≥( w+σ theo 0.00± exp 0.21±1.13 ( w )σ3-jet )/≥( w+σ theo 0.00± exp 0.25±1.03 ( w )σ4-jet )/≥( w+σ theo 0.00± exp 0.34±0.79 ( z )σ1-jet )/≥( z+σ theo 0.00± exp 0.16±1.00 ( z )σ2-jet )/≥( z+σ theo 0.00± exp 0.20±0.97 ( z )σ3-jet )/≥( z+σ theo 0.00± exp 0.29±0.82 ( z )σ4-jet )/≥( z+σ theo 0.00± exp 0.64±0.93 = 7 tevscms figure 3. ratios of cross-sections and quantities for singleand pair-production of gauge bosons with decay into leptons, as measured by cms, compared to (n)nlo predictions. [pb] tt σ 50 100 150 200 250 300 350 atlas preliminary data 2011 channel & lumi. new measurements 15 may 2012 theory (approx. nnlo) = 172.5 gev t for m stat. uncertainty total uncertainty (lumi)±(syst) ±(stat) ± tt σ single lepton -1 0.70 fb 7 pb± 9 ± 4 ±179 dilepton -1 0.70 fb pb 7 + 8 11 + 14 6 ±173 all hadronic -1 1.02 fb 6 pb± 78 ± 18 ±167 combination 7 pb± 7 + 8 3 ±177 + jetshadτ -1 1.67 fb 7 pb± 42 ± 19 ±200 + leptonhadτ -1 2.05 fb 7 pb± 20 ± 13 ±186 all hadronic -1 4.7 fb 6 pb± 57 + 60 12 ±168 figure 4. measurements of the top-pair production cross-section in various final states by atlas in comparison with theoretical predictions. cms has obtained comparable results [16]. they are in good agreement with each other and with the top-quark mass measurements performed at the tevatron [15]. the precision of the lhc measurements is currently limited by systematic uncertainties, which are expected to improve with further understanding of the details of soft-qcd effects in top-pair production and decay, of the relevant background, and with more refined calibration procedures. the physics of b-flavoured hadrons is being analysed in detail at the lhc, and new resonances like χb(3p) [18] and ξb [19] have been discovered. this gives an insight into the description of bound quark states, e.g. by lattice qcd [20]. furthermore, flavourchanging b-hadron decays are loop-suppressed in the sm and thus appear with very small branching channel mtop [gev] cms: di-lepton, lepton+jets 172.6 ± 0.4 ± 1.2 atlas: all-hadronic 174.9 ± 2.1 ± 3.8 atlas: lepton+jets 174.5 ± 0.6 ± 2.3 table 1. top-quark mass measurements performed at lhc in different top-pair final states. the statistical and systematic uncertainties are given separately. the cms results do not yet include systematic uncertainties due to so-called colour reconnection and underlying event effects. ratios. new physics may increase these decay rates by orders of magnitude. as an example, the decay bs → µ+µ− is predicted to appear at a relative rate of brsm(bs → µ+µ−) = (3.5 ± 0.3) × 10−9 [21], and is searched for by atlas, cms, and lhcb. isolated dimuon pairs in the relevant bs mass range are triggered and selected. background from hadronic bs decays and combinatorial background remains, but still allows upper limits to be put on the branching ratio, normalized by the b+ → j/ψk+ → µ+µ−k+ decay, of 2.2 × 10−8 (atlas [22]), 7.7 × 10−9 (cms [23]), and 4.5 × 10−9 (lhcb [24]). all results are in agreement with sm expectations, and no significant excess of events is observed. 3. searches for the standard model higgs boson one of the major goals of the lhc physics program is to understand the source of electroweak symmetry breaking. within the sm, this symmetry breaking is introduced by an su(2) higgs doublet field, which gives mass to the gauge bosons and fermions. its excitations can be measured as a scalar higgs boson. at the lhc, the sm higgs boson is produced by gluon–gluon fusion, gg → h, vector– boson fusion, qqv∗v∗ → qqh, and higgs–strahlung, w/z → w/z+h, processes, as well as associated production with top-quarks, where the first two dominate the production cross-section [25]. atlas and cms search mainly for decays into photon pairs, h → γγ, and into gauge boson pairs with subsequent decay into leptons, neutrinos or quarks, h → zz∗ → 4`, ``qq, ``νν, h → ww∗ → `ν`ν, `νqq. the γγ and 4` final states provide excellent mass resolution. low-mass higgs bosons decay mostly into pairs of b-quarks, which is also searched for in the associated production with w or z bosons. at low masses, h → ττ decays are analysed, though with limited mass resolution due to unmeasured neutrinos in the final state. the background in each channel is estimated either from data using oneand twodimensional side-band methods and control samples, or by simulation after verifying its consistency with data in phase space regions with negligible contributions from an sm higgs signal. 520 vol. 53 supplement/2013 highlights from the lhc [gev]hm 100 200 300 400 500 600 s m σ/ σ 9 5 % c l l im it o n -1 10 1 10 obs. exp. σ1 ± σ2 ± = 7 tevs -1 ldt = 4.6-4.9 fb∫ atlas 2011 2011 data cls limits higgs boson mass (gev) 100 200 300 400 500 s m σ/ σ 9 5 % c l l im it o n -1 10 1 10 observed expected (68%) expected (95%) observed expected (68%) expected (95%) -1 l = 4.6-4.8 fb = 7 tevscms, observed expected (68%) expected (95%) -1 l = 4.6-4.8 fb = 7 tevscms, figure 5. upper limit at 95 % cl for the production cross-section of an sm higgs boson relative to the sm expectation for atlas (left) and cms (right). combining all search channels that were analysed [26], a 95 % cl upper limit on the sm higgs production cross-section is derived by both atlas and cms, and is displayed in fig. 5. with the data of 2011 of in total 5 fb−1 at √ s = 7 tev, atlas excludes an sm higgs boson in the mass ranges 110 < mh < 117.5 gev, 118.5 < mh < 122.5 gev and 129 < mh < 539 gev, while cms excludes 127.5 < mh < 600 gev. at masses of about 120 ÷ 127 gev, both experiments observe an excess above background expectation. the local probability p0 for a background-only experiment to be more signallike than the observation corresponds to significances of about 3σ, for both atlas and cms. the total production rate observed would be compatible with the production of an sm higgs boson. however, the additional data taken in 2012 will have to be analysed to further study the mass region where the excess is observed. 4. searches for supersymmetry and supersymmetric higgs bosons supersymmetric (susy) extensions of the sm predict scalar partners of the sm fermions and fermionic partners of the sm bosons, however at higher mass scales due to supersymmetry breaking. in r-parity conserving models, like the constrained minimal supersymmetric standard model (cmssm/msugra), the lightest susy particle (lsp) is stable. in the cmssm in particular, the weakly interacting neutralino represents this lsp and thus escapes undetected with a typical signature of missing transverse energy, emisst . furthermore, gluino–squark initiated susy particle production involves susy decay chains with multiple leptons and jets measured in the detector. susy particle production is thus being searched for with these signatures. background from sm processes is estimated directly from data, using multi-dimensional side-band methods. further signatures with emisst are photons + jets, disappearing tracks, hadronic multijets, same-sign di-leptons, multi-leptons, same-sign dileptons with b-jets, etc. recent analyses also look for exclusive production, e.g. direct gaugino or 3rd generation sparticle production with leptons and b-jets in the [gev] 0 m 500 1000 1500 2000 2500 3000 [ g e v ] 1 /2 m 100 200 300 400 500 600 700 800 900 1000 ± l ~ lep2 ± 1 χ∼lep2 no ew sb = l s p τ∼ no n-c on ver ge nt rg e's ) = 500g ~m( ) = 1000g ~m( ) = 1500g ~m( ) = 2000g ~m( ) = 1000 q~ m( ) = 1500 q~ m( ) = 2000 q~ m( ) = 2500 q~ m ( median expected limit σ1±expected limit observed limit (theory)σ 1 ±observed cms preliminary -1 l dt = 4.4 fb∫ = 7 tev s hybrid cls 95% c.l. limits razor inclusive no ew sb no n-c on ver ge nt rg e's )=10βtan( = 0 gev 0 a > 0µ = 173.2 gev t m = l s p τ∼ figure 6. lhc results of searches for susy particles within the cmssm/msugra framework. the excluded parameter range for the universal scalar and gaugino masses, m0 and m1/2, at the gut scale is shown for cms, while atlas obtained very similar results [28]. final state. r-parity violating scenarios are also studied. however, no significant excess of data is observed, either by atlas or by cms. as an example, the excluded parameter ranges in the cmssm/msugra scenario are shown in fig. 6 [28]. within this model, gluino masses of mgluino < 800 gev (cms), 850 gev (atlas) can be excluded at 95 % cl, and for the assumption msquark = mgluino, the exclusion limits are further improved to 1.2 tev (atlas) and 1.35 tev (cms) at 95 % cl, respectively. even if the excess in the search for a light neutral higgs boson with sm properties will be confirmed with more data, this boson may be part of a larger set of higgs bosons, as predicted by supersymmetric extensions of the sm. in the mssm, two higgs doublets give rise to 5 physical higgs bosons: two cp-even, h, h, one cp-odd, a, two charged higgs bosons, h±. the mssm higgs sector is described to first order by the mass of the cp-odd higgs, ma, and the ratio of the vacuum expectation values of the two higgs doublets, tan β. the latter parameter also modifies the coupling strength of upand downtype fermions to the higgs bosons. for large tan β, down-type fermion couplings are enhanced and thus increasing production mechanisms involving b-quarks. final states like h/h/a → ττ and h± → τν,cs are studied by atlas and cms. in the absence of a significant excess in the 2011 data, upper limits on the relevant production cross-sections are derived and interpreted as exclusions in the tan β and ma parameter plane in the mmaxh scenario at 95 % cl, as displayed in fig. 7 [27]. assuming the absence of a possible mssm higgs boson signal, the gap in the ma . 180 gev range between lhc and lep results is expected to be closed with the additional lhc data of 2012. 521 arno straessner, on behalf of the atlas and cms collaborations acta polytechnica figure 7. lhc results [27] of searches for mssm higgs bosons in the tan β-ma parameter space in the mmaxh scenario. 5. searches for exotic particles atlas and cms have performed searches for new physics beyond the sm in a wide range of possible theoretical frameworks, but without observing any signal. as an example, searches for pair-produced down-type 4th generation quarks decaying to a top quark and a w boson yielded limits in the order of 400÷600 gev [29] at 95 % cl. masses of heavy gauge bosons, w′ and z′, were excluded by atlas up to 2.2 tev at 95 % cl in the sequential standard model [30]. in searches for signatures of large extra dimensions, the scale for the onset of quantum gravity, ms, was extracted to be larger than 2.5 ÷ 3.8 tev [31] at 95 % cl, depending on the number of extra dimensions, and the fundamental planck scale, md, was found to be larger than 2.0÷3.2 tev [32] at 95 % cl, again depending on the number of extra dimensions. these results represent only a few examples, and atlas and cms typically obtained similar exclusions in the tev range for mass and energy scales of new physics signatures. 6. summary and conclusion the lhc experiments atlas and cms have measured sm processes in generally good agreement with (n)nlo calculations, including gauge-boson and topquark physics. improved determination on sm parameters like mw and mtop is progressing. searches for new particles in the susy and exotic sector, including mssm higgs bosons, have not shown significant deviations from sm background expectations in the data analysed up to now. in the search for the sm higgs boson, an excess of signal candidates is observed by both atlas and cms at the 3σ level. more data will have to be analysed to verify if this excess will stay consistent with the production of a light higgs boson and possibly to confirm the higgs field as the source of electroweak symmetry breaking. acknowledgements this work was supported in part by the german helmholtz alliance physics at the terascale and by the german bundesministerium für bildung und forschung (bmbf) within the research network fsp-101 physics on the tev scale with atlas at the lhc . references [1] evans, l., bryant, ph. (eds.): 2008, lhc machine, jinst 3, s08001 [2] atlas collaboration: 2008, jinst 3, s08003 [3] cms collaboration: 2008, jinst 3 s08004 [4] lhcb collaboration: 2008, jinst 3 s08005 [5] alice collaboration: 2008, jinst 3 s08002 [6] see https://lhc-statistics.web.cern.ch for recent updates [7] atlas collaboration: 2012, phys. rev. d 85, 092002; cms collaboration: 2012, jhep 01, 010 [8] m. l. mangano et al.: 2003, j. high energy phys. 0307, 001 [9] t. gleisberg et al.: 2009, j. high energy phys. 0902, 007 [10] c. f. berger, et al.: 2011, phys. rev. lett. 106, 092001 [11] atlas collaboration: 2012, phys. rev. lett. 109, 012001 [12] cms collaboration: 2011, cms pas ewk-11-013 [13] atlas collaboration: 2012, phys. rev. d 85,012005; 2011, phys. lett. b705, 415–434; cms collaboration: 2012, phys. rev. d 85, 032002 [14] atlas collaboration: 2012, phys. lett. b 706, 276–294; 2011, phys. rev. d 84, 112006; 2010, jhep 1012, 060; 2012, cern-ph-ep-2012-059, arxiv:1205.2531v1 [hep-ex]; 2012, phys. lett. b 712, 289–308; 2012, phys. lett. b 709, 341–357; 2012, phys. rev. lett. 108, 041804; 2011, jhep 1109, 072; 2011, phys. rev. lett. 107, 041802; cms collaboration: 2011, jhep 10, 132; 2012, jhep 08 117; 2011, jhep 01, 080; 2012, cms-pas-smp-12-005; 2011, phys. lett. b 701, 535–555; 2011, cms-pas-ewk-11-010 [15] tevatron electroweak working group, cdf, d0 collaborations: 2011, arxiv:1107.5255v3 [hep-ex]; 2012: arxiv:1204.0042v2 [hep-ex]cdf collaboration: 2012, arxiv:1203.0275v1 [hep-ex] [16] atlas collaboration: 2012, atlas-conf-2012024, -031, -032; cern-ph-ep-2012-102, arxiv:1205.2067 [hep-ex]; cms collaboration: 2011, cms pas top-11-024, 007, 005, 004, 003; 2012, cern-ph-ep-2012-078, arxiv:1203.6810v1 [hep-ex] [17] atlas collaboration: 2012, eur. phys. j. c72, 2046; atlas-conf-2012-031; cms collaboration: 2011, cms-pas-top-11-015, cms-pas-top-11-018 [18] atlas collaboration: 2012, phys. rev. lett. 108, 152001 [19] cms collaboration: 2012, phys. rev. lett. 108, 252002 522 https://lhc-statistics.web.cern.ch vol. 53 supplement/2013 highlights from the lhc [20] see for example: randy lewis, r. m. woloshyn: 2009, phys. rev. d 79, 014502 [21] a. j. buras: 2010, acta phys. polon. b 41, 2487-2561, arxiv:1012.1447 [hep-ph] [22] atlas collaboration: 2012, phys.lett. b 713, 387 [23] cms collaboration: 2012, cern-ph-ep-2012-086, arxiv:1203.3976v1 [hep-ex] [24] lhcb collaboration: 2012, phys. rev. lett. 108, 231801 [25] lhc higgs cross section working group, s. dittmaier, c. mariotti, g. passarino, r. tanaka (eds.): 2011, arxiv:1101.0593 [hep-ph], 2012, arxiv:1201.3084 [hep-ph] [26] atlas collaboration: 2012, atlas-conf-2012019, and references therein; cms collaboration: 2012, phys. lett. b 710, 26, and references therein [27] atlas collaboration: 2012, atlas-conf-2012-11; cms collaboration: 2011, cms-pas-hig-11-019; 2012: phys. lett. b 713 (2012) 68; s. dasu: 2012, higgs sector beyond the standard model, talk at the xlviith rencontres de moriond “electroweak interactions and unified theories”, la thuile [28] atlas collaboration: 2012, atlas-conf-2012-037, atlas-conf-2012-041; cms collaboration: 2012, pas sus-12-005 [29] atlas collaboration: 2012, arxiv:1202.6540 [hepex]; cms collaboration: 2012, arxiv:1204.1088 [hep-ex] [30] atlas collaboration: 2011, arxiv:1108.1316 [hep-ex], 2012, atlas-conf-2012-007 [31] cms collaboration: 2012, arxiv:1202.3827v1 [hep-ex] [32] atlas collaboration: 2011, atlas-conf-2011-096 [33] s. heinemeyer, o. stal, g. weiglein: 2011, arxiv:1112.3026 [hep-ph]; s. s. abdus salam, et al.: 2011, arxiv:1109.3859 [hep-ph] discussion a. antonelli — if new data confirms a neutral higgs boson at ≈ 125 gev and no signal for supersymmetric particles is found, which of the different susy models will be favoured? a. straessner — a low-mass higgs boson with properties predicted by the standard model higgs is not in contradiction even with the minimal supersymmetric standard model (mssm). however, the currently observed absence of signatures of super-symmetric particles up to the tev scale places challenging constraints on the susy models. further discussions of the mssm higgs sector in the light of the lhc measurements can be found e.g. in reference [33]. 523 acta polytechnica 53(supplement):518–523, 2013 1 introduction 2 standard model measurements 3 searches for the standard model higgs boson 4 searches for supersymmetry and supersymmetric higgs bosons 5 searches for exotic particles 6 summary and conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0462 acta polytechnica 53(5):462–469, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap gln+1 algebra of matrix differential operators and matrix quasi-exactly-solvable problems yuri f. smirnov (deceased), alexander v. turbiner∗ instituto de ciencias nucleares, universidad nacional autónoma de méxico, apartado postal 70-543, 04510 méxico, d.f., mexico ∗ corresponding author: turbiner@nucleares.unam.mx abstract. the generators of the algebra gln+1 in the form of differential operators of the first order acting on rn with matrix coefficients are explicitly written. the algebraic hamiltonians for matrix generalization of 3−body calogero and sutherland models are presented. keywords: algebra of differential operators, exactly-solvable problems. submitted: 13 may 2013. accepted: 6 june 2013. 1. introduction this work has a certain history related to miloslav havlicek. on the important occasion of miloslav’s 75th birthday, we think this story should be revealed. about 25 years ago, when quasi-exactlysolvable schroedinger equations with the hidden algebra sl2 were discovered [1], one of the present authors (avt) approached israel m. gelfand and asked about the existence of the algebra gln+1 of matrix differential operators. instead of giving an answer, israel moiseevich said that m. havlicek knows the answer and that he must be asked. a set of dubna preprints was given (see [2, 3] and reference therein). then avt studied them for many years, at first separately and then together with the first author (yufs), who also happened to have the same set of preprints. the results of these studies are presented below. while carrying out these studies, we always kept in mind that a constructive answer exists and is known to miloslav. thus, we are certain that at least some of results presented here are known to miloslav. having difficulty to understand what is written in the texts we did not know what he really knew, and were therefore unable to indicate it in our text. our main goal is to find a mixed representation of the algebra gln+1 which contains both matrices and differential operators in a non-trivial way. then to generalize it to a polynomial algebra which we call g(m) (see below, section 4). another goal is to apply the obtained representations for a construction of the algebraic forms of (quasi)-exactly-solvable matrix hamiltonians. 2. the algebra gln in mixed representation let us take the algebra gln and consider the vector field representation ẽij = xi∂j, i,j = 1, . . .n,x ∈ rn. (1) it obeys the canonical commutation relations [ẽij, ẽkl] = δjkẽil − δilẽkj. (2) on the other hand, let us consider another representation mpm, p,m = 1, . . . ,n of the algebra gln in terms of some operators (matrix, finite-difference, etc) with the condition that all ‘cross-commutators’ between these two representations vanish [ẽij,mpm] = 0. (3) let us choose mpm to obey the canonical commutation relations [mij,mkl] = δjkmil − δilmkj, (4) (cf. (2)). it is evident that the sum of these two representations is also the representation, eij ≡ ẽij + mij ∈ gln. (5) now we consider an embedding of gln ⊂ gln+1 trying to complement the representation (1) of the algebra gln up to the representation of the algebra gln+1. in principle, this can be done due to the existence of the weyl-cartan decomposition, gln+1 = l⊕ (gln ⊕ i) ⊕u with the property gln+1 = l o (gln ⊕ i) n u, (6) where l(u) is the commutative algebra of the lowering (raising) generators with the property [l,u] = gln⊕i. thus, it realizes a property of the gauss decomposition of gln+1. it is worth emphasizing that dim(l) = dim(u) = n. obviously, the lowering generators (of negative grading) from l can be given by derivations t −i = ∂i, i = 1, . . . ,n, ∂i ≡ ∂ ∂xi , (7) (see e.g. [5]) when assuming that all commutators [t −i ,mpm] = 0, (8) 462 http://dx.doi.org/10.14311/ap.2013.53.0462 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 gln+1 algebra of matrix differential operators vanish. this probably implies that the only possible choice for mpm exists when they are either given by matrices or act in a space which is a complement to x ∈ rn. it is easy to check that [eij,t −k ] = −δikt − j . now we have to add the euler-cartan generator of the gln algebra, see (6) −e0 = n∑ j=1 xj∂j −k, (9) where k is arbitrary constant. raising generators from u are chosen as −t +i = −xie0 + n∑ j=1 xjmij = xi ( n∑ j=1 xj∂j −k ) + n∑ j=1 xjmij, i = 1, . . . ,n. (10) (cf. for instance [5]). needless to say that one can check explicitly that t −i , eij, e0, t + i span the algebra gln+1. in particular, [e,t +] = t +, and [t +i ,t − j ] = eii −δije0. if parameter k takes non-negative integer the algebra gln+1 spanned by the generators (5), (7), (9), (10) appears in a finite-dimensional representation. there exists a linear finite-dimensional space of polynomials of finite-order in the space of columns/spinors of finite length which is a common invariant subspace for all generators (5), (7), (9), (10). this finite-dimensional representation is irreducible. the non-negative integer parameter k has the meaning of the length of the first row of the young tableau of gln+1, describing a totally symmetric representation (see below). all other parameters are coded in mij, which corresponds to an arbitrary young tableau of gln. thus, we have some peculiar splitting of the young tableau. each representation is characterized by the gelfandtseitlin signature, [m1,n, . . .mnn], where min ≥ mi+1,n and their difference is positive integer. each basic vector is characterized by the gelfand-tseitlin scheme. an explicit form of the representation is given by the gelfand-tseitlin formulas [4]. it can be demonstrated that all casimir operators of gln+1 in this realization (5), (7), (9), (10) are expressed in mij, and thus do not depend on x. they coincide with the casimir operators of the gln-subalgebra realized by matrices mij. 3. example: the algebra gl3 in mixed representation in the case of the algebra gl3, the generators (5), (7), (9), (10) take the form e11 = x1∂1 + m11, e22 = x2∂2 + m22, e12 = x1∂2 + m12, e21 = x2∂1 + m21, e0 = k −x1∂1 −x2∂2, t −1 = ∂1, t − 2 = ∂2, t +1 = x1(k −x1∂1 −x2∂2) −x1m11 −x2m12, t +2 = x2(k −x1∂1 −x2∂2) −x1m21 −x2m22. (11) the casimir operators of gl3 in this realization are given by c1 = e11 + e22 + e0 = k + m11 + m22 = k + c1(m), c2 = e12e21 + e21e12 + t +1 t − 1 + t − 1 t + 1 + t + 2 t − 2 + t −2 t + 2 + e 2 11 + e 2 22 + e 2 0 = k(k + 2) + m211 + m 2 22 + m12m21 + m21m12 −m11 −m22 = k(k + 2) + c2(m) −c1(m), and, finally, c3 = − 1 2 c31 + 3 2 c1c2 + 3c2 − 2c21 − 2c1. in this realization, the casimir operator c3 is algebraically dependent on c1 and c2. in fact, c1 and c2 are nothing but the casimir operators of the gl2 sub-algebra. therefore, the center of the gl3 universal enveloping algebra in realization (11) is generated by the casimir operators of the gl2 sub-algebra realized by mij. thus, it seems natural that these reps are irreducible. now we consider concrete matrix realizations of the gl2-subalgebra in our scheme. 3.1. reps in 1 × 1 matrices this corresponds to the trivial representation of gl2, m11 = m12 = m21 = m22 = 0. this is [k, 0] or, in other words, a symmetric representation (the young tableau has two rows of length k and 0, correspondingly). we also can call it a scalar representation, since the generators e11 = x1∂1, e22 = x2∂2, e12 = x1∂2, e21 = x2∂1, e0 = k −x1∂1 −x2∂2, t −1 = ∂1, t − 2 = ∂2, t +1 = x1(k −x1∂1 −x2∂2), t +2 = x2(k −x1∂1 −x2∂2), (12) act on one-component spinors or, in other words, on scalar functions (see e.g. [5]). the casimir operators are: c1 = k, c2 = k(k + 2). 463 yu. f. smirnov, a. v. turbiner acta polytechnica if parameter k takes non-negative integer the algebra gl3 spanned by the generators (12) appears in finitedimensional representation. its finite-dimensional representation space is a space of polynomials pk,0 = 〈 x1 p1x2 p2 ∣∣ 0 ≤ p1 +p2 ≤ k〉, k = 0, 1, 2, . . . . (13) namely in this representation (12), the algebra gl3 appears as the hidden algebra of the 3-body calogero and sutherland models [5], bc2 rational and trigonometric, and g2 rational models [6] and even of the bc2 elliptic model [7]. 3.2. reps in 2 × 2 matrices take gl2 in two-dimensional reps by 2 × 2 matrices, m11 = ( 1 0 0 0 ) , m22 = ( 0 0 0 1 ) , m12 = ( 0 1 0 0 ) , m21 = ( 0 0 1 0 ) , then the generators (11) of gl3 are: t −1 = ( ∂1 0 0 ∂1 ) , t −2 = ( ∂2 0 0 ∂2 ) , e11 = ( x1∂1+1 0 0 x1∂1 ) , e12 = ( x1∂2 1 0 x1∂2 ) , e21 = ( x2∂1 0 1 x2∂1 ) , e22 = ( x2∂2 0 0 x2∂2+1 ) , e0 = ( a 0 0 a ) , t +1 = ( x1(a− 1) −x2 0 x1a ) , t +2 = ( x2a 0 −x1 x2(a− 1) ) , (14) where a = k−x1∂1−x2∂2. this is [k, 1]-representation (the young tableau has two rows of length k and 1, correspondingly), and their casimir operators are: c1 = k + 1, c2 = (k + 1)2. if parameter k takes non-negative integer the algebra gl3 spanned by the generators (14) appears in finitedimensional representation. let us consider several different values of k in detail. the case k = 1. then three-dimensional representation space v (2)1 appears to be spanned by: p− = [ 0 1 ] , p+ = [ 1 0 ] , y1 = [ x2 −x1 ] . (15) this corresponds to antiquark multiplet in standard (fundamental) representation. the newton polygon is a triangle with points p± as vortices at the base. figure 1. newton hexagon for the representation space v (2)4 of the [4, 1]-representation of dimension 24. the case k = 2. then eight-dimensional representation space v (2)2 appears to be spanned by: p− = [ 0 1 ] , p+ = [ 1 0 ] , p (1) − = [ 0 x2 ] , y (1) 1 = [ 0 x1 ] , y (2) 1 = [ x2 0 ] , p (1) + = [ x1 0 ] , y2 = [ x22 −x1x2 ] , y3 = [ x1x2 −x21 ] . (16) this corresponds to octet in standard (fundamental) representation. space v (2)2 contains v (2) 1 as a subspace, v (2)1 ⊂ v (2) 2 . it should be mentioned that y1 = −y (1) 1 + y (2) 1 . now the newton polygon is a hexagon where the central point is doubled, being presented by y (1,2)1 , and the lower (upper) base has length two being given by p± (y2,3). the case k = 3. the representation space v (2)3 is 15-dimensional. in addition to p±,p (1) ± and y (1,2) 1 (see (15) and (16)), it contains several vectors more, namely, p (2) − = [ 0 x22 ] , p (2) + = [ x21 0 ] , (17) which are situated on the ±-sides of the newton hexagon, doubling the points corresponding to y2,3 (see (16)) y (1) 2 = [ 0 x1x2 ] , y (2) 2 = [ x22 0 ] , y (1) 3 = [ 0 x21 ] , y (2) 3 = [ x1x2 0 ] , (18) plus three extra vectors on the boundary y8 = [ x32 −x1x22 ] , y9 = [ x1x 2 2 −x21x2 ] , y10 = [ x21x2 −x31 ] . (19) it is clear that v (2)1 ⊂ v (2) 2 ⊂ v (2) 3 . all internal points of the newton hexagon are double points, while the points on the boundary are single ones. 464 vol. 53 no. 5/2013 gln+1 algebra of matrix differential operators the general case. the finite-dimensional representation space v (2)k has dimension k(k + 2) and is presented by the newton hexagon, which contains (k + 1) horizontal layers. the lower base has length two, while the upper base has length k (see fig. 1 as an illustration for k = 4). all internal points of the newton hexagon are double points, while the points on the boundary are single ones. except for k vectors of the last (highest) layer of the newton hexagon, the remaining k(k + 1) vectors span the space of all possible two-component spinors with components given by the inhomogeneous polynomials in x1,x2 of degree not higher than (k − 1). we denote this space as ṽ (2) k ⊂ v (2) k . the non-trivial task is to describe k vectors of the last (highest) layer of the hexagon. after some analysis one can find that they have the form yk(k+1)+i = [ xk−i2 x i 1 −xk−i−12 x i+1 1 ] , i = 0, 1, 2, . . . , (k − 1), (20) hence they span a non-trivial k-dimensional subspace of spinors with components given by specific homogeneous polynomials of degree k. 3.3. reps in 3 × 3 matrices take gl2 in three-dimensional reps by 3 × 3 matrices, m11 =  2 0 00 1 0 0 0 0   , m22 =  0 0 00 1 0 0 0 2   , m12 =  0 √ 2 0 0 0 √ 2 0 0 0   , m21 =   0 0 0√2 0 0 0 √ 2 0   . then the generators (11) of gl3 are: t −1 =  ∂1 0 00 ∂1 0 0 0 ∂1   , t −2 =  ∂2 0 00 ∂2 0 0 0 ∂2   , e11 =  x1∂1 + 2 0 00 x1∂1 + 1 0 0 0 x1∂1   , e12 =  x1∂2 √ 2 0 0 x1∂2 √ 2 0 0 x1∂2   , e21 =  x2∂1 0 0√2 x2∂1 0 0 √ 2 x2∂1   , e22 =  x2∂2 0 00 x2∂2 + 1 0 0 0 x2∂2 + 2   , e0 =  a 0 00 a 0 0 0 a   , t +1 =  x1(a− 2) − √ 2x2 0 0 x1(a− 1) − √ 2x2 0 0 x1a   , t +2 =   x2a 0 0−√2x1 x2(a− 1) 0 0 − √ 2x1 x2(a− 2)   , (21) where a = k−x1∂1−x2∂2. this is [k, 2]-representation (the young tableau has two rows of length k and 2, correspondingly) and their casimir operators are: c1 = k + 2, c2 = (k + 1)2 + 3. as an illustration let us explicitly show finitedimensional representation spaces for k = 2, 3. the case k = 2. then the six-dimensional representation space v (3)2 appears to be spanned by: p− =  00 1   , p0 =  01 0   , p+ =  10 0   , y1 =   0x2 − √ 2x1   , y2 =  − √ 2x2 x1 0   , y3 =   x22−√2x1x2 x21   . (22) this corresponds to ‘di-antiquark’ multiplet. the case k = 3. then 15-dimensional representation space v (3)3 appears to be spanned by: p− =  00 1   , p0 =  01 0   , p+ =  10 0   , y (1) 1 =   0x2 0   , y (2)1 =   00 x1   , y (1)2 =  x20 0   , y (2) 2 =   0x1 0   , p (1)− =   00 x2   , p (1)+ =  x10 0   , y (1) 3 =  − √ 2x22 x1x2 0   , y (2)3 =   0x1x2 − √ 2x21   , y4 =   0−√2x22 2x1x2   , y5 =   2x1x2−√2x21 0   , y6 =   x32−√2x1x22 x21x2   , y7 =   x1x22−√2x21x2 x31   . (23) it is worth mentioning that as a consequence of a particular realization of the generators (11) of the gl3 algebra there exist a certain relations between generators other than those given by the casimir operators. the first observation is that there are no linear relations between generators of such a type. some time ago nine quadratic relations were found 465 yu. f. smirnov, a. v. turbiner acta polytechnica m s0 1/2 1 3/2 2 5/2 3 l/2 figure 2. verma module with the lowest weight (057) for s = 5/2. between gl3 generators taken in scalar representation (12) other than casimir operators [8]. surprisingly, certain modifications of these relations also exist for [kn] mixed representations (11), −t +1 e22 + t + 2 e12 = x1 [ m22x1∂1 + m11x2∂2 + (m11 −k)m22 −m21m12 ] −x2(x1∂1 −k − 1)m12 −m21x21∂2 ≡−t̃ + 1 , (24) −t +2 e11 + t + 1 e21 = x2 [ m22x1∂1 + m11x2∂2 + (m22 −k)m11 −m12m21 ] −x1(x2∂2 −k − 1)m21 −m12x22∂1 ≡−t̃ + 2 , (25) −e12(e0 + 1) + t +1 t − 2 = m12(x1∂1 −k − 1) −m11x1∂2 ≡−ẽ12, (26) −e21(e0 + 1) + t +2 t − 1 = m21(x2∂2 −k − 1) −m22x2∂1 ≡−ẽ21, (27) t +1 t − 1 −e11(1 + e0) = m11x2∂2 −m12x2∂1 − (k + 1)m11 ≡−ẽ11, (28) t +2 t − 2 −e22(1 + e0) = m22x1∂1 −m21x1∂2 − (k + 1)m22 ≡−ẽ22, (29) e12e21 −e11e22 −e11 = m12x2∂1 + m21x1∂2 −m22x1∂1 −m11x2∂2 + m12m21 −m11m22 −m11 ≡−ê11, (30) e22t − 1 −e21t − 2 = m22∂1 −m21∂2 ≡ t̃ − 1 , (31) e12t − 1 −e11t − 2 = m12∂1 −m11∂2 ≡−t̃ − 2 . (32) not all these relations are independent. it can be shown that one relation is linearly dependent, since the sum of (28) + (29) + (30) gives the second casimir operator c2. in scalar case, at least, we can assign a natural (vectorial) grading to the generators. the above relations also reflect a certain decomposition of the gradings, (1, 0)(0, 0) = (0, 1)(1,−1) (0, 1)(0, 0) = (1, 0)(−1, 0) for the first two relations, (1,−1)(0, 0) = (1, 0)(0,−1) (−1, 1)(0, 0) = (0, 1)(−1, 0) for the second two, (1, 0)(−1, 0) = (0, 0)(0, 0) (0, 1)(0,−1) = (0, 0)(0, 0) (1,−1)(−1, 1) = (0, 0)(0, 0) for three before the last two, and (0, 0)(−1, 0) = (−1, 1)(0,−1) (0, 0)(0,−1) = (1,−1)(−1, 1) for the last two. 4. algebra g(m) in mixed representation the basic property which was used to construct the mixed representation of the algebra gln+1 is the existence of the weyl-cartan decomposition gln+1 = l⊕ (gln ⊕ i) ⊕u with property (6). one can pose a question about the existence of other algebras than gln+1 for which the weyl-cartan decomposition with property (6) holds. the answer is affirmative. let us consider the important particular case of the cartan 466 vol. 53 no. 5/2013 gln+1 algebra of matrix differential operators algebra gl2 ⊕ i, and construct a realization of a new algebra denoted g(m) with the property g(m) = lm+1 o (gl2 ⊕ i) n um+1, (33) where lm(um) is the commutative algebra of the lowering (raising) generators with the property [lm,um] = pm−1(gl2 ⊕i) with pm−1 as a polynomial of degree (m − 1) in generators of gl2 ⊕ i. thus, it realizes a property of the generalized gauss decomposition. the emerging algebra is a polynomial algebra. it is worth emphasizing that the realization we are going to construct appears at dim(lk) = dim(um) = m. for m = 1 the algebra g(1) = gl3, see (6). our final goal is to build the realization of (33) in terms of finite order differential operators acting on the plane r2. the simplest realization of the algebra gl2 by differential operators in two variables is the vector field representation, see (1) at n = 2. exactly this representation was used to construct the representation of the gl3 algebra acting of r2, see (11), (12). in this case dim(lm) = dim(um) = 2. we are unable to find other algebras with dim(lm) = dim(um) > 2. however, there exists another representation of the algebra gl2 by the first order differential operators in two variables, j̃12 = ∂x, j̃ (k) 11 = −x∂x + k 3 , j̃ (k) 22 = −x∂x + sy∂y, j̃ (k) 21 = x 2∂x + sxy∂y −kx, (34) (see s. lie, [9] at k = 0 and a. gonzález-lopéz et al, [10] at k 6= 0 (case 24)), where s,k are arbitrary numbers. these generators obey the standard commutation relations (2) of the algebra gl2 in the vector field representation (1). it is evident that the sum of the two representations, j̃ij and the matrix representation mij, is also a representation, jij ≡ j̃ij + mij ∈ gl2. (35) (cf. (5)). it is worth mentioning that the gl2 algebra commutation relations for mpm are taken in a canonical form (4). the unity generator i in (33) is written in the form of a generalized euler-cartan operator j (k) 0 = x∂x + sy∂y −k. (36) now let us assume that s is non-negative integer, s = m, m = 0, 1, 2, . . .. evidently, the lowering generators (of negative grading) from lm+1 can be given by t −i = x i∂y, i = 0, 1, . . . ,m, (37) forming commutative algebra [t −i ,t − j ] = 0. (38) (cf. [9, 10]). eventually, the generators of the algebra (gl2 ⊕ i) n lm+1 take the form j12 = ∂x + m12, j (k) 11 = −x∂x + k 3 + m11, j (k) 22 = −x∂x + my∂y + m22, j (k) 21 = x 2∂x + mxy∂y −kx + m21, (39) with j(k)0 and t − i given by (36) and (37), respectively. let us consider two particular cases of the general construction of the raising generators for the commutative algebra u. case 1. for the first case we take the trivial matrix representation of the gl2, m11 = m12 = m21 = m22 = 0. one can check that one of the raising generators is given by u0 = y∂mx , (40) while all other raising generators are multiple commutators of j(k)21 with u0, ui ≡ [j (k) 21 , [j (k) 21 , [· · ·j (k) 21 ,t0] · · · ] ]︸ ︷︷ ︸ i = y∂m−ix j (k) 0 (j (k) 0 + 1) . . . (j (k) 0 + i− 1) , (41) at i = 1, . . .m. all of them are differential operators of fixed degree m. the procedure for construction of the operators ui has the property of nilpotency: ui = 0, i > m. in particular, for m = 1, u0 = y∂x, u1 = yj (k) 0 = y(x∂x + y∂y −k). inspecting the generators t −0,1, jij, j (n), u0,1 one can see that they span the algebra gl3, see (12). hence, the algebra g(1) ≡ gl3. if parameter k takes non-negative integer the algebra g(m) spanned by the generators (39), (40), (41) appears in finite-dimensional representation. its finitedimensional representation space is a triangular space of polynomials pk,0 = 〈 xp1yp2 ∣∣ 0 ≤ p1 + mp2 ≤ k〉, k = 0, 1, 2, . . . . (42) namely in this representation, the algebra g(m) appears as a hidden algebra of the 3-body g2 trigonometric model [6] at m = 2 and of the so-called ttw model at integer m, in particular, of the dihedral i2(m) rational model [11]. 467 yu. f. smirnov, a. v. turbiner acta polytechnica case 2. the second case is a certain evident extension when generators mij are of an arbitrary matrix representation of the algebra gl2. raising generators (40), (41) remain raising generators even if cartan generators are given by (39) with arbitrary mij ∈ gl2. however, the algebra is not closed: [t,u] 6= p (gl2⊕i). it can be fixed, at least, for the case m = 1. if mij are generators of gl2 subalgebra of gl3. by adding to t,u generators (38), (40), (41) the appropriate matrix generators from gl3, the algebra gets closed. we end up with the gl3 algebra of matrix differential operators other than (11). we are not aware of a solution to this problem for the case of m 6= 1 except for the case of trivial matrix representation, see case 1. 5. extension of the 3-body calogero model the first algebraic form for the 3-body calogero hamiltonian [12] appears after gauge rotation with the ground state function, separation of the center-ofmass and changing the variables to elementary symmetric polynomials of the translationally-symmetric coordinates [5], hcal = −2τ2∂2τ2τ2 − 6τ3∂ 2 τ2τ3 + 2 3 τ22 ∂ 2 τ3τ3 − [ 4ωτ2 + 2(1 + 3ν) ] ∂τ2 − 6ωτ3∂τ3. (43) these new coordinates are polynomial invariants of the a2 weyl group. its eigenvalues are − �p = 2ω(2p1 + 3p2), p1,2 = 0, 1, . . . . (44) as is shown in ruhl and turbiner [5], the operator (43) can be rewritten in a lie-algebraic form in terms of gl(3)-algebra generators of the representation [k, 0]. the corresponding expression is hcal = −2e11t −1 − 6e22t − 1 + 2 3 e12e12 − 4ωe11 − 2(1 + 3ν)t −1 − 6ωe22 . (45) now we can substitute the generators of the representation [k,n] in the form (11) h̃cal = −2τ2∂2τ2τ2 − 6τ3∂ 2 τ2τ3 + 2 3 τ22 ∂ 2 τ3τ3 − 2 [ 2ωτ2 + (1 + 3ν) + (n− 2m22) ] ∂τ2 − ( 6ωτ3 − 4 3 m12τ2 ) ∂τ3 + 2 3 m12m12 − 4ωn− 2ωm22. (46) this is an n×n matrix differential operator. it contains infinitely many finite-dimensional invariant subspaces which are nothing but finite-dimensional representation spaces of the algebra gl(3). this operator remains exactly-solvable with the same spectra as the scalar calogero operator. this operator probably remains completely integrable. a higher-than-second-order integral is the differential operator of the sixth order (ω 6= 0) or of the third order (ω = 0), which takes an algebraic form after gauging away the ground state function in τ coordinates. it can be rewritten in terms of the gl(3)algebra generators of the representation [k, 0], which then can be replaced by the generators of the representation [k,n]. under such a replacement the spectra of the integral remain unchanged and algebraic. 6. extension of the 3-body sutherland model the first algebraic form for the 3-body sutherland hamiltonian [13] appears after gauge rotation with the ground state function, separation of the center-of-mass and changing the variables to elementary symmetric polynomials of the exponentials of translationallysymmetric coordinates [5], hsuth = − ( 2η2 + α2 2 η22 − α4 24 η23 ) ∂2η2η2 − ( 6 + 4α2 3 η2 ) η3∂ 2 η2η3 + (2 3 η22 − α2 2 η23 ) ∂2η3η3 − [ 2(1+3ν)+2 ( ν+ 1 3 ) α2η2 ] ∂η2−2 ( ν+ 1 3 ) α2η3∂η3, (47) where α is the inverse radius of the circle on which the bodies are situated. these new coordinates are fundamental trigonometric invariants of the a2 weyl group. as shown in [5], operator (47) can be rewritten in a lie-algebraic form in terms of the gl(3)-algebra generators of the representation [k, 0], hsuth = −2e11t −1 − 6e22t − 1 + 2 3 e12e12 −2(1 + 3ν)t −1 + α4 24 e21e21− α2 6 [ 3e11e11 + 8e11e22 + 3e22e22 + (1 + 12ν)(e11 + e22) ] . (48) now we can substitute the generators of the representation [k,n] in the form (11) h̃suth = − ( 2η2 + α2 2 η22 − α4 24 η23 ) ∂2η2η2 − ( 6 + 4α2 3 η2 ) η3∂ 2 η2η3 + (2 3 η22 − α2 2 η23 ) ∂2η3η3 − 2 [ (1 + 3ν) + ( ν + 1 3 ) α2η2 + (n− 2m22) ] ∂η2 + α4 24 m21η3∂η2 + [ 2 ( ν + 1 3 ) α2η3 − 4 3 m12η2 ] ∂η3 − α2 3 [ 3n(η2∂η2 + η3∂η3 ) + m11η3∂η3 + m22η2∂η2 ] + 2 3 m12m12 + α4 24 m21m21 − α2 6 [ 2m11m22 + (1 + 12ν + 3n)n ] . (49) 468 vol. 53 no. 5/2013 gln+1 algebra of matrix differential operators this is an n×n matrix differential operator. it contains infinitely-many finite-dimensional invariant subspaces which are nothing but finite-dimensional representation spaces of the algebra gl(3). this operator remains exactly-solvable with the same spectra as the scalar sutherland operator. the operator (49) probably remains completely integrable. a non-trivial integral is the differential operator of the third order, it takes the algebraic form after gauging away the ground state function in η coordinates. it can be rewritten in terms of the gl(3)algebra generators of the representation [k, 0], which then can be replaced by the generators of the representation [k,n]. under such a replacement the spectra of the integral remain unchanged and algebraic. 7. conclusions the algebra gln of differential operators plays the role of a hidden algebra for all an,bn,cn,dn,bcn calogero-moser hamiltonians, both rational and trigonometric, with the weyl symmetry of classical root spaces (see [14] and references therein). we have described a procedure which, in our opinion, should carry the name of the havlicek procedure, to construct the algebra gln of the matrix differential operators. the procedure is based on a mixed, matrix-differential operator realization of the gauss decomposition diagram. as for hamiltonian reduction models with the exceptional weyl symmetry group g2,f4,e6,7,8, both rational and trigonometric, there exist hidden algebras of differential operators (see [14] and references therein). all these algebras are infinite-dimensional but finitely-generated. for generating elements of these algebras an analogue of the weyl-cartan decomposition exists but in the gauss decomposition diagram, a commutator of the lowering and raising generators is a polynomial of the higher-than-one order in the cartan generators. matrix realizations of these algebras surely exist. thus, the above mentioned procedure for building the mixed representations can be realized. it may lead to a new class of matrix exactly-solvable models with exceptional weyl symmetry. acknowledgements it was planned long ago to dedicate this text to miloslav havlicek who has always been deeply respected by both authors as a scientist and also as a citizen. the text is based mainly on notes jointly prepared by two authors. it does not include results of the authors obtained separately (except for section 4) and which the authors had no chance to discuss. thus, the text will appear somehow incomplete. when the first author (yufs) passed away, it took years for the second author (avt) to return to the subject due to sad memories. even now, almost a decade after the death of yura smirnov, the preparation of this text was quite difficult for avt. avt thanks crm, montreal for their kind hospitality extended to him. a part of this work was done there during his numerous visits. avt is grateful to j c lopez vieyra for taking the interest in the work and for technical assistance. this work was supported in part by the university program fenomec, by papiit grant in109512, and by conacyt grant 166189 (mexico). references [1] a.v. turbiner. quasi-exactly-solvable problems and the sl(2, r) group, comm.math.phys. 118: 467-474, 1988. [2] c. burdik. realisations of the real semisimple lie algebras: a method of construction, j. phys. a18: 3101-3111, 1985. [3] c. burdik, m. havlicek. boson realization of the semi-simple lie algebras, in symmetry in physics: in memory of robert t. sharp, crm proceeding 2004, vol.34, p.87-98, edited by r.t. sharp, p. winternitz. [4] i.m. gelfand and m.l. tsetlin. finite-dimensional representations of groups of orthogonal matrices, dokl. akad. nauk sssr 71: 1017–1020, 1950 (in russian) english transl. in: i.m. gelfand, collected papers. vol ii, berlin: springer-verlag 1988, pp. 657–661. [5] w. rühl and a. v. turbiner. exact solvability of the calogero and sutherland models, mod. phys. lett. a10: 2213–2222, 1995 arxiv:hep-th/9506105. [6] a.v. turbiner. hidden algebra of three-body integrable systems, mod.phys.lett. a13: 1473-1483, 1998 [7] a.v. turbiner. bc2 lame polynomials, talks presented at 1085 special session of the american mathematical society, tucson az, usa (october 2012) and at the annual meeting of the canadian mathematical society, montreal, canada (december 2012) [8] a. v. turbiner, lie algebras and linear operators with invariant subspace, in lie algebras, cohomologies and new findings in quantum mechanics (n. kamran and p. j. olver, eds.), ams contemporary mathematics, vol. 160, pp. 263–310, 1994; arxiv:funct-an/9301001 [9] s. lie. gruppenregister, vol. 5, b.g. teubner, leipzig, 1924, 767-773 [10] a. gonzález-lopéz, n. kamran and p.j. olver. quasi-exactly-solvable lie algebras of the first order differential operators in two complex variables, j.phys. a24: 3995–4008, 1991; lie algebras of differential operators in two complex variables, american j. math. 114: 1163–1185, 1992; [11] f. tremblay, a.v. turbiner and p. winternitz. an infinite family of solvable and integrable quantum systems on a plane, journal of phys. a42, 242001, 2009; 10 pp arxiv:0904.0738 [12] f. calogero. solution of a three-body problem in one dimension, j. math. phys. 10: 2191–2196, 1969 [13] b. sutherland. exact results for a quantum many-body problem in one dimension i, phys. rev. a4: 2019-2021, 1971 [14] a.v. turbiner. from quantum an (calogero) to h4 (rational) model, sigma 7: 071, 2011; 20 pp from quantum an (sutherland) to e8 trigonometric model: space-of-orbits view, sigma 9: 003, 2013; 25 pp 469 http://arxiv.org/abs/hep-th/9506105 http://arxiv.org/abs/funct-an/9301001 http://arxiv.org/abs/0904.0738 acta polytechnica 53(5):462–469, 2013 1 introduction 2 the algebra gl_n in mixed representation 3 example: the algebra gl_3 in mixed representation 3.1 reps in 1x1 matrices 3.2 reps in 2x2 matrices 3.3 reps in 3x3 matrices 4 algebra g^(m) in mixed representation 5 extension of the 3-body calogero model 6 extension of the 3-body sutherland model 7 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0422 acta polytechnica 55(6):422–426, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap on the speed of sound in steam pavel šafaříka, ∗, adam novýb, david jíchaa, b, miroslav hajšmanb a czech technical university in prague, faculty of mechanical engineering, technická 4, 166 07 prague 6, czech republic b doosan škoda power, tylova 1/57, 301 28 plzeň, czech republic ∗ corresponding author: pavel.safarik@fs.cvut.cz abstract. a study of the speed of sound in a pure water substance is presented here. the iapws data on the state of water and steam are applied only for investigating the speed of sound for a one-phase medium. a special numerical model for investigating the parameters of shock waves in steam is presented here and is applied for investigating extremely weak waves to obtain velocities representing the speed of sound in both one-phase and two-phase steam. problems with the speed of sound in two-phase steam are discussed, and three types of speed of sound are derived for the metastable region of wet steam. keywords: speed of sound; steam; iapws-if97; shockwave. 1. introduction the speed of sound is a physical quantity closely connected with the compressibility of a medium. physically, the speed of sound expresses the speed of the advance of undulation in a given medium. generally, the speed of sound a in a continuum is defined as the square root of an infinitesimal pressure disturbance ∂p related to an infinitesimal change in density ∂% in an isentropic process a = √(∂p ∂% ) s . (1) thee fundamental definition of the speed of sound is expressed by (1). there is no problem in evaluating the speed of sound when the state equation in the form f(p,%,s) = 0 is known. the well-known relation of the speed of sound can be derived for an ideal gas as a = √ κp/% = √ κrt. (2) from (2), it follows that the speed of sound in an ideal gas depends on temperature only, because κ = const. is the ratio of the heat capacities (the poisson constant) and r is the specific gas constant r = r/m, (3) r = 8314.41 j kmol−11 k−1 is the universal gas constant, and m is the molar mass of the gas. some considerations on the speed of sound in steam will be presented in this paper. in their previous publications [1, 2], the authors pointed out problems with the propagation of waves in steam. one point is that the speed of sound in steam depends on state parameters in a more complicated manner than in (2). this is obvious, because the equation of state for steam according to the data of the international association for properties of water and steam (iapws) is complex. iapws has released two formulations for water and steam: the first is the formulation for general and scientific use iapws-95 [3], and the second is the industrial formulation iapws-if97 [4]. it should be mentioned here that the data on the speed of sound in steam are available only for a onephase medium. the available tools are tables or calculators. the data on the speed of sound in wet steam have not yet been integrated. papers [5, 6] provide much stimulation for further studies of the speed of sound in wet steam, because they deal with the propagation of waves in wet steam. however, no measured data have in fact been published. 2. speed of sound in steam the formulation of iapws-95 is a fundamental equation for specific helmholtz free energy f(%,t). its dimensionless form is separated into two parts f(%,t) rt = φ(δ,τ) = φo(δ,τ) + φr(δ,τ), (4) where δ = %/%c and τ = tc/t. for water substance reference constants, iapws defined in [3] the critical temperature tc = 647.096 k, the critical density %c = 322 kg m−3, and the specific gas constant r = 461.51805 j kg−1 k−1. functions φo(δ,τ) and φr(δ,τ) are defined by iapws [2]. the speed of sound is then calculated from a(δ,τ) = [ rt ( 1 + 2δ ∂φr ∂δ + δ2 ∂2φr ∂δ2 − ( 1 + δ∂φ r ∂δ − δτ ∂ 2φr ∂δ ∂τ )2 τ2 ( ∂2φo ∂τ2 + ∂ 2φr ∂τ2 ) ) ]1/2 . (5) the formulation of iapws-if97 is divided into 5 regions, where different fundamental equations are 422 http://dx.doi.org/10.14311/ap.2015.55.0422 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 6/2015 on the speed of sound in steam figure 1. values for the speed of sound [m s−1] in water and in steam in a p–t diagram. defined: gibbs free energy g(p,t) and helmholtz free energy f(%,t). the fundamental equation for gibbs free energy is expressed in dimensionless form as g(p,t) rt = γ(π,τ), (6) where π is dimensionless reduced pressure, and τ is dimensionless reduced temperature. the function γ(π,τ) is defined by iapws [4], and then the speed of sound is solved from a(π,τ) = √√√√ rt(∂γ∂π)2τ2 ∂2γ∂τ2( ∂γ ∂π − τ ∂ 2γ ∂π ∂τ )2 − ∂2γ ∂π2 τ2 ∂ 2γ ∂τ2 . (7) values for the speed of sound were evaluated according to (7), and were presented in [1] in a p–t (pressuretemperature) phase diagram. the values are shown in fig. 1. it is evident that ideal gas theory cannot be applied for water and steam. 3. the speed of sound in steam solved by means of the model for solving the thermodynamic parameters of steam downstream from a normal shock wave an equilibrium model of a shock wave in steam was formulated in [1]. the theoretical approach for calculating steam parameters is based on balance equations for steam passing the infinitesimally thin control volume on a normal shock wave (fig. 2). the three modified balance equations are: balance of mass, balance of momentum, and balance of energy (under the assumption of constant total enthalpy h01 = h02 = h0): figure 2. scheme of a shock wave, control volume, and parameters on a normal shock wave in steam. • balance of mass ṁ a = %1v1 = %2v2; (8) • balance of momentum p1 −p2 = ṁa (v2 −v1), hence p1 + %1v21 = p2 + %2v 2 2 ; (9) • balance of energy h1 + v21 2 = h2 + v22 2 = h01 = h02 = h0; (10) • equation of state 1/% = fv,ph(p,h). (11) index 1 is upstream from the shock wave, index 2 is downstream from the shock wave, 0 indicates a total value. all thermodynamic parameters upstream from the normal shock wave are given. in paper [1], the calculation procedure for a given pressure downstream from the shock wave p2 (when p2 > p1) is derived. the iterative procedure is based on the balance equations and the equation of state of steam according to iapwsif97, and all thermodynamic parameters downstream from the normal shock wave are calculated. the wet steam model assumes isobaric separation of the twophase medium, for further details, see [1]. relations for velocities as functions of the calculated thermodynamic parameters can be derived from the balance equations. the velocity v1 of steam upstream from the normal shock wave can be calculated according to the relation v1 = p2−p1 %1√ 2p2−p1 %1 − 2(h2 −h1) . (12) the velocity v2 downstream from the normal shock wave is also derived from the balance equations, and can be calculated according to the relation v2 = v1 − p2 −p1 %1v1 . (13) the model for calculating the shock wave parameters was applied successfully for superheated, saturated and wet steam for various ratios of pressures p2/p1. a special case is for p2 = p1; then v1 = v2 = a, (a is the speed of sound). 423 p. šafařík, a. nový, d. jícha, m. hajšman acta polytechnica figure 3. the h–s diagram of water and steam with curves of the constant speed of sound in steam a [m s−1]. the wet steam region blue lines, are from definition (1) and the blue cross values are calculated by means of the model for normal shock waves. superheated steam a values according to iapws-if97 are depicted by green curves. 3.1. results of the solution for the speed of sound the numerical procedure for the model for calculating the thermodynamic parameters of steam downstream from a normal shock wave was performed for p2/p1 � 1. the values obtained for the speed of sound proved to be very close to the iapws data, according to (7) for superheated and saturated steam. we developed calculation tool in matlab, which proved to be well performing. the results for wet steam are very close (lower than 2 % difference) to the values for the speed of sound in wet steam, derived according to (1), see fig. 3, where a detail of the h–s (enthalpy– entropy) diagram for water and steam is depicted. the assumption of an infinitesimally weak normal shock wave defines the equilibrium conditions for wet steam, so the speed of the sound values obtained here corresponds to the speed of sound for the thermodynamic equilibrium state of wet steam. a numerical model for calculating the speed of sound in steam is developed and verified. figure 3 shows some special effect of the speed of sound on the steam saturation line – there is considerable discontinuity of the values. this effect can be explained by a discontinuity of the first derivative of lines of constant temperature on the steam saturation line in the h–s (enthalpy–entropy) diagrams and in the p–v (pressure–specific volume) diagrams, and perhaps by a discontinuity of the first derivative of the lines of constant pressure on the saturation line in the t–s (temperature–entropy) diagram for steam. the state parameters — namely pressure p, density % and entropy s — defined in the two-phase region by means of the maxwell rule as equilibrium parameters. from the thermodynamic point of view, wet steam in the equilibrium state is therefore considered to be a continuum, which contains both phases of water (saturated water liquid and saturated steam) according to the dryness value. the speed of sound obtained from this numerical model for wet steam should be referred to as the equilibrium speed of sound. 4. the speed of sound in wet steam in the metastable region 4.1. equilibrium speed of sound the equilibrium speed of sound can be evaluated for the whole wet steam region, see fig. 3. low equilibrium speed of sound values near the saturation line of water are not acceptable. this is shown in fig. 4, where the speed of sound values are depicted as dependencies of dryness x for constant pressures p. 4.2. frozen speed of sound the assumption is often made that isobars and isotherms are identical in the wet steam region when calculating the speed of sound in wet steam. according to (2), the value of the speed of sound is therefore calculated using the relation af = √ κrt ′′. (14) where t ′′ is temperature of saturated steam. the lines for the constant speed of sound are therefore identical to the isobars in wet steam. the speed of sound calculated using this assumption should be referred to as the frozen speed of sound. this approach can be applied for approximate calculations in the region of temperatures up to 100 °c and dryness of wet steam from 0.98 to 1.00. however, it is more convenient to determine the frozen speed of sound from the iapws data for saturated steam. 4.3. metastable speed of sound a further approach is based on a continuum model, so that the dependence of the speed of sound on the specific enthalpy is extended from the region of superheated steam into the region of wet steam. if superheated steam is considered as an ideal gas, the dependence of speed of sound on specific enthalpy is parabolic. it can be derived from (2): amc = √ (κ− 1)hid, (15) 424 vol. 55 no. 6/2015 on the speed of sound in steam figure 4. dependence of values for the equilibrium speed of sound a on the dryness of wet steam at constant pressures p. pressure p specific enthalpy h dryness x [mpa] [kj kg−1] [kgsat kgwet−1] 0.001 2421.89 0.9629 0.010 2473.12 0.9534 0.100 2542.29 0.9411 1.000 2645.18 0.9350 2.000 2658.32 0.9264 5.000 2660.51 0.9186 10.000 2555.87 0.8699 table 1. parameters of wet steam where the metastable and equilibrium speeds of sound are equal. where κ is the ratio of specific heat capacities and hid is the specific enthalpy of an ideal gas defined as a zero value at zero total temperature. the iapws-if97 data are depicted in fig. 5 in the diagram showing speed of sound vs. specific enthalpy for constant pressures. in the region of superheated steam the dependencies are similar curves, but the equilibrium speed of sound has a discontinuity on the saturation line. it should be mentioned here that the specific enthalpy of steam is applied according to the iapws definition – in the triple point, the enthalpy of steam is ht.p. = 2500.9 kj kg−1. in the new model, the curves are prolonged into the region of wet steam. the speed of sound defined by these prolonged curves should be referred to as the metastable speed of sound. the region from the saturation line of steam to the intersection of the prolonged curves with the dependencies of the equilibrium speed of sound determines the region of metastable speed of sound. we estimated the numerical uncertainty of the calculation to be lower than 5 % and can be further improved. the conditions for equality values of the metastable and equilibrium speeds of sound are determined and introduced in table 1. it is remarkable that the dryness values in table 1 are very close to the wilson line for condensation of steam into droplets. the speed of sound in the metastable region can acquire any value between the metastable speed of sound and the equilibrium speed of sound. the values for the frozen speed of sound are rather overvalued. for lower values of the dryness of wet steam given in table 1, continuum models lose validity, and it is necessary to investigate the propagation of waves in a heterogeneous (two-phase) environment. examples can be found in [5, 6]. the metastable-vapor region is defined by iapws [4, 7] from the saturation line to the 95 % equilibrium dryness line for pressures from the triple-point pressure up to 10 mpa. 5. conclusions a numerical model for calculating the speed of sound in steam based on balances of mass, momentum and energy and an equation of state for steam based on iapws-if97 has been developed and verified. our results are in very good accordance with the iapws data in the region of superheated steam, where the relative error is ranging from 0.3 % up to the maximum of 1.5 % in the critical point surrounding. proper method for determining the uncertainty of the calculated values of the speed of sound in wet steam was not yet determined. a detailed study of the speed of sound in wet steam has attempted to determine the metastable region and to define the speed of sound in this region. it should be pointed out here that no measured data are available for the speed of sound in wet steam, and there are also only limited theoretical resources. 425 p. šafařík, a. nový, d. jícha, m. hajšman acta polytechnica figure 5. dependence of speed of sound on specific enthalpy for constant pressures. list of symbols p pressure [pa] a speed of sound [m s−1] % density [kg m−3] t temperature [k] t temperature [°c] κ ratio of heat capacities (poisson constant) [1] r specific gas constant [j kg−1 k−1] v velocity [m s−1] h specific enthalpy [kj kg−1] s specific entropy [kj kg−1 k−1] x dryness [kgsat kg−1wet] a area [m2] ṁ mass flux [kg s−1] acknowledgements support from the technology agency of the czech republic under project te01020036 is gratefully acknowledged. doosan škoda power also provided crucial support for this research. references [1] šafařík, p., nový, a., hajšman, m., jícha, d. on a model for solution of shock wave parameters in wet steam, paper no.pws-011. in: 16th international conference on properties of water and steam, proceedings (electronic form), london: imeche, 2013. [2] nový, a., jícha, d., šafařík, p., hajšman, m. on parameters of shock waves in saturated steam. pp. 43-46. in: topical problems of fluid mechanics 2013, proceedings, prague, 2013. [3] iapws revised release on the iapws formulation 1995 for the thermodynamic properties of ordinary water substance for general and scientific use, iapws release, 2014. [4] iapws revised release on the iapws industrial formulation 1997 for the thermodynamic properties of water and steam, iapws release, 2007. [5] petr, v. wave propagation in wet steam, journal of mechanical engineering science, vol.218, part c, pp.871-882, 2004. doi:10.1243/0954406041474237 [6] young, j. b., guha, a. normal shock-wave structure in two-phase vapor-droplet flows, journal of fluid mechanics, vol.228, pp.243-274, 1991. doi:10.1017/s0022112091002690 [7] wagner, w., cooper, j.r., dittmann, a., kijima, j., kretzschmar, h.-j., kruse, a., mareš, r., oguchi, k., sato, h., st’ocker, i., šifner, o., takaishi, y., tanishita, i., trübenbach, j., willkommen. th. the iapws industrial formulation 1997 for the thermodynamic properties of water and steam, journal of engineering for gas turbines and power, transaction of the asme, vol.122, pp.150-182, 2000, doi:10.1115/1.483186 426 http://dx.doi.org/10.1243/0954406041474237 http://dx.doi.org/10.1017/s0022112091002690 http://dx.doi.org/10.1115/1.483186 acta polytechnica 55(6):422–426, 2015 1 introduction 2 speed of sound in steam 3 the speed of sound in steam solved by means of the model for solving the thermodynamic parameters of steam downstream from a normal shock wave 3.1 results of the solution for the speed of sound 4 the speed of sound in wet steam in the metastable region 4.1 equilibrium speed of sound 4.2 frozen speed of sound 4.3 metastable speed of sound 5 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0319 acta polytechnica 55(5):319–323, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap pyrolysis of brown coal using a catalyst based on w–ni lenka jílkováa, ∗, karel ciahotnýa, jaroslav kusýb a department of gas, coke and air protection, institute of chemical technology prague, technická 5, prague 166 28, czech republic b brown coal research institute, budovatelů 2830, most 434 37, czech republic ∗ corresponding author: lenka.jilkova@vscht.cz abstract. tars from pyrolysis of brown coal can be refined to obtain compounds suitable for fuel production. however, it is problematic to refine the liquids from brown coal pyrolysis, because high molecular compounds are produced, and the sample solidifies. therefore we decided to investigate the possibility of treating the product in the gas phase during pyrolysis, using a catalyst. a two-step process was investigated: thermal-catalytic refining. in the first step, alumina was used as the filling material, and in the second step a catalyst based on w-ni was used. these materials were placed in two separate layers above the coal, so the volatile products passed through the alumina and catalyst layers. pyrolysis tests showed that using the catalyst has no significant effect on the mass balance, but it improves the properties of the gas and the properties of the organic part of the liquid pyrolysis products, which will then be processed further. keywords: catalytic pyrolysis; w–ni catalyst; brown coal. 1. introduction the pyrolysis products of brown coal are [1, 2]: (1.) a solid pyrolysis residue, referred to as coke or semi-coke, which is used as a fuel or as an adsorbent; (2.) a liquid product, which consists of an organic part and a water part and can be utilized as a fuel after refining; (3.) a gas, which is often used for heating a reactor. the highest yields of liquid products can be achieved by fast pyrolysis, in which the products are rapidly removed from the pyrolysis reactor [3]. the properties of the liquid product are closely related to the material properties and the process conditions [4]. unlike the yield of coke, the yield of volatile matter increases as the pyrolysis temperature increases. increasing the residence time of the volatile product in the hot area leads to secondary reactions, such as thermal cracking, polymerization and condensation, which reduce thef yield of the liquid product [5, 6]. substances containing sulphur, nitrogen and oxygen allow easier thermal decomposition, all of which leads to the formation of carbon dioxide, carbon monoxide, ammonia, hydrogen sulphide, pyrogenetic water and other products [1]. when the material is being heated, typical processes take place [7, 8]: (1.) evaporation of water and desorption of adsorbed gases (up to 150 °c); (2.) the onset of thermal decomposition, formation of the first hydrocarbon gases, co2, co and h2s (150–300 °c); (3.) formation of tar, ch4 and higher hydrocarbons (300–400°c); (4.) ongoing thermal decomposition of organic matter, formation of hydrogen and pyrolysis carbon (semicoke) (400–600 °c); (5.) tar formation comes to an end, decomposition of organic matter to hydrogen and carbon continues (600–1000 °c). the organic part of the liquid can be used for fuel production. after pyrolysis, the organic part must be separated out from the liquid product contaminated by water. the organic product must be hydrogenated to decompose the compounds with a high molecular weight, because these compounds can cause the sample to solidify. this can be achieved by using a catalyst during pyrolysis. after hydrogenation, or pyrolysis with the catalyst, the organic liquid product has properties similar to those of petroleum. this suggests that these products could also be processed in refineries for conversion to engine fuel [3]. 2. material and methods 2.1. pyrolysis apparatus the pyrolysis apparatus (fig. 1) was placed in the laboratory of the brown coal research institute in most. the batch (500 g) was poured into a metal retort, which was placed in an electrically heated furnace. in the retort, 2 cm above the coal, there was a stainless steel grid, where the alumina (a layer 1.5 cm in thickness) or the alumina and the catalyst (a layer 1.5 cm 319 http://dx.doi.org/10.14311/ap.2015.55.0319 http://ojs.cvut.cz/ojs/index.php/ap l. jílková, k. ciahotný, j. kusý acta polytechnica figure 1. pyrolysis apparatus: 1–furnace, 2–water cooler, 3–separators, 4–sampling bag or gas burner. in thickness plus a layer 1 cm in thickness) was placed. the retort was heated to 650 °c (8.93 °c min−1), and the temperature was maintained for 3 hours. the system was purged by 20–30 dm3 of nitrogen, when the temperature in the reactor was 200 °c. the volatile compounds that were generated were conducted into a water cooler, then into four serially connected separators, where the liquid condensate was accumulated. the pyrolysis gas was collected in a sampling bag when the temperature in the pyrolysis reactor was already 510 °c, and was analyzed using gc 82tt (labio praha) a few minutes later. the redundant pyrolysis gas was burned in a gas burner. after each pyrolysis test, a mass balance was performed and an analysis was made of the liquid products (using gc hp 6890 coupled with a mass detector). brown coal brown coal + alumnia brown coal + alumnia + catalyst mass [g] part [wt %] mass [g] part [wt %] mass [g] part [wt %] batch 500 100 500 100 500 100 semi-coke 257.14 51.43 258.93 51.79 259.51 51.9 water part of the tar 110.5 22.1 107.71 21.54 114 22.79 organic part of the tar 25.14 5.03 24 4.8 18.48 3.7 gas + losses — 21.44 — 21.87 — 21.61 table 2. mass balance of brown coal pyrolysis; brown coal with alumina; brown coal with alumina and the catalyst. substance part [wt %] analytical water wa 6.9 analytical ash aa 5.3 sulphur in a dry sample sd 1.0 carbon in a dry sample cd 74.6 volatile flammable in a dry sample vd 57.7 combustion heat qs 31.3 table 1. properties of the brown coal. 2.2. brown coal for pyrolysis brown coal from the čsa mine was used as the feedstock. the basic properties of the brown coal are shown in table 1. for the tests on the pyrolysis apparatus the following were used: brown coal; brown coal with alumina; brown coal with alumina and a catalyst based on w–ni. 2.3. analytical methods off-line analysis of pyrolysis gases. the pyrolysis gases were collected in sampling bags, and were analyzed immediately, using gc 82tt labio praha with a dual thermal conductivity detector (tcd). hydrogen, oxygen, nitrogen, methane and carbon monoxide were determined by the first tcd (150 °c, stainless steel column: length 2 m, diameter 3.2 mm, stationary phase: molecular sieve 5a, carrier gas: argon). carbon dioxide was determined by the second tcd (150 °c, teflon column: length 2 m, diameter 3.2 mm, stationary phase: porapak q, carrier gas: helium). off-line analysis of the water and the organic part of the pyrolysis condensate. the liquid products were analyzed qualitatively, using gc hp 6890 with a mass detector (msd 5973). the gas chromatographer had an mtx-1 metallic column (length 30 m, diameter 250 µm, carrier gas helium). the column was at a temperature of 50 °c for first two minutes, and then the temperature was raised to 320 °c (15 °c min−1). a temperature of 320 °c was maintained for the next five minutes. 320 vol. 55 no. 5/2015 pyrolysis of brown coal using a catalyst based on w–ni brown coal brown coal brown coal + + alumina alumina + catalyst hydrogen 0.65 1.31 10.73 oxygen 1.2 0.89 0.51 nitrogen 7.57 6.04 5.7 carbon monoxide 10.71 12.46 14.8 methane 6.84 8.52 9.68 carbon dioxide 38.47 39.12 35.84 ethene 0.13 0.31 0.41 ethane 0.42 1.3 1.51 propene 0.33 0.45 0.48 propane 0.5 0.65 0.82 table 3. composition of the pyrolysis gases, in vol %. fig. 2: content of phenol, 1,2-benzenediol and acetic acid in water parts of pyrolysis condensates 0 5 10 15 20 25 30 35 40 pyrolysis of brown coal pyrolysis of brown coal + alum. pyrolysis of brown coal + alum. + cat. r e la ti v e c o n te n t in w a te r p a r t [% r e l. ] phenol 1,2-benzenediol acetic acid figure 2. content of phenol, 1,2-benzenediol and acetic acid in the water parts of the pyrolysis condensates. 3. results and discussion 3.1. mass balance table 2 presents the mass balances of individual pyrolysis events. it shows that the use of alumina or alumina with the catalyst has no major effect on the mass balance. the only exception is the organic part of the condensate, which decreased from 5.03 wt % (brown coal) to 4.80 wt % (brown coal with alumina), and then to 3.70 wt % (brown coal with alumina and the catalyst). 3.2. composition of pyrolysis gases table 3 presents the results of an analysis of the pyrolysis gases. the analysis showed that the content of hydrogen, carbon monoxide and methane in the pyrolysis gas increased significantly when the catalyst was used. at the same time, the content of nitrogen and carbon dioxide in the pyrolysis gas decreased. 3.3. composition of the water parts of the pyrolysis condensate in each sample of the water part of the pyrolysis condensate, about 25 compounds were identified by the gc-ms method. these were mostly polar compounds, such as phenol, 1,2-benzenediol and acetic acid. figure 2, below, shows the contents of these three compounds in the water parts of the pyrolysis condensates. the relative contents in percentages (yaxis) in the graphs are in fact the percentages of the peak area for the respective compound, according to the total peak area for the analyzed sample. the graph in fig. 2 shows that the use of alumina and the catalyst during pyrolysis of brown coal decreased the content of 1,2-benezenediol in the water part. there is no clear trend for phenol and acetic acid. a significant trend is revealed when we compare the compounds according to the functional groups, see fig. 3. there is an increasing content of phenols, unlike the content of acids and aromatic alcohols with two oh− groups. 321 l. jílková, k. ciahotný, j. kusý acta polytechnica fig. 3: content of compounds (selected by functional groups) in water parts of pyrolysis condensates 0 10 20 30 40 50 60 70 pyrolysis of brown coal pyrolysis of brown coal + alum. pyrolysis of brown coal + alum. + cat. r e la ti v e c o n te n t in w a te r p a r t [% r e l. ] acids phenols aromatic alcohols with 2 ohgroups figure 3. content of compounds (selected according to functional groups) in the water parts of the pyrolysis condensates. fig. 4: content of phenol, 4-methylphenol and 2,4-dimethylphenol in organic parts of pyrolysis condensates 0 1 2 3 4 5 6 7 8 pyrolysis of brown coal pyrolysis of brown coal + alum.pyrolysis of brown coal + alum. + cat. r e la ti v e c o n te n t in o r g a n ic p a r t [% r e l. ] phenol 4-methylphenol 2,4-dimethylphenol figure 4. content of phenol, 4-methylphenol and 2,4-dimethylphenol in the organic parts of the pyrolysis condensates. 3.4. composition of the organic parts of the pyrolysis condensate over 100 compounds were identified by the gc-ms method in each sample of the organic part of the pyrolysis condensate. the compounds that were identified include phenol, 4-methylphenol, and 2,4dimethylphenol. figure 4 shows content of these three compounds in the organic parts of the pyrolysis condensates. the relative contents as a percentage (y-axis) in the graphs are in fact the percentages of the peak area for the respective compound, according to the total peak area for the analyzed sample. the graph in fig. 4 shows that the highest content of identified phenols is for the organic part of the condensate from pyrolysis of brown coal with alumina. a clear trend is apparent from the comparison of compounds selected according to functional groups. when alumina or alumina with the catalyst is used, the content of phenols decreases during pyrolysis, whereas the content of aromatic and aliphatic hydrocarbons increases slightly. the decrease in phenols and in substituted phenols content may be attributed to a positive influence of the catalyst on the hydrogenation of the phenols, even if the operating conditions were not optimal for the catalyst (a low temperature in the incipient phase of pyrolysis). 322 vol. 55 no. 5/2015 pyrolysis of brown coal using a catalyst based on w–ni fig. 5: content of compounds (selected by functional groups) in organic parts of pyrolysis condensates 0 5 10 15 20 25 30 35 40 45 pyrolysis of brown coal pyrolysis of brown coal + alum. pyrolysis of brown coal + alum. + cat. r e la ti v e c o n te n t in o r g a n ic p a r t [ % r e l. ] phenols aromatic hydrocarbons aliphatic hydrocarbons figure 5. content of compounds (selected according to functional groups) in the organic parts of the pyrolysis condensates. 3.5. conclusions the use of alumina, or alumina with the catalyst, during pyrolysis of brown coal has no major effect on the mass balance, but it improves the gas properties and the properties of the organic part of the liquid pyrolysis products, which are then further processed. the content of hydrogen, carbon monoxide and methane in the pyrolysis gas increased significantly when the catalyst was used during pyrolysis. however, the content of nitrogen and carbon dioxide in the pyrolysis gas decreased. the increasing content of hydrogen supports desirable reactions in the pyrolysis reactor, leading to better properties of the organic part that are useful for refining. the decrease in phenols and substituted phenol content may be attributed to the positive influence of the catalyst on the hydrogenation of phenols, even if the operating conditions were not optimal for the catalyst (a low temperature in the incipient phase of pyrolysis). the catalyst should be placed in a separately heated reactor, leading to a further decrease in the phenol content. acknowledgements financial support from specific university research (msmt no 20/2013). references [1] a. procházková, j. káňa, j. kusý, k. ciahotný, l. jílková, v. vrbová: aprochem 2011, kouty nad desnou 11. 13. 4. 2011, 1. part, p. 305-312, kouty nad desnou, 2011. [2] g. k. janssens, g. reggers, h. pastijn, j. yperman, m. jans, m. stals, r. carleer, s. schreurs, t. cornelissen, t. kuppens, t. thewys: j. anal. appl. pyrolysis, 85, 87-97 (2009), doi: 10.1016/j.jaap.2008.12.003. [3] r. h. venderbosch, w. prins: biofuels, bioprod. biorefin., 6, 178-208 (2010), doi: 10.1002/bbb.205. [4] w. gerhartz: ullman’s encyclopedia of industrial chemistry, part a7, vch, 5. ed., weinheim 1986. [5] h. knoetze, j. gorgens, m. carrier, t. hugo: j. anal. appl. pyrolysis, 90, 12-26 (2011), doi: 10.1016/j.jaap.2010.10.001. [6] r. he, y. chen: j. anal. appl. pyrolysis, 90, 72-79 (2011), doi: 10.1016/j.jaap.2010.10.007. [7] r. riedl, v. veselý: technologie paliv. sntl, praha 1962. [8] s. landa: paliva a jejich využití. sntl, praha 1956. 323 acta polytechnica 55(5):319–323, 2015 1 introduction 2 material and methods 2.1 pyrolysis apparatus 2.2 brown coal for pyrolysis 2.3 analytical methods 3 results and discussion 3.1 mass balance 3.2 composition of pyrolysis gases 3.3 composition of the water parts of the pyrolysis condensate 3.4 composition of the organic parts of the pyrolysis condensate 3.5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0333 acta polytechnica 54(5):333–340, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap alternative selection functions for information set monte carlo tree search viliam lisý agent technology center, department of computer science, fee, czech technical university in prague correspondence: viliam.lisy@agents.fel.cvut.cz abstract. we evaluate the performance of various selection methods for the monte carlo tree search algorithm in two-player zero-sum extensive-form games with imperfect information. we compare the standard upper confident bounds applied to trees (uct) along with the less common exponential weights for exploration and exploitation (exp3) and novel regret matching (rm) selection in two distinct imperfect information games: imperfect information goofspiel and phantom tic-tac-toe. we show that uct after initial fast convergence towards a nash equilibrium computes increasingly worse strategies after some point in time. this is not the case with exp3 and rm, which also show superior performance in head-to-head matches. keywords: monte carlo tree search, imperfect information game, selection function, regret matching. 1. introduction monte carlo tree search (mcts) is a family of sample-based tree search algorithms that has recently led to a significant improvement in the quality of stateof-the-art solvers for perfect information problems, such as the game of go [1] or domain independent planning [2]. the main idea of monte carlo tree search is to run a large number of randomized simulations of the problem and learn the best actions to choose from the experience. the earlier simulations are generally used to create statistics that help to guide the later simulations to more important parts of the search space and decide on the best action to take in the current state of the game. the core component of the algorithm determining which action to choose next and what statistics to collect is called a selection function. inspired by the success of mcts in perfect information games, the algorithm has recently also been adapted for imperfect information games [3–5]. games with imperfect information are fundamentally more complicated than perfect information games, for several reasons. the most important complication is that the optimal strategies in imperfect information games may require the players to make randomized decisions. for example, in the well-known game of rock-paper-scissors, none of the available actions can be considered optimal. playing any of the actions all the time can always be exploited by the opponent, and the optimal strategy against a rational opponent is to play each action with the same probability. another important complication is the strong inter-dependency between the strategies in different parts of the game. a player often does not know the exact state of the game during the match, and the probability of the game being in the individual possible states depends on the opponent’s strategy in previous decisions in the game, which can depend on the optimal strategy in any other decision in the game. game theory provides means to deal with all these complications, but previous attempts to adapt mcts to imperfect information games generally did not evaluate the properties of the strategies from the perspective of game theory. the algorithms were developed mainly on the basis of heuristics and analogies with the perfect information case. in this paper, we analyze various selection functions in information set monte carlo tree search (is-mcts) [3]. we show that the most commonly used selection function – uct ([6]) – does not allow the algorithm to converge to good strategies, and even causes the strategies to get worse with more computation time. we further evaluate two additional selection functions. exp3 [7], which has previously been used in mcts mainly to handle simultaneous moves [3, 8, 9], and regret matching [10], previously evaluated in the context of simultaneous move games [9], but never used for mcts in generic imperfect information games. in an imperfect information variant of the game of goofspiel and phantom tic-tac-toe, we show that these alternative selection strategies allow is-mcts to converge closer to the nash equilibrium strategy and perform better in mutual matches. the following section introduces the basic gametheoretic concepts necessary to describe the is-mcts algorithm along with the existing and novel selection functions in section 3. afterwards, we continue with an experimental evaluation in section 4, and then we conclude the paper. 2. definitions and background we focus on domains that can be modelled as twoplayer zero-sum extensive-form imperfect-information games. 2.1. extensive form games we adapt the definition of the extensive-form game (efg) from [11]. 333 http://dx.doi.org/10.14311/ap.2014.54.0333 http://ojs.cvut.cz/ojs/index.php/ap viliam lisý acta polytechnica 3 c 2 d c -1 e 4 f d a -2 g 1 h c 4 g 2 h d b a 1 g 2 h e -2 g -1 h f b 0 i 1 j g 2 i 3 j h a b figure 1. example of a zero-sum imperfect-information extensive-form game. definition 1 (extensive-form game). an extensive form game with imperfect information consists of: • n is the set of players including the nature player (c) that represents the dynamics of the game; • for each i ∈n , ai is the set of actions available to player i; • h is the set of possible states of the game, each corresponding to a history of actions performed by all players; • z ⊆h is the set of terminal game states or histories; • p : h \z → n is a function assigning to each non-terminal state a player that selects an action in the state; • a : h\z → 2ai is a function assigning to each non-terminal game state the actions applicable by the acting player; • t : h × ai → h ∪z is the transition function realizing one move of the game; • ui : z → r for each i ∈ n \ {c} are the utility functions of the players defined only on the terminal states of the game; • ii for player i represents the player’s imperfect information about the game. it is a partition of hi = {h ∈ h : p(h) = i} termed information sets. each information set i ∈ ii represents the set of histories that are indistinguishable for player i. therefore, we naturally extend a(i) = a(h) for some h ∈ i. the game starts with the empty history ∅. in each history h, player p(h) selects an action a ∈ a(h). after the action is performed, the game proceeds to history h′ = t (h,a), often also denoted ha. the game ends when it reaches a terminal history h ∈z. each player i ∈ n receives the payoff ui(h). a zero-sum game is a two player game (|n \{c}| = 2), such that for each terminal history h ∈ z, the reward for one player is a loss for the other player (ui(h) = −u−i(h)). extensive form games can be represented by a game tree, such as the one in figure 1. the set of histories h are the nodes in the tree and z are the leaves. we denote the nodes where the first player selects an action (h1) by 4, and the nodes of the second player (minimizing the utility of the first player) by 5. in this example, the action sets of the players are a1 = {a,b, . . . ,h}, a2 = {a,b,. . . ,j}. we denote the information sets as the ellipses around the tree nodes. after the history h = aa player 4 decides about the next move from a(h) = {c,d}, but in the game, he has exactly the same information as if h′ = ab was the current state of the game. the numbers in the leaves denote the utility of the first player. a pure strategy in an extensive form game is a function that assigns to each information set i ∈ ii of player i an action from a(i). each pair of pure strategies then naturally defines the utility function as the expected value of playing this pair of strategies. the expectation is taken over the actions in the chance nodes, in which the action is selected based on a commonly known probability distribution. the set of mixed strategies of an extensive form game is defined as the set of all probability distributions over the pure strategies. in games where players do not forget the actions they performed, a theorem proposed by khun [12] allows us to use the following equivalent, but much more compact, representation of the mixed strategies. a behavioral strategy assigns to each information set a probability distribution over the available actions. if it is not specified otherwise, we mean by a strategy in an extensive form game the behavioral strategy. we denote the set of all mixed strategies of player i ∈n by σi. we term a vector of one strategy for each player σ ∈ πi∈nσi a strategy profile and we denote σi the strategy of player i and we denote σ−i the strategy of the opponent of i. ∆(ai) is the set of all probability distributions over ai. ui can be naturally extended to mixed strategies as the expectation over the pure strategies. definition 2 (best response). for strategy profile σ, we define a best response of player i to the opponent’s strategy σ−i as the strategy br(σ−i) ∈ arg max σ′ i ∈∆(ai) ui(σ′i,σ−i). one of the most fundamental solution concepts in game theory is the nash equilibrium [11]. definition 3 (nash equilibrium). a strategy profile σ∗ ∈ σ is a nash equilibrium if ui(σi,σ∗−i) ≤ ui(σ ∗) ∀i ∈n , ∀σi ∈ σi. in words, in a nash equilibrium, each player plays the best response to the strategies of the other players. in zero-sum games, a nash equilibrium is a very appealing strategy to play. it prescribes a (generally randomized) strategy that is optimal in several aspects. it is a strategy that gains the highest expected reward against its worst-case opponent, even if the opponent knows the strategy played by the player in advance. moreover, in the zero-sum setting, even if the opponent does not play rationally, the strategy still guarantees at least the reward it would have 334 vol. 54 no. 5/2014 selection functions for is-mcts gained against the rational opponent. this guaranteed reward is termed the value of the game. the distance of a strategy from a nash equilibrium can be measured in terms of exploitability (e.g., [13]). definition 4 (exploitability). the exploitability of strategy σi is expl(σi) = vi −ui(σi, br(σi)) where vi is the value of the game for player i. the exploitability of strategy profile σ is expl(σ) = ui(σi, br(σi)) −ui(br(σ−i),σi). in this paper, we compute the exploitability using the best response function described in [14]. 3. information set monte carlo tree search information set monte carlo tree search (is-mcts) is a monte carlo tree search variant for imperfect information games. fundamentally very similar algorithms have previously been formulated in two different ways. the formulation that is easier to understand is presented in [4]. the mcts iterations are performed on the complete game tree of the extensive form games (i.e., all possible states on the game). however, the statistics for the selection algorithm, such as uct, are collected for the whole information set. if any of the nodes in the information set is reached, the selection algorithm is used to select the next move and, subsequently, the statistics stored by this algorithm are updated by the result of the simulation. in the tree expansion step, the authors in [4] suggest adding a single node to the extensive form tree of the game. a very similar algorithm is presented in [15], in [3] as “multiple-observer information set monte carlo tree search” and in [5] as “multiple monte carlo tree search”. the pseudo-code for a single iteration of the algorithm is presented in algorithm 1. it starts in the root node of the game tree and descends the game tree towards a terminal state. the tree nodes are stored in the memory and common statistics are stored for all nodes in each information set. if the function is called with a terminal state, it just returns its utility for the first player (line 1). if nature selects an action in the current node (line 2), it is selected from the commonly known nature distribution and the algorithm is called recursively on the resulting state (line 4). otherwise, the node is added to memory if it is not already present (line 5). the statistics for the information set it belongs to are accessed (line 6) and used to make the decision about the action to select (line 8). this action is then executed on the game state, producing the following game state, which is used in a recursive call of the function (line 9). this process is continued until a state belonging to a new information set is reached (line 10). this is the end of the selection stage, and the expansion consists of creating new is-mcts(h) 1: if h is terminal (h ∈z) then return u1(h) 2: if h is a chance node (p(h) = c) then 3: a = random chance action from ac(h) 4: return is-mcts(t (h,a)) 5: if h not in memory then add h to memory 6: is = information set for state h 7: if is is not null then 8: a = select(is) 9: v = is -mcts (t (h,a)) 10: else 11: add new is for h to memory 12: a = select(is) 13: v = simulate(t (h,a)) 14: update(is,a,v) 15: return v algorithm 1. information set monte carlo tree search (empty) statistics for the information set (line 11). the specific data structures depend on the selection function. at this point, an action is selected by the selection function and the simulation is executed to estimate the quality of the following position (lines 12–13). the information sets accessed during the iteration are updated by the result of the simulation (line 14) when returning from the recursion, and the next iteration can start. iterations are repeated until a given time budget is spent. the functions select and update in the pseudo-code can be implemented by a suitable selection function, such as uct (with the negative of the received value for the opponent), and simulate can be either completely random, or a domain dependent simulation, as in perfect information mcts. in the following section, we discuss a possible selection function to be used with the algorithm. during an actual match, the player using this algorithm does not always start iterations from the root state of the game, but it rather maintains the current information set in the form of a collection of all states that can be the current state of the game. after the player using is-mcts selects an action, it applies the action to all of these states to determine the possible following states. when the opponent executes some action in the game, it executes all of the opponent’s applicable actions in the current information set and keeps only the resulting states that generate the same observations for the player. in the search from an inner information set during a match, the player chooses for each iteration the initial state uniformly at random from all states in the current information set [3]. 3.1. selection functions 3.1.1. upper confidence bounds the selection function that was suggested for ismcts in both [4] and [3] is the same modification of 335 viliam lisý acta polytechnica input: k – number of actions; c – exploration parameter 1: ∀ini = 0, x̄i = 0 2: for n = 1, 2, . . . do 3: i = arg maxi x̄i + c √ 2 ln n ni 4: use action i and receive reward r 5: x̄i = nix̄i+rni+1 6: ni = ni + 1 algorithm 2. uct: upper confidence bounds algorithm for selection in monte carlo tree search. input: k – number of actions; γ – exploration parameter 1: ∀ix̂i = 0 2: for t = 1, 2, . . . do 3: ∀ipi = exp( γ k x̂i)∑ k j=1 exp( γ k x̂j ) 4: p′i = (1 −γ)pi + γ k 5: use action it from distribution p′ and receive reward r 6: x̂it = x̂it + r p′ it algorithm 3. exp3: exponential weights for exploration and exploitation algorithm for selection in mcts. ucb1 [16] that was successful in perfect information mcts [6]. we present the algorithm in algorithm 2 and further refer to it as uct. the algorithm maintains the mean of the rewards received for each action x̄i and the number of times each action has been used ni. it first uses each of the actions once (the term with zero in the denominator is defined as ∞) and then decides what action to use based on the size of the one-sided confidence interval on the reward computed based on the cheroff-hoeffding bounds (line 3). we follow the suggestion from [17] and break ties on line 3 randomly. the strategy output as the solution for the information set after all simulations are the action use times ni normalized to sum to one. 3.2. exponential weights for exploration and exploitation uct is a successful selection function for perfect information problems, but it has been shown to converge to an exploitable strategy in a simultaneous move game [18], which is a special case of imperfect information games. therefore [3, 8] and [5] propose the use of an alternative selection function exp3, which can be modified to guarantee convergence to the optimal solution in single-stage simultaneous move games. exp3 stores the estimates of the cumulative reward of each action over all iterations, even in the case that the action was not selected. in the pseudo-code in algorithm 3, we denote this value for action i by x̂i. it is initially set to 0 on line 1. in each iteration, a probability distribution p is created proportionally to input: k – number of actions; γt – non-increasing sequence of real numbers 1: ∀i ri = 0 2: for t = 1, 2, . . . do 3: ∀i r+i = max{0,ri} 4: if ∑k j=1 r + j = 0 then 5: ∀i pi = 1/k 6: else 7: ∀i pi = r + i∑ k j=1 r + j 8: p′i = (1 −γ)pi + γ k 9: use action it from distribution p′ and receive reward r 10: ∀i ri = ri −r 11: rit = rit + r p′ it algorithm 4. rm: regret matching variant for selection in mcts. the exponential of these estimates. the distribution is combined with a uniform distribution with probability γ to ensure sufficient exploration of all actions (line 4). after an action is selected and the reward is received from the recursive call of is-mcts, the estimate for the performed action is updated using importance sampling (line 6). the reward is weighted by one over the probability of using the action, in order to reach the correct value in expectation. the strategy output as the solution for the information set after all simulations are performed is the mean of strategies p over all iterations. in the implementation of the algorithm, we use the numerically more stable form of the equation in line 3 proposed in [3]. 3.3. regret matching regret matching is a general procedure originally developed for playing known general-sum matrix games in [10]. the algorithm computes, for each action in each step, the regret for not playing another fixed action every time the action has been played in the past. the action to be played in the next round is selected randomly with probability proportional to the positive portion of the regret for not playing the action. the regret matching procedure in [10] requires exact information about the utility values in the game matrix as well as the action selected by the opponent in each step. in [19], the authors relax these requirements. instead of computing the exact values for the regrets, the regrets are estimated in a similar way as the cumulative rewards in exp3. as a result, the modified regret matching procedure can be used as the selection function in is-mcts. we present the algorithm in algorithm 4. the algorithm stores the estimates of the regrets for not taking action i in all time steps in the past in variables ri. in lines 3-7, it computes the strategy for the current time step proportionally to the positive part of the regrets. the uniform strategy is added similarly 336 vol. 54 no. 5/2014 selection functions for is-mcts to the case of exp3 in line 8. this ensures exploration and causes the addition in line 11 to be bounded. lines 10 and 11 update the cumulative regrets using importance sampling. the main computational advantage of this procedure over exp3 is that it requires only simple division instead of computing the expensive exponential function for each action in each iteration. when used in an mcts algorithm, this allows substantially more iterations to be performed in the same time budget. the strategy output as the solution for the information set after all simulations are performed is the mean of strategies p over all iterations. 4. experimental evaluation this section first presents two imperfect information games that we use in an evaluation. afterwards, we analyze the dependence of the speed of convergence of is-mcts and the eventual distance from a nash equilibrium on the used selection function. finally, we evaluate how these convergence properties translate to the quality of actual game playing. in the whole section, we use hand-tuned values of the exploration parameters c = 2 for uct, γ = 0.1 for exp3 and rm. all experiments were performed in a unified publicly available codebase [20]. 4.1. experimental domains goofspiel. goofspiel is a simultaneous move card game often used to evaluate ai algorithms. the game is played with three identical decks of cards. each deck contains cards of values {0, . . . , (n− 1)} and belongs to one of the players, including nature. the deck for nature is shuffled at the beginning of the game. in each round, nature reveals the top card from its deck. each player selects any of their remaining cards and places it face down on the table so that the opponent does not see the card. afterwards, the cards are turned face up and the player with the higher card wins the card revealed by nature. the card is discarded in case of a draw. at the end, the player with the higher sum of nature cards wins the whole game. in the results, we use utilities 1/0/−1 for win/draw/loss and count a draw as half win half lose in the win rates. we use the imperfect information variant of this game introduced by lanctot [21] for evaluation. it introduces two modifications. first, in each round, the players only learn who won or lost the round, but not the bid played by the opponent. second, both players know that the cards in nature’s deck are sorted in decreasing order. phantom tic-tac-toe. phantom tic-tac-toe is a blind variant of the well-known game of tic-tactoe. the game is played on a 3 × 3 board, where two players (cross and circle) attempt to place 3 identical marks in a horizontal, vertical, or diagonal row to win the game. the player who achieves this goal first wins the game. in the blind variant, the players are unable to observe their opponent’s moves and each player only knows that the opponent made a move and it is her turn. if a player tries to place her mark on a square that is already occupied by an opponent’s mark, the player learns this information and can place the mark in some other square. the uncertainty in phantom tic-tac-toe makes the game large (≈ 1010 nodes [22]). in addition, since one player can try several squares before a move is successful, the players do not necessarily alternate in making their moves. this rule makes the structure of the information sets rather complex, and since the player never learns how many attempts the opponent actually performed, a single information set can contain nodes at a different depth in the game tree. 4.2. convergence to nash equilibrium first, we focus on the speed of convergence in a small variant of goofspiel with 6 cards (0, 1, . . . , 5). we measure the ability of the algorithms to approximate a nash equilibrium strategies of the complete game by the sum of exploitabilities of both players’ strategies (see section 2). we compare the is-mcts algorithm with three different selection functions: uct, exp3, and rm. the results are the means of 20 runs of each algorithm. due to the different selection and update functions, the algorithms differ in the number of iterations per second. rm is the fastest, with more than 5.9 × 104, while uct with computing the square roots and random tie breaking has around 3.9 × 104 iterations per second, and exp3 computing the exponential functions has around 3.4×104 iterations. figure 2 presents the exploitability of the algorithms run from the root state for 30 minutes. note that the x-scale is logarithmic. the exploitability of uct starts decreasing fairly quickly, but after approximately 20 seconds of computation it starts increasing again. the lowest error achieved by uct is 0.39, reached after 25 seconds of computation. the variants of the is-mcts algorithm with exp3 and rm converge slower at the beginning, but eventually achieve smaller error than uct. after 500 seconds of computation, the error of exp3 is 0.41 and the error of rm is 0.27. the game of phantom tic-tac-toe has approximately 1010 terminal states, which makes it difficult to compute the exploitability of the strategies in the whole game. therefore, we initially focus on a simpler version of the game with the first move enforced to be to the square in the center of the board. the first player has only this action available, and the second player can also first play only this action, which reveals the position of the first player’s mark and allows the second player to move again. the second move of the second player and all the following moves can then be any legal moves in phantom tic-tac-toe. the number of iterations per second in this restricted game is similar to the previous game. rm makes the most iterations (9.5 × 104), because the 337 viliam lisý acta polytechnica 0.1 s uct exp3 rm rand uct 62.9 (2.6) 55.6 (2.1) 62.7 (2.5) 84.0 (2.1) exp3 74.5 (2.2) 62.8 (1.6) 74.8 (2.0) 88.0 (1.9) rm 70.6 (1.7) 60.0 (1.6) 73.1 (2.1) 87.8 (1.4) rand 15.7 (2.2) 8.2 (1.6) 10.3 (1.8) 49.7 (3.0) 1 s uct exp3 rm rand uct 61.5 (2.7) 54.0 (2.1) 64.5 (2.5) 84.2 (2.1) exp3 74.5 (2.2) 61.8 (1.5) 75.7 (2.0) 87.8 (1.9) rm 72.8 (2.3) 59.2 (1.7) 75.2 (2.1) 88.8 (1.9) rand 14.9 (2.1) 8.7 (1.7) 10.6 (1.8) 48.7 (3.0) table 1. win rates of the row player against different algorithms in imperfect information goofspiel with 6 cards with 0.1 (top) and 1 (bottom) second of computation per move. the number in brackets indicates the 95 % confidence interval. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.5 1.0 1.5 2.0 0.1 0.5 1.0 10.0 100.0 1800.0 time [s] e q u ili b ri u m d is ta n ce algorithm ●uct exp3 rm figure 2. convergence in the game root in imperfect information goofspiel with 6 cards. game initially has a higher number of applicable actions and rm uses the simplest functions to compute the probability of selecting each action. uct performs significantly fewer iterations (5.1 × 104), which is probably caused by the actions often having very similar values. uct needs to iterate through the actions multiple times to select an action. exp3 is again the slowest, with 3.4 × 104 iterations per second. the convergence of the algorithms that are run from the root of the game is presented in figure 3. uct first quickly reduces the exploitability of the strategy and then gradually makes the strategy more exploitable after 10 seconds of computation. the minimum exploitability achieved by uct is 0.27. shortly after 10 seconds of computation, initially the worst rm becomes the best algorithm and, after the full 30 minutes, it converges to exploitability of 0.13. exp3 is clearly the worst algorithm between the first and the hundredth second. 4.3. head to head matches after analyzing the convergence of the strategies computed by the sampling algorithms, we evaluate how this property translates to the actual performance of the algorithms in head-to-head matches. all presented results are an average of 1000 matches. we first evaluate the small game used for computation of ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.5 1.0 1.5 2.0 0.1 0.5 1.0 10.0 100.0 1800.0 time [s] e q u ili b ri u m d is ta n ce algorithm ●uct exp3 rm figure 3. convergence and game playing comparison of different algorithms on phantom tic-tac-toe, in which the first move of each player is forced to be to the middle square. the exploitability from the previous section to see the connection to the exploitabilities, and then we focus on substantially larger games to evaluate practical applicability. table 1 presents the win rates of the algorithms in mutual matches in imperfect information goofspiel with 6 cards per deck. the table on the top presents the results with 0.1 second per move and the table on the bottom presents the results with 1 second per move, but they are very similar. the first important observation is that even though the game is symmetric and the first player does not have any advantage, all is-mcts variants perform much better as the first player (rows). the reason is the asymmetry of the game model in the form of efg. even though in reality the players choose an action simultaneously, the game models this fact as a sequential decision with hidden information. as a result, the second player has substantially larger information sets than the first player. if the search is run from a larger information set, it is generally more important to have a good approximation of the probability of the individual states being the actual state of the game. the second player is at a large disadvantage here. is-mcts with exp3 wins the most from the first 338 vol. 54 no. 5/2014 selection functions for is-mcts 1 s uct exp3 rm rand uct 70.0 (2.8) 67.7 (2.9) 61.0 (3.0) 90.6 (1.8) exp3 79.5 (2.5) 63.2 (2.8) 66.4 (2.9) 95.6 (1.2) rm 73.8 (2.7) 68.2 (2.8) 67.5 (2.9) 92.8 (1.5) rand 6.2 (1.5) 4.0 (1.2) 4.9 (1.3) 49.0 (3.0) table 2. win rate of different algorithms on imperfect information goofspiel with 13 cards. the number in brackets indicates the 95% confidence interval. 0.1 s uct exp3 rm rand uct 85.9 (1.7) 84.8 (1.7) 82.3 (1.8) 95.6 (1.1) exp3 87.2 (1.6) 88.1 (1.5) 85.3 (1.6) 96.6 (1.0) rm 87.7 (1.6) 88.1 (1.4) 85.2 (1.5) 96.0 (1.0) rand 50.5 (3.0) 46.1 (3.0) 48.0 (3.0) 71.4 (2.7) 1 s uct exp3 rm rand uct 86.5 (1.8) 86.3 (1.7) 84.0 (1.8) 95.2 (1.1) exp3 88.5 (1.6) 88.5 (1.5) 87.0 (1.6) 96.2 (1.0) rm 90.0 (1.4) 87.4 (1.5) 85.2 (1.5) 96.3 (1.0) rand 52.0 (3.0) 48.9 (3.0) 46.2 (2.9) 70.9 (2.7) table 3. win rate of different algorithms on full phantom tic-tac-toe. the number in brackets indicates the 95% confidence interval. position and loses the least from the second position. this is a surprising result, as exp3 has the slowest convergence. apparently, it quickly reaches a strategy that is exploitable, but performs well against the other opponents in the tournament. the game playing performance of the algorithms in the large game of imperfect information goofspiel with 13 cards in each deck and one second of computation per move is presented in table 2. in the large game, the performance of the is-mcts variants is similar to the performance on the smaller version. exp3 wins significantly the most against uct, but rm loses less against uct and itself from the second position. this makes it hard to select the better selection function between exp3 and rm, but both of them are clearly better than uct in this game. since all the algorithms seem to perform very well even in the complete game of phantom tic-tac-toe, we do not present the evaluation on the smaller game with the enforced first move here. we assume that the differences between the algorithms would be even smaller there. the results in table 3 show an even larger imbalance between the results achieved from the first and the second position in the game. this time, it is not caused only by the disadvantage of larger information sets, but also by the fact that the game is not balanced on its own. when both players play the optimal strategy, the player who moves first wins 83% of the games. we computed this values by the double-oracle algorithm [14] implemented in our framework. with 0.1 second per move (top), the performance of rm and exp3 is practically identical from the first position, but rm loses less on the second position, which is consistent with its generally lower exploitability over the experiments. uct generally performs the worst. with 1 second per move, the performance of all algorithms is more similar to each other. rm still wins most often and loses least often against uct, but the mutual matches with exp3 are more balanced. either of the algorithms wins 87% of matches from the first position against each other. however, when the algorithms play against themselves, rm loses less from the second position, which makes it the more suitable algorithm for this setting. 5. conclusions we have studied the influence of selection functions on the performance of monte carlo tree search in imperfect information games. we have evaluated the most standard uct selection along with the less common exp3 and novel regret matching selection in a unified framework on two different games. to the best of our knowledge, this is the first direct comparison of the effect of selection functions on the performance of mcts in imperfect information games. we show that none of the proposed selection functions allows the algorithm to converge very close to the nash equilibrium, and even after 30 minutes of computation (approximately 108 iterations), the distance from an equilibrium was still larger than 0.1. consistently in both evaluation domains, the quality of the strategy produced with uct first decreased and then started to increase again, showing that ismcts with uct does not have the desirable anytime property that more computation time yields better results. this is also the case with exp3 at the very 339 viliam lisý acta polytechnica beginning, but the novel rm selection monotonously converges towards an equilibrium in our experiments. the superior performance of rm selection was also confirmed in head-to-head matches among the algorithms. in both domains used for evaluation, uct performed the worst from the evaluated selection functions. rm was always among the best, but in some cases was slightly outperformed by exp3. the good performance of exp3 in the matches was not supported by low exploitability in the convergence experiments, which indicates that even though the algorithm performed well against the tested opponents, there are other opponents that are likely to perform much better against this algorithm, but not against rm. acknowledgements this work was supported by the czech science foundation (grant p202/12/2054). access to the computing and storage facilities owned by parties and projects contributing to the national grid infrastructure metacentrum, provided under the “projects of large infrastructure for research, development, and innovations” programme (lm2010005) is highly appreciated. references [1] s. gelly, d. silver. monte-carlo tree search and rapid action value estimation in computer go. artificial intelligence 175(11):1856–1875, 2011. doi:10.1016/j.artint.2011.03.007. [2] t. keller, p. eyerich. prost: probabilistic planning based on uct. in icaps. 2012. [3] p. i. cowling, e. j. powley, d. whitehouse. information set monte carlo tree search. computational intelligence and ai in games, ieee transactions on 4(2):120–143, 2012. doi:10.1109/tciaig.2012.2200894. [4] m. ponsen, s. de jong, m. lanctot. computing approximate nash equilibria and robust best-responses using sampling. journal of artificial intelligence research 42:575–605, 2011. doi:10.1613/jair.3402. [5] d. auger. multiple tree for partially observable monte-carlo tree search. in applications of evolutionary computation, pp. 53–62. springer, 2011. [6] l. kocsis, c. szepesvári. bandit based monte-carlo planning. in j. frnkranz, t. scheffer, m. spiliopoulou (eds.), machine learning: ecml 2006, vol. 4212 of lecture notes in computer science, pp. 282–293. springer berlin heidelberg, 2006. doi:10.1007/11871842_29. [7] p. auer, n. cesa-bianchi, y. freund, r. e. schapire. the nonstochastic multiarmed bandit problem. siam j comput 32(1):48–77, 2003. doi:10.1137/s0097539701398375. [8] o. teytaud, s. flory. upper confidence trees with short term partial information. in applications of evolutionary computation, vol. 6624 of lecture notes in computer science, pp. 153–162. springer berlin heidelberg, 2011. doi:10.1007/978-3-642-20525-5_16. [9] m. lanctot, v. lisý, m. h. m. winands. monte carlo tree search in simultaneous move games with applications to goofspiel. in computer games workshop at ijcai 2013, vol. 408 of communications in computer and information science (ccis), pp. 28–43. springer, 2014. [10] s. hart, a. mas-colell. a simple adaptive procedure leading to correlated equilibrium. econometrica 68(5):1127–1150, 2000. doi:10.1111/1468-0262.00153. [11] m. osborne, a. rubinstein. a course in game theory. mit press, 1994. [12] h. w. kuhn. extensive games and the problem of information. contributions to the theory of games 2(28):193–216, 1953. [13] m. johanson, k. waugh, m. bowling, m. zinkevich. accelerating best response calculation in large extensive games. in proceedings of the twenty-second international joint conference on artificial intelligence volume volume one, ijcai’11, pp. 258–265. aaai press, 2011. doi:10.5591/978-1-57735-516-8/ijcai11-054. [14] b. bosansky, c. kiekintveld, v. lisy, et al. double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games. in proceedings of international conference on autonomous agents and multiagent systems (aamas), pp. 335–342. 2013. [15] v. lisy, r. pibil, j. stiborek, et al. game-theoretic approach to adversarial plan recognition. in proceedings of the 20th european conference on artificial intelligence (ecai), pp. 546–551. 2012. doi:10.3233/978-1-61499-098-7-546. [16] p. auer, n. cesa-bianchi, p. fischer. finite-time analysis of the multiarmed bandit problem. machine learning 47(2-3):235–256, 2002. doi:10.1023/a:1013689704352. [17] c. b. browne, e. powley, d. whitehouse, et al. a survey of monte carlo tree search methods. computational intelligence and ai in games, ieee transactions on 4(1):1–43, 2012. doi:10.1109/tciaig.2012.2186810. [18] m. shafiei, n. sturtevant, j. schaeffer. comparing uct versus cfr in simultaneous games. in ijcai workshop on general game playing. 2009. [19] s. hart, a. mas-colell. a reinforcement procedure leading to correlated equilibrium. springer berlin heidelberg, 2001. doi:10.1007/978-3-662-04623-4_12. [20] agent technology center. computation game theory, 2014. http://agents.felk.cvut.cz/topics/ computational_game_theory [2014-09-29]. [21] m. lanctot. monte carlo sampling and regret minimization for equilibrium computation and decision making in large extensive-form games. ph.d. thesis, university of alberta, 2013. [22] m. lanctot, r. gibson, n. burch, et al. no-regret learning in extensive-form games with imperfect recall. in icml, pp. 1–21. 2012. arxiv:1205.0622v1. 340 http://dx.doi.org/10.1016/j.artint.2011.03.007 http://dx.doi.org/10.1109/tciaig.2012.2200894 http://dx.doi.org/10.1613/jair.3402 http://dx.doi.org/10.1007/11871842_29 http://dx.doi.org/10.1137/s0097539701398375 http://dx.doi.org/10.1007/978-3-642-20525-5_16 http://dx.doi.org/10.1111/1468-0262.00153 http://dx.doi.org/10.5591/978-1-57735-516-8/ijcai11-054 http://dx.doi.org/10.3233/978-1-61499-098-7-546 http://dx.doi.org/10.1023/a:1013689704352 http://dx.doi.org/10.1109/tciaig.2012.2186810 http://dx.doi.org/10.1007/978-3-662-04623-4_12 http://agents.felk.cvut.cz/topics/computational_game_theory http://agents.felk.cvut.cz/topics/computational_game_theory arxiv:1205.0622v1 acta polytechnica 54(5):333–340, 2014 1 introduction 2 definitions and background 2.1 extensive form games 3 information set monte carlo tree search 3.1 selection functions 3.1.1 upper confidence bounds 3.2 exponential weights for exploration and exploitation 3.3 regret matching 4 experimental evaluation 4.1 experimental domains 4.2 convergence to nash equilibrium 4.3 head to head matches 5 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):246–248, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ stimulated raman backscattering in plasma — a promising tool for the generation of ultra-high power laser beams hana turčičováa,∗, jaroslav huynha,b a department of laser interactions, institute of physics of the academy of sciences of the czech republic, prague, czech republic b department of physical electronics, faculty of nuclear sciences and physical engineering, czech technical university in prague, czech republic ∗ corresponding author: turcic@fzu.cz abstract. in the last fifteen years stimulated raman backscattering (srbs) in plasma has been intensively elaborated as a promising tool on the way towards high intense lasers. there are several advantages of this technique in comparison to the world-wide used the cpa-chirped pulse amplification technique for a laser amplification. we present the principle of the srbs technique, the best results so far obtained in theory and experiment, and a possible srbs project at the pals research centre in prague. keywords: stimulated raman scattering, laser plasma, high intensity lasers. 1. introduction since the very beginning of the laser performing there has been a trend for the laser pulse shortening and power increasing. the first techniques manipulating the pulse duration were q-switching and modelocking. nanosecond duration has been achieved and further optimization of laser devices has led to energy increase and output pulse intensity growing. in the middle of 80s, a revolutionary technique appeared [13], which profited from the results of [6] and that from a radar pulse coding and compression [3]. the technique is known as cpa – chirped pulse amplification which enabled the amplification of ultra-short pulses up to a femtosecond range. however, a prerequisite for using this technique is a sufficient spectral broadness of the laser pulse being amplified. the pulse is first spectrally swept in space, and thus prolonged in time (in a tretcher). its intensity is now much lower and in the forthcoming amplifying medium the pulse will be amplified without any danger of an optical damage of the medium. having gained energy, in the following, the pulse has to be optically compressed to its original duration (in an optical compressor). both stretcher and compressor contain diffraction gratings and if the pulse is intense and large-sized after the amplification, the diffraction gratings in the compressor have to be similarly largesized as they can sustain only limited pulse intensity. in the 90s the cpa technique became a part of another revolutionary method in the ultra-short-pulse amplification, opcpa – optical parametric chirped pulse amplification. this technique brought many benefits, among others substantially improved pulse contrast, which is a very important parameter in laser–target interactions. at present both techniques are used in many prestigious laser facilities in the world. however, bulky compressors with demanding adjustment are an indispensable part of all of them. in the last fifteen years again a new potentially revolutionary technique has emerged from theoretical studies of wave instabilities in laser plasma, which presents a possibility of the production of intense and ultrashort laser pulses even without the large compressors. raman backscattering has been known as a bad acting parametric instability in the laser energy deposition into the hohlraum target during the inertial confinement fusion process [2]. in underdense plasma, however, stimulated raman backscattering, i.e the scattering seeded by a counterpropagating beam, revealed some features [12] that made it a new promising tool for the generation of intense ultrashort laser beams. it was found that a short (< 1 ps) and weak laser pulse can be amplified by a counterpropagating long pump and remain short or even get shortened [8]. since that time the topic has attracted attention of many laser-plasma physicists and theoreticians. here the laser plasma has ranked among the laser amplifying media, such as well-known solidstate materials (glass or crystals doped with nd, sapphire doped with ti, etc.) or gas mixtures (he−ne mixtures, co2 or co based mixtures, excited dimers etc.). its eminent property is that it can withstand by several orders higher optical intensity than the latter laser media can. the simulations have reported even hundreds of petawatts of the output power [15], whereas the practical realization up to now has not surpass 100 gw. in the paper we summarize the best results of simulations and experiments achieved in stimulated raman backscattering (srbs). in the following we propose a possible project on srbs at the research center of pals (prague asterix laser system) in prague, cz. 246 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 stimulated raman backscattering in plasma 2. srbs in simulations and experiments there is a long track of theoretical works on stimulated raman scattering since 70s and 80s in the last century. even in those works a plasma is shown as a suitable medium for the amplification and compression of laser pulses. in 90s demanding simulation works enabled better insight into this process and other generated instabilities, see e.g. [9]. it was found that a short weak laser pulse, a seed, can be efficiently amplified and compressed in a plasma prior to the harmful instabilities, such as forward raman scattering, filamentation and others, will grow. the computational methods in that time were based either on the classical threewave-interaction model with the assumption of slowlyvarying wave envelopes, where one of the waves was not electromagnetic but a plasma one, or on the pic simulations, mostly one-dimensional, under certain simplifying conditions. a new regime of a raman amplification was identified [12], so called sra – superradiant amplification, when the seed amplitude grows linearly with time and simultaneously the seed is contracted. reaching this regime needs an initial seed pulse of sufficiently high intensity, the pump wave then can get strongly depleted. a characteristic feature of the superradiant amplification is that the participating electrons from the plasma wave backscatter the pump coherently. in [5] this regime was experimentally demonstrated and in 1d-pic code numerically simulated. the plasma was produced in a h2 gas jet by the leading edge of a picosecond pump wave from a ti:sapphire laser. the initial signal was of 80 fs long and after the amplification in the plasma its energy increased from 70 µj to ≥1 mj and the duration was 56 fs. however, the output pulse was a train of several peaks. the experiment was explained as an initial srbs process and when the seed intensity reached a certain value the process proceeded as the sra amplification. at present the highest seed output achieved experimentally in the srbs amplification is given in [10]. the plasma channel of 2 mm long was generated in a c2h6 gas jet, the pump and seed pulses were derived from a ti:sapphire laser. double-pass geometry was used when the seed beam passed twice the interaction area. the energy transfer pump–seed was 6.4 % which led to the amplification of the seed by a factor of more than 20 000. the pulse compression was from 500 fs to about 50 fs. the seed intensity thus exceeded that of the pump by 2 orders. the output power of the seed was close to 100 gw level. it is evident that the seed output power data provided by the simulations exceed those from the experiments by several orders. it was shown, see e.g. [17, 16], that mechanisms like detuning from the srbs resonance conditions due to an unintentional pump chirp or plasma channel inhomogeneities, plasma wave breaking at higher plasma temperature, or an interplay between brillouin and raman scattering, effect negatively the srbs efficiency. the achievement of the multipetawatt regime is presented in the simulations of [15]. they performed large-scale multidimensional pic simulations and came to a conclusion that multi-pw peak powers are within reach, however, only in a narrow plasma parameter window. the plasma density should be kept in the range of (4.5 ÷ 18) × 1018 cm−3. the limiting factor is the growth of deleterious plasma instabilities. the authors recommend raman amplification in plasma channels of larger diameters so that the plasma density could be kept low enough. they report even 300 pw output if the plasma channel is of 1 cm (!) fwhm and pump beam (λ = 1 µm) of 1 pw in 25 ps. a too-manyorders difference thus persists between the seed output prognosis and the real up-to-now achieved value. 3. srbs at pals a crucial factor in the srbs process is the plasma channel. it should be thoroughly adjusted so as to keep its good wave guiding properties along the interaction region over many rayleigh zones [4]. the radial profile of the refractive index should have its maximum on the axis which prevents diverging tendencies of both waves. if the laser beams are strong enough, such index profiling can be self-formed by relativistic or ponderomotive forces acting on the electrons. however, the beam peaking on the axis will push the charged particles away and their displacement might create new centers of diffraction. the pump and seed are usually focused into the plasma channel which imposes spherical form of their wavefronts. therefore the fronts of the waves should move with a lower speed than peripheral parts. balancing all the counteracting processes in the channel is a demanding task. in any way, a high quality laser beam, smooth in time and space, will be beneficial for the production of a plasma channel with defined plasma density profile and wave guiding properties. the research center of pals has a single beam photodissociation iodine laser [11] of such high qualities at a disposal. its fundamental wavelength is 1315 nm and delivered energy ≤1 kj@400 ps. since 2000 the laser facility has been used by many international research groups namely due to the smooth top-hat beam profile in 1ω and also 3ω. a part of the beam could be used for the production of a gas-jet (e.g. c3h8) waveguiding plasma channel. there is also experience with smoothing the laser plasma non-uniformities by additional gas jets [1]. the plasma channel can be produced in ∼ 0.25 × 3 mm2 size with a plasma density of ne ≈ (1 − 2) × 1019 cm−3, corresponding to a plasma frequency of ωpe ≈ 0.2 × 1015 hz. the pump and seed beams can be produced by a highpower ti:sapphire laser system [7] which is also operated at the pals research center, λ = 810 nm, ∼1 j@40 fs. the laser beam will be split and adjusted into the pump (∼1 j, pulse duration τ = 10 ps) and 247 hana turčičová, jaroslav huynh acta polytechnica the seed (0.2 mj, τ = 500 fs), both beams being focused into a 0.25 mm spot. thus similar experimental conditions to those of [4] would be produced and get-in-touch srbs experiments could be performed. the seed amplification of almost two orders of magnitude should be closely touched. another srbs experiment on the pals laser system can be proposed. it aims at an output power upgrade on the third harmonics of the iodine laser. at present 400 j in 250 ps is reached if the pals beam is frequency tripled, i.e. 438 nm, corresponding wave frequency being 4.3 × 1015 hz. the output power at the maximum is therefore ∼ 1.6 tw. the third harmonics beam will be used as a source for both pump and seed waves, the remaining part of the fundamental beam after tripling will form a pre-pulse, i.e. generate the plasma channel, probably again in a gas jet with a low ionization potential. the channel should be of 1 ÷ 1.5 mm diameter and 15 mm length. according to [14] the recommended ratio between the pump wave frequency ωpu and ωpe should be 20, giving the plasma frequency ωpe ∼ 0.2 × 1015 hz and plasma density ne ∼ 1.5×1019 cm−3. the resonance condition provides the seed frequency, ωs = ωpu − ωpe, which means 460 nm wavelength. as the pump beam duration is 250 ps, the seed beam one should be about 250 fs so as to keep up with the initial simulation parameters of [14]. such a seed beam can be provided by the ti:sapphire laser mentioned above, if doubling its beam. the doubling will be performed prior compression to the fs duration, i.e. at the full duration of 250 ps [7]. the beam with about 400 nm than has to be raman shifted to 460 nm. this can be done in a gas raman shifter with a relevant stokes shift (e.g. d2, ch4). after the raman shifting, the beam will be compressed to the desired 250 fs and will function as the seed beam. following further the desired optimal plasma parameters given in [14], a pump intensity of about 6 × 1012 w cm−2 is required throughout the plasma channel. performance parameters adjusted in this way would enable a conversion efficiency of about 40 % and in its consequence an output power of 30 tw. using the raman amplification in the plasma, more than one order increase in the third harmonic frequency can be expected. 4. conclusion the investigations of the stimulated raman amplification in plasma performed in simulations are very optimistic regarding attainable output powers. the experimental results so far achieved, however, stay behind by several orders. in the experiment, it is very difficult to keep the optimal conditions for an efficient energy transfer from the pump to the seed along the whole interaction path in plasma. the channel should be long enough (≤ 10 mm) and all along with good wave guiding properties. technically it is not easy to maintain a uniform plasma density along such a long channel. however, it can be believed a single high-quality intense beam, smooth in time and space, the same as the pals laser system in prague, would be a good candidate for the production of a suitable plasma channel for the srbs studies. references [1] d. batani, r. benocci, r. dezulian, et al. smoothing of laser energy deposition by gas jets. eur phys j special topics 175(1):65–70, 2009. [2] r. l. berger, c. h. still, e. a. williams, a. b. langdon. on the dominant and subdominant behavior of stimulated raman and brillouin scattering driven by nonuniform laser beams. phys plasmas 5(12):4337–4355, 1998. [3] e. brookner. phased-array radars. scient american 252(2):94–103, 1985. [4] w. cheng. reaching the nonlinear regime of the raman amplification of ultrashort laser pulses. princeton university, u.s.a., 2007. ph.d. thesis. [5] m. dreher, e. takahashi, j. meyer-ter vehn, k.-j. witte. observation of superradiant amplification of ultrashort laser pulses in a plasma. phys rev lett 93(9):095001, 2004. [6] r. a. fisher, w. k. bischel. pulse compression for more efficient operation of solid-state laser amplifier chains ii. appl phys lett 24(10):468–470, 1974. [7] j. hrebicek, b. rus, j. c. lagron, et al. 25 tw ti:sapphire laser chain at pals. spie 8080, 2011. [8] v. m. malkin, g. shvets, n. j. fisch. fast compression of laser beams to highly overcritical powers. phys rev lett 82(22):4448–4451, 1999. [9] v. m. malkin, g. shvets, n. j. fisch. ultra-powerful compact amplifiers for short laser pulses. phys plasmas 7(5):2232–2240, 2000. [10] j. ren, s. li, a. morozov, et al. a compact double-pass raman backscattering amplifier/compressor. phys plasmas 15(5):056702, 2008. [11] k. rohlena, k. jungwirth, b. kralikova, et al. a survey of pals activity and development, 2004. [12] g. shvets, n. j. fisch. superradiant amplification of an ultrashort pulse in a plasma by a counterpropagating pump. phys rev lett 81(22):4879–4882, 1998. [13] d. strickland, g. mourou. compression of amplified chirped optical pulses. optics com 56(3):219–221, 1985. [14] r. m. g. m. trines, f. fiuza, r. bingham, et al. production of picosecond, kilojoule, and petawatt laser pulses via raman amplification on nanosecond pulses. phys rev lett 107(10):105002, 2011. [15] r. m. g. m. trines, f. fiuza, r. bingham, et al. simulations of efficient raman amplification into the multipetawatt regime. nature physics 7(1):87–92, 2011. [16] d. et al. turnbull. simulataneous stimulated raman, brillouin, and electron-acoustic scattering reveals a potential saturation mechanism in raman plasma amplifiers. phys plasmas 19(8):083109, 2012. [17] n. a. yampolsky, n. j. fisch. limiting effects on laser compression by resonant backward raman scattering in modern experiments. phys plasmas 18(5):056711, 2011. 248 acta polytechnica 53(2):246–248, 2013 1 introduction 2 srbs in simulations and experiments 3 srbs at pals 4 conclusion references acta polytechnica doi:10.14311/ap.2015.55.0379 acta polytechnica 55(6):379–383, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap treatment of natural gas by adsorption of co2 kristýna hádková∗, viktor tekáč, karel ciahotný, zdeněk beňo, veronika vrbová department of gas, coke and air protection, university of chemistry and technology, prague, technická 5, praha 6, 166 28, czech republic ∗ corresponding author: kristyna.hadkova@vscht.cz abstract. apart from burning, one of the possible uses of natural gas is as a fuel for motor vehicles. there are two types of fuel from natural gas — cng (compressed natural gas) or lng (liquefied natural gas). liquefaction of natural gas is carried out for transport by tankers, which are an alternative to long-distance gas pipelines, as well as for transport over short distance, using lng as a fuel for motor vehicles. a gas adjustment is necessary to get lng. as an important part of the necessary adjustment of natural gas to get lng, a reduction of co2 is needed. there is a danger of the carbon dioxide freezing during the gas cooling. this work deals with the testing of adsorption removal of co2 from natural gas. the aim of these measurements was to find a suitable adsorbent for co2 removal from natural gas. two different types of adsorbents were tested: activated carbon and molecular sieve. the adsorption properties of the selected adsorbents were tested and compared. the breakthrough curves for co2 for both adsorbents were measured. the conditions of the testing were estimated according to conditions at a gas regulation station — 4.0 mpa pressure and 8 °c temperature. natural gas was simulated by model gas mixture during the tests. the breakthrough volume was set as the gas volume passing through the adsorber up to the co2 concentration of 300 ml/m3 in the exhaust gas. the thermal and pressure desorption of co2 from saturated adsorbents were also tested after the adsorption. keywords: natural gas treatment; adsorption; co2 removal. 1. introduction there are a few possibilities how to use natural gas. fuel for motor vehicles is one of them. natural gas can be used as cng (compressed natural gas) or as lng (liquefied natural gas). lng is very important for the international trade of natural gas and this importance will grow [1]. an adjustment is needed during the liquefaction of natural gas; co2 removal is one of them, because co2 can freeze during the gas cooling process. co2 has to be removed also because if there is water in natural gas acid can be formed and it leads to corrosion of pipelines [1]. co2 removal can be carried out by several ways – adsorption, absorption, physical scrubbing or cryogenic separation. adsorptive separation of co2 was chosen for this work. two types of adsorbent were used for the experiments – activated carbon c46 and 13x zeolite molecular sieve. the breakthrough volume of co2 in exhaust gas was set at 300 ml/m3. the temperature and pressure were set according to common conditions at the pressure regulation station at the gas pipeline: 8 °c and 4.0 mpa [2]. 2. theory 2.1. lng lng is liquefied natural gas; an odorless liquid cooled to −162 °c at atmospheric pressure. the volume of liquefied natural gas is 600 times smaller than the same amount of natural gas in gaseous state. because of its smaller volume, lng is better for long distance transport if there is no gas pipeline [3, 4]. 2.2. adsorption adsorption is a phenomenon during which molecules of gas, vapor or liquid are captured on a solid surface. there are two types of adsorption which work on different principles of bounding forces – physical sorption and chemical sorption. in the case of physical sorption, the adsorption is based on van der waals forces. the bounds are not specific, the bounds are not chemical and molecules can be adsorbed in more layers [5, 6]. in the case of chemical sorption, the bounds are chemical. the bounds are specific, and molecules are captured only at active centers. the molecules can thus be captured only in one layer. unlike in the case of physical sorption, activation energy is needed [5, 7]. regeneration of an adsorbent can be realized due to different adsorption capacities at different temperatures (therma-swing adsorption) or at different pressures (pressure-swing adsorption)[1]. psa technology doesn’t require use of chemicals (like amine or other solvents in the case of absorbtion technology), heat is not used for the regeneration (that means low energy intensity) and the psa units are suitable for big and small technological arrangement [8]. 379 http://dx.doi.org/10.14311/ap.2015.55.0379 http://ojs.cvut.cz/ojs/index.php/ap k. hádková, v. tekáč, k. ciahotný et al. acta polytechnica 3 fig. 2 shows the apparatus for measuring of the breakthrough curves of co2. two cylinders belong to the apparatus. first of them contains tested gaseous mixture and the second contains nitrogen to purge the apparatus. there are reducing valves where can be set output pressure 4.0 mpa. then there are changeover valves to purge the apparatus by nitrogen or to use a model gas mixture. the pressure of the inlet gas is measured by the manometer which is connected as the next part. then the gas flows into adsorber filled with the tested adsorbent. the adsorber is equipped with valves to close it at both ends. adsorber is placed in a bath cooled to 8 °c. then there is a needle valve to reduce the gas pressure in the apparatus to atmospheric pressure. the next part of the apparatus is a manometer, than there is a flow meter and at the end there is the analyzer which is connected to the computer. fig. 2: the apparatus for the breakthrough curves measuring 1.cylinder containing the gas mixture; 2. cylinder containing nitrogen; 3. reducing valves; 4. switching valves; 5. manometer; 6. ball valves; 7. adsorber; 8. cooling bath; 9. needle valve; 10. gas flow meter; 11. analyzer; 12. computer the apparatus was always purged with nitrogen at first and then the desired gas flow was set. flow rates of gases at each measurement were slightly different due to the low sensitivity of the control element. the flow was set at 4.1 dm 3 /min in the case of molecular sieve 13x and at 3.4 dm 3 /min in the case of activated carbon c46. the samples were activated by heating at 150°c for 8 hours before measuring. after flushing with nitrogen, the model gas mixture was charged into the apparatus. the content of co2 in exhaust gas was monitored by ftir analyzer nicolet antaris igs. when the limit volume 300 ml/min co2 was exceeded, the time and volume of gas flowing through were recorded. the experiment continued until the saturation of the adsorbent; that means co2 content in the exhaust gas doesn’t increase any more. desorption of co2 from the saturated adsorbent was also carried out. pressure switch desorption by low pressure and thermal desorption by elevated temperature were realized. by the pressure switch desorption adsorber was connected to a vacuum pump for 1 hour after depressurization to atmospheric pressure. in the case of thermal desorption saturated adsorbents were regenerated at 150 ° c for 8 hours. desorption desorption of adsorbed gas from molecular sieve 13x and activated carbon c46 were also tested. during the thermo desorption the adsorber was disconnected from the apparatus, the pressure was reduced from 4.0 mpa to atmospheric pressure and then the adsorbent was regenerated at 150°c for 8 hours at atmospheric pressure. 5. 1. 3. 5. 4. 3. 2. 4. 9. 7. 6. 6. 10. 12. 11. 8. figure 1. the apparatus for the breakthrough curves measuring: (1) cylinder containing the gas mixture; (2) cylinder containing nitrogen; (3) reducing valves; (4) switching valves; (5) manometer; (6) ball valves; (7) adsorber; (8) cooling bath; (9) needle valve; (10) gas flow meter; (11) analyzer; (12) computer. methane 86.3 ethane 6.97 propane 0.784 butane 0.786 co2 2.03 n2 3.13 table 1. composition of the inlet gas (linde gas a.s.) [mol%]. 3. experiments 3.1. adsorption the aim of the experiments was to compare selected adsorbents for the removal of co2 from natural gas. the breakthrough curves of co2 were measured with the laboratory apparatus. the breakthrough volume was set at 300 ml/m3 of co2 in exhaust gas. adsorption was carried out at 8 °c and at a pressure of 4.0 mpa. the gas mixture used is described in tab. 1. the synthetic 13x zeolite molecular sieve from sigma – aldrich and activated carbon c46 from silcarbon aktivkohle were tested as adsorbents. the inner surface and pore volume of these adsorbents were measured by bet method at coulter sa 3100. these properties are described in tab. 2. fig. 1 shows the apparatus used to measure the breakthrough curves of co2. two cylinders belong to the apparatus. the first of them contains the tested gaseous mixture and the second contains nitrogen, to purge the apparatus. there are reducing valves where the output pressure of 4.0 mpa can be set. then there are changeover valves to purge the apparatus by nitrogen or to use a model gas mixture. the pressure of the inlet gas is measured by the manometer, which is connected as the next part. then the gas flows into an adsorber filled with the tested adsorbent. the adsorber is equipped with valves which close it at both, and it is placed in a ms 13x ac c46 inner surface [m2/g] 533 1258 pore volume [ml/g] 0.346 0.589 table 2. inner surface and pore volume of the adsorbents. bath cooled to 8 °c. then there is a needle valve which reduces the gas pressure in the apparatus to atmospheric pressure. the next part of the apparatus is a manometer, than there is a flow meter and at the end is the analyzer which is connected to the computer. the apparatus was always first purged with nitrogen, and then the desired gas flow was set. the flow rates of gases at each measurement were slightly different due to the low sensitivity of the control element. the flow was set at 4.1 l/min in the case of the 13x molecular sieve and at 3.4 l/min in the case of activated carbon c46. the samples were activated by being heated to 150 °c for 8 hours before the measuring. after being flushed with nitrogen, the model gas mixture was charged into the apparatus. the content of co2 in the exhaust gas was monitored by ftir analyzer nicolet antaris igs. when the limit volume of 300 ml/min co2 was exceeded, the time and volume of gas flowing through were recorded. the experiment continued until the saturation of the adsorbent; that is, until the co2 content in the exhaust gas stops increasing. desorption of co2 from the saturated adsorbent was also carried out. both pressure switch desorption by low pressure and thermal desorption by elevated temperature were realized. during the pressure switch desorption the adsorber was connected to a vacuum pump for 1 hour after depressurization to atmospheric pressure. in the case of thermal desorption, saturated adsorbents were regenerated at 150 °c for 8 hours. 380 vol. 55 no. 6/2015 treatment of natural gas by adsorption of co2 figure 2. the total adsorption capacity at the tested adsorbents by 8 °c. figure 3. activated carbon c46 and 13x molecular sieve: content of the components in gas extracted by reduction of the pressure from 4.0 mpa to atmospheric pressure and in gas extracted by evacuation. 3.2. desorption the desorption of adsorbed gas from the 13x molecular sieve and activated carbon c46 was also tested. during the thermal desorption the adsorber was disconnected from the apparatus, the pressure was reduced from 4.0 mpa to atmospheric pressure, and the adsorbent was then regenerated at 150 °c for 8 hours, at atmospheric pressure. in the case of pressure desorption, the ball valves were closed; the adsorber was disconnected from the apparatus and weighted. after the depressurization to atmospheric pressure, the adsorber was weighted again and subsequently the vacuum pump was connected for 1 hour. the gas from depressurization and from evacuation was collected and analyzed on a hewlett-packard hp 6890 chromatograph. the resulting adsorbed gas mass was the difference between the weight after the measurement (the adsorbent was saturated) and the weight before the measurement. the mass of the gas located in the space between the adsorbent particles was subtracted from the resulting mass. this mass value is an important part of the final weight (in this case about 4 g of gas) of the gas at a pressure of 4.0 mpa. 4. results and discussion the results of the total adsorption capacities of the tested adsorbents ac c46 and ms 3x are shown in fig. 2. these capacities were determined by weighing the adsorber. this figure shows that the adsorption capacity of the 13x molecular sieve was 9 wt% and the activated carbon has an adsorption capacity higher than 19 wt%. these results agree with the bet physical adsorption theory: the inner surface and pore volume of activated carbon c46 are higher than those of the molecular sieve. 381 k. hádková, v. tekáč, k. ciahotný et al. acta polytechnica figure 4. adsorption and desorption capacity – pressure desorption. figure 5. adsorption and desorption capacity – thermal desorption. this adsorption capacity shows the total adsorption capacity of all adsorbed components of the gas mixture; it is not clear whether it is only co2 or whether other gases are also adsorbed. an analysis of the adsorbed gas (extracted by reduction of the pressure from 4.0 mpa to atmospheric pressure and then by evacuation) was carried out. these gases were analyzed at a gas chromatograph hewlett-packard hp 6890 equipped with fid and tcd detectors. the composition of the gas taken out by reduction of pressure from 4.0 mpa to atmospheric pressure and the composition of the gas extracted by evacuation are shown in fig. 3 for the 13x molecular sieve and the activated carbon c46. the composition of inlet gas is shown in the first group of columns. the exact gas composition values are described in tab. 1. the amount of co2 grows slightly in the gas from depressurization and from evacuation in the case of activated carbon. co2 is particularly adsorbed and it is released during the pressure reduction from 4.0 mpa to atmospheric pressure and then also during the evacuation. in the case of the 13x molecular sieve, the content of co2 grows slightly in the exhaust gas in comparison to feed gas. it can be assumed that during the pressure reduction to atmospheric pressure in the adsorber a partial release of co2 occurs. a dominant amount of co2 remains adsorbed on the surface of the molecular sieve at atmospheric pressure; co2 releases adsorbent during the evacuation. in the case of activated carbon c46, the methane content is slightly higher in depressurization gas than in the inlet gas; methane is adsorbed on the activated carbon surface and it is released during the pressure reduction from 4.0 mpa to atmospheric pressure. methane is also desorbed during the evacuation. the methane content in the gas from the depressurization of the 13x molecular sieve compared to the inlet gas decreases only slightly; there is no high adsorption of methane on the molecular sieve surface. the small amount of adsorbed methane is released during the evacuation. the amount of nitrogen in depressurization gas of activated carbon is comparable to the amount of nitrogen in the inlet gas, and there is almost no nitrogen in the evacuation gas. it can be assumed that nitrogen is not adsorbed on the activated carbon surface. in the case of the molecular sieve, the amount of nitrogen in exhaust gas is comparable to the amount in inlet gas. there is also no nitrogen in the gas from the evacuation of the adsorbent. thus nitrogen is nearly not adsorbed. 382 vol. 55 no. 6/2015 treatment of natural gas by adsorption of co2 in the case of activated carbon, the amount of ethane in depressurization gas is noticeably lower than in the inlet gas, and it increases slightly in the evacuation gas. ethane is adsorbed on activated carbon and it is released during the evacuation, but almost not during the depressurization. in the case of the molecular sieve, the amount of ethane is higher in the depressurization gas than in the inlet gas; ethane is particularly adsorbed on the molecular sieve surface and it is released during the pressure reduction from 4.0 mpa to atmospheric pressure and also during the evacuation. propane and butane almost do not occur in the gas from depressurization and from evacuation of activated carbon; it can be assumed that propane and butane are strongly adsorbed on activated carbon surface; that they are not released during any pressure reduction. in the case of molecular sieve the amount of propane in the gas from depressurization is comparable to the inlet gas. the amount of propane grows significantly in the gas from evacuation; the propane is adsorbed on the molecular sieve surface and it is released during the evacuation. butane is present only in the inlet gas, it is almost absent in the depressurization gas and also in the gas from evacuation. butane is bound so strongly to the molecular sieve surface that it is not released during any pressure reduction. 4.1. desorption the 13x molecular sieve and activated carbon c46 were also tested for desorption of adsorbed gas. the results of the pressure desorption are shown in fig. 4. in the case of pressure desorption the desorption amount is almost 5 wt% for the 13x molecular sieve. the desorption amount for activated carbon c46 reaches 10 wt%. for both samples, it is about half of the adsorbed amount. fig 4 shows that it is not possible to desorb the full captured amount of adsorbed components. propane and butane and probably part of the adsorbed co2 are adsorbed mainly in the smallest pores from which they are not released. during the thermal desorption the adsorbent was regenerated at 150 °c for 8 hours at atmospheric pressure. the results of the thermal desorption are shown in fig. 5. in the case of thermal desorption the desorption amount reaches 9 wt% for the 13x molecular sieve; this value is close to the adsorption capacity of the molecular sieve. almost all adsorbed gas is released. the desorption capacity is almost 16 wt% in the case of activated carbon. almost 80 wt% of adsorbed gas is released. 5. conclusions the experiments show that activated carbon c46 has a higher adsorption capacity for components of gas mixture. activated carbon also has a greater pore volume and larger inner surface. however, the adsorption capacity does not only involve co2; mainly propane and butane are adsorbed. the content of ethane in gas from desorption reaches almost 85 wt% and the content of co2 is lower than 10 wt%. in the case of the molecular sieve there is more than 70 wt% of co2 in adsorbed gas. finally, the molecular sieve has a higher adsorption capacity for co2 than activated carbon c46. both adsorbents were also tested for desorption of adsorbed gases. almost all adsorbed gases were released by thermal desorption from the 13x molecular sieve and almost 80 wt% adsorbed gas was desorbed from activated carbon c46. the pressure swing desorption was not comparably successful; only about a half of the adsorbed gas was released. the 13x molecular sieve is a suitable adsorbent for co2 from natural gas. co2 is adsorbed mainly on the surface of the molecular sieve and it is also effectively desorbed by thermal desorption. however, activated carbon can be used for propane and butane removal from natural gas. acknowledgements financial support from specific university research (msmt no 20/2013). references [1] rufford, t. e.; smart, s.; watson, g. c. y.; graham, b. f.; boxall, j.; diniz da costa, j. c.; may, e. f. the removal of co2 and n2 from natural gas a review of conventional and emerging process technologies. journal of petroleum science and engineering 94, 123–154, 2011. [2] quack, h. koncepční řešení malého účinného zdroje lng. plyn 92 (12), 272–274, 2012. [3] šebor, g.; pospíšil, m.; žákovec, m. technicko-ekonomická analýza vhodných alternativních paliv v dopravě; praha, 2006 (in czech). [4] vše o cng – alternativní pohonné hmoty – zkapalněný zemní plyn. cng. http: //www.cng.cz/cs/alternativni-pohonne-hmoty-126/ [2014-02-25] (in czech). [5] bureš, m.; černý, č.; chuchvalec, p. fyzikální chemie ii, 1st ed.; všcht: praha, 1994 (in czech).. [6] mcketta, j. encyklopedia of chemical processing and design, vol. 2; new york and basel, 1977. [7] vrbová, v. testování nových druhů adsorpčních materiálů pro odstraňování organických látek z plynů. semestrální projekt, všcht praha, květen 2007 (in czech). [8] tagliabue, m.; farrusseng, d.; valencia, s.; aguado, s.; ravon, u.; rizzo, c.; corma, a.; mirodatos, c. natural gas treating by selective adsorption: mateial science and chemical engineering interplay. chemical engineering journal 155, 553–566, 2009. 383 http://www.cng.cz/cs/alternativni-pohonne-hmoty-126/ http://www.cng.cz/cs/alternativni-pohonne-hmoty-126/ acta polytechnica 55(6):379–383, 2015 1 introduction 2 theory 2.1 lng 2.2 adsorption 3 experiments 3.1 adsorption 3.2 desorption 4 results and discussion 4.1 desorption 5 conclusions acknowledgements references ap1_03.vp 1 introduction steelconcrete composite beams are an attractive form of construction in both buildings and bridges due to economic and efficiency considerations. in a typical composite beam section, a reinforced concrete slab is connected to the top flange of a steel beam, providing a stronger and stiffer carrying element. in the case of continuous composite beams, the concrete slab is most effective in positive bending, forming a wide compressive flange and raising the position of the neutral axis so that most of the steel section is available to carry tension. in negative bending, the slab, being subjected to tensile stresses, is less far beneficial due to possible cracking under service load conditions. such cracking considerably reduces the contribution of the slab to the strength and stiffness of the beam. in the case of bridges, cracking and possible subsequent deterioration of the concrete slab and corrosion of the rebars due to weathering effects may lead to a serviceability failure. prestressing can be used effectively to control or to completely eliminate deck cracking in composite beams. prestressing action can be induced by means of internal tendons running through the slab or externally by means of continuous deviated or discontinuous straight tendons connected to the steel structure. the present work investigates the contribution of external unbonded deviated tendons to improving the performance in terms of cracking characteristics and load-deflection response up to the ultimate capacity of the beam section at the inner support. an experimental program was conducted to test two identical double-span beams that were made as a composite of a reinforced concrete slab (6 × 60 cm) connected to an i-beam (no. 30) by means of a sufficient number of headed studs so as to attain a rigid connection. each beam was symmetrical about the inner support, with a total length of 14 m. while the first beam was non-prestressed, the second was prestressed by means of two continuous tendons running close and parallel to the beam web sides. each tendon was deviated by means of three saddle supports, two at each mid-span section and one at the interior support section. prestressing was applied at 28 days-age of the concrete slab to ensure the development of adequate compressive strength. the experimental work demonstrated that external prestressing is relatively easy to apply provided that proper dimensional details are designed. test results indicated that the application of prestressing remarkably improved the mechanical performance in terms of both yield and ultimate loads, and also cracking characteristics needed to ensure better durability performance under service conditions. 2 construction alternatives and analysis beside the well established advantages of prestressing, external prestressing offers the advantages of ease of monitoring and modifying the tendon force, ease of replacing the tendons in case of being damaged, and it also offers the possibility of controlling the tendon path. controlling the tendon path offers the twofold advantage of: (a) inducing the most favourable stress state counteracting the action of the external load depending on the load configuration and (b) controlling the secondary effects resulting from prestressing a continuous beam, which may reduce the load carrying capacity by introducing negative moments at the inner support(s). prestressing can be applied by various means and following different prestressing sequences. the stresses developing in the different elements of a composite beam may change significantly depending on the construction sequence. different construction sequences can be described based on possible combinations of the following alternatives: (a) prestressing can be applied through the steel structure and/or the concrete slab, (b) prestressing can be applied to the whole composite structure or to an individual element(s) before the composite action takes place, (c) shoring the steel structure by means of temporary supports removed as soon as the composite action takes place, and (d) concrete casting sequence. dall’asta and dezi [1] made an analytical study of different construction schemes of double-span beams prestressed by external deviated tendons, and concluded that it was more convenient, in terms of crack prevention, to apply prestressing to the composite structure rather than to the sole supporting steel structure. shoring the steel beam can be applied to further enhance the performance as a result of reduced deflection at all stages of loading. however, viest et al. [2] demonstrated that shored composite beam construct© czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 41 no. 3/2001 experiments with externally prestressed continuous composite girders m. safan, a. kohoutková steel-concrete composite girders have attractive potentials when applied in bridge construction. the serviceability performance of continuous composite girders is becoming more and more a deterministic parameter in the design of this type of structures. an effective method for improving this performance is to apply prestressing to control or completely eliminate concrete deck cracking caused by static and time dependent actions. little literature has been found addressing the experimental analysis of continuous girders prestressed by means of external deviated tendons. the current research aims to investigate the behavior of a double-span steel composite beam externally prestressed by means of continuous tendons in terms of cracking characteristics, load deflection response, and load carrying capacity. the efficiency of prestressing is evaluated by comparing the results to those of a non-prestressed beam with similar cross sections and spans. keywords: composite steel-concrete beams, serviceability, external prestressing, deviated tendons. ion usually cannot be justified from the standpoint of either economy or practical execution. according to current design codes [3], [4], the design of individual sections for flexural strength is relatively straightforward in terms of plastification and buckling criteria. however the design for serviceability is not as straightforward, considering: (a) the time-dependent effects due to creep and shrinkage inducing excessive deflections associated with moment redistribution along the spans and stress migration within individual sections with expected extension of cracking, (b) additional non-linearity due to cracking [4]. accordingly, accurate evaluation of prestressing efficiency should extend in time. studying different prestressing systems in double-span beams, dezi and leoni [5] demonstrated that external prestressing by means of deviated tendons produces a more favourable long-term stress state with the possibility of controlling the secondary effects by opportunely choosing the tendon path. this study also showed that it is possible to introduce relaxation losses as an instantaneous effect and to further assume that the prestressing force is constant in time (i.e., neglecting the variation under static, creep, and shrinkage actions). based on these simplifying assumptions, the prestressing force along the tendon can be introduced as concentrated forces acting at saddle points and anchorages. complete short and long term analysis can then be conducted using finite element models such as those presented in [6], [7] taking into account the flexibility of the shear connection or a more simplified finite element approach assuming rigid connection and combining the stiffness and flexibility methods to model the beam and the tendon sliding freely at saddle points [8]. 3 research significance the literature addressing prestressing of composite beams showed that more attention has been devoted to the study of internally prestressed continuous beams, while those prestressed by external continuous tendons have received less attention, specially as regards experimental testing. for this reason, it is believed that extensive research work is still needed in this area to explore the behavior for both ultimate and serviceability limit states, and to provide experimental data to investigate the efficiency of the theoretical models proposed for analysis. 4 experimental work the aim of the experimental work carried out in this research is to investigate the contribution of external prestressing to improving the performance of continuous composite beams. a double-span beam with a 7 m span length (beam b) has been prestressed by means of unbonded deviated tendons. its mechanical performance has been compared to that of a similar, but nonprestressed beam (beam a) in terms of the following parameters: 1. cracking characteristics: in terms of cracking load, crack width, number of cracks, and extension of cracks at different loading stages. 2. the efficiency of the slab reinforcement for controlling deck cracking in beam a. 3. load-deflection response. 4. load carrying capacity of the beam in terms of the load carrying capacity of the section at the interior support. 4.1 prestressing tendons one beam was stressed by means of two single strand tendons produced and installed by vsl systems – prague. the 15.5 mm diameter strand consisted of 7 wires plaited in helical twist configuration. the central wire had a diameter of 5.5 mm while each of the outer wires had a diameter of 5 mm, yielding a gross tendon area of 141.47 mm2. the following properties were reported by the manufacturer concerning tendon force losses: 1. maximum intrinsic relaxation = 2.5 % (after 1000 hours at 20 °c and an initial stress of 70 % of the nominal tensile strength) satisfying the requirements for low relaxation tendons according to astm a 416-85. 2. considering the special procedure developed by the manufacturer for locking the wedges, a loss due to wedge draw-in of approximately 6 mm occurs upon lock off. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 properties according to vsl limits of specifications euronorm 138-79 & bs 5896-80 (super) astm a 416-85 (grade 270) nominal diameter [mm] 15.50 15.7 15.2 nominal area [mm] 141.57 150 140 nominal mass [kg/m] 1.11 1.18 1.10 yield stress [mpa] 1673.00 1500 (1) 1670 (2) tensile strength [mpa] 1894.00 1770 1860 min. breaking load [kn] 268.10 265.0 260.7 young’s modulus [gpa] 194.54 195 intrinsic relaxation [%] (3) max. 2.5 (1) measured at 0.1 % residual strain (0.1 % offset method). (2) measured at 1.0 % extension (1 % extension under load method). (3) valid for relaxation class 2 according to euronorm 138-19 & bs 5896-80 or low relaxation grade according to astm a 416-85. table 1: 15 mm – strand properties provided by vsl and specification limits other losses due to friction at saddle points can be neglected without significant loss of accuracy, and thus the tendon force is assumed to be constant along the whole tendon length. other mechanical characteristics of the prestressing strand are presented in tab. 1 as provided by the manufacturer. a jacking force of 212 kn / strand was applied so as to induce a jacking stress of 1497 mpa corresponding to the maximum allowable stress of 0.9 f y (0.1) as recommended in eurocode 4 [9], where f y (0.1) is the yield stress measured at 0.1 % residual strain. 4.2 fabrication and erection of test beams the steel beam with prestressing and bearing assemblies and welded shear connectors was fabricated at the metrostav company workshops – prague. the steel structure was erected in its final position for testing at the prague state technical laboratories, where casting of the concrete slab was performed by metrostav and prestressing by vsl systems corporation in prague. the steel beam (rolled i-beam no. 30, grade 37) was fabricated and delivered to the laboratory in one piece and supported in the final position. the steel beam had a total length of 14.2 m, allowing 10 cm of the beam length to extend beyond the end supports, fig. 1. the prestressing force was transferred to the composite beam through a thick steel end plate (140 × 110 × 15 mm) welded to the steel beam, fig. 2. each prestressing tendon was deviated by means of three saddle points; two at each mid-span section and one at the intermediate support. for this purpose, the tendon was guided at these locations by means of curved steel tubes (24 mm inner diameter and 4 mm thick) welded to the steel beam. the tendon eccentricity values, with respect to the centroidal axis of the composite girder were zero at the end sections and 28, 194 at the intermediate support and mid span sections, respectively. a stiffened steel plate 240 mm in length was welded horizontally to both edges of the bottom flange, extending the supporting length of the beam over the inner support to ensure stability of the structure during © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 41 no. 3/2001 4.3 m 4.3 m5.4 m p p 3.5 m 3.5 m 3.5 m 7.0 m 7.0 m 14.20 m 3.5 m � 15.5 i. b. no. 30 0.60 m 0.125 0 .0 6 0 .3 0 8 � 10 fig. 1: loading configuration & dimensions of the test specimen (beam b) and cross section at mid-span section loading, as shown in fig. 2. headed studs were welded to the steel beam top flange in a single row along the web axis. the connectors were symmetrically distributed with respect to the inner support section. welding of connectors began at the end support sections and towards the inner support at a spacing of 40 cm over a span length of 3.6 m, after which the spacing was reduced to 20 cm. the studs had a total height of 54 mm after welding and a shank diameter of 12.6 mm. the steel beam was positioned over the final supports, and then a wooden form was installed to receive the fresh concrete of the 60 × 600 mm slab. a plastic adhesive material was used to fill the narrow gaps between the longitudinal edges of the top flange and the adjacent wooden plate to ensure water tightness of the form. the two horizontal plates forming the soffit of the concrete slab were supported by means of a sufficient number of vertical wooden plates fixed to the longitudinal perlins laid on the floor. horizontal plates were positioned attached to the soffit of the steel beam that originally received a small portion of the weight of the concrete slab. as a result, the own weight of the steel beam and the concrete slab are taken by the composite beam after stripping of the form. one layer of high tensile steel deformed bars was used to reinforce the concrete slab. the mesh consisted of 8 � 10 mm uniformly distributed longitudinal bars and distributers of the same diameter at a spacing of 20 cm. the mesh was elevated into position by means of plastic blocks 23 mm in thickness, and thus the centriod of the longitudinal bars can be assumed to coincide with the centriodal axis of the slab. the longitudinal rebars were spliced twice over a length of 70 cm so that the center of each splice was 2 m from the adjacent end support. slab casting: a ready mix self compacting concrete (scc) patch of 1 m3 was used to cast the slab of the two composite beams and the companion test specimens needed to investigate the different properties of the concrete mix. the concrete was poured from a moving mixing truck directly to the forms. minimum effort was needed only to level the concrete surface to ensure a constant slab thickness along the beam length. to avoid strength losses, casting of the all specimens was scheduled to take place in no more that 60 minutes from the moment of adding the mixing water. the concrete slabs were kept continuously wet for 7 days, after which the wooden forms were stripped and the slabs were left to dry in the laboratory atmosphere. prestressing: prestressing was applied by means of a 230 kn hydraulic jack connected to an electric pump as a pulling device, fig. 3. a jacking force of 212 kn was applied to each tendon inducing a stress of 1497 mpa. this value is the maximum stress allowed in eurocode 4 [9] and represents 90 percent of the yield stress corresponding to 0.1 % permanent strain. the tendons were anchored by means of wedges and anchor chuck grips, as shown in fig. 4. the tendons were allowed to go through the cylindrical anchor heads installed against the rigid steel diaphragm at the two end support sections. the wedges were then driven to provide sufficient friction, preventing slippage of the tendon towards the beam. about 1 m, length of the tendon was needed to extend behind the anchorage device so as to be clamped inside the movable core of the tensioning hydraulic 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 fig. 2: structural details at the inner support section fig. 3: stressing of tendons by means of a hydraulic jack unit fig. 4: structural detail at the end support section showing the head of the anchorage device acting against a rigid steel plate cell. while the tendon was being pulled out, the prestressing cell exerted a reaction against the anchor head and prevented the wedges from moving out, so that the wedges always remained in contact with the tendon during the stressing operation. upon the release of the jacking pressure, the wedges automatically locked in the conical hole of the anchor head. the tendons were interchangeably stretched at increments of one fourth of the jacking force. also, each tendon was been stressed from both sides, fig. 3. application of the stressing force from both sides, provided that the anchorage system offers such a facility, is recommended so as to avoid the accumulation of frictional losses at the far end segments of the tendon if prestressing were applied from one side. such a procedure can be more significant if the tendon profile is shaped by means of a big number of saddle points inducing friction losses. however, after the first force increment, a zero reading of the tendon extension was assumed. for each subsequent increment the extension was measured to provide a means of calibrating the applied force based on the tendon standard properties provided by the manufacturer. 5 instrumentation and testing of the composite beams the instrumentation of the model beams was designed to monitor the load-deflection response intensively along the span. deflection was monitored by deflection gauges attached to the bottom flange of the steel beam. the applied load was controlled by means of a computerized unit connected to an electric pump. after the gauges were installed, the slab sides were painted in white to facilitate crack detection. deflection along the two spans of each beam had been measured by means of nine deflection gauges equally distributed along the each span at a spacing of 70 cm (1/10 of the span length). on the other hand, three sets of deflection gauges were held attached to the top surface of the concrete slab at the end and inner support sections. each consisted of two units located 5 cm from the slab edges to make sure that the line joining these two points at the support sections always lay in the same horizontal level during all loading stages. tab. 2 gives a time schedule for testing the model beams. 5.1 loading procedure the beams were tested under the action of two concentrated simultaneously increasing loads. each load acted at a distance of 2.7 m from the inner support. each load was applied by means of a 600 kn maximum capacity hydraulic jack, which was fitted against the loading frame as shown in fig. 6. the jacks were driven by an electric pump connected to a computerized unit to control load increment, loading rate, and to maintain a constant value of the acting load during the test. each point load was applied through a rigid plated steel box, 20 cm in width, distributing the jacking load uniformly over the slab width. the bearing boxes were laid on a bed of paris plaster to minimize the effects caused by any unevenness in the underneath slab surface. prestressed beam (beam b): during the first cycle of loading, the load was applied in 20 kn increments at a loading rate of 4 kn/minute. when the load value reached a predetermined value of 150 kn as the beam serviceability load, an unloading process took place at a rate of 90 kn/minute. because of the distorted dimensions of the test models, it was not convenient to compute the service load value based on the “no cracking” criterion. for this reason, the service load value was determined so as to induce a maximum compressive stress of 200 mpa (0.55 yield stress of the steel beam) at the inner support section. the first cycle was followed by nine cycles in which the beam was loaded from zero to 150 kn in one increment at a loading rate of 45 kn / minute and then reloaded. in the 11th cycle, the load was increased from zero to 150 kn and then in increments of 20 kn at a rate of 4 kn / minute until the maximum was reached. at this final stage of loading, the propagation of yield initiated in the lower part of the steel beam could be recognized by flaking of mill scale that radiated upwards and to the sides about the inner support section. this observation is in fig. 7 compared to the same section in fig. 2 immediately after prestressing. the prescribed criterion could be observed at a load value of 300 kn, at which the test was ended. in all stages and after each load increment, the deflection measurements were recorded and the cracking characteristics were carefully observed. non-prestressed beam (beam a): a similar loading procedure to that of beam b was followed. the load was cycled from zero to a predetermined service load value of 110 kn for ten cycles. in the 11th cycle, the load was intended to increase further to a value comparable to that of the prestressed beam. this value was found analytically to be 245 kn. however, the test was ended at a load value of 230 kn due to an accidental lateral rotation of the beam, which could be attributed to improper positioning of the applied load. © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 41 no. 3/2001 100 % 25 % 50 % 75 % fig. 5: application of the tendon stressing force from both sides to uniformly distributed frictional losses along the tendon length operation time interval [days] casting of concrete slabs 0 stripping of wooden forms 7 prestressing of beam b 28 testing of the prestressed beam 41 testing of the nonprestressed beam 43 table 2: time schedule for beam testing 6 discussion of test results fig. 8 shows the load deflection response of the two beams. a stiffer response of beam b (prestressed) compared to beam a (non-prestressed) is obvious. the deflection of beam a at service load (110 kn) was 1.87 times higher than the corresponding value in case of beam b at the same load value. the first yield in beam a occurred in the extreme fiber of the lower flange at the inner support section at a load of 190 kn. at this stage the above deflection ratio reduced to 1.4 as the deflection increased in both beams, while a slight non-linear effect appeared due to deck cracking. however, by the end of testing of beam a, the deflection ratio increased to 1.75, demonstrating the obvious non-linear response of beam a compared to beam b up to the same loading level. the figure also shows that the deflection was nearly the same in both beams at the end of testing, although the load carried by beam b was 30 percent higher. the almost parallel load-deflection curves of the two beams up to the yield load of beam a, demonstrate that the stiffer response of beam b refers mainly to cambering associated with beam prestressing. the load carrying capacity of the tested beams at different loading stages is shown in fig. 9. it can be seen that an improvement in the mechanical behaviour in terms of load car70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 fig. 6: an overall view of the test specimen and loading frame fig. 7: diagonal flaking of mill scale indicating the yield development about the inner support section rying capacity was achieved thanks to prestressing. cracking, serviceability, and ultimate loads were 4.6, 1.2, and 1.2 times higher in beam b compared to beam a, respectively. obviously, the remarkable delay of crack initiation is a highly desired feature for serviceability performance. other cracking characteristics are demonstrated in figs. 10, 11. at service load, the total number of cracks developed in beam a was 23 spreading over 14 percent of the beam length. in beam b the corresponding number of cracks was only 7 over 4 percent of the beam length. by the end of testing, the number of cracks in beam a increased to 31 spreading over 17 percent of the beam length. the corresponding values in beam b were 20 and 9 percent, respectively. in beam a, the cracking width was limited to 0.1 mm © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 41 no. 3/2001 0 50 10 0 15 0 20 0 25 0 30 0 35 0 -10 -5 0 5 10 15 20 25 30 35 40 measured deflection at mid-span section [mm] beam a beam b l o a d p [k n ] fig. 8: load-deflection response for beam b (prestressed) & beam a (non-prestressed) 0 50 100 150 200 250 300 350 cracking serviceability yielding ultimate loading stage beam a beam b (4.6) (1.36) (1.2) (1.2) l o a d p [k n ] fig. 9: load carrying capacity at different stages of loading fig. 10: cracking pattern due to the relatively high reinforcement ratio of 1.75 percent. however, the cracking width further increased to 0.2 mm by the end of loading. thanks to prestressing, the cracking width was limited to 0.1 mm at all loading stages. 7 conclusions the experimental work conducted in the current research demonstrates that external prestressing is relatively easy to apply provided that proper dimensional details are designed. the application of prestressing improved the mechanical performance remarkably giving the following results: 1. while beam a first cracked at a load of 25 kn the corresponding load for beam b was 4.6 times higher (115 kn). 2. by the end of testing, the total number of cracks in beam a was 31 spreading over a length of 233 cm (16.6 % of the structure length). the number of corresponding cracks in beam b was limited to 20 cracks over a length representing only 9 % of the beam length. 3. during all loading stages in beam b, a maximum crack width of 0.1 mm was recorded. an equal cracking width was measured in beam a at the initial stages of loading at the serviceability level. however, this value further increased to 0.2 mm at the end of testing. 4. the loading capacity in terms of plastification of steel at the inner support section corresponded to a load of 300 kn in beam b, which was 30 percent higher than that for beam a (230 kn). 5. the application of a relatively high slab reinforcing ratio (1.75 %) made it possible to limit cracking width to 0.1 mm at service load in the non-prestressed beam. however, cracks became wider at higher load levels. 6. the load – deflection response indicated a remarkable increase in stiffness due to prestressing. the deflection of the non-prestressed beam was 1.87, 1.4, and 1.75 times higher than that of the prestressed beam at service, yield, and ultimate load values of the non-prestressed beam, respectively. 7. the almost parallel load-deflection curves up to the yield load of the non-prestressed beam demonstrate that reduced deflections refer mainly to cambering associated with external prestressing. references [1] dall’asta, a., dezi, l.: construction sequence effects on externally prestressed composite girders. international conference report: composite construction: conventional and innovative. innsbruck – austria, september 16 –18, 1997, pp. 301–306 [2] viest, i. m., colaco, j. p., furlong, r. w., griffis, l. g., leon, r. t., wyllie, l. a.: composite construction design for buildings. co-published by mcgraw hill & american society of civil engineers (asce), 345 east 47th street, new york, 1997, ny 10017-2398, isbn 0-07-067457-4 [3] commission of the european communities: design of composite steel and concrete structures. eurocode 4, 1992 [4] aisc: manual of steel construction, lfrd. 1st edition, american institute of steel construction (aisc), chicago, 1987 [5] gilbert, r. i., bradford, m. a.: time-dependent behavior of continuous composite beams at service loads. journal of structural engineering, vol. 121, no. 2/1995, pp. 319–327 [6] dezi, l., leoni, g.: time-dependent behavior of continuous composite beams: comparison among different prestressing techniques. costruzioni metalliche, april 1997, no. 2, pp. 15–27 [7] dezi, l., tarantino, a. m.: creep in composite continuous beams i: theoretical treatment. journal of structure engineering, asce, vol. 119, no. 7/1993, pp. 2095–2111 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 0 4 8 12 16 20 serviceability ultimate loading stage beam b beam a (0.1) (0.1) (0.1) (0.2) ( ) cracking width [mm] c ra c k in g e x te n s io n (% o f th e b e a m le n g th ) fig. 11: cracking extension and cracking width at different loading stages [8] amadio, c., fragiacomo, m.: a finite element method for the study of creep and shrinkage effects in composite beams with deformable shear connection. costruzioni metalliche, no. 4/1993, pp. 213–228 [9] tong, w., saadatmanesh, h.: parametric study of continuous composite girders. journal of structural engineering, asce, vol. 118, no. 1/1992, pp. 186–205 the support provided for this research by the grant agency of the czech republic, project no. 103/99/0734 is gratefully acknowledged. eng. mohamed safan, msc phone: +420 2 2435 4620 e-mail: msafan@beton.fsv.cvut.cz ing. alena kohoutková, csc. phone: +420 2 2435 3740 e-mail: akohout@fsv.cvut.cz dept. of concrete structures and bridges czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 41 no. 3/2001 table of contents effect of the distribution system on drinking water quality 3 a. grünwald, b. š•astný, k. slavíèková, m. slavíèek class representation of shapes using qualitative-codes 6 ashraf fouad hafez ismail a mathematical model of the drying process 20 a. k. haghi influence of different drying conditions on high strength concrete compressive strength 24 m. safan, a. kohoutková investigation into airborne dust in a wool textile mill 29 a. k. haghi evolved representation and computational creativity 34 ashraf fouad hafez ismail a logic object and its state 46 j. bokr, v. jáneš modelling of highway runoff quantity and quality 53 ashraf el-shahat elsayed, a. grünwald, m. synáèková, m. slavíèek colorimetry and tv colour splitting systems 60 j. kaiser, e. koš•ál experiments with externally prestressed continuous composite girders 65 m. safan, a. kohoutková << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica acta polytechnica 53(4):322–328, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ continued fractions of square roots of natural numbers ľubomíra balkováa,∗, aranka hruškováb a department of mathematics, faculty of nuclear sciences and physical engineering, czech technical university in prague, trojanova 13, 120 00 praha 2, czech republic b gymnázium christiana dopplera, zborovská 45, 150 00 praha 5 – smíchov, czech republic ∗ corresponding author: lubomira.balkova@fjfi.cvut.cz abstract. in this paper, we will first summarize known results concerning continued fractions. then we will limit our consideration to continued fractions of quadratic numbers. the second author describes periods and sometimes the precise form of continued fractions of √ n, where n is a natural number. in cases where we have been able to find such results in the literature, we recall the original authors, however many results seem to be new. keywords: continued fractions, periodic continued fractions, periods of continued fractions, quadratic numbers. 1. introduction continued fractions have a long history behind them — their origin may go back to the age of euclid’s algorithm for the greatest common divisor, or even earlier. however, they are experiencing a revival nowadays, thanks to their applications in high-speed and highaccuracy computer arithmetics. some of the advantages of continued fractions in computer arithmetics are: faster division and multiplication than with positional number representations, fast and precise evaluation of trigonometric, logarithmic and other functions, precise representation of transcendental numbers, no roundoff or truncation errors ([6], kahan’s method in [3, p. 179]). 2. continued fractions in this section we summarize some basic definitions and results that can be found in any number theory course [1, 2, 4]. we use bxc to denote the integer part of a real number x. definition 2.1. the continued fraction (expansion) of a real number x is the sequence of integers (an)n∈n obtained by the following algorithm x0 = x, an = bxnc, xn+1 = { 1 xn−an if xn /∈ z, 0 otherwise. note that a0 ∈ z and an ∈ n. the algorithm producing the continued fraction is closely related to the euclidean algorithm for computing the greatest common divisor of two integers. it is thus readily seen that if the number x is rational, then the algorithm eventually produces zeroes, i.e. there exists n ∈ n such that an = 0 for all n > n, thus x = a0 + 1 a1 + 1 a2 + 1 ... + 1 an−1 + 1 an . (1) we write x = [a0, . . . ,an ]. on the other hand, if we want to find an expression of the form (1) with a0 ∈ z and an ∈ n \ {0} otherwise, then there are exactly two of them – the continued fraction [a0, . . . ,an ] and x = a0 + 1 a1 + 1 a2 + 1 ... + 1 an−1 + 1 (an − 1) + 1 1 . if the number x is irrational, then the sequence of the so-called convergents a0, a0 + 1 a1 , . . . ,a0 + 1 a1 + 1 a2 + 1 ... + 1 an−1 + 1 an (2) converges to x for n →∞. on the other hand, every sequence of rational numbers of the form (2) with 322 http://ctn.cvut.cz/ap/ vol. 53 no. 4/2013 continued fractions of square roots of natural numbers a0 ∈ z and an ∈ n \ {0} converges to an irrational number (and for every irrational number there is only one such sequence — the sequence of convergents). we write x = [a0, . . . ,an, . . . ]. the convergents of the continued fraction are known to represent irrational numbers better than any other fractions. theorem 2.2 (lagrange). let x ∈ r\q and let pn qn be its n-th convergent (where pn and qn are coprime) and let p q with p,q ∈ z be distinct from pn qn and such that 0 < q ≤ qn. then∣∣∣x− pn qn ∣∣∣ < ∣∣∣x− p q ∣∣∣. it is also known how well continued fractions approximate irrational numbers. theorem 2.3. let x ∈ r \ q and let pn qn be its nth convergent (where pn and qn are coprime). then either∣∣∣x− pn qn ∣∣∣ < 1 2q2n or ∣∣∣x− pn+1 qn+1 ∣∣∣ < 1 2q2n+1 . and in a certain way, only continued fractions get very close to irrational numbers. theorem 2.4 (legendre). let x ∈ r \ q and let p q with p,q ∈ z satisfy |x − p q | < 12q2 . then p q is a convergent of x. 2.1. continued fractions and continuants the convergents of continued fractions are closely related to the so-called continuants kn(x1, . . . ,xn). theorem 2.5. let a0 ∈ r, ai > 0, i ∈ n. then it holds a0 + 1 a1 + 1 ... + 1 an = kn+1(a0,a1, . . . ,an) kn(a1, . . . ,an) , where the polynomial kn(x1, . . . ,xn) is given by the recurrence relation k−1 = 0, k0 = 1 and for n ≥ 1 by kn(x1, . . . ,xn) = kn−2(x1, . . . ,xn−2) + xnkn−1(x1, . . . ,xn−1). corollary 2.6. let [a0, . . . ,an, . . . ] be the continued fraction of an irrational number x. then its n-th convergent pn qn satisfies pn = kn+1(a0, . . . ,an), qn = kn(a1, . . . ,an). theorem 2.7. for every n ∈ n and a1, . . . ,an ∈ r, we have kn(a1, . . . ,an) = kn(an, . . . ,a1). 2.2. continued fractions of quadratic numbers we will call a quadratic irrational an irrational root α of a quadratic equation ax2 + bx + c = 0, where a,b,c ∈ z. the second root of the equation will be denoted α′ and called the (algebraic) conjugate of α. in order to state the theorem describing continued fractions of quadratic irrationals, we need to recall that a continued fraction [a0, . . . ,an, . . . ] is called eventually periodic if [a0, . . . ,an, . . . ] = [a0, . . . ,ak−1,ak, . . . ,a`] starts with a preperiod a0, . . . ,ak−1 and then a period ak, . . . ,a` is repeated an infinite number of times. it is called purely periodic if [a0, . . . ,an, . . . ] = [a0, . . . ,a`], i.e., if the preperiod is empty. theorem 2.8 (lagrange). let α ∈ r \ q. the continued fraction of α is eventually periodic if and only if α is a quadratic irrational. theorem 2.9 (galois). let α be a quadratic irrational and α′ its conjugate. the continued fraction of α is purely periodic if and only if α > 1 and α′ ∈ (−1, 0). example 2.10. let α = 1+ √ 5 2 , i.e., the so-called golden ratio, then it is the root of x2 −x−1 = 0 and α′ = 1− √ 5 2 ∈ (−1, 0). the continued fraction of α is indeed purely periodic since α = 1 + −1 + √ 5 2 = 1 + 1 1+ √ 5 2 = 1 + 1 α , consequently α = [1]. in the sequel when we restrict our consideration to square roots of natural numbers, we will make use of the following lemma from [4]. lemma 2.11. let α be a quadratic irrational and α′ its conjugate. if α has a purely periodic continued fraction [a0,a1, . . . ,an], then −1α′ = [an, . . . ,a1,a0]. 3. continued fractions of √ n let us consider n ∈ n \ {0}. if n = k2 for some k ∈ n, then √ n = k and the continued fraction is√ n = [k]. therefore, we limit our considerations to n ∈ n \ {0} which is not a square in the sequel. then there exists a unique n ∈ n\{0} and a unique j ∈{1, . . . , 2n} such that n = n2 + j. the proofs of the two following theorems can be found in [4] page 15. however we repeat them here since they follow almost immediately from the previous statements and they give an insight into the form of continued fractions of quadratic numbers. 323 ľ. balková, a. hrušková acta polytechnica theorem 3.1. for every n ∈ n \ {0} and every j ∈ {1, . . . , 2n} the continued fraction of √ n2 + j is of the form [n,a1, . . . ,ar, 2n], where a1 . . .ar is a palindrome. proof. denote α = n + √ n2 + j. then α is a quadratic irrational greater than 1 and α′ = n −√ n2 + j ∈ (−1, 0). therefore α has by theorem 2.9 a purely periodic continued fraction, i.e., there exist a1, . . . ,ar ∈ n such that α = [2n,a1, . . . ,ar ]. it is thus evident that √ n2 + j = [n,a1, . . . ,ar, 2n]. it remains to prove that a1 . . .ar is a palindrome. according to lemma 2.11 the number −1 α′ has its continued fraction equal to [ar, . . . ,a1, 2n]. we obtain thus √ n2 + j = n + 1 −1 n− √ n2+j = n + 1 −1 α′ = [n,ar, . . . ,a1, 2n]. since the continued fraction of irrational numbers is unique and we have√ n2 + j = [n,a1, . . . ,ar, 2n] = [n,ar, . . . ,a1, 2n], it follows that a1 = ar, a2 = ar−1 etc. consequently, a1 . . .ar is a palindrome. theorem 3.2. the continued fraction of the form [n,a1, . . . ,ar, 2n], where a1 . . .ar is a palindrome, corresponds to √ n for a rational number n. proof. denote by x the number whose continued fraction equals [n,a1, . . . ,ar, 2n], i.e., x = n + 1 a1 + 1 ... + 1 ar + 1 2n + (x−n) . hence by theorem 2.5, x−n = kr(a2, . . . ,ar,x + n) kr+1(a1, . . . ,ar,x + n) = kr−2(a2, . . . ,ar−1) + (x + n)kr−1(a2, . . . ,ar) kr−1(a1, . . . ,ar−1) + (x + n)kr(a1, . . . ,ar) . by theorem 2.7 and since a1 . . .ar is a palindrome, we have kr−1(a1, . . . ,ar−1) = kr−1(a2, . . . ,ar). consequently, we obtain x = √ n2 + 2nkr−1(a1,...,ar−1)+kr−2(a2,...,ar−1) kr (a1,...,ar ) , where under the square root, there is certainly a rational number since by their definition, continuants with integer variables are integers. in the sequel, let us study the length of the period and the form of the continued fraction of √ n =√ n2 + j in dependence on n and j, where n ∈ n \ {0} and j ∈ {1, . . . , 2n}. we will prove only some observations since the proofs are quite technical and space-demanding. the rest of the proofs may be found in [5]. in table 1, we have highlighted all classes of n and j for which their continued fractions of √ n = √ n2 + j have been described. observation 3.3. the continued fraction of √ n has period of length 1 if and only if n = n2 + 1. it holds then √ n = [n, 2n]. proof. this observation has already been made in [7]. (⇐) : √ n2 + 1 = n + √ n2 + 1 −n 1 = = n + 1 √ n2 + 1 + n = n + 1 2n + √ n2 + 1 −n 1 , hence √ n = [n, 2n]. (⇒) : if the length of the period equals 1, then by theorem 3.1 we have √ n = [n, 2n]. √ n2 + j = n + (√ n2 + j −n ) = n + 1 2n + √ n2 + j −n , hence we have √ n2 + j −n = 1 2n + √ n2 + j −n , j√ n2 + j + n = 1√ n2 + j + n , j = 1. observation 3.4. the continued fraction of √ n has period of length 2 if and only if 2n j is an integer. it holds then √ n = [n, 2n j , 2n]. proof. (⇐):√ n2 + j = n + ( √ n2 + j −n) = n + j√ n2 + j + n = n + 1 2n j + √ n2 + j −n j , = n + 1 2n j + 1√ n2 + j + n , = n + 1 2n j + 1 2n + ( √ n2 + j −n) , thus √ n = [n, 2n j , 2n]. 324 vol. 53 no. 4/2013 continued fractions of square roots of natural numbers table 1. all classes of n ≤ 40 (first column) and j ≤ 31 (first row) for which their continued fractions of√ n = √ n2 + j have been described are highlighted. (⇒): if the length of the period equals 2, then by theorem 3.1 we have √ n = [n,x, 2n].√ n2 + j = n + ( √ n2 + j −n) = n + 1 x + 1 2n + ( √ n2 + j −n) , hence we have √ n2 + j −n = 1 x( √ n2 + j + n) + 1√ n2 + j + n , x = 2n j . observation 3.5. the continued fraction of √ n has period of length 3 if and only if j = 4ak + 1 and n = aj + k for some a,k ∈ n, a ≥ 1, k ≥ 1. it holds then √ n = [n, 2a, 2a, 2n] and 5 ≤ j ≤ n− 1. proof. (⇒) : if the length of the period equals 3, then by theorem 3.1 we have √ n = [n,x,x, 2n]. √ n2 + j = n + 1 x + 1 x + 1 2n + ( √ n2 + j −n) , hence we get j = 2xn + 1 x2 + 1 . since j is an integer, x must be even. furthermore, as j 6= 1 by observation 3.3, there exists a ≥ 1 such that x = 2a. it follows then from j = 4an + 1 4a2 + 1 that n = aj + j−14a . since n is an integer, we obtain finally j = 4ak + 1 and n = aj + k for some k ≥ 1. it is easy to verify that j ≥ 5 and j ≤ n− 1. (⇐) : the reverse implication is only an exercise in manipulation with square roots and integer parts. we have it for the reader. in order to save space in the proofs, let us introduce 325 ľ. balková, a. hrušková acta polytechnica n j √ n ` = 4 2k + 1, k ≥ 2 2n− 3 [n, 1, n−32 , 1, 2n] 2k + 1, k ≥ 1 3n+12 [n, 1, 2, 1, 2n] 3k + 2 5n+23 [n, 1, 4, 1, 2n] 3k + 2, k ≥ 1 2n− 2 [n, 1, 2n−43 , 1, 2n] 3k + 2 4n+13 [n, 1, 1, 1, 2n] 5k + 2, k ≥ 1 n− 1 [n, 2, 2n−45 , 2, 2n] 5k + 4 8n+35 [n, 1, 3, 1, 2n] 6k + 1, k ≥ 1 5n+16 [n, 2, 2, 2, 2n] 6k + 5 2n−13 [n, 3, n−1 2 , 3, 2n] 9k + 4, k ≥ 1 n− 2 [n, 2, 2n−89 , 2, 2n] ` = 5 2k + 1, k ≥ 1 4 [n, n−12 , 1, 1, n−1 2 , 2n] 5k + 3 6n+25 [n, 1, 1, 1, 1, 2n] ` = 6 2k, k ≥ 2 2n− 3 [n, 1, n2 − 1, 2, n 2 − 1, 1, 2n] 10k + 7 2n+15 [n, 4, 1, n−3 2 , 1, 4, 2n] 3k + 1, k ≥ 1 2n+13 [n, 2, 1,n− 1, 1, 2, 2n] 3k + 1, k ≥ 1 n + 1 [n, 1, 1, 2n−23 , 1, 1, 2n] 3k + 1, k ≥ 1 4n+23 [n, 1, 2,n, 2, 1, 2n] 6k + 4 7n+26 [n, 1, 1, 2, 1, 1, 2n] 7k + 3, k ≥ 1 n + 2 [n, 1, 1, 2n−67 , 1, 1, 2n] ` = 8 4k + 1, k ≥ 2 2n− 7 [n, 1, n−54 , 2, n−1 2 , 2, n−5 4 , 1, 2n] 6k, k ≥ 1 4n3 [n, 1, 1, 1, n−2 2 , 1, 1, 1, 2n] 6k + 2, k ≥ 1 2n−13 [n, 3, n−2 2 , 1, 4, 1, n−2 2 , 3, 2n] 7k + 5 8n+27 [n, 1, 1, 3,n, 3, 1, 1, 2n] 9k + 3, k ≥ 1 9 [n, 2n−69 , 1, 2, 2n−6 9 , 2, 1, 2n−6 9 , 2n] 9k + 6, k ≥ 1 9 [n, 2n−39 , 2, 1, 2n−12 9 , 1, 2, 2n−3 9 , 2n] ` = 10 6k + 3, k ≥ 1 4n3 [n, 1, 1, 1, n−1 2 , 6, n−1 2 , 1, 1, 1, 2n] 9k + 6 10n+39 [n, 1, 1, 3, 1,n− 1, 1, 3, 1, 1, 2n] 10k + 5, k ≥ 1 4n5 [n, 2, 1, 1, n−1 2 , 10, n−1 2 , 1, 1, 2, 2n] table 2. lengths ` of periods and the form of continued fractions for several classes. the following notation 〈a0,a1, . . . ,an−1,an〉 = a0 + 1 a1 + 1 ... + 1 an−1 + 1 an , where ai ∈ n for i ∈{0, 1, 2, . . . ,n − 1}, but an ∈ r. observation 3.6. let j = 4. if n is even, then the length of the period is 2 and √ n = [ n, 2n j , 2n ] . if n is odd, then the length of the period is 5 and√ n = [ n, n−12 , 1, 1, n−1 2 , 2n ] . proof. if n is even, then 2n j is an integer and the statement is a corollary of observation 3.4. if n is odd, it holds√ n2 + 4 = n + (√ n2 + 4 −n ) = 〈 n, √ n2 + 4 + n 4 〉 = 〈 n, n− 1 2 , √ n2 + 4 + n− 2 n 〉 = 〈 n, n− 1 2 , 1, √ n2 + 4 + 2 n 〉 = 〈 n, n− 1 2 , 1, 1, √ n2 + 4 + n− 2 4 〉 = 〈 n, n− 1 2 , 1, 1, n− 1 2 , 2n + ( √ n2 + 4 −n) 〉 , thus √ n = [n, n−12 , 1, 1, n−1 2 , 2n]. observation 3.7. for n > 1 and j = 2n − 1 the length of the period is 4 and the continued fraction is then √ n = [n, 1,n− 1, 1, 2n]. proof.√ n2 + 2n− 1 = n + (√ n2 + 2n− 1 −n ) = 〈 n, √ n2 + 2n− 1 + n 2n− 1 〉 326 vol. 53 no. 4/2013 continued fractions of square roots of natural numbers = 〈 n, 1, √ n2 + 2n− 1 + (n− 1) 2 〉 = 〈 n, 1,n− 1, √ n2 + 2n− 1 + (n− 1) 2n− 1 〉 = 〈 n, 1,n− 1, 1, 2n + (√ n2 + 2n− 1 −n )〉 , hence √ n = [n, 1,n− 1, 1, 2n] observation 3.8. for n > 3 and j = 2n− 3, either the length of the period is 4 if n is odd and the continued fraction is then √ n = [ n, 1, n−32 , 1, 2n ] , or the length of the period is 6 if n is even and the continued fraction is then √ n = [n, 1, n2 − 1, 2, n 2 − 1, 1, 2n]. proof. for n odd:√ n2 + 2n− 3 = n + (√ n2 + 2n− 3 −n ) = 〈 n, √ n2 + 2n− 3 + n 2n− 3 〉 = 〈 n, 1, √ n2 + 2n− 3 − (n− 3) 4 〉 = 〈 n, 1, n− 3 2 , √ n2 + 2n− 3 + (n− 3) 2n− 3 〉 = 〈 n, 1, n− 3 2 , 1, 2n + (√ n2 + 2n− 3 −n )〉 , thus √ n = [ n, 1, n−32 , 1, 2n ] . for n even:√ n2 + 2n− 3 = n + (√ n2 + 2n− 3 −n ) = 〈 n, √ n2 + 2n− 3 + n 2n− 3 〉 = 〈 n, 1, √ n2 + 2n− 3 − (n− 3) 2n− 3 〉 = 〈 n, 1, n 2 − 1, √ n2 + 2n− 3 + (n− 1) n− 1 〉 = 〈 n, 1, n 2 − 1, 2, √ n2 + 2n− 3 + (n− 1) 4 〉 = 〈 n, 1, n 2 − 1, 2, n 2 − 1, √ n2 + 2n− 3 + (n− 3) 2n− 3 〉 = 〈 n, 1, n 2 − 1, 2, n 2 − 1, 1, 2n + ( √ n2 + 2n− 3 −n) 〉 . thus √ n = [n, 1, n2 − 1, 2, n 2 − 1, 1, 2n]. the following table will include all remaining cases of continued fractions of √ n2 + j that we were able to determine in terms of n and j. observation 3.9. let k ∈ n. let us summarize in table 2 the lengths ` of periods and the form of continued fractions for several classes (described in an analogous way) of n and j. the next observation was made in a different way than all previous ones. we prescribed the form of the continued fraction and searched for √ n having such a continued fraction. observation 3.10. if the period of the continued fraction of √ n = √ n2 + j contains p ≥ 1 ones as its palindromic part, i.e., √ n = [n, 1, . . . , 1︸ ︷︷ ︸ p , 2n] then n = kfp + fp+1 2 for some k ∈ n, p + 1 6= 3`, where ` ∈ n, and j = 2nfp−1+fp−2 fp , where fn denotes the nth fibonacci number given by the recurrence relation f−1 = 0, f0 = 1 and fn = fn−2 +fn−1 for all n ≥ 1. proof. it is a direct consequence of the proof of theorem 3.2 and the definition of continuants. the last observation is also of a different form than the previous ones since j and n depend on two parameters. observation 3.11. let n = 4ka + 2a, where k,a ∈ n,k ≥ 1,a ≥ 1, and j = 8a. then the continued fraction of √ n = √ n2 + j equals [ n, 4n− j 2j , 1, 1, n− 2 2 , 1, 1, 4n− j 2j , 2n ] . proof. the proof may be found in [5]. we also made one conjecture that turned out to be false. conjecture 3.12. for √ n the length of the period of the continued fraction is less than or equal to 2n. this observation was made when contemplating a table of periods of √ n for n ≤ 1000. however, in [8] it is shown that for n = 1726 with n = 41, the period of the continued fraction of √ n is of length 88 > 82 = 2n. a rougher upper bound comes from [7]. theorem 3.13. for √ n the length of the period of the continued fraction is less than or equal to 2n. let us terminate with two conjectures that have not been proved yet. conjecture 3.14. no element of the period of √ n apart from the last one is bigger than n. conjecture 3.15. there is no period of an odd length for j = 4k + 3, where k ∈ n. acknowledgements the first author acknowledges financial support from czech science foundation grant 13-03538s. references [1] klazar, m.: introduction to number theory, kam-dimatia series 782, 2006. [2] masáková, z., pelantová, e.: teorie čísel, skriptum čvut, 2010. [3] muller, j.-m.: elementary functions: algorithms and implementation, 2nd edition, birkhäuser boston, 2006. [4] lauritzen, n.: continued fractions and factoring, http://home.imf.au.dk/niels/cfracfact.pdf, 2009. 327 ľ. balková, a. hrušková acta polytechnica [5] hrušková, a.: řetězové zlomky kvadratických čísel, soč práce, http://bimbo.fjfi.cvut.cz/∼soc/, 2013–2014 [6] seidensticker, r.: continued fractions for high-speed and high-accuracy computer arithmetic, ieee symposium on computer arithmetic (1983), 184–193 [7] sierpinski, w.: elementary theory of numbers, panstwowe wydawnictwo naukowe, warszawa, 1964. [8] hickerson, d. r.: length of period of simple continued fraction expansion of √ d, pacific journal of mathematics 46(2) (1973), 429–432 328 acta polytechnica 53(4):322–328, 2013 1 introduction 2 continued fractions 2.1 continued fractions and continuants 2.2 continued fractions of quadratic numbers 3 continued fractions of sq. root of n acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0395 acta polytechnica 53(5):395–398, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the shmushkevich method for higher symmetry groups of interacting particles mark bodnera, goce chadzitaskosb, jiří paterac,a, agnieszka tereszkiewiczd,c,∗ a mind research institute, 111 academy drive, irvine, ca 92617, usa b faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, cz-11519 praha 1, czech republic c centre de recherches mathématiques, université de montréal, c. p. 6128, succ. centre-ville, montréal, h3c 3j7, québec, canada d institute of mathematics, university of bialystok, akademicka 2, pl-15-267 bialystok, poland ∗ corresponding author: a.tereszkiewicz@uwb.edu.pl abstract. about 60 years ago, i. shmushkevich presented a simple ingenious method for computing the relative probabilities of channels involving the same interacting multiplets of particles, without the need to compute the clebsch-gordan coefficients. the basic idea of shmushkevich is “isotopic non-polarization” of the states before the interaction and after it. hence his underlying lie group was su (2). we extend this idea to any simple lie group. this paper determines the relative probabilities of various channels of scattering and decay processes following from the invariance of the interactions with respect to a compact simple a lie group. aiming at the probabilities rather than at the clebsch-gordan coefficients makes the task easier, and simultaneous consideration of all possible channels for given multiplets involved in the process, makes the task possible. the probability of states with multiplicities greater than 1 is averaged over. examples with symmetry groups o(5), f (4), and e(8) are shown. keywords: isospin, particle collisions, lie group representation. submitted: 1 april 2013. accepted: 7 may 2013. 1. introduction the method of shmushkevich [1, 2] was conceived as a simpler alternative to computing the relative probabilities of various channels of scattering and decay processes under strict isospin invariance (su (2) invariance). the traditional alternative method for calculating the probabilities of the same channels is to calculate first all pertinent clebsch-gordan coefficients (cgc) for the channel. the most remarkable feature of the shmushkevich method is the complete avoidance of the need to calculate the clebsch-gordan coefficients. the underlying idea is to consider isotopically unpolarized states before and after the interaction, assuming that each possible state came with equal probability. the simplicity of the idea has attracted the attention of many physicists [3–6]. technically, the two methods differ in their objective: shmushkevich’s method calculates just the probabilities. the conventional alternative method calculates the cgc, their squares, and then provides the probabilities. neither of the tasks is easy for higher ranks of the representations. from the point of view of symmetries the two methods differ by the symmetry group that they exploit: shmushkevich uses just the weyl group of the lie group, while cgcs are built using the symmetry group of demazure-tits, see [7, 8]. the difficulty of generalizing of shmushkevich’s method to higher rank groups lies in the frequent occurrence of multiple states with the same quantum numbers, equivalently labeled by the same weights of irreducible representations [9], as well as the sheer number of channels that need to be written down. it is likely that practical exploitation of shmushkevich’s idea for higher groups and possibly representations of much higher dimensions, will not proceed by spelling out the large number of channels for each case and counting the number of occurrences of each state in all the channels. instead, one would start from one known channel and use the symmetry group, the weyl group in this case, to produce other channels with the same probability. this is a routine operation which, however does not produce all possible channels. this will relate only the states which are situated on the same weyl group orbit. this may be all that one needs as long as only the channels defined by individual orbits are studied. however, if the equal probability of all the states of the lie group is to be involved, the link between different orbits present in the same representation has to be imposed independently. for the probabilities, a natural link is provided by the requirement that the probabilities add up to one. if an orbit is present in irreducible representation more than once, say m times, we count them as equally probable m channels. in this paper 395 http://dx.doi.org/10.14311/ap.2013.53.0395 http://ojs.cvut.cz/ojs/index.php/ap m. bodner, g. chadzitaskos, j. patera, a. tereszkiewicz acta polytechnica c2 decomposition into irreducible representations with multiplicities multiplicity orbit size (20) × (10) (30) (11) (10) 10 × 4 [30] 1 4 [11] [11] 2 8 2[10] 2[10] [10] 5 4 decomposition into orbits with multiplicities [30] 2[11] 5[10] 40 table 1. decomposition of the product in the c2 example. decomposition is given in the weight system of irreducible representations (the first line) and in terms of orbits (bottom line). the dimensions of the representations and the sizes of the orbits are shown together with the multiplicities. we provide an illustration of this approach, for the symmetry groups o(5), f (4), and e(8). the process that consider is the simplest, where three multiplets are interacting, more precisely the interaction of two particles yields a third one. our aim is to show how to average over different particles/states which carry the same lie group representation labels. we write the highest weights of the representations in round brackets, and the dominant weights of the weyl group orbits in square brackets. in the examples, we show that there are many states which have the group labels (weights) identical although they label different particle states. in order to avoid the almost impossible task of distinguishing these states, we add them up and count their total probability. 2. symmetry group o(5) consider the example where the underlying symmetry group is the weyl reflection group of the lie group o(5), or equivalently of the lie algebra c2. we label the representations by their unique highest weight (relative to the basis of the fundamental weights). the product of representations of dimensions 10 and 4 decomposes as follows, (20) × (10) = (30) + (11) + (10), 10 × 4 = 20 + 16 + 4 = 40, where the second line shows the dimensions of representations, see [10]. labeling the weyl group orbits by their unique dominant weights, the product of the weight systems decomposes into the weyl group orbits as follows, (20) × (10) = [30] + 2[11] + 5[10], 10 × 4 = 4 + 2 · 8 + 5 · 4 = 40, where the integers in front of the square brackets are the multiplicities of the occurrence of the respective orbits in the decomposition. if only the product of the weyl group orbits were to be considered, the decomposition would be simpler: [20] × [10] = [30] + [11] + [10], 4 × 4 = 4 + 8 + 4 = 16. there are 40 states in the product. if equal probability is assumed, each of the channels comes with the probability 140 . consequently, we have the probabilities: • 4 states from [30], each present once: 1/40; • 8 states from [11], each present twice: 1/20; • 4 states from [10], each present 5 ×: 1/8. the results of the example are summarized in table 1. 3. symmetry group f (4) consider decomposition of the product of representations in terms of their weight systems, (0001) × (0001) = (0002) + (0010) + (1000) + (0001) + (0000), 26 × 26 = 324 + 273 + 52 + 26 + 1 = 676. decomposition of the same product in terms of weyl group orbits, (0001) × (0001) = [0002] + 2[0010] + 6[1000] + 12[0001] + 28[0000] and the corresponding equality of the dimensions 26 × 26 = 24 + 2 · 96 + 6 · 24 + 12 · 24 + 28 · 1. suppose that we want to decompose only the product of the orbits of the highest weights [0001] × [0001] = [0002] + 2[0010] + 6[1000] + 8[0001] + 24[0000], 24 × 24 = 24 + 2 · 96 + 6 · 24 + 8 · 24 + 24 · 1 if equal probability of the 676 states is assumed we have the following probabilities of the channels: 396 vol. 53 no. 5/2013 the shmushkevich method for higher symmetry groups f (4) decomposition into irreducible representations with multiplicities multiplicity orbit size (0001) × (0001) = (0002) (0010) (1000) (0001) (0000) 26 × 26 [0002] 1 24 [0010] [0010] 2 96 3[1000] 2[1000] [1000] 6 24 5[0001] 5[0001] [0001] [0001] 12 24 12[0000] 9[0000] 4[0000] 2[0000] [0000] 28 1 decomposition into orbits with multiplicities [0002] 2[0010] 6[1000] 12[0001] 28[0000] 676 = 262 table 2. decomposition of the product in the f (4) example. the decomposition is given in the weight system of irreducible representations (the first line) and in terms of orbits (bottom line). the dimensions of the representations and the sizes of the orbits are shown together with the multiplicities. • 24 states from [0002], each present once: 1/676; • 96 states from [0010], each present twice: 2/676; • 24 states from [1000], each present 6 ×: 6/676; • 24 states from [0001], each present 12 ×: 12/676; • 1 state from [0000], each present 28 ×: 28/676. the results of the example are summarized in table 2. 4. symmetry group e(8) consider decomposition of the product of the representations in terms of their weight systems( 0 1000000 ) × ( 0 1000000 ) = ( 0 2000000 ) + ( 0 0100000 ) + ( 0 0000001 ) + ( 0 1000000 ) + ( 0 0000000 ) with the respective dimensions 248 × 248 = 27000 + 30380 + 3875 + 248 + 1 = 61504. we write the components of e(8) weights as they would be attached to the corresponding dynkin diagram. the same product decomposed into the sum of the weyl group orbits has very different multiplicities,( 0 1000000 ) × ( 0 1000000 ) = [ 0 2000000 ] + 2 [ 0 0100000 ] + 14 [ 0 0000001 ] + 72 [ 0 1000000 ] + 304 [ 0 0000000 ] , and the equality of the dimensions in the decomposed product, 248 × 248 = 240 + 2 · 6720 + 14 · 2160 + 72 · 240 + 304 · 1 = 61504. if only the product of the orbits is to be calculated, the result is much simpler,[ 0 1000000 ] × [ 0 1000000 ] = [ 0 2000000 ] + 2 [ 0 0100000 ] + 14 [ 0 0000001 ] + 56 [ 0 1000000 ] + 240 [ 0 0000000 ] , and the corresponding orbit sizes with the appropriate multiplicities are 240 × 240 = 240 + 2 · 6720 + 14 · 2160 + 56 · 240 + 240 · 1 = 57600. if equal probability of the 61504 states is assumed, we have the following probabilities of the channels: • 240 states from [ 0 2000000 ] , each present once: 1/61504; • 6720 states from [ 0 0100000 ] , each present twice: 2/61504; • 2160 states from [ 0 0000001 ] , each present 14 times: 14/61504; • 240 states from [ 0 1000000 ] , each present 72 times: 72/61504; • 1 state from [ 0 0000000 ] , each present 304 times: 304/61504. the results of the example are summarized in table 3. acknowledgements the authors are grateful for support from natural sciences and engineering research of canada, to the mind research institute of irvine, california, and to mitacs. a.t. expresses her gratitude to the centre de recherches mathématiques, université de montréal for the hospitality during her postdoctoral fellowship. g.c. wishes to express thanks for support from the ministry of education of the czech republic (project msm6840770039). references [1] i.m. shmushkevich, on deduction of relations between sections that arise from the hypothesis of isotopic invariance, dokl. akad. nauk sssr 103 (1955) 235–238 (in russian) [2] n. dushin, i.m. shmushkevich, on the relations between cross sections which result from the hypothesis of isotopic invariance, dokl. akad. nauk sssr 106 (1956) 801-805; translated in soviet phys. dokl. 1 (1956) 94–98 397 m. bodner, g. chadzitaskos, j. patera, a. tereszkiewicz acta polytechnica e(8) decomposition into irreducible representations with multiplicities multiplicity orbit size( 0 1000000 ) × ( 0 1000000 ) ( 0 2000000 ) ( 0 0100000 ) ( 0 0000001 ) ( 0 1000000 ) ( 0 0000000 ) 248 × 248 [ 0 2000000 ] 1 240[ 0 0100000 ] [ 0 0100000 ] 2 6720 6 [ 0 0000001 ] 7 [ 0 0000001 ] [ 0 0000001 ] 14 2160 29 [ 0 1000000 ] 35 [ 0 1000000 ] 7 [ 0 1000000 ] [ 0 1000000 ] 72 240 120 [ 0 0000000 ] 140 [ 0 0000000 ] 35 [ 0 0000000 ] 8 [ 0 1000000 ] [ 0 0000000 ] 304 1 decomposition into orbits with multiplicities[ 0 2000000 ] 2 [ 0 0100000 ] 14 [ 0 0000001 ] 72 [ 0 1000000 ] 304 [ 0 0000000 ] 61504 = 2482 table 3. decomposition of the product in the e(8) example. the decomposition is given in the weight system of irreducible representations (the first line) and in terms of orbits (bottom line). the dimensions of the representations and the sizes of the orbits are shown together with the multiplicities. [3] g. pinsky, a.j. macfarlane, e.c.g. sudarshan, shmushkevich’s method for a charge-independent theory, phys. rev. 140, (1965), b1045–b1053 [4] c.g. wohl, isospin relations by counting, amer. j. phys. 50 , (1982), 748-753 [5] p. roman, the theory of elementary particles, north-holland pub. co. (1960) [6] r. marshak, e. sudarshan, introduction to elementary particles physics, new york, interscience publishers, 1961 [7] r. v. moody, j. patera, general charge conjugation operators in simple lie groups, j. math. phys., 25 (1984) 2838–2847 [8] l. michel, j. patera, r. t. sharp, demazure-tits subgroup of a simple lie group, j. math. phys., 29 (1988) 777–796 [9] p. ramond, group theory. a physicist’s survey, cambridge univ. press, 2010 [10] w. g. mckay, j. patera, d. w. rand, tables of representations of simple lie algebras (exceptional simple lie algebras : vol. 1), université de montréal, centre de recherches mathématiques, 1990 [11] r. v. moody, j. patera, fast recursion formula for weight multiplicities, bull. amer. math. soc., 7 (1982) 237–242 [12] m. r. bremner, r. v. moody, j. patera, tables of dominant weight multiplicities for representations of simple lie algebras, marcel dekker, new york 1985, 340 pages, isbn: 0-8247-7270-9 398 acta polytechnica 53(5):395–398, 2013 1 introduction 2 symmetry group o(5) 3 symmetry group f(4) 4 symmetry group e(8) acknowledgements references acta polytechnica acta polytechnica 53(2):70–74, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ clock math — a system for solving sles exactly jakub hladík, róbert lórencz, ivan šimeček∗ department of computer systems, faculty of information technology, czech technical university in prague, czech republic ∗ corresponding author: xsimecek@fit.cvut.cz abstract. in this paper, we present a gpu-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. exactly means without rounding errors due to using integer arithmetics. first, we scale floating-point numbers up to integers, then we solve dozens of sles within different modular arithmetics and then we assemble sub-solutions back using the chinese remainder theorem. this approach effectively bypasses current cpu floating-point limitations. the system is capable of solving hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing cpu solvers. keywords: integer arithmetic, modular arithmetic, hilbert’s matrix, error-free, gpgpu, solver, opencl, optimizations. 1. introduction solving linear algebraic equations solution is quite a frequent task in numerical mathematics. one may often find difficulties, while solving problems of the illconditioned matrix. stability of the solution cannot be ensured for large dense sets of linear equations. rounding error during the numerical computation cannot be tolerated. methods have been developed that minimize the influence of rounding errors on the solution. one method that we use relies on modular arithmetics [2] in order to solve dense systems of linear equations precisely. the underlying idea sounds quite simple – bypass floating point rounding limitations by using integer arithmetics. it consists of three parts – converting floating point numbers into integers, solving multiple systems of linear equations within their modules, and finally converting sub-solutions back using the chinese remainder theorem. nowadays, gpgpu (general-purpose computing on the gpu) is a trending topic in high-performance computing. since gpgpu began in 2006 hundreds of articles have been published every year. gpgpu is usually used to accelerate data-parallel algorithms, and that is our case. dozens of systems of linear equations emerge during the second step of our computation. each of them can be solved in parallel on the gpu, hence speedup is achieved. in this paper, we present a gpu-accelerated solver of ill-conditioned systems of linear equations. section 3 gives a brief overview of the mathematics that is used. section 4 describes the gpu architecture in general, and presents issues that we faced while optimizing the computation. finally, section 5 shows our measured results – the speedup and a comparison with existing systems solving similar problem. 2. state of the art nearly one half of the problems solved in numerical mathematics lead to problems in linear algebra. the primary objective of the numerical methods used in linear algebra is to solve sets of linear equations . . . they are solved in a chosen computer arithmetic, most often floating-point arithmetic. floating-point arithmetic’s well known advantages have led to its widespread usage, yet it also has its important disadvantages, sometimes resulting in severe problems. the disadvantages of floating-point arithmetic include mainly the generation and accumulation of rounding errors during the calculation, and nonuniform distribution of values in the real number subset. there are a variety of alternatives to floating point representation and associated arithmetic operations, including modular, logarithmic, p-adic, and continued fraction arithmetic. many numerical problems lead to sles with dense and large matrices. the rounding error sensitivity of such systems can be so grave that they can be considered to be ill-conditioned. to solve the problems, many numerical direct and iterative methods have been developed that tackle these systems with higher or lower success. apart from many numerical methods, various programming and computing tools have been developed for solving rounding error sensitive sles. there are many computer libraries that allow the user to choose an optimal virtual length of a computer word that fully or partially eliminates the destructive influence of rounding errors. there are also arithmetic units that tackle this problem by grouping arithmetic operations, and thus they create a longer computer word for operations sensitive to rounding. software tools that eliminate rounding errors include libraries for precision computing, such as gmp (gnu multiple 70 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 clock math — a system for solving sles exactly precision arithmetic library). unfortunately, these tools lead to such great time complexity when they are used to solve rounding error sensitive sles that they may not be applicable to practical problems. another limiting factor when solving sles with a square matrix is the inherent algorithmic time complexity o(n3), where n is the dimension of the matrix. to solve large sles with a large number of equations, special computers and processors have been developed. these are vector computers. until recently, the vector processors available on the market were cray sv1ex (2001), fujitsu vpp5000 (1999) and vmips (2001). ibm virtual vector architecture brings vector processing features to its most recent power6 mainframe processors (2007). another suitable architecture for solving rounding error sensitive sles is simd (single instruction, multiple data). today the simd features can be found as extensions of the standard processor instruction set, for example, intel-sse, amd-3dnow! or motorola-altivec. these are used in multimedia applications such as video, 3d graphics, animation, audio, etc. the recent trend is gpgpu. today, at least three of ten top supercomputers in the top 500 list use gpu for numerical acceleration. both major gpu manufacturers, nvidia and amd, offer the possibility to run general purpose computation on their gpus. at present there are no single-purpose numerical accelerators for solving rounding error sensitive sles. we use common nvidia fermi graphics cards together with intel core processors to accelerate the solution. 3. mathematical background let us take a system of linear equations (see [1]): ax = b, (1) where a ∈ rn×n is the matrix of a system of n equations of n unknowns, b ∈ rn is the right-side vector and x ∈ rn is the desired vector, the solution of our system of linear equations. 3.1. matrix scaling matrix scaling is the first thing to do. the goal of matrix scaling is to adjust all the floating-point numbers of the matrix to their corresponding integer versions. basically, every matrix row is multiplied by its scaling constant (scalar multiplication). this has to be done without losing a single bit of precision. this condition will be achieved only when the scaling constant is 2n, where n ∈ n1. the question now is how to determine the scaling constant. first, we should continue by finding the smallest absolute value element (closest to zero) in each row. then, we extract the absolute value of its exponent and multiply it by the 253 constant, because a significant mantissa bit size in the ieee 754 [6] double precision floating point number format is 53 bits long. finally, the scaling constant s is computed: s = 253 ∗ 2|exp(minrow)|, (2) where exp is a function that extracts and returns the exponent (as an integer – power of 2) and minrow is the element in the row closest to zero. the approach being used is explained in greater detail (and with alternatives) in [2]. 3.2. solving a system of linear equations from the previous step we have an properly scaled-up system of linear equations: ax = b, (3) where aij matrix a elements and xi and bi vector elements are just big integers. we solve it using a multi-modulus arithmetics over the commutative ring (zβ,⊕,�) with a base vector β that is equivalent to the single-modulus arithmetics over (zm, +, ·) and module m. m has to be a big enough positive integer to avoid rounding errors during our computation. hadamard’s determinant d estimate of matrix a can be used to estimate the maximum m value: |d|2 ≤ n∏ i=1 ( |a2i1| + |a 2 i2| + · · · + |a 2 in| ) . (4) the highest value of m that could appear during the computation: m > 2 max   n n 2 max(anij ), i,j=1,2,...,n, n(n−1) (n−1) 2 max(aij )n−1 max(yi), i,j=1,2,...,n   (5) and gcd(m,d) = 1. (6) we also need the following conditions for the vector β = (m1,m2, . . . ,mr) and module m to be satisfied: • r∏ i mi = m; • m1 < m2 < · · · < mr; • m1,m2, . . . .mr are prime numbers. the following condition for the sle (given by eq. (3)) determinant is satisfied when: |d|mi 6= 0, i = 1, 2, . . . ,r, (7) then the sle (from eq. (3)) solved within (zβ,⊕,�) due to vector β can be expressed as: |ax|mi = |b|mi, (8) or, for individual modules mi of vector β: |ax|mi = |b|mi, i = 1, 2, . . . ,r. (9) 71 jakub hladík, róbert lórencz, ivan šimeček acta polytechnica the following expression is also valid for eq. (3) within (zβ,⊕,�): |aa−1|β = |a −1a|β = e (10) and |x|β = |a −1b|β, (11) where e is the identity matrix. to solve the sle from eq. (9) within the specific modular arithmetics we will use the gauss-jordan elimination algorithm with non-zero pivotization. the difference from the original gauss-jordan elimination is the usage of modular arithmetics for all algorithm steps. there is also pivotization simplification – we do not need to find the greatest element in the elimination step, a nonzero element is good enough. using the gj elimination algorithm, let us have an matrix w of dimensions n× (n + 1) consisting of matrix a and vector b: w =   a11 a12 · · · a1n b1 a21 a22 · · · a2n b2 ... ... ... ... ... an1 an2 · · · ann bn   . (12) the goal of the algorithm is to eliminate all w elements one by one to get the resulting x vector to eq. (11) after ≈ n3 elimination steps. 3.3. inverse transformation after completing the two previous steps, we have a set of vectors x within each module from β = (m1,m2, . . . ,mr). they represent a sub-solution for the specific (zmi, +, ·) arithmetic. now we advance to the inverse transformation back to (zβ,⊕,�). the necessary condition for each β module solution x is its non-zero determinant from eq. (7). the algorithm used for this transformation is the chinese remainder theorem. the product of our last step consists of the vector x elements in the form of fractions that contain the solution of sle (3). 4. gpu and optimizations gpgpu is the area of our research, so we optimize on the pc graphical hardware. there were several platforms to choose from: • nvidia cuda, the mainstream platform today, was rejected; it is proprietary, and its usage is limited to nvidia hardware; • microsoft directx 11 directcompute was rejected; it is bound to the microsoft windows platform only (not used in hpc in general); • opencl is an open standard that works well across platforms (gnu/linux, microsoft windows and apple mac) and on all latest graphical hardware (nvidia, amd and latest intel gpus); we use opencl for the implementation. 0 % 25,00 % 50,00 % 75,00 % 100,00 % 16 32 64 128 256 512 1024 co m pu ta tio n sh ar e [% ] matrix size [m] matrix scaling beta vector computation sles solution inverse transformation figure 1. measured computation shares. 4.1. profiling the whole sle solving process is a rather timeconsuming task (fig. 1), so optimizations had to be made. first, all the big number alu operations – eq. (5) and section 3.3 – are performed using the gmp library, which is tuned for this platform. solving dozens of systems of linear equations using modular arithmetics (section 3.2) within the β vector is the most computation-intensive part of our task, see fig. 1. according to ahmdal’s law (expressed in eq. (13), the benefit (overall speedup) from optimizing part of the computation is equal to: speedup = 1 (1 −p) + p s , (13) where p is the proportion of the computation we are optimizing and s is its speedup. because modular arithmetics sles take a great amount of time to solve, we optimize the matrix elimination. 4.2. gpu architecture all modern common gpus have a similar architecture (from our optimization type point of view). they consist of a global memory (1–4 gb) and several (2–24) smp1 processors (example in fig. 2). smp features a scheduler of lightweight threads that have zero overhead. each smp contains dozens (8–48) of processor cores that share the smp memory (shared memory ≈ 64 kb) and execute the same code. if it happens that one processor core branches differently, the performance goes down dramatically. each processor core also has its own integer and floating-point unit. more details with examples and a description of the cuda platform are given in [5]. the processor cores have access to the gpu global ram, but the memory access has to be aligned. otherwise performance will be affected. we used smp processor cores to load part of the matrix (the row 1smp – symmetric multiprocessor 72 vol. 53 no. 2/2013 clock math — a system for solving sles exactly matrix size [n] clock math [s] linsolve-0.7.15 [s] speedup [%] 16 0.006 0.006425 16.82 32 0.006 0.014921 42.10 64 0.029 0.046871 61.97 128 0.117 0.195461 67.06 256 1.043 1.401498 34.40 512 11.252 16.215546 44.11 1024 124.375 212.557002 70.90 table 1. measured results: comparison of execution times for clock math solver, for linsolve-0.7.15 and corresponding speedups. figure 2. nvidia fermi smp [5]. that contains the current pivot). we used it for computation and then stored the elimination step results back to global memory. 4.3. gpu kernels, optimizations the opencl framework defines the term kernel. it is an function written in slightly modified c99 language which is to be executed on an opencl capable device. in parallel, opencl runtime executes the same kernel on every processor core within an smp. in the example of the saxpy function, we are able to process (8–48) vector elements at once. however, our case is not so simple. our system is currently limited to the matrix size of 4096×4096. the row with the current pivot is used in n multiply-add-modulo vector operations. hence, it is quite useful performance-wise to cache it in the smp shared memory. current generation nvidia gpus have shared memory size of 48 kb. one matrix row fits the local memory easily (4096 ∗ 8 = 32758 b). the processor cores fetch a matrix row from global memory and compute the inverse of the pivot element. then the processor cores multiply the cached row and multiply-add-modulo other rows in global memory. more details, including the processor core synchronizations and source code samples, are available online2. 5. results after embedding architecture-related optimizations into our clock math solver and verifying the correctness of our results, we were finally able to benchmark them. there are not many up-to-date solvers like ours currently. we benchmarked primarily against linsolve-0.7.15 [7], a highly optimized solver with performance critical parts written in x86 assembler. the results (tab. 1) are very satisfying. clock math solver outperforms linsolve-0.7.15 almost twice for larger matrices. 2http://www.github.com/kubbing/clock-math 73 jakub hladík, róbert lórencz, ivan šimeček acta polytechnica 6. conclusion we have presented the working system for solving ill-conditioned systems of linear equations exactly. we have managed to effectively bypass floating-point rounding error by using modular arithmetics. most of the entire computation has taken place by solving sles within their corresponding modules. this part of the computation can be (and is) solved in parallel on the common-grade gpu. gpu is capable of solving several sles at once (8–16, depending on its memory size) and also each system in parallel on its spe (8–40 cores). double parallelism has been utilized, so significant speedup is achieved compared to other implementations. we are currently working on advanced kernel optimizations with larger matrix support. 7. future work as we now have a working system, we would like to proceed to: • add support for both single and double precision for matrix elimination, as the β vector module count and gpu performance may differ; • add support for larger matrices than 4096 × 4096 – optimize smp shared memory usage; • add support for automatic kernel group size tuning for larger matrices, as the group size lowers the memory access time on different gpus/architectures; • examine the modulus operation performance across different gpu architectures and further optimize the saxpy, respectively daxpy functions (mod m); • utilize openmpi library to add cluster support; • adjust and run the solver on our university star cluster to test amd’s opencl cpu implementation. acknowledgements this research has been supported by cesnet development fund project 390/2010 and by the grant sgs12/097/ohk3/1t/18. references [1] lórencz, r.: aplikovaná numerická matematika a kryptografie. vydavatelství čvut, 2004 [2] gregory, r.t.: error-free computation: why it is needed and methods for doing it. r. e. krieger pub co, 1980 [3] lórencz, r., morháč, m.: a modular system for solving linear equations exactly. computers and artificial intelligence, vol. 12, 1992 [4] zahradnický, t.: mosfet parameter extraction optimization. ph.d. thesis, department of computer systems, faculty of information technology, czech technical university in prague, 2010 [5] kirk, d.b., hwu, w.w.: programming massively parallel processors, a hands-on approach. morgan kaufmann publishers, 2010 [6] ieee standard for floating-point arithmetic. ieee std 754-2008, pages 1-58, 2008 [7] vondra, l., lórencz, r.: system for solving linear equation systems. seminar on numerical analysis, pages 171-174, technical university liberec, 2012 74 acta polytechnica 53(2):70--74, 2013 1 introduction 2 state of the art 3 mathematical background 3.1 matrix scaling 3.2 solving a system of linear equations 3.3 inverse transformation 4 gpu and optimizations 4.1 profiling 4.2 gpu architecture 4.3 gpu kernels, optimizations 5 results 6 conclusion 7 future work acknowledgements references ap-3-10.dvi acta polytechnica vol. 50 no. 3/2010 lorentz and su(3) groups derived from cubic quark algebra r. kerner dedicated to jǐŕı niederle on the occasion of his 70-th birthday abstract we show how lorentz and su(3) groups can be derived from the covariance principle conserving a z3-graded three-form on a z3-graded cubic algebra representing quarks endowed with non-standard commutation laws. this construction suggests that the geometry of space-time can be considered as a manifestation of symmetries of fundamental matter fields. 1. many fundamental properties of matter at the quantum level can be announced without mentioning the space-time realm. the pauli exclusion principle, symmetry between particles and anti-particles, electric charge and baryonic number conservation belong to this category. quantummechanics itself can be formulated without any mention of space, as was shown by m. born, p. jordan and w. heisenberg [1] in their version of matrix mechanics, or in j. von neumann’s [2] formulation of quantum theory in terms of c∗ algebras. non-commutative geometry [4] gives another example of interpreting space-time relationships in pure algebraic terms. einstein’s dreamwas to be able to derive the properties of matter, and perhaps its very existence, from the singularities of fields defined on space-time, and if possible, from the geometry and topology of spacetime itself. a follower of maxwell and faraday, he believed in the primary role of fields and tried to derive the equations ofmotion as characteristic behavior of field singularities, or singularities of the space-time (see [3]). one can defend an alternative point of view supposing that the existence of matter is primary with respect to that of space-time. in this light, the idea of deriving the geometric properties of space-time, and perhaps its very existence, from fundamental symmetries and interactions proper to matter’s most fundamental building blocks seems quite natural. if the space-time is to be derived from the interactions of fundamental constituents of matter, then it seems reasonable to choose the strongest ineractions available, which are the interactions between quarks. the difficulty resides in the fact that we should define these “quarks” (or their states) without any mention of space-time. the minimal requirements for the definition of quarks at the initial stage of model building are the following: i) the mathematical entities representing quarks should forma linear space over complex numbers, so that we could produce their linear combinations with complex coefficients. ii) they should also form an associative algebra, so that their multilinear combinations may be formed; iii) there should exist two isomorphicalgebrasof this type corresponding to quarks and anti-quarks, and the conjugation transformation that maps one of these algebras onto another, a → ā. iv) the three quark (or three anti-quark) and the quark-anti-quark combinations should be distinguished in a certainway, for example, they should form a subalgebra in the algebra spanned by the generators. with this in mind we can start to explore the algebraic properties of quarks that would lead to more general symmetries, that of space and time, appearing as a consequence of covariance requirements imposed on the discrete relations between the generators of the quark algebra. 2. at present, the most successful theoretical descriptions of fundamental interactions are based on the quark model, despite the fact that isolated quarks cannot be observed. the only experimentally accessible states are either three-quark or three-anti-quark combinations (fermions) or quark-anti-quark states (bosons). whenever one has to do with a tri-linear combination of fields (or operators), one must investigate the behavior of such states under permutations. let us introduce n generators spanning a linear space over complex numbers, satisfying the following relations which are a cubic generalization of anti-commutation in the ususal (binary) case (see e.g. [5, 6]): θaθb θc = j θb θc θa = j2 θc θaθb , (1) with j = eiπ/3, the primitive cubic root of 1. we have j̄ = j2 and 1 + j + j2 = 0. we shall also introduce a similar set of conjugate generators, θ̄ȧ, ȧ, ḃ, . . . = 1,2, . . . , n, satisfying a similar condition with j2 replacing j: θ̄ȧθ̄ḃ θ̄ċ = j2 θ̄ḃθ̄ċ θ̄ȧ = j θ̄ċ θ̄ȧθ̄ḃ , (2) let us denote this algebra by a. we shall endow this algebrawith anatural z3 grading, considering the 37 acta polytechnica vol. 50 no. 3/2010 generators θa as grade 1 elements, and their conjugates θ̄ȧ being of grade 2. the grades add up modulo 3, so that the products θaθb span a linear subspace of grade2, andthe cubicproducts θaθbθc areof grade0. similarly, all quadratic expressions in conjugate generators, θ̄ȧθ̄ḃ are of grade 2+2=4mod3 =1,whereas their cubic products are againof grade0, like the cubic products of θa’s. combined with the associativity, these cubic relations impose a finite dimension on the algebra generated by z3 graded generators. as a matter of fact, cubic expressions are the highest order that does not vanish identically. the proof is immediate: θaθbθc θd = j θbθc θaθd = j2 θb θaθdθc = j3 θaθdθb θc = j4 θaθbθc θd and because j4 = j �=1, the only solution is θaθb θc θd =0. (3) therefore the total dimension of the algebra defined via the cubic relations (1) is equal to n +n2+(n3− n)/3: the n generators of grade 1, the n2 independent products of two generators, and (n3 − n)/3 independent cubic expressions, because the cube of any generator must be zero, and the remaining n3 − n ternary products are divided by 3, by virtue of the constitutive relations (1). theconjugategenerators θ̄ḃ spananalgebra ā isomorphic with a. both algebras split quite naturally into sums of linear subspaces with definite grades: a = a0 ⊕a1 ⊕a2, ā = ā0 ⊕ā1 ⊕ā2, the subspaces a0 and ā0 form zero-graded subalgebras. these algebras can be made unital if we add to each of them the unit element1 acting as identity and considered as being of grade 0. if we want the products between the generators θa and their conjugates θ̄ḃ to be included into the greater algebra spanned by both types of generators, we should consider all possible products, whichwill be included in the linear subspaces with a definite grade. of the resulting algebra a ⊗ ā. in order to decide which expressions are linearly dependent, and what is the overall dimension of the enlarged algebra generated by θa’s and their conjugate variables θ̄ḋ’s, we must impose some binary commutation relations on their products. the fact that conjugate generators are of grade 2 may suggest that they behave like products of two ordinary generators θaθb. sucha choice oftenwasmade (see, e.g., [5, 9, 6]). however, this does not enable one to make a distinction between conjugate generators and the products of two ordinary generators, and it would be better to be able to make the difference. due to thebinarynatureof “mixed”products, another choice is possible, namely, to impose the following relations: θaθ̄ḃ = −j θ̄ḃ θa, θ̄ḃ θa = −j2 θaθ̄ḃ, (4) in what follows, we shall deal with the first two simplest realizations of such algebras, spanned by two or three generators. consider the case when a, b, . . . = 1,2. the algebra a contains numbers, two generators of grade 1, θ1 and θ2, their four independent products (of grade 2), and two independent cubic expressions, θ1θ2θ1 and θ2θ1θ2. similar expressions can be producedwith conjugate generators θ̄ċ; finally,mixed expressions appear, like four independent grade 0 terms θ1θ̄1̇, θ1θ̄2̇, θ2θ̄1̇ and θ2θ̄2̇. 3. let us consider multilinear forms defined on the algebra a ⊗ ā. because only cubic relations are imposed on products in a and in ā, and the binary relations on the products of ordinary and conjugate elements, we shall fix our attention on tri-linear and bi-linear forms, conceived as mappings of a ⊗ ā into certain linear spaces over complex numbers. let us consider a tri-linear form ραabc. obviously, as ραabc θ aθbθc = ραbca θ bθc θa = ραcab θ c θaθb , by virtue of the commutation relations (1) it follows that we must have ραabc = j 2 ραbca = j ρ α cab. (5) even in this minimal and discrete case, there are covariant and contravariant indices: the lower case and the upper case indices display inverse transformation properties. if a given cyclic permutation is represented byamultiplicationby j for theupper indices, the same permutation performed on the lower indices is represented bymultiplication by the inverse, i.e. j2, so that they compensate each other. similar reasoning leads to the definition of the conjugate forms ρα̇ ċḃȧ satisfying the relations (5) with j replaced by j2: ρ̄α̇ ȧḃċ = jρ̄α̇ ḃċȧ = j2ρ̄α̇ ċȧḃ (6) in the case of two generators, there are only two independent sets of indices. therefore the upper indices α, β̇ take on the values 1 or 2. we choose the following notation: ρ1121 = jρ 1 112 = j 2ρ1211; ρ 2 212 = jρ 2 221 = j 2ρ2122, (7) all other components identically vanishing. the conjugate matrices ρ̄α̇ ḃċȧ are defined by the same formulae, with j replaced by j2 and vice versa. the constitutive cubic relations between the generators of the z3 graded algebra can be considered as intrinsic if they are conserved after linear transformations with commuting (pure number) coefficients, i.e. if they are independent of the choice of the basis. let 38 acta polytechnica vol. 50 no. 3/2010 u a ′ a denote a non-singular n × n matrix, transforming the generators θa into another set of generators, θb ′ = u b ′ b θ b. the primed indices run over the same range of values, i.e. from 1 to 2; the prime is there just to make clear we are referring to a new basis. we are looking for the solution of the covariance condition for the ρ-matrices: λα ′ β ρ β abc = u a′ a u b′ b u c′ c ρ α′ a′b′c′ . (8) let us write down the explicit expression, with fixed indices (abc) on the left-hand side. letus chooseone of the two available combinations of indices, (abc) = (121); then theupper indexof the ρ-matrix is alsofixed and equal to 1: λα ′ 1 ρ 1 121 = u a′ 1 u b′ 2 u c′ 1 ρ α′ a′b′c′ . (9) now, ρ1121 =1, andwehave two equations corresponding to the choice of values of the index α′ equal to 1 or 2. for α′ = 1′ the ρ-matrix on the right-hand side is ρ1 ′ a′b′c′, which has only three components, ρ1 ′ 1′2′1′ =1, ρ 1′ 2′1′1′ = j 2, ρ1 ′ 1′1′2′ = j, which leads to the following equation: λ1 ′ 1 = u 1′ 1 u 2′ 2 u 1′ 1 + j 2 u2 ′ 1 u 1′ 2 u 1′ 1 + j u 1′ 1 u 1′ 2 u 2′ 1 = = u1 ′ 1 (u 2′ 2 u 1′ 1 − u 2′ 1 u 1′ 2 )= u 1′ 1 [det(u)], (10) because j2+j = −1. for the alternative choice α′ =2′ the ρ-matrix on the right-hand side is ρ2 ′ a′b′c′, whose three non-vanishing components are ρ2 ′ 2′1′2′ =1, ρ 2′ 1′2′2′ = j 2, ρ2 ′ 2′2′1′ = j. the corresponding equation gives: λ2 ′ 1 = −u 2′ 1 [det(u)], (11) the remaining two equations are obtained in a similar manner, resulting in the following: λ1 ′ 2 = −u 1′ 2 [det(u)], λ 2′ 2 = u 2′ 2 [det(u)]. (12) the determinant of the 2×2 complex matrix u a ′ b appears everywhere on the right-hand side. taking the determinant of the matrix λα ′ β one gets immediately det(λ)= [det(u)]3. (13) taking into account that the inverse transformation should exist and have the same properties, we arrive at the conclusion that detλ=1, det(λα ′ β )=λ 1′ 1 λ 2′ 2 −λ 2′ 1 λ 1′ 2 =1 (14) which defines the sl(2,c) group, the covering group of the lorentz group. however, the u-matrices on the right-hand side are defined only up to the phase, which due to the cubic character of the relations (10–12), and they can take on three different values: 1, j or j2, i.e. the matrices j u a ′ b or j 2 u a ′ b satisfy the same relations as the matrices u a ′ b defined above. the determinant of u can take on the values 1, j or j2 while det(λ)= 1 let us then choose thematricesλα ′ β to be the usual spinor representation of the sl(2,c) group, while the matrices u a ′ b will be defined as follows: u1 ′ 1 = jλ 1′ 1 , u 1′ 2 = −jλ 1′ 2 , u 2′ 1 = −jλ 2′ 1 , u 2′ 2 = jλ 2′ 2 , (15) the determinant of u being equal to j2. obviously, the same reasoning leads to the conjugate cubic representation of sl(2,c) if we require the covariance of the conjugate tensor ρ̄ β̇ ḋėḟ = j ρ̄β̇ ėḟ ḋ = j2 ρ̄β̇ ḟ ḋė , by imposing the equation similar to (8) λα̇ ′ β̇ ρ̄ β̇ ȧḃċ = ρ̄α̇ ′ ȧ′ḃ′ċ′ ū ȧ ′ ȧ ū ḃ ′ ḃ ū ċ ′ ċ . (16) matrix ū is the complex conjugate of matrix u, and det(ū) is equal to j. moreover, the two-component entities obtained as images of cubic combinations of quarks, ψα = ραabc θ aθbθc and ψ̄β̇ = ρ̄β̇ ḋėḟ θ̄ḋθ̄ė θ̄ḟ should anticommute, because their arguments do so, by virtue of (4): (θaθbθc)(θ̄ḋθ̄ė θ̄ḟ)= −(θ̄ḋθ̄ė θ̄ḟ)(θaθb θc) wehave found thewaytoderive the coveringgroup of the lorentz group acting on spinors via the usual spinorial representation. the spinors are obtained as the homomorphic image of a tri-linear combination of three quarks (or anti-quarks). the quarks transform with matrices u (or ū for the anti-quarks), but these matrices are not unitary: their determinants are equal to j2 or j, respectively. so, quarks cannot be put on the same footing as classical spinors; they transform under a z3-covering of the sl(2,c) group. a similar covariance requirement can be formulated with respect to the set of 2-forms mapping the quadratic quark-anti-quark combinations into a fourdimensional linear real space. as we saw already, the symmetry (4) imposed on these expressions reduces their number to four. let us define two quadratic forms, πμ aḃ andconjugate π̄μ ḃa with the following symmetry requirement π μ aḃ θaθ̄ḃ = π̄μ ḃa θ̄ḃθa. (17) the greek indices μ, ν, . . . take on four values, andwe shall label them 0,1,2,3. it follows immediately from (4) that π μ aḃ = −j2 π̄μ ḃa . (18) 39 acta polytechnica vol. 50 no. 3/2010 such matrices are non-hermitian, and they can be realized by the following substitution: π μ aḃ = j2 i σμ aḃ , π̄ μ ḃa = −j i σμ ḃa (19) where σμ aḃ are the unit 2 matrix for μ = 0, and the three hermitian pauli matrices for μ =1,2,3. again, we want to get the same form of these four matrices in another basis. knowing that the lower indices a and ḃ undergo the transformationwithmatrices u a ′ b and ū ȧ′ ḃ , we demand that there exist some 4×4 matrices λμ ′ ν representing the transformation of lower indices by the matrices u and ū: λμ ′ ν π ν aḃ = u a ′ a ū ḃ′ ḃ π μ′ a′ḃ′ , (20) and this defines the vector (4 × 4) representation of the lorentz group. introducing the invariant “spinorial metric” in two complex dimensions, εab and εȧḃ such that ε12 = −ε21 =1 and ε1̇2̇ = −ε2̇1̇, we can define the contravariant components πν aḃ. it is easy to show that the minkowskian space-time metric, invariant under the lorentz transformations, can be defined as gμν = 1 2 [ π μ aḃ πν aḃ ] =diag(+, −, −, −) (21) together with the anti-commuting spinors ψα the four real coefficients defining a lorentz vector, xμ = π μ aḃ θaθ̄ḃ, can now generate the supersymmetry via standard definitions of super-derivations. 4. consider now three generators, qa, a = 1,2,3, and their conjugates q̄ḃ satisfying similar cubic commutation relations as in the two-dimensional case: qaqbqc = j qbqcqa = j2 qcqaqb, q̄ȧq̄ḃq̄ċ = j2 q̄ḃq̄ċq̄ȧ = j q̄ċq̄ȧq̄ḃ, qa q̄ḃ = −jq̄ḃ qa. with the indices a, b, c, . . . ranging from 1 to 3 we get eight linearly independent combinations of three undotted indices, and the same number of combinations of dotted ones. they can be arranged as follows: q3q2q3, q2q3q2, q1q2q1, q3q1q3, q1q2q1, q2q1q2, q1q2q3, q3q2q1, while the quadratic expressions of grade 0, qa q̄ḃ span a 9-dimensional subspace in the finite algebra generaterd by qa’s. the invariant 3-form mapping these combinations onto some eight-dimensional spacemust also have eight independent components (over real numbers). the three-dimensional “cubicmatrices”are then as follows: k3+121 =1, k 3+ 112 = j 2, k3+211 = j; k3−212 =1, k 3− 221 = j 2, k3−122 = j; k2+313 =1, k 2+ 331 = j 2, k2+133 = j; k2−131 =1, k 2− 113 = j 2, k2−311 = j; k1+232 =1, k 1+ 223 = j 2, k1+322 = j; k1−323 =1, k 1− 332 = j 2, k1−233 = j; k7123 =1, k 7 231 = j 2, k7312 = j; k8132 =1, k 8 321 = j 2, k8213 = j. all other components being identically zero. let the capital greek indices φ,ω take on the values from 1 to 8. as in the case of the ρα matrices, we define the conjugate matrices k̄ω̇, by replacing j by j2 and vice versa in the matrices kω. the ternary multiplication table for eight cubic matrices k, with the same definition as for the ρmatrices, {kγ, kπ, kλ}abc = 3∑ d,e,f=1 kγdaek π ebf k λ f cd (22) the z3 graded ternary commutator can be defined as follows: {kγ, kπ, kλ}z3 = {k π, kλ, kγ}+ j{kπ, kλ, kγ}+ (23) j2{kγ, kπ, kλ} the ternary multiplication table for these cubic matrices shall contain 8 × 8 × 8 = 512 entries, and we cannot print it here due to the lack of place. nevertheless, there are some interesting properties that can be noticedwhen one gets a closer look at the structure of the defining table above. there are three distinct groups of two generators, each of them reproducing the structure of ρ-matrices, only with a different choice of two indices: (1,2), 2,3 nd 3,1. they obviously reproduce the multiplication rules of the ρ-matrices. the last two generators are new in the sense that the combinations with three different indices did not exist in the previous twodimensional case. their z3-graded ternary commutators vanish, which reproduces the behavior of two generators of the cartan subalgebra of su(3). there is one drawback here, namely, the multiplication does not close under the z3-graded ternary commutator: one needs to form real and imaginary combinations of k and k̄ cubic matrices in order to make the corresponding ternary algebra complete. the covariance principle applied to the cubic matrices kφabc underlinear change of the basis from θ a to θa ′ = u a ′ b θ b means thatwewant to solve the following equations: sφ ′ ω k ω def = k φ′ a′b′c′ u a′ d u b′ e u c′ f , (24) 40 acta polytechnica vol. 50 no. 3/2010 it takes more time to prove, but the result is that the 8×8matrices sφ ′ ω are the adjoint representationof the su(3) group, whereas the 3×3 matrices u a ′ d are the fundamental representation of the same group, up to the phase factor that can take on the values 1, j or j2. the nine independent two-forms p i aḃ = −j2 p̄ i ḃa transform as the 3⊗ 3̄ representation of su(3) finally, the elements of the tensor product of both types of j-anti-commuting entities, θa and qb can be formed, giving six quarks, qba , transforming via z3 coverings of sl(2,c) and su(3), which looks very much like the three flavors. 5. we have shown how the requirement of covariance of z3-graded cubic generalization of anticommutation relations leads to spinor and vector representations of thelorentz groupand the fundamental and adjoint representations of the su(3) group, thus giving the cubic z3-graded quark algebra the primary role in determining the lorentz and su(3) symmetries. however, these representationscoincidewith the usual ones only when applied to special combinations of quarkvariables, cubic (spinor) or quadratic (vector) representations of the lorentz group. while acting on quark variables, the representations correspond to the z3-covering of groups. in this sense quarks are not like ordinary spinors or fermions, and as such, do not obey the usual dirac equation. if the sigma-matrices are to be replaced by the nonhermitianmatrices πμ aḃ , instead of the usualwave-like solutions ofdirac’s equationwe shall get the exponentials of complex wave vectors, and such solutions cannot propagate. nevertheless, as argued in [9], certain tri-linear and bi-linear combinations of such solutions behave as usual plane waves, with real wave vectors and frequencies, if there is a convenient coupling of non-propagating solutions in the k-space. acknowledgement we are greatly indebted tomichel dubois-violette for numerous discussions and enlightening remarks. references [1] born, m., jordan, p.: zeitschrift fur physik 34 858–878 (1925); ibid heisenberg, w., 879–890 (1925). [2] von neumann, j.: mathematical foundations of quantum mechanics, princeton univ. press (1996). [3] einstein, a., infeld, l.: the evolution of physics, simon and schuster, n.y. (1967). [4] dubois-violette, m., kerner, r., madore, j.: journ. math. phys. 31, 316–322 (1990); ibid, 31, 323–331 (1990). [5] kerner, r.: journ. math. phys., 33, 403–411 (1992). [6] abramov,v.,kerner,r., leroy,b.: journ.math. phys., 38, 1650–1669 (1997). [7] lipatov, l. n., rausch de traubenberg, m., volkov, g. g.: journ. of math. phys. 49 013502 (2008). [8] campoamor-stursberg, r., rausch de traubenberg,m.: journ. of math. phys.49 063506 (2008). [9] kerner, r.: class. and quantum gravity, 14, a203-a225 (1997). richard kerner laboratoire de physique théorique de la matière condensée université pierre-et-marie-curie – cnrs umr 7600 tour 22, 4-ème étage, bôite 121 4, place jussieu, 75005 paris, france 41 acta polytechnica doi:10.14311/ap.2013.53.0595 acta polytechnica 53(supplement):595–600, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap non-thermal emission from cataclysmic variables: implications on astroparticle physics vojtěch šimon∗ astronomical institute, academy of sciences of the czech republic, 25165 ondřejov, czech republic, czech technical university in prague, fel, prague, czech republic ∗ corresponding author: simon@asu.cas.cz, vojtech.simon@gmail.com abstract. we review the lines of evidence that some cataclysmic variables (cvs) are the sources of non-thermal radiation. it was really observed in some dwarf novae in outburst, a novalike cv in the high state, an intermediate polar, polars, and classical novae (cne) during outburst. the detection of this radiation suggests the presence of highly energetic particles in these cvs. the conditions for the observability of this emission depend on the state of activity, and the system parameters. we review the processes and conditions that lead to the production of this radiation in various spectral bands, from gamma-rays including tev emission to radio. synchrotron and cyclotron emissions suggest the presence of strong magnetic fields in cv. in some cvs, e.g. during some dwarf nova outbursts, the magnetic field generated in the accretion disk leads to the synchrotron jets radiating in radio. the propeller effect or a shock in the case of the magnetized white dwarf (wd) can lead to a strong acceleration of the particles that produce gamma-ray emission via π0 decay; even cherenkov radiation is possible. in addition, a gamma-ray production via π0 decay was observed in the ejecta of an outburst of a symbiotic cn. nuclear reactions during thermonuclear runaway in the outer layer of the wd undergoing cn outburst lead to the production of radioactive isotopes; their decay is the source of gamma-ray emission. the production of accelerated particles in cvs often has episodic character with a very small duty cycle; this makes their detection and establishing the relation of the behavior in various bands difficult. keywords: acceleration of particles, astroparticle physics, nuclear reactions, nucleosynthesis, abundances, elementary particles, magnetic fields, radiation mechanisms: non-thermal, circumstellar matter, accretion, accretion disks, novae, cataclysmic variables, x-rays: binaries. 1. introduction in cataclysmic variable (cv), matter flows from a companion star, the so-called donor (often a low-mass, main-sequence star), onto the white dwarf (wd) accretor. this mass-transferring binary is a complicated and very active system with various emission regions. release of the gravitational energy during accretion of matter from the donor onto the wd is the dominant energy source of the cv system. this accretion usually occurs via accretion disk embedding the wd. however, if the wd has a strong magnetic field, this field largely influences the accretion flow – such systems are called polars. thermonuclear reaction of the accreted matter on the surface of the wd is another, often episodic source of energy. review can be found in [40]. cvs have been shown to radiate in various spectral regions via various emission mechanisms. the complicated structure of the emission regions and the operation of the individual emission mechanisms largely vary with the state of their activity. cvs are therefore very important laboratories for a study of the physical processes, including a search for the highly energetic particles. we discuss the processes and conditions that lead to the production of non-thermal radiation in cvs. the detection of such a radiation suggests the presence of highly energetic particles in these cvs. how can this emission be detected and monitored? in which spectral bands and with which techniques? we also show the importance of non-thermal radiation of cvs for the physics of these systems. we focus on the cases where such emission was really observed – from gamma-rays (including tev emission) to radio. 2. non-thermal radiation in various types of cvs we review the types of cvs and their states of activity in which non-thermal radiation was observed in various spectral regions. 2.1. cvs with accretion disks and “non-magnetized” wds in some cvs, the accretion disk suffers from a thermalviscous instability if the mass transfer rate ṁ lies between certain limits. it gives rise to the largeamplitude optical outbursts in the so-called dwarf novae (e.g. [17]). the accretion disk is a source of intense thermal emission during the outburst (or during the high state in novalike cvs) (e.g. [40]). nevertheless, we will show several cvs which are also the sources of non-thermal radiation. 595 http://dx.doi.org/10.14311/ap.2013.53.0595 http://ojs.cvut.cz/ojs/index.php/ap vojtěch šimon acta polytechnica the dwarf nova ss cyg/3a 2140+433 is a very good example. its non-thermal radio emission was generated during its outburst [25]. this radio emission does not directly follow the optical one; the radio-emitting medium is therefore not any detached reprocessing medium. it originates from ss cyg itself during the outburst. radio emission of ss cyg is optically thick synchrotron radiation of a transient jet because the size of the radio-emitting medium is much larger than the magnetosphere of the wd. a similar behavior, that is radio emission only during the optical outburst, was observed in the dwarf nova em cyg/1rxs j193840.0+303035. gyrosynchrotron emission of nonthermal electrons is a plausible explanation. the radio source is significantly larger than the binary separation. this is consistent with a transient jet like in ss cyg. this radio emission, related to the conditions in the accretion disk [6, 7], suggests a symbiosis of the thermal emission of the accretion disk and the non-thermal radiation of the jet. a single process (thermal-viscous instability of the disk) is the trigger of both types of emission. in addition, radio emission (frequencies of 5.5 ghz and 9 ghz; atca radio telescope) was also detected from the novalike system v3885 sgr. the asas [33] light curve shows a relation between the optical and radio activity. this radio emission falls into a longlasting high state of the optical emission, with the luminosity comparable to the peak of the outburst in dwarf novae (e.g. ss cyg). the analogies with z-sources and outbursting dwarf novae suggest synchrotron emission of a jet [26]. 2.2. intermediate polars accretion flow of intermediate polar is controlled by the magnetic field of the wd inside the magnetosphere of this accretor. accretion of matter therefore occurs onto the magnetic poles of this wd [16, 40]. non-thermal radiation can be produced by electrons and protons which are accelerated in the transition region between the rotating wd magnetosphere and the accreting matter. gamma-ray emission from the decay of neutral pions is predicted to be produced in hadronic collisions [5]. v1223 sgr/1h 1853−312 was observed to display brief brightenings that cannot be explained by a thermal-viscous instability of its accretion disk. part of a flare observed in far infrared (ir) (λ of 14÷21 µm) by spitzer satellite, with the flux declining by a factor of 13 in 30 minutes, suggests a transient synchrotron emission [18]. another two brightenings were observed in the optical band. one of them, with an amplitude of more than 1 mag in the red continuum, lasted only for several hours. it was accompanied by an increase of the hα line flux, which was longer than in the continuum. this suggests that also the line emission participated in this event [39]. another flare (a brightening by > 1 mag) was found on an archival photographic plate (the bamberg observatory) in blue figure 1. (a) position of the synchrotron flare observed by [18] (arrow) in the optical (the v band) light curve of v1223 sgr (asas data [33], one ccd image per night). the smooth line represents the hec13 fit. (b) statistical distribution of the brightness from panel ‘a’. the baseline level of the synchrotron flare is marked by the vertical line. notice that the flare occurred from a shallow low state of the optical brightness. see sect. 2.2 for details. light [34]. one photographic plate of the field was obtained per night during this monitoring. extrapolation of the flux of the flare with a very flat synchrotron spectrum, observed by [18], from far ir to the optical band can hardly explain the observed bright optical flares in v1223 sgr. it is possible that some interaction of the inner disk region with the synchrotron jet occurred to produce the optical brightening; this scenario is supported by an increase of the hα emission during the flare. placing the flares in the long-term optical light curve can give us important information about the conditions suitable for generating these events in v1223 sgr (fig. 1a). in the last years, this system was in the so-called high state of its activity, but it displayed several recurring decreases to the so-called shallow low state. nevertheless, even during this shallow low state it was always much brighter that in the true low state reported by [15]. in the statistical distribution (fig. 1b), the upper (brighter) limit of the brightness is significantly better defined than its lower (fainter) limit. also the time fraction spent in the high state is considerably larger than in the shallow low state. the synchrotron flare 596 vol. 53 supplement/2013 non-thermal emission from cataclysmic variables in spitzer data occurred from a shallow low state (fig. 1b). it is close to an optical flare, but it is not precisely coinciding with it (fig. 1a). a cluster of flares in a given shallow low state can explain it. a similar situation is as regards the flare on the bamberg plate [34]. it therefore appears that the episodes of shallow low state create suitable conditions for the flares in v1223 sgr (increase of the alfvén radius, inner disk region therefore being closer to the propeller regime, but still with a high ṁ?). ae aqr/1rxs j204009.4−005216 is a unique system because it is an intermediate polar in a propeller regime. most of the transferring matter is therefore ejected by the rapidly spinning (33 s) magnetized wd [41]. an interaction of these blobs with the magnetosphere of the wd leads to the flares of the optical (thermal) emission. a typical duration of a single flare is several minutes but these events can be clustered [32]. part of matter of these blobs is trapped in the wd magnetosphere and these particles are accelerated according to the model by [27]. synchrotron emission from electrons in expanding clouds then dominates in far ir and radio bands. this radio emission is a superposition of discrete, synchrotron emitting flares from the vicinity of the wd [10]. the flare is an expanding plasmoid – a spherical cloud of relativistic electrons which expands adiabatically [4]. generation of the synchrotron emission is therefore a consequence of the blobs [27], although the occurrence of the individual spikes of the radio emission is not directly correlated with the optical spikes [1]. ae aqr is a transient tev source detected by ground-based cherenkov telescopes [29, 30]. a method of confirmation that tev emission really comes from the observed object (i.e. an argument against a spurious detection) is to correlate the time variations of its tev intensity with some period already known to be specific for this object. indeed, the optical and tev flares display the same frequency (a 33 s spin period of the wd detected in the optical band). tev flares are highly transient, with quite a small duty cycle (0.2 percent). these tev flares were interpreted as due to acceleration of particles by the rotating magnetic field of the wd in intermediate polar in the propeller regime. tev flares occur only during a low optical brightness; the accretion luminosity must be low to allow the inner edge of the disk to be outside the co-rotation radius. electrons are accelerated to e ≈ 1013 ev and converted to gamma-rays via π0 decay in the blobs. all this production of non-thermal radiation of ae aqr depends on the state of its long-term activity which can be measured in the optical band. the mean level of this brightness largely varies on the timescale of years (fig. 2a). this activity is mainly caused by a variable amount and brightness of the flares. the more numerous flares lead to a higher optical brightness of the system because they emit optical radiation. even a season with almost absent flares was figure 2. (a) long-term activity of ae aqr in the optical band (asas data [33], one ccd image per night). the smooth lines represent the hec13 fits. the segment with almost absent flares is marked by the box. (b) the light curve folded with the orbital period according to the ephemeris of [12]. open circles represent all the data points from panel ‘a’. closed circles denote the data from the box in panel ‘a’. see sect. 2.2 for details. observed (fig. 2a). this behavior can be explained by a transient decrease or even a cessation of the mass inflow from the donor. such an evolution suggests a variable amount of the blobs on this timescale. this implies variable conditions for the generation of the accelerated particles. the light curve of ae aqr folded with the orbital period according to the ephemeris of [12] displays a large scatter caused by the real variations, not by a noise (fig. 2b). the flares can occur at any orbital phase (see also [32]), and the detection of a higher intensity of the flares is less frequent. the profile of the lower envelope of the folded light curve (remaining all the time, even when the flares are missing) is caused by the tidal deformation of the donor [32]. 2.3. polars polars are cvs with a strongly magnetized wd (typically b > 107 g (e.g. [40]). no disk is formed and the accreting matter flows directly onto the accretion region(s) in the vicinity of the magnetic pole(s) of the wd. this falling matter forms an accretion column above the surface of the accretor (less than 0.1 of the radius of the wd). this column is a source of 597 vojtěch šimon acta polytechnica radiation via several mechanisms. it radiates via cyclotron mechanism in the optical and near ir bands, while the accretion shock emits bremsstrahlung in hard x-rays (e.g. [13, 28]). intensity of the cyclotron emission largely varies with the orbital phase of the polar, which reflects the changing aspect of the emission region with respect to the observer [14]. am her/3a 1815+498 displays a significantly variable ratio of the intensities of the cyclotron and bremsstrahlung components for the individual highstate episodes [35]. this suggests large changes of the properties of the emission region(s) on the wd. these properties are established already in the early phase (several days long) of the high-state episode but they are not reproduced for every individual episode. an increase of ṁ from the donor that switches the polar from the low to the high state of its long-term activity also establishes a division of the emission released during the accretion process into various spectral regions, hence into the individual emission mechanisms operating in the accretion region. this division is valid only for a given episode. each high-state episode is thus characterized by the specific configuration of the accreting matter [35]. this is also supported by the change of the profile of the optical orbital modulation, as observed by [24]. the observed behavior can be reconciled if the bremsstrahlung emission is confined to a smaller region than the cyclotron emission. also the role of several modes of accretion, e.g. a singlepole and two-poles accretion [19], is worth considering. these findings show that both the positions of these emission regions and their contributions to the total intensity varied with time. the emission region caused by the accretion of matter and its conditions are therefore proved to be highly unstable in time. several sites of non-thermalradio emission (frequency of 4.9 ghz) existed in am her [11] in the same time, specifically in the optical low state of its long-term activity [20]. quiescent radio emission was ascribed to the energetic electrons trapped in the magnetosphere of the wd. a radio outburst (flare) was explained as an electron-cyclotron maser near the surface of the late-type donor, in its corona with the magnetic field (b ≈ 1000 g). in the high state of the long-term optical activity, am her is sometimes able to accelerate particles that produce tev emission [8]. its intensity is strongly modulated with the already known orbital period seen in the polarized optical light of am her. this is strong evidence that this gamma-ray emission really comes from this object [8]. protons can be accelerated to very high energies (gev to tev) at the shock front on the top of the accretion column [38], with the subsequent gamma-ray production via π0 decay [3]. 2.4. outbursts of classical novae radioactive isotopes are synthesized during outburst of classical nova (cn). gamma-ray emission is then produced during their decay. detection of these gamma-rays strongly depends on the isotope’s lifetime and on the optical depth in the ejecta of the nova. only cumulative effect of the production of 26al (lifetime of 106 years) emitting the gamma-ray line with e = 1.809 mev in various types of sources, concentrated toward the galactic plane, and its ejection into the space has been observed by comptel/cgro [9]. this emission therefore does not come from a single source. outbursts of cne contribute about 15 percent [23]. the situation of 22na with a lifetime of 3.75 years (e = 1.272 mev) is similar. outbursts of the neon-type cne contribute only partly; excitation of 22ne-nuclei, e.g. through the low-energy cosmic ray interactions with the nuclei of the interstellar matter, can dominate [21]. it is very important that in v723 cas (nova cas 1995), the nucleosynthesis during its outburst was really observed in the gamma-ray band [22] (fig. 3). both the comptel/cgro field of nova cas 1995 (total integration time of ∼4.5 years) and a spectrum from the position of this nova revealed a new source emitting the 22na line (e = 1.275 mev). this is direct evidence that cne contribute to the gamma-ray flux of 22na in the galaxy. this 22na was synthesized during thermonuclear reactions on the surface of the wd [22]. time evolution of the 22na gamma-ray flux of v723 cas is a combination of a decreasing absorption of the gamma-ray flux in the ejecta and the time dependence of the undecayed 22na nuclei. the case of v723 cas also enabled to identify the important properties needed for the formation and preservation of 22na, hence for the detection of the 22na gammaray flux: type of cn (slow nova), massive ejected envelope, low mass of the wd (only 0.66 m�) [22]. strongly accelerated population of electrons with a nonthermal energy distribution during an outburst of the classical nova v2491 cyg was proved by the detection of superhard (e > 10 kev) x-ray emission with a power-law spectral profile up to e of 70 kev, attenuated by a heavy extinction of 1.4 × 1023 cm−2 (suzaku satellite) [37]. according to the model by [36], compton degradation, i.e. repeated scattering of the gamma-ray photons by electrons in the matter ejected from a wd, can explain the extremely hard x-ray spectrum and the absence of the 1.275 mev line of 22na. the ejecta become transparent to the gamma-ray photons within several tens days. gamma-rays from the shock in the very fast cn in the symbiotic binary v407 cyg (nova cyg 2010), detected by lat/fermi, represent a unique case of the cn activity [2]. the environment of the erupting wd was very specific. this object was deeply embedded in the dense wind of its cool giant companion [31]. according to [2], a variable gamma-ray emission started with the rise of the optical luminosity of the outburst and lasted for about 18 days. continuum emission with no lines was detected in the spectral region between 0.2 and 5 gev. no spectral 598 vol. 53 supplement/2013 non-thermal emission from cataclysmic variables figure 3. (a) outburst of the classical nova v723 cas (nova cas 1995) in the optical band (aavso data [20]). the vertical line denotes the time of the discovery of the optical outburst. the outburst started at most several days before the discovery. (b) time evolution of the flux of the line (e = 1.275 mev) of the radioactive isotope 22na of v723 cas (adapted from [22]). see sect. 2.4 for details. variability was observed over the duration of the active gamma-ray period. most gamma-rays come from the part of the nova shell approaching the red giant donor [2]. this gamma-ray emission was interpreted as an interaction of the material of the nova shell with the dense medium of the red giant donor. particles were accelerated by this interaction of the shell to produce π0 decay gamma-rays from proton-proton interactions [2]. 3. general conclusions non-thermal radiation was definitely observed in the following types of cvs: dwarf nova in outburst, novalike cv in the high state, intermediate polar, cn during outburst. this brings evidence that the processes for the creation and/or acceleration of highly energetic particles must operate in such cvs. the conditions for generating the non-thermal radiation depend on the state of the system’s activity and its parameters. these processes, states of the activity, and the spectral bands in which non-thermal radiation can be observed can be summarized in the following way. synchrotron emission provides us with evidence of generation of the magnetic fields influencing the transferring matter. this emission, which can be studied in the radio band, can come from the jets launched during the outbursts of dwarf novae and even during the high state of some novalike systems. however, the structure of this medium is uncertain in some systems (e.g. the flare radiating in far ir, observed in the intermediate polar v1223 sgr). also in the case of ae aqr, the synchrotron emission in radio is produced by the clouds launched from the system by the propeller effect rather than in a jet. also cyclotron emission carries information about strong magnetic field existing in some cvs, namely polars. it emerges that even a single polar can possess several simultaneously existing cyclotron-emitting regions: accretion region on the wd (emitting in the optical and ir bands), and the donor’s magnetosphere, radiating in radio. gamma-ray production via π0 decay suggests operation of mechanisms that lead to acceleration of protons in various types of cvs. π0 particles can be created in the transition region between the rotating wd magnetosphere and the accreting matter, by the propeller effect, by a shock on a strongly magnetized wd, or in ejecta of outburst of a symbiotic cn. production of radioactive isotopes occurs during the nuclear reactions in the outer layer of the wd during a cn outburst. only some isotopes lead to the production of the observable gamma-rays. only cumulative effect of many cne can be expected from the gamma emission of 26al. the situation is more optimistic as regards 22na, which was detected from a single cn. we offer the following solution and further prospects in search for the suitable states of activity of cvs in which the highly energetic particles are produced. the long-term activity of these objects in the optical band appears to be a plausible indicator of suitable conditions for the generation of these particles. the reason is that the data coverage in other spectral bands is fragmentary or even absent. the phenomena related to the generation of the highly energetic particles often have episodic character with a low duty cycle: e.g. ultra-high energy flares in the propeller systems (ae aqr), radio flares in polars (am her). this property makes the detection of these phenomena and establishing a relation between various emission processes difficult. acknowledgements support by the grant 205/08/1207 of the grant agency of the czech republic and the project rvo:67985815 is acknowledged. this research has made use of observations from the asm/rxte, asas, aavso, and afoev databases. i thank prof. p. harmanec for providing me with the code hec13. the fortran source version, compiled version and brief instructions how to use the program can be obtained via http://astro.troja.mff.cuni.cz/ ftp/hec/hec13/. 599 http://astro.troja.mff.cuni.cz/ftp/hec/hec13/ http://astro.troja.mff.cuni.cz/ftp/hec/hec13/ vojtěch šimon acta polytechnica references [1] abada-simon, m. et al., 1995, aspc, 85, 355 [2] abdo, a. a. et al., 2010, sci, 329, 817 [3] barrett, p. et al., 1995, apj, 450, 334 [4] bastian, t. s. et al., 1988, apj, 324, 431 [5] bednarek, w., pabich, j., 2011, mnras, 411, 1701 [6] benz, a. o., guedel, m., 1989, a&a, 218, 137 [7] benz, a. o. et al., 1996, aspc, 93, 188 [8] bhat, c. l. et al., 1991, apj, 369, 475 [9] diehl, r. et al., 1995, adspr, 15, 123 [10] dubus, g. et al., 2007, apj, 663, 516 [11] dulk, g. a. et al., 1983, apj, 273, 249 [12] echevarría, j. et al. 2008, mnras, 387, 1563 [13] gänsicke, b. t., 1997, phdt, 28 [14] gänsicke, b. t. et al., 2001, a&a, 372, 557 [15] garnavich, p., szkody, p., 1988, pasp, 100, 1522 [16] ghosh, p., lamb, f. k., 1978, apj, 223, l83 [17] hameury, j.-m. et al., 1998, mnras, 298, 1048 [18] harrison, t. e. et al., 2010, apj, 710, 325 [19] heise, j. et al., 1985, a&a, 148, l14 [20] henden, a., 2012, aavso international database, private communication [21] iyudin, a. f. et al., 2005, a&a, 443, 477 [22] iyudin, a. f., 2010, arep, 54, 611 [23] josé, j. et al., 2006, nupha, 777, 550 [24] kafka, s., hoard, d. w., 2009, pasp, 121, 1352 [25] körding, e. et al., 2008, sci, 320, 1318 [26] körding, e. g. et al., 2011, mnras, 418, l129 [27] kuijpers, j. et al., 1997, a&a, 322, 242 [28] kuulkers, e. et al., 2006, in: compact stellar x-ray sources. cambridge univ. press, p.421 [29] meintjes, p. j. et al., 1992, apj, 401, 325 [30] meintjes, p. j. et al., 1994, apj, 434, 292 [31] munari, u. et al., 2011, mnras, 410, l52 [32] van paradijs, j. et al., 1989, a&as, 79, 205 [33] pojmanski, g., 1997, aca, 47, 467 [34] šimon, v., 2010, adast, 38, id.382936 [35] šimon, v., 2011, newa, 16, 405 [36] suzuki, a., shigeyama, t., 2010, apj, 723, l84 [37] takei, d. et al., 2009, apj, 697, l54 [38] terada, y. et al., 2010, apj, 721, 1908 [39] van amerongen, s., van paradijs, j., 1989, a&a, 219, 195 [40] warner, b., 1995, cataclysmic variable stars, cambridge univ. press [41] wynn, g. a. et al., 1997, mnras, 286, 436 600 acta polytechnica 53(supplement):595–600, 2013 1 introduction 2 non-thermal radiation in various types of cvs 2.1 cvs with accretion disks and ``non-magnetized'' wds 2.2 intermediate polars 2.3 polars 2.4 outbursts of classical novae 3 general conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0883 acta polytechnica 53(6):883–889, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap control of the double inverted pendulum on a cart using the natural motion zdeněk neusser∗, michael valášek department of mechanics, biomechanics and mechatronics, faculty of mechanical engineering, czech technical university in prague, technická 4,166 07 prague, czech republic ∗ corresponding author: zdenek.neusser@fs.cvut.cz abstract. this paper deals with controlling the swing-up motion of the double pendulum on a cart using a novel control. the system control is based on finding a feasible trajectory connecting the equilibrium positions from which the eigenfrequencies of the system are determined. then the system is controlled during the motion between the equilibrium positions by the special harmonic excitation at the system resonances. around the two equilibrium positions, the trajectory is stabilized by the nonlinear quadratic regulator nqr (also known as sdre – the state dependent riccati equation). these together form the control between the equilibrium positions demonstrated on the double pendulum on a cart. keywords: underactuated systems, nonlinear control, mechanical systems. 1. introduction the control of systems with fewer actuators than degrees of freedom is a challenging task. systems of this type are called underactuated systems. a more precise definition of underactuated systems should be based on the number of blocks in the brunovsky canonical form [1]. our study deals with the double pendulum on a cart in the gravity field between equilibrium positions passing through singularity positions. the mechanism can be controlled in its equilibrium positions (see [2]), but there are always singularity points in between. the system is not controllable by the nqr control [3, 4]) with zero velocity. the main aim of our work is to determine the behavior of the actuator for utilizing the swing-up motion. it is a challenging task (according to [5]) to move the mechanism over this region of singular positions, or even to perform trajectory tracking. this problem has already been investigated by many authors. the energy based approach used in the design of stabilizing controllers constitutes the basis of the control applied in [6] and in [7], and this is another type of energy-based approach applied to the double inverted pendulum on a cart. a modified energy control strategy using friction forces is used in [10]. a method adding the actuators to the underactuated joints and running the swing-up motion with optimization that minimizes the action of the added actuators is used in [8]. partly stable controllers derived using the dynamic model of the manipulator in order to employ them under an optimal switching sequence is presented in [9]. another approach is to use exact input-output linearization within a certain submanifold (direction) in which the system is controllable and the zero-dynamics is stable [10]. this approach is used in [11], where the feedforward control strategy is combined with input-output linearization. the system acquires new input variables that are suitable for controlling the actuated part together with the underactuated part. however, the system can be controlled by this new input only if the input functions are from a certain admissible class of functions. this admissible class of functions can be determined by the cyclic control described in [12] as periodic invariant functions (with respect to the actuated variables), without details. another way to find suitable admissible functions for controlling underactuated mechanical systems between equilibrium positions across a singularity area in the gravity field is described in [13]. these suitable admissible functions are based on inverting the so-called natural motion, or on reusing specific knowledge gained from these natural motions. in brief, the natural motion is a motion of the system without control. the method is shown in [13] only for a double pendulum and a pendulum on a cart. our paper deals with the application of this approach also for the swing-up motion of a double pendulum on a cart as a more complex system than that investigated in [13]. our paper describes this different approach to the control of underactuated systems. it demonstrates a different solution of the problem explored in [7] and justifies the same results as in [7], but with broader validity and applicability. 2. natural motion natural motion [13] is the motion of an investigated mechanical system moving in the gravity field with zero or constant inputs (drive torques caused by gravity), from the unstable equilibrium position to the stable equilibrium position. to reach the desired quiescent motion, it is necessary to add a passive member (a damper) into the actuator. the underactuated part moves freely, but due to the overall energy flow into 883 http://dx.doi.org/10.14311/ap.2013.53.0883 http://ojs.cvut.cz/ojs/index.php/ap z. neusser, m. valášek acta polytechnica figure 1. double pendulum on a cart; left, lower position, right, upper position. the passive part the whole system is stabilized around the stable equilibrium point. the damping term (the added passive member) is chosen small, in order to remove the difference in potential energy between the equilibrium positions. the value of the damping term is optimized with respect to the settling time. the natural motion is later used to determine the eigenfrequencies of the natural motion. these are not influenced by the damping if it is small. proposition 1. the control described here enables the system of the cart-double-pendulum to be moved from the lower position (figure 1 left) to the upper position (figure 1 right). justification. the control consists of two phases. first, the system is moved from the lower position (figure 1 left) to the vicinity of the upper position (figure 1 right) with small final velocities. second, the system is stabilized and controlled exactly into the equilibrium at the upper position by nqr control. the first phase is realized by proper selection of the amplitudes in equation (9)) in order to achieve values in the vicinity of the upper position and small velocities. amplitude values of this kind exist because each amplitude influences the particular different eigenmode, and because the periods of all eigenmodes meet at one time instant. the excitation at the eigenfrequencies guarantees sufficient amplification of the excitation amplitudes to the angular amplitudes of the double-pendulum. the combination of two and more different eigenmodes enables any desired position to be reached, and the meeting of all periods at one time instant enables zero or near zero velocities to be reached. proposition 2. the cart-double-pendulum system can be moved from any position to the upper position (figure 1 right). u x ϕ � ϕ � m v i 1 , m 1 , l 1 , l t1 i 2 , m 2 , l 2 , l t2 figure 2. scheme of the double inverted pendulum on a cart with coordinates, parameters and external force. justification. the control consists of two phases. first, the system is moved from any position to the lower position (figure 1 left), and then the system is moved to the upper position, according to 1. the first phase is realized by active damping, i.e. the control applies damping of significant value to the cart, in order to damp out the energy and stop the system at stable equilibrium in the lower position (figure 1 left). 3. problem formulation figure 2 presents the scheme of a double inverted pendulum on a cart. this mechanism has the cart moving horizontally, with two links rotationally connected, and there is a force acting on the cart. using lagrange equations of the second type, the dynamic equations of the double pendulum on a cart are obtained. put m =  mv +m1+m2 a cos φ1 m2lt2 cos φ2a cos φ1 i1+m1l2t1+m2l21 b cos(φ1−φ2) m2lt2 cos φ2 b cos(φ1−φ2) i2+m2l2t2   , f =   (m1lt1+m2l1) sin φ1 φ̇21+m2lt2 sin φ2 φ̇22−m2l1lt2 sin(φ1−φ2)φ̇22+(m1lt1+m2l1) sin φ1 g m2l1lt2 sin(φ1−φ2)φ̇21+m2lt2 sin φ2 g   , m(φ1,φ2)   ẍφ̈1 φ̈2   = f(φ1, φ̇1,φ2, φ̇2) + u(t)  10 0   , (1) where a = (m1lt1 + m2l1) and b = m2l1lt2. matrix m is the inertia matrix, and vector f contains the forces dependent on the velocities and external forces besides the control inputs. actuator u is the horizontal force acting between the base frame and the cart. the goal is to control the mechanism from the stable equilibrium point (ϕ1 = π, ϕ2 = π) into the upper unstable equilibrium point (ϕ1 = 0, ϕ2 = 0). the procedure described in detail in [13] covering input884 vol. 53 no. 6/2013 control of the double inverted pendulum on a cart output feedback linearization which forms the basis for subsequent adaptation of the system of equations. it is necessary to divide the coordinates of the mechanism in such way that one part is controlled and another forms the zero dynamic. it is used the following division of the coordinates (but this choice is not unique, as is demonstrated in [13]): qm = x, qz = [ ϕ1 ϕ2 ] index m denotes the actuated (controlled) part, and index z denotes the zero dynamics coordinates. matrix m is decomposed in the following way: m = [ mmm mmz mzm mzz ] , where mmm = [ mv + m1 + m2 ] , mmz = [ (m1lt1 + m2l1) cos φ1, m2lt2 cos φ2 ] , mzm = [ (m1lt1 + m2l1) cos φ1 m2lt2 cos φ2 ] , mzz = [ i1 + m1l2t1 + m2l21 m2l1lt2 cos(φ1 − φ2) m2l1lt2 cos(φ1 − φ2) i2 + m2l2t2 ] . a necessary condition for matrix mzz is that it must be invertible, and for nonzero moments of inertia this is the case. vector f is as follows: f = [ fm fz ] , where fm = (m1lt1 + m2l1) sin φ1 φ̇21 + m2lt2 sin φ2 φ̇ 2 2, fz = [ −m2l1lt2 sin(φ1−φ2)φ̇22+(m1lt1+m2l1) sin φ1 g m2l1lt2 sin(φ1−φ2)φ̇21+m2lt2 sin φ2 g ] . the new control variable w is chosen as a second derivative of coordinate x: ẍ = w. (2) the remaining coordinates ϕ1 and ϕ2 can be expressed directly from equation (1) and its decomposition. the zero dynamics is represented by:[ ϕ1 ϕ2 ] = m−1zz (fz − mzmw). (3) the expression for actuator u is derived by decomposing equations (2) and (3): u(t) = fm − mmzm−1zz fz − (mmm − mmzm−1zz mzm)w. (4) the singular positions of the system are for ϕ1 = ±π/2, ϕ2 = ±π/2, and other states (velocities) zero. then mzm = 0. the system of equations (2)–(3) then takes the following form with zero velocity terms: ẍ = w, [ ϕ1 ϕ2 ] = m−1zz fz, (5) where the control variable cannot influence coordinates ϕ1 and ϕ2 and the system cannot be controlled. 4. how to get new input w the problem of how to get a new input w to ensure that qz ends in the desired position together with qm has been addressed in [13]. it is easy to find such w, which controls qm, but it is not easy to find such w which simultaneously controls qm and qz. the new input w contains two parts according to [12]. it consists of the active part wa, which moves the qm coordinates from the initial position qm(0) to the desired final position qmd(t) in some chosen time t from equation (2), and the invariant part wi which is invariant to the final position of the motion of qm, but due to the dynamic connection through zero dynamics influences qz in a way that moves qz from the initial position qz(0) to the desired final position qzd(t) in time t : w = wa + wi. (6) the invariant wi is such that∫ t 0 wi dt = 0 and ∫ t 0 ∫ t 0 wi dtdt = 0. (7) in our case, due to conditions (7) the motion of x in the final velocity and position is influenced only by the active part wa of the control (6). this control is chosen arbitrarily just to fulfil the desired final values for the position and velocity of x. when the active part wa is chosen, it is a given function of time wa(t), and substitution of (6) into (3) results in[ ϕ1 ϕ2 ] = m−1zz ( fz − mzm(wa(t) + wi) ) (8) now the motion of ϕ1 and ϕ2 can be controlled by the choice of wi just fulfilling the conditions (7). for invariant motion, use is made of so-called natural motions and their frequencies fi, wi = ∑ i ai sin(2πfit + ϕi), (9) where amplitudes ai and phases ϕi are some constants as a result of optimization. the frequencies fi are chosen and the amplitudes ai and phases ϕi are optimized for coordinates ϕ1 and ϕ2 to reach the desired position and velocity at the chosen settling time t . if it holds 2πfit = 2kiπ (10) for some natural number ki, then functions (9) satisfy conditions (7). how to determine suitable frequencies is explained in the next section. 5. simulation of the natural motion during the simulation all states are recorded. fourier analysis is applied to the recorded states in order to determine the eigenfrequencies of the considered nonlinear system. the feedforward control is constructed 885 z. neusser, m. valášek acta polytechnica −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 x [m] y [m ] cart motion end of the first link end of the second link figure 3. motion of the double pendulum on a cart from its upper position into the bottom position with the damping force between the cart and the base frame. using these eigenfrequencies and from equations (4), (6) and (9). the eigenfrequencies of the natural motion describe the vibrational motion of variables q that ends at equilibrium, where all swings meet at the final time. the conditions (10) are therefore satisfied. at the same time, the determined eigenfrequencies are in fact rational numbers, and the natural numbers ki at the final time t for any set of eigenfrequencies fi therefore exist. finally, the conditions (10) from the natural motion are to be satisfied just approximately in order to move the mechanical system into such vicinity of the upper equilibrium that the nqr control can stabilize the motion. the excitation of the system at its eigenfrequencies enables large motion of the system to be caused on the basis of resonance. the eigenfrequencies allow the mechanism to be excited with less input energy than other frequencies, as excitation by eigenfrequencies enables the energy to be accumulated in the system. the system is nonlinear, and it is not easy to determine the eigenfrequencies. the usage of natural motion is therefore helpful. the following parameters are used for the simulation: the mass of the cart is 1 kg, the mass of the first and second link is 0.5 kg, their moments of inertia are 0.0417 kg/m2, and the length of each part is 1 m, with the center of gravity in the middle of each part. the natural motion is illustrated in figure 3. the natural motion is achieved by simulating the equations of motion (1) with zero control and with damping instead of it, u(t) = −bx from the upper position to the lower position. the outer dotted line represents the end point of the second link, the dashed line is the trajectory of the joint between links, and the straight line represents the cart movements. it starts in the upper position with a slight deflection in the first and second angle ϕ1, ϕ2 and zero velocities. it falls into the bottom stable position with a quiescent motion. this behavior is caused by the damping force acting between the cart and the base frame, where the actuator is later placed. the damping coefficient (b = 11.92 n m s/rad) is received by the optimization with respect to the settling time. the behaviors of all states, including the accelerations, are recorded. 6. control design the data from the natural fall motion are now used. the first frequencies from the angular acceleration of the first and the second link angular acceleration form the basis for the new input variable for the swing-up motion (figure 4). the frequencies for the angular acceleration of the first link are 0.4053, 0.5722, 0.8583, 0.9537, 1.0490, 1.1444 hz (figure 4 middle row), the frequencies for angular acceleration of the second link are 0.4053, 0.6437, 0.8345, 0.9537, 1.0490, 1.1206 hz (figure 4 last row). the frequencies for the cart acceleration are 0.4053, 0.5722, 0.7629, 0.9060, 1.0014, 1.0967 hz (figure 4 first row). the input variable w consists of the active part and the invariant part (according to equation (6)). in this case, the active part is zero, because variable x has the same initial and final position. the sinusoidal behavior of the invariant part has frequencies from the movement of the two links, because w directly influences ϕ1 and ϕ2 through the zero dynamics in equation (3)). the amplitudes are proportional to the difference in potential energy between the initial and final position of the mechanism, but their values are the result of optimization. the difference of the potential energy is ep = m1lt1g + m2(l1 + lt2)g. (11) all parameters and behaviors of the control variable w are included in the following system: a = [5 6, 1 12, 1 12 ]( m1lt1g + m2(l1 + lt2)g ) , ϕ = [ −π2 , π2, π 2 ] , f = [ 0.4053, 0.6437, 0.8345 ] , ẍ1 = w, wa = 0, wi = 3∑ i=1 ai sin(2πfit + ϕi), w = wa + wi. (12) the conditions (10) are fulfilled only approximately (2/0.4053 = 4.93, 3/0.6437 = 4.66 and 4/0.8345 = 4.79), but near the upper position the control is switched to nqr. the usage of these frequencies is 886 vol. 53 no. 6/2013 control of the double inverted pendulum on a cart 0 5 10 15 −40 −20 0 20 40 t [s] ẍ [m s− 2 ] 0 1 2 3 0 0.1 0.2 0.3 0.4 f [hz] |ẍ |[ m s − 2 ] 0 5 10 15 −200 −100 0 100 200 t [s] ϕ̈ 1 [r a d ·s − 2 ] 0 1 2 3 0 1 2 3 f [hz] |ϕ̈ 1 |[ r a d ·s − 2 ] 0 5 10 15 −200 −100 0 100 200 t [s] ϕ̈ 2 [r a d ·s − 2 ] 0 1 2 3 0 2 4 6 f [hz] |ϕ̈ 2 |[ r a d ·s − 2 ] figure 4. accelerations (left) and their fourier transformations. efficient, because they are the eigenfrequencies and the system is sensitive to the input on these frequencies. in the surroundings of the desired position it is required to use the local stabilization control. in our case, we use the nonlinear quadratic regulator (nqr or sdre, see [3, 4]). however, the region of attraction of this control is also limited. in each time step, the nqr controller calculates the input torque u at the cart. the input variable u is therefore plotted in the same figure, based on the acceleration of the cart (4) when using nqr control and based on the nqr algorithm in the vicinity of the unstable equilibrium. 7. simulation of the double inverted pendulum on the cart the control is designed, and is used for the simulation of the system described by equations (2)–(3). the simulated swing-up motion guided by the cart acceleration ẍ1 = w described in equation (12) reaches the surroundings of the unstable equilibrium position (see figure 5). the nqr algorithm is used for the stabilization around the upper equilibrium. this algorithm controls the system from the position in the surroundings of the upper unstable position to this unstable equilibrium. when the pendulum reaches the position where it can be controlled by nqr, it is switched on and it leads the mechanism to equilibrium. the entire trajectory of the end point of the pendulum and the cart is shown in figure 5 left, with the starting position and the final position marked. figure 5 right depicts the behavior of the input force acting on the cart. the control algorithm nqr starts at time 2.715 s. a big step in the input torque is shown in figure 6 right, when the change in the control strategies is made. for illustration, the behavior of the ϕ1, ϕ2 and x coordinates and their velocities are shown in figure 7. 8. conclusions this paper has described a novel control for the bottom-up motion of a double pendulum on a cart using eigenfrequencies determined from the natural motion. the set of equations describing the mechanical system is transformed using exact input-output feedback linearization. the control is based on the feedforward control strategy, using sinusoids with frequencies obtained from the falling trajectory (natural motion). the frequencies of the sinusoids are eigenfrequencies of this underactuated system. the amplitudes of this control are related to the potential energy needed for the system to change its position in the field of gravity. the zero dynamics integrated in the underactuated mechanisms can be controlled by frequencies added to the signal of the control variable. the approach described here has great potential for extending this control into the field of underactuated systems. 887 z. neusser, m. valášek acta polytechnica −3 −2 −1 0 1 −3 −2 −1 0 1 2 x [m] y [m ] 0 1 2 3 −10 −5 0 5 10 t [s] ẍ [m s − 2 ] figure 5. swing-up motion controlled by cart acceleration; left, mechanism end points in the working plane; right, input acceleration behavior. −2 0 2 4 −5 −4 −3 −2 −1 0 1 2 3 4 5 x [m] y [m ] 0 5 10 15 −20 0 20 40 60 80 t [s] u [n ] figure 6. final swing up motion for the double pendulum on the cart; left, motion in the plane; right, the input torque acting on the cart. our work forms the basis for the control of systems with flexible parts to reduce unintentional residual structural vibrations during the motion. this kind of control is useful in controlling flexible robots. acknowledgements this work was partially supported by project msm 6840770003 ‘algorithms for computer simulation and application in engineering’. references [1] valášek, m.: design and control of under-actuated and over-actuated mechanical systems challenges of mechanics and mechatronics. supplement to vehicle system dynamics, 40, 2004, p. 37–50. [2] olfati-saber, r.: nonlinear control of underactuated mechanical systems with application to robotics and aerospace vehicles. phd thesis, mit, boston, 2001. [3] steinbauer, p.: nonlinear control of nonlinear mechanical systems. phd thesis, czech technical university in prague, prague, 2002. (in czech) [4] valášek, m., steinbauer, p.: nonlinear control of multibody systems. in: “euromech colloquium 404, advances in computational multibody dynamics”, lisboa: instituto technico superior, av. rovisco pais, 1999, p. 437–444. [5] aneke, n. p. i.: control of underactuated mechanical systems. phd thesis, tu eindhoven, eindhoven, 2002. [6] fantoni, i., lozano, r.: non-linear control for 888 vol. 53 no. 6/2013 control of the double inverted pendulum on a cart 0 5 10 15 −3 −2 −1 0 1 2 3 4 5 6 7 t [s] x [m ] ϕ 1 [r a d ] ϕ 2 [r a d ] x ϕ1 ϕ2 0 5 10 15 −10 −5 0 5 10 15 t [s] ẋ [m s− 1 ] ϕ̇ 1 [r a d ·s − 1 ] ϕ̇ 2 [r a d ·s − 1 ] x ϕ1 ϕ2 figure 7. upward movement of the double inverted pendulum on the cart, positions and velocities. underactuated mechanical systems. london: springer-verlag, 2002. [7] xin, x.: analysis of the energy-based swing-up control for the double pendulum on a cart. international journal of robust and nonlinear control, 21, 2011, p. 387–403. [8] rubi, j., rubio, a., avello, a.: swing-up control problem for a self-erecting double inverted pendulum. iee proc.–control theory app., 149 (2), 2002, p. 169–175. [9] udawatta, l., watanabe, k., izumi, k., kuguchi, k.: control of underactuated robot manipulators using switching computed torque method: ga based approach. soft computing, 8, 2003, p. 51–60. [10] mahindrakar, a. d., rao, s., banavar, r. n.: a point-to-point control of a 2r planar horizontal underactuated manipulator. mechanism and machine theory, 41, 2006, p. 838–844. [11] valášek, m.: exact input-output linearization of general multibody system by dynamic feedback. eccomas conference madrid: multibody dynamics, 2005. [12] valášek, m.: control of elastic industrial robots by nonlinear dynamic compensation. acta polytechnica, 33 (1), 1993, p. 15–30. [13] neusser, z., valášek, m.: control of the underactuated mechanical systems using natural motion. kybernetika; 48 (2), 2012, p. 223–241. 889 acta polytechnica 53(6):883–889, 2013 1 introduction 2 natural motion 3 problem formulation 4 how to get new input w 5 simulation of the natural motion 6 control design 7 simulation of the double inverted pendulum on the cart 8 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0124 acta polytechnica 54(2):124–126, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap on two ways to look for mutually unbiased bases maurice r. kibler université de lyon, université claude bernard lyon 1 et cnrs/in2p3, institut de physique nucléaire, 4 rue enrico fermi, 69622 villeurbanne, france correspondence: kibler@ipnl.in2p3.fr abstract. two equivalent ways of looking for mutually unbiased bases are discussed in this note. the passage from the search for d + 1 mutually unbiased bases in cd to the search for d(d + 1) vectors in cd 2 satisfying constraint relations is clarified. symmetric informationally complete positive-operator-valued measures are briefly discussed in a similar vein. keywords: finite-dimensional quantum mechanics, quantum information, mubs, sic povms, equiangular lines, equiangular vectors. 1. introduction the concept of mutually unbiased bases (mubs) plays an important role in finite-dimensional quantum mechanics and quantum information (for more details, see [1–4] and references therein). let us recall that two orthonormal bases {|aα〉 : α = 0, 1, . . . ,d − 1} and {|bβ〉 : β = 0, 1, . . . ,d− 1} in the d-dimensional hilbert space cd (endowed with an inner product denoted as 〈 | 〉) are said to be unbiased if the modulus of the inner product 〈aα|bβ〉 of any vector |bβ〉 with any vector |aα〉 is equal to 1/ √ d. it is known that the maximum number of mubs in cd is d + 1 and that this number is reached when d is a power of a prime integer. in the case where d is not a prime integer, it is not known if one can construct d + 1 mubs (see [4] for a review). in a recent paper [5], it was discussed how the search for d + 1 mutually unbiased bases in cd can be approached via the search for d(d + 1) vectors in cd 2 satisfying constraint relations. the main aim of this note is to make the results in [5] more precise and to show that the two approaches (looking for d + 1 mubs in cd or for d(d+ 1) vectors in cd 2 ) are entirely equivalent. the central results are presented in sections 2 and 3. in section 4, parallel developments for the search for a symmetric informationally complete positive-operator-valued measure (sic povm) are considered in the framework of similar approaches. some concluding remarks are given in the last section. 2. the two approaches it was shown in [5] how the problem of finding d + 1 mubs in cd, i.e., d + 1 bases ba = {|aα〉 : α = 0, 1, . . . ,d− 1} (1) satisfying |〈aα|bβ〉| = δα,βδa,b + 1 √ d (1 − δa,b) (2) can be transformed into the problem of finding d(d+1) vectors w(aα) in cd 2 , of components wpq(aα), satisfying wpq(aα) = wqp(aα), p,q ∈ z/dz (3) d−1∑ p=0 wpp(aα) = 1 (4) and d−1∑ p=0 d−1∑ q=0 wpq(aα)wpq(bβ) = δα,βδa,b + 1 d (1−δa,b) (5) with a,b = 0, 1, . . . ,d and α,β = 0, 1, . . . ,d − 1 in (1)–(5). (in this paper, the bar denotes complex conjugation.) this result was described by proposition 1 in [5]. in fact, the equivalence of the two approaches (in cd and cd 2 ) requires that each component wpq(aα) be factorized as wpq(aα) = ωp(aα)ωq(aα) (6) for a = 0, 1, . . . ,d and α = 0, 1, . . . ,d− 1, a condition satisfied by the example given in [5]. the factorization of wpq(aα) follows from the fact that the operator πaα defined in [5] is a projection operator. the introduction of (6) in (3), (4) and (5) leads to some simplifications. first, (6) implies the hermiticity condition (3). second, by introducing (6) into (4) and (5), we obtain d−1∑ p=0 |ωp(aα)|2 = 1 (7) and∣∣∣∣∣ d−1∑ p=0 ωp(aα)ωp(bβ) ∣∣∣∣∣ 2 = δα,βδa,b + 1 d (1 − δa,b) (8) respectively. it is clear that (7) follows from (8) with a = b and α = β. therefore, (3) and (7) are redundant 124 http://dx.doi.org/10.14311/ap.2014.54.0124 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 on two ways to look for mutually unbiased bases in view of (5) and (6). as a consequence, proposition 1 in [5] can be precised and reformulated in the following way. proposition 1. for d ≥ 2, finding d + 1 mubs in cd (if they exist) is equivalent to finding d(d + 1) vectors w(aα) in cd 2 , of components wpq(aα) such that d−1∑ p=0 d−1∑ q=0 wpq(aα)wpq(bβ) = δα,βδa,b + 1 d (1−δa,b) (9) and wpq(aα) = ωp(aα)ωq(aα), p,q ∈ z/dz (10) where a,b = 0, 1, . . . ,d and α,β = 0, 1, . . . ,d− 1. this result can be transcribed in matrix form. therefore, we have the following proposition. proposition 2. for d ≥ 2, finding d + 1 mubs in cd (if they exist) is equivalent to finding d(d + 1) matrices maα of dimension d, with elements (maα)pq = ωp(aα)ωq(aα), p,q ∈ z/dz (11) and satisfying the trace relations tr (maαmbβ) = δα,βδa,b + 1 d (1 − δa,b) (12) where a,b = 0, 1, . . . ,d and α,β = 0, 1, . . . ,d− 1. 3. equivalence suppose that we have a complete set {ba : a = 0, 1, . . . ,d} of d + 1 mubs in cd, i.e., d(d + 1) vectors |aα〉 satisfying (2), then we can find d(d + 1) vectors w(aα) in cd 2 , of components wpq(aα), satisfying (9) and (10). this can be achieved by introducing the projection operators πaα = |aα〉〈aα| (13) where a = 0, 1, . . . ,d and α = 0, 1, . . . ,d − 1. in fact, it is sufficient to develop πaα in terms of the epq generators of the gl(d,c) complex lie group; the coefficients of the development are nothing but the wpq(aα) complex numbers satisfying (9) and (10), see [5] for more details. reciprocally, should we find d(d + 1) vectors w(aα) in cd 2 , of components wpq(aα), satisfying (9) and (10), then we could construct d(d+1) vectors |aα〉 satisfying (2). this can be done by means of a diagonalization procedure of the matrices maα = d−1∑ p=0 d−1∑ q=0 wpq(aα)epq (14) where a = 0, 1, . . . ,d and α = 0, 1, . . . ,d − 1. an alternative and more simple way to obtain the |aα〉 vectors from the w(aα) vectors is as follows. equation (8) leads to∣∣∣∣∣ d−1∑ p=0 ωp(aα)ωp(bβ) ∣∣∣∣∣ = δα,βδa,b + 1√d(1 − δa,b) (15) to be compared with (2). then, the |aα〉 vectors can be constructed once the w(aα) vectors are known. the solution, in matrix form, is |aα〉 =   ω0(aα) ω1(aα) ... ωd−1(aα)   (16) a = 0, 1, . . . ,d α = 0, 1, . . . ,d− 1 (17) therefore, we can construct a complete set {ba : a = 0, 1, . . . ,d} of d + 1 mubs from the knowledge of d(d + 1) vectors w(aα). note that, for fixed a and α, the |aα〉 vector is an eigenvector of the maα matrix with eigenvalue 1. this establishes a link with the above-mentioned diagonalization procedure. 4. a parallel problem the present work takes its origin in [6], where some similar developments were achieved in the search of a sic povm. symmetric informationally complete positive-operator-valued measures play an important role in quantum information. their existence in arbitrary dimension is still the object of numerous studies (see for instance [7]). a sic povm in dimension d can be defined as a set of d2 nonnegative operators px = |φx〉〈φx| acting on cd and satisfying 1 d d2∑ x=1 px = i (18) and tr (pxpy) = dδx,y + 1 d + 1 (19) where i is the identity operator. the search for such a sic povm amounts to find d2 vectors |φx〉 in cd satisfying 1 d d2∑ x=1 |φx〉〈φx| = i (20) and |〈φx|φy〉| = √ dδx,y + 1 d + 1 (21) with x,y = 1, 2, . . . ,d2. the px operator can be developed as px = d−1∑ p=0 d−1∑ q=0 vpq(x)epq (22) so that the determination of d2 operators px (or d2 vectors |φx〉) is equivalent to the determination of d2 vectors v(x), of components vpq(x), in cd 2 . in the spirit of the preceding sections, we have the following result. 125 maurice r. kibler acta polytechnica proposition 3. for d ≥ 2, finding a sic povm in cd (if it exists) is equivalent to finding d2 vectors v(x) in cd 2 , of components vpq(x) such that 1 d d2∑ x=1 vpq(x) = δp,q, p,q ∈ z/dz (23) d−1∑ p=0 d−1∑ q=0 vpq(x)vpq(y) = dδx,y + 1 d + 1 (24) and vpq(x) = νp(x)νq(x), p,q ∈ z/dz (25) where x,y = 1, 2, . . . ,d2. 5. concluding remarks the equivalence discussed in this work of the two ways of looking at mubs amounts in some sense to the equivalence between the search for equiangular lines in cd and for equiangular vectors in cd 2 (cf. [8]). equiangular lines in cd correspond to |〈aα|bβ〉| = 1 √ d for a 6= b (26) while equiangular vectors in cd 2 correspond to w(aα) · w(bβ) = 1 d for a 6= b (27) where the w(aα)·w(bβ) inner product in cd 2 is defined as w(aα) · w(bβ) = d−1∑ p=0 d−1∑ q=0 wpq(aα)wpq(bβ) (28) observe that the modulus disappears and the 1/ √ d factor is replaced by 1/d when passing from (26) to (27). it was questioned in [5] if the equiangular vectors approach can shed light on the still unsolved question whether one can find d + 1 mubs when d is not a (strictly positive) power of a prime integer. in the case where d is not a power of a prime, the impossibility of finding d(d+1) vectors w(aα) or d(d+1) matrices maα satisfying the conditions in propositions 1 and 2 would mean that d + 1 mubs do not exist in cd. however, it is hard to know if one approach is better than the other. it is the hope of the author that the equiangular vectors approach can be tested in the d = 6 case for which one knows only three mubs instead of d+ 1 = 7 in spite of numerous numerical studies (see [9–11] and references therein for an extensive list of related works). similar remarks apply to sic povms. the existence problem of sic povms in arbitrary dimension is still unsolved although sic povms have been constructed in every dimension d ≤ 67 (see [7] and references therein). for sic povms, the equiangular lines in cd correspond to |〈φx|φy〉| = 1 √ d + 1 for x 6= y (29) and the equiangular vectors in cd 2 to v(x) · v(y) = 1 d + 1 for x 6= y (30) where the v(x) · v(y) inner product in cd 2 is defined as v(x) · v(y) = d−1∑ p=0 d−1∑ q=0 vpq(x)vpq(y) (31) the parallel between mubs and sic povm characterized by the couples of equations (26)-(29), (27)-(30) and (28)-(31) should be noted. these matters will be the subject of future work. acknowledgements the material contained in the present note was planned to be presented at the eleventh edition of the workshop analytic and algebraic methods in physics (aamp xi). unfortunately, the author was unable to participate in aamp xi. he is greatly indebted to miloslav znojil for suggesting that he submits this work to acta polytechnica. references [1] a. vourdas. quantum systems with finite hilbert space. rep prog phys 67(3):267–320, 2004. doi: 10.1088/0034-4885/67/3/r03 [2] j. tolar, g. chadzitaskos. feynman’s path integral and mutually unbiased bases. j phys a: math theor 42(24):1–11, 2009. doi: 10.1088/1751-8113/42/24/245306 [3] m. r. kibler. an angular momentum approach to quadratic fourier transform, hadamard matrices, gauss sums, mutually unbiased bases, unitary group and pauli group. j phys a: math theor 42(35):1–28, 2009. doi: 10.1088/1751-8113/42/35/353001 [4] t. durt, b.-g. englert, i. bengtsson, k. życzkowski. on mutually unbiased bases. int j quantum inf 8(4):535–640, 2010. doi: 10.1142/s0219749910006502 [5] m. r. kibler. equiangular vectors approach to mutually unbiased bases. entropy 15(5):1726–1737, 2013. doi: 10.3390/e15051726 [6] o. albouy, m. r. kibler. a unified approach to sic-povms and mubs. j russian laser res 28(5):429–438, 2007. doi: 10.1007/s10946-007-0032-5 [7] d. m. appleby, c. a. fuchs, h. zhu. group theoretic, lie algebraic and jordan algebraic formulations of the sic existence problem. arxiv:1312.0555v1. [8] c. godsil, a. roy. equiangular lines, mutually unbiased bases, and spin models. eur j combin 30(1):246–262, 2009. doi: 10.1016/j.ejc.2008.01.002 [9] p. butterley, w. hall. numerical evidence for the maximum number of mutually unbiased bases in dimension six. phys lett a 369(1-2):5–8, 2007. doi: 10.1016/j.physleta.2007.04.059 [10] s. brierley, s. weigert. constructing mutually unbiased bases in dimension six. phys rev a 79(5):1–13, 2009. doi: 10.1103/physreva.79.052316 [11] d. mcnulty, s. weigert. on the impossibility to extend triples of mutually unbiased product bases in dimension six. int j quantum inf 10(5):1–11, 2012. doi: 10.1142/s0219749912500566 126 http://dx.doi.org/10.1088/0034-4885/67/3/r03 http://dx.doi.org/10.1088/1751-8113/42/24/245306 http://dx.doi.org/10.1088/1751-8113/42/35/353001 http://dx.doi.org/10.1142/s0219749910006502 http://dx.doi.org/10.3390/e15051726 http://dx.doi.org/10.1007/s10946-007-0032-5 http://arxiv.org/abs/1312.0555v1 http://dx.doi.org/10.1016/j.ejc.2008.01.002 http://dx.doi.org/10.1016/j.physleta.2007.04.059 http://dx.doi.org/10.1103/physreva.79.052316 http://dx.doi.org/10.1142/s0219749912500566 acta polytechnica 54(2):124–126, 2014 1 introduction 2 the two approaches 3 equivalence 4 a parallel problem 5 concluding remarks acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0677 acta polytechnica 53(supplement):677–682, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap accretion disks with a large scale magnetic field around black holes gennady bisnovatyi-kogana,b,∗, alexandr s. klepneva,b, richard v.e. lovelacec a space research institute rus. acad. sci., moscow, russia b moscow engineering physics institute, moscow, russia c cornell university, ithaca, usa ∗ corresponding author: gkogan@iki.rssi.ru abstract. we consider accretion disks around black holes at high luminosity, and the problem of the formation of a large-scale magnetic field in such disks, taking into account the non-uniform vertical structure of the disk. the structure of advective accretion disks is investigated, and conditions for the formation of optically thin regions in central parts of the accretion disk are found. the high electrical conductivity of the outer layers of the disk prevents outward diffusion of the magnetic field. this implies a stationary state with a strong magnetic field in the inner parts of the accretion disk close to the black hole, and zero radial velocity at the surface of the disk. the problem of jet collimation by magneto-torsion oscillations is investigated. keywords: accretion, black holes, jets, magnetic field. 1. introduction quasars and agn contain supermassive black holes, about 10 hmxr contain stellar mass black holes – microquasars. jets are observed in objects with black holes: collimated ejection from accretion disks. the standard model for accretion disks of shakura and sunyaev [20] is based on several simplifying assumptions. the disk must be geometrically thin and rotate at the kepler angular velocity. these assumptions make it possible to neglect radial gradients and to proceed from differential to algebraic equations. for low accretion rates ṁ, this assumption is fully appropriate. however, for high accretion rates, the disk structure may differ from the standard model. to solve the more general problem, advection and a radial pressure gradient have been included in the analysis of the disk structure by paczynski & bisnovatyi-kogan [19]. it was shown by artemova et al. [1], that for large accretion rates there are no local solutions that are continuous over the entire region of existence of the disk and undergo kepler rotation. a self-consistent solution for an advective accretion disk with a continuous description of the entire region between the optically thin and optically thick regions has been obtained by artemova et al. [3], and klepnev and bisnovatyi-kogan [13]. early work on disk accretion to a black hole argued that a large-scale magnetic field of, for example, the interstellar medium would be dragged inward and greatly compressed by the accreting plasma [11, 12, 14]. subsequently, analytic models of the field advection and diffusion in a turbulent disk suggested, that the large-scale field diffuses outward rapidly [15, 17], and prevents a significant amplification of the external poloidal field. this has led to the suggestion that special conditions (non-axisymmetry) are required for the field to be advected inward [21]. the question of the advection/diffusion of a large-scale magnetic field in a turbulent plasma accretion disk was reconsidered by bisnovatyi-kogan & lovelace [6, 7], taking into account its non-uniform vertical structure. the high electrical conductivity of the surface layers of the disk, where the turbulence is suppressed by the radiation flux and the high magnetic field, prevents outward diffusion of the magnetic field. this leads to a strong magnetic field in the inner parts of accretion disks. 2. basic equations for accretion disk structure we use equations describing a thin, steady-state accretion disk, averaged over its thickness [3]. these equations include advection and can be used for any value of the vertical optical thickness of the disk. we use a pseudo-newtonian approximation for the structure of the disk near the black hole, where the effects of the general theory of relativity are taken into account using the paczyñski & wiita [18] potential φ(r) = − gm r−2rg , where m is the mass of the black hole, and 2rg = 2gm/c2 is the gravitational radius. the self-gravitation of the disk is neglected, the viscosity tensor trφ = −αp. the conservation of mass is expressed in the form ṁ = 4πrhρv, where ṁ is the accretion rate, ṁ > 0, and h is the half thickness of the disk. the equilibrium in the vertical direction dp dz = −ρzω 2 k is replaced by the algebraic relation in the form h = cs ωk , where cs = √ p/ρ is the isothermal 677 http://dx.doi.org/10.14311/ap.2013.53.0677 http://ojs.cvut.cz/ojs/index.php/ap gennady bisnovatyi-kogan, alexandr s. klepnev, richard v.e. lovelace acta polytechnica sound speed. the equations of motion in the radial and azimuthal directions are, respectively, written as v dv dr = − 1 ρ dp dr + (ω2 −ω2k)r, (1) ṁ 4π d` dr + d dr (r2htrφ) = 0, where ωk is the kepler angular velocity, given by ω2k = gm/r(r−2rg) 2; ` = ωr2 is the specific angular momentum. other components of the viscosity tensor are assumed negligibly small. the vertically averaged equation for the energy balance is qadv = q+ −q−, where qadv = − ṁ 4πr [ de dr + p d dr ( 1 ρ )] , q+ = − ṁ 4π rω dω dr ( 1 − lin l ) , (2) q− = 2at 4c 3(τα + τ0)h [ 1 + 4 3(τ0 + τα) + 2 3τ2∗ ]−1 are the energy fluxes (erg cm−2 s−1) associated with advection, viscous dissipation, and radiation from the surface, respectively, τ0 is the thomson optical depth, and τ0 = 0.4ρh for the hydrogen composition. we have introduced the optical thickness for absorption, τα ' 5.2×1021 ρ 2t 1/2h act 4 , and the effective optical thickness τ∗ = [(τ0 + τα) τα] 1/2. the equation of state for a mixture of a matter and radiation is ptot = pgas +prad. the gas pressure is given by the formula pgas = ρrt , r is the gas constant, and the radiation pressure is given by prad = at 4 3 1 + 43(τ0+τα) 1 + 43(τ0+τα) + 2 3τ2∗ . (3) the specific energy of the mixture of the matter and radiation is determined as ρe = 32pgas + 3prad. expressions for q− and prad, valid for any optical thickness, were obtained by artemova et al. [1]. 3. method of solution and numerical results the system of differential and algebraic equations can be reduced to two ordinary differential equations, x v dv dx = n d , (4) x v dcs dx = 1 − ( v2 c2s − 1 ) n d + + x2 c2s ( ω2 − 1 x(x− 2)2 ) + 3x− 2 2(x− 2) . (5) here the numerator n and and denominator d are algebraic expressions depending on x, v, cs, and lin, the equations are written in dimensionless form with figure 1. the radial dependence of the temperature of the accretion disk for an accretion rate ṁ = 50, and viscosity parameters α = 0.01 (dotted curve), α = 0.1 (smooth curve), and α = 0.4 (dashed curve). x = r/rg, rg = gm/c2. the velocities v and cs have been scaled by the speed of light c, and the specific angular momentum lin by the value c/rg. this system of differential equations has two singular points, defined by the conditions d = 0, n = 0. the inner singularity is situated near the last stable orbit with r = 6rg. the outer singularity, lying at distances much greater than rg, is an artifact arising from our use of the artificial parametrization trφ = −αp of the viscosity tensor. the system of ordinary differential equations was solved by a finite difference method discussed by artemova et al. [2]. the method is based on reducing the system of differential equations to a system of nonlinear algebraic equations which are solved by an iterative newton–raphson scheme, with an expansion of the solution near the inner singularity and using lin as an independent variable in the iterative scheme [2]. the solution is almost independent of the outer boundary condition. the numerical solutions have been obtained for the structure of an accretion disk over a wide range of the parameters ṁ( ṁ = ṁc 2 ledd ) and α. for low accretion rates, ṁ < 0.1, the solution for the advection model has τ∗ � 1, v � cs, and the angular velocity is close to the kepler velocity everywhere, except a very thin layer near the inner boundary of the disk. as the accretion rate increases, the situation changes significantly. the changes show up primarily in the inner region of the disk. the calculations made by klepnev & bisnovatyi-kogan [8] are presented in fig. 1, where there are given the radial dependences of the temperature of the accretion disk for the accretion rate ṁ = 50, and different values of the viscosity parameter α = 0.01, 0.1 and 0.4. clearly, for large ṁ and α the inner part of the disk becomes optically thin. because of this, a sharp increase in the temperature of the accretion disk is observed in this region. two distinct regions can be seen in the plot of the radial dependence of the temperature of the accretion 678 vol. 53 supplement/2013 accretion disks with a large scale magnetic field around black holes disk. this is especially noticeable for a viscosity parameter α = 0.4, where one can see the inner optically thin region with a dominant non-equilibrium radiation pressure prad, and an outer region which is optically thick with dominant equilibrium radiation pressure. things are different when the viscosity parameter is small. only a small (considerably smaller than for α = 0.4) inner region becomes optically thin for accretion rates of ṁ ≈ 30 ÷ 70. meanwhile, in the case of α = 0.01, there are no optically thin regions at all. 4. the fully turbulent model there are two limiting accretion disk models which have analytic solutions for a large-scale magnetic field structure. the first was constructed by bisnovatyi-kogan & ruzmaikin [12] for a stationary non-rotating accretion disk. a stationary state is maintained by the balance between magnetic and gravitational forces, and a local thermal balance is maintained by ohmic heating and radiative heat conductivity for optically thick conditions. the mass flux to the black hole in the accretion disk is determined by the finite electrical conductivity of the disk matter and the diffusion of matter across the large-scale magnetic field. it is widely accepted that the laminar disk is unstable to different hydrodynamic, magneto-hydrodynamic and plasma instabilities, which implies that the disk is turbulent. in x-ray binary systems the assumption of a turbulent accretion disk is necessary for construction of realistic models [20]. the turbulent accretion disks were constructed for non-rotating models with a large-scale magnetic field. a formula for turbulent magnetic diffusivity was derived by bisnovatyi-kogan and ruzmaikin [12], similar to the scaling of the shear α-viscosity in a turbulent accretion disk in binaries [20]. using this representation, the expression for the turbulent electrical conductivity σt is written as σt = c2 α̃4πh √ p/ρ . (6) here, α̃ = α1α2. the characteristic turbulence scale is ` = α1h, where h is the half-thickness of the disk, and the characteristic turbulent velocity is vt = α2 √ p/ρ. the large-scale magnetic field threading a turbulent keplerian disk arises from external electrical currents and currents in the accretion disk. the magnetic field may become dynamically important, influencing the accretion disk structure, and leading to powerful jet formation, if it is strongly amplified during the radial inflow of the disk matter. this is possible only when the radial accretion speed of matter in the disk is larger than the outward diffusion speed of the poloidal magnetic field due to the turbulent diffusivity ηt = c2/(4πσt). estimates by lubow, papaloizou & pringle [17] have shown that for turbulent conductivity (eq. 6), the outward diffusion speed is larger than the accretion speed, and there is no largescale magnetic field amplification. the numerical figure 2. sketch of the large-scale poloidal magnetic field threading a rotating turbulent accretion disk with a radiative outer boundary layer. the toroidal current flows mainly in the highly conductive radiative layers. the large-scale (average) field in the turbulent region is almost vertical. calculations of lubow, papaloizou & pringle [17] are reproduced analytically for the standard accretion disk structure by bisnovatyi-kogan & lovelace [6, 7]. the characteristic time tvisc of the matter advection due to the shear viscosity is tvisc = rvr = j αv2s . the time of the magnetic field diffusion is tdiff = r 2 η h r bz br , η = c 2 4πσt = α̃hvs. in the stationary state, the largescale magnetic field in the accretion disk is determined by the equality tvis = tdiff, which determines the ratio br bz = α α̃ vs vk = α α̃ h r � 1, vk = rωk and j = rvk for a keplerian disk. in a turbulent disk, matter penetrates through magnetic field lines, almost without field amplification: the field induced by the azimuthal disk currents has bzd ∼ brd. 5. turbulent disk with radiative outer zones near the surface of the disk, in the region of low optical depth, the turbulent motion is suppressed by the radiative and magnetic fluxes, similar to the suppression of the convection over the photospheres of stars with outer convective zones. the presence of the outer radiative layer does not affect the characteristic time tvisc of the matter advection in the accretion disk, determined by the main turbulent part of the disk. the time of the field diffusion, however, is significantly changed, because the electrical current is concentrated in the radiative highly conductive regions, which generate the main part of the magnetic field. the structure of the magnetic field with outer radiative layers is shown in fig. 2. inside the turbulent disk the electrical current is negligibly small, so that the magnetic field there is almost fully vertical, with br � bz. in the outer radiative layer, the field diffusion is very small, so that the matter advection leads to strong magnetic field amplification. we suppose that in the stationary state the magnetic forces support the optically thin regions against gravity. when the magnetic force balances the gravitational force in the optically thin part of the disk of surface density σph, the relation 679 gennady bisnovatyi-kogan, alexandr s. klepnev, richard v.e. lovelace acta polytechnica takes place [12] gmσph r2 ' bziφ 2c ' b2z 4π , (7) the surface density over the photosphere corresponds to a layer with effective optical depth close to 2/3 (see e.g. [5]). we estimate the lower limit of the magnetic field strength, taking κes (instead of the effective opacity κeff = √ κesκa, κa � κes). writing κesσph = 2/3, we obtain σph = 5/3 (g cm−2), for the thomson scattering opacity, κes = 0.4 cm2 g−1. we estimate the lower bound on the large-scale magnetic field in a keplerian accretion disk as [6, 7] bz = √ 5π 3 c2√ gm� 1 x √ m ' 108 g 1 x √ m . (8) here x = r rg , m = m m� . the maximum magnetic field is reached when the outward magnetic force balances the gravitational force on the surface with a mass density σph. in equilibrium, bz ∼ √ σph. we find that bz in a keplerian accretion disk is about 20 times less than its maximum possible value from bisnovatyi-kogan & ruzmaikin [12], for x = 10, α = 0.1, and ṁ = 10. 6. self-consistent numerical model self-consistent models of the rotating accretion disks with a large-scale magnetic field require solution of the equations of magneto-hydrodynamics. the strong field solution is the only stable stationary solution for a rotating accretion disk. the vertical structure of the disk with a large-scale poloidal magnetic field was calculated by lovelace, rothstein & bisnovatyi-kogan [16], taking into account the turbulent viscosity and diffusivity, and the fact that the turbulence vanishes at the surface of the disk. coefficients of the turbulent viscosity ν, and magnetic diffusivity η are connected by the magnetic prandtl number p ∼ 1, ν = pη = α c 2 s0 ωk g(z) , where α is a constant determining the turbulent viscosity [20]; β = c2s0/v2a0, where va0 = b0/(4πρ0)1/2 is the midplane alfvén velocity. the function g(z) accounts for the absence of turbulence in the surface layer of the disk. in the body of the disk g = 1, whereas near the surface of the disk g tends over a short distance to a very small value, effectively zero. smooth function with similar behavior is taken by lovelace, rothstein & bisnovatyi-kogan [16] in the form g(ζ) = ( 1 − ζ 2 ζ2s )δ , with δ � 1. in the stationary state the boundary condition on the disk surface is ur = 0, and only one free parameter – magnetic prandtl number p – remains in the problem. in a stationary disk, the vertical magnetic field has a unique value. an example of the radial velocity distribution for p = 1 is shown in fig. 3 from bisnovatyi-kogan & lovelace [8, 9]. figure 3. distribution of the radial velocity over the thickness in the stationary accretion disk with a large scale poloidal magnetic field. figure 4. qualitative picture of jet confinement by magneto-torsional oscillations. 7. jet collimation by magneto-torsional oscillations. following bisnovatyi-kogan [6, 7], we consider the stabilization of a jet by a magneto-hydrodynamic mechanism associated with torsional oscillations. we suggest that the matter in the jet is rotating, and different parts of the jet rotate in different directions, see fig. 4. such a distribution of the rotational velocity produces an azimuthal magnetic field, which prevents a disruption of the jet. the jet represents a periodical, or quasi-periodical, structure along the axis, and its radius oscillates with time all along the axis. the space and time periods of the oscillations depend on the conditions at jet formation: the lengthscale, the amplitude of the rotational velocity, and the strength of the magnetic field. the time period of the oscillations can be obtained during the construction of the dynamical model, and the model should also show at what input parameters a long jet stabilized by torsional oscillations could exist. let us consider a long cylinder with a magnetic field directed along its axis. it is possible that a limiting value of the radius of the cylinder could be reached in a dynamic state, when the whole cylinder undergoes 680 vol. 53 supplement/2013 accretion disks with a large scale magnetic field around black holes ti me y , z 0 50 10 0 -0 .4 -0 .2 0 0. 2 0. 4 0. 6 0. 8 1 1. 2 1. 4 1. 6 1. 8 2 2. 2 2. 4 figure 5. time dependence of non-dimensional radius y (upper curve), and non-dimensional velocity z (lower curve), for d = 2.1, y(0) = 1. magneto-torsional oscillations. such oscillations produce a toroidal field, which prevents radial expansion. there is competition between the induced toroidal field, compressing the cylinder in the radial direction, and the gas pressure, together with the field along the cylinder axis (poloidal), tending to increase its radius. during magneto-torsional oscillations there are phases when either the compression force or the expansion force prevails, and, depending on the input parameters, there are three possible kinds of behavior of such a cylinder with negligible self-gravity. (1.) the oscillation amplitude is low, so the cylinder suffers unlimited expansion (no confinement). (2.) the oscillation amplitude is high, so the pinch action of the toroidal field destroys the cylinder and leads to the formation of separated blobs. (3.) the oscillation amplitude is moderate, so the cylinder, in the absence of any damping, survives for an unlimited time, and its parameters (radius, density, magnetic field etc.) change periodically, or quasi-periodically, in time. a simplified equation describing the magnetotorsional oscillations of a long cylinder was obtained by bisnovatyi-kogan [6, 7]. it describes approximately the time dependence of the outer radius of the cylinder r(t) in the symmetry plane, where the rotational velocity remains zero. the equation contains a dimensionless parameter d, which determines the dynamic behavior of the cylinder. an example of the dynamically stabilized cylinder at d = 2.1 is given in fig. 5, from bisnovatyi-kogan [6, 7], y and z are the non-dimensional radius and the radial velocity, respectively. the transition to a stochastic regime in these oscillations was investigated by bisnovatyi-kogan et al. [10]. 8. discussion we have obtained an unambiguous solution for the structure of an advection accretion disk surrounding a nonrotating black hole for different values of the viscosity parameter and the accretion rate. this solution is global, trans-sonic, and, for high ṁ and α, is characterized by a continuous transition of the disk from optically thick in the outer region to optically thin in the inner region. it has a temperature peak in the inner (optically thin) region, which might cause the appearance of a hard component in the spectrum. for a rotating black hole, the peak temperature is so high that it may lead to the formation of electronpositron pairs and change the emission spectrum of the disk at energies of 500 kev and above. preliminary calculations have been made for a disk around a rapidly rotating black hole, with quasi-newtonian gravitational potential, approximating the effects of the kerr metric [4]. we obtain that, for a sufficiently large kerr rotation parameter, the temperature in the optically thin inner region may substantially exceed 500 kev. a consideration with a self-consistent account of pair creation is under way. in the presence of a large scale magnetic field we may expect the formation of relativistic jets with a high lepton excess. the inner optically thin region may exist only at α >∼ 0.01. this is because at very high ṁ large optical thickness is associated with high density in the inner regions of the disk; at low ṁ large effective optical depth is connected with high density because of low temperature. therefore, the effective optical depth has a minimum at intermediate values of ṁ, and for α ≤∼ 0.01 this minimum turns out to be greater than unity. the poloidal magnetic field is amplified during disk accretion, due to high conductivity in the outer radiative layers. a stationary solution is obtained corresponding to β = 240, for pr = 1. note that the value of β is obtained using the density of the disk in the symmetry plane. the local value of β in the outer radiative regions is much lower, and approximately corresponds to equipartition between the pressure of a gas and the magnetic field. 9. conclusions (1.) a global, trans-sonic solution exists, which at high ṁ and α is characterized by a continuous transition of the disk from optically thick in the outer region to optically thin in the inner region. (2.) the model, with correct accounting for the transition between the optically thick and optically thin regions, reveals the existence of a temperature peak in the inner (optically thin) region, which may cause the appearance of a hard component in the spectrum. a high temperature in the inner region of an accretion disk may lead to the formation of electron– positron pairs (in the kerr metric). 681 gennady bisnovatyi-kogan, alexandr s. klepnev, richard v.e. lovelace acta polytechnica (3.) when α = 0.5, a very substantial optically thin region is observed, when α = 0.1 we have a slight optically thin region, and when α = 0.01 no optically thin region is seen at all. (4.) the magnetic field is amplified during disk accretion due to high conductivity in the outer radiative layers. the stationary solution corresponds to β = 240 for pr = 1. (5.) the jets from the accretion disk are magnetically collimated in the presence of a large-scale poloidal magnetic field, by torsion oscillations, which may be regular or chaotic. jets may be produced in magneto-rotational explosions (supernova, etc.). acknowledgements the work of gsbk, ask was partially supported by rfbr grant 11-02-00602, by the ran program “origin, formation and evolution of objects of universe” and president grant nsh-3458.2010.2. r.v.e.l was supported in part by nasa grants nnx08ah25g and nnx10af63g and by nsf grant ast-0807129. gsbk is grateful to the organizers of the workshop for support. references [1] artemova, y.v., bisnovatyi-kogan, g.s., bjornsson, g., novikov, i.d.: 1996, apj, 456, 119 [2] artemova, y.v., bisnovatyi-kogan, g.s., igumenshchev, i.v., novikov, i.d.: 2001, apj, 549, 1050 [3] artemova, y.v., bisnovatyi-kogan, g.s., igumenshchev, i.v., novikov, i.d.: 2006, apj, 637, 968 [4] artemova, y.v., bjornsson, g., novikov, i.d.: 1996a, apj, 461, 565 [5] bisnovatyi-kogan, g.s.: 2001, stellar physics. vol.1,2. berlin: springer [6] bisnovatyi-kogan,g.s.:2007, mnras,376, 457 [7] bisnovatyi-kogan, g.s., lovelace, r.v.e.: 2007, apjl, 667, l167 [8] bisnovatyi-kogan, g.s., lovelace, r.v.e.: 2010, pos (texas 2010) 008; arxiv:1104.4866 [9] bisnovatyi-kogan, g.s., lovelace, r.v.e.: 2012, apj, 750, 109 [10] bisnovatyi-kogan, g.s., neishtadt, a.i., seidov, z.f., tsupko, o.yu., krivosheev, yu.m.: 2011, mnras, 416, 747 [11] bisnovatyi-kogan, g.s., ruzmaikin, a.a.: 1974, ap&ss, 28, 45 [12] bisnovatyi-kogan, g.s., ruzmaikin, a.a.: 1976, ap&ss, 42, 401 [13] klepnev, a.s., bisnovatyi-kogan, g.s.: 2010, astrophysics, 53, 409 [14] lovelace, r.v.e.: 1976, nature, 262, 649 [15] lovelace, r.v.e., romanova, m.m., newman, w.i.: 1994, apj, 437, 136 [16] lovelace, r.v.e., rothstein, d.m., bisnovatyi-kogan, g.s.: 2009, apj, 701, 885 [17] lubow, s.h., papaloizou, j.c.b., pringle, j.e.: 1994, mnras, 267, 235 [18] paczyñski,b., wiita,p.j.: 1980, astron.ap.,88, 23 [19] paczyñski, b., bisnovatyi-kogan, g.s.: 1981, acta astr., 31, 283 [20] shakura, n.i., sunyaev, r.a.: 1973, astron. ap., 24, 337 [21] spruit,h.c., uzdensky,d.a.: 2005, apj,629, 960 682 acta polytechnica 53(supplement):677–682, 2013 1 introduction 2 basic equations for accretion disk structure 3 method of solution and numerical results 4 the fully turbulent model 5 turbulent disk with radiative outer zones 6 self-consistent numerical model 7 jet collimation by magneto-torsional oscillations. 8 discussion 9 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):134–137, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the construction of the fast resistive bolometer for a sxr measurement on the git-12 facility jakub cikhardta,∗, daniel klíra, pavel kubeša, josef kravárika, karel řezáča, ondřej šílaa, alexander v. shishlovb, alexey yu. labetskyb a department of physics, faculty of electrical engineering, czech technical university in prague, czech republic b institute of high current electronics, siberian branch of the russian academy of sciences in tomsk, russian federation ∗ corresponding author: cikhajak@fel.cvut.cz abstract. a lot of kinds of instruments are used for the sxr measurement at pulsed power facilities, but most of them are difficult to calibrate absolutely. for the determination of the energy of sxr radiated by the discharge on z-pinches, it is possible to use the bolometer which can be calibrated analytically. the bolometer can be constructed with the sufficient sensitivity and, at the same time, with the time resolution in the order of nanoseconds. this bolometer was designed and constructed for the measurement on the 5 ma facility git-12 at the institute of high current electronics (ihce) of the siberian branch russian academy of sciences in tomsk. the experiments on git-12 with the neon and deuterium gas-puff load were diagnosed by the copper bolometer with the time resolution of 4 ns and the sensitivity of 12 v cm2 j−1. keywords: bolometers, x-rays, z-pinches, plasma diagnostics. 1. introduction the ohmic bolometer is based on the principle of transformation of radiated energy into thermal energy. it leads to the change of the resistance according to the well known formula r = r0 (1 + α∆ϑ) , (1) where r0 is the initial resistance, α is the thermal coefficient of the resistivity and ∆ϑ is the change of the temperature. in the case of metallic bolometers, the coefficient α can be considered as a positive constant, thus the dependence of the resistivity on the temperature is linear. the parameters and design are dependent on the application of a bolometer. bolometers are used in microwave technology, pyrometry, astronomy and in plasma diagnostics. the bolometer described in this paper is designed for measurement of the pulses of sxrs produced by a short lived plasma generated by pulse power generators especially large z-pinches. the experiments with this bolometer were performed with triple shell neon and deuterium gas-puffs. during the stagnation of the dense neon gas-puff, the most of kinetic energy of the z-pinch is radiating in neon k-shell lines (hν ∼= 0.9÷1.4 kev) [4]. in case of the deuterium, the main component of the radiation is the continuous bremsstrahlung. in this case the sensitive element of the bolometer is a very thin metal stripe. dimensions and material of the stripe were determined with regard to the requirement for the bolometer: • sufficient sensitivity which is constant on wide range of the photon energy, • time resolution in the order of nanoseconds, • ability to work in environments with extreme electromagnetic noise. 2. parameters determination 2.1. sensitivity the sensitivity s of this detector is determined as a ratio between the change of the output voltage and the fluence % at the point where the bolometer is placed, s = ∆u ∆q/s = ∆u ∆% , (2) where s is the surface of the bolometer sensitive element. using eq. 1 and another well known physical formulas we obtain the sensitivity determined by the dimensions of the sensitive element, material constants and bolometer current, s = ρelα 1 d2cρw i, (3) where ρe is resistivity, α is the thermal coefficient of the resistivity, c is the specific heat, ρ is the density of the material, l, d and w are length, thickness and width of the sensitive element respectively and i is the current of the bolometer. 2.2. range of photon energies the photon energies which can be measured with the bolometer are determined by the radiation absorption. for our purpose the data from [3] were used. 134 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 the construction of the fast resistive bolometer figure 1. the absorption characteristics of 2 micron metal foils: a) aluminium foil, b) copper foil. in the fig. 1 the absorption characteristics of the aluminium and copper of 2 micron thickness are shown as examples. in fig. 1a, the absorption of the aluminium foil has a minimum at about 1.6 kev. this minimum corresponds to the k-shell edge of the aluminium [8]. if the bolometer is made of this foil, the detector will be selective. the absorption characteristic of 2 micron thick copper foil can be seen in the fig. 1b. this characteristic is much more “smooth” than the characteristic of the aluminium. for our application, a nonselective detector is needed in the range up to 2 kev, therefore the copper foil was chosen. if the photon energy range is considered as a region where the sensitivity is not less than 1/ √ 2 of the maximal sensitivity, the energy range of the bolometer made of 2 micron thick copper is up to 3 kev. 2.3. time resolution the time resolution of bolometers is defined by the characteristic time needed to equilibration of the temperature in the sensitive element. if the front side of the metal foil is irradiated by a short sxr pulse, the absorbed energy will have the exponential dependence on the depth inside the foil [1]. the energy of sxr absorbed in the material is transformed into the thermal energy. therefore we can consider that the dependence of temperature on the depth inside the metal foil is also exponential, ϑ (x, 0) = ϑ (0, 0) e−γρx, (4) where x is the distance from the irradiated surface of the foil and γ is the absorption coefficient of material. we also consider that there is no heat transfer figure 2. the time dependence of the temperature difference of the sides of the 2 microns copper foil. between the sensitive element and surroundings because in our detector it is usually placed in vacuum. therefore we can write ∂ϑ (0, t) ∂x = ∂ϑ (d,t) ∂x = 0. (5) using the conditions eq. 4 and eq. 5 in the onedimensional heat transfer equation [6] ∂2ϑ (x,t) ∂x2 = 1 χ ∂ϑ (x,t) ∂t (6) we obtain solution ϑ (x,t) = ϑ0 γρd ( 1 − e−γρd ) + + 2ϑ0 d ∞∑ n=1 1 − (−1) e−γρd( πn γρd )2 + 1 cos (πnx d ) e−( πn d ) 2 χt, (7) where ϑ0 = ϑ(0, 0) and χ is thermal diffusivity. for determination of the characteristic time, it is useful to study the thermal difference between both sides of the foil: ϑ(0, t) − ϑ(d,t). the characteristic time of bolometer can be defined as the time at which the temperature difference drops at 1/e of the maximum of value. if this exponential process is equal to 1/e we consider that temperature in the foil is equilibrated. this temperature difference is plotted in the fig. 2. the characteristic time of the bolometer with sensitivity element made of 2 microns copper foil is 4 ns. the most important properties of the designed bolometer are shown in the tab. 1. 3. construction 3.1. bolometric detector cross-section of the bolometric detector is shown in the fig. 3. the foil is fixed by mechanical clamps. because the resistance of the foil is low (r0 = 0.126 ω) the fourclamp method was used for connection to the electric circuit. structure of the sensitive element and the surrounding circuits is symmetrical in order to achieve 135 jakub cikhardt et al. acta polytechnica property value material copper thickness 2 microns width 2 mm length 30 mm initial resistance (0.126 ω) time resolution 4 ns sensitivity 12.43 v cm2 j−1 table 1. the properties of the designed bolometer. figure 3. cross-section of the bolometric detector. a lower inductance. the screen with aperture is placed above the sensitive element. this screen protects the passive parts of the detectors against the irradiation and it allows to place a filter between the sensitive element and the source of radiation. 3.2. power supply to achieve the required sensitivity, it is necessary to supply the bolometer with a current of least 10 a. to avoid melting of the thin foil it is necessary to supply the bolometer with a short current pulse and at the same time the power supply pulse must be longer than the duration of the sxr pulse. we choose 10 µs length of the supply pulse. the bolometer changes resistance because of temperature change. therefore we need to know bolometer voltage and at the same time the bolometer current. we constructed mos-fet pulsed power supply with a defined output current, thus we need to register the bolometer output voltage only. the comparison of the measured and simulated dependence of the power supply current on the load resistance is shown in the fig. 4. 4. experiments on git-12 facility the bolometer was used during z-pinch experiments on the git-12 facility at the institute of high current electronics (ihce) of the siberian branch russian academy of sciences in tomsk. the git-12 is large pulsed power generator with 12 marx generators. the peak current of this facility reaches 5 ma, the rise time is 200 ns with plasma opening switches figure 4. the comparison of the measured and simulated dependence of the power supply current on the load resistance. (pos) and 1400 ns without pos, total energy stored in capacitors is 2.6 mj. the load of these experiments was a triple shell gas-puff of 160 mm, 80 mm and 30 mm diameters of the individual nozzles. neon, deuterium (d2), argon and their combination of this were used as filling gases for the gas-puff. the typical length of the gas column during these experiments was about 2 cm. the bolometer was placed in 2 m distance from the discharge without any filter. the typical detector output signal, namely the shot with neon (see shot no. 1 in tab. 2) is plotted in the fig. 5. the signal without a sxr pulse is plotted also for the comparison. the power supply current of the bolometer is starting to rise at the time 6 µs and at the time 8 µs is fully stabilized. the electromagnetic noise at the time around 10 µs is caused by ignition of the spark-gaps in the marx generators. the irradiation of the bolometer occurs at time of 12 µs, when the significant voltage step is observed. the magnitude of this voltage step corresponds to 50 kj of radiated energy during implosion of the gas-puff. other measured values including the shots in the deuterium and combination of the deuterium and neon, or argon are shown in tab. 2. all these shots were carried out with the triple shell gas-puffs. the notation a-b-c means gas a in outer the shell, gas b in the middle shell and gas c in the inner shell. results of these shots are shown in the tab. 2. 4.1. discussion of the experimental results it is seen in tab. 2 that the radiation in shot with the deuterium was lower. thus more of the kinetic energy of the z-pinch is transformed to the thermal energy. it is not easy to theoretically calculate the kinetic energy of the z-pinch exactly. for a rough estimation of the kinetic energy of the z-pinch can be used the formula [8] ek = 2.3i2ml (kj), (8) where im is the maximum of the current peak in ma and l is the length of the discharge in cm. 136 vol. 53 no. 2/2013 the construction of the fast resistive bolometer figure 5. the signal of the bolometric detector during the discharge in comparison with detector’s signal without the irradiation. shot current triple shell sxr energy no. (ma) gas-puff (kj) 1 3.73 ne-ne-ne 50 2 3.55 d2-d2-d2 13 3 3.64 d2-d2-d2 22 4 2.58 d2-d2-d2 21 5 3.60 ne-d2-d2 28 6 3.36 ne-d2-d2 23 7 3.36 ne-d2-d2 26 8 3.10 ne-d2-d2 18 9 3.28 ne-d2-d2 24 10 3.66 ar-d2-d2 14 11 2.16 d2-d2-d2 27 table 2. results of the sxr measurement at the experiments on git-12 facility. substituting values from shot no. 1 we obtain the kinetic energy about 64 kj. it corresponds to measured values and to the assumption that the most of kinetic energy of the neon gas-puff is changed to the energy of the radiation. the temperature of the bolometer at the steady state can be determined by the well known formula ∆ϑ = ∆q/(cm), where ∆q is an absorbed energy, c is a specific heat capacity and m is the mass of the bolometer foil. substituting the parameters of the bolometer and radiated energy of 50 kj from the shot no. 1, we obtain the change of the temperature of 144 k. if the initial temperature of the bolometer was 293 k, the temperature after irradiation is 437 k. so the melting point of the copper at 1357 k was not achieved. the accuracy of this measurement is decreased by several error factors. the most significant factor is electromagnetic noise. for example the rms value of this noise in shot no. 1 achieved to 7 % of the useful signal. including the noise, uncertainty of the stability of the power supply current and uncertainty of the dimensions of the bolometer foil, the total uncertainty of the measured radiated energy is about 10 %. 5. conclusion the bolometric detector for pulse sxr measurement was designed and constructed. the experiments on the git-12 facility was performed with this detector. the sensitivity, noise immunity and uncertainty of the detector is sufficient for the measurement at the terawatt generator git-12. amount of energies were measured in the range from 13 kj to 50 kj. it depended on the current peak and gas of the load and its amount. the measurements confirmed the assumption that the discharges in the deuterium produced less amount of the radiated energy of sxrs than the discharges in gases with higher atomic number. however the energy of sxrs radiated during the discharge in deuterium gas-puff was surprisingly high. probably it is the radiation of the material of electrodes. the measured values corresponds to the theoretical estimation of the kinetic energy of the gas-puff. because the amount of the radiated energy carries the information about the real kinetic energy of the gas-puff, it can be used for the gas-puff optimization as a scale. for determination of the dependences of the amount of the radiated energy on the gas-puff parameters and for verification of the detector’s time resolution we need more experimental shots which are scheduled on spring 2013. acknowledgements this work was supported by the research program msmt no. la08024, me09087 and gacr p205/12/0454, iaea rc 14817 and ctu sgs 10/266/ohk3/3t/13. references [1] d. attwood. soft x-rays and extreme ultraviolet radiation: principles and applications. canbridge university press, 1999. [2] j. cikhardt. konstrukce rychleho bolometru pro mereni intenzity impulsniho mekkeho rentgenového zareni. fee ctu, prague, 2012. diploma thesis. [3] e. m. gullikson. filter transmission. http: //henke.lbl.gov/optical_constants/filter2.html. the center for x-ray optics. [4] a. yu. labetsky, v. a. kokshenev, n. e. kurmaev, et al. study of the current-sheath formation during the implosion of multishell gas puffs. plasma physics reports 34(3):228–238, 2008. [5] m. a. liberman, j. s. de groot, a. toor, r. b. spielman. x-ray data booklet. berkley. lawrence berkley national laboratory university of california, 2009. 3rd edition. [6] j. h. lienhard iv, j. h. lienhard v. a heat transfer textbook. phlogiston press, cambridge, massachusetts, 2008. 3rd edition. [7] r. b. spielman, c. deeney, d. l. fehl, et al. fast resistive bolometer. tech. rep. sand98-1987, sandia national laboratories, 1998. [8] a. thompson, et al. x-ray data booklet. lawrence berkley national laboratory university of california, berkley, 2009. 3rd edition. 137 http://henke.lbl.gov/optical_constants/filter2.html http://henke.lbl.gov/optical_constants/filter2.html acta polytechnica 53(2):134–137, 2013 1 introduction 2 parameters determination 2.1 sensitivity 2.2 range of photon energies 2.3 time resolution 3 construction 3.1 bolometric detector 3.2 power supply 4 experiments on git-12 facility 4.1 discussion of the experimental results 5 conclusion acknowledgements references ap01_45.vp 1 introduction for over three decades the study of air intakes has led to design improvements based on wind tunnel test data. problems with particular designs such as damage to intake structures as a result of engine surge tended to only be found after prototype testing. from the early 1970’s wind tunnel testing methods improved considerably and there was also a much greater understanding of some important characteristics of duct flows. during this time cfd techniques have become widely used and advances have led to ever more complex studies yielding excellent agreement with experimental data. cfd methods are advantageous as they are generally cheaper in terms of cost, time and resources. however cfd should be thought of as an aid to experimental work. the validation of computational results with previous work and experimental data is crucial and is the motivation for this paper. air intakes are a vital component of aircraft and the primary purpose is to offer the engine a uniform stream of air. the efficiency of such devices is crucial in that it makes a contribution to the handling characteristics and performance attributes of the aircraft. just as important is the need for engine/intake compatibility. engine surge can be induced if factors such as cowl lip shape and diffuser shape are not closely considered in the design process. the design of an aircraft intake will depend on the role of the aircraft and the conditions in which it operates. subsonic intakes tend to be shorter in length due to the lower speeds when compared with a supersonic intake. however the position of store bays/undercarriage wells or the need to mask the compressor face in order to reduce the radar cross section and hence observability of the aircraft can lead to offset intakes such as the m2129. the flow in diffusing s-ducts is complex in nature due to effects arising from the offset between the intake plane and engine face plane. as the flow enters the intake it accelerates and then meets the first bend of the intake, where the centrifugal and pressure forces acting on the faster moving core cause it to move towards the outside of the bend (port side), where there is an adverse pressure region. near wall fluid that is energy deficient cannot pass through the adverse pressure gradient. instead the flow moves around the outer walls towards the region of low static pressure on the inside of the bend. this feature sets up two cells of swirling secondary flow. as the flow moves on through the duct it would perhaps be expected that a similar motion in the opposite sense be initiated at the second bend. however by this stage the low energy flow is largely on the outside wall relative to the second bend and is not driven back circumferentially. thus the swirl experienced at the engine face is in the sense generated as a result of the first bend. engine/intake compatibility is purely concerned with the quality of the airflow that is delivered by the intake to the compressor face and how the engine is affected. the flow distribution across the compressor face should be as uniform as possible to maximise performance of the engine and reduce the possibility of undesirable occurrences such as engine surge. total elimination of non-uniformity in pressure across the compressor face is not possible but the degree can be minimised. distortion is the term used to describe poor pressure distribution, and strong secondary flow causes poor distortion and can be sufficient to induce surge and produce a propagating hammershock. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 41 no. 4 – 5/2001 computational investigation of flows in diffusing s-shaped intakes r. menzies this paper examines the flow in a diffusing s-shaped aircraft air intake using computational fluid dynamics (cfd) simulations. diffusing s-shaped ducts such as the rae intake model 2129 (m2129) give rise to complex flow patterns that develop as a result of the offset between the intake cowl plane and engine face plane. euler results compare favourably with experiment and previous calculations for a low mass flow case. for a high mass flow case a converged steady solution was not found and the problem was then simulated using an unsteady flow solver. a choked flow at the intake throat and complex shock reflection system, together with a highly unsteady flow downstream of the first bend, yielded results that did not compare well with previous experimental data. previous work had also experienced this problem and a modification to the geometry to account for flow separation was required to obtain a steady flow. rans results utilising a selection of turbulence models were more satisfactory. the low mass flow case showed good comparison with experiment and previous calculations. a problem of the low mass flow case is the prediction of secondary flow. it was found that the sst turbulence model best predicted this feature. fully converged high mass flow results were obtained. once more, sst results proved to match experiment and previous computations the best. problems with the prediction of the flow in the cowl region of the duct were experienced with the s-a and k-� models. one of the main problems of turbulence closures in intake flows is the transition of the freestream from laminar to turbulent over the intake cowl region. it is likely that the improvement in this prediction using the sst turbulence model will lead to more satisfactory results for both high and low mass flow rates. keywords: aerodynamics, computational fluid dynamics, internal flows, turbulence models. the geometry of the intake is shown in figure 1. extracted data in this paper used for comparisons is taken from the port and starboard side of the intake. 2 numerical method the flow solver used for the calculations was the university of glasgow’s three dimensional flow solver named pmb3d. it has previously been applied to a number of problems including: • inviscid and turbulent wings, • inviscid, laminar and turbulent ogive cylinders at incidence, • cavity flows, • rolling, pitching, and yawing delta wings, • other complex three dimensional geometries. a cell centred finite volume technique is used to solve euler and the reynolds averaged navier-stokes (rans) equations. the diffusive terms are discretised using a central differencing scheme and the convective terms use roe’s scheme with muscl interpolation offering third order accuracy. steady and unsteady flows can be solved. steady flow calculations proceed in two parts, initially running an explicit scheme to smooth out the flow at a small cfl and then switching to an implicit algorithm to obtain rapid convergence. the preconditioning is based on block incomplete lower-upper (bilu) factorisation which is also decoupled between blocks to help reduce computational time. the linear system arising at each implicit step is solved using a generalised conjugate gradient (gcg) method. the unsteady code uses an implicit unfactored dual time approach and the rate of convergence between the two consecutive time steps is monitored by the pseudo time tolerance (ptt). more information can be found in reference [1]. the rans calculations were run using spalart-allmaras (s-a) [2], k-� [3], and sst [4] turbulence models. flow separation and large secondary flows due to adverse pressure gradients generated by localised accelerating and decelerating flows create high demands on turbulence models. the s-a model is of the one-equation type and is generated from first principles. these models are satisfactory and have been shown to be as successful as mixing length models. however a more universal model is desirable, particularly for separated flows. the k � model is based on a two-equation approach which has served as the foundation for much of the turbulence model research over the past two decades. this model accounts for the computation of the turbulent kinetic energy (k) but also for the turbulent length scale. consequently two-equation models can be used to predict properties of a given flow without any prior knowledge of the turbulent structure. the shear stress transport (sst) model is a modified version of the k-� model and is designed to account for the transport of the principal turbulent stress. the modifications improve the prediction of flows with strong adverse pressure gradients and separation and hence the sst turbulence model is thought to be more suitable to the application of internal duct flows, as studied here. a modification had to be made to the existing boundary conditions in pmb3d to account for the simulation of the engine face. simple extrapolation is used with the exception of static pressure which is held constant. although small amounts of bulk swirl have been shown to impose back into the main flow, the effects have been proven to be negligible experimentally and thus this method of modelling the engine face is satisfactory. the value of the constant static pressure set at the engine face depends on the engine demand that is to be modelled, and is determined from the freestream mach number (m), the contraction ratio (ratio of the intake plane area to engine face plane area), the desired pressure ratio and mass flow rate (used in wind tunnel tests). the multiblock grid was generated using the commercial package icemcfd. an extensive farfield region is included upstream of the intake in order to allow for the direct comparison of results between different flow solvers (intake entry conditions need not be specified as the flow is entering from freestream conditions). the interaction of these freestream blocks with the intake, particularly the cowl fold back, leads to some complex topologies. in summary, an o-grid is used in the intake. the outer blocks of the o-grid are then wrapped around at the intake entry plane to form a c-grid for the intake cowl. this then allows 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 1: m2129 showing data line extraction definitions an h-grid to be used in the large farfield region which has the advantage of reducing grid size in regions in which the flow is at freestream conditions. the rans investigation of complex three-dimensional flows inevitability involve the use of dense grids. the computational fluid dynamics group at the university of glasgow owns a cluster of high-tech pc’s allowing demanding cases to be studied. the cluster consists of 32 nodes, each a 750 mhz amd athlon thunderbird uni-processor machine with 768 mb of 100 mhz dram. 3 results the low mass flow rate (lmfr) case simulates low engine demand and is the simpler of the two test cases. euler and rans calculations were examined and compared with previous computations and experimental data [5]. a typical fully converged turbulent calculation required around 2000 implicit steps at a cfl of 30 for a medium sized grid of around 400,000 points. this took about 6 hours of computational time using 8 processors. euler and rans results were also computed for the high mass flow case. however this case is more complicated as supersonic flow is generated as the freestream is accelerated into the duct. supersonic flow is also generated as the flow accelerates around the first bend of the intake. this leads to an unrealistic unsteady flow predicted by the euler calculations. problems have been encountered in previous work with this case [6]. a resolution was found by modifying the geometry on the starboard side following the first bend to account for flow separation. this modification was not implemented in this work with attention instead focusing on the rans results. 3.1 low mass flow case the low mass flow rate (lmfr) case is the simpler of the two cases and the euler solution provides a good introduction to the problem and is straightforward to study. comparison of the results with rans computations and experiment data can be seen in figures 2 and 3. local static pressure (ps) is non-dimensionalised with total freestream pressure (pt) and is plotted against duct length, x (non-dimensionalised with engine face diameter, d). reasonable agreement of the flow with computation is seen. the location of the stagnation point is well predicted. however upstream of the first bend (x /d = 1) in the cowl region there are large differences with experimental data. flow acceleration from the stagnation point peak pressure to a minimum just inside the cowl interior is over-predicted. this leads under-prediction in the pressure recovery after the duct throat. consequently flow acceleration is also over-predicted at the second bend starboard side (as shown in figure 1 at x /d = 1). it is thought that these differences may be because of viscous effects that are being neglected in the euler calculation. the rans calculations reduce the magnitude of the over and under predictions in pressure, caused by the flow accelerating from stagnation into the intake, for all turbulence models examined. the rans calculations used a reynolds number (re) of 777,000 based on the engine face diameter. the peak pressure (stagnation point) on the outside of the cowl wall is well matched. the flow generated through the first bend of the intake is also better predicted. the k � results appear to match experimental data the best in the cowl region. sst and s-a results are very similar in this area. downstream of the first bend the port and starboard side data comparisons differ. port extractions match fairly well with experimental data although sa and sst results are consistently higher, probably due to the slight over prediction in the pressure recovery after the initial acceleration into the duct. however the most interesting examination is from the starboard side. a feature of s-duct intake flow is the generation of secondary flow off the first bend as described in the introduction. the extent of the secondary flow for the low mass flow case is indicated by a pressure drop between the two bends of the duct (between an x / d of 1 and 3.5). figure 2 shows that all turbulence models fail to predict the drop witnessed in the experiments. however closer examination shows that the s-a and, more particularly, the sst models show a slight levelling off of the pressure which is indicative of secondary flow development. the extent of the drop is not acta polytechnica vol. 41 no. 4 – 5/2001 63 fig. 2: lmfr calculation – data extracted from starboard side of intake well matched with experiment but it should be remembered that these models have a higher pressure recovery than expected which could explain this. the secondary flow can be seen in figure 4 by means of velocity vectors and local total pressure contours (non-dimensionalised with freestream total pressure) for the sst model. 3.2 high mass flow case the high mass flow rate (hmfr) case is more difficult to model due to the complex nature of the flow generated. supersonic flow is achieved as the flow accelerates into the intake and also as the flow accelerates around the starboard acta polytechnica vol. 41 no. 4 – 5/2001 64 fig. 3: lmfr calculation – data extracted from port side of intake fig. 4: lmfr n-s sst calculation – secondary flow at engine face side first bend. fully converged solutions were obtained for coarse, medium, and fine grids with the residuals dropping by 8 orders of magnitude. the results using the various turbulence models were compared with previous computations and experimental data. as with the lmfr calculations, the euler calculations were qualitatively in agreement with previous works but quantitatively showed differences, since no allowances were made to account for boundary layers and separation in the inviscid calculations. fully converged viscous results were achieved using all turbulence models. figures 5 and 6 show extracted pressures from the starboard and port sides respectively. it is immediately obvious that there are problems in the cowl region with the k � and s-a results. following the initial pressure drop resulting from the flow acceleration into the duct, further drops occur, especially for the k � model. examining an extraction through the symmetry plane of the grid shows a complex shock wave reflection system, contrary to experiment. this appears to stem from an acceleration into the duct that is greater than in experiment leading to higher core mach numbers. consequently the pressure never recovers prior to the first bend. after the first bend the pressure recovers to match previous solutions and experiment well. the sst model matches experiment very well. the acceleration of the flow into the duct compares very closely with experiment. the subsequent pressure recovery and flow acceleration around the starboard side first bend also match very well. the reason for the superior prediction using the sst model appears to stem from the cowl region and a better prediction of the initial flow features. closer examination of the solution shows that the transition from freestream to turbulent conditions was different from the other models. the s-a model also suffers from an over acceleration of the flow into the duct but the extent of this is not as large as for the k � model and the flow recovers prior to the first bend. the secondary flow developed in the intake duct can be seen in figure 5 in the form of a levelling of the pressure trace © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 6: hmfr calculation – data extracted from port side of intake fig. 5: hmfr calculation – data extracted from starboard side of intake between the two bends. closer examination of figure 5 shows that the sst model again predicts the secondary flow better than the other models. figure 7 shows the effect of the secondary flow at the engine face. it is clear that the secondary flow developed is significant and it is not untenable to suggest that the level of distortion experienced at the engine face might induce compressor blade stalling and subsequent engine surge. 4 conclusions euler and rans calculations have been performed on an offset, diffusing intake duct. a variety of turbulence models were used with the aim of validating the flow. two separate cases were looked at. firstly, a low mass flow case to simulate low engine demand, and, secondly, a high mass flow case to simulate high engine demand. comparisons were made by examining the local static pressure histories through the duct (non-dimensionalised with freestream total pressure). euler calculation were initially done as they are straightforward and serve as a good introduction to the problem. low mass flow results showed that, qualitatively, the salient flow features are captured, but quantitatively there were over-predictions in the level of acceleration into the duct leading to large pressure drops and poor pressure recoveries. high mass flow euler results failed to converge. the case is complex as supersonic flow is generated as the flow accelerates into the duct, and also as the flow accelerates around the starboard side first bend. it was found that the solution is unsteady, with the unsteadiness originating from the starboard side first bend. experimental data shows that there is considerable separation from this location and the euler code cannot predict this without modifications being made to the geometry. rans results were computed using s-a, k �, and sst turbulence models. fully converged steady solutions were reached. the low mass flow case showed that the sst model performed more satisfactorily than the other models. flow acceleration was closely matched into the duct although the subsequent pressure recovery was over predicted. this led to a lower acceleration around the first intake bend than was witnessed in the experiment. one of the main challenges with the low mass flow case is the difficulty of predicting secondary flow. however there is evidence that the sst model (and also the s-a model) predict limited secondary flow as indicated by a levelling of the pressure trace on the starboard side, although the actual pressure magnitude is too high due to the reduced acceleration around the intake first bend. results from the k � turbulence model for the high mass flow case showed that the acceleration into the duct was over predicted leading to an significant supersonic flow in the cowl region. there is evidence of shock reflection and the flow can be described as choked. this is all contrary to experiment. similar results can be seen for the s-a model although the solution is generally better. the sst model once again provides the best comparison with experiment. the level of acceleration into the duct is well predicted although, as for the low mass flow case, the pressure recovery is a little better 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 7: hmfr n-s sst calculation – secondary flow at engine face than witnessed in experiment. secondary flow on the starboard side following the first bend is predicted for all turbulence models although the sst model predicts the feature best. this work has served as a validation of the application of pmb3d to problems of this type. it has been found that the sst turbulence model appears to offer the best overall results to this type of problem. a non-algebraic model [7] will shortly be tested and it is anticipated that this will provide further improvements. acknowledgements the author gratefully thanks k. j. badcock and b. e. richards in the department of aerospace engineering and m. jackson at dera bedford for their help. thanks also to other members of the computational fluid dynamics group in the department for their suggestions and assistance. this work is supported by sponsorship from dera bedford. references [1] badcock, k. j., richards, b. e., woodgate, m. a.: elements of computational fluid dynamics on block structured grids using implicit flow solvers. progress in aerospace sciences, vol. 36, 2000, pp. 351–392 [2] spalart, p. r., allmaras, s. r.: a one-equation turbulence model for aerodynamic flows. aiaa, paper 92-0439, 1992 [3] wilcox, d. c.: turbulence modelling for cfd. dcw industries, inc., la canada, california, 1994 [4] menter, f. r.: zonal two equation kappa-omega turbulence models for aerodynamic flows. aiaa, paper 93-2906, 1993 [5] fluid dynamics panel working group 13: test case 3 – subsonic/transonic circular intake. agard advisory report 270, 1991 [6] may, n. e., mchugh, c. a., peace, a. j.: an investigation of two intake/s-bend diffuser geometries using the sauna cfd system – phase 1. aircraft research association, memo 386, 1993 [7] craft, t. j., launder, b. e., suga, k.: development and application of a cubic eddy-viscosity model of turbulence. international journal of heat and fluid flow, 17, 1996, pp. 108–115 ryan d. d. menzies, m.eng. phone: +44 0 141 330 2227 fax: +44 0 141 330 5560 e-mail: rmenzies@aero.gla.ac.uk department of aerospace engineering james watt building (south) university of glasgow glasgow, g12 8qq scotland, uk © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2014.54.0063 acta polytechnica 54(1):63–67, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of parameters affecting the quality of a cutting machine iveta onderováa, ∗, ľubomír šoošb a slovak university of technology in bratislava, faculty of mechanical engineering, institute of manufacturing systems, environmental technology and quality management b slovak university of technology in bratislava, faculty of mechanical engineering ∗ corresponding author: iveta.onderova@stuba.sk abstract. the quality of cutting machines is affected by several factors that can be directly or indirectly influenced by manufacturers, technicians and users of machine tools. the most critical qualitative evaluation parameters of machine tools include accuracy and stability. investigations of accuracy and repeatable positioning accuracy were essential for the research presented in this paper. the aim was to develop and experimentally verify the design of a methodology for cutting centers aimed at achieving the desired working precision. before working on the topic described here, it was necessary to make several scientific analyses, which are summarized in this paper. we can build on the initial working hypothesis that by improving the technological parameters (e.g. by increasing the working speed of the machine, or by improving the precision of the positioning) the quality of the cutting machine will also be improved. for the purposes of our study, several investigated parameters were set affecting positioning accuracy, such as rigidity, positioning speed, etc. first, the stiffness of the portal structure of the cutting machine was analyzed. fem analysis was used to investigate several alternative structures of the cutting machine, and also an innovative solution for beam mounting). the second step was to integrate two types of drives into the design of the cutting machine. the first drive is a classic rack and pinion drive for cutting machines. to increase (improve) the working speed of the machine, linear motors were designed as an alternative drive. the portal of the cutting machine was designed for a working speed of 260 m min−1 and acceleration of 25 m. s−2. the third step was based on the results of the analysis. in collaboration with microstep, an experimental cutting machine in a portal version was produced using linear synchronous motors driving the portal on both sides, and with direct linear metering of its position. in the fourth step, an experiment was designed and conducted to explore the positioning accuracy and the repeatable positioning accuracy. keywords: cutting machines, quality, positioning accuracy, repeatable positioning accuracy. 1. requirements for cutting machinery and quality requarements cutting centers must meet several requirements. some of the requirements are mandatory and are specified by current standards, while others are either generally anticipated or the customer determines them himself. a summary of all the requirements is a set of parameters affecting the quality of the cutting center. the parameters are divided into various classes, e.g.: structural, technological, ergonomic, operational, etc. based on the degree to which the parameters of the center fulfill the prerequisites, we can say that the quality of the cutting center is bad, good or excellent. the quality of a cutting center can be defined as the degree to which the set of cutting center parameters meets the requirements for cutting centers. for the purposes of the experiment described here, the selected qualitative parameters are: parameter a positioning accuracy, and parameter r repeatable positioning accuracy. 2. eexperimental equipment and experiment descign the technical solutions and features of this machine must reflect the high requirements for dynamics, speed and precision of cutting shapes that are difficult for machining, and also parts of small dimensions. a new design for the mechanics and the drive system of the cnc cutting machine was developed for force-free material cutting technology. the design of the machine is of gantry execution with extreme dynamics, and is designed on the basis of advanced structural materials (mineral alloys polymer concrete, sandwich tubular structures), applications of linear actuators driving the portal on both ends and direct linear position measurements. the portal support is equipped with a drive unit for height control of the technological head above the material that is being cut (the workpiece). the technological table is designed to work at speeds up to 260 m min−1 and acceleration of 25 m s−2. particular attention was paid to the design of the installation of the linear actuators, and to the design of the connecting node of the x axis to the beam of 63 http://dx.doi.org/10.14311/ap.2014.54.0063 http://ojs.cvut.cz/ojs/index.php/ap iveta onderová, ľubomír šooš acta polytechnica figure 1. parameters affecting the quality of a cutting machine. figure 2. experimental measurement of parameters a and r — axis x1 (1–interferometer, 4–axis y , 5– axis x1, 6–axis x2). the y axis. the frame of the cutting machine was also developed, and its design was gradually modified [1]. the main part is the exhausted technology table, with an area of 4.5 m2, i.e. 3000 × 1500 mm, and with three holes that are used to connect the filtering device. the portal of the machine for the purposes of the experiment was designed with a reversible linear actuator in the x axis. the cutting machine is equipped with its own conveyor, which is located inside the frame, and its motion is realized through a chain gear on both ends of the conveyor. the goal of the experiment is to determine the value of the a and r parameters across the technological table or in the range of motion of the technological head. the experiment based on the parametric method was realized by a renishaw ml 10 interferometer. in combination with the unit for compensating the environmental effects, it achieves extremely high accuracy up to 0.7 ppm. its resolution is 1 nm at a sampling figure 3. measuring cycle. frequency of 1 m s−1 in order to determine the values of parameters a and r in the full range of the technology head motion, we had to split the experiment into two parts: measurement of the y axis (axis length: 1500 mm), and measurement of the x axis (axis length: 3000 mm). for the required position pi, ten measurements were performed in the direction from the right, and ten measurements in the direction from the left. measured positions pi are designed in accordance with the formula p i = (i − 1)p + r, (1) where pi is the measured position, i is the number of the measured position, p is the measuring interval, r will take a different value in each measured position. it is used to prevent periodic errors. approximation to the desired position was carried 64 vol. 54 no. 1/2014 analysis of parameters affecting the quality of a cutting machine pi p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 measured positions [mm] 10.0 166.0 332.8 500.0 666.0 831.6 998.0 1166.0 1332.0 1489.0 table 1. table of desired positions for measurement of the y -axis. pi p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 measured positions [mm] 10.0 250.0 500.2 749.9 1001.0 1248.5 1500.0 1749.3 2000.8 2250.1 2498.3 2749.9 table 2. table of desired positions for measurement of the x-axis. figure 4. scheme of measurement axes y . figure 5. scheme of measurement axes x. out in both directions, i.e. from the right (marked as −) and from the left (marked as +). the measurement was carried out by the reverse movement cycle, as shown in figure 3 the measuring cycle was maintained throughout the experiment for all measurements. 3. evaluation of measurements the measurements were evaluated under the conditions stipulated in iso 230-2: 2006. the goal of the experiment is to determine the value of the a and r parameters. all results of the evaluation of measurements (maximum insensitivity btmax, average insensitivity b, mean bidirectional positioning error interval m, systematic positioning error e, positioning repeatability r and positioning accuracy a) for the y and x axes published in [1]. the y axis measurements are similar even for different positioning of the portal. for all measurements, the same phenomenon can be observed that when the linear axis increases the mean unidirectional positioning error increases up to the desired position. from this position, the unidirectional positioning error decreases again. the dispersion of the individual measurements is small, as can be seen in the narrow confidence corridor. all measurements of the y axis are characterized by a change in the range from to . this area will be further investigated. the results of the y -axis measurements also show that the value for the positioning accuracy of a increased after moving the 65 iveta onderová, ľubomír šooš acta polytechnica figure 6. graphical evaluation of direct measurements of the y axis of the left and right sides. portal. while measuring y − x0 the positioning accuracy was a = 118.8 microns, and while measuring y − x3000 the positioning accuracy was a = 135.1 microns. the unidirectional positioning error of the y axis along the x axis thus increased by 16.3 microns. the measurements of the x axis are similar even when the support weight is shifted along the y axis. for all measurements, the same phenomenon can be observed that when the linear y axis increases the mean unidirectional positioning error increases exponentially. the dispersion of the individual measurements is small, as can be seen in the narrow confidence corridor. an interesting area of all measurements on the x axis is the area around the desired position at this point, a step change occurred for all measurements, but it did not occur for the next desired position. it is therefore necessary to make a further investigation at a later date. the results of the x axis measurements also show that the value for the positioning accuracy of a increased after moving the portal. measuring x − y0 the positioning accuracy was a = 137.024 microns; and measuring x − y1500 the positioning accuracy was a = 184.632 microns. the positioning accuracy of the x axis increased by 47.608 microns after moving the support along the y -axis. 4. conclusions on the measurement evaluation the resulting positioning accuracy on the y axis is equal to the maximum value of the evaluated positioning accuracy , and the resulting positioning accuracy for the axis is for the x axis, the positioning accuracy has the maximum value and positioning repeatability the measurement results evaluated according to the standard do not define the positioning error in the whole range of the measured axis, only at the measurement points. using these results, evaluated according to the standard, it is therefore not possible to determine the basic measurement parameters beyond the measured desired positions. between these points the positioning deviation is not known. the standard, however, assumes that the curve of the positional deviation value between the measured data items is linear. this presumption is critical for the evaluation of the measured data, but in terms of further processing of the measured data it is insufficient and not correct. for this reason we 66 vol. 54 no. 1/2014 analysis of parameters affecting the quality of a cutting machine figure 7. graphical evaluation of direct measurements of the x axis of the left and right sides. decided to estimate the positioning accuracy beyond the desired positions using regression analysis. acknowledgements the research work presented in this paper was performed with financial support from vega grant 1/0584/12. references [1] onderová, i.: prínos k zvyšovaniu vybraných technologických parametrov deliacich strojov, dizertačná práca, slovenská technická univerzita v bratislava, strojnícka fakulta, 2010, bratislava. 67 acta polytechnica 54(1):63–67, 2014 1 requirements for cutting machinery and quality requarements 2 eexperimental equipment and experiment descign 3 evaluation of measurements 4 conclusions on the measurement evaluation acknowledgements references acta polytechnica acta polytechnica 53(2):170–173, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ pin hole discharge creation in na2so4 water solutions lucie hlavatáa, rodica serbanescua,b, lenka hlochováa, zdenka kozákováa, františek krčmaa,∗ a brno university of technology, faculty of chemistry, purkyňova 118, 612 00 brno, czech republic b faculty of physics, ovidius university of constanca, 124, mamaia boulevard, 900527 constanta, romania ∗ corresponding author: krcma@fch.vutbr.cz abstract. this work deals with the diaphragm discharge generated in water solutions containing na2so4 as a supporting electrolyte. the solution conductivity was varied in the range of 270 ÷ 750 µs cm−1. the batch plasma reactor with volume of 100 ml was divided into two electrode spaces by the shapal-mtm ceramics dielectric barrier with a pin-hole (diameter of 0.6 mm). three variable barrier thicknesses (0.3; 0.7 and 1.5 mm) and non-pulsed dc voltage up to 2 kv were used for the discharge creation. each of the current–voltage characteristic can be divided into three parts: electrolysis, bubble formation and discharge operation. the experimental results showed that the discharge ignition moment in the pin-hole was significantly dependent on the dielectric diaphragm thickness. breakdown voltage increases with the increase of the dielectric barrier thickness. keywords: pin hole discharge, discharge in liquids, discharge breakdown. 1. introduction electrical discharges in liquids have been in a serious focus of researchers for mainly last three decades. especially formation of various reactive species such as hydroxyl and hydrogen radicals, some ions, and molecules with high oxidation potential (hydrogen peroxide) has been investigated in order to utilize this process in water treatment, removal of organic compounds from water, and sterilization processes [8, 9, 5, 4]. pin hole discharge configuration consists of two electrode spaces divided by the dielectric barrier with a central pin-hole. mechanisms of discharge breakdown in liquids are still under an intensive research and their study requires specific approach in all possible configurations (different kind of high voltage, various electrode configuration, etc.). generally, two types of theories are considered to be the most suitable for discharge breakdown description – thermal (bubble) theory, and electron theory [3]. discharge ignition in the pin hole configuration probably combines both theories. initially, it starts in the pin-hole when sufficient power is applied. the breakdown moment is probably related to the bubbles formation [1]. by the application of constant dc voltage, water solution is significantly heated due to high current density (joule’s heating), and microbubbles of water vapor are created in the pin-hole region. it is assumed that the discharge breakdown starts inside these bubbles because of high potential gradient between the outer and inner bubble region [1], in correspondence to the thermal theory. on the other hand, further propagation of plasma channels to the bulk solution probably corresponds to the electron theory. moreover, application of dc voltage initiates creation of two kinds of plasma streamers on both sides of the dielectric barrier [7, 2]. longer streamers appear on the side with the cathode because the pin-hole in the dielectric barrier represents a positive pole (like point in point to plane configuration), and the streamers propagate to the positive electrode similarly to the positive corona discharge. on the other side where the anode is placed, shorter streamers in a spherical shape propagate towards the pin-hole like in the case of the negative corona discharge [6]. the presented paper describes the pin hole discharge creation by means of electrical characteristics, and discusses the influence of the dielectric barrier thickness on the breakdown moment and current voltage characteristic. 2. experimental set-up the batch reactor was divided into two electrode spaces by the dielectric barrier, and non-pulsed dc voltage up to 2 kv was used for the discharge creation. the discharge appeared in a pin-hole in the dielectric diaphragm. the dielectric barrier was made of shapal-m™ ceramics with the three different thicknesses (0.3; 0.7 and 1.5 mm). the pin-hole diameter was 0.3 mm and it remained constant during the whole experiment. planar electrodes (40 × 30 mm) made of stainless steel were installed on each side of the barrier. water solution containing na2so4 electrolyte to provide initial conductivity in the range of 270 ÷ 750 µs cm−1 was used as a liquid medium. total volume of the used solution was 100 milliliters (50 milliliters in each electrode space). solution temperature was changed by the discharge operation, 170 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 pin hole discharge creation in na2so4 water solutions figure 1. electrical scheme of the experiment: 1 – discharge reactor, 2 – dielectric barrier with pinhole, 3 – anode, 4 – cathode, 5 – oscilloscope tektronix tds 1012b, 6 – dc hv source with resistances important for electric measurements: r3 (100 mω), ri4 (3.114 kω), r5 (5.13 ω), ri6 (105.5 ω), r7 (0.13 ω). but its enhancement was up to 10 k, only, during each measurement. this temperature change was negligible in term of the discharge breakdown. oscilloscope tektronix tds 1012b operating up to 100 mhz with tektronix p6015a high voltage probe was used to obtain time resolved characteristics of discharge voltage and current with focus on the breakdown moment. the scheme of the electric circuit including diagnostics is demonstrated in fig. 1. mean values of breakdown parameters (voltage, current, power, and resistance) were calculated and subsequently, static current–voltage characteristics were constructed for each experiment. the obtained results were compared with respect to the electrode configuration (barrier thickness) and to the electrolyte conductivity. 3. results and discussion current–voltage characteristics of dc pin hole discharge were constructed from the mean values of time resolved current and voltage records over 50 ms. figure 2 demonstrates a typical current–voltage curve obtained in na2so4 solution with initial conductivity of 550 µs cm−1. this curve could be divided into following three parts. (1.) initially at low applied voltages, increasing the applied dc voltage, measured current increased more or less directly proportionally. the time resolved current record shows smooth line and thus we can conclude that only electrolysis took place in the system. of course, due to the passing current the electrolyte solution was heated by the joule effect. (2.) going above the voltage of some hundreds volts, the first significant breakpoint appeared in the curve – current markedly jumped up. according to the time resolved characteristics, the smooth 200 400 600 800 1000 1200 0 20 40 60 discharge breakdown bubble formation electrolysis cu rr en t [ m a ] mean voltage [v] figure 2. typical current–voltage characteristic of dc diaphragm discharge in na2so4 solution with initial conductivity of 550 µs cm−1 and 0.3 mm barrier thickness. current time record changed and some current pulses can be recognized. we have assumed that these current pulses were related to the substantial creation of micro bubbles formed by the evaporating solution inside the pin hole where the current density was the highest and thus the joule heating was sufficient for microbubble creation. further increase of voltage provided only a small current increase because the bubbles creation started to be more or less regularly until the second significant breakpoint was observed. (3.) from this moment, current was rapidly arising with only a small voltage increase. this second breakpoint was assumed to be the discharge breakdown moment which was also confirmed by light emission recorded by the optical emission spectroscopy. mean values over 50 ms of voltage and current were estimated from time resolved characteristics and subsequently, current–voltage curves were constructed. figure 3 demonstrates the comparison of current–voltage characteristics obtained for three different barrier thicknesses (0.3; 0.7 and 1.5 mm) in electrolyte solutions (na2so4) at conductivity of 270 µs cm −1. the increase of the barrier thickness had a substantial effect on all three parts of the current–voltage curve. curves obtained with the barrier thickness bigger than 0.3 mm were located at the lower current values. the reason could be explained by the increase of resistance with the increasing barrier thickness. the presented current–voltage curves show that the discharge breakdown voltage was enhanced by the increasing of the barrier thickness. the particular values of determined breakdown voltage are listed in tab. 1 (in the second column) for three barrier thicknesses (0.3; 0.7 and 1.5 mm). breakdown voltage increased from 1000 v (thickness of 0.3 mm) to about 1500 v for 1.5 mm thickness. these results 171 l. hlavatá, r. serbanescu, l. hlochová, z. kozáková, f. krčma acta polytechnica 200 400 600 800 1000 1200 1400 0 20 40 60 cu rr en t [ m a ] mean voltage [v] barrier thickness 0.3 mm 0.7 mm 1.5 mm figure 3. comparison of current–voltage characteristics of the pin hole discharge in na2so4 solutions (conductivity of 270 µs cm−1) for three different barrier thicknesses. conductivity [µs cm−1] barrier thickness [mm] 270 550 750 0.3 990 v 1040 v 1020 v 0.7 1270 v 1130 v 1200 v 1.5 1470 v 1370 v 1290 v table 1. breakdown voltage of dc pin hole discharge as a function of the barrier thickness for the selected conductivities. were obtained in na2so4 solution (initial conductivity of 270 µs cm−1). similar effect was observed for the other conductivities (550 and 750 µs cm−1) using the same electrolyte solution, too. the determined breakdown voltages for these conductivities are listed in tab. 1 (in last two columns). figure 4 demonstrates the time evaluation of voltage and current at the mean voltage of 1000 v for three different barrier thicknesses. these figures clearly demonstrate the difference between the diaphragm (thickness of 0.3 mm, ratio l/d 0.5) and capillary discharge (thickness of 1.5 mm, l/d = 2.5). resistance inside the pin-hole increased and current reached lower values with the increasing barrier thickness. time resolved characteristic for 1.5 mm thickness (fig. 4c) shows the electrolysis, only. it is well visible that there are no significant peaks appearing in regular voltage and current oscillations. voltage oscillations were related to the hv source construction, and they had no significant influence on the observed phenomena. remarkable higher oscillations of both current and voltage were recorded at the barrier thickness of 0.7 mm (fig. 4b). the current record shows nearly regular shape of the oscillations without any significant current peak. this phenomenon was related to the micro bubble formation due to the intensive solution heating by passing current. no light emission was observed during this period. the irregular shape of current peaks with some high current peaks appeared if the thinnest barrier 0 10 20 30 40 50 600 900 1200 (a) time [ms] vo lta ge [v ] 50 100 150 random breakdown current [m a ] 0 10 20 30 40 50 600 900 1200 time [ms] vo lta ge [v ] 10 20 30 40 (b) bubble formation current [m a ] 0 10 20 30 40 50 600 900 1200 electrolysis time [ms] vo lta ge [v ] 5 10 15 20 (c) current [m a ] figure 4. time resoled voltage and current records for mean voltage of 1000 v in na2so4 solutions (initial conductivity of 270 µs cm−1) using the dielectric barrier with thickness of a) 0.3 mm, b) 0.7 mm and c) 1.5 mm. was applied (fig. 4a). this phenomenon was related to the random discharge breakdown in vapor bubbles. simultaneously, the short peaks of emitted light were recorded, too. as the pin hole diameter was greater than the barrier thickness, the pin hole represented a significantly lower resistance. therefore, current in the pin hole was much higher which effect led to an easier discharge ignition in the pin hole. 4. conclusions breakdown moment as well as processes of electrolysis, and bubble formation were identified from obtained mean value current voltage characteristics; the time resolved characteristics clarified the proposed mech172 vol. 53 no. 2/2013 pin hole discharge creation in na2so4 water solutions anisms of the pin hole discharge creation in na2so4 solutions. current–voltage evaluation was remarkably influenced by the dielectric barrier configuration. breakdown voltage increased with the increase of the dielectric barrier thickness. current–voltage curves were shifted to the lower currents and higher voltages with the increase of the barrier thickness. this effect was caused by the increase of resistance with the increasing barrier thickness. solution conductivity had only a minor effect on the discharge characteristics. thus we can conclude that the pin hole geometry is the main parameter influencing the bubbles formation as well as the pin hole discharge breakdown. acknowledgements this work was supported by the czech ministry of culture, project no. df11p01ovv004. references [1] r. p. joshi, j. qian, k. h. schoenbach. electrical network-based time-dependent model of electrical breakdown in water. j appl phys 92(10):6245–6251, 2002. [2] z. kozakova, l. hlavata, f. krcma. diagnostics of electrical discharges in electrolytes: influence of electrode and diaphragm configuration. in book of contributed papers: 18th symposium on application of plasma processes and workshop on plasmas as a planetary atmospheres mimics, pp. 83–87. vrátna, 2011. [3] m. a. malik, a. ghaffar, a. m. salman. water purification by electrical discharges. plasma sources sci technol 10(1):90–97, 2001. [4] m. moisan, j. barbeau, m.-c. crevier, et al. plasma sterilization. methods and mechanisms. pure appl chem 74(3):349–358, 2002. [5] e. njatawidjaja, a. t. sigianto, t. ohshima, m. sato. decoloration of electrostatically atomized organic dye by the pulsed streamer corona discharge. j electrostat 63(5):353–359, 2005. [6] j. prochazkova, z. stara, f. krcma. optical emission spectroscopy of diaphragm discharge in water solutions. czech j phys 56(2 supplement):b1314–b1319, 2006. [7] z. stara, f. krcma, j. prochazkova. physical aspects of diaphragm discharge creation using constant dc high voltage in electrolyte solution. acta technica csav 53(3):277–286, 2008. [8] b. sun, m. sato, j. s. clements. oxidative processes occurring when pulsed high voltage discharges degrade phenol in aqueous solution. environ sci technol 34(3):509–513, 2000. [9] p. sunka, v. babicky, m. clupek, et al. potential applications of pulse electrical discharges in water. acta phys slovaca 54(2):135–145, 2004. 173 acta polytechnica 53(2):170–173, 2013 1 introduction 2 experimental set-up 3 results and discussion 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):103–109, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ development of the petal laser facility and its diagnostic tools dimitri batania,∗, sebastien hulina, jean eric ducreta, emmanuel d’humièresa, vladimir tikhonchuka, jérôme carona, jean-luc feugeasa, philippe nicolaia, michel koenigb, serena bastiani-ceccottib, julien fuchsb, tiberio ceccottic, sandrine dobosz-dufrenoyc, cecile szabo-fosterd, laurent seranie, luca volpef, claudio peregof, isabelle lantuejoul-thfoing, eric lefèbvreg, antoine compant la fontaineg, jean-luc miquelg, nathalie blanchotg, alexis casnerg, alain duvalg, charles reverding, rené wrobelg, julien gazaveg, jean-luc duboisg, didier raffesting a université bordeaux, cea, cnrs, celia umr 5107, f-33400 talence, france b luli, umr 7605, ecole polytechnique, f-91128 palaiseau, france c iramis/service photons atomes et molecules, cea-saclay, f-91191 gif sur yvette, france d laboratoire kastler-brossel umr 8552, 4, place jussieu, f-75252 paris, france e cenbg umr 5797, chemin du solarium, f-33175 gradignan, france f universitá degli studi di milano-bicocca, i-20126 milan, italy g cea-cesta, bp 2, f-33114 le barp, france ∗ corresponding author: batani@celia.u-bordeaux1.fr abstract. the petal system (petawatt aquitaine laser) is a high-energy short-pulse laser, currently in an advanced construction phase, to be combined with the french mega-joule laser (lmj). in a first operational phase (beginning in 2015 and 2016) petal will provide 1 kj in 1 ps and will be coupled to the first four lmj quads. the ultimate performance goal to reach 7 pw (3.5 kj with 0.5 ps pulses). once in operation, lmj and petal will form a unique facility in europe for high energy density physics (hedp). petal is aiming at providing secondary sources of particles and radiation to diagnose the hed plasmas generated by the lmj beams. it also will be used to create hed states by short-pulse heating of matter. petal+ is an auxiliary project addressed to design and build diagnostics for experiments with petal. within this project, three types of diagnostics are planned: a proton spectrometer, an electron spectrometer and a large-range x-ray spectrometer. keywords: plasma diagnostic, x-ray photon emission, proton radiography, particle laser acceleration, laser megajoule, petawatt laser, high energy density physics, electron spectrometer, x-ray spectrometer. 1. introduction a new era of plasma science started with the first experiments on the national ignition facility (nif) at the lawrence livermore national laboratory (llnl) in the usa. up to now, the 192 beams of the nif have been able to deliver more than 1.8 mj of energy into a hohlraum target. the aim is to reach the target ignition by indirect drive, the laser energy being transformed into a high-intensity and hightemperature radiation field, which is irradiating and compressing the target [11, 15]. the laser megajoule (lmj) under construction near bordeaux in france is following the trail opened by the nif with its planned 160 laser beams for more than 1 mj to reach ignition of a deuterium–tritium target using the indirect drive method. an updated status report on the lmj was made at the ifsa (inertial fusion science & applications) conference, held in bordeaux in september 2011 [19] (miq11). the construction plan leads to the beginning of laser shots on lmj at the end of 2014. the laser lines of the lmj will be assembled in quads of 4 beamlets. each quad will deliver more than 30 kj of energy within a few ns, providing an intensity of about 1015 w cm−2. at the start of the operation, four quads will be available with the first lmj-petal experiments for the academic science community in mid-2015. besides the physics of icf (plasma physics, shock/fast ignition), nif & lmj will be essential for basic science, exploring fields such as plasma astrophysics (e.g. study of shocks to simulate violent events 103 http://ctn.cvut.cz/ap/ dimitri batani et al. acta polytechnica figure 1. initial configuration of the lmj/petal laser system. in the universe such as supernovae, accretion disks), planetary physics (highly compressed and warm matter), stellar interiors with large coupling between radiation field and matter and nuclear physics. overviews of the physics programme at lmj and nif may be found in [16, 20]. a pw short pulse laser will be added to the nanosecond pulse beams of the lmj. this is the petal system, under construction on the lmj site near bordeaux (france). it is supported and funded by the region aquitaine regional council. once in operation, lmj & petal will be a unique facility in europe for high energy density physics (hedp). 2. petal laser system the petal system, under construction on the lmj site, has the ultimate goal to reach 7 pw (3.5 kj with 0.5 ps pulses). for the beginning of operation, the petal energy will be at the 1 kj level, corresponding to intensity on target of ∼ 1020 w cm−2. the pulse duration can be varied between 0.5 ps and 10 ps, and the intensity contrast is 10−7 at −7 ps. updates on the design and construction of the petal laser were given in 2011 [2]. the initial configuration of the lmj/petal laser system is shown in fig. 1. the development of the system has been funded by the aquitaine regional council, with contribution from the french government and the eu (for a total budget 54.3 me). the aquitaine regional council is then the contracting owner of the petal facility while cea is the prime contractor for its construction. technical and scientific assistance are provided by ilp (institut lasers et plasmas). finally, petal is also considered as the major french contribution to the hiper (high power laser energy research facility) project, since it will allow to perform: i) significant experiments in the domain of fast ignition (allowing to produce hundreds of joules of fast electrons); ii) backlighting for implosion experiments, in particular for direct-drive experiments on shock ignition (selected by hiper as the main route to inertial fusion energy). the petal system is based on the chirped pulse amplification technique. a short pulse laser oscillator (providing a bandwidth of 16 nm) and an offner stretcher (allowing bringing the pulse duration from 100 fs to 9 ns) form the front end. the preamplifier module (pam) is based on the optical parametric chirped pulse amplification (opcpa) technique, and it produces pulses of 4.5 ns, 8 nm, and 100 mj. the amplifier section produces pulses of 1.7 ns, 3 nm, 6.4 kj. the system is equipped with wavefront correction and chromatism correction. the compression section includes 2 stages (in air and in vacuum, respectively). finally the transport/focusing section includes a beam transport in vacuum to the lmj interaction chamber. the focusing parabola (90◦ off-axis parabola) and the final pointing mirror are placed just outside the lmj chamber. also, the possibility of conversion of the petal pulse into the second harmonic has been taken into account (leaving the space for a conversion crystal) although this will not be implemented in the initial phase. 3. problems related activation and emp using a very high-energy high-intensity system like petal implies facing novel problems related to the chamber activation, and to the generation of giant electromagnetic pulses (emp). activation of the experimental chamber and adjacent structures is due to the high-energy particles (γ, p, n) produced in experiments on petal coupled with lmj. several sources of these high energy particles are identified: 104 vol. 53 no. 2/2013 development of the petal laser facility and its diagnostic tools figure 2. fast electron source produced in interaction of the petal laser beam at 1020 w cm−2 with a plastic target (calder results at the end of the simulation, t = 2.2 ps) and corresponding photon distribution emitted from a 2 mm w target in the forward (p3) and backward (p2) directions (mcpnx results). (1.) laser interaction with thick solid targets and production of hard x-rays and γ-rays. (2.) laser interaction with a thin solid target and production of high-energy protons and ions. (3.) laser interaction with compressed heavy hydrogen isotope targets (dd or dt in fast ignition experiments) and production of fusion neutrons (in the future). a working group of specialists is now assessing the impact of the first two sources, with respect both to safety regulations and health risks (the third source will be considered in the future, when nuclear implosion experiments will be prepared). the first type of radiation (hard x-rays and γ-rays) is caused by the fact that ultra-high-intensity interaction between laser and matter produces strong fluxes of relativistic electrons. when the target is thick (and especially if it is made of a high-z material) such electrons are stopped inside it and hard x-rays and γ-rays are produced by bremsstrahlung. a thick solid target then acts as a converter of fast electron energy into photon energy. the first step of the work hence consists in modeling of the fast electron source. this has been done using the particle-in-cell (pic) code calder [13, 5]. it describes the laser plasma interaction, electron acceleration and their transport through the target. however, generation of photons is not accounted for in this code because of a very large difference in the spatial and temporal scales. this process is considered separately in simulations performed with the code mcpnx [7], where a thick (2 mm) and large tungsten (w) target is considered. from a simulation point of view, the electron source is defined at the edges of a (small) calder box and they are injected in a much larger mcnpx box. mcnpx then provides the number, direction and energy of the produced photons and secondary electrons, which are then responsible for the irradiation dose either directly or in photo-neutron reactions. about 1011 photo-neutrons are produced with a rather isotropic distribution and an average energy 〈e〉 = 1.6 mev. calculations show that the dose delivered by photons can be as high as 32 rads at 1 m at the rear of a w-target. in addition the activation zone is limited by a cone with an opening angle of ∼ 40◦ around the laser beam axis. about 90 % of the dose is provided by photons with energy < 23 mev. figure 2 shows an example of the fast electron source obtained from calder simulations and of the corresponding photon source obtained from mcpnx simulations. the second type of radiation (protons and ions) is produced in the process of the target normal sheath acceleration mechanism (tnsa) from the rear side of a thin target[27]. on the lmj/petal installation, the pw laser may be used for radiography of the plasma produced by ns lmj beams. in that case the petal beam is focused on a secondary target, where a short (∼ 20 ps) bunch of particles (electrons, protons, ions) is produced and directed to the primary plasma. the calculations of a proton source were performed with the picls two-dimensional (2d) pic code [23]. the energy spectra and the angular divergences of the protons to be produced with petal are presented in fig. 3. with a laser energy of 3.5 kj in the petal beam we expect protons with energies up to ∼ 100 mev, while at 1 kj the expected cut-off is ∼ 40 mev. (this number is compatible with 50 mev obtained on omega ep by k. flippo et al. [8] with 1 kj but a longer pulse duration). the results concerning radiation doses are in agreement with those obtained in other laser facilities, and reported in fig. 4. for instance, experimental data obtained at the institute of laser engineering of osaka university, using the lfex laser (500 j shot at 1020 w cm−2) show a dose of 10 msv produced inside the interaction chamber [21], which is rather compatible with the dose of 32 rads given here. in comparison with existing lasers, due to its high energy and intensity, petal is expected to yield larger doses, see fig. 4. 105 dimitri batani et al. acta polytechnica figure 3. left: expected spectrum of proton emission from a plastic 10 µm target irradiated with petal at 3.5 kj from pic simulations using the code picls. right: angular distribution of protons with an energy over 40 mev in the forward direction. a second working group has been established to evaluate the problem of giant electromagnetic pulses (emp) that are produced by: (1.) magnetic fields due to relativistic current propagating in solid targets. (2.) magnetic fields in plasma generated due to the crossed pressure and temperature gradients. (3.) electric and magnetic fields generated by fast electrons escaping the target. (4.) emission of the electrically charged target after the end of the laser pulse. the first three these mechanisms produce very large amplitude fields which are located essentially inside the plasma. their life time is limited to a few hundred ps at most. the last one is more dangerous for the diagnostic and control equipment as the corresponding fields are of a much longer duration (up to a µs scale) and they fill all the chamber and escape outside through the diagnostic windows. for instance, in experiments performed on the omega laser facility in the us, in may 2011, using the omega ep beam (1 kj, 10 ps) as a backlighter, an electric field as high as ∼ 250 kv m−1 was measured inside the interaction chamber. such a field may affect the performance of diagnostics placed near the target chamber centre (tcc). this field is reduced to ∼ 7.5 kv m−1 outside the interaction chamber. fields as large as ∼ 750 v m−1 and ∼ 100 v m−1 were measured in the laser and diagnostic bays, respectively, at the distance of a few tens of meters. results obtained in several laser facilities are shown in fig. 5. it is clear that emp features (both the amplitude of the field and the spectrum) depend on the application, i.e. the laser power/intensity, the target, the focal spot size, etc. in general radiographic (backlighting) applications imply a stronger emp signal. we expect that the petal ps shots will produce more electric perturbations than the lmj ns shots (typically for the icf applications the emp is limited to ∼ 100 kv m−1 mainly in the 1 ghz range, while for backlighting a signal of the amplitude of 1 mv m−1 in the range of 10 ghz is expected). coupling of such large em field to electric cables may induce disturbances in the data acquisition electronics. the electromagnetic protection heavily affects the design of diagnostics tools (see next section) implying the need for reliable shielding and grounding of existing electronic equipment and design of new emp-resistant electronics. in the meanwhile, in order to limit the problem, the choice in diagnostics mainly relies on passive detectors. 4. development of diagnostics the development of diagnostic tools is as essential as the construction of laser system itself. only under this condition significant experiments in plasma physics could be performed. the first goal of the petal+ project is to develop detectors to characterize the emission of particles and radiation from targets irradiated with petal (so called “secondary sources”): energy range and spectrum, angular distributions and intensity. this is a specific project funded by the anr (the french national agency for research) and managed by the university of bordeaux, with a budget of 9.3 me (of which 1.3 me for the phase of exploitation of diagnostics and realization of experiments). the cost of diagnostics is predicted to be much larger than for similar ones on smaller laser facilities. this is due to the fact that realization of di106 vol. 53 no. 2/2013 development of the petal laser facility and its diagnostic tools agnostics must take into account the safety issues (chamber activation, radiation hazards, emp, . . . ) as well as the need for remote handling & remote control and access. the passive detectors used in the diagnostics (films, ips, . . . ) will need to be automatically extracted from the insertion ports and probably automatically processed. the diagnostics themselves will be inserted in sid (diagnostics insertion system) that will be moved in the interaction chamber and aligned to the target. in particular two sids will be used for petal+ diagnostics. they will be positioned almost in face of the petal beam entrance window to the lmj interaction chamber (windows n.12 and 26). two working groups have been established to develop and design the diagnostics and sid for i) electron spectroscopy, proton spectroscopy and imaging, and ii) large-band x-ray spectroscopy. because of the particular characteristics of the petal beam, large dynamical ranges have to be covered by these diagnostics. other diagnostics will be installed in the future on the lmj/petal facility. the petal electron source spectrum is shown fig. 2. it has been computed by e. lefebvre and a. compant la fontaine [12] for radioprotection purposes. these calculations can be considered as majoring the intensities and electron energy spectra endpoints for the diagnostics design. our diagnostics must cover the range 0.1 ÷ 150 mev for the protons and electrons. in addition to that, a spectrometer dedicated to the highest energy electrons (above 200 mev) may be added. the energy resolution of the detectors will be of 5 ÷ 10 % for both types of particles. the angular range to be covered by the diagnostics will be as large as ∼ 20◦ around the direction normal to the petal target (protons/ions) or the incoming petal laser direction (electrons) to cover the full angular distributions. the first aim of the petal+ diagnostics is to characterize the particle emission from laser target interaction. to do so, electron magnetic spectrometers will be built in order to detect different parts of the spectrum: low, average and high energies. in addition to this, an activation detector such as the one developed by cenbg [10], which permits to measure the angular distribution of energetic electrons may be used. a small positron spectrometer is also under investigation within the petal+ project. the protons/ions will be detected with a two-component diagnostic. the first component will comprise a stack of radiochromic films or image plates, close to the target chamber center in order to cover the expected 20◦ half-divergence of the proton/ion bunch. stacking of these passive detection systems will permit to explore with a sufficient energy resolution the ion beam divergence. the second component will be a thomson parabola [25]. such a detector, which combines the magnetic and electrostatic analysis of a bunch of particles will provide a clear separation between figure 4. radiation doses measured on several laser installations and comparison (solid line) with simulation results (calculated at 1 m behind a 2 mm w target). the different ion species as well as good energy resolution needed to determine precisely the energy spectrum of the particles generated by the pw laser [9, 4]. in the present designs, this thomson parabola will be used also for the magnetic analysis of electrons in the energy range 1÷150 mev. after a detailed characterization of particle emission from petal targets, the diagnostics will be used for plasma experiments. as an example, we mention the proton radiography to determine the magnetic [14] or electric field [1] structure at the plasma scale or to measure the density of the lmj plasmas [22, 26]. stacks of radiochromic films and/or ips will be used for the x-ray or proton radiography of targets irradiated with lmj beams. the x-ray spectrometer, will have the cauchois geometry [3, 6] and will cover the range 5 ÷ 100 kev. such geometry, based on transmission cylindrical crystal, has been already adopted in many laboratories (hxs, henex, dcs at naval research laboratory, tcs at omega ep, llcs at llnl, lcs at luli, c2s at celia) because of a potentially large spectral range which, in particular, allows to detect kα emission lines from most materials with a variable resolution λ/∆λ ∼ 50 ÷ 300, depending on the distance of the detector to the roland circle. also the range 5 ÷ 100 kev was chosen in order to complete the range of x-rays spectrometers already planned and under construction for the lmj. this spectrometer is aiming at measuring the kα lines of any materials. such lines are in particular useful for fast ignition and shock ignition dedicated experiments since they act as tracer layers for the passage of fast electrons inside targets [24, 18]. 107 dimitri batani et al. acta polytechnica figure 5. emp measurements performed by cea on laser facilities: 2000-2004 lle-omega; 2006 luli2000; 2007-2009 luli-pico2000; 2009-2010 lle; in addition omega ep us feedback on nif and emp measurements on titan national ignition campaign. 5. petal in the context of european research, development of the scientific research programme on petal/lmj starting in 2015, the european scientific community will have access to petal/lmj (with a contractual 20 to 30 % of laser shots available for academic civilian research). this will be a unique facility (the only other one being nif) for addressing hed physics and inertial fusion in particular. petal is a key element of the hiper project because i) it guarantees academic access to the lmj/petal installations (through the agreement between cea and the region aquitaine), ii) it will allow for integrated experiment in the domain of fast ignition (allowing to inject up to several hundred joules of fast electrons into the target), iii) it will allow probing of lmj implosion in integrated experiments related to shock ignition. however the installation itself will be very complex (problems of activation, of remote control, of emp, . . . ) and experiments must be carefully planned and prepared by using numerical simulations, targets and diagnostics. preliminary experiments on smaller (“intermediate”) laser facilities will be needed as an indispensable step before “final” experiments on lmj/petal. for this reason, all the intermediate laser facilities are playing an essential role in the ife research (orion and vulcan in the uk, lil and luli2000 in france, pals in prague, phelix at gsi). these installations are also important from a strategic point of view as means to create the links between academic research, and training to research on very large laser systems. the first phase of petal commissioning (starting from 2015) will include characterization of laser systems and diagnostics, studies of protons, x-rays, and electrons produced in laser plasma interactions. the second phase (from 2016) will be dedicated to physics experiments on extreme states of matter produced by short-pulse heating of solid targets. the use of a pw laser such as petal within the context of plasma experiments provides basically two possibilities: either to heat isochorically a matter (creating wdm and hed states) or to create a beam of secondary energetic particles to probe the properties of plasma produced by the lmj beams. in that second case, the high power of petal allows to generate intense beams of x-rays, gamma-rays, electrons and ions. they will allow to probe dense states created with shock or adiabatic compression of samples with lmj beams. in particular, direct measurements of density, of the shock and fluid velocities can be made by using the proton and hard x-ray radiography. it will also be possible to use proton and hard x-ray backlighting to probe implosion and uniformity compression of targets imploded by lmj (shock ignition approach to icf, polar direct drive). finally proton radiography could be used to measure magnetic fields, especially associated with jet formation in the domain of laboratory astrophysics. working groups are elaborating now a scientific program for the years to come for academic civilian research on lmj/petal. this will be based on four “pillars”: 108 vol. 53 no. 2/2013 development of the petal laser facility and its diagnostic tools figure 6. scheme of a cauchois transmission x-ray spectrometer (1.) fusion-opportunities for hiper project: this will be a full-scale facility for demonstration of shock ignition of fusion targets. (2.) studies of matter in extreme conditions & high energy density physics. (3.) laboratory astrophysics experiments. (4.) acceleration and high energy physics. such a scientific program will define the scientific objective and priorities to be pursued on the installation. this programme will be validated by the ilp direction and by the international scientific advisory committee of petal (sac-p). an access to the petal/lmj installation will be decided by the sac-p on the basis of priorities indicated in the scientific programme. a call for experimental proposals will be announced as soon as the installation will be operational. to facilitate the access an “users’ committee” (e.g. similar to the one operating at omega facility) will be established. references [1] d. batani, et al. laser-driven fast electron dynamics in gaseous media under the influence of large electric fields. physics of plasmas 16:033104, 2009. [2] n. blanchot. overview of petal, the multi-petawatt project in the lmj facility. ifsa 11 conference, september & workshop on the physics with petal, december, bordeaux, france, 2011. [3] y. cauchois. spectrographie des rayons x par transmission d’un faisceau non canalisé à travers un cristal courbé. journal de physique 3:320, 1932. [4] jung d., et al. development of a high resolution and high dispersion thomson parabola. review of scientific instruments 82:013306, 2011. [5] e. d’humières, et al. proton acceleration mechanisms in high-intensity laser interaction with thin foils. physics of plasmas 12:062704, 2005. [6] seely j. f., et al. hard x-ray spectroscopy of inner-shell k transitions generated by mev electron propagation from intense picosecond laser focal spots. high energy density physics 3:263, 2007. [7] m. l. fensin, j. s. hendricks, s. anghaie. the enhancements and testing for the mcnpx 2.6.0 depletion capability. journal of nuclear technology 170:68–79, 2010. based on la-ur-08-0305. [8] k. flippo, et al. measurements of proton generation with intense kilojoule laser. journal of physics: confence series 244:022033, 2010. [9] c. g. freeman, et al. calibration of a thomson parabola ion spectrometer and fujifilm imaging plate detectors for protons, deuterons, and alpha particles. review of scientific instruments 82:073301, 2011. [10] m. gerbaux, et al. high flux of relativistic electrons produced in femtosecond laser-thin foil target interactions: characterization with nuclear techniques. review of scientific instruments 79:023504, 2008. [11] s. h. glenzer, et al. demonstration of ignition radiation temperatures in indirect-drive inertial confinement fusion hohlraums. physical review letters 106:085004 & 109903, 2011. [12] e. lefèbvre, a. compant la fontaine. note de calculs termes sources pour petal couple a la chambre lmj, 2011. cea-dif. [13] e lefèbvre, et al. electron and photon production from relativistic laser–plasma interactions. nuclear fusion 3:629, 2003. [14] c. k. li, et al. observations of electromagnetic fields and plasma flow in hohlraums with proton radiography. physical review letters 102:205001, 2009. [15] j. d. lindl. the national ignition campaign: goals and progress. ifsa11 conference, bordeaux, france, 2011 and references therein. [16] c. lion. the lmj program: an overview. ifsa11 conference, bordeaux, france, 2011. [17] e. martinolli, et al. conical crystal spectrograph for high brightness x-ray kα spectroscopy in subpicosecond laser-solid interaction. review scientific instruments 5:2024, 2004. [18] e. martinolli, et al. fast electron transport and heating of solid targets in high intensity laser interaction measured by kα fluorescence. physical review e 73:046402, 2006. [19] j. l. miquel. present status of the lmj facility. ifsa11 conference, bordeaux, france, 2011. [20] e. l. moses. the nif: an international high energy density science and inertial fusion user facility. ifsa11 conference, bordeaux, france, 2011. [21] osaka courtesy of prof. h. nishimura, ile. [22] j. r. rygg, et al. proton radiography of inertial fusion implosions. science 319:1223, 2008. [23] y. sentoku, a. kemp. hot-electron energy coupling in ultraintense laser-matter interaction. journal of computational physics 227:6846, 2008. [24] r. b. stephens, et al. kα fluorescence measurement of relativistic electron transport in the context of fast ignition. physical review e 69:066414, 2004. [25] j. j. thomson. rays of positive electricity. philosophical magazine 21:225, 1911. [26] l. volpe, et al. proton radiography of laser-driven imploding target in cylindrical geometry. physics of plasmas 18:012704, 2011. [27] s. wilks, et al. energetic proton generation in ultra-intense laser-solid interactions. physics of plasmas 8:542, 2001. 109 acta polytechnica 53(2):103–109, 2013 1 introduction 2 petal laser system 3 problems related activation and emp 4 development of diagnostics 5 petal in the context of european research, development of the scientific research programme on petal/lmj references ap01_6.vp 1 introduction stirring processes are used in many parts of the chemical, pharmaceutical, food and bioengineering industries. in most of them the stirring is a basic operation, e.g. during homogenization (or blending) of miscible liquids, i.e. the compensation of temperature and concentration differences. for such a process it is very important to know the blending efficiency when designing an industrial plant. however, the optimal design of stirring plants also involves better economics of the mixing process. in this paper we study the problems of the blending efficiency of pitch blade impellers (pbi), especially the course of homogenization performed by a three blade (3-b) pbi with various pitch angles and different vessel/impeller diameter ratios. there are a few papers dealing with experimentally determined results of blending times of the pbi [e.g. 1, 2, 3, 4] but no study describing the blending efficiency of the above mentioned impellers has yet appeared up to now. therefore, additional experimental data needs to be compiled to supplement our knowledge of the blending efficiency of the 3-b pbi, with different geometric configurations. 2 theoretical let us consider a three-blade pitched blade impeller (3-b pbi) in a blending process of low viscous miscible liquids. we can assume the turbulent regime of flow of the agitated batch. the dependence of the dimensionless time of homogenization (mixing time) on the reynolds number (re), modified for rotary impeller re � nd2� � , (1) expressed by the function � �nt nt� re (2) can be distributed in three regions (see fig. 1): in the range re < 10, the liquid around the impeller moves with the impeller rotation. therefore, the process of mixing is negligible and the real mixing time is very long. when the re > 10, the flow around the impeller is turbulent. with an increase of the reynolds number the viscous forces in the rest of the agitated batch decrease and the inertia forces increase. in the range re > 104, the negligible effects of the viscous forces and the forces in the whole agitated batch are only the inertial forces. the dimensionless blending time does not depend on the reynolds number but it depends on the geometry of the agitated system and on the chosen degree of homogeneity [5] � � � �� � c t c t c c c � � � k 0 k (3) where the average concentration in the volume of the mixed batch apart from the volume of the concentration impulse �v � � � �c t v c v t v� � 1 , d . (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 41 no. 6/2001 study of the blending efficiency of pitched blade impellers i. fořt, t. jirout, f. rieger, r. allner, r. sperling this paper presents an analysis of the blending efficiency of pitched blade impellers under a turbulent regime of flow of an agitated low viscous liquid. the conductivity method is used to determine of the blending (homogenization) time of miscible liquids in pilot plant mixing equipment with standard radial baffles. for the given homogeneity degree (98%) a three-blade pitched blade impeller is tested with various offbottom clearances, vessel/ impeller diameter ratios and various impeller pitch angles. the experimental results show in accordance with theoretical data from the literature, that the greatest effect on the dimensionless blending time is exhibited by the vessel/ impeller diameter ratio and the impeller pitch angle. the number of total circulations necessary for reaching the chosen homogeneity degree depends on the impeller pitch angle and amounts more than three. finally, the energetic efficiency of the blending process is calculated.the results of this study show, that the highest energetic efficiency of the three-blade pitched blade impeller appears for the pitch angle � = 24°, the impeller/vessel diameter ratio t/d = 2 and the impeller off-bottom clearance h/d = 1. keywords: pitched blade impeller, blending of liquids, degree of homogeneity, turbulent flow. n t = f(re) n t = const. n t [-] re [-] creeping flow turbulent regiontransient region fig. 1: typical dependence of dimensionless blending time on the reynolds number for a high-speed rotary impeller the initial concentration is defined as � �� �c c t0 0� � (5) and the final concentration of the dissolved matter in the agitated liquid is � �� �c c tk � . (6) quantity c (t) is the instantaneous concentration of the dissolved matter averaged over the volume of agitated liquid volume v, reduced about the volume of concentration impulse �v. for the sake of simplicity we neglect the volume �v with respect to value v; the value �v amounts to less than one thousanth of value v. turbulent flow of an agitated liquid is realized by its circulation. for the given homogeneity degree the blending process takes place during the given number of the total liquid circulation nc [5] t n t� c t, [ c = const.] (7) where tt is the mean time of total liquid circulation, which can be calculated from the relation t v q t t � , (8) where qt is the total volumetric flow of the agitated liquid. if we consider a “squared” configuration of the volume of the agitated liquid, i.e. v t� � 4 3 (9) and introduce the total flowrate number n q nd q t t� 3 (10) we can finally rearrange eq. (2) into the form n t n t d nq � � � � � �c t � 4 13 , [c = const.] (11) eq. (11) includes impeller speed n, impeller diameter d, and the mixing vessel diameter t. the energetic blending criterion e indicates the energy consumption ratio necessary for the chosen homogeneity degree [6]: � �e p t t po n t d t � � � � � � � 3 5 3 5 � . (12) to calculate the energetic criterion we need to know the impeller power input p and the mixing time for the given configuration of the agitated system. quantity p can be calculated from the power number po = p / � n3 d5 (13) where � is the density of the agitated liquid. the quantity po depends on the geometry of the system agitated by a pbi and this dependence can be expressed by the equation [7] � � � � � � � �po t d h d h t n� � �1 507 0 365 0 165 0 140 2 077. sin. . . .b 0.171 � ,(14) valid for values of the reynolds number greater than 104. it consists among others of quantities nb – the number of impeller blades, � – the pitch angle of the impeller blades, and h – the impeller off-bottom clearance. 3 experimental after the injection of a small amount of tracer �v (about 1 ccm) into the agitated liquid, the concentration changes and mixing time were measured in appropriate locations in the agitated liquid. the conductivity method was used for measuring the blending time. this method is based on the principle of monitoring the changes in electrical conductivity within the mixed liquid. the change in electrical conductivity was caused by adding a sample of concentrated solution of sodium chloride into the liquid below its surface along the impeller shaft. in our experiments the injected sample of liquid had approximately the same density and viscosity as the mixed liquid, and thus the effect of the archimedes number was eliminated. after adding the tracer the time change of conductivity was measured and recorded and the blending time was determined. the conductivity cell consisted of two platinum wire electrodes of 0.8 mm diameter in the shape of a rectangle 8×10 mm, 8 mm apart. the volume of the conductivity cell was approximately 0.8 ccm. the cell was located at of the liquid height under the surface of the mixed liquid and at 110 of the vessel diameter from the wall of the vessel (see fig. 2). we chose this position because during a test with the decolourizing method a dead space was detected in this volume for the impellers and vessels under investigation. the volume of the liquid injected varied in the range 0.5–1.5 ccm. the process of homogenization was recorded with a fast chart recorder. the time of homogenization (the blending time) was found at the moment when the fluctuation of the measured electrical voltage u was ±2 %. a typical time course of the indicated voltage is shown in fig. 3. the principle layout of the pilot plant experimental equipment is shown in fig. 2. the blending time was determined at various geometric configurations; in all of the cases the impeller was pumping liquid downwards – towards the bottom of the cylindrical vessel and was located in its axis of symmetry. measurements were carried out with two different series of 3-b pbis (fig. 4 shows the geometry of the impellers used) with diameters of 100 mm and 67 mm, respectively. for each impeller we used three different pitch angles: � = 24°, 35°, and 45°. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 2: cylindrical vessel with a high-speed rotary impeller, radial baffles and position of the conductivity probe; (1– conductivity probe, 2– baffle, 3– vessel, 4– impeller) for all the impellers the three different vessel/impeller ratios were investigated: t/d = 2, 3, 4.5. the influence of the impeller off-bottom clearance was studied at the standard configuration (t/d = 3, � = 45°). all these configurations are summarized in tab. 1. additionally, the blending time was determined with a dish shaped bottom of the vessel (for t/d = 2, 3 and � = 45°). all the vessels were equipped with four baffles distributed equally around the vessel wall (see fig. 2). the ratio between baffle width b and the vessel diameter b/t = 0.1. a larger interval of the reynolds number was covered by using three different levels of viscosities of the tested liquids: 1. distilled water [viscosity around 1 mpa�s], 2. 25 % w/w glycerol water solution [viscosity around 2 mpa�s], and 3. 45 % w/w glycerol water solution [viscosity around 4 mpa�s]. the viscosity was measured with the hoeppler viscosimeter b3 (mlw prufgeraetewerke, freital, germany) at five different temperatures, and the viscosity of the mixed liquid was calculated by linear interpolation from its known temperature dependence. the accuracy of the data obtained depends significantly on the independent variables, i.e. the viscosity of the mixed liquid, the impeller speed and on the geometric configuration of the agitated system. the impeller speed was measured by a photoelectric revolution counter having an accuracy of ±1/min. the diameter of the impeller and the width of its blade were manufactured with an accuracy of ±0.1 mm and the pitch angle within an accuracy of ±30’. the impeller position in the vessel and the height of liquid in the vessel were measued by a ruler with an accurancy of ±1 mm. the accuracy of measurement of the dynamic viscosity was ±0.1 mpa�s. the process of homogenization was recorded with a fast chart recorder with a scale 1 mm ~ 1 second. the accuracy of determination of the blending time was considered ±1 s. for each configuration of the agitated system the blending time was measured at five different values of impeller speed at constant viscosity of the mixed liquid. because of the required confidence level of the final experimental results [3, 4, 6], the five courses of blending time for each impeller speed were determined and the average value for the given experiment was calculated. the blending time measured and calculated by this process exhibits a relative deviation of ±5 % from the calculated average value. 4 results and discussion the results of the experiments were evaluated from the point of view of dependence of the blending time on the geometric parameters of the agitated system and, further, according to the formulas in the theoretical part of this study, evaluations were made to investigate the relation between the blending time and the mixed liquid circulation. finally, the blending efficiency of the 3-b pbi was calculated with respect to various geometries of the investigated system with the aim to find the optimal arrangement for the blending process. it follows from all the results of the experiments (see examples in fig. 5, 6) that when the reynolds number (modified for a rotary impeller) exceeds ten thousand the dimensionless blending time does not depend on re. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 41 no. 6/2001 u voltage u [mv] time of homogenization t [s] � u u� �2 % fig. 3: recorded trace (example of the time course of homogenization) fig. 4: three-blade pitched blade impeller (d = 67 mm, 100 mm, h/d = 0.2) vessel/impeller diameter ratio t/d [–] impeller off-bottom clearance/ impeller diameter ratio h/d [–] diameter of vessel t [mm] diameter of impeller d [mm] off-bottom clearance h [mm] 2 1 200 100 100 3 1; 0.75; 200 67 67; 50; 33; 22; 0.50; 0.333; 300 100 100; 75; 50; 33; 4.5 1 300 67 100 table 1: pilot plant configurations for investigation for nb = 3; (h = t ) dependence of the blending time on the geometry of a mixed system taking into account the independence of the quantity n t from the reynolds number for the turbulent regime of flow of an agitated liquid, the dependence of the dimensionless blending time on the geometric parameters of the mixing system was expected in the power form � �n t k t d h d a b c� � � � � � � � � � �1 1 1 1sin � (15) or n t k t d h d a b c� � � � � � � � � � �2 2 2 2� . (16) the values of the exponents ai , bi , ci and the parameter ni, ( i = 1, 2) were calculated from the results of the experiments (see tab. 2) by means of multidimensional log-log linear regression and they are listed in tab. 3 together with the correlation coefficients ri , ( i = 1, 2) of the corresponding regression equations. it follows from these two tables that there is no difference between the agreements of the two proposed equations [(n t)calc vers. (n t)exp] and that the correlation coefficients of both equations are practically the same, expressing a suitable confidence level of the considered power dependence. when we look at the values of the exponents in eqs. (15) and (16) the highest influence on the blending time is for the vessel/impeller diameter ratio: here the values of exponents a1 and a2 are approx. two. similar values were found for the standard rushton turbine impeller [3] and for four-blade [3] and 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 n t = f(re) 10 100 1000 1000 10000 100000 re [-] n t [-] 0.92 mpa s ; = 67 mm� d 1.79 mpa s ; = 67 mmd� 4.08 mpa s ; = 67 mmd� 0.92 mpa s ; = 100 mmd� 3.74 mpa s ; = 100 mmd� ( )av.n t fig. 5: dimensionless blending time as a function of re, ( d = 67, 100 mm, t/d = 3, � = 35° ) for different viscosities of mixed liquid n t = f(re) 10 100 1000 10000 100000 re [-] n t [-] 0.93 mpa s ; = 67 mmd� 1.79 mpa s ; = 67 mmd� 4.08 mpa s ; = 67mmd� 0.93 mpa s ; = 100 mmd� 2.11 mpa s ; = 100 mmd� 3.71 mpa s ; = 100 mmd� ( )av.n t� fig. 6: dimensionless blending time as a function of re, ( d = 67, 100 mm, t/d = 3, � = 45° ) for different viscosities of mixed liquid six blade pbis [4]. the values of the exponents c1 and c2 are moderate and do not differ significantly. the higher the pitch angle, the lower the blending time. table 4 presents a survey of calculated values of the number of total liquid circulations nc necessary for attaining the given homogeneity degree. the values of the total flow rate number nqt were taken from the literature [5] and the corresponding values of the dimensionless blending time n t follow from this study. it follows from this table that 3–4 circulations of the mixed liquid are necessary for attaining the degree of homogeneity required, and that the impeller off-bottom clearance does not exhibit a significant influence on this number. it seems that a decrease in the impeller blade pitch angle causes a decrease in the number of liquid circulations, but independently of the vessel/impeller diameter ratio. energetic blending criterion energetic blending criterion e (see eq. 12) characterizes in dimensionless form the energy necessary for attaining the chosen homogeneity of the agitated liquids. the higher the energetic blending criterion the lower the blending efficiency at the given geometry of agitated system. table 5 gives a survey of the values of quantity e following from our experimental data on blending time. the impeller power number po was calculated from eq. (14). © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 41 no. 6/2001 d/t [– ] h/d [– ] � [ ° ] (deg) (n t)exp [– ] (n t)calc [– ] (eq. 15) (n t)calc [– ] (eq. 16) 2 1 24 32.8 � 2.0 29.8 29.8 2 1 35 25.3 � 1.3 24.5 24.6 2 1 45 18.6 � 1.0 21.7 21.7 3 1 24 60. 3 � 4. 2 66. 9 66.8 3 1 35 58. 3 � 2. 7 55.0 55.3 3 1 45 49. 2 � 2. 2 48.9 48.8 4.5 1 24 142.0 � 8.1 150.1 150.0 4.5 1 35 118.0 � 11.2 123.6 124.2 4.5 1 45 112.0 � 10.5 109.8 109.5 3 0.33 45 54.5 � 3.2 56.1 56.1 3 0.50 45 55.3 � 4.8 53.3 53.3 3 0.77 45 51.8 � 3.5 50.7 50.6 table 2: dependance of dimensionless blending time on impeller pitch angle (three-blade pitched blade impeller), c = 0.02; nb = 3 eq. ki ai bi ci ri 15, i = 1 4.49 1.994 0.125 0.566 0.992 16, i = 2 36.63 1.994 0.128 0.501 0.992 table 3: parameters of the power regression for sin � (eq. 15) and � (eq. 16), c = 0.02, nb = 3, re > 10 4 � [ ° ] (deg) d/t [– ] h/d [– ] n t nqt [– ] nc [– ] nc, av [– ] 45 3 0.30 54.1 1.614 4.116 4.072 45 3 0.50 55.3 1.654 4.313 45 3 0.75 51.8 1.706 4.167 45 3 1.00 47.3 1.654 3.961 24 2 1.00 37.8 0.686 3.584 3.32324 3 1.00 60.3 1.024 2.913 24 4.5 1.00 142.0 1.751 3.474 24 3 0.75 60.3 1.331 3.788 4.16235 3 0.75 58.7 1.637 4.534 45 3 0.75 51.8 1.706 4.167 table 4: number of total liquid circulations for the given homogeneity degree (c = 0.02) and the number of impeller blades ( nb = 3) for the pitch angle � = 45° we can see a permanent increase in the energetic blending criterion with the increasing ratio t/d. for the pitch angles � = 24° and 35° a monotonous trend cannot be seen, e.g. the energetic criterion for � = 35° is similar for ratios t/d = 3 and 4.5, while at the ratios t/d = 2 and 4.5 for pitch angle � = 24° the values of the energetic criterion have a difference of just 4 %. the difference between the criteria corresponding to the ratios t/d = 3 and 4.5 is about 20 %. the vessel/impeller ratio exhibits an influence on the difference between the values of the energetic blending criterion when the flat bottom of the vessel is replaced by a dished bottom. here the difference is more significant for the ratio t/d = 3, where the flow pattern of the agitated liquid follows better the shape of the bottom, and then the blending process is faster than in the system with the flat bottom. the data in tab. 5 show also that there is a dependence between the energetic blending criterion and the impeller off-bottom clearance. the energetic criterion e decreases with increasing off-bottom clearance. the difference of the energetic criteria between the shortest and longest impeller off-bottom clearance is around 85 %. the favourable effect of decreasing the ratio h/d can be explained by the fact that the liquid is pumped by the impeller towards the bottom, where the high level of turbulence contributes to the better blending process. the energetic blending criterion e increases with increasing pitch angle at the value of ratio t/d = 4.5 but at the ratio t/d = 3 we cannot find a clear trend for quantity e as a function of quantity �. generally, for the off-bottom clearance h/d = 1 criterion e decreases with increasing pitch angle, but with certain exceptions especially around the pitch angle � = 35°. 5 conclusions a three-blade pitched blade impeller seems to be suitable for blending processes under the turbulent regime of flow of an agitated liquid. the discovered correlation allows us to calculate the blending time for different pitch angles, ratios d/t and h2/d at the homogeneity degree c = 0.02. the most efficient energetic blending criterion among the geometries investigated seems to appear at the configuration � = 24°, t/d = 2 , h/d = 1 and, on the other hand, the most inefficient configuration was found at the geometry � = 45°, t/d = 4.5, h/d = 1. this research was subsidised by research project of the ministry of education of the czech republic no. j04/98: 212200008. list of symbols b baffle width, [m] c degree of homogeneity c concentration, [kg�m 3] d impeller diameter, [m] e energetic blending criterion h total liquid depth, [m] h impeller off-bottom clearance, [m] n impeller speed, [sec 1] nb number of impeller blades nt dimensionless blending time nqt total flow rate number p impeller power consumption, [w] po power number r correlation coefficient 12 acta polytechnica vol. 41 no. 6/2001 � [ ° ] (deg) d/t [– ] h/d [– ] n t [– ] po [– ] e [–] 24 2 1 37.8 0.39 430 24 3 1 60.3 0.34 367 24 4.5 1 142.0 0.29 450 35 2 1 25.3 0.80 403 35 3 1 58.7 0.69 572 35 4.5 1 118.0 0.59 528 45 2 1 18.6 1.23 236 45 2 1 18.3 1.23 247 a) 45 3 1 42.3 1.06 329 a) 45 3 1 48.7 1.06 454 45 3 0.75 51.8 1.11 633 45 3 0.50 55.3 1.19 824 45 3 0.33 54.5 1.29 858 45 4.5 1 112.0 0.92 697 a) dished bottom table 5: dependence of the energetic blending criterion on the geometry of an agitated system with a three-blade pitched blade impeller ( c = 0.02), flat bottom of cylindrical vessel re reynolds number modified for rotary impeller t vessel diameter, [m] t homogenization (blending) time, [sec] tt time of total liquid circulation, [sec] qt total volumetric flow ratio of agitated liquid, [m 3 �s 1] u voltage, [v] v volume of agitated liquid, [m3] �v volume of concentration impulse, [m3] � pitch angle of impeller blades, ° (deg) dynamic viscosity, [pa�s] � density, [kg�m 3] subscripts calc calculated value exp experimental value k final value 0 initial value references [1] nagata, sh.: mixing. principles and applications. kodansha ltd., tokyo, john wiley & sons, n. y., 1975 [2] gvc-fachauschuss “mischvorgaenge”: mischen and ruehren. vdi gesellschaft verfahrungstechnik und chemieingenieurwesen, dusseldorf, 1998 [3] procházka, j., landau, j.: homogenization of miscible liquids by rotary impellers. collect. czech. chem. commun. 26, 1961, pp. 2961–2974 [4] kvasnička, j.: thesis. research institute of chemical equipment (vúchz-chepos), brno, 1967 [5] fořt, i., valešová, h., kudrna, v.: liquid circulation in a system with axial mixer and radial baffles. collect. czech. chem. commun. 36, 1971, pp. 164–185 [6] rieger, f., novák, v.: homogenization efficiency of helical ribbon and anchor agitators. chem. eng. jour. 9, 1975, pp. 63–70 [7] medek, j.: power characteristic of agitators with flat inclined blades. int. chem. eng. 20, 1980, pp. 665–672. doc. ing. ivan fořt, drsc. ing. tomáš jirout prof. ing. františek rieger, drsc., dept. of process engineering phone: +4202 2435 2713 fax: +4202 24310292 e-mail: fort@fsid.cvut.cz czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic ing. ralf allner prof. dr. ing. reinhard sperling e-mail: reinhard.sperling@lbv.hs.anhalt.de dept. of chemical engineering anhalt university of appl. sciences hochschule anhalt (fh) koethen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 41 no. 6/2001 ap1_01.vp 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 1 basic definitions let us take a nonlinear continuous time-invariant siso (single input – single output) system, described by the state equations (see also fig.1) � � � � � � � �� � � � � � � �� � � , ; , , , � , ; x x x i i+ n t t f t y t i n t f t y t i n i n � � � � � � 1 1 2 1u u � (1) and an output equation � � � �� � � � � �� �y t g g t x t ty u� ��1 1u u, . (2) (the state-space representation is substantially changed in comparison with [1].) let us suppose that u and y are the only measurable quantities in the system. the state-space diagram must fulfill the following rules: 1. the basic string of the scheme consists of n integrators followed by an algebraic block � �g xy �1 1u, . 2. there exists an inverse function � �g yy u, of the output function � �g xy �1 1u, from eq. ( 2 ). 3. all the direct paths of the scheme lead from the output source u on the basic string. 4. all the feedback loops of the scheme lead via the output algebraic block � �g xy �1 1u, and none of them is algebraic, i.e., all of them contain at least one integrator. then we can defined � � � � � �� �x t f t y t1 0� � u , (3) where � � � �� � � �� � � � � �� �f t y t g t g t y tu y0 u u u, ,� � . (4) the functions fi are supposed to be linear in parameters ai, j , i.e., to have the form � � � �� � � � � �� �f t y t a g t y t i ni i j i j j mi u u, , ; , , ,, ,� � � 1 0 1 � , (5) where gi, j are known (generally nonlinear) functions of the measured data and ai, j (generally unknown) constants. example: � � � �f y a u y a y a u u y3 3 1 1 3 2 3 3 3 1 2u, cos, , ,� � � � � . 2 identification algorithm for the sake of simplicity, let us denote a multiple integral of a time-dependent variable x(t) as � � � �� � x i t x t t t t t i x tt tt i i i , , ; , , � � � �00 10 1 20 1 13 0 0 d d d d� � � � �t x t i� � 0. (6) note that (6) corresponds to a time response of a chain of i integrators in series, with zero initial conditions, excited by the input function x(t). the first determined integral from 0 to t of the last state-space variable x1(t) in fig.1 can be expressed as � � � � � � � �� �� �� �x t x t i f t y t i ti i i i n , , ! , , ,1 0 1 1 � � � � � � � � � � � � u . (7) from (3), (4), (7) follows � � � � � �� �� �x t i f t y t i ti i i n i i n 0 1 0 1 0 ! , , , � � � � �� u . (8) now let us integrate (8) for kt, k = 1, 2, …, n + 2 and multiply both sides of the k-th equation by a binomial coefficient � �� � � � �� � � ��1 2 k k n . let us make a sum of the resulting (n + 2) equations to obtain nonlinear continuous system identification by means of multiple integration ii j. john this paper presents a new modification of the multiple integration method [1, 2, 3] for continuous nonlinear siso system identification from measured input – output data. the model structure is changed compared with [1]. this change enables more sophisticated systems to be identified. the resulting matlab program is available in [4]. as was stated in [1], there is no need to reach a steady state of the identified system. the algorithm also automatically filters the measured data with respect to low frequency drifts and offsets, and offers the user a potent tool for selecting the frequency range of validity of the obtained model. keywords: continuous system identification, multiple integration. u y fig. 1: state space diagram of the identified model � � � � � � � � � �� � � � � � �� �� � � � � � � � 1 2 0 1 1 2 1 k k n i i i n i k n x kt i f t y t i ! , ,� u� �, .t i n � � � � � � � � � 0 0 (9) because � �� � � �� �� � � 1 0 1 2 1 1 k k n ik m k i m; , , ,� , (10) the generally non-measurable initial conditions xi(0) disappear from (9) and we obtain � � � � � �� �� �� � � � �� �� � � � � � � � � � � �� 1 2 1 01 2 k i i n k n k n f t y t i t� u , , , 0 . (11) let us denote the weighted sum of multiple determined integrals (6 ) (after enumeration of this expression we obtain a scalar constant) as � � � � � �� �x i n t k n x i ktk k n , , , , , � � � � �� �� � � � 1 2 1 1 2 . (12) from (5), (11), (12) follows the linear algebraic equation for the unknown coefficients ai, j � � � �� �� �a g t y t i n ti j i j j m i= n i , , , , , ,� u �� 10 0 (13) the weighted sums c of determined integrals (12) of the (generally nonlinear) functions � �g yi j, ,u represent in (13) the known linear coefficients of the algebraic linear equation for the unknown constants ai,j. by shifting the time origin of the input – output data and repeating algorithm (13) we obtain another algebraic equation. after repeating this process enough times we obtain a system of algebraic equations that can be solved for the unknown constants ai,j, for example by the least squares method. to obtain a unique solution, one or more constants ai, j must be known. these constants are then put on the right side of system (13). the matlab program mi, described in part 4, was designed to enable such computation. 3 examples a. a linear system, described by a transfer function � � � � y s u s b s b s b s b s b a s a s n n i n i n n n n � � � � � � � � � � � � � 0 1 1 1 0 1 1 � � � a s a s ai n i n n � �� � �� 1 (14) can be described by the state-space diagram in fig.2. from the canonical forms this is the unique scheme corresponding to the structure shown in fig. 1. state-space equations of the system are � ; , , , � x x b u a y i n x b u a y i i i i n n n � � � � �1 1 2 1� (15) and the output equation is � �y x b u a � � �1 0 0 1 . (16) linear algebraic equations for calculating the unknown system parameters are � � � �� �b u i n t a y i n ti i i n � �, , , , , ,� � 0 0 . (17) an example of an identification algorithm for a linear system of the fourth order can be found in [1]. b. a parallel dynamo (fig. 3) is a nonlinear system with two input variables: resistance of the excitation winding r and angular velocity of rotation �. the output is dynamo voltage e. (the dynamo is supposed without any external load). the differential equation for the system dynamics is � �e =ri n t e �� � � �d d , (18) where e � � � is the induced voltage, � is the machine flux, n t d d � is the voltage induced by the change of the machine flux in the excitation winding, and rie is the voltage drop on the resistance in the excitation circuit. the inverse magnetization curve ie(�) is supposed in the form � �i a be � � � � � . (19) let us define the variables and constants to correspond to the symbols introduced in part 1. u r u x y e u x a n a a n a b n 1 2 1 2 1 1 1 1 2 1 3 1 � ; ; ; ; ; ; ., , , � � (20) the state-space diagram in fig. 4 corresponds to (18). it does not fulfill the condition of closing the feedback loops via © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 41 no.1/2001 u y fig. 2: state-space diagram of the linear system fig. 3: parallel dynamo the measurable output variable y (item 4, page 64). the diagram must therefore be restructured, as in the fig. 5. then it corresponds to a state-space equation � , , ,x a y a u y u a u y u 1 1 1 1 2 1 2 1 3 1 2 3 � � � � � �� � � (21) the negative inverse output function is then �f u y y u 0 2 , � � . (22) the algebraic equation (13) for computing the unknown parameters is �� � �� � � �� � � � � � y u t a y t a u y u t 2 1 1 1 2 1 2 0 1 1 1 1 1, , , , , , , , ,, , � � �� � � � � � � � �� � � � � � � � � �a u y u t1 3 1 2 3 1 1 0, , , ,� (23) in original notation: � � � � � � � � � � � � � � � � � � � � � � � 2 1 2 0 0 2 1 1 0 2 e t t t e t t t n e t t t t t � � d d d d � � � � t e t t t a n r t e t t t ttt 2 1 1 0 2 0 2 0 1 1 1 2 2 � � � � � � � � � � � � � ��� d d d � � � � 1 0 2 1 1 1 1 0 2 0 2 0 2 2t ttt t r t e t t t t b n � ��� � � � � � � � � � � � � � d d d � � � � � � � 2 1 1 1 1 0 2 1 1 1 2 r t e t t t t r t e t t t t � � � � � � � � � � � � � � � � 3 3 d d d 1 0 2 0 2 0 2 0 ttt t��� � � � � � � � � � �d . (24) 4 identification program mi and its use the identification program (available in [4]) is written for one-shot identification (identification off-line). the basic condition for using the program is to find a proper structure of the model, corresponding to the state-space diagram in fig. 1. this is not always a simple task and sometimes we need to transform the physical reality, as we did it in the case of the parallel dynamo in item b of the previous part. the other necessity is the existence of relevant data u, y in the form of time vectors. the data must contain a considerable content of frequencies interesting from the point of view of the future model. (these frequencies often lie around the critical frequency �k, i.e., during the measurement, at least for a short time, the identified system should oscillate in phase opposition.) the shortest integration period t, which is one of the items of the program input data, has to be (see [1]) tmin � �/�k. by shannon’s theorem, the data ought to have at least ten samples in the critical period. this implies for the sample period h � 0.2 tmin. to obtain one algebraic equation (13), a time sequence (n +2)t of data is necessary. in the program, the time origin of the data for a new equation (13) is shifted by t/2. this implies that from the time period tmax, approximately ne(t) = 2 ( tmax – (n +2 ) t) /t+1= 2tmax / t – 2 n –3 equations will be obtained (an approximation is given by rounding). more than one integration period can be put for one program call, and for each of them its own system of ne(t) algebraic equations (13) will be obtained. before the final computing of the parameters, all these systems are joined together and the parameters are calculated from this joint system by the method of least squares. by a proper choice of the integration period vector t we can freely select the filtration properties of the kernels (see [1] or [4] for further details). it is recommended to choose the integration periods in a geometric series beginning by tmin = �/�k and with quotient q � 1.5�4. the program displays on the screen the numbers of equations used for the given integration period t, the sum of these numbers, the root mean square error of the complete equation system, and the singular numbers of its matrix. the singular numbers can serve as a measure of the system conditionality number and, consequently, the error of the calculated parameters. let us suppose in the following that the model contains m independent nonlinear blocks gi,j from the equation (5). these blocks must be programmed as vector functions of the measured data vectors (by the period convention of matlab). the program is called by a statement parameters = mi(h,t,staircase,data,c) where h is the sampling periods of the measured data t is the row vector of the sampling periods staircase is a row m-vector of data indicators; for starcase(i) = 0 the corresponding data will be integrated as continuous, by euler’s method, for staircase(i) = 1 the corresponding data will be integrated as stepwise with a valid value at the end of the sampling period, for staircase(i) = �1 the corresponding data will be integrated as stepwise with a valid value at the beginning of the sampling period; 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 4. state-space diagram of the parallel dynamo fig. 5: restructured state-space diagram of the parallel dynamo data is a matrix consisting of m column vectors of (generally) nonlinear functions of measured data, corresponding to the functions from equation (5); c is a matrix describing the model structure. it consists of m columns, corresponding to the columns of the matrix data. absolute values of the nonzero members in the columns correspond to the index (i + 1), i.e., to the number of integrators between the block gi,j and the final state equation x1 at the end of the integrator chain in fig. 1, augmented by one. the sign of the member corresponds to the sign of the connection. the coefficient c (1, 1) corresponds obligatorily to the known parameter (here one), and the corresponding sum of weighted integrals will be put with inverted sign on the right side of equation (13). we must complete matrix c by zero members to a rectangular shape. these members are of no significance for the computation; parameters is the output row vector. it is ordered in accordance with the columns of matrix c, specifically according to the order of its nonzero elements. unitary coefficient corresponding to c(1, 1) does not appear in the output vector. examples a. the linear system with a transfer function b s b s a s a 1 2 2 1 2 � � � corresponds to the state-space diagram in fig. 2 for n = 2. the corresponding algebraic equations (13), built from the weighted integrals, will have the form of (17). for identification, column data vectors of measured data u(t) and y(t) will be necessary. parameter a0 is selected as unitary in our transfer function. in the state-space diagram, fig. 2, this parameter is connected with the last state-space variable x1 only by an algebraic connection (without integrators), and in the equations it appears with negative sign. due to this reasons the matrix data will be ordered as [y, u], and in the structure matrix c the first member will be c (1, 1) = �1. if the structure matrix has the form c � � � � � � � � � � � � � � 1 2 2 3 3 0 , we obtain the result (output vector) parameters = [a1, a2, b1, b2]. parameter a1 corresponds to the member –2 in c. consequently, a2 corresponds to –3, b1 to 2 and b2 to 3. parameter 1 by s 2 in the transfer function denominator corresponds to the member –1 in c. for the same measured data we can obtain the resulting transfer function in the form b s a s a s a 1 0 2 1 2 1� � � with the inputs data = [u, y], c � � � � � � � � � � � � � � 3 1 2 2 0 3 and the result will be parameters = [b1, a0, a1, a2]. b. parallel dynamo – see fig. 3, 4, 5 and eqs. (19–25). if we have at our disposal the time vectors of the measured data u1, u2 and y (resistance in the excitation circuit r, angular velocity �, output voltage of the dynamo e), and if we want to obtain the program output vector parameters = [a1, 1, a1, 2, a1, 3], the input data matrix must be (description in matlab) data=[� y./u2, y, u1.*y./u2, u1.*((y./u2).�3)] and the matrix of description of the model structure must be c=[�1, 2, �2, �2]. in the original notation the input data matrix is data=[e./�, e, r.*e./�, r.*((e./�).�3)] and the output vector of results corresponds to the expression parameters = [1/n, a/n, b/n], see (20). 5 conclusions a new modification of the methods presented in [1, 2, 3] has been presented, including the corresponding program. in comparison to [1, 2, 3], more general nonlinearities are permitted. a complete description of the method, more examples and corresponding programs can be found in [4]. references [1] john, j., rauch, v.: nonlinear continuous system identification by means of multiple integration. acta polytechnica, vol. 39 (1999), no. 1, pp. 55–62 [2] maletínský, v.: i-i-p identifikation kontinuierlicher dynamischer prozesse. abhandlung zur erlangung des titels eines doktors der technischen wissenschaften der eth zürich, (in german: i-i-p identification of continuous dynamic systems. phd thesis, eth zürich.), sss zürich 1978 [3] skočdopole, j.: nelineární vícenásobná integrace. kandidátská disertační práce fel čvut 1981, (in czech: nonlinear multiple integration.) phd thesis, fee ctu prague 1981 [4] see: http://dce.felk.cvut.cz/sri2/index.htm doc. ing. jan john, csc. department of control engineering phone: +420 2 2435 7597 e-mail: john@control.felk.cvut.cz czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 41 no.1/2001 table of contents thermal and hygric expansion of high performance concrete 2 j. toman, r. èerný temperature and moisture dependence of the specific heat of high performance concrete 5 j. toman, r. èerný thermal conductivity of high performance concrete in wide temperature and moisture ranges 8 j. toman, r. èerný faraday cup for electron flux measurements on the microtron mt 25 11 m. vognar, è. šimánì, d. chvátil some aspects of profiling electron fields for irradiation by scattering on foils 14 è. šimánì, m. vognar, d. chvátil critical length of a column in view of bifurcation theory 18 m. kopáèková erosion wear of axial flow impellers in a solid-liquid suspension 23 i. foøt, j. medek, f. ambros a small transfer and distribution system for liquid nitrogen 29 è. šimánì, m. vognar, j. køíž, v. nìmec lewatit s100 in drinking water treatment for ammonia removal 31 h. m. abd el-hady, a. grünwald, k. vlèková, j. zeithammerová evaluation of water resistance and diffusion properties of paint materials 34 j. drchalová, j. podìbradská, j. madìra, r. èerný management of people by managers with a technical background – research results 38 d. dobrovská clinoptilolite in drinking water treatment for ammonia removal 41 h. m. abd el-hady, a. grünwald, k. vlèková, j. zeithammerová application of morphological analysis methodology in architectural design 46 a. prokopska experimental investigations on drying of porous media using infrared radiation 55 a. k. haghi dynamic effect of discharge flow of a rushton turbine impeller on a radial baffle 58 j. kratìna, i. foøt, o. brùha nonlinear continuous system identification by means of multiple integration ii 64 j. john acta polytechnica acta polytechnica 53(1):27–29, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ classification of variable objects for search for grb candidates on bamberg photographic plates rené hudeca,b,∗, fabian kopelc, robert macsicsc, markus hadwigec, ulrich heberd, walter cayéc a astronomical institute, academy of sciences of the czech republic, ondřejov, czech republic b cech technical university in prague, faculty of electrical engineering, prague, czech republic c dientzenhofer gymnasium, bamberg, germany d dr remeis observatory, university of erlangen, bamberg, germany ∗ corresponding author: rhudec@asu.cas.cz abstract. we report on an ongoing study based on blink-comparison of more than 5000 bamberg observatory southern sky patrol plates performed within a continuation of a student high school project (jugend forscht). after a detailed analyses and classification, 6 non-classified objects were identified as emulsion defects, 19 as asteroids, 37 as variable stars, and 6 as real ot-grb candidates. keywords: astronomical photography, astronomical photographic plate archives, gamma-ray bursts, optical transients, variable stars, asteroids. 1. introduction in this paper we report on further results of an ongoing study based on extended blink-comparison of bamberg observatory southern sky patrol plates performed within a german high school student project (jugend forscht) in the last year (2011–2012). the bamberg observatory (bavaria, germany) was deeply involved in variable star research in the past. the bamberg observatory photographic sky surveys (hudec, 1999) were used to deliver observation data for these studies. the bamberg plate archive contains 40 000 plates from northern (18 000) and southern (22 000) surveys; the relevant time periods are 1928-1939 (north) and 1963– 1976 (south). the southern patrol was taken from the boyden observatory observing station in south africa for variable star research. the northern patrol was performed in bamberg directly. the past works in variable star research in bamberg focused mostly on discoveries and classification of new variable stars. for more details on the motivation of the work and the related background, see hudec et al., 2001. 2. the method the blinkmicroscope investigations of thousands of pairs of selected high-quality sky patrol plates were performed in the past under the direction of prof. w. strohmeier, former director of the bamberg observatory, and his collaborators, mainly r. knigge and h. abt. they investigated more than 2500 pre-selected southern sky survey plate pairs (with magnitude limits and overall quality above average) where one plate pair represented about 5 hours of work at the blinkmicroscope. each plate represents 13×13 square degrees, about 1 hr exposure time, and has a limiting magnitude of about 15 (for southern plates), i.e., enough to detect brighter ots (optical transients) and oas (optical afterglows) of grbs. taking into account the large field of view (fov) of the plates that were used (the northern plates are 35 × 35 degrees), this is one of major sky survey programmes in the past. we have re-analysed the objects recorded in 6 measurement books for southern sky surveys, with emphasis on detailed investigation of non-classified objects suspected to occur only once (i.e., visible only on one plate and below the limit of other plates). the datasets that were analysed involved a total of 5004 southern sky patrol plates (always blinked as a pair of plates, i.e., 2502 blink comparisons), each plate representing 13×13 square = 169 square degrees and 60 min exposure time. this represents in total 845676 square degrees (i.e., 21.14 full sky spheres) monitored for 60 minutes, i.e., almost full day and full sky sphere coverage. it is obvious that although the statistical probability of having a real grb ot candidate in our sample is low, it is not negligible. the estimated observed grb rate by grbm is about 1.3/day (guidorzi, 2011), but the intrinsic grb rate is higher as the estimated rate is influenced by instrument sensitivity. hence one can expect there was at least one grb inside the investigated plate sample. 3. object classification object classification was an important part of the investigations performed in the last year. in this section we show examples of originally non-classified objects, now classified by us to different types/categories, namely known and newly-detected variable stars (fig. 1), asteroids (fig. 2), emulsion faults (figs. 3 and 4), and genuine ot candidates (fig. 5). the light variation is clearly demonstrated. the object indicated on the right picture is a real one, 27 http://ctn.cvut.cz/ap/ r. hudec, f. kopel, r. macsics, m. hadwiger, u. heber, w. cayé acta polytechnica figure 1. example of a newly-detected (previously unknown) variable star: maximum (left) and minimum (right). figure 2. example of ot images identified as asteroids (here vesta). the image size is approximately 1 × 1 mm. despite being close to the plate background, as it is visible on more plates, and there is a faint star at this position. the image size is approximately 4 mm × 4 mm. 4. the catalogue the creation of a catalogue represents an important part of our work in the last year. due to the large number of objects and because of the scattered information at different places, a clear review and picture was difficult. we therefore decided to put all the information into one volume, namely the catalogue. the catalogue is divided according to classification types of non-classified objects, namely previously known and newly discovered variable stars, asteroids, emulsion defects, and true ot-grb candidates. the catalogue includes images of digitized neighborhoods of the objects, dss images, computer-generated images, images provided by imagej, as well as all available information for the candidate object, such as observing site, celestial position, exposure date and time, julian date, plate information, etc. 5. results we have made a detailed analysis of the 6 measurement protocols of prof. strohmayer and his collaborators, representing the results of analyses of about 5000 selected southern sky patrol plates. we have figure 3. identification of emulsion defects by surface plots and 3d imaging. the image size is approximately 0.6 × 0.6 mm. figure 4. detailed analyses with the sonneberg observatory zeiss microscope: emulsion defect (left) and real ot image (right). the image size is ∼ 0.3 × 0.3 mm. re-discovered the 63 non-classified objects on the original plates. the surrounding areas (sub-windows 873 × 873 arcsec) of these objects have been scanned and analysed. the detailed analysis of the discovery plates confirmed that 4 of these objects are in fact variable stars, as they were present on both the discovery and on the comparison plate, with different magnitudes. plates from the bamberg archive covering the same position but at different times were used for this study. in addition, the ot positions were investigated both on stargazer and on dss images, uncovering a total of 7 already known variable stars and 25 newly detected variable stars. this study was based on analyses of objects available at or close to the ot positions on deeper image (dss) or in catalog (stargazer). verification on deeper images and/or in catalog was always used to avoid false detection of faint stellar images. the detailed asteroid search performed with an online tool (http://ssd.jpl.nasa.gov/sbfind.cgi) identified 19 ot as asteroid images. we note that a similar study by bedient 2003 has indicated that out of 24 ot candidates identified by ross (1929) 6 represent asteroids mis-classified as suspected variable stars. the 6 objects were identified as emulsion defects based on a profile study with imagej as well as the large zeiss plate microscope at the sonneberg observatory. in this study, deviations from star images and/or profiles were investigated in great detail under high magnification and/or reflected light microscope. the remaining 6 objects can be considered as real 28 http://ssd.jpl.nasa.gov/sbfind.cgi vol. 53 no. 1/2013 classification of variable objects figure 5. one of 6 real ot-grb candidates (image taken by a usb plate microscope, which was used for rapidly obtaining images of the objects under high magnification). ot/grb candidates. their coordinates were measured to enable detailed analyses of their positions with modern ccd telescopes. in addition, their magnitudes were measured and were found to lie between 10.8 and 13.5 magnitude (b). because of the absence of quiet counterparts for all these objects in dss, we can conclude that the brightening amplitude was larger than 6.5, with very fast light change typically > 2–3 mag in one day. we have selected the plates in the bamberg plate archive covering the positions of these candidates just before or just after the ot times, and in most cases there was nothing one day before or one day after down to typical plate magnitude limit of investigated plates of mb = 15. very recently, we have obtained deep images of our ot candidates, in 2 cases taken in different filters. this data is undergoing a detailed analysis at the moment of writing this article, and the details will be reported later in a separate paper. the later paper will also give more details and a table of all 6 grb/ot candidates, as the investigation of these objects (especially with larger instruments) is still ongoing. the ot is the bright object in the centre of the image. for these candidate objects, all the alternative explanations namely for known types of astrophysical objects, asteroids, and emulsion defects, could be excluded. the image size is ∼ 1 × 1 mm. 6. conclusion we have re-analysed the bamberg observatory southern sky surveys measurement logs, and have redetected and investigated relevant non-classified objects as possible ot candidates. the study continued with a detailed ot classification, which has been explained in the paper. in addition the candidate objects were scanned and investigated by advanced computer programs. this study won 1 place in the jugend forscht (youth research) high school regional competition in oberfranken in 2012, and in addition 1st place in the jugend forscht 2012 competition in bavaria and 3rd place in the all-german jugend forscht 2012 competition, as well as the award for best astronomical project. fabian, robert and markus are (now) 18 year-old-students. the project was proposed by rene hudec, and was supervised by rene hudec and uli heber. acknowledgements rh acknowledges grants 102/08/0997 and 205/08/1207 from the grant agency of the czech republic and from msmt project me09027. references [1] hudec, r. astrophysics with astronomical plate archives, in exploring the cosmic frontier: astrophysical instruments for the 21st century. eso astrophysics symposia, european southern observatory series. edited by andrei p. lobanov, j. anton zensus, catherine cesarsky and phillip j. diamond. series editor: bruno leibundgut, eso. isbn 978-3-540-39755-7. published by springer-verlag, berlin and heidelberg, germany, 2007, p.79 [2] hudec, r., on the feasibility of independent detections of optical afterglows of grbs. in gammaray bursts in the afterglow era, eso astrophysics symposia, 2001, springer berlin / heidelberg [3] bedient, j. r. ibvs 5478, 1-4, 2003. [4] hudec, r., an introduction to the world’s large plate archives, acta historica astronomiae, vol. 6, p. 28-40, 1999. [5] hudec rené, 1942 superoutburst of ot j213806.6+261957, the astronomer’s telegram, 2619, 2010 [6] hudec, r.; šimon, v., identification and investigation of high energy sources on astronomical archival plates, x-ray astronomy 2009; present status, wulti-wavelength approad and future prespectives: proceedings of the international conference. aip conference proceedings, volume 1248, pp. 161-162 (2010). [7] tsvetkov m. et al., proc. iv serbian-bulgarian astronomical conference, publ. astron. soc. rudjer boskovic no 5, 303-308, 2005. [8] ross f. e. aj 39, 140, 1929. [9] guidorzi, c., phd thesis, http://www.fe.infn.it/ ~guidorzi/doktorthese/node1.html, 2011. 29 http://www.fe.infn.it/~guidorzi/doktorthese/node1.html http://www.fe.infn.it/~guidorzi/doktorthese/node1.html acta polytechnica 53(1):27--29, 2013 1 introduction 2 the method 3 object classification 4 the catalogue 5 results 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0001 acta polytechnica 54(1):1–5, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap the operating load of a disintegration machine juraj beniak∗, peter križan, miloš matúš, monika kováčová faculty of mechanical engineering, slovak university of technology in bratislava ∗ corresponding author: juraj.beniak@stuba.sk abstract. the process of disintegration is affected by a large number of operating and design parameters which have not previously been described in detail. this paper discusses the effect of the disintegrative surface area on the size of the operational load of a disintegration machine. this parameter affects the operational power of the device. the paper explains the experimental process, and describes and evaluates the measured data with mathematical functions that describe the relation of the operational power of the devices to a selected parameter. keywords: disintegration, shredding, cutting wedge, torque calculation. 1. definition of a disintegrative surface it is important to prepare biomass really well for its intended future use [1,2]. several papers have been written on the parameters of disintegration machines. the significance of the monitored parameters that influence the torque moment needed to disintegrate material samples has been investigated. this paper discusses experiments that have been carried out to confirm the mathematical relations (1) for calculating the disintegrative force [3,4], and to determine the mathematical expressions for significant influencing factors such as the area of the sample surface s and the cutting-edge side rake. various experiments have shown the significance of four known factors, the most significant being the area of the cross-sectional surface s of the sample (the cross-section that has to be break down by the disintegrative wedge). before measuring this parameter, it is necessary to define the area of the cross-section of the sample that has to be broken down by the disintegrative wedge in the disintegration process. various papers on this topic suggest that area s is calculated by multiplying disintegration wedge width b by wedge height h. when this disintegrative process was analyzed in greater detail, it was found that the theory of a constant wedge surface s is not correct. the real area of the cross-section surface that must be broken down by the tool wedge is defined by adding the surface areas that arise after disintegrating the material sample. therefore sm = 2s1 + s2, where s1 = hhm is the area defined by the side of the wedge and s2 = bhm is the area defined by the face of the wedge (fig. 1). then sm = 2hhm + bhm and, after simplification, sm = hm(2h + b), where sm is the material cross-section area that must be broken down by the disintegrative wedge (mm2), hm is the material thickness (mm), h is the wedge height (mm), and b is the wedge width (mm). this formula is relevant provided that the sample material width h is greater than the wedge height h. if the material sample width h is smaller than the wedge height h, then the following is valid: sm = 2h1hm, where h1 is the width of the material sample (h1 < h). 2. experiment for determining the influence of the disintegrative surface area in order to define the relation between the measured data (torque moment mk) and the section area of a cut in the disintegrative material, experiments were performed with the surface area of the cut as the only factor that was changed. the experiment was performed with five different material samples of varying cross-sectional area. only the thickness of the sample material hm was varied. two relations were revealed. firstly, the relation of the torque to the area of the cross-section s of the sample materials, and, secondly, the relation of the torque to the thickness hm of the material sample. the width of the sample was kept constant at 30mm, so that the material was subjected to the whole surface of the wedge while also overhanging the wedge in the outward direction. the following sample thicknesses were used: 7, 10, 12, 15, and 18 mm, which are the material sizes used for this kind of machine. measurements were made with wedge 2, using the same principles as for the previous experiments. this 1 http://dx.doi.org/10.14311/ap.2014.54.0001 http://ojs.cvut.cz/ojs/index.php/ap j. beniak, p. křižan, m. matúš, m. kováčová acta polytechnica figure 1. illustration of surfaces used for calculating the disintegrative force. figure 2. height of wedge h = 16.5 mm. wedge was used on the assumption that the measured torque values would be in the measurement range including thicknesses of 18 mm, while wedges 3 and 4 would result in measured values outside the measurement range. this assumption was based on previous measurements. for each series of samples, 10 measurements were repeated, for 5 types (levels) of samples, resulting in a total of 50 measurements. the surface area of the disintegrative cross-section is calculated on the basis of the thickness of the sample, but not on the basis of the whole width, since one portion, beneath the wedge, has already been disintegrated. the value was a maximum of 16.5 mm (fig. 2). so the surface area of the material cross-section that has been disintegrated is described by the relation sm = hm(2h + b). 3. experimental evaluation for the measured values and their corresponding uncertainty, see tab. 1 (the total uncertainty is equal to the uncertainty evaluated by the a type method), we can write mk1 = (166.7 ± 21) nm, where the value following the ± sign represents the expanded uncertainty u = kuc, at which uncertainty u is determined from the total (combined) standard uncertainty exp. hm sm mk s(mk) ua (mm) (mm2) (n m) (n m) (n m) 1 7 354.2 166.7 29.39 9.3 2 10 506 262.2 42.39 13.41 3 12 607.2 299.9 64.51 20.4 4 15 759 389.1 62.23 19.68 5 18 910.8 460.1 82.61 26.12 table 1. estimation of measured values and their uncertainty calculated using an a-type method. uc = ua = 9.3 nm, and coefficient k = 2.26 based on the distribution t for v = 9 degrees of freedom, and this coefficient defines the interval with an estimated confidence level of 95 %. the same record can also be introduced for other values of mki, where i = 1, 2, . . . ,k (k = 5). figures 3 and 4 show the relation of the measured values of torque mk to the surface area of the sample material sm. in order to describe this relation mathematically, the multi-nominal third degree approximation is applied. setting the estimation of the multi-nominal coefficients through a least squares method, as seen from the function, units with square and cube values are small: y = −103.19 + 0.9447x − (0.02449x)2 + (0.00669x)3. in the same way, we can describe the second degree multi-nominal (fig. 3) and also linear approximation (fig. 4). supposing that the mathematical formula for the calculating force has the form [3,4] f = τsm, (1) the linear function can be used, so formula (1) can be rewritten to express the torque mk = τrsm. (2) 2 vol. 54 no. 1/2014 the operating load of a disintegration machine figure 3. relation of the torque to the disintegrating surface area of the material sample cross-section of the material sample, with second degree multi-nominal approximation. figure 4. relationship of the torque to the disintegrated surface area of the cross-section of the material sample, with a linear approximation function. the relation between force f and torque mk is directly proportional, so only the linear dependencies are considered, which are shown in fig. 4. the linear approximation function that describes the relation of the torque to the disintegrating surface area of the material cross-section is mk = 0.5235sm − 12.897. (3) formulas (1) and (3) are similar, and therefore the presumption expressing the relation between the material shearing strength τ and the disintegrating surface area of the material cross-section sm is correct. from a physical point of view, when the disintegrating surface area of the cross-section of the material is zero, the torque is also zero. therefore the mathematical model is not third degree multi-nominal, and it is also not linearly dependent y = a + cx, but only the expression y = cx is used. on the basis of this assumption, formula (1) is suitable for describing the load during the disintegration process [5]. for various surface areas of the cross-sections, the following torque measurements were taken [6,7,8]: • mk1,1, mk1,2, . . . , mk1,10 for sm1; • mk2,1, mk2,2, . . . , mk2,10 for sm2; ... • mk5,1, mk5,2, . . . , mk5,10 for sm5. an estimation of the torque moment value for a given surface area of the cross-section of the material sample was calculated as the arithmetic mean of the correspondingly measured torques for each sml, l = 1, . . . , 5: mkl = 1 n n∑ j=1 mkl,j. the functional dependence of the torques mk on 3 j. beniak, p. křižan, m. matúš, m. kováčová acta polytechnica figure 5. relationship of the torque to the thickness of the material sample with a linear approximation function. the surface area of a cross-section sm can be written as mk = csm. (4) mathematical model for single units mkl = csml, l = 1, . . . , 5 can be written in matrix notation as x = aa, where x is the matrix of input data, a is the known matrix of the measurement plan, and a is the vector of unknown parameters, given by x =   mk1 mk2 ... mk5   , a = [c] , a =   sm1 sm2 ... sm5   . it is necessary to obtain an estimation of unknown parameter b and the uncertainties of this estimation. obtaining the estimation of parameter c: â = (at u−1(x)a)−1at u−1(x)x, (5) where â is the vector of the estimations of the unknown parameters, x is the vector of the estimations of the input parameters, and u (x) is the covariance matrix, which is of the form u (x) =   u2(mk1) u(mk1,mk2) · · · u(mk1,mk5) u(mk2,mk1) u2(mk2) u(mk2,mk5) ... ... ... u(mk5,mk1) u(mk5,mk2) · · · u2(mk5)   , where u2(mk1), . . . , u2(mk5) are uncertainties of the torque, calculated by the type a method (tab. 1), since other uncertainties are neglected, and u(mkl,mkl′ ), l, l′ = 1, . . . , 5, are the covariance between torques mkl and mkl′ . due to the absence of a common influence on the measurements of torques mkl, mk(l+1), the covariances between them can be neglected, so the covariance matrix u (x) takes the form u (x) =   u2(mk1) 0 · · · 0 0 u2(mk2) 0 ... ... ... 0 0 · · · u2(mk5)   . the matrix of the estimation of the unknown parameter can be calculated as u (â) = (at u−1(x)a)−1, (6) where matrix u (â) is a single variable matrix u (â) = [ u2(c) ] , where u(c) is the uncertainty of the unknown parameter c. 4. conclusion by calculating matrix relations (5) and (6) we obtain the parameter c = 0.5 and uncertainty u(c) = 0.013. therefore, mk = 0.5005sm. (7) considering practical values of c = 0.5 kn mm−1 and u(c) = 0.013 kn mm, the uncertainty of the calculated torque mk by formula (4) is found to be u(mk) = u(c)sm. comparing formulas (7) and (2), the tensile shear strength τ of the disintegrating material is c/r (r = 0.1165 m), thus τ = 4.29 mpa. the table value of tensile shear strength τ is found by [4] to be τtab = 6.7 mpa. based on earlier experiments and from the fact that the c/r ratio is not equal to the shear strength τ, 4 vol. 54 no. 1/2014 the operating load of a disintegration machine it can be hypothesized that formulas (7) must also reflect an additional parameter γ. therefore it is necessary to perform further experiments, where the influence of the cutting-edge side rake γ on the torque moment mk will be monitored, as well as adding a new coefficient to the formula that will reflect the influence of this parameter. the same relation as is illustrated in fig. 3 and fig. 4 can be made for the thickness of the material sample hm, and the character of the torque on different thicknesses of the material sample is shown in fig. 5. references [1] lisý, m., baláš, m., moskalík, j., štelcl, o., biomass gasification – primary methods for eliminating tar, (2012) acta polytechnica, 52 (3), pp. 66–70. [2] moskalík, j., škvařil, j., štelcl, o., baláš, m.„ lisý, m.: energy recovery from contaminated biomass, (2012) acta polytechnica, 52 (3), pp. 77–82. [3] kováč, a., rudolf, b. tvárniace stroje, sntl, alfa, bratislava 1989. [4] lisičan, j. teória a technika spracovania dreva. matcentrum, zvolen 1996. [5] beniak, j., ondruška, j., čačko, v.: design process of energy effective shredding machines for biomass treatment. acta polytechnica 52 (5), 2012, pp. 133–137. [6] palenčár, r., halaj, m. metrologické zabezpečenie systémov riadenia kvality, vydavateľstvo stu v bratislave 1998, isbn 80-227-1171-3. [7] palenčár, r., ruit, j.-m., janiga, i., horníková, a. štatistické metódy v metrologických a skúšobných laboratóriách, grafické štúdio ing. peter juriga, 2001,isbn 80-968449-3-8. [8] jarošová, e. navrhování experimentů. česká společnost pro jakost, 1997. 5 acta polytechnica 54(1):1–5, 2014 1 definition of a disintegrative surface 2 experiment for determining the influence of the disintegrative surface area 3 experimental evaluation 4 conclusion references acta polytechnica doi:10.14311/ap.2015.55.0059 acta polytechnica 55(1):59–63, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap diffuse dbd in atmospheric air at different applied pulse widths ekaterina shershunova∗, maxim malashin, sergei moshkunov, vladislav khomich institute for electrophysics and electric power ras, st. petersburg, russia, 191186, dvortsovaya nab., 18 ∗ corresponding author: eshershunova@gmail.com abstract. this paper presents the realization and a diagnosis of the volume diffuse dielectric barrier discharge in a 1-mm air gap when high voltage rectangular pulses are applied to the electrodes. a detailed study has been made of the effect of the applied pulse width on the discharge dissipated energy. it has been found experimentally that the energy remained constant when the pulse was elongated from 600 ns to 1 ms. keywords: dielectric barrier discharg; pulsed power supply; atmospheric pressure; pulse width; pulse energy. 1. introduction in recent years, much attention has been focused on realizing a diffuse atmospheric dielectric barrier discharge (dbd), and on investigating dbd, in associated with its potential for use in plasma medicine, surface treatment, plasma chemistry, etc. many researchers have shown that the diffuse mode of dbd with two discharge peaks per one voltage pulse can be ignited at low pressure, and even at atmospheric pressure, in gases such as neon, argon and helium by applying high voltage pulses with short durations to the electrodes [1, 2]. this increases the energy input into the discharge. there is no power consumption during the secondary discharge. at the same time, the energy stored at the barrier after the primary discharge is consumed. as a rule, diffuse high-current dbd in atmospheric air is realized by applying voltage pulses of submicrosecond duration to the electrodes. there have been few studies on the influence of applying pulse width on dbd behavior in the microsecond range. in previous works, short bell-shaped pulses were used for initiating diffuse dbd in air [3, 4]. only one primary discharge figure 1. experimental setup for realizing and diagnosing dbd. pulse was clearly observed when applying these pulses. in our work, we present an experimental study, using a specially developed generator, of the influence of the applied pulse width on dbd development in atmospheric air. 2. experimental setup a special experimental setup was designed to generate a volume dbd in atmospheric air (figure 1). two special semiconductor switches s1 and s2 [5, 6] were used to supply dbd with rectangular voltage pulses with varying parameters: amplitude from 0 to 16 kv, pulse width from 600 ns to 1 ms, pulse repetition rates 1–3000 hz. in addition, the rate of the rise of the applied voltage can easily be changed by varying the value of external resistor r1, thereby enabling the dbd mode to be controlled [7, 8]. dbd was initiated in a 1 mm atmospheric air gap (dg) under conditions of natural humidity of 40–60 % between two plane-parallel aluminum electrodes, one of which was covered by a 2 mm alumina ceramic plate at a pulse repetition rate of 30 hz. the desired pulse width of the applied voltage was set by varying the 59 http://dx.doi.org/10.14311/ap.2015.55.0059 http://ojs.cvut.cz/ojs/index.php/ap e. shershunova, m. malashin, s. moshkunov, v. khomich acta polytechnica figure 3. experimental (vin, it) and calculated (vdg, ids) voltage and current traces of the volume diffuse dbd in a 1 mm air gap. figure 2. equivalent circuit of dg. time delay between triggering switches s1 and s2. the voltage applied to the electrodes (vin) was measured using a tektronics p6015a high-voltage probe, and the total current in dg (it) was measured through the voltage drop at the 50 ω series low-inductance resistor rs. the voltage and current waveforms were displayed on a lecroy waverunner oscilloscope (bandwidth 1 ghz, sampling rate 10 gs/s). the measured traces were the result of processing 1000 events. 3. calculating the electrical and energy characteristics of volume diffuse dbd in atmospheric air the voltage drop for different elements of the discharge gap (dg), currents flowing through the circuit, discharge and supply power were evaluated according to the widely-used equivalent electrical circuit of the capacitive divider (figure 2). the voltage applied to the electrodes vin and the total current in dg it, corresponding to the sum of the displacement current ia (in the absence of a discharge) and the conduction current ids, are measured experimentally. from this data it is easy to calculate the voltage for the air gap vdg and discharge current ids. the voltage drop of the air gap is found by subtracting the voltage drop at the barrier vb from the applied voltage vin, using the formula vdg = vin − vb, where vb = 1cb ∫ it dt is the voltage at the barrier, it is the total current, and cb is the capacitance of the barrier. the discharge current is calculated from the difference between the total current and the current through the air capacitor ca in the absence of the discharge: ids = it − ia, where ca is the capacitance of the air gap. the current through the air capacitor 60 vol. 55 no. 1/2015 diffuse dbd in atmospheric air at different applied pulse widths figure 4. externally supplied power psup and discharge dissipated power pds versus time, tp = 600 ns. figure 5. temporal dependences of discharge energy eds and the energy from the external source esup (vin = 16 kv, tp = 600 ns, h = 1 mm, air, alumina ceramic barrier). is obtained by multiplying the air capacitance and the time derivative of the voltage across the air gap, i.e., as ia = ca dvdg dt . then, knowing the discharge current and the voltage at dg, the instantaneous power dissipated in the discharge pds can be calculated as pds = idsvdg. hence by integrating over time we find the discharge dissipated energy: eds = ∫ pds dt. the energy transferred from the external circuit can be estimated by the formula esup = ∫ psup dt, where psup = itvin. 4. results figure 3 presents typical voltage and current waveforms of the volume diffuse dbd at vin = 16 kv, f = 30 hz, r1 = 85 ω and pulse width tp = 600 ns. as voltage vin applies to the electrodes of dg, current it starts to flow through the circuit. initially, this current charges the equivalent capacitance of dbd, which corresponds to a small hump at current trace it. when the voltage at the air gap vdg exceeds the breakdown value, the primary discharge ignites. this looks like a sharp peak in the waveform. the situation at the falling voltage edge is similar to the picture at the rising edge. first, the capacity of dbd is recharged, and then, when the breakdown voltage is exceeded, a discharge appears in dg, i.e. in the conduction current, corresponding the secondary discharge pulse. the time-dependences of the externally supplied power psup and the power dissipated in the discharge pds are shown in figure 4. power comes from the external circuit (psup) to charge the equivalent capacitance of dbd and to the discharge process (pds). the secondary discharge pulse appears without direct consumption of power from the external source. this occurs due to the charge stored at the surface of the barrier after the primary discharge passes. the secondary discharge pulse leaves no charges on the barrier after it finishes. the calculated energy released in the primary discharge is about 1.8 mj, and in the secondary discharge the calculated energy release is 1.5 mj (figure 5). thus, the energy released per one pulse in the volume diffuse 1 mm dbd in air was ∼ 3.3 mj at the pulse width of the applied voltage of 600 ns. we recorded the voltage and current traces for different pulse widths in order to compare the discharges. figure 6 shows that with the elongation of the applied voltage pulse from 600 ns to 1 ms the peak current of the primary discharge remained constant at ∼ 15 a, but the peak current of the secondary discharge increased slightly. according to our experimental data, the charge transferred during the secondary discharge is constant 61 e. shershunova, m. malashin, s. moshkunov, v. khomich acta polytechnica figure 6. dependences of primary and secondary discharge peaks versus pulse width. figure 7. current it and voltage vdg traces of the secondary discharge at different pulse widths. with measured accuracy. we also consider the barrier capacitance to be constant. this allows the voltage at the barrier to be considered constant in the nodischarge period of the pulse. the voltage applied to dg vin decreases due to the leakage current in the circuit, thus having an influence on the gap voltage. the growth of the secondary current pulse can therefore be explained by an increase in the gap voltage amplitude with pulse elongation (figure 7). the peak power of the primary discharge was about 150 ± 15 kw at any pulse width (figure 8). the peak power of the secondary discharge changed twice as the pulse width increased from 600 ns to 1 ms. the total discharge energy in the pulse remained the same for any pulse width from the range. it was equal to 3.3 ± 0.1 mj, where 1.8 mj was dissipated in the primary discharge, and 1.5 mj was dissipated in the secondary discharge. 62 vol. 55 no. 1/2015 diffuse dbd in atmospheric air at different applied pulse widths figure 8. peak discharge power (p1 — primary discharge, p2 — secondary discharge) versus pulse width. 5. summary the volume diffuse dbd was realized in a 1 mm air gap when supplying the dg with rectangular unipolar voltage pulses 16 kv in amplitude, with a pulse repetition rate of 30 hz at different pulse widths, from 600 ns to 1 ms. it was found experimentally that there was no correlation between the pulse width of the applied voltage and the energy dissipated in the discharge. the total dissipated discharge energy per pulse was about 3.3 mj. it was also found that the slump in the applied voltage could cause an increase in the peak power of the secondary discharge. acknowledgements the research work presented here was funded by the russian foundation for basic research, grant no. 1308-01043. references [1] shuhai liu, manfred neiger. excitation of dielectric barrier discharges by unipolar submicrosecond square pulses. journal of physics d: applied physics 34(11):1632, 2001. doi:10.1088/0022-3727/34/11/312 [2] xinpei lu, mounir laroussi. temporal and spatial emission behavior of homogeneous dielectric barrier discharge driven by unipolar sub-microsecond square pulses. journal of physics d: applied physics 39(6):1127, 2006. doi:10.1088/0022-3727/39/6/018 [3] t. shao, d. zhang, y. yu, c. zhang, j. wang, p. yang, y. zhou. a compact repetitive unipolar nanosecondpulse generator for dielectric barrier discharge application. ieee transactions on plasma scienc 38(7):1651-1655, 2010. doi:10.1109/tps2010.2048724 [4] h. ayan, g. fridman, a. f. gutsol, v. n. vasilets, a. fridman, g. friedman. nanosecond-pulsed uniform dielectric-barrier discharge. ieee transactions on plasma science 36(2):504-508, 2008. doi:10.1109/tps.2008.917947 [5] v.yu. khomich, m.v. malashin, s.i. moshkunov, i.e. rebrov, e.a. shershunova. solid-state system for copper vapor laser excitation. epe journal 23(4):51–54, 2013. [6] m. v. malashin, s. i. moshkunov, i.e. rebrov, v. yu. khomich, e. a. shershunova. high-voltage solid-state switches for microsecond pulse power. instruments and experimental techniques 57(2):140-143, 2014. doi:10.1134/s0020441214010242 [7] e. shershunova, m. malashin, s. moshkunov, v. khomich. generation of homogeneous dielectric barrier discharge in atmospheric air without preionization. abstract book of the 19th int. vacuum congress. paris, france. p. 1242, 2013. http://apps.key4events.com/key4register/images/ client/164/files/abstracts_ivc19.pdf [2014-12-01]. [8] m. v. malashin, s. i. moshkunov, v. yu. khomich, e. a. shershunova, v. a. yamshchikov. on the possibility of generating volume dielectric barrier discharge in air at atmospheric pressure. technical physics letters 39(3):252-254, 2013. doi:10.1134/s106378501303010 63 http://dx.doi.org/10.1088/0022-3727/34/11/312 http://dx.doi.org/10.1088/0022-3727/39/6/018 http://dx.doi.org/10.1109/tps2010.2048724 http://dx.doi.org/10.1109/tps.2008.917947 http://dx.doi.org/10.1134/s0020441214010242 http://apps.key4events.com/key4register/images/client/164/files/abstracts_ivc19.pdf http://apps.key4events.com/key4register/images/client/164/files/abstracts_ivc19.pdf http://dx.doi.org/10.1134/s106378501303010 acta polytechnica 55(1):59–63, 2015 1 introduction 2 experimental setup 3 calculating the electrical and energy characteristics of volume diffuse dbd in atmospheric air 4 results 5 summary acknowledgements references ap01_6.vp 1 introduction heat conduction is the irreversible transport process that has been studied by non-equilibrium thermodynamics for very long time. it is often accompanied by the transport of mass (diffusion) and their corresponding transport equations are mutually coupled and form a system of equations. nevertheless, in solids – to a certain extent – these two processes can be investigated separately. heat conduction is governed by two fourier laws, while diffusion is governed by, two fick laws. either group of laws is represented by differential equations of the same type so that their solutions are interchangeable if the initial and boundary conditions are also equivalent. fourier’s first and second laws for heat flux � q and temperature field t ( x, y, z, t) can be expressed as follows � q t� � �� , (1) � �� � � �c t t t q� � � � *, (2) where �, c, � and q* are density, heat capacity, thermal conductivity and volume heat flux of the sources, respectively. if there are no sources q* � 0, temperature field is independent on time � � � t � 0, � is constant and heat flux is unidirectional (as is usual in thermal building technology), then the fourier’s laws (1), (2) read � �� q q t x x � � �� d d , (3) � �d d 2 2 0 t x x � . (4) for the boundary conditions t ( 0) � t1 > t ( d) � t2 the simple system of differential equations (3), (4) will yield the following solution q t t d � �1 2 � , (5) � �t x t t t d x� � � 1 1 2 . (6) it is obvious that the heat flux q is constant and the temperature field (profile) t ( x) inside the wall shows perfect linearity. this is the common situation described in all textbooks of building thermal technology for buildings and thermal standards many countries. it follows from the preceding discussion that the first necessary condition for the linear temperature profile t ( x) is the existence of the steady thermal state which resulted from the time independence of the temperature profile d d constant t t q� �0 , (7) and the second condition requires the thermal conductivity � to be a positive constant d d constant � � x � �0 . (8) the positivity of � follows from its physical definition. if some of those conditions are not fulfilled a non-linear temperature profile t ( x) can appear. frequently the non-linearity is caused by the dependence of � on temperature and moisture. 2 variable thermal conductivity let us assume a dry material inside a wall whose thermal conductivity � depends only on temperature t ( x) � �� �d d � t x x 0. (9) the temperature dependence of � is a common feature of all building materials and is often investigated especially with new materials. if a building material is used in a narrow temperature range (e. g. within several tens of degrees) the linear dependence � �� �� �bt x * (10) is a good approximation, as can be seen in fig. 1. the linear graphs in fig. 1 have been created by fitting the corresponding data [1] by the least-square method � � 2.98 � 10�4t + 0.0525 wm�1k�1 (asbestos), (11) � � 1.90 � 10�4t + 0.0116 wm�1k�1 (foam concrete), (12) � � 1.40 � 10�4t � 0.0057 wm�1k�1 (cork), (13) � � 1.10 � 10�4t � 0.0034 wm�1k�1 (glass wool). (14) 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 non-linear temperature profiles t. ficker, j. myslín, z. podešvová non-linear temperature profiles caused by temperature-dependent thermal conductivity ���� of wall materials are discussed. instead of conventional thermal resistance, modified effective resistance has been introduced. keywords: temperature profile in building structures, temperature-dependent thermal conductivity, fourier’s thermal laws. fig. 1: linear behaviour of thermal conductivity in a narrow temperature range nevertheless, in a wider temperature range (several hundreds of degrees and more) a non-linear fit for � must be employed. our experience shows that a reasonable approximation is provided with parabolic function � �� �� �� � �0 0 2a t x t (15) where �0, a, t0 are parameters to be fitted. fig. 2 illustrates the parabolic behaviour of �(t) for the same materials as given in fig. 1 but in a much wider temperature range. the graphs in fig. 2 have the following parameters � � 0.176 � �1.5�10�6(t(x) � 447)2 wm�1k�1 (asbestos), (16) � � 0.057 � 6�10�7(t(x) � 165)2 wm�1k�1 (foam concrete), (17) � � 0.053 � �6�10�7(t(x) � 465)2 wm�1k�1 (cork), (18) � � 0.053 � �2�10�7(t(x) � 648)2 wm�1k�1 (glass wool). (19) recently measurement of � ( t) was published [2] for high performance concrete in a wide temperature interval ranging from 100°c to 800°c. at first sight the graphs � ( t) in [2] resemble a parabolic behaviour (15). for such pronounced non-linear behaviour of � ( t) and for such wide temperature range, strong non-linearity of t ( x) must be expected. the aim of this paper is to specify the non-linear character of temperature profile t ( x) either in the case of linear conduction function (10) or in the case of parabolic function (15). 3 non-linear temperature profiles provided thermal conductivity � is not constant but a function of temperature � ( t ( x)), fourier’s laws written in one dimension assume the following form when a steady state is reached � � � � q t t x x � �� d d , (20) � � � �d d d dx t t x t � � � � �� � 0. (21) now the corresponding solution [ t ( x), q ] will depend not only on the boundary conditions t ( 0) � t1, t ( d) � t2 but also on the form of the function � ( t). let us discuss both the mentioned non-linear dependencies � ( t), i.e., the relations (10) and (15) from chapter 2. 3.1 linear temperature dependence �(t) using the linear function (10) and solving the system of eqs. (20), (21) for the usual boundary conditions t ( 0) � t1, t ( d) � t2, it is possible to obtain q t t d t t d b t t reff � � � � � �1 2 1 2 2 2 1 2 2 � * * , (22) � � � � � �t x b t d t t bd t t x� � � � � � � � � � � � � � � � � � � � � � � � * * 1 2 1 2 2 2 1 2 1 2 * b , (23) � � r d b t t d t d eff eff eff * * * * � � � � � 1 2 2 � � � , (24) � � � � t t t t t eff eff * *,� � � �1 2 1 2 2 2 � � � . (25) as can be seen, the temperature profile (23) is non-linear. nevertheless, in a narrow temperature range, e. g. t t� �( ,2 255 k t1 293� k), the graph t ( x) will deviate only slightly from the linear profile (fig. 3) so that a straight line can be considered as a good approximation for this case. since the linear fit (10) is restricted to narrow temperature ranges only, the extension of the discussed solution to wider temperature ranges would be rather artificial unless the non-linear fit (15) is used (see next chapter). it is worth noting that in the case of linear dependence � ( t) it is possible to introduce a generalised concept of thermal resistance reff * using the effective constant conductivity �eff * � � � � � �r d t t teff eff eff eff * * * *,� � � � � � � � �1 2 2 (26) this conclusion holds quite independently on temperature range. utilization of the thermal conductivity � �teff * ) for the average ‘surface’ temperature � �t t teff * � �1 2 2 can also be found in the classical work of glaser [3] who supposed a less 67 acta polytechnica vol. 41 no. 6/2001 fig. 2: parabolic behaviour of thermal conductivity in a wider temperature range fig. 3: temperature profile within the foam concrete wall in a narrow temperature range. parameters: d � 0.2 m, t1 � 293 k, t2 � 253 k, b � 1.90 �10�4 wm�1k�2, �* � 0.0116 wm�1k�1. general approximation �( )t bt� . however, using the more general function (10), one should be aware that � must not assume negative values ( � >0) otherwise the corresponding solution of the system (20), (21) becomes unacceptable from the physical point of view (a flat contradiction of the second law of thermodynamics). 3.2 parabolic temperature dependence �(t) the solution of the system (20), (21) for parabolic conductivity function (15) and the boundary conditions t ( 0) � t1, t ( d) � t2 reads � � � � q t t d t t t t d a t t reff � � � � � � � �1 2 1 0 3 2 0 3 1 2 3 � , (27) � �� � � �� � � � � � � � � � x a t x t t x t a t t a t t t t a t t � � � � � � � � � � � 0 3 0 0 1 0 3 2 0 3 0 2 1 1 0 3 3 3 � � d, (28) � � � �� � � � r d a t t t t t t t t eff * � � � � � � � � � 1 0 2 1 0 2 0 2 0 2 0 3 � � d eff� , (29) � � � � � �� � � �� �� �� � � � � � � �eff t t t t� � � � � �13 1 2 1 0 2 0 0 . (30) also in this case the non-linear temperature profile (28) – derived in implicit form – can be encountered. the parabolic fit (15) is meaningful especially in a wider temperature range (t1 � t2) which is just the convenient event for a full development of strong non-linearity in the t ( x) profile (see fig. 4). the ‘surface’ temperatures t1 � 1500 k and t2 � 400 k, used in fig. 4, simulate blast-furnace thermal conditions. again in a steady thermal state the heat flux q can be calculated using the effective thermal resistance reff introduced by means of constant effective conductivity �eff (see (30)). 4 conclusion a steady thermal state is characterised by a constant heat flux that can be determined by means of an effective thermal resistance and the ‘surface’ temperature difference of the wall under investigation. the expression of effective thermal resistance is dependent – besides the wall thickness d – on the analytical form of � ( t). this paper has illustrated that within the steady thermal state it is always possible to find a constant resistance (in the conventional sense) regardless of the eventual temperature dependence � ( t) r d eff eff � � . (31) this fact has been illustrated with two particular examples, namely, for linear and parabolic functions � ( t). the effective conductivity �eff for the linear �-function (10) assumes the value of the average ‘surface’ conductivity � � � �� �� � �eff t t * � � 1 2 1 2 . (32) for the parabolic �-function (15) a more complicated combination of ‘surface’ conductivity determines the effective value � � � � � �� � � �� �� � � � � � � �eff t t t t� � � � � �� � � �� 1 3 1 2 1 0 2 0 0 . (33) as soon as temperature dependence appears with the thermal conductivity � ( t), the corresponding temperature profile t ( x) inside a wall becomes non-linear. however, this non-linearity proves to be essential only for a sufficiently wide temperature range ( t1�t2). such a situation can be encountered, e.g., with blast-furnace envelopes which experience extreme temperature gradients and, therefore, pronounced non-linearity of the corresponding temperature profiles must be expected. a pronounced non-linear profile (fig. 4) caused by an extreme thermal range and the cubic conductivity function (15) has been illustrated and analytically described (28) in this paper. references [1] halahyja, m.: stavebná tepelná technika. bratislava, svlt 1966 (in czech) [2] toman, j.: thermal conductivity of high performance concrete in wide temperature and moisture ranges. acta polytechnica, vol. 41, no. 1/2001, pp. 8–10 [3] glaser, h.: influence of temperature on the diffusion of vapour through dry insulation walls. (in german), kältetechnik, vol. 9, no. 6/1957, pp. 158 –159 assoc. prof. rndr. tomáš ficker, drsc. phone: +420 5 4114 7661 e-mail: fyfic@fce.vutbr.cz department of physics professor ing. arch. jiří myslín, csc. ing. z. podešvová, ph.d. student department of building structures university of technology faculty of civil engineering žižkova 17, 662 37 brno, czech republic 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 4: temperature profile within a foam concrete wall in a wide temperature range. parameters: d � 0.2 m, t1 � 1500 k, t2 � 400 k, a � 6 �10 7 wm�1k�3, t0 � 165 k, �0 � 0.0570 wm �1k�1. ap1_02.vp 1 introduction as a result of illogical trends in power prices, the competitiveness of district heating has greatly decreased in recent years in košice. this situation is aggravated by the fact that a part of the heating plant is already reaching the end of its working life. if this part were to be closed, it would result in a shortfall of 200 mw heat output. the long term development plan for the košice district heating system focuses on solutions to this problem. it also takes into account the demands of the national electricity supply network, and is therefore considering three possibilities. an alternative is needed because of slovakia’s extremely high (90 %) dependence on imported energy sources, and the need for a lower level of the emission of pollutants. it is planned to establish a new gas-steam cycle-based chp plant-module with a 100 mw thermal output, and to supply 100 mw thermal output from geothermal sources. however the efficiency parameters (net present value, internal rate of return and payback period) indicate that competitiveness of this alternative is rather low. 2 current plan for geothermal energy utilization in the district heating system for košice the current plan for geothermal energy utilization is based on direct feeding with this energy into the district heating system. this is justified by its relatively high temperature (115–130 °c at a depth of about 2 000–2 500 m and 130–150 °c in the depth of about 2 500–3 000 m). according to the plan, nearby villages such as bidovce, ďurkov, ruskov and slanec will have two pairs of production and reinjection wells (doublets) and one heat exchanger plant. thermal water will be pumped into the heat exchanger plants from the production wells. the thermal will infuse its thermal energy into the secondary mains water situated in the heat exchangers, and then the water will be returned to the earth’s crust through two reinjection wells. the thermal energy thus acquired will then be transported to the heating plant by the secondary mains water. this will be done with the help of pumping stations, and the chp plant will enable it to be used in the district heating system. according to the plan, more than 2 500 tj can be obtained this way to supply the town with heat from geothermal energy sources. 3 more rational utilization of the geothermal energy an examination of this conception of geothermal energy leads us to an important and to some extent surprising realization: according to the plan, the temperature of the thermal water compressed back into the reinjection wells should be higher than 60 °c. this means there will be an extremely low degree of exploitation of the available geothermal energy capacity. it seems justified to assume that an increase in the utilization degree would be the most efficient way to improve the economic efficiency parameters. approximately 9 mil. usd could be saved on investment costs in the case of upgraded utilization of the available goethermal energy capacity. about 70 mw heat output could be gained from direct utilization and about 30 mw from indirect utilization. the indirect utilization stage would be implemented by a heat pumping plant, which would utilize the heat output produced by cooling down the returning secondary mains water by about 20 k at a higher temperature level. thus the temperature at which the thermal water is compressed back would decrease to a similar extent. this conception can lead to some 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 some problems of the integration of heat pump technology into a system of combined heat and electricity production g. böszörményi, l. böszörményi the closure of a part of the municipal combined heat and power (chp) plant of košice city would result in the loss of 200 mw thermal output within a realtively short period of time. the long term development plan for the košice district heating system concentrates on solving this problem. taking into account the extremely high (90 %) dependence of slovakia on imported energy sources and the desirability of reducing the emission of pollutantst the alternative of supplying of 100 mw thermal output from geothermal sources is attractive. however the indices of economic efficiency for this alternative are unsatisfactory. cogeneration of electricity and heat in a chp plant, the most efficient way of supplying heat to košice at the present time. if as planned, geothermal heat is fed directly into the district heating network the efficiency would be greatly reduced. an excellent solution of this problem would be a new conception, preferring the utilization of geothermal heat in support of a combined electricity and heat production process. the efficiency of geothermal energy utilization could be increased through a special heat pump. this paper deals with several aspects of the design of a heat pump to be integrated into the system of the chp plant. keywords: geothermal energy, ejector, heat pump. improvement in the competitiveness of geothermal energy. however the main reason for unsatisfactory economic efficiency (the conflict between cogenerated heat and geothermal heat) cannot be set aside, since this comes from the concept of geothermal energy utilization. cogeneration of electricity and heat, the bases of the relatively efficient heat supply of the town at a present time be very strongly limited if geothermal energy were to be fed into the district heating network. the conception of integrating geothermal energy sources into a system of combined electricity and heat production in the municipal chp plant would be an excellent solution to this problem. the principal scheme is illustrated in fig. 1. in this conception the utilization of geothermal energy in support of a combined electricity and heat production process would be preferred. the stream of the secondary geothermal energy medium into the chp plant would be divided into two parts. the higher flow would be used for heating the feed water in the steam cycle. the lower flow would be used directly in the district heating network: • for domestic heating and hot water in winter period, • for absorption cooling in the summer period. after that, the streams would be mixed and cooled down by about 20 k in a heat pumping plant. the resulting flow of the secondary geothermal energy medium could be used for cooling the condenser in the steam cycle. the advantage of this solution would be that the primary geothermal water would be injected back into the earth’s crust at a lower temperature than without using the heat pump. moreover a part of the energy losses from the condenser would be stored in this water in the earth’s crust. the thermal output of the heat pumping plant could be utilized the following way: • for domestic heating and hot water in two housing estates near the chp plant in the winter period, • for domestic hot water preparing throughout the city in the summer period. implementing this conception could reduce investment costs (by about 9 mil. usd) as well as operating costs, and could increase the benefits of geothermal energy utilization in the košice district heating system. 4 selection of the heat pumping technology one of main conditions for efficient geothermal energy utilization in this conception is the correct selection of the heat pump technology and its integration into the combined electricity and heat production system. heat pumping technology can be classified according to: • the number of stages of evaporation and condensation – the most probable method would be to implement these processes in at least two stages. • the coolant that is used. water can be used as a coolant. in addition to water, refrigerants r134a and r717 (ammonia) were analyzed, for lower dew and evaporation temperatures. • the method of compression. steam can be compressed mechanically in turbocompressors in two stages, or by thermocompression in an ejector in a single stage. r134a and r717 vapors can be compressed mechanically in two stages. this system could work the following way. in summer, all the required heat power (~45 mw) would be produced by the heat pump. consequently, the gas-steam cycle would produce mostly electricity. in winter, only one housing estate in košice city would be supplied with heat power (~65 mw) by the heat pump. the most of the heat power would be produced in the gas-steam cycle. one possible solution of the heat pump conception is the three-stage alternative version in fig. 2. this alternative is a combination of direct (first stage) and indirect (second and third stage) processes of evaporation and condensation. in the first stage, steam is compressed in a one-stage ejector by the motive steam acquired from the steam turbine of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 41 no. 2/2001 fig. 1: scheme of the integration of the geothermal source into the combined heat and electricity production system based on a gas-steam cycle gas-steam cycle. the evaporation temperature in the first stage is 39 °c and the dew temperature is 80 °c. because of the direct processes in the first stage, the temperature difference in the heat exchangers is negligible. in the second acta polytechnica vol. 41 no. 2/2001 46 tv condenser i ·· mchiiimchii · mol · mol · mpl · mpl r a134 r a134 e 2 e 4 e 3e 1 45 °c 25 °c 45 °c 80 °c evaporator i fig. 2: the three-stage alternative version of the heat pump integrated into combined heat and electricity production using the geothermal source of the košice basin tv condenser i r a134 · moi · mk · moi · mcii e 1 e 2 80 °c 45 °c 45 °c 25 °c evaporator i fig. 3: the two-stage alternative version of the heat pump integrated into combined heat and electricity production using the geothermal source of the košice basin and third stages indirect processes are implemented. in this heat pump, the secondary water is cooled down from 45 °c to 25 °c and the water that is used for heating is warmed from 45 °c to 80 °c. this conception of a heat pump is advantageous due to possibility of heat power regulation. the heat power produced by this alternative is 55 mw. the coefficient of performance (cop) is relatively low because of the huge heat power of the motive steam. this energetic valuation does not take into consideration various exergetic qualities of the energies used for vapor compression. this fact could be respected by equal electrical power which is identical to the power we would get from the expansion of the motive steam. using this assessment the cop reaches significantly higher values. another possible solution of the heat pump is the two-stage alternative shown in fig. 3. analogous to the previous solution, this version consists of direct processes (in the first stage) and indirect processes (in the second stage) of evaporation and condensation. in the first stage, the steam is compressed mechanically in two stages with intercooling after the intermediate stage. because of the enormous specific volume of the steam, the process of compression could fake place in axial turbocompressors. in the second stage, vapors of coolant r134a are compressed mechanically. the temperatures of the water cooled down and heated up are as before. in this case the heating power is about 37 mw and the power required for compression is about 7 mw. clearly this variant has a higher cop. however, it has the great disadvantage of problematic steam compression in turbomachinery, resulting from the extremely low partial pressure of steam at low temperatures. 5 conclusions the three-stage variant has advantageous characteristics both in terms of heat-power regulation and in terms of economic performance. a simplified economic analysis, based on the difference between the selling price of the heat and the cost of providing it, shows the promise of this variant. it is too early to state that this variant could be the final version of the heat pump technology. it will be necessary to analyze the synergies between the steam turbine of the gas-steam cycle and the heat pump. 6 list of used symbols he, e heat exchanger cp condensate pump hp heat pump c condenser gt gas turbine cycle st steam turbine g generator ss service system tv throttle valve m mass flow references [1] böszörményi, g.: využití hydrogeotermálního potenciálu košické kotliny. diplomová práce, praha, 2000 [2] bussmann, w. (hrsg.): geothermie – wärme aus der erde. verlag c. f. müller gmbh, karlsruhe, 1991 [3] austmeyer, k., e.: mechanical vapour recompression. vdi-society for energy technology, düsseldorf, 1993 ing. gabriel böszörményi e-mail: g.boszormenyi@sh.cvut.cz czech technical university in prague faculty of mechanical engineering vaníčkova 5, koleje strahov bl. 6./301 169 00 praha 6, czech republic doc. ing. ladislav böszörményi, csc. e-mail: boszla@ccsun.tuke.sk tel.: +421 95 6024241 fax: +421 95 6321558 technical university of košice faculty of civil engineering vysokoškolská 4, 042 01 košice, slovak republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 2/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2013.53.0746 acta polytechnica 53(supplement):746–749, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap moonlight: a new lunar laser ranging retroreflector and the lunar geodetic precession m. martinia,∗, s. dell’agnelloa, d. currieb, g.o. delle monachea, r. vittoric, s. berardia, a. bonia, c. cantonea, e. cioccia, c. lopsa, m. garattinia, m. maielloa, g. patrizia, m. tibuzzia, n. intagliettaa, j. chandlerd, g. biancoe a infn–lnf, via e. fermi, 40 frascati (rome) b university of maryland (umd), college park, college park md 20742 and nasa lunar science institute, usa c aeronautica militare italiana (ami), viale dell’universitá 4, 00185 rome, asi, infn–lnf, italy, and esa-hsou d harvard-smithsonian center for astrophysics (cfa), 60 garden street, cambridge, ma 02138, usa e agenzia spaziale italiana (asi), centro di geodesia spaziale “g. colombo” (cgs), localitá terlecchia, p.o. box adp, 75100 matera, italy ∗ corresponding author: manuele.martini@lnf.infn.it abstract. since the 1970s lunar laser ranging (llr) to the apollo cube corner retroreflector (ccr) arrays (developed by the university of maryland, umd) supplied almost all significant tests of general relativity (alley et al., 1970; chang et al., 1971; bender et al.,1973): possible changes in the gravitational constant, gravitational self-energy, weak equivalence principle, geodetic precession, inverse-square force-law. the lnf group, in fact, has just completed a new measurement of the lunar geodetic precession with apollo array, with accuracy of 9 × 10−3, comparable to the best measurement to date. llr has also provided significant information on the composition and origin of the moon. this is the only apollo experiment still in operation. in the 1970s apollo llr arrays contributed a negligible fraction of the ranging error budget. since the ranging capabilities of ground stations improved by more than two orders of magnitude, now, because of the lunar librations, apollo ccr arrays dominate the error budget. with the project moonlight (moon laser instrumentation for general relativity high-accuracy tests), in 2006 infn–lnf joined umd in the development and test of a new-generation llr payload made by a single, large ccr (100 mm diameter) unaffected by the effect of librations. with moonlight ccrs the accuracy of the measurement of the lunar geodetic precession can be improved up to a factor 100 compared to apollo arrays. from a technological point of view, infn–lnf built and is operating a new experimental apparatus (satellite/lunar laser ranging characterization facility, scf) and created a new industry-standard test procedure (scf-test) to characterize and model the detailed thermal behavior and the optical performance of ccrs in accurately laboratory-simulated space conditions, for industrial and scientific applications. our key experimental innovation is the concurrent measurement and modeling of the optical far field diffraction pattern (ffdp) and the temperature distribution of retroreflector payloads under thermal conditions produced with a close-match solar simulator. the apparatus includes infrared cameras for non-invasive thermometry, thermal control and real-time payload movement to simulate satellite orientation on orbit with respect to solar illumination and laser interrogation beams. these capabilities provide: unique pre-launch performance validation of the space segment of llr/slr (satellite laser ranging); retroreflector design optimization to maximize ranging efficiency and signal-to-noise conditions in daylight. results of the scf-test of our ccr payload will be presented. negotiations are underway to propose our payload and scf-test services for precision gravity and lunar science measurements with next robotic lunar landing missions. in particular, a scientific collaboration agreement was signed on jan. 30, 2012, by d. currie, s. dell’agnello and the japanese pi team of the llr instrument of the proposed selene-2 mission by jaxa (registered with infn protocol n. 0000242-03/feb/2012). the agreement foresees that, under no exchange of funds, the japanese single, large, hollow llr reflector will be scf-tested and that moonlight will be considered as backup instrument. keywords: lunar laser ranging, space test, cube corner reflector, laser technology, test of geodetic precession. 746 http://dx.doi.org/10.14311/ap.2013.53.0746 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 moonlight: a new lunar laser ranging retroreflector figure 1. the scf cryostat. 1. introduction lunar laser ranging (llr) is mainly used to conduct high-precision measurements of ranges between a laser station on earth and a corner cube retroreflector (ccr) on the lunar surface. over the years, llr has benefited from a number of improvements both in observing technology and data modeling, which led to the current accuracy of postfit residuals of ∼ 2 cm. nowadays llr is a primary technique to study the earth–moon system and is very important for gravitational physics, geodesy, and studies of the lunar interior. since 1969 llr has supplied a lot of tests of general relativity (gr): it has evaluated the geodetic precession [12], probed the weak and strong equivalence principle, determined the parametrized post newtonian (ppn) parameter β and γ, addressed the time change of the gravitational constant (g) and 1/r2 deviations. llr has also provided important information on the composition and origin of the moon through measurement of its rotations and tides. future lunar missions will expand this broad scientific program. initially, the apollo arrays contributed a negligible portion of the llr error budget. today the ranging accuracy of ground stations has improved by more than two orders of magnitude: the new apollo1 station at apache point, usa, is capable of mm level range measurements; mlro (matera laser ranging observatory), at the asi (agenzia spaziale italiana) space geodesy center in matera, italy, has restarted lr operations. now, because of lunar librations, the apollo arrays dominate the llr error budget, which is a few cm. 1.1. the satellite/lunar laser ranging characterization facility (scf) in 2004, infn started the operation to build the satellite/lunar laser ranging characterization facility (scf) in frascati. the main purpose of this apparatus is the thermal and optical characterization of ccr arrays in simulated space conditions. a schematic view of the scf is shown in fig. 1. the size of the steel cryostat is approximately 2 m length by 0.9 m diameter. the inner copper 1apache point observatory lunar laser-ranging operation shield is painted with the aeroglaze z306 black paint (0.95 emissivity and low out-gassing properties) and is kept at t = 77 k with liquid nitrogen. when the scf is cold, the vacuum is typically in the 10−6 mbar range. two distinct positioning systems at the top of the cryostat (one for rototranslation movements in the plane of the prototype and one for spherical rotations and tilts) hold and move the prototype in front of the earth infrared simulator (es, inside the scf), the solar simulator (ss), the infrared camera and laser, all located outside the scf. the ss beam enters through a quartz window (∼ 37 cm diameter, ∼ 4 cm thickness), transparent to the solar radiation up to 3000 nm. a side germanium window at 45° with respect to the ss beam allows for the acquisition of thermograms of the laser retroreflector array (lra) with an ir digital camera (both during the es/ss illumination and the laser interrogation phases). 1.2. the moonlight – iln experiment in 2006, infn proposed the moonlight – iln (moon laser instrumentation for general relativity high accuracy tests – international lunar network) technological experiment [5, 8], which has the goal of reducing the error contribution of llr payloads by more than two orders of magnitude. in tab. 1, the possible improvements in the measurement of gravitational parameters achievable through reaching the ranging accuracy of 1 mm or even 0.1 mm are reported [14–16]. after the building of the best station, apollo [11], in new mexico, the main uncertainty is due to the multi-ccr arrays. the apollo arrays now contribute a significant portion of the ranging errors. this is due to the lunar librations, which move the apollo arrays, since that have dimensions of 1 square meter, for the apollo 15, and half square meter, for the apollo 11 and 14. in this paragraph we will describe the moonlight/llrra21 payload to improving gravity tests and lunar science measurements [3, 6]. this project is the result of the collaboration of two teams: the llrra21 team in the usa, led by douglas currie of the university of maryland, and the italian one led by infn–lnf. we are exploring improvements of both the instrumentation and the modeling of the ccr. the main problem that affects the apollo arrays is the lunar librations, in longitude, that results from the eccentricity of the moons orbit around earth. during the lunar phase, 27 days, the moons rotation alternatively leads and lags its orbital position, of about 8°. due to this phenomenon the apollo arrays are moved so that one corner of the array is more distant than the opposite corner by several centimeters. because the libration tilt, the arrays increase the dimension of the pulse coming back to the earth (fig. 2). the broadening of the pulse will be greater proportionally to the array physical dimensions and to the moon–earth distance increase. 747 manuele martini et al. acta polytechnica gravitational 1st generation 2nd generation 2nd generation time measurement llr accuracy llr accuracy llr accuracy scale (∼ cm) (1 mm) (0.1 mm) ep < 1.4 × 10−13 10−14 10−15 few years sep < 4.4 × 10−4 3 × 10−5 3 × 10−6 few years β < 1.1 × 10−4 10−5 10−6 few years (ġ/g) < 9 × 10−13 5 × 10−14 5 × 10−15 ∼ 5 years geodetic 6.4 × 10−3 6.4 × 10−4 6.4 × 10−5 few years precession (1/r2) deviation < 3 × 10−11 10−12 10−13 ∼ 10 years table 1. narrowing of parameter bounds due to gains in the accuracy of ranging measurements by one or two orders of magnitude, (reaching accuracy of 1 mm or even 0.1 mm). figure 2. comparison between 1st and 2nd generation lras. the librations tilt the arrays on the left, but the single big ccrs are unaffected, on the right. so we have single short pulses coming back using the moonlight payloads. 2. analysis of lunar laser ranging data in order to analyze llr data we used the pep software, developed by the cfa, by i. shapiro et al. starting from 1970s. pep was designed not only to generate ephemerides of the planets and moon, but also to compare model with observations. one of the early uses of this software was the first measurement of the geodetic precession of the moon [12]. the pep software has enabled constraints on deviations from standard gr physics. here we show the first determination of the relative deviation from the value expected in gr of the geodetic precession, kgp. we have used all the data available to us from apollo ccr arrays: apollo 11, apollo 14 and apollo 15. the value obtained using only old stations (grasse, mcdonald and mlr2) vs. the value obtained using the apollo station is shown in tab. 2. this preliminary measurements are to be compared with the best result published by jpl [13] obtained using a completely different software package. on the contrary, after the original 2 % kgp measurement by cfa in 1988, the use of pep for llr has been resumed station value obtained value in gr old stations (9 ± 9) × 10−3 0 apollo station (−9.6 ± 9.6) × 10−3 0 table 2. value obtained for the geodetic precession. only since a few years, and it is still undergoing the necessary modernization and optimization. 3. conclusions the analysis of existing llr data with pep is making good progress, thanks to the important collaboration with cfa, as shown with the preliminary measurement of the geodetic precession (de sitter effect) with an accuracy at 1 % level. in the future we are going to deepen our knowledge about data and software in order to better estimate the kgp uncertainty and other gr parameters. a possible way to improve the precision of llr measurements is to have lunar station to range not only to the moon, but also to satellites around the earth and primarily to lageos, thus improving station intercalibration. acknowledgements references [1] alley, c. o., chang, r. f., currie, d. g., mullendore, j., poultney, s. k., rayner, j. d., silverberg, e. c., steggerda, c. a., plotkin, h. h., williams, w., warner, b., richardson, h., bopp, b., science, volume 167, issue 3917, pp. 368–370 01/1970; apollo 11 laser ranging retro-reflector: initial measurements from the mcdonald observatory [2] bender, p. l., currie, d. g., dicke, r. h., eckhardt, d. h., faller, j. e., kaula, w. m., mulholland, j. d., plotkin, h. h., poultney, s. k., silverberg, e. c., wilkinson, d. t., williams, j. g., alley, c. o., science, volume 182, issue 4109, pp. 229–238 10/1973; the lunar laser ranging experiment [3] boni, a. et al, ‘world-first scf-test of the nasa-gsfc lageos sector and hollow retroreflector’, in proc. 17th international workshop on laser ranging, bad kötzting, germany, may 2011. 748 vol. 53 supplement/2013 moonlight: a new lunar laser ranging retroreflector [4] chang, r. f., alley, c. o., currie, d. g., faller, j. e., space research xii, vol. 1, p. 247–259 00/1971; optical properties of the apollo laser ranging retro-reflector arrays [5] d. currie, s. dell’agnello, g. delle monache, acta astronaut. 68 (2011) 667–680; a lunar laser ranging retroreflector array for the 21st century [6] s. dell’agnello et al, advances in space research 47 (2011) 822ð842; creation of the new industry-standard space test of laser retroreflectors for the gnss and lageos [7] s. dell’agnello, 3rd international colloquium on scientific and fundamental aspects of the galileo programme, copenhagen, denmark, august 201; etrusco-2: an asi–infn project of technological development and ‘scf-test’ of gnss laser retroreflector arrays [8] s. dell’agnello, d. currie, g. delle monache et al., paper # glex-2012.02.1.7x12545 in proceed. global lunar exploration conference, washington (usa), may 2012; moonlight: a lunar laser ranging retroreflector array for the 21st century [9] r. march, g. bellettini, r. tauraso, s. dell’agnello, r, phys. rev d 83, 104008 (2011); constraining spacetime torsion with the moon and mercury [10] r. march, g. bellettini, r. tauraso, s. dell’agnello, gen relativ gravit (2011) 43:3099–3126, doi 10.1007/s10714-011-1226-2 (2011); constraining spacetime torsion with lageos [11] t. w. murphy et al, publications of the astronomical society of the pacific, 121:29–40, 2009; the apache point observatory lunar laser-ranging operation (apollo): two years of millimeterprecision measurements of the earth-moon range [12] i. i. shapiro, r. d. reasenberg, j. f. chandler, r. w. babcock, prl 61, 2643 (1988); measurement of the de sitter precession of the moon: a relativistic three-body effect [13] j. g. williams, s. g. turyshev, and d. h. boggs, prl 93, 261101 (2004) [14] williams, j. g., newhall, x. x., dickey, j. o. physical review d (particles, fields, gravitation, and cosmology), volume 53, issue 12, 15 june 1996, pp.6730–6739 06/1996; relativity parameters determined from lunar laser ranging [15] williams, james g., newhall, x. x., dickey, jean o., planetary and space science, v. 44, p. 1077–1080. 10/1996; lunar moments, tides, orientation, and coordinate frames [16] williams, j. g., turyshev, s. g., boggs, d. h., ratcliff, j. t., adv. space res. 37(1), 67–71, 2006; lunar laser ranging science: gravitational physics and lunar interior and geodesy [17] a scientific concept for the lgn has been developed by the international lunar network (see http://iln.arc.nasa.gov/). see core instrument and communications working group final reports. discussion jim beall — do you have a planned date for deploying the retroreflector on the moon? manuele martini — there are several scientific agreements for opportunities for robotic mission on the lunar surface that will deploy moonlight ccrs, but there is no firm approved planned date for deploying ccr on lunar surface. 749 http://iln.arc.nasa.gov/ acta polytechnica 53(supplement):746–749, 2013 1 introduction 1.1 the satellite/lunar laser ranging characterization facility (scf) 1.2 the moonlight – iln experiment 2 analysis of lunar laser ranging data 3 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0366 acta polytechnica 55(6):366–372, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap a hardware-in-the-loop simulator based on a real skoda superb vehicle and rt-lab/carsim milan biros, karol kyslan∗, frantisek durovsky technical university of kosice, department of electrical engineering and mechatronics, letna 9, kosice, slovak republic ∗ corresponding author: karol.kyslan@tuke.sk abstract. this paper describes the design and realization of a hardware-in-the-loop simulator made from a real skoda superb vehicle. a combination of rt-lab and carsim software is used for real-time control and for handling the sensoric subsystems. the simulator provides almost realistic testing of driving cycles with on-line visualization. this unique device can be used in various fields of research. keywords: hardware-in-the-loop, rt-lab, carsim, simulator, skoda, superb. 1. introduction in recent years, vehicle manufacturers and developers have focused on vehicle-human interaction, which has been leading towards eliminating the driver and replacing him/her by modern intelligent embedded systems. manufacturers will have to establish new active safety systems due to mandatory european regulations and strict euro ncap criteria [1]. developments in this field involve camera-based and radar-based assistance systems. these systems continuously monitor the surroundings of the vehicle, and provide considerable help for the driver during driving manoeuvres through visual or acoustic warnings or through autonomous reactions within a fraction of a second. typical examples of such systems are emergency braking assistant, lane keeping assistant, crossroads assistant, and automatic cruise control [2]. developments of this kind cannot be made without advanced simulation systems. hardware-in-the-loop (hil) simulation reduces development time and development costs, because it is not necessary to execute test rides with a real car in a real environment, and vehicle components can be tested separately. a controlled laboratory environment also improves accuracy and repeatability. vehicle simulators have been used for this kind of research, often for designing and evaluating a wide range of control algorithms. these simulators can be classified as [3], [4] or [5]. according to their complexity: • low-level: a car and a screen are fixed to the ground, a fixed base (fb) simulator • mid-level: a car is accelerated (moved) in one degree of freedom (1dof) • high-level: a car is moved in at least six degrees of freedom (6dof) according to their use: • training simulators: typically used to qualify for a driving license • research simulators: a complex mechatronic unit is established for various research requirements according to their cost: • low-cost: using inexpensive graphic displays; the main applications are for screening the cognitive behavior of a driver • mid-cost: employing real-time animation of a complete vehicle • high-cost: used for evaluating electronic safety systems, or for accident reconstruction studies according to the categories given above, this paper deals with a low-level, research, mid-cost driving simulator, based on a modified real vehicle. the aim is to connect a skoda superb vehicle to a real-time control system, and to use as many of the components of the vehicle as possible, resulting not only in a driving simulator, but also a complex mechatronic device that can be used for a range of research and educational requirements. 2. hardware architecture the essential component of the proposed simulator is the cockpit of a real skoda superb vehicle. most of the electronic components and all important mechanical components remain unchanged inside the cockpit. due to space limitations, the engine, the gearbox and the part of the vehicle behind the b-pillar were removed (see fig. 2). this involved removing some electronic devices, such as the engine ecu, sensors for the engine and speed sensors on the wheels, which had to be replaced. in addition, some interior parts were removed or were slightly modified to improve visibility and access to the wiring, the harness connectors and other important components. thanks to these modifications, users are able to observe the wiring loom and the placement of the components in a real vehicle in no time, without any tools. all adjustments were aimed at retaining the genuine arrangement of the electronics of the vehicle [6]. 366 http://dx.doi.org/10.14311/ap.2015.55.0366 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 6/2015 a hil simulator based on skoda superb and rt-lab accelerator pedal brake pedal steering wheel gear knob vehicle dashboard škoda superb vehicle rt-lab/carsim control and modeling electronic part of oem game steering wheel can target pc host pc can a-in a-in 5 x d-in ethernet brake pedal sequence gear steering angle usb vizualization pc figure 1. hardware architecture of a vehicle simulator. the carsimrt real-time version by mechanical simulations is used as a simulation software for simulating the dynamics of the vehicle. a set of non-linear equations of the vehicle dynamics is calculated here, and enables the results to be seen in the form of time responses of particular mechanical variables. likewise, the results can be animated by visualization in surfanim. when on-line visualization is needed, the liveanimator tool provides real-time simulation, including the steering inputs by the car driver. it should be noted here that liveanimator was not available when the simulator was being constructed. carsim includes vehiclesim solvers, which calculate motion equations for mathematical models of vehicles. vs solvers use vs lisp (originally named autosim) for symbolic solving of multi-body systems [7]. the current version of carsim software includes a graphical user interface (gui), which parameterizes the modelled vehicle in great detail. the user therefore does not need special knowledge of vs commands to create basic models. the current version also contains a large number of examples of vehicles and scenarios, which familiarize the user with the software functions and with the influence of the parameters on the manoeuvrability of the vehicle. only the basic parameters of the vehicle were obtained from the officially available documentation for the skoda superb 3u with a 1.8 l turbo charged sparkignited engine (code designation awt) and a 5-gear manual gearbox (code designation 01w-012). this is insufficient, due to the complexity of the carsim software that is used. a default example from carsim was therefore taken, and was supplemented by known parameters found in the documentation. the vehicle was constructed on a platform designated as b5 or figure 2. skoda superb vehicle simulator, rear view. pl45, where p indicates a passenger car, l indicates longitudinal engine placement above the front axle, 4 is the size and 5 is the generation of the platform. this platform corresponds to carsim dataset d-class, sedan. in order to ensure precise timing with the vehicle components, calculations in carsim software must be executed in real-time. the use of carsimrt together with rt–lab software creates the powerful hil system, which meets the requirements for controlling the proposed vehicle simulator [8]. the rt-lab system 367 m. biros, k. kyslan, f. durovsky acta polytechnica 22k 22k 22k 22k 22k 22k 22k 2 3 1 4 5 r hb +5v ai0 ai gnd ni 6024e (a) with the analog input 2 3 1 4 5 r hb p0.0 p0.1 p0.2 p0.3 p0.4 p0.5 p0.6 d gnd ni 6024e (b) with the digital inputs figure 3. measuring the position of the gear knob. is an alternative to the widely–used dspace [9], [10]. the control system consists of 2 personal computers (see fig. 1). host pc acts as a console for controlling the simulation. it handles starting and stopping the execution, and in addition, the simulation results are displayed and stored here for further off-line evaluation. the mathematical model of the vehicle, developed or edited here in carsimrt, is exported to matlab/simulink as an s-function block, and is then compiled to clanguage and sent to the target pc. target pc with the qnx real-time operating system executes real–time computations and provides data acquisition (daq). this computer is equipped with a national instruments 6024e daq card and a softing can-ac2 pci communication card. signal scaling and processing is also performed here. note that, with a few modifications, this hil–based simulator can easily be extended for use in research on hybrid electric vehicles [11]. without the carsim live animator option, it is possible only to display the time responses of particular vehicle variables. the spatial orientation of the driver of the vehicle with monitoring of behavior of the vehicle only in the form of time responses would be unwieldy. in order to avoid the use of live animator, a commercial video game with a console and a steering wheel was introduced to convert the analog and digital signals from target pc to another visualization pc. 3. sensoric subsystem the original superb dashboard was used to display some information necessary for the driver via can microswitches spring gear shift lever h-gear pattern figure 4. structure of a position sensor for the gear. bus, e.g. engine rpm (see fig. 1). many sensors have to be modified according to the notes in this section. these variables are measured in a real vehicle: • gear knob position • throttle position • brake pedal position • steering wheel angle 3.1. position of the gear knob in most cases, vehicles with automatic transmission are equipped with a gear sensor. as was mentioned above, the skoda superb has a 5-gear manual gearbox, which is missing in the proposed simulator. the position sensor of the gear knob was constructed, but the main problem was where to put the knob. we mounted a small container with a spring at the bottom of a lever. the spring pushes the container against a plastic panel with slots and holes, which defines the range of movement of the lever. holes assign the end position of the lever. we mounted another thin board above a plastic panel with a hole in the shape of the h–gear pattern. it was equipped with micro-switches in the end positions of the lever. the new structure of the gear shift lever is shown in fig. 4. at first, the micro-switches were connected as a resistor divider, as shown in fig. 3(a). however, this was not ideal, because of the unidentified state in neutral gear. all the micro-switches were opened, so the residual voltage on the analog input of the daq card was gradually discharged. for example, when 368 vol. 55 no. 6/2015 a hil simulator based on skoda superb and rt-lab 1st gear was removed, the voltage slowly decreased from 5 v to 0 v. this means, that all the gears were detected in sequence, which is absolutely unacceptable. another arrangement was therefore implemented, as in fig. 3(b), where each micro-switch is connected directly to the one single digital input of the daq card. thanks to this, the functionality of the sensor was re-established. however, the need for 5 digital inputs for the gear plus one for the hand brake takes up 6 of the 8 available digital inputs of the daq card, so there was limited capacity for adding new functions. 3.2. throttle position the skoda superb is equipped with an electronic acceleration pedal. its position is measured with a double–path potentiometer, which is also used in the simulator. it is not critical to operate this pedal during simulation, so it does not to have be redundant and only one path of the potentiometer is sensed. this sensor is supplied from an external voltage source, because there is no engine ecu. in a real vehicle, pedal failure detection is implemented. the voltage of the output sensor is in the range of 0.75–3.57 v, so it is offset for an analog input 0–5 v, and then is rescaled within the range of 0–1 for use in the carsim s-function. in order to eliminate noise voltage that would cause engine rpm fluctuations, quantization with a 0.01 step is applied to the processed sensor output signal. 3.3. position of the brake pedal in a real vehicle, information about the position of the brake pedal is obtained only by the circuit closer, so only an on-off signal is available. it is convenient to measure the position of the pedal in the simulator, which is later used in carsim. for this purpose, the brake pedal system had to be equipped with a position sensor. instead of an optical or linear position sensor, which would be expensive, we used a linear potentiometer modified for mounting to the pedal system. the path of the potentiometer did not match with the the range of movement of the pedal, so that the output voltage is processed in a similar way as in the accelerator pedal. 3.4. steering wheel the original sensor from a real vehicle (g85) is used for measuring the steering wheel angle. it is an absolute quadratic encoder with a range of ±720° directly connected to a can bus (can antrieb). the value of the steering wheel angle in the can message is represented by 2 bytes. the lower 15 bits represent the angle, and the highest bit represents the direction of rotation. it is necessary to preserve this form of the signal for additional signal processing, e.g. scaling and direction detection in rt-lab. in a real vehicle, this sensor is initialized after the key is switched to the first position and the steering wheel is turned about 4.5° by initializing a can message from the brake ecu. when the sensor is replaced or disconnected, it must be calibrated with the help of the vag-com diagnostic interface. the steering wheel angle is transmitted via can, then processed in rt–lab and sent out to the game steering wheel electronics via analog output of the daq card. 4. visualization as was mentioned above, the live animator original real-time animation extension of carsim was not available. an oem gamer steering wheel and a pc game were used for real-time animation and visualization. the gamer steering wheel was completely disassembled, and the wheel itself is not used. instead, its internal circuit board was disassembled and modified to connect to the daq card of the control system. the internal variables of the game can be accessed only by using the corresponding circuit board. an advantage of this solution is that the steering wheel is connected to visualization pc via usb, and is supplied with the existing usb driver, so it is not necessary to create a particular driver. however, the communication between target pc and visualization pc is only unidirectional, so the vehicle may behave slightly differently in the game and in the simulation. the main disadvantage of the oem steering wheel is that its angle is only 180°. it is suitable for racing games, but is not appropriate as a driving simulator. only way to remove this defect woud be to use a different gaming device. for this reason, the steering angle signal from the vehicle’s steering wheel ϕ has to be limited to ±90° in software, and rescaled: uin = (ϕ + 90) 5 180 then the input voltage range uin of the electronics of the gamer wheel is 0–5 v for the whole angle, and the steering wheel in the visualization is centered. the original buttons on the gaming device were replaced by a position sensor made from gear lever micro-switches, because the daq card had an insufficient number of outputs. classical gear shifting can therefore be used, if it is supported by the game. if not, two digital outputs of the daq card can be used and sequence gear can be applied, realized in matlab/simulink (see fig. 5). then, however, there will be no other available digital output for sensing the position of the hand brake. if a sequence gear is chosen in the game, a model for the sequence gear, shown in fig. 5, is active. an input 1 gear model is an output of a manual gear model, executed in carsim. an input 2 seq set is a signal for manual synchronization of a gear in the game and in the vehicle. during execution, the game has information only about gear up or gear down, and no information about the current gear number. it has to be manually synchronized with input 2. in this part of the model, a final output 369 m. biros, k. kyslan, f. durovsky acta polytechnica figure 5. sequence shifting algorithm. vector transmitted to the digital outputs of the daq card is created. the blue part of the model provides the main shifting algorithm. it has been verified experimentally, that gaming wheel buttons take at least 50 ms to detect pressing, and this causes delays. this value is implemented here as a sample time. another delay is caused by adding detection of the neutral position during sequence gear. during shifting, e.g., from 1st gear to 2nd gear, the shifting in the game is sequenced as 1–0–1–2. due to this, a waiting loop (yellow part of the model) was added to the algorithm. for neutral position of the lever, the algorithm waits 5 samples and then shifts to 0. this waiting loop causes an additional delay. however, this is not so critical, because if a gear change occurs within the waiting time, it is executed immediately. 5. simulation results the parameters of a vehicle model, running in carsim, are slightly different from the parameters of a real vehicle. only simulation results are presented in this section, and they are not compared with real vehicle measurements. the following simulation was performed with manual gear shifting, see in fig. 6. the vehicle was driven by a user, and a real-time model of the vehicle was executed in the control system. gear shifting from 1st gear to 3rd gear can be observed in fig. 6(a), when the vehicle accelerates smoothly to a speed of above 100 km/h. the speed of the vehicle at time t = 0 s is not precisely zero, because unbraked vehicle tends to creep forward, due to the idle rpm engine vibration. the center of gravity speed of the vehicle reached 100 km/h at around t = 11 s, which is about 1 s faster than the value specified for this vehicle by the manufacturer. from t = 2 s to t = 6 s, there was heavy overspeeding on the front axle velocity. the front wheels skidded when low gears were shifted. a small slip can even be observed when there is a shift into 3rd gear. this is because the vehicle has the best power transmission from the engine to the pavement with a particular slip value included in vehicle model. when the gear lever traverses neutral position, discontinuous gear shifting behavior can be observed at the top of fig. 6(a). this causes a temporary drop in velocity, and a negative acceleration peak can be observed at t = 4.3 s. the time responses during a double line change (dlc) manoeuvre are shown in fig. 6(b). the acceleration of the vehicle is omitted, and only the dlc phase at an almost constant speed (60 km/h) is shown. at t = 12.1 s the steering wheel is steered to an angle of approx. −90°, and after one second it returns to zero. the vehicle has passed into another line. at t = 14.6 s the wheel is steered to an angle of 100° and after a while, the vehicle is back in the original line. the vertical forces in kn acting against the wheels are also shown in fig. 6(b), where 1r, 1l are the forces acting on the front right wheel and on the front left wheel, respectively; and 2r, 2l are the forces acting on the rear right wheel and on the rear left wheel, respectively. at a constant speed, the forces acting on the front axle are greater than the forces acting on the rear axle, due to the placement of the engine in the front part of the vehicle. centrifugal forces occur while the vehicle is steering to the left at t = 12.1–13.1 s. the forces acting on the wheels on the right side of the vehicle (1r, 2r) are therefore greater than the forces acting on the wheels on the left side (1l, 2l). similar behavior can be observed while the vehicle is being steered to the right at t = 14.5–15.5 s. 370 vol. 55 no. 6/2015 a hil simulator based on skoda superb and rt-lab 0 2 4 6 8 10 12 -1 0 1 2 3 4 g e a r [] shifting 0 2 4 6 8 10 12 0 50 100 v e lo c it y [ k m /h ] cog velocity 0 2 4 6 8 10 12 0 50 100 v e lo c it y [ k m /h ] front axle rear axle 0 2 4 6 8 10 12 -0.2 0 0.2 0.4 0.6 time [s] a c c e le ra ti o n [ g ] acceleration (a) acceleration from 0 to 100 km/h 11 12 13 14 15 16 17 -100 -50 0 50 100 a n g le [ -] steering wheel angle 11 12 13 14 15 16 17 50 55 60 65 70 v e lo c it y [ k m /h ] cog velocity 11 12 13 14 15 16 17 0 2 4 6 v e rt ic a l fo rc e [ k n ] 1r 2r 1l 2l 11 12 13 14 15 16 17 -0.2 -0.1 0 0.1 0.2 0.3 time [s] a c c e le ra ti o n [ g ] acceleration (b) dlc manoeuvre figure 6. simulation results. 6. conclusion the proposed vehicle simulator can be considered as a mid-cost, low-level research mechatronic device. it was built to verify its abilities when combined with a real vehicle. the aim was to use as many real vehicle components as possible. the main advantage of the simulator is that after the driving circuit geometry and the characteristics of the environment (pavement, weather condition etc.) have been designed, the user can drive a model of the simulated vehicle from a real car. there is no need to design a test procedure of the driver’s behavior. the use of carsim enables a huge variety of vehicle and environment parameters to be tested. however, the simulator has some drawbacks in its current state. the manual clutch from the real car is not available. in addition, the steering wheel is not equipped with feedback. although the user can watch the behavior of the vehicle on a projector screen, this solution is not optimal. in future, the visualization will be upgraded to carsim rt animator or to matlab virtual reality toolbox, and the data acquisition channels will be extended to implement some new features. acknowledgements this work was supported by the scientific grant agency of the ministry of education of the slovak republic and of the slovak academy of sciences (vega), under the project code 1/0464/15. references [1] euro ncap. ncap, 2014. http://www.euroncap.com/technical.aspx. [2] dspace. dspace magazine, 2014. https://www.dspace.com/en/pub/home/medien/ product_info/eyes_on_the_road.cfm. [3] e. blana. a survey of driving research simulators around the world. institute of transport studies, university of leeds, leeds, uk, 1996. http://eprints.whiterose.ac.uk/2110/. [4] j. slob. state-of-the-art driving simulators, a literature survey. dtc report. deptartment of mechanical engineering, eindhoven university of technology, eindhoven, netherlands, 2008. http://www.mate.tue.nl/mate/pdfs/9611.pdf. [5] u. eryilmaz, h. s. tokmak, k. cagiltay, et al. a novel classification method for driving simulators based on existing flight simulator classification standards. transportation research part c: emerging technologies 42(0):132–146, 2014. doi:10.1016/j.trc.2014.02.011. [6] m. biros. skoda superb vehicle simulator in carsim. master thesis, (in slovak). department of electrical engineering and mechatronics, technical university of kosice, kosice, slovakia, 2014. 371 http://www.euroncap.com/technical.aspx https://www.dspace.com/en/pub/home/medien/product_info/eyes_on_the_road.cfm https://www.dspace.com/en/pub/home/medien/product_info/eyes_on_the_road.cfm http://eprints.whiterose.ac.uk/2110/ http://www.mate.tue.nl/mate/pdfs/9611.pdf http://dx.doi.org/10.1016/j.trc.2014.02.011 m. biros, k. kyslan, f. durovsky acta polytechnica [7] mechanical simulation. vehicle sim technology, 2015. http://www.carsim.com/products/supporting/ vehiclesim/index.php. [8] j. j. kwon, t. w. hong, k. park, s. j. heo. robustness analysis of esc ecu to characteristic variation of vehicle chassis components using hils technique. international journal of automotive technology 15(3):429–439, 2014. doi:10.1007/s12239-014-0045-3. [9] t. haubert, t. hlinovsky, p. mindl. modelling of electric vehicle dynamics using dspace dc 1103. transactions on electrical engineering 3(4):106–110, 2014. http://www.transoneleng.org/2014/20144f. pdf. [10] c. m. kang, s.-h. lee, c. c. chung. lane estimation using a vehicle kinematic lateral motion model under clothoidal road constraints. ieee 17th international conference on intelligent transportation systems (itsc) pp. 1066–1071, 2014. 8-11 oct. 2014, doi:10.1109/itsc.2014.6957829. [11] y. inaba, s. cense, t. o. bachir, h. yamashita. a dual high-speed pmsm motor drive emulator with finite element analysis on fpga chip with full fault testing capability. proceedings of the 14th european conference on power electronics and applications (epe-2011) 2011. aug. 30-sept. 1, 2011, birmingham, united kingdom. 372 http://www.carsim.com/products/supporting/vehiclesim/index.php http://www.carsim.com/products/supporting/vehiclesim/index.php http://dx.doi.org/10.1007/s12239-014-0045-3 http://www.transoneleng.org/2014/20144f.pdf http://www.transoneleng.org/2014/20144f.pdf http://dx.doi.org/10.1109/itsc.2014.6957829 acta polytechnica 55(6):366–372, 2015 1 introduction 2 hardware architecture 3 sensoric subsystem 3.1 position of the gear knob 3.2 throttle position 3.3 position of the brake pedal 3.4 steering wheel 4 visualization 5 simulation results 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0394 acta polytechnica 54(6):394–397, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap automorphisms of algebras and orthogonal polynomials daniel gromada, severin pošta∗ department of mathematics, faculty of nuclear sciences and physical engineering, czech technical university in prague, trojanova 13, cz-120 00 prague, czech republic ∗ corresponding author: severin.posta@fjfi.cvut.cz abstract. suitable automorphisms together with a complete classification of representations of some algebras can be used to generate some sets of orthogonal polynomials “at no cost”. this is also the case for the nonstandard klimyk-gavrilik deformation u′q(so3), which is connected to q-racah polynomials. keywords: orthogonal polynomials, algebra representation, automorphism. 1. introduction the connection of orthogonal polynomials with lie algebras has been known for a long time (see [1–4]. for a nice introduction and a detailed historical survey see [5] and references therein). having available the classification of irreducible representations and making use of some automorphisms of algebras, one can obtain sets of orthogonal polynomials for free, without the need to proving their properties manually. the same is true for some of their q-analogs. we show this approach on a well-known example of the sl2 algebra and krawtchouk polynomials [5], and we then apply the same procedure to the nonstandard klimyk-gavrilik deformation u′q(so3), for which the complete classification of its irreducible representations is known (see [6–9]). 2. lie algebra sl2 let us first consider the lie algebra sl2 of 2×2 complex matrices with zero trace. it has the standard chevalley basis e = ( 0 1 0 0 ) , f = ( 0 0 1 0 ) , h = ( 1 0 0 −1 ) . these matrices satisfy the commutation relations [h,e] = 2e, [h,f] = −2f, [e,f] = h, where [x,y] = xy −yx. let ϕ be a finite-dimensional irreducible representation of sl2 acting on the space vn+1 of dimension n + 1 with some fixed basis via the matrices h = ϕ(h) =   n 0 0 · · · 0 0 n − 2 0 0 0 0 n − 4 0 ... ... ... 0 0 0 · · · −n   , ϕ(e) =   0 1 0 · · · 0 0 0 2 0 0 0 0 ... 0 ... ... ... 0 0 0 · · · 0   , ϕ(f) =   0 0 0 · · · 0 n 0 0 0 0 n − 1 0 0 ... ... ... ... 0 0 0 · · · 0   . now let us define a matrix s as s = ϕ(e) + ϕ(f) =   0 1 0 · · · 0 n 0 2 0 0 n − 1 0 ... 0 ... ... ... ... 0 0 0 · · · 0   . making use of an automorphism σ sending h → e + f, e → 1 2 (h−e + f), f → 1 2 (h + e−f) and taking into account the classification of irreducible representations of the sl2 algebra, we see that matrices h and s form a so-called leonard pair (a leonard pair is a pair of diagonalizable finite-dimensional linear transformations, each of which acts in an irreducible tridiagonal fashion on an eigenbasis for the other one). particularly, they have the same eigenvalues. it follows that there exists a matrix p such that s = php−1. we can consider the rows of p or the columns of p−1 as coordinate vectors of a polynomial. the proposition is that kj(k; 1/2,n) = [p−1]kj = ( n j )−1 pjk (1) where kn(x; p,n) is n-th krawtchouk polynomial n = 0, 1, . . . ,n with parameters p ∈ (0, 1) and n ∈ n0. the matrix p is defined by a similarity relation up 394 http://dx.doi.org/10.14311/ap.2014.54.0394 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 automorphisms of algebras and orthogonal polynomials to a multiplicative constant. equation 1 holds if we choose p00 = 1. the krawtchouk polynomials are defined by means of the hypergeometric function kn(x; p,n) = 2f1 ( −n,−x −n ∣∣∣∣ 1p ) = n∑ j=0 (−n)j(−x)j (−n)j 1 pjj! where (a)j = j−1∏ k=0 (a + k) and (a)0 = 1. to show (1), we can use the similarity relations p−1s = hp−1, i.e., [p−1s]kj = j[p−1]k,j−1 + (n − j)[p−1]k,j+1, [hp−1]jk = (n − 2k)[p−1]kj, construct a recurrence relation j[p−1]k,j−1 − (n − 2k)[p−1]k,j + (n − j)[p−1]k,j+1 = 0 (2) and compare it with general recurrence for krawtchouk polynomials, j(p− 1)kj−1(x) − ( j(p− 1) + (j −n)p + x ) kj(x) + (j −n)pkj+1(x) = 0 (3) where kn(x) = kn(x; p,n) for fixed p and n. one can show a similar result for the relation sp = ph. this also proves the relation between p−1 and p∗, which can be written as p−1 = p∗g−1, where gjk = δjk ( n j ) . this means that p is orthogonal with respect to the inner product defined by matrix g. thus, p−1 is orthogonal and so the columns of p−1 (krawtchouk polynomials) are orthogonal with respect to the inner product defining the orthogonality relation n∑ j=1 ( n j ) km(j)kn(j) = hnδmn. (4) the inner product is of course determined up to normalization ((4) can be multiplied by the arbitrary sequence an). this result corresponds with the general orthogonality relation for krawtchouk polynomials (see [10]) n∑ j=0 ( n j ) pj(1 −p)n−jkm(j)kn(j) = ( n n )−1 (1 −p p )n δmn. relation (4) can be derived, not just proven, using the properties of the leonard pair. we will make use of the fact that s has n + 1 distinct eigenvalues and p diagonalizes s, so the columns of p are eigenvectors of s. if we find an inner product such that s is a matrix of a self-adjoint operator with respect to this product, then p will be orthogonal with respect to this product. we will try to find a diagonal matrix a = diag(a0, . . . ,an ), such that a−1sa is a hermitian matrix, and so it is a matrix of the self-adjoint operator in the case of a standard inner product. after a change of basis we can see that s is a self-adjoint operator in the case of the inner product defined by the matrix g = a∗a = diag(w0, . . . ,wn). thus, we require [a−1sa]j−1,j = [a−1sa]j,j−1, which leads to the condition |aj|2 = |aj−1|2 s̄j,j−1 sj−1,j = |aj−1|2 n − j + 1 j . this request is fulfilled if we choose wj = |aj|2 = j∏ k=1 n −k + 1 k = ( n j ) . 3. algebra u ′q(so3) now let us consider from the same point of view the algebra u′q(so3), a complex associative algebra generated by three elements i1, i2 and i3 satisfying the relations q1/2i1i2 −q−1/2i2i1 = i3, (5) q1/2i2i3 −q−1/2i3i2 = i1, (6) q1/2i3i1 −q−1/2i1i3 = i2. (7) let us assume that q is not the root of unity and define matrices ϕ(i1), ϕ(i2), ϕ(i3) by [ϕ(i1)]j+1,j = [2m − j] q−m+j + qm−j , [ϕ(i1)]j−1,j = − [j] q−m+j + qm−j , [ϕ(i1)]jk = 0 for k 6= j ± 1, [ϕ(i3)]jk = i[−m + j]δjk, where m = n/2 and [ν] = (qν−q−ν)/(q−q−1). (the matrix ϕ(i2) can be obtained from the third defining relation (7).) then the triple form is an irreducible so called classical representation of u′q(so3) of dimension n + 1. the matrices ϕ(i1) and ϕ(i3) have the same eigenvalues, which follows from the classification of all irreducible representations (there is one classical representation per dimension, see [9]) and from the existence of a rotational automorphism which sends i1 → i2, i2 → i3, i3 → i1. thus, we can construct matrix p such that ϕ(i1) = pϕ(i3)p−1. 395 daniel gromada, severin pošta acta polytechnica we will show that matrix p corresponds to q-racah polynomials. the general q-racah polynomials are defined by means of hypergeometric series as rn ( µ(x); α,β,γ,δ | q ) = 4ϕ3 ( q−n,αβqn+1,q−x,γδqx+1 αq,βδq,γq ∣∣∣∣ q; q ) = ∞∑ k=0 [q−n]k[αβqn+1]k[q−x]k[γδqx+1]k [αq]k[βδq]k[γq]k qk [q]k , (8) where rn(x; α,β,γ,δ | q) is n-th q-racah polynomial with parameters α, β, γ, δ, and with n = 0, 1, . . . ,n, where n is a nonnegative integer, µ(x) = q−x + γδqx+1, [a]k = k−1∏ j=0 (1 −aqj), [a]0 = 1. the parameters must satisfy αq = q−n or βδq = q−n or γq = q−n. in the definition of basic hypergeometric orthogonal polynomials it is usually assumed that q ∈ (0, 1). however, in this calculation it is sufficient to assume q ∈ r\{−1, 0, 1}. the correspondence has the following form −ijrj ( µ(k); α,β,γ,δ | q ) = [p−1]kj = w−1j p̄jk, (9) where i is an imaginary unit. the weight sequence and parameters are wj = [q−n ]j[−q−n ]j [q]j[−q]j 1 + q−n+2j (−q−n )j(1 + q−n ) , α = β = −γ = −δ = iq −n −1 2 . (10) from now, we will again omit the parameters of the polynomials and write only rn(µ(x)) instead of rn(µ(x); α,β,γ,δ | q). in order to prove (9), we construct the recurrence relation and compare it to the general form of the recurrence relation for q-racah polynomials. the equation p−1ϕ(i1) = ϕ(i3)p−1 gives us the relation − (q−n+k −q−k)rj ( µ(k) ) = −q−n (1 −q2n) 1 + q−n+2n rj−1 ( µ(k) ) + 1 −q−2n+2n 1 + q−n+2n rj+1 ( µ(k) ) . (11) we can see that this form corresponds to the general recurrence (see [10]) − (1 −q−x)(1 −γδqx+1)rn ( µ(x) ) = anrn+1 ( µ(x) ) − (an + cn)rn ( µ(x) ) + cnrn−1 ( µ(x) ) , (12) where an = (1 −αqn+1)(1 −αβqn+1) 1 −αβq2n+1 (1 −βδqn+1)(1 −γqn+1) 1 −αβq2n+2 , cn = q(1 −qn)(1 −βqn)(γ −αβqn)(δ −αqn) (1 −αβq2n)(1 −αβq2n+1) , if the parameters are set up the way as in (10). the way of deriving the weight sequence is similar to the former case. we again try to find a diagonal matrix a. however, there is no diagonal matrix that transforms ϕ(i1) to a hermitian matrix. nevertheless, we can transform ϕ(i1) to a symmetric matrix and then show that the transformed matrix is normal. the elements of a have to satisfy a2j = a 2 j−1 [ϕ(i1)]j,j−1 [ϕ(i1)]j−1,j = a2j−1 −(q2m−j+1 −q−2m+j−1)(q−m+j + qm−j) (qj −q−j)(q−m+j−1 + qm−j+1) . the elements are determined up to a multiplicative constant as a product a2j = j∏ k=1 − (q2m−k+1 −q−2m+k−1)(q−m+k + qm−k) (qk −q−k)(q−m+k−1 + qm−k+1) = j∏ k=1 q2m (1 −q−4m+2k−2)(1 + q−2m+2k) (1 −q2k)(1 + q−2m+2k−2) = j∏ k=1 (1 −q−n+k−1)(1 + q−n+k−1)(1 + q−n+2k) q−n (1 −qk)(1 + qk)(1 + q−n+2k−2) = [q−n ]j[−q−n ]j [q]j[−q]j 1 + q−n+2j (q−n )j(1 + q−n ) . if we assume q ∈ (0, 1) then for all k ≥ 1 the factor 1−q−n+k−1 is negative whereas the other factors are positive. therefore, |aj|2 = (−1)ja2j. it can be easily seen that this holds for all q ∈ r\{−1, 0, 1} by similar reasoning. finally, we have |aj|2 = wj. now we just need to verify that b := a−1ϕ(i1)a is normal using the fact that b is symmetric. thus, we have to verify ∑ bjlb̄kl = ∑ b̄jlbkl. we can just show that for all indices j, k, l we have bjlb̄kl = a−1j [ϕ(i1)]jlalā −1 k [ϕ(i1)]klāl ∈ r. since ϕ(i1) is real and alāl = |al|2, we just have to decide whether ajak is real for indices j, k, whose difference is even (otherwise [ϕ(i1)]jl or [ϕ(i1)]kl is zero due to its special form). considering a2j is real and alternates, we see that aj and ak are both real or purely imaginary. therefore, ajak ∈ r. 4. conclusion on the example of the algebra u′q(so3) we have shown that the existence of a complete classification of representations together with a suitable use of some automorphism can produce as a by-product a set of 396 vol. 54 no. 6/2014 automorphisms of algebras and orthogonal polynomials orthogonal polynomials (see also [11]). because the algebra u′q(so3) is a special case of askey-wilson algebra aw(3), introduced by zhedanov [12], it would be nice to generalize this approach to this case (see also [13–16]). references [1] w. miller, jr. lie theory and difference equations. i. j math anal appl 28:383–399, 1969. [2] t. h. koornwinder. krawtchouk polynomials, a unification of two different group theoretic interpretations. siam j math anal 13(6):1011–1023, 1982. doi:10.1137/0513072. [3] p. feinsilver. lie algebras and recurrence relations. i. acta appl math 13(3):291–333, 1988. [4] y. i. granovskĭı, a. s. zhedanov. orthogonal polynomials on lie algebras. izv vyssh uchebn zaved fiz 29(5):60–66, 1986. [5] k. nomura, p. terwilliger. krawtchouk polynomials, the lie algebra sl2, and leonard pairs. linear algebra appl 437(1):345–375, 2012. doi:10.1016/j.laa.2012.02.006. [6] a. m. gavrilik, n. z. iorgov. q-deformed algebras uq (son) and their representations. methods funct anal topology 3(4):51–63, 1997. [7] a. m. gavrilik, n. z. iorgov, a. u. klimyk. nonstandard deformation u ′q (son): the embedding u ′q (son) ⊂ uq (sln) and representations. in symmetries in science, x (bregenz, 1997), pp. 121–133. plenum, new york, 1998. [8] a. m. gavrilik, n. z. iorgov. representations of the nonstandard algebras uq (so(n)) and uq (so(n − 1, 1)) in gel′ fand-tsetlin basis. ukraïn f̄ız zh 43(6-7):791–797, 1998. international symposium on mathematical and theoretical physics (kyiv, 1997). [9] m. havlíček, s. pošta. on the classification of irreducible finite-dimensional representations of u ′q (so3) algebra. j math phys 42(1):472–500, 2001. doi:10.1063/1.1328078. [10] r. koekoek, r. f. swarttouw. the askey-scheme of hypergeometric orthogonal polynomials and its q-analogue, 1996. arxiv:math/9602214. [11] p. terwilliger. leonard pairs and the q-racah polynomials. linear algebra appl 387:235–276, 2004. doi:10.1016/j.laa.2004.02.014. [12] a. s. zhedanov. “hidden symmetry” of askey-wilson polynomials. teoret mat fiz 89(2):190–204, 1991. doi:10.1007/bf01015906. [13] p. terwilliger. the universal askey-wilson algebra. sigma symmetry integrability geom methods appl 7:paper 069, 24, 2011. doi:10.3842/sigma.2011.069. [14] p. terwilliger. the universal askey-wilson algebra and the equitable presentation of uq (sl2). sigma symmetry integrability geom methods appl 7:paper 099, 26, 2011. [15] p. terwilliger, a. žitnik. distance-regular graphs of q-racah type and the universal askey–wilson algebra. j combin theory ser a 125:98–112, 2014. doi:10.1016/j.jcta.2014.03.001. [16] p. terwilliger. the universal askey-wilson algebra and daha of type (c ∨1 , c1). sigma symmetry integrability geom methods appl 9:paper 047, 40, 2013. 397 http://dx.doi.org/10.1137/0513072 http://dx.doi.org/10.1016/j.laa.2012.02.006 http://dx.doi.org/10.1063/1.1328078 arxiv:math/9602214 http://dx.doi.org/10.1016/j.laa.2004.02.014 http://dx.doi.org/10.1007/bf01015906 http://dx.doi.org/10.3842/sigma.2011.069 http://dx.doi.org/10.1016/j.jcta.2014.03.001 acta polytechnica 54(6):394–397, 2014 1 introduction 2 lie algebra sl2 3 algebra uq'(so3) 4 conclusion references acta polytechnica doi:10.14311/ap.2015.55.0237 acta polytechnica 55(4):237–241, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap about the uniformity and the stability of a volume discharge in helium in near-atmospheric pressure v. s. kurbanismailova, o. a. omarova, g. b. ragimkhanova, a. a. aliverdieva, b, ∗ a dagestan state university, gadjieva str. 43a, 367025, makhachkala, russia b dsc institute for geothermal research of the russian academy of sciences, pr. shamilya 39a, 367030, makhachkala, russia ∗ corresponding author: aliverdi@mail.ru abstract. we present experimental electric and time-spatial characteristics of a volume discharge and of the transition from a volume burning stage into a channel mode nearing atmospheric pressure. we show that the discharge uniformity rises with the increase of cathode spots density and gas pressure. keywords: discharge formation; volume discharge; streamer; discharge plasma. 1. introduction one of the applications of volume discharges in inert gases is gas laser pumping. in this case, in order to increase the power characteristics of gas lasers, we need (i) to improve the pumping methods and (ii) to optimize the excitation conditions [1,2,3,4,5,6,7]. a problem of pumping optimization consists in the reception of certain electric characteristics of discharge plasma with a constant spatial uniformity during the pumping. the discharge instability results in the transition from volume burning into a channel stage (contracted discharge). there can be various physical mechanisms responsible for discharge instabilities. they depend on the gas (or the gas mixture) [8,9,10,11,12]. therefore the study of volume discharge properties in pure gases has both basic and practical interest. toward this general aim we have experimentally investigated under a wide range of initial conditions, the plasma characteristics of volume and contracted discharges, as well as the processes of discharge counteraction and plasma torch creation in helium at atmospheric pressure. 2. experimental installation and research methods the experimental setup and research methods are similar to those described in our previous papers [13,14,15]. the gap under study (about 1 cm in length) irradiates either by spark discharge through the grid anode or by uv source placed in the same gas at a distance of 5–7 cm from the main gap axis. the diameter of the electrodes is 4 cm. we have used electrodes with various shapes (plane and hemispherical, r = 30 cm) made of different materials: aluminium, stainless steel and copper. the pulsed voltage source generates voltage pulses with a variable amplitude of up to 30 kv and a front duration of ∼ 10 ns. the discharge voltage and current are measured with the application of digital oscilloscopes. frame photographs of the discharge glow (showing the distribution of radiation intensity both along, and across electrodes) are obtained using a fer-2 streak camera with the umi-92 image tube. when photographing a discharge in the frame mode, the scanning voltage of the fer-2 is switched off. frame photographs are synchronized with electrical characteristics of the discharge by simultaneously supplying the triggering voltage pulse to the fer-2 and the signal of the discharge current (or voltage) pulse to double-beam storage oscilloscopes. streak images of the discharge glow (discussed in [15]) are also obtained and synchronized with the discharge current (or voltage) pulse by applying the signal of the current (or voltage) pulse to the deflecting plates of the umi-92 image tube, simultaneously with the discharge scanning. time-integrated photographs of the discharge glow with a high spatial resolution are taken using a digital camera. 3. results and discussion we have investigated a discharge transition from a volume burning into a channel stage in helium with a discharge area s = 12 cm2, a distance between electrodes d = 1 cm, a gas pressure p = 1–5 atm, a discharge voltage u from static discharge up to hundreds of percent of overvoltage. as we can see from fig. 1 photos 1–4, a homogeneous volume discharge burns at small external fields (e0 < ecritical = 6 kv/cm). the growth of uncompleted anode channels (which are adhered to cathode spots with high conductivity) starts from a current density of about 40 a/cm2 (see fig. 1 photos 5–6). an increase in current density up to 60 a/cm2 (see fig. 1 photos 7–11) leads to further promotion of uncompleted anode channels, anode spotting, and also to the appearance of uncompleted cathode channels. when the current density surpasses 100 a/cm2, the anode and cathode channels merge (fig. 1 phot 12). 237 http://dx.doi.org/10.14311/ap.2015.55.0237 http://ojs.cvut.cz/ojs/index.php/ap v. s. kurbanismailov, o. a. omarov, g. b. ragimkhanov, a. a. aliverdiev acta polytechnica figure 1. time-integrated photos of the discharge glow at atmospheric pressure. (a) (b) figure 2. typical oscillograms of current and voltage: (a) u0 = 8 kv, p = 1 atm, d = 1 cm; (b) u0 = 9 kv, p = 3 atm, d = 1 cm). here, t0 is the beginning of the applied voltage growth; t1 is the beginning of the first voltage drop; t3 is the beginning of counteraction of a volume discharge in a spark channel; t3 − t2 is the volume phase duration. the instant of a discharge interval overlapping with a plasma channel (fig. 1 photo 12) is clearly visible, when the growth of channel conductivity causes a second sharp voltage drop (see fig. 2). in the coordinated pump mode, a specific heat input of ∼ 0.1 j/cm3 is provided, which is the maximum for helium in a homogeneous burning stage. the duration of discharge uniformity is adjusted by the reduction of the current density or by the increase of gas pressure. at a pressure of 5 atm, the step on a pulse voltage is practically not visible. in this case the volume discharge duration is defined by the time of switching of the discharge current. the voltage pulse on the discharge interval thus smoothly falls down up to arc value. oscillograms (fig. 2) and discharge glow pictures, with fixed spatial (fig. 1) and temporal (see [15,16]) resolutions, show on the one hand the dynamics of discharge development, and on the other hand allow us to define the duration of breakdown stages. 238 vol. 55 no. 4/2015 about the uniformity and the stability of a volume discharge in helium figure 4. distributions of radiation intensity (in a.u.) both along and across electrodes (d = 1 cm, p = 1 atm): (a) t = 105 ns; (b) t = 130 ns; (c) t = 155 ns; (d) t = 210 ns. here x is the coordinate along a field, and y is the coordinate across a field. figure 3. dependences of the burning voltage of volume discharge on pressure: (1) e/p = 3 kv/cm atm; (2) e/p = 3.5 kv/cm atm. the reduction of the duration of the volume discharge burning stage due to the pressure growth, is the result of non-compensated growth of the number of ionization processes relatively to the recombination ones. the increase in gas pressure results in an increase of the voltage at the discharge column. this results in the growth of ionization processes due to the shock ionization that is caused by the strong dependence of the factor α on e0, as well as to the step ionization. the rough uncompensated growth of electron concentration then results in growth of conductivity and in the recession of voltage up to the arc value. afterwards, the discharge moves to a recombination mode and dies. the increase of the overvoltage up to 300 % causes the appearance of a large number of plasma channels having a rather large diameter. the time-differentiation of volume and high-current stages is well visible on oscillograms of the discharge current or of the voltage on the plasma channel. at volume stage of discharge burning the voltage ub =const., and this constant value depends on pressure (see fig. 3), corresponding to the minimal voltage of the breakdown with the fixed value of pd, where p is gas pressure and d is interval length. to obtain a sparkless mode we need to obtain a full dispersion of storage element energy (c = 1.5 · 10−8 f) for burning time τb [1]. for volume discharges in he this requirement is reached at u0 = 2ub, where ub ≈ 3000 v is the volume discharge burning voltage at p = 1 atm, d = 1 cm). the analysis of such measurements shows that the spark channel in this case is initiated by instabilities in near-electrode areas [15,18]. these instabilities define the binding of narrow diffusive channels and cause the transition from a volume discharge mode into a spark mode. figure 4 shows the distributions of radiation intensities both 239 v. s. kurbanismailov, o. a. omarov, g. b. ragimkhanov, a. a. aliverdiev acta polytechnica along the field and across electrodes. from this figure follows that the counteraction process is defined by near-electrodes phenomena. on the other hand, the basic energy is entered in the discharge in quasi-stationary stages. then, fot the energy density, it is possibleto write: w = pτb v = iuτb sd = juτb d , (1) where τb is the volume stage duration, j is the current density. the energy, given to the gas before the formation of the spark channel, increases the power, although the burning duration τb exponentially decreases with field growth [16]. finally, the volume discharge emits a spark in the channel at the critical current density j ≥ jcritical ≈ 40 a/cm2 and the extreme specific inputs ≈ 0.1–0.2 j/cm3 [13]. in a voltage drop stage (when we have a quasineutral plasma column with the cross-section area s and electron density ne in the discharge interval) the resistance of this column defined by r = u i = pd senek , (2) where µ and νdr = kep = µe are the mobility and the drift speed of electrons. accordingly, k = µp = 6.72 · 105 cm2 torr/v sec [1]. for our experiment s = 12 cm2; p = 760 torr; d = 1 cm; ne ≈ 1013–1014 cm3. the resistance ranges from 10 to 100 ω. 4. conclusions to sum up, we can observe a clear sequence of events: (1.) the occurrence of cathode spots in an initial stage of discharge; (2.) the development of uncompleted anode channels; (3.) the formation of uncompleted cathode channels; and finally (4.) the merging of counter channels and the growth of their conductivity. in conditions of strong preliminary ionization and with a wide range of initial voltage values the discharge has a volume structure, and the duration decreases with the growth of the initial voltage and gas pressure. in the case of an extreme specific heat input ≈ 0.1 j/cm3 and a critical current density jcritical ≥ 40 a/cm2, the discharge contracts to a spark channel. the burning voltage ub at various values of e/p tends to reach such value at which ub/pd is constant. at the same time the ionization ability of electrons η = α/e0 is maximal and optimal for electron multiplication. the relation e/p in the volume discharge plasma does not depend on an initial voltage (for p = 1 atm, e/p ≈ 3 · 103 v/cm atm). a volume discharge voltage determined basically by gas pressure, has a linear dependence in quite wide ranges. acknowledgements the work was partially supported by rfbr (12-02-96505, 12-01-96500, 12-01-96501) and it was written within the framework of the state task 2.3142.2011 ministry of science and education of russian federation for 2012–2014. references [1] korolev yu.d. and mesyats g.a. physics of pulsed gas breakdown. nauka, moscow, 1991 (in russian). [2] mesyats g.a. and korolev yu.d. high-pressure volume discharges in gas lasers. sov. phys. usp., 29, 1986, 57-69. doi:10.1070/pu1986v029n01abeh003083 [3] woodard b.s., zimmerman j.w., benavides g.f., et al. demonstration of an iodine laser pumped by an air–helium electric discharge. j. phys. d: appl. phys., 43, 2010, 025208. doi:10.1088/0022-3727/43/2/025208 [4] meek j.m. and j.d. craggs, electrical breakdown of gases. quarterly journal of the royal meteorological society, 80 (344), 1954, 282-283, doi:10.1002/qj.49708034425 [5] panchenko a.n., tarasenko v.f., belokurov a.n., mendoza p., and rios i. planar krgl excilamp pumped by transverse self-sustained discharge with optical system for radiation concentration. phys. scr. 74, 2006, 108-113. doi:10.1088/0031-8949/74/1/013 [6] feng x., zhu s. investigation of excimer ultraviolet sources from dielectric barrier discharge in krypton and halogen mixtures. phys. scr. 74, 2006, 322-325. doi:10.1088/0031-8949/74/3/005 [7] lisenko a.a., lomaev m.i., skakun v.s., and tarasenko v.f. effective emission of xe2 and kr2 bounded by a dielectric barrier. phys. scr.. 76 (3), 2007, 211-215. doi:10.1088/0031-8949/76/3/001 [8] raizer y.p. gas discharge physics. berlin, springer, 1991. doi:10.1007/978-3-642-61247-3 [9] pancheshnyi s., nudnova m. and starikovskii a. development of a cathode-directed streamer discharge in air at different pressures: experiment and comparison with direct numerical simulation, phys. rev. e. 71 (1), 2005, 016407. doi:10.1103/physreve.71.016407 [10] morrow r. and lowke j. streamer propagation in air. j. phys. d: appl. phys. 30, 1997, 614-627. doi:10.1088/0022-3727/30/4/017 [11] clevis t.t.j., nijdam s. and ebert u. inception and propagation of positive streamers in high-purity nitrogen: effects of the voltage rise rate. j. phys. d: appl. phys., 46, 2013, 045202. doi:10.1088/0022-3727/46/4/045202 [12] bologa a., paur h.r., seifert h., and woletz k. influence of gas composition, temperature and pressure on corona discharge characteristics. international journal on plasma environmental science & technology. 5, 2011, 110-116. http://www.iesj.org/html/service/ijpest/ vol5_no2_2011/ijpest_vol5_no2_02_pp110-116.pdf [13] kurbanismailov v.s. and omarov o.a. the behavior of volume discharge contraction in helium at atmospheric pressure. teplofiz. vys. temp. 33, 1995, 346. https://getinfo.de/app/ the-behavior-of-volume-discharge-contraction-in/ id/blse%3aen029241892 240 http://dx.doi.org/10.1070/pu1986v029n01abeh003083 http://dx.doi.org/10.1088/0022-3727/43/2/025208 http://dx.doi.org/10.1002/qj.49708034425 http://dx.doi.org/10.1088/0031-8949/74/1/013 http://dx.doi.org/10.1088/0031-8949/74/3/005 http://dx.doi.org/10.1088/0031-8949/76/3/001 http://dx.doi.org/10.1007/978-3-642-61247-3 http://dx.doi.org/10.1103/physreve.71.016407 http://dx.doi.org/10.1088/0022-3727/30/4/017 http://dx.doi.org/10.1088/0022-3727/46/4/045202 http://www.iesj.org/html/service/ijpest/vol5_no2_2011/ijpest_vol5_no2_02_pp110-116.pdf http://www.iesj.org/html/service/ijpest/vol5_no2_2011/ijpest_vol5_no2_02_pp110-116.pdf https://getinfo.de/app/the-behavior-of-volume-discharge-contraction-in/id/blse%3aen029241892 https://getinfo.de/app/the-behavior-of-volume-discharge-contraction-in/id/blse%3aen029241892 https://getinfo.de/app/the-behavior-of-volume-discharge-contraction-in/id/blse%3aen029241892 vol. 55 no. 4/2015 about the uniformity and the stability of a volume discharge in helium [14] bairkhanova m.g., gadzhiev m.h., kurbanismailov v.s., omarov o.a., ragimkhanov g.b., kata a.j. the front wave instability of cathode streamer ionization in the helium of high pressure. applied physics. 5, 2009, 62-66. http://applphys.orion-ir.ru/appl-09/09-5/ 09-5-9e.htm [15] omarov o.a., kurbanismailov v.s., arslanbekov m.a., gadjiev m.kh., ragimkhanov g.b., al-shatravi a.j.g. expansion of the cathode spot and generation of shock waves in the plasma of a volume discharge in atmospheric-pressure helium. plasma physics reports, 38 (1), 2012, 22-28. doi:10.1134/s1063780x11120087 [16] kurbanismailov v.s., omarov o.a., ragimhanov g.b., and aliverdiev a.a., volume discharge in helium nearby atmospheric pressure: uniformity and stability // plasma physics and technology. 1 (3), 2014, 112-114. [17] kurbanismailov v.s., omarov o.a., ragimkhanov g.b., gadzhiev m.kh., bairkhanova m.g., and kattaa a.j. peculiarities of formation and development of initial stages of an impulse breakdown in argon. plasma physics reports. 37 (13), 2011, 1166-1172. doi:10.1134/s1063780x11030068 [18] efendiev a.z. and aliverdiev a.a. multichannel helium discharge in a nonuniform field, radiophysics and quantum electronics. 20 (8), 1977, 850-855. doi:10.1007/bf01038796 241 http://applphys.orion-ir.ru/appl-09/09-5/09-5-9e.htm http://applphys.orion-ir.ru/appl-09/09-5/09-5-9e.htm http://dx.doi.org/10.1134/s1063780x11120087 http://dx.doi.org/10.1134/s1063780x11030068 http://dx.doi.org/10.1007/bf01038796 acta polytechnica 55(4):237–241, 2015 1 introduction 2 experimental installation and research methods 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0814 acta polytechnica 53(supplement):814–816, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the na62 experiment at cern and the measurement of the ultra-rare decay k+ →π+νν̄ antonella antonelli∗, on behalf of the na62 collaboration i.n.f.n. – laboratori nazionali di frascati, via e. fermi 40, frascati (rome) italy ∗ corresponding author: antonella.antonelli@lnf.infn.it abstract. the na62 experiment at cern aims at the very challenging task of measuring with 10 % relative error the branching ratio of the ultra-rare decay of the k+ into π+νν̄ which is expected to occur only in about 8 out of 1011 kaon decays. this will be achieved by means of an intense hadron beam, an accurate kinematical reconstruction and a redundant veto system for identifying and suppressing all spurious events. good resolution on the missing mass in the decay is achieved using a high-resolution beam tracker to measure the kaon momentum and with a spectrometer equipped with straw tubes operating in vacuum. hermetic veto (up to 50 mrad) of the photon from π0 decays is achieved with a combination of large angle veto (with a creative reuse of the old opal lead glass blocks), the na48 liquid krypton calorimeter and two small angle calorimeters to cover the angle down to zero. the identification of the muons and the consequent veto is performed by a fast hodoscope plane (used in the first level of the trigger to reduce the rate) and by a 17 meter, neon-filled rich counter which is able to separate pions and muons in the momentum interval between 15 and 35 gev. particle identification in the beam (k+ separation) is achieved with an h2 differential cherenkov counter. the trigger for the experiment is based on a multilevel structure with a first level implemented in the readout boards and with the subsequent level done in the software. the aim is to reduce the 10 mhz level zero rate to a few khz sent to the cern computing centre. studies are underway to use gpu boards in some key point of the trigger system to improve the performance. keywords: na62 kaon flavour. 1. introduction among the many flavour-changing neutral current rare k and b decays, the decays k → πνν̄ play a key role in the search for new physics. the standard model (sm) branching ratio can be computed to an exceptionally high degree of precision: the theoretical comes mainly from the uncertainty on the ckm matrix elements, while the irreducible theoretical uncertainty amounts to less than 2.5 % for the neutral mode and 3.7 % for the charged one, and the latter could be further reduced by lattice calculation [1]. presently, the only existing measurement of k+ → π + νν̄ based on seven signal events collected by bnlags-e787(e949), which estimated a branching ratio of (1.73+1.15−1.05) · 1010 [2]. however only a measurement of the branching ratio with at least 10 % accuracy can be a significant test of new physics. 2. the na62 detector the requirement of 100 events leads to ∼10 % signal acceptance and to at least 1013 k+ decays. the required signal to background ratio demands a background suppression factor of at least 1012. the principle of the experiment is a decay-in-flight technique: the signal is composed by an incoming mother particle (the k+) and an outgoing daughter particle (the π+) and nothing else, all the other k+ decay channels being background. the experiment will be housed in the cern north area high intensity facility (nahif) where the na48 [3] was located, and it will use the same super proton synchrotron (sps) extraction line and target of na48 to produce a 75 gev/c (±1 %) positive hadron beam two-body and three-body decay modes will be reduced by a factor of 104 by cutting on the missing mass of reconstructed candidates. for this purpose, a fast up-stream tracker of every particle in the beam is used to measure the incoming k+ momentum. this beam spectrometer (called gigatracker [4]), consists of 3 silicon pixel stations matching the beam size. the 18000, 300 µm× 300 µm, pixels are formed by a 200 µm thick sensor, bump-bonded on 100 µm thick readout chips, thus keeping the total thickness below 5·10−3 radiation length. in order to provide the timing of the mother kaon and keep the pile-up at the 10 % level in an 800 mhz hadron beam, a 200 ps time resolution is required. downstream to a 60 m long fiducial region for k decays, a straw-chamber magnetic spectrometer (fig. 1 left) is used to measure daughter particle momenta with high resolution [5]. further rejection of kµ 2,3,4 and ke 2,3,4 background will be obtained with a ring-imaging cherenkov counter (rich), used to efficiently and non-destructively identify daughter pions and disentangle them from muons and electrons. the rich should provide 3-sigma π/µ separation in the 15 ÷ 35 gev/c pion momentum range, and a time resolution better than 100 ps should be guaranteed, to efficiently match with gigatracker 814 http://dx.doi.org/10.14311/ap.2013.53.0814 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the na62 experiment at cern information. this performance will be obtained by using a 17 m long, 3 m diameter volume, filled with 1 atm neon gas acting as a cherenkov radiator. mirrors at the downstream side of the volume will focus the rings of cherenkov light into two separated regions on the upstream side. these region will be instrumented with 2000 18 mm photomultiplier tubes (pmts). dedicated beam-tests of a 400 pmt prototype demonstrated muon rejection better than 1 %, with an overall pion loss of a few per mille and a time resolution better than 100 ps [6]. since it is critical to achieve sufficient rejection for kµ 2 decays, additional information will be provided by the muon veto: a sampling calorimeter placed after the existing 27 x0 liquid krypton electromagnetic calorimeter of the na48 experiment (lkr [7]). rejection of background from nuclear interactions of charged beam particles other than k+ will be guaranteed by a differential cherenkov counter (cedar) placed before kaons enter the decay region: cherenkov photons radiated in a 6 m long vessel, filled with helium gas, are focussed by an optimised optical system on eight fast pmts. rejection of modes with neutral pions and/or (possibly radiative) photons will be provided by the lkr calorimeter, complemented by high-efficiency photonveto detectors, covering 0 ÷ 50 mrad emission angles. this has to provide a rejection factor of 108 against k+ → π+π0. photons emitted at a very small angle, < 2 mrad, will be detected by compact calorimeters in the forward direction, with a required inefficiency of < 106 above 6 gev. in the angular range between 1 mrad and 8 mrad, the lkr causes an inefficiency measured to be < 10−5 for photons above 6 gev. at a large angle, between 8 mrad and 50 mrad, a new system, called large angle veto (lav) will provide detection with inefficiency < 10−4 above 100 mev, as measured in test beams performed at the daφne beam test facility in frascati, using positrons [3]. the lav system is constituted by 12 stations of increasing diameter (see fig. 1 right), placed at different positions along the vacuum decay tube. each station is composed of four or five layers of sf57 lead-glass blocks, formerly used in the barrel of the opal electromagnetic calorimeter, arranged radially to form a ring-shaped sensitive area. layers are staggered to guarantee that incident particles cross at least three blocks: total thickness ranges from 29 to 37 x0. cherenkov light is readout by 2-inch pmt. with 32 to 48 crystals per layer, a total of 2496 blocks will be used. a double time-over-threshold (tot) discriminator, with multiple adjustable thresholds [8], will be used in order to be able to reconstruct the charge of wide-dynamic-range signals coming from 0.02÷20 gev photons. a scheme of the na62 experimental layout is drawn in fig. 2. the second lav station (a2) was tested in the t9 area at the cern ps, by means of a beam, it is composed of electrons, pions, and muons with energies figure 1. a picture of the straws (left) and of the a1 station of the lav (right). between 0.3 and 10 gev. the timing performance was proven to be excellent [10], the resolution of the whole system is ∼ 200 ps/ √ e[gev] for a single block. a measurement of the energy can be obtained by means of the two tot values. the energy resolution obtained in this fashion is σ(e)/e = 9.2 %/ √ e[gev] + 5 %/e[gev] + 2.5 %. in order to extract a few interesting decays from a very intense flux a complex and performing three level trigger and data acquisition system (tdaq [11]) was designed. the tdaq is a unified completely digital system: the readout data, stored in large buffers waiting for trigger decisions, is exactly the same as was used to construct the trigger primitives. the level 0 (l0) trigger algorithm is based on the presence of a charged particle in the rich and veto conditions on lkr, lav and muon veto detector, and is performed by dedicated custom hardware modules, with a maximum output rate of 1 mhz and a maximum latency of 1 ms. level 1 and level 2 software triggers are executed on a dedicated pc farm. the maximum level 1 (level 2) output rate is of the order of 100 (10) khz. due to the large amount of data to be processed in a reasonable time, the number of pc cores at the l1 will be quite large. the use of a dedicated gpu-based farm is under evaluation [12]. after this level of selection the data from all the detectors will arrive at l2 through a network switch; the event will be fully reconstructed in order to apply a tighter selection based on the full kinematics. 3. status and plans the na62 collaboration has completed many of the intense r&d programs on various sub-detectors, and is now progressing in building and installing the experiment: the new beam-line, the muon-veto, and the magnetic straw-chamber spectrometer will be completed or close to completion by fall 2012. as far as the lav system is concerned 8 stations out of 12 are successfully installed and tested. a technical run is planned for 2013 before the sps undergoes the shutdown for lhc injection chain improvement. the readout of the liquid krypton calorimeter is being consolidated and the tdaq and computing system development are currently under way. in fact a test 815 antonella antonelli, on behalf of the na62 collaboration acta polytechnica liquid krypton calorimeter muon veto irc sac cedar rich gigatracker c o ll im a to r target large angle photon veto large angle photon veto tax achromat achromatachromat vacuum straw chambers spectrometer 1 m spectrometer magnet chod charged anti counter 0 m m m m m m figure 2. scheme of the na62 layout. of the whole electronics has been performed in the experimental area. on the other hand, the rich and the gigatracker will be ready for the full physics run, right after the restart of the fixed-target program of cern subsequent to the long sps shutdown. references [1] j. brod et al., phys. rev d83 034030 (2011) c. amsler et al., phys. lett. b667, 1 (2008) [2] j. brod and m. gorbahn, phys. rev. d78 034006 (2008) [3] p. valente et al., nucl. instrum. meth. a628, 411 (2011) [4] m. fiorini et al., nucl. instrum. meth. a628, 292 (2011) [5] p. lichard, jinst 5, c12053 (2010) [6] m. lenti et al., nucl. instrum. meth. a639, 20 (2011) [7] m. jeitler et al., nucl. instrum. meth. a494, 373 (2002) [8] a. antonelli et al., jinst 7, c01097 (2012) [9] na62 collaboration, cern-spsc-2007-035, spsc-m-760 (2007) [10] f. ambrosino et al., ieee nss/mic (2011) [11] c. avanzini et al., nucl. instrum. meth. a623, 543 (2010) [12] g. lamanna et al., nucl. instrum. meth. a628, 457 816 acta polytechnica 53(supplement):814–816, 2013 1 introduction 2 the na62 detector 3 status and plans references acta polytechnica doi:10.14311/ap.2013.53.0641 acta polytechnica 53(supplement):641–645, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap agile data center at asdc and agile highlights carlotta pittoria,b,∗, on behalf of the agile collaboration a asi science data center, esrin, i-00044 frascati (rm), italy b inaf-oar, via frascati 33, i00040 monte porzio catone (rm), italy ∗ corresponding author: carlotta.pittori@asdc.asi.it abstract. we present an overview of the main agile data center activities and the agile scientific highlights during the first 5 years of operations. agile is an asi space mission in joint collaboration with inaf, infn and cifs, dedicated to the observation of the gamma-ray universe. the agile satellite was launched on april 23rd, 2007, and is devoted to gamma-ray astrophysics in the 30 mev ÷ 50 gev energy range, with simultaneous x-ray imaging capability in the 18 ÷ 60 kev band. despite the small size and budget, agile has produced several important scientific results, including the unexpected discovery of strong and rapid gamma-ray flares from the crab nebula over daily timescales. this discovery won agile pi and the agile team the prestigious bruno rossi prize for 2012, an international award in the field of high energy astrophysics. thanks to its sky monitoring capability and fast ground segment alert system, agile is substantially improving our knowledge of the gamma-ray sky, also making a crucial contribution to the study of the terrestrial gamma-ray flashes (tgfs) detected in the earth atmosphere. the agile data center, part of the asi science data center (asdc) located in frascati, italy, is in charge of all the science oriented activities related to the analysis, archiving and distribution of agile data. keywords: gamma-rays: observations, catalogs. 1. introduction the agile satellite [23], launched on april 23rd, 2007, is the first of a new generation of high-energy space missions based on solid-state silicon technology, substantially improving our knowledge about various known gamma-rays sources, such as supernova remnants and black hole binaries, pulsars and pulsar wind nebulae, blazars and gamma ray bursts. moreover, agile has contributed to the discovery and study of new galactic gamma-ray source classes, of peculiar star systems and of mysterious galactic gamma-ray transients, including the observation that the energy spectrum of terrestrial gamma-ray flashes (tgf) extends well above 40 mev. the 2012 bruno rossi international prize was awarded to the pi, marco tavani, and the agile team for the discovery of rapid gamma-ray flares over daily timescales from the crab nebula, thought to be a steady enough source of energy in all wavebands from optical to gamma rays [15]. there is no evidence of correlation between the rapid high energy gamma-ray flares and the decennial variation (∼ 7 %) of the hard x-ray emission reported from 2005 on by several instruments [29, and references therein]. agile observations challenge emission models of pulsar wind interaction and particle acceleration processes. 2. the agile instrument the agile instrument shown in fig. 1 (a cube of 60 cm side, weighting only about 100 kg) consists of two detectors using silicon technology: a gamma-ray figure 1. the agile payload. imager (agile-grid) [4, 20] and a hard x-ray detector (superagile) [10] for the simultaneous detection and imaging of photons in the 30 mev ÷ 50 gev and in the 18 ÷ 60 kev energy ranges. the instrument is completed by a calorimeter (energy range 250 kev ÷ 100 mev) [13], and by an anti-coincidence system [16]. agile is characterized by a very large field of view (∼ 3 sr), good angular resolution (∼ 36 arcmin at 1 gev for the grid, and a spatial resolution of 6 arcmin for superagile corresponding to a positional accuracy of ∼ 1 ÷ 2 arcmin for detections of significance above 10σ in the energy range 18 ÷ 60 kev), as well as a small dead time (100 µs). these features make it a very good instrument to study persistent and transient gamma-ray sources even on very short timescales. 641 http://dx.doi.org/10.14311/ap.2013.53.0641 http://ojs.cvut.cz/ojs/index.php/ap carlotta pittori, on behalf of the agile collaboration acta polytechnica figure 2. whole sky agile intensity map (ph cm−2 s−1 sr−1) in galactic coordinates and aitoff projection, for energies e > 100 mev, accumulated during the first ∼ 5 years of observations, up to december 31, 2011 (pointing + spinning observing modes). green circles: agile first catalog source positions [18]. 3. the agile data center and data flow the agile data center (adc) is in charge of all the science-oriented activities related to the analysis, archiving and distribution of agile data. it is part of the asi science data center (asdc) located in frascati, italy, and it includes scientific personnel from both the asdc and the agile team. agile telemetry raw data (level-0) are downlinked every ∼ 100 min to the asi malindi ground station in kenya and transmitted first to the telespazio mission control center at fucino, and then to the adc within ∼ 5 min after the end of each contact. raw data are routinely archived, transformed in fits format through the agile pre-processing system [27] and processed using the scientific data reduction software tasks developed by the agile instrument teams and integrated into an automatic quick-look pipeline system developed at adc. the agile-grid ground segment alert system is distributed among adc and the agile team institutes, and it combines the adc quick-look [19] with the grid science monitoring system [5] developed by the agile team. automatic alerts to the agile team are generated within ∼ 100 minutes after the tm down-link start (t0) at the ground station. grid alerts are sent via email (and sms) both on a contact-by-contact basis and on a daily timescale. this fast ground segment alert system is very efficient, and leads to alerts within ∼ (2 ÷ 2.5) hours from an astrophysical event. refined manual analysis on most interesting alerts are performed every day (quick-look daily monitoring). public agile data and software are available at the adc web pages at asdc1. more details on the adc organization and tasks will be given in a forthcoming publication [19]. 1http://agile.asdc.asi.it during the first ∼ 2.5 years agile was operated in “pointing observing mode”, characterized by long observations called observation blocks (obs), typically of 2–4 weeks duration, mostly concentrated along the galactic plane following a predefined baseline pointing plan. on november 4, 2009, agile scientific operations were reconfigured following a malfunction of the rotation wheel that occurred in mid october, 2009. the satellite is currently operating regularly in “spinning observing mode”, surveying a large fraction (about 70 %) of the sky each day. the instrument and all the detectors are operating nominally producing data with quality equivalent to that obtained in pointing mode. in these new attitude conditions, agile continuously scans a much larger fraction of the sky, with smaller exposure to each region, and the superagile x-ray data analysis pipeline required to be fundamentally modified [8]. in nominal pointing conditions, the superagile fluxes were estimated with an exposure of about 3 ks while, in spinning mode, longer integration times (3.5 and 7 days) are now required to obtain equivalent exposures. the agile guest observer program has not suffered any interruption. figure 2 shows the total gamma-ray intensity above 100 mev as observed by agile up to december 31, 2011, during the first ∼ 5 years of observations (pointing plus spinning). the agile collaboration has published 98 astronomical telegrams (atels) (42 in pointing + 56 in spinning) and 37 gcn up to may, 2012. following the success of the mission, the agile operational lifetime has been currently extended by asi at least up to june, 2013. 4. five years of agile: main discoveries and surprises thanks to its sky monitoring capability and fast ground segment alert system, agile is very effective in detecting bright gamma-ray flares from blazars [28, and references therein], gamma-ray bursts [9, and references therein], and galactic gamma-ray transients [7, and references therein]. we present here a selection of the main agile science highlights during the first five years of operations, up to may 2012. first detection of a colliding wind binary system in gamma-rays by agile agile provided the first gamma-ray detection above 100 mev of a colliding wind binary (cwb) system in the η-carinae region, a phenomenon never observed before [24]. the agile satellite repeatedly pointed at the carina region for a total of ∼ 130 days during the time period 2007 july–2009 january. agile detected a gammaray source (1agl j1043-5931) consistent with the position of the cwb massive system η-car. a 2-day gamma-ray flaring episode was also reported on 2008 642 http://agile.asdc.asi.it vol. 53 supplement/2013 agile data center at asdc and agile highlights oct. 11–13, possibly related to a transient acceleration and radiation episode of the strongly variable shock in the system. detection of gamma-ray emission from the vela-x pulsar wind nebula with agile agile provided the first experimental confirmation of gamma-ray emission (e > 100 mev) from a pulsar wind nebula (pwn). pulsars are known to power winds of relativistic particles that can produce bright nebulae by interacting with the surrounding medium. the agile detection of gamma-ray emission from the pwn vela-x, described in a science paper [17], is the first experimental confirmation of gamma-ray emission (e > 100 mev) from a pulsar wind nebula. this result constrains the particle population responsible for the gev emission and establishes a class of gamma-ray emitters that could account for a fraction of the unidentified galactic gamma-ray sources. subsequently the nasa fermi satellite has confirmed the vela-x gamma-ray detection, and has also firmly identified 4 other pulsar wind nebulae plus a large number of candidates. agile detections of microquasars in the cygnus region microquasars are accreting black holes or neutron stars in binary systems with associated relativistic jets. before agile and fermi they had never been unambiguously detected emitting highenergy gamma rays. episodic transient gamma-ray flaring activity for a source positionally consistent with cygnus x-1 microquasar was reported twice by agile [6, 21, 22]. agile extensive monitoring of cygnus x-1 in the energy range 100 mev ÷ 3 gev during the period 2007 july–2009 october confirmed the existence of a spectral cutoff between 1 ÷ 100 mev during the typical hard x-ray spectral state of the source. however, even in this state, cygnus x-1 is capable of producing episodes of extreme particle acceleration on 1-day timescales. agile first detection of a gamma-ray flare above 100 mev adds to the even shorter lived detection in the tev range by magic in 2006 [3]. remarkably, agile also detected several gammaray flares from cygnus x-3 microquasar and also a weak persistent emission above 100 mev from the source for the first time [25]. there is a clear pattern of temporal correlations between the gamma-ray flares and transitional spectral states of the radiofrequency and x-ray emission: flares are all associated with special cygnus x-3 radio and x-ray/hard x-ray states. gamma-ray flares occur either in coincidence with low hard x-ray fluxes or during transitions from low to high hard x-ray fluxes, and usually appear before major radio flares. agile findings have also been confirmed by fermi-lat, which also detected the orbital period (4.8 hours) in gamma-rays, an unambigous temporal signature of the microquasar [1]. in the 9 days from december 2 to december 11, 2009 figure 3. the cygnus region in gamma-rays: agile intensity map from 100 mev to 10 gev. data taken in ∼ 2-year pointing observing mode, from nov. 2007 to oct. 2009, corresponding to ∼ 13 ms net exposure time. figure adapted from g. piano presentation, 9th agile workshop, 2012. a long-lasting mystery wass solved: cygnus x-3 is able to accelerate particles up to relativistic energies and to emit gamma-rays above 100 mev. evidence of proton acceleration from agile observations of supernova remnant w44 agile discovered a pattern of gamma-ray emission from the supernova remnant w44 that, combined with the observed multifrequency properties of the source, can be unambiguously attributed to accelerated protons interacting with nearby dense gas. the agile gammaray imager reaches its optimal sensitivity just at the energies in the 50 mev ÷ a few gev range at which neutral pions (produced by proton–proton interactions) radiate with an unambiguous signature. up to now a direct identification of sites in our galaxy where proton acceleration takes place was elusive. this important agile result is reported in [12]. agile contribution to tgf science the agile space mission, primarily focused on the study of the gamma-ray universe, can also detect phenomena originating in the earth atmosphere. the agile minicalorimeter is indeed detecting terrestrial gamma-ray flashes (tgfs) associated with tropical thunderstorms. they typically last a few thousandths of a second, and they produce gamma-ray flashes up to 100 mev, on timescales as low as < 5 ms [14]. agile joins other satellites in detecting tgfs, but its unique ability to detect photons of the highest energies within the shortest timescales makes it an ideal istrument for studying these impulsive phenomena. the crucial agile contribution to tgf science is the discovery that the tgf spectrum extends well above 40 mev, and that the high energy tail of the tgf spectrum is harder than expected. tgfs can also be localized from space using high-energy photons detected by the agile-grid detector [11]. 643 carlotta pittori, on behalf of the agile collaboration acta polytechnica figure 4. the crab nebula flare in september 2010, as observed by agile at energies above 100 mev [26]. crab nebula variability during 2010, the agile discovery of crab nebula variability above 100 mev astonished the scientific community [2, 26]. astronomers had long believed the crab to be a constant source at a level of a few percent from optical to gamma-ray energies, an almost ideal standard candle [15]. although small-scale variations in the nebula are well-known in optical and x-rays, when averaged over the whole inner region (several arcminute across) the crab nebula had been considered essentially stable. however, on september 2010 agile detected a rapid giant gamma-ray flare over a daily timescale, see fig. 4, and due to its rapid alert system, made the first public announcement on september 22, 2010. this finding was confirmed the next day by the fermi observatory. agile detected a flare from the crab also in october, 2007 and in sect. 6.1 of the first agile catalog paper [18], it was reported that anomalous flux values observed from the crab in 2007 were under investigation. from 2005 onwards several instruments (swift, rxte, integral, the fermi gbm, etc.) have also reported long-term flux variations in the hard x-ray range. combined observations carried out from 2008 to 2010 with different instruments in overlapping energy bands agree with observing a ∼ 7 % decline in the crab 15 ÷ 50 kev flux over a ∼ 3 year timescale, and a similar decline is also observed in the 3 ÷ 15 kev data [29, and referencees therein]. the pulsed flux measured since 1999 is consistent with the pulsar spin-down, indicating that the observed changes are nebular. no evidence of a correlation between the rapid high energy gamma-ray flares and the decennial variation of the hard x-ray emission has been found. gamma-ray data provide evidence for particle acceleration mechanisms in nebular shock regions more efficient than previously expected from current theoretical models. the 2012 bruno rossi international prize has been awarded to the pi, marco tavani, and the agile team for this important and unexpected discovery. 5. conclusions agile is substantially improving our knowledge of the gamma-ray sky. in several cases new gamma-ray data provide evidence for particle acceleration mechanisms more efficient than previously expected. gamma-ray emission from cosmic sources at these energies is intrinsically non-thermal, and the study of the wide variety of gamma-ray sources, such as galactic and extragalactic compact objects, and of impulsive gamma-ray events such as far away grbs and very near tgfs, provides a unique opportunity to test theories of particle acceleration, and radiation processes in extreme conditions and it may help to shed light on the foundations of physics itself. full exploitation of the agile sensitivity near and below 100 mev, for which the agile exposure is competitive with that of fermi, is in progress. acknowledgements we acknowledge support from asi grants i/089/06/2 and i/042/10/0. references [1] abdo, a. a., et al. 2009, science, 326, 1512 [2] abdo, a. a. et al., 2011, science vol. 331, 739 [3] albert j. et al, 2007, apj, 665, l51 [4] barbiellini, g. et al., 2001, aip conf. proc. 587, 754 [5] bulgarelli, a. et al, asp conference series, 411, 362 (2009) [6] bulgarelli, a. et al., 2010, atel #2512 [7] chen., a. et al., 2010, pos (texas 2010) 095 [8] del monte, e. et al., 2010, spie 7732, section 4. [9] del monte, e. et al., 2012, mem. s.a.it, vol. 83, 302 [10] feroci, m. et al., 2007, nim a 581, 728 [11] fuschini, f. et al., 2011, grl 38, l14806 [12] giuliani, a. et al., 2011, apj 742, l30 [13] labanti, c. et al., 2006, spie, volume 6266, 62663q [14] marisaldi, m. et al., 2010, jgr 115, a00e13 [15] meyer, m. et al., 2010, a&a, 523 [16] perotti, f. et al., 2006, nim a 556, 228 [17] pellizzoni, a. et al., 2010, science, vol. 327 no. 5966, 663 [18] pittori, c. et al., 2009, a&a 506, 1563–1574 [19] pittori, c. et al. 2012, in preparation [20] prest, m. et al., 2003, nim a, 501, 280 [21] sabatini, s. et al.,2010a, atel #2715 [22] sabatini, s. et al., 2010b, apj 712, l10 [23] tavani, m. et al., 2009a, a&a, 502, 1015 [24] tavani, m., et al., 2009b, apj 698, l142 [25] tavani, m. et al., 2009c, nature 462, 620–623 [26] tavani, m. et al., 2011, science vol. 331, 736–739 [27] trifoglio, m., bulgarelli, a., gianotti, f. et al., 2008, proceedings of spie, vol. 7011, 70113e [28] vercellone, s. et al., 2010, mem. s.a.it, vol. 81, 248 [29] wilson-hodge, c.a. et al., 2011, apj, 727, l40 644 vol. 53 supplement/2013 agile data center at asdc and agile highlights discussion matteo guainazzi — can the “standard candle” quality of the crab nebula be retained if flares are identified and removed? carlotta pittori — i recall that the pulsed gamma-ray emission of the crab does not vary. the crab steady state (pulsar plus nebula) flux above 100 mev detected by agile is: fsteady = (2.2 ± 0.1)10−6 ph cm−2 s−1. by looking at fig. 4, where the dotted line and band marked in grey color show the average crab flux and the 3σ uncertainty range, we can see that at this uncertainty level variations outside the flares cannot currently be identified within errors, but a refined analysis is ongoing on this delicate subject. in summary i would say that in the majority of exposure windows the total gamma-ray flux, carefully excluding periods of enhanced emission, is consistent within errors with the standard candle value. andrzej zdziarski — is there a difference in spectral index between agile and fermi observations of cygnus x-3? carlotta pittori — there is a paper in progress on the gamma-ray flaring behavior and spectral constraints of cygnus x-3 (piano et al. 2012, accepted for publication in a&a). for the moment i refer you to the 9th agile workshop presentation by giovanni piano, in which the agile-grid flaring spectrum of the source is shown. within errors the spectra of the two instruments appear to be consistent. 645 acta polytechnica 53(supplement):641–645, 2013 1 introduction 2 the agile instrument 3 the agile data center and data flow 4 five years of agile: main discoveries and surprises 5 conclusions acknowledgements references ap2002_01.vp 1 introduction in the national thermal standards of various countries [1], [2] there are diffusion data that were measured under some standard isothermal and isobaric conditions, e.g. for temperature ta � 283 k and atmospheric pressure pa � 98 066.5 pa. the diffusion data determined under such conditions are often used in cases for which neither isothermal nor isobaric conditions are fulfilled. for example, when the condensation of water vapour in building envelopes is estimated, an isothermal state can rarely be assumed. the condensation is usually estimated for winter conditions, when the temperatures of interior t1 and exterior t2 are quite different. in this case the effective temperature t* � (t1+t2)/2 is introduced and is assumed to be close to the standard temperature ta � 283 k. it is assumed that the effective temperature t* is common for the whole envelope, i.e., it is the isothermal approximation with the single temperature t* that enables numerical estimation of the non-isothermal condensation of water vapour diffusing through the envelope to be performed. for this purpose the coefficient of diffusion permeability � is defined as follows � �� t g d p p * � � d 1 2 (1), t t t * � �1 2 2 (2) where d is the thickness of the envelope (wall), gd is diffusion flux and p1, p2 are partial pressures of water vapour at both surfaces of the wall. some researchers have attempted to “improve” the mentioned isothermal approximation (1, 2) by measuring gd not in the isothermal state t t* � a but in non-isothermal conditions t t t1 2� � *. it is clear that in such a situation relations (1, 2) becomes incorrect and must be replaced by some different relations holding for the real non-isothermal state. the aim of this paper is to derive the corresponding relations that govern the non-isothermal diffusion of water vapour through porous materials. the starting point for this derivation will be fick’s laws of diffusion. 2 non-isothermal diffusion of water vapours if there are no sources of diffusing particles and unidirectional steady state diffusion through a porous medium with diffusion constant d has been established, then fick’s general equations can be rewritten in the simpler form q d c x d d d � � , (3) d d d dx d c x � � � � 0, (4) where concentration c of diffusing particles, i.e. molecules of water vapour, can be replaced by partial pressure p and absolute temperature t, according to the equation of thermodynamic state pv mrt c m v c r p t r� � � � � �, , , [ ] 1 462 1 1jkg k . (5) following the classical work of schirmer [3] and krischer [4] the diffusion constant d of a porous building material is dependent on atmospheric pressure pa, temperature ta � t and type of porous material which is represented by the diffusion resistance factor � being a purely material constant. for “standard” pressure pa � 98 066.5 pa, the diffusion constant d can be expressed as follows d t k t n k � � � � � � � � 8 9718 10 181 8 9718 10 10 1 81 10 2 . , . , . [ . � � n m s k� �1 1 81. ]. (6) inserting (5), (6) into (3), (4) fick’s equations will be modified � � � � � � q d x r x p t x x d d d � � � � � , (7) � � � � � � d d d d n x k r t x x p x t x� � � � � � � � � � � 0 , (8) with the following boundary conditions belonging to a non-isothermal wall of thickness d � � � � � � � � p t p t p t p t 0 0 1 1 2 2 � �, d d . (9) a linear temperature profile t( x) inside the wall can be assumed, as follows � �t x t t t d x a bx� � � � �1 1 2 . (10) inserting (10) into fick’s equations (7), (8) and taking into account the first boundary condition (9), the corresponding solution can be found � � k r a bx x p t gn � � � � � � � � d d constd ., (11) © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 42 no. 1/2002 non-isothermal diffusion of water vapour in porous building materials t. ficker, z. podešvová non-isothermal diffusion is analysed using fick’s laws. the exact relations for non-isothermal diffusion flux and partial pressure profiles in porous building materials are derived and discussed. keywords: fick’s laws, non-isothermal diffusion, partial pressure profile. � � � � � � d dd p t g r k a bx xn x p t p x t x � � � � � � ��� � 01 1 , (12) � � � � � � � � � g p t p t kb n r a a bx x x n nd � � � � � � � � � �� � 1 1 1 1 1 � . (13) inserting the second boundary condition (9) into (11), we can express the steady state diffusion flux gd going through the non-isothermal wall with the linear temperature profile (10) � � � � � g p t p t kb n r a a bdn n d � � � � � � � � �� � 1 1 2 2 1 1 1 � . (14) the symbols a, b in (14) can be specified using (10) � � � � g k n rd p t p t t t t tn n d � � � � � � � � �� � 1 1 1 2 2 1 2 1 1 2 1� . (15) relation (15) can be rearranged and effective diffusion resistance rd * and “conductivity” deff * may be introduced g c c r d d �1 2 * [kg�m�2s�1], r d d d eff * * [m �1s], � �� � � � d k n t t t tn n eff * � � �� � 1 1 2 2 1 1 1 � [m2s�1] (16) where c p rt 1 1 1 [kg m�3], c p rt 2 2 2 [kg m�3]. by means of relations (16) non-isothermal diffusion flux is easily available. as can be seen, the relevant potentials responsible for diffusion movement are not the partial pressures p1, p2 but concentrations c1, c2 of the water vapour at both wall surfaces. the differences between the non-isothermal diffusion fluxes determined according to rigorous relation (16) and that of isothermal approximation (1, 2), utilizing effective temperature t*, ranges from several per cent to several tens of per cent, depending on the boundary conditions. for completeness, the partial pressure profile p(x) inside the wall should be analyzed. function p(x) is given by eqs. (10), (13) and (14) � �p x t t t d x p t p t p t t tn � �� � � � � � � � � � � � � � � �� 1 1 2 1 1 1 1 2 2 1 1 1 t t d x t t n n n 1 2 1 1 1 2 1 �� � � � � � � � � � � � � � � � � � � � � , (17) for n � 1.81 relation (17) will read � �p x t t t d x p t p t p t t t � �� � � � � � � � � � � � � � �� 1 1 2 1 1 1 1 2 2 1 0 81. 1 1 2 0 81 1 0 81 2 0 81 � �� � � � � � � � � � � � � � � � � � � � � t t d x t t . . . . (18) at first sight it is obvious that the p(x) profile is not linear. nevertheless, for usual temperature and partial pressure differences between outdoor and indoor spaces in our climatic region the graph of p(x) will closely follow the linear behaviour as depicted in fig. 1. 3 conclusion this paper has presented the exact procedure (16) for calculation of the non-isothermal diffusion flux of water vapour going through porous materials. it has presented a comparison between the approximative procedure (1, 2) based on the 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 p [pa] d [m] 0 0.05 0.1 0.15 0.20 150 450 750 1050 fig. 1: partial pressure profile p(x) inside a wall of thickness d � 0.2 m with winter boundary conditions: �1 � 40 % rh, t1 � 293 k, p1 � 934,8 pa, �2 � 60 % rh, t2 � 263 k, p2 � 156 pa concept of effective temperature t* and the exact procedure respecting the real non-isothermal conditions. the corresponding non-isothermal pressure profile p(x) inside the wall proved to be almost linear under climatic conditions that are normal in the central european region (viz fig. 1). references [1] czech thermal standard čsn 73 0540, čs. normalizační institut, praha 1994 [2] german thermal standard din 4108, deutsches institut für normung, berlin 1999 [3] schirmer, r.: diffusionszahl von wasserdampf–luftgemischen und die verdampfungsgeschwindigkeit. z.vdi–beil., verfahrenstechnik, no. 6/1938, pp. 170–177 [4] krischer, o.: grundgesetze der feuchtigkeitsbewegung in trockengütern. kapillarwasserbewegung und dampfdiffusion. z.vdi–beil. verfahrenstechnik, no. 6/1938, pp. 373–380 assoc. prof. rndr. tomáš ficker, drsc. phone: +420 5 4114 7661 e-mail: fyfic@fce.vutbr. cz department of physics ing. zdenka podešvová department of building structures faculty of civil engineering university of technology žižkova 17, 662 37 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 42 no. 1/2002 acta polytechnica doi:10.14311/ap.2013.53.0712 acta polytechnica 53(supplement):712–717, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap precise cosmic rays measurements with pamela a. brunoa,∗, o. adrianib,c, g. c. barbarinod,e, g. a. bazilevskayaf, r. bellottia,g, m. boezioh, e. a. bogomolovi, m. bongic, v. bonvicinih, s. borisovj,k,l, s. bottaic, f. cafagnaa, d. campanae, r. carbonee,k, p. carlsonm, m. casolinoj, g. castellinin, l. consiglioe, m. p. de pascalej,k, c. de santisj,k, n. de simonej,k, v. di felicej, a. m. galperl, w. gillardm, l. grishantseval, g. jerseh,o, a. v. karelinl, m. d. kheymitsl, s. v. koldashovl, s. y. krutkovi, a. n. kvashninf, a. leonovl, v. malakhovl, l. marcellij, a. g. mayorovl, w. mennp, v. v. mikhailovl, e. mocchiuttih, a. monacoa,c, n. morib,c, n. nikonovi,j,k, g. osteriae, f. palmaj,k, p. papinic, m. pearcem, p. picozzaj,k, c. pizzolottoh, m. ricciq, s. b. ricciarinic, l. rossettom, r. sarkarh, m. simonp, r. sparvolij,k, p. spillantinib,c, y. i. stozhkovf, a. vacchih, e. vannuccinic, g. vasilyevi, s. a. voronovl, y. t. yurkinl, j. wum, g. zampah, n. zampah, v. g. zverevl a infn, sezione di bari, i-70126 bari, italy b university of florence, department of physics, i-50019 sesto fiorentino, florence, italy c infn, sezione di florence, i-50019 sesto fiorentino, florence, italy d university of naples “federico ii”, department of physics, i-80126 naples, italy e infn, sezione di naples, i-80126 naples, italy f lebedev physical institute, ru-119991, moscow, russia g university of bari, department of physics, i-70126 bari, italy h infn, sezione di trieste, i-34149 trieste, italy i ioffe physical technical institute, ru-194021 st. petersburg, russia j infn, sezione di rome “tor vergata”, i-00133 rome, italy k university of rome “tor vergata”, department of physics, i-00133 rome, italy l nrnu mephi, ru-115409 moscow, russia m kth, department of physics, and the oskar klein centre for cosmoparticle physics, albanova university centre, se-10691 stockholm, sweden n ifac, i-50019 sesto fiorentino, florence, italy o university of trieste, department of physics, i-34147 trieste, italy p universitat siegen, department of physics, d-57068 siegen, germany q infn, laboratori nazionali di frascati, via enrico fermi 40, i-00044 frascati, italy. ∗ corresponding author: alessandro.bruno@ba.infn.it abstract. the pamela experiment was launched on board the resurs-dk1 satellite on june 15th 2006. the apparatus was designed to conduct precision studies of charged cosmic radiation over a wide energy range, from tens of mev up to several hundred gev, with unprecedented statistics. in five years of continuous data taking in space, pamela accurately measured the energy spectra of cosmic ray antiprotons and positrons, as well as protons, electrons and light nuclei, sometimes providing data in unexplored energetic regions. these important results have shed new light in several astrophysical fields like: an indirect search for dark matter, a search for cosmological antimatter (anti-helium), and the validation of acceleration, transport and secondary production models of cosmic rays in the galaxy. some of the most important items of solar and magnetospheric physics were also investigated. here we present the most recent results obtained by the pamela experiment. keywords: cosmic rays, dark matter, antimatter, solar modulation, trapped radiation. 712 http://dx.doi.org/10.14311/ap.2013.53.0712 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 precise cosmic rays measurements with pamela 1. introduction: the pamela mission pamela – a “payload for antimatter matter exploration and light-nuclei astrophysics” [1] – is a satellite-borne experiment conceived to study charged particles in the cosmic radiation in a wide energy interval, ranging from several tens of mev to some hundreds of gev, and with unprecedented precision and sensitivity. it has been in orbit since june 15th 2006 when it was launched from the baikonur cosmodrome on board the resurs-dk1 russian satellite. pamela has been continuously taking data for more than 6 years, corresponding to > 109 registered triggers and > 25 tb of down-linked data. a detailed description of the apparatus and of methodologies involved in the data analysis can be found in publications [1–8]. pamela was designed and optimized to measure the rare antimatter component in the cosmic radiation. antiprotons and positrons, assumed to be created mostly in the interaction of cosmic rays (crs) with the interstellar medium, are fundamental in studies of the production and propagation of crs in the galaxy and, together with electrons e−, provide significant details not available from the investigation of the nuclear cr component. above all, cr antiparticle measurements have long been considered as one of the most promising tools for indirect dark matter (dm) searches. predicted p̄ and e+ fluxes from dm particles could be detectable above the background from nuclear interactions through a distortion of the measured spectra. pamela also investigates the global matter/antimatter symmetry of the universe. in case of the existence of anti-galaxies, signals of antimatter (z ≥ 2) could be detected in the extragalactic radiation. pamela design sensitivity allows the h̄e/he flux ratio to be accurately measured in a wide rigidity range. protons and helium nuclei constitute the most abundant cr components, providing fundamental information to understand the origin and propagation of cr. pamela is able to measure their spectra with high precision in the largest explored interval, significantly constraining models. finally, concomitant pamela scientific goals include the investigation of solar modulation of cr (anti)particles during the 24th solar minimum, and the study of the geomagnetically trapped radiation. 2. pamela results 2.1. antiparticles 2.1.1. antiprotons pamela provided precise antiproton measurements in the kinetic energy range 60 mev ÷ 180 gev [2, 3], significantly improving data by previous experiments, thanks to the high statistical significance and the large explored interval. results about the p̄ spectrum and kinetic energy [gev] -1 10 1 10 2 10 -1 ] 2 s s r 2 a n ti p ro to n f lu x [ g e v m -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 ams (m. aguilar et al.) bess-polar04 (k. abe et al.) bess1995-97 (s. orito et al.) caprice1998 (m. boezio et al.) caprice1994 (m. boezio et al.) pamela donato et al. 2001 ptuskin et al. 2006 kinetic energy [gev] -1 10 1 10 2 10 a n ti p ro to n -t o -p ro to n f lu x r a ti o -6 10 -5 10 -4 10 -3 10 bess 95-97 (s. orito et al.) bess-polar 2004 (k. abe et al.) caprice 1994 (m. boezio et al.) caprice 1998 (m. boezio et al.) heat-pbar 2000 (y. asaoka et al.) pamela donato et al. 2009 ptuskin et al. 2006 figure 1. the p̄ spectrum (top) and the p̄/p flux ratio (bottom) measured by pamela, compared with data from other contemporary experiments and calculations for purely secondary production of antiprotons in the galaxy [3, and references therein]. the p̄/p ratio are reported in fig. 1, with data from contemporary experiments and some theoretical calculations for a pure secondary production of antiprotons during the propagation of crs in the galaxy [3, and references therein]. 2.1.2. positrons figures 2 and 3 report the positron fraction e+/(e+ + e−) and the positron spectrum measured by pamela between 1.5 ÷ 100 gev. the data are compared with data from other contemporary experiments and predictions of a secondary production model [4, 5, and references therein]. pamela measurements cover a large energy interval, significantly reducing experimental uncertainties. 2.1.3. discussion pamela antiproton data reproduce the expected peak around ∼ 2 gev in the antiproton spectrum, 713 a. bruno et al. acta polytechnica energy [gev] 1 10 2 10 p o s it ro n f ra c ti o n 0.005 0.01 0.02 0.1 0.2 0.3 0.4 pamela fermi 2011 clem & evenson 2007 heat00 ams caprice94 heat94+95 ts93 mass89 muller & tang 1987 moskalenko and strong, apj 493, 694 (1998) figure 2. the pamela positron fraction e+/(e+ + e−) compared with other experimental data and with predictions of a secondary production model [4, 5, and references therein]. and appear to be consistent with pure secondary calculations, excluding an appreciable contribution from exotic processes in that energy range. on the other hand, pamela positron measurements exhibit significant features. the discrepancy with previous experimental positron data at low energy (< 5 gev) can be interpreted as a consequence of charge dependent solar modulation effects, affecting positrons and electrons differently. above ∼ 5 gev the measured positron fraction significantly deviates from predictions from secondary production models, increasing with energy. other experimental data in this range, while affected by too large uncertainties to draw any meaningful conclusions, are consistent with the excess which is clearly shown by pamela. such unexpected rising positron fraction has triggered a considerable amount of possible interpretations based on the existence of some standard or exotic primary sources. these models are significantly constrained by antiproton data which, contrarily, appear to be in agreement with predictions of a purely secondary production. further limits are provided by the measurement of diffuse gamma rays. even when the large theoretical uncertainties affecting positron fraction estimations [10] are taken into account, the presence of an excess appears manifest and consistent. as already proposed several years ago [11], a possible enhancement of the e± flux could be explained by astrophysical sources like nearby pulsars (e.g, see [12–14]). no sizeable contribution from antiprotons is predicted, while counterparts in γ-rays are expected. alternatively, positrons can be created as secondary products of hadronic interactions inside supernova remnants (snrs). the secondary production takes energy (gev) 0.1 0.2 1 2 3 4 5 6 10 20 30 100 200 ) 2 g e v -2 m -1 s r -1 ( s 3 e × f lu x -2 10 -1 10 1 10 2 10 pamela 2011 + fermi e heat94+95 ams caprice94 moskalenko and strong, apj 493, 694 (1998) delahye et al. aa 524, a51 (2010) figure 3. pamela preliminary results on the positron spectrum, compared with other experimental data and with predictions of a standard secondary production model [9], and with a recent calculation assuming additional primary e± sources [14]. place in the same region where crs are being accelerated. old snrs appear the best candidates [15]. however, according to this scenario, counterparts in the γ-rays and in the antiproton channels are expected [16], and an increase in the boron/carbon ratio should be observed at high energy [17]. the dm possibility, with annihilations in the halo of the milky way providing the anomalous antiparticle flux, is of great interest from the particle physics viewpoint. minimal dm models can give a reasonably good fit to the pamela positron data, while antiprotons data put strong constraints on dm annihilations, disfavoring channels with gauge bosons, higgs bosons or quarks. nevertheless, the required hard spectrum would result by combining a very high dm particle mass (∼ 1 ÷ 10 tev) and a very efficient enhancement mechanism for the annihilation into charged gauge bosons [18]. further possibilities are provided by dm models assuming a dominant leptonic channel, which can fit pamela positron and antiproton measurements as well [12, 18]. alternatively, wino-like neutralinos [19], kaluza–klein particles [20], and possibly radiative corrections [21] were proposed as candidates. 2.2. electrons the electron (e−) spectrum measured by pamela in the kinetic energy interval 1 ÷ 625 gev is shown in fig. 4, together with other recent measurements of the electron, and the electron plus positron (e− + e+) flux [6, and references therein]. pamela data cover the largest energy range ever achieved, with no atmospheric overburden. 2.2.1. discussion discrepancies at low energies are partially due to solar modulation effects. results do not show any significant spectral features and can be interpreted 714 vol. 53 supplement/2013 precise cosmic rays measurements with pamela figure 4. the pamela electron (e−) spectrum compared with recent electron and electron plus positron (e− + e+) data [6, and references therein]. in terms of conventional diffusive propagation models. regardless of the softer spectrum, no significant disagreement results with measurements of atic and fermi experiments. existing data are also consistent with calculations including new cr sources that could explain the growing positron component [14]. 2.3. protons and helium nuclei protons and helium nuclei represent by far the most abundant components of the cosmic radiation. their measurement constrains models of the cr origin and propagation in the galaxy. the pamela collaboration has recently published an accurate measurement of proton and helium spectra, in the rigidity range between 1 gv and 1.2 tv. results are shown in fig. 5 together with a compilation of recent measurements. pamela data are consistent with those of other experiments, considering the statistical and systematic uncertainties of the various experiments [7, and references therein]. to gain a better understanding of the results, h and he data were also analyzed in terms of rigidity instead of kinetic energy per nucleon (see fig. 6). some important features can be drawn. firstly, the proton and helium spectra are characterized by significantly different spectral indices (∆γr = γrh − γ r he = .101 ± 0.0014(stat) ± 0.0001(sys)). this aspect is also evident in fig. 7, where the proton-to-helium flux ratio is reported as a function of rigidity: it decreases smoothly with increasing rigidity. moreover, pamela data significantly differ from a pure single power law model. the spectra gradually soften in the rigidity range 30 ÷ 230 gv, and at 230 ÷ 240 gv they exhibit an abrupt spectral hardening (see fig. 8). 2.3.1. discussion while differences with experiments at low energies (< 30 gev) are explainable in terms of solar modulation effects, the hardening in the spectra observed by pamela around 200 gv challenges the standard cr figure 5. proton and helium absolute fluxes measured by pamela above 1 gev/n, compared with a few of the previous measurementsr [7, and references therein]. the shaded area represents the estimated systematic uncertainty. figure 6. proton (top points) and helium (bottom points) data measured by pamela in the rigidity range 1 gv ÷ 1.2 tv [7, and references therein]. the lines represent the fit with a single power law and the galprop [23] and zatsepin&sokolskaya [22] models. scenario, and could be an indication of different populations of cr sources. as an example of a multi-source model, in figs.6 and 7 pamela results are compared with a calculation by zatsepin&sokolskaya [22], which assumes novae stars and explosions in super-bubbles as additional cr sources. blue and red curves denote fits obtained by tuning model parameters in order to match atic-2 [24] and pamela data, respec715 a. bruno et al. acta polytechnica figure 7. the proton-to-helium flux ratio measured by pamela as a function of rigidity [7]. the shaded area represents the estimated systematic uncertainty. lines show the fit using one single power law (describing the difference of the two spectral indices), the galprop [23] and zatsepin&sokolskaya models with the original values of the paper [22] and fitted to the data. figure 8. proton (left panel) and helium (right panel) spectra in the range 10 gv ÷ 1.2 tv [7]. the shaded area represents the estimated systematic uncertainty, the pink shaded area represents the contribution due to tracker alignment. the straight (green) lines represent fits with a single power law in the rigidity range 30 ÷ 240 gv. the red curves represent the fit with a rigidity dependent power law (30 ÷ 240 gv) and with a single power law above 240 gv. tively. indeed, similar results were also reported by the cream experiment, which observed a change of the slope for nuclei (z > 3) but at a higher rigidity than the pamela break in the helium spectrum [25]. 2.4. anti-helium pamela also places constraints on the existence of cosmologically significant amounts of antimatter, by searching for anti-helium nuclei in the cosmic radiation. pamela is able to investigate the he/he ratio in the largest rigidity interval ever achieved, extending the measurement up to several hundreds of gv. this is particular relevant, since the predicted he flux is expected to be strongly suppressed below a few gv, where most of the measurements were taken. prelimifigure 9. time variations of proton (black) and electron (red) fluxes, measured by pamela between july 2006 and december 2009. data (arbitrary units) are normalized to july 2006. nary results in the rigidity range 0.6 ÷ 600 gv have been provided, allowing an upper limit of 10−7 ÷ 10−6 to be put on the he/he ratio. 2.5. solar modulation crs entering the heliosphere are affected by the solar wind, a continuous flow of plasma (with speed ∼ 350 km/s) from the sun corona carrying the solar magnetic field out into the solar system. the cr spectra variations depend on the solar activity, and this effect is called solar modulation. the solar activity has a period of about 11 years, and at each maximum the polarity of the solar magnetic field reverses. thus, precise measurements of the energy spectra of a variety of cr particles in a wide rigidity range from a few hundred mv to tens of gv (where modulation effects are stronger) provide information on the interstellar spectra and the effect of the solar modulation on charge particles of both signs. pamela data analysis is based on data collected from july 2006 till december 2009, a period of solar minimum with negative phase (a < 0). as preliminary results, fig. 9 shows the time variations of protons and electrons (arbitrary units), normalized to july 2006 data; data are provided for different intervals of βr, where β and r denote the particle velocity and its rigidity, respectively. the interpretation of pamela measurements needs more complex models of the propagation of cr into the heliosphere, invoking possible charge-sign dependent effects, affecting positively and negatively charged particles in different ways, depending on the solar polarity. 2.6. geomagnetically trapped antiprotons thanks to the satellite orbit (70° inclination and 350 ÷ 610 km altitude) pamela is able to measure in detail the cosmic radiation in different regions of the terrestrial magnetosphere. in particular, the spacecraft orbit passes through the south atlantic anomaly (saa), allowing the observation of geomagnetically 716 vol. 53 supplement/2013 precise cosmic rays measurements with pamela kinetic energy [gev] -1 10 1 10 -1 s s r] 2 a n ti p ro to n f lu x [ g e v m -4 10 -3 10 -2 10 -1 10 1 10 2 10 3 10 4 10 saa gcr sub-cutoff selesnick et al. 2007 figure 10. the geomagnetically trapped p̄ spectrum measured by pamela in the saa region (red circles), compared with the mean under-cutoff antiproton spectrum outside radiation belts (blue triangles) [8], and the interplanetary cr antiproton spectrum (black squares) measured by pamela [3], together with a trapped antiproton calculation for the pamela satellite orbit (solid line) [26]. trapped particles from the radiation belts. pamela provided the first evidence of the existence of trapped antiprotons [8]. as reported in fig. 10, at the current solar minimum the trapped p̄ flux exceeds the galactic cr p̄ flux and the mean under-cutoff p̄ flux outside radiation belts by 3 and 4 orders of magnitude, respectively. 3. conclusions the pamela experiment has been in orbit for more than six years, measuring cosmic ray particles with high precision and in a large energetic interval. the results have significant implications in the fields of astrophysics, particle physics and cosmology. in particular, pamela antiparticle data has put strong constraints for theoretical models of cr production and propagation and for the existence of exotic processes. in contrast with antiproton results, the observed unambiguous positron excess appears inconsistent with predictions of the standard cosmic ray model. proposed scenarios invoke the existence of some standard or non-standard primary sources, or non-standard secondary production mechanisms. additional constraints were placed by the measurement of the electron spectrum: while results agree with predictions of conventional diffusive propagation models, they do not exclude possible primary e± contributions assumed to explain the positron fraction rise. the measurement of proton and helium spectra by pamela provides fundamental information for understand the acceleration and propagation mechanisms of cosmic rays. the observed spectral features require improved models, possibly based on the existence of different source populations. pamela has provided an estimation of the he/he ratio in a wide rigidity range, putting new limits on the existence of cosmological antimatter. finally, pamela has investigated the effect of solar modulation on the cosmic radiation, and it has also achieved significant results in the study of geomagnetically trapped particles. acknowledgements we acknowledge support from the italian space agency (asi), deutsches zentrum für luftund raumfahrt (dlr), the swedish national space board, the swedish research council, the russian space agency (roscosmos) and the russian foundation for basic research. references [1] picozza, p., et al., astrop. phys. 27, 296, 2007 [2] adriani, o., et al., phys. rev. lett. 102, 051101 (2009) [3] adriani, o., et al., phys. rev. lett. 105, 121101 (2010) [4] adriani, o., et al., nature, 2009, vol 458, pages: 607–609 [5] adriani, o., et al., astropart. phys. 34 (2010) 1–11 [6] adriani, o., et al., phys. rev. lett. 106, 201101 (2011) [7] adriani, o., et al., science vol. 332 no. 6025 pp. 69–72 (2011) [8] adriani, o., et al., astrophys. j. 737:l29 (2011) [9] moskalenko, i. v., and strong, a. w., astrophys. j. 493, 694–707 (1998) [10] delahaye, t., et al., astron. astrophys. 501:821–833, 2009 [11] boulares, a., et al., astrophys. j. 342 807–13 (1989); atoyan, a. m., et al., phys. rev. d. 52 6, 3265–3275 (1995) [12] grasso, d., et al. astrop. phys. 32 (2009) [13] hooper, d., blasi, p., serpico, p. 2009, jcap, 1, 25 [14] delahaye, t., et al., a&a 524 (2010) a51 [15] blasi, p., phys. rev. lett. 103 (2009) 051104 [16] blasi, p., and serpico, p. d., phys. rev. lett. 103 (2009) 081103 [17] mertsch, p., and sarkar, s. , phys. rev. lett. 103 (2009) 081104 [18] cirelli, m., et al., nucl. phys. b 800, 204 (2008); cirelli, m., et al., nucl. phys. b 813 (2009) 1 [19] kane, g., lu, r., and watson, s., phys. lett. b 681, 151 (2009) [20] hooper, d., stebbins, a., and zurek, k. m., 2008 [21] bergstrom, l., bringmann, t., and edsjo, j., phys. rev. d78. 103520, 2008 [22] zatsepin, v. i., sokolskaya, n. v., astron. astrophys. 458, 1 (2006) [23] vladimirov, a. e., et al., arxiv:1008.3642v1 (2010) [24] wefel, j. p., et al., international cosmic ray conference (2008), vol. 2, pp. 31–34 [25] ahn, h. s., et al., apjl 714, l89 (2010) [26] selesnick, r. s., et al., 2007, geophys. res. lett., 34, 20 717 acta polytechnica 53(supplement):712–717, 2013 1 introduction: the pamela mission 2 pamela results 2.1 antiparticles 2.1.1 antiprotons 2.1.2 positrons 2.1.3 discussion 2.2 electrons 2.2.1 discussion 2.3 protons and helium nuclei 2.3.1 discussion 2.4 anti-helium 2.5 solar modulation 2.6 geomagnetically trapped antiprotons 3 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0085 acta polytechnica 54(2):85–92, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap the coupled-cluster approach to quantum many-body problem in a three-hilbert-space reinterpretation raymond f. bishopa, miloslav znojilb,∗ a school of physics and astronomy, schuster building, the university of manchester, manchester, m13 9pl, uk b nuclear physics institute ascr, 250 68 řež, czech republic ∗ corresponding author: znojil@ujf.cas.cz abstract. the quantum many-body bound-state problem in its computationally successful coupled cluster method (ccm) representation is reconsidered. in conventional practice one factorizes the groundstate wave functions |ψ〉 = es |φ〉 which live in the “physical” hilbert space h(p ) using an elementary ansatz for |φ〉 plus a formal expansion of s in an operator basis of multi-configurational creation operators c+ . in our paper a reinterpretation of the method is proposed. using parallels between the ccm and the so called quasi-hermitian, alias three-hilbert-space (ths), quantum mechanics, the ccm transition from the known microscopic hamiltonian (denoted by usual symbol h), which is self-adjoint in h(p ), to its effective lower-case isospectral avatar ĥ = e−shes, is assigned a ths interpretation. in the opposite direction, a ths-prescribed, non-ccm, innovative reinstallation of hermiticity is shown to be possible for the ccm effective hamiltonian ĥ, which only appears manifestly non-hermitian in its own (“friendly”) hilbert space h(f ). this goal is achieved via an ad hoc amendment of the inner product in h(f ), thereby yielding the third (“standard”) hilbert space h(s). due to the resulting exact unitary equivalence between the first and third spaces, h(p ) ∼h(s), the indistinguishability of predictions calculated in these alternative physical frameworks is guaranteed. keywords: quantum many-body problem, coupled cluster method, ad hoc inner product, alternative representation spaces. 1. introduction the coupled cluster method (ccm) of construction, say, of the ground-state energies and wave functions of general quantum many-body systems works with virtual multi-particle excitations, and the linked-cluster nature of the contributions to the resulting estimates of measurable quantities is particularly emphasized [1] – [3]. the strategy leads, in practical calculations, to the replacement of a given, known, realistic and exact microscopic input hamiltonian (let us denote it by the dedicated symbol h) by its lower-case isospectral reparametrization ĥ = ω−1 hω . (1) an optimal similarity-mediating transformation operator ω is then sought in an exponential, manifestly linked-cluster form ω = exp s. the excitations themselves are usually assumed multi-configurational, multi-indexed and generated by a complete set of mutually commuting many-body creation operators c+ ≡ (c− )† such that, conventionally, c + 0 ≡ i and c−0 ≡ i while s = ∑  6=0 sc +  . naturally, the quality of the variationally determined ccm coefficients s translates into the quality of the predicted expectation values of any operator of an observable quantity. in practice, there expectedly emerges a conflict between the precision and the costs of the results. one is thus forced to find an optimal compromise between these two requirements by introducing various approximation schemes. in our present short paper we intend to describe one possible systematic approach to the abstract formulation of approximation hierarchies. our considerations will be inspired by the recent progress achieved in both the formal and the applied analyses of isospectral partnerships ĥ ↔ h. in particular, we shall emphasize the innovative role played by various non-unitary mappings ω, say, in their alternative time-independent or time-dependent forms as described in review papers [4] and [5], respectively. once a decisive simplification of the hamiltonian is achieved by a non-unitary map ω : h → ĥ, we have to start working with the less usual form ĥ of the hamiltonian, which becomes, in general, nonhermitian since ĥ† = ω†h(ω−1)† = ω†ωĥω−1(ω−1)†, ω†ω ≡ θ 6= i. (2) in our present paper we intend to reveal and describe a deeper relationship between the ccm and the abstract framework provided by the mathematical theory of hamiltonians exhibiting the above property of quasi-hermiticity [6], alias crypto-hermiticity [7], with respect to the alternative hilbert-space metricoperator θ 6= i, ĥ†θ = θĥ. (3) 85 http://dx.doi.org/10.14311/ap.2014.54.0085 http://ojs.cvut.cz/ojs/index.php/ap raymond f. bishop, miloslav znojil acta polytechnica in section 2 we shall explain the abstract formalism of three-hilbert-space (ths) representation of quantum systems. we shall make use of the notation conventions of review paper [5], however, with the single, ccm-adapted exception of an interchange of the meaning of the lowerand upper-case symbols for the hamiltonian. for the sake of clarity, table 1 offers the explicit translation of the present notation conventions (as displayed in the first column) to the language of [5] (given in the second column). subsequently, in section 3 an overall review of the key ideas of ccm constructions will be recalled, and their reinterpretation within the general ths scheme will be described. section 4 will finally summarize our observations and proposals. 2. ths representation of a quantum system 2.1. inspiration: fourier transform the most elementary one-dimensional harmonic-oscillator hamiltonian h(ho) = − d2 dx2 + x2 may be recalled as one of the best known examples of an operator representing a typical quantum observable. it enters the ordinary differential schrödinger equation h(ho)ψ(p )n (x) = e (ho) n ψ (p ) n (x), ψ(p )n (x) ∈ l 2(r), n = 0, 1, . . . (4) for “physical” wave functions ψ(p )n (x). the solution of this eigenvalue problem yields the well known discrete spectrum of bound-state energies e0 = 1, e1 = 3, e2 = 5, . . . , while the related wave functions belong to the most common hilbert space of square-integrable complex functions of x ∈ r. the argument x of the wave functions coincides with an admissible value of the position of the quantum particle in question. in other words, the (p )-superscripted complex functions ψ (p ) n (x) may be interpreted as yielding the probability density of finding the particle at spatial point x ∈ r. the wave functions in question live in a physical hilbert space l2(r) ≡ h(p ). formally, these functions may be represented as fourier transforms of elements of a, supposedly, “friendlier” hilbert space, ψ (p ) n = fψ (f ) n , ψ (f ) n ∈ h(f ). by construction, the latter space is also l2(r) but the physical meaning of the argument p ∈ r of the new wave functions ψ(f )n (p) is different. at the same time, the primary observable (i.e., the energy) remains unchanged. in practice, the harmonic oscillator appears equally well represented in both of the hilbert spaces h(p,f ). whenever one moves to a more complicated model, however, one may find that one of these spaces is preferable. in other words, a unitary-mappingmediated transition to a potentially friendlier hilbert space h(f ) should be employed whenever it appears to lead, say, to a simplification of the calculation of the energies or of the wave functions. we only have to add here that the same recommendation remains valid even for mappings h(p ) ↔h(f ) which cease to be unitary. in this sense, our freedom of choosing between the upperand lower-case hamiltonians as expressed in eq. (1) may prove important, say, as a source of acceleration of the rate of convergence of various numerical or variational calculations (see, e.g., their review in [4]). 2.2. non-unitary mappings ω = exp s our present text is basically inspired by the recent growth in popularity of quantum models in which the ad hoc non-unitary isospectral transformations h → ĥ = ω−1 h ω (5) perceivably simplify the hamiltonian. thus, eq. (5) offers a path towards the feasibility of the evaluation of bound-state energies in complicated quantum systems via an ω-mediated transition from a complicated “primary” hilbert space h(p ) to a “friendlier” hilbert space h(f ). 2.2.1. crypto-hermitian ibm method one should distinguish between several non-equivalent applications of the above-outlined ideas. in one of the key references on the whole subject [4], the authors start from the knowledge of an overcomplicated h and from a qualified guess of a suitable simplification mapping ω 6= ( ω† )−1. for a persuasive illustration of the practical efficiency of such an approach the authors recalled the so-called interacting-boson-model (ibm) calculations of the spectra of heavy atomic nuclei. using the dyson-maleev choice of the boson-fermion mappings ω(dyson) this strategy was found to lead to successful and particularly computation-friendly forms of variational predictions of the measured energy levels [8]. the key condition of applicability of the latter ibm recipe may be seen in the feasibility of construction of the ultimate “effective” hamiltonian ĥ of eq. (5). one arrives at a non-hermitian operator in general, ĥ 6= ĥ†. it is worth adding that an exception may occur when the original self-adjoint hamiltonian h accidentally happens to commute with the operator-product symmetry π = ω ω†; notice that π 6= θ unless we restrict attention to the mere normal-operator mappings ω such that ω†ω = ωω†. whenever ĥ 6= ĥ†, the practical determination of the eigenvalues of the transformed hamiltonian must remain easy and efficient. the reason is that in comparison with standard methods, one must replace the usual single time-independent schrödinger equation by the following doublet of conjugate eigenvalue problems ĥ|φn〉 = en|φn〉, 〈φ̃m|ĥ = em〈φ̃m|, n,m = 0, 1, . . . (6) 86 vol. 54 no. 2/2014 the coupled-cluster approach to quantum many-body problem concept ccm [3] ths [5] (realistic, microscopic) (hermitian) (hermitian in h(p )) initial hamiltonian h h (non-unitary) (creation) (general invertible map) transformation exp s ω : h(f ) →h(p ) (assumed simplified) (non-hermitian) { non−hermitian in h(f ) and hermitian in h(s) } transformed hamiltonian ĥ = e−shes h = ω−1hω table 1. warning: opposite notation conventions. using the respective action-to-the-right and action-tothe-left conventions. interested readers may consult review paper [5], in which a detailed discussion of further subtleties is given, first of all, for the far from trivial heisenbergrepresentation-like cases in which the non-unitary mapping ω is also permitted to vary with time. 2.2.2. pt -symmetric models a reversal of the application of the simplification h → ĥ may be found promoted in the overall context of relativistic quantum field theory. in this entirely different domain of physics, bender and his coauthors were the first who advocated an alternative philosophy of first choosing a sufficiently elementary non-hermitian ĥ and of postponing the reconstruction of the overcomplicated selfadjoint operator h, sometimes even indefinitely. the initial move is due to bender and boettcher, who published, in 1998, an influential letter [9]. in this work they noticed that certain elementary nonhermitian toy-model operators ĥ appeared to possess real and bound-state-like spectra, which were discrete, non-degenerate and bounded from below. in 2001, their observations were rigorously proved while, a few years later, some of these results were also complemented by approximate reconstructions of the necessary metric operator(s) θ = θ(ĥ) (cf., e.g., review [10] for details). on a model-independent level these developments finally resulted in a fully consistent innovative ths strategy in which one starts from a sufficiently elementary lower-case (i.e., non-hermitian) candidate for a “realistic-model” hamiltonian ĥ 6= ĥ†. under a number of assumptions (cf., e.g., reviews [4, 11–13]) one is then able to re-construct a suitable hilbert-space mapping ω = ω(ĥ) and, via eq. (1), also a self-adjoint, textbook-compatible isospectral avatar h = h† of the hamiltonian living in h(p ). in other words, from the initial knowledge of a quantum-dynamics-determining operator ĥ one is able to reconstruct, in principle at least, one or several tractable, textbook-compatible phenomenological quantum-mechanical and/or fieldtheoretical models. naturally, the initial choice of hamiltonian ĥ 6= ĥ† acting in h(f ) should guarantee that the pair of schrödinger eqs. (6) remains sufficiently easily solvable. this requirement is not so easily satisfied. in practice people usually accept various independent and additional simplification assumptions, therefore. among them, a truly exceptional status belongs to the so called pt -symmetry assumption or, more generally, to the assumption of the so called pseudo-hermiticity property of ĥ (interested readers should consult, e.g., review [12] for more details). 2.2.3. towards the complex energy spectra. a third and still different implementation of the nonhermitian-observable ideas is much older than the previous two. it may be traced back to the traditional model-space projection technique of feshbach in which one of the non-unitary mappings ω and ω−1 is chosen as a projector so that the other one cannot exist. it is well known that the resulting simplified effective hamiltonians are restricted to a subspace while becoming energy-dependent in general. in this sense, feshbach’s effective schrödinger eqs. (6) are de facto nonlinear. such a case certainly lies outside the scope of our present considerations. still, it is worth noting that there has recently emerged a number of papers in which the authors pointed out the existence of numerous links between the latter studies of resonances (i.e., of the quantum hamiltonians possessing complex spectra) and their above-mentioned real-spectrum alternatives. interested readers may consult, e.g., monograph [14] to see a number of newly discovered connections between the physics of hermitian and/or non-hermitian effective hamiltonians and the related mathematics, which recommends, say, the use of the concepts of the kato’s exceptional points, etc. one should also point out that even in the recent physics-based and experiment-oriented studies of the real-spectrum pseudo-hermitian and pt -symmetric models there has been a definite increase of interest in the interdisciplinary applications of the ths-related concepts of the spontaneous pt -symmetry breakdown and/or explanations of the exceptional-point-related phase-transition mechanisms connected with the loss of the reality of the spectrum (cf., e.g., the recent quantum-theory-related review paper [15], or a sample 87 raymond f. bishop, miloslav znojil acta polytechnica [16] of a successful transfer of these ideas even beyond the realm of quantum theory itself). 3. ths interpretation of ccm constructions having passed through the extensive list of motivating considerations we are now getting very close to the key purpose of our present paper. for the construction of a concrete backward mapping ω = ω(h) in the ccm context we see that we might accept directly some of the ths constructive techniques. naturally, in the ccm framework we encounter the possibility of extending its philosophy and its range beyond the ground-state constructions. for this purpose we may decide to experiment with various ths-inspired alternatives to the basic (bi-)variational ccm ansätze. in an introductory step let us return, therefore, to the ibm-motivated version of the ths approach, in which one assumes a full knowledge of the realistic, albeit prohibitively complicated, hamiltonian h = h†, defined in some microscopic physical hilbert space h(p ). a qualified guess or construction of ω will be then vital for the success of computations, i.e., first of all, for the success of the practically tractable construction and solution of the pair of schrödinger eqs. (6). 3.1. brief introduction to ccm constructions in the ccm context, the generic, dyson-inspired nonunitary mapping ω(ccm ) has traditionally been considered in the specific linked-cluster form of an exponential operator ω(ccm ) = exp s. in the literature (cf., e.g., [17] with further references) one may find a huge number of practical applications of the ccm strategy by which the ground-state wave functions are sought in the form of products |ψ〉 = es|φ〉. (7) the ket vector |φ〉 represents here a normalized state (usually called the model state or reference state), intended to be employed as a cyclic vector with respect to a complete set of mutually commuting multiconfigurational creation operators c+ ≡ (c− )†. our use of the special symbol  for the index indicates that this is a multi-index that labels the set of all many-particle configurations. in other words, states of the many-particle quantum system in question can be all written as superpositions of basis states c+ |φ〉. variational eigenkets (7) of the many-body selfadjoint hamiltonian h = h† are conveniently written in terms of the specific ccm operator ansatz s = ∑  6=0 sc+ . (8) the fundamental ccm replacement (7) of an unknown vector |ψ〉 by an unknown operator s is very well motivated from several independent points of view. one of the motivations is inherited from rayleighschrödinger perturbation theory, in which, at a certain stage of construction, the operator schrödinger equation h|ψ〉 = e|ψ〉 in question is replaced by its single bra-vector projection 〈0|h|ψ〉 = e〈0|ψ〉 or, more generally, by a finite multiplet of such projections 〈0j|h|ψ〉 = e〈0j|ψ〉. the key advantage of such a reduction lies in the possibility of a variationally optimal choice of the bra-vectors 〈0j|. by contrast, the property of the hermiticity of hamiltonian h becomes, to a large degree, irrelevant. thus, one transfers this experience to the ccm context by introducing a complementary, formally redundant concept of left-action variational eigenvector 〈ψ̃| of h. the nontrivial difference between the tilded and untilded eigenvector |ψ̃〉 and |ψ〉 is motivated by the possibility of introducing an additional set {s̃} of free parameters in the bra-vector 〈ψ̃| = 〈φ|s̃e−s ; s̃ = i + ∑ 6=0 s̃c− . (9) together with the conditions of completeness of the basis∑  c+ |φ〉〈φ|c −  = i = |φ〉〈φ| + ∑  6=0 c+ |φ〉〈φ|c −  , (10) and together with the usual properties of the creation and annihilation operators, c− |φ〉 = 0 = 〈φ|c +  ; ∀ 6= 0 (11) and [c+ ,c + j ] = 0 = [c −  ,c − j ] (12) we arrive at the standard version of the ccm formalism, in which one currently employs approximations which do not make use of the manifest hermiticity of the original eigenvalue problem. such approximations may entail keeping only a physically motivated subset of the multi-indices  in the otherwise exact expansions of the correlation operators s and s̃ in eqs. (7)–(9). as an immediate mathematical consequence, the ccm schrödinger equation for ground state acquires the two different and mutually non-conjugate alternative forms ĥ|φ〉 = e|φ〉, 〈φ|s̃ ĥ = e〈φ|s̃, ĥ = e−shes. (13) obviously, once the two sets of coefficients {s} and {s̃} are determined, all the ground-state properties of the many-body system in question may be considered as known. the ground-state expectation value of any given operator λ should be evaluated from the asymmetric prescription 〈ψ̃|λ |ψ〉 = 〈φ|s̃e−s λes|φ〉 = λ̄(s, s̃) . (14) 88 vol. 54 no. 2/2014 the coupled-cluster approach to quantum many-body problem this recipe keeps trace of the artificial asymmetry as introduced in eq. (13) which, in its turn, simplifies certain technical aspects of the global ccm approach. in particular, in the bi-variational spirit the energy expectation formula 〈ψ̃|h|ψ〉 = 〈φ̃|ĥ|φ〉 (15) may now be minimized with respect to the full set of parameters {s, s̃}. two equations follow, viz., 〈φ|c− ĥ|φ〉 = 0 ; ∀ 6= 0 (16) and 〈φ|s̃(ĥ−e)c+ |φ〉 = 0 ; ∀ 6= 0. (17) in their turn, these relations may be interpreted as a coupled algebraic set of equations that determine the parameters {s, s̃}. the consistency of the recipe may be reconfirmed by the derivation of the former relation (16) from the assumption of completeness of the set of states {〈φ|c− }. similarly, eq. (17) may be perceived as a consequence of the completeness of the conjugate set {c+ |φ〉}. the coupled equations (16) and (17) are of the goldstone linked-cluster type. for this reason, all extensive variables, such as the energy, scale linearly with the number of particles at every level of approximation. this is another merit of the ccm construction. among the disadvantages we mention that the ground-state energy formula does not necessarily provide an upper bound, due to the intentional violation of manifest hermiticity for the problem. still, the recipe enables us to determine both the quickly convergent energies as well as the hamiltonian-dependent values of parameters {s, s̃} or, in various approximate schemes, of the respective truncated subsets of these values. within the general framework of the ccm treatment of many-body quantum systems some of the above-mentioned assumptions and restrictions may be removed. the method may certainly be extended, say, to cover also excited states and/or certain timedependent versions of dynamics. in both of these directions, an implementation of ideas from ths context might prove particularly helpful. 3.2. ccm–ths correspondence the close mathematical relationship between the various variational ccm recipes and the universal threehilbert-space (ths) representation of a generic quantum system has been largely overlooked till now. apart from a few rather inessential differences, one of the key obstacles may be seen in the differences in their notations, a first sample of which is displayed in table 1, where we see that for hamiltonians, the ccm and ths notation conventions are strictly opposite (so we have to re-emphasize that in our present paper we are using the first-column notation conventions). with due care paid to the hermiticity or nonhermiticity of the hamiltonian, it seems equally important to spot the ccm ths coincidences and/or differences in the definitions and meanings of the other concepts. for the ground-state wave functions, in particular, the parallels in the denotation of the same feature or quantity are displayed in table 2. an inspection of table 2 reveals that in their respective current versions, the two formalisms are far from equivalent, indeed. at the same time, they may be both found to suffer of certain specific weak points. in fact, our present considerations were originally motivated precisely by a parallel analysis of these respective weaknesses. after their deeper study we came to the conclusion (documented and emphasized also by the above two respective compact reviews) that a perceivable profit might be gained by modifying and getting those two formalisms and/or methods of calculation closer to each other. on the side of the ccm formalism, for example, one may immediately notice an obvious contrast between the exponential ccm form of the mapping ω(ccm ) = exp s(ccm ) and the manifestly nonexponential, polynomial form of the tilded operator s̃ entering the second ccm ansatz (9). naturally, such a striking difference did not stay unnoticed in the related literature, and the idea has been implemented into the so called extended version of the ccm (eccm) formalism [1, 2]. on the side of the general ths formalism, in parallel, we may now recollect one of the very popular formalism-simplifying tricks by which one works just with the special hermitian mappings ωs = ω†s = exp ss [11, 12]. under this additional assumption one arrives at a fairly natural exponential form of the equally special but still sufficiently general subset of the positive-definite metrics, θs = exp 2ss. in this manner, after the respective replacements s̃ → s̃(eccm ) and θ → θs = exp 2ss, the initially very different forms of the operators get closer. once one stops feeling discouraged by the similar, more or less purely formal differences, one has to reopen also the question of the respective roles of the operators s̃ and θ in the purely numerical context. this is another type of difference which is, naturally, strongly dependent on the purpose of the calculation. traditionally, the ccm and ths calculation purposes are truly rather different. nevertheless, on the ccm side one immediately notices that the predominance of calculations of the ground-state characterstics does not exclude extensions, say, to the excited-state problem [18] or even to the description of systems which are allowed to exhibit a manifest time-dependence of their dynamics [19]. in this sense we are getting still closer to the respective time-independent and time-dependent non-hermitian versions of the general and universal ths formulation of abstract quantum mechanics as summarized, say, in refs. [4] and [5]. 89 raymond f. bishop, miloslav znojil acta polytechnica ground state ccm [3] ths [5] purpose bi-variationality re-hermitization of h in h(s) assumptions s̃ = annihilation θ = ω†ω, ω = invertible eigen-ket (simplified) |φ〉 |0〉 ∈h(f,s) eigen-bra (conjugate) 〈φ| 〈0| ∈h(f ) ′ eigen-bra (amended) 〈φ̃| := 〈φ|s̃ 〈〈0| := 〈0|θ ∈h(s) ′ microscopic ground state |ψ〉 |0� = ω|0〉 ∈h(p ) first variational ansatz = es|φ〉 left ground state 〈ψ̃| ≺0| = 〈0|ω† = 〈〈0|ω−1 ∈h(p ) ′ second variational ansatz = 〈φ̃|e−s table 2. parallel notation conventions. initial, given microscopic hamiltonian h = h† lives in primary space h(p ); all is prohibitively complicated: • one constructs the ccm operator ω = exp s ccm map ω−1 = exp (−s) ↙ ↘↖ equivalence friendly space h(f ) is false : •• in it, new ĥ := ω−1hω is not self-adjoint, ĥ 6= ĥ† hermitization−→ secondary space h(s) is standard : ••• in it, the same ĥ = θ−1ĥ†θ := ĥ‡ is found self-adjoint and diagonalizable figure 1. the three-hilbert-space diagram. 4. discussion 4.1. a ccm–ths fusion? in the language of mathematics the core of our present message may be summarized as follows: in fact, it need not be particularly difficult to search for a further enhancement of parallels between the manifestly nonhermitian, annihilation-operator-type ccm choice of the tilded operator s̃ and the strictly hermitian and, in addition, also strictly positive definite hilbertspace-metric operator θ = ω†ω. in the terminology of physics this persuasion is supported by the observation that what is shared by both the abstract ccm and ths formalisms is a truly exciting idea of using nontrivial “redundant” operators s̃ or θ in place of the common identity operator. in both formalisms, the rationale behind the use of the respective nontrivial operators s̃ and θ is rather subtle though fairly persuasive and not too dissimilar. indeed, one starts from a well known while, unfortunately, prohibitively complicated initial self-adjoint hamiltonian in both cases (recall, once more, table 1). secondly, the choice and/or construction of the mapping ω = exp s is motivated, in both of the approaches, by a more or less comparably successful simplification of the schrödinger eigenvalue problem. thirdly, both the ccm and ths re-arrangements of the quantum bound-state problem lead to the necessity of the introduction of the respective nontrivial operators s̃ and θ using comparably strong but, at the same time, different supportive arguments. what now remains open is a truly challenging question as to whether, and in which sense, one could really achieve a complete coincidence of the respective (and, apparently, ideologically distant) ccm and ths recipes. firstly, an affirmative answer may be given (and the idea may be made working) whenever the hilbert spaces of the system remain, for whatever reason (e.g., for approximation purposes) finite-dimensional. in such a very specific case the space for a compromise immediately opens after we move from the abstract formalism to any kind of a practical variational calculation and/or numerical approximation. schematically speaking, any 2m-parametric array of the multi-indexed ccm variational coefficients sk and s̃k with k = 1, 2, . . . ,m may be perceived equivalent to an introduction of a 2m-parametric metric θ = ω†ω. it should be noted, as a supportive argument, that even in the thorough ibm review [4] a large amount of space has been devoted to the study of finite-dimensional models and to the questions of practical variational applicability of the ths scheme. on this level of mathematics the overall nature and structure of the above-indicated possibility of a complete unification (or, at least, of a strengthening of 90 vol. 54 no. 2/2014 the coupled-cluster approach to quantum many-body problem the ccm–ths parallelism) may be read out of the three-hilbert-space diagram in figure 1. by the blobs we mark here the three main constructive ccm–ths steps. in the first two steps (viz., • and ••) we may assume to stay inside the usual ccm framework in which the ground-state eigenvector |ψ〉 of the quantum system in question is reparametrized in terms of operator s. thus, the ccm–ths innovation only emerges, via operators s̃ alias θ, in the third step (•••, see table 2). in this setting let us remind the readers that the (certainly, in general, existing) creation-operator components of θ(ccm ) may be expected to play just a marginal role in the convergence. the reason is that the ccm choice of ω = exp s is mainly aimed at the construction of the many-body ground states. thus, a lot of freedom is left for the introduction of more variational parameters via s̃ 6= i. in contrast, the balanced distribution of attention of the universal ths formulae between the ground and excited states lowers, certainly, the latter freedom because the ths recipe defines the metric in terms of ω unambiguously. 4.2. towards the infinite-dimensional hilbert spaces once we decide to leave the language of computing and once we move to the exact description of realistic quantum systems and to the (say, separable) infinitedimensional hilbert spaces, the search for the ccm– ths unification becomes perceivably more difficult. from the ths perspective, in particular, the key subtlety lies in the fact that whenever one decides to treat the two topological vector spaces h(p ) and h(f ) (naturally, still without any account of the definition of the inner products and of the metrics) as distinct, the map ω = exp s will slightly change its meaning as well as its interpretation. from the alternative (and also historically older) ccm point of view it is necessary to recall, first of all, the results of the important paper [20]. its author accepted the usual, above-described ccm linked-cluster parametrization, in its most general time-dependent form, as deduced from an appropriate action principle. in turn, this enforces a symplectic structure on the ensuing ccm phase space of the real-valued “degrees of freedom” s, s̃ of eq. (14). at this point the author of [20] has been forced to discuss the emergence of the characteristic nonhermiticity of the average-value functionals λ̄(s, s̃) of physical observables as well as of the action ā(s, s̃) itself. in fact, our present idea of possible ccm–ths correspondence also found another source of inspiration in his approach, so let us recall his key ideas in more detail. firstly, he introduced the set of complex conjugate variables s∗ , s̃ ∗ and showed how they could be used to enlarge the ccm phase space into a genuine complex manifold but of too large a dimensionality. he further showed how the extra degrees of freedom could then be eliminated via the dirac bracket technique. a set of constraint functions was introduced which thereby select the physical submanifold (alias the reduced phase space, or constraint surface) corresponding to the original hilbert space. subsequently, the reduced phase space was shown to be a (kähler) complex manifold with a symplectic structure, just as the original extended one. ultimately, the kähler manifold may be perceived as defining a positive, invertible, hermitian geometry in the reduced phase space. arponen [20] further shows that for a compound operator product q = λ1λ2, the ccm star product which generates the expectationvalue functional q = λ1λ2 in terms of the individual expectation values λ1 and λ2, as given by eq. (14), can be well defined in the reduced (i.e., physical) phase space. this result suggests that besides starting from the ths scheme, one could also try to develop certain innovative and consequently hermiticity-preserving hierarchical approximation schemes strictly within the ccm framework. a judicious use of the on-shell star products seems capable of establishing another form of the ccm–ths parallels, and of doing so in an entirely general setting. in addition, some explicit and concrete constructive implementations of the concept of the metric θ may be found directly in the generic ccm framework. naturally, a deeper analysis would require a verification in terms of explicit constructions. further development of such a project lies, naturally, beyond the scope of our present paper. 4.3. outlook let us summarize that in the general ths framework one is expected to perform all of the practical computations of physical predictions inside the “friendliest” hilbert space h(f ). what is a real mathematical promise of a search for the new mutual ccm–ths correspondences is that even the standard probabilistic interpretation of many-body wave functions need not require a return to the “unfriendly” space h(p ). in all respects it becomes easier to replace the latter space by its (unitarily) equivalent alternative h(s). the reason is that the latter hilbert space only differs from its more friendly predecessor h(f ) by an ad hoc amended inner product. our present brief outline of a few explicit ccm– ths correspondences centered around the fact that operator s̃ of the ccm formalism coincides with the hilbert-space metric operator θ after a “translation of notation” to the ths-representation language of [5]. on the background of this comparison the main potential innovation of the ccm was found in the ths-based possibility of distinguishing between the three separate hilbert spaces h(p ), h(f ) and h(s), which would represent the same quantum many-body system. the change of perspective revealed several ccm– ths parallels as well as differences. among the par91 raymond f. bishop, miloslav znojil acta polytechnica allels, one of the most inspiring seems to lie in the emerging structural similarity between the ccm constructions and their ibm (= interacting boson model) counterparts. the project of our future development of such a ccm–ibm correspondence seems promising. in the language of physics it might enable us to keep the initial physical p-superscripted hilbert space as fermionic while rendering the other two, fand s-superscripted hilbert spaces, strictly in the generalized ibm spirit, carriers of another, generalized (e.g., pseudo-bosonic) statistics. in the opposite direction, also the traditional ibm constructions of effective hamiltonians could find some new inspiration in their ccm analogues. in particular, the prospects of a simplification mediated by the nonunitary invertible mappings ω = exp s need not necessarily stay bound by their traditional bosonic-image ibm restrictions. a new wealth of correspondences may be expected to become implementable between the auxiliary hilbert space h(f ) and the, by assumption, prohibitively complicated physical hilbert space h(p ) (hence, the superscript (p ) may also mean “prohibitive”). ultimately, the technically most productive idea may be seen in the exceptional role of the fsuperscripted hilbert space, in which the absence of an immediate physical interpretation (say, of the measurable aspects of coupled clusters) appears more than compensated by the optimal suitability of this particular representation space for calculations of the, typically, variational ccm type. acknowledgements the participation of mz was supported by gačr grant p203/11/1433. references [1] j. arponen. variational principles and linked-cluster exp s expansions for static and dynamic many-body problems. ann. phys. (ny) 151: 311–382, 1983. doi: 10.1016/0003-4916(83)90284-1 [2] j. s. arponen, r. f. bishop and e. pajanne. extended coupled-cluster method. i. generalized coherent bosonization as a mapping of quantum theory into classsical hamiltonian mechanics. phys. rev. a 36: 2519–2538, 1987. doi: 10.1103/physreva.36.2519 [3] r. f. bishop. an overview of coupled cluster theory and its applications in physics. theor. chim. acta 80: 95–148, 1991; doi: 10.1007/bf01119617 r. f. bishop. the coupled cluster method. in microscopic quantum many-body theories and their applications, springer lecture notes in physics vol. 510, ed j. navarro and a. polls, pp. 1-70. springer-verlag, berlin, 1998. [4] f. g. scholtz, h. b. geyer and f. j. w. hahne. quasi-hermitian operators in quantum mechanics and the variational principle. ann. phys. (ny) 213: 74–101, 1992. doi: 10.1016/0003-4916(92)90284-s [5] m. znojil. three-hilbert-space formulation of quantum mechanics. sigma 5: 001, 2009 (arxiv overlay: 0901.0700). doi: 10.3842/sigma.2009.001 [6] j. dieudonne. quasi-hermitian operators. proc. int. symp. lin. spaces, pergamon, oxford, 1961, pp. 115–122. [7] a. v. smilga. cryptogauge symmetry and cryptoghosts for crypto-hermitian hamiltonians. j. phys. a: math. theor. 4124): 244026, 2008. doi: 10.1088/1751-8113/41/24/244026 [8] d. janssen, f. dönau, s. frauendorf, and r. v. jolos. boson description of collective states: (i). derivation of the boson transformation for even fermion systems. nucl. phys. a 172: 145–165, 1971. doi: 10.1016/0375-9474(71)90122-9 [9] c. m. bender and s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. phys. rev. lett. 80: 5243–5246, 1998. doi: 10.1103/physrevlett.80.5243 [10] c. m. bender. making sense of non-hermitian hamiltonians. rep. prog. phys. 70: 947–1018, 2007. doi: 10.1088/0034-4885/70/6/r03 [11] p. dorey, c. dunning and r. tateo. the ode/im correspondence. j. phys. a: math. theor. 40: r205–r283, 2007. doi: 10.1088/1751-8113/40/32/r01 [12] a. mostafazadeh. pseudo-hermitian representation of quantum mechanics. int. j. geom. meth. mod. phys. 7: 1191–1306, 2010. doi: 10.1142/s0219887810004816 [13] f. bagarello and m. znojil. nonlinear pseudo-bosons versus hidden hermiticity. j. phys. a: math. theor. 44: 415305, 2011. doi: 10.1088/1751-8113/44/41/415305 [14] n. moiseyev. non-hermitian quantum mechanics. cambridge university press, cambridge, 2011. [15] i. rotter, a non-hermitian hamilton operator and the physics of open quantum systems. j. phys. a: math. theor. 42: 153001, 2009. doi: 10.1088/1751-8113/42/15/153001 [16] m. v. berry. physics of nonhermitian degeneracies. czech. j. phys. 54: 1039–1047, 2004; doi: 10.1023/b:cjop.0000044002.05657.04 c. e. rüter, r. makris, k. g. el-ganainy, d. n. christodoulides, m. segev and d. kip, observation of parity-time symmetry in optics. nature phys. 6: 192, 2010. doi: 10.1038/nphys1515 [17] r. f. bishop and p. h. y. li. coupled-cluster method: a lattice-path-based subsystem approximation scheme for quantum lattice models. phys. rev. a 83: 042111, 2011. doi: 10.1103/physreva.83.042111 [18] j. s. arponen, r. f. bishop and e. pajanne. extended coupled-cluster method. ii. excited states and generalized random-phase approximation. phys. rev. a 36: 2539–2549, 1987. doi: 10.1103/physreva.36.2539 [19] j. s. arponen, r. f. bishop and e. pajanne. dynamic variational principles and extended coupled-cluster techniques. in aspects of many-body effects in molecules and extended systems (springer lecture notes in chemistry, vol. 50, ed. d. mukherjee), pp. 79–100. springer-verlag, berlijn, 1989. doi: 10.1007/978-3-642-61330-2_4 [20] j. arponen. constrained hamiltonian approach to the phase space of the coupled-cluster method. phys. rev. a 55: 2686–2700, 1997. doi: 10.1103/physreva.55.2686 92 http://dx.doi.org/10.1016/0003-4916(83)90284-1 http://dx.doi.org/10.1103/physreva.36.2519 http://dx.doi.org/10.1007/bf01119617 http://dx.doi.org/10.1016/0003-4916(92)90284-s http://dx.doi.org/10.3842/sigma.2009.001 http://dx.doi.org/10.1088/1751-8113/41/24/244026 http://dx.doi.org/10.1016/0375-9474(71)90122-9 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1088/0034-4885/70/6/r03 http://dx.doi.org/10.1088/1751-8113/40/32/r01 http://dx.doi.org/10.1142/s0219887810004816 http://dx.doi.org/10.1088/1751-8113/44/41/415305 http://dx.doi.org/10.1088/1751-8113/42/15/153001 http://dx.doi.org/10.1023/b:cjop.0000044002.05657.04 http://dx.doi.org/10.1038/nphys1515 http://dx.doi.org/10.1103/physreva.83.042111 http://dx.doi.org/10.1103/physreva.36.2539 http://dx.doi.org/10.1007/978-3-642-61330-2_4 http://dx.doi.org/10.1103/physreva.55.2686 acta polytechnica 54(2):85–92, 2014 1 introduction 2 ths representation of a quantum system 2.1 inspiration: fourier transform 2.2 non-unitary mappings omega = exp(s) 2.2.1 crypto-hermitian ibm method 2.2.2 pt-symmetric models 2.2.3 towards the complex energy spectra. 3 ths interpretation of ccm constructions 3.1 brief introduction to ccm constructions 3.2 ccm–ths correspondence 4 discussion 4.1 a ccm–ths fusion? 4.2 towards the infinite-dimensional hilbert spaces 4.3 outlook acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0470 acta polytechnica 53(5):470–472, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the number of orthogonal conjugations armin uhlmann∗ university of leipzig, institute for theoretical physics, pb: 100920, d-04009 leipzig, germany. ∗ corresponding author: armin.uhlmann@t-online.de abstract. after a short introduction to anti-linearity, bounds for the number of orthogonal (skew) conjugations are proved. they are saturated if the dimension of the hilbert space is a power of two. for other dimensions this is an open problem. keywords: anti(conjugate) linearity, canonical hermitian form, (skew) conjugations. submitted: 31 march 2013. accepted: 1 may 2013. 1. introduction the use of anti-linear or, as mathematicians call it, conjugate linear operators in physics goes back to e. p. wigner, [11]. wigner also discovered the structure of anti-unitary operators, [12], in finite dimensional hilbert spaces. the essential difference from the linear case is the existence of 2-dimensional irreducible subspaces so that the hilbert space decomposes into a direct sum of 1and 2-dimensional invariant spaces in general. later on, f. herbut and m. vujičić were able to clarify the structure of anti-linear normal operators, [4], by proving that such a decomposition also exists for anti-linear normal operators. while any linear operator allows for a jordan decomposition, i do not know a similar decomposition of an arbitrary anti-linear operator. in the main part of the paper there is no discussion of what is happening in case of an infinite dimensional hilbert space. there are, however, several important issues both in physics and in mathematics: a motivation of wigner was in the prominent application of (skew) conjugations, (see the next section for definitions), to time reversal symmetry and related inexecutable symmetries. it is impossible to give credit to the many beautiful results in elementary particle physics and in minkowski quantum field theory in this domain. however, it is perhaps worthwhile to note the following: the cpt-operator, the combination of particle conjugation c, parity operator p, and time-reversal t, is an anti-unitary operator acting on bosons as a conjugation and on fermions as a skew conjugation. it is a genuine symmetry of any relativistic quantum field theory in minkowski space. the proof is a masterpiece of r. jost, [7]. a further remarkable feature of anti-linearity is shown by cpt. this operator is defined up to the choice of the point x in minkowski space on which pt acts as x →−x. calling this specific form cptx, one quite forwardly shows that the linear operator cptxcpty is representing the translation by the vector 2(x − y). the particular feature in the example at hand is the splitting of an executable symmetry operation into the product of two anti-linear ones. this feature can be observed also in some completely different situations. an example is the possibility to write the output of quantum teleportation, as introduced by bennett et al. [1], [2, 8], as the action of the product of two anti-linear ones on the input state vector, see [3, 9, 10]. these few sketched examples may hopefully convince the reader that studying anti-linearity is quite reasonable — though the topic of the present paper is by far not so spectacular. the next two sections provide a mini-introduction to anti-linearity. in the last one it is proved that the number of mutually orthogonal (skew) conjugations is maximal if the dimension of the hilbert space is a power of two. it is conjectured that there are no other dimensions for which this number reaches its natural upper bound. 2. anti(or conjugate) linearity let h be a complex hilbert space of dimension d < ∞. its scalar product is denoted by 〈φb,φa〉 for all φa,φb ∈h. the scalar product is assumed linear in φa. this is the “physical” convention going back to e. schrödinger. 1 is the identity operator. definition 1. an operator ϑ acting on a complex linear space is called anti-linear or, equivalently, conjugate linear if it obeys the relation ϑ( c1φ1 + c2φ2 ) = c∗1ϑφ1 + c ∗ 2ϑφ2, cj ∈ c. (1) as is common use, b(h) denotes the set (algebra) of all linear operators from h into itself. the set (linear space) of all anti-linear operators is called b(h)anti. anti-linearity requires a special definition of the hermitian adjoint. definition 2 (wigner). the hermitian adjoint, ϑ†, of ϑ ∈b(h)anti is defined by 〈φ1,ϑ†φ2〉 = 〈φ2,ϑφ1〉, φ1,φ2 ∈h. (2) 470 http://dx.doi.org/10.14311/ap.2013.53.0470 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 the number of orthogonal conjugations a simple but important fact is seen by commuting ϑ and a = c1. one obtains (cϑ)† = cϑ†, saying: ϑ → ϑ† is a complex linear operation,(∑ cjϑj )† = ∑ cjϑ † j. (3) this is an essential difference from the linear case: taking the hermitian adjoint is a linear operation. a similar argument shows, that the eigenvalues of an anti-linear ϑ form circles around the null vector. if there is at least one eigenvalue and d > 1, let r be the radius of the largest such circle. the set of all values 〈φ,ϑφ〉, φ running through all unit vectors, is the disk with radius r. see [6] for the more sophisticated real case. we need some further definitions. definition 3. an anti-linear operator ϑ is said to be hermitian or self-adjoint if ϑ† = ϑ. ϑ is said to be skew hermitian or skew self-adjoint if ϑ† = −ϑ. the linear space of all hermitian (skew hermitian) anti-linear operators are denoted by b(h)+anti respectively b(h) − anti. rank-one linear operators are as usually written( |φ′〉〈φ′′| ) φ := 〈φ′′,φ〉φ′, and we define similarly( |φ′〉〈φ′′| ) anti φ := 〈φ,φ ′′〉φ′, (4) projecting any vector φ onto a multiple of φ′. note that we do not use 〈φ′′| decoupled from its other part. we do not attach any meaning to 〈φ′′|anti as an expression standing alone!1 an anti-linear operator θ is called a unitary operator or, as wigner used to say, an anti-unitary, if θ† = θ−1. a conjugation is an anti-unitary operator which is hermitian, hence fulfilling θ2 = 1. the anti-unitary θ will be called a skew conjugation if it is skew hermitian, hence satisfying θ2 = −1. 3. the invariant hermitian form while the trace of an anti-linear operator is undefined, the product of two anti-linear operators is linear. the trace (ϑ1,ϑ2) := tr ϑ2ϑ1 (5) will be called the canonical hermitian form, or just the canonical form on the the space of anti-linear operators. an anti-linear ϑ can be written uniquely as a sum ϑ = ϑ+ + ϑ− of an hermitian and a skew hermitian operator with ϑ → ϑ+ := ϑ + ϑ† 2 , ϑ → ϑ− := ϑ−ϑ† 2 . (6) 1though one could do so as a conjugate linear form. relying on (5) and (6) one concludes (ϑ+,ϑ+) ≥ 0, (ϑ−,ϑ−) ≤ 0, (ϑ+,ϑ−) = 0. (7) in particular, equipped with the canonical form, b(h)+anti becomes an hilbert space. completely analogue, −(·, ·) is a positive definite scalar product on b(h)−anti. bases of these two hilbert spaces can be obtained as follows: let φ1,φ2, . . . be a basis of h. then( |φj〉〈φj| ) anti, 1 √ 2 (( |φj〉〈φk| ) anti + ( |φk〉〈φj| ) anti ) , (8) where j,k = 1, . . . ,d and k < j, is a basis of b(h)+anti with respect to the canonical form. as a basis of b(h)−anti one can use the anti-linear operators 1 √ 2 (( |φj〉〈φk| ) anti − ( |φk〉〈φj| ) anti ) . (9) by counting basis lengths one gets dimb(h)±anti = d(d± 1) 2 . (10) it follows: the signature of the canonical hermitian form is equal to d = dimh. indeed, dimb(h)+anti − dimb(h) − anti = dimh. (11) 4. orthogonal (skew) conjugations the anti-linear (skew) hermitian operators are the elements of hilbert spaces. their scalar products are restrictions of the canonical form (up to a sign in the skew case). it is therefore a legitimate question to ask for the maximal number of mutually orthogonal conjugation or skew conjugations. these two numbers depend on the dimension d = dimh of the hilbert space only. let us denote by n+(d) the maximal number of orthogonal conjugations and by n−(d) the maximal number of skew conjugations. by (10) it is n±(d) ≤ d(d± 1) 2 . (12) to get an estimation from below, one observes that the tensor products of two conjugations and the tensor product of two skew conjugations are conjugations. therefore n+(d1d2) ≥ n+(d1)n+(d2) + n−(d1)n−(d2) (13) and, similarly, n−(d1d2) ≥ n+(d1)n−(d2) + n−(d1)n+(d2) (14) because the direct product of two orthogonal (skew) conjugations is orthogonal. now consider the case 471 armin uhlmann acta polytechnica that equality holds in (12) for d1 and d2. then one gets the inequality n+(d1d2) ≥ 1 4 ( d1(d1 + 1)d2(d2 + 1) + d1(d1 − 1)d2(d2 − 1) ) and its right hand side yields d(d+ 1)/2 with d = d1d2. hence there holds equality in (13). a similar reasoning shows equality in (14) if equality holds in (12). hence: the set of dimensions for which equality takes place in (12) is closed under multiplication. to rephrase this result we call nanti the set of dimensions for which equality holds in (12): if d1 ∈ nanti and d2 ∈ nanti then d1d2 ∈ nanti. 2 ∈ nanti will be shown by explicit calculations below. hence every power of two is contained in nanti. let us briefly look at dimh = 1. it is n+(1) = 1 and n−(1) = 0. indeed, any anti-linear operator in c is of the form ϑaz = az∗. this is a conjugation if |a| = 1. there are no skew conjugations. the canonical form reads (ϑa,ϑb) = a∗b. conjecture 1. nanti consists of the numbers 2n, n = 0, 1, 2, . . . . skew hermitian invertible operators exist in even dimensional hilbert spaces only. therefore, no odd number except 1 is contained in nanti. this, however, is a rather trivial case. already for dimh = 3 the maximal number n+(3) of orthogonal conjugations seems not to be known. 4.1. the case dimh = 2 to show that 2 ∈ nanti one chooses a basis φ1,φ2 of the 2-dimensional hilbert space h and defines τ0(c1φ1 + c2φ2) = c∗1φ2 − c ∗ 2φ1, τ1(c1φ1 + c2φ2) = c∗1φ1 − c ∗ 2φ2, τ2(c1φ1 + c2φ2) = ic∗1φ1 + ic ∗ 2φ2, τ3(c1φ1 + c2φ2) = c∗1φ2 + c ∗ 2φ1. (15) for j,k ∈{1, 2} and m ∈{1, 2, 3} one gets 〈φj,τmφk〉 = 〈φk,τmφj〉 saying that these anti-linear operators are hermitian. one also has τ2m = 1 for m ∈ {1, 2, 3}. altogether, τ1,τ2,τ3 are conjugations. to see that they are orthogonal one to another we compute τ1τ2τ3 = iτ0, τ2τ1 = iσ3, (16) τ1τ3 = iσ2, τ2τ3 = iσ1, (17) and τ2τ0 = σ2, τ0τ1 = σ1, τ3τ0 = σ3. (18) the trace of any σj is zero. because of (16) and (18) we see, that τ1,τ2,τ3 is an orthogonal set of conjugations while τ0 is a skew conjugation. now n+(2) = 3 and n−(2) = 1 as was asserted above. acknowledgements i would like to thank b. crell for helpful support. references [1] c. bennett, g. brassard, c. crepeau, r. jozsa, a. peres, w. wootters. teleporting an unknown quantum state via dual classical and einstein-podolskyrosen channels, phys. rev. lett., 70: 1895-1898, 1993. [2] i. bengtsson and k. życzkowski, geometry of quantum states. cambridge university press, cambridge 2006 [3] r. a. bertlmann, h. narnhofer, w. thirring. time-ordering dependence of measurements in teleportation. arxiv:1210.5646v1 [quant-ph] [4] f. herbut, m. vujičić. basic algebra of antilinear operators and some applications. j. math. phys. 8: 1345-1354, 1966. [5] r. a. horn, c. r. johnson. matrix analysis; cambridge university press: cambridge, uk, 1990. [6] r. a. horn, c. r. johnson. topics in matrix analysis. cambridge university press 1991. [7] r. jost: the general theory of quantized fields. american math. soc. 1965. [8] m. a. nielsen, i. l. chuang. quantum computation and quantum information. cambridge university press, 2000. [9] a. uhlmann. quantum channels of the einstein-podolski-rosen kind. in: (a. borowiec, w. cegla, b. jancewicz, w. karwowski eds.), proceedings of the xii max born symposium fin de siecle. lecture notes in physics 539 93-105, wroclaw 1998. springer, berlin 2000. arxiv:9901027 [quant-ph] [10] a. uhlmann. antilinearity in bipartite quantum systems and imperfect teleportation. in: (w. freudenberg, ed.) quantum probability and infinitedimensional analysis. 15. world scientific, signapore 2003, 255 268, 2003. arxiv:0407244 [quant-ph] [11] e. p. wigner: über die operation der zeitumkehr in der quantenmechanik. nachr. ges. wiss. göttingen, math.-physikal. klasse 1932, 31, 546–559. [12] e. p. wigner: normal form of anitunitary operators. j. math. phys. 1960, 1, 409–413. 472 acta polytechnica 53(5):470–472, 2013 1 introduction 2 anti(or conjugate) linearity 3 the invariant hermitian form 4 orthogonal (skew) conjugations 4.1 the case dim h=2 acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0444 acta polytechnica 53(5):444–449, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap itineraries induced by exchange of two intervals zuzana masáková, edita pelantová∗ department of mathematics fnspe, czech technical university in prague, trojanova 13, 120 00 praha 2, czech republic ∗ corresponding author: edita.pelantova@fjfi.cvut.cz abstract. we focus on the exchange t of two intervals with an irrational slope α. for a general subinterval i of the domain of t , the first return time to i takes three values. we describe the structure of the set of return itineraries to i. in particular, we show that it is equal to {r1,r2,r1r2,q} where q is amicable with r1, r2 or r1r2. keywords: interval exchange, first return map, return time. submitted: 27 march 2013. accepted: 22 april 2013. 1. introduction we study the symbolic dynamical system given by the transformation t of the unit interval, t : [0, 1) → [0, 1), t(x) = { x−α for x ∈ [α, 1), x + 1 −α for x ∈ [0,α), (1) where α is a fixed number in [0, 1). transformation t has only one discontinuity point, such a dynamical system is the simplest dynamical system with discontinuous transformation. dynamical systems defined by continuous transformations f : j → j have a number of nice properties, for example, there exists a fixed point ρ ∈ j, f(ρ) = ρ. the famous theorem of sharkovskii [1] describes the structure of periodic points, i.e., fixed points of fk for some k ∈ n. if one chooses the parameter α in (1) irrational, the map t has no periodic point, in other words, the orbit {ρ,t(ρ),t 2(ρ), . . .} is infinite for every ρ ∈ [0, 1). nevertheless, t has a weaker property, namely that although tk(ρ) 6= ρ for any k ∈ n, one can get arbitrarily close to a point ρ with some of its iterations. more precisely, ∀ε > 0 ∃n ∈ n, n ≥ 1 : ∣∣tn(ρ) −ρ∣∣ < ε. (2) moreover, property (2) holds for every ρ ∈ [0, 1). it is well known that every point ρ ∈ [0, 1) can be uniquely represented using the infinite string of 0 and 1, which constitutes the binary expansion of the number ρ. the mapping t of (1) allows another type of representation of ρ, namely by the coding of the orbit of ρ under t. denote j0 = [0,α), j1 = [α, 1) and set un = { 0 if tn(ρ) ∈ j0, 1 if tn(ρ) ∈ j1. knowledge of the infinite word uρ := (un)∞n=0 allows one to determine the number ρ, i.e., the mapping ρ 7→ uρ is one-to-one. the above defined infinite words uρ appear naturally in diverse mathematical problems; they were discovered and re-discovered several times and given different names. we will call the infinite word uρ a sturmian word with slope α and intercept ρ. let us point out one important difference between binary expansion of numbers and their representation by sturmian words with a fixed slope α. every string of length n of letters 0 and 1 appears in the binary expansion of some real number ρ ∈ [0, 1). the number of such strings is obviously 2n. by contrast, the list of all strings of length n appearing in the representation uρ of all ρ ∈ [0, 1) has exactly n + 1 elements. nevertheless, one can still represent a continuum of real numbers ρ. on the other hand, any type of representation using at most n strings of 0 and 1 of length n would allow representation of only countably many numbers. in that sense, sturmian words represent real numbers in the most economical way. sturmian words have many other remarkable properties, for a review, see [2]. generalizations of sturmian words are treated in [3]. the property (2) expresses the fact that iterations tn(ρ) return arbitrarily close to ρ. this allows one to define, for a subinterval i ⊂ [0, 1) of positive length, the so-called return time r : i → n by r(ρ) := min { n ∈ n, n ≥ 1, : tn(ρ) ∈ i } . the return time represents the number of iterations needed for a point ρ to come back to the interval where it comes from. the movement of point ρ on its path from i back to i is recorded by the so-called i-itinerary of ρ, which we denote by r(ρ). it is defined as the finite word w0w1 · · ·wn−1 in the alphabet a = {0, 1} of length n = r(ρ) such that wi = a, if ti(ρ) ∈ ja, a ∈a. equivalently, the i-itinerary r(ρ) of ρ is a prefix of the infinite word uρ of length r(ρ). in our considerations, the interval i is fixed. thus, for simplicity 444 http://dx.doi.org/10.14311/ap.2013.53.0444 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 itineraries induced by exchange of two intervals of notation, we avoid marking the dependence on i of the first return time and return itinerary, i.e., we write r(x), r(x) instead of ri(x), ri(x), respectively. the position of the point ρ ∈ i after its return the interval i defines a new transformation ti : i → i by ti(ρ) = tr(ρ)(ρ), (3) which is usually called the first return map or induced map. the i-itineraries for a special type of interval i were studied in diverse contexts: • if the boundary points of the interval i are neighbouring elements of the set {α,t−1(α), . . . , t−n(α)} for some n ∈ n, then the set of iitineraries r(ρ) for ρ ∈ i consists of only two words. this reformulates the result of vuillon [4] about the existence of exactly two return words to a fixed factor of a sturmian word. • if the sturmian word uρ is invariant under a substitution 0 7→ ϕ(0), 1 7→ ϕ(1), then there exists an interval i ⊂ [0, 1), ρ ∈ i, such that the induced map ti is homothetic to t, and the finite words ϕ(0), ϕ(1) are the i-itineraries. invariance of sturmian words under substitutions was studied by yasutomi [5]. • an abelian return word to a factor of a sturmian word is an i-itinerary for i = [0,β) or i = [β, 1) for some β ∈ [0, 1), see [6]. as follows from the result of [7], the intervals of the mentioned form have at most three itineraries r1, r2, r3 and for their length one has |r3| = |r1| + |r2|. in [8], we have shown that a stronger statement holds, namely that the word r3 is a concatenation of words r1 and r2. the aim of this paper is to describe the structure of the set of i-itineraries for a general position and length of the subinterval i ⊂ [0, 1). the set of all i-itineraries r(x) for x ∈ i is denoted by iti. for the description, we will use the notion of word amicability. we say that two finite words w and v over the alphabet {0, 1} are amicable, if there exist words p,q ∈{0, 1}∗ such that w = p01q and v = p10q or w = p10q and v = p01q. in other words, v is obtained from w by interchanging the order of letters 0 and 1 at two neighbouring positions i− 1, i. it follows from [9] that for every interval i there exist at most four i-itineraries, i.e., #iti ≤ 4. we will show the following theorem. theorem 1.1. let t be the transformation (1) for some irrational α ∈ (0, 1) and let i ⊂ [0, 1) be an interval. then there exist words r1,r2 ∈{0, 1}∗ such that for the set iti of all i-itineraries one has iti ⊂{r1,r2,r1r2,q}, where q is amicable with r1, r2 or r1r2. from the proof of theorem 1.1 (at the end of section 2) one can see that in the generic case, iti = {r1,r2,r1r2,q}. in section 3 we discuss the possibilities for q if #iti = 4 and determine the cases for which the set iti has less than 4 elements. 2. interval exchange transformations first, let us recall the definition and certain properties of k-interval exchange maps, which we use for k = 2 and 3. definition 2.1. let j0∪j1∪·· ·∪jk−1 be a partition of the interval j, where ji are intervals closed from the left and open from the right for every i = 0, . . . ,k−1. the transformation t : j → j is called a k-interval exchange if there exist constants c0,c1, . . . ,ck−1 ∈ r such that t(x) = x + cj, x ∈ jj, and t is a bijection on j. since t is a bijection, intervals t(ji) for j = 0, 1, . . . ,k − 1 form a partition of j. the order of indices j which determines the ordering of intervals t(ji) in j is usually expressed by a permutation π. a trivial example of a k-interval exchange is the choice cj = 0 for j = 0, . . . ,k−1. then t is the identity map and π is the identity permutation. the transformation t of (1) is a 2-interval exchange with permutation (21). example 2.2. consider a,b ∈ (0, 1), a < b. put i0 = [0,a), i1 = [a,b), i2 = [b, 1). then the transformation t : [0, 1) → [0, 1) given by t(x) =   x + 1 −a if x ∈ [0,a), x + 1 −a− b if x ∈ [a,b), x− b if x ∈ [b, 1), (4) is a 3-interval exchange with permutation π = (321), see figure 1. from now on, we focus on the exchange t of two intervals given by the prescription (1) with an irrational slope α. we will study the first return map ti defined by (3) to the subinterval i ⊂ [0, 1). in [10] it is shown how ti depends on the length of the interval i. for an irrational α ∈ (0, 1) with the continued fraction α = [0,a1,a2, . . . ] and convergents pn qn set δk,s := ∣∣(s− 1)(pk −αqk) + pk−1 −αqk−1∣∣, for k ≥ 0, 1 ≤ s ≤ ak+1. (5) for the numbers δk,s one has δk,s > δk′,s′ if and only if k′ > k or k′ = k and s′ > s. 445 z. masáková, e. pelantová acta polytechnica i0︷ ︸︸ ︷ i1︷ ︸︸ ︷ i2︷ ︸︸ ︷ ︸ ︷︷ ︸ t (i2) ︸ ︷︷ ︸ t (i1) ︸ ︷︷ ︸ t (i0) figure 1. exchange of three intervals. in [10], we study infinite words associated to cutand-project sequences which we show to be exactly codings of exchanges of two or three intervals. the following proposition is a reformulation of statements of theorem 4.1 and proposition 4.5 of [10] in the framework of interval exchanges. proposition 2.3. let t : [0, 1) → [0, 1) be an exchange of two intervals with irrational slope α and let i = [c,d) ⊂ [0, 1). for the induced map ti one has (1.) if d− c = δk,s for some k,s, defined in (5), then ti is an exchange of two intervals. (2.) otherwise, ti is an exchange of three intervals with permutation (321). moreover, the lengths of intervals i0, i1, i2 forming the partition of i depend only on d−c and for the return time r(x0), r(x1), r(x2) of points x0 ∈ i0, x1 ∈ i1, x2 ∈ i2, x0 < x1 < x2, one has r(x1) = r(x0) + r(x2). remark 2.4. proposition 4.5 of [10] also allows to determine the exact two or three values of return time r(x) to i. in fact, if d− c = δk,s, then — keeping the notation of (5) — r(x) takes two values{ r(x) : x ∈ i } = { qk,sqk + qk−1 } . if d − c is between δk,s and its successor in the decreasing sequence (δk,s), then r(x) takes three values{ r(x) : x ∈ i } = { qk,sqk + qk−1, (s + 1)qk + qk−1 } . the values of return time are connected to the socalled three-distance theorem [11, 12]. another point of view on return time in sturmian words is presented in [13]. although the return time r(x) to a given interval i can take only three values, the set iti of i-itineraries can have more than three elements. the following statement can be extracted from the proof of the theorem in [9, §2]. it is convenient to provide the demonstration here. proposition 2.5. let t : [0, 1) → [0, 1) be an exchange of two intervals with irrational slope α and let i = [c,d) ⊂ [0, 1). then iti has at most 4 elements. proof. choose x ∈ i. denote r(x) its i-itinerary and r = r(x) its return time. let h ⊂ i be the maximal interval containing x such that for every x′ ∈ h one has r(x) = r(x′). for h, it holds that (1.) ti(h) ⊂ [0,α) or ti(h) ⊂ [α, 1) for i = 0, 1, . . . ,r − 1; (2.) ti(h) ∩ i = ∅ for i = 1, . . . ,r − 1; (3.) tr(h) ⊂ i. the theorem will be established by showing that there are only four candidates for the left end-point of the interval h = [c̃, d̃). obviously, one of them is c̃ = c. if it is not the case, maximality of h and properties (1.), (2.), and (3.) imply that c < c̃ < d̃ ≤ d and there exists (a) l̃, r − 1 ≥ l̃ ≥ 1 such that t l̃(c̃) = d; or (b) ñ, r − 1 ≥ ñ ≥ 0 such that t ñ(c̃) = α; or (c) m̃, r − 1 ≥ m̃ ≥ 1 such that tm̃(c̃) = c. suppose that possibility (a) happened. let us mention that it is possible only if d < 1. denote l = min { k ∈ z, k ≥ 1 : t−k(d) ∈ i } . (6) since t−l̃(d) = c̃ ∈ h ⊂ i, we have by definition of l that l̃ ≥ l. we will show by contradiction that l̃ = l. if l̃ > l, then t l̃−l(c̃) = t−l ( t l̃(c̃) ) = t−l(d) ∈ i, and by definition of return time r = r(c̃) ≤ l̃ − l. this contradicts the fact that l̃ ≤ r − 1. similar discussion for possibilities (b) and (c) shows that the left end-point of the interval h is equal either to t−l(d) where l is defined by (6), or t−n(α), where n = min { k ∈ z, k ≥ 0 : t−k(α) ∈ i } , (7) or t−m(c), where m = min { k ∈ z, k ≥ 1 : t−k(c) ∈ i } . (8) this means that i is divided by the three (not necessarily distinct) points t−l(d), t−n(α), t−m(c) into at most 4 subintervals h on which the i-itinerary is constant. proposition 2.6. let iti be the set of i-itineraries for the interval i = [c,d) ⊂ [0, 1) under an exchange of two intervals with irrational slope α. there exist neighbourhoods hc and hd of c, d, respectively, such that for every c̃ ∈ hc and d̃ ∈ hd, 0 ≤ c̃ < d̃ ≤ 1 one has itĩ ⊇ iti, where ĩ = [c̃, d̃). proof. let iti = {r1, . . . ,rp}. proposition 2.5 implies that p ≤ 4 and for every 1 ≤ i ≤ p the elements x such that r(x) = ri form an interval, say ii. choose xi ∈ ii such that for q with 0 ≤ q ≤ r(xi) − 1 = |ri| − 1 one has tq(xi) /∈ {c,d,α}, (it suffices to choose xi /∈ z[c,d,α]). denote m = {c,d,α} and n = {tq(xi) : i = 1, . . . ,p, 0 ≤ q ≤ r(xi) − 1}. put ε := min { |a− b| : a ∈ m,b ∈ n } . 446 vol. 53 no. 5/2013 itineraries induced by exchange of two intervals then for every c̃ ∈ (c−ε,c+ε) and d̃ ∈ (d−ε,d+ε), the i-itineraries r(x1), . . . , r(xp) are also ĩ-itineraries, where ĩ = [c̃, d̃). proof of theorem 1.1. if i = [c,d) where c = 0 or d = 1, then by theorem 4.5 of [8], the set iti of iitineraries is of the form iti ⊂{r,r′,rr′}. without loss of generality, we can therefore assume that c 6= 0 and d 6= 1. if c, d, or d− c belongs to z[α] (which is dense in r), we can always use proposition 2.6 to find ĩ = [c̃, d̃) such that itĩ ⊇ iti. therefore, without loss of generality we assume c,d,d− c /∈ z[α]. in particular, d− c 6= δk,s. from the proof of proposition 2.5, the interval i is divided into at most four subintervals with constant i-itinerary by points λ = t−l(d), l = min { k ≥ 1 : t−k(d) ∈ i } , µ = t−m(c), m = min { k ≥ 1 : t−k(c) ∈ i } , ν = t−n(α), n = min { k ≥ 0 : t−k(α) ∈ i } . moreover, λ and µ separate intervals with different return times. in particular, for sufficiently small ε, one has l = r(λ−ε) < r(λ + ε), m = r(µ + ε) < r(µ−ε). (9) by proposition 2.3, the induced map ti is an exchange of three intervals with permutation (321). let i = i0∪ i1 ∪ i2 be the corresponding partition of i, where for every x0 ∈ i0, x1 ∈ i1, x2 ∈ i2 one has x0 < x1 < x2. by the same proposition r(x1) = r(x0) + r(x2), which together with inequalities (9) implies that the right end-point of i0 is equal to λ, the left end-point of i2 is equal to µ, and r(x1) = l + m. since c,d,d− c /∈ z[α], we also have λ /∈ z[α], and thus one can choose ε sufficiently small, so that the interval [λ − ε,λ + ε] does not contain any of the points t−j(α) for 0 ≤ j ≤ l + m. this implies that tj ( [λ−ε,λ + ε] ) is an interval not containing α for any j = 0, 1, . . . , l + m − 1, and consequently, the prefix of length l + m of the infinite word uρ is the same for any ρ ∈ [λ−ε,λ + ε]. we have t l(λ−ε) = d−ε ∈ i, t l(λ + ε) = d + ε /∈ i. for the corresponding i-itineraries, we thus have r(λ + ε) = r(λ−ε)r(d−ε). we can set r1 = r(λ−ε), r2 = r(d−ε), to have iti ⊃{r1,r2,r1r2}. by proposition 2.5, the set iti may have four elements. let us determine the fourth element q. consider the point ν = t−n(α), n = min{k ≥ 0 : t−k(α) ∈ i}, which, by the proof of proposition 2.5 splits one of the intervals i0, i1, i2, into two, so that the i-itinerary on the new partition is constant. by the assumption that c,d /∈ z[α], we have ν 6= λ, ν 6= µ. consider the points ν−ε, ν +ε for sufficiently small ε. obviously, their return time coincides, r(ν −ε) = r(ν+ε) = r(ν), thus the i-itineraries r(ν−ε), r(ν+ε) are of the same length r(ν). since tn(ν) = α, we have tn+1(ν) = 0 /∈ i, and thus r(ν) ≥ n + 1. we can see that tn+1(ν + ε) = ε, tn+2(ν + ε) = 1 −α + ε, tn+1(ν −ε) = 1 −ε, tn+2(ν −ε) = 1 −α−ε, which implies that r(ν −ε) = u0 · · ·un−101un+2 · · ·ur(ν)−1, r(ν + ε) = u0 · · ·un−110un+2 · · ·ur(ν)−1. necessarily, r(ν−ε) and r(ν +ε) are amicable words. one of them is q, the other one is equal to r1, r2 or r1r2, according to whether the point ν belongs to i0, i1 or i2. 3. case study let us give several examples illustrating the possible outcomes for the set iti of i-itineraries for general subinterval i = [c,d) ⊂ [0, 1). according to our main theorem 1.1, we have iti ⊂{r1,r2,r1r2,q}, where q is a word amicable with one of r1,r2,r1r2. in fact, as we see in the following examples, we can have all possibilities. for simplicity in the examples, we always keep α = σ, where σ = 12 ( √ 5 − 1) is the reciprocal of the golden ratio. in calculations, we use the relation σ2 = σ + 1. first, we choose the most generic cases, namely examples where #iti = 4. let i = [c,d) where d − c = σ3 + σ6. since d − c 6= δk,s for any k,s, by proposition 2.3, the induced map ti is an exchange of three intervals with permutation (321), and, moreover, the lengths of exchanged intervals i0, i1, i2 do not depend on the position of the interval i. in the notation introduced in the proof of theorem 1.1, λ = c + σ4, µ = c + σ3. hence, in particular, i0 = [c,c + σ4), i1 = [c + σ4,c + σ3), i2 = [c + σ3,c + σ3 + σ6). independently on c, the return time r(x) to the interval i satisfies r(x) =   3 if x ∈ i0, 5 if x ∈ i1, 2 if x ∈ i2. 447 z. masáková, e. pelantová acta polytechnica (in fact, for any subinterval i ⊂ [0, 1) the return time takes two or three values, for α = σ always equal to two or three consecutive fibonacci numbers.) we consider several examples of positions of the interval i. example 3.1. let c = σ4. then ν = t−1(α) = σ3 ∈ i0 splits the interval i0 into i0 = il0 ∪ ir0 , where il0 = [σ 4,σ3), ir0 = [σ 3,σ3 + σ6). the i-itinerary satisfies r(x) =   001 if x ∈ il0 , 010 if x ∈ ir0 , 01001 if x ∈ i1, 01 if x ∈ i2. we put r1 = 01, r2 = 001, r1r2 = 01001, q = 010, where q is amicable with r2. note that we have another choice for notation, r1 = 010, r2 = 01, r1r2 = 01001, q = 001, where q is amicable with r1. example 3.2. let c = σ6. then ν = t−1(α) = σ3 ∈ i1 splits the interval i1 into i1 = il1 ∪ ir1 , where il1 = [σ 4 + σ6,σ3), ir1 = [σ 3,σ3 + σ6). the i-itinerary satisfies r(x) =   001 if x ∈ i0, 00101 if x ∈ il1 , 01001 if x ∈ ir1 , 01 if x ∈ i2. we put r1 = 001, r2 = 01, r1r2 = 00101, q = 01001, where q = r2r1 is amicable with r1r2. example 3.3. let c = σ3 + σ5 + σ7. then ν = t 0(α) = σ ∈ i2 splits the interval i2 into i2 = il2 ∪ir2 , where il2 = [σ 2 + σ4 + σ6 + σ9,σ), ir2 = [σ,σ + σ 7). the i-itinerary satisfies r(x) =   010 if x ∈ i0, 01010 if x ∈ i1, 01 if x ∈ il2 , 10 if x ∈ ir2 . we put r1 = 01, r2 = 010, r1r2 = 01010, q = 10, where q is amicable with r1, or r1 = 010, r2 = 10, r1r2 = 01010, q = 01, where q is amicable with r2. let us discuss the cases for which #iti < 4. this can happen if d−c 6= δk,s, (i.e., ti is still an exchange of three intervals), but ν ∈{c,λ,µ}. it can be derived from the proof of theorem 1.1, that, in this case, the set of i-itineraries is of the form iti = {r1,r2,r1r2}. note that c = 0 is a special case of such situation. for, we have c = 0 = t(α), whence µ = t−m(0) = t−m+1(α) = ν. similarly, the case d = 1 corresponds to λ = ν. example 3.4. let c = σ2, d − c = σ3 + σ6. then ν = t 0(α) = σ = µ. the i-itinerary satisfies r(x) =   010 if x ∈ i0, 01010 if x ∈ i1, 10 if x ∈ i2. with r1 = 010, r2 = 10, we have iti = {r1,r2,r1r2}. consider the situation that d− c = δk,s for some k,s as defined in (5). by proposition 2.3, the induced map ti is an exchange of two intervals, since λ = µ. the set of i-itineraries is then either iti = {r1,r2}, which happens if ν ∈ {c,λ}, or iti = {r1,r2,q}, where q is amicable with r1 or with r2, according to the position of ν in the interval i. 4. conclusions notions such as return time, return itinerary, first return map, etc. for the exchange of two intervals have been studied by many authors. for an overview, see for example [14]. this notion occurs in various contexts such as return words, abelian return words, or substitution invariance of the corresponding codings, i.e., sturmian words. the many equivalent definitions of sturmian words allow one to combine different points of view which contributes substantially to the solution of such problems. a detailed solution of analogous questions for exchanges of more than two intervals is still unknown. we believe that at least for exchanges of three intervals one can obtain an explicit description of return times and return itineraries, since the corresponding codings are geometrically representable by cut-andproject sequences, in a similar way that sturmian words are identified with mechanical words. acknowledgements the results presented in this paper, as well as other results about exchange of intervals, have been obtained with the use of geometric representation of the associated codings in the frame of cut-and-project scheme, see [10]. we were lead to this topic from the study of mathematical models of quasicrystals, based on the rich collaboration of our institute with jiří patera from centre de recherches mathématiques in montréal. this collaboration was initiated, encouraged and is still maintained by prof. miloslav 448 vol. 53 no. 5/2013 itineraries induced by exchange of two intervals havlíček, to whose honour this issue of acta polytechnica is published. we acknowledge financial support from czech science foundation grant 13-03538s. references [1] o. m. sharkovskii. co-existence of cycles of a continuous mapping of the line into itself. ukrain mat z̆ 16:61–71, 1964. [2] j. berstel. sturmian and episturmian words (a survey of some recent results). in algebraic informatics, vol. 4728 of lecture notes in comput. sci., pp. 23–47. springer, berlin, 2007. [3] l. balková, e. pelantová, v. starosta. sturmian jungle (or garden?) on multiliteral alphabets. rairo theor inform appl 44(4):443–470, 2010. [4] l. vuillon. a characterization of sturmian words by return words. european j combin 22(2):263–275, 2001. [5] s. i. yasutomi. on sturmian sequences which are invariant under some substitutions. in number theory and its applications (kyoto, 1997), vol. 2 of dev. math., pp. 347–373. kluwer acad. publ., dordrecht, 1999. [6] m. rigo, p. salimov, e. vandomme. some properties of abelian return words. j int sequences 16:13.2.5, 2013. [7] s. puzynina, l. q. zamboni. abelian returns in sturmian words. j combin theory ser a 120(2):390–408, 2013. [8] z. masáková, e. pelantová. enumerating abelian returns to prefixes of sturmian words. in words 2013, vol. 8079 of lecture notes in comput. sci., pp. 193–204. springer, heidelberg, 2013. [9] m. keane. interval exchange transformations. math z 141:25–31, 1975. [10] l.-s. guimond, z. masáková, e. pelantová. combinatorial properties of infinite words associated with cut-and-project sequences. j théor nombres bordeaux 15(3):697–725, 2003. [11] v. sós. on the distribution mod 1 of the sequence nα. ann univ sci budapest eötös sect math 1:127–134, 1958. [12] n. b. slater. gaps and steps for the sequence nθ mod 1. proc cambridge philos soc 63:1115–1123, 1967. [13] m. kupsa. local return rates in sturmian subshifts. acta univ carolin math phys 44(2):17–28, 2003. [14] p. kůrka. topological and symbolic dynamics, vol. 11 of cours spécialisés [specialized courses]. société mathématique de france, paris, 2003. 449 acta polytechnica 53(5):444–449, 2013 1 introduction 2 interval exchange transformations 3 case study 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0101 acta polytechnica 55(2):101–108, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap protection of wooden materials against biological attack by using nanotechnology michal havrlika, ∗, pavla ryparováb a department of physics, faculty of civil engineering, czech technical university in prague, thakurova 7, 166 29 prague 6, czech republic b department of building structures, faculty of civil engineering, czech technical university in prague, thakurova 7, 166 29 prague 6, czech republic ∗ corresponding author: havrlik.michal@seznam.cz abstract. this article is focused on protection of wooden materials by using nanofibrous textiles with biocidal addition, which continues on the work of a group at the center for nanotechnology at the faculty of civil engineering in the ctu. timber is a natural material which is predisposed for biodegradation and therefore it is essential to study suitable and effective protection against microorganisms. wood is a material susceptible to biological corrosion and therefore it is necessary to protect it. the study compares biocidal efficiency of polymer solution as a coating and as a layer from nanofiber textiles. we used polyvinyl alcohol (pva) as a basic polymer which was enriched by substances from commercial lignofix e – profi, solution of cuso4 · 5h2o and agno3 and finally colloidal silver as an example of nanoparticles. the final concentration of the biocidal substance was 1 (v/wt) % in fiber. the nanofiber textiles are produced on the device nanospider ns lab 500 (elmarco, cr) on cylinder rotating electrode. the study was divided into two parts, the first being an agar plate test and the second a test on samples from timber. the mixture of mold was used as the model organism. (alternaria tenuissima, pochonia bulbiosa, trichoderma viride and acremonium sclerotigenum). comparison of efficiency between the polymer paint and nanofiber textiles showed no difference. the best results were shown by pva with an addition of substances from the commercial biocidal treatment lignofix-e profi on the agar plate. the difference of result was shown on timbre samples, finding that the best results were with treatment by pva doped by silver nitrate. the anticipated results were shown by treatment with non-doped pva, which does not have any fungicidal protective effect. keywords: pva; wood; fungicidal protection; mold; electrospinning; nanofiber textiles.. 1. introduction wood belongs to the most broadly applied organic building materials. timber without any protection treatment is degraded by many biological pest and abiotic agents as water, moisture or change of temperature that cause changes their mechanical, aesthetic, physical and other properties. the important biological pests of wood are wood decaying fungi, mold and bacteria. fungi are the most occurring biological agents on timber and wood decaying fungi are the most danger pests of all, they can destroy the whole wooden structure. the protection and remediation against wood decaying fungi is a complicated and difficult process. if the timber is infected by them, the remediation of timber in building is nearly impossible without removed infected wooden parts. by the way, there are frequently found microscopic filamentous fungi (in common speech mold) at first, on which this article is focused on. molds develop and grow most often on the surface layer of organic or inorganic substrates [1]. filiform formation is the basic part of mold, called hyphae. the mycelium is created by a lot of hyphae. mycelium is very unstable and it can contaminate other surfaces and objectives. this is the origin of its reproduction, followed by growth of hyphae [2]. mold for its growth needs suitable ambient conditions. a sufficient wet substrate is the first of them. the air humidity should be near 100 %. the suitable temperature for the development of mold is about 27–37 °c and the ph between 5 and 7 [2]. currently, the problem with biological degradation of timber are very frequently discussed by expert from civil engineering and architecture. this is caused by interested in ecological and natural living with usage renewable resources and building materials. the contemporary people have demanded high standard of inner wellbeing and the possibilities of design construction cannot protect building block of timber without additional support chemical biocidal treatment. this human idea is led to answer of question about balance between efficiency of biocidal treatment and human health. this question is connected with type of chemi101 http://dx.doi.org/10.14311/ap.2015.55.0101 http://ojs.cvut.cz/ojs/index.php/ap michal havrlik, pavla ryparová acta polytechnica figure 1. from left top: trichoderma viride, acremonium sclerotigenum. from left bottom: pochonia bulbiosa, alternaria tenuissima. cal compounds, their concentration, and their anchorage in coat of building and so on. contemporary law is placed demand on usage secure compounds for human and without changes to surrounding environment and species diversity. 2. the experimental part the protection of construction wood against biological degradation is realized by many a conventional methods such as painting, dipping, impregnation, etc. novel method which is plentiful studied, is nanotechnology. this technology enable another way for protection of building materials include timber or cementbased materials [3] with usage lower concentration and with successively leached of efficiency agents. we are assumed that some forms on this base might be very suitable for the protection of wood also. nowadays, many efficiently agents are prohibited for their bad consequence for surrounding environment and therefore it is necessary to look for a novel protection methods. nanotechnology could be a new functional way to protect building materials [4, 5]. nanofiber textiles are being used in medicine, electro optic and in another branch. there are began used in civil engineering particularly for their hydrophobic properties combined with high breathability [6]. the nanofiber textiles can used as a scaffold for carrying bioactive substance which is incorporated by nucleation [7] or direct in production. it may take the properties of supplement in/or nanofiber. the one of the highest cited and presented biocidal agents against wood deteriogens are ions of silver and copper. their biocidal properties are depended on their concentration [8] silver ion is widely used for their antibacterial properties from century and it is safe for human organism and it can used as a base for biocidal treatment against fungi or bacteria [9]. these experiments are focused on compared fungicidal properties of polymer solution doped by biocidal additives on plate tests and subsequently on building block from spruce. the second part is dealt preparing nanofiber textiles from same polymer solution and compare their fungicidal properties in same adjusting of experiments. 3. materials and methods 3.1. production of pva solution the base polymer which we used in this work, is a polyvinyl alcohol (pva). the pva is a white powdery substance of crystalline character with a wide industrial application [10]. the basic polymer solution had following content: 375 g 16 % pva (sloviol 102 vol. 55 no. 2/2015 protection of wooden materials against biological attack figure 2. from left: mold inoculation through immersion, 5 days after inoculation, inoculation mold with cut outs. signification material type of ions mass concentration in solution [%] ps (pure pva) pva solution as (pva + ag) pva – solution ag+ 0,25 cs (pva + cu) pva solution cu+ 0,25 acs (pva + ag + cu) pva solution cu++ag+ 0,25 coas (pva + col. ag) pva solution colloidal ag+ 20 ppm bs (pva + lignofix eprofi) pva solution lignofix e profi 0,25 table 1: list of the pva solutions, including antibacterial ingredients table 1. list of the pva solutions, including biocidal agents. signification material type of ions mass concentration in nanofiber [%] pt (pure pva) pva – naofibre textile at (pva + ag) pva – naofibre textile ag+ 2 ct (pva + cu) pva – naofibre textile cu+ 2 act (pva + ag + cu) pva – naofibre textile cu++ag+ 2 coat (pva + col. ag) pva – naofibre textile colloidal ag+ 167 ppm bt (pva + lignofix e-profi) pva – naofibre textile lignofix e profi 2 table 2: list of the pva nanofiber textiles, including biocidal agents table 2. list of the pva nanofiber textiles, including biocidal agents. r16, fichema), 4.4 g glyoxal, 3 g 80 % h3po4 and a demineralized water filled up to 500 g [11]. the biocidal agents (tab. 1) was stirred into basic polymer solution. we selected following agents: commercial biocide lignofix e – profi, pentahydrate copper sulfate (cuso4 · 5h2o) as example ion of copper, silver nitrate (agno3) as example ion of silver, colloidal silver as example of nanoparticles and combination of silver nitrate and pentahydrate copper sulfate for evaluation of potential synergic enhance their biocidal properties. the final mass of agents in basic solution was in range 0.25 wt %. 3.2. produce of nanofiber textiles the nanofiber textile was produced by using of electrostatic spinning device – nanospider ns lab 500 s (elmarco, czech republic) in the center for nanotechnology at the faculty of civil engineering in the czech technical university and in the joint laboratory of institute of physics ascr and fce ctu. the setup of device is same as the previous article [12]. the fabrics were produced in the form of monolayers and the weight per square meter 5 g/m2. the mass concentration of biocidal agents in fiber was around 2 %, except coas, it was 167 ppm. 3.3. model organism microscopic filamentous fungi were used in this experiment as a model organism. this organism was obtained by swabbing in situ from building occur in czech republic. there was identified following consortium alternaria tenuissima, pochonia bulbiosa, trichoderma viride and acremonium sclerotigenum. 3.4. plate experiment the evaluation of fungicide properties of polymer solution and nanofiber textiles was made on the malt agar (merck, usa). the model organism is applied as 100 ul mixture of fungi (alternaria tenuissima, pochonia bulbiosa, trichoderma viride and acremonium scle103 michal havrlik, pavla ryparová acta polytechnica figure 3: development of fungi in petri dishes 0 20 40 60 80 100 0 17 24 41 64t h e p e rc e n ta g e i n c re a se i n f u n g a l [% ] time [hour] development of mold growth, depending on the type of protection pm ps as acs cs coas bs figure 3. development of fungi in petri dishes. figure 4: development of fungi in petri dishes 0 20 40 60 80 100 0 5 22 31 48 64 t h e p e rc e n ta g e i n c re a se i n o f fu n g y [ % ] time [hour] development of mold growth, depending on the type of protection pt pt at act ct bt coat figure 4. development of fungi in petri dishes. rotigenum) in physiological solution by spread over whole plate. the samples was placed in the center. 3.5. experiment on building block of timber the second experiment was used building block of timber. we had chosen spruce for their common usage in czech civil engineering. the wooden surface was inoculated by three methods. the first on was by immersing of sterile block into fungi mixture (as in plate experiment) for 5 s as. the second type of inoculation was by mixture of mold on malt agar. the cut had square shape with size 10 and 1 mm. the last method was used smear from the agar plate and it was placed on wooden samples. the cultivation was made in the plastic boxes with optimal conditions for mold growth, 28 ± 2 °c temperature and rh around 98 % for 5 days. these three methods have been implemented for monitoring the impact of agar for molds growth. the cut outs are shown in (fig. 2). four basic experiments have been performed to study antibacterial qualities of nanofiber textiles. 3.6. fungicidal activity of polymer solution on agar plate the aim of this experiment was demonstrate the fungicidal efficiency of the polymer solutions doped biocide ions. the prepared polymer solution was homogenized by ultrasonication for 20 minutes and placed into petri dish in volume 100 µl and spread over whole surface. the drying was in 37 °c for 1 hour. the agar plate without polymer solution was used as positive control. every agar plate was inoculated 20 µl mixture of model organisms and it was incubated at 28 ± 2 °c temperature and rh around 98 %. the evaluation was recorded every twelfth hour during 64 hours. the results are recorded in fig. 3. 3.7. fungicidal activity of nanofiber textiles on agar plate the next experiment want demonstrated the fungicidal efficiency nanofiber textiles doped biocide ions. the mixture of model organism was dripped on center on agar plate and covered by nanofieber textile with circle shape with diameter 20 mm. the results was classified as a percentage of increase of growth fungi on agar plate during 64 hours. the results are shown in fig. 4. 3.8. fungicidal activity of polymer solution on wooden blocks we used sterilized the building block of timber with size 50 × 30 × 15 mm. the biocidal treatment was made by paint of polymer solution (same as a 3.1, the samples named in tab. 1). the drying was in 24 ± 2 °c temperature and rh around 60 % during 24 hours. the experiment was made in three repeated from each type of solution. in respond to previously results, we did not applied colloidal silver (coas) 104 vol. 55 no. 2/2015 protection of wooden materials against biological attack figure 5. from left: macro picture of the sample, micro shot focuses on a clean part of the sample. figure 6. growth of mold on the outside of the nanofiber textile sample of pure pva day 32. for their ineffective fungicidal properties in this low concentration. the positive control was used the building block of wood without protection. further, the samples were inoculated by fungi cut with residue of malt agar. the incubation was under optimal condition (28 ± 2 °c; rh 98 %). the growth of molds was monitored during 39 days. 3.9. fungicidal activity of nanofiber textiles on wooden blocks the samples was inoculated by immersion into mixture of fungi and pre-incubated under same condition as a whole experiment for 5 days. the nanofiber textile fungicidal protection was applied after visual growth mycelium in 5th day (see fig. 2). the nanofiber textile samples was used same as above. the wooden samples was coated nanofiber textile. the coverage of wood by nanofiber textiles was complicated due to the very small mechanical resistance of the fabric itself without a spunbond. the samples were placed under conditions 28 ± 2 °c temperature and rh 98 %. the whole experiment was observed for 39 days. 4. results 4.1. fungicidal activity of polymer solution on agar plate the experiment was designed to demonstrate fungicidal properties of polymer solutions and equivalent nanofiber textiles. the evaluation was based on changes of fungi growth and it was calculated as percentage increase. the highest efficiency of protection evinced polymer solution doped lignofix e – profi which almost has restricted fungi growth. the samples as and acs showed slower growth. the remaining samples almost show small or no fungicidal effects. the samples with pure pva showed interesting result. this polymer brought in nutrient for fungi and due it had worse efficiency than wood without protection. this is due to the fact that the pva contains carbon and mold could it metabolize. the results are shown in fig. 3. 4.2. fungicidal activity of nanofiber textiles on agar plate the most effective protection was achieved by usage sample bt – lignofix e – profi. the lower effective component of pva appears to colloidal silver – coat. the pure sample of pva did not demonstrate a biocide activity and it works like barrier. 4.3. fungicidal activity of polymer solution on wooden blocks in the first stage of the experiment, the assumption was confirmed, that fastest fungi growth was in experiment with the greatest amount residue of agar (cut 1 × 1 cm). the protected sample reflected greater increase on a sample of cut 1 × 1 cm because unprotected wood soaked agar into wood. the evaluation of fungicidal efficiency of protection by nanofiber textiles was performed by eye because the growth mycelium in macroscopic level. the photo documentation was performed in definite interval for 39 days. the final process of growth is recorded in the (tab. 3). the 105 michal havrlik, pavla ryparová acta polytechnica 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 17.10. tp 18.10. apl 21.10. 3 x x xx x 23.10. 5 x x x xx xx 25.10. 7 x x x xx xx x 29.10. 11 x x xxx xx xx xx 31.10. 13 x x xxx x x xx xx xx 5.11. 18 x x xxx xxx xx xx xx xx 12.11. 25 x x xxx xxx xx xx x xx xx xx 19.11. 32 x x xxx xxx xx xx x xx xxx xxx xx 26.11. 39 x x xxx xxx xx xx x xx xxx xxx xx 1 2 1 2 3 1 2 1 2 3 1 2 1 2 3 17.10. tp 18.10. apl 21.10. 3 xx xx 23.10. 5 xx xx 25.10. 7 xx xx x 29.10. 11 xx xx xx 31.10. 13 xx xx xx 5.11. 18 xx xx xx xx 12.11. 25 xx xx xx xx x 19.11. 32 xx x xxx xxx x xx xx x xx 26.11. 39 xx x xxx xxx xxx xx xx x xx x note: 1 cutout 1 x 1 cm, 2 cutout 1 x 1 mm, 3 smear explanation: clear wood tp time painting moderate growth apl time application growth large growth acs bs agar wood agar wood agar wood ps as agar wood agar wood agar wood x xx xxx date number of days without protection date number of days cs table 3: evaluation of the growth of the mycelium on each samples during 39 days. the classification was divided to four group: clear without growth, moderate growth – 10 % covered, growth – 30 % covered, large growth mycelium covered more than 60% table 3. evaluation of the growth of the mycelium on each samples during 39 days. the classification was divided to four group: clear – without growth, moderate growth – 10 % covered, growth – 30 % covered, large growth mycelium covered more than 60 %. first evidence of fungi growth was on the sample ps (pure pva solution) around the third day. the worst biocidal properties had showed copper sulphate pentahydrate. the most efficiency had showed solution with addition of silver ions. we had checked the sample with best protection by microscopy observation. the fungi had grown only on residue of agar but the wood sample was uninfected after 39 days. the acs sample showed the same properties on a macroscopic scale as a sample as. microscopic analysis of the sample acs showed fungal growth even after a clean timber. the visible of molds in macro scale was only a question of time. this result can indicate that copper in a low concentration can influences a fungicidal activity of the silver. we will interested in it in next exploration 4.4. fungicidal activity of nanofiber textiles on wooden blocks in the last experiment, we evaluated fungicidal properties of nanofiber textiles doped biocidal ions on wood samples which was inoculated by mold direct on wood. it was observed of penetration of fungi mycelium through textiles. the first symptom of growth through had shown sample pt after 32 days (see fig. 6) the textiles made from pure pva had shown no fungicidal activity identically previous results. the individual hyphae of fungi grown into structure of nanofiber textiles but did not get over it. this result supposed good efficiency that in case if this protection is used to healthy timber in contaminated environment. the wood with this protection will not be degraded. the growth mold on the outside of the sample pt is origin rather from the external environment. the more problem was shown with me106 vol. 55 no. 2/2015 protection of wooden materials against biological attack chanical stability of nanofiber textiles and with their application than with their efficiency. fungicidal properties of the pva fabric has not been shown. this nanofiber textile has used only as a scaffold for carry biocidal agents. it will be necessary repeat this experiment with longer duration than 40 days for confirmation of fungicidal activity as a remediation treatment. we demonstrated only barrier properties for 40 days. 5. discussion and conclusion 5.1. solutions and textiles the protection by solution is evinced lower antifungal efficiency to compare with equal based nanofiber textiles. it is depended on concentration of added ions. the solution contains the 0.25 % ions and the nanofiber textiles have 2 % due evaporated water during electrospinning preparation. the polymer solution has not limited the maximum amount of biocidal supplements. on the other hand, the nanofiber textiles basic solution has limited concentration up to 3.5 %. if it is re-counted to concentration in nanofiber, it will be up to 22 % in fiber. the other factor of usage nanofiber textiles is their difficult application in real construction. the polymer solution is applied very easily and it is stable. the disadvantage of polymer solution is in uniformly distribution in volume and the easy leaching. 5.2. pva the fungicidal properties of polymer solution is limited and it is determined by low concentration supplement and supply of carbon for microorganisms. this effect was found in both types of protections. 5.3. protective biocide ions the efficiency of copper (copper sulphate pentahydrate) was the biggest surprise of the experiment and it is identical with copper fungicidal properties presented in literature [5, 13]. the amount of fungicidal effect is depended on concentration and if it is used low concentration it will support growth of fungi. silver nitrate confirmed biocide activity. the activity of silver is monitored form very low concentration the last biocidal agent was lignofix e – profi. the manufacturer’s recommended concentration is 1 : 9. we used concentration around 1 : 100 and this concentration had still fungicidal activities. 5.4. summary the biocide efficiency of ions in solutions, and textiles were studied. the highest fungicidal effect was achieved support by lignofix e – profi in plate experiments. in some cases, the solution form has killing effect to fungi. other agents showed lower reactivity than lignofix e – profi, silver ions belong to the best from other ones. the pure pva does not confirmed biocide activity. the barrier properties of the fabric were not confirmed, due to of a strong presence of mold growth on agar plate. the second growth medium was spruce. the protection by solution doped silver ions achieved the best result. the pure pva has not fungicidal protective effect for wood samples. the experiment had proposed with suitable condition for growth fungi which is not occur in building construction. therefore it should not expect total killing effect for fungi. overall, this work confirmed fungicidal efficiency of selected agents in form of polymer solution as well as in textiles and it shown this activity even at low concentration. acknowledgements this work was supported by “grand application of advanced technologies in modern engineering“ sgs14/111/ohk1/2t/11. the work has been performed within framework of joint laboratory of nanofibers technology of institute of physics ascr and faculty of civil engineering ctu in prague. references [1] r. wasserbauer, biologické znehodnocení staveb [biological damage of structures]. abf arch: praha, prague, 2000. [2] o. fassatiová, plísně a vláknité houby v technické mikrobiologii:(příručka k určování). sntl, prague, 1979. [3] p. demo, a. sveshnikov, š. hošková, d. ladman, and p. tichá, “physical and chemical aspects of the nucleation of cement-based materials,” acta polytech., vol. 52, no. 6, 2012. [4] p. ryparová, z. rácová, p. tesárek, and r. wasserbauer, “the antibacterial activity of nanofiber based on poly-vinyl-alcohol (pva) doped by metal,” nanocon10, p. 23 – 25, 2012. [5] p. ryparová, r. wasserbauer, p. tesárek, and z. rácová, “preparation of antimicrobial treatment interiors using nanotextiles,”central europe towards sustainable building, 2013. [6] p. gibson, h. schreuder-gibson, and d. rivin, “transport properties of porous membranes based on electrospun nanofibers,” colloids surf. physicochem. eng. asp., vol. 187–188, p. 469–481, aug. 2001. [7] p. demo, z. kožíšek, and r. šášik, “analytical approach to time lag in binary nucleation,” phys. rev. e, vol. 59, no. 5, p. 5124, 1999. [8] a. c. s. hastrup, f. green iii, c. a. clausen, and b. jensen, “tolerance of serpula lacrymans to copper-based wood preservatives,” int. biodeterior. biodegrad., vol. 56, no. 3, p. 173–177, oct. 2005. [9] a. v. vegera and a. d. zimon, “synthesis and physicochemical properties of silver nanoparticles stabilized by acid gelatin,” russ. j. appl. chem., vol. 79, no. 9, p. 1403–1406, sep. 2006. [10] j. l. gardon, “encyclopedia of polymer science and technology,” mark hf, p. 833–863, 1965. [11] p. tesárek, v. nezerka, r. toupek, t. pachý, and p. ryparová, “macro mechanical testing of nanofibers: tensile strength,” proceedings of the 50th annual conference on experimental stress analysis, p. 465–468, 2012. 107 michal havrlik, pavla ryparová acta polytechnica [12] p. tesárek, p. ryparová, z. rácová, v. králík, j. němeček, a. kromka, and v. nežerka, “mechanical properties of single and double-layered pva nanofibers,” key eng. mater., vol. 586, p. 261–264, 2014. [13] z. rácová, p. hrochová, and p. ryparová, “treatment of timber by nanofiber fabric with biocide compound,” in advanced materials research, vol. 1000, p. 154–157, 2014. 108 acta polytechnica 55(2):101–108, 2015 1 introduction 2 the experimental part 3 materials and methods 3.1 production of pva solution 3.2 produce of nanofiber textiles 3.3 model organism 3.4 plate experiment 3.5 experiment on building block of timber 3.6 fungicidal activity of polymer solution on agar plate 3.7 fungicidal activity of nanofiber textiles on agar plate 3.8 fungicidal activity of polymer solution on wooden blocks 3.9 fungicidal activity of nanofiber textiles on wooden blocks 4 results 4.1 fungicidal activity of polymer solution on agar plate 4.2 fungicidal activity of nanofiber textiles on agar plate 4.3 fungicidal activity of polymer solution on wooden blocks 4.4 fungicidal activity of nanofiber textiles on wooden blocks 5 discussion and conclusion 5.1 solutions and textiles 5.2 pva 5.3 protective biocide ions 5.4 summary acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0076 acta polytechnica 55(2):76–80, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap proton stopping power of different density profile plasmas david casasa, ∗, manuel d. barriga-carrascoa, alexander a. andreevb, matthias schnürerb, roberto moralesa a e.t.s.i. industriales, universidad de castilla-la mancha, av. camilo josé cela s/n 13071, ciudad real, spain b max born insitute, max born str. 2a d-12489, berlin, germany ∗ corresponding author: david.casas2@alu.uclm.es abstract. in this work, the stopping power of a partially ionized plasma is analyzed by means of free electron stopping and bound electron stopping. for the first one, the rpa dielectric function is used, and for the latter one, an interpolation of high and low projectile velocity formulas is used. the dynamical energy loss of an ion beam inside a plasma is estimated by using an iterative scheme of calculation. the abel inversion is also applied when we have a plasma with radial symmetry. finally, we compare our methods with two kind of plasmas. in the first one, we estimate the energy loss in a plasma created by a laser prepulse, whose density is approximated by a piecewise function. for the latter one, a radial electron density is supposed and the stopping is obtained as function of radius from the calculated lateral points. in both cases, the dependence with the density profile is observed. keywords: stopping power, laser-accelerated protons, density profile targets, energy loss, bound electrons, free electrons, plasma physics. 1. introduction nuclear fusion has a promising future as clean, endless, and sustainable energy source for humankind. a large amount of energy is achieved from a dense and highly energetic deuterium-tritium plasma. however, there are great challenges in order to obtain this so dense and overheated plasma. one of the chosen methods is by means of energetic beams as high power lasers or fast particles. in this last case, it is important to study energy loss of an ion beam that passes through a plasma target to understand the interactions of swift particles with nuclear fusion fuel pellet. the proton is the lightest ion that can be accelerated, and it achieves a great velocity due to high ratio charge-mass. furthermore, the laser-accelerated proton beams technique has achieved a great development in the last years [1, 2]. the low longitudinal emittance of the beam together with a continuous distribution of proton kinetic energies of a few mev allow to trace the temporal evolution of strong electric and magnetic fields in plasma foils [3]. a diagram of this process is shown in figure 1. the temporal evolution of energy loss can be evaluated using the proton streak deflectometry, where the proton energy, which encodes the time, is resolved using a magnetic spectrometer. electronic stopping is the main processes that contributes to deposit energy on plasma target for ion or proton beams. for partially ionized plasmas, this stopping is divided into two contributions: free electrons and bound electrons. both are calculated using different methods: for the first one, the random phase approximation (rpa) dielectric function is used, and figure 1. sketch of proton beam interaction with plasma target. for the latter one, an interpolation formula between limits of high and low projectile velocities together with hartree-fock calculations for atomic quantities is utilized. [4, 5]. stopping power calculation methods are described in section 2. afterwards, stopping power of different density profile plasmas are estimated for a proton beam in section 3 and finally a summary of this work is explained in section 4. we will use atomic units (a.u.) e = ~ = me = 1 to simplify expressions. 2. theoretical methods 2.1. free electron stopping stopping power of free plasma electrons can be calculated using a rpa dielectric function (df). in this df, the effect of a swift charged particle that passes through an electron gas is considered as a perturbation that losses energy proportionally to the square of its charge. then slowing-down was simplified to a treatment of the properties of the medium only, and a linear description of these properties may then be applied. 76 http://dx.doi.org/10.14311/ap.2015.55.0076 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 2/2015 proton stopping power of different density profile plasmas the rpa dielectric function (df) is developed in terms of the wave number k and of the frequency ω provided by a consistent quantum mechanical analysis. the rpa analysis yields the expression [6] εrpa(k,ω) = 1 + 1 π2k2 ∫ d3k′ f(~k + ~k′) −f(~k′) ω + iν − (e~k+~k′−e~k′ ) , (1) where e~k = k 2/2. the temperature dependence is included through the fermi-dirac function f(~k) = 1 1 + exp[β(ek −µ)] , being β = 1/kbt and µ the chemical potential of the plasma with electron density ne and temperature t. in this part of the analysis, we assume the absence of collisions so that the collision frequency tends to zero, ν → 0. the analytic rpa dielectric function for plasmas at any degeneracy can be obtained directly from (1) [7, 8]: εrpa(k,ω) = 1 + g(u + z) −g(u−z) 4z3πkf , (2) where g(x) corresponds to g(x) = ∫ ∞ 0 y dy exp(dy2 −βµ) + 1 ln (x + y x−y ) ; u = ω/kvf and z = k/2kf are the common dimensionless variables [6]. d = efβ is the degeneracy parameter and vf = kf = √ 2ef is fermi velocity in a.u. finally, electronic stopping of free plasma electrons will be calculated in the dielectric formalism as spf (v) = 2z2 πv2 ∫ ∞ 0 dk k ∫ kv 0 dω ω im −1 εrpa(k,ω) , (3) where zp is the charge, v is the velocity of the projectile, and the equation is in atomic units. the calculation of stopping power using (3) could be difficult and computationally slow in some cases. a fast accurate method is to make an interpolation from a database [9] where the variables to interpolate are temperature and density for every couple of coordinates (vi,spi) located on the interpolation grid. database has a set of rpa result files for different conditions of temperature and density. for sake of simplicity, bilinear method is used for a 2d interpolation, which is an extension of linear interpolation for interpolating functions of two variables on a regular 2d grid. in figure 2, we can see the difference between the direct calculation of rpa and its interpolation, for a free electron density of 3.6×1021 e−/cm3. both graphs are similar with a slight difference on the maximum values of the stopping, where the interpolated rpa is lower that the calculated one. figure 2. stopping power as a function of proton beam velocity. blue solid line: direct calculation of rpa. red dashed line: interpolation of rpa 2.2. bound electron stopping the stopping power of a cold gas or a plasma for an ion with charge z has been calculated many times by the well-known bethe formula [10] sp = (zeωp vp )2 ln 2mev2p i , (4) where ω2p = 4πnee2/me is the square of the plasma frequency and ne denotes the bound electron density. i is the mean excitation energy, which averages all the exchanged energy in excitation and/or ionization processes between a fast charged particle and the target bound electrons. we can simplify (4) using atomic units sp = (zωp vp )2 ln 2v2p i . (5) i, is estimated by several methods. a short expression is deduced in ref. [11] i = √ 2k 〈r2〉 , (6) where k is electron kinetic energy and 〈r2〉 is the average of the square of the radius. these quantities can be estimated for the whole atom/ion or shell by shell using atomic calculations. however, bethe equation has a disadvantage, when the logarithm argument in (5) is less than one, this results in a negative stopping, which has not physical meaning. to avoid this difficulty, an interpolation formula obtained in ref. [12] is used in this work. this expression interpolates the stopping between the limits of high velocity and low velocity projectiles, which are separated by an intermediate velocity lb(v) = { lh(v) = ln 2v 2 i − 2k v2 for v > vint, lb(v) = αv 3 1+gv2 for v ≤ vint, (7) vint = √ 3k + 1.5i, (8) 77 d. casas, m.d. barriga-carrasco, a.a. andreev et al. acta polytechnica figure 3. stopping power (thick line) and proton energy (dashed line) as a function of depth for a 1.5 mev proton beam. the bragg peak is placed at 0.169 cm. where g is given by lh(vint) = lb(vint), and α is the friction coefficient for low velocities. lb(v) substitutes the logarithm in (5). using (5) and (7), the stopping power of bound electrons for a proton beam (z = 1) is spb = 4πnat v2 lb(v) (9) 2.3. energy loss in a thick plasma target the energy loss of a proton beam in a material, like plasma, is a dynamical process. when it impacts with an initial energy, ep0 , it starts losing energy with a rate that is given by the stopping power function. using an iterative scheme, this energy loss could be calculated. the method is to divide the plasma length in segments and to evaluate the energy loss in the ith step by means of eli = spi ∆x , where spi is the stopping in the ith segment and ∆x its length. applying this iterative calculation to a plasma stopping profile, it is possible to calculate the energy loss profile and the bragg peak for a proton beam that is totally stopped inside the target. in figure 3, both graphs are been calculated for a partially ionized aluminum plasma. 2.4. abel inversion the abel inversion method is a mathematical technique that has been used to analyze proton imaging data from inertial confinement fusion experiments [13, 14]. with this technique a set of radial points is obtained from a corresponding set of lateral data points. the relationship between the lateral intensity measured i(y) and the radial intensity desired i(r) is shown schematically in figure 4 and is given by i(y) = ∫ x0 −x0 i(r) dx (10) figure 4. the radial distribution i(r) cannot be measured directly, but only through the integral i(y) in the x-direction. in this expression the integral is taken along a strip at constant y, x2 + y2 = r2, x20 = r2 − y2 is the x coordinate of the plasma edge at y value, and r is the radius beyond which i(r) is negligible. hence, assuming a radial symmetry, (10) can also be written i(y) = 2 ∫ r y i(r) r√ r2 −y2 dr (11) equation (11) is one form of abel′s integral equation. the reconstruction of the unknown function i(r) from the measured data i(y) can be done analytically by means of the inverse of abel′s integral equation i(r) = − 1 π ∫ r r di(y) dy dy√ y2 −r2 (12) the experimental measurement of i(y) provides a discrete set of data points. thus, both the differentiation and the integration in (11) cannot be performed directly. for this reason, the nestor-olsen method [15] is used in section 3 to apply the abel inversion to a discrete set of stopping power points. 3. results using stopping power expressions, (3) and (9), is possible to estimate the energy loss of a proton beam for different density target distributions: rectangular shape with a constant density and the piecewise approximation of a trapezium shape with a density profile given by [16] ni(z) = 2nimax 1 + exp (2xθ(x) lr − 2xθ(−x) lf r ). (13) equation (13) is the density distribution obtained when a laser prepulse hit a thin target with a thickness 78 vol. 55 no. 2/2015 proton stopping power of different density profile plasmas figure 5. the target density profiles (top) and its corresponding energy loss functions (bottom). of 1 micron or less. here x = z−0.5lr and θ(x) is the heaviside step function. the parameters lr and lfr were obtained for different initial target thicknesses, lf, from hydrocode calculations [16]. lr, lfr, and lf are expressed in microns. for a solid aluminum target, nimax = 6 × 1022 cm−3. the different density profiles and the corresponding proton beam energy are showed in figure 5. the energy loss of a proton beam has been also considered for a plasma with radial symmetry. in this case a rising electron density from external shells to inner core has been simulated by means of piecewise function, as it can see in figure 6. the energy loss of a proton beam is obtained in every xi. then, sp(xi) can be calculated, and applying the abel inversion to this lateral point set, by means of nestor-olsen method, it is possible to obtain the stopping as function of the radius, sp(ri), as it is showed in figure 7. 4. conclusions the stopping power of partially ionized plasma has been divided into two contributions. for free electrons, the rpa dielectric function obtained from interpolated values of discrete points of rpa calculations has been proposed. it has been proved that the differences between the calculated rpa and the interpolated one figure 6. electron density as a piecewise function of the radius. figure 7. stopping power as a function of lateral and radial points set. are negligible. in the case of bound electrons, a set of formulas for high and low projectile velocities has been proposed, which have the advantage to result in positive values of stopping for any proton velocity. the energy loss has been evaluated using an iterative scheme, that provides an accurate value of bragg peak and total depth stopping for a proton beam that traverses an extended plasma. in the case of plasma with radial symmetry, the abel inversion has been used to obtain radial parameters from lateral measurements. finally, two kinds of plasma has been analyzed using the previous methods. the first one, created by a laser prepulse, is approximated by a piecewise function and compared with a rectangular profile with the same particle number quantity. in both cases, the final energy loss is practically the same, but showing some differences in the proton beam energy profile. the second kind of plasma has a symmetrical radial distribution, with a density that decreases from inside to outside. the abel inversion has been applied to the stopping estimated from lateral measurement, obtaining the stopping as function of radius which is more closer to the radial electron density proposed. 79 d. casas, m.d. barriga-carrasco, a.a. andreev et al. acta polytechnica acknowledgements this work is supported by the spanish ministerio de economía y competitividad (under project mineco: ene2013-45661-c2-1-p). i would like to acknowledge university of castilla-la mancha and diputación of ciudad real by provide me the actual doctoral grant. i would also like to express my gratitude to cátedra enresa that supports me with a doctoral stay in the max born institute in berlin, and to my supervisors there for their invitation and kindly treatment. references [1] a. macchi, m. borghesi, m. passoni. ion acceleration by superintense laser-plasma interaction. reviews of modern physics 85(2):751–793, 2013. doi:10.1103/revmodphys.85.751. [2] h. daido, m. nishiuchi, a. s. pirozhkov. review of laser-driven ion sources and their applications. reports on progress in physics 75(5), 2012. doi:10.1088/0034-4885/75/5/056401. [3] f. abicht, m. schnuerer, j. braenzel, et al. coaction of strong electrical fields in laser irradiated thin foils and its relation to field dynamics at the plasma-vacuum interface. in esarey, e and schroeder, cb and leemans, wp and ledingham, kwd and jaroszynski, da (ed.), laser acceleration of electrons, protons, and ions ii; and medical applications of laser-generated beams of particles ii; and harnessing relativistic plasma waves iii, vol. 8779 of proceedings of spie. spie, 2013. conference on laser acceleration of electrons, protons, and ions ii; and medical applications of laser-generated beams of particles ii; and harnessing relativistic plasma waves iii, prague, czech republic, apr. 15-18, 2013, doi:10.1117/12.2017395. [4] d. casas, m. d. barriga-carrasco, j. rubio. evaluation of slowing down of proton and deuteron beams in ch2, lih, and al partially ionized plasmas. physical review e 88(3), 2013. doi:10.1103/physreve.88.033102. [5] m. d. barriga-carrasco, d. casas. electronic stopping of protons in xenon plasmas due to free and bound electrons. laser and particle beams 31(1):105–111, 2013. doi:10.1017/s0263034612000900. [6] j. lindhard. on the properties of a gas of charged particles. matematisk-fysiske meddelelser kongelige danske videnskabernes selskab 28(8):1–57, 1954. [7] c. gouedard, c. deutsch. dense electron-gas response at any degeneracy. journal of mathematical physics 19(1):32–38, 1978. doi:10.1063/1.523508. [8] n. r. arista, w. brandt. dielectric response of quantum plasmas in thermal equilibrium. physical review a 29(3):1471–1480, 1984. doi:10.1103/physreva.29.1471. [9] m. d. barriga-carrasco. pelo and pelos java programs. http: //www.uclm.es/area/amf/manuel/programas.htm. accessed: 2013-03-01. [10] h. bethe. the theory of the passage of rapid neutron radiation through matter. annalen der physik 5(3):325–400, 1930. [11] x. garbet, c. deutsch, g. maynard. mean excitation energies for ions in gases and plasmas. journal of applied physics 61(3):907–916, 1987. doi:10.1063/1.338141. [12] m. d. barriga-carrasco, g. maynard. a 3d trajectory numerical simulation of the transport of energetic light ion beams in plasma targets. laser and particle beams 23(2):211–217, 2005. doi:10.1017/s0263034605040097. [13] j. l. deciantis, f. h. seguin, j. a. frenje, et al. proton core imaging of the nuclear burn in inertial confinement fusion implosions. review of scientific instruments 77(4), 2006. doi:10.1063/1.2173788. [14] f. h. seguin, j. l. deciantis, j. a. frenje, et al. measured dependence of nuclear burn region size on implosion parameters in inertial confinement fusion experiments. physics of plasmas 13(8), 2006. doi:10.1063/1.2172932. [15] o. nestor, h. olsen. numerical methods for reducing line and surface probe data. siam review 2(3):200–207, 1960. doi:10.1137/1002042. [16] a. a. andreev, s. steinke, t. sokollik, et al. optimal ion acceleration from ultrathin foils irradiated by a profiled laser pulse of relativistic intensity. physics of plasmas 16(1), 2009. doi:10.1063/1.3054528. 80 http://dx.doi.org/{10.1103/revmodphys.85.751} http://dx.doi.org/{10.1088/0034-4885/75/5/056401} http://dx.doi.org/{10.1117/12.2017395} http://dx.doi.org/{10.1103/physreve.88.033102} http://dx.doi.org/{10.1017/s0263034612000900} http://dx.doi.org/{10.1063/1.523508} http://dx.doi.org/{10.1103/physreva.29.1471} http://www.uclm.es/area/amf/manuel/programas.htm http://www.uclm.es/area/amf/manuel/programas.htm http://dx.doi.org/{10.1063/1.338141} http://dx.doi.org/{10.1017/s0263034605040097} http://dx.doi.org/{10.1063/1.2173788} http://dx.doi.org/{10.1063/1.2172932} http://dx.doi.org/{10.1137/1002042} http://dx.doi.org/{10.1063/1.3054528} acta polytechnica 55(2):76–80, 2015 1 introduction 2 theoretical methods 2.1 free electron stopping 2.2 bound electron stopping 2.3 energy loss in a thick plasma target 2.4 abel inversion 3 results 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0524 acta polytechnica 53(supplement):524–527, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap study of the time dependence of radioactivity e. bellottia, c. brogginib,∗, g. di carloc, m. laubensteinc, r. menegazzob a universita delgi studi di milano bicocca and instituto nazionale di fisica nucleare, sezione di milano, milano, italy b istituto nazionale di fisica nucleare, sezione di padova, padova, italy c istituto nazionale di fisica nucleare, laboratori nazionali del gran sasso, assergi, italy ∗ corresponding author: broggini@pd.infn.it abstract. the activity of a 137cs source was measured using a germanium detector installed deep underground in the gran sasso laboratory. in total about 5100 energy spectra, one hour measuring time each, were collected and used to search for time variations of the decay constant with periods from a few hours to 1 year. no signal with amplitude larger than 9.6 × 10−5 at 95 % c.l. was detected. these limits are more than one order of magnitude lower than the values on the oscillation amplitude reported in the literature. the same data give a value of 29.96 ± 0.08 years for the 137cs half life, which is in good agreement with the world mean value of 30.05 ± 0.08 years. keywords: radioactivity, beta decay, gran sasso. 1. introduction interest in the time dependence of the radioactive nuclei decay constant has increased strongly in recent times [1], since various experiments began to report evidence of this effect. in particular, the constancy of the activity of a 226ra source was measured using an ionization chamber in [2]. an annual modulation of amplitude 0.15 % was observed, with the maximum in february and the minimum in august. the activity of a 152eu source was measured using a ge(li) detector, and an even larger annual modulation (0.5 %) was detected. alburger et al. [3] measured the half-life of 32si, which is interesting for many applications in earth science. data collected over a period of four years show annual modulation, with an amplitude of about 0.1 %. in [1, 4] the existence of a new and unknown particle interaction was put forward to explain the yearly variation in the activity of radioactive sources [2, 3]. the authors correlate these variations with the sun-earth distance. however, the possibility that anti-neutrinos affect β+ decay was excluded in a recent reactor experiment [5]. a paper by jenkins et al. [1] triggered strong interest in the subject. old and recent data have been analyzed, or reanalyzed, to search for periodic and sporadic variations with time. some of the measurements and analysis confirm the existence of oscillations [6, 7], whereas others contradict this hypothesis [8–10]. for example, [6] presents the time dependence of the counting rate for 60co, 90sr and 90y sources, measured using geiger müller detectors and, for a 239pu source, measured with silicon detectors. while beta sources show annual (and monthly) variation with amplitude of about 0.3 %, the count rate from the pu source is fairly constant. many authors have called for more dedicated experiments to clarify whether the observed effects are physical or are due to systematic effects not taken into account. we therefore performed a dedicated experiment using a 137cs source. time dependence of the order of 0.2 % with a period of 24 hours and 27 days has already been reported [11] in the decay constant of 137cs measured with a germanium detector during a 4-month experiment. the special feature of our experiment is the set-up installed deep underground in the infn gran sasso underground laboratory. the laboratory conditions are very favorable: the cosmic ray flux is reduced by a factor of 106 and the neutron flux by a factor of 103 with respect to above ground. as a consequence, we do not have to take possible time variations of these fluxes into account, since their contribution to the counting rate is completely negligible. moreover, the laboratory temperature is naturally constant, the maximum variation being of a few degrees celsius in the course of the year. 2. the measurements the set-up is installed in the stella low background facility (subterranean low level assay) located in the underground laboratories of lngs. the 137cs source, embedded in a plastic disk 1′′ in diameter and 1/8′′ in thickness, had activity of 3.0 kbq at the beginning of the measurements (june 6, 2011), and it is firmly fixed to the copper end-cap of the germanium detector in order to minimize variations in the relative positions of the source and the detector. as a matter of fact, monte carlo simulations indicate that a variation of 1 micron in the source-detector distance would cause a variation of 5 × 10−5 in the counting effciency. the germanium detector, a p-type high purity germanium with 96 % relative effciency, is pow524 http://dx.doi.org/10.14311/ap.2013.53.0524 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 study of the time dependence of radioactivity figure 1. measured γ-ray spectrum of the 137cs source after 5124 hours. a weak activity from 60co, 152eu and 40k contaminants is observed at high energy (see inset). the 40k line is also evident in the background spectrum. the peak at 1.323 mev results from the accidental coincidence of two 661.6 kev γ-lines when two 137cs decays occur within the pulse pair resolution of the spectroscopy amplifier. ered at a nominal bias of 3500 v and is surrounded by at least 5 cm of copper followed by 25 cm of lead to suppress the laboratory gamma ray background. finally, the shielding and the detector are housed in a polymethylmetacrylate box flushed with nitrogen at slight overpressure, which works as an anti-radon shield. the signal from the detector pre-amplifier goes first to an ortec amplifier (mod.120-5f), where it is shaped with 6 µs shaping time, and then to a multi channel analyser (easy-mca 8k ortec, 0.21 kev/channel). the busy and pile-up signals from the amplifier are also sent to the mca. the pile-up rejection circuit of the amplifier is able to recognize two signals if they are separated in time by at least 0.5 µs. in order to minimize the noise, the whole set-up, detector and electronics modules, is powered through a 3 kva ac–ac isolation transformer. finally, temperature and atmospheric pressure are the two important parameters that we have to monitor. temperature is continuously measured with a pt100 sensor located in the shielding very close to the cs source, whereas atmospheric pressure is measured in a nearby hall of the underground laboratory. the temperature at the position of the electronics modules is also continuously monitored, since it can affect the performance of the electronics. 2.1. data taking spectra from the mca are collected every hour (real time provided by the mca), except while the detector is being refilled with liquid nitrogen. these refilling interruptions occur twice a week and last less than 2 hours. the quality of the measurement is monitored by checking the energy resolution at the cs line (661.6 kev). its value, 1.79 ± 0.02 kev, is stable, thus proving that the electronic noise does not change with time. this is also confirmed by the counting rate in the lowest mca channels, below 7 kev, where the noise is dominant and the rate remains constant with time. the position of the cs peak is also constant, within 0.21 kev, proving the stability of the electronics. finally, we monitored the dead time, provided by the mca. it changed smoothly from the initial value of 5.10 % to 4.98 % as a consequence of the decreased activity of the cs source. 2.2. energy spectrum the spectrum measured with the present set-up is shown in fig. 1. modest signals from 40k, 60co and 152eu are visible in the energy spectrum above 1 mev. the 40k line is also present in the background spectrum, while the 60co and 152eu activities are related to the source. their contribution to the total count rate, estimated by monte carlo simulation, is 1.7 × 10−2 hz for 60co and 2.7 × 103 for 152eu and 40k. the total count rate above the threshold of 7 kev is of about 700 hz. the intrinsic background, i.e. the shielded detector without the cs source, was measured over a period of 70 days. thanks to the underground environment and the detector shielding, the intrinsic background is very low, down to about 40 counts/hour above the 7 kev threshold (0.01 hz). the signature of the 137cs source is given by the outstanding peak at 661.6 kev and by the peak at 1.323 mev due to the sum of two 661.6 kev gamma rays too close in time to be recognized by the pile-up circuit. 3. results the data discussed here were continuously collected over 217 days, from june 6, 2011 to january 9, 2012. the activity of the source can be calculated for any interval of time multiple of the hour. in particular, fig. 2 shows the activity per 4 days as a function of time, i.e. the integral from 7 kev to 1.7 mev of the mca spectra collected during 4 days and corrected for the dead time. a total error equal to the linear sum of the statistical uncertainty (6.4×10−5 relative) and the fluctuations of the dead time (2.8 × 10−5 relative) was assigned to each data point. the decrease in activity is due to the source decay. the continuous line in fig. 2 is the exponential fit obtained with the mean 525 e. bellotti, c. broggini, g. di carlo, m. laubenstein, r. menegazzo acta polytechnica figure 2. detected activity of the 137cs source. the dead-time corrected data are summed over 96 hours. the first two points correspond to the beginning of data taking, when the set-up was stabilizing, and they are not considered in the analysis. dotted lines represent a 0.1 % deviation from the exponential trend. residuals (mid panel) of the measured activity to the exponential fit. error bars include statistical uncertainties and fluctuations in the measured dead time. bottom panel: laboratory pressure as function of time. life which minimizes the chi squared: 43.22 years (we take the year of 365.25 days of 86 400 seconds each). the reduced chi squared per degree of freedom is 1.02. during the 7-month period of data taking, the temperature at the cs source increased smoothly by 0.4 k, with a few larger variations up to 0.7 k during one week. there is no effect on the data correlated with these temperature variations. in the same period, the pressure inside the laboratory varies in the range ±10 mbar. for these pressure changes we can also exclude any significant effect on the rate. 3.1. 137cs half-life the 137cs half-life has been determined by many authors over a period of more than fifty years (for a critical review of available data, see [12]). data collected in this measurement allow for a new and precise estimate of the 137cs half-life. the data are fitted with an exponential function leaving two parameters free: initial decay rate and half-life. the resulting half-life is 10 942 ± 30 days, to be compared with the recommended value of 10 976 ± 30 days. 3.2. time modulations we searched for time variations with periods from 6 hours to 400 days. for short periods, up to 40 days, it is appropriate to use the discrete fourier transform of the hourly data, after subtracting the exponential trend. for a longer period, we performed a chi squared analysis of the daily count rate, fitting with a superposition of the exponential decay function (with our mean lifetime estimate of 43.22 years) and a sine function of fixed period. the initial activity, the amplitude of the oscillation and its phase are left free. we scanned all the periods from 40 to 400 days, studying in particular the 1 solar year period. no significant improvement of chi squared per degree of freedom was observed in this range. our analysis excludes any oscillation with amplitude larger than 9.6 × 10−5 at 95 % c.l. in particular, for an oscillation period of 1 year the amplitude is 3.1(2.7) × 10−5, well compatible with zero; and a limit of 8.5 × 10−5 at 95 % c.l. on the maximum allowed amplitude is set independently of the phase [13]. 4. conclusions the half-life of a 137cs source has been estimated to be 10942 ± 30 days, in agreement with its recommended value. moreover, from our measurements we can exclude the presence of time variations in the source activity, superimposed on the expected exponential decay, larger than 9.6 × 10−5 at 95 % c.l. for oscillation periods in the range from 6 hours to 1 year. in particular, we exclude an oscillation amplitude larger than 8.5 × 10−5 at 95 % c.l. correlated to the variation of the sun–earth distance. data taking is now continuing to cover the one-year period. acknowledgements references [1] j.h. jenkins et al., astropart. phys. 32 (2009) 42 [2] h. siegert et al., appl. radiat. isot. 49 (1998) 1397 [3] d.e. alburger et al., earth and planetary science lett. 78 (1986) 168 [4] e. fischbach et al., arxiv:1106.1470 (2011), rencontre de moriond 2011 (3-2011) la thuile [5] r.j. de meijer et al., appl. radiat. isot. 69 (2011) 320 [6] a.g. parkhomov, arxiv:1004.1761 (2010) 526 vol. 53 supplement/2013 study of the time dependence of radioactivity [7] d. javorsek ii et al., astropart. phys. 34 (2010) 173 [8] j.c. hardy et al. arxiv:1108.5326 (2011) [9] p.s. cooper, astropart. phys. 31 (2009) 267 [10] e.b. norman et al, astroparticle physics 31 (2009) 135 [11] y.a. baurov, mod. phys. lett. a 16 (2001) 2089 [12] r.g. helmer and v.p. chechev, bureau international des poids et mesures monographies-5, vol. 1–6 (2011) [13] e. bellotti et al., phys. lett. b 710 (2012) 114 discussion smedar bressler — is it possible to plan an experiment to modify artificially variations in the decay time of radioactive nuclei? carlo broggini — to my knowledge, the only decay which can be modified is electron-capture decay, by changing the pressure at the place where the source is. lawrence jones — i wondered whether radioactive decay time might also have a vertical height dependence; i.e. analogous to the gravitational redshift of electromagnetic radiation. carlo broggini — i think so, since the effect is due to time dilation in the gravitational field. however, even for a source placed on the surface of the sun the size of the relative effect, as compared to a source on earth, would be of about 2 × 10−6 only. anatoly petrukhin — did you try to search correlations between your effect and annual cosmic ray variation or any other source of radiation? carlo broggini — we do not see any time dependence of the decay constant. in addition, the relative contribution of the background, as compared to the signal, is about 10−5 in our experiment. even a 10 % variation of the background would be undetectable for us. 527 acta polytechnica 53(supplement):524–527, 2013 1 introduction 2 the measurements 2.1 data taking 2.2 energy spectrum 3 results 3.1 137cs half-life 3.2 time modulations 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):219–222, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ velikhov electrothermal instability cancellation by a modification of electrical conductivity value in a streamer by magnetic confinement jean p. petita, jean c. doreb,∗ a research manager, lambda laboratory, france b technical manager, lambda laboratory, france ∗ corresponding author: contact@lambda-laboratory.fr abstract. we present a method, confirmed experimentally, allowing the cancellation of velikhov instability by operating a local magnetic field reduction along a lane, which enhances local electric conductivity and electron-gas collision frequency, due to a local passage into a coulomb collision regime and the subsequent lowering of the hall parameter below its critical value, close to 2. keywords: non equilibrium plasmas, mhd, velikhov instability. 1. introduction electrothermal instability was discovered by e. velikhov, a student of andrei sakharov. his paper [20], presented at the international mhd congress in newcastle in 1962, announced the complete ruin of the dreams of researchers and engineers working on electric current production processes by direct energy conversion of a fluid. the use of enthalpy associated with a gaseous supersonic flux obtained from hydrocarbon combustion was first considered. theoretically, a 60 % conversion efficiency was expected. however, problems were caused by the technological constraints due to wall and electrodes behaviour. effectively, by adding a more easily ionizable substance to the fluid, caesium, its conductivity only allowed an acceptable output at temperatures above 3000 ◦c. when this mhd energy production was seen to have failed, the russians pushed the technology to extremes at the kurtchatov institute in moscow with the u-25 generator. its enormous mhd nozzle equipped with zirconium oxyde electrodes only allowed a temperature of 2500 ◦c to be attained. given the non-linear increase of the ionization level according to the temperature, these few degrees of difference compared with the 3000 ◦c obliged the abandon of this formula despite the great amount of work done by several countries such as the usa, the uk, france, germany, italy, poland, china and others. another formula suggested by the american kerrebrock [2, 3] consisted of aiming at a non-equilibrium operation with a considerably higher electronic temperature than that of the gas, i.e. of the heavy species, as is the case in a fluorescent tube. then the mhd regime can be described by writing the energy conservation equation (v × b) · je = 3 ne k (te −tg) ∑ s6=e me ms δs〈ves〉 , where δs refers to an ineslatic loss. this could not be applied to gases issuing from hydrocarbon combustion because of the presence of carbon dioxyde which hinders the electronic temperature increase by immediately absorbing the energy and maintaining a thermodynamic equilibrium. it can only concern heat carrying fluids, made up of rare gases seeded with caesium. this formula is seductive and makes use of electromotive fields with a high vb and involves high hall parameter values. but in 1962 e. velikhov predicted the appearance of the turbulence affecting the electron gas, making the plasma totally unhomogenous in a time period comparable to that of the establishment of the ionization itself. this theory was widely verified in laboratories. 2. experimental 2.1. experiment in hot gas in 1961 bert zauderer [21] used a shock tube as a short duration generator of hot swift rare gases flows, driven into a faraday mhd converter, and got strong interaction parameters, due to the high electrical conductivity of argon at 10 000 k at one bar pressure (3000 sievert/m). in 1967, using a similar installation, held in the institute of fluids mechanics of marseille, france, j. p. petit managed [18, 17] to obtain the first and only non-equilibrium electric power production. the test gas was a mixture of helium and argon, where argon played the role of a seed. the formula only allowed experimentation for a maximum of 50 µs but was sufficient to allow a conclusive experiment. the idea was to make it that ionization development, leading the plasma to a coulomb collision dominated regime, bring about a drop in the hall parameter to below its critical value of around 2 in such a regime. this method allowed a stable operation to be attained, with a pressure of close to one bar, with a gas temperature of 4000 ◦c and an electronic 219 http://ctn.cvut.cz/ap/ jean p. petit, jean c. dore acta polytechnica figure 1. velikhov instability, development and annihilation. temperature up to 10 000 ◦c. however calculations showed that it would not be possible to go below. the image (fig. 1) refers to an experiment managed in a schock driven wind tunnel, coupled to a faraday mhd converter. up and down a series of spots show the places of cathodes and anodes. the magnetic field, 2 t, is perpendicular to the plane of the figure. the picture was taken in 1965, using a trw electronicoptic fast image converter, the first one available in france. the arrow shows the inlet of the fluid. the constant section of the chanel is 5 × 5 cm and the length of the mhd section is 10 cm. we can see clearly on the image that instability begins to appear at the mhd channel’s entry and is then absorbed by the ionisation development itself. this last rises the ion density. when the plasma is coulomb dominated, the hall parameter, whose value at the entrance is 8, falls below the critical value. this was evoked in a previous paper [10]. this rapid rise of the collision frequency, due to large coulomb cross section damps the hall parameter, whose value, at the inlet, was close to 8. if this passage to collision dominated regime is fast enough, it competes efficiently with the development of the electrothermal instability. in a collision dominated regime, in a two temperature plasma, the critical value of the hall parameter is close to 2. if β is driven below this value, after the ionization is fully developed, the plasma becomes stable. see the right portion of the picture. formulas giving the critical value of the hall parameter and growth rate of the instability are given in [10]. in our shock tube this velikhov instability cancellation could not be managed down to 4000 k. in a shock tube the temperature and the gas velocity are closely related. for lower temperatures, the induced electrical field v×b was not strong enough to give the required ionization velocity and too weak to drive the plasma into the coulomb dominated plasma regime, so that this method was not technologically interesting. 2.2. the impossibility to continue this research. in france, at the end of the sixties, or, for other countries at the beginning of the seventies, the mhd electric power generation was progressively abandonned. budgets were completely cut, and the author, until retired, shifted to kinetic theory of gases, astrophysics and cosmology. after publishing some theoretical works about electrothermal instability [16] (see also [1]) he discovered in 1981 a second method, for velikhov instability cancellation, firstly published at the french academy of science [8], then presented in the eight international mhd meeting, held in moscow [6]. he supported this attendance by his own funding. years passed. due to the lack of funding j. p. petit was unable to attend meetings in tsukuba, beijing, where his communications were accepted [12, 15] and finally gave up. the explanation of the rarity of publications about velikhov instability cancellation is the following: this is a key problem for classified research, devoted to mhd controlled hypersonic flight in high altitude. 2.3. back to experiments in cold plasma he was retired since 5 years when young men suggested to restart private mhd research, out of any official institution. j. c. doré had some place in his small garage. the team got funding, selling a book of j. p. petit, and j. c. doré built a small low pressure test installation. then the work restarted. thanks to this original funding system, the men of lambda laboratory could buy material and attend international meetings. the attention was concentrated on disk shaped accelerators systems, corresponding to previously published papers. the set of ideas can be found in the proceedings of reference [11]. the final purpose is to show that shock waves and turbulence can be eliminated around diskshaped flying machines, by suitable lorentz force field action [7, 5, 19, 12, 4, 13, 14, 15]. the lack of other papers about this subject deserves the same explanation: in highly scientifically advanced countries (usa, russia) all that stuff corresponds to highly classified research. the successive experimental works achieved by j. c. doré in his small garage are necessary steps towards this central goal. the velikhov instability control is just one of them. in 2010 j. c. doré and j. p. petit presented a successful wall confinment experiment, close to a disk shaped machine by magnetic pressure gradient inversion [9]. recently we achieved stable spiral current pattern, that will be the next presentation and publication. back to the present subject the photo (fig. 2) shows the extremely unhomogenous aspect, connected to the development of instability in low pressure air. the magnetic field, produced by a large coil, is perpendicular to the plane of the figure. the photo (fig. 2 made in lambda laboratory ccd camera f4 1 s) shows the extremely unhomogenous aspect, connected to the development of instability. in the following (fig. 3 made in lambda laboratory ccd camera f4 1 s) we attenuated the magnetic field in a lane by reducing to 200 gs. 220 vol. 53 no. 2/2013 velikhov electrothermal instability cancellation figure 2. instability. figure 3. controlled instability. 3. results and discussion the current was thus incited to take the route of least electric resistance so as to bring the plasma, locally, into a coulomb dominated regime, where we know that the hall parameter critical value is then close to 2. the local hall parameter value becomes, locally, lower than this value, the plasma is then brought into a stable configuration. the experiment aims to create spiral currents in order to obtain a quasi radial force field, even though the local hall parameter value is low. the discharge geometry is then completely controlled by the magnetic geometry and not by the hall effect. 4. conclusion this electrothermal instability is one of the greatest problems of cold, non-equilibrium magnetized plasmas, particularly in relation to the development of equipment used at altitudes well above 30 km, where the flight regime is necessarily hypersonic and where gas flow control towards the ramjets can be done using an mhd controlled inlet. then velikhov instability must then be totally cancelled [11]. the intention of the lambda laboratory team is to extend the research to suck low density gas by lorentz force action, around a disk shaped mhd aerodyne model. beyond, we believe that strong sucking action on surrounding air can prevent the birth of shock waves and turbulence around a mhd aerodyne, moving in supersonic, and event hypersonic velocity in air. we think that silent and shockless flight is possible, at supersonic velocity, in dense air. acknowledgements this research is solely sponsored by private funds. the authors thank all those who support us by private donations and without whom this research would not be possible. the authors thanks their collaborators, members of the lambda laboratory: mathieu ader, xavier lafont and for their technical help: jacques legalland, jacques juan, . . . references [1] m. g. haines, a. h. nelson. analysis of the growth of electrothermal waves. in 10th symp. on engineering aspects of mhd. cambridge, 1969. [2] j. l. kerrebrock. non-equilibrium effects on conductivity and electrode heat transfer in ionized gases. tech. rep. technical note 4, afosr-165, guggenheim jet propulsion center, caltech, pasadena, california, 1960. [3] j. l. kerrebrock. conduction in gases with elevated electron temperature. in proc. 2nd symp. eng. aspects of mhd, pp. 327–346. 1962. [4] b. lebrun. approche theorique de la suppression des ondes de choc se formant autour d’un obstacle effile place dans un ecoulement supersonique d’argon ionise a l’aide de forces de laplace (theoretical study of shock wave annihilation around a flat wing in hot supersonic argon flow with lorentz forces). aix-marseille university; journal of mechanics, france, 1987. dir. by j. p. petit. engineer-doctor thesis. [5] j. p. petit. convertisseurs mhd d’un genre nouveau (new mhd converters). cras, french academy of sciences 281(11):157–160, 1975. [6] j. p. petit. cancellation of the velikhov instability by magnetic confinement. in 8th international conference on the mhd electrical power generation. moscow, russia, 1983. [7] j. p. petit. is supersonic flight without shock wave possible? in 8th international conference on mhd electrical power generation. moscow, russia, 1983. [8] j. p. petit, m. billiotte. method for eliminating velikhov instability. cras, academy of science 1981. [9] j. p. petit, j. c. dore. wall confinement technique by magnetic gradient inversion. in 3rd euro-asian pulsed power conference (eappc2010). jeju, korea, 2010. (and in acta polonica a 121(3):611, march 2012). 221 jean p. petit, jean c. dore acta polytechnica [10] j. p. petit, j. geffray. non equilibrium plasma instabilities. in 2nd euro-asian pulsed power conference (eappc2008). vilnius, lithuania, 2008. (and in acta physica polonica a 115(6):1170–1173, june 2009). [11] j. p. petit, j. geffray, f. david. mhd hypersonic flow control for aerospace applications. in 16th aiaa/dlr/dglr international space planes and hypersonic systems and technologies conference (hytasp). bremen, germany, 2009. aiaa-2009-7348. [12] j. p. petit, b. lebrun. shock wave cancellation in gas by lorentz force action. in 9th international conference on mhd electrical power generation. proceedings iii, part 14.e – mhd flow, pp. 1359–1368. tsukuba, japan, 1986. article 5. [13] j. p. petit, b. lebrun. shock wave annihilation by mhd action in supersonic flow. quasi one dimensional steady analysis and thermal blockage. european journal of mechanics b/fluids 8(2):163–178, 1989. [14] j. p. petit, b. lebrun. shock wave annihilation by mhd action in supersonic flows. two-dimensional steady non-isentropic analysis. anti-shock criterion, and shock tube simulations for isentropic flows. european journal of mechanics b/fluids 8(4):307–326, 1989. [15] j. p. petit, b. lebrun. theoretical analysis of shock wave anihilation with mhd force field. in 11th international conference on mhd electrical power generation. proceedings iii, part 9 – fluid dynamics, pp. 748–753. beijing, china, 1992. article 4. [16] j. p. petit, j. valensi. growth rate of electrothermal instability and critical hall parameter in closed-cycle mhd generators when the electron mobility is variable. cras, academy of science 269, 1969. [17] j. p. petit, j. valensi, j. p. caressa. electrical characteristics of a converter using as a conversion fluid a binary mix of rare gases with non-equilibrium ionization. in 8th international conference on mhd electrical power generation, proceedings 3. international atomic energy agency, varsovie, pologne, 1968. [18] j. p. petit, j. valensi, j. p. caressa. theoretical and experimental study in shock tube of non-equilibrium phenomena in a closed-cycle mhd generator. in 8th international conference on mhd electrical power generation, proceedings 2, pp. 745–750. international atomic energy agency, varsovie, poland, 1968. [19] j. p. petit, m. viton. convertisseurs mhd d’un genre nouveau. appareils a induction (new mhd converters: induction machines). cras, french academy of sciences 284:167–179, 1977. [20] e. p. velikhov. hall instability of current-carrying clightly-ionized plasmas. in 1st international symposium on magnetoplasmadynamics electrical power generation. newcastle-upon-tyne, england, 1962. paper 47. [21] b. zauderer. experimental study of non equilibrium ionzation in mhd generators. aiaa jr 4(6):701–707, 1968. 222 acta polytechnica 53(2):219–222, 2013 1 introduction 2 experimental 2.1 experiment in hot gas 2.2 the impossibility to continue this research. 2.3 back to experiments in cold plasma 3 results and discussion 4 conclusion acknowledgements references ap1_01.vp 1 introduction an axially located standard rushton turbine impeller in a cylindrical vessel with radial baffles exhibits the main force effects – radial and peripheral [1]. the distribution of the peripheral (tangential) components of dynamic pressure affecting a radial baffle at the wall of a cylindrical pilot plant vessel with an axially located axial or radial flow rotary impeller under turbulent regime of flow of agitated liquid was determined experimentally [2, 3, 4]. the experiments [3, 4] were carried out in a flat-bottomed cylindrical pilot plant mixing vessel with four baffles at its wall (see fig. 1) of diameter t = 0.3 m filled with water ( � = 1 mpa�s) or a water-glycerol solution of dynamic viscosity � = 3 mpa�s and � = 6 mpa�s, respectively. the impeller was a standard rushton turbine disk impeller (see fig. 2) with six flat plane blades [5]. the range of impeller frequency of revolution was chosen in the interval n = 3.11–5.83 s�1. the originally developed technique for measuring the peripheral component of dynamic pressure affecting the baffle [4] is illustrated in fig. 1. one of the baffles was equipped with a trailing target of height ht and width b enabling it to be rotated along the axis parallel to the vessel axis with a small eccentricity and balanced by a couple of springs. eleven acta polytechnica vol. 41 no.1/2001 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ dynamic effect of discharge flow of a rushton turbine impeller on a radial baffle j. kratěna, i. fořt, o. brůha this paper presents an analysis of the mutual dynamic relation between the impeller discharge flow of a standard rushton turbine impeller and a standard radial baffle at the wall of a cylindrical mixing vessel under turbulent regime of flow of an agitated liquid. a portion of the torsional moment of the baffle corresponding to the region of the force interaction of the impeller discharge stream and the baffle is calculated under the assumption of constant angular momentum in the flow region between the impeller and the baffles. this theoretically obtained quantity is compared with the torsional moment of the baffles calculated from the experimentally determined distribution of the peripheral (tangential) component of dynamic pressure along the height of the radial baffle in pilot plant mixing equipment. it follows from the results of our calculations that for both investigated impeller off-bottom clearances the theoretically determined transferred torsional moment of the baffles in the area of interference between the impeller discharge flow and the baffles agrees fairly well with experimentally determined data and, moreover, that more than 2/3 of the transferred torsional moment of the baffles as a whole is located in the above mentioned interference area. keywords: rushton turbine impeller, impeller discharge stream, dynamic pressure. fig. 1: sketch of a flat-bottomed agitated pilot plant mixing vessel with four radial baffles at the wall and axially located standard rushton turbine impeller and sketch of measurement of local peripheral force affecting the trailing target (h/t = 1, h/t = 0.33, 0.48, b/t = 0.1, ht = 10 mm, b = 28 mm) fig. 2: standard rushton turbine disk impeller (d/t = 1/3, w/d = 0.2, d1/d = 0.75, l/d = 0.25) positions of the target along the height of the baffle were examined, above all in the region of the interference of the baffle and the impeller discharge flow. the angular displacement of the target � is directly proportional to the peripheral force f affecting the balancing springs (see fig. 3). the flexibility of the springs was selected in such a way that the maximum target displacement was reasonably small compared with the vessel dimensions (no more than 5 % of the vessel perimeter). by means of a small photo-electronic device, composed of two photo diodes, the angular displacement was scanned and the output signal was treated, stored and analysed by the computer. the vertical (axial) distribution of the peripheral component of dynamic pressure along the height of the radial baffle coincides with the flow pattern in an agitated system with a standard rushton turbine impeller [5]. the aim of this study was to analyse the force interaction of the impeller discharge stream and the corresponding part of the radial baffle and to compare the results of such an analysis with available experimental data. 2 theoretical let us consider a flat-bottomed cylindrical vessel filled with a newtonian liquid and provided with four radial baffles at its wall. at the axis of vessel symmetry a standard rushton turbine rotates in such a way that the flow regime of the agitated liquid can be considered turbulent. in such a system the radially-tangential impeller discharge stream leaves the rotating impeller and reaches the baffles (see fig. 4). for the balance region considered between the impeller and baffles we can assume that the angular momentum flow is constant [6], i.e., � � � � �q w const r d tt ., ,2 2 . (1) eq. (1) can also be used for the relation between the values of the angular momentum in the impeller discharge stream and around the baffles w w t d q q t p t b b p , , � � , (2) where index “b” characterises the baffle area and index “p” the flow leaving the rotating impeller (impeller discharge flow). eq. (2) can be rearranged into the form � �� � � � �q t w mb t b b d th2 , , , , (3) where quantity mb d th, , represents a theoretically considered portion of the time averaged (mean) torsional moment of the baffles corresponding to the area of the force interaction of the force action of the impeller discharge stream and the baffle (see shaded area below the curves in figs. 6 and 7). by combining eqs. (2) and (3) we can eliminate the unknown value of the tangential component of the mean velocity wt b, in the baffle area, and so we have finally � �m d w qb d th t p p, , ,� � � �2 � . (4) eq. (4) can be used for estimating the torsional moment of the baffles transferred via the impeller discharge flow. the tangential component of the mean velocity in the impeller discharge stream w t p, is related with the radial component w r p, by the equation � �w wt p r p, , tan� � � , (5) where � is the angle between the horizontal velocity component in the impeller discharge stream and its radial component [7]. it can be calculated from the equation © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 41 no.1/2001 –0.25 –0.13 0.00 0.13 0.25 –500 –250 0 250 500 pc indicated displacement f [n ] –0.25 –0.13 0.00 0.13 0.25 –0.2 –0.1 0 0.1 0.21 b [rad] fig. 3: results of mechanical calibration of balancing springs fig. 4: vertical cross section of the turbine impeller discharge stream region fig. 5: turbine impeller as a cylindrical tangential jet � �� �� � arcsin a d 2 , (6) where parameter a is the radius of the cylindrical jet, i.e., the virtual cylindrical source of the impeller discharge stream (see fig. 5). considering the relation between the impeller pumping capacity qp and the average value of the radial component of the mean velocity over the cross section of the impeller discharge flow q w d wp r p� � � �, � , (7) where d and w are the diameter of the impeller and the width of the impeller blade, respectively, we can rearrange eq. (4) into the form � �� �� m q a d w b d th p, , tan arcsin � � � � � 2 2 2 . (8) let us consider the flow rate number [8, 9] expressing the quantity qp in dimensionless form n q n d q p p � � 3 . (9) after substitution into eq. (8) we have after a simple rearrangement � �� � � � m m n d n a d w d b d th b d th q p, , * , , tan arcsin� � � � � � �� �2 5 2 2 2 , (10) where n is the impeller frequency of revolution. 3 results and discussion it follows from the experimental data that the radius of the virtual cylindrical jet [8] is � �a d d t h t� � � 0 34 1 3 1 3 1 2. , , , , (11) and the rushton turbine impeller flow rate number [8, 9] � �n d t h tq p � � 0 80 1 3 1 3 1 2. , , , . (12) the ratio of the impeller blade width according to the impeller geometry (see fig. 2) corresponds to the relation [5] w d � 1 5 . (13) now, we can substitute all the numerical values into eq. (10), and have finally in dimensionless form the theoretically considered portion of the impeller torque transferred by the baffles via the impeller discharge stream � �m d t h tb d th, ,* . , , ,� � 0 472 1 3 1 3 1 2 . (14) the mean dimensionless pressure affecting the part of the baffle of dimensionless height hd * can be calculated from the known experimental dependence (see figs. 6 and 7) � �p p h tk av k av t,* ,*� . (15) eq. 15 consists of the dimensionless axial position of the target centre of gravity ht related to the vessel diameter t as an independent variable and the dimensionless peripheral component of the local mean dynamic pressure affecting the baffle p p n d k av k av , * ,� � �� 2 2 , (16) considered as a dependent variable. then the average mean dynamic pressure corresponding to the hatched surface in figs. 6 and 7, i.e., the region along the baffle between the lower and upper intersections of the curve � �p p h tk av k av t,* ,*� and zero quantity pk av,* below and above its peak, is � �p h p h td av d k av t h h b b , * * , * * * � � 1 1 2 d , (17) where h h hd b b * * *� �2 1 . (18) dimensionless coordinates hb1 * and hb2 * depict the above mentioned intersections of the curve � �p p h tk av k av t,* ,*� with the zero values of the quantity pk av, * (see figs. 6 and 7). the total dimensionless mean peripheral force affecting the baffle along its interference region with the impeller discharge stream can be calculated from the relation � �f f n d t d b h pd av d av d d av, * , * * , *� � � � � � � � 2 4 2 , (19) where the dimensionless width of the radial baffle b b t* � . (20) similarly the total mean dimensionless peripheral force affecting the whole baffle can be calculated from the relation � �f f n d t d b h pav av d av * * * *� � � � � � � � 2 4 2 , (21) where the dimensionless total liquid depth in the mixing vessel h h t* � (22) 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 � fig. 6: axial profile of the dimensionless peripheral component of dynamic pressure affecting a radial baffle along its height (h/t = 0.33) � fig. 7: axial profile of the dimensionless peripheral component of dynamic pressure affecting a radial baffle along its height (h/t= 0.48) p k av,* p k av,* the average mean dynamic pressure over the whole baffle can be calculated by integration � �p h p h tav k av t h * * , * * � � 1 0 d . (23) from the momentum balance of the mechanically agitated system [1] it follows that the mean impeller torque m p n � �2� , (24) where p is the impeller power input, should correspond to the sum of the mean reaction moments of the baffles, bottom and walls. the mean impeller torque can be expressed in the dimensionless form m m n d p n d po* � � � � � � � � � � � � 2 5 3 52 2 , (25) where the standard rushton turbine impeller power number under conditions re > 104 and d/t 0.25, 0.70� can be expressed by means of the equation [8, 10] � � � �po p n d x d t t� � � � � � 3 5 1 0 2 0 0 0652 5. . . , (26) where x1 is the thickness of the turbine disk and t0 = 1 m. from knowledge of the quantity fav * , the mean dimensionless torsional moment of the baffle can be calculated. we only need to know the arm corresponding to the centre of gravity of the linear dynamic pressure along the width of the baffle (see fig. 8) from the axis of symmetry of the cylindrical mixing vessel: � � � �r t bb � �2 2 3 . (27) if we consider the total number of baffles nb, the moment transferred by the baffles is m f r nb av b b� � � , (28) and finally in dimensionless form m f r n n d n f r d b av b b b av b* * *� � � � � � � � � 3 5 . (29) similarly the portion of the reaction moment of the baffles corresponding to the mutual interference of nb baffles and the impeller discharge stream can be expressed in dimensionless form as m n f r d b d b d av b , * , *� � � , (30) where the quantity fd av, * was calculated according to eq. (19). table 1 contains a comparison of the impeller torque and the calculated reaction moment of the baffles, and a comparison of the experimentally determined quantity mb d, * and the theoretically found quantity mb d th, , * [see eq. (14)]. the power number po was calculated for the impeller tested from eq. (26). it follows from table 1 that theoretical considerations about the character of the discharge flow leaving a standard turbine impeller were fairly well confirmed experimentally. theoretically and experimentally determined values of the reaction moments of the baffles corresponding to the mutual interference of the baffles and the impeller discharge stream practically coincide. from the results summarised in table 1 it also follows that most of the turbine impeller torque is transferred via the agitated liquid by the radial baffles: more than 3/4 of the impeller torque is indicated as a reaction moment of the baffles. moreover, this reaction moment is concentrated mainly in the impinging region of the impeller discharge stream and the vessel wall: more than 2/3 of the total baffles reaction moment affects this narrow part of the baffles, i.e., more than 2/3 of the reaction moment takes place along one third of the height of the baffle. this knowledge plays a significantly role in the design of industrial mixing units with a standard rushton turbine impeller and baffles, where the maximum of fatigue stress can be considered in the above mentioned region with consequences for the baffle fixing using the corresponding welding technique. fig. 9 illustrates the axial profiles of the dimensionless peripheral component of dynamic pressure affecting a radial baffle for a pitched blade impeller and for a standard rushton turbine impeller at the same off-bottom clearances when the pitched blade impeller is pumping liquid down towards the bottoms. we can clearly distinguish between the distribution of the force effect of the agitated liquid on the radial baffles: the pitched blade impeller exhibits the main © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 41 no.1/2001 h/t po m * mb * mb d, * mb d th, , * hd * m mb * * m mb d, * * 0.33 5.289 0.842 0.713 0.464 0.472 0.33 0.847 0.651 0.48 5.289 0.842 0.641 0.475 0.472 0.30 0.762 0.741 table 1: transfer of the impeller torque by radial baffles in an agitated system with a standard rushton turbine disk impeller (thickness of the impeller separating disk x1 = 0.55 mm) fig. 8: radial profile of baffle loading effect at the bottom while the turbine impeller effects mainly the region around the horizontal plane of its separating disk. 4 conclusions • more than 3/4 of the turbine impeller torque is transferred via agitated liquid by the baffles. • more than 2/3 of the baffle reaction moment is located in the impinging region of the turbine impeller discharge stream and the vessel wall. • the mean flow characteristics of the turbine impeller discharge stream allow us to estimate the mutual force interference of the baffles and the impeller discharge stream. this research was conducted at the faculty of mechanical engineering, ctu in prague and was supported by grant no. ok316/2001 of the czech ministry of education, and by an internal grant of ctu in prague. list of symbols a radius of the virtual cylindrical jet, m b target width, m b baffle width, m d impeller diameter, m d1 diameter of the impeller separating disk, m f peripheral component of the force affecting the radial baffle, n h total liquid depth, m hd height of the area of the impeller discharge stream interference region with the baffles, m ht height of the target above the bottom, m h impeller off-bottom clearance, m m mean impeller torque, n�m mb mean reaction moment of the baffles, n�m mb d, mean torsional moment of the baffles transferred via impeller discharge flow, n�m n impeller speed, s�1 nqp flow rate number nb number of baffles p impeller power input, w po power number p peripheral component of the dynamic pressure affecting the radial baffle, pa qp impeller pumping capacity, m 3 �s�1 r radial coordinate, m re impeller reynolds number rb radial coordinate of the dynamic pressure linear profile centre of gravity, m t vessel diameter, m t0 diameter of the standard cylindrical vessel, 1m x1 thickness of the impeller separating disk, m z number of impeller blades w width of the impeller blade, m � angle between the horizontal velocity component in the impeller discharge stream and its radial component, ° � target angular displacement, ° � dynamic viscosity of agitated liquid, pa�s � density of agitated liquid, kg�m–3 subscripts and superscripts av average value b related to the baffle k related to the position of the target th theoretically calculated value * dimensionless value mean (time average) value 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 9: comparison of axial profiles of the dimensionless peripheral component of dynamic pressure affecting radial baffle along its height for down pumping pitched blade impeller and standard rushton turbine impeller (d/t = 1/3, h/t = 1/2, pitched blade impeller z = 4: p avmax, * = 0.195, z = 6 : p avmax, * = 0.230, rushton turbine impeller: p avmax, * = 0.716) p p k av av ,* m ax , * © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 41 no.1/2001 references [1] standart, g.: general relations concerning forces and torques in mixing vessel. collect. czech. chem. commun. 23, 1958, pp. 1163–1168 [2] kipke, h. d.: mechanische belastung von strömungslenkenden einbauten in gerührten behältern. chemie-technik, 6 (11), 1977, pp. 467–472 [3] kratěna, j., fořt, i., brůha, o., pavel, j.: local dynamic effect of mechanically agitated liquid on a radial baffle. in: “proceedings of the 10th european conference on mixing, delft” (editors: h. e. a. van den akker and j. j. derksen), amsterdam, elsevier science, 2000, pp. 369–376 [4] kratěna, j., fořt, i., brůha, o., pavel, j.: distribution of dynamic pressure along a radial baffle in an agitated system with standard rushton turbine impeller. in: “proceedings of the 4th international symposium on mixing in industrial processes – ismip 4, toulouse”, in press [5] turbine disk impeller with flat plane blade. czech standard cvs 691021, mixing equipment, brno, 1997 [6] biting, h., zehner, p.: power and discharge numbers of radial flow impellers and baffles. chem. eng. and proc., 33, 1994, pp. 295–301 [7] drbohlav, j., fořt, i., krátky, j.: turbine impeller as a tangential cylindrical jet. collect. czech. chem. commun., 43, 1978, pp. 696–712 [8] rutherford, k., mahmoudi, s. m. s., lee, k. c., yianneskis, m.: the influence of rushton impeller blade and disk thickness on the mixing characteristics of stirred vessel. trans. inst. chem. engrs., 74 (part a), 1996, pp. 369–378 [9] stoots, c. m., calabrese, r. v.: mean velocity field relative to a rushton turbine blade. aiche jour., 41 (1), 1995, pp. 1–11. [10] bujalski, w., nienow, a. w., chatwin, s., cooke, m.: the dependency on scale of power numbers of rushton disk turbine. chem. eng. sci., 42 (2), 1987, pp. 317–326 ing. jiří kratěna doc. ing. ivan fořt, drsc. dept. of process engineering doc. ing. oldřich brůha, csc. dept. of physics phone: +420 2 2435 2716 fax.: +420 2 2431 0292 e-mail: kratena@student.fsid.cvut.cz czech technical university in prague faculty of mechanical enginering technická 4, 166 07 praha 6, czech republic acta polytechnica doi:10.14311/ap.2015.55.0223 acta polytechnica 55(4):223–228, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison of mathematical models for heat exchangers of unconventional chp units peter ďurčanskýa, ∗, richard lenhardb, jozef jandačkab a university of žilina, research centre, univerzitná 8215/1, slovak republic b university of žilina, faculty of mechanical engineering, department of power engineering, univerzitná 8215/1, slovak republic ∗ corresponding author: peter.durcansky@rc.uniza.sk abstract. an unconventional chp unit with a hot air engine is designed as the primary energy source with fuel in the form of biomass. the heat source is a furnace designed for combustion of biomass, whether in the form of wood logs or pellets. the transport of energy generated by the biomass combustion to the working medium of a hot-air engine is ensured by a special heat exchanger connected to this resource. the correct operation of the hot-air engine is largely dependent on an appropriate design of the exchanger. the paper deals with the calculation of the heat exchanger for the applications mentioned, using criterion equations, and based on cfd simulations. keywords: heat exchanger; cfd simulation; hot air engine. 1. unconventional chp unit the burning of fossil fuels in internal combustion engines, in gas turbines, in steam power plants and in other facilities represent various possibilities of electricity production. except for units with internal combustion engines, these are large devices which require intensive operation and maintenance. another problem is the dependence of such devices on fossil fuels. hot air engines, namely the stirling and ericsson engines, offer a possible alternative to conventional internal combustion engines. the scheme of these machines is shown in fig. 1. in the case of the stirling engine, the double function of the regenerator is immediately obvious. the regenerator (r) functions as a heater or cooler and enables energy storage, while in the ericsson engine the cooler and heater are separate heat exchangers and he efficiency of the machine is not as significantly affected by the volume of the regenerator as in the case of the stirling engine. [1] in the ericsson cycle, air is compressed in the compression cylinder, then it flows through the heat exchanger, where the air receives heat energy at a fixed pressure. subsequently, it is led into the expansion cylinder, which expands adiabatically and performs work. [2] part of the work figure 1. stirling and ericsson engine [1]. is used to drive the compressor and the rest can be used as mechanical work to drive an electric generator. the source of heat for the proposed system with a hot air engine is a furnace designed to burn biomass, whether in the form of wood logs or pellets. the transport of energy generated by the combustion of biomass to the working medium of the engine is provided by a heat exchanger connected to that source. we have also made another proposal which uses the stirling engine with a different setup. the commercially available stirling engines use natural gas as their heat source. natural gas is burned in a burner and is led to the internal heat exchanger, which is part of the regenerator. the emissions from natural gas burning contain neither particles nor dust that can pollute the internal exchanger. a different situation occurs during biomass burning. the flue gases contain many particles, dust, and ash, and a direct use of flue gases to heat up the internal heat exchanger is not the best solution, as the exchanger gets fouled. thus figure 2. proposed setup of experimental hot air engine. 223 http://dx.doi.org/10.14311/ap.2015.55.0223 http://ojs.cvut.cz/ojs/index.php/ap p. ďurčanský, r. lenhard, j. jandačka acta polytechnica figure 3. schematic drawing of the proposed heat exchanger. we decided to use another heat exchanger, placed directly in the biomass furnace, as an intermediate component, as shown in fig. 2. this enables us to refund the energy supplied to the internal exchanger of the engine. the energy from the combustion of natural gas was replaced with the energy of the working medium, obtained and heated from biomass combustion. with such an involvement it was also necessary to ensure the delivery of the working fluid from the heat exchanger to the regenerator chamber of the stirling engine. delivering the medium by a blower or compressor was chosen as the best solution. the compressor is connected to a closed circuit, which allows the use of a working medium at higher pressure. however, there is another problem: the need to keep the temperature of the working medium at the inlet to the compressor at the value specified by the manufacturer. therefore, two heat exchangers were included. the second heat exchanger is used to recover heat energy to the medium, which flows from the compressor. this heat exchanger serves as a regenerator, retrospectively obtaining heat. next, a cooling exchanger is used for cooling the working medium prior to its entering the compressor at the desired temperature. this exchanger also supplies hot water for heating. 2. design of the heat exchanger when designing a new exchanger, it is necessary to take into account all the conditions imposed on the exchanger and its desired properties. the most frequently investigated feature of heat exchangers is the heat exchange surface, from which all the other parameters of the exchanger are derived. we decided to use a countercurrent heat exchanger of gas-gas type for our application of the heat exchanger to a hot air engine. the gas, which receives the heat, is either dry air or any of the inert gases, such as nitrogen. the main requirements for this exchanger are therefore be slightly different. one of the important conditions is its strength, needed even at high combustion gases temperatures. [3] the use of tubes with a meandering shape, adapted to the shape of the built-in volume, proved to be the best arrangement. the tubes have a common collector at the inlet and at the outlet. [4] the proposed design of the exchanger is shown schematically in fig. 3. the air enters the upper collector tube and heated air emerges from the lower header. in the space between the tubes, air flows vertically. the exchanger is designed as a cross-countercurrent, which will bring savings in terms of built-in volume. this arrangement also makes it possible to work with smaller temperature gradients. after a preliminary determination of the dimensions, we proceeded to calculate the heat exchanger using numerical methods. 3. calculation of the heat exchanger using numerical methods the aim of designing the heat exchanger using criteria equations is to determine the geometry of the heat exchanger. heat transfer surface verification is also needed for specified temperatures and desired flow rate. another aim of the analysis of the proposed exchanger, which is carried out using numerical methods, can be the verification of both the design correctness and of the boundary conditions, including temperature, flow rates or pressure losses. because the heat transfer in the heat exchanger is a three dimensional timeindependent process, it is described by a system of partial differential equations, which had to be solved with numerical methods. our first step was to set up the working conditions of the combined heat power plant. our application with the hot air engine sets a wide range of specifications not only on the heat exchanger, but also on the whole system. we are assuming the temperature of the working fluid, after expansion, ranges from 240 °c to 320 °c [4]. we set characteristic temperatures and physical properties for each working fluid, for the dry air in the tubes and for the exhaust gases outside the tubes. for the formula we use literature [6, 7]. the main temperature for heat transfer through the pipes in the bundle is: ∆tln = ∆tmax − ∆tmin ln ∆tmax∆tmin . (1) if the tubes are straight or staggered, the difference is characterized by dimensionless constants. if the tube bundle has a horizontal spacing s1 and a vertical spacing s2, we can characterize the bundle with these constants: a = s1 d0 , (2) 224 vol. 55 no. 4/2015 comparison of mathematical models for heat exchangers figure 4. lateral and longitudinal spacing in the tube bundles. b = s2 d0 , (3) ψ = 1 − π 4a . (4) we can also define the streamed length l, which can be expressed as the length of the flow path transverse over a single tube [7]: l = π 2 d0 (5) another difference can be seen in the non-dimensional criteria. the reynolds number characterizes the flowing medium and the type of flow. it is dependent on the flow velocity and also on geometry. for heat transfer through tubes in the bundle, the following reynolds number criteria can be used: re = wl ψν . (6) the nusselt number ν characterizes the heat transfer. if the turbulence in the inflowing medium is low, deviations in the nusselt number may occur. the average nusselt number in a cross-flow over a bundle of smooth tubes can be calculated from the nusselt number in a cross-flow over a single tube. for our purpose we used the criteria equation according to [7, 8]. the heat transfer is described by the 2 parts of the flow — the turbulent part and the laminar part of the flow near the walls: νl,lam = 0.664 √ reψl 3 √ pr, (7) νl,turb = 0.038 re0.8ψl pr 1 + 2.443 re−0.1ψl (pr 2/3 − 1) . (8) if re > 104, there is turbulent flow in the pipe. in the transition region of the reynolds number from 2300 to 104, the type of flow is also influenced by the nature of the inlet stream and by the form of the pipe inlet. tube bundles with in-line tubes behave more like parallel channels which are formed by the tube rows. no expected increase in the heat transfer coefficient, due to the turbulence enhancement, caused by the tube rows, occurs. [7] our application for the hot air engine will use the heat exchanger with staggered tubes as the primary heat exchanger. for this type of heat transfer through a tube bundle, we can define, according to [7], the average nusselt number for the bundle: ν0,bundle = 1 + (n− 1)fa n νl,0, (9) where fa,stag = 1 + 2 3b , (10) νl,0 = 0.3 + √ ν2l,lam + ν 2 l,turb. (11) an estimation of the overall coefficient of heat transfer, depending on the nusselt number, then follows. α = νbundleλtm l . (12) the overall coefficient of heat transfer has been set for both mediums; the coefficient of thermal conductivity for the wall is known. when we know both sides of the equation, we can compare them and estimate the overall heat transfer coefficient, as well as the needed heat transfer surface. the calculation of the heat exchanger based on the above-mentioned relationships was implemented in a spreadsheet processor; after taking into account the possibilities of structural arrangements, the heat exchanger was designed and was verified by using a numerical finite element method. 4. design and preparation of the numerical model the ansys fluent computing environment was used for the solution of the heat transfer simulation. the first step was to replace the exchanger constructed in a 3d design program with a simplified model, shown in fig. 5. to speed up the calculation and facilitate the generation of a computing mesh, the model was simplified and the parts which have a negligible effect on the result were not taken into account. only the basic tube bundle and the outer volume which is connected to the heat exchange surface were left. the model consists of two volumes, where one volume is a tube bundle and the second is the outer volume that encloses the entire exchanger and represents the gas volume. the next step was to create a computational mesh representing the distribution system of the computational domain into sub interlinked volume elements or cells. the number of cells and the quality of the network are the main factors influencing the accuracy of the mathematical model and of the calculation. in practice, the amount of cells in the calculated area is in an order of millions to tens of millions. the calculation becomes even more difficult if the grid has more cell volumes. due to the complicated geometry of this model, it is difficult to establish a computing mesh. the only possibility of creating a mesh was the creation of very small elements. the aim was to create a simple mesh using automatic methods. 225 p. ďurčanský, r. lenhard, j. jandačka acta polytechnica figure 5. simplified model of heat exchanger. figure 6. polyhedral mesh. the complex geometry has caused mesh creation problems, because the size of the elements has to be small enough to ensure that the elements can be formed in such a small area; therefore it was necessary to create a large number of elements. a triangular mesh was created, and in the basic mesh sizing settings an element with a minimal size of 0.5 mm was defined. the other settings remained default. the required parameters for the calculation were proposed in three data models. different possibilities for creating a computational mesh were used. the formation of the mesh was also treated in different ways, which is reflected in the quality of the mesh and in the results. a tetrahedron mesh and a polyhedra mesh were generated using the automated method, and subsequently acceded to another technique. to raise the quality, square or cubical elements can also be used; they speed up the calculation because they are the simplest of all types. because the geometry of the model is too complicated, it was only possible to create a quadratic network in the exchanger tubes. the rest of the model was meshed with tetrahedron elements. the hexagonal network was created using the sweep method. this step also required the division of the heat exchanger into separate tube volumes. the other settings of this method remained default. the air – air type was chosen for the model and efforts were made to simplify the model adequately. the specific heat capacity of air was linearly dependent on the temperature. steel was chosen for the walls of the tubes. it was also necessary to assign a type of material – solid or fluid– to each volume. in this case there were two volumes of selected types of fluid and in the editing options air was selected for both volumes. the model was designed as a time independent calculation and it takes into account the gravitational 226 vol. 55 no. 4/2015 comparison of mathematical models for heat exchangers figure 7. the velocity profile in section planes. force. boundary conditions were set as input data. flue gas and dry air were set as types of transport medium. the mass flow rate, turbulence intensity and hydraulic diameter were specified at the pressure inlet. the input pressure of the air and of the flue gases was also selected. 5. evaluation of the numerical simulation the result of the cfd simulation is a mathematical model that describes state values and changes in the heat exchanger. a calculation software allows viewing these values in any part of the heat exchanger, as well as in section planes. the first observed variable was the performance of a heat exchanger of 12.8 kw. when compared with a mathematical model that uses criteria equations, the deviation was only 6 %. based on the cfd simulation, the profile of flow velocity in the section plane was then shown — see fig. 7. there we can see the speed profile of the flow in the exchanger in the section planes. at first glance, the visible area on the side of the heat exchanger has higher flow velocities. these velocity fields are undesirable, because there is no reason for a sudden change in velocity in these areas. it is therefore a mistake, or rather a deviation in the calculation model in these areas. in other areas we can observe a flow of flue gases, which has a partly turbulent character. after a few rows of tubes, we can also observe a normalization of the flow in the space between the tubes. the flow is regulated and extends to the entire area of the heat exchanger, which is very appropriate. despite this, it will be necessary to use additional design elements and adjustments to influence the flow before it joins the exchanger. in fig. 8, we can see the temperature contours in section planes in the exchanger. the aim was to verify the predicted temperatures at the inlet and at the outlet. the only specified boundary conditions were the input temperature of flue gas from the furnace and figure 8. temperature overview in 3d cross section. the inlet air temperature. the other temperatures were computed by a mathematical model. the calculation showed that the real temperature of the heated air may be higher than predicted, in this case about 600 ° c, which is about 100 ° c above the prediction. this testifies to an oversizing of the heat exchanger, which was taken into account when compiling the mathematical model. however, in practice, a different situation may arise. especially after long periods of operation, the heat exchanger can get clogged and the thermal resistance can increase. a reduction in the overall coefficient of heat transfer can thus subsequently lead to a decline in the performance of the heat exchanger. figure 9 shows a view of the thermal field. again, we can see the sides of the thermal field whose temperature is higher than the temperature between the tubes. as mentioned above, the flow of the sides can be influenced by free volume, where a greater amount of exhaust gases can flow, due to lower pressure resistance. 227 p. ďurčanský, r. lenhard, j. jandačka acta polytechnica figure 9. temperature overview in cross section. figure 10. temperature overview in cross section. when constructing the heat exchanger, these findings will be taken into account, and the exchanger wall will be situated within close proximity to the heat exchanger tubes. 6. conclusions the heat exchanger is a technical device whose function is to transfer heat from one environment to another, from one fluid to another. the energy transfer should involve, as far as possible, as little loss as possible. the exchanger has to meet specific requirements in terms of lifetime. it must be able to withstand aggressive environmental influences, such as corrosion, and it must also be possible to at least partially restore its properties by cleaning or replacing exposed parts. the aim of our models was to verify the predicted temperatures at the inlet and the outlet. the calculation showed that the predicted temperature of the heated air may be higher than anticipated. this testifies to the oversizing of the heat exchanger, which was taken into account when compiling a mathematical model. acknowledgements this work is supported by “výskum nových spôsobov premeny tepla z oze na elektrickú energiu využitím nových progresívnych cyklov,” itms 26220220117 (50 %). this work is supported by the european regional development fund and the state budget as part of the "research center of university of zilina” itms 26220220183 project (50 %). references [1] creyx m., energetic optimization of the performances of a hot air engine for micro-chp systems working with a joule or an ericsson cycle, elsevier, france, 2012. [2] kalčík j., sýkora k.: technická termodynamika, praha: academia praha, 1973, pp. 301 – 318. [3] bonnet s., alaphilippe m., stouffs p, energy, exergy and cost analysis of a micro-cogeneration system based on an ericsson engine, elsevier, france, (2011) [4] ďurčanský p., jandačka j., kapjor a., papučík š, návrh výmenníka tepla pre ericsson-braytonov motor, in skmtat 2013 (ed. k. kaduchova), tatranská lomnica, slovakia, 2013, pp. 21-25 [5] verein deutscher ingenieure, vdi heatatlas, berlin heidelberg: springer-verlag, 2010, pp. 720-740. [6] r. lenhard, m. malcho, numerical simulation device for the transport of geothermal heat with forced circulation of media, in mathematical and computer modelling, 2013, vol. 57, iss. 1-2, pp. 111-125. [7] incropera – dewitt – bergman – lavine. 2007. fundamentals of heat and mass transfer, 6 edition.2007. isbn-10:0470055545 [8] s. bonnet, m. alaphilippe, p. stouffs: study of a small ericssonengine for household micro-cogeneration, proceedings of the international stirling forum 2004, ecos gmbh, osnabrück, germany (2004) 228 acta polytechnica 55(4):223–228, 2015 1 unconventional chp unit 2 design of the heat exchanger 3 calculation of the heat exchanger using numerical methods 4 design and preparation of the numerical model 5 evaluation of the numerical simulation 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0285 acta polytechnica 54(4):285–289, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap elimed: medical application at eli-beamlines. status of the collaboration and first results francesco schillacia, ∗, giuseppe a. p. cirronea, b, george kornb, mario maggiorec, b, daniele margaroneb, luciano calabrettaa, salvatore cavallaroa, giacomo cuttonea, santo gamminoa, josef krasab, jan prokupekb, andriy velyhanb, marcella renisd, francesco romanoa, e, barbara tomasellod, lorenzo torrisia, f, mariapompea cutroneoa, f, antonella tramontanaa a infn-lns, catania, italy b institute of physics of the ascr, eli-beamlines project, prague, czech republic c infn-lnl, legnaro, italy d university of catania,catania, italy e centro studi e ricerche e. fermi, roma, italy f dip.to di fisica, university of messina, messina, italy ∗ corresponding author: schillacif@lns.infn.it abstract. eli-beamlines is one of the four pillars of the eli (extreme light infrastructure) pan-european project. it will be an ultrahigh-intensity, high repetition-rate, femtosecond laser facility whose main goal is to generate and apply high-brightness x-ray sources and accelerated charged particles. in particular, medical applications are treated by the elimed task force, which has been launched by collaboration between eli and infn researchers. elimed aims to demonstrate the clinical applicability of laser accelerated ions. in this article, the state of the elimed project and the first scientific results are reported. the design and realisation of a preliminary beam handling system and of an advanced spectrometer for diagnostics of high energy (multi-mev) laser-accelerated ion beams will also be briefly presented. keywords: laser acceleration; cancer treatment; thomson parabola; beam handling. 1. introduction collimated multi-mev proton beams can be generated by irradiating thin solid foils with ultraintense (1018 w/cm2) short-pulse (30 fsec–10 psec) lasers. nowadays, the highest proton energies with best characteristics have been reached using the so called target normal sheath acceleration (tnsa) mechanism [1]. in this regime, high-energy (mev range) electrons are generated at the target front side by hot electron formation. the electrons have a mean free-path larger than the target thickness [2–4], which is in the range of a few hundred micron or smaller. as a consequence, the electrons can cross the target producing an intense electrostatic field (in the range of tv/m) due to the charge imbalance between positive ions at rest on the target and the electron sheath expanding at its rear surface, in the laser incidence direction [5]. laserbased accelerators can have performances comparable with, or maybe higher than, huge and expensive conventional accelerators, and the device size could end up more compact (1/10–1/100) than present facilities [6]. it must be stressed that laser-produced proton beams show a large divergence angle [7] and energy-spread (up to 100 %) with a cut-off at the maximum energy. the characteristics depend on the laser as well as on the target parameters [8]. other acceleration regimes than tnsa have been identified and investigated with simulations, as the high power lasers for these regimes are not available yet. one of the most interesting is the so called radiation pressure acceleration regime (rpa), which is different for thick (hole boring) and thin (light sail) targets [9]. radiation pressure effects start to be progressively more important at very high intensities, and finally they dominate over tnsa at laser intensities i > 1023 w/cm2 [9]. the rpa regime could be achieved within the eli-beamlines project, the main goal of which is to develop 1–10 pw class lasers with very short pulses (few tens of femtoseconds) with excellent shot-to-shot reproducibility in order to produce short x-ray sources and accelerate particles in the ultra-relativistic energy range [10]. 2. eli-beamlines and elimed eli (extreme light infrastructure) is a new type of european large scale infrastructure developed within the european esfri process. it is based on four laser facilities located in four different countries. one of the facilities is being developed in prague (czech republic) and it is named eli-beamlines. eli-beamlines, as defined in the eli white-book [10], will also be 285 http://dx.doi.org/10.14311/ap.2014.54.0285 http://ojs.cvut.cz/ojs/index.php/ap f. schillaci, g. a. p. cirrone, g. korn et al. acta polytechnica dedicated to applications of laser produced photon and particle beams in various fields, from high-resolution x-ray imaging to hadrontherapy. one of the goals of eli-beamlines is to develop and realize a transport beam line to demonstrate the clinical applicability of laser driven protons. current hadrontherapy centers are based on conventional accelerators, huge and expensive machines, in terms of both cost and human resources. laser-based accelerators could be a competitive alternative, as they can be smaller in size and less expensive. the design of miniaturized systems has been proposed in the literature [11], in which the laser-driven ion part is combined with a conventional beam transport system. in this framework, and with the aim to develop a new approach in hadrontherapy based on laser-driven ions, the elimed (medical application at eli-beamlines) taskforce has been launched by a group of eli-beamlines and infn researchers. the project officially started with a memorandum of understanding (mou) signed between eli-beamlines and infn-lns. the purpose of the mou is to start a research program aimed mainly at studying, designing and realizing an irradiation facility for dosimetric and radiobiological studies with the high energy proton/ion beams which will be produced at eli-beamlines [12]. lns researchers have gained experience at catana (centro di adroterapia e applicazioni nucleari avanzate), the first italian protontherapy facility [13, 14]. at catana a 62 mev proton beam, accelerated by a superconducting cyclotron, is used for treating of shallow tumors such as those in the ocular region. in the catana facility the clinical activity is still ongoing, and more than 200 patients have been treated since march 2002. the lns tasks involve developing of dosimetric techniques to be used for the laser-driven beams, designing and realising the transport medical beam line, and in-vitro radiobiological studies; these assignments are strictly linked with the catana activity. moreover, lns researchers are involved in beam diagnostics (using a new thomson-like spectrometer, the first prototype of which was recently tested and will be discussed within this paper), and in energy selection in order to produce a beam with energetic and spatial characteristics as required in clinical applications. 2.1. the elimed phases elimed is planned to be realised in four phases: the preparatory phase (2013–2015). during this phase the elimed facility will be built; even if no laser beams will be available, this phase is of fundamental importance for the final design of the transport beam-line, from the target to the irradiating point. the design concerns the selection system, namely a magnetic selector able to operate with a 30–60 mev proton beam in order to produce a beam with a final energy spread of about 5–20 %. an innovative thomson spectrometer able to deflect particles with energy up to 100 mev will be realised, starting from the spectrometer already developed at lns. the in-air transport line will also be developed. a preliminary dosimetric system able to detect the beam fluence will be located along the beam line. the detectors are also planned to be tested under conventional beams at lns and under laser-driven beams in other laser facilities. first phase (2015–2016). the first laser beams will be available inside the elimed hall in order to test the selection and diagnostic systems. according to already performed particle in cell (pic) simulations, proton beams up to 20 − 30 mev will be produced, second phase. proton beams with energy up to 60 mev will be produced. third phase. proton beams with energy up to 100 mev will be produced. the whole selection and diagnostic system will be working at its maximum performance. the transport line and the dosimetric systems will be commissioned and the final radiobiological study of the beam will be realised. inside the elimed facility an experimental hall for radiation therapy has already been assigned. 2.2. the lns thomson parabola spectrometer the thomson parabola (tp) spectrometer is a widely used beam diagnostic device. the working principle, widely described in the literature [15–18], is based on parallel electric and magnetic fields acting on a well collimated ion beam propagating orthogonally to the fields themselves. the lorentz force splits the different ion species of the beam according their q/m ratio. this deflection on the detection plane results in a set of parabolic traces, each of them corresponding to a well determined ion species. moreover the electric and magnetic deflection, mutually perpendicular, are related to the energy and momentum of the particles. hence several types of information can be simultaneously extracted from a spectrogram. a new model of tp has been developed at lns. it will permit the detection of high energy protons (up to 20 mev for protons) and wide acceptance (of the order of 10−4 msr). its deflection sector is made of partially overlapping electric and magnetic fields in order to have a more compact system. the electric field is produced by two copper electrodes, and the magnetic field is due to to an electromagnet with h-shaped iron. the resistive coils allow the dynamics of the spectrometer to be increased while the h-shaped iron ensures high field uniformity despite the big gap between the coils. the new technical solution is better than the previous solution based on permanent magnets. indeed, when permanent magnets are used, the dynamics is constrained and the field uniformity requires a small gap, which affects the spectrometer acceptance. figure 1 shows the field profile along the direction of particle propagation. 286 vol. 54 no. 4/2014 elimed: medical application at eli-beamlines figure 1. electric and magnetic field profile along the direction of particle propagation. the maximum achievable magnetic field intensity is 2500 gauss, corresponding to a coils current of 86.2 a. for current values higher than 65 a, a cooling system is necessary in order to avoid outgassing problems. the geometrical magnetic length is 15 cm but its effective length is 18.82 cm. the copper electrodes that produce the electric field are 7 cm long (effective length 8.04 cm) and they start at 6 cm from the magnetic field centre. a scheme of the deflection sector is available in [13, 14]. the collimator, a 10 cm long cylinder with two pinholes of 1 mm and 100 µm, respectively, is located upstream the deflection sector. downstream the deflection sector, after about 40 cm of drift, the imaging system for particle visualization is mounted. it is constituted by a microchannel plate detector in double chevron configuration, coupled to a phosphor screen and to a conventional reflex camera that is remotely controlled. 2.3. design study of an energy selector in view of the potential use of laser-driven protons for medical applications, a magnetic system is under study in order to energetically select the particle that are produced. the device consists of four dipoles based on permanent magnets producing a field of 0.7 t each (see figure 3). the second and the third magnetic fields are parallel whit each other, but are oriented antiparallel to the first and the fourth field. this configuration allows an increase in the separation between the particle trajectories at different energy corresponding to the central pair of magnets where, by means of a slit device, particles with suitable energy are selected. the energy spread and the number of particles passing through the slit depends on the size of the aperture. the lower the energy spread, the lower the number of particles that will be transported through the energy selector, because it needs to use a smaller slit size, and vice versa (see figure 2). the energy of the proton beam can be tuned by moving the slit position transversely between 30 mm and 8 mm from the target normal axis. the roller guide system where the central twin magnets are placed, allows the two magnets to be displaced radially in order to increase the transversal displacement and select the lowest energy particles. in this way, the energy could figure 2. calculated energy spread as a function of the energy of the selected beam. the curves correspond to the values obtained by using two sizes of the slit (0.5 and 1 mm). figure 3. layout of the magnet system used as an energy selector for the beam produced by the lasertarget interaction. vary within the wide range of 5 mev and 50 mev. the energy spread reachable using a 1 mm slit aperture ranges from 3 % for low energy (1 mev) up to 30 % for the highest (60 mev). the whole magnet system is almost 600 mm long and will be placed into a dedicated vacuum chamber. two additional collimators are placed both upstream and downstream the selector system in order to control the proton beam size. 3. experimental results the tp has been tested under laser-driven beams at the pals (prague asterix laser system) laboratory using the asterix iv laser. it has a long pulse, (∼ 300 psec) and supplies energy up to 1 kj at the fundamental wavelength of 1.315 µm [19, 20]. during the measurement session, the laser energy range was between 453 j and 516 j. the tp has been connected to the interaction chamber at 180 degrees with respect to the laser direction, just behind the target. an ion collector ring (icr) has been located upstream the tp and the distance between the target holder and the icr was 108 cm. another thomson parabola, developed at pals, was located at 30 degrees with respect to the laser incidence direction. this second tp has a different layout, as the fields are in series and the distance between the magnetic field end and 287 f. schillaci, g. a. p. cirrone, g. korn et al. acta polytechnica figure 4. view of the thomson parabola and icr at the pals laboratory. figure 5. spectrogram analyzed using the matlab analysis tool. the electric field is 1 cm [21, 22]. an icr has been set upstream the collimating system of this tp (distance target-icr 151.7 cm). figure 4 shows the lns tp mounted in the interaction hall of the pals facility during the experimental tests. the data acquired during the measurement campaign at pals have been analysed with a matlab based semi-automatic analysis tool, developed at lns. this tool is able to fit parabolas in spectrograms calculating the q-over-m ratio and the maximum particle energy corresponding to the starting point of the trace. an analysed spectrogram, corresponding to laser shot number 42702, is shown in figure 5. the electric and magnetic fields of the tp were set to 333 kv/m and 0.0234 t, respectively. maximum proton energy of 1.5 mev has been detected in the forward direction, with the laser irradiating a 0.65 micron gold target. gold ion parabolas are also present on the spectrogram, even if they have been not discussed in this paper. a first check of the above results can be performed using the matlab-based simulation tool that we have developed. it can track the particle motion in figure 6. spectrogram acquired using thin gold target and comparison with simulation for proton. tps icrs ln (180°) 1.5 ± 0.25 mev 1.53 mev pals (30°) 1.1 ± 0.011 mev 1.13 mev table 1. experimental results — maximal h + energy. the tp solving the differential equations of motion. computational errors are reduced using the ode23 solver, included in the matlab ode suite, and based on the stromer-verlet integration scheme. this ensures energy conservation for particles moving in a magnetic field [23]. a comparison between experimental and simulated proton parabolas for the shot under consideration is shown in figure 6. the data acquired with the tp has been further checked using an icr coupled with a fast digital oscilloscope, which provides information on the beam components time-of-flight (tof). the proton peak is usually identified just after the narrow x-ray photopeak. thus, once the proton peak is located, it is possible to obtain information on the corresponding tof [24, 25]. the tofs from the icr signals and the maximum energy estimation from the spectrograms are in good agreement. the pals tp data has been analysed with the same approach. in this case, lower proton energies are detected, in agreement with the backward direction [26] of the system position. for the shot under consideration, maximum energy values and the average tof values are shown in tab. 1. data from ircs are considered without errors. consistency between the icr signal and the tp result is evident. 4. conclusions the first experimental run performed with the lns tp has shown that the system is functioning well, also in comparison with other ion detectors, although further improvements are required. a new, off-axis and more precise collimation system is under study: it will allow better beam collimation, and a more reliable alignment procedure, an increase of the imaging detector field of view. 288 vol. 54 no. 4/2014 elimed: medical application at eli-beamlines the improved spectrometer will be tested with the cyclotron beam at lns to perform energy calibration and to verify the maximum energy limit. the spectrometer will also be upgraded in order to detect protons with energy up to 100 mev available within the future elimed beamline. acknowledgements this work has been performed in the framework of the lilia (laser induced light ion acceleration) project funded by infn and in the framework of the ministry of education, youth and sports the czech republic’s support for eli-beamlines (cz.1.05/1.1.00/483/02.0061), opvk (cz.1.07/2.3.00/20.0087) and opvk (cz.1.07/2.3.00/20.0279), and within the activities of the fyzikalni ustav, av cr, v.v.i. we offer our special thanks to the staff of pals for their support during the experimental measurements. references [1] wilks, s. c. et al.: energetic proton generation in ultra-intense laser-solid interactions. phys. plasmas 8, 542 (2001); doi:10.1063/1.1333697 [2] pucella, g., segre, s. e.: fisica dei plasmi. zanichelli editore s.p.a, bologna, 2010. [3] eliezer, s.: the interaction of high-power laser with plasmas. iop publishing ltd, 2002. [4] gibbon, p., short pulse laser interaction with matter. imperial college press, 2005. [5] passoni, m. et al.: target normal sheath acceleration: theory, comparison with experiments and future perspective. new j. phys. 12 045012 (2010); doi:10.1088/1367-2630/12/4/045012 [6] joshi, c., katsuleas, t.: plasma accelerators at the energy frontier and on tabletops. phys. today 56(6), 47 (2003); doi:10.1063/1.1595054 [7] c. t. zhou, c. t., he, x. t.: intense laser-driven energetic proton beams from solid density targets. optics letters, vol. 32, no 16, august 15, 2007; doi:10.1364/ol.32.002444 . [8] malka, v. et al.: practicability of protontherapy using compact laser systems med. phys. 31, 1587(2004); doi:10.1118/1.1747751 . [9] macchi, a., benedetti, c.: ion acceleration by radiation pressure in thin and thick targets. nucl. inst. meth. phys. res. a (2010) doi:m10.1016/j.nima.2010.01.057 [10] eli extreme light infrastructure. science and technology with ultra-intense lasers: whitebook. [11] sakaki, h. et al.: designing integrated laser-driven ion accelerator system for hadron therapy at pmrc (photo medical research center). proceedings of pac09, vancouver, bc, canada tu6pfp009. [12] infn, fzu: elimed: memorandum of understanding. [13] cirrone, g. a. p. et al: diagnostic for the radiotherapy use of laser accelerated proton beams. radiation effects and defects in solids, vol 00, no. 00, january 2008, 1–7; doi:10.1080/10420151003731942 [14] maggiore, m. et al.:design and realisation of a thomsom spectrometer for laser plasma facilities.nejaka hlavicka 123(2020), 123-456. [15] rhee, m. j.: compact thomson spectrometer., rev. sci. instrum. 55 (8), august 1984; doi:10.1063/1.1137927 [16] schneider, r. f., luo, c. m., rhee, m. j.: resolution of the thomson spectrometer. j. appl. phys. 57 (1), 1 january 1985; doi:10.1063/1.335389 [17] harres, k. et al.: development and calibration of a thomson parabola with microchannel plate for the detection of a laser-accelerated mev ions. rev. sci. instrum. 79, 093306 (2008); doi:10.1063/1.2987687 [18] jung, d. et al.: development of a high resolution and high dispersion thomson parabola. review of scientific instruments, 82(1):013306, 2011; doi:10.1063/1.3523428 [19] jungwirth, k. et al.: the prague asterix laser system. physics of plasmas, volume 8, number 5, may 2001; doi:10.1063/1.1350569 [20] margarone, d. et al.: new methods for high current fast ion beam production by laser-driven acceleration. review of scientific instruments 83, 02b307 (2012); doi:10.1063/1.3669796 [21] torrisi, l. et al.: proton emission from laser-generated plasmas at different intensities. nukleonika 2012;57(2):237-240. [22] torrisi, l. et al.: proton emission from thin hydrogenated targets irradiated by laser pulses at 1016 w/cm2. rev. sci. instrum. 83, 02b315 (2012); doi:10.1063/1.3673506 [23] shampine, l. f., reichelt,m. w.: the matlab ode suite. siam journal on scientific computing, vol. 18, 1997, pp 1-22. [24] margarone, d. et al.: full characterization of laseraccelerated ion beams using faraday cup, silicon carbide and single-crystal diamond detectors. journal of applied physics 109, 103302 (2011); doi:10.1063/1.3585871 [25] margarone, d. et al.: real-time diagnostic of fast light ion beams accelerated by a sub-nanosecond laser.nukleonika 2011; 56(2):136-141. [26] psikal, j. et al.: ion acceleration by femtosecond laser pulses in small multispecies targets. physics of plasmas 15, 053102 (2008); doi:10.1063/1.2913264 289 http://dx.doi.org/10.1063/1.1333697 http://dx.doi.org/10.1088/1367-2630/12/4/045012 http://dx.doi.org/10.1063/1.1595054 http://dx.doi.org/10.1364/ol.32.002444 http://dx.doi.org/10.1118/1.1747751 http://dx.doi.org/m10.1016/j.nima.2010.01.057 http://dx.doi.org/10.1080/10420151003731942 http://dx.doi.org/10.1063/1.1137927 http://dx.doi.org/10.1063/1.335389 http://dx.doi.org/10.1063/1.2987687 http://dx.doi.org/10.1063/1.3523428 http://dx.doi.org/10.1063/1.1350569 http://dx.doi.org/10.1063/1.3669796 http://dx.doi.org/10.1063/1.3673506 http://dx.doi.org/10.1063/1.3585871 http://dx.doi.org/10.1063/1.2913264 acta polytechnica 54(4):285–289, 2014 1 introduction 2 eli-beamlines and elimed 2.1 the elimed phases 2.2 the lns thomson parabola spectrometer 2.3 design study of an energy selector 3 experimental results 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):254–258, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ emittance characterization of ion beams provided by laser plasma luciano velardia,b,∗, domenico delle sidea,b, massimo de marcoa, vincenzo nassisia,b a laboratorio di elettronica applicata e strumentazione (leas), dipartimento di matematica e fisica, università del salento, lecce, italy b istituto nazionale di fisica nucleare (infn) section of lecce, lecce, italy ∗ corresponding author: luciano.velardi@le.infn.it abstract. laser ion sources offer the possibility to obtain ion beams useful for particle accelerators. nanosecond pulsed lasers at intensities of the order of 108 w/cm2, interacting with solid matter in a vacuum, produce plasma of high temperature and high density. to increase the ion energy, an external post-acceleration system can be employed by means of high voltage power supplies of some tens of kv. in this work, we characterize the ion beams provided by an lis source and post-accelerated. we calculated the produced charge and the emittance. applying 60 kv of accelerating voltage and laser irradiance of 0.1 gw/cm2 on the cu target, we obtain 5.5 ma of output current and normalized beam emittance of 0.2 π mm mrad. the brightness of the beams was 137 ma (π mm mrad)−2. keywords: lis source, ion beam, emittance. 1. introduction it is known that the presence of specific doped ions can significantly modify the chemical-physical properties of many materials. today many laboratories, including leas, are involved in developing accelerators of very contained dimensions, easy to install in small laboratories and hospitals. the use of ion sources facilitates the improvement of ion beams of moderate energy and with good geometric qualities. they are used for the production of innovative electronic and optoelectronic films [7], biomedical materials [8, 16], new radiopharmacy applications [14, 6], hadrontherapy applications [3], and to improve the oxidation resistance of many materials [13]. there are many methods for obtaining particle beams; application of the pulsed laser ablation (pla) technique (the technique that we adopt in this work) enables ions to be obtained solid targets, without any previous preparation, whose energy can easily be increased by post-acceleration systems [2, 12, 1]. in this way, plasma can be generated from many materials, including from refractory materials [9, 15]. in this work, we characterize the ion beams provided by a laser ion source (lis) accelerator composed of two independent accelerating sectors, using an excimer krf laser to get pla from the pure cu target. using a home-made faraday cup and a pepper pot system, we studied the extracted charges and the geometric quality of the beams. 2. materials and methods the lis accelerator consists of a krf excimer laser operating in the uv range to get pla from solid targets, and a vacuum chamber device for expanding the plasma plume, extracting and accelerating its ion component. the maximum output energy of the laser is 600 mj. it works at 248 nm wavelength and the pulse duration is 25 ns. the angle formed by the laser beam with respect to the normal to the target surface is 70◦. during our measurements, the laser spot area onto the target surface is fixed at 0.005 cm2. the laser beam strikes the solid targets and it generates plasma in a expansion chamber (ec), fig. 1. this chamber is closed around to the target support (t) and it is put at a positive high voltage (hv) of +40 kv. the length of the expansion chamber (18 cm) is sufficient to decrease the plasma density [5]. the plasma expands inside the ec and, since there is no electric field, breakdowns are absent. thanks to the plasma expansion, the charges reach the extremity of the expansion chamber, which is drilled with a 1.5 cm hole to allow ion extraction. a pierced ground electrode (ge) is placed at 3 cm distance from ec. in this way it is possible to generate an intense accelerating electric field between ec and ge. four capacitors of 1 nf, between ec and ground, stabilize the accelerating voltage during fast ion extraction. after ge, a third electrode (te) is placed 2 cm from it, and it is connected to a power supply of negative bias voltage −20 kv. it is also utilized as a faraday cup collector. te is connected to the oscilloscope by an hv capacitor (2 nf) and a voltage attenuator, ×20, in order to separate the oscilloscope from the hv and to suit the electric signal to the oscilloscope input voltage. the value of the capacitors (4 nf) applied to stabilize the accelerating voltage and the value 254 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 emittance characterization of ion beams provided by laser plasma figure 1. schematic drawing of the lis accelerator (t: target support, ec: expansion chamber, ge: ground electrode, r: radiochromic, te: third electrode). of the capacitors (2 nf) used to separate the oscilloscope from the hv are calculated assuming a storage charge higher that the extracted one. under this condition, the accelerating voltages during charge extraction are constant and the oscilloscope is able to record the real signal. te is not able to support the suppressing electrode on the cup collector, and secondary electron emission, caused by high ion energy, is therefore present. in this configuration, we are aware that the output current values are about 20 % higher than the real values [11]. 3. results and discussion the value of the laser irradiance used to produce the ion beams was 1.0 × 108 w cm−2 and the ablated target was a pure (99.99 %) disk of cu. figure 2 shows the time of flight (tof) spectra obtained at 60, 40 and 30 kv of total accelerating voltage from the cu target, detecting the ion emission with the faraday cup placed 23 cm from the target. the vertical axis represents the output current. the maximum output current is reached applying 40 and 20 kv, respectively, on the first gap (ec–ge) and on the second gap (ge–te). the space charge effects are more evident at the lowest accelerating voltage applied on the first gap, although they are present anyway. so, considering only the tof curves out of the charge domination effects, we obtain the behaviour of the accelerated charge with respect to the accelerating voltage. from these results, we can observe the absence of a saturation phase. in fact the curves in fig. 3 have a growing trend with respect to the applied voltage. the growing trend on the first gap voltage is larger than the one dependent on the second gap voltage. theoretically, we can expect to observe a constant trend for the curves, dependent principally on the voltages of the second acceleration gap, because the charge is already extracted and its value should be constant. we can asfigure 2. waveforms of the output current at different accelerating voltages for a cu target; laser irradiance: 0.1 gw cm−2. cribe this behaviour toi the secondary emission of electrons from the cup collector, because we are not able to prevent this emission, owing to the absence of a suppressing electrode on the faraday cup. so we expect that the real charge must be increased by about 20 % for the used voltage values. fixing the voltage at 20 kv in the second gap, the extraction current increased with the change of the voltage applied on first gap, reaching 150 % at the maximum voltage of 40 kv with respect to the value obtained at 0 kv. this result implies strong dependence of the extraction efficiency on the first stage voltage. in effect, simulations results preformed in previous works, have shown, tha the electric field strenght near the ec hole increased with respect to the applied voltage. this fact can enlarge the extracting volume inside the anode, and as a consequence, the extraction efficiency. the cu target at zero voltage produced ion beams containing 1.2 × 1011 ions/pulse (0.7 × 1011 ions cm−2). instead, applying accelerating voltages of 40 and 20 kv in the first and sec255 luciano velardi et al. acta polytechnica figure 3. output charge on accelerating voltages for the cu target; laser irradiance: 0.1 gw cm−2. ond accelerating gap, respectively, we obtained an increase of the ion dose up to 3.4 × 1011 ions/pulse, (2 × 1011 ions cm−2). for a geometric characterization of the beam, we performed emittance measurements by the pepper pot technique. we can assume the propagation direction of the beam along the z axis. so the x-plane emittance x is 1/π times the area ax in the xx′ trace plane (tpx) occupied by the points representing the beam particles at a given value of z, namely εx = ax π . (1) in the phase plane (ppx), the area of the particle is defined as [11] a0x = ∫ ∫ f 02 6=0 dx dpx = m0cβγ ax, (2) where f02 is the distribution function of the particles, m0 is the rest mass of the particles, β = v/c and γ = 1/(1 − β2)1/2. actually, it is necessary to define an invariant quantity of the motion called normalized emittance εnx in the tpx. therefore, by liouville’s theorem, it is known that the area occupied by the particle beam in ppx is an invariant quantity and the normalised emittance can assume the form εnx = βγεx. (3) figure 4 shows a sketch of the system used to measure the emittance value. the mask we used has 5 holes of 1 mm in diameter, and it was fixed on the ge. one hole is in the centre of the mask and 4 holes are at 3.5 mm from the centre. we used gafchromic ebt radiochromic films (r) as photo-sensible screen, placed on the te. radiochromic detectors take advantage of the direct impression of a material by the absorption of energetic radiation, without requiring latent chemical, optical, or thermal development or amplification. a radiochromic film changes its optical density as a function of the absorbed dose. this property, and the relative ease of use, led to the adoption these detectors as simple ion beam transverse properties diagnostic tools. so, the ion beam after the mask imprinted the radiochromic film and then it was possible to measure the divergence of all beamlets. the divergence values allowed us to determine the beam area ax in tpx. we applied 250 laser shots to imprint the radiochromic films. we measured the emittance for different accelerating voltage values. we fixed the accelerating voltage of the second gap at 20 kv, while the accelerating voltage of the first gap was put at 10, 20 and 40 kv. so, the obtained values of the area in the tpx resulted in 613, 545 and 435 mm mrad for 30, 40 and 60 kv of total accelerating field, respectively (fig. 5). considering eq. 3, we found the normalized emittance values. for all the applied voltage values, the normalized emittance resulted constant: εnx = 0.2 π mm mrad. therefore, to estimate the total properties of the delivered beams, it is necessary to introduce the concept of brightness. brightness is the ratio between the current and the emittance along the xand the y-axis. generally assuming x equal to y, the normalised brightness becomes bn = i ε2nx . (4) by eq. 4, at the current peak (5.5 ma), the brightness values resulted in 137 ma (π mm mrad)−2 at 60 kv of accelerating voltage. due to low emittance and high current, this apparatus is very promising for use for feeding large accelerators. the challenge of the moment is to get accelerators with dimensions so small that can be easily deployed in small laboratories and hospitals. 4. conclusions post-acceleration of ions emitted from laser-generated plasma can be developed to obtain small and compact accelerating machines. the output current can easily increase on accelerating voltage. the applied voltage can cause breakdowns, and for this reason the design of the chamber is very important (primarily its dimensions and morphology). we have also shown that two gaps of acceleration can efficiently increase the ion energy. by increasing the voltage of the first accelerating gap, we substantially increased the efficiency of the extracted current due to the rise of the electric field and the extracting volume inside the ec. the charge extracted without the electric fields was 0.7 × 1011 ions/cm2. at the maximum accelerating voltage the ion dose was 2 × 1011 ions/cm2, and the corresponding peak current was 5.5 ma. we 256 vol. 53 no. 2/2013 emittance characterization of ion beams provided by laser plasma figure 4. sketch of the pepper pot system used to measure the emittance. figure 5. emittance diagram in the trace plane for different accelerating voltage values. 257 luciano velardi et al. acta polytechnica measured the geometric characteristics of the beam utilising the pepper pot method. we found a low value for the normalized emittance of our beams, εnx = 0.2 π mm mrad. the resulting brightness values were therefore 137 ma (π mm mrad)−2. this study has demonstrated that our apparatus can produce ion beams of good quality, e.g. with a low emmitance value and high current. for this reason it is very promising for use in feeding large accelerators. acknowledgements the work was supported by the fifth national committee of infn in the lilia experiment. references [1] f. belloni, d. doria, a. lorusso, a. nassisi. study of particle acceleration of cu plasma. rev sci instrum 75(11):4763–4768, 2004. [2] a. beloglazov, v. nassisi, m. primavera. excimer laser induced electron beams on an al target: plasma effect in a “nonplasma” regime. rev sci instrum 66(7):3883–3887, 1995. [3] f. ciocci, f. della valle, a. doria, et al. spectroscopy of muonic hydrogen with a compact fel. in proc. epac ’94, pp. 864–866. 1994. [4] f. belloni d. bleiner, a. bogaerts, v. nassisi. laser-induced plasmas from the ablation of metallic targets: the problem of the onset temperature, and insights on the expansion dynamics. j appl phys 101(8):83301, 2007. [5] d. doria, f. belloni, a. lorusso, et al. recombination effects during the laser-produced plasma expansion. rad eff def solids 160(10–12):663–668, 2005. [6] a. fortin, f. marion, b. l. stansfield, et al. efficiency of plasma-based ion implantation of radioisotopes (32p). surf and coatings technology 200(1–4):996–999, 2005. [7] a. lorusso, v. nassisi, l. velardi, g. congedo. si nanocrystals formation by a new ion implantation device. nucl instrum meth b 266(10):2486–2489, 2008. [8] a. lorusso, l. velardi, v. nassisi, et al. characteristic modification of uhmwpe by laser-assisted ion implantation. rad eff def solids 163(4–6):447–451, 2008. [9] a. lorusso, l. velardi, v. nassisi, et al. polymer processing by a low energy ion accelerator. nucl instrum meth b 266(10):2490–2493, 2008. [10] a. luches, m. martino, v. nassisi, et al. generation of self-pulsed multiple charged ions by an xecl excimer laser. nucl instrum methods phys res a 322(2):166–169, 1992. [11] v. nassisi, e. giannico. characterization of high charge electron beams induced by excimer laser irradiation. rev sci instrum 70(8):3277–3281, 1999. [12] v. nassisi, a. pedone. physics of the expanding plasma ejected from a small spot illumined by an ultraviolet pulsed laser. rev sci instrum 74(1):68–72, 2003. [13] f. noli, a. lagoyannis, p. misaelides. oxidation resistance of y-implanted steel using accelerator based techniques. nucl instrum methods phys res b 266(10):2437–2440, 2008. [14] h. takeda, j. h. billen, s. nath, et al. a compact high-power proton linac for radioisotope production. in proc. ieee part. accel. conf., vol. 2, pp. 8006–8013. 1995. [15] l. torrisi, f. caridi, l. giuffrida. comparison of pd plasmas produced at 532 nm and 1064 nm by a nd:yag laser ablation. nucl instr and methods in physics res b 268(13):2285–2291, 2010. [16] z. yu. study on the interaction of low-energy ions with organisms. surface & coatings technology 201(19–20):8006–8013, 2007. 258 acta polytechnica 53(2):254–258, 2013 1 introduction 2 materials and methods 3 results and discussion 4 conclusions acknowledgements references ap01_45.vp 1 introduction the conceptual design phase is characterised by the evaluation of many disparate concepts. evaluation of these concepts requires the application of a combination of empirical, experimental and computational tools to predict the concept performance against requirements. however, this early design stage is also characterised by significant cost and time constraints. a significant level of risk must therefore be assumed because of insufficient time or financial resources to evaluate all the concepts comprehensively. in the case of configuration development for ucavs there exists no empirical database and limited experience for the predicting of their aerodynamic characteristics. this is true not just because they are a recent concept in military aviation but also because other requirements, most prevalently low observability, are forcing extremely novel configurations. this tendency towards novel configuration solutions is also evident in the wide range of configurations emerging in the uav field. the relevance of existing empirical tools and databases based on vastly different configurations is thus questionable. it is argued in this paper that a quasi-physical modelling process is required to be able to truly capture the performance of the radically differing concepts typical of ucavs. this approach attempts to combine the advantages of empirical, computational and experimental tools to enable the design team to better predict vehicle aerodynamic characteristics early in the design process with increased confidence, and hence to reduce risks. 2 background the interest in ucavs has been driven by a number of factors, primarily the political necessity to minimise pilot casualties and to minimise cost at all levels. principal amongst these is the drive for affordable weapon systems. it has been proposed, see for example [1], that ucavs will lower acquisition, training and operational costs. goals established in us programs in this area are for vehicle acquisition costs in the order of one-third the cost of the upcoming joint strike fighter and have 50–80% lower life cycle costs. these ambitious goals will be achieved through a number of means. amongst these will be weight reduction (weight remains an important parameter in vehicle costs) gained from removing the pilot and associated equipment (such as cockpit heating, ejection seat and so forth) and training for the mission operators that can be conducted almost entirely in simulation (the vehicle need only fly on operational missions, hence eliminating fatigue and associated flight time driven cost drivers). the operational cost for a military operation would be minimised through maximising the platform survivability. it is this factor, and the demand for low observability – together with the freedom that accrues from removing the pilot from the vehicle – that has led to a large number of innovative, unconventional platforms. one example of this is the saab sharc concept, shown in fig. 1. while there are undoubtedly many issues to be resolved with ucavs regarding the level of autonomy, systems design, certification and interoperability with manned platforms, there also exist questions as to the aerodynamic performance of such vehicles. advances in flight control certainly allow for a wide variation in vehicle design that would not otherwise be possible. however, the fundamental requirement still exists © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 41 no. 4 – 5/2001 rapid prediction of configuration aerodynamics in the conceptual design phase c. munro, p. krus conceptual aircraft design is characterised by the requirement to analyse a large number of configurations rapidly and cost effectively. for unusual configurations such as those typified by unmanned combat air vehicles (uc avs) adequately predicting their aerodynamic characteristics through existing empirical methods is fraught with uncertainty. by utilising rapid and low cost experimental tools such as the water tunnel and subscale flight testing it is proposed that the required aerodynamic characteristics can rapidly be acquired with sufficient fidelity for the conceptual design phase. furthermore, the initial design predictions can to some extent be validated using flight-derived aerodynamic data from subscale flight testing. keywords: aircraft design, uav, water tunnel, flight testing. fig. 1: saab sharc unmanned combat air vehicle concept (image courtesy saab ab) that the configuration must provide sufficient lift with low drag and the control surfaces be able to respond to demands to manoeuvre and ensure controllability. these aerodynamically derived characteristics are however rarely in concordance with low observability requirements. the focus of the work described in this paper is to enable the aerodynamic design team to identify critical aerodynamic (primarily as related to control and stability) issues, predict performance and perform trade studies to determine the aerodynamic influence of changes (be they either to the external aerodynamics or to the control system). it should be recognised that there is very strong coupling in the case of ucavs between the aerodynamics, stability and flight control functions and hence a need to be able to evaluate the interactions between the three early in the design process. from a conceptual design standpoint, these requirements to meet both target performance and cost requirements place significant pressure on the initial analysis procedures. furthermore, much of the cost (particularly that component driven by vehicle weight) is determined very early on, as illustrated in fig. 2. hence, there is significant leveraging and longer term payback that can be gained from investing earlier in conceptual design studies. the level of modelling fidelity is lower at the conceptual design stage out of necessity. there must therefore be an acceptable level of fidelity that is required to enable concept selection to be undertaken and the requirement definition to be refined. the methods presented in this paper attempt to address these combined requirements of low cost and time while achieving an acceptable level of fidelity for conceptual design prediction. 3 design analysis toolbox the use of computational tools for the prediction of flight vehicle performance is rightly assuming a great deal of importance in current aeronautical research. the benefits of such tools are clear: significantly reduced cost and time can, at least in principle, be achieved and often a greater insight into the underlying physics and multidisciplinary aspects can be obtained through the ability to exactly control test parameters which may well be coupled in the physical system. in the field of flight performance the use of nonlinear six degree of freedom flight dynamics models is very well established as critical to every stage of vehicle development from initial system definition and design right through to training of aircrew. while the mathematical description of flight dynamics models is relatively simple and readily implementable in computer models, the same cannot be said for the underlying aerodynamics models. the complexity of fluids modelling means that computational fluid dynamics (cfd) is a relatively immature field that has not yet reached the point of acceptability in aircraft company conceptual design offices. the reasoning is generally that the cycle time is too long when there are so many configurations to be studied (and so many test points to run) and, often, a lack of confidence in the results. to address this weakness in aerodynamic modelling at the conceptual design stage a method of building up the aerodynamic database using rapid experimental and computational tools around the central flight dynamics model is proposed, as shown in fig. 3. the role of the aerodynamic prediction methods presented is twofold: validation of simulation results and as a complement to elements of the digital simulation. from a set of designs sketched by the conceptual design team (with a large input from low observability elements as regards the 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 2: influence of design stage on costs and the ability to influence the design fig. 3: conceptual design process with physical and virtual modelling elements external configuration) a few may be selected on the basis of initial empirical analysis to proceed to more detailed modelling and experimental investigation. such activities may proceed in parallel. modelling will consist of initial computational analysis complemented by water tunnel quantitative data. this water tunnel data may be used to aid in generating the preliminary aerodynamic database that can be used as an input to the flight dynamics simulation model that can be used for initial operational/developmental evaluation. furthermore, the flow visualisation capabilities of the water tunnel provide an insight into the aerodynamic behaviour and hence guide the design team to potential areas of improvement through a greater understanding of the dominant flow physics that this provides. the terms verification and validation used here conform to the test and evaluation community definitions, as detailed in [2]. verification need only be performed on major code changes, to ensure software robustness. validation on the other hand is the confirmation of simulation results with experimental or other means. in this instance, the validation can be provided by the subscale flight test elements, providing an independent check of the modelling and simulation elements that are themselves a combination of experimental (water tunnel), empirical and digital simulation. the modelling can be adjusted on the basis of the validation to improve the prediction capability. the design team can thus have more confidence in the analysis results when it comes to evaluating these results against design requirements. this in turn increases the confidence in the design, helps identify key problem areas requiring further work and reduces the risk. conversely, this reduced risk could be turned to allow more radical configurations to be evaluated with the same level of risk as without this approach. key elements of this design process are two relatively novel experimental tools: the water tunnel and subscale flight testing. it is the contention of this research that these methods both have significant potential in the conceptual design phase for ucav aerodynamics, although their limitations are significant and must be recognised. 4 novel experimental tools while experimentation is undoubtedly a rather expensive and time consuming activity in comparison to simulation, the development of modern, cost effective and accurate instrumentation has driven down experimental costs and cycle time. furthermore, questions remain in the conceptual design area over the feasibility in the near-term of utilising complex, long cycle time simulations. the water tunnel and subscale flight test tools that are introduced here are good examples of this trend. both tools have their advantages and disadvantages in terms of time, cost and modelling fidelity, and these are introduced in this section. 5 water tunnel the water tunnel has long been used as a diagnostics tool for external flows (see for example [3] and [4]). for ucavs, the demand for low observability tends to chined fuselage geometries and swept, relatively sharp wing leading edges. the flow over such geometries at moderate to high angles of attack will in general be dominated by vortical flow with fixed separation lines along the sharp wing leading edges and fuselage chines. this geometrically fixed separation combined with the vortex dominated separated flowfield makes such flows reynolds insensitive ([5] and [6] are typical examples illustrating the reynolds number insensitivity of such flows) and hence amenable to analysis in low reynolds number facilities such as water tunnels. while flow visualisation has been well demonstrated in water tunnels, the use of strain gauge balances to allow force and moment measurement is relatively new ([7], [8]). initial test results performed as part of this work on generic delta wing models indicates excellent correlation of both lift curve slope and maximum lift coefficient with wind tunnel data, as shown in fig. 4. note that the reynolds number as listed here is based on root chord and that the lift coefficient value does not account for the contribution of tangential force. hence, the true lift coefficient is slightly lower at high angles of attack than shown here. nonetheless, this result indicates the high level of accuracy that can be rapidly achieved while remaining within the time and cost constraints imposed at the conceptual design stage. 6 subscale flight testing the use of subscale flying models for free dynamics testing has become increasingly possible in recent years due to the development of lightweight, low power instrumentation and telemetry systems. examples of projects that have been undertaken in this area include the boeing/nasa x-36 fighter 28% scale demonstrator [9], nasa experimental hypersonic configuration [10] and the saab t28 instrumented model (see fig. 5). © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 4: matching of force data for water and wind tunnel tests of a 65° delta wing subscale flight testing offers the benefit over ground-based tests of being fully six degrees of freedom, enabling realistic manoeuvres to be performed (although inertial scaling remains an issue requiring solution). particularly for high risk tests such as at high angle of attack and departures such as spin, the subscale demonstrator offers the opportunity to explore these corners of the flight envelope in a safer manner than with a fullscale vehicle and at much lower cost. furthermore, the subscale model can be built early in the design process (perhaps late in the conceptual phase) whereas the fullscale demonstrator can only be tested after design work is essentially complete. particularly for ucavs, these can readily demonstrate technologies key to unmanned operation in addition to exploring a configuration’s aerodynamic performance. advances in parameter identification methods combined with the widespread availability of fast computers have enabled the rapid identification of aerodynamic parameters from flight test data. the resulting set of aerodynamic and control derivatives can then readily be compared to simulation results for validation and improvement of these prediction approaches. furthermore, the possibility exists to perform tests rapidly with variations of platform geometry to assess alternative configurations and perform trade studies. initial testing has been performed to confirm that a complete set of inertial and air data can be readily recorded onboard and telemetered to the ground in real time for a payload weight of under 5 kg and at very low cost. work is currently underway on a small ucav-type vehicle for demonstration of the concept in a more realistic configuration. further, the work will be validated against fullscale flight test and wind tunnel data on a number of representative configurations. 7 methods integration the two methods described in this paper are incorporated into the design process as shown in fig. 3. as shown in fig. 6, the two novel methods proposed here fit into a spectrum of experimental tools that can be utilised in the design process. through careful selection of these methods with an underlying simulation capability that develops as the aircraft develops, the potential exists for more rapid and effective aerodynamic prediction. note also how the tendency is for the last 10 % of accuracy to cost much more than for the initial, conceptual design level estimate. neither water tunnel testing or subscale flight testing can replace the wind tunnel or fullscale flight testing in developing an aircraft to certification. however, at the very early design stages where there are a large number of options to be studied and time and financial resources are limited, these tools offer some potential to reduce risk and increase the design knowledge. this in turn, has a great leveraging effect on the overall development cost, as was indicated in fig. 2. 8 conclusion the aerodynamic conceptual design of novel configurations such as ucavs present significant problems to the aerodynamic designer, who is dependent on empirical (handbook) approaches combined with time consuming and costly wind tunnel and cfd analysis. it is proposed in the present research that useful additions to the design team’s analysis toolbox may be the water tunnel and subscale flight testing. these two experimental methods fit into a simulation framework by enabling validation of simulation results and by complimenting flight dynamics modelling by reducing the need for more costly aerodynamic prediction methods early in the design process. further work is currently underway to validate both the water tunnel and subscale flight testing against baseline test configurations for which there are readily available wind tunnel and fullscale flight test data. work is also underway to address the dynamic scaling issues relevant to subscale flight testing and how they can be addressed. acknowledgments the first author’s work in this area is supported by the swedish engineering, design, research and education agenda (endrea). the authors also wish to acknowledge the support of saab ab in conducting this work. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 5: t28 trojan instrumented model aircraft fig. 6: tradeoffs of accuracy and cost in methods selection references [1] darpa system capability document – unmanned combat air vehicle operational system. http://www.darpa.mil/tto/ downloadables/ucav/appendix.doc, 1998 [2] hall, d. h., kilikauskas, m., laack, d. k., muessig, p. r., o’neal, b., richardson, c., simecka, k., stewart, w., turner, s. w.: how to vv&a without really trying.us joint accreditation support activity jtcg/as-97-m-00 9, 1997 [3] del frate, j. h.: nasa dryden flow visualization facility. nasa tm-4631, 1995 [4] malcolm, g. n., nelson, r. c.: comparison of water and wind tunnel flow visualization results on a generic fighter configuration at high angles of attack. aiaa 87-2423, 1987 [5] erickson, g.: vortex flow correlation. afwal-tr-80-3142, 1981 [6] roos, f. w., kegelman, j. t.: an experimental investigation of sweep-angle influence on delta-wing flows. aiaa 90-0383, 1990 [7] suarez, c. j., malcolm, g. n.: water tunnel force and moment measurements on an f/a-18. aiaa 94-1802, 1994 [8] cunningham, jr., a. m., bushlow, t., mercer, j. r., wilson, t. a., schwoerke, s. n.: small scale model tests in small wind and water tunnels at high incidence and pitch rates. ad-a208647, vol. 1–3, 1989 [9] walker, l. a.: flight testing the x-36 – the test pilot’s perspective. nasa cr-198058, 1997 [10] budd, g. d., gilman, r. l., eichstedt, d.: operational and research aspects of a radio-controlled model flight test program. journal of aircraft, 32, 1995, pp. 583–589 cameron munro, beng(aero), bbus phone: +46 13 285783 fax: +46 13 130414 e-mail: cammu@ikp.liu.se petter krus, ph.d. phone: +46 13 281792 fax: +46 13 130414 e-mail: petkr@ikp.liu.se department of mechanical engineering linköpings universitet 581 83 linköping, sweden © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2015.55.0306 acta polytechnica 55(5):306–312, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap condensation of water vapor in a vertical tube condenser jan havlik∗, tomas dlouhy department of energy engineering, faculty of mechanical engineering, czech technical university in prague technická 4, praha 16607, czech republic ∗ corresponding author: jan.havlik@fs.cvut.cz abstract. this paper presents an analysis of heat transfer in the process of condensation of water vapor in a vertical shell-and-tube condenser. we analyze the use of the nusselt model for calculating the condensation heat transfer coefficient (htc) inside a vertical tube and the kern, bell-delaware and stream-flow analysis methods for calculating the shell-side htc from tubes to cooling water. these methods are experimentally verified for a specific condenser of waste process vapor containing air. the operating conditions of the condenser may be different from the assumptions adopted in the basic nusselt theory. modifications to the nusselt condensation model are theoretically analyzed. keywords: condensation; vertical tube condenser; heat transfer coefficient. 1. introduction condensers are used in a range of chemical, petroleum, processing and power facilities for distillation, for refrigeration and for power generation. most condensers used in the chemical process industries are water-cooled shell-and-tube exchangers and air-cooled tube or platen exchangers. shell-and-tube condensers, which are used for condensing process vapors, are classified according to orientation (horizontal and vertical) and according to the placement of the condensing vapor (shell-side and tube-side) [1]. this paper deals with vertical shell-and-tube condensers with tube-side condensation. calculations of the overall heat transfer coefficient necessary for the design of the condenser heat transfer area are well described in the literature, but for limited operating conditions only. the nusselt condensation model, which is often recommended for calculating the condensing side htc, is derived for conditions which need not be satisfied in real operation [2, 3]. the methods by kern, bell-delaware and the stream-flow analysis method, which are commonly used for calculating the shell-side htc, provide different values [1]. the first step in an analysis of the condensation process is to test these methods in operating conditions different from the assumptions adopted in the basic nusselt theory. in the second step, the applicability of the methods will be proved, or modifications to them will be designed. 2. vertical tube-side downflow condenser this configuration is often used in the chemical industry. it consists of a shell with fixed tubesheets. the lower head is over-sized to accommodate the condensate and a vent for non-condensable gases. the condensate flows down the tubes in the form of an annular film of liquid, thereby maintaining good contact with both the cooling surface and the remaining vapor. disadvantages are that the coolant, which is often more prone to fouling, is on the shell side, and the use of finned tubes is precluded. the overall heat transfer coefficient u is given from the equation u = 1/do 1 dihc + 12k ln do di + 1 dohw , (1) where di is the inside diameter of the tubes, hc is the tube-side htc (vapor condensation), k is the thermal conductivity of the tubes, do is the outside diameter of the tubes, and hw is the shell-side htc. 3. nusselt theory of condensation on a vertical surface the basic heat-transfer model for film condensation was first derived by nusselt to describe how a purecomponent saturated vapor condenses on a vertical wall, forming a thin film of condensate that flows downward due to gravity [2, 3]. the following assumptions are made: (1.) the flow in the condensate film is laminar. (2.) the temperature profile across the condensate film is linear. this assumption is reasonable for a very thin film. (3.) advection in the film is neglected. (4.) the shear stress at the vapor-liquid interface is negligible. (5.) the fluid velocity in the film is small. (6.) the fluid properties are constant for the liquid film. (7.) the wall is flat (no curvature). 306 http://dx.doi.org/10.14311/ap.2015.55.0306 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 5/2015 condensation of water vapor in a vertical tube condenser (8.) the system is in a steady state. then the following equation was introduced: hc = 2 √ 2 3 (%lg(%l −%p)h′f gk3l µl∆tsatl )1/4 , (2) where %l is the density of the condensate, %pis the density of the water vapor, h′f g is the latent heat of condensation, kl is the thermal conductivity of the condensate, µl is the dynamic viscosity of condensate, ∆tsat is the difference between the saturation temperature and the wall temperature, and l is the wall length. 4. shell-side htc the methods by kern, by bell-delaware, and the stream flow analysis method are widely used for calculating the shell-side htc. hewitt et al. [1] present these methods very well. these methods take into account the following factors, which are specific for shell-and-tube heat exchangers: (1.) tube layout and pitch (2.) effect of production clearance on temperature and velocity profiles the tube layout and the pitch influence the coefficients in the resulting equations (3), (6), (9) of the methods. production clearances are included in the calculation by correction factors. 4.1. the kern method based on data from industrial heat transfer operations and for a fixed baffle size (75 % of the shell diameter), the following equation was introduced: h2 = k de · 0.36 re0.55pr0.33, (3) where k is the fluid thermal conductivity and de is the equivalent diameter defined in (5). no change in viscosity from the bulk to the wall is assumed. the reynolds number re is defined as re = m′sde µ , (4) where m′s is the shell-side mass velocity calculated from the total mass flow in the shell m′t and the cross-flow area at the diameter of the shell ss as m′s = m′t /ss, the equivalent diameter de is defined in the usual way as de = 4 ·flow area wetter perimeter , (5) and µ is the fluid dynamic viscosity. 4.2. the bell-delaware method in this method, correction factors for the following elements were introduced: (1.) leakage through the gaps between the tubes and baffles, and between the baffles and the shell. (2.) bypassing of the flow around the gap between the tube bundle and the shell. (3.) effect of the baffle configuration (i. e., recognition of the fact that only a fraction of the tubes are in pure cross-flow). (4.) effect of the adverse temperature gradient on heat transfer in laminar flow. the ideal cross-flow heat transfer coefficient is given by hcf = k do · 0.273 re0.653pr0.34, (6) where the reynolds number re is defined as re = %vmaxdo µ , (7) where % is the fluid density, vmax is the maximum intertube velocity between tubes near the centerline of the flow [1] and do is the external diameter of the tube. then the shell side heat transfer coefficient is given by hw = hcf jcjljb, (8) where jc is the correction factor for the baffle configuration, jl is the correction factor for leakage, and jb is the correction factor for the bypass in the bundleshell gap [1]. 4.3. the flow-stream analysis method this method makes a detailed analysis of the flow in a heat exchanger. the fluid flows inside an exchanger via various routes. leakage flows occur between the tubes and the baffle, and between the baffle and the shell. part of the flow passes over the tubes in crossflow, and part bypasses the bundle. the cross-flow and bypass streams combine to form a further stream that passes through the window zone. a correction factor fcr, which adjusts the mass flow calculation (the reynolds number), takes these effects into account. then the shell-side heat transfer coefficient is given by hw = k do · 0.273 re0.653cr pr 0.34, (9) where recr = fcr re. the reynolds number re is defined the same as in section 4.2 according to (7). generally speaking, the kern method offers the simplest route, the bell-delaware method is the most widely-accepted method, and flow-stream analysis is the most realistic method. 307 jan havlik, tomas dlouhy acta polytechnica fig. 2 impact of the accuracy of the shell-side determination and the tube-side htc 400 500 600 700 800 900 1000 1100 1200 1300 1400 600 700 800 900 1000 1100 o v e ra ll h t c d e v ia ti o n [ w /m 2 k ] overall htc [w/m2k] shell-side htc +-20% condensation htc +-20% theoretical calculation figure 2. impact of the accuracy of the shell-side determination and the tube-side htc determination on the overall htc. figure 1. scheme of the testing loop. 5. experimental set-up experiments for testing and for validating these methods for htc calculations are carried out on a vertical shell-and-tube heat exchanger, in which the condensing water vapor flows downwards in vertical tubes and the cooling water countercurrent flows in the shell part. the vapor outlet is open to the atmosphere, so the steam condensation pressure is close to atmospheric pressure. the tube bundle is formed by 49 tubes 865 mm in length, 28 mm in external diameter and 24 mm in internal diameter. the tubes are arranged in staggered arrays with a triangular tube pitch of 35 mm. the cross-section of the shell is rectangular in shape, 223 mm by 270 mm in size. seven segmental baffles (223 × 230 mm) are used in the shell section, the tube-to-baffle diametral clearance is 1 mm, and there is no shell-to-baffle diametral clearance. the material is stainless steel 1.4301 (aisi 304). the testing loop is shown in fig. 1. steam is produced in a steam generator. before the steam enters the vertical tube condenser, its parameters are reduced to the required values. hot water recirculation enables the cooling water temperature and the cooling water flow rate in the condenser to be regulated. the measured parameters are the inlet cooling water temperature tw1, the temperature of the water after recirculation tw2, the outlet cooling water temperature tw3, the cooling water flow rate qw, the inlet steam pressure ps , the inlet steam temperature ts and the amount of steam condensate mc. 5.1. analysis of overall htc sensitivity to the shell-side and tube-side htc value determination of the htc is a complex problem, as there is often a large deviation between theoretical models and experiments. htc is commonly evaluated with accuracy within 20 % [1, 4]. the impact of the tolerance range of the pure steam condensation htc hc and the determination of the shell-side htc hw on the value of the overall htc is shown in fig. 2. the overall htc values are calculated according to (1), where the value of the thermal resistance of the shell-side htc term 1/hw (see fig. 3, approximately 10−3 m2 k/w) is significantly higher than the value of the thermal resistance of the condensation htc term 1/hc (approximately 10−4 m2 k/w [3]). the sensitivity of the shell-side htc value to the overall htc is more significant than the condensation htc. when calculating the overall htc, it is of greatest importance to determine the shell-side htc precisely. 6. verifying the shell-side htc calculation the experiments are carried out on the condenser for the following operating parameters: heat exchanger thermal load, from 20 to 60 kw; logarithmic mean 308 vol. 55 no. 5/2015 condensation of water vapor in a vertical tube condenser fig.3 comparison between experimental and theoretical results of the shell-side htc y = 17,61x0,52 r² = 0,81 400 600 800 1000 1200 1400 1600 1800 2000 2200 1000 2000 3000 4000 5000 6000 7000 s h e ll -s id e h t c [w /m 2 k ] reynolds number [-] experiment kern method stream analysis method bell-delaware method approximation figure 3. comparison between experimental and theoretical results of the shell-side htc. temperature difference, from 6 to 28°c; and reynolds number values from 1200 to 6300 (70 measured states). condensation of saturated pure steam without air is tested. the thickness of the film does not exceed 0.5 % of the value of the tube diameter, so the difference in comparison with a flat plate is negligible. the vapor velocity is up to 1 m/s, and the maximum value of the reynolds number of the film does not exceed 100. the experimental shell-side htc values are calculated from the heat balance of the exchanger using measured data to determine the overall htc. then the shell-side htc is calculated from (1), assuming that the results of the nusselt condensation model for calculating the condensation htc are valid. the deviation of the measurements ranges from 4 % up to 7 % in dependence on the operating parameters. a comparison between the experimental values and the values determined by the theoretical methods described in section 4 is shown in fig. 3. the bell-delaware method and the stream-flow analysis method achieve comparable results, while the results of the kern method provide significantly higher values. the difference decreases from 40 % for a reynolds number of 1000 to 21 % for a reynolds number of 7000. values for the shell-side htc calculated from the experimental results lie inside the range marked by the models, and do not correctly match any of them. our own approximation of the dependence of the shell-side htc on the reynolds number is proposed on the basis of experimental results (see fig. 3), where the reynolds number is defined according to (4). for the bell-delaware method and the streamflow analysis method, the reynolds number defined in (7) is used in the calculation of htc, but the reynolds number used for fig. 3 is transformed to the value according (4). the change in the prandtl number for the measurement range is estimated to be within ±3 %, so its influence on the change in htc has been neglected. the proposed approximation hw = 17.61 re0.52 (10) is valid for this specific tested condenser for cooling water with a mean temperature about 85°c with the coefficient of determination r2 = 0.81. the total variance in the experimental results is an assumed value, due to the complexity of the heat transfer process and various operating parameters. fig. 4 compares the overall htc evaluated from the measured heat balance with the theoretical calculation according the nusselt theory for calculating the condensation htc and the approximation of the dependence of the shell-side htc on the reynolds number (see eq. 12). the experimental results match the theoretical results with a range of tolerance mainly within ±10 %. 7. modifications to the nusselt theory the operating conditions of the condenser may differ from the assumptions adopted in the basic nusselt theory. the following modifications of the nusselt condensation model, which may occur in the operation of the tested device, are theoretically analyzed: (1.) condensation in vertical tubes (2.) impact of steam velocity on laminar film flow (3.) condensation in the event of turbulent film flow (4.) presence of non-condensable gases 7.1. condensation in vertical tubes the calculation is similar to the calculation for condensation on a vertical flat wall, assuming that the 309 jan havlik, tomas dlouhy acta polytechnica fig. 4 comparison of the experimental and theoretical results for the overall htc 400 600 800 1000 1200 1400 400 500 600 700 800 900 1000 1100 1200 1300 1400 o v e ra ll h t c e x p e ri m e n t [w /m 2 k ] overal htc theory [w/m2k] experiment theory = experiment +-10% figure 4. comparison of the experimental and theoretical results for the overall htc. fig. 5 dependence of the local htc on steam velocity 5400 5500 5600 5700 5800 5900 6000 6100 6200 6300 6400 0 2 4 6 8 10 12 14 16 18 20 lo ca l h t c [w /m 2 k ] vapor velocity [m/s] figure 5. dependence of the local htc on steam velocity. thickness of the film is very small in comparison with the external diameter of the tube [3]. the thickness of the film is evaluated according to the method included in the basic nusselt theory [2, 3]. the maximum value calculated for the operating conditions of the condenser does not exceed 0.5 % of the diameter of the tube. therefore, the difference in comparison with a flat plate wall is negligible. 7.2. impact of steam velocity on laminar film flow during condensation inside vertical tubes, steam works on the surface of the film by a shear force. when the steam flows downward, the film flow accelerates and the condensation htc increases slightly. by contrast, when the steam moves upwards, it slows down the film flow [3, 4]. the influence of the steam velocity on the value of the local condensation htc is evaluated according to the blangetti method [5]. the results are shown in fig. 5. the velocity of the steam in the condenser ranges from 1 to 2 m/s. therefore, the effect of steam velocity is negligible. 7.3. condensation in the case of turbulent film flow on the bottom of the high vertical walls or tubes, the thickness of the film can grow to such an extent that 310 vol. 55 no. 5/2015 condensation of water vapor in a vertical tube condenser fig. 6 impact of the reynolds number of the condensate film on htc 0 2000 4000 6000 8000 10000 12000 14000 16000 0 500 1000 1500 2000 2500 3000 3500 4000 lo ca l h t c [ w /m 2 k ] rel [-] figure 6. impact of the reynolds number of the condensate film on htc. fig. 7 effect of non-condensable gas on the condensation htc 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 c o n d e n sa ti o n h t c [ k w /m 2 k ] tube length [m] pure steam mixture 10 m/s mixture 1 m/s figure 7. effect of non-condensable gas on the condensation htc. there is a transition to turbulence. this turbulent transport increases the htc [3, 4]. pure turbulent flow begins when the value of the critical reynolds number is 1800 [3]. the dependence of the reynolds number of the condensate film on the local condensation htc is evaluated according to the blangetti method [5], where the assumption of transition flow is introduced. an increase in the local htc begins at a reynolds number value rel of 1200 (see fig. 6). the reynolds number of the condensate film for the operating conditions of the condenser does not exceed a value of 100. therefore, the film flow in the condenser is laminar. 7.4. presence of non-condensable gases if the steam condenses in a mixture with inert gases, the steam molecules diffuse through an inert gas towards the vapor-liquid interface, and this causes a decrease in the partial pressure of the vapor and reduces the htc. to determine the htc value, it is necessary to take into account the mass transfer generated by the different partial pressures at the interface between the gas and the liquid phases and the bulk gas [3, 4]. in accordance with the analogy between mass transfer and heat transfer, the condensation htc is evaluated according to the nusselt model, combined with applying the dittus-boelter equation [2] for turbulent flow 311 jan havlik, tomas dlouhy acta polytechnica fig. 8 dependence of the condensation temperature on the vapor concentration 50 60 70 80 90 100 110 0,20,30,40,50,60,70,80,91 c o n d e n sa ti o n t e m p e ra tu re [ °c ] water vapor concentration in the gas [-] figure 8. dependence of the condensation temperature on the vapor concentration. (velocity of 10 m/s) and the hansen equation [2] (velocity of 1 m/s) for laminar flow in the mass transfer. the results of the calculation for a mixture of steam with 10 % of air are shown in fig. 7. the value of htc is significantly affected by the gas velocity. the decrease in the partial pressure also reduces the condensation temperature during the process (see fig. 8). the impact of the presence of a non-condensable gas on the condensation process in a vertical tube condenser is a subject for further research. 8. conclusions this paper has dealt with calculating the heat transfer in a vertical tube condenser. we have analyzed the use of the nusselt model for calculating the condensation htc inside vertical tubes and the kern, bell-delaware and stream-flow analysis methods for calculating the shell-side htc from the tubes to the cooling water. these methods have been verified experimentally for the specific condenser. the influence of the shell-side htc on the overall htc is more significant than the influence of the condensation htc. assuming that the nusselt condensation model is valid, the values of the shell-tube htc evaluated from the experiments do not correctly match any of the methods mentioned here. we propose our own equation for calculating the dependence of the shell-side htc on the reynolds number. the experimentally-determined values of the overall htc match the results of the proposed theoretical approximation for determining the shell-tube htc with a range of tolerance mainly within 10 %. a theoretical analysis has been made of the modifications to the nusselt condensation model that may occur in the operation of the tested device. the effect of steam velocity on a value from 1 to 2 m/s is negligible. the thickness of the film does not exceed 0.5 % of the diameter of the tube, so the difference in comparison with a flat plate wall is negligible. the reynolds number of the condensate film does not exceed a value of 100. the presence of transition flow and turbulent flow begins at a reynolds number value of 1200. therefore, the film flow in the condenser is laminar. the presence of a non-condensable gas reduces the steam condensation temperature and reduces the htc. this process is significantly affected by the gas velocity. this effect is a topic for further research. acknowledgements this work has been supported by the grant agency of the czech technical university in prague grant sgs13/134/ohk2/2t/12. references [1] g. f. hewitt, g. l. shires, t. r. bott: process heat transfer, begell house, new york 1994. [2] f. p. incropera, d. p. dewitt: introduction to heat transfer, wiley-academy, new york 1996. [3] j. šesták, f. rieger: přenos hybnosti, tepla a hmoty. vydavatelství čvut, praha 1993. [4] h. d. bahr, k. stephan: heat and mass transfer, springer, berlin 1998. doi: 10.1007/978-3-662-03659-4 [5] f. blangetti, r. krebs, e. u. schünder: condensation in vertical tubes – experimental results and modelling. chemical engineering fundamentals 1 (1982) 20-63. 312 acta polytechnica 55(5):306–312, 2015 1 introduction 2 vertical tube-side downflow condenser 3 nusselt theory of condensation on a vertical surface 4 shell-side htc 4.1 the kern method 4.2 the bell-delaware method 4.3 the flow-stream analysis method 5 experimental set-up 5.1 analysis of overall htc sensitivity to the shell-side and tube-side htc value 6 verifying the shell-side htc calculation 7 modifications to the nusselt theory 7.1 condensation in vertical tubes 7.2 impact of steam velocity on laminar film flow 7.3 condensation in the case of turbulent film flow 7.4 presence of non-condensable gases 8 conclusions acknowledgements references acta polytechnica acta polytechnica 53(3):295–301, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ complex covariance frieder kleefeld∗ collaborator of centro de física das interacções fundamentais (cfif), instituto superior técnico (ist), edifício ciência, piso 3, av. rovisco pais, p-1049-001 lisboa, portugal ∗ corresponding author: kleefeld@cfif.ist.utl.pt abstract. according to some generalized correspondence principle the classical limit of a nonhermitian quantum theory describing quantum degrees of freedom is expected to be the well known classical mechanics of classical degrees of freedom in the complex phase space, i.e., some phase space spanned by complex-valued space and momentum coordinates. as special relativity was developed by einstein merely for real-valued space-time and four-momentum, we will try to understand how special relativity and covariance can be extended to complex-valued space-time and four-momentum. our considerations will lead us not only to some unconventional derivation of lorentz transformations for complex-valued velocities, but also to the non-hermitian klein-gordon and dirac equations, which are to lay the foundations of a non-hermitian quantum theory. keywords: non-hermitean, quantum theory, relativity, complex plane. 1. introduction as has been pointed out on various places (see e.g. [1–10] and references therein), a simultaneous causal, local, analytic and covariant formulation of physical laws requires a non-hermitian extension of quantum theory (qt), i.e. quantum mechanics and quantum field theory. since causality expressed in qt by the time-ordering operation infers some small negative imaginary part in self-energies appearing in causal propagators which may be represented by some (in most cases infinitesimal) negative imaginary part in particle masses, even field operators representing electrically neutral causal particles have to be considered non-hermitian. this leads to the fact that their creation and annihilation operators are not hermitian conjugate to each other. during the attempt to find some spatial representation of non-hermitian creation and annihilation operators which preserves analyticity and covariance, it has turned out that the aforementioned non-hermitian causal field operators are functions of the complex spatial variable z instead of merely the real spatial variable x, while anticausal field operators are functions of the complex conjugate spatial variable z∗. consequently a causal, local, analytic and covariant formulation of the laws of nature separates into a holomorphic causal sector and an antiholomorphic anticausal sector, which must not interact on the level of causal and anticausal fieldoperators in the spatial respresentation. at least since gregor wentzel [11] (1926) there exists a formalism (see also [12–18]) nowadays called e.g. the “quantum-hamilton-jacobi theory” (qhjt) or the “modified de broglie-bohm approach”, which relates a field or wave function by some correspondence principle (see e.g. eqs. (4.10)) to the trajectories of some “quantum particle” in the whole complex phase space. wentzel’s approach has recently even been fortified by a. voros [19] (2012) by providing an “exact wkb method” allowing to solve the schrödinger equation for arbitrary polynomial potentials simultaneously in the whole complex z-plane. moreover, there exists a rapidly increasing interest [23] of many theoretical and experimental researchers to study solutions of the schrödinger equation even for non-hermitian hamiltonians in the whole complex plane due to a meanwhile confirmed conjecture of d. bessis (and j. zinn-justin) in 1992 on the reality and positivity of spectra for manifestly non-hermitian hamiltonians, which was related by c.m. bender and s. boettcher in 1997 [20] to the pt-symmetry [21] of these hamiltonians. despite this enormous amount of activities to “make sense of non-hermitian hamiltonians” [21, 22] and the fact that we had managed [6] to construct even a lorentz-boost for complex-mass fields required to formulate non-hermitian spinors and dirac-equations, there has remained — as to our understanding — one crucial point neglected and unclear which is in the spirit of the qhjt and which will be the focus of the presented results: how can the general concept of “covariance” having been formulated by albert einstein (1905) [24] and colleagues (see also [25, 26]) merely for a phase space of real-valued spatial and momentum coordinates be extended to a complex (i.e. complex-valued) phase space? an answer will be given to some extent in the follwoing text. certainly one might argue that there are already approaches like e.g. [27–31], which refer to seemingly covariant equations within a non-hermitian framework. nonetheless, it turns out that in all of the approaches there remain open questions to the reader to what extent these equations are consistent with some of the aspects of causality, locality or analyticity, and to what extent 295 http://ctn.cvut.cz/ap/ frieder kleefeld acta polytechnica the fourvectors formed by space-time and momentumenergy coordinates contained in these equations will transform consistently under lorentz transformations, as it seems at the first sight (e.g. [32, 33]) completely puzzling how to extend the framework of an inertial frame to the complex plane. 2. space-time covariance for complex-valued velocities the purpose of this section is to derive generalized lorentz-larmor-fitzgerald-voigt (llfv) [34, 35] transformations1 relating the space-time coordinates ~z,t and ~z′, t′ of two inertial frames s and s′, respectively, which move with some constant 3-dimensional complex-valued relative velocity ~v ∈ c×c×c. while the 3-dimensional space vectors ~z and ~z′ are assumed to be complex-valued, i.e. ~z,~z′ ∈ c×c×c, our derivation will display how the preferably real-valued time coordinates t and t′ will be complexified. without loss of generality we will constrain ourselves throughout the derivation to one complexvalued spatial dimension while the generalization to three complex-valued spatial dimensions appears straightforward. hence we consider in what follows for simplicity generalized llfv transformations relating the space-time coordinates z,t and z′, t′ (with z,z′ ∈ c) of two inertial frames s and s′, respectively, moving with some constant one-dimensional complex-valued relative velocity v ∈ c. according to the standard definition, an inertial frame in the absence of gravitation is a system in which the first law of newton holds. for our purposes we will rephrase this definition in a way which may be used even in some complexified space-time: an inertial frame in the absence of gravitation is a system whose trajectory in (even complexified) spacetime is a straight line. our definition of inertial frames implies directly that generalized llfv transformations relating inertial frames must be linear. making use of this observation we can write down — as a first step in our derivation — the following linear ansatz for the generalized llfv transformations between the inertial frames s and s′: z′ = γ ·z + δ · t + ε, (2.1) t′ = κ ·z + µ · t + ν. (2.2) here γ, δ, ε, κ, µ, ν are yet unspecified eventually complex-valued constants. in a second step we will perform — without loss of generality — a synchronization of the inertial frames s and s′ by imposing the following condition: (z,t) = (0, 0) ⇐⇒ (z′, t′) = (0, 0). (2.3) 1major steps in the llfv-history: stepwise derivation of the transformations by voigt (1887), fitzgerald (1889), lorentz (1895, 1899, 1904), larmor (1897, 1900); formalistic progress afterwards: poincaré (1900, 1905) (discovery of lorentz-group properties and some invariants), einstein (1905) (derivation of llfv transformations from first principles), minkowski (19071908) (geometric interpretation of llfv transformations). obviously the synchronisation yields ε = ν = 0. inserting this result in eqs. (2.1) and (2.2) leads to the following equations: z′ = γ ·z + δ · t = γ ( z + δ γ · t ) , (2.4) t′ = κ ·z + µ · t. (2.5) in the 3rd step we use the relative complex-valued velocity between inertial frames s and s′: the spatial orgin z′ = γ · z + δ · t = 0 of s′ moves in s with constant complex-valued velocity v ≡ z/t = −δ/γ yielding for eq. (2.4): z′ = γ(z −v · t) with γ = γ(v). (2.6) in writing γ = γ(v) we point out that the constant γ could be a function of the complex-valued velocity v. the fourth step is the application of the principle of relativity which states that — in the absence of gravity — there does not exist any preferred inertial frame of reference implying in particular that the laws of physics take the same mathematical form in all inertial frames. this provides an essentially unique prescription of how to construct inverse generalized llfv transformations: interchange the space-time coordinates z,t and z′, t′, respectively, and replace the complex-valued velocity v by −v. as a consequence we obtain for the corresponding “inverse” of eq. (2.6): z = γ(z′ + v · t′) with γ ≡ γ(−v). (2.7) in order to determine the yet unknown eventually complex-valued constants κ and µ in eq. (2.4), we solve eq. (2.7) for t′ and apply to the result the identity eq. (2.6), i.e.: t′ = 1 v (z γ −z′ ) = 1 v (z γ −γ(z −v · t) ) (2.8) or — after some rearrangement — t′ = γ ( t− 1 v ( 1 − 1 γγ ) z ) . (2.9) comparison of eq. (2.9) with eq. (2.5) yields κ = −γ v ( 1 − 1 γγ ) and µ = γ. by the same procedure leading from eq. (2.6) to eq. (2.7) the principle of relativity can be used to obtain the corresponding “inverse” of eq. (2.9), i.e.: t = γ ( t′ + 1 v ( 1 − 1 γγ ) z′ ) . (2.10) hence, the previous considerations result in the following four identities (with γ ≡ γ(v), γ ≡ γ(−v)): z′ = γ(z −v · t), (2.11) t′ = γ ( t− 1 v ( 1 − 1 γγ ) z ) , (2.12) z = γ ( z′ + v · t′ ) , (2.13) t = γ ( t′ + 1 v ( 1 − 1 γγ ) z′ ) . (2.14) 296 vol. 53 no. 3/2013 complex covariance in dividing eq. (2.11) by eq. (2.12) and eq. (2.13) by eq. (2.14) we obtain a generalized velocity addition law eq. (2.15) and its inverse eq. (2.16), respectively, i.e.: z′ t′ = z t −v 1 − 1 v ( 1 − 1 γγ ) z t , (2.15) z t = z′ t′ + v 1 + 1 v ( 1 − 1 γγ ) z′ t′ . (2.16) a final fifth step makes use of the principle of constancy inferred by albert einstein stating that light in vacuum is propagating in all inertial frames with the same speed independent of the movement of the light source and the propagation direction. for our purposes we will generalize and simplify this principle of constancy by simply claiming that the velocity addition law and its inverse possess an eventually complexvalued fixed point c whose modulus |c| coincides with the vacuum speed of light. or, in other words: there exists some eventually complex-valued velocity c which is not modified by the application of the velocity addition law and its inverse while the modulous |c| coincides with the vacuum speed of light. application of this generalized principle of constancy to the addition laws eq. (2.15) and eq. (2.16) yields the following identity eq. (2.17): c = c∓v 1 ∓ 1 v ( 1 − 1 γγ ) c =⇒ γγ = 1 1 − v2 c2 . (2.17) as γγ depends on c2, i.e., the square of c, we can conclude that — besides the fixed point +c of the velocity addition law eq. (2.15) and its inverse — there simultaneously exists a second fixed point −c. eq. (2.17) can be used to transform eqs. (2.11)–(2.16) to their final form. as llfv transformations should reduce to the identity in the limit v → 0, we take the “positive” complex square root of eq. (2.17), i.e., (γγ)−1/2 = + √ 1 − v2 c2 , and invoke it together with γ = √ γγ· √ γ γ and γ = √ γγ· √ γ γ to eqs. (2.11)–(2.14) and obtain the following generalized llfv transformations (with γ ≡ γ(v), γ ≡ γ(−v)): z′ = √ γ γ · z −v · t√ 1 − v2 c2 , (2.18) t′ = √ γ γ · t− v c2 ·z√ 1 − v2 c2 , (2.19) z = √ γ γ · z′ + v · t′√ 1 − v2 c2 , (2.20) t = √ γ γ · t′ + v c2 ·z′√ 1 − v2 c2 , (2.21) and to eqs. (2.15), (2.16) to arrive at the generalized velocity addition law and its inverse: z′ t′ = z t −v 1 − v c2 · z t , (2.22) z t = z′ t′ + v 1 + v c2 · z′ t′ . (2.23) two comments are in order here: even though the previous eqs. (2.18)–(2.23) look very similar to the well known text-book equations appearing in the context of the standard formalism of special relativity, they are completely non-trivial as they hold not only in a real-valued space-time and for real-valued velocities, yet also in a complex-valued space-time and for complex-valued velocities. moreover, the extension of the aforementioned equations to three spatial dimensions is achieved by replacing the complex-valued quantities z, z′ and v by complex-valued 3-dimensional vectors ~z, ~z′ and ~v, respectively. 3. on the choice of the invariant velocities ±c and the complexification of time as we allow complex-valued velocities we face more freedom than albert einstein in defining the invariant eventually complex-valued invariant velocities ±c. we will discuss here two specific options for defining c of which the first is our preferred choice due to the arguments given below: • option 1: choose c = ±|c| real-valued with |c| = 299792458 m/s [38] being the vacuum speed of light and set γ = γ (see the discussion of eq. (4.14)!). performing this choice eqs. (2.18)–(2.23) read: z′ = z −v · t√ 1 − v2|c|2 , z = z′ + v · t′√ 1 − v2|c|2 , (3.1) t′ = t− v|c|2 ·z√ 1 − v2|c|2 , t = t′ + v|c|2 ·z ′√ 1 − v2|c|2 , (3.2) z′ t′ = z t −v 1 − v|c|2 · z t , z t = z′ t′ + v 1 + v|c|2 · z′ t′ . (3.3) with the exception of the square-roots all these equations are manifestly analytic. on the worldline z = v · t of s′ in s we have with t ∈ r: t′ = t− v|c|2 ·v · t√ 1 − v2|c|2 = t · √ 1 − v2 |c|2 ∈ c for v ∈ c. (3.4) hence, the attractive feature of analyticity would be obtained at the price of multiplying time in the boosted frame by some complex-valued constant. 297 frieder kleefeld acta polytechnica • option 2: choose c and v to be (anti)parallel in the complex plane, i.e., c = ±|c| · v|v| = ± ∣∣c v ∣∣ · v with |c| = 299792458 m/s [38] being the vacuum speed of light, and set γ = γ (see the discussion of eq. (4.14)!). performing this choice eqs. (2.18)–(2.23) read: z′ = z −v · t√ 1 − ∣∣v c ∣∣2 , z = z ′ + v · t′√ 1 − ∣∣v c ∣∣2 , (3.5) t′ = t− v ∗ |c|2 ·z√ 1 − ∣∣v c ∣∣2 , t = t′ + v ∗ |c|2 ·z ′√ 1 − ∣∣v c ∣∣2 , (3.6) z′ t′ = z t −v 1 − v∗|c|2 · z t , z t = z′ t′ + v 1 + v∗|c|2 · z′ t′ . (3.7) all these equations are manifestly non-analytic. on the world-line z = v · t of s′ in s we have with t ∈ r: t′ = t− v ∗ |c|2 ·v · t√ 1 − ∣∣v c ∣∣2 = t · √ 1 − ∣∣∣v c ∣∣∣2 ∈ r for ∣∣∣v c ∣∣∣ ≤ 1. (3.8) hence, the attractive feature of a real-valued time in s and s′ would be obtained at the price of interferring manifest non-analyticity to the theory. 4. momentum-energy covariance for complex-valued velocities it seems to be one of the greatest mysteries in theoretical physics that — as to our understanding — the most straightforward derivation of einstein’s [36] famous seemingly classical identity e = mc2 is based on the correspondence between classical and quantum physics, finding its manifestation in the concept of louis de broglie’s [37] (1923) particle-wave duality (see also [25, 26]). in our words:2 in the process of quantisation the point particle of classical mechanics propagating in complex-valued space-time is replaced by energy quanta (quantum particles) being represented by some wave function ψ evolving also in complexvalued space-time. quantum particles, i.e. energy quanta, can be — depending on the spatial spread of the wave function and circumstances — localized or delocalized. moreover, — according to liouville’s complementarity and heisenberg’s uncertainty principle — they display some simultaneous spread in complex-valued momentum space. 2it should be stressed that the concept of “electromagnetic mass” [25, 26] involving names like j.j. thomson (1881), fitzgerald, heaviside (1888), searle (1896, 1897), lorentz (1899), wien (1900), poincaré (1900), kaufmann (1902-1904), abraham (1902-1905), hasenöhrl (1905) had revealed already before einstein (1905) a proportionality between energy and some (eventually velocity dependent) mass, i.e. e ∝ mc2. einstein himself considered a moving body in the presence of e.m. radiation to derive e = mc2. in the interaction-free case, the wave function of a quantum particle with sharply defined momentum is a plane wave with angular frequency ω and wave number k (or wave vector ~k in more than one dimension). for a real-valued space coordinate x the functional behaviour of a plane wave is known to be ψ(x,t) ∝ exp ( i(kx−ωt) ) , yielding obviously: ω = +i ∂ ln ψ ∂t , k = −i ∂ ln ψ ∂x . (4.1) as we extend our formalism to the complex plane we replace the real-valued coordinate x by the complexvalued coordinate z (or the complex-conjugate z∗). instead of performing partial derivatives with respect to x, we will now perform partial derivatives with respect to z (or z∗), which are known as wirtinger derivatives in one complex dimension and dolbeault operators in several complex dimensions. they are used in the context of (anti)holomorphic functions and have the following fundamental properties, being some special case of the famous cauchy-riemann differential equations: ∂z∗ ∂z = ∂z ∂z∗ = 0, ∂z ∂z = ∂z∗ ∂z∗ = 1. (4.2) on this formalistic ground we denote now generalized relations within a holomorphic framework to determine the eventually complex-valued angular frequency ω and wave number k for a plane wave propagating in some complexified phase space: ω = +i ∂ ln ψ ∂t , k = −i ∂ ln ψ ∂z . (4.3) integration of these equations results in the following wave function for a plane wave in some some holomorphic phase space: ψ(z,t) ∝ exp ( i(kz −ωt) ) . (4.4) as a key postulate (let us call it e.g. plane-wave-phasecovariance postulate (pwpcp)) in our derivation we claim at this point that the eventually complex-valued phase of a plane wave should be a lorentz scalar. or, in other words: the eventually complex-valued phase of a plane wave should not change when boosted from one inertial frame to another.3 for the previously considered inertial frames s and s′ this implies in particular: kz −ωt = k′z′ −ω′t′. (4.5) we may now insert into the left-hand side of this equation our generalized llfv transformations eqs. (2.20) 3one could use these considerations even to define inertial frames on the basis of quantum particles: an inertial frame in the absence of gravitation is some reference frame in complexvalued space-time in which the wave function describing a noninteracting quantum particle with sharply defined momentum has the mathematical form of a plane wave. 298 vol. 53 no. 3/2013 complex covariance and (2.21), i.e.: k · √ γ γ · z′ + v · t′√ 1 − v2 c2 −ω · √ γ γ · t′ + v c2 ·z′√ 1 − v2 c2 = k′z′ −ω′t′. (4.6) comparison of the leftand right-hand side of this equation yields the following two equations: k′ = √ γ γ · k − v c2 ·ω√ 1 − v2 c2 , ω′ = √ γ γ · ω −v ·k√ 1 − v2 c2 , (4.7) and by application of the principle of relativity interchanging k, ω and k′, ω′, respectively, and replacing v by −v the two inverse equations: k = √ γ γ · k′ + v c2 ·ω′√ 1 − v2 c2 , ω = √ γ γ · ω′ + v ·k′√ 1 − v2 c2 . (4.8) these four equations state that an eventually complexvalued frequency ω and some — 3-dimensionally generalized — wave vector ~k are transforming like a fourvector under inverse generalized llfv transformations. while the wave representation of particles has brought us to the quantum formalism without even involving planck’s quantum of action ~ = h/(2π) = 1.054571726(47) · 10−34 j s [38] it is the following two highly non-trivial and fundamental identities conjectured by louis de broglie [37] (1923) to be applicable even to massive particles which will bring us back to the seemingly classical quantities momentum p (here for simplicity in one dimension) and energy e, i.e. (with k = 2π/λ):4 p = ~k, e = ~ω (4.9) and — when combined with our eqs. (4.3) — to the following two fundamental identities representing even for wave functions of interacting quantum particles (being not of plane wave form) the correspondence principle of qhjt, i.e.: e = +i~ ∂ ln ψ ∂t , p = −i~ ∂ ln ψ ∂z . (4.10) in multiplying eqs. (4.7) and (4.8) by ~ it is now straightforward to obtain via eqs. (4.9) the seemingly classical generalized lorentz-planck (lp) transformations (see also planck [39] (1906)) relating here some 4it is of course known that the former identity e = hf (with f = ω/(2π)) had been derived earlier — using an energydiscretisation trick of boltzmann — by planck (1900) in the context of the e.m. radiation of a black body and by einstein (1905) to determine the energy of his massless photon, while the latter identity had been used in the form p = hf/c for massless photons for the first time by stark (1909) and later by einstein (1916, 1918), while compton (1923) and debye (1923) had finally confirmed the proportionality of the suggested threemomentum and wave vector of a massless photon by famous experiments. even eventually complex-valued momentum and energy in inertial frames s and s′:5 p′ = √ γ γ · p− v c2 ·e√ 1 − v2 c2 , e′ = √ γ γ · e −v ·p√ 1 − v2 c2 , (4.11) p = √ γ γ · p′ + v c2 ·e′√ 1 − v2 c2 , e = √ γ γ · e′ + v ·p′√ 1 − v2 c2 . (4.12) hence, a particle with zero momentum (p′ = 0) in s′ will have in the frame s of a resting observer the eventually complex-valued velocity v and appear with eventually complex-valued momentum p and energy e given by:6 p = √ γ γ · m0v√ 1 − v2 c2 , e = √ γ γ · m0c 2√ 1 − v2 c2 , (4.13) with m0 ≡ e′/c2 being — for the limit γ = γ being seemingly favoured by experiment — some eventually complex-valued rest mass and lorentz invariant, as there obviously holds the generalized dispersion relation: e2 − (pc)2 = γ γ (m0c2)2. (4.14) 5. non-hermitian klein-gordon-fock equation at this point we would like to recall eqs. (4.10) expressing the correspondence principle in qhjt and being even valid for interacting quantum particles: e = +i~ ∂ψ ∂t · 1 ψ , p = −i~ ∂ψ ∂z · 1 ψ , (5.1) yielding obviously e2 = (+i~)2 ( ∂ψ ∂t )2 · 1 ψ2 , p2 = (−i~)2 ( ∂ψ ∂z )2 · 1 ψ2 , (5.2) simultaneously there are the following two identities holding for non-interacting quantum particles being described by a plane wave ψ ∝ exp ( i h (pz −et) ) (obtained by combining eq. (4.4) with eqs. (4.9)): e2 = (+i~)2 ∂2ψ ∂t2 · 1 ψ , p2 = (−i~)2 ∂2ψ ∂z2 · 1 ψ , (5.3) like klein [40], gordon [41] and fock [42] in 1926 we can insert eqs. (5.3) in the dispersion relation 5without loss of generality, we display the equations here only for one complex-valued momentum dimension. 6in the limit γ = γ we recover the famous relativistic identities ~p = m~v (planck [39] (1906)) and e = mc2 (einstein [36] (1905)) with m = m0√ 1−(v/c)2 . 299 frieder kleefeld acta polytechnica eq. (4.14) to obtain the generalized klein-gordonfock (kgf) equation describing a non-interacting relativistic quantum particle: e2︷ ︸︸ ︷ (+i~)2 ∂2ψ ∂t2 · 1 ψ − (pc)2︷ ︸︸ ︷ (−i~c)2 ∂2ψ ∂z2 · 1 ψ = γ γ (m0c2)2 (5.4) =⇒ (+i~)2 ∂2ψ ∂t2 − (−i~c)2 ∂2ψ ∂z2 = γ γ (m0c2)2ψ (5.5) =⇒ (+i~)2 ∂2ψ ∂t2 = (−i~c)2 ∂2ψ ∂z2 + γ γ (m0c2)2ψ. (5.6) as usual, the solution ψ = ψ(+) + ψ(−) of the kgf eq. (5.6) can be decomposed into a sum of a retarded solution ψ(+) and an advanced solution ψ(−) solving not only the kgf eq. (5.6), but also respectively the following relativistic retarded or advanced interaction free schrödinger [43] (1926) equations: ± i~ ∂ψ(±) ∂t = √ (−i~c)2 ∂2 ∂z2 + γ γ (m0c2)2ψ(±) ≈ (√ γ γ m0c 2 − √ γ γ ~2 2m0 ∂2 ∂z2 + . . . ) ψ(±) (5.7) in the last line we performed the non-relativistic limit well known for γ = γ. 6. non-hermitian dirac-equation each of the four components of the dirac spinor ψ of a non-interacting dirac-quantum particle should individually respect the kgf eq. (5.4). returning to three eventually complex-valued space and momentum dimensions, this condition is formally denoted by the following equivalent identities: 0 = ( e2 − (~pc)2 − γ γ (m0c2)2 ) ψ (6.1) 0 = ( (+i~)2 ∂2 ∂t2 − (−i~c)2 ∂ ∂~z · ∂ ∂~z − γ γ (m0c2)2 ) ψ (6.2) 0 = ([ β ( +i~ ∂ ∂t − (−i~c)~α · ∂ ∂~z )]2 − γ γ (m0c2)2 ) ψ (6.3) 0 = ( β ( +i~ ∂ ∂t − (−i~c)~α · ∂ ∂~z ) − √ γ γ m0c 2 ) · ( β ( +i~ ∂ ∂t − (−i~c)~α · ∂ ∂~z ) + √ γ γ m0c 2 ) ψ. (6.4) throughout the factorization of eq. (6.2) we made use of the four well known 4×4 dirac matrices ~α and β, which are defined as follows with the help of the pauli matrices ~σ, the 2 × 2 unit matrix 12 and the 2 × 2 zero matrix 02: ~α ≡ ( 02 ~σ ~σ 02 ) , β ≡ ( 12 02 02 −12 ) . (6.5) by simple inspection of eq. (6.4) and use of the identity β2 = 14 it is now straightforward to denote the retarded and advanced dirac [44] (1928) equations for the retarded component ψ(+) and advanced component ψ(−) of solution ψ = ψ(+) + ψ(−) of the interaction free kgf eq. (6.1), i.e.: 0 = ( β ( +i~ ∂ ∂t − (−i~c)~α · ∂ ∂~z ) ∓ √ γ γ m0c 2 ) ψ(±), (6.6) +i~ ∂ψ(±) ∂t = ( −i~c~α · ∂ ∂~z ± √ γ γ βm0c 2 ) ψ(±), (6.7) ±i~ ∂ψ(±) ∂t = ( ±(−i~c)~α · ∂ ∂~z + √ γ γ βm0c 2 ) ψ(±). (6.8) once more we stress that these generalized dirac equations, the generalized schrödinger eq. (5.7) and the generalized kgf eqs. (5.6), (6.2) do hold even in complex-valued space-time and for complex-valued rest mass m0. 7. final remarks the purpose of the considerations presented here has been to extend the concept of covariance to complexvalued space-time. it is remarkable that this can be achieved in some analytical way on the basis of and in accordance with the correspondence principle of qhjt. after extending the concept of inertial frames to the complex plane we have constructed on one hand generalized llfv and lp transformations relating the fourvectors of complex-valued space-time and momentum-energy beween two inertial frames with an eventually complex-valued relative velocity, and on the other hand a complex generalization of einstein’s energy-mass equivalence e = mc2. it turned out that the complexification of time is — in the limit γ = γ — not a severe problem, as a boost will multiply the time at most by a complex constant. moreover, it has been possible to derive on the basis of a generalized concept of covariance generalized kgf, schrödinger and dirac differential equations, which can be used to formulate a non-hermitian qt describing the apparently complex laws of physics. as had already been pointed out earlier (e.g. [2]) it is the advanced schrödinger (or dirac) equation which plays the role of benders hardly constructable cpttransformed schrödinger (or dirac) equation required to construct some positive semidefinite cpt-inner product [45] for some pt-symmetric qt. the possibility to obtain — via covariance — directly the underlying advanced schrödinger (or dirac) equation, 300 vol. 53 no. 3/2013 complex covariance as described in the present paper, will make the tedious search and construction of a unique cpt-inner product in non-hermitian qt needless. acknowledgements at this place we would like to thank miloslav znojil for the strong encouragement to publish the present — already comprehensive — results, despite the fact that — as to our understanding — there remain still many questions open to be answered elsewhere in future. this work was supported by the fundação para a ciência e a tecnologia of the ministério da ciência, tecnologia e ensino superior of portugal, under contract cern/fp/123576/2011 and by czech project lc06002. references [1] f. kleefeld, czech. j. phys. 56 (2006) 999, arxiv:quant-ph/0606070. [2] f. kleefeld, arxiv:hep-th/0408028, arxiv:hep-th/0408097. [3] f. kleefeld, arxiv:hep-th/0312027. [4] f. kleefeld, econf c 0306234 (2003) 1367 [proc. inst. math. nas ukraine 50 (2004) 1367], arxiv:hep-th/0310204. [5] f. kleefeld, few body syst. suppl. 15 (2003) 201, arxiv:nucl-th/0212008. [6] f. kleefeld, aip conf. proc. 660 (2003) 325, arxiv:hep-ph/0211460. [7] f. kleefeld, e. van beveren and g. rupp, nucl. phys. a 694 (2001) 470, arxiv:hep-ph/0101247. [8] f. kleefeld: phd thesis (univ. of erlangen-nürnberg, germany, 1999). [9] f. kleefeld, proc. xiv ishepp 98 (17-22 august, 1998, dubna), eds. a.m. baldin et al., jinr, dubna, 2000, part 1, 69-77, arxiv:nucl-th/9811032. [10] f. kleefeld, acta phys. polon. b 30 (1999) 981, arxiv:nucl-th/9806060. [11] g. wentzel, z. phys. 38 (1926) 518. [12] c. d. yang, annals phys. 319 (2005) 399, 444. [13] m. v. john, found. phys. lett. 15 (2002) 329, arxiv:quant-ph/0109093; quant-ph/0102087. [14] a. e. faraggi and m. matone, int. j. mod. phys. a 15 (2000) 1869 arxiv:hep-th/9809127. [15] r. a. leacock and m. j. padgett, phys. rev. lett. 50 (1983) 3. [16] c. m. bender, k. olaussen, p. s. wang, phys. rev. d 16 (1977) 1740. [17] chapter iii in a. s. dawydow, quantenmechanik (7. auflage), veb deutscher verlag d. wissenschaften 1987, isbn 3-326-00095-2 (revision of the german translation of the russian original edition, moskau 1973). [18] j. l. dunham, phys. rev. 41 (1932) 713. [19] a. voros, arxiv:1202.3100. [20] c. m. bender, s. boettcher, phys. rev. lett. 80, 5243 (1998) arxiv:physics/9712001. [21] c. m. bender, rept. prog. phys. 70 (2007) 947 arxiv:hep-th/0703096. [22] c. m. bender, d. w. hook, p. n. meisinger and q. h. wang, annals phys. 325 (2010) 2332 arxiv:0912.4659. [23] see e.g. the pt symmeter page http://ptsymmetry.net/ and for pt-symmetry-meetings http: //gemma.ujf.cas.cz/~znojil/conf/index.html. [24] a. einstein, annalen phys. 17 (1905) 891 [ann. phys. 14 (2005) 194]. [25] a. pais, raffiniert ist der herrgott, albert einstein. eine wissenschaftliche biographie, spektrum, heidelberg 2000, isbn 3-8274-0529-7 (german translation of: a. pais, subtle is the lord: the science and the life of albert einstein, oxford univ. press, new york 1982). [26] wikipedia, (history of special relativity, http: //en.wikipedia.org/wiki/history_of_relativity [august 30, 2012]. [27] n. nakanishi, phys. rev. d 5 (1972) 1968. [28] n. nakanishi, prog. theor. phys. suppl. 51 (1972) 1. [29] t. d. lee and g. c. wick, phys. rev. d 2 (1970) 1033. [30] a. mostafazadeh, int.j.mod.phys. a 21 (2006) 2553, arxiv:quant-ph/0307059. [31] c. m. bender and p. d. mannheim, phys. rev. d 84 (2011) 105038 arxiv:1107.0501. [32] n. nakanishi, phys. rev. d 3 (1971) 811. [33] a. m. gleeson, r. j. moore, h. rechenberg and e. c. g. sudarshan, phys. rev. d 4, 2242 (1971). [34] h.a. lorentz, proc. roy. netherlands acad. of arts a.sci., 1 (1899) 427. [35] h. poincaré, c.r. hebd. séances acad. sci., 140 (1905b) 1504. [36] a. einstein, annalen phys. 18 (1905) 639. [37] l. v. p. r. de broglie, thesis, paris, 1924, annals phys. 3 (1925) 22. [38] p.j. mohr, b.n. taylor, d.b. newell, arxiv:1203.5425. [39] m. planck, verh. deutsch. phys. ges. 4 (1906) 136; see also: sitzungsber. preuß. akad. wiss. (1907) 542; annalen phys. 26 (1908) 1. [40] o. klein, z. phys. 37 (1926) 895 [surveys high energ.phys. 5 (1986) 241]. [41] w. gordon, z. phys. 40 (1926) 117. [42] v. fock, z. phys. 38 (1926) 242; z. phys. 39 (1926) 226 [surveys high energ. phys. 5 (1986) 245]. [43] e. schrödinger, phys. rev. 28 (1926) 1049 (see also h. feshbach and f. villars, rev. mod. phys. 30 (1958) 24). [44] p. a. m. dirac, proc. roy. soc. lond. a 117 (1928) 610, proc. roy. soc. lond. a 118 (1928) 351. [45] c. m. bender, d. c. brody, h. f. jones, econf c 0306234 (2003) 617 [phys. rev. lett. 89 (2002) 270401] [erratum-ibid. 92 (2004) 119902], arxiv:quant-ph/0208076. 301 http://arxiv.org/abs/quant-ph/0606070 http://arxiv.org/abs/hep-th/0408028 http://arxiv.org/abs/hep-th/0408097 http://arxiv.org/abs/hep-th/0312027 http://arxiv.org/abs/hep-th/0310204 http://arxiv.org/abs/nucl-th/0212008 http://arxiv.org/abs/hep-ph/0211460 http://arxiv.org/abs/hep-ph/0101247 http://arxiv.org/abs/nucl-th/9811032 http://arxiv.org/abs/nucl-th/9806060 http://arxiv.org/abs/quant-ph/0109093 http://arxiv.org/abs/hep-th/9809127 http://arxiv.org/abs/1202.3100 http://arxiv.org/abs/physics/9712001 http://arxiv.org/abs/hep-th/0703096 http://arxiv.org/abs/0912.4659 http://ptsymmetry.net/ http://gemma.ujf.cas.cz/~znojil/conf/index.html http://gemma.ujf.cas.cz/~znojil/conf/index.html http://en.wikipedia.org/wiki/history_of_relativity http://en.wikipedia.org/wiki/history_of_relativity http://arxiv.org/abs/quant-ph/0307059 http://arxiv.org/abs/1107.0501 http://arxiv.org/abs/1203.5425 http://arxiv.org/abs/quant-ph/0208076 acta polytechnica 53(3):295–301, 2013 1 introduction 2 space-time covariance for complex-valued velocities 3 on the choice of the invariant velocities +-c and the complexification of time 4 momentum-energy covariance for complex-valued velocities 5 non-hermitian klein-gordon-fock equation 6 non-hermitian dirac-equation 7 final remarks acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0683 acta polytechnica 53(supplement):683–686, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap modeling jet interactions with the ambient medium j.h. bealla,b,c,∗, j. guilloryc, d.v. rosed, michael t. wolffb a st. john’s college, annapolis, md b space sciences division, naval research laboratory, washington, dc c college of sciences, george mason university, fairfax, va d voss scientific, albuquerque, nm ∗ corresponding author: beall@sjca.edu abstract. recent high-resolution (see, e.g., [13]) observations of astrophysical jets reveal complex structures apparently caused by ejecta from the central engine as the ejecta interact with the surrounding interstellar material. these observations include time-lapsed “movies” of both agn and microquasars jets which also show that the jet phenomena are highly time-dependent. such observations can be used to inform models of the jet–ambient-medium interactions. based on an analysis of these data, we posit that a significant part of the observed phenomena come from the interaction of the ejecta with prior ejecta as well as interstellar material. in this view, astrophysical jets interact with the ambient medium through which they propagate, entraining and accelerating it. we show some elements of the modeling of these jets in this paper, including energy loss and heating via plasma processes, and large scale hydrodynamic and relativistic hydrodynamic simulations. keywords: jets, active galaxies, blazars, intracluster medium, non-linear dynamics, plasma astrophysics. 1. introduction large scale hydrodynamic simulations of the interaction of astrophysical jets with the ambient medium through which they propagate can be used to illuminate a number of interesting consequences of the jets’ presence. these include acceleration and entrainment of the ambient medium, the effects of shock structures on star formation rates, and other effects originating from ram pressure and turbulence generated by the jet (see, e.g., [1, 10, 16, 23]. we present results for large scale hydrodynamic simulations and initial relativistic hydrodynamic simulations in this paper. 2. modeling the jet interaction with the ambient medium hydrodynamic sumulations neglect an important species of physics: the microscopic interactions that occur because of the effects of particles on one another and of particles with the collective effects that accompany a fully or partially ionized ambient medium (i.e. a plasma). while the physical processes (including plasma processes) in the ambient medium can be modeled in small regions by pic (particle-in-cell) codes for some parameter ranges, simulations of the larger astrophysical jet structure with such a pic code are not possible with current or foreseeable computer systems. for this reason, we have modeled these plasma processes in the astrophysical regime by means of a system of coupled differential equations which give the wave populations generated by the interaction of the astrophysical jet with the ambient medium through which it propagates. a detailed discussion of these efforts can be found, variously, in [6, 8, 18, 19, 22]. the system of equations used to solve for the normalized wave energies is very stiff. solving the system of equations yields a time-dependent set of normalized wave energies (i.e., the ratio of the wave energy divided by the thermal energy of the plasma) that are generated as a result of jets interaction with the ambient medium. as we will show, these solutions can yield an energy deposition rate (de/dt), an energy deposition length (de/dx), and ultimately, a momentum transfer rate (dp/dt) that can be used to estimate the effects of plasma processes on the hydrodynamic evolution of the jet. for this analysis, we posit a relativistic jet of either e±, p–e−, or more generally, a charge-neutral, hadron–e− jet, with a significantly lower density than the ambient medium. the primary energy loss mechanism for the electron–positron jet is via plasma processes, as beall [8] notes. kundt [11, 12] also discusses the propagation of electron–positron jets. the principal plasma waves generated by the jet–ambient-medium interaction can be characterized as follows: the two stream instability waves, w1, interact directly with the ambient medium, and are “predated” (principally) by the oscillating two stream instability waves, w2, and the ion-density fluctuations or ion-acoustic waves, ws. the waves produced in the plasma by the jet produce regions of high elec683 http://dx.doi.org/10.14311/ap.2013.53.0683 http://ojs.cvut.cz/ojs/index.php/ap j.h. beall, j. guillory, d.v. rose, michael t. wolff acta polytechnica tric field strength and relatively low density, the so-called “cavitons” (after solitons or solitary waves) which propagate like wave packets. these cavitons mix, collapse, and reform, depositing energy into the ambient medium, transferring momentum to it, and entraining (i.e., dragging along and mixing) the ambient medium within the jet. the typical caviton size when formed is of order 10’s of debye lengths, where a debye length, λd = 7.43 × 102 √ t/np cm, t is the electron temperature in units of ev, and np is the number density of the ambient medium in units of cm−3. 2.1. energy loss, energy deposition rate, and momentum transfer in order to determine the energy deposition rate, the momentum transfer rate, and heating, we model the plasma interaction as a system of very stiff, coupled differential equations (see, e.g., [7]), which simulate the principal elements of the plasma processes that draw energy out of the jet. as a test of the fealty of this method, we “benchmark” (see [15]) the wave population code by using the pic code in regions of the parameter space where running the pic code simulation is practicable. we then use the wave population code for regions of more direct astrophysical interest. a more detailed discussion of the comparisons between the pic-code simulations and the wave-population model can be found in [20, 21]. the coupling of these instability mechanisms is expressed in the model through a set of rate equations. these equations are discussed in some detail by rose et al. [18, 19], and beall [8]. beall et al. [7] illustrate two possible solutions for the system of coupled differential equations that model the jet–ambient-medium interaction: a damped oscillatory and an oscillatory solution. the landau damping rate for the two-temperature thermal distribution of the ambient medium is used for these solutions. as noted in the figure caption, transitions toward chaotic solutions have been observed for very large growth rates for the two-stream instability. in order to benchmark the wave-population code, we use that code to calculate the propagation length of an electron–positron jet as described above. specifically, we model the interaction of the relativistic jet with the ambient medium through which it propagates by means of a set of coupled, differential equations which describe the growth, saturation, and decay of the three wave modes likely to be produced by the jet–medium interaction. first, two-stream instability produces a plasma wave, w1, called the resonant wave, which grows initially at a rate γ1 ≤ ( √ 3/2γ)(nb/2np)1/3ωp, where γ is the lorentz factor of the beam, nb and np are the beam and cloud number densities, respectively, in units of cm−3 and ωp is the plasma frequency, as described more fully in [18]. the average energy deposition rate, 〈d(α�1)/dt〉, of the jet energy into the ambient medium via plasma processes can be calculated as 〈d(α�1)/dt〉 = npkt 〈w〉(γ1/ωp)ωp erg cm−3 s−1, where k is boltzmann’s constant, t is the plasma temperature, 〈w〉 is the average (or equilibrium) normalized wave energy density obtained from the wave population code, γ1 is the initial growth rate of the two-stream instability, and α is a factor that corrects for the simultaneous transfer of resonant wave energy into nonresonant and ion-acoustic waves. the energy loss scale length, deplasma/dx = −(1/nbvb)(dα�1/dt), can be obtained by determining the change in γ of a factor of 2 with the integration ∫ dγ = − ∫ [d(α�1)/dt]dl/(vbnbm′c2) as shown in [8, 17], where m′ is the mass of the beam particle. thus, lp = ((1/2)γcnbmc2)/(dα�1/dt) cm is the characteristic propagation length for collisionless losses for an electron or electron–positron jet, where dα�1/dt is the normalized energy deposition rate (in units of thermal energy) from the plasma waves into the ambient plasma. in many astrophysical cases, this is the dominant energy loss mechanism. we can therefore model the energy deposition rate (de/dt) and the energy loss per unit length (de/dx), and ultimately the momentum loss per unit length (dp/dx) due to plasma processes. beall, guillory, and rose (2009) have compared the results of a pic code simulation of an electron–positron jet propagating through an ambient medium of an electron–proton plasma with the solutions obtained by the wave population model code, and have found good agreement between the two results (see fig. 1 from that paper). at the same time, that paper demonstrates that the ambient medium is heated and entrained into the jet. that analysis also shows that a relativistic, low-density jet can interpenetrate an ambient gas or plasma. initially, and for a significant fraction of its propagation length, the principal energy loss mechanisms for such a jet interacting with the ambient medium are via plasma processes [8, 18]. as part of our research into the micro-physics of the interaction of jets with an ambient medium, we continue to investigate the transfer of momentum from the jet. understanding how such a transfer is accomplished is essential to understanding the manner in which the ambient medium (for example, from interstellar clouds) is accelerated and eventually entrained into the large-scale astrophysical jet. in order to proceed to a more detailed analysis of the issue of momentum transfer, we have used modern pic code simulations to study the dynamics of caviton formation, and have confirmed the work of robinson and newman (1990) in terms of the cavitons’ formation, evolution, and collapse. we are in the process of developing a multi-scale code which uses the energy deposition rates and momentum transfer rates from the pic and wavepopulation models as source terms for the highly parallelized hydrodynamic code currently running on the nrl sgi altix machine. 684 vol. 53 supplement/2013 modeling jet interactions with the ambient medium figure 1. simulation using a highly-parallelized version of the vh-1 hydrodynamics code. the figure shows the x–z cross-section (figure 1a) and the y– z cross-section (figure 1b) for a fully 3-dimensional hydrodynamic simulation of a jet with v = 1.5 × 109 cm s−1. the simulation length is approximately 64 kpc on the long axis. figure 2. 2d simulation of an axially symmetric, relativistic hydrodynamic jet with v/c = 0.995, showing jet density (top) and axial velocity along the jet axis and radially (bottom) using the pluto code (mignone et al. 2009) plasma effects can have observational consequences. beall [8] has noted that plasma processes can slow the jets rapidly, and beall and bednarek [4] have shown that these effects can truncate the low-energy portion of the γ-ray spectrum (see their fig. 3). in the interests of brevity, we do not go into the (reasonable) assumptions for these calculation. please see [4] for a detailed discussion. a similar effect will occur for neutrinos and could also reduce the expected neutrino flux from agn. the presence of plasma processes in jets can also greatly enhance line radiation species by generating high-energy tails on the maxwell–boltzmann distribution of the ambient medium, thus abrogating the assumption of thermal equilibrium. an analytical calculation of the boost in energy of the electrons in the ambient medium to produce such a high energy tail, with ehet ∼ 30÷100 kt , is confirmed by pic-code simulations. aside from altering the landau damping rate, such a high-energy tail can greatly enhance line radiation over that expected for a thermal equilibrium calculation (see [5, 7] for a detailed discussion). if the beam is significantly heated by the jet–cloud interaction, the beam will expand transversely as it propagates, and will therefore have a finite opening angle. these “warm beams” result in different growth rates for the plasma instabilities, and therefore produce somewhat different propagation lengths (see, e.g., [7, 9, 20]). a “cold beam” is assumed to have little spread in momentum. the likely scenario is that the beam starts out as a cold beam and evolves into a warm beam as it propagates through the ambient medium. this scenario is clearly illustrated by the pic simulations we have used to benchmark the wave population codes appropriate for the astrophysical parameter range (see, e.g., [5, 21]). 3. hydrodynamic simulations as an example of the large scale hydrodynamic simulations we are conducting, we show a jet simulation in the x–z cross-section (figure 1a) and the y–z crosssection for a 3-dimensional hydrodynamic simulation of a jet with v = 1.5 × 109 cm s−1. the simulation length is approximately 64 kpc on the long axis. the simulation shows the detailed structure of the shocks generated by the jet as well as the kelvin–helmholtz instabilities of the jet itself, which generate additional shock structures. these shocks produce jeans-length structures which will ultimately collapse to form stars that in turn may feed the central engine in the agn. in fig. 2, we show the density and axial velocity plotted radially and along the jet axis for a 2d simulation of a relativistic jet using the pluto code [14]. we plan to develop a multi-scale code with the plasma processes as momentum and energy source terms for these codes. acknowledgements hb and mtw gratefully acknowledge the support of the office of naval research for this project. thanks also to colleagues at various institutions for their continued interest and collaboration, including kinwah wu, curtis saxton, s. schindler and w. kapferer, and s. colafrancesco. references [1] basson, j.f., and alexander, p., 2002, mnras, 339, 353 [2] beall, j.h., et al. 1978: ap.j., 219, 836 [3] beall, j.h. and rose, w.k. 1981: ap.j., 238, 539 [4] beall, j.h. and bednarek, w. 1999: ap.j., 510, 188 [5] beall j.h., guillory, j., & rose, d.v., 1999, journal of the italian astronomical society, 70, 1235 [6] beall j.h., guillory, j., & rose, d.v., 2003, c.j.a.a., vol. 3, suppl., 137 [7] beall j.h., guillory, j., rose, d.v., schindler, s., colafrancesco, s., 2006, c.j.a.a., supplement 1, 6, 283 685 j.h. beall, j. guillory, d.v. rose, michael t. wolff acta polytechnica [8] beall j.h., 1990, “energy loss mechanisms for fast particles,” in physical processes in hot cosmic plasmas, (kluwer academic publishers: dordrecht/boston/london: w. brinkmann, a.c. fabian, and f. giobannelli, eds.) [9] kaplan, s.a., & tsytovich, v.n., 1973 plasma astrophysics, (pergamon press: oxford, new york) [10] krause, m, & camenzind, m., 2003, new astron. rev., proceedings of the conference in bologna: “the physics of relativistic jets in the chandra and xmm era”, sep.,2002 journal-ref: new astron.rev. 47 (2003) 573–576 [11] kundt, w., 1987, in astrophysical jets & their engines: erice lectures, ed. w. kundt (reidel: dordrecht. netherlands) [12] kundt, w. 1999: in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. s.a.it. 70, 1097 [13] lister et al., 2009, aj, 137, 3718 [14] mignone et al., 2007, apjs, 170, 228 [15] oreskes, n., shrader-frechette, k., and belitz, k., 1996, science, 263, 641 [16] perucho et al., 2012, mmsai 83, 297 [17] rose, d.v., 1997, phd dissertation, george mason university, fairfax, va, usa [18] rose, w.k., guillory, j., beall, j.h., & kainer, s., 1984, ap.j., 280, 550 [19] rose, w.k., beall, j.h., guillory, j., & kainer, s., et al. 1987, ap.j., 314, 95 [20] rose, d.v., guillory, j. & beall, j.h., 2002, physics of plasmas, 9, 1000 [21] rose, d.v., guillory, j. & beall, j.h., 2005, physics of plasmas, 12, 014501 [22] scott, j.h., holman, g.d., ionson, and papadopoulos, k., 1980, ap.j., 239, 769 [23] zanni, c., murante, g., bodo, g., massaglia, s., rossi, p., ferrari, a., 2005, astronomy and astrophysics, 429 discussion maria diaz trigo — two questions: 1. given that winds are slower than jets, do you expect that the interaction with the ambient medium, and therefore the energy deposition, is more efficient for winds? 2. at which temperatures is the ambient medium heated after the interaction with the jet? jim beall — as to the first question, the winds being slower than the jet, i think they would tend to be less energetic, and therefore would tend to have a less dramatic effect on the ambient medium than do the jets. regarding the temperatures the ambient medium can reach, pic code simulations as well as analytical estimates indicate that the plasma processes can heat a 104 k ambient medium to 105 k using the plasma processes along. maurice van putten — as you know, knotted structures are observationally quite generic. in your simulations in hydro, they do not form. in my simulations, with relativistic mhd (rmhd) in ap.j., 467, l57 (1996), they form by dynamically significant magnetic field strengths. adding “test” or “tracer” magnetic fields to hydro simulations does not seem to be a valid approximation. jim beall — i agree that we need to move to mhd and rmhd simulations, and it is our plan of research to do this. but if you look closely at the structure of the jets in our simulations, you can see knotted structures along the jet’s longitudinal axis that are similar to those observed in the sky. 686 acta polytechnica 53(supplement):683–686, 2013 1 introduction 2 modeling the jet interaction with the ambient medium 2.1 energy loss, energy deposition rate, and momentum transfer 3 hydrodynamic simulations acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0149 acta polytechnica 54(2):149–155, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap laplace-runge-lenz vector in quantum mechanics in noncommutative space peter prešnajder∗, veronika gáliková, samuel kováčik commenius university bratislava, faculty of mathematics, physics and informatics ∗ corresponding author: presnajder.peter@gmail.com abstract. the object under scrutiny is the dynamical symmetry connected with conservation of the laplace-runge-lenz vector (lrl) in the hydrogen atom problem solved by means of noncommutative quantum mechanics (ncqm). the considered noncommutative configuration space has such a “fuzzy” structure that the rotational invariance is not spoilt. an analogy with the lrl vector in the ncqm is brought to provide our results and also a comparison with the standard qm predictions. keywords: noncommutative space, coulomb-kepler problem, symmetry. 1. introduction our main goal we are after is to investigate the existence of dynamical symmetry of the coulomb-kepler problem in the quantum mechanics in noncommutative space, and possibly to find the generalization of the so-called laplace-runge-lenz (lrl) vector for this case. before actually starting, we briefly look at the history of the lrl vector. we call this laplace-runge-lenz vector, as it is commonly named nowadays, but as far as we know, the first ones to make a mention of it were jakob hermann and johann bernoulli in the letters they exchanged in 1710, see [1], [2]. so the name “hermann-bernoulli vector” would be more proper. much later, in 1799 the vector was rediscovered by laplace in his celestial mechanics [3]. then it appeared in a popular german textbook on vectors by c. runge [4], which was referenced by w. lenz in his paper on the (old) quantum mechanical treatment of the kepler problem or hydrogen atom [5]. now back to physics. the coulomb-kepler problem is all about the motion of a particle in a field of a central force proportional to (r−2). the corresponding newton equation for a body of mass m reads m~̇v = −q ~r r3 . (1) here q denotes a constant which specifies the magnitude of the force applied. the system with a central force definitely is rotationally symmetric and the orbital momentum, ~l = m~r ×~v, (2) is conserved in any central field. however, as to the symmetries, more has to be said in this case, due to the fact that not only is the force central, but in addition it has the inverse square dependence of the distance. besides the components of ~l, the coulombkepler problem has three additional integrals of motion, namely those represented by the conserved lrl vector, ~a = ~l×~v + q ~r r . (3) when the motion of a planet around the sun is considered, the conservation of the given quantity has to do with the constant eccentricity of the orbit and the position of the perihelion. another well-known system characterized by coulomb potential is the hydrogen atom. obviously a need for the use of quantum mechanics arises here. there are, however, several ways to address the issue. in 1926 wolfgang pauli published his paper on the subject [6]. he used the lrl vector to find the spectrum of a hydrogen atom using modern quantum mechanics and the hidden dynamical symmetry of the problem, without knowledge of the explicit solution of the schrödinger equation. it turned out that the lrl vector can be found among the hermitian operators acting in the hilbert space considered in an almost complete analogy with the classical case, the only subtlety to deal with being the fact that the cross product needs to be properly symmetrized, resulting in a qm k = 1 2 εijk(livj + vjli) + q xk r , (4) where vj = −i~m∂j stands for velocity operator. importantly the operators li and ai commute with the hamiltonian, i.e. are conserved with respect to time evolution, and as to their mutual commutation relations, there would be pretty good views of them forming a closed algebra so(4), if it were not for the commutator [ai,aj] ∝ lkh. however, restricting ourselves to the subspace he spanned by eigenvectors of h corresponding to the eigenvalue e, h can be replaced by its eigenvalue, which is a c-number. besides enabling the algebra to close, we have also dragged the energy into the very definition of the algebra generators. this, together with the theory related to the relevant casimir operators, has a direct impact on the h-atom energy spectrum. in this way pauli found the correct formulas for the hydrogen atom spectrum even before schrödinger. now the search for the analogy in noncommutative quantum mechanics (ncqm) begins. the rest of this paper is organized in the following way: in section 2 149 http://dx.doi.org/10.14311/ap.2014.54.0149 http://ojs.cvut.cz/ojs/index.php/ap p. presnajder, v. galikova, s. kovacik acta polytechnica we give a brief introduction to the quantum mechanics in a spherically symmetric noncommutative space. the hilbert space of wave functions in nc space and the nc generalizations of important operators (hamiltonian, angular momentum, coordinate and velocity) are introduced. when we use these, a generalization of the dynamical symmetry of the coulomb-kepler problem in ncqm is presented in section 3. section 4 provides conclusions. we skip all the detailed and lengthy calculations that can be found in our recently published paper [7]. 2. basics of noncommutative quantum mechanics to begin, it should be made clear how to introduce some uncertainty principle into the configuration space for the coulomb problem without spoiling the key feature which allows us to find the exact solution the rotational invariance. the uncertainty is expressed as nontrivial commutation relations for the nc analogs of the former cartesian coordinates (obviously we have to abandon c-numbers), defined in a spherically symmetrical way. the coordinates in the nc configuration space r3λ are realized in terms of 2 pairs of boson annihilation and creation operators aα, a+α , α = 1, 2, satisfying [aα,a+β ] = δαβ, [aα,aβ] = [a + α ,a + β ] = 0. (5) they act in an auxiliary fock space f spanned by the normalized vectors |n1,n2〉 = (a+1 ) n1 (a+2 ) n2 √ n1! n2! |0〉. (6) the normalized vacuum state is denoted as |0〉 ≡ |0, 0〉. the noncommutative coordinates xj, j = 1, 2, 3, and the nc analog of the euclidean distance from the origin are given as xj = λa+σja ≡ λσ j αβa + αaβ, r = λ(n + 1), (7) where σj are the pauli matrices, n = a+αaα is the number operator in the fock space, and λ is a length parameter. its magnitude is not fixed within our model. naturally it has a connection with the smallest distance relevant in the given noncommutative configuration space denoted as r3λ. the key rotationally invariant relations in the theory are [xi,xj] = 2iλεijkxk, [xi,r] = 0, r2−x2j = λ 2. (8) the first equation defines a noncommutative or fuzzy sphere that appeared a long time ago in various contexts [8], e.g., quantization on a sphere = nonflat phase space, a simple model of nc manifolds. all these models considered a single fuzzy sphere. here we deal with an infinite sequence of fuzzy spheres dynamically via an nc analog of the (radial) schrödinger equation that is introduced below. we remark that while the first equation in (8) is postulated, the other two follow from the construction of r3λ. while constructing nc quantum mechanics, firstly we have to decide on a hilbert space hλ of states (see also [9]). the suitable choice is a linear space of normally-ordered analytic functions containing the same number of creation and annihilation operators: ψ = ∑ cm1m2n1n2 (a + 1 ) m1 (a+2 ) m2an11 a n2 2 , (9) which possess finite weighted hilbert-schmidt norm ‖ψ‖2 = 4πλ2tr[rψ†ψ]. (10) (the summation in (9) is over nonnegative integers satisfying m1 + m2 = n1 + n2.) our nc wave functions ψ are themselves operators on the fock space mentioned above, so the relations which occur are to be looked at as operator equalities (we could put |n1,n2〉 on both sides of every such equation). we can move to the definition of the operators acting on nc wave functions ψ ∈ hλ. to avoid potential confusion, we have decided to leave the nc coordinates and the nc wave functions ψ (operators on the fock space) as they are, and to denote the operators acting on ψ with a hat from now on. the generators of rotations in hλ, orbital momentum operators, are defined as l̂jψ = 1 2λ (xjψ − ψxj), j = 1, 2, 3. (11) they are hermitian and obey the usual commutation relations [l̂i, l̂j]ψ ≡ (l̂il̂j − l̂jl̂i)ψ = iεijkl̂kψ. (12) the standard eigenfunctions ψjm, j = 0, 1, 2, . . . , m = −j, . . . , +j, satisfying l̂2i ψjm = j(j + 1)ψjm, l̂3ψjm = mψjm, (13) are given by ψjm = ∑ (jm) (a+1 ) m1 (a+2 ) m2 m1! m2! rj(r) an11 (−a2) n2 n1! n2! . (14) the summation goes over all nonnegative integers that satisfy m1 + m2 = n1 + n2 = j,m1 −m2 −n1 + n2 = 2m. for any fixed rj(r) equation (14) defines a representation space for a unitary irreducible representation with spin j. the nc analog of the usual laplace operator is ∆̂λψ = − 1 λr [â+α , [âα, ψ]] = − 1 λ2(n + 1) [â+α , [âα, ψ]]. (15) 150 vol. 54 no. 2/2014 lrl vector in quantum mechanics in noncommutative space as to the operator û, the nc analog of the central potential, it is defined simply as the multiplication of the nc wave function by u(r): (ûψ)(r) = u(r)ψ = ψu(r). (16) since any term of ψ ∈hλ consists of the same number of creation and annihilation operators (any commutator of such a term with r is zero), there is no difference between left and right multiplication by u(r). so finally here is the definition of our nc hamiltonian: ĥψ = ~2 2mλr [â+α , [âα, ψ]] − q r ψ. (17) the coordinate operator x̂j acts on ψ symmetrically as x̂jψ = 1 2 (xjψ + ψxj). (18) as to the velocity operator, clearly it should be in some relation with the evolution of the coordinate operator. the nc analog of the time derivative is proportional to the commutator of the quantity considered with h; so the components of the velocity operator are given by v̂jψ = −i[x̂j,ĥ]ψ. (19) both sets of nc observables, v̂j and x̂j, have been introduced in [10]. as we see below, they are well adapted to the construction of the nc analog of the lrl vector. based on what has been briefly summarized above, the nc analog of the schrödinger equation with the coulomb potential in r3λ can be postulated: ~2 2mλr [â+α , [âα, ψ]] − q r ψ = eψ. (20) to avoid overloading the formulas, we usually set m = 1, ~ = 1 below. 3. dynamical symmetry in ncqm it is time to address the ncqm version of the coulomb-kepler problem. our task is to find sensible analogs of the three components ai of the lrl vector in such a way that all the requirements regarding commutation relations are met (the commutator with the hamiltonian has to be zero because of the conservation law and relations among all components of ~a and ~l are supposed to correspond to the relevant symmetry). recall the subtlety which had to be taken into account when the standard qm version of the lrl vector had been built based on the classical model. the cross product of velocity and angular momentum needed symmetrization due to their nonvanishing commutator. the nc operators that we are going to use when constructing the analog of the cross product part, i.e. v̂i, l̂i, do not commute either, so some adjustment of this sort will also have to be made. however, there is also another, “potential” part of the lrl vector, which is proportional to ~r/r. the corresponding nc analogs of xi and ψ do not commute either, so we resolve the ordering in a similar way as in the cross product case — we take x̂k instead of xk: âk = 1 2 εijk(l̂iv̂j + v̂jl̂i) + q x̂k r . (21) it has turned out that besides coping with the ordering dilemma, nothing more needs to be done that is, except for doing the calculations to show that our definition of âk has been a good choice. so now we are going to take the nc analogs of the hamiltonian, velocity, angular momentum and position operators, and build the nc lrl vector according to (21). then we have to move to the next task evaluate the commutator [âi,ĥ], examine the commutation relations between âk and l̂k, searching for the signs of a higher dynamical symmetry. as soon as the symmetry group is recognized, we can construct the corresponding casimir operators. all these crucial operators: the hamiltonian, velocity, angular momentum and position operators, have been defined already in terms of creation and annihilation operators a+α , aα; knowing the commutation relations for these, one can calculate all that is required. however, after writing it all down and trying to make heads and tails of it, we soon realize that the problem is not assigned in the most friendly way. this definitely seems to be a case in which introducing some auxiliary quantities may help. there are certain combinations of a+α , aα that occur often in our expressions, and separating them the right way makes the calculations more manageable. 3.1. auxiliary operators we have to examine how the operators considered here act on the wave function ψ. they are expressed in terms of â+α , âα. in general it makes a difference whether the creation and annihilation operators act from the right or the left and the following notation seems to be useful to keep track of it: âαψ = aαψ, b̂αψ = ψaα, (22) â+α ψ = a + α ψ, b̂ + α ψ = ψa + α . (23) an advantage of this notation is the fact that now we do not have to drag ψ into the formulas just to make clear which side the operators act from. the relevant commutation relations are (see (5)) [âα, â+β ] = δαβ, [b̂α, b̂ + β ] = −δαβ. (24) the other commutators are zero. this, when kept in mind, spares a lot of paper during the calculations. 151 p. presnajder, v. galikova, s. kovacik acta polytechnica as it was already mentioned, we use the position operator in the form x̂iψ = 1 2 (xiψ + ψxi) = λ 2 σiαβ(â + α âβ + b̂βb̂ + α )ψ, r̂ψ = 1 2 (rψ + ψr) = λ 2 ((â+α âα + 1) + (b̂αb̂ + α + 1))ψ. (25) the following sequences of operators appear often and their role is important enough to admit that they deserve some notation on their own: ŵk = σkαβ(â + α âβ − â + α b̂β − âβb̂ + α + b̂ + α b̂β), ŵ = â+α âα − â + α b̂α − b̂ + α âα + b̂ + α b̂α, ŵ ′k = ŵk − 2λex̂k, ŵ ′ = ŵ − 2λer̂, (26) where e is energy and λ is the nc parameter already mentioned. note that the only difference between ŵ ′k and ŵk is the constant multiplying one of their terms. ŵ ′ and ŵ are related similarly. 3.2. nc operators revisited now we rewrite the hamiltonian, the velocity operator and the nc lrl vector in terms of the new auxiliary operators which have been introduced. ĥ = 1 2λr̂ (â+α âα + b̂ + α b̂α − â + α b̂α − âαb̂ + α ) − q r̂ = 1 2λr̂ ŵ − q r̂ , (27) v̂i = −i[x̂i,ĥ] = i 2r̂ σiαβ(â + α b̂β − âβb̂ + α ) (28) âk = 1 2 εijk(l̂iv̂j + v̂jl̂i) + q x̂k r̂ = 1 2r̂λ (r̂ŵ ′k − x̂k(ŵ ′ − 2λq)). (29) deriving equations (28) and (29) involves somewhat laborious calculations. the details can be found in [7]. this gives us an opportunity to write the nc schrödinger equation in the following way: ( 1 2λr̂ ŵ− q r̂ −e ) ψe = 1 2λr̂ (ŵ ′−2λq)ψe = 0. (30) ψe belongs to heλ , i.e. to the subspace spanned by the eigenvectors of the hamiltonian. the important quantity for us is âk|he λ , the lrl vector as it acts on the solutions of the schrödinger equation: âk|he λ = 1 2r̂λ (r̂ŵ ′k−x̂k (ŵ ′ − 2λq)︸ ︷︷ ︸ see eq. (30) ) = 1 2λ ŵ ′k. (31) when dealing with calculations related to the conservation of âk, we just need to ascertain whether the following commutator with the hamiltonian vanishes: ˙̂ w ′k = i [ ĥ0 − q r̂ ,ŵ ′k ] = i [ 1 2r̂λ ŵ ′ − q r̂ ,ŵ ′k ] = i 2r̂λ [ ŵ ′,ŵ ′k ] + i [1 r̂ ,ŵ ′k ](ŵ ′ 2λ −q ) = 0. (32) the second term in the second-to-last line vanishes when acting on vectors from heλ (and we are not so interested in the rest of hλ). the first term proportional to [ŵ ′,ŵ ′k] does not contribute either. the calculations proving this involve more steps and can be found in [7]. the equation above encourages one to search for the underlying so(4) symmetry, since the lrl vector conservation makes its components suitable candidates for half of its generators, the remaining three consisting of the components of the angular momentum. once again we have to ask the reader either to check [7] for details or simply to believe that the following holds: [âi, âj] = iεijk(−2e + λ2e2)l̂k (33) there is nothing but a constant in the way, as long as we let the operator [âi, âj] act upon the vectors from heλ with the energy fixed. eq. (33) and [l̂i, l̂j] = iεijkl̂k, [l̂i, âj] = iεijkâk (34) define lie algebra relations corresponding to a particular symmetry group, the actual form of which depends on the sign of the e-dependent factor in (33). the relevant relations for l̂i have already been mentioned, the formula for the mixed commutator [l̂i, âj] follows from the fact that âj, j = 1, 2, 3, are components of a vector. there are three independent cases: • so(4) symmetry: −2e + λ2e2 > 0 ⇐⇒ e < 0 or e > 2/λ2; • so(3, 1) symmetry: −2e + λ2e2 < 0 ⇐⇒ 0 < e < 2/λ2; • e(3) euclidean group: −2e + λ2e2 = 0 ⇐⇒ e = 0 or e = 2/λ2. the admissible values of e should correspond to the unitary representations of the symmetry in question. this requirement guarantees that the generators l̂j and âj are realized as hermitian operators, and consequently correspond to physical observables. the casimir operators in the mentioned cases are ĉ′1 = l̂jâj, ĉ′2 = âiâi + (−2e + λ 2e2)(l̂il̂i + 1). (35) 152 vol. 54 no. 2/2014 lrl vector in quantum mechanics in noncommutative space the prime indicates that we are not using the standard normalization of casimir operators. now, we need to calculate their values in heλ . the first casimir vanishes in all cases due to the fact that ĉ′1ψe ∼ rψe − ψer = 0. the second casimir operator is somewhat more demanding. according to (31) we have ĉ′2ψe = (ŵ ′iŵ ′i 4λ2 + (−2e + λ2e2)(l̂il̂i + 1) ) ψe = 1 4λ2 ŵ ′ŵ ′ψe, (36) where we used the quadratic identity ŵ ′iŵ ′ i + 4λ 2(−2e + λ2e2)(l̂il̂i + 1) = ŵ ′2. (37) according to the schrödinger equation, (ŵ ′)2ψe = 4λ2q2ψe, and we are left with ĉ′2ψe = q 2ψe. (38) since both casimir operators take constant values ĉ′1 = 0 and ĉ′2 = q2 in heλ , we are dealing with irreducible representations of the dynamical symmetry group g that are unitary for particular values of energy. in all the cases considered, g = so(4), so(3, 1),e(3), the unitary irreducible representations are well known. the corresponding systems of eigenfunctions that span the representation space have been found in [9]. here we do not repeat their construction, but we restrict ourselves to brief comments pointing out some interesting aspects. 3.3. bound states – the case of so(4) symmetry −2e + λ2e2 > 0 in this case we rescale the lrl vector as k̂j = âj√ −2e + λ2e2 = ŵ ′j 2λ √ −2e + λ2e2 . (39) after this step eqs. (33), (34) turn into the following relations: [l̂i, l̂j] = iεijkl̂k, [l̂i,k̂j] = iεijkk̂k, [k̂i,k̂j] = iεijkl̂k. (40) thus we have got the representation of the so(4) algebra. the relevant normalized casimir operators are ĉ1 = l̂jk̂j, ĉ2 = k̂ik̂i + l̂il̂i + 1. (41) as we have stated already, the ĉ1 acting on an eigenfunction of the hamiltonian returns zero. as to ĉ2, we know that for so(4), under the condition that the first casimir is zero, the second casimir has to be equal to n2 for some integer n = j + 1,j + 2, . . . (with j(j + 1) corresponding to the square of the angular momentum). at the same time, according to (38) it is related to the energy: k̂ik̂i + l̂il̂i + 1 = q2 λ2e2 − 2e = n2. (42) now solving the quadratic equation for energy we obtain two discrete sets of solutions depending on n: e = 1 λ2 ∓ 1 λ2 √ 1 + κ2n, κn = qλ n . (43) the first set of eigenfunctions of the hamiltonian in (17) for energies e < 0 (i.e. negative sign in front of the square root in (43)) has been found for coulomb attractive potential, i.e. q > 0 in (17): eiλn = 1 λ2 − 1 λ2 √ 1 + κ2n. (44) these eigenvalues possess a smooth standard limit for λ → 0 eiλn = 1 λ2 − 1 λ2 ( 1 + 1 2 κ2n − 1 24 κ4n + · · · ) →− q2 2n2 = − q2m 2n2~2 . (45) this spectrum coincides (in the commutative limit λ → 0) with the spectrum for coulomb attractive potential, q > 0, that was found by pauli using algebraic methods prior to solving schrödinger equation for the hydrogen atom. the full set of eigenfunctions of (17) for energies e < 0 was constructed in [9] by explicitly solving the nc schrödinger equation. the radial nc wave functions defined in (14) are given in terms of the hypergeometric function riλn = (ωn) nf(−n,−n, 2j + 2,−2κnω−1n ), ωn = κn − √ 1 + κ2n + 1 κn + √ 1 + κ2n − 1 , (46) where n = a+αaα controls the radial nc variable. the second set of very unexpected solutions corresponds to energies (43) with positive sign eiiλn = 1 λ2 + 1 λ2 √ 1 + κ2n > 2 λ2 . (47) the corresponding radial nc wave functions have been found in [9] solving nc schrödinger equation for a coulomb repulsive potential, q < 0 in (17). these radial nc wave functions are closely related to those given above riiλn = (−ωn) nf(−n,−n, 2j + 2, 2κnω−1n ). (48) both so(4) representations, the representation for coulomb attractive potential with eiλn < 0 and that for ultra-high energies eiiλn > 2/λ 2 for coulomb repulsive potential, are unitary equivalent as in both representations the casimir operators take the same values, ĉ1 = 0 and ĉ1 = n2. however, physically they are quite distinct: in the commutative limit λ → 0 the first bound states persist and reduce to the standard ones, while the extraordinary bound states at ultra-high energies disappear from the hilbert space. 153 p. presnajder, v. galikova, s. kovacik acta polytechnica 3.4. coulomb scattering – the case 2e −λ2e2 > 0 in this case we rescale the lrl vector as k̂j = âj√ 2e −λ2e2 = ŵ ′j 2λ √ 2e −λ2e2 . (49) after this step we obtain equations [l̂i, l̂j] = iεijkl̂k, [l̂i,k̂j] = iεijkk̂k, [k̂i,k̂j] = −iεijkl̂k. (50) so this time we have obtained the representation of the so(3, 1) algebra. the relevant normalized casimir operators are ĉ1 = l̂jk̂j, ĉ2 = k̂ik̂i − l̂il̂i. (51) in our case ĉ1 = 0, so we are dealing with so(3, 1) unitary representations that are labeled by the value of second casimir operator ĉ2. rewriting (38) in terms of k̂j we obtain relation between energy e and the eigenvalue τ of ĉ2: k̂ik̂i − l̂il̂i = 1 + q2 2e −λ2e2 = τ > 1. (52) thus we are dealing with the spherical principal series of so(3, 1) unitary representations, see e.g [11]. the scattering nc wave functions have been constructed in [9] for any admissible energy e ∈ (0, 2/λ2), and from their asymptotic behavior the partial wave s-matrix has been derived sλj (e) = γ(j + 1 − iq p ) γ(j + 1 + iq p ) , p = √ 2e −λ2e2. (53) it can be easily seen that such s-matrix possesses poles at energies e = eiλn for coulomb attractive potential and poles at energies e = eiiλn for coulomb repulsive potential, where both eiλn and e ii λn coincide with (44) and (47) given above. as for energies eλ∓ = 1 λ2 ( 1 ∓ √ 1 − λ2q2 τ − 1 ) (54) the casimir operator values coincide, the corresponding representations are unitarily equivalent. this relates the scattering for low energies 0 < e < 1/λ2 to that at high energies 1/λ2 < e < 2/λ2. we skip the limiting cases of the scattering at the edges e = 0 and e = 2/λ2 of the admissible interval of energies, where the so(3, 1) group contracts to the euclidean group e(3) = so(3) . t(3) of isometries of 3d space. the corresponding nc hamiltonian eigenstates are given in [9]. 4. conclusions this paper deals with the coulomb-kepler problem in noncommutative space. we have found the nc analog of the lrl vector; its components, together with those of the nc angular momentum operator, supply the algebra of generators of a symmetry group. it is interesting that the formula for the nc version of the lrl vector looks very much like the one from standard qm, that is, when written in terms of the proper nc observables: nc angular momentum, nc velocity, symmetrized nc coordinate and nc radial distance. it is quite remarkable that the so(4) symmetry has appeared twice: firstly, not so surprisingly, when addressing the problem of the bound states for negative energies in the case of the attractive coulomb potential. these have an analog in standard quantum mechanics, and our result for negative energy bound states indeed coincides with the well known qm prediction if the commutative limit λ → 0 is applied. the second appearance of the so(4) symmetry is probably not so expected, for we have found a set of bound states for positive energies above a certain ultra-high value in the case that the potential is repulsive. however, there is again no discrepancy between qm and ncqm, since the unexpected ultra-high energy bound states disappear from the hilbert space in the above mentioned commutative limit. when examining the scattering (relevant for the interval of energies between zero and the mentioned critical ultra-high value), so(3, 1) is the symmetry to be considered. the scattering is usually characterized by the s-matrix. in the nc version of the problem this object has exactly those poles in complex energy plane which correspond to the bound states (of both kinds) mentioned above. this goes, however, beyond the scope of this paper. to summarize, there are basically two ways of examining the hydrogen energy spectrum: by solving some differential equation in schrödinger fashion or by looking for an underlying symmetry and using an algebraic approach à la pauli. both possibilities were tried in ncqm (the aim of this paper has been mainly to provide some outline of the latter option), for details see [7], [9]. we are glad to find out that both approaches (agreeing in standard qm) lead to the same outcomes also in ncqm. acknowledgements the author v. g. is indebted to comenius university for support received from grant no. uk/545/2013. references [1] jakob hermann, giornale de letterati d’italia 2 (1710) 447; jakob hermann, extrait d’une lettre de m. herman à m. bernoulli datée de padoüe le 12. juillet 1710, histoire de l’academie royale des sciences (paris) 1732: 519; 154 vol. 54 no. 2/2014 lrl vector in quantum mechanics in noncommutative space [2] johann bernoulli, extrait de la réponse de m. bernoulli à m. herman datée de basle le 7. octobre 1710, histoire de l’academie royale des sciences (paris) 1732: 521. [3] p. s. laplace, traité de mécanique céleste, tome i (1799), premiere partie, livre ii, pp.165ff. [4] c. runge, vektoranalysis, volume i, leipzig hirzel (1919). [5] w. lenz, zeitschrift für physik 24 (1924) 197, über den bewegungsverlauf und quantenzustände der gestörten keplerbewegung. [6] w. pauli, zeitschrift für physik 36 (1926) 336, über das wasserstoffspektrum vom standpunkt der neuen quantenmechanik. [7] veronika gáliková, s. kováčik and p. prešnajder, laplace-rung-lenz vector in quantum mechanics in noncommutative space j. math. phys. 54, 122106 (2013) doi: 10.1063/1.4835615 [8] f. a. berezin, commun. math. phys. 40 (1975) 153 doi: 1007/bf01609397; j. hoppe, elem. part. res. j. 80 (1989) 145; j. madore, journ. math. phys. 32 (1991) 332 doi: 10.1063/1.529418; h. grosse and p. prešnajder, lett. math. phys. 28 (1993) 239 doi: 10.1007/bf00745155; h. grosse, c. klimčík and p. prešnajder, int. j. theor. phys. 35 (1996) 231 doi: 10.1007/bf02083810. [9] veronika gáliková and peter prešnajder, coulomb problem in non-commutative quantum mechanics j. math. phys. 54, 052102 (2013) doi: 10.1063/4803457; [10] s. kováčik and p. prešnajder, the velocity operator in quantum mechanics in noncommutative space, j. math. phys. 54, 102103 (2013) doi: 1063/1.4826355; [11] a. o. barut and r. raçzka, theory of group representations and applications, polish scientific publishers, 1977. 155 http://dx.doi.org/10.1063/1.4835615 http://dx.doi.org/1007/bf01609397 http://dx.doi.org/10.1063/1.529418 http://dx.doi.org/10.1007/bf00745155 http://dx.doi.org/10.1007/bf02083810 http://dx.doi.org/10.1063/4803457 http://dx.doi.org/1063/1.4826355 acta polytechnica 54(2):149–155, 2014 1 introduction 2 basics of noncommutative quantum mechanics 3 dynamical symmetry in ncqm 3.1 auxiliary operators 3.2 nc operators revisited 3.3 bound states 3.4 coulomb scattering 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0399 acta polytechnica 53(5):399–404, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap extremal vectors for verma type representation of b2 čestmír burdíka,∗, ondřej navrátilb a department of mathematics, czech technical university in prague, faculty of nuclear sciences and physical engineering, trojanova 13, 120 00 prague 2, czech republic b department of mathematics, czech technical university in prague, faculty of transportation sciences, na florenci 25, 110 00 prague, czech republic ∗ corresponding author: burdices@kmlinux.fjfi.cvut.cz abstract. starting from the verma modules of the algebra b2 we explicitly construct factor representations of the algebra b2 which are connected with the unitary representation of the group so(3, 2). we find a full set of extremal vectors for representations of this kind. so we can explicitly resolve the problem of the irreducibility of these representations. keywords: verma modules, height-weight representation, reducibility, extremal vectors. submitted: 16 july 2013. accepted: 28 july 2013. 1. introduction representations of lie algebras are important in many physical models. it is therefore useful to study various methods for constructing them. the general method of construction of the highestweight representation for the semisimple lie algebra was developed in [1, 2]. the irreducibility of such representations (now called verma modules) was studied by gelfand in [3]. the theory of these representations is included in dixmier’s book [4]. in the 1970’s prof. havlíček with his coworkers dealt with the construction of realizations of the classical lie algebras, see [5]. our aim in this paper is to show how one can use realizations of the lie algebra to construct so called extremal vectors of the verma modules. to work with a specific lie algebra, we choose lie algebra so(3, 2), which plays an important role in physics, e.g. in ads/cft theory, see [6, 7]. in the construction of the verma modules for b2, the representations depend on parameters (λ1,λ2). for connection with irreducible unitary representations of so(3, 2) we take λ2 ∈ n0, and in section 3 we explicitly construct the factor-verma representation. further, we construct a full set of extremal vectors. these vectors are called subsingular vectors in [8]. in this paper, we use an almost elementary partial differential equation approach to determine the extremal vectors in any factor-verma module of b2. it should be noted that our approach differs from a similar one used in [9]. first, we identify the factorverma modules with a space of polynomials, and the action of b2 on the verma module is identified with differential operators on the polynomials. any extremal vector in the factor-verma module becomes a polynomial solution of a system of variable-coefficient second-order linear partial differential equations. 2. the root system for lie algebra b2 in the lie algebra g = b2 we will take a basis composed by elements h1, h2, ek and fk, where k = 1, . . . , 4, which fulfill the commutation relations [h1, e1] = 2e1, [h1, e2] = −e2, [h1, e3] = e3, [h1, e4] = 0, [h2, e1] = −2e1, [h2, e2] = 2e2, [h2, e3] = 0, [h2, e4] = 2e4, [h1, f1] = −2f1, [h1, f2] = f2, [h1, f3] = −f3, [h1, f4] = 0, [h2, f1] = 2f1, [h2, f2] = −2f2, [h2, f3] = 0, [h2, f4] = −2f4, [e1, e2] = e3, [e1, e3] = 0, [e1, e4] = 0, [e2, e3] = 2e4, [e2, e4] = 0, [e3, e4] = 0, [f1, f2] = −f3, [f1, f3] = 0, [f1, f4] = 0, [f2, f3] = −2f4, [f2, f4] = 0, [f3, f4] = 0, [e1, f1] = h1, [e1, f2] = 0, [e1, f3] = −f2, [e1, f4] = 0, [e2, f1] = 0, [e2, f2] = h2, [e2, f3] = 2f1, [e2, f4] = −f3, [e3, f1] = −e2, [e3, f2] = 2e1, [e3, f3] = 2h1 + h2, [e3, f4] = f2, [e4, f1] = 0, [e4, f2] = −e3, [e4, f3] = e2, [e4, f4] = h1 + h2. we can take as h the cartan subalgebra with the bases h1 and h2. we will denote λ = (λ1,λ2) ∈ h∗, for which we have λ(h1) = λ1, λ(h2) = λ2. 399 http://dx.doi.org/10.14311/ap.2013.53.0399 http://ojs.cvut.cz/ojs/index.php/ap č. burdík, o. navrátil acta polytechnica the root systems g = b2 with respect to these bases h1 and h2 are r = { ±αk ; k = 1, 2, 3, 4 } , where α1 = (2,−2), α2 = (−1, 2), α3 = α1 + α2 = (1, 0), α4 = α1 + 2α2 = (0, 2). if we choose positive roots r+ = {α1, α2, α3, α4 } , the basis in root system r is b = { α1, α2 } . if we define h3 = 2h1 + h2 and h4 = h1 + h2, the following relations [hk, ek] = 2ek, [hk, fk] = −2fk, [ek, fk] = hk are valid for any k = 1, . . . , 4. 3. the extremal vectors for verma type representation we denote by n+, and n− the lie subalgebras generated by elements ek, and fk, respectively, where k = 1, . . . , 4, and b+ = h+ n+. let us further consider λ = (λ1,λ2) ∈ h∗ the one-dimensional representation τλ for the lie algebra b+ such that for any h ∈ h and e ∈ n+ τλ(h + e)|0〉 = λ(h)|0〉. the element |0〉 will be called the lowest-weight vector. let further be w(λ) = u(g) ⊗u(b+) c|0〉, where b+-module c|0〉 is defined by τλ. it is clear that w (λ) ∼ u(n−)|0〉 and it is the u(g)module for the left regular representation, which will be called the verma module. 1 it is a well-known fact that every u(g)-submodule of the module w(λ) is isomorphic to module w(µ), where µ = λ −n1α1 −n2α2, for n1,n2 ∈ n0 = {0, 1, 2, . . .}. for the lowest-weight vector of the representation w(µ) ⊂ w(λ), |0〉µ, is fulfilled h|0〉µ = µ(h)|0〉µ, h ∈ h, e|0〉µ = 0, e ∈ n+. such vectors |0〉µ will be called extremal vectors w(λ). from the well-known result for the verma modules we know that the verma module w(λ) is irreducible iff λ1 /∈ n0, λ2 /∈ n0, λ1 + λ2 + 1 /∈ n0, 2λ1 + λ2 + 2 /∈ n0. if λ1 ∈ n0, resp. λ2 ∈ n0, then the extremal vectors are fλ1+11 |0〉 = |0〉µ1, resp. f λ2+1 2 |0〉 = |0〉µ2, 1in dixmier’s book the verma module m(λ) is defined with respect to τλ−δ, where δ = 12 ∑4 k=1αk = (1, 1). so we have w(λ) = m(λ + δ). where µ1 = λ − (λ1 + 1)α1 = (−λ1 − 2, 2λ1 + λ2 + 2), µ2 = λ − (λ2 + 1)α2 = (λ1 + λ2 + 1,−λ2 − 2). (1) if w(µ) is a submodule w(λ), we will define the u(g)-factor-module w(λ|µ) = w(λ)/w(µ). now we can study the reducibility of a representation like that. again, the extremal vector is called any nonzero vector v ∈ w (λ|µ) for which there exists ν ∈ h∗ such that hkv = νkv, ekv = 0, k = 1, 2. (2) it is clear that ekv = 0 for k = 1, 2, 3, 4. in this paper, we find all such extremal vectors in the space w(λ|µ2), where λ2 ∈ n0 and µ2 is given by (1). 4. differential equations for extremal vectors let λ2 ∈ n0 and µ2 be given by equation (1). it is easy to see that the basis in the space w(λ|µ2) is given by the vectors |n〉 = |n1,n3,n4,n2〉 = (λ2 −n2)! fn11 f n3 3 f n4 4 f n2 2 |0〉, where n1,n3,n4 ∈ n0 and n2 = 0, 1, . . . ,λ2.2 now by direct calculation we obtain h1|n〉 = (λ1 − 2n1 + n2 −n3)|n〉, h2|n〉 = (λ2 + 2n1 − 2n2 − 2n4)|n〉, e1|n〉 = n1(λ1 −n1 + n2 −n3 + 1)|n1 − 1,n3,n4,n2〉 − (λ2 −n2)n3|n1,n3 − 1,n4,n2 + 1〉 + n3(n3 − 1)|n1,n3 − 2,n4 + 1,n2〉, e2|n〉 = n2|n1,n3,n4,n2 − 1〉 + 2n3|n1 + 1,n3 − 1,n4,n2〉 −n4|n1,n3 + 1,n4 − 1,n2〉. (3) it is possible to rewrite the action by the second order differential operators (see [10, 11]) on the polynomial functions z1, z2, z3 a z4, which are in variable z2 up to the level λ2. if we put |n1,n3,n4,n2〉 = (λ2 −n2)! fn11 f n3 3 f n4 4 f n2 2 |0〉 ↔ zn11 z n2 2 z n3 3 z n4 4 , we obtain from equations (3) for the action on polynomials f = f(z1,z2,z3,z4) h1f = λ1f − 2z1f1 + z2f2 −z3f3, h2f = λ2f + 2z1f1 − 2z2f2 − 2z4f4, e1f = λ1f1 −z1f11 + z2f12 −z3f13 −λ2z2f3 + z22f23 + z4f33, e2f = f2 + 2z1f3 −z3f4, (4) 2if λ2 /∈ z we can use a similar construction with basis |n〉 = γ(λ2 −n2 + 1)fn11 f n3 3 f n4 4 f n2 2 |0〉, where n1,n2,n3,n4 ∈ n0. 400 vol. 53 no. 5/2013 extremal vectors for verma type representation of b2 where fk = ∂f ∂zk . the conditions for extremal vectors (2) are now λ1f − 2z1f1 + z2f2 −z3f3 = ν1f, λ2f + 2z1f1 − 2z2f2 − 2z4f4 = ν2f, λ1f1 −z1f11 + z2f12 −z3f13 −λ2z2f3 + z22f23 + z4f33 = 0, f2 + 2z1f3 −z3f4 = 0, (5) where ν1 and ν2 are complex numbers. the condition on the degree of the polynomial f(z1,z2,z3,z4) in variable z2 can be rewritten in the following way ∂λ2+1f ∂zλ2+12 = 0. 5. the extremal vectors the extremal vectors are in one-to-one correspondence to polynomial solutions of the systems of equations (5), which are in variable z2 of maximal degree λ2. you can find all such solutions in the appendix. for any λ1 and λ2 there exists a constant solution f(z1,z2,z3,z4) = 1. but such a solution gives v = |0〉, which is not interesting. a further solution exists only in the cases λ1 ∈ n0, λ1 + λ2 + 1 ∈ n0 or 2λ1 + λ2 + 2 ∈ n0. for λ1 ∈ n0 there is a function f(z1,z2,z3,z4) = zλ1+11 , and we obtain the extremal vector v = fλ1+11 |0〉. for λ1 + λ2 + 1 ∈ n0 and 2λ1 + λ2 + 4 ≤ 0 we find the solution f(z1,z2,z3,z4) = ( z4 + z2z3 −z1z22 ) λ1+λ2+2 = ∑ (n1,n3)∈dλ (−1)n1 (λ1 + λ2 + 2)! n1! n3! (λ1 + λ2 −n1 −n3 + 2)! ×zn11 z 2n1+n3 2 z n3 3 z λ1+λ2−n1−n3+2 4 , where dλ = { (n1,n3) ∈ n20 ; n1 + n3 ≤ λ1 + λ2 + 2 } . the extremal vector corresponding to this solution is v = ∑ (n1,n3)∈dλ (−1)n1 (λ2 − 2n1 −n3)! n1! n3! (λ1 + λ2 −n1 −n3 + 2)! × fn11 f n3 3 f λ1+λ2−n1−n3+2 4 f 2n1+n3 2 |0〉. if 2λ1 + λ2 + 2 ∈ n0, we introduce n = 2λ1 + λ2 + 3, `2 = [1 2λ2 ] , m = [1 2n ] . then we can rewrite the solution from the appendix in the following way: for λ1 being a half integer, i.e. λ1 = `1 − 12 , where `1 ∈ z, we have f = m∑ n4=0 min(λ2,n−2n4)∑ n2=0 (−1)n2 cn2,n4 n2! n4! ×zn2+n41 z n2 2 z n−n2−2n4 3 z n4 4 , where cn2,n4 =   ∑min(`2,m−n4) n=[ 12 (n2+1)] 22n+n2+2n4 × `2! m!(2n−n2)! (`2−n)! (m−n−n4)!, λ2 even,∑min(`2,m−n4) n=[ 12 n2] 22n+n2+2n4 × `2! m!(2n−n2+1)! (`2−n)! (m−n−n4)!, λ2 odd. for these solutions we obtain the extremal vectors v = m∑ n4=0 min(λ2,n−2n4)∑ n2=0 (−1)n2 (λ2 −n2)! n2! n4! cn2,n4 × fn2+n41 f n−n2−2n4 3 f n4 4 f n2 2 |0〉. if λ1 is an integer we have λ1 ≤−2. the solution of the differential equations in this case is f = m∑ n4=0 n−2n4∑ n2=0 (−1)n2 dn2,n4 n2! n4! ×zn2+n41 z n2 2 z n−n2−2n4 3 z n4 4 , where dn2,n4 =   ∑m−n4 n=[ 12 (n2+1)] 2n+n2+2n4 (2`2−1)!!(2`2−2n−1)!! × m!(2n−n2)! (m−n−n4)!, λ2 even,∑m−n4 n=[ 12 n2] 2n+n2+2n4 (2`2−1)!!(2`2−2n−1)!! × m!(2n−n2+1)! (m−n−n4)!, λ2 odd, and the extremal vectors are v = m∑ n4=0 n−2n4∑ n2=0 (−1)n2 (λ2 −n2)! n2! n4! dn2,n4 × fn2+n41 f n−n2−2n4 3 f n4 4 f n2 2 |0〉. 6. appendix: polynomial solutions of differential equations to obtain extremal vectors we need to find the polynomial solutions f(z1,z2,z3,z4) = ∑ n1,n2,n3,n4≥0 cn1,n2,n3,n4z n1 1 z n2 2 z n3 3 z n4 4 of the system of equations (5), which are of less degree than (λ2 + 1) in the variable z2. to simplify the solution of the first equations, we put f(z1,z2,z3,z4) = z−ρ21 (4z1z4 + z 2 3 ) ρ2+ρ1/2g(t,x1,x2,x3), where ρ1 = λ1 −ν1, ρ2 = 12 (λ2 −ν2), x2 = z1, x3 = z2 and t = (2z1z2 −z3)2 4z1z4 + z23 , x1 = 2z1z2 −z3 z3 , 401 č. burdík, o. navrátil acta polytechnica or z1 = x2, z2 = x3 and z3 = 2x2x3 1 + x1 , z4 = x2x 2 3(x21 − t) t(1 + x1)2 . the first order equations are equivalent to the conditions gx1 = gx2 = gx3 = 0, and so g(t,x1,x2,x3) = g(t). the equations of the second order give the system of three equations (2λ1ρ1 + 2λ1ρ2 + λ2ρ1 + 2λ2ρ2 −ρ21 − 2ρ1ρ2 − 2ρ 2 2 + 3ρ1 + 4ρ2)g = 0, (2λ1 + λ2 −ρ1 − 2ρ2 + 3)(1 − t)g′ + ρ2(λ1 −ρ1 −ρ2 + 1)g = 0, 4t(1 − t)g′′ + 2 ( 1 + (2λ1 + 2λ2 + 1)t ) g′ + (2λ1ρ1 −ρ21 + 3ρ1 + 2ρ2)g = 0. (6) as we want to obtain polynomial solutions f(z1,z2,z3,z4), which are in variable z2 of less or equal degree λ2 ∈ n0, there must be solution g(t) of the system (6), which is the polynomial in √ t of less or equal degree λ2. if we exclude derivatives of g from the second and the third equations,we find that nonzero solutions can exist only in the following six cases: (1.) ρ1 = 0, ρ2 = 0; (2.) ρ1 = 2λ1 + 2, ρ2 = −λ1 − 1; (3.) ρ1 = 0, ρ2 = λ1 + λ2 + 2; (4.) ρ1 = 2λ1 + 2, ρ2 = λ2 + 1; (5.) ρ1 = 2λ1 + λ2 + 3, ρ2 = 0; (6.) ρ1 = −λ2 − 1, ρ2 = λ1 + λ2 + 2. case 1 (ρ1 = ρ2 = 0). a function that corresponds to the extremal vector is f(z1,z2,z3,z4) = g(t), where g(t) is the solution of the system (2λ1 + λ2 + 3)(1 − t)g′ = 0, 2t(1 − t)g′′ + ( 1 + (2λ1 + 2λ2 + 1)t ) g′ = 0. (7) for each λ1 and λ2 this system has the solution g(t) = 1 which corresponds to the extremal vector f(z1,z2,z3,z4) = 1. but for 2λ1 + λ2 + 3 = 0 we obtained for g(t) the equation 2t(1 − t)g′′ + ( 1 + (λ2 − 2)t ) g′ = 0, which also has a non-constant solution g(t) = g( √ t), where g(x) = ∫ ( 1−x2 )(λ2−1)/2dx. however this solution does not give a polynomial function f(z1,z2,z3,z4) for any λ2. case 2 (ρ1 = 2λ1 + 2, ρ2 = −λ1 −1). the function that corresponds to the extremal vector is in this case f(z1,z2,z3,z4) = zλ1+11 g(t), where g(t) is the solution of system (7). as in event 1 we find that the non-constant polynomial solutions f(z1,z2,z3,z4) = zλ1+11 get only λ1 ∈ n0. case 3 (ρ1 = 0, ρ2 = λ1 + λ2 + 2.). the function for the extremal vectors is f(z1,z2,z3,z4) = ( 4z1z4 + z23 z1 )λ1+λ2+2 g(t), where g(t) is the solution of the system (λ2 + 1) ( (1 − t)g′ + (λ1 + λ2 + 2)g ) = 0, 2t(1 − t)g′′ + ( 1 + (2λ1 + 2λ2 + 1)t ) g′ +(λ1 + λ2 + 2)g = 0. (8) as we assume that λ2 ∈ n0,for each λ1,λ2 this system has the solution g(t) = (1 − t)λ1+λ2+2. this solution corresponds to the function f(z1,z2,z3,z4) = ( z4 + z2z3 −z1z22 ) λ1+λ2+2, (9) which is a non-constant polynomial for λ1 + λ2 + 1 ∈ n0. this function is a polynomial in the variable z2 of degree 2λ1 + 2λ2 + 2. it gives sought solutions for 2λ1 + λ2 + 4 ≤ 0. thus, function (9) provides a permissible solution for the λ2 ∈ n0 only if λ1 ∈ z, −λ2 − 1 ≤ λ1 ≤ −12λ2 − 2, from which follows λ2 ≥ 2. case 4 (ρ1 = 2λ1 + 2, ρ2 = λ2 + 1.). in this case, the function that can match the extremal vector is f(z1,z2,z3,z4) = z−λ2−11 (4z1z4 + z 2 3 ) λ1+λ2+2g(t), where g(t) is the solution of system (8). so f(z1,z2,z3,z4) = zλ1+11 ( z4 + z2z3 −z1z22 ) λ1+λ2+2. to give a polynomial solution, which we have found, to this function there must be λ1 ∈ n0 and λ1 + λ2 + 1 ∈ n0. but in this case, the degree of polynomial f in the variable z2 is greater than λ2 and, therefore, is not a permissible solution. 402 vol. 53 no. 5/2013 extremal vectors for verma type representation of b2 case 5 (ρ1 = 2λ1 + λ2 + 3, ρ2 = 0.). the function corresponding to the possible extremal vectors is f(z1,z2,z3,z4) = (4z1z4 + z23 ) λ1+ 12 λ2+ 3 2 g(t), where function g(t) meets the equation 4t(1 − t)g′′ + 2 ( 1 + (2λ1 + 2λ2 + 1)t ) g′ −λ2(2λ1 + λ2 + 3)g = 0. (10) this equation has two linearly independent solutions g1(t) = f ( −12λ2,−λ1 − 1 2λ2 − 3 2 ; 1 2 ; t ) , g2(t) = √ tf (1 2 − 1 2λ2,−λ1 − 1 2λ2 − 1; 3 2 ; t ) , where f(α,β; γ; t) is the hypergeometric function f(α,β; γ; t) = ∞∑ n=0 (α)n(β)n n! (γ)n tn, where (α)n = γ(α + n) γ(α) = α(α + 1) . . . (α + n− 1). these solutions correspond to the functions f1 = ∞∑ n=0 (−12λ2)n(−λ1 − 1 2λ2 − 3 2 )n n! ( 12 )n × (2z1z2 −z3)2n(4z1z4 + z23 ) λ1+ 12 λ2−n+ 3 2 , f2 = ∞∑ n=0 ( 12 − 1 2λ2)n(−λ1 − 1 2λ2 − 1)n n! ( 32 )n × (2z1z2 −z3)2n(4z1z4 + z23 ) λ1+ 12 λ2−n+1. for at least one of these functions to be a nonconstant polynomial, must be 2λ1+λ2+3 ∈ n, i.e. 2λ1+λ2+2 ∈ n0. if 2λ1 + λ2 + 3 is even, we get the solution f1 = λ1+ 12 λ2+ 3 2∑ n=0 (−12λ2)n(−λ1 − 1 2λ2 − 3 2 )n n! ( 12 )n × (2z1z2 −z3)2n(4z1z4 + z23 ) λ1+ 12 λ2−n+ 3 2 , and for 2λ1 + λ2 + 3 odd, we have the solution f2 = λ1+ 12 λ2+1∑ n=0 ( 12 − 1 2λ2)n(−λ1 − 1 2λ2 − 1)n n! ( 32 )n × (2z1z2 −z3)2n+1(4z1z4 + z23 ) λ1+ 12 λ2−n+1. if 2λ1 + λ2 + 3 is even and λ2 is even, then λ1 is a half integer, i.e. λ1 = `1 − 12 , where `1 ∈ z, counts in f1 only to n ≤ 12λ2, i.e. f = min( 12 λ2,λ1+ 1 2 λ2+ 3 2 )∑ n=0 (−12λ2)n(−λ1 − 1 2λ2 − 3 2 )n n! ( 12 )n × (2z1z2 −z3)2n(4z1z4 + z23 ) λ1+ 12 λ2−n+ 3 2 , and, therefore, f is in the variable z2 of a polynomial of degree not exceeding λ2. if 2λ1 + λ2 + 3 is even and λ2 is odd, i.e. λ1 is an integer, the function f = λ1+ 12 λ2+ 3 2∑ n=0 (−12λ2)n(−λ1 − 1 2λ2 − 3 2 )n n! ( 12 )n × (2z1z2 −z3)2n(4z1z4 + z23 ) λ1+ 12 λ2−n+ 3 2 , in the variable z2 is a polynomial of degree 2λ1 +λ2 +3. thus admissible solutions get only λ1 ≤−2. if 2λ1 + λ2 + 3 is odd, then solution f2 comes into play. if 12 (λ2−1) ∈ n0, i.e. for odd λ2 and half integer λ1 sum in f2 only n ≤ 12 (λ2 − 1), then the solutions are f = min( 12 λ2− 1 2 , λ1+ 12 λ2+1)∑ n=0 ( 12 − 1 2λ2)n(−λ1 − 1 2λ2 − 1)n n! ( 32 )n × (2z1z2 −z3)2n+1(4z1z4 + z23 ) λ1+ 12 λ2−n+1 in the z2 polynomial of degree not exceeding λ2. but for 2λ1 + λ2 + 3 odd and λ2 even, i.e. λ1 ∈ z, the solution is f = λ1+ 12 λ2+1∑ n=0 ( 12 − 1 2λ2)n(−λ1 − 1 2λ2 − 1)n n! ( 32 )n × (2z1z2 −z3)2n+1(4z1z4 + z23 ) λ1+ 12 λ2−n+1. in the variable z2 it is a polynomial of degree 2λ1 + λ2 + 3. therefore we get a permissible solution for 2λ1 + λ2 + 3 ≤ λ2, i.e. λ1 ≤−2. case 6 (ρ1 = −λ2 − 1, ρ2 = λ1 + λ2 + 2.). in this case, f(z1,z2,z3,z4) = z−λ1−λ2−21 (4z1z4 + z 2 3 ) λ1+ 12 λ2+ 3 2 g(t), where function g(t) is the solution of equation (10). for this function f to be polynomial, must be 2λ1 + λ2 + 3 ∈ n0 and −λ1 − λ2 − 2 ∈ n0. but these conditions are not fulfilled for any λ2 ∈ n0. acknowledgements the work of č.b. and o.n. was supported in part by the gacr-p201/10/1509 grant, and by research plan msm6840770039. references [1] d. verma. structure of certain induced representations of complex semisimple lie algebras. yale university, 1966. [2] d. verma. structure of certain induced representations of complex semisimple lie algebras. bull amer math soc 74:160–166, 1968. 403 č. burdík, o. navrátil acta polytechnica [3] i. bernstein, i.m. gel’fand and s.i. gel’fand. structure of representations generated by highest weight vectors. funct anal appl 5:1–8, 1971. [4] j. dixmier. enveloping algebras. new york: north holland, 1977. [5] p. exner, m. havlíček and w. lassner. canonical realizations of clasical lie algebras. czech. j. phys. 26b:1213–1228, 1976. [6] e. witten. anti–de sitter space and holography. adv. theor. math. phys. 2:253–291, 1998. [7] o. aharony, s.s gubser, j. maldacena, h. ooguri and y. oz. large n field theories, string theory and gravity. phys. rept. 323:183–386, 2000. [8] v. dobrev. subsingular vectors and conditionally invariant (q-deformed) equations. j phys a: math gen 28:7135–7155, 1995. [9] x. xu. differential equations for singular vectors of sl(n). arxiv:math/0305180v1. [10] č. burdík. realization of the real semisimple lie algebras: method of construction. j phys a: math gen 15:3101–3111, 1985. [11] č. burdík, p. grozman, d. leites and a. sergeev. realizations of lie algebras and superalgebras via creation and annihilation operators i. theoretical and math physics 124(2), 2000. 404 http://arxiv.org/abs/math/0305180v1 acta polytechnica 53(5):399–404, 2013 1 introduction 2 the root system for lie algebra b2 3 the extremal vectors for verma type representation 4 differential equations for extremal vectors 5 the extremal vectors 6 appendix: polynomial solutions of differential equations acknowledgements references acta polytechnica acta polytechnica 53(1):30–32, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ czech contribution to loft rené hudeca,b,∗, vojtěch petráčekc, dalimil šnitad a astronomical institute, academy of sciences of the czech republic, cz-25165 ondřejov, czech republic b czech technical university in prague, faculty of electrical engineering, technická 2, cz-16627 prague, czech republic c czech technical university in prague, faculty of nuclear engineering, břehová 78/7, cz-11000 prague, czech republic d institute of chemical technology, technická 5, cz-16628 prague,czech republic ∗ corresponding author: rhudec@asu.cas.cz abstract. we describe the current status of the czech contribution to the esa loft space mission, with emphasis on technical aspects. expertise available in the czech republic will play a positive role in the loft project and related developments. keywords: x-rays, x-ray astronomy, x-ray satellites, loft. 1. introduction: the loft mission loft, the large observatory for x-ray timing, is a newly proposed space mission intended to provide answers to fundamental questions about the motion of matter orbiting close to the event horizon of a black hole, and the state of matter in neutron stars (feroci et al., 2011). loft was recently selected by esa as one of the four space mission concepts of the cosmic vision programme that will compete for a launch opportunity at the beginning of the 2020s. the esa lft m class mission candidate is specifically designed to exploit the diagnostics of very rapid x-ray flux and spectral variability that directly probe the motion of matter down to distances very close to black holes and neutron stars, and also the physical state of ultradense matter (feroci et al., 2011). loft will investigate variability from submillisecond qpos to transient outbursts lasting years. the loft lad has an effective area ∼ 20 times larger than its largest predecessor (the proportional counter array on board rossixte) and much improved energy resolution. the loft wfm will discover and localise x-ray transients and impulsive events, and will monitor spectral state changes, triggering follow-up observations and providing important science in its own right. the basic technologies for loft include: (1.) large area silicon drift detectors; (2.) capillary plate x-ray collimators (feroci et al., 2011). the large area detector (lad) for loft has a fully modular and redundant approach: there are 16 independent detectors per module, 21 independent modules per detector panel, and 6 independent detector panels per lad. the wide field monitor for loft is based on the same type of si detectors as the large area detector, but it has finer pitch (300 µm): < 60 µm 1d position resolution coarse (∼ 3 mm) 2d resolution. it has 8 cameras, 4 units, and rectangular fov 180°×90° (at zero resp., each camera/unit 90° × 90°) the capillary-plate collimator is an important part of the loft instrumentation. lead-glass microcapillary plates are commercially available and can be customized. the loft baseline is as follows: fov to ∼ 43′ fwhm (2 mm thickness, 25 µm hole diameter, 28 µm pitch, open area ratio 80 %). the heritage is recently based on microchannel plates (mcp, e.g. chandra satellite). for more details on loft and its onboard experiments, see feroci et al., 2011. 2. czech participation in loft the czech republic has been a full esa member state for almost 4 years. the loft mission is the 2nd esa high-energy satellite with official czech participation (after integral). the following consortium is expected to handle all aspects of the czech contribution to loft. 2.1. the czech loft team/consortium the following czech institutes are involved in the esa loft mission and related studies: (1.) czech technical university in prague (ctu); (2.) astronomical institute academy of sciences of the czech republic, ondřejov (ai); (3.) institute of chemical technology (icht); (4.) silesian university at opava (su). the czech loft consortium has the following members: rené hudec ctu & ai, vojtěch petráček ctu, miroslav finger mff uk, václav vrba fzu av 30 http://ctn.cvut.cz/ap/ vol. 53 no. 1/2013 czech contribution to loft čr, dalimil šnita icht, vladimír karas ai, vojtěch šimon ai, and zdeněk stuchlík su. the wider czech loft consortium has 3 sections: scientific, technical, and industrial, and the members are as follows. scientific group: v. karas, z. stuchlík, m. bursa, m. dovciak, j. horak, v. šimon. r. hudec and g. torok. technical group: v. petráček, d. šnita, l. pína, r. hudec, m. finger, v. vrba and o. gedeon. industrial group: v. maršíková, a. inneman, m. holl and p. bareš. 2.2. goals of czech participation the following points indicate the anticipated goals of czech participation in the loft mission. • contribute to science through the loft mission – x-ray and bh (black hole) astrophysics • participate in the study, design and development of a major onboard experiment (large area detector), and contribute to silicon drift detector design, development and tests. • possibly participate in studying and designing alternative glass capillary plate technology for the large area detector • complete the study and design of an additional small czech lobster eye x-ray all sky monitor • ensure participation by czech industry in various onboard experiments and satellite parts and systems, hardware and software, if the mission is selected. in this paper we focus on technical participation in the loft project. scientific aspects will be addressed in a separate forthcoming paper. 2.3. expertise available (for loft) expertise in the following fields is available in the participating czech institutions: • si drift detectors (czech technical university); • polycapillary glass plates (institute of chemical technology); • wide-field lobster eye x-ray monitors; • x-ray and bh astronomy and astrophysics (astronomical institute, silesian university at opava, charles university); • space industry — hardware and software (czech space alliance, small companies and larger companies); • data analyses (isdc participation). 2.4. polycapillary glass plates polycapillary glass plates form a significant part of the onboard instrumentation. below we offer a brief description of a possible option based on experience available in czech institutions. • a possible alternative solution for mcp: a recent version of our photolithographic method had glass 0.2–1.6 mm in thickness 100 × 100 mm, holes as small as 50 microns. • developed by d. šnita, institute of chemical technology in prague. • will need to be further developed if it is to be used in the loft project. 2.5. silicon drift detectors the silicon drift detectors (sdd) are key instrumentation for loft. the following expertise is available in czech institutions. • extended czech participation in detectors for the alice ground-based experiment. • expertise available in design, development, manufacture and tests of sdd. • ctu (czech technical university) in prague, institute of physics, and on semiconductor czech republic are the main czech institutions involved. • sdd was invented by czech scientist pavel rehak. 3. sdd in prague – the team’s qualifications and resources in this section, we briefly introduce the qualifications and resources of the czech sdd teams. 3.1. team composition (experience with sdd): the expert team is concentrated in the center for physics of ultra-relativistic heavy ion collisions, consisting of teams from ctu in prague and from the institute of nuclear physics of the academy of sciences of the czech republic. the following institutes are the main partners. faculty of nuclear sciences and physical engineering, czech technical university in prague. this institute has experience in detector design, development, prototype production, detector testing, experimental operation of sdd, radiation damage tests, detector defect simulation, calibration, data analysis, development of the control system for sdd. wa98, and na45-2 ceres and alice experiments. nuclear physics institute. this institute has expertise in detector assembly, operation, testing, data analysis, na45-2 ceres, sdd calibration and analysis, and sdd laboratory tests. 3.2. laboratory infrastructure currently available for the sdd program the following laboratory infrastructure is available to support sdd-related studies and developments: 31 r. hudec, v. petráček, d. šnita acta polytechnica fnspe ctu prague — clean room facility, probestation, ac-dc test equipment (lrc), laser charge injection, bonding station (small, manual), dark box; npi ascr — clean room, probestation, test equipment, bonding station, irradiation facility (n-source, cyclotron); cern — beam test (if needed). 3.3. production expertise the collaboration teams also have the following production expertise: • prototype production of sdd for the alice experiment – icm prague; • post-prototyping contacts in the on semiconductor: roznov, czech republic (together with our team, the company tested double-sided technology and later succesfully produced many of the pixel detectors for the atlas detector at lhc); • selection and detector tests – participation in detector construction (trieste and torino); • design and production of the low voltage power supply system for alice sdd. 4. the le (lobster-eye) all sky monitor in addition, we plan to further exploit the modular concept of the le all-sky x-ray monitor (hudec, 2011, tichy et al., 2011), as a possible low-energy addition to the recently considered loft onboard instrumentation. the modular le concept offers easy modification for exist, hxmt or other satellites. one module can be as small as 5×5×30 cm, ∼ 2 kg, and hence can be considered as a supplementary onboard instrument for loft. the basic estimated parameters for an array of 30 modules are given below: • daily limiting flux 10−12 erg/s/cm2; • one module 2×195 plates 78×11.5×0.1 mm, 0.3 mm spacing; • detector pixel size 150 microns; • total front area 1825 cm2; • energy range 0.1–10 kev; • fov 180 × 6 degrees (30 modules 6 × 6 degrees); • angular resolution 3–4 arcmin; • total mass < 200 kg (for 30 modules). 5. conclusions after esa athena was not selected for the esa l mission, loft is the only remaining x-ray astronomy mission under consideration by esa for the future. for the czech teams, loft is (after integral) the 2nd esa high-energy astrophysical mission with official czech participation. this paper has provided a brief review of possible czech participation in the loft mission and the contribution that czech institutes are able to make. the exact extent of our participation will depend on the available funding. despite extensive efforts, no funding was available at the time when this paper was written. however, recent changes in the organization of czech space activities and the plan to establish a czech national space agency give good hope that the situation will improve soon, enabling more a effective czech contribution to the loft mission. acknowledgements we acknowledge the support provided by the grant agency of the czech republic within the framework of grants 102/09/0997 and 13-33324s “lobster eye x-ray monitor”. references [1] m. feroci, l. stella, m. van der klis. the large observatory for x-ray timing (loft), experimental astronomy, 2011. [2] r. hudec. new types of astronomical x-ray optics. vulcano workshop 2010 frontier objects in astrophysics and particle physics, conference proceedings, italian physical society, vol. 103, pp. 643–648, 2011. [3] v. tichý, m. barbera, a. collura, et al. lobster eye optics for nano-satellite x-ray monitor, euv and x-ray optics: synergy between laboratory and space ii. edited by hudec, rené; pina, ladislav. proceedings of the spie, volume 8076, pp. 80760c–80760c-13 (2011) 32 acta polytechnica 53(1):30--32, 2013 1 introduction: the loft mission 2 czech participation in loft 2.1 the czech loft team/consortium 2.2 goals of czech participation 2.3 expertise available (for loft) 2.4 polycapillary glass plates 2.5 silicon drift detectors 3 sdd in prague – the team’s qualifications and resources 3.1 team composition (experience with sdd): 3.2 laboratory infrastructure currently available for the sdd program 3.3 production expertise 4 the le (lobster-eye) all sky monitor 5 conclusions acknowledgements references ap2002_01.vp 1 sources of heat and humidity the various sources of heat and water vapor, especially in a cramped space, under the roof etc., can lead to hydrothermal comfort disturbance within an interior, especially if the thermal-insulation properties and ventilation of the building are not on a proper level. 1.1 sources of heat and cold the outdoor climatic situation is the greatest source of heat and cold for the interior of a building. heat and cold are transferred indoors through the outer walls of a building. windows play a major role in this field: they are the greatest source of heat losses in winter and of heat gains in summer. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 thermal comfort and optimum humidity part 1 m. v. jokl the hydrothermal microclimate is the main component in indoor comfort. the optimum hydrothermal level can be ensured by suitable changes in the sources of heat and water vapor within the building, changes in the environment (the interior of the building) and in the people exposed to the conditions inside the building. a change in the heat source and the source of water vapor involves improving the heat insulating properties and the air permeability of the peripheral walls and especially of the windows. the change in the environment will bring human bodies into balance with the environment. this can be expressed in terms of an optimum or at least an acceptable globe temperature, an adequate proportion of radiant heat within the total amount of heat from the environment (defined by the difference between air and wall temperature), uniform cooling of the human body by the environment, defined a) by the acceptable temperature difference between head and ankles, b) by acceptable temperature variations during a shift (location unchanged), or during movement from one location to another without a change of clothing. finally, a moisture balance between man and the environment is necessary (defined by acceptable relative air humidity). a change for human beings means a change of clothes which, of course, is limited by social acceptance in summer and by inconvenient heaviness in winter. the principles of optimum heating and cooling, humidification and dehumidification are presented in this paper. hydrothermal comfort in an environment depends on heat and humidity flows (heat and water vapors), occurring in a given space in a building interior and affecting the total state of the human organism. keywords: thermal comfort, optimum humidity, hygienic standards. fig. 1: heat losses from a family house for example, the heat loss from a detached family house is 20 % through the windows, and only 16 % through the walls (fig. 1), [10], [15]. thus very great attention should be given to the quality of windows, especially at times when energy prices are rising rapidly. if not, windows will function more or less as holes through which heat escapes from the building. well insulated windows are also useful in summer: they decrease the heat flow to the interior from outdoors, thus improving the thermal comfort. it does not matter from thermal point of view whether the windows are made of plastic or of wood. easy maintenance is the main advantage of plastic windows (no painting is necessary), but their decreased mechanical strength (plastic window frames must be more massive) causes loosenig of the fittings after some time, and their almost perfect air tightness can decrease the natural air exchange in an interior beneath the tolerable limit. wooden windows (several layers of wood glued together, so called eurowindows) are now widespread in europe, following the period of plastic windows. eurowindows do not suffer from deformation and, thanks to the application of synthetic resin for glueing, maintenance is not demanding. various human activities, especially cooking and ironing (the input from these appliances contributes to the heat gain in the room) and also people themselves (especially if several people are present in a room) constitute heat sources within a building. heat (so called basal metabolic heat) is produced by the liver of a person at rest (e.g. asleep) in relationship to age (children generate more heat than adults) and sex (females produce less heat than males, so they need a higher temperature in a room (fig. 2). heat production increases with bodily activity, and heat is produced mainly by the muscles (so called net metabolic heat, in the range from 40 to 170 w/m2 while working at home. basal metabolic heat must be added to the metabolic heat values, of course) (table 1). © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 42 no. 1/2002 activity �qm, net [met] [w � m�2] resting sleeping 0 0 recycling 0.1 6 seated, quiet 0.3 18 standing, relaxed 0.5 29 walking on the level m/s 0.89 1.3 76 1.34 1.9 111 1.79 3.1 181 miscellaneous occupations bakery (e.g., cleaning tins, packing boxes) 0.7 to 1.3 11 to 76 brewery (e.g., filling bottles, loading beer boxes onto a belt) 0.5 to 1.7 29 to 99 bricklaying bricking a wall 1.8 106 carpentry machine sawing, table 1.1 to 1.5 64 to 87 sawing by hand 3.3 to 4.1 192 to 239 planing by hand 4.9 to 5.7 285 to 332 foundry work using a pneumatic hammer 2.3 to 2.7 134 to 157 tending furnaces 4.3 to 6.3 250 to 367 activity �qm, net [met] [w � m�2] garage work (e.g., replacing tires, raising cars by jack) 1.5 to 2.3 87 to 134 general laboratory work 0.7 to 1.1 41 to 64 machine work light (e.g., electrical industry) 1.3 to 1.7 76 to 99 heavy (e.g., steel work) 2.8 to 3.8 163 to 221 shop assistant 1.3 76 teacher 0.9 52 lecture in a large hall 0.8 to 2.2 46 to 129 watch repairer, seated 0.4 23 vehicle driving car in highway 0.7 39 car in city-center (busy period) 2.1 124 car (average) 0.8 47 motorcycle 1.3 76 heavy vehicle 2.5 146 aircraft flying routine 0.7 41 instrument landing 1.1 64 combat flying 1.7 99 maximal physical output (short-term) 16.9 982 domestic work house cleaning 1.3 to 2.7 76 to 157 cooking 0.9 to 1.3 52 to 76 table 1: net metabolic heat, in the course of various typical activities (ashrae fundamentals handbook, 1985) 1.2 sources of water vapor the water vapor content in an interior is determined by both the water vapor content in outdoor air and the water vapor sources inside a building. in winter, as a result of low air temperatures, the water vapor content in outdoor air is low (the vapor is condensed or even frozen and falls to the earth). thus the air coming indoors is very dry after warming to the indoor temperature, and the relative humidity may even drop below 20 %. in summer, when the air temperatures are high, the water vapor content in outdoor air is high, due to higher air saturation by water vapor [6], [11], [12]. thus the air coming indoors is almost saturated with water vapor, after cooling to the indoor temperature, and the relative air humidity can even reach 100 %. water vapor indoors also results mostly from various human activities, especially having a shower (about 2600 g/h), cooking (up to 1500 g/h), drying linen (up to 500 g/h), pot plants (up to 20 g/h) and also human beings themselves (30 up to 300 g/h) (see table 2). 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 activity �qm, net [met] [w � m�2] washing by hand and ironing 1.3 to 2.9 76 to 169 making a fire in an open fireplace or in a stove 1.4 to 1.6 79 to 90 shopping 0.7 to 1.1 41 to 64 office work typing 0.5 to 0.7 29 to 41 miscellaneous office work 0.4 to 0.6 23 to 35 drafting 0.4 to 0.6 23 to 35 leisure activities stream fishing 0.5 to 1.3 29 to 76 calisthenics exercise 2.3 to 3.3 134 to 192 jogging (long-distance) 6.6 381 table-tennis, singles 3.4 196 activity �qm, net [met] [w � m�2] swimming 7.5 to 8.7 436 to 506 biking (long-distance) 7.2 to 8.0 419 to 463 dancing social 1.7 to 3.7 99 to 215 waltz 3.9 226 polka 7.2 419 tennis, singles 2.9 to 3.9 169 to 227 squash, singles 4.3 to 6.3 250 to 378 basketball, half court, indoors 4.3 to 6.9 250 to 402 wrestling-competitive or intensive 6.3 to 8.0 367 to 466 golf, swinging and walking 0.7 to 1.9 41 to 111 golf, swinging and golf cart 0.7 to 1.1 41 to 64 fig. 2: basal heat production by human beings (basal metabolic rate qm, b). for the standard man of average weight and height (75 kg, 175 cm) surface � 1.9 m2 humans light activity 30– 60 g/h medium heavy work 120–300 g/h heavy work 200–300 g/h bathroom with bath about 700 g/h with shower about 2600 g/h kitchen during cooking 600–1500 g/h daily average 100 g/h table 2: sources of water vapor in an apartment 2 impact of heat, cold and water vapor on human beings water vapor manifests itself in various ways. particular attention must be paid to the impact of heat and cold, and to water vapor. 2.1 impact of heat and cold on human beings humans are so called “homoiotherm organisms”, i.e. they need a constant temperature at their core (for their inner bodily organs). for this purpose, they are provided with a thermoregulatory center situated in the hypothalamus (see fig. 3), located at the center of an imaginary line connecting the auditory canals. the central location of the thermoregulatory center, with all-round protection from the head, indicates its importance. the heat center and the same time terminal center is in the anterior (front) hypothalamus (which controls vasodilatation and perspiration), while the center of cold and at the same time the “heat maintenance center” is in the posterior (rear) hypothalamus. however, the response to both warmth cold is controlled by the terminal anterior center, also called the “temperature eye” or the “human thermostat”, because it is able to maintain the body temperature at a strictly determined “set” temperature [6], [5]. thermal equilibrium between the human organism and the environment is the basic condition for maintaining body temperature [6], [16], i.e., the heat produced by a human must be transferred to his environs. if more heat is transferred than is produced, as a result of an excessively cold environment, the thermal balance is disturbed downwards followed by a feeling of coldness. if less heat is transferred than is produced (as a result of an excessively warm environment), the thermal balance is disturbed upwards, followed by a feeling of warmth. in a warm or hot environment the heat balance can be restored by sweating (the body is cooled by sweat evaporation), but in a very cold environment there is no equivalent mechanism (except shivering, which rarely occurs) so excessive cooling leads to hypothermia and a decrease in body temperature. (taumo is practiced in tibet: young men compete in a contest to dry out wet bed sheets at a temperature of �32 °c. they did this with such ease that even a member of a british expedition joined on and survived!) for this reason, most hygiene regulations permit up to 4 liters of sweat production in a hot workplace, but take no account of shivering in cold workplaces. overheating of the human organism followed by an increase in body temperature can also occur quantity of if the quantity of evaporated sweat is not sufficient, or if the sweat cannot be evaporated (in a humid environment, in waterproof clothing). the human body temperature can be increased without risk only slightly above 38 °c. higher temperatures are dangerous. the extent to which the human body temperature can be decreased is still under discussion; until recently 28 °c had been considered to be the critical value. however, in 1999 norwegian doctors in tromso were able to revive a 29-year-old woman, ann baagenholm from sweden, who had fallen into a river near narvik while on a crosscountry skiing run. she spent more than an hour in the river ice before being rescued and transported to tromso hospital – her body temperature measured there was only 13.8 °c. the difference between undercooling and catching cold depends only on one’s state of health, i.e., on one’s immunobiological resistance. in every day practice, it is important that people are protected against excessive heat by sweating, but there is no protection against excessive cold – there is a danger of undercooling and also of catching cold. thus when a person feels cold, immediate action is necessary: increased thermal/insulating properties of clothing, a higher air temperature, intensive physical activity, etc. however, the heat balance of the human organism is not a sufficient condition for hydrothermal comfort. radiant comfort must be added, i.e., heat balance of the body must be provided by external radiant heat (we are used to solar radiation), and heat must be released by convection to the outside (cooling by wind in nature). this physiological fact is expressed as the so called radiant comfort coefficient rcc. this expresses the ratio of radiant and convective heat: rcc � � body radiation body convection 1 . © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 42 no. 1/2002 drying linen (washer for 4.5 kg) centrifugal 0–200 g/h drip 100–500 g/h pools (free water surface) about 40 g/m2h plants indoor flowers, e.g. violet (viola) 5–10 g/h flowers in pot, e.g. fern (comptonia asplemifolia) 7–15 g/h fig plant of medium size (ficus elastica) 10–20 g/h water plants, e.g. water-lily (nymphea alba) 6– 8 g/h young trees (2/3 m tall), e.g. beech (fagus) 2– 4 kg/h grown trees (25 m), e.g. spruce (picea) 2–3 m3/h fig. 3: thermoregulatory centers rcc should be higher than or equal to one for optimum conditions. this limit enables another hygienic requirement for interiors to be estimated: the acceptable difference between air temperature and surface, wall temperature, and the corresponding difference between air temperature and globe temperature in winter. it is clear that, in a cellar with low wall temperatures, where the optimum limit for rcc cannot be achieved, we feel less comfortable than in front of an open fireplace with the optimum limit provided. the sum of the convective and radiative heat is, of course, the same in both cases. besides thermal equilibrium and radiant comfort, the human organism is also sensitive to uniform heat transfer to the surroundings both in space and in time. in space, the cooling of the ankles must not be too different from the cooling of the head, i.e., the temperatures at ankle and head level must not be too different, and air streaming must also be taken into account (drafts should be avoided). in time, the temperature oscillations during a day in the same place or when passing from one room to another without a change of clothes [7], [8], [9], [11], [12], [13] must not exceed the thermoregulatory range of the human body. some locations on the human body surface are extremely sensitive to undercooling, and therefore need to be carefully protected (see fig. 4). they are: the neck (the thyroid gland is a part of the thermoregulatory system), below the shoulder blade (purine agents sedimentate in this area), the kidneys (kidney belts made of special fur seem to be useful), the small of the back (the area where people most often feel pain, and local cold stress may result in painful lumbago), and the ankles, which are directly connected to the upper breathing passages (when the ankles are immersed into cold water the temperature of these passages, e.g. the inside of the nose, immediately drops, and microbes multiply, leading to sneezing). everybody knows that wet shoes will soon lead to sneezing. temperature changes when passing from one environment to another one are a chronic problem in hot countries, i.e., in the subtropics and tropics. in the persian gulf where 50 °c and 100% relative humidity can occur in summer, the air temperature is maintained at about 20 °c inside hotels and many other buildings. foreign visitors are liable to catch colds, and residents are liable to rheumatism followed by arthritis. 2.2 impact of water vapor on human beings there are problems with low relative air humidity in winter, when air coming from outdoors contains just a small quantity of water, and with high relative air humidity in summer, especially during rainy weather. a sensation of dryness is caused by drying out of the mucosa in the upper breathing passages, leading to a disturbance of the protective nasal function, which acts as a sort of filter for aerosols (especially dust, including allergens) (see fig. 5). the cilia in the mucosa are continuously in motion, and prevent dust sedimentation, as can be seen under a microscope. according to evert [14] mucus (slimy substance) production depends mainly on the relative humidity of the inspired air: mucus production and cilia movement decreases rapidly below a relative humidity of 40 %. this produces optimum conditions for microbe multiplication. low air humidity also has a negative impact on skin and eyes and on static electricity production. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 fig. 4: locations on the human body sensitive to cold fig. 5: mucosa of the upper respiratory pathways in normal condition (a), and dried out (b) relative humidity not only affects feelings of comfort but also has a direct impact on human health. green investigated the occurrence of influenza and runny noses in children at school [6]. there was a significant difference between classrooms with humidified and nonhumidified air. the results taken by ritzel from five kindergartens are even more significant: in the investigated classrooms with air humidification the occurrence of upper respiratory infections was not much more than half the occurrence in classrooms (control group) without humidification (table 3). the explanation may be shown in fig. 6: adenoviruses, which produce colds, have their lowest survival rate at about 60 % relative air humidity. particles causing severe allergies, e.g., pollen, domestic dust often containing hairs and particles from the skin of pets, particles of mites etc., spread most freely in conditions of low air humidity. high relative humidity – over 70 % (over 50 % for people with high sensitivity) in combination with high air – temperature produces a feeling of sultriness, and can even lead to health problems. it enables the spread of airborne molds and multiplication of mites (see fig. 7), thus producing breathing problems, sore throats, headaches, runny noses and nervous problems in adults, and especially in children. the number and intensity of these problems increases proportionally to the humidity in the accommodation the children live in (according to dr platt, from the epidemiological unit of medical research at the royal edinburgh hospital, u.k.) mites are the main source of allergies in denmark, according to research by gullev [3]. they cause allergies in 200 000 danes. cats and dog hair, mold spores, and cigarette smoke cause a smaller number of allergies. the use of a vacuum cleaner is very effective against mites in carpets and other textiles, which feed on the remains of human skin (immediately after use, the filter must be emptied out). decreasing the relative air humidity below 45 % for period of several days is also effective. special natural sprays have been developed against mites. allersearch, developed in australia, contains plant substances that repel mites for at least six months. these natural substances are not harmful for humans, since they act by changing the structure of the proteins in the bodies of mites. insecticides, which used to be used, are harmful to the human body human body, and are no longer very effective against mites. artilin is applied with a brush, and kills mites on contact, and also destroys the molds that feed mites and give them shelter. painting is effective for at least five years. 3 hydrothermal microclimate standards two hydrothermal microclimate standards are particularly influential in the world today: the us ansi/ashrae national standard 55-1992, with addendum ansi/ashrae 55a-1995 thermal environmental conditions for human occupancy, and the european international standard iso 7730 moderate thermal environments. 3.1 ansi/ashrae standard 55-1992 thermal environmental conditions for human occupancy the purpose of this standard is to specify the combinations of indoor space environment and personal factors that © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 42 no. 1/2002 microclimate total student study days absent days (colds) absenteeism [%] humidified (rh � 49 %, t a � 22 °c) 9306 195 3.0 unhumidified (rh � 40 %, t a � 22 °c) 5910 138 5.7 table 3: effects of relative air humidity on upper respiratory infections in kindergartens fig. 6: the percentage of microorganism survivals in the air with increased relative air humidity fig. 7: a mite, living in a building interior (magnified, real size about 0.1 mm) will produce hydrothermal environmental conditions acceptable to 80 % or more of the occupants within the space. the environmental factors addressed are temperature, thermal radiation, humidity and air speed; the personal factors are those of activity and clothing. this standard specifies the hydrothermal environmental conditions acceptable for healthy people at atmospheric pressure equivalent to altitudes up to 3000 m in indoor spaces designed for human occupancy for periods of not less than 15 minutes. clothing, through its insulative properties, is an important modifier of body heat loss and comfort. clothing or garment insulation is quantified in clo units. in this standard, the intrinsic insulation value (icl) is used to describe the insulation provided by clothing ensembles. clothing insulation values for selected ensembles are given in table 4. a heavy business suit with its accompanning garments usually has a clo value of 1. clo: a unit used to express the thermal insulation provided by garments and clothing ensembles, where 1 clo � 0.155 m2 � °c/w (0.88 ft2�h�°f/btu). insulation, clothing (icl): the resistance to sensible heat transfer provided by a clothing ensemble (i.e., more than one garment). it is described as the intrinsic insulation from the skin to the clothing surface, not including the resistance provided by the air layer around the clothed body; it is usually expressed in clo units. temperature, operative (t0): the uniform temperature of an imaginary black enclosure in which an occupant would exchange the same amount of heat by radiation plus convection as in the actual nonuniform environment. operative temperature is numerically the average of the air temperature (ta) and mean radiant temperature (tr), weighted by their respective heat transfer coefficients (hc and hr): t h t h t h h0 � � �( ) ( )c a r r c r . the operative temperature is approximately equal to the globe temperature (fig. 8). the operative temperatures and clothing insulation values corresponding to the sensation of neutral and the 10 % dissatisfaction criterion are given in fig. 9. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 ensemble icl [clo] 1. briefs; knit, short-sleeve sport shirts; walking shorts; belt; calf-length socks; hard-soled shoes 0.4 2. panties; broadcloth, short-sleeve shirt; a-line, knee-length skirt; pantyhose; thongs/sandals 0.5 3. briefs; broadcloth, long-sleeve shirt; long fitted trousers; calf-length socks; hard-soled shoes 0.6 4. panties; full slip; broadcloth, short-sleeve shirt; belted a-line dress; long-sleeve cardigan sweater; pantyhose; hard-soled shoes 0.7 5. panties; broadcloth, long-sleeve shirt; sleeveless round-neck sweater; thick walking shorts; belt; thick knee socks; hard-soled shoes 0.7 6. panties; half-slip; broadcloth, long-sleeve blouse; single-breasted suit jacket; a-line, knee-length skirt; pantyhose; thongs/sandals 1.0 7. briefs; thermal long underwear top; thermal long underwear bottoms; flannel, long-sleeve shirt; overalls; calf-length sock; hard-soled shoes 1.0 8. briefs; broadcloth, long-sleeve shirt; single-breasted suit jacket; tie; straight, long fitted trousers; calf-length socks; hard-soled shoes 1.0 9. briefs; t-shirt; broadcloth, long-sleeve shirt; long-sleeved, round-neck sweater; thick, straight, long, loose trousers; belt; calf-length sock, hard-soled shoes 1.0 10. panties; broadcloth, long-sleeve shirt; thick vest; thick, single-breasted suit jacket; thick, a-line, knee-length skirt; pantyhose; hard-soled shoes 1.0 11. briefs; t-shirt; broadcloth, long-sleeve shirt; thick vest; thick, single-breasted suit jacket; thick, straight, long, loose trousers; belt; calf-length socks; hard-soled shoes 1.2 12. briefs; t-shirt; flannel, long-sleeve shirt; work jacket; belt; work pants; calf-length socks; hard-soled shoes 1.3 13. flannel, long-sleeve, long nightgown; thick, long-sleeve, wrap, long robe; slippers 1.7 table 4: clothing insulation values for typical ensembles the acceptable range of operative temperatures and humidities for winter and summer is given in table 5, and is further defined in the psychometric chart in fig. 10. comfort conditions for clothing levels different from those given above can be determined approximately by lowering the temperature ranges of table 5 or fig. 10 by 0.6 °c (1 °f) for each 0.1 clo of increased clothing insulation. however, at lower temperatures, comfort depends on the maintenance of a reasonably uniform distribution of clothing insulation over the entire body and, in particular, for hands and feet. for sedentary occupancy of more than an hour, the minimum operative temperature should not be less than 18 °c (65 °f). within the thermally acceptable temperature ranges of table 5 and fig. 10 there is no minimum air speed that is necessary for thermal comfort. for sedentary persons, it is essential to avoid drafts. requirements on avoiding drafts are given in the section dealing with nonuniformity of the heat load on humans (see later). the temperature may be increased above the level allowed for the comfort zone if a means is provided for also elevating the air speed. the benefits that can be gained by increasing the air speed depend on clothing, activity, and the difference between the surface temperature of the clothing/skin and the air temperature. fig. 11 shows the air speed that is required for clothing and for activities that correspond to the summer comfort zone in fig. 10. for sedentary people, this option may not be used to increase the temperature by more than 3 °c (5.4 °f) above the comfort zone and it may not be used if the speed required is more than 0.8 m/s (160 fpm). the air speed and/or direction in work locations must be © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 42 no. 1/2002 season description of typical clothing icl [clo] optimum operative temperature operative temperative range (10 % dissatisfaction criterion) winter heavy slacks, long-sleeve shirt and sweater 0.9 22 °c 20–23.5 °c 71 °f 68–75 °f summer light slacks and short-sleeve shirt 0.5 24.5 °c 23–26 °c 76 °f 73–79 °f minimal 0.05 27 °c 26–29 °c 81 °f 79–84 °f *other than clothing, there are no adjustments for season or sex to the temperatures in table 5. for infants, certain elderly people, and individuals who are physically disabled, the ower limits of table 5 should be avoided. table 5: optimum and acceptable ranges of operative temperature for people during light, primarily sedentary activity (1.2 met) at 50 % relative humidity and mean air speed 0.15 m/s (30 fpm)* fig. 8: the vernon-jokl globe thermometer (1 thin copper globe, 2 thermometer, 3 polyurethane, 4 fastening stand) operative temperature [°c] operative temperature [°f] c lo th in g in s u la ti o n [c lo ] fig. 9: the recommended range of clothing insulation providing acceptable thermal conditions at a given operative temperature for people during light, primarily sedentary activity (� 1.2 met) according to ansi/ashrae standard 55-1992. the limits are based on a 10 % dissatisfaction criterion. under the direct control of the affected occupants and adjustable in steps no greater than 0.25 m/s (50 fpm) if this option is used. 3.1.1 nonsteady state problems of nonsteady state concern mainly temperature cycling and temperature drifts or ramps. temperature cycling. if the peak cyclic variation (time period less than 15 minutes) in the operative temperature exceeds 1.1 °c (2 °f) the rate of temperature change shall not exceed 2.2 °c/h (4 °f/h). there are no restrictions on the rate of temperature change if the peak – to – peak difference is 1.1 °c (2 °f) or less. temperature drifts or ramps. temperature drifts and ramps are monotonic, steady, noncyclic operative temperature changes. drifts refer to passive temperature changes of the enclosed space, and ramps refer to actively controlled temperature changes. the maximum allowable drift or ramp condition from a steady-state starting temperature of between 21 °c and 23.3 °c (70 °f and 74 °f) is a rate of 0.5 °c/h (1 °f/h).this drift or ramp change should not extend beyond the upper operative temperature limits of the comfort zone guidelines (specified in fig. 10) by more than 0.5 °c (1 °f) and should not remain beyond this temperature zone for longer than one hour. 3.1.2 nonuniformity nonuniformity of the hydrothermal environment (vertical temperature difference, radiant temperature asymmetry, warm or cold floors, and draft, see fig. 12 for a total view) may cause local discomfort. vertical air temperature difference. air temperature in an enclosed space generally increases from floor to ceiling. if this increment is sufficiently large, local warm discomfort can occur at the head and/or cold discomfort at the feet, although the body as a whole is thermally neutral. therefore, to prevent local discomfort, the vertical air temperature difference within the occupied zone, measured at the 0.1 m (4 in) and 1.7 m (67 in) levels, shall not exceed 3 °c (5 °f) (fig. 13). radiant temperature asymmetry. asymmetric radiation from hot and cold surfaces and from direct sunlight can cause local discomfort and reduce the thermal acceptability of the space. in general, people are more sensitive to asymmetric radiation caused by a warm ceiling than that caused by hot and cold vertical surfaces. thus, to 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 fig. 10: acceptable ranges of operative temperature and humidity for people in typical summer and winter clothing during light, primarily sedentary activity (� 1.2 met) according to ansi/ashrae standard 55-1992. the operative temperature ranges are based on a 10 % dissatisfaction criterion. fig. 11: air speed required to offset increased temperature according to ansi/ashrae standard 55-1992. the air speed increases in the amount necessary to maintain the same total heat transfer from the skin. this figure applies to increases in temperature above those allowed in the summer comfort zone with both t r and ta increasing equally. the starting point of the curves at 0.2 m/s (fpm) corresponds to the recommended air speed limit for the summer comfort zone at 26 °c (79 °f) and typical ventilation (i.e., turbulence intensity between 30 % and 60 %). acceptance of the increased air speed requires occupant control of the local air speed. limit local discomfort, radiant temperature asymmetry in the vertical direction shall be less than 5 °c (9 °f) and in the horizontal direction less than 10 °c (18 °f). the radiant temperature asymmetry in the vertical direction is the difference in plane radiant temperature of the upper and lower parts of the space with respect to a small horizontal plane 0.6 m (2 ft) [seated] or 1.1 m (3.6 ft) [standing] above the floor. in the horizontal direction, it is the difference in plane radiant temperatures in opposite directions from a small vertical plane 0.6 m (2 ft) [seated] or 1.1 m (3.6 ft) [standing] above the floor. floor temperatures. to minimize foot discomfort, the surface temperature of the floor for people wearing typical indoor footwear shall be between 18 °c (65 °f) and 29 °c (84 °f). for floors that people may occupy with bare feet, the optimum floor temperature will depend on the type of floor material. draft. air speed may cause unwanted local cooling of the body, defined as draft. the risk of draft depends on the mean speed, the turbulence intensity, and the temperature of the air. sensitivity to draft is greatest where skin is exposed at the head and ankles. fig. 14 shows the mean air speed limits at these locations needed to limit the draft risk. higher air speeds may be acceptable if the person has individual control of the local air speed. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 42 no. 1/2002 fig. 12: types of nun (nonuniformity of the hydrothermal environment) fig. 13: acceptable air temperature decrease at ankle level and at head level 3.1.3 people at different activity levels the temperatures given in table 5 and fig. 10 shall be decreased when the average steady-state activity level of the occupants is higher than light, primarily sedentary (> 1.2 met). the optimum operative temperature for activity depends on both the time-weighted average (e.g., about 30 to 60 minutes) activity level and the clothing insulation. this temperature can be found from fig. 15. people with higher activity levels are not so sensitive to drafts. 3.2 international standard iso 7730 moderate thermal environments hydrothermal comfort is defined as that condition of mind which expresses satisfaction with the hydrothermal constitution of an environment. dissatisfaction may be caused by warm or cold discomfort for the body as a whole as expressed by the pmv (predicted mean vote) or ppd (predicted percentage dissatisfied) indices. but hydrothermal dissatisfaction may also be caused by unwanted heating or cooling of one particular part of the body (local discomfort, more exactly nonuniformity of the hydrothermal environment). due to individual differences it is impossible to specify a hydrothermal environment that will satisfy everybody. there will be always a percentage of dissatisfied occupants. according to iso 7730 recommended comfort requirements are specified that are known to be acceptable for at least 80 % of the occupants. 3.2.1 light, mainly sedentary activity during winter conditions (heating period) clothing of 1 clo � 0.155 m2� °c/w is assumed. the conditions are the following: a) the operative temperature shall be between 20 and 24 °c (i.e. 22�2 °c), relative humidity about 50 %. b) the vertical air temperature difference between 1.1 m and 0.1 m above the floor (head level and ankle level) shall be less than 3 °c. c) the surface temperature of the floor shall normally be between 19 and 26 °c, but floor heating systems may be designed for 29 °c. d) the mean air velocity shall be less than 0.15 m/s. e) the radiant temperature asymmetry from windows or other cold vertical surfaces shall be less than 10 °c (in relation to a small vertical plane 0.6 m above the floor). f) the radiant temperature asymmetry from a warm (heated) ceiling shall be less than 5 °c (in relation to a small plane 0.6 m above the floor). 3.2.2 light, mainly sedentary activity during summer conditions (cooling period) clothing of 1 clo � 0.078 m2� °c/w is assumed. the conditions are the following: a) the operative temperature shall be between 23 and 26 °c (i.e. 24.5�1.5 °c), relative humidity about 50 %. b) the vertical air temperature difference between 1.1 m and 0.1 m above floor (head and ankle level) shall be less than 3 °c. c) the mean air velocity shall be less than 0.25 m/s. 3.2.3 other activities during both winter and summer conditions the optimum operative temperature as a function of various activities and clothing is presented in fig. 16. 3.2.4 drafts draft, especially cold draft, seems to be the most difficult hydrothermal comfort problem for building interiors nowadays. thus european standard “ventilation of buildings, design criteria for the indoor environment” cen/tc 156 introduces a new criterion, the so called draft rating (dr), which can be estimated, e.g., by swema air 300 with an swa 01 probe (fig. 17). dr is defined as the percentage of persons who perceive the draft in an investigated place as 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 fig. 14: allowable mean air speed (v) as a function of air temperature (ta) and turbulence intensity (tu) according to ansi/ashrae standard 55-1992. the turbulence intensity may vary between 30 % and 60 % in conventionally ventilated spaces. in rooms with displacement ventilation or without or without ventilation, the turbulence intensity may by lower. the diagram is based on a 15 % acceptable level and the sensation at the head/feet level where people are most sensitive. higher air speeds may be acceptable if the affected occupants have control of the local air speed. fig. 15: optimum operative temperatures for active people in environments with a low air speed v � 0.15 m/s (30 fpm) according to ansi/ashrae standard 55-1992 dissatisfying. dr depends on air velocity, air turbulence (vortices in the air) and air temperature. 3.3 a new standard based on space research a new standard for optimum hydrothermal environment conditions (hydrothermal constituent of the microenvironment) based on soviet space research findings has been introduced in prague, czech republic. class one (see table 6) of this standard with a total metabolic rate up to 80 w/m2 can be applied to a residential environment. the optimum operative temperature range is based partly on temperatures just before the onset of sweating (upper limit), and partly on temperatures just before the onset of shivering (lower limit) [6], [16], [17]. in addition to optimum values, admissible values are also introduced. the nonuniformity solution arouses great interest. it seems that in both the u.s. and the european standards, the problem of moderate heat stress non-uniformity is not resolved in a proper way. first, radiant nonuniformity, known adiant asymmetry (fig. 11), should be evaluated on the basis of human physiology. the human body is able to tolerate even high thermal radiation in cool streaming air, but only very low thermal radiation in hot streaming air, i.e., thermal radiation and convection must be taken into account simultaneously. second, the physiological radiant comfort coefficient must also be taken into account (see section 2.1). radiant asymmetry must be evaluated from the physiological point of view, simultaneously with the cooling effect of the surrounding air. in a uniform environment this fact is respected by the globe temperature or more exactly, by an operative temperature (the globe temperature is the temperature of a globe resulting from warming by radiation and cooling by convection). in a non-uniform environment this fact is respected by the stereotemperature, which is the temperature of only a part of the measuring globe, changed from the technical point of view into a so-called stereothermometer (fig. 18). thus thermal asymmetry can be expressed exactly as the difference of two stereotemperatures: as the radiating side minus the side without the investigated radiation [7], [8], [9], [11], [12]. for optimum and admissible values, see table 6. if this new criterion is applied, e.g., in a living room, it is evident that the heating body, as a source of radiant heat, must be located under the window, i.e., close to the source of radiant cold (see part 2, section 1.2.1). as presented above, thermal comfort depends not only on the heat equilibrium of the human body, but also on the way in which it was reached, i.e., on the proportion of individual heat flows, expressed by the radiant comfort coefficient (see section 2.1). from the limit value of rcc we get, for example, the maximum difference between the air temperature and the surface indoor wall temperature, e.g., 2 k for a man sitting and watching a tv. of course, some further optimum and admissible values are introduced in this new standard, © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 42 no. 1/2002 fig. 17: swema air 300 with the swa 01 probe for draft rate estimation fig. 16: the optimum operative temperature as a function of activity and clothing according to iso 7730 cold period of the year parameters t0 operative temperature [°c] air velocity [m/s] t0, ankles � t0, head [k] tair � tglobe [k] tair �tg, stereo [k] rh [%] optimum 20–23 �0.1 �3.0 to +9.0 0.4 1.2 30–70 admissible 18–24 �0.1 �4.5 to +13.4 0.4 1.2 30–70 warm period of the year optimum 23–26 0.1–0.2 �3.0 to +9.0 0.4 1.2 30–70 admissible 20–28 0.1–0.2 �3.5 to +10.0 0.4 1.2 30–70 rh � relative air humidity, tg, stereo � stereotemperature measured by a stereothermometer table 6: the range of optimum and admissible hydrothermal parameters for the interiors of buildings, prescribed in the czech republic (work class i) (for rcc = 1) e.g., an operative temperature decrease from head level to ankle level of �3 °c and an operative temperature increase of +9 °c. to be continued. references [1] centnerová, l.: ventilation of a family house. topenářství, instalace, no. 3/1999, pp. 64 –65 [2] chlum, m., jokl, m., papež, k.: progressive ways of residential ventilation. společnost pro techniku prostředí, praha 1999, pp. 75 [3] gullev, g.: allergy and indoor air climate. scanvac 1999, no. 1/1999, pp. 8–9 [4] hensen, j., kabele, k.: application of system simulation to wch boiler selection. in: proceedings of 5th int. ib psa conference, prague 1997, pp. 141–147 [5] jirák, z., jokl, m. v., bajgar, p.: long-term and short-term tolerable work-time in a hot environment: the limit values verification. int. j. of environmental health research no. 7/1997, pp. 33–46 [6] jokl, m. v.: microenvironment: the theory and practice of indoor climate. thomas, illinois, u.s.a. 1989, p. 416 [7] jokl, m. v.: the stereothermometer: a new instrument for hydrothermal constituent nonuniformity evaluation. ashrae transactions 96, no. 3435/1990, pp. 13–15 [8] jokl, m. v.: stereothermometer for the evaluation of a hydrothermal microclimate within buildings. heizung/luftung, klima, haustechnik 42, no. 1/1991, pp. 27–32 [9] jokl, m. v.: stereothermometer an instrument for assessing the non-uniformity of the environmental hydrothermal constituent. čs. hygiena 36, no. 1/1991, pp. 14–24 [10] jokl, m. v.: some new trends in power conservation by thermal insulating properties of buildings. acta polytechnica, no. 8/1990, pp. 49–63 [11] jokl, m. v.: hydrothermal microclimate: a new system for evaluation of non-uniformity. building serv. eng. technol. 13, no. 4/1992, pp. 225–230 [12] jokl, m. v.: theory of the non-uniformity of the environment at hydrothermal constituent. stavební obzor 1, no. 4/1992, pp. 16–19 [13] jokl, m. v.: internal microclimate. czech technical university in prague, prague 1992, p. 182 [14] jokl, m. v.: the theory of the indoor environment of a building. czech technical university in prague, prague 1993, p. 261 [15] jokl, m. v.: energetics of building environsystems. czech technical university in prague, prague 1993, p. 148 [16] jokl, m. v., moos, p.: optimal globe temperature respecting human thermoregulatory range. ashrae transactions 95, no. 3288/1989, pp. 329–335 [17] jokl, m. v., moos, p.: the nonlinear solution of the thermal interaction of the human body with its environment. technical papers, tu prague, building construction series, ps no. 6/1992, pp. 15–24 [18] kabele, k., kadlecová, m., matoušovic, t., centnerová, l.: application of complex energy simulation in competition design of the czech embassy in ottawa. in: proceedings of 6th int. ibpsa conference, kyoto, japan 1999, pp. 249–255 [19] kadlecová, m.: internal microclimate in museums, galeries and exhibition rooms. dissertation, czech technical university in prague, prague 1992, pp. 47 [20] kopřiva, m.: buildings saving energy. topenářství, instalace 33, no. 5/1999, pp. 98–99 [21] lunos luftungsfibel. berlin 1999, pp. 21 [21] papež, k.: ventilation and air conditioning – exercises. czech technical university in prague, prague 1993, pp. 116 [22] sandberg, m.: hybrid ventilation, new word, new approach. swedish building research, no. 4/1999, pp. 2–3 miloslav v. jokl, ph.d., sc.d, university professor phone: +420 2 2435 4432 fax: +420 2 3333 9961 e-mail: miloslav.jokl@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 1–6 switches of selected segments 7 globe temperature fig. 18: stereothermometer acta polytechnica acta polytechnica 53(2):138–141, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ thomson parabola spectrometer for energetic ions emitted from sub-ns laser generated plasmas mariapompea cutroneoa,b,∗, lorenzo torrisia,c, lucio ando’c, salvatore cavallaroc, jiri ullschmiedd, josef krasae, daniele margaronee, andreji velyhane, eduard krouskyd,e, miroslav pfeiferd,e a department of physics, university of messina, messina, italy b csfnsm — centro siciliano di fisica nucleare e struttura della materia, catania, italy c infn — laboratori nazionali del sud, catania, italia d institute of plasma physics, ascr, za slovankou 3, 182 00 prague, czech republic e institute of physics, ascr, na slovance 2, 182 21 prague, czech republic ∗ corresponding author: m.cutroneo@tiscali.it abstract. laser-generated plasmas were obtained in high vacuum by irradiating micrometric thin films (au, au/mylar, mylar) with the asterix laser at the pals research infrastructure in prague. irradiations at the fundamental wavelength, 300 ps pulse duration, at intensities up to about 1016 w/cm2, enabled ions to be accelerated in forward direction with kinetic energies of the order of 2 mev/charge state. protons above 2 mev were obtained in the direction orthogonal to the target surface in selffocusing conditions. gold ions up to about 120 mev and 60+ charge state were detected. ion collectors and semiconductor sic detectors were employed in time-of-flight arrangement in order to measure the ion velocities as a function of the angle around the normal direction to the target surface. a thomson parabola spectrometer (tps) with a multi-channel-plate detector was used to separate the different ion contributions to the charge emission in single laser shots, and to get information on the ion charge states, energy and proton acceleration. tps experimental spectra were compared with accurate tosca simulations of tps parabolas. keywords: thomson parabola spectrometer, high intensity laser, focal position, opera 3d tosca code, simulation. 1. introduction ion acceleration driven by laser-generated plasma is a major topic in various scientific fields, from ion sources to ion implantation, nuclear physics and biomedicine. in investigations of the macroscopic and microscopic effects occurringwhen a laser interacts with matter, one of the most important parameterd is the product iλ2, where i is the laser intensity and λ is the laser wavelength. it is well known that increasing the iλ2 parameter increases the ponderomotive energy transferred to plasma electrons and, consequently, the ion acceleration [10]. the formation of laser plasma and its dynamics are different in cases of low or high laser power densities. in the case of low power densities, the plasma in a high vacuum chamber expands along the normal direction to the irradiated target surface with a non-relativistic velocity, and the energy of the ion beam produced is of the order of 200 ev per charge state [9]. in the case of high densities, the ion beam expands with a relativistic velocity, the typical ion energy values being about 2 mev per charge state [1]. the investigated ion beam must be characterized in terms of the maximum ion energy, charge states, energy distribution, etc., as well as shotto-shot reproducibility. various detector systems were exploited to investigate of the ion acceleration in the laser-produced plasma, one of the most useful being the thomson parabola spectrometer (tps). in tps, the ions are deflected with a combined magnetic and electric field producing a mass/charge and charge state separation. much information on the physics of ion acceleration can be obtained by analyzing the parabolas imaged on a micro-channel plate (mcp) based detector coupled to a phosphorus screen [12]. 2. material and methods at the national laboratories in catania, several kinds of specimens were irradiated by ir laser light from an nd:yag laser operating at 1 ÷ 10 hz, with an intensity of 5 × 1010 w cm−2, 9 ns pulse duration and a beam diameter of 500 µm. it was possible to detect the plasma produced in backward direction as a consequence of thick target irradiation [6]. 138 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 thomson parabola spectrometer for energetic ions several experiments have been carried out recently at the pals laboratory, prague, where an iodine laser with 7 × 1016 w cm−2 of laser intensity, 1315 nm (1ω) and 438 nm (3ω) wavelengths, 300 ps pulse duration and a beam diameter of 70 µm in single shot mode [8] has been employed for irradiating both thick and thin targets. the hottest were the forward expanding plasmas generated in interaction of the laser with thin targets. much effort was focused on the choice of target materials, such as thick and thin polymers (polyethylene and mylar), metals (cu and au) and targets with synthesized nanoparticles. various tools have been used for ion detection, based on time-of-flight (tof) techniques and on magnetic and electric ion deflections, including the thomson parabola spectrometer. ion collectors (ic) were fixed at a known distance and angle from the target, and their outputs were connected to a fast storage oscilloscope. the ic current density signal depends on the ion charge, and is given by [3] jic = ezi ni vi , (1) where e is the electron charge, zi is the ion charge state, ni is the ion density and vi is the ion velocity. at low laser intensity, the use of an ion energy analyzer (iea) permits us to evaluate the amount of energy per charge state (e/z) of the ions emitted from plasma. iea is constituted by two coaxial metallic plates that deflect electrostatically to 90◦ the ions arriving at a windowless electron multiplier (wem) detector that selects them in terms of e/z. the ratio e/z is given by the relation e z = 2 k v , (2) where k is the gain parameter (k = 10) depending on the wem and v is the voltage applied to both deflection plates [7]. by varying the bias voltagei, it was possible to select different e/z and finally to plot the ion energy and the charge state distributions. sic semiconductor detectors were adopted due to their fast response, their insensitivity to visible light (energy gap 3.2 ev) and the energy proportionality of the detected ions. their current density signals are given by [3] jsic = e ( ni ei � ) µeff ( ud d ) , (3) where ei is the ion energy, � is the energy necessary for electron-hole pair creation, µeff is the quantum efficiencyi, which takes into account the energy loss in the detector, ud is the bias applied to the semiconductor detector, and d is the thickness of the semiconductor sensitive layer. at high laser intensity, a thomson parabola spectrometer was placed along the normal to the target surface in forward direction. the model of tps employed at the pals laboratory in prague is shown in fig. 1a. figure 1. model of the thomson parabola spectrometer (a) profiles of magnetic fields (b) and electric fields (c) provided by opera 3d/tosca. two pinholes collimate the input ions; the nearest is 1 mm in diameter and the other, 10 cm distant, is 100 µm in diameter. the latter is placed at a distance of 5 mm with respect to the magnetic plates. the applied magnetic fields ranged between 0.05 ÷ 0.2 t, and an electric voltage of 1.0 ÷ 3.0 kv was applied across two deflecting plates, producing an electric field orthogonal to the direction of the incident ions. charged particles were deflected by electrostatic and magnetic fields towards the mcp fixed at a distance of 16.5 cm from the electrostatic plates. the mcp was made up with a phosphor screen 2 cm in diameter coupled with a ccd camera [5]. a comparison between the experimental images and the simulations carried out using opera 3d/tosca code and matlab software [4] enabled us to evaluate the mass per charge state, the charge state and the energies of the detected ions. figures 1b, c show the profiles of the magnetic and electric fields calculated using tosca, in order to evaluate the effects of the edge and field gradient on the ion trajectories. 3. results the measurements carried out at low laser intensities, with thick targets, provided evidence that ions are emitted from plasma in backward direction with energies of the order of 200 ev per charge state. for au, the maximum charge state is of the order of 10+, thus the maximum kinetic energy is of about 2 kev. the ion energy distributions follow the coulomb–boltzmann shifted functions, and the charge state distributions are inversely proportional to the ionization potentials of the atomic species. investigations performed at high laser intensities, with thin targets, showed that ions emitted from plasma in the forward direction can have much higher energies, of the order of 2 mev per charge state. for au, the maximum charge state 139 mariapompea cutroneo et al. acta polytechnica figure 2. experimental parabolas for a target of au 0.6 µm in thickness (a), transformation in gray scale (b) and identification of the parabolas (c). figure 3. experimental parabolas obtained for targets of au at three different thicknesses irradiated by an iodine laser. is of the order of 70+, thus the maximum kinetic energy is of about 150 mev, as seen from the tps spectra. the image shown in fig. 2a represents a typical tps spectrum obtained by irradiating a 0.6 µm thin au target. the spectrum contains a bright circular zone caused by undeflected photons and neutral particles arriving on mcp, and a lot of parabolas outgoing from this circle. figure 2b shows conversion of the experimental spectrum in gray scale colors. figure 2c shows the simulation data overlapped with the experimental data, as obtained by opera 3d/tosca code and matlab software. the lowermost parabola corresponds to the deflected protons, while the other parabolas corresponds to au ions with high charge states. the closer the parabola points are to the circular zone, the higher is the energy of the ions. the maximum proton energy determined from the proton parabolas was as high as 2.5 mev. the maximum energies of au ions increase with the charge state. values of 160 kev, 20 mev, 60 mev and 130 mev have been evaluated for the charge states au2+, au20+, au40+ and au60+, respectively. several homogeneous au samples ranging in thickness from 0.6 to 50 microns were irradiated at the same experimental and geometrical conditions for laser energy, focal position and magnetic-electric deflections. figure 3 features a comparison of three experimental tps spectra obtained for different au sample thicknesses, from 0.6 up to 50 µm. this experiment showed that the maximum kinetic energy of the protons emitted from laser-produced plasma in the forward direction is proportional to the target thickness. proton energies from 2.5 mev (au 0.6 µm) up to 3.5 mev (au 50 µm) were observed. figure 4. sic-tof spectrum relative to 0.6 mm au target irradiation. these results can be explained on the basis of plasma electron density that increases with the thickness of the target producing enhancement of the electric field driving the ion acceleration. of course, for thicker au targets above 100 µm the transmitted charge particles in the rear side of the target decrease so low due to the high energy loss, and practically no plasma is therefore obtainable in forward direction. a large number of forward-accelerated protons with very high kinetic energies, above 2 mev, can be generated in high electron density plasmas produced when high-intensity laser pulses interact with thin targets made of a heavy material, such as gold. with the aim of comparing the experimental results and the simulations, we treated the data obtained by using sic detectors fixed in forward direction in a tof approach. they agree well with tps measurements, as seen from the sic spectrum in fig. 4. in this case, the corresponding proton peak energy is about 1.9 mev. the proton, carbon and gold peaks can be interpreted as a convolution of the coulomb–boltzmann shifted distributions, according to the literature [2]. 4. conclusions thomson parabola spectrometry can provide useful detailed information on physical processes, ion species and charge states generated in single-shot laser experiments. a combination of measurements and simulations helps to recognize well the ion species, the charge states, the maximum value of the ion energies, and the intensity distributions. the reported results show that the energy of protons accelerated in forward direction is higher than the energy of protons accelerated in backward direction, with increasing thickness of the specimen (for micrometric thicknesses). in addition, we observed an increase in plasma temperature, with the electron density of the sample. this effect was evaluated as a first approximation from the maximum charge 140 vol. 53 no. 2/2013 thomson parabola spectrometer for energetic ions states measured through tps. in shots with heavy metallic targets, the observed plasma temperatures and ion driving acceleration potentials were higher than those observed when using light metals or polymeric targets. the same behavior was observed for all the irradiated targets at the different laser intensities and wavelengths that were used. the results of the experiments at pals also demonstrated well that plasma temperature, maximum kinetic ion energy and maximum charge state depend on parameter iλ2, as expected [10, 11]. references [1] m. borghesi, j. fuchs, s. v. bulanov, et al. fast ion generation by high-intensity laser irradiation of solid targets and applications. fusion science and tech 49(3):412–439, 2006. [2] l. laska, l. ryc, j. badziak, et al. correlation of highly charged ion and x-ray emission from the laser-produced plasma in the presents of non-linear phenomena. rad eff & def in solids 160(10–12):557–566, 2005. [3] d. margarone, j. krasa, l. giuffrida, et al. full characterization of laser-accelerated ion beams using faraday cup, silicon carbide, and single-crystal diamond detectors. journal of applied physics 109(10):103302, 2011. [4] online. http://www.technosoft.biz/ns/tosca.php. [5] l. torrisi, a. borrielli, f. caridi, et al. optical spectroscopy in laser-generated plasma at a pulse intensity of 1010 w/cm2. in proc. 35th eps-eca, vol. 32d, p. 2.144. 2008. [6] l. torrisi, f. caridi, l. giuffrida. protons and ion acceleration from thick targets at 1010 w/cm2 pulse intensity. laser and particle beams 29(1):29–37, 2011. [7] l. torrisi, m. cutroneo. energy analysis of protons emitted from nd:yag laser generated plasmas. radiation effects and defects in solids 167(6):436–447, 2012. [8] l. torrisi, m. cutroneo, s. cavallaro, et al. proton emission from laser-generated plasmas at different intensities. nukleonika 57(2):237–240, 2012. [9] l. torrisi, l. giuffrida, m. rosinski, c. schallhorn. ge and ti post-ion acceleration from ion source. nucl instr and methods in phys res b 268(17–18):2808–2814, 2010. [10] l. torrisi, d. margarone, l. laska, et al. self-focusing effect in au-target induced by high power pulsed laser at pals. laser and particle beams 26(3):379–387, 2008. [11] l. torrisi, minniti t., l. giuffrida. data elaboration of proton beams produced by high-energy laser-generated plasmas. rad effects & defects in solids 165(6):721–729, 2010. [12] e. woryna, p. parys, j. wolowski, w. mroz. corpuscolar diagnostics and processing methods applied in investigation of laser-produced plasma as a source of highly ionized ions. laser and part beams 14(3):293–321, 1996. 141 http://www.technosoft.biz/ns/tosca.php acta polytechnica 53(2):138–141, 2013 1 introduction 2 material and methods 3 results 4 conclusions references ap1_01.vp 1 introduction thermal conductivity k of porous materials is generally a function of a number of external environmental parameters, the most important of them being temperature t and moisture u. in usual performance conditions, when relatively narrow temperature and moisture ranges occur, the dependence of k on t and u can be neglected. however, in certain circumstances particularly the dependence of k on t may appear to be very relevant. a typical example is the behavior of a building structure during a fire (see [1]). employing classical methods for measuring the thermal conductivity of porous materials at high temperatures may lead to certain difficulties due to the necessity of using largescale specimens and high-temperature resisting probes. in addition, most of these methods (e.g., [2–5]) enable only measurements of constant thermal conductivity, which makes the determination of the k(t) relation very time consuming and leads to a loss of precision due to the necessary averaging over some temperature range. a good alternative to classical treatments is to solve an inverse problem of heat conduction and to determine the temperature dependence of thermal conductivity from the temperature field. in this paper, an integral method using a doubleintegration treatment for solving the inverse problem of heat conduction in porous materials is employed. this method has been shown to deal with the problems of numerical instability and lack of universality known in the commonly used differential methods [6]. 2 measuring methods for high temperature measurements of thermal conductivity, we employed a double integration method developed earlier by our group [6]. we will show the main features of the derivation of the method, for the convenience of the reader. we have the one-dimensional heat conduction equation in the form � � � � � � � c t t x k t x � � � � � � �, (1) where k is the thermal conductivity, � the density and c the specific heat. we suppose t(t) and t(x) to be monotonous functions and choose a constant value of temperature, � � t( x, t). there must exist one-to-one parametrizations x � x0(�, t), t � t0(�, x) where both x0 and t0 are monotonous functions. considering this fact, integration of heat conduction equation (1) by x and t leads to � � � � c t x t x t x t t t � , , d d 0 0 1 2 �� � � ��k t x x t t t j t tq t t t t � � � �0 0 1 2 1 2 , , ,d + d , (2) where the heat flux at x = 0, jq(0, t), reads j t k t t x tq 0 0, ,� � � � . (3) the left-hand side of equation (2) can be modified by accounting for the shape of the integration area: ls c t t x t x t x t t t � ��� � � � � , , d d 0 0 1 2 � � ��� � � � � � �� � c t t x t x t c t t x t x t t x t x t x t t , , ,, , d d d d 0 2 0 1 0 2 1 20 1 0 tx t �� �, . (4) denoting it c t t t c t� �� �� � � �d d� � we obtain ls i t x t i t x t x i t x t i t t x t t t x t x � � � � , , , , , 2 1 0 2 0 1 0 1 0 � � � � d , t x 2 � �d � � � �i t x t x i t x t xt x t t x t , , , , 2 0 1 0 0 2 0 1 d d � � � �� �i x t x tt � � �0 2 0 1, , . (5) substituting (5) into (2) we obtain acta polytechnica vol. 41 no.1/2001 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ thermal conductivity of high performance concrete in wide temperature and moisture ranges j. toman, r. černý the thermal conductivity of two types of high performance concrete was measured in the temperature range from 100 °c to 800 °c and in the moisture range from dry material to saturation water content. a transient measuring method based on analysis of the measured temperature fields was chosen for the high temperature measurements, and a commercial hot wire device was employed in room temperature measurements of the effect of moisture on thermal conductivity. the measured results reveal that both temperature and moisture exhibit significant effects on the values of thermal conductivity, and these effects are quite comparable from the point of view of the magnitude of the observed variations. keywords: concrete, thermal conductivity, moisture content, temperature. k t x x t t t i t x t x t t t x t � � � � � � � � � � � � 1 0 2 0 1 2 0 2 , , , , d d � �� � � �i t x t x i x t x tt x t t, , , , 1 0 0 2 0 1 0 1 d � � � � � � � � �� j t tqt t 1 2 0, .d (6) in eq. (6), some difficulties may arise in determining the heat flux jq(0, t). basically, we have two ways how to deal with this problem: either to measure jq(0, t) directly or to calculate it from the known temperature field t ( x, t). in the high temperature range that we are interested in, measuring jq( 0, t) is relatively difficult. therefore, we choose the numerical treatment, and supposing that in the initial state the boundary condition for temperature on the opposite side of the one-sided heated sample is not yet influencing the temperature field, we calculate jq( 0, t) from the formula � �j t t t t c t t x t t c t t x t xq l 0 1 2 1 2 1 0 , , ,� � �� � � d , (7) where l is the length of the one-dimensional sample. for the room temperature measurements, we employed the shotherm qtm (showa denko) commercial hot wire device. 3 materials and samples the experimental work was done with two types of high performance concrete used in nuclear power plants: penly concrete and temelin concrete. penly concrete was used for a concrete containment building in a nuclear power plant in france (samples were obtained from m. dugat, bouygues company, france). it had a dry density of 2290 kg/m3, and consisted of the following components: cement cpa hp le havre (290 kgm–3), sand 0/5 size fraction (831 kgm–3), gravel sand 5/12.5 size fraction (287 kgm–3), gravel sand 12.5/25 size fraction (752 kgm–3), calcareous filler piketty (105 kgm–3), silica fume (30 kgm–3), water (131 kgm–3), retarder chrytard 1.7, super-plasticizer resine gt 10.62. the maximum water saturation was 4 %kg/kg. the temelin concrete used for the concrete containment building of the temelin nuclear power plant in the czech republic had a dry density of 2200 kg/m3 and maximum water saturation 7 %kg/kg. the composition was as follows: cement 42.5 r mokrá (499 kgm–3), sand 0/4 size fraction (705 kgm–3), gravel sand 8/16 size fraction (460 kgm–3), gravel sand 16/22 size fraction (527 kgm–3), water (215 kgm–3), plasticizer 4.5 lm–3. for high temperature measurements of thermal conductivity, we used cubic specimens 71 × 71 × 71 mm, and for room temperature measurements prismatic specimens 40 × 40 × 160 mm. 4 experimental results 4.1 temperature dependence of thermal conductivity the high temperature measurements of thermal conductivity are summarized in figs. 1a, b. the two materials exhibited very similar behavior. in the temperature range to 400 °c, we observed a characteristic decrease of thermal conductivity which is well known for instance for monocrystalline metals or semiconductors. for temperatures above 400 °c, the thermal conductivity began to increase, which is an unexpected feature. this unusual behavior of both materials can be explained by structural changes in this temperature range, which are mainly due to the loss of crystallically bonded water and dehydration of some components. 4.2 moisture dependence of thermal conductivity the measured results are shown in figs. 2a, b where the different markers denote different samples. the measured © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 41 no.1/2001 fig. 1b: dependence of the thermal conductivity of penly concrete on temperature fig. 1a: dependence of the thermal conductivity of temelin concrete on temperature fig. 2a: dependence of the thermal conductivity of temelin concrete on the moisture content results exhibit significant changes in thermal conductivity due to the moisture variations which are quite comparable to those due to high temperature exposure. the data for high values of moisture, close to the saturation moisture content, are split, which might be a consequence of structural damage caused by the water saturation process. 5 conclusions the thermal conductivity of the two types of high performance concrete was determined in wide temperature and moisture ranges. both temperature and moisture effects were found to be very important. they could result in variations of thermal conductivity as high as 50 or more per cent compared to the reference values. acknowledgement this paper is based on work supported by the ministry of education of the czech republic, under contract no. cez:j04/98:210000003. references [1] toman, j.: influence of external conditions on building materials and constructions. thesis, ctu prague 1986 [2] dickerson, jr., r. w.: food technol., 19, 198 (1965) [3] rao, m. a., barnard, j. and kenny, j. f.: trans. asae, 16, 1143 (1975) [4] dreyer, j. and rogass, h.: proc. of the 4. bauklimatisches symp., dresden 1982, p. 53 [5] venzmer, h. and černý, r.: stavebnícky časopis, 38, 105 (1990) (in czech) [6] černý, r., toman, j.: proc. of international symposium on moisture problems in building walls, v.p. de freitas, v. abrantes (eds.), p. 299, univ. of porto, 1995 prof. mgr. jan toman, drsc. department of physics phone: +420 2 2435 4694 e-mail: toman@fsv.cvut.cz prof. ing. robert černý, drsc. department of structural mechanics phone: +420 2 2435 4429 e-mail: cernyr@fsv.cvut.cz ctu, faculty of civil engineering thákurova 7, 166 29 prague 6 czech republic 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 2b: dependence of the thermal conductivity of penly concrete on the moisture content acta polytechnica doi:10.14311/ap.2015.55.0007 acta polytechnica 55(1):7–13, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap monte-carlo re-deposition model during terrestrial measurements of ion thrusters julia durasa, ∗, oleksander kalenteva, ralf schneidera, b, konstantin matyashb, karl felix lüskowa, jürgen geisera a institute of physics, ernst-moritz-arndt university greifswald, felix-hausdorff-str.6, d-17498 greifswald, germany b computing center, ernst-moritz-arndt university greifswald, felix-hausdorff-str.12, d-17498 greifswald, germany ∗ corresponding author: julia.duras@physik.uni-greifswald.de abstract. for satellite missions, thrusters have to be qualified in large vacuum vessels to simulate the space environment. one caveat of these experiments is the possible modification of the beam properties due to the interaction of the energetic ions with the vessel walls. impinging ions can produce sputtered impurities or secondary electrons from the wall. these can stream back into the acceleration channel of the thruster and produce co-deposited layers. over a long operation time of thousands of hours, these layers can modify the optimized geometry and induce changes in the ion beam properties, e.g., broadening of the angular distribution and thrust reduction. a monte carlo code for simulating the interaction of ion thruster beams with vessel walls was developed to study these effects. back-fluxes of a spt-like ion thruster for two different test-setups and vessel geometries are calculated. keywords: ion thruster, monte-carlo method, terrestrial testing of thrusters, interaction with vessel walls. 1. introduction ion thrusters, where the propellant is ionized and the ions are accelerated by electric fields, are of increasing importance for scientific and commercial space missions. compared to commonly used chemical thrusters they have a 5 to 10 times higher specific impulse [1]. this results in a considerably reduced propellant budget, and a significant reduction of spacecraft launch mass by some 100 to 1000 kg can be achieved. one concept for this electric propulsion involves grid-less ion thrusters, which are based on magnetic confinement of the plasma electrons, where the trapped electrons both ionize the propellant and provide the potential drop for ion acceleration. due to their low complexity in terms of system architecture, they are becoming of increasing interest in particular for commercial satellites. in order to reduce the development and qualification costs, it is therefore necessary to set up and apply a series of different modeling tools which can quantitatively describe the plasma physics within the thruster, and also the interactions of the thruster with the testing environment and finally with the satellite. the integrated modeling strategy should include several modular components in a consistent way in order to provide the complexity and accuracy required for the problem [2]. ions created in the thruster discharge may impinge on the surrounding surfaces, which can induce sputter erosion and redeposition of eroded material. depending on the surface region, this may affect the operational and performance characteristics of the thruster itself, of the ion thruster module, or even of the whole satellite. for the simulation, one can distinguish: (a) the impact on the inner thruster surface by ions generated in the inner thruster discharge. (b) the impact on the exit-sided surface of the thruster and the neutralizing electron source by ions generated in the plasma plume downstream the thruster exit. (c) the impact on the satellite surface producing erosion and redeposition. (d) the impact on the vacuum chamber walls during testing and life-time qualification, creating redeposition onto the thruster and thruster module surface. the proposed multi-scale modeling strategy is well suited to address these ion impingement effects. the outline of the paper is as follows: the modeling strategy is described and the problems of artifacts during terrestrial qualifications are outlined. as one example of this modeling, the influence of the testsetups on particle back-fluxes towards the ion thruster channel is studied with a monte carlo model for an spt-like ion thruster. finally, the results are summarized. 2. modeling strategy the most complete model resolving all time scales of ion thrusters would be a direct coupling of a kinetic plasma model with a molecular dynamics model for the walls. this would allow a fully self-consistent 7 http://dx.doi.org/10.14311/ap.2015.55.0007 http://ojs.cvut.cz/ojs/index.php/ap j. duras, o. kalentev, r. schneider et al. acta polytechnica analysis of the complete system, including plasma dynamics, possible erosion of the thruster walls and the interaction of the exhausted ions with the surrounding satellite surfaces or, during testing and qualification, with the testing environment, like the vacuum chamber walls. this type of solution is not feasible, due to the tremendous computational costs and the high complexity of this combined model. instead, we propose to use a hierarchical multi-scale set of models, in which the parameterization for a lower hierarchy model can be deduced from a higher level model. for example, a 3d particle-in-cell (pic) model can deliver a parameterization of turbulence effects by appropriate anomalous transport coefficients. transport coefficients based on these runs could then be used in a 2d pic, which is more practical for production runs. to get a correct description of both the thruster and the plume plasma, one has to solve a kinetic problem for the whole region of interes,t including all significant physical processes. these are collisions, turbulence effects, surface driven sheath instabilities and breathing modes. a pic model is therefore a natural choice for this problem. in addition, similarity scaling is applied to further reduce the calculation costs [3]. in order to describe erosion-redeposition processes, one can use various approximation levels of the model. the most thorough description is given by the full molecular dynamic model. however, this would be far too time-consuming, because it resolves each individual atom and their interactions. the next level can be represented by the binary collision cascade model, which assumes an amorphous target and the interaction of particles with the solid based on heavy particle collisions with ions, and additional losses with electrons acting as a viscous force. this model can use the detailed information about flux distributions provided by the pic code, and can then, on the basis of this, the erosion response of the materials. the most crude approximation is given by a monte-carlo (mc) procedure simulating erosion-redeposition on the basis of sputter yield tables calculated from the binary-collision cascade or molecular dynamics model together with information about the plasma fluxes. this model is particular useful due to its simplicity and flexibility for the quantifying the lifetime of ion thrusters. 3. artifacts during terrestrial measurements of an ion thruster terrestrial qualification of a thruster differs significantly from outer space exploitation, in that it is held in a limited vessel, which can create various artifacts on the measured thruster properties. for example, the back scattered flux from the vessel walls can be deposited on the walls of the thrusters, and can in that way form a conducting layer influencing the thruster operating regimes. these measurements are taken in large vacuum vessels, up to 10 times larger than the thruster itself, in order to provide a space-like environment. despite these dimensions, however interactions of exhausted particles with residual gas and vessel walls still take place and can modify the measurements. one source of differences between measurements in space and during terrestrial testing is the re-deposition of sputtered particles inside the thruster channel or for the grid thruster at the thruster walls. the accelerated ions impinge on the vessel walls and produce sputtered impurities. these can stream back towards the acceleration channel of the thruster and produce co-deposited layers. over a long operation time of thousands of hours, these layers can modify the optimized geometry of the thruster channel or grids and the inner wall surface. this induces changes in the ion beam properties, e.g., broadening of the angular distribution and thrust reduction, as observed in the test campaigns of hemp-t [4] and the next grid thruster [5]. a reduced back-flux is therefore important to minimize artifacts in the plume measurements. due to the large size difference between the thruster and the vessel, it is possible to parameterize the backscattered flux from the vessel walls as an effective source for the mc erosion-deposition code. this paper will show that the position of the thruster inside the vessel, the wall material and the vessel geometry play important roles and can influence the plume measurement results. 4. description of the monte carlo model the monte carlo (mc) method is a common approach for plasma-wall problems, for example mc simulations of sputtering and re-deposition are well established in fusion-oriented studies [6] as is magnetron sputtering [7, 8]. the idea of the monte carlo model is to sample the primary distribution of the ions with respect to energies, angles and species. these pseudoparticles are followed hitting the vessel walls and generating sputtered particles, based on sputter rates calculated by a binary collision cascade code. their angular distributions are sampled and the back-flow of the eroded particles from the vessel walls towards the ion thruster acceleration channel is calculated. in this work, we assume that particles move along rays according to their source distribution. in this 3d model, the vacuum chamber is assumed to be a cylinder with two spherical caps attached to its ends. the angular source distributions of the mean ion energy, the current and the species fraction were generated with respect to the emission angle θ, see fig. 3. as an example, ion current and energy distributions similar to those published for spt-100 [9] are used. the fraction of xe2+ to xe+ ions is taken 8 vol. 55 no. 1/2015 monte-carlo re-deposition model figure 1. distribution of mean ion energy (a), current density (b), xe2+ faction (c) and the resulting emitted flux (d) similar to spt-100. arbitrarily. all used source distributions used here are shown in fig. 1(a)–(c), sampled as a point source at the thruster exit, within angular steps of ∆θ = 5°. the resulting emitted flux γ(θ) = j(θ) e · (1 + fem(θ)) is shown in fig. 1(d), where j(θ) is the current density, e is the elementary charge and fem(θ) is the emitted xe2+ fraction. equal distribution of the poloidal angle is used. this distribution represents an emitted beam of xenon ions with a mean emission angle of θem = 0°. the sputter yield for the impinging ions on the vessel walls is taken from sdtrimsp [10] simulations. while the thruster is operating, ions with energies larger than the sputter threshold form a micro roughness on the thruster surfaces. due to shadowing, this micro-roughness modifies the real angle of incidence, effectively reducing its range to values between 20° and 50°. for this angular range, the sputter yields vary only slightly with angle. we therefore assume that the sputter yields depend only on the energy of the impinging ions. it is also assumed that the sputtered particles obey a cosine law [11] for their angular distribution. due to their low energies, the sputtered particles are assumed to have a sticking coefficient of 1. therefore, only particles with direction towards the thruster exit are followed. a detailed description of the monte carlo model and its validation with analytic calculations can be found in [12]. in the following, the influence of the thruster position inside the vessel is studied. this is important, since ion thrusters are qualified within various testsetups. 5. back-flux for two different test-setups measurements of plume parameters are taken in two different test-setups: a ’performance test’, where a single thruster is placed in the center of the circular cross-section of the vesse,l and an ’end-to-end test’, where four thrusters are assembled as a cluster. performance tests are typically carried out to test and qualifying single thrusters, while end-to-end tests give the characteristics of a whole cluster of thrusters, as it is applied on satellites where only one of the thrusters is operating. the following comparison of the two test setups shows strong dependency of the back-flux on this thruster position. for the performance test, the ion thruster is placed co-axially in the vessel. in the following, the lvtf-1 vessel at aerospazio [13] in siena, italy was taken as a reference. it has a cylinder length of zc = 7.7 m and a radius of rc = 1.9 m. the spherical cap at the end has a radius of rsp = 2.7 m. for simplicity, the ion thruster is approximated by a cylinder with a length of l = 9.0 cm and a diameter of d = 9.0 cm. in most test chambers, graphite-coated walls are used in order to reduce the back-fluxes of sputtered par9 j. duras, o. kalentev, r. schneider et al. acta polytechnica figure 3. sketch of a thruster cluster assembled in the lvtf-1 vessel within an end-to-end test set-up. ‘0’ to ‘3’ indicate the different thrusters. figure 2. calculated back-flux towards the thruster exit in total (total flux fraction of 1.97 · 10−5). ticles, since graphite has a lower sputter yield than aluminum. however, in the case of carbon the release of hydrocarbons is a major problem linked to the sponge-like characteristics of carbon with respect to its interaction with hydrogen. the sputtered hydrocarbons produce a co-deposited layer inside the thruster channel, which can become conductive and can therefore change the potentials and can produce subsequent problems. when carbon is replaced by metal walls, the rates of physical sputtering are larger, but the evaporation in the hot parts of the channel will prevent deposition inside the thruster [2]. aluminum walls are therefore studied with the montecarlo model. within this model, the back-flux is collected on a circle, which represents the thruster exit. the calculated back-flux fraction towards the source is shown by a blue line in fig. 2. it is given by the number of particles hitting the thruster exit in a certain angular range [θ; θ + ∆θ], with ∆θ = 1°, divided by the total number of source particles. for the chosen parameters of the vessel, the back-flux for θ = [0; 14°] originates from the spherical cap, while for θ = [14°; 90°] it comes from the cylindrical walls of the vessel. the flux fraction shows a pronounced peak at around 10° and a broader peak at about 45°. figure 4. calculated back-flux to four thruster channels with a total flux fraction of f = 8.0 · 10−5. thruster ∫ f(θ) dθ ‘0’ 2.11 · 10−5 ‘1’ 2.00 · 10−5 ‘2’ 1.91 · 10−5 ‘3’ 2.00 · 10−5 table 1. calculated back-flux to four thruster channels with a total flux fraction of f = 8.0 · 10−5. these structures are determined by the combination of the mean ion energy distribution of the emitted xenon ions, the sputter yield and the cosine distribution of the sputtered particles. the first peak at 10° is dominated by the maximum of the mean ion energy which takes place at the same emitting angle. the second peak is given by the combination of decreasing mean ion energy, with increasing θ and increasing back-flow as given by the cosine law. for zero degree emission angle less flux is seen, due to the small number of emitted particles in this angular region, because of its small circular area for θ ∈ [0; 1°]. for the ’end-to-end’ simulations the same vacuum chamber was taken as a reference. a sketch of the implemented geometry is shown in fig. 3. in order to reduce the artifacts further, all thrusters are pointing 10 vol. 55 no. 1/2015 monte-carlo re-deposition model figure 5. re-deposited flux in s−1m−2 inside the four 9 cm long thruster channels within an end-to-end test setup at the lvtf-1 vessel. total re-deposited flux γ = 4.3 · 10+12 m−2s−1 figure 6. re-deposited flux along the four thruster channels. in the same direction. thruster ‘0’ was chosen as the operating thruster. the computed back-flux fractions towards the thruster exit planes of all four thrusters are shown in fig. 4. the integral flux fraction for each thruster is given in table 1. most of the back-flux is measured for thruster ‘0’, since it is the emitting source. one can see that the integral deposited flux decreases with distance to the source. therefore, equal flux distribution for symmetrically placed thrusters ‘1’ and ‘3’ is reasonable. since the emitting source is not placed co-axially in the vessel one cannot deduce, where the sputtered particles originate from. in addition, the back-flux is no longer equally distributed in poloidal direction. in fig. 5, the back-flux on the simplified inner thruster channel wall is given with respect to the depth z′ and the poloidal angle ϕ of the thruster. here the z′-axis is the symmetry axis of the cylinder, where z′ = 0 cm is at the anode and z′ = 9 cm is at the thruster exit. as expected, the flux is slightly higher in the thruster exit region and decreases towards the thruster bottom, see fig. 6. it shows the measured flux summed over the poloidal angle. in poloidal difigure 7. calculated back-flux to four thruster channels with total flux fraction of f = 5.7 · 10−4 at the ulan vessel. thruster ∫ f(θ) dθ ‘0’ 1.61 · 10−4 ‘1’ 1.42 · 10−4 ‘2’ 1.26 · 10−4 ‘3’ 1.42 · 10−4 table 2. calculated back-flux to four thruster channels with total flux fraction of f = 5.7 · 10−4 at the ulan vessel. rection, the distribution varies and the angle with maximum flux ϕ = max ϕ ∫ n(ϕ,z′)dz′ along the z-axis is given in each plot. thrusters ‘0’ and ‘2’ show approximately the same maximum angle, while for the others the angle is shifted by ±10°. this can be explained by the symmetric thruster positions within the vessel with respect to the emitting source. in total, the re-deposition distribution pattern is nearly the same for all four thrusters, due to the 11 j. duras, o. kalentev, r. schneider et al. acta polytechnica figure 8. re-deposited flux in s−1m−2 inside the four 9 cm long thruster channels within an end-to-end test setup at the ulan vessel. total re-deposited flux γ = 2.4 · 10+13 m−2s−1 large vessel geometry in comparison with the thruster size, and the narrow beam-like emission distribution. for the purposes of comparison, the same simulation was carried out for a smaller vessel. the ulan facility [14] in ulm was modeled. it has a cylinder length of zc = 2.9 m and a radius of rc = 1.2 m. as in aerospazio, the thruster cluster is placed co-axially in the vessel. in fig. 7, the back-flux fraction at the exit planes and the integral flux fraction are given in table 2. the total back-flux fraction f = 5.7 · 10−4 is about 14 times higher than for the larger aerospazio vessel f = 8.0 · 10−5. a clearer distinction between the four thrusters and a different back-flux pattern develop. the maximum back-flux is measured for the emitting thruster ‘0’, while it decreases with distance from the source, as can be seen in tab. 2. these differences can be explained by the thruster position closer to the cylinder walls of the vessel. the back-flux for each channel is given in fig. 8. here, too, the redeposition decreases with channel depth, as expected, but more pronounced re-deposition areas with parts of practically no re-deposition build up. in comparison with the larger vessel, the maximum peak is approximately one order higher. while for thruster ‘0’ the re-deposition is almost equally distributed within the channel, a peak builds up with increasing distance from the emitting source, resulting in the highest maximum flux for thruster ‘2’. also, the position of the maximum back-flux varies more in poloidal angles ϕ. here, the symmetric thruster positions within the vessel with respect to the emitting source is important. these distribution characteristics correspond to observations during testing of the hemp-t (b.van reijen, personal communication, june, 2014). summarizing these results, a complex re-deposition profile appears due to the non-central position of the source within the vessel. therefore, the source particles do not hit the vessel walls equally distributed in ϕ, which destroys the poloidal symmetry of the emitted flux hitting the vessel walls. in addition, the distribution of the sputtered particles is overlying, which gives no poloidal symmetry of the re-deposited particles, although the test-setup has such a simple geometry. the size of the vessel not only influences the amount of re-deposited particles but also gives a more pronounced re-deposition pattern. 6. conclusion a monte carlo model using a ray approximation for the particles allowsus to calculate the back-flux towards the thruster exit generated by sputtered particles at the vessel walls. it has shown the influence of the test set-up and the vessel size, which affects the re-deposition pattern inside the thruster channels. a non-centered emitting source leads to a complex redeposition profile within the thruster channels. this effect can be diminished with a larger vacuum vessel which reduces the back-flux and smoothes the re-deposition patterns inside the channel. the emission distribution of the thruster itself also plays an important role. the results represent a worst case scenario, since the emission distribution of the thruster was assumed to be beam-like and aluminum was taken as the vessel wall material. for broader emission distributions and back-flux reducing modifications, e.g., carbon walls or baffles [2], the effects are reduced. effects like secondary electron emission at the vessel walls, which influence the plume potential, collisions of propellant ions with residual gas, regions of magnetized electrons in the plume and changes in the thruster potential due to re-deposited layers inside the channel are not considered within this model. in a future modeling step, the estimated back-flux distribution can be used for simulating the erosion and re-deposition on the thruster surfaces. this could clarify more precisely how terrestrial conditions influence the thrust measurements in total. acknowledgements this work was supported by the german space agency dlr through project 50 rs 1101. references [1] n. koch, h.-p. harmann, g. kornfeld (eds.). status of the thales high efficiency multi stage plasma thruster development for hemp-t 3050 and hemp-t 12 vol. 55 no. 1/2015 monte-carlo re-deposition model 30250, iepc-2007-110. 30th international electric propulsion conference, 2007. [2] o. kalentev, k. matyash, j. duras, et al. electrostatic ion thrusters towards predictive modeling. contributions to plasma physics 54:235–248, 2014. doi:10.1002/ctpp.201300038. [3] k. matyash, r. schneider, a. mutzke, et al. kinetic simulations of spt and hemp thrusters including the near-field plume region. ieee transactions on plasma since 38(9, part 1):2274–2280, 2010. doi:10.1109/tps.2010.2056936. [4] a. genovese, a. lazurenko, n. koch, et al. (eds.). endurance testing of hempt-based ion propulsion modules for smallgeo, iepc-2011-141. 32th international electric propulsion conference, 2011. [5] r. shastry, d. a. herman, g. c. soulas, m. j. patterson (eds.). status of nasa’s evolutionarry xenon thruster (next) long-duration test as of 50,000 h and 900 kg throughput, iepc-2013-121. 33th international electric propulsion conference, 2013. [6] r. behrisch, w. eckstein. springer-verlag. [7] v. serikov, k. nanbu. monte carlo numerical analysis of target erosion and film growth in a three-dimensional sputtering chamber. journal of vacuum science & technology a: vacuum, surfaces, and films 14(6):3108–3123, 1996. doi:10.1116/1.580179. [8] c. shon, j. lee. modeling of magnetron sputtering plasmas. applied surface science 192(1–4):258–269, 2002. advance in low temperature rf plasmas, doi:10.1016/s0169-4332(02)00030-2. [9] l. king. transport-property and mass spectral measurements in the plasma exhaust plume of a hall-effect space propulsion system. ph.d. thesis, university of michigan, 1998. [10] a. rai, a. mutzke, r. schneider. modeling of chemical erosion of graphite due to hydrogen by inclusion of chemical reactions in sdtrimsp. nuclear inst and methods in physics research, b 268(1718):2639–2648, 2010. doi:10.1016/j.nimb.2010.06.040. [11] w. eckstein. computer simulation of ion-solid interaction, vol. 10 of springer series in material science. springer, 1991. [12] o. kalentev, l. lewerentz, j. duras, et al. infinitesimal analytical approach for the backscattering problem . journal of propulsion and power 29(2):495–498, 2013. doi:10.1002/ctpp.201300038. [13] aerospazio tecnologie s.r.l. http://www.aerospazio.com [2014-02-01]. [14] h.-p. harmann, n. koch, g. kornfeld (eds.). the ulan test station and its diagnostic package for thruster characterization, iepc-2007-119. 30th international electric propulsion conference, 2007. 13 http://dx.doi.org/10.1002/ctpp.201300038 http://dx.doi.org/10.1109/tps.2010.2056936 http://dx.doi.org/10.1116/1.580179 http://dx.doi.org/10.1016/s0169-4332(02)00030-2 http://dx.doi.org/10.1016/j.nimb.2010.06.040 http://dx.doi.org/10.1002/ctpp.201300038 http://www.aerospazio.com acta polytechnica 55(1):7–13, 2015 1 introduction 2 modeling strategy 3 artifacts during terrestrial measurements of an ion thruster 4 description of the monte carlo model 5 back-flux for two different test-setups 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0068 acta polytechnica 54(1):68–73, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap nox prediction for fbc boilers using empirical models jiří štefanica∗, jan hrdlička ctu in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 160 00 prague 6 ∗ corresponding author: jiri.stefanica@fs.cvut.cz abstract. reliable prediction of nox emissions can provide useful information for boiler design and fuel selection. recently used kinetic prediction models for fbc boilers are overly complex and require large computing capacity. even so, there are many uncertainties in the case of fbc boilers. an empirical modeling approach for nox prediction has been used exclusively for pcc boilers. no reference is available for modifying this method for fbc conditions. this paper presents possible advantages of empirical modeling based prediction of nox emissions for fbc boilers, together with a discussion of its limitations. empirical models are reviewed, and are applied to operation data from fbc boilers used for combusting czech lignite coal or coal-biomass mixtures. modifications to the model are proposed in accordance with theoretical knowledge and prediction accuracy. keywords: nox prediction, fbc, empirical models. 1. introduction fluidized bed combustion (fbc) technology provides an efficient and ecological way for low quality fuel combustion. the fuel is combusted in a bed of inert material that is brought into a fluidized state by passing through air, which leads to very intensive mixing of gases and solids inside the bed. a high degree of mixing enhances the heat and mass transfer by orders of magnitude compared to other combustion technologies. intensive heat transfer is beneficial for keeping the combustion temperature low and uniform throughout the bed. the mass transfer helps to keep high combustion efficiency even for low-quality fuels, and facilitates emission control. nitrogen oxides no and no2 (referred to as nox) are pollutant gases that cause photochemical smog, respiratory problems and damage to organisms. emissions of these gases are therefore monitored and must be kept at a minimal level. although transportation (internal combustion engines) is the major source of nox, control of nox emissions is efficient only in stationary combustion sources. under typical combustion conditions of a solid fuel, about 95 % of the total nox is in the form of no, and just 5 % is in the form of no2, which is much more noxious. emissions of nitrogen oxides are influenced by fuel properties, combustion conditions and combustor design. authors of nox prediction models for fbc usually combine a kinetic modelling approach with fbc hydrodynamic models. however, all existing models suffer from inaccuracy, overcomplexity, or both. a much simpler approach can be found for pulverized coal combustors (pcc), where the application of empirical models leads to very simple correlations that can achieve good agreement with experiments. however, these correlations are used exclusively for pcc, and no reference is available for modifying this method for fbc conditions [1, 2]. 2. theory 2.1. formation of nitrogen oxides many authors have already written in detail about the formation of nitrogen oxides, see e.g. [2]. in general, there are three mechanisms of no formation that are generally accepted: thermal, prompt and fuel. no2 is formed through oxidation of no by ho2 radicals that are present in low temperature regions of the flame. n2o is formed from no by reaction with nco or ammonia radicals. in fbc conditions, the vast majority of no has its origin in fuel. thermal and prompt no formation mechanisms are insignificant, due to the low temperature in a fluidized bed. a further reduction compared to pcc can often be achieved, because the most of the no is reduced to n2 or n2o. homogeneous reduction occurs both in the freeboard and in the bed by reaction with co and volatiles. heterogeneous reduction takes place on the surface of devolatilized char particles inside the bed. ash and bed material can have a catalytic effect on no reduction. the low combustion temperature enhances the reduction of no to n2o (with the exception of biomass combustion, which is for example a case of waste combustion) [2, 3]. 2.2. prediction of nox for fbc boilers the complexity of nox chemistry and the large number of influencing parameters make an accurate prediction very difficult. the most common approach for predicting the emissions of nitrogen oxides of fbc boilers is kinetic modelling combined with a detailed 68 http://dx.doi.org/10.14311/ap.2014.54.0068 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 1/2014 nox prediction for fbc boilers using empirical models premixed diffusion staged flame flame combustion k1 285 340 150 k2 1280 835 80 k3 180 20 −30 k4 −840 −395 100 table 1. pohl’s correlation coefficients for pcc boilers. fb hydrodynamic model of the bed and freeboard. recent models taking into account all occurring phenomena contain hundreds of reversible chemical reactions, and divide the bed into control volumes that can respect different flow patterns and hydrodynamics in different parts of the bed. these arrangements increase the complexity beyond acceptable limits. an undisputed advantage of kinetic models is the prediction of nitrogen oxide concentration profiles through the bed and freeboard, which can be used to identify and validate the detailed chemistry. however, this information is not necessary for predicting stack emissions. [2] [4] [5] 2.3. empirical modelling approach the main advantage of empirical models is their simplicity. the data required is usually easy to obtain through proximate and ultimate analysis of the fuel and combustion parameters. by contrast with kinetic models, there is no need to solve an extensive equation system, or to have high for computation capacity available. the prediction is based on experimentally derived correlations accounting for the dependency of the emissions on the influencing parameters. the parameters that have been identified to have the largest influence, and that are used in empirical models, can be classified within three groups: • fuel related (nitrogen content, volatile matter content, etc.) • boiler design related (staged/ unstaged combustion, extent of fuel air mixing, etc.) • boiler operation related (excess air, combustion temperature, etc.) the influence of individual parameters can be observed experimentally by keeping the other parameters constant. however, this approach presumes independent effects of the parameters, and this is not necessarily valid for all fuels and combustion conditions. the main disadvantage of empirical models is uncertainty originating from lack of input data, e.g. ash composition can promote no reduction under certain conditions, petrographic composition can significantly influence the devolatilization and char formation process. another consideration is the extent of mixing of fuel and combustion air. to minimize the uncertainty, figure 1. pohl’s correlation coefficients for pcc boilers [9]. correct parameters must be used in the model in order to cover all important factors, and at the same time not to increase the complexity. influencing parameters not included in the input data are taken into account via constants, and their applicability determines the limitations of the model. input parameters and selection of constants should be carefully considered. nevertheless, deviations in nox concentration can be measured in the flue gas stream due to inhomogeneity, so prediction reliability of ±50 ppm can be considered acceptable. from the models found in the literature, only pohl’s and ibler’s were chosen and applied to boiler data, because they assume general applicability [1, 6]. 2.4. pohl’s model a simple correlation was developed by pohl et al. [1] to estimate no emissions for controlled mixing conditions (various types of pcc flames, cf. table 1): no[ppm] = k1 + k2 n daf 1.5 + k3 vm 40 noeq3200 + k4 fc 60 noeq 3200 , (1) where noeq [ppm] is the maximum emission of no provided that all fuel nitrogen converts to no, n daf [%] is the nitrogen content in combustible, vm [%] is combustible volatile matter, and fc [%] is the fixed carbon content. noeq can be calculated from the nitrogen content in the fuel and dry flue gas volume 69 jiří štefanica, jan hrdlička acta polytechnica steam output 125 t/hour steam temperature 490 °c steam pressure 7.3 mpa table 2. steam nominal parameters for komořany i chpp. lhv 13 mj/kg w r 28 % ar 25 % n daf 1 % table 3. fuel parameters for komořany i chpp. vfd [n m3/kg], ash content ar [–] and water content w r [–], by noeq[ppm] = 2.1422n daf (1 − ar − w r) vfd · 106. (2) a different set of constants will presumably be needed for fbc conditions. as can be seen from figure 1, only three combustion regimes are accounted for, and other fuel-air mixing regimes are not defined. pohl’s model was constructed on the basis of a wide range of experimental data from pcc boilers (diffusion flame) [1, 6–8]. 2.5. ibler’s model ibler et al. [10] proposed the following correlation for predicting fuel nitrogen conversion to no: no nomax [–] = 7 · 10−5k co2 3 √ t − 1025, (3) where k [–] is a fuel related constant (ibler recommended using values of constant k between 4 and 6 for czech coals), co2 [%] is the flue gas oxygen concentration and t [k] is the combustion temperature. the predicted concentration in ppm can be calculated by multiplying the fuel nitrogen conversion by noeq from equation (2). the constant 7 · 10−5 in equation (3) represents the pcc conditions, and a different constant will presumably be needed for fbc conditions. as can be seen from equation (3), ibler’s model is targeted more on combustion conditions than on fuel properties, which are characterized by constant k only. 3. experimental the main aim of this paper is to make an evaluation of real measurement nox emissions data from two largescale fluidized bed boilers, and to make a comparison with the nox levels predicted by pohl’s model and by ibler’s model. steam output 140 t/hour steam temperature 535 °c steam pressure 12.5 mpa table 4. steam nominal parameters for mladá boleslav chpp. hard lignite biomass coal coal pellets lhv 24.31 18.77 15.23 w r 13.2 28.18 13.66 ar 11.67 6.37 4.51 n daf 0.89 1.38 1.9 table 5. fuel parameters for mladá boleslav chpp. 3.1. komořany i chpp the k3 fbc boiler with a bubbling bed at the komořany i combined power plant was used as the first reference. the lower part of the combustion chamber containing the bed is lined and contains an inbed evaporator. the upper part contains wall and grid parts of the evaporator. the convection part, which follows the combustion chamber, contains the superheaters (primary, secondary and output) and the economizer. a tube-type air heater with a separate part for fluidization and secondary air is the last heat transfer surface of the boiler. the boiler is equipped with a bed material recirculation system as well as bed height control. the combustion process is controlled by the fluidization air flow rate and the fuel input. the steam parameters are adjusted by feed water injection before the last superheater. the steam nominal parameters are shown in table 2. the lignite coal used was analysed before each combustion test. the results of the analysis were coupled with the nox emissions for the model predictions. the average parameters of coal are shown in table 3. 3.2. mladá boleslav chpp the k90 fbc boiler with a circulating bed at the mladá boleslav combined heat and power plant was used as the second reference. this boiler is designed for hard coal combustion, but the recent fuel is a mixture of hard and lignite coal with the addition of biomass. the combustion chamber with a lined lower part contains the membrane-wall type evaporator. after the combustion chamber there is a cyclone for coarse particle separation. the second duct contains a membrane wall, tube and wall type superheaters and an economizer, followed by a hopper. the third duct contains a tube-type air preheater, which is the last heat transfer surface of the boiler. the steam parameters are controlled by feed water injection before the second and last superheater. the steam nominal parameters are shown (for hard coal) in table 4. the 70 vol. 54 no. 1/2014 nox prediction for fbc boilers using empirical models komořany mladá boleslav oxygen concentration after economizer [%] 3.8–4.7 4–5 fluidized bed temperature [°c] 815–866 873 boiler load [%] 100 75–100 fuel mixture by mass [%] – lignite coal, biomass, hard coal 100, 0, 0 40–85, 0–25, 0–50 co concentration [mg/m3] 79–222 table 6. combustion parameters. figure 2. nox prediction reliability using the original pohls method. hard coal, lignite coal and biomass that were used were analysed before each combustion test. the results of the analysis were coupled with the nox emissions for the model predictions. the average fuel parameters are shown in table 5: 4. results experimental data from combustion tests on these boilers was taken from [11] and [12]. combustion tests were carried out in these boilers covering the combustion conditions described in table 6. 4.1. pohl’s model results the coefficients for the staged combustion model were adopted as a basis for nox prediction using pohl’s model for fbc boilers. three options were explored. the first option used pohl’s original model, as it was proposed by the authors in equation (1). as expected, the reliability was very low, see figure 2. in the second option, coefficient k1 was optimized by the least squares method for a better fit with the x = y line. the best fit with determination index r2 = 79.43 % was found for k1 = 19.87 — see figure 3. the third option incorporated the temperature and the excess oxygen dependency proposed by ibler into coefficient k1: k1 = 0.17c 2o2 3 √ t − 1025. (4) the modified constant k1 was consecutively optimized by the least squares method for the best fit with the biomass −2.2 hard coal 3.1 lignite coal 1 5 lignite coal 2 3 table 7. values of fuel constant k. x = y line, see figure 4. the modified version of pohl’s method showed slightly better results with the determination index r2 = 79,54 %. 4.2. ibler’s model results the prediction results using ibler’s model were not in good agreement with the measured data, see figure 5. to increase the reliability, the combustion constant was modified from 7 · 10−5 to 2.9 · 10−4 and fuel constants k were optimised to the values presented in table 7, in both cases using the least squares method fitting the x = y line. see the results in figure 6 with r2 = 81.3 %. 5. conclusions this paper has discussed the advantages and limitations of empirical prediction of nox emissions from the fbc boilers. with careful choice of input parameters and constants, empirical modelling can be in very good agreement with experimental data while 71 jiří štefanica, jan hrdlička acta polytechnica figure 3. nox prediction reliability using pohl’s method with modified constant k1 figure 4. nox prediction reliability using pohl’s method with incorporated temperature and excess oxygen dependency. keeping the model simple and the input data easy to obtain. given the inhomogeneity of the flue gas stream, prediction accuracy of ±50 ppm can be considered reliable. for fbc conditions, neither pohl’s model nor ibler’s model for pcc boilers provided satisfactory results without modifications. the models were adapted by least squares methods to fit the experimental data from two fbc boilers, in komořany (combusting lignite coal) and in mladá boleslav (combusting a coalbiomass mixture). the modified pohl model for staged combustion with k1 = 19.87, which uses only fuel parameters as input data, shows relatively good agreement with the measured data with r2 = 79.43 %. the prediction accuracy increases to r2 = 79.54 % with the adoption of a modification to coefficient k1 for temperature and excess air dependency taken from ibler’s model. however, most of the predicted value originates from the other constants, so the nox prediction is limited to a quite narrow range (cca 80–120 ppm), irrespective of the combustion conditions, and does not follow the measured trend. ibler’s model, which focuses more on combustion parameters (temperature and oxygen concentration), and accounts for fuel properties by constant k only, shows better agreement, with r2 = 81.3 % for the combustion constant 2.9 · 10−4 and fuel constants k taken from table 7. the predicted nox emissions are from a much wider range (50–180 ppm), and seem to follow the experimentally observed trend well. the prediction results from the modified models are in almost all cases within the 50 ppm limit, and can be considered reliable. however, the modified ibler model has higher prediction accuracy and seems to be more suitable for fbc conditions. references [1] j. pohl, s. chen, m. heap and d. pershing: correlation of nox emissions with basic physical and chemical characteristics of coal, proc. joint symposium on stationary combustion nox control, 1983 [2] j. a. miller and c. t. bowman: mechanism and modeling of nitrogen chemistry in combustion, prog. energy combust. sci., vol. 15, pp. 287–338, 1989 72 vol. 54 no. 1/2014 nox prediction for fbc boilers using empirical models figure 5. nox prediction reliability using ibler’s method. figure 6. nox prediction reliability using ibler’s method with optimized constants. [3] j. kers, k. priit, a. aruniit, v. laurmaa, p. križan, l. šooš, ü. kask: determination of physical, mechanical and burning characteristics of polymeric waste material briquettes, estonian journal of engineering. vol. 16, no. 4, pp. 307–316, issn 1736-6038, 2010 [4] d. kunii and o. levrnspiel: fluidization engineering, butterworth-heinemann, 1991 [5] a. gungor: prediction of so2 and nox emissions for low-grade turkish lignites in cfb combustors, chemical engineering journal 146, pp. 388–400, 2009 [6] j. pohl, g. dusatko, p. orban and r. mcgraw: the influence of fuel properties and boiler design and operation on nox emissions, joint symposium on stationary combustion nox control, pp. 24-1–24-28, 1987 [7] l. juniper and d. holcombe: formation and control of nox emissions from coal fired boilers, aie seminar on clean use of coal, 1992 [8] p. bennet: nox prediction research report 20, final report of the cooperative research centre for black coal utilization, 2001, http://www.ccsd.biz/publications/files/rr/rr% 2020%20nox%20prediction.pdf [9] j. pohl et al.: correlation of nox, emissions with basic physical and chemical characteristics of coal. proc. joint symposium on stationary combustion nox control. epri cs-3182, vol. ii., 1983. [10] ibler, z., karták j., mertlová j., ibler z.: technický průvodce energetika, nakladatelství ben, praha. [11] m. mák: posouzení změny paliva u fluidního kotle, ctu in prague, faculty of mechanical engineering, department of energy engineering, diploma thesis, 2011. [12] t. dlouhý et al.: spalovací zkoušky na fluidním kotli k3, unpublished report. 73 http://www.ccsd.biz/publications/files/rr/rr%2020%20nox%20prediction.pdf http://www.ccsd.biz/publications/files/rr/rr%2020%20nox%20prediction.pdf acta polytechnica 54(1):68–73, 2014 1 introduction 2 theory 2.1 formation of nitrogen oxides 2.2 prediction of nox for fbc boilers 2.3 empirical modelling approach 2.4 pohl's model 2.5 ibler's model 3 experimental 3.1 komorany i chpp 3.2 mladá boleslav chpp 4 results 4.1 pohl's model results 4.2 ibler's model results 5 conclusions references ap01_45.vp 1 introduction ever increasing demands on efficiency and minimisation of unwanted side effects (emissions, waste heat, noise, vibrations, etc.) of engine operation may be fulfilled only if modern methods of optimisation are used as an integral part of design process. two conditions should be respected: • strongly limited time for development, which calls for a concurrent approach using a hierarchy of comprehensive models during all stages of development; nevertheless, these methods are fully developed for parametric design optimisation only in the sense of the terminology used in [1]; • the need for deeper configuration optimisation satisfying ever increasing competition in the prime mover market; today the reciprocating engine is king; what if it is replaced by fuel cells or by other thermal engines (including steam engines) in the future? the base for every parametric optimisation is a properly calibrated model of sufficient breadth (i.e., comprehensiveness of the description of the engine and accessories system, or even involvement of an engine in the vehicle system) and depth (i.e., involvement of phenomena and independent co-ordinates). in all usable cases, the knowledge acquired and the computer power available limit the depth. therefore the universally valid laws of conservation have to be supplemented by empirical closures. they have a different range of validity, starting with an equation of state, fourier’s or fick’s laws and ending with turbulence and chemical kinetic models or even purely empirical correcting functions (e.g., rate-of-heat-release function, rohr, heat transfer coefficient correlations) or correcting factors (e.g., loss or discharge coefficients). changing the closures, the model should be calibrated for the considered class of engines or optimisation tasks. the structure of a model must be transparent, otherwise the calibration is dubious and the results are unreliable. in this case “less means more” very often. accumulation of too many closures (especially if they come from different research sources) tends to unreliability and fuzziness. concerning configuration design the concept under consideration should be divided into simple sub-systems and new combinations of them should be studied. this procedure gives an apparently countable but very high number of variants that must be sorted before detailed calculations are made. the description of system elements consists in algorithms that are already developed. nevertheless, it is not suitable to apply them for all possible cases because it consumes too much time. the preliminary sort procedure must be applied. it is to some extent heuristic. on a higher level, some new elementary sub-systems may even be amended. then the process is also heuristic at a higher level. the aim of this paper is to present an analysis of thermodynamic model division into elements and possibilities of acquiring simple first-approximation sorting aids for further synthesis, to show some unavoidable associations with engine and vehicle structure dynamics, and to stress the results of inverted algorithms for model calibration. 2 ways to improve the engine concept analyses of combustion have shown that a compromise between carnot limits of cycle efficiency and temperature effect on nox formation should be sought for. homogeneous combustion without stepwise heating and compression of the burnt zones would be desirable if feasible, if firing pressure is not too high and if a lean mixture causing low specific power is used. removal of the impacts caused by a limited expansion ratio would open another well-known possibility of higher efficiency followed, however, by decreased mechanical efficiency and higher cooling losses due to the low specific power of an engine. conversely, engine downsizing by turbocharging is valuable from the standpoint of specific power, i.e., heat cooling losses and mechanical efficiency but in most cases it requires higher piston work during gas exchange and very high firing or compression pressures. thus, most measures have an ambiguous impact if multi-criterion optimisation is © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 41 no. 4 – 5/2001 involvement of thermodynamic cycle analysis in a concurrent approach to reciprocating engine design j. macek, m. takáts a modularised approach to thermodynamic optimisation of new concepts of volumetric combustion engines concerning efficiency and emissions is outlined. levels of primary analysis using a computerised general-change entropy diagram and detailed multizone, 1 to 3-d finite volume methods are distinguished. the use of inverse algorithms based on the same equations is taken into account. keywords: internal combustion engine, thermodynamic optimisation, simulation, configuration design. taken into account. moreover, all of them result in increased complexity of engine design and the need for better control algorithms, preferentially with adaptive and predictive capabilities. therefore, improved predictive tools are essential and interaction between thermodynamics and mechanical efficiency must be taken into consideration. an example of a design network involving these factors is presented in [1]. 3 tools of standard and configuration thermodynamic analysis thermodynamic simulation is based on the laws of conservation, constitutive equations and empirical closures based on goal-directed experiments. the hierarchy of models covers a wide range from an idealised thermodynamic cycle over 0d, 1d and zone approaches [2] to comprehensive system models with 3d cfd simulation. a modular approach used very widely by gamma technologies (see, e.g., in [2]) has been further elaborated to 3-d in [3], [4]. the breadth of the models is indirectly proportional to their depth, of course, so that all of the models mentioned may play their proper role in engine cycle analysis. the layout of a current state-of-the-art 1-d model is presented in fig. 1, together with examples of new engine concepts not included in a standard model – [5], [6]. a parametric optimisation – [1] – is solved fully by this model, today often with automated procedures – [2] and interactions to engine gear mechanics – e.g., [7], [8]. this link is of utmost importance not only because of the impact of pressure-volume dependence on mechanical losses (especially using high compression ratios and boost pressure levels for turbocharged engines) but due to power input for accessories, especially injection equipment (ultra high-pressure direct injection, common rail injection) or variable valve gear timing with advanced electric actuators of rather limited mechanical efficiency. however, new configuration design tools should be sought for because conventional tools are not adapted to the flexibility required by changes in volumetric engine layout. the general approach used is to divide complex models into elementary parts and to allow for the construction of an arbitrary set of these modules to a new concept. a “thermodynamic structure kit” has to be developed for these purposes. using procedures and data (especially substance properties), a thermodynamic structure kit may be used even for formulating the inverse algorithms suitable for evaluating experiments and calibration of models – [9]. moreover, an initial sorting-out tool should be provided. 4 simple configuration design tool starting with the last issue, a versatile simplest model suitable not only for current tasks has been developed by computerising the old entropy (t-s) diagram approach. it is based on general reversible polytropic changes for the high-pressure part of a cycle where the mass of gas inside a cylinder is constant. the model contains in the most general manner (see, fig. 3 left): • polytropic compression (1–2), which can be divided (1–12 and 12–2) into heated (the only case in the figure) and cooled parts, if necessary, taking internal heat regeneration from hot components (if present) into account; • constant-volume (isochoric) heat supply caused by regenerated heat at the end of the compression stroke (2–22, if present); this part ends when the unsteady surface temperature of surrounding walls is achieved; • constant-volume heat supply by combustion (22–23), optionally with simultaneous heat loss (covered by fuel heat supply) to engine components with a temperature lower than that of the gas; • constant pressure (isobaric) combustion (23–3), optionally amended as in the two previous issues depending on the gas temperature reached after the constant-volume phase; 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 1: layout of a standard 1-d engine model and examples of unconventional reciprocating engines with porous medium combustion (middle) – [5] or exhaust gas heat regeneration (right) – [6] • polytropic combustion (heated expansion due to afterburning) (3–334), optionally amended as in the previous issues, in most cases an isothermal combustion gives a suitable approximation; • polytropic expansion (34 – 4) (usually cooled) optionally amended as in the previous issue (34 –344 and 344 – 4); this can again be divided into cooled (high gas temperature) and heated parts (the case of gas temperature below the surface temperature is presented in the figure); • constant-volume heat rejection substituting the real gas expansion to an exhaust system; the first part of it (4 –55) concerns the part of the heat stored in a hot component heat capacity, e.g., the next change 2–22; these changes can be applied only if suitable engine lay-out is used; • constant-volume heat rejection, in reality provided by the exhaust process – gas expansion and scavenging (55 – 6, which should be identical with 55 –1). this model uses realistic heat capacities dependent on temperature and gas composition. heat supply from fuel chemical energy is calculated in 3 or 4 steps (i.e., 2–34), taking overall cooling losses determined by experiments or by more detailed cycle simulation into account, so detailed model calibration is provided, avoiding the very rough standard approach an using overall correcting coefficient only. on the other hand, all limiting values are maintained at a prescribed level during calculation, using nested iterations. irreversible changes in an exhaust system, turbine and compressor are modelled in a simplified manner for turbocharged engines to take into account changes in energy transfer to the turbine caused by high-pressure cycle changes. the model is precise if all limiting values and the engine cooling loss are known – fig. 2. some examples of results of cycle optimisation are presented in fig. 3. the importance of this model consists in fact that not only quantitative results are obtained during optimisation but that the shape of the t-s diagram generates instructions for improvements. generally speaking, all measures enlarging the “entropy length” of the diagram are harmful if no internal heat regeneration at a suitable temperature difference is used. this approach provides therefore for parameters and fixes – [1] simultaneously. nevertheless, a more profound analysis is necessary after the first reasoning has been finished. the elements of a “thermodynamic structure kit” are therefore described now. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 41 no. 4 – 5/2001 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1400.0 1600.0 1800.0 0.0 500.0 1000.0 1500.0 comp. ratio 14.0 1 imep 1.600 mpa real kappa # 185 12 2 23 12 3 34 4 344 t e m p e r a tu r e [k ] 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1400.0 1600.0 1800.0 0.0 200.0 400.0 600.0 800.0 1000.0 comp. ratio 11.0 1 imep 1.599 mpa real kappa # 187 12 2 23 12 3 34 4 344 t e m p e r a tu r e [k ] 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1400.0 1600.0 1800.0 2000.0 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1 2 22 23 3 34 4 55 6t rov 7t rov 1k 2k 344 12 t e m p e r a tu r e [k ] ]entropy [j kg k� �� �1 1 entropy [j kg k� �� �1 1] ]entropy [j kg k� �� �1 1 fig. 3: entropy diagram of a general cycle with heat regeneration and a turbocharger and examples of some interesting cycles with limited pressure and temperature and comparatively high thermal efficiency if heat regeneration between 34–344 and 12–2(–22) is used 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 1400.0 1600.0 1800.0 2000.0 0.0 200.0 400.0 600.0 800.0 1000.0 1200.0 entropy [j kg k ]� � � �1 1 comp. ratio 10.3 1 imep 1.941 mpa real kappa # 223 2 23 12 3 34 4 t e m p e r a tu r e [k ] fig. 2: entropy diagram of a real cycle with a lean mixture and an ignition amplifier (a pre-chamber), limited pressure and temperature, low nox emission, high thermal efficiency. a comparison of an idealised cycle and a real cycle (error in thermal efficiency of 0.5 %) 5 basic “boundary condition” equations before the concrete appearance of the conservation laws for a modular model is presented the structure of the model and its boundary conditions must be described. in reality, partial differential equations describe the continuum mechanics inside an engine. the model represents a real continuum, using concentrated parameters in finite volumes (fv) and in the case of necessity dividing manifolds into parts. therefore it can be described by the system of ordinary differential equations (ode). in this case the boundary conditions of the original set of partial differential equations are transformed to algebraic linking equations that occur in the right-hand side vector of an ode set. the model consists of the complicated structure of such finite volumes in the amount of m with 1-d or sometimes 0-d unsteady flows connected mutually by throttling devices (ports, valves, nozzles, diffusers, etc.) and featuring moveable walls in general. the structure itself can be described in a finite-volume index m*m matrix, where linked finite volumes are stated by non-zero terms. in the case of a serial model structure the index matrix is tri-diagonal. it is used for defining the extent for sum operations over neighbouring volume fluxes in the resulting ode set. an example of a general model element (finite volume) with a 1-d co-ordinate in its axis, mass fluxes, acting pressure forces, heat source and power sink (due to a moveable wall) is presented in fig. 4. the key terms are those describing volumetric fluxes. the volume flux is oriented by the result of the scalar product, using velocity vectors wf and outer-normal oriented surface vectors a at the linking surfaces. it is not associated with up-wind procedure described further only if the fluid is incompressible. since the compressibility of gases may not be neglected for typical flows of reciprocating engines (even if they are in most cases subsonic) the volumetric flux must be reduced to the upstream gas density. nevertheless, using the volumetric flux remains suitable for versatile flux formulae, and therefore it will be used in the following. the sign of velocity (i.e., the upstream and downstream fv) is obtained basically by comparing the static pressures in neighbouring volumes (a corrective procedure taking inertia into account is outlined further). thus using index a for the effective throat cross-section (including flow contraction), j for the fv under consideration and i for the neighbouring ones, the volumetric flux v from i to j yields v w a v v j i j i j i j, i j i j i i i j , , , , , ;� � � � � � � � � � � � � � a up sign � 1 , , . i j j j iv 2 sign 2 � � �� � � �� � 1 (1) the second equation easily determines all fluxes � in consideration (using mass-specific � value for specie, momentum, and energy), which are calculated using up-winding according to the sign of v. the up-winding terms “inside” and “out-of ” – fractions containing the sign function – are indicated as �j,i , �j,i respectively in the following. although formulae for compressible flow volumetric or mass flux seem to be well known, two modifications should be involved for the current case – [10]. in most cases, the port and throat length (b.c.l.) is not negligible in comparison with the fv length, but it cannot be reduced simply to the fv itself. the steady-flow energy equation (“total enthalpy conservation”) can be supplemented by an approximation of an unsteady term expressing the work of local inertia forces in a similar manner as in the case of the unsteady term in the bernoulli equation. it yields for correcting term i using reduction of active gas column lengths ln (k being index of total state, up means upstream position) h h w i i w t l w t l a a w kup a a b c l a j i n n � � � � � � � 2 2 ; ; . . . ,� � � � � d n � �a kup a kup a kup at k t p p k p t p t p p� � � � � � � � �2 d d d d . (2) this assumes negligible density changes due to inertia effects. the last expression uses bernoulli’s approximation of flow rate, which simplifies the derivative of the pressure function. the factor k can be solved iteratively comparing precise and bernoulli solutions, which is quite suitable because of the iterative essence of the whole procedure, as will be demonstrated further in the equation (3). the second remark concerns complex influence of the velocity loss coefficient �. it is widely used for ideal (isentropic) velocity correction (see the right-side expression in the following formulae) but without taking into consideration other impacts of kinetic energy dissipation, namely on the critical pressure ratio and throat density. the more precise expressions yield � � � � � � � � � � � � � � � � � � � � � � � � � max ; ; p p a kup a k 1 1 12 1 , up � � � � � � � � � � � 1 12 1 ; (3) 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 4: finite volume as an element of a modular simulation tool w r t ij i k up, .� � � � � � � � � � � � � 2 1 1 1 , an inertia correction term is included. with it, the limits for the subsonic region must be tested iteratively, using the pressure ratio limit as the first approximation and checking the resulting flow velocity against the local sonic velocity (this is calculated involving also dissipation of kinetic energy, using a temperature term in the denominator of the density expression). problems may occur if pressure difference is very near to zero, paradoxically. these cases must be treated carefully. the deduced formulae are used in linking terms of the convective fluxes involved in the right-hand sides of all conservation equations (6), (10), (11). values with index k up have to be substituted by the concrete total state upstream of the throttling device, a means static pressure downstream of it. in most cases the latter means static pressure in the following fv, but in some cases it is worthwhile to differentiate between them if the diffuser effect downstream of a bottleneck takes place. it can be computed even in dissipative cases of borda loss. 6 conservation equations for a fv the general form of conservation equations is used in accordance with the literature – see, e.g., [3]. a column vector of s species is designated by { s }, the weighted sums of mass-specific quantities are calculated using the scalar product written in matrix notation as transposed (row) vector times column one, i.e., { s } t { s }; � stands for mass fractions, � for density, index ch chemical reactions (primarily combustion). all chemistry is described for r reactions by transformation stoichiometric matrix c – tab. 1 and [9]). the chemical transformation of species yields for concentration changes, using the stoichiometric transformation matrix together with guldberg-waage-arrhenius laws (5) � � � � � � v c t c m t c v c t d d d d d d s chem s *r r s *r r� � � � , (4) � �d d er r b e r t x x y y z zc t k t c c c� � � � � � � � � � � � � � � � � . (5) here k stands for a reaction constant (the individual constants create a vector { rk}) and e/r is an individual activation energy for a certain reaction. they are together with the exponents b, x, y, z determined from experiments for the reactions described by columns of c – tab. 1. liquid fuel evaporation may be taken into consideration in a similar way; up-wind coefficients �, � have already been mentioned. then the conservation of species yields � � � � � �� �� � � � d d d d d s s s m t v m t j j i i s i j i j s j j i i n ch j j � � � � � � � � , , , ; � � � � � � � � � �m t c m t ch j j d d d r*s r � � . (6) volumetric fluxes associate this equation (6) with energy equations since they include pressures both inside zone (fv) j and in neighbouring fvs. in the neighbourhood there may be fictive zones as well (e.g., plenum with steady state pressure and temperature). before other conservation equations are presented the equation of state, which supplemented the odes set to a solvable system, has to be transformed to a suitable form. the thermodynamic state equation can be written in the form p = f( t, { sm} t, v ) after molar-specific quantities are simply recalculated in mass-specific quantities. in the case of real gas mixtures, the empirical coefficients have to determined. the correlations for it are available (e.g., nonlinear molar-weighted sums in the case of a suitable b-w-r equation of state). thus, © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 41 no. 4 – 5/2001 reaction main component / specie: h2 h2o co co2 c8h18 n2 n� no no o2 �16/2 +16/18 �16/28 +16/44 � 400/114 �32/14 +32/30 h2 �2/2 2/18 h2o 18/2 �18/18 162/114 co �28/28 28/44 co2 44/28 � 44/44 352/114 c8h18 �114/114 n2 �28/28 �14/14 28/30 n� 14/28 �14/30 14/30 no 30/28 30/14 �30/30 �30/30 o� �16/28 16/14 16/30 �16/30 table 1: an example of a part of the transformation matrix |c| for some combustion products of octane and nitrogen oxides formation according to zeldovitch. columns describe reactions designated by the main reaction component. � �� �p p t v m p t p t t t p v v t p m mj j j i i j � � � � � � � , , ; , s d d d d d d d d � � � � � � t t t p t p v v t p m m t p t j j j i i j i s � � � � � � � ; . , d d d d d d d d � � � � � � (7) the basic (temperature) part of the thermal state equation, e.g., for the static enthalpy hid = f (t, ,{ sm}), yields with the equation (6) the relation between enthalpy and thermodynamic state quantities (pressure, temperature, etc.) � �� �h h t p m h t h t t t h p p t h m m j j j j i i j � � � � � � � , , , s d d d d d d d d d d d � � � t h t s t p t p v v t p m m t i s j j i i j i s � � � � � � � � � � � � � � � � � � d d d d d d d , � � �� � � � � ��� �hp p t h m m t j i i j i s d d d d d d , . (8) the thermal effects of chemical reactions are involved automatically if enthalpy considers reaction terms for combustible components (see the first term in the second row of the equation (8), in which chemical changes may be involved). versatile formulation of energy conservation uses total energy (internal + kinetic k + turbulent kinetic k ones) or total enthalpy. then, using the differential equation of state (8), the set of odes for an arbitrary number of fvs is solvable. the energy conservation yields thus for the static enthalpy d d h t v h v h v a j j i ki j i i kj j i j i j i� � � � �� � � � � � � � , , , , , � � � � �j i j i n k j j k j j j j t t q v p t m k t k j � � � � � � � � � � � � � � � � � � � , dd d d d d � � t m t k k j j j � � � � � � � d d . ( )9 the thermodynamic state quantities (pressure, temperature, etc.) are evaluated iteratively combining the equations (8), (9) in the case of a mixture of real gases. heat transfer to neighbouring fvs is taken into consideration by newton’s equation and by the terms q k if necessary; turbulent kinetic energy k may often be neglected at the current level of precision, and in the case of a big flow cross-section the same is valid for k. nevertheless, for use in association with experiments where pressure measurements are often the single output or for multi-zone models with equal pressure and different temperature inside fvs, it is better to write energy conservation in a form that provides the pressure derivative explicitly. it is possible in the case of ideal gas. the equation has to be supplemented by closures concerning heat release (reaction velocities) and heat transfer. corrective heat flux qcorr (e.g., for heat radiation) and enthalpy corrective flux hcorr may be used if necessary. their meaning is important as corrections in inverse algorithms. then energy conservation yields � � v p t v h h j j j i i k i j i j k j j i i j � � � � � � � � � � � � � � � � � 1 d d � , , , , , , � �� � � � � � a t t p v t h t r i j i j i n j j j j t j j j t j , � � � � � � � � � � � � � � 1 1 d d s s � � � � � � � � � � � � � � � � � � � � � d d d d d d d d s m t m k t k t m t k k j j j j j j j � �� �� � . ( )q hcorr corr 10 in the case of small cross-sections and big velocities, the momentum conservation has to be considered. this vector equation can be written for a concrete coordinate system in the form of several equations, the number of which respects the number of coordinates. thus in the case of cartesian coordinates (� stands for friction losses at walls, and in this case another closure equation is needed for them, e.g., stokes’s and reynolds’s hypotheses, and a turbulence model): � ��m w t v w w p a j j j i i i j i j j j i i n i i j j � � � � � � � � � � � � � � d d , , , , � � � � �� �i j i j j ja w m t , , . ( )� � � d d 11 it is worth mentioning that this is not necessary for 1-d problems when velocities are low, unlike in 2 and 3-d cases, where energy and mass conservation does not give a complete set of equations more and the momentum conservation has to be used in any case. in cases of curvilinear coordinates appropriate inertia terms must be amended. 7 modularised structure of the code the general fv cfd methods have thus been re-applied for the 1-d or 0-d engine models. unlike in full cfd approaches, the conditions at fvs boundaries must be described by 1-d gas dynamics expressions. the description of large engine structures comes into these models by connection through convective terms, describing the engine structure by means of the structure matrix of the whole simulated system. closures may be involved in this open system according to applicator needs. they can be supplemented by inertia corrective terms, but in this case iterative solution of ode with implicitly involved derivatives is needed. because of the danger of stiffness implicit, variable-step or even iterative ode solvers have to be applied. in the latter case there is no problem to implement inertia corrections. code subroutines for ode right-hand sides have been programmed as an open system to which closures may be implemented. function tests for some typical examples were done. other implementations and – based on them – automatic structuring of the source code applied to a concrete system will be done in the future. 8 model calibration based on experiments the concept of procedures inverted to the described analytical algorithms is used for engine experiment evaluation (rate-of-heat-release, nox formation, valve aerodynamic parameters, turbine efficiency, etc.). the main advantage of well-structured transparent algorithms for thermodynamic 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 calculations is the fact that they can be inverted and a measured quantity pattern can be used for computing other values (in most cases the pressure traces inside a cylinder or manifold create source data). concerning rate-of-heat–release (rohr), the aim is to find the main reaction component of fuel combustion, i.e., one of the components d{sm}/dt) in a certain reaction. the other d{m}/dt for the chosen reaction are coupled by stoichiometry, using appropriate column only of the s *rc matrix – [9], i.e., the transformation vector {s; r =mainc} of the chosen reaction is used. other (side) reactions are calculated according to the model chosen. absolute values of correction terms qcorr and/or hcorr = m * h in the pressure differential equation should be minimised if the main reaction component is correct and other influence (i.e., leakage) is small. this procedure is typical for the rohr computations used to calibrate model closures (see the example in fig. 5). after it, the enthalpy correction term may thus be distributed to the product of the known specific in-cylinder enthalpy and an unknown mass flux caused by a piston-ring leakage. then it can be used during combustion as a correction. in other cases, e.g., during charge exchange, this term is useful in obtaining the essential mass flux in valves. the flow through a single set of valves and the appropriate discharge coefficient may be checked during pure inlet or exhaust. another way is to substitute the complete pressure equation by the measured pressure and to calculate only the rest of the system described by the other equations. this procedure can be used to analyse more complicated problems (e.g., pre-chamber combustion where mass flux and heat release are coupled) to avoid amplification of errors caused by an imperfect part of the system model. the examples of the rohr traces evaluated from experiments at a natural-gas, spark-ignition engine are shown in fig. 5. they create a suitable base for the extrapolation of engine parameters during a design optimisation. some supplementary measured data (fuel and air flow rates) and amending assumptions concerning the temperature at the start of compression had to be used during the evaluation. it is important that these supplementary conditions are again based on the same equations during the both evaluation and simulation processes. conclusions an efficient way to maximum use of internal combustion engine potential involves a combination of thermodynamic analysis and engine gear modelling starting with an idealised cycle in the case of configuration design. the computerised entropy diagram provides an initial procedure of the thermodynamic optimisation. the advanced methods (1-d, cfd) must be properly structured and equipped by suitable interfaces, which enables the user to describe a complicated engine structure. nevertheless, even these advanced methods contain always some empirical closure equations. methods for model calibration based on experiments are therefore essential and useful tools, if the differential equations of the simulation model are used an solved by inverted algorithm. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 5: evaluation of in-cylinder pressure records for a spark ignition, natural gas fuelled engine at different ignition advance using a measured pressure signal, measured mean mixture flow rate and supplemented by simulated temperature at the start of compression these models are presented here, creating a hierarchical structure from the simple algebraic model, providing transparent and instructive results, to the most complicated models, which can be used for final optimisation. the modular structure of the latter models enables an appropriate model of unconventional engine lay-out to be constructed. acknowledgements this research has been supported by research project msm 212200008 of the ministry of education, czech republic. references [1] macek, j., valášek, m.: initial methodology for concurrent engineering. in: “proc. of first international conference on advanced engineering design” (editors: fiala, p., smrček, l.), prague, čvut 1999, pp. 286 –290, isbn 80-01-02055-x [2] ciesla, ch., keribar, r., morel, t.: engine/powertrain/vehicle modeling tool aplicable to all stages of the design process. sae paper 2001 (in review) [3] macek, j., steiner, t.: advanced multizone multidimensional models of engine thermoaerodynamics. in: “21st intl. congress on combustion engines” (editor m.k.eberle), london, cimac. 1995, pp. d10/1–d10/18 [4] macek, j., polášek, m.: advanced eulerian multizone model – versatile tool in moveable boundary problem modeling. in: “8th international symposium on computational fluid dynamics”, proceedings on cd-rom. bremen, zarm 1999, pp. 1–20 [5] durst, f., weclas, m.: a new type of internal combustion engine based on the porous medium combustion technique. journal of automobile engineering 2001 (in press) [6] macek, j.: compression or heat regeneration in the cycle of reciprocating engine – new results of old methods. in: “koka 2000” (editor hlavňa, v.), žilina, technical university of žilina, 2000, pp. 117–122, isbn 80-7100-736-6 [7] macek, j.: a simplified model of a valve gear with friction. in: “sborník z xxx. mezinárodní konference” (editors macek, j., baumruk, p.), praha, čvut 1999, pp. 17–27, isbn 80-01-01972-1 [8] macek, j., remek, b.: simplified model of losses in crank gear. in: “motorsympo ’99” (editors: macek, j., kroebl, l.), praha, čvut, fsi, 1999, pp. 11–21, isbn 80-01-01985-3 [9] macek, j., kozel, k., baumruk, p., takáts, m., hatschbach, p., bendl, f.: computational fluid dynamics or zone models – pros and cons in ic engine research. in: “world automotive congress of fisita 1996” (edts. m. apetaur, m. hanke) praha, čsat, 1996, technical papers on cd rom, pap. p14.02, pp. 28 [10] macek, j., polášek, m.: general method for modeling of quasi-1d subsonic flows and its application in cfd boundary conditions. in: “topical problems of fluid mechanics 2000” (editors: kozel, k., příhoda, m.), prague, academy of science of the czech republic, institute of thermomechanics, 2000, pp. 57– 60, isbn 80-85918-55-2 prof. ing. jan macek, drsc. phone: +420 2 2435 2504 fax: +420 2 2435 2500 e-mail: macek@fsid.cvut.cz doc. ing. michal takáts, csc. phone: +420 2 2039 5127 fax: +420 2 2435 2500 e-mail: takats@fsij.fsid.cvut.cz josef božek research centre czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 ap01_6.vp 1 list of symbols chp combined heating and power plant gt gas turbine cycle st steam turbine whb waste heat boiler c condenser he heat exchanger hp heat pump mfwh mass flow of secondary geothermal water used for heating feed water mdh mass flow of secondary geothermal water used for use in the dhn mscgh mass flow of the secondary carrier of geothermal heat 2 introduction revolutionary structural changes will be needed as worldwide energy consumption increases with rising populations and higher living standards. in response to the higher impacts on the environment (especially the threat of climate change caused by the greenhouse effect) and decreasing reserves of fossil fuels, other sources will be needed. the utilization of renewable sources is expected to increase above all between 2010–2040. in this time period, the proportion of power supplied from renewable sources will probably rise from 5 % to approximately 35 %. around 2050 the proportion from non-renewable and renewable sources will be equal, and after 2060 the proportion of power from renewable sources will be about 40 %. for slovakia, which is extremely poor in fossil fuels, the prospect of such an evolution should provide strong motivation for more intensive utilization of sustainable energy sources, especially solar and geothermal energy. their current usage (6 % and 2 %) is much lower than it could be. while solar energy is suitable for use in projects in the range of some kw, due to its low concentration, the power output from geothermal projects is significantly higher, and can be used for major projects. the plan for the geothermal project in košice is to utilize about 100 mw heat output from 8 geothermal doublets. about 2500 tj of heat could be obtained in this way. this is undoubtedly one of the boldest intitiatives of its type in the world today. 3 feasibility of utilizing geothermal heat directly in the district heating network (dhn) the simple, direct use of geothermal heat in the district heating network (dhn) cannot be a feasible way of replacing the existing chp plant under the current economic circumstances, due to its low efficiency. the main reason for this low efficiency is the conflict between geothermal heat and heat produced in cogeneration, which means that the heat produced by the cogeneration unit would be replaced by geothermal heat. limiting the cogenerated heat production would cause a decrease in power production, which plays a key role in the efficiency of this unit. therefore this alternative must be rejected. the energy policy of the government, which supports the utilization of renewable sources, accepts the plan to establish of a cogeneration unit in 2004–2005 to replace a part of the chp plant that is reaching the end of its working life. cogeneration based on a gas-steam cycle is one of the most efficient ways of converting fossil fuels to other profitable energy forms, and for this reason it is one of the priorities in the energy policy. its efficiency is based on longest possible utilization in the course of a year, as in the case of geothermal energy. due to low demand for heating in the summer period, it is impossible to fulfil this condition. the way to improve the efficiency of the geothermal project in košice is by changing the basic philosophy. it is also important to reduce investment costs and to decrease the amount of geothermal heat used in a year. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 a geothermal energy supported gas-steam cogeneration unit as a possible replacement for the old part of a municipal chp plant (teko) l. böszörményi, g. böszörményi the need for more intensive utilization of local renewable energy sources is indisputable. under the current economic circumstances their competitiveness in comparison with fossil fuels is rather low, if we do not take into account environmental considerations. integrating geothermal sources into combined heat and power production in a municipal chp plant would be an excellent solution to this problem. this concept could lead to an innovative type of power plant – a gas-steam cycle based, geothermal energy supported cogeneration unit. keywords: geothermal energy, combined heat and power plant, heat pump, waste heat boiler, district heating network 4 integrating the geothermal potential of the košice basin into the combined heat and power production system in teko motivated by a vision of supplying energy for košice with a major proportion of clean and renewable energy, which could qualify for membership of the prestigeous “energie-cités – association of european municipalities for a local sustainable energy policy” network, the authors of this paper sought for a feasible concept, in which geothermal energy would be utilized in a system of combined heat and power production, rather than only direct utilization in the dhn. the combined heat and power production would not be limited but would be supported by geothermal heat. this solution would be advantageous for the two companies teko and geoterm, and also for the end users. the initial results were published in a summarized form. the aim of these publications was to draw the attention of engineers and above of business people to the fact that the low efficiency of only direct utilization of geothermal energy in the dhn will not be compensated for by the increasing price of fossil fuels. approximately 60 % of the cost of geothermal heat is accounted for by energy costs, and the increasing cost of fossil fuels would influence all its components, so the desired effect would not be achieved. moreover, the current rising price of electricity increases the advantages of cogeneration, but reduces the advantages of geothermal heating. for this reason, the use of hybrid chp plants with the simultaneous use of fossil and geothermal sources was considered. this can be done using existing machinery, but a special design for the planned gas-steam cycle seems to be desirable. a gas-steam cycle based cogeneration unit with integrated support of geothermal energy could be implemented to supply the city with heat. 5 the concept of a gas-steam cogeneration unit supported by geothermal energy there are many possible technical solutions of the gas-steam cogeneration source with integrated support of geothermal energy. in this concept geothermal heat is used mainly for heating the feed water in the steam cycle, so condensation production of electricity should be dominant in each proposal. the highest energy efficiency and the lowest environmental pollution can be achieved by the concept illustrated schematically in fig. 1. in this proposal the stream of secondary geothermal water �mscgh will be divided into two parts in teko: � �mhfw for heating the feed water (condensate) in the heat exchanger of the steam cycle, � �mdh for direct utilization in dhn. because geothermal heat is more efficient for heating the feed water, the flow �mhfw should be higher than �mdh. after mixing the returning flows that were cooled down to different degrees, the heat will be pumped from the resulting flow, using the heat pump hp for indirect utilization in dhn. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 41 no. 6/2001 fig. 1: principal technological scheme of the gas-steam cogeneration unit with geothermal support heating water in the cycle of the heat pump’s condenser can be overheated as required in the waste heat boiler whb. in the evaporator of the heat pump the secondary geothermal water can be cooled down to such an extent that it can be used for cooling the condenser c of steam turbine st. in condenser c the secondary geothermal water will be heated by 8 to10 k by a part of the heat loss. finally, in the waste heat boiler the secondary geothermal water can be heated by the heat losses from the outgoing flue gases at the same temperature, as is assumed in the case of only direct utilization in dhn. the flue gases can be cooled down in this way below the dew temperature and in addition to the sensible heat also the latent heat of the flue gases can branch away. after this, not only the pollution load but also the load by emission from exhaust emissions will be reduced, because the moisture will clean the outgoing flue gases. the heat losses of the waste heat boiler and condenser, which constitute a substantial part of the heat losses, will be accumulated through the geothermal water being reinjected into the earth’s crust. the parameters of the steam cycle will be adjusted to the processes analyzed above. the electric power of the steam turbine will also be determined mainly by the amount of condensate, and also indirectly by the stream of the secondary geothermal water used for heating the feed water. taking into account the anticipated abundance of geothermal doublets, such a stream is expected, so at least two pressure levels of superheated steam must be dealt with. there are other possible ways of designing the gas turbine, depending on the degree of overheating. in the extreme case of a gas-steam cycle without overheating, which is the most efficient variant, the power output of the gas cycle will be unnecessarily high, considering the current aims for the utilization of geothermal heat. for this reason it is important to turn our attention to overheating. the degree of overheating should be adjusted to the selected gas turbine. the ideal power output is when no additional air for combustion will need to be overheated. it is important to pay attention to the concept of the waste heat boiler, which will be different from the standard double-pressure design, with (probably) high degree of overheating by the last heat exchanger, where the flue gases will be cooled to below dew point. the exiting of the flue gases into the atmosphere must also be adjusted to it. the heat pump integrated into the combined heat and power production system, which pumps the heat from the returning secondary geothermal water into the chp, plays a key role. if the parameters of the heat pump are chosen properly, its application will be effective, because the heating and cooling output would be used all year round. these parameters should therefore be chosen in such a way that the heat pump can supply the whole city with heat for domestic warm water production. 6 simplified analysis of winter period operation of a geothermal energy supported gas-steam cycle based cogeneration unit as shown in the scheme in fig. 1., the operation of the gas-steam cycle based cogeneration source was analyzed to compare the effects of the possible ways of utilizing geothermal energy with the original, still officially supported idea, which involves direct feeding into dhn. the main parameters of the analysis were the following: � geothermal heat is distributed by secondary water to the teko from 6 doublets. if one doublet produces 60 kg/s of mains water and the streams on the primary and secondary side of the heat exchangers of transmission line are equal, there is a total of 360 kg/s of secondary geothermal water. the input temperature into the teko is approximately 120°c. � the stream of secondary mains water is divided in the ratio 2:1. this means that in the case of year-round operation 240kg/s water will be used for heating the feed water (from 4 doublets) and 120 kg/s of water will be directly used in the dhn. � parameters of the high pressure steam pressure: 10 mpa temperature: 500°c stream: 160 kg/s � parameters of the low pressure steam pressure: 0,5 mpa temperature: 300°c stream: 80 kg/s � condensation pressure 0.004 mpa � parameters of the primary water in the dhn 120°c/50°c � heat output of the heat pump 45 mw � temperature of the secondary water at the output of the teko 60°c these parameters indicate that it would be possible to obtain about 245 mw electric power from the steam cycle and the total heat output of the source would be from 190 to 200 mw. the total electric power of the source would depend on the selected gas turbine and on the degree of overheating, which is relatively independent from the utilization of geothermal heat. it is in principle possible to imagine a variant in which the cycle of the steam turbine would operate in the interval of the basic load and the gas cycle would be used as a source for regulation of electric power. this source would provide geoterm with 2500 tj/year from 6 doublets instead of 8, which means a saving on investment costs of about 14 mil. usd, and also lower distribution costs. these savings could be invested in teko. this variant seems more advantageous for teko, which would pay for about 2500 tj/year but would in fact be able to utilize 4000tj/year as heat (this difference represents the heat losses leaving the condenser of the steam cycle and the waste heat boiler) and more than 1000 tj/year as cooling capacity. moreover, the company would decrease its emissions, and reduce its impact on the environment. world bank studies have 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 type of emission environmental profit usd/t co2 48 nox 1100 sox 2200 table 1: savings on investments by decreasing emissions shown that the following savings can result from decreasing emissions and reducing environmental damage. taking these savings into account, it would even be advantageous for teko to buy the geothermal heat at a price higher than its selling price. 7 conclusions the concept of a geothermal energy supported gas-steam cycle based cogeneration unit appears very promising and our analysis shows that a new philosophy may lead to good results. the specification of the optimal variant of this concept and also the overall evaluation of its power supply, economic and environmental potential should form the subject of a very serious feasibility study. the establishment and functioning of such a unique work would provide more power with fewer environmental impacts, contributing to the development of an efficient technology for converting fossil fuels to other energy forms, together with the use of geothermal sources. optimized integration of the secondary geothermal heat carrier into the system of combined heat and power production in a gas-steam cycle based cogeneration unit and into the dhn could lead to very intensive utilization of its enthalpy, and also to a reduction in the consumption of primary energy. at the same time, huge heat losses or an accumulation of them in the earth’s crust will be utilized, and the working life of the geothermal wells will be extended. the realization of this project would be a success for the energy policy of slovakia. the main aim of the energy policy of the european union is to increase the proportion of renewable energy source utilization from 6 % to 12 % in 2010, and to reduce the emissions of greenhouse gases by 8 % in comparison with 1990. slovakia has serious ambitions to join the eu, and will have to participate in this programme. if any version of the geothermal project is realized, it will form a major part of slovakia’s renewable clean energy programme. the interest of slovak engineers in the new concept is surprisingly low, paradoxically lower than that of engineers in foreign countries. there has been no adequate response from the state authorities dealing with regional energy and environmental policy was presented. as a result of this neglect, no local funding has yet been made available for this work. references [1] böszörményi, l.: možnosti zlepšenia ekonomických ukazate� ov košického geotermálneho projektu. konferencia s medzinárodnou účas�ou „podnikanie v energetike“. dt zsvts košice, košice 1998. [2] böszörményi, l.: úvahy o využívaní hydrogeotermálneho potenciálu košickej kotliny pri kombinovanej výrobe tepla a elektriny. časopis ee, 5, 1999, č. 6 [3] böszörményi, l.: geotermálna podpora kombinovanej výroby elektriny a tepla. seminár kogeneračné zdroje, dt zsvts bratislava, bratislava, 2000 [4] böszörményi, l.: optimierte geothermienutzung bei der gekoppelten stromund wärmeerzeugung in einer gud-anlage. international conference world sustainable energy day 2000, wels/austria, 2000 [5] böszörményi, l.: kraft-wärme-kälte-kopplung mit geothermischer unterstützung. vdi-berichte 1594, vdi verlag gmbh, düsseldorf, 2001 [6] geotermálna energia pre centrálne zásobovanie teplom v meste košice. geoterm košice, košice, 1999 [7] riešenie náhrady zastaralých zdrojov tepla v teko košice. výskumný ústav energetický egú bratislava, bratislava, 1996 [8] vdi nachrichten magazin. sonderbeilage zur expo 2000, vdi verlag gmbh, düsseldorf, 2000 doc. ing. ladislav böszörményi, csc. phone: 00421-95-6024241 fax.: 00421-95-6321558 e-mail: boszla@ccsun.tuke.sk department of building structures technical university of košice faculty of civil engineering vysokoškolská 4, 042 01 košice, slovak republic ing. gabriel böszörményi e-mail: g.boszormenyi@sh.cvut.cz department of fluid dynamics and power engineeringdivision of compressors, refrigeration and hydraulic machines czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 41 no. 6/2001 acta polytechnica doi:10.14311/ap.2013.53.0890 acta polytechnica 53(6):890–894, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap effect of slip velocity on the performance of a short bearing lubricated with a magnetic fluid rachana u. patel, g. m. deheri∗ department of mathematics, sardar patel university, vallabh vidyanagar – 388 120, gujarat state, india. ∗ corresponding author: gm.deheri@rediffmail.com abstract. this paper aims at analyzing the effect of velocity slip on the behavior of a magnetic fluid based infinitely short hydrodynamic slider bearing. solving the reynolds’ equation, the expression for pressure distribution is obtained. in turn, this leads to the calculation of the load carrying capacity. further, the friction is also computed. it is observed that the magnetization paves the way for an overall improved performance of the bearing system. however the magnetic fluid lubricant fails to alter the friction. it is established that the slip parameter needs to be kept at minimum to achieve better performance of the bearing system, although the effect of the slip parameter on the load carrying capacity is in most situations, negligible. it is found that for large values of the aspect ratio, the effect of slip is increasingly significant. of course, the aspect ratio plays a crucial role in this improved performance. lastly, it is established that the bearing can support a load even in the absence of flow, which does not happen in the case of a conventional lubricant. keywords: short bearing, magnetic fluid, slip velocity, load carrying capacity. 1. introduction pinkus and sternlicht [1] presented an analysis for the hydrodynamic lubrication of slider bearings. exact solutions of the reynolds’ equation for slider bearings with several film geometries have been treated in numerous books and research papers (cameron [2], archibald [3], lord rayleigh [4],charnes and saibel [5], basu, sengupta and ahuja [6], majumdar [7], hamrock [8], gross, matsch, castelli, eshel, vohr and wildmann [9], prakash and vij [10]). patel and gupta [11] considered the effect of slip velocity on the hydrodynamic lubrication of a porous slider bearing. they showed that velocity slip decreased the load carrying capacity. all these above studies considered conventional lubricants. agrawal [12] dealt with the configuration of prakash and vij [10] with a magnetic fluid lubricant, and found that the performance was better than with a conventional lubricant. bhat and deheri [13] modified and extended the analysis of agrawal [12] by considering a magnetic fluid based porous composite slider bearing with its slider consisting of an inclined pad and a flat pad. bhat and deheri established that the magnetic fluid increased the load carrying capacity, did not affect the friction, decreased the coefficient of friction, and shifted the centre of pressure towards the inlet. patel et al. [14] analyzed the performance of a magnetic fluid based infinitely short bearing. it was shown that the magnetization sharply increased the load carrying capacity. the friction remained unchanged due to magnetization. prajapati [15] investigated the performance of a magnetic fluid based porous inclined slider bearing with velocity slip, and concluded that the magnetic fluid lubricant minimized the negative effect of the velocity slip. figure 1. configuration of the bearing system. recently, hydrodynamic lubrication of short bearings have been subjected to investigations in patel et al. [23], vakis and polycarpous [24] and patel and deheri [25]. the present study discusses the performance of a magnetic fluid based short bearing system with slip effect while the magnitude of the magnetic field is represented by a cosine function. 2. analysis figure 1 consists of the configuration of the bearing system, which is infinitely short in the z-direction. the slider runs with uniform velocity u in the xdirection. the length of the bearing is l and the breadth b is in the z-direction, where b � l. the pressure gradient ∂p ∂x can be neglected because the pressure gradient ∂p ∂z is much larger as a consequence of b being very small. the magnetic fluid is a suspension of solid magnetic particles approximately 3–10 nanometers in diameter stabilized by a surfactant in a liquid carrier. with the help of an external magnetic 890 http://dx.doi.org/10.14311/ap.2013.53.0890 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 effect of slip velocity field these fluids can be confined, positioned, shaped and controlled as desired. for details, see bhat [22]. the magnetic field is taken to be oblique to the stator, as in agrawal [12]. following bhat [22] and prajapati [16], the magnetic field is taken as h = ( h(z) cos φ, 0,h(z) sin φ ) ; φ = φ(x,z), (1) where the inclination angle of the magnetic field is described from the partial differential equation cot φ ∂φ ∂z + ∂φ ∂x = 1 1 h dh dz . (2) in view of the deliberation carried out in prajapati [16], verma [17] and bhat and deheri [18]the magnitude of the magnetic field is assumed to be of the form h2 = kb2 cos πz b , where k is chosen to suit the dimensions of both sides and the strength of the magnetic field. under the usual assumptions of hydrodynamic lubrication, and employing the beavers and joseph [19] model for slip, the governing reynolds’ equation (agrawal [12], prajapati [16], patel et al.[14]) turns out to be d2 dz2 ( p − µ0µh 2 2 ) = 6µu(2 + sh) h3(4 + sh) dh dx , (3) where µ0 is the magnetic susceptibility, µ is free space permeability, µ is lubricant viscosity and an m is the aspect ratio. the associated boundary conditions are p = 0 at z = ± b 2 and dp dz = 0 at z = 0. (4) the expression for pressure distribution is obtained by integrating equation (3) with respect to the boundary condition (4), as p = µ0µkb 2 2 cos πz b − 3µum lh22t 3 2 + sh2 4 + sh2 2 + 2sh2t + s2h22t2 (1 + sh2t)2 ( z2 − b2 4 ) , (5) where t = 1 + m(1 − x/l), while the aspect ratio m comes from m = (h1 − h2)/h2. introduction of the dimensionless quantities x = x l , p = h32p µub2 , µ∗ = − h32kµ0µ µu , y = y h , z = z b , s = sh2 leads to the expression for the non-dimensional pressure distribution, obtained as p = µ∗ 2 cos πz − 3mh2 lt3 2 + st 4 + st 2 + 2sts2t2 (1 + st)2 ( z2 − 1 4 ) . (6) then the load carrying capacity per unit width is determined from w = µ0µkb 2l π + µub2 2h22 ( 19 16 s2h22 ln(m + 1) − 3 4 sh2 m m + 1 + 1 2 m(m + 2) (m + 1)2 + 5 144 s2h22 ln 4 + sh2(m + 1) 4 + sh2 − 11 9 s2h22 ln 1 + sh2(m + 1) 1 + sh2 − 1 3 s2h22 ( 1 1 + sh2 − 1 1 + sh2(m + 1) )) . (7) thus, the dimensionless load carrying capacity of the bearing system comes out to be w = h32wπ µub4 = π ∫ 1/2 −1/2 ∫ 1 0 p(x,z) dx dz = µ∗l bπ + h2 2b ( 19 16 s2 ln(m + 1) − 3 4 s m m + 1 + 1 2 m(m + 2) (m + 1)2 + 5 144 s2 ln 4 + s(m + 1) 4 + s − 11 9 s2 ln 1 + s(m + 1) 1 + s − 1 3 s2 ( 1 1 + s − 1 1 + s(m + 1) )) . (8) the frictional force fper unit width of the lower plane of the moving plate is obtained as f = ∫ 1/2 −1/2 τ dz, (9) where τ = h2 µu τ (10) is the non-dimensional shearing stress, while τ = dp dz ( y − h 2 ) + µu h . (11) a little computation indicates that τ = µ∗πb 2 sin(πz)t 2 + st 1 + st ( y − 1 2 ) − 6mh2z l (2 + st)2(2 + 2st + s2t 2) t(1 + st)3(4 + st) ( y − 1 2 ) + 1 + st t(2 + st) , (12) where t = 1 + m(1 − x). at y = 0 (moving plate), one computes that τ = µ∗πb 4 sin(πz)t 2 + st 1 + st ( y − 1 2 ) − 3mh2z l (2 + st)2(2 + 2st + s2t 2) t(1 + st)3(4 + st) ( y − 1 2 ) + 1 + st t(2 + st) . (13) 891 rachana u. patel, g. m. deheri acta polytechnica figure 2. variation of load carrying capacity with respect to µ∗ and m. figure 3. variation of load carrying capacity with respect to µ∗ and s. therefore, the friction force in non-dimensional form at the moving plate is calculated as f0 = 1 + st t(2 + st) . (14) next, at y = 1 (fixed plate), one concludes that τ = µ∗πb 4 sin(πz)t 2 + st 1 + st ( y − 1 2 ) − 3mh2z l (2 + st)2(2 + 2st + s2t 2) t(1 + st)3(4 + st) ( y − 1 2 ) + 1 + st t(2 + st) , (15) which transforms to the non dimensional form as f1 = 1 + st t(2 + st) . (16) it is clearly seen from equations (14) and (16) that f0 = f1. (17) 3. results and discussion equations (6) and (8), respectively, present the variation of non-dimensional pressure distribution and load carrying capacity, while the frictional force is determined from equation (9). comparison with the conventional lubricant indicates that the non-dimensional figure 4. variation of load carrying capacity with respect to µ∗ and l/h2. figure 5. variation of load carrying capacity with respect to µ∗ and h2/b. pressure increases by µ∗ cos πz 2 , while the load carrying capacity enhances by µ∗(l/h2) b/h2 . for lower values of the slip parameter, the load carrying capacity estimated here is approximately three times more than the load calculated from the investigation of patel [20]. it is interesting to note that the friction remain unchanged in spite of the presence of slip, which is clear from equation (17). however, for large values of the aspect ratio, the effect of slip on friction is significant. the distribution of load carrying capacity with respect to magnetization µ∗ for various values of m, s, l/h2 and h2/b is presented in figures 2–5. all these figures make it clear that the load carrying capacity increases due to magnetization. further, the load carrying capacity increases for increasing values of m, l/h2 and h2/b while it decreases with increasing slip velocity values. however, the effect of m and s on µ∗ is negligible so far as the load carrying capacity is concerned. (figures 2 and 3). figures 6–8 show the variation of load carrying capacity with respect to slip velocity s for different values of m, l/h2 and h2/b, respectively. it is clearly 892 vol. 53 no. 6/2013 effect of slip velocity figure 6. variation of load carrying capacity with respect to s and m. figure 7. variation of load carrying capacity with respect to s and l/h2. seen from these figures that the load carrying capacity decreases with increasing slip velocity values. however, the decrease remains nominal, as can be seen from figures 6–8. figures 9 and 10 deal with the distribution of load carrying capacity with respect to m. the load carrying capacity increases with increasing values of m. figure 11 confirms that the rate of increase in load carrying capacity with respect to l/h2 increases with increasing values of h2/b. thus, the combined effect of the two ratios l/h2 and h2/b is significantly positive. from figure 12, it is found that the friction decreases with respect to the aspect ratio, whereas it increases with increasing slip parameter values. 4. conclusions this paper underlines that from the point of view life time of bearing, the slip parameter needs to be put at a minimum value. a comparison of our paper with the discussions of patel et al. [21] indicates that the load carrying capacity remains almost identical for lower aspect ratio values. the industrial importance of this work is that it offers an additional degree of freedom from the design point of view, in terms of the form of the magnitude of the magnetic field. it is suggested that the adverse effect of slip velocity can be compensated to a large extent by the magnetic fluid lubricant, when a suitable aspect ratio value is chosen. figure 8. variation of load carrying capacity with respect to s and h2/b. figure 9. variation of load carrying capacity with respect to m and l/h2. list of symbols h fluid film thickness at any point [mm] m aspect ratio p lubricant pressure [n/mm2] u uniform velocity in x-direction w load carrying capacity [n] b breadth of the bearing [mm] f frictional force l length of the bearing [mm] p dimensionless pressure w non-dimensional load carrying capacity h1 maximum film thickness [mm] h2 minimum film thickness [mm] f dimensionless frictional force f0 non-dimensional frictional force (moving plate) f1 non-dimensional frictional force (fixed plate) h magnitude of the magnetic field h magnetic field φ inclination angle of the magnetic field µ lubricant viscosity [n s/mm2] τ shear stress [n/mm2] µ0 magnetic susceptibility µ free space permeability µ∗ dimensionless magnetization parameter τ dimensionless shear stress references [1] pinkus, o., sternlicht, b., theory of hydrodynamic lubrication, mcgrawhill, new york, 1961. 893 rachana u. patel, g. m. deheri acta polytechnica figure 10. variation of load carrying capacity with respect to m and h2/b. figure 11. variation of load carrying capacity with respect to l/h2 and h2/b. [2] cameron, a., the principles of lubrication, longmans, london, 1996. [3] archibald, f. r., a simple hydrodynamic thrust bearing, asme 72, 1950, p. 393. [4] lord rayleigh, notes on the theory of lubrication, phil. mag. 35, 1918, p. 1–12. [5] charnes, a. , saibel, e., on the solution of the reynolds’ equation for slider bearing lubrication, part 1, asme 74, 1952, p. 867. [6] basu, s. k., sengupta, s. n., ahuja, b. b., fundamentals of tribology, prentice-hall of india private limited, new delhi, 2005. [7] majumdar, b, c., introduction to tribology of bearings, s. chand and company limited, new delhi, 2008. [8] hamrock, b. j., fundamentals of fluid film lubrication, mcgraw-hill, inc. new york, 1994. [9] gross, w. a., matsch lee, a., castelli, v., eshel, a., vohr, j. h., wildmann, m., fluid film lubrication, a wiley-interscience publication, john wiley and sons, new york, 1980. [10] prakash, j., vij, s. k., hydrodynamic lubrication of porous slider, journal of mechanical engineering and science 15, 1973, pp. 232–234. [11] patel, k. c., gupta, j. l., hydrodynamic lubrication of a porous slider bearing with slip velocity, wear 85, 1983, pp. 309–317. [12] agrawal, v. k., magnetic fluid-based porous inclined slider bearing, wear 107, 1986, pp. 133–139. figure 12. variation of friction with respect to s and m. [13] bhat, m. v., deheri, g. m., porous composite slider bearing lubricated with magnetic fluid, japanese journal of applied physics 30, 1991, pp.2513–2514. [14] patel, r. m., deheri, g. m., vadher, p. a., performance of a magnetic fluid-based short bearing, journal of applied sciences 7(3), 2010, pp. 63–78. [15] prajapati, b l., magnetic fluid-based porous inclined slider bearing with velocity slip, prajna, 1994, pp.73–38. [16] prajapati, b l., on certain theoretical studies in hydrodynamics and electro magnetohydrodynamic lubrication, dissertation, s.p. university, vallabh vidhyanagar, 1995. [17] verma, p. d. s., magnetic fluid-based squeeze films, int. j. eng. sci. 24(3), 1986, pp. 395—401. [18] bhat, m. v., deheri, g. m., squeeze film behavior in porous annular discs lubricated with magnetic fluid, wear 151, 1991, pp.123–128. [19] beavers, g.s. and joseph, d.d., 1967, “boundary conditions at a naturally permeable wall”, jour. fluid mechanics 30, pp. 197–207. [20] patel, n.s., analysis of magnetic fluid-based hydrodynamic slider bearing, thesis m. tech., sardar vallabhbhai national institute of technology, surat, 2007. [21] patel, n. d., deheri, g. m., effect of surface roughness on the performance of a magnetic fluid based parallel plate porous slider bearing with slip velocity, journal of the serbian for computational mechanics 5, 2011, pp. 104–118. [22] bhat, m.v., lubrication with magnetic fluid, team sprit, india, pvt. ltd., 2003. [23] patel, n.s., vakharia, d.p., deheri, g. m., a study on the performance of a magnetic-fluid-based hydrodynamic short journal bearing, isrn mechanical engineering, 2012, article id 603460. [24] vakis, a.i., polycarpou, a.a., an advanced rough surface continuum-based contact and sliding model in the presence of molecularly thin lubricant, tribology letters, 49(1), 2013, pp. 227–238. [25] patel, j.r., deheri, g.m., a comparison of porous structures on the performance of a magnetic fluid based rough short bearing, tribology in industry, 35(3), 2013, pp. 177–189. 894 acta polytechnica 53(6):890–894, 2013 1 introduction 2 analysis 3 results and discussion 4 conclusions list of symbols references acta polytechnica doi:10.14311/ap.2013.53.0007 acta polytechnica 53(5), 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap three quarters a century foreword to special issue dedicated to prof. miloslav havlíček such a number of years makes one think, as the famous phrase in some like it hot goes. in cases like the jubilee that this festschrift celebrates, there is indeed a lot to think about, because the life of professor miloslav havlíček has been anything but boring. this is clear already from his curriculum. one of the first graduates of the newly-founded faculty of technical and nuclear physics in 1961, he started his mathematical career teaching algebra. at the same time, he worked on applications of algebraic methods in quantum theory, under the strong influence of professor václav votruba, the nestor of czech theoretical physics. infinite-dimensional lie algebras were the topic of his phd thesis, and algebraic problems have remained his favourite subject throughout his life. however, his interests have been broader than that. he has dealt with problems in functional analysis, differential equations, foundations of quantum mechanics, and other topics. since 1973, he has spent several years at the laboratory of theoretical physics, jinr, in dubna, where he also wrote and defended his higher doctoral thesis on canonical realizations, a certain algebraic representation theory of lie algebras. back in prague, he worked at charles university, but in the second half of the 1980s his research came under pressure from those who wielded power but had no understanding, and he decided to return to the department of mathematics of his alma mater. the great political rift of 1989 changed his life. in particular, it removed the obstacles that had prevented him from obtaining the academic laurea that he deserved. he obtained his habilitation in 1990, and three years later he became a full professor. in 1990, he was also elected dean of the faculty, and he served in this capacity for two terms; he was the principal spiritus movens in the transformation of the faculty to the conditions of a free society. one of his memorable achievements was to found the doppler institute, which cemented and boosted the existing informal collaboration of several groups working on mathematical physics in prague. these are the dry facts. however, we who have known him for about two thirds of his present age can testify that there is much more in miloslav. his professional – and not only professional – life has been ornamented by an array of stories too numerous to be told in this short preface. some are anecdotal, like the story of his doctoral thesis being confiscated by a soviet border guard, who suspected subversion in a text on lie algebras. some are rather complex, like the story of our textbook on linear operators in quantum physics, tenderly known as blue death by our students, which started out four decades ago as a lecture note project to utilize an orphaned chapter of a manuscript, passed through years of reworking under the eye of mean editors, and through administrative hurdles so typical for certain periods in the history of this country, not to mention the bankruptcy of a publishing house, to achieve one czech edition and two english editions with three different publishers – and who knows where it will go from here. it is a good custom in our field that we honour a colleague and a teacher as he celebrates a jubilee by putting forward some results which, we think, might please him. in this volume, we have collected short papers from miloslav’s friends, collaborators and students, in the hope that he will enjoy them. pavel exner http://dx.doi.org/10.14311/ap.2013.53.0007 http://ojs.cvut.cz/ojs/index.php/ap acta polytechnica 53(5), 2013 acta polytechnica doi:10.14311/ap.2014.54.0019 acta polytechnica 54(1):19–21, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap current ways to harvest energy using a computer mouse frantisek horvata, ∗, michal cekana, lukas soltesa, peter biatha, branislav huckoa, igor jedinakb a slovak university of technology, namestie slobody 17, bratislava b igor jedinak, sturova 1305/36, kysucke nove mesto ∗ corresponding author: frantisek.horvat@stuba.sk abstract. this paper deals with the idea of an energy harvesting (eh) system that uses the mechanical energy from finger presses on the buttons of a computer mouse by means of a piezomaterial (pvf2). the piezomaterial is placed in the mouse at the interface between the button and the body. this paper reviews the parameters of the pvf2 piezomaterial and tests their possible implementation into eh systems utilizing these types of mechanical interactions. the paper tests the viability of two eh concepts: a battery management system, and a semi-autonomous system. a statistical estimate of the button operations is performed for various computer activities, showing that an average of up to 3300 mouse clicks per hour was produced for gaming applications, representing a tip frequency of 0.91 hz on the pvf2 member. this frequency is tested on the pvf2 system, and an assessment of the two eh systems is reviewed. the results show that fully autonomous systems are not suitable for capturing low-frequency mechanical interactions, due to the parameters of current piezomaterials, and the resulting very long startup phase. however, a hybrid eh system which uses available power to initiate the circuit and eliminate the startup phase may be explored for future studies. keywords: energy harvesting, computer mouse, piezomaterial, battery management. 1. introduction many studies involving energy harvesting (eh) from human activity have concentrated on mounting an eh device to the body itself and using human motion to generate electrical energy [1]. however, there are other possible ways to obtain energy from human activity. whether for work or for play, humans interact extensively with their personal computer (pc), and this interaction can be considered for eh. interfacing with a pc typically involves a mouse and a keyboard, and depending on the activity, a computer mouse offers unique possibilities for eh, particularly in the mechanisms that are involved. studies have already been made for the potential application of eh circuits based on a piezomaterial, and have shown the possibility of applying it for battery management of cardiostimulators [2]. similarly, the buttons on a computer mouse can be set up to introduce ideal displacements on a given piezomaterial for optimal energy generation. considering the enormous number of people who interact with a pc every day, for work and for entertainment, on a large scale, the energy that is generated offers potential for eh [3]. 2. study a piezomaterial can generate electricity, but the quantity of energy produced is a function of the frequency and the tip mass of the displaced material. to determine, on average, how many interactions humans perform with a computer mouse, a statistical analysis was carried out for various activities performed by 10 volunteers. using a computer program that monitors mouse clicks per hour [4], each volunteer was asked to measure their interactions over one hour of typical work. following this analysis, the same volunteers were asked to perform a measurement after one hour of gaming. the results are shown in figure 1. the table shows that, on an average, administrative work produces 170 mouse clicks per hour, which is essentially useless for eh. however, gaming produces a very much greater number of interactions an average of 3300 clicks per hour. the values are highly dependent on the type of work being performed or on the game being played. it is obvious that gaming offers the greatest potential for generating energy. a suitable eh model, material, and configuration must also be considered. 3. method the most widely-used piezomaterial for eh is based on lead zirconate titanate (pb(zrxti1−x)o3 or pzt. these structures offer very good performance, and they were therefore used in this study [5]. the cantilever loaded by the dirac function in figure 2 shows how the piezomaterial functions. the pvf2 piezomaterial is placed at the root of this beam. a load and frequency are exciting this beam. the load at the end of the beam displaces the internal crystal structure, and this generates a charge. higher tip masses are used to fine tune the region 19 http://dx.doi.org/10.14311/ap.2014.54.0019 http://ojs.cvut.cz/ojs/index.php/ap f. horvat, m. cekan, l. soltes et al. acta polytechnica figure 1. statistical analysis of mouse clicks for various computer activities: 1 – administration, 2 – gaming, 3 – programming. figure 2. cantilever beam loaded by a dirac function. of resonance in which the piezomaterial operates [5]. the corresponding voltage that is generated is used as an input into the eh circuit. there are two eh methods, shown in figure 3, that can be implemented into the mouse circuit. model 1 shows a system used for battery management (battery recharging or a charge extender), and model 2 shows the direct use of the generated energy in a semi-autonomous system. in the analysis, piezomaterial with dimensions of 13 × 25 × 0.3 mm were considered. the pvf2 material that was used indicated higher impulse voltages at an optimal tip deflection of ±1.5 mm and a higher tip mass. the piezomaterial is optimized for 30 hz resonance. higher tip mass reduced the required frequency to induce energy. the eh circuit from linear technologies was used, and the calculations were based on this circuit [6]. 4. measurement and results the pvf2 piezo-member was glued to the body of the mouse so that the tip of the pvf2 beam was positioned at the interface between the mouse button and body, as shown in figure 4. care was taken to ensure consistent tip deflection coinciding with the optimal deflection of the piezomaterial. the eh circuit was constructed according to lt specifications, with an output set at 1.8 v. this circuit was placed figure 3. possible energy harvesting systems, pvf2 – piezomaterial, lt – circuit required for eh, c – capacitor, b – battery, s – input to mouse system. in an open cavity of the mouse body, as is also shown in figure 4. the mouse was then assembled, and the left button of the mouse was actuated to determine whether the eh system was functioning. measurements were taken at the input and output of the lt-based eh circuit. on an average, the piezomaterial generated 320 mv of input voltage per click, which is consistent with an approximately 30 % reduction in expected values. at 0.91 hz (gaming activities), the circuit start-up profile required approximately 2850 mouse clicks to accumulate enough energy to open the lt circuit. after this point, energy is passed through the circuit, and an output of 1.77 v is generated. 5. conclusions it must be stated that the frequency of mouse clicks is highly dependent on the type of activity. some games require more use of the mouse than others. the same is true for certain working activities. the piezo-member was designed to operate at 30 hz. because of this, the results show that the start-up process is too long, even when considering gaming activities (approximately 51 minutes at 0.91 clicks per second). fully autonomous systems are therefore currently not possible, due to the long start-up profile and the in20 vol. 54 no. 1/2014 current ways to harvest energy using a computer mouse figure 4. implementation of the pvf2 piezomaterial on the left mouse button and the lt eh circuit. herently high voltage dropout as charge is built. it should be stated that once the eh circuit overcomes the regulator start-up profile, it begins to release energy to the circuit output. therefore, any further study could involve an eh mouse circuit operating as a hybrid system, where the regulator start-up profile is initiated by the available on-board power, and thus any clicking directly adds energy to the output of the circuit. however the character of the voltage drop suggests that this kind of system is not possible with the given piezomaterial and with low frequency input. it should be stated that the measurements were carried out on the left mouse button alone. a series of piezo-members mounted on both mouse buttons, or combined integration into other peripherals, e.g. the keyboard, could possibly sustain the charge in the system. finally, current piezomaterial parameters are not suitable for such low frequency operations, and it is difficult to induce the required 30 hz resonance on these members in these operations, due to the inconsistency of the impulses. references [1] riemer, r., shapiro, a.: biomechanical energy harvesting from human motion: theory, state of the art, design guidelines, and future directions, journal of neuroengineering and rehabilitation, 2011, 8:22 [2] karami, a., inman, d.: powering pacemakers from heartbeat vibrations using linear and nonlinear energy harvesters, american institute of physics, 2012, lett. 100, 042901 [3] horvat, f., cekan, m., soltes, l., hucko, b.: myska – prostriedok alebo zdroj?, in abstracts, preveda, 2013, isbn:78-80-970712-4-0. [4] codeplex, application usage statistics, project hosting for open source software. www.usagestats.codeplex.com [5] setter, n.: piezoelectric materials in devices, epfl swiss federal institute of technology, 2002, pp. 6–26. [6] linear technology.: piezoelectric energy harvesting power supply, data sheet, online: www.linear.com [2014-01-27]. 21 www.usagestats.codeplex.com www.linear.com acta polytechnica 54(1):19–21, 2014 1 introduction 2 study 3 method 4 measurement and results 5 conclusions references acta polytechnica doi:10.14311/ap.2015.55.0177 acta polytechnica 55(3):177–186, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap effect of the flow field deformation in a wind tunnel on aerodynamic coefficients dušan maturkanič vzlú, beranových 130, prague, czech republic correspondence: maturkanic@centrum.cz abstract. the quality of the flow field is highly significant wind tunnel measurements are bing made. when an air flow field is formed by a fan, the entire flow field rotates. moreover, the flow field is deformed at the bends of a wind tunnel with close circulation. although wind tunnels are equipped with devices that eliminate these non-uniformities, in most cases the air flow field does not have ideal parameters in the test section. in order to evaluate the measured results of the model in the wind tunnel, it is necessary to characterize the deformation of the flow field. the following text describes the possible general forms of the flow field non-uniformity, and their effect on the calculation of the aerodynamic coefficients. keywords: flow field, speed profile, rotation, vortex, specific speed, deformation of the flow field, wind tunnel, aerodynamic coefficient. 1. introduction the requirement for keeping a uniform air flow field uniformity in the wind tunnel is the state when the axial speed component has an identical value in each point of the test section that influences the measured model. the radial speed component, which is perpendicular to the axial axis of the wind tunnel, should be zero. the level of the flow field non-uniformity in the wind tunnel is screened, for example, by a measuring probe, at defined points in the test section. for this, the measuring space should be divided into four sectors in the chosen radial plane (i.e. the plane perpendicular to the axial axis of the wind tunnel). the marking and numbering of these points is presented in fig. 1. the constant vertical and horizontal distance among these points is suitable for other analyses. checking measurements in the defined points can be performed in other planes. the described flow field non-uniformity analysis is justified in the core of the flow field, which is intended for wind tunnel tests one air and ground applications at a sufficient distance from the tunnel wall, where there is no influence of the boundary layer. if the method described here is repeated in the other radial planes of the test section, the character of the development of the turbulence in the axial direction can be analyzed. the turbulent pulsation level can be assessed when measuring the parameters at one point in the sector over in longer periods of time and recording the changes in these parameters. the following text focuses on the general flow field deformation in one radial plane and without time-sensitive effects. figure 1. definition of the probe positions 2. description of the flow field the air flow speed in a wind tunnel is considered as the average value of the speed distribution in a defined cross-section of the test section. the speed value then expediently meets the real conditions for the measured model. for a better description, the non-uniformity of the flow field will be classified into speed relations in the longitudinal direction of the wind tunnel (axial speed) and speed relations in the lateral direction of the wind tunnel (radial speeds). 3. speed profile the speeds of the air flow distributed in the longitudinal direction of the wind tunnel form the speed profile. for simplicity, the speed profile in one plane parallel 177 http://dx.doi.org/10.14311/ap.2015.55.0177 http://ojs.cvut.cz/ojs/index.php/ap dušan maturkanič acta polytechnica figure 2. two speed profile in the wind tunnel. with the axis of the wind tunnel will be considered in the text that follows. generally, there are two kinds of the speed profile: (1) the mirror speed profile; and (2) the diagonal speed profile. the dissimilarity between these profiles is presented in fig. 2, where the curve illustrates the oscillations of the local speeds compared with the defined speed of the air flow. the mirror speed profile is characterized by the va speed, and the local speeds in the flow field scarcely influence the results of measurements on the model, so the influence of the non-uniformity of the speed profile is insignificant. the diagonal speed profile is characterized by the vb speed, and a no similar statement can be claimed, since the sum of the local speeds on left and right parts of the speed profile from the center still rises, and the expressive imbalance of the aerodynamic forces acts on the measured model. in connection with this schedule, if the difference among the corresponding local speeds in defined sectors is lower than a specific value, then the non-uniformity of the flow field has no effect on the measured parameters. the definition of the non-uniformity of the speed profile is based on the fact that speed vs represents the real air flow speed in the wind tunnel. the consideration of the opposite sectors according to fig. 1 (i.e. sectors a and d) and their local speeds leads to the expression of the difference between the speeds in the two sectors v sy m (da) = ∑k i=1 |vdi −vai| kvs . (1) the speed difference (vdi −vai) is valid for the corresponded suffixes according to fig. 1, where the speed field asymmetry can be presented. the specific speed v sy m (da) between these opposite sectors then expresses the level of deformation of the speed profile. based on the character of the definition of the speed profile, the specific speed v sym between the sectors expresses the rate of the speed profile asymmetry. despite the symmetry of the flow field between the opposite sectors, however, marked changes in one sector take place without the v sym specific speed indication. these changes in the speed of the flow field should be represented by expressive differences of the local speeds against the average speed or the real speed of the flow field vs, and this character of the flow field can be expressed by the specific speed v av r(da) = ∑k i=1 ( |vai −vs| + |vdi −vs| ) 2kvs , (2) which presents the rate of the speed non-uniformity in each sector. although the speed distribution in profile is symmetrical, the speed profile does not have to be suitable for measuring. a similar case is the asymmetry of the speed profile according to following relation: v asm = ∑k i=2 |vi −vi−1|a + |vi −vi−1|d (k − 1)vs . (3) these asymmetric speed profiles, however, form the slope of the air flow accompanied by the radial speed. this case can be described by the characteristics of the flow field in the radial plane. 4. rotation of the flow field in the radial plane, the non-uniformity of the flow field depends on the radial speed component. this speed component acts in a plane which is perpendicular to the axis of the wind tunnel, and it causes flow field to rotate or vortices in the flow field (there is not consider here the case when one component of the radial speed is zero and the motion of the flow field is only a translation). the local radial speed could correspond to character of other local speeds in considered plane, and then it creates overall rotation of the air flow. the second case when characters of the local radial speeds differ, and then the local nonuniformities of the air flow rotation are created, or local vortices are produced. a similar process can be used for evaluating the radial speed effect. the rotation of the flow field will be defined by analogy with the speed profile (fig. 2). the rotation of the flow field also has two profiles: (1) the rotary flow field, with radial speed vr; and (2) the vortex flow field, with local radial speed vq (fig. 3). for the following analyses, it is more useful to translate the measured radial speed components in tunnel coordinates into the tangential speed component and the normal speed component relative to the axis of the wind tunnel. a required condition for the rotary flow field is that the normal speed component is insignificant in each point of the flow field vn(i) → 0. in a flow field that 178 vol. 55 no. 3/2015 effect of the flow field deformation figure 3. character of the rotary flow field. is rotating uniformly, the local tangential speeds rise evenly from the axis of the wind tunnel. in accordance with the speed profile, the requirement for the rotary flow field can be defined by non-uniformity of the rotation of the flow field v rot = r vs k∑ i=2 ∣∣∣∣∆vt(i,i−1)∆yk,1 − ∆vt(k,1)∆yi,i−1∆yi,i−1∆yk,1 ∣∣∣∣, (4) where ∆vt(`,m) = vt(`) − vt(m), ∆y`,m = y` − ym and r is the diameter of the wind tunnel. in this relation, parameter y represents the perpendicular distance from the axis of the wind tunnel to the point of tangential speed vt. of course, the flow field must have the same character in all sectors (fig. 1). another factor in the rotations of the flow field, i.e. the rotary flow field and the vortex flow field, is not only value but also the orientation of the normal speed component vn. in principle, there are three cases: (1.) outer rotation, when vn > 0; (2.) inner rotation, when vn < 0; (3.) steady rotation, when vn = 0; in the last case vn = 0, if it is on whole cross-section, a rotary flow field is formed. however, there are situations in the vortex flow field when the local normal speed is zero. the angle υ2 between radial speed vr and normal speed vn is the characteristic parameter of the vortex flow field. the intensity of the local vortex then will be defined by the cosine of this angle in the range from −1 to +1. the intensity of the vortex flow field is now defined by the following relation: v vor = √ (∆y)2 + (∆z)2 r ( |vr(i) −vr(i−1)| vs + |υ2(i) −υ2(i−1)| 2π ) , (5) where a constant distance between measured positions ∆y and ∆z is expected. 5. deformation character of the flow field the deformation of the flow field in the wind tunnel described above is based on the change of the speed profile in the axial direction and the rotation or the vortex flow field in the radial direction. for the speed profile, there are the mirror speed profile, with the flow symmetry, and the diagonal speed profile, with some unsymmetrical level. the specific speed v sym indicates the level of symmetry or asymmetry. however, disproportion of the local speed can occur in the symmetry speed profile. the specific speed v avr indicates these local speed disproportions in the air flow. there are the rotation of the flow field, and the vortex flow field in the radial direction. the normal speed component value determines the range of each case. for the rotation of the flow field, the specific speed v rot indicates the non-uniformity of the rotation of the flow field. this specific speed depends on the diameter of the wind tunnel, and is related to the defined speed of the flow field in the tunnel, or the average speed of the flow field. if the flow field has a vortex character, then the specific speed v vor indicates the intensity of the vortex and this specific speed depends on the diameter of the wind tunnel, and is also related to the defined speed of the flow field. 6. asymmetry of the speed profile the specific speed v sym indicates the size of the corresponding difference between the local speeds and defined flow field speed in the wind tunnel. in comparison with value for the specific speed v sym, the local speed non-uniformity in the flow field can be considered as insignificant. however, the same value of this specific speed could represent various kinds of flow field non-uniformities. in the following text, the difference between the local speed and the defined speed of the wind tunnel vs will be considered. the size or the shape of the model also has a sense. for most basic measurements, the longitudinal axis of the model is consistent with the axis of the wind 179 dušan maturkanič acta polytechnica figure 4. basic kinds of speed profile asymmetry. figure 5. basic kinds of speed profile disproportion. tunnel, so the type of flow field non-uniformity and the size of the model are important for the error analysis. fig. 4, shows the three basic types of speed profile asymmetry: • sym-1 speed asymmetry refers to the case when the change in speed occurs only in one part of the speed profile, or only in one sector (fig. 1). if two model sizes are considered, the asymmetry is more marked in the bigger model. • sym-2 speed asymmetry refers to the case when the direction of the local speed differs between the opposite parts of the speed profile. however, the speed profile asymmetry does not affect the smaller model in this case. • sym-3 speed asymmetry describes the case when the non-symmetrical distribution of the local speed changes in the opposite parts of the speed profile can have an entire different effect for big models and for small models. the dark color in fig. 4 depicts the local speed differences affecting the measurement accuracy for both models. the light color depicts the local speed differences for the bigger model only. according to this schedule, the size of the model and the position of the most extensive local speed differences should be included with the flow filed nonuniformity for measurements in the wind tunnel. in this connection, positions of the speed differences related to characteristic parts of the model are very important for the error measurement analyses. 7. speed profile disproportion it was mentioned that the symmetric speed profile can include significant discrepancies among the local speeds, which can have an influence on the results. as in the case of speed profile asymmetry, there are three basic kinds of the symmetric speed profile disproportion (fig. 5): • the avr-1 speed disproportion refers to the case when the speed differences are extensive only on the smaller model, since the deformed air flows around the whole small model whereas the speed differences affect only the region near to the axis of the bigger model. • the avr-2 speed disproportion refers to the case when the smaller model is scarcely ok affected by the speed differences. however, the effect of the differences in speed acting on the bigger model can be great enough to affect the usefulness of the measured data. • the avr-3 speed disproportion describes the case when the differences in speed have different orientations on smaller models and on bigger models. according to fig. 5, the smaller model has a flow speed that is lower than the defined speed in the wind tunnel, whereas the bigger model has a flow speed that is higher than defined speed in the wind tunnel. the definition of the smaller model and bigger model has been mentioned as an example of the difference between the train model, where the height almost 180 vol. 55 no. 3/2015 effect of the flow field deformation figure 6. basic kinds of disproportion of the rotation of the flow field. corresponds with the width of the model, and the airplane model, where the span has a significantly greater value than the height and a significantly greater value than the diameter of the fuselage. 8. disproportion of the flow field rotation the requirement for the rotation of the flow field is condition defined above when the normal speed component in the radial plane is zero or has a value near to zero. then, there is a way similar to the speed profile solution. according tothe value for the specific speed v rot, the rotation of the flow field can be considered as a steady or unsteady rotation. however, the value of this specific speed does not indicate the particular rotation non-uniformities, so the rotation effects on the measured model data are essential. the characteristic attribute of a rotating flow field is that the local radial speeds (i.e. the tangential speeds only, because the normal speeds are insignificant) at the same distance from the axis of the wind tunnel and in opposite sectors have the same value but opposite orientations. this can be called an antisymmetrical speed profile. in this case, the model is also located in center of the the wind tunnel, and there are two model sizes — the small model and the big model. four basic kinds of the flow filed rotation can therefore be defined (fig. 6): • the rot-0 rotation disproportion describes the case when the tangential speeds rise steadily from the axis of the wind tunnel. this kind of rotation creates a similar force and moment for all model sizes, and is mentioned because the rotation is considered as the deformation of the flow field. • the rot-1 rotation disproportion describes the case when the positive augmentation of the tangential speed component effect is different on small models and on big models. the measurements on the small model can incorrect while the inaccuracy of the measurements on the big model can be acceptable, and the edge surfaces of the big model are not affected by the differences in speed. • the rot-2 rotation disproportion describes the case when the small model is almost unaffected by the rotation of the flow field, whereas the rotation character of the flow field could make the measurement results on the big model totally unsuitable. • the rot-3 rotation disproportion describes the case when the deformation of the flow field has different effects on the small model and on the. again, this case is characterized by the positive and negative differences of the tangential speed. disproportionate speeds when the flow field is rotating (i.e. the rot-1 to rot-3 kinds) in fact create separate rotations with a joint axis, which is the axis of the wind tunnel. the radial speed of the rotation therefore has only the tangential speed component, which does not rise proportionally s from the axis, as it does typical case for rot-0 type the flow field rotates in its layers and even for the rot-3 type, there are the tangential speeds with the opposite orientation between some layers. 9. vorticity of the flow field rotation the vortex flow field represents a certain generalization of the rotating flow field, where the axis of rotation is not located in the center of the wind tunnel center, but it is in the random position on the radial speed plane. radial speeds therefore have.a random direction in relation to the axis of the wind tunnel. the complexity of the description of the flow field was indicated in the definition of the specific speed v vor, where the directions of these speeds are considered. simultaneously, the direction of the local speeds defined by angle υ2 is important for the assessment, if the calculated difference between the local speeds is related to the same vortex. the vortex flow field is in principle unsymmetrical, unlike the rotary flow field, where the speeds in opposite sectors (fig. 1) are antisymmetric. however, a particular case can also be a symmetrical vortex flow field with the same vortex pattern in the opposite sectors. since this case is highly improbable it will not be considered below. unlike in previous cases, 181 dušan maturkanič acta polytechnica the situation will be described on the whole sector designated as the vortex field. for a description of the vortex field, it is necessary to define the characteristics of the vortex. the vortex can be of high intensity, with a high (tangential) speed on the margin of the vortex, or of low intensity to the commotion vortex (i.e. the time between the creation and the collapse of the vortex is negligibly small. in addition, the region affected by the vortex must be distinct, and there are large-range vortices and small-range vortices. a vortex of immeasurable range can occur in the situation when the chosen measured distance between neighboring positions of the probe is greater than the diameter of the vortex, and the location of the whole area of the vortex with its margin is unmeasurable. the situation can have a negative effect on the results of the measured model, especially if there are critical parts of the model design. the vortex flow fields are therefore distinguished according to the characteristics individual local vortices and their dominant parameters: • a flow field with low-intensity vortices; • a flow field with high-intensity vortices; • a flow field with vortices of the limited range; • a flow field with vortices beyond the limited range. if the flow field is described in the defined sectors (fig. 1), the vortex distribution in the flow field is important, together with the individual vortex characteristics. generally, the following cases are observed: • a flow field with a random vortex; this represents the case when the isolated vortex occurs in the flow field, eventually there may be some vortices of low-intensity and with a limited range. • a flow field with a random fixed vortex; this represents the case when the center of the vortex is exactly known, for example from repeated measurements, or the vortex is considered as stationary.a flow field with a random free vortex; this represents the case when the vortex was found on the basis of the radial speeds, and the center of the vortex was calculated or the axis of the vortex moves in relation to the axis of the wind tunnel. • a flow field with independent vortices; there are several vortices which do not affect each other. • a flow field with dependent vortices; there are several vortices which affect each other. the line between individual cases is dependent on the parameters of the individual vortices related to the defined data, i.e. the vortex intensity related to the speed of the flow field in the wind tunnel, or the vortex range related to the area of the flow field, etc. these cases are taken into consideration in the description of the flow field for the radial plane. however, this description can be extended to the three-dimensional situation and then, for example, the vortex space with the free axis of the vortex is defined as the case when the axis of the vortex changes its position (or slope) in relation to the axis of the wind tunnel. these situations ensue when measures are made in more than one radial plane. on the basis of the definition of the vortex field, the flow field is characterized into several kinds, unlike the other cases mentioned in the previous paragraphs. unlike the previous paragraphs, where only the tangential speed component was considered, the normal speed component has a fundamental meaning in the vortex flow field. remarks were made about the value and the orientation of the normal speed in paragraph 2.2, in the description of the rotation of the flow field rotation. the value and the orientation of the normal speed do not need to represent unequivocally the concept of the vortex orientation, so it is necessary to explore whether the neighboring normal speeds are parts of the same vortex. the effect of the vortex flow field on the measured model therefore cannot be created according to the separated local radial speeds. this situation is illustrated in fig. 7, where the pair of radial speeds is able to create a random vortex image. these two speeds can be included in a joint vortex they can be included in two separate vortices with different circulations or with the same circulations, or they can be included into two vortices with different ranges and intensities. however, the measured speeds do not have to be located on the margin of the vortices and are probably not located on the margin. the vortex flow field is a characteristic situation where the model is influenced by flow field nonuniformity only on parts of its surface. this situation can be considered as the unsymmetrical case. the random number and possibilities vortices in the flow field then exclude any definition of types of vorticity,in similar cases to above mentioned. the unique requirement for vorticity and for an analysis of its effect on the measured model is knowledge of the value and the position of the radial speed components in the flow field related to critical parts of the model. for the analyses below, the complex mechanism of the vortex flow field needs to be taken into consideration, but an explanation is beyond the scope of this paper [1]. 10. a summary of flow field characteristics non-uniformity of the flow field was described above on the basis of the speed profile and the flow field rotation. these characteristics were essential for defining the deformation effect of the flow field on the accuracy of the measurements in the wind tunnel. there are two specific speed deformations for the speed profile: speed non-uniformity, which is divided into three types, whereby the position and the model size related to this non-uniformity can be taken into consideration; and the speed disproportion, which is also divided into three types with similar characters. for rotation of the flow field, disproportions of three kinds are defined. 182 vol. 55 no. 3/2015 effect of the flow field deformation figure 7. an example of the potential vortices for two measured radial speeds. the rot-0 disproportion represents the rotation of the whole flow field, where the local speeds rise proportionally from the center of the rotation (i.e. the axis of the wind tunnel). in principle, the deformations of the flow field can generally be classified into three types: • type 1 — the deformation of the flow field is demonstrated on the model without the scale effect. this effect can be estimated according to acertain character, and the measurement results can be corrected (represented by the sym-1, avr-1, rot-1 types). • type 2 — the deformation of the flow field is demonstrated only.only in the measured space. there is also a possible case where the small model is not affected by the change in the flow field, but the large model is affected (represented by the sym-2, avr-2, rot-2 types). • type 3 — the deformation of the flow field affects similar models (differing in scale only) differently, and the standard corrections cannot be used (represented by the sym-3, avr-3, rot-3 types). in this connection, the described effects of nonuniformities of the flow field are valid for steady speeds. the change of the speed in time was not taken into consideration, since this character of the flow field is not in accordance with classical conditions for measurements on the models in the wind tunnel. the vortex flow field refers to the unsymmetrical situation, where the radial speeds of more than one vortex create a non-uniform speed profile without a unique relation to the size of the model. the vortex flow field was defined for analyses where the range, the intensity and the number of vortices are monitored via the local radial speeds. the center of the vortex can be identified on the basis of two local radial speeds, but these two speeds do not provide any idea about the characters of the local vortices (fig. 7). the vortex flow field is thus acceptable for the measurements when there are a limited number of low-intensity vortices. the character of the vortex flow field cannot affect the space where the critical parameters of the model are evaluated. for a comparison of the types and kinds of speed profiles characteristics, the three basic profiles were considered. these profiles are distinguished from each other by the area in which the speeds are different from the defined speed vs. for simplicity, the speed profiles were designed with a square differential area, as illustrated in the corresponding graphs (fig. 8). the speed difference b and position a determine this area. the a parameter expresses the range of the deformation of the speed profile as a percentage (a = 80 % means that the deformation extends over 80 % of the speed profile. it is drawn in the fig. 8, where the deformations are designed as symmetrical according to the axis of the speed profile). the b parameter expresses the size of the speed difference as a percentage of the defined speed (b = 20 % means that the difference of the speed in the region of the deformation described by the a parameter is 20 % in comparison with the defined speed of the flow field). generally, an increase in the specific speed value means a higher stage of non-uniformity of the flow field. for the sym-1, sym-2, avr-1, and avr-2 profiles, the specific speed grows linearly with parameters a and b. for the rotating rot-1 and rot-2 profiles, the specific speed increases in the range of the a parameter to 20 % for rot-1 and 40 % for rot-2, and the specific speed decreases in the range of the a parameter from 60 % for rot-1 and 80 % for rot-2. a similar case is shown for the sym-3 profile in fig. 8, where the specific speed increases up to 50 % of the a parameter, and then the specific speed decreases. the decreased specific speed for higher value of the a parameter indicates a more uniform flow field, but the real speed of the flow field differs in relation to the defined speed vs. this difference between the real speed of the flow field and the defined speed creates therefore the avr profiles. the avr-3 and rot-3 profiles create two different areas in one sector, with positive and negative differential speeds. if position a1 or a2 is greater than 50 %, 183 dušan maturkanič acta polytechnica t h e s y m -1 p ro file 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 0 20 40 60 80 100 a (% ) s 1 2 0 s 1 5 0 s 1 8 0 t h e s y m -2 p ro file 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 0 20 40 60 80 100 a (% ) s 2 2 0 s 2 8 0 s 2 5 0 t h e s y m -3 p ro file 0 0,0 5 0,1 0,1 5 0,2 0,2 5 0,3 0,3 5 0,4 0,4 5 0 20 40 60 80 10 0 a (% ) s 3 8 0 s 3 5 0 s 3 2 0 t h e a v r -1 p ro file s 0 0,0 5 0,1 0,1 5 0,2 0,2 5 0,3 0,3 5 0,4 0,4 5 0 20 40 60 80 10 0 a (% ) a 1 5 0 a 1 2 0 a 1 8 0 t h e a v r -2 p rifile s 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0 20 40 60 80 100 a (% ) a 2 8 0 a 2 5 0 a 2 2 0 t h e a v r -3 p ro file s 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5 0,55 0 20 40 60 80 100 a (% ) a 3 80 a 3 50 a 3 20 t h e r o t -1 p ro file s 0 1 2 3 4 5 6 7 0 20 40 60 80 10 0 a (% ) r 180 r 150 r 1 20 t h e r o t -2 p ro file s 0 1 2 3 4 5 6 7 0 20 40 60 80 100 a (% ) r 280 r 2 50 r 2 20 t h e r o t -3 p ro file s 0 4 8 12 16 20 24 28 0 10 20 30 40 50 60 70 80 a 1 , a 2 (% ) r 350 -a 2 r 3 20 -a 2 r 380 -a 2 r 3 50 -a 1 r 3 20 -a 1 r 3 80 -a 1 effect of the flow field deformation 13/17 in the wind tunnel on the aerodynamic coefficients f ig u re 8 . a t h e co m p a riso n o f th e b a sic sp ee d p ro files figure 8. the comparison of the basic speed profiles. 184 vol. 55 no. 3/2015 effect of the flow field deformation figure 9. deformed speed of the flow field. the results are the same for the avr-3 profiles (hence the corresponding graph in fig. 8 does not distinguish parameters a1 or a2), but the results are different for the rot-3 profiles. the complex characteristics of the non-uniformity of the flow field are based on all specific speeds described here, which represent the types of deformation. these profiles are marked on the basis of their parameters. the first letter refers to the type of profile (sym, avr, rot), the second number refers to the type of profile (sym-1, sym-2, sym-3, etc.), and the last two numbers refer to the difference in speed, as a percentage (parameter b). the rot marking is followed by an index tagging the position of the speed difference. for example, profile r380-a1 indicates that there is a rot-3 type analysis with 80 % of the speed difference in both directions (positive and negative speed differences), and the variable parameter for the position of the speed difference is the length range of a1. 11. consequences for the aerodynamic coefficients the classification of the flow fields presented above defines the applicability of the results from the measurements due to the flow field deformation. the character of the flow field produces reactions of the force and moments of the measured model which are recorded by the tensometric balance. the values of the force or moments depend on the parameters of the flow field included in the well-known formula x = cx %v 2x 2 lx, (6) where the x is the drag, lift or side force component with the parameter x = 2 and is the component of pitching, yawing or rolling moment component with the parameter x = 3. the aerodynamic coefficients cx are calculated on the basis of the measured forces or moments x and the flow field characteristics % and vx on model lx. in connection with this, the speed of the flow field is the most important parameter for the accuracy of the calculation of the aerodynamic coefficients. 12. deformed speed of the flow field in the model the ideal flow field in the wind tunnel has the axial speed component vs only. the deformation of the flow field produces the radial speed component, which influences the results of the measured model with additional forces and moments. in this case, the real speed of the flow field is the deformed speed vdef (fig. 9). matrix notation is used to express the deformed speed: vsvy vz   = vdef  cos β cos αsin β cos υ1 sin α sin υ1   . (7) and after transformation to the tangential and normal speed components, the relation is as follows: vsvt vn   = vdef  cos β cos αsin β sin υ2 sin α cos υ2   . (8) angles α and β are stated on the basis of the measured speed components: tan α = vz vs and tan β = vy vs . (9) speeds vs, vy and vz are registered when the measurements of the flow field are checked, and their expression is suitable for connecting with the deformed speed via the specific speeds: v i = vi vdef , where i = s,y,z. (10) these specific speeds present only the changes in the angles of the total speed of the flow field, according to the relation (7) or (8). 13. aerodynamic coefficients the ideal situation is a flow field with no deformation. in this case, the flow field has one speed component in the direction of the axis of the wind tunnel axe, which creates the drag force only for the symmetric model. if the drag force is registered by the tensometric balance, relation (6) can be modified: (cd )vyp = cd 1 2%v 2 defl 2 1 2%v 2 s l 2 = cd v 2 s . (11) the inaccuracy of the drag coefficient calculation is defined as δcd = cd − (cd )vyp and it means δcd = (cd )vyp(cos2 β cos2 α− 1). (12) 185 dušan maturkanič acta polytechnica figure 10. the inaccuracy of the drag coefficient for the deformation of the flow field. the same process can be used for the lift and side-force coefficients, and the relation is as follows: δci = (cd )vyp ( 1 − 1 v 2 i ) , where i = y,z. (13) relation (13) can be used for the moment coefficients, whereby the appreciated speed produces a force acting in the real length. if the characteristic length is equal to the real length, the relations for the aerodynamic moment coefficients correspond to relations (13). if the symmetric model is measured, only the drag coefficient is non-zero for the ideal flow field. the graph in fig. 10 presents the increasing the inaccuracy of the drag coefficient with the deformation of the flow field through angles α and β. 14. conclusion this paper has presented a study of flow field nonuniformity in connection with the measurements in a wind tunnel. because the study of the flow field deformation was described on a general level, the paper mentioned all cases, including events that are improbable for the conditions in a wind tunnel. however, these situations provide a complete survey of flow field deformation. due to the complications for the vortex flow field mentioned here, a general description of this problem requires deeper analysis. the effect of the flow field deformation on the aerodynamic coefficients was analyzed on the basis of a classification of the non-uniformity of the flow field. this effect was designated as the inaccuracy of the calculation of the aerodynamic coefficients, and is presented in a graph for the drag coefficient. list of symbols a length of the speed difference area [%] b value of the speed difference area [%] cx aerodynamic coefficient [–] l characteristic length [m] r diameter of the wind tunnel [m] v speed of the flow field in the wind tunnel [m s−1] v specific speed of the speed profile [–] y,z distance between positions of the probe at the check measuring [m] δcx inaccuracy of the aerodynamic coefficient [–] % air density [kg m−3] references [1] barlow j. b., rae w. h., jr., pope a.: low-speed wind tunnel testing, third edition, john wiley & sons, inc., new york 1999 [2] pátek z.: zkušební proud vzduchu v aerodynamickém tunelu ∅ 3 m, zpráva vzlú r-3401/02, 2002 [3] maturkanič d.: nejistoty měření v aerodynamickém tunelu, zpráva vzlú r-5778, 2013 [4] hora a., mráz v.: metodika měření v tunelu ∅ 3 m, zkušební postup zp-anr-02, 2002 [5] hull d. g.: fundamentals of airplane flight mechanics, springer, london 2007 [6] aiaa: assessment of experimental uncertainty with application to wind tunnel testing, aiaa standard s-071a-1999 [7] jcgm: evaluations of measurement data — guide to the expression of uncertainty in measurement, 2008 186 acta polytechnica 55(3):177–186, 2015 1 introduction 2 description of the flow field 3 speed profile 4 rotation of the flow field 5 deformation character of the flow field 6 asymmetry of the speed profile 7 speed profile disproportion 8 disproportion of the flow field rotation 9 vorticity of the flow field rotation 10 a summary of flow field characteristics 11 consequences for the aerodynamic coefficients 12 deformed speed of the flow field in the model 13 aerodynamic coefficients 14 conclusion list of symbols references acta polytechnica doi:10.14311/ap.2014.54.0127 acta polytechnica 54(2):127–129, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap dirac and hamilton p. g. l. leacha, b a department of mathematics and statistics, university of cyprus, lefkosia, republic of cyprus b school of mathematical sciences, university of kwazulu-natal, private bag x54001, durban 4000, republic of south africa correspondence: leachp@ukzn.ac.za abstract. dirac devised his theory of quantum mechanics and recognised that his operators resembled the canonical coordinates of hamiltonian mechanics. this gave the latter a new lease of life. we look at what happens to dirac’s quantum mechanics if one starts from hamiltonian mechanics. keywords: dirac, quantum mechanics, hamiltonian mechanics. 1. introduction when dirac was developing his theory of quantum mechanics, see [1], he observed that the operators he had introduced reminded him of something from mechanics. he looked to his personal collection in his study and could find nothing suitably advanced, such as whitaker’s treatise [2] which would contain such information. it was a sunday afternoon and naturally in those more civilised days the college library would not be open. dirac had to wait until monday morning to find that in fact the very properties he had in his operators, apart from the occasional i, were precisely those of the poisson brackets of hamiltonian mechanics, which had been established and developed some 120 years before by poisson and incorporated into the new mechanics some 30 years later. it is a little difficult to draw a line in the history of the development of mechanics and to claim what is the terminus a quo, but to keep this narrative compact we start with the mechanics of newton [3], for it is on his laws of motion that the subsequent evolution of mechanics has been based. one of the problems in practice is the solution of equations and it soon became apparent that the number of problems which could be solved at any particular epoch was rather limited. the theoreticians were always trying to think of a new way to solve the unsolvable. the calculus of variations provided one way to look for solutions from a different direction. developing upon the work of euler, lagrange in his mécanique analytique of 1788 introduced his equations of motion based upon the variation of a functional called the action integral. as far as classical mechanics was concerned, lagrange’s equations of motion were of the second order. in principle these equations reduced to the corresponding newtonian equations, but lagrange had introduced the idea of generalised coordinates and generalised forces which always have the possibility of giving equations which look simpler than the naive newtonian equations. then along came hamilton, who decided to reduce mechanics to a system of first-order equations. in a sense he was returning to the original formulation of newton ii. as the hamiltonian, as it came to be called, was derived by introducing a momentum as a derivative of the lagrangian and then obtained by using a legendre transformation, it inherited the generalised coordinates and generalised forces of lagrangian mechanics except that now there were generalised coordinates and generalised momenta. one of the attractions of hamilton’s approach was the introduction of a theory of transformations whereby one could transform from one hamiltonian to another by means of transformations which obeyed certain rules. hamiltonian mechanics was and is a marvellous theoretical construct which does have some practical uses. however, as a general tool in elementary mechanics it is not of much practical use and could well have faded into near-oblivion had it not been for the observation of dirac. now hamiltonian mechanics provided a lodestone to the new quantum mechanics. 2. a problem we have remarked that the basis of these theoretical constructs can be seen in the equations of motion of newton, particularly in the perturbation theory of orbital mechanics. the construction of an hamiltonian from a lagrangian is a quite definite procedure. however, the construction of a lagrangian which leads to the proper newtonian equation of motion, which in itself begs the question of properness, is by no means unique. we take a very elementary example, namely the simple harmonic oscillator in one space dimension. the newtonian equation of motion is q̈ + q = 0 (1) in which we have scaled the time to remove a distracting ω2. equation (1) has a plethora of lagrangians. for a modest sampling see [4]. here we consider just three, 127 http://dx.doi.org/10.14311/ap.2014.54.0127 http://ojs.cvut.cz/ojs/index.php/ap p. g. l. leach acta polytechnica namely l1 = 1 2 ( q̇2 − q ) , (2) l2 = 1 2 sin2 t (q̇ sin t − q cos t) and (3) l3 = 1 2 cos2 t (q̇ cos t + q sin t) . (4) these lagrangians share a common property in that they all possess five point noether symmetries, which is the maximal number for a one-degree-of-freedom system. in each case the algebra of the symmetries is sl(2, r) ⊕s 2a1, which is a subalgebra of the sl(3, r) algebra of (1). the standard formalism leads to three hamiltonians. these are h1 = 1 2 ( p2 + q2 ) (5) h2 = pq cot t + i √ p sin3 /2t and (6) h3 = −pq tan t + i √ 2p cos3/2 t (7) with the canonical momentum in each case being l1 p = q̇ l2 p = − cosec t 2 (q̇ sin t − q cos t)2 and l3 p = − sec t 2 (q sin t + q̇ cos t)2 . the ‘quantisation procedure’ for h1 (5) is well known and leads to the equation 2iut = −uqq − q2u (8) about which any competent undergraduate can write at length. in principle we may follow the same procedure for h2 and h3. however, there are two problems. the first is the presence of √ p, which doubtless makes the process a little complicated. it is true that there exist procedures for dealing with nonstandard forms, but this is not the place to deal with such things as we are considering a very elementary problem. the other difficulty is that, even neglecting the fractional power, the resulting schrödinger equation would be linear in the spatial derivative. the question is what to do? there are various possibilities: (1.) a mistake was made in the calculation of the second and third hamiltonians. this is unlikely as the algorithm is particularly simple and mathematica is much better at arithmetic than i am. (2.) the process was initiated under false pretences. one recalls that the simple harmonic oscillator has three linearly independent quadratic integrals. they are i1 = 1 2 ( q̇2 + q2 ) , i2 = 1 2 e2it {( q̇2 − q2 ) − 2iqq̇ } and i3 = 1 2 e−2it {( q̇2 − q2 ) + 2iqq̇ } . the first integral, i1, corresponds to the hamiltonian and equally leads to the schrödinger equation given above. i2 and i3 can also be used to construct schrödinger equations, but the question is that of meaning [5]. one feature of the schrödinger equation for h1 is that it is a parabolic equation which has the same specific algebra as l1. one could take the noether symmetries of l2, respectively l3, and construct the corresponding schrödinger equation with the additional requirement that it be linear. the meaning of the outcome could be of interest [6]. one could conclude that the similarity of the operators introduced by dirac and their identification with those of hamiltonian mechanics is an accident and one should read dirac’s book carefully. 3. conclusion this simple example already shows that there is the potential for ambiguity in the interpretation of the properties of a classical hamiltonians, in this case that of the simple harmonic oscillator. a critical aspect is the interpretation of how one should progress from classical mechanics to quantum mechanics. in the approach adopted by dirac the hamiltonian corresponding to the ‘standard’ lagrangian, ie a lagrangian of the form l = t −v in the case of simple systems was used to construct an operator which fitted into the expectation for the corresponding quantum mechanical system. if one views the newtonian equation of motion, (1), as the fundamental source of the problem, that equation has eight lie point symmetries. without going into anything fanciful one can construct a large number of lagrangians using these symmetries, two at a time, with the vector field of the equation of motion to determine jacobi last multipliers and the relationship, ∂2l ∂ẋ2 = m, where m is a last multiplier, to calculate a whole pile of lagrangians for the single newtonian equation, (1). one can then apply noether’s theorem to each of these lagrangians in turn to determine the noether symmetries. these vary in number up to a maximum of five. there are three such lagrangians [7], the l1, l2 and l3 listed above. the first, l1, is the one which is usually used to construct the hamiltonian for the simple harmonic oscillator and hence a schrödinger equation. why is it that this lagrangian should be chosen when there are two other lagrangians equally well endowed with noether symmetries? 128 vol. 54 no. 2/2014 dirac and hamilton one can reasonably claim that this choice leads to physically acceptable results. however, this does not really gel well with the idea of symmetry and operators. consequently one must argue that the choice of lagrangian, hence hamiltonian and corresponding operators for quantum mechanics must be predicated on other considerations. in dirac’s book he writes of the energy being the source of the operator to be used for quantisation. it is an accident that in elementary mechanics the energy is the hamiltonian in the usual meaning of the word. that the operators required for dirac’s quantum mechanics had essentially the same properties as the canonically conjugate variables of classical hamiltonian mechanics is perhaps a cause for jumping on a bandwagon without checking to see if the horses had been harnessed. acknowledgements this paper was prepared while pgll was enjoying the hospitality of professor christodoulos sophocleous and the department of mathematics and statistics, university of cyprus. the continuing support of the university of kwazulu-natal and the national research foundation of south africa is gratefully acknowledged. any opinions expressed in this talk should not be construed as being those of the latter two institutions. references [1] p. dirac. the principles of quantum mechancis. first edition. cambridge, at the clarendon press, 1932. [2] e. whittaker. a treatise on the analytical dynamics of particles and rigid bodies. fourth edition, first american printing. dover, new york, 1944. [3] i. newton. principia. fourth edition, second printing. editor: cajori f, university of california press, berkeley, voll i & ii (translation by motte a), 1962. [4] n. m. . l. pgl. lagrangians galore. journal of mathematical physics 48(123510):1–16, 2007. [5] n. m. . l. pgl. classical integrals as quantum mechanical differential operators: a comparison with the symmetries of the schrödinger equation. in preparation. [6] n. m. . l. pgl. lagrangians of the free particle and the simple harmonic oscillator of maximal noether point symmetry and their corresponding evolution equations of schrödinger type. in preparation. [7] p. winternitz. subalgebras of lie algebras. example of sl(3, r). centre de recherches mathématiques crm proceedings and lecture notes 34:215–227, 2004. 129 acta polytechnica 54(2):127–129, 2014 1 introduction 2 a problem 3 conclusion acknowledgements references acta polytechnica acta polytechnica 53(2):110–112, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ computer simulation of a plasma vibrator antenna nikolay n. bogacheva,∗, irina l. bogdankevichb, namik g. gusein-zadeb, vladimir p. tarakanovb a moscow state technical university of radio engineering, electronics and automation, moscow, russia b prokhorov general physics institute ras, moscow, russia ∗ corresponding author: bgniknik@yandex.ru abstract. the use of new plasma technologies in antenna technology is widely discussed nowadays. the plasma antenna must receive and transmit signals in the frequency range of a transceiver. many experiments have been carried out with plasma antennas to transmit and receive signals. due to lack of experimental data and because experiments are difficult to carry out, there is a need for computer (numerical) modeling to calculate the parameters and characteristics of antennas, and to verify the parameters for future studies. our study has modeled plasma vibrator (dipole) antennas (pda) and metal vibrator (dipole) antennas (mda), and has calculated the characteristics of pdas and mdas in the full karat electro-code. the correctness of the modeling has been tested by calculating a metal antenna using the mmana program. keywords: computer simulation, plasma antenna, karat code, mmana program. 1. introduction many analytical and experimental studies of plasma antennas have been made in recent years [1, 3, 4]. these studies have investigated problems of the formation and excitation of plasma by an rf field, transmission and receiving of tem-waves by plasma antennas, the efficiency of plasma antennas, etc. a group of scientists from the prokhorov general physics institute, russian academy of sciences has made two fairly comprehensive studies [3, 4]. the most promising approach is the use of antennas with discharge plasma – asymmetric dipole length l = λ/4 (see fig. 1). it is not always possible to investigate plasma antennas with analytical or experimental methods. the best available option is to use computer (numerical) models of plasma antennas. thin metal vibrators can easily be calculated theoretically in an approximation to an ideal thin vibrator. plasma dipole antennas with r ≈ 1 cm cannot be considered when approximating an ideal thin vibrator. for the reasons mentioned above, and due to the features of plasma antennas (gas ionization and excitation of plasma), computer simulation is necessary. 2. simulation modern free software for simulating metal antennas does not allow us to establish a plasma environment. in particular, the mmana software package [2] is convenient for modeling metal antennas of various configurations (stationary mode). researchers are therefore aiming to design a plasma antenna of finite thickness that transmits a modulated pulse of finite duration. the karat electro-code [5] was chosen as the software for modeling plasma antennas. the karat code allows us to simulate a metal antenna and a plasma antenna, and the transmission or receiving of a modulated pulse of finite duration. the karat code solves the maxwell equations using the finitedifference time domain. the material equations of environments may be represented by various models by both phenomenological and pic (particle-in-cell) methods. in our studies, the plasma model is phenomenological. in the simulations, we modeled the injection of a tem wave with the frequency through a ring window (r = 1 cm, r = 2 cm) at the left boundary z = 0 of the coaxial line. this window is one of the elements of the computation region that is chosen to be the quadrant shown in fig. 2. the other elements in the computation region are the short coaxial line 5 cm in length, a radiation output window at the end of the coaxial line, and a finite spatial zone surrounding the plasma column. the outer wall of the coaxial waveguide is adjacent to the transverse conducting screen in the form of a disc. the correctness of the modeling is confirmed by a comparison ofthe metal antenna results using the karat code and using the mmana program. the calculations of the radiation pattern (rp) for a metal antenna of length l = 17 cm and a tem wave with frequency f = 400 mhz show that the two models are quantatively similar (see fig. 3). 110 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 computer simulation of a plasma vibrator antenna figure 1. scheme of the experimental device; 1 – dielectric tube with plasma, 2 – metal screen, 3 – coaxial line, 4 – coaxial tee, 5 – coaxial cable, 6 – transmitter. figure 2. scheme of the computer model in karat code: 1 – coaxial cable, 2 – plasma antenna, 3 – metal screen, 5 – absorber. figure 3. radiation pattern for models of mda in the mmana program (green) and in the karat code (red). figure 4. radiation power p from the plasma antenna vs. electron plasma density ne. 3. results and discussion the well-known plasma dispersion equation for unmagnetized plasma is �(ω) = �0 ( 1 + ω2p ω(iνe − ω) ) , (1) where the parameters ωp = √ nee2/me�0, �0 is the permittivity of the free space, ne is the plasma electron density, e is the electric charge, me is the mass of an electron and νe is the average collision frequency of the electrons. throughout the simulation, we use the plasma density as a plasma parameter. the dependence of the radiation power on the plasma density for the plasma antenna (length l = 17 cm, frequency f = 400 mhz) is presented in fig. 4. the graph shows that the maximum radiation power values is for ne ≥ 1012 cm−1 and is almost constant while the density increases. for the mda and pda dependences of radiation power p on the length of the antenna, l was calculated for various radii of the antenna (fig. 5) (frequency f = 400 mhz). the graphs show the resonance of dependences p on l. the resonance shifts to short waves with increasing radius r, because the capacity 111 nikolay n. bogachev et al. acta polytechnica figure 5. radiation power p of mda (a) and pda (b) vs. length l of the antenna. figure 6. radiation patterns of metal (red) and plasma (blue) vibrator antennas in karat code. of the equivalent oscillatory circuit of the antenna also increases. the region of the resonance expands as the input impedance decreases. two cases were considered for pda: 1) a plasma column of variable length in the dielectric tube of fixed length 35 cm; 2) a plasma column in the dielectric tube of variable length. case 1) corresponds to an extension of the coil of the equivalent oscillatory circuit of the antenna (shift of the resonance and a weak decrease in the radiation power). for pda (l = 17 cm, r = 1 cm, ne = 3 × 1012 cm−3 and f = 400 mhz) the radiation pattern was calculated on a sphere of radius ρ = 270 cm (≈ 3 ÷ 4 λ). the rp of pda and the rp of mda (fig. 6) are similar in form. however, the rp value for the plasma antenna is substantially smaller than the rp value for the metal antenna, because there are bigger losses for matching the coaxial cable to the antenna. these frequency dependences of the radiation power for pda (l = 17 cm, r = 1 cm) for various densities ne are illustrated in fig. 7. figure 7. radiation power p of the mda and pda vs. the frequency. 4. conclusions the graph shows, that: • pda is identical with mda when plasma density is ne > 2 × 1012 cm−3, • for density 1011 < ne < 1012 cm−3, the dependences have two smooth resonances with almost the same power value. this indicates that it may be possible to tune the working frequency with almost zero-delay and without loss of effectiveness. a change in the plasma density can control the characteristics of a plasma antenna, for example, by switching two working frequencies. this property can be used actively to create a system of phased array antennas. acknowledgements the study was supported by the ministry of education and science of russia, project 8392. references [1] i. alexeff, t. anderson, i. farshi, et al. experimental and theoretical results with plasma antennas. physics of plasmas 15:057104, 2008. [2] i. v goncharenko. kompyuternoe modelirovanie antenn (computer-aided antenna modeling). radiosoft, moscow, 2002. (in russian). [3] e. n. istomin, d. m. karfidov, i. m. minaev, et al. plasma asymmetric dipole antenna excited by a surface wave. plasma physics reports 32(5):388, 2006. [4] i. m. minaev, n. g. gusein-zade, k. z. rukhadze. a plasma receiving dipole antenna. plasma physics reports 36(10):914, 2010. [5] v. p. tarakanov. user’s manual for code karat. va: berkley research associates, inc, springfield, 1992. 112 acta polytechnica 53(2):110–112, 2013 1 introduction 2 simulation 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0320 acta polytechnica 54(5):320–324, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap structuring of diamond films using microsphere lithography mária domonkosa, b, ∗, tibor ižáka, lucie štolcovác, jan proškac, pavel demoa, b, alexander kromkaa a institute of physics, academy of sciences of the czech republic v.v.i., cukrovarnická 10/112, 162 53 praha, czech republic b department of physics, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 praha 6, czech republic c department of physical electronics, faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, 115 19 praha, czech republic ∗ corresponding author: domonkos@fzu.cz abstract. in this study, the structuring of microand nanocrystalline diamond thin films is demonstrated. the diamond films are structured using the technique of microsphere lithography followed by reactive ion etching. specifically, this paper presents a four-step fabrication process: diamond deposition (microwave plasma assisted chemical vapor deposition), mask preparation (by the standard langmuir-blodgett method), mask modification and diamond etching. a self-assembled monolayer of monodisperse polystyrene (ps) microspheres with close-packed ordering is used as the primary template. then the ps microspheres and the diamond films are processed in capacitively coupled radiofrequency plasma using various plasma chemistries. this fabrication method illustrates the preparation of large arrays of periodic and homogeneous hillock-like structures. the surface morphology of the processed diamond films is characterized by scanning electron microscopy and with the use of an atomic force microscope. the potential applications of these diamond structures in various fields of nanotechnology are also briefly discussed. keywords: nanostructuring, diamond thin films, polystyrene microspheres, reactive ion etching, scanning electron microscopy. 1. introduction 1.1. diamond and its applications the most commonly used methods for synthesizing diamond are high-pressure, high temperature (hpht) [1] and chemical vapour deposition methods (cvd) [2]. other methods include explosive formation (forming detonation nanodiamonds), sonication of graphite solutions (ultrasound cavitation), laser ablation, etc. due to a unique combination of properties (e.g. extreme hardness, high thermal conductivity, wide band gap, negative electron affinity, high mechanical strength, chemical inertness, resistance to particle bombardment, and biocompatibility) diamond is a promising material for applications in various fields of electronics, bioelectronics, sensorics, etc. [3]. the potential applications of diamond depend not only on its intrinsic physical and chemical properties, but also on its surface geometry. the defined surface structuring allows these unique properties to be tuned and exploited for a wider range of applications. for example, structuring of films enhances the surface-tovolume ratio and therefore increases the sensitivity and other performances of fabricated devices [4]. for example, increasing the surface area-to-volume ratio of diamond films improves the field-emission properties by introducing the enhancement effect of the local field near the tips [5]. generally, various nanostructures can be fabricated from diamond films known as nanowires, nanorods, nanoneedles, etc. therefore, there is still great interest in developing methods for obtaining diamond nanostructures with high area density, high aspect ratio (depth/width), good uniformity and controlled geometry over large areas [6]. 1.2. structuring of diamond films on the basis of the fabrication or structuring method, two basic concepts can be defined as: a) the bottomup strategy and b) the top-down strategy. for structuring diamond films, the bottom-up strategy (e.g. selective area deposition [7]) is rarely used because of the greater complexity and low process reliability. wet chemical etching is not applicable due to the hightemperature stability, super hardness and chemical inertness of diamond. among the top-down strategies, only dry reactive ion plasma etching through a mask can be used. the advantages are greater reliability, greater reproducibility, and a broad family of possible masking materials (metals, polymers, oxides or nitrides) than when the bottom-up strategy is used. the mask, which defines the required geometrical patterns, is formed using standard lithographic processes (photolithography, electron beam lithogra320 http://dx.doi.org/10.14311/ap.2014.54.0320 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 5/2014 structuring of diamond films using microsphere lithography steps process parameters 1a) mcd deposition mw power 3000 w total gas pressure 7000 pa ch4/h2 5 % co2/h2 1.5 % temperature 800 °c process time 4 h (pre-deposition: 2 % of ch4/h22 without co2 for 10 min) 1b) ncd deposition mw power 3000 w total gas pressure 7000 pa ch4/h2 5 % co2/h2 1.5 % temperature 800 °c process time 3 h (pre-deposition: 2 % of ch4/h2 without n2 for 10 min) 2a) rie process rf power 100 w (o2-plasma) total pressure 12 pa o2 flow rate 50 sccm (for 7 min) self-bias voltage −57 v 2b) rie process rf power 100 w (cf4/o2-plasma) total pressure 12 pa (1.) o2 flow rate 50 sccm (for 2 min) (2.) o2/ch4 flow rate 40/10 sccm (for 8 min) self-bias voltage −57 v table 1. process parameters for cvd diamond deposition and diamond etching using the ps template as a direct mask (for ps1500). phy, nanoimprinting, etc.). however, these processes are expensive or time-consuming. a possible mask preparation process would involve utilizing a monolayer of plasma-treated microspheres with controllable size and gap. in this study, we present structuring of diamond thin films using the technique of microsphere lithography. 2. experimental section a schematic illustration of the main technological steps for diamond structuring is shown in fig. 1: (1.) diamond deposition. two types of diamond films were grown in a focused microwave chemical vapor deposition reactor (aixtron p6) on a silicon substrate 1 × 1 cm2 in size (for the plasma parameters see rows 1 and 2 in table 1). (2.) ps mask preparation. uniform periodic arrays of polystyrene (ps) microspheres were achieved by the standard langmuir-blodgett method, i.e. self-assembly of microspheres in a hexagonally closepacked monolayer at the water-air interface. the initial diameter of the ps microspheres was 1500 nm. details abouth the preparation of the mask can be found in ref. [8]. (3.) mask modification – ps etching. the preparation of the hexagonally close packed monolayer figure 1. schematic drawing of diamond film structuring using a ps microsphere array. was followed by reactive ion etching. the ps microspheres were modified in a capacitively coupled plasma system (ccp-rie, phantom iii, trion technology) in an oxygen atmosphere (for the etching parameters see table 1, row 3). (4.) diamond etching. the samples were subsequently also treated by the ccp-rie system. two different gas mixtures were used: pure o2 and 20 % of cf4 in o2 gas (see row 4 in table 1). 321 m. domonkos, t. ižák, l. štolcová et al. acta polytechnica figure 2. top-view sem and afm (2 × 2 µm) images of a), c) mcd and b), d) ncd diamond thin films before the mask was prepared. figure 3. raman spectra of a) mcd and b) ncd films measured by an excitation wavelength of 325 nm. the plasma parameters were chosen on the basis of our previous study on ps and diamond etching [9, 10]. the samples were systematically characterized after each technological step. the morphology of the cvd grown and plasma etched samples was characterized by scanning electron microscopy (sem, e_line writer, raith gmbh) and with the use of an atomic force microscope (veeco dimension 3100, ntmdt ntegra). raman spectroscopy (renishaw invia reflex) with an excitation wavelength of 325 nm was employed to determine the chemical character (i.e. sp3 versus sp2 carbon bonds) of the diamond films. 3. results and discussions figure 2 shows the surface morphology and the topography of the diamond films before the mask was prepared, taken by scanning electron microscopy. the diamond film with larger grain sizes (∼ 1 µm) was labelled as mcd (microcrystalline, fig. 2a), and the diamond film with smaller grain sizes (< 50 nm) was labelled as ncd (nanocrystalline, fig. 2a). both diamond films were about 3 µm in thickness. figure 2c,d shows the 3d topography of the diamond films (before the mask was prepared), provided by an atomic force microscope (afm). the raman spectra clearly confirmed the diamond character in both films (fig. 4a,b). for the mcd film, the spectrum is dominated by a sharp peak located at 1332 cm−1, which is the characteristic line for the phonon mode of the sp3 crystalline diamond phase. two broad bands located at frequencies ∼ 1374 cm−1 and ∼ 1575 cm−1 are attributed to the d-band (defects) and to the g-band (graphite), which represents the non-diamond carbon bonds (sp2 phase). a weak band centered at ∼ 1154 cm−1 corresponds to the trans-polyacetylene-like groups at the grain boundaries [11]. this band is more intensive for ncd films, where the crystals are much smaller, and more sp2 carbon bonds are present at grain boundaries. thus, 322 vol. 54 no. 5/2014 structuring of diamond films using microsphere lithography figure 4. tilted-angle view (45°) sem images of etched mcd diamond films (columns 1, 2) and ncd diamond films (columns 3, 4) prepared using a ps microsphere mask (ps 1500 nm) by etching in oxygen plasma (a, c, e, g) and with the addition of cf4 (b, d, f, h). (note: the upper row and the bottom row differ only in magnification). the intensity of the g-band increases and the intensity of the diamond-peak (at 1332 cm−1) decreases, but it is still resolvable in the raman spectrum. right at the beginning of the plasma treatment (step 2a in table 1) we observed homogeneous etching of ps over the whole sample. the close-packed arrays of ps microspheres were converted into a non-closepacked template with the preserved period of the initial microsphere array (not shown here) [10]. the surface morphology of the diamond films after the ccp-rie processes (step 2a and 2b, table 1) is shown in figure 4. periodic hillock-like structures with a period of ∼ 1500 nm were observed. their height was estimated from the afm image to be ∼ 600 nm for mcd and ∼ 800 nm for ncd. the fabricated hillock-like features were better recognized (i.e. their geometrical border was better defined) for the ncd films. this is attributed to the nanocrystalline character itself. in the case of mcd films, the fabricated structures reveal non sharp edges (borders), which we assign to the different etching rates for each crystal facet of the diamond. this means that the (111) facets were etched at different rates than the (100) oriented diamond crystals, etc. [12]. the mcd films consist of randomly and well faceted crystals and the surface roughness is higher for the thicker films. from this point of view, the ncd films do not reveal such dependences or nonhomogenities on the microscopic scale, i.e. the grain size and the surface roughness do not vary with the film thickness. the surface of the etched mcd and ncd films in pure oxygen plasma (columns 1 and 3 of fig. 4) corresponds well to our previous studies [9]. rie etching in o2 led to the formation of diamond needle-like structures (or so-called whiskers). the main reason for this effect is that the ions are vertically accelerated to the substrate and not only chemical etching but also physical sputtering takes place. moreover, it is well-known that oxygen etches sp3 diamond bonds much faster than sp2 carbon bonds [13], resulting in the formation of needle-like structures. in ncd films, more sp2 carbon bonds at the grain boundaries correspond to the formation of needle-like structures [14].however, the addition of cf4 into o2 resulted in flattening/smoothing of the diamond surface. 4. conclusion in summary, we have demonstrated that microsphere lithography is a promising technique for structuring diamond thin films in the so-called top-down strategy. diamond films were grown using a focused microwave plasma cvd process, and their crystallographic character was controlled by the gas mixture that was used. the mcd films were grown from a co2 + ch4 + h2 gas mixture, and the ncd films were grown from a n2 + cf4 + h2 gas mixture. self-assembled, hexagonally close packed ps microsphere arrays obtained by the langmuir-blodgett method were used as the masking material. first, the ps layer was treated in plasma to predefine the final geometry of the diamond structures. during continued plasma etching, the primary ps were removed. using this cost-effective and time-efficient method, highly periodic and homogeneous hillock-like structures were achieved both for mcd films and for ncd films. it is believed that diamond structures fabricated over large areas will have a positive impact on their further applications as photonic crystals in optics or as active functional surfaces in sensorics, bioelectronics, biomedicine and electrochemistry. 323 m. domonkos, t. ižák, l. štolcová et al. acta polytechnica acknowledgements this work was supported by grants from the czech science foundation p108/12/g108 (t. i., a. k.). the work was carried out in the framework of the lnsm infrastructure. references [1] bundy, f.p. et al.: man-made diamonds. nature, 176, 1955, pp. 51–54. doi:10.1038/176051a0 [2] may, p.w.: diamond thin films: a 21st-century material. philos. trans. r. soc. lond. ser. math. phys. eng. sci., 358 (1766), 2000, pp. 473–495. doi:10.1038/176051a0 [3] koizumi, s. et al.: physics and applications of cvd diamond. weinheim: wiley-vch, 2008. doi:10.1002/9783527623174 [4] babchenko, o. et al.: nanostructuring of diamond films using self-assembled nanoparticles. cent. eur. j. phys., 7 (2), 2009, pp. 310–314. doi:10.2478/s11534-009-0026-8 [5] uetsuka, h. et al.: icp etching of polycrystalline diamonds: fabrication of diamond nano-tips for afm cantilevers. diam. relat. mater., 17 (4-5), 2008, pp. 728–731. doi:10.1016/j.diamond.2007.12.071 [6] demo, p. et al.: analytical approach to time lag in binary nucleation. phys. rev. e, 59 (5), 1999, pp. 5124–5127. doi:10.1103/physreve.59.5124 [7] babchenko, o. et al.: toward surface-friendly treatment of seeding layer and selected-area diamond growth. phys. status solidi b, 247 (11-12), 2010, pp. 3026–3029. doi:10.1002/pssb.201000124 [8] stolcova, l. et al.: periodic arrays of metal nanobowls as sers-active substrates. nanocon 2011. brno: tanger, 2011, pp. 737–741. [9] domonkos, m. et al.: mask-free surface structuring of microand nanocrystalline diamond films by reactive ion plasma etching. adv. sci. eng. med., 6 (7), 2014, pp. 780–784. doi:10.1166/asem.2014.1573 [10] domonkos, m. et al.: controlled structuring of self-assembled polystyrene microsphere arrays by two different plasma systems. nanocon 2013. brno: tanger, 2013, p. 34–38. [11] h. kuzmany, r.p.: the mystery of the 1140 cm −1 raman line in nanocrystalline diamond films. carbon, 42 (5), 2004, pp. 911–917. doi:10.1016/j.carbon.2003.12.045 [12] neves, a.j., nazaré, m.h.: properties, growth and applications of diamond. iet, 2001. [13] wang, q. et al.: chemical gases sensing properties of diamond nanocone arrays formed by plasma etching. j. appl. phys., 102 (10), 2007, p. 103714. doi:10.1063/1.2817465 [14] zou, y.s. et al.: fabrication of diamond nanopillars and their arrays. appl. phys. lett., 92 (5), 2008, p. 053105. doi:10.1063/1.2841822 324 http://dx.doi.org/10.1038/176051a0 http://dx.doi.org/10.1038/176051a0 http://dx.doi.org/10.1002/9783527623174 http://dx.doi.org/10.2478/s11534-009-0026-8 http://dx.doi.org/10.1016/j.diamond.2007.12.071 http://dx.doi.org/10.1103/physreve.59.5124 http://dx.doi.org/10.1002/pssb.201000124 http://dx.doi.org/10.1166/asem.2014.1573 http://dx.doi.org/10.1016/j.carbon.2003.12.045 http://dx.doi.org/10.1063/1.2817465 http://dx.doi.org/10.1063/1.2841822 acta polytechnica 54(5):320–324, 2014 1 introduction 1.1 diamond and its applications 1.2 structuring of diamond films 2 experimental section 3 results and discussions 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0817 acta polytechnica 53(supplement):817–820, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap constraining spacetime torsion with the moon, mercury and lageos riccardo marcha,c, giovanni bellettinib,c,∗, roberto taurasob,c, simone dell’agnelloc a istituto per le applicazioni del calcolo, cnr, via dei taurini 19, 00185 roma, italy b dipartimento di matematica, università di roma tor vergata, via della ricerca scientifica 1, 00133 roma, italy c infn – laboratori nazionali di frascati (lnf), via e. fermi 40, frascati 00044 roma, italy ∗ corresponding author: giovanni.bellettini@lnf.infn.it abstract. we consider an extension of einstein general relativity where, beside the riemann curvature tensor, we suppose the presence of a torsion tensor. using a parametrized theory based on symmetry arguments, we report on some results concerning the constraints that can be put on torsion parameters by studying the orbits of a test body in the solar system. keywords: riemann-cartan spacetime, torsion, autoparallel trajectories, geodetic precession, perihelion advance, frame dragging, lunar laser ranging, planetary radar ranging. 1. introduction one of the various generalizations of the einstein theory of general relativity is the so-called einstein– cartan theory [11]: it consists of a spacetime endowed with a locally lorentzian metric gλµ and with a compatible connection γνλµ, not assumed to be symmetric, so that γνλµ is not equal, in general, to γ ν µλ. the connection γνλµ has a curvature tensor, and its lack of symmetry amounts to the additional presence of a torsion tensor [4]1 s νλµ := 1 2 ( γνλµ −γ ν µλ ) . γνµν is determined uniquely by gµν and by the torsion tensor, as follows: γνλµ = { ν λµ } + s νλµ + s ν µλ + s ν λµ, where {·} is the levi-civita connection2. usually, the torsion tensor is related to the intrinsic spin of matter [10, 11]3. since the spins of elementary particles in macroscopic matter are usually oriented in a random way, such theories predict a negligible amount of torsion generated by massive bodies. as a consequence, spacetime torsion would be observationally negligible in the solar system. on the other hand, as pointed out in [15], the existence or nonexistence of a torsion tensor in the solar system should be tested experimentally4: for this reason the authors of [15] developed a parametrized theory based on symmetry arguments and, computing the precessions of gyro1explicit examples of torsion of a connection compatible with a riemannian metric in r2, r3 or in a surface embedded in r3 can be found for instance in [1, 2, 20]. see also [21] for a discussion on nonsymmetric connections. 2we recall that for a connection, it is possible to choose coordinates so that γ ν λµ + γ ν µλ = 0 [13, prop. 8.4] at a point. 3in [10] the reader can find an interesting analogy between torsion and the density of dislocations in crystals. 4in the teleparallel theory of hayashi and shirafuji [9] a massive body generates a torsion field. gravitational forces are due entirely to spacetime torsion and not to curvature. scopes, put constraints on torsion with the gravity probe b experiment. the aim of this short paper is to report on the results of [16, 17] concerning the constraints on torsion that can be put by studying the orbits of a test body in the solar system, following the nonstandard parametrized approach of [15]. the computations concern torsion corrections to: (i) the orbital geodetic (or de sitter) precession, (ii) the precession of the pericenter of a body (a planet) orbiting around a central mass, (iii) the orbital frame-dragging (or lense–thirring) effect. we then use the measured moon geodetic precession, mercury’s perihelion advance and the data from the measurements of lageos satellites5, to put contraints on torsion, looking at the secular perturbations of the orbits. 2. working assumptions and autoparallel trajectories our main working hypotheses are the following: (i) weak field approximation and slow motion of the test bodies, assumptions probably sufficiently accurate for solar system experiments. (ii) spherical or axial symmetry, depending on the situation at hand (in the case of the lense–thirring effect, we suppose also earth uniformly rotating, while in the computation of the de sitter effect earth and sun are supposed to be nonrotating). for example, if m is the mass of the body, the torsion tensor around a spherically symmetric body (sun/earth) in spherical coordinates (t,r,θ,φ) can be parametrized to second order 5remember that in general relativity we have that the secular precession of the perihelion of mercury is 43′′/century, the geodetic precession is 19.2 mas/yr, and the lense–thirring effect on the longitude of the nodes of the lageos satellites is around 31 mas/yr. 817 http://dx.doi.org/10.14311/ap.2013.53.0817 http://ojs.cvut.cz/ojs/index.php/ap r. march, g. bellettini, r. tauraso, s. dell’angelo acta polytechnica in m/r � 1 as s ttr = t1 m 2r2 + t3 m2 r3 , s θrθ = s φ rφ = t2 m 2r2 + t4 m2 r3 , where t1, t2, t3, t4 are dimensionless torsion parameters (all other components of the torsion tensor vanish. moreover, it turns out that t4 will not enter into our computations6). in case of a uniformly rotating spherical body, the expression of the torsion tensor reads (to second order in m/r) as s ttr = t1 m 2r2 , s trφ = w1 j 2r2 sin2 θ, s rtφ = w3 j 2r2 sin2 θ, s φtr = w5 j 2r4 , s θrθ = s φ rφ = t2 m 2r2 , s tθφ = w2 j 2r sin θ cos θ, s θtφ = w4 j 2r3 sin θ cos θ, s φtθ = −w4 j 2r3 cos θ sin θ , where w1, . . . ,w5 are other torsion parameters, m is the mass of the rotating body and j its angular momentum. (iii) bodies move along (causal) autoparallel trajectories, namely they satisfy d2xν dτ2 + γνλµ dxλ dτ dxµ dτ = 0, and not along geodesics. this latter assumption is more questionable; on the other hand, assuming that the test bodies move along geodesics does not give any constraint on the torsion parameters. for example, indicating with m� the mass of the sun, the system of autoparallel trajectories (of the moon) in the case of computation of the geodetic precession reads as d2xα dt2 + m� (xα ∆3 − ξα ρ3 ) = 2(β − t3)m2� (xα ∆4 − ξα ρ4 ) + (t2 + 2)m� (∆̇ẋα ∆2 − ρ̇ξ̇α ρ2 ) + 3γm� (xα∆̇2 ∆3 − ξαρ̇2 ρ3 ) − (2γ + t2)m� (xα ∑σ(ẋσ)2 ∆3 − ξα ∑ σ(ξ̇ σ)2 ρ3 ) , where ξα, ρ are the heliocentric rectangular coordinates of the earth, xα, ∆ are the heliocentric rectangular coordinates of the moon, and xα,r are the geocentric rectangular coordinates of the moon. in the case of the lense–thirring effect, the system of 6all torsion parameters are independent of the ppn parameters appearing in the expression of the metric. autoparallel (of a satellite) reads as d2x dt2 = − m⊕ r3 x + j r5 [ (d + a)xy dx dt + ( −dx2 + ay2 + bz2 ) dy dt + (a−b)yz dz dt ] , d2y dt2 = − m⊕ r3 y − j r5 [ (d + a)xy dy dt + ( ax2 −dy2 + bz2 ) dx dt + (a−b)xz dz dt ] , d2z dt2 = − m⊕ r3 z + j r5 (d + b)z [ y dx dt −x dy dt ] , where m⊕ is the mass of the earth, j its angular momentum, a = 1 + γ + α14 + w1 − w3, b = −2 ( 1 + γ + α14 ) + w2 − w4, d = − ( 1 + γ + α14 ) − w1−w5. it should be noted that from assumption (iii) it follows that the antisymmetric part of the torsion tensor cannot be measured (the torsion tensors that we will consider are not totally antisymmetric). (iv) in the computation of the geodetic precession we assume that we can superimpose linearly the metric and torsion fields of sun and earth to obtain the global fields. (v) all computations have been performed by taylor expanding in m/r at the required order7. 3. description of the results using perturbative methods in celestial mechanics, in the case of the three-body problem we can constrain the torsion parameters with the moon as follows. the secular precession of node ω of the satellite orbiting around earth turns out to be: (δω)sec = 1 2 m�ν0 ρ ( 1 + 2γ + 3 2 t2 ) t, (1) where ν0 is the angular velocity of the earth and t is time. we observe that (δω)sec is independent of the details of the satellite8. the same right hand side of eq. 1 is obtained for the secular precession of the lunar perigee (δω̃)sec. from these computations, we find b ≡ geodetic precession with torsion geodetic precession in gr = 1 3 (1 + 2γ) + t2 2 . using the lunar laser ranging data giving the relative deviation from gr [23] we find |b− 1| < 0.0064. using the cassini measurement γ = 1 + (2.1 ± 2.3) × 10−5 [3] gives |t2| < 0.0128. (2) 7other assumptions that we make are the following: (vi) existence of the newtonian limit, which fixes t1 = 0. (vii) all ppn parameters different from γ, β (and α1 in the case of the lense–thirring effect) are negligible. (viii) test bodies, such as planets, are supposed to be pointwise, in particular structureless. (ix) in the computation of secular effects, we perform time averages over suitably chosen time intervals. 8this remark is important, since it allows to use the result also in the case of lageos satellites, and to decouple the geodetic precession from the lense–thirring precession. 818 vol. 53 supplement/2013 constraining spacetime torsion with the moon, mercury and lageos using a two-body computation, we can now constrain the torsion parameters with mercury as follows. if ω̃ is the longitude of the pericenter, the secular contribution (δω̃)sec reads as (δω̃)sec = (2 + 2γ −β + 2t2 + t3) m� a(1 −e2) v, where a is the semimajor axis of the orbit, e is the eccentricity and v the true anomaly. then b ≡ perihelion precession with torsion perihelion precession in gr = 1 3 (2 + 2γ −β + 2t2 + t3). using the planetary radar ranging data giving the relative deviation from gr of 10−3 [22], we find |b − 1| < 0.001. using the cassini measurement one gets |1 −β + t3| < 0.0286. if in addition we assume β = 1 + (1.2 ± 1.1) × 10−4 [23], then |t3| < 0.0286. (3) in the case of the lense–thirring effect, the precession of the node ω of lageos reads as (δω)sec = j a3(1 −e2)3/2 ( 1 + γ + α1 4 − w2 −w4 2 ) t, where a the semimajor axis of the satellite, e the eccentricity of the orbit and t is time. we then have bω ≡ precession of the node with torsion precession of the node in gr = 1 2 ( 1 + γ + α1 4 ) − w2 −w4 4 . using |α1| < 10−4 [24] and the measurements of [5], we get |bω − 0.99| < 0.10 and − 0.36 < w2 −w4 < 0.44. (4) reasoning similarly for the perigee, we eventually have −0.22 <0.11w1 − 0.20w2 − 0.06w3 + 0.20w4 + 0.06w5 < 0.42. (5) it is worthnothing that eqs. 4 and 5 are obtained making use also of the previously obtained estimates eqs. 2 and 3. 4. discussion beside the assumptions listed in section 2, the bounds eqs. 2, 3, 4 and 5 rely on the combination of a certain number of experimental estimates on the ppn parameters and on the various precessions. they can be improved as far as the experimental estimates improve. estimates eqs. 2–5 should be coupled with the estimates obtained in [15] obtained analyzing the precession of gyroscopes of gpb. these latter estimates turn out to be different from ours, see [16, 17] for the details. 5. future prospects before the end of the decade, robotic missions on the lunar surface could deploy new scientific payloads which include laser retroreflectors and thus extend the lunar laser ranging reach for new physics (and possibly for torsion). in particular, the single, large, fused-silica retroreflector design developed by the university of maryland and infn-lnf [6] could improve on the performance of current apollo arrays by a factor 100 or more. after the end of this decade, results from the bepicolombo mercury orbiter are expected to improve the classical test of the perihelion advance [19]. the latter measurement can be cross-checked by new planetary radar ranging data taken simultaneously with bepicolombo’s ranging data. mercury’s special role in the search for new physics effects, and for spacetime torsion in particular, is due to the relatively large value of its eccentricity and to its short distance to the sun. eventually we observe that the recently approved juno mission to jupiter [18] will make it possible, in principle, to attempt a measurement of the lense–thirring effect through juno’s node. hence such a mission may yield an opportunity to improve of the constraints on torsion parameters. 6. conclusions estimates eqs. 2–5 give an order of magnitude of the torsion tensor; they neither prove nor disprove the existence of a non-vanishing torsion tensor in the solar system. a more definite answer could be given by refining these estimates, taking advantage of future missions. acknowledgements references [1] agricola, i. the snrí lectures on non-integrable geometries with torsion, archivum mathematicum (brno) 42 (2006), supplement, 5–84 [2] agricola, i, thier, c.: the geodesics of metric connections with vectorial torsion, ann. global anal. geom. 26 (2004), 321 [3] bertotti b., iess l., tortora p., a test of general relativity using radio links with the cassini spacecraft, nature 425, (2003), 374 [4] e. cartan, sur le variétés à connexion affine, et la théorie de la relativité généralisée (premiére partie), ann. ec. norm. sup. 40 (1923), 325 [5] ciufolini i., pavlis e.c., a confirmation of the general relativistic prediction of the lense-thirring effect, nature 431, (2004), 958 [6] d. g. currie et al., “60th international astronautical congress”, daejeon, korea, october 12-16, 2009, paper n. iac-09.a2.1.11 [7] de sitter w., on einstein’s theory of gravitation, and its astronomical consequences, monthly notices of royal astr. soc. 77 (second paper), (1916) 155 819 r. march, g. bellettini, r. tauraso, s. dell’angelo acta polytechnica [8] de sitter w., planetary motion and the motion of the moon according to einstein’s theory, knaw, proceedings, 19 i, amsterdam, (1917) 367 [9] hayashi k., shirafuji t., new general relativity phys. rev. d 19, (1979) 3524 [10] hehl f.w., von der heyde p., spin and structure of spae-time, ann. inst. h. poincarè, section a 19 (1973), 179 [11] hehl f.w. et al., new general relativity rev. mod. phys. 48, (1976) 393 [12] hehl f.h., obukhov y.n., annal. fondation louis de broglie 32, 157 (2007) [13] kobayashi s., nomizu k., foundations of differential geometry, j. wiley & sons inc., 1963 [14] lense j., thirring h., phys. z. 19, 156 (1918), translated in: b. mashhoon, f.w. hehl, d.s. theiss, gen. rel. grav. 16 (1984) [15] y. mao at al., constraining torsion with gravity probe b, phys. rev. d 76, 1550 (2007) [16] march r. et al., constraining spacetime torsion with the moon and mercury, phys. rev. d 83, 104008-18, 2011 [17] march r. et al., constraining spacetime torsion with lageos, gen. rel. gravitation 43 (2011), 3099-3126 [18] s. matousek, the juno new frontiers mission, acta astronautica 61, 932 (2007) [19] a. milani et al., testing general relativity with the bepicolombo radio science experiment, phys. rev. d 66, 082001 (2002) [20] nakahara, m., geometry, topology and physics, graduate student series in physics, institute of physics publishing, bristol and philadelphia, 2003 [21] schrodinger, e., space-time structure, cambridge univ. press, 1963 [22] shapiro i.i., gravitation and relativity 1989, edited by n. ashby, d.f. bartlett, and w. wyss, cambridge university press, cambridge, england, 1990, p. 313 [23] williams j.g., turyshev s.g., boggs d.h., phys. rev. lett. 93 (2004), 261101 [24] will c.m., the confrontation between general living rev. relativity 9, 3 (2006) (http://www.livingreviews.org/lrr-2006-3) 820 http://www.livingreviews.org/lrr-2006-3 acta polytechnica 53(supplement):817–820, 2013 1 introduction 2 working assumptions and autoparallel trajectories 3 description of the results 4 discussion 5 future prospects 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0790 acta polytechnica 53(supplement):790–792, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap neutrinoless double beta decay: an extreme challenge fernando ferroni∗ department of physics, sapienza university & infn, roma, italy ∗ corresponding author: fernando.ferroni@roma1.infn.it abstract. neutrino-less double beta decay is the only known way to possibly resolve the nature of neutrino mass. the chances to cover the mass region predicted by the inverted hierarchy require a step forward in detector capability. a possibility is to make use of scintillating bolometers. these devices shall have a great power in distinguishing signals from alfa particles from those induced by electrons. this feature might lead to an almost background-free experiment. here the lucifer concept will be introduced and the prospects related to this project will be discussed. keywords: neutrino physics, majorana mass, bolometers. 1. introduction mysteries about neutrinos are several and of different nature. we know that they are neutral particles with an extraordinary little mass compared to the one of all the other particles. although they are massive we have not succeeded yet in measuring their mass. we do not know if the neutrino is a particle different from its antiparticle or rather as by majorana [1] are the same particle. majorana observed that the minimal description of spin 1/2 particles involves only two degrees of freedom and that such a particle, absolutely neutral, coincides with its antiparticle. if the majorana conjecture holds then it will be possible to observe an extremely fascinating and rare process that takes the name of neutrinoless double beta decay (0ν2β). the net effect of this ultra rare process will be to transform two neutrons in a nucleus into two protons and simultaneously to emit two electrons. since no neutrinos will be present in the final state the sum of the energy of the two electrons will be a monochromatic line. the rate of this, so far, unobserved phenomenon will also allow a determination, although not precise, of the neutrino mass. neutrinoless double-beta decay is an old subject well discussed for example by avignone, elliot & engel [2]. what is new is the fact that, recently, neutrino oscillation experiments have unequivocally demonstrated that neutrinos do have a non zero mass and that the neutrino mass eigenstates do mix. indeed the massive nature of neutrinos is a key element in resurrecting the interest for the majorana conjecture. 2. the physics the practical possibility to test the majorana nature of neutrinos is indeed in detecting the process shown in fig. 1, the double beta decay (dbd) without emission of neutrinos. the rate for 0ν2β process will go as 1/τ = g(q,z)|m2|m2 ββ where g is the easily calculable phase space factor and m is the challenging figure 1. neutrino-less double beta decay process. nuclear matrix element that is known, ref. [3], with still large uncertainties. the effective neutrino mass (mββ) is a combination of neutrino masses, mixing angles and majorana phases. the experimental investigation of this process definitely requires a large amount of dbd emitter, in low-background detectors with the capability of selecting reliably the signal from the background. the sensitivity of an experiment will go as s0ν = a √ mt bδe ε. from this formula it is clear that isotopic abundance (a) and efficiency (ε) will end up in a linear gain, while mass (m) and time (t) only as the square root. also background level (b) and energy resolution (δe) behaves as a square root. in the case of the neutrinoless decay searches, the detectors should therefore have at least very sharp energy resolution and possibly other discriminating mechanisms. the key however is in the background index value (b). 790 http://dx.doi.org/10.14311/ap.2013.53.0790 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 neutrinoless double beta decay: an extreme challenge figure 2. radiative nuclear transitions. it clearly shows (left) that above the 208tl this contribution to the dbd background becomes negligible. right: cuoricino background in the dbd region and above. it clearly shows the dominance of degraded α’s. the two components are the natural radiation and the degraded α’s. cuoricino experiment (in ref. [4]) has been a clear cornerstone for identifying the nature of the problem. 3. the problem the challenge is in the very fact that the sensitivity of this kind of experiment, as previously seen, improves only with the square root of the selected isotope mass, running time, decrease of background index and improvement of energy resolution. not much choice is left for deciding where to go for designing a superior experiment. once you have reached the practical limit (say one ton of mass, five years running time and a few kev energy resolution) there is nothing else left than to work hard on background reduction. in fig. 2 the main problem with background, at least for the calorimetric experiments, is elucidated. 4. the future one (high)way open for at least getting to the possibility of testing the entire region allowed by the inverted hyerarchy case is to combine the superior energy resolution of the bolometric technique, moreover applicable to almost all of the isotopes suited for the search, with the information provided by scintillation light in a way to use the different yield generated by α particles with respect to electrons. the best features of bolometric detectors are that they can contain the candidate nuclei with a favorable mass ratio and be massive and they exhibit spectacular energy resolution. this parameter is crucial since the signal is a peak in the energy spectrum of the detector positioned exactly at the q-value of the reaction. this peak must be discriminated over the background and therefore the narrower the better. beside, they can be built in a way to be characterized by low intrinsic background. scintillating bolometers bring in an enormous added value by providing a substantial α/β discrimination power (by difference of quenching factor). further there is some flexibility in the choice of crystal type which allows the use of most of the high q-value candidates. a first demonstration is given in [5]. when the energy absorber in a bolometer is an efficient scintillator at low temperatures, a small but significant fraction of the deposited energy (up to a few percent) is converted into scintillation photons, while the remaining dominant part is detected as usual in the form of heat. the simultaneous detection of the scintillation light is a very powerful tool to identify the nature of the interacting particle. the principle of operation of a scintillating bolometer is shown in fig. 3. the most suited scintillating crystals are based on cd, mo and se with the serious drawback of the need for an isotopic enrichment that brings their natural abundances (less than 10 %) to a much higher value. a lot of pro and cons have been evaluated for the three materials, without going into details we say that the final decision has favoured se in form of znse crystals. one of the most striking features of znse is the abnormal qf, higher than 1 unlike all the other studied compounds. although not really welcome, this unexpected property does not degrade substantially the discrimination power (see in [6]) of this material compared to the others and makes it compatible with the requirement of a high sensitivity experiment. an additional very useful feature is the possibility to perform α/β discrimination on the basis of the temporal structure of the signals, both in the heat and light channel as seen in fig. [4]. the detector configuration proposed for lucifer [7] resembles closely the one selected and extensively tested for cuore [8] with an additional light detector, designed according to the recipes developed during the scintillating-bolometer r&d and consisting of an auxiliary bolometer, opaque to the light emitted by the znse crystals. in tab. 1 we give a rough indication of a merit factor of experiments ongoing (gerda [9]) and exo [10] and in preparation (cuore and lucifer). the merit refers only to the capability (real or claimed) of background discrimination through energy resolution and background 791 fernando ferroni acta polytechnica figure 3. schematic structure of a double read-out scintillating bolometer. right: schematic scatter plots of light signals vs. heat signals corresponding to events occurring in the scintillating bolometer. in both circumstances (positive/negative qf) α induced events can be efficiently rejected. figure 4. results from a run on a znse crystal with double (heat and light) readout exposed to radiactive sources. left: scatter plot light vs. heat. right: decay time of the scintillation light from α and electrons 208tl line. exp δe b bδe kev count/(kev kg y) count/(ton y) gerda 4.5 0.02 90 exo 80 0.0015 120 cuore 5 0.02 100 lucifer 10 0.001 10 table 1. merit factors due to background rejection for some experiment running or in preparation. rejection. a realistic projection shall of course include nuclear matrix elements, tonnage and, helas, cost. 5. conclusions the search for understanding the nature of neutrino mass is undergoing. experiments of the actual generation are unable to explore the entire region allowed by the inverted mass hierarchy hypothesis. one of the possible breakthrough for a future generation experiment is the use of scintillating crystals bolometrically exploited. a conclusive demonstration of the validity of this approach is still missing. lucifer experiment will be a cornerstone in this respect. acknowledgements the project lucifer has received funding from the european research council under the european union’s seventh framework programme (fp7/2007–2013)/erc grant agreement n. 247115. references [1] majorana, e: 1937, il nuovo cimento 14, 171 [2] avignone, f.t., elliot, r. s., engel, j.: 2008, rev. mod. phys., 80, 481 [3] simkovic. f., et al.: 2009, phys. rev. c, 79, 055501 [4] arnaboldi, c., et al.: 2008, phys. rev. c 78, 035502 [5] alessandrello, a. et al.,: 1998, phys. lett. b 420, 109 [6] arnaboldi, c. et al.: 2011, astropart.phys. 34, 344 [7] ferroni, f.: 2010, nuovo. cim. c033n5, 27 [8] ardito, r. et al.: 2011, hep-ex/0501010 [9] schonert, s. et al.: 2005, nucl.phys b145, 242 [10] ackerman, n. et al.: 2011, phys. rev. lett. 107, 212501 792 acta polytechnica 53(supplement):790–792, 2013 1 introduction 2 the physics 3 the problem 4 the future 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0854 acta polytechnica 53(6):854–861, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap quantification of respiratory sinus arrhythmia with high-framerate electrical impedance tomography christoph hoog antink∗, steffen leonhardt philips chair for medical information technology, helmholtz-institute for biomedical engineering, rwth aachen university, pauwelsstr. 20, 52074 aachen, germany ∗ corresponding author: hoog.antink@hia.rwth-aachen.de abstract. respiratory sinus arrhythmia, the variation in the heart rate synchronized with the breathing cycle, forms an interconnection between cardiac-related and respiratory-related signals. it can be used by itself for diagnostic purposes, or by exploiting the redundancies it creates, for example by extracting respiratory rate from an electrocardiogram (ecg). to perform quantitative analysis and patient specific modeling, however, simultaneous information about ventilation as well as cardiac activity needs to be recorded and analyzed. the recent advent of medically approved electrical impedance tomography (eit) devices capable of recording up to 50 frames per second facilitates the application of this technology. this paper presents the automated selection of a cardiac-related signal from eit data and quantitative analysis of this signal. it is demonstrated that beat-to-beat intervals can be extracted with a median absolute error below 20 ms. a comparison between ecg and eit data shows a variation in peak delay time that requires further analysis. finally, the known coupling of heart rate variability and tidal volume can be shown and quantified using global impedance as a surrogate for tidal volume. keywords: electrical impedance tomography, respiratory sinus arrhythmia, cardio-respiratory coupling, heart-rate variability. 1. introduction electrical impedance tomography (eit) is a powerful imaging tool. it seeks to reconstruct the impedance distribution inside a patient from measurements on the boundary. these measurements are non-invasive, painless and have no known side effects. this makes eit an ideal tool for long-term monitoring of patients. since the electrical impedance of lung tissue varies greatly with air content, the most common use for medical eit is in pulmonary monitoring. here it serves to visualize and analyze the regional distribution of ventilation, which in turn can be used for example to automatically optimize the respirator settings for mechanically ventilated patients [1]. although the most prominent changes in conductivity in the thorax originate from respiration, a signal that is roughly an order of magnitude smaller can be attributed to cardiac activity. a question still unanswered is the optimal electrode configuration to maximize the quality of the cardiac related signal [2]. in mechanically ventilated patients, the distribution of the ventilation is a very important parameter. if water accumulates in the dorsal lung regions, these regions may collapse and thus may not be ventilated at all. at the same time, the ventral regions may be over-ventilated, leading to potentially lethal lung damage. however, distribution of ventilation is not the only parameter of importance, perfusion is as well. if the pressure is set very high to optimize the ventilation, this may simultaneously hinder cardiac activity, thus leading to a reduction in gas exchange. monitoring cardiac functionality is therefore a task of equal importance. in addition to heart rate (hr), heart rate variability (hrv), i.e. the change in timespan between two consecutive heartbeats, has received increasing attention recently. it has been examined as an indicator for stress [3], for sleep stages [4] and even as a predictor for septic shock [5]. respiratory sinus arrhythmia (rsa) is an oscillation of the heart rate in sync with respiration. in the inspiration phase, an increase in heart rate can be observed, whereas a decrease in heart rate occurs in the expiration phase. this fact can be exploited to extract the respiratory rate (rr) from an electrocardiogram (ecg) [6]. at the same time, if an individual model for the cardio-respiratory coupling can be calculated or learned, the breathing signal could be used to improve the estimation of the cardiac related signal in a multi-sensor data fusion setting. eit is widely used in medical research and its introduction into standard clinical practice is now imminent. the world’s first commercial medical eit device was released in 2011 by dräger medical gmbh, under the name pulmovista® 500 [7]. this device works with a 16-electrode belt and is capable of recording up to 50 frames per second (fps). with such a high frame rate, the noninvasive nature of the measurement, and the possibility to measure lungand cardiac-related signals simultaneously, the device seems an ideal tool for analyzing cardio-respiratory coupling. this paper presents a proof-of-concept study con854 http://dx.doi.org/10.14311/ap.2013.53.0854 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 quantification of rsa with high-framerate eit figure 1. electrode position in the experiment: a eit electrode belt, b eit reference electrode, c1−3 ecg electrodes. sisting of simultaneous eit and ecg recordings. in section 2, technical details are presented. section 3 presents the results and a discussion, followed by the conclusions in section 4. 2. materials and methods 2.1. hardware eit measurements were obtained using the pulmovista® 500 eit device, operating at a rate of 50 fps. additionally, a single channel ecg was recorded using the somnolab 2 psg standard by weinmann medical technology. designed for polysomnography studies, this device features a variety of sensors. for this study, only ecg functionality sampling at 256 hz was used. 2.2. trial setup the study was conducted as a self-experiment. no recent history of pulmonary or cardiac related diseases is known. electrodes were positioned according to figure 1. ten measurements were conducted in sitting position: in the first 8 runs, the respiratory frequency was controlled to a specific value and eit data was recorded for two minutes, see also table 1. in run 9, breath was held after expiration for as long as possible (25 seconds) and in run 10 the same was done after inspiration 2.3. data preprocessing data was recorded and exported using the manufacturers’ respective software tools, and was imported into matlab. in each trial, a single eit file was hf/lf ratio [a .u .] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 figure 2. ratio of hf to lf components in the eit signal. the heart region can be clearly distinguished from the background. recorded, ecg was recorded continuously. the ”forward problem” in eit is governed by the equations ∇· ~j(~x) = −∇·σel(~x)∇φ(~x) = 0 (1) for ~x ∈ ω \∂ω and j(~x) = −~j(~x) ·~n(~x) = σel(~x)∇φ(~x) ·~n(~x) (2) for ~x ∈ ∂ω, see also [8]. here, ~x is a coordinate inside a body ω or on its boundary ∂ω with normal vector ~n, ~j is current density, φ is electric potential, j is scalar current density on the boundary, and σel is conductivity distribution inside the body. if j(~x) and σel(~x) are known, the partial differential equation 1 has a unique solution φ(~x). the ”inverse problem” in eit is where the underlying conductivity distribution σel(~x) is unknown and needs to be inferred from current injections and voltage measurements on the boundary. this is a much harder problem to solve due to its ill-posed nature, which means that at least one of the criteria • existence of a solution, • uniqueness of the solution or • continuity of the solution is violated. to be still able to solve the problem, a-priori knowledge in the form of regularization is incorporated. this usually results in a spatially smooth reconstructed distribution σel. the eit system used here works with an electrode belt with n = 16 electrodes, which leads to n(n − 3) = 208 voltage measurements per frame, of which due to reciprocity only 104 are linearly independent. the device’s internal linear method of reconstruction was used, leading to a sequence of impedance images z with 32 by 32 pixels at 50 fps. to extract the cardiac signal, a single pixel z was selected. in a first proof of concept, this selection process was 855 c. hoog antink, s. leonhardt acta polytechnica 0 20 40 60 80 100 120 −0.5 0 0.5 1 1.5 g lo b a l im p e d a n c e [a .u .] run # 1 0 20 40 60 80 100 120 −5 0 5 c a rd ia c p ix e l [a .u .] time [seconds] figure 3. time course of the global impedance (top) and of a single pixel in the cardiac region (bottom). performed manually and yielded reasonable results [9]. as a subsequent development, the selection process was optimized: for this, the time course of each pixel of the first trial was converted into frequency domain using the fast fourier transformation (fft). next, the ratio of high frequency (hf, 0.99 to 1.12 hz, cardiac activity) to low frequency components (lf, 0.08 to 0.16 hz, respiratory activity) was calculated. to avoid division by (nearly) zero, a small offset was added to the low frequency component. in figure 2, the hf/lf ratio is spatially visualized, clearly distinguishing the heart region from the background. the coordinates of the maximum value were used in the subsequent experiments. no spatial averaging was performed, since the reconstructed eit impedance image data is intrinsically ”spatially low-pass filtered”, as argued above. the difference between the time course of the global impedance (gi) and the selected pixel is visualized in figure 3. while the average impedance shows changes mainly due to respiration, a strong cardiac related signal (crs) can be deduced in the time course of a single pixel. this signal was resampled to 256 hz to match the ecg sampling rate and was high-pass filtered to remove the residual respiratory component, resulting in crs∗. the same high-pass was applied to the ecg signal ecg′, resulting in ecg. crs∗ and ecg were then crosscorrelated; a very distinct peak could be observed that was used to calculate tsync, which could then be used to synchronize the eit and ecg data by selecting the appropriate part in the ecg stream. to remove high-frequency artifacts, crs∗ was lowpass filtered before peak detection. no further processing was applied to ecg. the algorithm is outlined in figure 4. the resulting arrays tp eak,eit and tp eak,ecg were used in the subsequent analysis. pixel selection eit-device resampling high-pass z 50hz [32 x 32] z 50hz z‘ 256hz crs* ecg‘ 256hz ecg cross correlation low-pass peak detection crs ecg-device high-pass peak detection tpeak, eit tsync max tpeak, ecg figure 4. outline of the algorithm to extract heartbeats from eit and ecg as well as synchronizing the raw data. 2.4. data analysis first, beat-to-beat intervals (bbi) were calculated from the arrays containing the peaks. since these time intervals are by definition unevenly sampled and thus located on an irregular grid, the bbi derived from crs were linearly interpolated to the locations of the bbi obtained from ecg. then, the root mean square error (rmse), the median absolute error (mae) and correlation coefficient r were calculated. to validate the rr, the global impedance signal was evaluated in the frequency domain, using the fft. to evaluate rsa, the lomb periodogram 856 vol. 53 no. 6/2013 quantification of rsa with high-framerate eit 0 20 40 60 80 100 120 −0.5 0 0.5 1 g lo b a l im p e d a n c e [a .u .] run # 5, rr = 20 bpm 0 20 40 60 80 100 120 400 600 800 1000 b e a t− to − b e a t in te rv a l [m s ] rmse = 23.0382 ms, r = 0.91413 crs ecg 0 20 40 60 80 100 120 150 200 250 300 p e a k d e la y [m s ] time [seconds] t delay,mean = 203.683 ms ecg to eit figure 5. time course for run # 5. 0 20 40 60 80 100 120 −0.5 0 0.5 1 1.5 g lo b a l im p e d a n c e [a .u .] run # 7, rr = 12 bpm 0 20 40 60 80 100 120 400 600 800 1000 1200 b e a t− to − b e a t in te rv a l [m s ] rmse = 33.4943 ms, r = 0.9393 crs ecg 0 20 40 60 80 100 120 150 200 250 300 350 p e a k d e la y [m s ] time [seconds] t delay,mean = 218.0469 ms ecg to eit figure 6. time course for run # 7. was calculated. this allows the use of bbi directly without interpolating them to the regular grid. the respiratory rate was identified as the peak in the periodogram above 1 bpm. the heart rate was calculated as the inverse of the median bbi. additionally, the delay between peaks derived via ecg and eit was calculated. here, values below 150 and above 400 milliseconds were considered to be outliers, and were discharged. it is known that the tidal volume (tv) influences the degree of rsa: it was found in [10], that the change in bbi shows an almost linear dependence on tv. additionally, the coefficient was found to be frequency dependent, being almost constant up to a corner frequency, which was found to be different for each subject, with a mean value of 7.1±1.5 bpm. for higher breathing frequencies, it showed an exponential roll-off of around 20.4 ± 2.4 db per decade ([10], table 2). in this study, only the breathing frequency was actively controlled with the help of an acoustic and visual signal. no such device was employed to keep tv constant. since the global impedance correlates very well with tv, it was used as a surrogate. visual 857 c. hoog antink, s. leonhardt acta polytechnica 0 5 10 15 20 25 30 35 40 −1 0 1 2 g lo b a l im p e d a n c e [a .u .] run # 10, breath hold (inspiration) 0 5 10 15 20 25 30 35 40 600 800 1000 1200 b e a t− to − b e a t in te rv a l [m s ] rmse = 18.3771 ms, r = 0.98304 crs ecg 0 5 10 15 20 25 30 35 40 150 200 250 300 p e a k d e la y [m s ] time [seconds] t delay,mean = 223.0569 ms ecg to eit figure 7. time course for run # 10. run rr rreit rrcrs rrecg # [bpm] [bpm] [bpm] [bpm] 1 6 6.00 5.99 5.95 2 10 10.00 10.00 10.00 3 14 14.00 14.01 14.05 4 18 18.00 18.09 18.09 5 20 20.00 20.11 20.10 6 16 16.00 15.99 16.00 7 12 12.00 12.05 12.05 8 8 8.00 7.99 7.99 table 1. respiratory rate according to the protocol (rr), calculated from eit (rreit) and calculated via bbi periodogram analysis from crs (rrcrs) and ecg (rrecg). run 9 and 10 were breath-hold experiments. inspection of the time course of the global impedance showed a small variation in tv over time, see figure 3. for exact analysis, all breaths were identified using a simple peak detector. next, the difference of the maximum and the minimum in global impedance as well as the difference between maximum and minimum bbi in a small window around each peak were determined. 3. results and discussion tables 1 and 2 present the values described above. additionally, figures 5 to 7 present the time course of the global impedance change, bbi derived from crs and ecg as well as the ecg to crs peak delay. one can observe that high frame rate eit is capable of analyzing cardiac function with great precision. table 1 shows that the effect of rsa can clearly be inferred from the bbi signal derived via eit, since the calculated respiratory rates are almost identical. figures 5 to 7 also show very good agreement of bbi calculated via eit and ecg. a careful analysis of the raw data shows that relatively high rmse stems from singular artifacts, especially from the misclassification of beats in crs. these errors greatly influence rmse, whereas mae is influenced only minimally. this is especially prominent in run 9, where the correlation coefficient is below 0.7 and rmse is above 90 ms, while the mae shows a very low value below 10 ms, proving that singular outliers influence the result. combining all experiments, rmse is 34.31 ms and mae is 15.91 ms. this is a promising observation, considering that the eit device used here has a frame rate of 50 fps, resulting in a sampling time of 20 ms. an interesting observation can be made from figures 6 and 7. in general, the ecg and crs peaks are not synchronous, since ecg stems from the electrical activity of the heart, whereas a peak in crs stems from a maximum in contraction, i.e. a mechanical activity. the delay of roughly 200 milliseconds between the ecg peak (r-wave) and the crs peak is consistent with values in the literature, relating ecg and blood volume inside the heart [11]. it is very interesting to observe a slow drift in figure 7, peaking just before expiration, i.e. just before maximum discomfort was reached. figure 6, however, shows a sharp jump. while at first glance an artifact seems the most reasonable explanation, the consistency of the bbi derived from crs and from ecg does not indicate an obvious source of error. 858 vol. 53 no. 6/2013 quantification of rsa with high-framerate eit run hrcrs hrecg rmse mae r # [bpm] [bpm] [ms] [ms] 1 69.50 69.03 28.92 12.60 0.97 2 74.56 75.29 30.14 16.98 0.96 3 77.58 79.59 37.46 23.13 0.91 4 87.77 88.79 31.39 18.13 0.86 5 85.81 85.81 23.04 15.06 0.91 6 78.37 80.21 29.68 17.20 0.94 7 75.29 76.42 33.49 21.24 0.94 8 74.38 76.04 24.89 11.72 0.96 9 65.08 67.22 98.08 7.32 0.68 10 61.69 61.69 18.38 6.14 0.98 table 2. heart rate calculated from crs (hrcrs) and from ecg (hrecg). additionally, root-mean-square error (rmse), median absolute error (mae) as well as correlation coefficient (r) of bbi calculated from ecg and crs. run rr std(∆gi) r(∆gi, ∆bbi) p(∆gi, ∆bbi) a b # [bpm] [a.u.] [ms/a.u.] [ms] 1 6 0.13 0.91 0.0002 871 -596 2 10 0.05 0.19 0.4445 200 -83 3 14 0.08 0.01 0.9741 3 20 4 18 0.05 0.10 0.6757 63 -29 5 20 0.04 0.22 0.2893 184 -112 6 16 0.04 -0.29 0.1593 -398 366 7 12 0.06 0.09 0.6938 58 18 8 8 0.08 0.80 0.0011 341 -223 table 3. respiratory rate according to protocol (rr), standard deviation of global impedance peaks (std(∆gi)) as well as the correlation coefficient (r(∆gi, ∆bbi)) of the change in gi and the change in bbi and the corresponding p-value. a and b represent the parameters of the linear fit ∆bbi = a · ∆gi + b. an analysis of the tidal volume dependence of rsa is presented in table 3. it can be seen that although the change in tv is relatively small, a correlation greater than 0.9 could be measured in run 1, which is the one with the lowest rr. the run with the second lowest rr (run 8) also showed a relatively high correlation. beyond the rr of 8, no correlation could be determined. this is probably due to the low variation in tv in combination with the exponential decay of the slope parameter a discovered in [10] — see also its determined values for rr of 6 and 8 in table 3. figures 8 and 9 plot the time course as well as the direct comparison of the change in global impedance and the change in bbi, clearly showing the correlation. 4. conclusions and outlook this paper has proved the general feasibility of eit as a tool for analyzing cardio-respiratory coupling. hrv could be measured with mae below 20 ms, allowing the calculation of rr from bbi analysis. this is a redundant task here, since global impedance can be used with less effort. however, it serves to prove the accuracy of the crs extracted from eit. moreover, it shows the redundancy in the measured data that could be used in a model for multi sensor data fusion, exploiting the cardio-respiratory coupling. interesting observations can be made on the basis of an analysis of the relationship between crs and ecg. slow drifts could be observed in the time interval between ecgpeak (r-wave) and crs-peak (maximum contraction of the heart) in the breath-hold experiment. this observation needs further investigation and might allow non-invasive analysis of traditional parameters such as blood pressure, or the derivation of novel surrogates for cardiac health. finally, the known coupling of heart rate variability and tidal volume could be shown using a change in global impedance as a surrogate for 859 c. hoog antink, s. leonhardt acta polytechnica 10 20 30 40 50 60 70 80 90 100 110 0.6 0.8 1 1.2 1.4 run # 1, rr = 6 bpm, r = 0.91337 time [seconds] c h a n g e i n g lo b a l im p e d a n c e [a .u .] 10 20 30 40 50 60 70 80 90 100 110 0 100 200 300 400 c h a n g e i n b b i [m .s .] figure 8. time course of the change in gi and the change in bbi for run # 1. 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 0 50 100 150 200 250 300 350 400 450 change in global impedance [a.u.] c h a n g e i n b b i [m .s .] run # 1, rr = 6 bpm, r = 0.91337 measurement linear fit figure 9. the change in bbi over the change in gi for run # 1. the latter. since this study presents only a proof of concept, some limitations do exist: only a single subject participated in the experiment, a larger cohort needs to be examined to test robustness. moreover, only the respiratory rate was controlled. additionally, a simple peak detector is likely to be only a suboptimal approach to identifying the true maximum of crs. firstly, this may lead to a systematic offset if the shape of the peak changes, which could be an explanation for the jump observed in figure 6. secondly, this leads to some misclassifications which are responsible for outliers. while small in number, their high amplitude leads to serious degrading in non-robust metrics like rmse and the correlation coefficient. finally, other sources of information, e.g. the crs waveform or its slope should be examined. this could lead the way to new, non-invasive tools for cardio-respiratory model verification, and potentially to new methods of diagnosis. acknowledgements the author would like to thank dipl.-ing. boudewijn venema for helping with the somnolab equipment. references [1] t. meier, h. luepschen, j. karsten, et al. assessment of regional lung recruitment and derecruitment during a peep trial based on electrical impedance tomography. intensive care medicine 34(3):543–50, 2008. [2] j. nasehi tehrani, c. jin, a. l. mcewan. modelling of an oesophageal electrode for cardiac function tomography. computational and mathematical methods in medicine 2012:585786, 2012. [3] m. pagani, g. mazzuero, a. ferrari, et al. sympathovagal interaction during mental stress. a study using spectral analysis of heart rate variability in healthy control subjects and patients with a prior myocardial infarction. circulation 83(4):ii43–51, 1991. [4] m. h. bonnet, d. l. arand. heart rate variability: sleep stage, time of night, and arousal influences. electroencephalography and clinical neurophysiology 102(5):390–6, 1997. 860 vol. 53 no. 6/2013 quantification of rsa with high-framerate eit [5] w.-l. chen, c.-d. kuo. characteristics of heart rate variability can predict impending septic shock in emergency department patients with sepsis. academic emergency medicine 14(5):392–7, 2007. [6] g. moody, r. mark. derivation of respiratory signals from multi-lead ecgs. computers in cardiology 12:113–116, 1985. [7] e. teschner, m. imhoff. elektrische impedanztomographie: von der idee zur anwendung des regionalen beatmungsmonitorings. dräger medical gmbh 2010. [8] w. lionheart, n. polydorides, a. borsic. the reconstruction problem. in d. holder (ed.), electrical impedance tomography: methods, history and applications, vol. 0750309520, pp. 3–64. institute of physics publishing, 2004. [9] c. hoog antink. analyzing cardio-respiratory coupling with high-framerate eit : a proof of concept. in l. husník (ed.), proceedings of the 17th international scientific student conference poster 2013. czech technical university in prague, 2013. [10] j. a. hirsch, b. bishop. respiratory sinus arrhythmia in humans: how breathing pattern modulates heart rate. the american journal of physiology 241(4):h620–9, 1981. [11] r. klinke, h.-c. pape, s. silbernagl. physiologie. thieme, 5th edn., 2005. 861 acta polytechnica 53(6):854–861, 2013 1 introduction 2 materials and methods 2.1 hardware 2.2 trial setup 2.3 data preprocessing 2.4 data analysis 3 results and discussion 4 conclusions and outlook acknowledgements references ap1_01.vp 1 introduction the water-proofness quality of paint materials can be evaluated by various methods. a first look into the evaluation can provide information about two material parameters characteristic of moisture transport in porous materials, namely the diffusivity of liquid moisture � and water vapor diffusivity d. however, these parameters may not be known because it is particularly difficult to measure � for materials containing small pores only, and in addition neither � nor d themselves tell us anything about the influence of paint thickness or about the properties of the contact surface between the paint and the substrate. therefore, direct testing of the paint-substrate system is more desirable. one of the physical quantities characterizing the behavior of capillary-porous materials in contact with water is the height of capillary rise, i.e., the maximum height hmax of the water column in the material above the main water level. however, measurements of hmax are very time consuming and, in the main, inaccurate [1]. a more suitable quantity for evaluating quality of water-proofness is the capillarity c, defined by the relation c s m t � 1 d d , (1) where s is the surface of the specimen which is in contact with water, m is the mass of the moistened specimen, and t is the time. as a matter of fact, the capillarity defined by (1) is identical with the water flux in the material. the value of c is an instantaneous quantity which does not provide any information on the history of the moistening process. therefore, it appears reasonable to define the integral capillarity cint (see, e.g., [2–4]) � � � � � � c t c m t m s t d int � � � � � �0 d , (2) where m is the mass of the moist specimen, and md is the mass of the dried specimen. integral capillarity can express not only the absolute amount of water in the specimen but also the time history of the moistening process, which is particularly useful in comparing the effectiveness of various paints on a specified substrate. therefore, we employ the cint (�) function as the main parameter in evaluating the water-proofness of paint materials throughout this paper. measuring the water vapor diffusion properties of paint-substrate systems can provide useful complementary information on water-repellent paints. paints applied on external linings should protect the underlying layers from water penetration but, on the other hand, they should not oppose water transport from the interior to the exterior in order to avoid the formation of condensing zones in walls. therefore, the effective diffusion coefficient of the paint-substrate system is the second important parameter in evaluating the quality of the paint materials in this paper. 2 materials and samples in our experimental work, we studied the hygric properties of selected karlocolor masonry paints on glass concrete substrates. the glass concrete was produced by vush, a. s., brno. the following basic proof the samples were determined: density 1800 kg/m3, saturation water content 13.5 %kg/kg. the following karlocolor masonry paints (producer: karlomix, s. r. o., czech republic) were tested: 10-001, 11-001, 20-001, 30-001. karlocolor 10-001 is a disperse acrylate paint consisting of styrene-acrylate, quartz sand, inorganic pigments, some special additives and water. karlocolor 11-001 has a similar composition as 10-001 but contains pure acrylate instead of styrene-acrylate. karlocolor 20-001 is a silicon paint which consists of silicon emulsion, acrylate dispersion, quartz sand, inorganic pigments, some special additives and water. karlocolor 30-001 is a silicate paint consisting of potassium water glass, acrylate dispersion, quartz sand, inorganic pigments, some special additives and water. 3 measuring integral capillarity in order to measure integral capillarity we used samples of substrate materials that were cylindrical in shape, with a diameter of 110 mm and a height of 10 mm. the lateral area of all specimens was waterand vaporproof insulated by epoxy resin, and the specimens were placed with their face side into the vessel with water on a soft sponge, so that the upper side of the sponge was just on the water level. the mass 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 evaluation of water resistance and diffusion properties of paint materials j. drchalová, j. poděbradská, j. maděra, r. černý a simple method is presented for evaluating the water-proofness quality of paints on lining materials. the method is based on measuring the integral capillarity in dependence on time, and then comparing this value to the value determined for the basic lining material. measurements of the effective water vapor permeability then provide information on the risk of condensation which may increase after applying the paint. a practical application of the method is performed with four karlocolor paints on glass concrete substrates. all the karlocolor paints are found to be very effective materials for driven rain protection. the diffusion properties of all the paints are found to be excellent. keywords: water-proofness, diffusion coefficient, paints. of the specimens absorbing water was then determined at specified time intervals, and the experiment was stopped after a period of five days. during the experiment, the level of water in the vessel was kept constant. finally, the dependence of integral capillarity on time was determined according to the definition formula given above. the integral capillarities cint of the paint-substrate systems described above are shown in figs. 1a–d, where various time scales were applied for better clarity. figs. 2a–d show details of the differences between particular paints in the same time scales. fig. 1a shows that all applied paints exhibited very good water repellent properties during short term water exposure. the integral capillarity after 45 minutes was approximately equal to 25 % of its value for the basic material without surface treatment. the same situation was observed for 180 minutes’ exposure (fig. 1b), and even after 9 hours the value of integral capillarity was still about one third of the value for the basic material (fig. 1c). fig. 1d shows that the paints lost their water protective function after water exposure greater than 70 hours. this is a very good result, because building facades are only very exceptionally exposed to such long water exposure, for instance due to wind driven rain. the detailed figs. 2a–d show that the best water repellent properties among the applied paints were exhibited by the 11-001 and 20-001 paints, i.e. those based on pure acrylate and silicon emulsion. the worst results were obtained for the silicate based paint 30-001. however, it should be noted that this is really only a relative result compared to the other paints, and that the water repellent properties of 30-001 paint are still satisfactory. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 41 no.1/2001 fig. 1a: integral capillarity of paint-glass concrete systems for t � [0, 45 min] fig. 1b: integral capillarity of paint-glass concrete systems for t � [0, 180 min] fig. 1c: integral capillarity of paint-glass concrete systems for t � [0, 540 min] fig. 1d: integral capillarity of paint-glass concrete systems for t � [0, 6000 min] fig. 2a: integral capillarity of paint-glass concrete systems for t � [0, 45 min], a detail fig. 2b: integral capillarity of paint-glass concrete systems for t � [0, 180 min], a detail 4 measuring the effective diffusion coefficient of water vapor when evaluating water vapor transport properties, we preferred diffusion coefficient d defined as j d� � grad c� , (3) rather than diffusion permeability �, which is defined as follows: j pv� ��grad . (4) in eqs. (3) and (4), j is the flux of water vapor, � c is the mass of water vapor per unit volume of the porous material, and pv is the partial pressure of water vapor. we point out that under isothermal conditions, the following relation between diffusion coefficient and permeability is valid: d rt m � � . (5) some other coefficients widely employed in building physics, namely vapor diffusion resistance number �, water vapor resistance z and equivalent air layer thickness sd are defined as follows: � � d d a (6) z d � � (7) s d d dd a� � , (8) where da is the diffusion coefficient of water vapor in air, and d is the thickness of the specimen of porous material. for measuring the effective diffusion coefficient of water vapor d we chose a steady-state method that is commonly used for experimental work on other materials. the measuring apparatus consists of two airtight glass chambers separated by the sample of measured material, which is typically board-type. in the first chamber a state near to 100 % relative humidity is maintained (achieved with the help of a cup of water), while in the second chamber there is a state close to 0 % relative humidity (set up using some absorption material, such as silica gel). after a certain time, measurement is interrupted, and the changes in the mass of water in the cup, �mw, and of the silica gel, �ma, during the chosen time interval [0, �] are determined. if |�mw| = |�ma|, i.e., if a steady state is established within the measuring system, the experiment is terminated. otherwise, the measurement continues in the same way as before. the experiment is carried out under isothermal conditions. the diffusion coefficient at temperature t can be calculated using the following formula: d m rd s m t c b a t � � �� � � � w � 1 10 , (9) where d is the thickness of the board-type specimen, m is the molar mass of water vapor, r is the universal gas constant, a, b, c are the constants in the empirical formula describing the dependence of the saturated water vapor pressure on temperature pv, s , p tv s c bat, � � � �10 , (10) a = 2900 k, b = 24.738, c = –4.65. the material specimens for measuring the difffusion properties of the paintsubstrate systems were prepared in the same way as for the measurements of integral capillarity, and their dimensions were also the same. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 2c: integral capillarity of paint-glass concrete systems for t � [0, 540 min], a detail fig. 2d: integral capillarity of paint-glass concrete systems for t � [0, 6000 min], a detail paint d (m) d (m2s�1) � (s) � (–) z (ms�1) sd (m) none 9.5 10�3 1.34 10�6 0.99 10�11 17.2 0.96 109 0.16 10-001 12.2 10�3 1.47 10�6 1.08 10�11 15.7 1.13 109 0.19 11-001 11.6 10�3 1.29 10�6 0.94 10�11 17.9 1.23 109 0.21 20-001 12.9 10�3 1.26 10�6 0.91 10�11 18.3 1.41 109 0.24 30-001 12.1 10�3 1.48 10�6 1.07 10�11 15.5 1.12 109 0.19 table 1: effective diffusion parameters of selected paint-glass concrete systems the effective diffusion parameters of the analyzed paint-glass concrete systems are summarized in table 1. the properties of all paints analyzed were found to be excellent for systems of this kind. the differences in the diffusion properties of the systems with the particular paints from the basic material were less than 10 % which is within the errorbar of the experimental method. therefore, within this measurable range no differences from the basic material were observed. 5 conclusions in this paper, we analyzed several paint-glass concrete systems from the point of view of their water repellent and diffusion properties. all the karlocolor paints were found to be very effective water repellent materials, because in short term water exposure they decreased the integral capillarity of the system to about 25 % of its value for the basic material, and retained their water repellent function for more than 70 hours, which is quite sufficient for wind driven rain protection. the diffusion properties of all paints were found to be excellent. the decrease in diffusion permeability due to the paint application was within the errorbar of the applied measuring method, i.e., lower than 10 %. acknowledgement this research was supported by the ministry of education of the czech republic, under contracts no. cez: j04/98:210000003 and cez: j04/98:210000004. references [1] mrlík, f.: moisture-induced problems of building materials and constructions (in slovak). alfa, bratislava, 1985 [2] hansen, k. k. and bunch-nielsen, t.: capillary rise of water in insulating materials and in gravel and stone, part i. mineral wool and expanded polystyrene. proc. of the 3rd symp. building physics in the nordic countries, vol. 2., copenhagen, 1993, pp. 761–768 [3] hansen, k. k. and bunch-nielsen, t.: capillary rise of water in insulating materials and in gravel and stone, part ii. product of lightweight expanded clay aggregates. proc. of the 3rd symp. building physics in the nordic countries, vol. 2., copenhagen, 1993, pp. 769–776 [4] hansen, k. k. and bunch-nielsen, t.: capillary rise of water in insulating materials and in gravel and stone, part iii. gravel and stone. proc. of the 3rd symp. building physics in the nordic countries, vol. 2., copenhagen, 1993, pp. 777–782 rndr. jaroslava drchalová, csc. department of physics phone.: +420 2 2435 4586 e-mail: drchalov@fsv.cvut.cz prof. ing. robert černý, drsc., ing. jitka poděbradská, ing. jiří maděra, department of structural mechanics phone.: +420 2 2435 4429 e-mail: cernyr@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 41 no.1/2001 acta polytechnica doi:10.14311/ap.2015.55.0081 acta polytechnica 55(2):81–85, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap initial follow-up of optical transients with colores using the bootes network m. d. caballero-garciaa, ∗, m. jelinekb, a. castro-tiradob, r. hudeca, c, r. cunniffeb, o. rabazad, l. sabau-graziatie a czech technical university in prague, technická 2, 166 27 praha 6 (prague), czech republic b instituto de astrofísica de andalucía (iaa-csic), p.o. box 03004, e-18080, granada, spain c astronomical institute, as cr, 251 65 ondrejov, czech republic d área de ingeniería eléctrica, dpto. de ingeniería civil, univ. de granada, spain e división de ciencias del espacio (inta), torrejón de ardoz, madrid, spain ∗ corresponding author: cabalma1@fel.cvut.cz abstract. the burst observer and optical transient exploring system (bootes) is a network of telescopes that allows the continuous monitoring of transient astrophysical sources. it was originally devoted to the study of the optical emission from gamma-ray bursts (grbs) that occur in the universe. in this paper we show the initial results obtained using the spectrograph colores (mounted on bootes-2), when observing optical transients (ots) of diverse nature. keywords: telescopes, gamma-ray burst: general, stars: variables: general . 1. introduction the majority of the sources visible in the sky have variable emission. some of them vary in a long timescale compared to the human life. but most of them have shorter variability and, because of that, are continuously observed by telescopes both on earth and space. the study of variability provides one of the most direct clues on the size and physical characteristics of the emitting object. in the highest energy regime (x-rays, γ-rays) the scales are related to orbits around compact objects (i.e., the result from the death of massive stars). for longer wave-lengths variability is related to larger regions and/or bigger stars. the most extreme case is the study of microwaves, that allows the study of large structures in the universe. this shows that it is important for astrophysicists to continuously observe our changing universe, from the smallest to the largest scales, for its understanding. the study of variable sources in the optical regime can be performed from the earth surface. furthermore, it is a window for the study of many complex physical processes occurring at intermediate space/time scales (see the previous paragraph). in this regime we can study (variable) stars (e.g., erupting variables, novae, cataclysmic variables, etc.), the most violent physical processes in the universe (supernovae — hereafter sne — and gamma-ray bursts), the relatively quiet active galactic nuclei (agn) and quasars (qsos) and the outer discs and stellarcompanions around compact objects (neutron stars and black holes). nevertheless, the window is still open to discoveries of new types of transients. recently, thanks to the advance of devices and detectors, new phenomena are being studied (gravitational lenses, binary mergers, stellar tidal disruptions by black holes). it is natural to wonder whether there are new (undiscovered) types of physical processes (sources) giving rise to new kinds of observed emission. observations in the optical are performed by big and medium-sized telescopes on earth. the former are not suitable for performing the rapid follow-up needed for the study of optical transients (as we will explain hereafter). these transients events are typically of short duration (from fractions of a second to a few days), because the physical processes that originate them are of limited duration/spatial extent. robotic smaller telescopes are very well suited for performing such studies. this is due to several factors: their observing flexibility, their rapid response and slew times and the fact that they can be located worldwide working remotely (therefore allowing continuous monitoring). of course, additional observations might be triggered after the transient has been detected with large x-ray/optical observatories. in this way we can perform deep studies on the nature of these sources. 1.1. the burst optical observer and transient exploring system and its spectrographs bootes (acronym of the burst observer and optical transient exploring system) is a world-wide network of robotic telescopes. it was originally designed from a spanish-czech collaboration that started in 1998 [1, 2]. the telescopes are located at huelva (bootes-1), malaga (bootes-2), granada, auckland (bootes3) and yunnan (bootes-4), located at spain, new zealand and china, respectively. there are plans of extending this network even further (mexico, south africa, chile, etc.). these telescopes are medium81 http://dx.doi.org/10.14311/ap.2015.55.0081 http://ojs.cvut.cz/ojs/index.php/ap m. d. caballero-garcia, m. jelinek, a. castro-tirado et al. acta polytechnica sized (d = 30–60 cm), autonomous and very versatile. they are very well suited for the continuous study of the fast variability from sources of astrophysical origin (gamma-ray bursts — hereafter grbs, and optical transients — hereafter ots). currently two spectrographs are built and working properly in the network at malaga and granada (in the optical and infra-red, respectively). in section 2 we will show preliminary results obtained so far with colores at bootes-2 in the field of ots of astrophysical origin. 1.2. colores colores stands for compact low resolution spectrograph [3]. it is a spectrograph designed to be lightweight enough to be carried by the high-speed robotic telescope 60 cm (bootes-2). it works in the wavelength range of 3800–11500 å and has a spectral resolution of 15–60 å. the primary scientific target of the spectrograph is a prompt grb follow-up, particularly the estimation of redshift. colores is a multi-mode instrument that can switch from imaging a field (target selection and precise pointing) to spectroscopy by rotating wheelmounted grisms, slits and filters. the filters and the grisms (only one is mounted at the moment) are located in standard filter wheels and the optical design is comprised of a four-element refractive collimator and an identical four-element refractive camera. as a spectroscope, the instrument can use different slits to match the atmospheric seeing, and different grisms in order to select the spectral resolution according to the need of the observation. the current detector is a 1024 × 1024-pixel device, with 13-micron pixels. the telescope is a rapid and lightweight design, and a low instrument weight was a significant constraint in the design as well as the need to be automatic and autonomous. for further details on description, operation and working with colores we refer to m. jelinek’s phd thesis and references therein. 1.3. scientific goals the bootes scientific goals are multifold, and are detailed in the following. observations of grb optical counterparts. there have been several grbs for which the optical counterpart has been detected simultaneously to the gamma-ray event, with magnitudes in the range 5– 10th. these observations provide important results on the central engine of these sources. the fast slewing 0.6 cm bootes telescopes are providing important results in this field [4]. the detection of ots of cosmic origin. these events could be unrelated to grbs and could constitute a new type of astrophysical phenomenon (perhaps associated to qsos/agn). if some of them are related to grbs, the most recent grb models predict that here should be a large number of bursting sources in which only transient x-ray/optical emission should be observed, but no gamma-ray emission. the latter would be confined a jet-like structure and pointing towards us only in a few cases. monitoring a range of astronomical objects. these are astrophysical objects ranging from galactic sources such as comets, cataclysmic variables, recurrent novae, compact objects in x-ray binaries to extragalactic sources such as distant sne and agn. in the latter case, there are hints that sudden and rapid flares occur. 2. scientific results using bootes-2/colores in the following we present some important scientific results obtained using bootes-2 and its lowresolution spectrograph colores, since the beginning of its operation (summer of 2012). 2.1. grb 130606a since the first light on 1998, more than a hundred of grbs have been observed with bootes, some of them only ≈30 s after the onset of the gamma-ray event. the majority of the results have been published in circulars and refereed journals 1. here we focus on the recent event occurred on 6th june 2013. a ≈ 275 s cosmic gamma-ray burst (grb 130606a) was recorded by swift and konus-wind on 6th june 2013, 21:04:34 u.t. (t0) [5, 6] displaying a bright afterglow (the emission at other wavelengths following the gamma-rays) in x-rays, but no apparent optical transient emission [7] in the range of the uvot telescope aboard swift. bootes-2 station automatically responded to the alert and an optical counterpart was identified [8], thanks to the spectral response of the detector up to 1 µm, longer than that of swift/uvot (0.17–0.65 µm). we refer to [9] for details on the observations and results. 2.2. tcp j17154683-3128303 = nova scorpius 2014 following the discovery on 26th march 2014 of a 10 mag new source in scorpius dubbed tcp j17154683-3128303 by [10] also detected by swift on march 27 [11], [12] report that an optical spectrum was obtained with the colores spectrograph at the 0.6 m robotic telescope at the bootes-2 astronomical station (see figure 1). the spectrum, covering the range 3800–200 å has been taken on 30th march, 04:37 ut and shows broad emission lines of balmer series, he i 501.6, 587.8, 706.5, and probably of o i 844.6, suggesting a nova in early phase (nova scorpius 2014) thus confirming the earlier suggestion by [13]. 1we refer to http://bootes.iaa.es for further information on the bootes network and scientific results. 82 http://bootes.iaa.es vol. 55 no. 2/2015 optical transients observed with colores figure 1. optical spectrum in the range 3800–9200 å from nova scorpii 2014, obtained with bootes-2/colores on 30th march, 04:37 ut. 2.3. optical spectra from red-dwarf flaring stars the swift team reported on the detection of a superflare from one of the stars in the close visual dm4e+dm4e flare star binary system dg cvn. the burst alert telescope (bat) triggered on dg cvn the 23rd april 2014 at 21:07:08 ut (t0) [14]. bootes2/colores has been observing dg cvn since the beginning of the superflare and new interesting optical spectral variability following the x-ray superflare evolution has been observed (we refer to [15] for further details). additionally, following the detection and subsequent monitoring of the new outburst from the rs cvn ux ari by swift and maxi [17, 18], the 0.6 m robotic telescope at the bootes-2 astronomical station, obtained optical 4000–9000 å spectra starting at 19th july 2014, 01:32:24.382 ut and ending at 04:25:55.652 ut. it was reported [16] that the optical spectra contain broad molecular tio, cai, mgi, nai lines plus a red continuum (see figure 2). these spectra lack of any significant balmer lines in emission. these spectral features are indicative of a late-type star spectrum (as previously reported). nevertheless, there are no indications of important chromospheric activity, that might have been disappeared by the time of our observations. 3. discussion and conclusions the era of ots is about to start. since the beginning of the modern times the big telescopes have been the only resource for astronomers to study astrophysical sources. in spite of constituting the best tool for deep studies of individual targets, they are not properly suited for the discovery of optical transient sources. their big size limits the speed they can cover the entire sky and the time for overheads might be longer than that for medium-sized telescopes. many factors make them difficult to fully automatize and indeed currently none is completely robotic and autonomous. mediumsized telescopes (i.e., d ≤ 1 m) can be much quicker moving from target to target and time overheads are usually very small. therefore, robotic medium-sized telescopes are currently the best ones for the followup and studies of the long-term variability of the astrophysical transient sources. bootes-2 constitutes one step forward with respect to any robotic telescope on earth that has ex83 m. d. caballero-garcia, m. jelinek, a. castro-tirado et al. acta polytechnica 3000 4000 5000 6000 7000 8000 9000 10000 11000 λ 0 5000 10000 15000 20000 25000 30000 c o u n ts ux ari, spectrum by bootes-2 + colores figure 2. optical spectrum in the range 3800–11500 å from ux ari, obtained with bootes-2/colores on 19th july, 01:32 ut. isted so far. this is because it is the first robotic telescope with a spectrograph mounted on it which has demonstrated to work properly. this does not allow only to perform early follow-up and to measure the redshift of grbs, with cosmic origin, but also to perform early follow-up (both photometric and spectroscopic) of transient sources, often located much more nearby. these ots might have been reported at other wave-lengths (typically at xand γ-rays), which would create an alert to the scientific community (often through an astronomer telegram). in such a case the observer (remotely) sends the telecommands to start the observation with bootes-2, that re-observes the target every time it is visible during the following nights. apart from the intensive campaign of follow-up of grbs performed by the bootes network (≈ 100 grbs have been observed so far), bootes-2 and its spectrograph (colores) are providing excellent results in the field of ots, too. in this paper we mention a few of them, obtained during the last 1.5 yr (since the spectrograph was mounted on the telescope). but this is only the beginning and we look forward to follow many ots for understanding better their physical properties and may be also to discover and follow-up new kinds of ots. this will prepare us for the advent of the large synoptic survey telescope (lsst), the biggest telescope ever built on earth for the study of the entire sky, that will allow the discovery of many kinds of new astrophysical sources (if they exist) and follow-up of ots, planned to start operations on 2023. acknowledgements mcg is supported by the european social fund within the framework of realizing the project "support of inter-sectoral mobility and quality enhancement of research teams at czech technical university in prague", cz.1.07/2.3.00/30.0034. mj and ajct thank the support of the spanish ministry projects aya2009-14000-c03-01 and aya2012-39727-c03-01. rh acknowledges gacr grant 13-33324s. references [1] castro-tirado, a. j., soldán, j., bernas, m. et al. the burst observer and optical transient exploring system (bootes). a&as, 138, 583, 1999. [2] castro-tirado, a. j., jelínek, m., gorosabel, j. et al. building the bootes world-wide network of robotic telescopes. asi conf. ser. 7, 313, 2012. [3] rabaza, o., jelinek m., castro-tirado a. j., et al. compact low resolution spectrograph (colores), an imaging and long slit spectrograph for robotic telescopes. review of scientific instruments 84 (11), 114501, 2014. [4] jelínek, m., castro-tirado, a. j., de ugarte postigo, a., kubánek, p. et al. four years of real-time grb followup by bootes-1b (2005-2008). advances in astronomy, 85, 2010. 84 vol. 55 no. 2/2015 optical transients observed with colores [5] barthelmy, s. d., baumgartner, w. h., cummings, j. r., et al. grb 130606a, swift-bat refined analysis. grb coordinates network, 14819, 2013. [6] golenetskii, s., aptekar, r., mazets, e. et al. konus-wind observation of grb 130606a. grb coordinates network, 14808, 2013. [7] ukwatta, t. n. and barthelmy, s. d. and gehrels, n. et al. grb 130606a: swift detection of a burst. grb coordinates network, 14781, 2013. [8] jelinek, m., gorosabel, j., castro-tirado, a. j. et al. grb 180606a: optical afterglow with bootes-2/telma and 1.23m caha. grb coordinates network, 14782, 2013. [9] castro-tirado, a. j., sánchez-ramírez, r., ellison, s. l. et al. grb 130606a within a sub-dla at redshift 5.91. submitted to a&a, astro-ph/1312.5631, 2013. [10] nishiyama, k. & kabashima, f. cbat transient object follow-up reports. [11] kuulkers, e., page, k. l., saxton, r. d., ness, j.-u., et al. detection of absorbed x-ray emission from tcp j17154683-3128303 by swift. the astronomer’s telegram, 6015, 2014. [12] jelinek, m., cunniffe, r., castro-tirado, a. j., rabaza, o., et al. tcp j17154683-3128303 = nova scorpius 2014. the astronomer’s telegram, 6025, 2014. [13] noguchi, t., kojima, t., pearce, a., et al. nova scorpii 2014 = tcp j17154683-3128303. central bureau electronic telegrams, 3841, 2014. [14] d’elia, v., gehrels, n., holland, s. t. et al. swift trigger 596958 is probably the flare star: dg cvn. grb coordinates network, 16158, 2014. [15] caballero-garcia, m. d., et al. in preparation, 2014. [16] caballero-garcia, m. d., castro-tirado, a. j., jelinek, m., on behalf of a larger collaboration. optical spectra of ux ari with bootes-2. the astronomer’s telegram, 6337, 2014. [17] kawagoe, a., tsuboi, y., negoro, h. et al. maxi/gsc detection of a big flare from ux ari. the astronomer’s telegram, 6315, 2014. [18] krimm, h. a., drake, s. a., hagen, l. m. z. et al. swift observations of a flare from ux ari. the astronomer’s telegram, 6319, 2014. 85 acta polytechnica 55(2):81–85, 2015 1 introduction 1.1 the burst optical observer and transient exploring system and its spectrographs 1.2 colores 1.3 scientific goals 2 scientific results using bootes-2/colores 2.1 grb 130606a 2.2 tcp j17154683-3128303=nova scorpius 2014 2.3 optical spectra from red-dwarf flaring stars 3 discussion and conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0378 acta polytechnica 54(6):378–382, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap injection molding and structural analysis in metal to plastic conversion of bolted flange joint by cae marián blaško, viktor tittel∗, antonín náplava, michal ondruška slovak university of technology in bratislava, faculty of materials science and technology in trnava, institute of production technologies, j. bottu 25, 917 24 trnava, slovakia ∗ corresponding author: viktor.tittel@stuba.sk abstract. this contribution deals with replacing of metals part by plastics products. there are several benefits of this application — minimize part cost, corrosion resistance, integrating more components into one part etc. material selection depending on the design of plastic part. it is necessary has to withstand the same load as metal part. to fulfill this requirement solve fiber reinforced plastics. also it is convenient to substitute wall sections with ribbed structure. mechanical properties this part could be important affected by fiber orientation. results of fiber orientation can be used in stress analysis for better prediction to mechanical load. this analysis is performed in this study on bolted flange joint. keywords: cae, fiber reinforced plastics, injection molding simulation, metal to plastic conversion. 1. introduction many metal parts in various applications are being replaced by plastic parts. there are several reasons for that depending on actual application — minimize part cost, enhance corrosion resistance, integrating more components into one single part etc. [1–3]. most important steps in metal to plastic conversion are material selection and plastic part design. plastic part has to withstand the same load as metal part. to fulfill this requirement fiber reinforced engineering plastics are often used. mechanical properties of fiber reinforced part are highly affected by fiber orientation as a result of flow. however fiber reinforced materials are often treated as isotropic materials what could lead to potential problems with final dimensions, warpage and/or poor mechanical properties in highly stressed areas of part. assuming fiber reinforced plastic part as isotropic is convenient in preliminary design phase and preliminary stress analysis to find out where are the critical most demanding areas in terms of stress. these areas should be redesigned to meet the stress requirements, however a reduced stiffness and strength should be considered in this phase. injection molding simulation follows after the preliminary stress analysis [4]. for parts with heavy wall thickness and with thickness variations it is necessary to use 3d injection molding simulation to capture all phenomena related to fiber orientation. important considerations in injection molding analysis are manufacturability of the part — the part should be void free, it should comply with warpage limits, dimensions etc. and the fiber orientation in the part should comply with mechanical loading of part to get the best possible mechanical properties for given applications. fiber orientation is mostly affected by gate location and plastic part design [4, 5]. final step in cae approach is stress analysis with anisotropic material properties resulting from fiber orientation. figure 1. ductile iron and pa66 gf30 flange. figure 2. crack and crack detail. motivation for this study was an existing design of injection molded flange made of pa66 gf30. it is dn80 flange, pressure class pn10. this design was converted from metal — ductile iron flange. both designs are shown in fig. 1. flanges from pa66 gf30 are injection molded with cold runner system with one gate. pa66 gf30 flanges failed to pass the load test. during the load test, two fittings were coupled with flanges and bolted together. after tightening the bolts to prescribed torque 40 nm, flanges cracked. crack occurs mostly near the hole farthest from the gate. the location of the crack is in the thickest area of the part, wall thickness in this area is 17 mm. crack location and crack detail are 378 http://dx.doi.org/10.14311/ap.2014.54.0378 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 injection molding and structural analysis figure 3. details of fracture surfaces. figure 4. melting core after packing phase. shown in fig. 2. voids and a foam like structure are visible on fracture surfaces (fig. 3). this implies that the packing phase was not sufficient. injection molding analysis was performed on actual flange model with corresponding feed system and injection molding parameters. material in thickest locations is still far above melting point, while the cold runner system is already frozen (fig. 4). volumetric shrinkage can not be further compensated by adding new material, thus voids form in these locations. the largest void would be in the area with largest volumetric shrinkage, which is the thick section near the hole farthest from the gate as shown in fig. 5 — this is also the area where material started to crack during figure 5. areas with highest volumetric shrinkage. figure 6. weld line formation. figure 7. weld line fiber orientation. load test. however it is not only the void factor that contributed to the crack formation, but also the weld line in this location. strength of pa66 gf30 decreases in weld lines about 15 to 17 % [4]. weld line formation and resulting fiber orientation are shown in figures 6 and 7 respectively. 379 m. blaško, v. tittel, a. náplava, m. ondruška acta polytechnica figure 8. flange redesign. figure 9. setup for preliminary stress analysis. 2. design, materials and analysis approach the goal in this study is to redesign the flange so it will meet the structural requirements (proposed design must withstand the load test) and also manufacturability requirements — proposed design must be void free after injection molding. following cae approach is used in this case study: • redesign the flange, • preliminary stress analysis of new design, design corrections, material considerations, • injection molding analysis, • stress analysis with consideration of fiber orientation. pa66 gf50 data sheet reduced tensile strength 245 mpa 208 mpa tensile elongation 3.0 % tensile modulus 16000 mpa 13600 mpa flexural strength 360 mpa 306 mpa flexural modulus 15000 mpa 12750 mpa table 1. mechanical properties of pa66 gf50. figure 10. maximum principal stress. figure 11. melting core at the end of packing. new proposed design of flange is shown in fig. 8. thick sections were cored out and more stiffening ribs were added to support the structure. preliminary stress analysis setup is shown in fig. 9. clamping force 8.3 kn was applied to each bolt, which corresponds to the tightening torque of 40 nm. 3. results preliminary stress analysis was first performed with pa66 gf30, however stress levels were high for this material, so pa66 gf50 was selected instead. reduced mechanical properties were assumed according to tab.1. by the calculation on basis software moldex 3d user manual, version r20. since pa66 gf50 has a brittle behavior when loading at room temperatures 380 vol. 54 no. 6/2014 injection molding and structural analysis figure 12. fiber orientation. figure 13. major modulus. a maximum normal stress theory can be assumed as a failure theory — however only in a “isotropic” analysis. maximum principal stress in preliminary stress analysis reached 190 mpa as shown in fig. 10, and it is in the locations where flange “bends” over fitting. maximum deflection in z direction was 0.5 mm. these results seem to be good enough to go into injection molding simulation. injection molding analysis shown that new flange design can be molded without voids. melting core at the end of packing is shown in fig. 11. highest volumetric shrinkage is 4.2 % and it is near the gate. fiber orientation is shown in figures 12 and 13 shows resulting mechanical properties (major modulus) of flange based on fiber orientation. it is clear from this result that very low stiffness was achieved in stiffening ribs, only about 8000 mpa, a half of the data sheet modulus. material properties resulting from injection molding analysis were mapped onto structural mesh from ansys (solid186). resulting mapped mesh is “assembled” from elements with different material properties — according to orientation of fibers. static structural analysis was performed in ansys with the same loads as in preliminary “isotropic” analysis. third principal stress result plot is shown in fig. 14. results from “isotropic” and anisotropic stress analfigure 14. third principal stress result plot. analysis isotropic anisotropic 1st principal −97/102 −53/122 3rd principal 49/−190 21/−226 von mises 130 210 displacement z [mm] −0.5 −0.65 table 2. summary of results from stress analysis. the stress values are in mpa. ysis are compared in tab. 2. further research is needed for clear statement on failure of this part. also the part design and gating options have to be reviewed, since the highly stressed locations have poor fiber orientation in relation to loading of this part. 4. conclusion in this contribution cae approach for designing a reinforced plastic part was presented. mechanical properties of fiber reinforced parts are strongly influenced by fiber orientation resulting from injection molding process. orientation is mainly affected by gating of the part and part design itself. “isotropic” approach is not sufficient for predicting part behavior under load. it can be convenient in preliminary design to find out stress requirements a then redesign the part accordingly or select different material. injection molding simulation is vital to avoid defects in molded part such as voids. and in case of fiber reinforced materials to analyze the resulting fiber orientation and get anisotropic material properties. anisotropic material properties from injection molding simulation can be mapped onto structural mesh and then stress analysis of molded part can be performed. anisotropic stress analysis gives better insight on how the part will perform under load, what deflections can be expected. however, since there is different orientation, it is difficult to tell what are the allowable stress levels and thus whether the part will fail or not. the cae analysis were used for redesignin new type of injection molding tools and practical verification and mechanical testing of plastics part will be follow. 381 m. blaško, v. tittel, a. náplava, m. ondruška acta polytechnica references [1] g. pötsch, w. michaeli, injection molding — an introduction, hanser/gardner, 1995. [2] j. beaumont, runner and gating design handbook — tools for successful injection molding, 2nd edition, hanser/gardner, 2007. [3] e. campo, the complete part design handbook — for injection molding of thermoplastics, hanser/gardner, 2006. [4] moldex 3d user manual, version r10. [5] a. y. peng, w. yang, c. david, seamless integration of injection molding and structure analysis tools, antec proceedings, 2005. 382 acta polytechnica 54(6):378–382, 2014 1 introduction 2 design, materials and analysis approach 3 results 4 conclusion references ap01_45.vp 1 introduction the time horizon of a manufacturer should reach at least as far as the period needed for identification, development and introduction of a new product. as in the aircraft industry this time span can reach up to 15 years, the aircraft manufacturer has to be particularly engaged with future markets. the airlines, as his primary customers, are only of limited help in doing so, since their time horizon hardly extends beyond a few years. therefore, any development of a future product strategy that goes beyond simple suggestions for the improvement of existing products has to be carried out by the aircraft manufacturer himself. as already stated by steiner [1], aircraft programs always represent complex risk experiences for the manufacturer, as market conditions, competitive actions and technological alternatives are constantly changing during the relatively long-term program period. in recent decades, the aircraft market in the segment of more than 100 seats has changed dramatically, as aircraft manufacturers have improved their product strategies: whereas in the past several manufacturers offered quite different aircraft, today there remain only two competitors sharing the market almost equally and offering products which hardly show significant differences, either technically or economically. however, the recent projects and proposals of the two big aircraft manufacturers show different views of the future: where the one sees a market need for giant ‘megaliners’ mainly operating on hub-and-spoke networks, the other offers a high speed ‘sonic cruiser’ for long range point-to-point service instead. the two designs are based on different assumptions of the future developments in the air transport market and the question is, whether each design will be robust enough to succeed in the competitors view of the future market. according to kazmer and roser [2], a design can only be called robust if it can fulfil two requirements: the system performance should be within the customer specifications and it should be as far as possible independent of input variance. for a robust aircraft design this means that its performance should meet the airlines’ needs at entry-into-service and that its characteristics should allow a successful launch more or less independent of variations in the future markets. therefore, in today’s competitive situation, the aircraft designer has to foresee which design characteristics are robust enough, but still can achieve decisive competitive advantages in the different future markets. to improve the robustness of future projects, he needs methods which can cope with the uncertainties in the development of relevant factors in the air transport industry. in this paper, the implementation of scenario methods in the aircraft design process is outlined as a useful means of complementing today’s forecasting techniques. 2 the aircraft design process as described by ehrlenspiel [3], the design synthesis cycle in general consists of three steps from a given task to a solution, see figure 1: clarification of the task, search for solutions and selection of a specific solution. as indicated in the figure by the grey background, the variety of concepts increases in the first two steps to be reduced in the third, with an ‘optimum’ solution as a result. loops back to precedent steps allow intermediate results, to be reconsidered. transferred to the aircraft project design process, as outlined for example by jenkinson et al [4], these steps can be assigned to the conceptual and the preliminary design phase at the very beginning of a new project. to initiate the design process, first of all a market need has to be identified. this need may come from customer requirements or a market analysis, leading to the further development of an existing product line. it may also result from the introduction and exploitation of new technologies and an innovation linked to it. extensive and detailed market forecasts are undertaken, considering social and economic trends, fuel prices, developments in the infrastructure of airports and air traffic control, and changes in the legislation relevant to air transport. for the first idea initiating a new project, requirements and regulations has to be clarified in order to state an aircraft specification. with regard to market and mission, payload and range requirements, cruise speed and costs have to be 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 improving aircraft design robustness with scenario methods a. strohmayer compared to other industries, the aerospace sector is characterized by long product cycles in a very complex environment. the aircraft manufacturer has to base his product strategy on a long-term view of risks and opportunities in the transport industry but he cannot predict the development of relevant factors in this market environment with any certainty. in this situation, scenario methods offer a pragmatic way to limit the uncertainties and to work them up methodically, in order to derive recommendations for cost-intensive strategic decisions like for example the go-ahead for a new aircraft concept. by including scenario methods in the aircraft design cycle, the ‘design robustness’ can be improved, i.e. the design is not optimised for a prognosticated operating environment, but can cope with various possible future developments. the paper will explain the three fundamental aspects in applying scenario planning to the aircraft design process: requirement definition, design evaluation and technology identification. for each aspect, methods will be shown, which connect the rather qualitative results of a scenario process with aircraft design, which typically demands a qualitative input. keywords: aircraft design, scenario methods, requirements, design evaluation, technology strategy. quantified, environmental regulations concerning noise and emissions as well as airport compatibility requirements have to be met, and airworthiness, i.e. controllability and integrity within the whole flight envelope has to be assured. these considerations are quantified as far as possible, or at least described precisely in a ‘standards & requirements’ document, as a basis for the concept studies. for the subsequent selection of a baseline configuration out of the different concepts, the design effectiveness has to be assessed. the characteristics of the new design have to be analysed with the aim of easy and profitable integration into the operations of the target customers. for the manufacturer as well as for the customer, ‘return on the investment’ may be the main decision criterion. but other economic, technological, political and environmental aspects must also be considered. as stated in jenkinson et al [4], at the early design stages there will always be insufficient knowledge of the future situation to make an accurate prediction of the effects of these aspects on the aircraft design. however, the aircraft design evaluation has to be based on a prognosticated market environment and therefore its robustness towards changes in this environment depends strongly on the quality of these forecasts. the considerations of a project design process can only to a certain part be quantified and thus be analysed with computer based tools. instead, many of the disciplines involved are ill-adapted to quantification, but depend on intuition, sound judgement and creativity. and it is essentially these aspects that determine the scenarios within which the aircraft designer has to work and thus exert a strong influence on the design characteristics. again, the robustness of the new design depends strongly on the ability of the design team to integrate these disciplines satisfactorily in the project design process in order to set up the right requirements and to identify the most promising concept, which can then be handed over to detail design. 3 the scenario method how the factors which characterize today’s air transport industry will merge to create the world of tomorrow is impossible to predict with any certainty. the examination of future developments is subject to the important principle that the future cannot be known. answers to questions concerning future developments can be given in the form of hypotheses or assumptions only. the uncertainty of future developments with regard to economic, technological and political factors increases all the more, the further we look into the future. in such a situation – coming to a decision a long time ahead with sometimes considerable uncertainties about the development of relevant factors – scenario methods offer a pragmatic way to limit the uncertainties, and to work them up methodically, in order to derive recommendations for action which are comprehensible, plausible and systematic. for this, a complex analysis is needed to structure the task and the relevant influencing factors precisely, and this is the goal of the scenario method: complex problems are seized systematically, the mutual influences and networks are analysed, and finally the consequences are reflected. sce© czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 41 no. 4 – 5/2001 clarification of the task search for solutions selection of a solution project initiation ‚optimum‘ configuration standards & requirements requirements and regulations market & mission environment airworthiness concept studies baseline design evaluations parametric design studies selection of a baseline configuration detail design fig. 1: the project design process narios help to think in alternatives, the results are a staging of alternative future worlds, a description of the events leading to these worlds and a definition of the driving forces in these systems, as shown for example in [5]. it is necessary to get an idea of the anticipated environment which is plausible enough to use as a base for cost-intensive strategic decisions like the go-ahead for a new aircraft concept. 4 scenario methods in aircraft design scenario methods can be used for a wide variety of problems. in the following will be shown how these methods can be applied effectively in the aircraft design process. as shown in figure 2, there are essentially three fundamental aspects of scenario implementation in aircraft design: definition of market requirements, design evaluation, and identification of the level of technology to implement in the new design. in figure 2 these basic considerations are assigned to the project milestone structure, shown here from the first idea to the go-ahead, as defined in the airbus concurrent engineering (ace) project [6]. before m0, the business in general is observed in order to identify market opportunities and to initiate a new project. refined market analyses follow between m0 and m2, including a first feedback from the customer. at this stage, not only the airlines’ needs have to be analysed, but also the development of the infrastructural, economic and political situation. in a ‘top down’ approach, scenarios in different markets are analysed in different degrees of refinement, resulting in a robust set of requirements as an input for the design process. between m2 and m3, the results of the conceptual design studies are evaluated in order to identify at this early stage the most promising aircraft concept as a baseline configuration, which for its part will be varied and optimised between m3 and m4. for a scenario based design evaluation, the design concepts are evaluated ‘bottom up’ in various future market scenarios. as the resulting baseline configuration has to consist not only in a prognosticated environment of a key market, but in addition in alternative market developments, the robustness of the design will be improved. technology requirements are identified mainly in the feasibility and early concept phase. appropriate technologies for inclusion in the new or modified design concepts are selected and, as resources usually are limited, market-driven research priorities are defined. a decision on service readiness for new technologies has to be taken at latest in the definition phase, and at this point the level of advanced technology for the project has to be frozen. a good technology strategy has the potential to succeed in a variety of potential future markets, whose priorities can be resolved with scenario methods. an additional use of scenario results can be seen after the authorization to offer (m6): as product success is tied closely to the marketing strategy, the evaluation results have to be communicated to the marketing organization, where arguments from different scenarios are used to respond to particular customers’ needs. in the following, the various aspects of scenario implementation in aircraft design will be explained in more detail. 4.1 requirement definition as outlined in chapter 2, the aircraft manufacturer runs an early market survey to identify air transport demands and operational issues concerning environment, air traffic control, regulations and airports. based on this survey, a product idea will emerge and an intensified analysis in the market segment will be run to determine a first set of requirements. this initial design definition for a passenger aircraft usually consists of seating capacity, range and operating cost levels. secondary issues in the set of requirements are performance and comfort 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 s c e n a ri o a c ti v it ie s s c e n a ri o a c ti v it ie s observe business observe business analyse market situation analyse market situation analyse market needs analyse market needs identify most promising a/c concept identify most promising a/c concept optimise concept on a/c level optimise concept on a/c level consolidate a/c configuration baseline consolidate a/c configuration baseline finalize a/c-spec & comm. proposals finalize a/c-spec & comm. proposals assure industrial launch condition assure industrial launch condition m0 m2 m4 m6 m1 m3 m5 m7 product idea established standards and requirements available a/c configuration base lined instruction to proceed (itp) authorization to offer (ato) go ahead feasibility phase concept phase definition phase market opportunities identified p ro je c t m il e s to n e s tr u c tu re market requirementsmarket requirements design evaluation technology identificationtechnology identification marketing support marketing support a/c concept selected fig. 2: implementation of scenario activities in the aircraft design process standards, the number of engines, technology levels, infrastructure needs and commonality demands. later in the pre-launch period, specific requirements of key airlines can be included in the set of requirements. the conceptual design leads to a geometric definition, an engine evaluation and special features. figure 3 shows how to get with scenario methods to a robust set of standards and requirements from an analysis of the system environment: the environment of the air transport system and thus the aircraft design is divided into search fields with the aim to find all relevant factors in the system. for the future development of these, mostly qualitative projections are given, for example ticket tariffs could rise, stagnate or decrease. in a matrix, a cross-impact analysis of all of these alternative assumptions can be carried out, which can be solved with computer based scenario tools, resulting in plausible, consistent scenarios. a number of scenarios, in the majority of cases between two and four, will be selected and, as they represent varying future markets, different scenario specific standards and requirements will be derived. in this step, the ‘soft’ scenario statements have to be translated into the ‘hard’ facts required in a specification. to get a robust set of requirements as a result of the process, each element of the requirements list has to be cross-checked in the alternative scenarios. requirements which are promising in each scenario can be included in the set, those which are in contradiction with most other scenarios have to be reconsidered, resulting possibly in a compromise. 4.2 design evaluation the acquisition of an aircraft usually follows a detailed evaluation of the concepts which fit the requirements of the customer. in order to pass this selection procedure, the aircraft manufacturer has to evaluate his design critically against the requirements as well as against competing aircraft already in the project design phase. he should know the characteristics of the competitors exactly when starting an entirely new design. consequently, he must be able to prove that his design can meet the future selection criteria and requirements of potential customers and that it has in addition operational advantages over its competitors. the evaluation of competing products usually turns on economic factors such as seat mile costs versus range or on technical factors such as fuel burn and field performance. as nowadays, however, the differences between new designs and in production aircraft are very small, other criteria have to be considered as well, in order to win the fierce competition for an airline’s purchase decision. therefore, the current trend in aircraft evaluation is to consider the commonality effects and additional capability characteristics that result in an economic advantage for an airline, but are not directly related to the operating costs. such additional evaluation criteria are for example cabin comfort aspects, operational flexibility, compatibility with the infrastructure, or environmental viability. the economic surplus value for the operator manifests itself in an increase in utilization, load factor and customer acceptance, lower costs for crew, maintenance and transition, smaller crew load and environmental fees, a higher residual value and better product support. in contrast to the purely economic and technical evaluation criteria, for these factors significant differences can sometimes be found between competing products. in addition, the relative importance of these factors may change dramatically in the future, as operation responds to new market needs, and with regard to their future development, a high degree of uncertainty can be seen. varying results of an aircraft evaluation are much more related to the market environment in which a particular airline will operate than to any inherent economic differences between competing designs. and it is exactly this airline environment that is worked up methodically in a scenario process. in different market environments the relative significance of the various evaluation criteria will change in the view of an aircraft operator. a scenario based design evaluation therefore has to connect market drivers with design parameters by the means of adapting the relative criteria weights to the alternative future worlds, as shown in figure 4. as in the case of most value benefit analyses, design data and criteria defini© czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 41 no. 4 – 5/2001 system environment search fields relevant factors projections scenarios requirements scenario specific field a field b field n ... factor a.2factor a.2 factor b.1factor b.1 factor a.1factor a.1 factor b.2factor b.2 ... ... ... factor n.1factor n.1 factor n.2factor n.2 a.2.a a.2.b a.2.c a.2.aa.2.a a.2.ba.2.b a.2.ca.2.c b.2.ab.2.a b.2.bb.2.b n.1.a n.1.b n.1.c n.1.an.1.a n.1.bn.1.b n.1.cn.1.c ... ... ... ... ... ... a.1.ca.1.c a.2.aa.2.a b.1.ab.1.a n.1.cn.1.c ...... b.2.cb.2.c a.1.aa.1.a a.2.ba.2.b b.1.bb.1.b n.1.an.1.a ...... b.2.cb.2.c a.1.ba.1.b a.2.aa.2.a b.1.bb.1.b n.1.bn.1.b ...... b.2.ab.2.a robust requirements standards & requirements 1. ... 2. ... 3. ... 4. ... standards & requirements 1. ... 2. ... 3. ... 4. ... standards & requirements 1. ... 2. ... 3. ... 4. ... standards & requirements 1. ... 2. ... 3. ... 4. ... ... ... standards & requirements 1. ... 2. ... 3. ... 4. ... standards & requirements 1. ... 2. ... 3. ... 4. ... s ce n ar io i s ce n ar io b sc.i sc.b sc.a fig. 3: from the market environment to robust requirements tions first have to be set, but instead of reckoning them up in a fixed scheme, the criteria weighting will change for each scenario. the result is a set of evaluations which show strong and weak points of a design concept in different future worlds. a deeper analysis of the evaluation results will show across a range of markets which design characteristics to keep and which to improve. this scenario based aircraft evaluation process can complement the standard comparisons in order to obtain a robust baseline configuration. 4.3 technology identification with every new aircraft design, a technological advancement is demanded by the customer, forced by competition or pressed by political regulations, very often environmentally motivated, and stimulated by operational fees from airports and air traffic control. technologies will be introduced to a certain level, which allows a market-orientated definition of a competitive new product and guarantees a return on investment for the manufacturer as well as for the airline. unexpected costs and risks have to be evaluated carefully and the technology readiness level has to be assured. as outlined by steiner [1], there are three main phases for technology development: basic research, assembly of the body of technology, and application to a specific aircraft design. scenario methods can help mainly to identify basic research needs but also to identify specific technologies to be applied in a specific new design. transferring the suggestions of fahey and randall [7] to the project design process, the scenario specific baseline configurations are analysed in order to identify key technologies, needful tools and methods for design and operational requirements across a range of products, see figure 5. an evaluation of the identified needs across all scenarios leads 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 e v a lu a ti o n c ri te ri a scenarios weighting scenario specific ... ... ... criteria weights a: ... % b: ... % c: ... % d: ... %s ce n ar io b criteria weights a: ... % b: ... % c: ... % d: ... %s ce n ar io b scenario i scenario b scenario a e n v ir o n m e n t a n a ly s is criteria weights a: ... % b: ... % c: ... % d: ... % s ce n ar io i criteria weights a: ... % b: ... % c: ... % d: ... % s ce n ar io i evaluation scenario specific ... ... evaluation results factor 1: ... factor 2: ... factor 3: ... factor 4: ...s ce n ar io b evaluation results factor 1: ... factor 2: ... factor 3: ... factor 4: ...s ce n ar io b m o n e ta ry te c h n ic a l s u b je c ti v e evaluation results factor 1: ... factor 2: ... factor 3: ... factor 4: ... s ce n ar io i evaluation results factor 1: ... factor 2: ... factor 3: ... factor 4: ... s ce n ar io i robust baseline configuration fig. 4: scenario based aircraft design evaluation fig. 5: from scenarios to robust strategies to a robust strategy for a competitive new generation aircraft design with improvements for example in ground and flight operations, mission flexibility, noise and emission characteristics and economics. in addition, a recommendation for the long term orientation of technology research programmes can be derived. 5 conclusion as a consequence of integrating product strategy, customer requirements and research & development in a series of scenario processes as proposed in this paper, the probability increases to have the ‘right’ product at the end. a meaningful factor for the success of this new product is timing, and accordingly the process of accurately accessing changes in airline fleet strategy and needs. an understanding of real market requirements and opportunities is absolutely necessary before starting a new project, especially with regard to the uncertainties in future developments. as this phase of analysis and planning should take up to two years for a successful project, the time required for a series of scenario processes is surely available and will lead to a deeper understanding of the real market needs. scenario planning is an appropriate method to derive robust market requirements and to evaluate the long-term viability of current design studies or technology investments in the early design phases. references [1] steiner, j. e.: how decisions are made – major considerations for aircraft programs. aiaa 1982 wright brothers lectureships in aeronautics, 1982 [2] kazmer, d., roser, c.: evaluation of product and process design robustness. in: “research in engineering design”, springer, vol. 11, no. 1, 1999, pp. 20 –30. [3] ehrlenspiel, k.: integrierte produktentwicklung – methoden für prozeßorganisation, produkterstellung und konstruktion. carl hanser verlag, 1995 [4] jenkinson, l. r., simpkin, p., rhodes, d.: civil jet aircraft design. arnold, 1999 [5] strohmayer, a., jost, p.: air transport scenario ‘flight unlimited 2015 – how to avoid operational limitations for large civil jet aircraft’. lt-tb-99/4, tu münchen, 1999 [6] ace core team: airbus concurrent engineering. airbus industrie, 1999 [7] fahey, r., randall, r. (editors): learning from the future – competitive foresight scenarios. 1st edition, john wiley & sons, 1998 dipl.-ing. andreas strohmayer phone: +49 89 289 15984 fax: +49 89 289 15982 e-mail: strohmayer@llt.mw.tu-muenchen.de chair of aeronautical engineering technische universität münchen 85747 garching, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 41 no. 4 – 5/2001 ap2002_01.vp 1 introduction the department of management at the faculty of mechanical engineering of the czech technical university in prague was set up in 1960, following the university reform act, bringing together various groups within the university that had previously worked separately. the department is headed by professor karel macík, and has a staff of some 20 lecturers. it has its own doctoral programme, and runs postgraduate and master’s courses. the department is engaged in a great deal of research, mainly in the field of marketing, financial and operations management. it offers a general study programme in economics and management in mechanical engineering, as well as a special programme in production and operations management. in 1994 i successfully completed a four-week instructors training course at saint mary’s university in halifax, sponsored by the government of canada. my present post is as a lecturer in production and operations management. the new programme in production and operations management was started in 1997, and is linked to research work and technology transfer within the department. the programme comprises courses in: management of change and the importance of innovations, forecasting and operations strategy, design of work systems, total quality management and inventory control, material requirements planning and just-in-time systems, logistics and practical exercises. the programme is organised in two stages, in the winter and summer semesters. the courses taught in each semester are compulsory for all students of the department, and provide a wide overview of the subject. the programme has a strong international orientation. the participation of students from a wide range of countries provides an opportunity to exchange ideas and experiences, and enriches the intellectual and social life of the department. european exchange programmes add further diversity to our student community, as visiting students study in prague and czech students have the opportunity to study at overseas universities. i have a internationally ranked program of research, teaching and corporate contacts in operations management. this programme brings together theory and practice in a broad overview of operations management (om), dealing with the production process, interactions with other business functions, and also business strategy. production and operations management (pom) is the study of order, structure and relationship. it is a powerful tool for solving practical problems, and a highly creative field of study, combining logic and precision with intuition and imagination. an understanding of production and operations management is extremely valuable in today’s technologically oriented workplace. a very wide range of employment opportunities are available to students who can handle operational concepts. employers in czech business and industry like to hire students and graduates with a background in this type of management, because they are able to think and reason critically, logically and analytically. my teaching goal is to prepare students for dealing with real-world settings and for implementing the most effective up-to-date practices in the area of operations management. as part of this effort, students are required to participate in a field project. i aim to integrate the latest theoretical findings and problem solving tools into coursework. i engage in active course development which leverages on the joint professional, teaching and research experience of the department. at our department we aspire to lead in research, and in developing modern concepts and tools that aid executives. we are particularly interested in management activities involving: designing and operating production systems, strategy decisions for operations, forecasting and decision making process, value analysis, process selection and capacity planning, layout of facilities, design of work systems, modern quality management, manufacturing planning and control, aggregate planning, master scheduling, inventory 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 education for production and operations management m. kavan the department of mechanical engineering enterprise management at the faculty of mechanical engineering of the czech technical university in prague has its own doctoral programme, and runs postgraduate and master’s courses. the department is engaged in a great deal of research in the field of marketing, financial and mainly operations management. a new production and operations management programme was started in 1997. the programme consists of: management of change and the importance of innovations, forecasting and operations strategy, design of work systems, total quality management and inventory control, material requirements planning and just-in-time systems, logistics and practical exercises. the study programme is organised in two stages, winter and summer semesters. the study programme has a strong international orientation. the teaching goal is to prepare students for dealing with real-world settings and implementing the most effective up-to-date practices. the department aspires to lead in research, and in developing modern concepts and tools. research is being conducted in the mechanical engineering industry under a grant from the eu leonardo programme. we invite you to email with questions or to schedule a visit to the department at any time. keywords: production and operations management, study programme, department of mechanical engineering enterprise management, czech business and industry,czech technical university in prague, education, lean operations management, toyota production system. management, material requirements planning, just-in-time systems, materials management and purchasing, logistics and computer integrated manufacturing. students are required to gain 10 credits during a year of productcion and operations management study. my key benchmark in assessing the success of the programme is its impact on management practice. increasing importance has been attached in recent years to various aspects of education as the complex of systematic facts and knowledge gained either through formal education (learning) at universities and institutes of education or through informal training at one’s place of work. qualitative changes and ongoing trends of development in education are closely connected with scientific and technical progress, or – expressed more dramatically – with the scientific and technical revolution. this holds true especially for such a complicated phenomenon as the achievement of operations management of products and services to meet the needs of the customer-consumer. scientific and technical innovation forms the framework within which this shift in the importance of operations management for economic and social life is taking place. one cannot complain that questions of education and training in this particular field have been neglected. on the contrary, proof of the interest in them is provided by this article, the main theme of which is education and training. as a professional teacher, i am often faced with the difficult problem of deciding what to include in my lectures for future mechanical engineers, who will be engaged in the design, technology and management of production in engineering plants, so that they will have the best preparation for their jobs from the point of view of production and operations management. this is not only a completely new discipline for them, but also a new approach to their jobs. we can no longer make do with only industrial statistics and operational research, both of which are, of course, essential tools in the new concept. to be an effective teacher, one must not only know what to teach, but also how to teach it. my teaching process has several steps, and along the way there are several questions i should ask myself as teacher. � what am i going to teach (curriculum) � how am i going to teach it (lesson planning) � how will i know when i’ve taught it (assessment) � how can i teach it better next time (reflection) if one does not ask the third question, there is no way to know if the teaching is truly effective. asking the fourth question is the key to staying current, fresh, and enthusiastic about teaching. for this reason the students not only do formal coursework, but also participate in a series of management development seminars, which provide an overview of czech business concepts and practices. business and industry field trips, seminars with czech executives, and other special activities provide a further dimension to the programme. the course is intended to be intensive, flexible, and adapted to individual needs. competency is assessed by comprehensive written and oral examinations. the final programme requirement is the successful oral presentation and defence of a thesis. engineering is a very large profession. many people are employed in this country as engineers, and the field will continue to expand as long as there are technical problems to solve. my students are trained to be problem-solvers who invent new products and make things work better, more efficiently, more quickly and less expensively. they will turn ideas into products and services. engineering graduates have excellent prospects for finding employment in private industry, government, or academia. we welcome students from around the world. i have a growing number of international students joining my classes taught in english language. 2 principles and practices of modern manufacturing modern manufacturing is a total business philosophy built on four major principles: value analysis, flow, just-in-time, and perfection. students learn how integrated application of these principles can raise a company’s performance – in terms of value delivery to customers, total cost of production, product quality, lead times, inventory turnaround, production flexibility, floor space needs, in-house technology development, labour utilization, safety, and employee satisfaction – to world class levels. i attempt to cover lean philosophy in depth, along with the key supporting tools and practices for successful implementation. these concepts and their application are studied through hands-on demonstrations, real world case studies, and numerous exercises. students learn to: � identify waste in an existing operation and develop countermeasures to eliminate it. � articulate the shortcomings of traditional manufacturing thinking, and express the ways in which lean thinking overcomes them. � develop a production improvement strategy for an existing operation. � establish visibility, efficiency, and control through effective “5s” (which refers to the five words that encapsulate the principles for maintaining an effective, efficient workplace) and visual management systems. � make real progress towards zero defects by enhancing the quality systems with lean tools, such as poke yoke and self/successor inspection. 3 lean operations management operations management has seen a revolution in recent years, becoming a topic of critical importance in business today. demands for quality, time-based competition, and international production have demonstrated that superior management of the operations function is vital to the survival of a firm. success in any venture requires the ability to identify what potential customers need, then to produce the products or services that satisfy this need better, faster, and cheaper than their competitors. students concentrate on the critical operational functions that allow an organization to do just that. they explore the roots of the toyota production system, learning leadership, design management, quality management, and workplace management principles. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 42 no. 1/2002 through a progressive series of class discussions and hands-on practical applications, students will develop problem solving strategies aimed at continuously improving their business systems. students learn: � how to apply the principles and practices required to effectively operate a lean system. � the basic concepts of operations management and lean manufacturing (customer focus, process orientation vs. results orientation, value, value stream, flow, pull, and perfection, systems thinking). � to focus on improving people as the most important resource in any continuous improvement plan. 4 organizational learning for lean manufacturing the core issues of transforming existing organizations into lean enterprises focus on learning and continuous improvement. presentation segments provide an overview of important concepts and practices of organizational learning, based on a comparison of craft, mass production, and lean work design principles. students discover how all work is embedded in and managed through meaningful communication. this is followed by a closer investigation of the role of organizational change and personal change agency, leading up to a design strategy for the learning organization. a description of specific knowledge management tools for the lean enterprise, such as models and scenarios, and of the necessary policy and network support, concludes the programme. students learn to: � appreciate and develop the creative potential in all workers. � better understand the role of knowledge and learning in their work environment. � target education needs more specifically. � improve motivation by developing a new sense of work ownership. 5 management for a lean system improvement of the process – from the factory floor to the offices and throughout an enterprise – is critical to the success of lean implementation. i guide discussions on how to perform job instruction training, facilitate problem solving using teams, involve suppliers and customers, and manage cost parameters based on processes. in a discovery workshop, students will establish teams to develop management solutions to tackle actual situations. through a series of self-guided exercises, the teams will learn how to use several effective problem-solving techniques, including kaizen methods. students learn to: � assess the current state of a factory and develop an overall lean implementation plan, identifying key supporting kaizen activities. � define product families and machine groups as a basis for implementing work cells and focused factories to achieve flow. � design group technology work cells including lot sizing, equipment layout, workstation design, and work-design strategies to achieve high labour utilization. � train team members and leaders in job instruction methods to develop multifunction workers for all process operations. � build effective employee involvement teams (kaizen teams, quality control circles, work teams). � develop creative problem-solving techniques. � involve suppliers and customers in continuous improvement activities (supply chain management). � manage the financial aspects of a lean enterprise (target costing, kaizen costing, building a cost management system). this course aims to expand students’ understanding of the lean enterprise. it demonstrates the need for a profound transformation of the organization according to lean principles, and argues against an implementation approach focusing on tools alone. such a transformation requires a deep understanding of the human system, of culture, and of leadership strategies and behaviours. both the corporate and the operational level of the enterprise are reviewed from this perspective. the course concludes with a two-hour, applied transformation strategy exercise in which teams develop specific blueprints for their lean future. i have more than 20 years of experience in university education. my current research takes a broad look at speed and flexibility in manufacturing and logistic systems. more specifically, how these systems develop the attributes necessary to respond quickly and efficiently, to changing customer demand. this research is conducted primarily in the mechanical engineering industry under a grant from the eu leonardo programme. in particular, much of my work focuses on the interface between manufacturing and retail organizations. i am a member of the czech association for mechanical engineers. prior to embarking on an academic career, i spent five years in various managerial positions in czech companies. one of today’s realities is that knowledge is a continually moving target. what was considered current as recently as five years ago is now, in many disciplines, considered to be hopelessly out-of-date. whether a student is considering enrolling for an undergraduate or a graduate level certificate or diploma programme, or for a professional development course, he or she can be sure that whatever we offer will be consistent with what is happening in today’s market place. at the department of management at the faculty of mechanical engineering of the czech technical university, we are constantly updating our courses and programmes to reflect not just today’s reality, but also, as best we can anticipate it, the reality of the decade ahead. one example is the new diploma programmes in the rapidly evolving field of production and operation management. we are also working hard to ensure that our physical facilities will enhance students’ learning experience. our computer laboratories have been upgraded and expanded to provide state-of-the art equipment that reflects the real world in which students will be working. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 designed for those who wish to pursue an interest or develop a new understanding, courses are also offered in english language. every year, more than eighty students at the department study economics and management at bachelor’s, master’s and doctoral levels. the department offers tuition and training comparable to well-known academic institutions abroad, and occupies a leading position among all schools of its kind in the czech republic. the academic staff includes numerous personalities who are at the top of their fields, both as teachers and as researchers, a number of whom have been recognized abroad for their scholastic achievement. the department maintains an ongoing relationship with state and other public sector institutions, as well as important contacts with the business community. this strategy will continue to pay dividends in strengthening our position as a desirable partner for foreign institutions. in principle we distinguish between two basic forms of education and training. the first concerns formal education for those who are still preparing themselves for a certain profession and gaining qualifications for it. the aim of this type of education is not just to acquire new knowledge, but above all to learn how to think. the second form provides employees already working in organisations and enterprises with new knowledge and skills through training at work or externally. this can be achieved in several ways; for instance, ongoing contact of the employee with his surroundings and problems at work, with his supervisors, by individual study, or by training offered by the employer, including sending the employee for a post-graduate study programme at an institution of higher education. from the point of view of an enterprise, the training of employees may be considered as an internal form of education. an external form of education, on the other hand, involves the training of people who are not employees of the firm but are important to it as the producer of products or services. operations activities such as forecasting, choosing a location for an office or plant, allocating resources, designing products and services, scheduling activities, and assuring quality are core activities of most business organizations. very often, most of the employees and assets of an organization are controlled by the production/operations function. historically, production and operations management techniques developed in manufacturing organizations. however, as time went on, it became more and more apparent that nonmanufacturing organizations have to contend with problems similar to those encountered in manufacturing settings. consequently, the scope of production and operations management has been expanded to cover both manufacturing and service organizations. moreover, many of the techniques can be directly applied to both areas without modification. i have always found operations management to be the most relevant and enjoyable part of my own business studies. it deals with the fundamental essence of a firm, how its products are made, and how its services are delivered to customers. it involves everything from strategic concerns such as aggregate planning, plant location, and service capacity expansion, to tactical issues such as daily order scheduling, statistical quality control, and inventory control. studying such a broad range of topics helps a student achieve a balance between skilful use of necessary analytical tools and an understanding of the underlying conceptual issues. production and operations management lies at the heart of the great changes sweeping through today’s business environment. the competitive pressures for higher quality, quicker response time, superior service, and total customisation can only be met through more intelligently run business operations. even the recent enthusiasm for corporate re-engineering is fundamentally about better management of operations. i invite you to email me at any time with questions, or to schedule a visit to the department of management at the faculty of mechanical engineering of the czech technical university in prague. references [1] macík, k., freiberg, f., zralý, m.: target costs of a machinery product. proceedings of workshop 2000, part b [2] vysušil, j., macík, k., freiberg, f.: ekonomické výpočty v řídící praxi. institut řízení, praha 1989 [3] zralý, m., mádl, j.: pracovní podmínky a ekonomická efektivnost obrábění. strojírenská výroba, no. 11–12/1996, pp. 4 –7 [4] rejf, l., kříž, j.: personální řízení. skriptum vydavatelství čvut, praha 1996 [5] vaniš, l.: probability evaluation scenarios. proceedings of workshop 2002 [6] kavan, m.: management study guide. skripta pro zahraniční studenty, vydavatelství čvut, praha 1999 ing. michal kavan, csc. phone/fax: +420 2 2435 9286 e-mail: kavanm@fsih.cvut.cz department of management czech technical university in prague faculty of mechanical engineering horská 3, 120 00 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 42 no. 1/2002 acta polytechnica doi:10.14311/ap.2014.54.0210 acta polytechnica 54(3):210–213, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap fact — longterm monitoring of bright tev blazars katja meiera, ∗, a. bilandb, t. bretzb, j. bußc, d. dornera, s. eineckec, d. eisenachera, d. hildebrandb, m. l. knoetigb, t. krähenbühlb, w. lustermannb, k. mannheima, d. neisec, a.-k. overkempingc, a. paravaca, f. paussb, w. rhodec, m. ribordyd, t. steinbringa, f. temmec, j. thaelec, p. voglerb, r. waltere, q. weitzelb, m. zängleina (fact collaboration) a universität würzburg, germany — institute for theoretical physics and astrophysics, 97074 würzburg b eth zurich, switzerland — institute for particle physics, schafmattstr. 20, 8093 zurich c technische universität dortmund, germany — experimental physics 5, otto-hahn-str. 4, 44221 dortmund d epf lausanne, switzerland — laboratory for high energy physics, 1015 lausanne e university of geneva, switzerland — isdc data center for astrophysics, chemin d’ecogia 16, 1290 versoix ∗ corresponding author: kmeier@astro.uni-wuerzburg.de abstract. the first g-apd cherenkov telescope (fact), located on the canary island of la palma, has been taking data since october 2011. fact has been optimized for longterm monitoring of bright tev blazars, to study their variability time scales and flare probability. g-apd photo-sensors allow for observations even under strong moonlight conditions, and the telescope can be operated remotely. the monitoring strategy of fact is discussed and preliminary results of the flare of mrk501 in june 2012 are shown. keywords: cherenkov astronomy; gamma astronomy; monitoring; agn; blazar. 1. the fact telescope the first g-apd cherenkov telescope (fact) is situated on the canary island of la palma at the observatorio del roque de los muchachos at 2200 meters above sea level. for the experiment, the former mount of the hegra ct3 telescope was refurbished. the massive steel structure was taken over, but a new drive system was installed and the mirrors of the former hegra ct1 were repolished and newly coated, giving a total mirror area of 9.5 m2. the setup of the telescope was designed for the detection of very high energy (vhe) cosmic gammarays, ranging from several hundred gev up to approximately 10 tev, applying the imaging atmospheric cherenkov technique. the telescope started taking data in october 2011. an outstanding feature of fact is the possibility to operate the telescope remotely. fact is the first cherenkov telescope using geigermode avalanche photodiodes (g-apds), instead of photo-multiplier tubes, for photon detection in regular observation. the camera consists of 1440 pixels, each with an opening angle of 0.11 degrees. this results in a total field-of-view of 4.5 degrees. each pixel has a solid light concentrator to allow for maximum area compression for photons arriving from the mirror, to reduce light loss due to fresnel refraction, and also to shield the sensors of background photons not reflected by the mirrors. a great advantage of g-apds is their ability to operate even under strong moonlight conditions. this significantly increases the possible duty cycle of the telescope. more details on the applied technique and the design of the telecope can be found in [1]. the fact collaboration was founded in 2008 with the aim to examine these silicon photodectectors for use in cherenkov telescopes and to monitor bright tev blazars in the longterm. 2. longterm monitoring of bright tev blazars longterm lightcurves with consistent sampling are needed to study such highly variable objects as blazars. they show variations in their fluxes on timescales ranging from minutes to years. a more detailed explanation of blazars and their related physics is discussed in section 3. longterm monitoring in the vhe range also offers the opportunity to combine these data with observations at other wavelengths in multi wavelength (mwl) campaigns. such complete sampling allows for a deep insight into the fundamental acceleration processes and related physics, that cause the observed radiative phenomena. to complete this process of continous monitoring, the idea of the dwarf project (dedicated multi210 http://dx.doi.org/10.14311/ap.2014.54.0210 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 3/2014 fact — longterm monitoring of bright tev blazars mjd 56050 56100 56150 56200 56250 56300 56350 56400 56450 e vt s/ h 0 20 40 60 80 100 120 140 fact preliminary figure 1. excess rate in events per hour versus time in modified julian date, for the blazar markarian 421. the excess rate is plotted in black, while the background rate is in blue. a major flare in april 2013 can be seen. wavelength agn research facility) is to position small, inexpensive telescopes around the world, to ensure monitoring all around the clock. hence fact is the first telescope of this project and the first cherenkov telescope using g-apds in regular operation. one major task of the large currently operating cherenkov telescopes, is to search for new sources. examples are magic (major atmospheric gammaray imaging cherenkov telescopes), h.e.s.s. (high energy stereoscopic system) and veritas (very energetic radiation imaging telescope array system). their observing time is too expensive for longterm monitoring. only small, inexpensive telescopes are ideally suited for this task. 3. sources observed by fact fact observes the crab nebula constantly up to very high zenith angles. the crab nebula is a remnant from the well known supernova in 1054 and a pulsar wind nebula. it was discovered in 1989 in the vhe range by whipple [2]. it is the strongest known persistent source of high energy radiation, and is therefore used as a standard candle in vhe astronomy. it should be mentioned here that there are current detections of the crab nebula flaring in the gammaray regime for photon energies greater than 10 mev [3]. the assumed persistent flux of the crab nebula, at the vhe range, can be used to study the performance of the telescope. this is important for understanding the dependence of the measured rates on effects such as the zenith angle and the ambient light. the other two sources mainly observed by fact are the blazars markarian 501 and markarian 421. blazars belong to the class of agn (active galactic nuclei). this class can be divided into radio loud and radio quiet sources. subclasses are for example seyfert galaxies, broad and narrow line radio galaxies, as well as blazars. the unifying agn model [4] explains the different types as an effect of the viewing angle to the physically same object. it is believed that all agns consist of a supermassive black whole at the center of the host galaxy, surrounded by an accretion disk, a dust torus and gas clouds circulating around. perpendicular to the accretion disk a jet of relativistically accelerated charged particles is formed. due to relativistic doppler boosting, the emitted radiation of the jet is strongly amplified and focussed in the direction of the jet axis. for blazars, this jet axis is close to the line of sight, allowing for an insight of the jet. this explains the vhe radiation and the extreme flux variability of blazars. the high energy radiation of blazars is dominated by nonthermal processes. their spectral energy distribution (sed) has a typical two hump shape. the first hump is caused by synchrotron emission from relativistic electrons in magnetic fields. the second one in the gamma ray regime is attributed to 211 k. meier et al. (fact collaboration) acta polytechnica mjd 56050 56100 56150 56200 56250 56300 56350 56400 56450 56500 e vt s/ h 0 20 40 60 80 100 120 140 fact preliminary figure 2. excess rate in events per hour versus time in modified julian date of markarian 501, from may 2012 until june 2013, including two flares. the excess rate is plotted in black, while the background rate is in blue. inverse compton scattering. markarian 501 is a blazar, located in the constellation of hercules. it was discovered in 1996 for photon energies above 300 gev by whipple [5]. it is one of the brightest and nearest objects in the vhe regime, and therefore one of the most studied blazars. its historical flux maximum in the vhe range was detected in 1997 [6–8]. markarian 501 has a redshift of z = 0.034. markarian 421 is located in the constellation of ursa major and has a redshift of z = 0.031. it was detected in the very high energy range in 1992 by whipple [9]. 4. results of fact as a result of longterm monitoring of bright tev blazars by fact, the excess rates of markarian 501 and markarian 421 are presented here. for markarian 421, the time range in figure 1 is from may 2012 until june 2013. the shown data includes a flare in april 2013. for all plots the excess rate is plotted in black and the background rate is in blue. the data check used for these plots is described in [10] in full detail. the gaps in which no data is taken is because of high zenith angles, the source being beyond the horizon, or bad weather. the shown excessrate of markarian 501 starts in may 2012 and lasts until june 2013. clearly two enhanced flux states of the blazar can be seen in figure 2, the first in june 2012 and the second in feburary 2013. figure 3 gives an insight on the nights of the first flare, on a time range from may 2012 until june 2012. the highest bin shows a rise of the flux of nearly six times the previous value. this highest flux was measured in the night of 8/6/2012 during a major outburst of markarian 501. 5. conclusions studying agns requires longterm monitoring of very high energy sources, as their fluxes are highly variable. dwarf (dedicated multiwavelength agn research facility) is a project to enable such constant monitoring with small inexpensive telescopes around the world. the first g-apd cherenkov telescope (fact) is the first telescope of this project. it is also the first cherenkov telescope that uses geiger-mode avalanche photo diodes (g-apds) instead of photo-multiplier tubes in regular observations. g-apds have the great advantage of being operable even under strong moonlight conditions. this allows for a considerably larger duty cycle. fact has been successfully taking data for more than 1.5 years. two major flares of markarian 501 and one of markarian 421 with high significance have been detected so far. for the highest bin of the measured flux during the first flare of markarian 501 the significance rate was around 5 sigma in 5 minutes. this nicely demonstrates that fact is able to send flare alerts on short timescales to other telescopes. 212 vol. 54 no. 3/2014 fact — longterm monitoring of bright tev blazars mjd 56070 56080 56090 56100 56110 e vt s/ h 0 20 40 60 80 100 120 140 fact preliminary figure 3. the first flare of markarian 501 in more detail, for a time range from may 2012 until june 2012. for the highest bin of the excess rate, the flux rose to a value six times higher than the previous one. acknowledgements the important contributions from eth zurich grants eth-10.08-2 and eth-27.12-1 as well as the funding by the german bmbf (verbundforschung astround astroteilchenphysik) are gratefully acknowledged. we are thankful for the very valuable contributions from e. lorenz, d. renker and g. viertel during the early phase of the project. we thank the instituto de astrofisica de canarias allowing us to operate the telescope at the observatorio roque de los muchachos in la palma, and the max-planckinstitut für physik for providing us with the mount of the former hegra ct3 telescope, and the magic collaboration for their support. we also thank the group of marinella tose from the college of engineering and technology at western mindanao state university, philippines, for providing us with the scheduling webinterface. references [1] h. anderhub, et al. design and operation of fact – the first g-apd cherenkov telescope. jinst 8:p06008, 2013. 1304.1710 doi:10.1088/1748-0221/8/06/p06008. [2] m. f. cawley, et al. detection of tev gamma rays from the crab nebula using the atmospheric cherenkov imaging technique. irish astronomical journal 19:51–54, 1989. [3] a. a. abdo, et al. gamma-ray flares from the crab nebula. science 331:739–, 2011. 1011.3855 doi:10.1126/science.1199705. [4] c. m. urry, et al. unified schemes for radio-loud active galactic nuclei. pasp 107:803, 1995. arxiv:astro-ph/9506063 doi:10.1086/133630. [5] j. quinn, et al. detection of gamma rays with e > 300 gev from markarian 501. apj 456:l83, 1996. doi:10.1086/309878. [6] a. djannati-atai, et al. very high energy gamma-ray spectral properties of mkn 501 from cat čerenkov telescope observations in 1997. a&a 350:17–24, 1999. arxiv:astro-ph/9906060 doi:10.1086/308558. [7] m. amenomori, et al. detection of multi-tev gamma rays from markarian 501 during an unforeseen flaring state in 1997 with the tibet air shower array. apj 532:302–307, 2000. arxiv:astro-ph/0002314. [8] f. a. aharonian, et al. the temporal characteristics of the tev gamma-radiation from mkn 501 in 1997. i. data from the stereoscopic imaging atmospheric cherenkov telescope system of hegra. a&a 342:69–86, 1999. arxiv:astro-ph/9808296 doi:10.1038/358477a0. [9] m. punch, et al. detection of tev photons from the active galaxy markarian 421. nature 358:477, 1992. [10] t. bretz, et al. fact the first g-apd cherenkov telescope (first results). in f. a. aharonian, et al. (eds.), american institute of physics conference series, vol. 1505 of american institute of physics conference series, pp. 773–776. 2012. 213 1304.1710 http://dx.doi.org/10.1088/1748-0221/8/06/p06008 1011.3855 http://dx.doi.org/10.1126/science.1199705 arxiv:astro-ph/9506063 http://dx.doi.org/10.1086/133630 http://dx.doi.org/10.1086/309878 arxiv:astro-ph/9906060 http://dx.doi.org/10.1086/308558 arxiv:astro-ph/0002314 arxiv:astro-ph/9808296 http://dx.doi.org/10.1038/358477a0 acta polytechnica 54(3):210–213, 2014 1 the fact telescope 2 longterm monitoring of bright tev blazars 3 sources observed by fact 4 results of fact 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0750 acta polytechnica 53(supplement):750–755, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap recent developments in ultra-high energy neutrino astronomy peter k. f. grieder∗ physikalisches institut, university of bern, switzerland ∗ corresponding author: peter.grieder@space.unibe.ch abstract. we outline the current situation in ultrahigh energy (uhe) cosmic ray physics, pointing out the remaining problems, in particular the puzzle concerning the origin of the primary radiation and the role of neutrino astronomy for locating the sources. various methods for the detection of uhe neutrinos are briefly described and their merits compared. we give an account of the achievements of the existing optical cherenkov neutrino telescopes, outline the possibility of using air fluorescence and the particle properties of air showers to identify neutrino induced events, and discuss various pioneering experiments employing radio and acoustic detection of extremely energetic neutrinos. the next generation of space, ground and sea based neutrino telescopes now under construction or in the planning phase are listed. keywords: neutrino astronomy, neutrino telescopes, neutrino detection. 1. introduction the principal aim of neutrino astronomy is to locate the sources of uhe component of the cosmic radiation (cr). cr is predominantly of hadronic nature. it is therefore expected that uhe hadronic interactions take place within the sources and in their immediate vicinity, copiously producing pions, kaons and other particles that are subject to decay, yielding a corresponding number of photons and neutrinos of different flavors. consequently, an uhe hadron source is also expected to emit uhe neutrinos that are signatures of the hadronic processes. their trajectories are not affected by magnetic fields; they point directly at the source and should be detectable. in recent years cosmic ray physics has made great progress, in particular as concerns the primary allparticle spectrum. in the high energy regime (e ≥ 1014 ev), where air showers are the only source of information from which the properties of the primary radiation can be extracted in conjunction with simulations, the results of different experiments have deviated significantly until recently. after re-scaling the energy spectra of all major experiments of recent years, there is now good agreement with respect to the shape of the spectrum, i.e., the spectral index, and the intensity to within about 20 percent or better up to ∼ 5 × 1019 ev. beyond this energy even the two most recent and largest experiments, the telescope array (ta) [48] in the northern hemisphere and the pierre auger observatory (pao) [5] in the southern hemisphere, show increasing differences between their respective spectra with increasing energy as they enter the region of the expected greisen–zatsepin–kuzmin (gzk) cutoff [32, 59] (see fig. 1a). unfortunately, the situation concerning the primary composition remains very unsatisfactory. the energy dependence of the xmax distributions and of other primary mass sensitive observables recorded by the ta and the pao manifest different trends at uhe. in general, large differences exist between the compositions obtained by the different experiments in the air shower energy domain of the primary spectrum, which get worse with increasing energy. some progress can be reported concerning the correlation between the arrival direction of the most energetic events (air showers) and astrophysical objects. however, so far no object could definitely be identified as a source of uhe cosmic rays, and the results of the three most relevant experiments carrying out anisotropy studies (pao, ta, and hires (now shut-down)) yield inconclusive results [5, 22, 48]. neutrino astronomy is expected to solve the cosmic ray source puzzle, as mentioned before, provided an adequate flux of uhe neutrinos exists and can be detected. if no neutrino point source can be found but only a diffuse isotropic flux of uhe neutrinos, this would be additional evidence besides the drop of the all-particle spectrum beyond ∼ 5 × 1019 ev and the increasing gamma ray fraction observed at uhe (fig. 1b) for the existence of the gzk process, p + γ2.7 k →δ + → n + π+; π+ → µ+ + νµ, µ+ → e+ + νµ + νe, and n → p + e− + νe, and similar reactions, which cause the cutoff. 2. neutrino reaction signatures and detection a common feature of all high energy neutrino interactions, be it charged or neutral current reactions, initiated by any flavor (νe,νµ or ντ and their antiparticles), is a hadron cascade, emerging from the point 750 http://dx.doi.org/10.14311/ap.2013.53.0750 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 recent developments in ultra-high energy neutrino astronomy 10 18 10 19 10 20 primary energy [ev] 10 22 10 23 10 24 10 25 in te n s it y x e 3 [ m -2 s -1 s r1 e v 2 ] rescaled primary spectra rescaling factors: p.a.o. ex1.102 t.a. ex0.906 agasa ex0.652 yakutsk ex0.561 hires i ex 0.911 hires ii ex1.0 10 18 10 19 10 20 primary threshold energy [ev] 10 -2 10 -1 10 0 p h o to n f ra c ti o n hp hp a2 ay y a1 a1 sd sd sd hy hy hy hy hy gzk-γ z-burstshdm1 shdm2 td a) b) figure 1. a) re-scaled uhe primary all-particle spectra from the six major air shower experiments [52]. b) photon fraction as deduced from different pao measurements and predictions from various models. gzk-γ shows the contribution from gzk process (for details see [7]). 20 tev νµ hadron cascade cherenkov emission cone µ cherenkov emission cone 0 6 mr 0 60 mr acoustic shock disc ~5 m ~1 km figure 2. characteristic effects caused by a high energy neutrino initiated interaction in a dense target medium. analogous effects occur in νe and ντ triggered reactions. note that in principle the cherenkov process generates optical as well as radio emission. of collision. the latter takes on average about 20 % of the incident energy at energies beyond ∼ 106 gev, whereas the bulk of the energy is taken by the forward going electron, muon or tau meson, emerging from the respective reaction, or by the scattered incident neutrino. the common signature of all neutrino reactions in a dense target medium is illustrated, in principle, in fig. 2 on the basis of a νµ interaction. the emerging muon must be replaced by an emerging electron or tau for νe or ντ initiated events, respectively. such events can be detected in principle by an array of optical, radio or acoustic sensors in a suitable medium (water, ice, rock, or a salt dome). since neutrino events are rare, background rejection and shielding are of paramount importance. depending on the physical properties of the target medium the propagation of parts of the cherenkov emission (optical or radio) may be suppressed. in the atmosphere, uhe neutrino initiated events cause air showers with particular characteristics that can be identified as such (late starting, downward going hadron poor showers; earth skimming upward going showers, or, typical for ντ initiated events, showers that emerge from mountain sides or start in the air, whose axis projected backward points toward a mountain slope). since neutrinos have extremely small reaction cross sections and uhe astrophysical or cosmogenic neutrino fluxes are expected to be extremely low, as can be estimated from the cr spectrum and from various cr source and propagation models, huge detector systems (targets) are required to collect a statistically significant number of events. consequently the attenuation length of the agent used to record, reconstruct and identify the events, i.e., optical or radio photons, or acoustic shock waves, is of prime importance (see tab. 1). it determines the layout of the sensor matrix (fine or coarse meshed), the size and probably the price of a detector telescope. 3. neutrino telescopes 3.1. initial efforts and prototypes the first attempt to search for cosmic neutrinos was made in japan in the early 1960s, using an air shower array, looking for so-called hadron poor horizontal air showers as neutrino signature [47], however, without much success. the decisive step which eventually led to the solution of the two major problems in the search for uhe cosmic neutrinos, their small reaction cross section and the expected low intensity, was made by 751 peter k. f. grieder acta polytechnica technique target frequency or attenuation reference medium wavelength length optical cherenkov water 400 nm 36 m [43] water 470 nm 55 m [43] ice 405 nm ∼ 50 ÷ 100 m [35] air fluorescence atmosphere 355 nm ' 14 km [25] 355 nm 30+16−10 km [57] radio cherenkov ice 380 mhz 1450+300−150 m [19] ice 250 ÷ 400 mhz 400 ÷ 700 m [23] ice 100 ÷ 300 mhz 495 ± 15 m [34] lunar regolith 1 ghz ∼ 20 m [33] rock salt 94 mhz 330 m [24] rock salt 1 ghz > 250 m [29] acoustic shock sea water 10 khz ∼ 5 km [11] sea water 20 khz ∼ 1 km [11] ice 10 ÷ 30 khz 312+68−47 m [2] table 1. detection technique and corresponding attenuation length. markov [45]. he suggested using the ocean as a neutrino target and installing a giant three-dimensional matrix of optical sensors at great depth, to look for upward directed cherenkov light trajectories of energetic muons emerging from uhe upward propagating muon neutrino initiated interactions. this idea became the guideline for an international collaboration that was formed early in 1981 to develop the dumand (deep underwater muon and neutrino detector) project, a giant detector matrix intended to be deployed in the pacific, at great depth near hawaii. the pioneering efforts of this collaboration eventually led to a very successful prototype system [17] that became the template for all subsequent deep water or deep ice optical cherenkov neutrino telescopes. unfortunately, the dumand project had to be abandoned in 1995 because of lack of funds. a russian collaboration built a similar prototype in the early 1980s which was successfully operated in lake baikal and has been continuously expanded until now. 3.2. deep-water/ice optical cherenkov neutrino telescopes since the cherenkov track of a muon emerging from a high energy νµ or νµ initiated interaction is the easiest clearly identifiable signature of all neutrino reactions, the initial searches for uhe astrophysical or cosmogenic neutrinos were focused on muon neutrinos, using huge optical sensor matrices at great depth for good shielding from downward going atmospheric muons, in large bodies of water or ice. today the list of large deep-water/ice optical detector matrices currently operating that serve as neutrino telescopes comprise besides nt-200 at lake baikal (since 1998) the antares telescope in the mediterranean (since 2007), and the giant 1 km3 icecube matrix in the deep ice at the south pole (completed in december 2010), with the high resolution deep core detector embedded within it1. all these detectors are fine-meshed arrays with a typical sensor spacing of the order of about half of the attenuation length of the cherenkov light in the respective media and yield a fair amount of reaction details. the pointing accuracy which increases with energy depends also on the detector type and configuration, and on the kind of neutrino reaction chosen to identify and reconstruct the event. as an example, the scattering angle between the reconstructed muon trajectory and the incident νµ in a charged current reaction is approximately 5 mr or less for an incident 20 tev νµ (fig. 2). apart from environmental data these experiments have yielded a wealth of data on the cosmic ray muon flux, on muon physics and on atmospheric neutrino fluxes. unfortunately no uhe cosmic neutrinos could so far be identified, only upper limits could be established, except for two pev events in icecube [37]. nevertheless, the present data could already rule out some of the production models. the data on diffuse ν fluxes obtained from these experiments are presented in fig. 3. the energy estimation of the events is a very difficult task and is not discussed here. the major disadvantage of the optical cherenkov technique is the relatively short attenuation length of light in water and ice, requiring densely instrumented detectors that make large volume telescopes extremely costly and impose ultimate limits. generally speaking, deployment of optical detectors in the deep open ocean 1amanda, which began operation in 2000 at the location of icecube had been shut-down some time ago. 752 vol. 53 supplement/2013 recent developments in ultra-high energy neutrino astronomy (pacific) has proven to be difficult and hazardous. deployment in the calm mediterranean is probably less problematic. 3.3. air fluorescence detection of neutrinos & ν-initiated showers neutrino-induced air showers exhibit specific features that are easily detectable with fly’s eye type air fluorescence detectors under certain conditions, as mentioned in section 2 (late starting, highly inclined downward going showers; earth skimming ντ initiated events, emerging from the ground or mountain slopes). installations like the pao are well suited and very promising for such tasks. they offer a huge atmospheric target volume because of the very long optical attenuation length of air, are relatively cost effective and have a high discovery potential for uhe cosmic tau neutrinos. the pointing accuracy of such telescopes is similar to that for hadronic showers. the upper limits of uhe ντ intensities from the pao experiment are given in fig. 3. 3.4. radio detection of neutrinos the negative results so far obtained with the existing optical detector systems in the search for astrophysical neutrinos have motivated several investigators to explore the so-called askar’yan radio emission effect [15] which is caused by the negative charge excess in electromagnetic cascades in dense media. the charge excess is due to compton scattering, positron annihilation and other minor contributing effects. since radio waves have a much longer attenuation length at some frequencies in a variety of dense media, such as ice, certain rocks and pure salt in so-called salt domes (see tab. 1), a much larger ν-target volume can be equipped with a given number of radio detection elements (antennas) than in the case of an optical ν-target with the same number of optical sensors for recording the optical component of the cherenkov emission in water or ice (rice, kravchenko et al. [40]). moreover, huge thick homogeneous surface layers or bodies of suitable target material, such as the antarctic ice shelf or the giant greenland ice cap, can be surveyed with balloon (anita, gorham et al. [31]) or satellite bound antenna systems (forte, lehtinen et al. [42]), that can record radio signals from neutrino induced cascades in the target from large distances because of the excellent propagation of radio waves in air and vacuum. an much larger target for uhe cosmic rays as well as neutrinos of all flavors is the vast layer of regolith on the lunar surface. this layer can be surveyed from a satellite based antenna system, orbiting the moon, or for higher threshold energies with radio telescopes from earth, recording lunar surface skimming uhe events. the latter approach had been explored by the glue [30], lunaska [38],numoon [54, 55] and lofar [56] projects. radio detection experiments serve mainly to explore the energy spectrum and yield fewer details than the optical cherenkov telescopes. the upper limits from these experiments are also plotted in fig. 3. 3.5. acoustic neutrino detection the hadron recoil cascade resulting from uhe neutrino interactions in dense media causes a thermal shock, as outlined in section 2 [14, 41]. several attempts were made in water and ice, using the existing infrastructures of the optical cherenkov telescopes, to implement microphones to explore the phenomenon, and to interpret the signal in terms of neutrino interactions (antares/amadeus, aguilar et al. [11]; spats, abbasi et al. [4]). the method is still in its exploratory phase. 4. next generation telescopes the obvious lesson that we have learned so far in our exploration of the cosmos in search of the sources of uhe cosmic rays using neutrinos is that even larger detection systems are required, employing partly new concepts. within the context of this paper we can only list the next generation experiments that appear likely to be operational within the current decade, without going into details. apart from planned extensions of existing installations, new projects, some of which are well under way, comprise the arianna radio detection array to be installed on the ross ice shelf in antarctica [21] and the jem-euso, a large air fluorescence detector to be installed on the international space station [27]. in addition there is the giant km3net water cherenkov array project in the mediterranean [39], and the lunar orbiting radio detector, lord [53], which are presently in the r&d phase. 5. concluding remarks the lack of a positive result in our search for uhe astrophysical neutrinos with the present large deep water/ice optical cherenkov telescopes and the promising exploratory work with the more economical radio detection systems strongly suggests that future efforts should be oriented in this direction. however, the sensitivity of the method needs to be improved. even though the radio method does not seem to yield the details that the optical cherenkov matrices can yield, establishing the energy scale and (approximate) arrival direction of uhe messengers should have priority over details. the acoustic technique, too, may be worth exploring further, but it seems to be less promising. acknowledgements i am grateful to the organizers of the very stimulating vulcano workshop for the kind invitation to participate. references [1] abbasi, r. et al.: 2008, ap. j. 684, 790 [2] abbasi, r. et al.: 2011a, astrop. phys. 34, 382 753 peter k. f. grieder acta polytechnica 10 12 10 15 10 18 10 21 10 24 10 27 neutrino energy [ ev ] 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 i( e ) x e 2 [ g e v c m -2 s -1 s r -1 ] c g o i j m w&b/2 k q n r mpr/2 gzk -ν a b f e d h p s t x u v w l figure 3. compilation of the upper bounds of cosmic neutrino intensities as obtained by different experiments and from corresponding predictions: a, atmospheric neutrino intensity and uncertainty region [49]; b, fréjus, νµ + νµ [51]; c, macro, νµ + νµ [12]; d, baikal nt-200, (νe + νµ + ντ)/3 [16]; e, amanda-ii uhe 2000–02, limits for (νe + νµ + ντ)/3 [9]; f, amanda-ii, νµ + νµ limit [8]; g, antares 2007–09, νµ + νµ [10]; h, full icecube, 3 years, all flavors [36]; i, auger diff., ντ +ντ [6]; j, hires, νe +νe [1]; k, hires, ντ +ντ [46]; l, rice, all flavors [40]; m, auger integral, ντ + ντ [6]; n, o, anita lite, all flavors [20]; p, anita-08, all flavors [31]; q, glue-04, all flavors [30]; r, forte-04, all flavors [42]; s, t, icecube ic22, all flavors [3]; u, wsrt/newmoon, 20 hours, all flavors [55]; v, w, lunaska, all flavorsi [38]; x, topological defects [50]; mpr/2, agn based model [44], intensity/2; w&b/2 [18, 58] model intensities/2; gzk-ν, models [13, 28]. [3] abbasi, r. et al.: 2011b, p.r. d 83, 092003 [4] abbasi, r. et al.: 2012, astrop. phys. 35, 312 [5] abraham, j. et al.: 2008a, astrop. phys. 29, 188 [6] abraham, j. et al.: 2008b, p.r.l. 100, 211101 [7] abraham, j. et al.: 2009, astrop. phys. 31, 399 [8] achterberg, a. et al.: 2007, p.r. d 76, 042008 [9] ackermann, m. et al.: 2008, ap. j. 675, 1014 [10] aguilar, j. et al.: 2011a, p.l. b 696, 16 [11] aguilar, j. et al.: 2011b, arxiv:1009.4179 [12] ambrosio, m. et al.: 2003, astrop. phys. 19, 1 [13] anchordoqui, l. et al.: 2007, p.r. d 76, 123008 [14] askar’yan, g.: 1957, sov. j. at. energy 3, 921 [15] askar’yan, g.: 1962, sov. phys. jetp, 14, 441 [16] avrorin et al.: 2009, astron. lett., 35, 651 [17] babson, j. et al.: 1990, p.r. d 42, 3613 [18] bahcall, j., e. waxman: 2001, p.r. d 64, 023002 [19] barwick, s. et al.: 2005, j. glaciology, 51, 173 [20] barwick, s. et al.: 2006, p.r.l. 96, 171101 [21] barwick, s.w. et al.: 2011, proc. icrc, he2.3, 236 [22] belz, j.: 2009, n.p. b (proc. suppl.) 190, 5 [23] cheng, e. et al.: 2011, proc. icrc he2.3, 267 [24] chiba, m. et al.: 2001, proc. first ncts workshop, kenting, taiwan, p. 90, world scientific [25] chikawa, m. et al.: 1999, proc. icrc 5, 17 [26] descamps, freija: 2009, proc. icrc [27] ebisuzaki, t. et al.: 2009, proc. icrc [28] engel, r. et al.: 2001, p.r. d 64, 093010 [29] gorham, p. et al.: 2002, n.i.m. a 490, 476 [30] gorham, p. et al.: 2004, p.r.l. 93, 041101 [31] gorham, p. et al.: 2010, arxiv:1003.2961v3 [32] greisen, k.: 1966, p.r.l. 16, 748 [33] gusev, g. et al.: 2006, dokl. phys. 51, 22 [34] hanson, j. et al.: 2011, proc. icrc he2.3, 168 [35] icecube: 2011a, proc. icrc he2.3, 160 [36] icecube: 2011b, proc. icrc he2.3, 204 [37] ishihara, a., plenary talk at nu2012 [38] james, c. et al.: 2010, p.r. d 81, 042003 [39] katz, u.f., for the km3net collaboration: 2011, n.i.m., a639, 50 [40] kravchenko, i. et al.: 2006, p.r. d 73, 082002 [41] learned, j.g.: 1979, p.r. d 19, 3293 [42] lehtinen, n. et al.: 2004, p.r. d 69, 013008 [43] mangano, s.: 2011, proc. icrc he2.3, 118 [44] mannheim, k. et al.: 2000, p.r. d 63, 023003 [45] markov, m.a.: 1960, proc. internat. conf. on high energy physics, rochester, univ. of rochester/interscience, rochester, n.y.) p. 578 [46] martens, k.: 2007, arxiv:0707.4417v1 [47] matano, t. et al.: 1965, p.r.l. 15, 594 [48] matthews, j.n.: 2011, n.p. b (proc. suppl.) 212, 79 754 vol. 53 supplement/2013 recent developments in ultra-high energy neutrino astronomy [49] münch, k. et al.: 2005, proc. icrc 5, 17 [50] protheroe, r., stanev, t.: 1998, p.r. d 59, 043504 [51] rhode, w. et al.: 1996, astrop. phys. 4, 217 [52] risse, m. et al.: 2012, proc. uhecr-2012 conf., cern (in print) [53] ryabov, v.a. et al.: 2009, n.p. b (proc. suppl.), 196, 458 [54] scholten, o. et al.: 2009, p.r.l. 103, 191301 [55] scholten, o. et al.: 2011, proc. icrc, 0086 [56] singh, k. et al.: 2009, proc. icrc, 1077 [57] tomida, t. et al.: 2011, n.i.m. a 654, 653 [58] waxman, e., bahcall, j.: 1998, p.r. d 59, 023002 [59] zatsepin, g.t., kuzmin, v.a.: 1966, jetp lett. 4, 78 755 acta polytechnica 53(supplement):750–755, 2013 1 introduction 2 neutrino reaction signatures and detection 3 neutrino telescopes 3.1 initial efforts and prototypes 3.2 deep-water/ice optical cherenkov neutrino telescopes 3.3 air fluorescence detection of neutrinos & nu-initiated showers 3.4 radio detection of neutrinos 3.5 acoustic neutrino detection 4 next generation telescopes 5 concluding remarks acknowledgements references acta polytechnica acta polytechnica 53(3):259–267, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ stationary and dynamical solutions of the gross-pitaevskii equation for a bose-einstein condensate in a pt symmetric double well holger cartarius∗, dennis dast, daniel haag, günter wunner, rüdiger eichler, jorg main institut für theoretische physik 1, universität stuttgart, pfaffenwaldring 57, 70550 stuttgart, germany ∗ corresponding author: holger.cartarius@itp1.uni-stuttgart.de abstract. we investigate the gross-pitaevskii equation for a bose-einstein condensate in a pt symmetric double-well potential by means of the time-dependent variational principle and numerically exact solutions. a one-dimensional and a fully three-dimensional setup are used. stationary states are determined and the propagation of wave function is investigated using the time-dependent grosspitaevskii equation. due to the nonlinearity of the gross-pitaevskii equation the potential depends on the wave function and its solutions decide whether or not the hamiltonian itself is pt symmetric. stationary solutions with real energy eigenvalues fulfilling exact pt symmetry are found as well as pt broken eigenstates with complex energies. the latter describe decaying or growing probability amplitudes and are not true stationary solutions of the time-dependent gross-pitaevskii equation. however, they still provide qualitative information about the time evolution of the wave functions. keywords: bose-einstein condensates, pt symmetry, gross-pitaevskii equation, stationary states, dynamics. 1. introduction bose-einstein condensates can, at extremely low temperatures, be described by the gross-pitaevskii equation [1, 2], which reads in a particle number scaled form and in appropriate units iψ̇(x, t) = ( −∆ + v (x) −g|ψ(x, t)|2 ) ψ(x, t). (1) this is the hartree approximation of the corresponding many-particle equation for the dilute atomic gas, where the assumption that all atoms are in the quantum mechanical ground state is used. an external potential, in which the atom cloud is trapped, is described by v (x). additionally, the atoms interact via the short-range van der waals force. in the dilute gas it can be described sufficiently exact by an s-wave scattering process, which leads in the mean-field approximation to the nonlinear contribution −g|ψ(x, t)|2. the strength g is determined by the s-wave scattering length. using magnetic fields acting on the hyperfine levels of the atoms and exploiting feshbach resonances, the scattering length can be tuned and g is a true parameter of the system, which can be varied. bose-einstein condensates are a promising candidate for a first experimental realization of a special class of non-hermitian hamiltonians in quantum mechanics, viz. systems which possess a pt symmetry in spite of their non-hermitian nature. as was shown first by bender and boettcher, these hamiltonians exhibit remarkable properties [3] such as stationary states with real eigenvalues in the presence of loss and gain terms in the potential. with the operators of spatial reflection p: x →−x, p → −p, and of time reversal t : x → x, p → −p, i →−i the pt symmetry of the hamiltonian can be expressed in terms of the commutator relation [pt ,h] = 0. (2) since the kinetic energy term in the hamiltonian is always pt symmetric one obtains in a simple calculation from (2) the necessary condition v ∗(−x) = v (x) (3) for the potential. whereas the experimental realization of pt symmetry has successfully been achieved [4, 5] for optical wave guides [6–12], a verification in a genuine quantum system is still lacking. however, a proposal for a bose-einstein condensate with a pt symmetric external potential was given by klaiman et al. [11], who suggested a double-well setup, where atoms are coherently incoupled in one well and outcoupled from the other. in this article, we will solve the gross-pitaevskii equation of such a system and show that it indeed supports pt symmetric solutions. since the model describes a true quantum system, our investigations provide a good starting point for an experimental realization of this quantum system with pt symmetry. however, in the case of the nonlinear gross-pitaevskii equation (1) one has to address one additional critical question. the wave function has an influence on the nonlinear hamiltonian’s symmetry. along with the 259 http://ctn.cvut.cz/ap/ holger cartarius et al. acta polytechnica external potential the interaction term −g|ψ(x, t)|2 has to fulfill the condition (3). the hamiltonian is only pt symmetric if the square modulus of its solution |ψ(x, t)|2 is a symmetric function of x. this is always fulfilled for pt symmetric wave functions. thus, one may state that the nonlinear hamiltonian of the gross-pitaevskii equation preserves its own symmetry in the case of exact pt symmetry. but does this happen also with nonlinearity? wave functions describing pt broken eigenstates usually do not possess a symmetric square modulus and destroy also the pt symmetry of the hamiltonian. previous studies of pt symmetric systems with nonlinearity indicate that the nonlinearity does not destroy the relevant effects known from linear pt symmetric hamiltonians. an investigation of a boseeinstein condensate in a two-mode approximation using a non-hermitian bose-hubbard model was made by graefe et al. [13–15]. furthermore, the combination of pt symmetry and nonlinearity has been studied in quantum mechanical model potentials [10, 16], in optics [17] or for a bose-einstein condensate with an idealized double-δ trap [18], a system of which the linear variant already attracted much interest [19–23]. to investigate bose-einstein condensates in a pt symmetric double well we first introduce our numerical approach in section 2. then we present and discuss the numerical results for the energy eigenvalues, the stationary wave functions and the time evolution of non-stationary initial wave packets in section 3. conclusions are drawn in section 4. 2. numerical approach to bose-einstein condensates in a pt symmetric double well 2.1. gross-pitaevskii equation we consider a bose-einstein condensate of particles with mass m in a double-well setup described by the external potential v (x) = m 2 ω2xx 2 + m 2 ω2y,z(y 2 + z2) + v0e−σx 2 + iγxe−ρx 2 , (4) where a three-dimensional harmonic trap is superimposed by a gaussian barrier in the x direction (cf. figure 1). the trapping frequencies are ωx for the direction of the double-well structure and ωy,z for the two remaining directions. the barrier has its maximum at x = 0, the height v0, and its width is determined by σ. an outcoupling (incoupling) of atoms is reflected by a negative (positive) imaginary potential contribution in the left (right) well. its strength is determined by the gain-loss parameter γ. since it affects the probability amplitude of the whole condensate, the physical interpretation is a coherent out-/incoupling. with this ansatz we are not considering individual atoms but the macroscopic wave function of the condensed phase. the potential (4) 1 2 3 4 5 6 7 8 9 −6 −4 −2 0 2 4 6 −4 −3 −2 −1 0 1 2 3 4 r e v (x ) im v (x ) x re v (x) im v (x) figure 1. pt symmetric external potential. the real part (solid line) defines the confinement of the condensed atom cloud and the imaginary part (dashed line) describes the gain-loss contributions due to the coherent inand outcoupling of atoms. has been chosen such that it fulfills the condition (3), i.e. its real part is a symmetric function of x, while its imaginary part is antisymmetric. thus, the linear external potential (4) is pt symmetric. introducing the length scale a0 = √ ~/mωy,z defined by the trap frequency in the direction perpendicular to the double-well shape and the unit of energy e0 = ~2/2ma20 the dimensionless potential reads v (x) = ω2xx 2 + y2 + z2 + v0e−σx 2 + iγxe−ρx 2 , (5) and the evolution of the condensate is described by the gross-pitaevskii equation (1). with the chemical potential µ and the usual separation ansatz ψ(x, t) = φ(x)e−iµt in the units defined above one obtains the time-independent gross-pitaevskii equation( −∆ + v (x) −g|φ(x)|2 ) φ(x) = µφ(x). (6) in our calculations we use the potential parameters ωx = 0.5, v0 = 4, and σ = 0.5. the width parameter ρ = σ 2 ln(v0σ/ω2x) (7) of the gain-loss potential is chosen such that the extrema of the real and imaginary potential parts coincide, cf. figure 1. if only the x direction is considered and all y and z terms are removed from the potential and the wave function, we can reduce the model to one dimension that contains the relevant pt symmetric information. 2.2. two methods: variational gaussian and numerically exact we use two methods to solve the time-dependent and time-independent gross-pitaevskii equations (1) and (6). the gaussian variational method has been shown to provide highly precise solutions with low numerical effort [24–26]. since in our work it is applied 260 vol. 53 no. 3/2013 stationary and dynamical solutions of the gross-pitaevskii equation for the first time to bose-einstein condensates in a pt symmetric complex potential, we compare it to numerically exact solutions of the gross-pitaevskii equation in the one-dimensional case. the idea of the gaussian variational method consists of the restriction to a gaussian shaped wave function ψ(z, x) = 2∑ k=1 e−[a k x(x−q k x ) 2+aky,z (y 2+z2)] × eip k x(x−q k x )−ϕ k (8) described by a small set of variational parameters, viz. z(t) = { akx(t),a k y,z(t),q k x(t),p k x(t),ϕ k } . (9) in the case of the pt symmetric double-well setup it is reasonable to start with two gaussian wave functions, where each of them is located in one of the wells. the widths of the gaussians are determined by the complex parameters a1x, a2x, a1y,z and a2y,z. since the relevant dynamics only affects the x coordinate and the trap was assumed to be symmetric in y and z directions, we include in our ansatz the same symmetry for the gaussian wave functions. the positions and the corresponding momenta of both gaussians are determined by the real coordinates q1x, q2x, p1x and p2x. the amplitudes and phases are introduced via the complex variables ϕ1 and ϕ2. this leads in total to 16 real parameters completely defining the condensate wave function. reducing the model to one dimension we end up with twelve real variables. certainly, the ansatz (8) cannot solve the timedependent gross-pitaevskii equation (1) exactly. one way to find the “best” approximative solution with a wave function restricted to the gaussian form (8) is the application of the mclachlan time-dependent variational principle [27], δi = δ ∥∥iχ(z(t), x) −hψ(z(t), x)∥∥2 != 0. (10) in this procedure the variation with respect to the parameters in the wave function χ is performed such that the functional i is minimized. in a second step one sets ψ̇ ≡ χ and obtains the equations of motion, which in our case are of the form ȧkx = −4i ( (akx) 2 + (aky,z) 2) + iv k2;x, (11a) ȧky,z = −4i ( (akx) 2 + (aky,z) 2) + iv k2;y,z, (11b) q̇kx = 2p k x + s k x, (11c) ṗkx = −re v k 1;x − 2 im a k xs k x − 2 re v k 2;xq k x, (11d) ϕ̇k = ivk0 + 2i(a k x + a k y,z) − i(p k x) 2 − ipkxs k x + iq k xv k 1;x + iq k xv k 2;xq k x, (11e) with skx = 1 2 (re akx) −1(im vk1;x + 2 im v k 2;xq k x). (11f) the equations of motion (11a)-(11f) contain effective potential terms v = (v10, . . . ,v11;x, . . . ,v 12;x, . . . ), which are obtained from a system of linear equations, kv = r, where the matrix k contains weighted overlap integrals of the gaussians and the vector r consists of weighted gaussian averages of all potential terms including the nonlinearity. the detailed form can be found in [28]. the dynamics of the condensate wave function is found by solving the ordinary differential equations (11a)-(11f) with a runge-kutta algorithm [29]. stationary states or solutions of the time-independent gross-pitaevskii equation are found by the requirements ȧkx = ȧky,z = q̇kx = ṗkx = 0 (12 conditions for real numbers), ϕ̇1 = ϕ̇2 (2 conditions). these states have to be normalized, ‖ψ‖ = 1, which is important in the nonlinear gross-pitaevskii equation and adds an additional constraint. due to the arbitrary global phase one of the 16 gaussian parameters introduced above is free and 15 parameters must be varied to fulfill the 15 conditions. this is done with a 15-dimensional root search by applying a powell hybrid method [29]. if we consider a one-dimensional condensate the root search reduces to 11 conditions, which have to be fulfilled by 11 appropriately chosen parameters. this small difference exemplifies the high scalability of the variational gaussian method. an increase of the dimension or the flexibility of the wave function leads only to a moderate increase of the numerical effort. the numerically exact integrations of the grosspitaevskii equations (1) and (6) presented in this paper are carried out for the one-dimensional setup, where the computational costs are reasonable. the stationary wave functions are integrated outward from x = 0 in positive and negative direction using a rungekutta algorithm with initial values re ψ(0),ψ′(0) ∈ c, and µ ∈ c. they have to be chosen such that the five conditions ψ(∞) → 0, ψ(−∞) → 0, and ‖ψ‖ = 1 are satisfied. the conditions define square-integrable and normalized wave functions, i.e. the stationary states we are interested in. note that the arbitrary global phase has been exploited by setting im ψ(0) = 0 for the initial values of the integration. for dynamical calculations the split operator method is used, as explained in [28]. 3. numerical solutions 3.1. energy eigenvalues figure 2 shows a typical example for the eigenvalues of the time-independent gross-pitaevskii equation (6). first, one observes that the linear system (g = 0) reveals the known features of complex hamiltonians with pt symmetry. two real eigenvalue solutions are found below a value γep, where they merge in an exceptional point. for larger values of γ two complex and complex conjugate solutions are found. the agreement between the gaussian approximation (solid 261 holger cartarius et al. acta polytechnica 2.34 2.36 2.38 2.40 2.42 2.44 2.46 2.48 2.50 2.52 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 r e µ γ (a) −0.08 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 0.08 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 im µ γ (b) g = 0 g = 0.1 g = 0.2 g = 0.3 figure 2. eigenvalues of the time-independent grosspitaevskii equation (6) for different values of the nonlinearity g in dependence on the gain-loss parameter γ. with increasing g the real part of the energies decreases. the gaussian approximation (solid lines) and the numerically exact solutions (dashed lines) are shown. vanishing imaginary parts are not plotted. two solutions with real eigenvalues are obtained up to a value γep ≈ 0.04, where they merge in an exceptional point. two additional solutions with complex eigenvalues are obtained, starting at a critical value γc, where γc < γep for g 6= 0. lines) and the numerically exact solution (dashed lines) is excellent. an important result is the persistence of real eigenvalue solutions for nonvanishing values of the gain-loss parameter γ in the case g 6= 0. this demonstrates that the gross-pitaevskii equation with a nonlinearity in the potential still supports real eigenvalue solutions, i.e. non-decaying states. there are, however, a few crucial differences between the linear and the nonlinear system. the complex eigenvalue solutions are now born at a value γc < γep, whereas at the exceptional point γep only the real eigenvalue states vanish and new complex solutions do not appear. the bifurcation scenario of the linear system seems to be split between the emergence of complex eigenvalue solutions at γc and the disappearance of the real eigenvalue states at γep 6= γc. it is possible to explain this unusual characteristics by the non-analyticity of the grosspitaevskii equation [28]. for sufficiently high values of g the complex eigenvalues exist down to γ → 0 and only lose their imaginary contribution for γ = 0. the one-dimensional model already contains the whole important potential shape for the analysis of a pt symmetric condensate. thus, one can expect that it includes all important features in the eigenvalue structure. this is confirmed by the comparison 4.34 4.36 4.38 4.40 4.42 4.44 4.46 4.48 4.50 4.52 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 r e µ γ (a) −0.08 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 0.08 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 im µ γ (b) g3d = 0 g3d = 0.2π g3d = 0.4π g3d = 0.6π figure 3. real (a) and nonvanishing imaginary (b) parts of the energy eigenvalues of the time-independent gross-pitaevskii equation in the one(dashed lines) and three-dimensional (solid lines) models. the comparison shows that, by applying an appropriate rescaling of the nonlinearity parameter g and an energy shift for the energy contributions from the two additional directions, the one-dimensional model already provides a very good quantitative description of the fully three-dimensional setup. it is almost impossible to identify the differences in the graph. of the eigenvalues obtained with the one(dashed) and three-dimensional (solid) potentials in figure 3. the three-dimensional setup contains two additional directions in which an external harmonic oscillator potential is present. this leads to the assumption that, in the ground state, the energy is shifted by a value of ∆µ = 2 in the units introduced above. furthermore, the spatial extension in y and z directions leads to additional energy contributions from the contact interaction. the difference can be estimated from the expectation value of the contact energy. we demand that the expectation values of both models are identical,∫ r3 dx dy dz g3d|ψ3d(x)|4 != ∫ r dxg1d|ψ1d(x)|4 (12) and obtain the condition g3d = 2πg1d (13) for a rescaled g1d of the one-dimensional model which leads to the same contact energy as a threedimensional wave function with g3d. in this calculation we used a product ansatz ψ3d(x) ≈ ψ1d(x)ψ0(y)ψ0(z) of the wave function, where ψ0 describes the harmonic oscillator ground state [28]. this assumption is only correct in the linear case without 262 vol. 53 no. 3/2013 stationary and dynamical solutions of the gross-pitaevskii equation −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −6 −4 −2 0 2 4 6 φ x (a) −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −6 −4 −2 0 2 4 6 φ x (b) re φ im φ |φ|2 figure 4. wave functions of the ground (a) and excited (b) eigenstates with real eigenvalues in the case g = 0.2 and γ = 0.03. only the gaussian solutions are shown since they are almost identical with the numerically exact values. the square moduli of both wave functions are symmetric functions of x, preserving the pt symmetry of the nonlinear hamiltonian. contact interaction. the remarkable agreement of both calculations in figure 3 shows, however, that it still approximates the nonlinear case very well even for considerable nonlinearities g1d ≈ 0.3. due to the excellent quantitative agreement between the oneand three-dimensional calculations we will only consider the one-dimensional variant in the following sections. 3.2. wave functions we have already seen that real eigenvalue solutions do appear in the nonlinear gross-pitaevskii equation. but, we still do not know whether we have found truly pt symmetric solutions. as explained in the introduction, the hamiltonian is only pt symmetric if the square modulus of the eigenstate is a symmetric function of x. this question will be answered in this section. figure 4 shows the wave functions belonging to the real eigenvalues for g = 0.2 and γ = 0.03. obviously, the square modulus of both wave functions is a symmetric function of x. this preserves the hamiltonian’s pt symmetry, it is not destroyed by the nonlinearity. or in other words, the nonlinear hamiltonian picks as eigenstates wave functions which render itself pt symmetric. the situation is, however, completely different for the wave functions of the states with complex eigenvalues. in linear pt symmetric models the complex eigenvalue solutions belong to the pt broken case, in which the wave functions do not reflect the pt −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 −6 −4 −2 0 2 4 6 φ x (a) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 −6 −4 −2 0 2 4 6 φ x (b) re φ im φ |φ|2 figure 5. wave functions of the complex eigenvalue solutions with negative (a) and positive (b) imaginary part in the case g = 0.2 and γ = 0.03. again only the gaussian wave functions are shown. the square moduli are not symmetric functions of x and destroy the hamiltonian’s pt symmetry. symmetry of the hamiltonian. this is also the case for the solutions of the gross-pitaevskii equation as can be seen in figure 5. the wave functions of the solutions with complex eigenvalues are not pt symmetric. their square moduli are not symmetric functions of x, and thus even the pt symmetry of the hamiltonian is destroyed, a circumstance which is not possible in a linear quantum system. the wave function belonging to the eigenvalue with negative imaginary part has a higher amplitude in the left well with loss, whereas the probability amplitude of the state with positive imaginary part of the energy is shifted more to the right well with gain. a pt symmetric bose-einstein condensate will only be observable if the eigenstates are stable with respect to quantum fluctuations. we performed a linear stability analysis to check the behavior of the eigenstates with real eigenvalues. a linearization of the equations of motion (11a)–(11f) of the gaussian parameters around the stationary states and the solution of the bogoliubov-de gennes equations for the numerically exact wave functions has been done [28]. we found that the excited state is stable for all combinations of g and γ as long as it exists. the ground state, however, becomes unstable as soon as the pt broken branches with complex eigenvalues emerge in the energy diagram of figure 2. 263 holger cartarius et al. acta polytechnica 0 60 120 180 240 300 t −4 −2 0 2 4 x 0 0.1 0.2 0.3 0.4 0.5 0.6 (a) 0 60 120 180 240 300 t −4 −2 0 2 4 x 0 0.1 0.2 0.3 0.4 (b) figure 6. evolution of the wave packet (14) in the pt symmetric double well for nonvanishing nonlinearity g = 0.2, ϕ = π/2, and the gain-loss contributions γ = 0 (a) and γ = 0.02 (b). 3.3. temporal evolution and the significance of “stationary” solutions with complex eigenvalues we first want to investigate the temporal evolution of condensate wave functions close to the stationary real eigenvalue solutions for values of γ below the appearance of the pt broken states. two examples of numerically exact propagations using the split operator method are given in figure 6 for the initial wave packet ψ(x,t = 0) = 1 √ 2 ( φgs(x) + eiϕφes(x) ) (14) with the ground state φgs(x) and the excited state φes(x). in this case one expects for linear systems the behavior that an oscillation of the probability amplitude between the two wells sets in. the frequency decreases with increasing γ until the oscillation stops at γ = γep [11]. this behavior is reproduced in figure 6, i.e. the qualitative pattern of the motion is conserved in the nonlinear gross-pitaevskii equation (1). a quantitative analysis shows that the nonlinearity leads to slightly higher oscillation frequencies. however, with the emergence of the additional pt broken states we observe a qualitative change between the linear and nonlinear systems. as was mentioned above, the ground state becomes unstable in this regime. the relevance of this unstable character can be seen in figure 7a, where the initial wave packet (14) is evolved for g = 0.2, γ = 0.03, and the phase ϕ = π. already during the first oscillation a complete deformation of the typical pattern is observed. then the oscillation stops and changes into an unrestricted growth of the probability amplitude in the right well, 0 60 120 t −4 −2 0 2 4 x 0 0.5 (a) 0 60 120 180 240 300 t −4 −2 0 2 4 x 0 0.1 0.2 0.3 (b) figure 7. evolution of the wave packet (14) in the pt symmetric double well for nonvanishing nonlinearity g = 0.2 and the gain-loss contributions γ = 0.03 with phase ϕ = π (a) and γ = 0.04 with ϕ = π/2 (b). the wave function “explodes”. this behavior is not surprising. one of the superimposed states in (14) is unstable and the two pt broken solutions with complex eigenvalues exist. one can expect that, in the nonlinear system, there is a considerable overlap of the time evolved wave function (14) with the pt broken eigenstates, cf. also figures 4 and 5. then, the eigenstate with positive imaginary part of the energy will always dominate for long times since it increases, and determines the further evolution. in a realistic situation an infinite growth of the probability amplitude will not appear. it has its origin in the pt symmetric potential (4) and requires an infinite reservoir of atoms. at some point in time this description will break down. the instability does not necessarily lead immediately to a destruction of the original oscillating behavior, as is shown for the example with g = 0.2 and γ = 0.04 in figure 7b, i.e. very close to γep. even for a gain-loss influence larger than that chosen for figure 7a it was possible to find a stable oscillation for the phase ϕ = π/2. the behavior known from linear systems re-emerges. the probability amplitude almost pulsates in both wells with a low frequency. from the dynamical point of view the solutions of the time-independent gross-pitaevskii equation (6) with complex eigenvalues have to be considered with some care. strictly speaking, they lose their physical relevance. due to the decay or growth of the probability amplitude mentioned above the nonlinear term −g|ψ|2 in the hamiltonian becomes explicitly time dependent. thus, the states are not true stationary solutions of the time-dependent gross-pitaevskii equation (1). they are, however, still useful to indicate the 264 vol. 53 no. 3/2013 stationary and dynamical solutions of the gross-pitaevskii equation 0 1 2 3 4 5 6 7 8 9 10 0 10 20 30 40 50 n 2 t actual time developement exp(−2 im µt) figure 8. temporal evolution of the norm of an initial complex eigenvalue state with im µ > 0. the parameters are g = 0.2 and γ = 0.03. temporal evolution of the condensate. for example, it was possible above to explain the explosion of the condensate wave function in figure 7a by the presence of the growing pt broken state. furthermore, it is possible to show that the complex eigenvalue solutions can still indicate the actual temporal evolution of an initial wave packet. in particular, for small times the prediction from the “stationary” complex eigenvalue solutions for the norm n2 = ∫ |ψ|2 dx describes very well the true onset of the growth, as can be seen for an initial state with im µ > 0 in figure 8. to investigate the correspondence of the complex eigenvalue solutions with the exact time integration for longer times we introduce the difference d = 1 n (√∫ right well |ψ| 2 dx − √∫ left well |ψ| 2 dx ) , (15) of the wave function’s norm in the right and left wells. it tells us how the probability density is distributed in both wells. a positive value signalizes a higher probability density in the right well with gain, whereas a negative value indicates a concentration of the condensate’s probability density in the left well with loss. in figure 9 we compare the norm difference d of the correct temporal evolution with that of the complex eigenvalue solution with positive imaginary part, i.e. the growing state for the same parameters. since the norm of the time evolved wave changes due to the gain-loss part of the potential we have to adapt the effective g in the stationary gross-pitaevskii equation (6), g → gn2, (16) such that the two norm differences are comparable. in figure 9a we plot d for the situation depicted in figure 7. with increasing time the state first decays and the norm drops. at low values of the norm the effective g (16) assumes values at which only the true stationary states with real eigenvalues and d = 0 exist. the latter property can be seen in the figure. however, at t ≈ 12 the overlap with the growing −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 160 d t (a) 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 10 20 30 40 50 60 70 80 d t (b) actual time developement adapted broken eigenstate figure 9. comparison of the norm difference d of the correct temporal evolution with that of complex eigenvalue solutions. in (a) the same situation as in figure 7 is depicted, whereas (b) shows the evolution of the complex eigenvalue state with positive imaginary part. in both cases the norm difference follows the growing complex eigenvalue state for long times. γ was chosen to be 0.03 and the initial g was 0.2. eigenstate wins and the norm starts to grow. after some oscillations of d, one observes for times t > 120 that the norm differences of the two calculations agree more and more. this indicates that the wave function initially prepared in a superposition of two eigenstates evolves into the shape of the growing eigenstate for long times. that is, the complex eigenvalue solution with positive imaginary part still has a meaning for long times although it is not a true stationary solution of the time-dependent gross-pitaevskii equation (1). a similar behavior can be observed in figure 9b, where the initial state was the growing complex eigenvalue solution of (6). since we start with the increasing norm solution the agreement between both calculations sets in earlier. 4. conclusion the work presented in this article shows that pt symmetric eigenfunctions exist in nonlinear quantum systems, in particular, in bose-einstein condensates described by the gross-pitaevskii equation. the solutions render the hamiltonian itself pt symmetric. a comparison of numerically exact calculations and a gaussian approximation demonstrated that the gaussian ansatz provides quantitatively well converged numerical results. a fully three-dimensional calculation of the condensate is possible, but is not necessary to describe the effects appearing due to the non-hermitian 265 holger cartarius et al. acta polytechnica gain-loss potential. it is even possible to extract quantitative values from a one-dimensional description of the condensate containing only the relevant direction, in which the pt symmetric potential acts. solutions of the time-independent gross-pitaevskii equation with complex energy eigenvalues are found as well, and belong to eigenstates with broken pt symmetry, destroying the hamiltonian’s symmetry. they have no direct physical meaning, since they are not true stationary states of the time-dependent gross-pitaevskii equation due to their complex energy eigenvalues indicating a growth or decay of the probability amplitude. however, one can observe that they influence the ground state. for nonvanishing nonlinearity g the complex eigenvalue solutions bifurcate from the ground state when a critical value of the gain-loss parameter γ is exceeded. at this point the ground state becomes unstable, whereas the excited real eigenvalue solution is not affected by the appearance of the new states and stays stable as long as it exists. the time evolution of the condensate showed that the eigensolutions of the time-independent grosspitaevskii equation with complex eigenvalues may help to estimate the true temporal behavior of a condensate wave function. for small times, the imaginary part of the energy correctly describes the onset of the wave function’s growth or decay. for large times, all initial wave functions tend to the state with positive imaginary part of the energy, which is located predominantly in the well with gain. certainly, there is a number of questions which still have to be answered. a better understanding of the nonlinearity’s influence on the solutions is important. analytically solvable matrix models might help to get a better insight. furthermore, the relation of the threeand one-dimensional results should be investigated for higher values of the nonlinearity g. a full analytical continuation of the non-analytical gross-pitaevskii equation should help to understand the change in the number of solutions observed in the eigenvalue diagrams. in a proper extension, the critical values of γ at which solutions appear or vanish should turn out to be bifurcation points. it would be desirable to understand how a coherent inor outcoupling of atoms can be understood on a microscopic level. this will be important for a realistic description of experimental situations. it is also possible to extend the model such that it can be understood as an embedding in a chain of potential wells. here the gain-loss contributions can result from an effective description of a transport effect when only two wells somewhere in the middle of the chain are taken into account. furthermore, it would be interesting to see how a gain-loss potential interacts with long-range inter-atomic interactions such as the dipole-dipole interaction leading often to qualitatively new effects [30]. references [1] gross, e. p. structure of a quantized vortex in boson systems. nuovo cimento 20, 454 (1961). [2] pitaevskii, l. p. vortex lines in an imperfect bose gas. sov. phys. jetp 13, 451 (1961). [3] bender, c. m. and boettcher, s. real spectra in non-hermitian hamiltonians having pt symmetry. phys. rev. lett. 80, 5243 (1998). [4] guo, a., et al. observation of pt -symmetry breaking in complex optical potentials. phys. rev. lett. 103, 093902 (2009). [5] rüter, c. e., et al. observation of parity-time symmetry in optics. nat. phys. 6, 192 (2010). [6] makris, k. g., et al. beam dynamics in pt symmetric optical lattices. phys. rev. lett. 100, 103904 (2008). [7] makris, k. g., et al. pt -symmetric optical lattices. phys. rev. a 81, 063807 (2010). [8] ruschhaupt, a., et al. physical realization of pt -symmetric potential scattering in a planar slab waveguide. j. phys. a 38, l171 (2005). [9] el-ganainy, r., et al. theory of coupled optical pt-symmetric structures. opt. lett. 32, 2632 (2007). [10] musslimani, z., et al. optical solitons in pt periodic potentials. phys. rev. lett. 100, 30402 (2008). [11] klaiman, s., et al. visualization of branch points in pt-symmetric waveguides. phys. rev. lett. 101, 080402 (2008). [12] driben, r. and malomed, b. a. stability of solitons in parity-time-symmetric couplers. opt. lett. 36, 4323 (2011). [13] graefe, e. m., et al. a non-hermitian pt symmetric bose-hubbard model: eigenvalue rings from unfolding higher-order exceptional points. j. phys. a 41, 255206 (2008). [14] graefe, e. m., et al. mean-field dynamics of a non-hermitian bose-hubbard dimer. phys. rev. lett. 101, 150408 (2008). [15] graefe, e.-m., et al. quantum-classical correspondence for a non-hermitian bose-hubbard dimer. phys. rev. a 82, 013629 (2010). [16] musslimani, z. h., et al. analytical solutions to a class of nonlinear schrödinger equations with pt -like potentials. j. phys. a 41, 244019 (2008). [17] ramezani, h., et al. unidirectional nonlinear pt -symmetric optical structures. phys. rev. a 82, 043803 (2010). [18] cartarius, h. and wunner, g. model of a pt -symmetric bose-einstein condensate in a δ-function double-well potential. phys. rev. a 86, 013612 (2012). [19] jakubský, v. and znojil, m. an explicitly solvable model of the spontaneous pt-symmetry breaking. czech. j. phys. 55, 1113 (2005). [20] mostafazadeh, a. delta-function potential with a complex coupling. j. phys. a 39, 13495 (2006). 266 vol. 53 no. 3/2013 stationary and dynamical solutions of the gross-pitaevskii equation [21] mostafazadeh, a. and mehri-dehnavi, h. spectral singularities, biorthonormal systems and a two-parameter family of complex point interactions. j. phys. a 42, 125303 (2009). [22] mehri-dehnavi, h., et al. application of pseudo-hermitian quantum mechanics to a complex scattering potential with point interactions. j. phys. a 43, 145301 (2010). [23] jones, h. f. interface between hermitian and non-hermitian hamiltonians in a model calculation. phys. rev. d 78, 065032 (2008). [24] rau, s., et al. pitchfork bifurcations in blood-cell-shaped dipolar bose-einstein condensates. phys. rev. a 81, 031605(r) (2010). [25] rau, s., et al. variational methods with coupled gaussian functions for bose-einstein condensates with long-range interactions. i. general concept. phys. rev. a 82, 023610 (2010). [26] rau, s., et al. variational methods with coupled gaussian functions for bose-einstein condensates with long-range interactions. ii. applications. phys. rev. a 82, 023611 (2010). [27] mclachlan, a. d. a variational solution of the timedependent schrödinger equation. mol. phys. 8, 39 (1964). [28] dast, d., et al. a bose-einstein condensate in a pt symmetric double well. fortschr. phys. 61, 124 (2013). [29] press, w. h., et al. numerical recipes in fortran 77. cambridge university press, new york, second edition (1992). [30] lahaye, t., et al. the physics of dipolar bosonic quantum gases. rep. prog. phys. 72, 126401 (2009). 267 acta polytechnica 53(3):259–267, 2013 1 introduction 2 numerical approach to bose-einstein condensates in a pt symmetric double well 2.1 gross-pitaevskii equation 2.2 two methods: variational gaussian and numerically exact 3 numerical solutions 3.1 energy eigenvalues 3.2 wave functions 3.3 temporal evolution and the significance of ``stationary'' solutions with complex eigenvalues 4 conclusion references ap1_02.vp 1 problem formulation the optimization problem of the net was studied. in the solution it was necessary to define an evaluation of the net elements, i.e., an evaluation of the edges and nodes. the values of the net elements are defined: • by the permeability of the edge or the node, • by the cost of the transport unit or of the transport set driving though the element of the transport network, • by a time value representing the transport unit or the transport set driving through an element of the transport network. for the edges of the transport network it is the duration of the journey between two nodes. for the nodes it is the time value of the entry to the node, the service of the transport unit or the transport set at the nod and the exit from the node. the transport nets defined by the plane network chart (i.e., mainly nets for road transport, nets for railway transport and nets for systems of multimodal and combined transport as heterogeneous nets) have to be solved in dual work mode: • in a deterministic work mode, • in a stochastic work mode. in some cases it is also necessary to solve hybrid systems, for example if fixed lines are kept for transport services in personal transport and random lines for cargo transport. this refers to all the above-mentioned types of transport. taking into account the importance as well as the complexity of determining the first index of transport net element evaluation, this stage was dedicated to an investigation to determine the permeability of edges and nodes in both service regimes. a major transport network with a substantial number of nodes and edges is assumed in a basic optimalization role. therefore no simulation methods were applied, only analytic methods. simulative methods are much more time-consuming, and could hardly be accomplished for such a large complex. the use of graphical-analytical methods for deterministic work regimes seems to be sufficient for methods that use some form of elaboration of graphs or graphic work models for each element. for the graph edges we can use the forms of layout transport order. for the knots we can use a graphical model of individual activities, the deterministically given duration of which is known. in order to determine the permeability of the transport net element with a stochastic work regime, operations research methods were mainly used, in particular queuing theory for typical service systems and stock theory for systems where the transport sets are made up of transport elements. stock theory procedures were applied to create the sets and optimize the time for assembly. it should be pointed out that classical systems of queuing theory mainly solve the economic evaluation of a work system, i.e., the mean of the system, the mean of the requirements refused by the system, the mean length of the queue and, finally, the mean time of the customer s downtime in the queue. it is however necessary to determine the permeability of the system for a given optimization function. the speculation that the gaps between entries of subsequent requirements can take on values that are multiples of the time of service in the system forms the basis for solution. for this reason, we can if necessary insert another service requirement into these gaps without it being rejected. it may be assumed that if we determine the number of gaps thus arising, added to the mean load of the system, it will also be possible to define the permeability of the whole system. however this assumption cannot be accomplished, because on the one hand the entry requirement does not need to be available at the stochastic entry while, on the other hand, at the moment of system occupation there can be several entry requirements that are rejected. it can be proved that if the average value of entry flow of the requirements to the system increases, the number of gaps into which it is possible to insert other services decreases. conversely the probability of the amount of refused entries increases. the point at which the number of gaps into which another service can be inserted is equal to the 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 determining the permeable efficiency of elements in transport networks v. svoboda, d. šiktancová the transport network is simulated by a directed graph. its edges are evaluated by length (in linear units or time units), by permeability and by the cost of driving through in a transport unit. its peaks (nodes) are evaluated in terms of permeability, the time of driving through the node in time units and the cost of driving a transport unit (set) through this node. for such a conception of the transport network a role of optimisation and disintegration of transport flow was formulated, defined by a number of transport units (transport sets). these units enter the network at the initial node and exit the network (or vanish at the defined node). the aim of optimization was to disintegrate the transport flow so that the permeability was not exceeded in any element of the network (edge, nod), so that the relocation of the defined transport flow was completed in a prearranged time and so that the cost of driving through the transport net between the entry and exit knots was minimal. keywords: the transport networks, elements in transport network, disintegration of transport flow in transport networks, permeability of elements in transport network, permeability of networks, determining of the permeability of edges, deterministic and stochastic work regimes, systems of queuing theory, a transcendental equation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 41 no. 2/2001 number of refused requirements can be considered as the permeability of the system, because during every increase in average entry flow the number of refused requirements is greater than the number of those additionally inserted, and the system works with losses. 2 conclusion the solution was accomplished for single-line systems with a limited queue and for a-line systems with rejection. since the solution is a transcendental equation soluble only by interaction steps, graphs were developed from which values of permeability can with sufficient accuracy be directly subtracted. references [1] jirava, p. et al.: models of the transport and operation of the transport processes. report j04/99:212600025, j04/00: 212600025, praha, fd čvut, 1999–2000 prof. ing. vladimír svoboda, csc. phone:+420 2 2435 9175 e-mail: svoboda@fd.cvut.cz ing. denisa šiktancová phone:+420 2 2435 9160 e-mail: xsiktancova@fd.cvut.cz czech technical university in prague faculty of transportation sciences horská 3, 128 03 praha 2, czech republic << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2014.54.0093 acta polytechnica 54(2):93–100, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap eigenvalue collision for pt-symmetric waveguide denis borisova, b a institute of mathematics of ufa scientific center of ras, chernyshevskogo, str. 112, 450008, ufa, russian federation b bashkir state pedagogical university, october st. 3a, 450000, ufa, russian federation correspondence: borisovdi@yandex.ru abstract. we consider a model of a planar pt -symmetric waveguide and study the phenomenon of the eigenvalue collision under perturbation of the boundary conditions. this phenomenon was discovered numerically in previous works. the main result of this work is an analytic explanation of this phenomenon. keywords: pt-symmetric operator, eigenvalues, perturbation, asymptotics. 1. introduction and main results in this paper we study a problem in the theory of pt -symmetric operators which has been studied rather intensively after the pioneering works [12–21]. our model is introduced as follows. let x = (x1,x2) be cartesian coordinates in r2, let ω be the strip {x : −d < x2 < d}, d > 0, and let α = α(x1) be a function in w 1∞(r). we consider the operator hα in l2(ω) acting as hαu = −∆u on the functions u ∈ w 22 (ω) satisfying the non-hermitian boundary conditions( ∂ ∂x2 + iα ) u = 0 on ∂ω. (1.1) it was shown in [1] that this operator is m-sectorial, densely defined, and pt -symmetric, namely, pthα = hαpt , (1.2) where (pu)(x) = u(x1,−x2), and t is the operator of complex conjugation, t u = u. it was also proven in [1] that h∗α = h−α, h ∗ α = thαt = phαp. (1.3) a non-trivial question related to hα is the behavior of its eigenvalues. as α(x1) is a small regular localized perturbation of a constant function, sufficient conditions were obtained in [1] for the existence and absence of isolated eigenvalues near the threshold of the essential spectrum. similar results for both regularly and singularly perturbed models were obtained in [2–6]. numerical experiments performed in [6, 7] provided a very non-trivial picture of the distribution of the eigenvalues. an interesting phenomenon discovered numerically in [6, 7] was the eigenvalue collision. namely, let t ∈ r be a parameter, then as t increases, operator htα can have two simple real isolated eigenvalues meeting at some point. then two cases are possible. in the first of them, these eigenvalues stay real as t increases and they just pass along the real line. in the second case, the eigenvalues become complex as t increases and they are located symmetrically w.r.t. the real axis. the present paper is devoted to an analytic study of this phenomenon. suppose λ0 ∈ r is an isolated eigenvalue of hα, ε is a small real parameter, β ∈ w 2∞(r) is some function. denote γ± := {x : x2 = ±d}. our first main result describes the case when λ0 is an eigenvalue of geometric multiplicity two. theorem 1.1. assume λ0 ∈ r is a double eigenvalue of hα, ψ±0 are the associated eigenfunctions satisfying (ψ±0 ,t ψ ± 0 )l2(ω) = 1, (ψ + 0 ,t ψ − 0 )l2(ω) = 0. (1.4) suppose also (b11 − b22)2 + 4b212 6= 0, (1.5) b11 = i ∫ γ+ β(ψ+0 ) 2 dx1 − i ∫ γ− β(ψ+0 ) 2 dx1, b22 = i ∫ γ+ β(ψ−0 ) 2 dx1 − i ∫ γ− β(ψ−0 ) 2 dx1, b12 = i ∫ γ+ βψ+0 ψ − 0 dx1 − i ∫ γ− βψ+0 ψ − 0 dx1. (1.6) 93 http://dx.doi.org/10.14311/ap.2014.54.0093 http://ojs.cvut.cz/ojs/index.php/ap denis borisov acta polytechnica then for all sufficiently small ε the operator hα+εβ has two simple isolated eigenvalues λ±ε converging to λ0 as ε → 0. these eigenvalues are holomorphic w.r.t. ε and the first terms of their taylor series are λ±ε = λ0 + ελ ± 1 + o(ε 2), λ±1 = 1 2 (b11 + b22) ± 1 2 ( (b11 − b22)2 + 4b212 )1/2 . (1.7) the second main result is devoted to the case when the geometric multiplicity of λ0 is one but the algebraic multiplicity is two. theorem 1.2. let λ0 ∈ r be a simple eigenvalue of hα and let ψ0 be the associated eigenfunction. assume that the equation (hα −λ0)φ0 = ψ0 (1.8) is solvable and there exists a solution satisfying (φ0,t ψ0)l2(ω) 6= 0, (φ0,ψ0)l2(ω) = 0. (1.9) then eigenfunction ψ0 can be chosen so that (φ0,t ψ0)l2(ω) = 1, (φ0,ψ0)l2(ω) = 0, (1.10) ψ0 = pt ψ0, φ0 = pt φ0. (1.11) suppose then that this eigenfunction obeys the inequality∫ γ+ β re ψ0 im ψ0 dx1 6= 0. (1.12) then for all sufficiently small ε the operator hα+εβ has two simple isolated eigenvalues λ±ε converging to λ0 as ε → 0. these eigenvalues are real as ε ∫ γ+ β re ψ0 im ψ0 dx1 < 0 (1.13) and are complex as ε ∫ γ+ β re ψ0 im ψ0 dx1 > 0. (1.14) eigenvalues λ±ε are holomorphic w.r.t. ε1/2 and the first terms of their taylor series read as λ±ε = λ0 + ε 1/2λ±1/2 + o(ε), λ ± 1/2 = ±2 ( − ∫ γ+ β re ψ0 im ψ0 dx1 )1/2 . (1.15) let us discuss the results of these theorems. the typical situation of the eigenvalue collision is that two simple eigenvalues of hα+εβ converge to the same limiting eigenvalue λ0 of hα as ε → 0. then it is a general fact from the regular perturbation theory that the algebraic multiplicity of λ0 should be two. the above theorems address two possible situations. in the first of them the geometric multiplicity of λ0 is two, i.e., there exist two associated linearly independent eigenfunctions. as we see from theorem 1.1, in this situation the perturbed eigenvalues are holomorphic w.r.t. ε and their first terms in the taylor series are given by (1.7 right). the numbers λ±1 are some fixed constants and they can be either complex or real. but an important issue is that here when changing the sign of ε, the eigenvalues can not bifurcate from real line to the complex plane or vice versa. this fact is implied by (1.7 right), namely, if λ±1 are complex numbers, then λ ± ε are also complex for both ε < 0 and ε > 0. thus, in this case we do not face the above-mentioned phenomenon of the eigenvalue collision discovered numerically in [6], [7]. if λ±1 are real, then we need to calculate the next terms of their taylor series to see whether they are complex or real. once all the terms in the taylor series are real, we deal with two real eigenvalues which just pass one through the other staying on the real line. nevertheless, in view of formulae (1.6) we believe that choosing appropriate β we can get almost any value for the quantity in (1.5). in a particular interesting case β = α the author does not know a way of identifying the sign of (b11 − b22)2 + 4b212 or proving the reality of the eigenvalues λ±ε . theorem 1.2 treats the case when the geometric multiplicity of λ0 is one. then the taylor series for the perturbed eigenvalues are completely different from theorem 1.1 and here the expansions are made w.r.t. ε1/2. and the presence of this power perfectly explains the studied phenomenon. namely, once ε is positive, the same is true for ε1/2, while for negative ε the square root ε1/2 is pure imaginary. this is exactly what is needed, once ε changes the sign, real eigenvalues become complex and vice versa. unfortunately, we cannot even analytically prove for our model the existence of such eigenvalues. we can just state that once λ0 has geometric multiplicity one and the associated eigenfunction ψ0 satisfies the identity (ψ0,t ψ0)l2(ω) = 0, then equation (1.8) is solvable (see lemma 2.1). and numerical results in [6], [7] show that this is quite a typical situation. our next main result provides another criterion identifying the solvability of equation (1.8) 94 vol. 54 no. 2/2014 eigenvalue collision for pt-symmetric waveguide theorem 1.3. suppose λ0 is a simple eigenvalue of hα, the associated eigenfunction satisfies the estimate∑ γ∈z2+ |γ|62 ∣∣∣∂γψ0 ∂xγ (x) ∣∣∣ 6 c 1 + |x1|3 , x ∈ ω. (1.16) then equation (1.8) is solvable if and only if∫ r2 k(x1,y1) ( α(x1) −α(y1) ) re ψ0(x1,d) im ψ0(y1,d) dx1 dy1 = 0, (1.17) where k(x1,y1) := { x1 if y1 < x1, −y1 if y1 > x1. here ψ0 is chosen so that it satisfies the first identity in (1.11). assumption (1.16) is not very restrictive since usually eigenfunctions associated with isolated eigenvalues of elliptic operators decay exponentially at infinity. the main condition here is (1.17). as we shall show later in lemma 2.1, equation (1.8) is solvable if and only if (ψ0,t ψ0)l2(ω) = 0. and we rewrite this identity to (1.17) by calculating (ψ0,t ψ0)l2(ω). the left hand side in (1.17) is simpler in the sense that it involves only boundary integrals while (ψ0,t ψ0)l2(ω) is in fact the integral over the whole strip ω. 2. proofs of main results in l2(ω) we introduce the unitary operator (uεβf)(x) := e−iεβ(x1)x2f(x). then it is easy to see that the spectra of hα+εβ and u−1εβ hα+εβuεβ coincide and u−1εβ hα+εβuεβ = hα −εlε, (2.1) lε := −2iβ′x2 ∂ ∂x1 − 2iβ ∂ ∂x2 −εβ2 −ε(β′)2x2 − iβ′′x2. (2.2) in the proofs of the main results we shall make use of several auxiliary lemmata. lemma 2.1. under the hypothesis of theorem 1.2 the equation (hα −λ0)u = f (2.3) is solvable if and only if (f,t ψ0)l2(ω) = 0. (2.4) under the hypothesis of theorem 1.1 equation (2.3) is solvable if and only if (f,t ψ±0 )l2(ω) = 0. (2.5) proof. by (1.3) we see that under the hypotheses of both theorems 1.1 and 1.2, λ0 is an eigenvalue of h∗α with the associated eigenfunction(s) t ψ0 or t ψ±0 . then the lemma follows from [8, ch. iii, sec. 6.6, rem. 6.23]. lemma 2.2. suppose the hypothesis of theorem 1.2. then eigenfunction ψ0 can be chosen so that relations (1.10), (1.11), and (ψ0,t ψ0)l2(ω) = 0 (2.6) hold true. the functions re ψ0 and re φ0 are even w.r.t. x2 and im ψ0 and im φ0 are odd w.r.t. x2. proof. identity (2.6) follows directly from (2.4) applied to equation (1.8). since λ0 is a real simple eigenvalue and equation (1.8) has a unique solution satisfying the second identity in (1.10), by (1.2) we have (1.11) and thus re ψ0 and re φ0 are even, while im ψ0 and im φ0 are odd w.r.t. x2. employing this fact and (1.8), we obtain (φ0,t ψ0)l2(ω) = − ∫ ω φ0(∆ + λ0)φ0 dx = i ∫ γ+ αφ20 dx1 − i ∫ γ− αφ20 dx1 + ∫ ω ((∂φ0 ∂x1 )2 + (∂φ0 ∂x2 )2 −λ0φ20 ) dx = −4 ∫ γ+ α re φ0 im φ0 dx1 + ∫ ω ( |∇re φ0|2 −|∇im φ0|2 ) dx−λ0 ∫ ω ( |re φ0|2 −|im φ0|2 ) dx ∈ r. (2.7) hence, multiplying function ψ0 and φ0 by an appropriate constant, we can easily get the first identity in (1.10) not spoiling other established properties of φ0 and ψ0. 95 denis borisov acta polytechnica lemma 2.3. suppose the hypothesis of theorem 1.2. then for λ close to λ0 the resolvent (hα −λ)−1 can be represented as (hα −λ)−1 = p−2 (λ−λ0)2 + p−1 λ−λ0 + rα(λ), (2.8) p−2 = ψ0`2, p−1 = φ0`2 + ψ0`1, `2f := −(f,t ψ0)l2(ω), `1f := − ( f,t (φ0 −ψ0) ) l2(ω) , (2.9) where rα(λ) is the reduced resolvent which is a bounded and holomorphic in the λ operator. proof. we know by [8, ch. iii, sec. 6.5] (see also the remark on space m′(0) in the proof of theorem 1.7 in [8, ch. vii, sec. 1.3]) that (hα −λ)−1 can be expanded into the laurent series (hα −λ)−1 = n∑ n=1 p−n (λ−λ0)n + rα(λ), where n is a fixed number independent of λ, rα is the reduced resolvent which is a bounded and holomorphic in λ operator. given any f ∈ l2(ω), we then have u = (hα −λ)−1f = n∑ n=1 u−n (λ−λ0)n + ∞∑ n=0 (λ−λ0)nun. we substitute this formula into the equation (hα −λ)u = f and equate the coefficients at the like powers of (λ−λ0): (hα −λ0)u−n = 0, (hα −λ0)u−k = u−k−1, k = 1, . . . ,n − 1, (hα −λ0)u0 = f + u−1, (hα −λ0)u1 = u0. (2.10) this implies that u−n = ψ0`2f, u−n+1 = φ0`2f + ψ0`1f, where `i are some functionals on l2(ω). if n > 2, then by (1.9) and lemma 2.1 the equation for u−n+2 is unsolvable. hence, we can assume n = 2. writing then the solvability condition (2.4) for equations (2.10) and taking into consideration the identity in (1.10), we arrive easily to the formula for `2 in (2.9) and `1f := −(u0,t ψ0)l2(ω), (2.11) where u0 is the solution to the equation (hα −λ0)u0 = f + ψ0`2f (2.12) satisfying (u0,ψ0)l2(ω) = 0. (2.13) it follows from (1.3) and (1.8) that (u0,t ψ0)l2(ω) = ( u0,t (hα −λ0)φ0 ) l2(ω) = ( u0, (hα −λ0)∗t φ0 ) l2(ω) = ( (hα −λ0)u0,t φ0 ) l2(ω) = (f + ψ0`2f,t φ0)l2(ω). these identities, the above obtained formula for `2, and (2.6), (2.11) imply formula (2.12) for `1. lemma 2.4. suppose the hypothesis of theorem 1.1. then for λ close to λ0 the resolvent (hα −λ)−1 can be represented as (hα −λ)−1 = p−1 λ−λ0 + rα(λ), (2.14) p−1 = ψ+0 `+ + ψ − 0 `−, `±f := −(f,t ψ ± 0 )l2(ω), (2.15) where rα(λ) is the reduced resolvent which is a bounded and holomorphic in λ operator. the proof of this lemma is similar to that of lemma 2.3, we just should bear in mind that due to (1.4) and lemma 2.1 the equations (hα −λ0)u = ψ±0 are unsolvable. we proceed to the proofs of theorems 1.1, 1.2, 1.3. 96 vol. 54 no. 2/2014 eigenvalue collision for pt-symmetric waveguide proof of theorem 1.2. the proof is based on the modified version of the birman-schwinger principle suggested in [9] in the form developed in [10]. in view of (2.1), the eigenvalue equation for hα+εβ is equivalent to the same equation for hα −εlε. the latter equation can be written as (hα −λε)ψε = εlεψε. (2.16) we then invert the operator (hα −λε) by lemma 2.3 and obtain ψε = ε p−2lεψε (λε −λ0)2 + ε p−1lεψε λε −λ0 + εrα(λε)ψε. by lemma 2.3 the operator rα(λ) is bounded uniformly in λ close to λ0 and hence the inverse a(z,ε) :=( i −εrα(λ0 + z) )−1 is well-defined and is uniformly bounded for all λ close to λ0 and for all sufficiently small ε. we apply this operator to the latter equation and get ψε = ε z2ε a(λ0 + zε,ε)p−2lεψε + ε zε a(λ0 + zε,ε)p−1lεψε, (2.17) where we denote zε := λε −λ0. then we apply functionals `2lε, `1lε to the obtained equation and it results in( ε zε a11(zε,ε) − 1 ) x1 + ε z2ε ( a11(zε,ε) + zεa12(zε,ε) ) x2 = 0, ε zε a21(zε,ε)x1 + ( ε z2ε ( a21(zε,ε) + zεa22(zε,ε) ) − 1 ) x2 = 0, (2.18) where xi = `ilεψε, and ai1(z,ε) := `ilεa(λ0 + z,ε)ψ0, ai2(z,ε) := `ilεa(λ0 + z,ε)φ0, i = 1, 2. the obtained system of equations is linear w.r.t. (x1,x2). we need a non-zero solution to this system since otherwise by (2.17) we would get ψε = 0 and ψε then cannot be an eigenfunction. system (2.18) has a nonzero solution if its determinant vanishes. it implies the equation z2ε −ε ( a11(zε,ε) + a22(zε,ε) ) zε −εa21(zε,ε) + ε2 ( a11(zε,ε)a22(zε,ε) −a12(zε,ε)a21(zε,ε) ) = 0, which is equivalent to the following two zε = g±(zε,ε1/2), (2.19) where g±(z,κ) := κ2(a11(z,κ2) + a22(z,κ2)) 2 ±κ ( a21(z,κ2) + κ2 4 ( a11(z,κ2) −a22(z,κ2) )2 + κ2a12(z,κ2)a21(z,κ2))1/2. (2.20) here the branch of the square root is fixed by the restriction 11/2 = 1. it is clear that the functions aij are jointly holomorphic w.r.t. sufficiently small z and ε. moreover, by (2.2) a21(0,ε) = `2lεa(0,ε)ψ0 = i`2 ( −2β′x2 ∂ ∂x1 − 2β ∂ ∂x2 −β′′x2 ) ψ0 + o(ε). (2.21) to calculate the first term on the right hand side of this identity, we first observe that by the equation for ψ0 we have − ( 2β′x2 ∂ ∂x1 + 2β ∂ ∂x2 + β′′x2 ) ψ0 = −(∆ + λ0)βx2ψ0 =: g. now we find i`2g by integration by parts i`2g = ∫ ω ψ0(∆ + λ0)βx2ψ0 dx = i ∫ γ+ ( ψ0 ∂ ∂x2 βx2ψ0 −βx2ψ0 ∂ψ0 ∂x2 ) dx1 − i ∫ γ− ( ψ0 ∂ ∂x2 βx2ψ0 −βx2ψ0 ∂ψ0 ∂x2 ) dx1 = i ∫ γ+ βψ20 dx1 − i ∫ γ− βψ20 dx1. (2.22) together with lemma 2.2 this implies i`2g = −4 ∫ γ+ β re ψ0 im ψ0 dx1. (2.23) 97 denis borisov acta polytechnica hence, by (2.20), (2.22), (1.12), and the properties of functions aij we conclude that functions g± are jointly holomorphic w.r.t. sufficiently small z and κ. applying the rouché theorem as in [10, sec. 4], we conclude that for all sufficiently small κ each of the functions z 7→ z − g±(z,κ) has a simple zero z±(κ) in a small neighborhood of the origin. by the implicit function theorem these zeroes are holomorphic w.r.t. κ. thus, the desired solutions to equations (2.19) are z±(ε1/2), and these functions are holomorphic w.r.t. ε1/2. moreover, it follows from (2.19), (2.20), (2.21), (2.22), (2.23) that z±(ε1/2) = g±(0,ε1/2) + o(ε) = ±ε1/2a 1/2 21 (0,ε) + o(ε) and then the sought eigenvalues are λ±ε = λ0 + z±(ε1/2). these eigenvalues are holomorphic w.r.t. ε1/2 and obey (1.15). let us prove that these eigenvalues are real as (1.13) holds true and are complex once (1.14) is satisfied. the latter statement follows easily from formulae (1.15) since in this case ε1/2λ±1/2 are two imaginary numbers. to prove the reality, as one can easily make sure, it is sufficient to prove that functions g±(z,κ) are real for real z and κ. then the existence of a real root is implied easily by the implicit function theorem for real functions. in view of definition (2.20) of g±, the desired fact is yielded by the similar reality of aij. let us prove the latter. it follows from lemma 2.3 that for each f ∈ l2(ω) the function rα(λ)f = (hα −λ)−1f − p−2f (λ−λ0)2 − p−1f λ−λ0 solves the equation (hα −λ)rα(λ)f = f + ψ0`1f + φ0`2f. (2.24) employing definition (2.2) of lε, we check easily that ptlε = lεpt . this identity and (1.11), (2.24) yield that for z ∈ r, κ ∈ r ptlεa(λ0 + z,κ)ψ0 = lεa(λ0 + z,κ)ψ0, ptlεa(λ0 + z,κ)φ0 = lεa(λ0 + z,κ)φ0. using (1.11) once again, for z ∈ r, κ ∈ r we get a11(z,κ) = ( ptlεa(λ0 + z,κ)ψ0,pψ0 ) l2(ω) = ( tlεa(λ0 + z,κ)ψ0,t ψ0 ) l2(ω) = a11(z,κ). the reality of other functions aij can be proven in the same way. the proof is complete. proof of theorem 1.1. the main ideas here are the same as in the proof of theorem 1.2, so, we focus only on the main milestones. we again begin with (2.1) and invert (hε −λε) by lemma 2.2. it leads us to an analogue of equation (2.17), ψε = ε zε a(λ0 + zε,ε)p−1lεψε, (2.25) where operator a is introduced in the same way as above. we then apply functionals `±lε to this equation( ε zε b11(zε,ε) − 1 ) x1 + ε zε b12(zε,ε)x2 = 0, ε zε b21(zε,ε)x1 + ( ε zε b22(zε,ε) − 1 ) x2 = 0, (2.26) b11(z,ε) := `+lεa(λ0 + z,ε)ψ+0 , b12(z,ε) := `+lεa(λ0 + z,ε)ψ − 0 , b21(z,ε) := `−lεa(λ0 + z,ε)ψ+0 , b22(z,ε) := `−lεa(λ0 + z,ε)ψ − 0 . the determinant of system (2.26) should again vanish and it implies the equation z2ε −ε ( b11(zε,ε) + b22(zε,ε) ) + ε2 ( b11(zε,ε)b22(zε,ε) −b12(zε,ε)b21(zε,ε) ) = 0, which splits into other two zε = q±(zε,ε), (2.27) q±(z,ε) := ε 2 ( b11(zε,ε) + b22(zε,ε) ) ± ε 2 ( (b11(z,ε) −b22(z,ε))2 + 4b12(z,ε)b21(z,ε) )1/2 . here the branch of the square root is fixed by the restriction 11/2 = 1. let us prove that this square root is jointly holomorphic w.r.t. z and ε. integrating by parts as in (2.22) and employing (1.1), one can make easily sure that bii = bii + o(ε), i = 1, 2, b12(0,ε) = b12 + o(ε), b21(0,ε) = b21 + o(ε). (2.28) hence, by assumption (1.5), functions q± are jointly holomorphic w.r.t. z and ε. proceeding now as in the proof of theorem 1.2, we arrive at the statement of theorem 1.1. 98 vol. 54 no. 2/2014 eigenvalue collision for pt-symmetric waveguide proof of theorem 1.3. denote ψ(x) := 1 2 ∫ x1 −∞ tψ0(t,x2) dt. in view of (1.16) this function is well-defined. throughout the proof we shall deal with several integrals of such kind and all of them will be well-defined due to (1.16). in what follows we shall not stress this fact anymore. employing the equation for ψ0, integrating by parts, and bearing in mind estimates (1.16), we get (∆ + λ0)ψ = ψ0 + 1 2 x1 ∂ψ0 ∂x1 + 1 2 ∫ x1 −∞ t ( ∂2 ∂x22 + λ0 ) ψ0(t,x2) dt = ψ0 + 1 2 x1 ∂ψ0 ∂x1 − 1 2 x1 ∫ x1 −∞ ∂2ψ0 ∂x21 (t,x2) dt = ψ0. the proven equation for ψ allows us to integrate once again,∫ ω ψ20 dx = ∫ ω ψ0(∆ + λ0)ψ dx = ∫ γ+ ( ψ0 ∂ψ ∂x2 −ψ ∂ψ0 ∂x2 ) dx1 − ∫ γ− ( ψ0 ∂ψ ∂x2 −ψ ∂ψ0 ∂x2 ) dx1 = ∫ γ+ ψ0 ( ∂ψ ∂x2 + iαψ ) dx1 − ∫ γ− ψ0 ( ∂ψ ∂x2 + iαψ ) dx1. now we employ identity (1.11) and boundary condition (1.1) for ψ0 to simplify the sum of these integrals,∫ ω ψ20 dx = − ∫ γ+ dx1 re ψ0(x1,d)x1 ∫ x1 −∞ ( α(x1) −α(y1) ) im ψ0(y1,d) dy1 − ∫ γ+ dx1 im ψ0(x1,d)x1 ∫ x1 −∞ ( α(x1) −α(y1) ) re ψ0(y1,d) dy1 = − ∫ γ+ dx1 re ψ0(x1,d)x1 ∫ x1 −∞ ( α(x1) −α(y1) ) im ψ0(y1,d) dy1 + ∫ γ+ dx1 re ψ0(x1,d)x1 ∫ +∞ x1 ( α(x1) −α(y1) ) im ψ0(y1,d) dy1 = − ∫ r2 k(x1,y1) ( α(x1) −α(y1) ) re ψ0(y1,d) im ψ0(y1,d) dx1 dy1. by (2.4) we then conclude that equation (1.8) is solvable if and only if identity (1.17) holds true. remark 2.5. the idea of the latter proof was borrowed from the proof of lemma 2.2 in [11], see also proof of lemma 3.6 in [10]. acknowledgements the author thanks m. znojil for valuable discussions that stimulated him to write this paper. the work is partially supported by rfbr, by a grant of the president of russia for young scientists — doctors of science (md-183.2014.1) and by the dynasty foundation fellowship for young mathematicians. references [1] d. borisov, d. krejčiřík. pt -symmetric waveguide.integral equations and operator theory. 2008. v. 62. no. 4. p. 489-515. doi: 10.1007/s00020-008-1634-1 [2] d. borisov. on a pt -symmetric waveguide with a pair of small holes.proceedings of steklov institute of mathematics. 2013. v. 281. no. 1 supplement. p. 5-21; translated from trudy instituta matematiki i mekhaniki uro ran. 2012. v. 18, no. 2. p. 22-37. doi: 10.1134/s0081543813050027 [3] d. borisov. discrete spectrum of thin pt -symmetric waveguide.ufa mathematical journal. 2014. vol. 6, no. 1, pp. 29–55. doi: 10.13108/2014-6-1-29 [4] d. borisov. on a quantum waveguide with a small pt -symmetric perturbation.acta polytechnica. 2007. no. 2-3. p. 59-61. [5] d. borisov, d. krejčiřík. the effective hamiltonian for thin layers with non-hermitian robin-type boundary conditions.asymptotic analysis. 2012. v. 76. no. 1. p. 49-59. doi: 10.3233/asy-2011-1061 [6] d. krejčiřík and p. siegl. pt -symmetric models in curved manifolds.journal of physics a: mathematical and theoretical. 2010. v. 43, no. 48. id 485204. doi: 10.1088/1751-8113/43/48/485204 [7] d. krejčiřík and m. tater. non-hermitian spectral effects in a pt -symmetric waveguide.journal of physics a: mathematical and theoretical. 2008. v. 41, no. 24. id. 244013. doi: 10.1088/1751-8113/41/24/244013 [8] t. kato. perturbation theory for linear operators. classics in mathematics, springer-verlag, berlin. 1995. 99 http://dx.doi.org/10.1007/s00020-008-1634-1 http://dx.doi.org/10.1134/s0081543813050027 http://dx.doi.org/10.13108/2014-6-1-29 http://dx.doi.org/10.3233/asy-2011-1061 http://dx.doi.org/10.1088/1751-8113/43/48/485204 http://dx.doi.org/10.1088/1751-8113/41/24/244013 denis borisov acta polytechnica [9] r.r. gadyl’shin. local perturbations of the schrödinger operator on the axis.theoretical and mathematical physics. 2002. v. 132, no. 1. p. 976-982. doi: 10.4213/tmf349 [10] d. borisov. discrete spectrum of a pair of non-symmetric waveguides coupled by a window.sbornik mathematics. 2006. v. 197. no. 4. p. 475-504. doi: 10.1070/sm2006v197n04abeh003767 [11] d. i. borisov. on a model boundary value problem for laplacian with frequently alternating type of boundary condition.asymptotic analysis. 2003. v. 35. no. 1. p. 1-26. [12] c.m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry.physics review letters. 1998. v. 80. no. 24. p. 5243-5246. doi: 10.1103/physrevlett.80.5243 [13] a. mostafazadeh. pseudo-hermiticity versus pt-symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian.journal of mathematical physics. 2002. v. 43. no. 1. p. 205-214. doi: 10.1063/1.1418246 [14] a. mostafazadeh. pseudo-hermiticity versus pt-symmetry ii: a complete characterization of non-hermitian hamiltonians with a real spectrum.journal of mathematical physics. 2002. v. 43. no. 5. p. 2814-2816. doi: 10.1063/1.1461427 [15] a. mostafazadeh. pseudo-hermiticity versus pt-symmetry iii: equivalence of pseudo-hermiticity and the presence of antilinear symmetries.journal of mathematical physics. 2002. v. 43. no. 8. p. 3944-3951. doi: 10.1063/1.1489072 [16] a. mostafazadeh. on the pseudo-hermiticity of a class of pt-symmetric hamiltonians in one dimension.modern physics letters a. 2002. v. 17. no. 30. p. 1973-1977. doi: 10.1142/s0217732302008472 [17] m. znojil. exact solution for morse oscillator in pt-symmetric quantum mechanics.physics letters a. 1999. v. 264. no. 2. p. 108-111. doi: 10.1016/s0375-9601(99)00805-1 [18] m. znojil. non-hermitian matrix description of the pt-symmetric anharmonic oscillators.journal of physics a: mathematics and general. 1999. v. 32. no. 42. p. 7419-7428. doi: 10.1088/0305-4470/32/42/313 [19] m. znojil. pt-symmetric harmonic oscillators.physics letters a. 1999. v. 259. no. 3-4. p. 220-223. doi: 10.1016/s0375-9601(99)00429-6 [20] g. levai and m. znojil. systematic search for pt -symmetric potentials with real energy spectra.journal of physics a: mathematics and general. 2000. v. 33. no. 40. p. 7165-7180. doi: 10.1088/0305-4470/33/40/313 [21] c.m. bender. making sense of non-hermitian hamiltonians.reports on progress in physics. 2007. v. 70. no. 6. p. 947-1018. doi: 10.1088/0034-4885/70/6/r03 100 http://dx.doi.org/10.4213/tmf349 http://dx.doi.org/10.1070/sm2006v197n04abeh003767 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1063/1.1418246 http://dx.doi.org/10.1063/1.1461427 http://dx.doi.org/10.1063/1.1489072 http://dx.doi.org/10.1142/s0217732302008472 http://dx.doi.org/10.1016/s0375-9601(99)00805-1 http://dx.doi.org/10.1088/0305-4470/32/42/313 http://dx.doi.org/10.1016/s0375-9601(99)00429-6 http://dx.doi.org/10.1088/0305-4470/33/40/313 http://dx.doi.org/10.1088/0034-4885/70/6/r03 acta polytechnica 54(2):93–100, 2014 1 introduction and main results 2 proofs of main results acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0560 acta polytechnica 53(supplement):560–572, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the sz effect in the planck era: astrophysical and cosmological impact sergio colafrancescoa,b,∗ a university of the witwatersrand, johannesburg (south africa) b inaf-oar, monteporzio (italy) ∗ corresponding author: sergio.colafrancesco@wits.ac.za abstract. the sunyaev–zel’dovich effect (sze) is a relevant probe for cosmology and particle astrophysics. the planck era marks a definite step forward in the use of this probe for astrophysics and cosmology. astrophysical applications to galaxy clusters, galaxies, radiogalaxies and large-scale structures are discussed. cosmological relevance for the dark energy equation of state, modified gravity scenarios, dark matter search, cosmic magnetism and other cosmological applications is also reviewed. future directions for the study of the sze and its polarization are finally outlined. keywords: cosmology, cmb, dark matter, dark energy, cosmic magnetism, cosmic structures: galaxy clusters, galaxies, radio galaxies. 1. introduction comptonization of the cmb photons by electrons in the plasma confined in the atmospheres of cosmic structures (hereafter referred to as the sz effect: sze) is a powerful probe of the energetics, the spectra, and the stratification of their overall electronic distribution because the spectral and spatial characteristics of this process are sensitive to the properties (spectrum, spatial distribution, energy density, pressure) of the electron population. this makes the sze a powerful astrophysical probe. due to its redshift-independent nature, the sze is also a powerful cosmological probe. to this aim, the sze in cosmic structures must be determined with very good accuracy in order to derive reliable and unbiased cosmological probes. therefore, a detailed astrophysical study of the various sources of sze has to be preliminary fulfilled before their vast cosmological applications are set at work. 2. the physics of the sz effect the sze is produced by the inverse compton scattering (ics) of cmb photons off the electrons confined in the atmospheres of cosmic structures. observable quantities of the sze include: i) spectral distortions of the cmb due to upscattering of cmb photons induced by high-e electrons (thermal, non-thermal and relativistic sze); ii) spectral distortion of the cmb due to a bulk motion of the electronic plasma w.r.t. the hubble flow (kinematic sze); iii) polarization of the cmb due to dynamical and plasma effects (sze polarization). 2.1. the sz effect the spectral distortion of the cmb spectrum observable in the direction of a galaxy cluster writes [7, 14, 71] as δi(x) = 2 (ktcmb)3 (hc)2 y g(x), (1) where δi(x) = i(x) − i0(x), i(x) is the up-scattered cmb spectrum in the direction of the cluster and i0(x) is the unscattered cmb spectrum in the direction of a sky area contiguous to the cluster. here x ≡ hν/ktcmb, h is the planck constant, k is the boltzmann constant, tcmb = 2.726 k is the cmb temperature today and ν is the observing frequency. the comptonization parameter y is y = σt mec2 ∫ pe d` (2) in terms of the pressure pe contributed by the electronic population. here σt is the thomson cross section, me the electron mass, and c the speed of light. the spectral function g(x) of the sze is [14] g(x) = mec 2 〈εe〉   1τe   +∞∫ −∞ i0(xe−s)p(s) ds− i0(x)     (3) in terms of the photon redistribution function p(s) and of i0(x) = i0(x)/[2(ktcmb)3/(hc)2] = x3/(ex−1). (4) the quantity 〈εe〉≡ σt τe ∫ pe d` = ∫ ∞ 0 dpfe(p) 1 3 pv(p)mec, (5) 560 http://dx.doi.org/10.14311/ap.2013.53.0560 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the sz effect in the planck era figure 1. the sze spectral shape g(x) is shown as a function of the nondimensional frequency x for different electronic populations residing in a cluster: thermal with kbte = 8.2 kev (blue); warm with kbte = 1 kev (cyan); secondary electrons from dm annihilation with mχ = 20 gev (red); relativistic electrons which fit the coma radio halo spectrum (yellow). the kinematic sze with a negative peculiar velocity (green) is also shown for comparison. the amplitudes of the various curves have been artificially renormalized to highlight their frequency dependence. where fe(p) is the normalized electron momentum distribution function, is the average energy of the electron plasma [14]. the optical depth along the line of sight ` of the electron population with number density ne is τe = σt ∫ d`ne . (6) the photon redistribution function p(s), with s = ln(ν′/ν) in terms of the cmb photon frequency increase factor ν′/ν, can be calculated by repeated convolution of the single-scattering redistribution function, p1(s) = ∫ dpfe(p)ps(s; p), where ps(s; p) depends on the physics of inverse compton scattering. the previous description is relativistically covariant and general enough to be applied to both thermal and nonthermal plasma, as well as to a combination of the two (see fig. 1 and [14, 23, 51, 52] for details). 2.2. kinematic sz effect the velocity (or kinematic) sze (hereafter ksze) arises if the plasma causing the thermal, or nonthermal, sze is moving relative to the hubble flow. in the reference frame of the scattering particle the cmb radiation appears anisotropic, and the effect of the ics is to re-isotropize the radiation slightly. back in the rest frame of the observer the radiation field is no longer isotropic, but shows a structure towards the scattering atmosphere with amplitude ∝ τevt/c, where vt is the component of the peculiar velocity of the scattering atmosphere along the line of sight [66, 70]. the brightness change of the cmb due to the ksze is given by δi i = −τeβt xex ex − 1 (7) with βt ≡ vtc [54, 70]. a general relativistic description of the ksze has been given in the framework of the general boltzmann equation [40] and in the relativistic covariant formalism [51, 52]. 2.3. sze polarization the ics process yields naturally a polarized upscattered radiation field (see, e.g., [10] and references therein). the polarization π of the sze arises from various dynamical and plasma effects [5, 27, 44, 68]: galaxy clusters transverse motion (πk ∝ β2t τ in the rayleigh–jeans, rj, regime), transverse motions of plasma within the cluster (πv ∝ βtτ2 in the rj regime) and multiple scattering between electrons and cmb photons within the cluster (πth ∝ θτ2 in the rj regime for the thermal sze with θ = kte/mec2). a general, covariant, relativistic derivation of the sze polarization for thermal, non-thermal and relativistic plasma can be derived [27] and generalizes the non-relativistic derivation [68] in a way similar to the general derivation of the sze [14] previously discussed. 3. astrophysical and cosmological impact studying the sze in various cosmic atmospheres provides many insights on their energetics, pressure and dynamical structure. the combination of sze with other emission mechanisms related to the same particle distribution (i.e., synchrotron, high-e ics emission, bremsstrahlung emission) provides further information on the radiation, matter and magnetic fields that are co-spatial with the electrons producing the sze. these properties of the sze concern various cosmic structures, from galaxy clusters to radiogalaxy lobes, from galaxy halos to supercluster and the whim (see sect. 6 below). the redshift-independent nature of the sze allow to use this effect as a powerful cosmological probe by using both the redshift evolution of cluster abundance and direct probes of cosmological parameters. the sze has a wide range of cosmological applications: it can be used to determine the main cosmological parameters and the dark energy (de) equation of state, and also set constraints to modified gravity scenarios and to the properties of primordial magnetic fields (see sect. 7 below). observations of the sze in cosmic structures have been performed in the last two decades with increasing sensitivity and spatial resolution but in the limited frequency bands accessible from the ground, and the availability of the planck surveyor is now opening its study over a wider frequency range. we discuss in the following the main achievements on the sze physics in the pre-planck era and in the planck era, 561 sergio colafrancesco acta polytechnica and we also outline some of the future possibilities of exploiting the large amount of astrophysical and cosmological information contained in the sze. 4. the past: pre-planck era the sze has been searched in galaxy clusters since it was originally proposed [70, 71] using various techniques. three distinct techniques for the measurement of the thermal sze in clusters of galaxies yielded reliable results: single-dish radiometric observations, bolometric observations, and interferometric observations (we refer to [7] for a discussion of the weaknesses and strengths of each technique and the types of systematic error from which they suffer). no concerted results for measuring the polarization sze have yet been obtained. the relevant milestones of the most relevant sze observations before the planck era are listed below: – 1983. the owens valley radio observatory (ovro) first detect the sze at 30 ghz from clusters of galaxies. – 1993. the ryle telescope is the first telescope to image a cluster of galaxies in the sze. – 1998. the first sub-mm observation of the sze effect was obtained by the pronaos 2 m stratospheric telescope towards abell 2163 [43]. – 2001. the first four band spectrum of the sze was obtained for the coma cluster with the mito 3.5 m telescope [32]. – 2003. the wmap spacecraft maps the cmb over the whole sky with some (limited) all-sky sensitivity to the sze. – 2005. the atacama pathfinder experiment (apex) sze camera saw first light and shortly after began pointed observations of galaxy clusters. – 2005. the arcminute microkelvin imager (ami) and the sz array (sza) each begin surveys for very high redshift clusters of galaxies using the sze. – 2007. the south pole telescope (spt) saw first light on february 16, 2007, and began science observations in march of that same year. – 2007. the atacama cosmology telescope (act) saw first light on june 8, 2007, and began an sze survey of galaxy clusters. – 2008. the spt discover for the first time galaxy clusters in blind-survey mode via the sze. – 2012. the act find statistical evidence for the ksze [38]. the pioneering era of the sze study has been dotted with numerous technological and scientific successes. ground-based sze experiments (e.g., spt, apex, act, ami, gbt, among others) provided excellent results in terms of imaging and blind search surveys with their low frequency, multiple-band observations, accessible from space independent of astrophysics strongly dependent on astrophysics kt= 7 kev kt=10 kev kt=15 kev kt=20 kev figure 2. the thermal sze spectrum of typical galaxy clusters for different plasma temperatures (as indicated) and the same value of the optical depth. the low-ν part (ν ∼< 220 ghz) of the spectrum depends mostly on the total compton parameter y ∝ ∫ d`ptot and there is no strong spectral dependence on the temperature. the high-ν part of the spectrum (at ∼> 300 ghz) shows a strong spectral dependence from the plasma temperature [23]. the usual frequency bands where the sze is observed from the ground are shown as blue-cyan bands, while the region accessible only from space observations is shown by the gray shaded area. but they do not have neither true spectroscopic capabilities nor a wide spectroscopic frequency band, and they are not sensitive to the high-ν range ( ∼< 400 ghz) of the sze signals, which is crucial to exploit the astrophysical information contained in the sze (see figs. 1 and 2). the region around the null of the thermal sze, i.e. at ∼ 220 ghz is rich in astrophysical information because the frequency location of the null of the sze is directly dependent on the total pressure of the plasma [14] but this frequency diplacement is rather difficult to be measured due to the low amplitude of the sze signal at these frequencies and to the heavy biases and uncertainties (due to the level of the cmb subtraction, the kinematic sze, and possible sources of non-thermal sze) in the measurement (see [23, 31]). in fact, ground-based instruments widely improved the source statistics (crucial to obtain cosmological information using the sze) and the angular resolution of sze images (crucial to disentangle the extended sze signal from point-source contamination), but add little to the physical specification of the detected sze sources, and therefore they need x-ray and optical follow-up to determine the characteristics of the physical parameters extracted from sze observations. 5. the present: planck era the planck surveyor satellite was launched in may 2009, together with the herschel satellite, and set in an l2 orbit. planck has a 1.5 m gregorian telescope 562 vol. 53 supplement/2013 the sz effect in the planck era and receivers covering 9 frequency bands, from 30 to 857 ghz. the angular resolution of the experiment goes from ∼ 5 (at high-ν) to 30 (at low-ν) arcmin. the two main experiments on board planck are the lfi with 22 radiometers, covering the 3 low-ν frequency bands, and the hfi with 72 bolometers plus thermometers cooled down to 0.1 k, covering the higher 6 frequency bands. the nominal mission consists of 2 full sky surveys and the extended mission of 4 surveys at least. the planck early sze science observations yielded 189 sze sources with s/n > 6 which provides the first sze measure for ∼ 80 % of the known galaxy clusters, and 20 additional new clusters discovered by planck via the sze (see [61] on behalf of the planck collaboration for a recent discussion on the detection, follow-up and validation procedures). this is the largest sample so far of sze detected clusters. planck detected sze clusters are followed-up with a multi-frequency observation program in the x-rays (xmm-newton), sze (ami), optical (eno, eso, rtt, not, noao, among others) to obtain confirmation, redshift estimation and estimates of the global physical parameters. planck results show that the sze selection is a very powerful method for the detection of new distant and very massive clusters. planck is also unveiling a population of dynamically perturbed clusters at z ∼> 0.3, possibly underrepresented in x-ray surveys. the information collected so far strengthen our overall view of the icm properties and mass content in galaxy clusters and is getting to close the long standing issue of the “missing hot baryons” from the excellent agreement between observed the sze compton parameter ysze and x-ray-based predictions. most of these results are discussed in the early and intermediate papers [55–60] and more detailed analyses are still coming. the herschel satellite (co-eval with planck) has been able to observe the sze in a few pointed clusters with the spire instrument equipped with an fts spectrometer working in the frequency range ∼ 600 ÷ 1200 ghz. the possibility to have sensitive spectroscopic measurements in these high-frequency bands opens the way to the deep astrophysical exploitation of the sze. as an example, two additional data points on the sze spectrum of the bullet cluster observed with herschel-spire [81] allowed immediately to establish a number of properties for the thermal and non-thermal plasma superposition in the atmosphere of this strong merging cluster (see, e.g. [24, 63, 64]. this has been possible because the access to the very high-ν part of the sze spectrum contains detailed information on the relativistic effect on the single thermal plasma and on the presence of additional plasmas of either thermal or non-thermal nature (see discussion in [24, 26]). planck and herschel observations of the sze are in fact opening a rich field of investigation that will fully blossom in the next years with the full exploitation of spatially-resolved spectroscopic sze observations. in the following sect. 6 we discuss some of the astrophysical studies possible with the sze and in sect. 7 we will address their cosmological relevance. 6. astrophysical impact studying the sze in various cosmic atmospheres provides many insights on their energetics, pressure and dynamical structure. the combination of sze with other emission mechanisms related to the same particle distribution (i.e., synchrotron, high-e ics emission, bremsstrahlung emission) provides further information on the radiation, matter and magnetic fields that are co-spatial with the electrons producing the sze. these properties of the sze concern various cosmic structures. 6.1. galaxy clusters galaxy clusters are the largest containers of dark matter (dm), diffuse hot (and probably also warm) thermal plasma as probed by x-ray emission from the icm, cosmic rays and diffuse non-thermal plasma as indicated by diffuse radio emission, ∼ µg amplitude magnetic fields as indicated by diffuse radio emission and faraday rotation measures, they have cool cores likely heated by cosmic rays ejected and/or accelerated in the radiogalaxy (rg) jet/lobes, whose late evolution might produce large icm cavities filled with relativistic or non-thermal plasma [21]. the sze spectra of the various electron populations (see fig. 1) show quite different shapes that reflect the different electron spectra and pressure (energy density), and their analysis can be used to disentangle the plasma stratification of the cluster atmosphere. precise observations of the sze at microwave and mm wavelengths are crucial for unveiling the detailed structure of cluster atmospheres, their temperature distribution, and the possible presence of suprathermal and/or nonthermal plasma because the high-frequency part (i.e. at ν ∼> 350 ghz or x ∼> 6) of the sze spectrum is more sensitive to the relativistic effects of the electron momentum distribution [14, 19, 23]. this is even more so for galaxy clusters with a complex plasma distribution as found for powerful merging clusters, like the exemplary case offered by the bullet cluster (1es0657−56) [24]. powerful merging events in galaxy clusters can, in fact, produce an additional high-t plasma distribution (if the electron acceleration time scale at the merging shocks is longer than their equilibration time scale [78]), or an additional nonthermal population (produced either in a merging process with a very short acceleration time scale or by secondary electrons produced by p–p collisions, after the high-e protons have been accelerated by the merging and accumulate in the cluster region on long time scales [79]). the quasi-stationary case provided by the competition between particle thermalization and stochastic acceleration and momentum diffusion [35] can develop 563 sergio colafrancesco acta polytechnica figure 3. the sze spectrum at the bullet cluster center modeled with a thermal plus nonthermal plasma: thermal plasma with kt = 13.9 kev and τ = 1.1×10−2 (dot-dashed); nonthermal plasma with p1 = 1, s = 2.7 and τ = 2.3 × 10−4 (dotted); total sze produced by the sum of the two plasmas (solid). a subrelativistic electron distribution tail and can produce suprathermal (or nearly nonthermal) regions in the cluster atmosphere. a quantitative estimate of the temperature inhomogeneity (stratification) along the line of sight is possible using sze data only providing a measure the temperature standard deviation of the cluster plasma along the line of sight. we found that the bullet cluster has a temperature standard deviation of 10.6 ± 3.8 kev [65]. this result (obtained for the first time with sze measurements) shows that the temperature distribution in the bullet cluster is strongly inhomogeneous and provides a new method for studying galaxy clusters in depth. study of the multifrequency (from ∼ 30 to ∼ 850 ghz) sze signal observed in the bullet cluster shows, in fact, the presence of a thermal plasma at ∼ 13.9 kev coexisting with a second plasma component, either at higher temperature (≈ 25 kev) or, more preferably, of a nonthermal origin [24] (see fig .3). additional observations of the bullet cluster at ν ∼ 400 ghz with a precision ∼< 1 % of the expected signal will be able to further distinguish between the two cases of non-thermal power-law or suprathermal tail [24]. sze observations over a wide frequency range, and especially with high sensitivity in the high-ν range, can also add relevant information on the electron distribution function (df) in the icm, a subject that – even though relevant for a proper analysis of the sze – has not been addressed in details so far. the relativistic kinetic theory, on which the df derivation is based, is still a subject of numerous debates (see discussion in [63]). sze observations can separate the sze spectrum caused by a departure from the diffusive approximation based on the kompaneets approach [42] from those which are due to using a relativistic correct df instead of a maxwell–boltzman df (see fig. 4) figure 4. the sze intensity spectra for a massive cluster with a temperature of kte = 15.3 kev for juttner (solid) and maxwell–boltzman (dashed) dfs. the non-relativistic sze spectrum (solid) is also shown for comparison. figure from [63]. and therefore set constraints to the actual electron df [63]. this analysis is best fulfilled in hot massive clusters because the sze intensity change due to using a relativistic correct df instead of a maxwell–boltzman df are much larger in hot clusters due to the fact that relativistic sze corrections scale as ∝ t 5/2. a method used to derive the df of electrons using sze multifrequency observations of massive hot clusters [63] makes use of fourier analysis representation of the approximate electron df whose parameters are best fitted using observations in the (optimal) frequency channels at 375, 600, 700, 857 ghz. a morphological analysis of the sze observed at various frequencies adds relevant information to assess the pressure and energy density structure of cluster atmospheres. morphological sze differences are particularly evident for clusters undergoing violent mergers that create large inhomogeneities of the electron df. sze intensity maps of merging clusters obtained from hydrodynamical simulations show that the morphology of the sze intensity maps observable with laboca (at 345 ghz) and herschel-spire (at 857 ghz) are rather different [64] (see fig. 5). for a bullet-like cluster, the sze intensity map at 857 ghz has a spatial feature caused by the presence of the cold bullet-like substructure seen also in the x-ray surface brightness map. however, this cold substructure is not present in the sze intensity map at 345 ghz. this is a consequence of the relativistic effects of the sze and shows that observations of the sze intensity maps at very high frequencies can reveal complex pressure substructures within the atmospheres of massive galaxy clusters. for the cluster a2219, the sze intensity map at 857 ghz (obtained by using the chandra density and temperature maps [49]) shows evidence for a largescale shock heated region and a very hot region coinci564 vol. 53 supplement/2013 the sz effect in the planck era figure 6. the thermal sze polarization spectrum πth (blue solid) from a stacking analysis of 20 clusters with kt > 10 kev and τ > 0.03 is compared with the same analysis for the ksze polarization spectrum πk (red dotted). the statistical uncertainties refer to a stacking analysis produced with core [29]. dent with the peak of the x-ray surface brightness (see fig. 5). this result shows that the analysis of the sze signal at 857 ghz, correlated with lower-ν observations offers a promising method for unveiling high-t regions in massive merging clusters using available experiments like, e.g laboca and herschel-spire. in more relaxed clusters spectroscopic measurements of the sze over a wide frequency band allow to derive precise information on the temperature distribution and on the cool-core nature independently of x-ray priors [23] and hence reconstruct the full set of cluster physical parameters [31]. polarization measurement of the (thermal and nonthermal) sze are able to add further information on the transverse plasma motions within the cluster and on the pressure substructure of the plasma. sze polarization signals in galaxy clusters are quite low and typically below mjy (or µk) level even for high-t clusters (see fig. 6). it is interesting, however, to note that the sze polarization in cluster has quite different spectra w.r.t. the intensity sze spectrum. combining intensity and polarization observations of the sze can uncover unique details of the 3d (projected and along the line of sight) velocity structure of the icm, of its 3d pressure structure and of the influence of a structured magnetic field in the stratification of the icm, and therefore provides a full tomography of cluster atmospheres. analogously, the combination of the intensity and polarization observations of the kinematic sze (and its frequency dependence) can yield crucial information on the 3d distribution of the cosmological velocity field traced by galaxy clusters. specifically, the ratio δith/πth yields direct information on the plasma optical depth τ, and the ratio δith/πv on the combination τ βt, thus allowing to use intensity and polarization sze measurements to fully disentangle the pressure and velocity structure of the cluster atmospheres. sze polarization measurements are quite difficult to obtain with present-day experiments and they are also at the limit of next generation experiments. the required sensitivity to disentangle these signals at a high (∼ 3σ) confidence level should be of order of order ∼ 10 µjy. however, stacking analysis of even small samples (order of ∼ 20) of hot and dense galaxy clusters observed at multifrequency would allow to determine statistically the polarization signals of the thermal sze for clusters with kt > 10 kev and τ > 0.03 (see fig. 6) in the optimal frequency range ≈ 90 ÷ 250 ghz. 6.2. cluster cavities the atmospheres of galaxy clusters often show the presence of bubbles filled with high-e particles and magnetic field that are sites of bright radio emission and produce cavities in the cluster x-ray emission. cavities with diameters ranging from a few to a few hundreds of kpc have been observed by chandra in the x-ray emission maps of several galaxy clusters and groups [8, 47]. while the properties of these cavities and of the relativistic plasma they contain is usually studied by combining x-ray and radio observations, an alternative and efficient strategy is to study the consequences of the sze produced by the high-energy electrons filling the cavities [16, 53] whose amplitude, spectral and spatial features depend on the overall pressure and energetics of the relativistic plasma in the cavities. as an example, the overall sze observable along the line of sight (los) through a cluster containing cavities (see fig. 7 for the case of the cluster ms0735.6+7421) is the combination of the nonthermal sze produced by the cavity and of the thermal sze produced by the surrounding icm. due to the different ν-dependence of the thermal and non-thermal sze, the non-thermal sze from a cluster cavity shows up uncontaminated at frequencies ν ≈ 220 ghz: at this frequency, in fact, the overall sze from the cluster reveals only the ics of the electrons residing in the cavities without the presence of the intense thermal sze dominating at lower and higher frequencies. the cavity’s sze becomes dominant again at very high-ν (x ∼> 14 or ν ∼> 800 ghz) where the nonthermal electrons dominate the overall ics emission (see fig. 7). the cavity’s sze is more spatially concentrated than the overall cluster sze because it is only emerging from the cavity regions: this fact allows to study the overall energetics and pressure structure of the cavity’s high-e particle population and the b-field 565 sergio colafrancesco acta polytechnica figure 5. from top to bottom. the sze signal to noise ratio map for the cluster 1e0657−558 at 345 ghz smoothed to the resolution of laboca and at 857 ghz smoothed to the resolution of herschel-spire. analogous maps for the cluster a2219. figures from [64]. structure in combination with x-rays and radio images. the observation of the crossover of the non-thermal sze from the cavities (which depends on the value of emin(p1) or, equivalently, on the value pcavity) provides a way to determine the total pressure and hence the nature of the electron population within the cavity [16], an evidence which adds crucial, complementary information to the x-ray and radio analysis. alternative studies have been performed by assuming that cluster cavities contain a high-t plasma (∼ 109 ÷ 1010 k) [62]. in this case the sze flux from cocoons in the central part of a distant elliptical and a nearby galaxy cluster are of the same order. for a hight plasma, the cocoon’s sze spectrum is rather flat at high-ν resembling the shape of the non-thermal sze from cavities. in this high-t plasma model, however, no strong radio emission at ν ∼> 1 ghz (as instead observed) is expected from the cocoon, unless the cocoon’s b-field is very high b ∼> 103 µg. 6.3. radiogalaxy lobes studies of (giant) radio-galaxy (rg) lobes (see, e.g., [9, 30, 39, 41] and references therein) have shown that these extended structures contain relativistic electrons that are currently available to produce both low-ν synchrotron radio emission and ics of the cmb (as well as other radiation background) photons. as a consequence, an sze from the lobes of rgs is inevitably expected [20]. such non-thermal, relativistic sze has a specific spectral shape that depends on the shape and energy extent of the spectrum of the electrons residing in rg lobes. the sze emission from rg lobes is expected to be co-spatial with the relative ics x-ray emission [20] and its spectral properties are related to those of the relative ics x-ray emission. in fact, the spectral slope of the ics x-ray emission αx = (α− 1)/2 (where fics ∼ e−αx ) can be used to set the electron energy spectral slope α (where ne ∼ e−α) necessary to compute the sze spectrum, and to check its 566 vol. 53 supplement/2013 the sz effect in the planck era figure 7. top. the geometry of the cavities in the cluster ms0735.6+7421. bottom. the sze spectrum has been computed at a projected radius of ≈ 125 kpc from the cluster center where the los passes through the center of northern cavity. the thermal sze (blue), the non-thermal sze from the cavity (black) and the total sze (red) are shown. the non-thermal sze is normalized to the cavity pressure p = 6 × 10−11 erg cm−3, and is shown for various values of p1. figure from [16]. consistency with the synchrotron radio spectral index αr = (α − 1)/2 (where fsynch ∼ e−αr ), that is expected to have the same value [20, 25]. the sze in rg lobes has not been detected yet: only loose upper limits have been so far derived on the sze from these sources (see [7] for a review; see also a recent attempt [80] to detect this effect at radio wavelengths in the giant radio galaxy b1358+305). a detection of the sze from rg lobes can provide a determination of the total energy density and pressure of the electron population in the lobes [20] allowing to determine the value of emin once the slope of the electron spectrum is determined from radio and/or x-ray observations. sze measurements provide a much more accurate estimates of the electron pressure/energy density than with other technique like ics x-ray emission or synchrotron radio emission, since the former can only provide an estimate of the electron energetics in the high-energy part of the electron spectrum, and the latter is sensitive to the degenerate combination of the electron spectrum and of the magnetic field in the radio lobes. the combination of sze observations (that depend on the electron distribution and on the known cmb radiation field) and the radio observations (which depend on the combination of the electron distribution and of the magnetic field distribution) provides an unbiased estimate of the overall b-field in the lobe by using the ratio fradio/fsze ≈eb/ecmb, that is more reliable than that obtained from the combination of ics x-ray (or gamma-ray) and radio emission [25]. the spatially resolved study of the sze and synchrotron emission in rg lobes also provide indication on the radial behaviour of both the leptonic pressure and of the magnetic field from the inner parts to the boundaries of the lobes. the study of the pressure evolution in rg lobes can provide crucial indications on the transition from radio lobe environments to the atmospheres of giant cavities observed in galaxy cluster atmospheres (see, e.g., [48] for a review), which seem naturally related to the penetration of rg jets/lobes into the icm (see fig. 7). a substantial sze polarization is also expected in rg lobes due to both coherent transverse motions of the plasma along the jet/lobe direction and to the electron pressure substructures induced by e.g. plasma inhomogeneities and magnetic field turbulence. the transverse velocity-induced polarization is πv ∝ τrel(βtτrel), and the multiple scattering induced polarization is πτ ∝ τrelprel where prel is the pressure of the relativistic electron distribution. observations of the sze and its polarization in rg lobes can yield, therefore, direct information on electrons τrel and βt in the rg lobe. 6.4. galaxies hot gas trapped in a dm halo can produce a sze. a typical galaxy halo might hence show an integrated thermal sze at the level of ∼< 0.5 mjy arcmin−2 from a plasma with t ∼ 106 k and density ne ∼ 10−3 cm−3 extended for ∼ 50 kpc in the inter-stellar medium (ism). due to this fact, it has been claimed that the halo of m31, may be one of the brightest integrated sze sources in the sky [73]: for various realistic gas distributions consistent with current x-ray limits, the integrated sze decrement from m31 could be comparable to decrements already detected in more distant sources, provided that its halo contains an appreciable quantity of hot gas. a measurement of galaxy halo sze would provide direct information on the mass, spatial distribution and thermodynamic state of the plasma in a low-mass galactic halo, and could place important constraints on current models of galaxy formation. detecting such an extended, low-amplitude signal will be challenging, but possible with all-sky sze maps from planck. 567 sergio colafrancesco acta polytechnica an sze is also expected from galaxy outflows swept by galaxy (hyper)winds. a thermal sze is expected to arise from the shocked bubble plasma in a strong galaxy wind described by a simple, spherical blast wave model [67]. however, such simple recipe for the sze from galaxy winds is to be modified by the presence of cosmic rays and magnetic field in the expanding wind [28] leading on one side to a more complex sze spectrum, and on the other side to an amplification of the overall sze at high frequencies, thus increasing the detection probability. sze observations from galaxy winds will be possible with high-sensitivity and high-resolution telescopes like alma and ska due to its low amplitude and spatial extension. 6.5. superclusters and the whim on the very large scales of the universe the sze can be used to trace the distribution of baryons in the supercluster environment and in the warm hot intergalactic medium (whim). a detection at 33 ghz of a strong temperature decrement in the cmb towards the core of the cornoa borealis supercluster has been found [37]. multifrequency observations with vsa and mito suggest the existence of a thermal sze component in the spectrum of this cold spot with y = 7.8+4.4−5.3 × 10−6 [4], which would account for roughly 25 % of the total observed temperature decrement towards this supercluster. n-body marenostrum universe sph simulations have been further used to study the thermal and ksze associated with superclusters of galaxies, and in particular, superclusters with characteristics (i.e., total mass, overdensity and number density of cluster members) similar to those of the corona borealis supercluster [36]. these simulations show, however, that the whim lying in the inter-cluster regions within the supercluster produces a thermal sze much smaller than the value observed by mito/vsa. 7. cosmological relevance the redshift-independent nature of the sze allow to use this effect as a powerful cosmological probe [2, 3, 7, 12, 34, 50] by using both the redshift evolution of cluster abundance and direct probes of cosmological parameters. the sze has a wide range of cosmological applications that we briefly review. 7.1. cosmological scenarios galaxy cluster surveys are powerful tools for studying the nature of de. in principle, the equation of state (p = wρ) parameter w of the de and its time evolution can be extracted from large solid angle, high yield surveys that can deliver tens of thousands of clusters [50]. this is possible with sze cluster surveys like the act, spt and planck surveys (see fig. 8). the sensitivity of the cluster redshift distribution to the parameter w for a large sze cluster survey have figure 8. left. forecasts for the geometry constraints from the spt+des galaxy cluster survey, the snap sne ia mission, and planck. right. the expected redshift distribution (top) and quantified differences (bottom) for models where only the de equation of state parameter w is varied. these models are normalized to produce the same local abundance of galaxy clusters. figures from [50]. been widely discussed (see, e.g., [50]). the sensitivity to the cosmological volume dominates at intermediate redshifts z ∼ 0.5, and the sensitivity to the growth rate dominates at z > 1 where it is possible to disentangle more easily among various forms of the de equation of state. there are several requirements, which must be in place to achieve precise cosmological constraints from a cluster survey: i) a firm theoretical understanding of the formation and evolution of massive dm halos, ii) a clean and well-understood selection of large numbers ( ∼> 104) of massive dm halos (i.e. galaxy clusters) over a wide redshift range, iii) a cluster survey observable y that correlates with the cluster halo mass m (i.e., an m–y relation), and iv) redshift estimates for each cluster. cluster cosmology is pursued by using a wide range of cluster observables, including the x-ray emission from the icm, the sze, the optical light and number of cluster galaxies, and the weak lensing shear signature of clusters. generally speaking, cluster selection is easier, and the m-y relations are tightest in the x-ray; there is however a strong theoretical prejudice (supported by the results of hydrodynamical simulations) that the situation will be even better at mm wavelengths (i.e. using sze observations) [50]. the sze can check for potential traces of cluster evolution better than other cosmological probes [1, 33], in particular for galaxy clusters in the medium mass regime. in addition, new techniques are being introduced that make improved use of other less well studied cosmological tools, like, e.g., the combination of sze and weak lensing properties of clusters. as an example, the combined spt+des cluster survey should provide a strong handle on the nature of the de [50]: in this case, cluster finding and masses should arise primarily from the sze data, and the des optical data will 568 vol. 53 supplement/2013 the sz effect in the planck era provide photometric redshifts. we further emphasize here that cluster surveys can be used to constrain far more general models than the constant w de models, like the time variation of the de equation of state parameter [76, 77] or modified gravity scenarios [74]. the possibility to probe modified gravity scenarios derives from the fact that the high-mass tail of the cluster mass function is enhanced w.r.t. the standard de-cdm model and this yields both an increase in the number of high-m cluster and modification to their cosmological evolution with redshift. two methods have been proposed to use sze observations to set constraints on these models: i) studies of the redshift evolution of high-m clusters and of their ym relation [24], ii) studies of the sze angular power spectrum using small scale cmb observation data [74]. new results and tighter cosmological contraints are expected from planck, spt and act cluster surveys. 7.2. dark matter search dark matter (dm) annihilation in the halo of galaxies and galaxy clusters have relevant astrophysical implications (see e.g. [22] for a review). neutralinos (χ) which annihilate inside a dm halo produce quarks, leptons, vector bosons and higgs bosons, depending on their mass and physical composition. electrons are then produced from the decay of the final heavy fermions and bosons providing a continuous spectrum of secondaries (apart from the monochromatic electrons, with energy about mχ, coming from the direct channel χχ→ e±). the different composition of the χχ annihilation final state will in general affect the form of the final secondary electron spectrum [13, 17, 18]. the sze induced by secondary electrons produced in dm (χ) annihilation (szedm) is an unavoidable consequence of the presence and of the nature of dm in large-scale structures and its spectral shape depends on the neutralino mass and composition [15]. the presence of a substantial szedm is likely to dominate the overall sz signal of a galaxy cluster at frequencies x ∼> 3.8 ÷ 4.5 providing a negative total sze that is at variance w.r.t. the positive or null thermal sze from the cluster icm and allows to set contraints in the 〈σv 〉− mχ exclusion plot for dm models [15]. it is, however, necessary to stress that in such frequency range there are other possible contributions to the sze, like the ksze and the possible non-thermal sze (see fig. 1) which could provide additional biases. nonetheless, the spectral shape of the szedm is quite different from that of the ksze and of the thermal sze and this allows to disentangle it from the overall sze signal. an appropriate multifrequency analysis of the overall sze based on observations performed on a wide spectral range (from the radio to the sub-mm region) is required to separate the various sze contributions and to provide an estimate of the dm induced sze [15]. the szedm signal also does not strongly depend on the assumed dm density profile at intermediate angular distances from the cluster center and on the dm clumpiness since ydm is the integral of the total pdm along the line of sight (see eq. 2). the szedm is one of the cleanest probes for dm indirect search since it only depends on the dm particle nature and on the cmb photon background. an szedm is expected in every dm-dominated cosmic structure from the most massive ones (galaxy clusters) to the smallest bound dm halos, thus producing negative temperature decrements (at ν ∼< 400 ghz) in all dm halos which do not host thermal or relativistic plasma (e.g. dwarf spheroidal galaxies and/or satellites and sub-clumps of normal galaxies) and an integrated comptonization of cmb radiation plus additional anisotropies in the microwave and mm cosmic backgrounds. this effect represents the minimum guaranteed ics emission (from radio to sub-mm frequencies, and also at high-e up to γ-rays) of dm halos in a universe dominated by χ dm. the multi-ν nature of the dm induced emission [17, 18, 21], and the analysis of the szedm, and the relative temperature anisotropy spectrum, provide crucial complementary probes for the presence and for the nature of dm. 7.3. the magnetized universe after recombination, primordial magnetic fields generate additional density fluctuations that enhance the number of formed galaxy clusters so that their counts and z-distribution can be used to set constraints on the structure of the primordial b-field. the results of chandra x-ray cluster survey exclude the existence of primordial b fields with amplitude larger than ∼ 1 ng [72]. future sze cluster surveys have an enhanced sensitivity to constrain primordial magnetic fields because of the z-independence of the sze and of the smaller biases in the cluster mass reconstruction. the planck and spt cluster survey have the potential to constrain a primordial b fields with sub ng amplitude [72] (see fig. 9). 7.3.1. cosmological velocity field the ksze is a powerful probe to the 3d velocity distribution of large-scale structures and of the missing baryons distribution in the universe. however, the ksze signal is overwhelmed by various contaminations and the cosmological application is hampered by loss of redshift information due to the projection effect. a ksze tomography method is, nonetheless, possible to alleviate these problems, with the aid of galaxy spectroscopic redshift surveys. specifically, it has been proposed to estimate the large scale peculiar velocity through the 3d galaxy distribution, weigh it by the 3d galaxy density and adopt the product projected along the line of sight with a proper weighting as an estimator of the true ksze temperature fluctuation [69]. 569 sergio colafrancesco acta polytechnica figure 9. the sze cluster number counts expected for planck and for spt. in both panels, the solid, the dashed, the dashed-dotted and the dotted lines represent the number counts for the case of b = 3 ng, b = 1 ng, b = 0.5 ng and b = 0 ng, respectively. the number counts for the b = 0 ng and σ8 = 0.95 (corresponding to σ8 for b = 3 ng) is also shown (thin solid line) for comparison. figure from [72]. first evidence for a non-zero mean pairwise cluster momentum due to the ksze signal was obtained by act [38], but it can also be interpreted as a measure of baryons on cluster length scales. 7.4. other cosmological applications detailed models of structure formation processes and reionization history of the universe (including a consistent description of galaxies, quasars and matter density fluctuations) allow to compute the secondary cmb anisotropies generated by the ksze [75]. the contribution due to patchy reionization seems however to be negligible except at large angular scales (θ ∼> 3.5 arcmin) and small wavenumbers (` < 103). over this range, which corresponds to scales larger than the typical size of hii regions, the signal actually comes from the cross-correlation of ionized bubbles, induced by the correlations of the rare radiation sources. on smaller scales, the inter-galactic medium (igm) contribution is governed by the fluctuations of the matter density field itself. however, over this range the secondary anisotropies should be dominated by the contribution from galactic halos, which are characterized by smaller scales than the igm (and larger densities). this leads to a cutoff of the cmb anisotropy power-spectrum at a large wavenumber ` ∼ 106, while at low wavenumbers ` < 103 a white noise power-spectrum is recovered. thus, observations of these secondary cmb anisotropies should probe the correlation properties of the underlying matter density field, through the correlations of the hii regions and the small-scale density fluctuations. the (thermal and kinematic) sze and its polarization are important tools to test the homogeneity of our universe through probes of the copernican principle (cp) [46]. in fact, large variations of the thermal sze induced by cmb photons that have temperature significantly different from the blackbody temperature we observe directly, and that arrive at galaxy cluster from points inside our light cone, could indicate a violation of the cp and homogeneity. analogously, large variations of the cmb dipole measured by the cluster ksze could also indicate a violation of the cp and homogeneity. the sze polarization contains more refined information on the cmb temperature and thus it can also provide a powerful probe of cp and homogeneity. thus, observations of large sze and of its polarization w.r.t. to the expectations of the sze produced from a pure blackbody cmb spectrum might provide indications of a non-flrw universe. finally, the evolution of the cmb temperature can be probed out to substantial redshifts by combining sze observations of galaxy clusters [45] (setting constraints at low-moderate z) with sze observations of rg lobes (able to set constraints even at high z) [20]. these data will provide further constraints on the energy release mechanisms possibly occurring at early cosmic ages. 8. future directions the impact of sze observation for astrophysical and cosmological application is steadily increasing since the advent of dedicated experiments and large scale surveys in the microwave and mm frequency range. the quality and the spatial resolution of sze images is reaching the level of x-ray images and the spectral coverage in systematically extending in the mm region. there is a strong theoretical motivation (supported by analytical calculations, hydro-dynamical simulations and some preliminary data) that the situation will be even better extending the sze study at sub-mm wavelengths, where highest spatial resolutions can be achieved with a wide spectral coverage able to decipher the physical details of the electron distribution in the atmosphere of cosmic structures. new paths of theoretical investigations are underway and concern both the detailed study of ics mechanism in cosmic atmospheres and the impact of the various plasma structure and fields in cosmic structures. the possibility to perform precise measurements of the various sze signals and to extract the relevant astrophysical information depends crucially on the capability to have spatially-resolved spectral observations of sze sources over a wide ν band, from radio to sub-mm. in particular, the important condition for such study is to have a wide-band continuum spectroscopy and especially a good spectral coverage and sensitivity in the high-frequency band, where most of the astrophysical effects reveals more clearly. spatially resolved spectroscopic and polarimetric observations of the sze in the frequency range from ∼ 100 ghz to ∼ 1 thz are the key to improve our understanding of the structure of cosmic atmospheres through analysis of the intensity and polarized sze signal, and will allow to use this technique to probe the structure of the universe. 570 vol. 53 supplement/2013 the sz effect in the planck era acknowledgements s.c. acknowledges useful and enriching discussions with p. de bernardis, v. dogiel, r. maartens, p. marchegiani, s. masi and d. prokhorov. s.c. acknowledges support by the south african research chairs initiative of the department of science and technology and national research foundation and by the square kilometre array (ska). references [1] ade, p. et al. 2011, arxiv:1101.2024 [2] aghanim, n. et al., 1997, a&a, 325, 9 [3] barbosa, d. et al., 1996, a&a, 314, 14 [4] battistelli, e. et al. 2006, apj, 645, 826 [5] baumann, d. & cooray, a. 2003, new astron.rev. 47, 839 [6] benson, b.a. et al. 2004, apj, 617, 829 [7] birkinshaw, m., 1999, physics reports, 310, 97 [8] birzan, l. et al. 2004, apj, 607, 800 [9] blundell, k.m., et al. 2006, apj, 644, l13 [10] bonometto, s. et al. 1970, a&a, 7, 292 [11] challinor, a. & lasenby, a. 1998, apj, 499, 1 [12] colafrancesco, s. et al. 1997, apj, 479, 1 [13] colafrancesco, s. & mele, b. 2001, apj, 562, 24 [14] colafrancesco, s., marchegiani, p. & palladino, e. 2003, a&a, 397, 27 [15] colafrancesco, s. 2004, a&a, 422, l23 [16] colafrancesco, s. 2005, a&a, 435, l9 [17] colafrancesco, s., profumo, s. & ullio, p.2006, a&a, 455, 21 [18] colafrancesco, s., profumo, s. & ullio, p. 2007, phrvd, 75, 3513 [19] colafrancesco, s., 2007, new astronomy reviews, 51, 394 [20] colafrancesco, s. 2008, mnras, 385, 2041 [21] colafrancesco, s. 2010, in “multi-frequency behaviour of high-energy cosmic structures”, mmsait, 81, 104 [22] colafrancesco, s. 2010, in “astrophysics and cosmology after gamow”, 2010, aipc, 1206, 5 [23] colafrancesco, s. & marchegiani, p., 2010, a&a, 520, 31 [24] colafrancesco, s., marchegiani, p. & buonanno, r. 2011, a&a, 527, l1 [25] colafrancesco, s. & marchegiani, p., 2011, a&a, 535, 108 [26] colafrancesco, s., marchegiani, p., de bernardis, p. & masi, s. 2012, a&a in press [27] colafrancesco, s. & tullio, m. 2012, preprint [28] colafrancesco, s., et al. 2011, in preparation [29] the core collaboration 2011, a white paper, arxiv:1102.2181 [30] croston, j.h. et al. 2005, apj, 626, 733 [31] de bernardis, p., colafrancesco, s. et al. 2012, a&a, 538, 86 [32] de petris, m. et al. 2002, apj, 574, l119 [33] delsart, p., barbosa, d. & blanchard, a. 2010, a&a, 524, 81 [34] diego, j.m, martinez, e., sanz, j.l., benitez, n. & silk, j. 2002, mnras, 331, 556 [35] dogiel, v., colafrancesco, s., ko, c.m. et al. 2007, a&a, 461, 433 [36] flores-cacho, i. et al. 2009, mnras, 400, 1868 [37] genova-santos, r. et al. 2008, mnras, 391, 1127 [38] hand, n. et al. 2012, arxiv:1203.4219 [39] harris, d.e. & krawczynski, h. 2002, apj, 565, 244 [40] itoh, n., kohyama, y. & nozawa, s. 1998, apj, 502, 7 [41] kataoka, j. et al. 2003, a&a, 410, 833 [42] kompaneets, a.s. 1957, soviet phys. jetp, 4, 730 [43] lamarre, f. et al. 1998, apj, 507, l5 [44] lavaux, g. et al. 2004, mnras, 347, 729 [45] luzzi, g. et al. 2009, apj, 705, 1122 [46] maartes, r. 2011, arxiv:1104.1300 [47] mcnamara, b. r. et al. 2000, aas, 32.13211 [48] mcnamara, b.r. & nulsen, p.e.j. 2007, arxiv:0709.2152 [49] million, e. t., & allen, s. 2009, mnras, 399, 1307 [50] mohr, j. 2004, in “observing dark energy” arxiv:astro-ph/0408484v1 [51] nozawa, s., kohyama, y. & itoh, n. 2010a, arxiv:1009.2311 [52] nozawa, s., kohyama, y. & itoh, n. 2010b, phrvd, 82, 3009 [53] pfrommer, c., ensslin, t., & sarazin, c. l. 2005, a&a, 430, 799 [54] phillips, p.r. 1995, apj, 455, 419 [55] planck collaboration, 2011, a&a 536, a8 [56] planck collaboration 2011, a&a 536, a9 [57] planck collaboration 2011, a&a 536, a10 [58] planck collaboration 2011, a&a 536, a911 [59] planck collaboration 2011, a&a 536, a12 [60] planck collaboration 2011, arxiv:1112.5595p [61] pointecouteau, e. 2012, in proceeding of the moriond meeting ‘cosmology’ [62] prokhorov, d., antonuccio-delogu, v. & silk, j. 2010, arxiv:1006.2564 [63] prokhorov, d., colafrancesco, s. et al. 2011a, a&a, 529, 39 [64] prokhorov, d., colafrancesco, s. et al. 2011b, mnras, 416, 302 [65] prokhorov, d. and colafrancesco, s., 2012, mnras, 424, l49 [66] rephaeli, y. & lahav, o. 1991, apj, 372, 21 [67] rowe, b. & silk, j. 2010, mnras in press, arxiv:1005.4234v2 [68] sazonov, s. & sunyaev, r. 1999, mnras, 310, 765 571 sergio colafrancesco acta polytechnica [69] shao, j. et al. 2010, arxiv:1004.1301 [70] sunyaev, r.a. & zel’dovich, y.b.: comm. astrophys. space phys. 1972, 4, 173 [71] sunyaev, r. a. & zeldovich, ia. b. 1980, ara&a, 18, 537 [72] tashiro, h. et al. 2010, arxiv:1010.4407v1 [73] taylor, j.e. et al. 2003, mnras, 345, 1127 [74] tsutomu, k. & hiroyuki, t. 2009, mnras, 398, 477 [75] valageas, p., balbi. a. & silk, j. 2001, a&a, 367, 1 [76] wang, s. et al. 2004, ph.rev.d., 70, 123008 [77] weller, j., battye, r. a., & kneissl, r. 2002, phys. rev. lett., 88, 231301 [78] wolfe, b. & melia, f. 2006, apj, 638, 125 [79] wolfe, b. & melia, f. 2008, apj, 687, 193 [80] yamada, m. et al. 2010, aj, 139, 2494 [81] zemcov, m. et al. 2010, a&a, 518, l16 572 acta polytechnica 53(supplement):560–572, 2013 1 introduction 2 the physics of the sz effect 2.1 the sz effect 2.2 kinematic sz effect 2.3 sze polarization 3 astrophysical and cosmological impact 4 the past: pre-planck era 5 the present: planck era 6 astrophysical impact 6.1 galaxy clusters 6.2 cluster cavities 6.3 radiogalaxy lobes 6.4 galaxies 6.5 superclusters and the whim 7 cosmological relevance 7.1 cosmological scenarios 7.2 dark matter search 7.3 the magnetized universe 7.3.1 cosmological velocity field 7.4 other cosmological applications 8 future directions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0187 acta polytechnica 55(3):187–192, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap an experimental investigation into moisture-induced expansion of plasters radoslav sovjáka, ∗, tomáš koreckýb, aleksander gundersenc a czech technical university in prague, faculty of civil engineering, experimental centre, thákurova 7, cz-166 29 prague b czech technical university in prague, faculty of civil engineering, department of materials engineering and chemistry, thákurova 7, cz-166 29 prague c norwegian university of science and technology, faculty of engineering science and technology, høgskoleringen 6, no-7491 trondheim ∗ corresponding author: sovjak@fsv.cvut.cz abstract. this paper presents an experimental study on moisture-induced expansion of selected plasters. contactless measurement is introduced and a coefficient of moisture expansion for different building plasters is established. it is found that stresses which might develop in building materials due to moisture variations are equal to or higher than stresses which might be caused by temperature variations. keywords: adsorption; absorption; hygric expansion; hydric expansion; plasters; experimental measurement. 1. introduction building materials and structures are subjected to a number of climate actions. many buildings become disfigured soon after their completion or later after many repetitions of climate actions or as a result of synergies with other physical or chemical effects. typical defects include the cracking of finishes and the spalling of surfaces. many authors agree that the mechanisms responsible for such failures are usually associated with length or volume changes of porous materials due to increased moisture content or to temperature changes [1–3]. deformation caused by moisture changes is called hygric in the range between 0 % rh and 95 % rh, and hydric when the material is in contact with or immersed in water [4]. hygric deformation is due to the predisposition of the material to adsorb water. adsorption is a surface-based process, while hydric deformation is initiated when water is being absorbed into the sample, i.e., when a specimen is immersed in the water. absorption involves the whole volume of the material determining the purely hygroscopic strain can sometimes be difficult, especially in the direction parallel to reinforcing fibres in the case of composite materials [5, 6]. in addition, hygric expansion is often not taken into account in practical measurements and calculations, although a higher content of moisture, particularly in the liquid state, can lead to hygric stresses in the same range as thermal stresses [7]. a higher coefficient of moisture expansion generally means higher stress and higher demand on the mechanical properties [8]. the present study assesses the coefficient of moisture-induced expansion of plasters under both hygric and hydric loading. in addition, this study presents the long-term environmental effect on the water sorption capability of plasters. 2. experimental program the hygric expansion was determined in the moisture range from dry material to 95 % rh. hygric wetting experiments were carried out in climatic chambers and the hydric expansion was monitored after samples were immersed in a water tank. hygric loading was simulated by increasing the relative humidity in the chamber to 40 %, 60 %, 80 % and 95 %. the relative humidity was kept constant at every humidity level for six weeks (fig. 1). the temperature was kept constant at 21 °c at all times. the common principle of using a laser sensor for dilatation measurements was utilized for the monitoring of the hygric and subsequent hydric expansion (fig. 2). the sensors measured distance without any contact with a front surface of the specimen. the specimen was placed in between two laser sensors on a special stainless-steel mount developed for the purpose of this study (fig. 3). measuring range of laser sensors was 5 mm. lasers worked with resolution 0.01 % of full scale output. the measuring rate of the laser sensors was 1.5 khz. the weight of the specimen was determined at the end of each loading sequence on a digital scale with 0.01 g accuracy. a laser triangulation displacement sensor operated with a laser diode which projected a visible light spot onto the surface of the measurement target. “triangulation” refers to the measuring of a distance by calculating an angle. the light reflected from the spot was imaged by an optical receiving system onto 187 http://dx.doi.org/10.14311/ap.2015.55.0187 http://ojs.cvut.cz/ojs/index.php/ap r. sovják, t. korecký, a. gundersen acta polytechnica figure 1. climatic loading. figure 2. principle of the laser sensor. figure 3. measurement unit. a position-sensitive element. if the light spot changed its position, this change was imaged on the receiving element and evaluated. the laser sensor used a semiconductor laser with a wavelength of 670 nm. the maximum optical output power was 1 mw. the sensor was classified as laser class ii. the laser beam was strongly bundled using a special lens design in order to detect only a few micrometres in diameter on the measured object. this was a particular benefit in the case of very small measured objects. even when measurements on structured surfaces are required, a small spot size is often advantageous. the basic characteristics of the laser triangulation displacement sensor used in this study are shown in tab. 1. measuring range 5 mm start/end of measuring range 20/25 mm operation temperature 0–50 °c operation relative humidity 5–95 % measuring rate 1.5 khz resolution–static 0.6 µm table 1. basic parameters of the laser sensor. 3. materials the materials used in this study were specimens of commonly used plasters which are routinely in contact with outdoor environments. the moisture-induced strains were determined for the reference samples and also for the samples which were subjected to an outdoor environment for one year. the history of the temperature actions and the relative humidity actions on the one-year-old samples is shown in figs. 4 and 5, respectively. the specimens of plasters used in this study were 160 mm in length; the side of the rectangular cross-section was 40 mm. the selected plasters used in this study are commercially available in the czech republic. the basic properties of these plasters are shown in tab. 2. a detail characterisation of the plasters used in this study can be found in [9]. the consistency of the plasters was determined by a standard flow table test described by čsn en 1015-3, and all plasters reached a consistency of 160/160 mm. the coefficient of linear thermal expansion for plasters can usually be found in the range from 6.2 × 10−6 k−1 to 15 × 10−6 k−1 [8]. 4. results the sorption isotherms of the analysed plasters are presented in fig. 6. the highest value of water sorption was achieved by the lightweight plaster p2 in both the reference and the one-year-old measurements. it is interesting to note that all water sorption isotherms, 188 vol. 55 no. 3/2015 an experimental investigation into expansion of plasters material commercial name composition w/c % [kg/m3] plaster p1 baumit mpa 35 plaster with lime and cement 0.22 1244 plaster p2 baumit thermo putz lightweight plaster with perlite 0.4 452 plaster p3 baumit sanova omítka w renovation plaster 0.31 1183 plaster p4 baumit sanova pufferová omítka renovation plaster–brown coat 0.34 1118 plaster p5 baumit mvr uni plaster suitable for 0.24 1292 aerated-concrete walls table 2. characterisation of commercial plasters used in this study [9]. figure 4. the temperature history. sample a [10−6] w0 [10−3] r2 plaster p1 −37.0 23.5 0.999 plaster p2 −72.5 20.0 0.999 plaster p3 −36.5 19.6 0.999 plaster p4 −47.2 29.7 0.997 plaster p5 −28.9 22.4 0.996 table 3. fitting parameters for reference samples. gained after one year in outdoor environments, presented a reduced capacity for water sorption. an approximation and smoothing of the measured moisture-induced strains for individual plasters (fig. 7) was described, as follows, as a function of moisture content: ε(w) = a w + w0 − a w0 , (1) where ε is the moisture-induced strain, w is the moisture content by volume and a and w0 are fitting constants. this equation was chosen in order to fit the measured data with the best reliability, as presented in tab. 3 and tab. 4. the fitting parameters presented in tab. 3 and tab. 4 were determined by regression analysis using the least-squares method with an assumption that the data are independently and normally distributed with a common standard deviation. the maximum moisture contents by volume for the individual samples and the maximum strain obtained figure 5. the relative humidity history. sample a [10−6] w0 [10−3] r2 plaster p1 −4.61 8.79 0.999 plaster p2 −21.7 10.9 0.997 plaster p3 −8.96 10.1 0.994 plaster p4 −18.2 18.1 0.996 plaster p5 −11.2 15.3 0.991 table 4. fitting parameters for one-year-old samples. after the wetting experiments are listed in tab. 5. the values are presented for the reference samples as well as for the samples which were subjected to the environmental loading for one year. the data presented in tab. 5 show the maximum moisture content and the related moisture-induced strain. both parameters decreased when the plasters were subjected to an outdoor environment for one year prior to the testing. the value of the maximum moisture content for plaster p1, plaster p2, plaster p3, plaster p4 and plaster p5 was reduced by 13.4 %, 9.59 %, 23.9 %, 22.7 % and 8.82 % relative to the reference samples, respectively. the value of the maximum moisture-induced strain for plaster p1, plaster p2, plaster p3, plaster p4 and plaster p5 was reduced by 65.6 %, 41.5 %, 55.7 %, 35.0 % and 40.6 % relative to the reference samples, respectively. the coefficient of moisture expansion αh [–], including both hygric and hydric deformations, was determined as a derivation of the ε(w) function, as presented in (1). the coefficient can therefore be 189 r. sovják, t. korecký, a. gundersen acta polytechnica figure 6. sorption isotherms of studied plasters. expressed as follows: αh = −a (w + w0)2 , (2) where a, w and w0 represent the same parameters as described in (1). the coefficient of moisture expansion for individual plasters is presented in fig. 8. 5. conclusions and further outlook the coefficients of moisture-induced expansion, covering both hygric and hydric strains, were established in this study for different plasters. the moisture-induced strains demonstrated the significant influence of moisture content on deformation which might be induced in plasters due to moisture changes. these changes might be equal to or higher than the stresses caused sample moisture strain [m3 m−3] [10−3] plaster p1 – reference 0.472 1.49 plaster p1 – one year old 0.409 0.514 plaster p2 – reference 0.649 3.49 plaster p2 – one year old 0.587 2.04 plaster p3 – reference 0.536 1.82 plaster p3 – one year old 0.408 0.806 plaster p4 – reference 0.555 1.53 plaster p4 – one year old 0.429 0.996 plaster p5 – reference 0.432 1.23 plaster p5 – one year old 0.393 0.731 table 5. maximum moisture content by volume and moisture-induced strain. 190 vol. 55 no. 3/2015 an experimental investigation into expansion of plasters figure 7. strain development of studied plasters. by temperature variations. in addition, it was verified experimentally that both strain and moisture content decreased when the samples were subjected to outdoor environments prior to the testing. the maximum moisture content of the one-year-old plasters which were subjected to environmental actions was reduced by 8 % to 24 % and the maximum strain was reduced by 35 % to 66 % relative to the reference samples. the reference samples and the one-year-old samples were presented in this study. it is important to note that it is a fairly complex and highly time-consuming task to make proper measurements of moisture expansion. this is the main reason why no experimental work has been performed on the older samples so far. it is therefore highly desirable to extend the scope of our study presented here, and to verify the moistureinduced expansion on samples subjected to outdoor environments for more than a year. acknowledgements the authors gratefully acknowledge the support provided by the czech science foundation under project number gap 105/12/g059. the authors would also like to acknowledge the assistance given by the technical staff of the experimental centre, faculty of civil engineering, ctu in prague, and by the students who participated in the project. references [1] baker, m. c.: thermal and moisture deformations in building materials, canadian building digest, 56, 1964, p. 1-4. [2] t ritchie, moisture expansion of clay bricks and brickwork, national research council canada, division of building research 1975. [3] t mcneilly, c brick, moisture expansion of clay bricks: an appraisal of past experience and current knowledge, brick development research institute 1985. 191 r. sovják, t. korecký, a. gundersen acta polytechnica figure 8. coefficient of moisture expansion for studied plasters. [4] s siegesmund, h dürrast, physical and mechanical properties of rocks, stone in architecture, springer, 2014, pp. 97-224 doi:10.1007/978-3-642-45155-3_3 [5] ramezani-dana, h., casari, p., perronnet, a., fréour, s., jacquemin, f., lupi, c.: hygroscopic strain measurement by fibre bragg gratings sensors in organic matrix composites–application to monitoring of a composite structure, composites part b: engineering, 58, 2014, p. 76-82. doi:10.1016/j.compositesb.2013.10.014 [6] schulgasser, k.: moisture and thermal expansion of wood, particle board and paper, international paper physics conference, cppa tech. section, 1987, p. 53-63. [7] toman, j., černý, r.: coupled thermal and moisture expansion of porous materials, int. j. thermophys., 17, 1996, p. 271-277. doi:10.1007/bf01448229 [8] černý, r., kunca, a., tydlitát, v., drchalová, j., rovnaníková, p.: effect of pozzolanic admixtures on mechanical, thermal and hygric properties of lime plasters, constr. build. mater., 20, 2006, p. 849-857. doi:10.1016/j.conbuildmat.2005.07.002 [9] čáchová, m., koňáková, d., vejmelková, e., keppert, m., polozhiy, k., černý, r.: heat and water vapor transport properties of selected commercially produced plasters, advanced materials research, trans tech publ, 2014, p. 90-93. www.scientific.net/amr.982.90 [2015-06-01]. 192 http://dx.doi.org/10.1007/978-3-642-45155-3_3 http://dx.doi.org/10.1016/j.compositesb.2013.10.014 http://dx.doi.org/10.1007/bf01448229 http://dx.doi.org/10.1016/j.conbuildmat.2005.07.002 www.scientific.net/amr.982.90 acta polytechnica 55(3):187–192, 2015 1 introduction 2 experimental program 3 materials 4 results 5 conclusions and further outlook acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0415 acta polytechnica 55(6):415–421, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap verification of census devices in transportation research jan petru∗, michal kludka, vladislav krivda, ivana mahdalova, karel zeman vsb – technical university of ostrava, faculty of civil engineering, department of transport constructions, ostrava-poruba, czech republic ∗ corresponding author: jan.petru@vsb.cz abstract. the article presents a comparison of three devices and two methods that are used to count traffic flow. all measurements were carried out at a roundabout in ostrava, where the following devices were used: viacount ii, icoms tsm-sa, and nu-metrics nc-200 traffic analyzers. the methods of manual counting of vehicles and of counting vehicles based on video footage were used. the article also provides a comparison of the results obtained, namely in terms of traffic intensity, and of the measurements of the length and speed of vehicles. further, we evaluate the results and explore the deviations from reality and the reasons why they occur. the article concludes with the recommended procedure designed to eliminate the identified problems, in order to ensure the most accurate results, with no significant deviations. keywords: road transport; traffic engineering; traffic flow; intelligent transport system; traffic census devices. 1. introduction obtaining relevant traffic data is one of the most important phases of the work of a traffic engineer. the most exact information about the intensities of traffic flows, their composition, speed, variability, etc., is very important for follow-up analyses, prognosis, etc. actually, there are two basic ways of getting traffic data: from manual counting and from automated counting (or their combinations). each of these methods has its advantages and disadvantages, and their choice always depends on the purpose of the measurements and the required accuracy. during our professional work and cooperation with practice, we have encountered a problem in terms of the suitability or unsuitability of certain types of automated census devices. some devices are able to measure only vehicles in one lane, others in two or more lanes and some devices are able to measure vehicles coming in the opposite direction. in the case of using devices for complicated measurments, the device must first be calibrated, for example based on the distance from the lane device. the procedures are described in the relevant manuals [1], but based on our experience and expertise of practitioners, such procedures are not always applicable. this article evaluates and subsequently compares data obtained from various counting devices measuring the intensity of traffic on a roundabout in the city of ostrava. the measurements were made using classic manual traffic counting, counting from video footage, a nu-metrics nc-30x traffic analyzer placed in the roadway, a nu-metrics nc-200 traffic analyzer placed on the road surface, and two radars in the viacount ii and icoms tms-sa devices [2]. the raw data obtained from all types of measurements were then processed into a form enabling a mutual comparison of the measured results, so that a conclusion determining the reliability or error rate of the respective counting methods and devices could be made. this article is based on measurements made for the sgs "verification of recording methods in traffic monitoring" project. 2. location of measurement the measurements took place on the outskirts of the city of ostrava, in its part called plesna. this site is located in the north-western part of the city, as can be seen in fig. 1. a roundabout with three arms is located on the i/11 road – opavska street, about 600 meters from the city borders. the direct direction of this intersection is formed by the already mentioned road i/11, the third arm is the prubezna street, which serves as a collecting local road of the ostrava pustkovec district, as well provides access to the globus hypermarket. this intersection is located on one of the main access routes into the city, and the intensity measured by measuring instruments corresponds to this fact. in the morning, the traffic intensity in the direction to the city dominates, whereas in the afternoon the intensity in the opposite direction is higher. 3. devices and methods used 3.1. nu-metrics nc-200 and nu-metrics nc-30x traffic analyzer the nu-metrics nc-200 traffic analyser is used for measuring vehicle speed, and the intensity and com415 http://dx.doi.org/10.14311/ap.2015.55.0415 http://ojs.cvut.cz/ojs/index.php/ap j. petru, m. kludka, v. krivda et al. acta polytechnica direction to opava hypermarket globus direction to center fig. 1 location of the intersection for the purpose of the measurements figure 1. location of the intersection for the purpose of the measurements position of the traffic flow on roads. the analyzer is installed in the middle of the reference lane, directly on the roadway. the analyzer is then covered with a special rubber cover that is attached by eight screws into the road surface. to measure the number, speed and type of vehicles, the nu-metrics nc-200 uses vmi technology (vehicle magnetic imaging). the device enables the allocation of vehicles to 13 length classification groups and 15 speed groups. the nu-metrics nc-200 can detect a vehicle with a traveling speed from 13 to 193 kph. nu-metrics nc-30x is the older type of the above-mentioned traffic analyzer. the card is installed in the middle of the reference lane into a 10 mm wide slot in the road surface, cut by a milling machine, which is then covered by a rubber strip. the device placed in the slot is sealed in a plastic bag that protects it against unfavourable conditions. the traffic analyzers used for our measurement were borrowed from the ostravske komunikace a.s company. see [3] (p. 16), to learn more about the use of these devices by the company. 3.2. viacount ii viacount ii is a device used for the counting of vehicles in road transport. it is designed to measure the vehicles traveling either in one lane in one direction, or in two lanes in opposite directions. for each vehicle, viacount ii records its speed, the detail proportional to the length of the vehicle, and the time lag between two vehicles. the date of the measuring in the "dd.mm.yy" format and the time in the "hh.mm.ss" format are assigned to a block of measured data and stored in the memory of the detector. this counting device consists of a 24.165 ghz doppler radar, integrated ram, serial rs 232 output and lead-gel battery of values of 12 v/18 ah. this battery enables approximately 15 days of continuous measurement. 3.3. icoms tms-sa tms-sa is an individual portable device designed for temporary counting of road traffic, classification of vehicles and measurement of their speed. the functionality is provided by doppler radar (24.125 ghz) powered by a 6 v/12 ah rechargeable battery, providing energy for a three-week-long measurement. the weight of the radar is 6.4 kg and a four-point mounting at the rear of the machine serves for mounting. the vehicle speed range for measurement is 10 to 255 kph. the device is able to measure the traffic in one lane or in two lanes of opposite directions. the device was borrowed from the ostravske komunikace a.s. company, which also downloaded the data from the device [4]. 3.4. manual counting vehicles were also counted using the classical method of manual counting in the field into prepared forms. during the counting, different types of vehicles and their direction were distinguished. the records were made in quarter-hour intervals by four people. during the counting phase, the intersection was recorded by a video camera. another counting was subsequently made from the recording, also with the same focus and into the same forms. 3.5. hand radar a handheld speed ii radar was used. it is a model with measurement accuracy of ±2 kph in the speed range from 16 kph to 322 kph. the device measures at the distance of 27 m for small and slowly moving targets to up to 457 m. in our measurements, it was used to verify the proper operation of the viacount ii and tms-sa devices at the beginnings of the measurements. 416 vol. 55 no. 6/2015 verification of census devices in transportation research figure 2. location of the respective devices in the field. measured difference from the percentage difference intensity reference value from the reference value video record (reference value) 363 0 0.00 % icoms tms-sa 352 −11 −3.13 % manual counting 351 −12 −3.42 % nu-metrics nc-200 343 −20 −5.83 % viacount ii 339 −24 −7.08 % table 1. the measured values on the prubezna street leg — intensity for direction from the hypermarket globus. 3.6. location of devices in the field seven counting devices and methods were used at the roundabout. the location of the respective elements in the field was as shown in fig. 2. 3.7. evaluation of the data the evaluation was carried out in several stages. the first stage consisted of obtaining raw data from all devices. for this step, specific software provided by the manufacturer of each device was used. during the second stage, the raw data from individual counting methods and devices were unified to the same format; it was necessary to unify the numerical outputs so that they could be subsequently compared. the last stage consisted of entering all collected data into ms excel, which offers the possibility to create a visual comparison of the values measured by individual devices or methods. measuring by the devices took place throughout the whole day; the manual counting method and the video record (video analysis) were made in the morning and afternoon rush hours [5]. as an example, in this article the counting leg of prubezna street will be used. this leg was selected because the largest number of counting methods was used. 3.8. the intensity of vehicles the first compared criterion was the value of traffic intensity, as measured by the respective measuring devices and methods. traffic levels have a major impact on the design of the roadway and the parameters of the subsoil [5, 6]. the measured value was subsequently compared with the actual value which was determined based on the video record. the video record was analysed independently by three persons in order to count the true number of vehicles and eliminate the error that an individual can make. the difference between the values measured by the respective methods or devices used and the actual number of vehicles is noticeable in tab. 1. the difference between the highest and the lowest value is 24 vehicles, which at the detected hourly intensity of 363 vehicles means 7.08 %. the measured value most similar to reality was the value of the icoms tms-sa radar, which differed by 11 vehicles. conversely, the biggest difference between the measured values, and therefore the worst result, was from the viacount ii device. 417 j. petru, m. kludka, v. krivda et al. acta polytechnica speed categories (kph) total number of vehicles0–10 11–20 21–30 31–40 41–50 viacount ii 30 109 831 1302 158 2430 icoms tms-sa 22 219 785 1237 218 2481 nu-metrics nc-200 94 142 738 1208 271 2453 average value 49 157 785 1249 216 table 2. comparison of the number of vehicles in different speed categories. fig. 3 comparison of the number of vehicles in different speed categories 30 109 831 1302 158 22 219 785 1237 218 94 142 738 1208 271 0 200 400 600 800 1000 1200 1400 0-10 11-20 21-30 31-40 41-50 n u m b e r o f v e h ic le s speed categories (km/h) viacount ii icoms tms-sa nu-metrics nc-200 figure 3. comparison of the number of vehicles in different speed categories. 3.9. the speed of vehicles the second carried-out comparison was the comparison of the following three devices: viacount ii, icoms tms-sa and nu-metrics nc-200 traffic analyzer. the determined vehicle speeds were compared. for better clarity, the results were divided into five speed categories. for each category, the number of vehicles that moved at this speed was determined. the measurement was based on the values on the prubezna street leg as of july 17, 2014 between 10 a.m. and 5 p.m. all the measuring devices were placed in such a way that they were focused on the same spot on the road, so as to ensure the most accurate comparison possible. from the three acquired values, the arithmetic average was determined, and it was then taken into account in the mutual comparison of the devices. tab. 2 and fig. 3 show the number of vehicles that were moving at a certain speed during the measurement. the values show that the devices mostly match in the speed range of 20 to 40 kph. there are significant differences between the devices at low speeds. the traffic analyzer recorded a much larger number of vehicles traveling at a speed of up to 10 kph than both radar devices. also worth mentioning is the significantly higher number of vehicles traveling at a speed of 11 to 20 kph measured by the icoms tms-sa device, and, on the other hand, a significantly lower number of vehicles traveling at a speed of 41 to 50 kph measured by the viacount ii device. tab. 3 shows the percentage difference of the measured values. the sign indicates whether the device counted more or fewer vehicles than the average measured value. the total inaccuracy of the devices compared to the average value was then calculated based on all the deviations. the smallest deviation, namely 19.33 %, which is, however, still a relatively high value, was measured by the icoms tms-sa device. in general we can say that all devices, except for the speed range between 20 to 40 kph, differ quite a lot from each other in their meaasurements. otherwise, comparatively significant deviations occurred, which can be seen in fig. 4. 3.10. the length of the vehicles the last comparison looked at the viacount ii, icoms tms-sa devices and the nu-metrics nc-200 traffic analyzer with respect to the measured length of the vehicles. the viacount ii device measures the so-called electronic length; for further use it was necessary to convert it to length in meters, using tables provided by the manufacturer of the device. the mea418 vol. 55 no. 6/2015 verification of census devices in transportation research speed categories (kph) average deviation of devices0–10 11–20 21–30 31–40 41–50 viacount ii −38.36 % −30.43 % 5.90 % 4.24 % −26.74 % 21.13 % icoms tms−sa −54.79 % 39.79 % 0.04 % −0.96 % 1.08 % 19.33 % nu−metrics nc−200 93.15 % −9.36 % −5.95 % −3.28 % 25.66 % 27.48 % table 3. the percentage difference from the average number of vehicles. fig. 4 the percentage difference from the average number of vehicles -60% -40% -20% 0% 20% 40% 60% 80% 100% 0-10 11-20 21-30 31-40 41-50 -38.36% -30.43% 5.90% 4.24% -26.74% -54.79% 39.79% 0.04% -0.96% 1.08% 93.15% -9.36% -5.95% -3,28% 25.66% d if fe re n c e f ro m t h e a v e ra g e speed categories (kph) viacount ii icoms tms-sa nu-metrics nc200 figure 4. the percentage difference from the average number of vehicles. sured length of the vehicles allows classification of the vehicles into five categories. tab. 4 shows the number of vehicles in each category, where the vehicles were divided according to the measured length of the vehicle. with the nu-metrics nc-200 traffic analyzer the figure for bicycles and motorcycles is missing, because the device is not able to measure these vehicles, unless they pass directly over it. the comparison of the measured lengths of the vehicles showed that in this case viacount ii came out as the best device, having the total deviation of 2.73 %. icoms tms-sa and the traffic analyzers had deviations of around 6.25 %, as can be seen in tab. 5. the radar counted more vehicles than average, whereas the traffic analyzers counted fewer. fig. 5 provides a clear presentation. 4. discussion the obtained measurement results pose several questions about the reliability and suitability of the respective devices for the actual measurement. before the measurement commences, it is necessary to ask how long the actual measurement will last, so that we can achieve relevant results. based on the final evaluation, we can choose the method that will work best for us. this article only discusses a measurement using the devices on one roundabout. however, the devices were tested at more roundabouts with a similar placement of the devices, and similar deviations occurred. it would be advisable to also place the devices on a different type of intersection or on straight sections, and to subsequently compare the results from the obtained data and evaluate the placement of the respective devices on the given type of intersection, or to recommend just a certain type of device for the counting itself. currently, further measurements and testing of the devices are taking place for example in [7, 8]. the devices are placed on communications with two and more lanes, with variable distances and heights of the mounted devices. subsequently, the results of these projects will be analyzed and published in other articles. 5. errors in measurement during the measurement we could observe several situations that caused the inaccuracy of the measurement, and this paragraph describes the most important of them. the problem with the method of manual counting of vehicles in field forms consists mainly in the fact that with the increasing length of the measurement the attention of the person who performs the counting decreases. also, if the intensity of the traffic increases significantly, the person carrying out the counting is no longer able to write down all the vehicles and put them into correct categories. in our experimen419 j. petru, m. kludka, v. krivda et al. acta polytechnica length category total number of vehiclesbicycle, car lorry bus, truck with motorbike truck semi-trailer viacount ii 39 2007 332 35 17 2430 icoms tms-sa 43 2038 341 40 19 2481 nu-metrics nc-200 – 2101 303 33 16 2453 average value 41 2049 325 36 17 table 4. comparison of the number of vehicles in different length categories. length category average deviation of devicesbicycle, car lorry bus, truck with motorbike truck semi-trailer viacount ii −4.88 % −2.03 % 2.05 % −2.78 % −1.92 % 2.73 % icoms tms−sa 4.88 % −0.52 % 4.82 % 11.11 % 9.62 % 6.19 % nu−metrics nc−200 – 2.55 % −6.86 % −8.33 % −7.69 % 6.36 % table 5. the percentage difference from the average number of vehicles. fig. 5 the percentage difference from the average number of vehicles -10% -8% -6% -4% -2% 0% 2% 4% 6% 8% 10% 12% bicycle, motorbike car lorry bus, truck truck with semitrailer -4.88% -2.03% 2.05% -2.78% -1.92% 4.88% -0.52% 4.82% 11.11% 9.62% 2.55% -6.86% -8.33% -7.69% t h e p e rc e n ta g e d if fe re n c e f ro m t h e a v e ra g e n u m b e r o f v e h ic le s type of vehicle viacount ii icoms tms-sa nu-metrics nc-200 figure 5. the percentage difference from the average number of vehicles. tal measurement this method worked best, but the measurement took place at a leg with a relatively low intensity of traffic and the duration of the measurement in the afternoon peak was only two hours. similar problems occur with the method of manual counting from video record since there is a possibility that the view would be blocked, for example, by a large vehicle. with the nu-metrics nc-200 and numetrics nc-30x traffic analyzers the accuracy of measuring the traffic intensity is also high; inaccuracy occurs when the vehicle does not pass directly over the device. this occurs while overtaking another vehicle or when the vehicle passes the card only with a tire. if there is a traffic jam and the vehicle stops above the device, it will also result in an inaccurate measurement, especially in terms of speed and length, which are determined by the time the vehicles spend above the card. with systems with radars (viacount ii icoms tsmsa), there is a problem when the transmitted beam is blocked. this occurs most often when a bigger vehicle is passing at a low speed through the radar beam. the radar beam does not reach the farther lane, through the bigger vehicle in the closer line, and the vehicles that pass there at that time are not included. 6. recommendations to ensure the most accurate measurements, it is necessary to consider the above-mentioned findings, whether it concerns the detected accuracy of measurement or the problems causing inaccuracy. in the case of long-term measurement, it is definitely beneficial to use one of the available devices. however, it is 420 vol. 55 no. 6/2015 verification of census devices in transportation research necessary to choose a good spot to place the device and to take into consideration the possibility of the occurrence of traffic jams, or of conditions that do not enable accurate measurement. we recommend choosing the station for a video camera in such a way that the view of the measured section is not blocked. the same applies for the station of the people performing the manual counting. for the manual counting method it is also important to ensure a sufficient number of people to carry out the measurements, and to make sure that they are properly trained on how the survey is conducted. the location of nu-metrics traffic analyzers should be chosen to ensure their being positioned in the middle of the roadway vehicles, and the analyzers should also be placed at a sufficient distance from the intersection to prevent any stopping of vehicles above them. sections with no overtaking, or those where the roadway is narrowed so that the position of the traffic analyzer does not allow vehicles to pass the analyzer in the wrong way are suitable. this device is not suitable for counting bicycles and motorbikes [9]. devices using radar should be placed on constructions next to the roadway that provide a direct view of the place of interest. it is necessary to avoid places where vehicles go too slowly, or even stand still. if the device is used for counting traffic flow in both direction, it is necessary to take into account the fact that if the transmitted beam is blocked in the adjacent lane (for example, by a standing truck with a semitrailer), the vehicles passing in the second lane are not recorded at all, and there is sometimes considerable distortion of the actual number of vehicles. it seems advisable to place the device above the roadway (for example on a bridge structure) where a view of both lanes independently should be ensured [10]. acknowledgements the work was supported by the student grant competition vsb-tuo. the project registration number is sp2014/207. references [1] vaisala, nu-metrics portable traffic analyzer nc200. vaisala, 2010. http://www.approachnavigation.com/docs/ portable-traffic-analyzer_brochure.pdf [2015-07-20]. [2] laboratory of traffic engineering. http://kds.vsb.cz/ldi/ [2015-01-09]. [3] laza, j. information about traffic in ostrava 2014, in czech informace o dopravě v ostravě 2014. ostrava communications, inc. department of transportation engineering, in 2014, published by the city of ostrava. http://okas.cz/userfiles/rocenka_doprava_2013. pdf [4] http://www.okas.cz/ [2015-07-26]. [5] krivda, v.: video-analysis of conflict situations on selected roundabouts in the czech republic. communications scientific letters of the university of zilina, 2011, vol. 13, no. 3, p. 77-82. http://www.uniza.sk/komunikacie [6] krivda, v.: analysis of conflict situations in road traffic on roundabouts. promet traffic & transportation: scientific j. on traffic and transportation research. zagreb: university of zagreb, faculty of transport and traffic sciences, 2013, vol. 25, no. 3, p. 295-303. doi:10.7307/ptt.v25i3.296 http: //www.fpz.unizg.hr/traffic/index.php/promtt [7] senkova, e. comparison of various types of traffic counters. theses. vsb – technical university of ostrava. [8] hohn, p. ostrava sequentional, ostrava communications [email communication]. 2014 [2015-07-26]. [9] celko, j., decky, m., kovac, m.: an analysis of vehicle — road surface interaction for classification of iri in frame of slovak pms. maintenance and reliability, polish maintenance society, no. 1 (41) 2009, pp. 15-21. [10] decky, m., drusa, m., pepucha, l., zgutova, k.: earth structures of transport constructions. pearson education limited 2013, edinburg gate, harlow, essex cm20 2je. edited by martin decky, 180 pp. printed and bound in great britain by ashford colour press, gasport, hampshire. 421 http://www.approachnavigation.com/docs/portable-traffic-analyzer_brochure.pdf http://www.approachnavigation.com/docs/portable-traffic-analyzer_brochure.pdf http://kds.vsb.cz/ldi/ http://okas.cz/userfiles/rocenka_doprava_2013.pdf http://okas.cz/userfiles/rocenka_doprava_2013.pdf http://www.okas.cz/ http://www.uniza.sk/komunikacie http://dx.doi.org/10.7307/ptt.v25i3.296 http://www.fpz.unizg.hr/traffic/index.php/promtt http://www.fpz.unizg.hr/traffic/index.php/promtt acta polytechnica 55(6):415–421, 2015 1 introduction 2 location of measurement 3 devices and methods used 3.1 nu-metrics nc-200 and nu-metrics nc-30x traffic analyzer 3.2 viacount ii 3.3 icoms tms-sa 3.4 manual counting 3.5 hand radar 3.6 location of devices in the field 3.7 evaluation of the data 3.8 the intensity of vehicles 3.9 the speed of vehicles 3.10 the length of the vehicles 4 discussion 5 errors in measurement 6 recommendations acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0724 acta polytechnica 53(supplement):724–727, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap do we see the ‘iron peak’? anatoly d. erlykina,b,∗, arnold w. wolfendaleb a p.n. lebedev physical institute, moscow, russia b department of physics, durham university, durham, uk ∗ corresponding author: erlykin@sci.lebedev.ru abstract. recent measurements of the cosmic ray (cr) energy spectrum in the pev region and above have confirmed the remarkable sharpness of the knee and revealed another structure at about 70 pev, which we call the ‘iron peak’. the position and the shape of this structure lead us to associate its likely origin with the same single source responsible for the the knee. we have analysed the shape of the single source spectrum and concluded that its mass composition is rather similar to that for the bulk of cr in the tev ÷ pev region. since it is generally accepted that these cr originate mainly in supernova explosions, this gives an additional argument in favour of our single source being a supernova remnant. keywords: cosmic rays, knee, single source, iron peak. 1. introduction the origin of cosmic rays (cr) is usually studied by analysing the general shape of the energy spectrum, mass composition and anisotropy. we would like to say that additional information can be obtained from the study of the detailed shape or the ‘fine structure’ of the cr energy spectrum. the ‘structure’ can be divided into two broad and overlapping categories: sharp discontinuities (fine structure) and slow trends. anything other than a simple power law spectrum with an energy-independent exponent can be termed a ‘structure’. many simulations indicate that the dominant contribution to the observed cr is from nearby sources, where the non-uniformity of their space distribution plays a significant role. if the production of cr by these sources has an explosive character as from supernovae (sn) and subsequent remnants (snr), then their random explosions make the non-uniformity of the cr space-time distribution even stronger. this has to result in the appearance of a fine structure in the cr spectrum at some level. 2. evidence for a fine structure in the knee region the most prominent structure in the cr spectrum is the knee, at about 3 ÷ 4 pev. although it was discovered more than half a century ago, its origin is still debated. more than a decade ago we put forward a model in which the remarkable sharpness of the knee can be understood if a ‘single source’ is largely responsible [8, 9]. the cr component that makes the dominant contribution at the knee was assumed to be helium (he) nuclei [12]. an additional argument in favour of the single source model of the knee was the observation of a small bump in the 10 ÷ 20 pev energy interval, attributed to the cno group of nuclei. the sharpness of the knee is caused by the following: the source is in fact single, the spectrum of he nuclei is peak-like and has a sharp cut-off above the maximum acceleration rigidity of ∼ 1.5 ÷ 2 pv, and the amount of li, be, b nuclei between the he and cno peaks is negligible. in our paper [12], we assumed the possible existence of a further structure, a peak at an energy of above 50 pev associated with the iron (fe) group of nuclei, but at that time we could not claim to have observed it. however, very recently several experiments which have good energy resolution and statistical accuracy have revealed irregular behavior of the spectrum in the 10 ÷ 100 pev energy range, with the possible existence of the bump above 50 pev. these experiments are gamma [15, 20], tunka-25,133 [4, 19], icetop [24], kascade-grande [6] and yakutsk [18]. 3. origin of the fine structure in the following, we construct a model of the primary cr spectrum (icr) in the knee region and above as composed of a smooth background (ibgrd) and a contribution from the single source (iss). the background is produced by a multitude of various sources and has the shape of two power laws [23] far below the knee and above the knee. in the knee region these two power laws are connected with each other by a smooth transition line with sharpness of 0.3 inherent for the galactic diffusion model (gdm) [9]. in order to find the average contribution of the single source to the background, we have determined this contribution for 10 individual energy spectra, and have averaged them. the results are shown in figs. 1 and 2. the excess over the background at log(e/ek) = 1.1 ÷ 1.3 is clearly seen in fig. 2b. if the knee position is at ek ≈ 4 pev and is caused by a dominant contribution of he nuclei, then the observed excess 724 http://dx.doi.org/10.14311/ap.2013.53.0724 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 do we see the ‘iron peak’? figure 1. energy spectra of primary cr, measured by the following eas arrays: tibet-iii (a) [2], kascade (b) [3], gamma (c) [15], yakutsk (d) [17], maket-ani (e) [7], tunka-25 (f) [19], gamma2002 (g) [14], msu (h) [25], kascade-grande (i) [6] and andyrchi (j) [22]. the symbols ‘e’, ‘µ’ and ‘c’ in brackets indicate the measured eas components: electromagnetic, muon or cherenkov light, respectively. the upper full lines are fits of these spectra by the ter-antonyan and haroyan formula [23]. the lower full lines are the spectra expected if the best fit sharpness obtained in the above fit is replaced by 0.3 – the value expected in the galactic diffusion model. the dashed lines are extrapolations of these fits to energies above the fitted range. corresponds to energies of 50 ÷ 80 pev and the contribution of the fe-group nuclei, if their cut-off energies are proportional to the nuclear charge z. since we adopted the approximation that icr = ibgrd + iss, the excess ∆ shown in fig. 2b is equal to ∆ = log(icr/ibgrd) = log(1 + iss/ibgrd), and then iss = ibgrd(10∆ − 1). (1) the mean background was taken of the same form [23] used in the analysis of the 10 individual spectra. this background has a differential exponent of 2.7 at low energies, log e < 5, and 3.1 at high energies, log e > 8. the sharpness of the knee is s = 0.3 at log ek = 6.5. the absolute intensity has been taken at log e = 5 from [26]. the energy spectrum of the single source figure 2. fine structure of the primary cosmic ray energy spectrum, shown as the difference between the measured intensities and the ‘background’ (see the text): (a) the individual excesses. the key to the symbols associated with arrays is given in the box below the dotted zero line. arrays are denoted by the same signs as in the panels in fig. 1. (b) the unweighted mean profile of the excess above the gdm with the likely charge assignments. the knee is at log(e/ek) = 0. was derived from this background and the excess ∆ (fig. 2b), using eq. 1, and it is shown by open squares in fig. 3. the relative mass composition has been derived by fitting 5 individual components: p, he, cno (carbon, nitrogen and oxygen), m (neon, magnesium, silicon and sulphur) and fe (argon, calcium and iron) to this single source spectrum. the shape of the energy spectrum for each mass group was taken as typical for the monogem-like snr. the maximum rigidity of nuclei accelerated in the single source is taken to be the same for all the nuclei, so that the maximum energy is proportional to the charge, z. since the energy spectra of all 5 mass groups in cr from the single source have essentially a non-power law shape ‘on arrival’, it is unreasonable to describe the mass composition in terms of the relative fractions of their flux at a fixed energy per particle (or per nucleon). these fractions would have a strong variation with energy. instead we derive the abundances as relative 725 anatoly d. erlykin, arnold w. wolfendale acta polytechnica figure 3. the energy spectrum of cr from the single source and its interpretation. the full line denoted as bgrd is the background spectrum used to convert the excess in fig. 2b to the single source spectrum (represented by open squares). the dotted lines are best fit contributions from 5 cr mass groups: p, he, cno, m and fe. the full line denoted as ss is the sum of these 5 components. full and open circles are from tunka [4, 19], asterisks – from hires [1]. dashed line above 50 pev in the upper plot – the spectrum expected for just background and single source contributions, without the vela pulsar. the possible contribution of the vela pulsar was calculated assuming that it is an isolated pulsar [11] with 1 % of its rotation energy loss converted to the emission of just fe nuclei. fractions of the energy content contained in each mass group with respect to the total energy carried by cr from the single source. the abundances obtained for the 5 mass groups in the cr from the single source are: p – 0.477, he – 0.406, cno – 0.081, m – 0.010, fe – 0.026, with accuracy typically 15 % for he and cno and 30 % for p, m and fe. for the ambient primary cr with their steep power law energy spectra for all 5 mass groups, the mass composition at lower energies coincides with that of the background spectrum since iss � ibgrd and there is no difference between the two definitions since both give the same values for the abundances. if all cr sources in the galaxy are the same, then figure 4. the ‘ambient’ cr mass composition taken from panov et al. [21] at log e = 3 compared with the ‘injected’ cr mass composition recalculated from ‘ambient’, using the ‘leaky box model’, and our ‘single source’ composition. the similarity of the injected and single source compositions (apart from the m-component) is remarkable. the average injection abundances (at a fixed energy per nucleus) can be derived by dividing the observed cr abundances by the mean lifetime against escape from the galaxy. direct measurements of cosmic ray energy spectra have only been made up to about 105 gev/nucleus. as a datum we use the directly measured intensities (fractions) from atic measurements [21] at a lower 103 gev/nucleus energy, where they have good accuracy. the necessary mean lifetimes for escape are taken from the ‘leaky box model’ of particle diffusion, in which the mean lifetime is proportional to z. the energy dependence of the lifetime τ(e) is taken as τ(e) = t0e−δ + τ0, where δ = 0.5 for the anomalous diffusion modeli [10], t0 = 4×107 year, τ0 = 1.1×104 y, e in gev. the ambient mass composition from panov et al. [21] at log e = 3, and the ‘effective’ injected mass composition recalculated from it, are shown in fig. 4 together with our single source composition. it is remarkable to observe the general similarity of our single source injection and the ‘injected’ mass 726 vol. 53 supplement/2013 do we see the ‘iron peak’? compositions in basic nuclei. if indeed the bulk of the observed cosmic rays originate from snr, the similarity of the general ‘injected’ mass compositions and our injected ‘single source’ mass compositions gives an additional argument that our single source is also an sn and its subsequent snr. 4. discussion haungs [16] interpreted the observed spectral flattening above 20 pev as due to the transition from the cno-group to the m-group of sub-iron nuclei (ne, mg, si and s) and the steepening above 100 pev as due to the end of galactic cr and the gradual transition to extragalactic cr. measurements of the shower age parameter by gamma [20] and the eas maximum depth by the tunka-133 [5] experiments, as well as the spectrum of muon-rich showers by kascade-grande [13] show that the cr mass composition above 20 pev becomes progressively heavier, approaching that in which fe nuclei dominate. looking at fig. 3 we agree that such a structure at 20 pev exists. we can also say that the observed behaviour of the mass composition can be seen in our single source model. as for the steepening beyond 100 pev, measurements of the energy spectrum by gamma and tunka-133 give support to the existence of this structure, although with less statistical accuracy. they also hint that steepening could occur at about 200 pev (fig. 3). an additional argument in favour of the end of dominant fe is the trend to a lighter mass composition above 100 ÷ 200 pev. we think that the interpretation of the gradual steepening above 100 pev in terms of the end of galactic cr is principally the same as in the galactic diffusion model, which ignores the evident sharpness of the knee. we prefer the scenario in which the steepening of the spectrum in this energy region means the end of the contribution of the single source and the origin of cr above 100 pev is still galactic, but from a background of other more powerful sources. 5. conclusion we have examined the fine structure of the cr energy spectrum in the knee region and above, and have demonstrated how the analysis of this structure can help in studying the origin of cr in this energy region. we have shown that the new data give strong support to the conclusion that this fine structure is caused by the contribution of a single nearby and recent source, most likely an snr. the steepening of the cr energy spectrum beyond the ‘iron peak’ most likely indicates the end of the contribution of the single source. acknowledgements the authors are grateful to the dr. r. kohn foundation for financial support. references [1] abbasi r.u. et al., 2009, astropart. phys., 32, 53 [2] amenomori m. et al., 2008, astrophys. j., 678, 1165 [3] apel w.d. et al., 2009, astropart. phys., 31, 86 [4] berezhnev s.f. et al., 2011a, 32nd int.cosm.ray conf., beijing, 1, 209 [5] berezhnev s.f.et al., 2011b, 32nd int.cosm. ray conf., beijing, 1, 197 [6] bertaina m. et al., 2011, astrophys.space sci.trans., 7, 229 [7] chilingarian a.a. et al., 2007, astropart. phys., 28, 58 [8] erlykin a.d., wolfendale a.w., 1997, j.phys.g, 23, 979 [9] erlykin a.d., wolfendale a.w., 2001, j.phys.g, 27, 1005 [10] erlykin a.d. et al., 2003, astropart. phys., 19/3, 351 [11] erlykin a.d., wolfendale a.w., 2004, astropart. phys., 22/1, 47 [12] erlykin a.d., wolfendale a.w., 2006, j.phys.g, 32, 1 [13] fuhrmann d. et al., 2011, 32nd int.cosm. ray conf., beijing, 1, 227 [14] garyaka a.p. et al., 2002, j.phys.g, 8, 2317 [15] garyaka a.p. et al., 2008, j.phys.g, 35, 115201(18pp) [16] haungs a., 2011, 32nd int.cosm.ray conf., beijing, 1, 263 [17] ivanov a.a. et al., 2009, new j.phys., 11.065008,2009 [18] knurenko s.p., sabourov a., 2011, 32nd int.cosm.ray conf., beijing, 1, 189 [19] korosteleva e.e. et al., 2007, nucl.phys.b (proc.suppl.), 165, 74 [20] martirosov r.m. et al., 2011, 32nd int.cosm.ray conf., beijing, 1, 178 [21] panov a.d. et al., 2009, bull.rus.acad.sci., 73, 564 [22] petkov v.b., 2009, 31st int.cosm.ray conf., lodz, inv.rapp.high.papers, 12 [23] ter-antonyan s.v., haroyan l.s, 2000, arxiv: hep-ex/0003006 [24] the icetop coll., 2011, 32nd int.cosm.ray conf., beijing, 1, 279 [25] vishnevskaya e.a. et al., 2002, bull. rus. acad. sci., 66, 74 [26] watson a.a., 1997, proc. 25th int.cosm.ray conf., durban, 8, 257 discussion bozena czerny — from the energetics point of view, is a single supernova enough to supply the required number of particles? anatoly erlykin — the brief answer is ‘yes’, if the standard supernova with explosion energy ∼ 1051 erg converted ∼ 10 % of it into cosmic rays. a more detailed answer: the number of particles from the standard supernova depends on distance and age. our estimates of the needed distance and age are: ∼ 300 pc and ∼ 105 years. 727 acta polytechnica 53(supplement):724–727, 2013 1 introduction 2 evidence for a fine structure in the knee region 3 origin of the fine structure 4 discussion 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0123 acta polytechnica 55(2):123–127, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap simultaneous fits in isis on the example of gro j1008–57 m. kühnela,∗, s. müllera, i. kreykenbohma, f.-w. schwarma, c. großbergera, t. dausera, m. a. nowakb, k. pottschmidtd,c, c. ferrignog, r. e. rothschilde, d. klochkovf, r. staubertf, j. wilmsa a remeis-observatory & ecap, universität erlangen-nürnberg, sternwartstr. 7, 96049 bamberg, germany b mit kavli institute for astrophysics, cambridge, ma 02139, usa c cresst, center for space science and technology, umbc, baltimore, md 21250, usa d nasa goddard space flight center, greenbelt, md 20771, usa e center for astronomy and space sciences, university of california, san diego, la jolla, ca 92093, usa f institut für astronomie und astrophysik, universität tübingen, sand 1, 72076 tübingen, germany g isdc data center for astrophysics, chemin d’écogia 16, 1290 versoix, switzerland ∗ corresponding author: matthias.kuehnel@sternwarte.uni-erlangen.de abstract. parallel computing and steadily increasing computation speed have led to a new tool for analyzing multiple datasets and datatypes: fitting several datasets simultaneously. with this technique, physically connected parameters of individual data can be treated as a single parameter by implementing this connection directly into the fit. we discuss the terminology, implementation, and possible issues of simultaneous fits based on the interactive spectral interpretation system (isis) x-ray data analysis tool.while all data modeling tools in x-ray astronomy in principle allow data to be fitted individually from multiple data sets, the syntax used in these tools is not often well suited for this task. applying simultaneous fits to the transient x-ray binary gro j1008–57, we find that the spectral shape is only dependent on x-ray flux. we determine time independent parameters, e.g., the folding energy efold, with unprecedented precision. keywords: methods: data analysis x-rays: binaries (stars:) pulsars: individual gro j1008−57. 1. motivation most data analysis in x-ray astronomy concentrate on describing single datasets or on characterizing samples with the results of fits of individual datasets. once a good description of an example dataset is found, an analysis of comparable datasets follows. finally, the results of all those individual analyses are compared and interpreted. for example, a particular parameter is found to depend on other parameters. instead of going back to the data analysis and fitting this dependency directly to enhance the precision of the parameter or to break degeneracies (feasible through reduced degrees of freedom), the dependency is then analyzed on its own. in another way, the former analysis is indeed repeated but with this parameter fixed according to the dependency that has been discovered. furthermore, if parameters cannot be constrained well, these parameters are commonly fixed to a certain standard value. we therefore cannot gain any physical information from this fixed parameter and, more importantly, systematical effects may arise. the reason for not adopting sophisticated ways is usually a lack of computation power. implementing parameter correlations or dependencies would require all data to be analyzed at the same time. however, since computer power has increased and parallel computation using several computers is possible, the situation has now changed. in other words, fitting data simultaneously has become feasible even when large numbers of datasets (e.g., 50– 100 pointings at a single source) are to be considered. in section 2 we introduce an implementation of simultaneous data analysis into the interactive spectral interpretation system (isis) [1], which has been designed to “facilitate the interpretation and analysis of high resolution x-ray spectra”1. in section 3, we present ideas for possible applications of simultaneous data analysis and further demonstrate the power of this method on the example of the transient xray binary gro j1008–57 in section 4. finally, we discuss questions and issues, which arise by comparing advantages and disadvantages of simultaneous fits (table 1). 2. implementation into isis isis [1] was developed to fit x-ray spectra, but it can also be used to analyze almost all kinds of data, due to its strong customization capability [2] compared 1http://space.mit.edu/cxc/isis/ [2015-04-01] 123 http://dx.doi.org/10.14311/ap.2015.55.0123 http://ojs.cvut.cz/ojs/index.php/ap http://space.mit.edu/cxc/isis/ m. kühnel, s. müller, i. kreykenbohm et al. acta polytechnica advantages • fixed parameters can be determined correctly • complicated parameter correlations can be implemented and tested • different types of data can be combined • parameter degeneracies can be broken • reduced number of degrees of freedom disadvantages • increased runtime of fits and uncertainty calculations • large memory is needed → multi-cpu calculations required • statistical weights of datasets have to be chosen • careful handling of fit-parameters required table 1. advantages and disadvantages of fitting several datasets simultaneously. dataset a.1 dataset a.2 ... dataset a.n dataset b.1 dataset b.2 dataset b.3 ... dataset b.m datagroup a datagroup b model p parameters group parameters a group parameters b global parameters figure 1. terminology of simultaneous fits in isis. there are n and m simultaneous datasets, forming datagroup a and datagroup b, respectively. each dataset has its own group parameters, resulting from a model with p parameters. some of the group parameters are the same for both datagroups and are called global parameters. to, e.g., xspec [3, 4]. for example, user-defined fit-functions, as well as complex correlations between data and models, can be implemented. however, no functions are yet available for handling these correlations for a large number of parameters and datasets in an easy way. before we describe the technical realization of simultaneous data analysis in isis, we introduce new notations used by the implemented functions. 2.1. terminology the parameters of a model which is fitted to data either act on all datasets loaded into isis, or on an individual dataset. by defining parameters for each dataset and tying them to each other, parameters can be linked to multiple datasets similar to the approach chosen, e.g., in xspec. multiple datasets that are to be fitted with the same set of parameters are called a datagroup. the corresponding parameters are called group parameters. global parameters denote parameters which act on all datagroups. figure 1 illustrates these definitions. in this example, a dataset requires a model with p parameters. simultaneous data, which can be described by the same parameters are available from n detectors. these datasets define datagroup a with p free parameters. another group of data was recorded by m detectors. these datasets define an individual datagroup b with, again, p free parameters. during the analysis of the two groups, however, it turns out that a specific parameter seems to be equal for both data groups within the uncertainties. as a result, the two individual values for this parameter are tied to each other, resulting in a global parameter. this reduces the number of free parameters by one and the remaining group parameters can be constrained better. 2.2. dataand analysis functions since simultaneous fits can have large numbers of fit parameters connected by a complicated logic, we provide a collection of all functions necessary to initialize and perform simultaneous fits in isis2. a simultaneous fit is initialized via simfit = simultaneous_fit(); where simultaneous_fit returns a structure (struct_type), which has to be assigned to a variable, here simfit. the structure contains several functions and fields to handle simultaneous fits. the documentation of each function is available using the help-qualifier. some important functions are described in the following. simfit.add_data(filenames); this defines a datagroup and loads the spectra given by filenames, which must be an array of strings. the function also allows other data than spectra to be loaded or defined. simfit.fit_fun(model); the string model defines the fit-function to be used for all datasets. here, the placeholder % can be used instead of a component instance. in this case individual group parameters are applied automatically to each datagroup. simfit.set_par_fun(parameter, function); this is probably one of the most useful functions. like the isis intrinsic function, the value of the parameter is determined by the given function. the %-placeholder can be used within the string parameter to apply the function to the corresponding parameter of each data group. however, the function may also contain other parameters or even a single parameter name. in the latter case, if the function is 2these functions are available as part of the isisscripts, a collection of useful functions, which can be downloaded at http://www.sternwarte.uni-erlangen.de/isis [2015-05-06]. 124 http://www.sternwarte.uni-erlangen.de/isis vol. 55 no. 2/2015 simultaneous fits in isis on the example of gro j1008–57 also applied to all datagroups (using the %), the single parameter is treated as global parameter from now on. because a simultaneous fit results in a large number of parameters, a single call to a fit-routine (fit_counts) will take a long time. in the example from the previous section, the final model fitted to the data consists of (n + m)×p parameters, where only 2p−1 are free. to reduce the runtime of a fit, two fit-routines are implemented within the simultaneousfit-structure. simfit.fit_groups(groupid); instead of perfoming a χ2-minimization of all parameters and datasets, this function loops over all datagroups and fits only the associated parameters (group parameters). if a group is specified by the optional groupid, then only the group parameters of this particular group are fitted. simfit.fit_global(); instead of fitting the group parameters, this function fits the global parameters only. since all defined data groups have to be taken into account, the fit usually takes longer than fitting the group parameters. 2.3. uncertainty calculations as has already been mentioned, the runtime of simultaneous fits takes longer than fitting a single dataset only. thus, uncertainty calculations of parameters, where the range of a certain parameter has to be found corresponding to a change in χ2, will be affected dramatically by the high runtime. furthermore, it is necessary to distinguish between group parameters and global parameters. we recommend computing the uncertainty intervals for each parameter on a different machine, e.g., by using [5] or mpi_fit_pars and the slmpi module3. we compared the runtime of a parallel uncertainty calculation in isis with a serial approach in xspec. estimating the uncertainties of 10 parameters in parallel (i.e., on 10 cores) is faster by a factor of more than 3 (21 ks vs. 60 ks). additionally, the calculations in isis resulted in a better χ2red because the parameter ranges being scanned are larger in the parallel approach. group parameters depend on a single datagroup only. as a consequence, all other datagroups and therefore all other group parameters can be ignored during the uncertainty calculation. unfortunately, this not the case for global parameters. during the analysis of gro j1008–57 (see section 4), the uncertainties of the global parameters were calculated by revealing the χ2-landscape of each global parameter by individual fits. subsequently, each landscape was interpolated to find the ∆χ2-value of interest 3http://www.sternwarte.uni-erlangen.de/\git.public/ ?p=slmpi.git [2015-04-01] (e.g., ∆χ2 = 2.71 corresponding to the 90%-confidence level). in this way the runtime of an uncertainty calculation of a single global parameter could be reduced significantly. note that depending on the model and on the amount of data, such a computation can take up to several days. 3. applications there are various applications of simultaneous fits and data analysis. besides determining specific parameters that seem to be constant over time by all available data, more physical questions can be tackled. for example, if a physical property of the object of interest results in multiple observables: • the geometry of the accretion column in accreting neutron star x-ray binaries affects the line shape of cyclotron resonant scattering features (crsf) [6] and also the pulse profile shape (falkner et al., in prep.). furthermore, instead of deriving physical properties from parameters after fits have been performed, these properties can be directly fitted to the data by implementing the dependency on the model parameters: • the components in radio maps of jets in active galactic nuclei move at certain velocities. assuming a constant velocity of the jet components, the velocity itself could be a global fit parameter [7]. • in the sub-critical accretion regime of neutron stars, the spectrum is believed to harden with increasing luminosity [8]. any possible dependency between power-law shape and luminosity could be fitted simultaneously with multiple spectra. 4. the example gro j1008–57 as an example of a successful simultaneous fit we will briefly summarize the results of our analysis of gro j1008–57 using almost all available x-ray spectra and -lightcurves. this transient high-mass x-ray binary consists of a neutron star orbiting a be-type optical companion. for further details of the system and also for the results of the analysis see [9] and references therein. since sources are only visible for a small fraction of their full orbit, it is challenging to obtain the orbital parameters of transient x-ray binaries by analyzing, e.g., the pulse arrival times [10, 11]. thus, an observed shift in the orbital phase with respect to the initial orbital parameters can be fitted with either a different orbital period or with a different time of periastron passage. this leads to a parameter degeneracy, which can be visualized by a contour map of the χ2-landscape of these parameters. the resulting contour map shows that both parameters are degenerated statistically (i.e., the ellipsoidal contour lines are tilted). however, the outburst times of the source are clearly connected to the periastron passage. performing a simultaneous fit of the pulse arrival times and the 125 http://www.sternwarte.uni-erlangen.de/\git.public/?p=slmpi.git http://www.sternwarte.uni-erlangen.de/\git.public/?p=slmpi.git m. kühnel, s. müller, i. kreykenbohm et al. acta polytechnica 2.0 1.5 1.0 0.5 10.10.01 10−2 10−1 10−0 γ fpl (15-50 kev) (10 −9 erg s−1 cm−2) f b b figure 2. a fit (black lines) of the power-law index γ and black body flux fbb as functions of power-law flux fpl. the different colors correspond to different outbursts. outburst times reduces the parameter degeneracy and results in much better constrained parameters (by a factor of about 2-3), as is shown by recalculated contour map. initial fits of the spectra of three outbursts of gro j1008–57 in 2005, 2007 and 2011 with an absorbed cutoff power-law and an additional black body component showed that the folding energy efold, as well as the black body temperature kt, are independent of time within uncertainties. in particular, it seems that they do not change between different outbursts, i.e., these parameters are constant properties of the source. these parameters are therefore set as global parameters using simfit.set_par_fun and their values are determined by all available data. in addition, further parameters can be treated as the global parameters [see 9, for more details]. finally, each observation is described by 3 group parameters only (≈ degrees of freedom for each datagroup, the global parameters contribute marginally), which are the power-law flux fpl, the black body flux fbb, and the photon index γ. the latter two parameters correlate strongly with fpl, but show no dependency on the outburst time or the outburst shape. this correlation can be fitted to describe the spectrum of gro j1008–57 by only one parameter: the power-law flux fpl. the fit is shown in fig. 2 and its values are given in section 4.2 of [9]. as has already been mentioned in section 2.3, the runtime for the uncertainty calculations of the global parameters is increased dramatically. in the case of this analysis, the χ2-landscape produced by taking all 68 spectra into account was interpolated to estimate the uncertainties. the calculation of a single global parameter took ∼7 days on 100 cpus (16320 cpuh). 5. outlook although simultaneous fits have already been applied successfully to real data (see section 4 and [9]), the routines and functions are still under development. we recommend to pull the isisscripts-git-repository2 regularly to be up-to-date. there are, however, some caveats according to table 1 that one should be aware of (as with any routine, not just our isis implementation). in particular, the runtime still has to be reduced. one way to achieve this is by performing the fit on multiple cpus, e.g., one cpu handles one dataset or datagroup. this has not been implemented yet because the dependencies of the datasets on each other require data exchange between the processes on the different machines. additionally, the question of weighting the data is currently under discussion. the weight depends on the number of datapoints available in each dataset (or -group) as well as their uncertainties but what does this mean for its importance, i.e., for its effect on the model parameters? these remaining issues have to be clarified and the respective solutions will be published in the future. acknowledgements m. kühnel was supported by the bundesministerium für wirtschaft und technologie under deutsches zentrum für luftund raumfahrt grant 50or1113. we thank john e. davis for developing the slxfig package, which was used to create all figures shown in this paper. references [1] j. c. houck, et al. isis: an interactive spectral interpretation system for high resolution x-ray spectroscopy. in n. manset, et al. (eds.), astronomical 126 vol. 55 no. 2/2015 simultaneous fits in isis on the example of gro j1008–57 data analysis software and systems ix, vol. 216 of astron. soc. of the pacific conf. series, p. 591. 2000. [2] m. s. noble, et al. beyond xspec: toward highly configurable astrophysical analysis. publ of the astron soc of the pacific 120:821–837, 2008. [3] r. a. schafer. xspec, an x-ray spectral fitting package : version 2 of the user’s guide. european space agency, 1991. [4] k. a. arnaud. xspec: the first ten years. in g. h. jacoby, et al. (eds.), astronomical data analysis software and systems v, vol. 101 of astron. soc. of the pacific conf. series, p. 17. 1996. [5] m. s. noble, et al. using the parallel virtual machine for everyday analysis. in c. gabriel, et al. (eds.), astronomical data analysis software and systems xv, vol. 351 of astron. soc. of the pacific conf. series, p. 481. 2006. [6] g. schönherr, et al. a model for cyclotron resonance scattering features. a&a 472:353–365, 2007. [7] c. grossberger, et al. a&a in prep. [8] p. a. becker, et al. spectral formation in accreting x-ray pulsars: bimodal variation of the cyclotron energy with luminosity. a&a 544:a123, 2012. [9] m. kühnel, et al. gro j1008-57: an (almost) predictable transient x-ray binary. a&a 555:a95, 2013. [10] j. e. deeter, et al. pulse-timing observations of hercules x-1. apj 247:1003–1012, 1981. [11] p. e. boynton, et al. vela x-1 pulse timing. i determination of the neutron star orbit. apj 307:545–563, 1986. 127 acta polytechnica 55(2):123–127, 2015 1 motivation 2 implementation into isis 2.1 terminology 2.2 dataand analysis functions 2.3 uncertainty calculations 3 applications 4 the example gro j1008–57 5 outlook acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0606 acta polytechnica 53(supplement):606–611, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap supernova 1987a: celebrating a silver jubilee nino panagiaa,b,c,∗ a space telescope science institute, 3700 san martin dr., baltimore, md 21218, usa b inaf-osservatorio astrofisico di catania, via s. sofia 78, i-95123 catania, italy c supernova ltd., oyv #131, northsound road, virgin gorda, british virgin islands ∗ corresponding author: panagia@stsci.edu abstract. the story of the sn 1987a explosion is briefly reviewed. although this supernova was somewhat peculiar, the study of sn 1987a has clarified quite a number of important aspects of the nature and the properties of supernovae, such as the confirmation of the core collapse of a massive star as the cause of the explosion, as well the confirmation that the decays 56ni–56co–56fe at early times and 44ti–44sc at late times, are the main sources of the energy radiated by the ejecta. still we have not been able to ascertain whether the progenitor was a single star or a binary system, nor have we been able to detect the stellar remnant, a neutron star that should be produced in the core collapse process. keywords: supernovae: general, supernovae: 1987a, neutrinos, binaries: general, stars: neutron. 1. introduction supernova 1987a was discovered on february 24, 1987 by ian shelton [29] in the large magellanic cloud. thanks to the modern instruments and telescopes available, it was possible to observe sn 1987 in great detail and and with high accuracy so that this event has been a first in many aspects (e.g. neutrino flux, progenitor identification, gamma ray flux) and definitely the best studied event ever. the early evolution of sn 1987a has been highly unusual. it brightened much faster than any other known supernova: in about one day it jumped from 12th up to 5th magnitude at optical wavelengths (a factor of ∼ 1000!). however, equally soon its rise leveled off and took a much slower pace, indicating that this supernova would have never reached such high peaks in luminosity as the astronomers were expecting. similarly, in the ultraviolet, the flux initially was very high, even higher than in the optical. but since the very first observation, made with the international ultraviolet explorer (iue in short) satellite less than fourteen hours after the discovery [28, 59], the ultraviolet flux declined very quickly, by almost a factor of ten per day for several days. it looked as if it was going to be a quite disappointing event but eventually it became apparent that sn 1987a has been the most valuable probe to test our ideas about the explosion of supernovae. reviews of both early and recent observations and their implications can be found in [3, 36, 37, 42]. the proceedings of the 2007 aspen conference supernova 1987a, twenty years later: supernovae and gammaray bursters, (eds. s. immler, r. mccray, k.w. weiler, aip, conf. proc. vol. 937, 2007) provide an extensive update on all sn 1987a studies. here, i summarize some of the early findings on sn 1987a and discuss some of the new results obtained in recent years. figure 1. true color picture (hst-wfpc2) of sn 1987a, its companion stars, and the circumstellar rings [credit: peter challis (harvard)]. 2. neutrino emission from sn 1987a for the first time, particle emission from a supernova was directly measured from earth: on february 23, around 7:36 greenwich time, the neutrino telescope “kamiokande ii” recorded the arrival of 9 neutrinos within an interval of 2 seconds and 3 more 9 to 13 seconds after the first one. simultaneously, the same event was revealed by the imb detector and by the “baksan” neutrino telescope which recorded 8 and 5 neutrinos, respectively, within a few seconds from 606 http://dx.doi.org/10.14311/ap.2013.53.0606 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 supernova 1987a: celebrating a silver jubilee each other. this makes a total of 25 neutrinos from an explosion that allegedly produces about 1058 of them! but two dozens neutrinos were enough to verify and confirm the theoretical predictions made for the core collapse of a massive star that becomes a neutron star (e.g. [3, and references therein]). this process was believed to be the cause of the explosion of massive stars at the end of their lives, and sn 1987a has provided the experimental proof that the theoretical model was sound and correct, promoting it from a nice theory to the description of the truth. at the same time we cannot discard other evidence that may reveal puzzling aspects of this supernova explosion. in particular, about five hours before the kamiokande event, the mont blanc neutrino detector recorded a series of five neutrinos grouped within 7 seconds from each other [1]. such an event appeared to be highly significant (99.9 % significance) but was not revealed by the other detectors (possibly because of the different detection thresholds among the various experiments) and was not consistent with the timing of the light curve rise in the optical and in the uv. barring an exceptionally high (and rare) fluctuation as discussed by galeotti [19], the reality of this event would imply that the supernova progenitor underwent a double collapse, in which only the second one rebounded so as to generate the outward shock that, breaking out at the stellar surface, started the uv/optical burst (e.g. [14, 57]). a possible model of such phenomenon has been put forward by imshennik and ryazhskaya [22] who show that a rotating collapsar may be expected to explode in a two-stage collapse with a phase difference of ∼ 5 hours. we are inclined to argue that this phenomenon is more likely to occur with a binary system progenitor because such a system would naturally provide the opportunity of having two merging stellar cores to collapse in a multi-step process. 3. sn 1987a progenitor star from both the presence of hydrogen in the ejected matter and the conspicuous flux of neutrinos, it was clear that the star which had exploded was quite massive, about twenty times more massive than our sun. and all of the peculiarities were due to the fact that just before the explosion the supernova progenitor was a blue supergiant star instead of being a red supergiant as expected. there is no doubt about this explanation because sn 1987a is exactly at the same position as that of a well known blue supergiant, sk −69°202. and the iue observations indicated that such a star was not shining any more after the explosion: the blue supergiant star unambiguously was the sn progenitor. this heretic possibility was first suggested in panagia et al. [39] and confirmed by the more detailed analyses presented by gilmozzi et al. [20] and sonneborn, altner & kirshner [52]. on the other hand, the presence of narrow emission lines of highly ionized species, detected in the sn 1987a short wavelength spectrum since late may 1987, has provided evidence for the progenitor having been a red supergiant before coming back toward the blue side of the hr diagram [17]. also, the detection of early radio emission that decayed in a few weeks [56] indicated that the ejecta were expanding within a circumstellar environment whose properties were a perfect match to the expected wind of a blue supergiant progenitor [9]. such an evolution for a star with mass of ∼ 20 m� was not expected, and theorists have struggled quite a bit to find a plausible explanation for it. as summarized by podsiadloswki [46], in order to explain all characteristics of sn 1987a, rotation has to play a crucial role, thus limiting the possibilities to models involving either a rapidly rotating single star [30], or a stellar merger in a massive binary system [46]. more recently, podsiadlowski & morris [47] pointed out that while sn 1987a anomalies have long been attributed to a merger between two massive stars that occurred some 20 000 years before the explosion, so far there has been no observational proof that this merger took place. however, their detailed 3-d hydrodynamical simulations of the mass ejection associated with such a merger has shown that such scenario not only could account for the unusual evolution of the massive progenitor but also it appears to accurately reproduce the properties of the triple-ring nebula surrounding the supernova. 4. explosive nucleosynthesis the optical flux reached a maximum around midmay, 1987, and declined at a quick pace until the end of june, 1987, when rather abruptly it slowed down, settling on a much more gentle decline of about 1 % a day [48]. such a behaviour was followed for about two years quite regularly: a perfect constant decay with a characteristic time of 111 days, just the same as that of the radioactive isotope of cobalt, 56co, while transforming into iron. this is the best evidence for the occurrence of nucleosynthesis during the very explosion: 56co is in fact the product of 56ni decay and this latter can be formed at the high temperatures which occur after the core collapse of a massive star. thus, not only are we sure that such a process is operating in a supernova explosion, but we can also determine the mass of 56ni produced in the explosion, slightly less than 8/100 of a solar mass or ∼ 1 % of the mass of the stellar core before the explosion. the detection of hard x-ray emission since july 1987, and the subsequent detection of gammaray emission have confirmed the reality of such a process and provided more detailed information on its distribution within the ejecta (e.g. [3, and references therein]). eventually, the detection of coii lines in the mid-infrared [4] confirmed the light curve result and provided the first direct evidence of the production of 56ni in supernova explosions. 607 nino panagia acta polytechnica figure 2. b and r band light curves of sn 1987a debris (dots) and the equatorial ring (squares) [credit: sins team, and peter challis (harvard)]. 5. energetics of the emitted radiation sn 1987a ultraviolet spectra obtained with iue over the period 1987 february 24 [day 1.6] through 1992 june 9 [day 1567]) [48] show that the uv flux plummeted during the earliest days of observations because of the drop in the photospheric temperature and the increase in opacity. however, after reaching a minimum of 0.04 % of the total uvoir bolometric luminosity on day 44, the uv flux increased by about 200 times in its relative contribution to 7 % of the bolometric luminosity at day 800. a study of the uv colors reveals that the supernova started to get bluer in the uv around the time when dust started to form in the ejecta (about 500 days after the explosion; [12]). these results are consistent with the possibility that the condensed dust be metal-rich and of small size. in the early ’90s the sn light curve decay appears to slow down (see fig. 2), decreasing at a rate that is consistent with the decay of 44ti into 44sc in an amount as expected by explosive nucleosynthesis [13, 27]. on the other hand, around 2001 the optical flux was observed to increase again mimicking the behaviour shown by the inner ring radiation. as discussed by larsson et al. [31] this increase results from heat deposited by x-rays produced as the ejecta interact with the surrounding material. 6. hst observations the hubble space telescope (hst) was not in operation when the supernova exploded, but it did not miss its opportunity in due time and its first images, taken with the esa-foc on august 23 and 24, 1990, revealed the inner circumstellar ring in all its “glory” and detail [23]. subsequent observations made with the foc allowed jakobsen et al. [24, 25] to measure the angular expansion of the supernova ejecta. the results confirmed the validity of the expansion models put forward on the basis of spectroscopy. additional observations, made with the wfpc2 (burrows et al. 2005) on the refurbished hst confirmed the early trend of the expansion and revealed the presence of structures that had never been seen before [26, 61]. hst-fos spectroscopic observations of sn 1987a, made over the wavelength range 2000÷8000 å on dates 1862 and 2210 days after the supernova outburst, indicate that at late times the spectrum is formed in a cold gas that is excited and ionized by energetic electrons from the radioactive debris of the supernova explosion [60]. the profiles are all asymmetric, showing redshifted extended tails with velocities up to 10 000 km s−1 in some strong lines. the blueshift of the line peaks is attributed to dust that condensed from the sn 1987a ejecta and is still distributed in dense opaque clumps. 7. the circumstellar rings important clues to the nature of sn 1987a are provided by the study of its circumstellar rings, i.e. an equatorial ring (the “inner ring”) about 0.86′′ in radius and inclined by about 45 degrees, plus two additional “outer rings” which are approximately but not exactly, symmetrically placed relative to the equatorial plane, approximately co-axial with the inner ring, and have sizes 2 ÷ 2.5 larger than the inner ring. the presence of the inner ring was originally revealed with the iue detection of narrow emission lines [17]. detailed studies of the rings, mostly based on spectroscopy and imaging done with hst, concluded that the rings display a strong n overabundance and a moderate he enhancement [17, 32, 40, 41, 53]. most likely, they were ejected in two main episodes of paroxysmal mass loss which occurred approximately 10 000 (the inner ring) and 20 000 years (the outer rings) before the supernova explosion, respectively [34, 41]. 8. interaction of the ejecta with the equatorial ring since mid-1997 hst has observed the high-velocity material from the supernova explosion starting to overtake and crash into the slow-moving inner ring. figure 3 shows dramatic evidence of these collisions. the circumstellar ring started to develop bright spots in 1997, and in late 2006 one can identify at least twenty bright spots. these bright spots are the result of the fast moving component of the ejecta (at a speed of about 15 000 km s−1) colliding with the stationary equatorial ring (e.g. [38, 54]). independent evidence for an interaction whose strength is quickly increasing with time is provided by both radio emission (e.g. [18, 55]) and x-ray emission (e.g. [2, 43–45]). actually, bouchet et al. [5, 6] emphasized that a comparison of their gemini 11.7 µm image with chandra x-ray images, hst uv-optical images, and atca radio 608 vol. 53 supplement/2013 supernova 1987a: celebrating a silver jubilee figure 3. images of sn 1987a and its inner ring obtained with hst-wfpc2 over the years 1996–2006, during which time the ring has developed at least twenty hot spots. [credit: sins team, peter challis (harvard) and nasa] synchrotron images shows generally good correlation across all wavelengths. on the other hand, a good correlation does not necessarily imply a one-to-one correspondence. for example, gaensler et al. [18] stressed that an asymmetric brightness distribution is seen in radio images at all atca epochs. the eastern and western rims have higher fluxes than the northern and southern regions, indicating that most of the radio emission comes from the equatorial plane of the system, where the progenitor star’s circumstellar wind is thought to be densest. the eastern lobe is brighter than, and further away from the supernova site than the western lobe, suggesting an additional asymmetry in the initial distribution of supernova ejecta. similar asymmetries are also found at x-ray wavelengths, but in the optical it is the west side that appears to become brighter at late times. right now, we are witnessing the transition of sn 1987a from a supernova proper, in which the energetics are dominated by the ejecta themselves, into a full-fledged supernova remnant where most of the energy is produced by the interaction of the ejecta with a surrounding medium. over the next decades, as the bulk of the ejecta reach the ring, more spots will light up and the whole ring will shine as it did in the first several months after explosion (e.g. [37]). eventually, the ejecta will completely sweep the ring up, clearing the circumstellar space of that beautiful remnant of the pre-supernova wind activity. 9. dust associated with sn 1987a discussing the physical correlation between ir and x-ray emission, bouchet et al. [6] pointed out that if the dust responsible for the ir emission resides in the diffuse x-ray-emitting gas then it is collisionally heated. in this case, the ir emission can be used to derive the plasma temperature and density, which they found to be in good agreement with those inferred from the x-rays. alternatively, the dust could reside in the dense uv-optical knots and be heated by the radiative shocks that are propagating through the knots. bouchet et al. [6] conclude that in either case the dust-to-gas mass ratio in the csm around the supernova appears to be significantly lower than that in the general interstellar medium of the lmc, suggesting either a low condensation efficiency in the wind of the progenitor star or an efficient destruction of the dust by the sn blast wave. recent midand far-ir observations made with the spitzer and the herschel space telescopes have provided a wealth of new information about the properties and the nature of the dust associated with sn 1987a [15, matsuura et al. 2011]. spitzer observations in the mid-ir (3.6 to 24 µm) show a spectrum that peaks around 20 µm and whose luminosity indicates a mass of dust in the inner ring of about 1.2 × 10−6m� (dwek at al. 2011). compared to the estimated total mass of the ring (∼ 6 × 10−2m� [35], such a dust mass implies a gas to dust ratio as high as 5 × 104, i.e. much higher than an average value of 300 appropriate for the lmc. on the other hand, the spectrum measured with herschel peaks around 160 µm, suggesting dust temperatures around 20 k (matsuura et al. 2011). the mass inferred from these data depends on assumptions about the nature of the grains, but all estimates range between 0.34 and 3.4m� and values around 0.5m� are deemed to be most reasonable according to the authors. this is a large amount even accepting the idea of matsuura et al. 2011 that we are dealing with dust formed in the ejecta: considering that the total mass of the ejecta is estimated to be about 10m� even the lowest value of 0.34m� would require both a very high metal abundance in the progenitor’s outher layers and a very high efficiency for dust formation during the expansion of the ejecta. future observations in the fir, hopefully made with high angular resolution telescopes (perhaps alma?) will be able to verify these claims and to clarify if sne are actually efficient dust producers. 10. a missing neutron star? hst observations both in spectroscopy (stis, 1140 ÷ 10 266 å) and in imaging (acs 2900 ÷ 9650 å) have failed to detect any point source inside the sn 1987a remnant [21], implying an uvoir luminosity below 8 × 1033 erg s−1. the presence of bright young pulsars such as kes 75 or the crab pulsar is excluded by optical and x-ray limits on sn 1987a. while nonplerionic x-ray point sources have optical luminosities 609 nino panagia acta polytechnica similar to the above limits, among all young pulsars known to be associated with snrs, those with ages less than about 5000 years are much brighter in x-rays than the limits on sn 1987a. discussing theoretical models, graves et al. [21] find that spherical accretion onto a neutron star is firmly ruled out and that spherical accretion onto a black hole is possible only if the dust absorption in the remnant is considerably higher than expected. in the case of thin-disk accretion, the flux limit requires a small disk (< 1010 cm in size), with an accretion rate lower than 0.3 times the eddington accretion rate. possible ways to hide a surviving compact object include the removal of all surrounding material at early times by a photon-driven wind, a small accretion disk, or very high levels of dust absorption in the remnant. graves et al. [21] conclude that it will not be easy to improve substantially on the present optical-uv limit for a point source in sn 1987a, although one can hope that a better understanding of the thermal infrared emission will provide a more complete picture of the possible energy sources at the center of sn 1987a. 11. conclusions it is clear that sn 1987a constitutes an ideal laboratory for the study of supernovae, and of explosive events, in general. as summarized above, a great deal of observations have been made and quite a number of aspects have been clarified and understood, such as confirming that the core collapse of a massive star was the cause of the explosion, as well ascertaining that the decays 56ni–56co–56fe and 44ti–44sc are the main sources of the energy radiated at early and at late times, respectively. on the other hand, there are still important points that need clarification and further study, as well as more observations. for example, the stellar remnant left behind by the explosion has eluded our detection so far and its nature remains a complete mystery. we still debate whether the sn progenitor was a single rotating star or a binary system. also, the detection of an early interaction of the supernova ejecta with the inner circumstellar ring has opened a new chapter in the study of this supernova, which is expected to culminate in about ten years, when the colliding materials will become the brightest objects in the lmc, with a display of fireworks at x-ray, uv and optical wavelengths that defy our most vivid imagination. acknowledgements this project was funded in part by an stsci ddrf grant d0001.82435. references [1] aglietta, m. et al. 1987, europhys. lett., 3, 1321 [2] aschenbach, b. 2007, in supernova 1987a: twenty years after: supernovae and gamma-ray bursters, eds s. immler, r. mccray & k.w. weiler (aip, new york), aip conf. proc., vol. 937, pp. 33–42 [3] arnett, w.d. et al. 1989, ara&a, 27, 629 [4] bouchet, p., danziger, i.j. 1988, iauc 4575 [5] bouchet, p. et al. 2004, apj, 611, 394 [6] bouchet, p. et al. 2006, apj, 650, 212 [7] burrows, c.j. et al. 1995, apj, 452, 680 [8] chevalier, r.a. 1986, apj, 308, 225 [9] chevalier, r.a., dwarkadas, v.v. 1995, apj, 452, l45 [10] crotts, a. 1988, iauc 4561 [11] crotts, a., kunkel, w.e., mccarthy, p.j. 1989, apj, 347, l61 [12] danziger, i.j. et al. 1989, iauc 4746 [13] diehl, r., timmes, f.x. 1998, pasp, 110, 637 [14] de rujula, a. 1987, phlb, 193, 514 [15] dwek, e. et al. 2010, apj, 722, 425 [16] dwek 2011. [17] fransson, c. et al. 1989, apj, 336, 429 [18] gaensler, b.m. et al. 2007, in supernova 1987a: twenty years after: supernovae and gamma-ray bursters, eds s. immler, r. mccray & k.w. weiler (aip, new york), aip conf. proc., vol. 937, pp. 86–95 [19] galeotti, p., pallottino, g.v., pizzella, g. 2009, in “frontiers objects in astrophysics and particle physics”, eds. f. giovannelli & g. mannocchi (sif, bologna), conf. proc. vol. 98, pp. 233–242 [20] gilmozzi, r. et al. 1987, nature, 328, 318 [21] graves, g. et al. 2005, apj, 629, 944 [22] imshennik, v.s., ryazhskaya, o.g., 2004, astronomy letters, 30, 14 [23] jakobsen, p. et al. 1991, apj, 369, l63 [24] jakobsen, p., macchetto, f.d., panagia, n. 1993 apj, 403, 736 [25] jakobsen, p. et al. 1994, apj, 435, l47 [26] jansen, r.a., jakobsen, p. 2001, a&a, 370, 1056 [27] jerkstrand, a., fransson, c, kozma, c. 2011, a&a, 530, 45 [28] kirshner, r.p. et al. 1987, apj, 320, 602 [29] kunkel, w., madore, b. 1987, iauc 4316 [30] langer, n. 1991, a&a, 243, 155 [31] larsson, j. et al. 2011, nature, 474, 484 [32] lundqvist, p., fransson, c. 1996, apj, 464, 924 [33] manchester, r.n. et al. 2002, pasau, 19, 207 [34] maran, s.p. et al. 2000, apj, 545, 390 [35] mattila, s. et al. 2010, apj, 717, 1140 [36] mccray, r. 1993, ara&a, 31, 175 [37] mccray, r. 2004, in iau colloquium #192 supernovae, eds. j.m. marcaide & k.w. weiler, (berlin: springer-verlag), p. 77–87 [38] michael, e. et al. 2003, apj, 593, 809 [39] panagia, n. et al. 1987, a&a, 177, l25 [40] panagia, n. et al. 1991, apj, 380, l23 [41] panagia, n. et al. 1996, apj, 457, 604 610 vol. 53 supplement/2013 supernova 1987a: celebrating a silver jubilee [42] panagia, n. 2005, in 1604–2004: supernovae as cosmological lighthouses, eds. m. turatto, s. benetti, l. zampieri, & w. shea (san francisco: asp) asp-conference series, 342, 78 [43] park, s. et al. 2002, apj, 567, 314 [44] park, s. et al. 2006, apj, 646, 1001 [45] park, s. et al. 2007, in supernova 1987a: twenty years after: supernovae and gamma-ray bursters, eds s. immler, r. mccray & k.w. weiler (aip, new york), aip conf. proc., vol. 937, pp. 43–50 [46] podsiadlowski, p. 1992, pasp 104, 717 [47] podsiadlowski, p., and morris, t. 2007, sci. 315, 1103 [48] pun, c.s.j. et al. 1995, apjs, 99, 223 [49] pun, c.s.j. et al. 2002, apj, 572, 906 [50] rosa, m. 1988, iauc 4564 [51] schaefer, b.e. 1987, apj, 323, l47 [52] sonneborn, g., altner, b., kirshner, r.p. 1987, apj, 323, l35 [53] sonneborn, g. et al. 1997, apj, 477, 848 [54] sonneborn, g. et al. 1998, apj, 492, 139 [55] staveley-smith, l. et al. 2007,in supernova 1987a: twenty years after: supernovae and gamma-ray bursters, eds s. immler, r. mccray & k.w. weiler (aip, new york), aip conf. proc., vol. 937, pp. 96–101 [56] turtle, a.j. et al. 1987, nature, 327, 38 [57] voskresensky, d.n. et al. 1987, apss, 138, 421 [58] wampler, j.e. et al. 1990, apj, 362, l13 [59] wamsteker, w. et al. 1987, a&a, 177, l21 [60] wang, l. et al. 1996, apj, 466, 998 [61] wang, l. et al. 2002, apj, 579, 671 611 acta polytechnica 53(supplement):606–611, 2013 1 introduction 2 neutrino emission from sn 1987a 3 sn 1987a progenitor star 4 explosive nucleosynthesis 5 energetics of the emitted radiation 6 hst observations 7 the circumstellar rings 8 interaction of the ejecta with the equatorial ring 9 dust associated with sn 1987a 10 a missing neutron star? 11 conclusions acknowledgements references acta polytechnica acta polytechnica 53(4):329–337, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ steel fibre reinforced concrete for tunnel lining – verification by extensive laboratory testing and numerical modelling jaroslav beňoa,b,∗, matouš hilara,c a faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague, czech republic b metrostav a.s., koželužská 2246, 180 00 prague, czech republic c 3g consulting engineers s.r.o., na usedlosti 16, 147 00 prague, czech republic ∗ corresponding author: jaroslav.beno@metrostav.cz abstract. the use of precast steel fibre reinforced concrete (sfrc) for tunnel segments is a relatively new application of this material. it was first applied in italy in the 1980s. however, it did not begin to be widely applied until after 2000. the czech technical university in prague (ctu), together with metrostav, carried out a study to evaluate the use of this new technology for tunnels in the czech republic. the first tests were carried out on small samples (beams and cubes) produced from sfrc to find an appropriate type and an appropriate dosage of fibres. the tests were also used to verify other factors affecting the final product (e.g. production technology). afterwards, sfrc segments were produced and then tested at the klokner institute of ctu. successful test results confirmed that it was possible to use sfrc segments for czech transport tunnels. consequently a 15 m-long section of segmental lining generated from sfrc without steel rebars was constructed as part of line a of the prague metro. keywords: steel fibre reinforced concrete, segmental tunnel lining, mechanical excavation, laboratory testing. 1. introduction steel fibre reinforced concrete (sfrc) is a relatively new structural material. in the past, it was used mainly for construction of industrial floors. nowadays, sfrc often replaces steel bar reinforced concrete; the tunnel lining from precast segments is one of many examples. in tunnelling sfrc is used in three basic modifications – sprayed concrete for primary outer lining, in situ cast concrete for the secondary inner lining, and precast segments for tunnels constructed with tunnelling shields. the main advantages of sfrc over widely-used steel bar reinforced concrete (rc) segments are as follows: simple production, higher durability, lower consumption of steel, lower number of defects, etc. a major disadvantage of sfrc segments without steel rebars is their lower bending capacity. uniformly distributed and randomly oriented fibres transform brittle plain concrete into ductile sfrc. sfrc can be used to replace plain concrete or reinforced concrete. the steel fibres improve the mechanical properties of the concrete. the fibres carry local tensions caused by three-dimensional stress between aggregate particles. fibres in concrete enhance its compressive strength, but especially sfrc tensile strength is higher. sfrc has high resistance against the development of microcracks. this feature is related to sfrc’s high resistance against dynamic load and resistance against sudden temperature changes. the physical and mechanical properties of sfrc are influenced by several factors: the material of the fibres, the shape of the fibres (especially adjustment of the ends), the amount of added fibres, and the composition of the concrete mixture. it is necessary to ensure uniform distribution of fibres in the concrete, and to ensure that they are well coated with cement mortar. sfrc cannot be generated just by adding fibres to a standard concrete mix. the composition of the sfrc mix has to be designed taking into account the increased volume of the aggregate caused by fibres. the dosage of fibres is determined by mckee’s theory. the minimum amount of fibres depends on their length and thickness. some types of fibres can generate conglomerates during the mixing process; they should therefore be added to the mix using dosing and dispersing equipment before mixing. the length of the fibres length should be approximately three times the maximum aggregate size. this will bridge cracks located on the border of the grains, and avoiding pulling out of fibres during the development of these cracks. the strength of sfrc parameters depend mainly on the aspect ratio of the fibres (length to diameter) and on the dosage of fibres (when there is the same concrete matrix). a higher aspect ratio (or a higher fibre content) generally means better performance of the sfrc. sfrc is especially suitable for structures loaded in more than one direction, where traditional bar reinforcement is problematic. this means that sfrc 329 http://ctn.cvut.cz/ap/ j. beňo, m. hilar acta polytechnica figure 1. spalling of unreinforced edges of traditionally reinforced segments [1] is not appropriate for unidirectionally loaded structures, because the randomly oriented fibres would be largely unloaded. sfrc is therefore suitable for tunnel segments loaded in various directions during their production, installation and permanent function. polypropylene (pp) fibres cannot be used as a reinforcement for concrete structures, because they have a low modulus of elasticity (lower than concrete), which means big deformations. moreover, pp fibres lose their mechanical properties at 50°c and melt at 165°c. pp fibres can be used in sfrc structures or in reinforced concrete structures to increase their fire resistance. 2. segmental tunnel lining tunnel linings constructed from precast reinforced concrete segments started to be used in the 1950s, when they gradually replaced steel and cast-iron segments. reinforced concrete linings from precast segments were widely used in the construction of the metro in prague, where the lining was not waterproof due to the manufacturing tolerances. water-tightness of the lining was reached by additional grouting of the joints between segments, but the risk of leakage was relatively high. modern segmental linings have manufacturing tolerances of ±0.5 mm, and water tightness is guaranteed by impermeable concrete together with epmd (ethylene propylene diene monomer) gaskets between the segments. sfrc segments can have complicated details, as every part of the segment is uniformly reinforced by fibres (fig. 1). there is therefore less damage of segments during transportation and installation, which means that there is a lower risk of leakage and fewer repairs are required. nowadays segmental lining is used only in mechanical tunnelling with tunnelling shields (tunnel boring machine technology – tbm). a permanent lining is installed directly behind the tunnel face under the shield protection. the lining has a circular shape composed of several precast concrete segments. each segment is placed in the required position by the erector (a hydraulic arm at the rear part of the shield). one ring usually has dimensionally identical segments; the final segment (the key segment) has a different shape and is smaller in size. individual segments are connected by bolts. the space between the lining and the ground is grouted. segments are loaded by a wide range of loading conditions during their lifetime. similarly to other precast structures, the segments have to endure demoulding, manipulation, storage and transportation to the construction site. they also have to carry loads during their installation, namely they should be able to carry the axial forces generated by the hydraulic rams that push the shield into the ground. the load generated by shield rams is often decisive for the design of the segments. all load cases mentioned here are only temporary. the segments are loaded only by ground pressure and hydrostatic pressure during tunnel operation. this load case should correspond with the reinforcement of the segments. 3. a comparison of traditionally reinforced segments and sfrc segments sfrc segments are suitable for structures without high bending moments. a tunnel lining constructed by tunnelling shields is circular in shape, which is advantageous due to the low bending moments. in standard geotechnical conditions, the segments are loaded mainly by compression with relatively small eccentricity (i.e. without bending and without tensile stress). after reaching tensile strength, the deformation of sfrc does not increase abruptly, but thanks to the uniformly distributed fibre deformations it increases gradually and causes a greater number of small cracks. this is because the fibres are activated and are continuously pulled out from the concrete. the cracks are relatively small, and the crack openings remain small. the total value of the tensile strength of the segments 330 vol. 53 no. 4/2013 steel fibre reinforced concrete for tunnel lining is significantly lower than in the case of bar-reinforced concrete segments. the behaviour of reinforced concrete is different. after reaching tensile strength, the increment of the deformation develops until full activation of the bar reinforcement. this results in wider cracks than in the case of sfrc, and one main crack usually appears. afterwards, however, the deformation stabilizes and grows approximately linearly, until it reaches the yield strength of steel bars. this is significantly higher than the strength of steel fibre reinforced concrete in tension. destruction of the structure occurs after reaching the ultimate strength of steel in tension or crushing of concrete in compression. sfcr structures are better protected against corrosion. this is mainly because there are lower crack openings than in the case of traditionally reinforced concrete structures. adding fibres into concrete reduces crack openings in the same segment from 1 mm to less than 0.2 mm. minimal opening of cracks allows them to close and self-heal. for this reason, sfrc segments are considered to have about 20 % longer durability than traditionally reinforced structures. durability of 120 years is expected for some sfrc structures (e.g. tunnels on the channel tunnel rail link in the united kingdom). minimum cover of the steel rebars is required for traditionally reinforced segments. the minimum cover prevents corrosion of the steel bar reinforcement. the required cover is usually 40–50 mm in thickness; it depends on the aggressiveness of the external and internal environment (e.g. underground water). no cover is required for sfrc segments, as no reinforcement corrosion is expected. the fibres are completely protected by the alkaline environment of the concrete. in addition, no stray current corrosion can occur in sfrc segments, because the fibres are short and they do not touch each other. therefore the thickness of sfrc segments can be reduced in some cases. the production cost of sfrc segments is often slightly lower than the production cost of rc segments, although the actual material (steel fibres) is more expensive than a conventional steel bar reinforcement. savings can be achieved by lower labour costs and by reducing the amount of steel that is used. the fibres are added directly into the mix. sfrc segments can therefore be produced faster, and the plant can have a higher capacity. the final cost is also reduced by the fact that fewer segments are damaged during installation, so that expensive repairs or replacements of damaged segments are very rare in the case of sfrc segments. the production process also eliminates the need for accurate placement of the reinforcement. a disadvantage of sfrc structures is the absence of structural design standards. european standards (eurocodes – ec) still cover only the field of laboratory testing and production of fibres. there are only various guidelines and recommendations for structural design. this means that there is an absolute lack of uniformity in the design methods and required tests. a european standard for the design of sfrc structures is currently being prepared. the standard will be based on ec 2 for concrete structures; parts focused on sfrc will be amended. 4. tests on sfrc samples laboratory tests on sfrc are similar to plain concrete tests. in contrast to plain concrete, the tensile strength or the equivalent tensile strength after cracking of sfrc structures can be taken into consideration. the effect of added fibres on increasing of the cube compressive strength, the tensile splitting strength and the equivalent flexural strength is not identical. due to the added steel fibres, the tensile strength usually increases more than the compressive strength. sfrc therefore cannot be classified simply according to cube compressive strength, as plain concrete is. this is because the high tensile strength would not be fully utilized in the design of structures. therefore laboratory test of sfrc should take this factor into account. laboratory tests of sfrc samples were carried out at the czech technical university in prague (ctu). the major objective of the realised tests was to verify the properties of various sfrc mixtures, the impact of various types of fibres, and the impact of various dosages of fibres. all specimens were cast in the plant of the slovak company doprastav, in senec. in the first stage, the samples were produced with dramix rc – 80/60 bn fibres, produced by the belgian company bekaert, and with czech fibres tritreg. the fibre dosage was 70 kg for 1 m3 of concrete. a total of 30 cubes were produced with a side length of 150 mm, and 30 beams with dimensions of 150 × 150 × 700 mm. the samples were tested in three independent laboratories to increase the objectivity of the test results. in the second stage, 12 beams and 12 cubes were produced with the same dimensions as in the first stage. the specimens were generated with dramix rc – 80/60 bn fibres with a fibre dosage of 50 kg for 1 m3 of concrete. the second stage of the tests served mainly to compare the impact of various fibre dosages. the samples were used for four types of tests; all samples were tested after 28 days. cube compressive strength and tensile splitting strength were tested on cubes according to čsn en 12390 – 3 and according to čsn en 12390 – 6 respectively. the beams were tested for flexural strength according to the german dvb – merkblatt guideline (four-point bending tests) and for residual equivalent flexural strength on a beam with a notch in the middle (three-point bending test), according to čsn en14651. the four-point bending tests were executed with controlled deflection. the curves of load dependence on deflection were evaluated by calculating the area under this curve. the results of a low beam deflection can be used for the serviceability limit state; results 331 j. beňo, m. hilar acta polytechnica dosage of fibres fct,l [mpa] fr,1 [mpa] fr,2 [mpa] fr,3 [mpa] fr,4 [mpa] 50 kg/m3 6.364 6.759 7.012 6.196 5.511 70 kg/m3 6.977 8.930 8.916 6.892 5.306 table 1. comparison of residual flexural strength calculated according to čsn en 14651 (three-point bending test) figure 2. comparison of four-point bending test results (blue: 70 kg/m3 of dramix fibres; orange: 50 kg/m3 of dramix fibres). of a high beam deflection can be used for the ultimate limit state. the three-point bending test can be used to evaluate the post-crack behaviour and to obtain the equivalent tensile strength at the determined crack opening [4]. the applied forces depending on crack mouth opening displacement (cmod) were recorded during the test. the tensile strength of sfrc was calculated from the graph when the first crack appeared (lop – limit of proportionality); and the post-crack residual flexural strength was calculated with the help of values given in the standard. four residual strength values for sfrc were recorded (fr,1, fr,2, fr,3 and fr,4) for four values of cmod (cmod1 = 0.5 mm, cmod2 = 1.5 mm, cmod3 = 2.5 mm a cmod4 = 3.5 mm). this test is obligatory for manufacturers of fibres, because they have to state how many kg of fibres should be added to 1 m3 concrete to obtain residual strength of 1.0 mpa at cmod = 3.5 mm. this approach is not appropriate for the design of sfrc structures. the results of the three-point bending tests that were carried out are presented in tab. 1. they show an increase by about 25 % in flexural strength between 50 kg/m3 and 70 kg/m3 dosages of fibres. the character of the resulting curves is presented in fig. 2. higher dosage of fibres leads to an increase in tensile residual strength after cracking up to deflection of about 3 mm, and then the both curves are almost identical. the results show that tests with a lower dosage of fibres have a lower scatter of properties due to better workability of the mix with 50 kg/m3 of fibres. however, samples with higher fibre dosages tend to have better final properties. the performed tests could not provide an objective recommendation for the type of fibres for the producfigure 3. comparison of four-point bending tests on beams with fibre dosage of 70 kg/m3 (red: dramix, green: tritreg). tion of sfrc segments. further tests with various dosages of various fibres would be required to achieve higher confidence. sfrc beams with dramix fibres generally had slightly better properties (fig. 3). they had higher compressive strength and flexural strength. they also showed lower scatter due to better distribution of the fibres. sfrc beams with dramix fibres also showed higher residual strength after cracking. the results of the laboratory test showed that sfrc can be used for segments produced for the construction of czech transport tunnels. the tests indicated that sfrc has properties that are important for segment design (especially compressive strength and flexural strength). in addition, the tests confirmed that a homogeneous sfrc mixture can be cast in the precast plant in senec. dramix fibres with a dosage of 50 kg/m3 have been recommended for production and testing of sfrc segments. 5. numerical modelling of sfrc beams many software products designed for numerical modelling of various civil structures are available on the market. it should be noted that a numerical model is only a specific approximation of the real behaviour of a structure. modelling of sfrc structures should be performed in software that uses nonlinear fracture mechanics, because the postcracking behaviour is important for sfrc structures. the czech software atena was used for modelling the laboratory testing of sfrc beams. back analysis was used for calibrating the sbeta material model. this problem cannot be solved using sophisticated methods (e.g. neural networks). the meaning of the individual parameters 332 vol. 53 no. 4/2013 steel fibre reinforced concrete for tunnel lining figure 4. comparison of testing and modelling (black: laboratory tests, orange: simulation, using atena software). modulus of elasticity [gpa] 40.66 poisson’s ratio [–] 0.20 tensile strength [mpa] 4.20 compessive strength [mpa] −57.2 specific fracture energy [n/m] 7500 critical compressive deformation [m] −0.0005 table 2. parameters of the sbeta model derived by back analysis of the beams in atena software is known, and the parameters are almost independent. the parameters can be derived directly by gradually eliminating the differences between the test results and the modelling results. the sbeta material model describes the behaviour of concrete in 2d models by fracture mechanics [6]. that it uses the crack-band concept with a nonlinear traction-separation law. the properties of concrete in tension are simulated by nonlinear mechanics in combination with the width of the cracked zone. the main parameters of the model are: tensile strength, fracture energy and the shape of the curve defined by the relationship between the stress and the crack opening. the parameters were derived by back analysis. the four-point bending test of a beam without a notch was chosen for deriving the material parameters. this test simulates the behaviour of sfrc better than the test of the beam with a notch, which is better for calculating the fracture energy. the geometry, the support and the loading of the beam were the same as for the laboratory tests. the deflection at mid span and the value of the force were recorded in the simulation to compare the modelling results with the tests that were carried out. a comparison between the laboratory test and the simulation achieved by back analysis in atena software is presented in a graph (fig. 4). the curve derived by modelling does not match perfectly with the test results (the peaks are not included), but the prepared model can be used for a conservative prediction of the behaviour of the figure 5. tests performed on sfrc segments: (a) compressive strength of a key segment, (b) segment bending perpendicular to the segment plane, (c) compressive strength of the segment, (d) segment bending in the segment plane. material. the material parameters derived from back analysis of laboratory tests of sfrc beams (tab. 2) served for a numerical simulation of the behaviour of an sfrc segment. tests and mathematic simulations of entire sfrc segments should confirm the appropriateness of the derived sbeta parameters. verification of the numerical model should be very useful for projects in future (as it will enable expensive laboratory tests on segments to be avoided). 6. tests on sfrc segments tests on segments were executed at the klokner institute of ctu [3]. the segments were produced in senec (slovakia), where moulds for the construction of prague metro line a extension are located. two complete rings of sfrc segments were produced. the dosages of fibres were 40 kg/m3 for one ring and 50 kg/m3 for the second ring. two major design cases were verified – long-term load of the installed lining (ground pressure and hydrostatic pressure) and the short-term load generated by shield rams pressure. other temporary load cases can be adjusted in order not to increase the reinforcement. the ultimate limit state is generally not crucial for the design of segments. the serviceability limit state plays a more important role, due to the importance of the watertightness of the tunnel lining. impermeability of the segmental tunnel lining is generally guaranteed by waterproof concrete and gaskets between segments. the limiting factor for the design of the segmental tunnel lining is therefore a crack running through a segment allowing water leakage through the segment. sfrc is advantageous from this point of view, because macro-cracks appear later than in traditionally reinforced concrete. on the other hand reinforced concrete has significantly higher flexural 333 j. beňo, m. hilar acta polytechnica figure 6. key segment compression (left: segment installed in the test machine; right: detail of the fractured surface). figure 7. segment bending (left: segment installed in the test machine; right: a detail of the fractured surface). strength, but this is not important for the serviceability limit state. sfrc segments were subjected to the following laboratory tests (fig. 5). 6.1. key segment compression first, an sfrc key segment was tested in compression in the direction parallel to the longitudinal tunnel axis (fig. 5a). the load simulated pressure generated by shield rams caused by penetrating the shield machine to the rock mass. the applied force was recorded during the test. deformations were measured by potentiometric sensors and by resistance tensiometers glued to the surface of the segments. the laboratory test was conducted in a wpm 6000 kn hydraulic machine (fig. 6) with gradually increasing compressive force. the applied force was increased in steps of 300 kn, and the segment was unloaded to 90 kn after each load step. the load was continuous from a value of 4800 kn. the first crack appeared at an applied force of 4200 kn. the crack ran through the whole thickness of the segment. the maximum applied force was 6000 kn, when the available capacity of the testing machine was reached. the maximum load capacity of the key segment was not detected, so the segment had to be tested in the stronger amsler 10000 kn test machine. the key segment was tested by continuous loading until destruction, which occurred with a sudden break at a force of about 7250 kn. a test of an rc key segment in compression was also executed. the dimensions of the segment were identical with the sfrc segment, and the setting of the test was also the same. the only difference was that the rc segment included a gasket (a sealing rubber) on its surface. the first crack occurred at a force of 3300 kn. the crack ran through the whole thickness of the segment. the maximum applied force was about 5870 kn. at this force, separation of a large part of cover at the inner surface of key stone occurred. 6.2. segment bending segments were also tested under a load perpendicular to the longitudinal tunnel axis (fig. 5b). the test simulated segment bending caused by ground pressure. the segment was laid in convex position, and the lower edges were supported by sliding mats, allowing horizontal movement and preventing vertical movement. the segment was loaded with a uniform load along the whole length of the crown (fig. 7). 334 vol. 53 no. 4/2013 steel fibre reinforced concrete for tunnel lining figure 8. compression of a uniformly supported segment. figure 9. compression of a non-uniformly supported segment (left: segment installed in the test machine; right: detail of the fractured segment). the test was controlled by increasing the deflection to obtain the whole stress–strain diagram, including the decreasing section. the test was finished once the segment disintegrated under its self-weight. the first cracks emerged on the bottom surface of the segment in a relatively wide zone under the applied load. cracks gradually spread in this zone, then one dominant crack appeared, and then the failure of the segment occurred. the load capacity was relatively low, and the segment failed at a maximum load between 100 kn and 150 kn. a total of four sfrc segments were tested using this type of test. 6.3. compression of a uniformly supported segment this test simulated a uniform load of the shield rams. segments were loaded by a force parallel to the longitudinal tunnel axis. the segment was loaded in two points in the centre plane (fig. 5c). the applied force was increased in steps of 1200 kn; the segment was unloaded to 400 kn after each step. two sfrc segments were subjected to this test. the first crack appeared in both cases under a force of 3600 kn. subsequently cracks developed between two applied forces, mainly on the external surface of the segment. a crack running through the whole thickness of the segment appeared under loads of 6000 kn and 6600 kn. the maximum force that was reached was 9000 kn and 9300 kn. both segments failed by splitting in the area between the applied forces. 6.4. compression of a non-uniformly supported segment this test was prepared to simulate segment bending in the segment plane (i.e. high cantilever bending). this kind of load occurs if the geometric assembly of the preceding ring is inaccurate, and the ring loaded by rams is not supported uniformly. the segments were supported by two supports (central and side support), support of the one side was omitted. the segment was loaded in the centre plane on the unsupported side. the tested segment was supported and loaded as a high cantilever (fig. 5d). the segment was loaded by increasing the force without unloading, with force increments of 100 kn, up to failure of the segment. four sfrc segments were tested using this test; the test was also performed 335 j. beňo, m. hilar acta polytechnica segment dosage increments unload appearance crack maximum of fibers of applied force to the force throught the segment applied force [kg/m3] [kn] [kn] thickness [kn] [kn] loading by bending perpendicular to the segment plane (fig. 5b) s1 40 continually not unloaded — 115 s2 50 continually not unloaded — 106 s3 40 continually not unloaded — 124 s4 50 continually not unloaded — 154 loading unaxial compression of segment (fig. 5a, 5c) k 50 300 90 4200 7247 s1–l 40 600 200 6000 6600 s2–l 50 600 200 4800 7500 s3–l 40 600 200 6000 6600 s3–p 40 600 200 6000 7480 s4–l 50 600 200 5400 8300 s4–p 50 600 200 6000 7900 s5 40 1200 400 6000 9000 s6 50 1200 400 6000 9300 loading by bending in the segment plane (fig. 5d) s11 50 100 not unloaded 200 500 s12 50 100 not unloaded 300 753 s13 40 100 not unloaded 300 629 s14 40 100 not unloaded 300 610 table 3. results of laboratory tests on sfrc segments. with traditionally reinforced segments. the first cracks appeared under a load of about 300 kn. the segment began to break almost in the middle, where the segments are weakened by a niche for the connecting bolt. many tiny cracks parallel to the axis of the tunnel also appeared. subsequently, the segment cracked in the middle. the maximum force that was attained was of about 600 kn. a similar maximum force of about 600 kn was achieved for the rc segments. a crack through the whole thickness of the segment formed in the sfrc segment later than in the rc segment. the corresponding force was 500 kn for the sfrc segment, and 400 kn for the rc segment. 6.5. evaluation of the tests on sfrc segments the test results gave a realistic idea about the behaviour of the sfrc segments, and also about their bending capacity and serviceability. the biggest advantage of sfrc segments is the larger number of small cracks in initial stages of loading. this is beneficial from the serviceability limit state point of view. a crack through the whole thickness of the segment appears later in sfrc segments than in rc segments. the risk of water leakage is therefore lower for sfrc segments. the tests showed that sfrc segments can replace widely-used rc segments. the results of our laboratory tests of sfrc segments are summarized in tab. 3. 7. a section of sfrc segmental lining installed in the prague metro the successful laboratory tests on sfrc segments led to the decision to install sfrc segmental lining in the prague metro. sfrc segments were installed in the construction of the line a extension. rc segments were normally used on the running tunnels of this project excavated by two epb shields. the rc segments were produced at the precast plant in senec, slovakia. the sfrc segments were produced in the same factory. dramix fibres were used, with a fibre dosage of 40 kg/m3. a total of ten sfrc rings were assembled (i.e. 15 meters of lining – see fig. 10). the 336 vol. 53 no. 4/2013 steel fibre reinforced concrete for tunnel lining segments were placed in the right tunnel tube in june 2012 between metro stations veleslavín and červený vrch. no major problems occurred during production and assembly of the sfrc segments. 8. conclusion sfrc is increasingly used as a structural material for precast segmental tunnel linings excavated by tunnelling machines. in some cases sfrc is supplemented by steel bar reinforcement, while in other cases sfrc is used without steel cages. sfrc segments can bring many benefits during tunnel construction and operation. the following advantages of sfrc segments can be mentioned: • possible price reduction (less steel is used, and there is faster production); • easier production (less manual work, no problems with the shape and the position of cages); • simpler placing of tunnel equipment (no risk of drilling to steel bars); • reduced risk of segment damage during transportation and installation (the edges are reinforced by fibres); • longer durability (no problems with corrosion). successful laboratory test results have confirmed the possibility of using sfrc segments in czech transport tunnels. this finding has been confirmed by their application in the prague metro. our study supports future applications of sfrc segments in the czech republic. acknowledgements financial support from the czech science foundation (gačr) grant no. p104/10/2023 is gratefully acknowledged. references [1] rivaz b.: steel fibre reinforced concrete (sfrc): the use sfrc in precast segment for tunnel lining, wtc 2008, agra, india, 2007–2017 figure 10. section with sfrc segmental lining installed in prague metro line a extension. [2] poh1 j., tan k. h, peterson g. l., wen1 d.: structural testing of steel fibre reinforced concrete (sfrc) tunnel lining segments in singapore, wtc 2009, budapest, hungary [3] vokáč m., bouška p.: experimental testing of sfrc pre-cast metro segments, klokner institute, ctu in prague, 2011–2012 [4] čsn en 14651: zkušební metoda betonu s kovovými vlákny – měření pevnosti v tahu za ohybu (mez úměrnosti, zbytková pevnost), 12/2008 (in czech) [5] krátký j., trtík k., vodička j.: drátkobetonové konstrukce – směrnice pro navrhování, provádění, kontrolu výroby a zkoušení drátkobetonových konstrukcí, česká společnost pro beton a zdivo, česká komora autorizovaných inženýrů a techniků činných ve výstavbě, 1999 (in czech) [6] www.cervenka.cz/assets/files/atena-pdf/atena_ theory.pdf [7] bendit de rival, dojčák j.: normy a špecifikácie pre striekaný betón s rozptýlenou výstužou v tuneloch nového železničného spojenia lyon – turín, časopis tunel, 19. ročník, č. 1/2010, str. 38 – 43, praha, 2010 337 www.cervenka.cz/assets/files/atena-pdf/atena_theory.pdf www.cervenka.cz/assets/files/atena-pdf/atena_theory.pdf acta polytechnica 53(4):329–337, 2013 1 introduction 2 segmental tunnel lining 3 a comparison of traditionally reinforced segments and sfrc segments 4 tests on sfrc samples 5 numerical modelling of sfrc beams 6 tests on sfrc segments 6.1 key segment compression 6.2 segment bending 6.3 compression of a uniformly supported segment 6.4 compression of a non-uniformly supported segment 6.5 evaluation of the tests on sfrc segments 7 a section of sfrc segmental lining installed in the prague metro 8 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0301 acta polytechnica 54(4):301–304, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap beyond the standard model of the disc–line spectral profiles from black hole accretion discs vjačeslav sochora∗, vladimír karas, jiří svoboda, michal dovčiak astronomical institute, academy of sciences of the czech republic, boční ii 1401, prague, czech republic ∗ corresponding author: sochora@astro.cas.cz abstract. the strong gravitational field of a black hole has distinct effects on the observed profile of a spectral line from an accretion disc near a black hole. the observed profile of the spectral line is broadened and skewed by a fast orbital motion and redshifted by a gravitational field. these effects can help us to constrain the parameters of a system with a black hole, both in active galactic nuclei and in a stellar-mass black hole. here we explore the fact that an accretion disc emission can be mathematically imagined as a superposition of radiating accretion rings that extend from the inner edge to the outer rim of the disc, with some radially varying emissivity. in our work, we show that a characteristic double-horn profile of several radially confined (relatively narrow) accretion rings or belts could be recognized by the planned instruments onboard future satellites (such as the proposed esa large observatory for x-ray timing). keywords: black hole physics, accretion discs, galactic nuclei. 1. introduction an observation of spectral lines from the inner regions of an accretion disc around a black hole, both in active galactic nuclei (agn) [1, 2] and in galactic black holes [3, 4], gives us information about matter in extreme conditions. these spectral lines are broadened and skewed by a fast orbital motion and redshifted by a strong gravitational field. according to the standard scenario [5, 6], line emissivity is assumed to be a simple power-law of the radius. with a typically moderate inclination angle of the source, a broad profile is formed with an extended red wing and a dominant well-defined blue peak. however, the radial emissivity of an astrophysically realistic accretion disc cannot be a simple smooth function of the radius. instead, it is expected to have peaks of enhanced emissivity occurring at particular radii, e.g. due to localized irradiation by magnetic flares [7, 8]. we address the question whether the emission excesses on top of the standard emission profile can be resolved in observed spectra and used to further constrain the black hole spin to better precision. we discussed in [9] whether the proposed large observatory for x-ray timing (loft), [10, 11], will have the necessary capability to reconstruct the parameters from a model spectrum. we produced artificial data with appropriate properties and then we analyzed them by using a preliminary response file for loft. two scientific instruments form the payload of the satellite: lad (large area detector) with a large effective area (designed to reach ' 12 m2) and the energy resolution should be about 200–300 ev; and wfm (wide field monitor), which will observe about 50 % of the sky available to the lad in the same energy band at any time. in this paper, we compare the results obtained in [9] and discuss how variability of the background can affect the spectrum. correct modeling of the background is crucial for the success of the measurements, because loft does not contain any telescope that could measure the background from the neighborhood of the observed object. 2. model spectrum and analysis by loft 2.1. test case we took our fiducial model (figure 1, left panel) from [9]: it was phabs * (powerlaw + 4 * kyrline), i.e., a photo-absorbed power-law continuum and four line components blurred by relativistic effects (we used xspec v. 12.6.0). one of the kyrline components originates over the entire disc surface, and it has been fixed to its default parameters (risco ≤ r ≤ 400, radial emissivity index α = 3). we set the model parameters to: a = 0.93 (rapidly spinning kerr black hole in prograde rotation), i = 30 deg (moderate inclination typical for the seyfert 1 nucleus) and three rings of width 0.5rg at the positions r1 = 3rg, r2 = 4rg and r3 = 6rg. we produced the simulated spectrum (figure 1, right panel) by assuming a source flux of approximately 1.3 mcrab (' 3 × 10−11 erg/cm2 in the energy range 2–10 ev), a photo-absorbed power-law continuum (photon index γ = 1.9, hydrogen density nh = 4 × 1021 cm−2) and the rest energy erest = 6.4 kev. the exposure time was set to 20 ksec. 301 http://dx.doi.org/10.14311/ap.2014.54.0301 http://ojs.cvut.cz/ojs/index.php/ap v. sochora, v. karas, j. svoboda, m. dovčiak acta polytechnica figure 1. left panel: the complete theoretical model and the model components: a power-law continuum and the individual line profiles from which the energy shifts of the components are derived. right panel: simulated data and the ratio to the baseline model consisting of the power-law and the disc-line components. residuals related to the three additional narrow rings are clearly visible. taken from [9]. ring gmin gmax rin rout a = 0.76 a = 1.00 a = 0.76 a = 1.00 1 0.36 0.81 3.1 2.8 3.7 3.4 2 0.48 0.91 4.1 3.9 4.9 4.7 3 0.59 0.98 5.8 5.6 7.1 6.9 table 1. parameters of the model inferred from the energy positions of the spectral peaks in the test spectrum from figure 1. we identified the visible features with the horns of the line components. we imposed the same inclination i = 30 deg for all three rings and required the inferred spin values to be consistent with each other. the spin turns out to be constrained only partially, with the values from 0.77 up to 1.00 being consistent with the positions of peaks in the model spectrum when the radius is set appropriately. the fiducial test spectrum was generated for rings positioned at radii rin = 3 rg, 4 rg, and 6 rg, respectively. the tabulated values demonstrate the accuracy of the fitting procedure. see the text for details. taken from [9]. 2.2. determination of shifts from the spectral profile to determine the relativistic energy shifts of photons, we adopt the method described in our recent paper [12], where we considered the propagation of photons from the source in the limit of geometrical optics in the kerr metric [13]. there is a partial degeneracy of the parameter values. in our case this exhibits itself by the fact that, in order to obtain the red peaks of the line in the right position, the spin has to be greater than the lower limit of a = 0.76. however, the upper bound remains undetermined. for 0.76 ≤ a ≤ 1, i.e., up to the maximum spin of the kerr black hole, we can reproduce the peaks by rearranging the ring radii. this is shown in table 1 by giving two possible values of rin and rout that are consistent simultaneously with the mentioned minimum and maximum spin values. one can see that the uncertainty in the inferred radii is below 10%, while for spin the relative error represents about 25%. figure 2. the lad background and its various contributions. the results of a study in which the behavior of each background component was modeled along the satellite orbit. the spectrum of a 10 mcrab source is also shown. [14] 302 vol. 54 no. 4/2014 beyond the standard model of the disc–line spectral profiles 2 5 0 .9 5 1 1 .0 5 1 .1 1 .1 5 ra ti o energy (kev) data/model 2 5 0 .9 5 1 1 .0 5 1 .1 ra ti o energy (kev) data/model 2 5 0 .9 5 1 1 .0 5 1 .1 ra ti o energy (kev) data/model figure 3. the ratio between the simulated data and the baseline model with a background level of 5% (top), 7% (middle) and 10% (bottom). 0.85 0.9 0.95 1 2 .8 3 3 .2 ri n ( g m /c 2 ) a/m (gm/c) + 0.85 0.9 0.95 1 2 .9 3 3 .1 3 .2 ri n ( g m /c 2 ) a/m (gm/c) + 0.85 0.9 0.95 1 2 .8 2 .9 3 3 .1 ri n ( g m /c 2 ) a/m (gm/c) + figure 4. constraints on the best-fit model with a background level of 5% (top), 7% (middle) and 10% (bottom). 303 v. sochora, v. karas, j. svoboda, m. dovčiak acta polytechnica 3. model spectrum considering background 3.1. lad background the lad background has been analyzed and computed by [14] using monte carlo simulations of a mass model of the whole loft spacecraft and all known radiation sources in the loft orbit, see figure 2. from the simulations, it is obvious that the background is dominated (> 70 %) by high energy photons of cosmic x-ray background and earth albedo leaking and scattering through the collimator structure, which becomes less efficient at high energies. these two sources are stable and predictable, although there exist small modulations of these components due to the orbital motion of the satellite around the earth. one of the varying sources is the particle induced background (< 6 % of the overall background). the largest modulation of the total lad background is estimated as < 20 %, and can be effectively described by a geometrical model that should predict the background at the level of 1 % or better (1–20 kev). 3.2. data to model ratio we considered the model presented in the previous section and tested the expected impact of background contamination by applying corrnorm to the background file in xspec. figure 3 shows the data to model ratio with the randomized background at the 5, 7 and 10 % levels. the blue peaks can be recognized in the first two graphs, while the red peaks are lost in the signal-to-noise. the 10 % inaccuracy of the background degrades the visibility of the peaks. figure 4 demonstrates the expected accuracy with which the model parameters are constrained. the results can seem to be counter-intuitive (the 10 % level is better than 5 %). this is caused by the method of fitting. in our case, we know the parameters of the system and the constraints would look similar. the study of the background deserves more investigation. the constraints on the best-fit model parameters are derived from the simulation data. confidence contours are shown (1, 2, and 3σ) of the inner ring radius rin vs. dimension-less spin a. 4. conclusions in this paper, we have applied the background response file of the proposed satellite loft to the test model from [9], and we have studied the variability of the model spectrum. our result is that 10 % inaccuracy is a limit value of the background, and it should not be exceeded. accuracy of the background at the 5% level is sufficient to recognize and fit the structures in the model spectrum, and the peaks of the energy shifts, and also to determine the parameters, as described in [12]. acknowledgements we would like to acknowledge support from gačr 1300070j. references [1] a. c. fabian, k. iwasawa, r. c. s., a. j. young. broad iron lines in active galactic nuclei. pasp 112:1145–1161, 2000. doi:10.1086/316610. [2] c. s. reynolds, m. a. nowak. fluorescent iron lines as a probe of astrophysical black hole systems. physics reports 377:389–466, 2003. doi:10.1016/s0370-1573(02)00584-7. [3] j. m. miller, a. c. fabian, r. wijnands, et al. resolving the composite fe kα emission line in the galactic black hole cygnus x-1 with chandra. apj 578:348–356, 2002. doi:10.1086/342466. [4] j. e. mcclintock, r. a. remillard. compact stellar x-ray sources. cambridge university press, cambridge, 2006. [5] i. d. novikov, k. s. thorne. black holes. gordon and breach publishers, new york, 1973. [6] d. n. page, k. s. thorne. disk-accretion onto a black hole. time-averaged structure of accretion disk. apj 191:499–506, 1974. doi:10.1086/152990. [7] b. czerny, a. różańska, m. dovčiak, et al. the structure and radiation spectra of illuminated accretion disks in agn. ii. flare/spot model of x-ray variability. a&a 420:1–16, 2004. doi:10.1051/0004-6361:20035741. [8] d. a. uzdensky, j. goodman. statistical description of a magnetized corona above a turbulent accretion disk. apj 682:608–629, 2008. doi:10.1086/588812. [9] v. sochora, v. karas, j. svoboda, m. dovčiak. black hole accretion rings revealed by future x-ray spectroscopy. mnras 418:276–283, 2011. doi:10.1111/j.1365-2966.2011.19483.x. [10] m. feroci, l. stella, a. vacchi, et al. loft: a large observatory for x-ray timing. in society of photo-optical instrumentation engineers (spie) conference series, vol. 7732. 2010. arxiv:1008.1009 doi:10.1117/12.857337. [11] m. feroci, l. stella, m. van der klis, et al. the large observatory for x-ray timing (loft). experimental astronomy 34:415–444, 2012. arxiv:1107.0436 doi:10.1007/s10686-011-9237-2. [12] v. karas, v. sochora. extremal energy shifts of radiation from a ring near a rotating black hole. apj 725:1507–1515, 2010. doi:10.1088/0004-637x/725/2/1507. [13] s. kato, j. fujue, s. mineshige. black-hole accretion disks. kyoto university press, kyoto, 1998. [14] r. campana, m. feroci, e. del monte, et al. the loft (large observatory for x-ray timing) background simulations. in society of photo-optical instrumentation engineers (spie) conference series, vol. 8443. 2012. arxiv:1209.1661 doi:10.1117/12.925999. 304 http://dx.doi.org/10.1086/316610 http://dx.doi.org/10.1016/s0370-1573(02)00584-7 http://dx.doi.org/10.1086/342466 http://dx.doi.org/10.1086/152990 http://dx.doi.org/10.1051/0004-6361:20035741 http://dx.doi.org/10.1086/588812 http://dx.doi.org/10.1111/j.1365-2966.2011.19483.x arxiv:1008.1009 http://dx.doi.org/10.1117/12.857337 arxiv:1107.0436 http://dx.doi.org/10.1007/s10686-011-9237-2 http://dx.doi.org/10.1088/0004-637x/725/2/1507 arxiv:1209.1661 http://dx.doi.org/10.1117/12.925999 acta polytechnica 54(4):301–304, 2014 1 introduction 2 model spectrum and analysis by loft 2.1 test case 2.2 determination of shifts from the spectral profile 3 model spectrum considering background 3.1 lad background 3.2 data to model ratio 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):223–227, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ positioning of the precursor gas inlet in an atmospheric dielectric barrier reactor, and its effect on the quality of deposited tiox thin film surface jan pichal∗, julia klenko czech technical university in prague, faculty of electrical engineering, technicka 2, prague 6, czech republic ∗ corresponding author: pichal@fel.cvut.cz abstract. thin film technology has become pervasive in many applications in recent years, but it remains difficult to select the best deposition technique. a further consideration is that, due to ecological demands, we are forced to search for environmentally benign methods. one such method might be the application of cold plasmas, and there has already been a rapid growth in studies of cold plasma techniques. plasma technologies operating at atmospheric pressure have been attracting increasing attention. the easiest way to obtain low temperature plasma at atmospheric pressure seems to be through atmospheric dielectric barrier discharge (adbd). we used the plasma enhanced chemical vapour deposition (pecvd) method applying atmospheric dielectric barrier discharge (adbd) plasma for tiox thin films deposition, employing titanium isopropoxide (ttip) and oxygen as reactants, and argon as a working gas. adbd was operated in filamentary mode. the films were deposited on glass. we studied the quality of the deposited tiox thin film surface for various precursor gas inlet positions in the adbd reactor. the best thin films quality was achieved when the precursor gases were brought close to the substrate surface directly through the inlet placed in one of the electrodes. high hydrophilicity of the samples was proved by contact angle tests (ca). the film morphology was tested by atomic force microscopy (afm). the thickness of the thin films varied in the range of (80 ÷ 210) nm in dependence on the composition of the reactor atmosphere. xps analyses indicate that composition of the films is more like the composition of tioxcy. keywords: afm, atmospheric dielectric barrier discharge, chemical composition, chemical vapour deposition, inlet position, precursor gas, surface quality, thin film, tiox, tioxcy, xps. 1. introduction thin film deposition techniques and technologies have undergone serious development and cultivation in recent decades. a vast number of deposition methods are now available [6, 7], but it still remains arduous to select the best deposition method which is, at the same time, environment friendly. the application of cold plasmas sustaining under atmospheric pressure in combination with the chemical vapour deposition method seems to be a promising approach. cold plasma is often produced in plasma jets, plasma torches and adbd (for details about adbd, see e.g. [5]). however, research in this area is still mostly restricted to the laboratory stage. for practical reasons, the most tested films are siox and tiox coatings. this paper summarizes results obtained when tiox films are deposited on glass substrates using the pecvd method in an adbd plasma reactor when ttip, (less toxic than ticl4, which was used e.g. in [2]), is applied as a precursor, namely the connection between the precursor gas inlet position in the adbd reactor and the deposited film quality. some preliminary results have been published in [3, 4]. 2. experimental 2.1. the reactor and the experimental conditions film deposition was performed with discharge power of about 350 mw [14 kv, 50 hz]. the experiments were carried out in an open flow-through type plasma reactor with dimensions (90 × 79 × 41) mm. the scheme of the plasma reactor is shown in fig. 1. plasma sustained between two brass electrodes [(45 × 8 × 18) mm hv electrode, (40 × 17 × 18) mm ground electrode] placed within the reactor. a barrier manufactured from glass plate ((70 × 46 × 1) mm) covered the ground electrode. the distance between the electrodes was fixed at 4 mm. three types of hv electrode were used in the experiments, all of them with identical external dimensions, but they were differentiated by the hole (diameter 3 mm) leading into the interelectrode region. the first electrode was without a hole, the hole of the second type was connected with one inlet (c) only, and the third type had two inlets, c and d (see fig. 1, for simplicity all four inlets possitions are drawn here, although only one pair of inlets was used in each experiment). adbd sustained in filamentary mode. thin films 223 http://ctn.cvut.cz/ap/ jan pichal, julia klenko acta polytechnica figure 1. scheme of the plasma reactor; 1 – ground electrode, 2 – dielectric barrier, 3 – substrate, 4 – hv electrode, 5 – evaporator, 6 – mass flow controller, 7 – hv supply, a, b, c, d – inlet positions. were obtained at atmospheric pressure using titanium (iv) isopropoxide as a precursor (ttip, ti[och(ch2)]4 and 97 % purity). ttip volatilized at temperatures of (30.0 ± 0.5) ◦c. it was mixed with argon in the evaporator, transported into the reactor and reacted with oxygen (or, in the first experiments, merely with dry air, when atmospheric oxygen took part in the reaction). the gas flow rates were adjusted by means of the mass flow controllers. deposition tests were performed with ttip/ar flows (0.5 ÷ 4.0 ) slm. the oxygen/dry air flow was controlled within (2.5 ÷ 10) slm. the outer atmosphere was air with relative humidity of (36 ÷ 47 ) % and room temperature about 20 ◦c. the deposition time was 10 minutes in all experiments. unfortunately, the reactor has no “clean interface”, so surface-related chemical reactions or contamination by ambient air species began while the films being removed from the reactor, before any test was initialized. these chemical reactions also proceeded while the films were being stored. the deposited films were stored in darkness at room temperature (20 ÷ 23 ) ◦c, relative humidity (30 ÷ 40 ) %, in plastic boxes in air at atmospheric pressure. 2.2. film analysis the surface morphology of the films was examined using the atomic force microscopy (afm) technique in noncontact mode, performed under ambient conditions on an frt afm scanning probe microscope at the technical university of liberec. all (2 × 2) µm scans were processed by gwyddion software for spm (scanning probe microscopy) data visualization and analysis. to analyse the hydrophilicity, a contact angle test (ca) was applied. ca was measured by a sessile drop technique when a constant time (30 s) passed after the dropping of distilled water about 0.5 µl in volume. the ca of each sample was performed in 5 different positions at room temperature. the surface chemfigure 2. topography of the tiox films; inlets: a (dry air from the atmosph.), c (ttip/ar, 0.5 slm). ical composition of the films was investigated by x-ray photoelectron spectroscopy (xps). a multi-channel hemispherical electrostatic analyser (phobios 100, specs) was used. the al kα line (1486.6 ev) was used with an x-ray incidence angle 45◦ to the surface plane. the analyser was operated in retarding-field mode, applying pass energy of 40 ev for the survey scans and 10 ev for all core level data. the xps peak positions were referenced at the aliphatic carbon component at 285.0 ev. due to inadequate equipment we were unfortunately not able to clean the film profile by sputtering and perform xps profile tests, so our research had to focus on film surface tests only. 3. results deposition of the films proceeds in substrate-surface limited reactions. the films were deposited in adbd sustaining in filamentary mode. the deposited surface was in general irregularly corrugated, hummocky, with growing thickness in spots where filaments bridged electrodes. growing/diminishing thickness of the film area reflects local electric field inhomogeneities associated with the existence of filaments. the surface topography was similar to layers deposited with ticl4 precursor [2, 10]. the gassing of the reactor was performed through various pairs of gas inlets (see fig. 1). we used following combinations: (1.) a (air at atm. pressure) and b (ttip/ar) (2.) a (air at atm. pressure) and c (ttip/ar) (3.) c (ttip/ar) and d (oxygen) (4.) c (oxygen) and d (ttip/ar) for each combination of inlets we tested the effect of ttip and o2 in various concentrations in the ttip/ar/o2 mixture on the film characteristics. only the best results are mentioned in the following summary. 224 vol. 53 no. 2/2013 positioning of the precursor gas inlet in an atmospheric dielectric barrier reactory figure 3. ti 2p xps spectrum of the film, inlets a, c. 1. the reactor was gassed both with ttip/ar (0.5 slm) and with dry air through inlets a and b in the walls of the reactor. several differents positions of both inlets were tested, but only powder-like structures were deposited on the glass substrate for all used combinations of a and b. 2. to reduce pulverization, the reactor was gassed through inlet a in the wall of the reactor with dry air, and ttip/ar (0.5 slm) was fed through inlet c (the hole, 3 mm in diameter, in the hv electrode). the film that formed on the glass substrate was very thin and was barely detectable. we suppose that the ttip/oxygen reaction was weak, due to the low concentration of oxygen atoms (from the air) in the reactor atmosphere. then the precursor molecules flew out from the inter-electrode space and later reacted with oxygen atoms (and we observed the tiox thin film deposited on the inner walls of the reactor). afm measurements revealed the existence of some salient parts on the uniform powder-like film surface (fig. 2). small hummocks were also visible. ca measurements were almost impossible due to the inhomogeneity and the thinness of the film, and the results were not reproduciable. only powderlike tiox structures were deposited on the substrate for ttip/ar flows higher than 0.5 slm. the high-resolution xps spectra for the main elements in the films are shown in figs. 3–5. figure 3 represents the ti 2p spectrum that consists of 2p3/2 and 2p1/2 spin orbit components located at 458.8 ev and 464.6 ev, respectively. this position of the peak maxima indicates that the main titanium compound is tio2. the small components on the lower binding energy side correspond to sub-stoichiometric titanium oxide tiox, x < 2. the o 1s spectrum is shown in fig. 4. the peak consists of two distinct components at 530.5 ev and 532.7 ev. the first component is classified as tio2, whereas the second component corresponds to the carbon-oxygen species. the carbon-oxygen species are about three times more abundant than oxygen figure 4. o 1s xps spectrum of the film, inlets a, c. figure 5. c 1s xps spectrum of the film, inlets a, c. bound in titanium (24 % against 76 %). we suppose that the carbon contamination was probably only superficial. this contamination may have originated both during the deposition process by impurities from the air and by post-discharge reactions and adsorption of various species from the ambient air atmosphere after the film removal from the reactor. 3. to improve the quality of the films and optimize the deposition conditions, both components were introduced through the inlets in the hv electrode, and the atmospheric air was replaced by oxygen (c (ttip/ar) and d (oxygen)). nevertheless problems still persisted. afm analysis proved that the deposited films were not fully homogeneous (fig. 6), and there were problems with creating the powder-like structures. for further details, see [3]. 4. the best quality films were deposited in the same adjustment, but unlike with the previous combination of inlets, oxygen (2.5 slm) was fed through inlet c and ttip/ar mixture (1 slm or 2 slm), i.e. the ttip content (0.05 or 0.10) % was fed in through inlet d. the mixture entered through the hole (diameter 3 mm) in the hv electrode into the inter-electrode space. 225 jan pichal, julia klenko acta polytechnica figure 6. inlets: c (ttip/ar, 1 slm), d (o2, 2.5 slm). figure 7. topography of the tiox films; inlets: c (ttip/ar, 1 slm), d (o2, 2.5 slm). figure 7 is afm scan of the film surface deposited under optimum conditions. the surface is similar to the surface described in [10]. it is characterised by higher hummocks than in fig. 6, when film deposition in surface-limited reactions was accompanied by volume-limited dust generating reactions. the ca tests proved that all samples were hydrophilic immediately after deposition. for an ar/ttip flow rate of 1 slm, the ca value attained 5◦ immediately after deposition. the hydrophilicity of the films remained almost invariable in the first 7 days after deposition. later, the wettability had worsened and within 28 days after deposition the ca value of all samples exceeded 40◦. films deposited with ttip/ar flow 2 slm changed more rapidly from hydrophilic to hydrophobic. the chemical composition of the films (ttip/ar flows (0.5 ÷ 4.0 slm), oxygen (2.5 ÷ 10) slm was almost constant. high-resolution spectra for the main elements in films are shown in figs. 8–10. the relatively high hydrocarbon contamination on the film surface was again most probably produced by the post-discharge reactions, and by the adsorption of various species from the ambient air atmosphere after the film was removed from the reactor. figure 8. c 1s xps spectrum of the film; inlets c (oxygen, 2.5 slm), d (ttip/ar, 2 slm). figure 9. o 1s xps spectrum of the film; inlets c (oxygen, 2.5 slm), d (ttip/ar, 2 slm). the carbon (contamination) is partially bonded with titanium (the c−ti bond is about 283 ev). carbon forms mostly c−c (285 ev) backbone chains, some of which are partly oxidized (fig. 8). the o 1s spectra are shown in fig. 9. note, that fwhm (full width at half of maximum) is more than 2 ev. this broadening is evidently caused by substoichiometric titanium oxides [1]. in addition, the second peak at 532.8 ev can be considered as a contribution from single and double oxygen-carbon bonds [9]. titanium is a reactive element and easily forms oxides and carbides,which can be seen in the curve of ti 2p (fig. 10). the location of the strongest peaks at 458.8 ev and 465 ev indicates that the main titanium species is tio2 [8]. the small peaks at 457 ev and 462.6 ev indicate mixed presence of substoichiometric titanium oxides tixoy and titanium carbides tic. the xps spectra demonstrate that the titanium in the near surface area is strongly oxidized, the dominant species being tio2 and substoichiometric titanium oxides. the deposition process more 226 vol. 53 no. 2/2013 positioning of the precursor gas inlet in an atmospheric dielectric barrier reactory figure 10. ti 2p xps spectrum of the film; inlets c (oxygen, 2.5 slm), d (ttip/ar, 2 slm). likely produced tioxcy films instead of the primarily desired tiox films. for more details see [4]. 4. conclusion film deposition on glass substrates was performed by the pecvd method in the adbd plasma reactor. the plasma reactor was an open flow-through type. adbd sustained in filamentary mode. the reactor atmosphere consisted either of ttip/ar/dry air or of a ttip/ar/oxygen mixture. we studied the quality of the deposited tiox thin film surface for various precursor gas inlet positions in the adbd reactor, and various precursor/oxidizer mixture compositions. the best film quality was achieved when the precursor and the oxydizer entered the discharge region instantly after they were mixed, through the hole adjacent to the substrate. the surface topography was influenced by the nonequilibrium character of adbd, leading to the irregularly corrugated and hummocky surface of the film. ca tests proved the high hydrophilicity of the samples immediately after deposition. later, the wettability of the films diminished, and the ca value of all samples exceeded 40 degrees after 28 days; the changes were probably related with chemical reactions between the surface of the film and the chemical groups involved in the air atmosphere. xps tests indicate that the deposition process more likely produced tioxcy films instead of the primarily desired tio2 or tiox films. all samples exhibit contamination with carbon, probably caused by postdischarge reactions and by adsorption of various species from the ambient air atmosphere after the film was removed from the reactor. some problems of this deposition method are related with two different chemical processes that take place during deposition: surface-related chemical processes resulting in conventional pecvd film deposition, and undesired volume-related chemical processes resulting in dust production. the dust-producing mechanism prevails under certain working conditions (e.g. higher oxygen flow rate values). dust particles, when created, remain in the discharge region and their layer(s) on the substrate hinder the effective formation of a more homogeneous film, and influence the quality of the film. another problem of pecvd thin film deposition with adbd seems to be with the filamentary character of the adbd in some applications, leading to generation of hummocks that form a rough surface of the films. acknowledgements this work was supported by czech technical university in prague grant no. sgs10/266/ohk3/3t/13. references [1] i. bertoti, m. mohai, j. l. sullivan, s. o. saied. surface characterisation of plasma-nitrided titanium: an xps study. appl surf sci 84:357–371, 1995. [2] lan-bo di, xiao-song li, chuan shi, et al. atmospheric-pressure plasma cvd of tio2 photocatalytic films using surface dielectric barrier discharge. j physd: applied physics 42:032001, 2009. [3] y. klenko, j. pichal. deposition of tio2 thin films using atmospheric dielectric barrier discharge. problems of atomic science and technology series: plasma physics 48:177–179, 2008. [4] y. klenko, j. pichal. tiox films deposited by plasma enhanced chemical vapour deposition method in atmospheric dielectric barrier discharge plasma. plasma chemistry and plasma processing 32:1215–1225, 2012. [5] u. kogelschatz. dielectric-barrier discharges: their history, discharge physics and industrial applications. plasma chemistry and plasma processing 23:1–46, 2003. [6] seshan krishna (ed.). handbook of thin-film deposition. elsevier, oxford, 2012. [7] d. m. mattox. handbook of physical vapor deposition (pvd) processing. elsevier (william andrew), oxford, 2010. [8] s. w. ryu, e. j. kim, s. k. ko, s. h. hahn. effect of calcination on the structural and optical properties of m/tio2 thin films by rf magnetron co-sputtering. materials letters 58:582–587, 2004. [9] t. solomun, a. schimanski, h. sturm, e. illenberger. reactions of amide group with fluorine as revealed with surface analytics. chemical physics letters 387:312–316, 2004. [10] x. w. zhang, g. r. han. microporous textured titanium dioxide films deposited at atmospheric pressure using dielectric barrier discharge assisted chemical vapor deposition. thin solid films 516:6140–6144, 2008. 227 acta polytechnica 53(2):223–227, 2013 1 introduction 2 experimental 2.1 the reactor and the experimental conditions 2.2 film analysis 3 results 4 conclusion acknowledgements references ap1_01.vp 2 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 1 introduction length changes of porous materials can be affected by changes of both temperature and moisture content [1]. hygric expansion is often not taken into account in practical measurements and calculations, although a higher content of moisture, particularly in the liquid state, can lead to hygric stresses which are at least comparable with thermal stresses [2]. in this paper, we measure both the linear thermal expansion coefficient and the linear hygric expansion coefficient of two types of high performance concrete. 2 thermal and hygric expansion parameters the infinitesimal change of length due to change of temperature is defined by d dl l tt t� 0� , (1) where l0 is the length at the reference temperature t0, and �t is the linear thermal expansion coefficient. in an analogous way, the linear hygric expansion coefficient �u can be defined d dl l uu u� 0� , (2) where u m m m m d d � � 100 (3) is the moisture content in %, mm is the mass of the moistened material, md the mass of the dried material. applying, in the first approximation, the superposition principle to the length changes due to temperature and moisture, we arrive at � �d d d d dl l l l t uu t u� � � �� 0 � � . (4) defining the relative elongation as � � � �l l l l l l 1 0 1 2 d , (5) we obtain � � � � � �u t u t u tt u, , ,� � �0 0 � � � � � � � � � �� � � � � � � � � t t u u t u ut t tu u t t t ud d d d 0 0 0 u u 0 � . (6) 3 experimental procedure in measuring the high-temperature linear thermal expansion coefficient, we employed an experimental device developed in our laboratory (see [3]). the device consists of cylindrical, vertically placed electric furnace with two bar samples located in the furnace. the first sample is the measured material, the second sample is a reference material with the known dependence of the thermal expansion coefficient on temperature. the length changes of the samples are measured mechanically outside the furnace by thin ceramic rods which pass through the furnace cover and are fixed on the top side of the measured sample. these ceramic rods pass by an indefinite temperature field, so that their elongation cannot be determined mathematically and a comparative method of determining the elongation of the rod is used. in measuring the linear hygric expansion coefficient, we measured the length changes by the carl zeiss optical contact comparator with ±1 μm accuracy, and the mass changes were determined by the sartorius electronic balance with ±1 mg accuracy. the basic set of points li (mi, ti ) necessary for determining the � (u) function was obtained in this way. the experiments were performed under isothermal conditions, with t = (25.0±0.5) °c. first, the dependence of the relative elongation on moisture content was determined. the samples were first dried in a hot-air dryer at t = 110 °c, then moistened in the environment with a known relative humidity of 40 %, moistened to full water saturation, and dried first in natural way in the laboratory and finally in the hot-air dryer at t = 110 °c. both length and mass changes were simultaneously measured during the moistening and drying processes. 4 materials and samples the experimental work was done with two types of high performance concrete used in nuclear power plants: penly concrete and temelin concrete. penly concrete was used for a concrete containment building in a nuclear power plant in france (samples were obtained from m. dugat, bouygues company, france). it had a dry density of 2290 kg/m3, and consisted of the following components: cement cpa hp le havre (290 kgm�3), sand 0/5 size fraction (831 kgm�3), gravel sand 5/12.5 size fraction (287 kgm�3), gravel sand 12.5/25 size fraction (752 kgm�3), thermal and hygric expansion of high performance concrete j. toman, r. černý the linear thermal expansion coefficient of two types of high performance concrete was measured in the temperature range from 20 °c to 1000 °c, and the linear hygric expansion coefficient was determined in the moisture range from dry material to saturation water content. comparative methods were applied for measurements of both coefficients. the experimental results show that both the effect of temperature on the values of linear thermal expansion coefficients and the effect of moisture on the values of linear hygric expansion coefficients are very significant and cannot be neglected in practical applications. keywords: concrete, linear thermal expansion, linear hygric expansion, moisture, temperature. calcareous filler piketty (105 kgm�3), silica fume (30 kgm�3), water (131 kgm�3), retarder chrytard 1.7, super-plasticizer resine gt 10.62. the maximum water saturation was 4 %kg/kg. the temelin concrete used for the concrete containment building of the temelin nuclear power plant in the czech republic had a dry density of 2200 kg/m3 and maximum water saturation 7 %kg/kg. the composition was as follows: cement 42.5 r mokrá (499 kgm�3), sand 0/4 size fraction (705 kgm�3), gravel sand 8/16 size fraction (460 kgm�3), gravel sand 16/22 size fraction (527 kgm�3), water (215 kgm�3), plasticizer 4.5 lm�3. the measurements of both linear thermal expansion and linear hygric expansion were performed on 12 samples each. the dimensions of the samples were 40 × 40 × 120 mm, and the centers of the 40 × 40 mm faces were provided with contact seats for use with the contact comparator. 5 experimental results 5.1 linear thermal expansion coefficient figs. 1a, b show the measured linear thermal expansion coefficient of both types of high performance concrete in the temperature range from 20 °c to 1000 °c. the experimental results for temelin concrete show an abrupt change in the character of the �(t ) function at approximately 500 °c. the course of the � (t ) function for penly concrete is even more dramatic, with several maxima and minima. this is apparently a consequence of structural changes in the concrete due to the chemical processes taking place in the studied temperature range. 5.2 linear hygric expansion coefficient the measured results are summarized in figs. 2a, b. both types of high performance concrete exhibited a very similar behavior, and their linear hygric expansion coefficient decreased with the moisture content. 6 conclusions the linear thermal expansion of two types of high performance concrete was determined in a wide temperature range, and the linear hygric expansion coefficient in wide moisture range. the changes both of linear thermal expansion coefficient with temperature and of linear hygric expansion coefficient with moisture were found to be very significant, so that they cannot be neglected in practical applications. acknowledgement this paper is based on work supported by the ministry of education of the czech republic, under contract no. cez: j04/98: 210000004. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 41 no.1/2001 fig. 1a: dependence of the linear thermal expansion coefficient of temelin concrete on temperature fig. 2a: dependence of the hygric expansion coefficient of temelin concrete on temperature fig. 1b: dependence of the linear thermal expansion coefficient of penly concrete on temperature fig. 2b: dependence of the hygric expansion coefficient of penly concrete on temperature 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 references [1] schulgasser, k.: moisture and thermal expansion of wood, particle board and paper. paper and timber, no. 6, 3 (1988) [2] toman, j., černý, r.: coupled thermal and moisture expansion of porous materials. int. j. thermophysics, vol. 17, 271 (1996) [3] toman, j., černý, r.: measuring the thermal expansion of building materials at high temperatures. proceedings of the seminar: research activities of departments of physics in czech and slovak republic, p. 51, stu bratislava, 1997 (in czech) prof. mgr. jan toman, drsc. department of physics phone: +420 2 2435 4694 e-mail: toman@fsv.cvut.cz prof. ing. robert černý, drsc. department of structural mechanics phone: +420 2 2435 4429 e-mail: cernyr@fsv.cvut.cz ctu, faculty of civil engineering thákurova 7, 166 29 prague 6 czech republic 2 3 4 acta polytechnica doi:10.14311/ap.2016.56.0138 acta polytechnica 56(2):138–146, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap experimental levelling at the interface of optical environments h. sirůčková department of special geodesy, faculty of civil engineering, czech technical university in prague, thákurova 7, praha 166 29 correspondence: hana.siruckova@fsv.cvut.cz abstract. the article discusses the problems of refraction and its impact on levelling at the interface of optical environments. it describes the influence of a vertical refraction and shows the results of investigating the effect of the refraction in the course of levelling at the interface of optical environments. the results of the experiment were obtained by levelling through the building of the national technical library in prague dejvice. keywords: refraction, levelling, temperature gradient, digital thermometer. 1. terrestrial refraction refraction is defined as the deflection of a light ray passing through an inhomogeneous medium. as a consequence, target b will be viewed by the observer in a tangential direction from the point of observation a, as a generally curved spatial path of the ray, which is deviated from the direct line by refractive angle ρ. a simplified illustration is presented in figure 1. terrestrial refraction is caused by the passing of a light ray through the lowest layers of the atmosphere, known as the microclimate. the boundary of the microclimate has not yet been clearly determined. according to [4], this environment extends to a height of 2 m above the ground, while, according to [5], it extends to a height of 3 m above the ground. most of geodetic measurements are performed in this very complex and timevarying optical medium. a light ray, on the path from the source to the receiver, penetrates from one environment in the atmosphere to another. according to snell’s law of refraction, light is refracted when passing through the boundary of each environment and thus changes its original direction. the sight line is tangential to the generally curved spatial path of the ray, deviating from the direct line by refraction angle ρ. in geodesy, this can be divided in terms of the character of the measurable quantities into: • the component affecting the measured horizontal directions — horizontal refraction; • the component affecting the measured zenith angles — vertical refraction. when applied to open spaces, the vertical refraction is generally considerably larger than the horizontal refraction. the temperature gradient is more distinctive in the vertical, than in the horizontal direction (caused by the arrangement of hot layers of air of varying temperature that are nearly parallel to the surface). however, under certain circumstances, the figure 1. refractive angle refractive angle in the horizontal direction is greater than the one in the vertical direction. this is caused by the influence of the environment, e.g., measuring through a forest or a building. the influence of a building causes the temperature to vary more in the horizontal than in the vertical direction. the extent of the effect is different, depending on the type and colour of the surface — stone, plaster, glass (a white stucco, marble or glass surface will affect the measurement differently). 2. vertical refraction the refraction of measured values will generally have the greatest effect in the vertical plane when measuring the zenith angles, and especially with the sight line extending close to the ground. in terms of the optical environmental properties, temperature and pressure conditions play the biggest role, as they affect density decisively, and hence the refractive index of atmospheric layers. in the past, efforts were made to specify a universal refraction coefficient, the introduction of which would eliminate its effect from measurements. under our conditions, the best known and most widely used average value of the refraction coefficient k = 0.1306 was determined by gauss between 1823 and 1826 from the adjustment of what is known as the hanover arc measurements between göttingen and altona. in a 138 http://dx.doi.org/10.14311/ap.2016.56.0138 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 2/2016 experimental levelling at the interface of optical environments free space fairly high above the ground [3] (regards at least 30 m above the ground as adequate measuring height), the temperature gradient is 0.01 ◦c m−1, which corresponds to the refraction coefficient of 0.14. however, near the ground, the temperature gradient varies considerably, according to the intensity of solar radiation and the type of the surface over which the sight line passes. experiments in slovakia are described in [8]. this article deals with trigonometric determination of heights with the sight lines of tens of meters to 300 meters , to reduce the influence of refraction, a refractory model from measured meteorological data is used. during the experiment the changes in temperature, humidity and barometric pressure of the air were measured. trigonometric measurement was carried out in several places (bratislava, kleština, haje and gabčíkovo) and over several days and nights between the years 1975 and 2000. observed items were monumented by drillings half a meter in thickness and from 6 to 10 meters of depth that are above the ground as pillars at the height of 1.3 m and 3 m. the columns were fitted with the forced centering of instruments, respectively targets. height difference between the points was determined twice with a high-precision levelling in the beginning and at the end of the experiment. the lengths between the points were measured by the electronic distance meters with empirical rmse of up to 5 mm. zenith angles were measured with instruments wild t3, elta s10 and geodimeter 600. the measurements were organized in groups of 1-2 hour repetition. the changes in temperature, humidity and barometric pressure air were simultaneously measured. the difference in the measurements on 8th and 9th of june 1976, between the height difference measured by precision levelling and trigonometrically calculated, gave the group average of −4.3 to +3.3 mm. in the case of adjusted group average, this deviation ranged from −2.8 to +3.3 mm. another way to determine the effect of vertical refraction is mentioned in [9]. the author describes the process of removing the influence of vertical refraction based on measurements of temperature, pressure and relative humidity. it also recommends the placement of multiple sensors for measuring of temperature, while the atmospheric pressure and humidity is sufficient to measure only in one place, because the pressure does not change significantly within a small locality and the humidity has only a negligible effect. the calculation should be done for each sight, in order to obtain several determining profiles for calculating the refractive index. if the target is out of the locality border, the profiles will be derived from the model. another article [10] summarizes mathematic methods for elimination of refraction using the measured values. a method of differential determination of the refractive index from the radio waves based on the relationship between gradients of temperature and humidity, a special equipment for automatic determining of the vertical refraction, refractometer studies in antarctica and distance measurements from satellites are mentioned here. 3. levelling unlike most systematic errors, the refractive effect cannot be excluded by geometric levelling from the centre. according to [2], a systematic refractive error can reach values of 0.05 to 0.1 mm per metre of height difference in levelling with 50 m sight lines and its size is directly proportional to the square of the length of the sight line. in open space, air is warmed and cooled mainly from the ground, and therefore it can be presumed that, in the microclimate, atmospheric layers of the same temperature are approximately parallel to the ground surface. refraction during levelling and in the method of geometric levelling from the centre are generally manifested in an inclining terrain, because the sight line that is closer to the ground curves more and therefore the level of curvature is different when measured back and fore. this error is not excluded by geometric levelling from the centre; it is systematic in nature, known as differential refraction. when levelling on a ground which is practically horizontal, the influence of refraction, due to basically the same conditions along the entire length of the sight line, can be almost completely excluded by geometric levelling from the centre and therefore its influence is negligible. in our case, the ground was almost horizontal, but not an open space. there was a significant change in temperature due to movement from the exterior of the building towards the interior. in addition, the temperature was balanced when opening the door. therefore it can be assumed that there is a large variability of temperature gradients due to air flow. the general rule for measurement at the interface of different optical environments (e.g., when entering a building, a mine, etc.) is that the levelling staff must always be at the interface. the reason is for the back and fore sights to pass through the same optical environment as long as possible. in our case, the opposite scenario was applied, with a levelling instrument situated at the interface. thus, the back sight passes through a completely different optical environment than the fore sight. it was an effort to find how noncompliance with the “levelling staff stands at the interface” principle would affect the measurement in extreme cases. 4. experimental levelling through the building of the national technical library in prague dejvice measurements were carried out in the national technical library and surroundings in prague dejvice. the 139 h. sirůčková acta polytechnica figure 2. layout of the library and configuration of individual levelling points building of the national technical library seemed exceptionally appropriate for the experiment, because it is detached and accessible from all sides, making it possible to build up a closed levelling line around it. moreover, the approximate shape of the building is a super-ellipse (a more detailed study of the shape and properties of the library building [6]) depicts entrances on all four sides and a large atrium in the interior, through which the levelling lines can be led between all the entrances. the layout of the library is depicted in fig. 2. the experiment was carried out every sunday in order to avoid excessive movement of persons through individual entrances. on sundays, only the 24-hour study room via entrance no. 3 is open. the rest of the entrances are locked. the author would like to express his gratitude to the library management for the unlocking of the entrances, thus enabling the execution of the entire experiment. 5. instruments 5.1. temperature measurement the temperatures were measured with the digital thermometer btm-42083d equipped with 12 temperature sensors. when measuring on 3rd november 2013 all 12 sensors were functional. from 1st december 2013 only 7 sensors have been functional. for the temperature measurement, a one minute interval was set. 5.2. levelling instruments the measurements were carried out with the levelling instrument koni 007, serial number 150972. throughout the experiment, the 1.8 meters long staff no. 49356 was used. the tripod weiss was used during the measurements. the second leveling instrument was a digital dna 03 serial number 337893, art number 723289. throughout the experiment, the staff with serial number 35713 and the leica tripod was used. identical levelling staffs and devices were used for all measurements. 5.3. instrument calibration the staff used to levelling instrument koni 007 was calibrated and the measuring in the laboratory revealed that the scale of the staff is 0.999999. the calibration dna 03 found the following values: system scale system scale rms 1 +17 ppm 3 ppm 2 +17 ppm 1 ppm 3 +14 ppm 2 ppm 5.4. calibration of temperature sensors temperature sensors were immersed in water to eliminate short-term environmental influences, and were all in the same condition. measurements were conducted for approximately 1 hour, with temperatures reading every minute. a mean was computed from the values thus measured. one sensor was taken as fixed and corrections for the other sensors were calculated from it. 6. weather on 3rd november 2013, the weather was mostly rainy. it rained in the morning and around midday it only drizzled. throughout the day it was windy and cloudy. humidity ranged from 46 % to 49 %, the pressure ranged from 971 mbar to 973 mbar. temperatures ranged from 20 °c to 24 °c inside the building and outside from 9 °c to 13 °c outside the building. on 1st december 2013, it was cloudy all day without rain and without sunshine. there were no significant changes in the weather. humidity ranged from 29 % to 33 %, the pressure was 994 mbar. temperatures ranged from 18 °c to 19 °c inside the building and from 7 °c to 9 °c outside the building. on 9th february 2014, the weather was similar to the weather as on 1st december 2013. it was cloudy, with no wind and no significant changes in the weather. humidity ranged from 25 % to 30 % and the pressure was 969 mbar. temperatures ranged from 17 °c to 19 °c inside the building and from 4 °c to 7 °c outside the building. on 6th april 2014, in the morning, it was overcast, cloudy. at midday, the sun began to shine and in the afternoon the skies were already clear. humidity was 35 % and the pressure of 985 mbar. the temperature ranged from 19 °c to 21 °c inside the building and from 14 °c to 18 °c outside the building. on 27th july 2014 the weather was sunny all day. the skies were completely clear and the sunshine was very strong. when measuring, the sun was shining on the entrance no.1 all the time and it was windy, 140 vol. 56 no. 2/2016 experimental levelling at the interface of optical environments figure 3. dna 03 levelling instrument; measurement on 8 february 2015 entrance no. 2 was in the half-shade, entrance no.3 was in the shade all day and when measuring the entrance no. 4, it was sunny and slightly windy. humidity was 53 % and the pressure of 984 mbar. indoor temperature ranged from 24 °c to 27 °c, the outside temperature ranged from 29 °c to 33 °c. on 8th february 2015 the weather was changeable. when measuring the entrance no. 4, the sun was shining, the entrances no. 2 and no. 3 were in the shade and when entrance no. 1 was measured, it was snowing and wind gusts with snow occurred. humidity was 20 % and the pressure of 987 mbar. indoor temperature ranged from 17 °c to 22 °c and the outside temperature ranged from 1 °c to 3 °c. 7. description of the experiment the aim of the experiment is to determine the difference between the height difference of points identified by the measuring of levelling lines conducted through the building and the levelling lines conducted around the building. the total number of levelling lines around the building was 11, 6 of which were measured with the leveling instrument koni 007 and 5 with the digital leveling instrument dna 03. these levelling lines were measured in different seasons and times of the day and in different weather conditions. from these measurements, the average was computed figure 4. measurement at individual entrances and from this average the height differences between the points were calculated. it is assumed that the resulting height differences, due to a greater number of repeated measurements, are the real values. these values are compared with various levelling lines conducted throughout the building. in addition to measurement errors, the refraction influences are taken into account, regarding the polygons conducted through the building. further, the temperatures along the sight line are measured and the temperature gradients are calculated. from the temperature gradients determined this way, the estimated errors from the effect of refraction are calculated. by comparing the differences between the height difference determined from the levelling lines measured around the building and the levelling lines measured through the building, with the calculated errors from the influence of refraction, firstly, the correct use of formulas can be verified and secondly any mistake in measurement, which can happen in an extreme case when the instrument stands on the optical environment interface and not a staff, can be detected. a screw in a metal plug was placed in a concrete fissure and monumented in front of each entrance outside the building. it was not moved during the entire course of the experiment (from november 2013 to february 2015). see the building layout — fig. 2: points a1–a4. only one point was moved (between measurements in december 2013 and february 2014), probably due to mechanical snow removal. nuts with spherical caps were placed within the building, fixed to the floor with double-sided adhesive tape. these nuts were placed in position in the morning at the beginning of measurement and always removed at the end of the day. a total of 6 such nuts were positioned, 4 of which were between the internal doors of the building and 2 in the centre of the atrium. there was always one shared point for entrances 1 and 3 and for entrances 2 and 4. see the building layout — fig 2: points c1–c4 and e1–e4. measurements were carried out separately for each entrance. temperature sensors were positioned, the levelling set was measured between the exterior point and a point in the centre of the atrium. the entire apparatus was moved to another entrance. the measuring procedure was: the levelling instrument was 141 h. sirůčková acta polytechnica figure 5. fixing of temperature sensors at the range pole positioned at the interior doors of the building and sight lines led to a permanently monumented point in front of the building and to the point in the centre of the building for a given levelling line. the layout of the measurement in one entrance is shown in fig. 4. sight lines were approx. 20 m long. the entire levelling line from a permanently monumented point in front of the building through the building and the permanently monumented point on the other side of the building was divided into 2 sets. arrangement of temperature sensors. 7 functioning temperature sensors were fixed on 3 range poles along the sight lines. sensors on one range pole were at heights of 1 m, 1.5 m and 2 m above the ground. the attachment can be seen in fig. 5. two range poles had sensors at heights of 1 m and 2 m above the ground. the temperature gradient was calculated from temperatures measured by these sensors at different heights. arrangement of the range poles. the range pole with 3 temperature sensors was placed outsidethe building. one range pole with 2 temperature sensors was positioned between the exterior and interior doors of the building and the last range pole with 2 temperature sensors was positioned inside the building. after completing the measurement in each entrance, the temperature sensors were moved to another enfigure 6. the positioning of temperature sensors at entrance no. 1 trance. the position of the temperature sensors at entrance no. 1 is shown in fig. 6. 8. formulas used calculation of correction of dl according to [7]: dl = 1 2 γs2(1 − tg−1β.tg−1z) dt dh′ , γ = 0.000294 p 760 · 0.00367 1 + 0.00367t , where s (m) is the sight line length; β is the zenith angle of atmospheric layers; z is the zenith angle of sight line; dt dh (◦c m−1) is the temperature gradient; p (torr) is the atmospheric pressure; t (◦c) is the temperature in °c. for levelling z = 100 g the formula is simplified: dl = 1 2 γs2 dt dh (1) permissible deviation of measurement was calculated according to [1]: lengths of levelling lines – around the building: 300 m, through the building: 84 m; ∆mm = 3 √ rkm. (2) for levelling line around the building: ∆max,mm = 1.64 mm for levelling line through the building: ∆max,mm = 0.87 mm 142 vol. 56 no. 2/2016 experimental levelling at the interface of optical environments figure 7. figure 8. 9. results closed levelling line around the library. point a2 — basic referential elevation 0 mm. due to movement of point a4, measurements of the levelling lines were divided into 2 sections. the first is derived from the measurement on 3 november and 1 december 2013, the second is from 1 august 2014 to 13 february 2015. in all cases of the measurement of the levelling line around the library, measurement accuracy fulfilled the limits for precise levelling, no standard deviations were exceeded. closures of levelling lines around the library the height differences of points a1–a3 and a2–a4 calculated from average values are stated in tables 1 and 2. the height differences between points a1–a3 and a2–a4 were calculated from levelling lines led through the building. this difference found from the levelling line measured around the building and through it, and the correction of dl were calculated according to the above-mentioned formula. the values from tables 5 and 6 are displayed in figures 7 and 8. these charts clearly show that the introduction of corrections to the height differences measured through the building approached the averaged height difference measured around the building. it can be supposed that the assumption of the experiment was verified and that the results gained by the introduction of corrections calculated from temperature gradients are improved in most cases. table 7 shows the gradient values outside the building at the individual entrances. these values are displayed in figure 9. 143 h. sirůčková acta polytechnica figure 9. date a1 (mm) a4 (mm) a3 (mm) 3 november 2013 koni 007 7.55 −34.10 28.25 3 november 2013 koni 007 7.60 −34.50 28.65 1 december 2013 koni 007 7.95 −34.10 28.15 1 december 2013 koni 007 7.75 −33.75 28.35 mean 7.71 −34.11 28.36 standard error of mean 0.09 0.15 0.10 table 1. date a1 (mm) a4 (mm) a3 (mm) 1 august 2014 koni 007 7.70 −32.25 28.30 13 february 2015 koni 007 7.95 −32.50 28.15 9 february 2014 dna03 8.14 −32.47 28.66 9 february 2014 dna03 8.16 −32.34 28.69 6 april 2014 dna03 7.73 −32.59 28.63 27 july 2014 dna03 7.62 −32.49 28.65 8 february 2015 dna03 7.49 −32.93 28.84 mean 7.83 −32.51 28.56 standard error of mean 0.10 0.08 0.09 table 2. koni 007 polygon closure (mm) 3 november 2013 0.275 3 november 2013 −0.4 1 december 2013 −0.025 1 december 2013 −0.075 1 august 2014 0.25 13 february 2015 0.25 dna 03 polygon closure (mm) 9 february 2014 0.14 9 february 2014 0.13 6 april 2014 −0.205 27 july 2014 0.19 8 february 2015 0.265 table 3. a1–a3 (mm) a2–a4 (mm) by feb 2014 20.87 −34.11 from feb 2014 20.73 −32.51 table 4. 144 vol. 56 no. 2/2016 experimental levelling at the interface of optical environments date a1–a3 (mm) difference from levelling line around (mm) dl (mm) difference from levelling line around minus correction of dl (mm) 3 november 2013 koni 007 20.30 0.57 0.23 0.34 1 december 2013 koni 007 20.50 0.37 0.41 −0.04 27 july 2014 koni 007 21.50 −0.77 −0.28 −0.49 8 february 2015 koni 007 21.90 −1.17 −0.74 −0.43 9 february 2014 dna 03 20.69 0.04 0.27 −0.23 6 april 2014 dna 03 20.60 0.13 −0.31 0.44 27 july 2014 dna 03 21.63 −0.89 −0.63 −0.26 8 february 2015 dna 03 21.75 −1.02 −0.47 −0.55 table 5. date a2–a4 (mm) difference from levelling line measured around (mm) dl (mm) difference from levelling line measured around – correction of dl (mm) 3 november 2013 koni 007 −35.40 1.29 0.34 0.95 1 december 2013 koni 007 not measured 27 july 2014 koni 007 −35.90 1.89 0.15 1.74 8 february 2015 koni 007 −31.80 −0.71 −0.89 0.18 9 february 2014 dna 03 −32.64 0.13 0.23 −0.1 6 april 2014 dna 03 −33.35 0.84 0.20 0.64 27 july 2014 dna 03 −35.59 1.58 0.12 1.46 8 february 2015 dna 03 −32.72 0.21 −0.33 0.54 table 6. entrance 1 entrance 2 entrance 3 entrance 4 date (◦c m−1) (◦c m−1) (◦c m−1) (◦c m−1) 3.11. 2013 koni 007 1.04 0.33 0.41 1.04 1.12. 2013 koni 007 1.11 0.29 1.20 9.2. 2014 dna 03 1.59 0.92 1.57 1.00 6.4. 2014 dna 03 −2.13 0.89 0.45 0.77 27.7. 2014 koni 007 −0.79 0.97 −2.07 −0.70 27.7. 2014 dna 03 −2.08 0.45 −3.73 −0.34 8.2. 2015 koni 007 0.57 0.22 −0.38 0.53 8.2. 2015 dna 03 1.33 0.23 −0.59 0.29 table 7. 10. conclusion differences between mean of levelling lines measured around the library and of individual levelling lines through the building range from −1.17 mm to 1.89 mm. the permissible deviation for respective length of levelling line is 0.87 mm. therefore, it was exceeded several times. the effect of refractive error, calculated according to formulas, approximates the differences between the levelling lines measured around the library and through the building. correction of the levelling lines measured through the building with the calculated error of refraction, in most cases, would lead to results significantly similar to the results obtained by levelling around the building. exceptions are the measurements of 27 july 2014; differences between points a2–a4 are 1.58 mm with the dna 03 instrument and 1.89 mm with the koni 007. however, the correction of the influence of refraction calculated from temperature gradients is only slight. therefore, it is probable that there was an erroneous measurement at point a4. further differences were identified using the dna 03 instrument in levelling line a1–a2 on 9 february 2014 and in levelling line a2–a4 on 6 april 2014 and 8 february 2015 when, with the implementation of correction, there would have been a greater deviation from the values obtained by levelling around the building. in two cases, the values are even opposite. this could be caused by short-term fluctuations in the 145 h. sirůčková acta polytechnica environment that were not recorded by the sensors, or the simple fact that the values are on the border of measurement accuracy. in order to identify any significant systematic error during measurement, whether of the instrument or of the sensor during reading, measurement was carried out with two instruments — the optical-mechanical koni 007 and digital dna 03. after comparing the measured results, it can be stated that no serious errors were manifested, nor any errors in the accuracy of the reading with the koni 007. the sizes of standard errors and calculated deviations are comparable for both instruments. references [1] blažek r, skořepa z (1999). geodézie 30. výškopis, prague, ctu, isbn 80-01-01598-x [2] hradilek l (1984). vysokohorská geodézie. prague, academia [3] hauf m et al.(1989). geodézie – technický průvodce 42. prague, sntl, isbn 80-03-00142-0 [4] pospíšil j (1980). vliv atmosféry na šíření laserového záření při metodě záměrné přímky, geodetický a kartografický obzor 26(68), no. 1, pp. 9–14. issn 0016-7096 [5] vyskočil p (1966). studium možností snížení vlivu mikroklimatu na nivelační měření, candidate dissertation work. prague, ctu in prague [6] dušek r, skořepa z (2012). národní technická knihovna z pohledu superelipsy, geodetický a kartografický obzor, year 58/100, no. 3, pp. 45–50. issn 0016-7096 [7] vyskočil p (1982). refraction in levelling, collection of vúgtk research works, volume 14 [8] sokol š., ježko j.: možnosti eliminácie vplyvu refrakcie na trigonometrické meranie výšok, acta montanistica slovana, year 10 (2005), no. 2, pp. 218–227. issn 1335-1788 [9] urban r., štroner m.: modeling of the influence of vertical refraction on the precise geodetic measurements based on discrete measurement of atmospheric parameters. in: quaere 2013. hradec králové: magnanimitas, 2013, vol. 1, art. no. 1, p. 2624-2631. [10] ostrovskij, a. l.: dostiženija i zadači refraktometrii. lvov: geoprof 1/2008, lvovskaja politechnika, pp.6–15. issn 2306-8736 146 acta polytechnica 56(2):138–146, 2016 1 terrestrial refraction 2 vertical refraction 3 levelling 4 experimental levelling through the building of the national technical library in prague dejvice 5 instruments 5.1 temperature measurement 5.2 levelling instruments 5.3 instrument calibration 5.4 calibration of temperature sensors 6 weather 7 description of the experiment 8 formulas used 9 results 10 conclusion references ap1_01.vp 1 introduction the system was proposed for multiple, remotely controlled selective filling of one of three small dewars with liquid nitrogen (ln) from a storage dewar vessel. the liquid nitrogen level in the selected dewar is to be kept at a preset height by automatic refilling. the liquid nitrogen is transferred by a specially constructed electromechanical pump from a standard non-pressurized storage dewar. we avoided the solution of ln transfer from the storage dewar by pressure nitrogen gas provided either by an electric heater situated in the dewar, by self-pressurization or by dry nitrogen supply from outside. the pump itself, consisting of cylinder, plunger and valves, is immersed in ln above the bottom of the storage dewar vessel, the electromagnetic driving system being situated outside. the travel of the core in the solenoid of the driving system is transferred to the plunger by a rod. in such a way the joule heat from the driver, the input power of which is about 10 w, does not contribute to evaporation of liquid nitrogen. the distribution of ln between the three small dewars is accomplished by a three-way valve with electromechanical actuators. 2 electromechanical ln pump all parts of the pump (fig. 1) in contact with ln are made from stainless steel. the delivery valve is of a diaphragm type, and the intake valve is of a ball type. the diaphragm of the delivery valve is held in the closed position by a spring, the ball of the intake valve by gravity and hydrostatic pressure of the liquid nitrogen column above it. the intake valve operates successfully even when pressure in the cylinder volume is reduced during suction. due to the short delivery period, the plunger need not to be provided with a special sealing against the cylinder wall to prevent leakage. the electromagnetic solenoid type driver of the pump, sketched in fig. 1, has an axially sliding iron core. the number of windings of the coil is 1400, resistance 15 � and inductance 0.5 hy (when the core is in central position). without current, the core is held in the extreme upper position by a spring. during the current pulse, the core moves to the central position delivering pumping work. the static characteristic of the system, representing the resultant of magnetic force at coil current value 1.7 a and the resultant of the elastic force of the spring, acting on the core, is represented in fig. 2 as a function of the plunger position. this force compensates the inertial forces of the mechanical parts of the system and of the column of the transferred ln in the exhaust tubing during the delivery period and the hydrostatic pressure. the full stroke is maintained up to 10 hz working frequency of the pump. the solenoid of the pump driver is excited by rectangular current pulses, peak current 1.7 a, duty cycle adjustable from 0.12 to 0.25 and repetition rate 1 to 10 hz. the necessary voltage for peak current 1.7 a is 25.5 v, maximum ampere-turns equal to 2380 a. integrated circuit 555 is used in the pulse current generator, directly switching a darlington type transistor. the delivery period of the pump corresponds to the peak current, the intake period to zero current in the coil. the return of the plunger during the intake period is assured © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 41 no.1/2001 a small transfer and distribution system for liquid nitrogen č. šimáně, m. vognar, j. kříž, v. němec a system for remotely controlled filling of small dewars with liquid nitrogen from a central storage dewar vessel is described, consisting of a plunger type pump with an electromechanical driver and electromechanical ball type valves for distribution of liquid nitrogen. the preset nitrogen level in the small dewars is kept constant by automatic refilling. the delivery is adjustable in steps by frequency change from 2.5 to 25 cm3/s, and delivery height up to 2 meters is assured. keywords: liquid nitrogen, transfer system, solenoid pump, solenoid valve, automatic level control. fig. 1: pump for liquid nitrogen fig. 2: force p of the driver unit as function of the plunger position x by the spring in the driver. at 1 cm plunger stroke the suction volume is 2.5 cm3. the delivery is adjustable in steps by frequency change from 2.5 to 25 cm3/s, and delivery height up to 2 meters is assured. the use of pulse excitation of the electromagnet with an adjustable duty cycle enables the energy to be concentrated on a short delivery period, thus reducing the mean heat dissipation in the electromagnet coil. 3 three-way ln valve the distribution of ln to three dewars is enabled by a three-way solenoid valve (fig. 3) consisting of individual ball valves with a common body. each ball valve is actuated by a solenoid through a rod in contact with the ball of the valve, which is held in the closed position by a spring. without current in the solenoid coil, its core is held in the extreme upper position by a spring. when the magnet coil is excited, the core shifts towards the central position and only after having reached a certain amount of kinetic energy does it stroke on the rod in contact with the ball. in such a way the ball is set free even if it happens to have been frozen to the seal. the input power of each of the electromagnets is 3.6 w, z = 2500, coil resistance 105 �, i = 0.185 a. the switching circuitry ensures that, when any one of the valves is open, the other two remain closed. as long the valve is open, the ln pump is in action unless the ln level in the corresponding dewar is already at the preset height. the levels in the small dewar flasks are controlled by thermocouple sensors placed at preset heights. when the ln comes into contact with the thermocouple sensor, the current in the pump is interrupted and is renewed when the ln descends. the unwanted cooling of the thermocouple sensor by cold ln vapors during the filling period is compensated by heat supplied to the active part of the sensor by conduction from the warmer parts of the apparatus, keeping the sensor temperature above that of ln. the common body of the valve is made from teflon to minimize evaporation of liquid nitrogen and at the same time to insulate the valves thermally from the solenoids. for reliable action of the system the air moisture must be prevented from entering the inner parts of the system which are in contact with ln. this is accomplished by rubber gaskets, both in the pumping system and in the distribution valve, situated at places remaining at all times at or near to the room temperature. the transfer of ln to one of the three dewars is started by pushing the button switch of the corresponding valve. the starting of the pump is bound to the opening of the valve. the filling and maintenance of the preset ln level in the dewar is fully automatic. 4 conclusion no effects of atmospheric moisture on the function of the distribution valves were observed. in particular, the system did not manifest any tendency to freezing of the nitrogen and clogging the transfer lines or the distribution valves. references. [1] vognar, m., šimáně, č. : laboratory apparatus for preparation of 123i from 124xe by photonuclear reaction. acta polytechnica, 5(1996), pp. 49–56 prof. ing. čestmír šimáně, drsc. ing. miroslav vognar jiří kříž vladimír němec dept. of dosimetry and appl. of ionizing rad. phone: +420 2 2323657, +420 2 2315212 fax: +420 2 2320861 e-mail: vognar@br.fjfi.cvut.cz czech technical university in prague faculty of nucleus sciences and physical engineering břehová 7, 115 19 praha 1, czech republic 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 3: three-way valve ap01_6.vp 1 introduction the atm computer network consists of switches, sources and destinations of the transported data. the atm network is based on switching packets of the same length called cells. switches can transmit packets from their input to their output and also get information about the intensity of the traffic. network traffic management is a significant process used for proper functioning of the network. the traffic management is based on a specific algorithm. to avoid congestion, current atm networks use techniques of timeout, duplicate acknowledge or explicit congestion notification. for monitoring the state of the network, special cells called rm (resource management) are used in the atm network. rm cells inform data sources about the maximum rate of flow acceptable by the network. the load of the network varies in time, and interference of different loads in the same point can cause congestion. when this happens, the memories of the switches are overloaded and incoming cells have to be refused, i.e. removed. in this paper a new method is proposed for an earlier reaction to congestion. an analytical model of a switch was constructed for an analysis of the network behaviour. this model is used for analysing traffic through the network. a description of the model is given in the sections below. 1.1 standard traffic management there are several classes of traffic in the atm network. this paper considers the class called available bit rate (abr), used for transmissing files. the abr class of atm computer networks uses feedback information on data transport from switches and destination to the source to control the load. the feedback mechanism is essential for good throughput of networks enabling an earlier reaction of sources when congestion of switches occurs. the transported data are arranged in packets of cells. these packets have the same length, and a special cell called the rm cell precedes every packet. the traffic management is based on rm cells. rm cells pass from the source to the destination and collect information about the maximum rate of generated load which is acceptable by a dedicated path. after their acceptance by the destination they are sent back in order to bring the feedback information to the source. the rate of the traffic load is adjusted in the source according to the value received from the rm cell. the flow of data cells together with rm cells between switches sw (i) and sw (i+1) is shown in fig. 1. the rm cells are pictured as black rectangles, and the other cells as white rectangles. the packets of data cells with the rm cell number 8 are sent from the data source, i.e. forwards. only rm cells 2 and 3 are sent backwards to bring feedback information from the destination. the disadvantage of this method is that the source of data obtains information about the current rate of the load, but it does not know which particular data was lost because it was refused within the switches. this information can be obtained later from the destination. in other words, rm cells always have to go along the whole path from the data source to the destination and back, see fig. 5, 7. 1.2 proposed traffic model in the proposed model all rm cells are numbered by the source. the rm cells are returned by the switches at the moment of their congestion. if congestion occurs, the packets are refused. but associated rm cells are immediately sent back to inform the source of the loss of data. in this model, the source is informed earlier about the congestion in the network because rm cells did not pass along the whole path, see fig. 5, 8. 2 analytical model of the switch in this part the behaviour of basic network components – switches – is studied. the behaviour of the switch can be described as a stationary queueing process. let (m|m|1|m) be a queueing process according to the kendall notation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 41 no. 6/2001 analytical model of modified traffic control in an atm computer network j. filip the abr class of atm computer networks uses feedback information that is generated by net switches and destination and is sent back to a source of data to control the net load. a modification of the standard traffic management of the atm network is presented, based on the idea of using rm cells to inform the source about congestion. an analytical model of a net switch is designed and mathematical relations are presented. the probability of queueing is considered. the analytical model of the switch was constructed for the purpose of analysing the network behaviour. it is used for investigating of cells passing through the network. we describe the model below. the submitted analytical results show that this method reduces congestion and improves throughput. keywords: throughput, congestion control, modelling, rm cell, cell refusal. rm data 1 2 3 fig. 1: the flow of counted rm cells the symbols denote: m – distribution function of the cell interarrival time is exponential with the mean 1 0 � �, � � � and the arrival times are independent; m – distribution function of the cell service time is exponential with the mean 1 0 � �, � � � and the service times are independent; 1 – system with one place in service; m – system of queue length m. notation used in the following text: � – average arrival intensity; � – average service rate; � � �� – traffic intensity. let us assume that cells coming to switch are stored in a switch queue of length m. cells are refused if all of the m waiting places are occupied, and further arriving cells are cancelled. the situation is shown in fig. 2. the cell with index 0 is being transmitted and the following cells are waiting in the queue. the other incoming cells with an index equal to or greater than m + 1 are refused. the behaviour of this model is described by the probabilities of all possible queue lengths and by the cell waiting time in the queueing system. the probability that there are just k waiting cells in the queue is: � � p p m k m k k m+2 k for and for � � � � � � � � � � � 1 1 1 1 2 1 0, . (1) the probability pz of cell refusal (all places in the queue are occupied) is pz � pm + 1 and its value equals: � � p p m z m+1 m+2 z for and for � � � � � � � � � � � 1 1 1 1 2 1. (2) fig. 3 shows the dependence of probability pz on the value m for several values of traffic intensity �. the cell waiting time w in this queueing system is a random variable. for a description of the analytical model, the mean value e (w) of the waiting time w is used: � �e w p k k m � � �0 1 1 � � k. (3) fig. 4 shows the dependence of the mean value e (w) on the value m for several values of traffic intensity �. we can also use the probability of service pp, p pp z� �1 . (4) other formulas describing the behavior of the queueing system can be found in [5]. the behavior of the system for values of traffic intensity � close to 1 will be studied. in this case the refusal of cells occurs more frequently. 3 path of cell transport it is supposed that a path of cell transport from source s to destination d includes a sequence of n switches sw (i), 1 i n. the same path is used for both forward transport of data cells together with rm cells and backward transport of rm cells. in the backward transport the rm cells pass through the switches in the reverse order. the switches are marked sw (i), n i n 1 2 . both paths are shown in fig. 5. the following notation is used in fig. 5: s – source of cells; d – destination of a transport; sw (i) – i-th switch in the path; ti – time period needed for cell transport along the line connecting switches sw (i) and sw (i+1). 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 refused cell waiting cells cell in service m +1 m 2 1 0 fig. 2: the structure of a switch 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028 0.03 0.032 0.034 100 200 300 400 500 600 pz m � � 1.01 � � 1 � � 0.98 � � 0.95 fig. 3: probability of cell refusal � � 1.01 � � 1 � � 0.98 � � 0.95 100 200 300 400 500 0 100 200 300 400 500 600 �e w( ) m fig. 4: mean value of the waiting time in the queue 3.1 characterization of cell transport the following cases of cell transport from the source to the destination were studied: � the data cell is transported without being refused. � the data cell is just once refused during the transport. the formulas for transport time will be stated below, and we will use the weighted averages of the mean values of these time periods to compare the two models. we choose the probabilities of possible events as weight coefficients. we do not consider the other cases of cell transport (when the number of refusals is greater than 1) because earlier studies [1, 2] have shown that the associated probabilities are negligible. the probabilities of cell refusal in a switch are shown in fig. 3. we use the following notation: pz(i) – probability of cell refusal in the i-th switch due to full switch buffer; pp(i) � 1 � pz(i) – probability of service in the i-th switch; w(i) – waiting times in the queue of the i-th switch; e (w (i)) – mean value of w (i); ti – time period needed for cell transport along the line connecting switches sw (i) and sw (i+1). the sample space of the considered events consists of independent elementary events {0, 1, 2, …, n}. the symbols denote: 0 – the cell is transported without being refused in any switch; i – the cell is just once refused within the i-th switch sw (i), i = 1, 2, …, n. we assume that the associated request (the rm cell transmitted from the destination to the source) and repeated sending of data cells are delivered without refusal. the probabilities of the described events are conditional probabilities and their values can be approximated: � � � � � � � � � � � � � � � � p p p p n p i p p i p i i n 0 1 2 1 1 1 2 � � � � p p p p p z � � � ; , , , , . the values of � �p i i np , , ,� 1 � , are close to 1 and the values pz(i) � 1, i � 1, …, n. therefore we can approximate the probabilities p(i), i � 0, …, n, as: � � � � � � � �� � � � � � � � p p p p p n p i p i p i i n 0 1 1 2 1 0 1 � � � � � � � � � z z z z � � , , , , . (5) we use the values p0, p1(i), i � 1, …, n as the probabilities of the events {0, 1, …, n}. we note that � �p p i i n 0 1 1 1� � � . we consider the individual cases of the cell transport denoted by the symbols {0, 1, …, n} and we calculate their transport time. we denote: tst total time period of cell transport in the standard model; tst 0 total time period of cell transport without refusal in the standard model; � �ts it 1 total time period of cell transport with refusal in the i-th switch in the standard model; tpt total time period of cell transport in the proposed model; tpt 0 total time period of cell transport without refusal in the proposed model; � �tp it 1 total time period of cell transport with refusal in the i-th switch in the proposed model. 3.2 cell transport without refusal the value p0 is the probability of cell transport without refusal within any switch of the path. it is actually the probability of a successful transmission without loss of any cell. the influence of the proposed modification does not appear in this case and the transport times in the standard and proposed models are the same. in both models cells pass along the whole path from source s to destination d. this path is shown in fig. 6. the total time periods tst 0 and tpt 0 needed for the transport of a particular cell along the whole path in the standard and proposed models are equal, and are � �ts tp t w k k n k n t 0 t 0 k� � � � � � 1 1 1 (6) © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 6/2001 sw(j) w(j) sw(n+1) w(n+1) sw(1) w(1) sw(i) w(i) sw(n) w(n) sw(2n) w(2n) fig. 5: path of cell transport rm 9 sw(i) ds 1 fig. 6: path of a cell transport without refusal the mean value of time tst(0) and tpt(0) is � �� �e ts e tp t e w k k n k n ( ) ( )t0 t0 k� � � � � � 1 1 1 (7) because only time periods w (k) are random quantities. 3.3 cell transport with refusal of cells in the standard model we calculate the transport time tst(i). the assumption is that a cell is refused in the i-th switch sw (i). the destination must call for repeated sending of the refused cell. the probability � � � �p i p i1 � z of this case is given in (5). the path of a cell transport in the standard model is shown in fig. 7. the switch sw (i) is congested and that is why a cell is refused. information about a cell being refused is not known before an uncompleted packet is received by destination d. such a situation results in a request for a repeated transmission of data cells until the complete packet arrives at its destination. the cell and the request have to pass the path three times and the total time period needed for the cell transport � �ts it 1 , i n� 1, ,� , is � � � � � �ts i t w k t w kt k n k n k n n 1 1 1 1 2 2 2 2� � � � �� � � � �� � � � � � �k k k n n � � 1 2 . (8) the mean value of the time period � �� �e ts it1 is � �� � � �� �e ts i t e w k t k n k n k n n t k k 1 1 1 1 2 2 2 2� � � � �� � � � �� � � � � � � � �� � � �e w k k n n 1 2 . (9) 3.4 cell transport with refusal of cells in the proposed model in the proposed modification, the cell transport path and the associated request are shown in fig. 8. the request for the repeated sending is sent immediately from the congested switch sw (i). we can see that in this case the cell and the associated request for its repeated sending does not very often pass along all the way from the source to the destination. the cell is refused in the i-th switch with the probability � � � �p i p i1 � z . the time periods needed for the transport along the individual parts of the whole path are: 1. � �t w k k i k i k � � � � � 1 1 1 this is the time period of cell transport from source s to switch sw (i): 2. � �t w k k n i n k n i n k � � � � � � 2 3 2 2 2 2 2 this is the time period needed for transmission of the repeated sending request from switch sw (i) to source s. 3. � �t w k k n k n k � � � � 1 1 1 this is the time period needed for transport of the cell from source s to destination d. the total time period � �tp it 1 needed for the transport is: � � � � � �tp i t w k t w k k i k i k n i n k n i t 1 k k� � � � � � � � � � � 1 1 1 2 3 2 2 2 2 � � 2 1 1 1 n k n k n t w k � � � � � k . (10) the mean value of time period � �� �e tp it1 is � �� � � �� � � �� � e tp i t e w k t e w k k i k i k n i n k n t 1 k k� � � � � � � � � � 1 1 1 2 3 2 2 2 � �� � � � � � � � i n k n k n t e w k 2 2 1 1 1 k . (11) 4 comparison of cell transports we now compare the total time period of the cell transport in the standard model and in the proposed model. the transport time in the individual cases denoted by indeces {0, 1, …, n} are random variables and the corresponding probabilities p 0, p 1(i) are given in (5). as the mean value of the total transport time periods tst and tpt we use the weighted averages of mean values � �e tst0 , � �� �e ts it1 and � �� �e tp it0 , � �� �e tp it1 . the coefficients of weighted averages are the probabilities p 0, p 1(i). if we denote by e (t st) the mean value of the total time period of the cell transport in the standard model and by e (t pt) the same quantity in the proposed model, we obtain: � � � � � �� � � � � � � � � �� � � � e ts e ts p e ts i p i e tp e tp p e tp i p i i n t t t t t t � � � �0 0 1 1 1 0 0 1 1 , . i n � � 1 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 rm 9 sw(i) ds rm 9 rm 9 1 3 2 fig. 7: path of a cell transport with refusal in the standard model rm 9 sw(i) ds rm 9 rm 9 1 3 2 fig. 8: path of a cell transport with refusal in the proposed model the effect of the modification of the standard model is expressed by the difference � � � � � � � �� � � �� � � �� �� � � � e ts e tp e ts e tp p e ts i e tp i p i e t i n t t t t t t � � � � � � � � � 0 0 0 1 1 1 1 � �� � � �� �� � � �s i e tp i p i i n t t 1 1 1 1 � � (12) where we use the equality (7) � � � �e ts e tpt t0 0� . the values � �� �e ts it1 and � �� �e tp it1 are given in (9), (11). the last expression depends on many parameters. we determine its value for one simple case. in the sequel we assume that the switches are identical, i.e. the waiting times w (k) are the same and times tk are equal. we denote: � � � �� � � � � � t t k n e w e w k k n p p k p p k k n � � � � � � � � � � � k z z p p , ; , ; , , . 1 2 2 1 2 1 2 in this case we get � �� � � � � �� � � �� � � � � � � �� � e ts i n t n e w e tp i n i t n i e w i t t 1 1 3 3 3 2 1 2 2 1 � � � � � � � � � � , , , ,� n. � � � � � � � �� �� � � � � �� � e ts e tp p n i t e w n n p t e w i n t t z z 1 1 1 2 1 1 � � � � � � � � � � � . (13) the value � � � �e ts e tpt t1 1� depends on the parameters of the switch (e (w ), pz) and the value �t. we can suppose that the values � t � e (w ) and therefore the comparison of the two models is demonstrated by the expression � � � �v p e w n n� �z 1 . the values pz, e(w) and pp are defined in (2), (3), and (4). the following table contains the values of v for some path transport parameters. figs. 9 and 10 show the dependence of v on the number n of the switches for several values of traffic intensity �. fig. 11 shows this dependence for several values of buffer storage m and traffic intensity � = 1. the value of expression v depends on the product of the probability pz and the mean value of the waiting time e (w). figs. 12 and 13 show the dependence of the product pze (w) on value m for several values of traffic intensity �. conclusion the influence of the proposed modification of cell transport in an atm computer network is tested. analytical models of cell transport through standard and proposed modification are described. formulas for the time period needed for cell transport for both models are presented. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 41 no. 6/2001 v m n � � 1.01 � � 1 � � 0.99 � � 0.98 32 10 66.3 53.4 42.4 33.1 32 20 253 204 162 126 32 50 1536 1237 928 767 64 10 81.4 53.7 33.9 18.8 64 20 311 205 129 71.8 64 50 1887 1244 785 436 96 10 98.6 54.5 26.8 11.7 96 20 376 208 102 44.8 96 50 2285 1262 625 272 128 10 120 54.7 21.2 6.8 128 20 459 209 81 26 128 50 2788 1267 492 158 tab. 1: influence of modification on the transport quality 0 1000 2000 3000 4000 5000 6000 20 40 60 80 100 v n fig. 9: dependence of v on parameter n of the path 0 1000 2000 3000 4000 5000 6000 7000 20 40 60 80 100 v n fig. 10: dependence of v on parameter n of the path 0 1000 2000 3000 20 40 60 80 100 v n fig. 11: dependence of v on parameter n of the path a comparison is made by the weighted averages of the mean values of the time period needed for cell transport e (t st), e (t pt). as the coefficient of weighted values, we choose the probabilities of particular cases. we assumed transports without refusal and with one refusal only. the differences of the mean values e (t st) and e (t pt) (13) are approximated by the expression v. its values in table 1 and the graph dependences in figs. 9, 10 and 11 show that the effect of the proposed modification grows with increased values of traffic intensity �. from table1 and fig.12, 13 it follows that if traffic intensity � <1, then the effect of the proposed modification decreases with increased values of buffer storage size m while for � � 1 the effect of the proposed modification grows. this is caused by the fact that when m increases, then the probability pz of cell refusal decreases, but for � � 1 the mean of the waiting time e (w) increases faster, hence the product pze (w) increases. if � <1 then this does not happen. we see from tab.1 that in this case the effect of the modification actually becomes smaller as we increase m. references [1] filip, j.: data transfer with modified rm cells. in: proceedings of workshop 97, prague, ctu, 1997 [2] filip, j.: switching an alternative for high-speed networks. in: proceedings of workshop 96, prague, ctu, 1996 [3] filip, j.: data transports in atm networks. in: proceedings of workshop 2000, prague, ctu, 2000 [4] brandt, a., fraenken, p., lisek, b.: stationary stochastic models. akademie – verlag berlin, 1990 [5] zitek, f.: malé zastavení času. spn praha, 1975 [6] atm forum technical committee “traffic management specification version 4.0”, april 1996 ing. jiří filip e-mail: fofis@post.cz ministerstvo zahraničních věcí české republiky ministry of foreign affairs of the czech republic loretánské náměstí 5, 118 00, praha 1, czech republic 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 40 60 80 100 120 140 160 m p e wz ( ) fig. 12: dependence of pze (w) on value m 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 50 100 150 200 250 m p e wz ( ) fig. 13: dependence of pze (w) on value m acta polytechnica doi:10.14311/ap.2013.53.0646 acta polytechnica 53(supplement):646–651, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap argo-ybj: status and highlights giuseppe di sciascio∗, on behalf of the argo-ybj collaboration infn – sezione di roma tor vergata, viale della ricerca scientifica 1, i-00133 roma ∗ corresponding author: disciascio@roma2.infn.it abstract. the argo-ybj experiment has been gathering data steadily since november 2007 at the yangbajing cosmic ray laboratory (tibet, p.r. china, 4300 m a.s.l., 606 g/cm2). argo-ybj is confronting various open problems in cosmic ray (cr) physics. the search for cr sources is carried out by observing tev gamma-ray sources, both galactic and extra-galactic. the cr spectrum, composition and anisotropy are measured in a wide energy range (tev ÷ pev), thus overlapping direct measurements for the first time. this paper summarizes the current status of the experiment and describes some of the scientific highlights since 2007. keywords: cosmic rays, extensive air showers, gamma-ray astronomy. 1. introduction exploiting the full coverage approach at high altitude, the argo-ybj experiment is an air shower array able to detect cosmic radiation with an energy threshold of a few hundred gev. the detector has been gathering data steadily since november 2007 with a duty cycle larger than 86 %. the trigger rate is 3.5 khz. the detector characteristics and performance are described in detail in [1, 2]. the main results obtained by argoybj are described in [3]. this paper summarizes the status of the experiment and reviews some highlights. 2. gamma-ray astronomy from november 2007 the argo-ybj experiment collected about 4 × 1011 events in 1543 days of total effective observation time. five known vhe γ-ray sources were detected with a statistical significance greater than 5 standard deviations (s.d.), i.e. crab nebula, mrk421, mrk501, mgro j1908+06 and mgro j2031+41. a number of flares from mrk421 and mrk501 were observed and studied in detail. evidence of tev flaring activity from the crab nebula in coincidence with agile/fermi observations is also reported. details on the analysis procedure (e.g., data selection, background evaluation, systematic errors) are discussed in [4, 5]. 2.1. crab nebula with all the data recorded in 3.5 years argo-ybj observed a tev signal from the crab nebula with a statistical significance of about 17 s.d., proving that the cumulative sensitivity of the detector was 0.3 crab units. the observed flux is consistent with steady emission, and the observed differential energy spectrum in the 0.330 tev range can be described by dn/de = (3.0 ± 0.30) × 10−11 (e/tev)−2.57±0.09 photons cm−2 s−1 tev−1, in good agreement with other observations. according to mc simulations, 84 % of the detected events come from primary photons with energies greater than 300 gev, while only 8 % come from primaries above 10 tev. we evaluate systematic error on the flux less than 30 %, mainly due to the background estimaation and to the uncertainty on the absolute energy scale. according to the agile and fermi data, 4 major flaring episodes at energies e > 100 mev occurred during argo-ybj data acquisition [6–9]. flare 1 starting time mjd 54857 (feb. 2009), duration δt ∼ 16 days, maximum flux fmax ∼ 5 times larger than the steady flux [7]. during this flare no excess is present in our data, for any multiplicity threshold. flare 2 starting time mjd 55457 (sept. 2010), duration δt ∼ 4 days, maximum flux fmax ∼ 5 times larger than the steady flux [6, 7]. according to temporal data analysis, the γ-ray emission is concentrated in 3 narrow peaks of ∼ 12 hours duration each [8, 10]. integrating the 3 transits we observe for npad > 40 an excess of 3.1 s.d. over the expected steady flux (0.55 s.d.). if the excess were due to a flare, the γ-ray flux would be higher by a factor of ∼ 5 with respect to the steady flux at energies around 1 tev. integrating the data over 10 transits (from mjd 55456/57 to mjd 55465/66) the signal significance is 4.1 s.d. (pre-trial), while 1.0 s.d. is expected from the steady flux [11]. no measurement from cherenkov telescopes is available in coincidence with our observations and in coincidence with the various spikes to confirm this excess. sporadic measurements performed by the magic and veritas telescopes at different times from mjd 55456.45 to mjd 55459.49 show no evidence for flux variability [12, 13]. flare 3 starting time mjd 55660 (apr. 2011) [14], duration δt ∼ 6 days, maximum flux fmax ∼ 14 646 http://dx.doi.org/10.14311/ap.2013.53.0646 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 argo-ybj: status and highlights times larger than the steady flux [8]. integrating the argo-ybj data over the 6 days in which agile detected a flux enhancement, i.e. from mjd 55662 to mjd 55668, we find evidence for an excess for events with npad > 100 at a level of about 3.5 s.d. no measurement from cherenkov telescopes is available during these days, due to the presence of the moon during the crab transits. flare 4 starting time mjd 56111 (july 3, 2012) [9], duration δt ∼ 3 days. the daily-averaged emission doubled from (2.4 ± 0.5) × 10−6 ph cm−2 s−1 on july 2 to (5.5 ± 0.7) × 10−6 ph cm−2 s−1 on july 3. a preliminary analysis of the argo-ybj data shows an excess of events with statistical significance of about 4 s.d. from a direction consistent with the crab nebula on july 3, corresponding to a flux ≈ 8 times higher than the average emission at a median energy of ∼ 1 tev [15]. the expected steady flux corresponds to 0.33 s.d. no significant excess is detected in the following days from july 4 to 6. once again, no measurement from cherenkov telescopes is available during these days. in conclusion, the argo-ybj data show marginal evidence of a tev flux increase correlated to the mev÷gev flaring activity, with insufficient statistical significance to draw a firm conclusion (the post-trial probability is of order 10−3 for each flare). nevertheless, the probability of observing 3 flares out of 4 in coincidence with the satellites by chance is very low. a detailed analysis of the crab nebula tev emission is under way, and a paper is in preparation. 2.2. blazar mrk421 argo-ybj is monitoring mrk421 above 0.3 tev, studying the correlation of the tev flux with x-ray data. we observed this source with a total significance of about 14 s.d., averaging over quiet and active periods. it is well known that this agn is characterized by strong flaring activity both in x-rays and in tev γ-rays. many flares are observed simultaneously in both bands. the γ-ray flux observed by argo-ybj has a clear correlation with the x-ray flux. no lag between x-ray and γ-ray photons longer than 1 day is found. the evolution of the spectral energy distribution (sed) is investigated by measuring spectral indices at four different flux levels. spectral hardening is observed in both x-ray and γ-ray bands. the γ-ray flux increases quadratically with the simultaneously measured x-ray flux. all these observational results strongly favor the self-synchro compton (ssc) process as the underlying radiative mechanism. the results of mrk421 long-term monitoring are summarized in [16]. 2.3. blazar mrk501 the long-term observation of the mrk501 tev emission by argo-ybj can be described by the following differential energy spectrum: energy (tev) 1 10 2 10 ) -1 t e v -1 s -2 d if fe re n ti a l fl u x ( c m -16 10 -15 10 -1410 -13 10 -1210 -1110 -10 10 energy (tev) 1 10 2 10 ) -1 t e v -1 s -2 d if fe re n ti a l fl u x ( c m -16 10 -15 10 -1410 -13 10 -1210 -1110 -10 10 energy (tev) 1 10 2 10 ) -1 t e v -1 s -2 d if fe re n ti a l fl u x ( c m -16 10 -15 10 -1410 -13 10 -1210 -1110 -10 10 argo-ybj hess milagro-1 milagro-2 milagro-3 figure 1. gamma-ray flux from mgro j1908+06 measured by argo-ybj (red line) compared to other measurements. the dashed area represents the 1 s.d. error. the plotted errors are purely statistical for all the detectors. for details and references, see [18]. dn/de = (1.92 ± 0.44) × 10−12 (e/2tev)−2.59±0.27 ph cm−2 s−1 tev−1, corresponding to 0.312 ± 0.076 crab units above 1 tev. the largest flare since 2005 started in october 2011. during the brightest γ-ray flaring episodes from october 17 to november 22, 2011, an excess of the event rate over 6 s.d. was detected by argo-ybj, corresponding to an increase of the γ-ray flux above 1 tev by a factor of 6.6 ± 2.2 from its steady emission [17]. during the flare, the differential flux is dn/de = (2.92 ± 0.52) × 10−12 (e/4tev)−2.07±0.21 photons cm−2 s−1 tev−1, corresponding to 2.05 ± 0.48 crab units above 1 tev. remarkably, γ-rays with energies above 8 tev are detected with statistical significance of about 4 s.d., which had not happened since the 1997 flare. the average sed for steady emission is well described by a simple one-zone ssc model. however, the detection of γ-rays above 8 tev during the flare challenges this model due to the hardness of the spectra [17]. 2.4. mgro j1908+06 the γ-ray source mgro j1908+06 was discovered by milagro at a median energy of ∼ 20 tev and confirmed by hess at energies above 300 gev. the milagro and hess energy spectra are in disagreement, the milagro result being higher by a factor of about 3 at 10 tev. argo-ybj observed a tev emission from mgro j1908+06 with a maximum significance of about 7.3 s.d. for npad ≥ 20 in 6867 hours onsource [18]. the intrinsic extension is determined to be σext = 0.49° ± 0.22, consistent with the hess measurement (σext = 0.34°+0.04−0.03). the best fit spectrum obtained is: 647 giuseppe di sciascio, on behalf of the argo-ybj collaboration acta polytechnica galactic longitude (deg) g a la c ti c l a ti tu d e ( d e g ) -10 -5 0 5 10 -4 -2 0 2 4 6 °dec=50 °dec=40 657075808590 mgro j2019+37mgro j2031+41 ver j2019+407 ver j2016+372 figure 2. significance map of the cygnus region as observed by argo-ybj for npad > 20. the four known vhe γ-ray sources are reported. the errors on the mgro source positions are marked with crosses, while the circles indicate their intrinsic sizes. the cross for ver j2019+407 indicates its extension. the source ver j2016+372 is marked with a small circle without position errors. the small circle within the errors of mgroj2031+41 indicates the position and extension of source tev j2032+4130, as estimated by the magic collaboration. the open stars mark the location of the 24 gev sources in the second fermi lat catalog. for details and references, see [19]. dn/de = (6.1 ± 1.4) × 10−13 (e/4tev)−2.54±0.36 photons cm−2 s−1 tev−1, in the energy range 1 ÷ 20 tev (see fig. 1). the measured γ-ray flux is consistent with the milagro results, but is ∼ 2 ÷ 3 times larger than the flux derived by hess at energies of a few tev. given the reduced significance of the excess at high energies, we are not able to constrain the shape of the spectrum above 10 tev and to definitively rule out a possible high energy cutoff. the continuity of the milagro and argo-ybj observations and the stable excess rate observed by argo-ybj throughout 4 years of data collecting support the identification of mgro j1908+06 as a stable extended source, likely the tev nebula of psr j1907+0602, with a flux at 1 tev ∼ 67 % that of the crab nebula. assuming a distance of 3.2 kpc, the integrated luminosity above 1 tev is ∼ 1.8 times that of the crab nebula, making mgro j1908+06 one of the most luminous galactic γ-ray sources at tev energies [18]. 2.5. mgro j2031+41 and the cygnus region the cygnus region contains a large column density of interstellar gas and is rich in potential cr acceleration sites as wolf–rayet stars, ob associations and supernova remnants. several vhe γ-ray sources have been discovered within this region in the past decade, including two bright extended sources detected by the milagro experiment. energy (tev) 1 10 2 10 ) -1 s -2 c m -1 d n /d e ( t e v -18 10 -17 10 -16 10 -15 10 -14 10 -13 10 -12 10 -11 10 -10 10 -9 10 magic hegra milagro apjl milagro 2011 argo-ybj figure 3. energy spectrum of mgro j2031+41/tev j2032+4130 as measured by argo-ybj (magenta solid line). the spectral measurements of hegra and magic are also reported for comparison. the solid black line and the shaded area indicate the differential energy spectrum and the 1 s.d. error region, as recently determined by the milagro experiment. the two triangles give the previous flux measurements by milagro at 20 tev and 35 tev. for details and references, see [19]. the γ-ray source mgro j2031+41, detected by milagro at a median energy of ∼ 20 tev, is spatially consistent with the source tev j2032+4130 discovered by the hegra collaboration and likely associated with the fermi pulsar 1fgl j2032.2+4127. the extension measured by milagro, 3.0° ± 0.9°, is much larger than that initially estimated by hegra (about 0.1°). the bright unidentified source mgro j2019+37 is the most significant source in the milagro data set, apart from the crab nebula. this is an enigmatic source, due to its high flux not being confirmed by other vhe γ-ray detectors. recently, a deep survey carried out by veritas with sensitivity of ∼ 1 % crab units showed a complex emitting region with different faint sources inside the mgro j2019+37 extension. the estimated flux is much weaker than that determined by milagro. the cygnus region has been studied by argo-ybj with data collected in a total effective observation time of 1182 days [19]. the results of the data analysis are shown in figs. 2 and 3. a tev emission from a position consistent with mgro j2031+41/tev j2032+4130 is found with a significance of 6.4 s.d. assuming the background spectral index −2.8, the intrinsic extension is determined to be σext = (0.2+0.4−0.2)°, consistent with the estimation by the magic and hegra experiments, i.e., (0.083 ± 0.030)° and (0.103 ± 0.025)°, respectively. the differential flux is dn/de = (1.40 ± 0.34) × 10−11 (e/tev)−2.83±0.37 ph cm−2 s−1 tev−1 (fig. 3), in the energy range 0.6 ÷ 7 tev. assuming σext = 0.1, the integral flux is 31 % that of the crab at energies above 1 tev, which is higher than the flux of tev j2032+4130 as determined by hegra (5 %) and 648 vol. 53 supplement/2013 argo-ybj: status and highlights magic (3 %). again, this measurement is in fair agreement with the milagro result. the reason for the large discrepancy between the fluxes measured by cherenkov telescopes and by eas arrays (argo-ybj and milagro) is still unclear. possible contributions from diffuse γ-ray emission, nearby sources, and systematic uncertainties are not enough to explain the discrepancy [19]. no evidence of a tev emission above 3 s.d. is found at the location of mgro j2019+37, and flux upper limits at 90 % c.l. are set. at energies above 5 tev, the argo-ybj exposure is still insufficient to reach a firm conclusion, while at lower energies the argoybj upper limit is marginally consistent with the spectrum determined by milagro. the observation by argo-ybj is about five years later than that by milagro. a flux variation over the whole extended region cannot be completely excluded. if the flux variation were dominated by a smaller region in the source area, the picture could be more reasonable. in such a scenario, however, identifying mgro j2019+37 as a pwn could be a dilemma, because it otherwise should have a steady flux. 3. cosmic ray physics several interesting results have been obtained by argo-ybj in cr physics, as discussed in [3]. in the following sections, measurements of the anisotropy in the cr arrival direction distribution and of the light component (p + he) spectrum of crs are described. 3.1. large scale cr anisotropy the observation of the cr large scale anisotropy by argo-ybj is shown in fig. 4 as a function of the primary energy up to about 25 tev. the so-called ‘tail-in’ and ‘loss-cone’ regions, correlated to an enhancement and a deficit of crs, respectively, are clearly visible with statistical significance greater than 20 s.d. the tail-in broad structure appears to break up into smaller spots with increasing energy. in order to quantify the scale of the anisotropy we studied the 1-d r.a. projections integrating the sky maps inside a declination band given by the field of view of the detector. for this, we fitted the r.a. profiles with the first two harmonics. the resulting amplitude and phase of the first harmonic are plotted in figs. 5 and 6, where they are compared to other measurements as a function of the energy. the argo-ybj results are in agreement with those of other experiments, suggesting a decrease of the anisotropy first harmonic amplitude at energies above 10 tev. 3.2. medium scale anisotropy figure 7 shows the argo-ybj sky map in equatorial coordinates containing about 2 × 1011 events reconstructed with a zenith angle ≤ 50°. figure 4. large scale cr anisotropy observed by argo-ybj as a function of the energy. the color scale gives the relative cr intensity. energy [ev] 11 10 12 10 13 10 14 10 15 10 1 a -6 10 -5 10 -410 -3 10 norikura1973 musala1975 baksan1981 morello1983 norikura1989 eastop1995 eastop1996 tibet1999 tibet2005 milagro2009 eastop2009 baksan2009 utah1981 ottawa1981 holborn1983 bolivia1985 misato1985 budapest1985 hobart1985 yakutsk1985 london1985 socorro1985 hongkong1987 baksan1987 artyomovsk1990q sakashita1990 utah1991 liapootah1995 matsushiro1995 poatina1995 kamiokande1997 macro2003 superkamiokande2007 icecube2010 icecube2012 argo-ybj (this work) figure 5. amplitude of the first harmonic as a function of the energy, compared with other measurements. the zenith cut selects the declination region δ ∼ −20° ÷ 80°. according to simulations, the median energy of the isotropic cr proton flux is e50p ≈ 1.8 tev (mode energy ≈ 0.7 tev). the most evident features are observed by argoybj around the positions α ∼ 120°, δ ∼ 40° and α ∼ 60°, δ ∼ −5°, positionally coincident with the excesses detected by milagro [20]. these regions, named “1” and “2”, are observed with a statistical significance of about 14 s.d. the deficit regions parallel to the ex649 giuseppe di sciascio, on behalf of the argo-ybj collaboration acta polytechnica figure 6. phase of the first harmonic as a function of the energy, compared with other measurements. figure 7. medium scale cr anisotropy observed by argo-ybj. the color scale gives the statistical significance of the observation in standard deviations. cesses are due to a known effect of the analysis, which uses also the excess events to evaluate the background, overestimating this latter. the left side of the sky map seems to be full of few-degree excesses not compatible with random fluctuations (the statistical significance is more than 6 s.d. post-trial). the observation of these structures is reported here for the first time and together with that of regions 1 and 2 it may open the way to an interesting study of the tev cr sky. to figure out the energy spectrum of the excesses, the data have been divided into five independent shower multiplicity sets. the number of events collected within each region is computed for the event map (ev) as well as for the background map (bg). the relative excess (ev−bg)/bg is computed for each multiplicity interval. the result is shown in fig. 8. region 1 seems to have a spectrum harder than isotropic crs and a cutoff around 600 shower particles (proton median energy e50p = 8 tev). on the other hand, the excess hosted in region 2 is less intense and seems to have a spectrum more similar to that of isotropic crs. we point out that, in order to filter the global anisotropy, we used a method similar to that used by milagro and icecube. further studies using different approaches are under way. multiplicity 2 10 3 10 re la ti v e e x c e s s -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 -3 10× 0.66 1.4 3.5 7.3 20 p 50 e region 1 region 2 figure 8. size spectrum of regions 1 and 2. the vertical axis represents the relative excess (ev − bg)/bg. the upper scale shows the corresponding proton median energy. the shadowed areas represent the 1σ error band. energy (gev) 10 2 10 3 10 4 10 5 10 6 10 1 .7 5 ( g e v ) -1 -s -s r) 2 ( m 2 .7 5 e × f lu x 4 10 5 10 p he argo ybj (p + he) macro & eas-top (p + he) ams (p, he) bess (p, he) caprice (p, he)cream (p, he) horandel (2003) cream (p + he) jacee (p + he) runjob (p + he) figure 9. light component (p + he) spectrum of primary crs measured by argo-ybj compared with other experimental results. 3.3. light component (p + he) spectrum of crs requiring quasi-vertical showers (θ < 30°) and applying a selection criterion based on particle density, a sample of events mainly induced by protons and helium nuclei with the shower cores inside a fiducial area (with radius ∼ 28 m) has been selected. the contamination by heavier nuclei is found to be negligible. an unfolding technique based on the bayesian approach has been applied to the strip multiplicity distribution in order to obtain the differential energy spectrum of the light component (p + he nuclei) in the energy range (5 ÷ 200) tev [23]. the spectrum measured by argo-ybj is compared with other experimental results in fig. 9. systematic effects due to different hadronic models and due to the selection criteria do not exceed 10 %. the argo-ybj data agree remarkably well with the values obtained by adding up the p and he fluxes measured by cream, 650 vol. 53 supplement/2013 argo-ybj: status and highlights concerning both the total intensities and the spectral index [21]. the value of the spectral index of the power-law fit to the argo-ybj data is −2.61 ± 0.04, which should be compared with γp = −2.66 ± 0.02 and γhe = −2.58 ± 0.02 obtained by cream. the present analysis does not allow the determination of the individual p and he contribution to the measured flux, but the argo-ybj data clearly exclude the runjob results [22]. we emphasize that for the first time direct and ground-based measurements overlap for a wide energy range thus making it possible to cross-calibrate the different experimental techniques. 4. conclusions the argo-ybj detector exploiting the full coverage approach and the high segmentation of the readout is imaging the front of atmospheric showers with unprecedented resolution and detail. the digital and analog readout will allow a deep study of the cr phenomenology in the wide tev ÷ pev energy range. the results obtained in the low energy range (below 100 tev) predict an excellent capability to address a wide range of important issues in astroparticle physics. references [1] aielli, g. et al.: 2006, nim a562, 92 [2] bartoli, b. et al.: 2011a, phys. rev. d84, 022003 [3] d’ettorre piazzoli, b.: 2011, in 32nd icrc proc., highlight talk [4] aielli, g. et al.: 2010a, apj 714, l208 [5] bartoli, b. et al.: 2011b, apj 734, 110 [6] tavani, m. et al.: 2011, science 331, 736 [7] abdo, a.a. et al. 2011, science 331, 739 [8] striani, e. et al.: 2011, arxiv:1105.5028 [9] ojha r. et al.: 2012, astron. telegram 4239 [10] balbo, m. et al.: 2011, a&a 527, l4 [11] aielli, g. et al.: 2010b, astron. telegram 2921 [12] mariotti, m. et al.: 2010, astron. telegram 2967 [13] ong, r. et al.: 2010, astron. telegram 2968 [14] buehler, r. et al.: 2011, astron. telegram 3276 [15] bartoli, b. et al.: 2012a, astron. telegram 4258 [16] bartoli, b. et al.: 2011c, apj, 734, 110 [17] bartoli, b. et al.: 2012b, apj in press [18] bartoli, b. et al.: 2012c, apj in press, arxiv:1207.6280 [19] bartoli, b. et al.: 2012d, apjl 745, l22 [20] abdo, a.a. et al.: 2008, phys. rev. lett. 101, 221101 [21] yoon, y.s. et al.: 2011, apj 728, 122 [22] derbina v.a. et al.: 2005, apjl 628, l41 [23] bartoli, b. et al.: 2012e, phys. rev. d 85, 092005 651 acta polytechnica 53(supplement):646–651, 2013 1 introduction 2 gamma-ray astronomy 2.1 crab nebula 2.2 blazar mrk421 2.3 blazar mrk501 2.4 mgro j1908+06 2.5 mgro j2031+41 and the cygnus region 3 cosmic ray physics 3.1 large scale cr anisotropy 3.2 medium scale anisotropy 3.3 light component (p+he) spectrum of crs 4 conclusions references acta polytechnica acta polytechnica 53(2):63–69, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ in memory of alois apfelbeck: an interconnection between cayley-eisenstein-pólya and landau probability distributions vladimír vojta∗ chotovická 12, 182 00 praha 8 ∗ corresponding author: vojta@karneval.cz abstract. the interconnection between the cayley-eisenstein-pólya distribution and the landau distribution is studied, and possibly new transform pairs for the laplace and mellin transform and integral expressions for the lambert w function have been found. keywords: cayley-eisenstein-pólya distribution, landau distribution, lambert function, laplace transform, mellin transform, mellin multiplier, hadamard fractional integral, liouville fractional integral. ams mathematics subject classification: 33e99, (44a10, 44a15, 26a33). 1. introduction in their seminal paper on queueing theory [1] abate and whitt studied a cumulative probability distribution function (c.d.f.) f(x) with a pertinent probability density function (p.d.f.) f(x), x > 0, such that its moment generating function g(s) = ∫ ∞ 0 e−sx df(x) = ∫ ∞ 0 e−sxf(x) dx, (1.1) satisfies the functional equation g(s) = e−sg(s). (1.2) it was shown in [1] that the moments of f(x) are mn = (n + 1)n−1, n ≥ 0. (1.3) the authors of [1] named this probability distribution the cayley-eisenstein-pólya (c.e.p.) distribution. the solution of the functional equation (1.2) was not carried out in [1]. the primary effort of the author was to solve this problem with the aim to obtain an explicit formula for the probability density function f(x). during the calculation of f(x) by three methods, it was found that the c.e.p. distribution is in relation to the landau distribution function [2]. theorem 1.1. the solution of functional equation (1.2) is the function g(s) = w (s)/s = e−w(s), s ∈ c, where w(s) is the lambert function [3] and c is the set of all complex numbers. proof. let u(s) = sg(s). then eq. (1.2) can be rewritten as u(s)eu(s) = s. but this is the definition of the lambert w function as a solution of the functional equation w(s)ew(s) = s. for our case, we select the real branch w0(s) of w(s), which is a real function for s ≥ −1/e. the series expansion of w0(s) at s = 0 is w0(s) = ∞∑ n=1 (−n)n−1 sn n! , (1.4) with the radius of convergence equal to 1/e [3]. after some algebraic operations, we obtain the moment generating function g(s) in the form of the series g(s) = w0(s) s = ∞∑ n=0 (n + 1)n−1 (−s)n n! , g(0) = 1, (1.5) in accordance with (1.3). the radius of convergence is also equal to 1/e. it should be noted that the moment generating function is often taken as m(s) = g(−s), because in this case the generating function has the form of an (at least formal) power series. 63 http://ctn.cvut.cz/ap/ vladimír vojta acta polytechnica 2. inversion of the moment generating function to obtain p.d.f. f(x), we had to invert the laplace transform (1.1). three procedures were used: direct, mellin and stieltjes. 2.1. direct procedure theorem 2.1. the explicit form of the cayley-eisenstein-pólya p.d.f. is: f(x) = 1 π ∫ π 0 ( y2 + (1 −y cot y)2 ) e−xy csc ye −y cot y dy, x > 0, (2.1) proof. the proof is based on the relation [4] w0(s) s = 1 π ∫ π 0 y2 + (1 −y cot y)2 s + y csc ye−y cot y dy, s ∈ c\ (−∞,−1/e), (2.2) and on direct application of the bromwich inversion formula for the laplace transform and of the fubini theorem: f(x) = 1 2iπ ∫ c+i∞ c−i∞ exs w0(s) s ds = 1 π ∫ π 0 ( y2 + (1 −y cot y)2 ) 1 2iπ ∫ c+i∞ c−i∞ exs s + y csc ye−y cot y dsdy, c > −1/e,x > 0. (2.3) the bromwich integral in parentheses gives e−xy csc ye −y cot y , because the inverse laplace transform of a function 1/(s + a) is e−ax, and we have finally eq. (2.1). 2.2. mellin procedure the mellin procedure is based on the “cannibalistic” feature of the mellin transform: the mellin transform of the laplace transform (and of many others as well) of some function is given by the mellin transform of that function only [5]. symbolically: mt [ lt [ h(t); p ] ; s ] = γ(s) mt [ h(t); 1 −s ] , mt [ h(t); s ] = mt [ lt [ h(t); p ] ; 1 −s ] /γ(1 −s), (2.4) where mt and lt represent the mellin transform and the laplace transform, and γ(s) is the gamma function of argument s. lemma 2.2. the following mellin transform pair holds: ss = ∫ ∞ 0 xs−1f0(x) dx, s > 0, where f0(x) = 1 π ∫ ∞ 0 xyy−y sin πy dy, x > 0. (2.5) proof. in accordance with the bromwich inversion formula for the mellin transform, the function f0(x) can be recovered by contour integrals f0(x) = 1 2iπ ∫ c+i∞ c−i∞ x−sss ds = 1 2iπ ∫ (0+) −∞ x−sss ds, (2.6) where the first integral is along the bromwich contour and the second integral is along the left anticlockwise hankel contour. the equivalence of these two integrals is based on the cauchy integral theorem, and on the fact that the integrand has two branch points s = 0, s = −∞ and a branch cut in the s-plane along the interval (−∞, 0), and no other singularity. then f0(x) = 1 2iπ ∫ (0+) −∞ x−sss ds = 1 2iπ ∫ ∞ 0 xye−y(ln y−iπ) dy − 1 2iπ ∫ ∞ 0 xye−y(ln y+iπ) dy + 1 2iπ lim r→0 ∫ cr x−sss ds = 1 2iπ ∫ ∞ 0 xye−y ln y(eiπy −e−iπy) dy + 0 = 1 π ∫ ∞ 0 xyy−y sin πy dy, x > 0, (2.7) where cr is a small circle with radius r around the origin. theorem 2.3. the cayley-eisenstein-pólya p.d.f. f(x) is represented by the integral f(x) = ∫ ∞ x ln z x f0(z) z dz, x > 0, (2.8) where function f0(x) is given by eq. (2.7). 64 vol. 53 no. 2/2013 cayley-eisenstein-pólya and landau probability distributions proof. we start the proof from the known relationship [3]∫ ∞ 0 us−1w0(u) du = (−s)−s γ(s) s , 0. (2.11) inversion of the mellin transform (2.11) can be divided into two steps. the first step is the inversion of ss, giving auxiliary function f0(x) according to lemma 2.1, eq. (2.7). the second step, according to [5, entry 1.17, p. 12] combined with [5, entry 1.3, p. 11], gives the inversion f1(x) of ss/s as f1(x) = ∫ ∞ x f0(z) z dz, x > 0, (2.12) and, repeating this procedure once more, we obtain p.d.f. f(x) as f(x) = ∫ ∞ x f1(z) z dz, x > 0. (2.13) two consecutive integrations eq. (2.12) and eq. (2.13) can be replaced by one integration eq. (2.8). 2.3. stieltjes procedure this procedure is based on the fact that the function w0(s)/s is a stieltjes function, and can be represented as the stieltjes transform [4]: w0(s) s = 1 π ∫ ∞ 1/e =w0(−u) u 1 s + u du, s ∈ c\ (−∞,−1/e), (2.14) where w0(−u) = limθ→0+ w0(−u + iθ) for u > −1/e and =(·) means the imaginary part of the argument. theorem 2.4. cayley-eisenstein-pólya p.d.f. f(x) is represented by the integral f(x) = 1 π ∫ ∞ 1/e e−xu =w0(−u) u du, x > 0. (2.15) proof. the proof is straightforward, because the stieltjes transform is equivalent to the iterated laplace transform. conversely, by applying the euler differential operator −xd/dx to p.d.f. f(x) from eq. (2.1) or eq. (2.15), respectively, we obtain the function f1(x): theorem 2.5. function f1(x) is represented by the integrals f1(x) = −x d dx f(x) = 1 π ∫ π 0 ( y2 + (1 −y cot y)2 ) xy csc ye−xy csc ye −y cot y−y cot y dy = 1 π ( −x d dx )∫ ∞ 1/e e−xu =w0(−u) u du = x π ∫ ∞ 1/e e−xu=w0(−u) du, x > 0. (2.16) proof. by applying the differential operator −xd/dx to p.d.f. f(x) from eq. (2.1) and eq. (2.15) we obtain the function f1(x). in both cases, the conditions for the interchange of the derivation and integration are fulfilled. by two successive applications of the differential operator −xd/dx, i.e. by applying the euler differential operator xd/dx + x2d2/dx2 to p.d.f. f(x) from eq. (2.1) we obtain the function f0(x): 65 vladimír vojta acta polytechnica theorem 2.6. function f0(x) is represented by the integrals f0(x) = ( x d dx + x2 d2 dx2 ) f(x) = 1 π ∫ π 0 ( y2 + (1 −y cot y)2 ) m(x,y) dy, x ≥ 0, (2.17) where m(x,y) = xy csc ye−xy csc ye −y cot y−2y cot y(xy csc y −ey cot y), (2.18) and f0(x) = 1 π ( x d dx + x2 d2 dx2 )∫ ∞ 1/e e−xu =w0(−u) u du = x π ∫ ∞ 1/e e−xu=w0(−u) ( xu− 1 ) du, x > 0. (2.19) proof. by two successive applications of the differential operator −xd/dx, i.e. by applying the euler differential operator xd/dx + x2d2/dx2 to p.d.f. f(x) from eq. (2.1) and eq. (2.15), we obtain function f0(x). after the substitution x = e−λ, the resulting integral on the right hand side of eq. (2.7) gives the well-known landau p.d.f. [2] (more precisely its universal part), which describes the energy loss of a fast charged particle by ionization as it passes through a thin layer of matter: φ(λ) = 1 π ∫ ∞ 0 e−λyy−y sin πy dy, λ > −∞. (2.20) from eq. (2.17) and eq. (2.20), it follows that the landau p.d.f. has an alternative integral representation φ(λ) = 1 π ∫ π 0 ( y2 + (1 −y cot y)2 ) m(e−λ,y) dy. (2.21) theorem 2.7. the interconnection between the cayley-eisenstein-pólya p.d.f. and the landau p.d.f. is given by f(x) = − ∫ − ln x −∞ (u + ln x)φ(u) du, x > 0. (2.22) proof. the differential equation in (2.17) is the euler nonhomogeneous differential equation for the unknown function f(x) and given f0(x) with a particular integral (2.8). after substituting x = e−λ to this equation we obtain the following simple differential equation: d2 dλ2 φ2(λ) = d2 dλ2 f(e−λ) = f0(e−λ) = φ(λ), λ > −∞, (2.23) where φ2(λ) = f(e−λ). the particular solution of eq. (2.23) is φ2(λ) = ∫ λ −∞ φ(u)(λ−u) du, (2.24) conformable to eq. (2.8). after substituting λ = − ln x into eq. (2.24), we obtain eq. (2.22). 3. integrals and integral transforms the integral operator t defined by eq. (2.12): t(f0)(x) = ∫∞ x f0(z) z dz, x > 0, is an example of a mellin multiplier operator with multiplier 1/s [6]. this means that the fractional powers of the operator t are [6]: tα(f0)(x) = 1 γ(α) ∫ ∞ x lnα−1 z x f0(z) z dz, x > 0, α > 0, t 0 = i, (3.1) where i is the identity operator, and that it holds for the mellin transform of eq. (3.1): mt [ tα(f0)(x); s ] = s−α mt [ f0(x); s ] . (3.2) we thus have a one-parameter family of the functions fα(x), 0 ≤ α ≤ 2, f2(x) ≡ f(x), given by fα(x) = tα(f0)(x), x > 0. (3.3) moreover, the integral operator tα(f0)(x) in eq. (3.1) is the hadamard right-sided fractional integral of the order α of the function f0(x) [7]. by analogy, there exists a one-parameter family of the functions φα(x), 0 ≤ α ≤ 2, φ0(x) ≡ φ(x), given by φα(x) = uα(φ0)(x), x > −∞, (3.4) 66 vol. 53 no. 2/2013 cayley-eisenstein-pólya and landau probability distributions where uα(φ0)(x) = 1 γ(α) ∫ x −∞ (x−z)α−1φ0(z) dz, x > −∞, α > 0, u0 = i, (3.5) is the liouville left-sided fractional integral [7] of the order α of the landau p.d.f. equation (2.24) is a special case for α = 2. because w0(s)/s is a laplace transform of the p.d.f. f(x), it is natural to ask what the laplace transform of the functions f0(x) and f1(x) is. theorem 3.1. we have∫ ∞ 0 e−sxf1(x) dx = d ds w0(s) = w0(s) s(1 + w0(s)) = e−w0(s) 1 + w0(s) , −1/e, (3.6) ∫ ∞ 0 e−sxf0(x) dx = − d ds w0(s) + d2 ds2 ( sw0(s) ) = w0(s) s(1 + w0(s))3 , −1/e. (3.7) proof. with the aid of the standard rules for the laplace transform, e.g. [8, entry 1.24, p.6], applied to eq. (2.16) and eq. (2.17), respectively, we obtain the results. corollary 3.2. the following mellin transform pairs hold:∫ ∞ 0 us−1 w0(u) u(1 + w0(u)) du = γ(s)(1 −s)−s, 0. (3.11) proof. function w0(s) s(1+w0(s)) is a stieltjes function [4]. from the definition of the stieltjes transform and from eq. (3.6), it follows that∫ ∞ 0 e−szf1(x) dx = 1 π ∫ ∞ 1/e = ( w0(−u) u ( 1 + w0(−u) )) 1 s + u du, −1/e, (3.12) and after inversion of the laplace transform we obtain eq. (3.10). equation (3.11) is a consequence of the relation f0(x) = −xd/dxf1(x).. the integral in eq. (3.10) also converges for x = 0. theorem 3.4. we have ∫ ∞ 0 e−sx f0(x) x dx = 1 1 + w0(s) , −1/e. (3.13) proof. because d ds ∫ ∞ 0 e−sx f0(x) x dx = − ∫ ∞ 0 e−sxf0(x) dx = − w0(s) s ( 1 + w0(s) )3 , −1/e (3.14) and d ds 1 1 + w0(s) = − w0(s) s ( 1 + w0(s) )3 , (3.15) we obtain eq. (3.13). 67 vladimír vojta acta polytechnica corollary 3.5. we have 1 π ∫ ∞ 0 s−yy−y sin πy γ(1 + y) dy = w0(s)( 1 + w0(s) )3 , s ∈ c\d1/e, (3.16) 1 π ∫ ∞ 0 s−yy−y sin πy γ(y) dy = 1 1 + w0(s) , s ∈ c\d1/e, (3.17) where d1/e is an open disc of complex numbers with absolute value r < 1/e. proof. substitution of f0(x) from eq. (2.7) to eq. (3.7) and eq. (3.13) gives eq. (3.16) and eq. (3.17), respectively. after substituting s = ep into eq. (3.16) and eq. (3.17) we obtain the laplace transform pairs∫ ∞ 0 e−pyy−y sin πy γ(1 + y) dy = πw0(ep)( 1 + w0(ep) )3 ,

−1, −π < =p ≤ π, (3.18) ∫ ∞ 0 e−pyy−y sin πy γ(y) dy = π 1 + w0(ep) ,

−1, −π < =p ≤ π. (3.19) the fact that the abscissa of convergence of the two previous laplace integrals is equal to −1 can be verified by the stirling formula. it should be emphasized, however, that the right hand sides of eq. (3.18) and eq. (3.19) are not holomorphic functions in the half-plane

−1, a otherwise arbitrary. this means that there exist functions holomorphic in the half-plane

−1, defined by the integrals on the left hand sides of equations (3.18) and (3.19), and equivalent to the functions on the right hand sides in the region defined in eq. (3.18) and eq. (3.19). regarding eq. (3.18) or eq. (3.19) as inverse problem needs to apply the real methods for the laplace transform inversion. theorem 3.6. we have γ(s) π ∫ ∞ 0 y−sy−y sin πyγ(y) dy = ∫ ∞ 0 ys−1 1 + w0(ey) dy, 0 < 0. (3.21) proof. from the mellin transform of eq. (2.15), taking eq. (2.11) and eq. (2.4) into account, it follows that ss s2 = mt [ 1 π ∫ ∞ 1/e e−xu =w0(−u) u du; s ] = γ(s) π ∫ ∞ 1/e u−s =w0(−u) u du, 0. (3.22) theorem 3.8. we have f1(x) = 1 − 1 π ∫ ∞ 0 xyy−y−1 sin πy dy, x ≥ 0. (3.23) proof. from eq. (3.13) it follows that ∫ ∞ 0 f1(x) x dx = 1 1 + w0(0) = 1. (3.24) eq. (2.12) thus can be rewritten in the form f1(x) = 1 − ∫ x 0 f0(z) z dz = 1 − 1 π ∫ ∞ 0 y−y sin πy ∫ x 0 zy−1 dz dy, (3.25) where we substituted f0(z) from eq. (2.7). because ∫x 0 z y−1 dz = xy/y, y > 0, we obtain eq. (3.23). 68 vol. 53 no. 2/2013 cayley-eisenstein-pólya and landau probability distributions corollary 3.9. we have f1(x) = 1 − ∫ ∞ − ln x φ(u) du = ∫ − ln x −∞ φ(u) du, x ≥ 0, (3.26) where φ(u) is landau p.d.f. (2.20). proof. equation (2.12) can also be rewritten in the form f1(x) = 1 − ∫ x 0 f0(z) z dz = 1 − ∫ ∞ − ln x φ(u) du, (3.27) where the substitution z = e−u is used. the integral on the right of eq. (3.26) is a direct consequence of the same substitution into eq. (2.12). corollary 3.10. we have f(x) = − ln xf1(x) − ∫ − ln x −∞ uφ(u) du, x > 0. (3.28) proof. equation (3.28) is a consequence of eq. (2.22) and of eq. (3.26). theorem 3.11. we have f1(x) = ∫ x 0 f(x−u) f0(u) u du, x > 0. (3.29) proof. w0(s)/s is a laplace transform of the p.d.f. f(x). from equations (3.6) and (3.13), it immediately follows that function f1(x) is a laplace convolution of the functions f(x) and f0(x)/x. theorem 3.12. we have ∫ x 0 ( 1 + f(x−u) )f0(u) u du = 1, x > 0. (3.30) proof. according to eq. (3.25) and eq. (3.29), we find that∫ x 0 ( 1 + f(x−u) )f0(u) u du = ∫ x 0 f0(u) u du + ∫ x 0 f(x−u) f0(u) u du = ( 1 −f1(x) ) + f1(x) = 1. (3.31) 4. conclusion defining the moment generating function of the probability distributions by means of the laplace or laplacestieltjes transform proposed in [1] makes it possible to obtain the interconnection between the cayley-eisensteinpólya probability distribution, which is native in queueing theory with the landau distribution originating in atomic physics. this interrelation is given by an integral relation/equation of the first kind eq. (2.22) and in principle by a euler differential relation/equation in eq. (2.17), differential relation eq. (2.23) and by the laplace convolution eq. (3.30). from the formal point of view, the absolute convergence of the laplace transform integral and moreover its uniform convergence generally constitute an advantage in the calculations and in the process of inference of the formulae. as regards the integral transform pairs that are a byproduct of this study, the author cannot guarantee their novelty. however, but he has not found them elsewhere. acknowledgements many thanks to both referees for their valuable comments and suggestions. references [1]abate j., whitt w.: an operational calculus for probability distributions via laplace transforms, adv. appl. probab. 28, (1), 1996, p. 75–113. [2]landau l.d.: on the energy loss of fast particles by ionisation, j.phys. u.s.s.r. 8, 1944, also in editor: d. ter haar.: collected papers of l.d.landau, oxford: pergamon press, 1965, p.417–424. [3]corless r. m. et al: on the lambert w function, adv. comput. math. 5, 1996, p.329–359. [4]kalugin g.a. et al: bernstein, pick, poisson and related integral expressions for lambert w, to apear in integral transforms spec. funct. [5]oberhettinger f.: tables of mellin transforms, new york: springer-verlag, 1974. [6]rooney p.g.: a survey of mellin multipliers, in: “fractional calculus” (editors: a. c. mcbride, g.f.roach). london: pitman publishing, 1985, p. 176–187. [7]kilbas a.a. et al: theory and applications of fractional differential equations, amsterdam: elsevier, 2006. [8]oberhettinger f., badii l.: tables of laplace transforms, new york: springer-verlag, 1973. 69 acta polytechnica 53(2):63–69, 2013 1 introduction 2 inversion of the moment generating function 2.1 direct procedure 2.2 mellin procedure 2.3 stieltjes procedure 3 integrals and integral transforms 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0342 acta polytechnica 55(5):342–346, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap electrodynamic loudspeaker-driven acoustic compressor martin šoltés∗, milan červenka czech technical university in prague, faculty of electrical engineering, technická 2, prague, czech republic ∗ corresponding author: soltes.mail@gmail.com abstract. an acoustic compressor is built using the acoustic resonator which shape was optimized for a maximum acoustic pressure amplitude and a low-cost compression driver. acoustic compressor is built by installing a suction port in the resonator wall where the standing wave has its pressure node and a delivery port with a valve in the resonator wall where the standing wave has its pressure anti-node. different reeds, serving as delivery valves, are tested and their performance is investigated. it was shown that the performance of such a simple compressor is comparable or better than the acoustic compressors built previously by other researchers using non-optimally shaped resonators with more sophisticated driving mechanisms and valve arrangements. keywords: acoustic compressor, acoustic resonator, high amplitude acoustic field. 1. introduction high amplitude standing waves in closed cavities have been studied extensively by many researchers (see e. g. [1–4], for the review of the research done on the subject up to 1996, see [5]). much of the effort has been devoted to the study of the special case of piston-driven constant cross-section resonators driven at or close to one of its resonance frequencies. however, no matter how strong the excitation, the maximum acoustic pressure amplitudes obtained in the constant cross-section resonators were limited. it has been observed and is now a well-known fact that when resonators of cylindrical shape are excited at their resonant frequency, acoustic energy is transferred from the fundamental resonance to higher harmonics due to a non-linear properties of fluid, eventually leading to a formation of the shock wave. due to the fact, that dissipation of acoustic energy is proportional, the shock formation sets the upper limit for the maximum acoustic pressure in such resonator. gaitan and atchley [3] showed that it is possible to prevent the formation of shock waves in tubes with variable cross-section, where energy transfer from the fundamental resonance to the higher resonance modes is significantly reduced. in 1998, lawrenson et al. [4] published their experimental paper in which they introduced the concept of resonant macrosonic synthesis (rms). they showed that relative phases and amplitudes of the harmonics can be controlled by the resonator geometry resulting in shock-free waveforms of extremely high amplitudes. acoustic pressure amplitudes that they obtained were more than an order of magnitude larger than it had been possible before. ilinskii et al. [6] presented a one-dimensional nonlinear model equation for description of high-amplitude standing waves in axi-symmetric, arbitrarily shaped acoustic resonators in their theoretical paper. li et al. [7] presented a method for optimization of the resonator shapes based on a nonlinear wave equation. they considered resonator shapes described by smooth elementary functions with adjustable parameters only. červenka et al. [8] proposed an evolutionary algorithm-based method for optimizing the shape of the acoustic resonators for achieving high-amplitude shock-free acoustic fields. the method is based on a linear model that includes losses due to turbulence in the boundary layer. they used a more general approach of parametrizing the resonator shape using control points interconnected with cubic splines. with the progress in both theoretical description as well as experimental results in the field of highamplitude acoustic fields a number of different practical applications emerged, e. g. acoustic compressors, plasma-chemical reactors [9], thermoacoustic devices [10], etc. the possibility of constructing an acoustic compressor has been investigated by several authors (bodine [11], lucas [12], el-sabbagh [13], masuda and kawashima [14] and hossain et al. [15]). acoustic compressors offer several advantages over the more traditional ones. most importantly they do not contain moving parts that require oils to reduce friction and wear — this is important in applications where mixing of oil with the compressed fluid is undesirable. moving parts also reduce reliability of the compressor since they are subject to mechanical fatigue and failure. another advantage is that acoustic compressors allow using a valveless construction [12]. in all of the experiments researches used either the piston-driven resonators or the entirelydriven resonators (resonators in which acoustic energy is introduced in the resonator cavity by shaking the whole resonator along its axis). in this paper we aim to experimentally demonstrate practical utilisation of high-amplitude acoustic fields made possible by the optimized resonator described in our previous paper [8] by constructing a working exam342 http://dx.doi.org/10.14311/ap.2015.55.0342 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 5/2015 electrodynamic loudspeaker-driven acoustic compressor x [mm] 0 33.3 66.7 100 133.3 166.7 200 233.3 266.7 300 r [mm] 5 5 5 5.4 7.6 10.6 15.1 23.1 25 25 table 1. coordinates of the ten control points for cubic-spline interpolation describing the shape of the resonator used in experiments. figure 1. shape of the resonator cavity. grey crosses denote positions of the ten control points for cubicspline interpolation. part of the resonator to the right of the dashed line represents the internal waveguide inside the selenium driver. resonator is terminated by a rigid wall on its narrow end (at x = 0 mm) and by the loudspeaker diaphragm on its wide end (at x = lt ). ple of an acoustic compressor. we use the commonly available loudspeaker-driver to excite the resonator since it is much cheaper and smaller compared with the traditionally used electromagnetic or electrodynamic shakers to increase the economic attractiveness of the proposed solution. 2. acoustic compressor construction the experiments were conducted on the system consisting of electrodynamic loudspeaker and external resonator cavity. the compression driver selenium dt-405ti, from which the phase-plug was removed, was used along with a 300 mm long external resonator1. external resonator, machined from two pieces of duralumin block which were joined together, has an axisymmetrical cavity, which shape was optimized for the use with the selenium loudspeaker for maximal acoustic pressure amplitude at the closed end of the resonator (at x = 0 mm). the optimization procedure, described in our previous paper [8] is an evolutionary strategy-based method that optimizes the shapes of resonators subject to a given set of constraints (minimum and maximum radius, minimum and maximum resonance frequency, resonator length). resonator shape is defined by a set of n control-points (serving as parameters for the optimization) and is obtained by a cubic-spline interpolation. we have used n = 10 control points and 1resonating cavity of the whole system is formed by the short internal waveguide inside the driver and external resonator attached to the driver. total resonator length lt is therefore given by lt = lint + lext, where lint is the length of the internal waveguide and lext is the length of the external waveguide. plastic reed transparent window valve housing clamp delivery port rotameter r e s o n a to r pressure gauge laser vibromoter air microphone hose clamp figure 2. valve housing configuration. the following constraints: rmin = 5 mm, rmax = 25 mm, lext = 300 mm in the optimization procedure. resonance frequency constraints were not applied. the resulting resonator shape is shown in figure 1. coordinates of the control points (shown as gray crosses in figure 1) are listed in table 1. the acoustic compressor was built by installing a suction port in the resonator wall where standing wave has its pressure node (at x = 272 mm from the closed end of the resonator in our case) and a delivery port in the resonator wall where the standing wave has its anti-node (at x = 0 mm). with this arrangement alone, it is already possible to produce one-way air flow2 [12], however the value of the air flow-rate obtained this way is very small. for this reason the delivery port was fitted with a valve which rectifies the flow of the medium and therefore allows much higher values of air flow-rate to be achieved. in our experiments we have used a reed-type passive valve due to its simplicity and, more importantly, due to its ability to operate at high frequencies. the valve opens when acoustic pressure inside the resonator cavity rises above the static ambient pressure at the other side of the reed and closes when the acoustic pressure falls bellow this static pressure level resulting in one-way air flow. delivery port — a circular hole of 4 mm diameter with the center at resonator axis of symmetry was drilled in the cap closing the resonator at its narrow end (at x = 0 mm). delivery valve, enclosed inside a small metal box, was installed in a way shown in 2this is possible due to nonlinear properties of a fluid — with rising acoustic pressure amplitude inside the resonator, some small dc component emerges at the pressure anti-node. this creates static pressure gradient inside the resonator causing air flow between suction and delivery port. 343 martin šoltés, milan červenka acta polytechnica figure 3. the acoustic compressor experimental setup. the selenium driver with the resonator attached to it is in the left part of the picture. the delivery valve housing is attached on the top of the resonator and laser vibrometer is placed above. one of the hoses connects the pressure gauge and the other one with a hose clamp leads to the flow-meter. a microphone is connected from the side of the resonator. figure 2. this box (delivery valve housing) features two outlets for connecting the pressure gauge and air-flow meter. it also features a transparent glass window in its top wall, which can be used for the measurement of the reed displacement using a laser vibrometer. two holes of 2 mm diameter drilled in the opposite sides of the resonator wall at the pressure node serve as the suction ports. 3. results 3.1. experimental setup all measurements were performed at a room temperature and atmospheric pressure using air as a compressed medium. the electrical signal driving the loudspeaker was generated with a computer using the labview environment and amplified with the akiyama amd400 amplifier. acoustic pressure measurements were made in the labview environment using the ni pci-6251 data acquisition card and the g.r.a.s. 12aa preamplifier with the g.r.a.s. 40dp 1/8 " microphone which was attached from the side of the resonator as shown in figure 2. static pressure was measured using the ptl prematlak 2010 pressure gauge, air volume flow-rate was measured using the rheotest medingen pg05 rotameter and the µ� optoncdt ild2300 laser vibrometer was used for the reed displacement measurement. figure 3 shows a photograph of the experimental setup. reed 1 reed 2 reed 3 reed 4 15 1 9 1 1 17 6 6 11 1 1 figure 4. the valve reeds used in the experiment. dimensions are in millimetres. circles in the middle of each reed denote the position of the delivery port. 0 1 2 3 4 5 0 0.2 0.4 0.6 0.8 driving signal period [−] r e e d d is p la ce m e n t [m m ] 1 2 3 4 figure 5. measurement of the displacement versus the time characteristic for four different valve reeds shown in figure 4. the displacement was measured with the laser beam pointing at the center of the delivery port. 3.2. static pressure with no flow when the acoustic pressure is present inside the resonator, the delivery valve acts as a rectifier — increasing static pressure on the other side of the valve. by closing the hose clamp that controls the air flow (see figure 2) air is prevented from leaving the delivery valve housing resulting in the static pressure build-up. since there is no air-flow out of the delivery valve housing, optimally performing valve would eventually produce static pressure (inside the delivery valve housing) equal to the acoustic pressure amplitude (inside the resonator). displacement of the reed should decrease with the rising static pressure and the reed should eventually stop vibrating once the steady-state is reached — when the static pressure reaches the value of the acoustic pressure amplitude. this is not possible in our simple arrangement where the reed is not allowed to vibrate symmetrically but we can expect better performing reeds having smaller displacement. a number of different valve reeds were tried and it was found that dimensions of the reed have a significant impact on its performance. results for four different reeds shown in figure 4 are presented below. all reeds were made from a 130 µm thick pvc foil. figure 5 shows a measurement of the steady-state reed displacement versus the time for four different reeds shown in figure 4. the resonator was driven with sine-wave signal at resonance frequency fres = 551 hz with the input voltage amplitude |uin| = 10 v. it can be observed that the valve reed 1 tracks the acoustic pressure well. it opens only for a short interval during 344 vol. 55 no. 5/2015 electrodynamic loudspeaker-driven acoustic compressor 0 5 10 15 0 0.05 0.1 0.15 input voltage amplitude [v] p e a k re e d d is p la ce m e n t [m m ] figure 6. the valve reed 1 displacement amplitude versus input voltage characteristic. reed number 1 2 3 4 pressure [kpa] 20 16 7 0 table 2. static pressures measured inside the delivery valve housing, with the resonator driven at the resonance with the input voltage amplitude |uin| = 10 v. each cycle with the maximum displacement around 0.1 mm. valve reed 4 does not track the acoustic pressure very well. it is open during most of the cycle and its maximum displacement is roughly seven times bigger than the maximum displacement of the valve reed 1. valve reeds 2 and 3 behave somewhere between these two extremes. the static pressures inside the delivery valve housing measured at these conditions are summarized in the table 2. it is clear that the valve reeds which behave closer to the ideal — exhibiting smaller displacement, which are able to track the acoustic pressure more accurately (i.e. respond faster) produce higher static pressures. we use the valve reed 1 in all of the experiments described below. figure 6 shows the valve displacement versus the input voltage characteristic. it can be observed that the valve reed displacement grows roughly proportionately with the input voltage suggesting a good — linear behaviour of the valve. figure 7 shows the measured static pressure and the acoustic pressure amplitude versus the input voltage characteristic. it can be observed that the system possesses the desired characteristic — most of the acoustic pressure amplitude is rectified into the static pressure. 3.3. resonator with air flow by gradually releasing the hose clamp, the air is allowed to escape the valve housing that results in the decrease of static (delivery) pressure and one-way air flow through the resonator. figure 8 shows the air volume flow-rate versus the delivery pressure characteristic measured at three different input voltage amplitudes. it was observed that the frequency at which maximum air flow is achieved increases with opening of 0 5 10 15 0 1 2 3 4 x 10 4 input voltage amplitude [v] [p a ] acoustic pressure amplitude static pressure figure 7. the acoustic pressure amplitude inside the resonator measured near the delivery port (as shown in figure 2) and the static pressure measured inside the delivery valve housing versus the input voltage characteristic. 0 2 4 6 8 10 0 1 2 3 4 x 10 4 flow−rate [l/min] d e liv e ry p re ss u re [ p a ] |u in | = 5v |u in | = 10v |u in | = 15v figure 8. the air volume flow-rate versus the delivery pressure measured for the three different input voltage amplitudes. the hose clamp. however, it does not depend on the absolute value of the air flow rate or on the delivery pressure. figure 9 illustrates this frequency shift. the vertical axis represents frequency at which air flow is maximal (for a given hose clamp setting) while the horizontal axis represents this air flow rate as a percentage of the maximum possible air flow rate — with the hose clamp removed. this is measured for three different input voltage amplitudes. in other words, figure 9 illustrates how air flow rate and frequency at which air flow rate is maximal changes with opening of the hose clamp. it can be observed that the frequency shifts similarly (in relative terms) for all three input voltages — irrespective of the absolute air flow rate or static pressure. it seems that the frequency at which the air flow rate is maximal depends only on how much the hose clamp is opened. possible explanation of this behaviour is that as the clamp is opened the effective geometry of the cavity behind the delivery valve changes. since this cavity itself forms an acoustic system the change in its geometry could affect the function of the delivery valve. similar behaviour was also observed by masuda and kawashima [14]. 345 martin šoltés, milan červenka acta polytechnica 0 20 40 60 80 100 550 555 560 565 570 575 580 flow−rate [%] f re q u e n cy [ h z] |u in | = 5v |u in | = 10v |u in | = 15v figure 9. the shift of the driving signal frequency at which the air flow-rate is maximal as the hose clamp is opened. description of the figure is given in the text above. 540 560 580 600 620 7.5 8 8.5 9 frequency [hz] f lo w − ra te [ l/m in ] figure 10. frequency characteristic of the air volume flow-rate at input voltage |uin| = 15 v with the hose clamp removed. the measured frequency characteristic of the air volume flow-rate at the input voltage |uin| = 15 v with the hose clamp removed is shown in figure 10. it can be observed that under these conditions adjusting the driving frequency from the fres = 551 hz to f = 580 hz results in roughly 14 % increase of the air volume flowrate. 4. conclusions we have shown that using the optimized resonator described in our previous paper [8] construction of acoustic compressor is possible even when using a relatively inexpensive and simple driving mechanism — a compression driver. air volume flow-rates and delivery pressures we have been able to achieve (figure 8) are comparable or better than the ones reported by other authors [13–15]. it was observed that the dimensions of the reed, which acts as a delivery valve, has a significant impact on the performance of the compressor with better performing reeds exhibiting smaller displacement and faster response. the performance of the described compressor could be possibly further enhanced by using different (active) valve and by placing a suction port with a valve in the resonator wall where standing wave has its pressure antinode. moreover, the shape of the optimized resonator and the corresponding maximum acoustic pressure amplitude inside the resonator very much depend on the constraints used in the optimization procedure. by choosing different constraints (especially the minimum radius rmin) optimization procedure would produce different resonator shape which could produce higher acoustic pressure amplitude and better performance of the acoustic compressor. acknowledgements this work was supported by gacr grant p101/12/1925. references [1] b. coppens, j. v. sanders. finite amplitude standing waves in rigid walled tubes. jasa 52:1024–1034, 1968. [2] d. b. cruikshank. experimental investigation of finite-amplitude acoustic oscillations in a closed tube. jasa 52:1024–1036, 1972. [3] d. f. gaitan, a. a. atchley. finite amplitude standing waves in harmonic and anharmonic tubes. jasa pp. 2689–2495, 1993. [4] c. lawrenson, b. lipkens, t. s. lucas, et al. measurements of macrosonic standing waves in oscillating closed cavities. jasa 104:623–636, 1998. [5] m. ilgamov, r. zaripov, r. galiullin, v. repin. nonlinear oscillations of a gas in a tube. applied mechanics reviews 49:137–154, 1996. [6] y. a. ilinskii, b. lipkens, t. s. lucas, et al. nonlinear standing waves in an acoustical resonator. jasa 104:2664–2674, 1998. [7] x. li, j. finkbeiner, g. raman, et al. optimized shapes of oscillating resonators for generating high-amplitude pressure waves. jasa 116:2814–2821, 2004. [8] m. červenka, m. šoltés, m. bednařík. optimal shaping of acoustic resonators for the generation of high-amplitude standing waves. jasa 136:1003–1012, 2014. [9] t. nakane. discharge phenomenon in high-intensity acoustic standing wave field. ieee trans plasma sci . [10] g. w. swift. thermoacoustic engines. jasa 84:1145–1180, 1988. [11] a. g. bodine. resonant gas compressor and method, 1952. us patent 2,581,902. [12] c. macrosonix. standing wave compressor, 1991. ep patent app. ep19,910,301,934. [13] a. el-sabbagh. gas-filled axisymmetric acoustic resonators. vdm verlag dr. muller aktiengesellschaft & co. kg, 2008. [14] m. masuda, s. kawashima. a study on effects of the valves of acoustic compressors on their delivery flow rate. international congress on acoustics madrid september 2007. [15] a. hossain, m. kawahashi, t. nagakita, et al. application of finite amplitude oscillation of air-column in closed tube to design acoustic compressor. proceeding of 18th international congress on acoustics pp. 381–384, 2004. 346 acta polytechnica 55(5):342–346, 2015 1 introduction 2 acoustic compressor construction 3 results 3.1 experimental setup 3.2 static pressure with no flow 3.3 resonator with air flow 4 conclusions acknowledgements references ap1_01.vp 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 1 introduction linear formulation of the eigenvalue problem for a vertical rod loaded with its own weight has apparent drawbacks, e.g., a) all the critical lengths of the rod (i.e., the eigenvalues) compose a discrete set, outside of which all the lengths are noncritical, i.e., nonzero deflections are impossible, and b) all the critical deflections (i.e., the eigenfunctions) are of an arbitrary magnitude. to a certain extent, they may be eliminated using the precise (nonlinear) form of the curvature. the nonlinear problem was solved long ago by l. euler (1744) and by j. l. lagrange (1773), who gave the solution of the simplest problem for a homogeneous rod loaded with a longitudinal force (neglecting its own weight) in the form of elliptic functions. detailed analysis of the problem was published e.g., in [1], [2] and it is mentioned in the first part of the paper as an introduction to the bifurcation problem. the second part of this paper contains the formulation of nonlinear eigenvalue problem for a vertical homogeneous rod loaded with its own weight only, and the corresponding linear problem is solved. the last part is devoted to applications of bifurcation theory (see e.g., [3], [4]) to the nonlinear problem and the approximate solution of the problem is found. all the computations and pictures were performed with use of the maple v software. 2 critical load of a column in view of nonlinear theory the problem of finding the critical vertical load, which is applied to the free end of a vertical homogeneous rod of uniform cross section fixed at the bottom is a well-known eigenvalue problem that has been explicitly solved in many mechanics textbooks. the example is mentioned here to illustrate the differences between the results of linear and nonlinear theories. if we denote the horizontal deflection of the column by w (only such displacements are taken into account), its curvature by �� the variable length measured from the free end by s, the length of the whole rod by l, the angle between the x-axis and the tangent line at the point s of the rod by � (s), the equation of the moments is � � � �� �ei p w s w� � � 0 , where e is young’s modulus, i is the moment of inertia and p is the above mentioned load (in the direction x of the column). differentiating (by s) the equation and substituting the relations � �1 � � �� � � d d d ds w s s , sin , (1) into it, we get d d 2 � � s p2 2 0� �sin , (2) where p p ei2 � . in the case of small deflections, the equation may be approximated by the linear equation d d 2 � � s p2 2 0� � , having (together with the appropriate boundary conditions) the discrete set of eigenvalues � � p k l kk 2 22 1 2 1 2� �� � � � � � , , , and corresponding eigenfunctions with an arbitrary magnitude. but these results do not correspond to reality (see the introduction). for this reason many engineers and mathematicians have tried to solve the nonlinear equation (2). using the identity d d d d d d d d d d 2 � � � � � � �s s s2 2 1 1 1 2 � � � � �� � � � � �� � � � � � � on the left hand side of (2), integrating by � and taking into account the boundary condition � (0) = �0, 1/� (�0) = 0, we get � � � � 1 2 2 2 � � � � ��� p cos cos . (3) computing � � � d d s from (3), we must choose the sign in front of the root. as the most important eigenvalue is the smallest one, we choose the minus sign on the whole interval (0, �0) , and then we will consider the smallest critical force and its right neighborhood. integrating the root of (3) and taking into account the boundary condition s (0) = l, we get the implicit form of the function � = � (s) : critical length of a column in view of bifurcation theory m. kopáčková the paper investigates nonlinear eigenvalue problem for a vertical homogeneous rod loaded with its own weight only. the critical length of the rod, for which the rod loses its stability, is found by use of bifurcation theory. dependence of maximal deflections of the rods on their lengths is given. keywords: critical length of column, eigenvalue problem, bifurcation theory, approximation of solution, bessel function. � �l s p d � � �� � � � � � 1 2 00 cos cos . (4) the value �� is obtained by use of the substitution sin sin sin � � 2 2 0� � in (4) and limiting the result for � �0, s 0 : lp � � � � � � � d� �1 2 0 2 0 2 sin sin � � , (5) which is the implicit form of the function �0( p). it is seen, that the smallest p satisfying (5) is the smallest eigenvalue p1 of the linear problem with corresponding �0= 0. the dependence of the maximum deflections w0 w (0) on the force p, resp. p is given by the formula � � w p p 0 02 2 � sin � , which is obtained by integrating the second equality of (1) with the use of (3) and the fact d d d d d d d d w s w s w � � � � � � � � 1 and limiting the result for � �0, s 0 (see the fig. 1). the dependence of the ratio � � � � r p l x l l � � on p is given by the following formula � �r p l p � � � � � � � � � � � � � � 1 1 2 1 2 1 1 2 0 2 0 2 sin sin sin sin � � � � � � � � � � � � d� � 0 2 , which is calculated (similarly as w0) integrating the equality � �d d x s s � cos � , and limiting the result for � ��, s 0 (see fig. 2). two improvements comparing with the linear theory are evident: 1. if p tends to the first eigenvalue p1 = p / 2l of the linear problem from the right hand side of the interval, the maximum deflection w0 tends to zero, whereas no deflection of the rod exists for p � p1. 2. for every p � p1 there exists the unique solution w (�), � (0, �0] with maximum deflection w0( p). the same considerations hold for pk, k = 2, 3, … 3 critical lengths of a column bent with its own weight – nonlinear formulation and some results of linear theory in the further discussion we answer the question which length of a homogeneous column of a constant cross section is bent with its own weight only, i.e., finding the smallest l, for which there exists nonzero deflection w of the column. let us consider the equality of the moments in the form � � � �� � � � ei f s w s w s , l s � � �� � �� d , 00 , where f is the density of the column and s is its cross section (the other notation coincides with that of part 1). differentiating the equation and substituting the relations � � 1 � � �� � � d d d ds w s s, sin into it, we get � � d d 2 � � s f s ei s s2 0� �sin . to eliminate the unknown length l of the column, we transform s to the new variable s / l (denoting it again s) in the last equation and we denote the unknown function � in the new variable by u (s). thus, we get the equation � � � � � � d d 2 2 0 0 1 u s s s u s s� � �� sin , , , (6) acta polytechnica vol. 41 no.1/2001 © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 fig. 1: dependence of the maximum deflections w on the force p (l = �/2) fig. 2: dependence of the relative extension r on the force p (l = �/2) where � � f sl ei 3 (7) is an unknown eigenvalue of the problem to equation (6) and to the boundary conditions � � � �� � �u u0 0 1 0, . (8) the first condition follows from the assumption that upper end of the column is free of tension, i.e., 1 /� = 0, and the second condition expresses the fixed end at the bottom. in the case of small deflections, equation (6) may be linearized using the following approximations � �sin . .� �� � �w s . thus, we get the linear eigenvalue problem � � � � � � � � � � d d d d 2 2 0 0 1 0 0 1 0 v s s s v s s v s v� � � � �� , , , , , (9) here we denoted v (s) = �w ( s). the solutions of (9) are the discrete set of the eigenvalues �n nz n� � 9 4 0 1 2 , , , and the corresponding eigenfunctions � � � �v s c s j z s nn n� � �1 3 3 2 0 1, , , where zn (n = 0, 1, …) are all the roots of the bessel function � �j z�1 3 and c is an arbitrary constant. the first three eigenfunctions v0, v1, v2 of the problem (where c is chosen in such a way that vi (0) = 1, i = 0, 1, 2) are given in fig. 3, whereas the corresponding deflections w0, w1, w2 are drawn in fig. 4. the critical lengths of the column are determined by the eigenvalues �n and the relation (7), i.e., l eiz f s n n3 29 4 � . the length for which the homogeneous column loses its stability (i.e., nonzero deflection of the column exists without any force apart from its own weight), l eiz f s 0 0 2 1 39 4 � � � � � � corresponds to the smallest eigenvalue �� of the above problem, where �0 0 29 4 7 83734744� � z . (10) and the corresponding eigenfunction is of the form � � � � � �v s s j s s0 1 3 3 21 86635086 0 1� �� . , , . (11) 4 the solution of the bifurcation problem the assumption of small deflections, and then the validity of linear theory, implies the same insufficiency as that mentioned in the introduction. to remove it, we have to take into account the nonlinear equation (6) representing together with the boundary conditions (8) the problem of a bifurcation of the nonlinear operator l(u) – f(u, t) (given below) at the point [0, 0], (where l(0) – f(0, t) = 0), i.e., there exists a nonzero solution of the equation (14) (see below) in every neighborhood of [0, 0]. every eigenvalue �n gives a bifurcation point [0, �n] and the solution of the problem is similar to each other f o r n = 0, 1, … then, we choose the smallest eigenvalue �0, because this number determines the loss of stability of the column. equation (1) may be rewritten into the form � � � � � � � �� � � ��� � � � �u s s u s s u s u s s� � �0 0 0 1sin , , . (12) let us denote by l(u) the linear operator as the closure (in h1(0, 1)) of the operator given by the values � � � � � �l u s u s s u s� �� � �0 for the functions satisfying boundary conditions (8) and � � � � � �� � � �� �f u t s u s u s u s, sin sin ,� ��0 � (13) where = � � �0. now, we may reduce the problem of finding nonzero solutions to (6), (8) for � near to �0 to solving the equation � �lu f u� , (14) (in l2(0, 1) ) for sufficiently small . all the solutions of the homogeneous equation lu � 0 form a one-dimensional subspace n of the eigenfunctions v0(s) given by the formula (11) corresponding to the eigenvalue �0. any solution u(s) to the equation 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 3: three eigenfunctions v0, v1, v2 of the linear program fig. 4: three deflections of the column � � � � � �lu s f s s� �, ,0 1 (15) exists and is of the form � � � � � � � � � � � �� � � �u s b f v u s v s u cv s s � � �� � � � �0 0 0 00 0d (16) if and only if the function f is orthogonal to v0 in l 2(0, 1), i.e., � � � �f s v s s0 0 1 0d �� , (17) where c is an arbitrary constant, � � � �b u s s j z s� �2 3 3 0 1 3 0 3 2� , . u0 is a solution of � � � ��� � �u s s u s�0 0, representing together with v0 a fundamental system of solutions to the above equation. formula (16) is easily obtained by solving the linear problem (15), (8) with the use of variation of parameters. as operator l is not invertible, we split equation (14) into two equations: � � � �� �q lu q f u t� , , (18) � � � �� �i q lu f u t� � � �, 0 , (19) where q is a projector operator of l2(0, 1) on � � � � � �n f l f s v s s� � � � � � � �� � � � �� �2 00 1 0 1 0, : d . due to the uniqueness of the solution to (15) in n�, the operator q (lu) is invertible in n�. then the solution u (s) of (14) may be rewritten in the form � �u u u u� �1 2 2 where u1(u2) is the solution of (18) for fixed but arbitrary u2, and u2 is afterwards obtained as a solution of (19) where u1 = u1(u2) was substituted. now, we will compute u1, u2 solving problems (18), (19), or, more precisely, their approximations. let us introduce a sufficiently small parameter t and use the following series � � � � � � � � � ��� � � � � � � � � � � 1 2 2 3 3 0 2 2 3 3 t t t u s v s t u s t u s t u u , , sin u u3 5 3 5! ! � � (20) in (12). comparing the coefficients corresponding to the same power tn, n = 1, 2, 3, …, we get an infinite system of equations, three of which are written below � � � ��� � �v s s v s0 0 0 0� , (21) � � � � � ��� � � �u s s u s v s2 0 2 0� � , (22) � � � ��� � � � � � � �� � � �u s s u s s v v u v3 0 3 0 0 2 0 3 0 3 2 6 � � � � � . (23) other equations may easily be obtained continuing the computations. (21) is automatically satisfied due to (11). any solution of (21) is of the form (16) for f (s) =� 1v0(s) satisfying (17), which implies 1= 0. in this case, the only solution u2 � n � is u2 (s) � 0. substituting these values into (23), we get the equation for u3: � � � � � ��� � � �� � � � �u s s u s s v v s3 0 3 0 0 2 0 2 0 1� � � , , . (24) and again, any solution of (24) satisfying (8) is of the form (16) under the assumption (17) on the right hand side of (24), which determines the unique 2. the constant c from (16) is determined by the condition u3 n �. the computation of n, un(s) ( n = 2, 3, …) gives approximations of the solution u(s) of (14), whereas n, un+1(s) is a zero for odd indices n. the approximations � � � � � �u s v s t u s t � � 0 3 3� (25) for the values of parameter t = ±0.5, ±1.0, ±1.5 are given in fig. 5. the dependence of two approximations of � (denoted by l1, l2 on the picture 6): � �� �0 2 �t , � �� � �0 2 4 � �t t (26) is shown in fig. 6. excluding parameter t from (25) and (26), we get the maximum angle �0 (denoted by u0 in fig. 7) as a function of the length l of the column (see fig. 7). © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 41 no.1/2001 fig. 5: the approximate solutions u(s) for t = ±0.5, ±1.0, ±1.5 fig. 6: two approximations of �: l1, l2 5 conclusions the last figure illustrates the fact that the nonlinear theory gives reasonable results: a) the deflections of the column loaded with its own weight are zero for � � �� (the first eigenvalue of the linear problem), b) the maximum deflection increases continuously with increasing length of the column. formula (25) represents the solution of the problem as the sum of the first eigenfunction v0(s) of the linear problem multiplied by parameter t : t 0 for � �0 and “a perturbation”, which is orthogonal to v0 in l 2. the above mentioned method is applicable to other stability problems described by nonlinear ordinary and partial differential equations, e.g., lateral buckling. acknowledgement the author is grateful to prof. j. šejnoha for his stimulating discussions and valuable advice. the research of the author was supported by the foundation of the ministry of education in the project “výzkumný záměr no. j04/98:210000001”. references [1] rzhanicyn, a. r.: ustoichivost ravnoviesija uprugich sistem. moskva 1955 [2] collatz, l.: eigenwertaufgaben mit technischen anwendungen. leipzig 1963 [3] nirenberg, l.: topics in nonlinear functional analysis. new york 1974 [4] krasnoselskij, m. a., zabrejko, p. p.: geometricheskije metody nelinejnogo analiza. moskva 1975 [5] char, b. w., geddes, k. o., gonet, g. h., leong, b. l., monagan, m. b., watt, s. m.: first leaves: a tutorial introduction to maple v. springer-verlag 1992 rndr. marie kopáčková, csc. department of mathematics phone:+420 2 2435 4385 e-mail: marie.kopackova@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 7: graph of the maximum angle �0 (l) u(l) acta polytechnica doi:10.14311/app.2016.56.0062 acta polytechnica 56(1):62–66, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison of axial fan rotor experimental data with cfd simulation aleš prachař aerospace research and test establishment, beranových 130, 199 05 praha–letňany, czech republic correspondence: prachar@vzlu.cz abstract. data obtained from an experimental simulation on a new test rig for axial fans are compared to a cfd simulation. the edge solver is used and the development needed for the simulation (boundary conditions, free stream consistency) is described. adequate agreement between the measured and calculated data is observed. keywords: axial fan; cfd; edge software; boundary conditions. ams mathematics subject classification: 65n08, 76m12. 1. introduction axial flow fans are used in many applications ranging from cooling of the computer cpu’s to the propulsion of wind tunnels or cooling towers of the power plants, to emphasize extreme scales. for better understanding of the flow phenomena it is usual and today almost inevitable to include cfd into the design and testing. it is, however, necessary to develop sufficiently accurate computational tools and validate them extensively. a test rig for axial fans has been designed and constructed in recent years. compared to the previous state it allows automatic flow regulation and data acquisition. from the point of view of the cfd user it serves as a valuable source of validation data for computational methods. the edge cfd solver package [1] has been used so far mainly for external aerodynamics problems with encouraging results [2]. since the solver includes many of the ingredients needed to solve problems in rotating domains (turbomachinery), it was decided to test its ability to deal with this kind of problems. several updates, corrections and generalizations of the existing code had to be developed for the required functionality. this paper starts with a brief description of the experimental device and the data obtained. however, the main part is dedicated to the development of the edge flow solver and to a comparison with the experimental data. 2. experimental set-up and measured data a test rig for the axial fans, see fig. 1, has been designed. since the electric motor has been placed upstream of the fan rotor and the shaft is relatively short to prevent problems with vibrations, the inlet channel has been designed to change the direction of the incoming air from radial (centripetal) to axial. the shaft is fitted with dynamometer to measure torque. the hub and the shroud (casing) are fixed together by the struts which are placed upstream (airfoil shape) and downstream of the rotor. the diameter of the shroud at the rotor position is 630 mm and the hub to shroud diameter ratio is 0.55. to vary the volume flow rate by aerodynamic resistance, the adjustable choking element is mounted at the outflow area. the properties of the flow (total and static pressure, velocity vector) are measured at the planes approximately 130 mm upstream and downstream of the fan rotor by a pair of 5-hole pitot–static probes (see fig. 1) traversing along the radial direction. the test rig has a constant cross section between those two planes. the measured quantities, obtained as functions of radial coordinate, were later utilized to calculate integral values (volume flow rate, average total or static pressure). 3. cfd simulation for the cfd simulation the edge software, see [3], which is a compressible flow solver for unstructured grids with arbitrary elements was used. edge is based on the node–centred finite volume formulation of the euler or the navier–stokes equations in two or three dimensions using dual grids. 3.1. solver description the solver has been used in its parallel version utilizing convergence acceleration techniques like multi– grid, low speed preconditioning [4], implicit residual smoothing and local time stepping. the flow equations formulated in a reference frame rotating around an arbitrary axis ω with an angular velocity |ω| are written with the aid of the einstein summation convention as ∂u ∂t + ∂fi ∂xi + ∂gi ∂xi = s. (1) 62 http://dx.doi.org/10.14311/app.2016.56.0062 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 1/2016 axial fan cfd simulation figure 1. test stand scheme. a pair of 5-hole pitot-static probes is placed upstream and downstream of the fan rotor. figure 2. primary (dotted) and dual grid (solid), the vectors representing facets (blue) and faces (red). the definition of unknowns u, convective fluxes fi and the source term s is given by u =   ρ ρu1 ρu2 ρu3 ρer   , fi =   ρwi ρu1wi + δi1p ρu2wi + δi2p ρu3wi + δi3p (ρer + p)wi   , s =   0ρω ×u 0   , (2) where u denotes the absolute velocity vector and w stands for the relative velocity. the exact form of the viscous fluxes gi is omitted here for brevity. let us note that the total energy in the energy equation contains contribution from the rotation er = e −u · (ω ×r). (3) since the grid movement (rotation) is included in the governing equations, they can be solved in a steady state manner. for our simulation various k–ω models of turbulence have been tested, and the low reynolds number model [5] has been used for the presented results. in the finite volume method, (1) is solved in the integral form d dt ∫ v u + ∫ ∂v (fi + gi) ·n = ∫ v s. (4) for the convective fluxes the second order central scheme with the jameson–schmidt–turkel type artificial dissipation term has been used. the data structure for the finite volume flux calculations is edge–based and the reduced scheme is used, i.e., the normals representing the facets between two adjacent dual cells are summed into a single one, which is kept in the data structure, see fig. 2. hence, only one flux evaluation between two adjacent cells is needed. it was noted by raichle [6] that discretization error is introduced if the grid velocity ω × r at the cell face is averaged from the cell centres. he proposed a formulation assuring exact integration and, hence, free stream consistency for rotating flows. the scheme, still based on reduced formulation, was implemented. the only additional requirement is that along with the computed normals, representing the cell faces, additional vector for each face has to be stored in the data structure. for a typical case we observed increased memory demands for up to 10–15% compared to the inexact reduced scheme. increase in the cpu time is negligible. 3.2. boundary conditions for internal aerodynamics it is usually natural to use the pair of boundary conditions prescribing total quantities (total pressure, total temperature and flow direction) at the inlet and the static pressure at the outlet from the computational domain. we consider the inlet boundary condition adequate for our case. however, the exit static pressure is influenced by the performance of the rotor at various volume flow rates. if we inspect the dependence of the exit static pressure on the volume flow rate, see fig. 3, we find that the static pressure alone does not uniquely determine the flow regime. we prescribe the average of boundary 63 aleš prachař acta polytechnica figure 3. outlet static pressure (relative) and corresponding kl parameter for a typical case. pressure p̃b according to p̃b = pref + 1 2 klρ̃ũ 2 ax, (5) where pref denotes reference pressure and kl is a free parameter. the calculation was carried out for a range of kl’s to cover various flow regimes. moreover, since the flow exhibits large circumferential (peripheral) velocities as a consequence of the absence of stator blades, the outlet boundary condition is implemented to satisfy the radial equilibrium condition which we assume at the outlet and which is considered in a simple form, cf. [7], dpb dr = ρu2θ r . (6) finally, pb is updated to vary along the radial coordinate taking into account both (5) and (6). another type of boundary used in our simulation is a solid wall. however, we need to distinguish between boundaries steady with respect to the fixed coordinates (e.g., casing) where u = 0 is prescribed, and boundaries moving with the reference frame (e.g., rotor blade) with w = 0. in edge, weak formulations for these boundary conditions are used, [8]. since no information about the turbulence level was known, the inlet turbulence intensity was set to 1% of the inlet velocity and the freestream viscosity ratio (turbulent to dynamic viscosity) was set to the default value equal to 1. 3.3. cfd geometry for the cfd calculation the model was simplified considerably. first, the struts were removed and only the rotor blades were preserved. this makes the problem rotationally periodic. only one rotor blade was modelled and the rotation periodic boundary condition figure 4. layout of the cfd simulation. was applied. the cfd geometry models the constant cross–section part of the test rig, prolonged upstream and downstream. a hybrid (tetrahedral and hexahedral elements) computational grid was used as a primary grid with approximately 2.3 million nodes. the layout of the simulation and of the planes where the flow properties were evaluated is indicated in fig. 4. the structured hexahedral blocks were placed upstream and downstream of the rotor. the structured block with o-grid topology was used around the rotor blade with 100 points on each side of the blade surface, 35 points in the normal direction and 120 radial grid points. the boundary layer was fully resolved around the solid walls and in the tip gap to keep the parameter y+ below 1. the cell expansion ratio from the wall was kept close to 1.2. the rest of the space between the blade o-grid block and the periodic boundaries was filled with tetrahedral elements. 4. comparison of experiment data with cfd results to match the experimental data the flow field obtained from the cfd simulation was evaluated at the planes upstream and downstream of the rotor. the volume flow rates, average of static and total pressure are the basic data which together with torque at the rotor blades are used to calculate integral values. the experimental data were obtained for three settings of blade angle (25°, 35° and 45°), constant angular velocity (|ω| = 394 rad/s, ut = 124 m/s) and for a range of choking element settings from fully open to and beyond the blade stall. the non-dimensional pressure and flow coefficients are used for comparison, see fig. 5. the notation is similar to [9]. we observe satisfactory agreement of experimental and cfd data. let us note that the computation was performed for several settings of the solver. the choice of turbulence model had a notable effect on the prediction of the blade stall. in our cases the low reynolds numbers k–ω turbulence model [5] gave the best results in terms of capturing the blade stall. the cases in the stall region for the given blade settings are averaged values since oscillatory behaviour was 64 vol. 56 no. 1/2016 axial fan cfd simulation figure 5. axial fan performance in non–dimensional parameters for various blade angles. full line denotes the experiment, dashed lines with triangles represent cfd calculation. observed. in this case the unsteady solution could be used to improve accuracy. extensive validation of the turbulence models is necessary for the class of problems considered in this paper since testing of their implementation in edge was done mainly for external flows. various other simplifications and idealizations can also cause differences, e.g., tip clearance height estimate, which is never perfectly uniform in the experiment, the effect of struts, etc. the (relative) isentropic efficiency for the considered cases is shown in fig. 6. we can also consider the agreement of the data acceptable. the slightly higher efficiency for higher volume flow rates (especially for the blade setting of 35°) can be partly explained by the turbulence modelling. the flow is considered fully turbulent in cfd, however, turbulent transition could cause improvement. 5. conclusion the main result of this paper is that it is possible to use the edge software, a code primarily designed for external aerodynamics, for the problem under consideration. various enhancements were necessary for appropriate functionality. however, the changes to the code were compatible with the solver structure. an acceptable agreement between the measured and calculated data was observed. local quantities could be compared in the next step. future research will focus on the rotor–stator configuration and the modelling of the interface between the blade rows. list of symbols a flow area (cross–section) [m2] figure 6. relative efficiency for three blade settings. full line denotes the experiment, dashed lines with triangles represent cfd calculation. mk torque moment [n m] p fluid pressure [pa] p̃t total pressure (iso), p̃t = p̃ + 12 ρ̃( q a )2 [pa] q volume flow rate [m3 s−1] r radial coordinate [m] u absolute velocity vector [m s−1] uax axial component of velocity [m s−1] uθ peripheral component of velocity [m s−1] ut blade tip velocity [m s−1] w relative velocity vector, w = u− ω ×r [m s−1] δij kronecker delta η isentropic efficiency η = q∆p̃t/(mk|ω|) ρ fluid density [kg m−3] ϕ flow coefficient ϕ = q/(aut) ψ pressure coefficient ψ = 2∆p̃t/(ρ̃u2t ) ω rotation axis, angular velocity |ω| [rad/s] (̃ ) average of the quantity across boundary ( )b boundary value acknowledgements this research was supported by the ministry of industry and trade of the czech republic for long-term strategic development. access to computing and storage facilities owned by parties and projects contributing to the national grid infrastructure metacentrum, provided under the "projects of large infrastructure for research, development, and innovations" programme (lm2010005), is greatly appreciated. references [1] p. eliasson. edge, a navier-stokes solver for unstructured grids. in proceedings to finite volumes for complex applications iii., pp. 527–534. iste ltd., london, 2002. [2] p. eliasson, s.-h. peng, l. tysell. computations from the fourth drag prediction workshop using the edge 65 aleš prachař acta polytechnica solver. journal of aircraft 50(5):1646–1655, 2013. doi:10.2514/1.c032225. [3] edge theoretical formulation. tech. rep. 03-2870, swedish defence research agency (foi), 2007. [4] a. prachař. local low speed preconditioning in rotating reference frame. applied mathematical sciences 9(5):209–218, 2015. doi:10.12988/ams.2015.411885. [5] s.-h. peng, l. davidson, s. holmberg. a modified low-reynolds-number k −ω model for recirculating flows. asme j fluids eng 119:867–875, 1997. doi:10.1115/1.2819510. [6] a. raichle. extension of the unstructured tau-code for rotating flows. in new results in numerical and experimental fluid mechanics v, vol. 92 of notes on numerical fluid mechanics and multidisciplinary design (nnfm), pp. 136–143. springer berlin heidelberg, 2006. doi:10.1007/978-3-540-33287-9_17. [7] d. l. tweedt, r. v. chima, e. turkel. preconditioning for numerical simulation of low mach number three–dimensional viscous turbomachinery flows. in proceedings of 28th fluid dynamics conference. 1997. doi:10.2514/6.1997-1828. [8] p. eliasson, s. eriksson, j. nordström. the influence of weak and strong solid wall boundary conditions on the convergence to steady-state of the navier-stokes equations. in proceedings of 19th aiaa cfd conferenc. 2009. doi:10.2514/6.2009-3551. [9] v. cyrus, j. cyrus, p. wurst, p. panek. aerodynamic performance of advanced axial flow fan for power industry within its operational range. in proceedings of asme turbo expo. 2014. doi:10.1115/gt2014-25339. 66 http://dx.doi.org/10.2514/1.c032225 http://dx.doi.org/10.12988/ams.2015.411885 http://dx.doi.org/10.1115/1.2819510 http://dx.doi.org/10.1007/978-3-540-33287-9_17 http://dx.doi.org/10.2514/6.1997-1828 http://dx.doi.org/10.2514/6.2009-3551 http://dx.doi.org/10.1115/gt2014-25339 acta polytechnica 56(1):62–66, 2016 1 introduction 2 experimental set-up and measured data 3 cfd simulation 3.1 solver description 3.2 boundary conditions 3.3 cfd geometry 4 comparison of experiment data with cfd results 5 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0450 acta polytechnica 53(5):450–456, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap note on verma bases for representations of simple lie algebras severin pošta∗, miloslav havlíček department of mathematics, faculty of nuclear sciences and physical engineering, czech technical university in prague, trojanova 13, cz-120 00 prague, czech republic ∗ corresponding author: severin@km1.fjfi.cvut.cz abstract. we discuss the construction of the verma basis of the enveloping algebra and of finite dimensional representations of the lie algebra an. we give an alternate proof of so-called verma inequalities to the one given in [1] by p. littelmann. keywords: verma basis, enveloping algebra, lie algebra. submitted: 7 may 2013. accepted: 2 july 2013. 1. introduction the theory of simple lie groups and their representations (and corresponding representations of simple lie algebras) has been at the center of interest of modern mathematics for a long time, because it has many relationships with other areas of mathematics and physics. the simple lie algebras over the field of complex numbers were classified in the famous works of killing and cartan in the 1930s. since then we have known that there are four infinite series an, bn, cn, dn, which are called the classical lie algebras, and five lie algebras e6, e7, e8, f4 and g2, which we call exceptional lie algebras. the structure of these lie algebras is described in terms of special finite sets of elements in a euclidean space, called roots, which generate a root system. weyl’s theorem assures that each finite dimensional reducible representation of such a lie algebra is completely reducible. therefore in the theory of finite dimensional representations of the semisimple lie algebras, which are direct sums of simple ones, it is sufficient to restrict to irreducible finite dimensional representations. the complete classification of these irreducible finite dimensional representations is known. their sets are parametrized by vectors of nonnegative integers called highest weights. moreover, the characters and dimensions of such irreducible finite dimensional representations are explicitly known because of the weyl formula [2–5]. practical use of the simple lie groups and lie algebras, serving as a fundamental tool for studying the symmetries of systems examined in physics, often involve constructing the bases of the spaces on which their finite dimensional representations act. the best known example of such constructions are two works of gelfand and tsetlin. in two famous papers (see [6] and [7]), they gave an explicit construction of bases for a general linear lie algebra gl(n,c) (resp. special linear lie algebra an) and for orthogonal lie algebras bn and dn (for detailed comments, see also [8]). these papers contain no comments and no methods for deriving the explicit formulas. these papers also do not contain any references (the hint that one has to verify the commutation relations by direct calculation is not very useful for the proof). it is therefore no wonder that their formulas were re-derived and verified by other authors. verification and/or an independent derivation of these formulas was given in the papers by baird and biedenharn (see [9, 10]), and also by other authors [11–14]. after gelfand and tsetlin’s construction of representations, in the second half of the 20th century and later, a range of different approaches were developed and many techniques were adopted to construct the bases of the representations of classical series of lie algebras. we can mention here the gould paper [15], which made use of polynomial identities satisfied by the generators of the corresponding lie group, an approach which was then generalized then to kac-moody algebras [16], and the approach of asherova, smirnov and tolstoy involving projection operators [17, 18], which prove their usefulness also in the field of lie superalgebras and quantum algebras [19]. the results of tarasov and nazarov [20] also belong to this group. another approach, based on using weyl realization [21] of the representations of corresponding groups in tensor spaces, was developed in many papers [22–26]. so-called special bases were constructed by de concini and kazhdan [27], and their q-analogs by xi [28]. proper bases were constructed by gelfand and zelevinsky [29], retakh and zelevinsky [30], and similar good bases were constructed by mathieu [31]. another well known group of bases are crystal bases. they were constructed by lusztig [32, 33], kashiwara [34], du [35, 36], kang [37], and others [38–41]. 450 http://dx.doi.org/10.14311/ap.2013.53.0450 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 note on verma bases for representations of simple lie algebras 2. verma bases besides these approaches, an important role is played by bases which have special properties as bases in the universal enveloping algebra of a given simple lie algebra, and which can then be restricted by taking a suitable subset to the basis of a given representation of that lie algebra. such bases are called monomial bases, and were constructed in the standard monomial theories developed by lakshmibai, musili and seshadri [42], by littelmann [1, 43], its q-analogs by chari and xi [44] and others. one of the advantages of these bases is that the basis vectors are eigenvectors of cartan subalgebra, and therefore such a basis is suitable for various modifications. on the other hand, there is no explicit form of matrix elements of operators expressed in these bases. verma bases as introduced in [45] are of this type. these bases were constructed for the lie algebras an in [46] and for some concrete examples of other lie algebras of low rank (see also [47]). in [48, 49] proof of the so-called verma conjecture for the lie algebra an was given by raghavan and sankaran. note that the basis in enveloping algebra from which we can obtain the corresponding verma basis by restriction was given in [1]. the basis vectors of the verma basis are constructed from the highest weight vector (vacuum state) in a way consisting of the action of some specified sequence of the elements corresponding to simple roots. each set of basis vectors is constructed using sequences given by a certain set of inequalities (called verma inequalities). let us briefly describe the main result for the lie algebra an. let an = sl(n + 1,c) = n+ ⊕h⊕n− be the decomposition of the lie algebra sl(n + 1,c) into strictly upper triangular, diagonal and strictly lower triangular matrices. denote u(an), u(a+n ) and u(a−n ) corresponding enveloping algebras of an, n+ and n−. let φ be the root system of an and fixed h such that φ = φ+ ∪ φ−, where n+ = ⊕ β∈φ+ gβ. for the positive root β ∈ φ+ denote by fβ ∈ gβ and eβ ∈ g−β fixed elements of the chevalley basis of an. for a fixed ordering of simple roots {β1, . . . ,βn} denote fβj by fj and the corresponding eβj by ej. put hj = [ej,fj]. then the set of the following monomials (so-called verma monomials), f a11 1 ( f a22 2 f a21 1 )( f a33 3 f a32 2 f a31 1 ) · · · ( f ann n f ann−1 n−1 · · ·f an2 2 f an1 1 ) , where ack ≤ a c k+1 (1) is a linear basis of u(a−n ). a similar basis consisting of vectors generated by appropriate sequences of ej’s spans u(a+n ). together with the enveloping algebra of h one can obtain a basis of the whole u(an). if we now restrict to the elements generated by sequences fulfilling verma inequalities 0 ≤ ack ≤ min{a c k−1 + λn−c+k,a c+1 k+1}, where an+1k = +∞ and a k 0 = 0 for all k, acting on the highest weight vector |0〉 (vacuum state) with the highest weight (λ1, . . . ,λn), where λj are nonnegative integers, we obtain a basis of the representation space of the corresponding finite dimensional representation. 3. verma monomials inequalities as a contribution to the above discussion, we give an alternate proof of (1) to the proof given in [1]. lemma 3.1. for any n,m ≥ 1 and k ≥ 0 we have fni f k i−1f m i ∈ span{f k i−1f n+m i , fk−1i−1 f n+m i fi−1, . . . ,f n+m i f k i−1}, (2) fn+mi−1 f n i ∈ span{f n i−1f n i f m i−1, fn−1i−1 f n i f m+1 i−1 , . . . ,f n i f m+n i−1 }, (3) and (2), (3) in which fi and fi−1 are interchanged. proof. to show (2) we first prove the following identity: for any m ≥ 1 and i = 2, 3, . . . ,n we have fifi−1f m i = 1 m + 1 ( fm+1i fi−1 + mfi−1f m+1 i ) . (4) for m = 1, (4) follows from the fact that[ fi, [fi,fi−1] ] = 0. now let us assume validity for m and calculate the equation fifi−1f m+1 i = 1 2 (f2i fi−1 + fi−1f 2 i )f m i = 1 2(m + 1) fi(fm+1i fi−1 + mfi−1f m+1 i ) + 1 2 fi−1f m+2 i = 1 2(m + 1) fm+2i fi−1 + m 2(m + 1) fifi−1f m+1 i + 1 2 fi−1f m+2 i , from which, isolating the term fifi−1fm+1i we get the desired result. we now generalize formula (4) to the form fif k i−1f m i = 1 m + 1 ( kfk−1i−1 f m+1 i fi−1 + (m−k + 1)fki−1f m+1 i ) . (5) this formula is proved similarly by induction on k. multiplying both sides of (5) by fi−1, we obtain fi−1fif k i−1f m i = 1 m + 1 ( kfki−1f m+1 i fi−1 + (m−k + 1)fk+1i−1 f m+1 i ) , fi−1fif k i−1f m i = 1 2 ( f2i−1fifif 2 i−1 ) fk−1i−1 f m i = 1 2 f2i−1 1 m + 1 ( (k − 1)fk−2i−1 f m+1 i fi−1 + (m−k + 2)fk−1i−1 f m+1 i ) + 1 2 fif k+1 i−1 f m i . 451 s. pošta, m. havlíček acta polytechnica extracting terms fifk+1i−1 f m i we obtain (5) for k + 1. the last step is to prove the following identity: for any n,k,m ≥ 0 we have fni f k i−1f m i = 1( m+n n ) n∑ l=0 ( k l )( m−k + n n− l ) ×fk−li−1 f m+n i f l i−1. (6) this can be shown by induction on n. to show (3) we apply dixmier’s antiisomorphism (see [3], 2.2.18., p. 73) to (6) to obtain the relation fmi f k i−1fi = 1 m + 1 ( kfi−1f m+1 i f k−1 i−1 + (m−k + 1)fm+1i f k i−1 ) , which allows inductively to prove fn+1i−1 f n i = n+1∑ l=1 (−1)l+1 ( n + 1 l ) fn+1−li−1 f n i f l i−1. (7) multiplying (7) by fi−1 and repeatedly applying (7) to the right-hand side we finally obtain (3). replacing fi and fi−1 and using a similar approach, we subsequently obtain the following formulas: fi−1fif m i−1 = 1 m + 1 ( fm+1i−1 fi + mfif m+1 i−1 ) , fi−1f k i f m i−1 = 1 m + 1 ( kfk−1i f m+1 i−1 fi + (m−k + 1)fki f m+1 i−1 ) , fni−1f k i f m i−1 = 1( m+n n ) n∑ l=0 ( k l )( m−k + n n− l ) ×fk−li f m+n i−1 f l i, fn+1i f n i−1 = n+1∑ l=1 (−1)l+1 ( n + 1 l ) fn+1−li f n i−1f l i. due to the poincaré-birkhoff-witt theorem, ordered monomials es1212 e s13 13 · · ·e s1,n+1 1,n+1 e s23 23 e s24 24 · · ·e s2,n+1 2,n+1 · · ·esn−1,n+1n−1,n+1e sn,n+1 n,n+1 , sij ≥ 0, (8) where ei,i+1 = fi and eik = [ei,k−1,ek−1,k], i + 1 < k, (9) form a basis of u(a−n ). let us consider any such monomial and denote it by v. the relations (9) express any generator eik, i < k as a commutator of simple ones f1, . . . , fn. therefore v is a linear combination of (unordered) monomials from these simple generators and such monomials can be written in the form v′ = v1fr1n v2f r2 n · · ·vmf rm n vm+1, (10) where vi ∈ u(a−n−1) ⊂ u(a − n ) and the monomials v2,v3, . . . ,vm 6∈ u(a−n−2) (i. e. they contain generator fn−1, otherwise we use the relation frnvjf s n = fr+sn vj and the product can be truncated). theorem 3.2. let us denote vr,m = span { v1f r1 n v2f r2 n · · ·vmf rm n vm+1 ∣∣ vi ∈ u(a−n−1), r1 + r2 + · · · + rm = r } . (11) then we have v′ ∈ vr,1. proof. by induction on n. if n = 2, we have v′ = fs11 f r1 2 f s2 1 f r2 2 · · ·f sm 1 f rm 2 f sm+1 1 . (12) if m ≥ 2, we use formula (2) from lemma 3.1 applied to the product fr12 f s2 1 f r2 2 and we obtain v ′ ∈ vr,m−1. in the general case we write vi = wifsin w ′ i, w,w ′ ∈ u(a−n−1), and, therefore, [w,fn+1] = [w′,fn+1] = 0. for monomial v′ we can write v′ = v1fr1n+1w2f s2 n w ′ 2f r2 n+1 · · · = v1w2fr1n+1f s2 n f r2 n+1w ′ 2 · · · and we use the same argument as in the case n = 2. it follows from the above theorem that the set u(a−n ) is spanned by monomials of a special type. when n = 2 these monomials are of the form {fs11 f r1 2 f s2 1 |s1,s2,r1 ≥ 0}. due to formula (3) from lemma 3.1 the monomials fs11 f r1 2 f s2 1 , where s1 > r1 are linearly dependent on those having s1 ≤ r1, therefore we can restrict to the set {fs11 f r1 2 f s2 1 |s1,s2,r1 ≥ 0, s1 ≤ r1}. (13) we can generalize this assertion for n > 2 as follows. theorem 3.3. u(a−n ) = span { fk1n1 f k2n 2 · · ·f knn n f k1,n−1 1 f k2,n−1 2 · · · f kn−1,n−1 n−1 . . .f k12 1 f k22 2 f k11 1 ∣∣kij ≤ ki+1,j, i = 1, . . . ,n− 1, j = 1, . . . ,n } . (14) (we call verma monomials those monomials appearing on the right hand side of this equality.) proof. we proceed by induction on n. for n = 2 the assertion is true, now assume validity for n and we prove for n + 1. it follows from the preceeding lemma that it is sufficient to consider v ∈ u(a−n+1) such that 452 vol. 53 no. 5/2013 note on verma bases for representations of simple lie algebras it can be written as v1frn+1v2, where v1,v2 ∈ u(a−n ) and v1 is a verma monomial of the form v1 = fk1n1 f k2n 2 · · ·f knn n f k1,n−1 1 f k2,n−1 2 . . . ×fkn−1,n−1n−1 . . .f k12 1 f k22 2 f k11 1 . therefore v1f r n+1v2 = f k1n 1 f k2n 2 · · ·f knn n f r n+1 ×fk1,n−11 f k2,n−1 2 · · ·f kn−1,n−1 n−1 . . .f k12 1 f k22 2 f k11 1 v2︸ ︷︷ ︸ v′ and, due to the induction hypothesis, v′ is a linear combination of verma monomials. now if r ≥ knn, then the product fk1n1 f k2n 2 · · ·f knn n f r n+1 is already a verma monomial and the proof is finished. when r < knn, we rewrite the product fknnn frn+1 using (2) from lemma 3.1 to the form fknnn f r n+1 = a0f r n f r n+1f knn−r n + a1fr−1n f r n+1f knn−r+1 n + · · · + arf r n+1f knn n , ai being suitable complex constants. from this we conclude that fk1n1 f k2n 2 · · ·f knn n f r n+1 ∈ span{u1frn+1w1,u2f r n+1w2, . . . ,unf r n+1wn}, (15) where ui,wi are verma monomials from u(a−n ) and the highest degree of the simple root fn in vi is less or equal to r. therefore the product (15) is a linear combination of verma monomials from u(a−n+1), as desired. linear independence of verma monomials can be shown as follows. we make use of commuting operators ad hi : u(a−n ) → u(a−n ) defined by ad hiv = [hi,v], i = 1, . . . ,n. (16) the algebra u(a−n ) decomposes into the direct sum u(a−n ) = ⊕ z1,...,zn v (z1, . . . ,zn) of common eigenspaces of operators ad hi v (z1, . . . ,zn) = { v ∈ u(a−n ) ∣∣ ad hiv = ziv, i = 1, . . . ,n } . two vectors belonging to different subspaces are linearly independent. lemma 3.4. (1.) pbw monomials (8) of u(a−n ) are eigenvectors of all ad hi. (2.) v ∈ v (z1, . . . ,zn) iff s12 + s13 + · · · + s1,n+1 = m1, s23 + · · · + s2,n+1 = m2 + s12, ... sn,n+1 = mn + s1n + s2n + · · · + sn−1,n, (17) where m = − 1 n + 1 c1z, (18) and m =   m1 m2 m3 ... mn−1 mn   , c1 =   −n −(n− 1) −(n− 2) · · · −2 −1 1 −(n− 1) −(n− 2) · · · −2 −1 1 2 −(n− 2) · · · −2 −1 ... ... ... ... ... 1 2 3 · · · −2 −1 1 2 3 · · · (n− 1) −1   , z =   z1 z2 z3 ... zn−1 zn   . proof. (1.) the assertion is direct consequence of the fact that generators ejk, j < k are eigenvectors of ad hi. (2.) system (17) was obtained using generators eii, eii = 1 n + 1 ( c1 − n+1−i∑ j=1 jhn+1−j+ n∑ j=n+2−i (n + 1 − j)hn+1−j ) , ad eiiv = 1 n + 1 ( − n+1−j∑ j=1 jzn+1−j+ n∑ j=n+2−i (n + 1 − j)zn+1−j ) v ≡ miv. c1 = e11 +· · ·+enn stands for the casimir operator. note that the matrix which appears in (18) is the inverse of the cartan matrix of the algebra an. from equations (17) we see that mi are all nonnegative 453 s. pošta, m. havlíček acta polytechnica integers. the dimension of v (z1, . . . ,zn) is finite since for fixed right hand sides of equations (18) there is only a finite number of decompositions of m1 into the sum of s12, s13, . . . , s1,n+1, etc. for each of the possibilities (s12, . . . ,s1,n+1,s21, . . . ,s2,n+1, . . . , . . . ,sn,n+1) (19) we obtain a basis vector v ∈ v (z1, . . . ,zn). by exhausting all these possibilities we obtain the basis of v (z1, . . . ,zn). lemma 3.5. verma monomial v = ( fl1n1 · · ·f ln−1,n n−1 f kn n )( f l1,n−1 1 · · ·f ln−2,n−1 n−2 f kn−1 n−1 ) · · · ( fl121 f k2 2 )( fk11 ) ∈ v (z1, . . . ,zn) (20) iff   z1 z2 ... zn−1 zn   =   −2 −1 0 · · · 0 0 −1 −2 −1 · · · 0 0 ... ... ... ... ... 0 0 0 · · · −2 −1 0 0 0 · · · −1 −2   ×   l1n + l1,n−1 + · · · + l12 + k1 l2n + l2,n−1 + · · · + l2,3 + k2 ... ln−1,n + kn−1 kn   . (21) proof. by direct calculation using relations [fi,hj] = cijfi, cij = 2δij − δi,j+1 −δi,j−1, cij = 0 for |i− j| > 1 and, consequently [fαi ,hj] = αcijf α i . note that inverting the cartan matrix and matrix from eq. (21) we can rewrite system (21) to the form l1n + l1,n−1 + · · · + l12 + k1 = m1, l2n + l2,n−1 + · · · + k2 = m1 + m2, ... ln−1,n + kn−1 = m1 + m2 + · · · + mn−1, kn = m1 + m2 + · · · + mn. (22) theorem 3.6. if the numbers m1, . . . ,mn are fixed, then there is a bijective mapping between the set of all solutions (19) of (17) and the set of all solutions( l1n, l1,n−1, . . . , l12,k1, l2n, l2,n−1, . . . , l23,k2, . . . , . . . , ln−1,n,kn−1 ) of the system (22). proof. the explicit form of this bijection is s12 = k1 s1t = l1,t−1, t = 3, . . . ,n + 1, sr,r+1 = kr − lr−1,r, srt = lr,t−1 − lr−1,t−1, t = r + 2, . . . ,n + 1, sn,n+1 = mn + kn−1. bijectivity is the consequence of the fact that verma monomials form the spanning set of u(a−n ). corollary 3.7. all verma monomials are linearly independent. 4. conclusions problems of unified construction of verma bases for other series of simple lie algebras (namely orthogonal and symplectic) and of an effective determination of matrix elements in these cases are still open. in [50, 51], kang and lee developed the notion of gröbnershirshov pairs. in this way, the reduction problem in representation theory was solved and monomial bases of representations of various associative algebras could be constructed. the algebra an was among the first examples. note that the bases obtained there are different from verma bases. it is an interesting question whether verma bases can be derived this way. acknowledgements s. p. acknowledges support from grant no. p201/10/1509, a project of the grant agency of the czech republic. references [1] p. littelmann. an algorithm to compute bases and representation matrices for sln+1-representations. j pure appl algebra 117/118:447–468, 1997. algorithms for algebra (eindhoven, 1996). [2] n. bourbaki. éléments de mathématique. fasc. xxxiv. groupes et algèbres de lie. chapitre iv: groupes de coxeter et systèmes de tits. chapitre v: groupes engendrés par des réflexions. chapitre vi: systèmes de racines. actualités scientifiques et industrielles, no. 1337. hermann, paris, 1968. [3] j. dixmier. algèbres enveloppantes. les grands classiques gauthier-villars. [gauthier-villars great classics]. éditions jacques gabay, paris, 1996. reprint of the 1974 original. [4] j. e. humphreys. introduction to lie algebras and representation theory, vol. 9 of graduate texts in mathematics. springer-verlag, new york, 1978. second printing, revised. [5] k. erdmann, et al. introduction to lie algebras. springer undergraduate mathematics series. springer-verlag london ltd., london, 2006. [6] i. m. gel′fand, et al. finite-dimensional representations of the group of unimodular matrices. doklady akad nauk sssr (ns) 71:825–828, 1950. 454 vol. 53 no. 5/2013 note on verma bases for representations of simple lie algebras [7] i. m. gel′fand, et al. finite-dimensional representations of groups of orthogonal matrices. doklady akad nauk sssr (ns) 71:1017–1020, 1950. [8] a. i. molev. gelfand-tsetlin bases for classical lie algebras. in handbook of algebra. vol. 4, vol. 4 of handb. algebr., pp. 109–170. elsevier/north-holland, amsterdam, 2006. [9] l. c. biedenharn. on the representations of the semisimple lie groups. i. the explicit construction of invariants for the unimodular unitary group in n dimensions. j math phys 4:436–445, 1963. [10] g. e. baird, et al. on the representations of the semisimple lie groups. ii. j math phys 4:1449–1466, 1963. [11] d. p. želobenko. compact lie groups and their representations. american mathematical society, providence, r.i., 1973. translated from the russian by the israel program for scientific translations, translations of mathematical monographs, vol. 40. [12] j. g. nagel, et al. operators that lower or raise the irreducible vector spaces of un−1 contained in an irreducible vector space of un. j math phys 6:682–694, 1965. [13] h. pei-yu. orthonormal bases and infinitesimal operators of the irreducible representations of group un. sci sinica 15:763–772, 1966. [14] f. lemire, et al. formal analytic continuation of gel′fand’s finite-dimensional representations of gl(n, c). j math phys 20(5):820–829, 1979. [15] m. d. gould. the characteristic identities and reduced matrix elements of the unitary and orthogonal groups. j austral math soc ser b 20(4):401–433, 1977/78. [16] m. d. gould, et al. characteristic identities for kacmoody algebras. lett math phys 22(2):91–100, 1991. [17] r. m. ašerova, et al. projection operators for simple lie groups. ii. general scheme for the construction of lowering operators. the case of the groups su(n). teoret mat fiz 15(1):107–119, 1973. [18] r. m. ašerova, et al. projection operators for simple lie groups. teoret mat fiz 8(2):255–271, 1971. [19] y. f. smirnov. projection operators for lie algebras, superalgebras, and quantum algebras. in latin-american school of physics xxx elaf (mexico city, 1995), vol. 365 of aip conf. proc., pp. 99–116. amer. inst. phys. press, woodbury, ny, 1996. [20] m. nazarov, et al. representations of yangians with gelfand-zetlin bases. j reine angew math 496:181–212, 1998. [21] h. weyl. the classical groups. their invariants and representations. princeton university press, princeton, n.j., 1939. [22] r. c. king, et al. standard young tableaux and weight multiplicities of the classical lie groups. j phys a 16(14):3153–3177, 1983. [23] a. berele. construction of sp-modules by tableaux. linear and multilinear algebra 19(4):299–307, 1986. [24] r. c. king, et al. construction of orthogonal group modules using tableaux. linear and multilinear algebra 33(3-4):251–283, 1993. [25] k. koike, et al. young-diagrammatic methods for the representation theory of the classical groups of type bn, cn, dn. j algebra 107(2):466–511, 1987. [26] p. exner, et al. canonical realizations of classical lie algebras. czech j phys 26b(11):1213–1228, 1976. [27] c. de concini, et al. special bases for sn and gl(n). israel j math 40(3-4):275–290 (1982), 1981. [28] n. h. xi. special bases of irreducible modules of the quantized universal enveloping algebra uv (gl(n)). j algebra 154(2):377–386, 1993. [29] i. m. gel′fand, et al. multiplicities and proper bases for gln. in group theoretical methods in physics, vol. ii (yurmala, 1985), pp. 147–159. vnu sci. press, utrecht, 1986. [30] a. v. zelevinskĭı, et al. the fundamental affine space and canonical basis in irreducible representations of the group sp4. dokl akad nauk sssr 300(1):31–35, 1988. [31] o. mathieu. good bases for g-modules. geom dedicata 36(1):51–66, 1990. [32] g. lusztig. canonical bases arising from quantized enveloping algebras. j amer math soc 3(2):447–498, 1990. [33] g. lusztig. canonical bases arising from quantized enveloping algebras. ii. progr theoret phys suppl (102):175–201 (1991), 1990. common trends in mathematics and quantum field theories (kyoto, 1990). [34] m. kashiwara. crystalizing the q-analogue of universal enveloping algebras. commun math phys 133(2):249–260, 1990. [35] j. du. canonical bases for irreducible representations of quantum gln. bull london math soc 24(4):325–334, 1992. [36] j. du. canonical bases for irreducible representations of quantum gln. ii. j london math soc (2) 51(3):461–470, 1995. [37] s.-j. kang. representations of quantum groups and crystal base theory. in algebra and topology 1992 (taejŏn), pp. 189–210. korea adv. inst. sci. tech., taejŏn, 1992. [38] k. misra, et al. crystal base for the basic representation of uq (sl(n)). commun math phys 134(1):79–88, 1990. [39] y. m. zou. crystal bases for uq (sl(2, 1)). proc amer math soc 127(8):2213–2223, 1999. [40] g. cliff. crystal bases and young tableaux. j algebra 202(1):10–35, 1998. [41] d. p. zhelobenko. crystal bases and the problem of reduction in classical and quantum modules. in lie groups and lie algebras: e. b. dynkin’s seminar, vol. 169 of amer. math. soc. transl. ser. 2, pp. 183–202. amer. math. soc., providence, ri, 1995. [42] v. lakshmibai, et al. geometry of g/p . iv. standard monomial theory for classical types. proc indian acad sci sect a math sci 88(4):279–362, 1979. 455 s. pošta, m. havlíček acta polytechnica [43] p. littelmann. cones, crystals, and patterns. transform groups 3(2):145–179, 1998. [44] v. chari, et al. monomial bases of quantized enveloping algebras. in recent developments in quantum affine algebras and related topics (raleigh, nc, 1998), vol. 248 of contemp. math., pp. 69–81. amer. math. soc., providence, ri, 1999. [45] s. p. li, et al. verma bases for representations of classical simple lie algebras. j math phys 27(3):668–677, 1986. [46] j. patera. verma bases for representations of rank two lie and kac-moody algebras. in proceedings of the 14th icgtmp (seoul, 1985), pp. 289–292. world sci. publishing, singapore, 1986. [47] m. e. hall. verma bases of modules for simple lie algebras. proquest llc, ann arbor, mi, 1987. thesis (ph.d.)–the university of wisconsin madison. [48] k. n. raghavan, et al. on verma bases for representations of sl(n, c. j math phys 40(4):2190–2195, 1999. [49] k. n. raghavan, et al. erratum: “on verma bases for representations of sl(n, c)” [j. math. phys. 40 (1999), no. 4, 2190–2195; mr1683149 (2000b:17011)]. j math phys 41(5):3302, 2000. [50] s.-j. kang, et al. gröbner-shirshov bases for representation theory. j korean math soc 37(1):55–72, 2000. [51] s.-j. kang, et al. gröbner-shirshov bases for irreducible sln+1-modules. j algebra 232(1):1–20, 2000. 456 acta polytechnica 53(5):450–456, 2013 1 introduction 2 verma bases 3 verma monomials inequalities 4 conclusions acknowledgements references ap01_45.vp 1 introduction in order to ensure the safe navigation of surface vessels their motion has to be controlled accurately. this control can be provided through the application of control theory. in general, control theory provides design strategies that allow a better understanding of the system being controlled and a mechanism to regulate the way it operates. there are various control theories or methodologies that have their own structure. unfortunately, not all these methods perform their controlling duties satisfactorily due to inherent limitations imposed by the controller structure. in addition, the performance of these controllers depends on the values of the controller’s parameters. conventionally, the designer manually tunes these parameters to find an acceptable solution. however, this relies on an ad hoc approach to tuning, which depends on the experience and qualitative judgement of the designer. this process can be very slow and there is no guarantee that the designed solution will perform satisfactorily. a solution to this problem is to use optimisation techniques that tune such parameters automatically. the most powerful of these techniques is based on genetic algorithms (gas) [1], [2], [3]. gas are stochastic search methods that mimic the way species evolve in nature. they operate on a population of potential solutions applying the darwinian principal of “survival of the fittest” to produce better and better possible solutions to a given problem. at each generation, a new set of candidate solutions is created by the process of selecting solutions that are better than others and breeding them together using operators borrowed from natural sexual reproduction in order to get populations of solutions that are better than the solutions they were created from, just as in natural adaptation. this paper covers the optimisation of control systems for the propulsion and navigation of an oil platform supply ship using a ga. the particular vessel used in this study is a scale model called cybership i [4], [5], which is the test vehicle for the guidance, navigation and control laboratory at the norwegian university of science and technology in trondheim. computer-generated simulations based on a non-linear hydrodynamic model of cybership i are used in the optimisation studies. these simulations have proven to be sufficiently representative of the full-scale manoeuvring dynamics of such a vessel. hence the optimised controllers could be used to control the actual scale model without the need for further modification. the investigation presented in the paper will represent part of a study into the optimisation of controller designs based on a number of different control methodologies. in this case the particular methodology considered is classical pid, a very simple and widespread controller. pid control is used to provide the structure for both propulsion and navigation controller. the goal of this study is to obtain controller solutions that satisfactorily track the desired heading and propulsion response while keeping actuator usage to a minimum. the results obtained from this study illustrate the benefits of using gas to optimise propulsion and navigation controllers for surface ships. the accuracy of the resulting simulations allows meaningful evaluation of the optimised controllers’ performance. 2 supply ship mathematical model the dynamics of the vessel can be represented by the kinetics and kinematics equations. when kinetic and kinematic equations are combined together the following matrix equation is produced [4] � �� � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � �m c d j m1 10 0 0 � (1) which corresponds to a form of state space equation (i.e. � ��x a x x b= � �). here m, c, d and j are the mass/inertia, coriolis, damping and euler matrices, � = [u, v, r]t is the body-fixed linear and angular velocity vector, � = [ x, y, �]t denotes the position and orientation vector in the earth-fixed frame and � � [�1, �2, �3] t is the input force vector, given that �1 is the thrust vector along the body fixed x-axis, �2 is the thrust vector along the body fixed y-axis and �3 is the thrust vector along the z-axis. these three force vectors relative to the body-fixed axis that constitute the inputs to the vessel model are provided by four thrusters. two of them are placed at the stern, symmetric with respect to the body-fixed x-axis, while the other two are placed at the bow, along the body-fixed x-axis [4]. each © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 41 no. 4 – 5/2001 genetic algorithm optimisation of a ship navigation system e. alfaro-cid, e. w. mcgookin, d. j. murray-smith the optimisation of the pid controllers’ gains for separate propulsion and heading control systems of cybership i, a scale model of an oil platform supply ship, using genetic algorithms is considered. during the initial design process both pid controllers have been manually tuned to improve their performance. however this tuning approach is a tedious and time consuming process. a solution to this problem is the use of optimisation techniques based on genetic algorithms to optimise the controllers’ gain values. this investigation has been carried out through computer-generated simulations based on a non-linear hydrodynamic model of cybership i. keywords: ship control, pid controller, genetic algorithm optimisation. thruster is represented by the force (fi) it produces and the angle (�i) defining its direction. and the amplitude limit values for each of the thrusters’ force and angle are �0.9n and � radians respectively [4]. the three input force components (�) can be related to the individual forces produced by the thrusters (fi) using trigonometric relationships [5]. hence, the ship’s model has 6 states (i.e. u, surge velocity; v, sway velocity; r, yaw rate; �, yaw angle; xp, x-position on earth and yp, y-position on earth) and 3 inputs (i.e. �1, �2 and �3, thrust forces along the body fixed x, y and z-axis respectively) [4]. since cybership i is a scale model of an actual type of supply vessel the simulations performed are scaled in order to generate results comparable to the response of full-scale ships. 3 automatic control system for most surface vessels there are two main sets of dynamics: propulsion and heading. two classical pid controllers [6] have been used to provide the structure for the propulsion controller (for governing surge velocity) and the navigation controller (for governing heading). these controllers operate on the error signal, which is the difference between the desired output and the actual output, and generate the actuation signal (i.e. �1 and �3), which drives the plant (i.e. the vessel). they have three basic modes of operation: proportional action, in which the actuation signal is proportional to the error signal, integral action, when the actuation signal is proportional to the time integral of the error signal, and derivative action, where the actuation signal is proportional to the time derivative of the error signal, and their transfer function is given by: � � � � � �d s k k s k s e si d � � . (2) these “three-term controllers” have been found to be so effective that pid is the standard control method used in the process industries. to design a particular control loop the three parameters (k, proportional gain, ki, integral gain, and kd, derivative gain) have to be “tuned” to arrive at acceptable performance. 4 genetic algorithm optimisation gas are optimisation techniques that mimic the way species evolve in nature. in natural evolution many organisms evolve by means of two mechanisms: natural selection and sexual reproduction. the concept of natural selection is described by the darwinian theory of survival of the fittest. sexual reproduction allows the offspring to inherit features from both its parents. gas emulate this process by evolving a population of parameter solutions through a number of generations. they initiate this process by randomly generating an initial population of possible parameters (suitably encoded). the performance of each solution is evaluated and a new generation is produced according to the three main operators of the ga: selection, crossover and mutation. selection determines which solutions are chosen for mating according to the principal of survival of the fittest (i.e. the better the performance of the solution, the more likely it is to be chosen for mating and the more offspring it produces). crossover allows an improvement in the species in terms of the evolution of new solutions that are fitter than any seen before, and mutation reintroduces values that might have been lost through selection or crossover, or creates totally new features. the cycle is performed until a predetermined number of generations is met [1], [2]. in order to search the space of possible solutions the ga uses a string (called chromosome) of digits as a representation of the elements of the space of possible solutions (i.e. the possible solutions are suitably encoded). in this study, the encoding of the parameters has been chosen to allow a range of possible solutions from 0.001 10 2 to 9.999 103. each controller parameter value is encoded as a string of five genes. as there are three parameters to optimise in the controller (i.e. k, proportional gain, ki, integral gain, and kd, derivative gain), each possible solution is represented by a chromosome that is a string of 30 genes (i.e. 15 genes represent the propulsion controller and 15 represent the heading controller). these genes, instead of being binary bits (as they used to be in the traditional ga), are integers included within the interval [0, 9], in order to allow a wider range of possible solutions in smaller chromosomes [4]. parametere ncoding abcde parameterv alue a b � � �0 1 0 0. .� � � �1 0 001 10 2 2c d e� �. . (3) once an initial population of chromosomes is generated at random, the chromosomes are decoded to get the corresponding parameters and these are introduced in the controllers. a simulation is run and the result obtained for each set of controller’s parameters of the population is evaluated, using the cost function to be minimized. based on this cost function the selection procedure, used to draw chromosomes from the evaluated population, takes place. there are three main types of algorithms for this (i.e. roulette wheel selection, tournament selection and rank-based selection [4]), but they all share the feature that the probability of selecting a chromosome for reproduction is a decreasing function of the chromosome’s cost score. roulette wheel selection allows sub-optimal solutions to have a chance of being accepted. it ensures a good mix of good and bad solutions and prevents premature convergence to a local minimum, however it also leads to a very slow convergence rate. therefore, to avoid this slow convergence, the selection scheme chosen for this study is a variation of roulette wheel selection. the population is sorted according to the actual cost and a new cost value is assigned to each chromosome depending only on its position in the chromosome’s rank and not on the actual cost value. consider nchrom the number of chromosomes in the population, pos the position of a chromosome in this population and sp the selective pressure (i.e. the probability of the best chromosome being selected compared to the average probability of selection of all chromosomes). the new cost is calculated as [7]: � � � � � � � �cost pos sp sp pos nchrom � � � � �2 2 1 1 1 . (4) then a biased roulette wheel is created where each chromosome in the population has a roulette wheel slot sized in proportion to its cost. to reproduce, the roulette wheel thus 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 defined is spinned as many times as chromosomes are in the population [3]. the fact of assigning this new cost value to each chromosome (instead of creating a biased roulette wheel straight from the actual cost values obtained after the simulation) allows a bigger differentiation among the chromosomes. in a even population where most chromosomes have approximately the same cost value, assigning new cost values depending only in the position of the chromosome once the population has been ranked, increases the chances of the fittest chromosomes being chosen for reproduction. this way the convergence to the minimum is faster [7]. once the new population has been obtained, chromosomes are ready for mating and mutation. the crossover operator combines the features of two parents to create new solutions. one or several crossover points are selected at random and then complementary fractions from the two parents are spliced together to form a new chromosome [4]. in this study two-point crossover has been used because multi-point crossover encourages a greater amount of variation in the new chromosomes produced and two points do not greatly increase the complexity of the algorithm. the mutation operator alters a copy of a chromosome. one or more locations are selected on the chromosome and replaced with new randomly generated values. mutation is used to help insure that all areas of the search space remain reachable. the mutation probability chosen has been 5 % [4]. 5 manual tuning results this investigation has been carried out through simulation studies in matlab. the accuracy of computer-generated simulations based on this mathematical model permits the use of such simulations as a method for evaluating the performance of the controllers. in the vessel simulation two pid controllers have been integrated: one for the heading subsystem (i.e. generates the input �3), and other for the propulsion subsystem (i.e. generates the input �1). the parameters of both controllers (i.e. k, proportional gain, ki, integral gain, and kd, derivative gain) need to be tuned in order to get a response of the system as accurate as possible. the manual tuning yields the parameters shown in table 1. the resulting controller generates the simulated results shown in figure 1 when a 45o overdamped second order step input is applied in the heading and a 0.4 m/s overdamped second order step input in the propulsion. although the results obtained in this case are considered to be quite good, manual tuning is a tedious job and not very reliable, since sometimes the values of the controller’s gains can be particularly difficult to obtain. therefore a ga has been used to optimise these gains. 6 genetic algorithm optimisation results the results presented in this section are best cases obtained from several runs (i.e. 10 runs) of the ga. the rest of the runs converge to the same minimum area and the control© czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 41 no. 4 – 5/2001 heading controller propulsion controller k 9 200 ki 1.5 2 kd 20 8 table 1: manually tuned controllers parameters fig. 1: manually tuned controllers response lers obtained show similar characteristics, although they do not perform as well as the best case. since it is desirable that the population size is approximately equal to the number of genes in each chromosome and that the generation size doubles this [4], in every run the population consisted of 60 chromosomes and the number of generations the optimisation cycle performed was 120. thus, the ga initialisation is carried out by generating an initial population matrix of uniformly distributes random integers, whose size is 60 × 30 (i.e. number of chromosomes × number of genes in a chromosome). the optimisation design criterion is defined by the cost function. in addition to this there is a desired response that the controller must track. the desired heading and propulsion manoeuvres used in the ga optimisation is the same than the one used in the manually tuned controller so that both studies can be compared (see figure 1). since the objective of the controllers is to make the vessel track desired dynamic responses with the minimum actuator effort, the cost function will have two terms for each controller [4]: � � � � � � � �� �c ui i i i i tot � � � � � �� � � � �2 1 3 2 2 2 1 2 0 . (5) here ��i is the ith heading angle error between the desired and obtained heading, �3i is the ith yaw thrust force, �ui is the ith surge velocity error between the desired and obtained surge velocity, �1i is the ith surge thrust force, tot is the total number of iterations and �1 and �2 are scaling factors. since the optimisation processes attempt to minimise the value of this function it is easy to see that ��, �u, �3 and �1 will be minimised too. therefore, the quantities �� and �u give an indication of how well the controllers are operating by showing the tracking between the actual and the desired heading and surge velocity and the input components �3 and �1 are used to keep the actuators movement to a minimum so that they can operate well within their operating limits. as the input force and torque are always larger than the output errors near the optimum, they dominate the cost values in this area. this leads to solutions that provide very small thruster effort, but a very poor tracking of the desired responses. in order to avoid this, �3 and �1 are scaled by two constants, �1 and �2, so that an equally balanced trade-off between the elements is obtained and the four terms of the cost function are equally optimised. several scaling factors have been considered, resulting in choosing �1 and �2 equal to 0.001 as the most appropriate. in figure 2 the fast convergence of the ga for these scaling factors is shown. the best solution converges to a solution very close to the minimum in less than 20 generations. the total cost is shown to converge to a low cost region within 5 generations, although after that there are some peaks of higher cost due to the mutation operator. in the final generation there are 52 chromosomes that have similar cost functions (i.e. 86.6 % of the population). this means that the amount of saturation is high and provides confidence in the final solution being near optimal. the parameters shown in table 2 are typical of those obtained as a solution by this optimisation. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 2: ga optimisation convergence when these parameters are implemented the simulated responses shown in figure 3 are obtained. as can be seen from the figure, the tracking of the desired overdamped response in both subsystems, propulsion and heading, is excellent, although the surge velocity response shows a very small steady state error (0.01 m/s approximately). obviously, �1 increases while the ship is accelerating and stays constant once the ship has reached the desired surge speed. on the other hand, �3 shows initially a positive peak, corresponding with the beginning of the manoeuvre and tends to zero as the ship completes the turn. both input signals, �3 and �1, are well kept within the operational limits, although �3 shows a very oscillating behaviour. this leads to an unnecessary wearing and tearing of the actuators that shortens their life. a new cost function has been considered to try to reduce the oscillations in the actuators: � � � � � � c t u i i i i i tot i � � �� � � � � � � � � � � � � � � � � � � � �2 1 3 2 1 3 3 1 2 0 2 � �� � �� � � � � � � � � � � � � 2 1 2 2 1 1 1 2 i i i t� . (6) these two new terms introduce a measurement of the inputs increasing or decreasing rates. in the minimization process these two terms will also be minimized, leading to a smoother input response. both terms are scaled by two coefficients, �1 and �2. after several trials, the coefficients �1 and �2 that seem to fulfil the role of reducing the wear and tear in the actuators without worsening the tracking are: �1 equal to 0.01 and �2 equal to 0. this value of �1 gives a good trade-off between good tracking of the desired heading and a significant reduction in the oscillations of the input �3. also, any value for �2 different to 0 leads to a very slow response, often with a big steady-state error, caused by the slowing down of �1. this is to be expected since the previous optimised controller did not present oscillations in the input �1. hence, although the inclusion of the input rate term has provided a less oscillatory response for the input �3 keeping a good tracking, in the case of �1 it leads to a much worse tracking with slight improvements in the input response. it seems necessary to have a prior knowledge of the actuators behaviour in order to include the input rate term in the cost function. introducing the input rate term in the cost function if there is no wearing and tearing of the actuator will just result in a slower input and less accurate tracking. considering the convergence of the ga method through the total and best cost, the total cost converges to a low cost region within 3 generations and the best cost converges to a solution close to the final cost value in the 10th generation. hence, the convergence in this case has been faster than in figure 2. also, observing the amount of saturation in the final generation it can be verified that the final solution is near optimal due to the high number of similar individuals in the final solution (i.e. 45 individuals, 75 % of the population). © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 3: ga controllers response using cost function (5) heading controller propulsion controller k 25.36 223 ki 0.0025 0.0031 kd 37.03 519.9 table 2: ga controllers parameters when the pid controllers are optimised using (6) the values shown in table 3 are typical of those obtained. the results obtained for the pid controller gains using this latter cost function (6) differ quite a lot from the ones got with the previous one (5), especially ki. as it has been included a new term in the cost function (6) to reduce the oscillations in the input signal generated by the heading controller (i.e. �3), the ga has done so by decreasing k and kd to reduce the gain of the system and the noise amplification. also, in order not to get a poorer tracking response after reducing k, ki has been increased to improve the tracking. although the terms of the cost function (6) concerning the propulsion controller have not changed, the reduction of oscillation in �3 affects �1 and this deals to a different propulsion controller. once implemented the simulated responses shown in figure 4 are obtained. as can be seen from these plots the tracking of the surge velocity desired response is again excellent, and even the slight steady state error has disappeared. the tracking of the heading desired response is a bit worse than it was with the previous controller, especially in the transient (i.e. the first 10 seconds), but thanks to this the goal of this new optimisation, which was to reduce the wearing and tearing in the yaw thrusters, has been accomplished. 7 execution time comparing the execution time for both methods, manual tuning and ga optimisation, it can be seen in the table 4 below that the time spent tuning the controller was longer that the time it takes to run a ga simulation. it is quite difficult to reckon the time a designer will spend doing a manual tuning because it highly depends on his experience, engineering judgement and luck. although it may seem that the simulation time is quite long and there is no real advantage in it, we must take into account that while the optimisation is running the designer does not need to be there. 8 conclusions this paper has presented the findings of a study of pid controller gains optimisation for an oil platform supply ship’s propulsion and heading dynamics. the ga process has performed well as optimisation technique. it has obtained optimal solutions for both propulsion and navigation pid controllers without a-priori knowledge of the optimal region. as this a-priori knowledge is not necessary, ga techniques are very suitable when the designer is unfamiliar with the plant to control. the key factor in the ga optimisation is the choice of the cost function. as the objective of the optimised controller is to make the vessel follow as accurately as possible the desired track while avoiding excessive wear and tear in the actuators, these two terms (i.e. difference between desired response and 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 heading controller propulsion controller k 4.28 885 ki 5.126 96.34 kd 10.04 560.2 table 3: ga controllers parameters fig. 4: ga controllers response using cost function (6) manual tuning ga optimisation 15 h 8 h table 4: execution time actual response and thrust forces) must be included in the cost function in order to be minimised. numerous simulations considering different scaling factors show that there is a trade-off between the accuracy following the desired track and the actuators, usage and one of them cannot be improved without making the other worse. this leads to the necessity, of a balance so that the overall performance is good. in addition, the inclusion in the cost function of a term considering the rate of variation of the thrust forces along the z-axis leads to a significant reduction in the oscillation of this input keeping a good tracking of the desired heading. this is a considerable advantage since it avoids unnecessary wear and tear of the actuators, which could shorten their operational lifespan. the results obtained illustrate the benefits of using gas to optimise propulsion and navigation controllers for surface ships. the accuracy of the obtained simulations allows confidence in the good performance of these controllers once they are used to control the manoeuvring of the actual scale model. references [1] holland, j.: genetic algorithms. scientific american, 1992 [2] ellis, c.: a bluffer guide to genetic algorithms. engineering design newsletter, science and engineering research council newsletter, 1993 [3] dutton, k., thompson, s., barraclough, b.: the art of control engineering. harlow, addison-wesley, 1997 [4] mcgookin, e. w.: optimisation of sliding mode controllers for marine applications: a study of methods and implementation issues. phd thesis, uk, 1997 [5] fossen, t. i.: guidance and control of ocean vehicles. chichester, john wiley & sons ltd., 1994 [6] goldberg, d.: genetic algorithms in searching optimisation and machine learning. reading, addison-wesley, 1989 [7] polheim, h.: genetic and evolutionary algorithm toolbox for use with matlab. http://www.systemtechnik.tu-ilmenau.de/~pohlheim eva alfaro-cid, meng phone: +44 141 330 6031 fax: +44 141 330 6004 e-mail: alfaro@elec.gla.ac.uk euan w. mcgookin, meng, phd david j. murray-smith, bsc, msc, phd centre of systems and control & dept. of electronics and electrical engineering university of glasgow glasgow, g12 8lt, uk © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica acta polytechnica 53(2):148–151, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ inactivation of candida albicans by corona discharge: the increase of inhibition zones area after far subsequent exposition vladyslava fantovaa,∗, karol bujačeka, vítězslav kříhaa, jaroslav julákb a czech technical university in prague, fac. of electrical engineering, dep. of physics, prague, czech republic b charles university in prague, first faculty of medicine, inst. of immunology and microbiology, czech republic ∗ corresponding author: demchvla@fel.cvut.cz abstract. the cold atmospheric pressure plasma generated by the negative corona discharge has inhibition effect on the microorganism growth. this effect is well-known and it can be demonstrated on the surface of cultivation agar plates by the formation of inhibition zones. we exposed the cultures of candida albicans to the negative corona discharge plasma in a special arrangement in this study: the equal doses of plasma were applied subsequently twice or four times on the same culture on one petri dish, while the distance between exposed points was variable. only small differences were observed in decontaminated zone areas for twice exposed agar at the shortest distance between exposed points (1.5 cm). in case of the four times exposed agars, we observed significant differences in inhibition zone areas, dependent not only on the exposition site distances, but also on the exposition order. the largest inhibition zone size was observed for the first exposition decreasing to the fourth one. to check relevancy of these dependencies, we presume to conduct further set of experiments with lower yeast concentration. in conclusion, significant difference in partial inhibition zone sizes appeared only when four expositions on one petri dish were carried out, whereas no significant difference was observed for two subsequent expositions. the explanation of this effect may be the subject of subsequent remote exposition(s), when minute amounts of scattered active particles act on the previously exposed areas; the influence of diffused ozone may also take place. keywords: cold atmospheric plasma decontamination, candida albicans, direct and indirect treatment. 1. introduction a novel discipline plasma medicine including topics of bio-decontamination and sterilization rapidly develops nowadays. there are many papers concerning interaction between cold atmospheric pressure plasma (cap) and biological materials (bacteria, yeasts, spores, human and/or animal tissue). it is widely accepted the uv radiation, charged particles, reactive nitrogen and oxygen species are the main plasma-generated factors that affect biological materials [4, 3]. nevertheless, there is still a lot of ambiguity about exact inactivation mechanisms, dominant factors and their mutual effects, where the influence of the uv radiation is the most unexplored factor. on the one hand, it is well known that uv radiation has germicidal effect and may cause double strain dna break. on the other hand, some papers declare that there is small or marginal impact of the uv radiation generated in cap. some papers even declare that uv-free plasmas have germicidal properties. however, the synergy of uv radiation and reactive species give better decontamination results [9, 5, 6]. there are also efforts to separate and study particular factors, or at least to separate direct and indirect cap treatment. during the direct treatment, the contaminated object is in direct contact with plasma active (light-emitting) zone, while during the indirect treatment the object is in contact only with longliving products of plasma chemical reactions [10, 2]. current experiments assume mutual or separated influence of individual factors. we focused on separation of near and far decontamination effects using several expositions with different spatial and time arrangement in our experiments. common antimicrobial susceptibility tests (antimicrobial gradient method, disc diffusion test) are often realized for several concentrations of antibiotics on one petri dish [1]. using only one petri dish is beneficial, as it guarantees the same conditions (inoculum concentration, cultivation medium properties, incubation period, etc.), as well as cost and labor saving. it is important to know whether similar susceptibility test arrangements can be realized with cap treatment instead of antibiotic application. the effect of particular factors and components of cap decrease with distance. according to this, we can distinguish near and far treatment. the near treatment means direct interaction between cap and 148 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 inactivation of candida albicans by corona discharge treated surface, whereas the far treatment means interaction between long living species or highly mobile ones outside the active plasma region. the near and far treatment effects cannot be distinguished when performing only one experiment on a single petri dish, while performing subsequent expositions at different places on a single petri dish, when the distance between exposed areas are greater than inhibition zone of single exposition, enables us to study the near and far treatment superposition. experiments presented in this paper are aimed to investigate mutual influence of several expositions on a single petri dish. the impact of chosen treatment procedure (near treatment followed by the far one and vice versa) also has to be considered. it seems that there is distinguishable effect of chosen order, so near and far treatments are not commutable in decontamination processes. 2. samples candida albicans yeast was used as a model microorganism. c. albicans occurs as a commensal in human digestive tract and other parts of human body. in most cases it is nonpathogenic. potential overgrowth can be observed if the immunity system of host is weakened, for example in case of young or old, hiv infected or hospitalized people. this microorganism can be grown in laboratory conditions without any considerable difficulties. sabouraud agar, developed for fungi and yeast cultivation, was used as a growth medium. c. albicans suspension in sterile water was spill onto agar surface and more or less uniform layer on surface was created. resulting concentration of c. albicans was estimated as in order of 108 cfu cm−2 (colony forming units per square cm) in this layer. the inoculum concentration and colony confluence are not crucial for the inhibition zone areas, as described in [7] or [8]. from the practical point of view, the higher inoculum concentrations are preferable, as they provide sharp zone edges, whereas the lower ones yield not sharply delineated and thus hardly measurable zones. 3. experiments 3.1. apparatus low temperature plasma was generated by the dc negative corona discharge in the simple point-to-plane electrode system (figs. 1, 2). the high voltage power supply utes ht 2103 and a ballast resistor (discharge stabilization, 10 mω) were used. the plane electrode was realized by the sabouraud agar in petri dish (� 9 cm), which is an ion conducting medium. it was connected to electric circuit by immersed metallic strip. the point electrode was realized by a sharpened brass rod. polarity of the point electrode was negative, i.e. negative corona discharge was ignited in the electrode gap. figure 1. experiment scheme: ammeter a, voltmeter v, power supply hv, ballast resistor 10 mω, a point metallic electrode 1 and agar surface as an electrically conductive plane electrode 2. figure 2. point electrode 1, petri dish with ion conductive medium 2. petri dish was placed on rotating stand, enabling us to perform subsequent exposition with fixed radial distance on the same petri dish. 3.2. exposition presuming that multiple expositions are affecting each other, two types of experiments with variable distance between expositions were arranged: (1.) two subsequent expositions on one petri dish (fig. 3a). five different distances between opposite exposition locations were used. (2.) four subsequent expositions on one petri dish (fig. 3b). the following experimental conditions were used: the gap between electrodes was 7 mm, the exposition time was 5 min, the voltage was set to 10 kv (measured on the power supply) and the resulting electric current was approximately 50 µa. the current varied due to increasing of gap size, caused by an ion wind causing the agar surface deformation. exposed dishes were cultivated overnight at 37 ◦c. area of inhibition zones was measured manually using the millimeter paper grid. 4. results and discussion the inhibition zones were compared for both experiments and are presented in the following paragraphs. experiments, that are presented here were repeated 149 vladyslava fantova, karol bujaček, vítězslav kříha, jaroslav julák acta polytechnica a) b) 1 2 1 2 3 4 figure 3. exposition sequence scheme; the numbers indicate the order of exposition. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2 cm 3 cm 4 cm 5 cm iz a re a (c m 2 ) distance between zones zone 1 zone 2 figure 4. the inhibition zone sizes for areas 1 and 2. tree times, but the uncertainty of these measurements was smaller than the inhibition zone evaluation uncertainty. that’s why the combined uncertainty was given dominantly by uncertainty type b so it is not specified in graphs. 4.1. two expositions interaction the differences of both inhibition zone areas were not significant, but the decreasing tendency for the increasing distance between exposed areas was still observed (fig. 4). the inhibition zone size was determined using computer aided filling of the area by regular grid. sporadic colonies appeared inside the zones and their edges, indicating the incomplete c. albicabs inhibition. the edge of continual growth was considered as the border of particular inhibition zones. we used distances of 1, 2, 3, 4 and 5 cm for the experiment. the overlapping of inhibition zones was observed for the distance of 1 cm, where it was difficult to distinguish particular inhibition zones, therefore this sample was excluded from the data processing. 4.2. four expositions with symmetric clockwise sequence order this is the experiment with three subsequent expositions for distances of 1, 2, 3, 4 and 5 cm between opposite exposed sites. the overlapping of inhibition zones was observed for small distances (1 and 2 cm), therefore these samples were excluded from data processing (fig. 5). the first exposed zone is affected by far treatment of the second and the fourth exposition in this geometric arrangement. the third exposition is farther away, thus effect of this exposition is expected to be smaller. for other expositions applies the similar. figure 5. overlapping of inhibition zones at small distances between exposition sites; left 2 cm, right 4 cm distance between opposite expositions. 0 0.5 1 1.5 2 2.5 3 3.5 3 cm 4 cm 5 cm iz a re a (c m 2 ) distance between zones zone 1 zone 2 zone 3 zone 4 figure 6. the inhibition zones sizes for various distances between expositions. the following mechanisms are present in both experiments: near treatment, far pre-treatment, far posttreatment. we suppose that all of them contribute to overall decontamination with certain rate. it can be seen from both experiments that the first exposed zone was always the biggest one on the petri dish. this indicates that the far pre-treatment is not as important as the far post-treatment of previously exposed area. the far post-treatment could be mediated by stable particles produced by discharge, ozone o3 seems to be the most probable candidate. the additional effect of uv radiation may be also considerable. nevertheless, the concentration of ozone was not measured here; concerning its production, see, e.g., [2]. this holds also for uv radiation: we did not dispose any method to measure its intensity. 5. conclusions the growth of yeast could be inhibited by near and far discharge exposition. the aim of this study was to serially test the gradual increase of candida albicans growth inhibition zone size caused by increasing discharge exposition sequence. the firstly exposed zone was the biggest one on the petri dish. it is obvious that the near treatment is the most important. however, impact of the far post-treatment is also significant. the inhibition zone area can be enlarged up by tens percent by far post-treatment. we can also conclude that the far and the near effects are not commutative. it seems that the far pre-treatment affects yeasts only a little bit. the near 150 vol. 53 no. 2/2013 inactivation of candida albicans by corona discharge treatment by high energetic species has naturally the biggest impact for the decontamination. however, the low energetic species, which reach the previously near treated area, seem to be sufficient to finish candida albicans cells off. this experiment also showed us, that usage of a single petri dish for several expositions and subsequent comparison of inhibition areas may cause methodical errors in results. acknowledgements this work was supported by the czech technical university in prague grant sgs13/194/ohk3/3t/13 and charles university in prague research programs prvoukp25/lf1/2 and svv-2012-264506. references [1] j. jorgensen, m. ferraro. antimicrobial susceptibility testing: a review of general principles and contemporary practices. clinical infectious diseases 49(11):1749–55, 2009. [2] j. julák, v. scholtz, s. kotúčová, o. janoušková. the persistent microbicidal effect in water exposed to the corona discharge. physica medica 28(3):230–239, 2012. [3] m. kong, g. kroesen, g. morfill, et al. plasma medicine: an introductory review. new journal of physics 11(11):115012, 2009. [4] laroussi m. evaluation of the roles of reactive species, heat, and uv radiation in the inactivation of bacterial cells by air plasmas at atmospheric pressure. international journal of mass spectrometry 233(1–3):81–86, 2004. [5] z. machala, l. chladekova, m. pelach. plasma agents in bio-decontamination by dc discharges in atmospheric air. journal of physics d: applied physics 43(22):222001, 2010. [6] s. schneider, j. lackmann, d. ellerweg, et al. the role of vuv radiation in the inactivation of bacteria with an atmospheric pressure plasma jet. plasma processes and polymers 9(6):561–568, 2012. [7] v. scholtz, j. julák, v. kříha. the microbicidal effect of low-temperature plasma generated by corona discharge: comparison of various microorganisms on an agar surface or in aqueous suspension. plasma process polym 7(3–4):237–243, 2010. [8] v. scholtz, j. julák, v. kříha, et al. decontamination effects of low-temperature plasma generated by corona discharge. part ii: new insights. prague med rep 108(2):128–146, 2007. [9] t. shimizu, t. nosenko, g. e. morfill, et al. characterization of low-temperature microwave plasma treatment with and without uv light for disinfection. plasma processes and polymers 7(3–4):288–293, 2010. [10] e. stoffels, y. sakiyama, d. graves. cold atmospheric plasma: charged species and their interactions with cells and tissues. ieee transactions on plasma science 36(4):1441–1457, 2008. 151 acta polytechnica 53(2):148–151, 2013 1 introduction 2 samples 3 experiments 3.1 apparatus 3.2 exposition 4 results and discussion 4.1 two expositions interaction 4.2 four expositions with symmetric clockwise sequence order 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0687 acta polytechnica 53(supplement):687–692, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap investigating the e p, i – e iso correlation lorenzo amatia,∗, simone dichiarab a istituto nazionale di astrofisica – iasf bologna, via p. gobetti 101, i-40129 bologna, italy b university of ferrara, department of physics, via saragat 1, 44100 ferrara, italy ∗ corresponding author: amati@iasfbo.inaf.it abstract. the correlation between the spectral peak photon energy, ep, and the radiated energy or luminosity (i.e., the “amati relation” and other correlations derived from it) is one of the central and most debated topics in grb astrophysics, with implications for physics and the geometry of prompt emission, the identification and understanding of various classes of grbs (short/long, xrfs, sub-energetic), and grb cosmology. fermi is exceptionally suited to provide, also in conjunction with swift observations, a significant step forward in this field of research. indeed, one of the main goals of fermi/gbm is to make accurate measurements of ep, by exploiting its unprecedented broad energy band from ∼ 8 kev to ∼ 30 mev; in addition, for a small fraction of grbs, the lat can extend the spectral measurements up to the gev energy range, thus allowing a reliable estimate of the bolometric radiated energy/luminosity. we provide a review, an update and a discussion of the impact of fermi observations in the investigation, understanding and testing of the ep,i –eiso (“amati”) relation. keywords: gamma-rays: observations, gamma-rays: bursts. 1. introduction despite the huge observational advances and theoretical efforts of the last 20 years, the gamma-ray bursts (grb) phenomenon is still far from being fully understood. open issues include the fraction and peculiar characteristics (mass, rotational speed, metallicity, core collapse physics) of highly energetic type ic sne (“hyper-novae”) producing long grbs, the progenitors of short grbs (coalescence of ns–ns or bh–ns binary systems, magnetars, etc.); the mechanisms through which the gravitational energy of the central engine is converted into an ultra-relativistically expanding plasma, and the kinetic (or magnetic) energy of this “fireball” (or “firejet”) is converted into xand gamma-rays; the explanation of the early afterglow phenomenology (steep decay, plateau, flares) and of the properties of the gev emission; the degree of collimation of the emission and the structure of the jet; and other topics. after several years of deep investigations of the multi-wavelength properties of early and late afterglow emission, following the discoveries of bepposax and swift, the focus of the community is getting back to the physics of the prompt emission, prompted by the very high-energy measurements by fermi and based on refined re-analysis of the batse, bepposax, hete-2 and konus/wind data. in this respect, one of the most intriguing and most investigated pieces of observational evidence is the correlation between the photon energy at which the νfν spectra of grbs peak, ep,i, and their radiated energy, eiso (or other grb intensity indicators, e.g. average luminosity or peak luminosity). indeed, this correlation can provide useful constraints to the models for prompt emission physics and geometry. it can also be used to identify and understand the different sub-classes of grbs (short/long, sub-energetic, x-ray flashes) and to standardize these sources for cosmological investigations. thanks to its unprecedented capability to measure the grb prompt emission from ∼ 8 up to ∼ 30 mev for hundreds of grbs and up to tens of gev for a few grbs per year, the fermi satellite is making a major contribution to this field of research. in this paper, after reviewing the basic properties, implications and uses of fermi we show how its measurements are confirming and extending the ep,i –eiso correlation in grbs, providing further evidence of its reliability. 2. the ep,i – eiso correlation in grbs 2.1. observations grb spectra are typically described by the empirical smoothed broken-power-law introduced by [1], with parameters α (low-energy index), β (high-energy index), e0 (break energy). in terms of νfν, they show a peak at the photon energy ep = e0 × (2 + α). this quantity is a relevant parameter in most of the models for grb prompt emission, see, e.g., [2]. presently, more than 250 grbs have measured redshift, and about 40 ÷ 50 % of them have well-measured spectra. from the measured spectrum and the measured redshift it is then possible to compute two fundamental quantities in the cosmological rest-frame of these sources: the intrinsic spectral peak energy, ep,i, and the radiated energy in the assumption of isotropic emission, eiso. both ep,i and eiso span several orders 687 http://dx.doi.org/10.14311/ap.2013.53.0687 http://ojs.cvut.cz/ojs/index.php/ap lorenzo amati, simone dichiara acta polytechnica figure 1. location of long grbs in the ep,i –eiso plane as of july 2011. fermi grbs are marked with red (gbm detection) and blue (gbm + lat detection) color. the continuous line and the dotted line show the best fit power–law of the ep,i –eiso correlation and its ±2σ limits, respectively, as determined by amati et al. [6]. of magnitude and a distribution which can be described by a gaussian plus a low–energy tail (intrinsic xrfs and sub-energetic events). in 2002, based on a small sample of bepposax grbs with known redshift, it was discovered [3] that a very significant correlation exists between ep,i and eiso (fig. 1). the ep,i –eiso correlation for grbs with known redshift was then confirmed, and was extended in the subsequent years by measurements of all other grb detectors with spectral capabilities [4–6]. these include the ep,i values for swift grbs measured by konus/wind, suzaku/wam, fermi/gbm and swift/bat itself (only when ep is inside or close to 15 ÷ 150 kev). despite its strength, this correlation is characterized by a significant dispersion of the data around the best-fit power-law; the distribution of the residuals can be fitted by a gaussian with σ(log ep,i) ∼ 0.2. this extra-poissonian scatter of the data can be quantified by performing a fit with a maximum likelihood method which accounts for the sample variance and the uncertainties on both x and y quantities [7]. this method, by expressing ep,i in kev and eiso in units of 1052 erg, provides an extrinsic scatter σext(log ep,i) of 0.19 ± 0.02 and an index of 0.54 ± 0.03 [6, 8]. in recent years, definite evidence has been found that short grbs do not follow the ep,i –eiso correlation, thus showing that the ep,i –eiso plane can be used as a tool for distinguishing between short and long events and for getting clues on their different nature [5, 9]. finally, the only long grb outlier to the correlation is grb 980425, an event which is peculiar in several respects: it has a very low redshift (z = 0.0085), it is sub-energetic, and it is inconsistent with most other grb properties. 2.2. implications and uses the physics of the prompt emission of grbs is still not settled, and various scenarios have been proposed: synchrotron emission in internal shocks (ssm, inverse compton (ic) dominated internal shocks), external shocks, photospheric emission dominated models, kinetic energy/poynting flux dominated fireballs, and more. the existence and properties of the ep,i –eiso correlation can be used to discriminate among different models and to constrain the physical parameters within each model [2]. in addition, the extension of the correlation over several orders of magnitude from the brightest events to the softest xrfs provides challenging evidence for models in which the observed properties of grbs depend strongly on the jet structure and the viewing angle [4, 10]. an ep,i –eiso correlation with properties consistent with the observed properties is also predicted by alternative scenarios like the “cannonball” model [11], the “fireshell” model [12] and the “precessing jet” model [13]. as mentioned above, the ep,i –eiso plane is also a useful tool for identifying sub-classes of grbs, first of all short and long grbs. only very recently, redshift estimates for short grbs became available, thanks to the observational progress. the estimates and limits on ep,i and eiso for short grbs show that they are inconsistent with the ep,i –eiso correlation holding for long grbs. in addition, a long weak soft emission following the short spike has been observed in some cases. intriguingly, this component is consistent with the correlation, showing that the ep,i –eiso plane can be used to identify and understand not only short and long grbs but also “hybrid” grbs. another issue concerns sub-energetic grbs. indeed, the only long grb not following the correlation is grb 980425, which is not only a prototype event of grb/sn connection but is also the closest grb (z = 0.0085) and a very sub-energetic event (eiso ∼ 1048 erg). moreover, grb 031203, which is the most similar case to grb 980425, being very close (z = 0.105), associated to a bright type ic sn (sn2003lw) and sub-energetic, may also be inconsistent with the correlation (however, only an upper limit to ep,i is available for this burst). the most common explanations for the (apparent?) sub-energetic nature of grb 980425 and grb 031203 and their violation of the ep,i –eiso correlation is that they are “normal” events seen very off-axis [10]. grb 060218, very close (z = 0.033, second only to grb 9809425), with a prominent association with sn2006aj, and very low eiso (6×1049 erg) is very similar to grb 980425 and grb 031203, but, contrary to these two events, it is consistent with the ep,i –eiso correlation. this provides evidence that it is a truly (and not apparent) sub-energetic grb, pointing to the likely existence of a population of under-luminous grb detectable only in the local universe. finally, the ep,i –eiso correlation can also provide clues to a better understanding of the grb–sn connection. except for the peculiar sub-energetic grbs 688 vol. 53 supplement/2013 investigating the e p, i – e iso correlation 980425 and 031203, associated with sn1998bw and sn2003dh, respectively, grb 060218 and other grbs with the firmest evidence of association with an sn are consistent with the ep,i –eiso correlation. in particular, the location of these grbs in the ep,i –eiso plane seems to be independent from the magnitude of the associated sn. furthermore, grb 060614, a long grb with a very deep lower limit to the magnitude of an associated sn, is also consistent with the correlation. these pieces of evidence support the hypothesis that the grb properties are not, or are only weakly, linked to those of the sn explosion which produced them. recently, swift detected an x-ray transient associated with sn 2008d at z = 0.0064, showing a light curve and duration similar to grb 060218. the properties of this event gave rise to a debate: are we facing a very soft/weak grb or an sn shock breakout? based on swift xrt and uvot data, it can be found that the peak energy limits and energetics of this transient (also named xrf 080109) are consistent with a very low energy extension of the ep,i –eiso correlation. this provides evidence that the transient may really be a very soft and weak grb, thus confirming the existence of a population of sub-energetic grbs. 2.3. grb cosmology an interesting aspect of the ep,i –eiso correlation is that it can be used to infer limits or ranges of redshift for long grbs. redshift estimates available only for a small fraction of grbs have occurred in the last 15 years, based on optical spectroscopy. pseudo-redshift estimates for the large number of grbs without measured redshift would provide us with fundamental insights into the grb luminosity function, the star formation rate evolution up to z > 6, etc. in addition, in some cases the optical measurements provide more than one possible value for the redshift. the most straightforward method for using the ep,i –eiso correlation for pseudo-redshift estimates or for disentangling different possible redshifts from optical spectroscopy/photometry, is to study the track in the ep,i –eiso plane as a function of z, i.e., to compute, based on the measured fluence and spectral parameters, the values of ep,i and eiso for each possible value of the redshift and see for which range of redshift the grb would be consistent with the correlation. this method often does not provide precise z estimates, but it is anyway useful for low-z grb and in general when combined with optical measurements. an outstanding case is that of grb 090429b, for which photometric analysis pointed to a redshift of ∼ 9.4, but also provided a very small probability that the grb was at very low redshift [14]. the consistency of this grb with the ep,i –eiso correlation only for z > 1 further supported the very high redshift estimate from the photometric analysis. however, one of the most intriguing, debated and investigated issues about the ep,i –eiso correlation and other spectrum–energy correlations derived from it is their use for grb cosmology. all grbs with measured redshift (∼ 250 up to now, including a few short grbs) lie at cosmological distances (z ∼ 0.033 ÷ 9.4) (except for the peculiar grb 980425, at z = 0.0085). this fact, combined with the huge radiated power of these events, would make them very powerful cosmological probes. nevertheless, the isotropic luminosities and radiated energies of grbs span several orders of magnitude, thus these sources are not standard candles (unfortunately). given that it links a quantity, ep,i, which can be derived from the observables based only on the redshift, and a quantity, eiso, whose computation requires the assumption of a cosmology, the ep,i –eiso correlation can, in principle, be used to “standardize” grbs. indeed, it can be found [8, 15] that a fraction of the extrinsic scatter of the correlation is due to the cosmological parameters used to compute eiso. in particular, by assuming, e.g., a standard λcdm flat universe, it can be found that the scatter minimizes for ωm ∼ 0.25 ÷ 0.3, in very good agreement with the estimate coming from other cosmological probes (sn ia, cmb, bao, clusters). more in general, this simple analysis provides evidence, independent from sn ia and other cosmological probes, that, if we are in a flat λcdm universe, as resulting from cmb analysis, ωm is lower than 1. by using a maximum likelihood method, the extrinsic scatter can be parametrized and quantified. for example, [8] found ωm constrained to 0.04÷0.43 (68 %) and 0.02 ÷ 0.71 (90 %) for a flat λcdm universe (ωm = 1 excluded at 99.9 % c.l.), and that significant constraints on both ωm and ωλ are expected from sample enrichment. and, indeed, the analysis of an updated sample of 109 grbs shows significant improvements in the constraints on ωm (0.06 ÷ 0.36 at 68 % and 0.03÷0.59 at 90 %) with respect to the sample of 70 grbs (0.06 ÷ 0.36 at 68 % and 0.03 ÷ 0.59 at 90 %), providing evidence of the reliability and perspectives of the use of the ep,i –eiso correlation for estimating of cosmological parameters. 2.4. reliability different grb detectors are characterized by different detection and spectroscopy sensitivity as a function of the grb intensity and spectrum, see, e.g., [16]. this may introduce relevant selection effects/biases in the observed ep,i –eiso and other correlations. in the past, there were claims that a high fraction (70 ÷ 90 %) of batse grbs without redshift would be inconsistent with the correlation for any redshift [17]. however, this would imply unreliable huge selection effects in the sample of grbs with known redshift. in addition, other authors [9, 18, 19] have shown that most batse grbs with unknown redshift are consistent with the ep,i –eiso correlation. moreover, [6] showed that the normalization of the correlation varies only marginally using grbs measured by individual instruments with different sensitivities and energy bands, 689 lorenzo amati, simone dichiara acta polytechnica figure 2. location of fermi grbs in the ep – fluence plane based on the analysis reported by nava et al. [21]. in the left and right panels we show those grbs for which the spectral parameters and the fluence were derived by fitting the data with a cut-off power-law and with the band function, respectively. grbs have been divided according to their durations: short (red points), intermediate (cyan points) and long (black points). the red and blue lines represent the limits above which a grb would be inconsistent with the ep,i –eiso correlation at 2σ or 3σ, respectively, for any redshift. which provides further evidence that the instrumental limits do not have a significant impact. selection effects in the process leading to the redshift estimate may also play a role. thanks to its capability of providing quick and accurate localization of grbs, swift is reducing selection effects in the sample of grb with measured redshift. thus, swift grbs are expected to provide a robust test of the ep,i –eiso correlation. by considering the ep,i of swift grbs measured by konus-wind, suzaku/wam, fermi/gbm and swift/bat (only when ep is inside or close to 15÷150 kev and the values provided by the swift/bat team, it can be found that they are consistent with the ep,i –eiso correlation. finally, based on time-resolved analysis of batse, bepposax and fermi grbs, it was found that the ep,i –liso correlation holds also within a good fraction of grbs [20], which is robust evidence for a physical origin, also providing clues to its explanation. 3. the fermi contribution the key features of fermi for the study of grbs are: detection, arcmin localization and study of grbs in the gev energy range through the fermi/lat instrument, with dramatic improvement w/r cgro/egret; detection, rough localization (within a few degrees) and accurate determination of the shape of the spectral continuum of the prompt emission of grbs from 8 kev up to 30 mev through the fermi/gbm instrument. the investigated ep,i –eiso correlation with fermi can thus be done under the following respects: a) location in the ep,i –eiso plane of grbs with known z (most of which were detected and localized by swift) and with ep accurately measured by gbm (direct test); b) testing the ep,i –eiso correlation by analyzing the location of hundreds of gbm grbs in the ep–fluence plane (as done with batse grbs); c) exploiting joint analysis of gbm and lat spectra to investigate the impact of the extension from 10 mev up to > 1 gev of the spectral–energetic analysis. 3.1. fermi grbs in the ep,i – eiso plane up to now, gbm has detected several hundreds of grbs, providing accurate ep estimates for ∼ 90 % of them. however, only ∼ 15 % of these events were simultaneously detected by swift, leading to a final ∼ 5 % with ep and z estimates. grb fluences and spectral data of fermi grbs are presently available from four main data sets: gcns (preliminary results for most grbs by the fermi collaboration); [21] (430 grbs); [22] (52 bright grbs by the fermi collaboration); [23] (32 grbs with known redshift by the fermi collaboration). based on gcn data, [6] showed that all fermi/gbm long grbs with known z are fully consistent with the ep,i –eiso correlation, as determined by previous experiments (fig. 1): further confirmation of non relevant instrumental effects. in addition, the analysis of the fermi/gbm grb 090510 further confirms that short grbs do not follow the correlation. very recently, [23] (an official fermi team) performed a refined analysis of the updated sample of fermi/gbm grbs with known z, confirming that long ones are consistent with ep,i –eiso correlation, while short grbs are not. the slightly higher normalization and dispersion of the ep,i –eiso correlation found by them with respect to previous analysis is possibly due to the use, for some grbs, of the cut-off power-law model, instead of the band model, which leads to an overestimate of ep,i and an underestimate of eiso). 690 vol. 53 supplement/2013 investigating the e p, i – e iso correlation figure 3. location in the ep–fluence plane of grbs simulated by assuming the existence of the ep,i –eiso correlation and the sensitivity limits of the fermi/gbm. in the right panel we have also included in the simulations the effect of the spectral evolution ep,i ∝ l0.5 observed for most grbs. the red and blue lines represent the limits above which a grb would be inconsistent with the ep,i –eiso correlation at 2σ or 3σ, respectively, for any redshift. 3.2. fermi grbs without redshift as mentioned above, besides the small sample of grbs with measured redshift, fermi/gbm is providing a large sample of hundreds of grbs without redshift but with accurate measurements of the spectral peak energy ep and fluence f. this sample can be used to test the reliability of the ep,i –eiso correlation. this is similar to what was done in the past with batse grbs, but it takes advantage of the better accuracy in the spectral parameters allowed by the unprecedented wide energy band of this instrument (8 kev÷30 mev). given that we are considering grbs without measured redshift, this analysis requires a conversion from the ep,i –eiso correlation in the cosmological rest-frame plane to an ep –f correlation in the observer plane. by considering the ep,i –eiso correlation in the form ep,i = k × emiso and taking into account that ep,i = ep × (1 + z) and eiso = f×(4πd2l)/(1+z), where dl is the luminosity distance to the source, we obtain ep = f(z)×k×fm, where f(z) = (4πd2l) m/(1 + z)m+1. given that f(z) shows a maximum for z ∼ 4, we can convert the best-fit and 2σ or 3σ upper limits of the ep,i –eiso correlation into lines in the logarithmic ep –s plane above which a grb would be inconsistent with the ep,i –eiso correlation at the corresponding confidence level for any redshift (see figs. 2 and 3). we applied the above method to fermi grbs, using the different resources described above for the spectral parameters and fluence. in all cases, we have found that most (90÷95 %) of the long grbs are potentially consistent with the ep,i –eiso correlation, whereas most short ones are not. in addition, we have found that, when considering only those grbs with well measured spectral parameters and fluence, properly modeled with the band function instead of the cut-off power-law, and with integration times not shorter than 75 % of the total duration of the event, all long fermi grbs are potentially consistent with the ep,i –eiso correlation. as an example, we show in fig. 2 the impact of the fitting model (band function vs. cut-off power-law) for the sample by [21]. in addition, we performed monte carlo simulations aimed at evaluating the impact on the location of grbs in the ep –s plane of the combination of spectral evolution with detector sensitivity. indeed, time resolved analysis of grbs generally shows, that ep is correlated with the flux: the higher the flux, the higher the spectral peak energy. this means that if we detect only the brightest part of a grb, we will overestimate ep and underestimate the fluence. in order to evaluate this effect, we generated thousands of fake grbs by assuming the existence and the measured parameters of the ep,i –eiso correlation, accounting for the observed distributions of relevant parameters (eiso, z, eiso vs. z). we also attributed to each grb a specific light curve and a spectral evolution of the type ep,i ∝ ln, where n is between 0.4 and 0.6, as observed in several grbs [20]. then by accounting for cosmological effects and fermi/gbm instrumental sensitivity as a function of ep,i [16, 24], we computed for each grb the ep and fluence that would be measured by the gbm. as can be seen in fig. 3, when accounting for the spectral evolution, the observed small fraction of outliers in the ep –s plane is reproduced. 3.3. extremely energetic fermi grbs thanks to its sensitivity and its huge energy band, fermi is detecting and characterizing from ∼ 10 kev up to several gevs the sub-class of very energetic grbs also detected by lat. as pointed out by [6], grb 080916c, the most energetic grb ever (eiso ∼ 1055 erg in the 1 kev ÷ 10 gev band), and the other extremely energetic grbs 090323 and 090902b are fully consistent with the ep,i –eiso correlation (fig. 1). thus, fermi is providing a further extension of the correlation and evidence that the physics behind the x-ray and soft gamma-ray emission of extremely energetic events with gev emission is similar to the physics 691 lorenzo amati, simone dichiara acta polytechnica of normal events. in addition, based on the fact that grb 080916c showed a spectrum extending up to tens of gev without any excess or cut-off, [6] investigated the impact on the correlation of the extension of the energy range over which eiso is computed from the canonical upper bound of 10 mev up to 10 gev, finding no significant change in the slope and dispersion of the correlation. it also has to be cautioned that, given that for a few events gbm plus lat spectral fitting shows an additional power-law component with respect to the simple band function, the extrapolation of the spectrum at energies higher than 10 mev is not straightforward. 4. conclusions the ep,i –eiso correlation in long grbs is one of the most robust pieces of observational evidence in the grb field. implications and uses of the ep,i –eiso correlation include: prompt emission physics and geometry, identification and understanding of sub-classes of grbs (e.g., short, sub-energetic), grb cosmology. refined analysis of large samples of grbs without redshifts, combined with simulations, further support the reliability of the correlation. the fermi observatory is providing a significant contribution to investigations of the property and reliability of this correlation. first of all, gbm is significantly increasing the number and the accuracy of ep,i estimates for grbs with known redshift. it is found that gbm long grbs in the ep,i –eiso plane follow the same correlation as measured by previous/other instruments; as expected short grbs do not. furthermore, the analysis of the spectral peak energy and the fluences of hundreds of gbm grbs without redshift, joined with monte carlo simulations, confirm that the ep,i –eiso correlation is not significantly affected by instrumental effects. finally, extremely energetic fermi long grbs with significant gev emission detected by lat (e.g., 080916c, 090323) further confirm and extend the correlation. references [1] band, d., et al.: 2002, apj 413, 281 [2] zhang, b., & meszaros, p.: 2002, apj 581, 1236 [3] amati, l., et al.: 2002, a&a 390, 81 [4] lamb, d.q., donaghy, t.q., & graziani, c.: 2005, apj 620, 355 [5] amati, l.: 2006, mnras 372, 233 [6] amati, l., frontera, f., & guidorzi, c.: 2009, a&a 508, 173 [7] d’agostini, g.: 2005, arxiv:physics/0511182 [8] amati, l., et al.: 2008, mnras 391, 577 [9] ghirlanda, g., et al.: 2008, mnras 387, 319 [10] yamazaki, r., yoka, k., & nakamura, t.: 2004, apj 606, l33 [11] dado, s., dar, a., & de rujula, a.: 2007, apj 663, 400 [12] guida, r., et al.: 2008, a&a 487, l37 [13] fargion, d., & d’armiento, d.: 2010, mem. sait 81, 44 [14] cucchiara, a., et al.: 2011, apj 736, 7 [15] amati, l.: 2012, int.j.mod.phys.s 12, 19 [16] band, d.: 2003, apj 588, 945 [17] nakar, e., & piran, t.: 2005, mnras 360, l73 [18] ghirlanda, g., ghisellini, g., & firmani, c.: 2005, mnras 361, l10 [19] bosnjak, z., et al.: 2008, mnras 384, 599 [20] ghirlanda, g., nava, l., & ghisellini, g.: 2010, a&a 511, 43 [21] nava, l., et al.: 2011, a&a 530, 21 [22] bissaldi, e., et al.: 2011, apj 733, 97 [23] gruber, d., et al.: 2011, a&a 528, a15 [24] band, d., et al.: 2009, apj 701, 1673 692 acta polytechnica 53(supplement):687–692, 2013 1 introduction 2 the e_p,i – e_iso correlation in grbs 2.1 observations 2.2 implications and uses 2.3 grb cosmology 2.4 reliability 3 the fermi contribution 3.1 fermi rrbs in the e_p,i–e_iso plane 3.2 fermi grbs without redshift 3.3 extremely energetic fermi grbs 4 conclusions references acta polytechnica doi:10.14311/ap.2014.54.0414 acta polytechnica 54(6):414–419, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap effective use of rotary furnace shell heat julius lisuch∗, dusan dorcak, jan spisak workplace for developing and implementing raw materials extraction and treatment, technical university of kosice, nemcovej 32, 040 02 kosice, slovak republic ∗ corresponding author: julius.lisuch@tuke.sk abstract. a significant proportion of the total energy expenditure for heat treatment of raw materials ends up as heat losses through the shell of the rotary furnace. currently, the waste heat is not used in any way and escapes into the environment. a controlled cooling system for the rotary furnace shell (ccsrf) is a new solution integrated into the technological process aimed at reducing the heat loss of the furnace shell. simulations and experiments how the effect of controlled cooling of the shell on the operation of the rotary furnace. the proposed solution is cost-effective and operationally undemanding. keywords: rotary furnace, furnace, magnesite, control shell cooling, shell. 1. introduction the heat losses in rotary furnaces account for as much as 60 % according to the heat balance of the total heat supplied to the workspace. due to its temperature potential the use of this heat for technological purposes is limited to low-temperature processes, such as drying of materials. the heat loss for direct use in the firing process of magnesite clinker in the form of reduced losses by the furnace shell and preheated combustion air is much greater. such use will reduce fuel consumption for heating and will also increase the quality of the clinker burning. the aim of the analysis conducted here is to acquire knowledge of the system for controlled cooling of a rotary furnace shell, to determine its impact on the work of the rotary furnace, and to find optimal conditions for making use of it. 2. rotary furnace rotary furnaces are continuously working furnaces. they are used in many branches of industry and in various technologies, such as the cement industry (cement clinker burning, lime production), the refractory materials industry (magnesite clinker burning, dolomite clinker and fireclay shales). they are also used in the production of pearlite and caustic magnesite for burning below 1000◦c, in preparing ores for the iron industry, burning pellets for blast furnaces and steel mills, roasting sulphide ores, oxidizing roasting, magnetizing roasting, drying, and for drying refractory clays and sands. rotary furnaces are cylindrical in shape with a diameter of 1–7 m and with a length 20–120 m and more, stored horizontally under a slight inclination of 2–6 % (fig. 1). the batch and the fuel are usually fed into the furnace from the opposite ends. the furnace operates continuously in countercurrent. as a result of the inclination of the furnace and its rotation, the batch passes through the furnace with a spiraling motion. figure 1. rotary furnaces at smz jelsava, a.s. for magnesite thermal treatment [1]. figure 2. composition of the lining of a rotary furnace [3]. co-stream movement of the combustion gases and the batch is especially used for drying. the furnace consists of a steel shell, a lining, supporting equipment, a propulsion device, towing heads (a charging installation, cold) and the cooler. in addition, the furnace may have a cross-poking device, heat exchangers, or special equipment for feeding solid and gaseous substances into the individual zones of the furnace by the holes in the shell and lining. a preheating device for the batch is often used before the furnace. the shell is lined with fireclay, magnesia, high alumina fittings or bricks (figure 2)[2]. one of the key aspects of the operation of industrial furnaces is heat exchange. heat exchange in the working space of the furnace is divided into: • external heat exchange — heat transfer from the 414 http://dx.doi.org/10.14311/ap.2014.54.0414 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 effective use of rotary furnace shell heat figure 3. sankeyov diagram [4]. flue gas and the walls for the batch, • movement of heat inside the batch. both ways are contingent and are related to each other [4, 5]. 2.1. heat losses of rotary furnaces heat release consists mainly of (figure 3): • workspace losses, • outgoing flue gas losses, • heat losses through the furnace shell. the heat balance is described by equations [4–6]: qtb = qte, (1) qchh + qhpa + qhpf + qexn = = qhw + quh + qfl + qofg + qfhl, (2) where qtb is total brought heat [w], qte – total exhaust heat [w], qchh – chemical heat of fuel [w], qhpa – heat of preheated air [w], qhpf – heat of preheated fuel [w], qexn – heat involving exothermic and endothermic reactions in the furnace area [w], qhw – heat of the workspace [w], quh – useful heat necessary for material heating [w], qfl – furnace workspace heat loss [w], qofg – outgoing flue gas heat loss [w], qfhl – furnace shell heat loss [w]. the heat removed from the furnace can be divided into: • useful heat q – the heat that is needed for heating a batch (endothermic reactions, etc.); • heat losses q – consisting of heat loss: . through the furnace openings (doors), . accumulated in the lining, . by radiation through the openings, . cooling water, . coming out of the material (in the product), . through the furnace shell (qpl) – led through the walls and into the furnace casing (furnace shell), figure 4. lining heat losses. figure 5. the surface temperature of rotary furnace shell no. 2. . other losses (loss due to waste flue gases and solid waste heat loss, heat loss by incomplete combustion of fuel). the flue gas waste heat loss (qfgw [w]) is the heat content of the flue gas leaving the furnace workspace: qfgw = vgas cgas tgas (3) where vgas is flue gas volume flow [m3/h], cgas – medium specific heat of the flue gas [j m−3 °c], tgas – temperature of the flue gas leaving the furnace [◦c]. the shell heat losses (qshl [w]) are determined by relation [5], shown in figure 4: qshl = s(t1 − t2) 1 α1 + s1 λ1 + 1 α2 + s2 λ2 (4) where s is wall surface [m2], λ1,2 – thermal conductivity [w m−1 k−1], t1,2 – surface temperature [°c], s1,2 – wall thickness of the individual layers [m]. 2.2. analysis of the current state the heat losses through the shell of a rotary furnace are a significant proportion of the losses for rotary furnace no. 2 at smz jelsava, a.s.: 14 % of the total energy expenditure for the burning process. the waste heat from the rotary furnace shell has until now been used only as a secondary effect on water heating [7, 8], or for drying. draft implementation of controlled cooling of the furnace was implemented to 415 j. lisuch, d. dorcak, j. spisak acta polytechnica input material batch 28130 kg bulk density 1600 kg/m3 output material product 9743 kg flue dust 4620 kg natural gas volume 2774 m3 calorific value of gas 34325 mj/m3 output flue gases volume 38653 m3 temperature 516 °c primary air proportion 30 % volume 8572.2 m3 secondary air proportion 70 % volume 20001.8 m3 table 1. inputs and outputs of the reference state of rp no. 2. flue input 2774 m3/h layer thickness 30 cm flame length 15 m furnace performance 9.81 t/h furnace shell heat losses 13.46 gj/h table 2. parameters of the simulation model. improve the operational, economic and environmental indicators in the production of sintered magnesite in rotary furnaces. the operating conditions were used for a simulation model (table 1). on the basis of the surface temperature of the rotary furnace shell, it was possible to determine the quantity of heat escaping from the shell of the furnace (figure 5). the heat of the shell along the length of the furnace was inhomogeneous due to changes, such as: changes in lining thickness, a change in the lining material, the thickness and the quality of the batch [9–12]. on the basis of a computer simulation (table 2), figure 6 shows the generated progress of temperatures for the flue gas and the material passing through the rotary furnace shell along the rotary furnace. the value for the heat input into the process and the amount of heat rejection are shown in table 3. the simulations (figure 7) show that the shell losses are 14 % of the total heat rejection. heat loss from the shell therefore has a great effect on the economic and environmental indicators. figure 6. simulation of the reference state of rotary furnace no. 2. detained heat [mj] [%] q material ln 508 0.53 q medium ln 518 0.54 q burner 95335 98.84 sum 96361 100 escaped heat [mj] [%] qh2o evaporated 654 0.68 q decomposition mgco3 31308 32.45 q decomposition caco3 2498 2.58 q decomposition feco3 453 0.47 q medium out 35485 36.78 q material out 12632 13.09 q furnace shell losses 13460 13.95 sum 92484 100 table 3. thermal balance of rp no. 2. 3. a controlled shell cooling system for a rotary furnace (ccsrf) ccsrf (figures 8 and 9) consists of the shell of the furnace, which is installed on the original shell. between the original shell and the installed shell there is an air gap, which passes through the cooling air and removes heat from the the shell of the rotary furnace. controlled cooling removes heat from the furnace shell, so that its temperature reaches the value of the maximum operating temperature of the material properties of the shell. the carbon steel shell can have a maximum shell temperature of 350 °c on the surface. for steel alloys, the temperature can reach values up to approx. 600 °c. by raising the temperature from the outside of the shell in this way, the temperature gradient in the furnace wall is reduced (the difference between the surface internal temperature and the temperature of the outer shell of the furnace), thereby reducing the heat flow through the furnace shell, which is the own heat loss. the reduced heat losses increase 416 vol. 54 no. 6/2014 effective use of rotary furnace shell heat figure 7. heat balance of the reference state of rotary furnace no. 2. figure 8. the current rotary furnace. figure 9. rotary furnace with controlled shell cooling. the useful heat [13–17]. the gaseous medium enters into the cooled shell, in which heat is transferred to the inner wall of the furnace. from the burner, the heat proceeds by radiation and convection to the lining layer and to the batch layer. the heat passes over the lining on the inner surface of the shell, where it is transferred by conduction and the lining transfers it to the surroundings. after installing the double shell, the inner shell furnace transmits heat by radiation and convection of the sampled air. the heated air is carried by convection from the outer surface of the inner shell and from the inner surface of the double shell (figure 10). for the heat flow (q [k]) of the stream (convection) from the inner environment at temperature t1 to the lining surface at temperature ts1 is valid [15, 18, 19]: q = α1.s.(t1.ts1) (5) where α1 is heat transfer coefficient [w.m−2.k−1], s – surface of the shell [m2], t1 – flue gas temperature [◦c], ts1 – inner surface of the lining temperature [◦c]. the heat transfer (q [w]) from the inner environment by radiation to the lining is determined by the relationship: q = cs [( t1 100 )4 − ( ts1 100 )4] (6) where s – surface of the shell [m2], t1 – flue gas temperature [°c], ts1 – inner surface of the lining temperature figure 10. the principle of controlled shell furnace cooling. [°c]. the heat flow (q [w]) led through the lining is expressed in the equation for heat conduction through a multilayer wall. for each layer: q = 2φλl ln r2 r1 (ts1 − ts2) (7) where ts1 – inner surface of the lining temperature [◦c], ts2 – surface of the shell temperature rp [◦c], λ – thermal conductivity of the lining [w.m−1.k−1], r1,2 – radius (indoor, outdoor) [m], l – length of the measured part [m]. the relationship for the convection of heat flow (q [w]) from the rp shell surface into the surroundings is: q = α2s(ts2 − t2) (8) where α2 – heat transfer coefficient [w m−2 k−1], s – surface of the shell [m2], t2 – flue gas temperature [°c], ts2 – inner surface of the lining temperature [°c]. to calculate the heat loss (q [w]) through the wall of the rotary furnace: q = 2φλl 1 α1 + ln r2 r1 + 1 α2 (ts1 − ts2) (9) where α1 – heat transfer coefficient from the surface to the surroundings [w m−2 k−1], α2 – heat transfer coefficient [w m−2 k−1], t1 – flue gas temperature [°c], t2 – rp ambient temperature [°c], r1,2 – radius (indoor, outdoor) [m]. shell waste heat in the form of pre-heated combustion air can be used by: • increasing the flame temperature, which causes: . an increase in heat transfer, . an increase in the sintering temperature, . an increase in the quality of the burned clinker. 417 j. lisuch, d. dorcak, j. spisak acta polytechnica flue input 2550 m3/h layer thickness 30 cm flame length 15 m furnace performance 9.81 t/h furnace shell heat losses 4.83 gj/h heat returned to the process through the primary air 5 gj/h table 4. simulated process parameters. figure 11. simulation model of rotary furnace no. 2 with ccsrf. • reduction of losses causes: . an increase in useful heat, . an increase in the capacity of the furnace, . an increase in the quality of the sintering. 3.1. simulation of the application of ccsrf to rotary furnace no. 2 a computer simulation (table 4) generated the temperatures of the combustion gases and the material passing through the rotary furnace, and also the temperature along the course of the rotary furnace shell (figure 11) compared to the reference state (table 3). controlled cooling achieved (table 5, figure 12) a reduction in fuel consumption of 2.774 m3/h to 2550 m3/h (natural gas), which represents a saving of 8 % with unchanged performance of 9.81 tons/hour. the heat loss of the rotary furnace shell decreased from 13.46 gj/h to 4.83 gj/h. 5 gj/h of heat can be used for heating the combustion air. preventing heat loss by installing a double shell will increase the useful heat value by 1.54 %, which means an increase in the efficiency of magnesite clink firing (table 6). 4. discussion in magnesite sintering, the clinker quality is determined by the beginning of calcination, the completion of calcination and the maximum calcination temperature. cooling treatment in the furnace can thereby influence the process of decomposition and sintering (speed up/slow down). by the intensity of the controlled shell cooling we can affect the flow of useful heat in the rp segment and thus control the intensity detained heat [mj] [%] q material ln 508 0.54 q medium ln 518 0.55 q brought from furnace shell through primary air 5000 5.34 q burner 87600 93.56 sum 93626 100 escaped heat [mj] [%] qh2o evaporated 654 0.71 q decomposition mgco3 31304 33.82 q decomposition caco3 2498 2.70 q decomposition feco3 453 0.49 q medium out 34934 37.74 q mematerial out 12890 13.93 q furnace shell losses 4825 5.21 q brought from furnace shell through primary air 5000 5.40 sum 92558 100 table 5. heat balance of rp no. 2. figure 12. heat balance after installation of ccsrf. of the thermal processes taking place in the furnace. controlled shell furnace cooling provides a new way to manage processes in a furnace, which can contribute significantly to achieving the required criteria for the processes taking plane in the rotary furnace. 5. conclusions the effect of the proposed system of controlled shell furnace cooling has been verified by simulations on a mathematical model. the proposed innovative measure fulfills the defined basic objectives of rotary furnace optimization primarily by: • minimizing the total cost of magnesite clinker in a rotary furnace. the cost of burning to produce one ton of clinker is reduced by 7 %, • increasing the immediate enforcement of the rotary furnace by increasing the theoretical combustion temperature by about 8 %, or • reducing the energy consumption for magnesite clinker burning, reducing the specific fuel consumption by approx. 8 %, • ensuring the temperature for furnace shell monitoring, • increasing the preheating of the primary air and gas to the burner, 418 vol. 54 no. 6/2014 effective use of rotary furnace shell heat alt. 1 2 power consumption 2774 m3/h 2550 m3/h performance 9.81 t/h 9.81 t/h specific consumption 282 m3/h 259 m3/h material temp. 1615 °c 1610 °c flue gas temp. 1825 °c 1796 °c savings — 8 % reference ccsrf table 6. comparison of reference and proposed status. • reducing the consumption of combustion air, and thereby reducing the emissions of co2, nox. the analysis of the impact of ccsrf on the technological process and on the economics of the innovation demonstrated the usefulness of this measure. the proposed solution is based on self-regulatory principles and provides a significant opportunity to improve the operation of rotary furnaces. the proposed solution gained heat used directly in the technological process of the rotary furnace and also created synergies with other measures (the diffusion burner, the self-batch feeder, the rolling rotary furnace sealing). by combining these measures it is possible to achieve fuel savings of about 30 %. acknowledgements "this paper is the result of the project under the title development of a joint research and development and innovation centre and streamlining its use in thermal processing of raw materials, supported by the research & development operational programme funded by the erdf (itms: 26220220151). references [1] j. mikula. mathematical modelling of thermal processes for virtual reality environment. dissertation thesis, tu, kosice, slovakia, 2009. [2] f. t. j. staron. refractory materials-productions, properties and usage. second printing. radovan mlynarik-media, banska bystrica, slovakia, 2000. [3] feeco int. rotary kiln refactory. photo [2014-12-01], http://www.flickr.com/photos/68660976@n06/ 6322419239. [4] a. varga. thermal technology in metallurgy. first printing. technical university, kosice, slovakia, 1999. [5] h. brunklaus. construction of industry furnaces. first printing. state publishing technical literature, praha, czechoslovakia, 1966. [6] v. vitek. industry furnaces ii. first printing. slovak publishing technical literature, bratislava, czechoslovakia, 1965. [7] p. s. a. caputo, p. pelagagge. performance modeling of radiant heat recovery exchangers for rotary kilns. applied thermal engineering 31(14-15):2578–2589, 2011. doi:10.1016/j.applthermaleng.2011.04.02. [8] j. s. j. lisuch. technological logistic innovative trend in rotary furnaces. 5th international conference loado 2009 logistics and transport, strbske pleso, slovakia, issn: 1451-107x pp. 139–143, 2009. [9] j. s. j. lisuch, j. mikula. rotary kiln shell controlled cooling system. iccc’ 2007 : proceedings of 8th international carpathian control conference, high tatras, slovak republic pp. 426–429, 2007. [10] m. r. et al. thermal calculations and optimization lining of industrial furnaces. publishers of technical literature sntl, praha, czechoslovakia, 1975. [11] k. michalikova-frajtova. effective process management in the conditions of globalization. logistic monitor, rajecke teplice, slovakia, isbn: 979-80-969745-1-0 pp. 126–132, 2008. [12] b. chakrabarti. investigations on heat loss through the kiln shell in magnesite dead burning process: a case study. applied thermal engineering 22(12):1339–1345, 2002. doi:10.1016/s1359-4311(02)00051-0. [13] k. z. j. herman, m. hezina. industrial innovations. first printing. economic university praha, praha, 2002. [14] a. r. j. spisak, j. lisuch. technological logistics tool of optimalization of heat treatment processes of raw materials. metal 2011 : 20th anniversary international conference on metllurgy and materials, brno, czech republic, isbn: 978-80-87294-24-6 pp. 1–8, 2011. [15] j. terpak. modelling and control of technological processes. habilitation thesis, tu, kosice, slovakia 2002. [16] m. michejev. fundamentals of heat sharing. second printing. industrial publishing, praha, czechoslovakia, 1953. [17] i. i. l. komorova. thermodynamics in metallurgy. alfa, bratislava, slovakia, isbn: 800501077x, 1992. [18] i. kostial. lectures on modeling and simulation of technological processes. kosice, tu, slovakia 1998. [19] l. kuna. refractory linings of industrial furnaces. first printing. slovak publishing technical literature, bratislava, czechoslovakia, 1967. 419 http://www.flickr.com/photos/68660976@n06/6322419239 http://www.flickr.com/photos/68660976@n06/6322419239 http://dx.doi.org/10.1016/j.applthermaleng.2011.04.02 http://dx.doi.org/10.1016/s1359-4311(02)00051-0 acta polytechnica 54(6):414–419, 2014 1 introduction 2 rotary furnace 2.1 heat losses of rotary furnaces 2.2 analysis of the current state 3 a controlled shell cooling system for a rotary furnace (ccsrf) 3.1 simulation of the application of ccsrf to rotary furnace no. 2 4 discussion 5 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):174–178, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ instability growth rate dependence on input parameters during the beam–target plasma interaction miroslav horký∗ department of physics, faculty of electrical engineering, czech technical university in prague, czech republic ∗ corresponding author: horkymi1@fel.cvut.cz abstract. the two-stream instability without magnetic field is described by the well-known buneman dispersion relation. for more complicated situations we need to use the generalized buneman dispersion relation derived by kulhánek, břeň, and bohata in 2011, which is a polynomial equation of 8th order. the maximal value of the imaginary part of the individual dispersion branches ωn(k) is very interesting from the physical point of view. it represents the instability growth rate which is responsible for the turbulence mode onset and subsequent reconnection on the ion radius scale accompanied by strong plasma thermalization. the paper presented here is focused on the instability growth rate dependence on various input parameters, such as magnitude of a magnetic field and sound velocity. the results are presented in well-arranged plots and can be used for a survey of the plasma parameters close to which the strong energy transfer and thermalization between the beam and the target occurs. keywords: buneman instability, numerical simulations, plasma, dispersion relation. 1. introduction two-stream instabilities are the most common instabilities in plasmas which originate on the microscopic scale and which can develop to macroscopic phenomena like a thermal radiaton from strong thermalization or non-thermal radiaton from reconnections. if we consider that both streams have parallel direction of their velocities, we talk about buneman instability [1] and if we consider intersecting directions of velocities and anisotropy of temperatures, we talk about weibel instability [8]. the dispersion relation for two-stream instability without magnetic field in cold plasma is described by the relation 2∑ α=1 ωpα (ω − k · u0α) 2 = 1, (1) where ω is the wave frequency, ωpα is the plasma frequency of the first and second stream respectively, k is the wave vector, and u0α is the vector of the velocity for the first and the second stream respectively. the most simple situation, in which we can use this relation, is the interaction of two identical streams moving in opposite directions. equation 1 has simple one dimensional form [4] ω2p (ω −ku0)2 + ω2p (ω + ku0)2 = 1. (2) the two-stream instabilities are usually used for the study of the origin of the observed macroscopic phenomena (e.g. particle acceleration in relativistic plasma shocks [6]). this paper is focused on the general study of the plasma jet interaction on the microscopic scale (not only on the study of one particular phenomenon origin) and for this case the general dispersion relation is needed. two generalizations of the two-stream instability dispersion relation were done in last years, first was done by kulhánek, břeň and bohata [5] in 2011 and second was done by pokhotelov and balikhin [7] in 2012. in this paper we do all the calculations from [5], because the generalization is more rigorous and precise. the authors called it generalized buneman dispersion relation (gbdr) and it is described by equation 2∏ α=1 { ω4α −ω 2 α [ i f(0)α · k mα + c2sαk 2 + ω2pα + ω 2 cα ] − ωαωcα mα ( f(0)α × k ) · eb + ω2cα(k · eb) [ i f(0)α · eb mα + ( c2sαk 2 + ω2pα ) k · eb k2 ]} − 2∏ α=1 ω2pα k2 [ ω2αk 2 −ω2cα (eb · k) 2 ] = 0, (3) where ωα = ω − k · u(0)α , (4) ωcα is the cyclotron frequency, f (0) α is the lorentz magnetic force, eb is the unit vector in the direction of magnetic field and csα is the sound velocity. for b = 0 and cold plasma limit csα = 0, the generalized relation becomes the eq. 1. for its analysis it is useful to convert this relation to a non-dimensional form to ensure the scale invariance of the results. the system of coordinates used in the solution is in fig. 1. 174 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 instability growth rate dependence the directions of the vectors uα, b and k are presented in fig. 1 in the case of our coordinates system. the wavevector can have any direction, the magnetic field is only in the (x–z) plane and the velocity vectors of the streams are only along the x-axis. these vectors have coordinates uα = (uα, 0, 0), b = (b sin θb, 0,b cos θb), k = (k cos ϕ sin θk,k sin ϕ sin θk,k cos θk). 1.1. the non-dimensional form non-dimensional variables are defined by relations [2] cs1 ≡ cs1 u1 , cs2 ≡ cs2 u1 , ωc1 ≡ ωc1 ωp2 , ωc2 ≡ ωc2 ωp2 , ωp1 ≡ ωp1 ωp2 , ωp2 ≡ ωp2 ωp2 = 1, u2 ≡ u2 u1 , u1 ≡ u1 u1 = 1, k ≡ ku1 ωp2 , ω ≡ ω ωp2 , where index 1 denotes a jet and index 2 denotes a background. under these definitions, we can convert eq. 3 into a non-dimensional form [2] which will be [ ω 4 1 + iω 2 1ωc1k(g1) −ω 2 1 ( c2s1k 2 + ω2p1 ) −ω 2 1ω 2 c1 − ω1ω2c1k(g3) + ( ω2c1k 2 c2s1 + ω 2 c1ω 2 p1 ) (g2)2 ] · [ ω 4 2 + iω 2 2ωc2ku2(g1) −ω 2 2 ( c2s2k 2 + 1 ) −ω 2 2ω 2 c2 − ω2ω2c2ku2(g3) + ( ω2c2k 2 c2s2 + ω 2 c2 ) (g2)2 ] − [ ω2p1 ( ω 2 1 −ω 2 c1(g2) 2 )] · [ ω 2 2 −ω 2 c2(g2) 2 ] = 0, (5) where we denoted g1 = (cos θb sin ϕ sin θk), g2 = (cos ϕ sin θk sin θb + cos θk cos θb), g3 = (cos2 θb cos ϕ sin θk − cos θb cos θk sin θb), ω1 = ω −k cos ϕ sin θk, ω2 = ω −ku2 cos ϕ sin θk. the main goal is to find the solution for the ω dependence on k. equation 5 is a polynomial equation of 8th order. 2. numerical solution a classical newton’s algorithm for finding the roots of polynomial equations has one big disadvantage. it does not specify the initial points (points where an algorithm starts the iterations) so it does not guarantee the finding of all the roots. in 2001 hubbard, figure 1. system of coordinates used in the simulations. schleicher and sutherland published the article “how to find all roots of complex polynomials with newton’s method”, where they demonstrated how to determine the initial points to find all the roots of polynomial equation [3]. 2.1. principle of the algorithm fundamentals basic principles are described in [4]. for each k we have a polynomial equation of a type c0 + c1ω + c2ω2 + c3ω3 + c4ω4 + c5ω5 + c6ω6 +c7ω7 + c8ω8 = 0. (6) at first we must rescale the polynom, so we have to find amax = 1 + max k {∣∣∣∣ ckcn ∣∣∣∣ } . (7) from now we will work with the polynomial q(z) ≡ n∑ k=0 ckz k, (8) where z ≡ ω amax , ck ≡ ckakmax. (9) the second step is to determine the initial points where the algorithm will start the iterations. the net of initial points is determined by radii and angles in the complex plane: rl ≡ ( 1 + √ 2 )(n − 1 n )(2l−1)/4l , (10) l = 1, . . . ,l, (11) l ≡d0.26632 ln ne , (12) 175 miroslav horký acta polytechnica ξm ≡ 2πm m , (13) m = 0, . . . ,m − 1, (14) m ≡d8.32547n ln ne . (15) then the net of initial points is zlm = rl exp(iξm), (16) l = 1, . . . ,l, (17) m = 0, . . . ,m − 1. (18) the initial net of points has definitely lm numbers. from these numbers the algorithm starts the iterations. a number of iterations o is defined by accuracy ε by the definition o ≡ ⌈ ln(1 + √ 2) − ln ε ln n − ln(n − 1) ⌉ . (19) the bracket dxe means the ceiling function (first integer number which is higher or equal to x). solutions which do not accomplish |q(zo) < ε| are not the roots of the polynomial. after finding all the roots in the rescaled polynomial we have to do the backscaling ωo = amaxzo. (20) 2.2. example of the solution the first numerical solution was made in [5] for the situation of two identical opposite plasma beams in a magnetic field. this example of dispersion branches is for more complicated situation – one plasma beam penetrates into the plasma background and magnetic field has both perpendicular and parallel components. the parameters of this simulation are in tab. 1 and graphical result is in fig. 2. the result is depicted in well arranged plot where blue dots represent real branches and red dots imaginary branches of the solution. also maximal value of the imaginary branch which is so called plasma instability growth rate (pigr) is depicted with sign “max”. parameter value ωc1 = ωc2 0.5 cs1 = cs2 0.1 u2 0 ωp1 1 θk π/2 ϕ 0 θb π/4 table 1. parameters used in example of the solution. figure 2. example of solution of the gbdr with marked maximal value of imaginary part. 3. pigr dependence on various input parameters 3.1. dependence on cyclotron frequencies at first the pigr dependence on both jet and the background cyclotron frequencies was found. 3.1.1. results for ωc1 the parameters used in the simulations are presented in tab. 2 and the results are depicted in fig. 3. it is obvious that the pigr grows linearly from the value ωc1 = 0.6. parameter value ωc1 〈0.5, 3〉 ωc2 0.5 cs1 = cs2 0.1 u2 0 ωp1 1 θk π/2 ϕ 0 θb π/4 table 2. parameters used in the simulations with various parameter ωc1. figure 3. the pigr dependence on ωc1. 176 vol. 53 no. 2/2013 instability growth rate dependence 3.1.2. results for ωc2 the parameters used in the simulations are presented in tab. 3 and the results are shown in fig. 4. from these results we can see the local minimum of pigr which origins due to the bifurcation of the solution. the bifurcation is depicted in three dimensional plot where the first axis is k, second is ω and third is ωc2 (see the fig. 5). parameter value ωc1 0.5 ωc2 〈0.5, 3〉 cs1 = cs2 0.1 u2 0 ωp1 1 θk π/2 ϕ 0 θb π/4 table 3. parameters used in the simulations with various parameter ωc2. figure 4. the pigr dependence on ωc2. figure 5. imaginary part of the solution in three dimensions with an observable bifurcation. 3.2. dependence on sound velocities subsequently the pigr dependence on both jet and the background sound velocities was found. 3.2.1. results for cs1 the parameters used in the simulations are presented in tab. 4 and the results are depicted in fig. 6. parameter value ωc1 0.5 ωc2 0.5 cs1 〈0.1, 2〉 cs2 0.1 u2 0 ωp1 1 θk π/2 ϕ 0 θb π/4 table 4. parameters used in the simulations with various parameter cs1. figure 6. the pigr dependence on cs1. it is obvious that after value cs1 = 1, there is no imaginary branch of the solution, so there are not any instabilities. 3.2.2. results for cs2 the parameters used in the simulations are presented in tab. 5 and the results are depicted in fig. 7. figure 7 presents a similiar bifurcation point like in fig. 4. parameter value ωc1 0.5 ωc2 0.5 cs1 0.1 cs2 〈0.1, 1.5〉 u2 0 ωp1 1 θk π/2 ϕ 0 θb π/4 table 5. parameters used in the simulations with various parameter cs2. 177 miroslav horký acta polytechnica figure 7. the pigr dependence on cs2. 3.3. results overview we found the pigr dependence on four parameters ωc1, ωc2, cs1, and cs2. the main dissimilarity between the dependencies on the cyclotron frequencies is caused by zero velocity of the background. since the jet has non-zero velocity with a component perpendicular to the magnetic field, the jet particles react to the change of the magnetic field more strongly than the background particles. the dissimilarity between the dependencies on the sound velocities has the same origin. beause of the non-zero velocity of the jet, the jet could be subsonic and therefore it could be in the state with no instabilities. 4. conclusions and future work first of all, the gbdr was converted into a nondimensional form which ensures the scale invariance of the problem, which means that the results can be used both for the laboratory and astrophysical plasmas. afterwards the dispersion relation had been solved for the angular frequency via the algorithm suggested by hubbard, schleicher, and sutherland. in every solution branch there were separated real and imaginary parts and subsequently found plasma instability growth rate numerically. finally, the pigr dependence on four input parameters ωc1, ωc2, cs1 and cs2 was found. all these numerical calculations were done on microscopic scale and in the linear approximation. these results can be used for lookup of the plasma parameters close to which the strong energy transfer and thermalization between the beam and the target occurs which will be the first part of the future work. another part will be particle in cell simulations of plasma turbulences origin in the vicinity of pigr maximum. acknowledgements research described in the paper was supervised by prof. p. kulhánek from the fee ctu in prague and supported by the ctu grants sgs10/266/ohk3/3t/13, sgs12/181/ohk3/3t/13. references [1] o. buneman. dissipation of currents in ionized media. phys rev 115(3):503–517, 1959. [2] m. horky. numerical solution of the generalized buneman dispersion relation. in proceedings of poster 2012. prague, 2012. [3] j. hubbard, d. schleicher, s. sutherland. how to find all roots of complex polynomials with newton’s method. inventiones mathematicae 146:1–33, 2001. [4] p. kulhanek. uvod to teorie plazmatu. aga, prague, 1st edn., 2011. (in czech). [5] p. kulhanek, d. bren, m. bohata. generalized buneman dispersion relation in longitudinally dominated magnetic field. isrn condensed matter physics 2011, 2011. article id 896321. [6] i. nishikawa, k., p. hardee, b. hededal, c., et al. particle acceleration, magnetic field generation, and emission in relativistic shocks. advances in space research 38:1316–1319, 2006. [7] a. pokhotelov, o., a. balikhin, m. weibel instability in a plasma with nonzero external magnetic field. ann geophys 30:1051–1054, 2012. [8] e. s. weibel. spontaneously growing transverse waves in a plasma due to an anisotropic velocity distribution. phys rev lett 2(3):83–84, 1959. 178 acta polytechnica 53(2):174–178, 2013 1 introduction 1.1 the non-dimensional form 2 numerical solution 2.1 principle of the algorithm fundamentals 2.2 example of the solution 3 pigr dependence on various input parameters 3.1 dependence on cyclotron frequencies 3.1.1 results for bold0mu mumu c1c12c1c1c1c1 3.1.2 results for bold0mu mumu c2c22c2c2c2c2 3.2 dependence on sound velocities 3.2.1 results for bold0mu mumu cs1cs12cs1cs1cs1cs1 3.2.2 results for bold0mu mumu cs2cs22cs2cs2cs2cs2 3.3 results overview 4 conclusions and future work acknowledgements references ap01_45.vp 1 introduction mathematical modelling is central to many aspects of engineering design. this is particularly true in the case of control systems engineering where models of the system to be controlled (“plant” models) play an important part in design. in some forms of control algorithm dynamic models are actually explicitly incorporated within the controller. plant models used in design or in the implementation of control systems are often linear in form because most control system design methods are based upon linear theory. on the other hand, evaluation of the overall system performance for a system with a controller designed on a linear basis usually necessitates use of a nonlinear description of the plant which incorporates some features which, although neglected at the initial design stage, have to be taken into account in performance evaluation studies prior to commissioning of the system. although it has for long been accepted that an essential part of the modelling and simulation process involves establishing the credibility of a simulation model in the context of the intended application, there is much evidence that, in practice, this aspect of modelling is often treated in a superficial way [1]. apart from some specific safety-critical applications, little attention appears to be given to model testing and to establishing the quality of models in terms of their useful range and limits of accuracy. the lack of attention given to external validation of models of engineering systems can lead to expensive redesign at late stages in the development cycle and there are many examples which can be used to illustrate this. modelling errors are especially important in the design of high-performance automatic control systems where model uncertainties can make it impossible for a control system to meet given performance specifications. it is now becoming accepted that in some application areas dependence upon linear perturbation models for control system design may not be sufficient for high-performance applications [2]. one good example of this is in helicopter flight control systems design where it is now recognised that the success of optimal control and other synthesis methods has been limited by the range and accuracy of available models of the vehicle [3, 4]. developments in the theory of modelling and simulation [5] provide a useful methodological framework within which to consider issues of verification and validation of simulation models. the concept of an experimental frame, which is separate from the underlying system model and the simulation program, is central to present-day theory of modelling and simulation methods. an experimental frame can be divided into generator, acceptor and transducer elements. the generator is used to stimulate the system and the model with identical input sequences or trajectories (e.g. steady state values, steps, ramps, periodic signals etc.). the acceptor is the element through which the user can specify the conditions which are of interest (e.g. steady states, transient behaviour) and limit the observation of behaviour to the specified conditions. the transducer element of the experimental frame post-processes the output time histories and extracts the measures of interest. 2 methods for external validation of computer-based models external validation of computer-based models involves testing the underlying mathematical model to ensure that its behaviour is consistent with that of the real system that it represents. it can also involve establishing the operating range over which the model is appropriate for the intended application. this process of external validation has to be distinguished carefully from the process of internal verification which involves testing the computer simulation program to establish that it is consistent with the mathematical model in terms of its structure and also that it is algorithmically correct in the sense that simulation solutions accurately represent model solutions. acta polytechnica vol. 41 no. 4 – 5/2001 45 the validation of computer-based models in engineering: some lessons from computing science d. j. murray-smith questions of the quality of computer-based models and the formal processes of model testing, involving internal verification and external validation, are usually given only passing attention in engineering reports and in technical publications. however, such models frequently provide a basis for analysis methods, design calculations or real-time decision-making in complex engineering systems. this paper reviews techniques used for external validation of computer-based models and contrasts the somewhat casual approach which is usually adopted in this field with the more formal approaches to software testing and documentation recommended for large software projects. both activities require intimate knowledge of the intended application, a systematic approach and considerable expertise and ingenuity in the design of tests. it is concluded that engineering degree courses dealing with modelling techniques and computer simulation should put more emphasis on model limitations, testing and validation. keywords: computer-based models, simulation, model limitations, software testing, external validation, software engineering tools. external validation presents a much greater challenge than internal verification to those involved in computer-based modelling and simulation of engineering systems. the approaches to external validation that are most commonly used are based upon comparisons of response data from the model and from the real system, together with appropriate model performance measures (the transducer element within the experimental frame). for example, many different deterministic measures of model quality have been proposed [1] and statistical techniques [6] based on the fitting of auto-regressive moving-average models or stochastic models to time series of the model and real system variables can also be used. if the two models are the same the two time series are equivalent in some respects. techniques of step-wise regression have also been applied successfully to model structure assessment [7]. system identification and parameter estimation techniques [8, 9] provide an approach to validation that involves comparisons between system and model which are somewhat less direct than in the methods outlined above. comparisons of parameter estimates for linear empirical models obtained from experiments on the real system with equivalent quantities derived by linearisation of a nonlinear theoretical model may allow conclusions to be reached about the overall validity of the theoretical model and possible sources of error. the concept of identifiability [10, 11] is itself of great significance in the establishment of experimental frames for external validation and the interpretation of estimates of unknown parameters. other tools of potential value for external validation include parameter sensitivity analysis [12, 13], inverse modelling [1, 14] and model distortion methods [15]. whatever the chosen methodology for external validation, experimental design is of vital importance since the information content of the response data is central to the validity of the conclusions reached regarding the suitability or otherwise of a particular model for a given application. if the frequency or amplitude ranges chosen for the generator are inappropriate the conclusions reached will be of little value. at best the model will be restricted to the range of conditions over which it has been tested. documentation of the complete model development process, including all testing, is very important. a complete record should be kept of the aims and objectives of the work, the purpose of the model, the detailed specifications of the model, all assumptions, simplifications and approximations used, tests applied for external validation, experimental records obtained for validation testing and the associated analysis techniques and results. the rationale used in the decision to accept or reject a given model must also appear in the documentation. in most engineering projects which depend on computer-based models in the design, implementation or application phases, additional evidence of the suitability or otherwise of the model will continue to be accumulated throughout the whole duration of the project. documentation must therefore be maintained throughout the application phase. within design organisations it may well be helpful to retain model documentation beyond the lifetime of the real system as this may provide generic information which may be useful for subsequent design projects involving other systems of a similar kind. 3 the relevance of software engineering testing principles in the early days of computer programming, testing of software was viewed as “debugging” and was carried out by the programmers themselves as a post-development activity. by the 1980s the term “software engineering” was being used to describe the software development process and software testing began to be recognised as a separate activity to be performed by independent testers using appropriate testing tools. many definitions of testing are available. one which is particularly appropriate is “verifying that a system satisfies its specified requirements or identifying differences between expected and actual results” [16]. this definition gives emphasis to the fact that during testing one needs to be able to anticipate what is supposed to happen and to compare what actually does happen with that prediction. it is now recognised that in software development projects testing is more than just a phase of work which occurs towards the end of the development cycle [16, 17]. the testing process starts at the stage at which requirements are defined and occurs again at every subsequent stage of the development cycle through design and implementation to operation and maintenance. clearly the cost of software errors can be minimised if the errors are detected at the stage of development in which they are introduced. in general it is vitally important to prevent the migration of errors from one phase of software development to a later phase. most testing involves a bottom-up process in which low-level modules are tested first, with emphasis placed initially at the unit or module level. higher-level testing, involving integration testing and complete system testing, is carried out at a later stage in the development cycle. in general this makes it easier to establish the cause of any failure. whatever approach to software testing is adopted every new version of a unit or product should be retested after modification through a process known as regression testing. this involves re-execution of some or all of the tests carried out during the initial (or progressive) testing process. regressive testing puts special emphasis on the need for good documentation of software testing procedures as part of the complete documentation of the software development process. testing may be used to show that errors are present but never to prove that they are absent. effective testing removes errors but in complex applications it may be difficult to know how much testing is appropriate and some form of risk assessment and management may be called for in establishing what is needed. critical software (as defined by the ieee/ansi standards [16]) is software which could have an impact on safety or could cause large financial or social losses if it failed. 4 discussion even a superficial review of recent publications on software testing suggests that the similarities between this activity and the internal verification and external validation of computer-based models are significant. almost every statement written above about software testing could also be presented in the context of good practice for model development and 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 becomes especially clear when viewed from the perspective of the experimental frame. for example, testing at the module or unit level is just as important in computer-based modelling and simulation as it is in software engineering. as with software, complete testing of a model is impossible because the domain of possible inputs is too large to test exhaustively and realistic test planning involves selecting a small number of test cases which are designed to detect errors. careful consideration has to be given to the generator, acceptor and transducer elements within the experimental frame. the amount of testing to be carried out and the extent to which exhaustive regressive testing is applied depends on the consequences of possible model errors or failure of the model and is therefore a matter for risk assessment. in safety-critical applications it is clear that model testing procedures and model documentation processes are already much more rigorous than in other areas. while parallels exist between software testing and model testing it is clear that computer professionals are generally more aware of testing methodologies than are most engineers who use modelling and computer simulation tools. although lip-service may be paid frequently to issues of model quality in applications in which models have a central role in design, relatively little attention appears to be given, in practice, to systematic model testing and to model documentation. in spite of the enormous advances of software engineering in the past thirty years the testing process is still relatively immature in most organisations and testing still does not receive the attention which it should within the academic world. nevertheless, the situation in terms of systematic testing of software is very much more satisfactory than is the case for systematic and exhaustive testing within the model development process. the apparent lack of awareness of engineers about the processes of testing and documentation of models points to underlying problems in their education. it appears that most courses which deal in some way with modelling and simulation issues at university level put most emphasis on model formulation and on numerical methods. more difficult questions about model accuracy and fitness for purpose are very often neglected or treated in a superficial way. some aspects of the problem could be handled more easily if there was wider understanding of the tools used by software engineers within the software development process. for example version control techniques are routinely used in software development but are seldom applied in a rigorous fashion for simulation model development, maintaining the model throughout its life cycle [18] and documentation. the tools are readily available and could be applied in modelling just as effectively as in software engineering. the tasks involved in developing a simulation model extend far beyond the technical processes of constructing a computer-based description of a set of mathematical equations and the conduct of simulation experiments. the processes involved in investigation of the accuracy and limitations of a model may include analysis of linearised descriptions derived from a more general nonlinear model, storage, retrieval and quantitative comparison of simulation and experimental results for a wide range of test conditions, system identification and parameter estimation, sensitivity analysis, experimental design, post-processing, visualisation and documentation (not only of the model itself but all the associated external validation experiments). these wide-ranging requirements strongly suggest that there are potential benefits to be obtained from the use of a properly integrated set of software tools covering continuous system simulation, optimisation routines, database software for experimental and simulation model response records, and facilities for visualisation [19]. such an integrated set of software tools should be made available within a well defined and properly managed software engineering environment. there is already awareness of these methods in safetycritical areas of engineering such as aircraft flight control system desgn, flight simulator development and in the nuclear industry, where they are already being applied successfully. the systematic approach to model development and maintenance could and should be extended to other areas of engineering application where they could offer significant benefits in terms of reduced development time, reduced risk and potentially better performance. achieving these benefits is largely a matter of education of those who are likely to be engaged in the development and use of computer-based models and re-education of many who are in this field already but are unaware of the potential benefits of using a properly integrated and controlled software environment and more appropriate techniques for model validation. 5 conclusions it is accepted that the quality of a software system is primarily determined by the design specification, by the quality and effectiveness of the development process and by the commitment of all concerned in the project to excellence. systematic testing of software is an essential element of that development process. those engaged in software development activities are well served by excellent computing environments which can assist greatly in the management of complex development processes. many of the same issues arise in the development and testing of models used within engineering applications. the final quality and suitability of a model depends upon the appropriateness of the specification (in the context of the intended application), the nature of the development process used and the skills of those involved, especially in terms of testing. the qualities required by computing professionals involved in software development and testing and by engineers engaged in the development and testing of models are very similar. it is believed that engineering degree programmes should give increased emphasis to the modelling process, including external validation principles. documentation of models should be emphasised and the dangerous consequences of inadequate documentation should be stressed during the training of students. development tools and principles widely used for software development projects should be adapted for engineering applications which involve the development and use of computer-based models. references [1] murray-smith, d. j.: methods for the external validation of continuous system simulation models: a review. mathematical and computer modelling of dynamical systems, vol. 4, 1998, pp. 5–31 © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 4 – 5/2001 [2] murray-smith, d. j.: issues of model accuracy and external validation for control systems design. acta polytechnica, vol. 40, 2000, pp. 8 –13 [3] tischler, m. b.: system identification requirements for high-bandwidth rotorcraft flight control systems. j. guidance, control and dynamics, vol. 13, 1990, pp. 835– 841 [4] tischler, m. b.: system identification methods for aircraft flight control development and validation. in m. b. tischler (ed.), “advances in aircraft flight control”, taylor and francis, london, 1996, pp. 35– 69 [5] ziegler, b. p., praehofer, h., tag gon kim: theory of modeling and simulation (2nd edition). academic press, san diego, 2000 [6] kleijnan, j. p. c.: verification and validation of simulation models. european j. operational research, vol. 82, 1995, pp. 145–162 [7] draper, n. r., smith, h.: applied regression analysis. wiley, new york, 1966 [8] goodwin, g. c., payne, r. l.: dynamic system identification: experiment and design. academic press, new york, 1977 [9] beck, j. v., arnold, k. j.: parameter estimation in engineering and science. wiley, new york, 1977 [10] bellman, r, astrom, k. j.: on structural identifiability. math. biosci., vol. 7, 1970, pp. 329–339 [11] murray-smith, d. j., bradley, r., leith, d.: identifiability analysis and experimental design for simulation model validation. in maceri, f. and iazeolla, g. (eds.) eurosim’92 simulation congress (proceedings of the 1992 eurosim conference, capri, 28 sept.– 4 oct. 1992), north holland, amsterdam, 1992, pp.15–20 [12] tomovic, r.: sensitivity analysis of dynamic systems. mit press, 1962 [13] eslami, m.: theory of sensitivity in dynamic systems. springer-verlag, berlin, 1994 [14] bradley, r., padfield, g. d., murray-smith, d. j., thomson, d. g.: validation of helicopter mathematical models. trans. inst. measurement and control, vol. 12, 1990, pp. 186 –196 [15] cameron, r. g.: model validation by the distortion method: linear state space systems. iee proc.-d., vol. 139, 1992, pp. 296 –300 [16] kit, e.: software testing in the real world. addison-wesley, harlow, england, 1995 [17] kaner, c., falk, j., hung quoc nguyen: testing computer software (2nd edition). wiley, new york, 1999 [18] sisle, m. e.: validation throughout the simulation life cycle. in proc. 1985 summer computer simulation conference, july 22–24, 1985, chicago, illinois, the society for computer simulation, san diego, 1985, p. 168 [19] murray-smith, d. j.: enhanced environments for the development and validation of dynamic system models. mathematics and computer in simulation, vol. 39, 1995, pp. 459– 464 prof. d. j. murray-smith e-mail: djms@elec.gla.ac.uk centre for systems and control and department of electronics and electrical engineering university of glasgow glasgow g12 8qq scotland, uk 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2014.54.0156 acta polytechnica 54(2):156–172, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap exact renormalization group for point interactions osman teoman turgut, cem eröncel∗ bogazici university, department of physics, 34342 bebek, istanbul ∗ corresponding author: cem.eroncel@boun.edu.tr abstract. renormalization is one of the deepest ideas in physics, yet its exact implementation in any interesting problem is usually very hard. in the present work, following the approach by glazek and maslowski in the flat space, we will study the exact renormalization of the same problem in a nontrivial geometric setting, namely in the two dimensional hyperbolic space. delta function potential is an asymptotically free quantum mechanical problem which makes it resemble nonabelian gauge theories, yet it can be treated exactly in this nontrivial geometry. keywords: point interactions, exact renormalization group, harmonic analysis, hyperbolic spaces. 1. introduction most problems of deep significance in the world of interacting many-particles are formulated by singular theories. typically, these are plagued by divergences, which reflects our ignorance of the physics beyond the scales defined by our original theory. a deep insight into this behavior came from wilson [23–28]. he argued that the physics beyond the scales of interest should be incorporated into lower energies by some effective interactions. as we will show in section 2.2, for a system defined by a hamiltonian, if one calculates the form of the effective hamiltonian at some energy scale λ specifying the cutoff, the result will be hλeff = php + xλ, where php is the projection of the hamiltonian on the subspace where momentum eigenvalues are bounded by the cutoff λ, and the operator xλ depends on higher degrees of freedom. since the cutoff is totally arbitrary, the effective hamiltonian should not depend on it. the essence of the wilsonian renormalization group, or the exact renormalization group (erg), is modifying the parameters of the theory without altering the energy eigenvalues, so that the effective hamiltonian becomes cutoff-independent. this is done as follows: we start by specifying the system at some high energy scale λ, called the bare scale. then we introduce another scale λ, called the effective scale, such that 1 � λ � λ. the erg procedure consists of integrating out degrees of freedom between these two scales. this integrating out procedure is not performed in a single step. in each step one integrates over an infinitesimal mome! ntum shell. this transformation, which is called a renormalization group (rg) transformation creates a trajectory, called an rg trajectory in the space of theories, or in this particular case in the space of hamiltonians. the hamiltonians at different scales are related by the requirement that the eigenvalues do not change as one changes the scale. in other words the rg trajectory is determined by the condition that all the hamiltonians on this trajectory give the same set of eigenvalues as the unrenormalized theory. there is no systematic non-perturbative approach to implement this idea yet, but many interesting problems can be solved by means of some approximation method. the literature in this direction is immense, and we do not feel competent enough to cite all the relevant works. we will mention just a few things related to the present work. a perturbative approach to the renormalization group for effective hamiltonians in light-front field theory is given in [10]. another renormalization procedure for light-front hamiltonians is called the similarity renormalization group, where the bare hamiltonian with an arbitrary large, but finite cutoff, is transformed by a similarity transformation which makes the hamiltonian band diagonal [9, 11]. a pedagogical treatment can be found in [7]. one of the main challenges is to understand renormalization in a system where the interactions lead to the appearance of bound states. indeed quantum chromodynamics is the main example we have in mind, where the theory is formulated in terms of physically unobservable variables, in ordinary energy scales, and as a result of interactions only their bound states become physical particles. since one is interested in understanding the formation of these bound states and calculating the resulting masses, in principle, it is most natural to work with the hamiltonian directly. of course this is a very hard problem. as a result it is valuable and interesting to learn more about renormalization and its non-perturbative aspects even in very simple systems using the hamiltonian 156 http://dx.doi.org/10.14311/ap.2014.54.0156 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 exact renormalization group for point interactions formalism. this has been done by glazek and maslowski for the dirac-delta function in two dimensions [8]. in the present work, we will consider the same problem on a nontrivial manifold, two dimensional hy! perbolic space. this is interesting because the gauge theory problem also has a nontrivial geometry when it is formulated on the space of connections modulo gauge equivalent configurations [19]. hence, it is a nice exercise to see that type of complications may arise when the underlying geometry is nontrivial. we shall start by reviewing point interactions on the euclidean plane and answer why this problem requires renormalization. following [8], we will review the renormalization of point interactions in the euclidean plane using the wilsonian rg scheme and derive the flow equation. as an addendum to glazek and maslowski, we also investigate the range of renormalizability using the banach contraction principle. in the next section we shall analyze the same problem on the hyperbolic plane and show that the flow equation has the same form. finally we will speak about a puzzle where this procedure fails at a technical level, if one studies the same problem on a compact manifold, namely two-dimensional sphere, s2. 2. exact renormalization group on the euclidean plane 2.1. formulation of the problem the schrödinger equation for the simplest type of point interaction on a d-dimensional euclidean space rd is given in units ~ = 1 and 2m = 1, as ( −∆rd −gδd(x) ) ψ(x) = eψ(x), (2.1) where ∆rd is the laplacian operator on rd and g is real, positive parameter which determines the strength of the point interaction. if we parametrize the bound state energy by e = −ν2, then the schrödinger equation for the bound state of the system becomes( −∆rd −gδd(x) ) φ(x) = −ν2φ(x), (2.2) where φ(x) is the bound state wavefunction. this expression can be expressed in momentum space as (p2 + ν2)φ̃(p,ω) = g (2π)d ∫ sd−1 dω ∫ ∞ 0 dp′p′d−1φ̃(p′,ω), (2.3) where φ̃(p,ω) is the fourier transform of φ(x). note that we also switched to spherical coordinates in momentum space, where p and ω denote the radial and angular coordinates respectively. from (2.3), g−1 can be solved as 1 g = vol ( sd−1 ) (2π)d ∫ ∞ 0 dp′ p′d−1 p′2 + ν2 , (2.4) where vol ( sd−1 ) denotes the volume of the unit sphere in d − 1 dimensions. it is easy to see that the integral diverges for d ≥ 2, so regularization and renormalization are needed to obtain physical results. renormalization of point interactions has been studied by many authors; in position space [12, 14, 17], and in momentum space [2, 5, 6, 18, 20]. the renormalization group equations were derived in [1, 2]. instead of the conventional approach, we shall perform the renormalization using the exact renormalization group (erg) method. 2.2. renormalization of hamiltonians in this section we perform the renormalization of a point interaction on the euclidean plane, r2. this part will be mainly a review of the lecture notes given by głazek and maslowski [8] and we include it for the sake of completeness. however our approach will be slightly different and as an addendum we also investigate the range of renormalizability using the banach contraction principle. before any kind of regularization or renormalization, the schrödinger equation for the bound state can be written as h |φ〉 = −ν2 |φ〉 . (2.5) we want to calculate the effective hamiltonian hλeff at some energy scale λ, where λ � 1. this is done by integrating out degrees of freedom above λ. to this end we introduce the operators p and q which are projections to the subspaces, where the momentum eigenvalue takes the values 0 ≤ p ≤ λ and p > λ respectively. let us also define |φ〉p ≡p|φ〉 and |φ〉q ≡q|φ〉. by using p + q = i and pq = 0 one can split (2.5) as php |φ〉p + phq|φ〉q = −ν 2 |φ〉p (2.6) qhp |φ〉p + qhq|φ〉q = −ν 2 |φ〉q . (2.7) 157 osman teoman turgut, cem eröncel acta polytechnica from (2.7) we find |φ〉q = (−ν 2 −qhq)−1qhp |φ〉p . (2.8) if we substitute this result back into (2.6) we get( php + phq(−ν2 −qhq)−1qhp ) |φ〉p = −ν 2 |φ〉p (2.9) and this implies that the effective hamiltonian at the scale λ is given by hλeff = php + phq(−ν 2 −qhq)−1qhp ≡php + xλ. (2.10) we note that, although we are working at the scale λ, the effective hamiltonian contains the xλ term and this term, which is called a counterterm, depends on the higher degrees of freedom. and as we shall see now, we will use this counterterm in order to define the effective coupling constant at the scale λ. let us write the hamiltonian as h = h0 + v where h0 is the free hamiltonian and v denotes the point interaction, i.e. 〈x|v |φ〉 = −gδ2(x)φ(x). (2.9) can be written in momentum space as (p2 + ν2)φ̃p(p) + ∫ rd ddp′ 〈p|pvp |p′〉 φ̃p(p′) + ∫ rd ddp′ 〈p|xλ |p′〉 φ̃p(p′) = 0, (2.11) where φ̃p(p) = 〈 p ∣∣ φ̃p〉. by defining xλ(p, p′) ≡ (2π)2 〈p|xλ |p′〉 and using 〈p|v |p′〉 = ∫ r4 d2x d2x′ 〈p| x〉〈x|v |x′〉〈x′| p′〉 = − g (2π)2 , (2.12) we get (p2 + ν2)φ̃p(p) − 1 (2π)2 ∫ r2 d2p′θλ(p) (g −xλ(p, p′)) φ̃p(p′) = 0, (2.13) where θλ(p) is the step function. we see that the g −xλ(p, p′) term plays the role of the effective coupling constant. from now on we denote it by gλ(p, p′). the counterterm xλ(p, p′) acts like a correction to the initial theory and by using it we have defined the renormalized coupling constant gλ(p, p′) at the scale λ. 2.3. applying the erg procedure now we are in a position to perform the erg analysis of our theory. since the original problem is rotationally symmetric, we want to keep the rotational symmetry intact. therefore we assume that the renormalized coupling constant gλ does not depend on ω. at the bare scale λ we can write the following equation: (p2 + ν2)φ̃(p,ω) = θλ(p) (2π)2 ∫ λ 0 dp′p′gλ(p,p′)ϑ(p′), (2.14) where ϑ(p) ≡ ∫ s1 dω φ̃(p,ω). (2.15) we remark that we have switched to the unprojected wavefunction φ̃(p,ω) and compensate this change by putting the step function θλ(p) in front of the integral, which ensures that (2.14) is valid for p ≤ λ. following the erg procedure, we write the analog of (2.14) at the infinitesimally lower scale λ −dλ. (p2 + ν2)φ̃(p,ω) = θλ−dλ(p) (2π)2 ∫ λ−dλ 0 dp′p′gλ−dλ(p,p′)ϑ(p′). (2.16) we can rewrite (2.14) as (p2 + ν2)φ̃(p,ω) = θλ(p) (2π)2 (∫ λ−dλ 0 dp′p′gλ(p,p′)ϑ(p′) + dλ λ gλ(p, λ)ϑ(λ) ) . (2.17) for p = λ this will give us (λ2 + ν2)φ̃(λ,ω) = 1 (2π)2 (∫ λ−dλ 0 dp′p′gλ(λ,p′)ϑ(p′) + dλ λ gλ(λ, λ)ϑ(λ) ) , (2.18) 158 vol. 54 no. 2/2014 exact renormalization group for point interactions and from this we can read of φ̃(λ,ω) as φ̃(λ,ω) = 1 (2π)2(λ2 + ν2) ∫ λ−dλ 0 dp′p′gλ(λ,p′)ϑ(p′), (2.19) where we have ignored the term which is proportional to dλ. if we substitute this result into (2.15) and perform the ω integral we find ϑ(λ) = 1 (2π)(λ2 + ν2) ∫ λ−dλ 0 dp′p′gλ(λ,p′)ϑ(p′). (2.20) finally we put this result into (2.17) to obtain (p2 + ν2)φ̃(p,ω) = θλ(p) (2π)2 ∫ λ−dλ 0 dp′p′ ( gλ(p,p′) + dλ λ 2π(λ2 + ν2) gλ(p, λ)gλ(λ,p′) ) ϑ(p′). (2.21) clearly, we can replace θλ(p) by θλ−dλ(p) and write (p2 + ν2)φ̃(p,ω) = θλ−dλ(p) (2π)2 ∫ λ−dλ 0 dp′p′ ( gλ(p,p′) + dλ λ 2π(λ2 + ν2) gλ(p, λ)gλ(λ,p′) ) ϑ(p′) (2.22) now comparing this equation with (2.16) gives us an equation for the coupling constant gλ−dλ(p,p′) = gλ(p,p′) + dλ λ 2π(λ2 + ν2) gλ(p, λ)gλ(λ,p′), (2.23) which can be put into differential form as − dgλ(p,p′) dλ = λ 2π(λ2 + ν2) gλ(p, λ)gλ(λ,p′). (2.24) this equation determines the rg trajectory of the coupling constant. to find the effective coupling at the effective scale λ we integrate this from λ to λ and find gλ(p,p′) = gλ(p,p′) + 1 2π ∫ λ λ ds s s2 + ν2 gs(p,s)gs(s,p′) (2.25) or gλ(p,p′) = g −xλ(p,p′) + 1 2π ∫ λ λ ds s s2 + ν2 gs(p,s)gs(s,p′). (2.26) although this is an ordinary differential equation with three variables and we have one initial condition, there is also the requirement that gλ(p,p′) should not depend on λ when we take the λ →∞ limit. this can be satisfied by the appropriate choice of the counterterm xλ(p,p′). we try an iteration procedure to obtain a solution. at the first order we choose g (1) λ (p,p ′) = g so that x(1)λ = 0. (2.27) after substituting these choices to (2.26) we get g (2) λ (p,p ′) = g −x(2)λ (p,p ′) + g2 2π ∫ λ λ ds s s2 + ν2 . (2.28) the integral diverges in the λ →∞ limit, therefore we choose the counterterm as x (2) λ (p,p ′) = g2 2π ∫ λ λ0 ds s s2 + ν2 , (2.29) where λ0 is an another energy scale chosen such that 1 � λ0 < λ � λ. now the effective coupling at the second order is finite and given by g (2) λ (p,p ′) = g − g2 2π ∫ λ λ0 ds s s2 + ν2 . (2.30) 159 osman teoman turgut, cem eröncel acta polytechnica we note that it is independent of p and p′. if we repeat this procedure, then by induction it is straightforward to see that g(n)λ and x (n) λ are independent of p and p ′ for all n. at the order n + 1, the effective coupling becomes g (n+1) λ = g −x (n+1) λ + 1 2π ∫ λ λ ds s s2 + ν2 ( g(n)s )2 . (2.31) we choose the counterterm as x (n+1) λ = 1 2π ∫ λ λ0 ds s s2 + ν2 ( g(n)s )2 , (2.32) hence we find g (n+1) λ = g − 1 2π ∫ λ λ0 ds s s2 + ν2 ( g(n)s )2 . (2.33) it is not trivial to conclude that this iteration process has a limit. we shall deal with this later in this section and for now we assume that g and λ0 are chosen such that the sequence {g (n) λ } ∞ n=1 has a limit given by lim n→∞ g (n+1) λ = gλ. (2.34) after taking the limit we can write for the effective coupling gλ = g − 1 2π ∫ λ λ0 ds s s2 + ν2 g2s, (2.35) which immediately implies g = gλ0 . this equation can be put into the following form∫ λ λ0 ds dgs ds = − 1 2π ∫ λ λ0 ds s s2 + ν2 g2s, (2.36) which implies dgs g2s = − 1 2π sds s2 + ν2 . (2.37) after integrating this equation from λ0 to λ and solving for gλ, we obtain the final answer. gλ = gλ0 1 + gλ04π log ( λ2+ν2 λ20+ν2 ). (2.38) this result is in agreement with the one given in [1]. we also note that as λ → ∞, gλ → 0, so the theory is asymptotically free. 2.4. estimating the range of renormalizability now we shall try to investigate under which conditions the sequence {g(n)λ } ∞ n=1 has a limit. however we do not have a closed form expression for g(n)λ , (2.31) tells us that g (n) λ depends on g (n−1) λ , hence we could not obtain a rigorous result for the convergence radius of the series {g(n)λ } ∞ n=1 in this way. an alternative way is to investigate under which circumstances the integral equation given by gλ = gλ0 − 1 2π ∫ λ λ0 ds s s2 + ν2 g2s (2.39) has a unique solution. this can be done by using the theory of ordinary differential equations. we begin by defining a compact interval i = [λ0, λ̃] ⊂ r where 1 � λ0 < λ < λ̃ � λ. let c(i) denote the space of continuous functions on i. it becomes a vector space if the vector space operations are defined pointwise. moreover it is well known that it is a banach space if we define a norm on c(i) by ‖g‖ = sup s∈i |gs| . (2.40) 160 vol. 54 no. 2/2014 exact renormalization group for point interactions we note that we use g, h as elements of c(i) to avoid confusion with the unrenormalized coupling constant g, that is, we made the definition g(s) ≡ gs. now we introduce a map t : c(i) → c(i) defined by t(g)(λ) = gλ0 − 1 2π ∫ λ λ0 ds s s2 + ν2 g2s. (2.41) then (2.39) can be expressed as g = tg, in other words the solution of (2.39) is also a fixed point of t. the existence and uniqueness of a solution to g = tg can be proved using the contraction principle which can be stated as follows: let d be a nonempty closed subset of a banach space b. if a map t : d → b is a contraction and maps d into itself, i.e. t (d) ⊆ d, then t has exactly one fixed point g which is in d [22]. t is a contraction, wich means that there exist a positive constant θ < 1 such that ‖t(g) −t(h)‖≤ θ‖g−h‖ for g,h ∈ d. (2.42) if t is a contraction, then the sequence {g(n)}∞n=1 defined by g(n) = t(g(n−1)) with g(1) = t(g0), (2.43) where g0 is an arbitrary element of d, converges to the fixed point g, that is lim n→∞ ‖g(n) −g‖ = 0. (2.44) therefore to estimate the range of renormalizability of our theory, we need to estimate under which cases the map t defined as in (2.41) is a contraction. first of all we need a closed subset of c(i). from (2.39) we can conclude that if g is a solution, then it should be monotone decreasing on i = [λ0, λ̃]. thus it is natural to choose our closed subset as d = {g ∈ c(a) | ‖g‖≤ gλ0} . (2.45) then ‖t(g)‖ = sup λ∈i |t(g)(λ)| = sup λ∈i ∣∣∣∣gλ0 − 12π ∫ λ λ0 ds s s2 + ν2 g2s ∣∣∣∣ = gλ0, (2.46) thus t(d) ⊆ d. so it remains to show that t is a contraction. let g,h ∈ d. then we have the estimate |t(g) −t(h)| = 1 2π ∫ λ λ0 ds s s2 + ν2 ( (hs)2 − (gs)2 ) ≤ 1 2π sup s∈[λ0,λ] ∣∣(hs)2 − (gs)2∣∣∫ λ λ0 ds s s2 + ν2 ≤ 1 2π sup s∈[λ0,λ] ∣∣(hs)2 − (gs)2∣∣∫ λ λ0 ds 1 s = 1 2π sup s∈[λ0,λ] |(hs + gs)(hs −gs)| log ( λ λ0 ) . (2.47) by taking the supremum of both sides we find ‖t(g) −t(h)‖≤ 1 2π sup s∈i |(hs + gs)(hs −gs)| log ( λ̃ λ0 ) = 1 2π ‖g + h‖‖g−h‖ log ( λ̃ λ0 ) ≤ 1 2π (2gλ0 )‖g−h‖ log ( λ̃ λ0 ) . (2.48) this tells us that t is a contraction if gλ0 π log ( λ̃ λ0 ) < 1. (2.49) if we interpret the interval i = [λ0, λ̃] as the range of renormalizability, then from (2.49) we can see that it is directly related to the coupling at the energy scale λ0. for a small coupling gλ0 � 1, we can shift up λ̃ considerably without breaking the contraction property of t . however for couplings gλ0 ∼ 1, the range is quite small or we may not even prove the existence of a solution by this approach. 161 osman teoman turgut, cem eröncel acta polytechnica 2.5. bound state solution we can check that with the coupling constant given as in (2.38) we get a finite answer for the bound state energy. for this we plug (2.38) into (2.14) to find (p2 + ν2)φ̃(p,ω) = θλ(p) (2π)2 gλ0 1 + gλ04π log ( λ2+ν2 λ20+ν2 ) ∫ λ 0 dp′p′ ∫ s1 dω′ φ̃(p′,ω′). (2.50) by defining n = ∫ λ 0 dp′p′ ∫ s1 dω′ φ̃(p′,ω′), (2.51) we obtain φ̃(p,ω) = θλ(p) (2π)2 gλ0 1 + gλ02π log ( λ λ0 ) n p2 + ν2 . (2.52) substituting this result into (2.51) and dividing both sides by n gives us 1 = 1 4π gλ0 1 + gλ04π log ( λ2+ν2 λ20+ν2 ) log (λ2 + ν2 ν2 ) . (2.53) from this equation we can solve for ν2 in the λ →∞ limit and find eb = lim λ→∞ −ν2 = −λ20 e−4π/gλ0 1 −e−4π/gλ0 , (2.54) which is finite. 3. exact renormalization group on the hyperbolic plane we will begin this section by constructing the spectral representation of the laplacian on the hyperbolic plane h2. by using this construction we shall perform the erg analysis of a point interaction on the hyperbolic plane. 3.1. the geometry and spectra of the hyperbolic plane we shall do the construction by using ideas given in [15] and [21]. there are various models for the hyperbolic plane. we will use the upper half-plane model, where h2 is realized as the set h2 = {z = (x,y) | x ∈ r ,y ∈ [0,∞)} , (3.1) with the riemannian metric gh2 given by gh2 = r2 y2 ( 1 0 0 1 ) , (3.2) where −r−2 is the constant sectional curvature. the riemannian volume element is given by dvh2 = √ det gh2 dx∧dy = dxdy y2/r2 , (3.3) and the laplacian is ∆h2 = y2 r2 ( ∂2 ∂x2 + ∂2 ∂y2 ) . (3.4) the eigenfunctions can be found by solving the closed eigenvalue problem on l2(h2,dvh2 ) expressed by (∆h2 + λ)f(z) = 0, (3.5) where λ ∈ r. for notational simplicity, let us define ∆̃h2 ≡ r2∆h2 and λ̃ ≡ r2λ. then (3.5) will be equivalent to (∆̃h2 + λ̃)f(z) = 0, (3.6) 162 vol. 54 no. 2/2014 exact renormalization group for point interactions since ∆̃h2 is separable in (x,y) coordinates we can use separation of variables. so we choose f(z) = v(x)w(y) and put this into (3.6) to obtain ∂2v ∂x2 1 v(x) + ∂2w ∂y2 1 w(y) + λ̃ y2 = 0. (3.7) this implies that there is a constant ξ2 such that ∂2v ∂x2 1 v(x) = −ξ2 and ∂2w ∂y2 1 w(y) + λ̃ y2 = ξ2. (3.8) the x-part can be solved easily as v(x) = eiξx. to solve the y-part we introduce a new function by u(y) ≡ y−1/2w(y). after substituting this into the y-part of (3.8) and making some rearrangements we get y2 ∂2u ∂y2 + y ∂u ∂y − ( y2ξ2 + 1 4 − λ̃ ) u(y) = 0. (3.9) the eigenvalues of the laplacian on h2 start with λ̃0 = 14 [4]. therefore 1 4 − λ̃ ≤ 0 so we introduce a new variable τ ∈ [0,∞) such that 14 − λ̃ = (iτ) 2. then (3.9) becomes y2 ∂2u ∂y2 + y ∂u ∂y − [ (yξ)2 + (iτ)2 ] u(y) = 0. (3.10) there are two linearly independent solutions which are the modified bessel functions iiτ (|yξ|) and kiτ (|yξ|). since iiτ (|yξ|) is singular at infinity, we exclude it from our solution space. moreover kiτ (|yξ|) has a singularity at ξ = 0 given by [21] kiτ (|yξ|) ∼ 2iτ−1γ(iτ) |yξ| −iτ + 2−iτ−1γ(−iτ) |yξ|iτ as ξ → 0+. (3.11) so we choose u(y) = |ξ|iτ kiτ (|yξ|) as a solution to (3.10) and write the eigenfunctions of ∆h2 as e0(z; τ,ξ) = 1 √ 2π eiξx √ y |ξ|iτ kiτ (|yξ|), (3.12) where we have put an extra (2π)−1/2 to simplify our construction of the spectral representation. we note that this does not alter the spectrum of the eigenvalues. to obtain the spectral representation of ∆h2 we introduce the following transform, (kψ)(τ,ξ) ≡ ψ̃(τ,ξ) = 1 r2 ∫ h2 dvh2 ψ(z)e0(z; τ,ξ) (k−1ψ̃)(z) = 2 π2 ∫ ∞ 0 dτ τ sinh(πτ) ∫ r dξ ψ̃(τ,ξ)e0(z; τ,ξ). (3.13) which is a combination of the fourier transform in (x,ξ) variables and the kontorovich-lebedev transform in (y,τ) variables [16]. the range of k can be formulated by considering l2(r,dξ) as the hilbert space corresponding to ξ, and then taking a direct integral of it with respect to the measure space (0,∞). so the map k can be expressed formally as k : l2(h2,dvh2 ) → ∫ ⊕ (0,∞) l2(r,dξ). (3.14) to prove that k provides a spectral representation of ∆h2 , we need two identities involving modified bessel functions which are given by [15] 2 π2 ∫ ∞ 0 dτ τ sinh(πτ)kiτ (u)kiτ (v) = vδ(u−v), (3.15) and 2 π2 ∫ ∞ 0 du u kiτ (u)kiτ′(u) = δ(τ − τ′) √ ττ′ √ sinh(πτ) sinh(πτ′) . (3.16) 163 osman teoman turgut, cem eröncel acta polytechnica moreover we shall also use the property kiτ (u) = k−iτ (u). using (3.16) one can show that k diagonalizes the laplacian, that is for all ψ̃ ∈ ∫⊕ (0,∞) l 2(r,dξ) we have −(k∆̃h2k−1ψ̃)(τ,ξ) = ( τ2 + 1 4 ) ψ̃(τ,ξ), (3.17) and therefore −(k∆h2k−1ψ̃)(τ,ξ) = 1 r2 ( τ2 + 1 4 ) ψ̃(τ,ξ), (3.18) so the spectrum of ∆h2 is given by σ(∆h2 ) = [ 1 4r2 ,∞ ) . (3.19) by using (3.15) it is straightforward to show that k is an isomorphism, i.e. for all ψ ∈ l2(h2,dvh2 ) we have (k−1kψ)(z) = ψ(z). (3.20) therefore, the transformation k with the eigenfunctions given as in (3.12) provide a complete spectral representation of ∆h2 . 3.2. formulation of the problem as in the flat case we consider a particle of mass m interacting with a dirac-delta potential on the hyperbolic plane h2. let z0 = (x0,y0),y0 6= 0 denote the location of the dirac-delta potential. the corresponding schrödinger equation for the bound state is written in coordinates ~ = 1, 2m = 1 as (−∆h2 −gδh2 (z,z0)) φ(z) = −ν2φ(z). (3.21) on a riemannian manifold (m,g) the dirac delta function δg(z,z0) is defined such that∫ m dvg(z) δg(z,z0) = 1 for all z0 ∈m. (3.22) thus δh2 (z,z0) is given by δh2 (z,z0) = y2 r2 δ(x,x0)δ(y,y0). (3.23) since k is an isomorphism we can write (3.21) as −k∆h2k−1φ̃(τ,ξ) −gkδh2 (z,z0)k−1φ̃(τ,ξ) = −ν2φ̃(τ,ξ). (3.24) the first term can be read directly from (3.17), which is −k∆h2k−1φ̃(τ,ξ) = 1 r2 ( τ2 + 1 4 ) φ̃(τ,ξ). (3.25) the second term can be easily calculated using δh2 (z,z0). the result is gkδh2 (z,z0)k−1φ̃(τ,ξ) = 2g π2r2 ∫ ∞ 0 dτ′τ′ sinh(πτ′) ∫ r dξ′e0(z0,τ,ξ)e0(z0,τ′,ξ′)φ̃(τ′,ξ′). (3.26) if we put (3.25) and (3.26) into (3.21) and rearrange the terms we obtain ( τ2 + a2 ) φ̃(τ,ξ) = 2g π2 ∫ ∞ 0 dτ′τ′ sinh(πτ′) ∫ r dξ′e0(z0,τ,ξ)e0(z0,τ′,ξ′)φ̃(τ′,ξ′), (3.27) where we made the definition a2 ≡ 14 + ν 2r2. our next goal is to determine the type and the cause of the divergence. to this end we make an attempt to solve (3.27). we define n ≡ ∫ ∞ 0 dτ′τ′ sinh(πτ′) ∫ r dξ′e0(z0,τ′,ξ′)φ̃(τ′,ξ′). (3.28) 164 vol. 54 no. 2/2014 exact renormalization group for point interactions then φ̃(τ,ξ) becomes φ̃(τ,ξ) = 2g π2 ne0(z0,τ,ξ) ( τ2 + a2 )−1 . (3.29) we put this result back into (3.28) and find 1 g = 2 π2 ∫ ∞ 0 dτ′τ′ sinh(πτ′) ( τ′2 + a2 )−1 ∫ r dξ′e0(z0,τ′,ξ′)e0(z0,τ′,ξ′). (3.30) let us denote the ξ-integral by υ(τ′). using the explicit form of the eigenfunctions as given in (3.12) we can write υ(τ′) as υ(τ′) = y0 2π ∫ r dξ′kiτ′(y0 |ξ|)k−iτ′(y0 |ξ|) = y0 π ∫ ∞ 0 dξ′kiτ′(y0ξ)kiτ′(y0ξ) (3.31) to evaluate the integral we will use the following integral representation of the modified bessel functions [3]: kν(z) = ∫ ∞ 0 due−z cosh u cosh(νu), rez > 0 (3.32) therefore υ(τ′) becomes υ(τ′) = y0 π ∫ ∞ 0 dξ ∫ ∞ 0 du ∫ ∞ 0 dv e−y0ξ(cosh u+cosh v) cosh(iτ′u) cosh(iτ′v) = y0 π ∫ ∞ 0 du ∫ ∞ 0 dv cos(τ′u) cos(τ′v) ∫ ∞ 0 dξ e−y0ξ(cosh u+cosh v) = 1 π ∫ ∞ 0 du cos(τ′u) ∫ ∞ 0 dv cos(τ′v) cosh u + cosh v . (3.33) the dv integral can be evaluated using the definite integral [13] ∫ ∞ 0 cos(ax)dx b cosh(βx) + c = π sin ( a β cosh−1 ( c b )) β √ c2 − b2 sinh ( aπ β ), for c > b > 0. (3.34) in our case a = τ′, b = 1, β = 1, c = cosh u and cosh u ≥ 1 > 0, for all u ∈ [0,∞) so we can use (3.34). hence dv integral becomes∫ ∞ 0 dv cos(τ′v) cosh u + cosh v = π sin ( τ′ cosh−1(cosh u) )√ cosh2 −1 sinh(τ′π) = π sin(τ′u) sinh(u) sinh(τ′π) . (3.35) then we have υ(τ′) = 1 sinh(τ′π) ∫ ∞ 0 du cos(τ′u) sin(τ′u) sinh u . (3.36) finally we will use [13] ∫ ∞ 0 dx sin(αx) cos(βx) sinh(γx) = π sinh ( πa γ ) 2γ ( cosh ( απ γ ) + cosh ( βπ γ )) (3.37) for im (α + β) < reγ. in our case α = β = τ′, γ = 1 and τ′ ∈ r, therefore im (α + β) = 0 < 1. thus∫ ∞ 0 du cos(τ′u) sin(τ′u) sinh u = π sinh(πτ′) 2 (cosh(πτ′) + cosh(πτ′)) = π 4 tanh(πτ′), (3.38) and υ(τ′) becomes υ(τ′) = ∫ r dξ′e0(z0,τ′,ξ′)e0(z0,τ′,ξ′) = π 4 tanh(πτ′) sinh(πτ′) . (3.39) finally we put this result into (3.30) and obtain 1 g = 1 2π ∫ ∞ 0 dτ′τ′ tanh(πτ′) ( τ′2 + a2 )−1 . (3.40) for large values of τ′, tanh(πτ′) ≈ 1 and the integrand behaves as 1/τ′ so, as in the flat case, we face with a logarithmic divergence. this analysis also shows us that there is no divergence in the ξ term. therefore we only need to concern ourselves with the renormalization of τ. 165 osman teoman turgut, cem eröncel acta polytechnica 3.3. applying the erg procedure we start by writing the eigenvalue equation at the bare scale λ. ( τ2 + a2 ) φ̃(τ,ξ) = 2 π2 θλ(τ) ∫ λ 0 dτ′gλ(τ,τ′)τ′ sinh(πτ′)ϑ(τ,τ′; ξ), (3.41) where gλ(τ,τ′) = g −xλ(τ,τ′) and ϑ(τ,τ′; ξ) ≡ ∫ r dξ′e0(z0,τ,ξ)e0(z0,τ′,ξ′)φ̃(τ′,ξ′). (3.42) at an infinitesimally lower scale λ −dλ we write ( τ2 + a2 ) φ̃(τ,ξ) = 2 π2 θλ−dλ(τ) ∫ λ−dλ 0 dτ′gλ−dλ(τ,τ′)τ′ sinh(πτ′)ϑ(τ,τ′; ξ). (3.43) we can rewrite (3.41) as ( τ2 + a2 ) φ̃(τ,ξ) = 2 π2 θλ(τ) (∫ λ−dλ 0 dτ′gλ(τ,τ′)τ′ sinh(πτ′)ϑ(τ,τ′; ξ) + dλ gλ(τ, λ)λ sinh(πλ)ϑ(τ, λ; ξ) ) , (3.44) and for τ = λ we obtain ( λ2 + a2 ) φ̃(λ,ξ) = 2 π2 (∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′)ϑ(λ,τ′; ξ) + dλ gλ(λ, λ)λ sinh(πλ)ϑ(λ, λ; ξ) ) . (3.45) then φ̃(λ,ξ) becomes φ̃(λ,ξ) = 2 π2 ( λ2 + a2 )−1 ∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′)ϑ(λ,τ′; ξ), (3.46) where we have again ignored the term proportional to dλ. from this result we can find ϑ(τ, λ; ξ) as ϑ(τ, λ; ξ) = 2 π2 ( λ2 + a2 )−1 ∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′) ∫ r dξ′e0(z0,τ,ξ)e0(z0, λ,ξ′)ϑ(λ,τ′; ξ′). (3.47) by putting the explicit expression for ϑ(λ,τ′; ξ′) we get ϑ(τ, λ; ξ) = 2 π2 ( λ2 + a2 )−1 ∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′) × ∫ r dξ′e0(z0,τ,ξ)e0(z0, λ,ξ′) ∫ r dξ′′e0(z0, λ,ξ′)e0(z0,τ′,ξ′′)φ̃(τ′,ξ′′) = 2 π2 ( λ2 + a2 )−1 ∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′) × ∫ r dξ′e0(z0, λ,ξ′)e0(z0, λ,ξ′) ∫ r dξ′′e0(z0,τ,ξ)e0(z0,τ′,ξ′′)φ̃(τ′,ξ′′) = 1 2π ( λ2 + a2 )−1 tanh(πλ) sinh(πλ) ∫ λ−dλ 0 dτ′gλ(λ,τ′)τ′ sinh(πτ′)ϑ(τ,τ′; ξ). (3.48) if we put this result back into (3.44) we find ( τ2 + a2 ) φ̃(τ,ξ) = 2θλ(τ) π2 ∫ λ−dλ 0 dτ′τ′ sinh(πτ′)ϑ(τ,τ′; ξ) ( gλ(τ,τ′) + λ tanh(πλ) 2π(λ2 + a2) gλ(λ,τ′)gλ(τ, λ) ) . (3.49) by comparing this with (3.43) we arrive at an equation for the coupling constant gλ−dλ(τ,τ′) = gλ(τ,τ′) + λ tanh(πλ) 2π(λ2 + a2) gλ(λ,τ′)gλ(τ, λ), (3.50) which can be put into differential form as − dgλ(τ,τ′) dλ = λ tanh(πλ) 2π(λ2 + a2) gλ(λ,τ′)gλ(τ, λ), (3.51) 166 vol. 54 no. 2/2014 exact renormalization group for point interactions and by integrating from λ to λ we find gλ(τ,τ′) = g −xλ(τ,τ′) + 1 2π ∫ λ λ ds s s2 + a2 tanh(πs)gs(s,τ′)gs(τ,s). (3.52) to obtain a solution we shall use the same procedure we used in the flat case. we begin with g (1) λ (τ,τ ′) = g so that x(1)λ (τ,τ ′) = 0. (3.53) then g(2)λ becomes g (2) λ = g −x (2) λ (τ,τ ′) + g2 2π ∫ λ λ ds s tanh(πs) s2 + a2 . (3.54) we choose the counterterm as x (2) λ (τ,τ ′) = g2 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 , (3.55) so that the effective coupling at the second order is now finite and given by g (2) λ (τ,τ ′) = g − g2 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 . (3.56) just like the flat case, g(n)λ (τ,τ ′) is independent of τ and τ′ for all n. at the order n + 1, the effective coupling becomes g (n+1) λ = g −x (n+1) λ + 1 2π ∫ λ λ ds s tanh(πs) s2 + a2 ( g(n)s )2 . (3.57) by choosing the counterterm as x (n+1) λ = 1 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 ( g(n)s )2 , (3.58) we find g (n+1) λ = g − 1 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 ( g(n)s )2 . (3.59) if we assume the existence of lim n→∞ g (n+1) λ = gλ, we can write for the effective coupling gλ = g − 1 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 g2s, (3.60) which again implies g = gλ0 . this expression can be put into the following form − ∫ λ λ0 dgs dgs g2s = 1 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 . (3.61) by evaluating the integral on the lhs and expressing tanh(πs) as tanh(πs) = 1 − 2 e2πs + 1 , we obtain 1 gλ = 1 gλ0 + 1 2π ∫ λ λ0 ds s s2 + a2 − 1 2π ∫ λ λ0 ds s s2 + a2 2 e2πs + 1 . (3.62) 167 osman teoman turgut, cem eröncel acta polytechnica let us define function α(s) by α−1(s) ≡ 1 2π ∫ s 0 ds′ s′ s′2 + a2 2 e2πs ′ + 1 . (3.63) then (3.62) becomes 1 gλ = 1 gλ0 + 1 2π ∫ λ λ0 ds s s2 + a2 − 1 α(λ) + 1 α(λ0) . (3.64) by redefining the coupling constant by g̃−1s ≡ g−1s + α−1s and evaluating the integral in (3.64) we obtain 1 g̃λ = 1 g̃λ0 + 1 4π log ( λ2 + a2 λ20 + a2 ) , (3.65) which can be solved as g̃λ = g̃λ0 1 + g̃λ04π log ( λ2+a2 λ20+a2 ). (3.66) we see that by slightly modifying the coupling constant we can obtain the same solution as for the flat case. to show that this modification brings no problems at high energies let us try to estimate α−1(λ). α−1(λ) = 1 2π ∫ λ 0 ds s s2 + a2 2 e2πs + 1 ≤ 1 π ∫ λ 0 ds se−2πs s2 + a2 . (3.67) using cauchy-schwartz inequality we find α−1(λ) ≤ 1 π [∫ λ 0 ds s2 (s2 + a2)2 ]1/2 [∫ λ 0 dse−4πs ]1/2 = 1 π [ 1 2 ( tan−1 ( λ a ) a − λ λ2 + a2 )]1/2 [ 1 4π ( 1 −e−4πλ )]1/2 . (3.68) we see that for high energies λ � 1, this correction term behaves like a constant. 3.4. estimating the range of renormalizability the conditions under which the sequence {g(n)λ } has a limit is investigated in an identical manner to the one presented in the flat case in section 2.4. in this case we define the map t : c(i) → c(i) as t(g)(λ) = gλ0 − 1 2π ∫ λ λ0 ds s tanh(πs) s2 + a2 g2s. (3.69) then we have the estimate |t(g) −t(h)| = 1 2π ∫ λ λ0 ds s tanh(πs) s2 + ν2 ( (hs)2 − (gs)2 ) ≤ 1 2π sup s∈[λ0,λ] ∣∣(hs)2 − (gs)2∣∣∫ λ λ0 ds s tanh(πs) s2 + ν2 ≤ 1 2π sup s∈[λ0,λ] ∣∣(hs)2 − (gs)2∣∣∫ λ λ0 ds 1 s = 1 2π sup s∈[λ0,λ] |(hs + gs)(hs −gs)| log ( λ λ0 ) . (3.70) as in the flat case, t is a contraction if gλ0 π log ( λ̃ λ0 ) < 1. (3.71) 3.5. bound state solution to find the bound state energy we rewrite (3.40) in terms of renormalized coupling. 1 gλ = 1 2π ∫ λ 0 dτ′τ′ tanh(πτ′) ( τ′2 + a2 )−1 , (3.72) 168 vol. 54 no. 2/2014 exact renormalization group for point interactions where 1 gλ = 1 g̃λ − 1 α(λ) = 1 g̃λ0 + 1 4π log ( λ2 + a2 λ20 + a2 ) − 1 2π ∫ λ 0 ds s s2 + a2 2 e2πs + 1 . (3.73) thus (3.72) becomes 1 g̃λ0 + 1 4π log ( λ2 + a2 λ20 + a2 ) − 1 2π ∫ λ 0 ds s s2 + a2 2 e2πs + 1 = 1 2π ∫ λ 0 dτ′ τ′ τ′2 + a2 − 1 2π ∫ λ 0 dτ′ τ′ τ′2 + a2 2 e2πτ ′ + 1 . (3.74) the integrals on both sides vanish so we are left with 1 g̃λ0 + 1 4π log ( λ2 + a2 λ20 + a2 ) = 1 4π log ( λ2 + a2 a2 ) , (3.75) with the solution in the λ →∞ limit lim λ→∞ −ν2r2 = −λ20 e−4π/g̃λ0 1 −e−4π/g̃λ0 + 1 4 . (3.76) 4. exact renormalization group on the sphere 4.1. formulation of the problem considering the same problems as in the previous sections we write the eigenvalue equation for the bound state as (−∆s2 + ν2)φ(ω) = gδs2 (ω, ω0)φ(ω), (4.1) where ω0 ∈ s2 is the location of the dirac-delta potential. since the spherical harmonics y ml (ω) form a complete set, we can expand φ(ω) in terms of them and using the eigenvalue relation −∆s2y ml (ω) = r −2l(l + 1)y ml (ω), we find ∞∑ l=0 l∑ m=−l cml [ r−2l(l + 1) + ν2 ] y ml (ω) = gδs2 (ω, ω0) ∞∑ l=0 l∑ m=−l cml y m l (ω). (4.2) now if we multiply both sides by y m′l′ (ω) and integrate over r −2 ∫ s2 dvs2 we obtain cml [ l(l + 1) + ν2r2 ] = g ∞∑ l′=0 l′∑ m′=−l′ cm ′ l′ y m l (ω0)y m′ l′ (ω0), (4.3) where we have also used the orthogonality relation of the spherical harmonics. the next step is to determine the type of divergence. for this we define n = ∞∑ l′=0 l′∑ m′=−l′ cm ′ l′ y m′ l′ (ω0), (4.4) so that cml is given by cml = n gy ml (ω0) l(l + 1) + ν2r2 . (4.5) by plugging this result into (4.4) we find g−1 as 1 g = 1 4π ∞∑ l′=0 2l′ + 1 l′(l′ + 1) + ν2r2 , (4.6) where we have used l′∑ m′=−l′ y m ′ l′ (ω0)y m′ l′ (ω0) = 2l′ + 1 4π . (4.7) by using the maclaurin-cauchy integral test we can see that the l′ sum in (4.6) is logarithmically divergent and the divergence is caused by the large l′ values. 169 osman teoman turgut, cem eröncel acta polytechnica 4.2. applying the erg procedure we begin by writing the eigenvalue equation at the bare scale λ. cml [ l(l + 1) + ν2r2 ] = θλ(l) λ∑ l′=0 gλ(l, l′)ϑ(l, l′; m), (4.8) where ϑ(l, l′; m) ≡ l′∑ m′=−l′ cm ′ l′ y m l (ω0)y m′ l′ (ω0). (4.9) since the eigenvalue spectrum is discrete, we take the second cutoff as λ − 1 instead of λ −dλ. we write cml [ l(l + 1) + ν2r2 ] = θλ−1(l) λ−1∑ l′=0 gλ−1(l, l′)ϑ(l, l′; m). (4.10) we rewrite (4.8) as cml [ l(l + 1) + ν2r2 ] = θλ(l) (λ−1∑ l′=0 gλ(l, l′)ϑ(l, l′; m) + gλ(l, λ)ϑ(l, λ; m) ) , (4.11) so that by substituting l = λ we get an expression for cmλ : cmλ = 1 λ(λ + 1) + ν2r2 (λ−1∑ l′=0 gλ(λ, l′)ϑ(λ, l′; m) + gλ(λ, λ)ϑ(λ, λ; m) ) . (4.12) we note that due to the discrete spectrum we could not ignore the second term. using (4.9) and (4.12) we can write ϑ(l, λ; m) = λ∑ m′=−λ cm ′ λ y m′ λ (ω0)y ml (ω0) = λ∑ m′=−λ y m ′ λ (ω0)y m l (ω0) λ(λ + 1) + ν2r2 (λ−1∑ l′=0 gλ(λ, l′)ϑ(λ, l′; m) + gλ(λ, λ)ϑ(λ, λ; m) + gλ(λ, λ) λ∑ m′=−λ y m ′ λ (ω0)y ml (ω0)ϑ(λ, λ; m ′) ) . (4.13) by putting explicit expressions for ϑ(λ, l′; m′) and ϑ(λ, λ; m′) we get ϑ(l, λ; m) = 1 λ(λ + 1) + ν2r2 (λ−1∑ l′=0 gλ(λ, l′) λ∑ m′=−λ y m ′ λ (ω0)y m ′ λ (ω0) l′∑ m′′=−l′ cm ′′ l′ y m′′ l′ (ω0)y ml (ω0) + gλ(λ, λ) λ∑ m′=−λ y m ′ λ (ω0)y m ′ λ (ω0) λ∑ m′′=−λ cm ′′ λ y m′′ λ (ω0)y ml (ω0) ) = 1 4π 2λ + 1 λ(λ + 1) + ν2r2 (λ−1∑ l′=0 gλ(λ, l′)ϑ(l, l′; m) + gλ(λ, λ)ϑ(l, λ; m) ) . (4.14) from this, ϑ(l, λ; m) can be solved as ϑ(l, λ; m) = ( 4π λ(λ + 1) + ν2r2 2λ + 1 −gλ(λ, λ) )−1 λ−1∑ l′=0 gλ(λ, l′)ϑ(l, l′; m). (4.15) by putting this result back into (4.11) we get cml [l(l + 1) + ν 2r2] = θλ(l) λ−1∑ l′=0 [ gλ(l, l′) + ( 4π λ(λ + 1) −m 2λ + 1 −gλ(λ, λ) )−1 ×gλ(l, λ)gλ(λ, l′) ] ϑ(l, l′; m), (4.16) 170 vol. 54 no. 2/2014 exact renormalization group for point interactions and by comparing this result with (4.10) we obtain a recursion relation for the effective coupling constant. gλ−1(l, l′) = gλ(l, l′) + ( 4π λ(λ + 1) + ν2r2 2λ + 1 −gλ(λ, λ) )−1 gλ(l, λ)gλ(λ, l′). (4.17) from this relation we can express the effective coupling at the effective scale λ as gλ(l, l′) = gλ(l, l′) + λ∑ s=λ+1 ( 4π s(s + 1) + ν2r2 2s + 1 −gs(s,s) )−1 gs(l,s)gs(s,l′), (4.18) or gλ(l, l′) = g −xλ(l, l′) + 1 4π λ∑ s=λ+1 ( s(s + 1) + ν2r2 2s + 1 − gs(s,s) 4π )−1 gs(l,s)gs(s,l′). (4.19) by applying the same iteration procedure as in the previous cases we can obtain g (n+1) λ = g − 1 4π λ∑ s=λ0 ( s(s + 1) + ν2r2 2s + 1 − g (n) s 4π )−1 (g(n)s ) 2. (4.20) this is a very complicated recursive relation, and unlike previous cases, we could not convert it to a differential equation from which we can solve for gλ in the n →∞ limit. 5. conclusion in this paper we have investigated a non-perturbative renormalization of point interactions on the two-dimensional hyperbolic space, using the exact renormalization group method. we have shown that the theory is asymptotically free and the flow equations have the same form as in the flat case. acknowledgements o. t. turgut would like to thank prof. p. exner and prof. m. znojil for the kind invitation to the aamp xi meeting which was held in villa lanna, and for the kind support for his lodging there. references [1] sadhan k. adhikari and t. frederico. renormalization group in potential scattering. physics review letters, 74:4572–4575, 1995. doi: 10.1103/physrevlett.74.4572 [2] sadhan k adhikari and angsula ghosh. renormalization in non-relativistic quantum mechanics. journal of physics a: mathematical and general, 30(18):6553, 1997. doi: 10.1088/0305-4470/30/18/029 [3] george b. arfken and hans j. weber. mathematical methods for physicists, 6th edition. elsevier academic press, burlington, ma, usa, 2005. [4] david borthwick. spectral theory of infinite-area hyperbolic surfaces. birkhauser boston, new york, ny, usa, 2007. [5] r. m. cavalcanti. exact green’s functions for delta-function potentials and renormalization in quantum mechanics. e-print: arxiv:quant-ph/9801033v2, 2000. [6] carlos f de araujo, lauro tomio, sadhan k adhikari, and t frederico. application of renormalization to potential scattering. journal of physics a: mathematical and general, 30(13):4687, 1997. doi: 10.1088/0305-4470/30/13/020 [7] stanislaw d. glazek. renormalization group and bound states. arxiv:0810.5258 [hep-th], october 2008. acta phys.polon.b39:3395-3421,2008. [8] stanislaw d. głazek and tomasz maslowski. renormalization of hamiltonians. lecture notes distributed in the sixth international school and workshop on light-front quantization and non-perturbative qcd at iowa state university, ames on may 6 june 14, 1996. [9] stanisław d. głazek and kenneth g. wilson. renormalization of hamiltonians. phys. rev. d, 48:5863–5872, dec 1993. doi: 10.1103/physrevd.48.5863 [10] stanislaw d. glazek and kenneth g. wilson. perturbative renormalization group for hamiltonians. phys. rev. d, 49:4214–4218, apr 1994. doi: 10.1103/physrevd.49.4214 [11] stanisław d. głazek and kenneth g. wilson. asymptotic freedom and bound states in hamiltonian dynamics. phys. rev. d, 57:3558–3566, mar 1998. doi: 10.1103/physrevd.57.3558 [12] p. gosdzinsky and r. tarrach. learning quantum field theory from elementary quantum mechanics. american journal of physics, 59(1):70–74, 1991. doi: 10.1119/1.16691 171 http://dx.doi.org/10.1103/physrevlett.74.4572 http://dx.doi.org/10.1088/0305-4470/30/18/029 http://dx.doi.org/10.1088/0305-4470/30/13/020 http://dx.doi.org/10.1103/physrevd.48.5863 http://dx.doi.org/10.1103/physrevd.49.4214 http://dx.doi.org/10.1103/physrevd.57.3558 http://dx.doi.org/10.1119/1.16691 osman teoman turgut, cem eröncel acta polytechnica [13] i. s. gradshteyn and i.m. ryzhik. table of integrals, series and products, 7th edition. elsevier academic press, burlington, ma, usa, 2007. [14] michel hans. an electrostatic example to illustrate dimensional regularization and renormalization group technique. american journal of physics, 51(8):694–698, 1983. doi: 10.1119/1.13148 [15] peter d. hislop. the geometry and spectra of hyperbolic manifolds. proceedings of the indian academy of sciences mathematical sciences, 104:715–776, 1994. [16] n. n. lebedev. special functions and their applications. prentice-hall inc., englewood cliffs, nj, usa, 1965. [17] lawrence r. mead and john godines. an analytical example of renormalization in two-dimensional quantum mechanics. american journal of physics, 59(10):935–937, 1991. doi: 10.1119/1.16675 [18] indrajit mitra, ananda dasgupta, and binayak dutta-roy. regularization and renormalization in scattering from dirac delta potentials. american journal of physics, 66(12):1101–1109, 1998. doi: 10.1119/1.19051 [19] p. k. mitter and c.-m. viallet. on the bundle of connections and the gauge orbit manifold in yang-mills theory. communications in mathematical physics, 79(4):457–472, 1981. doi: 10.1007/bf01209307 [20] su long nyeo. regularization methods for delta-function potential in two-dimensional quantum mechanics. american journal of physics, 68(6):571–575, 2000. doi: 10.1119/1.19485 [21] audrey terras. harmonic analysis on symmetric spaces and applications, volume i. springer-verlag, new york, ny, usa, 1985. [22] wolfgang walter. ordinary differential equations. springer-verlag, new york, ny, usa, 1998. [23] kenneth g. wilson. model of coupling-constant renormalization. phys. rev. d, 2:1438–1472, oct 1970. doi: 10.1103/physrevd.2.1438 [24] kenneth g. wilson. renormalization group and critical phenomena. i. renormalization group and the kadanoff scaling picture. physical review b, 4:3174–3183, 1971. doi: 10.1103/physrevb.4.3174 [25] kenneth g. wilson. renormalization group and critical phenomena. ii. phase-space cell analysis of critical behavior. physical review b, 4:3184–3205, 1971. doi: 10.1103/physrevb.4.3184 [26] kenneth g. wilson. the renormalization group: critical phenomena and the kondo problem. reviews of modern physics, 47:773–840, 1975. doi: 10.1103/revmodphys.47.773 [27] kenneth g. wilson. the renormalization group and critical phenomena. reviews of modern physics, 55:583–600, 1983. doi: 10.1103/revmodphys.55.583 [28] kenneth g. wilson and j. kogut. the renormalization group and the � expansion. physics reports, 12(2):75 – 199, 1974. doi: 10.1016/0370-1573(74)90023-4 172 http://dx.doi.org/10.1119/1.13148 http://dx.doi.org/10.1119/1.16675 http://dx.doi.org/10.1119/1.19051 http://dx.doi.org/10.1007/bf01209307 http://dx.doi.org/10.1119/1.19485 http://dx.doi.org/10.1103/physrevd.2.1438 http://dx.doi.org/10.1103/physrevb.4.3174 http://dx.doi.org/10.1103/physrevb.4.3184 http://dx.doi.org/10.1103/revmodphys.47.773 http://dx.doi.org/10.1103/revmodphys.55.583 http://dx.doi.org/10.1016/0370-1573(74)90023-4 acta polytechnica 54(2):156–172, 2014 1 introduction 2 exact renormalization group on the euclidean plane 2.1 formulation of the problem 2.2 renormalization of hamiltonians 2.3 applying the erg procedure 2.4 estimating the range of renormalizability 2.5 bound state solution 3 exact renormalization group on the hyperbolic plane 3.1 the geometry and spectra of the hyperbolic plane 3.2 formulation of the problem 3.3 applying the erg procedure 3.4 estimating the range of renormalizability 3.5 bound state solution 4 exact renormalization group on the sphere 4.1 formulation of the problem 4.2 applying the erg procedure 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0324 acta polytechnica 55(5):324–328, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap cfd analysis of the spacer grids and mixing vanes effect on the flow in a chosen part of the tvsa-t fuel assembly jakub juklíčeka, ∗, václav železnýb a czech technical university in prague, technická 4, praha 6, 166 07, czech republic b power engineering department, czech technical university in prague, technická 4, praha 6, 166 07, czech republic ∗ corresponding author: jakub.juklicek@gmail.com abstract. cfd is a promising and widely spread tool for flow simulation in nuclear reactor fuel assemblies. one of the limiting factors is the complicated geometry of spacer grids. it leads to a computational mesh with a high number of cells and with a possibility of decreasing quality. an approach which simulates the flow as precisely as possible and simultaneously at a reasonable computational expense therefore has to be chosen. the goal of the following cfd analysis is to obtain a detailed velocity field in a precise geometry of a chosen part of the tvsa-t fuel assembly. this kind of simulation provides data for comparison that can be applied in many situations, for instance, for comparison with simulations when a porous media boundary condition is applied as a replacement for the spacer grid. the tvsa-t fuel assembly is equipped with combined spacer grids. a combined spacer grid has two functions. the first function is to support the fuel pins as a part of the assembly skeleton. the second function is to ensure coolant mixing with the mixing vanes. the support part is geometrically very complicated. therefore it is impossible to prepare a good quality computational mesh there. it is also difficult to create a mesh in the support part and the mixing part joint area because of the inaccurate connection between these two parts. a representative part of the tvsa-t fuel assembly with a combined spacer grid segment was chosen to perform the cfd simulation. some inevitable simplifications of the spacer grid geometry were performed. these simplifications were as insignificant as possible to preserve the flow character and to make it possible to prepare a quality mesh at the same time. a steady state cfd simulation was performed with the k-ε realizable turbulence model. heat transfer was not simulated and only the velocity field was investigated. the detailed flow characterization which was obtained from this calculation has shown that mixing vanes already affect the flow in the support part of the grid thanks to the suction effect. the vortex structures disappear approximately 50 mm behind the mixing vanes but the basic spiral character of the flow is preserved in the whole area between two following spacer grids. keywords: tvsa-t fuel assembly; spacer grid; mixing vanes; cfd; turbulent flow. 1. introduction thermalhydraulic analysis is needed to predict the flow and temperature distributions in fuel assemblies to ensure the safe operation of nuclear reactors. analysis of this kind can be performed by commercial cfd codes. a limiting factor in these calculations is the large computational domain of the fuel assembly which leads to unrealistic computational time or to a large computational mesh which is not computable with current hardware options available in most research facilities. the geometrically most complicated parts are the inlet of the fuel assembly, the spacer grids and the fuel assembly outlet. these parts have to be simplified to create a computational geometry where calculation is feasible. tvsa-t fuel is currently used in the temelin npp. the goal of this work is to simulate the flow in a chosen part of the tvsa-t fuel assembly with a spacer grid segment. the tvsa-t spacer grid has two functions: support of the assembly skeleton and coolant mixing. the spacer grid is therefore divided into two connected parts – the support part and the mixing vanes. both parts had to be simplified. the connection between these parts was also simplified and a computational mesh was successfully created. a steady state cfd simulation was performed to examine the velocity field. the velocity field was evaluated to describe the spacer grid effect on the flow character in the fuel assembly. 2. tvsa-t spacer grid geometry the support part is made of cells which provide support for the fuel pin in three points. the mixing vanes are placed on the top edge of metal strips forming the mixing part of the spacer grid. both parts are 324 http://dx.doi.org/10.14311/ap.2015.55.0324 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 5/2015 cfd analysis of spacer grids and mixing vanes effect figure 1. tvsa-t spacer grid with the detail of the support/mixing part transition. figure 2. symmetrical geometry of the mixing part. welded together and form a complete tvsa-t spacer grid, as illustrated in fig. 1. the red line indicates the transition between those two parts. the support part had to be simplified, for example, in places where the grid touches the fuel cladding. here some problematic radiuses were replaced by straight edges to prevent high skewness of the cells. a simplification was also performed in the transition area between the two parts of the grid. the mixing part directly follows the simplified support part. slight modifications were done also in the mixing vanes geometry. 2.1. original geometry the original geometry of the mixing part consists of a symmetrical configuration of the mixing vanes. the mixing vanes are directed around the fuel pin to create the spiral character of the coolant flow. this configuration is illustrated in fig. 2, where the identification by different colours shows that the configuration is periodically repeated every 120 degrees. fig. 3 illustrates the transition between the support part (yellow) and the mixing part (red). this transition is indicated by the red line in fig. 1, as mentioned before. the blue colour indicates a small cross section area that has been simplified. 2.2. simplified geometry the comparison of the original and simplified geometry of the support cell is illustrated in fig. 4. the radiuses were replaced by straight edges. the support point was preserved as the connection point between the fuel cladding and the support cell. this provision helped figure 3. details in the transition between the support and mixing part. figure 4. original (left) and simplified (right) geometry of the support cell in the fuel pin support plane. to enlarge the angle between the fuel pin and the support cell, which consequently enabled the creation of the computational mesh in this part of the grid. the support point was the most complicated part in terms of meshing. fig. 5 illustrates the final computational geometry used for cfd analysis. the support part and the mixing part are directly connected (the small flow cross sections illustrated in fig. 3 are not modelled). the arrows indicate the use of the periodic boundary condition to explain the already mentioned 120 degrees symmetry. the volumes for the flow development in front of and behind the spacer grid are also part of the computational geometry. these volumes are more widely described in the following section. 3. cfd analysis 3.1. computational mesh the computational mesh was created using the gambit 2.4.6 software. the computational geometry was divided into three main parts with a transition volume between each part: the volume for the flow development before the spacer grid (200 mm), the spacer grid area and the volume for the flow development behind the spacer grid (500 mm). the distance recommended to develop the flow (20 hydraulic diameters) is of 200 mm and the distance between two following spacer grids is of 500 mm. the mesh in those volumes is illustrated in fig. 6. moreover, the computational mesh in the spacer grid area is divided into two parts: the support part and the mixing vanes part. this mesh division is just the logical approach to create the mesh, 325 jakub juklíček, václav železný acta polytechnica figure 5. final simplified computational geometry. figure 6. mesh in the volumes for the flow development. and no interfaces are present in the geometry. the mesh consists of 5593797 cells. the most problematic part is the support cell area around the support point with 2421032 cells (43 % of the cells), as illustrated in fig. 7. this area was meshed with tetrahedrons and the maximum equisize skew is 0.95. the high equisize skew did not affect the solution convergence. the transition volumes and the volumes in the mixing vanes area were also meshed with tetrahedrons. the other parts were meshed with hexahedral or wedge cells. the mesh in the volumes to develop the flow is structured. 3.2. boundary conditions the boundary conditions correspond to the operational conditions of the vver-1000 nuclear reactor in the temelin npp [1]. the operational pressure is 15.7 mpa and the temperature is 290 °c. the water properties were set according to the following parameters: density % = 746.5 kg/m3 and dynamic viscosity η = 9.25 · 10−5 pa s. the velocity inlet boundary condition was set at the inlet with velocity v = 5.536 m/s. the hydraulic diameter at the inlet dh = 0.0106 mm. the pressure outlet boundary condition was set at the outlet. the periodic boundary condition was set as shown before in fig. 5. figure 7. mesh in the fuel pin support plane. figure 8. results location. 3.3. models and calculation settings the cfd simulation was performed using the ansys fluent 14.5 cfd code. the k-ε realizable turbulence model was used in the calculation. the turbulence intensity i = 3 % was set according to the reynolds number re = 473576 [2]. the range of the dimensionless wall distance y+ was 30 < y+ < 300. the solution has converged for steady state simulation. second order upwind spatial discretization was used. heat transfer was not simulated. 4. results fig. 8 illustrates the location of the following cross sections, where some of the results are shown. the blue colour indicates the volumes to develop the flow in front of the spacer grid and behind it. the velocity field obtained from the calculation was analysed and the spacer grid influence on the coolant flow was evaluated. fig. 9 illustrates the velocity field in the plane with the support point of the fuel pin (axial velocity contours). wall functions are not applicable in the cells closest to the support point. 326 vol. 55 no. 5/2015 cfd analysis of spacer grids and mixing vanes effect figure 9. contours of axial velocity in the support point plane [m/s]. figure 10. contours of axial velocity in the mixing vanes area [m/s]. therefore, the velocity field is not evaluated in these cells, and more significant simplifications could be done in this area without affecting the flow character. simplifying this detail would also be beneficial in terms of mesh quality and, subsequently, of lowering the number of mesh cells. in the next paragraph the suction effect of the mixing vanes is described. this effect is also noticeable in fig. 9 as the area with the highest coolant speed. fig. 10 illustrates the contours of axial velocity in the mixing vanes area. it can be seen that the mixing vanes already affect the flow in the support part of the grid (fig. 10a). the sudden widening of the flow crosssection (located behind the mixing vanes) causes the suction effect, which leads to coolant acceleration in the support part. cross-sections in the mixing vanes area show the formation of the vortex structures as indicated also in fig. 10b and fig. 10c. these vortex structures are formed alongside the entire mixing vane. the vortex structures are also directed opposite to the axial flow, which explains the negative values of the velocity. fig. 11 illustrates the velocity vectors in the cross flow direction. the vortex structures are illustrated in fig. 11a. these vortex structures disappear approximately 50 mm behind the spacer grid, as can be seen in fig. 12, where the average vorticity in horizontal cross sections is illustrated. the 0 mm level in fig. 12 corresponds to the outflow edge of the mixing vanes, and 500 mm is the distance between the two following spacer grids, as was already mentioned. the vorticity remains constant after the 50 mm level and further behind the spacer grid. fig. 11b illustrates that the spiral character of the flow is preserved in the whole volume behind the spacer grid. the velocity vectors in fig. 11b illustrate the plane just before the following spacer grid (499 mm from the mixing vanes outflow edge). axial flow is already strongly dominant in this area but the arrows indicate that the flow character is still affected by the effect of the mixing vanes. 5. conclusions a spacer grid affected velocity field in the tvsa-t fuel assembly was obtained in the cfd simulation after the original geometry was successfully simplified and meshed. only necessary simplifications without any major changes were made in the grid geometry to mesh the geometry of the spacer grid qualitatively. it was shown that precise modelling is feasible in terms of meshing, but a large number of cells is consequently reached. the support point between the fuel pin and the support cell was preserved, but a finer mesh with a lower number of cells could be achieved through a greater simplification without significantly affecting 327 jakub juklíček, václav železný acta polytechnica figure 11. vectors of cross-flow velocity [m/s]. figure 12. average vorticity ω in horizontal cross sections behind the spacer grid. the flow character. cfd analysis showed that the spiral character of the flow is preserved in the whole volume between the two following spacer grids. simulation had also shown that mixing vanes already affect the flow in the support part of the grid, thanks to the suction effect. mixing vanes form vortex structures alongside their geometry. these vortex structures disappear approximately 50 mm behind the spacer grid. it should be stated that this simulation does not correspond to the real flow conditions in the tvsa-t fuel assembly (e.g., flow affected by the fuel assembly inlet or by the previous spacer grid). also, if a larger area (more fuel pins) is simulated, the spiral character of the flow is limited and the cross-flow between the subchannels takes on a more significant role (this occurs approximately 100 mm and further behind the spacer grid) [3]. nevertheless, this kind of simulation provides comparative data for more significant simplifications of the fuel assembly geometry or for replacement of the part or whole spacer grid geometry by the porous media boundary condition in cfd simulation. computational mesh could also serve as input for calculations of larger areas with more fuel pins and, e.g., two spacer grids if appropriate hardware and software is available in the research or industrial facility. acknowledgements this work was done as partial fulfilment of the requirements for the degree of master of science in nuclear power devices at the czech technical university in prague, faculty of mechanical engineering. references [1] temelín, hlavní technické údaje, 2013.. http://www.cez.cz/cs/vyroba-elektriny/ jaderna-energetika/jaderne-elektrarnycez/ete/ technologie-a-zabezpeceni/2.html [2015-03-20] [2] ansys fluent 12.0 user’s guide. ansys, inc. 2009. [3] železný, v., zácha, p. úvodní simulace proudění skrze oblast distančních mřížek a turbulizujících lopatek ve vybrané skupině palivových kanálů souboru tvsa-t. řež, 2011. 328 http://www.cez.cz/cs/vyroba-elektriny/jaderna-energetika/jaderne-elektrarny cez/ete/technologie-a-zabezpeceni/2.html http://www.cez.cz/cs/vyroba-elektriny/jaderna-energetika/jaderne-elektrarny cez/ete/technologie-a-zabezpeceni/2.html http://www.cez.cz/cs/vyroba-elektriny/jaderna-energetika/jaderne-elektrarny cez/ete/technologie-a-zabezpeceni/2.html acta polytechnica 55(5):324–328, 2015 1 introduction 2 tvsa-t spacer grid geometry 2.1 original geometry 2.2 simplified geometry 3 cfd analysis 3.1 computational mesh 3.2 boundary conditions 3.3 models and calculation settings 4 results 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0895 acta polytechnica 53(6):895–900, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap robust control of end-tidal co2 using the h∞ loop-shaping approach anake pomprapaa,∗, berno misgelda, verónica sorgatoa, andré stollenwerkb, marian waltera, steffen leonhardta a philips chair for medical information technology, helmholtz institute for biomedical engineering, rwth aachen university, pauwelsstrasse 20, d-52074 aachen, germany b embedded software laboratory, chair of computer science 11, rwth aachen university, ahornstrasse 55, d-52074 aachen, germany ∗ corresponding author: pomprapa@hia.rwth-aachen.de abstract. mechanically ventilated patients require appropriate settings of respiratory control variables to maintain acceptable gas exchange. to control the carbon dioxide (co2) level effectively and automatically, system identification based on a human subject was performed using a linear affine model and a nonlinear hammerstein structure. subsequently, a robust controller was designed using the h∞ loop-shaping approach, which synthesizes the optimal controller based on a specific objective by achieving stability with guaranteed performance. for demonstration purposes, the closed-loop control ventilation system was successfully tested in a human volunteer. the experimental results indicate that the blood co2 level may indeed be controlled noninvasively by measuring end-tidal co2 from expired air. keeping the limited amount of experimental data in mind, we conclude that h∞ loop-shaping may be a promising technique for control of mechanical ventilation in patients with respiratory insufficiency. keywords: closed-loop ventilation, embedded system, system identification, h∞ loop-shaping control design, biomedical application, control of etco2. 1. introduction in an intensive care unit, patients with respiratory insufficiency due to lung diseases, injury or undergoing a surgical procedure, may require support from a mechanical ventilator to maintain appropriate gas exchange: oxygenation and carbon dioxide (co2) elimination from blood circulation [1]. concerning co2 exchange, if an abnormal value of co2 pressure in arterial blood (paco2) persists for a long period of time, this can cause an imbalance of the ph value, which may be life threatening. therefore, it is essential to regulate paco2 during ventilation therapy in order to avoid both hypercapnia and hypocapnia. assuming good diffusion conditions (as in a healthy lung), paco2 can be approximated by end-tidal co2 (etco2) or co2 partial pressure at end of expiration, which can be noninvasively measured from the exhaled air. currently, no accurate mathematical model of the cardiopulmonary system is available that allows to estimate etco2. therefore, in this work, we propose to define a model structure and to parametrize this model using system identification [2]. to this end, we have simplified the problem and assumed a single-input single-output (siso) system. minute ventilation (mv ; in l/min) was used to control the value of etco2 [3–7]. to identify the parameters of a hammerstein model, our results from a grey box identification are presented. for this, we assumed a nonlinear steady-state (or static) characteristic of the controlled plant (the patient) with, in addition, some linear time dynamics. note that the nonlinear hammerstein model is composed of a serial interconnection between a linear time-invariant system and a static nonlinearity; this is also classified as a block-oriented structure [8, 9]. the advantage of the block-oriented structure is that it is able to represent input and output multiplicities; this makes it well suited for application on a cardiopulmonary system, due to its similar behavior of input and output multiplicities [10]. in addition, a linear affine model was identified in order to compare this with the nonlinear hammerstein modeling results. after model validation, a robust controller was designed by an h∞ loop-shaping approach [11, 12]. note that this method guarantees closed-loop stability whilst offering performance and robustness trade-offs [13]. our goal for the h∞ loop-shaping approach was to tune the singular value of the open-loop gain or the open-loop transfer function gain, to increase the bandwidth of the system and to eliminate steady-state errors. the advantage of this method is to add performance possibilities while obtaining an exact solution in the h∞ optimal sense. for this design method, we present simulation results when evaluating the control performance based on various conditions of model uncertainty. finally, a patient-in-the-loop ventilation system was connected to a human volunteer (first author) to test the etco2 control algorithm in vivo. the remainder of this article is organized as follows. 895 http://dx.doi.org/10.14311/ap.2013.53.0895 http://ojs.cvut.cz/ojs/index.php/ap anake pomprapa, berno misgeld et al. acta polytechnica metabolism patient mechanical ventilator sensor (co smo+)2 • fio2 • pip • peep • rr • i:e ratio ph• etco2 pulse oximeter gas analyser dspace control unit c a n b u s r s 2 3 2 • spo2 medical panel pc • fio2 • mv µc µc figure 1. configuration of the system for closed-loop ventilation. section 2 describes a unified approach for system identification including a physiological description; here, all the models are parametrized and validated. section 3 presents the robust control design based on the h∞ loop-shaping technique and the simulation results of the controller under various conditions of model uncertainty. section 4 presents a discussion of this work and the conclusions are presented in section 5. 2. system modeling 2.1. system configuration the proposed closed-loop system is composed of a medical panel pc for process monitoring, user interface and data storage, a mechanical ventilator (ventilogic ls, weinmann geräte für medizin gmbh, germany), a capnograph (co2smo+, philips gmbh, germany) for etco2 monitoring, a microautobox ii dspace control unit, and two arm-based microcontrollers. the system configuration is presented in figure 1. the data communication protocol was designed based on the can (controller area network) protocol. can-bus is a serial fieldbus, which allows additional devices to be connected to the system architecture using this topological arrangement. a data transfer rate of 1 mbit/s can be obtained and collision avoidance between the messages can be achieved for all connected devices, based on priority assignment. therefore, the proposed closed-loop system is suitable for real-time automatic control of mechanical ventilation. 2.2. static nonlinearity to serve as an example, system identification was conducted based on one male volunteer with healthy homogeneous lungs and a body mass index within normal range (i.e. 23.3 kg/m2). this person was connected to a mechanical ventilator, which was operating in pressure-controlled mode. all ventilation settings were manually adjusted to extract cardiopulmonary information from the subject. based on the static characteristics of the patient, etco2 is a nonlinear function of mv [4]. its static 8 10 12 14 16 18 20 22 24 20 25 30 35 mv [l/min] e tc o 2 [m m h g ] measured data from a human subject estimated curve of the relationship figure 2. static nonlinearity between etco2 and mv . nonlinearity is the so-called “metabolic hyperbola” (figure 2). by extracting co2 information, mv input was increased stepwise from 10 l/min to 25 l/min, and etco2 was measured at steady state. we could indeed confirm that etco2 is a decreasing function in terms of mv input at steady state. hence, the more mv applied to the lungs, the less etco2 can be measured from the subject. this relationship is important and can be employed for controlling etco2 with mv input. a mathematical description of the nonlinearity may approximate the nonlinearity as a parabolic equation as provided in (1). etco2 = n[mv ] = a · mv 2 + b · mv + c (1) where n[mv ] denotes a nonlinear function of mv . in this particular example, a = 0.05, b = −2.55 and c = 52.80 are the best parameters according to a least-squares fit. 2.3. linear affine model as an initial estimation, a simple dynamic model can be applied to complex input-output relationship of the system in order to evaluate to what extent such a simple model can represent the real system (figure 3). such an affine model is formulated by a linear combination form of primitive variables, provided in (2): y(k) = yo + p∑ i=1 a(i) ·y(k−i) + q∑ j=1 b(j) ·u(k−j) (2) where y(k) represents etco2 at the sampling point k, and u(k) denotes the input ∆p at the sampling k. the sampling time for this study was 4.28 s. also, p and q are finite order parameters with p ≥ q and yo representing a constant so called offset. the unknown parameters (yo,a(i) and b(j)) can be estimated by using a least-squares algorithm based on the input and output measurements. the following algorithm is used to estimate the unknown parameters. according to the experimental data and the model structure from (2) for the linear affine model, it can 896 vol. 53 no. 6/2013 robust control of end-tidal co2 15 20 25 30 35 40 e tc o 2 [m m h g ] measured etco 2 1 st order model 2 nd order model 2 nd order model with zero 1 st order hammerstein model 0 100 200 300 400 500 600 700 800 900 1000 0 5 10 time [s] d p [h p a ] figure 3. input-output relationship used for system identification with the results of model identification. be expressed in terms of a vector and matrices, as in (3): y = x ·β + ε, (3) where y = [ y(ko + 1) y(ko + 2) · · · y(m) ]t , xt =   1 1 · · · 1 y(ko) y(ko + 1) · · · y(m − 1) ... ... ... y(ko −p) y(ko + 1 −p) · · · y(m − 1 −p) u(ko) u(ko + 1) · · · u(m − 1) ... ... ... u(ko −q) u(ko + 1 −q) · · · u(m − 1 −q)   , β = [ yo a1 · · · ap b1 · · · bp ]t for ko ≥ p and ko ≥ q and ε is an unknown disturbance vector. the vector β or the unknown parameters can then be estimated using the ordinary least-squares algorithm that minimizes the sum of squared errors provided in (4) [2]: β = (xt · x)−1 · xt · y. (4) in order to relax the muscles involved in respiratory breathing, a fixed respiratory rate (rr) is introduced at 14 bpm to allow the subject to minimize work of breathing [5]. since the mechanical ventilator was set to pressure-controlled mode, mv was obtained by a multiplication of tidal volume (vt ) and respiratory rate (rr). because of the fixed rr of 14 bpm, the input for this particular system is transformed from mv to the difference in the driving pressure (∆p = pip −peep) between inspiration and expiration pressure, and has a direct influence on vt . 2.4. nonlinear hammerstein model providing a representation of signal flow using a blockoriented structure, the hammerstein model comprises a static nonlinearity n[·] at the input u(k) cascaded with a linear dynamic model h(z): n[•] h(z) u(k) v(k) y(k) the static nonlinearity maps the input u(k) into the intermediate variable v(k) (see (5)) and the linear dynamic model maps the intermediate variable v(k) into the output y(k) (see (6)). the model can be represented by v(k) = n[u(k)], (5) y(k) = p∑ i=1 a(i) ·y(k − i) + q∑ j=1 b(j) ·v(k − j). (6) rearranging (5) and (6), the hammerstein model can be described as shown in (7). y(k) = p∑ i=1 a(i) ·y(k− i) + q∑ j=1 b(j) ·n[u(k−j)] (7) it can be seen from (7) that the hammerstein model is very similar to the linear dynamic model. because the qualitative behavior of the transient response is entirely determined by the discrete transfer function of the linear subsystem h(z), it can be used as an alternative to the linear model. this model can exhibit input multiplicities if the static nonlinearity is in the form of input multiplicities. according to the experimental data and the model structure, the unknown parameters (a(i) and b(j)) can be estimated by a constrained least-squares algorithm, as provided in (3) and (4). 2.5. evaluation of model structure both the linear and nonlinear model structures are used to describe this system. the performance index used in our evaluation was obtained by root mean square error (rmse). table 1 presents the results of the comparisons, divided into an estimated dataset and a validated dataset for the different model structures. based on the validation dataset, the first-order hammerstein model provides the best result of all the listed models. nevertheless, the first-order linear model also offers the best result of the rmse evaluation of all the linear models. in addition, qualitative comparison of the selected models is provided in figure 3. in the following section, the design of the h∞ controller and simulation results are conducted using the first-order linear model. it should be emphasized that capnography has an accuracy of ±2 mmhg for values in the 0–40 mmhg range, ±5 % for values in the 41–70 mmhg range, and ±8 % for values in the 71–150 mmhg range [14]. 897 anake pomprapa, berno misgeld et al. acta polytechnica estimated validated rmse rmse 1st linear affine 2.2475 2.2880 2nd linear affine 2.2116 2.2988 2nd affine w. zero 2.1597 2.4093 1st hammerstein 2.1988 1.6709 2nd hammerstein 2.1680 1.7804 2nd ham. w. zero 2.1351 1.8085 table 1. root mean squared error (rmse) evaluation of the different model structures. r y k gp n m -1 u dn dm figure 4. the h∞ control structure with left coprime factorization of a plant g. 3. design of robust controller 3.1. loop-shaping design using h∞ this method combines the principle of bode’s sensitivity integral [15] and the h∞ optimization technique by minimizing the h∞ norm in the presence of uncertainty. in designing the h∞ controller, both stability and performance are taken into account, with bounded differences between the nominal model and the real nonlinear plant. given a nominal discrete-time model of a plant g, it can be represented using a normalized left coprime factorization (lcf) g = m−1 ·n, (8) where m and n are coprime matrices in rh∞ (figure 4): a perturbed model associated with the lcf representation of the plant g is given by (9), where perturbations are assumed to be bounded, is given by gp = (m + ∆m)−1 · (n + ∆n),∥∥∆m ∆n∥∥∞ < � (9) the objective is to find a robust controller k that stabilizes gp and minimizes (10). γmin = ∥∥∥∥k(i −gk)−1m−1(i −gk)−1m−1 ∥∥∥∥ ∞ (10) the solution of this h∞ norm problem can be solved using the algorithm proposed by mcfarlane and glover [11]. define [a,b,c, 0] to be a state space 10 -3 10 -2 10 -1 10 0 -20 -10 0 10 20 30 40 50 60 70 80 singular values frequency [rad/s] s in g u la r v a lu e s [ d b ] plant singular value desired singular value designed loop gain singular value s( )g s( )gk figure 5. shaping the singular value of the singleinput single-output (siso) system. minimal realization of plant g. then, the suboptimal h∞ controller k can be computed by discrete algebraic riccati equations [16]. the loop-shaping objective is to design a robust controller k, so that σ(gk) � 1 or |gk(jω)| � 1 (for a case of siso) at low frequencies (minimizing the effect of output disturbances) and σ(gk) � 1 or |gk(jω)| � 1 at high frequencies (minimizing the effect of sensor noise and providing robustness for additional uncertainty). the singular value of the open loop gain gk is shaped based on this design criterion. based on the singular value of the plant in figure 5, the integral action has been chosen to ensure zero steady-state error. in addition, the cut-off frequency is designed to be 0.28 rad/s in comparison with a very low cut-off frequency at 0.03 rad/s of the plant. in this way, the bandwidth is increased by a factor of approximately 10 times. referring to [11], γ = 1.86 or � = 0.5376 indicates the allowable proportional uncertainty in n and m of approximately 50 % in the crossover frequency range of the shaped plant. 3.2. step response simulation a step response, shown in figure 6, represents the control performance of the h∞ loop-shaping controller for both the linear affine and the nonlinear hammerstein models. the etco2 is controlled to have a target of 34 mmhg before t = 0 s and is forced to 35 mmhg after t = 0 s. in both models, rise time is 14 s and settling time is within 60 s. the response has no steady-state error. the robust h∞ loop-shaping controller gives a good transient and steady-state performance for the linear affine and the nonlinear hammerstein models. for all model parameters, a satisfied step response can also be achieved with a parameter uncertainty of about 12 %. therefore, the controller is robust in coping with model and parameter uncertainty, and with disturbance. 898 vol. 53 no. 6/2013 robust control of end-tidal co2 33 34 35 36 e tc o 2 [m m h g ] h ∞ loop-shaping controller for linear affine model 0 10 20 30 40 50 60 33 34 35 36 time [s] e tc o 2 [ m m h g ] h loop-shaping controller for hammerstein model ∞ -10-20 figure 6. step response of the h∞ loop-shaping controller for the linear affine and the hammerstein models. 0 20 40 60 80 100 120 140 160 180 200 220 32 34 36 38 40 42 time [s] e tc o 2 [m m h g ] figure 7. details of controller performance obtained from a human volunteer. 3.3. evaluation of closed-loop control ventilation after simulating the controller performance, closedloop control ventilation is implemented and tested on a human volunteer. the result of the control performance is presented in figure 7. robustness with a step response can be achieved for the required etco2 at 35 mmhg under real disturbance and measurement uncertainty. within approximately 60 s, the target etco2 is satisfied. the response lies in an acceptable range for clinical application. it should be emphasized that a relatively good performance can be obtained by the robust h∞ loop-shaping controller. 4. discussion for an abnormal lung condition, like acute respiratory distress syndrome (ards), etco2 does not correspond to paco2 and therefore invasive measurement of paco2 is required. therefore, our focus of identification and control design for etco2 is only valid for patients after treatment with the open lung recruitment maneuver [17], or for patients whose etco2 reading appropriately reflects the true paco2 (as, for example, in chronic obstructive pulmonary disease) [18]. in such cases, the mean value of co2 pressure in arterial blood (paco2) is approximately 9 mmhg higher than the value of etco2 and the required value of etco2 should be adapted based on the calibration with paco2, which can be obtained from blood gas analysis. the second-order model with one zero is correlated to the pharmacological two-compartment model as proposed in [7]. the etco2 represents the output from the lung compartment and is one of the state variables in the model. zero position relies on gas transport from tissues to the lung. our findings of poles and zero positions based on animal experiments correlate well with the findings derived from 18 patients [7]. one pole is located near the origin of the unit circle and another pole is near the point (1, 0) in the unit circle. the identification parameters are subject to measuring errors due to the limitations of capnography: its accuracy for etco2 monitoring is ±2 mmhg for the 0– 40 mmhg range, ±5 % for the 41–70 mmhg range, and ±8 % for the 71–150 mmhg range, and the resolution is 1 mmhg [14]. therefore, the identified parameters will not perfectly reflect the underlying parameters of the plant. in other words, parameter uncertainty also exists because of the limitations related to the accuracy and resolution of the measuring device itself. considering the input applied to the system, mv shows a better result compared to inverse minute ventilation (imv ) for both the linear affine and the hammerstein model. the design problem is simplified to be a siso system by regarding mv as the input and etco2 as the output. in a pressure-controlled ventilation mode, tidal volume cannot be directly adjusted. thus, a pressure difference should be changed in order to meet the required tidal volume. the hammerstein model provided better numerical results compared with the affine model, especially for the validation dataset (table 1). note that the hammerstein model has been successfully applied in several other biomedical applications [19], including the stretch reflex emg [20] and heart rate regulation [21]. in our clinical application, the complex nonlinear cardiopulmonary system can be better modeled by a hammerstein model than by an affine model. it should be noted that the block-oriented narmax models [8], which offer a modeling for output multiplicities, did not give acceptable results when they were tested; therefore, those results are not presented or discussed here. regarding patient safety, mv that is too low leads to low oxygenation and a risk of mortality. on the other hand, extremely high mv carries a high risk of trauma and a possibility of lung damage. thus, actuator saturation should be introduced in our system design with the aid of an anti-windup technique and should be considered for future research work. 5. conclusion in a clinical application, etco2 is required to be feedback-controlled to a certain value to minimize the risk of hypercapnia or hypocapnia. to realize this task, we propose a model-based approach by 899 anake pomprapa, berno misgeld et al. acta polytechnica identifying the model parameters of a complex nonlinear cardiopulmonary system using a block-oriented structure with the linear affine and the hammerstein models. a robust control design was implemented using the h∞ loop-shaping approach based on the derived affine model. the simulation results indicate a good control performance of the h∞ loop-shaping controller for both the linear affine and the nonlinear hammerstein models, including possible parameter variations up to 12 %. finally, for demonstration purposes, the controller was tested in a task to control etco2 in a healthy volunteer and a positive result was achieved with the robust h∞ loop-shaping controller. acknowledgements the authors acknowledge financial support from the german federal ministry of science and education (bundesministeriums für bildung und forschung – bmbf) within the oxivent project under grant number 16sv5605. references [1] a. pomprapa, d. schwaiberger, b. lachmann, s. leonhardt. predictive 3d model of co2 elimination for optimal pressure-controlled ventilation. am j respir crit care med 185(3):a1717, 2012. [2] l. ljung. system identification theory for the user. prentice hall, new jersey, 2nd edn., 1999. [3] f. w. chapman, j. c. newell, r. j. roy. a feedback controller for ventilatory therapy. ann biomed eng 13:359–372, 1985. [4] r. rudowski, c. spanne, g. matell. computer simulation of a patient end tidal co2 controller system. comput meth prog bio 28:243–248, 1989. [5] a. pomprapa, m. walter, c. goebel, et al. l1 adaptive control of end-tidal co2 by optimizing the muscular power for mechanically ventilated patients. 9th ifac symposium on nonlinear control systems 2013. [6] s. mersmann. smartcare: automatizing clinical guidelines. biomed tech 54:283–288, 2009. [7] j. hahn, g. dumont, m. anersmino. system identification and closed-loop control of end-tidal co2 in mechanically ventilated patients. ieee trans inf technol biomed pp. 1–9, 2012. [8] m. pottmann, r. k. pearson. block-oriented narmax models with output multiplicities. aiche journal 44(1):131 – 140, 1998. [9] r. k. pearson, m. pottmann. gray-box identification of block-oriented nonlinear models. j process contr 10(4):301–315, 2000. [10] a. pomprapa, r. pikkemaat, h. luepschen, et al. self-tuning adaptive control of the arterial oxygen saturation for automated open lung maneuvers. 3 dresdner medizintechnik-symposium 2010. [11] d. c. mcfarlane, k. glover. a loop-shaping design procedure using h∞-bounded uncertainty. ieee trans autom control 37(6):759–769, 1992. [12] r. a. hyde. h∞ aerospace control design: a vstol flight application. springer verlag, 1995. [13] d. c. mcfarlane, k. glover. robust controller design using normalized coprime factor plant descriptions. springer verlag, 1989. [14] novametrix medical systems inc. co2smo+ respiratory profile monitor service manual model 8000. catalog no 9758-90-02 (rev 2) p. 67, 2001. [15] h. w. bode. network analysis and feedback amplifier design. d. van nostrand inc., 1945. [16] d. w. gu, p. h. petkov, m. m. konstantinov. formulae for discrete h∞ loop-shaping design procedure controllers. 15th ifac world congress 15(1):351, 2002. [17] j. j. haitsma, r. a. lachmann, b. lachmann. open lung in ards. acta phamacol sin 12:1304–1307, 2003. [18] m. kartal, e. goksu, o. eray, et al. the value of etco2 measurement for copd patients in the emergency. eur j emerg med 18:9 – 12, 2011. [19] i. w. hunter, m. j. korenberg. the identification of nonlinear biological systems: weiner and hammerstein cascade models. biol cybern 55:135–144, 1986. [20] d. t. westwick, r. e. kearney. identification of a hammerstein model of the stretch reflex emg using separable least squares. ieee embs 3:1901 – 1904, 2000. [21] s. w. su, s. huang, l. wang, et al. nonparametric hammerstein model based model predictive control for heart rate regulation. ieee embs pp. 2984 – 2987, 2007. 900 acta polytechnica 53(6):895–900, 2013 1 introduction 2 system modeling 2.1 system configuration 2.2 static nonlinearity 2.3 linear affine model 2.4 nonlinear hammerstein model 2.5 evaluation of model structure 3 design of robust controller 3.1 loop-shaping design using h infty 3.2 step response simulation 3.3 evaluation of closed-loop control ventilation 4 discussion 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0022 acta polytechnica 54(1):22–27, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap size analysis of solid paritcles at the experimental device for multi-stage biomass combustion michaela hrnčířová∗, michal špiláček, jiří pospíšil energy institute, department of power engineering, faculty of mechanical engineering, brno university of technology, technická 2896/2, 616 69 brno ∗ corresponding author: zarybnicika@fme.vutbr.cz abstract. this paper presents the results of an analysis of ash content particles produced in biomass combustion at an experimental device. the main parts of the device are: the water heater, the gasifying chamber, the air preheater, and the fuel feeder. this device can be modified for combustion in an oxygen-enriched atmosphere. sawdust and wood chips were used as fuel, and were laid loosely into the device. ash specimens were extracted from various parts of the device. for the measurements themselves, we used the analysette 22 microtec plus universal laser diffraction device manufactured by the fritch company, in the size range from 0.08 µm to 2000 µm. the device utilizes laser diffraction for particle size analysis. keywords: laser diffraction, emission, biomass combustion, exhaust pipe. 1. introduction it is essential to know the size of particles in many fields of industry and science, e.g. material science, medicine, biology, and the power industry. the size of a particle [4] is considered to be the diameter (radius) of a perfectly spherical particle. for any other shape of a particle, the size parameter is its length, which must be defined according to the measurement method that is used. laser diffraction [3] is currently the most widelyused method for particle size measurements. the physical principle that is utilized has been known since the beginning of the 20th century, but this method was developed only after the invention of suitable laser devices and computers. nowadays this method is replacing other particle size measurement methods, due to its flexibility and readiness. 2. description of the equipment the analysette 22 microtec plus (fig. 1) is a universal analysing device for particle measurement of suspensions, emulsions and solid matter with laser diffraction. the device consists of a central measuring unit and a dispersion module. there are two semiconductor lasers in the central measuring unit, each with output of 7 mw and a wavelength of 532 nm and 940 nm, respectively. the measuring range is from 0.008 µm up to 2000 µm. the dispersion unit is an ultrasound water bath (organic fluids or saturated inorganic salt solutions can be used on a short-term basis) with maximum output of 50 w and frequency of 40 khz. any fluid comes into contact with chemically stable materials only [2]. the extent of laser beam diffraction and the way it is reflected depends on the size and the optical attributes of each particle that the beam strikes. the figure 1. laser analyser. figure 2. concept of laser beam shielding 22 http://dx.doi.org/10.14311/ap.2014.54.0022 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 1/2014 size analysis of solid paritcles figure 3. diagram of the experimental device. diffracted light then strikes a fourier lens, in which a sensor (photodiode) measures the distribution of the intensity of the diffracted light in the focal plane in dependence on the angle of incidence. the sizes of the particles and their distribution are calculated on the basis of the sensor results. the concentration of the particles put into the device must be low enough to avoid multiple laser light diffractions. at the same time, the concentration must be high enough for the particles to be able to diffract enough laser light for the sensor to detect. the optimum shielding of the laser beam for wet dispersion is 10–15 % [2]. after an optimum sample amount is inserted into the device, the measuring will start automatically. fig. 2 illustrates the principle of laser beam shielding. in the upper section, the laser beam is not shielded at all and the photodiode is struck with its full intensity. in the lower section, a sample is introduced and the intensity that strikes the photodiode has decreased due to the shielding of the laser beam caused by the introduced particles. the device from which the samples were collected consists of a fuel feeder, a gasifying chamber, a water heater, an air preheater, an optional mobile oxygen separator and a cyclone separator. the water heater with nominal heat output of 110 kw was originally designed for combustion of solid fuels, and has been modified to combust syngas from the gasifying chamber. the gasification chamber [5, 6] is designed for combustion of loose wood, mostly wood chips and other similar biomass material. sawdust and wood chips were used for the experiments. figure 3 shows a diagram of the device with the extraction points highlighted. 3. measument results a total of four ash specimens were selected to undergo size analysis in the laser analyser. the laser analyser shows the results on data sheets, where the frequency of the corresponding range of particle sizes to the whole sample is displayed. all results are transparently arranged in the form of a diagram (frequency curve, cumulative or distribution curve, see figure 4). the frequency curve represents the particle size distribution [4] related to particle volume. the cumulative distribution curve gives the percentage representation of particles in a sample of smaller size than the selected sample. figure 4 and table 1 show the results of the ash analysis from the gasifying chamber. it can be seen from figure 4 that the specimen contains a large proportion of big non-combusted particles. however, it also contains a significant amount of smaller particles. thanks to the quality dispergation of the specimen in the water bath, even these small particles are detected, since they are not agglomerated. agglomeration is typical for sieve analysis. figure 5 presents an analysis of a specimen that was taken from the water heater. it is a piece of ash that was carried by gas products from the gasifying chamber into the water heater. in comparison with the previous figure, it is evident that the number of large particles has decreased considerably and also that the number of particles smaller than 100 µm has increased. sample 3 was collected at a sampling spot located in the middle section of the flue gas pipe. the graph clearly shows that the measured sample contains only small particles. it is an essential prerequisite that necessary sedimentation occurs at larger particles along their way in that larger particles are sedimented as they pass through the flue gas pipe. the results are presented in figure 6 and in table 3. the last sample was collected in the ash bin located at the end of the cyclone separator. the main functional parameters of a cyclone are separability and pressure loss. in general, the greater the pressure loss, the better the separability, but this may not always be the case. the particle size distribution of the gas admixture and the geometric configuration has the greatest influence on cyclone sereparability. when determining separability, we assume that: • the particles are spherical in shape, and that their final speed is governed by stokes’ law; 23 m. hrnčířová, m. špiláček, j. pospíšil acta polytechnica figure 4. results from the gasifying chamber. figure 5. results from the water heater. x [µm] y [%] 20 19.7 25 24 45 37.4 63 45.5 90 53.4 125 60 180 67.3 250 75.1 x [µm] y [%] 355 84.7 500 93.6 710 98.8 1000 100 table 1. results from the gasifying chamber. x [µm] y [%] 1 2.8 1.3 3.1 10 14.9 25 33.2 50 64.3 100 82.6 315 99.5 500 100 table 2. results from the water heater. 24 vol. 54 no. 1/2014 size analysis of solid paritcles figure 6. results from flue gas pipe. figure 7. results from the cyclone separator x [µm] y [%] 1 0.7 1.3 0.9 10 5.7 25 22.9 50 50.6 100 81.1 315 100 500 100 table 3. results from flue gas pipe. x [µm] y [%] 0.1 0 2.5 5.7 4 9.3 8 20.9 15 42.1 30 75 55 95.8 100 100 table 4. results from the cyclone separator. 25 m. hrnčířová, m. špiláček, j. pospíšil acta polytechnica figure 8. overview of all measurements. frequency portions weight portions [µm] a [%] b [%] c [%] d [%] 30 75 25 13.15 86.85 table 5. frequency and weight portions. • the concentration and the particle fall speed are even in the cyclone input section; • the particles do not interfere with each other, i.e. they do not gather or break away • no turbulence or secondary flow occurs. our experience with calculations and the assumptions stated above imply that particle separability of 30 µm can be provided by this cyclone. figure 8 states the percentage results for 30 µm. labels a and b stand for the percentage frequency ratios, and areas c and d mark the percentage weight ratios. for conversion to a weight ratio (c,d), it was considered that all particles are spherical in shape, with density ρ = 1200 kg/m3. the results show that particles larger than 30 µm are the main weight carrier, despite the fact that their frequency in the measured sample is significantly lower than that of particles smaller than 30 µm. all this is shown in figure 7 and table 4. particles leaving the device for the atmosphere will be analysed in a future extension of our research. 4. conclusions this paper has presented a detailed analysis of ash specimens extracted from biomass combustion. the modern laser diffraction method that was used for the analysis fully replaces previous methods, such as sedimentation and particle-size analysis. the major advantage of this method concerns particle detection in a water environment, where quality and sufficient dispergation occurs what does this mean? the goal was to show the size spectrum of particles taken from each sampling spot in the device. our evaluation of these samples shows the decrease in the size of the particles as they travel through the whole device. the largest particles are captured right at the beginning, when the fuel is being gasified. afterwards the particles gradually sediment as they travel through the device. figure 8 presents a comparison of all measurement results, in which the differences in particle size are clearly visible. acknowledgements the work presented in this paper has been supported by the european regional development fund in the framework of the netme centre research project within the research and development for innovation operational programme. our work has also received financial support from the brno university of technology under the grant fsi-j-132090. references [1] pabst, w., gregorová, e. charakterizace částic a částicových soustav. prague: vysoká škola chemicko-technologická, 2007. pp. 1–22 [2] fritsch company manual for the analysette 22 microtec plus device. [3] wanogho s., gettinby g., caddy b. particle size distribution analysis of soils using laser diffraction, forensic science international, vol. 33, 1987, pp. 117–128. 26 vol. 54 no. 1/2014 size analysis of solid paritcles [4] tiehm a., herwig v., neis u. particle size analysis for improved sedimentation and filtration in waste water treatment, water science and technology, vol. 39, 1999, pp. 99-106. [5] lisý, m., baláš, m., moskalík, j., pospíšil, j. research into biomass and waste gasification in atmospheric fluidized bed. proceedings of the 3rd wseas international conference on energy planning, energy saving, environmental education, epese ’09, renewable energy sources, res ’09, waste management, wwai ’09, 2009, pp. 363-368. [6] lisý, m., báláš, m., moskalík, j., štelcl, o. biomass gasification primary methods for eliminating tar. acta polytechnica, 52 (3), pp. 66–70, 2012. 27 acta polytechnica 54(1):22–27, 2014 1 introduction 2 description of the equipment 3 measument results 4 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):113–116, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the process of plasma chemical photoresist film ashing from the surface of silicon wafers siarhei bordusau∗, siarhei madveika, anatolii dostanko belorussian state university of informatics and radioelectronics, minsk, belarus ∗ corresponding author: bordusov@bsuir.by abstract. at present, the research for finding new technical methods of treating materials with plasma, including the development of energy and resource saving technologies for microelectronic manufacturing, is particularly actual. in order to improve the efficiency of microwave plasma chemical ashing of photoresist films from the surface of silicon wafers a two-stage process of treating was developed. the idea of the developed process is that wafers coated with photoresist are pre-heated by microwave energy. this occurs because the microwave energy initially is not spent on the excitation and maintenance of a microwave discharge but it is absorbed by silicon wafers which have a high tangent of dielectric losses. during the next step after the excitation of the microwave discharge the interaction of oxygen plasma with a pre-heated photoresist films proceeds more intensively. the delay of the start of plasma forming process in the vacuum chamber of a plasmatron with respect to the beginning of microwave energy generation by a magnetron leads to the increase of the total rate of photoresist ashing from the surface of silicon wafers approximately 1.7 times. the advantage of this method of microwave plasma chemical processing of semi-conductor wafers is the possibility of intensifying the process without changing the design of microwave discharge module and without increasing the input microwave power supplied into the discharge. keywords: microwave plasma, microwave plasmatron, plasma chemical ashing of photoresist. 1. introduction at present, in most cases the technology of plasma chemical photoresist ashing from the surface of semiconductor wafers uses high frequency or microwave discharges the main drawback of which is inertia of the process at the beginning of plasma forming process. ashing of photoresist begins with a delay connected with warming up of the reaction-discharge chamber’s construction elements and wafers. the duration of the delay being dependent on the temperature of the reaction-discharge chamber and wafers, and, consequently, on the size and volume occupied by semiconductor wafers [8]. the transition of the microelectronic industry to the use of 200 mm and even larger diameter wafers needs reaction discharge chambers of more than 0.005 m3 that significantly affects the duration of photoresist ashing with plasma. the formation of high volume and large area microwave discharges leads to certain construction difficulties as industrial microwave discharge plasma sources are designed for 2.45 ghz frequency characterized by a small electromagnetic wave length (12.2 cm in open space [12]) and a small depth of electromagnetic field’s penetration into plasma. experiments to ash photoresist from the surface of silicon wafers during batch processing led to finding out a specific character of “loading” effect manifesting itself in partial absorption of the electromagnetic wave power entering a microwave resonator by the material with a high tangent dielectric loss (silicon) [1, 4, 5]. peculiar features of microwave energy and silicon wafer interaction should be taken into consideration as well as used in developing and their processing in plasma microwave discharge. the essence of photoresist plasma chemical ashing lies in the interaction of active particles, as a rule, oxygen plasma (atoms, excited molecules, ions, radicals) with photoresist molecules which are organic compounds [10]. the speed of these reactions is determined by the streams of particles onto the wafer and their temperature. hence, with the increase of reaction discharge chamber’s dimensions, quantity and diameter of plasmatron’s semiconductor wafers treated in the chamber, it is necessary to increase the power of supplied microwave radiation. since medium power microwave magnetrons for technological application have maximum power about 0.8 ÷ 1.5 kw [6], it becomes necessary to create additional stimulating effects on the process of batch microwave plasma chemical treatment of materials. the methods of controlling the processes of plasma chemical treatment of semiconductor wafers with the use of gas plasma microwave discharge may be conventionally divided into two groups: by changing the modes of treatment (power, frequency and form of electric pulses etc.) and by changing non electrical operation modes (pressure, type of gas, speed 113 http://ctn.cvut.cz/ap/ siarhei bordusau, siarhei madveika, anatolii dostanko acta polytechnica of gas stream etc.) [3]. these ways are not always efficient enough and since the increase of rate of photoresist ashing from the wafer surface is proportional to the value of sum power supplied to the discharge from various energy sources, stimulating the treatment process with such types of effects as bombarding by inert gas ions, laser radiation, electron flow, thermal heating before the treatment, heating up with microwave radiation etc., is of great interest [9]. wide use of additional methods of stimulating the processes of plasma treatment processes of materials is hindered by insufficiently developed technology as well as construction differences in plasma chemical equipment. in order to stimulate the processes of microwave plasma chemical photoresist ashing from the surface of semiconductor wafers we suggest using microwave warming up before the beginning of plasma treatment. while plasma is absent in the reaction discharge chamber during microwave heating, energy is absorbed straight in the volume of the semiconductor material without the mechanism of heat transfer from plasma to wafers. the reaction of plasma chemical photoresist ashing is endothermic, so photoresist previously warmed up during the stage of plasma treatment will be ashed away quicker. the purpose of the performed research was to study the application of this method in developing highly effective technologies and the possibility of its use in microwave plasma chemical etching processes using the equipment already available at factories. 2. method of experiment the research was carried out on the base of microwave discharge installation with a quartz tunnel-flow reactor 200 mm in diameter and 320 mm in length with a volume about 0.009 m3 placed in the center of a resonator. an m-112 microwave magnetron was used for generating electromagnetic radiation of at least 650 w, the efficiency coefficient being equal to 60 %. the total measured power consumed by the installation was equal to 1300 w. monocrystal silicon wafers, 76 mm in diameter and 0.3 mm thick, coated with of 1.4 ± 0.1 µm thick s1813g2sp15 photoresist film, treated in accordance with the standard modes of lithography, were used as objects for treating. during the experiments to study the rate of photoresist ashing the wafers were placed in the reaction discharge chamber in pairs. the moment of terminating the microwave plasma chemical photoresist ashing from the surface of silicon wafers was controlled with the spectrometer sl 40-2-2048 isa with 777.96 nm oxygen line intensity. o2 from oxygen cylinders was used as plasma forming gas. the temperature of silicon wafers was measured with the help of a pyrometer testo 830-t1 (an infrared thermometer). 3. results and their discussion in the result of experiments it was found out that the delay of plasma forming process in a plasmatron vacuum chamber with respect to the beginning of microwave energy generated by a magnetron leads to the increased rate of photoresist ashing from the surface of silicon wafers approximately by 1.7 times and reaches 40 nm/s. it means that the efficiency of plasma ashing of photoresist from the surface of silicon wafers can be increased with two stage method of treatment. the first stage is warming up of a semiconductor wafers with microwave energy, the second is interaction between microwave plasma and warmed up photoresist. the essence of the developed two-stage method of plasma chemical photorest ashing from the surface of semiconductor wafers lies in a preliminary warming up of photoresist with microwave energy. initially, microwave energy is not spent on exciting and maintaining microwave discharge but is completely absorbed by silicon wafers with photoresist having high tangent angle of dielectric losses thanks to which they are warmed up. during further interaction of oxygen plasma with preliminarily warmed up photoresist the process of oxidizing destruction of photoresist is more intensive and is accompanied by volatile components of reaction products, which are removed from the reaction chamber with a vacuum pump. the delay of plasma formation with respect to the beginning of microwave energy generation by a magnetron was performed in the following way. the reaction discharge chamber is pumped out to a sufficient pressure and then is filled with plasma forming gas to reach the pressure one/two times higher than the working pressure at which plasma will not ignite under the influence of microwave energy. then the magnetron microwave generation is switched on. the wafers being warmed up for a certain time under the influence of microwave energy, the required pressure of plasma forming gas is set, which provides microwave discharge excitation and the microwave plasma chemical treatment process is performed. semiconductor wafers are kept in plasma till the photoresist is completely removed from their surface. figures 1 and 2 show the dependencies of full cycle duration (microwave warming up and plasma treatment) and the rate of photoresist ashing from the surface of silicon wafers calculated with respect to the total treatment process duration on the time of plasma formation delay with respect to the beginning of microwave generating by the magnetron. photoresist was ashed from the surface of two si wafers. the presented dependencies show that they have an extreme character. we believe it may be connected with the structural changes going on in the photoresist film under the influence of both the temperature factor [7, 13], and the result of processes caused by microwave field [11]. 114 vol. 53 no. 2/2013 the process of plasma chemical photoresist film ashing figure 1. dependence of total t duration of photoresist ashing from the surface of si wafers on the delay time t of the beginning of plasma formation with respect to the beginning of microwave energy generation. figure 2. dependence of the calculated speed v of photoresist ashing from the surface of si wafers on the delay time t of the beginning of plasma formation with respect to the beginning of microwave energy generation. the experiments on the investigation of silicon wafer temperature depending on the time of treating them under the influence of microwave field (fig. 3) showed, that in the time range of 20 seconds the dependence looks like linear. according to the data shown in fig. 3 it is evident that during microwave warming up of silicon wafers their temperature reaches and exceeds the threshold working temperature for this type of photoresist (130 ÷ 140 ◦c). analyzing the data of figs. 2 and 3 it is possible to make a conclusion that with respect to the conditions of the experiment after 15 s of microwave energy application, structural changes in the photoresist film may take place. it results in the increase of film resistivity to the process of plasma destruction [2] and, consequently, to the reduction of plasma chemical photoresist ashing rate. in order to define the rule of revealing the process acceleration effect in case of preliminary microwave warming up of si wafers, a series of experiments was carried out to investigate the “loading” effect with reference to the conditions of treatment under investigation. the experiments were carried out loading 2, 5, 10 and 15 wafers into the reaction discharge chamber. the delay of plasma formation with refigure 3. temperature of si wafers θ depending on the time t of their warming up under the influence of microwave energy. figure 4. rate of photoresist film ashing from the surface of si wafers depending on the number of si wafers in the reactor (the delay duration of the beginning of plasma formation with respect to the beginning of microwave energy generation t = 15 s). spect to the beginning of microwave generation in accordance with the data of fig. 2 was chosen equal to 15 seconds. figure 4 shows the data on the rate of photoresist ashing from the surface of si wafers depending on the quantity of si wafers in the reaction discharge chamber. analyzing the obtained data and the data given in [5], we can arrive at the conclusion that the character of “loading” effect in case of two stage treatment is analogous to the classical microwave plasma chemical treatment. we suppose in both cases the process of photoresist film ashing takes place in accordance with the same mechanism but with a higher (up to 1.7 times) rate in case of preliminary activation owing to the microwave warming up. the experimental data on the investigation of dependence of the rate of plasma chemical photoresist film ashing from the surface of si wafers on the pressure of o2 in the reaction discharge chamber is shown in fig. 5. figure 5 shows that the investigated two-stage process of microwave plasma chemical photoresist film ashing is characterized by the presence of an evident extreme value of photoresist film ashing rate in the range of o2 pressure which qualitatively coincides with the analogous dependence obtained in [5]. 115 siarhei bordusau, siarhei madveika, anatolii dostanko acta polytechnica figure 5. rate values of photoresist film ashing from the surface of two silicon wafers depending on the pressure o2 (the delay duration of the beginning of plasma formation with respect to the beginning of microwave energy generation t = 15 s). these results can also confirm the conclusion about the identity of microwave plasma chemical photoresist film coating processes’ mechanisms for the studied two-stage method of microwave treatment and a standard single stage process. 4. conclusions the performed experimental investigation shows that microwave energy can be successfully employed for intensifying plasma chemical processes running at plasma chemical photoresist film coating ashing during the production of electronic devices. the effect of acceleration the treatment is achieved by absorption of microwave energy in silicon plates and their heating during processing without plasma. the advantage of such microwave plasma chemical treatment of semiconductor wafers is the possibility of a significant reduction of the process duration without changing the construction of microwave discharge module for technological application and without increasing the microwave power supplied to the discharge. references [1] c. boisse-laporte, j. marec (eds.). microwave discharges: fundamentals and applications. 3rd international workshop, abbaye royale de fontevraud, 20 – 25 april, 1997. les ulis, france, 1998. [2] s. v. bordusov. influence of the microwave plasma processing on the photoresist material. electronic materials processing 5(217):78–80, 2002. [3] s. v. bordusov. microwave plasma technologies in the productions of electronic devices. bestprint, minsk, 2002. edited by a.p. dostanko. [4] s. v. bordusov, s. i. madveyko. investigation of the influence of the effect of “loading” the discharge chamber on the optical characteristics of the microwave resonator type plasmatron. proceedings of polotsk state university 8:103–106, 2010. [5] s. v. bordusov, s. i. madveyko. investigation of the influence of electrical regimes of plasma formation on the local chemical activity of the microwave discharge plasma. proceedings of polotsk state university 3:119–123, 2012. [6] a. n. didenko, b. v. zverev. microwave energy. science, moscow, 2000. [7] a. v. dinaburg, i. g. erusalimchik, v. s. zelova, et al. investigation of physico-chemical properties of positive photoresists. electronic equipment sor 2 semiconductor devices 7(71):17–49, 2009. [8] v. m. dolgopolov, v. i. ivanov, v. a. krotkov, et al. spectral control indicator of process removing the photoresist in oxygen plasma. electronic engineering series 7 technology, production organization and equipment 5(114):27–30, 1982. [9] a. p. dostanko, et al. intensification of solid-state structures production by concentrated energy flows. bestprint, minsk, 2005. [10] a. p. dostanko, s. p. kundas, s. v. bordusov, et al. plasma processes in the productions of electronic devices. fuainform, minsk, 2001. [11] v. i. dubkova, s. v. bordusov, a. p. dostanko, et al. investigation of the influence of microwave field on the curing process and properties of epoxy compositions. journal of engineering physics 70(6):1014–1019, 2009. [12] a. macdonald. microwave breakdown in gases. mir, moscow, 1969. [13] w. moro. microlithography. mir, moscow, 1990. 116 acta polytechnica 53(2):113–116, 2013 1 introduction 2 method of experiment 3 results and their discussion 4 conclusions references acta polytechnica doi:10.14311/ap.2014.54.0363 acta polytechnica 54(5):363–366, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap resonant switch model of hf qpos and equations of state of neutron stars and quark stars zdeněk stuchlík∗, martin urbanec, andrea kotrlová, gabriel török, kateřina goluchová institute of physics, faculty of philosophy and science, silesian university in opava, bezručovo nám. 13, cz-74601 opava, czech republic ∗ corresponding author: zdenek.stuchlik@fpf.slu.cz abstract. the mass and spin estimates of the 4u 1636–53 neutron star obtained by the resonant switch (rs) model of high-frequency quasi-periodic oscillations (hf qpos) are tested by a large variety of equations of state (eos) governing the structure of neutron stars. neutron star models are constructed under the hartle–thorne theory of slowly rotating neutron stars calculated using the observationally given rotational frequency frot = 580 hz (or alternatively frot = 290 hz) of the neutron star at 4u 1636– 53. it is demonstrated that only two variants of the rs model are compatible with the parameters obtained by modelling neutron stars for the rotational frequency frot = 580 hz. the variant giving the best fit with parameters m ∼ 2.20 m� and a ∼ 0.27 agrees with high precision with the prediction of one of the skyrme eos [1]. the variant giving the second best fit with parameters m ∼ 2.12 m� and a ∼ 0.20 agrees with lower precision with the prediction of the gandolfi eos [2]. keywords: neutron stars, x-ray variability, theory, observations. 1. introduction a new alternative to the standard models of hf qpos has been proposed recently in [3, 4]. the resonant switch (rs) model of twin-peak hf qpos observed in low-mass x-ray binaries (lmxbs) containing a neutron star is based on the switch of twin oscillations at a resonant point, where one pair of oscillating modes changes to some other pair due to non-linear resonant phenomena. the rs model has been applied to the atoll source 4u 1636–53, where we assume two resonant points observed at frequency ratios νu : νl = 3 : 2 and 5 : 4 [3]. the range of allowed values of dimensionless spin a and mass m of the neutron star was determined by fitting the pairs of oscillatory modes admitted by the rs model to the observed data in the regions related to the resonant points [15]. among acceptable variants of the rs model the most promising are those combining relativistic precession and total precession frequency relations or modifications to them, when the precision of the fits increases strongly (the χ2 test is improved by almost one order) in comparison to the fits realized by individual frequency pairs along the whole data range [15]. here we present preliminary results of testing the rs model by various models of eos. 2. resonant switch model the rs model [3, 4] is based on the idea that the twin oscillatory modes creating the sequences of lower and upper hf qpos can switch at a resonant point where the frequencies of the upper and lower oscillations νu and νl become commensurable. it is expected that at the resonant point non-linear resonant phenomena will excite a new oscillatory mode (or two new oscillatory modes) and dump one of the previously acting modes (or both the previously acting modes), i.e., switching from one pair of oscillatory modes (corresponding to a specific model of hf qpos) to the other pair, which will act up to the next relevant resonant point. in the simplest version of the rs model, we assume two resonant points at disc radii rout and rin, with observed frequencies νoutu , ν out l and ν in u , ν in l , being in commensurable ratios pout = nout : mout and pin = nin : min. observations put restrictions on νinu > ν out u and p in < pout. in the region covering the resonant point at rout we assume twin oscillatory modes with the upper (lower) frequency determined by the function νoutu (r,m,a) (ν out l (r,m,a)). near the inner resonant point at rin different oscillatory modes generally occur with the upper and lower frequency relation functions νinu (r,m,a) and νinl (r,m,a). we assume all the frequency functions to be given by combinations of the orbital and epicyclic frequencies of the geodesic motion in the kerr backgrounds. such a simplification is correct with high precision for near-maximum-mass neutron (quark) stars in a slow rotation regime related to all known atoll sources [5, 6]. in the kerr spacetime, the epicyclic frequencies νθ and νr and the keplerian (orbital) frequency νk 363 http://dx.doi.org/10.14311/ap.2014.54.0363 http://ojs.cvut.cz/ojs/index.php/ap z. stuchlík et al. acta polytechnica model relations rp νl = νk − νr νu = νk rp1 νl = νk − νr νu = νθ tp νl = νθ − νr νu = νθ tp1 νl = νθ − νr νu = νk td νl = νk νu = νk + νr wd νl = 2 (νk − νr) νu = 2νk − νr table 1. frequency relations corresponding to individual qpo models. depend only on the spacetime parameter m (mass) and a (spin) [7–10]. the frequency-relation functions have to meet the observationally given resonant frequencies that can be determined by the “energy switch effect” [3, 11]. in the framework of the simple rs model this requirement enables direct determination of the kerr background parameters describing the exterior of the neutron (quark) star [3, 4]. independence of the frequency ratio on the mass parameter m implies that the conditions νoutu (x; a) : ν out l (x; a) = p out , (1) νinu (x; a) : ν in l (x; a) = p in (2) determine the relations for spin a in terms of the dimensionless radius x = r/(gm/c2) and the resonant frequency ratio p. they can be expressed in the form aoutp (x) and ainp (x), or in an inverse form xoutp (a) and xinp (a). at the resonant radii, the conditions νoutu = ν out u (x; m,a) , ν in u = ν in u (x; m,a) (3) are satisfied along the functions moutpout (a) and m in pin (a) which can be obtained by using the functions xoutp (a) and xinp (a). the parameters of the neutron (quark) star are then given by the condition [3, 4] moutpout (a) = m in pin (a). (4) condition (4) determines m and a with precision given by the error in determining the resonant frequencies by the energy switch effect. we consider the pairs of frequency relations given by the relativistic precession (rp) model [9], the total precession (tp) model [12], and their modifications rp1 and tp1, combined also with the tidal disruption (td) model [13], and the warped disc oscillations (wd) model [14]. the frequency relations are summarized in table 1. for each of the frequency relations under consideration, the frequency resonance functions and the resonance conditions determining the resonant radii xn:m(a) are given in [3]. combination of models χ2min a m [m�] rp1(3:2) – rp(5:4) 55 0.27 2.20 tp(3:2) – rp(5:4) 55 0.52 2.87 rp1(3:2) – tp1(5:4) 61 0.20 2.12 rp1(3:2) – tp(5:4) 62 0.45 2.46 tp(3:2) – tp1(5:4) 68 0.31 2.39 rp(3:2) – tp1(5:4) 72 0.46 2.81 wd(3:2) – td(5:4) 113 0.34 2.84 table 2. the best fits and the corresponding spin and mass parameters of the neutron star located in the 4u 1636–53 source. 3. application to the atoll source 4u 1636–53 in [3], the rs model has been applied in the case of the atoll source 4u 1636–53, where the observational data clearly demonstrate the possible existence of two resonant points with frequency ratios 3 : 2 and 5 : 4, where the energy switch effect occurs. the mass m and spin a ranges of the 4u 1636–53 neutron star predicted by the rs model with resonant frequencies given by the energy switch effect are very large (see table 1 in [3]). however, the ranges can be strongly restricted by fitting the observational data near the resonant points by the pairs of frequency relations corresponding to the twin oscillatory modes. in the fitting procedure we apply those switched twin frequency relations predicted by the rs model that are acceptable due to the neutron (quark) star structure theory [3]. in fitting the observational data we use the standard least-squares (χ2) method. the resulting limits on the mass m and spin a of the 4u 1636–53 neutron star implied by the data fitting procedure realized in the framework of the rs model of hf qpos are presented in table 2. the fitting procedure is shown to be by almost one order of magnitude more precise than the fitting realized by individual pairs along the whole range of the observational data [15]. the best fit obtained for the rs model with the frequency relation pair rp1–rp gives χ2 ∼ 55 and χ2/dof ∼ 2.5 [15]. the results of the fitting procedure for the best fit are presented in figure 1. the best fit occurs for a combination of the rp1 and rp models, where the rp1 model has to be related to the outer resonant point, while the rp model is related to the inner resonant point and predicts neutron star parameters m ∼ 2.20 m� and a ∼ 0.27 which are quite acceptable according to the neutron star theory and can be considered as the best prediction of the rs model. the second best fit (with χ2 = 61) is obtained for the frequency pair rp1–tp1, where the rp1 model has to be related to the outer resonant point, while the tp1 model is related to the inner resonant point and predicts the parameters m ∼ 2.12 m� and a ∼ 0.20, which are again acceptable according to the neutron star structure theory. 364 vol. 54 no. 5/2014 rs model and equations of state of neutron stars 40 80 120 160 200 2 2.1 2.2 2.3 2.4 χ 2 m/m [2.20, 55] 600 800 1000 1200 1400 400 600 800 1000 1200 ν u [ h z ] νl [hz] rp1 rp m = 2.20 m a = 0.27 resonant switch figure 1. results of the fitting the data of twin-peak hf qpos in the atoll source 4u 1636–53 by the procedure of the rs model for the combination of the rp1(3:2) and rp(5:4) frequency relations. left panel: profile of the lowest χ2 for a given m. thick blue vertical lines give mean value of m as determined by the rs model from the frequency ratio governed by the energy switch effect. the grey region corresponds to the precision of the fit. right panel: the pair of frequency relations rp1–rp obtained for the best fit to the observational data (with χ2 ∼ 55). 4. equations of state for the neutron star in source 4u 1636–53 testing the rs model we compare results obtained in [15] with models of rotating neutron stars calculated using the hartle– thorne approximation [16, 17], which describes slowly rotating neutron stars. we construct models of rotating neutron stars using a large variety of acceptable eos and with rotation frequency 580 hz (or 290 hz) observed for source 4u 1636–53 [18]. in figure 2, the results of the hartle–thorne model are illustrated by appropriately denoted curves in the m–a plane that are calculated for the eos under consideration. we can see that no eos enables a model of the neutron star that can fit the rs model data, if we assume the rotational frequency of the 4u 1636–53 neutron star frot ∼ 290 hz. for the rotational frequency frot ∼ 580 hz, neutron star models give very interesting restrictions that are in significant agreement with the results of applying the fitting the hf qpo data in the framework of the rs model. a neutron star model using one of the skyrme eos (sv) [1] meets with high precision the prediction of the rp1–rp version of the rs model that gives the best fit to the twin peak hf qpo data observed in the 4u 1636– 53 source for neutron star parameters m ∼ 2.20 m� and a ∼ 0.27. the neutron star model based on the gandolfi eos [2] meets with acceptable precision the prediction of the rp1–tp1 version of the rs model that gives the second best fit to the observation data of the hf qpos in 4u 1636–53 for a neutron star having parameters m ∼ 2.12 m� and a ∼ 0.20. note that the second best rs model fit is marginally touched by another parameterized skyrme eos (gs) [1] for the neutron star parameters m ∼ 2.12 m� and a ∼ 0.20. this result demonstrates that the 4u 1636– 53 neutron star could be in a state very close to instability with respect to the radial perturbation, corresponding to the maximum mass, predicted by the eos. all the other predictions of the rs model are located in m–a plane positions that are evidently outside the range of all the eos considered in the present paper – we can expect that this is true even for all variants of the presently known eos. 5. conclusions we can conclude that the eos considered in our study strongly restrict the versions of the rs model. only two of them (rp1–rp and rp1–tp1) are therefore acceptable. however, it is quite interesting that the rs model can put strong restrictions on the acceptable eos, and it seems that only three of those considered here can be taken as plausible. acknowledgements we would like to express our gratitude to the czech grant agency for supporting project gačr 202/09/0772, and for internal grants of the silesian university in opava sgs/11/2013 and sgs/23/2013. the authors further acknowledge the project on supporting integration with the international theoretical and observational research network in relativistic astrophysics of compact objects, reg. no. cz.1.07/2.3.00/20.0071, supported by operational programme education for competitiveness funded by structural funds of the european union and the state budget of the czech republic. references [1] j. r. stone, et al. nuclear matter and neutron-star properties calculated with the skyrme interaction. phys rev c 68(3):034324, 2003. [2] s. gandolfi, et al. microscopic calculation of the equation of state of nuclear matter and neutron star structure. monthly notices roy astronom soc 404:l35–l39, 2010. arxiv:0909.3487[nucl-th]. [3] z. stuchlík, et al. resonant switch model of twin peak hf qpos applied to the source 4u 1636–53. acta 365 arxiv:0909.3487 [nucl-th] z. stuchlík et al. acta polytechnica 0.1 0.2 0.3 0.4 0.5 0.6 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 a m[msol] 1 2 3 4 5 6 7 8 9 10 14 15 16 17 18 19 79 35 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 a m[msol] 1 2 3 4 5 6 7 8 9 10 14 15 16 17 18 19 79 35 figure 2. hartle–thorne models of neutron stars with a variety of eos: 1 − 9: skyrme [1], 10: ubs [19], 14: apr [20], 16: bbb2 [21], 17: bpal12 [22], 18: balbn1h1 [23], 19: glendnh3 [24], 35: apr2 [20], 79: gandolfi [2]. the models are constructed for frot ∼ 290 hz (left panel), and frot ∼ 580 hz (right panel). astronom 62(4):389–407, 2012. arxiv:1301.2830[astro-ph.he]. [4] z. stuchlík, et al. multi-resonance orbital model of high-frequency quasi-periodic oscillations: possible high precision determination of black hole and neutron star spin. astronomy and astrophysics 552(a10), 2013. arxiv:1305.3552[astro-ph.he]. [5] g. török, et al. on mass constraints implied by the relativistic precession model of twin-peak quasi-periodic oscillations in circinus x-1. astrophys j 714(1):748–757, 2010. arxiv:1008.0088[astro-ph.he]. [6] m. urbanec, et al. quadrupole moments of rotating neutron stars and strange stars. monthly notices roy astronom soc (published online), 2013. arxiv:1301.5925[astro-ph.sr]. [7] a. n. aliev, et al. radiation from relativistic particles in non-geodesic motion in a strong gravitational field. gen relativity gravitation 13:899–912, 1981. [8] s. kato, et al. black-hole accretion disks. kyoto university press, kyoto, japan, 1998. [9] l. stella, et al. lense–thirring precession and quasi-periodic oscillations in low-mass x-ray binaries. astrophys j lett 492:l59–l62, 1998. arxiv:astro-ph/9709085. [10] g. török, et al. radial and vertical epicyclic frequencies of keplerian motion in the field of kerr naked singularities. comparison with the black hole case and possible instability of naked singularity accretion discs. astronomy and astrophysics 437(3):775–788, 2005. arxiv:astro-ph/0502127. [11] g. török. reversal of the amplitude difference of khz qpos in six atoll sources. astronomy and astrophysics 497(3):661–665, 2009. arxiv:0812.4751[astro-ph]. [12] z. stuchlík, et al. on a multi-resonant origin of high frequency quasiperiodic oscillations in the neutron-star x-ray binary 4u 1636–53. arxiv e-prints, 2007. arxiv:0704.2318[astro-ph]. [13] u. kostić, et al. tidal effects on small bodies by massive black holes. astronomy and astrophysics 496(2):307–315, 2009. arxiv:0901.3447[astro-ph.he]. [14] s. kato. resonant excitation of disk oscillations by warps: a model of khz qpos. publ astronom soc japan 56(5):905–922, 2004. arxiv:astro-ph/0409051. [15] z. stuchlík, et al. test of the resonant switch model by fitting the data of twin-peak hf qpos in the atoll source 4u 1636–53. acta astronom 64(1):45–64, 2014. [16] j. b. hartle, et al. slowly rotating relativistic stars ii. model for neutron stars and supermassive stars. astrophys j 153:807–834, 1968. [17] s. chandrasekhar, et al. on slowly rotating homogeneous masses in general relativity. monthly notices roy astronom soc 167:63–79, 1974. [18] t. e. strohmayer, et al. evidence for a millisecond pulsar in 4u 1636–53 during a superburst. astrophys j 577:337–345, 2002. arxiv:astro-ph/0205435. [19] m. urbanec, et al. observational tests of neutron star relativistic mean field equations of state. acta astronom 60(2):149–163, 2010. arxiv:1007.3446[astro-ph.sr]. [20] a. akmal, et al. equation of state of nucleon matter and neutron star structure. phys rev c 58:1804–1828, 1998. arxiv:nucl-th/9804027. [21] m. baldo, et al. microscopic nuclear equation of state with three-body forces and neutron star structure. astronomy and astrophysics 328:274–282, 1997. arxiv:astro-ph/9707277. [22] i. bombaci. an equation of state for asymmetric nuclear matter and the structure of neutron stars. in i. bombaci, a. bonaccorso, a. fabrocini, et al. (ed.), perspectives on theoretical nuclear physics, pp. 223–237, 1995. [23] s. balberg, et al. an effective equation of state for dense matter with strangeness. nuclear phys a 625:435–472, 1997. arxiv:nucl-th/9704013. [24] n. k. glendenning. neutron stars are giant hypernuclei? astrophys j 293:470–493, 1985. 366 arxiv:1301.2830 [astro-ph.he] arxiv:1305.3552 [astro-ph.he] arxiv:1008.0088 [astro-ph.he] arxiv:1301.5925 [astro-ph.sr] arxiv:astro-ph/9709085 arxiv:astro-ph/0502127 arxiv:0812.4751 [astro-ph] arxiv:0704.2318 [astro-ph] arxiv:0901.3447 [astro-ph.he] arxiv:astro-ph/0409051 arxiv:astro-ph/0205435 arxiv:1007.3446 [astro-ph.sr] arxiv:nucl-th/9804027 arxiv:astro-ph/9707277 arxiv:nucl-th/9704013 acta polytechnica 54(5):363–366, 2014 1 introduction 2 resonant switch model 3 application to the atoll source 4u 1636–53 4 equations of state for the neutron star in source 4u 1636–53 testing the rs model 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0821 acta polytechnica 53(supplement):821–824, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap moonlight: a new lunar laser ranging retroreflector instrument m. garattinia,∗, s. dell’agnelloa, d. currieb, g. o. delle monachea, m. tibuzzia, g. patrizia, s. berardia, a. bonia, c. cantonea, n. itagliettaa, c. lopsa, m. maielloa, m. martinia, r. vittoric, g. biancod, r. marcha, g. bellettinia, r. taurasoa a istituto nazionale di fisica nucleare – laboratori nazionali di frascati,via e. fermi 40, 00044 frascati (roma), italy b department of physics, university of maryland (umd), college park, md 20742 and nlsi (nasa lunar science institute), usa c european space agency (esa-hso) and ami – aeronautica militare italiana d centro di geodesia spaziale g.colombo (asi-cgs), matera, italy ∗ corresponding author: marco.garattini@roma1.infn.it abstract. since 1969 lunar laser ranging (llr) to the apollo cube corner reflector (ccr) arrays has supplied several significant tests of gravity: geodetic precession, the strong and weak equivalence principle (sep, wep), the parametrized post newtonian (ppn) parameter β, the time change of the gravitational constant (g), 1/r2 deviations and new gravitational theories beyond general relativity (gr), like the unified braneworld theory (g. dvali et al., 2003). now a new generation of llr can do better using evolved laser retroreflectors, developed from tight collaboration between my institution, infn–lnf (istituto nazionale di fisica nucleare – laboratori nazionali di frascati), and douglas currie (university of maryland, usa), one of the fathers of llr. the new lunar ccr is developing and characterizing at the “satellite/lunar laser ranging characterization facility” (scf), in frascati, performing our new industry standard space test procedure, the “scf-test”; this work contains the experimental results of the scf-test applied to the new lunar ccr, and all the new payload developments, including the future scf tests. the international lunar network (iln) research project considers our new retroreflector as one of the possible “core instruments”. keywords: lunar laser ranging, general relativity, gravitation, equivalence principle, geodetic precession, gravitational constant variation, ppn parameters, apollo station, international lunar network, cube corner retroreflector. 1. introduction the time of flight measurement of a laser pulse sent by a station on the earth in the direction of a laser retroreflector deployed on the lunar surface, and sent back again to the station, is commonly know as lunar laser ranging and is the most accurate and cheapest distance measurement in space. generally realized in fused silica, this special kind of laser mirror is a solid ccr, which, assembled in passive, maintenance-free, light-weight laser retroreflector arrays (lra), gives exceptional performances for several decades, thanks to its choice of thermal design and materials. since 1969, nasa’s apollo 11, 14 and 15 missions, designed by a team led by c. o. alley, d. currie, p. bender and faller [2, 3], and the soviet missions luna 17 and 21, have deployed lras on the moon surface, giving a perfect demonstration of functionality and performance. the accuracy of laser ranging from earth to moon surface passive targets commonly reaches centimeter level. in recent years, many upgrades in observing technology and data modeling have improved the llr. at this time, llr is one of the best methods for achieving high-accuracy tests of gr, including tests of wep and sep, of geodetic precession, of time-variability in g, of ppn parameter β and of the inverse-square law of gravity [4](j. g. williams et al., 2008 & m. martini, acta polytechnica 53 supplement, 2013). the apollo (apache point observatory lunar laser-ranging operation) station is located at the apache point observatory (apo) in southern new mexico, at an altitude of 2800 m. this is a unique position for the first-class apo 3.5 m telescope astronomical facility to perform a fundamental physics experiment [5]. before the advent of the new apparatus of the apollo station, the accuracy limit in llr measurements had been about 20 mm. now apollo seems to be able to provide 1 mm accuracy, though this is difficult to verify because the software models do not yet have sufficient precision. the relative orientation of the moon changes up to ten degrees during a month with respect to the line of sight from the earth station to the lunar array. 821 http://dx.doi.org/10.14311/ap.2013.53.0821 http://ojs.cvut.cz/ojs/index.php/ap marco garattini et al. acta polytechnica figure 1. concept of the 2nd generation of lunar laser ranging. this phenomenon is called “lunar libration”, and it produces the primary range error. considering the apollo 15 array, the length, from corner to corner, is about 120 cm, so librations generate a difference in distance between the extremes of 200 mm (120 cm × 10°). this trivial geometrical calculation is shown for the first time by the apollo observations, confirmed by variation of the shape and full width of signal, as a function of the libration angle. the widest base, standing for corner-to-corner, is 1.1 ns (round-trip), or 165 mm one-way difference. intrinsic to the multiccr structure of the arrays, this causes an uncertainty widening of the return pulse, thus imposing a limit on the statistical error. this can add over 50 mm to the measurement error per photon, in a root-meansquare sense. it is by far the principal source of range uncertainty and, to achieve the millimeter level, one must collect thousands of photons. 2. 2nd generation of llr the general concept of the second generation of llr is to consider a number (notionally eight) of large single cube corner retroreflectors (ccrs). each of these will produce a light echo that, with a single photoelectron detection system such as the current apollo system, can be used to improve the ranging beyond the limit of accuracy determined by the librational effects of current arrays and the laser pulse length. when single ccrs are used, the return is unaffected by the libration, that is, there is no increased widening of the fwhm caused by the librational effects and by the ccr itself. in this way, an accuracy improvement on the time of flight measurement of 1 ns (roundtrip) will be obtained. if two such single reflectors are deployed with a relative distance of tens of meters, their return of light will be recorded separately and can be recognized by comparison with the nominal orbit of the moon and the rotational parameters of the earth [6]. this idea is illustrated schematically in fig. 1. figure 2. section and exploded of the moonlight/ llra-21st ccr in its housing. 3. the new maryland/frascati payload the university of maryland (umd) and infn–lnf are now proposing a new approach to the lunar laser ranging array (llra) technology, the experiment moonlight (moon laser instrumentation for general relativity high-accuracy tests), parallel to the llra-21st (lunar laser ranging array for the 21st century) project, led by douglas currie [7]. we currently use a 100 mm ccr realized in a special kind of glass, suprasil 1 (formerly suprasil t19), the same material as was used both in old llra and in lageos (laser geodynamics satellites). this will be mounted in an aluminum housing, thermally shielded from the lunar environment, in order to maintain a relatively constant temperature during the succession of lunar day and night. moreover the ccr is also isolated from the housing by two coaxial “gold cans”, to ensure that the ccr receives relatively little thermal input, as an effect of the low temperature of the lunar night and the high temperature of the lunar day. the mounting of the ccr inside the housing is shown in fig. 2. for this mounting, one could use a ring of kel-f, a special kind of plastic material that is already used in lageos satellites, due to its good insulating, non-hygroscopic and low out-gassing properties. the ccr and its housing will be set by a rod into the lunar regolith to a depth of 1 m [8]. 3.1. thermal and optical tests in frascati scf (satellite/lunar laser ranging characterization facility), at lnf/infn in frascati, is a cryostat where we are able to reproduce the space environment: cold (77 k with liquid nitrogen), vacuum, and the sun spectra. the scf includes a sun simulator1, that provides a 40 cm diameter beam with a close spectral match to the am0 standard of 1 sun in space (1366.1 w/m2), with uniformity better than ±5 % over an area of 35 cm diameter. next to the cryostat we have an optical table, where we can reproduce the laser path from earth to the moon, and back, studying the far field diffraction pattern (ffdp) coming back from the ccr to the laser station, which is useful for 1www.ts-space.co.uk 822 www.ts-space.co.uk vol. 53 supplement/2013 moonlight: a new lunar laser ranging retroreflector instrument understanding how good the optical behavior of the ccr is. the scf-test [9] is a new test procedure, never performed before, for characterizing and modelling the detailed thermal behavior and the optical performance of laser retroreflectors in space for industrial and scientific applications. we perform an scf-test on the moonlight ccr to evaluate the thermal and optical performance in a space environment. for the thermal measurements we use both an infrared (ir) camera and temperature probes, which give real time measurements of all the components of the ccr and its housing. in particular we look at the temperature difference between the front face and the tip of the ccr, studying how the ffdp changes during the different thermal phases. this is the best representative of the thermal distortion of the return beam to the earth. various configurations and designs of the ccr and the housing have been tested and are being tested in the scf facility, with the solar simulator, temperature data recording, the ir camera and measurements of the ffdp [10]. 4. experimental results figure 4 shows the moon light/llra-21st flight ccr ffdp intensity variation at moon velocity aberrations (2 v/c), during key points of the scftest: (1) in air, (2) in vacuum, (3) during cooling of the chamber’s shields, (4) sun on orthogonal to the ccr’s face with the housing temperature controlled at t = 310 k, (5) sun on at 30° of inclination (no “break-through”), (6) sun on at −30° of inclination (“break-through”), (7) sun on orthogonal with the housing temperature left floating, (8) sun off. from this graph we can deduce that the intensity decreases during non-orthogonal illumination of the ccr, in particular when the sun enters the housing cavity during the “break-through” phase. this effect is due to a strong increase in the “tip–face” thermal gradient during this phase of the test. when the housing temperature is left floating, the intensity increases slightly because the “tip–face” gradient decreases. although, by default, this configuration of the moonlight/llrra-21st ccr seems to work in the right way, the first scf-test brought to light some important considerations that need to be explored, understood, and in many cases, improved. first of all, fig. 3 clearly shows that a major loss of signal intensity, at the useful velocity aberration, is a consequence of the solar “break-through”, when the ccr is illuminated in a non-orthogonal way. to limit this effect, we have designed and realized an aluminum “solar shade”, which, mounted on the frontal ring of the housing, blocks the solar radiation during the “break-through” phase (fig. 4). the scf-test of this “shade configuration” has already been done (from 24/03/2010 to 27/03/2010), in the same way as the previous measurements. the use of “solar shade” has two additional important figure 3. moonlight/llrra-21st flight ccr ffdp intensity variation. figure 4. the moonlight/llrra-21st ccr in the “shade configuration”, ready to be tested. motivations. first of all, it acts as a protection from eventual dust deposition on the ccr frontal face, preventing a decline in the efficiency of the reflector in time. in addition, the shade provides protection from ultra violet light, which could change the optical properties of the reflector. measurements of this kind will provide information on the ccr behavior in every phase of its lunar life, giving a complete and definitive characterization of the payload, which already seems to work quite well for our purposes. in this context, our moonlight/llrra-21st ccr, is becoming one of the most important candidates to be the laser retroreflector core instrument of the iln project [11]. 5. conclusions lunar laser ranging remains one of the most powerful and competitive of all methods and technologies for investigations of this kind, and our work has been contributing to the development of the llr, from several points of view. 823 marco garattini et al. acta polytechnica first of all, a new winning philosophy has been shown for a second generation of llr, proposing a new prototype for a laser retroreflector. the experimental results indicate that this new payload can be a real candidate for improving the performance of llr. in this regard, the iln project offers a real opportunity to bring these new payloads on the lunar surface. acknowledgements llr gratefully acknowledges the major contribution to its work that has come from tight collaboration with the university of maryland, personified by douglas currie, and from the infn–lnf research group, led by simone dell’agnello. references [1] dvali, g.; a.gruzinov and m.zaldarriaga, the accelerated universe and the moon, phys. rev. d 68 024012, 2003 [2] alley, c. o., et al. 1969, laser ranging retroreflector, in apollo 11: preliminary science report (tech. rep. sp-214; washington: nasa). 1970, science, 167, 368 [3] faller, j. e., et al. 1971, laser ranging retroreflector, in apollo 14: preliminary science report (tech. rep. sp-272; washington: nasa). 1972, laser ranging retroreflector, in apollo 15 preliminary science report (tech. rep. sp-289; washington: nasa) [4] j.g. williams, s. turyshev and d.h. boggs, progress in lunar laser ranging tests of relativistic gravity, phys. rev. lett.93.261101 (2008) [5] t.w. murphy, e.g. adelberger, j.b.r. battat et al., apollo: the apache point observatory lunar laser-ranging operation: instrument description and first detections, publ.astron.soc.pac.120:20–37, 2008 [6] s. dell’agnello et al., ieee aerospace conference, next generation lunar laser ranging and its gnss applications, big sky (mt), usa, 2010, and references therein [7] currie d.g. et al., a lunar laser ranging retro-reflector array for nasa’s manned landings, the international lunar network, and the proposed asi lunar mission magia, 16th international workshop on laser ranging instrumentation, october 13–17, 2008 [8] d.currie et al., a lunar laser ranging retroreflector array for the 21st century, “acta astronautica” 68 (2011) 667–680 [9] s.dell’agnello et al., creation of the new industry-standard space test of laser retroreflectors for the gnss and lageos, galileo issue in journal of advances in space research, scientific application of galileo navigation satellite system, 47 (2011) 822–842 [10] a. boni et al., optical far field diffraction pattern test of laser retroreflectors for space applications in air and isothermal conditions at infn-lnf. infn-lnf report lnf-08/26(ir), 2008 [11] dell’agnello et al, international lunar network core instruments working group, final report study period: june 2008 to july 2009, and references therein (http://iln.arc.nasa.gov) 824 http://iln.arc.nasa.gov acta polytechnica 53(supplement):821–824, 2013 1 introduction 2 2nd generation of llr 3 the new maryland/frascati payload 3.1 thermal and optical tests in frascati 4 experimental results 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0050 acta polytechnica 55(1):50–58, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap biperiodic fibonacci word and its fractal curve josé l. ramíreza, ∗, gustavo n. rubianob a departamento de matemáticas, universidad sergio arboleda, bogotá, colombia b departamento de matemáticas, universidad nacional de colombia, bogotá, colombia ∗ corresponding author: josel.ramirez@ima.usergioarboleda.edu.co abstract. in the present article, we study a word-combinatorial interpretation of the biperiodic fibonacci sequence of integer numbers (f (a,b)n ). this sequence is defined by the recurrence relation af (a,b) n−1 + f (a,b) n−2 if n is even and bf (a,b) n−1 + f (a,b) n−2 if n is odd, where a and b are any real numbers. this sequence of integers is associated with a family of finite binary words, called finite biperiodic fibonacci words. we study several properties, such as the number of occurrences of 0 and 1, and the concatenation of these words, among others. we also study the infinite biperiodic fibonacci word, which is the limiting sequence of finite biperiodic fibonacci words. it turns out that this family of infinite words are sturmian words of the slope [0,a,b]. finally, we associate to this family of words a family of curves with interesting patterns. keywords: fibonacci word, biperiodic fibonacci word, biperiodic fibonacci curve. 1. introduction the fibonacci numbers and their generalizations have many interesting properties and combinatorial interpretations, see, e.g., [13]. the fibonacci numbers fn are defined by the recurrence relation fn = fn−1 + fn−2, for all integer n ≥ 2, and with initial values f0 = 1 = f1. many kinds of generalizations of the fibonacci sequence have been presented in the literature. for example, edson and yayenie [12] introduced the biperiodic fibonacci sequence { f (a,b) n } n∈n. for any two nonzero real numbers a and b, this is defined recursively by f (a,b) 0 = 0, f (a,b) 1 = 1, f (a,b)n = { af (a,b) n−1 + f (a,b) n−2 , if n ≥ 2 is even, bf (a,b) n−1 + f (a,b) n−2 , if n ≥ 2 is odd. to avoid cumbersome notation, let us denote f (a,b)n by qn. the first few terms are {qn}∞n=0 = {0, 1,a,ab + 1,a 2b + 2a, a2b2 + 3ab + 1,a3b2 + 4a2b + 3a,. . .}. note that if a = b = 1, then qn is the nth fibonacci number. a binet-like formula to the biperiodic fibonacci sequence is qn = ( a1−ξ(n) (ab)bn2 c ) αn −βn α−β , (1) where α = ab + √ (ab)2 + 4ab 2 , β = ab− √ (ab)2 + 4ab 2 , and ξ(n) := n− 2 ⌊n 2 ⌋ . on the other hand, there exists a well-known wordcombinatorial interpretation of the fibonacci sequence. let fn be a binary word defined inductively as follows f0 = 1, f1 = 0, fn = fn−1fn−2, for n ≥ 2. it is clear that |fn| = fn, i.e., the length of the word fn is the nth fibonacci number. the words fn are called finite fibonacci words. the infinite fibonacci word, f = 0100101001001010010100100101 · · · is defined by the limit sequence of the infinite sequence {fn}n∈n. it is the archetype of a sturmian word [2, 14], and one of the most studied examples in the combinatorial theory of infinite words; see, e.g., [3, 5, 6, 8, 10, 15, 17]. the word f can be associated with a curve from a drawing rule, which has geometric properties obtained from the combinatorial properties of f [4, 16]. the curve produced depends on the rules given. we read the symbols of the word in order and depending on what we read, we draw a line segment in a certain direction; this idea is the same as that used in lsystems [18]. in this case, the drawing rule is called the “odd-even drawing rule” [16]. this is defined as shown in table 1. the nth-curve of fibonacci, denoted by fn, is obtained by applying the odd-even drawing rule to the word fn. the fibonacci word fractal f, is defined as f = limn→∞fn. for example, in figure 1, we show the curve f10 and f17. the graphics in the present article were generated using the mathematica 9.0 software, [19, 21]. f10 = 0100101001001010010100100101001 0010100101001001010010100100101 001001010010100100101001001. 50 http://dx.doi.org/10.14311/ap.2015.55.0050 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 1/2015 biperiodic fibonacci word and its fractal curve figure 1. fibonacci curves f10 and f17 corresponding to the words f10 and f17. symbol action 1 draw a line forward. 0 draw a line forward and if the symbol 0 is in an even position, then turn left and if 0 is in an odd position, then turn right. table 1. odd-even drawing rule. ramírez et al. [22] introduced a generalization of the fibonacci word, the i-fibonacci word. specifically, the (n,i)-fibonacci words are words over {0,1} defined inductively as follows: f[i]0 = 0, f [i] 1 = 0 i−11, f [i] n = f [i] n−1f [i] n−2, for n ≥ 2 and ≥ 1. the infinite word f [i] := limn→∞f [i] n is called the i-fibonacci word. for i = 2, we have the classical fibonacci word. note that |f[i]n | = f [i] n , where f [i] n is the integer sequence defined recursively by f [i]0 = 1,f [i] 1 = i,f [i] n = f [i] n−1 + f [i] n−2, for n ≥ 2, and i ≥ 1. for i = 1, 2 we have the fibonacci numbers. ramírez and rubiano [20] studied a similar binary word to f[i]n , which is denoted by fk,n. the initial values are the same, i.e., fk,0 = 0,fk,1 = 0k−11. however, the recurrence relation is defined by fk,n = fkk,n−1fk,n−2, for n ≥ 2, and k ≥ 1. the infinite word fk is the limit sequence of the infinite sequence {fk,n}n∈n. for k = 1, we have the word f = 1011010110110 . . .. here the overline is shorthand for the morphism that maps 0 to 1 and 1 to 0. it is clear that |fk,n| = fk,n+1, where fk,n+1 is the integer sequence defined recursively by fk,0 = 0,fk,1 = 1,fk,n+1 = kfk,n + fk,n−1, for n ≥ 1. by analogy with the fibonacci word fractal, when we apply the odd-even drawing rule to the words f[i]n and fk,n, we obtain the nth word fractal f [i] n and fk,n, respectively. moreover, we have the curves f[i] and fk, which are defined as f[i] = limn→∞f [i] n and fk = limn→∞fk,n. in table 2, we show some curves f[i]16 and fk,n, and their associated words. in this paper, we study a word-combinatorial interpretation of the biperiodic fibonacci sequence [12]. this problem was recently proposed by ramírez et al. [22]. we study a family of infinite words f(a,b) that generalize the fibonacci word and the word fk. specifically, the nth biperiodic fibonacci words are words over {0,1} defined inductively as follows f(a,b,0) = �, f(a,b,1) = 0, f(a,b,2) = 0a−11, f(a,b,n) = { fa(a,b,n−1)f(a,b,n−2), if n ≥ 3 is even, fb(a,b,n−1)f(a,b,n−2), if n ≥ 3 is odd, for all a,b ≥ 1. it is clear that |f(a,b,n)| = f (a,b) n = qn. the infinite word f(a,b) := lim n→∞ f(a,b,n), is called the biperiodic fibonacci word. in addition to this definition, we investigate some new combinatorial properties and we associate a family of curves with interesting geometric properties. these properties are obtained from the combinatorial properties of the word f(a,b). 2. definitions and notation the terminology and notations are mainly those of lothaire [14] and allouche and shallit [1]. let σ be a finite alphabet, whose elements are called symbols. a word over σ is a finite sequence of symbols from σ. the set of all words over σ, i.e., the free monoid generated by σ, is denoted by σ∗. the identity element � of σ∗ is called the empty word. for any word w ∈ σ∗, |w| denotes its length, i.e., the number of symbols occurring in w. the length of � is taken to be equal to 0. if a ∈ σ and w ∈ σ∗, then |w|a denotes the number of occurrences of a in w. for two words u = a1a2 · · ·ak and v = b1b2 · · ·bs in σ∗ we denote by uv the concatenation of the two words, that is, uv = a1a2 · · ·akb1b2 · · ·bs. if v = �, then u� = �u = u. moreover, by un we denote the word uu · · ·u (n times). a word v is a factor or subword of u if there exist x,y ∈ σ∗ such that u = xvy. if x = � (y = �), then v is called a prefix (suffix) of u. the reversal of a word u = a1a2 · · ·an is the word ur = an · · ·a2a1 and �r = �. a word u is a palindrome if ur = u. an infinite word over σ is a map u : n → σ. it is written u = a1a2a3 . . .. the set of all infinite words over σ is denoted by σω. 51 josé l. ramírez, gustavo n. rubiano acta polytechnica f [1] = 1011010110110 · · · = f f [2] = 0100101001001 · · · = f f [3] = 0010001001000 · · · f[1]16 f [2] 16 f [3] 16 f1 = 1011010110110 · · · = f f5 = 0000100001000 · · · f6 = 0000010000010 · · · f1,18 f5,6 f6,6 table 2. some curves f[i]16 and fk,n. example 1. let p = (pn)n≥1 = 0110101000101 · · · be an infinite word, where pn = 1 if n is a prime number and pn = 0 otherwise. the word p is called the characteristic sequence of the prime numbers. let σ and ∆ be alphabets. a morphism is a map h : σ∗ → ∆∗ such that h(xy) = h(x)h(y) for all x,y ∈ σ∗. it is clear that h(�) = �. furthermore, a morphism is completely determined by its action on single symbols. for instance, the fibonacci word f satisfies limn→∞σn(1) = f, where σ : {0,1} → {0,1}∗ is the morphism defined by σ(0) = 01 and σ(1) = 0. this morphism is called the fibonacci morphism. the fibonacci word f can be defined in several different ways, see, e.g., [3]. there is a special class of infinite words, with many remarkable properties, the so-called sturmian words. these words admit several equivalent definitions (see, e.g. [1, 2, 14]). let w ∈ σω. we define p(w,n), the complexity function of w, to be the map that counts, for all integer n ≥ 0, the number of subwords of length n in w. an infinite word w is a sturmian word if p(w,n) = n + 1 for all integers n ≥ 0. since for any sturmian word p (w, 1) = 2, sturmian words are over two symbols. the word p, in example 1, is not a sturmian word because p(p, 2) = 4. given two real numbers θ,ρ ∈ r with θ irrational and 0 < θ < 1, 0 ≤ ρ < 1, we define the infinite word w = w1w2w3 · · · by wn = b(n + 1)θ + ρc−bnθ + ρc. the numbers θ and ρ are called the slope and the intercept, respectively. words of this form are called lower mechanical words and are known to be equivalent to sturmian words [14]. as a special case, when ρ = 0, we obtain characteristic words. definition 2. let θ be an irrational number with 0 < θ < 1. for n ≥ 1, define wθ(n) := b(n + 1)θc− bnθc, and w(θ) := wθ(1)wθ(2)wθ(3) · · · . then w(θ) is called the characteristic word with slope θ. note that every irrational θ ∈ (0, 1) has a unique continued fraction expansion θ = [0,a1,a2,a3, . . .] = 1 a1 + 1 a2 + 1 a3 + · · · , where each ai is a positive integer. let θ = [0, 1 + d1,d2, . . . ] be an irrational number with d1 ≥ 0 and dn > 0 for n > 1. we use the continued fraction expansion of θ as a directive sequence in the following way. we associate a sequence (sn)n≥−1 of words defined by s−1 = 1, s0 = 0, sn = sdnn−1sn−2, for n ≥ 1. such a sequence of words is called a standard sequence. this sequence is related to characteristic words in the following way. observe that, for any n ≥ 0, sn is a prefix of sn+1, which gives meaning to limn→∞sn as 52 vol. 55 no. 1/2015 biperiodic fibonacci word and its fractal curve a b f(a,b) 1 2 1101110111011011101110111011011 · · · 2 1 0100100101001001001010010010010 · · · 2 3 0101010010101001010101001010100 · · · 3 2 0010010001001000100100010010010 · · · 3 5 0010010010010010001001001001001 · · · table 3. some biperiodic fibonacci words. an infinite word. in fact, one can prove [14] that each sn is a prefix of w(θ) for all n ≥ 0 and w(θ) = lim n→∞ sn. (2) 3. biperiodic fibonacci words let f(a,b,n) and f(a,b) be the nth biperiodic fibonacci word and the infinite biperiodic fibonacci word, respectively. note that, for a = b = 1 we have the word f = 1011010110110 . . .. if a = b = k, we obtain k-fibonacci words, i.e., f(k,k) = fk. definition 3. the (a,b)-fibonacci morphism σ(a,b) : {0,1} → {0,1}∗ is defined by σ(a,b)(0) = (0a−11)b0 and σ(a,b)(1) = (0a−11)b0a1. theorem 4. for all n ≥ 0, σn(a,b)(0) = f(a,b,2n+1) and σn(a,b)(0 a−11) = f(a,b,2n+2). hence, the biperiodic fibonacci word f(a,b) satisfies that lim n→∞ σn(a,b)(0) = f(a,b). proof. we prove the two assertions about σn(a,b) by induction on n. they are clearly true for n = 0, 1. now assume the result holds for n, and let us prove it for n + 1: σn+1(a,b)(0) = σ n (a,b)((0 a−11)b0) = (σn(a,b)(0 a−11))bσn(a,b)(0) = fb(a,b,2n+2)f(a,b,2n+1) = f(a,b,2n+3), and σn+1(a,b)(0 a−11) = σn(a,b)(((0 a−11)b0)a−1(0a−11)b0a1) = σn(a,b)(((0 a−11)b0)a0a−11) = ((σn(a,b)(0 a−11))bσn(a,b)(0)) aσn(a,b)(0 a−11) = (fb(a,b,2n+2)f(a,b,2n+1)) af(a,b,2n+2) = fa(a,b,2n+3)f(a,b,2n+2) = f(a,b,2n+4). example 5. in table 3 we show some biperiodic fibonacci words for specific values of a and b. proposition 6. the number of occurrences of 1 and 0 in the finite biperiodic fibonacci words are given by pn and hn, where pn and hn satisfy the following recurrences: p0 = 0, p1 = 0, p2 = 1, pn = { apn−1 + pn−2, if n ≥ 3 is even, bpn−1 + pn−2, if n ≥ 3 is odd. (3) moreover, pn = ( b1−ξ(n−1) (ab)b n−1 2 c ) αn−1 −βn−1 α−β , for n ≥ 1, (4) and h0 = 0, h1 = 1, h2 = a− 1, hn = { ahn−1 + hn−2, if n ≥ 3 is even, bhn−1 + hn−2, if n ≥ 3 is odd. (5) moreover, for n ≥ 1, hn = (a− 1)f (b,a) n−1 + (a b )ξ(n−1) f (b,a) n−2 . (6) proof. recurrences (3) and (5) are clear from definition of f(a,b,n). equation (4) is clear from the binetlike formula; see equation (1). we obtain equation (6) from [12, theorem 8]. proposition 7. one has (1.) limn→∞ |f(a,b,n)| |f(a,b,n)|1 = α b = ab+ √ (ab)2+4ab 2b . (2.) limn→∞ |f(a,b,n)| |f(a,b,n)|0 = a(α+1) α(a−1)+a. (3.) limn→∞ |f(a,b,n)|0 |f(a,b,n)|1 = a ( 1 + 1 α ) − 1. proof. (1.) from equations (1) and (4) we obtain lim n→∞ |f(a,b,n)| |f(a,b,n)|1 = lim n→∞ qn pn = lim n→∞ ( a1−ξ(n) (ab)b n 2 c ) αn−βn α−β( b1−ξ(n−1) (ab)b n−1 2 c ) αn−1−βn−1 α−β = lim n→∞ (ab)b n−1 2 c(a1−ξ(n))(αn −βn) (ab)bn2 c(b1−ξ(n−1))(αn−1 −βn−1) . if n is even, then lim n→∞ |f(a,b,n)| |f(a,b,n)|1 = 1 b lim n→∞ αn −βn αn−1 −βn−1 = 1 b lim n→∞ α 1 − ( β α )n 1 − ( β α )n−1 = αb . if n is odd, the limit is the same. 53 josé l. ramírez, gustavo n. rubiano acta polytechnica (2.) from equations (1) and (6) we obtain lim n→∞ |f(a,b,n)| |f(a,b,n)|0 = lim n→∞ qn pn = lim n→∞ ( a1−ξ(n) (ab)b n 2 c ) αn−βn α−β (a− 1)b(n) + ( a b )ξ(n−1) b(n− 1) , where b(n) = ( b1−ξ(n) (ab)b n−2 2 c ) αn−2−βn−2 α−β . if n is even, then lim n→∞ |f(a,b,n)| |f(a,b,n)|0 = a lim n→∞ αn −βn (a−1)(ab)(αn−1−βn−1)+a2b(αn−2−βn−2) = a 1 1 α (a− 1)(ab) + a2b 1 α2 = a(α + 1) α(a− 1) + a . if n is odd, the limit is the same. (3.) the proof runs like in the previous two items. the previous proposition is a particular result related to the incidence matrix of a substitution, see, e.g., [7]. proposition 8. the biperiodic fibonacci word and the nth biperiodic fibonacci word satisfy the following properties. (1.) word 11 is not a subword of the biperiodic fibonacci word, for a ≥ 2, and b ≥ 1. (2.) let xy be the last two symbols of f(a,b,n). for n,a ≥ 2, and b ≥ 1, we have xy = 01 if n is even, and xy = 10 if n is odd. (3.) the concatenation of two successive biperiodic fibonacci words is “almost commutative”, i.e., f(a,b,n−1)f(a,b,n−2) and f(a,b,n−2)f(a,b,n−1) have a common prefix of length qn−1 + qn−2 − 2, for all n ≥ 3. proof. (1.) it suffices to prove that 11 is not a subword of f(a,b,n), for all n ≥ 0. we proceed by induction on n. for n = 0, 1, 2 it is clear. now, assume that it is true for an arbitrary integer n ≥ 2. if n+ 1 is even, we know that f(a,b,n+1) = fa(a,b,n)f(a,b,n−1), so by the induction hypothesis we have that 11 is not a subword of f(a,b,n) and f(a,b,n−1). therefore, the only possibility is that 1 is a suffix and a prefix of f(a,b,n) or 1 is a suffix of f(a,b,n) and a prefix of f(a,b,n−1), but these are impossible by the definition of the word f(a,b,n). if n + 1 is odd the proof is analogous. (2.) we proceed by induction on n. for n = 2 it is clear. now, assume that it is true for an arbitrary integer n ≥ 2. if n + 1 is even, we know that f(a,b,n+1) = fa(a,b,n)f(a,b,n−1), and by the induction hypothesis the last two symbols of f(a,b,n−1) are 01, therefore the last two symbols of f(a,b,n+1) are 01. analogously, if n + 1 is odd. (3.) we proceed by induction on n. for n = 3, 4 it is clear. now, assume that it is true for an arbitrary integer n ≥ 4. if n is even, then by definition of f(a,b,n), we have f(a,b,n−1)f(a,b,n−2) = fb(a,b,n−2)f(a,b,n−3) ·fa(a,b,n−3)f(a,b,n−4) = (f a (a,b,n−3)f(a,b,n−4)) b ·fa(a,b,n−3)f(a,b,n−3)f(a,b,n−4), and f(a,b,n−2)f(a,b,n−1) = fa(a,b,n−3)f(a,b,n−4) ·f b (a,b,n−2)f(a,b,n−3) = fa(a,b,n−3)f(a,b,n−4) · (f a (a,b,n−3)f(a,b,n−4)) b ·f(a,b,n−3) = (fa(a,b,n−3)f(a,b,n−4)) b fa(a,b,n−3)f(a,b,n−4)f(a,b,n−3). hence the words have a common prefix of length b(aqn−3 + qn−4) + aqn−3 = bqn−2 + aqn−3. by the induction hypothesis f(a,b,n−3)f(a,b,n−4) and f(a,b,n−4)f(a,b,n−3) have a common prefix of length qn−3 +qn−4−2. therefore the words have a common prefix of length bqn−2 + aqn−3 + qn−3 + qn−4 −2 = qn−1 + qn−2 −2. if n is odd the proof is analogous. the above proposition is a particular result related to sturmian words, see, e.g., [14]. definition 9. let φ : {0, 1}∗ → {0, 1}∗ be a map such that φ deletes the last two symbols, i.e., φ(a1a2 · · ·an) = a1a2 · · ·an−2, if n > 2, and φ(a1a2 · · ·an) = � if n ≤ 2. corollary 10. the nth biperiodic fibonacci word satisfies for all n ≥ 2, and x,y ∈{0, 1} that (1.) φ(f(a,b,n−1)f(a,b,n−2)) = φ(f(a,b,n−2)f(a,b,n−1)). (2.) φ(f(a,b,n−1)f(a,b,n−2)) = f(a,b,n−2)φ(f(a,b,n−1)) = f(a,b,n−1)φ(f(a,b,n−2)). (3.) if f(a,b,n) = φ(f(a,b,n))xy, then φ(f(a,b,n−2))xyφ(f(a,b,n−1)) = f(a,b,n−1)φ(f(a,b,n−2)). (4.) if f(a,b,n) = φ(f(a,b,n))xy, then φ(f(a,b,n)) =   φ(f(a,b,n−2))(10φ(f(a,b,n−1)))a, if n is even. φ(f(a,b,n−2))(01φ(f(a,b,n−1)))b, if n is odd. proof. (1.) and (2.) follow immediately from proposition 8.(3.) and |f(a,b,n)| ≥ 2 for all n ≥ 2. for (3.), in 54 vol. 55 no. 1/2015 biperiodic fibonacci word and its fractal curve symbol action 1 draw a line forward. 0 draw a line forward and if the symbol 0 is in a position even, then turn θ degree and if 0 is in a position odd, then turn −θ degrees. table 4. odd-even drawing rule with turn angle θ. fact, if f(a,b,n) = φ(f(a,b,n))xy, then from proposition 8.(2.) we have f(a,b,n−2) = φ(f(a,b,n−2))xy. hence φ(f(a,b,n−2))xyφ(f(a,b,n−1)) = f(a,b,n−2)φ(f(a,b,n−1)) = f(a,b,n−1)φ(f(a,b,n−2)). item (4.) is clear from (3.) and the definition of f(a,b,n). theorem 11. φ(f(a,b,n)) is a palindrome for all n,a ≥ 2 and b ≥ 1. proof. we proceed by induction on n. if n = 2, 3, then φ(f(a,b,2)) = 0a−1 and φ(f(a,b,3)) = (0a−11)b0 are palindromes. now, assume that it is true for an arbitrary integer n ≥ 3. if n is even, then from corollary 10 (φ(f(a,b,n)))r = (φ(fa(a,b,n−1)f(a,b,n−2))) r = (fa(a,b,n−1)φ(f(a,b,n−2))) r = φ(f(a,b,n−2))r(fa(a,b,n−1)) r = φ(f(a,b,n−2))(fr(a,b,n−1)) a = φ(f(a,b,n−2))(φ(f(a,b,n−1)10)r)a = φ(f(a,b,n−2))(01φ(f(a,b,n−1)))a = (φ(f(a,b,n))). if n is odd, the proof is analogous. corollary 12. (1.) if f(a,b,n) = φ(f(a,b,n))xy, then yxφ(f(a,b,n))xy is a palindrome. (2.) if u is a subword of the biperiodic fibonacci word, then so is its reversal, ur. the above propositions are particular results related to palindromes of sturmian words, see, e.g., [9]. theorem 13. let ζ = [0,a,b] be an irrational number, with a and b positive integers, then w(ζ) = f(a,b). proof. let ζ = [0,a,b] be an irrational number, then its associated standard sequence is s−1 = 1, s0 = 0, s1 = sa−10 s−1 = 0 a−11, and sn = { sbn−1sn−2, if n ≥ 2 is even, san−1sn−2, if n ≥ 2 is odd. hence {sn}n≥0 = {f(a,b,n+1)}n≥0 and from equation (2), we have w(ζ) = lim n→∞ sn = f(a,b). remark. note that ζ = [0,a,b] = 1 a+ 1 b+ 1 a+ 1··· = −ab+ √ (ab)2+4ab 2a = − α 2 . from the above theorem, we conclude that biperiodic fibonacci words are sturmian words. a fractional power is a word of the form z = xny, where n ∈ z+, x ∈ σ+ and y is a prefix of x. if |z| = p and |x| = q, we say that z is a p/q-power, or z = xp/q. in the expression xp/q, the number p/q is the power’s exponent. for example, 01201201 is an 8/3-power, 01201201 = (012)8/3. the index of an infinite word w ∈ σω is defined by ind(w) := sup{r ∈ q≥1 : w contains an r-power.} for example, ind(f ) > 3 because the cube (010)3 occurs in f at position 6. mignosi and pirillo [15] proved that ind(f ) = 2 + φ ≈ 3.618, where φ is the golden ratio. ramírez and rubiano [20] proved that the index of the k-fibonacci word is given by ind(fk) = 2+k+1/rk,1, where rk,1 = (k+ √ k2 + 4)/2. a general formula for the index of a sturmian word was given by damanik and lenz [11]. theorem 14 [11]. if w is a sturmian word of slope θ = [0,a1,a2,a3, . . . ], then ind(w) = sup n≥0 {2 + an+1 + rn−1 − 2 rn }, where rn is the denominator of θ = [0,a1,a2, a3, . . . ,an] and satisfies r−1 = 0,r0 = 1,rn+1 = an+1rn + rn−1. corollary 15. the index of the biperiodic fibonacci word is ind(f(a,b)) = max { 2 + a + a α , 2 + b + b α } , (7) where α = (ab + √ (ab)2 + 4ab)/2. proof. the word f(a,b) is a sturmian word of slope ζ = [0,a,b], then from the above theorem rn = qn+1, and ind(f(a,b)) = supn≥0{hn}, where hn = { 2 + b + qn−1−2 qn , if n is even, 2 + a + qn−1−2 qn , if n is odd. then ind(f(a,b)) = max { sup n≥0 { 2 + a + q2n−2 q2n+1 } , sup n≥0 { 2 + b + q2n−1−2 q2n }} . since q2n+1/q2n → α/a and q2n/q2n−1 → α/b as n → ∞, then equation (7) follows. 55 josé l. ramírez, gustavo n. rubiano acta polytechnica f(2,5)9 , θ = 90 ◦ f(5,5)7 , θ = 90 ◦ f(6,6)6 , θ = 90 ◦ f(2,5)10 , θ = 90 ◦ f(2,3)10 , θ = 60 ◦ f(6,6)5 , θ = 60 ◦ table 5. some curves f(a,b)n with an angle θ. 4. the biperiodic fibonacci word curve the odd-even drawing rule can be extended from a parameter θ, where θ is the turn angle, see table 4. if θ = 90◦, then we obtain the drawing rule in table 1. definition 16. the nth-biperiodic curve of fibonacci, denoted by f(a,b)n , is obtained by applying the odd-even drawing rule to the word f(a,b,n). the biperiodic fibonacci word fractal f(a,b), is defined as f(a,b) := limn→∞f (a,b) n . in table 5, we show some curves f(a,b)n with a given angle θ. proposition 17. the biperiodic fibonacci word curve and the curve f(a,b)n have the following properties: (1.) the biperiodic fibonacci curve f(a,b) is composed only of segments of length 1 or 2. (2.) the curve f(a,b)n is symmetric. (3.) the number of turns in the curve f(a,b)n is hn. proof. (1.) it is clear from proposition 8.(1.), because 110 and 111 are not subwords of f (a,b)n . (2.) it is clear from theorem 11, because f(a,b,n) = φ(f(a,b,n))xy, where φ(f(a,b,n)) is a palindrome. (3.) it is clear from the definition of the odd-even drawn rule and because |f(a,b,n)|0 = hn; see proposition 6. proposition 18 (monnerot-dumaine [16]). the curve f(1,1)n = fn is similar to the curve f (1,1) n+3 = fn+3. proposition 19 (ramírez and rubiano [20]). if a is even, then the curve f(a,a)n = fa,n is similar to the curve f(a,a)n+2 = fa,n+2. if n is odd, then the curve f(a,a)n = fa,n is similar to the curve f (a,a) n = fa,n+3. theorem 20. (1.) if a is even, then the curve f(a,b)n is similar to the curve f(a,b)n+2 , i.e., they have the same shape except for the number of segments. (2.) if a 6= b, and a,b are odd, then the curve f(a,b)n is similar to the curve f(a,b)n+6 . (3.) if a is odd, and b is even, then the curve f(a,b)n is similar to the curve f(a,b)n+4 . proof. suppose a is even. then it is clear that σ(a,b)(f(a,b,n)) = f(a,b,n+2); see theorem 4. we are going to prove that σ(a,b) preserves the odd-even alternation required by the odd-even drawing rule. in fact, σ(a,b)(0) = (0a−11)b0, and σ(a,b)(1) = (0a−11)b0a1. as a is even, then |σ(a,b)(0)| and |σ(a,b)(1)| are odd. 56 vol. 55 no. 1/2015 biperiodic fibonacci word and its fractal curve figure 2. curves f(2,6)7 ,f (2,6) 9 ,f (2,6) 11 with θ = 72 ◦. hence if |w| is even (odd), then |σ(a,b)(w)| is even (odd). since σ(a,b) preserves parity of length then any subword in a biperiodic fibonacci word preserves parity of position. finally, let a(w) be the function that gives the ending or resulting angle of an associate curve to the word w through the odd-even drawing rule of angle θ. we have to prove that the resulting angle of a curve must be preserved or inverted by σ(a,b). note that a(00) = 0, a(01) = −θ and a(10) = +θ. therefore • a(σ(a,b)(00)) = a((0a−11)b0(0a−11)b0) = a((0a−11)b) + a(0a) + a((10a−1)b−1) + a(10) = a((0a−11)b) + a(0a) + a(10) + a(0a−2) + a(10) + a(0a−2) + · · ·+a(10) = b(−θ) + 0 + θ + 0 + θ + · · · + θ = 0. • a(σ(a,b)(01)) = a((0a−11)b0(0a−11)b0a1) = a((0a−11)b) + a(0a) + a(10) + a(0a−2) + · · · + a(10)a(0a−11) = 0 −θ = −θ. • a(σ(a,b)(10)) = a((0a−11)b0a1(0a−11)b0) = a((0a−11)b) + a(0a) + a(10) + a(0a−2) + a(10) + a(0a−2) +· · ·+a(10) = bθ+ 0 +θ+ 0 +θ+· · ·+θ = bθ + (b + 1)θ = +θ. then σ(a,b) preserves the resulting angle, i.e., a(w) = a(σ(a,b)(w)) for any word w of even length. therefore the image of a pattern by σ(a,b) is the rotation of this pattern by an angle of +θ. since σ(a,b)(f(a,b,n)) = f(a,b,n+2), then the curve f(a,b,n) is similar to the curve f(a,b,n+2). if a 6= b, and a,b are odd the proof is similar, but using σ3(a,b). if a is odd, and b is even the proof is similar, but using σ2(a,b). an open problem is try to find a characterization of similar word curves in terms of words. example 21. in figure 2, f(2,6)7 is similar to f(2,6)9 ,f (2,6) 11 and so on. in figure 3, f (3,4) 5 is similar to f(3,4)9 and so on. acknowledgements the authors thank the anonymous referees for their careful reading of the manuscript and their fruitful comments and suggestions. this research is for the observatorio de restituciãşn y regulaciãşn de derechos de propiedad agraria. moreover, it was supported in part by the equipment donation from the german academic exchange service-daad to the faculty of science at the universidad nacional de colombia. 57 josé l. ramírez, gustavo n. rubiano acta polytechnica figure 3. curves f(3,4)5 ,f (3,4) 9 with θ = 120 ◦. references [1] allouche, j., shallit, j., automatic sequences, cambridge university press, cambridge, 2003. [2] baláži, p., various properties of sturmian words, acta polytech. 45(5), 2002, 19–23. [3] berstel, j., fibonacci words-a survey, in: g. rosenberg, a. salomaa (eds.), the book of l, springer, berlin, 1986, 11–26. [4] blondin-massé, a., brlek, s., garon, a., labbé, s., two infinite families of polyominoes that tile the plane by translation in two distinct ways, theoret. comput. sci. 412, 2011, 4778–4786. [5] cassaigne, j., on extremal properties of the fibonacci word, rairo theor. inf. appl., 42(4), 2008, 701–715. [6] chuan, w., fibonacci words, fibonacci quart., 30(1), 1992, 68–76. [7] fuchs, c., tijdeman, r., substitutions, abstract number systems and the space filling property, ann. inst. fourier, 56(7), 2006, 2345–2389. [8] de luca, a., a division property of the fibonacci word, inform. process. lett., 54, 1995, 307–312. [9] de luca, a., mignosi, f., some combinatorial properties of sturmian words, theoret. comput. sci. 136, 1994, 361–385. [10] droubay x., palindromes in the fibonacci word, inform. process. lett., 55, 1995, 217–221. [11] damanik, d., lenz, d., the index of sturmian sequences, european j. combin., 23(1), 2002, 23–29. [12] edson, m., yayenie, o., a new generalization of fibonacci sequence and extended binet’s formula, integers, 9(6), 2009, 639–654. [13] koshy, t., fibonacci and lucas numbers with applications, a wiley-interscience publication, 2001. [14] lothaire, m., algebraic combinatorics on words, encyclopedia of mathematics and its applications, cambridge university press, cambridge, 2002. [15] mignosi, f., pirillo, g., repetitions in the fibonacci infinite word, rairo inform. theor. appl., 26, 1992, 199–204. [16] monnerot-dumaine, a., the fibonacci word fractal, preprint, http: //hal.archives-ouvertes.fr/hal-00367972/fr/, 2009 [2014-12-01]. [17] pirillo, g., fibonacci numbers and words, discrete math., 173, 1997, 197–207. [18] prusinkiewicz, p., lindenmayer, a.: the algorithmic beauty of plants, springer-verlag. nueva york, 2004. [19] ramírez, j., rubiano, g., generating fractals curves from homomorphisms between languages [with mathematicar] (in spanish), revista integración 30(2), 2012, 129–150. [20] ramírez, j., rubiano, g., on the k-fibonacci words, acta univ. sapientiae infor., 5(2), 2013, 212–226. [21] ramírez, j., rubiano, g., properties and generalizations of the fibonacci word fractal. exploring fractal curves, the mathematica journal, 16, 2014. [22] ramírez, j., rubiano, g., de castro, r., a generalization of the fibonacci word fractal and the fibonacci snowflake, theoret. comput. sci., 528, 2014, 40–56. 58 http://hal.archives-ouvertes.fr/hal-00367972/fr/ http://hal.archives-ouvertes.fr/hal-00367972/fr/ acta polytechnica 55(1):50–58, 2015 1 introduction 2 definitions and notation 3 biperiodic fibonacci words 4 the biperiodic fibonacci word curve acknowledgements references ap1_02.vp 1 introduction a number of numerical methods may be applied when estimating the bearing capacity of existing as well as planned buildings with random properties of structural elements, especially of vertical and horizontal joints. at present, probabilistic methods can be broadly classified into two major categories – methods using a statistical approach and methods using a nonstatistical approach [1]. statistical methods are based on simulation. the direct monte carlo method and the latin hypercube sampling (lhs) technique are fairly known, as well as improved simulation methods known as “importance sampling” and “adaptive sampling”. nonstatistical methods include numerical integration, the method of second order moments and the probabilistic finite element method. the horizontal and vertical joints of precast buildings and their properties are structural elements of the utmost importance. calculations based on statistical methods and taking into account the random material properties of joints and panels, as well as the random properties of loading, especially due to temperature impact, are rather complicated and time consuming. that is why a different approach using reliability index � is preferred to the direct determination of failure probability. it is well known that very low values of � are attained (� � 2) when deterioration of the joint due to an extreme inelastic deformation and/or due to a certain type of cyclic loading is developed to such an extent that the consecutive static stiffness approaches its residual value. a typical loading path of a reinforced vertical joint published in [5] is displayed in fig. 1. based on this observation, the proposed procedure is as follows. index � is determined using the second order reliability method. in the parts of joints where values of � are rather low, the initial stiffnesses of the joints are reduced to 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 reliability – based design of precast buildings m. kalousková, e. novotná, j. šejnoha we present a numerical analysis of a precast structure regarding random properties of the material characteristics of joints, as well as the random character of loading, especially due to temperature impact. using fem we compare some of our results with a deterministic nonlinear solution. keywords: reliability, precast buildings, joints of panels, two-parameter subgrade, reliability index. fig. 1: loading paths for a vertical joint their residual values. the aim of this paper is to demonstrate that • this simple algorithm, which does not require an examination of the whole loading path from fig. 1, makes it possible to describe the propagation of the deteriorated regions of joints, • the image of these regions is similar to that obtained by the well-tried finite element deterministic solutions. 2 reliability analysis of joints probabilistic analysis is carried out using the nasrel (numerical analysis of structures for reliability) code. nasrel is the high performance finite element nascom (numerical analysis of structures and combined objects) code integrated with comrel (componental reliability analysis). the second order reliability method sorm is used to determine reliability index b at selected points of the joints. under the assumption that a failure domain � �g u � 0, u being the normalized basic uncertainty variables, is twice differentiable, the failure surface � �g u � 0 in the vicinity of the critical point u* with the distance � � u* to the origin is approximated by its supporting hyperparaboloid. expanding the function g into the taylor series up to the second order terms and introducing certain orthogonal transformations, the failure surface can be written as: � �� �g u un i i i n u � � � � � � � � � � �0 12 2 1 1 � � . (1) parameters, �i, i = 1, 2, … , n, stand for the second order derivatives in the principal directions of the failure surface. an expression for the failure probability can be found in [6] in this form � �� � � � � �p g i i n u � � � �0 1 1 2 1 1 � � � � , (2) where � is the laplace function. the rackwitz/fiessler optimization procedure is used to find the design point. 3 model of a precast building the fem code nascom is called when analyzing the state of stress in walls and joints of a precast structure. in this paper, 3d elements have been used for panels and joints, beam elements to model a continuous footing, and beam and truss elements for the equivalent subgrade structures. a 3d coulomb condition describes the failure envelope in joints [3] as � � � � �i j g i j i j � � � � � � � � � �cot , , , , , , , 2 4 2 1 2 3 1 2 3red (3) � � � red � 2 1 c cos sin , (4) where �i, �j principal stresses, c cohesion coefficient � friction angle. if � � �1 2 3� � , condition (3) reduces to � � � � �1 3 2 4 2 � � � � � � �cot g red . (5) © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 41 no. 2/2001 fig. 2: interaction diagram for a joint regarding the fact that the material of joints is plastically anisotropic, condition (5) can be written as [4] � � 1 3 2 � c , (6) where � r r c i ratio of the strength in compression to the strength in tension. the following three loadings are combined – the dead load, the loading transferred from the ceiling panels, and the temperature. all of these are supposed to be randomly distributed. in the deterministic solution the 2d finite elements are preferred to the 3d formulation discussed above. fig. 2 shows a material model describing the interaction between the shear and normal stresses. diagram � � has been derived from experimental results for a layer of lower strength [2]. the model is characterized by the following parameters – the characteristic strength fck, the ultimate strains �ck and �c. the proportional limit �e is equal to 0.4 fck. the determination of reduced stiffnesses ered, gred at the n-th iteration step is based on the assumption that strains �(n�1) and (n�1) from the preceding iteration step are known. the algorithm starts by determining the reduced stiffness ered. next, the corresponding ultimate shear strength of the joint max is determined. finally, gred is assigned to the known (n�1) . this algorithm is implemented in the finite element code feat, where contact elements are used to model the joints. for more details, see [2]. 4 model of a subgrade the proposed analysis of a precast structure takes into account the structure-subgrade interaction. a straightforward way to solve this problem by the nascom code is to use 3d elements for both the structure and subsoil. an alternative and more effective approach is based on the winkler – pasternak model with two parameters, which is not implemented in the nascom code. this model is described in a concise manner in what follows. the stiffness of the subgrade is replaced by the stiffness of an equivalent construction composed of truss and beam elements, as shown in fig. 3. as for the model of the subgrade, noninteracting foundation structures are considered [1]. three basic types of elements are used (fig. 4) and the deformation of each of them is given by the vertical displacements of end-points 1 and 2. a) inner element i the stiffness matrix of the subgrade element is expressed [1] as � �k b c b c b c b c b c b c b c wp i � 2 3 2 3 2 3 2 2 3 1 2 1 2 1 2 � � � � � � � * * * * * * 1 2 2* * � ! ! ! " # $ $ $ b c � , (7) where 2b width of foundation � length of element c1, c2 stiffness parameters of the winkler-pasternak model c c1 2 * *, modified parameters defined as c c b c c1 1 1 2 1* � , c c b c c 2 2 2 3 1 1 2 * � . the corresponding stiffness matrix [kbt] of the equivalent beam and truss element (see fig. 5) is given by � � � � � �k e a h e i e i e ibt i t t b b b b b b � 1 2 12 1 2 12 1 2 12 1 2 3 3 3 � � � � � � � � �� � 1 2 12 1 23 e a h e it t b b � ! ! ! ! " # $ $ $ $� , (8) where a at b, cross-section areas of truss and beam, respectively e et b, young moduli of truss and beam, respectively � length of beam element h length of truss ib moment of inertia of the beam cross section � � 6 b b b e i kga �2 coefficient expressing the influence of shear. comparing the equivalent stiffness matrices (7) and (8) gives � � 1 2 12 1 2 2 3 2 3 1 2 e a h e i b c b ct t b b � � � �� * *, (9) � � 12 1 2 3 2 3 1 2 e i b c b cn n � � � � � * *. (10) the determination of the beam and truss characteristics is evident. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 fig. 3: beam and truss construction “flange” “web” 2221 1 1 fig. 4: basic types of subgrade elements b) end – point element ii the stiffness matrix of the subgrade element is obtained from (7) by adding a complementary matrix � �% k b c c cwp ii � � ! ! " # $ $ 0 0 0 2 3 2 1 2 2 . (11) c) inner corner element iii the interaction of the crossing beams cannot be neglected. substituting the following expression for the displacement of the subgrade in the vicinity of the inner corner, � �w x y w e e e e c c x c c y c c x c c y , � & � � � � � � � � � � 0 1 2 1 2 1 2 1 2 , into the principle of virtual displacements yields formulas for the shear forces qx, qy acting along the crossing beams (unit corner displacement w0 is considered): � � � � q c c y q c c x x y � & � & 1 2 1 2 � � , (12) where � �� y e e c c y c c y � 1 3 4 1 2 1 2 2 . (13) for y = 0 or x = 0 we have q q c cx y� � 3 4 1 2. (14) for y x� �� �0 0or (� 0 being length of the shear depression) q q c cx y� � 1 2 . (15) applying equations (12) through (15) to the elements in the vicinity of the inner corner (fig. 6) yields � �k k k k kwp iii � � ! " # $ 11 12 21 22 , where for the “flange” shown in fig. 6 � �� �k k b c c c x b c11 22 1 1 2 2 2 3 2 3 1 2f f s� � � � � � *, (16) � �� �k k b c c c x b c12 21 1 1 2 2 3 3 1 2f f s� � � � � � * , (17) and for the “web” � �k k b c c c y b c11 22 1 1 2 2 2 3 2 3 2w w s� � � � � � * , (18) � �k k b c c c y b c12 21 1 1 2 2 3 3 2w w s� � � � � � * . (19) © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 41 no. 2/2001 i ii fig. 5: equivalent beam and truss construction – elements of types i, ii �2 2 0 0 � �w �1 1 0 0 � �w �2 2 0 0 � �w �1 1 0 0 � �w �� fig. 6: distribution of shear forces in the vicinity of the inner corner � 0 5 numerical example part of a seven-story precast building of type g57 was analysed (fig. 7). the construction with the continuous footing lies on a sandy loam subgrade (c1 = 15 mnm �3, c2 = 5 mnm �1, e0 = 35 mpa). the statistical properties of the basic variables applied to the reliability analysis are listed in table 1. the aim of this paper is to demonstrate the propagation of deteriorated regions rather than to describe truthfully the random properties of the building. for simplicity, all variables except for the friction angle, which is a constant, are supposed to be normally distributed with the coefficient of variation 0.1. the temperature loading is caused by exposing one side of the building to thermal radiation from the sun. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 . . . . fig. 7: ground plan and analyzed part of building g57 a b c fig. 8: distribution of the failed joints the maximum value of the temperature change 10 k is considered on the outside surface and conducted to the inner wall (thermal conductivity coefficient � being 1.43 wm�1k�1). table 1: basic variables basic variables dimension mean cohesion c mpa 2.5 friction angle � rad 0.52 dead load – material density � kgm�3 2300 loading transferred from ceilings q knm�2 6.67 � t1 k 4 temperature increments � t2 k 4 � t3 k 2 three temperature levels were used together with the dead load and the loading transferred from the ceilings (table 1). at the first temperature level (� t1 = 4 k) the values of � attained in the whole structure were greater than 5. at the second level (the total temperature increment � t = 8 k) the stiffnesses in the regions of joints with � � 1.5 were reduced to their residual values and the procedure was repeated. in this example, 10 % of the initial stiffness kin has been chosen for the residual stiffness kres, even though this value somewhat overestimates the values obtained experimentally [5]. the vertical joints in precast buildings of this type are not equipped with reinforcing bars. their stiffness is assured by the ceiling panels, which overlap the vertical fissures between the wall panels. the resulting distribution of the failed joints is drawn in fig. 8a (solid lines). the third temperature level (total increment � t = 10 k) caused the failure distribution demonstrated in fig. 8b. the detailed distribution of the deteriorated regions at the top of the building is displayed in fig. 8c. when comparing the results obtained in this way with a deterministic non-linear solution by the 2d fem mentioned in section 3, nearly the same images of deteriorated regions of joints were reached. it should be pointed out that the two images become different when the coefficient of variation increases. 6 conclusion this paper discusses a model describing the failure of a precast construction with random properties of joints and loading by means of index �. it is evident that introducing the residual stiffnesses in joints with � < 1.5 leads to results that are comparable with the deterministic solution, providing the failure condition is of an adequate type. it appears that the results are almost the same when the level of index � used to reduce the individual stiffnesses varies from 1 to 2. nevertheless, a fully probabilistic approach, using for example the monte carlo method, especially in conjunction with the response surface method, will provide more complex information about the construction behaviour and its reliability. acknowledgments the financial support was provided by gačr 103/99/0944 and by research project j04/98:210000001. references [1] bittnar, z., šejnoha, j.: numerical methods in structural mechanics. asce press, new york, thomas telford, london, 1996 [2] blažek, v., fajman, p., šejnoha, j.: východiska nelineární analýzy konstrukcí panelových budov (theoretical background of nonlinear analysis of precast structures). beton a zdivo (v tisku) [3] ducháček, j.: nauka o pružnosti a pevnosti ii (theory of elasticity ii). sntl praha, 1964 [4] chen, w., f.: plasticity in reinforced concrete. mcgraw-hill, new york, 1982 [5] pume, d.: structural models of joints between concrete wall elements. ctu report, no. 2/1997 [6] strurel, theoretical manual. rcp consult, münchen, 1996 ing. marie kalousková, csc. phone: +420 2 2435 4489 e-mail: kalousko@fsv.cvut.cz ing. eva novotná phone: +420 2 2435 4483 e-mail: novotnae@fsv.cvut.cz prof. ing. jiří šejnoha, drsc. phone: +420 2 2435 4492 e-mail: sejnoha@fsv.cvut.cz dept. of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 41 no. 2/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2013.53.0793 acta polytechnica 53(supplement):793–798, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap erosita: agn science, background determination, and optical follow-up spectroscopy thomas boller∗ max-planck-institut für extraterrestrische physik, psf 1312, 85741 garching, germany ∗ corresponding author: bol@mpe.mpg.de abstract. more than 20 years after the highly impacting rosat all-sky survey in the soft x-ray spectral range, we are close to the next major x-ray all/sky surveys with erosita. erosita will be the primary instrument on-board the russian “spectrum–roentgen–gamma” (srg) satellite which will be launched from baikonur in 2014 and placed in an l2 orbit. it will perform the first imaging all-sky survey in the medium energy x-ray range up to 10 kev with an unprecedented spectral and angular resolution. the erosita all sky x-ray survey will take place in a very different context than the rosat survey. there is now a wealth of complete, ongoing and planned surveys of the sky in broad range of wavelengths from the gamma, x-ray to the radio. a significant amount of science can be accomplished through the multi-frequency study of the erosita agn and cluster sample, including optical confirmation and photometric redshift estimation of the erosita extended sources and agns. optical spectroscopy has been, and will for the foreseeable future be, one of the main tools of astrophysics allowing studies of a large variety of astronomical objects over many fields of research. the fully capitalize on the erosita potential, a dedicated spectroscopic follow-up program is needed. 4most is the ideal instrument to secure the scientific success of the erosita x-ray survey and to overcome the small sample sizes together with selection biases that plagued past samples. the aim is to have the instrument commissioned in 2017, well matched to the data releases of erosita and gaia. the design and implementation of the 4most facility simulator aimed to optimize the science output for erosita is described in necessary details. keywords: x-ray mission: new, spectroscopy. 1. erosita agn science erosita will be the primary instrument on-board the russian “spectrum–roentgen–gamma” (srg) satellite which will be launched from baikonur in 2014 and placed in an l2 orbit (see [1] for the most recent update). erosita offers a unique possibility to understand the agn phenomenon more generally by extending agn observations within an all-sky survey and subsequent detailed pointed observations. 1.1. relativistic fe k emission lines, the strong field limit and black hole spin determination the study of relativistically broadened fe k lines in agns can be used as a diagnostic tool of the geometry of the accreting matter at the innermost stable orbit as well as of the spin of the black hole. since the discovery of optically thick matter in the vicinity of a black hole in the form of a fe k line [2–4] and resolving the relativistic fe k line in mcg–6-30-15 [5], broad iron lines have been found so far only in a few other seyfert galaxies [6]. a recent study of nls1 galaxies revealed emission very close to the central black hole in three prototype objects, confirming the presence of soft x-ray reflection and moderate black hole spin values [7]. erosita, both during its survey and pointing operations will offer a unique possibility to further study the behavior of matter under strong gravity on a much larger sample of agns showing relativistic line emission. these studies are of long term importance for identifying complete samples for more detailed studies by future x-ray missions like athena [9]. these studies have to be accompanied by e.g. ground based optical surveys and spectroscopy to uniquely identify the source type. 1.2. erosita black hole growth studies and multi-wavelength follow-up work the study of the black hole growth over cosmic time is one of the key science issues which are of relevance for the core science of the erosita mission. narrowline seyfert 1 (nls1) galaxies with their proven high accretion rates exceeding the eddington limit by a factor of 10÷20 are most probably the objects with the highest black hole growth rates in the nearby universe are therefore ideally suited for such studies. while liner galaxies accrete at very low rates of their eddington accretion rates with values ranging from about 10−6÷2 [10], the accretion rate is increasing with decreasing masses of the black hole. kollmeier et al. [11] have studied the eddington luminosity ratios of broad line agn and found a relatively narrow scatter in the eddington accretion rates of 0.3 dex 793 http://dx.doi.org/10.14311/ap.2013.53.0793 http://ojs.cvut.cz/ojs/index.php/ap thomas boller acta polytechnica with a peak at about 0.25. nls1 galaxies exhibit the lowest black hole masses and the highest accretion rates [12, 13]. the most probable explanation for this trend is that nls1 galaxies are agns just in the forming (e.g. mathur 2000) with low black holes masses and still a lot of gas reservoir to feed the black hole. in addition, intense star forming processes are expected which yield to the enrichment of the circumnuclear material with metals. the black hole growth due to accretion is expected to be the fastest in nls1 galaxies. studying the accretion rate and the observational properties of nls1 galaxies with erosita provide a unique possibility to search for signatures of rapid black hole growth. the black hole growth due to accretion results in changes in the observational parameters of agns. the fwhm of the permitted lines of the blr will increase with time and it is inversely correlated with the accretion rate. an increasing black hole mass will result into a lower ionizing continuum, flatter soft and hard x-ray continua and a higher emissivity from the nlr. optical and near infrared spectroscopic follow up work is essential here to uniquely apply the definition for nls1 galaxies based on their optical fe ii multiplet emission and narrow hβ fwhm line widths [14]. basically, this connects the nls1 research to a large scale spectroscopic survey, which is a challenging and crucial aspect for the erosita mission. all these parameters changes will be tested by studying especially nls1 galaxies with erosita over a substantial amount of cosmic time. confirming the black hole growth and the change in the observational parameters over the life time of active galaxies in their nls1 phase will be of importance for the astrophysical community at large. with the capabilities of erosita we will achieve an important step forward in understanding growing black holes in a cosmological context. 1.3. accretion disc physics and nls1 science xmm-newton observations appear to have decomposed the x-ray spectra of agn into a strong, relativistically blurred reflection plus power law component (e.g. [15]). while the relativistic reflection model appears to be supported by the detection of both fe k and fe l emission with a normalization of the photon flux in the ratio of 20:1 in 1h 0707−495, in agreement with atomic physics and the presence of a 30 s time delay between the soft and hard light curves [16], the modelling of the high energy parts of the spectra might require further improvements to account for the sharp spectral drops in 1h 0707−495 [17, 18] and iras 13224−3809 [19]. recently miller et al. [20] have questioned the reverberation interpretation associated with the inner disc. their results are consistent with a partial covering interpretation of a substantial fraction of scattered x-rays passing through an absorbing medium whose opacity decreases with increasing energy. it is argued that both absorption and scattering is required and that the x-ray emitting material might be associated with an accretion disc wind. further modelling and observations are required to make progress. the erosita all-sky survey and pointed observations will open a new window on the study of nls1 galaxies. as rosat and xmm-newton provided new and fascinating observational results on nls1 galaxies, the further study of this class will be striking in terms of even more exciting discoveries. 2. synergy between erosita and multi-frequency missions the erosita all sky x-ray survey will take place in a very different context than the rosat survey. there is now a wealth of complete, ongoing and planned surveys of the sky in broad range of wavelengths from the gamma, x-ray to the radio. a significant amount of science can be accomplished through the multi-frequency study of the erosita agn and cluster sample, including optical confirmation and photometric redshift estimation of the erosita extended sources and agns. the sdss survey is already public in the north and will provide excellent confirmation and redshift estimation of extended sources out to redshifts approaching z = 0.5. the pan-starrs1 survey will push around 0.75 magnitudes deeper than sdss and over a larger fraction of sky, allowing studies of clusters and agn over essentially the full northern sky. the first two large aperture, large solid angle multiband optical surveys, the dark energy survey in the south and the hyper suprime-cam survey in the north, will also be underway as the erosita data come on. both surveys will enable cluster confirmation and cluster and agn photometric redshift estimation out to z = 1 and beyond. the des survey is also coupled to a deep nir survey with vista (visible and infrared survey telescope for astronomy) that should further extend the redshift grasp of the erosita mission. these datasets will be available, either publicly or through targeted partnerships, but it will require a significant effort to couple them to the erosita source lists. but the scientific payoff in studies of large scale structure, cluster and agn populations and constraints on the cosmic acceleration will be profound. in addition, a large scale spectroscopic survey would deliver additional new science, including precise cluster redshifts to enrich large scale structure and cosmology studies as well as agn spectra that provide direct constraints on the physical state of the gas surrounding the central black holes in unobscured erosita x-ray sources. a large scale spectroscopic survey is a challenging and crucial aspect for erosita and the 4most (4-meter multi-object spectroscopic telescope) will play a very important role in the follow-up observations of erosita agns and erosita clusters. 794 vol. 53 supplement/2013 erosita other surveys like fermi, planck and the spt will enable x-ray and sze studies of clusters of galaxies and agns. lsst (large synoptic survey telescope) will provide much deeper data once it comes along with respect to the des and the hsc. for alma agn studies in great detail are expected. ska (square kilometre array) will constrain the galaxy evolution of gas-rich galaxies out to z = 1, will probe matter in the strong field limit using black holes and pulsars, and will use the hydrogen emission from galaxies to measure the properties of dark energy. these surveys will be very complementary to erosita and therefore of general interest for the synergy aspect. the multifrequency approach is one of the key science drivers for the erosita mission. 3. erosita background determination the erosita background has been simulated based on photon and high-energy particle spectral components. the cosmic diffuse photon x-ray background has been adopted from lumb et al. 2011 1 and the high-energy particle background, which does not interact with the mirror system has been calculated by tenzer et al. [21]. the expected background count rate has been compared with xmm-newton observations, which might provide the best test for the x-ray background for erosita around the lagrangian point l2 before real x-ray background data will become available. 3.1. spectral components included in the erosita background simulations 3.1.1. the photon x-ray background the photon x-ray background has been modeled based on the assumption made by lumb et al. 2011. the photon x-ray background originates from two major components, the optically thin diffuse emission and the extragalactic unresolved background emission. the optically thin diffuse emission arises from the local hot bubble, the galactic disk and the galactic halo. the optically thin background emission from these components are modeled with two mekal models (an emission spectrum from hot diffuse gas, cf. [22]) with temperatures of kt1 = 0.204 kev, and kt2 = 0.074 kev. the corresponding normalizations are n1 = 7.59 photons kev−1 cm−2 s−1 sr−1, and n2 = 116.0 photons kev−1 cm−2 s−1 sr−1, respectively. the extragalactic unresolved background emission is modeled with a power law model with a photon index γ of 1.42 and a normalization of n = 9.03 photons kev−1 cm−2 s−1 sr−1. 3.1.2. the high energy particle background the high energy particle background has been modeled according to c. tenzer et al. [21] based on geant4 1http://arxiv.org/abs/astro-ph/0204147 simulation studies of the erosita detector background. the high energy particle background which is not interacting with the mirrors is modeled with a flat power law model with a photon index γ of 0.0 and a normalization of n = 1151 photons kev−1 s−1 sr−1. 3.2. comparison of the simulated erosita photon and particle background with the observed xmm-newton background table 1 shows the comparison between the observed xmm-newton count rate for the medium filter (without particle background, cf. table 7 of [23]) and the simulated photon plus particle erosita background count rates. the overall diffuse cosmic photon x-ray emission from the lumb et al. 2011 model is in good agreement with the observations obtained by xmm-newton. at energies above 5 kev the high energy particle background dominates the spectral energy distribution of the simulated erosita background. based on the spectral model parameters described above the mean erosita background count rate model is shown in fig. 1. 4. erosita optical follow-up spectroscopy with 4most optical spectroscopy has been, and will for the foreseeable future be, one of the main tools of astrophysics allowing studies of a large variety of astronomical objects over many fields of research. the fully capitalize on the erosita potential, a dedicated spectroscopic follow-up program is needed. 4most is the ideal instrument to secure the scientific success of the x-rays surveys and to overcome the small sample sizes together with selection biases that plagued past samples. the aim is to have the instrument commissioned in 2017, well matched to the data releases of erosita and gaia. here i shortly describe the design and implementation of a facility simulator aimed to optimize the science output for erosita. the 4most facility simulator (4fs) has several roles, firstly to optimize the design of the instrument, secondly to devise a survey strategy for the wide field design reference surveys that are proposed for 4most, and thirdly to verify that 4most, as designed, can indeed achieve its primary science goals. in order to optimize the science output from 4most a dedicated simulation software, the 4most facility simulator (4fs) has been developed [24]. the 4fs will not only optimize the science output from 7 design reference surveys (drs), two of them, the agn and the cluster drs, are dedicated for erosita follow-up, but it also used to understand and optimize the instrument behavior. the 4fs consists of three different major components, the operations simulator (opsim), the throughput simulator (ts), and the data quality control tools (dqct), each of which have specified tasks. the opsim component is located 795 http://arxiv.org/abs/astro-ph/0204147 thomas boller acta polytechnica energy band 0.2 ÷ 0.5 0.5 ÷ 2.0 2.0 ÷ 4.5 4.5 ÷ 7.5 7.5 ÷ 12 [kev] simulated particle count rate 0.028 0.15 0.24 0.29 0.44 [10−3 counts s−1 arcmin−2] simulated photon and particle 1.71 2.19 0.36 0.31 0.44 count rate [10−3 counts s−1 arcmin−2] observed xmm-newton background 1.13 ± 0.50 2.04 ± 0.94 0.72 ± 0.36 0.64 ± 0.36 0.68 ± 0.48 pn medim filter [10−3 counts s−1 arcmin−2] table 1. simulated erosita photon and particle count rates and the comparisions with the observed xmm-newton background count rates. 1 100.2 0.5 2 5 10 − 5 10 − 4 10 − 3 0. 01 c ou nt s s− 1 ke v − 1 energy (kev) fov 1 arcmin2 erosita background particle background x−ray background total background figure 1. mean erosita background spectrum (boller, erosita science book). at mpe, garching, the ts at gepi, paris, and the dqct at ioa, cambridge. data flow between the major components is carried out over the internet through an rsync directory system located at mpe, garching. the 4fs accepts input data in the format of mock catalogues of targets, together with template spectra of targets. each mock catalogue represents one drs. the operation of 4fs is controlled through parameter files issued by the systems engineer, which define the set-up of each system that should be to be simulated. figure 2 illustrates, in a graphical way, the summary statistics internally generated by the 4fs opsim for each simulation that has been carried out. the plot show the fraction of input objects that have been assigned a fiber in at least one tile of the simulated survey. note that this is not the same measure as the fraction of input objects that were ‘successfully’ observed reported by the 4fs dqct. however, the allocated fractions reported here do provide strong upper bounds on the successfully observed fractions. the survey footprint is defined as the locus of all points on the sky that lie within the hexagonal bounds of at least one field in which at least one tile was executed. due to the fov shape, the survey footprint has slightly ‘ragged’ northern and southern edges. therefore, some points of the footprint extend slightly outside the nominal declination limits of the survey. the plots illustrate the relative ability of each combination of telescope, positioner, fov, number of fibers, high/low resolution fiber pattern that has been simulated. the red bars show the fraction of objects within the survey footprint that were allocated a fiber in at least one tile. the blue bars show the frac796 vol. 53 supplement/2013 erosita figure 2. fraction of input objects that have been assigned a fiber in at least one tile of the simulated agn survey [24]. 14 individual combination of a telescope selection, ntt or vista, and fiber positioner concepts, the potsdam position (potzpos) and the munich positioner (mupos), have been simulated. other parameters that have been varied are the fov of the telescope, the number of high and low resolution fibers and the ratio between high and low resolution fibers. see [24] for a detailed description of the 4fs simulation for telescope trade off selection. tion of all input objects (within the declination limits −70° < dec < +20°) that were allocated a fiber in at least one tile. the total number of objects within the −70° < dec < +20° range is given in the top right hand corner of the plot. the red and blue bars should be read off against the left hand y-axis scale. the green bars show the number of objects that were allocated a fiber in each simulation, with a scale that should be read off the right hand y-axis. note that the blue bar is simply the number of sources allocated fibers (indicated by the green bar) divided by the total number of objects, indicated in the top right hand corner of each plot. the allocation fraction for the erosita agn mock catalogue is best for large fovs for ntt and vista and for both types of positioner concepts. a complete description of the allocation fraction for the other drs is given in [24]. acknowledgements tb is grateful to the peter predehl, the pi of erosita, for intensive discussions in preparation of the erosita talk given at the 2012 vulcano workshop. references [1] merloni, a., predehl, p., becker, w., böhringer, h., boller, th. et al., astro-ph, arxiv:1209.3114, 2012 [2] nandra, k., pounds, k. a., stewart, g. c. et al., mnras 236, 39, 1989 [3] pounds, k. a., nandra, k., stewart, g. c. et al., nature 344, 132, 1990 [4] nandra, k., pounds, k. a., mnras 268, 405, 1994 [5] tanaka, y., nandra, k., fabian, a. c. et al., nature 375, 659, 1995 [6] nandra, k., o’neill, p. m., george, i. m. et al., mnras 382, 194, 2007 [7] boller, th., exploring fundamental issues in nuclear physics: nuclear clusters – superheavy, superneutronic, superstrange, of anti-matter. edited by debades bandyopadhyay (saha institute of nuclear physics, india). published by world scientific publishing co. pte. ltd., 2012. isbn 9789814355766, pp. 44–52, 2012a [8] boller, th., soft x-ray reflection and strong and weak field limit determination in narrow-line seyfert 1 galaxies, exploring fundamental issues in nuclear physics, proceedings of the symposium on advances in nuclear physics in our time, ed. debades bandyopadhyay, world scientific publishing co. pte. ltd., isbn-13 978-981-4355-72-8, 2012 [9] nandra, k., athena: the advanced telescope for high energy astrophysics, in: the x-ray universe 2011, presentations of the conference held in berlin, 797 thomas boller acta polytechnica germany, 27–30 june 2011. available online at: http://xmm.esac.esa.int/external/, article id.022, 2011 [10] balestra, i., tozzi, p., ettori, s. et al., a&a 462, 429, 2007 [11] kollmeier, j. a., onken, c. a., kochanek, c. s. et al., apj 648, 128, 2006 [12] boller, th., growing black holes: accretion in a cosmological context. proceedings of the mpa/eso/mpe/usm joint astronomy conference held at garching, germany, 21–25 june 2004. a. merloni, s. nayakshin, r. a. sunyaev (eds.). eso astrophysics symposia. berlin: springer, isbn 3-54025275-4, isbn 978-3-540-25275-7, 2005, p. 170–174, 2005 [13] tanaka, y., boller, th., gallo, l., growing black holes: accretion in a cosmological context. proceedings of the mpa/eso/mpe/usm joint astronomy conference held at garching, germany, 21–25 june 2004. a. merloni, s. nayakshin, r. a. sunyaev (eds.). eso astrophysics symposia. berlin: springer, isbn 3-54025275-4, isbn 978-3-540-25275-7, 2005, p. 290–295, 2005 [14] osterbrock, d. e., pogge, r. w., apj 297, 166, 1985 [15] miniutti, g., fabian, a. c., mnras 349,1435, 2004 [16] fabian, a. c. et al., nature 459, 540, 2009 [17] boller, th., fabian, a. c., sunyaev, r. et al., mnras 329, l1, 2002 [18] zoghbi, a., fabian, a. c., uttley, p., miniutti, g. et al., mnras 401, 2419, 2010 [19] boller, th., tanaka, y., fabian, a. et al., mnras 343, l89, 2003 [20] miller, l., turner, t. j., reeves, j. n. et al., mnras 408, 1928, 2010 [21] tenzer, ch., warth, g., kendziorra, e. et al., proceedings of spie, 7742, 2010 [22] mewe, r., gronenschild, e.h.b.m., van den oord, g.h.j., a&a, 62, 197, 1985 [23] read, a. m., ponman, t. j., a&a 409, 395, 2003 [24] boller, th., dwelly, t., böhringer h., proceedings of spie 2012b, in press 798 http://xmm.esac.esa.int/external/ acta polytechnica 53(supplement):793–798, 2013 1 erosita agn science 1.1 relativistic fe k emission lines, the strong field limit and black hole spin determination 1.2 erosita black hole growth studies and multi-wavelength follow-up work 1.3 accretion disc physics and nls1 science 2 synergy between erosita and multi-frequency missions 3 erosita background determination 3.1 spectral components included in the erosita background simulations 3.1.1 the photon x-ray background 3.1.2 the high energy particle background 3.2 comparison of the simulated erosita photon and particle background with the observed xmm-newton background 4 erosita optical follow-up spectroscopy with 4most acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0847 acta polytechnica 53(6):847–853, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap employment, production and consumption with random update: non-equilibrium stationary state equations hynek lavičkaa,b,∗, jan novotnýc,d a czech technical university in prague, faculty of nuclear sciences and physical engineering, department of physics, břehová 7, cz-115 19 praha 1, czech republic b bogolyubov laboratory of theoretical physics, joint institute of nuclear research, 141980 dubna, russia c centre for econometric analysis, faculty of finance, cass business school, city university london, 106 bunhill row, london, ec1y 8tz, united kingdom d cerge-ei, charles university, politickych veznu 936/7, 11000 prague 1-new town, czech republic ∗ corresponding author: lavicka@fjfi.cvut.cz abstract. in this work, we investigate the model of employment, production and consumption, as introduced in a series of papers by i. wright [1–3] from the perspective of statistical physics, and we focus on the presence of equilibrium. the model itself belongs to the class of multi-agent computational models, which aim to explain macro-economic behavior using explicit micro-economic interactions. based on the mean-field approximation, we form the fokker-plank equation(s) and then formulate conditions forming the stationary solution, which results in a system of non-linear integral-differential equations. this approximation then allows the presence of non-equilibrium stationary states, where the model is a mixed additive-multiplicative model. keywords: multi-agent based model, ness, equilibrium, langevin equations. 1. introduction in recent years, the emergence of cheap computational power and progress in big data science has resulted in the appearance of multi-agent computational models in the economic and social sciences, see [4]. these models aim to model macroscopic properties of the socioeconomic system, using explicit modeling of microscopic interactions. such interactions are mimicked explicitly as a large-scale experiment, where thousands of agents are equipped with a set of rules and their mutual interaction is explicitly modeled using monte carlo procedures, see [5] for a number of explicit examples. large scale multi-agent computational models are, by their nature, black-box computational models, with no direct relation between the variables of interest. this is the main critique in comparison with main stream economic tools, which rather aim to find an explicit and analytically tractable link between the variables. by contrasts, multi-agent computational models have to be perceived as complex systems, see [5–7], and this requires a specific set of tools. in particular, there is a question about the existence of “equilibrium”, which is a key component of many economic models. the contribution of our paper is two-fold. first, the paper aims to establish a serious discussion about investigating the presence and the type of equilibrium in complex socioeconomic systems. to the best of the knowledge of the authors’ such a discussion is missing in the literature, and thus there is no bridge between standard economic models and complex multi-agent based models. second, we illustrate our point with the model of employment, production and consumption employed in [1–3, 8–13], which mimics the economic activity in society. for this model, we utilize the mean-field approximation, which illustrates the presence of non-equilibrium stationary states and contradicts the conventional economic intuition. an answer to the question about the qualitative properties of the states that the economy reaches precedes any attempts to find quantitative solutions. the paper is organized as follows. in section 2, we define the model of employment, production and consumption. in section 3, we employ the mean-field approximation and the qualitative properties of the solution. section 4 provides conclusions. 2. definition of the model the model of employment, production and consumption (epc, henceforth) was originally proposed in a series of papers by ian wright [1–3]. the model is based on the social stratification of society according to an individual’s holding and utilization of capital, as a means of production and wealth. the model assumes and explicitly models two distinct types of agents: citizens and companies. the number of citizens is held constant 847 http://dx.doi.org/10.14311/ap.2013.53.0847 http://ojs.cvut.cz/ojs/index.php/ap hynek lavička, jan novotný acta polytechnica in the model, while the number of companies varies dynamically and endogenously over time. the model itself represents a grand-canonical ensemble with respect to the agents in the economy as the number of companies varies endogenously over time. economic activity in the economy is undertaken by companies, which are the only agents who can compete on the market. companies, however, are endogenously created by citizens and cannot operate without them. thus, every citizen in the economy is characterized by her status with respect to companies: she can be at any time either unemployed, employed or the owner of a company. both citizens and companies are characterized by their holding of capital, serving the role of wealth. the model itself is characterized by the properties of the individual agents and the macroscopic properties of the system, e.g. the rate of employment of agents or the distribution of capital among different groups of citizens is the result of individual interactions among agents. it is worth noting that the model itself can be highly non-linear with respect the parameters (exogenous conditions) can produce a non-linear response, as has been illustrated by [8]. the definition of the model is in the form of langevin equations with discrete time steps, using multi-step procedure driving the dynamics. the economic activity mimics the procedure where citizens demand goods to consume, which is in turn satisfied by the economic activity of companies, which supply the goods to consume. the concept of goods is implicit, though, and all the economic transactions are proxied in terms of capital. each micro-state of the system is described by the parameters of each of n citizens and the volume of demanded goods d. the state of the corporate sector can be deduced from the knowledge of citizens. in particular, a citizen i carries three variables {mi,ei,ηi}, where mi stands for the total amount of money owned, ei specifies one of the three states with respect to the corporate sector, and, finally, ηi is a wage expectation. the state of the system at time t is thus described by st ≡ { {mi,ei,ηi}i∈{1,...,n},d } . in particular, if ei points to the same agent, it denotes an owner of a company. if it points toward another agent, she is an employee of the other agent. and if it is empty, she is unemployed. the commercial cycle which propels the dynamics consists of four turns: • hiring turn; • demand turn; • revenue turn; • wage turn; where during each turn, a certain part of the system is updated. passing through all sub-steps evolves the system one step ahead in time. as the system passes intermediate states, we equip it with superscripts h, d and r, indicating that it is a result of the hiring turn, the demand turn and the revenue turn, respectively. the end of the wage turn coincides with the end of the entire step. this model was originally defined by wright in [2], following his research in [1, 3, 14, 15]. in this paper, we generalize his definitions using a general notation of the functions fa(w), fd(w), and fr(d) of individual probability densities employed during the consecutive turns, which depends on one parameter and we also make changes such that the model is a fully random updated model1. in the definitions below, we assume that the random variables are mutually uncorrelated, undergoing a certain well-defined probability distribution, if not specified otherwise. 2.1. the hiring turn during the first turn, an unemployed citizen can either become employed or can set up her own venture and become an employer. each unemployed citizen evaluates her attractiveness towards each of the existing employers and other unemployed citizens (denoted as h). attractiveness is a function of the other agent’s wealth. then, each unemployed citizen chooses a new employer indexed by h with probability p(wh) = fa(wh)∑ i∈h fa(wi) , where p(wh) serves as a weight to choose a potential employer h. then, the unemployed person decides whether she will turn her initial inclination into a contract. the decision is based on the following rule: she draws a random number from ηhc ∼ u(ηc, 2ηc), where ηc are the agent’s wage expectations, and if ηhc ≤ wh then she either joins the existing company of agent h or initiates the creation of a new venture by agent h (the creation of a company is induced by the demand for employment). employment increases the future wage expectations, as ηc ← ηhc . otherwise, she remains unemployed and her wage expectations for the next turn decrease as ηc ← u(0,ηc). 2.2. the demand turn in the next turn, citizens consume goods available on the market and thus create demand for them. each citizen c spends a share of her endowment ddc on goods. the amount to spend is expressed as ddc ∼ fd(wc), where 1article [2] proposes a model where a wage is paid to all agents at the same time. this, however, represents a multi-body interaction, which introduces an unnecessary degree of complexity. we replace this step by a sequential two-body interaction, which means that the wage is paid consequently, agent by agent. 848 vol. 53 no. 6/2013 employment, production and consumption with random update fd(w) is some probability distribution and is a function of the agent’s wealth. after this spending on goods, she owns wdc = wc −ddc and the pool of demanded goods increases as dd = d + ddc . 2.3. the revenue turn then, the companies aim to exploit the pool of demanded goods and compete for citizens. each citizen c from the pool of companies and company owners works for her company and supplies a share of demanded goods drc . the supplied amount of goods is expressed as drc ∼ fr(dd), where fr(d) is a probability distribution. consequently, the demand is decreased as dr = dd−drc . for an employee c, drc is sent to the company owner’s budget as wrec = w h ec + drc , where ec is a pointer to the appropriate employer. for an employer, on the other hand, the rule reads as wrc = whc + drc . 2.4. the wage turn finally, each employee is paid for her work for the company. employee c obtains her wage ηc, which is based on her initial wage expectations, which were agreed during the employment contract. if employer ec has enough resources to provide this wage, the wage is paid and thus wc = wdc + ηc and wec = wdec −ηc. if wec < ηc, on the other hand, the employer does not have enough resources, the employee receives what remains as wc = wdc + wec and then becomes unemployed. if the employer loses all her employees, she becomes unemployed as well. 3. derivation of the stationary state equations the model introduced in the preceding section is a markov process that evolves micro-states of the system described by 4 · n + 1 variables in total, captured by st ≡ { {mi,ei,ηi}i∈{1,...,n},d } at time t. the time coordinate can point to any turn at each step; however, we will mainly refer to time t after all four turns have been completed. each time step state of the system evolves following the transition probabilities p(st|s′t), which move the system among the micro-states, using the rule p(st+4t|s0) = p(st+4t|s′t) ·p(s ′ t|s0). the solution p(st|s0) of the system with the initial micro-state s0 can be very complex, and even impossible to express with a system of analytically solvable equations. therefore, we rather focus on the ground stationary state p(s) ≡ limt→+∞ p(st|s0) which is a solution to the equation p(s) = p(s|s′)p(s′). (1) thus the stationary state is an eigenvector of eigenvalue 1 of the transition probabilities if the equations are linear. our model, in contrast to such classical models as the ising model, asep and brownian motion (see, for example 16, and 17) is described by a set of different sub-transition probabilities following each described turn, thus composing the total transition probability as follows p(s|s′) = pw (s|siii )pr(siii|sii )pd(sii|si )ph(si|s′), (2) where ph(si|s′), pd(sii|si ), pr(siii|sii ) and pr(siii|sii ) stands for the hiring, demand, revenue and wage turns, respectively. we simplify the problem (1) using a set of ansatzes, which we impose on the stationary solution. each particular ansatz decreases the complexity of the problem, assuming the independence of certain variables. first, we assume that p(s) ≡ p({wi,ei,ηi},d) ' p({wi,ei,ηi})pd(d), i.e., independence of the level of demand from the particular state of the society. second, we assume the independence of wage expectations from the state of the entire system, or, p ( {wi,ei,ηi} ) ' p ( {wi,ei} ) pη(η). third, we employ the mean-field approximation of employment contracts, and thus assume p({wi,ei}) ' p(x,wx). then, the stationary state of the model is identified using the above set of ansatzes, in the form of p(s) ≡ p({wi,ei,ηi},d) ' pd(d)pη(η)p(x,wx), (3) 849 hynek lavička, jan novotný acta polytechnica where the entire system is described by variables x and wx, where x ∈{u,e,c} substitutes the status with respect to companies and wx shows the accumulated wealth. the state of economy is than described by 3 variables d, η and m, which describe aggregated demand, wage expectation, and income, respectively. we assume (3) for each variable in each turn of the system in decomposition (2). during each of the turns of the economic activity, the system changes, and we thus equip in each turn the variables with superscript (h, d, and r) denoting the particular turn. we first define the function w(w) = +∞ˆ w fa(w′)(p(u,w′) + p(c,w′)) dw′, and a constant c1 = ´ +∞ 0 w(η)pη(η) dη w(0) . 3.1. the hiring turn during the hiring turn, the variables of the system evolve as follows ph(e,we) = p(e,we) + p(u,we)c1, (4) ph(c,wc) = p(c,wc) + p(u,wc) fa(wc) ´wc 0 pη(η) dη w(0) , (5) ph(u,wu ) = p(u,wu ) −p(u,wu )c1 −p(u,wu ) fa(wu ) ´wu 0 pη(η) dη w(0) , (6) ph(η) = c1 ηˆ η 2 pη(η′) η′ dη′ + (1 −c1) +∞ˆ η pη(η′) η′ dη′. (7) 3.2. the demand turn consequently, during the demand turn, the following update is performed: pd(x,wx) = +∞ˆ 0 ph(x,wx + d)fd(d,wx + d) dd, (8) pdd(d,t) = d̂ 0 ˚ r3+ ∏ x∈{u,e,c} fd(wdx,wx)p h(x,wx)δ (∑ x∈{u,e,c} wdx −d ) dwxpd(d −d,t) dd, (9) where the delta function δ(x−x0) in (9) effectively decreases the dimension of integration by 1, and r+ is a set of positive real numbers. 3.3. the revenue turn during the revenue turn, the system changes as pr(c,wc) = c2 wcˆ 0 pd(c,w′ −d′) +∞ˆ 0 pdd(d ′)fr(m′,d′) dd′ dw′ + (1 −c2)pd(c,wc), (10) prd(d) = c2 +∞ˆ 0 pdd(d ′ + d)fr(d′,d′ + d) dd′ + (1 −c2)pdd(d), (11) where c2 = ´ +∞ 0 ( p(e,w) + p(c,w) ) dw. 850 vol. 53 no. 6/2013 employment, production and consumption with random update 3.4. the wage turn finally, the wage turn evolves according to equations p(e,we) = −pd(e,we) +∞ˆ 0 ηˆ 0 phη (η)p r(c,w′) dw′ dη + weˆ 0 p(e,w −w′) +∞ˆ 0 phη (w ′)pr(c,w′′ + w′) dw′′ dw′, (12) p(c,wc) = pd(c,wc) ( 1 − ´ +∞ 0 p(c,w ′) dw′´ +∞ 0 p(e,w ′) dw′ +∞ˆ 0 ηˆ 0 phη (η)p r(c,w′) dw′ dη ) + +∞ˆ 0 phη (w ′)pr(c,wc + w′) dw′, (13) p(u,wu ) = pd(u,wu ) + pd(e,we) +∞ˆ 0 ηˆ 0 phη (η)p r(c,w′) dw′ dη + pd(c,wu ) ´ +∞ 0 p(c,w ′) dw′´ +∞ 0 p(e,w ′) dw′ +∞ˆ 0 ηˆ 0 phη (η)p r(c,w′) dw′ dη, (14) pη(η) = +∞ˆ 0 phη (η + w ′) pd(c,w′)´ +∞ 0 p d(c,w′′) dw′′ dw′ + phη (η) ( 1 − ´η 0 p d(c,w′) dw′´ +∞ 0 p d(c,w′′) dw′′ ) , (15) where we have omitted any superscript as the result coincides with the final stage of the economy. 3.5. discussion on properties of the solution the system of equations (4)–(15) describes the ground stationary state of a non-equilibrium macro-economic model (the transition probabilities for macro-state variables) based on the micro-economic characteristics of individual agents. the model is as a fokker-planck equation using the mean-field approximation. the equations constitute a closed non-linear system. the system of equations is formulated in general and it may then be simplified using a certain class of microscopic characteristics captured by fa(w), fr(d) and fd(d,w). in the case of wright’s epc model, see [2], the mean-field approximation can be obtained using fa(w) = w, fr(m,d) = 1d and fd(d,w) = 1 w . let us note that the system of equations (4)–(15) is the simplest formulation that exhibits the micro-structure of particular economic clusters of citizens and their mutual interactions. generally, more simplifications can be made at the expense of reducing the accuracy of the model and its connection to microscopic properties. to calculate the “excited stationary states” of the system, we have to modify the system of equations (11)–(15) with a prefactor constant λ on the left hand side of equations (4)–(7). the prefactor must be determined prior to solution of the system. excited states are eigenvectors (in linear case) of a composite operator { p(n)(x,wx),p (n) d (d),p (n) η (η) } associated with eigenvalues λn and the system of equations (11)–(15) is a matrix equation. eigenvalues are roots of the characteristic polynomial p(λ) = 0 associated to such a matrix system. uniqueness of the stationary state is an open question but non-uniqueness would allow the existence of different phases of the system with systematically different states of the economy, where a change of the stationary state would require an action of the exogenous field — in the economic case, a policy action by government or by a policy-making body. finally, in the case of non-equilibrium stationary states, there exists a non-zero expected macro-scopic current (currents), while equilibrium states do not allow for any non-zero macroscopic currents. in the context of our model, such a current can be represented by the flow of capital or by the structure of the society with respect to the corporate sector. to illustrate the existence of such currents, let us consider the hiring turn, where the number of employees or the size of the corporate sector increases due to random fluctuations in the model. therefore, there is a directed flow of agents into employment or corporate ownership, or, equivalently, an outflow of agents from unemployment. on the other hand, during the wage turn, we observe a similar flow but with a different sign. however, on the scale of the entire commercial cycle, we observe an equilibrium. this follows from the fact that there are no exogenous driving forces, active boundaries or external reservoir(s). in the case of the economic system, these exogenous factors can be illustrated by supporting particular industry or subsidizing newly set 851 hynek lavička, jan novotný acta polytechnica up ventures (for driving forces), imposing a minimum wage and a pension system (for active boundaries), and trading with foreign economies with an active international trade balance sheet (for external reservoirs). therefore, the model presented here is clearly of the non-equilibrium stationary state type. 4. conclusions and discussion in this paper, we have discussed the characteristics of the model of employment, production and consumption presented in [2], which mimics the economic activity in a society using the multi-agent computational model. we have provided a novel derivation using mean-field theory, and we have obtained a set of non-equilibrium ground stationary state equations for the model initially described by fokker-planck equations. in contrast to simple models like one-step brownian motion, see [17], various versions of the ssep and asep models, see [16] and references therein, or various examples of kinetics, see [18], we have derived a set of equations that drives the mean-field approximation and are of non-linear integral-differential type. the set of identified dynamic equations is too complex, and a full analytic treatment of it is not feasible. general complexity lies in the conditional interactions which produce non-linear terms, where another source of complexity lies in combination of multiplicative and additive processes during the commercial cycle, where the wage expectations undergo a multiplicative process — the mean of this process is governed by extreme events — while the rest undergoes an additive process — with means being governed by most likely events, as was elaborated in [19]. our solution can be thus explained from the perspective of [20, 21], while the non-linear multi-dimensional case present in our model is a novelty. our paper provides an initial discussion on the analytic treatment of the epc multi-agent computational model. the results suggest several extensions. firstly, the numerical solution to the approximate analytic description can be compared with the direct monte carlo simulation of the system, and thus the loss we suffer in the mean-field approximation can be evaluated. second, the solutions suggest that sudden changes of currents related to the stationary state due to a change in the exogenous parameters are possible. however wright’s original model does not have any external parameter, in contrast to the parallel formulation of the epc model, see [8]. thus, the formulation of [8] is a natural extension of the study presented here. finally, from the thermodynamic perspective, the model resembles the mimkes model of a productive economy, based on the carnot cycle with two reservoirs, see [13]. this further stresses the possible existence of a non-equilibrium stationary state of the economy, which contradicts the recent economic paradigm based on equilibrium concepts. all these extensions are left for future research. in conclusion, our analyzed model is in equilibrium on the large scale; however, it is composed of nonequilibrium intermediate states. multi-agent computational models are, in turn, very likely to exhibit such non-equilibrium behavior in the intermediate states. the dynamics of multi-agent computational models is therefore more general than compared dynamics of models based on equilbrium assumptions in every intermediate step, and thus allows more complex socio-economic behavior be mimicked. acknowledgements we would like to express our thanks for material and financial support from gacr grant no. p402/12/2255, and from grant rvo68407700 of the czech ministry of education in support of the development of the sunrise cluster, where some of the calculations were executed. jn acknowledges funding from the european community’s seventh framework program fp7-people-2011-ief under grant agreement number pief-ga-2011-302098 (price jump dynamics). references [1] i. wright. the social structure of capitalism. physica a 346:589–620, 2005. [2] i. wright. implicit microfoundations for macroeconomics. economics: the open-access, open-assessment e-journal 3(2009-19), 2009. [3] i. wright. the emergence of the law of value in a dynamic simple commodity economy. review of political economy 20:367–391, 2008. [4] d. helbing. economics 2.0: the natural step towards a self-regulating, participatory market society, 2013. preprint, arxiv:1305.4078. [5] l. tesfatsion, k. l. judd. handbook of computational economics, vol. 2 : agent-based computational economics (1st ed.). amsterdam, nl: elsevier, 2006. [6] h. lavička. simulations of agents on social network. lap lambert academic publishing, 2010. [7] p. w. anderson, k. j. arrow, d. pines. the economy as an evolving complex system,. addison-wesley, reading, ma., 1988. [8] h. lavička, l. lin, j. novotný. employment, production and consumption model: patterns of phase transition. physica a 389:1708–1720, 2010. 852 http://arxiv.org/abs/1305.4078 vol. 53 no. 6/2013 employment, production and consumption with random update [9] h. dawid, s. gemkow, p. harting, et al. the eurace@unibi model: an agent-based macroeconomic model for economic policy analysis. working paper universität bielefeld 2012. [10] g. dosi, g. fagiolo, a. roventini. schumpeter meeting keynes: a policy-friendly model of endogenous growth and business cycles. journal of economic dynamics & control 24:1748–1767, 2010. [11] t. assenza, g. d. delli. e pluribus unum: macroeconomic modelling for multi-agent economies. gredeg working papers, gredeg cnrs, university of nice-sophia antipolis, 2012. [12] l. lin. some extensions to the social architecture model. in j. wells, e. sheppard, i. wright (eds.), proceedings of probabilistic political economy: law of chaos in the 21st century. kingston university, 2008. [13] j. mimkes. the complex networks of economic interactions, vol. 567, chap. concepts of thermodynamics in economic growth, pp. 139–152. springer berlin heidelberg, 2006. [14] i. wright. a conjecture on the distribution of firm profit. economía: teoría y práctica 20, 2004. [15] i. wright. the duration of recessions follows an exponential not a power law. physica a 345:608–610, 2004. [16] b. derrida. non-equilibrium steady states: fluctuations and large deviations of the density and of the current. j stat mech p07023, 2007. [17] a. einstein. über die von der molekularkinetischen theorie der wärme geforderte bewegung von in ruhenden flüssigkeiten suspendierten teilchen. annalen der physik 17:549–560, 1905. [18] p. krapivsky, s. redner, e. ben-naim. a kinetic view of statistical physics. cambridge university press, 2010. [19] s. redner. random multiplicative processes: an elementary tutorial. am phys j 58(3):267–273, 1990. [20] d. sornette, r. cont. convergent multiplicative processes repelled from zero: power law and truncated power laws. j phys i france 7:431–444, 1997. [21] d. sornette. multiplicative processes and power laws. phys rev e 57:4811–4813, 1998. 853 acta polytechnica 53(6):847–853, 2013 1 introduction 2 definition of the model 2.1 the hiring turn 2.2 the demand turn 2.3 the revenue turn 2.4 the wage turn 3 derivation of the stationary state equations 3.1 the hiring turn 3.2 the demand turn 3.3 the revenue turn 3.4 the wage turn 3.5 discussion on properties of the solution 4 conclusions and discussion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0221 acta polytechnica 54(3):221–224, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap optimizing the welding of plastics with the use of differential scanning calorimetry and thermogravimetric analysis michal ondruškaa, ∗, marián drienovskýb, roman čičkac, milan marôneka, antonín náplavab a slovak university of technology in bratislava, faculty of materials science and technology in trnava, department of welding and foundry, j. bottu 25, 917 24 trnava, slovakia b slovak university of technology in bratislava, faculty of materials science and technology in trnava, department of materials engineering, j. bottu 25, 917 24 trnava, slovakia c slovak university of technology in bratislava, faculty of materials science and technology in trnava, department of physics, j. bottu 25, 917 24 trnava, slovakia ∗ corresponding author: michal.ondruska@stuba.sk abstract. plastics have different thermal stability, depending on the structure of the polymer chains. it is therefore very important to know their thermal properties, which influence the temperature regime of processing equipment. this paper presents examples of differential scanning analysis (dsc) and thermogravimetry analysis (tg) of selected plastics in an ar protective atmosphere and also in an oxidative atmosphere of static air. these analyses can be used for proposing a welding temperature regime. keywords: plastics; sep polymers; welding; differential scanning calorimetry; thermogravimetric analysis. 1. introduction plastics offer many distinct advantages, e.g., light weight, good thermal and electrical insulation properties, corrosion resistance, chemical inertness, high strength and dimensional stability, absorption of mechanical shocks, good dyeability, potential for decorative surface effects, and low production costs [1]. future progress in plastics technology may depend mainly on the same factors that set up the recent fast growth. some of the main factors include [2]: • improved understanding of the characteristics of plastics, especially wider long-term and under combined stresses (that is, under combined mechanical, thermal, and chemical effects). • development and utilization of new materials and combinations of materials, especially in reinforced plastics or composites. • steady reduction in material costs relative to competing materials, taking advantage of low energy requirements for processing and economies of scale. • invention and commercialization of new processes. • continued improvement in quality, in part due to further automation and in-line measurement and control. • advances in recycling technology to reduce the environmental consequences of wider use of nondegradable materials. over the past few years, thermoplastic polymers have been progressively replacing metals in the automotive and aerospace industries, medicine, packaging, electronics, construction, etc. welding increases the versatility and applicability for thermoplastic polymers. welding techniques for thermoplastics are divided into three categories[3]. thermal methods are based on conducting heat to the welding surface. a heat source is placed between the joining surfaces. when the surfaces are melted, the heat source is removed. the surfaces are put into contact under pressure until the weld solidifies [4]. these methods conclude: (1.) heated tool welding; (2.) hot gas welding; (3.) extrusion welding; (4.) infrared welding; (5.) laser beam welding. thermal welding methods have a number of advantages and disadvantages. heated tool welding and hot gas welding are simple and economic, but they have high requirements on the skills of the operator. extrusion welding has shorter processing times than heated tool and hot gas welding. the advantages of infrared welding include a short processing time, absence of contact with the welding surfaces and suitability for high temperature thermoplastics. on the other hand, two-step processes and some kinds of thermoplastics absorb infrared radiation. laser beam welding is based on the absorption of radiation. some pigments absorb laser radiation, which may influence 221 http://dx.doi.org/10.14311/ap.2014.54.0221 http://ojs.cvut.cz/ojs/index.php/ap m. ondruška, m. drienovský, r. čička et al. acta polytechnica the colour of the thermoplastics being welded [3]. friction methods involve rubbing together with pressure force. these methods include: (1.) vibration welding; (2.) ultrasonic welding; (3.) spin welding; (4.) stir welding. vibration welding is suitable for parts of relatively high stiffness, and offers the advantages of robust machinery, simple processing without accessories, and minimal polymer degradation [5]. ultrasonic welding is suitable for joining parts of low strength. it entails greater frequency vibration (up to 40 khz), and requires better surface preparation than vibration welding [5]. spin welding is applicable only for symmetrical and circular cross-section components. friction stir welding (fsw) is technically based on the heat produced by surface friction, which causes intense motion in the molecular structure of the material to be welded. however, an exit hole remains when the tool is removed [3]. electromagnetic methods contain: (1.) resistance; (2.) microwave; (3.) induction; (4.) radiofrequency. resistance welding has shown performance and cost benefits over other joining techniques, and is being used in current applications [6]. microwave welding has several unique benefits, for example short processing time, and usability for complex geometries. the material degradation risk for polymers with polar groups is the main disadvantage [3, 4]. induction welding has a short process time and is usable for complex geometry but the machinery is quite expensive [3]. radiofrequency welding is most commonly used. it is a relatively fast process with typical cycle times from less than 2 seconds to 5 seconds. no special joint designs are required [4]. during production, processing and application, plastics are often subjected to temperature-dependent structural changes. thermal analysis for characterization of plastics is widely practiced in research and industry today [1]. there are three different basic thermal analysis techniques for polymer analysis: differential scanning calorimetry (dsc), thermogravimetric analysis (tg) and thermomechanical analysis (tma) [2]. dsc instruments are widely used for the thermal characterization of plastics [7]. dsc is a technique that measures the heat flow in or out of a material as a function of time or temperature. the required sample size is relatively small, and very little sample preparation is required. this leads to a fast analysis time. in a dsc measurement, information about thermal and mechanical history (processing influences, crystallinity figure 1. netzsch sta 409cd — simultaneous thermal analysis apparatus. and curing, service temperature) is revealed by the first heating curve. for a forensic comparison of chemically similar samples, the thermal history plays an important role because subsequent controlled cooling creates a ”new” known specimen history, which gives the same characteristic properties to all materials [8]. the advantage of dsc in comparison with other calorimetric techniques lies in the broad dynamic range regarding heating and cooling rates, including isothermal and temperature-modulated operation [9]. dsc is a useful tool for characterizing thermoplastics by determining the glass transition temperatures. this technique is especially useful in characterizing copolymers and blends, where this information may be directly applied to determining the formulation changes required to improve the physical properties [10, 11]. 2. method differential scanning calorimetry (dsc) and thermogravimetry (tg) measurements were employed in this study. both measurements were performed on a netzsch sta 409cd simultaneous thermal analysis apparatus (fig. 1). this instrument is able to record both measured values (dsc and tg) at the same time. the samples were tested under non-isothermal conditions at the same scanning rates of 10 k/min in two steps: heating and cooling. the temperature range of the measurements was from room temperature to 222 vol. 54 no. 3/2014 optimizing the welding of plastics figure 2. dsc-tg record of hdpe (4.5 mg) measured in argon. figure 3. dsc-tg record of hdpe (3.8 mg) measured in air. 350 °c. the specimen, i.e., high density polyethylene (hdpe) and polyamide66 (pa66) was measured in an inert atmosphere and also in an oxidizing atmosphere. the first measurements were conducted in a protective atmosphere of pure ar (99.9999 vol.%). a furnace was evacuated and purified by ar before the measurements. the gas flow of ar during the measurements was 60 ml/min. the measurements in an oxidizing atmosphere were conducted in static air in a furnace without active gas flow. the samples were loaded onto an aluminium pan covered by an aluminium lid. the weight of the samples varied between 3.9 mg and 8.7 mg. thermal properties such as glass transition temperature (tg), melting temperature (tm) and solidification temperature (ts) were measured. the thermal stability of the samples was investigated with a tg. 3. results the dsc-tg results for the hdpe sample (4.5 mg) tested in ar protective atmosphere are shown in fig. 2. the weight of the hdpe sample for the dsc-tg measurement in the static air atmosphere was 3.8 mg. the dcs-tg record (heating and cooling) is shown figure 4. dsc-tg record of pa-66 (4.6 mg) measured in argon. figure 5. dsc-tg record of pa-66 (2.0 mg) measured in air. in fig. 3. the axes are again in the tg (left), dsc (right) versus the time coordinates. fig. 4 shows the simultaneous dsc-tg measurement of the pa-66 sample (4.6 mg) performed under an ar protective atmosphere. dsc-tg measurements of the same pa-66 sample, but in an air atmosphere are shown in fig. 5. the mass of this sample was 2.0 mg. 4. discussion the red dsc curve in fig. 2 (hdpe sample) shows again only the endothermic changes in the sample during heating (10k/min). the sample started to melt at about 100 °c; however the major components of the sample started to melt at a higher temperature of 123.8 °c and the sample was completely melted at 137.5 °c. the specific heat required for melting was 153.5 j/g. the melted sample was stable during further heating. during cooling, an exothermic reaction occurred at 115.3 °c. fig. 3 shows that this sample began to melt at 122.5 °c in an air atmosphere. however, at 259 °c, exothermic reactions occurred in at least three steps related to a mass loss of about 11.3 %. the exothermic reaction representing solidification 223 m. ondruška, m. drienovský, r. čička et al. acta polytechnica at 115.1 °c is shown in the dsc cooling curve. this measurement demonstrates that during heating in an air atmosphere both a melting reaction and an oxidative degradation of the hdpe sample occurred. the red curve in fig. 4 represents the dsc signal of the pa-66 sample during heating. two extrapolated onsets at 236.1 °c and 245.2 °c can be evaluated in the melting peak. the specific heat consumed by melting of the sample was 46 j/g. the blue curve represents the dsc signal during cooling. the extrapolated onset of solidification of the sample was shifted to a lower temperature due to undercooling. the thermogravimetric curve (green) shows a negligible mass change during heating. the simultaneous dsc-tg record of the pa-66 sample measured in an air atmosphere is plotted against time (fig. 5). an endoand exothermic peak related to melting and solidification also appeared during heating (red curve) and cooling (blue curve), respectively. moreover, the exothermic reaction between 300–350 °c was visible in the heating curve. the total mass loss during the whole measurement was up to 8 %. 5. conclusions this paper has demonstrated that simultaneous tgdsc measurements can be utilized for determining melting temperatures and also for determining thermal degradation in a protective atmosphere and thermal oxidation in an air atmosphere during heating. the measured values can determine the optimal temperature parameters for plastic welding. the difference between these two plastics relates to the diversity of their macromolecular structures. the results of the measurements presented here introduce important parameters for subsequent processing of plastics by welding. acknowledgements the authors express their thanks for the financial support provided by the vega grant agency under contract no. 1/0339/11. this paper is an outcome of the project: ce for development and application of advanced diagnostic methods in processing of metallic and non-metallic materials, itms:26220120048, supported by the research & development operational programme, funded by the erdf. references [1] netzsch. analyzing and testing: thermal characterization of polymers. www.netzsch-thermal-analysis.com , 2012. [2] h. belofsky. plastics: product design and process engineering. hanser/gardner publishers cincinnati, ohio, 1995, 648pp. [3] n. amanat, n. l. james, d. r. mckenzie. welding methods for joining thermoplastic polymer for the hermetic enclosure of medical devices. medical engineering & physics, 2010, pp. 690–699. doi: 10.1016/j.medengphy.2010.04.011 [4] d. a. grewell, a. benatar, j. b. park. plastics and composites welding handbook. mãijnchen: carl hanser verlag, 2003. 407 pp. [5] b. patham, p. h. foss. thermoplastic vibration welding: review of process phenomenology and processingâăşstructureâăşproperty interrelationships. polymer engineering & science, 2011, pp. 1–22. doi: 10.1002/pen.21784 [6] m. dube, p. hubert, a. yousefpour, j. denault. resistance welding of thermoplastic composites skin/stringer joints. composites, 2007, pp. 2541–2552. doi: 10.1016/j.compositesa.2007.07.014 [7] g. ehrensttein, g. riedel, p. trawiel. thermal analysis of plastics. theory and practice, hanser/gardner publishers cincinnati, ohio, 2004, 359pp. [8] m. sajwan, s. aggarwal, r. b. singh. forensic characterization of hdpe pipes by dsc. forensic science international, vol. 175, 2007, pp. 130–133. doi: 10.1016/j.forsciint.2007.05.020 [9] c. schick. differential scanning calorimetry (dsc) of semicrystalline polymers. analytical and bioanalytical chemistry, vol. 359, 2009, pp. 1589–1611. [10] measuring the glass transition of amorphous engineering thermoplastics. ta instruments. www.tainstruments.co.jp, 2012. [11] p. k. gallagher. handbook of thermal analysis and calorimetry. elsevier, 1998, 1032pp. 224 http://dx.doi.org/10.1016/j.medengphy.2010.04.011 http://dx.doi.org/10.1002/pen.21784 http://dx.doi.org/10.1016/j.compositesa.2007.07.014 http://dx.doi.org/10.1016/j.forsciint.2007.05.020 acta polytechnica 54(3):221–224, 2014 1 introduction 2 method 3 results 4 discussion 5 conclusions acknowledgements references ap1_03.vp 1 introduction among the many processes that are performed in the leather industry, convective drying has an essential role. leather fabrication has become an important industrial development worldwide, similar to other technologically advanced process industries. some of the unit operations involved in this industry, especially the drying process, are still based on empiricism and tradition, with very little use of scientific principles [1– 4]. many researchers have studied the drying process through a variety of mathematical models [5–10]. beard [5] assumed that leather can consist of two layers, one dry and one wet layer. however, his analysis did not describe the details of what was going on inside the leather. also, he used two experimental constants to fit his data to the experimental results of the measured temperature variation inside the dryer. in this study, the mathematical model developed by nordon [9] has been modified to determine the transient temperature and moisture concentration distribution of leather in a dryer. also, the distributions of temperature and moisture concentration are calculated using the finite-difference method. the effects of many operating parameters such as dryer temperature, humidity, and initial moisture content of the leather have been examined. 2 mathematical model the mathematical model derived by nordon is used with small modifications. the resulting differential equations are derived as d c x c t c t � � � � � � a f a 2 � � (1) and k t x c t t c t � � � � � � � � 2 2 � �p f . (2) the boundary conditions for convective heat transfer and mass transfer at the leather surface are � �q h t t� �e e (3) and � �m h c c� �m e a . (4) the deriving force determining the rate of mass transfer inside the fabric is the difference between the relative humidities of the air in the pores and the leather. in this study, the rate of moisture exchange is assumed to be proportional to the relative humidity difference. thus, the rate equation for mass transfer is � � � � 1 1� � � �� � � c t k y yf a f . (5) also, the relative humidities of air and leather are assumed to be y c rt p a a s � (6) and � � y c f f � �� �1 . (7) the rate constant in equation 5 is an unknown empirical constant and the effect of this constant can be examined. the value of the rate constant was varied from k = 0.1 to k = 10. the resulting calculated leather surface temperatures are compared in figure 1. when the rate constant is small, the evaporation rate is so small that the moisture content decreases very slowly. initially, the surface temperature increases rapidly, but later this declines. when k is greater than 1.0, however, the effect of the rate constant on the surface temperature distribution is not 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 a mathematical model of the drying process a. k. haghi a convective drying model is proposed which may be used to describe the drying behavior of leather. using this model, the calculated transient leather temperature agrees well with experimental values. variations in temperature and moisture content distribution are solved using the finite-difference method. the effects of operation parameters, such as temperature and humidity in the dryer, initial moisture content of the leather, and heat and mass transfer coefficients are examined using the model. keywords: convectiv drying model, leather, transient temperature field. t e m p e ra tu re [k ] time [s] fig. 1: effect of rate constant on the leather surface temperature significant. this indicates that when the rate constant is greater than 1.0, the evaporation rate is high and the drying process is mainly controlled by the moisture diffusion mechanism inside the fabric. thus we have assumed the rate constant to be 1.0 in the following calculations. 3 model prediction the temperature and moisture content were calculated using this model. in these calculations, the parameters used for the base condition are shown in table 1. table 1: values of parameters for base condition parameter unit value dryer temperature k 450.00 heat transfer coeff. w/m2k 70.00 mass transfer coeff. m/s 0.08 leather thickness mm 1.20 porosity – 0.90 initial moisture %rh 50.00 drying air moisture kg/m3 0.02 first, the transient temperatures of the surface and center of the leather were calculated, using the data shown in table 1. from figure 2, we see that the surface and center temperature increase rapidly in the initial stage up to the saturation temperature, at which point the moisture in the leather starts to evaporate. from that point, the difference between the surface temperature and the center temperature increases due to the different moisture contents of the surface and the center. in this stage, the leather starts to dry from the surface, and the moisture in the interior is transferred to the leather surface. then the moisture content decreases during drying of the leather. thereafter, the surface and center temperatures converge to reach the external air temperature. the moisture variations of the surface and the center of the leather were also calculated and are shown in figure 3. initially, the surface moisture content decreases rapidly, but later this rate declines because moisture is transferred to the external air from the leather surface. the center moisture content remains constant for a short time, and then decreases rapidly because the moisture content difference between the surface and the interior of the leather becomes large. after drying out, both center and surface moisture contents converge to reach the external air moisture content. the mathematical model is used to predict the effects of many parameters on the temperature variation of the leather. these parameters include the operation conditions of the dryer, such as the initial moisture content of the leather, heat and mass transfer coefficients, drying air moisture content, and dryer air temperature. figure 4 shows the calculated results of the effect of the initial moisture content of the leather. when the initial moisture content is high, the temperature rise is relatively small and drying takes a long time. this may be because the higher moisture content needs much more heat for evaporation from the leather. also, the saturation temperature for higher moisture content is lower, and thus the temperature rise in the initial stage is comparatively small. the leather temperature was calculated to investigate the effect of heat and mass transfer coefficients in these calculations. an analogy was assumed between heat and mass © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 41 no. 3/2001 t e m p e ra tu re [k ] time [s] fig. 2: temperature variation of surface and center of leather time [s] r e la ti v e h u m id it y o f le a th e r fig. 3: moisture content of surface and centre of leather t e m p e ra tu re [k ] time [s] fig. 4: effect of initial moisture content of fabric transfer, and both heat and mass transfer coefficients were determined using this assumption. the calculated results are compared in figure 5. when the heat and mass transfer coefficients are high, the leather temperature rise is great and the time required for drying is relatively short. the effect of drying air moisture content, and the calculated results of the model are shown in figure 6. when the moisture content is high, the initial temperature rise of the leather also becomes high. this may be because the saturation temperature in the initial stage largely depends on the drying air moisture content. after the initial temperature rise, however, the temperature increase is relatively small, and thus the time required for complete drying is comparatively long. the effect of dryer air temperature was also investigated, and the calculation results are shown in figure 7. when the dryer air temperature is high, the temperature rise of the leather is great. 4 conclusion the mathematical model developed in this study can be used to predict transient variations in temperature and moisture content distribution of leather in the dryer with reasonable accuracy using the model, the effect of the temperature and humidity of the dryer, the initial moisture content of leather and the heat and mass transfer coefficients can be predicted for leather. with the model predictions, energy consumption can potentially be reduced by optimizing the drying conditions of the dryer. 5 nomenclature ca moisture content of air in leather pores [kg/m 3] ce moisture content of external air [kg/m 3] cf moisture content in leather [kg/m 3] cp specific heat [kj/kg k] d diffusion coefficient [m2/s] g mass flowrate [kg/m2 s] he heat transfer coefficient [w/m 2 k] hm mass transfer coefficient [m/s] k rate constant [1/s] k thermal conductivity [w/m k] m mass transfer rate [kg/m2 s] ps saturation pressure [pa] q convective heat transfer rate [w/m2] r gas constant [kj/k] t temperature [k] te external air temperature [k] t time [s] ya relative humidity of air in pores of leather yf relative humidity of leather � porosity � latent heat of evaporation [kj/kg] � density [kg/m3] 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 t e m p e ra tu re [k ] time [s] fig. 5: effect of heat and mass transfer coefficients t e m p e ra tu re [k ] time [s] fig. 6: effect of drying air moisture content t e m p e ra tu re [k ] time [s] fig. 7: effect of dryer air temperature references [1] arma, c. r., gortary, j. c.: experimental data and preliminary design of a non-conventional dryer of leather. sixth int. drying symp., ids 85, versailles, france, 1988, pp. 59–63 [2] bienkiewicz, k. j.: physical chemistry of leather making. robert e. kriger publication co., 1983 [3] tomas, s., skansi, d., sokele, m.: drying technology. vol. 11, no. 6/1993, p. 1353 [4] skansi, d. et al: experimental evaluation of the microwave drying of leather. journal of the society of leather technologists and chemists, vol. 79, 1993, pp. 171–177 [5] beard, j. n.: more efficient tenter frame operations through mathematical modeling. textile chem. colorist , no. 3/1976, pp. 47–50 [6] carnahan, b., luther, h. a., wilkes, j. o.: applied numerical method. john wiley & sons, ny, 1969 [7] farnworth, b.: a numerical model of combined diffusion of heat and water vapor through clothing. textile res. j., no. 56/1986, pp. 653–655 [8] henry, p. s.: diffusion in absorbing media. proc. r. soc., 171a, 1986, pp. 215–655 [9] nordon, p., david, h. g.: coupled diffusion of moisture and heat in hygroscopic textile materials. int. j. heat mass trans., no. 10/1967, pp. 853–866 [10] treybal, r. e.: mass transfer operation. 2nd ed., mc graw-hill book co., 1968 dr a. k. haghi e-mail: haghi@kadous.gu.ac.ir university of guilan p.o.box 3756 rasht, iran © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 41 no. 3/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap01_45.vp 1 introduction construction works are commonly designed using various operational (deterministic) methods specified in national or international standards or other prescriptive documents. consequently, the actual reliability of a designed structure depends on the applied standards, their principles and application rules including quality requirements specified for design and verification of the structure with respect to the ultimate and serviceability limit states. previous experience and performed analysis show that the ultimate resistance, serviceability or durability of a given structure designed in accordance with various standards is to be expected within a broad range. the actual structural resistance may depend not only on the used theoretical models, but also on appropriate detailing and other rules recommended in the applied standards. moreover, theoretical models specified in various standards for determining structural resistance provide considerably different probabilities of exceeding the calculated design value. theory of structural reliability and mathematical statistics enable a comprehensive analysis of structural elements with respect to both the ultimate and serviceability limit states. it is shown that the credibility of the design formulas and the reliability of structural elements may have a great scatter and are in some cases inadequate. in particular the credibility of the design formulae with respect to serviceability limit states appears to be very low. 2 design procedure two limit states should generally be considered in the design of construction works – the ultimate and serviceability limit states. reaching the ultimate limit states leads to structural failure, e.g., due to loss of overall equilibrium, by reaching critical strain conditions at a certain part of the cross-section or by fatigue of the construction materials. the serviceability limit states characterise a structural condition in which, where it is reached, the specified requirements are not satisfied. this can be caused by cracking, deformation or sensitivity to vibration. in the case of reinforced or prestressed structures, the stresses in concrete, reinforcing and prestressing steel should be limited. uncontrolled stresses in concrete under serviceability conditions can lead to excessive cracking, high levels of creep, and plastic deformations negatively influencing the properties of the whole stucture. tensile stresses in reinforcements should be checked to avoid inelastic strain, unacceptable cracking and deformations. to evaluate the reliability or probability of failure of a structure it is necessary to specify basic variables describing the load and resistance parameters and their relationship corresponding to the relevant performance criterion of a structure. this relationship, called performance function (limit state function) g, is given as � �g � g x (1) where the vector of basic variables x represents random variables, which may be time dependent. the limit state of a structure for random realisation x of the vector of basic variables can be defined as � �g x � 0. it represents a state beyond which a structure can no longer fulfil the function for which it was designed. the method of partial factors (level i design method), used in most current european countries for structural design, and which is also a basis of the new european standards (eurocodes), deals with influences of uncertainties and randomness of basic variables by means of design values assigned to the variables. the design limit state function is expressed in terms of the design values of basic variables as � �gd g� xd (2) where xd is the vector of design values of basic variables represented, e.g., by design values of actions fd, design material properties fd, design models of uncertainties �d, design values of geometrical quantities ad, serviceability constraints cd and importance coefficients �n, in accordance with iso 2394 [1]. the design condition is expressed as � �g xd � 0 (3) and the design vector of basic variables xd can be obtained on the basis of characteristic values of variables and of a set of partial factors � for actions and material properties. the values of the partial factors depend on the design situation and the considered limit state. the design procedure, recommended values of partial factors and other reliability elements are described in various standards for structural design used throughout the world. the partial factors are based on previous experience and calibration using methods of structural reliability. knowledge about the reliability level of a structure designed according to current standards and 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 credibility of design procedures j. marková, m. holický theory of structural reliability enables comprehensive analysis of structural elements with respect to various limit states, and provides valuable insights into the methodology of applied standards. in addition to reliability analysis of the structural element, a new concept of the credibility of theoretical models used to calculate the design value of basic variables is introduced. the presented example of structural verification for limit states of cracking shows that the credibility of commonly applied formulas and reliability of a reinforced concrete slab have a great scatter and are in some cases inadequate. keywords: credibility of design procedures, reliability of elements, basic variables, model uncertainties, probability of failure, reliability index. also the credibility of calculation models recommended by the standards can be used for optimisation of design procedures. 3 reliability in accordance with traditional reliability theory, a structure can be considered as reliable if the following condition is satisfied p pf t� , or � �� t (4) where the probability of failure pf is given as � �� � � � � p p xf g g d� � � � x x x x 0 0 � (5) where �x(x) is the joint probability density function of the vector x of basic variables. the probability of failure pf can be expressed by the reliability index � �� � � �� 1 pf , where � is the distribution function of the standardised normal variable. the probability of failure pt and reliability index �t are the specified (target) values that should not be exceeded during the intended period of time, as shown by haldar, mahadevan [2]. if for example the reliability of a structural element stems from a comparison between action effects e and resistance r, then the probability of structural failure can be expressed as � � � � �p p e r fe r r e f d� � � � � � � � � 0 (6) where �e() is the density function of variable e, fr() is the distribution function of variable r, and � denotes a generic point of e and r. 4 credibility the accuracy of the calculation models given in standards can be examined using the credibility analysis proposed in [10]. credibility is defined as the probability pc of design value g( xd) being exceeded by random variable g( x). thus, the probability pc is given as � � � �� � � � � � � p xc d g >g p g > g d d � � x x x x x x � . (7) thus, similar general principles can be used to determine the credibility of prescriptive formulae and the reliability of the structure. it should be mentioned, that unfavourable changes in the properties of a significant basic variable can dramatically influence the credibility as well as the reliability of the element in both the ultimate and serviceability limit states. an example of reliability analysis of a reinforced concrete element with respect to crack width and credibility analysis of selected theoretical models recommended for verification of the limit state of cracking are presented in the following, for the sake of illustration. 5 verification of crack width regarding selected standards cracking in reinforced concrete elements due to load effects can be controlled by applying the calculation models recommended in various standards or by fulfilling appropriate practical rules (e.g., for the position of reinforcement, size of bars, area of reinforcement). many theoretical models exist for predicting crack width. almost any standard for the design of concrete structures contains some calculation formulae, as is also shown in structural concrete [3]. the following condition should be satisfied in the process of crack width verification � �w wx k lim (8) where w(xk) is the calculated crack width and wlim is the crack width limit. most current standards recommend various theoretical models for crack width. the structural element is designed and verified using the specific methodology provided by the relevant set of national or international standards. it is known that the vector of characteristic values xk and design values xd of basic variables may differ from country to country (e.g., due to the different geometrical requirements, material properties defined through non consistent classes of concrete, different properties of reinforcement). the design and verification of structures is influenced not only by prescribed values of basic variables, but also by specified design procedures (e.g., different load combinations used for checking ultimate and serviceability criteria), and by detailing rules. results for the credibility of theoretical models given in the prestandard env 1992-1-1 [4], in ceb fip model code 1990 [5] and also in its previous proposal (marked prmc 90 in all figures), in the working draft of new operational document pren 1992-1-1 [6], in bs 8110 [7], in aci 318-89 [8] and in čsn 73 1201 [9] are presented here, based on previous works by marková and holický [10, 11]. a reinforced concrete slab subjected to bending moment is selected to analyse theoretical models for crack width. the slab is from 0.19 to 0.29 m in depth, with a span of 5 m. it is loaded by one permanent load and one imposed load. note that in all cases the same material and concrete reinforcement cover are assumed. the following alternatives are considered: 1. the reinforced concrete slab is designed for the ultimate limit states according to the eurocodes. the crack width is verified taking into account the theoretical models for crack width recommended in the above mentioned standards, considering the quasi-static combination of actions of eurocodes. calculated crack width is compared with crack width limit wlim = 0.3 mm (0.2 mm for long-term load effects in the case of čsn 73 1201 [9]). the requirement of standards for crack width limit is almost identical for similar types of environment. thus, these study cases mutually differ only by the theoretical models for crack width. the resulting quantities are shown in fig. 1 to 3 and fig. 5 (the symbols of the used models are indicated without brackets). 2. considering three standards (bs 8110 [7], aci 318-89 [8] and čsn 73 1201 [9]) the slab is designed for the ultimate limit state using appropriate loading requirements (including partial factors, combination of actions) and © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 41 no. 4 – 5/2001 verified for serviceability limit states using recommended loading and a theoretical crack width model. the results are also shown in fig. 1 to 3 and fig. 5 (the symbols of the standards are indicated in brackets), in all cases the limit crack width wlim = 0.3 mm. the resulting crack width w(xk) shown for a slab 0.25 m in depth in fig. 1 is in a broad range. it is obvious that crack width w(xk) represents only a theoretical value calculated on the basis of different assumptions using the influence coefficients considered in the standards. however, these calculated values of crack width w(xk) determined on the basis of a broad range of normative recommendations are compared with the same limiting value wlim. classic deterministic methods do not enable deeper analysis of particular influences and consequently detailed determination of structural reliability with respect to crack width. 6 credibility analysis of crack width model the credibility of the design value of crack width wd is verified using methods of structural reliability. the probability of the random variable w(x) exceeding the calculated value of crack width w(xk) determined in accordance with a particular standard is expressed as � � � �� p w wwc kp� � �x x� 0 (9) where xk is the vector of characteristic values of basic variables and the coefficient �w represents the uncertainties of action effects and the inaccuracy of the theoretical model for crack width. the probabilistic models of basic variables entering equation (9) are recommended on the basis of the working materials of jcss (joint committee for structural safety) and previous reliability analyses. some of the models applied in the reliability analyses are assumed to be deterministic values, while the others are considered as random variables having a normal distribution, lognormal, beta or gamma distribution. statistical properties of basic variables are described using the moment characteristics (by mean, standard deviation), lower and upper bounds, and they are listed in marková and holický [10, 11]. the significant basic variable influencing the resulting crack width is the concrete reinforcement cover. its probabilistic model is based on measurements carried out at the klokner institute of ctu in prague, in the united kingdom, and on working materials of jcss [12] and vrouwenvelder et al [13]. the theoretical models for crack width presented in current standards are based on various presumptions. they are often based on physical models and modified by influencing coefficients taking into account experimental data. in some cases they are assessed on the basis of experimental research or in combination with an empirical relationship based on previous experience. the selected theoretical models assume different probabilities of exceeding the design value of crack width, or maximum crack spacing. the probability of exceeding the calculated crack width w(xk) is 5 % according to env 1992-1-1 [4] and čsn 73 1201 [9], 10 % in cep fip model code 1990 [5] and pren 1992-1-1 [6], 20 % in bs 8110 [7]. the following relationship between the average crack width wm and the characteristic crack width wk can be considered � �w w u vk m p� �1 , (10) where v is a coefficient of variation expressing up to 40 % variability of crack width and up is an upper fractile of the standardised normal variable for probability p based on the assumption of normal distribution of crack widths. the relationship for calculating the average crack width introduced in aci 318-89 [8] was derived on the basis of experimental measurements, and is given in marková [10]. the probability pf of exceeding the design crack width wd according to relationship (9) is determined by the form method using comrel software and expressed here by reliability index �c, as illustrated in fig. 2. further, the sorm method and the importance sampling method were also used to check the results of reliability analysis. analysis of the credibility of the design values of crack width shows that the reliability index �c determined for a slab depth from 0.19 m to 0.29 m is low for the theoretical model introduced in the american standard (reliability index �c is about 0.3), in the british standard (0 5 0 9. . �c ), in the working draft of eurocode 2 (0 4 1 4. . �c ), and the credibility is high for the czech standard (about 2.9). the credibility of theoretical models seems to be sufficient for env 1992-1-1 [4] and in most cases also for ceb fip model 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 en v pr m c 90 m c 90 pr en b s (b s) a ci (a ci ) ès n (è sn ) fig. 1: crack width w(xk) for a reinforced concrete slab 0.25 m in depth calculated according to the selected theoretical models considering two alternatives: 1. slab is designed and verified following loading requirements of env [4] – the symbols of standards are given without brackets, 2. slab is designed and verified for loading recommendations of bs [7], aci [8] and čsn [9] – the symbols of the standards are given in brackets. 0 0.5 1 1.5 2 2.5 3 en v pr m c 90 m c 90 pr en b s (b s) ac i (a ci ) čs n (č sn ) fig. 2: the credibility of the design value of crack width wd for the selected theoretical models, or the slab depth of 0.25 m and two considered alternatives code 1990 [5], greater than the reliability index �d = 1.5 recommended for serviceability limit states in pren 1990 [14]. analysis of the sensitivity factors � of the basic variables indicates that the significant basic variables influencing the credibility of the design crack width include permanent and variable loads, thickness of the reinforcement cover, tensile strength of concrete and influence coefficients (e.g., expressing bond strength, duration of loading, shape of the strain across the cross-section). however, theoretical models for crack width give different significance to some basic variables, as shown in fig. 3. 7 time-dependent credibility analysis of crack width model the time-dependent credibility of the design value of crack width is based on a similar relationship as given in (9). the short-term and long-term effects of the imposed load and the time-effects of creep are taken into account. the probability that the random process � �� �w tx j, exceeds in time t the calculated value of crack width w(xk, t) determined in accordance with the relevant standards is expressed as � � � �� �� p w t w , twc kp� � �x x j, � 0 (11) where j(t) is the rectangular wave renewal vector process, represented here by the short-term and long-term components of the imposed load. an example of the time-dependent credibility of the design crack width for a slab depth of 0.25 m and a time period from 10 to 50 years according to env 1992-1-1 [4] is shown in fig. 4 for three considered depths of the slab (0.19 m, 0.25 m and 0.29 m). 8 reliability analysis of reinforced concrete slab regarding crack width the time-independent reliability analysis of the slab for the limit state of cracking, considering selected standards, deals with probability pf of the random crack width w(x) exceeding the required constraint wlim expressed by � �� p w wwf p� � �� �lim lim x 0 (12) where x is a vector of basic variables and �lim is the model uncertainty for the required crack width limit wlim (it is considered wlim = 0.3 mm for a quasi-static load combination, or for a serviceability load combination according to relevant national standards – the names of the standards are introduced in parentheses in fig. 5). the results of reliability analyses show that the reliability of the element depends on the theoretical model that is used. in most cases reliability index � is greater than the level of 1.5 recommended for serviceability limit states in pren 1990 [14]. fig. 5 shows that reliability index � is in a broad range from 1.5 to 4.9, and only in the case of thicker slabs does the index � decrease to value 1. the reliability of the slab is high according to the british standard (about 4.3) and czech standard (about 3.8). 9 conclusions 1. deterministic methods of structural analysis commonly used for verification of structures do not enable objective evaluation of structural reliability. 2. it is shown that the theory of structural reliability and mathematical statistics enable comprehensive analysis of the reliability of a structure and assessment of the credibility of theoretical models. 3. a practical example of verification of a reinforced concrete slab with respect to the limit state of cracking shows that the same limit value of crack width is in practical application compared with different design values of crack © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 41 no. 4 – 5/2001 0 0.2 0.4 0.6 0.8 1 en v pr m c 90 m c 90 pr en b s (b s) a ci (a ci ) čs n (č sn ) fig. 3: sensitivity factor � of reinforcement cover c according to the selected theoretical models and two considered alternatives 0 0.5 1 1.5 2 10 20 30 40 50 t [years] 0.19 ub 0.19 lb 0.25 ub 0.25 lb 0.29 ub 0.29 lb fig. 4: time-dependent credibility of the design crack width according to env 1992-1-1 for a time period from 10 to 50 years 0 1 2 3 4 5 0.19 0.21 0.23 0.25 0.27 0.29 h [m] env mc 90 pren bs (bs) aci (aci) čsn (čsn) fig. 5: reliability index � of a slab of depth h for the limit state of cracking considering selected standards and two considered alternatives widths obtained on the basis of a broad range of normative recommendations. 4. the methods of structural reliability enable realistic analysis of concrete elements with respect to crack width. reliability indices �c assessed in the analysis of the credibility of the design crack width formulas and the reliability of a reinforced concrete slab with respect to limit crack width have a great scatter and are in some cases inadequate. 5. it is shown that the credibility of a theoretical model for crack width is independent of the previous design of a reinforced concrete slab for the ultimate limit states of eurocodes or relevant national standards. 6. analysis of the sensitivity factors � indicates the significant basic variables influencing the reliability index and the credibility of the design crack width: permanent and variable loads, thickness of the reinforcement cover, tensile strength of the concrete, influence of coefficients (expressing bond strength, duration of loading). 7. our paper indicates that probabilistic methods can be used effectively for the development and calibration of new theoretical models applied in the design and verification of structures. acknowledgement this research has been conducted at the klokner institute of the czech technical university in prague, czech republic, as a part of research project cez: j04/98/210000029 “risk engineering and reliability of technical systems”. references [1] iso 2394, general principles on reliability for structures. no. 6/1996, p. 115 [2] haldar, a., mahadevan, s.: probability, reliability and statistical methods in engineering design. john wiley & sons, 2000, p. 304 [3] structural concrete, vol. ii., textbook on behaviour, design and performance. updated knowledge of the ceb fip model code 1990, 1999. switzerland, vol. 2, p. 305 [4] env 1992-1-1, design of concrete structures – part 1: general rules and rules for buildings. 1995, p. 253 [5] ceb-fip model code 1990. design code, thomas telford, switzerland, 1993, p. 437 [6] pren 1992-1-1, working draft of design of concrete structures – part 1: general rules and rules for buildings, 2000 [7] bs 8110, structural use of concrete, part 2: code of practice for design and construction. british standards institution, london, 1985 and amendment, no. 1, 1989 [8] aci 318-89, manual of concrete practice. reported by aci committee 301, american concrete institute, detroit, 1989 [9] čsn 73 1201, design of concrete structures, 1986, amendment 1, 1989 and amendment 2. prague, csni, 1994, p. 93 [10] marková, j.: reliability of reinforced concrete elements with respect to crack width. thesis, 8/2000, prague, czech republic, p. 165 [11] marková, j., holický, m.: probabilistic analysis of crack width. acta polytechnica vol. 40, no. 2/2000, prague, czech republic, pp. 56 – 60 [12] jcss probabilistic model code, part 2, load models, draft, 1999 [13] vrouwenvelder t. et al: jcss working document on eurocode random variable models. jcss-vrou-06-2-95, delft, 1995 [14] pren 1990 basis of structural design, european committee for standardisation. final draft, july 2001, p. 88 ing. jana marková, ph.d. phone: +420 2 24353501 fax: +420 2 24355232 e-mail: markova@klok.cvut.cz assoc. prof. milan holický, drsc. ph.d. phone: +420 2 24353842 e-mail: holicky@vc.cvut.cz czech technical university in prague klokner institute šolínova 7, 166 08 prague 6, czech republic 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica acta polytechnica 53(3):268–270, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the two-dimensional harmonic oscillator on a noncommutative space with minimal uncertainties sanjib dey∗, andreas fring department of mathematical science, city university london, northampton square, london ec1v 0hb, uk ∗ corresponding author: sanjib.dey.1@city.ac.uk abstract. the two dimensional set of canonical relations giving rise to minimal uncertainties previously constructed from a q-deformed oscillator algebra is further investigated. we provide a representation for this algebra in terms of a flat noncommutative space and employ it to study the eigenvalue spectrum for the harmonic oscillator on this space. the perturbative expression for the eigenenergy indicates that the model might possess an exceptional point at which the spectrum becomes complex and its pt-symmetry is spontaneously broken. keywords: noncommutative space, non-hermitian operators, 2d-systems. in [1] we demonstrated how canonical relations implying minimal uncertainties can be derived from a q-deformed oscillator algebra for the creation and annihilation operators a†i, ai aia † j −q 2δij a † jai = δij, [a†i,a † j] = 0, [ai,aj] = 0,   for i,j = 1, 2, 3; q ∈ r,(1) as investigated for instance in [2–6]. starting from the general ansatz x = κ̂1(a † 1 + a1) + κ̂2(a † 2 + a2) + κ̂3(a † 3 + a3), (2a) y = iκ̂4(a † 1 −a1) + iκ̂5(a † 2 −a2) + iκ̂6(a † 3 −a3), (2b) z = κ̂7(a † 1 + a1) + κ̂8(a † 2 + a2) + κ̂9(a † 3 + a3), (2c) px = iκ̌10(a † 1 −a1) + iκ̌11(a † 2 −a2) + iκ̌12(a † 3 −a3), (2d) py = κ̌13(a † 1 + a1) + κ̌14(a † 2 + a2) + κ̌15(a † 3 + a3), (2e) pz = iκ̌16(a † 1 −a1) + iκ̌17(a † 2 −a2) + iκ̌18(a † 3 −a3), (2f) with κ̂i = κi √ ~/(mω) for i = 1, . . . , 9 and κ̌i = κi √ mω~ for i = 10, . . . , 18 we constructed some particular solutions and investigated the harmonic oscillator on these spaces. here we provide an additional two dimensional solution previously reported in [6]. setting κ3 = κ6 = κ7 = κ12 = κ15 = κ16 = κ17 = κ18 = 0 in equations (2a)–(2f), employing the constraints reported in [6] together with the subsequent nontrivial limit q → 1, the deformed oscillator algebra [x,y ] = iθ ( 1 + τ̂y 2 ) , [px,py] = iτ̂ ~2 θ y 2, [x,px] = i~ ( 1 + τ̂y 2 ) , [y,py] = i~ ( 1 + τ̂y 2 ) , [x,py] = 0, [y,px] = 0, (3) was obtained, with τ̂ = τmω/~ having the dimension of an inverse squared length. by the same reasoning as provided in [1, 5–9], we find the minimal uncertainties ∆xmin = |θ| √ τ̂ + τ̂2〈y 〉2ρ, ∆ymin = 0, ∆(px)min = 0, ∆(py)min = ~ √ τ̂ + τ̂2〈y 〉2ρ, (4) where 〈.〉ρ denotes the inner product on a hilbert space with metric ρ in which the operators x,y,px and py are hermitian. so far no representation for the two dimensional algebra (3) has been provided. here we find that it can be represented by x = x0 + τ̂y20x0, y = y0, px = px0, py = py0 − τ̂ ~ θ y20x0, (5) where x0,y0,px0,py0 satisfy the common commutation relations for the flat noncommutative space [x0,y0] = iθ, [x0,px0 ] = i~, [x0,py0 ] = 0, [px0,py0 ] = 0, [y0,py0 ] = i~, [y0,px0 ] = 0, for θ ∈ r. (6) clearly there exist many more solutions that one may construct in this systematic manner from the 268 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 two-dimensional harmonic oscillator on a noncommutative space ansatz (2a)–(2f), which will not be our concern here. instead we will study a concrete model, i.e. the twodimensional harmonic oscillator on the noncommutative space described by algebra (3). using representation (5), the corresponding hamiltonian reads h2dncho = 1 2m (p 2x + p 2 y ) + mω2 2 (x2 + y 2) = h2dfncho + τ̂ 2 [ mω2{y20x0,x0}− ~ mθ {y20x0,py0} ] + τ̂2 2 [ mω2 + ~2 mθ2 ] y20x0y 2 0x0, (7) where we used the standard notation for the anticommutator {a,b} := ab + ba. evidently this hamiltonian is non-hermitian with regard to the standard inner product, but respects an antilinear symmetry pt± : x0 → ±x0,y0 → ∓y0,px0 → ∓px0,py0 → ±py0, i →−i. this suggests that its eigenvalue spectrum might be real, or at least real in parts [10–12]. let us now investigate the spectrum perturbatively around the solution of the standard harmonic oscillator. in order to perform such a computation we need to convert flat noncommutative space into the standard canonical variable xs, ys, pxs and pys . this is achieved by means of a so-called bopp-shift x0 → xs − θ~pys , y0 → ys, px0 → pxs and py0 → pys . the hamiltonian in (7) then acquires the form h2dncho = h 2d ho + mθ2ω2 2~2 p2ys − mθω2 2~ {xs,pys} + τ̂ 2 [ mω2{y2sxs,xs}− ~ mθ {y2sxs,pys} ] + τ̂ 2 [( 1 m + mθ2ω2 ~2 ) {y2spys,pys} − mθω2 ~ ( {y2spys,xs} + {y 2 sxs,pys} )] − τ̂2 2 [ mθω2 ~ + ~ mθ ]( y2spysy 2 sxs + y 2 sxsy 2 spys ) + τ̂2 2 [ 1 m + mθ2ω2 ~2 ] y2spysy 2 spys + τ̂2 2 [ mω2 + ~2 mθ2 ] y2sxsy 2 sxs = h2dho (xs,ys,pxs,pys ) + h 2d nc (xs,ys,pxs,pys ). (8) in this formulation we may now proceed to expand perturbatively around the standard two dimensional fock space harmonic oscillator solution with normalized eigenstates |n1n2〉 = (a†1) n1 (a†2) n2 √ n1! n2! |00〉, a † i|n1n2〉 = √ ni + 1 ∣∣(n1 + δi1)(n2 + δi2)〉, ai|00〉 = 0, ai|n1n2〉 = √ ni ∣∣(n1 − δi1)(n2 −δi2)〉, (9) for i = 1, 2, such that h2dho |nl〉 = e (0) nl |nl〉. the energy eigenvalues for the hamiltonian h2dncho then result to e (p) nl = e (0) nl + e (1) nl + e (2) nl + o(τ 2) = e(0)nl + 〈nl|h 2d nc |nl〉 + ∑ p,q 6=n+l=p+q 〈nl|h2dnc |pq〉〈pq|h2dnc |nl〉 e (0) nl −e (0) pq + o(τ2) = ω~(n + l + 1) + 1 16 ~ωω [ 2n− (2l + 1)ω + 10l + 6 ] + 1 8 ~τω [ ω ( 8nl + 4n + 6l2 + 10l + 5 ) + 10nl + 5n + 5l2 + 10l + 5 ] + o(τ2), (10) where ω = m2θ2ω2/~2. we note the minus sign in one of the terms, which might be an indication for the existence of an exceptional point [13, 14] in the spectrum. naturally it would be very interesting to obtain a more precise expression for the eigenenergies, but nonetheless as has turned out to be very useful in the one dimensional setting [15] the first order approximations is very useful for the computation of coherent states [16]. acknowledgements s.d. is supported by a city university research fellowship. references [1] s. dey, a. fring, and l. gouba, pt-symmetric noncommutative spaces with minimal volume uncertainty relations, j. phys. a: math. theor. 45 (2012) 385302 [2] l. c. biedenham, the quantum group group su (2)q and a q-analogue of the boson operators, j. phys. a22, l873–l878 (1989). [3] a. j. macfarlane, on q-analogues of the quantum harmonic oscillator and the quantum group su (2)q, j. phys. a22, 4581–4588 (1989). [4] c.-p. su and h.-c. fu, the q-deformed boson realisation of the quantum group su (n)q and its representations, j. phys. a22, l983–l986 (1989). [5] b. bagchi and a. fring, minimal length in quantum mechanics and non-hermitian hamiltonian systems, phys. lett. a373, 4307–4310 (2009). [6] a. fring, l. gouba, and b. bagchi, minimal areas from q-deformed oscillator algebras, j. phys. a43, 425202 (2010). [7] a. kempf, uncertainty relation in quantum mechanics with quantum group symmetry, j. math. phys. 35, 4483–4496 (1994). [8] a. kempf, g. mangano, and r. b. mann, hilbert space representation of the minimal length uncertainty relation, phys. rev. d52, 1108–1118 (1995). [9] a. fring, l. gouba, and f. g. scholtz, strings from dynamical noncommutative space-time, j. phys. a43, 345401(10) (2010). [10] c. m. bender and s. boettcher, real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. 80, 5243–5246 (1998). 269 s. dey, a. fring acta polytechnica [11] a. mostafazadeh, pseudo-hermiticity versus pt symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian, j. maths. phys. 43, 202–212 (2002). [12] c. m. bender, making sense of non-hermitian hamiltonians, rept. prog. phys. 70, 947–1018 (2007). [13] c. m. bender and t. t. wu, anharmonic oscillator, phys. rev. 184, 1231–1260 (1969). [14] t. kato, perturbation theory for linear operators, (springer, berlin) (1966). [15] s. dey, a. fring, l. gouba, and p. g. castro, timedependent q-deformed coherent states for generalised uncertainty relations, phys. rev. d 87, 084033 (2013). [16] s. dey and a. fring, squeezed coherent states for noncommutative spaces with minimal length uncertainity relations, phys. rev d 86, 064038 (2012). 270 acta polytechnica 53(3):268–270, 2013 acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0528 acta polytechnica 53(supplement):528–533, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap cp violations (and more) after the first two years of lhcb giulio auriemmaa,b,∗ a università degli studi della basilicata, potenza, italy b infn sezione di roma, roma, italy ∗ corresponding author: giulio.auriemma@cern.ch abstract. the most interesting cosmological open problems, baryon asymmetry, dark matter, inflation and dark energy, are not explained by the standard model of particle physics (sm). the final goal of the large hadron collider an experimental verification of the sm in the higgs sector, and also a search for evidence of new physics beyond it. in this paper we will report some of the results obtained in 2010 and 2011, from the lhcb experiment dedicated to the study of cp violations and rare decays of heavy quarks. 1. introduction the standard model (sm) of particle physics is at present the most advanced and comprehensive phenomenological theory of all the elementary particles and forces known, with the exclusion of gravity [1]. the sm is already a very successful theory because it can predict very accurately the particle properties and their interactions [2], but it needs to be completed in the higgs sector1. with data taking at the large hadron collider (lhc) at √ s = 8 tev in progress, the sm (if completely confirmed) will be the best candidate description for the physics of the universe in the time span from ∼ 10−10 s to 13.7 gy after the big bang (see fig. 1). however the progress of observative cosmology made in last decades, specially from space, has radically reshaped our vision of the universe, in a way that seems to contradict the completeness of the sm. the following is a time ordered list of problems in cosmology, that do not appear compatible with the sm, at least in its minimal version. • baryon asymmetry: since the discovery of the existence of antimatter in the early 1940’s, the predominance of matter over antimatter in the whole visible universe has been a puzzle [5]. in the late 1960’s, sakharov [6] showed that the puzzle could be solved by the recently discovered charge and parity violation (cpv) in the decay of k0 mesons. a few years later, kobayashi & maskawa found that the extension of cabibbo mixing [8] to three families could easily explain cpv if one of the elements of the 3×3 quarks mixing matrix was complex [9] (see following §2). • dark matter: constitutes (23 ± 2) % of the mass of the universe in the λcdm model, while baryons account only for the (4.6 ± 0.2) % [10]. the forma1during the preparation of this manuscript, cern announced officially that a significant excess of events for mh = 125.5 ± 0.6 gev/c2 has been observed in the data by both atlas [3] and cms [4]. tion of large scale structures in the universe gives a clue to the nature of this matter, which is currently understood as formed by weakly interacting massive particles (wimps) relic from the big bang (see e.g. [11] and references therein). an ideal candidate for wimps is the lightest supersymmetric particle (lsp) of the supersymmetric theories (susy), stable if r-parity is conserved [12]. lhc is expected to be able either to prove the validity of susy, or to exclude its realization in specific models [13, 14] (see §4). • inflation: the prototype for the “inflaton” [15] for alan guth was originally the higgs field. however it was realized very soon [16] that in order to have the required shape the self-coupling of the higgs should be of the order of λ . 10−13, too small to be compatible with ew physics. however the sm higgs itself could provide the inflationary potential if its coupling to gravity is non-minimal [17]. in this case the observation of the cosmic microwave background (cmb) fluctuations set limits to the higgs mass in the range 120÷140 gev/c2, compatible with the present limits of the tevatron and lhc experiments (see e.g. [18]). • dark energy: about fifteen years ago, two independent groups tracking the expansion of the universe with type ia supernovae (snia) with the hubble space telescope discovered that the expansion was accelerating [20]. the present understanding is that about 73 % of the mass-energy content of the universe is in the form of a vacuum energy very similar to the famous “cosmological constant” λ0, introduced by einstein [21]. from the point of view of hep, the problem with λ0 is its smallness, compared to the huge energy density of quantum fluctuations in the vacuum of sm, which can be estimated to about 128 orders of magnitude larger than the critical density [22]. perfect symmetry between boson and fermion, or in other words 528 http://dx.doi.org/10.14311/ap.2013.53.0528 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 cp violations (and more) after the first two years of lhcb 13.7 gyinflation 40 30 20 10 0 10 20 log10 t (s 30 20 10 0 10 20 30 l o g 1 0 r (m ) t p l t b b n t g u t t e w t q g p t s u s y (? ) t r e c 40 30 20 10 0 10 20 log t 30 20 10 0 10 20 30 (t ) ) gravity electromagnetic weak strong electroweak gut pre-planck lhc figure 1. a schematic view of the expansion of the universe. an unbroken susy, offers a perfect cancellation of this energy density, because the contribution of bosons to the vacuum energy has an opposite sign to that of fermions (see e.g. §5.1 of ref. [21]). since susy is broken at present, the discrepancy is not eliminated but only mitigated by susy to the level of about 60 order of magnitudes. this situation indicates a serious unsolved problem for particle physics theory [23]. finally, it should be emphasized that all these problems are more or less directly related to our present lack of knowledge of the higgs sector of the sm. in fact while the fermion sector of the sm includes at least three families of quarks and leptons, and four gauge bosons, its higgs structure consists only of a single doublet. in the coming years, the lhc experiments will be certainly able to set constraints to the various possible alternative theories that have been proposed. 2. the sakharov mechanism for baryon asymmetry in his seminal paper, sakharov [6] pointed out that even if the initial universe was matter–antimatter symmetrical, the observed present asymmetry could be originated if at a certain point of the evolution of the universe: (1.) the baryon number is violated; (2.) charge and parity is violated; (3.) there is an exit from thermodynamical equilibrium. in fact, condition (1.) is obviously needed to go from the initial b = 0 to b 6= 0, with b = 13 ∑ q (nq −nq ); condition (2.) produces asymmetry because the decay rates of particles are different from those of antiparticles (see e.g. fig. 3), finally condition (3.) is needed because otherwise annihilation reactions qq � γγ would keep b = 0 in force of the cpt theorem. these conditions could be present at different stages of the evolution of the universe [24]. in the sm, there is a possibility that baryon asymmetry could be produced at the electroweak phase transition (ewpt) [25], which is the transition induced at a temperature tew ∼ 200 gev [26] by spontaneous symmetry breaking (ssb) from the full su(2)l ⊗u(1)y symmetry of the early universe, to the present time, in which fermions and boson get mass interacting with the higgs field. for t � tew, the early universe was in the state of the vacuum expectation value (vev) of the higgs field 〈φ〉 = 0, while at t � tew the higgs field has the present value 〈φ〉 = v0 = 246 gev. in this transition vector bosons and fermions acquired masses. the first sakharov condition can be realized in the sm because only the difference b −l, where l = ∑ ` ( n` −n` ) is strictly conserved. anomalous processes that changes both b and l, keeping δb = δl, are possible as tunneling through the higgs field potential barrier. in the present state of the universe these processes are strongly suppressed, but this is not true when t ≥ tew. in the sm, cpv occurs only via the ckmmechanism [8, 9], which arise from quark mixing. this mechanism with 3 families can operate if and only if there is a complex phase δcp in the mixing matrix with sin δcp 6= 0, and if the masses of quarks with equal electric charge but a different flavor are not degenerated, namely should mu 6= ms 6= mt and md 6= mc 6= mb. it is evident that the ckm mechanism will be switched off in the symmetric phase of the primordial plasma at t > tew, when all fermions and bosons were mass less. in the minimal sm, a sizable baryon asymmetry can therefore be generated at the ewpt only if the ssb proceeds through a strong first order transition [27, 28]. in this case the vev of the higgs field changes non-adiabatically at t ≈ tew by nucleation of bubbles with 〈φ〉 = vtew (tew) 6= 0 inside the supercooled bulk with null vev. in fact, cpv will be active inside the bubbles and can produce baryon asymmetry. a first order transition occurs only if the effective potential of the higgs field veff (φ,t) [29] has a pronounced second local minimum with ξ = vtew (tew)/tew > 1. recent calculations [30] of veff in the minimal sm show that this is possible if the higgs mass is mh . 114 gev/c2, already excluded by the combined tevatron limits [31]. susy completely changes this scenario, for two reasons: 1) ewpt can be strong even with a mass of the higgs compatible with the lhc experiments [30, 32] and 2) cpv is not switched off for t ≈ tew. the minimal susy extension of the sm (mssm) [33] includes two higgs complex doublets of opposite hypercharge: φu = ( φ+u φ0u ) and φd = ( φ0d φ−d ) , (1) a combination of eight real degrees fields that couples separately to heavier u, c, t and lighter d, s, b quarks (and leptons). the expectation values of the two neutral fields will be respectively the heavier 〈 φ0u 〉 = vu√ 2 529 giulio auriemma acta polytechnica and the lighter 〈 φ0d 〉 = vd e iθ √ 2 that originate “spontaneous” cp violating phases in the mixing matrix [34]. in order to conserve the strength of sm weak interactions the vev of the two fields must obviously be v2u + v 2 d = v 2 0 = (246 gev) 2 (2) tan β = vu/vd � 1 being a free parameter of the theory. in the physical realization of this theory the eight degrees of freedom correspond to three massless goldstone bosons and five massive higgs fields: a cpodd neutral scalar a0, two charged scalar h± and two cp-even neutral scalars h0, h0, the first being identical to the sm one. if the mass of the heavier higgs is mh . few tev, it will be possible to detect the existence of this new type of cpv as small deviations from the prediction of the ckm ansatz, generically indicated as “new physics” (np), using precision measurements on the b meson physics by lhcb and the other lhc experiments, as we will discuss in the following section §3. more recently a different way of establishing baryon asymmetry has been proposed [35]. as we said before, in the sm only the difference b−l is conserved, while both baryon number b and lepton number l can change with the constraint δb = δl. leptonic asymmetry can easily be generated from the cp violating decay of right handed massive neutrinos n at t � mn � tew. when mn ≤ t ≤ tew, the difference b − l 6= 0 would be enforced by baryon number violating interactions of the sm. the existence of heavy majorana neutrinos can explain the mass of the ordinary light neutrinos [36]. in section §4 we will give the results of indirect searches for susy and majorana neutrinos in b meson rare decays. 3. cpv after the first two years of lhcb the lhcb detector, located at intersection point 8 of the lhc, is a single arm spectrometer covering the very forward cone 30 ≤ θ ≤ 300 mrad optimized for the reconstruction of heavy mesons decay [37]. as shown in fig. 2, lhcb does not look like a regular collider experiment (e.g. compare with the atlas or cms layout shown in ref. [18]). the core of the detector is the vertex locator (velo), a high resolution silicon tracker with cylindrical geometry, which allows the reconstruction of the position of the decay vertices with a resolution of ∼ 10 µm. two ring cerenkov detectors (rich1 & rich2) allow the identification of charged particles, whose momentum is determined by the magnet deflection measured by the downstream (tt) and upstream tracking stations (t1, t2, t3). energy measurements are performed by the electromagnetic (ecal) and the hadronic (hcal) calorimeters. finally, energetic muon are identified by the muon detector chambers (m1–m5) interleaved into the 4 m iron filter. figure 2. layout of the lhcb detector at the intersection point 8 of lhc [37]. the lhcb detector ran at √ s = 7 tev from april 2010 to september 2011, collecting integrated luminosity for about 1 fb−1 (corresponding to ≈ 1014 pp collisions and ≈ 8 × 1010 bb̄ inclusive pair produced in the lhcb acceptance [38]). among the many interesting results obtained by the lhcb collaboration, it is worth mentioning. • direct cpv in the b0s,d system: the ckm mechanism manifests itself in two ways in neutral meson decays: time dependent “indirect” cpv, which take place during b0s,d ↔ b 0 s,d oscillations and “direct” cpv which gives the different decay rate of the b0s,d from b 0 s,d that could originate baryon asymmetry. lhcb has given the first evidence of direct cpv in the charmless two body decay b0s → k − π + at 3.3σ [39]. figure 3 makes a direct comparison of the distribution of the invariant mass of k−π+ pairs (on the left) with that of k+π−. • mixing and indirect cpv in the charmed mesons: evidence for indirect cpv in d mesons has been reported for the first time by lhcb [40]. evidence for the mixing d0 ↔ d 0 has been observed at the b-factories. belle has published the mixing parameters derived from the 3-body decay d0 → k0sπ+π−, obtaining a cp violation phase −0.2±0.3 consistent with no cpv. lhcb has investigated the decay of d0 and d 0 into a pair of charged hadrons using data taken in 2010. lhcb obtaining ylhcbcp = [5.5 ± 6.3 (stat) ± 4.1 (syst)]×10 −3, a value still compatible with no cpv. a significant improvements in sensitivity and systematic uncertainty is expected from an improved treatment of background events, which will be possible for the data taken in 2011 [41]. an important test of the sm is the check for unitarity of the ckm matrix vckm extracted from measurements [42]. in the complex plane (ρ,η), where η is related to the cpv phase, the unitarity condition is represented by a closed triangle (see §12.3 of ref. [2]). figure 4 shows the various constraints used for the fit 530 vol. 53 supplement/2013 cp violations (and more) after the first two years of lhcb bd bd bs bs figure 3. visual illustration of direct cpv in the b-meson system [39]. top and bottom plots show the same data, only the vertical scale of the bottom is multiplied ×15. of the unitary triangle, together with the uncertainty over the closure of the triangle. a detailed study of the parameters of the oscillations b0s ↔ b̄ 0 s has been proposed in the past as a sensitive test of deviations from the predictions of the sm that could be explained by np (see e.g. [43] and references therein). particularly • bs anomaly: the cpv phase φs of the |δf | = 2 transitions, predicted to be φsms = −0.036 ± 0.002 rad in the sm. previous measurements of this phase at tevatron gave values only marginally compatible with the sm [44, 45]. lhcb has significantly improved the study of the decay b 0 s → j/ψφ followed by φ→ k+k− [46]. using 0.37 fb−1 of data taken during 2011 at √ s = 7 tev, the lhcb collaboration has obtained φj/ψφs = 0.15 ± 0.18 (stat) ± 0.06 (syst) , (3) only 1σ larger than the sm. hopefully the collection of larger amount of data expected at √ s = 8 tev in the 2012 runs of lhc can reduce the statistical error to an acceptable level suitable for understanding the situation. in case of the very similar decay b 0 s → j/ψf0 (980) with f0 → π+π− [47], lhcb derived a value φcombs = −0.44 ± 0.44 (stat) ± 0.02 (syst) . (4) this value is compatible with the sm, but leaving still room for np deviations. a statistical study using both b0s ↔ b̄ 0 s and b 0 d ↔ b̄ 0 d lhcb data [48] concludes that the the sm predictions have only ≈ 0.7 % c.l. 4. the search for new physics in rare b decays • search for the decay b0d,s → µ+µ−: this is the golden channel in the search for np, because in the sm this decay is possible only through the transition β γ sm fit uncertanty α figure 4. the present status of the fit to the unitary triangle. yellow color indicates 95 % c.l. contour of the sm fit (adapted from ref. [42]). b u,c,t w − z 0 µ − µ+d̄, ¯ s ū,c,t¯ ¯ b h0/a0 µ− µ +d̄,s̄ t̃ χ̄ b̄ figure 5. left: sm, right: mssm. b → u, c or t shown by the graph in fig. 5, which allows precise calculation of the branching ratio [49], being br ( b0d → µ+µ− ) sm = (1.0 ± 0.1) × 10 −10 and br ( b0s → µ+µ− ) sm = (3.2 ± 0.2) × 10 −9. the best experimental limits obtained until now are set by lhcbr [50] using 1.0 fb−1 integrated luminosity, which are respectively br ( b0d → µ+µ− ) lhcb ≤ 1.0 × 10−9 and br ( b0s → µ+µ− ) lhcb ≤ 4.5 × 10 −9 at 95 % c.l. in theories beyond the sm, with an extended higgs sector, additional graphs are expected to contribute to these decays. an example is the graph shown in fig. 5 where this essential role is played by the neutralino, supposed to be a good candidate for dark matter [51]. the enhancement of the branching ratio with respect to the sm is rs = br ( b0s → µ+µ− ) lhcb br ( b0s → µ+µ− ) sm ≤ 1.2 at 95 % c.l. (5) where br is the time averaged theoretical branching ratio [52]. the complete mssm has about 100 free parameters, making any comparison with experimental results practically impossible. some indications can be obtained from the “constrained” mssm (cmssm), in which all the masses of scalar partners of fermions (squarks, sleptons, etc.) are assumed to have the same mass m0 while all the fermionic partners of gauge bosons (gauginos) are 531 giulio auriemma acta polytechnica lhcb cms 4/fb figure 6. exclusion region for the two cmssm mass parameters, derived from the lhcb upper limits on bs → µ+µ− for tan β = 50. solid lines indicate the direct limits from cms (adapted from ref. [54]). b ū w − c µ− n µ − d̄ w− (mev)majorana neutrino mass 0 1000 2000 3000 4000 5000 2 | 4 |v -610 -510 -410 -310 -210 -110 1 µ lhcb figure 7. assumed to have mass m1/2. in the cmssm it is possible to make predictions about the amplitude of np deviations from the sm, for example in the rare decays of b meson. in the cmssm it is expected [53]: br ( b0s → µ +µ− ) mssm ∝ mbmµ m4a tan6 β. (6) therefore the limit on rs from lhcb can already exclude a substantial fraction of the cmssm parameter space, as shown in fig. 6. • search for heavy majorana neutrinos: the decay b− → d+µ−µ− (if it exists) violates the conservation of the lepton number [55]. its graph, shown schematically in fig. 7, is equivalent to the nuclear neutrinoless double beta decay [56]. the limit on the branching ratio is ≤ 10−8. figure 7 shows the limit to the coupling of the majorana neutrino with the muon for mn ≤ 5 gev/c2. • lepton number violation: more recently lhcb has also searched for the decay τ− → µ+µ+µ−, which violates the lepton number conservation, expected in the sm with an extremely small branching ratio br (τ− → µ+µ+µ−)sm ≈ 10−40 [57]. lhcb has obtained the upper limit br (τ− → µ+µ−µ−)lhcb ≤ 6.3 × 10 −8 (90 % c.l.). the sensitivity of this search has been calibrated with the control channel d−s → φ(µ+µ−)π−, for which the branching ratio measured by lhcb is 1.33 ± 0.8 × 10−5 [58]. 5. summary • all the results obtained up to now exclude a mass of the lighter higgs particles smaller than 115 gev/c2 or greater than 126 gev/c2 (up to ≈ 800 gev/c2). the atlas and cms data are at present not inconsistent with a sm higgs mass in the range 115 ≤ mh ≤ 126 gev/c2 (see note 1) which is favored by ew precision measurements [2] and is compatible with the cmssm [59]. • cpv phenomenology is very well established in the transitions of the s and b quarks while only some indirect evidence has finally been found by lhcb for charmed hadrons. in all respects, the ckm mechanism is the dominant way in which cpv is realized, as shown from the most accurate fit of the unitary triangle made possible by lhcb measurements. some large anomalies in b0s semileptonic decays claimed by past experiments have not been confirmed, even if the situation is still ambiguous. • rare decays of b mesons are theoretically well constrained in the sm, e.g. b0s → µ+µ−, have been shown to be very effective in setting constraints to beyond sm effects. lhcb, increasing the collected integrated luminosity, will soon be able to detect this decay at the level predicted by the sm, earlier if it is enhanced as predicted by cmssm. at present, both the scalar mass scale m0 and the gauginos mass m1/2 are above 1 tev/c2 for the large tan β fit of cmssm [60]. • the 2012 data taking period of lhc, started last spring, is expected to yield in lhcb an integrated luminosity of at least 5 fb−1, that is very promising for the detection or exclusion of new physics phenomenology. references [1] g. altarelli, in encyclopedia of mathematical physics, j.-p. francoise et al. eds. academic press, oxford (2006) 3238. [2] j. beringer, phys. rev d86 (2012) 0100010. [3] atlas collaboration, http://cdsweb.cern.ch/record/1460439. [4] cms collaborations, http://cdsweb.cern.ch/record/1460438. [5] r. a. alpher and r. herman, science 128 (oct., 1958) 904. [6] a. d. sakharov, jetp letters 5 (1967) 24–27. [7] j. christenson et al., phys.rev.lett. 13 (1964) 138–140. [8] n. cabibbo, phys. rev. lett. 10 (1963) 531–533. [9] m. kobayashi and t. maskawa, prog. theor. phys. 49 (1973) 652–657. [10] e. komatsu et al., astrophys.j.suppl. 192 (2011) 18. [11] s. colafrancesco, these proceedings., (2012). [12] g. bertone et al., phys.rev. d85 (2012) 055014. [13] h. baer, v. barger, and a. mustafayev, phys.rev. d85 (2012) 075010. 532 http://cdsweb.cern.ch/record/1460439 http://cdsweb.cern.ch/record/1460438 vol. 53 supplement/2013 cp violations (and more) after the first two years of lhcb [14] s. sekmen et al., jhep 1202 (2012) 075. [15] a. h. guth, phys.rev. d23 (1981) 347–356. [16] a. d. linde, phys.lett. b129 (1983) 177–181. [17] f. bezrukov and m. shaposhnikov, phys.lett. b659 (2008) 703–706. [18] a. strassner, these proceedings. (2012). [19] k. nakayama and f. takahashi, phys.lett. b707 (2012) 142–145. [20] n. panagia, these proceedings. (2012). [21] j. frieman, m. turner, and d. huterer, ann.rev.astron.astrophys. 46 (2008) 385–432. [22] s. weinberg, arxiv:astro-ph/0005265. [23] g. altarelli, arxiv:1206.1476 [hep-ph]. [24] a.g. cohen et al., ann.rev.nucl.part.sci. 49 (1993) 27–70; m. dine and a. kusenko, rev. mod. phys. 76 (2004) 1; g. auriemma in frontier objects in astrophysics and particle physics, f. giovannelli and g. mannocchi, eds.,(2007), p. 97. [25] v. a. kuzmin, v. a. rubakov, and m. e. shaposhnikov, phys. lett. b155 (1985) 36. [26] d. a. kirzhnits and a. d. linde, phys. lett. b42 (1972) 471–474. [27] m. e. shaposhnikov, phys. lett. b277 (1992) 324–330. [28] g. w. anderson and l. j. hall, phys. rev. d 45(1992) 2685–2698. [29] s. weinberg, phys. rev. d 9 (1974) 3357–3378; l. dolan and r. jackiw, phys. rev. d 9 (1974) 3320–3341. [30] a. noble and m. perelstein, phys.rev. d78 (2008) 063518. [31] tevnph working group, arxiv:1203.3774 [hep-ex]. [32] j. r. espinosa, arxiv:hep-ph/9706389; j. r. espinosa, nucl. phys. b475 (1996) 273–292. [33] s. p. martin, in perspectives on supersymmetry, g. l. kane, ed., p. 1. 1998. [34] t. lee,phys.rev. d8 (1973) 1226–1239; n. haba, m. matsuda, and m. tanimoto, phys.rev. d54 (1996) 6928–6935. [35] m. fukugita and t. yanagida, phys. lett. b174 (1986) 45. [36] h. murayama and a. pierce, phys. rev. d67 (2003) 071702, [37] a. a. alves et al., jinst 3 (2008) s08005. [38] r. aaij et al., phys.lett. b694 (2010) 209–216, [39] r. aaij et al., phys.rev.lett. 108 (2012) 201601. [40] r. aaij et al., phys.rev.lett. 108 (2012) 111602. [41] r. aaij et al., jhep 1204 (2012) 129. [42] m. ciuchini et al., jhep 0107 (2001) 013 and updates from http://www.utfit.org. [43] a. lenz et al., phys.rev. d83 (2011) 036004. [44] v. m. abazov et al., phys.rev. d85 (2012) 032006. [45] t. aaltonen et al., phys.rev. d85 (2012) 072002. [46] r. aaij et al., phys.rev.lett. 108 (2012) 101803. [47] r. aaij et al., phys.lett. b707 (2012) 497–505. [48] a. lenz et al., arxiv:1203.0238 [hep-ph]. [49] a. j. buras et al., jhep 1010 (2010) 009. [50] r. aaij et al., phys.rev.lett. 108 (2012) 231801. [51] r. dermisek et al., jhep 0304 2003037. [52] k. de bruyn et al., phys.rev.lett. 109 (2012) 041801. [53] g. kane, c. kolda, and j. e. lennon, arxiv:hep-ph/0310042. [54] n. mahmoudi, in 47th rencontres de moriond : qcd and high energy interactions, la thuile, in press, (2012). [55] r. aaij et al., phys.rev. d85 (2012) 112004. [56] f. ferroni, these proceedings. (2012). [57] w. j. marciano, t. mori, and j. m. roney, ann.rev.nucl.part.sci. 58 (2008) 315–341. [58] lhcb collaboration, https://cdsweb.cern.ch/record/14344562. [59] g. buchmueller et al., eur.phys.j.. c72 (2012) 1878. [60] g. buchmueller et al., arxiv:1207.7315[hep-ph], 2012. 533 http://arxiv.org/abs/astro-ph/0005265 http://arxiv.org/abs/1206.1476 http://arxiv.org/abs/1203.3774 http://arxiv.org/abs/1203.3774 http://arxiv.org/abs/hep-ph/9706389 http://www.utfit.org http://arxiv.org/abs/1203.0238 http://arxiv.org/abs/hep-ph/0310042 https://cdsweb.cern.ch/record/14344562 http://arxiv.org/abs/hep-ph/0310042 acta polytechnica 53(supplement):528–533, 2013 1 introduction 2 the sakharov mechanism for baryon asymmetry 3 cpv after the first two years of lhcb 4 the search for new physics in rare b decays 5 summary references acta polytechnica doi:10.14311/ap.2014.54.0358 acta polytechnica 54(5):358–362, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap effective fracture energy of ultra-high-performance fibre-reinforced concrete under increased strain rates radoslav sovják∗, jana rašínová, petr máca czech technical university in prague, faculty of civil engineering, experimental centre, thákurova 7, 166 29 prague, czech republic ∗ corresponding author: sovjak@fsv.cvut.cz abstract. the main objective of this paper is to contribute to the development of ultra-highperformance fibre-reinforced concrete (uhpfrc) with respect to its effective fracture energy. this paper investigates the effective fracture energy, considering various fibre volume fractions and various strain rates. it is concluded that the effective fracture energy is dependent on the strain rate. in addition, it is found that higher fibre volume fractions tend to decrease the sensitivity of uhpfrc to increased strain rates. keywords: effective fracture energy; uhpfrc; quasi-static loading; increased strain rates; micro fibres; fibre volume fraction. 1. introduction ultra-high-performance fibre-reinforced concrete (uhpfrc) is an advanced cementitious composite with enhanced mechanical and durability properties. nowadays, uhpfrc is not considered as an alternative to conventionally used materials, but it outperforms conventionally used concretes. the reasons for the present situation are the higher initial costs, and also a certain inertia in the present-day building industry that has led to the continued use of conventional concretes. the application of uhpfrc is therefore limited to special applications such as energy absorption facade panels and key elements of building structures that may be exposed to increased loading rates resulting from earthquakes, impacts or blasts [1, 2]. in the event of increased loading rates, a large deformation of the structural member is expected, while the exposed member is required to continue to possess some residual capacity to carry the load. the capacity of the member to absorb the energy can be quantified via the effective fracture energy, which determines the overall energy that a material can absorb per square meter. the energy absorption capacity is the main material property that benefits from fibre reinforcement. the effective fracture energy (gf ) is a key parameter for evaluating the ability of a material to withstand an increased loading rate and also to redistribute the load from the exposed structure to its surrounding parts. in addition, different behaviour of uhpfrc in terms of gf can be expected at higher loading rates, as the action is shorter by a magnitude than for quasi-static loading (see figure 1). 2. effective fracture energy the effective fracture energy (gf ) of a material is defined as the energy required to open a unit crack figure 1. strain rates for various load events. figure 2. effective fracture energy of a notched specimen. surface area. gf is governed by the tensile mechanism of the material, and represents the amount of energy consumed when a crack propagates through the beam. the fracture energy is expressed as the work of external forces acting on the beam related to the actual depth of the crack. the overall work of the external forces related to the final crack depth is considered as the average fracture energy, the so-called effective fracture energy (figure 2). the effective fracture energy (gf ) is strain rate dependent, and it is assumed that gf also increases with increasing strain rate. this dependence is usually described by the dynamic increase factor (dif), which expresses the ratio of gf measured under increased strain rate loading conditions to gf measured under 358 http://dx.doi.org/10.14311/ap.2014.54.0358 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 5/2014 effective fracture energy of uhpfrc under increased strain rates fibre content 1 % 2 % 3 % cement cem i 52.5r 800 silica fume 200 silica powder 200 water 176 superplasticizer 39 fine sand 0.1/0.6 mm 336 fine sand 0.3/0.8 mm 720 640 560 fibres 0.22 × 13 mm 80 160 240 table 1. mixture design of the uhpfrc used in this study (all values are in kg/m3). figure 3. micro fibres used in this study. quasi-static loading conditions: difgf = gincreased strain ratef g quasi-static strain rate f . gf was determined in this study on the basis of recommendations given by the rilem technical committee [3] and also by other studies [4, 5]: gf = wf + mguu b(h − a0) , where gf is the effective fracture energy, wf is the work of external forces (i.e., the area beneath the l–d diagram), and mguu is the contribution of the weight of the beam. in detail, m is the weight of the beam, g is gravity acceleration, uu is the ultimate deflection of the beam, b is the width of the beam, h is the height of the beam, and a0 is the height of the notch. 3. material the uhpfrc tested in this study was developed on the basis of components widely available in the czech republic (see table 1). the material design process fibre content 1 % 2 % 3 % compressive strength 150 152 150 tensile strength 7.8 9.9 11.7 modulus of rupture 15.8 25.6 33.8 splitting tensile strength 14.9 20.5 26.6 modulus of elasticity (in gpa) 45.1 56.3 51.5 table 2. mechanical properties of the uhpfrc used in this study (all values are in mpa). figure 4. a) pull-out failure mode. b) fibre failure mode. has been fully described elsewhere [6–8]. briefly, the uhpfrc was mixed in conventional mixers, and the beams were cured in water tanks. the mixture contained a high volume of cement and silica fume, and the water-to-binder ratio was 0.18. in this study the strain rate and the fibre volume fraction (i.e., the fibre content) were selected as the main test variables. the high-strength steel fibres used in this study were 13 mm in length and 0.22 mm in diameter (see figure 3). the fibres were straight, with tensile strength of 2800 mpa. the high tensile strength of the fibres was chosen in order to achieve the pull-out failure mode. the pull-out failure mode (see figure 4a) is a much more energy-consuming mode than the fibre failure mode (see figure 4b). straight fibres also provided a good trade-off between workability and the mechanical properties of the resulting mixture. as shown in table 2, the compressive strength measured on cylinders 200 mm in height and 100 mm in diameter was around 150 mpa. the compressive strength did not vary with increasing fibre content. however, the uniaxial tensile strength, the modulus of rupture and the splitting tensile strength showed linear dependence on the actual fibre content (see table 2). the maximum tensile strength was determined to be 11.7 mpa when the fibre content was 3 % by volume [9]. the uhpfrc mixture was placed in moulds along the length of the beam, and this caused the fibres to be aligned along the length of the beam [10]. this led to fibre alignment in the direction of the tensile stress. no other technique was used to align the fibres. all beams were tested after 28 days from casting in order 359 r. sovják, j. rašínová, p. máca acta polytechnica figure 5. experimental setup. figure 6. load–deflection diagrams for uhpfrc beams under various strain rates and with various fibre contents. to avoid the effect of ageing, which may also influence the results [11]. 4. experimental program experiments were performed on beams 100 × 100 × 550 mm in size with a clear span of 500 mm. the beams had a notch in their bottom edge which was 30 mm in height and 5 mm in width (figure 5). three different fibre volume fractions were tested covering 1 %, 2 % and 3 % of the fibre volume content. each fibre volume fraction was tested under quasi-static conditions and under increased strain-rate conditions. quasi-static conditions were simulated by a deformation controlled test with a speed of the cross-head of 0.2 mm/min. this speed corresponded to a strain rate of 5.6 × 10−6 s−1, which is considered as the quasi-static strain rate [12, 13]. an increased strainrate was simulated by the greatest possible motion of the cross-head of the hydraulic testing machine used in this study. the cross-head developed a speed of 200 mm/min, corresponding to a strain rate of 5.6 × 10−3 s−1. this level of strain rate is typical for dynamic loading, e.g., for earthquakes. during the experimental program, the force acting on the beam and the deflection measured by two lvdt (linear variable differential transformer) sensors was recorded with 5 hz and 1 khz frequency during quasi-static loading figure 7. development of the effective fracture energy. figure 8. development of the dynamic increase factor. and increased strain rate loading, respectively. steel yokes were implemented in the experimental setup as mounts for the lvdt sensors, in order to subtract the settlement of the supports from the measured deflections [14]. 5. results and discussion load–deflection (l–d) diagrams were plotted for all beams, including various fibre volume fractions tested under various strain rates. three beams were tested for all fibre contents and for both strain rates, making a total of 18 tested beams. the zuz-200 hydraulic testing machine with a closed loop deformation control system with a maximal capacity of 100 kn was used for both loading rates. the tests were deformation controlled, based on the movement of the crosshead, which was either 0.2 mm/min for quasi-static loading or 200 mm/min for the increased strain rate (see figure 6). several authors have suggested that for low strain rates the fracture energy is constant, and is therefore not dependent on the loading rates [15, 16]. however, in our study it was found that for increasing fibre contents and also for increasing strain rates the effective fracture energy also increases (see figure 7). this is because in the case of quasi-static loading the crack propagates along the line with the least resis360 vol. 54 no. 5/2014 effective fracture energy of uhpfrc under increased strain rates gf [j/m2] strain ratefibre content 5.6 × 10−6 s−1 5.6 × 10−3 s−1 dif 1 % 12000 (2100) 15300 (3200) 1.28 2 % 17900 (1200) 20300 (1500) 1.13 3% 25300 (700) 26900 (1900) 1.06 table 3. effective fracture energy of the uhpfrc. tance, which leads to minimal fracture energy. in the case of increased strain rates, the crack does not have enough time to seek the lowest energy consumption path, and goes straight through the beam, which is a more energy-consuming procedure [17]. other authors have suggested that the rate effect for low strain rates can also be attributed to viscous effects, which mainly originate due to the presence of free water in voids and in porous structures [18]. for higher fibre contents, the sensitivity to strain rates decreases due to the group effect of fibres that interact together. denser fibre concentration decreases the pull-out capacity of the fibres, because the matrix surrounding the fibre will not be sufficient to keep the interfacial bonding as strong as in the single pull-out case [17]. when the pull-out capacity is lower, the sensitivity to loading rates is also lower. this is indicated by the dynamic increase factor (dif), as shown in table 3 and figure 8. the effective fracture energy values indicated in table 3 are averages from three beams. the value in parentheses gives the standard deviation from the tested beams. 6. conclusions and further outlook the effective fracture energy was determined on a total of 18 beams, which were tested under various strain rates and with various fibre contents. the fibre volume fraction ranged from 1 % to 3 % by volume, and the strain rate was either 5.6 × 10−6 s−1 or 5.6 × 10−3 s−1. the following conclusions can be drawn on the basis of the experimental outcomes derived from this study: (1.) the effective fracture energy (gf ) increases as the fibre volume fraction increases. higher scatter in the experimental outcomes was observed for lower fibre contents. the pull-out capacity of the fibre in a higher fibre volume content is lower than the pull-out capacity of the fibre in a lower fibre volume content. each fibre plays a more significant role in terms of gf in beams with a lower fibre content, because its pull-out capacity is higher. thus each mismatch in the fibre distribution in lower fibre contents will scatter the results more. (2.) gf is dependent on the strain rates. it was verified experimentally that gf increases as the strain rate increases. in addition, it was found that higher fibre volume fractions attenuate the dependence of gf on the strain rate. this is because when there is a higher fibre volume fraction the maximum pullout capacity on each individual fibre is reduced; its sensitivity to higher strain rates is therefore also reduced. (3.) two strain rates were considered in this study, which can both be classified as low strain rates. an increased strain rate was simulated as the maximum speed developed by the hydraulic testing machine. it is important to note that it is a fairly complex and highly time-consuming task to make proper measurements of the effective fracture energy. this is the main reason why no experimental work was performed on other strain rate levels. it is therefore highly desirable to extend the scope of our study presented here, and to verify the effective fracture energy of fibre-reinforced cementitious composites under higher strain rates above 10−2 s−1. acknowledgements the authors gratefully acknowledge the support provided by the czech science foundation under project number gap 105/12/g059. the authors would like to acknowledge michaela kostelecká, from the klokner institute, for her assistance with the microscopic investigation of the fibres used in this study. the authors would also like to acknowledge the assistance given by the technical staff of the experimental centre, faculty of civil engineering, ctu in prague, and by students who participated in the project. references [1] millon, o., riedel, w., mayrhofer, c., thoma, k.: fiber-reinforced ultra-high performance concrete – a material with potential for protective structures, proceedings of the first international conference of protective structures, manchester, 2010, p. 013. [2] máca, p., sovják, r., konvalinka, p.: mix design of uhpfrc and its response to projectile impact, int. j. impact eng., 2013, doi:10.1016/j.ijimpeng.2013.08.003. [3] rilem draft recommendation: determination of the fracture energy of mortar and concrete by means of three-point bend tests on notched beams, mater. struct., 18, 1985, pp. 285–290. [4] bažant, z. p., kazemi, m. t.: size dependence of concrete fracture energy determined by rilem work-of-fracture method, int. j. fract., 51, 1991, pp. 121–138. doi:10.1007/bf00033974. [5] hu, x., wittmann, f.: fracture energy and fracture process zone, mater. struct., 25, 1992, pp. 319–326. doi:10.1007/bf02472590. [6] maca, p., zatloukal, j., konvalinka, p.: development of ultra high performance fiber reinforced concrete mixture, ieee symposium on business, engineering and industrial applications (isbeia), ieee, 2012, pp. 861–866. doi:10.1109/isbeia.2012.6423015. [7] máca, p., sovják, r., konvalinka, p.: mixture design and testing of ultra high performance fiber reinforced concrete, malaysian journal of civil engineering, 25 special issue (1), 2013, pp. 74–87. 361 http://dx.doi.org/10.1016/j.ijimpeng.2013.08.003 http://dx.doi.org/10.1007/bf00033974 http://dx.doi.org/10.1007/bf02472590 http://dx.doi.org/10.1109/isbeia.2012.6423015 r. sovják, j. rašínová, p. máca acta polytechnica [8] sovják, r., vogel, f., beckmann, b.: triaxial compressive strength of ultra high performance concrete, acta polytechnica, 53, 2013, doi:10.14311/ap.2013.53.0901. [9] máca, p., sovják, r., vavřiník, t.: experimental investigation of mechanical properties of uhpfrc, procedia engineering, 65, 2013, pp. 14–19. doi:10.1016/j.proeng.2013.09.004. [10] fornůsek, j., tvarog, m.: influence of casting direction on fracture energy of fiber-reinforced cement composites, key eng mat, 594, 2014, pp. 444– 448. doi:10.4028/www.scientific.net/kem.594-595.444. [11] holčapek, o., vogel, f., vavřiník, t., keppert, m.: time progress of compressive strength of high performance concrete, applied mechanics and materials, 486, 2014, pp. 167–172. doi:10.4028/www.scientific.net/amm.486.167. [12] li, q., reid, s., wen, h., telford, a.: local impact effects of hard missiles on concrete targets, int. j. impact eng., 32, 2005, pp. 224–284. doi:10.1016/j.ijimpeng.2005.04.005. [13] beckmann, b., hummeltenberg, a., weber, t., curbach, m.: strain behaviour of concrete slabs under impact load, struct. eng. int., 22, 2012, pp. 562–568. doi:10.2749/101686612x13363929517893. [14] banthia, n., trottier, j.: test methods for flexural toughness characterization of fiber reinforced concrete: some concerns and a proposition, aci mater. j., 92, 1995, doi:10.14359/1176. [15] birkimer, d. l., lindemann, r.: dynamic tensile strength of concrete materials, aci journal proceedings, aci, 1971, doi:10.14359/11293. [16] schuler, h., mayrhofer, c., thoma, k.: spall experiments for the measurement of the tensile strength and fracture energy of concrete at high strain rates, int. j. impact eng., 32, 2006, pp. 1635–1650. doi:10.1016/j.ijimpeng.2005.01.010. [17] tran, t. k., kim, d. j.: high strain rate effects on direct tensile behavior of high performance fiber reinforced cementitious composites, cement and concrete composites, 45, 2014, pp. 186–200. doi:10.1016/j.cemconcomp.2013.10.005. [18] zhang, x., ruiz, g., yu, r., tarifa, m.: fracture behaviour of high-strength concrete at a wide range of loading rates, int. j. impact eng., 36, 2009, pp. 1204–1209. doi:10.1016/j.ijimpeng.2009.04.007. 362 http://dx.doi.org/10.14311/ap.2013.53.0901 http://dx.doi.org/10.1016/j.proeng.2013.09.004 http://dx.doi.org/10.4028/www.scientific.net/kem.594-595.444 http://dx.doi.org/10.4028/www.scientific.net/amm.486.167 http://dx.doi.org/10.1016/j.ijimpeng.2005.04.005 http://dx.doi.org/10.2749/101686612x13363929517893 http://dx.doi.org/10.14359/1176 http://dx.doi.org/10.14359/11293 http://dx.doi.org/10.1016/j.ijimpeng.2005.01.010 http://dx.doi.org/10.1016/j.cemconcomp.2013.10.005 http://dx.doi.org/10.1016/j.ijimpeng.2009.04.007 acta polytechnica 54(5):358–362, 2014 1 introduction 2 effective fracture energy 3 material 4 experimental program 5 results and discussion 6 conclusions and further outlook acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0756 acta polytechnica 53(supplement):756–759, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap limit on uhe neutrino fluxes from the pierre auger observatory s. malderaa,∗, for the pierre auger collaborationb,c a inaf-osservatorio astrofisico di torino, c. fiume 4, torino, italy b av. san matrín norte 304 (5613) malargüe, prov. de mendoza, argentina c full author list: http: // www. auger. org/ archive/ authors_ 2012_ 06. html ∗ corresponding author: maldera@to.infn.it abstract. the surface detector of the pierre auger observatory is sensitive to ultra high energy (uhe) neutrinos. neutrinos of all flavors can interact in the atmosphere producing inclined showers near the ground. moreover, ultra high energy earth-skimming tau neutrinos can be observed through the detection of showers induced by the decay of tau leptons created by interactions in the earth’s crust. in both cases, neutrino showers can be identified through the time structure of the signals in the surface detector stations. two sets of identification criteria have been designed to search for down-going and up-going neutrinos in the recorded data, with no candidates found. we will discuss the identification criteria used, and we will present the corresponding limits on the diffuse and point source neutrino fluxes. keywords: uhe neutrinos, pierre auger observatory. 1. introduction all the proposed models for the origin of ultra high energy cosmic rays (uhecr, as are usually named cosmic rays with e > 1018 ev) predict a flux of high energy neutrinos, mainly via charged pion decay following interactions on matter and radiation. such interactions can occur at the acceleration site itself (“astrophysical neutrinos”) or during the propagation in the background radiation field (“cosmogenic neutrinos”) [1, 2]. the detection of uhe neutrinos would open a new observation window to the universe. the pierre auger observatory, is located in the province of mendoza, argentina at a mean altitude of 1400 m a.s.l. it uses two different techniques to detect the air showers: an array of about 1660 water cherenkov detectors, placed at a distance of 1.5 km from each other, samples the particles at ground level over an area of about 3000 km2 (surface detector, sd [3]), while a fluorescence detector (fd [4]) observes the ultra-violet light emitted by atmospheric nitrogen excited by the particles of the shower. the fd consists of 27 telescopes located at four sites at the edges of the ground array. as it operates only during moon-less nights it has a duty cycle of ∼ 15 %, while the sd has a duty cycle of almost 100 %. although the main goal of the auger observatory is the detection of extensive air showers produced by uhecrs, it has a good detection and identification capability for neutrinos with energies above 1017 ev. at such energy, neutrinos of all flavors can interact in the atmosphere inducing a “down-going” (dg) shower that can be detected at ground. in addition, earth-skimming (es) tau neutrinos can undergo charge current interaction inside the earth, generating a tau lepton that can emerge and decay in the atmosphere, giving an upward-going air shower. even if tau neutrinos are not predicted to be produced at astrophysical sources, the flux at earth will be equally distributed between all neutrino flavors due to neutrino oscillations. the neutrino search is based on the data from the surface detector. each sd station consists of a cylindrical polyethylene tank, 3.6 m in diameter and 1.2 m tall, containing 12 tons of purified water. three large photocatode photomultipliers detect the cherenkov light emitted by relativistic particles crossing the water volume and their signals are digitalized with samplig rate of 40 mhz. two different trigger modes are implemented in the stations, a simple threshold trigger and a time over threshold trigger (tot) requiring that at least 13 samples are over a lower threshold within a sliding window of 3 µs (120 samples). 2. identification of ν candidates neutrino induced showers have to be extracted from a large background of cosmic ray showers (protons or heavier nuclei). at large zenith angles (θ > 75°), nucleonic showers will be dominated at the ground by the muonic component, as the electromagnetic ones are almost completely absorbed in the atmosphere (old showers). on the other hand neutrino induced showers (dg or es) initiated near the detector will be electromagnetic rich (young showers). the key point for the neutrino discrimination is therefore the selection of deeply interacting (young) inclined showers. at very large zenith angles (above 90° in the es case) the standard event reconstruction algorithms are not reliable and, therefore, in order to select very 756 http://dx.doi.org/10.14311/ap.2013.53.0756 http://ojs.cvut.cz/ojs/index.php/ap http://www.auger.org/archive/authors_2012_06.html vol. 53 supplement/2013 limit on uhe neutrino fluxes from the pierre auger observatory inclined events, some more general shower characteristics are exploited: the shape of the shower footprint at ground and the apparent ground speed. the pattern of the triggered sd stations for an inclined shower is an elongated ellipse, with the major axis (l) along the shower arrival direction and the minor axis (w) perpendicular to that direction (see [5] for the details of the definitions of l and w). a very inclined event has a large l/w ratio. the ground speed is defined as vij = dij/δtij, where dij is the distance between two stations participating in the event, and δtij is the difference between the start time of the signal in the two stations. the average ground speed (obtained by averaging over all the pairs of stations in the event) is close to the speed of light for a horizontal shower, while it is much larger in the case of a vertical event. a set of cuts (listed in tab. 1) on the l/w, the average signal speed (v ) and its dispersion (σv ) have been implemented for the dg and es channels to select showers in the zenith angle ranges between 75° < θ < 90° and 90° < θ < 96° respectively. in addition we require at least 3 and 4 triggered stations for the es and dg channels and in the case of the dg analysis a further cut on the reconstructed zenith angle is used. the selection of events with a large electromagnetic component is based on the time structure of the signals recorded by the ground detectors. in fact, a shower with a significant electromagnetic component at the ground produces a signal in the triggered sd stations which is broader in time (hundreds of nanoseconds) with respect to the one given by “old”, muon-dominated showers (tens of nanoseconds). the main observable is the ratio of the integrated signal charge collected by the photomultipliers over its peak height (area over peak, aop) normalized to that of isolated muons (periodically collected for calibration and monitoring purposes), which is sensitive to the time spread of the signal. in addition, stations with a broad signal usually satisfy the time over threshold (tot) local trigger condition. the selection criteria have been optimized using monte-carlo simulations to estimate the expectations from neutrino induced showers while the background is estimated using a subsample of the sd data (training data), to take into account the actual primary cosmic rays composition and all possible detector effects that may not be reproduced by simulations. in the earth-skimming analysis the selection of young showers is done applying a cut on the fraction of stations satisfying the tot trigger condition and with aop > 1.4. in the down-going case, the discrimination is performed using the fisher discriminant method [6] to improve the separation of neutrinos and background. the variables used are the aop of the first (time ordered) 4 triggered stations and some combinations of them (their squares and their product), and the difference in aop between “early” and “late” stations. the cut on the fisher discriminant is fixed requiring 1 background event in 20 years. down-going (dg) earth skimming (es) n. stations ≥ 4 n. stations ≥ 3 θrec > 75° l/w > 3 l/w > 5 v < 0.313 m ns−1 0.29 m ns−1 < v < 0.31 m ns−1 σv /v < 0.08 σv < 0.08 m ns−1 fisher discriminant fraction of stations with based on aop tot trigger and aop > 1.4 greater than 60 % table 1. criteria used for the selection of inclined events (upper part) and for the neutrino discrimination (lower part). in the earth-skimming case the training data subsample goes from 1 nov 2004 to 31 dec 2004, while it is from 1 jan 2004 to 31 oct 2007 for dg events1. the remaining events (search sample) were not used until all the selection algorithms were defined, and only at that point they were “unblinded”. 3. exposure and limits to diffuse flux the neutrino selection criteria are applied to the search samples (from 1 jan 2004 to 31 may 2010 for es and from 1 nov 2007 to 31 may 2010 for dg) for both the down-going and earth-skimming searches, resulting in zero candidates being found. to be able to give an upper limit to the neutrino flux, we need to compute the neutrino exposure of the surface detector. this is obtained by folding the array aperture, the efficiency of the neutrino identification (given by the combination of the trigger probability of the array and the selection cuts) and the ν interaction probability, and integrating over time. for down-going neutrinos the identification efficiency depends on the neutrino flavor and interaction channel, its energy, incident zenith angle θ, and depth of the interaction. in the case of earth-skimming ντ it depends mainly on the neutrino energy and on the height above ground of the shower induced by tau decay. the exposure is computed by means of monte-carlo simulations, taking into account the time evolution of the sd array. the first interaction between a neutrino and nucleon is simulated with the herwig package [7], the development of the shower in the atmosphere with aires [8], and the response of the sd detectors using the auger simulation framework. in the case of ντ cc interactions, the decay of the τ lepton is simulated with tauola [9]. for the dg ντ channel a detailed description of the topography around the detector was used to estimate the contribution from ντ interacting in the andes. the accumulated exposure corresponding to the search periods for the 1as dg ν-induced showers are more similar to cosmic ray showers, more statistics is required for the evaluation of the background. 757 s. maldera, for the pierre auger collaboration acta polytechnica figure 1. exposure of the surface detector for es and dg neutrino induced showers, corresponding respectively to 3.5 and 2 years of the complete and fully efficient surface detector (the deployment of the sd was concluded by the end of 2008). earth-skimming down-going k < 3.2 × 10−8 gevcm2 s sr k < 1.7 × 10 −7 gev cm2 s sr ∼ 0.16 < eν[eev] < 20.0 ∼ 0.1 < eν[eev] < 100.0 table 2. 90 % c.l. upper limits to the single flavor neutrino flux and corresponding energy range, for earth-skimming and down-going searches. earth-skimming and down-going analysis are shown in fig. 1. the main sources of uncertainties for dg neutrinos came from the ν induced shower simulation and the hadronic models used (+9 %, −33 %), and from neutrino cross-sections (±7 %). for es neutrinos the dominant uncertainties comes from the tau energy losses (+25 %, −10 %), shower simulation (+20 %, −5 %), and topography (+18 %, 0 %). assuming a neutrino flux f(eν) = k e−2, and a complete neutrino mixing (flavor ratio 1:1:1), the upper limits to the single flavor neutrino flux are derived using a semi-bayesian extension [10] of the feldman–cousins approach [11], that allows one to include in the limits the systematic uncertainties on the exposure cited above. the 90 % c.l. limits are shown in tab. 2, together with the energy range in which they are derived. the resulting flux upper limits are also shown in fig. 2, together with the results from icecube neutrino observatory [12] and the anita experiment [13], and flux predictions for cosmogenic neutrinos under different assumptions [14, 15]. the limits on the differential flux are also shown, assuming a spectrum of the form e−2 in each energy bin. 4. limits to point-like sources as the neutrino search is limited to large zenith angles (75° to 96°), a point-like source can be seen only for a fraction of the sidereal day, depending on its declination. in fact, a point in the sky with declination [ev]νe 17 10 18 10 19 10 20 10 21 10 ] -1 s r -1 s -2 ) [ g e v c m ν (e φ ν2 k = e -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 single flavour neutrino limits (90% cl) limitsν auger downward-going auger earth-skimming icecube 2011a anita 2009 cosmogenic models ahlers 2010 kotera 2010 figure 2. integrated (solid lines) and differential (dashed lines) 90 % c.l. upper limits to the single flavor diffuse flux of down-going ν and earth-skimming ντ, as a function of neutrino energy. limits from the icecube [12] and anita [13] experiments and predictions for cosmogenic neutrinos under different assumptions [14, 15] are also shown. δ and right ascension α at a given sidereal time t is seen at the observatory latitude (λ = −35.2°) with a zenith angle θ given by cos(θ) = sin λ sin δ + cos λ cos δ sin(2πt/t − α) (1) being t the duration of sidereal day. the declination range accessible with this analysis is between −65° and 55°, while the regions near the terrestrial poles are not observable. the point source exposure is evaluated in a similar way as the diffuse exposure but avoiding to integrate over the solid angle [16]. assuming a differential flux f(eν) = kps e−2, 1:1:1 neutrino flavor ratio, the upper limits to kps are derived as function of source declination, in the same way as in the diffuse case. the 90 % c.l. upper limits for the dg and es analyses are shown in fig. 3 as a function of declination. in both analyses we have a region of about 100° in declination were the sensitivity is almost constant, and the limits on kps are at the levels of ≈ 5 × 10−7 for the es analysis and ≈ 2.5 × 10−6 gev cm−2 s−1 for the dg analysis. the shapes of the declination dependent limits are determined mainly by the fraction of time in which a source is within the zenith angle range of dg and es searches. the upper limits are derived for neutrino energies from 1.6 × 1017 to 2.0 × 1019 ev and from 1 × 1017 to 1 × 1020 ev for the earth-skimming and the down-going analysis respectively. in fig. 4 we show the limits on kps for the particular case of the active galaxy centaurus a (δ = −43°), a potential acceleration site for uhecr. neutrino flux predictions for three different models of uhe ν production in the jets and in the core of cena are also shown [17–19]. the expected number of events for a flux like in [17] is about 0.1 for the es and 0.02 for the dg searches respectively. they are about one order of magnitude smaller for the flux predicted in [18]. 758 vol. 53 supplement/2013 limit on uhe neutrino fluxes from the pierre auger observatory [deg]δsource declination -80 -60 -40 -20 0 20 40 60 80 ] -1 s -2 ) [g e v c m ν f (e ν2 = e p s k -710 -6 10 -5 10 single flavour neutrino limits (90% cl) auger downward-going auger earth-skimming figure 3. upper limits (90 % c.l.) for the integral flux of single flavour neutrinos from a point-like source as a function of source declination. [ev] ν e 15 10 16 10 17 10 18 10 19 10 20 10 21 10 22 10 23 10 ] -1 s -2 ) [g e v c m ν f (e ν2 = e p s k -10 10 -9 10 -8 10 -710 -6 10 -5 10 -410 centaurus a single flavour neutrino limits (90% cl) kachelriess 2009 cuoco 2008 auger earth-skimming auger downward-going lunaska 2008 icecube 2011b figure 4. upper limits (90 % c.l.) on the integral flux of neutrinos from centaurus a, together with the limits from other experiments (icecube [20], lunaska [21]) in different energy ranges and theoretical predictions. references [1] f. halzen et al., reports on progress in physics 65, (2002) 1025 [2] l. anchordoqui, t.montaruli, annual review of nuclear and particle science 60, (2010) 129–162 [3] i. allekotte et al., nuclear instruments and methods in physics research section a 586, (2008) 409 [4] j. abraham et al., the pierre auger collaboration, nuclear instruments and methods in physics research section a 620, (2010) 227 [5] j. abraham et al., the pierre auger collaboration, physical review d 79, (2009) 102001 [6] r. fisher, annals of eugenics 7, (1936) 179 [7] g. corcella, et al., jhep 101, (2001) 10 [8] s. sciutto, aires: a system for air shower simulations, arxiv:astro-ph/9911331. [9] s. jadach et al., computer physics communications 76(3), (1993) 361–380 [10] j. conrad et al., physical review d 67, (2003) 12002 [11] g.j. feldman, r.d. cousins, physical review d 57, (1998) 3873 [12] r. abbasi et al., icecube collaboration, physical review d 83, (2011) 092003 [13] p. gorham et al., anita collaboration, physical review d 82, (2010) 022004 [14] m. ahlers et al., astroparticle physics 34, (2010) 106 [15] k. kotera et al., journal of cosmology and astroparticle physics 10, (2010) 013 [16] p. abreu et al., pierre auger collaboration, astrophysical journal letters 755, (2012) l4 [17] a. cuoco et al., physical review d 78 (2008), 023007 [18] m. kachelriess et al., new journal of physics 11, (2009) 065017 [19] l.a. anchordoqui arxiv:1104.0509 [hep-ph]. [20] r. abbasi et al., the icecube collaboration, astrophysical journal 732, (2011), 18 [21] c.w. james et al., monthly notices of the royal astronomical society 410, (2011) 885–889 759 acta polytechnica 53(supplement):756–759, 2013 1 introduction 2 identification of nu candidates 3 exposure and limits to diffuse flux 4 limits to point-like sources references ap1_02.vp 1 introduction the increase in terrestrial applications of solar radiant energy has given impetus to the study of solar energy availability in many areas of the world. when passing through the earth’s atmosphere, extraterrestrial solar radiation is subjected to attenuation due to scattering by the air molecules and aerosols, and due to absorption by various atmospheric components, mainly ozone, water vapor, oxygen and carbon dioxide. the extinction of the radiation is strongly dependent on the state of the sky, whether cloudy or not, the cleanliness of the atmosphere, and the amount of gaseous absorbers. theoretical analysis of the attenuation of solar radiation passing through clouds requires a great deal of information regarding instantaneous thickness, position and number of layers of clouds, as well as their optical properties. however, for technological utilization of solar energy, a study of solar radiation under cloudless skies is very important, particularly for solar systems using concentrators. the attenuation of radiation through a real atmosphere versus that through a clean dry atmosphere gives an indication of the atmospheric turbidity. several atmospheric turbidity coefficients have been introduced during the past decades in order to quantify the influence of atmospheric aerosol content on direct radiation received at the earth’s surface. the most currently used are the linke turbidity factor, tl, [1] and the ngström turbidity coefficient, �, [2]. linke’s turbidity factor refers to the whole spectrum, i.e., overall spectrally integrated attenuation, which includes presence of gaseous, water vapor and aerosols, and indicates the number of ideal (clean and dry) atmospheres that produce the same extinction of the extraterrestrial solar beam as the real atmosphere. on the other hand the ngström turbidity coefficient is obtained from spectral measurements and is an indication only of the amount of aerosols in the atmosphere. in the present work the helwan site was used as sampling station for collecting atmospheric aerosol samples over several years. the examined concentrations were compared with other data representative of source areas influencing the helwan atmosphere. a substantial part of the anthropogenic emissions of primary particles in helwan are fly ash particles from solid fuel combustion and inorganic particles from iron and steel production, cement production and a variety of industrial processes. the question of the existence and tracer power of regional elemental characteristics reflecting the structure of emission sources at a given location has been treated in a number of publications. as summarized in some reference papers [3, 4, 5], single element tracers or ratios of elemental concentrations can be used for studying the nature of major emission sources in the region as well as for pinpointing the source areas of aerosols transported to the site of observation. rizk h. f., et al. [6], studied the effect of pollutant aerosols on spectral atmospheric transmissivity in cairo, using a volz sun photometer in the period july 1981–june 1982. they found that the annual loss in solar energy absorption, in the case of dust-free atmosphere in cairo, due to pollutant aerosols for each of the blue, green and red bands were 37 %, 21 %, and 19 %, respectively. also, radiation loss due to pollutant aerosols is strongly wavelength dependent, where shorter wavelengths are much more seriously affected than longer wavelengths. fathy a. m., [7] found that the turbidity factor had reached three times the value that was found before industries came to helwan. the pollution reduced the integrated ultraviolet direct solar radiation by 50 % due to cement exhaust in the atmosphere. rahoma u. a., [8] revealed a decrease of direct solar radiation by 30–45 % in comparison with the results of 1922–1927, and by 20 % in comparison with 1967. moreover, the intensity of direct solar radiation was about 50 % lower than the extraterrestrial solar radiation. at the prague site, which is not particularly influenced by local industrial processes, the primary inorganic aerosol particles may not account for more than a few percent (5–10 %) 48 acta polytechnica vol. 41 no. 2/2001 comparison between atmospheric turbidity coefficients of desert and temperate climates hamdy k. elminir, u. a. rahuma, v. benda knowledge of the solar radiation available on the earth’s surface is essential for the development of solar energy devices and for estimating of their performance efficiencies. for this purpose it is helpful to study the attenuation of direct normal irradiance by the atmosphere, in terms of fundamental quantities, including optical thickness, relative optical air mass, water vapor content, and aerosol amount. in the present article, we will not deal with cloudy atmospheres because of their great variability in space and time, but will focus our attention on atmospheres characterized by the complete absence of condensed water. the objectives of this article are to report data on aerosol optical depth and atmospheric turbidity coefficients for a desert climate, and to compare them with those of a temperate climate. aerosol optical depth, the linke turbidity factor, tl, and ngström turbidity coefficients, �, are calculated from measurements of broadband filters at helwan, egypt, which has a desert climate. a linear regression model is to be determined between the linke factor and the ngström turbidity coefficient. this relation is compared with similar relations reported for a temperate climate [prague, czech republic]. this comparison is made to determine whether a universal relation exists between these two important coefficients, or whether the relation is location dependent. keywords: energy demand, environmental management, industrial wastes, sulfur dioxide, carbon dioxide, aerosol optical depth, linke turbidity factor, ngström turbidity coefficient. of the total particle mass. moreover, due to more efficient emission controls, the concentrations of calcium and other inorganic primary particle components have been decreasing substantially over the last decades [9, 10]. in this work we document the general tendency of atmospheric turbidity by means of the variations at selected wavelengths of the aerosol optical depth and its spectral characteristics during the measured period, together with a short statistical analysis. all this analysis gives a good representation of the aerosol turbidity characteristics in our study areas. in section 2 we briefly review the experimental measurements, and the procedures are revised in section 3. a summary of factors affecting atmospheric turbidity is given in section 4, and the results are discussed in section 5. 2 apparatus and measurements measuring apparatus was installed on the terrace of the research laboratory at the national research institute of astronomy and geophysics in helwan, egypt (latitude 29° 52' n and longitude 31� 20' e), where it is located on hilltop site about 30 km south of cairo in desert surroundings. in this study, the broadband filter method was used to measure quantities of normal radiation at different bands. the filters used in this study are schott filters (2 mm thick), whose cutoff wavelengths were determined using a spectrophotometer. these filters were arranged on a rotatable disk and mounted on an eppley normal incidence pyrheliometer. their main characteristics (interval bands, �m and reduction factor) are given in table 1. total solar radiation intensity was monitored with a high precision pyranometer, which is sensitive in the wavelength range from 300 to 3000 nm. sky diffuse radiation was measured by a pyranometer equipped with a special shading device to exclude direct radiation from the sun. due to the lack of measured tilted surface solar radiation data, models were employed to estimate the radiation incident on a tilted surface from measured horizontal radiation. the results of these calculations are tabulated and plotted against the angle of tilt for summer, winter and all-year-round intended use. table 1: filter characteristics old name filter reference interval bands [�m] filter factor, f og1 rg2 rg8 clear og530 rg630 rg695 – 0.530–0.630 0.630–0.695 0.695–2.900 0.250–2.800 1.082 1.068 1.042 1.080 meteorological instrumentation was used to provide the necessary information about the weather. this data was used to determine the stability class of the atmosphere, from which the rate of dust deposition was calculated [11]. the concentration of dust in the atmosphere was monitored by means of a portable air sampler. the physical design of this sampler is based on aerodynamic principles, which result in the collection of particles of 100 microns (stokes equivalent diameter) and less. to measure the concentration, air was drawn into the sampler, and, by virtue of their inertia, the particles were deposited on membrane filters. the filters were weighed before and after sampling to determine the mass collected. the weight was divided by the surface area from which they were collected to give the dust deposition density in �g/m2. the proton induced x-ray emission analytical method was applied to deduce the multielemental absolute concentration data on the elemental constituents of the samples. 2.1 database preparation a routine quality control procedure described in [12] has been implemented for data from the helwan station since its inception. a daily validation test was instituted to eliminate certain days from further consideration. days that were mostly overcast were rejected primarily because little beam radiation occurs on such days and, secondarily, because it is difficult to verify pyrheliometer data on such days. for this daily screening, only periods with solar altitudes greater than 6° were considered in order to avoid the consequent refraction effects on tracking accuracy. next, hourly databases were subjected to three types of data checks to identify missing data, data that clearly violate physical limits, and extreme data. hours when the data were known to be ”bad” or “missing” were omitted. then, any hour with an observation that violated a physical limit or conservation principle was eliminated from the data set, including: reported hours with a diffuse fraction greater than 1, or beam radiation exceeding the extraterrestrial beam radiation. the final data set was constructed from the measured data that passed all of the quality control checks. 3 mathematical formulations the amount of aerosols present in the atmosphere in the vertical direction is represented in terms of the number of particles per cubic meter or their mass in micrograms per cubic meter. however, it is more usual to represent the amount of aerosols by an index of turbidity. two popular indices of turbidity are the linke turbidity factor, tl, and the ngström turbidity coefficient, �. both of these are used to quantify the influence of atmospheric aerosols on direct solar radiation on the earth’s surface. in the following sections we present the mathematical approaches to evaluate tl and �. 3.1 the linke approach linke’s turbidity factor, tl is an index of the number of clear dry atmospheres that would be necessary to produce the attenuation of the extraterrestrial radiation that is produced by the real atmosphere. the direct normal irradiance over the whole solar spectrum at the earth’s surface is expressed in terms of tl by: � �i e i t mn o sc r l a� �exp � (1) in which isc is the solar constant corrected by the eccentricity factor, eo due to the variation in the sun-earth distance; �r is the spectrally integrated optical thickness of the clean dry atmosphere; and ma is the relative optical air mass, ma depends on the zenith angle �z, on the actual pressure, p, at the site, and consequently on the latitude. the following equation has been used for expressing the relative optical mass in this work: � � m p a z zcos � � � � � � � �1013 25 1 015 93 885 1253. . . .� � . (2) it can be seen from equation 1 that the smallest value of tl is 1, obtained when the atmosphere is fully clean and dry. the 49 acta polytechnica vol. 41 no. 2/2001 optical thickness of such an atmosphere is then equal to �r, which accounts for the attenuation due to the scattering by air molecules (rayleigh scattering), absorption by ozone and by other gaseous absorbers. according to equation 1, linke’s turbidity factor can be derived from pyrheliometric measurements of the direct normal irradiance at ground level, in, as: t m e i i l r a o sc n � � � � � 1 � ln . (3) the evaluation of tl from in requires the knowledge of �r. the values of �r originally by feussner and dubois, [13] were presented by: � ��r a� � �9 4 0 9 1. . m (4) has generally been used for calculating linke’s turbidity factor in most recent work. in 1986, a determination of �r based on more accurate values of spectral extraterrestrial solar irradiance and extinction coefficients of the various attenuators was carried out by louche, [14], who proposed the following algorithm to evaluate the optical thickness of the clean dry atmosphere from the relative air mass: �r a a 2 a 3 a 4 � � � � � 1 6 5567 17513 01202 0 0065 0 00013. . . . .m m m m (5 ) the values of �r obtained by using equation 5 are clearly different from those obtained from kasten’s formula 4, and lead to tl values that are also quite different. in fig. 1 the optical thickness of the clean dry atmosphere obtained from equations 4 and 5 is plotted. it is evident from this diagram that the older value of �r represented by kasten’s formula, equation 4, are lower than those obtained from equation 5. these differences are larger when ma 5compared to when ma � 5. 3.2 the ngström approach the ngström turbidity coefficient is a dimensionless index that represents the amount of aerosols. it appears in an equation called the ngström formula, aimed at determining the spectral optical thickness relative to aerosol scattering �a�: � � �� � a � � (6) in which � is the wavelength (expressed in micrometers); and � is the wavelength exponent, which is representative of the aerosol size distribution. it is considered that � � �13 0 2. . is a reasonable average value [15], although ngström had shown that in a polluted atmosphere, for instance after volcanic outbreaks or forest fires, � may be as low as 0.5 or less. the experimental determination of ngström’s turbidity coefficient � requires measurements of the spectral direct normal irradiance at wavelengths, e.g., 0.530 �m and 0.630 �m, in a part of the solar spectrum where absorption is negligible. ngström’s turbidity coefficient can vary from 0.0 for absolutely clean atmosphere to about 0.5 for very high aerosol amounts. the total amount of water vapor in the atmosphere in the vertical direction is highly variable and depends on the instantaneous local conditions. however, this amount, generally expressed as precipitable water thickness w, can be readily computed through a number of standard routine atmospherically observations, such as relative humidity r, ambient temperature t or vapor pressure. the precipitable water vapor thickness can vary from 0.0 to 5 cm. iqbal, [16] has summarized some of the most commonly used methods of computing the precipitable water vapor thickness. in this study, leckner’s formula is used to obtain w; w t t � �� � � �0 493 26 23 5416 . exp . r . (7) 4 site and climate dependence of solar energy collection 4.1 helwan site details a very common weather condition in helwan is characterized by great sunshine (>3000 hours per year), calm or light air, increased humidity during the cold season. the daily average temperature in helwan ranges between 35.2 °c in july and 13.1 °c in january. the relative humidity fluctuates between 39 % in june and 56 % in december, with visibility of about 5 km. this restricted visibility is the result of the presence of solid particles in the atmosphere, some of which act as condensation nuclei. during such weather conditions, the reduction of solar radiation is sometimes due to the increased quantity of water vapor, sometimes to the presence of increased quantities of aerosol particles, and sometimes to the presence of both these influences. table 2 and fig. 2 show the annual variation of the mean monthly solar radiation and climatological data for helwan, respectively. over the last 40 years, air pollution has become a serious problem in helwan. air quality has decreased rapidly as a result of industrialization and an increase in the number of motor vehicles. the great number of automobiles traveling on the town’s narrow roads has resulted in a significant loading of the atmosphere with both solid and gaseous pollutants. a brownish cloud of air pollution forms over helwan; it can be seen by the naked eye. this cloud denotes a high level both of no2 pollutant and of man-made aerosols. high concentrations of particulate matter are found in helwan, emanating 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0 1 2 3 4 5 6 louche’s formula kasten’s formula o p ti c a l th ic k n e s s o f c le a n d ry a tm o s p h e re optical air mass fig. 1: optical thickness of clean, dry atmosphere computed by various authors from natural sources such as desert dust and from industries such as cement plants. the local industry includes four factories located from the north in tura to the south in el-tebeen, engineering industries (an automobile factory, a pipe and tube factory) and an iron and steel works. the prevailing wind direction is from the n and ne, which represents about 50 % of the total direction. this leads to an important result, that is, the tura and helwan cement factories contribute 50 % to the pollution of the helwan site. fig. 3 demonstrates this fact and shows the close match between pure cement and dust settling onto the surface of the flat plate at the helwan station. dust fallout is a rather informative and well-recognized indicator of air pollution. it is measured by a routine deposition method, where a calibrated vessel is placed outdoors for one month to collect dust, and the collected sample is assessed gravimetrically. it should be noted that, according the environmental protection agency (epa) regulations, the primary air standard is limited to less than 7 mg/m2 month. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 41 no. 2/2001 table 2: annual variations in solar radiation data for helwan month g [kwh/m2/day] d [kwh/m2/day] i [kwh/m2/day] f1 [kwh/m2/day] f2 [kwh/m2/day] f3 [kwh/m2/day] january 3.09 0.97 4.67 4.2 5.0 5.2 february 3.4 1.21 4.25 4.8 5.9 6.0 march 3.68 1.69 3.32 5.4 6.7 6.7 april 5.11 1.96 4.47 5.5 7.0 7.1 may 5.25 2.5 3.78 5.5 7.1 7.4 june 5.81 2.69 4.09 5.9 8.0 8.5 july 6.2 2.18 5.29 6.2 8.4 8.9 august 6.46 1.86 6.63 6.3 8.3 8.5 september 5.64 1.47 6.44 5.8 7.3 7.4 october 4.35 1.11 5.8 5.4 6.8 6.9 november 3.65 0.91 5.79 4.6 5.5 5.7 december 2.78 0.87 4.47 4.0 4.8 5.1 average 4.62 1.62 4.92 5.3 6.73 6.95 where the monthly means in table 2 are: g global solar radiation on a horizontal surface in kwh/m2/day, d diffuse solar radiation in kwh/m2/day, i direct normal incidence of solar radiation in kwh/m2/day, f1 solar radiation for a flat plate-facing south tilted by the latitude angle of the site in kwh/m 2/day, f2 solar radiation for 1-axis tracking flat plate with a north-south axis, kwh/m 2/day, f3 solar radiation for 2-axis tracking flat plate, kwh/m 2/day. ja n. 7 8 9 10 11 12 13 fe b. m ar . ap r. m ay ju n. ju l. au g. se p. o ct . n ov . d ec . 0.65 0.70 0.75 0.80 0.85 10 15 20 25 30 35 40 45 actual sunshine duration [hours] maximum air temperature clearness index relative humidity wind speed [knots] [°c] [%] a c tu a l s u n s h in e d u ra ti o n [h ] w in d s p e e d [k n o ts , o n e k n o ts = 1 .8 5 k m /h o u r] c le a rn e s s in d e x r e la ti v e h u m id it y [% ] m a x im u m a ir te m p e ra tu re [° c ] fig. 2: annual variations in climatological data for helwan . 4 005 006 007 008 009 0010 0011 0012 0013 0014 0015 0016 0017 0018 0019 0020 00ca si al fe sx mg sr na ti k zr cl mn p v cr nb er cu ni 48 44 40 36 32 28 24 20 16 12 8 4 0 sample pure cement r e la ti v e c o n c e n tr a ti o n o f p o llu ti n g m a te ri a ls [ ] % fig. 3: comparison between dust depositions on the flat plate and pure cement as shown in table 3 and table 4, the dust fallout in the helwan region is far beyond the air standard. 4.2 prague site details the czech republic is a hilly country. sixty-six percent of the total area is at an altitude up to 500 m above sea level, 33 % of the area lies between 500–1000 m above sea level, and 1 % lies more than 1000 m above sea level. the climate is temperate. the annual average air temperature is 6.5–8 °c, and the average annual precipitation amounts to 500–650 mm. table 5 and fig. 4 show the annual variation of climatological data and the mean monthly solar radiation in prague, respectively. in the framework of former czechoslovakia the area currently called the czech republic was extremely industrial. after the first world war and the formation of the first czechoslovak republic, the former industrial regions in central bohemia, northern bohemia, and northern moravia, 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 table 3: classification of world standard for dust fallout degree of pollution dust full [mg/m2 month] light < 7 medium 7–14 high 4–35 very high >35 table 4: annual variation of monthly dust fallout (mg/m2 month) on selected sites in helwan month national cement co. portland cement co. tura cement co. helwan observatory january 252 103 327 21 february 215 307 311 43 march 183 236 422 31 april 147 94 204 55 may 320 112 1222 94 june 138 38 243 13 july 68 47 236 23 august 75 82 116 25 september 42 48 178 36 october 47 50.7 269 12 november 50 40 189 19 december 53 50 287 26 average 132.5 100.64 333.67 33.17 table 5: climatological data for prague [lat. 50° 04', long. 14° 25'] jan. feb. mar. apr. may jun. jul. aug. sep. oct. nov. dec. t 0.30 4.50 5.40 12.90 17.10 19.70 17.10 20.80 14.90 12.10 6.50 2.20 w 17.40 16.10 50.50 40.30 40.80 28.00 53.60 31.70 25.30 65.00 24.70 9.20 s 61.00 85.80 80.40 205.00 285.00 282.00 108.00 252.00 145.00 83.10 71.40 42.00 ds 4.00 3.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 ws 2.36 2.84 1.99 1.79 1.68 1.50 1.57 1.54 1.56 1.92 1.72 2.87 r 86.90 96.00 94.10 96.70 90.20 93.90 94.10 94.30 94.60 93.50 95.10 95.40 where, t ambient air temperature [°c], w average precipitable water [mm], s actual sunshine duration [hour], ds depth of the snow cover [cm], ws wind speed [m s�1], r relative humidity [%]. etc., were further developed and upgraded. great structural changes in czech industry took place after the second world war, when the czech lands enormously developed the production of energy, steel, heavy engineering and chemical industries for export to east european countries. establishing numerous power plants in the lignite basin of northwestern bohemia satisfied energy demand. it is no wonder that this territory has suffered from excessive levels of major airborne pollutants. the industrial lignite basin in northern bohemia forms part of the black triangle region, well known in the past as the most polluted area in central europe. recently, however, the industrial structure and the activities have changed dramatically. between 1990 and 2000, the production of energy shifted from the use of fossil to renewable fuels, and demand for coal and petroleum has declined, whereas natural gas and primary electricity have increased to fill the primary energy gap. consequently, the relative output of the energy industry in producing emissions has fallen. emissions of solid particles released from industrial combustion of fossil fuels were estimated at 622 and 194 thousand tonnes in the area of the czech republic in 1990 and 1995, respectively. the number of motor vehicles in use has increased dramatically in the czech republic in recent years. about 3.5 million vehicles run on the 56000 km of in roads the country. the most used highway, between prague and brno, carries a load of approximately 50000 cars per day. lead emissions from cars are decreasing, and leaded petrol will no longer be used after 2002. fig. 5 shows the concentrations of lead in airborne particulate matter on selected sites in prague between 1983 and 1996. current czech republic lead emissions have been estimated at 125–200 tons per year. ninety percent of the sulphur dioxide generated over the period 1980 to 1998 arose from the combustion of fossil fuels in the energy sector. the annual mass emission of sulphur dioxide has fallen over this period, due to the reductions in output from coal and petroleum-fired plants. the decline in sulphur dioxide emissions arising from petroleum is attributed to the reduced demand for this product, coupled with the reduced consumption of high sulphurbearing fuel oil by the manufacturing and power generating industries, and the dramatic increase in the demand for the light low sulphur bearing fuels used in the transport sector. the annual so2 emissions in the czech republic declined from 1867 to 598 thousand tonnes, i.e., more than 32% reduction, which is equivalent to a 1.9 % annual fall. yearly average means of dust fallout range from 2 to 5 gm�2/month in southern and southwestern parts of prague to 10 gm�2/month in the center of the city and industrial zones. local peak values are probably due to nearby building activities, local sources of pollution, heavy transport and secondary dust pollution. the average dust fallout in 1995 was 6.83 gm�2/month. the level of dust fallout in prague has been reduced considerably in the last decade, as shown in fig. 6. other details concerning potential sources of air pollution and their location in the czech republic are mentioned in a czech moss survey [19]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 41 no. 2/2001 0 1 2 3 4 5 6 7 8 9 10 11 12 jan. feb. mar. apr. may jun. jul. aug. sep. oct. nov. dec. 0 1 2 3 4 5 6 fixed tilt = 0 fixed tilt = lat. 15 fixed tilt = lat. fixed tilt = lat. +15 fixed tilt = 90 fig. 4: annual variation in estimated average daily radiation for a flat plate collector facing south for latitude 50°, and a ground reflectance of 0.2 [17] 0 50 100 150 200 250 300 350 88 89 90 91 92 93 94 95 96 øeporyje libuš radotín al�írská antala staška rytíøská klementinum svornosti p b fig. 5: lead concentrations in airborne particulate matter at selected sites in prague [18] tatrovka 0 10 20 30 40 50 60 radotín podìbradská rytíøská legerova plynární 8384858687888990919293949596 fig. 6: dust fallout on selected sites in prague between 1983 and 1996 [19] 5 discussion of results 5.1 solar radiation reduction by aerosol in the area of helwan one way of estimating the atmospheric pollution in the town is by comparing values of total solar radiation measured in the town and values measured outside the town during a day or number of days characterized by clear sky. regarding the global solar radiation data for helwan and cairo, it is found that the helwan annual mean value was 5.48 kwh/m2/day, which is higher than the cairo value, which was 5.03 kwh/m2/day. the total suspended particles (tsp) annual mean value for helwan was 960 �g/m3 as against 583 �g/m3 for cairo. also, the smoke annual mean value for helwan was 52 �g/m3, while it was 132 �g/m3 for cairo, as shown in table 6. table 6: annual mean values of g, tsp and smoke for cairo and helwan region g [kwh/m2/day] tsp [�g/m3] smoke [�g/m3] cairo 5.03 583 132 helwan 5.48 960 52 table 6 shows that the global solar radiation value is higher for helwan than for cairo by 8.2 %, the tsp value is higher for helwan than for cairo by 55 %, and the smoke value is higher for cairo than helwan by 61 %. we found out that the higher level of global solar radiation for helwan than for cairo is due to the presence of a higher value of tsp, which contains particles of large size like sand, calcium and iron (see fig. 7). the presence of these particles tends to make the diffusion of beam solar radiation become diffuse solar radiation, which is added to the global solar radiation value and substitutes the reduction in it due to the reduction in direct solar radiation. in cairo, by contrast, it is the smoke that makes the absorption greater than the diffusion for beam radiation. this leads to a reduction in the global solar radiation value. the realistic reduction of direct solar radiation due to the presence of large quantities of aerosol particles in the atmospheric mass covering helwan is about 43 %, the reduction of global radiation is 19 %, and the increase in diffuse radiation is by 72 %. the clearness index, ki is another parameter that can describe the situation of atmospheric mass, from the point of view of aerosols. ki is usually defined as the ratio of the global horizontal irradiation g to the extraterrestrial horizontal irradiance go. the clearness index is generally less than unity because of extinction by air molecules (rayleigh scattering) and suspended solid or liquid particles (aerosols) [20]. an alternative presentation is to consider the ratio of the beam normal irradiation, ibn to the extraterrestrial normal irradiation, ion. this ratio should probably be called the ”beam transmittance of the atmosphere,” and can be correlated 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 0 2 4 6 8 10 12 14 16 18 20 march april may june july august na al si s cl ca k ti fe zn ce mg r e la ti v e c o n c e n tr a ti o n [% ] fig. 7: results of proton induced x-ray emission analysis of polluting particles at helwan jan. feb. march apr. may jun. jul. aug. sep. oct. nov. dec. b1 � 0.32 0.36 0.47 0.41 0.48 0.39 0.24 0.38 0.39 0.26 0.36 0.48 �b 0.55 0.55 0.46 0.58 0.48 0.59 0.71 0.57 0.58 0.64 0.55 0.40 b2 � 0.41 0.42 0.59 0.54 0.73 0.56 0.49 0.48 0.49 0.41 0.49 0.50 �b 0.43 0.52 0.38 0.48 0.42 0.47 0.52 0.50 0.51 0.49 0.41 0.35 b3 � 0.36 0.41 0.47 0.41 0.59 0.45 0.38 0.39 0.35 0.41 0.42 0.45 �b 0.49 0.51 0.49 0.58 0.51 0.54 0.61 0.56 0.61 0.52 0.46 0.39 b4 � 0.21 0.25 0.36 0.31 0.45 0.31 0.24 0.24 0.26 0.23 0.27 0.33 �b 0.64 0.65 0.56 0.66 0.59 0.65 0.72 0.70 0.69 0.67 0.59 0.49 table 7: monthly mean of extinction coefficient and transmissivity due to aerosols for helwan against the clearness index. correlations of this form have a special intuitive appeal, since one expects the beam transmittance to increase monotonically with the clearness index. a pioneering effort with this format is the work of boes e. c., et al., [21]. however, the variations of this factor with wavelength are generally unknown, especially in tropical conditions. the beam transmittance may be expected to follow a bougurer’s law dependence on atmospheric extinction and air mass, such that: � �� �b � � �exp m . (8) table 7 shows the annual variation of the mean monthly values of the atmospheric transparency factor exp (�� m), for selected interval bands (b1 0.290–0.530 �m, b2 0.530–0.630 �m, b3 0.630–0.695 �m, and b4 0.695–2.800 �m) [22]. 5.2 determination of different atmospheric turbidity parameters stagnating air led to a polluted atmosphere over helwan during the measurement period. the early morning and late afternoon observations correspond to periods for which there were smaller temperature and relative humidity variations. thus, air temperature varied slightly from 19 to 21 °c while humidity remained almost constant at 62–58 %, within the same period. after this time interval, air temperature increased steadily to reach 28 °c by midday while the relative humidity fell steadily to about 40 %. light winds of 1.3–2 ms�1 were blowing in the early morning, strengthening later to about 3–3.5 ms�1. such patterns were observed over the whole measurement period. atmospheric turbidity coefficients were obtained from pyrheliometric measurements of direct solar radiation. the monthly variation of the atmospheric turbidity coefficients was calculated from their respective monthly values and listed in appendix (a and b). the seasonal (summer and winter) and annual average values of tl and � were also computed from appendix (a), and presented in appendix (c). all of them show similar evolutions, with the higher values in summer and lower values in winter. we note that autumn and spring values are rather high and closer to summer values than to winter values. the higher values in summer are due to higher average precipitable water during the same months (see fig. 8). the atmospheric humidity strongly influences the large scatter shown in this diagram, an aspect that can be explained in the following manner. consider an instant when the relative humidity and atmospheric turbidity are both high. as the solar altitude increases, the increasing sunshine will evaporate the liquid particles of the aerosols, which will decrease the turbidity and increase the precipitable water. now consider an instant when the relative humidity and the atmospheric turbidity are both low. as the solar altitude increases, the increased sunshine evaporates water from the soil and the river, thereby increasing the precipitable water, and forms liquid particles in the atmosphere, which contribute to increased turbidity. 5.3 relationship between linke factor and ngström coefficient the plot of the 108 computed values of � versus tl for both sites, fig. 9, shows a linear relationship. using a linear regression technique the following model has been found from the measured data with a correlation coefficient equal to 0.84 and 0.71 respectively: � � � �0194933 0 0620059. . tl helwan � � � �0162108 0 0449825. . tl prague (9) the relation of equation (9) is similar to the model reported for avignon, france: � � � �0103 0 052. . tl . (10) and to the relationship of hinzpeter for potsdam, germany [23]: � � � �0100 0 050. . tl . (11) equations (9–11) indicate that the linear regression model fitted to � versus tl for helwan, desert climate, is similar to the models reported for prague, czech republic, avignon, france and potsdam, germany, temperate climate. the coef© czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 41 no. 2/2001 0 0 10 20 30 40 50 60 70 ja n. fe b. m ar . ap r. ju n. ju l. au g. se p. o ct . n ov . d ec . m ay prague helwan p re c ip it a b le w a te r v a p o r [m m ] fig. 8: precipitable water vapour for the helwan and prague sites prague helwan 2 3 4 5 6 7 8 9 0.00 0. 01 0. 02 0.30 0.05 0.15 0.25 0. 53 linke turbidity factor, t l å n g s tr ö m tu rb id it y c o e ff ic ie n t, b fig. 9: plot of � versus tl for both sites ficients of the helwan model are different from the approximately equal coefficients of the prague, avignon and potsdam models. conclusions the results obtained with a proton induced x-ray emission analytical procedure and statistical evaluations of the data sets collected and presented above provide new information on the aerosol load of the egyptian and czech atmosphere. the nature of the contributing sources has been investigated and some attempts have been made to indicate the role played by neighboring regions in determining the air quality at the sites mentioned. the study of an experimental set of direct spectral irradiances measured during the test period enables the retrieval of spectral aerosol optical depth and its use as a turbidity index at selected wavelengths. the difficulties in making a direct comparison of our data with other data sets at other measured sites are mainly due to use of different instruments and techniques. however, the obtained values of these turbidity parameters serve as reference data to estimate the climatological behavior of atmospheric aerosols in these areas of study. the variation in the monthly average values of � and tl at helwan shows a similar trend to that of prague. however, helwan shows higher values of atmospheric turbidity coefficients than prague, due to the influence of the desert climate. references [1] linke, l.: transmission koeffizient und trübungsfaktor beitr. atmos., 1922, vol. 10, pp. 91 [2] ngström, a.: techniques of determining the turbidity of the atmosphere. tellus, 1961, vol. 13, pp. 214 [3] mosalam, m. a.: solar radiation and air pollution in cairo. proceedings of the 3rd arab international solar energy conference, baghdad, iraq, 1988, pp. 53 [4] mosalam shaltout, m. a., ghoniem, m. m., hassan, a. h.: measurements of the air pollution effects on the color portions of solar radiation at helwan, egypt. renewable energy, vol. 9, no. 1–4/1996, pp. 1279 [5] mosalam shaltout, m. m., hassan, a. h., rahoma, u. a.: measurements of suspended particles and aerosols in atmosphere of helwan, egypt. proceeding of 6th international conference on energy and environment, cairo, egypt, 1998, vol. 2, pp. 567 [6] rizk, h. f., farag, s. a., ateia, a. a.: effect of pollutants on spectral atmospheric transmissivity in cairo. environment int., 1986, vol. 11, pp. 243 [7] fathy, a. m.: ultraviolet solar radiation at helwan and its dependence on atmospheric conditions. m. sc. thesis, faculty of science, helwan university, 1992, ch. 5 [8] rahoma, u. a.: atmospheric transparency at helwan from solar radiation measurements and its correlation with air pollution. m. sc. thesis, faculty of science, helwan university, 1992, pp. 101 [9] hedin, l., granat, l., likens, e., buishand, a.: steep declines in atmospheric base cations in regions of both europe and north america. nature, 1994, vol. 36, pp. 351 [10] heintzenberg, j., müller, k., birmili, w., spindler, g., widensohler, a.: mass related aerosol properties over the leipzig basin. j. geophys. res., 1998, vol. 103, pp. 13 125 [11] draxler, r.: estimating vertical diffusion from routine meteorological measurements. atmospheric environment, 1979, vol. 13, pp. 1559 [12] phan, c. n.: procedures for quality control and analysis of data from a solar meteorological monitoring station. m.sc. thesis, school of mechanical engineering, georgia institute of technology, 1980 [13] kasten, f.: a simple parameterization of the pyrheliometric formula for determining the linke turbidity factor. meteor. rdsch. 33, 1980, pp. 124–127 [14] louche, a., peri, g., iqbal, m.: an analysis of linke turbidity factor. solar energy, vol. 37, no. 6/1986, pp. 393 [15] frohlich, c.: extraterrestrial solar radiation, course on physical climatology for solar and wind energy. international center for theoretical physics, trieste, italy, 1986 [16] iqbal, m.: an introduction to solar radiation. academic press, new york, 1983 [17] hamdy k. elminir, rabab, h. a., fathy, a. m., benda, v.: solar radiation on tilted south oriented surfaces – i: validation of transfer model. 6th international conference of solar energy stored and applied photochemistry, solar’01, cairo, egypt, 2001 [18] svoboda, k., čermák, j., hrtman, m.: source of heavy metal emissions in the czech republic and ways of decreasing emissions – emissions from combustion of coal and wastes, part i. ochrana ovzduší, vol. 10, no. 4/1998, pp. 6 [19] sucharová, j., suchara, i.: results of the international biomonitoring program 1995. research institute of ornamental gardening (vuoz), průhonice, 1998, pp. 183 [20] casiniere, a. l., grenier, j. c., cabot, t., werneck-faga, m.: altitude effect on the clearness index in the french alps. solar energy, 1993, vol. 51, pp. 93 [21] boes, e. c., anderson, h. e., hall, i. j., prairie, r. r., stromberg, r. t.: availability of direct, total and diffuse solar radiation to fixed and tracking collectors in the u.s.a. report no. sand 77-0885, sandia national laboratories, 1977 [22] mosalam shaltout, m. m., tadros, m. t., el-metwally, m.: studying the extinction coefficient due to aerosol particles at different spectral bands in some regions at great cairo. renewable energy, 2000, vol. 19, pp. 597 [23] katz, m., baille, a., mermier, m.: atmospheric turbidity in a semi-rural site – i: evaluation and comparison of different atmospheric turbidity coefficients. solar energy, vol. 28, no. 4/1982, pp. 323 eng. hamdy kamal elminir phone: +420 2 2435 2212 fax: +420 2 2435 3949 e-mail: ehamdy@hotmail.com doc. ing. vítězslav benda, csc. e-mail: benda@feld.cvut.cz department of electrotechnology czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 l oc al m ea n t im e l oc al m ea n t im e l oc al m ea n t im e acta polytechnica vol. 41 no. 2/2001 appendix (a) january february tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 5.7 5.6 5.0 0.04 0.15 0.13 4.9 4.6 3.8 0.03 0.10 0.10 10 5.7 5.9 5.6 0.06 0.15 0.16 5.4 5.1 4.5 0.06 0.18 0.13 11 6.3 7.6 5.1 0.09 0.22 0.15 5.8 6.3 4.9 0.08 0.18 0.18 12 6.2 6.7 5.6 0.09 0.20 0.17 6.0 6.8 5.9 0.09 0.19 0.21 13 6.2 6.6 5.7 0.09 0.19 0.18 5.8 6.3 5.9 0.08 0.23 0.19 14 6.0 7.2 6.1 0.08 0.21 0.19 6.7 6.6 5.7 0.12 0.21 0.20 15 6.4 6.5 5.7 0.09 0.18 0.17 7.1 6.6 6.4 0.13 0.18 0.20 16 5.7 5.9 5.2 0.06 0.15 0.15 6.9 6.9 6.8 0.11 0.16 0.20 17 5.4 5.6 4.2 0.03 0.12 0.10 6.0 5.3 4.6 0.06 0.11 0.13 march april tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 3.7 4.1 3.6 0.01 0.08 0.90 5.7 6.5 4.8 0.09 0.21 0.16 10 4.5 6.3 4.4 0.03 0.18 0.13 6.8 7.1 5.6 0.15 0.25 0.21 11 4.5 6.1 3.6 0.04 0.18 0.11 5.9 7.1 6.5 0.12 0.26 0.25 12 4.6 6.2 3.5 0.04 0.19 0.11 6.2 7.4 6.4 0.14 0.28 0.26 13 5.1 7.1 3.8 0.07 0.23 0.12 7.3 8.2 6.6 0.19 0.32 0.27 14 5.6 6.7 3.1 0.08 0.21 0.10 6.7 8.7 6.4 0.16 0.33 0.26 15 5.1 6.1 4.7 0.06 0.18 0.15 6.6 8.5 6.9 0.15 0.32 0.27 16 4.9 5.8 5.7 0.04 0.16 0.18 6.7 7.2 6.1 0.14 0.25 0.23 17 4.3 5.0 3.6 0.05 0.11 0.10 7.4 7.2 5.9 0.15 0.23 0.20 may june tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 6.4 7.0 5.8 0.13 0.24 0.21 5.9 6.8 4.8 0.13 0.25 0.19 10 6.3 6.8 5.8 0.15 0.25 0.23 5.8 6.9 5.1 0.14 0.27 0.22 11 7.2 7.2 5.8 0.20 0.28 0.24 5.9 7.5 5.0 0.16 0.31 0.22 12 7.2 7.4 5.9 0.21 0.30 0.25 6.9 7.7 5.2 0.21 0.33 0.24 13 6.2 7.3 6.4 0.16 0.30 0.28 6.2 7.5 4.5 0.18 0.32 0.21 14 6.7 7.6 6.3 0.18 0.31 0.27 6.2 8.1 5.9 0.18 0.34 0.27 15 6.8 7.5 6.2 0.18 0.30 0.26 6.2 8.0 5.9 0.17 0.33 0.26 16 7.2 7.9 6.2 0.19 0.30 0.24 6.8 8.1 5.9 0.18 0.32 0.25 17 7.5 7.8 6.1 0.18 0.28 0.23 6.7 7.8 6.0 0.16 0.29 0.23 57 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 appendix (a) – continued monthly mean variations of linke turbidity factor, tl and ngström turbidity coefficient, � for different bands in visible range at helwan july august tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 5.7 6.4 4.5 0.15 0.23 0.18 6.1 6.1 4.6 0.12 0.21 0.17 10 5.8 7.1 5.3 0.14 0.28 0.23 6.0 5.9 4.4 0.14 0.22 0.18 11 6.1 7.6 4.8 0.17 0.31 0.22 6.4 6.9 5.2 0.17 0.28 0.23 12 6.3 7.3 5.2 0.18 0.31 0.24 6.3 6.0 5.1 0.18 0.28 0.23 13 6.0 7.8 5.6 0.17 0.34 0.26 6.2 7.0 5.4 0.17 0.29 0.24 14 6.1 7.8 5.5 0.17 0.33 0.26 6.2 7.4 5.6 0.17 0.31 0.25 15 5.9 7.7 5.7 0.16 0.32 0.25 6.4 7.5 5.7 0.17 0.03 0.25 16 6.1 7.2 5.2 0.15 0.28 0.22 6.5 6.8 5.8 0.16 0.26 0.24 17 6.3 7.9 4.2 0.14 0.29 0.17 7.4 7.0 5.7 0.18 0.24 0.22 september october tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 6.2 6.1 4.8 0.01 0.18 0.16 5.8 5.5 4.3 0.07 0.14 0.12 10 5.8 6.0 5.0 0.01 0.19 0.18 5.7 6.0 5.4 0.08 0.18 0.17 11 6.4 6.9 4.3 0.14 0.24 0.15 6.1 6.3 5.6 0.11 0.20 0.19 12 6.2 6.9 5.5 0.13 0.25 0.21 6.2 7.5 6.0 0.11 0.25 0.21 13 6.4 7.0 5.8 0.14 0.26 0.22 6.1 7.2 4.9 0.11 0.24 0.17 14 6.6 .2 6.2 0.15 0.26 0.24 6.1 7.0 5.5 0.11 0.23 0.19 15 6.9 7.4 6.4 0.15 0.26 0.24 6.6 7.2 5.2 0.12 0.24 0.17 16 7.4 7.3 8.2 0.16 0.25 0.31 7.0 7.1 6.0 0.13 0.22 0.20 17 7.1 7.1 8.7 0.14 0.22 0.30 6.0 6.6 5.4 0.10 0.18 0.16 november december tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 9 5.1 4.5 4.1 0.03 0.10 0.10 5.0 5.4 3.8 0.02 0.11 0.08 10 5.6 5.9 4.8 0.06 0.16 0.13 5.9 6.0 4.8 0.60 0.15 0.13 11 6.0 5.6 4.2 0.08 0.15 0.12 6.7 7.0 5.6 0.10 0.20 0.17 12 6.1 5.5 5.4 0.09 0.15 0.17 7.2 8.0 6.3 0.13 0.24 0.20 13 6.2 6.5 5.4 0.09 0.19 0.17 7.6 7.3 6.4 0.14 0.22 0.20 14 6.9 6.2 5.8 0.12 0.18 0.18 7.9 6.8 6.6 0.15 0.20 0.21 15 7.5 6.7 6.8 0.14 0.20 0.22 7.9 8.0 6.5 0.15 0.24 0.20 16 6.9 6.6 5.7 0.11 0.18 0.17 7.3 7.6 6.0 0.11 0.21 0.17 17 6.1 5.7 4.8 0.06 0.13 0.13 6.9 5.9 5.3 0.08 0.13 0.14 l oc al m ea n t im e l oc al m ea n t im e l oc al m ea n t im e © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 41 no. 2/2001 appendix (b) jan. feb. mar. apr. may jun. jul. aug. sept. oct. nov. dec. tl 2.75 3.52 3.34 4.29 4.75 5.42 4.83 5.21 4.33 5.17 3.94 3.13 � 0.062 0.073 0.091 0.122 0.152 0.190 0.158 0.168 0.131 0.168 0.128 0.087 atmospheric turbidity parameters in visible band at prague, czech republic appendix (c) summer (april – september) winter (october – march) tl � tl � b1 b2 b3 b1 b2 b3 b1 b2 b3 b1 b2 b3 6.45 7.26 5.87 0.15 0.27 0.23 6.02 6.32 5.17 0.09 0.18 0.17 annual mean value tl � b1 b2 b3 b1 b2 b3 6.23 6.79 5.42 0.12 0.23 0.20 seasonal (summer and winter) and annual average values of tl and � for helwan site << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2013.53.0728 acta polytechnica 53(supplement):728–731, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap detection of a change of slope in the spectrum of heavy mass cosmic rays primaries by the kascade-grande experiment a. chiavassaa,∗, w.d. apelb, j.c. arteaga-velázquezc, k. bekkb, m. bertainaa, j. blümerb,d, h. bozdogb, i.m. brancuse, e. cantonia,f, f. cossavellad,l, k. daumillerb, v. de souzag, f. di pierroa, p. dollb, r. engelb, j. englerb, m. fingerd, b. fuchsd, d. fuhrmannh, f. garinod, h.j. gilsb, r. glasstetterh, c. grupeni, a. haungsb, d. heckb, j.r. hörandelj, d. huberd, t. huegeb, k.-h. kamperth, d. kangd, h.o. klagesb, k. linkd, p. łuczakk, m. ludwigd, h.j. mathesb, h.j. mayerb, m. melissasd, j. milkeb, b. mitricae, c. morellof, j. oehlschlägerb, s. ostapchenkoa,m, n. palmierid, m. petcue, t. pierogb, h. rebelb, m. rothb, h. schielerb, s. schood, f.g. schröderb, o. simal, g. tomae, g.c. trincherof, h. ulrichb, a. weindlb, j. wocheleb, m. wommerb, j. zabierowskik a dipartimento di fisica, università degli studi di torino, italy b institut für kernphysik, kit – karlsruher institut für technologie, germany c universidad michoacana, instituto de física y matemáticas, morelia, mexico d institut für experimentelle kernphysik, kit – karlsruher institut für technologie, germany e national institute of physics and nuclear engineering, bucharest, romania f osservatorio astrofisico di torino, inaf torino, italy g universidade são paulo, instituto de física de são carlos, brasil h fachbereich physik, universität wuppertal, germany i department of physics, siegen university, germany j dept. of astrophysics, radboud university nijmegen, the netherlands k national centre for nuclear research, department of cosmic ray physics, lodz, poland l department of physics, university of bucharest, bucharest, romania m now at: max-planck-institut für physik, münchen, germany n now at: university of trondheim, norway ∗ corresponding author: andrea.chiavassa@to.infn.it abstract. kascade-grande is an extensive air shower experiment devoted to the study of cosmic rays in the 1016 ÷ 1018 ev energy range. the array comprises various detectors allowing independent measurements of the number of muons (nµ) and charged particles (nch) of extensive air showers (eas). these two observables are then used to study the primary energy spectrum, separating the events into two samples on the basis of the shower size ratio, corrected for attenuation in the atmosphere, ln nµ/ ln nch. the two samples represent the light and heavy mass groups of the primaries. in the studied energy range, only the spectrum of heavy primaries shows a significant change of slope. the energy (estimated using the qgsjet ii hadronic interaction model) of this feature is in agreement with the expectations of a rigidity-dependent knee feature. keywords: cosmic rays, energy spectrum, extensive air showers. 728 http://dx.doi.org/10.14311/ap.2013.53.0728 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 detection of a change of slope in the spectrum 1. introduction knee-like structures have been found in the spectrum of the light and medium components of cosmic rays in the 1015 ÷ 1017 ev energy range [1]. these kinks produce the overall feature in the all-particle cosmic ray spectrum known as the knee, which was discovered by kulikov and khristiansen more than fifty years ago [2]. several hypotheses have been proposed ot explain the origin of this feature [3–5]. a first class of models attributes this spectral feature to astrophysical mechanisms. the most favoured scenario explains this radiation by galactic sources (e.g. super novae remnants) and the spectral feature of the knee either by the limit of acceleration in galactic sources or by the limit of containment inside local magnetic fields. both scenarios predict the knee of the primary spectrum at an energy in agreement with the measured energy and scaling, for different elements, with the atomic number. alternative scenarios explain the knee by a change of the hadronic interactions originating the extensive air showers; these models predict an energy dependence of the knee on the primaries mass number. both kinds of models predict that there should also be a knee-like structure in the energy spectrum of the heavy component of cosmic rays at about 1017 ev. experiments that have studied the knee in the last decade have focused on the 1014÷1016 ev energy range, and have therefore not been able to detect the change in the slope of heavy primaries. the kascadegrande experiment was designed to address the problem of the iron-knee by studying cosmic rays in the 1016 ÷ 1018 ev energy range [6]. measurements of the primary spectrum of at least two mass groups can only be obtained if the experiment has a high resolution for all the eas components needed to reach this goal. in kascade-grande, the observables used are the muon (nµ) and charged particles (nch, defined as the sum of the electrons and muons in the shower) numbers in the eas at detection level. the ratio between the number of muons and charged particles has been identified as a variable with enough resolution to separate (with less than 10 % contaminations) the two event samples. the technique is briefly described and the results are presented. the energy scale of the results depends on the hadronic interaction model used in the eas simulations (e.g. qgsjet ii [7]), while the spectral features are always detected in the electron-poor event sample (e.g. a high value of the above-mentioned ratio) irrespective of the exact value used to separate the events. 2. the kascade-grande experiment the kascade-grande detector is located on the north campus of the karlsruhe institute of technology, germany (110 m a.s.l.). two arrays are the main y c o o r d in a t e [ m ] x coordinate [m] grande stations kascade mtd cd figure 1. experimental layout of the kascadegrande experiment. components of the experiment. the layout is shown in fig. 1. the first detector, called grande, is 700 × 700 m2 in area and uses 37 × 10 m2 plastic scintillator detectors to sample the charged particle density at different distances from the eas core. these measurements are used to estimate nch, the eas arrival direction and core location. the second detector, i.e. the former kascade experiment array, comprises a 252 e/γ unshielded liquid scintillator and shielded µ plastic scintillation detectors. this array, covering a smaller (200 × 200 m2) surface, independently measures the muon number nµ. a complete description of the event reconstruction and of the achieved experimental resolutions can be found in [6]. in this short note, it is important to point out that nch and nµ are detected with 15 % resolution and 25 % resolution, respectively. 3. analysis and results the analysis was performed on a particular subset of data with zenith angles (θ) below 40°, with reconstructed cores in a fiducial area of 1.52 × 105 m2 inside the central region of grande and with shower sizes log nch > 6.0 and log nµ > 5.0. for these selection criteria, full trigger and reconstruction efficiency is achieved at an energy of log(e/gev) ≥ 7.4. the present analysis is based on 1173 days of data taking, for an exposure of 2 × 1013 m2 s sr. the analysis technique is described in detail in [8]. using the constant intensity cut method (cic, described in [9]), the attenuation curves of nch and nµ in the atmosphere are obtained, and can be used to calculate the equivalent particle numbers at a reference zenith angle (θref = 21.5°, selected as the mean 729 andrea chiavassa et al. acta polytechnica (e/gev) 10 log 7 7.5 8 8.5 9 y 0.7 0.75 0.8 0.85 0.9 fe h he c si 0.84 figure 2. mean values of y cic for five primaries (from bottom to top: h, he, c, si, fe) vs. primary energy. error bars represent the rms of the y cic distributions. value of the events zenith angle distribution). the cic method allows the eas propagation in the atmosphere to be taken into account in a model-independent way. having converted the measured nch(θ) and nµ(θ) to the values at the reference angle, the ratio y cic = ln nµ(θref) ln nch(θref) (1) is calculated. the values of the ratio y cic for different primaries is then investigated using a full shower and detector simulation (sampling the primary energy on a power-law spectrum between 1015 and 1018 ev). the mean value of y cic is almost energy-independent, and it increases with the mass of the primary nucleus, as shown in fig. 2 (for a qgsjet ii based simulation). selecting events with y cic ≥ 0.84, we separate the electron-poor and electron-rich (y cic < 0.84) samples. heavy elements (si and fe) are representatives of the first group, and light elements (h and he) are representatives of the second. the fraction of misclassified events for protons and iron nuclei is lower than 15 % for energies above full efficiency, and does not depend on the reconstructed energy. the primary energy (log(e/gev)) is attributed to each event following a procedure based on the nch and nµ observables that is described in detail in [8]. the measured spectra of the light and heavy mass groups are shown in fig. 3. it can easily be observed that the spectrum of the heavy component shows a clear knee-like structure [8]. fitting the spectrum with a broken power-law expression [8], the change of the spectral slope is δγ = −0.48 from γ1 = 2.76 ± 0.02 to γ2 = 3.24 ± 0.05, with the break position at log(e/gev) = 7.92 ± 0.04. the statistical significance of this change of slope is 3.5σ. however it should be pointed out that the light component of cosmic rays is still present at energies between 1017 and 1018 ev, though its relative abundance is smaller than that of the heavy component. 7 7.2 7.4 7.6 7.8 8 8.2 8.4 8.6 8.8 9 1 .7 s r s e v 2 / m 2 .7 e ⋅ d j /d e 19 10 20 10 (e/gev) 10 log all particle spectrum y<0.84 y>0.84 figure 3. reconstructed energy spectrum of the electron-poor and electron-rich components together with the all-particle spectrum for the angular range 0° < θ < 40°. the error bars show the statistical uncertainties. 4. conclusions applying a cut based on the ratio between the muon and charged particles numbers (calculated at a reference zenith angle), the kascade-grande experiment is able (due to its unprecedented resolution in this energy range) to separate the measured events in samples generated by light and heavy primaries. the energy spectra of both components was measured; the energy spectrum of the heavy mass group shows a knee-like feature at an energy of ∼ log(e/gev) = 7.92 ± 0.04. this primary energy value depends heavily on the hadronic interaction model used for the shower simulation, e.g. qgsjet ii; while the knee-like structure in the spectrum is detected for different cut values of y cic (i.e. different hadronic interaction models). the spectral steepening occurs at an energy where the charge-dependent knee of primary iron is expected, when the knee at about 3 ÷ 5 × 1015 ev is assumed to be caused by a decrease in the flux of primary protons and/or helium nuclei. acknowledgements the authors would like to thank the members of the engineering and technical staff of the kascade-grande collaboration, who have contributed to the success of the experiment. the kascade-grande experiment is supported by the bmbf of germany, the miur and inaf of italy, the polish ministry of science and higher education, and the romanian authority for scientific research uefiscdi (pnii-idei grants 271/2011 and 17/2011). references [1] apel w. d. et al. (kascade collaboration): 2005, astropart. phys. 24, 1 [2] kulikov g. v., khristiansen g. b.: 1959, sov. phys. jetp 35, 441 [3] hörandel j. r.: 2004, astropart. phys. 21, 241–265 730 vol. 53 supplement/2013 detection of a change of slope in the spectrum [4] peters b.: 1961, il nuovo cimento 22, 800. binns w. r. et al.: 2008, new astromy reviews 52, 427430 [5] ptsukin v. s. et al.: 1993, astron. astrophys. 268, 726; ptuskin v. s.: 2005, aip conf. proc. 745, 14 [6] apel w. d. et al. (kascade-grande collaboration): 2010, nima 620, 202 [7] ostapchenko s. s.: 2006, nucl. phys. b (proc. suppl.) 151, 143–147; ostapchenko s. s.: 2006 phys. rev. d 74, 014026 [8] apel w. d. et al. (kascade-grande collaboration): 2011, prl 107, 171104 [9] kang d. et al. (kascade-grande collaboration): 2011, proc. of the 31st icrc, lodz (poland), id1044; arteaga-velázquez j. c. et al. (kascade-grande collaboration): 2011, proc. of the 31st icrc, lodz (poland), id805 731 acta polytechnica 53(supplement):728–731, 2013 1 introduction 2 the kascade-grande experiment 3 analysis and results 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0472 acta polytechnica 56(6):472–477, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap simulation of surface heating for arbitrary shape’s moving bodies/sources by using r-functions sergiy plankovskyy, olga shypul, yevgen tsegelnyk∗, oleg tryfonov, ivan golovin national aerospace university, “kharkiv aviation institute” named after n. ye. zhukovsky, faculty of aircraft engineering, department of aircraft manufacturing technologies, chkalova 17, 61070, kharkiv, ukraine ∗ corresponding author: y.tsegelnyk@gmail.com abstract. the purpose of this article is to propose an efficient algorithm for determining the place of an action of a heat source with a given motion law for a body of an arbitrary shape using methods of analytical geometry. the solution to this problem is an important part of a modeling of a laser, plasma, ion beam treatment. in addition, it can also be used for mass transfer problems, such as simulation of coating, sputtering, painting etc. the problem is solved by the method of r-functions to define the shape of the test body and the heat source and the analytical determination zone shadowing. as an example, we consider the problem of using the method of ion cleaning parameters optimization considering temperature limitations. application of the r-functions can significantly reduce the amount of computation with usage of the ray tracing algorithm. the numerical realization of the proposed method requires an accurate creation of a numerical mesh. the best results in terms of accuracy of determination the scope of the source can be expected when applying adaptive tunable meshes. in case of integration of the r-functions into the cad system, the use of the proposed method would be simple enough. the proposed method allows to determine the range of the source by the expression, which is constructed only once for the body and the source of arbitrary geometric shapes moving in any law. this distinguishes the proposed approach against all known algorithms for ray tracing. the proposed method can also be used for time-dependent multisource with arbitrary shapes, which move in different directions. keywords: numerical methods; moving heat sources; ray tracing; r-functions method. 1. introduction moving body/source heating analysis has an application in several manufacturing processes [1, 2]. in most well-known studies, in this formulation, the problem is considered for a circular heat source, which moves in a straight line [3, 4]. however, there are papers that outline the temperature distribution in a half-space, because of the complexity of the shape of the moving heat source [5], and because of the fact that a heat source moves according to a more complex laws [6]. the problem of heating bodies of finite dimensions by moving heat sources is considered in fewer studies. the objects of the study are likely to be the bodies of simple shapes (cylinder or parallelepipeds). for example, a number of studies were carried out by [7], to investigate the temperature distribution in a rotating cylinder heated with the laser heat source. this problem may be associated with the calculation of laser hardening regimes, laser-assisted machining, etc. at the same time, it is necessary to calculate the temperature of an arbitrary shaped body caused by heating of a moving heat source. the law of motion and the shape of the heat source can be quite complex. such problem is typical for cases, when body heat happens because of an energy flux action (radiation, heating, laser, ion or electron beam processing). in this case, the definition of the heated area’s borders is a complicated problem. in this paper, we propose an analytical method for the solution of this problem by using the r-functions. the geometric shapes of the body, the source and the law and their relative motion can be arbitrary. after definition of the action zone of the heat source , the further temperature calculation was determined numerically by a finite element method. 2. mathematical formulation the problem with determining the action zone of the moving heat source is very similar to the problem with determining the shadow zones location that is typical to computer graphics.. beginning with the studies [8, 9] such problems are solved by different algorithms of ray tracing. it required creating a lot of rays from points of the body surface in the direction of the radiation source. these tasks require large amount of computations. at the same time, for complex shape bodies, the precise definition of the shaded area is associated with considerable difficulties in case of the moving sources, which change the intensity of the radiation. we assume that the geometry of the source and the body is known. it can be independent of time, 472 http://dx.doi.org/10.14311/ap.2016.56.0472 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 6/2016 simulation of surface heating figure 1. description of geometric shapes with rfunctions. or change according to the known law. the source intensity is able to change arbitrary on time and on its cross section. the law of the source movement is also known. solution of the problem is reduced to the construction of a switch-function, which would be equal to 1 on the heated surface and to 0 on the rest of the surface of the part. for this purpose, it is enough to create a function that has the following properties:  ω(x,y,z,t) > 0 inside the heated surface part, ω(x,y,z,t) = 0 on the border of the heated surface part, ω(x,y,z,t) < 0 ouside the heeated surface part. (1) such relation can be described using the rfunctions [10]. the r-function method was developed as an improvement of ritz methods for solving boundary-value problems. it is known for its utilization of solving the heat conduction problems in [11]. however, in this paper it will be used only as a tool of analytical geometry. the r-functions are functions of continuous real arguments. their sign is determined by the sign of the argument. if, instead the sign of the “+” and “−”, we use the values 0 and 1, r-functions are equivalent to some boolean logic functions. boolean function equivalent to a particular r-function, called the companion function. almost in every cad system, a geometric shape of a complex object is created by using boolean operations with geometric primitives. these primitives can be defined by algebraic equations or inequalities. for example, inequality ω1 = (r2 −x2 −y2 ≥ 0) in the plane x0y defines a circle of radius r centered at the origin. it is proved by [12] that if the geometry of the complex object is created by using boolean operations (∨,∧,¬) with geometric primitives which are described by the inequalities ωi ≥ 0, the replacement of companion boolean function with r-functions, alfigure 2. definition of shadow zones. lows to obtain the inequality ω = f(ω1, . . . ,ωn) ≥ 0, with the properties (1). the simplest complete system of these r-functions is the following one: • f ≡−f (logical negation); • f1 ∧f2 = f1 + f2 − √ f21 + f22 (logical conjunction); • f1 ∨f2 = f1 + f2 + √ f21 + f22 (logical disjunction). let it be required to create an expression ω(x,y) ≥ 0 for the area which is shown in figure 1. as geometric primitives, the following items are selected: ω1 =( 1 −|x/a|− |y/b| ≥ 0 ) – the part of the plane inside a diamond with vertices (±a, 0), (0,±b); ω2 = (x2 + y2 − r2 ≥ 0) – the part of the plane outside of the circle with radius r centered at the origin. the shape of the region is determined by ω = ω1 ∧ ω2, or after replacing the boolean operations with r-functions: ω = ( 1 −|x/a|− |y/b| + x2 + y2 −r2 − √ (1 −|x/a|− |y/b|)2 + (x2 + y2 −r2)2 ≥ 0 ) . further, for simplicity, the symbols ∧r, ∨r will be used instead of the expanded form of r-function. method of analytical determination of the coverage of the source considers the example of two-dimensional problem. we propose dependences that describe the geometric shape of the heat source ωsource ≥ 0 and body ωbody = r(ωi) ≥ 0, where r(ωi)– system of r-functions; ωi, i = 1, . . . ,n – geometric primitives (figure 2). 473 s. plankovskyy, o. shypui, y. tsegelnyk et al. acta polytechnica figure 3. design scheme for the test problem. we propose the expression gi = −(ωi ∧r ωsource)2, which is equal to zero in the surface regions γi, primitives fall into the domain ωsource and less than zero at all other points. we also use the expression gbody = −(ωbody ∧r ωsource)2, which is equal to zero for all γi. expression ψi = 4h(gi) × h(gbody), where h (heaviside function) is equal to one for all points on the surface of the body, lying on the γi, and zero for the remaining points. it means that ray tracing are needed only for points, where ψi is equal to one. without loss of generality, we assume that the action of the source is directed along the axis ysource. we use lines fi = xsource −xpoint for points on the surface lying on γi. expression h(−f2i ) × ψi ×|ysource| determines the distance between the source and the point γi, which lies on fi. point of the line fi, which lies closest to the source, will have coordinate yi,minsource = minbody ( h(−f2i ) × ψi ×|ysource| ) . seeking function equal to 1 in areas which are affected by the source and 0 for the rest of the γi can be written as: φ = 2h ( yi,minsource − ψi ×|ysource| ) . (2) with the known law of motion of the body, connecting the coordinates (xbody,ybody) and (xsource,ysource), functions h(gbody) and φ clearly defines the scope of the source for the body of arbitrary shape given by the expression ωbody. the method of creating equation (2) may seems cumbersome. however, this expression is created only once. this distinguishes the proposed approach against all known algorithms for ray tracing. the proposed method can also be used for time-dependent multisource with arbitrary shapes which move in different directions. at solving such problems using the finite element method, quality of domain finite element mesh has a serious influence on the accuracy of the source locale. if the finite element mesh of the object is not performed accurately enough, the mesh generator determines the coordinates of the surface nodes with an error. in this case, the elements heated by source on the surface of the body are defined incorrectly. you can see such influence in the following test problem. determination of the moving heat source site of action for complex shape body was solved as a test problem. a rectangular shape source with a cut performs a reciprocating motion along the axis xbody, and rotates around the axis ysource (figure 3). for this problem, the coordinate system of the source and body are connected by expressions:  xsource = (xbody + l− 2l sin ϕ̇1t) cos ϕ̇2t +zbody sin ϕ̇2t, ysource = ybody, zsource = −(xbody + l− 2l sin ϕ̇1t) sin ϕ̇2t +zbody cos ϕ̇2t. figure 4 shows the results of numerical calculations for two cases: when using automatic generation of unstructured mesh without additional conditions and with the mesh obtained by the forced association of mesh nodes to the body surface. area of the source is defined incorrectly on curved surfaces and planes parallel to it for case one. 3. numerical simulation result and discussion simulation of gas turbine engine blade heating during ion sputtering has been done using the proposed method. the process is performed on an ion-plasma device and is a preparatory stage before coating. sputtering is a slow process, so to improve performance, a set of parts was treated at the same time. so batch of blades is set on a turntable which rotates with an angular velocity ϕ̇1 (figure 5). for more uniform sputtering and coating, the blade also rotates around its own axis with an angular velocity ϕ̇2. energy of the ion beam, sputtering time, the rotation speed of the table and of the blades must be assigned so as to provide a predetermined quality of treatment and to prevent the blade material from overheating above the phase transition temperature. for the heat resistant alloys, which are most commonly used in a turbine blade manufacturing, this temperature is equal to 1270 k. the intersection of the ion beam and the blade surface causes the blade heating. it was assumed that the ion flux occupies a cylindrical shape area with radius r, located perpendicularly to the plane z0x, displaced along the axis z from the plane of the table on a distance h. movement of ions occurs along a positive direction of the axis y (figure 5). the blade heating occurs near the ion source. therefore, the shape of the ion beam was set by the expression ωsource = (r2 −x2 −(z−h)2)∧r−y. coordinate system of the body and the source are related by the 474 vol. 56 no. 6/2016 simulation of surface heating figure 4. the results of numerical simulations using unstructured mesh without conditions (left) and mesh obtained by the forced association of mesh nodes from the body surface (right). expressions:  zsource = zbody, xsource = (xbody + r cos ϕ̇1t) cos(−ϕ̇2)t +(ybody + r sin ϕ̇1t) sin(−ϕ̇2)t, ysource = −(xbody + r cos ϕ̇1t) sin(−ϕ̇2)t +(ybody + r sin ϕ̇1t) cos(−ϕ̇2)t. blade surface geometry was defined by point cloud. the heat flux is determined on the basis of the energy balance at the surface between input heat qin and heat loss qout. for plasma processes qin described by the expression [13]: qin = jrad,in + jch + jn + jads + jreact,in + jext,in. here jrad,in is the heat radiation towards the surface; jch is the power transferred by charge carriers (electrons and ions); jn is the contribution of neutral species of the background gas and the neutral particles; jads and jreact,in are energies released by an absorption or a condensation and the reaction energy of exothermic processes including molecular surface recombination; jext,in is an input heat flux by external sources that influences the thermal balance of the substrate. clearing by ion sputtering is carried out at low pressure with using a high-purity neutral gases, without additional heating and cooling. in this case, the expression for determining qin can be simplified [13]: qin = jrad,in + jch = σ(εt 4rad −εst 4 s ) + jieion + 4kcmims (mi + ms)2 sin2 θ2 eivbias, where ε is the spectral emittance of the radiation source at a temperature trad; εs represents the spectral absorbance of the substrate surface at a temperature ts; σ denotes the stefan–boltzmann constant; ji is the ion flux density of the surface; eion is the ionization potential of the incident ion; kc is the energy transfer coefficient; mi, ms are masses ratio of the colliding particles (ion and surface atom); θ is the angle of incidence; ei is the ion charge; vbias is the sum of the plasma potential and the substrate potential. the heat loss qout of the substrate during plasma processing consist of the following terms [13]: qout = jrad,out + jparticle + jdes + jreact,out + jext,out. here jrad,out is the energy radiated from the substrate at a temperature ts; jparticle is the energy transport 475 s. plankovskyy, o. shypui, y. tsegelnyk et al. acta polytechnica figure 5. the equipment for sputtering and coating – turntable with blades and design scheme for the task of the blade heating under ion sputtering. from the substrate due to sputtering of surface atoms and the secondary electron emission; jdes is the energy sink due to the desorption of particles into the gas phase and the diffusion into the solid bulk; jreact,out is the reaction energy of exothermic processes, including molecular surface recombination; jext,out is the heat loss caused by an external cooling. subject to the terms of the ion cleaning, the expression for determining qout can be written as: qout = jrad,out + jparticle ≈ jrad,out + jsputtering = σ(εst 4s −εenvt 4 env) + ksputjiecoh, where εenv is the emissivity of the environment; tenv is the environmental temperature (reactor walls, etc.); ksput is the sputtering coefficient dependent on the ion energy and angle of incidence; ecoh is the cohesive energy of sputtered material’s atoms. the temperature field in the clearing blade was calculated during the simulation. for this, the transient heat conduction equation was solved. the heat flux through surface of the blade was specified by the expression: w = σ(εst 4s −εenvt 4 env) + φ ( σ(εt 4rad −εst 4 s ) + jieion + 4kcmims (mi + ms)2 sin2 θ2 eivbias −ksputjiecoh ) , where φ is a switch-function, which is determined by the expression (2). dependence of the heat flux value on the angle between the direction of ions flux and the blade surface is a feature of the process. this feature required calculating the directional cos of the normal to the surface. using r-functions makes it simple to solve this problem. if the equation ωi ≥ 0 for geometric primitives is known, and for a complex domain the equation ω(ωi) ≥ 0 is constructed by using r-functions, than figure 6. maximum blade surface temperature versus the sputtering time. for the function ω∗i ≥ 0, where ω∗i = ωi√ ω2i + (grad ωi)2 , the expressions ∂ω(ω∗i )/∂x, ∂ω(ω ∗ i )/∂y, ∂ω(ω ∗ i )/∂z will determine the direction cosines of the interior normal to the corresponding coordinate axes [12]. parameters of sputtering (such as value of the ions energy, rotation speed of the turntable and the blade) have been determined with respect to the temperature limit, so the maximum surface temperature should not exceed the temperature of the material phase transition. additionally, the condition , in which the minimum value of the sputtered material layer should not be less than a predetermined value throughout surface of the blade, was applied. the graphs of the maximum temperature change in a checkpoint of the blade at a various ion beam energy is shown in figure 6. based on the simulation results, recommendations on the choice parameters of the sputtering of the blades have been made. criterion of choice was the minimum time of the sputtering while respecting the temperature limitations. 476 vol. 56 no. 6/2016 simulation of surface heating 4. conclusions the use of r-functions can significantly reduce the amount of tracing computations. the abovementioned effect is due to the analytical determination of points falling within the scope of the source. the point clouds produced by the 3d scanners can be used for this purpose. the numerical realization of the proposed method requires an accurate build numerical mesh. the best results in terms of accuracy of determination the scope of the source can be expected, when applying adaptive tunable meshes. the proposed approach can be applied to cases with an arbitrary number of heat sources considering that the law of body motion is known. it can also be used in mass transfer problems, such as simulation coating, painting or clearing bodies with complex geometric shapes. in case of integration of the r-functions in cad system, the use of the proposed method would be simple enough. references [1] komanduri, r., hou, z.b. thermal modeling of the metal cutting process – part ii: temperature rise distribution due to frictional heat source at the tool–chip interface. international journal of mechanical sciences 43:57-88, 2001. doi:10.1016/s0020-7403(99)00104-6 [2] cline, h.e., anthony, t.r. heat treating and melting material with a scanning laser or electron beam. journal of applied physics 48(9):3895-900, 1977. doi:10.1063/1.324261 [3] rosenthal, d. the theory of moving sources of heat and its application to metal treatments. transaction of the american society of mechanical engineers 68:849-66, 1946. [4] liu, s., lannou, s., wang, q., keer, l. solutions for temperature rise in stationary/moving bodies caused by surface heating with surface convection. journal of heat transfer 126:776-85, 2004. doi:10.1115/1.1795234 [5] akbari, m., sinton, d., bahrami, m. geometrical effects on the temperature distribution in a half-space due to a moving heat source. journal of heat transfer 133:064502-1-10, 2011. doi:10.1115/1.4003155 [6] zhou, h. temperature rise induced by a rotating or dithering laser beam. advanced studies in theoretical physics 5(10):443-68, 2011. http://hdl.handle.net/10945/25571 [7] jung, j.w., lee, c.m. cutting temperature and laser beam temperature effects on cutting tool deformation in laser-assisted machining. in proceedings of the international multiconference of engineers and computer scientists (imecs 2009) vol. ii, hong kong, march 18-20, 2009, pp. 1817-22. [8] kay, d.s. transparency, refraction and ray tracing for computer synthesized images. master’s thesis in program of computer graphics, cornell university, usa, 1979. [9] whitted, t. an improved illumination model for shaded display. communications of the acm 23(6): 343-49, 1980. doi:10.1145/965103.807419 [10] rvachev, v.l., sheiko, t.i., shapiro, v. the r-function method in boundary-value problems with geometric and physical symmetry. journal of mathematical sciences 97(1):3888-99, 1999. doi:10.1007/bf02364929 [11] voronyanskaya, m.e. maksimenko-sheiko, k.v., sheiko, t.i. mathematical modeling of heat conduction processes for structural elements of nuclear power plants by the method of r-functions. journal of mathematical sciences 170(6):776-93, 2010. doi:10.1007/s10958-010-0120-x [12] rvachev, v.l. on the analytical description of some geometric objects. reports of ukrainian academy of sciences 153(4):765-67, 1963. (in russian) [13] kersten, h., deutch, h., steffen, h., kroesen, g.m.w., hippler, r. the energy balance at substrate surfaces during plasma processing. vacuum 63:385-431, 2001. doi:10.1016/s0042-207x(01)00350-5 477 http://dx.doi.org/10.1016/s0020-7403(99)00104-6 http://dx.doi.org/10.1063/1.324261 http://dx.doi.org/10.1115/1.1795234 http://dx.doi.org/10.1115/1.4003155 http://hdl.handle.net/10945/25571 http://dx.doi.org/10.1145/965103.807419 http://dx.doi.org/10.1007/bf02364929 http://dx.doi.org/10.1007/s10958-010-0120-x http://dx.doi.org/10.1016/s0042-207x(01)00350-5 acta polytechnica 56(6):472–477, 2016 1 introduction 2 mathematical formulation 3 numerical simulation result and discussion 4 conclusions references acta polytechnica doi:10.14311/ap.2013.53.0483 acta polytechnica 53(supplement):483–496, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap what is new in astroparticle physics franco giovannelli∗ inaf – istituto di astrofisica e planetologia spaziali, area di ricerca di tor vergata, via del fosso del cavaliere, 100 – i00177 roma, italy ∗ corresponding author: franco.giovannelli@iaps.inaf.it abstract. in this brief review paper i will point to the most important steps that have been made in recent decades toward a better understanding of the physics governing our universe. because of the limited length of this paper, i have selected only a few results that, in my opinion, have been of crucial importance. keywords: photonic astrophysics, particle astrophysics, neutrino astrophysics. 1. introduction astroparticle physics is a new field of physics that emerged roughly twenty years ago, when high energy astrophysicists and particle physicists began to collaborate. within this relatively short period of time, astroparticle physics has developed strongly through the study of cosmic sources that are emitters of photons, charged particles and neutrinos. these sources are considered as frontier objects between astrophysics and particle physics. results coming from the study of cosmic sources using various techniques have stimulated the scientific community to work toward a unifying scheme for a general comprehension of the physics governing our universe. wilkinson microwave anisotropy probe (wmap) mission described by bennett et al. [25] determined that the universe is 13.7 ± 0.2 gyr old. the combination of wmap and 2dfgrs data favors a flat universe, from which it follows that the mean energy density in the universe is equal to the critical density [158]. this is equivalent to a mass density of 9.9 × 10−30 g cm−3. of this total density, we now know the breakdown to be 4.6 % baryons, 0.4 % neutrinos, 23 % cold dark matter (cdm), and 72 % dark energy [81] and references therein). till now, we have no unquestionable experimental proof of the existence of dark matter and dark energy, though some results on the presence of dm particles in the galactic halo have been claimed by bernabei et al. [26]. bernabei et al. [27] (this volume) point out that the dama/libra data shows a modelindependent evidence of the presence of dm particles in the galactic halo at 8.9σ c.l. if this is so, it will greatly improve our knowledge of the universe. however, ackerman et al. [7] observing the satellite galaxies of the milky way – the most promising targets for dark matter searches in gamma rays – did not find any positive signal. in their search for dark matter consisting of weakly interacting massive particles – applying a joint likelihood analysis to 10 satellite galaxies with 24 months of data from the fermi large area telescope – no dark matter signal was detected. including the uncertainty in the dark matter distribution, robust upper limits are placed on dark matter annihilation cross sections. the 95 % confidence level upper limits range from about 10−26 cm3 s−1 at 5 gev to about 5 × 10−23 cm3 s−1 at 1 tev, depending on the dark matter annihilation final state. all cosmic sources, both discrete and diffuse, are variable in intensity and in spectral shape at different time scales. in this sense, we can affirm that no single source is sufficiently stable to be considered a standard candle. for this reason, multifrequency observations, possibly simultaneous, are mandatory for a proper comprehension of the behaviour of a target cosmic source (e.g. [81]). in this paper i will discuss – following my knowledge and feelings – the most relevant results obtained in the recent past that significantly improved our knowledge of the physics governing our universe. deeper discussions about astroparticle physics can be found in review papers by giovannelli [79, 81]. in their review papers, giovannelli & sabau-graziati [92, 94], and de angelis, mansutti & persic [53] have discussed the multi-frequency behaviour of high energy cosmic sources, and very high energy (vhe) γ-ray astrophysics. the results from egret have been extensively discussed by thompson [170] in his review of γ-ray astrophysics. 2. the main pillars of astroparticle physics high energy astrophysics is generally approached through the study of cosmic rays. the reason for this is historical in nature. since the discovery of this extraterrestrial radiation by victor hess [103], an enormous amount of scientific research has been involved in trying to discover its nature, and as a result many separate research fields have been developed. before particle accelerators came into operation high energy cosmic rays were the laboratory tools for investigations of elementary particle production, 483 http://dx.doi.org/10.14311/ap.2013.53.0483 http://ojs.cvut.cz/ojs/index.php/ap franco giovannelli acta polytechnica and to date they are still the only source of particles with energies greater than 1012 ev. research into the composition of radiation led to the development of studies of the astrophysical environment using the information in the charge, mass, and energy spectra; this field is also known as particle astrophysics. the discovery of high energy photons near the top of earth’s atmosphere was of great importance, and led to the development of new astronomical fields such as x-ray and γ-ray astronomy. however, many of these high energy photons have their origin in interactions of high energy charged particles with cosmic matter, light, or magnetic fields. the research fields of particle astrophysics and astronomy have found in this fact a bond to join their efforts in trying to understand the high energy processes which occur in astrophysical systems. 2.1. cosmic rays the modern picture of cosmic rays is that of a steady rain of particles moving at speeds close to that of light. the particles are primarily nuclei with atomic weights less than 56, as well as a few nuclei of heavier elements, some electrons and positrons, a few γ-rays and neutrinos. the energy spectrum extends over 12 orders of magnitude (∼ 108 ÷ 1020 ev) and the particle flux rapidly decreases with increasing energy. figure 1 shows the energy spectrum of cosmic rays. until energies of ∼ 109 ev, crs are of galactic origin and suffer a strong solar modulation due to their relatively low energy. from ∼ 109 ev to ∼ 1015 ev, crs are still of galactic origin and probably are mainly accelerated by supernova remnants (snrs). from ∼ 1015 ev to ∼ 1019 ev, crs shows some hints of galactic anisotropy at ∼ 1018 ev, and composition from heavy to light elements. above ∼ 1019 ev, the ultra high energy (uhe) crs are very few in number, and their origin and composition is mostly unknown. figure 1 shows clearly that the cr flux falls from 1 cr m−2 s−1 at ∼ 1011 ev, to 1 cr m−2 yr−1 at ∼ 1015÷16 ev, to 1 cr km−2 yr−1 at ∼ 1019 ev. there is evidence of a break in the spectrum around 1015 ÷1016 ev. this break is also called the knee. the knee was observed in the late 1950s initially as a steepening in the extensive air shower (eas) size spectrum [120]. since then more than 60 years have passed but its origin is still a challenge for cosmic ray physics. a possible interpretation of the knee is that it represents the energy at which cosmic rays can escape more freely from the galaxy, or it may indicate a transition between two different acceleration mechanisms. in the first case, one might expect an anisotropy effect in the distribution of arrival directions above this energy if the cosmic rays originated within the galaxy. a 2nd knee is present at about 1018 ev. its origin is not yet completely clear: it could be due to dispersion of sns, or re-acceleration of particles or early figure 1. the whole particle spectrum of cosmic rays (after the talk by mostafa, 2012 [132]). energies less or greater than ≈ 1015 ev divide the two ranges in which direct and indirect measurements of the cr spectrum are possible. transition to extragalactic cosmic rays [134]. in summary at gev energies the cosmic ray spectrum for protons is close to e−2.75, and for he and higher elements it is close to e−2.65 below the knee at ≈ 5 × 1015 ev, where the spectrum turns down to ∼ e−3.1, to and flattens out again near 1018.5 ev, called the ankle (e.g., [123, 133, 183]). the origin of the highest energy crs remains an open question. for a general review of high energy crs, see [160]. the greisen–zatsepin–kuzmin (gzk) cutoff is due to the microwave background, near to 1021 ev. once we succeed in identifying the origin of the highest energy events beyond 5×1019 ev, and if we can establish the nature of their propagation through the universe to us, then we will obtain a tool for doing physics at eev energies [28]. dirac’s famous statement, from his nobel lecture speech, is: we must regard it rather as an accident that the earth (and presumably the whole solar system), contains a preponderance of negative electrons and positive protons. it is quite possible that for some of the stars it is the other way about, these stars being built up mainly of positrons and negative protons. this statement still poses a fundamental question – open to experimental observations – about the baryonic symmetry or asymmetry of the universe. this topic can be investigated either ‘indirectly’, by measuring the spectrum of the cosmic diffuse gamma-ray, or ‘directly’, by searching for anti-nuclei and by measuring the energy spectra of antiprotons and positrons. 484 vol. 53 supplement/2013 what is new in astroparticle physics the pamela satellite, designed to perform accurate measurements of cosmic rays, has revealed antiprotons and positrons in the range ∼ 10 ÷ 100 gev, providing important constraints for the existence of exotic processes and information about mechanisms for the production, acceleration and propagation of crs in the galaxy (e.g. [69]). as annihilating dark matter particles in many models are predicted to contribute to the cosmic ray positron spectrum in this energy range, a great deal of interest has resulted from this observation. however, pulsars could be an alternative source of this signal [106]. the excesses recently observed in the cosmic ray positron and electron spectra by the pamela and atic experiments could be produced by a nearby clump of 600 ÷ 1000 gev neutralinos [107]. shaviv, n., nakar & piran [154] showed that inhomogeneity of cr sources, due to the concentration of supernova remnants (snrs) towards the galactic spiral arms, can provide a natural explanation for the anomalous increase in the positron/electron ratio observed by pamela. the propagation of charged crs is influenced by the magnetic field in the galaxy, and for the lowest energy particles also in the solar system. the result is that the distribution of arrival directions as the radiation enters earth’s atmosphere is nearly isotropic. it is not possible to identify the sources of the cosmic rays by detecting them. however, in the high energy interactions produced at the source, electrically neutral particles such as photons, neutrons, and neutrinos are also produced and their trajectories are not deviated, being directed from their point of origin to the observer (e.g. [23, 71, 135]). owing to their short lifetime neutrons cannot survive the path length to the earth (decay length ∼ 9 pc at 1 pev) and neutrinos do not interact efficiently in the atmosphere. it is in this context that gamma ray astronomy has demonstrated itself to be a powerful tool. the observations made to date have detected γ-rays from many astronomical objects, e.g. neutron stars, interstellar clouds, the center of our galaxy and the nuclei of active galaxies (agns). one might expect very important implications for high energy astrophysics from the observations at energies greater than 1011 ev of extragalactic sources (e.g. [105]). the fluxes of γ-rays at these energies are attenuated because of their interactions with the cosmic radio, microwave, infrared and optical radiation fields (e.g. [8, 151]). measurements of the flux attenuation can then provide important information on the distribution of these fields. for example, the threshold energy for pair production in reactions of photons with the 2.7 k background radiation is reached at 1014 ev and the absorption length is of the order of ∼ 7 kpc. for the infrared background, maximum absorption is reached at energies greater than 1012 ev. the monograph entitled “origin of cosmic rays” [78] strongly influenced subsequent papers, mainly because of the suggestive title. more than 1000 papers discussing the origin of crs have appeared since then. recently, drury [58] presented a critical discussion, because we are still searching for the origin of crs. the discussion involves the place of their origin (galactic or extragalactic), the origin of the accelerated particle, the origin of the energy, the acceleration site and mechanisms, the physical limits on the accelerator, possible synthesis, and observational tests. 2.2. lhc results in the past, it was impossible to accelerate particles in earth-bound laboratories to energies of the order of tev. now, the large hadron collider (lhc) – described by straessner et al. [163] – is able to reach tev energies for p–p interactions at 7 tev to search the higgs boson with atlas [76] and cms [171] detectors. these detectors provided the ranges of exclusion for the higgs boson mass, and together with the data coming from other experiments, e.g. tevatron (craig group talk, 2012), it has been possible to reduce the possible mass of the higgs boson in the range 115 ÷ 131 gev (95 % c.l.). with this result, and a first hint in the mass window 124 gev < mh < 126 gev, elias-miró et al. [60] discussed a detailed investigation of the stability of the standard model (sm) vacuum under the hypothesis 124 gev < mh < 126 gev, assuming the validity of sm up to very high energy scales. for a higgs mass in the range 124 ÷ 126 gev, and for the current central values of the top mass and strong coupling constant, the higgs potential develops an instability around 1011 gev, with a lifetime much longer than the age of the universe. however, taking into account theoretical and experimental errors, stability up to the planck scale cannot be excluded. stability at finite temperature implies an upper bound on the reheat temperature after inflation, which depends critically on the precise values of the higgs and top masses. a higgs mass in the range 124÷126 gev is compatible with very high values of the reheating temperature, without conflict with baryogenesis mechanism such as leptogenesis. elias-miró et al. [60] derived an upper bound on the mass of heavy right-handed neutrinos by requiring that their yukawa couplings do not destabilize the higgs potential. a historic result comes from combining the results obtained with the cms and atlas experiments at lhc. the results from the cms experiment gives a mass of the observed state decaying to di-photon and four-lepton final state, with statistical significance of 5σ, of 125.3 ± 0.6 gev [113]. the result from atlas experiment gives an excess of events at mh ∼ 126.5 gev with local significance 5.0σ [77]. within the current uncertainties, these observations are compatible with the sm higgs boson. however, this historic milestone marks only the beginning of a new exciting challenge for frontier physics. 485 franco giovannelli acta polytechnica 2.3. big bang and diffuse extragalactic background radiation after the big bang the universe started to expand with a rapid cooling. the cosmic radiation now observed is probably a melting of various components which had their origin in various stages of the evolution as the results of different processes. this is diffuse extragalactic background radiation (debra), which, if observed in different energy ranges, allows the study of many astrophysical, cosmological, and particle physics phenomena [148]. debra has witnessed the whole history of the universe, from the big bang to the present time. this history is marked by three main experimental witnesses supporting the big bang theory (e.g. [90]): the light element abundances [37]; the cmbr temperature at various redshifts as determined by srianand, petitjean & ledoux [159], and the references therein; the cmb at z = 0 as result of cobe (tcmbr(0) = 2.726 ± 0.010 k), which is well fitted by a black body spectrum [125]. at z ' 2.34, the cmbr temperature is 6.0 k < tcmbr(2.34) < 14.0 k. the prediction from the hot big bang tcmbr = tcmbr(0) (1 + z) gives tcmbr(2.34) = 9.1 k, which is consistent with the measurement [159]. new measurements in various energy regions are improving our knowledge about debra. it can be affirmed that the big bang theory has been proved, and in the light of recent lhc results, the standard model is definitively correct. 2.4. reionization of the universe after the epoch of recombination (last scattering) between ≈ 3.8×105÷≈ 2×108 yr (z ≈ 1000÷20), the universe experienced the so-called dark ages, when the dark matter halos collapsed and merged until the appearance of the first sources of light. this brought the dark ages to an end. the ultraviolet light from the first sources of light also changed the physical state of the gas (hydrogen and helium) that filled the universe, from a neutral state to a nearly fully ionized state. this was the reionization era when the population iii stars formed and as feedback the first sne and grbs. this occurred between ≈ (2 ÷ 5) × 108 yr (z ≈ 20 ÷ 10). soon after population ii stars started to form and probably the second wave of reionization occurred and stopped at ≈ 9 × 108 yr (z ≈ 6) after the big bang, and then the evolution of galaxies started (e.g. [56, 57]). quasars – the brightest and most distant known objects – offer a window on the reionization era, because neutral hydrogen gas absorbs their ultraviolet light. reionization drastically changes the environment for galaxy formation and evolution and in a hierarchical clustering scenario, the galaxies responsible for reionization may be the seeds of the most massive galaxies in the local universe. reionization is the last global phase transition in the universe. the reionization era is thus a cosmological milestone, marking the appearance of the first stars, galaxies and quasars. there is an apparent contradiction between the wmap5 data [59] and quasar (qso) absorption spectra data [65]. the wmap5 data is consistent with an epoch of reionization zrei ≈ 11, while the sdss observations suggest zrei ≈ 6. long grb may constitute a complementary way to study the reionization process, possibly probing zrei ≈ 6 [99, 150, 167]. moreover, an increasing number of lyman alpha emitters are routinely found at z > 6 [161]. recent results obtained by ouchi et al. [140] make an important contribution to the solution of this problem. indeed, from the the lyα luminosity function (lf), clustering measurements, and lyα line profiles based on the largest sample to date of 207 lyα emitters at z = 6.6 on the 1 deg2 sky of subaru/xmmnewton deep survey field, ouchi et al. [140] found that the combination of various reionization models and observational results about the lf, clustering, and line profile indicates that there would exist a small decrease of the intergalactic medium’s (igm’s) lyα transmission owing to reionization, but that the hydrogen igm is not highly neutral at z = 6.6. their neutral–hydrogen fraction constraint implies that the major reionization process took place at z & 7. the discovery of the qsos sdss j1148+5251 at a redshift of 6.41 (≈ 12.6 × 109 yr ago) [56], j114816.64+525150.3 at z = 6.43 [66], ulas j1120+0641 at z = 7.085 (770 myr after the big bang) [131] does not contradict the results found for the epoch of reionization. the discovery of a qso at z > 7, together with the discovery of a grb at z = 9.4 [51] strongly indicate that the epoch of reionization occurred at redshift ≈ 11, as suggested by the wmap5 data. however, the search for the epoch of reionization is still one of the most important open problems for understanding the formation of the first stars, galaxies and quasars. indeed, janus, the proposed mission by nasa, was able to detect tens of qsos at z > 7, and euclid, an esa mission, was probably able to detect faint qsos at z > 8. 2.5. clusters of galaxies the problems of the production and transport of heavy elements seems to have been resolved. indeed, thermally driven galactic winds, such as from m82, have shown that only active galaxies with an ongoing starburst can enrich the icm with metals. the amounts of metals in the icm is at least as high as the sum of the metals in all galaxies of the cluster (e.g. [173]). several clusters of galaxies, having strong radio emission, have been associated with egret sources. this is an important step in clarifying the nature of many unknown egret sources [47]. however, no γ-ray emission from any of the monitored cgs was detected 486 vol. 53 supplement/2013 what is new in astroparticle physics in the first 11 months of operation of the fermi lat monitoring program of cgs [6]. it is argued that cooling flows efficiently form dark matter. this has wider implications for the formation of dark matter in massive galaxies (e.g. [63, 64]). although many important results have coming from satellites of the last decade, the hierarchical distribution of the dark matter, and the role of the intergalactic magnetic fields in cgs are still open. 2.6. dark energy and dark matter using various methods to determine the mass of galaxies, a discrepancy has been found that suggests ∼ 95 % of the universe is in a form not easily detected by our instruments and/or experiments. this form of unknown content of the universe is the sum of dark energy (de) and dark matter (dm). for a thorough discussion about new cosmology see [48]. the discovery of the nature of the dark energy may provide an invaluable clue for understanding the nature and the dynamics of our universe. however, there is ∼ 30 % of the matter content of the universe which is dark and still requires a detailed explanation. baryonic dm consisting of machos (massive astrophysical compact halo objects) can yield only some fraction of the total amount of dark matter required by cmb observations. wimps (weakly interacting massive particles) (non-baryonic dm) can yield the needed cosmological amount of dm and its large scale distribution provided that it is “cold” enough. several options have been proposed so far like: i) light neutrinos with mass in the range mν ∼ 10 ÷ 30 ev, ii) light exotic particles like axions with mass in the range maxion ∼ 10−5 ÷ 10−2 ev or weakly interacting massive particles like neutralinos with mass in the range mχ ∼ 10 ÷ 1000 gev, this last option being favored at the present time (see, e.g., [61]). eros and macho, two experiments based on the gravitational microlensing, were developed. two lines of sight have been probed intensively: the large magellanic clouds (lmc) and the small magellanic clouds (smc), located 52 kpc and 63 kpc respectively from the sun [142]. with 6 years of data towards the lmc, the macho experiment published a most probable halo fraction between 8 and 50 % in the form of 0.2m� objects [11]. most of this range is excluded by the eros exclusion limit, and in particular the macho preferred value of 20 % of the halo. among experiments for searching wimps dark matter candidates there is pamela devoted to search for dark matter annihilation, antihelium (primordial antimatter), new matter in the universe (strangelets?), study of cosmic-ray propagation (light nuclei and isotopes), electron spectrum (local sources?), solar physics and solar modulation, and terrestrial magnetosphere. a comparison of pamela expectation with many other experiments has been discussed by morselli [130]. bruno [34] discussed some results from pamela. the search for dm is one of the main open problems in present day astroparticle physics. 2.7. the galactic center the galactic center (gc) is one of the most interesting places for testing theories in which frontier physics plays a fundamental role. there is an excellent review by mezger, duschl & zylka [128], which discusses the physical state of stars and interstellar matter in the galactic bulge (r ∼ 0.3÷3 kpc from the dynamic center of the galaxy), in the nuclear bulge (r < 0.3 kpc) and in the sgra radio and gmc complex (the central ∼ 50 pc of the milky way). this review reports also a list of review papers and conference proceedings related to the galactic center, with bibliographic details. multifrequency gc behaviour is also discussed in the review paper by giovannelli & sabau-graziati ([85] and references therein). larosa et al. [122] presented a wide-field, high dynamic range, high-resolution, long-wavelength (λ = 90 cm) vla image of the gc region. this is the most accurate image of the gc. it is highly obscured in optical and soft x-rays; it shows a central compact object – a black hole candidate – with m ∼ 3.6 × 106 m� [72], which coincides with the compact radio source sgra* [r.a. 17 45 41.3 (hh mm ss); dec.: -29 00 22 (dd mm ss)]. sgra* in x-rays/infrared is highly variable [73]. the gc is also a good candidate for indirect dark matter observations. indeed, cesarini et al. [41] using the γ-ray source detected by egret at the gc pointed out that the spectral features of that source are compatible with the γ-ray flux induced by pair annihilations of dark matter wimps. the observation of a gamma-ray line in cosmicray fluxes would be a smoking-gun signature for dark matter annihilation or decay in the universe. weniger [180] analyzing 43 months fermi-lat data in regions close to the gc, found a 4.6σ indication for a γ-ray line at e ≈ 130 gev. however, the evidence for the signal is based on about 50 photons; it will take a few years of additional data to clarify its existence. bringmann et al. [33] and hooper, kelso & queiroz [108] analyzed and discussed the same fermilat data from the direction of the inner galaxy, and derived robust yet stringent upper limits on the annihilation cross section of the dark matter, with the warning that the set of data is still poor. after the reports of a γ-ray line feature at ∼ 130 gev, buckley & hooper [35], developed a model-independent approach and discussed a number of possibilities for dark matter candidates which could potentially generate a feature of this kind. ghez et al. [74] measured proper motions of 17 stars within 0.4 of the galaxy’s central dark mass, that reveal orbital solutions. orbits were derived simultaneously so that they jointly constrain the central dark 487 franco giovannelli acta polytechnica object’s properties: its mass, its position, and, for the first time using orbits, its motion on the plane of the sky. the estimated central dark mass from orbital motions is ∼ 3.7 × 106[r0/8 kpc]3 m�. thus the study of stellar motions near the gc will provide good tests of general relativity at intermediate relativity parameters. despite the wide spread belief that the gc contains a black hole with a mass m ∼ 3.7 × 106 m�, kundt ([121], this volume, and references therein) shows a number of maps of our gc not all of which easily found in the standard literature. all of these maps require a burning disk as their central engine, rather than a black hole. 2.8. gamma-ray bursts theoretical description of grbs is still an open and strongly controversial question. the fireball (fb) model [127, 146], cannon ball (cb) model [52], spinninprecessing jet (spj) model [67, 68], fireshell [114] model, which come directly from the electromagnetic black hole (embh) model (e.g. [149] and the references therein), however, each of these models conflicts with the others. important implications on the origin of the highest redshift grbs are coming from the detection of the grb 080913 at z = 6.7 [99], grb 090423 at z ∼ 8.2 [168], and grb 090429b at z = 9.4 [51]. this means that really we are approaching to the possibility of detecting grbs at the end of dark era, where the first pop iii stars appeared. izzo et al. [114] successfully discussed a theoretical interpretation of the grb 090423 within their fireshell model. wang & dai [176] studied the high-redshift star formation rate (sfr) up to z ' 8.3 considering the swift grbs tracing the star formation history and the cosmic metallicity evolution in different background cosmological models including the λcdm, quintessence, quintessence with a time-varying equation of state and brane-world models. λcdmis the preferred model, but it is compared with other results. dust plays an important role in understanding the formation and evolution of the galaxy in the course of cosmic history. dust absorbs and scatters uv and optical lights from young stars, making it difficult to make out how the stars were formed in galaxies using optical and nir observations that sample the redshifted uv/optical lights of high redshift objects. then it is important to know how the dust extinguishes the uv/optical light at high redshift, in order to understand how stars and galaxies were formed in the early universe. at high redshift, the universe was so young that core-collapse sne are suspected to be the dominant source of dust production. a crucial help in understanding dust production at high redshifts was coming from the analysis of the afterglow of the grb 071025 placed at z ∼ 5 [115, 145]. its red color and the inflexed shape of the afterglow spectral energy distribution (sed) suggest dust extinction dominated by sne-dust. in order to determine which kind of sne can produce such dust, jang et al. [115] – using their independent optical/near-infrared data of grb 071025 at a different epochs – tested sne–dust models with different progenitor masses and dust destruction efficiencies to constrain the dust formation mechanisms. by searching for the best-fit model of the afterglow sed, jang et al. [115] confirmed the previous claim that the dust in grb 071025 is most likely originated from sne. they also found that the sne-dust model of 13 or 25 m� without dust destruction fits the extinction property of grb 071025 best, while pair-instability sne models with a 170 m� progenitor poorly fit the data. then, jang et al. [115] results indicate that, at least in some systems at high redshift, sne with intermediate masses within 10 ÷ 30 m� were the main contributors for the dust enrichment, and the dust destruction effect due to reverse shock was negligible. although great progress has been made in recent years, grb theory needs further investigation in light of the experimental data coming from old and new satellites, often coordinated, such as bepposax or batse/rxte or asm/rxte or ipn or hete or integral or swift or agile or fermi or maxi. 2.9. extragalactic background light space is filled with diffuse extragalactic background light (ebl) which is the sum of starlight emitted by galaxies through the history of the universe. high energy γ-rays traversing cosmological distances are expected to be absorbed through their interactions with the ebl by: γvhe + γebl −→ e + e−. then the γ-ray flux φ is suppressed while travelling from the emission point to the detection point, as φ = φ0e−τ(e,z), where τ(e,z) is the opacity. the e-fold reduction [τ(e,z) = 1] is the gamma ray horizon (e.g. [29, 124]). direct measurement of the ebl is difficult at optical to infrared wavelengths because of the strong foreground radiation originating in the solar system. however, the measurement of the ebl is important for vhe gamma-ray astronomy, as well as for astronomers modelling star formation and galaxy evolution. second only in intensity to the cosmic microwave background (cmb), the optical and infrared (ir) ebl contains the imprint of galaxy evolution since the big bang. this includes the light produced during formation and reprocessing of stars. current measurements of the ebl are reported in the paper by schroedter ([152] and references therein). he used the available vhe spectra from six blazars. later, the redshift region over which the gamma reaction history can be constrained by observations has been extended up to z = 0.536. upper ebl limit based on 3c 279 data have been obtained [10]. the universe is more transparent to vhe gamma rays than expected. thus many more agns could be seen at these energies. 488 vol. 53 supplement/2013 what is new in astroparticle physics indeed, abdo et al. [1] observed a number of tevselected agns during the first 5.5 months of observations with the large area telescope (lat) on-board the fermi gamma-ray space telescope. redshiftdependent evolution is revealed in the spectra of objects detected at gev and tev energies. the most reasonable explanation for this is the absorption on the ebl, and as such, it would represent the first model-independent evidence for absorption of γ-rays on the ebl. abdo et al. [5] by using a sample of γ-ray blazars with redshift up to z ∼ 3, and grbs with redshift up to z ∼ 4.3, measured by fermi/lat placed upper limits on the γ-ray opacity of the universe at various energies and redshifts and compared this with predictions from well-known ebl models. they found an ebl intensity in the optical-ultraviolet wavelengths that rules out the “baseline” model of stecker, malkan & scully [162] with high confidence. 2.10. relativistic jets rotating massive cosmic sources can produce jets. indeed, relativistic jets have been found in numerous galactic and extragalactic cosmic sources at different energy bands. the emitted spectra of jets from cosmic sources of different nature are strongly dependent on the angle formed by the beam axis and the line of sight, and obviously by the lorentz factor of the particles (e.g. [24] and the references therein, [15–18, 18–22]). thus observations of jet sources at different frequencies can provide new inputs for the comprehension of such extremely efficient carriers of energy, as for cosmological grbs. the discovered analogy among µ-qsos, qsos, and grbs is fundamental for studying the common physics governing these different classes of objects via µ-qsos, which are galactic, and then apparently brighter and with all processes occurring in time scales accessible by our experiments (e.g. [42]). chaty [43] noted the importance of multifrequency observations of jet sources by means of the measurements of grs 1915+105. dermer et al. [55] suggested that ultra-high energy cosmic rays (uhecrs) could come from black hole jets of radio galaxies. spectral signatures associated with uhecr hadron acceleration in studies of radio galaxies and blazars with fermi observatory and ground-based γ-ray observatories can provide evidence for cosmic-ray particle acceleration in black hole plasma jets. also in this case, γ-ray multifrequency observations (mev–gev–tev) together with observations of pev neutrinos could confirm whether blackhole jets in radio galaxies accelerate the uhecrs. despite their frequent outburst activity, microquasars have never been unambiguously detected emitting high-energy gamma rays. the fermi/lat has detected a variable high-energy source coinciding with the position of the x-ray binary and microquasar cygnus x–3. its identification with cygnus x–3 is secured by the detection of its orbital period in gamma rays, as well as the correlation of the lat flux with radio emission from the relativistic jets of cygnus x–3. the γ-ray emission probably originates from within the binary system [2]. microquasar ls 5039 has been unambiguously detected by fermi/lat through its 3.9-day modulated emission. analyzing the spectrum, variable with the orbital phase, and having a cutoff, abdo et al. [3] concluded that the γ-ray emission of ls 5039 is magnetospheric in origin, like that of pulsars detected by fermi. this experimental evidence of emission in the gev region of microquasars opens an interesting window onto the formation of relativistic jets. 2.11. cataclysmic variables the detection of cvs with the integral observatory [14] has recently renewed the interest of high energy astrophysicists in such systems, and subsequently involved the low-energy astrophysical community again. the detection of cvs having orbital periods inside the so-called period gap between 2 and 3 hours, which separates polars (experiencing gravitational radiation) from intermediate polars (experiencing magnetic braking), renders attractive the idea of physical continuity between the two classes. further investigations of this important problem are necessary. for recent reviews on cvs see the papers by giovannelli [80] and giovannelli & sabau-graziati [95]. type ia supernovae (sne ia) are essential astrophysical tools for the study and exploration of fundamental properties in the cosmos. the first suggestion about the possibility that progenitors of sne ia in late-type galaxies were cvs systems was given by della valle & livio [54]. since then, the identification of the progenitors of sne ia remained controversial. it is now generally accepted that sne ia originate from binary star systems in which at least one component is a carbon–oxygen white dwarf (wd). these systems belong to the general class of cvs. current theories for sne ia progenitors hold that, either via roche lobe overflow of the companion or via a wind, the wd accumulates hydrogen or helium rich material which is then burned to c and o onto the wd’s surface. however, the specifics of this scenario are far from being understood. within this framework kafka [117] discussed the latest attempts to identify and study those controversial sne ia progenitors. she also introduced the most promising progenitors in hand and presented observational diagnostic that can reveal more members of the category. taani et al. [165] discussed the origin of the progenitors of millisecond pulsars (msps) and concluded that some fraction of isolated msps originate from the conversion of wds to msps via accretion induced collapse (aic) process. taani et al. [166] discussed mainly on the massive binary wds (m ≥ 1.0 m�) forming cvs, that could potentially evolve to reach chandrasekhar limit, thereafter they collapse and become msps. sn ia–cvs connection is one of the hottest problem of today astrophysics (e.g. [96]). 489 franco giovannelli acta polytechnica 2.12. high mass x-ray binaries for general reviews see e.g. giovannelli & sabaugraziati [84, 85] and van den heuvel [104] and references therein. hmxbs are young systems, with age ≤ 107 yr, mainly located in the galactic plane (e.g. [100]). a compact object – the secondary star –, mostly a magnetized neutron star (x-ray pulsar) is orbiting around an early type star (o, b, be) – the primary – with m ≥ 10 m�. the optical luminosity of the system is dominated by the early type star. such systems are the best laboratory for the study of accreting processes thanks to their relative high luminosity in a large part of the electromagnetic spectrum. because of the strong interactions between the optical companion and collapsed object, low and high energy processes are strictly related. in x-ray/be binaries the mass loss processes are due to the rapid rotation of the be star, the stellar wind and, sporadically, to the expulsion of casual quantity of matter essentially triggered by gravitational effects close to the periastron passage of the neutron star. the long orbital period (> 10 days) and a large eccentricity of the orbit (> 0.2) together with transient hard x-ray behavior are the main characteristics of these systems. the whole sample of hmxbs contain 128 x-ray pulsars in the magellanic clouds [39] and 114 in the milky way [116]. only few of them have been extensively studied. among these, the system a 0535+26/hde 245770 is the best known thanks to concomitant favorable causes, which rendered possible thirty seven years of coordinated multifrequency observations, most of them discussed by e.g. giovannelli & sabau-graziati [83, 90], burger et al. [36]. cyclotron lines have been detected in 21 x-ray pulsars ([39] and the references therein), allowing the direct determination of the magnetic field intensity at the surface of the neutron star of ∼ 1012 ÷ 1013 g. accretion powered x-ray pulsars usually capture material from the optical companion via stellar wind, since this primary star generally does not fill its roche lobe. however, in some specific conditions (e.g. the passage at the periastron of the neutron star) and in particular systems (e.g. a 0535+26/hde 245770), it is possible the formation of a temporary accretion disk around the neutron star behind the shock front of the stellar wind. this enhances the efficiency of the process of mass transfer from the primary star onto the secondary collapsed star, as discussed by giovannelli & ziolkowski [82] and by giovannelli et al. [87] in the case of a 0535+26. giovannelli & sabau-graziati [93] discussed the history of the discovery of optical indicators of high energy emission in the prototype system a0535+26/hde 245770 ≡ flavia’ star, updated to the march–april 2010 event when a strong optical activity occurred roughly 8 days before the x-ray outburst [38] that was predicted by giovannelli, gualandi & sabau-graziati [91]. this optical indicator of x-ray outburst together with the whole history of a0535+26 system allowed to conclude that the periastron passage of the neutron star is scanned every 110.856 days (optical orbital period), and the anomalous and casual x-ray outbursts are triggered starting from that moment and occur roughly after roughly 8 days – the transit time for material expelled from the primary to reaching the secondary. by contrast, the normal outbursts triggered by the ‘steady’ stellar wind of be star – in a state of ‘quiescence’ – occur at the periastron. an alternative explanation of such delay between optical and x-ray flares could be due to the presence of a non-stationary accretion disk around the ns related to the motion of a high-mass flux region from the outer boundary of the ns roche lobe to the alfven surface due to action of the α-viscosity, as suggested by bisnovatyi-kogan, giovannelli & klepnev [30], who constructed a quantitative model of such event. for bright outbursts the 8 days delay happened at α = 0.1. however, it is still an important, open and controversial problem in astrophysics how x-ray outbursts are triggered in x-ray pulsars. important data is also coming in from gev observations of hmxbs. indeed, abdo et al. [4] present the first results from the observations of lsi + 61°303 using fermi/lat data obtained between 2008 august and 2009 march. their results indicate variability that is consistent with the binary period, with the emission being modulated at 26.6 days. this constitutes the first detection of orbital periodicity in high-energy γ-rays (20 mev ÷ 100 gev). the light curve is characterized by a broad peak after periastron, as well as a smaller peak just before apastron. the spectrum is best represented by a power law with an exponential cutoff, yielding an overall flux above 100 mev of ' 0.82 × 10−6 ph cm−2 s−1, with a cutoff at ∼ 6.3 gev and photon index γ ∼ 2.21. there is no significant spectral change with orbital phase. the phase of maximum emission, close to periastron, hints at inverse compton scattering as the main radiation mechanism. however, previous very high-energy gamma ray (> 100 gev) observations by magic and veritas show peak emission close to apastron. this and the energy cutoff seen with fermi suggest that the link between he and vhe gamma rays is nontrivial. this is an open problem for future investigation. 2.12.1. obscured sources and supergiant fast x-ray transients one of the most important contributions of the integral satellite has been its scans of the galactic plane and bulge, which have led to the discovery of a number of previously unknown transient x-ray binary sources (e.g. [31]). many of these sources have high absorbing column density (nh ≈ 1023), making the integral hard x-ray response crucial for their discovery. these sources are in the course of being unveiled by means of multi-wavelength optical, near490 vol. 53 supplement/2013 what is new in astroparticle physics and mid-infrared observations. other sources discovered by the integral are the supergiant high-mass x-ray binaries (sgxbs) (e.g. [45]). they are believed to be rare objects, as stars in the supergiant phase have a very short lifetime and to date only about two dozen of them have been discovered. they are known to be persistent and bright x-ray sources. integral changed this classical picture, as its observations revealed the presence of a new subclass of sgxbs that have been labeled supergiant fast x-ray transients (sfxts), since they are strongly characterized by fast x-ray outbursts lasting less than a day, typically a few hours [153], and extreme x-ray luminosity dynamic ranges (103÷5) [156]. sfxts are one of the most intriguing (and unexpected) results of the integral mission. they are a new class of high mass x-ray binaries. they are composed by a massive ob supergiant star as companion donor and a compact object. at least four sfxts host a neutron star, because x-ray pulsations have been discovered, while for the others a black hole cannot be excluded [156]. in a recent review sidoli [155] discussed on the latest progress on sfxts and future direction. the importance of the discovery of this new population is based on the constraints on the formation and evolution of hmxbs [43, 44, 46, 147]: does the dominant population of short-living systems – born with two very massive components – occur in rich star-forming region? what will happen when the supergiant star dies? are primary progenitors of ns/ns or ns/bh mergers good candidates for gravitational wave emitters? can we find a link with short/hard γ-ray bursts? 2.13. ultra-compact double-degenerated binaries ultra-compact double-degenerated binaries (ucd) consist of two compact stars, which can be black holes, neutron stars or white dwarfs. in the case of two white dwarfs revolving around each other with an orbital period porb ≤ 20 min the separation is very small. the separation of the two components for a ucd with porb ≈ 10 min or shorter is smaller than the diameter of jupiter. smak [157] was the first to suggest that am cvn was such a system, and paczyński [141] discussed the possibility of gravitational waves emission from it. later many papers appeared about ucd systems (e.g. [50, 136, 164, 174, 175, 177, 178]). these ucd are evolutionary remnants of low-mass binaries, and they are numerous in the milky way. the discovery of ucd forebodes interesting hints for possible detection of gravitational waves with the lisa observatory. moreover, ucd are extremely important for solving the problem of the sne ia progenitors. indeed, following giovannelli & sabau-graziati [95], it is well accepted by the community that type ia sne are the result of the explosion of a carbon–oxygen wd that grows to near chandrasekhar’s limit (mch ≈ 1.4 m�) in a close binary system [109]. but the debate is focussed around the different kinds of progenitors. indeed, in the past, two families of progenitor models have been proposed. they differ in the mode of wd mass increase. the first family is the so-called single degenerate (sd) model [181], in which the wd accretes and burns hydrogen-rich material from the companion. the second family is the so-called double degenerate (dd) model, in which the merging of two wds in a close binary triggers the explosion [112, 179]. the two scenarios produce different delay times for the birth of the binary system to explosion. thus it is hopefully possible to discover the progenitors of type ia sne by studying their delay time distribution (ddt). the ddt can be determined empirically from the lag between the cosmic star formation rate and type ia sn birthrate (e.g. [137]). the dd scenario, in which two co wds can produce a sn ia while merging if their combined mass is larger than mch has been recently discussed by toonen, nelemans & portegies zwart [172]. 2.14. magnetars the discovery of magnetars (anomalous x-ray pulsars – axps – and soft gamma-ray repeaters – sgrs) is another very exciting result in recent years ([126, 143] and e.g. review by giovannelli & sabaugraziati [85] and the references therein). indeed, with the magnetic field intensity of order 1014 ÷ 1015 g a question naturally arises: what kind of sn produces such axps and sgrs? are really the collapsed objects in axps and sgrs neutron stars? (e.g. [111, 144]). with such high magnetic field intensity an almost ‘obvious’ consequence can be derived: the correspondent dimension of the source must be of ∼ 10 m [86]. this could be the dimension of the acceleration zone in supercompact stars. could they be quark stars? ghosh [75] discussed some of the developments in quark star physics, along with the consequences of possible hadron-to-quark phase transition, in a highdensity scenario of neutron stars, and their implications for astroparticle physics. important consequences could be derived by experimentally demonstrated the continuity among rotationpowered pulsars, magnetars, and millisecond pulsars [110, 119]. however, the physical reason for this continuity is not yet clear. several papers have appeared about the possibility of having quark stars (e.g. [97, 138]). strange quark matter could be found in the core of neutron stars or forming strange quark stars. quarks and electrons interact with the magnetic field via their electric charges and anomalous magnetic moments. in contrast to the magnetic field value of 1019 g, obtained when anomalous magnetic moments are not taken into account, gonzález felipe et al. [97] found the upper bound b . 8.6 × 1017 g, for the stability 491 franco giovannelli acta polytechnica of the system. a phase transition could be hidden for fields greater than this value. an analytical model of a magnetar as a highdensity magnetized quark bag was discussed by orsaria, ranea-sandoval & vucetich [138]. they considered the effect of strong magnetic fields (b > 5×1016 g) in the equation of state. they found an analytic expression for the mass–radius (m–r) relationship from the energy variational principle in general relativity, and the results were compared with observational evidence of possible quark and/or hybrid stars. the m–r relationship, gravitational redshift and rotational kepler periods of a magnetized quark-hybrid stars were compared with those of standard neutron stars [139]. 3. cross sections of nuclear reactions in stars knowledge of the cross-sections of nuclear reactions occurring in the stars is one of the most crucial points in all astroparticle physics. direct measurements of the cross sections of the 3he(4he, γ)7be and 7be(p, γ)8be reactions of the p–p chain and 14n(p, γ)15o reaction of the cno-cycle will allow a substantial improvement in our knowledge on stellar evolution. the luna collaboration has already measured the key reactions d(p, γ)3he, 3he(d, p)4he and 3he(4he,γ)7be with good accuracy. these measurements substantially reduces the theoretical uncertainty of d, 3he, 7li abundances. the d(4he, γ)6li cross section, which is the key reaction for the determination of the primordial abundance of 6li, will be measured in the near future [101, 102]. there is a mission (dual) for nuclear astrophysics studies, which naturally addresses the requirement for medium-sensitivity large-scale exposures, and very deep pointed observations [12]. dual will study the origin and evolution of the elements and explores new frontiers of physics: extreme energies that drive powerful stellar explosions and accelerate particles to macroscopic energies; extreme densities that modify the laws of physics around the most compact objects known; and extreme fields that influence matter in a way that is unknown on earth. the roadmap of high energy astronomy shows a white spot on the future track of nuclear astrophysics – i.e. in the energy range of 100 kev÷ 100 mev, which will be explored by the dual mission [13]. 4. neutrino astronomy for a short discussion about neutrino astronomy see e.g. the paper by giovannelli ([87] and the references therein), as well as all the papers of the neutrino astronomy session, which appeared in the proceedings of the vulcano workshops 2006, 2008, 2010, and 2012 [88, 89]. however, it is important to note that several papers appeared about: i) the sources of he neutrinos [9] and diffuse neutrinos in the galaxy [62]; ii) potential neutrino signals from galactic γ-ray sources [118]; iii) galactic cosmic-ray pevatrons with multi-tev γ-rays and neutrinos [70]; iv) results achieved with amanda: 32 galactic and extragalactic sources have been detected [182]; diffuse neutrino flux from the inner galaxy [169]; discussion about vhe neutrino astronomic experiments [40]. important news and references can be found in the proceedings of the les rencontres de physique de la vallée d’aoste [98]. news about the neutrino oscillations have been reported by mezzetto [129]. the angle θ13 differs from zero: sin2 θ13 = 0.013. this result opens the door to cp violation searches, with profound implications for our understanding of the matter–antimatter asymmetry in the universe. this result is extremely important in view of our understanding of the physics governing the universe. indeed, it confirms the predictions of boyarsky, ruchayskiy & shaposhnikov [32] at the end of their comprehensive overview of an extension of the standard model that contains three right-handed (sterile) neutrinos with masses below the electroweak scale [the neutrino minimal standard model, (νmsm)]. they considered the history of the universe from the inflationary era until the present day, and demonstrated that most of the observed phenomena beyond the standard model can be explained within the framework of this model. they reviewed the mechanism of baryon asymmetry of the universe in the νmsm and discussed a dark matter candidate that can be warm or cold and satisfies all existing constraints. from the viewpoint of particle physics the model provides an explanation for neutrino flavor oscillations. 5. conclusions and reflections this review is far from complete, but i will now conclude with some comments about the topics discussed here. (1.) many ground-based and space-based experiments are exploring the whole energy range of the cr spectrum from ∼ 1 gev to ∼ 1012 gev and many experiments are programmed for the near future. a significant improvement has been made in the definition of the cr spectrum for protons, electrons, positrons, antiprotons, and all ions. better results are expected in the near future. particular interest is devoted to the knowledge of the extreme he energy crs. (2.) many experiments are exploring cosmic sources along the whole electromagnetic spectrum, and new space-based and ground-based experiments are developing a tendency to explore processes at higher and higher energies, which are directly linking photonic astrophysics with particle astrophysics. (3.) a particular attention is necessary at the highest energies where the cosmic ray spectrum extends 492 vol. 53 supplement/2013 what is new in astroparticle physics to 1020 ev (see fig. 1). however the origins of the spectacularly high-energy particles remains obscure. particle energies of this magnitude imply that near their acceleration sites a range of elementary particle physics phenomena is present which is beyond the ability of present day particle accelerators to explore. vhe γ-ray astronomy may catch a glimpse of these phenomena. it is becoming increasingly clear that the energy régime covered by vhe γ-ray astronomy will be able to address a number of significant scientific questions, which include: i) what parameters determine the cutoff energy for pulsed γ-rays from pulsars? ii) what is the role of shell-type supernovae in the production of cosmic rays? iii) at what energies do agn blazar spectra cut-off? iv) are gamma blazar spectral cut-offs intrinsic to the source or due to intergalactic absorption? v) is the dominant particle species in agn jets leptonic or hadronic? vi) can intergalactic absorption of the vhe emission of agn’s be a tool to calibrate the epoch of galaxy formation, the hubble parameter, and the distance to γ-ray bursts? vii) are there sources of γ-rays which are ‘loud’ at vhes, but ‘quiet’ at other wavelengths? the multifrequency astrophysics and multienergy particle astrophysics observations allow us to complement one observation with others thus improving our potential deciphering of the problems and understanding. there are many problems in performing simultaneous multifrequency, multienergy multisite, multiinstrument, multiplatform measurements due to: i) objective technological difficulties; ii) sharing common scientific objectives; iii) problems of scheduling and budgets; iv) politic management of science. in spite of the many ground-based and space-based experiments, which provide an impressive quantity of excellent data in various energy regions, many open problems still remain. i believe that only a drastic change in the philosophy of the experiments will enable the open problems to be solved more rapidly. for example, in the case of space-based experiments, it is necessary to support small satellites, dedicated to specific missions and problems, and providing the possibility to schedule very long-term observations, because they are faster to prepare, easier to manage and less expensive to run than medium-size and large satellites. in addition, because they can be prepared faster, it is possible to use more recent technologies in small satellites, than in bigger satellites, in which, typically, 15-year-old (or older) technologies are used. in addition, the international community needs to be able to persuade the various national and international space agencies to give strong support for arrays of small satellites, each of them specialized in a particular energy band or in the study of a specific physical process. i strongly believe that in the coming decades spacebased, ground-based and maybe lunar-based passive physics experiments will be the most suitable probes for sounding the physics of the universe. active physics experiments have probably already reached the maximum dimensions compatible with a reasonable cost/benefit ratio, with the obvious exception of neutrino-astronomy experiments. acknowledgements many thanks to professor giora shaviv for helpful suggestions that have improved the quality of this paper. this research has made use of nasa’s astrophysics data system. references [1] abdo, a.a. et al.: 2009a, apj 707, 1310 [2] abdo, a.a. et al.: 2009b, sci. 326, 1512 [3] abdo, a.a. et al.: 2009c, apj 706, l56 [4] abdo, a.a. et al.: 2009e, apj 701l, 123 [5] abdo, a.a. et al.: 2010, apj 723, 1082 [6] ackermann, m. et al.: 2010, apj 717, l71 [7] ackermann, m. et al.: 2011, phys. rev. letter 107, 241302 [8] aharonian, f.a.: 2004, very high energy cosmic gamma radiation: a crucial window on the extreme universe, world scientific publishing co. pte. ltd., 99–135 [9] aharonian, f.a.: 2007, sci. 315, 70 [10] albert, j. et al. (magic collaboration): 2008, sci. 320, 1752 [11] alcock, c. et al.: 2000, apj 542, 281 [12] von ballmoos, p., takahashi, t., boggs, s.e.: 2010, nucl. instr. & meth. phys. res. 623, 431 [13] von ballmoos, p. et al. (the dual consortium): 2012, exp. astron. 34, 583 [14] barlow, e.j. et al.: 2006, mnras 372, 224 [15] beall, j.h.: 2002, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 73, 379 [16] beall, j.h.: 2003, chja&as 3, 373 [17] beall, j.h.: 2008, chja&as 8, 311 [18] beall, j.h.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), sif, ed. compositori, bologna, italy, 98, 283 [19] beall, j.h., guillory, j., rose, d.v.: 1999, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 70, 1235 [20] beall, j.h. et al.: 2006, chja&as1 6, 283 [21] beall, j.h. et al.: 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), sif, ed. compositori, bologna, italy, 93, 315 [22] beall, j.h., guillory, j., rose, d.v.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), sif, ed. compositori, bologna, italy, 98, 301 493 franco giovannelli acta polytechnica [23] beatty, j.j. et al.: 2011, cern web page, rpp2011-rev-cosmic-rays [24] bednarek, w., giovannelli, f., karakula, s., tkaczyk, w.: 1990, a&a 236, 268 [25] bennett, c.l. et al.: 2003, apj 583, 1 [26] bernabei, r. et al.: 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy 103, 157 [27] bernabei, r. et al.: 2013, this volume [28] biermann, p. l.: 1999, astrophys. & space sci. 264, 423 [29] blanch, o., martinez, m.: 2005, astrop. phys. 23, 588 [30] bisnovatyi-kogan, g.s., giovannelli, f., klepnev, a.: 2012, invited talk presented at cospar 2012, e1.2-0025-12 [31] bodaghee, a. et al.: 2007, a&a 467, 585 [32] boyarsky, a., ruchayskiy, o., shaposhnikov, m.: 2009, ann. rev. nucl. part. sci. 59, 191 [33] bringmann, t. et al.: 2012, j. cosm. & ap. phys., issue 07, id. 054 [34] bruno, a.: 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 103, 139 [35] matthew r. buckley, m.r., hooper, d.: 2012, phys. rev d 86, 043524 [36] burger, m. et al.: 1996, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 67, 365 [37] burles, s., nollet, k.m., turner, m.s.: 2001, apjl 552, l1 [38] caballero, i. et al.: 2010, atel no. 2541 [39] caballero, i., wilms, j.: 2012, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait. 83, 230 [40] cao, z.: 2008, nucl. phys. b (proc. suppl.) 175–176, 377 [41] cesarini, a. et al.: 2004, ap. phys. 21, 267 [42] chaty, s.: 1998, ph.d. thesis, university paris xi [43] chaty, s., 2007, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 93, 329 [44] chaty, s.: 2008, chja&as 8, 197 [45] chaty, s.: 2010, in high energy phenomena in massive stars, j. martí, p.l. luque-escamilla & j.a. combi (eds.), asp conf. ser. 422, 243 [46] chaty, s., filliarte, p.: 2005, chja&as 5, 104 [47] colafrancesco, s.: 2002, a&a 396, 31 [48] colafrancesco, s.: 2003, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 85, 141 [49] craig, h. group: 2012, talk presented at the aps april meeting on 100 years of cosmic ray physics [50] cropper, m., ramsay, g., wu, k., hakala, p.: 2004, in magnetic cataclysmic variables, iau coll. 190, sonja vrielmann & mark cropper (eds.), asp conf. proc. 315, 324 [51] cucchiara, a. et al.: 2011, apj 736, 7 [52] dar, a., de rújula, a.: 2004, phys. rep. 405, 203 [53] de angelis, a., mansutti, o., persic, m.: 2008, il n. cim. 31 n. 4, 187 [54] della valle, m., livio, m.: 1994, apj 423, l31 [55] dermer, c.d., razzaque, s., finke, j.d., atoyan, a.: 2009, new j. of phys. 11, 1 [56] djorgovski, s.g.: 2004, nature 427, 790 [57] djorgovski, s.g.: 2005, in the tenth marcel grossmann meeting, m. novello, s. perez bergliaffa & r. ruffini (eds.), world scientific publishing co., p. 422 [58] drury, l.o’c.: 2012, arxiv:1203.3681v1 [59] dunkley, j. et al.: 2009, apj 701, 1804 [60] elias-miró, j. et al.: 2012, phlb 709, 222 [61] ellis, j.: 2002, astro-ph 4059 (arxiv:hep-ex/0210052) [62] evoli, c., grasso, d., maccione, l.: 2007, astro-ph 0701856 [63] fabian, a.c.: 1994, ann. rev. a&a 32, 277 [64] fabian, a.c., nulsen, p.e.j., canizares, c.r.: 1991, a&a rev. 2, 191 [65] fan, x., carilli, c.l., keating, b.: 2006, ara&a 44, 415 [66] fan, x. et al.: 2003, aj 125, 1649 [67] fargion, d.: 2003, chja&as 3, 472 [68] fargion, d., grossi, m.: 2006, chja&as1 6, 342 [69] di felice, v. et al.: 2010, in 38th cospar scientific assembly, p. 5 [70] gabici, s., aharonian, f.a.: 2007, apjl 665, l131 [71] gaisser, t.k., stanev, t.: 2000, the eur. phys. j. c – particles and fields 15 (ns 1–4), 150 [72] genzel, r. et al.: 2003a, apj 594, 812 [73] genzel, r. et al.: 2003b, nature 425, 934 [74] ghez, a.m. et al.: 2005, apj 620, 744 [75] ghosh, s.k.: 2009, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi, (eds.), sif, ed. compositori, bologna, italy, 98, 243 [76] gianotti, f. (atlas collaboration): 2011, talk at cern public seminar, dec. 13, 2011; atlas-conf-2011-163 [77] gianotti, f. (atlas collaboration): 2012, talk at cern public seminar, july 4, 2012 [78] ginzburg, v.l., syrovatskij, s.i.: 1966, sov. phys. uspekhi 9 (n. 2), 223 [79] giovannelli, f.: 2007; 2009; 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 93, 3; 98, 3; 103, 3 [80] giovannelli, f.: 2008, chinese j. a&a s 8, 237 494 vol. 53 supplement/2013 what is new in astroparticle physics [81] giovannelli, f.: 2012, in 9th ibws 2012 proceedings acta polytechnica (in press) [82] giovannelli, f., ziółkowski, j.: 1990, aca 40, 95 [83] giovannelli, f., sabau-graziati, l.: 1992, space sci. rev. 59, 1 [84] giovannelli, f., sabau-graziati, l.: 2001, ap&ss 276, 67 [85] giovannelli, f., sabau-graziati, l.: 2004, space sci. rev. 112, 1 [86] giovannelli, f., sabau-graziati, l.: 2006, chja&as1 6, 1 [87] giovannelli, f., bernabei, s., rossi, c., sabau-graziati, l.: 2007, a&a 475, 651 [88] giovannelli, f., mannocchi, g. (eds.): 2007; 2009; 2011, proc. vulcano workshops on frontier objects in astrophysics and particle physics, sif, ed. compositori, bologna, italy, vols. 93; 98; 103 [89] giovannelli, f., mannocchi, g. (eds.): 2013, this volume [90] giovannelli, f., sabau-graziati, l.: 2008, chja&as 8, 1 [91] giovannelli, f., gualandi, r., sabau-graziati, l.: 2010, atel no. 2497 [92] giovannelli, f., sabau-graziati, l.: 2010, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 81 n. 1, 18 [93] giovannelli, f., sabau-graziati, l.: 2011, acta polyt. 51 n. 2, 21 [94] giovannelli, f., sabau-graziati, l.: 2012a, in multifrequency behaviour of high energy cosmic sources, f. giovannelli & l. sabau-graziati (eds.), mem. sait 83 n. 1, 17 [95] giovannelli, f., sabau-graziati, l.: 2012b, in the golden age of cataclysmic variables and related objects, f. giovannelli & l. sabau-graziati (eds.), mem. sait 83 n. 2, (in press) [96] giovannelli, f., sabau-graziati, l. (eds.): 2012c, the golden age of cataclysmic variables and related objects, mem. sait 83 n. 2, (in press) [97] gonzález felipe, r., pérez martínez, a., pérez rojas, h., orsaria, m.: 2008, phys. rev. c 77, issue 1, 015807 [98] greco, m. (ed.): 2009, 2010, le rencontres de physique de la vallée d’aoste: results and perspectives in particle physics, frascati phys. ser. vol. l, vol. li [99] greiner, j. et al.: 2009, apj 693, 1610 [100] grimm, h.-j.: 2003, phd thesis, ludwig-maximilians-universität, münchen, germany [101] gustavino, c.: 2007; 2009; 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 93, 191; 98, 77; 103, 657 [102] gustavino, c.: 2013, this volume [103] hess, v.f.: 1912, physik zh. 13, 1084 [104] van den heuvel, e.p.j.: 2009, ap&ss library 359, 125 [105] hillas, a.m., johnson, a.p.: 1990, proc. 21st intern. cosmic ray conf. (adelaide) 4, 19 [106] hooper, d., blasi, p., serpico, p.d.: 2009, j. cosm. & ap. phys., issue 01, 1 [107] hooper, d., stebbins, a., zurek, k.m.: 2009, phys. rev. d 79, 103513 [108] hooper, d., kelso, c., queiroz, f.s.: 2012, arxiv:1209.3015v1 [109] hoyle, f., fowler, w.a.: 1960, apj, 132, 565 [110] hunter, : et al: 2009, astro2010: the astronomy and astrophysics decadal survey, science white papers, no. 137 [111] hurley, k.: 2008, chja&as 8, 202 [112] iben, i., jr., tutukov, a.v.: 1984, apjs, 54, 335 [113] incandela, j. (cms spokesperson): 2012, talk at cern public seminar, july 4, 2012 [114] izzo, l. et al.: 2010, j. korean phys. soc. 57, no. 3, 551 [115] jang, m. et al.: 2011, apj, 741, l20 [116] johnstone, wm.r.: 2005, http://www.johnstonsarchive.net/\relativity/ binpulstable.html [117] kafka, s.: 2012, j. astron. space sci. 29(2), 163 http://dx.doi.org/10.5140/jass.2012.29.2.163 [118] kappes, a., hinton, j., stegman, c., aharonian, f.a.: 2007, apj 656, 870 [119] kuiper, l.: 2007, talk presented at the frascati workshop on multifrequency behaviour of high energy cosmic sources [120] kulikov, g.v., khristiansen, g.b.: 1958, jetp 35, 63 [121] kundt, w.: 2013, this volume [122] larosa, t.n., kassim, n.e., lazio, t.j.w., hyman, s.d.: 2000, aj 119, 207 [123] lawrence, m.a., reid, r.j.o., watson, a.a.: 1991, j. phys. g: nucl. part. phys. 17, 733 [124] martinez, m.: 2007, ap&ss 309, 477 [125] mather, j.c. et al.: 1994, apj 420, 439 [126] mereghetti, s., stella, l.: 1995, apjl 442, l17 [127] meszaros, p., rees, m.j.: 1992, apj 397, 570 [128] mezger, p.g., duschl, w.j., zylka, r.: 1996, a&a rev. 7, 289 [129] mezzetto, m.: 2011, journal of physics: conference series 335, 012005 [130] morselli, a.: 2007, in high energy physics ichep ’06, y. sissakian, g. kozlov & e. kolganova (eds.), world sci. pub. co., p. 222 [131] mortlock, d.j. et al.: 2011, nature 474, 616 [132] mostafa, m.: 2012, talk presented at the aps april meeting on 100 years of cosmic ray physics [133] nagano, m. et al.: 1992, j. phys. g: nucl. part. phys. 18, 423 [134] nagano, m., watson, a.a.: 2000, rev. mod. phys. 72, 689 495 http://www.johnstonsarchive.net/\relativity/binpulstable.html http://www.johnstonsarchive.net/\relativity/binpulstable.html http://dx.doi.org/10.5140/jass.2012.29.2.163 franco giovannelli acta polytechnica [135] nakamura, k. et al. (particle data group): 2010, j. phys. g: nucl. part. phys. 37, 075021 [136] nelemans, g., jonker, p.g.: 2010, new astron. rev. 54, 87 [137] nelemans g., toonen, s., bours, m.: 2012, arxiv:1204.2960v2 [138] orsaria, m., ranea-sandoval, i.f., vucetich, h.: 2011, apj 734, 41o [139] orsaria, m., ranea-sandoval, i.f., vucetich, h., weber, f.: 2011, int. j. mod. phys. e 20, 25 [140] ouchi, m. et al.: 2010, apj 723, 869 [141] paczyński, b.: 1967, aca 17, 277 [142] palanque-delabrouille, n.: 2003, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 85, 131 [143] van paradijs, j., taam, r.e., van den heuvel, e.p.j.: 1995, a&a 299, l41 [144] pérez martínez, a., pérez rojas, h., mosquera cuesta, h.j.: 2003, the european phys. j. c 29, issue 1, 111 [145] perley, d.a. et al.: 2010, mnras 406, 2473 [146] piran, t.: 1999, phys. rep. 314, 575 [147] rahoui, f., chaty, s., lagage, p.-o, pantin, e.: 2008, a&a 484, 801 [148] ressel, m.t., turner, m.s.: 1990, comm. astrophys. 14, 323 [149] ruffini, r. et al.: 2003, aip conf. proc. 668, 16 [150] salvaterra, r. et al.: 2009, mem, sait. 80, 26 [151] schlickeiser, r.: 2003, cosmic ray astrophysics, springer-verlag berlin heidelberg new york [152] schroedter, m.: 2005, apj 628, 617 [153] sguera, v. et al.: 2006, apj 646, 452 [154] shaviv, n.j., nakar, e., piran, t.: 2009, phrvl 103, 111302. [155] sidoli, l.: 2011, adv. space res. 48, 88 [156] sidoli, l. et al.: 2010, arxiv:1001.3234v1 [157] smak, j.: 1967, aca 17, 255 [158] spergel, d.n. et al.: 2003, apjs 148, 175 [159] srianand, r., petitjean, p., ledoux, c.: 2000, nature 408, 931 [160] stanev, t.: 2010, high energy cosmic rays, springer praxis books, springer-verlag berlin heidelberg [161] stark, d.p., loeb, a., ellis, r.s.: 2007, apj 668, 627 [162] stecker, f.w., malkan, m.a., scully, s.t.: 2006, apj 648, 774 [163] straessner, a. (on behalf of the atlas collaboration): 2011, in frontier objects in astrophysics and particle physics, f. giovannelli & g. mannocchi (eds.), sif, ed. compositori, bologna, italy, 103, 43 [164] strohmayer, t.e.: 2005, apj 627, 920 [165] taani, a. et al.: 2012a, astron. nachr/an 333, no. 1, 53 [166] taani, a. et al.: 2012b, arxiv:1201.3779v2 [167] tagliaferri, g. et al.: 2005, a&a 443, 1 [168] tanvir, n.r. et al.: 2009, nature 461, 1254 [169] taylor, a.m. et al.: 2008, in high energy gamma-ray astronomy, aip conf. proc. 1085, 384 [170] thompson, d.j.: 2008, rep. progr. phys. 71, issue 11, pp. 116901 [171] tonelli, g. (cms collaboration): 2011, talk at cern public seminar, dec. 13, 2011 [172] toonen, s., nelemans, g., portegies zwart, s.: 2012, arxiv1208.6446 [173] tozzi, p. et al.: 2003, apj 593, 705 [174] tutukov, a., yungelson, l.y.: 1996, mnras 280, 1035 [175] ulla, a.: 1994, space sci. rev. 67, 241 [176] wang, f.y., dai, z.g.: 2009, mnras 400, 10 [177] warner, b.: 1995, ap&ss 225, 249 [178] warner, b., robinson, e.l.: 1972, mnras 159, 101 [179] webbink, r.f.: 1984, apj, 277, 355 [180] weniger, c.: 2012, j. cosm. & ap. phys. issue 08, article id. 007 [181] whelan, j., iben, i. jr.: 1973, apj, 186, 1007 [182] xu, x.w. (icecube collaboration): 2008, n. phys. b 175-176, 401 [183] zatsepin, v.i.: 1995, j. phys. g: nucl. part. phys. 21, issue 5, l31 496 acta polytechnica 53(supplement):483–496, 2013 1 introduction 2 the main pillars of astroparticle physics 2.1 cosmic rays 2.2 lhc results 2.3 big bang and diffuse extragalactic background radiation 2.4 reionization of the universe 2.5 clusters of galaxies 2.6 dark energy and dark matter 2.7 the galactic center 2.8 gamma-ray bursts 2.9 extragalactic background light 2.10 relativistic jets 2.11 cataclysmic variables 2.12 high mass x-ray binaries 2.12.1 obscured sources and supergiant fast x-ray transients 2.13 ultra-compact double-degenerated binaries 2.14 magnetars 3 cross sections of nuclear reactions in stars 4 neutrino astronomy 5 conclusions and reflections acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0573 acta polytechnica 53(supplement):573–578, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the first stars daniel j. whalen∗ mcwilliams fellow, department of physics, carnegie mellon university, pittsburgh, pa 15213 ∗ corresponding author: dwhalen@lanl.gov abstract. pop iii stars are the key to the character of primeval galaxies, the first heavy elements, the onset of cosmological reionization, and the seeds of supermassive black holes. unfortunately, in spite of their increasing sophistication, numerical models of pop iii star formation cannot yet predict the masses of the first stars. because they also lie at the edge of the observable universe, individual pop iii stars will remain beyond the reach of observatories for decades to come, and so their properties are unknown. however, it will soon be possible to constrain their masses by direct detection of their supernovae, and by reconciling their nucleosynthetic yields to the chemical abundances measured in ancient metal-poor stars in the galactic halo, some of which may bear the ashes of the first stars. here, i review the state of the art in numerical simulations of primordial stars and attempts to directly and indirectly constrain their properties. keywords: early universe, galaxies: high redshift, stars: early type, supernovae: general, radiative transfer, hydrodynamics, shocks. 1. the simulation frontier unlike with star formation in the galaxy today, there is little disagreement over the initial conditions of star formation in the primordial universe. the original numerical simulations suggested that pop iii stars formed in small pregalactic structures known as cosmological minihalos at z ∼ 20 ÷ 30, or ∼ 200 myr after the big bang [1–4, 24]. these models predicted that the stars formed in isolation, one per halo, and that they were 100 ÷ 500 m�. pop iii stars profoundly transformed the halos that gave birth to them, expelling their baryons in supersonic ionized flows and later exploding as supernovae (e.g. [21, 38, 39, 44]). radiation fronts from these stars also engulfed nearby halos, either promoting or suppressing star formation in them, thereby regulating the rise of the first stellar populations [11, 15, 30, 33, 34, 40, 41]. the original estimates of pop iii stellar masses were not obtained by modeling the actual formation and evolution of the stars. they were derived by comparing infall rates at the center of the halo at early stages of collapse to kelvin–helmholz contraction times to place upper limits on the final mass of the star. later simulation campaigns in the same vein revealed a much broader range of final masses for pop iii stars, 30 ÷ 300 m� [27], and that they could form as binaries in a fraction of the halos [35]. heroic numerical efforts have only recently achieved the formation of a hydrostatic protostar at the center of the halo [51] and the collapse of the central flow into an accretion disk [6, 9, 10, 31, 32]. in particular, the disk calculations indicate that they are unstable to fragmentation, raising the possibility that pop iii stars may have only been tens of solar masses, not hundreds, and that they may have figure 1. the formation and fragmentation of a pop iii protostellar disk in the arepo code [10]. formed in small swarms of up to a dozen at the centers of primeval halos. computer models of ionizing uv breakout in the final stages of pop iii protostellar disks have also found that the i-front of the nascent star exits the disk in bipolar outflows that terminate accretion onto the star and mostly evaporate the disk by the time the star reaches ∼ 40 m� [13]. this result reinforces the sentiments of some in the community that while the pop iii imf was top-heavy, primordial stars may only have been 10 ÷ 40 �. 573 http://dx.doi.org/10.14311/ap.2013.53.0573 http://ojs.cvut.cz/ojs/index.php/ap daniel j. whalen acta polytechnica 1.1. high-mass or low-mass pop iii stars? in spite of their increasing sophistication, these simulations should be taken to be very preliminary for several reasons. first, pop iii accretion disks form in smoothed-particle hydrodynamics (sph) models but not in adaptive mesh refinement (amr) simulations, although the amr models have not evolved the collapse of the halo to the times achieved by sph calculations. this raises the question of whether one technique better captures the transport of angular momentum out of the center of the cloud than the other, and whether accretion is ultimately spherical or through a disk. second, the stability of the disk itself remains an open question because although the simulations can now fully resolve the disk they do not yet incorporate all of its relevant physics. in particular, they lack high-order radiation transport, which regulates the thermal state of the disk and its tendency to fragment. furthermore, the role of primordial magnetic fields in the formation and evolution of the disk is not well understood [36, 48]. the use of sink particles to represent disk fragments in the original sph simulations of pop iii protostellar disk formation called into question the longevity of the fragments. once they are created in the simulations they are never destroyed, unlike real fragments that could be torn apart by gravitational torques and viscous forces [25]. more recent moving mesh simulations performed with the arepo code that do not rely on sink particles find that the fragments persist but only evolve the disk for 10÷20 yr [10]. perhaps most importantly, no simulation has followed the disks for more than a few centuries, far short of the time required to assemble a massive star. thus, it remains unclear if the fragments in the disk remain distinct objects or merge with the largest one at the center, building it up into a very massive star over time through protracted, clumpy accretion. 1.2. accretion cutoff and the final masses of pop iii stars this latter point directly impacts estimates of final masses for pop iii stars inferred from numerical simulations that attempt to model ho ionizing uv from the star reverses infall and evaporates the accretion disk. at the heart of such models is a simple recipe for the evolution of the protostar that provides a prescription for its radius and luminosity as a function of time and acquired mass. the hosokawa et al. [13] 2d calculations take the growth of the protostar to be relatively steady, in which case it contracts and settles onto the main sequence at ∼ 30 m�. at this point the star becomes extremely luminous in ionizing uv radiation that halts accretion onto the star in a few hundred kyr at a final mass of ∼ 40 m�. if accretion instead turns out to be clumpy, the protostar could remain puffy and cool and reach much larger masses before burning off the disk. a finer point is that all current accretion cutoff simulations evolve both radiation and hydrodynamics on the courant time, a practice which is known to lead to serious inaccuracies in i-front propagation in density gradients [43]. such coarse time steps may result in premature i-front breakout and accretion cutoff, and hence underestimates of the final mass of the star. three dimensional simulations with more accurate radiation–matter coupling schemes, both steady and clumpy accretion scenarios, more realistic prescriptions for protostellar evolution based on nucleosynthesis codes such as kepler [37, 49] and a variety of halo environments may better constrain the pop iii imf. however, in judging the power of such simulations to model the masses of the first stars, it should be remembered that no simulations realistically bridge the gap in time between the formation and fragmentation of a protostellar disk and its photoevaporation up to a myr later. we note in passing that fragments can also stop accreting if they are ejected from the disk by 3-body gravitational effects [9, 17]. these fragments could become very low-mass pop iii stars (∼ 1 m�); if so, some of them may live today. 2. constraining the pop iii imf with stellar archaeology unfortunately, because they lie at the edge of the observable universe, individual pop iii stars will remain beyond the reach of direct detection for decades to come, even with their enormous luminosities [29] and the advent of the next generation of near infrared (nir) observatories such as the james webb space telescope (jwst) and the thirty-meter telescope (tmt). however, there have been attempts to indirectly constrain the masses of pop iii stars by comparing the cumulative elemental yield of their supernovae to the fossil chemical abundances found in ancient metal-poor stars in the galactic halo, some of which may be second-generation stars. stellar evolution models indicate that 15 ÷ 40 m� primordial stars died in core collapse (cc) supernovae (sne) and that 40 ÷ 140 m� stars collapsed to black holes, perhaps with violent pulsational mass loss prior to death [12]. pop iii stars between 140 and 260 m� can die in pair-instability (pi) sne, with energies up to 100 times those of type ia sne that completely unbind the star and leave no compact remnant (chatzopoulos & wheeler 2012 have recently discovered that rotating pop iii stars down to 65 m� can also die as pi sne). these explosions were the first great nucleosynthetic engines of the universe, expelling up to half the mass of the progenitor in heavy elements into the early igm. primordial stars above 260 m� collapsed directly to black holes, with no mass loss. joggerst et al. [18] recently calculated the chemical imprint of low-mass pop iii sne on later generations of stars by modeling mixing and fallback onto the central black hole in 15 ÷ 40 m� pop iii core collapse explosions with the castro amr code. as shown 574 vol. 53 supplement/2013 the first stars figure 2. comparing pop iii sn yields to the chemical abundances of three of the most metal-poor stars (left panel) and the extremely metal-poor (emp) stars in the cayrel et al. [5] and lai et al. [22] surveys (right panel). in the left panel, the abundances in he0557-4840 agree well with the yields from sn model z15g in the joggerst et al. [18] study. in the right panel we show that higher explosion energy rotating z = 0 stars reproduce emp abundances well. the existence of 15 m� pop iii stars is required to produce this good agreement with observations. in fig. 2, a simple power-law imf average of the elemental yields of these explosions is in good agreement with the fossil abundances in a sample of 130 extremely metal poor stars with z < 10−4 z� [5, 22]. although these results suggest that low-mass pop iii stars shouldered the bulk of the chemical enrichment of the early igm, 40 ÷ 60 m� hypernova explosions, whose energies are intermediate to those of cc and pi sne, may also have contributed metals at high redshifts [16]. to date, the telltale odd-even nucleosynthetic signature of pi sne has not been found in the fossil abundance record, leading some to assert that pop iii stars could not have been very massive. however, the oddeven effect may have been masked by observational bias in previous surveys [19]. reconciling pop iii sn yields to the elemental patterns in metal-poor stars is still in its infancy for several reasons. first, only small numbers of extremely metal-poor stars have been discovered to date, and larger sample sizes would better constrain early sn yields. second, measurements of some elements in low-metallicity stars are challenging and in the past have been subject to systematic error. finally, there are many intervening hydrodynamical processes between the expulsion of the first metals and their uptake into new stars that are not yet understood. 3. finding the first cosmic explosions detection of pop iii sne would unambiguously probe the masses of primordial stars for the first time. since these explosions are 100,000 times brighter than either their progenitors or the primitive galaxies that host them, they could be found by jwst or the widefield infrared survey telescope (wfirst). however, unlike the type ia sne used to constrain cosmic acceleration, light from primeval supernovae must traverse the vast cosmic web of neutral hydrogen that filled the universe prior to the epoch of reionization. lyman absorption by this hydrogen removes or scatters most of the light from ancient supernovae out of our line of sight, obscuring them. whalen et al. [45–47] have calculated jwst nir light curves for pop iii pair-instability sne with the los alamos rage and spectrum codes [7, 8], which are shown for z = 10, 15, 20 and 30 in fig. 3. these simulations include radiation hydrodynamical calculations of the sn light curve and spectra in the local frame, cosmological redshifting, and lyman absorption by intergalactic hydrogen. jwst detection limits at 2 ÷ 4 µm are ab magnitude 31 ÷ 32, so it is clear that jwst will be able to detect the first cosmic explosions in the universe if they are pi sne (and even perform spectrometry on them). even given jwst’s very narrow fields of view at high redshifts, recent calculations indicate that at least a few pi sne should be present in any given jwst survey [14]. also, because wfirst detection limits will be ab magnitude 26.5 at 2.2 µm, it is clear from fig. 3 that pop iii pi sne will be visible to wfirst out to z ∼ 15 ÷ 20. since it is an all-sky survey, and because this redshift range may favor the formation of very massive pop iii stars because of the rise of lyman-werner uv backgrounds [28], wfirst will detect much larger numbers of pop iii sne. 575 daniel j. whalen acta polytechnica 0 200 400 600 800 1000 time (days) 36 34 32 30 28 26 24 m a g ( f 2 0 0 ) 150 m o • 175 m o • 200 m o • 225 m o • 250 m o • z = 10 0 200 400 600 800 1000 time (days) 36 34 32 30 28 26 24 m a g ( f 2 0 0 ) 150 m o • 175 m o • 200 m o • 225 m o • 250 m o • z = 15 0 200 400 600 800 1000 time (days) 36 34 32 30 28 26 24 m a g ( f 2 7 7 ) 150 m o • 175 m o • 200 m o • 225 m o • 250 m o • z = 20 0 200 400 600 800 1000 time (days) 36 34 32 30 28 26 24 m a g ( f 3 5 6 ) 150 m o • 175 m o • 200 m o • 225 m o • 250 m o • z = 30 figure 3. pop iii pi sn light curves for the jwst nircam. clockwise from the upper left panel, the redshifts are z = 10, 15, 20 and 30. the optimum filter for each redshift is noted on the y-axis labels and the times on the x-axes are for the observer frame. f200, f277 and f356 are at 2.0, 2.77 and 3.56 µm, respectively. could pop iii sne be detected at later stages by other means? whalen et al. [40], found that most of the energy of pop iii cc sne is eventually radiated away as h and he lines as the remnant sweeps up and shocks surrounding gas. at later epochs this energy would instead be lost to fine structure cooling by metals. in both cases the emission is too dim, redshifted and drawn out over time to be detected by any upcoming instruments. however, pi sne deposit up to half of their energy into the cosmic microwave background (cmb) by inverse compton scattering at z ∼ 20 [21, 40] and could impose excess power on the cmb at small scales [26]. the resolution of current ground-based cmb telescopes such as the atacama cosmology telescope and south pole telescope approaches that required to directly image sunyaevzeldovich (sz) fluctuations from individual pop iii pi sn remnants, so future observatories may detect them. unlike pi sne, cc sne deposit little of their energy into the cmb at z ∼ 20, and even less at lower redshifts because the density of cmb photons falls with cosmological expansion. the extreme nir luminosities of primordial pi sne could contribute to a nir background excess, as has been suggested for pop iii stars themselves, i.e. [20]. new calculations reveal that enough synchrotron emission from cc sn remnants is redshifted into the 21 cm band above z ∼ 10 to be directly detected by the square kilometer array (ska) [23]. somewhat more energetic hypernovae could be detected by existing facilities such as the extended very-large array (evla) and emerlin. pi sn remnants generally expand into ambient media that are too diffuse to generate a detectable synchrotron signal. pop iii sn event rates make it unlikely that they will be found in absorption at 21 cm at z > 10. the detection of the first cosmic explosions will be one of the most spectacular discoveries in extragalactic astronomy in the coming decade, opening our first observational window on the era of first light and the 576 vol. 53 supplement/2013 the first stars end of the cosmic dark ages at z ∼ 30. they will unveil the nature of primordial stars and constrain scenarios for early cosmological reionization, the process whereby the universe was gradually transformed from a cold, dark, featureless void into the vast, hot, ionized expanse of galaxies we observe today. at somewhat lower redshifts (z ∼ 10 ÷ 15), detections of pop iii supernovae will probe the era of primitive galaxy formation, marking the positions of nascent galaxies on the sky that might otherwise have eluded detection by jwst. finally, finding the first supernovae could also reveal the masses of the seeds of the supermassive black holes lurking at the centers of massive galaxies today. acknowledgements djw was supported by the bruce and astrid mcwilliams center for cosmology at carnegie mellon university. work at lanl was done under the auspices of the national nuclear security administration of the u.s. dept of energy at los alamos national laboratory under contract no. de-ac52-06na25396. all numerical simulations were performed on institutional computing (ic) and yellow network platforms at lanl (conejo, lobo and yellowrail). references [1] abel, t., bryan, g. & norman, m. l.: 2000, 540, 39 [2] abel, t., bryan, g. & norman, m. l.: 2002, science, 295, 93 [3] bromm, v., coppi, p. s. and larson, r. b.: 1999, apjl, 527, l5 [4] bromm, v., coppi, p. s. and larson, r. b.: 2002, apj, 564, 23 [5] cayrel, r. et al.: 2004, a&a, 416, 1117 [6] clark, p. c. et al.: 2011, science, 331, 1040 [7] frey, l. h. et al.: 2012, apjs, submitted, arxiv:1203.5832 [8] gittings, m. et al.: 2008, comp. sci. & disc., 1, 015005 [9] greif, t. h. et al.: 2011, apj, 737, 75 [10] greif, t. h. et al.: 2012, mnras, 424, 399 [11] hasegawa, k. et al.: 2009, mnras, 445 [12] heger, a. & woosley, s. e.: 2002, apj, 567, 532 [13] hosokawa, t. et al.: 2011, science, 334, 1250 [14] hummel, j. et al.: 2012, apj, submitted, arxiv:1112.5207 [15] iliev, i. et al.: 2005, mnras, 361, 405 [16] iwamoto, n. et al.: 2005, science, 309, 451 [17] johnson, j. l. & khochfar, s.: 2011, mnras, 413, 1184 [18] joggerst, c. c. et al.: 2010, apj, 709, 11 [19] karlsson, t., johnson, j. l. & bromm, v.: 2008, apj, 679, 6 [20] kashlinsky, a. et al.: 2005, nature, 438, 45 [21] kitayama k. & yoshida, n.: 2005, apj, 630, 675 [22] lai, d. k. et al.: 2008, apj, 681, 1524 [23] meiksen, a. & whalen, d. j.: 2012, mnras, submitted, arxiv:1209:1915 [24] nakamura, f. & umemura, m.: 2001, apj, 548, 19 [25] norman, m. l.: 2010, in first stars and galaxies: challenges in the coming decade, aip conf. ser. 1294, 17 [26] oh, s. p., cooray, a. & kamionkowski, m.: 2003, mnras, 342, l20 [27] o’shea, b. l. & norman, m. l.: 2007, apj, 654, 66 [28] o’shea, b. l. & norman, m. l.: 2008, apj, 673, 14 [29] schaerer, d.: 2002, a&a, 382, 28 [30] shapiro, p. r. et al.: 2004, mnras, 348, 753 [31] smith, r. j. et al.: 2011, mnras, 414, 3633 [32] stacy, a. et al.: 2010, mnras, 403, 45 [33] susa, h. & umemura, m.: 2006, apjl, 645, l93 [34] susa, h. et al.: 2009, apj, 702, 480 [35] turk, m. j. et al.: 2009, science, 325, 601 [36] turk, m. j. et al.: 2012, apj, 745, 154 [37] weaver, t. a., zimmerman, g. b. & woosley, s. e.: 1978, apj, 225, 1021 [38] whalen, d. j., abel, t. & norman, m. l.: 2004, apj, 610, 14 [39] whalen, d. j. et al.: 2008a, apj, 682, 49 [40] whalen, d. j. et al.: 2008b, apj, 679, 925 [41] whalen, d. j. et al.: 2010, apj, 712, 101 [42] whalen, d. j. & fryer, c. l.: 2012, apjl, apjl, 756, l19 [43] whalen, d. j. & norman, m. l.: 2006, apj, 162, 281 [44] whalen, d. j. & norman, m. l.: 2008, apj, 673, 664 [45] whalen, d. j. et al.: 2012a, apj, submitted [46] whalen, d. j. et al.: 2012b, apj, submitted [47] whalen, d. j. et al.: 2012c, apj, submitted [48] widrow, l. m. et al.: 2012, ssr, 166, 37 [49] woosley, s. e., heger, a. & weaver, t. a.: 2002, rev. mod. phys., 74, 1015 [50] yorke, h. & welz, a.: 1996, a&a, 315, 555 [51] yoshida, n., omukai, k & hernquist, l.: 2008, science, 321 577 daniel j. whalen acta polytechnica discussion wolfgang kundt — you spoke of a mass range (> 40 m�) for primordial stars for which they would collapse to black holes. where are the nearest of them today? daniel whalen — this is an excellent question. we recently published a letter (whalen & fryer 2012) in which we found that most 20 ÷ 40 m� pop iii black holes would be ejected from the cosmological halos that gave birth to them at velocities of 500 ÷ 1000 km s−1 by natal kicks due to asymmetries in the core-collapse engine. such velocities are far above the escape velocity of any halo they would encounter for over a hubble time, so there is a good chance that these black holes would be exiled to the voids between galaxies today. black holes above 40 m� are unlikely to be born with kicks and remain in the halo, accreting and growing over cosmic time. these black holes are much more likely to reside in the galaxies into which their host halos were taken, a few of which could become the supermassive black holes found in the sdss quasars today. maurice van putten — what fraction of the gas in a cosmological halo ends up in primordial stars? daniel whalen — it is currently thought that the minimum halo mass for forming a pop iii star is ∼ 105 m� and that 1–10 stars are formed with masses of 30÷300 m�. thus, a conservative estimate is that 0.1 ÷ 1 % of the baryons in the halo are converted into stars, and that the rest are evicted from the halo by strong ionized flows over the life of the stars. 578 acta polytechnica 53(supplement):573–578, 2013 1 the simulation frontier 1.1 high-mass or low-mass pop iii stars? 1.2 accretion cutoff and the final masses of pop iii stars 2 constraining the pop iii imf with stellar archaeology 3 finding the first cosmic explosions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0101 acta polytechnica 54(2):101–105, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap semiclassical asymptotics of eigenvalues for non-selfadjoint operators and quantization conditions on riemann surfaces anna i. esinab, andrei i. shafarevicha, ∗ a m.v. lomonosov moscow state university, leninskie gory, 1, moscow, russia b institute for problems in mechanics, russian academy of sciences, prospekt vernadskogo, 101, moscow, russia ∗ corresponding author: shafarev@yahoo.com abstract. this paper reports a study of the semiclassical asymptotic behavior of the eigenvalues of some nonself-adjoint operators that are important for applications. these operators are the schrödinger operator with complex periodic potential and the operator of induction. it turns out that the asymptotics of the spectrum can be calculated using the quantization conditions. these can be represented as the condition that the integrals of a holomorphic form over the cycles on the corresponding complex lagrangian manifold, which is a riemann surface of constant energy, are integers. in contrast to the real case (the bohr–sommerfeld–maslov formulas), in order to calculate a chosen spectral series, it is sufficient to assume that the integral over only one of the cycles takes integer values, and different cycles determine different parts of the spectrum. keywords: semiclassical asymptotics, quantization conditions, riemann surface, spectral graph. 1. introduction one of the main problems of the semiclassical theory (see, for example, [1]) is the description of the asymptotic behavior of the spectrum of operators of the form ĥ = h(x,−ıh ∂ ∂x ), h → 0. in this case, the problem can naturally be divided into two subproblems, namely: (1.) to solve the spectral equation approximately, i.e., to find numbers λ and functions ψ, satisfying the following equation for some n > 1: ĥψ = λψ + o(hn ); (1) (2.) to choose numbers of the form λ that approach spectral points of the operator ĥ, i.e., to choose points λ such that |λ−λ0| = o(hn ) (2) for some point λ0 of the spectrum of operator ĥ. if operator ĥ is self-adjoint, then the estimate (2) automatically follows from equation (1) (see, e.g., [1– 3]). at the same time, the first problem is highly nontrivial and is related to the study of invariant sets of the corresponding classical hamiltonian system. recall how to solve this problem (1) in the integrable case. let h(x,p) : r2n → r be a smooth function, and let the hamiltonian system defined by function h be liouville integrable. let f1 = h,.. . ,fn be the commuting first integrals; consider the domain of the phase space smoothly fibered into liouville tori λ which are the compact connected components of the common level sets of the form fj = cj. we assume that the weyl operator ĥ is self-adjoint in l2(rnx). the following theorem is due to v. p. maslov. theorem 1. suppose that a liouville torus λ satisfies the following conditions (the so-called bohr– sommerfeld–maslov quantization rules, see [1, 2, 4, 5]): 1 2πh ∫ γ (p,dx) = m + µ(γ) 4 , (3) where m = o(1/h) ∈ z, γ is an arbitrary cycle on λ, and µ(γ) is the maslov index of the cycle. then there is a function ψ ∈ l2(rn), ‖ψ‖ = 1, such that ĥψ = λψ + o(h2), λ = h|λ. remark 1. the function ψ mentioned in the theorem can be described in a computable way, namely, it is of the form k(1), where k stands for the maslov canonical operator on the liouville torus λ. integer m can be chosen in the form m = [1/h]int +m0, where [1/h]int stands for the integral part of the real number 1/h and m0 does not depend on h. remark 2. as was already noted above, it follows automatically from the statement of the theorem that the point λ is at a distance of the order of o(h2) from the spectrum of operator ĥ. remark 3. we stress that the topological condition (1.3) must be satisfied for all cycles of torus λ (in other words, the quantization condition is the condition that the cohomology class 1 2πh [θ] − 1 4 [µ] is integer, where [θ] stands for the class of the form (p,dx) and [µ] for the maslov class). 101 http://dx.doi.org/10.14311/ap.2014.54.0101 http://ojs.cvut.cz/ojs/index.php/ap anna i. esina, andrei i. shafarevich acta polytechnica remark 4. in action–angle variables (i1, . . . , in,ϕ1, . . . ,ϕn), the quantization conditions and formula for the spectrum have a simple form (see e.g. [2]) ij = h ( mj + µj 4 ) , λ = h(i1, . . . ,in). the nonself-adjoint case has been investigated less, and quite incompletely; however, spectral problems for nonself-adjoint operators arise in many important physical applications (like the theory of hydrodynamic stability, a description of magnetic fields of the earth and of galaxies, pt -symmetric quantum theory, statistical mechanics of coulomb gases, and many other problems; see, for example, [6–11]). in our paper, we consider two classes of nonselfadjoint operators, namely, the one-dimensional schrödinger operator with complex potential and the operator of magnetic induction on a two-dimensional symmetric surface. the spectrum of these operators, in the semiclassical limit, is concentrated in the o(h2)neighborhood of some curves in the complex plane e; these curves form the so-called spectral graph. it turns out that each edge of the spectral graph corresponds to a certain cycle on the riemann surface defined by the classical complex hamiltonian system (this is a surface of constant energy). the asymptotics of the eigenvalues can be calculated by using complex equations which are similar to the bohr–sommerfeld–maslov quantization conditions on the riemann surface. however, in contrast to the self-adjoint case, in order to evaluate the eigenvalues, it is required to satisfy the corresponding condition on only one cycle, and it turns out that different cycles determine different parts of the spectrum (and different edges of the spectral graph). 2. schrödinger equation with a complex potential the spectral problem for the schrödinger equation on a circle with a purely imaginary potential −h2ψ′′ + ıv (x)ψ = λψ, ψ(x + 2π) = ψ(x) (4) arises, in particular, as a model problem for the orr– sommerfeld operator in the theory of hydrodynamic stability (see, e.g., [12–20]). a close problem appears in the statistical mechanics of the coulomb gas (see [11]). here h → 0 is a small parameter and v (x) is a trigonometric polynomial. the asymptotic behavior of the spectrum of this operator for different trigonometric polynomials v as h → 0 was calculated in [21–25]; it turns out here that the numbers λ satisfying (1) fill a half-strip in the complex plane entirely, while the actual spectrum is discrete and concentrates near some graph. the results of these papers can be reformulated in terms of the quantization rules on riemann surfaces as follows. consider a riemann surface λ in the complex phase space φ = (c/2πz) ×c with coordinates (x,p), where λ is given by the equation p2 + ıv (x) = λ; this surface is obtained by gluing together two cylinders of the variable x along finitely many cuts, namely, the zeros of the trigonometric polynomial v are joined to one another and to the points at infinity. the results of the papers mentioned above imply the following assertion. theorem 2. the spectrum of the schrödinger operator concentrates in the o(h2)-neighborhood of the set given by the family of equations 1 2πh ∫ γ pdx = m + µ 4 , (5) where γ is some cycle on surface λ, µ ∈ {0, 2}, and m = o(1/h) is an integer. remark 5. in contrast to the self-adjoint case, the quantization condition must hold on only one cycle in the given family of cycles, and different cycles determine different parts of the spectrum. remark 6. separating the real and imaginary parts in equations (5) we obtain the system = ∫ γ pdx = 0, (6) < 1 2πh ∫ γ pdx = m + µ 4 . (7) the first equation does not depend on h. the combination of these equations for different cycles defines a set of analytical curves in the complex plane λ, the so-called spectral graph. the second equation defines a discrete set of asymptotic eigenvalues; for a fixed cycle γ, these eigenvalues are concentrated near the corresponding edge of the spectral graph. remark 7. in [21–25], examples of spectral graphs for specific surfaces λ are presented. in particular, if v (x) = cos x, then surface λ is homeomorphic to a torus with two punctures; the corresponding spectral graph consists of three edges corresponding to the three cycles in the surface and has the shape shown in fig. 1. if v = cos x + cos 2x, then the surface is homeomorphic to a pretzel with two punctures (a sphere with two handles and with two disks removed); the corresponding spectral graph is shown in fig. 2 and consists of five edges (note that the one-dimensional homology of λ is the fivedimensional in this case). remark 8. the equations for the asymptotic eigenvalues can be represented by explicit formulas∫ xk xj √ λ− ıv (x) dx = πh(mkj + µ/4) (8) where mkj are integers, µ ∈{0, 2}, and xk and xj are zeros of the integrand. in this case, the equation = ∫ xk xj √ λ− ıv (x) dx = 0 (9) 102 vol. 54 no. 2/2014 semiclassical asymptotics of eigenvalues for non-selfadjoint operators figure 1. spectral graph for the case v = cos x defines the edges of the spectral graph, and the spectral points are defined by the equations: < ∫ xi xj √ λ− ıv (x) dx = πh(mij + µ/4). (10) remark 9. integer µ is the analog of the maslov index; however, the definition of this number is quite different. namely, µ(γ) equals the index of intersection of the cycle γ with the pull-back of the real circle =x = 0 with respect to the projection (x,p) → x. 3. equation of magnetic induction the spectral problem for the operator of induction, h24b −{v,b} = −λb, (11) div b = 0 (12) arises when describing the magnetic field in a conductive liquid (in particular, the magnetic fields of planets, stars, and galaxies, see, e.g., [9]). here, v stands for a given smooth divergence-free field on a riemannian manifold m, ∆ for the laplace–beltrami operator, and b is the desired vector field (the magnetic field). parameter h characterizes the resistance in the liquid, and the passage to the limit as h → 0 corresponds to high conductivity. clearly, the spectrum of the operator of induction depends substantially on the manifold m and on the field v and can be computed efficiently in special situations only. below we consider a special case of this kind, namely, a two-dimensional surface of revolution with the flow along the parallels. this case was discussed in detail in [24] (see also [26, 27]); we present the main results only. recall that a two-dimensional compact surface of revolution is diffeomorphic either to a torus or to a sphere. 3.1. torus the torus is obtained by rotating a smooth closed curve around an axis that does not intersect the curve, and the metric is of the form ds2 = dz2 + u2(z)dϕ2, figure 2. spectral graph for the case v = cos x + cos 2x where z stands for the arc length parameter on the rotating curve, u(z) for the distance of the point to the axis of rotation (we assume that u is a trigonometric polynomial), and ϕ for the angle of rotation. we assume that field v is directed along the parallels, v = a(z) ∂ ∂ϕ , where a is a trigonometric polynomial, in which case, the variables in the spectral equation can be separated and the asymptotic behavior of the spectrum can be calculated by using equations similar to (5). the riemann surface λ is given by the equation p2 + ina(z) = λ (n is an integer constant entering the separation of variables), and the spectral graph is defined from equations (9), in which v = na. 3.2. sphere the sphere is obtained by rotating a smooth curve (the graph of a function f(z)) around the z axis which intersects the curve at two points at which the tangent to the curve is perpendicular to the axis of rotation (the poles of the surface). we assume that f(z) = √ (z −z1)(z −z2)k(z), where z1 and z2 are the poles of the surface, k(z) is a polynomial, and k(z) > 0 for z ∈ [z1,z2]. as far as field v is concerned, it is assumed that v = a(z) ∂ ∂ϕ , where a(z) is a polynomial. the riemann surface is given in c2 by the equation p2f(z)2 + ina(z) = λ; it is punctured not only at the points at infinity but also at the zeros of f (i.e., at the the poles of m). the asymptotics of the spectrum is still defined by equation (5); analytical equations (8) are replaced by the equations∫ zk zj √ (f2z + 1)(ina(z) + λ) dz = πh(mij + µ/4), where zi and zj are the zeros and poles of the integrand (in particular, the poles of the surface of revolution m can be taken as the limits of integration). 103 anna i. esina, andrei i. shafarevich acta polytechnica figure 3. cycles on the riemann surface as an example, consider the simplest case of the standard sphere (f = √ 1 −z2) and take a(z) = z. in this case, the riemann surface is homeomorphic to the torus with three punctures, namely, at the points z = ±1 and at the point at infinity. the cycles are depicted in fig. 3. cycle γ1 goes around the points −1 and 1, the cycles γ2 and γ3 go around the points iλ/n, −1 and the points iλ/n, 1, respectively. every cycle defines the corresponding quantization conditions, which are of the form 1 πh ∫ 1 −1 √ inz −λ 1 −z2 dz = 1 2 + m1 for cycle γ1, 1 πh ∫ iλ/n −1 √ inz −λ 1 −z2 dz = m2 for cycle γ2, and 1 πh ∫ iλ/n 1 √ inz −λ 1 −z2 dz = m3 for cycle γ3. to every quantization condition, there corresponds its own sequence of eigenvalues. remark 10. in contrast to the preceding section, the quantization conditions corresponding to a surface of revolution involve an integer n (the constant arising in the course of the separation of variables). the asymptotic eigenvalues and the edges of the spectral graph depend on n; thus, the graph now consists of countably many edges. for the standard sphere and for a = z, this graph is shown in fig. 4. 4. conclusions we have studied the asymptotic behavior of the eigenvalues of the schrödinger operator with complex periodic potential and of the induction operator on the figure 4. spectral graph with countably many edges surface of revolution. both appear in concrete physical problems (a study of the stability of a viscous fluid, a description of the magnetic fields in stars and galaxies, statistical mechanics of coulomb gas etc.) we show that semiclassical asymptotics of the spectrum can be computed with help of quantization conditions on the corresponding riemann surface. we discuss the relation of these equation to the standard ebk — maslov quantization: equations should be considered for different cycles of the surface separately and the maslov index should be replaced by the index of intersection with the pull-back of the real circle. acknowledgements our research was supported by a grant from the government of the russian federation for state support for scientific research carried out under the supervision of leading scientists at the lomonosov moscow state university federal budget educational institution of higher professional education, according to agreement 11.g34.31.0054, and also by the russian foundation for basic research under grants 09-01-12063-ofi-m, 11-01-00937-a, and 13-01-00664, by the program in support of leasing scientific schools (under grant 3224.2010.1), and a grant “my first grant” project 12-01-31235. the authors thank the referees for very useful recommendations. references [1] v. p. maslov. asymptotic methods and perturbation theory. mgu, 1965. [2] v. p. maslov, m. v. fedoryuk. quasiclassical approximation for the equations of quantum mechanics. nauka, 1976. [3] e. b. davies. pseudospectra of differential operators. operator theory 43:243–262, 2000. [4] m. a. evgrafov, m. v. fedorjuk. asymptotic behavior of solutions of the equation w′′ −p(z,λ)w = 0 as λ →∞ in the complex z-plane. uspekhi mat nauk 21(2):3–50, 1966. [5] m. v. fedoryuk. asymptotic analysis: linear ordinary differential equations. springer-verlag, 1993. [6] i. t. gohberg, m. g. krein. introduction to the theory of linear nonself-adjoint operators. american mathematical society, 1969. 104 vol. 54 no. 2/2014 semiclassical asymptotics of eigenvalues for non-selfadjoint operators [7] l. n. trefethen. pseudospectra of linear operators. isiam 95: proceedings of the third int congress of industrial and applied math pp. 401–434, 1996. [8] r. g. drazin, w. h. reid. hydrodynamic stability. cambridge, 1981. [9] y. b. zel’dovich, a. a. ruzmaikin. the hydromagnetic dynamo as the source of planetary, solar, and galactic magnetism. uspekhi fiz nauk 152(2):263–284, 1987. [10] c. m. bender, b. k. m. dorje c. brody, hugh f. jones. faster than hermitian quantum mechanics. phys rev lett 98, 2007. doi: 10.1103/physrevlett.98.040403 [11] t. gulden, a. k. michael janas, peter koroteev. statistical mechanics of coulomb gases as quantum theory on riemann surfaces. jetp 144(9), 2013. doi: 10.7868/s0044451013090125 [12] s.-a. stepin. nonself-adjoint singular perturbations: a model of the passage from a discrete spectrum to a continuous spectrum. russ math surv 50(6):1311–1313, 1995. [13] a. a. shkalikov. on the limit behavior of the spectrum for large values of the parameter of a model problem. math notes 62(5):796–799, 1997. doi: 10.4213/mzm1688 [14] a. a. arzhanov, s. a. stepin. semiclassical spectral asymptotics and the stokes phenomenon for the weber equation. dokl akad nauk 378(1):18–21, 2001. [15] a. a. shkalikov, s. n. tumanov. on the limit behaviour of the spectrum of a model problem for the orr–sommerfeld equation with poiseuille profile. izv math 66(4):829–856, 2002. doi: 10.4213/im399 [16] a. v. d’yachenko, a. a. shkalikov. on a model problem for the orr–sommerfeld equation with linear profile. funktsional anal i prilozhen 36(3):228–232, 2002. doi: 10.4213/faa208 [17] a. a. shkalikov. spectral portraits of the orr–sommerfeld operator with large reynolds numbers. j math sci 124(6):5417–5441, 2004. doi: 10.1023/b:joth.0000047362.09147.c7 [18] s. a. stepin, v. a. titov. on the concentration of spectrum in the model problem of singular perturbation theory. dokl math 75(2):197–200, 2007. [19] v. i. pokotilo, a. a. shkalikov. semiclassical approximation for a nonself-adjoint sturm–liouville problem with a parabolic potential. math notes 86(3):442–446, 2009. doi: 10.4213/mzm8506 [20] l. k. kusainova, a. z. monashova, a. a. shkalikov. asymptotics of the eigenvalues of the second-order nonself-adjoint differential operator on the axis. mat zametki 93(4):630–633, 2013. doi: 10.4213/mzm10173 [21] s. v. galtsev, a. i. shafarevich. spectrum and pseudospectrum of nonself-adjoint schrödinger operators with periodic coefficients. mat zametki 80(3):456–466, 2006. doi: 10.4213/mzm2821 [22] s. v. galtsev, a. i. shafarevich. quantized riemann surfaces and semiclassical spectral series for a nonself-adjoint schrödinger operator with periodic coefficients. theor math phys 148(2):206–226, 2006. doi: 10.4213/tmf2081 [23] a. i. esina, a. i. shafarevich. quantization conditions on riemannian surfaces and the semiclassical spectrum of the schrödinger operator with complex potential. mat zametki 88(2):209–227, 2010. doi: 10.4213/mzm8803 [24] a. i. esina, a. i. shafarevich. asymptotics of the spectrum and the eigenfunctions of the operator of magnetic induction on a two-dimensional compact surface of revolution. mat zametki (in print) 2013. doi: 10.4213/mzm10424 [25] a. i. esina, a. i. shafarevich. analogs of bohr–sommerfeld–maslov quantization conditions on riemann surfaces and spectral series of nonself-adjoint operators. russian journal of mathematical physics 20(2):172–181, 2013. doi: 10.1134/s1061920813020052 [26] h.roohian, a. shafarevich. semiclassical asymptotics of the spectrum of a nonself-adjoint operator on the sphere. russ j math phys 16(2):309–315, 2009. doi: 10.1134/s1061920809020150 [27] h. roohian, a. i. shafarevich. semiclassical asymptotic behavior of the spectrum of a nonself-adjoint elliptic operator on a two-dimensional surface of revolution. russ j math phys 17(3):328–334, 2010. doi: 10.1134/s1061920810030064 105 http://dx.doi.org/10.1103/physrevlett.98.040403 http://dx.doi.org/10.7868/s0044451013090125 http://dx.doi.org/10.4213/mzm1688 http://dx.doi.org/10.4213/im399 http://dx.doi.org/10.4213/faa208 http://dx.doi.org/10.1023/b:joth.0000047362.09147.c7 http://dx.doi.org/10.4213/mzm8506 http://dx.doi.org/10.4213/mzm10173 http://dx.doi.org/10.4213/mzm2821 http://dx.doi.org/10.4213/tmf2081 http://dx.doi.org/10.4213/mzm8803 http://dx.doi.org/10.4213/mzm10424 http://dx.doi.org/10.1134/s1061920813020052 http://dx.doi.org/10.1134/s1061920809020150 http://dx.doi.org/10.1134/s1061920810030064 acta polytechnica 54(2):101–105, 2014 1 introduction 2 schrödinger equation with a complex potential 3 equation of magnetic induction 3.1 torus 3.2 sphere 4 conclusions acknowledgements references ap2002_01.vp 1 introduction theoretical assumptions as well as practical experience of diagnostic measurements on high voltage cables, transformers or rotating machines show that if we intend to compare results, it is necessary to carry out all measurements with the same internal conditions (atmospheric humidity, ambient temperature, air pressure, etc.) for example, the influence of temperature on the loss dissipation factor of insulating system of high voltage equipment is well known. after measurements, it must be recounted to a temperature of 20 °c. in technical and scientific literature, the influence of pressure on partial discharge measurement has not been handled very often. because diagnosing high voltage rotating machines using the partial discharge method has become very popular, and our experience of partial discharge measurements has shown that the pressure of the coolant is a very important factor influencing the measurements, we consider it useful to deal with this matter more precisely. 2 theoretical analysis at normal temperature and pressure, gases are excellent insulators. in higher fields charged particles such as electrons or positive ions may gain sufficient energy between collisions to cause ionisation. ionisation by electron impact is the most important process leading to breakdown of gases, because electrons usually gain higher energy than relatively slow atoms. this corresponds closely with the fact that the mean free path �m of electrons is longer than the mean free path of atoms �me � 5.66 �ma [1]. it can therefore be said that the effectiveness of ionisation by electron impact depends upon the energy that an electron can gain along the mean free path in the direction of the electrical field. considering only collisions by electrons, if �me is the mean free path in the field direction of strength e, then the average energy gained over a distance �me is w ee� �me . (1) to cause ionisation on impact energy w must be at least equal or higher than ionisation energy wi. collisions of particles in a gas are random events. hence, a free path (which is defined as the distance molecules or particles travel between collisions [1]) is a random quantity and will have a distribution about a mean value. in the case of simple “ballistic” model the mean free path is given by � � m i � 1 n , (2) where n is gas density and �i is the collision cross-section of two particles. according to the universal gas law pv nkt� or p nkt� , where p is the gas pressure, n n v� ( n is number of particles, v is volume) is gas density, k is the universal boltzmann constant and t is an absolute temperature, we can evaluate n as follows: n p kt � . (3) inserting (3) into (2) we can evaluate the mean free path as � � m i � 1 kt p . (4) if the free path of particle � � �i , where �i is the free path of a particle which at a certain level of strength e gains the ionisation energy wi, from (1) it is possible to write �i i � w ee . (5) the mean number of collisions caused by an electron per unit length is1 �m . this electron initiates � ionisations, where � is townsend’s first ionisation coefficient, defined as the number of electrons produced by an electron per unit length of path in direction of the field. townsend’s first ionisation coefficient can be evaluated as � � � � � � � �1 m i me . (6) coefficient � plays a valuable role in the process of multiplying of the free charge holders. according to (4), the mean free path (provided that the collision cross-section, gas density and temperature are constant) is indirectly proportional to the pressure of the gas. according to (5) we can say that �i is © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 42 no. 1/2002 partial discharge measurements in hv rotating machines in dependence on pressure of coolant i. kršňák, i. kolcunová the influence of the pressure of the coolant used in high voltage rotating machines on partial discharges occurring in stator insulation is discussed in this paper. the first part deals with a theoretical analysis of the topic. the second part deals with the results obtained on a real generator in industrial conditions. finally, theoretical assumptions and obtained results are compared. keywords: partial discharges, stator insulation, high voltage rotating machine, phase resolved partial discharge analysis. indirectly proportional to the strength of electric field e and we can write � p ae b e p � � , (7) where a kt� �i and b w ekt� �i i . equation (7) points to those physical quantities of the gas which interact with the ionisation coefficient �. the value � p is mainly affected by the coefficient b that comprises ionisation energy wi. the ionisation energy is different for each type of gas. if we know the dependence of ionisation coefficient � on the strength of electrical field e and pressure of gas p, then according to the condition of self-sustained discharge we can derive an equation describing the inception breakdown voltage in a homogeneous electrical field which is dependent on gas pressure and the distance of the electrodes (paschen’s law) � �u f p dbr � . (8) in non-uniform electrical fields it is necessary to use generalized paschen’s law, the law of similarity of discharges in gas, which considers the geometry of the electrode set-up � �u f p d r dbr � , . (9) such a curve has its characteristic minimum ubr min at the critical value ( p d)min. the minimum breakdown value is dependent on the material of the electrodes and the type of the gas. for the electro insulating purposes of gases in high voltage equipments it is always valid that � � � �p d p d� min . for this reason we can consider only the “right wing” of paschen’s curve. this means that when the pressure of the gas is going up (provided that distance d is constant), the mean free path of electrons �me shortens, which causes a decrease in their energy w, see eqn.(1). this leads to an increase in the electric breakdown strength of the gas. surface discharges occurring on the interface of the gas-solid dielectric occur in cases when both normal and tangential components of the electrical field are present. according to [2], the type of insulating material does not essentially affect the discharge formation. for this reason the same laws that have been described for discharges in a gas are valid both for pure discharges in air and for surface discharges on the gas-solid dielectric interface. 3 partial discharge measurements on a high voltage rotating machine off-line partial discharge measurements were made on stator windings of synchronous generator with stator insulation of thermal class f. each phase was measured separately. all measurements were carried out using a partial discharge detector with digital data recording. the testing voltage was increased gradually until the inception of partial discharges. at this voltage, the discharge data was recorded for the first time. then the testing voltage was increased in 1 kv steps, up to the nominal phase voltage of 8 kv. the partial discharge signal was measured at each voltage level. the first partial discharge measurement was performed when the pressure of the coolant (hydrogen) was 108.9 kpa. the inception voltage of partial discharges in each phase was the next measurement was carried 6 month later. the pressure of the coolant (hydrogen) was higher – 196 kpa. the inception voltage of partial discharges in each phase was much higher then in the case of previous measurement. partial discharge phase analysis it can shows that at lower hydrogen pressure, (108.9 kpa), see fig. 1, much higher values of apparent charge were obtained (qmax � 80 000 pc, qavg � 10 000 pc) than when partial discharge measurements were performed at higher hydrogen pressure (196 kpa), see fig. 2, where qmax � 4 500 pc, and qavg � 500 pc. all the above mentioned apparent charge values are for a nominal phase voltage of 8 kv. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 phase inception voltage (kv) l1 2.1 l2 2.8 l3 2.0 phase inception voltage (kv) l1 6.0 l2 5.5 l3 5.5 fig. 1: partial discharge fingerprints: ut � 8 kv, pressure of coolant p � 108.9 kpa on the other hand, very similar phase distributions to those obtained at coolant pressure of 196 kpa at the testing voltage of 8 kv were measured in the first case at lower coolant pressure at a testing voltage of 5 kv, see fig. 3. 4 conclusion both theoretical assumptions and partial discharge measurements on real stator insulation in high voltage rotating machines have shown that the coolant pressure value significantly affects both the inception voltage of partial discharges and their apparent charge values. for transparency in the process of evaluating the results, it is therefore necessary to perform partial discharge measurements in high voltage rotating machines with coolant at the same pressure. references [1] kuffel, e., zaengl, w. s.: high voltage engineering. pergamon press, 1988 [2] beyer, m., boeck, w., möller, k., zaengl, w.: hochspannungstechnik. springer-verlag, 1986 [3] razevig, d. v., sokolova, m. v.: raschet nachalnych i razrjadnych naprjazenij gazovych promezutkov. energia, 1977 [4] kolcunová, i., kršňák, i.: diagnostika elektrických točivých strojov metódou fázovej analýzy vybraných veličín čiastkových výbojov. journal ee, vol. 5, no. 1, pp. 8–10 [5] záliš, k.: complex evaluation system for partial discharge measurement. proc. of workshop’99, vol. 3, prague, 1999, p. 282 dr. ing. igor kršňák e-mail: krsnak@ktvn1.tuke.sk doc. ing. iraida kolcunová, ph.d. tel./fax: +421556225060 mobil: +421907571507 dept. of high voltage engineering technical university of košice mäsiarska 74 042 10 košice, slovakia © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 42 no. 1/2002 fig. 2: partial discharge fingerprints: ut � 8 kv, pressure of coolant p � 196 kpa fig. 3: partial discharge fingerprints: ut � 5 kv, pressure of coolant p � 108.9 kpa << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts false /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2015.55.0086 acta polytechnica 55(2):86–95, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison of schemes for cooling high heat flux components in fusion reactors phani kumar domalapally∗, slavomir entler centrum výzkumu řež, řež u prahy, czech republic ∗ corresponding author: p_kumar.domalapally@cvrez.cz abstract. some components of fusion reactors receive high heat fluxes either during startup and shutdown, or while the machine is in operation. this paper analyzes various ways of enhancing heat transfer using helium and water for cooling these high heat flux components. conclusions are then drawn to decide on the best choice of a coolant for use in the near term and in the long term. keywords: high heat flux components; helium cooling; water cooling; critical heat flux.. 1. introduction in fusion reactors such as the international thermonuclear experimental reactor (iter) and the demonstration fusion power reactor (demo), the heat fluxes on the plasma facing components (pfcs) such as the divertor and the first wall, can be of the order of 1 to 20 mw/m2, depending on the location [1, 2]. if not properly cooled, these components will fail due to problems caused by excessive temperature. it is necessary to design an active cooling system to prevent damage to these components. the design of the active cooling system should include selecting a proper heat sink and a proper coolant. the heat sink should be selected such that it has high thermal conductivity, high mechanical resistance, compatibility with the coolant, compatibility with the armoring materials, resistance to radiation damage, adequate resources and ease of fabrication [3]. at the same time, the selected coolant should have high heat capacity, be non-toxic, safe to use, non-corrosive, chemically and electrically inert, and if possible not affected by magnetic fields. so when the final cooling system is made it should be able to provide perfect cooling in all adverse conditions, with minimum pumping power. based on these requirements, three different coolants have been proposed: liquid metals (e.g., k, na, lipb), helium, and water [4]. this paper discusses and analyses the advantages and disadvantages of the different coolants, and then discusses various ways of increasing the effective heat transfer coefficient (htc) of the fluids. finally, conclusions are drawn on the basis of the details provided here. 2. cooling schemes as discussed in the previous section, the proposed coolants are liquid metals, helium, and water. liquid metals are proposed as coolants because of their high thermal conductivity and high heat capacity, which enable them to accommodate high heat fluxes (hhf) with relatively low velocities. if used in two-phase flow, very low pumping speeds in turn pumping power is required. the operating life of pressurized feed pipes can be extended due to the ability of a liquid metal to absorb a huge amount of energy without producing high pressure. however, if liquid metals are used, electrically insulating coatings have to be used. liquid metals can be chemically active, and corrosion can be a problem. in addition, magnetohydro-dynamic effects may decrease the heat transfer capability, and excessive driving force may be needed. if excessive driving force is used in a two-phase regime, there is a possibility of critical heat flux (chf) [5]. helium cooling is very advantageous, as helium is an inert gas that is already present in the system. tritium can easily be separated from helium, and it will not act as an additional contaminant as it is already present in the reactor. helium is favored over water for safety reasons if liquid metal blankets are used. it offers single phase heat transfer without the possibility of chf, and then the machine can be operated at a higher temperature, and so higher thermal efficiency can be obtained when combined with the brayton cycle. however, helium has very low heat capacity, so in order to absorb very high heat fluxes we have to use heat transfer enhancement methods coupled with high pressures and velocities [6]. water has several advantages, such as good heat transport properties with relatively small pumping power. very high htcs are associated when used in a two-phase flow regime, and water is cheaper than the other coolants. however, water is not compatible with the liquid metal tritium breeder. the low outlet temperature leads to low thermal efficiency, and this can be dangerous in the event of a leak, and a large vapor fraction may result in chf [5]. although each of the coolants has some advantages and disadvantages, preference is given to helium and water. this is because there are still several problems associated with using liquid metals as a coolant [5]. a simple channel/tube without any enhancement technique can be used for fusion applications to handle very high hf, but in this case very high pumping speeds and pressures are required. it is therefore nec86 http://dx.doi.org/10.14311/ap.2015.55.0086 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 2/2015 comparison of schemes for cooling high heat flux components figure 1. porous metal (copper) heat exchanger test article [8]. figure 2. target plate cross section of the porous medium (refractory material) concept with improved flow path design [10]. essary to use heat transfer enhancement techniques, which improve the heat transfer efficiency. in the following sections these techniques are discussed in brief. 3. helium cooling several methods have been proposed in the literature for increasing the heat transfer capability using helium as a coolant, e.g., a porous medium or extended surfaces and jet cooling. these cooling schemes will be discussed in this section. porous metal exchangers provide a large surface area, which directly influences the heat transfer. the pressure drop in porous media is dependent on the porosity of the medium, and htc is dependent on the specific surface area [7]. several porous medium heat exchangers have been proposed in the literature, and cu, w, ta etc. are the porous materials that are used [8, 9]. the test article, as shown in figure 1, is made up of dispersion strengthened copper, and the copper porous metal has been brazed in the middle. the test article was optimized parametrifigure 3. the helium divertor module is a dualchannel, circumferential flow, porous metal (oxygenfree copper) device [11]. figure 4. scheme of a porous medium (refractory foam) heat exchanger [7, 12]. cally on the gas flow rate, the flow channel diameter, the spacing and depth below the surface, the particle size of the metal powder, and the system pressure, where htcs ranging from 15–18 kw/m2c were achieved when tested at the sandia national laboratory (snl) [8]. figure 2 shows an advanced version of the previous concept with the circumferential flow passage compared to the axial inlet and exhaust passages reducing the pressure drop [9, 10]. this concept has two tubes surrounded by a porous medium. figure 3 shows a dual channel circumferential flow porous metal design. the objective of this design was to mitigate the flow bypass while still maintaining low pressure drops. the porous medium was made up of oxygenfree copper spheres brazed to the outer shell forming ∼ 40 % porosity. the test article was able to withstand 29.5 mw/m2 with helium flowing at 20 g/s and 4 mpa [11]. the design as shown in figure 4 also allows for 87 phani kumar domalapally, slavomir entler acta polytechnica figure 5. sketch of the cvd mockup. the inset is an sem photo of tungsten porous foam [6]. shorter flow paths than the axial flow by allowing circumferential flow. in this concept, refractory foams are used whose porosity and specific surface area can be optimized. the porous foam can be manufactured using tungsten (w), which is of particular interest in fusion applications. in order to accommodate 5 mw/m2 heat flux and maintain the temperature of the tungsten below 1400 °c, the required velocities are 11, 15, 53 m/s for porosity values of 50 %, 80 % and 95 %, with pressure drops of 0.32, 0.07 and 0.07, respectively. using this concept it is also possible to accommodate 30 mw/m2, but in order to keep the surface temperature below 1400 °c, it is necessary to have 120 m/s velocity with 80 % porosity, which leads to a pressure drop of 2 mpa [7, 12]. the tungsten foam in the tube concept is shown in figure 5, where the chemical vapor deposition (cvd) technique is used for forming the w foam. this has an advantage over a packed bed of balls or irregular solids, as the relatively large interconnected free volume around the ligaments is available for helium flow, which considerably reduces the pressure drop. articles made using this concept were tested at snl, where they were able to remove 22.4 mw/m2 with he flowing at 27 g/s, at 4 mpa. the tests also showed that a linear pressure drop of 1.87 mpa/m is obtained with helium flow at 4 mpa and 24.8 g/s [6]. the short flow path foam-in-tube (sofit) design (figure 6) uses axial flow from an inlet tube towards the high heat flux surface through a slot, where helium flows circumferentially through a foam from a 2 mm wide slot separating the inlet and outlet plenums. as the flow path through the foam is very short, the pressure drop is much less than the other pressure drops. this device is able to handle 10 mw/m2 before reaching 800 °c on the surface with 28 g/s of helium at 4mpa and an inlet temperature of 40 °c [13]. extended surfaces increase the effective htc by the following mechanisms: a reduced flow area infigure 6. assembly drawing of the short flow path foam-in-tube (sofit) design [13]. creases the flow velocity, the hydraulic diameter is smaller, and finally the heat transfer area is increased; extended surfaces be provided by pin fins, by 2d or 3d surface roughness, or by using ribs. table 1 gives a general idea about how various heat transfer enhancement methods contribute to an increase in htc and in the friction factor (ff) over a smooth tube [14]. 2d and 3d roughness increases the htc by breaking the laminar boundary layer near the wall [14]. a swirl rod insert (sri) uses a rod with fins in place of a twisted tape. this reduces the hydraulic diameter and produces higher effective flow velocities. the increase in htc and ff when using this method can be found in table 1. a combination of sri and 2d roughness is very effective, and tests conducted at snl showed that it can easily withstand 5 mw/m2 [14]. using pin fins increases the effective htc to a very high value, but the increase depends on the type of coolant, the material, and also the geometry orientation of the fins. figure 7 shows a test module fabricated using dispersion strengthened copper, designed by general atomics and tested at snl. the test results show that the module was able to withstand 10 mw/m2 at 4 mpa and 22 g/s flow rate without any damage, and the pumping power required was < 1 % of the total power removed [15]. figure 8 shows the scheme of the annular flow square ribbed channel, which is used to increase the turbulent heat transfer. square ribbed channels can provide a 200–300 % increase in htc in comparison with smooth 88 vol. 55 no. 2/2015 comparison of schemes for cooling high heat flux components htc over ff over smooth pipe smooth pipe microfins 8 30 porous media 5 20 jet impingement 3 7 particulate addition 10 30 swirl tape 2 4 2d roughness 1.8 3.8 3d roughness 3 7 swirl rod insert 2.5 5 swirl rod insert with 2d roughness 3.5 7 table 1. heat transfer enhancement methods [14]. figure 7. helium-cooled divertor module [15]. channels. for a pitch-to-height ratio of 10 and with a reynolds number of 65000 the channel was able to remove 1.5 mw/m2 heat flux [16]. the concept of a helium-cooled modular pin array (hemp) is shown in figure 9. a modular design instead of large plate structures is advantageous for thermal stress, when used in designing large structures such as the divertor. the hemp design consists of w tiles brazed in to a thimble (finger) made of molybdenum alloy (tzm). the fingers are inserted into a front plate made of tzm. in the study carried out by e. diegele et al. [17], tzm was assumed to be the structural material, but rafm-ods steels are also used as the structural material of the back plate. the inlet temperature of the helium gas is 620 °c at 10 mpa. the inlet temperature was dictated by the ductile brittle transition temperature of the materials. to improve the heat convection at the top of the finger, a plate is inserted (by brazing) with a pin fin array, figure 8. illustration of a square-ribbed annular channel [16]. figure 9. he-cooled modular divertor design with pin arrays (hemp) [17]. which dictates the performance of the design. this concept has the potential to remove 15 mw/m2 heat flux, and an htc excess of 60 kw/m2k is estimated, where the pumping power is < 5 % of the thermal power [17]. jet impingement cooling is an effective technique for achieving very high htcs with much lower pressure drops than for other techniques. when helium at high pressure (∼ 10 mpa) leaves the hole or slot and this jet hits the surface that is being exposed to hhf, the outlet velocity of the jet is so high that there is immediately turbulent flow, resulting in increased local turbulent mixing and very high htcs [18]. various techniques have been proposed for achieving this, namely an impinging jet with particulate addition, helium-cooled modular jet cooling (hemj), heliumcooled modular slot array cooling (hems), t-tube design, plate type design, and integrated design. the addition of sic particles to the helium impinging jet is expected to give htc up to 20000 w/m2k, which can handle heat fluxes of the order of 10 mw/m2 [19]. 89 phani kumar domalapally, slavomir entler acta polytechnica figure 10. schematics of hemj and hems divertor finger concepts [20]. figure 11. cross-section of the mid-size t-tube aries divertor concept [18]. the finger concept or the modular concept (hemj, hems) is designed to minimize the thermal stresses and accommodate high heat fluxes. figure 10 illustrates these concepts. these configurations include small hexagonal armor tiles made of w, which is brazed to a w alloy thimble (on which jet impingement occurs and cools the surface of the plasma facing). the cooling unit uses helium at 10 mpa and 600/700 °c as the inlet temperature [20]. this cooling structure is joined to the remaining body using some transition pieces. a detailed discussion about this can be found elsewhere. these modular concepts are designed to withstand heat fluxes of at least 10 mw/m2. the experiments showed that the hemj concept can handle higher heat fluxes with lower pressure drop than the hems concept [21]. the t-tube is a mid-size concept in comparison with the modular and plate type concepts. this concept is illustrated in figure 11, where it is quoted along with its dimensions. on the plasma facing side there is w castellated armoring, which is joined to the w alloy outer tube. helium gas impinges on to this from the w alloy inner tube through a thin slot that is cut into this tube. these two w alloy tubes can be attached to the background structure through some transition figure 12. cross-section of a unit cell, and a drawing of the divertor plate module [13]. figure 13. longitudinal section through the target plate channel [13]. pieces to the base structure made of steel to reduce the thermal stresses. this heat sink can accommodate heat fluxes in excess of 10 mw/m2 [18]. the onlet conditions are same as for the hemj concept. the cross and longitudinal sections of plate type divertors are shown in figure 12. heat sinks of this size can minimize the number of units that are used, thanks to their large size. the plate is made of w-alloy with castellated w armoring. the inlet and outlet sections are tapered in order to obtain a uniform velocity distribution. this concept can accommodate heat fluxes up to 10 mw/m2, with a helium inlet temperature of ∼ 600/700 °c at 10 mpa [13]. table 2 compares the three jet impingement concepts introduced above for a divertor area of 150 m2. it is known that the resulting heat flux footprint on the divertor is gaussian-shaped, so only a part of the divertor will see high heat fluxes. in order to minimize the number of parts and increase the reliability of the joints, the integrated units are designed as shown in figure 14. in this design, finger units are used when the expected heat fluxes are about 10 mw/m2, and a plate type design is used when the expected heat fluxes are lower, thus optimizing the overall design [20]. the concept for cooling the coolant is presented in 90 vol. 55 no. 2/2015 comparison of schemes for cooling high heat flux components divertor concept unit characteristic number of units allowable incident dimensions per a typical tokamak heat flux (mw/m2) finger 1.5 cm in diameter ∼ 535000 > 12 t-tube 10 × 1.5 cm ∼ 110000 ∼ 10–12 plate 100 × 200 cm ∼ 750 ∼ 8–10 table 2. a comparison of various divertor concepts for a tokomak with an assumed divertor area of 150 m2 [13]. figure 15. helium-cooled divertor for demo (cooling of the coolant) [4]. figure 14. integrated plate + finger unit [20]. figure 15, which is designed to use cheaper eurofer steel as much as possible, rather than costlier eurofer ods steel, as the outlet temperature for the concepts discussed above will be greater than 650 °c. this is achieved by using a mixing tap at the outlet of the heated region with helium gas coming at 400 °c, which can then be adjusted to a temperature of 550 °c, so that eurofer can be used there. in this concept, in the region where high heat fluxes are estimated to occur, drillings are made in the cartridge to use the jet impinging phenomenon [4]. 4. water cooling water cooling gives very high htcs when it is operated in the subcooled boiling regime, and stays below the chf limit. details about the boiling phenomenon can be found elsewhere [22]. in the literature, several cooling techniques have been developed to increase the chf limit, such as swirl tubes, hypervapotrons, screw tubes, etc. in the fusion community, three design options are available when water is used as a coolant. these three options, based on bonding the plasma facing component to the heat sink [3], are the flat tile, saddle tile and mono block designs, which are discussed in [3]. here we will focus on techniques for improving heat transfer. adding ice particulates to the water greatly increases the heat transfer. the heat transfer and chf are increased because inlet subcooling is increased to its maximum value. this is because the inlet temperature is at the ice melting point, thus maximizing the rise in the subcooling temperature. this allows hhf through ice melting to an extent dependent on the latent heat of fusion of ice and on the ice particulate mass flow fraction. 91 phani kumar domalapally, slavomir entler acta polytechnica figure 16. various systems for impinging planar jet cooling: (a) a free jet on a flat surface. (b) a confined jet on a curved surface [25]. figure 17. cooling the divertor by jet cooling [25]. tests that were conducted show that for water pressure of 2 bar and mass flow of 15000 kg/m2, through a 1 cm pipe with an ice fraction of 30 %, if complete ice meltdown occurs, it can easily accommodate 20 mw/m2 heat flux [23]. jet impingement using water is another way to increase chf. figure 16 shows various impinging planar jet cooling methods. if jet cooling is used on a flat surface, very high chf is obtained in the limited central region right under the jet nozzle, and chf decreases abruptly in the region away from the center. this drawback is overcome by a jet on a curved heated surface, as shown in figure 16(b). using this technique, a maximum chf of 38 mw/m2 was obtained locally under conditions of jet velocity v = 14.6 m/s, tsub = 80 k, on a curved cooling surface with radius of curvature r = 24.5 cm [24]. figure 17 shows various divertor designs using jet cooling, namely free jet and confined jet techniques. the confined planar jet can construct a compact cooling system so as to fit the heat flux distribution more suitably, by changing the flow velocity, and can operate under elevated pressure. free planar jet cooling has the advantage of keeping the pressure drop in the cooling system small [25]. porous media can be used in various ways for increasing heat transfer. mechanically pumped porous media heat exchangers, capillary pumped exchangers figure 18. heat and working fluid flow in a heat pipe [26]. figure 19. swirl tube [29]. (heat pipes) and porous coatings are discussed in the literature. pumped single phase porous media heat exchangers using water, which can remove heat fluxes in the range of 100 mw/m2, were tested at snl [26]. there are several reasons for preferring heat pipes (figure 18) for removing heat. heat pipes can be designed to work against gravity. they are passive, i.e. they require no mechanical pump to function. they also require only a relatively small amount of working fluid, so failure of a heat pipe will only allow a relatively small amount of working fluid to leak out. they can also be used to control heat removal at a uniform or constant temperature, or to spread a high heat flux to a larger area to allow heat removal by conventional techniques. water heat pipes were evaluated for cooling the faraday shield that protects the radio frequency (rf) antennae used for plasma heating. tests on these devices have shown their ability to absorb localized heat fluxes in excess of 10 mw/m2 [26]. a porous coating is another way to enhance chf. it is interesting because it increases the chf by approximately 40–60 % over a smooth channel, but the pressure drop is almost of the same order as the pressure drop of a smooth channel [27]. the tests conducted at snl showed that ichf = 24.5 mw/m2 is obtained with water flowing at 4 mpa, 100 °c, 10 m/s, with a pressure drop of 0.19 mpa/m [27]. erosion and corrosion of the porous coating has kept this method from wide acceptance [28]. a swirl tube uses a tape which is twisted and inserted into a circular tube, as shown in figure 19. the increase in chf is obtained due to turbulence and the superimposed vortex motion (swirl flow), causing a 92 vol. 55 no. 2/2015 comparison of schemes for cooling high heat flux components figure 20. scheme of the outer vertical target [30]. figure 21. scheme of an annular flow tube (a swirl tape with a solid center piece) [27]. thinner boundary layer and consequently resulting in a high heat transfer coefficient and nusselt number, due to repeated changes in the twisted tape geometry [29]. an increase in chf values by > 50 % compared to the values for smooth tubes have been obtained using a swirl tube in various experiments. several experiments have been conducted internationally on a swirl tube as a chf enhancer, and it is the reference design for cooling the vertical target of the divertor at iter, with a monoblock configuration [30], where it has been shown that swirl tubes can easily accommodate a 20 mw/m2 heat flux [30, 31]. figure 20 shows the scheme of the iter outer vertical target (ovt), with the location of the swirl tubes. the same figure shows the cross section of the tubes used for the inner vertical target (ivt) and for the outer vertical target (ovt). the cross section of the twisted tape used for ovt has the form of a parallelogram with the sharp edges chopped off. its sheet thickness is thicker than that used for ivt, in order to sustain critical heat flux performance equivalent to that of ivt under a lower coolant flow rate than ivt [30]. the scheme of an annular flow swirl tube is presented in figure 21. it is similar to a swirl tube, except that it has a solid center piece. the annular swirl is not preferred, as it provides relatively little data. its chf is no better than the chf for the swirl tape, but there is a higher pressure drop for a given flow rate [27]. figure 22. hypervapotron module configuration [34]. figure 23. schematic view of a screw tube made of f82h and heating condition [36]. the hypervapotron consists of fins which are placed perpendicular to the direction of flow, and has slots at each end of the fins parallel to the direction of flow, as shown in figure 22. when sufficient heat flux is applied to the wall which is in contact with the hypervapotron fins, the liquid between the two adjacent fins boils. when the slot between the fins is full of steam, it rapidly condenses in the bulk subcooled liquid. the slots are therefore empty, and are ready to be replenished with fresh cold liquid. this phenomenon of continuous boiling and condensation between the slots increases the critical heat flux [32]. the chf values were > 50 % higher than for smooth tubes when the hypervapotron was used in various experiments. it can easily withstand 20 mw/m2, as has been verified by several experiments [33, 34]. it used in the dome liner and also in some places in first wall at iter to deal with heat fluxes ∼ 5 mw/m2. for an equivalent flow, the hypervapotron has the highest chf limit, and the pressure drop is also lower than for swirl tubes [27, 34]. a comparison made by smid et al. [35] proved that hypervapotron has a better thermal hydraulic performance than a swirl tube, but that the swirl 93 phani kumar domalapally, slavomir entler acta polytechnica tube has a better thermomechanical performance. in order to gain the advantages of the performance of both hypervapotron (from the thermohydraulic side) and the swirl tube (from thermomechanical side), the definition of the best concept could be a combination of the two concepts: a circular channel with helical fins (which is just a screw tube). a screw tube is a cooling tube with a helical triangular fin on its inner cooling surface, as shown in figure 23. the nut-like inner surface can work as a combination of enlarging the heat transfer area and a turbulence promoter of cooling water near the surface to enhance the heat transfer [36]. heat fluxes in excess of 20 mw/m2 can be accommodated using a heat sink made with this geometry. it has been reported that a screw tube made of pure cu achieves twice as much heat removal as a smooth tube [36]. it was assumed that the screw would act as a crack initiator, but experiments carried out at jaeri found that this is not what happens. however, further investigation under cyclic loading is needed [37]. 5. conclusions various cooling schemes relevant for fusion applications have been presented, focusing on the use of helium and water as working fluids for cooling hhf components. advantages and disadvantages of various cooling schemes have been presented. one of the main problems encountered today comes from the materials used for pfcs and heat sinks. the properties of these materials can differ, leading to thermal stresses and to the failure of a component. however, an optimized thermohydraulic design not only results in safe operation of the reactor but also minimizes the pfc temperatures, the coolant flow and the pumping power requirements, while extending the life of the reactor. it has been mentioned that if simple channels are used (without turbulence enhancers) very high velocities and pressures are required. this would not be economical, so turbulence enhancers are essential and should be used for cooling in order to deal with hhf. the main criteria in choosing a particular fluid and turbulence enhancer are the cooling capacity of the fluid and the pressure drop (which in turn decides the pumping speed). helium cooling is very advantageous because of its safety features and its operating temperatures. it can be directly combined with the brayton cycle to produce very high thermal efficiency. among the heat transfer enhancers, jet cooling has undergone more experiments and has received much attention as it can deal with heat fluxes of the order of 10 mw/m2 with lower pressure drops. the heat flux footprint on the divertor is gaussian in shape, so the integrated plate and finger type cooling system is very advantageous as the number of units can be decreased. however, the sofit and hemp concepts need extensive study, as these concepts have very high potential for removing hhf. in addition, there is a need for further research to find suitable materials. helium cooling will used in future fusion reactors, as helium is a very attractive coolant for achieving greater plant efficiency. water cooling is attractive, as it has been used for a long time for cooling various components in various fields, and there is a very large database of experiments to be drawn on. at the same time, water can handle very high heat fluxes with lower pressures and velocities than helium. the main caveat with a water cooling system is the chf limit, and several ways of increasing this limit have been mentioned here. the swirl tube with a monoblock design is used for hhf applications at iter to handle very high heat fluxes of the order of 10–20 mw/m2, as can be seen in the vertical target of the divertor. hypervapotron will be used to handle heat fluxes of the order of 5 mw/m2, as can be seen in the dome of the divertor and as the enhanced flux design of the first wall. the screw tube has the advantages of both the swirl tube and hypervapotron. experiments conducted by a japanese team have shown that the screw tube has better cooling capability than the swirl tube, but the experimental database on this scheme is limited in size, and needs to be improved before the screw tube is ready for use in future reactors. for the near term, water cooling is very attractive for fusion reactors. however, further research and development can make helium cooling advantageous in the longer term. references [1] raffray, a.r., merola, m.: overview of the design and r&d of the iter blanket system, fusion engineering and design volume, 87 (5–6), 2012, p. 769–776. [2] norajitra, p., et al.: progress of he-cooled divertor development for demo, fusion engineering and design, 86 (9-11), 2011, p. 1656–1659. [3] chappuis, ph., et al.: possible divertor solutions for a fusion reactor. part 2. technical aspects of a possible divertor, fusion engineering and design, 36, 1997, p. 109–117. [4] reiser, j., rieth, m.: optimization and limitations of known demo divertor concepts, fusion engineering and design, 87 (5-6), 2012, p.718–721. [5] dobran, f.: fusion energy conversion in magnetically confined plasma reactors, progress in nuclear energy, 60, 2012, p.89-116. [6] youchison, d.l., et al.: high heat flux testing of a helium-cooled tungsten tube with porous foam, fusion engineering and design, 82 (15-24), 2007, p.1854–1860. [7] raffray, a.r., pulsifer, j.e.: merlot: a model for flow and heat transfer through porous media for high heat flux applications, fusion engineering and design, 65 (1), 2003, p.57-76. [8] rosenfeld, j.h., et al.: cooling of plasma facing components using helium-cooled porous metal heat exchangers, fusion technology, 27 (14), 1994, p.255–258. 94 vol. 55 no. 2/2015 comparison of schemes for cooling high heat flux components [9] wong, c.p.c., et al.: helium-cooled refractory alloys first wall and blanket evaluation, fusion engineering and design, 49–50, 2000, p.709–717. [10] hermsmeyer, s., malang, s.: gas-cooled high performance divertor for a power plant, fusion engineering and design, 61-62, 2002, p.197-202. [11] youchison, d.l., et.al.: thermal performance and flow instabilities in a multi-channel, helium-cooled, porous metal divertor module, fusion engineering and design, 49–50, 2000, p.407–415. [12] pulsifer, j.e., raffray, a.r..: structured porous media for high heat flux fusion applications, proceedings of the 19th ieee/npss symposium on fusion engineering, 2002, p.352-355. [13] tillack, m.s. , et al.: recent us activities on advanced he-cooled w-alloy divertor concepts for fusion power plants, fusion engineering and design, 86,(1), 2011, p.71–98. [14] baxi, c.b., wong, c.p.c.: review of helium cooling for fusion reactor applications, fusion engineering and design, 51–52, 2000, p.319–324. [15] baxi, c.b.: evaluation of helium cooling for fusion divertors, fusion engineering and design, 25, (1-3), 1994, p.263-271. [16] takase, k.: forced convective heat transfer in square-ribbed coolant channels with helium gas for fusion power reactors, fusion engineering and design, 49–50, 2000, p.349–354. [17] diegele, e., et.al.: modular he-cooled divertor for power plant application, fusion engineering and design, 66-68, 2003, p.383-387. [18] ihli, t., et.al.: design and performance study of the helium-cooled t-tube divertor concept, fusion engineering and design, 82 (3), 2007, p.249–264. [19] yamazaki, s. et.al.: design study of helium-solid suspension cooled blanket and divertor plate for a tokamak power reactor, fusion engineering and design, 25 (1-3), 1994, p.227-238. [20] raffray, a.r., malang, s., wang, x.: optimizing the overall configuration of a he-cooled w-alloy divertor for a power plant, fusion engineering and design, 84 (7-11), 2009, p.1553–1557. [21] norajitra, p., et.al.: he-cooled divertor development for demo, fusion engineering and design, 82 (15-24), 2007, p.2740–2744. [22] ishii, m., hibiki, t.: thermo-fluid dynamics of two-phase flow, springer, 2011. [23] gorbis, z.r., raffray, a.r., abdou, m.a.: high heat flux removal by phase-change fluid and particulate flow, fusion technology, 23, 1993, p.435-441. [24] inoue, a., et.al.: two-dimensional impinging jet cooling of high heat flux surfaces in magnetic confinement fusion reactors, fusion engineering and design, 28, 1995, p.81 89. [25] inoue, a., et.al.: studies on a cooling of high heat flux surface in fusion reactor by impinging planar jet flow, fusion engineering and design, 51–52, 2000, p.781–787. [26] rosenfeld, j. h., et.al.: advances in porous media heat exchangers for fusion applications, proceedings of the 19th symposium on fusion technology,1996, p.487-490. [27] raffray, a.r., et.al.: critical heat flux analysis and r&d for the design of the iter divertor, fusion engineering and design, 45 (4), 1999, p.377–407. [28] baxi, c.b.: thermal hydraulics of water cooled divertors, fusion engineering and design, 56–57, 2001, p.195–198. [29] bournonville, y., et.al.: numerical simulation of swirl-tube cooling concept, application to the iter project, fusion engineering and design, 84 (2-6), 2009, p.501–504. [30] suzuki, s., et.al.: development of the plasma facing components in japan for iter, fusion engineering and design, 87 (5-6), 2012, p.845–852. [31] gavila, p., et.al.: high heat flux testing of mock-ups for a full tungsten iter divertor, fusion engineering and design, 86 (9-11), 2011, p.1652–1655. [32] cattadori, g., et.al.: hypervapotron technique in subcooled flow boiling chf. experimental thermal and fluid science, 7, 1993, p. 230 – 240. [33] milnes, j.: computational modelling of the hypervapotron cooling technique for nuclear fusion applications, phd thesis, 2010. [34] wang, z., song, y., huang, s.: design of the hypervapotron module for the east device, fusion engineering and design, 87 (5-6), 2012, p.868–871. [35] smid, i., et.al.: comparison between various thermal hydraulic tube concepts for the iter divertor, proceedings of the 19th symposium on fusion technology, lisbon, portugal, 1996, p.263-266. [36] ezato, k., et.al.: critical heat flux testing on screw cooling tube made of rafm-steel f82h for divertor application, fusion engineering and design, 75–79, 2005, p.313–318. [37] ezato, k., et.al.: thermal fatigue experiment of screw cooling tube under one-sided heating condition, journal of nuclear materials, 329–333, 2004, p.820–824. 95 acta polytechnica 55(2):86–95, 2015 1 introduction 2 cooling schemes 3 helium cooling 4 water cooling 5 conclusions references acta polytechnica acta polytechnica 53(4):338–343, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ developing control and monitoring software for the data acquisition system of the compass experiment at cern martin bodláka, vladimír jarýa,∗, igor konorovb, alexander mannb, josef novýa, severine paulb, miroslav viriusa a faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, 115 19 prague 1, czech republic b physik department, technische universität münchen, james-franck-str. 1, 85748 garching, germany ∗ corresponding author: vladimir.jary@cern.ch abstract. this paper focuses on the analysis, design and development of software for the new data acquisition system of the compass experiment at cern. in this system, the data flow is controlled by custom hardware; the software will therefore be used only for run control and for monitoring. the requirements on the software have been analyzed, and the functionality of the system has been defined. the system consists of several distributed nodes; communication between the nodes is based on a custom protocol and a dim library. a minimal version of the system has already been implemented. preliminary results of performance and stability tests have shown that the system fulfills the defined requirements, and is stable. in the next phase of development, the system will be tested on the real hardware. it is expected that the system will be ready for deployment in 2014. keywords: data acquisition, remote control, monitoring, compass. 1. introduction modern high energy physics experiments depend strongly on the computer systems that are used for the simulations, data acquisition, data storage, and data analysis. this paper focuses on the software development for a new data acquisition system for the compass experiment. first, the experiment is briefly introduced. then the existing daq system, based on the alice date package, is explained, and the problems with it that triggered the development of the new systems are analyzed. next, a new system featuring the hardware controlled data flow is presented and the results of a requirements analysis are summarized. then, a proposal for a minimal version of the software that fulfills these requirements is presented. the first results of performance and scalability tests are then summarized. finally, the plans for future development are presented. 2. the compass experiment compass1 compass is a high-energy fixed-target experiment situated at the super proton synchrotron at cern, built for studying the gluon and quark structure and for spectroscopy of hadrons using high intensity muon and hadron beams [1]. the scientific program was approved by the cern scientific council in 1997; it includes research on hadron structure and spectroscopy with high-energy hadron and 1compass is the acronym for the common muon and proton apparatus for the structure and spectroscopy muon beams. after several years of preparation and commissioning, data gathering began in 2002. the experiment is currently entering its second phase, known as compass-ii, which covers studies of primakoff scattering, the drell-yan effect, and generalized parton distributions [2] 3. existing daq architecture the compass spectrometer is composed of detectors that are used to track and identify particles and to measure energies of particles. the data acquisition (daq) system is used to read out data from detectors, to assemble full events from fragments from different detector channels, and to store events into permanent storage. the daq system also provides control and monitoring. the compass daq system consists of several layers [3]. the front-end electronics that form the lowest layer continuously preprocess and digitize analog data from the detectors. there are approximately 250000 detector channels. data from multiple channels is read out and assembled by the concentrator modules called catch2 and gesica3. these modules receive the signals from the time and trigger system; when the trigger signal arrives, the readout is performed. the subevent is created by adding the time stamp and the event identification to the data. the subevents are transferred using the optical bus s-link to the readout buffers that form the following layer of the 2compass accumulate, transfer, and control hardware 3gem and silicon control and acquisition module 338 http://ctn.cvut.cz/ap/ vol. 53 no. 4/2013 developing control and monitoring software figure 1. dim name server. architecture. readout buffers are standard servers equipped with custom pci cards called spillbuffers. spillbuffers are used for buffering subevents, which allows the load to be distributed across the full cycle of the sps accelerator. finally, the subevents are sent over the gigabit ethernet to the event builders that assemble full events and transfer them to permanent tape storage. the remaining cpu power of the event building servers is used for data quality monitoring and for filtering purposes. the data acquisition is powered by the date software [4], which has been adopted from the alice experiment. this package is designed to perform data acquisition tasks in a distributed environment; each node must be powered by intel-compatible hardware and a gnu/linux operating system. date is a very flexible and scalable system – at the alice experiment, it is deployed on hundreds of nodes, but it can also be used in a small laboratory experiment on a single node. from the functionality point of view, the date package provides data flow control, event building, data quality monitoring, event sampling, information reporting, interactive configuration, and run control. 4. motivation for upgrade during the first year of data taking (i.e. 2002), the system collected 200 tb of data [5], and this amount rose almost 10 times to 2 pb in 2010. the requirements on the daq system increase as the trigger rate and the beam intensity increase, and the existing system has already experienced performance problems and increased dead time. additionally, as the hardware gets older, its failure rate also increases. one possible solution is to upgrade the hardware of the event builders and the readout buffers. unfortunately, the spill buffer cards in the readout buffers are based on the deprecated pci technology. it would therefore be necessary to develop and produce a pci express version of these cards. however, no software development (with the exception of the kernel driver for the new spill buffer card) would be necessary in this scenario. another scenario proposes replacing the event building network with custom hardware based on fpga programmable circuits4 [6]. this hardware would perform the readout and the event building, so the software would be used only for control and monitoring. this system would consist of fewer components, so it should be more reliable and easier to maintain. as an additional benefit, the existing readout buffers and event builders could be used as an online filtering farm. 5. requirements on the new software the control and monitoring software is to be deployed in a distributed heterogeneous environment. some nodes (e.g. the user interface application) will be installed on standard intel-compatible hardware, but the control and the monitoring nodes will be deployed on custom cards with softcore processors. we have evaluated the possibility of using the date package on the new daq architecture; unfortunately this idea has been rejected – mainly because date requires intel-compatible hardware and, moreover, the system is too complex. however, the date data format must remain unchanged in order to make the new architecture compatible with the tools for the physical analysis. additionally, some date components, e.g. the online filter, could be reused. in order to retain compatibility with the detector control system, the dim library must be used for network communication. moreover, the system must support the remote control, and multiple users should be able to access the system. however, only one user can take the control over the system at any time. the other users can monitor the behavior of the system. 4field programmable gate array 339 m. bodlák, v. jarý et al. acta polytechnica figure 2. software layers. data acquisition does not require real time operation, so it is not necessary to depend on the real time operating system and special libraries. 6. dim library dim (distributed information management) is a software package that provides asynchronous, one-tomany communication in a heterogeneous network environment [7]. the library is based on the tcp/ip protocols; the client-server paradigm is extended by the concept of a dim name server (dns). in order to publish an information service or a command, a server (publisher) must register it at the name server. when a client (subscriber) wishes to subscribe to some service, it asks the name server which server has published that service. the communication with the name server is transparently provided by the library; the user has only to provide the location of the dns (usually by exporting an environmental variable). dim is a c library, but interfaces to c++, java, and python languages also exist. the performance of the library was measured for different message sizes, and the c++ and java interfaces were compared. we found that, for larger messages, the dim library can saturate the network [8]. the java version of the library calls the native c code through the java native interfaces. the performance hit caused by jni calls is about 20 % for smaller messages; for larger messages, the performance hit can be neglected. additionally, it has been found that the java version of the dim library is still incomplete. it was therefore decided to focus on c++ and to abandon java. however, the python language can be used for some auxiliary tasks, such as waking up the slaves. 7. overview of the software architecture using the results of the requirements analysis, we designed the structure of the control and monitoring system. the system is deployed on several distributed nodes. communication between nodes is based on the dim package. the nodes can be divided into several categories, according to their purpose. according to figure 2, one node acts as the master. the master node receives commands from applications that provide the graphical user interface, and forwards these commands to the slave nodes. in turn, the slave nodes generate monitoring data and confirmation messages that are sent back to the master. at one time, multiple user interface applications can receive data from the master. however, only one instance can issue commands. in addition, communication between the master node and the user interface is based on the dim library, which means that remote control is possible. the configuration of the system is stored in the online mysql database – the mysql database was chosen for its compatibility with the existing daq system. this configuration includes the lists of slave nodes for different scenarios, e.g. calibrating or data taking. when the system starts up, the master loads the appropriate list of slaves from the database, it connects to them (via ssh) and wakes up the slave processes. the system configuration is propagated from the master to all slaves using the dim services. in this way, only the master process needs access to the database. one node called message logger is used to collect messages generated during important events (messages of an informative character, e.g. change of the current state of a node) or unexpected events (i.e. errors) produced on other nodes. these messages are 340 vol. 53 no. 4/2013 developing control and monitoring software header 1. data size 4 bytes size of data in 32b words (= header_size+body_size+trailer_size) 2. version 4 bytes version of data protocol 3. sender id 4 bytes unique id of message’s sender (from db) 4. message number 4 bytes number of message 5. receiver id 4 bytes unique id of message’s receiver (0=multicast) (from db) 6. message id 4 bytes message id 7. time 4 bytes time stamp 1/2 8. time 4 bytes time stamp 2/2 body 9. body 0–4n bytes body of message (can be empty) trailer 10. reserved 4 bytes 0x00000000 11. reserved 4 bytes 0x00000000 12. message number 4 bytes number of message (the same as in header of message) 13. message number 4 bytes table 1. transport protocol. temporarily stored in a memory buffer, and under specified conditions (i.e. the number of messages in the buffer or the time elapsed since the first message stored in a buffer) are stored in the mysql database. this message logger is de facto a dim server that receives messages sent by other nodes (see figure 2). a message is sent directly to the dim server as continuous text, and is parsed on the server side according to the log protocol. messages already stored in the database can be viewed using a message browser application. message logger and message browser replace the functionality of the infologger and the infobrowser applications from the date package. it was decided to use the qt framework for implementing the slave, the master, and the gui parts. qt extends the object model of c++ language, and it also provides a large class library that covers graphical components (widgets), networking, database access, multithreading, or data manipulation. additionally, the framework supports all the major platforms windows, macos, and gnu/linux with x11. the slave and the master were successfully tested on qt version 4.2.1 and the gui on version 4.6.1. the slave includes both the dim server and dim client parts. for all outgoing communication, the slave uses only dim services. however, for incoming communication it uses dim commands and dim services. the master is also a combination of a dim server and a dim client. it uses dim services for regular, more frequent or multicast messages, and dim commands for precise sending of messages to control the state of a single node. gui has only the client part of the dim library, so it communicates by sending commands to the master and receiving status messages from master’s services. 8. transport protocol overview we proposed and implemented a custom protocol that is used for exchanging information between nodes. this protocol defines a common frame for transporting messages and commands. this provides easier information manipulation in them. a precise description of this protocol is summarized in table 1. the program uses the qbytearray class for assembling the frames 9. performance tests the architecture described above was thoroughly tested during the winter shutdown of the experiment. the dim name server, the master and the slave processes were deployed on the existing event builders located in the compass experimental hall; the message logger, the database server, and the graphical user interface were deployed on computers in the remote control room [9]. communication between all nodes is provided by the gigabit ethernet, which limits the theoretical transfer speed with a value of 128 mb/s. 341 m. bodlák, v. jarý et al. acta polytechnica figure 3. transfer speed. several tests were performed for different message sizes and for different number of slaves. as shown in figure 3, the system can almost saturate the network for messages larger than 1 kb. for smaller messages, the performance bottleneck is caused by the communication with the dim name server. at these speeds, the system is able to exchange approximately 90000 1 kb messages per second. as real time operation is not required, this result is promising – it is expected that the final system will require a transfer rate of thousands of messages per second at most. the stability of the system in time was also evaluated; the results are summarized in figure 4. the system exchanged messages continuously between the master and 10 slave nodes over a period of 20 hours. during this period, no memory leaks were detected, and the transfer speeds remained constant. the observed peaks were probably caused by the time synchronization with the ntp server. however, this issue is not yet fully comprehended, and it requires further investigation and testing. 10. conclusion and outlook the existing data acquisition system of the compass experiment has been analyzed. the readout part of the data acquisition is based on deprecated pci technology, so two different upgrade scenarios were considered. the first scenario involves developing a pci express version of the spillbuffer cards, while the second scenario proposes replacing the readout buffers and event builders with a new architecture based on the fpga technology. we analyzed the requirements posed on the software for this new hardware architecture, and we proposed and implemented the minimal control and monitoring application framework. the preliminary performance and stability test results have been discussed the performance of the system should meet the expected requirements. during the rest of the winter shutdown, further tests are scheduled. in order to test the system under more realistic conditions, the slave processes need to be deployed on the gesica concentrator modules equipped with the prototypes of the new hardware. at longer scale, the system specification needs to be extended and finalized in 2012 so that the system will be ready for final testing during the expected annual shutdown of all cern accelerators in 2013. it is planned to deploy the new architecture for the 2014 run. acknowledgements this work has been supported by czech ministry of education, youth and sports grants la08015 and sgs 11/167. references [1] p. abbon et al. (the compass collaboration). the compass experiment at cern. in: nucl. instrum. methods phys. res., a 577, 3 (2007) pp. 455–518. [2] ch. adolph et al. (the compass collaboration). compass-ii proposal. cern-spsc-2010-014; spsc-p-340 (may 2010). [3] l. schmitt et al. the daq of the compass experiment. in: 13th ieee-npss real time conference 2003, montreal, canada, 18–23 may 2003, pp. 439–444. [4] t. anticic et al. (alice daq project). alice daq and ecs user’s guide. cern edms 616039, january 2006. 342 vol. 53 no. 4/2013 developing control and monitoring software figure 4. system stability. [5] t. nagel. cinderella: an online filter for the compass experiment. münchen: technische universität münchen, january 2009. [6] a. mann, f. goslich, i. konorov, s. paul. an advancedtca based data concentrator and event building architecture. in 17th ieee-npss real-time conference 2010, lisboa, portugal, 24–28 may 2010. [7] p. charpentier, m. dönszelmann, c. gaspar. dim, a portable, light weight package for information publishing, data transfer and inter-process communication. available at: http://dim.web.cern.ch [2013–08–20] [8] v. jarý, t. liška, m. virius. developing a new daq software for the compass experiment. in: 37th software development, ostrava: všb – technická univerzita ostrava, 2011, isbn 978-80-248-2425-3. p 35-41. [9] m. bodlák, v. jarý, t. liška, f. marek, j. nový, m. plajner. remote control room for compass experiment. in: 37th software development, ostrava: všb – technická univerzita ostrava, 2011, isbn 978-80-248-2425-3. pp. 1–9. [10] compass page [online]. 2010. available at: http://wwwcompass.cern.ch [2013–08–20] 343 http://dim.web.cern.ch http://wwwcompass.cern.ch acta polytechnica 53(4):338–343, 2013 1 introduction 2 the compass experiment 3 existing daq architecture 4 motivation for upgrade 5 requirements on the new software 6 dim library 7 overview of the software architecture 8 transport protocol overview 9 performance tests 10 conclusion and outlook acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0401 acta polytechnica 55(6):401–406, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap operating specifications of catalytic cleaning of gas from biomass gasification martin lisý∗, marek baláš, michal špiláček, zdeněk skála dep. power engineering, energy institute, faculty of mechanical eng., brno university of technology, technická 2896/2, brno 616 69, czech republic ∗ corresponding author: lisy@fme.vutbr.cz abstract. the paper focuses on the theoretical description of the cleaning of syngas from biomass and waste gasification using catalytic methods, and on the verification of the theory through experiments. the main obstruction to using syngas from fluid gasification of organic matter is the presence of various high-boiling point hydrocarbons (i.e., tar) in the gas. the elimination of tar from the gas is a key factor in subsequent use of the gas in other technologies for cogeneration of electrical energy and heat. the application of a natural or artificial catalyst for catalytic destruction of tar is one of the methods of secondary elimination of tar from syngas. in our experiments, we used a natural catalyst (dolomite or calcium magnesium carbonate) from horní lánov with great mechanical and catalytic properties, suitable for our purposes. the advantages of natural catalysts in contrast to artificial catalysts include their availability, low purchase prices and higher resilience to the so-called catalyst poison. natural calcium catalysts may also capture undesired compounds of sulphure and chlorine. our paper presents a theoretical description and analysis of catalytic destruction of tar into combustible gas components, and of the impact of dolomite calcination on its efficiency. the efficiency of the technology is verified in laboratories. the facility used for verification was a 150 kw pilot gasification unit with a laboratory catalytic filter. the efficiency of tar elimination reached 99.5 %, the tar concentration complied with limits for use of the gas in combustion engines, and the tar content reached approximately 35 mg/m3n. the results of the measurements conducted in laboratories helped us design a pilot technology for catalytic gas cleaning. keywords: biomass; gasification; gas cleaning; dolomite. 1. introduction thermochemical gasification is a conversion of organic matter into gas with low lower heating value (co, h2, ch4, co2, n2, and h2o) and at high temperatures (750–1000 °c). the partial oxidation of gasified material (gasification using air, oxygen, steam) commonly supplies heat for endothermic reactions. the prevailing technology utilizes air. thanks to this technology, there are no costs or hazards concerning oxygen production and utilization, as well as there are no costs and complexity regarding the reactors for gasification in steam and pyrolysis, which requires two reactors. the produced gas is suitable for the operation of boilers, engines and turbines; however, it is not suitable for transfer via gas lines, due to low energy density (4–7 mj/m3n). gas comprises trace amounts of higher hydrocarbons such as ethane and ethene, small particles of charcoal and ashes, tar and other substances. tar is a complex and heterogeneous mixture of hydrocarbons with a wide range of molar weights, yet there was no exact definition [1]. therefore, several institutes cooperated to create a unified definition, the so-called tar protocol, which introduces the following delimitation: “tar includes all organic materials which have a higher boiling point than benzene (i.e., 80.1 °c)”. in addition to that, the tar protocol presents a uniform methodology for sampling and analysis of tar, which will clearly help increase the comparability of particular published results. almost every gas produced from gasification of biomass contains at least a minimum amount of tar, and this creates serious problems for its subsequent use. due to high concentrations of tar, several biomass gasification projects were discontinued. successful elimination of tar from produced gas requires information about gas composition, physical-chemical properties, sampling conditions and a tar sample analysis. 2. gas cleaning methods tar production from wood gasification is much higher than tar production from coal and/or peat gasification, and it is composed of heavier and more stable aromatic substances [2], which means that the technologies developed for the elimination of tar from coal gasification may not be transferrable onto the elimination of tar from biomass gasification. therefore, our research focuses on the elimination of tar from biomass gasification, i.e., on efficient elimination of tar. 401 http://dx.doi.org/10.14311/ap.2015.55.0401 http://ojs.cvut.cz/ojs/index.php/ap m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica the methods leading to a decrease in tar concentrations in the produced gas may be classified according to various criteria. the fundamental classification distinguishes between primary and secondary measures for tar elimination. current research suggests that the use of primary methods may decrease tar content; however, the methods are inefficient for complete tar elimination, at least for large-scale gasification systems [3–5]. 2.1. secondary measures secondary measures focus on tar elimination in subsequent filtration routes. there are various methods for the elimination of tar from gas [6], and it is impossible to opt for the best method unanimously. selecting a particular method for a particular process of tar elimination is always a result of optimization and compromise between several important factors such as efficiency, pressure drop, energy intensity, reliability, universality, investment and operating indicators, waste production, etc. 2.2. use of catalysis since the mid-1980s, many research institutes have been interested in the use of catalysts for the modification of syngas, especially for tar elimination. requirements for catalyst properties [3]: (1.) the catalyst must be highly efficient in tar elimination. (2.) if syngas is to be produced, the catalyst must be able to reform methane. (3.) the catalyst should produce h2 : co in a suitable ratio. (4.) the catalyst should be resistant to deactivation, fouling and fusing. (5.) the catalyst should regenerate easily. (6.) the catalyst must be sturdy and resistant to scratching. (7.) the catalyst should be cheap. tar reduction on the surface of the catalyst occurs with steam or with co2: cnhm + nh2o ←→ co + (n + m2 )h2, (1) cnhm + nco2 ←→ 2nco + m2 h2. (2) in addition to dry and steam reforming, hydrogenation, hydrocracking, catalytic pyrolysis and polymerization also participate in tar elimination, under specific conditions [7]. all these reactions occur with catalysts. a detailed and accurate description of the reactions is not yet available. the reactions differ for individual tar components, and depend on the content of h2, h2o and co2 in the gas, and on the temperature [7]. studies conducted in laboratory conditions proved that reactions of dry reforming (2) prevail for temperatures exceeding 850 °c. this type of reaction requires high temperatures, but it is less energy intensive. natural materials, such as dolomites, zeolites and limestones, are commonly used for tar elimination. various industrial metallic ni, mo, co, pt, ru-based and other element catalysts are used as well. numerous factors influence the catalyst, and they may all worsen its catalyzing properties. some of these factors work slowly, others may destroy the catalyst in a relatively short period of time [8]. the main causes of decline in catalyst functions include thermal instability of the catalyst, fouling, and catalyst poisoning. 2.3. natural catalysts in contrast with industrial catalysts, natural catalysts do not eliminate tar as efficiently [3, 9]. other disadvantages include their high operating temperatures. the minimum operating temperature is around 800 °c [3], and the optimum temperature is around 900 °c and higher [5, 10]. yet, natural catalysts have numerous advantages and may be adopted in tar elimination from syngas [3, 11]: (1.) they are cheap and readily available. (2.) they are not prone to catalyst poisoning and thermal instability, as opposed to metallic catalysts. (3.) they easily regenerate when fouled. (4.) they are relatively mechanically resistant. another non-disputable advantage of calcic catalysts is their ability to eliminate sulphure and chlorine compounds. on the other hand, these substances may be totally destructive for metallic catalysts. diverse types of natural materials, e.g., dolomite, olivine, limestone, zeolite, magnesite and others, were tested [7–10]. dolomite (camg(co3)2) and olivine (femg(sio4)2) seem to be the most efficient. limestone and magnesite may also work; however, they are not as active as dolomite [3]. dolomite (camg(co3)2) is the most common and definitely most used natural material for catalytic elimination of tar [1]. its exact chemical composition varies depending on the site of extraction. in general, dolomite contains about 30 wt% cao, 21 wt% mgo, and 45 wt% co2 [3], as well as other mixtures, especially metal oxides (iron, aluminum), alkali metal oxides, silicon oxides, etc. dolomite is not active in steam reforming of methane, and therefore the lower heating value of the gas from the cracking of lower hydrocarbons does not drop significantly [9]. the efficiency of tar elimination reaches up to 99 % [3, 9], depending on the operating conditions (retention time, temperature, and various other agents). the optimum temperature ranges from 800–900 °c, and the retention time ranges from 0.3–0.8 seconds. dolomite calcination is 402 vol. 55 no. 6/2015 operating specifications of catalytic cleaning of gas from biomass gasification figure 1. atmospheric fluidized bed gasifier biofluid. extremely important for proper elimination of tar. if dolomite is calcinated, its activity rises tenfold [3]. 2.4. calcination dolomite calcination is a complex process, occurring in high temperatures, that transforms original material and comprises two distinct stages [14]. the calcination process depends on several factors: temperature, grain size, dolomite composition, heating speed, and ambient conditions (partial pressure of co2) [12, 15, 16]. less stable mgco3 is destructed in the presence of co2 in temperatures exceeding 600 °c [15–17]. the reaction creates the so-called “half-calcinated” dolomite (mgo · caco3) which is stable if the temperatures do not change and if the ambient partial pressure of co2 is lower than the corresponding steady partial pressure of co2. if the temperatures rise, calcination of caco3 occurs as well. an increase in co2 results in a higher calcination temperature of caco3. the activity of calcinated dolomite depends on the size of the crystal particles and on the porousness of the stone, which is directly influenced by calcination. the so-called re-calcination, which is a reverse reaction of calcium oxide to carbonate accompanied by a temperature drop and/or by a rise in partial pressure of co2, is another disadvantage of the technology [16]. 3. experimental measuring in biofluid 100 the research of gas cleaning at the energy institute of the faculty of mechanical engineering in brno has figure 2. simplified layout of the gasifier connections. measured quantities: (t101–103) temperatures in the gasifier; (t104–105) temperatures under the cyclone; (t106) temperature inside the cyclone; (t107) temperature of the incoming primary air; (t108) gas temperature at jacket outlet; (f1–3) air flows; (f4) gas flow; (pstat) outlet gas pressure; (pstat1) tank pressure; (dp1) fluidized bed pressure difference. focused on wet scrubbing technologies. researchers have been successful in the use of water as a scrubbing fluid, and in the use of organic solvent (methyl ester of rapeseed oil). both technologies present values below 50 mg/m3nof tar in the gas. research has slowly shifted to dry catalytic cracking, which may bring outstanding and more comprehensive solutions to gas cleaning. the objective here is to assemble a pilot verification route with a combustion engine. the experiments were conducted in the atmospheric fluid gasifier biofluid 100, which has been running since 2000 [18]. the equipment has a stationary fluidized bed and may be operated in gasification and/or combustion mode. reactor specifications: • power output (in produced gas) 100 kwt; • power input (in fuel) 150 kwt; • wood consumption max. 40 kg/h; • air flow rate max.50 m3n/h. the fuel is supplied from a fuel storage tank which is equipped with a shovel, and it is fed into the reactor via screw conveyer with a frequency converter. compressed air is lead into the reactor under the grate (primary air), the secondary and tertiary air is further supplied in two height levels. wood pellets or highquality pure wood sawdust of 20–30 % humidity are ideal fuels for fluid gasifier. constant humidity [19], 403 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica na2o k2o mgo cao sio2 al2o3 fe2o3 co2 others h. lánov 0 0.24 17.6 32.87 2.44 1.34 0.31 45.03 0.14 table 1. chemical composition of dolomite [wt%]. figure 3. laboratory verification route. low ash content and the shape stability [20] of wood pellets are their huge assets. however, in order to imitate real operations as closely as possible, we used sawdust from spruce (2–3 cm) with 30 % humidity. 3.1. methods of measurement on experimental unit biofluid gas quality measurements are usually carried out in two ways. one consists of an on-line monitoring of gas composition with simultaneous gas sampling into gastight glass sample containers. the samples are subsequently analysed using a gas chromatograph. the tar sampling is carried out in line with iea methodology [21] by capturing tar in a solution that is subsequently analysed by gas chromatograph with a mass spectrometer. the presence of hcl, hf and nh3 in the gas is examined by trapping them in an naoh solution. the operating parameters are monitored during operation and continuously recorded by the control computer. they include, in particular, the mass flow of fuel, the temperatures at various points of the unit, the pressure difference in the fluidized bed, the gas flow and pressure, and the temperature and flow of air. figure 4. diagram of a verification route. measured quantities and scheme description: (tir c2–5) temperatures; (pi1) inlet gas pressure; (∆p1) gas pressure difference; (a2–5) sampling points; (r2-4) regulation of electric heating. 4. verification of conditions of catalytic cracking on stand the verification of the dolomite ability to function as a catalyst in tar cracking, and the verification of the operating conditions were conducted in a stand with a 5 l.min-1 gas flow rate. dolomite was selected as a catalyst thanks to its availability, low purchase price and thanks to the fact that the first tar cracking occurs at a temperature of around 700 °c. the material grain was opted using a literary search and calculations obtained during the process of filter design. dolomite from horní lánov was purchased for the verification (see table 1). the grainy texture of the material used was about 1–1.5 mm. the measured gas was heated to 800–900 °c. the gas further entered a dolomite filter with electric heating, which regulates the temperature from 800 to 1200 °c. prior to sampling, the dolomite was calcinated for 3–4 hours at 950 °c, and it was continuously blown through with air. the filter design impedes continuous replacement of the catalyst; therefore new fillings of dolomite were used for every temperature, 404 vol. 55 no. 6/2015 operating specifications of catalytic cleaning of gas from biomass gasification temperature in reactor 855 °c 936 °c 941 °c inlet outlet inlet outlet inlet outlet σ btx [mg/m3n] 24425 4626 23305 72 15865 235 σ tar [mg/m3n] 11625 623 10980 2 8185 35 tar red. efficiency 94.64 % 99.98 % 99.57 % lhv [mj/m3n] 7.450 6.931 7.723 6.267 7.482 6.152 table 2. reduction of tar in verification route. temperature in reactor 855 °c 936 °c 941 °c inlet outlet inlet outlet inlet outlet co2 15.28 17.66 15.13 17.13 15.35 18.56 h2 13.98 15.44 14.11 19.55 13.95 19.14 co 16.35 20.8 16.53 20.31 16.67 20.86 ch4 2.99 2.73 3.05 1.58 2.93 1.05 n2 49.53 42.53 49.41 41.35 49.48 40.26 cxhy 1.78 0.75 1.72 0.01 1.58 0.02 table 3. gas composition [vol%]. and max. 3–4 samples of tar and 4–5 samples of gas for each dolomite filling were taken. specifications of the verification equipment: • diameter 3.0 cm; • height 0.3 m; • flow rate 5 l/min = 8.3 · 10−5 m3n/h; • gas temperature 800–900 °c; • superficial velocity 〈v〉 = vp s = vp πd2/4 = 0.48 m/s, where vp is the flow rate of the gas through the filter [m3n/h] and s is the cross section of the filter [m2]; • retention time τs = high〈v〉 = 0.61 s. the results are shown in table 2 and prove that if temperatures rise above 900 °c, the amount of tar drops sufficiently, and the gas may be used in engines. table 3 presents the changes in gas composition. the indicated temperature is an average value of tir c2 and tir c3. samples of gas and tar were taken at the a5 sampling spot (see the diagram in fig. 4). gas from the gasification of spruce wood scobs (20 % moisture content) was used in the experiments. the temperature in the gasifier reached 800–820 °c; the excess pressure compared to atmospheric pressure was about 400 pa, and the gasification ratio e = 0.35–0.4; 35 m3n/h of produced gas. 5. results, evaluation and conclusions we reached 99.5 % efficiency in tar elimination, which corresponds with data published by some of the international authors [3, 22]. simell and his team conducted a series of studies using model compounds and tar substitutes to test the efficiency of dolomite and other carbonate rocks at 900–1000 °c. the catalysts were calcinated at 900 °c and operated at 900 °c; tar elimination efficiency ranged from 86 to 99 %. dolomite efficiency increased with a rising ca : mg ratio and with a rising iron content in the gasified material [23, 24]. delgado et al. obtained similar results [25]. most of the results of tar elimination testing using natural catalysts published in research literature are based on laboratory applications and tested a model gas, not a real gas from biomass gasification. our results obtained on the gasification unit in realoperation conditions prove that dolomite catalysts are suitable for the cleaning of gas from biomass gasification, namely for elimination of tar from the gas. the purity of the gas reached 2 and 35 mg/m3n, which complies with the requirements on purity of the gas for use in cogeneration units (the maximum admissible gas content is commonly 50 mg/m3n and/or zero tar condensate in the gas [26]). our tar elimination method presents several advantages compared to other types of gas cleaning. natural catalysts are cheap and are not subjected to so-called slow-degrading catalyst poisoning. they also eliminate sulphur and chlorine compounds. a proper equipment design allows for the implementing of a filter as a barrier separator of solids. some of the disadvantages of natural catalysts include high operating temperatures and the fact that the relevant equipment must be operated together with a gas generator in order to minimize heat losses. acknowledgements the presented results were obtained within the framework of the netme centre plus (lo1202) project, created with financial support from the ministry of education, youth and sports of the czech republic under the “national sustainability programme i”. 405 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica references [1] btg – biomass technology group; tar & tar removal, http://www.btgworld.com/technologies/ tar-removal.html#tar, (5.11.2005). [2] bridgewater, a. v.: the technical and economic feasibility of biomass gasification for power generation. energy research group, aston university, birmingham, (1995) fuel 74:5. [3] sutton d. et. al.: review of literature on catalysts for biomass gasification, fuel processing technology 73, 155-173, (2001). doi:10.1016/s0378-3820(01)00208-9 [4] lisý, m., baláš, m., moskalík, j., štelcl, o. (2012). biomass gasification primary methods for eliminating tar. acta polytechnica, 52(3), 66-70. [5] baláš, m., lisý, m., štelcl, o. (2012). the effect of temperature on the gasification process. acta polytechnica, 52(4), 7-11. [6] milne t.a., evans r.j, abatzoglou n.: biomass gasifier „tars“: their nature, formation and conversion, nrel/tp-570-25357, usa, (1998). [7] skoblia, s: modification of gas composition for biomass gasification, phd thessis. všcht praha, (srpen 2004). [8] simell, p.: catalytic hot gas cleaning of gasification gas, vtt 1997, phd thesis, (1997). [9] gusta e., dalai a.k, uddin m.a., sasaoka e.: catalytic decomposition of biomass tars with dolomite, revised manuscript, energy & fuel, 27.3.2009. http://pubs.acs.org [10] myrén c., hörnel ch., björnbom e., sjöström k.: catalytic tar decomposition of biomass pyrolitic gas with combination of dolomite and silica, biomass and bioenergy 23 ,217-227, (2003). [11] ordorica j.m.a., cabanillas a.: methods to reduce tars production during the gasification proces, http://www.mam.gov.tr/bigpower/workshop2/16.ppt 24.3.2009. [12] mcintosh, r.m., sharp, j.h., wilburn, f.w.: the thermal decomposition of dolomite, thermochimica acta, 165, pp. 281-296, (1990). doi:10.1016/0040-6031(90)80228-q [13] corella j., narvaéz i., orío a.: criteria for selection of dolomites and catalysts for tar elimination from biomass gasification gas; kinetic constants, new catalysts for clean environment, the 2nd symposium of vtt research programme on chemical reaction mechanisms, espoo, (january 1996). [14] hehl, m., helmrich, h., schugerl, k.: dolomite decomposition in a high temperature fluidized bed reaktor., chem. technol. biotechnol. 1983, 33a, 730-737. [15] hartman m., trnka o., veselý v., svoboda k.: predicting the rate of thermal decomposition of dolomite, chemical engineering science, vol. 51, no. 23, pp. 5229-5232, (1996). doi:10.1016/s0009-2509(96)00363-6 [16] wiedemann h.g., bayer, g.: note of the thermal decomposition of dolomite, thermochimica acta, 121, 479-485, (1987). doi:10.1016/0040-6031(87)80195-8 [17] boyton r.s.: chemistry and technology of lime and limestone, pp. 159-170, willey, n.york, usa, (1980). [18] lisý, m., baláš, m., moskalík, j., & pospíšil, j. (2009). research into biomass and waste gasification in atmospheric fluidized bed. paper presented at the proceedings of the 3rd wseas international conference on energy planning, energy saving, environmental education, epese ’09, renewable energy sources, res ’09, waste management, wwai ’09, 363-368. [19] križan, p., matúš, m., beniak, j., kováčová m.: "stabilization time as an important parameter after densification of solid biofuels." acta polytechnica 54(1):35-41, (2014). doi:10.14311/ap.2014.54.0052 [20] menind, a., križan, p., šooš, l., matúš, m., kers, j.: optimal conditions for valuation of wood waste by briquetting. paper presented at the proceedings of the international conference of daaam baltic "industrial engineering", 187-192, (2012). [21] van paasen, s.v.b et al.: guideline for sampling and analysis of tar and particles in biomass producer gases. final report documenting the guideline, r&d work and dissemination 2002, ecn-c-02-090. [22] chłond, r., najser, j.: analysis of gas purification technology from biomass gasification based on work of ceramic filter. [analiza technologii oczyszczania gazu z procesu zgazowania biomasy na przykładzie pracy filtru ceramicznego] rynek energii, 88(3), 107-112. [23] tuomi, s., kurkela, e., simell, p., reinikainen, m. (2014). behaviour of tars on the filter in high temperature filtration of biomass-based gasification gas. fuel, 139, 220-231. doi:10.1016/j.fuel.2014.08.051 [24] simell p.a, hakala n.a.k, haario h.e., krause, a.o.i.: catalytic decomposition of gasification gas tar with benzene as the model compound, industrial & engineering chemistry research 1997 36 (1), 42-51. doi:10.1021/ie960323x [25] delgado j., aznar m.p, corella j.: biomass gasification with steam in fluidized bed: effectiveness of cao, mgo, and cao–mgo for hot raw gas cleaning, industrial & engineering chemistry research 1997 36 (5), 1535-1543. doi:10.1021/ie960273w [26] najser, j., ochodek, t., chłond, r.: functioning of installation for a biomass gasification and economic aspects of electricity generation. [charakter pracyinstalacji słu zaçej do zgazowania biomasy a aspekty ekonomiczne procesu generacji energii elektrycznej] rynek energii, 85(6), 68-74, (2009). 406 http://www.btgworld.com/technologies/tar-removal.html#tar http://www.btgworld.com/technologies/tar-removal.html#tar http://dx.doi.org/10.1016/s0378-3820(01)00208-9 http://pubs.acs.org http://www.mam.gov.tr/bigpower/workshop2/16.ppt http://dx.doi.org/10.1016/0040-6031(90)80228-q http://dx.doi.org/10.1016/s0009-2509(96)00363-6 http://dx.doi.org/10.1016/0040-6031(87)80195-8 http://dx.doi.org/10.14311/ap.2014.54.0052 http://dx.doi.org/10.1016/j.fuel.2014.08.051 http://dx.doi.org/10.1021/ie960323x http://dx.doi.org/10.1021/ie960273w acta polytechnica 55(6):401–406, 2015 1 introduction 2 gas cleaning methods 2.1 secondary measures 2.2 use of catalysis 2.3 natural catalysts 2.4 calcination 3 experimental measuring in biofluid 100 3.1 methods of measurement on experimental unit biofluid 4 verification of conditions of catalytic cracking on stand 5 results, evaluation and conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0612 acta polytechnica 53(supplement):612–616, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap galactic and extragalactic supernova remnants as sites of particle acceleration manami sasaki∗ institute for astronomy and astrophysics, university of tübingen, germany ∗ corresponding author: sasaki@astro.uni-tuebingen.de abstract. supernova remnants, owing to their strong shock waves, are likely sources of galactic cosmic rays. studies of supernova remnants in x-rays and gamma rays provide us with new insights into the acceleration of particles to high energies. this paper reviews the basic physics of supernova remnant shocks and associated particle acceleration and radiation processes. in addition, the study of supernova remnant populations in nearby galaxies and the implications for galactic cosmic ray distribution are discussed. keywords: supernova remnants, shock waves, cosmic rays. 1. introduction supernova remnants (snr) are among the most beautiful astronomical objects, presenting various kinds of morphologies. at the same time they are important astrophysical sites, being responsible for the chemical enrichment, the energy budget, and the dynamics in the interstellar medium (ism), and thus they drive the chemical and dynamical evolution of galaxies. in the optical, snrs were identified as nebulae formed at the sites of historical supernovae (sne). most of the snrs in our galaxy were discovered as non-thermal, extended radio sources with a flux spectrum described by a power-law sν ∝ να with α ≈ −0.5. this emission was explained as synchrotron radiation emitted by a non-thermal, power-law population of relativistic electrons with n(e) = ke−s electrons cm−3 erg−1. these electrons produce a power-law photon distribution with α = (1 −s)/2, thus s = 2 for α = −0.5. 2. snrs as a cosmic particle accelerator electrons that emit radio synchrotron emission observed in radio snrs typically have energies in the gev range. particles of such energies and higher are known as cosmic rays (crs). since the first detection of crs on the earth, many observations have been performed to study their nature and have shown that the all-particle spectrum of crs can be described by a power-law ∝ e−2.7 up to ∼ 3×1015 ev (the knee) and becomes steeper for higher energies. in general, crs up to the knee are believed to be of galactic origin, while extragalactic sources are discussed for crs at higher energies. fermi [1] had suggested back in 1949 that particles can be accelerated in a diffusive process through magnetic mirroring of charged particles at moving magnetized interstellar clouds. likely sites of particle accelerations in the interstellar space are the strong shock waves in snrs, which are the remains of supernova explosions and heat the ambient ism and distribute heavy elements that were processed inside the progenitor star throughout the galaxy. these blast waves of shell-type snrs have kinetic energy of the order of 1051 erg. the rate at which sne occur in our galaxy is estimated to be about 3 sne per century, while each of these events can convert ∼ 10 % of their kinetic energy into cosmic-ray energy. snrs are thus believed to be the primary sources of cosmic rays below 3 × 1015 ev (e.g., [2]). 2.1. evolution of snrs after the initial explosion, snrs go through the following evolutionary stages: (1.) the ejecta dominated, free expansion phase in the first hundreds of years, in which the mass of the supernova ejecta, mej, dominates over the mass of the swept up ism. (2.) the sedov–taylor phase or the adiabatic phase, in which msw > mej. the radiative losses are still negligible. in this phase, which typically lasts a few 1000–10,000 yr, the snr expands adiabatically. (3.) in the subsequent pressure-driven or snow-plough phase, radiative cooling becomes dynamically important. the evolution of the shock radius is best described by using momentum conservation. (4.) finally, in the merging phase the shock velocity and the temperature behind the shock become comparable to the turbulent velocity and temperature of the ism, respectively. the snr becomes fainter and fainter and merges with the ism. figure 1 (left) shows a three-colour x-ray composite image of tycho’s snr, which is 440 years old and is believed to be in the transition between the free expansion phase and the sedov phase. the outer blast wave is well seen in hard x-rays (blue/violet), while 612 http://dx.doi.org/10.14311/ap.2013.53.0612 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 galactic and extragalactic supernova remnats figure 1. left: tycho’s snr observed with the chandra x-ray observatory. the image is a composite of three x-ray images at the following energies: 0.95 ÷ 1.26 kev (red), 1.63 ÷ 2.26 kev (green), 4.1 ÷ 6.1 kev (blue). (nasa/cxc/rutgers/j. warren & j. hughes et al.) middle: composite image of the remnant of sn 1006 consisting of x-ray data from chandra in blue, optical data from the university of michigan’s 0.9 m curtis schmidt telescope at the cerro tololo inter-american observatory (yellow) and the digitized sky survey (orange and light blue), and radio data from the very large array and green bank telescope in red. (x-ray: nasa/cxc/rutgers/g. cassam-chenaï, j. hughes et al.; radio: nrao/aui/nsf/gbt/vla/dyer, maddalena, & cornwell; optical: middlebury college/f. winkler, noao/aura/nsf/ctio schmidt & dss). right: chandra spectra of the outer rim (1) and two interior regions (2, 3) of the northeastern limb of sn 1006 by long et al. [7]. while the spectrum of the outermost region is a simple power-law, the spectra of inner region are have significant thermal components. the interior red-greenish, softer x-ray emission is due to hot ejecta. 2.2. shock waves in snrs the strong expansion due to supernovae form waves in the surrounding medium with high-amplitude disturbances resulting in a discontinuous flow. discontinuities of velocity v, pressure p, and density ρ in non-stationary flows are produced by shock waves. in the rest frame of the shock front, the gas moves perpendicular to the shock from side 1 (upstream) to side 2 (downstream). in reality the shock front propagates into region 1. at the shock front, the conservation laws of mass, energy, and momentum flow must be fulfilled. for a monoatomic gas, the ratio of the specific heats is γ = 5/3. for a mach number m with m2 � 1, one obtains r = ρ2/ρ1 = u1/u2 = γ+1γ−1 = 4 and kt2 = 316µ2mpu 2 1, where µ is the mean mass per particle (µ2 ≈ 0.6 for fully ionized gas of cosmic composition downstream). upstream, µ1 may be 1.4 in the case of a (mostly diatomic) neutral gas of cosmic composition. however, these relations are correct only if the energy loss due to radiation or other causes is negligible. if efficient acceleration of particles in shocks occurs, the energy loss due to escaping particles can become significant. in addition, a relativistic particle population will lower the mean adiabatic index from 5/3 to its fully relativistic value of 4/3. the primary mechanism discussed for producing energetic particles in snr shocks is diffusive shock acceleration (dsa, [3–5], etc.). in a collisionless shock wave like in an snr, the velocities of particles are randomized by scattering at inhomogeneities in the magnetic field. a considerable fraction of the particles may scatter back upstream, hence particles can cross the shock front many times and gain energy. owing to these processes, the energy distribution of the particles develops a non-thermal power-law tail in addition to a maxwellian distribution. particles accelerated in snr shocks will also have a dynamical effect on the shock. the incoming gas (in the shock frame) will be decelerated as accelerated particles diffuse ahead and scatter at inhomogeneities in the magnetic field in the upstream medium. therefore, the upstream flow is gradually slowed by the accelerated particles. the thermal shock in which the gas is heated still persists. in total, the compression ratio becomes larger than in the dsa-free case. particles with higher energies will get farther ahead of the shock front, where the compression ratio is higher. this will result in a harder spectrum and a concave up-curvature in the particle spectrum is expected. in addition, accelerated particles, which get ahead of the shock, can themselves excite magnetohydrodynamic waves [6]. the interstellar magnetic field thus can be amplified at the shock of young snrs through these self-generated magnetohydrodynamic waves, which can again scatter particles. 2.3. observations of snrs reynolds & chevalier [8] first proposed that dsa might cause x-ray synchrotron emission in sn 1006, in which predominantly non-thermal emission was detected (fig. 1, middle and right). in addition, two synchrotron-dominated shell-type x-ray snrs were discovered in the 1990s: rx j1713.7-3946 (g347.30.5, [9, 10]) and vela jr. (g266.2-1.2, [11, 12]). synchrotron emitting thin filaments were also found in remnants of historical sne rcw 86 (sn 185), tycho (sn 1572), kepler (sn 1604), or cas a (sn ∼1667). 613 manami sasaki acta polytechnica figure 2. chandra image of snr rx j1713.7–3946 with h.e.s.s. contours [13]. 2.3.1. magnetic field in the non-thermal filaments as the synchrotron flux depends on the magnetic field strength, the non-thermal filaments observed in xrays can be used to derive the magnetic flux density b. the very thin synchrotron filaments found in xrays parallel to the remnant edge indicate a fast drop of synchrotron emissivity behind the shock, which is far too fast to be explained by adiabatic expansion of electrons and the magnetic field. therefore, either the x-ray emitting electrons must disappear through radiative losses or the magnetic field must decay through some damping mechanisms. if electron losses are the cause, the filament width corresponds to the distance that electrons convect in the post-shock flow within synchrotron loss time. typical filament thicknesses of ∼ 0.01 pc imply immediate post-shock magnetic field strengths between 50 and 200 µg for the historical snrs. non-thermal x-ray hot spots were found in snr rx j1713.7–3946, which brightened and decayed on a one-year timescale (fig. 2, [13]). the rapid variability of the x-ray hot spots shows that the acceleration of the ultrarelativistic synchrotron-emitting electrons is associated with an amplification of the magnetic field by a factor of more than 100. 2.3.2. supernova remants interacting with molecular clouds another scenario in which the acceleration of particles results in traceable emission at or around snrs is the acceleration of heavier particles, which then interact with the ambient dense gas and produce pions through inelastic scattering on thermal protons. the neutral pion π0 decays to gamma rays, which produce a spectrum peaking at energy mπ/2 = 68 mev and continuing to higher energies with the same shape as the primary ion spectrum. snrs located in an inhomogenous ism and interacting with molecular clouds like snrs ic 443, w28, w51c are likely sources of π0-decay emission. at the position of these snrs, co clouds or oh masers have been found, which indicate that the snr shock wave is interacting with some high-density medium (fig. 3). 3. studies of extragalactic snrs supernova remnants can best be studied in soft x-rays, since they mainly consist of very hot plasma (106 ÷ 107 k) and radiate copious thermal line and continuum emission. however, due to absorption by interstellar matter, it is often difficult to study these soft x-ray sources with predominant emission below 10 kev in our own galaxy. 3.1. snr 1e 0102.2–7219 in the small magellanic cloud snr 1e 0102.2–7219 is the brightest x-ray snr in the small magellanic cloud (smc), which shows strong line emission of highly ionized o, ne, and mg. optical filaments have also been detected with radial velocities of −2500 to +4000 km s−1 w.r.t. the smc [16]. chandra grating spectrum indicates velocities of ±1000 km s−1 [17]. hughes et al. [18] studied the chandra data of snr 1e 0102.2–7219 and concluded that the optical filaments and the extent measured from the chandra images indicate a higher expansion velocity what would correspond to the post-shock electron temperature derived from the x-ray spectra. even taking into account that the electrons and heavier particles in the shock have not reached thermal equilibrium, which thus corresponds to non-equilibrium ionization, is not 614 vol. 53 supplement/2013 galactic and extragalactic supernova remnats figure 3. left: fermi lat 2 – 10 gev count maps with vla radio contours (ellipse: shocked co clump, crosses: oh masers, [14]). right: radio and γ-ray spectra of snr w44 with models [15]. sufficient to yield the observed inconsistency. therefore, the authors suggest that cosmic rays must have carried a significant fraction of the snr shock energy away. theoretial calculations have shown that non-linear effects in particle acceleration mechanisms can yield a mean post-shock temperature of ∼ 1 kev for snr 1e 0102.2–7219 [19, 20], consistent with the observed x-ray temperature, e.g., [18, 21]. 3.2. snrs in nearby galaxies as supernova remnants are believed to be the primary sources of galactic cosmic rays, the distribution of snrs is essential for understanding the injection and propagation of galactic cosmic rays. however, the galactic position of the earth is not ideal for studying the source distribution in our galaxy. to understand the distribution of snrs in a spiral galaxy, it is necessary to study other nearby spiral galaxies. we have obtained a new complete sample of x-ray snrs in m 31 based on the xmm-newton large programme survey data [22]. in fig. 4 we compare the radial number distribution of x-ray snrs in m 31 to that of galactic snrs and of x-ray snrs in m 33 [23]. the snr distributions in m 31 and m 33 are comparable but slightly flatter than in the milky way. the distribution of snrs in m 31 is correlated with the distribution of gas in m 31, which shows ring-like structures with the most prominent ring at a radius of ∼ 10 kpc. the flatter snr distribution in m 33 is consistent with m 33 being a typical flocculent spiral galaxy with discontinuous spiral arms and prominent on-going star formation regions in these arms. 4. summary supernova remnants are the aftermath of stellar explosions, which inject large amounts of energy into the ism, carving out new structures and transferring kinetic energy to the ism. in addition, particles can be accelerated in the snr shock waves through diffusive shock acceleration. rather recent gev to tev observations of snrs have confirmed the existence of ultrarelativistic particles in snrs. the primary agent of the acceleration process are the magnetic fields, which can also be modified and amplified by the relativistic particles. measurements of x-ray synchrotron emission in non-thermal snrs allow us to derive the magnetic field and its configuration around the shock. the relativistic particles in the snr shock also modify the shock itself by carrying away kinetic energy from the snrs. in addition, snrs interacting with dense molecular clouds are likely to produce γ-rays via π0 production with subsequent decay. while observations of galactic snrs emitting nonthermal x-ray emission or γ-rays help us to improve our understanding of cr acceleration in snr shocks, observations of extragalactic snrs allow us to perform statistical studies of a representative sample of the sources within a galaxy. the spatial distribution of snrs in a galaxy will shed light on the propagation of cosmic rays in the galaxy. references [1] fermi e., 1949, phys. rev. 75, 1169 [2] hillas a.m., 2005, j. phys. g: nucl. part. phys. 31, r95 [3] drury l.o’c., 1983, rep. progr. phys. 46, 973 [4] blandford r.d., eichler d., 1987, phys. rep. 154, 1 [5] jones f.c., ellison d.c., 1991, space sci. rev. 58, 259 [6] bell a.r., 1978, mnras 182, 147 [7] long, k.s., et al., 2003, ap. j. 586, 1162 [8] reynolds s.p., chevalier r.a., 1981, ap. j. 245, 912 [9] koyama k., et al., 1997, publ. astron. soc. jpn. 49, l7 [10] slane p., et al., 1999, ap. j. 525, 357 [11] aschenbach b., 1998, nature 396, 141 [12] slane p.o., et al., 2001, ap. j. 548, 814 [13] uchiyama y., et al., 2007, nature 449, 576 [14] uchiyama y. on behalf of the fermi lat collaboration, 2011, arxiv:1104.1197 615 manami sasaki acta polytechnica figure 4. surface density of snrs and candidates in m 31 [22] and m 33 [23] plotted over the radius normalised to half of the apparent major isophotal diameter d25/2. the radial distance in kpc is given along the upper x-axis. dotted lines show the fitted model function f(x) = cxα exp (−βx), dashed lines show the model function for the milky way [24] normalised to the maximum of m 31 or m 33. [15] thompson d.j., baldini l., uchiyama y., 2012, arxiv:1201.0988 [16] tuohy i.r., dopita m.a., 1983, ap. j. 268, l11 [17] flanagan k.a., et al., 2004, ap. j. 605, 230 [18] hughes j.p., rakowski c.e., decourchelle a., 2000 ap. j. 543, l61 [19] decourchelle a., ellison d.c., ballet j., 2000, ap. j. 543, l57 [20] ellison d.c., 2000, aip conf. proc. 528, 386 [21] sasaki m., et al., 2006, ap. j. 642, 260 [22] sasaki m., et al., 2012, arxiv:1206.4789 [23] long k.s., et al., 2010, ap. j. s. 187, 495 [24] case g.l., bhattacharya d., 1998, ap. j. 504, 761 discussion carlotta pittori’s comment — i have a comment on the snr w44. you just quoted γ-ray fermi data. one of the most important recent agile results is about this snr (giuliani et al., 2011, apjl, 742, 30). agile data extend to the lowest part of the γ-ray spectrum energy distribution around 100 mev and allow us to discriminate between leptonic and hadronic models, providing the first clear evidence of π0-decay and of proton acceleration. 616 acta polytechnica 53(supplement):612–616, 2013 1 introduction 2 snrs as a cosmic particle accelerator 2.1 evolution of snrs 2.2 shock waves in snrs 2.3 observations of snrs 2.3.1 magnetic field in the non-thermal filaments 2.3.2 supernova remants interacting with molecular clouds 3 studies of extragalactic snrs 3.1 snr 1e 0102.2–7219 in the small magellanic cloud 3.2 snrs in nearby galaxies 4 summary references acta polytechnica acta polytechnica 53(2):228–232, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ estimation of amount of scattered neutrons at devices pfz and git-12 by mcnp simulations ondřej šíla∗, pavel kubeš, josef kravárik, karel řezáč, daniel klír, jakub cikhardt department of physics, faculty of electrical engineerging, czech technical university in prague, czech republic ∗ corresponding author: silaondr@fel.cvut.cz abstract. our work is dedicated to pinch effect occurring during current discharge in deuterium plasma, and our results are connected with two devices – plasma focus pfz, situated in the faculty of electrical engineering, ctu, prague, and z-pinch git-12, which is situated in the institute of high current electronics, tomsk. during fusion reactions that proceed in plasma during discharge, neutrons are produced. we use neutrons as instrument for plasma diagnostics. despite of the advantage that neutrons do not interact with electric and magnetic fields inside device, they are inevitably scattered by materials that are placed between their source and probe, and information about plasma from which they come from is distorted. for estimation of rate of neutron scattering we use mcnp code. keywords: plasma focus, pinch effect, d−d reaction, mcnp. 1. introduction one of the most remarkable phenomena of plasma physics is pinch effect which proceeds if sufficiently high current passes through plasma. such high current generates magnetic field force that compresses current layer radially – that leads to great increase of temperature in place of pinched current. if plasma is made of suitable gas like deuterium or tritium, we can observe fusion reactions during pinch effect. plasma focus is device that is mainly composed of two cylindrical electrodes placed in vessel which is filled by isotope of hydrogen (mostly deuterium). between these electrodes current layer is formed (in most cases by discharge of capacitors) and moves in front of inner electrode where is compressed, and pinch effect proceeds. in some cases, there is another cylindrical electrode placed in front of inner electrode, and with purpose to support pinch effect – this experimental setup appears for instance at pfz device [1]. device git-12 is situated in the institute of high current electronics, tomsk. operation of this device is based on principle of gas-puff. both mentioned devices have vessel filled with deuterium, so during discharge d−d reactions proceed within. detection of d−d fusion neutrons is performed using scintillation probes with scintillator bicron bc-408 [5]. for estimating of neutron energy we use time-offlight method. because of the fact that neutrons from d−d fusion reaction have velocity approximately 0.72c, we use simple non-relativistic equation for estimating neutron energy en from its velocity vn, en = 1 2 mv2n. (1) 2. mcnp results mcnp (abbreviation for “monte carlo n-particle”) is being used for calculating of energies and positions of variety of particles (for instance electrons, photons, neutrons and positrons) during and after their transport from place a to place b, if they interact with specific materials which have unique shape. the main problem of defying parameters for simulation in mcnp is to define sufficiently geometry of experimental setup to get as realistic model as possible. user of this program creates input file where considered geometry, materials, particle source, type of results and number of iterations are defined [3]. 2.1. parameters of our simulations we created mcnp input file for devices pfz and git-12. in both cases, we tried to define geometry of problem sufficiently similar to reality. two dimensional view of our considered models is pictured in figs. 1 and 2. in our simulations, we used mono-energetic point source with mean energy of d−d fusion neutron 2.45 mev. we placed this source to place which we considered as the most probable for production of fusion neutrons (we are able to determine this place with inaccuracy circa 5 cm [2]). endf cross section data libraries were used to simulate neutron transport in our simulation. the type of result of simulation is defined by so called tally card. we chose tally f4 card, so our result was average energy of neutrons in cell which we defined as cylindrical and of the same dimensions as our scintillator. we located such cell to the same position and distance as the scintillation probe in our experiment. it is quite important to choose sufficiently great number of iterations of simulation to get results 228 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 estimation of amount of scattered neutrons figure 1. parts of pfz device which were implemented to mcnp input file. figure 2. parts of git-12 device which we considered in mcnp simulation. accurate enough. for all our mentioned cases, 5 × 107 iterations gave us acceptably reasonable results. when creating geometry of problem, there is a question whether floor and walls should be included to geometry or not. for each of these two devices we have different answer. in case of pfz it is not necessary to take walls and floor into account because neutrons reflected from them do not manage to reach probe in the time range of observed neutron pulse. it is caused by the fact that in experimental setup of pfz we have probes distanced not more than 5 meters from source of fusion neutrons. on the other hand, git-12 is device in which probes are placed in much greater distances from the place of production of fusion neutrons and it is very expectable that this reflected neutrons appear in observed neutron signal, so it is necessary to include floor and walls into geometry [4]. we performed our simulations in version mcnp5 of this program. figure 3. 3-d view of parts of pfz device which were implemented to mcnp input file. figure 4. deployment of radial detectors d2 and d3. 2.2. mcnp results for pfz device 2.2.1. neutron energy spectrum in place of three scintillation probes in pfz experimental setup we calculated neutron energy spectrum in three places where we put probes in our experimental setup. deployment of detectors illustrates fig. 4. in the fig. 5, we present the results of our simulation. we can see that the greatest number of scattered neutrons is observed in probe d1. we attribute this result to the fact that neutrons flying in axial direction have to overcome substantially thicker layer of metal than neutrons flying in radial direction, so they scatter more. 2.2.2. replacement of steel in pfz device by another metal as can be seen in fig. 6, putting steel material away from our experimental setup would have a big influence on reducing the amount of scattered neutrons. we performed simulations where steel was replaced by another considerable metal. firstly, we tried to replace steel by aluminum. with respect to relative error of our simulation which was 2 %, we may not say that replacement of steel by aluminum would have any remarkable effect on reducing of amount of scattered neutrons. we made the same conclusion for case when steel was replaced by copper. 229 ondřej šíla et al. acta polytechnica figure 5. we can see what percentage of neutrons remained unscattered during flight to probes d1, d2 and d3. figure 6. red line – complete experimental setup, blue line – experimental setup without steel. figure 7. red line – experimental setup with steel, blue line – steel replaced by aluminum. 2.2.3. correspondation between simulation and experiment naturally, there is a question whether our computer simulations correspond with reality of the experimental setup. neutron energy spectrum in results of our simulations may not correspond with measured neutron energy spectrum because the physics of generating of neutrons may be much more complicated in reality than we supposed in our simulation. specifically, it is especially choice of mono-energetic point figure 8. red line – experimental setup with steel, blue line – steel replaced by copper. figure 9. experimental setup with shielded and unshielded probe. source in our input file1. however, we can make experiment where we change experimental setup in a way that we observe change in measured neutron spectrum, and also make computer simulation which has implemented the same change in parameters of the input file. we realized this idea so that we put two probes next each other in experiment, and in front of one of them we placed thin plexiglas desk. in shielded probe a we measured smaller area x under neutron signal than in unshielded probe b. in computer simulation, we can find equivalents ya and yb to areas xa and xb. we estimated value of xa/xb and ya/yb as xa xb = 1.7 ± 0.3 and ya yb = 1.3 ± 0.1. this result is in favor of the hypothesis that our simulations match quite reality. 1nevertheless, for our purposes when we compare scattering of neutrons flying in specific direction, it is useless to consider non-isotropic distribution of neutrons that is surely present in our experiment. 230 vol. 53 no. 2/2013 estimation of amount of scattered neutrons figure 10. area under neutron signal of unshielded probe. figure 11. area under neutron signal of shielded probe. figure 12. neutron energy spectrum in place of two scintillation probes. 2.3. mcnp results for git-12 device we simulated neutron energy spectrum in place of two scintillation probes, one placed axially 10.12 m above neutron source (d1), and one radially in the same distance (d2). relative inaccuracy of our results was 0.2 %. probe d2 is placed near great boxes with marx generators which are filled with oil. oil may have greater influence on scattering of neutrons entering probe d2, so we removed it from our simulation and observed how neutron energy spectra changed. if we remove oil in boxes, we get approximately 1 % greater amount of unscattered neutrons in d2. this leads us to conclude that there is approximately the same amount of scattered neutrons flying in axial and radial direction if we remove oil from our experimental setup. figure 13. position of probe d2 which is situated between boxes with marx generators (green blocks). figure 14. neutron energy spectrum in place of probe d2; we consider the case of removing scattering medium from our experimental setup. 3. conclusion 3.1. summary of the results at pfz device the greatest amount of scattered neutrons fly in axial direction. we see the reason of it in fact that they have to overcome substantially thicker layer of metal. steel used in our device seems to be convenient material for our purposes – in our simulations, other metals would have the same or greater influence on scattering of neutrons. comparing simulation with experiment gave us feedback which confirmed our hypothesis that our simulations correspond with reality. 3.2. conclusions for git-12 device there is 1 % less scattered neutrons flying to axial probe than to radial probe. our results showed that this difference may be caused by boxes with oil which are placed near radial probe. acknowledgements this work was supported by the research program msmt no. la08024, me09087 and gacr p205/12/0454, iaea rc 14817 and ctu sgs 10/266/ohk3/3t/13. references [1] d. klir. the study of a fibre z-pinch. ctu fee, prague, 2005. doctoral thesis. 231 ondřej šíla et al. acta polytechnica [2] d. klir, p. kubes. neutron emission generated during wire array z-pinch implosion onto deuterated fiber. physics of plasmas 15(1–13), 2008. [3] m. kralik, j. krasa. application of a bonner sphere spectrometer for determination of the energy spectra of neutrons generated by ≈ 1 mj plasma focus. review of scientific instruments 81(1–5), 2010. [4] j. krasa, m. kralik. anisotropy of the emission of d−d fusion neutrons caused by the plasma-focus vessel. plasma physics and controlled fusion 50(1–10), 2008. [5] o. sila. energy spectrum of neutrons from d(d,n)3he reactions in plasma focus discharge. ctu fee, prague, 2011. 232 acta polytechnica 53(2):228–232, 2013 1 introduction 2 mcnp results 2.1 parameters of our simulations 2.2 mcnp results for pfz device 2.2.1 neutron energy spectrum in place of three scintillation probes in pfz experimental setup 2.2.2 replacement of steel in pfz device by another metal 2.2.3 correspondation between simulation and experiment 2.3 mcnp results for git-12 device 3 conclusion 3.1 summary of the results at pfz device 3.2 conclusions for git-12 device acknowledgements references ap1_03.vp 1 evolved representations as described in the abstract, the choice of a representation will have an influence on the result of a design process using this representation. any particular representation might make some designs impossible to generate, and some designs less likely to be produced than others. the first of these effects is often used to restrict the size of the search space; the second however is generally avoided, because it is implicit and difficult to predict and control. in this research, it will be shown that it is possible to create representations that bias a search process in a predictable and controllable way. this means that the representation introduces a usercontrollable ‘focus’ into the design process. designs inside this focus, those showing user-defined features, have a higher probability to be the result of the design process than designs from other areas of the search space. 1.1 using the representation to focus design processes to make use of the influence that representations can have on a design process, any implicit bias introduced by the representation has to be replaced by a bias that is both predictable and controllable by the user. the goal is therefore to find a way to create a representation for an evolutionary system that transforms the search space in a way such that designs are more likely to be generated that shows certain preferred attributes. this goal can be divided into two parts. • identify a method to influence a design process in a predictable way by modifying the representation used in this process. many different representations are used for design, and variations on these representations influence 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 evolved representation and computational creativity ashraf fouad hafez ismail advances in science and technology have influenced designing activity in architecture throughout its history. observing the fundamental changes to architectural designing due to the substantial influences of the advent of the computing era, we now witness our design environment gradually changing from conventional pencil and paper to digital multi-media. although designing is considered to be a unique human activity, there has always been a great dependency on design aid tools. one of the greatest aids to architectural design, amongst the many conventional and widely accepted computational tools, is the computer-aided object modeling and rendering tool, commonly known as a cad package. but even though conventional modeling tools have provided designers with fast and precise object handling capabilities that were not available in the pencil-and-paper age, they normally show weaknesses and limitations in covering the whole design process. in any kind of design activity, the design worked on has to be represented in some way. for a human designer, designs are for example represented using models, drawings, or verbal descriptions. if a computer is used for design work, designs are usually represented by groups of pixels (paintbrush programs), lines and shapes (general-purpose cad programs) or higher-level objects like ‘walls’ and ‘rooms’ (purpose-specific cad programs). a human designer usually has a large number of representations available, and can use the representation most suitable for what he or she is working on. humans can also introduce new representations and thereby represent objects that are not part of the world they experience with their sensory organs, for example vector representations of four and five dimensional objects. in design computing on the other hand, the representation or representations used have to be explicitly defined. many different representations have been suggested, often optimized for specific design domains or design methods, but each individual computational design system has only one or very few different representations available. whatever the choice of the representation, it is likely to influence the outcome of the design process. in any representation, some designs may be more difficult to represent than others, and some designs may not be representable at all. the same applies if the design process is implemented in a computer program. if a design cannot be represented with a given representation, it cannot be the outcome of a design process using this representation. as is the case for human designers, it is also possible that the representation influences a computational design process such that it is easier for the program to find some designs than others. depending on the design process used, this might make those designs a more likely outcome of the design process. this is for example the case with stochastic optimization processes, like evolutionary systems and simulated annealing. in these cases, the representation is likely to introduce a bias into the design process. the selection of the representation is therefore of high importance in the development of a computational design system. obviously, while choosing the representation the programmer has to ensure that all or as many as possible potentially ‘interesting’ designs can be represented. but it is also generally desirable to minimize the bias introduced by the representation. in contrast to the user-provided design criteria, the bias caused by the representation influences the outcome of the design process in an implicit way which is not obvious to the user, and is difficult to predict and control. the idea developed in this research is that it is possible to turn the bias caused by the representation into a virtue, by deliberately choosing or modifying the representation to influence the design process in a certain desired way. the resulting ‘focusing’ of the search process is connected to the idea of ‘expansion of search spaces’, a notion used in some definitions of computational creativity. both ‘focusing’ and ‘expansion of search space’ will be explored in this research. keywords: computational creativity, evolved representation, design processes, evolutionary algorithms, search space. the design process in different ways. the method should be applicable to different representations in different applications. • design a mechanism that allows the creation of such a representation for particular design problems. ideally, this mechanism would require little user-interaction. the following section presents an intuitive view of the algorithm that has been developed, and the sections following describe how this algorithm achieves the two parts of the goal described here. 1.2 basic implementation schema the general schema of the algorithm is shown in figure 1. the first step is the same as in any design computing application: the definition of a representation. in this case however, this is only an initial representation, referred to as the ‘basic representation’. it is designed to allow a very large search space, including as many potentially interesting designs as possible, and is therefore usually very basic and low-level. in the example, the basic representation is based on squares that can be connected to form larger shapes (figure 1(a)). in the second step, the system is put into a training situation, where a search algorithm using the initial representation is set to solve a simple design task: to produce phenotypes that are (partial) copies of a set of examples given to the system. during this phase, a meta-level process observes how the basic representation is used. it identifies patterns in the genotypes of the individuals that are particularly successful, and modifies the representation used by the system by adding symbols for these patterns. the result is a new, ‘complex’ or ‘evolved’ representation, biased in favor of common features in the designs produced in the training session. in figure 1, l-shaped shapes appear in the design examples (figure 1(b)); therefore the representation is expanded by adding a symbol for this shape (figure 1(c)). the designs produced in this step that are copies of the examples are discarded. however, the evolved representation provides the required focus, centered on the examples. a regular search algorithm, using this evolved representation, can then be used to produce new designs that are likely to be similar to the examples. the effect of the evolved representation depends on whether the basic representation is replaced by the evolved representation or whether the evolved representation is added to the basic representation. in this example, the l-shape is only added, and the new designs can use both the original square and the l-shape (figure 1(d)). this basic algorithm solves the following two parts of the goal. • the representation is manipulated by adding new symbols to the alphabet used in the representation. these additional symbols represent certain features, and by introducing them into the representation designs containing these features are favored; in other words the additional symbols create a focus in the search space. • the additional symbols can be identified automatically, using machine learning. no user interaction is required, only the provision of a set of example designs. both points will be elaborated in the following sections. a system as described here also allows for the transformation of the focus thus created, simply by modification of the representation. this feature is important in connection with creative design. 1.2.1 use of evolutionary algorithms the creation of an evolved representation from a training situation requires two main features: the ability to produce copies or partial copies of the examples without additional knowledge or supervision, and a flexible and easily modifiable representation. at the same time, the evolved representation is intended to be used to produce new designs. therefore, it has to be compatible with the search method that will be used to produce these new designs. evolutionary algorithms seem to fit these requirements very well. they require only a fitness value that can easily be © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 41 no. 3/2001 creation of evolved representation use of evolved representation new design using evolved representation (d) (a) (b) (c) design examples basic representation basic and evolved representation fig. 1: basic concept of use of evolved representations in creative design calculated from a comparison between phenotypes and the design. as will be shown in the following sections, they also allow for the manipulation of the representation during the search. finally, as the large body of existing design systems using evolutionary algorithms shows, they are also very well suited to the generation of new designs. this means that the same type of algorithm can be used for the creation and for the use of the evolved representation, and the compatibility of the representation is therefore ensured. 1.3 influencing search space using evolved representations in evolutionary algorithms, a bias towards particular designs can be introduced either in the genotype representation, or in the genotype-phenotype transformation. for example, using evolutionary algorithms with variable-length genotypes, individuals with short genotypes are generally easier to find than individuals with long genotypes. similarly, if the genotype-phenotype transformation were such that particular phenotypes can be represented by many genotypes, these phenotypes would be expected to be easier to find than phenotypes that are represented by only one genotype. in the method presented here, the biggest influence on the search process comes from the first effect: designs with certain desired features are represented with shorter genotypes, and are therefore easier to find. however, the method also introduces new ways to represent these designs, which again improves the chances of these designs in the design process. to illustrate the creation of a focus, an evolutionary system with a string representation is used. in such a system, the genotypes are strings of fixed or variable length, constructed from symbols of a predefined alphabet. to create a focus in the search space of such an evolutionary algorithm, the representation used in this algorithm is modified by the introduction of additional symbols to the original alphabet. to distinguish the introduced symbols, they will be referred to as ‘evolved genes’, while the original symbols will be called ‘basic genes’. evolved genes can be used together with the basic genes to produce new genotypes. as a result, two different kinds of genotypes can be distinguished: genotypes that use only basic genes (referred to as ‘basic-level genotypes’), and genotypes that also use evolved genes (referred to as ‘evolved-level genotypes’). this introduces another representation level, as the genotype representation is now split into a ‘basic genotype representation’ (or ‘basic representation’) and an ‘evolved genotype representation’ (or ‘evolved representation’), as shown in figure 2. the evolved genes are defined such that each evolved gene represents a certain combination of basic genes. as figure 2 shows, evolved-level genotypes can therefore be transformed into basic-level genotypes by replacing each evolved gene with the set of basic genes it represents. for example, if gene a in the figure appears in an evolved level genotype, this indicates that in the corresponding basic level genotype, this and the next position is filled by basic gene 0; evolved gene c indicates a sequence of four basic genes 2. the original genotype-phenotype transformation can then be used to generate a phenotype. in the general case, the evolved-level genotype to basic-level genotype transformation is an n-to-one transformation; there will be many different ways to represent the same basic level genotype using evolved genes. 1.3.1 transformation of the search space since evolved genes are represented by single symbols in the genotype, they are ‘atoms’ for evolutionary operations. at the same time, they represent a combination of basic genes, effectively encapsulating this gene combination. as a result, this gene combination cannot be ‘broken up’ by any genetic operation. it can still be removed as a whole, but its chances of surviving a genetic operation are much higher than for all other gene combinations that occupy the same positions on the basic-level genotype, but are not encapsulated into an evolved gene. the more basic genes an evolved gene encapsulates, the stronger is this effect. similarly, if evolved genes are used in the creation of an initial population, then the gene combinations represented in the evolved genes will have a higher chance of being represented in an individual than any other, random combinations of basic genes. the effect of the introduction of evolved genes is, therefore, that certain combinations of basic genes will be advantaged in the genetic search. it follows that evolved genes can 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 a a 1 1 1 1 c 3 3 3 3 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 basic-level genotype evolved-level genotype and evolved representation human-readable phenotype phenotype a = <0 0> c = <2 2 2 2> ((0, 0) (4, 0)) ((4, 0) (4, 4)) ((4, 4) (0, 4)) ((0, 4) (0, 0)) fig. 2: additional layer of representation caused by the use of evolved genes be used to bias the search of the evolutionary system in favor of this feature if combinations of basic genes can be identified such that the probability that a certain feature is present in the phenotype is higher if the gene combination is present in the genotype than if it is not present. the introduction of evolved genes can be seen as a transformation of the search space, as illustrated in figure 3. the example assumes a variable-length representation where each basic gene in the genotype is directly translated into a movement of a pen in a certain direction. the original alphabet therefore has four members, shown in the figure as a, b, c and d. the genotype-phenotype transformation transforms each letter into the movement of a pen, for example each occurrence of the symbol a in a genotype results in an upward movement of the pen of one unit length. the genotype dacb describes a simple square, constructed by the movement of the pen one unit to the left, one unit upward, one unit to the right, and one unit down. the genotype dbca represents the same square, however the pen ends up at a different corner or the square. in the figure, the endpoint of the movement is indicated with an arrow. the search space can be illustrated by a number of concentric circles, each defining the space of designs that can be defined by a genotype of a certain length. the inner circle contains the designs represented by genotypes of length one, in other words the basic genes translated into phenotypes. the further away a design (or part of a design) is from the center, the larger is the genotype required to represent it, and also the larger the space that has to be searched to arrive at this design. the original search space is illustrated in figure 3(a), with the four basic genes in the center. the second circle shows all designs that can be derived from genotypes of length two (i.e., using two vectors). the other circles give some examples of designs using genotypes of length three, four and five. every time an evolved gene is created, the structure of the search space is changed. the state of the new gene in the search space is moved into the center, all design states in the next circle that can be derived from that state are moved into the second circle, and so on. for example, if an evolved gene is introduced for each of the combination of four consecutive basic genes that represent the two closed shapes in the fourth circle, the search space changes as shown in figure 3(b). the squares are now represented directly by an evolved gene, and the shapes on the fifth circle that are derived from the squares can now be found in the second circle. the greater the number of evolved genes a design state involves, the more it is moved towards the center. for example, a shape with the four squares that is now on the fifth circle (that is, can be constructed from genotypes of length five) would have been on the fourteenth circle before (fourteen vectors, because the shape cannot be drawn without drawing two lines twice). since the introduction of a new gene increases the size of the alphabet of the representation, more genotypes of a given evolved-level genotype length exist, and the size of the search space for a given length increases. this is illustrated by using © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 41 no. 3/2001 (a) (b) fig. 3: example of an evolved representation: (a) original representation; (b) representation with evolved genes. some of the corresponding genotypes are given, capital letters denote evolved genes. the transformation from phenotype to genotype is not always unique, e.g., the genotypes ‘abc’ and ‘bac’ produce the same phenotype. arc segments indicate that only part of the space is shown. larger circles in figure 3(b). however, the reduction in genotype length has a much stronger, search space reducing, effect, as can be shown for the foursquare shape. to produce this shape using basic genes only, the search space consists of basic-level genotypes of length fourteen, with four basic genes, containing 414= 268 434 456 elements. to produce the same designs using basic and evolved genes, the search space would consist of all genotypes of length five with 6 symbols in the alphabet, containing 65= 7 776 elements. 1.4 creating evolved representations the previous section showed how evolved genes can be used to influence an evolutionary search in such a way that certain features are favored. the second task is to find a way to identify the appropriate combinations of basic genes, so that the evolved representation can be created. creating an appropriate evolved representation is straightforward in the case where the features that are intended to be included are explicitly known, and the genotype-phenotype representation is such that it is possible to directly map those features onto gene combinations. however, neither of those conditions is usually fulfilled. explicitly enumerating all desired features requires a high amount of user input, and the genotype-phenotype translation can be such that a reverse translation is difficult or impossible. it is therefore necessary to find a different method to create the evolved representation. machine learning can provide such a method. figure 4 shows a schematic outline of a system-employing machine learning to create the evolved genes. the central element is a user provided example. the features present in this example will provide the center of the focus created by the evolved representation. the only user input required is the provision of this example. the loop on the left of figure 4 is based on a conventional evolutionary system. individuals are taken from the population (which is initially generated randomly), offspring are produced, fatnesses calculated and the new individuals are either discarded or introduced into the population. the fitness function in this system is a comparison between the phenotypes produced and the example. this comparison returns a high value for phenotypes that are similar to the example, and lower values for phenotypes that are less similar; the exact implementation depends on the application domain. this fitness will be referred to as ‘similarity fitness’ �. at the start, the individuals produced will have hardly any similarities to the example; at the end the system might have found an identical copy of the example. in between, the system produces a high number of individuals that are in some features similar to the example. the goal of this system is not to produce the final individual, but to generate a range of individuals that contain a large variety of features from the example in a variety of combinations. some additional control may therefore be necessary to prevent convergence of the population; this control usually influences the fitness and the way new individuals are inserted into the population. the population generated by the evolutionary system is then used as a pool of samples to create the evolved re40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 compare compareselect create offspring maybe introduce gene combination statistics best gene combination introduce convergence control sample creation cycle gene creation cycle population e x a m p le fig. 4: schematic representation of the proposed system to create evolved genes presentation. this is done in the right loop in figure 4. gene combinations that appear predominantly in sample individuals that are very similar to the example can be used to create the set of evolved genes. the assumption behind this is that the genotype-phenotype transformation is defined in such a way that features in the phenotype are correlated to subsets of genes on the genotype. this does not have to be a direct mapping, it is sufficient when: • the probability that the feature exists in the phenotype is higher if the gene combination can be found in the genotype than if not. • the probability that the gene combination can be found in the genotype is higher if the feature exists in the phenotype than if not. a result of the probabilistic nature of the evolutionary process is that the evolved representation created for an example is not unique. instead, different runs will produce different evolved representations, each creating a focus around the example, but each slightly different from the others. to reduce the computational cost in identifying the best new gene combination, it is possible to take only combinations of two existing genes into account. these existing genes can be either basic genes or evolved genes; any new evolved gene can therefore be composed of two basic genes, two evolved genes, or a basic gene and1 an evolved gene. in cases where the genotype-phenotype transformation allows gene combinations to be converted directly into features, this construction of high order genotypes can be inter-prated as a creation of large ‘building blocks’ by combining smaller ones. figure 5 shows how a building block with seven elements can be assembled in four steps, from both basic blocks and lower-order building blocks. the building block that is added in the third step would in turn have been created from basic blocks in a similar process. creating complex evolved genes by combining simpler evolved genes can be described as a ‘bottom-up’ process, in the sense of artificial life research, where complex behaviors and structures are the result of interaction of a number of simpler behaviors or structures. 1.4.1 feedback into the evolutionary system it is possible to run the evolutionary system until a sufficient number of samples are generated, and then run the gene extraction to create all evolved genes. however, given the bottom-up construction of complex evolved genes, a different approach offers itself: phases of sample creation and gene extractions can be interwoven. the evolved genes created in the gene extraction can be added to the representation and introduced into the population. in the early stages, where the individuals produced contain only little knowledge about the examples, simple evolved genes are produced and introduced; in later stages, more complex evolved genes will be introduced. the evolved genes thereby continually improve the representation used in the evolutionary process, helping it to produce larger and better fitting individuals. if this strategy is adopted, it is especially important to calculate the similarity fitness � in a way that ensures that a gene combination is in fact related to a feature, because the first gene extractions will occur in early phases of the run of the evolutionary system, where the individuals still differ strongly from the examples. any gene combination, even if it occurs in high-fitness individuals, could otherwise simply reflect random influence from the initial population, instead of features in the example. 1.5 creating new designs using evolved representations when a set of evolved genes has been created, it can be used to provide a focus for the generation of new designs. evolved genes are used to create a new representation, and a new random initial population is created using this representation. a conventional evolutionary system can then be run to produce new designs, using a fitness function that represents user-defined design criteria. depending on how the evolved genes are used, their effect on the search space can be different. if the evolved genes are added to the basic representation, the system can still use basic genes at any place in the genotype if the fitness requires it. therefore, the © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 41 no. 3/2001 basic gene evolved gene + + + + + + fig. 5: interpretation of the creation of high-order evolved genes as the combination of building blocks set of genotypes in the basic-level genotype search space and the resulting set of designs in the phenotype space are not changed, only the probability that some designs are found. if, on the other hand, the evolved genes replace the basic genes, only some basic-level genotypes can be produced, and therefore the set of designs in the phenotype space is restricted to a subset of all phenotypes possible with the basic representation. this will be referred to as ‘hard focus’. in most applications, the ‘soft focus’ approach is more appropriate, since it still allows the adaptation of the design to any specific design criteria. in these cases, two ‘forces’ influence the outcome of the design: the influence of evolved genes on the initial population and on the genetic operations, and the selection for or against phenotypes containing certain features. kauffman [10] shows that on general rugged, multi-peaked fitness landscapes, often-large bands of near constant fitness exist, where selection therefore has no influence on the population. released from random points in the fitness landscape, the population usually ends up in those bands. inside a band, other forces, usually weaker than selection, can influence the population. local optima inside regions favored by such a force are then more likely to be reached than other local optima in other regions. rugged fitness landscapes usually result when the influence of a gene in the genotype onto the fitness of the phenotype depends on a number of other genes in the genotype, a condition that certainly holds for most situations where evolutionary algorithms are used in design. the evolved representations can then be seen as a force that controls the ‘neutral drift’ of the population towards the focus in the search space. the evolved genes will only introduce those features that are positive or neutral with respect to the user provided fitness. if no basic genes are used in the evolved representation, the evolutionary design system has no choice but to use the evolved genes. the choice of evolved genes however is again dependent on how the use of the specific evolved genes interacts with the fitness function. 2 computational creativity when can a computational process be called ’creative’? it seems there are two ways to give a claim of creativity some foundation. one is to derive the process directly from observations of particular processes of human creative design, for example the use of analogies and emergence. both of these processes are assumed to play a role in human creative behavior, and a number of computational processes have subsequently been developed that use analogies to them ([20], [2], [22], [21]) and emergence ([7], [16], [18], [8]) create or facilitate creative design. the second way is to try to define a general characterization in computational terms of human creative design activity, and to use this to guide the development of a computational process. this ‘top-down approach’ allows the use of computational techniques and methods that are not related to any specific human cognitive behavior, as long as they correspond to the general characterization. it might, for example, allow the use of evolutionary algorithms and neural networks, which most likely do not play any role in human creativity. the definitions involve concepts related to transformation or expansion of search spaces (for computers) and conceptual spaces (for humans). neither of the definitions, however, says anything about how such a behavior can be achieved. where do new rules come from? how can the conceptual space be transformed? or, in computational terms, where do the new variables come from? this research will look at search spaces in the context of a finite system, and how a computational process can expand a search space. 2.1 finite systems and closed worlds the fact that a process is running as a program inside a computer introduces a set of theoretical limitations. the two most important in the context of design computing are: 1. the size of memory available to the program is limited, which means that the total number of different states that the program can assume is limited (finite system). 2. the computing power available to run the program is limited, which means that the number of different states a program can assume in any specific time span is limited. the limitations have a strong influence on what a computational design process can do, and what is impossible. the main implications are: 1. since each different design produced is connected to a different state of the machine, and the number of states is limited (limitation one), the total set of different designs that a program can generate is limited and defined a priori. another way of putting this restriction is that every design has to be represented by design variables, and each variable has to be stored in memory, limiting the number of variables and therefore designs. 2. due to limitation two, the set of different designs that a program can produce and evaluate in an acceptable time frame is limited. in practice, this number is usually much smaller than the set of possible designs. this means that the space searched usually has to be much smaller than the space of possible designs. 3. the set of designs a program could produce in a limited time frame is not generally known a priori. as langton [12] observes, turing’s halting theorem can be expanded to show that “it is impossible in the general case to determine any nontrivial property of the future behavior of a sufficiently powerful computer from a mere inspection of its program and initial state alone.” even with a knowledge of the representation and the program, it might therefore not be possible to predict which points in the search space the system can assume, and therefore which designs can possibly be the outcome of the design process. 4. due to limitations one and two, the evaluation of the design can take into account only a certain limited set of interactions between a design and its environment. for example, individuals in an artificial life application can develop ‘vision’ only if (a) the individuals have access to some kind of optical sensory organs, and (b) in every time-step of the evolution, a simulation is run to calculate what each individual would ‘see’ (as is done for example in [23]). however even this would not allow for individuals developing, for example, flight. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 the main consequences of the limitations are therefore that the total set of designs that can be produced is fixed, and that usually only a small part of it can be tested by the design process in an acceptable time. if the design process is seen as a search process, it means that the search will always proceed inside a predetermined search space, which can be referred to as the meta search space; and that of the designs in this meta search space, the search process can test only a small fraction. however, there are still a number of methods by which this search space can be searched by the design process. the following discussion and the illustration in figure 6 assume that one design exists in the meta search space that can be considered the optimum in terms of design performance; but the same methods are also applicable if more than one equally acceptable performance exists. 1. a process can search only a small search space, accepting the outcome of this search, even if it represents only a local optimum, and better designs lie outside the search space. for example, the search space can be restricted to designs where methods to optimize them analytically are known. this method is illustrated in figure 6(a). 2. using domain knowledge, a search space can be created that is known to contain the desired design. this could be the case in a situation where, say, theoretical analysis shows that all designs outside a certain space give results that violate one or more of the design restrictions. this method is illustrated in figure 6(b). 3. searching the global optimum in a large space without heuristics. this could either use a random search or attempt to enumerate all possible designs, figure 6(c). 4. using forms of domain knowledge, including ‘soft’ knowledge and heuristics, to guide the search for the best design in a very large search space. search strategies such as following the local gradient (hill-climbing), simulated annealing ([11]) and evolutionary algorithms fall under this criterion, as well as guaranteed methods such as logic programming. the arrows in figure 6(d) illustrate this use of local knowledge. 5. focusing the search onto a sub-space. the sub-space is searched with any of the other methods. domain knowledge is used to change the focus inside the total search-space, as shown in figure 6(e). © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 41 no. 3/2001 (a) (b) (c) (d) (e) fig. 6: methods to search a large search space (black dot: global design optimum; broken line: meta search space, bounded by restriction one; continuous line: search space searched: (a) focus on a subspace, possibly excluding optimum; (b) use domain knowledge to set focus to include optimum; (c) search whole search space; (d) search whole search space using domain knowledge; (e) focus on subspace, but move focus) fig. 7: hierarchy of search spaces in practice, the methods will be combined, so that what appears as the meta search space in any of the last three methods is in fact a subset of the set of theoretically possible designs, resulting from the limitation of the search space by either method one or two. with focusing, the search becomes a three-layered process, as shown in figure 7: the search space is restricted from the meta search space to a smaller search space, usually using domain knowledge. in this search space, focusing creates smaller sub-spaces, which in turn are searched using local knowledge. while the limitations of a finite state process are related to the concept of a ‘closed world’, as used in artificial life (see for example [1]), where it implies total reproducibility, and logic programming (see for example [9]), where the term implies complete knowledge, it is important to note that a closed world is not necessary for the above restrictions to apply. for example, it would be possible to exchange the pseudo random generator used in many programs by a physical device, based on thermal noise or on radioactive decay. this results in a process that is neither reproducible, nor allows complete knowledge; however the rest of the computer is still a finite state machine, and the openness of the world will have no effect on the qualitative outcome of the process. 2.2 focusing: creativity in a finite system the focusing in method 5 is very similar to the ‘introduction of new variables’ and ‘modification of search space’. in fact, the difference in definition may be seen as a difference in perspective. looking at a process as a local observer, who only sees the currently accessible part of the search space, the move of focus in this method appears as a move of the search space. however, a global observer will be able to tell that all successively searched search spaces are in fact part of the larger meta search space. an example from the literature can be used to illustrate this point. in [6], an evolutionary system is used to generate beam sections, with perimeter and moment of inertia as two competing design criteria. the sections are represented using a shape grammar; the initial search space s0 is the set of all designs that can be generated using this grammar. the authors then allow the shape grammar itself to change, and at the end of the evolutionary process a new shape grammar is learned. with this new grammar, a different set of sections can be produced; the system is therefore using a new search space, sn. the authors observer that s0 � sn and s0 � sn � �� and therefore argue that the change in the shape grammar led to a substitutive change in search space. however, a global observer would be able see that both s0 and sn are in fact part of a larger search space s*, s0 � s *, sn � s *, which may also hold many other designs that are neither part of s0 or sn. in terms of focusing, s0 and sn both represent a focus inside the space of all designs that can be represented by all possible sets of shape grammars, s*. another example can be seen in the variable addition shown in figure 2.7(a). if the space of all possible pentagons is considered the original search space s0, then adding a variable and thereby introducing hexagons creates a new, expanded search space sn. but it is also possible to argue that both are a subset of the space of all possible polygons, s*. 2.2.1 is focusing creative? in terms of [5], most of the methods illustrated in figure 6 would have to be classified as ‘routine design’, since the search space that is searched remains constant. focusing on a sub-space, however, requires that some of the total set of variables are restricted in their range or set to constant values. moving the focus, then, introduces new variables, and/or uses variables with values outside their current scope. this fulfils the condition for ‘creative’ or ‘innovative’ design, not for the entire process, but for the local view onto the focused search. the position of a search using focusing in poon and maher’s [18] transformation-exploration matrix (figure 8) depends on the way the sub-space is searched. as argued above, the process certainly can be classified as ‘novel’ or ‘original’ in terms of transformation. if then for example an evolutionary search is used inside the focus area, strong diverging elements are introduced, giving a value of ‘novel’ or ‘original’ for ‘exploration’. the resulting process is then inside the area of processes with the potential for creativity. a simple hill climbing inside the focus-area, on the other hand, would be entirely convergent, the value for ‘explora44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 exploration (e) transformation (t) creative zone non-creative zone transition zone mundane (m) mundane (m) novel (n) novel (n) original (o) original (o) fig. 8: framework to classify design processes in terms of exploration and transformation (from [14]) tion’ would therefore be ‘mundane’. such a process would then be classified as having only a low potential for creativity. 2.2.2 soft focus versus hard focus the focus does not have to be as clearcut as in figure 6. it is also possible to have a ‘soft’ focus, where all variables can be modified, but some are much more likely to be modified than others, and/or variables are much more likely to assume values in one range than in a different range. in other words, certain points in the search space, those inside the focus, are much more likely to be found than other points. moving the focus would then correspond to changing the probabilities of the search process. in figure 9, the soft focus is shown as regions of higher and lower probability inside a search space. as the figure shows, the regions of higher and lower probability do not have to be connected; indeed it is also possible that neighboring designs have very different probabilities, and no distinct regions exist at all. in the literal sense, moving or expanding a soft focus constitutes neither an addition of variables nor an expansion of the search space, not even from a local perspective, since every design inside the meta search space can always be produced, independent of the positions and shape of the focus. however, this is only a difference in degree, a system where some designs that had previously have been impossible to find are now available (moving a hard focus) will behave very similarly to one where some designs were very unlikely to be found and are now much more likely (moving soft focus). for this reason, it can be argued that moving a soft focus equally well fulfils this requirement for creativity. 2.2.3 moving the focus gero [4], maher, boulanger, poon and gomez [14] define any necessary attributes of the mechanism that drives the transformation or exploration. an entirely random mechanism seems not very useful: if there are only a few acceptable designs in the meta search space, then a randomly positioned focus will have a low probability of containing one of them. this point is also made by boden when she says that without hunches, a creative robot “would waste a lot of time in following up new ideas that ’anyone could have seen’ would lead to a dead end” [3]. two sources for such ‘hunches’ have already been discussed: the use of analogy and of emergence. other sources seem possible, the important aspect is that some connection exists between the current and the transformed focus that improves the probability of finding good designs in the new focus above that of random moves. 2.2.4 humans and finite systems the previous section has argued that the possibilities for expanding and moving the search space in a computational process are limited by the fact that they have to work inside a finite system. it is an ongoing philosophical debate whether the human mind is essentially nothing more than a complex computational process, or if other, fundamentally different, processes are involved (see for example [17]). however, it is possible to argue that human designers also only focus onto subspaces of a larger meta search space without requiring any assumptions on the fundamental nature of the human mind. for example, as argued in [13], a very good knowledge both in breadth and in depth about a field is a necessary condition for creativity in humans. apart from limited knowledge, the number of design alternatives they can consider in a limited time also restricts humans. for complex design tasks this might easily be more restricting for a human than for a computer. focusing in human design processes can be directly shown using protocol analysis. in [15], the authors analyze the design behavior of designers during conceptual design. among other things, they classify the design activities into ‘analysis’ of the problem, ‘synthesis’ of solutions, and ‘evaluation’ of the solutions with regard to the problem. only during the analysis phase, where the problem is taken into account, does the designer define the search space. the authors report that, as could be expected, the designers observed usually proceeded from analysis to synthesis, and from synthesis to evaluation. however, even in the early stages of the design process, the designers often went from evaluation back to synthesis without a new problem analysis, and therefore without a change in search space. after the initial phase, the designers were about five to six times more likely to proceed from evaluation to synthesis than to analysis. this can be interpreted as a focused search of the search space, interrupted by analysis phases, which allow the focus to be changed. evidence of focusing occurring in human design can also be found in the observation that designers tend to reproduce both adequate and inadequate design features of examples if they are given such examples in a design brief, a phenome© czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 41 no. 3/2001 fig. 9: transforming a soft focus inside a search space, darker shades represent higher probability for a design to appear as result of the search non referred to as ‘design fixation’ [19]. seeing an example is sufficient to create a focusing effect in the following design activity. the notion of focusing as an essential component in a creative computational process, as presented in this research, has been derived entirely from general ideas about creativity and computational processes, quite specifically without looking at particular instances of human creative behavior. it is therefore especially encouraging to find this evidence of focusing in human creative design. as in the computational processes, the focus can be ‘clear-cut’, for example as a result of a decision not to modify some design variables, or ‘soft’, as a tendency to or as a preference for certain types of design. 2.3 requirements for a creative design process from the previous sections, a number of criteria can be identified that a potentially creative computational process should fulfill. the process should: • be able to define a non-trivial sub-space, or focus, inside a very large search-space (the meta search space). • use divergent and convergent elements in a search process, such that the search is either entirely bounded by the focus, or that designs inside the focus are far more likely to be sampled by the search than designs outside the focus. • allow for transformations of the focus inside the meta search space. • allow for goal-oriented control of the modifications of the focus. this characterization differs from those used by [5] [14] in three points: • it specifically acknowledges that every search space searched will be a sub-space of a larger, predefined search space, the meta search space. • it specifically allows for ‘soft’ focus. • it requires that the control for the modification of the focus is not random. the requirements form a set of necessary conditions for a creative computational process. whether they are also sufficient conditions depends on what measure is used to judge the creativity. they are sufficient according to the two definitions used, but as mentioned, maher, boulanger, poon and gomez [14] carefully limit the definition to the potential for creativity. it is very likely that processes exist that fulfill the criteria, but where the designs produced appear not to be creative. additional requirements, which narrow down the definition, might emerge in future research. references [1] ackley, d. h.: a network of worlds for research. in c. g. langton and k. shimohara (eds), proceedings of the 5th international workshop on artificial life: synthesis and simulation of living systems (alife-96), mit press, cambridge, 1997, pp. 116–123 [2] bhatta, s., goel, a., prabhakar, s.: innovation in analogical design. in j. s. gero and f. sudweeks (eds), artificial intelligence in design, kluwer academic publishers, dordrecht, the netherlands, 1994, pp. 57–74 [3] boden, m. a.: could a robot be creative – and would we know? in k. m. ford, c. glymour and p. j. hayes (eds), android epistomology, mit press, cambridge, ma, 1995, pp. 51–72 [4] gero, j. s.: design prototypes: a knowledge representation schema for design. ai magazine vol. 11, no. 4/1990, pp. 26– 36 [5] gero, j. s.: towards a model of exploration in computer-aided design. in j. s. gero and e. tyugu (eds), formal design methods for cad, north-holland, amsterdam, 1994, pp. 315–336 [6] gero, j. s., louis, s. j., kundu, s.: evolutionary learning of novel grammars for design improvement. artificial intelligence for engineering design, analysis and manufacturing aiedam, vol. 8, no. 2/1994, pp. 83–94 [7] gero, j. s., jun, h. j.: visual semantics emergence to support creative designing: a computational view. in j. s. gero, m. l. maher and f. sudweeks (eds), preprints computational models of creative designs, key centre of design computing, university of sydney, 1995, pp. 87–116 [8] grabska, e., borkowski, a.: assisting creativity by composite representation. in j. s. gero and f. sudweeks (eds), artificial intelligence in design ’96, kluwer academic publishers, dordrecht, the netherlands, 1996, pp. 743–760 [9] jäger, g.: annotations on the consistency of the closed world assumption. journal of logic programming, vol. 8, no. 3/1990, pp. 229 –247 [10] kauffman, s. a.: the origins of order. oxford university press, new york, 1993 [11] kirkpatrick, s., gelatt, c., vecchi, m.: optimization by simulated annealing. science 220, 1983, pp. 671–680 [12] langton, c. g.: artificial life. in c. g. langton (ed.), artificial life, vol. vi of sfi studies in the sciences of complexity, addison-wesley, reading, 1988, pp. 1– 47 [13] maher, m. l., zhao, f., gero, j. s.: creativity in humans and computers. in j. s. gero and t. oksala (eds), knowledge-based systems in architecture, acta scandinavica, helsinki, 1989, chapter 13, pp. 129–141 [14] maher, m. l., boulanger, s., poon, j., gomez, a.: exploration and transformation in computational models for creative design processes. in j. s. gero, m. l. maher and f. sudweeks (eds), preprints computational models of creative design, key center of design computing, university of sydney, 1995, pp. 233–265 [15] mc. neill, t., gero, j. s., warren, j.: understanding conceptual electronic design using protocol analysis. research in engineering design 10/1998 (to appear) [16] nagasaka, i., yamagishi, a., taura, t.: a methodology of emergent shapes for creative design. in j. s. gero, f. sudweeks and m. l. maher (eds), preprints computational models of creativity, key center of design computing, university of sydney, 1995, pp. 117–130 [17] penrose, r.: the emperor’s new mind. oxford university press, new york, 1989 [18] poon, j., maher, m. l.: emergent behavior in co-evolutionary design. in j. s. gero and f. sudweeks (eds), artificial intelligence in design ’96, kluwer academic publishers, dordrecht, the netherlands, 1996, pp. 703–722 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 [19] purcell, a., gero, j. s., edwards, h. m., matka, e.: design fixation and intelligent design aids. in j. s. gero and f. sudweeks (eds), artificial intelligence in design ’94, kluwer academic publishers, dordrecht, the nederlands, 1994, pp. 483– 496 [20] qian, l., gero, j. s.: a design support system using analogy. in j. s. gero (ed.), artificial intelligence in design ’92, kluwer academic publishers, dordrecht, the nederlands, 1992, pp. 795–816 [21] qian, l., gero, j. s.: an approach to design exploration using analogy. preprints computational models of creative design, key centre of design computing, university of sydney, 1995, pp. 3–36 [22] wolverton, m., hayes-roth, b.: finding analogues for innovative design. preprints computational models of creative design, key centre of design computing, university of sydney, 1995, pp. 59–84 [23] yaeger, l.: computational genetics, physiology, metabolism, neural systems, learning, vision and behaviour or polyworld: life in a new context. in c. g. langton (ed.), artificial life iii, vol. xvii of sfi studies in the sciences of complexity, santa fe institute, addison-wesley, reading, 1992, pp. 263–298 ashraf fouad hafez ismail phone: +420 602882128 fax: +420 2 6517811 e-mail: hafez@fa.cvut.cz department of design theory czech technical university in prague faculty of architecture thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 3/2001 acta polytechnica acta polytechnica 53(2):75–78, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ fractal dimension estimation in diagnosing alzheimer’s disease václav hubata-vaceka,∗, jaromír kukala, robert rusinab, marie buncovác a ctu, faculty of nuclear sciences and physical engineering, department of software engineering in economics, břehová 7, 115 19 praha 1, czech republic b thomayer’s hospital, vídeňská 800, 140 59 praha 4–krč, czech republic c institute for clinical and experimental medicine, vídeňská 1958/9, 140 21 praha 4–krč, czech republic ∗ corresponding author: hubatvac@fjfi.cvut.cz abstract. estimated entropies from a limited data set are always biased. consequently, it is not a trivial task to calculate the entropy in real tasks. in this paper, we used a generalized definition of entropy to evaluate the hartley, shannon, and collision entropies. moreover, we applied the miller and harris estimations of shannon entropy, which are well known bias approaches based on taylor series. finally, these estimates were improved by bayesian estimation of individual probabilities. these methods were tested and used for recognizing alzheimer’s disease, using the relationship between entropy and the fractal dimension to obtain fractal dimensions of 3d brain scans. keywords: entropy, fractal dimension, alzheimer’s disease, boxcounting, rényi entopy. 1. introduction before explaning the relationship between entropy and dimension, we have to introduce the term of dimension. let d ∈ n be a dimension of euclidean space where a d-dimensional unit hypercube is placed. let m ∈ n be resolution and a = 1/m be edge the length of covering hypercubes of the same dimension d. the number of covering elements is given by n = n(a) = a−d. knowledge of n for fixed a enables direct calculation of the hypercube dimension according to ln n(a) = −d ln a d = ln n(a) ln 1 a . (1) the very popular boxcounting method [1] is based on the generalization of (1) to the form ln n(a) = a0 −d0 ln a and its application to the boundary of any set f ⊂ rd. as will be shown in the next section, the quantity ln n(a) is an estimate of the hartley entropy. 2. rényi entropy using a natural logarithm instead of a binary logarithm, we can follow in the definition of rényi entropy. let k ∈ n be number of events, pj > 0 be their probabilities for j = 1, . . . ,k satisfying ∑k j=1 pj = 1, and q ∈ r. we can define rényi entropy [2] as hq = ln ∑k j=1 p q j 1 −q , which is a generalization of shannon entropy. in respect of q, we obtain the specific entropies: • hartley entropy [3] for q = 0 as h0 = ln ∑ pj >0 1 = ln k∑ j>0 1 = ln k = ln n(a); • shannon entropy [4] for q → 1 as h1 = lim q→1 hq = − ∑ j=1 pj ln pj ; • collision entropy [2] for q = 2 as h2 = − ln ∑ pj >0 p2j ; the resulting theoretical entropies can be used for defining the rényi dimension [2] as dq = lim a→0+ hq ln 1 a , which corresponds to the relationship hq ≈ aq −dq ln a (2) for small covering size a > 0. 3. entropy estimates there are several approaches to entropy estimation from experimental data sets. assuming that the number of experiments n ∈ n is finite, we can count the 75 http://ctn.cvut.cz/ap/ v. hubata-vacek, j. kukal, r. rusina, m. buncová acta polytechnica events and obtain nj ∈ n0 as the event frequencies for j = 1, . . . ,k. the first approach to entropy estimation is naive estimation. we directly estimate k and pj as kn = ∑ nj >0 1 ≤ k, pj,n = nj n . these biased estimates also produce biased entropy estimates h0,n = ln kn, h1,n = − ∑ nj >0 pj,n ln pj,n, h2,n = − ln ∑ nj >0 p2j,n. the second approach is based on bayesian estimation of probabilities pj as pj,b = nj + 1 n + kn . this technique is called here semi-bayesian estimation. we obtain other, but also biased, entropy estimates h1,s = − ∑ nj >0 pj,b ln pj,b, h2,s = − ln ∑ nj >0 p2j,b. the estimate h2,s can be improved as h2,s2 = − ln ∑ nj >0 uj, where uj = (nj +2)(nj +1) (n+kn+1)(n+kn) is a bayesian estimate of p2j . a direct bayesian estimate of h1 was also calculated as h1,b = − kn∑ i=1 ni + 1 n + kn ( ψ(ni + 2) −ψ(n + kn + 1) ) , where ψ is the digamma function. 4. bias reduction miller [5] modified the naive estimate h1,n using first order taylor expansion, which produces h1,m = h1,n + kn − 1 2n . lately, harris [5] improved the formula to h1,h = h1,n + kn − 1 2n + 1 12n2 ( 1 − ∑ pj >0 1 pj ) from the theoretical point of view, it is prohibited to estimate pj by its estimates. however we are trying to investigate biased estimates of h1 in the forms h1,hn = h1,n + kn − 1 2n + 1 12n2 ( 1 − ∑ nj >0 1 pj,n ) , h1,hs = h1,n + kn − 1 2n + 1 12n2 ( 1 − ∑ nj >0 1 pj,b ) , h1,hb = h1,n + kn − 1 2n + 1 12n2 ( 1 − ∑ nj >0 rj ) , where rj = n+kn−1nj is bayesian estimate of 1 pj . 5. estimation methodology naive, semi-bayesian, bayesian and corrected entropy estimates were subjected of testing on 2d and 3d structures with known hausdorff dimension. the list of involved estimates is included in tab. 1. a sierpinski carpet with dq = 1.8928 for any q ≥ 0 of size 81×81 is a typical 2d fractal set model. using the estimates from tab. 1 and a linear regression model (2), we estimated the rényi dimensions d̂q and then evaluated its zscore as a relative measure of bias zscore = d̂q −dq sdq . the results are included in tab. 2. the best estimations with |zscore| ≤ 1.960 are h1,m followed by harris estimations h1,hn,h1,hs,h1,hb. a structure of dq = 2.3219 and size 128 × 128 × 128 was then used for 3d testing and the results are also included in tab. 2. the best estimators are h1,hs,h1,hn,h1,hb,h1,m,h2,s. 6. alzheimer’s disease diagnosis from fractal dimension estimates alzheimer’s disease (ad) is the most common form of dementia, and is characterised by loss of neurons and their synapses. this loss is caused by an accumulation of amyloid plaques between nerve cells in the brain. morphologically, the affected areas produce rounded clusters of destroyed brain cells, which are visible on brain scans. on the other hand, amyotrophic lateral sclerosis (als) is a disease of the motor neurons, and it is not visible on brain scans. in this sense, brain scans of als patien’s look like brain scans of healthy patients. these entropy estimators were used for diagnosing alzheimer’s disease. we tried to separate two different groups of samples of human brains. in the first group, there were brain scans of patients with alzheimer’s disease (ad) and in the second group brain scans of patients with amyotrophic lateral sclerosis (als). we carried out tests on 21 samples (11 for ad and 10 for als), represented by 128 × 128 × 128 matrices of thresholded images (θ = 40 %). we used a twosample t-test for null hypotheses, and the alternative 76 vol. 53 no. 2/2013 fractal dimension estimation in diagnosing alzheimer’s disease estimate sierpinski carpet dq = 1.8928 five box fractal dq = 2.3219 d̂q sdq zscore d̂q sdq zscore h0,n 1.8158 0.0064 −12.0577 2.0897 0.0284 −8.1757 h1,n 1.8472 0.0059 −7.7116 2.1853 0.0320 −4.2690 h2,n 1.8578 0.0076 −4.6212 2.1949 0.0298 −4.2568 h1,s 1.8515 0.0058 −7.0853 2.2367 0.0315 −2.7012 h2,s 1.8657 0.0072 −3.7494 2.2927 0.0298 −0.9798 h2,s2 1.7898 0.0077 −13.4269 2.1189 0.0268 −7.5904 h1,b 1.8170 0.0060 −12.6863 2.1654 0.0297 −5.2638 h1,m 1.8930 0.0059 0.0306 2.3315 0.0349 0.2730 h1,hn 1.8921 0.0059 −0.1203 2.3208 0.0347 −0.0332 h1,hs 1.8921 0.0059 −0.1164 2.3226 0.0347 0.0196 h1,hb 1.8920 0.0059 −0.1328 2.3182 0.0346 −0.1084 table 2. dimension estimates via various entropy estimates. method h0 h1 h2 naive h0,n h1,n h2,n semibayesian (pj) * h1,s h2,s semibayesian (p2j ) * * h2,s2 bayesian * h1,b * miller * h1,m * harris * h1,hn * harris semibayesian (pj) * h1,hs * harris bayesian (1/pj) * h1,hb * table 1. entropy estimates. hypotheses were h0 : ed̂q (ad) = ed̂q (als), ha : ed̂q (ad) 6= ed̂q (als). the results are included in tab. 3. the most significant differences between ad and als were observed for h0,n,h1,s,h1,b. 7. conclusion in this paper we tested estimates for hartley, shannon and collision entropy. these estimates were improved by bayesian estimation and tested on fractals with known fractal dimensions. finally, these estimates were used on two groups of samples of brain scans, in order to obtain the best separator. the best separators, with regard to the experiment, are h0,n,h1,s,h1,b, and they have a 2 % level of significance. the rest of the estimates also have results under a 5 % level of significance, except for h2,n, estimate ed̂q (ad) ed̂q (als) pvalue h0,n 1.9745 2.0315 0.017486 h1,n 2.0649 2.1096 0.025128 h2,n 2.0687 2.1034 0.067814 h1,s 2.0968 2.1471 0.018828 h2,s 2.1458 2.1903 0.031375 h2,s2 1.9274 1.9666 0.036419 h1,b 2.0011 2.0506 0.018873 h1,m 2.2607 2.3115 0.041142 h1,hn 2.2428 2.2931 0.037608 h1,hs 2.2452 2.2957 0.037729 h1,hb 2.2366 2.2868 0.035800 table 3. diagnostic power. which was worst. on hte basis of these results, entropy can be used for diagnosing alzheimer’s disease in the future, considering that methods can be still improved, especially by estimating kn or by image filtering. acknowledgements this paper was created with support from ctu in prague grant sgs11/165/ohk4/3t/14. references [1] theiler, j., estimating fractal dimension. journal of the optical society of america, vol. 7, no. 6 1990, pp. 1055–1073. [2] renyi, a., on measures of entropy and information. 77 v. hubata-vacek, j. kukal, r. rusina, m. buncová acta polytechnica berkeley symposium on mathematical statistics and probability, vol. 1, 1961, p. 547. [3] hartley, r.v.l., transmission of information. bell system technical journal, vol. 7, 1928, 535. [4] shannon, c.e., a mathematical theory of communication. bell system technical journal, 1948. [5] harris, b., the statistical estimation of entropy in the non-parametric case. mrc technical summary report, 1975. [6] gomez, c., mediavilla, a., hornero, r., abasolo, d., fernandez, a., use of the higuchi’s fractal dimension for the analysis of meg recordings from alzheimer’s disease patients. medical engineering & physics, volume 31, issue 3, april 2009, pp. 306–313. [7] jouny, c.c., bergey, g.k., characterization of early partial seizure onset: frequency, complexity and entropy. clinical neurophysiology, volume 123, issue 4, april 2012, pp. 658–669. [8] lopes,r., betrouni, n., fractal and multifractal analysis: a review. medical image analysis, volume 13, issue 4, august 2009, pp. 634–649. [9] polychronaki, g.e., ktonas, p. y., gatzonis, s., siatouni, a., asvestas, p. a., h tsekou, h., sakas, d. and nikita, k.s., comparison of fractal dimension estimation algorithms for epileptic seizure onset detection. journal of neural engineering 7 (2010). 78 acta polytechnica 53(2):75–78, 2013 1 introduction 2 rényi entropy 3 entropy estimates 4 bias reduction 5 estimation methodology 6 alzheimer's disease diagnosis from fractal dimension estimates 7 conclusion acknowledgements references acta polytechnica acta polytechnica 53(2):152–154, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ systems of electronic overcurrent protection in pulse power generator operating on plasma load dmitriy v. godun, sergey v. bordusov∗, anatoliy p. dostanko belorussian state university of informatics and radioelectronics, minsk, belarus ∗ corresponding author: bordusov@bsuir.by abstract. schematic peculiarities of pulsed power source and modulator-shaper for working on high instability plasma load are discussed. in its construction should be provided for several levels of overcurrent protection. first of all modules of electronic protection should be integrated into the control driver system of igbt modules and must provide a quick disconnect power switches in excess of the allowable values of pulse current. the next level of overcurrent protection in pulse power generator is a protection against overcurrent in the load circuit. operating threshold of current protection in this case must be set to the maximum value of current in the secondary circuit. in order to limit the emission of stray voltage on the power pulses in a moment of switching of power switches a restrictive rc snubbers parallel to the collectors and the emitters of transistors must be installed. it is also appropriate application of software-controlled configurating of electrical power at the output of a pulsed power supply. keywords: plasma, high-voltage source, pulse voltage. 1. introduction the electronic system of complex protection of pulse power generator is used for the protection of key igbt transistors against excess of permissible pulse current values and temperature during gas discharge formation at a low pressure. an application peculiarity of the pulse power generator during the gas discharge formation is a wide range of setting of operating parameters and consequently instability of plasma as the load [3]. the fast-acting complex protection system of pulse power generator is to ensure its blocking with the lapse of time not exceeding 100 nanoseconds. 2. main part the complex electronic protection system of pulse power generator is shown in fig. 1. the pulse generator [1], operating into plasma load, composes four current protection circuits and two thermal protection circuits of power semiconductor devices. the system has been designed in such a way that thermal and current protections are interconnected and operate simultaneously. their joint operation ensures a quick switch-off of the power generator in case of excess of the permissible pulse current values or temperature of power semiconductor devices. the thermal protection circuit is required to control the temperature of igbt transistors operating in a pulse mode. according to the producer’s technical documentation with the increase of the temperature of igbt transistors up to 100 ◦c the permissible pulse current of the device decreases by two times [5]. in this case to determine a value of a threshold of electrical current pulse limit depending on temperatures of transistor circuits is a task of current importance. the electronic protection system is divided into several circuits. the main circuit of electronic protection (the first circuit) has been implemented in the composition of the driver control system of igbt transistors (invertor). the electronic protection system includes sensors of response temperature at 70 ◦c. according to the operation algorithm of the electronic protection system, when the response temperature is reached the power transforming cascades are switched off. the cogitative control system constantly monitors the temperature with digital sensors. the data are processed by the microcontroller and, in accordance with the computed temperature, the threshold of current protection system actuation is corrected with the help of digital potentiometers [2]. the system operates in a phased control mode and ensures a stable dynamic correlation between a value of pulse current flowing through the key element and the temperature of its frame. in the circuit the current protection module as a part of the invertor is implemented on a fast-acting comparator to which input comes a signal from the current sensor rs1. this signal is compared with the basic signal and if they are equal the comparator is switched over. the trigger t has been introduced into the electronic protection circuit to exclude transients at the moment of comparator’s switching and keep the system’s last logic status. the trigger is reset either at a signal of the microcontroller or when the power of the generator is turned off. when logic-1 level is input to sd microcircuit of the bridge converter driver the generator is 152 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 systems of electronic overcurrent protection figure 1. structure chart of the comlex protection system of pulse generator. rapidly turned off. this signal is formed by the modules of current and temperature protections when the preset limits of pulse current and temperature values of igbt transistors are reached. the aim of the second circuit of current protection of pulse power supply is to determine the value of current in the load circuit (discharge system) and, in case it goes beyond the permissible limit, to instantly turn off the generator. the threshold of protection response in the second circuit is set 10 ÷ 15 % higher of the threshold of protection response in the third circuit and is set to the maximum permissible value of current in the load circuit (discharge system). the third circuit is software operated and allows to maintain the level of the preset power at the output of the generator. its operation is based in such a way that a signal from the current sensor rs2 comes to the special input of pulse generator’s control system forming a circuit of a negative feedback on current. the plasma discharge, as the load, is especially instable and lets a short-term excess of the pulse current when the output parameters of the pulse generator are changed. if the excess of the permissible level of pulse current in the load (discharge system) is achieved by force of the software then the amplitudes of power pulses are decreased and the power transmitted to the discharge is stabilized. when the value of the pulse current in the load (discharge system) is lowered to the permissible limits then the amplitude of the power pulses is restored. when elaborating current protection systems optical systems of data transfer with the switch period up to 75 ns has been applied which ensure the retention interval necessary to turn the generator off. to get improved frequency characteristics the modulator’s key power cascade is made of several mosfet transistors switched sequentially (fig. 2). the application of mosfet transistors is preferable because figure 2. structure chart of the modulator. of a rather high switching frequency of power voltage with a relatively small gate capacitance which ensures the formation of high trajectories switching [4]. each transistor is controlled by a personal driver device. the driver of the lower transistor includes current protection device which limits the pulse current in the load (discharge system) and when the preset limit of the current value is reached it turns off the system of control signals formation. the transistors must be safely closed and the power circuit 153 dmitriy v. godun, sergey v. bordusov, anatoliy p. dostanko acta polytechnica must be open. in order to ensure a safe closing of transistors blocking resistors r1–r4 are installed between their gates and emitters. the limitation of power pulses amplitude and protection against a potential breakdown in the gate-emitter transition is ensured by putting suppressors vd1–vd4 parallel to resistors. to limit the spurious discharge during the commutation of modulator’s power keys a limiting snubber has been installed which consists of r6 and c2 elements. another purpose of the snubber is to limit the switch speed of power transistors [6]. the switch speed of transistors has to be limited for the reason that the driver control circuit is based on field complementary transistor structures which in certain conditions may change to an unmanaged state which will lead to a thermal or current breakdown of the driver device. 3. conclusion a fast-acting complex protection system of a pulse power generator operating on the plasma load has been elaborated. a special attention has been paid to the circuit implementation of protection paths against a potential breakdown and to the quick-action of device. the application of fast-acting optical systems of data transfer in the composition of the current and thermal protection circuits allows to block the operation of pulse power generator with the lapse of time not exceeding 100 nanoseconds. the analysis of correlation between the temperature of transistor frames and the value of pulse current flowing through them is carried out by a microprocessor control system which allows to correct the threshold of the complex protection system actuation in the process of device operation. references [1] d. v. godun, s. v. bordusov, a. p. dostanko. the software-controlled generator of high-voltage impulses for plasma technological application. acta technica 56:332–337, 2011. [2] d. v. godun, a. p. dostanko. control of high-speed igbt modules in a converter cascade of high-power pulsed inverter. in modern facilities of communication materials: works of xvi international scientific and technical conference, p. 67. belarus, minsk, 2011. [3] d. v. godun, a. p. dostanko, s. v. bordusov. stabilization and regulation of value of an output voltage of a high-voltage sources for forming gas-discharge devices with the combined analogue-digital method of transformation. in proceedings of the report of iii international scientific and technical conference “modern methods and technologies of creation and processing of materials”, pp. 233–235. minsk, belarus, 2008. [4] a. kolpakov. schematic capture of high-voltage power converters. power electronics 2:28–32, 2007. [5] a. polischuk. schematic capture of modern high power sources for telecommunications equipment and systems of industrial automation. power electronics 2(1–4):70–74, 2005. [6] k. suker. power electronics. developer’s guide. dodeka-xxi, moscow, 2008. 154 acta polytechnica 53(2):152–154, 2013 1 introduction 2 main part 3 conclusion references ap2002_01.vp 1 hydrothermal microclimate optimization an optimum hydrothermal microclimate can be ensured by suitable changes in a) the source of heat, cold and humidity, b) the environment, and c) the exposed subject, the user of the building. an analysis of the current situation by a computer simulation program is a suitable approach to the solution of the problem [4], [18]. 1.1 changes in the source of heat, cold and humidity the most effective method for stabilising indoor conditions is better insulation of outer walls, because the outdoor environment is the greatest source of heat (in summer), cold (in winter) and humidity (all the year). outer wall insulation improves the thermal insulating properties and also the air tightness [20]. 1.1.1 thermal insulating properties and air permeability of outer walls the thermal properties of both walls and windows must be taken into account. windows, in particular, are usually the source of leaks through which heat escapes in winter and comes in summer – see part 1, section 2.2. well insulated windows should be supplemented by outdoor louvres, which prevent excessive solar radiation into an interior in summer, and decrease heat losses in winter. they form an insulating air 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 thermal comfort and optimum humidity part 2 m. v. jokl the hydrothermal microclimate is the main component in indoor comfort. the optimum hydrothermal level can be ensured by suitable changes in the sources of heat and water vapor within the building, changes in the environment (the interior of the building) and in the people exposed to the conditions inside the building. a change in the heat source and the source of water vapor involves improving the heat insulating properties and the air permeability of the peripheral walls and especially of the windows. the change in the environment will bring human bodies into balance with the environment. this can be expressed in terms of an optimum or at least an acceptable globe temperature, an adequate proportion of radiant heat within the total amount of heat from the environment (defined by the difference between air and wall temperature), uniform cooling of the human body by the environment, defined a) by the acceptable temperature difference between head and ankles, b) by acceptable temperature variations during a shift (location unchanged), or during movement from one location to another without a change of clothing. finally, a moisture balance between man and the environment is necessary (defined by acceptable relative air humidity). a change for human beings means a change of clothes which, of course, is limited by social acceptance in summer and by inconvenient heaviness in winter. the principles of optimum heating and cooling, humidification and dehumidification are presented in this paper. hydrothermal comfort in an environment depends on heat and humidity flows (heat and water vapors), occurring in a given space in a building interior and affecting the total state of the human organism. keywords: thermal comfort, optimum humidity, hygienic standards. fig. 1: incorrect location of louvres (a) and the correct location (b) – from the outside fig. 2: impact of the application of horizontal louvres on natural interior illumination (a without louvres, b with louvres) layer in front of the window during the night when the losses are greatest, and can reduce energy consumption by as much as 40% [15] (fig. 1). louvres also protect the interior from external noise and can be used to improve indoor lighting (see fig. 2). thermal insulation also plays an important role in indoor air humidity. if it is inadequate, water vapor condensation occurs in low temperature (below dew point) locations. at the same time, windows must allow sufficient ventilation, either by exfiltration, i.e. water vapor delivery to the exterior by windows that are not tights, or by controlled window opening, or by infiltration, i.e. delivery of outdoor air to the interior (see also the section on dehumidification). 1.2 changes in the environment these changes are more expensive (in terms of capital and running costs). they involve heating and humidification in winter and cooling and dehumidification in summer. 1.2.1 optimum heating optimum heating in winter (more exactly during the cold period of the year) (see table 6 in part 1) depends on heat losses from a room (heat escaping from indoors to outdoors). optimum heating provides hydrothermal comfort, i.e., a) without drafts, b) with sufficient heat radiation, c) with individual control of the heat output. heating without drafts the provision of heating without drafts depends on the type of heating applied: heating with radiators or warm air heating. the basic principle for heating with radiators is that the source of heat (the heating body) must be placed close to the source of cold (e.g., the window) in such a way that the impact of cold on the people in the room is eliminated. in practice, this involves placing the radiator under the window, not on an internal wall (fig. 3). from a heating body on an internal wall the thermal (convective) stream goes up to the ceiling, bends around it and falls down along the outer wall, and is cooled by the window. thus it changes into a cold draft that cannot be avoided by sealing the window. placing the heating body under the window changes this situation dramatically: the air cooled by the window drops onto the heating body, is warmed up and changes into a warm thermal stream rising upwards toward the ceiling. from the ceiling it creeps along the internal wall and floor, and a cold draft is avoided. of course, the heating body must extend along the width of the window. if the floor or ceiling is used as a heating body (with built-in heating pipes) the heating output must be increased considerably near the window (e.g., by increasing the density of the heating pipes). despite this provision a cold draft often occurs. that is the main problem of this type of heating in apartments and offices. for warm air heating, the above pattern of air streams within a room is the best way to prevent the formation of a cold draft. the basic principle for warm air heating is that warm air must be released from above and must then go down, i.e., it must be supplied to the ceiling of a room. otherwise the warm air from the floor goes up and the stream is typically constricted followed by cold air coming into the room (fig. 4). heating with sufficient heat radiation optimum heating also involves providing an adequate portion of radiant heat, i.e., the radiant comfort coefficient must be at least one. this condition is fulfilled draft-free by respecting the basic principle of draft-free heating by radiators, i.e., locating them next to the source of cold. it is different to fulfill the optimum condition of rcc with warm air heating, if convection heaters are not applied. thus the application of additional radiating sources is recommended with warm air heating, e.g., by installing an open fireplace in a living room. heating with individual control individual control means the ability to control the heating system in a room according to the individual requirements of the room user. for example, owing to their lower metabolic rate (heat production) women need a higher room temperature, while people suffering from high blood pressure require a lower room temperature. thermostatic valves enable the required temperature to be set and automatically maintained. personal control is a new type of individual control which maintains the required temperature in a given location in the © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 42 no. 1/2002 a b fig. 3: the impact of the location of a heating body on draft formation (a location without draft, b draft is coming from the window) a b fig. 4: warm air inlet location for room heating (a correct, b incorrect, because the rising warm air streams are constricted) room, e.g., in a working place. this can be done in two ways: by placing the heating body directly in the required location (mostly by using a remotely controlled airconditioning unit) or by placing personally controlled warm air heating inlets in this area. 1.2.2 optimum air humidification to prevent low air humidity, overheating of the room must be avoided: decreasing the air temperature into the optimum range is often sufficient to achieve the lower relative air humidity limit of 30 %. if it is necessary to achieve the optimum value (e.g., for children allergic to dry air), air humidification must be applied. special instruments, known as humidifiers, are used. humidifiers produced by a reputable company should be preferred, because some instruments produce microbes together with water vapor. that is what may happen when a saucer with water is placed on a heating body – and, in addition, a saucer does not usually provide enough water for air conditioning in the room [22]. 1.2.3 optimum cooling optimum cooling (usually a part of air conditioning) in summer (more exactly during the warm period of the year) (see table 6 in part1) depends on neutralising the heat gains in a room (neutralising the heat coming from outdoors to indoors) to provide hydrothermal comfort of the environment, i.e., cooling a) without drafts, b) with individual control of the cooling output. if possible, a ventilation system is preferred for cooling by applying so-called hybrid ventilation. the radiant heat portion causes no problem in summer, rcc being automatically respected in almost all cases. cooling without drafts the provision of cooling without drafts depends on the type of cooling applied: cooling by air conditioning (packaged) units, cooling by a cooled ceiling, or cooling by a central air conditioning system. the basic principle for cooling by air conditioning units is that the units must be placed close to the source of warmth (e.g., a window) in such a way that the impact of warmth on the human body is eliminated. this means, in practice, placing the unit under the window (above, to the side, but close to the window) in such a way that the heat radiated from the window on to people is compensated by the cold stream falling on the radiated surface of the body. an air conditioning unit blowing cold air on to a non-radiated body surface as a result of its unsuitable location can even increase the discomfort, because it increases the difference in the heat load on the body between the irradiated and non-radiated sides. if the ceiling is used as a cooling body, warm air coming from the window near the ceiling is gradually cooled and then falls. this is more favourable than a central air conditioning system, because the air quality is not decreased (e.g., the aeroions are not damaged), no space is necessary for air ducts and machine rooms, and no power is needed for the fans, resulting in energy conservation of about 20 % on comparison with air conditioning. for cooling by an air handling system, the pattern of air streams within the room is decisive for avoiding draft formation. the basic principle is that cold air must be let in from below and must move upwards, i.e., it must be supplied to the floor of a room. otherwise, the cold air from the ceiling drops, and the stream is constricted, followed by warm air coming into the room (fig. 5). cooling with individual control individual control means being able to control the cooling system in a room according to the individual requirements of the room user (see also the section on heating with individual control). the required temperature is set and kept automatically, e.g., on a thermostatic valve located on the cooling water pipe before entering the ceiling as a cooling body. personal control is a new type of individual control which maintains the required temperature in a given location in a room, e.g. in a working place. this can be done in two ways: by placing a remotely controlled air conditioning unit directly in the required location, or by placing the inlets of the personally controlled central air conditioning system in this area. the operation of the unit can often be programmed, i.e., the required temperature can be changed during the day or week. cooling by hybrid ventilation hybrid ventilation is a combination of natural and mechanical ventilation which uses the advantages of both systems. during low outdoor temperatures (up to +7 °c without wind) it works as a quiet natural ventilation system, white during warm weather the fan automatically goes into operation, enabling full efficiency of the system [23] (fig. 6). hybrid ventilation can also be used for storing cold air in a warm period of the year: if it is in operation all night, the interior of the building is cooled by the cold night air. thus in the morning and for a part of the day the rooms are pleasantly cool. to achieve successful cooling glazing of the facade should be reduced at least to 40 %, outdoor louvres should be installed, and the air should be changed about six times per hour at night. 1.2.4 optimum air dehumidification the special devices (known as dehumidifiers) are applied to exclude humidity (e.g., for air conditioning of valuable works of art) [19], but in most cases ventilation is sufficient, i.e. the necessary air change within an interior. it is particularly important ventilate bathrooms and kitchens in apart30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 a b fig. 5: cold air inlet location for room cooling (a correct, b incorrect, because the falling cold air streams are constricted) ments (see table 2 in part 1). automatic ventilation devices are controlled by a humidity sensor, which maintains the required air humidity level within a building interior. placing an exhaust hood above the cooker seems to be the best solution for kitchens. an exhaust hood with filtered air recirculation is sufficient for electrical cookers, while gas cookers need air outlets to the exterior, due to the quantity water vapor that is produced: water vapor from the burnt gas is added to the steam from cooking. an air output of 180 m3/h used to be recommended; this has now been increased at least twofold: 360 up to 400 m3/h or even 600 m3/h. optimum ventilation unlike the other optimization systems mentioned before, ventilation must be provided all year. even in winter, when there is normally a low quantity of water vapor indoors, ventilation is necessary owing to intensive water vapor production during cooking, taking a shower, etc. [1], [2]. air quantities and air changes during window opening are presented in table 1 [21]. the basic principle for optimum ventilation is to ventilate briefly but intensively. long-time ventilation causes the interior to warm up in summer and cool down up in winter, so it cannot be recommended for reasons of increased energy consumption. morning ventilation of all rooms, lasting about to 30 minutes (in freezing winter weather only two minutes, in summer time longer, depending on the outdoor temperature). the heating bodies or cooling units must be turned off (for about 30 minutes) before ventilation, especially if thermostatic valves have been installed (when a window is open, the heating or cooling water flow rises to its maximum level). ventilation (10 to 15 minutes) is recommended three or four times a day, with the window fully open (fig. 7). if an air handling system has been installed, there is no need for ventilation by windows. heat recovery is recommended: it uses the outcoming air to warm up the incoming air in winter and to cool it down in summer. in such a way about 50 % of the heat can be saved. however, some recovery systems reduce the indoor air quality. 1.3 human changes the simpliest way to achieve comfort is simply by changing the heat insulating properties of people clothing, i.e., by putting on or taking off the appropriate amount of clothing. however, the possibilities are limited because clothing cannot be decreased below the level accepted from a social point of view in summer, and too heavy clothing can detract from personal comfort in winter. this situation can be dramatically changed by the results of nasa space research. there is a new high-technology insulation layer consisting of small globes of a manmade material that are smaller than pin-heads. they are filled with paraffines mixtures of hydrocarbons. the phase of the paraffines changes into a solid or a liquid state, depending on the temperature. when they absorb heat, the paraffines melt (like a melting piece of ice), and when emanating heat they solidify. the temperature of the phase change can be set by the composition of the paraffine in the range from 0 °c up to 132 °c, i.e., a special layer can be designed for every purpose. two american companies have already started to put this technology application into everday use. frisby technologies © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 42 no. 1/2002 fig. 6: the principle of hybrid ventilation. it works either naturally – during a cold period, or mechanically – by operating a fan during a warm period to accumulate coolness at night. a solar collector can be applied to increase the chimney effect. window type (size 1 × 1.2 m) window opening air quantity m3/h air change per hour* hung window gap 2 cm to 50 0.25 hung window gap 6 cm to 130 0.65 hung window gap 12 cm to 220 1.1 swing window gap 6 cm to 180 0.9 swing window gap 12 cm to 280 1.4 swing window opening 90° to 800 4.0 opposite windows fully opened (cross ventilation) to 40 * for 80 m2 floor area table 1: air quantity and air change during window opening fig. 7: ventilation should be brief and intensive puts these small globes into a foam envelope, while outlast technologies have given them a textile cover. the resulting capsules must be designed in such a way that for all possible circumstances they remain paraffineproof. the first area of commercial application is for winter sportswear, especially gloves for snowboarders. the paraffine mixture is chosen in such a way that it is melted by the body temperature and solidifies at 0 °c. the hands produce heat during the sporting activity, and the capsule content slowly liquefies, and heat gradually accumulates. white going uphill on a drag lift the capsule content is slowly cooled, liquefies, and thus solidified with heat emanation to the skin, and the gloves heat the hands. before the capsule content solidifies even a long uphill ride can be completed. going down the hill, the capsule content is heated again by the body heat that is produced, and it melts. then the cycle begins again. the van de sport company in tettnagen, in bavaria offered some special leisure wear, the winter collection of outlast technologies, at the international fair of sporting goods in munich. the company makes equipment for snowboarders, skiers and mountain climbers. despite the high prices (ski sportswear costing between 330 to 360 eur) the company was satisfied with its sales. frisby technologies in colorado has introduced some new applications: a special foam cover for vertical take-off airplane surfaces, turbine blade protection using microcapsules against high temperatures, so that expensive thermally stable alloying is not necessary, etc. this new technology clearly has great potential. references [1] centnerová, l.: ventilation of a family house. topenářství, instalace, no. 3/1999, pp. 64–65 [2] chlum, m., jokl, m., papež, k.: progressive ways of residential ventilation. společnost pro techniku prostředí, praha 1999, pp. 75 [3] gullev, g.: allergy and indoor air climate. scanvac 1999, no. 1/1999, pp. 8–9 [4] hensen, j., kabele, k.: application of system simulation to wch boiler selection. in: proceedings of 5th int. ib psa conference, prague 1997, pp. 141–147 [5] jirák, z., jokl, m. v., bajgar, p.: long-term and short-term tolerable work-time in a hot environment: the limit values verification. int. j. of environmental health research no. 7/1997, pp. 33–46 [6] jokl, m. v.: microenvironment: the theory and practice of indoor climate. thomas, illinois, u.s.a. 1989, p. 416 [7] jokl, m. v.: the stereothermometer: a new instrument for hydrothermal constituent nonuniformity evaluation. ashrae transactions 96, no. 3435/1990, pp. 13–15 [8] jokl, m. v.: stereothermometer for the evaluation of a hydrothermal microclimate within buildings. heizung/luftung, klima, haustechnik 42, no. 1/1991, pp. 27–32 [9] jokl, m. v.: stereothermometer an instrument for assessing the non-uniformity of the environmental hydrothermal constituent. čs. hygiena 36, no. 1/1991, pp. 14–24 [10] jokl, m. v.: some new trends in power conservation by thermal insulating properties of buildings. acta polytechnica, no. 8/1990, pp. 49–63 [11] jokl, m. v.: hydrothermal microclimate: a new system for evaluation of non-uniformity. building serv. eng. technol. 13, no. 4/1992, pp. 225–230 [12] jokl, m. v.: theory of the non-uniformity of the environment at hydrothermal constituent. stavební obzor 1, no. 4/1992, pp. 16–19 [13] jokl, m. v.: internal microclimate. czech technical university in prague, prague 1992, p. 182 [14] jokl, m. v.: the theory of the indoor environment of a building. czech technical university in prague, prague 1993, p. 261 [15] jokl, m. v.: energetics of building environsystems. czech technical university in prague, prague 1993, p. 148 [16] jokl, m. v., moos, p.: optimal globe temperature respecting human thermoregulatory range. ashrae transactions 95, no. 3288/1989, pp. 329–335 [17] jokl, m. v., moos, p.: the nonlinear solution of the thermal interaction of the human body with its environment. technical papers, tu prague, building construction series, ps no. 6/1992, pp. 15–24 [18] kabele, k., kadlecová, m., matoušovic, t., centnerová, l.: application of complex energy simulation in competition design of the czech embassy in ottawa. in: proceedings of 6th int. ibpsa conference, kyoto, japan 1999, pp. 249–255 [19] kadlecová, m.: internal microclimate in museums, galeries and exhibition rooms. dissertation, czech technical university in prague, prague 1992, pp. 47 [20] kopřiva, m.: buildings saving energy. topenářství, instalace 33, no. 5/1999, pp. 98–99 [21] lunos luftungsfibel. berlin 1999, pp. 21 [21] papež, k.: ventilation and air conditioning – exercises. czech technical university in prague, prague 1993, pp. 116 [22] sandberg, m.: hybrid ventilation, new word, new approach. swedish building research, no. 4/1999, pp. 2–3 miloslav v. jokl, ph.d., sc.d, university professor phone: +420 2 2435 4432 fax: +420 2 3333 9961 e-mail: miloslav.jokl@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 acta polytechnica acta polytechnica 53(1):38–43, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ status and perspectives of the mini-megatortora wide-field monitoring system with high temporal resolution sergey karpova,∗, grigory beskina, sergey bondarb, alexey perkovb, evgeny ivanovb, adriano guarnieric, corrado bartolinic, giuseppe grecod, andy shearere, vyacheslav sasyukf a special astrophysical observatory of the russian academy of sciences b institute for precise instrumentation, russia c bologna university, italy d astronomical observatory of bologna, inaf, italy e national university of ireland, galway, ireland f kazan federal university, kazan, russia ∗ corresponding author: karpov@sao.ru abstract. here we briefly summarize our long-term experience of constructing and operating wide-field monitoring cameras with sub-second temporal resolution to look for optical components of grbs, fast-moving satellites and meteors. the general hardware requirements for these systems are discussed, along with algorithms for real-time detection and classification of various kinds of short optical transients. we also give a status report on the next generation, the megatortora multi-objective and transforming monitoring system, whose 6-channel (mini-megatortora-spain) and 9-channel prototypes (mini-megatortora-kazan) we have been building at sao ras. this system combines a wide field of view with subsecond temporal resolution in monitoring regime, and is able, within fractions of a second, to reconfigure itself to follow-up mode, which has better sensitivity and simultaneously provides multi-color and polarimetric information on detected transients. keywords: gamma-ray bursts, high time resolution, wide field photometry. 1. introduction the systematic study of night sky variability on subsecond time scales remains an important but practically unsolved problem. the need for such a study for the purposes of the search of non-stationary objects with unknown localization has been noted by [10]. such studies have been performed [24, 25], but due to technical limitations it has only been possible either to achieve high temporal resolution of tens of milliseconds in monitoring of 5′–10′ fields, or to use 5–10 seconds time resolution in wider fields. the wide-field monitoring systems currently in operation, such as widget [27], raptor [11], bootes [13] and pi of the sky [12], while having good sky coverage and limiting magnitude, lack temporal resolution, which significantly lowers their performance in the study of transient events of subsecond duration. optical transients of unknown localization may be very short. for example, the rise times of flashes of some uv cet-like stars may be as short as 0.2– 0.5 seconds [26], 30 % of grbs are shorter than 2 seconds in duration, and details of their light curves may be seen on time scales shorter than 1 ms [20]. also, of great interest are observations of very fast meteors (faster than 200 km/s) which may be of extrasolar system origin [1]. slower meteors should also be studied with high temporal resolution due to their short duration (typically just a fraction of second) and the necessity to observe them on at least two consecutive frames to reliably recover their angular velocity. monitoring of near-earth space is another task that requires wide-field observations with high temporal resolution. a number of satellites, and also a vast amount of small pieces of space debris, have rapidly evolving trajectories, and are therefore difficult to observe by typical narrow-field satellite tracking methods based on extrapolation of previously known trajectories. high temporal resolution is needed here due to the fast motion of these objects in the sky, up to several degrees per second for low-orbit objects, and the fast decrease in the detection limit when there is significant motion of the source over the exposure time due to spread of its flux across several pixels along the trail. it has been proposed [9] that large low-quality mosaic mirrors of air cerenkov telescopes should be used to study the variability of large sky areas on such time scales. however, we have demonstrated in [16, 30] that subsecond temporal resolution can be achieved in a reasonably wide field with small telescopes equipped with fast ccds, to perform fully automatic searching 38 http://ctn.cvut.cz/ap/ vol. 53 no. 1/2013 mini-megatortora wide-field monitoring system name fov τ limit (degrees) (seconds) (magnitudes) widget 62 × 62 5 10m raptor a/b 40 × 40 60 12m raptor q 180 × 180 10 10m π of the sky∗ 40 × 40 10 12.5m aroma-w 25 × 35 5–100 10.5m–13m master-vwf 20 × 21 5 11.5m master-net 30 × 30 1 9m favor 16 × 24 0.13 10m–11.5m tortora 24 × 32 0.13 9m–10.5m mmt 30 × 30 0.13-1300 12.5m–17.7m ∗ field of view is for a 4-objective unit in wide-field mode; there is also 2-objective unit in operation now. table 1. wide-field monitoring cameras currently in operation, with field of view (fov) size, temporal resolution τ and the detection limit claimed by the authors. for favor, tortora and mini-megatortora the limits correspond to 3σ detection on a single frame, and may differ from their real-time operational values due to non-ergodic pixel statistics when using the image intensifier. and classification of fast optical transients. in addition, a two-telescope scheme [3, 15] has been proposed which is able to study these transients within a very short time after detection. in accordance with these ideas, we created the prototype of the favor fast wide-field camera [16] and the tortora camera as part of the tortorem two-telescope complex [21], and we have operated them over a period of several years. the recent discovery of the brightest ever grb, grb080319b (naked-eye burst [23]), by several widefield monitoring systems — tortora [17], raptor [28] and pi of the sky [14] — and the subsequent discovery of its fast optical variability [7] on time scales from several seconds down to a sub-second time scale [8] have demonstrated that the ideas behind our efforts in fast temporal resolution wide-field monitoring are correct. 2. general requirements for wide-field monitoring typical follow-up observations performed for a detailed study of newly discovered transients require no more than a good robotic telescope with fast repointing. however, instruments of this kind will inevitably only begin to capture data after the event has been in progress for a few seconds or tens of seconds. to obtain information from the start of the event, which is essential for understanding the nature and properties of transients, one needs to observe the position of the transient before it appears. as transients occur in unpredictable places, systematic monitoring of large sky regions is therefore an important task. for monitoring of this kind, one needs to select the optimal set of mutually exclusive parameters — the angular size of the field of view, the limiting magnitude and the temporal resolution. indeed, the area of the sky ω, covered by an objective with diameter d and focal length f, equipped with an n ×n pixel ccd with pixel size of l and exposure time of τ seconds, is ω ∝ n2l2 f 2 (1) while the faintest detectable object flux, for sky background noise dominating over the ccd read-out noise, is fluxmin ∝ ( d f ) d−2lτ− 1 2 (2) for the case of ccd read-out noise σ domination, the limit is fluxmin ∝ σ d2τ (3) the number of detectable events, uniformly distributed in euclidean space, is number ∝ ω · flux− 3 2 = d ( d f )1 2 τ 3 4 n2l 1 2 (4) when the duration of the event t (or, more accurately, the duration of its peak, where the flux may be treated as a constant) is longer than the exposure, and number ∝ ω · flux− 3 2 (τ t )−32 = d ( d f )1 2 t 3 2 τ− 3 4 n2l 1 2 (5) when it is shorter — as fluxmin decreases, one can detect a larger number of events in a greater volume. thus, high temporal resolution is essential in detecting and analysing short optical transients. however, it requires the application of fast ccd matrices. fast ccd matrices usually have large read-out noise, which limits the detection of faint objects. 39 sergey karpov, grigory beskin et al. acta polytechnica figure 1. different modes of operation of the megatortora system. left: wide-field monitoring mode in a single color band (or in white light). center: insertion of color and polarization filters as the first step of the follow-up routine upon detection of a transient. right: repointing of all unit objectives towards the transient. in the latter mode of operation the system collects three-color transient photometry for three polarization plane orientations simultaneously. the mode transition is expected to be less than 0.3 second. most of the general-purpose wide-field monitoring systems currently in operation, listed in table 1, have chosen a large of field of view while sacrificing temporal resolution to achieve a decent detection limit. by contrast, our cameras, starting with the favor prototype [16, 30], chose high temporal resolution as the key parameter. 3. megatortora – multi-objective transforming instrument it has been shown above that the parameters defining field of view size, detection limit and temporal resolution are mutually exclusive, and are limited by the difficulties of constructing and using objectives with large relative apertures (d/f ∼ 1 or greater). the only possible way to further improve the parameters simultaneously is to design a multi-objective monitoring system, where the detection limit is improved by decreasing the angular pixel size [6] and the field of view is improved by pointing several identical channels towards different regions of the sky. to operate in a sky background dominated regime, the ccd read-out noise may be suppressed by a high quantum efficiency image intensifier, or by using low-noise em-ccd or scmos as a detector. multi-objective design also gives freedom in the regimes of operation, as the fields of view of the channels may be either separated or combined, either with the same photometric (or even polarimetric) filter or with a combination of different filters. the megatortora project [5] is being developed along these lines. it utilizes a modular design and consists of a set of basic units, 9 objectives each, installed on separate mounts. each objective in a unit is placed inside the gimbal suspension with remotely-controlled micro-motors, and may therefore be oriented independently from the others. in addition, each objective possesses a set of color and polarization filters that can be installed before the objective on the fly. this enables modes of observation to be changed on the fly, from routine wide-field monitoring in the color band, which provides the best signal-to-noise ratio (or in white light, with no filters installed), to the narrowfield follow-up regime, when all objectives are pointed towards the same point, i.e., a newly-discovered transient, and observe it simultaneously in different colors and for different polarization plane orientations, to acquire all possible kinds of information on the transient (see figure 1). there can also be simultaneous observation of the transient by all objectives in white light, inorder to obtain better photometric accuracy by co-adding frames. each objective is equipped with the fast em-ccd, which has low readout noise even for high frame rates when internal amplification is in effect. the data from each channel of this system, which is roughly 20 megabytes per second, is collected by a dedicated rackmount pc, which stores it in its hard-drive and also processes the data in real time in a way similar to the current processing pipeline of the favor and tortora cameras, which currently operate under a similar data flow rate. the whole system is coordinated by the central server, which acquires the transient data from data-processing pcs and controls the pointing and mode of operation of all objectives in response to them. 4. mini-megatortora as a megatortora prototype as a prototype of the megatortora concept, we designed the mini-megatortora, which is basically a model of a 3 × 3 unit. the main design choice was to use the celostate in a gimbal suspension for fast repointing of each channel. this decision allows us to significantly loosen the requirements for the structural, dynamical and precision parameters. we are building two variants of mini-megatortora with different detectors (an image intensifier with fast ccd for mmt-spain, and a low-noise scmos for mmt-kazan) and, therefore, with slightly different parameters. both variants use the canon ef85 f/1.2 lens as the main objective and celostate mirrors for fast (faster than 0.3 s) repointing in the ±20° region of the sky. the optical design of the first variant is analogous to the design used in the favor [16] and tortora [21] systems, but with a non-scaling image intensifier, see figure 2. for the second variant, the design is simplified, and lacks the image intensifier and the transmission optics. the detector of the first variant image intensifier for spain is based on a fast sony ix285al ccd chip with a 6.4 µm pixel and 0.13 s exposure in a continuous acquisition regime, which gives 7.5 1392 × 1036 40 vol. 53 no. 1/2013 mini-megatortora wide-field monitoring system figure 2. optical scheme of a single channel for the mmt-spain variant. the main objective is surrounded by two menisci to compensate the optical distortions of the thick glass on the input of the image intensifier. frames per second with 12-bit depth. the non-scaling image intensifier has quantum efficiency of about 25 %, and the amplified image from its output window is transferred to ccd by transmission optics, which downscales it 1.7 times; the resulting pixel scale is 25′′ per pixel, and the total field of view of a channel is about 100 square degrees. the high image intensifier amplification (of ∼ 150) overcomes the ccd read-out noise, but induces its own spatially-correlated and highly non-poissonian shot-noise due to ions striking the photocathode events. the resulting limiting magnitude in differential imaging mode is about b ∼ 12m; it is somewhat worse in the direct imaging regime, due to the spatial correlation of the dominant image intensifier noise. in addition, direct imaging suffers from the non-uniform spatial sensitivity of the image intensifier microchannel plates, which makes it very important to perform proper flatfielding. each channel is therefore equipped with its own flatfielding module, consisting of a dull surface on the inner part of the lid and dedicated photodiodes. due to financial limitations, we are building only 6 channels for this variant, which will simultaneously provide imaging in only two photometric filters (which, of course, may be selected arbitrarily from the three available filters) and three polarimetric filters. the mechanical scheme of the channel for this variant is shown in figure 3. mmt-spain will be installed at the el-arenosillo atmospheric station in huelva, spain in fall 2013. the second variant, mmt-kazan, is equipped with andor neo scmos, which has 2560 × 2160 6.4µm pixels with 16-bit depth. due to limitations of the pc processing power, and also limited available hard drive space, we decided to operate it in a 10 frames per second regime, which still provides us with ∼ 3 tb of data per night. the quantum efficiency is about 55 %, with read-out noise as low as 1e−. the pixel scale is about 15′′ per pixel, and the channel field of view is about 100 square degrees. the limiting magnitude of the channel will be about b ∼ 12.5m in 0.1 s, in both differential and direct imaging. mmt-kazan will be installed at the engelgardt obfigure 3. scheme of the single channel mechanical design of the mmt-spain variant. servatory of kazan federal university, kazan, russia in summer 2013. both variants of mmt will use custom fork mounts based on a skywatcher eq6 head, each carrying two channels simultaneously. each channel will be controlled by a dedicated linux-based pc performing data acquisition (due to the absence of linux drivers, for mmt-spain we use one more windows-based image grabber pc per channel, connected to data processing one via dedicated gigabit ethernet cable), raw data storage, real-time data processing, as well as controlling the state of the filters and the celostate of the channel. the operation of the complex as a whole is controlled by a central server which collects information on transients detected by channels and issues commands for repointing all of them towards it for a follow-up, choosing an appropriate combination of color and polarimetric filters, based on the brightness of the transient. it also checks the hardware state of channels, the weather conditions and the day/night status, starting and stopping the routine monitoring accordingly. 41 sergey karpov, grigory beskin et al. acta polytechnica 5. strategy of mini-megatortora operation mini-megatortora will perform routine observations of all the available sky in wide-field regime, which gives a ∼ 900 square degrees field of view for mmtkazan. it will spend up to 20 minutes on each spot, selected to follow as much as possible the fields of view of space-borne gamma-ray telescopes according to information from the gcn network [2], while avoiding regions currently being observed by other monitoring systems such as “pi of the sky” [19], the regions close to the moon or the horizon, and regions recently observed by the complex itself. in 8 hours of typical dark night, it will cover up to ∼ 20000 square degrees, nearly half of the whole sky, and will return to each spot in about one day. on each spot, each channel will collect about 10000 frames, which will allow scientists to study its variability on different time scales with different limits by co-adding consecutive frames (this co-addition will not be subject to coordinate re-binning and varying spatial sensitivity problems, as all frames are collected consecutively on the same detector imaging the same sky region with sufficiently good telescope tracking) — co-adding of 100 frames may improve the limit by up to 2.5m, while co-adding of 10000 frames may improve the limit by up to 5m, depending on the temporal stability of the detector and the sky conditions, and also the quality of the flatfielding and dark frames. frames co-added by 100 will be stored forever to form a timedomain atlas of the sky for further study, along with a time-domain photometric catalogue formed by measurements by means of fast aperture photometry (on a 100 frames / 10 s time scale, down to b ∼ 15m for mmt-kazan) or slower psf-fitting photometry (on a 10000 frames / 1000 s time scale, down to b ∼ 17.5m for mmt-kazan). this catalogue will allow to study the variability of various classes of objects on time scales from 10 seconds to years, and also to detect slowly moving objects. compared with existing data from the asas-3 [22] and nsvs [29] surveys, which have similar detection limits, we may expect up to 15-20 millions of objects to be covered, with ∼ 100000 being variable, and probably new classes of variable objects to be discovered due to better temporal resolution and cadence. real-time data processing, based on fast differential imaging and interlinking of events on several consecutive frames [4, 18], will allow us to detect both fast flashes (with durations longer than 0.3 s) and rapidly moving satellites (with velocities up to half degree per second), as well as meteors (even meteors appearing on a single frame, as they are selected on the basis of their elongated shape), and roughly classify them on the fly. for transients, the light curve and coordinates will be stored, while for satellites – the trajectories will also be stored for further processing by more sophisticated methods in day time. if the transient is bright enough, and is not coincident with a known satellite or a bright star, the complex may be reconfigured to follow it up, pointing all the channels towards it and installing some combination of color and polarimetric filters to acquire both photometric and polarimetric information. if all 9 channels are equipped with the same color filter, frame co-addition may yield up to 1m to the complex sensitivity, while in three-color mode it may yield up to 0.6m. in polarimetric mode, the limit is nearly the same as in single-channel regime due to the light losses on polarimetric filters. the expected accuracy of polarimetry is about 10 % at 10m and about 1 % at 5m. acknowledgements this work was supported by bologna university progetti pluriennali 2003, by grants of crdf (no. rp1-2394-mo02), rfbr (no. 04-02-17555, 06-02-08313, 09-02-12053 and 12-02-00743-a), intas (04-78-7366), by the presidium of the russian academy of sciences program, by a grant of the president of the russian federation in support of young russian scientists and by a european union grant (fp7 grant number 283783, gloria project). the construction of mmt-kazan is being financed by kazan federal university. s.k. has also been supported by a grant from the dynasty foundation. g.b. thanks landau network-cenro volta and the cariplo foundation for a fellowship and the brera observatory for hospitality. we thank emilio molinari, stefano covino and cristiano guidorzi for technical help in organizing tortora observations and for discussions of the results. references [1] v. l. afanasiev, v. v. kalenichenko, i. d. karachentsev. detection of an intergalactic meteor particle with the 6-m telescope. astrophysical bulletin 62:301–310, 2007. [2] s. d. barthelmy. observing strategies using gcn. american institute of physics conference series 428:129–133, 1998. [3] g. beskin, v. bad’in, a. biryukov, et al. favor (fast variability optical registration)—a two-telescope complex for detection and investigation of short optical transients. nuovo cimento c 28:751–754, 2005. [4] g. beskin, a. biryukov, s. bondar, et al. software for detection of optical transients in observations with rapid wide-field camera. astronomische nachrichten 325(6):676–676, 2004. [5] g. beskin, s. bondar, s. karpov, et al. from tortora to megatortora – results and prospects of search for fast optical transients. advances in astronomy 2010:171569, 2010. [6] g. beskin, v. de-bur, s. karpov, et al. search for optical signals from extra-terrestial intelligence at sao ras: past, present and futurte. bulletin of special astrophysical obervatory 60-61:217–225, 2007. [7] g. beskin, s. karpov, s. bondar, et al. tortora discovery of naked-eye burst fast optical variability. american institute of physics conference series 1065:251–254, 2008. 42 vol. 53 no. 1/2013 mini-megatortora wide-field monitoring system [8] g. beskin, s. karpov, s. bondar, et al. fast optical variability of a naked-eye burst – manifestation of the periodic activity of an internal engine. apj 719:l10–l14, 2010. [9] g. m. beskin, v. plokhotnichenko, c. bartolini, et al. catching the light curve of flaring grbs: the opportunity offered by scanning telescopes. a&as 138:589–590, 1999. [10] h. bondi. astronomy of the future. qjrastronsoc 11:443, 1970. [11] k. n. borozdin, s. p. brumby, m. c. galassi, et al. real-time detection of optical transients with raptor. proceedings of the spie 4847:344–353, 2002. [12] a. burd, m. cwiok, h. czyrkowski, et al. pi of the sky all-sky, real-time search for fast optical transients. new astronomy 10:409–416, 2005. [13] a. j. castro-tirado, j. soldán, m. bernas, et al. the burst observer and optical transient exploring system (bootes). a&as 138:583–585, 1999. [14] m. cwiok, w. dominik, g. kasprowicz, et al. grb 080319b prompt optical observation by pi-of-the-sky. grb coordinates network circular 7439:1, 2008. [15] s. karpov, d. bad’in, g. beskin, et al. favor (fast variability optical registration) two-telescope complex for detection and investigation of short optical transients. astronomische nachrichten 325:677–677, 2004. [16] s. karpov, g. beskin, a. biryukov, et al. optical camera with high temporal resolution to search for transients in the wide field. nuovo cimento c 28:747–750, 2005. [17] s. karpov, g. beskin, s. bondar, et al. grb 080319b: tortora synchronous observation. grb coordinates network circular 7452:1, 2008. [18] s. karpov, g. beskin, s. bondar, et al. wide and fast: monitoring the sky in subsecond domain with the favor and tortora cameras. advances in astronomy 2010:784141, 2010. [19] s. karpov, m. sokolowski, e. gorbovskoy. all sky coordination initiative – simple service for wide-field monitoring systems to cooperate in searching for fast optical transients. astronomical society of india conference series p. in press, 2012. [20] s. mcbreen, f. quilligan, b. mcbreen, et al. temporal properties of the short gamma-ray bursts. a&a 380:l31–l34, 2001. [21] e. molinari, s. bondar, s. karpov, et al. tortorem: two-telescope complex for detection and investigation of optical transients. nuovo cimento b 121:1525–1526, 2006. [22] g. pojmanski. the all sky automated survey. catalog of variable stars. i. 0 h 6 hquarter of the southern hemisphere. acta astronomica 52:397–427, 2002. [23] j. l. racusin, n. gehrels, s. t. holland, et al. grb 080319b: swift detection of an intense burst with a bright optical counterpart. grb coordinates network circular 7427:1, 2008. [24] b. schaefer. celestial optical flash rate predictions and observations. aj 11:1363–1369, 1985. [25] b. schaefer. optical flash background rates. a&a 174:338–343, 1987. [26] v. f. shvartsman, g. m. beskin, r. e. gershberg, et al. minimum rise times in uv-ceti type flares. soviet astronomy letters 14:97, 1988. [27] t. tamagawa, f. usui, y. urata, et al. the search for optical emission on and before the grb trigger with the widget telescope. nuovo cimento c 28:771–774, 2005. [28] p. wozniak, w.t. vestrand, j. wren, h. davis. grb 080319b: raptor observations of a naked eye burst. grb coordinates network circular 7464:1, 2008. [29] p. r. woźniak, w. t. vestrand, c. w. akerlof, et al. northern sky variability survey: public data release. aj 127:2436–2449, 2004. [30] i. zolotukhin, g. beskin, a. biryukov, et al. optical camera with high temporal resolution to search for transients in the wide field. astronomische nachrichten 325:675–675, 2004. 43 acta polytechnica 53(1):38--43, 2013 1 introduction 2 general requirements for wide-field monitoring 3 megatortora -multi-objective transforming instrument 4 mini-megatortora as a megatortora prototype 5 strategy of mini-megatortora operation acknowledgements references acta polytechnica acta polytechnica 53(3):302–305, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ emmy noether and linear evolution equations p. g. l. leacha,b,∗ a school of mathematical sciences, university of kwazulu-natal, private bag x54001, durban 4000, republic of south africa b department of mathematics and statistics, university of cyprus, lefkosia, republic of cyprus ∗ corresponding author: leachp@ukzn.ac.za abstract. noether’s theorem relates the action integral of a lagrangian with symmetries which leave it invariant and the first integrals consequent upon the variational principle and the existence of the symmetries. these each have an equivalent in the schrödinger equation corresponding to the lagrangian and by extension to linear evolution equations in general. the implications of these connections are investigated. keywords: lagrangian, hamitonian, schrödinger, black-scholes-merton. 1. noether’s theorem in 1918 emmy noether [13] presented a theorem in the festschrift to mark the diamond jubilee of the thesis of felix klein. her presentation dealt with a functional of several independent variables. given our present interest we quote the theorem for one independent variable, one dependent variable and a first-order lagrangian. if the action integral a = ∫ t1 t0 l (t, x, ẋ) dt is invariant under the infinitesimal transformation generated by γ = τ∂t + η∂x, then there exists a function f(t, x) such that ḟ = τ ∂l ∂t + η ∂l ∂x + (η̇ − ẋτ̇) ∂l ∂ẋ + τ̇l. in the present context of the functions τ and η are to be considered as depending upon t and x only, but in a more general context they can also depend upon ẋ. the function, f(t, x), is the variation induced in a by the variation in the limits in the action integral. thus it is called a boundary term despite the earnest efforts of successors to describe it as a gauge function. when the variational principle is invoked, it follows that there is a first integral given by i = f − ( τl + (η − ẋτ) ∂l ∂ẋ ) . over the approximately 90 years since noether presented her theorem there have been many efforts to misquote it. as a consequence of these various failures to understand the quite clear exposition in noether’s paper various stratagems have been advanced to remedy the perceived deficiencies of the theorem. generally speaking these advances were not necessary. 2. quantisation around 1835 hamilton essentially reduced the study of second-order equations to first-order equations by the introduction – indeed in light of newton’s second law, reintroduction – of momentum as a variable conjugate to position. the momentum was defined according to p = ∂l ∂ẋ and a new function, nowadays called the hamiltonian, was introduced according to the formula h = pẋ−l. quite a deal of interesting theory was developed and became the substance of hamiltonian mechanics. however, our present interest is the observation by dirac in 1926 of the resemblance of the operators needed for quantum mechanics and the canonical variables of hamiltonian mechanics and their poisson brackets. thus the schrödinger equation could be written as i~ ∂u ∂t = ĥu, where ĥ is the operator obtained by replacing the variables x and p by the operators x̂ and p̂. a matter which should be of some interest is the nature of the function h, which can be used as an operator in the schrödinger equation. this is not a trivial question for it must be borne in mind that hamiltonian mechanics is a reformulation of lagrangian mechanics, which is based upon newtonian mechanics, and the fundamental object of the last is newton’s equation of motion. whether one applies hamilton’s equations or the euler–lagrange equation, the end result should be newton’s equation. it is evident from dirac’s book [4] that he considered the appropriate hamiltonian to be the energy, and a 302 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 emmy noether and linear evolution equations symmetry boundary term σ1 = ∂t f1 = 0 σ2 = t∂t + 12x∂x f2 = 0 σ3 = t2∂t + tx∂x f3 = 12x 2 σ4 = ∂x f4 = 0 σ5 = x∂x f5 = x f6 = 1 table 1. noether point symmetries. conserved energy at that. this marks a strong constraint upon the nature of the hamiltonian and hence the lagrangian. as a simple example consider the two lagrangians for the free particle given by l1 = 12ẋ 2 and l2 = 1 ẋ . both give rise to the newtonian equation ẍ = 0. both have the maximal number of noether symmetries, five, for the lagrangian of a one-degree-offreedom system. the former represents the energy. why choose one above the other? if we do the usual trick of setting plank’s constant to unity in the schrödinger equation for the free particle obtained by using the hamiltonian operator representing the energy, the equation is 2i ∂u ∂t + ∂2u ∂x2 = 0. a calculation of the lie point symmetries results in γ1 = ∂t, γ2 = 2t∂t + x∂x, γ3 = 2t2∂t + 2tx∂x + ( ix2 − t ) u∂u, γ4 = ∂x, γ5 = t∂x + ixu∂u, γ6 = u∂u, γ∞ = f(t,x)∂u, where the coefficient function in γ∞ represents a solution of the original equation and shall not be of further interest here. the symmetry, γ6 is a consequence of the homogeneity in u of the equation. the remaining five symmetries are closely related to the five noether point symmetries of the lagrangian l = 12ẋ 2. the latter with the boundary term are in table 1 which are not quite the same in that the boundary terms and the coefficient of u∂u are not identical, but are suggestive of some connection. this connection is made clearer when one writes γ = a(t)∂t + (1 2ȧ(t) + b(t) ) ∂x + (1 4iä(t)x 2 + iḃ(t)x + c(t) ) u∂u and σ = a(t)∂t + (1 2ȧ(t) + b(t) ) ∂x with f(t,x) = 14ä(t)x 2 + ḃ(t)x + c(t), where in the latter case c(t) = c0 and in the former case c(t) = c0 − 14ȧ(t). there is a precise connection. 3. the heat equation the schrödinger equation for the free particle, 2i ∂u ∂t + ∂2u ∂x2 = 0, may be rewritten as ∂u ∂it/2 − ∂2u ∂x2 = 0 so that the change of variable, it/2 −→ t̃, gives the classical heat equation ∂u ∂t̃ − ∂2u ∂x2 = 0. since the transformation is a point transformation, the number of lie point symmetries is unchanged. specifically they are ∆1 = ∂t, ∆2 = 2t∂t + x∂x, ∆3 = 4t2∂t + 4tx∂x − ( 2t + x2 ) u∂u, ∆4 = ∂x, ∆5 = 2t∂x −xu∂u, ∆6 = u∂u, ∆∞ = f(t, x)∂u, where, as has been the case above, f(t, x) is any solution of the heat equation. the lie algebra of the symmetries is { sl(2, r) ⊕s w3 } ⊕s ∞a1. the symmetries, ∆1 and ∆2, play a similar role to those of the quantal simple harmonic oscillator as creation and annihilation operators. they have been used to generate heat polynomials [8]. 4. the wonderful world of finance approximately 40 years ago black and scholes [1, 2] and, independently, merton [9–12] developed a model for the pricing of options. at the time it was remarked that the subject under investigation was of no great importance since trading in options was a minor feature of the financial markets. a later observation about the applicability of the model to any financial instrument the future of which was uncertain at the present time certainly extended the relevance of the equation proposed. the black-scholes-merton equation, ut + 12σ 2x2uxx + rxux −ru = 0, (∗) was originally presented by black and scholes, but merton’s name is usually added to indicate his contribution to the development of this part of the theory of financial mathematics. this model is the precursor of the many evolution of partial differential equations which have been derived in the modelling of various 303 p. g. l. leach acta polytechnica financial processes. basically it has to do with the pricing of options, but anything vaguely connected such as corporate debt is equally grist for its mill. the symmetry analysis of (∗) was firstly undertaken by gasizov and ibragimov [5]. after determining the symmetries they obtained from the solution for the initial condition being a delta function which is a typical initial condition for the heat equation.1 a more typical problem is the solution of (∗), the subject of which is known as a terminal condition, ie u(t,x) = u when t = t. the lie point symmetries of (∗) are γ1 = x∂x γ2 = 2tx∂x + { t− 2 σ2 (rt− log x) } u∂u γ3 = u∂u γ4 = ∂t γ5 = 8t∂t + 4x log x∂x + { 4tr + σ2t + 2 log x + 4r σ2 (rt− log x) } u∂u γ6 = 8t2∂t + 8tx log x∂x + { −4t + 4t2r + σ2t2 + 4t log x + 4 σ2 (rt− log x)2 } u∂u γ∞ = f(t,x)∂u, where γ∞ is the infinite subset of solutions to (∗). the algebra of the finite subset is sl(2,r) ⊕s w3, where w3 is the three-dimensional heisenberg-weyl algebra. the important thing to note is that the algebra of the symmetries presented above is just that of the classical equation which we have seen is related to the schrödinger equation for the free particle and so to the noether point symmetries of a classical lagrangian. a question of course could be posed as to the identity of the classical lagrangian. equation (∗) is not quite in the form of the classical heat equation, for which we could easily make a definite identification of the corresponding lagrangian. under the transformations x 7→ exp[σy], u(t,x) 7→ w(t,y) exp [ 1 2 ( 1 − 2r σ2 ) y + 1 8σ2 ( σ2 + 2r )2] we obtain 2wt + wyy = 0. one can detect a certain degree of irony in the identification of the most famous equation of financial mathematics with the free particle. the black-scholes-merton equation is but one of a number of evolution equations to be found in financial 1one would hope that this initial condition would not apply in financial matters! unfortunately there are some instances of financial instability in which such an initial condition is far too accurate a model. note that the paper [7] with more realistic conditions appeared earlier, but the content of [5] had already been presented at a seminar in the department of physics, the university of the witwatersrand, johannesburg, in 1996. mathematics which are rather blessed with an abundance of symmetry. admittedly not all of them possess the richness of the algebra {sl(2, r) ⊕s w3}⊕s ∞a1, but one can be successful in the resolution of the equation with fewer symmetries. these algebraic properties are not confined to 1 + 1 evolution equations. an interesting example is to be found in the model of the pricing of commodities developed by eduardo schwartz [3, 6, 14], which was examined from the viewpoint of symmetry in [15]. schwartz considered models with one, two and three ‘spatial’ variables. in terms of the type of model that he proposed, the number of dimensions becomes irrelevant since there is sufficient increase in symmetry with the increase in the number of dimensions. it is rather intriguing that so many of the equations which arise in the mathematics of finance should be so richly endowed with symmetry. naturally the possession of symmetry is an indispensable aid to a ready resolution of a differential equation which may explain the comment of a learned referee that any competent applied mathematician can solve these equations without recourse to the arcane methods of a norwegian mathematician. 5. initial/boundary conditions thus far we have considered only the differential equations and not those pesky little things which accompany them when one wants to state the equation correctly, i.e. the initial and/or boundary conditions. in the case of classical mechanics with a single independent variable, the situation is quite simple. one solves the equation by one means or another and then evaluates the constants of integration using the initial conditions. to be quite honest, i have never thought of applying the methodology to be discussed below to an initial value problem in classical mechanics! when it comes to partial differential equations, the situation is not so simple, for one can have both boundary conditions and initial conditions. actually in the case of financial problems it is usually a terminal condition rather than an initial condition, but that is merely a matter of one’s attitude to the arrow of time. typically the problem with options is to determine the price which one should pay now, say, to purchase some stock in the future at a now-determined price. i suspect that we all have our favourites ways to write the one-parameter elements of the lie symmetries of a differential equation. nevertheless it is well that from time to time we should be reminded that the differential equation admits a single symmetry. if we are fortunate, the single symmetry will be a multiparameter symmetry which gives some freedom when it comes to dealing with the dreaded boundary/initial conditions. consider the case of the black-scholesmerton equation, (∗), and the terminal condition u(t,x) = u when t = t . we need a symmetry which 304 vol. 53 no. 3/2013 emmy noether and linear evolution equations is compatible with these conditions. we write the symmetry of (∗) as γ = σi=6i=1αiγi and apply this general symmetry to the dual terminal conditions. note that the solution symmetries, the components of the subalgebra ∞a1, do not enter into the discussion. the details of the calculation are irrelevant for our present purpose. suffice it to say that the six constants in the general form of γ are constrained to give two oneparameter symmetries. in general the algebra of these symmetries is non-abelian. fortunately there is a very useful theorem which guarantees the uniqueness of the solution of the equation under these circumstances. consequently one chooses the simpler symmetry for further calculation. 6. tantum adesse i have concentrated in the end on the applications of symmetry to the equations of financial mathematics, but it is quite evident that these considerations apply to evolution equations wherever they arise. the connection between the lie point symmetries of these evolution partial differential equations and the noether point symmetries classical lagrangians has not been exploited. it may well be that for 1 + 1 evolution equations the degree of exploitation may be limited. however, when it comes to more degrees of freedom – as in the commodities model of schwartz – there may be some benefit in the investigation of the properties that the classically conserved quantities may have for these evolution equations. finally there are the questions of alternate lagrangians and of evolution equations which are not presumed to be linear. acknowledgements this contribution was prepared while pgll was enjoying the hospitality of professor mc nucci and the dipartimento di matematica e informatica, università di perugia. the continuing support of the university of kwazulunatal and the national research foundation of south africa is gratefully acknowledged. any opinions expressed in this talk should not be construed as being those of the latter two institutions. references [1] black fischer & scholes myron (1972) the valuation of option contracts and a test of market efficiency journal of finance 27 399–417. [2] black fischer & scholes myron (1973) the pricing of options and corporate liabilities journal of political economy 81 637–659. [3] brennan mj & schwartz es (1985) evaluating natural resource investments journal of business 58 133–155. [4] dirac pam (1932) the principles of quantum mechanics (cambridge, at the clarendon press). [5] gasizov r & ibragimov nh (1998) lie symmetry analysis of differential equations in finance nonlinear dynamics 17 387–407. [6] gibson r & schwartz es (1990) stochastic convenience yield and the pricing of oil contingent claims journal of finance 45 959–976. [7] ibragimov nh & wafo soh c (1997) solution of the cauchy problem for the black-scholes equation using its symmetries modern group analysis, international conference at the sophus lie conference centre ibragimov nh, naqvi kr & straume e edd (mars publishers, norway). [8] leach pgl (2006) heat polynomials and lie point symmetries journal of mathematical analysis and applications 322 288–297. [9] merton rc (1969) lifetime portfolio selection under uncertainty: the continuous-time case review of economics and statistics 51 247–257. [10] merton rc (1971) consumption and portfolio rules in a continuous time model journal of economic theory 3,4 373–413. [11] merton robert c (1973) theory of rational option pricing bell journal of economics and management science 4 141–183. [12] merton rc (1974) on the pricing of corporate data: the risk structure of interest rates journal of finance 29 449–470. [13] noether emmy (1918) invariante variationsprobleme königlich gesellschaft der wissenschaften göttingen nachrichten mathematik-physik klasse 2 235–267. [14] schwartz es (1997) the stochastic behaviour of commodity prices: implications for valuation and hedging the journal of finance 52 923–973. [15] sophocleous c, leach pgl & andriopoulos k (2008) algebraic properties of evolution partial differential equations modelling prices of commodities mathematical methods in the applied sciences 31 679–694. 305 acta polytechnica 53(3):302–305, 2013 1 noether's theorem 2 quantisation 3 the heat equation 4 the wonderful world of finance 5 initial/boundary conditions 6 tantum adesse acknowledgements references ap01_6.vp 1 introduction the atm computer network consists of switches, sources and destinations of the transported data. the atm network is based on switching packets of the same length called cells. switches can transmit packets from their input to their output and also get information about the intensity of the traffic. network traffic management is a significant process used for proper functioning of the network. the traffic management is based on a specific algorithm. to avoid congestion, current atm networks use techniques of timeout, duplicate acknowledge or explicit congestion notification. for monitoring the state of the network, special cells called rm (resource management) are used in the atm network. rm cells inform data sources about the maximum rate of flow acceptable by the network. the load of the network varies in time, and interference of different loads in the same point can cause congestion. when this happens, the memories of the switches are overloaded and incoming cells have to be refused, i.e. removed. in this paper a new method is proposed for an earlier reaction to congestion. an analytical model of a switch was constructed for an analysis of the network behaviour. this model is used for analysing traffic through the network. a description of the model is given in the sections below. 1.1 standard traffic management there are several classes of traffic in the atm network. this paper considers the class called available bit rate (abr), used for transmissing files. the abr class of atm computer networks uses feedback information on data transport from switches and destination to the source to control the load. the feedback mechanism is essential for good throughput of networks enabling an earlier reaction of sources when congestion of switches occurs. the transported data are arranged in packets of cells. these packets have the same length, and a special cell called the rm cell precedes every packet. the traffic management is based on rm cells. rm cells pass from the source to the destination and collect information about the maximum rate of generated load which is acceptable by a dedicated path. after their acceptance by the destination they are sent back in order to bring the feedback information to the source. the rate of the traffic load is adjusted in the source according to the value received from the rm cell. the flow of data cells together with rm cells between switches sw (i) and sw (i+1) is shown in fig. 1. the rm cells are pictured as black rectangles, and the other cells as white rectangles. the packets of data cells with the rm cell number 8 are sent from the data source, i.e. forwards. only rm cells 2 and 3 are sent backwards to bring feedback information from the destination. the disadvantage of this method is that the source of data obtains information about the current rate of the load, but it does not know which particular data was lost because it was refused within the switches. this information can be obtained later from the destination. in other words, rm cells always have to go along the whole path from the data source to the destination and back, see fig. 5, 7. 1.2 proposed traffic model in the proposed model all rm cells are numbered by the source. the rm cells are returned by the switches at the moment of their congestion. if congestion occurs, the packets are refused. but associated rm cells are immediately sent back to inform the source of the loss of data. in this model, the source is informed earlier about the congestion in the network because rm cells did not pass along the whole path, see fig. 5, 8. 2 analytical model of the switch in this part the behaviour of basic network components – switches – is studied. the behaviour of the switch can be described as a stationary queueing process. let (m|m|1|m) be a queueing process according to the kendall notation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 41 no. 6/2001 analytical model of modified traffic control in an atm computer network j. filip the abr class of atm computer networks uses feedback information that is generated by net switches and destination and is sent back to a source of data to control the net load. a modification of the standard traffic management of the atm network is presented, based on the idea of using rm cells to inform the source about congestion. an analytical model of a net switch is designed and mathematical relations are presented. the probability of queueing is considered. the analytical model of the switch was constructed for the purpose of analysing the network behaviour. it is used for investigating of cells passing through the network. we describe the model below. the submitted analytical results show that this method reduces congestion and improves throughput. keywords: throughput, congestion control, modelling, rm cell, cell refusal. rm data 9 8 ) 2 3 fig. 1: the flow of counted rm cells the symbols denote: m – distribution function of the cell interarrival time is exponential with the mean 1 0 � �, � � � and the arrival times are independent; m – distribution function of the cell service time is exponential with the mean 1 0 � �, � � � and the service times are independent; 1 – system with one place in service; m – system of queue length m. notation used in the following text: � – average arrival intensity; � – average service rate; � � �� – traffic intensity. let us assume that cells coming to switch are stored in a switch queue of length m. cells are refused if all of the m waiting places are occupied, and further arriving cells are cancelled. the situation is shown in fig. 2. the cell with index 0 is being transmitted and the following cells are waiting in the queue. the other incoming cells with an index equal to or greater than m + 1 are refused. the behaviour of this model is described by the probabilities of all possible queue lengths and by the cell waiting time in the queueing system. the probability that there are just k waiting cells in the queue is: � � p p m k m k k m+2 k for and for � � � � � � � � � � � 1 1 1 1 2 1 0, . (1) the probability pz of cell refusal (all places in the queue are occupied) is pz � pm + 1 and its value equals: � � p p m z m+1 m+2 z for and for � � � � � � � � � � � 1 1 1 1 2 1. (2) fig. 3 shows the dependence of probability pz on the value m for several values of traffic intensity �. the cell waiting time w in this queueing system is a random variable. for a description of the analytical model, the mean value e (w) of the waiting time w is used: � �e w p k k m � � �0 1 1 � � k. (3) fig. 4 shows the dependence of the mean value e (w) on the value m for several values of traffic intensity �. we can also use the probability of service pp, p pp z� �1 . (4) other formulas describing the behavior of the queueing system can be found in [5]. the behavior of the system for values of traffic intensity � close to 1 will be studied. in this case the refusal of cells occurs more frequently. 3 path of cell transport it is supposed that a path of cell transport from source s to destination d includes a sequence of n switches sw (i), 1 i n. the same path is used for both forward transport of data cells together with rm cells and backward transport of rm cells. in the backward transport the rm cells pass through the switches in the reverse order. the switches are marked sw (i), n i n 1 2 . both paths are shown in fig. 5. the following notation is used in fig. 5: s – source of cells; d – destination of a transport; sw (i) – i-th switch in the path; ti – time period needed for cell transport along the line connecting switches sw (i) and sw (i+1). 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 refused cell waiting cells cell in service m +1 m 2 1 0 fig. 2: the structure of a switch 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028 0.03 0.032 0.034 100 200 300 400 500 600 pz m � � 1.01 � � 1 � � 0.98 � � 0.95 fig. 3: probability of cell refusal � � 1.01 � � 1 � � 0.98 � � 0.95 100 200 300 400 500 0 100 200 300 400 500 600 �e w( ) m fig. 4: mean value of the waiting time in the queue 3.1 characterization of cell transport the following cases of cell transport from the source to the destination were studied: � the data cell is transported without being refused. � the data cell is just once refused during the transport. the formulas for transport time will be stated below, and we will use the weighted averages of the mean values of these time periods to compare the two models. we choose the probabilities of possible events as weight coefficients. we do not consider the other cases of cell transport (when the number of refusals is greater than 1) because earlier studies [1, 2] have shown that the associated probabilities are negligible. the probabilities of cell refusal in a switch are shown in fig. 3. we use the following notation: pz(i) – probability of cell refusal in the i-th switch due to full switch buffer; pp(i) � 1 � pz(i) – probability of service in the i-th switch; w(i) – waiting times in the queue of the i-th switch; e (w (i)) – mean value of w (i); ti – time period needed for cell transport along the line connecting switches sw (i) and sw (i+1). the sample space of the considered events consists of independent elementary events {0, 1, 2, …, n}. the symbols denote: 0 – the cell is transported without being refused in any switch; i – the cell is just once refused within the i-th switch sw (i), i = 1, 2, …, n. we assume that the associated request (the rm cell transmitted from the destination to the source) and repeated sending of data cells are delivered without refusal. the probabilities of the described events are conditional probabilities and their values can be approximated: � � � � � � � � � � � � � � � � p p p p n p i p p i p i i n 0 1 2 1 1 1 2 � � � � p p p p p z � � � ; , , , , . the values of � �p i i np , , ,� 1 � , are close to 1 and the values pz(i) � 1, i � 1, …, n. therefore we can approximate the probabilities p(i), i � 0, …, n, as: � � � � � � � �� � � � � � � � p p p p p n p i p i p i i n 0 1 1 2 1 0 1 � � � � � � � � � z z z z � � , , , , . (5) we use the values p0, p1(i), i � 1, …, n as the probabilities of the events {0, 1, …, n}. we note that � �p p i i n 0 1 1 1� � � . we consider the individual cases of the cell transport denoted by the symbols {0, 1, …, n} and we calculate their transport time. we denote: tst total time period of cell transport in the standard model; tst 0 total time period of cell transport without refusal in the standard model; � �ts it 1 total time period of cell transport with refusal in the i-th switch in the standard model; tpt total time period of cell transport in the proposed model; tpt 0 total time period of cell transport without refusal in the proposed model; � �tp it 1 total time period of cell transport with refusal in the i-th switch in the proposed model. 3.2 cell transport without refusal the value p0 is the probability of cell transport without refusal within any switch of the path. it is actually the probability of a successful transmission without loss of any cell. the influence of the proposed modification does not appear in this case and the transport times in the standard and proposed models are the same. in both models cells pass along the whole path from source s to destination d. this path is shown in fig. 6. the total time periods tst 0 and tpt 0 needed for the transport of a particular cell along the whole path in the standard and proposed models are equal, and are � �ts tp t w k k n k n t 0 t 0 k� � � � � � 1 1 1 (6) © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 6/2001 sw(j) w(j) sw(n+1) w(n+1) sw(1) w(1) sw(i) w(i) sw(n) w(n) sw(2n) w(2n) fig. 5: path of cell transport rm 9 sw(i) ds 1 fig. 6: path of a cell transport without refusal the mean value of time tst(0) and tpt(0) is � �� �e ts e tp t e w k k n k n ( ) ( )t0 t0 k� � � � � � 1 1 1 (7) because only time periods w (k) are random quantities. 3.3 cell transport with refusal of cells in the standard model we calculate the transport time tst(i). the assumption is that a cell is refused in the i-th switch sw (i). the destination must call for repeated sending of the refused cell. the probability � � � �p i p i1 � z of this case is given in (5). the path of a cell transport in the standard model is shown in fig. 7. the switch sw (i) is congested and that is why a cell is refused. information about a cell being refused is not known before an uncompleted packet is received by destination d. such a situation results in a request for a repeated transmission of data cells until the complete packet arrives at its destination. the cell and the request have to pass the path three times and the total time period needed for the cell transport � �ts it 1 , i n� 1, ,� , is � � � � � �ts i t w k t w kt k n k n k n n 1 1 1 1 2 2 2 2� � � � �� � � � �� � � � � � �k k k n n � � 1 2 . (8) the mean value of the time period � �� �e ts it1 is � �� � � �� �e ts i t e w k t k n k n k n n t k k 1 1 1 1 2 2 2 2� � � � �� � � � �� � � � � � � � �� � � �e w k k n n 1 2 . (9) 3.4 cell transport with refusal of cells in the proposed model in the proposed modification, the cell transport path and the associated request are shown in fig. 8. the request for the repeated sending is sent immediately from the congested switch sw (i). we can see that in this case the cell and the associated request for its repeated sending does not very often pass along all the way from the source to the destination. the cell is refused in the i-th switch with the probability � � � �p i p i1 � z . the time periods needed for the transport along the individual parts of the whole path are: 1. � �t w k k i k i k � � � � � 1 1 1 this is the time period of cell transport from source s to switch sw (i): 2. � �t w k k n i n k n i n k � � � � � � 2 3 2 2 2 2 2 this is the time period needed for transmission of the repeated sending request from switch sw (i) to source s. 3. � �t w k k n k n k � � � � 1 1 1 this is the time period needed for transport of the cell from source s to destination d. the total time period � �tp it 1 needed for the transport is: � � � � � �tp i t w k t w k k i k i k n i n k n i t 1 k k� � � � � � � � � � � 1 1 1 2 3 2 2 2 2 � � 2 1 1 1 n k n k n t w k � � � � � k . (10) the mean value of time period � �� �e tp it1 is � �� � � �� � � �� � e tp i t e w k t e w k k i k i k n i n k n t 1 k k� � � � � � � � � � 1 1 1 2 3 2 2 2 � �� � � � � � � � i n k n k n t e w k 2 2 1 1 1 k . (11) 4 comparison of cell transports we now compare the total time period of the cell transport in the standard model and in the proposed model. the transport time in the individual cases denoted by indeces {0, 1, …, n} are random variables and the corresponding probabilities p 0, p 1(i) are given in (5). as the mean value of the total transport time periods tst and tpt we use the weighted averages of mean values � �e tst0 , � �� �e ts it1 and � �� �e tp it0 , � �� �e tp it1 . the coefficients of weighted averages are the probabilities p 0, p 1(i). if we denote by e (t st) the mean value of the total time period of the cell transport in the standard model and by e (t pt) the same quantity in the proposed model, we obtain: � � � � � �� � � � � � � � � �� � � � e ts e ts p e ts i p i e tp e tp p e tp i p i i n t t t t t t � � � �0 0 1 1 1 0 0 1 1 , . i n � � 1 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 rm 9 sw(i) ds rm 9 rm 9 1 3 2 fig. 7: path of a cell transport with refusal in the standard model rm 9 sw(i) ds rm 9 rm 9 1 3 2 fig. 8: path of a cell transport with refusal in the proposed model the effect of the modification of the standard model is expressed by the difference � � � � � � � �� � � �� � � �� �� � � � e ts e tp e ts e tp p e ts i e tp i p i e t i n t t t t t t � � � � � � � � � 0 0 0 1 1 1 1 � �� � � �� �� � � �s i e tp i p i i n t t 1 1 1 1 � � (12) where we use the equality (7) � � � �e ts e tpt t0 0� . the values � �� �e ts it1 and � �� �e tp it1 are given in (9), (11). the last expression depends on many parameters. we determine its value for one simple case. in the sequel we assume that the switches are identical, i.e. the waiting times w (k) are the same and times tk are equal. we denote: � � � �� � � � � � t t k n e w e w k k n p p k p p k k n � � � � � � � � � � � k z z p p , ; , ; , , . 1 2 2 1 2 1 2 in this case we get � �� � � � � �� � � �� � � � � � � �� � e ts i n t n e w e tp i n i t n i e w i t t 1 1 3 3 3 2 1 2 2 1 � � � � � � � � � � , , , ,� n. � � � � � � � �� �� � � � � �� � e ts e tp p n i t e w n n p t e w i n t t z z 1 1 1 2 1 1 � � � � � � � � � � � . (13) the value � � � �e ts e tpt t1 1� depends on the parameters of the switch (e (w ), pz) and the value �t. we can suppose that the values � t � e (w ) and therefore the comparison of the two models is demonstrated by the expression � � � �v p e w n n� �z 1 . the values pz, e(w) and pp are defined in (2), (3), and (4). the following table contains the values of v for some path transport parameters. figs. 9 and 10 show the dependence of v on the number n of the switches for several values of traffic intensity �. fig. 11 shows this dependence for several values of buffer storage m and traffic intensity � = 1. the value of expression v depends on the product of the probability pz and the mean value of the waiting time e (w). figs. 12 and 13 show the dependence of the product pze (w) on value m for several values of traffic intensity �. conclusion the influence of the proposed modification of cell transport in an atm computer network is tested. analytical models of cell transport through standard and proposed modification are described. formulas for the time period needed for cell transport for both models are presented. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 41 no. 6/2001 v m n � � 1.01 � � 1 � � 0.99 � � 0.98 32 10 66.3 53.4 42.4 33.1 32 20 253 204 162 126 32 50 1536 1237 928 767 64 10 81.4 53.7 33.9 18.8 64 20 311 205 129 71.8 64 50 1887 1244 785 436 96 10 98.6 54.5 26.8 11.7 96 20 376 208 102 44.8 96 50 2285 1262 625 272 128 10 120 54.7 21.2 6.8 128 20 459 209 81 26 128 50 2788 1267 492 158 tab. 1: influence of modification on the transport quality 0 1000 2000 3000 4000 5000 6000 20 40 60 80 100 v n fig. 9: dependence of v on parameter n of the path 0 1000 2000 3000 4000 5000 6000 7000 20 40 60 80 100 v n fig. 10: dependence of v on parameter n of the path 0 1000 2000 3000 20 40 60 80 100 v n fig. 11: dependence of v on parameter n of the path a comparison is made by the weighted averages of the mean values of the time period needed for cell transport e (t st), e (t pt). as the coefficient of weighted values, we choose the probabilities of particular cases. we assumed transports without refusal and with one refusal only. the differences of the mean values e (t st) and e (t pt) (13) are approximated by the expression v. its values in table 1 and the graph dependences in figs. 9, 10 and 11 show that the effect of the proposed modification grows with increased values of traffic intensity �. from table1 and fig.12, 13 it follows that if traffic intensity � <1, then the effect of the proposed modification decreases with increased values of buffer storage size m while for � � 1 the effect of the proposed modification grows. this is caused by the fact that when m increases, then the probability pz of cell refusal decreases, but for � � 1 the mean of the waiting time e (w) increases faster, hence the product pze (w) increases. if � <1 then this does not happen. we see from tab.1 that in this case the effect of the modification actually becomes smaller as we increase m. references [1] filip, j.: data transfer with modified rm cells. in: proceedings of workshop 97, prague, ctu, 1997 [2] filip, j.: switching an alternative for high-speed networks. in: proceedings of workshop 96, prague, ctu, 1996 [3] filip, j.: data transports in atm networks. in: proceedings of workshop 2000, prague, ctu, 2000 [4] brandt, a., fraenken, p., lisek, b.: stationary stochastic models. akademie – verlag berlin, 1990 [5] zitek, f.: malé zastavení času. spn praha, 1975 [6] atm forum technical committee “traffic management specification version 4.0”, april 1996 ing. jiří filip e-mail: fofis@post.cz ministerstvo zahraničních věcí české republiky ministry of foreign affairs of the czech republic loretánské náměstí 5, 118 00, praha 1, czech republic 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 40 60 80 100 120 140 160 m p e wz ( ) fig. 12: dependence of pze (w) on value m 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 50 100 150 200 250 m p e wz ( ) fig. 13: dependence of pze (w) on value m acta polytechnica doi:10.14311/ap.2018.58.0271 acta polytechnica 58(5):271–278, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap intelligent controller for tracking a 14-inch celestron telescope abdel-fattah attiaa, b a kafrelsheikh university, faculty of engineering, department of electrical engineering, p.o. 33516 kafrelsheikh, egypt b intelligent system research group (isrg), kafrelsheikh university, kafrelsheikh, egypt correspondence: aheliel@eng.kfs.edu.eg abstract. the paper proposes a design of two fuzzy and proportional-integral-derivative (pid) controllers for a position tracking of the 14′′ celestron telescope. the position responses; right ascension and declination in such a way that it minimizes the integral absolute error (itae) using a modified particle swarm optimization (mpso). based on the structure of the mpso, the accelerated coefficients of the particle swarm optimization are adapted dynamically by minimizing the system error with the iteration index. the adaptive control tool combines the fuzzy controller and mpso to produce a powerful controller in the system (flc-pso). the parameters of membership functions and the pid gains are tuned simultaneously based on the mpso, which is an efficient and simple tool for the multidimensional problem. the simulation results for both controllers are analyzed and compared on a basis of the time response specifications. keywords: static fuzzy controller; adaptive fuzzy controller; improved particle swarm algorithm. 1. introduction at present, there are many evolutionary computation algorithms. particle swarm optimization (pso) is one of them. the pso has many advantages, such as simplicity to implement and fast convergence as in van den bergh, frans [1]. despite the above advantages, the standard pso algorithm provides local solutions easily when solving complex optimization issues. recently, many articles are being presented to overcome this weakness and improve the pso standard performance. adaptive pso by xie, zhang, and yang [2], fully informed pso by mendes, kennedy, and neves [3], pso with disturbance term by qingyuan, and han [4]. moreover, the performance of the particle swarm optimization is slower, as search space dimensionality increases. recently, many articles covered optimization approaches for tuning methods [5]. in the articles, the particle swarm optimization (pso) and genetic algorithm (ga) are used to improve the input / output membership functions (mfs). the pso is simple to implement and faster than the ga for tuning the mfs, which will enhance the performance of the fuzzy controllers [5]. authors of [6–11] tried to improve different optimization methods for tuning the mfs of flcs and coefficient of linear controllers to make it faster and more reliable for real applications. nowadays, other methods are used to enhance the performance of the traditional controllers such as the fractional order systems [19, 20]. the pointing, tracking and imaging of a modern telescope are a very important processes. the driving system of the telescope depends on two dc motors working together to track a predefined position. the mechanisms to be controlled are, therefore, the right ascension (ra) and declination (dec) drives. therefore, the precise control system is needed to solve such problem [1]. the purpose of this research is to describe and implement the algorithms used in the control system for the celestron telescope. these control algorithms must be robust, accurate and easy to implement. this research study will introduce a new modification to the original pso to improve the global performance of the basic pso. in this research, two fuzzy controllers and a classical tuned pid controller were proposed to control the electric motor driving the celestron telescope. the classical fuzzy controller has a fixed fuzzy rule. it is like the one proposed in a previous work [12, 14]. the proposed fuzzy controller uses an improved pso algorithm to dynamically change rules for a better stability, settling time, rise time, maximum overshoot and integral absolute error. in this paper, the performance of the pid tuning algorithms is analyzed and compared to both fixed and adaptive fuzzy logic controllers based on time response specifications. the remainder of the article is organized as follows. section 2 provides the 14′′ celestron telescope model. the pid controller is provided in section 3. static fuzzy controller components are introduced and discussed in section 4. section 5 proposed an improved particle swarm optimization algorithm. the enhanced pso algorithm was tested for adjusting fuzzy logic controllers described in section 6. section 7 illustrates the results and conclusions of the proposed control techniques. a conclusion and extensions are addressed in the last section. 271 http://dx.doi.org/10.14311/ap.2018.58.0271 http://ojs.cvut.cz/ojs/index.php/ap abdel-fattah attia acta polytechnica figure 1. the 14′′ celestron telescope [15]. 2. mathematical description of the telescope model the 14′′ celestron telescope is shown in figure 1 [15]. the right ascension (ra) coordinates and the declination (dec) have two motor drives of the telescope movements on both sides [15]. the nonlinear differential equations of the celestron telescope model are expressed mathematically as follows: m(θ)θ̈ + c(θ, θ̇) + g(θ) = τ, (1) m(θ)θ̈ + n(θ, θ̇) + τd = τ, (2) where: θ is the joint angular position, θ̇ is velocity and θ̈ is the acceleration term. each one can be represented as follows: θ = [ θ1 θ2 ] , θ̇ = [ θ̇1 θ̇2 ] , and θ̈ = [ θ̈1 θ̈2 ] the unknown dynamics are represented by a constant disturbance; τd. the input torques τ1 and τ2 control the outputs θ1 and θ2 for a coupled telescope model. it is required to introduce a compensator to decouple this model. thus, the decoupled system consists of independent variable systems. the telescope model compensator represents a linear system, which allows using the linear pid controller to control [13]. the nonlinear model consists of a compensator plus two parallel pid controllers. the controller is a pidpso hybrid controller for driving the dc motor. the controller consists of a state feedback compensator and a process with a state vector. 3. pid controller the position errors of each arm are measured. the joint velocities measurements are then estimated. depending on the position errors and shared speeds, the pid controllers are defined by using an additional control signal u. the output of the pid controller is described as [13] u = −kpeθ −kdeθ −ki�. (3) the total input of the non-linear model is described as τ = m(θ)(kpeθ + kdėθ + ki�) + n(θ, θ̇), (4) ėθ1 eθ1 nl ns z ps pl nl nl nl nl ns z ns nl nl ns z ps z nl ns z ps pl ps ns z ps pl pl table 1. look-up table for fuzzy rules. where �(t) is the integral of the control error e(t). the controller parameters are proportional gain kp, integral gain ki, and derivative gain kd. the position error eθ is the difference between the required angle (θd) and the final position angle (θ) of the telescope in a given direction for the right ascension or declination positions. enhanced gains from the pid controllers [kp1,kd1,ki1,kp2,kd2,ki2]t are then determined by using the modified pso algorithm. figure 2 shows the pid controllers for the decoupled telescope model. the coefficients kp1, ki1, kd1, kp2, ki2 and kd2 are the core parameters in the pid controllers for both the ra and dec axis. the mpso algorithm will be used to select the optimal efficiency factor to make the output values meet the required control specification. 4. fuzzy logic controller (flc) the main components of the static fuzzy controller with a fixed membership functions (sflc) consists of four components: [6, 14]. figure 3 shows the flc controllers for the telescope model. (1.) fuzzification interface. the global input variables of the sflc controllers are the position deviation error eθ and its rate of change ėθ. τ1 is the output variable. five fixed fuzzy sets (ffs) are designed for each input and output variables of the sflc. the fuzzy input vector of each flc-pso for input/output variables consists of the previous variables used in the sflc with five linguistic variables using adaptive fuzzy sets (afs). in figure 4, the solid lines represent the fixed fuzzy sets of the sflc. while the dashed lines represent the adaptive fuzzy sets of the flc-pso. (2.) data base. the database has the definitions of the mfs defined for every fuzzy control variable. the linguistic variables such as the pl (positive large), ps (positive small), z (zero), ns (negative small) and nl (negative large) are shown in figure 4 and indicated in table 1. (3.) the inference system. there are 25 rules constructed based on five gaussian mfs groups selected for i / o variables in the fuzzy control unit. these fuzzy rules define control objectives using fuzzy terms as shown in table 1. (4.) defuzzification stage. the center average defuzzifier is used to compute the fuzzy output, which represents the defuzzification stage [6–8]. 272 vol. 58 no. 5/2018 intelligent controller for tracking a 14-inch celestron telescope figure 2. pid controllers for the decoupled telescope model. figure 3. flc controllers for the nonlinear telescope model. 5. proposed improved particle swarm optimization algorithm (mpso) in particle swarm optimization (pso), at the beginning of the iterations, the particles are randomly distributed. the best solutions are scattered in the search space. the variety of acceleration parameters in the pso provides faster convergence than fixed coefficients. starting the acceleration parameter c1 at a high value will create the best solutions through the iterations. this means that the population converges to a smaller subset of the search area, and the value of c1 will greatly decrease with the iteration index: c1 = c1ie−t/tmax −c1f, (5) where t is the current iteration index, and tmax is the maximum number of iterations. in the early iterations, particle members are very random. the acceleration parameter c2 increases significantly with the iteration index to exploit the enhanced pso in the zone created by the best current solutions: c2 = c2iet/tmax. (6) the dynamic operator’s idea was discussed and explained previously in [17, 18]. 273 abdel-fattah attia acta polytechnica figure 4. (a) membership functions (mfs) of position error for sflc and flc-pso controllers. (b) mfs of rate of error for sflc and flc-pso controllers. (c) mfs of control action for sflc and flc-pso controllers. (d) the output surfaces viewer of ra & dec sflc controller. figure 5. flowchart for improved parameters for membership functions for flc-pso controllers. 274 vol. 58 no. 5/2018 intelligent controller for tracking a 14-inch celestron telescope eθ1 ėθ1 torque 1; τ1 parameters c1,σ1, . . . ,c5,σ5 c1,σ1, . . . ,c5,σ5 c1,σ1, . . . ,c5,σ5 30 2 × 5 2 × 5 2 × 5 table 2. optimized parameters of flc-pso for ra axis. eθ2 ėθ2 torque 2; τ2 parameters c1,σ1, . . . ,c5,σ5 c1,σ1, . . . ,c5,σ5 c1,σ1, . . . ,c5,σ5 30 2 × 5 2 × 5 2 × 5 table 3. optimized parameters of flc-pso for dec axis. 6. adaptive fuzzy controllers using modified pso this section provides a brief overview of the modified pso control design and fuzzy logic controllers. the flc-pso, using an adaptive afs based on the pso, has the same inputs and outputs as the sflc. there are 25 rules for flc-pso and sflc controllers. the rules can be expressed as follows: if eθ is ns and ėθ is z then τ is ns. there are 30 parameters to be optimized for each flc-pso controller based on their limitations, to form a set of n-particles. the membership functions (mfs) of the flc-pso controller are represented by the dotted lines shown in figure 3. tables 2 and 3 show the mfs of the flc-pso controller for the ra and dec axis, respectively. these 60 parameters will be optimized using the mpso. the flow chart explains the sequence of steps to be carried out in the pso algorithm. figure 5 illustrates the flowchart of the improved parameters for membership functions of flc-pso controllers for ra and dec axis drives. the following steps explained details of optimizing mfs parameters of the fuzzy controllers: (1.) initialization. initially, the controllers work with the parameters of static fuzzy controllers. the acceptable constraints for each gaussian mf parameters center parameter (∆c = [cmin,cmax], and width parameter ∆σ = [σmin,σmax]). acceptable restrictions are determined using the 2nd order fuzzy sets method explained in [17]. the whole system operation is defined based on traditional fuzzy controllers (sflcs) and pso parameters in table 4. (2.) generation. generate n particles randomly within the acceptable limits stating at time counter t = 0, {xj(0), j = 1, . . . ,n}, and velocity {vj(0), j = 1, . . . ,n}. (3.) objective function evaluation determine the fitness function f for each particle in the initial population. the parameters of fuzzy sets (vector x) are used to create flc-pso controllers for the ra axis and the dec axis. the best initial individuals are achieved among the population. in this article, particles number (swarm size) n 50 inertia weight (initial value) wmax 0.9 inertia weight (final value) wmin 0.25 maximum number of iterations tmax 40 acceleration param. 1 (initial value) c1i 2 acceleration param. 2 (initial value) c2i 0.5 acceleration param. 1 (final value) c1f 0.5 acceleration param. 2 (final value) c2f 2 maximum velocity v1max 2.0 minimum velocity v1min −1.5 table 4. pso parameters. integrated time and absolute error (itae) is used as an objective function f = ∫ ∞ 0 t|e(t)|dt, (7) where e(t) is the position error in the ra and dec positions for each flc-pso controller such as shown in figure 3. (4.) set t = t + 1. (5.) wight updating. the new inertia weight w(t) will be w(t) = wmax + tmax − t tmax (wmax −wmin). (8) (6.) velocity updating. based on the global best and individual best of each particle, the jth particle velocity is updated: vj(t) = w(t)vj(t−1) +c1r1 ( x∗j (t−1)−xj(t−1) ) + c2r2 ( x∗∗j (t− 1) −xj(t− 1) ) , (9) where r1 and r2 are random numbers in [0, 1], c1 and c2 are the acceleration parameters varying with the iteration index based on the pso improvements described in (5) and (6). (7.) position updating. updated velocities will change each particle to its position: xj(t) = xj(t− 1) + vj(t). (10) 275 abdel-fattah attia acta polytechnica figure 6. convergence properties of the proposed mpso algorithm: itae criterion case. transient response flc-pso sflc pid-pso classical pid characteristics ra dec ra dec ra dec ra dec rise time (sec) 0.3355 0.33 0.439 0.395 0.0625 0.061 0.9755 0.98 settling time (sec) 0.75 0.6 0.9 0.65 0.4 0.35 2.6 2.25 peak time (sec) – – – – 0.11 0.103 – – maximum overshoot 0 0 0 0 22.2 % 20.0 % 0 0 table 5. transient response characteristics for ra & dec axis controllers’ schemes. (8.) best individual update. evaluate each object according to the updated location using the objective function in (7). if fj < f∗j , j = 1, . . . ,n, then update individual best as x∗j (t) = xj(t) and f∗j = fj and go to step 9; else go to step 9. (9.) global best updating. looks for the minimum value of fmin around f∗, where min is the index of the particle with minimum objective function, i.e., min ∈ {1, . . . ,n} if fmin < f∗∗, then update the globally best as x∗∗(t) = xmin(t) and f∗∗(t) = fmin(t) and go to step 10; else go to step 10. (10.) stopping criteria. the particle swarm is repeated until it reaches the stop criterion or reaches the maximum iterations. 7. results and discussions the efficiency of the proposed control schemes are validated with different simulation computations performed. figure 6 illustrates the fitness curve for the pid-pso and flc-pso controllers. the pid-pso fitness curve is faster than the fitness curve of the flc-pso controllers. the optimized gaussian fuzzy sets for the flc-pso controllers are shown in figure 3 as dashed lines. the optimized pid coefficients are shown in table 5. the results of position responses and velocity responses for ra & dec coordinates are shown in figures 7 and 8. after defining the control parameters, such as stability, settling time, overshoot, fitness value, convergence iteration, running time and the pid parameters for mpso optimization method, they were illustrated in figure 7. the proposed flc-pso controllers provide a better dynamic response compared to the pid-pso controllers. table 5 shows the comparisons of the control specification for the flc-pso, sflc, pid-pso and classical controllers. the results shown in table 5 and figure 7 show that the system with the static fuzzy controller (sflc) increased by 0.439 seconds and 0.395 seconds in response to changes in the ra and dec positions, respectively. when using the flc-pso controllers for the telescope movements, the rise and settling time are 0.3355 sec and 0.33 sec respectively, which is much better than using the sflc, as shown in figure 7 and table 5. the rise times for the system using classical pid and pid-pso controllers are 0.9755 sec, 0.98 sec for the ra responses and 0.0625 sec, 0.0.061 sec for the dec responses. both the sflc and flc-pso controllers suppress and eliminate the overshoot response with an acceptable rise time as shown in figure 7 and table 5. because of using a fixed ccd camera on the telescope during the tracking of a sky object, it is an impermissible response overshoot. based on the mpso algorithm, different sets of gains have been experienced in the case of the pid controllers to control specifications. we have seen that if pid gains are increased to improve the rise time, there will be an inherent tendency for the overshoot response. this is impermissible in the case of telescopes with a fixed ccd camera. these requirements are achieved for the telescope movements based on the flc-pso and sflc controllers for the ra and dec positions. figure 7 shows the improved rise, settling times and stability periods and increased overshoot responses to the pid-pso controllers. therefore, the 276 vol. 58 no. 5/2018 intelligent controller for tracking a 14-inch celestron telescope figure 7. position responses: (a) ra axis and (b) dec axis. figure 8. velocity responses: (a) ra axis and (b) dec axis. application of the flc-pso controller improves the dynamic response of the overall system of the astronomical telescope. although the pid-pso controller shows the fastest rising and settling times compared to the flc-pso controller, the flc-pso controller suppresses excess responses with the best coefficient of the damping factor. the error values for speed and position become zero when the system reaches the required references for the ra and dec respectively, as shown in figure 8. the results confirmed that the mpso algorithm performed effectively to improve the flc-pso, pid-pso performance compared with the sflc and classical pid controllers for the ra and dec axis of the celestron telescope. 277 abdel-fattah attia acta polytechnica 8. conclusions in this paper, dynamic acceleration parameters were introduced for the pso algorithm. this optimization of the pso routing is optimized for a global search. acceleration parameters speed up the pso approach and prevent the pso from stumbling at a locally optimized level. the flc-pso membership function parameters are automatically updated based on the mpso. the proposed adaptive flc-pso controller, tuned by the mpso, improves the stability and settling time and reduces the damping factor of the celestron telescope model. results include the superior flc-pso compared with sflc and classical pid controllers even when the pid is optimized for the performance of the ra and dec axis of the celestron telescope. references [1] van den bergh, frans. “an analysis of particle swarm optimizers (pso).” pretoria, university of pretoria (2001): 78-85. [2] xie, x.-f., zhang, w.-j. and yang, z.-l., (2002), “adaptive particle swarm optimization on individual level”, int. conf. on signal processing (icsp), pp: 1215-1218 [3] mendes, r., kennedy, j., and neves, j., (2004) “the fully informed particle swarm: simpler, maybe better, ” ieee trans. on evolutionary comput., vol. 8 (3), pp. 204–210. [4] he, q. and c. han, 2006. an improved particle swarm optimization algorithm with disturbance term. comput. intell. bioinfo., 4115: 100-108. [5] fereidouni, alireza, mohammad as masoum, and moayed moghbel. “a new adaptive configuration of pid type fuzzy logic controller.” isa transactions 56 (2015): 222-240. [6] gharghory, sawsan, and hanan kamal. “modified pso for optimal tuning of fuzzy pid controller.” ijcsi international journal of computer science issues 10.2 (2013): 462-471. [7] abdo mm, vali ar, toloei ar, arvan mr. stabilization loop of a two axes gimbal system using self-tuning pid type fuzzy controller. isa trans 2014; 53:591–602. [8] cheng, p. c., peng, b. r., liu, y. h., cheng, y. s., huang, j. w. (2015). optimization of a fuzzy-logiccontrol-based mppt algorithm using the particle swarm optimization technique. energies, 8(6), 5338-5360. [9] vlachogiannis jg, lee ky. contribution of generation to transmission system using parallel vector evaluated particle swarm optimization. ieee trans power syst 1765–1774;205(4):20. [10] ghosh, s.; kundu, d.; suresh, k.; das, s.; abraham, a.; panigrahi, b.k.; snasel, v.;on some properties of the lbest topology in particle swarm optimization . proceedings of ninth international conference on hybrid intelligent systems, ieee computer society press. 2009 page(s):370 – 375 [11] chopra s, mitra r, kumar v. auto tuning of fuzzy pi type controller using fuzzy logic. int j comput cogn 2008; 6:12–8. [12] hussein f. soliman, abdel-fattah a. attia, mohammed a. l. badr, anas m. osman and abdul a. i. gamaleldin, “fuzzy logic controller for the electric motor driving the astronomical telescope”, proc. spie 3351, 415 (1998); doi:10.1117/12.308827 [13] attia a.-f. hierarchical fuzzy controllers for an astronomical telescope tracking (2009) applied soft computing journal, 9 (1), pp. 135-141. [14] attia, abdel-fattah, jan j., p. horáček, and f. soliman, “fuzzy control for astronomical telescope tracking”, 16th international conference on production research, prague, czech republic, 2001. [15] celestron 11/14 operating manual. celestron international co. pp. 1-26, 1992. [16] tang, k.s, man, k.f., chen, g. and kwong, s., “an optimal fuzzy pid controller”, ieee transactions on industrial electronics, 48, no. 4, pp. 757-765, 2001. [17] attia, abdel-fattah, and p. horáček. “modeling nonlinear systems by a fuzzy logic neural network using genetic algorithms.” acta polytechnica 41, no. 6 (2001). [18] attia, abdel-fattah, genetic algorithms for optimizing fuzzy and neuro-fuzzy systems. p. 107, cvut praha, (czech republic), 2002. [19] padula, f., visioli, a. (2011). tuning rules for optimal pid and fractional-order pid controllers. journal of process control, 21(1), 69-81. [20] luo, y., chen, y. q., wang, c. y., pi, y. g. (2010). tuning fractional order proportional integral controllers for fractional order systems. journal of process control, 20(7), 823-831. 278 http://dx.doi.org/10.1117/12.308827 acta polytechnica 58(5):271–278, 2018 1 introduction 2 mathematical description of the telescope model 3 pid controller 4 fuzzy logic controller (flc) 5 proposed improved particle swarm optimization algorithm (mpso) 6 adaptive fuzzy controllers using modified pso 7 results and discussions 8 conclusions references acta polytechnica doi:10.14311/ap.2013.53.0652 acta polytechnica 53(supplement):652–658, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap measuring supermassive black hole spins in agn laura brenneman∗ harvard-smithsonian center for astrophysics, 60 garden st., ms-67, cambridge, ma 02138 usa ∗ corresponding author: lbrenneman@cfa.harvard.edu abstract. measuring the spins of supermassive black holes (smbhs) in active galactic nuclei (agn) can inform us about the relative role of gas accretion vs. mergers in recent epochs of the life of the host galaxy and its agn. recent theoretical and observation advances have enabled spin measurements for ten smbhs thus far, but this science is still very much in its infancy. herein, i discuss how we measure black hole spin in agn, using recent results from a long suzaku campaign on ngc 3783 to illustrate this process and its caveats. i then present our current knowledge of the distribution of smbh spins in the local universe. i also address prospects for improving the accuracy, precision and quantity of these spin constraints in the next decade and beyond with instruments such as nustar, astro-h and future large-area x-ray telescopes. keywords: black holes, active galaxies, x-rays, spectroscopy, xmm-newton, suzaku, nustar, ngc 3783. 1. introduction measurements of the spins of supermassive black holes (smbhs) in active galactic nuclei (agn) can contribute to the understanding of these complex and energetic environments in three principal ways: • they offer a rare probe of the nature of the spacetime proximal to the event horizon of the black hole (bh), well within the strong-field gravity regime [12, 19]; • they can shed light on the relation of a black hole’s angular momentum to its outflow power in the form of winds and jets (e.g., [26, 33, 37]; • they can inform us about the relative role of gas accretion vs. mergers in recent epochs of the life of the host galaxy and its agn [3]. for these reasons, developing a theoretical and observation framework in which to measure black hole spin accurately and precisely is of critical importance to our understanding of how galaxies form and evolve over cosmic time. advances in theoretical modeling as well as observational sensitivity in the chandra/xmmnewton/suzaku era are finally producing robust constraints on the spins of a handful of smbhs. computationally, new algorithms developed within the past decade [2, 6, 9, 11] have made it possible to perform fully relativistic ray-tracing of photon paths emanating from the accretion disk close to the bh, keeping the bh spin as a variable parameter in the model. when such models are fit to high signal-to-noise (s/n) x-ray spectra from the innermost accretion disk, they yield vital physical information about the bh/disk system, including constraints on how fast – and in what direction – the bh is rotating. if spin (formally denoted a ≡ cj/gm2, where c is the speed of light, j is the bh angular momentum, g is newton’s constant and m is the mass of the bh) is known to within δa ≤ 10%, then meaningful correlations between spin and other environmental variables (e.g., jet power, history of the accretion flow) can be drawn. in this proceeding, i discuss our current knowledge of the distribution of smbh spins in the local universe and future directions of bh spin research. i begin in §2 with an examination of the spectral modeling techniques used to measure bh spin in agn. i then discuss the application of these techniques to a deep observation of ngc 3783, and the caveats that must be considered in §3. i describe the results of these and other investigations of bh spin in bright, nearby type 1 agn in §4, examining our current knowledge of the spin distribution of local smbhs and its implications. conclusions and future directions for this field of research are presented in §5. 2. measuring black hole spin in principle, there are at least five ways that the spin of a single (i.e., non-merging) bh can be measured, electromagnetically. all five are predicated on the assumption that general relativity provides the correct description of the spacetime near the bh, and that there is an easily-characterized, monotonic relationship between the radius of the innermost stable circular orbit (isco) of the accretion disk and the bh spin (see fig. 1). these five methods are listed below. • thermal continuum fitting (e.g., mcclintock et al. [21]). • inner disk reflection modeling (e.g., brenneman & reynolds [6]). 652 http://dx.doi.org/10.14311/ap.2013.53.0652 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 measuring supermassive black hole spins in agn -1.0 -0.5 0.0 0.5 1.0 black hole spin (a) 0 2 4 6 8 10 r ad iu s in d is k (r g) figure 1. radius of the isco (solid line) and event horizon (dotted line) as a function of bh spin. spin values to the left of the dashed line indicate a bh spinning in he retrograde direction relative to the accretion disk, while spins to the right of the dashed line indicate prograde bh spin relative to the disk. • high frequency quasi-periodic oscillations (e.g., strohmayer [34]). • x-ray polarimetry (e.g., tomsick et al. [38]). • imaging the event horizon shadow (e.g., broderick et al. [7]). there are currently limitations in applying the last three methods listed above, and the continuum fitting method is only viable for stellar-mass bhs, due principally to the difficulty in finding agn in a thermally dominant state analagous to the high/soft state seen in black hole x-ray binaries (e.g., czerny et al. 2011). we are therefore restricted to using only the reflection modeling method for constraining the spins of smbhs in agn at this time. this method assumes that the high-energy x-ray emission (≥ 2 kev) is dominated by thermal disk emission which has been comptonized by hot electrons in the “corona,” whether this structure represents the disk atmosphere, the base of a jet or some alternative geometry. some of the scattered photons will depart the system and form the power-law continuum characteristic of typical agn in x-rays. a certain percentage of the photons, however, will be scattered back down (“reflected”) onto the surface of the disk again, exciting a series of fluorescent emission lines of various species ≤ 7 kev, along with a “compton hump” shaped by the fe k absorption edge at ∼ 7.1 kev and by downscattering at ∼ 20 ÷ 30 kev. the most prominent of the fluorescent lines produced is fe kα at a rest-frame energy of 6.4 kev, due largely to its high fluorescent yield. as such, the fe kα line is the most important diagnostic feature of the inner disk reflection spectrum; its shape is altered from the typical near-delta function expected in a laboratory, becoming highly broadened and skewed due to the combination of doppler and relativistic effects close to the bh (see fig. 2). the energy at which the “red” wing of this line manifests is directly linked to the 2 4 6 8 0. 5 1 1. 5 p ho to ns c m − 2 s− 1 ke v − 1 energy (kev) a = −0.998 a = 0 a = +0.998 figure 2. change in the shape of the fe kα line as a function of bh spin. the black line represents a = +0.998, the red line shows a = 0.0 and the blue line shows a = −0.998. location of the isco, and therefore the spin of the bh (see [23, 31] for comprehensive reviews of the reflection modeling technique). 3. applying the reflection modeling method an agn must satisfy three important requirements in order to be a viable candidate for obtaining spin constraints. firstly, it must be bright enough in order to be observed with the necessary s/n in xrays to accurately separate the reflection spectrum from the continuum and any intrinsic absorption within the host galaxy. typically one must obtain ≥ 200 000 counts from 2÷10 kev [17], though in practice the required number can be substantially higher in sources with complex absorption. secondly, the agn must possess a broad fe kα line of sufficient strength relative to the continuum to allow its red wing to be successfully located; usually this corresponds to a line equivalent width of hundreds of ev (e.g., mcg–6-30-15, [6, 8, 24]). not all type 1 agn have been observed to possess such features. recent surveys of hundreds of agn with xmm-newton have concluded that broadened fe kα lines are only present in ∼ 40 % of all bright, nearby type 1 agn [10, 25]), and some broad iron lines have been ephemeral, appearing and disappearing in the same object observed during different epochs (e.g., ngc 5548, [5]). thirdly, the fe kα line in question must be relativistically broad in order to be able to constrain bh spin; that is, it must have a measured inner disk edge – assumed to correspond to the isco – of rin ≤ 9 rg, where rg ≡ gm/c2. taking all these points into consideration, the potential sample size of spin measurements for agn in the local universe is ∼ 30–40 sources [23]. most of these agn are type 1, lacking significant obscuration by dust and gas along the line of sight to the inner disk. 653 laura brenneman acta polytechnica 1 100.5 2 5 20 0. 5 1 1. 5 da ta /m od el energy (kev) figure 3. suzaku xis-fi (front-illuminated; black crosses) and pin (red crosses) data from the 210 ks observation of ngc 3783 in 2009, ratioed against a simple power-law model for the continuum affected by galactic photoabsorption. black and red solid lines connect the data points and do not represent a model. the green line represents a data-to-model ratio of unity. data from the xis back-illuminated ccd (xis-bi) are not shown for clarity. the reflection spectrum from the inner disk can be self-consistently reproduced by models such as reflionx [32] or xillver [14]. these models simulate not only the broad fe kα line, but all other fluorescent emission species at lower energies, as well as the compton hump at higher energies. in order to incorporate the effects of relativity and doppler shift, this static reflection spectrum must then be convolved with a smearing kernel which computes the photon trajectories and energies during transfer from the accretion disk to the observer. several free-spin smearing algorithms are currently available for use (see §1). the kerrconv algorithm of brenneman and reynolds [6] is the only one of these models that is currently built into xspec, though it limits bh spin to prograde scenarios only. a more recent improvement is the relconv model of dauser et al. [9], which generalizes the possible spins to incorporate retrograde bhs. in the following section, i describe the practicalities of using the reflionx and relconv models to measure the spin of the smbh in ngc 3783 using a 210 ks suzaku observation. 3.1. case study: ngc 3783 the type 1 agn ngc 3783 (z = 0.00973) was the subject of a deep suzaku observation from 2009 as part of the suzaku agn spin survey key project (pi: c. reynolds, lead co-i: l. brenneman). the source was observed with an average flux of fx = 6.04 × 10−11 erg cm−2 s−1 from 2 ÷ 10 kev during the observation, yielding a total of ∼ 940 000 photon counts over this energy range in the xis instruments (s/n = 35) and ∼ 45 000 photon counts for 4 5 6 7 8 1 1 .2 1 .4 d a ta /m o d e l energy (kev) (b) figure 4. a zoomed-in view of the fe k region in the 2009 suzaku observation of ngc 3783, ratioed against a simple power-law continuum. note the prominent narrow fe kα emission line at 6.4 kev and the blend of fe kβ and fe xxvi at ∼ 7 kev. xis-fi is in black, xis-bi in red. the pin instrument from 16÷45 kev (s/n = 5), after background subtraction [4]. the spectrum is shown in fig. 3 as a ratio to the power-law continuum and galactic photoabsorbing column in order to illustrate the various residual spectral features present. the compton hump is readily apparent ≥ 10 kev, though its curvature is relatively subtle compared with more prominent features of its kind (e.g., in mcg–6-30-15). the 6 ÷ 7 kev band of the spectrum is dominated by narrow and broad fe k features, including a narrow fe kα emission line at 6.4 kev and a blend of fe kβ and fe xxvi emission at ∼ 7 kev. the broad fe kα line manifests as an elongated, asymmetrical tail extending redwards of the narrow fe kα line to ∼ 4 ÷ 5 kev. the fe k region can be seen in more detail in figs. 4 and 5. at energies below ∼ 3 kev the spectrum becomes concave due to the presence of complex, ionized absorbing gas within the nucleus of the galaxy; the gas is ionized enough that some contribution from this absorber is seen at ∼ 6.7 kev in an fe xxv absorption line. there is a rollover back to a convex shape below ∼ 1 kev, however, where the soft excess emission dominates. the models described above were used by brenneman [4] to fit the 0.7 ÷ 45 kev suzaku spectrum of ngc 3783 with a statistical quality of χ2/ν = 917/664 (1.38). most of the residuals in the bestfit model manifested below ∼ 3 kev in the region dominated by the warm absorber and soft excess, as is typical for type 1 agn. because the s/n of the xis detectors is highest at lower energies due to the higher collecting area there, small residuals 654 vol. 53 supplement/2013 measuring supermassive black hole spins in agn 50 .9 0 .9 5 1 1 .0 5 1 .1 d a ta /m o d e l energy (kev) (c) figure 5. the broad fe kα line at 6.4 kev becomes more obvious when the two more prominent narrow emission lines are modeled out, in addition to the power-law continuum. in the spectral modeling of this region can have an exaggerated effect on the overall goodness-of-fit. excluding energies below 3 kev in our fit, we achieved χ2/ν = 499/527 (0.95). no significant residuals remained. the best-fit parameters of the bh/inner disk system included a spin of a ≥ +0.98, a disk inclination angle of i = 22+3−8 deg to the line of sight, a disk iron abundance of fe/solar = 3.7 ± 0.9 and an ionization of ξ ≤ 8 erg cm s−1 (errors are quoted at 90 % confidence for one interesting parameter). these parameters remained consistent, within errors, when energies ≤ 3 kev were ignored in the fit, negating the importance of the soft excess emission in driving the fit to these parameter values. by contrast, patrick et al. [27] analyzed the same data separately and reached a strikingly different conclusion regarding the spin of the bh in ngc 3783, with approximately equivalent goodness-of-fit to that obtained by brenneman [4]: a ≤−0.04. this discrepancy illustrates the importance of assumptions and modeling choices in influencing the derived bh spin and other physical properties of the bh/disk system. patrick et al. [27] made three critical assumptions that differed from [4]: (1.) that the iron abundance of the inner disk is fixed to the solar value; (2.) that the warm absorber has a high-turbulence (vturb = 1000 km/s), high-ionization (ξ ∼ 7400 erg cm s−1) component not reported by brenneman [4]; (3.) that the soft excess originates entirely through comptonization, with the comptonizing medium having a temperature of kt ≥ 9.5 kev and an optical depth of τ = 1.9 ± 0.1. using a markov chain monte carlo approach, reynolds et al. [29] demonstrated that a solar iron abundance is significantly detrimental to the global goodness-of-fit (δχ2 = +36) in ngc 3783 when compared with allowing the iron abundance of the inner disk to fit freely. brenneman [4] found no need to include a high-turbulence component in their fit to the suzaku data, and noted no evidence for such a component in the higher-resolution 2001 chandra/hetg data. finally, reynolds et al. [29] note that there is no physical reason to assume that the soft excess originates entirely from comptonization, as other processes within the agn might contribute (e.g., photoionized emission, scattering, thermal emission). reynolds et al. [29] attempted several different model fits to the soft excess and found not only a much smaller contribution to the overall model for the soft excess component than patrick et al. [27], but also no statistical difference between fits using different models (e.g., blackbody vs. comptt). it should be noted, however, that modeling the soft excess with a comptonization component of high temperature, high optical depth and high flux, as patrick et al. [27] have done, requires the comptt component to possess significant curvature up into the fe k band, reducing the need for the relativistic reflection to account for this same curvature seen in the data and thereby eliminating the requirement of high bh spin. clearly, different modeling approaches can lead to vastly different conclusions regarding bh spin and careful consideration should be given to the models used and their allowed parameter ranges. obtaining high s/n x-ray spectra over a broad energy range (e.g., by using nustar simultaneously with xmm-newton or suzaku) will also help break the degeneracy between models (see §5). 4. results and implications in the previous two sections we have noted the importance of both adequate data (i.e., high s/n) and a physically self-consistent modeling approach to constraining smbh spins in agn. we have also stressed the importance of one very critical assumption that must be made in order to calculate bh spin: namely, that the inner edge of the accretion disk is at the isco. if the optically-thick disk is truncated further out, then any spin derived using this assumption and the reflection modeling technique will be a lower limit. if there is some emission produced inside the isco, this will lead to a systematic error on the bh spin measurement that can be ≥ 20 % above the actual value of spin for non-spinning or retrograde bhs, but is ≤ 2 % higher than the real spin for bhs with spins a ≥ +0.9 [30]. the models currently used to represent both the accretion disk and the relativistic smearing also have 655 laura brenneman acta polytechnica agn a wkα log m lbol/ledd host mcg–6–30–15 ≥ 0.98 305+20−20 6.65 +0.17 −0.17 0.40 +0.13 −0.13 e/s0 fairall 9 0.52+0.19−0.15 130 +10 −10 8.41 +0.11 −0.11 0.05 +0.01 −0.01 sc swift j2127.4+5654 0.6+0.2−0.2 220 +50 −50 7.18 +0.07 −0.07 0.18 +0.03 −0.03 — 1h0707–495 ≥ 0.98 1775+511−594 6.70 +0.40 −0.40 ∼ 1.0−0.6 — mrk 79 0.7+0.1−0.1 377 +47 −34 7.72 +0.14 −0.14 0.05 +0.01 −0.01 sbb mrk 335 0.70+0.12−0.01 146 +39 −39 7.15 +0.13 −0.13 0.25 +0.07 −0.07 s0a ngc 7469 0.69+0.09−0.09 91 +9 −8 7.09 +0.06 −0.06 1.12 +0.13 −0.13 sab(rs)a ngc 3783 ≥ 0.98 263+23−23 7.47 +0.08 −0.08 0.06 +0.01 −0.01 sb(r)ab ark 120 0.94+0.1−0.1 105 +26 −24 8.18 +0.05 −0.05 0.04 +0.01 −0.01 sb/pec 3c 120 ≤−0.1 48+10−10 7.74 +0.20 −0.22 0.31 +0.20 −0.19 s0 table 1. summary of black hole spin measurements derived from in smbh spectra. data are taken with suzaku except for 1h0707−495, which was observed with xmm-newton, and mcg–6-30-15, in which the data from xmm and suzaku are consistent with each other. for references, see [4]. spin (a) is dimensionless, as defined previously. wkα denotes the equivalent width of the broad iron line relative to the continuum in units of ev. m is the mass of the black hole in solar masses, and lbol/ledd is the eddington ratio of its luminous output. host denotes the galaxy host type. host types for 1h0707−495 and swift j2127.4+5654 are unknown. the spin values of mcg–6-30-15 and ngc 3783 are disputed by patrick et al. [27]. their inherent limitations and uncertainties. the reflionx and xillver models both assume that the disk has a constant density and ionization structure throughout, which cannot be the case, physically. there is also some question about whether a limbbrightening vs. limb-darkening algorithm should be used to represent the directionality of the reflected emission from the disk when convolved with the smearing kernel [35]. the nature of the disk emissivity profile itself is also an active topic of research; though the disk is thought to dissipate energy as a function of radius (� ∝ r−q), the emissivity index (q) likely varies as a function of radius as well [40]. taking all these caveats into account, one can begin to appreciate why, to date, there are only ten different agn with published values for their smbh spins. these agn, and their properties, are listed in tab. 1. it is difficult to draw any robust statistical inferences from a sample size of ten objects. there may also be selection biases in play which make it more likely that we measure higher spin values [4]. the only pattern that is readily apparent in tab. 1 is that nine out of ten agn have relatively high, prograde smbh spins. the one retrograde spin value is for 3c 120, which is the one agn in the sample that is radio-loud. this may not be a coincidence; garofalo [16] postulated that jet power is maximized for rapidly-rotating retrograde bhs (though this idea is not without controversy, e.g., [37]). more work needs to be done to assess bh spin and jet power independently in order to prove or disprove this conjecture. if the trend toward large prograde spins continues to hold as our sample size increases, we might ultimately infer that the growth of bright, nearby agn in recent epochs has been driven primarily by prolonged, prograde accretion of gas. if the overall distribution of smbh spins in the local universe begins to drift toward intermediate values, it is likely that the role of mergers has been more significant than that of ordered gas accretion. similarly, if the distribution tends toward low values of spin, we can infer that episodes of randomly-oriented accretion have been the dominant means of smbh and galaxy growth [3]. 5. conclusions and future directions measuring bh spin is painstaking work, even with the best data from current observatories such as xmmnewton and suzaku. long observations (c. hundreds of kiloseconds) of bright agn are needed, and multiepoch, multi-instrument data should be analyzed jointly whenever possible in order to assess the physical nature and variability of all of the components in a given x-ray spectrum. a broad energy range is also desirable in order to constrain the properties of the continuum and complex absorption, particularly, and to distinguish these components from any signatures of inner disk reflection. only by isolating the broad fe kα line and its associated compton hump can we measure bh spin with the accuracy and precision necessary to begin constructing a spin distribution for local agn. we can then begin to draw inferences regarding the dominant growth mechanism of these smbhs over cosmic time, and to understand the role of spin in jet production and agn feedback. our current sample of ten agn with measured, published smbh spins must be extended in order to accomplish these goals. the suzaku spin survey is 656 vol. 53 supplement/2013 measuring supermassive black hole spins in agn ongoing, and is expected to provide new spin constraints for 3c 120 and mrk 841 within the next year. additionally, many valuable datasets from the xmm and suzaku archives are currently being analyzed with an eye toward measuring spin (e.g., walton et al., in prep.). nustar [18] will also benefit this science, providing an invaluable high-energy (∼ 5 ÷ 80 kev) complement to xmm-newton and suzaku when used simultaneously with either observatory. accessing this high energy range with low background and high s/n will enable the continuum and absorption in the spectrum to be more accurately modeled, allowing the signatures of inner disk reflection to be isolated and yielding more accurate, precise spin constraints. in addition to nustar’s role in this work, astro-h [36], scheduled for launch in 2014, will bring the science of micro-calorimetry to x-ray astronomy with a spectral resolution of δe ∼ 7 ev over the 0.3÷12 kev range. the calorimeter will be the unique strength of this mission, enabling the broad and narrow fe k emission and absorption features to be definitively disentangled and the telltale signatures of complex intrinsic absorption to be identified and modeled correctly. in order to achieve the order-of-magnitude increase in sample size necessary to begin assessing the spin distribution of smbhs in the local universe from a statistical perspective, future large-area (≥ 1 m2) x-ray observatories are needed. proposed concepts such as ixo [39], athena [1] and the extreme physics explorer (epe) [15] would all offer the necessary collecting area and spectral resolution to extend our sample of measured smbh spins to several hundred agn using the reflection modeling method. loft [13] would also bring precise timing resolution into play along with large collecting area, allowing measurements of spin to be made on the orbital timescale of many agn by tracing individual hot spots in the inner accretion disk. the science of determining bh spin is very much in its infancy. though the past decade has seen great strides in our ability to constrain spin through long x-ray observations coupled with detailed spectral modeling, much work remains to be done in terms of improving the precision and accuracy of these measurements, as well as the sample size. the next decade will see an improvement in the quality of bh spin science, but a significant advance in the quantity of this work in the decades beyond will depend critically on the amount of international funding available for large-area x-ray missions. acknowledgements lb is indebted to the vulcano conference organizers for their kind invitation, and to nasa grant # nnx10ar44g, which funded her travel. references [1] barcons, x. et al. : 2012, arxiv:1207.2745 [2] beckwith, k. & done, c.: 2005, mon. not. r. astr. soc., 359, 1217 [3] berti, e. & volonteri, m.: 2008, astrophys. j., 684, 822 [4] brenneman, l. et al. : 2011, astrophys. j., 736, 103 [5] brenneman, l. et al. : 2012, astrophys. j., 744, 13 [6] brenneman, l. & reynolds, c.: 2006, astrophys. j., 652, 1028 [7] broderick, a. et al. : 2011, astrophys. j., 735, 57 [8] chiang, c.y. & fabian, a.: 2011, mon. not. r. astr. soc., 414, 2345 [9] dauser, t. et al. : 2010, mon. not. r. astr. soc., 409, 1534 [10] de le calle pérez et al. : 2010, astr. astrophys., 524, 50 [11] dovčiak, m., karas, v. & yaqoob, t.: 2004, astrophys. j. suppl., 153, 205 [12] fabian, a. et al. : 1989, mon. not. r. astr. soc., 238, 729 [13] feroci, m. et al. : 2010, soc. pho. inst. eng., 7732e, 57 [14] garcia, j. & kallman, t.: 2010, astrophys. j., 718, 695 [15] garcia, m. et al. : 2011, soc. pho. inst. eng., 8147e, 55 [16] garofalo, d.; 2009, astrophys. j., 699, 400 [17] guainazzi, m. et al. : 2006, astron. nachr., 327, 1032 [18] harrison, f. et al. : 2005, exp. astr., 20, 131 [19] laor, a.: 1991, astrophys. j., 376, 90 [20] malizia, a. et al. : 2008, mon. not. r. astr. soc., 389, 1360 [21] mcclintock, j. et al. : 2006, astrophys. j., 652, 518 [22] mchardy, i. et al. : 2005, mon. not. r. astr. soc., 359, 1469 [23] miller, j.: 2007, ann. rev. astr. astrophys., 45, 441 [24] miniutti, g. et al. : 2007, publ. astr. soc. japan, 59s, 315 [25] nandra, k. et al. : 2007, mon. not. r. astr. soc., 382, 194 [26] narayan, r. & mcclintock, j.: 2012, mon. not. r. astr. soc., 419l, 69 [27] patrick, a. et al. : 2011b, mon. not. r. astr. soc., 416, 2725 [28] peterson, b. et al. : 2004, astrophys. j., 613, 682 [29] reynolds, c. et al. : 2012, arxiv:1204.5747 [30] reynolds, c. & fabian, a.: 2008, astrophys. j., 679, 1181 657 laura brenneman acta polytechnica [31] reynolds, c. & nowak, m.: 2003, phys. reports, 377, 389 [32] ross, r. & fabian, a.: 2005, mon. not. r. astr. soc., 358, 211 [33] steiner, j. et al. : 2012, astrophys. j., 745, 136 [34] strohmayer, t.: 2001, astrophys. j., 552l, 49 [35] svoboda, j. et al. : 2010, amer. inst. phys. conf., 1248, 515 [36] takahashi, t. et al. , 2010, soc. pho. inst. eng., 7732e, 27 [37] tchekhovskoy, a. & mckinney, j.; 2012, mon. not. r. astr. soc., 423l, 55 [38] tomsick, j. et al. : 2009, arxiv:0902.4238 [39] white, n. et al. : 2010, amer. inst. phys. conf., 1248, 561 [40] wilkins, d. & fabian, a.: 2010, mon. not. r. astr. soc., 414, 1269 [41] woo, j.h. & urry, c.m.: 2002, astrophys. j., 579, 530 [42] zoghbi, a. et al. : 2010, mon. not. r. astr. soc., 401 658 acta polytechnica 53(supplement):652–658, 2013 1 introduction 2 measuring black hole spin 3 applying the reflection modeling method 3.1 case study: ngc 3783 4 results and implications 5 conclusions and future directions acknowledgements references ap01_6.vp 1 introduction high-speed rotary impellers are generally used for mixing low-viscosity liquids and dispersions under a turbulent regime of flow of an agitated batch [1, 2, 3]. their main advantage is that they can reduce investment costs for complex gear boxes because their high speed involves transferring the power input from the driving motor to the impeller predominantly via frequency of revolution at a moderate level of torque. the main types of high-speed rotary impellers suitable both for blending pure liquids and for mixing solid-liquid suspensions, gas-liquid dispersions and liquid-liquid emulsions are the standard rushton turbine impeller (fig. 1) and the pitched blade impeller (see figs. 2 and 3). their design is very well defined and their manufacture is rather simple. the power input of the standard rushton turbine impeller (srti) and pitched blade impellers (pbi) has in many cases of impellers been studied experimentally, and the results of the experiments have been published in dimensionless form by means of similarity criteria [2] a [3]: � �po po geometry of the system� re, (1) where the power number po p n d � � 3 5 (2) and the reynolds number (modified for impeller) re � nd2� � . (3) the geometry of the agitated system is characterised simplexes of geometric similarity, e.g. h/t, d/t, d/t, h/t, nb, b/t, � and t/d (see list of symbols) necessary for scaling up the impeller power input from the pilot plant to industrial size. for reynolds number greater than ten thousand the power number is independent of this criterion and eq. (1) can be simplified into the form � �po po h t d t h t b t n t d� , , , , ,b (4) published experimental studies have dealt with a quantitative description (usually in the power form) of dependence (4) both for srti [3, 4, 5, 6] and pbi [6, 7, 8, 9]. the results of these studies confirm the strong effect of various simplexes of geometric similarity on the impeller power input, e.g., the relative thickness of the turbine separating disc t/d or the pitch angle of the pbi. on the other hand, the experimental studies enable us to neglect the influence of certain simplexes on the impeller power input, e.g. the vessel diameter/impeller diameter ratio d/t of the srti or the relative thickness of the blades of the pbi. the aim of this study is to compare the experimentally investigated power input of the srti and pbi determined in two mixing vessels of different sizes under a turbulent regime of an agitated liquid and look for the influence of the main geometric characteristics of the two impellers on the impeller power input. the experimental results will be expressed by means of the above introduced dimensionless variables, and the statistically calculated power correlations will be compared with results published in the literature. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 power input of high-speed rotary impellers k. r. beshay, j. kratěna, i. fořt, o. brůha this paper presents the results of an experimental investigation of the power input of pitched blade impellers and standard rushton turbine impellers in a cylindrical vessel provided with four radial baffles at its wall under a turbulent regime of flow of an agitated liquid. the influence of the geometry of the pitched blade impellers (pitch angle, number of blades) and the off-bottom impeller clearance of both high-speed impellers tested on the impeller power input is determined in two sizes of the cylindrical vessel (0.3 m and 0.8 m diameter of vessel). a strain gauge torquemeter is used in the small vessel and a phase shift mechanical torquemeter is used in the large vessel. all results of the experiments correspond to the condition that the reynolds number modified for the impeller exceeds ten thousand. the results of this study show that the significant influence of the separating disk thickness of the turbine impeller corresponds fairly well to the empirical equations presented in the literature. both the influence of the number of impeller blades and the blade pitch angle of the pitched blade impeller were expressed quantitatively by means of the power dependence of the recently published correlations: the higher the pitch angle and the number of blades, the higher the values of the impeller power input. finally, it follows from results of this study that the impeller off-bottom clearance has a weak influence on the power input of the rushton turbine impeller, but with decreasing impeller off-bottom clearance the power input of the pitched blade impeller increases significantly. keywords: impeller power input, rushton turbine, pitched blade impeller. 60° l w d fig. 1: sketch of rushton turbine impeller (czech standard cvs 691021) d/t = 1/3, w/d = 0.2, d1/d = 0.75, l/d = 0.25 2 experimental the power input of the srti and the pbi was investigated through measurements in two experimental test rigs having the same geometrical parameters as shown in fig. 3. the first (larger) test rig, shown in fig. 4, had a flat-bottom cylindrical vessel 0.8 m in diameter, made of perspex and equipped with four equally-spaced baffles mounted on the wall and with a width of b = 0.1t. the working fluid used throughout the measurements was water, at a temperature of 20 °c, whose height h was equal to the vessel diameter t. the torque was measured by a phase-shift mechanical torquemeter whose output, in the form of an electrical signal, was fed to a data acquisition system. the torquemeter was mounted, under the variable-speed driving motor, above the tank. the rotational speed was measured by means of a photoelectric cell and a notched disc mechanism. in the smaller-size test rig shown in fig. 5, a flat-bottom, cylindrical vessel with four baffles at its wall, was again used. the vessel had a diameter of t = 0.3 m and was filled with water at temperature 20 °c to a height of h = t. a strain gauge torquemeter, mounted on the motor shaft, was used to measure the torque, and the signal was fed to a data acquisition system. the rotational speed was measured, as in the case of the larger-size test rig, by a photoelectric mechanism. table 1 shows the geometrical details of the impellers used in this study. for the large-vessel test rig three impellers were investigated, namely: a standard rushton turbine impeller (srti), a 6-blade pitched blade impeller with blade angle � = 45° (6pbi45) and a 4-blade pitched blade impeller with � = 45° (4pbi45). for the small-vessel test rig, two types of impellers were investigated. these were the rushton turbine impeller (srti) and pitched blade impellers with the following geometry: 6 blades and pitch angle � = 45° (6pbi45), 4 blades, � = 45° (4pbi45), 4 blades, � = 30° (4pbi30), 4 blades, � = 20° (4pbi20), 3 blades, � = 45° (3pbi45), 3 blades, � = 35° (4pbi35) and 3 blades, � = 24° (4pbi24). all had a relative diameter d/t = 0.33. all experiments were made at three levels of off-bottom clearance, namely: h/t = 0.2, 0.35, 0.5 except for the srti, where measurements were made at only two off-bottom clearances: h/t = 0.33 and 0.5. the off-bottom clearance was measured from the bottom of the vessel to the lower edge of the impeller using a ruler, with a precision of ± 1 mm. the error in measuring the blade angle of the pitched blade impellers can be considered ± 1°. a sufficiently wide range of reynolds number was obtained by varying the rotational speed of the impeller (n). this quantity was kept constant during measurements to within ± 2 rpm. therefore, the maximum experimental error © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 41 no. 6/2001 w w w w d d d /2 � � � a) b) c) fig. 2: sketch of pitched blade impellers (czech standard cvs 691020) d/t = 1/3, w/d = 0.2, � = 45o h b n h d t fig. 3: geometrical parameters of the agitated vessel. (h/t = 1, b/t = 0.1, d/t = 1/3) encountered in measuring speed reached 2 % for low reynolds numbers ( n = 200 rpm), while this error fell to less than 0.25 % for higher reynolds numbers ( n = 1000 rpm). the main factors affecting the accuracy of the impeller power input measurements were the static and dynamic frictions. the static friction effect was reduced to its minimal value by using a proper bearing and driving motor. the most dominant factor, i.e., the dynamic friction, was closely monitored during the experiment to ensure that steady state values of the torque were recorded. for this purpose, the motor was allowed to run before each set of experiments at high speed (1000 rpm) for one hour, in order to reach the working temperatures of the bearings and motor. the motor was then set to the lowest speed for the particular impeller and allowed to run for 5 minutes till steady state conditions prevailed. passive torque (torque readings without installing the impeller) was then recorded for 3 minutes. then the speed was increased by an increment of 40 rpm and the passive torque was recorded again for 3 minutes, and so on until the maximum allowable rotational speed was reached. the motor was then stopped and the impeller was attached and positioned at an off-bottom clearance level. the motor started once again at 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 4: layout of the large size test rig impeller t [mm] d [mm] d/t l/d w/d d1/d nb � [deg] large vessel stri 2.50 260 0.33 0.25 0.2 0.75 6 90 6pbi45 – 255 0.32 – 0.2 – 6 45 4pbi45 – 260 0.33 – 0.2 – 4 45 small vessel srti 1.55 100 0.33 0.25 0.2 0.75 6 90 6pbi45 – 100 0.33 – 0.2 – 6 45 4pbi45 – 100 0.33 – 0.2 – 4 45 4pbi30 – 100 0.33 – 0.2 – 4 30 4pbi20 – 100 0.33 – 0.2 – 4 20 3pbi45 – 100 0.33 – 0.2 – 3 45 3pbi35 – 100 0.33 – 0.2 – 3 35 3pbi24 – 100 0.33 – 0.2 – 3 24 table 1: geometrical specifications of impellers the lowest measuring speed and was left for 5 minutes before recording the torque. torque data was then collected for 5 minutes, and three collections were recorded at each speed and off-bottom clearance to obtain the average power input. the passive torque was measured once again at the end of each speed range. this experimental technique was repeated for each impeller and at each off-bottom clearance for both test rigs. the average power number of each impeller across a speed range was then calculated. the average relative standard deviation was calculated and found to be in the range 2.3–16 %. this experimental methodology was adopted to completely eliminate the dynamic resistance of the equipment. 3 results and discussion the mean power number defined as the average power number across the speed range of the impeller (from low speed to the speed at which aeration occurs) was measured by the large vessel test rig. the results for the three impellers investigated are reported in table 2, at the above mentioned off-bottom clearances. the corresponding results obtained using the small vessel test rig are given in table 3 for the eight impellers studied. the experimental results for the srti obtained by the large test rig where the reynolds number was in the range 1.5�105 < re < 3�105 and those obtained using the small test rig in the range 3�104 < re < 6�104 show more or less constant power input over reynolds number range. only a 4 % change of po is observed in the recorded date, in accordance © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 41 no. 6/2001 fig. 5: layout of the small size test rig impeller power input measured at srti h/t = 0.33 h/t = 0.5 6.51 6.32 h/t = 0.2 h/t = 0.35 h/t = 0.5 6pbi45 1.63 1.52 – 4pbi45 1.48 1.31 1.20 table 2: average values of po ( t = 0.8 m) impeller power input measured at h/t = 0.33 h/t = 0.5 srti 5.41 5.44 h/t = 0.2 h/t = 0.35 h/t = 0.5 6pbi45 1.84 1.69 1.55 4pbi45 1.44 1.28 1.22 4pbi30 0.71 0.64 0.59 4pbi20 0.32 0.29 0.28 3pbi45 1.16 1.04 0.97 3pbi35 0.75 0.69 0.65 3pbi24 0.39 0.36 0.33 table 3. average values of po ( t = 0.3 m) with [3], although a change of 10 % [5] and even larger changes [6] have been reported. another significant result obtained from tables 2 and 3 is that the power input of srti varies very slightly (almost constant) with off-bottom clearance. only a change of 3 % is observed in po due to change of c/t from 0.33 to 0.5. this agrees well with the literature [3, 9]. now, the power inputs of the present study are compared with those obtained from the empirical correlation proposed by bujalski et al [4], relating the power number to the relative blade thickness ( t/d) and the vessel diameter, via: po t d t t � � � � � � � � � � 2 512 0 195 0 0 063 . . . , (5) where t0 = 1 m. good agreements are obtained, e.g., for the large test rig po = 6.2 compared to an averaged value of 6.4 of the present study, and for the small test rig the agreement is even better, i.e., po = 5.3 compared for 5.425 for the present work. similar findings were also reported by rutherford et al [5]. the effect of the off-bottom clearance was investigated in the present work for the pitched blade impellers in both the large and small vessel test rigs. it is clear from the values reported in tables 2 and 3 that po decreases, as expected, with increasing h/t, in agreement with previous investigation [7]. as the pitched blade impellers pump the flow downwards, it creates a region with large turbulence underneath it. for the smaller bottom clearances, the fluid velocities in the region bounded by the vessel bottom and the impeller are higher, and thus higher friction and higher rate of mechanical energy dissipation are generated, which leads to higher power numbers. for higher off-bottom clearances, lower turbulence leads to lower power numbers. the effect on power input of the number of blades of the pitched blade impeller at � = 45° and the effect of changing � at nb = 4 and 3 were also investigated in the present work at different off-bottom clearances. the power input data are depicted in table 3. these and other data for the pitched blade impeller were used to obtain a correlation in the form: � �po n h d b� � � � � � 0 996 0 682 0 178 1 995. sin. . . � . (6) this formula correlates the power input of the pitched blade impeller to the different parameters investigated. the average error (difference between the measured values and those obtained from the correlation) was found to be 1.49 %. this correlation can be compared with a similar general correlation found by medek [7] given in the form: po n h d t d h tb � � � � � � � � � � � � � � � 1507 0 701 0 165 0 365 . . . . � � 0 140 2 077 . . sin � (7) which was reduced, according to the present geometry, i.e., t/d = 3 and h/t = 1.0, to the form: � �po n h d b� � � � � � 10092 0 701 0 165 2 077. sin. . . � (8) a close inspection of eq. (6) and eq. (8) reveals that the present results correlate well with those of medek [7]. in fact the two correlations are almost identical since the variation in the constant and the various exponents are negligibly small. 4 conclusions the power input of the pitched blade impellers and the standard rushton turbine impellers were investigated using two cylindrical vessels of different diameters. the effect of several parameters including the vessel diameter, the off-bottom clearance, the number of blades of the pitched blade impellers and the pitch angle were studied. it has been found, for srti, that po remains almost constant over the tested reynolds number range and it also remains constant at off-bottom clearances greater than 1/3. a correlation relating the power input of the pitched blade impeller to the number of blades, off-bottom clearance and the pitch angle was obtained from the measured data and compared with the corresponding correlation and good agreement was obtained. acknowledgement this research has been subsidised by research project of the ministry of education of the czech republic j04/98: 212200008. list of symbols b baffle width, m d impeller diameter, m h off-bottom clearance, m h total liquid depth, m l width of rushton turbine discs, m n rotational speed, rps nb number on blades p power required by the impeller, w po power number re reynolds number t thickness of rushton turbine disk, m t vessel diameter, m w height of blades, m � pitch angle, ° � dynamic viscosity of agitated liquid, pa�s � density of agitated liquid, kg/m3 references [1] holland, f. a.. chapman, f. s.: liquid mixing and processing in stirred tanks. new york, reinhold publ. corp., 1966 [2] nagata, sh.: mixing principles and applications. tokyo, new york, kodansha ltd., john wiley and sons, 1975 [3] strek, f.: mixing and mixing equipment (in czech). prague, sntl, 1997 [4] bujalski, w., nienow, a. w., chatwin, s., cooke, m.: the dependency on scale of power numbers of rushton disk turbine. chem. eng. sci., 42 (2), 1987, pp. 317–326 [5] rutherford, k., mahmoudi, s. m. s., lee, k. c., yianneskis, m.: the influence of rushton impeller blade and disk thickness on the mixing characteristics of stirred vessel. trans. inst. chem. engrs., 74 (part a), 1996, pp. 369–378 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 [6] chapple, d., afacan, a., kresta, s. k.: the effect of impeller and tank geometry on power number for a pitched blade turbine. trans. inst. chem. engrs, (in press) [7] medek, j.: power characteristics of agitators with flat inclined blades. int. chem. eng., 20 (4), 1980, pp. 664–672 [8] medek, j.: hydrodynamic characteristics of pitched blade impellers (in czech). technika v chemii, 74, 1991, pp. 22–28 [9] ibrahim, s., nienow, a. w.: power curves and flow patterns for a range of impellers in newtonian fluids: 40 1. next by solving (f2 − x)/(f − x) = 0, we obtain a quadratic equation, from which x = x1,2 = (a + 1)/2a{1 ± √ (a− 3)/(a + 1)}, a > 3. now we anticipate that (f3−x)/(f−x) = 0 is a cubic equation, from which 3 roots are to be found. we now construct q(x) ≡ (f3 −x)/(f −x) = 0 in a new variable t = ax for simplification [5]: q(t) = t6 − (3a + 1)t5 + (3a2 + 4a + 1)t4 − (a3 + 5a2 + 3a + 1)t3 + (2a3 + 3a2 + 3a + 1)t2 − (a3 + 2a2 + 2a + 1)t + (a2 + a + 1). (2) to our surprise it is a sextic equation. since there are no known formulas for solving any equations of degrees equal to or higher than 5, at a first glance we seem to have arrived at a brick wall. perhaps q(t) is not a general sextic equation but a special one. if so, it might be solvable in some special way. if ek denotes an equation of degree k, e6 would be solvable at least in principle if it could be expressed as: e2 ×e2 ×e2 or e3 ×e3. looking at (2), we see at once that q(t) must be real if t is real since its coefficients are all real. this means that the 6 roots can only be either 3 pairs of complex conjugates or two sets of 3 real roots, corresponding to two solvable forms for q(t). this impels us to look for an indication that q(t) is not general. if (2) is expressed as q(t) = 6∑ k=0 (−)kδktk, (3) we find that ∆ ≡ ∑5 k=0(−) kδk = 0, referred to as the delta sum rule. this sum rule is a clear indication that q(t) is not a general sextic equation. 4. solving the sextic equation: preliminaries if a = 0 in (2), which still satisfies the delta sum rule, the roots are: exp(±iπ/7), exp(±i3π/7) and exp(±i5π/7), 3 pairs of complex conjugates lying on the unit circle. if the roots are to be real, none of them may lie on the negative real axis of t since the coefficients of t in odd power have ‘−’ signs. the six roots must all lie on the positive real axis. if a →∞, one of the roots goes as 1/a asymptotically. thus as a increases from zero, the three pairs of complex roots must move toward the positive real axis. at some value of a = ã say, each complex-conjugate pair becomes two identical real roots simultaneously. thereafter each degenerate real root splits into two real roots. this overall behavior is consistent with the two solvable forms for q(t). we cannot yet know however whether ã < 4 or > 4. the existence of 3-cycle in the logistic map requires that ã be less than 4 and also that at least one set of real roots lies in the interval t = (0,a) or x = (0, 1). if otherwise, 3-cycle does not exist in it. if a > ã, there must be two sets of 3 positive real roots. hence, we shall write q(t) = q(t)×q′(t), where q and q′ are cubic equations with (as yet undetermined) roots, resp., tk and t′k, k = 1, 2, 3. thus one can express q(t) as q(t) = t3 −αt2 + βt−γ, (4) where α = t1 +t2 +t3, β = t1t2 +t2t3 +t3t1, γ = t1t2t3, and q′(t) similarly with primes written on. they (αβγ) will be referred to as trigonals. to understand why q is special we look for some structural relations between the trigonals of the same kind contained in the structure of the logistic map. indeed we find that [7]: α−β+γ = 0 and α′−β′+γ′ = 0, together called the trigonal relation. to no surprise the trigonal relation implies the delta sum rule. in addition, β = (a+1)α−(a2 +a+1) and γ = aα−(a2 + a + 1), leaving only α and α′ independent, leaving only two unknowns to the problem. the transition condition e2×e2×e2 = e3×e3, together with the trigonal relation yields σ = 0, where σ = (a2 − 2a− 7)1/2. hence, ã = 1 + √ 8 = 3.872281323 · · · as the transition value. since ã < 4, 3-cycle can exist in the logistic map. there are two sets of 3 positive real roots, or two forms of 3-cycle, that had not been implied by the theorem of sharkovskii, but that evidently exist in the logistic map. unless their relationship is found, the two cubic equations are still left with one unknown each, α or α′. how to distinguish the two remains an obstacle to the final solution. 5. solving the sextic equation: an internal degree of freedom if there are two forms of 3-cycle, they must be distinguishable by some internal degree of freedom in q and q′. it cannot appear in q, meaning that it cannot appear in δ′s, the coefficients of t in q, e.g. δ5 = 3a + 1 = α + α′. since the two forms must be equivalent, they are like two states of parity, 131 m. howard lee acta polytechnica which could be represented by a double-valued function. for 3-cycle in the logistic map it happens to be σ = (a2 − 2a− 7)1/2 [7]. thus, we shall assume α = 1 2 δ5 + kσ, (5) α′ = 1 2 δ5 −kσ, (6) where if k = 1/2 it yields all the coefficients (δ′s) exactly. therewith we finally obtain: α = 1 2 (3a + 1 + σ), (7) β = 1 2 (a2 + 2a− 1 + (a + 1)σ), (8) γ = 1 2 (a2 −a− 2 + aσ), (9) and α′,β′,γ′ by taking −σ in (7)–(9). hence if q = q(σ), q′ = q(−σ). it is sufficient to solve only one of them. even beforehand, it is possible to prove that tk and t′k,k = 1, 2, 3, all lie in the interval (0,a). the proof goes as follows: let tk = a− �. if ã ≤ a ≤ 4, tk > 0 as already proved. hence a− � > 0. this gives � < a, an upper bound on �. to determine a lower bound on �, we consider the trigonal relation with tk = a−� for any k = 1, 2 or 3. by (7)–(9) we find that α−β + γ = −r(�)/(a− �), (10) where r(�) = �3 −a�2 + b�− 1, (11) where a and b are real and positive if ã ≤ a ≤ 4. the trigonal relation requires that r(�) = 0. this is possible if and only if � > 0, which gives a lower bound on �. thus, 0 < � < a and this implies that tk = a − � must lie in the interval t = (0,a). this proof agrees with the cubic solutions obtained from q = 0 and q′ = 0. we conclude that the 3-cycle exists continuously in the logistic map from a = ã to a = 4. throughout this domain there exists chaos because there are, according to the theorem of sharkovskii, infinitely many cycles of all kinds. 6. conclusion: aleph cycle and chaotic trajectories the theorem of sharkovskii asserts that if 3-cycle exists in a domain, all other cycles exist in that domain. for the logistic map we take this to mean that there are infinitely many cycles of all varieties in the domain ã ≤ a ≤ 4. thus, in this domain the fixed points form a dense set of points in x = (0, 1), a strip made up of irrational points of all different kinds. it has a finite measure. if x1 is an initial value that belongs to this set of points of finite measure, fn (x1) = xn +1,n → ∞, termed an aleph cycle. evidently an aleph cycle describes a chaotic trajectory. if x0 is an initial value that does not belong to this set, thus belongs to a set of points of 0 measure, fm (x0) = x0, m < ∞, a finite cycle, giving a periodic trajectory. this assertion can be justified for some special values of a such as a = 4, for which one can obtain a large number of fixed points from which a set of points of finite measure can be constructed [6]. an aleph cycle has a very close connection to an ergodic trajectory in statistical mechanics according to the ergometic theory of the ergodic hypothesis [8]. it is thus possible to gain via an aleph cycle an insight into the ergodic theory of chaos that has been proposed [9]. as we have stated at the outset, while it would be nearly impossible to study chaos analytically, the theorem of sharkovskii gives a window through which something can be done analytically, as we have described here in this short note. acknowledgements i thank dr miloslav znojil of the doppler institute for mathematical physics and applied mathematics, rez, czech republic for suggesting that i write this article based on a talk presented at vila lanna, prague in october 2013. references [1] b. hu, phys. rep. 91, 234 (1982) doi: 10.1016/0370-1573(82)90057-6 [2] d. gulick, encounters with chaos (mcgraw-hill, ny 1992). [3] a.n. sharkovskii, ukr. math. z. 16. 61 (1964). [4] t-y li and j.a. yorke, am. math. monthly 82, 985 (1975). [5] m.h. lee, acta phys. pol. b 42, 1071 (2011). [6] m.h. lee, acta phys. pol. b 43,1053 (2012). [7] m.h. lee, acta phys. pol. b 44, 925 (2013). [8] m.h. lee, phys. rev. lett. 87, 250601 (2001) doi: 10.1103/physrevlett.87.250601; phys. rev. lett. 98, 110403 (2007) doi: 10.1103/physrevlett.98.110403 [9] j.-p. ekmann and d. rulle, rev. mod. phys. 57, 617 (1985) doi: 10.1103/revmodphys.57.617 132 http://dx.doi.org/10.1016/0370-1573(82)90057-6 http://dx.doi.org/10.1103/physrevlett.87.250601 http://dx.doi.org/10.1103/physrevlett.98.110403 http://dx.doi.org/10.1103/revmodphys.57.617 acta polytechnica 54(2):130–132, 2014 1 introduction 2 one-dimensional continuous map of real numbers in an interval 3 logistic map 4 solving the sextic equation: preliminaries 5 solving the sextic equation: an internal degree of freedom 6 conclusion: aleph cycle and chaotic trajectories acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0906 acta polytechnica 53(6):906–912, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap cfd simulation of a stirred dished bottom vessel petr vlček, jan skočilas∗, tomáš jirout czech technical university in prague, faculty of mechanical engineering, department of process engineering, technicka 4, 166 07 prague 6, czech republic ∗ corresponding author: jan.skocilas@fs.cvut.cz abstract. this paper deals with simulation of the fluid flow in a stirred curved-bottom vessel equipped with three curved blade impellers. the power number and the impeller flow rate number are dimensionless characteristics of the system determined from simulation results and compared with relevant experimental data or data from the literature. the model of the system was created in the conventional gambit and fluent program. the system is solved for two designs — for an unbaffled vessel, and for a baffled vessel. the vessel is filled with water and the impeller speed is 100 min−1. three turbulent models were used for the solution: k-ε, k-ω and rsm. the results were compared with experimental data or data from the literature. the k-ε model had the smallest demands on processor time, and the results compared satisfactorily with the experimental data. the model provides comprehensive information about the characteristics of the system. keywords: cfd, fluent, mixing, curved bottom, stirred vessel. 1. introduction the main requirements to be met by an engineer specializing in the design of new equipment or in analyzing and optimizing existing equipment are to minimize the time taken to solve the assigned problem and to minimize the cost of solving the problem. when there is no contemporary equivalent of the required equipment, and when design verification experiments are too expensive, a cheap and quick alternative is to make simulations of the processes and equipment. mathematical models based on fundamental physical principles, and even on empirics that can predict the behaviour of the system with sufficient (engineering) accuracy, provide a modern and robust tool for developers and for other users. however, the results of the numerical simulations generated by mathematical models need to be verified experimentally or by approximate analytical calculations. if the prediction corresponds with the experimental data or with calculations, the model of the virtual equipment becomes a powerful tool that enables various options of structural or technological modifications of the equipment to be checked easily, without expensive prototyping and experimentation. an advantage of applying a verified equipment model is that the design of a series of pieces of equipment of the same type but with various dimensions can be enhanced. this is referred to as scale up, and is a typical example of the production program of engineering companies that specialize in a single specific branch of industry. this study deals with simulation of the flow in stirred vessels. the objectives of the paper are to determine complex dimensionless characteristics — power number and impeller flow rate number — by carrying out investigations of the velocity field within the stirred vessel using a simulation of the equipment. the aim is to propose a model that will predict the value of dimensionless characteristics with sufficient accuracy (in comparison with experimental data). then this model can be used to determine the characteristics of the system, e.g. with a different rotation speed or with a different distance of the impeller from the vessel bottom. the system will be solved for two designs — for an unbaffled vessel and for a baffled vessel. the vessel is filled with water, and the impeller speed is set to 100 min−1. 2. problem analysis in terms of flow simulation, a stirred vessel is a special case of the flow problem (without any inlets or outlets), and it requires a slightly different approach from usual types of simulated processes, e.g. fluid flow in a channel, body circumfluence, or mixing of two fluids in a pipe or in a nozzle. in our case, it is necessary to ensure the movement of the impeller in the tank, and to adjust the boundary conditions of the problem (the volume of liquid is constant; the system has no inflows or outflows). our task is to determine the torque acting on the blades of the impeller and the impeller flow rate, the power number and the impeller flow rate number. these parameters can be defined by equations (1) and (2). power number p0 = 2πmk %n2d5 , (1) where mk is the torque acting on the impeller [n · m], % is the density of the liquid [kg · m3], n is the rotation speed of the impeller [s−1], and d is the diameter of the impeller [m]. 906 http://dx.doi.org/10.14311/ap.2013.53.0906 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 cfd simulation of a stirred dished bottom vessel d h/d h2 d/d h/d t/d d r h t 300 mm 1 15 mm 2 0.15 0.03 150 mm 50 mm 22.5 mm 5 mm table 1. geometrical parameters and dimensions of the vessel and the impeller with curved blades. figure 1. scheme and geometry of the impeller and the vessel. impeller flow rate number nq = q nd3 , (2) where q is the impeller pumping capacity of the fluid [m3 · s−1]. in terms of mathematical formulations, only navierstokes equations together with the continuity equation will be resolved in the flow of a newtonian fluid in a vessel. the flow regime of the liquid in the vessel has to be verified with respect to the specified parameters. a simple calculation of the reynolds number will determine the flow regime. reynolds number for stirred vessels re = nd2% µ ≈ 37500, (3) where the working fluid is water, µ is the dynamic viscosity [pa · s], and the impeller speed is 100 min−1. the calculated reynolds number value indicates that the flow regime is turbulent. we will use the maximum simplification of the model that is compatible with obtaining relevant results comparable with the experimental data. we will consider a 3d geometric model for the steady flow regime. due to the reynolds number value, we consider a turbulent flow turbulence model. we use rans models for a quick analysis of the problem and to obtain the required outputs (global dimensionless characteristics of the system). for the same reason, we will simulate the interaction between the stirrer and the vessel wall using the mrf approach. we also neglect the formation of the central vortex at the water level, because the rotation speed of the impeller 100 min−1 is low, and the vortex that forms is minimal (verified by experiment). thus, we use symmetry for the water level boundary condition. figure 2. final layout of the geometric model with the sub-volumes and a grid for the unbaffled vessel. 3. model geometry and grid the basic geometry of the model was created in the autodesk inventor 3d modeller. the unbaffled vessel, the baffled vessel and the impeller geometry were created separately. the impeller (manufactured by tenez company) has three curved blades with rounded edges. the baffled vessel is equipped with four baffles. the vessel has a curved bottom. the basic dimensions are shown in figure 1 and table 1. appropriate models were combined into the required model in the gambit 2.4 program [6]. the resulting volume was subsequently divided into rotating and stationary parts, as required by the mrf method. the model was further decomposed into sub-volumes to form an unstructured grid, see figure 2., which completely describes our stirred vessel. the model contains 2.5 million cells for a system without baffles and 3.1 million cells for a system with baffles. 4. processor set up and calculation the solutions for the two cases with the unbaffled vessel and the baffled vessel were carried out in fluent v14.0 [5]. the solver was adjusted according to table 3. the final number of iterations of the calculation for k-ε and k-ω was approximately 15000 for the first calculation with adjusted lower accuracy, and then 20000 iterations to calculate with adjusted greater accuracy. the average computation time was 48 hours when using eights parallel computing threads. for rsm, the total number of iterations was 50000 and the computation time was four days. a total of six models were calculated — the unbaffled vessel with three turbulence models, and the baffled vessel with three turbulence models. the power number was calculated from the torque acting on the impeller. the torque was determined 907 p. vlček, j. skočilas, t. jirout acta polytechnica unbaffled vessel baffled vessel element number of proportion element number of proportion type elements of elements type elements of elements hexahedron 611000 24.7 % hexahedron 1032000 33.2 % pyramid 37000 1.5 % pyramid 45000 1.5 % tetrahedron 1822000 73.8 % tetrahedron 2027000 65.3 % summary 2471000 100 % summary 3104000 100 % table 2. number of elements in the grid of the stirred vessel model. flow regime turbulent turbulence models k-ω, k-ε, rsm fluid in the stirred vessel water boundary conditions rotation mrf, rotation speed 100 min−1 under relaxation factors default discretization — pressure second order discretization — momentum second order upwind discretization — turbulent kinetic energy second order upwind discretization — turbulent dissipation rate second order upwind discretization — reynolds stresses second order upwind residuals 10−4 table 3. solver setting for the stirred vessel. directly from fluent software using the in-built function which calculates the torque from the shear stress acting on the user selected surface. the torque acting on the shaft was neglected due to the small radius of the shaft. the impeller flow rate number was evaluated from impeller pumping capacity determined by integrating the velocity field on the surface located above (or below) the impeller (the radial flow rate is neglected). this surface is circular in shape with a radius that is one millimetre greater than impeller, and it was formed together with the geometry and the mesh. 5. simulation results — postprocessor this section presents the results of two types of stirred vessels with an agitator for various models of turbulence. first, we will concentrate on comparing the distribution of the monitored quantities in the crosssections of the vessel for first and second order accuracy. the comparison of the solution results for the unbaffled stirred vessel using the k-ε model is successively shown in figures 3 to 6. the results of the calculation of first order accuracy are always shown on the left side, and the results for second order of accuracy are shown on the right side. there is a comparison of the static pressure contours in the vessel and the velocity magnitude contours. the figures show that when a higher order of accuracy is used the contours of the parameters are smoother and some instability fields are suppressed, see for example the velocity distribution in the vessel viewed in vertical section in figure 5., e.g. the area in the vicinity of the wall in the transition between the cylindrical part and an arched part of the vessel. three turbulent models were used for the solution. the results are compared in the following figures for the contours of the velocity magnitude. all figures are presented for a calculation of second order accuracy. there are no significant differences in the results obtained from all the turbulence models. the maximum and minimum values are also at least of the same order (except rsm, where the maximum pressure is four times higher). a better overall comparison of turbulence model predictions can be obtained with the measured data, especially the calculated power numbers and impeller flow rate numbers. see tables 4 to 7. the stated value for the pump power nqe = 0.3 was obtained for a pfaudler impeller from the literature [4] (not for the investigated geometry of an impeller). however the system geometry (flat or curved bottom vessel, impeller distance from bottom, etc.) corre908 vol. 53 no. 6/2013 cfd simulation of a stirred dished bottom vessel figure 3. unbaffled vessel. vertical cut of the vessel. static pressure contours in the vessel. left — first order accuracy, right — second order accuracy. figure 4. unbaffled vessel. horizontal cut of the vessel in the plane of the impeller. static pressure contours in the vessel. left — first order accuracy, right — second order accuracy. figure 5. unbaffled vessel. vertical cut of the vessel. velocity magnitude contours in the vessel. left — first order accuracy, right — second order accuracy. k-ε model. figure 6. unbaffled vessel. horizontal cut of the vessel in the plane of the impeller. velocity magnitude contours in the vessel. left — first order accuracy, right — second order accuracy. k-ε model. 909 p. vlček, j. skočilas, t. jirout acta polytechnica figure 7. unbaffled vessel. vertical cut. contours of velocity magnitude. three turbulence models: k-ε, k-ω and rsm. figure 8. baffled vessel. vertical cut. contours of velocity magnitude. three turbulence models: k-ε, k-ω and rsm. figure 9. unbaffled vessel. horizontal cut in the plane of the impeller. contours of velocity magnitude. three turbulence models: k-ε, k-ω and rsm. figure 10. baffled vessel. horizontal cut in the plane of the impeller. contours of velocity magnitude. three turbulence models: k-ε, k-ω and rsm. 910 vol. 53 no. 6/2013 cfd simulation of a stirred dished bottom vessel nqc nqe deviation poc poe deviation k-ε 0.23 0.3 23 % 0.56 0.46 −22 % k-ω 0.19 0.3 37 % 0.51 0.46 −11 % rsm 0.21 0.3 30 % 0.43 0.46 7% table 4. comparison of the power numbers and the impeller flow rate number for the unbaffled stirred vessel, first order discretization. indexes: c — calculated, e — measured. nqc nqe deviation poc poe deviation k-ε 0.19 0.3 37 % 0.52 0.46 −13 % k-ω 0.16 0.3 47 % 0.51 0.46 −11 % rsm 0.12 0.3 60 % 0.30 0.46 35 % table 5. comparison of the power numbers and the impeller flow rate number for the unbaffled stirred vessel, second order discretization. indexes: c — calculated, e — measured nqc nqe deviation poc poe deviation k-ε 0.43 0.3 −43 % 0.79 1.4 44 % k-ω 0.38 0.3 −27 % 0.75 1.4 46 % rsm 0.32 0.3 −7 % 0.71 1.4 49 % table 6. comparison of the power numbers and the impeller flow rate number for the baffled stirred vessel, first order discretization. indexes: c — calculated, e — measured nqc nqe deviation poc poe deviation k-ε 0.44 0.3 −47 % 0.81 1.4 42 % k-ω 0.42 0.3 −40 % 0.85 1.4 39 % rsm 0.31 0.3 −3 % 0.61 1.4 56 % table 7. comparison of the power numbers and the impeller flow rate number for the baffled stirred vessel, second order discretization. indexes: c — calculated, e — measured. sponding to the stated value was not described. the place at which of the flow rate was determined is also not clear. the nqc values therefore serve rather for mutual comparison of the model. the power number po serves better for comparing the results with measured values, because we investigated them directly. the results of models k-ε and k-ε are comparable, but the rsm model predicts mostly lower values of the parameters, especially for second order accuracy of the discretization. the reasons for obtaining different results with the rsm model were obviously problems with the convergence of the simulations. especially when second-order discretization accuracy was used, the residues and the flow rate fluctuated significantly, and so the flow rate value is determined by the consideration given by the authors to stabilisation of the calculation. in addition, the rsm model requires more computational time for the solution than models k-ε and k-ε. model k-ε provided slightly smaller deviations of the power numbers for the second order, but this model had small oscillations of the residues and therefore also had more problems with convergence than k-ε. on the basis of these conclusions, we recommend using the robust k-ε model, which made the smallest demands on processor time, and the results were sufficiently comparable with the experimental data when second order discretization accuracy was used. there are more deviations of the power numbers for a baffled vessel, but the predictions of the models are similar. this difference is probably due to the difference in the shape of the end of the stirrer blades, where the most power dissipation takes place. the impeller for which the power number was determined experimentally has a straight end of the blades, while the model for the simulation has an oblique end of the blades, so that the model impeller has a smaller blade area. the difference is in the largest radius. the dissipated power depends on the fifth power of the radius, and the predicted power numbers for all models are therefore smaller than the experimentally-derived value. 911 p. vlček, j. skočilas, t. jirout acta polytechnica 6. conclusion • our model provides complex information about the integral characteristics of the system. the values of the parameters agree with the experimental findings in terms of orders of magnitude. the distribution of the pressure, the velocity and the other variables can be regarded as credible. in particular, the results for the unbaffled vessel can be considered suitable for engineering design, where the deviation is around 10 per cent. • this model can be directly used for calculating the flow characteristics inside the vessel for various rotation speeds. the model can also be used to predict the distribution of characteristics dependent on various geometrical configurations — impeller displacement from the bottom of the vessel, more baffles or a different layout of the baffles, etc. when there is a change in the geometry, it is necessary to reconstruct the geometric model and the grid of finite volumes, which requires a considerable investment of time. however, if we apply exactly the same procedure for creating a network as for setting up the simulations, the results of this new model can be considered to be at least within the same value order. • however this model has limitations that must be kept in mind: the boundary condition for the fluid surface inside the vessel; the type of boundary conditions, symmetry, neglecting the effect of the central vortex, which increases as the speed of the impeller rises, or as the reynolds number rises. list of symbols d vessel diameter [m] d impeller diameter [m] h height of fluid surface [m] h2 impeller distance from the bottom [m] h height of the impeller blade [m] t thickness of the impeller blade [m] r radius of the curvature blades [m] mk torque [n m] n rotation speed of the impeller [s−1] q impeller pumping capacity [m3 s−1] nq impeller flow rate number [–] po power number [–] re reynolds number [–] % density [kg m−3] µ dynamic viscosity [pa s] acknowledgements this research was supported by the technology agency of the czech republic under grant ta02011251 “optimization of enameled mixing equipment according to the technological needs of end users”. references [1] wilcox d.c., aiemo-obeng v.a., and kresta s.m.: turbulence modeling for cfd: science and practice. 1st ed. california: dcw industries, inc., 1993, pp 460. isbn 09-636-0510-0. [2] joshi j.b., nere n.k., rane ch.v., murthy b.n, mathpati c., patwardham a.w. and ranade v.v.: cfd simulation of stirred tank: comparison of turbulent models. part ii:axial flow impellers, multiple impellers and multiphase dispersion, can. j. chem. eng., 89, 2011, pp. 754–816. [3] ceres d., jirout t., rieger f.: mixing of suspensions with curved blade turbine. inzynieria i aparatura chemiczna, 50, 2011, pp. 7–8. issn 0368-0827. [4] paul e.l., aiemo-obeng v.a., and kresta s.m. handbook of industrial mixing: science and practice. hoboken, n.j.: wiley-interscience, 2004, 1377 p. isbn 04-712-6919-0. [5] fluent inc., 2008. fluent 6.3 user’s guide. [6] fluent inc., 2004. gambit 2.4 tutorial guide. 912 acta polytechnica 53(6):906–912, 2013 1 introduction 2 problem analysis 3 model geometry and grid 4 processor set up and calculation 5 simulation results — postprocessor 6 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0099 acta polytechnica 56(2):99–105, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap integral methods for describing pulsatile flow david hromádkaa,∗, hynek chlupa, rudolf žitnýb a department of mechanics biomechanics and mechatronics, czech technical university in prague, technická 4, 166 07 praha 6, czech republic b department of process engineering, czech technical university in prague technická 4, 166 07 praha 6, czech republic ∗ corresponding author: david.hromadka@fs.cvut.cz abstract. this paper presents an approximate solution of the pulsatile flow of a newtonian fluid in the laminar flow regime in a rigid tube of constant diameter. the model is represented by two ordinary differential equations. the first equation describes the time evolution of the total flow rate, and the second equation characterizes the reverse flow near the wall. these equations are derived from the momentum balance equation and from the kinetic energy equation, respectively. the accuracy of the derived equations is compared with a solution in which the finite difference method is applied to a partial differential equation. keywords: pulsatile flow, integral energy, integral momentum, rigid pipe. 1. introduction pulsatile flow is a time-dependent flow that consists of a constant part (hagen-poiseuille flow, with a parabolic velocity profile, fully controlled by the viscous forces) and an oscillatory part, which is controlled by the viscous forces at the wall and by the inertial forces at the core [1]. pilot studies of time-dependent laminar flow were reported by [2], and were followed by [3] and [4]. sexl [2] developed an analytical solution to the momentum equation of newtonian fluid flow subjected to a pressure gradient dp/dx = ceiωt. szymanski [3] employed a similar technique, and obtained an analytical solution to the momentum equation for the flow driven by a pressure gradient dp/dx = 0 for t ≤ 0 and dp/dx = c for t > 0, where c was constant. an extended review of the literature dealing with laminar pulsatile flow, which presents the current state of theoretical and experimental knowledge was thoroughly drafted by [5] and supplemented in [6]. the most cited work on linearization of the basic equations [7] was carried out by womersley in the late 1950’s, and was published in a series of papers [4], [8] and [9], [10]. most of these materials appear in a comprehensive report [11]. another option for solving non-stationary flow is to employ the integral momentum and integral energy method, which leads to a different form of equations. in this method, it is not the navier-stokes equations themselves that are solved, but the integrated form of the stream wise momentum or energy equation. the form of the velocity profiles is assumed a priori with unknown coefficients characterizing the profile [12]. a list of integral methods that are used, variants of velocity profiles, the ability to capture the reverse flow and specific features of the method are presented in (tab. 1). the methods in the list describe unidirectional the pulsatile flow and the impulsively-initiated flow of a newtonian fluid within a rigid tube of constant diameter, unless otherwise stated in the column “specific”. to the best of the author’s knowledge, there is no integral energy solution for pulsatile flow that takes into account the reverse flow. only some of the listed methods are able to solve both, the periodic flow and the impulsively-initiated flows for arbitrary initial conditions (using time marching algorithms). additionally, there is a controversy over the comparative accuracy of the energy method and momentum integral methods; the results depend on the specific test and on the selected accuracy criterion. for example, elkouh [21] tested his method in the special case of a sinusoidally oscillating newtonian fluid (the method did not take into account the reverse flow near the wall). a comparison with the exact solution [22] showed, that the integral energy method is more accurate than the integral momentum method. the methods described in this paper anticipate quite opposite conclusions. the method proposed in our paper was applied in the newly developed experimental method for identifying viscoelastic properties of blood vessels and grafts using a transient water hammer experiment [24]. the experimental setup consists of the tested elastic tubular sample connected to a long glass capillary filled with an oscillating column of water. when the standard balance momentum equation with a parabolic velocity profile [25], [26] is used to describe the pulsation of the column of water, there is an error in the predicted frequency. the error is reduced by using a new model, which replaces the standard balance momentum equation by two ordinary differential equations: one for the total flow rate and one for the reverse flow rate. the model is derived from the momentum balance and from the macroscopic kinetic energy balance. 99 http://dx.doi.org/10.14311/ap.2016.56.0099 http://ojs.cvut.cz/ojs/index.php/ap d. hromádka, h. chlup, r. žitný acta polytechnica author solution method velocity profile form reverseflow specific [12] integral momentum u(r,t) = u(t) 4∑ i=0 ai ri ri for a1 = 0 yes flow in a tube with and without longitudinal vibration of the wall compared with [11] exact solution for α = 0–10 [13] integral momentum u(r,t) = u(t) 4∑ i=0 ai ri ri for a1 = a3 = 0 yes flow of non-newtonian fluid with vibrating wall [14] integral momentum u(r,t) = u(t) 4∑ i=0 ai ri ri for a1 = 0 yes flow of micropolar fluid with and without longitudinal vibration of the wall [15] combination of integral energy and integral momentum u(r) = u 4∑ i=0 ai ri ri for a1 = a3 = 0 and flat profile yes steady flow through rigid stenosis, results were compared with an experiment [16] integral momentum u(r) = u 4∑ i=0 ai ri ri for a0 = 0 yes steady flow through rigid stenosis, results were compared with experiment [17] [18] integral momentum u∗(r∗,x∗, t∗) = 3∑ i=0 ui(x∗, t∗)r∗i for u0 = u1 = 0 yes flow through compliant and rigid tube. solution in rigid tube was compared with [11] for α = 1, 3, 4.72, 6.67, 15 [19] integral momentum u(r,t) = eiωta0 ( 1− r 2 r2 )( a1 − r 2 r2 ) yes solution only for harmonic p, compared with [20] exact solution for α= 2, 4, 6 [21] integral momentum,integral energy u(r,t) = 2ū(t) 3n+1 n+1 ( 1 − r n+1 n r n+1 n ) tested only for the special case n = 1 no flow of power law and newtonian fluid compared with [22] exact solution for α = 0–10 [23] integral momentum u(r,t) = 2u(t) ( 1 − r 2 r2 ) no method proposed compared with [22] exact solution for α = 0–9 table 1. list of integral methods used for describing pulsatile flow. 2. methods 2.1. mathematical model – weak formulation the exact solution of pulsatile flow in a rigid tube of constant radius r is fully described by the momentum balance in the x axial direction, assuming rotational symmetry and spatially fully-developed flow of a noncompressible fluid ρ ∂u ∂t = − ∂p ∂x + µ 1 r ∂ ∂r ( r ∂u ∂r ) , (1) where ρ is constant density and µ is dynamic viscosity. pressure p is independent of the radial coordinate, as follows from the momentum balance in the radial direction. the pressure gradient can be considered as a known (and arbitrary) function p(t) p(t) = ∂p ∂x . (2) the equation (1) multiplied by velocity u represents the kinetic energy balance, which forms the basis of the mechanical energy equation ρ 1 2 ∂u2 ∂t = −up + uµ 1 r ∂ ∂r ( r ∂u ∂r ) . (3) an exact solution of (1), (3) has to provide the same results, but the results of an approximate solution may differ slightly. an approximation of the velocity profile u(r,t) can be represented as a linear combination of polynomial basis functions u(r,t) = n∑ i=1 ui(t)ni(r) = n∑ i=1 ui(t) ( r2(i−1) r2(i−1) − r2i r2i ) . (4) the term ui(t) is the amplitude of the velocity profile, and the symmetric basis functions ni(r) satisfy the boundary conditions (u(r,t) = 0,∂u(0, t)/∂r = 0). 100 vol. 56 no. 2/2016 integral methods for describing pulsatile flow an approximation of the velocity profile for n = 1 corresponds to a parabolic function (hagen poiseuille) and further terms with a higher power exponent can be called corrective functions that provide a description of more extrema (for example the second basis function n2(r) represents a typical radial velocity profile with backflow at the wall). first, we derive approximate solutions (5), (6) from the balance momentum equation (1) and from the kinetic energy equation (3) respectively. the galerkin weighted residual method is applied. the integration of the residual res and the weight function wj is over area 2πr dr, and for the momentum balance results in r∫ 0 res · wj dr = r∫ 0 [ ρ ∂u ∂t + p − µ 1 r ∂ ∂r ( r ∂u ∂r )] nj (r)2πr dr = ρ n∑ i=1 ∂ui (t) ∂t r∫ 0 ni(r)nj (r)2πr dr ︸ ︷︷ ︸ m b − µ n∑ i=1 ui(t) r∫ 0 nj (r) 1 r ∂ ∂r ( r ∂ni(r) ∂r ) 2πr dr ︸ ︷︷ ︸ kb + p r∫ 0 nj (r)2πr dr ︸ ︷︷ ︸ nb = 0, (5) r∫ 0 [ ρ 1 2 ∂u2 ∂t + up − uµ 1 r ∂ ∂r ( r ∂u ∂r )] nj (r)2πr dr = ρ n∑ i=1 ∂u2i (t) ∂t r∫ 0 1 2 n2i (r)nj (r)2πr dr ︸ ︷︷ ︸ m k − µ n∑ i=1 u2i (t) r∫ 0 ni(r)nj (r) 1 r ∂ ∂r ( r ∂ni(r) ∂r ) 2πr dr ︸ ︷︷ ︸ kk + n∑ i=1 ui(t)p r∫ 0 ni(r)nj (r)2πr dr ︸ ︷︷ ︸ n k = 0. (6) subscript i is the summation index, and index j corresponds to the j-th weight function (corresponding to the j-th equation). it is more convenient to work with flow rates. we therefore define the total flow rate within the tube q(t) = 2π n∑ i=1 ui(t) r∫ 0 rni(r) dr = πr2 n∑ i=1 ui(t) i(i + 1) = q1(t)+q2(t) + · · · + qn(t).res. (7) subscript i = 1 corresponds to the poiseuille flow, while indices i ≥ 2 to n denote correction functions that are able to describe more extrema of the velocity profile qi(t) = 2πui(t) r∫ 0 rni(r) dr = πr2 ui(t) i(i + 1) . (8) it is now simple to express the velocity ui (t) as a function of the flow rate  u1 (t) ui (t) ... un (t)   ︸ ︷︷ ︸ u = 1 πr2   2 −2 −2 −2 0 i(i+1) · · · 0 ... 0 ... 0 0 0 0 n(n+1)   ︸ ︷︷ ︸ a   q (t) qi (t) ... qn (t)   ︸ ︷︷ ︸ q . (9) equations (5) and (6) can be rewritten in terms of the flow rate in matrix notation as ρ dq dt + m −1b a −1nbp −µm −1b a −1kbaq = 0, (10) ρ d dt (q2)+m −1k a −2n kaqp −µa−2m −1k kka 2q2 = 0. (11) let us restrict the approximation to the first term in the polynomial series (4) (parabolic velocity profile). the momentum balance reduces into the equation ρ dq dt + 3 4 pr2π + 24 4 µ r2 q = 0 (12) and the mechanical energy equation formulated from (3) ρ dq dt + 2 3 pr2π + 16 3 µ r2 q = 0. (13) these two equations describe the same physical problem. however, they are not identical as they differ from each other by the coefficients. the equations are identical only for stationary flow, and both lead to hagen-poiseuille flow. the reader has to ask himself which of the equations – (12) or (13) – provides a better description of the real movement of a fluid. however, the answer to this question is left until an analysis of the test examples has been made. we consider the velocity profile as the fourth order polynomial function (the first basis function describes the poiseuille flow, and the second basis function describes 101 d. hromádka, h. chlup, r. žitný acta polytechnica the reverse flow, the backflow in the vicinity of the wall) u(r,t) = u1(t) ( 1 − r2 r2 ) + u2(t) ( r2 r2 − r4 r4 ) . (14) equations (9), (10) were applied to (14), and this leads to a system of ode’s. after a simple manipulation we obtain a standard formulation of the momentum balance for the flow rates ρ   dq dt dq2 dt   + πr2   8 9 5 9  p + µ r2   64 9 80 9 40 9 320 9  (q q2 ) = 0. (15) and by applying (9), (11), respectively, to the velocity profile mentioned above, we obtain the formulation of kinetic energy for the flow rates ρ   dq dt dq2 dt   + πr2   1 6 8q2q + 12q22 − 35q2 2q22 + 2q2q − 7q2 7 12 −5q2 + 2q22 2q22 + 2q2q − 7q2  p+ µ r2   4 3 −20q2q2 +24q32 +4q22q−35q3 2q22 +2q2q−7q2 14 3 −5q3 +2q22q−40q2q2 +16q32 2q22 +2q2q−7q2   = 0. (16) the reader will notice that the system of equations (15) and (16) deals with greater differences than for (12) and (13). ordinary differential equations (15) are linear, while (16) is a system of nonlinear differential equations. systems (12), (13), or (15), (16) can be solved with the matlab ode toolbox for an arbitrary time course of p(t) (continuous as well as discontinuous pressure gradients) and for arbitrary initial conditions prescribed in the form of the total flow rate q and the reverse flow rate q2 at time zero. 2.2. test example differences between the solution of the flow rate realized by the finite difference method qfd with a very fine mesh (an implicit method with a central difference scheme), the standard balance momentum (12), the mechanical energy equation (13) and the proposed two-equation approximations (15), (16) were tested using the model with harmonic driving force (pressure gradient p(t)), consisting of a stationary part pstat = −100 pa m−1 and an oscillatory part pamp = −500 pa m−1 in a rigid tube of constant radius r = 2 mm p(t) = pstat + pamp sin(ωt). (17) stationary flow is assumed at time t = 0 as initial conditions, therefore q(0) = − πr4 8µ p(0), q2 (0) = 0. (18) the frequency ω can be expressed in a dimensionless form either strouhal or womersley number α α = r √ ωρ µ . (19) 2.3. determining the deviation the deviation (20) is computed as the standard euclidean norm of two flow functions, which is calculated from the finite difference method and from the flow rate using the equation (12), (15), (13), (16), respectively, and which are integrated over time with the period t = 2π/ω error =   1 t t∫ 0 (q(t) − q(t)fd) 2 max (q(t)2fd) dt   1 2 . (20) the error defined in this way is dimensionless, and it is a suitable measure for deviations in the total calculated flow rates and also in the back flow rates. 2.4. wall shear stress the wall shear stress is calculated from the following equation τ = µ ∂u ∂r . (21) 3. results the deviation of the suggested balance momentum (12),(15), the energy equations (13), (16) is demonstrated by (fig. 1), which illustrates the deviation from the exact solution computed by the equation (20). 0 0.01 0.02 0.03 0.04 0.05 0.06 0 2 4 6 8 10 12 14 16 18 20 e r r o r [− ] α[−] figure 1. the graph presents the deviation of flow rate q from qfd. the black curves correspond to the formulation from the kinetic energy balance (13), (16) while the grey curves correlate with the weak solution of the balance momentum (12), (15). the solid curves are consistent with the parabolic velocity profile approximation and the dashed curves are equivalent to the approximation of the velocity profile with a fourth order polynomial function. three representative samples of the evolution of the velocity profile and the course of the wall shear 102 vol. 56 no. 2/2016 integral methods for describing pulsatile flow stress are shown in the figures bellow. pairs of velocity profiles are presented in (fig. 2). a womersley number is assigned to each pair of velocity profiles in two different time points t1, t2 and also for the wall shear stress. the results of the wall shear stress can be found in (fig. 3) below. -0.15 -0.1 -0.05 0 0.05 -2-1.5-1-0.5 0 0.5 1 1.5 2 u [m s − 1 ] r[mm] α = 3 t1 = 0.7t t2 = 0.9t 0 0.05 0.1 0.15 -2-1.5-1-0.5 0 0.5 1 1.5 2 u [m s − 1 ] r[mm] α = 5 t1 = 0.7t t2 = 0.9t 0 0.05 0.1 0.15 -2-1.5-1-0.5 0 0.5 1 1.5 2 u [m s − 1 ] r[mm] α = 7 t1 = 0.7t t2 = 0.9t figure 2. the black lines are representatives of the velocity profiles obtained from the finite difference method on the right side of the figures. the dashed lines were obtained by calculating (16) (black color), (15) (grey color). 4. conclusion the method proposed here is suitable especially for the situation where the pressure gradient cannot be decomposed into fourier series, which corresponds to [24]. the exact solution through the bessel function is therefore not applicable. the effect of assuming a parabolic profile and applying the galerkin method is shown in (fig. 1). the maximum error when using the momentum integral method is 2.8 percent. the maximum error when using the energy integral method is 5.5 percent. the discrepancy between elkouh’s results, which assume the steady state of the velocity profile, and our results is caused by the use a different method for finding the solution (in our case, the galerkin method), by the specific test, and by the selected accuracy criterion. elkouh reported that better results were obtained from the integral energy method with the well-known steady state velocity profile (without reverse flow) for α = 0–10 [21]. the method proposed here is -2 0 2 4 0 0.5 1 1.5 2 2.5 τ (t ) τ (0 )[ − ] t[s] α = 3 0 1 2 3 0 0.2 0.4 0.6 0.8 1 τ (t ) τ (0 )[ − ] t[s] α = 5 0 1 2 0 0.1 0.2 0.3 0.4 0.5 τ (t ) τ (0 )[ − ] t[s] α = 7 figure 3. the black lines correspond to the shear stress at the wall calculated using the finite difference method. the dashed lines mark the wall shear stress of the approximate solution. the color designation for the approximate solution is the same as in previous figures. designed to take the reverse flow into account, and it investigates the error of the integral energy method and the integral momentum method. the maximum error over a cycle was 0.3 percent for the integral momentum solution while the maximum error caused using the integral energy method was 1.3 percent. the higher error of the system of mechanical energy equations derived from the integral energy (16) is caused by nonlinearities. the integral solution (15) is in excellent agreement with the finite difference solution in the region α = 1–20 which is a part of the intermediate region of pulsatile flow [5] , [6]. another type of gradient p (a symmetric triangular wave) was applied to the system of equations (15), (16) and also to the finite difference method. the same amplitude of the stationary and nonstationary part of the pressure gradient was used as in the test example (17). similar error was calculated as in (fig. 1) based on (20). the maximum error from the flow calculation (15), (16) in comparison with the finite difference solution, was 1.4, 2.4 percent, respectively. the evolution of the approximate velocity profiles is in good agreement with the velocity profile calculated from the finite difference method (fig. 2). the wall shear stress calculated from the momentum balance (15) describes the shear stress at the 103 d. hromádka, h. chlup, r. žitný acta polytechnica wall more precisely than the shear stress that was evaluated from the mechanical energy equation (16). other variational methods were tested (least square integral energy and momentum, an expert estimate of the weight function with integration of momentum, and also the energy equation) together with various integrations of differential equations (over the radius and over the area). the lowest error found was using the galerkin integral momentum method with integration over the area of rigid pipe. this method has therefore been presented in this paper. 5. appendix for illustration, let us derive the balance momentum with the backflow (15). we recall equations (1), (5), (14) and we obtain a set of integral equations (22), (23) r∫ 0 { ρ [ du1(t) dt ( 1 − r2 r2 ) + du2(t) dt ( r2 r2 − r4 r4 )] + p − µ r [ −u1(t) 2r r2 + u2(t) ( 2r r2 − 4r3 r3 )] − µ [ −u1(t) 2 r2 + u2(t) ( 2 r2 − 12r2 r4 )]} ( 1 − r2 r2 ) 2πr dr = 0, (22) r∫ 0 { ρ [ du1(t) dt ( 1 − r2 r2 ) + du2(t) dt ( r2 r2 − r4 r4 )] + p − µ r [ −u1(t) 2r r2 + u2(t) ( 2r r2 − 4r3 r3 )] − µ [ −u1(t) 2 r2 + u2(t) ( 2 r2 − 12r2 r4 )]} ( r2 r2 − r4 r4 ) 2πr dr = 0. (23) the inner product of differential equations and weighted basis functions integrated over the area yield two ordinary differential equations (24), (25) for unknown center-line velocity u1(t), u2(t) 1 3 r2ρ du1(t) dt + 1 12 r2ρ du2(t) dt + 2µu1 (t) + 2 3 µu2(t) + 1 2 r2p = 0, (24) 1 12 r2ρ du1(t) dt + 1 30 r2ρ du2(t) dt + 2 3 µu1(t) + 2 3 µu2(t) + 1 6 r2p = 0. (25) we can formulate the center-line velocity in terms of flowrates. this can be done easily through (9) u1(t) = 1 πr2 ( 2q(t) − 2q2(t) ) , (26) u2(t) = 1 πr2 ( 6q2(t) ) . (27) combining (24),(25) with (26), (27) leads to a set of a differential equations for describing the non-stationary flowrates within a rigid pipe of constant diameter ρr2 ( 4 −1 5 1 ) dq dt dq2 dt  +πr4(35 ) p +µ ( 24 0 40 80 )( q q2 ) = 0. (28) simple manipulation of (28) yields the standard set of equations (15). equations (16) are derived in a similar manner. list of symbols α womersley number [–] ρ water density [kg m−3] τ wall shear stress [pa] µ dynamic viscosity [pa s] ω angular frequency [rad s−1] i i-th index [–] i imaginary unit [–] n n-th index [–] ni basis function [–] p pressure [pa] p pressure gradient [pa m−1] pamp amplitude of the nonstationary part of gradient pressure [pa m−1] q total flowrate [m3 s−1] qi corrective flowrate [m3 s−1] qfd total flowrate calculated from finite difference method [m3 s−1] r radial coordinate [m] r∗ dimensionless radial coordinate [–] r radius of the tube [m] t time [s] t∗ dimensionless time [–] u fluid velocity [m s−1] u∗ dimensionless velocity [–] ui velocity amplitude [m s−1] ū average velocity amplitude [m s−1] wj j-th weight function [m] x axial coordinate [m] x∗ dimensionless axial coordinate [–] res residual [pa m] error deviation of the flowrate [–] a matrix flowrate coefficients [–] kb stiffnes matrix of the momentum balance [–] kk stiffnes matrix of the kinetic balance [–] m b mass matrix of the momentum balance [m2] m k mass matrix of the kinetic balance [m2] n b coefficients of driving force [m2] n k coefficients of driving force [m2] q flowrate vector [m3 s−1] 104 vol. 56 no. 2/2016 integral methods for describing pulsatile flow acknowledgements this work has been supported by czech technical university in prague grant no. sgs13/176/ohk2/3t/12 and by ministry of health project nos. cr nt 13302 and 15-27941a. references [1] m. zamir. the physics of pulsatile flow. springer science business media, 2000. doi:10.1007/978-1-4612-1282-9. [2] t. sexl. über den von e. g. richardson entdeckten annulareffekt. zeitschrift für physik 61(5-6):349–362, 1930. doi:10.1007/bf01340631. [3] p. j. szymanski. quelques solutions exactes des équations de l’hydrodynamique du fluide visqueux dans le cas d’un tube cylindrique. j math pures appl 9(11):67–107, 1932. [4] j. r. womersley. method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. the journal of physiology 127(3):553– 563, 1955. doi:10.1113/jphysiol.1955.sp005276. [5] m. y. gundogdu, m. o. carpinlioglu. present state of art on pulsatile flow theory. part 1. laminar and transitional flow regimes. jsme international journal series b 42(3):384–397, 1999. doi:10.1299/jsmeb.42.384. [6] m. o. carpinlioglu, m. y. gundogdu. a critical review on pulsatile pipe flow studies directing towards future research topics. flow measurement and instrumentation 12(3):163–174, 2001. doi:10.1016/s0955-5986(01)00020-6. [7] g. rudinger. review of current mathematical methods for the analysis of blood flow. asme biomedical fluid mechanics symposium pp. 1–33, 1966. [8] j. r. womersley. oscillatory flow in arteries: the constrained elastic tube as a model of arterial flow and pulse transmission. physics in medicine and biology 2(2):178–187, 1957. doi:10.1088/0031-9155/2/2/305. [9] j. r. womersley. oscillatory flow in arteries. iii: flow and pulse-velocity formulae for a liquid whose viscosity varies with frequency. physics in medicine and biology 2(4):374–382, 1958. doi:10.1088/0031-9155/2/4/307. [10] j. r. womersley. oscillatory flow in arteries. ii: the reflection of the pulse wave at junctions and rigid inserts in the arterial system. physics in medicine and biology 2(4):313–323, 1958. doi:10.1088/0031-9155/2/4/301. [11] j. r. womersley. an elastic tube theory of pulse transmission and oscillatory flow in mammalian arteries. tech. rep. tr56–614, wadg technical report, 1957. [12] l. e. hooks, r. m. nerem, t. j. benson. a momentum integral solution for pulsatile flow in a rigid tube with and without longitudinal vibration. international journal of engineering science 10(12):989– 1007, 1972. doi:10.1016/0020-7225(72)90021-3. [13] j. misra, b. kar. unsteady flow of blood through arteries in vibration environments. mathematical and computer modelling 13(4):7–17, 1990. doi:10.1016/0895-7177(90)90049-s. [14] s. parvathamma, r. devanathan. microcontinuum approach to the pulsatile flow in tubes with and without longitudinal vibration. bulletin of mathematical biology 45(5):721–737, 1983. doi:10.1007/bf02460045. [15] b. e. morgan, d. f. young. an intergral method for the analysis of flow in arterial stenoses. bulletin of mathematical biology 36(1):39–53, 1974. doi:10.1007/bf02461189. [16] j. h. forrester, d. f. young. flow through a converging-diverging tube and its implications in occlusive vascular disease — i. journal of biomechanics 3(3):297–305, 1970. doi:10.1016/0021-9290(70)90031-x. [17] j. h. forrester, d. f. young. flow through a converging-diverging tube and its implications in occlusive vascular disease — ii. journal of biomechanics 3(3):307–316, 1970. doi:10.1016/0021-9290(70)90032-1. [18] f. k. tsou, p. c. chou, s. n. frankel, a. w. hahn. an integral method for the analysis of blood flow. the bulletin of mathematical biophysics 33(1):117–128, 1971. doi:10.1007/bf02476669. [19] e. f. blick, p. d. stein. variational solution for pulsatile flow in rigid tubes. journal of biomechanical engineering 106(1):89, 1984. doi:10.1115/1.3138464. [20] p. lambossy. oscillations forcées d’un liquide imcompressible et visqueux dans un tube rigide et horizontal. calcul de la force de trottement. helv physica acta 25, 1957. [21] a. f. elkouh. approximate solution for pulsatile laminar flow in a circular rigid tube. journal of fluids engineering 100(1):131, 1978. doi:10.1115/1.3448588. [22] s. uchida. the pulsating viscous flow superposed on the steady laminar motion of incompressible fluid in a circular pipe. journal of applied mathematics and physics (zamp) 7(5):403–422, 1956. doi:10.1007/bf01606327. [23] j. mcfeeley, r. patel, k. jolls. approximate low-frequency solution for pulsatile laminar flow in a tube. chemical engineering science 28(11):2105–2107, 1973. doi:10.1016/0009-2509(73)85058-4. [24] d. hromádka, h. chlup, r. žitný. identification of relaxation parameter of a physical model of vein from fluid transient experiment. epj web of conferences 67:02039, 2014. doi:10.1051/epjconf/20146702039. [25] r. bird, w. stewart, e. lightfoot. transport phenomena. wiley international edition. wiley, 2007. [26] i. idelchik. handbook of hydraulic resistance. jaico publishing house, 2005. 105 http://dx.doi.org/10.1007/978-1-4612-1282-9 http://dx.doi.org/10.1007/bf01340631 http://dx.doi.org/10.1113/jphysiol.1955.sp005276 http://dx.doi.org/10.1299/jsmeb.42.384 http://dx.doi.org/10.1016/s0955-5986(01)00020-6 http://dx.doi.org/10.1088/0031-9155/2/2/305 http://dx.doi.org/10.1088/0031-9155/2/4/307 http://dx.doi.org/10.1088/0031-9155/2/4/301 http://dx.doi.org/10.1016/0020-7225(72)90021-3 http://dx.doi.org/10.1016/0895-7177(90)90049-s http://dx.doi.org/10.1007/bf02460045 http://dx.doi.org/10.1007/bf02461189 http://dx.doi.org/10.1016/0021-9290(70)90031-x http://dx.doi.org/10.1016/0021-9290(70)90032-1 http://dx.doi.org/10.1007/bf02476669 http://dx.doi.org/10.1115/1.3138464 http://dx.doi.org/10.1115/1.3448588 http://dx.doi.org/10.1007/bf01606327 http://dx.doi.org/10.1016/0009-2509(73)85058-4 http://dx.doi.org/10.1051/epjconf/20146702039 acta polytechnica 56(2):99–105, 2016 1 introduction 2 methods 2.1 mathematical model – weak formulation 2.2 test example 2.3 determining the deviation 2.4 wall shear stress 3 results 4 conclusion 5 appendix list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0693 acta polytechnica 53(supplement):693–697, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap ultrahigh energy cosmic rays: review of the current situation todor stanev∗ bartol research institute and department of physics and astronomy, university of delaware, newark, de 19716, u.s.a ∗ corresponding author: stanev@bartol.udel.edu abstract. we describe the current situation of the data on the highest energy particles in the universe – the ultrahigh energy cosmic rays. the new results in the field come from the telescope array experiment in utah, u.s.a. for this reason we concentrate on the results from these experiments and compare them to the measurements of the other two recent experiments, the high resolution fly’s eye and the southern auger observatory. keywords: high energy cosmic rays, hadronic interaction at very high energy, origin of the highest energy cosmic rays. 1. introduction two years ago i was asked to review at this meeting the new results of the measurements of the ultrahigh energy cosmic rays (uhecr). at that time there were two experiments that performed such measurements: the high resolution fly’s eye (hires) in utah, u.s.a., and the auger southern observatory (auger) in mendoza, argentina. hires is a detector that measures the fluorescent light emitted by the nitrogen in the atmosphere when its atoms are excited by the numerous electrons of such large air showers. its two fluorescent telescopes are able to detect showers that hit the ground up to distances of 40 km from the detectors. the two telescopes of hires can observe the air showers separately or in stereo mode with both telescopes. auger is a hybrid experiment that combines four fluorescent detectors (fd) with a huge surface array (sd) that covers 3000 km2. the surface array consists of 1600 water cherenkov tanks on a triangular matrix with an average distance between the tanks of 1500 m. the cherenkov tanks are deep enough (almost three radiation lengths) to detect electrons, gamma rays, and muons, and thus measure the energy flow of the air shower. a brief summary of the results at that time is that both detectors observed the gzk feature in the uhecr energy spectrum [5, 14]: the steep decline in the uhecr energy spectrum above energy of 4 × 1019 ev due to the energy loss in cosmic ray propagation from their presumably extragalactic sources to us. the two measured spectra have very similar shapes and agree with each other within systematic errors of about 20 %. the two experiments, however, did disagree on the chemical composition of uhecr: hires interpretation of the measured depth of shower maximum (xmax) and its fluctuations was that all uhecr are hydrogen nuclei (protons) [10], while auger interpreted its results as a chemical composition becoming increasingly heavier with energy above 2 × 1018 ev [6]. the interpretation of the chemical composition from the xmax measurement depends on the hadronic interaction model used, which creates a significant systematic error. auger also saw a correlation of their highest energy events (above 55 eev = 5.5 × 1019 ev) with nearby agn, while the smaller hires statistics did not show any correlation. these results have not changed during the last two years. 1.1. telescope array the new results come from a new detector, the telescope array (ta), which is a hybrid detector that started collecting data in 2009 in utah, usa, at 39°n, 120°w and an altitude of 1500 m. its surface array (sd) consists of 607 scintillator counters on a square grid with dimension of 1.2 km. each scintillator detector consists of two layers of thickness 1.2 cm and area of 3 m2. the phototube of each layer is connected to the scintillator via 96 wavelength shifting fibers, which make the response of the scintillator more uniform. each station is powered by a solar panel that charges a lead-acid battery. the total area of the surface array is 762 km2. the surface array is divided into three parts that communicate with three control towers where the waveforms are digitized and triggers are produced. each second, the tower collects the recorded signals from all stations and a trigger is produced when three adjacent stations coincide within 8 µs. the sd reaches full efficiency at 1018.7 ev for showers with zenith angle less than 45° [9]. this angle corresponds to sd acceptance of 1600 km2 sr. the fluorescence detector (fd) consists of three fluorescence stations. two of them are new and consist of 12 telescopes with field of view from elevations of 3° to 31°. the total horizontal field of view of each station is 108°. the third station has 14 telescopes that use cameras and electronics from hires-i and 693 http://dx.doi.org/10.14311/ap.2013.53.0693 http://ojs.cvut.cz/ojs/index.php/ap todor stanev acta polytechnica 50 km 23 km figure 1. comparison of the sizes of the surface arrays of the telescope array and the auger southern observatory. the positions of the ta fluorescent detectors are indicated with small arcs. mirrors from hires-ii. the fluorescent telescopes are calibrated with n2 lasers, xe flashers, and an electron linear accelerator [11]. the atmosphere is monitored for clouds by ir cameras and with the use of the central laser facility, which is in the center of the array at 20.85 km from each station. the fluorescent stations are positioned in such a way that they cover the whole area of the surface detector. the mono acceptance of the fd is 1830 km2 sr and the stereo acceptance is 1040 km2 sr. the total energy resolution is 25 % and the xmax resolution is 17 g/cm2. 2. new results the new results come from the telescope array. they were reported at the 2011 international cosmic ray conference in beijing. two papers also appeared in the arxiv a couple of months ago. figure 1 compares the size of the ta to that of auger – it is almost four times smaller. in addition, the water cherenkov tanks have the same effective area up to a shower zenith angle of 60°which means that their exposure is higher than that of the scintillator counters. for these reasons the new ta results are based on smaller statistics and should be considered preliminary. 2.1. uhecr energy spectrum figure 2 shows the energy spectrum measured by the telescope array [3] compared to the spectra of auger and the hires experiments. at first glance at the figure, we see that the spectrum measured by ta is extremely close to that of hires. one should say here that there is a big difference between the way the energy spectrum is measured by the two detectors. the telescope array uses the method of measuring the energy spectrum with the surface array introduced by auger. fluorescent telescopes can work only on clear moonless nights with good atmospheric conditions (about 10 % of the time) while surface arrays are 1023 1024 1025 1018 1019 1020 e 3 d n /d e , e v 2 (m 2 . s. sr )1 e, ev auger hr stereo ta figure 2. energy spectrum of the uhecr measured by ta, hires and auger. the particle flux is multiplied by e3 to show better the shape of the energy spectrum. active all the time. in addition, the energy estimates with the surface array depend heavily on the hadronic interaction model used in the shower analysis. to increase the statistics, one can correlate the particle density in the surface array at a certain distance from the shower core (800 m for ta and 1000 m for auger) with the energy estimate from the fluorescent detectors (which does not need the hadronic monte carlo) and then use the surface density to obtain the spectrum. the telescope array energy spectrum paper [3] also fits the shape of the spectrum with a broken power law. the ankle of the spectrum, where it becomes less steep, is at (4.8 ± 0.1) × 1018 ev. the power law index α before the ankle is 3.33 ± 0.04, at the ankle it is 2.68 ± 0.04, and at the gzk decline it is 4.2 ± 0.7. the statistics is, of course, quite small but there is no doubt that the spectrum becomes steeper, as predicted by greisen and zatsepin&kuzmin. it is indeed remarkable that using very different methods for observation of the spectrum the data of ta and hires agree so well. one has to admit that the shape of the energy spectrum detected by ta is also very similar to that of auger in spite of the different normalization. all three spectra shown in fig. 2 are consistent within the systematic errors claimed by the experiments, which are of the order of 20 %. 2.2. chemical composition of uhecr the measurement of the chemical composition of cosmic rays is through the interpretation of the depth of shower maximum xmax. the position on the shower maximum for proton showers becomes deeper in the atmosphere with energy because showers continue developing until the average energy of its particles decreases below 80 mev. showers caused by heavy nuclei have xmax higher in the atmosphere because in the first approximation they are the sum of a nucleon showers of energy e/a. at energies above 1018 ev the 694 vol. 53 supplement/2013 ultrahigh energy cosmic rays: review of the current situation 600 650 700 750 800 850 900 1018 1019 1020 x m ax (g /c m 2 ) e (ev) epos qgsjet ii sibyll 2.1 hires auger ta figure 3. depth of shower maximum measurements by the telescope array, hires and auger. the lines show the energy behavior for proton and iron showers for three hadronic interaction models. difference between xmax of proton and iron showers is about 100 g/cm2. the primary mass of the particle interacting in the atmosphere also affects the fluctuations of xmax per energy bin. showers caused by heavy nuclei would have smaller fluctuations, as in the simplest model (superposition) the fluctuations in such showers should decrease by √ a. in monte carlo calculations the difference is smaller, varying from about 60 g/cm2 for proton showers to about 20 g/cm2 for fe showers. figure 3 compares the xmax measurements of the telescope array [12] presented at the 2011 international cosmic ray conference (beijing) with the results of hires and auger. the interpretation of the xmax measurement by the ta experiment is that the uhecr composition is light, consisting mostly of protons and very light nuclei. it is not easy to understand the very different interpretations of the data of hires and ta, on one hand, and auger, on the other hand, of data look very similar to the naked eye. the explanation of the previous disagreement between hires and auger was that they used different event selection. it is not obvious now what exactly the ta event selection is. one has to have in mind that the highest energy two points in its data set have respectively only three and one events and the average xmax could be different when more statistics is collected. the telescope array also presented [12] the distributions of xmax in the energy bins shown in fig. 3. at relatively low energy, the widths of the distributions were more similar to proton showers, while at high energy the statistics is not enough to judge the distributions. 2.3. identifying the sources of uhecr in 2007 the auger collaboration published a paper where a correlation of their highest energy events (> 55 eev) with agn was discussed. at that time the collaboration had seen 27 such events. eighteen of these events had an angle of less than 3.2° from the positions of nearby (redshift z < 0.018, distance less than 75 mpc) agn from the véron-cetty and véron catalog (vcv) [13]. the correlation was even stronger if events close to the galactic plane were excluded. although the vcv catalog contains mostly not very powerful seyfert-2 agn, they may have marked the the distribution of the real sources. this paper had a huge readership and many scientists were convinced that the sources of uhecr would be discovered soon. the hires data (13 events) did not confirm this correlation [1] and papers discussing the different fields of view (auger in the south and hires in the north) appeared in press. since the southern auger observatory was completed at that time it did not take long to significantly increase the statistics. in 2009 the correlation of 69 high energy events with the same agn catalog was published. the correlation has decreased to about 38 % of the events. the previous result happened to be a typical 3σ disappearing result. the disagreement between auger and hires on the correlation of the arrival directions of their highest energy events with agn is also strange, because of their results on the chemical composition of uhecr. if the composition is indeed heavy, as interpreted by auger, one expects that the heavy nuclei would scatter more in the intergalactic and galactic magnetic fields and show no anisotropy. figure 4 shows the arrival directions of the highest energy events of auger, hires and ta. having in mind the dimensions of auger and the ta (see fig. 1) and the fact that ta field of view is restricted to zenith angles less than 45°, it is difficult to believe that the ratio of their statistics is less than three. we hope that auger has more than 100 such events by now. the 20 % difference in the energy assignment may also play a role in this issue. it is not easy to judge what the new data set says about the correlation of the uhecr arrival direction with powerful astrophysical sources. one way would be to judge the possible direction of the sources by close-by arrival directions of groups of highest energy events. we looked at pairs of events at angular distance less than 5° from each other. there are 11 such pairs in the auger 69 events data set. six such pairs are within 18 degrees of cena. an isotropic monte carlo in the auger field of view creates on the average 11 pairs, the same number as in the data. there are three pairs consisting on hires and auger events and one ta-auger pair. there are also two pairs consisting of ta events, as shown in fig. 4. it is not possible to run an isotropic monte carlo for the new events because the exposure of the telescope array is not as well defined as those of auger and hires. 695 todor stanev acta polytechnica 69 auger events >55 eev 13 hires events 25 ta events l=180 l=-180 b=-60 b=60 b=-30 b=30 virgo cen a figure 4. arrival directions of the 69 auger events, 13 hires events and the ta 25 events in galactic coordinates. the colored area shows the part of the galaxy that auger does not see. the six areas defined within the auger field of view have equal exposures. the events that form a pair at angular distance less than 5° are circled. 3. discussion it is not possible to conclude anything new from the data set of the telescope array. its results on the energy spectrum of the uhecr are very similar to those of the high resolution fly’s eye. all three newest experiments confirm the end of the cosmic ray spectrum that is consistent with the gzk effect and with photo dissociation energy loss of heavy nuclei. the three published spectra are almost identical within the stated systematic error of more than 20 %. it may be important for high energy physics to understand the differences, claimed by auger and ta, between the energy assignment of the events from the fluorescent detectors and the surface arrays, which is also of the order of 20 %. the analysis of the surface array data in both detectors gives a higher energy assignment. in the case of auger, the suspicion is that the water cherenkov tanks of the surface array see a much higher number of muons, which produce more light in the tanks than electrons and γ-rays do. in the case of ta the surface array consists of scintillator counters where muons generate the same signals as electrons do. in this case a wrong expectation about the shower muons would make a smaller contribution to the energy assignment. by far the biggest controversy in the results is the interpretation of the xmax measurement by the three experiments shown in fig. 3. the results of the measurements do not seem to be as different to the eye as the interpretation is. hires and ta interpret the results as an almost purely proton composition, while auger interprets the measurements as a composition becoming increasingly heavier with energy. in the review of uhecr [8], suspicion fell on the different event selection in auger and hires. we do not know much about the selection in ta yet, and this question is still open. there is some theoretical contradiction between the chemical composition derived by auger and the anisotropy it has measured, including the large number of events coming from the vicinity of cena. lemoine & waxman [7] suggested that if the composition were heavy there would be protons from nuclear photodissociation that would show the same anisotropy at significantly lower energy. such anisotropy at about 1018 ev has not been seen by the auger experiment. this is not an argument against the heavy composition derived by auger, but an interesting argument for further measurements and observations. the new data on the arrival direction distribution of uhecr that come from ta did not contribute to the source identification. it is very good though, to have an active experiment in the northern hemisphere. auger and ta are able to increase the statistics by a factor close to five during the next four years. this statistics may not be sufficient to identify of the sources of the ultrahigh energy cosmic rays, but it will certainly be an improvement over the current situation. the good news is that at the international symposium on future directions in uhecr physics at cern in february 2012 the two collaborations started to work together on all of the topics discussed above. working groups consisting of members of both collaborations were created and gave talks at the symposium. all of us hope that the working groups will make a good study of the differences in the shower reconstruction and data analysis, and will at least discover the reasons for the contradictory results. if this happens, we will know much more about this exciting field in a couple of years. 696 vol. 53 supplement/2013 ultrahigh energy cosmic rays: review of the current situation acknowledgements the author thanks the organizers of the vulcano workshop for the invitation to this excellent and useful meeting. his work is supported in part by doe grant de-fg0291er40626. references [1] abbasi, r.u. et al. (hires collaboration), 2008, astropart. phys. 30, 175 [2] abraham, j. et al. (auger collaboration), 2007, science, 318 938 [3] abu-zayyad, t. et al. (telescope array), arxiv: 1205.5067 [4] abu-zayyad, t. et al. (telescope array), arxiv: 1205.5984 [5] greisen, k., phys. rev. lett., 1966 16 748 [6] kampert, k-h., unger, m., astropart. phys. 2012 35 660 [7] lemoine, m., waxman, e., jcap 2009 0911 009 [8] letessier-selvon, a., stanev,t., rev. mod. phys. 2011 83 907 [9] nonaka, t. et al. (telescope array), nucl. phys. b (proc. suppl.), 2009, 190 26 [10] sokolsky, p. (hires collaboration), nucl. phys. b (proc. suppl), 2011, 212-213 74 [11] tokuno, h. et al. (telescope array), nucl. instrum. meth., 2009 a601 364 [12] tsunesada, y.: in proceedings of the 32nd icrc, beijing, 2011, 12 58 [13] véron-cetty, m.-p., véron, p., 2006, astron&asttrophys, 445 773 [14] zatsepin, g.t., kuzmin, v.a., jetp lett., 1966 4 78 discussion peter grieder — concerning the differences in composition between auger and the telescope array. the two experiments see different sources. these maybe of different nature. please comment. todor stanev — the fields of view of auger and ta are different, but it is difficult to imagine that the cosmic ray composition that much. the fields of view of auger and hires coincided about 30 % so hires should have seen some heavy nuclei. i do not believe that this is the reason for the disagreement. laurence jones — we now know that the total p–p cross section rises to about 100 mb near 1 eev. do the monte carlo models used to determine the mass include the cross section rise? todor stanev — the hadronic monte carlo models used for shower analysis have a rising cross section. the cross section of sibyll 2.1 is higher than the one measured at lhc. all interaction models are now revised to match the measurements. anatoly erlykin — will the extreme sharpness of the ankle in the published telescope array surface array energy spectrum be an evidence against the dip model of its origin? todor stanev — the first point of the ta energy spectrum is indeed quite high. since it is only one point at the detector threshold, where the detector is not fully efficient, i have not paid much attention to it. 697 acta polytechnica 53(supplement):693–697, 2013 1 introduction 1.1 telescope array 2 new results 2.1 uhecr energy spectrum 2.2 chemical composition of uhecr 2.3 identifying the sources of uhecr 3 discussion acknowledgements references acta polytechnica acta polytechnica 53(2):117–122, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the formation of a power multi-pulse extreme ultraviolet radiation in the pulse plasma diode of low pressure ievgeniia v. borguna,∗, mykola o. azarenkova, ahmed hassaneinb, oleksandr f. tseluykoa, vasyl i. maslova, dmitro l. ryabchikovka a karazin kharkiv national university, pl. svobody 4, kharkov 61022, ukraine b school of nuclear engineering, purdue university, 400 central drive, west lafayette, in 47907–2017, usa ∗ corresponding author: ievgeniia.borgun@mail.ru abstract. in this paper results are presented on experimental studies of the temporal characteristics of spike extreme ultraviolet (euv) radiation in the spectral range of 12.2 ÷ 15.8 nm from the anode region of high-current (i = 40 ka) pulsed discharges in tin vapor. it is observed that the intense multi-spike radiation in this range arises at an inductive stage of the discharge. it has been shown that the radiation spikes correlate with the sharp increase of active resistance and of pumped power, due to plasma heating by an electron beam, formed in the double layer of charged particles. it has been observed that for large number of spikes the conversion efficiency of pumped energy into radiation at double layer formation is essentially higher in comparison with collisional heating. keywords: euv sources, plasma discharge, z–pinch, euv radiation, conversion efficiency. 1. introduction this study is devoted to the investigation of phenomena, occurring in a high-current pulse plasma diode, where the plasma with tin multi-charged ions generates extreme ultraviolet (euv). one of the methods for the creation of high-power euv sources for nanolithography is an application of high-current discharges in tin vapor [3, 11]. an emission source based on the dense plasma of multi-charged tin ions is advantageous against gas-filled systems because of the expected higher conversion efficiency [7] and is capable of operating at ultralow pressures. the latter is important for probability reduction of parasitic breakdowns and for reduction of losses of radiation in optical paths. since such a source is required to have a rather high output power [9], increasing the conversion efficiency of the supplied electric power into radiation is an important goal. large conversion efficiency of the source is predicted for the tin [9]. for production of highcontrast nonlinear photoresists, operating in euv range, for nanolithography above-threshold pulse intensities are required [9]. it can be provided by the narrow peak pulses (spikes) of radiation. this work is aimed to investigation of the processes affecting the efficiency of radiation generation in highcurrent tin-vapor discharges. the results are presented on studies of the generation of intense radiation in the range of 12.2 ÷ 15.8 nm wavelength from an extended plasma diode, operating in the regime of self-sustained plasma-beam discharge [1]. the dense high-temperature plasma of multi-charged tin ions is produced by pulsed evaporation of the anode material, fast ionization of the vapor, and heating of the resulting plasma by the current and by the electron beam, formed by the double electric layer in the anode region of the diode. 2. experiments the experimental setup dedicated to studies of the euv yield from the plasma of multi-charged tin ions is presented in fig. 1. the setup consists of the pulsed high-current plasma diode with the igniting electrode, the photo-detector for measurement of the integral radiation intensity, and the semicoductor detector axuv-20 for the radiation intensity measurement of selected wavelength range. axuv-20 detector (fabricated by international radiation detectors incorporation, california) with input mo−si optical filter on 12.2 ÷ 15.8 nm wavelength range was calibrated by intensity with help of the synchrotron radiation. the damped alternating current in the diode is excited between the cylindrical electrodes due to the discharge of the low-inductance capacitor bank c0 of capacity 2.0 µf at starting pressure 2 × 10−6 torr. the features of the applied scheme of discharge gap are the use of electrodes with a working surface of various sizes and initially the applying a positive voltage to the electrode with a small effective surface (the anode), the negative voltage – to the electrode with a larger effective surface (the cathode). at the discharge current 10 ÷ 40 ka and the effective anode surface 2 ÷ 20 mm2 the current density reaches value about of 0.2 ÷ 2.0 ma cm−2. this leads to dense plasma formation near anode and to the necessary position stabilization of this plasma. in addition the small anode effective area provides suitable conditions of the electrical double layer formation exactly in near anode region. the double layer formation has been 117 http://ctn.cvut.cz/ap/ ievgeniia v. borgun et al. acta polytechnica figure 1. scheme of the experimental setup: vc – vacuum chamber, sd – detector axuv-20, pd – photoelectron detector, a – anode, c – cathode, id – rogowski coil, vd – voltage divider, vig – igniting electrode. observed in [10, 12]. in these articles at an achievement by the discharge current of some critical value all voltage is concentrated in a narrow region and a high-current electron beam is formed, which carries all discharge current. this narrow region divides the discharge on two parts. further study of conditions of facilitation of double layer and high-current electron beam formation in such type of discharge studied by the authors of the articles [5, 4, 6]. in our article the double layer formation and its dynamics are not investigated in detail. based on the knowledge of methods for forming the double layer and its stabilization, we have created such experiment conditions to create the double layer near one electrode with a small effective surface. the using of double layer is an effective method for local plasma heating. thus the presence in the same location of the dense plasma and its source of heating makes it possible to form dense plasma with multi-charged ions in near anode region. the length of the discharge space can be varied from 3 to 10 cm, the diameter of the cathode is 10 mm and the anode diameter equals 1.5 mm or 2.5 mm, or 5 mm. the working surface of the electrodes are covered with 0.5 mm thick tin layer. the side surface of a rod anode is enveloped by tubular ceramic insulator to increase the current density to the anode. the discharge voltage is from 4 to 15 kv, the current amplitude from 10 to 40 ka, the current density on the anode reaches to 0.2 ÷ 2.0 ma cm−2, the halfcycle of current oscillations is 1.7 µs. the discharge current and voltage are measured with rogowski coil id and balanced voltage divider vd, respectively. the current through the diode is excited after filling of the discharge interval by preliminary plasma due to surface discharge on the cathode, using a igniting electrode. the pulsed voltage 0.2 ÷ 5.0 kv supplies on a igniting electrode from the capacity condenser 0.025 µf through thyratron and inductance of value 400 µh. the integral radiation by plasma is monitored by the current measurement of photoelectrons. the photoelectrons are collected by fine-mesh grid placed at a distance of 0.5 mm from the photocathode. the grid is grounded, and the photocathode is at a negative potential of −50 v, supplied from an autonomous voltage source. at the radiarion intensity measuring of the selected wavelength range it is used the special measures. to protect the axuv-20 detector from the plasma and charged particle beams the inlet diafragmatic channel is located in transverse magnetic field (strength 0.2 t, length 25 cm). to exclude the effect of photoelectron current on the detector signal, we use a set of shading diaphragms and the detector is biased by +20 v with respect to the channel. the diode discharge develops in two stages. the first stage begins with a surface breakdown at the cathode and finished when the primary plasma reaches the anode. in the first stage, which lasts for 2 ÷ 6 µs (for the electrode gap length of 5 cm), the discharge is operated in the regime of a vacuum diode with a plasma emitter. in this case, the discharge current is carried by the electron beam. thus the working surface of the anode is preheated by the beam and the initial vapor envelope is formed. in the second stage, when the dense plasma occupied the discharge gap, the discharge switches to the plasma diode regime, in which the discharge current is determined by the parameters of the plasma and discharge circuit. probe measurements have shown that beetwen stages of the vacuum diode and the plasma diode the transition regime exists in which the double electric layer near the anode surface is formated. the intense electron beam are accelerated into the layer and affects on the anode surface. a high-current double electric layer exists for 0.5 µs. under these conditions, the material is intensely evaporated from the anode surface, the vapor is quickly ionized and the plasma is rapidly heated due to the beam–plasma interaction. the total energy pumped in the anode plasma and the anode itself during the first half-period of the discharge current reaches 80 % of the energy stored in the capacitor bank. the formation of the double layer in the transition regime is determined by inability of the discharge gap plasma to provide the high discharge current density [1, 5]. as soon as the dense plasma increases, the current is no longer limited and the double layer is disappeared. then, the discharge operates in the inductive phase. 118 vol. 53 no. 2/2013 the formation of a poser multi-pulse extreme uv radiation 3. results and discussion from the results of the experiments it follows that the intensive radiation in the range of wavelengths 12.2 ÷ 15.8 nm arises both at the transition regime and at the inductive stage of the discharge. the peculiarity of this radiation is that there are the powerful (up to 1mw) short (100 ÷ 200 ns) spikes against of the background of wide radiation pulses with duration about half-period of discharge current oscillations. the transversal dimension of the region of generation is comparable with the diameter of the anode, and its length changes depending on a discharge voltage and equals 4 ÷ 7 mm. typical waveforms of discharge current and voltage, the radiation intensity in the selected wavelength range and integral radiation versus time are shown in fig. 2. it is seen that in the selected wavelength range there are several radiation spikes, correlating with the corresponding half-periods of the discharge current. in the first half-period there are as the wide pulse with the relatively small amplitude and a powerful spike. whereas the pulses emitted during the second and third half-periods have only the form of the powerful shorter narrow spikes. it is clearly seen that the radiation spikes of the selected wavelength range coincide with spike pulses of integral radiation (see fig. 2c and 2d). note that during first half-cycle the narrow spike of duration 200 ns is observed only at discharge voltage more than 7 kv. its intensity grows with increase of the discharge voltage. more than 70 % of energy, radiated during first half-cycle, is concentrated in this peak pulse. the time of occurrence of the narrow peak pulse depends on an ignition voltage. (for higher voltage the peak pulse appears earlier.) in second and third half-cycles these radiation spikes are always observed near maximum of the current. the radiation spikes are registered at the current amplitude more than 10 ka. their intensities grow with growth of the discharge voltage. note that in second half-cycle of the current oscillations an additional spike-satellite of duration 200 ns is observed at discharge voltage 5 ÷ 8 kv (fig. 2c). this pulse follows after the basic spike through 200 ns. the intensity of the spike-satellite also grows with increase of the discharge voltage. at voltage more than 8 kv the spikesatellite disappears. similar two spikes during the one half-period have been observed in [8]. for effective generation of the radiation in the selected wavelength range it is required an additional energy pumping. in our case, additional energy pumping is provided by the electron beam, generated by the electric double layer which is formed periodically. in fig. 3 the dependence of power, pumped in discharge, and corresponding pulses of radiation on time are shown. one can see that the spikes of radiation coincide with spikes of power, pumped in discharge. the different time behavior of radiation in this wavelength range in the first and subsequent half-periods can be explained as follows. in the first figure 2. waveforms of the (a) discharge current, (b) discharge voltage, (c) radiation intensity in the 12.2 ÷ 15.8 nm wavelength range, and (d) integral plasma radiation versus time, da = 2.5 mm, vdis = 8 kv, ldis = 5 cm. half-period, there is still an influx of neutral atoms into the anode region due to intense evaporation of the anode material. moreover, since the discharge current is high, the energy distribution of plasma electrons is fairly wide. this leads to a significant collisional energy pumping; the broadening of the distribution function of ions over charge states; and, accordingly, the generation of ordinary recombination radiation within a wide spectral range with a relatively low intensity. by the end of the first half-period, the number of ions in the charge states corresponding to radiation in the 12.2 ÷ 15.8 nm wavelength range decreases. in the second and third half-periods the plasma already exists. it is necessary only to increase the charge states (i.e., plasma temperature) of ions up to necessary ones. moreover the plasma is relatively dense and focused during previous half-period(s). in the second and third half-periods the neutral atom flux into the anode region is much less than one in the first 119 ievgeniia v. borgun et al. acta polytechnica figure 3. (a) the temporal dependences of power, pumped in discharge, (b) radiation intensity in the 12.2 ÷ 15.8 nm wavelength range; da = 2.5 mm, vdis = 8 kv, ldis = 5 cm. half-period, because existing plasma shades anode from electron beam bombardment. thus in the subsequent half-periods the energy expenditure on formation of dense plasma with necessary charge states of ions is essentially smaller than in first half-period. after formation of plasma with necessary charge states of ions in the second and third half-periods the distribution function of ions on charge states is more narrow than in the first half-period. we examined how the radiation intensity and the conversion efficiency of the stored energy into radiation depend on the external conditions. fig. 4a shows how the total (within the 2π solid angle) energy of the radiation pulses (all pulses: spike and wide) in the 12.2 ÷ 15.8 nm wavelength range depends on the energy stored in the capacitor bank. the dependence is seen to reach maximum at stored energies larger than 140 j. it is necessary to note that this maximum in dependence of the radiation energy on the energy stored in the capacitor bank is observed at use of anode of different diameters (from 1.5 to 5 mm). as it is shown in fig. 5 for larger diameter of the anode this maximum shifts to region of larger energy stored in the capacitor bank. for smaller diameter of the anode this maximum shifts to region of smaller energy stored in the capacitor bank. in fig. 4 the diameter of the anode equals 2.5 mm. figure 4b also shows the energies of the radiation pulses emitted during the first, second, and third half-periods of the discharge current. at low stored energies, radiation is mainly emitted during the first half-period. as the stored energy increases, the energies of radiation pulses emitted in different halfperiods approach one another. at stored energies, figure 4. (a) radiation energies (spike and wide pulse energies) in the 12.2 ÷ 15.8 nm wavelength range during first w1r, second w2r, third w3r half-periods, (b) fractions of the energy radiated in the selected wavelength range in different half-periods of the discharge current as functions of the stored energy w0; here w0r is the total radiation energy, emitted in the selected wavelength range during a discharge into the 2π solid angle; w1r/w0r, w2r/w0r, w3r/w0r are the energy fractions emitted in the first, second, and third half-periods, respectively, da = 2.5 mm, ldis = 5 cm. exceeding 120 j, they become nearly equal. figure 4b shows the relative energies of radiation pulses emitted in each half-period of the discharge current as functions of the stored energy. figure 6a shows the relative energy, pumped in each half-period of the discharge current, as a function of the stored energy. it is seen that the energy is mainly pumped in the first half-period, during which the anode material is intensely evaporated and neutral atoms are multiply ionized. in other words, the stored energy is mainly spent on the production of dense plasma. very small energy in comparison with the plasma formation energy is spent on plasma support during next half-periods. therefore, when the total stored energy is low, the rest of the energy pumped in the subsequent half-periods is low. however, the radiation intensity in those half-periods is relatively high. as the stored energy increases, the current in the second and third half-periods increases. the energy fraction pumped in the subsequent half-periods increases, thereby leading to an increase in the radiation energy. the equalization of the absolute and relative radiation energies emitted over one half-period of the discharge current with increasing energy, as well as the fact that the radiation intensity in the second 120 vol. 53 no. 2/2013 the formation of a poser multi-pulse extreme uv radiation figure 5. radiation energies (spike and wide pulse energies) in the 12.2 ÷ 15.8 nm wavelength range as functions of the stored energy for the different anode diametres: da = 1.5, 2.5, 5.0 mm. and third half-periods is high in spite of a much lower (as compared to the first half-period) energy pump, indicates that the number of particles in the dense radiating plasma remains nearly constant for a fairly long time (approximately 6 µs). this is confirmed by the results obtained in [2], where long-lived dense plasma was also observed. thus, after the plasma has formed in the first half-period, it is supplied with energy in the subsequent half-periods, a fraction of the supplied energy being instantaneously converted into radiation. detailed analysis of the energy pumped in the discharge during one half-period and the energy radiated over the same time permits to find the energy conversion efficiency for each half-period. figure 6b shows the conversion efficiency of the energy pumped in the discharge during each half-period into the radiation energy in the 12.2 ÷ 15.8 nm wavelength range, and the total energy conversion efficiency ns0 = w0 r/w0, as a function of the total stored energy. it is seen that the conversion efficiency in the second and third half-periods reaches a few percent, the energy pumped in the discharge being 10 j. for a 2.5 mm diameter anode, there is an optimum at a stored energy of 80 j (see fig. 6b). a comparison of the radiation intensity in the selected wavelength range and the integral intensity shows that, when the stored energy is more above a certain value, the integral radiation intensity increases sharply, whereas the radiation intensity in the selected wavelength range of increases insignificantly. presumably, this is related to the increasing of tin ion with the higher charge states, which leads to generation of the harder radiation and, therefore, to extra energy consumption. figure 6. energy fractions, pumped during one halfperiod (w1, w2, and w3 are the pumped energies and w1/w0, w2/w0, and w3/w0 are the energy fractions pumped in the first, second, and third half-periods of the discharge current, respectively); (b) conversion efficiency ηi = wi r/wi (i = 1, 2, 3) of the energy, pumped during one half-period wi, into the radiation energy wi r in the 12.2 ÷ 15.8 nm wavelength range as functions of the stored energy w0; w0 r is the radiation energy in the selected wavelength range during all half-periods, da = 2.5 mm, ldis = 5 cm. 4. conclusions multi-spike radiation, important for use of highcontrast nonlinear photoresists for nanolithography, in the wavelength range 12.2 ÷ 15.8 nm from a tin vapor discharge has been observed. the duration of radiation spikes is found to be much shorter than the half-period of the discharge current. in a narrow energy range in second half-cycle the spike-satellite is observed. an analysis of the experimental data indicates the presence of long-lived dense plasma. when this plasma is optimally supplied with energy, radiation spikes are generated in the selected wavelength range. analysis of the experimental data indicates that the intensity of these spikes is mainly determined by the discharge current and by the energy pumped in the discharge. for an efficient radiation source development it is expedient that the radiating plasma be used repeatedly, because most of the stored energy is spent on plasma formation. in order to achieve quasi-steady emission, the discharge should be supplied with optimal portions of energy. 121 ievgeniia v. borgun et al. acta polytechnica the possibility of the power (several mw) extreme ultraviolet generation in kind of train of spikes of duration 200 ns in the high-current pulse plasma diode in the inductive stage of discharge development at the current density to the anode 0.2 ÷ 2.0 ma cm−2 has been shown. the power spikes of radiation are generated in conditions of plasma heating by electron beam. the electron beam is formed by double electrical layer. the conversion efficiency of pumped energy into radiation energy at the double layer formation (beam mechanism of plasma heating) is essentially higher in comparison with ordinary heating by current, because in first case approximately all energy is pumped in very small volume. the use of plasma diode with anode of small dimension (in comparison with cathode dimension) helps to control and stabilize in space dense plasma localization and the position of double layer formation. at this condition the space of dense plasma localization coincides with the source of its heating. references [1] v. n. borisko, et al. the formation of the low-sized high density plasma strucrures in the self-maintained plasma-beam discharge. probl at sci technol ser plasma phys 12(6):225, 2006. [2] v. a. burzev, et al. about recombination nonequilibrium of plasma in low-inductive capillary discharges. jetp letters 33(1), 2007. [3] a. hassanein, et al. simulation and optimization of dpp hydrodynamics and radiation transport for euv lithography devices. proc of spie 5374:413–422, 2004. [4] e. i. lutsenko, n. d. sereda, a. f. tseluyko. dynamic double layers in high-current plasma diodes. journal of technical physics 58(7):1299–1309, 1988. (in russian). [5] e. i. lutsenko, n. d. sereda, a. f. tseluyko, a. a. bizukov. the dynamic characteristics of the beam-plasma self-discharge. ukrainian physics journal 33(5):730–736, 1988. (in russian). [6] v. i. maslov. properties and evolution of nonstationary double layers in nonequilibrium plasma (invited lecture). in proc. of 4th symp. on double layers and other nonlinear structures in plasmas, pp. 82–92. innsbruck, 1992. [7] m. masnavi, et al. estimation of optimum density and temperature for maximum efficiency of tin ions in z discharge extreme ultraviolet sources. j appl phys 101:033306, 2007. [8] g. niimi, et al. observation of multi-pulse soft x-ray lasing in a fast capillary discharge. j phys d: appl phys 34:2123, 2001. [9] r. seysyan. nano-lithography slis by extreme ultraviolet radiation (review). journal of technical physics 75(5):1–13, 2005. (in russian). [10] v. k. suladze, b. a. tshadaya, a. a. plutto. features of the formation of intense beams of electrons confined plasma. jetp letters 10(6):282–285, 1969. (in russian). [11] y. tao, et al. investigation of the interaction of a laser pulse with a preformed gaussian sn plume for an extreme ultraviolet lithography source. j appl phys 101:023305, 2007. [12] b. a. tshadaya, a. a. plutto, k. v. suladze. the formation of pulse ion beams in the high-current plasma diode. journal of technical physics 44(8):1779–1780, 1974. (in russian). 122 acta polytechnica 53(2):117–122, 2013 1 introduction 2 experiments 3 results and discussion 4 conclusions references acta polytechnica doi:10.14311/ap.2013.53.0799 acta polytechnica 53(supplement):799–802, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap grb investigations by esa gaia and loft rené hudeca,b,∗, vojtěch šimona, lukáš hudecb a astronomical institute, academy of sciences of the czech republic, cz-25165 ondřejov, czech republic b czech technical university in prague, faculty of electrical engineering, prague, czech republic ∗ corresponding author: rhudec@asu.cas.cz abstract. the possibility of studying grbs with the esa gaia and loft missions is briefly addressed. the esa gaia satellite to be launched in november 2013 will focus on high precision astrometry of stars and all objects down to limiting magnitude 20. the satellite will also provide photometric and spectral information and hence important inputs for various branches of astrophysics, including the study of grbs and related optical afterglows (oas) and optical transients (ots). the strength of gaia in grb analyses will be the fine spectral resolution (spectro-photometry and ultra-low dispersion spectroscopy), which will allow the correct classication of related triggers. an interesting feature of gaia bp and rp instruments will be the study of highly redshifted triggers. similarly, the low dispersion spectroscopy provided by various plate surveys can also supply valuable data for investigations of high-energy sources. the esa loft candidate mission, now in the assessment study phase, will also be able to detect and be used in the study of grbs, with emphasis on low-energy (x-ray) emission. keywords: gamma ray bursts, satellites: gaia, spectroscopy: low-dispersion spectra, loft. 1. introduction this paper briefly discusses the potential of the esa gaia and loft missions for studying gamma-ray bursts (grbs). the gaia mission will chart a threedimensional map of our galaxy, the milky way, in the process revealing its composition, formation and evolution. the satellite is expected to provide unprecedented positional and radial velocity measurements with the accuracies needed to produce a stereoscopic and kinematic census of about one billion stars in our galaxy and throughout the local group. combined with astrophysical information for each star, provided by on-board multi-color photometry/lowdispersion spectroscopy, these data will have the precision necessary to quantify the early formation, and the subsequent dynamical, chemical and star formation evolution of the galaxy [6]. gaia will provide several advantages for studies of the optical counterparts of gamma-ray bursts (grbs). first, it will have a deep limiting magnitude of 20 mag [5], much deeper than most previous studies and global surveys. secondly, the time period covered by gaia observations, i.e. 5 years, will also allow some studies requiring long-term monitoring, recently provided mostly by astronomical plate archives and by small or magnitude-limited sky ccd surveys. but perhaps the most important benefit of gaia for these studies will be the color (spectral) resolution thanks to the low-resolution (prism) gaia photometer. this will allow some detailed studies involving analyses of the color and spectral changes that were not possible before. the studies of the optical counterparts of high-energy sources are described in detail in the dedicated sub-workpackages within the specific objects studies workpackage within gaia cu7 [3, 4]. the main objective is to investigate the optical counterparts of high-energy astrophysical sources including high-mass x-ray binaries, low-mass x-ray binaries, x-ray transients, x-ray novae, microquasars, optical transients (ots) and optical afterglows (oas) related to x-ray flashes and grbs, etc. we put emphasis on the photometric mode rp/bp, and its potential use for analyses of optical counterparts of grbs. the use of the dispersive element (prism) generates ultra-low dispersion spectra. one disperser called bp (blue photometer), operates in the wavelength range of 330 ÷ 660 nm; the other disperser called rp (red photometer), covers the wavelength range of 650 ÷ 1000 nm. the dispersion is higher at short wavelengths, and ranges from 4 to 32 nm/pixel for bp and from 7 to 15 nm/pixel for rp. the esa loft satellite, now in the assessment phase, will also have a potential for grb investigations, with emphasis on their x-ray emission, as briefly discussed below. 2. gaia and grbs: photometry ots and oas of grbs display characteristic light curves. these events usually reach their peak optical luminosity in the initial phase, shortly (several minutes) after the gamma-ray emission, which typically lasts from a fraction of the second to several minutes. in the later, much longer phase, which can last for several (even more than 10) days, oas usually display a characteristic power-law fading profile. a sequence of observations mapping this oa light curve is there799 http://dx.doi.org/10.14311/ap.2013.53.0799 http://ojs.cvut.cz/ojs/index.php/ap rené hudec, vojtěch šimon, lukáš hudec acta polytechnica fore necessary. according to zhang [13], most oas are fainter than about 18 mag already about 1 day after the grb, although some of them can even be brighter than 14 mag in the early phase. gaia is therefore definitely able to detect these oas in their early phase. however, the sampling provided by gaia is not optimal, so we can only rarely expect oa of grb to be detected only on the basis of this type of data. additional data can be provided by ground-based robotic telescopes (supplementary sos observations wp in gaia cu7). we will show that the low-dispersion spectroscopy provided by gaia will be very helpful in identifying of new sources as oas, even without available grb detection. 3. gaia and grbs: spectrophotometry/low-dispersion spectroscopy the gaia instrument consists of two low-resolution fused-silica prisms dispersing all the light entering the field of view (fov). two ccd strips are dedicated to photometry, one for bp and one for rp. both strips cover the full astrometric fov in the across-scan direction. all bp and rp ccds are operated in tdi (time-delayed integration) mode. ccds have 4500 (for bp) or 2900 (for rp) tdi lines and 1966 pixel columns (10 × 30 micron pixels). the spectral resolution is a function of wavelength as a result of the natural dispersion curve of fused silica. the bp and rp dispersers have been designed in such a way that the bp and rp spectra have similar sizes (of the order of 30 pixels along the scan). bp and rp spectra will be binned on-chip in the across-scan direction; no along-scan binning is foreseen. rp and bp will be able to reach object densities in the sky of at least 750 000 objects deg−2. the obtained images can be simulated by the gibis simulator (fig. 1). despite the low dispersion, the major strength of gaia for many scientific fields will be the fine spectrophotometry, as the low dispersion spectra may be transferred to numerous well-defined color filters. we have shown previously [11, 12] that the individual oas of grbs display quite specific and remarkably similar color indices, with negligible changes during the first several days after the grb (an example of such a color-color diagram is shown in fig. 2). this feature is important for distinguishing oas from other types of astrophysical objects. this suggests that although oas possess a large range of redshifts z, they display very similar spectra in the observer frame for z < 3.5. this gives us a chance to resolve whether an optical event is related to a grb even without available gamma-ray detection. it will be possible to classify optical transients using this method. this also means that it will be possible to classify oas of grbs in one photometric shot. figure 1. bp (top) and br (bottom) images simulated by the gibis simulator, the same sky field. figure 2. example of the color–color diagram of oas of long grbs. the data for the time interval < 10.2 d after the burst in the observer frame and corrected for the galactic reddening are displayed. multiple indices of the same oa are connected by lines for convenience. the mean colors (centroid) of the whole ensemble of oas (except for grb 000131 and sn 1998bw) are marked by the large cross. the representative reddening paths for eb−v = 0.5 mag and the positions of the main-sequence stars are also shown. adapted from [11, 12]. 4. low-dispersion spectral databases before gaia, low dispersion spectra were frequently taken in the 20th century by various photographic telescopes with objective prisms. the motivation of plate survey low-dispersion spectral studies is as follows: (1) compare the simulated gaia bp/rp images with images obtained from digitized schmidt spectral plates (both using dispersive elements) for selected test fields, and (2) study the feasibility of applying 800 vol. 53 supplement/2013 grb investigations by esa gaia and loft the algorithms developed for the plates for gaia. the dispersion is an important parameter: (1) gaia bp: 4 ÷ 32 nm/pixel, i.e. 400 ÷ 3200 nm/mm, 9 nm/pixel, i.e. 900 nm/mm at hγ, rp: 7 ÷ 15 nm/pixel, i.e. 700 ÷ 1500 nm/mm. psf fwhm ∼ 2 px, i.e. spectral resolution is ∼ 18 nm, (2) schmidt sonneberg plates (typical mean value): the dispersion for the 7 deg prism 10 nm/mm at hγ, and 23 nm/mm at hγ for the 3 deg prism. the scan resolution is 0.02 mm/px, thus about 0.2 and 0.5 nm/px, respectively, (3) bolivia expedition plates: 9 nm/mm, with calibration spectrum, (4) hamburg qso survey: 1.7 deg prism, 139 nm/mm at hγ, spectral resolution of 4.5 nm at hγ, (5) byurakan survey: 1.5 deg prism, 180 nm/mm at hγ, resolution 5 nm at hγ. it emerges that the gaia bp/rp dispersion is ∼ 5 to 10 times less than a typical digitized spectral prism plate, and the spectral resolution ∼ 3 to 4 times less. note that for the plates the spectral resolution is seeing-limited, hence the values represent the best values. it is only ∼ 2 times less on the plates affected by bad seeing. 5. gaia and grbs: proposed strategy and detection rate as the duration of most oas is about 10–20 days in the observer frame, they are likely to be detected by gaia during its scans even without rapid pointing at the grb position. however, this assumes that they will occur in the fov of the gaia telescopes. as already indicated, the oa can be recognized from several features, even without information on the time profile. the following features appear to be important: (1) unique color indices, (2) rapid rise (a new object appears between two scans), (3) host galaxy of the grb at the position of oa – this galaxy can be detected later by ground-based observations. it should be noted that even a search for so-called orphan afterglows will be possible with gaia. a missing gamma-ray emission, with only an oa remaining, can also suggest this important event. gamma-ray emission from many grbs remains unobservable because the jet is not pointing towards the observer, but the late-time oa is less beamed and can reach us [8, 10]. failed grbs can also contribute to the population of orphan afterglows [1]. the estimated gaia detection rate for oas of grbs, including orphans, is expected to be up to ∼ 100 in the whole lifetime of gaia (5 years). this low rate is due to the small fov of the gaia telescopes (∼ 0.36 deg2 each). a higher detection rate is expected in the plate lds surveys mentioned in this paper (due to a much larger fov), in which analogous strategies (e.g. high-redshift triggers) can be applied. 6. gaia lds and highly redshifted universe the redshifted lyman alpha line/break can be used to measure the value of z. this was e.g. the idea of the proposed janus space mission with a coverage range of 0.7 ÷ 1.7 microns (gaia rp has a coverage of 0.65 ÷ 1.0 microns). grbs are located at cosmological distances, often with z > 0.5 (e.g. [9]), and the lyman break is shifted to the optical band for objects at z larger than about 3.5. this break manifests itself as a sharp decrease of the flux in the blue part of the spectrum. this feature is prominent in the smooth spectral profile of oa. this oa will therefore appear shifted from its true position because of the lack of its blue part of the spectrum. a comparison of the accurate position of the oa obtained by gaia in astrometric mode with the blue edge of its spectrum can be used to resolve easily the objects occurring in our galaxy from those located at cosmological distances. it will also be possible to determine z, and hence the distance necessary for determining its luminosity. digitized plate surveys can also be used for discovering and identifying oas, especially those taken in the red-ir range (numerous surveys in these region have been made in the past for red objects, such as carbon stars). 7. esa loft the large observatory for x-ray timing (loft) is a proposed esa space mission to be launched around 2022, which will be devoted to the study of neutron stars, black holes and other compact objects by means of their very rapid x-ray variability [14]. the mission was submitted in the esa cosmic vision m3 call for proposals, and was selected, together with three other missions, for the initial assessment phase. two onboard experiments are proposed. 7.1. loft lad and grbs the lad (large area detector) is a collimated pointed experiment, so it is very improbable that it will detect a grb by chance. however, it can be used for grb follow-up pointing at the grb positions. the energy range (2 ÷ 60 kev) is consistent with x-ray afterglows of grbs, and with x-ray flares. the obvious preference is for fine time resolution (5 µs) together with energy resolution (< 200 ev), expected to provide new insights into grb physics, including spectroscopy (possible emission lines, etc.). 7.2. loft wfm and grbs the loft wfm (wide field monitor) has a very large instantaneous field of view (> 3 sr), and also an unprecedented capability for detecting rare, shortlived, bright sources. the energy range of 2 ÷ 50 kev is consistent with the range in which x-ray afterglows and x-ray prompt emission of grbs, x-ray rich grbs 801 rené hudec, vojtěch šimon, lukáš hudec acta polytechnica and x-ray flashes radiate. this offers a unique possibility to study very soft x-rays from these objects. wfm has moderate sensitivity (0.2 crab for 1 s and 5σ detection) and energy resolution (< 200 ev). the instrument will localizate triggers (few arcmin accuracy). the soft x-ray coverage (∼ 1 kev) will allow grbs at very high z to be detected and investigated. preliminary approximate expected rates are 150 grbs and 30 xrfs/year. 8. conclusions the esa gaia satellite will contribute to scientific investigations of grbs not only by providing long-term photometry but also by using the ultra-low dispersion spectra provided by bp and rp photometers. these data will present a new challenge for astrophysicists and for informatics in general. in the field of grb study, the advantages of gaia and loft can be briefly summarized as follows. • gaia: a unique opportunity to provide early or simultaneous lds for grbs (so far, lds is mostly late) • gaia: an opportunity to recognize/classify oas and ots of grbs using lds and/or color information even in their later phases and without known grb • gaia: an opportunity to detect/study orphan oas of grbs • gaia: an opportunity to estimate redshift up to ∼ 7 • loft: wfm&lad: unique tools for detecting and studying grbs, xrfs with emphasis on (soft) x-rays acknowledgements the scientific part of the study is linked to the grant 102/09/0997 provided by the grant agency of the czech republic (gacr). the analyzes of astronomical plates are supported by the gacr grant 13-39464j. references [1] huang, y.f., dai, z.g., lu, t., 2002, mnras, 332, 735 [2] hudec, l., algorithms for spectral classification of stars, bsc. thesis, charles university, prague, 2007. [3] hudec, r., šimon, v., 2007a, specific object studies for cataclysmic variables and related objects esa gaia reference code gaia-c7-tn-aio-rh-001-1. [4] hudec, r., šimon, v., 2007b, specific object studies for optical counterparts of high energy sources. esa gaia reference code gaia-c7-tn-aio-rh-002-1. [5] jordi, c., carrasco, j.m., in the future of photometric, spectrophotometric and polarimetric standardization, asp conference series, 364, 215, 2007. [6] perryman, m.a.c., in astrometry in the age of the next generation of large telescopes, asp conference series, 338, 3, (2005) [7] perryman, m., et al., gaia overall science goals, http://sci.esa.int/gaia/, (2006) [8] rhoads, j., 1997, apj, 487, l1 [9] robertson, b. e., ellis, r. s., 2012, apj, 744, 95 [10] rossi, e.m., perna, r., daigne, f., 2008, mnras, 390, 675 [11] simon, v., hudec, r., pizzichini, g., and masetti, n., 2001, a&a, 377, 450 [12] šimon, v., hudec, r., pizzichini, g., and masetti, n., 2004, in gamma-ray bursts: 30 years of discovery: gamma-ray burst symposium, aip conference proceedings 727, 487–490. [13] zhang, b., 2007, chjaa, 7, 1 [14] feroci, m. et al., 2012, proceedings of spie, vol. 8443, paper no. 8443 (arxiv:1209.1497) 802 http://sci.esa.int/gaia/ acta polytechnica 53(supplement):799–802, 2013 1 introduction 2 gaia and grbs: photometry 3 gaia and grbs: spectro-photometry/low-dispersion spectroscopy 4 low-dispersion spectral databases 5 gaia and grbs: proposed strategy and detection rate 6 gaia lds and highly redshifted universe 7 esa loft 7.1 loft lad and grbs 7.2 loft wfm and grbs 8 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0825 acta polytechnica 53(supplement):825–828, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the auger engineering radio array klaus weidenhaupta,∗, for the pierre auger collaborationb a 3. physikalisches institut a, rwth aachen university, aachen, germany b observatorio pierre auger, av. san martín norte 304, 5613 malargüe, argentina (full author list: http://www.auger.org/archive/authors_2012_06.html) ∗ corresponding author: weidenhaupt@physik.rwth-aachen.de abstract. the auger engineering radio array currently measures mhz radio emission from extensive air showers induced by high energy cosmic rays with 24 self-triggered radio detector stations. its unique site, embedded into the baseline detectors and extensions of the pierre auger observatory, allows to study air showers in great detail and to calibrate the radio emission. in its final stage aera will expand to an area of approximately 20 km2 to explore the feasibility of the radio-detection technique for future cosmic-ray detectors. the concept and hardware design of aera as well as strategies to enable self-triggered radio detection are presented. radio emission mechanisms are discussed based on polarization analysis of the first aera data. keywords: aera, radio detection, cosmic rays, air shower, pierre auger observatory. 1. introduction during the last years significant progress has been made in radio detection of cosmic rays both in experiments and in theoretical calculations. experiments such as lopes [1] and codalema [2] use arrays of radio antennas to detect coherent vhf (10÷100 mhz) radio emission from extensive air showers induced by cosmic rays. geomagnetic radio emission has been identified as the dominating process, and subdominant effects are studied currently. by including information from particle detectors on the ground, the characteristics of the radio pulse are investigated with respect to the fundamental properties of the primary cosmic rays. on the theoretical side, various approaches such as reas [3] and mgmr [4] simulate the radio emission from air showers in detail. they predict a quadratic scaling of the emitted radio power with the energy of the primary cosmic ray. furthermore, a sensitivity of the lateral distribution of the pulse amplitude to the chemical composition of the primary cosmic ray was found. experimental evidence for this phenomenon has recently been observed by the lopes collaboration [5]. these results suggest that the radio-detection technique, with its high duty cycle close to 100 %, is well suitable for measuring all relevant parameters of cosmic rays. as a next step, the auger engineering radio array (aera) now explores radio detection at ultra-high energies (e ≥ 1 · 1018 ev). 2. aera aera is a radio extension of the pierre auger observatory located in argentina. the pierre auger observatory follows a hybrid concept to detect ultrahigh-energy cosmic rays via their induced air showers. figure 1. layout of aera. each marker denotes the position of a radio detector station. an array of 1660 water-cherenkov tanks covering an area of 3000 km2 detects secondary shower particles at ground level. this surface detector array is overseen by 27 fluorescence telescopes at 4 sites which track the emission of fluorescence light during shower development in the atmosphere. aera is situated within the field of view of the fluorescence telescopes coihueco and heat [6] and co-located within the extension amiga [7], a combination of muon detectors and additional watercherenkov tanks. thus, a unique environment is formed to study air showers with complementing detection techniques and to calibrate the radio detectors. the deployment of aera started in 2010 with a dense core of 24 radio detector stations arranged on a triangular grid with a spacing of 150 m (fig. 1). since spring 2011, these stations are operational and successfully measure the radio emission from air showers in a self-trigger mode. more than 100 cosmic-ray events have been recorded in coincidence with the surface detector, and several “super-hybrid” events have been observed with the fluorescence detector of 825 http://dx.doi.org/10.14311/ap.2013.53.0825 http://ojs.cvut.cz/ojs/index.php/ap http://www.auger.org/archive/authors_2012_06.html klaus weidenhaupt, for the pierre auger collaboration acta polytechnica the pierre auger observatory as well. in its final stage aera will consist of 161 radio-detector stations with a spacing of 150 m (the deployed 24 stations), 250 m (52 stations) and 375 m (85 stations), covering nearly 20 km2. for the final layout, several thousand cosmic-ray events per year with energies above 1017 ev are expected [8]. the layout of aera aims to fulfill the following scientific goals: • calibration of the radio emission from air showers by super-hybrid measurements. this includes the understanding of underlying dominant and subdominant emission mechanisms. • exploring the feasibility of radio detection for future cosmic-ray experiments. evaluating the quality of the angular, energy and primary mass measurements. • composition measurements in the transition region of galactic to extragalactic cosmic rays. 2.1. radio detector stations and calibration the hardware design of the autonomous radio-detector stations (rdss) evolves from various prototype setups at the pierre auger observatory. these allowed us to define the technical requirements for successful selftriggering and to examine the environmental demands at the site in the argentinean pampas. the stations of the dense core (fig. 2) utilize dualpolarized log-periodic-dipole antennas, aligned northsouth and east-west. the bandwidth from 30 to 80 mhz corresponds to the relatively radio-quiet region between the short-wave and fm bands. the received signals pass a two-stage amplification chain and band-pass filters before they are digitized. the custom-built digitizers are based on 12-bit adc units at a 200 mhz sampling rate. the data is transmitted via optical fibers or a wireless link to the central radio station which hosts the daq system. the stations are powered by a photovoltaic system. to prevent triggering on noise generated by the station electronics itself, all critical components are shielded by a radio-frequency tight chamber. the physical quantity of interest, the time dependent electric field emitted by the air shower, has to be reconstructed from the voltage traces measured by the rdss. this requires end-to-end calibration of the entire signal chain including the antenna. all relevant electronic components of the aera stations: amplifiers, cables and digitizers were calibrated by individual measurements. the antenna gain has been measured as a function of the zenith angle at the aera site. therefore, a calibrated signal source attached to a helium balloon has been used in the far-field region of the antenna. the results are in fair agreement with the corresponding simulations. figure 2. photograph of an aera radio-detector station. 2.2. self-triggering to demonstrate the feasibility of radio detection as a stand-alone technique, aera is capable of selftriggering on the measured voltage traces. this is a technical challenge as one has to deal with various man-made noise sources. even in the relatively radio quiet environment in argentina, sources of transient noise have been discovered. by assuming that signals propagate as spherical waves, the location of the sources can be reconstructed by a spherical wave fit if the signal was measured at least at four different positions. the reconstructed source positions of self-triggered radio events are plotted in a map of the aera site in fig. 3. various hot spots are visible. coincidences with the positions of potential man-made noise sources such as transformer stations can be observed. various methods have been developed for rejecting these noise sources. the rds electronics host a fieldprogrammable gate array (fpga) where sophisticated real-time algorithms can be implemented. before triggering on a radio pulse, the signal-to-noise ratio is increased by filtering out narrow band transmitters with digital notch filters. the trigger itself consists of a threshold in the time-domain and also considers various quantities which are sensitive to the shape of the pulse expected from cosmic-ray induced air showers [10]. events often occur with a certain periodicity, e.g. with a frequency of 100 hz, matching twice the frequency of the power grid in argentina. these events are rejected directly at the station level by an algorithm which dynamically adjusts to the phase drift of these periodic pulses. at the central radio station a fast plane wave fit is performed to reconstruct the arrival direction of events. events coming from known hot spots can then be excluded. 826 vol. 53 supplement/2013 the auger engineering radio array figure 3. map of the aera site (the stations of the dense core are in the center) with directions of 2.35 × 106 reconstructed radio events. each symbol represents the reconstructed source point of a selftriggered radio event with a zenith angle larger than 60°. color-coded is the density of events. in addition suspected sources of the transient radio pulses are marked. from [9]. 3. polarization studies although the emission of mhz radiation from air showers is not fully understood yet, two different emission mechanisms are currently considered: geomagnetic emission and emission due to the charge-excess effect. a robust observable for investigating different emission mechanisms is the polarization signature of the electric field emitted by air showers. due to its dualpolarized antennas and extensive calibration, aera is a suitable tool for measuring these observables with high precision. 3.1. geomagnetic emission charged secondary shower particles are deflected in the earth’s magnetic field by the lorentz force, which causes emission of electromagnetic radiation. the superposition of radiation from single shower particles produces an electromagnetic pulse which appears to be coherent in the radio frequency region. the polarization of the electric field e is then given by the direction of the lorentz force: e ∝ n × b (1) where n is the direction of the shower axis and b is the local magnetic field vector. the polarization signature of aera data has been investigated based on radio events measured in coincidence with the surface detector, as these events likely result from cosmic-ray induced air showers. by cutting on at least three triggered rdss and on zenith angles smaller than 55°, 29 events were selected. the electric field traces were reconstructed at each triggered rds with the radio extension of the standard figure 4. polar skyplot of measured polarizations in comparison to the n × b calculation. all vectors are normalized to unity. note that at least 3 black vectors are plotted on top of each other for a specific incoming direction according to the multiple measurements of the electric field vector at the different rdss triggered in the event. the dashed line indicates incoming directions perpendicular to the magnetic field vector (red star) at the aera site. auger reconstruction framework offline [11]. afterwards, the electric field vector was extracted at the pulse position. in fig. 4, east-west and north-south components of the measured electric field vectors are plotted in a skyplot corresponding to their arrival direction given by the surface detector. all measured electric field vectors corresponding to one cosmic-ray event are pointing essentially in the same direction. this implies that all stations measure approximately the same linear polarization. furthermore, the measured polarization is in good agreement with the predictions by the calculation following eq. 1. this gives evidence that, for those air-shower events and within the bandwidth of our setup, the radio emission is mainly of geomagnetic origin. the angular distribution of radio events shown in fig. 4 supports this conclusion. equation 1 implies that the strongest radio pulses are emitted if the shower axis is perpendicular to the earth’s magnetic field vector. if the radio detector triggers above a threshold for the pulse amplitudes, an anisotropy in the arrival directions is expected. this anisotropy is clearly observed in fig. 4. the dominant role of geomagnetic emission is well known and has recently been observed at aera prototype setups [12], and by lopes [1] and codalema [2]. as for the measured events and the present setup, the geomagnetic emission dominates the emitted signal strength, an estimator for the primary cosmic ray’s 827 klaus weidenhaupt, for the pierre auger collaboration acta polytechnica energy can be formulated by correcting for the arrival direction using eq. 1 and taking into account an exponential lateral distribution function. a calibration of this radio-energy estimator versus the energy given by the surface detector was recently presented at the arena conference [13]. 3.2. charge-excess emission askaryan first predicted that a negative net charge in the air shower can also lead to a coherent emission of mhz radiation [14]. this emission mechanism implies an essentially different polarization signature compared to the geomagnetic emission. the electric field is polarized radially with respect to the shower axis. observers placed at different positions with respect to the shower core thus would measure different polarization. in a polarization analysis of data from an aera prototype setup experimental evidence was found that the charge-excess effect contributes to the total radio emission [12]. similar analysis are being performed with aera data and will be presented in a forthcoming publication. 4. conclusions in its initial stage, aera has successfully detected cosmic rays by self-triggering on mhz radio emission. various strategies have been implemented to reject the background of transient man-made noise and to enable a stable data taking. the end-to-end calibrated detector stations allow for precise polarization-sensitive measurements of the radio emission. analyses of the first aera data provide an insight into the underlying emission mechanisms and support the dominating role of geomagnetic emission. super-hybrid measurements in coincidence with the baseline detectors and extensions of the pierre auger observatory provide the calibration of the radio emission and will determine the sensitivity of radio detection to the primary energy and mass composition. in its final stage aera will expand the energy range of radio detection and will provide new access to air shower physics and fundamental parameters of cosmic rays. references [1] t. huege et al. [lopes collaboration], nucl. instrum. meth. a 662, s72 (2012) [2] d. ardouin et al. [codalema collaboration], astropart. phys., 2009, 31:192-200 [3] m. ludwig and t. huege, astropart. phys. 34, 438 (2011) [4] k. de vries, o. scholten, k. werner, nucl. instr. meth. a, 2012, 662: s175-s178 [5] w. d. apel et al. [lopes collaboration], phys. rev. d 85, 071101 (2012) [6] t. h. mathes for the pierre auger collaboration, proc. 32th icrc, beijing, china, 2011 [7] f. sánchez for the pierre auger collaboration, proc. 32th icrc, beijing, china, 2011 [8] s. fliescher for the pierre auger collaboration, nucl. instr. meth. a, 2012, 662: s124-s129 [9] l. mohrmann, master thesis, rwth aachen university (2011) [10] j. l. kelley for the pierre auger collaboration, proc. 32th icrc, beijing, china, 2011 [11] the pierre auger collaboration, nucl. instr. meth. a, 2011, 635: s92-s102 [12] b. revenu for the pierre auger collaboration, proc. 32th icrc, beijing, china, 2011 [13] c. glaser for the pierre auger collaboration, arena 2012, erlangen, germany [14] g. a. askaryan, soviet physics jetp 14, 441 (1962) discussion peter grieder — how problematic is the contribution from transition radiation? klaus weidenhaupt — this has not been studied with aera data yet. as the antennas of the dense core are essentially insensitive to the ground it should be a second-order effect. 828 acta polytechnica 53(supplement):825–828, 2013 1 introduction 2 aera 2.1 radio detector stations and calibration 2.2 self-triggering 3 polarization studies 3.1 geomagnetic emission 3.2 charge-excess emission 4 conclusions references ap01_45.vp 1 introduction in recent years fuel consumption of today’s vehicle engines has been brought into focus. one means to increase the overall efficiency is by means of fully variable valve actuation, where valve events and lifts for both intake and exhaust valves can be varied independently. this is especially true for part load operation, where the pumping losses due to the throttle blade can be decreased significantly [1], [4]. currently, a wide range of systems with varying degrees in actuation flexibility are under development or at a prototype stage. two different types of actuation devices are already used in production engines of today. such devices are commonly add-on systems to camshaft based valve actuation. with camphasers, intake and/or exhaust valve overlap can be adjusted, keeping a constant valve duration. for the other type of system, valve actuation is influenced by using two different cam lobes depending on engine speed. with these comparably simple add-on approaches, a significant increase in efficiency can be gained. however, these two actuation principles are not sufficient in order to reduce pumping losses at part load [1]. this paper treats variable valve actuation from a conceptual point of view. the focus is on the actuation system itself rather than the influences on combustion and emissions. the usual procedure for concept validation of variable valve train systems is to build a prototype of the entire system, which can be both costly and time consuming. such systems are in general difficult to evaluate analytically due to their highly dynamic nature. using the simulation driven experiments approach presented here, it is possible to identify bottlenecks or components jeopardising a proper system function early in the concept evaluation phase. using numerical simulation, the valve train system is in consequence decomposed into a few minor prototypes, resulting in a number of less complex experiments. thus, design cycle time and development cost of new systems are reduced. 2 conceptual system design the focus of this paper is twofold, that is simulation driven experiments, which will be explained in detail later on, and conceptual design of an electro-hydraulic valve train system. in this case the issue of conceptual design is system design, i.e. the composition of components to a system in order to fulfil a required objective. the conceptual design phase constitutes only a small part of the entire design process but has major influence on the final result. as suggested by roozenburg and eekels [6], the conceptual design phase itself can conveniently be broken down into three different tasks. 2.1 specification the focus of the specification task is to state desired properties and functions of the final product. the requirement specification serves two separate purposes: it provides guidelines for the development process as well as information required for product evaluation. the requirement specification is a working document and is both altered and updated during the entire process. as the product evolves, previously unknown information will be available and must be added to the product specification. consequently, the specification changes from undetailed and solution independent to specific and solution dependent. the starting-point for a design specification should be the analysis of the product life cycle. between origination and disposal a product goes through several processes, such as manufacturing, assembly, distribution, installation, operation, maintenance, use, re-use, and disposal. each of these processes influences the design specification. at the preliminary design stage, it may be impossible to find the answer to all these questions, especially for complex products. this paper focuses on the performance and functional issues of the design specification. 2.1.1 performance / function • what is the main purpose of the product? the main purpose of the product is to control the movement of the intake and exhaust valves of an otto engine, in order to improve overall engine efficiency together with the ability to influence engine emissions. • what additional function(s) does the product have to fulfil? what are the parameters by which the functional characteristics will be assessed (speed, power, strength...)? 20 acta polytechnica vol. 41 no. 4 – 5/2001 conceptual design of a hydraulic valve train system j. pohl, a. warell, p. krus, j.-o. palmberg variable valve train systems have been brought into focus during recent years as a means to decrease fuel consumption in tomorrow’s combustion engines. in this paper an integrated approach, called simulation driven experiments, is utilised in order to aid the development of such highly dynamic systems. through the use of systematic design methodology, a number of feasible concepts are developed. critical components are subsequently identified using simulation. in this approach, component behaviour is simulated and validated by measurements on prototype components. these models are unified with complete system models of hydraulically actuated valve trains. in the case of the valve trains systems studied here component models could be validated using comparably simple test set-ups. these models enable the determination of non-critical design parameters in an optimal sense. this results in a number of optimised concepts facilitating an impartial functional concept selection. keywords: simulation, simulation based optimisation, conceptual design, valve train system. valvetrains in production engines of today are camshaft based, which drastically restricts the variability in valve lift, duration and timing. by far, the largest benefit of a fully variable valve train is the ability of non-throttled load control, as explained earlier. variable valve lift may not be needed at any expense, but may be compensated for by valve deactivation at low engine speeds and loads. important is the valve seating velocity, as this may not be too high, due to noise and strength reasons. • what performance properties are required? the maximum engine speed is restricted to 6000 rev/min at a valve lift of 8 mm. valve weight is assumed not to exceed 60 g. the energy consumption of the system should not exceed the energy consumption of present. • what outer dimensions are desired? the valve train height must exceed today’s systems. a decrease in height is opportune. the width is not as critical as the height, but preferable not larger than today’s system. 2.2 functional decomposition the second task involves an investigation into what the product is meant to do, i.e. to establish the technical process, also called “black-box model” (stating the operand to be transformed, the transformation taking place, and the different states of the process). following the functional decomposition a function/ means-tree is developed. in a function/means-tree, the main © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 41 no. 4 – 5/2001 mechanical cam phaser actuate valve hydraulic valve actuation non-return valve spring electrical control signal change opening duration spring pressure accelerate valve pressure energy recovery accelerate valve damping non-return valve restrict seating velocity keep valve closed control signal variable pressure level variable valve lift amplifier f lu id c o n tr o l fluid-> mechanic amplifier f lu id c o n tr o l fluid-> mechanic fig. 2: function/means-tree developed for the variable valve train system. sub-functions are derived from the technical process model. technical process keep valve closed keep valve open accelerate valve upwards accelerate valve downwards slow down valve restrict seating velocity vary valve lift slow down valve keep valve closed restrict seating velocity start execution end fig. 1: a technical process model including sub-tasks for the variable valve train functions derived from the technical process model are listed together with possible means to realising the required functions. the generation of means is generally regarded as the most demanding phase in conceptual design, since the final solution is only composed of function carriers from the function/means-tree. the function/means-tree is used in order to explore the solution space for all the sub-problems in a systematic manner and to divide it into several distinct classes. a means could be a solution principle, a geometrical model describing a part of the technical system etc. the function/means-tree for the valve train is shown in figure 2. the sub-functions are extracted from the technical process model and means for these functions are generated. a means combination matrix was utilised in order to represent the combinatory alternatives for the function “accelerate valve downwards”. as shown in figure 2 such a matrix efficiently reduces the size of the function/means-tree. 2.3 synthesis the final phase is to generate a number of concepts. from the function/means-tree, a morphological matrix containing all generated means to respective functions, is formed. by choosing one function carrier for each function several concepts are generated. each concept represents a unique combination of means. the aim is now to select the optimal concept for further development, given a set of evaluation criteria. for complex dynamic systems, this evaluation is traditionally done by experiments using a prototype system for each competing concept. however, this procedure is both costly and time consuming. an alternative evaluation methodology denoted „simulation driven experiments“ is therefore proposed in the following section. 3 the simulation driven experiment approach the simulation driven experiments methodology utilises a simulation model of the entire system that delivers results equivalent to the experiment in terms of information needed for concept selection. the overall procedure of the approach can be seen in figure 4. as opposed to typical industrial practice where simulation is used as an analysis tool rather than a design tool, in this approach the simulation drives the selection of experiments. in other words, the simulation serves the dual purpose of identifying bottlenecks and providing input to the component experiments. in this manner, numerical simulation is used in a first step to identify bottlenecks. studying the dynamic behaviour of the bottleneck with specialised simulation tools (for instance a cfd-tool for fluid flow calculations) is often time consuming at the conceptual design stage and capturing of the complete dynamic behaviour can not be guaranteed. this necessitates the use of prototyping for validation of component models. the final aim of the approach presented here is a component model, validated by experiments, that can be used for subsequent evaluation of the overall concept. however, the approach does not specify how the experiment should be conducted in terms of fidelity. since competing concepts often share identical components, the validated component model can favourably be reused. designing the test set-up can also be facilitated by the simulation driven experiments approach. cause and effect of the test set-up may be altered since a physics based component model is used rather than a black-box model. the task of the valve train system is to accomplish a certain valve opening. this is synonymous with a certain oil flow through, for instance the check valve. consequently, it is insignificant 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 variable pressure level accelerate valve change opening duration mechanical cam phaser keep valve closed accelerate valve pressure restrict seating velocity variable valve lift energy recovery non-return valve mechanical cam phaser spring non-return valve damping mechanical cam phaser pressure not possible not possible concept 1 concept 2 concept 3 variable pressure level control signal non-return valve concept 4 spring variable pressure level fig. 3: table of concepts for the variable valve train system. the means combination matrix was used to represent the alternative means for the “accelerate valve downwards” function. for the check-valve whether this oil flow is produced by a “real” valve train or any other means. model parameterisation can be done with common system identification methods, or even by simulation based optimisation. for the latter approach, the test stand is simulated and optimisation is used to adjust the computed system characteristics to the measured characteristics, see [5]. thus all state variable of the component need not to be measured, as this can be a quite difficult task. the validated component model is then implemented directly in the overall system simulation model, allowing system performance to be computed. 3.1 example 1: check-valve the check valve is an essential component of many hydraulic valve train systems. they are mainly used for energy recovery purposes in order to decrease the valve train’s energy consumption. check valves for fluid power applications usually consist of a spring and a ball in a housing. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 41 no. 4 – 5/2001 keep valve closed start function/ means tree technical process model requirements list actuate valve hydraulic valve actuation concept optimisation component model concept selection validated models test setup function value measured data objective function optimisation strategy component model d s1 m s k s a s x s b s p(n2) l s f sff simulated data component parameters variable pressure level accelerate valve change opening duration mechanical cam phaser keep valve closed accelerate valve pressure restrict seating velocity variable valve lift energy recovery non-return valve mechanical cam phaser spring non-return valve mechanical cam phaser pressure damping non-return valve mechanical cam phaser pressure not possible not possible mechanical cam phaser pressure non-return valve concept 1 concept 2 concept 3 concept 4 concept 5 control signal non-return valve concept 6 variable pressure level control signal non-return valve concept 7 springpressure variable pressure level morphological matrix component model validation simulation driven experiments identify bottlenecks simulation model fig. 4: the overall structure of the simulation driven experiments approach plunger cam volume pressure sensor reverse flow check valve spring forward flow check valve p x p=const 8.00 7.00 6.00 5.00 4.00 3.00 4.00 2.00 0.00 -2.00 -4.00 30.010.00.0 time [msec] 20.0 simulated pressure measured pressure forward flow reverse flow p re ss u re p [m p a ] o il f lo w q [l /m in ] fig. 5: simulated and measured pressure signals and corresponding oil flows through the check valves together with the spring the ball constitutes a 2nd order system, but the break frequency is strongly influenced by the impedance of the oil column in the valve itself, as well as flow forces. flow forces are usually difficult to obtain from technical component data, which makes complementary tests necessary. the dynamic behaviour of the valves can be studied using a test set-up, such as the one shown in figure 5. in the test set-up a cam acts on a plunger moving upand downwards in an oil volume. during the downwards stroke, oil flows through the forward check valve and during the upwards stroke through the reverse flow valve. by measuring the pressure signal inside the volume, the back-flow through either valve can be estimated. the diagram in figure 5 shows the measured and calculated pressure signals in the volume together with the corresponding oil flows through the valves. the latter ones were calculated using a simulation model of the test set-up. 3.2 example 2: solenoid-valve the solenoid model has to handle the transformation of an input current to an electro-magnetic force on the spool of the valve, covering all the intermediate system states. material properties such as hysteresis and saturation have shown to be important in solenoid modelling, see [5]. in figure 6 the oil flow has been calculated with the two-microphone method, see [8]. in this method, two dynamic pressure signals are measured in a specially designed measurement pipe. together with a description of the pipe in the frequency domain, the oil flow through the pipe in the time domain is calculated. the dashed line represents the simulate oil flow. 4 competing valve train concepts in order to clarify the approach presented in the previous section, two valve train concepts from the morphological matrix were studied. characteristics of critical components have been obtained using two specially designed yet simple test set-ups. the validated component models are implemented in the valve train system models to be studied. by numerical simulation two critical components could be identified, namely the solenoid and the check valves. according figure 4 these models are implement in the overall simulation models of the two valve train concepts in question 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 ps mesurement pipe l = 0.5 m dynamic pressure transducerstatic pressure transducer solenoid valve i hall element simulated (dashed) and calculated (solid) oil flows ps = 100 bar vs = 12 v test stand simulation model magnetic circuit & hysteresis h b coil inductance l electromagnetic force m ks bl springmass system oil flowhvs b fm x q volume v0 time [msec] o il f lo w [l /m in ] fig. 6: from the two pressure signals the oil flow was calculated using the 2-microphone method. the dashed line represents the oil flow simulated by the model. fig. 7: the flumes valvetrain system a system that has been developed by the ford motor company [7] and a concept developed by the division of fluid and mechanical engineering systems (flumes) at linköping university. figure 7 shows the principal layout of the flumes-system. the modelling and simulation of both these systems was done in the hopsan simulation package [3], developed at the same division. the valve train system consists of a hydraulic piston which is mounted directly onto the engines valve stem. the oil flow is controlled by a hydraulic main stage working as a flow amplifier since the flow capacity of the solenoid valves itself is not sufficient. if there should exist fast switching valves with sufficient flow capacity in future, this main stage can be omitted. in this respect, the valve train system presented constitutes a very flexible solution. the valve train system works as follows: a valve opening is initiated by the high-pressure solenoid valve, so that the main stage is switched to its lower position. in this manner, the a and b port of the hydraulic piston will be connected and oil is flowing from the high pressure side through the lower piston chamber in to the upper piston chamber. as a consequence, the engine valve moves downwards. after approximately half the strokelength the piston cut the oil supply from the high pressure side and a connection to the low pressure side is established, see figure 7. due to the valves inertia it will continue to move downwards until the kinetic energy has been used to draw oil from the low pressure side into the upper piston chamber. during the entire downwards stroke, oil from the high pressure side flows through the check-valve. the advantage with this type of valve train system is that the hydraulic piston remains due to the check-valve in the extend position irrespective of the solenoid valves state. in this manner, the engine valves movement is only initiated by the solenoid valve which may be activated longer than the actual engine valve stroke. thus solenoid valves closing characteristic is not critical a critical parameter. the engine valves closing is initiated by the low pressure switching valve which make the main stage move into the upper position. both b an c port are connected to tank. in figure 8 the basic layout of fords system is shown, but unlike the system presented in [7] a main stage for flow amplification purposes as in the flumes-system has been added. in this study the ford system is meant to use the same solenoidand check-valves as the system in figure 7 in order to make concept selection be based on similar premises. in this system, the downward stroke is initiated with the high pressure-switching valve. such a system consists of two switching and two check valves as well as a hydraulic piston that is mounted directly on the engine valve stem. in this system, the downward stroke is initiated with the high pressure-switching valve. the valve is closed after a certain distance, about half the stroke distance, and oil is sucked from the low-pressure side via the corresponding check valve. the engine valve is thus retarded and finally reaches the stroke length. the upward stroke is a reversed procedure of the downward stroke, initiated via the © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 41 no. 4 – 5/2001 low pressure high pressure fig. 8: ford solution to the valve train problem. in the background a drawing of the concept from [7] is shown. low-pressure switching valve but here retardation is done by feeding back high-pressure oil into the high-pressure line. in this manner, the highand low-pressure checkvalves are vital components to reduce the energy consumption of the system, as valuable high-pressure oil only is used during a fraction of the engine valve stroke. 5 concept optimisation according to the process map shown in figure 4 concept optimisation should be conducted before proceeding to the concept selection stage. in figure 9 is shown how optimisation based on numerical simulation can be done. unlike classic mathematical optimisation problems, the objective function is here not given directly, rather than as a result from a simulation. in this manner the simulation model is used in order to produce the input data to the objective function that is used by the optimisation strategy in order to do a ranking between different parameter sets. to formulate an objective function can be a quite difficult task when several objectives are to be met. in this case the valve train system has to actuate the engine valve according to a certain cam profile while at the same time keeping the energy consumption low. the cam profile in question is shown in figure 10 and is a standard cam profile as it is used in many of today’s production engine. the valve lift is here shown as a time function for an engine speed of 6000 rpm. figure 11 shows the valve opening of the flumes-system for a given input signal obtained by optimisation. the correspondence between the pattern valve opening and the simulated one (dashed) is quite good. in figure 12 the opening of check-valve, main stage and solenoid valves is shown (left diagram) as well as the oil flow from the high pressure side and check-valve together with the valve opening. the high pressure oil flow per engine cycle is a measure for the energy consumption of the system. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 system characteristics system parameters objecti vefunctio objective function strate optimisation strategy simulation optimisation c t b a c a b low pressure high pressure objective function value fig. 9: principal procedure for simulation based optimisation time [ms] v a lv e li ft [m m ] fig. 10: the desired valve lift as a time function for an engine speed of 6000 rpm inputsignal 10.0 8.0 6.0 4.0 2.0 0.0 -2.0 15.0 10.0 5.0 0.0 2.00×10 -2 1.50×10 -2 1.00×10 -2 5.00×10 -3 0.00 time [s] o p e n in g [m m ] in p u ts ig n a l [v ] fig. 11: simulated (dashed) and given valve opening for the flumes-system as well as input signals to the solenoid valves for an engine speed of 6000 rpm in figure 13 the valve opening together with the input signals to the solenoid valves is chosen in case of the ford system. the capacity of this system to follow a certain opening profile is obviously worse than for the other concept. however, most important for a valve train system is that opening and closing times are prescribed, the opening and closing gradients s of secondary importance. in figure 14 the valve opening for check-valves, solenoid valves and main stage is shown. to the left the corresponding oil flows are shown. the flumes-system is more suitable when a certain opening profile of the engine valves is required as can be seen from figure 11 and 13. this is mainly due to the solenoid valve used in this study. in the case of the ford-system, both opening and closing characteristics of the solenoid valve influence the engine valve opening. for the flumes-system the solenoid valve closing is irrelevant, as this is handled by internal groves of the valve piston. the energy recovery function as it is presented in [7] is compared with hydraulic pendulum, where kinetic energy is transformed into potential energy, and vice versa. this demands for very fast solenoid valves for higher engine speeds, as these must activate the flow amplifying main stage sufficiently fast in order to make energy recovery possible on the downwards stroke. in this study the high pressure solenoid valve was not in use (see figure 15) which is synonymous with that energy recovery was not possible on the downwards stroke. the energy consumption for both the ford and flumes system lies around 3 kw for an engine with 20 valves. this was without any force due to combustion, this may make an increase in system pressure necessary, which will raise energy consumption. from a design and manufacturing point of view the ford system appears to be the simpler design solution, as the main stage is kept simpler and the cylinderhead design becomes simpler due to less channels. on the contrary the other system is, as already explained earlier, not dependent on the closing characteristics of the solenoid valves. closing is 27 acta polytechnica vol. 41 no. 4 – 5/2001 2.00 1.50 1.00 0.50 0.00 10.0 8.0 6.0 4.0 2.0 0.0 2.00 10× -2 1.50 10× -2 1.00 10× -2 5.00×10 -3 0.00 time [ms] c h e c k -. s o le n o id -, m a in v a lv e o p e n in g [m m ] e n g in e v a lv e lif t [m m ] check valve engine valve main valve solenoid valve o il fl o w [l /m in ] o p e n in g [m m ] 20.0 15.0 10.0 5.0 0.0 -5.0 -10.0 10.0 8.0 6.0 4.0 2.0 0.0 2.00 10× -2 1.50 10× -2 1.00 10× -2 5.00×10 -3 0.00 time [s] fig. 12: opening of check-valve, main stage and solenoid valves and corresponding oil flows for the flumes-system 10.0 8.0 6.0 4.0 2.0 0.0 -2.0 15.0 10.0 5.0 0.0 2.00 10× -2 1.50 10× -2 1.00 10× -2 5.00×10 -3 0.00 time [s] v a lv e o p e n in g [m m ] in p u ts ig n a l [v ] inputsignal fig. 13: simulated (dashed) and given valve opening for fords-system as well as input signals to the solenoid valves for an engine speed of 6000 rpm check valve 1.20 1.00 0.80 0.60 0.40 0.20 0.00 10.0 8.0 6.0 4.0 2.0 0.0 2.00×10 -2 1.50 10× -2 1.00 10× -2 5.00 10× -3 0.00 time [ms] c h e c k -. s o le n o id -, m a in v a lv e o p e n in g [m m ] e n g in e v a lv e lif t [m m ] engine valve main valve solenoid valve 15.0 10.0 5.0 0.0 -5.0 -10.0 10.0 8.0 6.0 4.0 2.0 0.0 2.00x10 -2 1.50x10 -2 1.00x10 -2 5.00x10 -3 0.00 time [s] f lo w [l /m in ] v a lv e o p e n in g [m m ] fig. 14: opening of check-valve, main stage and solenoid valves and corresponding oil flows for fords-system for many solenoid valve designs guaranteed by a spring force. if the valve is not fully pressure balanced, the closing sped may be pressure dependent. this makes the control strategy more difficult in case of the ford system. 6 conclusions the purpose of this work is twofold. first it shows how conceptual design techniques can be used in variable valve train development. secondly, the feasibility of the simulation driven experiments approach to be used in conceptual design is studied. simulation is used in order to both identify critical components in a valve train system and for model adaptation together with optimisation. the component models which have been validated against measured data are fed back into the overall simulation model. this significantly raises the confidence level of the overall simulation model. in this paper, two competing valve train concepts, developed using well known conceptual design techniques, were studied. both systems were optimised according to exactly the same objective function. thus the objective function value is a direct measure for system performance. here energy consumption and the ability to provide certain valve opening characteristics were taken as quality criteria. in this manner, a functional concept selection can take place more impartially. as demonstrated in this paper the use of the simulation driven experiments approach may significantly reduce the amount and the need of full-scale hardware testing. the approach shows significant potential for industry in terms of reduction of design cycle time and development costs. references [1] adamis, p., gnegel, p.: variable ventilsteuerung und deren einfluss auf verbrauch und emissionen. vdi berichte nr. 1170, verien deutscher ingenieure, germany, 1994 [2] heywood, j. b.: internal combustion engine fundamentals. mcgraw hill, 1988 [3] hopsan: hopsan, a simulation package, user’s guide. technical report lith-ikp-r-704, division of fluid and mechanical engineering systems, linköping university, sweden, 1991 [4] miller, r. h., davis, g. c., newman, c. e., levin m. b.: unthrottled camless valvetrain strategy for spark-ignited engines. 1997 fall technical conference, asme, usa, 1997 [5] pohl, j., sethsson, m., krus, p., palmberg, j.-o.: modelling and simulation of a fast 2/2 switching valve for hydraulic valve train applications. lith-ikp-r1135, division of fluid and mechanical engineering systems, linköping university, sweden, 2000 [6] roozenburg, n. f. m., eekels, j.: product design: fundamentals and methods. john wiley & sons, 1995 [7] schlechter, m. m., levin, m. b.: camless engine. sae paper 960581, society of automotive engineers, usa, 1996 [8] weddfelt, k.: on modelling, simulation and measurements of fluid power pumps and pipelines – with special reference to flow pulsations. dissertation, linköping university, sweden, 1992 dr jochen pohl e-mail: jpohl12@volvocars.com prof. petter krus prof. jan-ove palmberg phone: +46 13 281198 fax: +46 13 130414 fluid and mechanical engineering systems linköping university s-581 83 linköping, sweden lic.-eng. anders warell industrial design engineering chalmers university of technology s-412 96 göteborg, sweden 28 acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2013.53.0457 acta polytechnica 53(5):457–461, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap maximal subsets of pairwise summable elements in generalized effect algebras zdenka riečanováa,∗, jiří jandab a department of mathematics, faculty of electrical engineering and information technology, slovak university of technology, ilkovičova 3, sk-812 19 bratislava, slovak republic. e-mail: zdenka.riecanova@stuba.sk b department of mathematics and statistics, faculty of science, masaryk university, kotlářská 2, cz-611 37 brno, czech republic. e-mail: 98599@mail.muni.cz ∗ corresponding author: zdenka.riecanova@stuba.sk abstract. we show that in any generalized effect algebra (g;⊕, 0) a maximal pairwise summable subset is a sub-generalized effect algebra of (g;⊕, 0), called a summability block. if g is lattice ordered, then every summability block in g is a generalized mv-effect algebra. moreover, if every element of g has an infinite isotropic index, then g is covered by its summability blocks, which are generalized mv-effect algebras in the case that g is lattice ordered. we also present the relations between summability blocks and compatibility blocks of g. counterexamples, to obtain the required contradictions in some cases, are given. keywords: (generalized) effect algebra, mv-effect algebra, summability block, compatibility block, linear operators in hilbert spaces. submitted: 7 march 2013. accepted: 10 april 2013. 1. introduction and some basic definitions in a hilbert space formalization of quantum mechanics, g. birkhoff and j. von neumann proposed the concept of quantum logics (in 1936 the concept of modular ortholattices and later orthomodular lattices, discovered by husimi in 1937). nevertheless, in the set p(h) of all projection operators in a separable hilbert space (used as a model for orthomodular lattices) every event satisfies the non-contradiction principle. thus the set p(h) is not the set of all possible events in quantum theory. in 1994, d. foulis introduced algebraic structures called effect algebras. equivalent structures, in some sense, are d-posets introduced by kôpka and chovanec in 1994. the prototype for the axiomatic system of effect algebras was the set e(h) of all positive linear operators dominated by the identity operator in a hilbert space. events in e(h), called effects, do not satisfy the non-contradiction law (meaning that there exist unsharp events x and non x which are not disjoint). they represent unsharp measurements or observations on a quantum mechanical system in a hilbert space h. moreover, a special kind of effect algebras are mv-algebras, which are algebraic bases for multivalued logic, as a generalization of boolean algebras. effect algebras are very suitable algebraic structures for being carriers of probability measures when events may be unsharp or pairwise non-compatible. the mutually equivalent generalizations (unbounded version) of effect algebras were introduced in 1994 by several authors — d. foulis and m.k. bennett, g. kalmbach and z. riečanová, j. hedlíková and s. pulmannová, and f. kôpka and f. chovanec. on the other hand, all intervals in these generalized effect algebras are effect algebras. recently, operator representations of abstract effect algebras (i.e. their isomorphism with sub-effect algebras of the standard effect algebra e(h) mentioned above) have been studied. it was proved in [14] that the set vd(h) of all positive linear operators in an infinite-dimensional complex hilbert space h with partially defined sum of operators (which coincides with the usual sum) restricted to the common domains of operators forms a generalized effect algebra. this generalized effect algebra vd(h) is a union of sub-generalized effect algebras of maximal subsets of pairwise summable operators. moreover, all intervals are effect algebras isomorphic to sub-effect algebras of the standard effect algebra e(h) for some hilbert space h (see [13]). we are going to show that in a generalized effect algebra g without elements with finite isotropic indexes (which corresponds to the operator case) its maximal subsets of pairwise summable elements are sub-generalized effect algebras. moreover, such g is covered by those sub-generalized effect algebras. definition 1 ([3]). a partial algebra (e;⊕, 0, 1) is called an effect algebra if 0, 1 ∈ e are two distinguished elements and ⊕ is a partially defined binary operation on e which satisfies the following conditions for any x, y, z ∈ e: (ei) x ⊕ y = y ⊕ x if x ⊕ y is defined, (eii) (x⊕y)⊕z = x⊕(y⊕z) if one side is defined, 457 http://dx.doi.org/10.14311/ap.2013.53.0457 http://ojs.cvut.cz/ojs/index.php/ap z. riečanová, j. janda acta polytechnica (eiii) for every x ∈ e there exists a unique y ∈ e such that x ⊕ y = 1 (we put x′ = y), (eiv) if 1 ⊕ x is defined then x = 0. the basic references for the present text are the books by dvurečenskij and pulmannová [2], and blank, exner and havlíček [1], where unexplained terms and notations concerning the subject can be found. in 1994 also a generalization of effect algebras without a top element was introduced by several authors ([3, 5, 6, 8]). definition 2. a partial algebra (e;⊕, 0) is called a generalized effect algebra if 0 ∈ e is a distinguished element and ⊕ is a partially defined binary operation on e which satisfies the following conditions for any x, y, z ∈ e: (gei) x ⊕ y = y ⊕ x, if one side is defined, (geii) (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z), if one side is defined, (geiii) x ⊕ 0 = x, (geiv) x ⊕ y = x ⊕ z implies y = z (cancellation law), (gev) x ⊕ y = 0 implies x = y = 0. in every (generalized) effect algebra e relation ≤ and the partial binary operation can be defined by (po) x ≤ y iff there exists z ∈ e such that x⊕z = y. in that case, such element z is unique and we set z = y x. then ≤ is a partial order on e under which 0 is the least element of e. a generalized effect algebra (e;⊕, 0) is called a lattice generalized effect algebra if e with respect to induced partial order ≤ is a lattice. definition 3. let (e;⊕, 0, 1) be an effect algebra ((e;⊕, 0) be a generalized effect algebra). a subset q ⊆ e is called a sub-effect algebra (sub-generalized effect algebra) of e iff (si) 1 ∈ q (0 ∈ q), (sii) if a, b, c ∈ q with a ⊕ b = c and out of a, b, c at least two elements are in q then a, b, c ∈ q. let (e;⊕, 0, 1) be an effect algebra and f ⊆ e, by the symbol ⊕/f we will denote a restriction of ⊕ to f , i.e. for a, b ∈ f , a ⊕/f b is defined if and only if a ⊕ b is defined and a ⊕/f b = a ⊕ b. it is easy to see that sub-effect algebra (subgeneralized effect algebra) q of (e;⊕, 0, 1) ((e;⊕, 0)) is an effect algebra (q;⊕/q, 0, 1) (generalized effect algebra (q;⊕/q, 0)) in its own right. definition 4. let (e;⊕, 0) be a generalized effect algebra. for any x ∈ e, if there exists a natural number ord(x) ∈ n such that ord(x)·x = x⊕x⊕. . .⊕x (ord(x)-times) is defined, but (ord(x) + 1) · x is not defined, is called an isotropic index of x. if such natural number does not exist, we set ord(x) = ∞. definition 5. elements a, b ∈ e of an effect algebra (e;⊕, 0, 1) (generalized effect algebra (e;⊕, 0)) are called compatible (we write a ↔ b) if there exist a1, c, b1 ∈ e such that a1 ⊕ c ⊕ b1 is defined and a = a1 ⊕ c, b = b1 ⊕ c. in [11] it was proved that in any lattice effect algebra e for a, b ∈ e we have a ↔ b iff (a (a ∧ b)) ⊕ (b (a ∧ b)) is defined in e. moreover, we call every maximal subset of pairwise compatible elements of e a compatibility block of e. every lattice effect algebra e is a set-theoretical union of its compatibility blocks [10, theorem 3.2]. a lattice effect algebra possessing a unique block is called an mv-effect algebra (hence a ↔ b for all a, b ∈ e). 2. pairwise summable generalized effect algebras recall that elements a, b of a generalized effect algebra (g;⊕, 0) are called summable if a ⊕ b exists in g. a nonempty subset f of a generalized effect algebra (g;⊕, 0) is called a pairwise summable subset of g if a⊕b exists for every not necessarily different elements a, b ∈ f and a ⊕ b ∈ f (hence f is closed under the partial operation ⊕). evidently, in this case, every a ∈ f has the infinite isotropic index ord(a) = ∞. a (sub-) generalized effect algebra e of (g;⊕, 0) is called pairwise summable if it is a pairwise summable subset of g. we are going to show that every maximal subset of pairwise summable elements of a generalized effect algebra g is a sub-generalized effect algebra of g. moreover, we study further properties of these pairwise summable sub-generalized effect algebras. theorem 1. let (g;⊕, 0) be a generalized effect algebra. let a non-empty subset f of g satisfy the following conditions: (1.) for every a, b ∈ f there exists a ⊕ b ∈ f , (2.) if a ∈ g and a ⊕ e exists for every e ∈ f then a ∈ f . then f is a sub-generalized effect algebra of g. proof. by (2.) we obtain 0 ∈ f , since 0 ⊕ e exists for all e ∈ f . suppose now that a ⊕ b = c, for a, b, c ∈ g. if a, b ∈ f then c = a ⊕ b ∈ f by (1.). further, in the case a, c ∈ f , we have b = c a ≤ c. since c ⊕ e exists for all e ∈ f , also b = (c a) ⊕ e exists for all e ∈ f . thus by (2.) b = c a ∈ f . that is, f is a sub-generalized effect algebra of (g;⊕, 0). remark 1. condition (1.) on a subset f ⊆ g of a generalized effect algebra (g;⊕, 0) in the above theorem guarantees that f is a pairwise summable subset of g. condition (2.) then provides that f is a maximal pairwise summable subset of g. 458 vol. 53 no. 5/2013 maximal subsets of pairwise summable elements definition 6. let (g;⊕, 0) be a generalized effect algebra and f ⊆ g be a subset of g that satisfies conditions (1.) and (2.) from theorem 1. then f is called a summability block of g. corollary 1. every maximal pairwise summable subset f ⊆ g (a summability block) of elements of any generalized effect algebra (g;⊕, 0) is a sub-generalized effect algebra of g. example 1. let h be an infinite-dimensional complex hilbert space and d = {d ⊆h | d is a dense sub-space of h}. let vd(h) ={ a : d(a) →h ∣∣ (ax, x) ≥ 0 for all x ∈ d(a), d(a) ∈d, d(a) = h if a is bounded } be a set of densely defined positive linear operators on h. in [14] it was shown that vd(h) with the partial binary operation ⊕d defined for every a, b ∈vd(h) by a⊕d b = a + b (the usual sum) if a or b is bounded or d(a) = d(b) if a, b are both unbounded, forms a generalized effect algebra (vd(h);⊕d, 0). moreover, for every d ∈d the set gd (h) = { a∈vd(h) ∣∣ a is bounded, or d(a) = d } is a sub-generalized effect algebra of vd(h) (see [14]). for every a, b ∈ gd(h), d ∈ d by definition of ⊕ the condition (1.) is satisfied. let us assume that there exists c ∈vd(h) such that c ⊕a is defined for all a ∈gd (h). further, there exists some b ∈gd (h) with d(b) = d 6= h (if not, then d = h), hence by the hellinger-toeplitz theorem b is unbounded. since c ⊕ b is defined we have d(c) = d(b), that is c ∈gd(h). therefore sets gd(h) are for d 6= h maximal pairwise summable sub-generalized effect algebras. example 2. according to [1], every positive linear operator a ∈vd(h) uniquely determines a positive sesquilinear form ta on d(ta) = d(a) by ta(x, y) = (ax, y). let us denote a set of all such sesquilinear forms by fd(h), namely fd(h) = { t : d(t) × d(t) →h ∣∣ there exists a ∈vd(h) with d(a) = d(t) and t(x, y) = (ax, y) for all x, y ∈ d(t) } . on the set fd(h), we can define a partial sum t ⊕ s for any t, s ∈ fd(h) in the following way: t ⊕ s exists whenever d(t) = d(s) or t or s is bounded (then d(t ⊕ s) = d(t) ∩ d(s)) by (t ⊕ s)(x, y) = t(x, y) + s(x, y) for all x, y ∈ d(t)∩d(s). it is easy to show that (fd(h);⊕, 0) is a generalized effect algebra isomorphic to (vd(h);⊕d, 0). as in the previous example, maximal pairwise summable subsets are md (h) = { t ∈f(h) ∣∣ t is bounded, or d(t) = d}, hence they are sub-generalized effect algebras of fd(h). example 3. let us consider chang’s effect algebra (e;⊕, 0, 1) which is defined by e = {0, a, 2a, . . . , (2a) ′ , a ′ , 1}. consider its subset f = {0, a, 2a, . . .}⊆ g. clearly f satisfies condition (1.). since any element of the form (n0a) ′ is summable only with elements na for n ≤ n0, hence (n0a) ′ /∈ f which gives that (2.) is satisfied as well. 3. intervals in pairwise summable generalized effect algebras the significant property of any generalized effect algebra (g;⊕, 0, q) is the fact that for every non-zero element q ∈ g, the interval [0, q]g = {a ∈ g | there exists c ∈ g with a ⊕ c = q} is an effect algebra ([0, q]g;⊕q , 0). the partial operation ⊕q is defined by a ⊕q b exists iff a ⊕ b ≤ q and then a ⊕q b = a ⊕ b. further, let us investigate intervals in pairwise summable generalized effect algebras. namely, we are going to show that if g with derived ≤ is a lattice, then these intervals are mv-effect algebras (hence can be organized into mv-algebras). we start with the observation that every pairwise summable generalized effect algebra (g;⊕, 0) is a generalized mv-effect algebra if and only if (g,≤) is a lattice. recall that a non-void subset i ⊆ l of a partially ordered set (l,≤) is an order ideal if a ∈ l, b ∈ i and a ≤ b implies a ∈ i. let (p ;≤, 0) be a generalized effect algebra. let p∗ be a set disjoint from p with the same cardinality. consider a bijection a → a∗ from p onto p∗ and let us denote p ∪̇p∗ by e. further define a partial binary operation ⊕∗ on e by the following rules. for a, b ∈ p (1.) a ⊕∗ b is defined if and only if a ⊕ b is defined, and a ⊕∗ b = a ⊕ b, (2.) b∗⊕∗ a and a⊕∗ b∗ are defined if and only if b a is defined and then b∗ ⊕∗ a = (b a)∗ = a ⊕∗ b∗. theorem 2 ([2, p. 18]). for every generalized effect algebra p and e = p ∪̇p∗ the structure (e;⊕∗, 0, 0∗) is an effect algebra. moreover, p is a proper order ideal in e closed under ⊕∗ and the partial order induced by ⊕∗, when restricted to p , coincides with the partial order induced by ⊕. the generalized effect algebra p is a sub-generalized effect algebra of e and for every a ∈ p, a ⊕ a∗ = 0∗. since the definition of ⊕∗ on e = p ∪̇p∗ coincides with the ⊕-operation on p , it will cause no confusion 459 z. riečanová, j. janda acta polytechnica 0 a a ⊕ c c b b ⊕ c figure 1. to example 4: (g;⊕, 0) if from now on we use the notation ⊕ also for its extension to e. to avoid undue repetitions on generalized mv-effect algebras, we recall the following statements from [11], giving their equivalent definitions. theorem 3 ([11, theorem 3.2]). for a generalized effect algebra p the following conditions are equivalent: (1.) p is a generalized mv-effect algebra, (2.) e = p ∪̇p∗ is an mv-effect algebra. theorem 4 ([11, theorem 3.3]). a generalized effect algebra p is a generalized mv-effect algebra iff the following conditions are satisfied (1.) p is a lattice, (2.) for all a, b, c ∈ p the existence of a ⊕ c and b ⊕ c implies the existence of (a ∨p b) ⊕ c, (3.) ∨ {c ∈ p | a⊕c exists and c ≤ b} exists in p , for all a, b ∈ p , (4.) (a (a∧p b))⊕(b (a∧p b)) exists for all a, b ∈ p . lemma 1. let (g;⊕, 0) be a generalized mv-effect algebra. then for every q ∈ g the interval [0, q]g ⊆ g is an mv-effect algebra. proof. clearly, for every q ∈ g, the interval [0, q]g ⊆ g is lattice ordered, as for every a, b ∈ [0, q]g we have a ∨g b, a ∧g b ≤ q in g. moreover, for every a, b ∈ [0, q]g ⊆ g we have (a (a∧g b))⊕(b (a∧g b)) exists in g by theorem 4. since (a (a∧g b))⊕(b (a∧g b)) = [(a (a∧g b))∨g (b (a ∧g b))] ⊕ [(a (a ∧g b)) ∧g (b (a ∧g b))] = (a (a ∧g b)) ∨g (b (a ∧g b)) ≤ a ∨g b ≤ q we have that (a q (a ∧q b)) ⊕q (b q (a ∧q b)) also exists in [0, q]g (for inequalities see [2, p. 70]). this proves that a, b are compatible elements of a lattice effect algebra ([0, q]g;⊕q , 0, q). hence ([0, q]g;⊕q , 0, q) is an mv-effect algebra. the converse of this lemma, in general, does not hold as can be seen in the following example. example 4. let us have a generalized effect algebra (g;⊕, 0) given by g = {0, a, b, a ⊕ c, b ⊕ c} (fig. 1). consider e = g∪̇g∗ (fig. 2). (1.) clearly, a is not compatible with b. this is because a ↔ b if and only if there exists (a (a ∧g b)) ⊕ (b (a ∧g b)) = a ⊕ b which is not defined. 0 a a ⊕ c c b b ⊕ c 0∗ a∗ (a ⊕ c)∗ c∗ b∗ (b ⊕ c)∗ figure 2. to example 4: e = g∪̇g∗ (2.) e = g∪̇g∗ is a lattice. (3.) g is a prelattice generalized effect algebra but it is not a generalized mv-effect algebra, since e = g∪̇g∗ is not an mv-effect algebra. (4.) every interval of g is an mv-effect algebra, namely [0, a⊕c], [0, b⊕c] are boolean algebras which are mv-effect algebras, and [0, a], [0, b] and [0, c] are finite chains, which are mv-effect algebras as well. nevertheless, g is not a generalized mv-effect algebra since e = g∪̇g∗ is not an mv-effect algebra. using previous theorems we obtain statements for pairwise summable lattice ordered generalized effect algebras. theorem 5. let (g;⊕, 0) be a pairwise summable lattice ordered generalized effect algebra. then (1.) (g;⊕, 0) is a generalized mv-effect algebra, (2.) e = g∪̇g∗ is an mv-effect algebra, (3.) for every q ∈ g the interval [0, q]g ⊆ g is an mv-effect algebra. proof. since g with derived partial order ≤ is a lattice and every pair of elements of g is summable, g satisfies all conditions (1.)–(4.) of theorem 4. further (2.) follows by theorem 3 and (3.) by (1.) and lemma 1. 4. blocks of pairwise summable elements in generalized effect algebras in [10] it was shown that every lattice effect algebra is a set theoretical union of blocks of compatible elements. in this section we present an analogous statement for blocks of pairwise summable elements in generalized effect algebras. let (p,≤) be a poset (e.g. generalized effect algebra). we call (p,≤) inductive if every chain in p has an upper bound. 460 vol. 53 no. 5/2013 maximal subsets of pairwise summable elements theorem 6. let (g;⊕, 0) be a generalized effect algebra such that for every element a ∈ g, its isotropic index ord(a) = ∞. then g is a set-theoretical union of its summability blocks, which are sub-generalized effect algebras of g. proof. let a ⊆ g be a non-empty set of pairwise summable elements (i.e., a⊕b exists for every a, b ∈ a) and let a = {b ⊆ g | a ⊆ b, b is a set of pairwise summable elements}. then for every chain b ⊆ a (i.e., for x, y ∈b we have either x ⊆ y or y ⊆ x) we show that ⋃ b ∈a. let us have x, y ∈ ⋃ b. then there exist bx, by such that x ∈ bx, y ∈ by. by assumptions bx ⊆ by or by ⊆ bx, hence x, y ∈ bx or x, y ∈ by i.e. x ⊕ y exists. therefore ⋃ b ∈ a hence a is inductive. thus a maximal element m by zorn’s lemma exists in a and m is clearly a summability block of g. for every a ∈ g, a 6= 0 there exists a pairwise summable subset a = {0, a, 2a, . . .} ∈ g. by previous a subset a ⊆ g is contained in some summability block m. thus g is a set-theoretical union of summability blocks m. blocks are sub-generalized effect algebras by theorem 1 (resp. corollary 1). hence any generalized effect algebra without elements with finite isotropic index (g;⊕, 0) is covered by its summability blocks. example 5. we now turn to example 1. let h be an infinite-dimensional complex hilbert space. consider a generalized effect algebra vd(h) and its subgeneralized effect algebras gd(h) from example 1. then for any d ∈d, d 6= h, gd (h) forms a summability block of vd(h). note that sub-generalized effect algebras gd (h) are also compatibility blocks (see [14], hence in this case, compatibility and summability blocks coincide. theorem 7. let (g;⊕, 0) be a lattice generalized effect algebra such that for every element a ∈ g, its isotropic index ord(a) = ∞. then g is a set-theoretical union of its blocks, which are subgeneralized effect algebras of g and generalized mveffect algebras in its own right. proof. this follows from theorems 5 and 6. corollary 2. for every maximal pairwise summable subset f of a lattice ordered generalized effect algebra (g;⊕, 0) and any q ∈ g, q 6= 0 the intervals [0, q]f = [0, q]g ∩ f are mv-effect algebras in its own right. proof. the identical mapping ϕ : [0, q]g → [0, q]g restricted to [0, q]f = [0, q]g ∩ f is an embedding of [0, q]f into [0, q]g. thus if g is a lattice ordered generalized effect algebra, then [0, q]g ∩ f is a sub-effect algebra of [0, q]g (see [7, section 3]) and consequently it is an mv-effect algebra in its own right. example 6. consider chang’s effect algebra (g;⊕, 0) mentioned in example 3. it is not covered by summability blocks since it has elements with finite isotropic index. there exists only one summability block f = {0, a, 2a, . . .} ⊆ g. on the other hand, since g is linearly ordered by induced partial order ≤, all of its elements are pairwise compatible, hence the only compatibility block is g itself (g is an mv-effect algebra). that is compatibility and summability blocks need not coincide. acknowledgements zdenka riečanová gratefully acknowledges support by the science and technology assistance agency under contract apvv-0178-11 bratislva sr, and under vega-grant of mš sr no. 1/0297/11. jiří janda gratefully acknowledges support from masaryk university, grant muni/a/0838/2012 and esf project cz.1.07/2.3.00/20.0051 algebraic methods in quantum logic of the masaryk university. references [1] blank j., exner p., havlíček m., hilbert space operators in quantum physics, 2nd ed. springer, berlin (2008). [2] dvurečenskij a., pulmannová s., new trends in quantum structures, kluwer acad. publ., dordrecht/ister science, bratislava, 2000. [3] foulis d. j., bennett m. k., effect algebras and unsharp quantum logics, found. phys. 24 (1994), 1331–1352. [4] gudder s., d-algebras, found. phys. 26, no. 6, (1996), 813–822. [5] hedlíková j., pulmannová s., generalized difference posets and orthoalgebras, acta math. univ. comenianae lxv, (1996), 247–279. [6] kalmbach g., riečanová z. an axiomatization for abelian relative inverses, demonstratio math. 27, (1994), 769–780. [7] janda j., riečanová z. intervals in generalized effect algebras, to appear in soft computing, (2013), doi:10.1007/s00500-013-1083-x. [8] kôpka f., chovanec f. d-posets, math. slovaca 44, (1994), 21–34. [9] riečanová z., subalgebras, intervals and central elements of generalized effect algebras, inter. j. theor. phys. 38, (1999), 3209–3220. [10] riečanová z., generalization of blocks for d-lattices and lattice-ordered effect algebras, inter. j. theor. phys. 39, (2000), no. 2., pp. 231–237. [11] riečanová z., marinová i. generalized homogeneous, prelattice and mv-effect algebras, kybernetika 41, (2005), no. 2, pp. 129–142. [12] riečanová z., zajac m., intervals in generalized effect algebras and their sub-generalized effect algebras, acta polytechnica 53, (2013), no. 3, pp. 314–316. [13] riečanová z, zajac m., hilbert space effect-representations of effect algebras, rep. math. phys. 70, (2012), no. 2, pp. 283–290. [14] riečanová z., zajac m., pulmannová s., effect algebras of positive linear operators densely defined on hilbert spaces, rep. math. phys. 68, (2011), 261–270. 461 http://dx.doi.org/10.1007/s00500-013-1083-x acta polytechnica 53(5):457–461, 2013 1 introduction and some basic definitions 2 pairwise summable generalized effect algebras 3 intervals in pairwise summable generalized effect algebras 4 blocks of pairwise summable elements in generalized effect algebras acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0359 acta polytechnica 55(6):359–365, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap a comparison of the tensile strength of plastic parts produced by a fused deposition modeling device juraj beniak∗, peter križan, miloš matúš institute of manufacturing systems, environmental technology and quality management, faculty of mechanical engineering, slovak university of technology in bratislava, nam. slobody 17, 812 31 bratislava, slovak republic ∗ corresponding author: juraj.beniak@stuba.sk abstract. rapid prototyping systems are nowadays increasingly used in many areas of industry, not only for producing design models but also for producing parts for final use. we need to know the properties of these parts. when we talk about the fused deposition modeling (fdm) technique and fdm devices, there are many possible settings for devices and models which could influence the properties of a final part. in addition, devices based on the same principle may use different operational software for calculating the tool path, and this may have a major impact. the aim of this paper is to show the tensile strength value for parts produced from different materials on the fused deposition modeling device when the horizontal orientation of the specimens is changed. keywords: rapid prototyping; fdm; fused deposition modeling; 3d printer; additive manufacturing. 1. introduction rapid prototyping refers to a group of techniques used for rapid production of scaled models, real parts or assemblies based on a 3d model designed by a cad system [1]. rapid prototyping systems and systems for producing prototypes rapidly are coming to the forefront at great speed. although the name rapid prototyping suggests that prototype production is the primary aim, these devices are ever more frequently used directly in the manufacturing process. they are not able to operate in producing large series, but have their place in short-run or medium series production. these devices are still expensive, but we can now observe fused deposition modeling technology spreading widely on the market, and they can produce items much more cheaply than primary producers can. this is because the validity of the patent protection for this technology has expired, and production has been able to expand in the global market. this has rapidly reduced the price of fdm devices. rapid prototyping devices are very widely applied not only in production, and not only inmechanical engineering industries. in addition, there are a wide range of technologies for creating prototypes. a feature of all of them is so-called additive manufacturing where, in contrast with classical conventional manufacturing methods, material is added to a workpiece, not removed from it. conventional technologies are based on principle that material is removed from a predefined semi-product (raw material), until the final required shape and dimensions are achieved. in additive manufacturing, is the action is in the opposite direction. the material is added step-by-step figure 1. the additive manufacturing process – layer manufacturing [3]. and layer-by-layer, added and this way a totally new part, a prototype, is formed). fig 1 illustrates how in most rapid prototyping technologies the parts built up layer-by-layer (fig. 1). 2. fdm technology overview in this paper, we will concentrate on fused deposition modeling (fdm) technology. fdm is a technique that uses two types of materials, for modeling and for support [2]. first, the modeling material is used to build a model. then, the support material is used to build a support structure on areas where the modeling material will overhang the rest of the model [5]. this technique works on a principle as similar to that of the fuse gun [4]. the material is unspooled from 359 http://dx.doi.org/10.14311/ap.2015.55.0359 http://ojs.cvut.cz/ojs/index.php/ap j. beniak, p. križan, m. matúš acta polytechnica figure 2. dimensions of the specimen for tensile strength measurements. factor level 1 level 2 level 3 a 0 deg 45 deg 90 deg b pc abs pla table 1. factor specification for the experiments. the spool to the fuse head, where it is melted and deposited on the working table. after the model has been completed, the support material is broken away or dissolved in a special bath. using this technique, the prototype is built up layer by layer (fig. 1). a range of materials are used for making prototypes. the most widely used materials are abs plastics and polycarbonate (pc).more recently, a broad spectrum pla plastic modifications and composite materials have been introduced, consisting of pla polymer and others material particles (wood, ceramic, metal, and others). each type of plastic has certain advantages and disadvantages [6]. from the point of view of the designer, the mechanical properties of the selected material are very important, e.g. its tensile strength. 3. tensile test of fdm samples if we want to use prototypes or real parts in practical applications, or if need to test them under a load, it is necessary to know their material properties and to be able to compare the properties of a prototype with parts manufactured in the conventional way. if we know the ratio between the table values of the material properties of conventionally produced parts and the real values of the material properties of prototype samples produced by by fdm technology. conversely, if we test the prototype and find its properties, we can make a reverse estimate of the properties of the real part. we will measure the tensile strength of tested samples (fig. 2), working only with parts produced by /fdm. the tensile test specimens were made on different devices, because each device is suitable for processing a single material type. three material types were chosen, polycarbonate (pc), acrylonitrile butadiene styrene (abs) and polylactic acid (pla), which is environment-friendly polymer sourced from corn, a renewable material. these materials were selected on the basis of the availability of fdm devices for processing the materials, and also the availability of materials. all of the selected devices were set up in the same figure 3. orientation of the samples in the workspace of the fdm device. way, in order to make the experiment suitable for comparng the measured values. the polycarbonate plastic specimens are produced on the fortus 360mc professional device [7]. the abs plastic parts are made on dimension sst, andthe pla plastic parts from are made on the 3d profimaker device. the pc and abs plastic samples are pre-processed using catalyst software, and the pla samples are pre-processed with g3dmaker software. each of the selected materials has a different preferred processing temperature for this technology, it is not possible use the same temperature for all the materials for experimental purposes. the other settings are the same for all device settings. the interior space of the model is filled with plastic fibers to provide the maximum possible material content (the maximum relative specimen density). the model layer thickness is set to 0.25 mm. the specimens lie in flat position in the horizontal plane. the only driven factor is the orientation of the specimen in the horizontal plane, which is in three levels: 0°, 45° and 90° (fig. 3, table 1). the aim of this experiment is to find the tensile strength of the fdm specimens, with reference to the orientation of the model in the workspace of the device. this data is important for users of fdm devices, for proper production of parts, and also for preparing further experiments involving other factors. it is also important for correct selection of the testing device for tensile tests and for defining the dimensions and the parameters of test samples, because all tensile test devices are limited by their maximum possible load. basically we have a two-factor experiment in which each factor has three levels, and we are able to prepare complete experiment (table 2). the specimens are tested on the inspekt desk 5 kn universal testing device (fig. 4), which enables a maximum load of 5 kn to be applied ot the specimen. 4. measured tensile strength values from each combination introduced in the design of theexperiment (table 2), we produced five specimens 360 vol. 55 no. 6/2015 a comparison of the tensile strength figure 4. the inspekt desk 5 kn universal testing device. exp. no. a b 1 1 1 2 1 2 3 1 3 4 2 1 5 2 2 6 2 3 7 3 1 8 3 2 9 3 3 table 2. design of the experiment. to make a statistical evaluation of the measured values possible. the measured values are the maximum tensile force fm (n) and tensile strength rm (mpa), which is given by the control software and is calculated by the following mathematical formulas (3). the situation is the same for the values of elongation ε (%), which are displayed by the control software but are calculated based on by the known formulas (1) and (2). table 3 shows the spread of the reviewed tensile strength values s2rm . the tensile strength was measured on the universal tensile testing machine, controlled by a microprocessor with automatic value recording and scoring of the measured values. all samples were tested under the same conditions, except where there were changes due to the design of the experiment (table 2 and table 3). the measured values are displayed in table 3. as was mentioned above, the tensile testing machine and its control software is able to calculate all necessary data, but the basic relations need to be known the revised young modulus of elasticity e (1) and tensile strength rm (3) were calculated by means of formulas (1) and (3), which will be introduced below. young’s modulus e can be calculated by dividing the tensile stress by the extensional strain in the elastic (initial, linear) portion of the stress–strain curve: e = tensile strength extensional strain = rm ε = fmax/a0 ∆l/l0 = fmaxl0 a0∆l , (1) where e is the young’s modulus (modulus of elasticity); fmax is the force exerted on an object under tension; a0 is the original cross-sectional area through which the force is applied; ∆l is the amount by which the length of the object changes; l0 is the original length of the object; rm is tensile strength; ∆rm is the percentage deviation from the table values for tensile strength. the strain ε can be measured by integrating the change in unit current length. this measure of strain is called the true strain, or the logarithmic strain [8]: ε = ∫ l l0 1 l dl = ln l l0 . (2) the ultimate tensile strength is calculated as follows: rm = fmax a0 . (3) average tensile strength values rm (mpa) are displayed in fig. 5. the first three columns are for polycarbonate material, the next three are for abs plastic, and the last three columns present average tensile strength values for pla plastic. fig. 5 also 361 j. beniak, p. križan, m. matúš acta polytechnica exp. no. a b rm (mpa) s2(rm) (mpa) fm (n) ε (%) 1 1 1 56.12 160 3591.7 5.19 2 1 2 43.53 1 2785.9 2.88 3 1 3 57.21 198 3661.4 5.78 4 2 1 35.47 152 2270.1 3.37 5 2 2 37.88 101 2424.3 3.44 6 2 3 35.78 200 2289.9 3.15 7 3 1 45.81 562 2931.8 3.17 8 3 2 45.31 294 2899.8 3.03 9 3 3 43.25 354 2768.0 3.29 table 3. measured tensile strength values. factor f (calculated) p (signification) ss mse ftab 0.95 a f (2.18) = 70.4 < 10−6 68.74 0.49 3.56 b f (2.18) = 1107 < 10−6 1140.51 0.49 3.56 a*b f (4.18) = 153 < 10−6 299.28 0.49 2.93 table 4. results of an analysis of variances (anova). figure 5. measured tensile strength values. show horizontal lines indicating the tensile strength measured for conventionally-produced samples, for comparison. 5. discussion we prepared an exact statistical evaluation by analysis of variances (anova), and the results are presented in table 4. the anova analysis results show that all the factors and the interaction are significant on a level of p = 0.05. the most significant result is for factor b (material type). this makes sense, because these polymers also have differences in tensile strength like those observed in specimens produced by conventional technologies. the maximum tensile strength values of specimens produced by conventional technologies are shown as a horizontal line in fig. 5. polycarbonate has a tensile strength value of 68 mpa. the value for abs plastic is 40 mpa, while pla eco-plastic material has a value of 50 mpa. for each of the materials, our measured tensile strength value is lower. for polycarbonate, the measured value is 84 % of conventional tensile strength value. for abs plastic, the measured value is 94.7 % of the value for a conventional material, and for pla plastic the measured value is 91.6 % of the conventional material value. we see that factor a (orientation of the model in the x-y horizontal plane) is also significant, but less significant than factor b. the measured values (table 3, fig. 5) point to a big difference between the different sample orientations for the polycarbonate material. for the other two materials (abs and pla polymers) the gap is smaller, but in each case the distribution between 0 degree orientation, 45 degree orientation and 90 degree orientation is also different. these results may be due to the calculation and the distribution of the tool path. as we have pointed out, the different material specimens were produced on different fdm devices. each device has its own 362 vol. 55 no. 6/2015 a comparison of the tensile strength 7 parameters for the device. each software has its own logic for generating the tool part and the direction of the fibers in each layer of the model. fig. 6 shows the the structure of the fibers in successive layers of the model. in each layer there is a different orientation of thefibers. these four orientations change from the bottom to the top of the specimen. the direction of fibers in the 0-degree orientation is same as for the 90-degree orientation in the horizontal x – y plane. fig. 6 direction of the fibers on different layers of pc specimens – 0 degrees and 90 degrees fig. 7 shows successive layers and their fibes orientation if the specimen is oriented at 45 degrees within the horizontal x – y plane. we see mostly short fibers, which are closer to the normal direction of the specimen. this causes the tensile strength to be lower in a 45-degree orientation than in a 0-degree or 90-degree orientation. fig. 7: direction of the fibers on different layers of pc specimens – 45 degrees fig. 8 and fig. 9 show the orientation of the fibers for the devices that produced the pla and abs material specimens. it is clear that there are just two possible orientations, which change across the layers of the specimen. this basic difference between the structure of the pc specimen and the structure of the abs/pla specimen also causes a significant difference between the measured tensile strength values. fig. 8 direction of the fibers on different layers of pla and abs specimens – 0 degrees and 90 degrees fig. 9 direction of the fibers on different layers of the pla and abs specimens – 45 degrees figure 6. direction of the fibers on different layers of pc specimens – 0 degrees and 90 degrees. 7 parameters for the device. each software has its own logic for generating the tool part and the direction of the fibers in each layer of the model. fig. 6 shows the the structure of the fibers in successive layers of the model. in each layer there is a different orientation of thefibers. these four orientations change from the bottom to the top of the specimen. the direction of fibers in the 0-degree orientation is same as for the 90-degree orientation in the horizontal x – y plane. fig. 6 direction of the fibers on different layers of pc specimens – 0 degrees and 90 degrees fig. 7 shows successive layers and their fibes orientation if the specimen is oriented at 45 degrees within the horizontal x – y plane. we see mostly short fibers, which are closer to the normal direction of the specimen. this causes the tensile strength to be lower in a 45-degree orientation than in a 0-degree or 90-degree orientation. fig. 7: direction of the fibers on different layers of pc specimens – 45 degrees fig. 8 and fig. 9 show the orientation of the fibers for the devices that produced the pla and abs material specimens. it is clear that there are just two possible orientations, which change across the layers of the specimen. this basic difference between the structure of the pc specimen and the structure of the abs/pla specimen also causes a significant difference between the measured tensile strength values. fig. 8 direction of the fibers on different layers of pla and abs specimens – 0 degrees and 90 degrees fig. 9 direction of the fibers on different layers of the pla and abs specimens – 45 degrees figure 7. direction of the fibers on different layers of pc specimens – 45 degrees. 7 parameters for the device. each software has its own logic for generating the tool part and the direction of the fibers in each layer of the model. fig. 6 shows the the structure of the fibers in successive layers of the model. in each layer there is a different orientation of thefibers. these four orientations change from the bottom to the top of the specimen. the direction of fibers in the 0-degree orientation is same as for the 90-degree orientation in the horizontal x – y plane. fig. 6 direction of the fibers on different layers of pc specimens – 0 degrees and 90 degrees fig. 7 shows successive layers and their fibes orientation if the specimen is oriented at 45 degrees within the horizontal x – y plane. we see mostly short fibers, which are closer to the normal direction of the specimen. this causes the tensile strength to be lower in a 45-degree orientation than in a 0-degree or 90-degree orientation. fig. 7: direction of the fibers on different layers of pc specimens – 45 degrees fig. 8 and fig. 9 show the orientation of the fibers for the devices that produced the pla and abs material specimens. it is clear that there are just two possible orientations, which change across the layers of the specimen. this basic difference between the structure of the pc specimen and the structure of the abs/pla specimen also causes a significant difference between the measured tensile strength values. fig. 8 direction of the fibers on different layers of pla and abs specimens – 0 degrees and 90 degrees fig. 9 direction of the fibers on different layers of the pla and abs specimens – 45 degrees figure 8. direction of the fibers on different layers of pla and abs specimens – 0 degrees and 90 degrees software for generating the control program and for setting the basic parameters for the device. each software has its own logic for generating the tool part and the direction of the fibers in each layer of the model. fig. 6 shows the the structure of the fibers in successive layers of the model. in each layer there is a different orientation of thefibers. these four orientations change from the bottom to the top of the specimen. the direction of fibers in the 0-degree orientation is same as for the 90-degree orientation in the horizontal x-y plane. fig. 7 shows successive layers and their fibes orientation if the specimen is oriented at 45 degrees within the horizontal x – y plane. we see mostly short fibers, which are closer to the normal direction of the specimen. this causes the tensile strength to be lower in a 45-degree orientation than in a 0-degree or 90-degree orientation. fig. 8 and fig. 9 show the orientation of the fibers for the devices that produced the pla and abs material specimens. it is clear that there are just two possible orientations, which change across the layers of the specimen. this basic difference between the structure of the pc specimen and the structure of the abs/pla specimen also causes a significant difference between the measured tensile strength values. fig. 10 compares the development of the tensile test for three different materials. there are measurements where tensile strength values are reached that similar to the average values presented in table 3. the development of the tensile tests presented in fig. 10 shows that the yield point is almost completely missing, and the sample only breaks. this provides evidence of the fragility and the brittleness of the solidified extruded plastic material. by contrast, the samples formed by conventional technology display marked plastic deformation, a neck is formed, and then the sample breaks. according to the material properties list forabs plastic, the elongation-at-break about 30 %, whereas for our samples the elongationat-break only about 3.44 %. this again points to the brittleness of the model produced from abs plastic material using fdm technology. there is a similar situation for the other materials, and the calculated elongation of the fdm samples is also lower than to the elongation of samples produced conventionally from the same material (table 5). 363 j. beniak, p. križan, m. matúš acta polytechnica 7 parameters for the device. each software has its own logic for generating the tool part and the direction of the fibers in each layer of the model. fig. 6 shows the the structure of the fibers in successive layers of the model. in each layer there is a different orientation of thefibers. these four orientations change from the bottom to the top of the specimen. the direction of fibers in the 0-degree orientation is same as for the 90-degree orientation in the horizontal x – y plane. fig. 6 direction of the fibers on different layers of pc specimens – 0 degrees and 90 degrees fig. 7 shows successive layers and their fibes orientation if the specimen is oriented at 45 degrees within the horizontal x – y plane. we see mostly short fibers, which are closer to the normal direction of the specimen. this causes the tensile strength to be lower in a 45-degree orientation than in a 0-degree or 90-degree orientation. fig. 7: direction of the fibers on different layers of pc specimens – 45 degrees fig. 8 and fig. 9 show the orientation of the fibers for the devices that produced the pla and abs material specimens. it is clear that there are just two possible orientations, which change across the layers of the specimen. this basic difference between the structure of the pc specimen and the structure of the abs/pla specimen also causes a significant difference between the measured tensile strength values. fig. 8 direction of the fibers on different layers of pla and abs specimens – 0 degrees and 90 degrees fig. 9 direction of the fibers on different layers of the pla and abs specimens – 45 degrees figure 9. direction of the fibers on different layers of the pla and abs specimens – 45 degrees. material pc abs pla fdm sample max. elongation εfdm (%) 5.78 3.44 3.29 conventional sample average elongation εcon (%) 95 24.3 28.5 table 5. elongation-at-break for plastic materials [9]. 8 fig. 10 compares the development of the tensile test for three different materials. there are measurements where tensile strength values are reached that similar to the average values presented in table 3. fig. 10 development of the tensile test on different samples the development of the tensile tests presented in fig. 10 shows that the yield point is almost completely missing, and the sample only breaks. this provides evidence of the fragility and the brittleness of the solidified extruded plastic material. by contrast, the samples formed by conventional technology display marked plastic deformation, a neck is formed, and then the sample breaks. according to the material properties list forabs plastic, the elongation-at-break about 30%, whereas for our samples the elongation-at-break only about 3,44%. this again points to the brittleness of the model produced from abs plastic material using fdm technology. there is a similar situation for the other materials, and the calculated elongation of the fdm samples is also lower than to the elongation of samples produced conventionally from the same material (table 5). table 5 elongation-at-break for plastic materials [9] material pc abs pla fdm sample max. elongation fdm (%) 5,78 3,44 3,29 conventional ample average elongation con (%) 95 24,3 28,5 6 conclusion the measured tensile strength values shown in table 3 and fig. 5 show that the materials tested here achieve lower values than conventionally-produced specimens of pc, abs and pla plastic materials. this is primarily due to the way in which the tested samples were formed. the values for the material sheet properties are for injected parts. by contrast, the samples produced byfdm technology were formed by depositing thin semi-melted fibers side by side. if the fibers are deposited closely side-by-side for maximum density, we are also not able using this technology to achive such density of part as is achieved using conventional production methods, which produce a material with a homogeneous structure. we can also see, based on outputs (table 3), that there are in some cases bigger differences in tensile strength values 0 10 20 30 40 50 60 0 1 2 3 4 5 6 pc abs pla rm (mpa) l (mm) figure 10. development of the tensile test on different samples. 6. conclusions the measured tensile strength values shown in table 3 and fig. 5 show that the materials tested here achieve lower values than conventionally-produced specimens of pc, abs and pla plastic materials. this is primarily due to the way in which the tested samples were formed. the values for the material sheet properties are for injected parts. by contrast, the samples produced byfdm technology were formed by depositing thin semi-melted fibers side by side. if the fibers are deposited closely side-by-side for maximum density, we are also not able using this technology to achive such density of part as is achieved using conventional production methods, which produce a material with a homogeneous structure. we can also see, based on outputs (table 3), that there are in some cases bigger differences in tensile strength values between single samples. this is caused by different fibers orientation and also different layers structure of each specimen. the result of presented experiment is that also the specimen orientation within horizontal x–y plane affect measured values. the reason, why the proportions of measured values for each material are different, have to be investigated in separate experiment, where will be more close examined layer structure and its effect to measured values. acknowledgements the research presented in this paper is an outcome of project no. apvv-0857-12 “tools durability research of progressive compacting machine design and development of adaptive control for compaction process”, funded by the slovak research and development agency. 364 vol. 55 no. 6/2015 a comparison of the tensile strength references [1] rapid prototyping & manufacturing technologies, ic learning series, hong kong polytechnic university, industrial centre [2] williams, c., rapid prototyping with metals: a review of technology and associated material properties, georgia institute of technology, systems realization laboratory, november 2003, [3] sigler, dean; additive manufacturing for electric motors, in: http://blog.cafefoundation.org/additivemanufacturing-for-electric-motors/ [2014-04-04]. [4] pham, duc-truong; dimov, stefan: rapid prototyping, a time compression tool, manufacturing engineering centre, cardiff university. [5] chua, c. k., leong, k. f., lim, c. s., rapid prototyping, principles and applications, nanyang technological university, singapore, world scientific co. pte. ltd, 2003, doi:10.1142/5064 [6] beniak, j.; rapid prototyping and accuracy of created models, in: erin, 5, č. 6 (2012), s. 2-9 [7] lipina, j., kopec, p., marek, j., krys, v.; thread variants and their load capacity in components made by rapid prototyping technology, in: erin (6)11, 2013, p. 11-15. [8] persson, h., adan k.; modeling and experimental studies of pc/abs at large deformations, div. of solid mechanics, sony ericson, lund, sweden, 2004. [9] letcher, t., waytashek, m.; material property testing of 3d-printed specimen in pla on an entry-level 3d printer; proceedings of the asme 2014 international mechanical engineering congress & exposition; imece2014; november 14-20, 2014, montreal, quebec, canada, doi:10.1115/imece2014-39379 365 http://dx.doi.org/10.1142/5064 http://dx.doi.org/10.1115/imece2014-39379 acta polytechnica 55(6):359–365, 2015 1 introduction 2 fdm technology overview 3 tensile test of fdm samples 4 measured tensile strength values 5 discussion 6 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):179–184, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ the influence of nitrogen in ar+n2 mixture on parameters of high-temperature device with electric arc ivana jakubovaa,∗, josef senkb, ilona laznickovab a dept. of radio engineering, faculty of electrical engineering and communication, brno university of technology, brno, czech republic b dept. of power electrical engineering, faculty of electrical engineering and communication, brno university of technology, brno, czech republic ∗ corresponding author: jakubova@feec.vutbr.cz abstract. the paper presents the results of numerous experiments carried out on a high temperature device consisting of an arc heater with intensively blasted electric arc and reaction chambers connected to its output. the influence of nitrogen mass concentration (up to 11 %) in working gas ar+n2 on voltage–current characteristics, power losses of individual parts and efficiency is studied for two variants of electrical configuration of the device. a short description of the computation of necessary thermodynamic and transport properties of ar+n2 mixture is included. the computed properties are then used for evaluation of mean temperature and velocity at certain cross-sections of the device. conclusions can be useful for the design of high temperature devices operating with argon/nitrogen mixture. keywords: arc heater, gas mixture, argon, nitrogen. 1. introduction the paper deals with a high-temperature device consisting of an arc heater and reacting chambers connected to its output. the device was designed for laboratory experiments, especially with thermal decomposition of stable environmentally harmful substances. the first part of the device, the arc heater with intensively blasted electric arc serves as a source of heated working gas, which is then mixed up with a decomposed substance in the second part, reacting chambers, where decomposition itself takes place. the parameters of the device are given both by its geometry, both by working gas used. the influence of some parameters (e.g. channel length, gas flow rate, the sort of working gas) was studied in previous works of the authors [6]. as working medium, mostly pure gas is used to be studied. for instance, isakaev et al. in [4] deal with the arc heater with segmented stepwise extended anode channel operated on argon. similarly, the work [6] compares the behavior of an arc heater with pure argon or pure nitrogen. gas mixtures are studied rarely. according to our previous conclusions, the sort of working gas strongly affects operational parameters of the device which seem to be more beneficial with nitrogen. unfortunately, reaction with nitrogen may produce toxic compounds. that is why utilization of argon with low admixtures of nitrogen deserves attention. this work presents the results of numerous experiments carried out on the mentioned high-temperature device with the arc heater operated on argon with up to 11 % of nitrogen. altogether with the measured data (arc current and voltage, power loss of individual segments of the device, etc.), the computed mean temperatures and velocities of the working medium in significant cross-sections are presented. the aim is to show and judge the influence of concentration of nitrogen on basic operational characteristics of the device with two modified configurations of the anode. especially the applicable input power, the distribution of power loss along the device, the efficiency, the mean temperature and velocity are compared and conclusions useful for practice are given. 2. material and methods figure 1 shows the main parts of the device in general. the arc heater includes a copper cathode with a tungsten tip, a cathode shell, an anode channel, and an anode. electric arc is stabilized by intensive blasting of working gas. steep decrease of the gas velocity at the anode extension creates suitable conditions for stable attachment of the anode spot there. the reaction chamber is assembled of three parts; stepwise extension of the first part insures thorough mixing of the working gas with the substance to be decomposed. the device is designed as a modular structure in order to make it possible to reconfigure all parts of the device and to separately measure parameters of individual segments. both the arc heater and the reaction chambers are set up of hollow copper rings cooled with water flowing through them. the flow rate and temperature of cooling water are measured to determine power loss of individual segments. 179 http://ctn.cvut.cz/ap/ ivana jakubova, josef senk, ilona laznickova acta polytechnica ch a r2 r3r1 channel anode reactor 1 reactor 2 reactor 3 arc heater reaction chamber working gas cooling water a on b off figure 1. the main parts of the experimental device. the experiments presented here are focussed on studying the influence of nitrogen mass concentration in argon/nitrogen mixture serving as working gas. that’s why other parameters of the device remain unchanged for all the tests: the anode channel is 16 mm in diameter and 109 mm in length. the last section of the anode channel (diameter of 16 mm, length of 27 mm) is always grounded and cooled separately serving as the main anode (referred to as variant b). the first part of the reactor r1 consists of three segments cooled altogether (seg. r11: diameter 30 mm/length 49 mm, electrically isolated, seg. r12: 32 mm/50 mm, seg. r13: 55 mm/30 mm). in one experimental series (referred to as variant a), the first extended part of r11 (34 mm long) is grounded and connected to the last section of the anode channel to improve stability. a decomposed substance is typically added into reactor r1, before its extension from 32 to 55 mm. thus, in the middle part of the reaction chamber (reactor r2, total length of 92 mm), the working gas and the decomposed substance are perfectly mixed up and reaction conditions (mean temperature and velocity) can be evaluated here. reactor r3 (89 mm in length) provides additional volume necessary for the technological process and its last segment (25 mm in diameter and 15 mm in length) drains gas away into a scrubber. the total gas flow rate for the designed device can be set between 2 and 30 g s−1. the presented experiments are carried out with the total flow rate of argon/nitrogen mixture of 11.3 g−1. the input power can be up to 60 kw. the basic set of measured data for each experiment includes arc current, voltage, flow rate of argon and nitrogen, input gas temperature and pressure, and flow rates and temperatures of cooling water in individual segments of the device. the measured voltage u covers not only the arc voltage unet but also the cathode ucat and anode voltage drop uan, thus u = unet + ucat + uan. ucat is determined from the measured cathode power loss. for uan, which changes very little with the current, values given in [4] are used. the cooling water circuits are established according to expected power losses of individual parts of the device and with respect to achievable accuracy. seven separate water circuits were used (the cathode, cathode shell, anode channel without its last segment, the last anode channel segment, and the first, second and third reactors). the total power loss of the part of the device from the cathode tip up to the axial distance zn, i.e. to the n-th output of cooling water, is pl (zn) =∑n 0 pli. the rest of the input power is transferred to the working medium and results in the increase of its enthalpy; power exchange between the device’s outer surface and the surrounding can be neglected. for the arc current i and the cross-section at the distance zn from the cathode tip, the power ph transferred to the gas is ph(i,zn) = ui − pl(i,zn). (1) the efficiency of the part of the device up to the cross-section at the distance zn is η(i,zn) = 1 − pl(i,zn) ui . (2) the practical utilization of the high-temperature device is given especially by attainable parameters of the working medium in the reaction chamber, namely by its temperature and velocity. these quantities can be computed using mass and energy conservation laws. the mean velocity v for the arc current i and distance zn can be determined using the following relation derived from the continuity equation v(i,zn) = 4g ρ[t(i,zn)]πd2(zn) (3) where ρ is the mass density [kg m−3] of the medium, t is the temperature [k], d is the inner diameter [m], all at the distance zn from the cathode tip, and g is the total gas flow-rate [kg s−1]. 180 vol. 53 no. 2/2013 the influence of nitrogen in ar+n2 mixture similarly, according to the energy equation, the input power covers the power loss of the device, the increase of enthalpy h [j kg−1] and kinetic energy of the heated flowing gas between the beginning z = 0 up to the cross-section at the distance zn from the cathode tip ui = pl(i,zn) +g∆h[t(i,zn)] +g v2(i,zn) − v2(0) 2 . (4) the mean temperature can be determined from the increase of enthalpy h which is as follows h[t(i,zn)] − h[t0] = ph(i,zn) g − v2(i,zn) − v2(0) 2 . (5) the second member at the right-hand side of eqs. 4, 5 can be neglected in approximate estimations. the computation of mean temperatures and velocities is sensible especially in the reaction chamber where the working gas is mixed up and the radial profiles of temperature and velocity are rather flat. the computed mean values of temperature and velocity are sufficient for the description of reaction conditions there. they can be calculated at the output of the anode channel as well, but the centerline temperature and velocity may be much higher here because of presence of a narrow arc column. as it can be seen from eqs. 3 to 5, the properties of the working gas must be known for the proper range of temperature and pressure. for the calculation of the equilibrium composition, the thermodynamic properties (mass density, specific enthalpy, specific heat capacity at constant pressure) and the electrical conductivity, the following 17 species were considered: ar, n, n2, n3, ar +, n+, n2+, n3+, ar2+, n2+, ar3+, n3+, ar4+, n4+, n2–, n3– and e–. the ar plasma, n2 plasma and ar + n2 plasma mixture for pressure 1 atm and the temperature range from 300 to 20 000 k were considered. the program tmdgas [1] and the database thecoufal [2] were used for the calculation of the composition and the thermodynamic properties. the method of the composition calculation used in tmdgas is based on the method looking for the minimum of gibbs free energy. the method of the thermodynamic properties calculation is described in detail in [1]. it requires the knowledge of the values of the chemical potentials of all the system species, the system composition and the composition derivatives. the chemical potentials are determined by the values of the standard enthalpies of formation and the standard thermodynamic functions of the system species. these important values are presented in database thecoufal [2] containing the data of thermodynamic properties of the individual species created by the elements: c, f, h, n, o, s, w, ar, ca, cl, cu and e–. the method of the transport properties calculation proceeds from the kinetic theory of gases and using the chapman–enskog method of the solution mass density 0.0e+00 1.0e-01 2.0e-01 3.0e-01 4.0e-01 5.0e-01 6.0e-01 7.0e-01 8.0e-01 9.0e-01 1.0e+00 0 2000 4000 6000 8000 10000t [k]  [k gm -3 ] pure ar ar+10.9%n2 pure n2 figure 2. computed mass density of argon, nitrogen, and ar + 10.9 % of n2. enthalpy 0.0e+00 2.0e+07 4.0e+07 6.0e+07 8.0e+07 1.0e+08 1.2e+08 1.4e+08 1.6e+08 1.8e+08 2.0e+08 0 5000 10000 15000 20000t [k] h [j kg -1 ] pure ar ar+10.9%n2 pure n2 figure 3. computed enthalpy of argon, nitrogen, and ar + 10.9 % of n2. of the boltzmann integral–differential equation [3]. the knowledge of the collision integrals of each pair of colliding species in plasma is the integral part of the calculation method. in this paper the fundamental methods of the collision integrals calculation were used; for some pairs of species (e−ar, e−n, e−n2, n−n+, n−n2) the different methods were applied. details can be found in [5]. figures 2 and 3 show the computed mass density and enthalpy for pure argon and nitrogen and for their mixture with mass concentration of 10.9 % of nitrogen. 3. results and discussion in this section, results of selected experiments are given in graphs and discussed. as explained above, the set-up of the device was the same for all the experiments presented here, with anode channel 16 mm in diameter and 109 mm in length. the last segment of the channel was always grounded. for experiments designed “variant a”, it was connected to the first extended segment (extended anode 30 mm in diameter and 34 mm in length). the gas flow rate was maintained 11.3 g s−1 and the mass concentration was set to 0.6, 1, 1.6, 3, 5, 8, and 10.9 % of nitrogen in the gas mixture. figure 4 shows the arc voltage unet for both configurations of the device and all the tested mass concentrations of nitrogen in the gas mixture. 181 ivana jakubova, josef senk, ilona laznickova acta polytechnica unet (i), l = 109 mm, da = 30 mm (var. a), g = 11.3 gs -1, ar + x% n2 60 80 100 120 140 160 180 200 50 100 150 200 250 300i[a] u ne t [ v ] 10.9% 8% 5% 3% 1% 0.6% unet (i), l = 109 mm, da = 16 mm (var. b), g = 11.3 gs -1, ar + x % n2 60 80 100 120 140 160 180 200 50 100 150 200 250 300i[a] u ne t [ v ] 10.9% 8% 5% 3% 1.6% 1% 0.6% 0% figure 4. net arc voltage vs. arc current for various mass concentration of nitrogen in ar + n2 mixture (0 to 10.9 %), total gas flow rate g = 11.3 g s−1, for variant a (anode diameter 30 mm, up) and variant b (16 mm, down). for configuration b, also the voltage–current characteristic measured with pure argon is given. obviously, even a very low mass concentration of nitrogen in the ar+n2 mixture significantly raises the arc voltage. according to eq. 4 the increase of the arc voltage with the increasing share of nitrogen in the working gas can be explained by higher enthalpy of the mixture (see fig. 3). while mass density of the mixture with up to 11 % admixture of nitrogen is almost unchanged in comparison to pure argon (see fig. 2) enthalpy of the mixture is substantially higher. as far as power loss of the anode channel is concerned, measurements for variant a prove it to increase almost linearly with the input power preserving almost the same slope for all the tested concentrations. for the highest tested mass concentration 10.9 %, the arc voltage is more than twice higher than with pure argon. comparing the voltage–current characteristics of variant a and b, the arc voltage of the device with the extended anode (variant a) slightly increases in the whole range of the measured currents while the arc voltage of variant b remains almost constant. thus, operation of the device in variant a exhibits better stability. because the cathode and anode voltage drops are much lower than the arc voltage as a whole and little sensitive to the current the dependence of the total measured voltage on the current has almost the same shape as unet(i) (fig. 4). consequently, the input power pin = ui increases almost linearly measured voltage u for constant arc current 100 a and 200 a, l = 109 mm, g = 11.3 gs-1, ar + x % n2 80 100 120 140 160 180 200 0 2 4 6 8 10 12x [%] u [v ] da=16 mm, 200 a da=30 mm, 200 a da=16 mm, 100 a da=30 mm, 100 a figure 5. the measured voltage vs. nitrogen mass concentration for both configurations and two currents. l = 109 mm, g = 11.3 gs-1, ar + 10.9% n2, da = 16 mm (b) or 30 mm (a) 0 0.05 0.1 0.15 0.2 0.25 15000 25000 35000 45000 55000pin [w] p [] ch_b a_b r1_b ch_a a_a r1_a figure 6. relative power loss of the selected parts of the device with both configurations (30 mm var. a, 16 mm var. b, ch denotes the anode channel without its last segment, a denotes the anode channel last segment 27 mm in length, r1 is reactor 1), working gas argon with 10.9 % admixture of nitrogen. with the current for variant b, but faster and nonlinearly with the current for variant a (extended anode). from fig. 5 it can be seen that the voltage increase with the nitrogen mass concentration x slows down for higher concentrations. although the total power loss of both configurations is almost the same, surprising distinctions can be revealed if power losses of individual segments are compared. to make comparison easier and more general, the power losses of individual segments are taken relatively to the input power ploss = ploss/pin. an example of relative power losses of individual segments of the device operated on argon with 10.9 % admixture of nitrogen and with the both anode configurations is given in fig. 6. figure 6 shows the relative power loss of the anode channel without its last segment (ch), of the last anode channel segment (a) and of reactor 1 (r1) for both anode configurations. relative power losses of other parts of the device are much lower and are not displayed not to make the diagram too complicated. noticeable differences between both anode configurations can be observed. in variant a, the relative power loss of the anode channel and its last segment are significantly higher than in variant b and linearly increase with the input 182 vol. 53 no. 2/2013 the influence of nitrogen in ar+n2 mixture efficiency: da = 30 mm (a), g = 11.3 gs -1, ar + (0.6 or 10.9)%n2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5000 15000 25000 35000 45000 55000pin [w]  [] ch+a(109 mm)_10.9% r1_10.9% r2_10.9% r3_10.9% ch+a (109 mm)_0.6% r1_0.6% r2_0.6% r3_0.6% efficiency: da = 16 mm (b), g = 11.3 gs -1, ar + (1.6 to 8)% n2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5000 15000 25000 35000 45000 55000 pin [w]  [] ch+a (109 mm) r1 r2 r3 figure 7. efficiency in important cross-sections of the device, namely at the outputs of the anode channel as a whole (109 mm from the cathode tip) and of the reactors r1, r2, r3 for configuration a (up) and b (down). power. on the contrary, the relative power loss of reactor 1 is lower for variant a. the curve of the relative power loss of the first reactor is non-monotonous and exhibits a flat minimum for variant a, but a flat maximum for variant b. for other tested concentrations the power loss distribution is similar. figure 7 shows the efficiency in important crosssections of the device in configuration a and b, namely at the output of the anode channel as a whole (109 mm from the cathode tip) and at the output of each reactor. higher mass concentration of nitrogen raises the voltage and consequently higher input power is reached. in fig. 7 two sets of data are given for configuration a, for the lowest (0.6 %, empty symbols) and the highest (10.9 %, full symbols) tested mass concentrations. both data sets trace the following parts of the same curve. similarly, figure 7 includes the data measured for 1.6 and 8 % for variant b. comparing the efficiency of the device in both configurations, clearly better efficiency at the output of the anode channel for configuration b compared to configuration a is deteriorated by higher power loss of reactor 1 in variant b. the total efficiency of the device at the output of reactor 3 is between 46 and 37 % for the configuration a (da = 30 mm) and is slightly lower than in configuration b (52 % to 40 %). mean temperatures and velocities were calculated mean temperature at the output of reactor 2, da = 16 (b) or 30 (a) mm, g = 11.3 gs-1, ar + (0.6 to 10.9) % n2 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 15000 25000 35000 45000 55000pin [w] t [k ] 16 mm, 10.9% 16 mm, 0.6% 30 mm, 0.6% 30 mm, 10.9% mean velocity at the output of reactor 2, da = 16 mm (b) or 30 mm (a), g = 11.3 gs-1, ar + (0.6 or 10.9) % n2 0 10 20 30 40 50 60 5000 15000 25000 35000 45000 55000pin [w] v [m s1 ] 16 mm, 10.9% 30 mm, 10.9% 16 mm, 0.6% 30 mm, 0.6% figure 8. mean temperatures and velocities at the output of reactor 2 for both anode configurations and several nitrogen mass concentrations. using eqs. 5 and 3 and the computed enthalpy and mass density of the used ar + n2 mixtures. as an example, mean temperatures at the output of reactor 2 are depicted in fig. 8 for both anode configurations and several nitrogen mass concentrations. obviously, for configuration b the curve t(pin) is almost linear with the same slope for the lowest and highest tested configuration. in configuration a (da = 30 mm) higher nitrogen mass concentration slightly decreases the mean temperature and the slope of the curve as well. in general, due to higher power loss of the anode channel including its last segment, configuration a exhibits lower mean temperatures than configuration b. similar conclusions for mean velocities at the output of reactor 2 can be derived from fig. 8. 4. conclusions results of experimental operation of the hightemperature device consisting of the arc heater and the three-stage reaction chamber are presented and discussed in the paper. the working gas used was argon with low admixtures of nitrogen (up to approximately 11 %). two configurations of electrical connection of the anode were tested, with the same geometry of the device. even a low admixture of nitrogen in argon was proved to substantially raise the arc voltage under otherwise identical conditions. thus, the device can 183 ivana jakubova, josef senk, ilona laznickova acta polytechnica be operated on higher input power. consequently, the power delivered to the working medium increases as well. it results in higher mean temperatures which can be desirable, and velocities, which can be undesirable. unfortunately, power loss of the device also increases with the input power. that’s why the influence of the increase of nitrogen concentration on the total efficiency is not so simple. as far as the anode configuration is concerned, variant a with the extended anode da = 30 mm is characterized by more stable operation. its voltage–current characteristic is slowly increasing in the tested range and also lower dispersion of measured data proves better stability. interesting differences between both anode configurations were observed in power loss distribution along the device. variant a exhibits substantially bigger power loss of the whole anode channel, including its last grounded segment but surprisingly the power loss of reactor 1 is lower than in variant b. probably, the arc in configuration a is stabilized in a narrower column in the axis of the anode channel and reaches higher centerline temperature than for variant b. previous work of the author concluded that it is mainly radiation which transfers energy from the plasma column to the anode channel wall. thus, both these factors raise radiation and consequently the power loss of the anode channel in variant a compared to variant b. in the stepwise extended reactor 1, the plasma column is mixed with colder surrounding gas. smaller radius of the plasma column naturally results in lower mean temperature of the medium in reactor 1 and also lowers power loss of reactor 1 for anode configuration a. on the contrary, a worse stabilized arc in the anode channel in configuration b (see flat voltage-current characteristics in fig. 4) probably does not reach too high centerline temperature. moreover, its lower radiation may be absorbed in outer regions of the wider plasma column. thus, the power loss of the whole anode channel is lower in variant b, but the mean temperature at its output might be higher than in variant a, which raises the reactor 1 power loss. obviously, these hypotheses must be tested and confirmed or disproved by further experiments and modeling. acknowledgements the research was performed in center for research and utilization of renewable energy sources. authors gratefully acknowledge financial support from european regional development fund under project no. cz.1.05/2.1.00/01.0014. references [1] o. coufal. composition and thermodynamic properties of thermal plasma up to 50 kk. journal of physics d: applied physics 40(11):3371–3385, 2007. [2] o. coufal, p. sezemsky, o. zivny. database system of thermodynamic properties of individual substances at high temperatures. journal of physics d: applied physics 38(8):1265–1274, 2005. [3] j. o. hirschfelder, ch. f. curtis. molecular theory of gases and liquids. john wiley, new york, 1954. [4] e.k. isakaev, et al. investigation of low temperature plasma generator with divergent channel of the output electrode and some applications of this generator. high temperature 48(1):97–125, 2010. [5] i. laznickova. transport coefficients of ar−n2 plasma. in xixth symposium on physics of switching arc, pp. 271–274. fekt vut, brno, 2011. [6] j. senk, i. jakubova. properties of a high-temperature device with electric arc burning in various working gas. in proceedings of the 13th international scientific conference electric power engineering 2012. but feec, brno, 2012. 184 acta polytechnica 53(2):179–184, 2013 1 introduction 2 material and methods 3 results and discussion 4 conclusions acknowledgements references ap1_03.vp 1 introduction for many years, concrete with compressive strength higher than 400 kg/cm2 (40 mpa) was only available in few locations. in recent years, the applications of high strength concrete (hsc) have increased as a result of recent development in material technology and in the demand for hsc. the construction of the world tallest concrete buildings, prestressed concrete bridges, and other special structures would not have been possible without the development of hsc [1]. high strength concrete offers greater compressive strength per unit cost, weight, and volume associated with improved mechanical properties such as modulus of elasticity and tensile strength. besides the satisfactory appearance of finished surfaces and possible reduction in the size of structural members, hsc has remarkably improved durability characteristics and thus, serviceability and maintenance problems are expected to decrease. because of its superior characteristics, high strength concrete can be applied in a wide range of applications. it can be recommended in nuclear vessels due to its low permeability, can be applied in aggressive environments due to its high durability, and can also be a convenient design alternative in high rise buildings, prestressed bridge girders, and composite structures. curing is a critical process for the production of high strength concrete, taking into account the low water-cement ratios usually applied in concrete mix design. mechanical characteristics and their development rate are highly influenced by curing conditions. in the current research, crushed dolomite was used in two high strength concrete mixes with 0.0 and 10 percent silica fume addition to evaluate the influence of five different drying conditions on high strength concrete compressive strength evaluated at 3, 7, 28 days. 2 methods of hsc production high strength concrete can be produced by conventional and special techniques for production. in the former, the following considerations are usually adopted [1], [2]: • unless high early strength is needed, as in prestressed concrete, the most common types of portland cement (astm types i, ii, iii) can all be used to produce high strength concrete. • water reducers are commonly used to reduce the water-cement ratio as the most significant single factor affecting concrete strength. • crushed coarse aggregates with lower nominal maximum size are known to produce higher strength. • efficient compaction and curing are essential. compared to normal concrete mixtures, high strength concrete mix proportioning is a more critical process and thus, many trail batches may be required to provide data that will enable the designer to identify the optimum mix proportions. on the other hand, several exotic techniques can be applied to produce hsc. these methods include [3]: • modification with polymers. • use of active or artificial aggregate such as portland cement clinkers to enhance the paste-aggregate bond. • introducing a tri-axial stress state by using finely divided reinforcements such as wires, nails, steel wool high density fibers, etc. • increasing compaction by pressure. • autoclave curing. 3 admixtures in hsc nearly all high strength concrete mixes have contained chemical admixtures. conventional water reducers, retarders, and high range water reducers (hrwr) or superplasticizers are typical examples. changes in the dosage and combination of these admixtures affect both the plastic and the hardened properties of concrete [1]. also, finely divided minerals or pozzolanic admixtures such as fly ash and silica fume have been used to achieve higher strengths and better durability. silica fume is a by-product material resulting from the reduction of high purity quartz with coal in electrical arc furnaces during the production process of ferro-silicon and silicon metal [4]. it can be obtained in a loose, condensed, and slurry form. because of its extreme fineness (specific area of 20 m2/g), silica fume has a high water demand and, like other pozzolanic materials, behaves more efficiently in concrete mixes having higher water-cement ratios. the action of silica fume is twofold; it reacts with free lime in cement to form a new cementitious compound, calcium silicate hydrate, and enhances the paste-aggregate bond [5], [6]. silica fume can be applied either as an addition or as a partial replacement of cement. in the first approach, typical silica fume ratios between 5 to 15 percent by weight of cement are usually added. the increased water demand associated with silica fume addition requires the use of a hrwr to maintain 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 influence of different drying conditions on high strength concrete compressive strength m. safan, a. kohoutková the influence of different drying conditions on the compressive strength and strength development rates of high strength concrete up to an age of 28 days was evaluated. two hsc mixes with and without silica fume addition were used to cast cubes of 10 cm size. the cubes were stored in different drying conditions until the age of testing at 3, 7, 28 days. keywords: high strength concrete, curing conditions, silica fume, high range water reducers. the watercementitious ratio required to achieve the desired strength [7]. however, if more water is added to compensate the loss in slump as an alternative to the use of a hrwr, there will still be a marked increase in the compressive strength [8]. 4 curing conditions curing is the process of maintaining satisfactory moisture content and a favourable temperature in concrete during the hydration of the cementitious material so that desired properties of concrete can develop. the potential strength and durability of concrete will not fully develop, unless it is properly cured for an adequate period prior to being placed in service. at watercement ratios below 0.4, the ultimate degree of hydration is significantly reduced if free water is not provided. water curing will allow more efficient, although not complete hydration of the cement [9]. kliger [10] reported that for lower watercement ratios it is more advantageous to supply additional water during curing. for concrete mixes with watercement ratios of 0.29, the 28 day compressive strength of specimens cast with saturated aggregates and cured by ponding water on top was 59 to 69 kg/cm2 greater than that of comparable specimens cast with dry aggregates and cured under damp burlap. also, it was noted that while early strength is increased by elevated temperature during mixing and curing, later strengths are reduced by such temperature. however, pfieffer [11] has demonstrated that later strengths may have only minor reductions if heat is applied after setting. 5 experimental work the current research was conducted in the strength of materials laboratory, menofia university, egypt. crushed dolomite was used in two mixes with a hrwr and 0.0 % and 10.0 % silica fume addition to evaluate the effect of different drying conditions on hsc compressive strength. the drying conditions are described in table 1 indicating the average of temperature and relative humidity in each case. a total number of 90 cubes (100 mm, side length) were been tested at different ages (3, 7, and 28 days) to explore the strength development rates up to an age of 28 days. concrete mix proportions of mix m1 (0.0 % s. f.) and mix m2 (10.0 % s. f.) are reported in table 2. 6 materials • cement: ordinary portland cement was used. compressive strength of mortar cubes (70 × 70 × 70 mm) measured at 3 and 7 days was 180 and 255 kg/cm2, respectively. • aggregates: natural siliceous sand was used. the delivered sand had a fineness modulus of 2.51. half of the sand weight was sieved using sieve no. 30. sand retained on this sieve was blended with the rest of the sand to obtain a coarser sand with a fineness modulus of 2.83 satisfying the material recommendations in [13], [14] to achieve better workability and compressive strength as the fineness modulus approaches a value of 3.0. crushed dolomite (specific gravity = 2.59 and volume weight = 1.53 t/m3) with a maximum nominal size of 0.5 in. (12.5 mm) was used as coarse aggregate. the particles were irregular, angular in shape with a relatively high percentage of flat and elongated particles, and had a granular-porous texture. grading curves for fine and coarse graded aggregates are shown in figures 1, 2 along with the limits according to astm c33-82 [12]. • water: tap water was used for mixing and curing. • chemical admixtures: one type of admixture with the trade name addecrete-bdf was used in the two mixes. the admixture produced by modern construction chemicals, © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 41 no. 3/2001 case drying condition relative humidity [%] temperature [°c] 1 water immersion 100 26 2 laboratory atmosphere 79.0 � 2 29.5 � 1 3 lab. atmosphere + spraying * similar to case (2) 4 direct sun + spraying * similar to case (5) 5 direct sun: • morning • noon • night 49.0 � 3 20.0 � 5 82.0 � 5 29.5 � 2 39.0 � 1 24.5 � 1 * twice a day table 1: drying conditions and associated average measured temperature & relative humidity in surrounding environment mix concrete mix proportions [kg/m3] density [kg/m3] c. a./f. a. s/c [%] w/(c+s) [%] a/(c+s) [%] slump [mm] c w f. a. c. a. s a m1 450 146.9 728 1092 0.0 1338 2430 1.5 0.0 32.6 2.97 70 m2 450 133.7 720 1097 45 13.5 2441 1.5 10 27 2.73 55 c = cement w = water f. a. = fine aggregate c. a. = coarse aggregate s = silica fume a = admixture (hrwr) table 2: concrete mix proportions [ n. m. s of c. a. (crushed dolomite) =1/2 in. and sand f. m.= 2.83] cairo, egypt is classified as a high range water reducer meeting the requirements of astm c494 (type a, f). the admixture is a brown liquid with 1.165 kg/l density at 20 °c. • silica fume: silica fume provided by the free silicon company, cairo, egypt was used in a powder form with a light gray color giving a black slurry when mixed with water. the product had a specific gravity of 2.2, bulk density = 280 kg/m3, and specific surface area = 17 m2/g. • mixing procedure: a 50 liter capacity mixer was used to mix batches of 20 liters needed to cast fifteen cubes. the mixer was charged with the drum at rest using the following sequence: half of the coarse graded aggregate, sand, and cement and silica fume mixed together. after charging, water was added gradually during mixing that continued for 2 minutes, the hrwr was added and mixing continued for 3 minutes to ensure adequate mixing. the concrete was charged from the mixer bowl and the slump test was then performed according to astm c143-78 [13]. the concrete specimens were removed from the steel molds 24 hours after adding the mixing water and stored according to the different drying schemes until the age of testing. 7 results the test results at 3, 7, and 28 days are presented in figures 3, 5 for mix m1 and in figures 4, 6 for mix m2. each result represents the average of three tested cubes. test results for mix m1 (0.0 % s. f.): figure 3 shows that the 28-day compressive strength for case 1 was higher than the other cases. the strength reduction increased with increased dryness of the curing environment, reaching a maximum of 15 % in case (5). however, this was not the case at early ages. at an age of 3 days, case (1) did not provide the most favourable curing condition, where strength increments between 16 and 21 percent were recorded for the other cases. with the passage of time, the loss of water due to evaporation abated the hydration process resulting in the strength reduction observed at 7 and 28 days. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 0 20 40 60 80 100 120 no. 100 no. 50 no. 30 no. 18 no. 8 no. 4 3/8 in. sieve size astm c33 astm c33 used f. a. p e r c e n ta g e p a s s in g fig. 1: grading of the used sand (f. m. � 2.83) 0 100 200 300 400 500 600 700 3 7 28 age [days] case (1) case (2) case (3) case (4) case (5) fig. 3: compressive strength for different drying conditions at different ages (silica fume content = 0.0) 0 20 40 60 80 100 120 no. 18 no. 8 no. 4 3/8 in. 1/2 in sieve size astm c33 astm c33 used c. a. p e r c e n ta g e p a s s in g fig. 2: grading of crushed dolomite (m. n. s � 1/2 in) test results for mix m2 (10.0 % s. f.): figure 4 shows that different drying conditions have a remarkable effect on strength especially at 3 days age. at this age, a gradual increase in strength reaching a maximum of 35 % for case (5) was recorded. in case (5) a maximum temperature of 46 °c was measured at the specimen surface. this relatively high temperature simulated the cement hydration process and the pozzolanic reaction of silica fume, resulting in a remarkable strength increase. this phenomenon was also observed at later ages of 7 and 28 days. because of its low permeability, silica fume concrete specimens kept a moisture level sufficient for the hydration process to proceed efficiently even after 28 days in a dry environment. figures 5, 6 show the compressive strength development at different ages, indicating that case 1 provided better rates of strength gain than any other curing conditions. 8 conclusions concrete specimens (cubes 100 × 100 × 100 mm) were cured in different drying conditions (water immersion, laboratory atmosphere, laboratory atmosphere and spraying, direct sun and spraying, and direct sun). the specimens were tested at 3, 7, and 28 days. the trend according to which the compressive strength was affected by drying condition varied at different ages. it is recommended that testing should © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 41 no. 3/2001 0 100 200 300 400 500 600 700 800 900 1000 3 7 28 case (1) case (2) case (3) case (4) case (5) age [days] fig. 4: compressive strength for different drying conditions at different ages (silica fume content � 10 %) 300 350 400 450 500 550 600 650 0 5 10 15 20 25 30 case (1) case(2) case (3) case (4) case (5) silica fume content = 0.0 age [days] fig. 5: compressive strength for 100 × 100 × 100 mm concrete specimens for different curing conditions extend in time and for different specimen sizes to provide broader understanding of the behaviour trend. however, the following conclusions could be drawn: • high strength concrete specimens with 10 percent silica fume addition were more sensitive to drying conditions than those with zero silica fume content. • the compressive strength of silica fume concrete specimens increased as the curing environment temperature increased. this effect was more obvious at early ages and decreased with the passage of time. • because of its low permeability, silica fume concrete kept a moisture content level that was sufficient for the hydration process to proceed efficiently even after 28 days in a dry environment. • continuously watercured concrete specimens provided a better rate of strength development compared to other curing conditions. references [1] state-of-the art report on high strength concrete. aci committee 363 r–1984 [2] macinnis, c., thomson, d. v.: special techniques for producing high strength concrete. aci journal, vol. 67, no. 12/1970, pp. 996–1002 [3] duriez, m.: methods of achieving high strength concrete. aci journal, vol. 64, no. 1/1967, pp. 45–48 [4] high strength concrete. manual of concrete materials – aggregates, national crushed stone association, washington, d. c., january 1975, p. 16 [5] howard, n. l., leatham, d. m.: the production and delivery of high strength concrete. concrete international, april 1989, pp. 26–30 [6] rosenberg, a. m., gaidis, j. m.: a new mineral admixture for high strength concrete. concrete international, april 1989, pp. 31–36 [7] luciano, j. j., nami, c. k.: a novel approach to developing high strength concrete. concrete international, may 1991, pp. 25–29 [8] malhotra, v. m., carette, g. g.: silica fume concrete properties: applications and limitations. concrete international, may 1983, pp. 40–46 [9] neville, a. m.: properties of concrete. 3rd edition, pitman publishing limited, london, 1981, pp. 779 [10] kliger, p.: early high strength concrete for prestressing. world conference on prestressed concrete, san francisco, 1957, pp. a5 (1–14) [11] pfieffer, d. w., landgreen, j. r.: energy efficient accelerated curing of concrete: a laboratory study for plant-produced prestressed concrete. technical report no. 1, prestressed concrete institute, chicago, december 1981 [12] astm c33 – 82: specification for concrete aggregates. [13] astm c143 – 78: specification for ordinary portland cement. eng. mohamed safan, msc phone: +420 2 2435 4620 e-mail: msafan@beton.fsv.cvut.cz ing. alena kohoutková, csc. phone: +420 2 2435 3740 e-mail: akohout@fsv.cvut.cz dept. of concrete structures & bridges czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 450 500 550 600 650 700 750 800 850 900 950 0 5 10 15 20 25 30 case (1) case (2) case (3) case (4) case (5) silica fume content = 10.0 age [days] fig. 6: compressive strength for 100 × 100 × 100 mm concrete specimens for different curing conditions << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap01_45.vp 1 introduction ul aircraft have recently became very popular and relatively simple to build and operate. however this process does not mean that the design and analysis of such aeroplanes is unsophisticated. the main aspect, that has an impact on the design simplicity, is the configuration of lifting surfaces and its geometry. the l2k is a ul, one-seat, high-wing aeroplane with an almost parabolic wing layout and mixed construction. the aeroplane has a tailless configuration with rudders on the tip of the wing (winglets) and ailerons likewise with a coupled function as a horizontal tail. an aerodynamical analysis involves investigating the airfoil characteristics for two different values of relative thickness and computations of wing characteristics. the l2k is designed according to german bfu requirements for ul aircraft. 2 notation © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 41 no. 4 – 5/2001 aerodynamic design of a tailless aeroplan j. friedl the paper presents an aerodynamic analysis of a one-seat ultralight (ul) tailless aeroplane named l2k, with a very complicated layout. in the first part, an autostable airfoil with a low moment coefficient was chosen as a base for this problem. this airfoil was refined and modified to satisfy the design requirements. the computed aerodynamic characteristics of the airfoils for different reynolds numbers (re) were compared with available experimental data. xfoil code was used to perform the computations. in the second part, a computation of wing characteristics was carried out. all calculated cases were chosen as points on the manoeuvring and gust envelope. the vortex lattice method was used with consideration of fuselage and winglets for very complicated wing geometry. the pmw computer program developed at iae was used to perform the computations. the computed results were subsequently used for structural and strength analysis and design. keywords: aviation, aerodynamics, tailless aeroplane, airfoil, wing. fig. 1: l2k layout parameter value unit max. take off weight 300 kg length 5.3 m height 2.1 m total span 11.71 m wing span 10.8 m dihedral 0 deg sweep variable wing area 16.8499 m2 aspect ratio 6.9223 1 table 1: l2k design parameters � [deg] angle of attack �0 [deg] zero lift angle ck * [1] coefficients for wing cd [1] drag coefficient cdi [1] induced drag coefficient cl [1] lift coefficient cl � [rad�1] lift curve slope clmax [1] max. lift coefficient cm [1] moment coefficient (0.25 chord) cm0as [1] cm of wing for zero lift angle cm � [rad�1] moment curve slope �cl � [rad�1] clk difference �t [deg] geometric twist angle of wing tip airfoil �w [deg] geometric twist angle of winglets l – left aileron n [1] g-loading r – right aileron re [1] reynolds number 3 airfoil analysis the first step during the design of lifting surfaces is the choice of an airfoil. an airfoil with low moment coefficient is needed for tailless aeroplanes to ensure stability during flight. the n60r airfoil was chosen as the base for further investigation. this airfoil belongs to the family of autostable profiles with a slightly “s” shape of the mean curve, which has a direct impact on the value of the moment coefficient. because the original airfoil coordinates were too rough they had to be refined and an airfoil named n60r124 with a relative thickness of 12.4 % was developed. by modification the n60r124 to a relative thickness of 15.0 % the n60r150 airfoil arose and was used at the root wing section. the two airfoils have identical mean curves with maximum relative camber 2.7 %, so a similar moment coefficient value was expected. 3.1 used program xfoil version 5.7 code was used to perform the computations. according to [2] the inviscid formulation built in this program is a second order panel method with a finite trailing edge modelled as a source panel. this formulation is closed with the explicit kutta-joukowski condition. the viscous formulation is based on the two-equation lagged dissipation integral boundary layer model and an envelope en transition criterion. the solution of the boundary layers and wake is interacted with the incompressible potential flow via the surface transpiration model. the drag is computed from the wake momentum thickness far downstream. 3.2 experimental data and computation comparison at the beginning, the computation for re = 168 000 was done and compared with available experimental data from [1]. there was no information about the measurement conditions, so it was decided to leave the settings of the solution parameters in xfoil at default values. the turbulence intensity was set to 0.07 %. comparisons for lift and moment curves are shown in fig. 3 and 4. for unknown parameters, e.g. the roughness, turbulence intensity and correction at measurement space, the comparison was not conclusive. 3.3 aerodynamic characteristics of airfoils for different re numbers computations of the aerodynamic characteristics of airfoils n60r124 and n60r150 were done for the considered range of re number for the root and tip section for the designed speed range and 0 m of isa. the values of zero angle attack and lift curve slope were determined by the least squares method from the lift curves. 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 2: airfoils n60r124 and n60r150 -5 0 5 10 15 20 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 re = 168e3 comp. re = 168e3 exp. � [deg] c l [1 ] n60r124 fig. 3: lift curve comparison � [deg] c m [1 ] n60r124 0 0 �5 5 10 15 20 �0.08 �0.07 �0.06 �0.05 �0.04 �0.03 �0.02 �0.01 0.01 0.02 fig. 4: moment curve comparison va [kph] speed of turn vd [kph] maximum speed x, y, z coordinate axis airfoil n60r124 airfoil n60r150 © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 41 no. 4 – 5/2001 10 5 10 6 10 7 1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 c l m a x [1 ] n60r124 re [1] fig. 5: computed variation of clmax with increasing re number re �0 cl� clmax [1] [deg] [rad�1] [1] 168000 �0.36 5.226 1.324 1000000 �0.58 5.998 1.418 2000000 �0.68 6.258 1.511 4000000 �0.83 6.492 1.612 8000000 �1.02 6.642 1.710 table 2: aerodynamic characteristics of n60r124, re range �5 0 5 10 15 20 �0.5 0 0.5 1 1.5 2 re = 168e3 re = 1e6 re = 2e6 re = 4e6 re = 8e6 � [deg] c l [1 ] n60r124 fig. 6: computed lift curves � [deg] c m [1 ] n60r124 �5 0 5 10 15 20 �0.035 �0.03 �0.025 �0.02 �0.015 �0.01 �0.005 0 0.005 0.01 0.015 re = 168e3 re = 1e6 re = 2e6 re = 4e6 re = 8e6 fig. 7: computed moment coefficient curves 0.5 0 0.5 1 1.5 2 0 0.02 0.04 0.06 0.08 0.1 0.12 re = 168e3 re = 1e6 re = 2e6 re = 4e6 re = 8e6 c d [1 ] cl [1] n60r124 fig. 8: computed polars 10 6 10 7 1.45 1.5 1.55 1.6 1.65 1.7 c l m a x [1 ] n60r150 re [1] fig. 9: computed variation of clmax with increasing re number 3.4 discussion of results airfoils n60r124 and n60r150 can be considered as suitable for tailless aeroplane due to the low moment coefficients values. in order to maintain the stability of the aeroplane a twisted wing tip section and the application of consistent deflection of the ailerons will be necessary. experimental data and computation comparison was not applicable for the set up solution parameters due to unknown measurement conditions. 4 wing analysis the structural and strength design of the wing must take into account the distribution of aerodynamic parameters along the wing span. this task is quite simple for unswept wings and can be solved by a wide range of methods based on prandtl lift line theory. but for more complex geometry this task becomes complicated. the l2k aeroplane has a variable sweep angle due to the parabolic leading and trailing edge (except ailerons). 4.1 used program pmw102 software developed at iae was used to perform the computations. this package is based on the prandtl theory of lift vortex (panel method) [3, 4]. the lift surface is replaced by the system of lift vortices distributed along the span and chord. two conditions must be satisfied in order to determine the circulation – tangential flow at the panel surface and zero value of circulation at the trailing edge (kutta–joukowski condition). the model is divided into a finite number of trapezoidal panels with a horseshoe system of vortices. there is a constrained vortex at 1/4 of the panel chord and free vortices flow from the constrained to the infinity parallel to the velocity vector. the flow vector must be tangential to the panel surface at a control point located at 3/4 of the panel chord. this method is appropriate for swept wings, for lift surfaces with a small aspect ratio and likewise for complex geometry surfaces. program allows the model geometry to be inputted only by the man curve of all bodies. only a linear lift curve is considered. the pmw102 package 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 re �0 cl� clmax [1] [deg] [rad�1] [1] 1000000 �0.62 6.329 1.498 2000000 �0.74 6.541 1.548 4000000 �0.91 6.660 1.608 8000000 �1.02 6.707 1.677 table 3: aerodynamic characteristics of n60r150, re range �5 0 5 10 15 20 �0.5 0 0.5 1 1.5 2 re = 1e6 re = 2e6 re = 4e6 re = 8e6 � [deg] c l [1 ] n60r150 fig. 10: computed lift curves �5 0 5 10 15 20 �0.04 �0.03 �0.02 �0.01 0 0.01 0.02 re = 1e6 re = 2e6 re = 4e6 re = 8e6 � [deg] c m [1 ] n60r150 fig. 11: computed moment coefficient curves c d [1 ] cl [1] n60r150 0.5 0.50 0 1 1.5 2 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 fig. 12: computed polars involves a basic pre-processor, a solver and a post-processor, and is still under development. 4.2 model geometry the aeroplane was modelled by the mean curves of all bodies. this means that the aeroplane geometry was replaced by thin surfaces. the fuselage and winglets were also modelled to find out the impact on the flow field. the wing geometry had to be simplified because the pre-processor allows basic trapezoidal wing segments (blocks) to be modelled so the parabolic wing was replaced by 40 blocks with refinement at the tip area. a total of 1180 panels were used in 108 sections. 4.3 investigation process the first investigation was to determine the geometric twist angle of the wing tip airfoil (�t). it was derived from the condition that the difference between the airfoil max. lift coefficient and the local lift coefficient must be greater than 0.1 in the middle of the aileron span when maximum lift of the wing is reached. three variants with twist angles of –2, –3, –4 degrees were tested and the case zk 3 (–4 deg.) was chosen as the best. this variant ensures that flow separation will develop at the wing root and will extend to the wing tip with increasing angle of attack. according to the results obtained a stall speed less than 65 kph (by bfu) will be satisfied. the next investigation was to determine the geometric twist angle of the winglets (�w). three additional variants of winglet twist angle were studied for the twist variant zk 3 (tab. 5). the variant wzk 13 that has the highest value of lift curve slope is the best from the induced drag point of view. nevertheless, based on local cl distribution along the span (fig. 14) the variant wzk 33 was chosen, because it gives more lift in the aileron area. © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 13: axonometric view of the computation model var. �t clkmax cmk � stall speed [deg] [1] [1] [deg] [kph] zk 1 �2 1.2846 �0.012 17.6 56.246 zk 2 �3 1.2766 �0.011 18.0 56.421 zk 3 � 4 1.2139 �0.008 18.0 57.861 note: cmk is computed towards the y axis. the angle of attack is measured from the root chord. table 4: tip twist variations 0 1 2 3 4 5 6 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 wzk 1 3 wzk 2 3 wzk 3 3 y [m] c l [1 ] cl fig. 14: winglets variants, local cl along span, angle of attack 2 deg var. �w �0 cl� �cl� [deg] [deg] [rad�1] [rad�1] wzk 13 0 0.9692 4.2124 0 wzk 23 4 1.0830 4.1920 �0.0204 wzk 33 6 1.1475 4.1755 �0.0165 table 5: winglet twist angle variants 4.4 computed cases for strength analysis the geometric variant wzk 33 was chosen as final to compute cases for the strength analysis. five cases were selected on the manoeuvring and gust envelope (see tab. 6). the lift curve was considered linear. this simplification can be used for the strength analysis. the wing coefficients clk and cmk (see tab. 7) were computed from the distribution of local values along the span using the trapezoidal method of numerical integration. the basic text output from the pmw102 programme with all operational load data is available. 5 conclusions the main objectives of the project were successfully completed. the characteristics of the modified n60r airfoil were obtained and data for the strength analysis of the wing was determined. it was shown that relatively complex wing geometry should be analysed via the panel method approach. in the investigative process some problems occurred, such as data evaluation of the results from pmw102, because the development of this software is still in progress. references [1] horejší, m.: aerodynamika létajících modelů. praha, naše vojsko, 1957 [2] drela, m.: xfoil user guide. mit aero & astro harold youngren, aerocraft, inc., 2001 [3] kuthe, a. m., chow, ch.: foundations of aerodynamics. new york, john wiley & sons, inc., 1998 [4] brož, v.: aerodynamika nízkých rychlostí. ostrava, české vysoké učení technické v praze, 1990 ing. jan friedl phone: +420 5 41143470 fax: +420 5 41142879 e-mail: friedl@iae.fme.vutbr.cz institute of aerospace engineering brno university of technology technická 2, 616 69 brno, czech republic 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 case speed speed value n aileron deflection [deg] note [kph] [1] r l p1 va 112.17 4 0 0 p2 va 112.17 1 �30 20 full defl. p3 vd 210 4 0 0 p4 vd 210 1 �10 6.67 1/3 defl. p5 vd 210 �1.5 0 0 table 6: envelope cases case � clk cmk [deg] [1] [1] p1 18.500 1.2920 �0.0156 p2 5.617 0.2860 0.0092 p3 6.205 0.3823 �0.0098 p4 2.410 0.0895 0.0011 p5 �1.000 �0.1362 �0.0041 table 7: summary of results 0 1 2 3 4 5 6 0.5 0 0.5 1 1.5 2 clmax airf. p1 p3 p5 y [m] c l [1 ] symmetrical cases fig. 15: local cl along span, symmetrical cases 6 4 2 0 2 4 6 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 p2 p4 y [m] c l [1 ] unsymmetrical cases fig. 16: local cl along span, unsymmetrical cases �0 cm0as cl� cm� [deg] [1] [rad�1] [rad�1] wzk 33 1.1475 �0.0053 4.1755 �0.0286 table 8: results in linear area of lift curve acta polytechnica doi:10.14311/ap.2014.54.0305 acta polytechnica 54(4):305–319, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap fractional calculus and lambert function i. liouville–weyl fractional integral vladimír vojta chotovická 12, 182 00 praha 8 correspondence: vojta@karneval.cz abstract. the interconnection between the liouville–weyl fractional integral and the lambert function is studied. the class of modified abel equations of the first kind is solved. a new integral formula for the gamma function and possibly new transform pairs for the laplace and mellin transform have been found. keywords: variable order fractional integral, liouville–weyl fractional integral, lambert function, gamma function, bickley function, mcdonald function, exponential integral, entire functions, completely monotone functions, laplace transform, mellin transform, euler integral transform, volterra integral equations. ams mathematics subject classification: 26a33, (33b15, 33e20, 44a10, 44a15, 45d05). 1. introduction for a study of the interconnection between fractional integrals [1] and the lambert function [2], we start with the following variant of the fractional integral, known as the liouville, liouville–weyl or weyl fractional integral [1]: (iν−f)(y) = 1 γ(ν) ∫ ∞ y f(x)(x−y)ν−1 dx = 1 γ(ν) ∫ ∞ 0 f(u + y)uν−1 du, (1.1) where <ν > 0 and y > −∞. parameter ν is known as the order of the fractional integral. the integral on the right of (1.1) is the mellin transform of the shifted function f(x). this means that the mellin transform of function f(·) is its liouville–weyl fractional integral of the order ν at y = 0 times the gamma function of argument ν. for our further investigation, we suppose that the integrals in (1.1) converge absolutely and that function f(x) is an unilateral laplace transform (picture, generating function) of the relevant original or determining function f(t): f(x) = ∫ ∞ 0 e−xtf(t) dt, (1.2) where f(t) is a locally integrable function on the interval [0,∞), and the integral in (1.2) converges absolutely at some x = c. then the integral in (1.2) converges absolutely and uniformly in the half-plane <(x) > <(c) [3]. if f(t) is a function of the exponential order α0, i.e., f(t) = o(exp α0t) for t = +∞ (see [21, chap. 5]), then integral (1.2) converges absolutely at least in the region α0 and uniformly for α1 > α0. function f(x) is holomorphic in the region α0. it can easily be shown [4] that (iν−f)(y) = ∫ ∞ 0 e−yxx−νf(x) dx, <ν ≥ 0. (1.3) it should be noted that (1.3) is defined straightforwardly for <ν = 0, whereas (1.1) is not. the integral in (1.3) can be regarded as a laplace transform for variable y, or as a mellin transform for variable −ν + 1. it is often called the laplace–mellin transform [5]. function f(x) can be a generalized function, namely a dirac delta function, see [3]. these generalized functions will be used as a tool only, and the results obtained here can be verified without making use of them. definition 1.1. a completely monotone function is defined as follows [6, def. 1.3, p. 2]: a function f : (0,∞) → r is a completely monotone function if f(·) is a member of the class c∞ and (−1)nf(n)(λ) ≥ 0, n ∈ n∪{0}, λ > 0. (1.4) equation (1.3) has an interesting consequence based on the following theorems: theorem 1.2. the liouville–weyl fractional integral of a completely monotone function is a completely monotone function. 305 http://dx.doi.org/10.14311/ap.2014.54.0305 http://ojs.cvut.cz/ojs/index.php/ap vladimír vojta acta polytechnica proof. according to the bernstein theorem [6, p. 3], the laplace transform of the nonnegative function is completely monotone and, conversely, if function f(x), defined by (1.2), is completely monotone, then function f(t) is nonnegative for t > 0. because function f(x) is nonnegative by hypothesis, the liouville–weyl fractional integral defined by (1.3) is the laplace transform of a nonnegative function. theorem 1.3. suppose the following two conditions on f(·) are satisfied: (1.) the liouville–weyl fractional integral of function f(·) is a completely monotone function; (2.) function f(·) is a laplace transform of some function f(·). then function f(·) is completely monotone. proof. in consequence of hypothesis 2., the liouville–weyl fractional integral is represented by (1.3). according to hypothesis 1. and the bernstein theorem, function x−nf(x) is nonnegative for 0, reν > 0. this implies that function f(t) in (1.2) is also nonnegative. this means that function f (·) is completely monotone according to the bernstein theorem. completely monotone functions play a substantial role in probability theory, measure theory, etc. they can also be found in technological practice we mention time-dependent shear, bulk and the young moduli of linear viscoelastic materials, because they are laplace transforms of the relevant nonnegative relaxation spectra [7]. 2. diagonal fractional integrals in the case that order ν of the fractional integral in (1.1) is a function of variable y, i.e., ν = ν(y), the integral in (1.1) is called a variable order fractional integral [8]. variable order fractional integrals and derivatives have been studied recently due to their applications in many branches of science where the memory effect plays an essential role [9]. our interest will be concentrated on the simplest case ν(y) = y. definition 2.1. the fractional integral of variable order (iy−f)(y) = 1 γ(y) ∫ ∞ y f(x)(x−y)y−1 dx = ∫ ∞ 0 e−ytt−yf(t) dt, y > 0, (2.1) will be called the right liouville–weyl diagonal fractional integral or simply the diagonal integral of the function f(x), and the integral on the right hand side of (2.1) will be called the anti-diagonal laplace–mellin transform of function f(t). the following two theorems are seminal for this paper. theorem 2.2. the right-most integral in (2.1) can be written as∫ ∞ 0 e−ytt−yf(t) dt = ∫ ∞ −∞ e−yuf ( w0(eu) ) w0(eu) 1 + w0(eu) du, (2.2) where function w0(·) is the 0th branch of the lambert function [2]. proof. we compute∫ ∞ 0 e−ytt−yf(t) dt = ∫ ∞ 0 e−y(t+ln t)f(t) dt = ∫ ∞ −∞ e−yuf ( w0(eu) ) w0(eu) 1 + w0(eu) du, where the substitution t + ln t = u, i.e., t = w0(eu) and dt = w0(eu)/(1 + w0(eu)), was performed. the lambert function is defined as a solution of the functional equation z = w(z)ew(z) for any complex z [2]. theorem 2.2 means that the liouville–weyl diagonal fractional integral of the unilateral laplace transform can be interpreted as a two-sided laplace transform with the region of convergence β0 < y < β1. alternatively, we can express the anti-diagonal laplace–mellin transform as the mellin transform. theorem 2.3. the rightmost integral in (2.1) can be written as∫ ∞ 0 e−ytt−yf(t) dt = ∫ ∞ 0 u−y−1f ( w0(u) ) w0(u) 1 + w0(u) du. (2.3) proof. we compute∫ ∞ 0 e−ytt−yf(t) dt = ∫ ∞ 0 (ett)−yf(t) dt = ∫ ∞ 0 u−y−1f ( w0(u) ) w0(u) 1 + w0(u) du where the substitution ett = u, i.e., t = w0(u) and dt = w0(u) u(1+w0(u)) du, was performed. 306 vol. 54 no. 4/2014 fractional calculus and lambert function i the integral on the right of (2.3) is a mellin transform in variable −y. in this paper, we will omit the words “in variable . . . ” for the sake of brevity. in the case of the mellin transform in variable y we emphasize this circumstance as the “mellin transform in standard notation” if necessary to avoid confusion. this means that the liouville–weyl diagonal fractional integral can be represented as a mellin transform with the region of convergence −β1 < −y < −β0 or β0 < y < β1. the following corollaries easily follow from theorems 2.2 and 2.3: corollary 2.4. let the laplace transform of function h(t) exist for 0 ≤ a ≤ t < b ≤ ∞, where a = w0(ea) and b = w0(eb), i.e., a = a + ln a and b = b + ln b. then for the laplace transform of the compound function h ( w0(eu) ) it holds ∫ b a e−yuh ( w0(eu) ) du = ∫ b a e−ytt−y(1 + t)h(t)/t dt (2.4) whenever both integrals exist. if −∞ < a < b < +∞, the laplace transform on the left side represents the entire function. in that case, we can substitute the value y = 0 ( and others of course), whenever we need it. corollary 2.5. let the mellin transform of function h(t) exist for 0 ≤ a < t < b ≤ ∞, a = w0(a) and b = w0(b), i.e., a = aea and b = beb . then for the mellin transform in the standard form of the compound function h ( w0(u) ) it holds ∫ b a uy−1h ( w0(u) ) du = ∫ b a ty−1eyt(1 + t)h(t) dt (2.5) whenever both integrals exist. proof. instead of f(t), write (1 + t)h(t)/t in (2.2)–(2.3). in (2.3) change −y to y. in both equations recount the limits of integration. the integrals on the right sides of (2.4)–(2.5) can often be calculated more easily than the integrals on the left side. example 2.6. we have∫ e 0 sin w0(u) du = ∫ 1 0 et(1 + t) sin t dt = e 2 (2 sin 1 − cos 1) ≈ 1.55301. 3. inversion of the diagonal fractional integral a natural question is to find function f(x) knowing its diagonal fractional integral g(y): 1 γ(y) ∫ ∞ y f(x)(x−y)y−1 dx = g(y). (3.1) this equation is a volterra integral equation of the first kind and of the abel type, but in contrast to (1.1), equation (3.1) is not a convolution. if f(x) is a unilateral laplace transform of function f(t), see (1.2), equation (3.1) can easily be solved for f(x) via its original function f(t). because g(y) = ∫ ∞ −∞ e−yug(u)du = (iy−f)(y) = ∫ ∞ −∞ e−yuf ( w0(eu) ) w0(eu) 1 + w0(eu) du, (3.2) the following theorem holds: theorem 3.1. the solution of (3.2) is given by the relation f(t) = (1 + t)g(t + ln t)/t, t ∈ (0,∞). (3.3) proof. both integrands must be identical with an identical strip of convergence [10]. after substituting w0(eu) = t, i.e., u = t + ln t into (3.2) we obtain (3.3). a similar argument holds for the “mellin variant” (2.5). in this case g(y) = ∫ ∞ 0 u−y−1g(u) du = (iy−f)(y) = ∫ ∞ 0 u−y−1f ( w0(u) ) w0(u) 1 + w0(u) du. (3.4) theorem 3.2. the solution of (3.4) is given by the relation f(t) = g(tet) 1 + t t , t ∈ (0,∞). (3.5) proof. as in the case of theorem 3.1, the two integrands must be identical with an identical strip of convergence. after substituting w0(u) = t, i.e., u = tet in (3.4) we obtain (3.5). 307 vladimír vojta acta polytechnica 3.1. examples 3.1.1. example of the laplace variant (cf. (3.2)–(3.3)) let g(y) = 1/(y − 1), y > 1, i.e., g(u) = eu for u ≥ 0 and g(u) = 0 for u < 0. then f(t) = { et(t + 1) for t ≥ w0(1), 0 for t < w0(1), f(x) = ∫ ∞ w0(1) e−xtf(t) dt = w0(1)x−1 xw0(1) + x−w0(1) (x− 1)2 , x > 1, and (iy−f)(y) = 1 γ(y) ∫ ∞ y f(x)(x−y)y−1 dx = ∫ ∞ w0(1) e−ytt−yf(t) dt = 1 y − 1 , y > 1. 3.1.2. example of the mellin variant (cf. (3.4)–(3.5)) let g(y) = −γ(−y)yy−1, y ∈ (0, 1), i.e., g(u) = w0(u) for u ≥ 0 and g(u) = 0 for u < 0. then f(t) = { (1 + t)g(tet)/t = 1 + t for t ≥ 0, 0 for t < 0, beceause w0(tet) = t. the laplace transform of f(t) is f(x) = ∫ ∞ 0 e−xtf(t) dt = 1 + x x2 . finally we obtain (iy−f)(y) = 1 γ(y) ∫ ∞ y f(x)(x−y)y−1 dx = ∫ ∞ 0 e−ytt−yf(t) dt = πyy−2 csc πy γ(y) , y ∈ (0, 1), see [12, entry 8, p. 201], where csc πy = 1/ sin πy. this means that g(y) = ∫ ∞ 0 u−y−1w0(u) du = πyy−2 csc πy γ(y) , y ∈ (0, 1). in standard notation of the mellin transform we have the known transform pair [2]∫ ∞ 0 uy−1w0(u) du = π(−y)−y−2 csc(−πy) γ(−y) = (−y)−y γ(y) y , y ∈ (−1, 0). 4. fixed point the following theorems assert that the unilateral laplace transform can be represented by the liouville–weyl diagonal fractional integral of another laplace transform. theorem 4.1. every unilateral laplace transform f0(y) = ∫ ∞ a e−ytf(t) dt, a ∈ [0,∞) of some function f(t) of exponential order α0, i.e., f(t) = o(exp α0t) for t → +∞, is equivalent to the liouville–weyl diagonal fractional integral of the function f1(x) = ∫ ∞ a1 e−xtf(t + ln t) 1 + t t dt, x ≥ α0, (4.1) where a1 = a(a) = w0(ea). proof. we rewrite the laplace transform from the hypothesis in the form∫ ∞ a e−ytf(t) dt = ∫ ∞ a e−yth1 ( w0(et) ) w0(et) 1 + w0(et) dt . 308 vol. 54 no. 4/2014 fractional calculus and lambert function i then h1(w0(et)) = f(t)(1+w0(et))/w0(et), t > 0, according to the uniqueness (lerch) theorem for the unilateral laplace transform [3, p. 120]. after substituting w0(et) = u, i.e., t = u + ln u,u ≥ w0(ea) = a(a) ≡ a1 > 0, we obtain ∫ ∞ a e−ytf(t)dt = ∫ ∞ a1 e−y(u+ln u)h1(u) du = ∫ ∞ a1 e−yuu−yh1(u) du. (4.2) the function h1(u) = f(u + ln u) + f(u + ln u)/u, u ≥ w0(ea) is also of exponential order α0, because f(u + ln u) is of exponential order α0 and the function f(u + ln u)/u is of the order α < α0 . this means that the laplace transform f1(x) = ∫ ∞ a1 e−xth1(t) dt = ∫ ∞ a1 e−xtf(t + ln t) 1 + t t dt = ∫ ∞ x φ(u) du +φ(x), x ≥ α0, where φ(x) = ∫ ∞ a1 e−xtf(t + ln t) dt, x ≥ α0 exists, and that the liouville–weyl diagonal fractional integral (iy−f1)(y) = 1 γ(y) ∫ ∞ y f1(x)(x−y)y−1 dx = ∫ ∞ a1 e−yuu−yh1(u) du = ∫ ∞ a e−ytf(t) dt = f0(y), y > α0 also exists. the same procedure as above can be applied to function f1(y), and we obtain f2(x) = ∫ ∞ a2 e−yth2(t) dt = ∫ ∞ a2 e−yth1(t + ln t) 1 + t t dt, (iy−f2)(y) = 1 γ(y) ∫ ∞ y f2(x)(x−y)y−1 dx = ∫ ∞ a2 e−yuu−yh2(u) du = ∫ ∞ a1 e−yth1(t) dt = f1(y), y > α0, where a2 = a(a(a)) = w0(exp(w0(ea)) is a second iterate of the function a(a), and h2(u) = h1(u + ln u) 1 + u u = f ( u + ln u + ln(u + ln u) )1 + u + ln u u + ln u 1 + u u , u ≥ a2. this procedure can be repeated without restraint. to understand its behaviour, we first observe that point a = 1 is a fixed point of the function a(a) = w0(ea), because a(a) is a continuous function, a(a) > a for a < 1, a(a) < a for a > 1 and a(1) = 1. this fixed point is attractive because of a′(1) = 1/2 < 1. for this reason, limn→∞an(a) = 1 for every a ∈ [0,∞), where an(a) means the nth iterate of function a(a). the interval [0,∞) is the domain of attraction of this fixed point. for fn(x), we obtain fn(x) = ∫ ∞ an e−ythn(t) dt, where an = an(a), hn(t) = f ( (t + ln t)n )n−1∏ k=0 1 + (t + ln t)k (t + ln t)k , t ≥ an, n > 0, and an(a) and (t+ ln t)n are the nth iterate (not power) of the function a(a) and t+ ln t, respectively, a0(a) = a, (t + ln t)0 = t. the procedure just described can be reversed: theorem 4.2. let h(y) = ∫∞ b e−ytg(t) dt, 0 ≤ b < ∞ , be a unilateral laplace transform, where g(t) is a locally integrable function on the interval [b,∞) and of exponential order α0. then the function h1(y) = ∫ ∞ b e−yuf(u) du = ∫ ∞ b e−yug ( w0(eu) ) w0(eu) 1 + w0(eu) du, (4.3) where b = b(b) = b + ln b, is a liouville–weyl diagonal fractional integral of h(y). proof. the proof is a direct consequence of theorem 2.2. this procedure can also be repeated. the point b = 1 is a fixed point of the function b(b) = b + ln b, because b(b) is a continuous function, b(b) < b for b < 1, b(b) > b for b > 1 and b(1) = 1. this fixed point is repulsive because of b′(1) = 2 > 1, so limn→∞bn(b) = +∞ for b > 1, limn→∞bn(b) = 1 for b = 1 and for b < 1 the process ended at n for which bn+1(b) < 0. 309 vladimír vojta acta polytechnica 5. complex domain as yet there has been no need to reason about complex values of variable y in the definition of the diagonal integral and in the anti-diagonal laplace–mellin transform in (2.1), because this paper (with minimum exceptions — the gamma function and examples 6.12 and 6.13) deals with fractional integrals of generally complex order but on the real axis. now we intend to infer an analog of the standard bromwich inversion formula of the laplace transform for the anti-diagonal laplace–mellin transform, and make a mention of the entire functions. theorem 5.1. if the integral in (3.2) converges absolutely for function g(u) in the region of convergence −∞≤ β0 < 0, f(0) = lim t→0+ f(t), (5.1) where β0 < c < β1 and the integral in (5.1) is taken as the cauchy principal value (see [3, § 5.8]). moreover, (5.1) is also true if “equation (3.2)” in the hypothesis is changed to “equation (3.4)”. proof. we start with the standard bromwich inversion for the bilateral laplace transform in (3.2) and perform the substitution u = t + ln t. we obtain g(t + ln t) = 1 2iπ ∫ c+i∞ c−i∞ etytyg(y) dy, t > 0 (5.2) and according to (3.3) we obtain (5.1). if we apply the bromwich inversion to the mellin transform in (3.4) and perform the substitution u = tet we obtain g(tet) = 1 2iπ ∫ c+i∞ c−i∞ etytyg(y) dy, t > 0 (5.3) and according to (3.5) we also obtain (5.1). if g(·) is the mellin transform of function g(·) in standard notation, formula (5.3) obtains the form g(tet) = 1 2iπ ∫ c+i∞ c−i∞ e−tyt−yg(y) dy, t > 0. (5.4) corollary 5.2. let function g(t) have the (generally bilateral) laplace transform g(x), then∫ ∞ 0 e−xtg(t + ln t) dt = 1 2iπ ∫ c+i∞ c−i∞ (x−y)−y−1γ(1 + y)g(y) dy, c > −1, α0, (5.5) where x lies in the interior of the region of holomorphy of the laplace integral on the left side of (5.5). proof. application of the laplace transform to both sides of (5.2) with respect to the fubini theorem gives (5.5). the formula for solving the integral equation (3.1) in the case that function g(y) is a laplace or mellin transform of the pertinent determining functions g(t) is given by the laplace transform of (5.1): theorem 5.3. let function g(y) on the right hand side of (3.1) be a laplace or mellin transform. then solution f(x) of integral equation (3.1) is f(x) = ∫ ∞ 0 e−xtf(t) dt = 1 2iπ ∫ c+i∞ c−i∞ g(y) ∫ ∞ 0 (1 + teytty) e−xt t dt dy = x 2iπ ∫ c+i∞ c−i∞ (x−y)−y−1γ(y)g(y) dy, c, α0, (5.6) where the real constant c determines the bromwich contour inside the region of holomorphy of function g(y), providing f(t) is of exponential order α0, i.e., f(t) = o(exp α0t) as t → +∞. proof. we perform the laplace transform of (5.1) and change the order of the integration according to the fubini theorem. 310 vol. 54 no. 4/2014 fractional calculus and lambert function i the importance of theorem 5.3 in comparison with theorem 3.1 is embodied in the fact that there is no need to know the laplace inverse of function g(y) and that solution f(x) is obtained directly. moreover numerical experiments indicate that (5.6) also holds for functions g(y) that are either laplace or mellin transforms of generalized functions (const., yn, e−ay, ζ(1 −y), yn e−ay , tanh y), or it is not known to the author if they are transforms at all (sin y, ln y, 1/γ(y), tan y ). on the other hand, it should be emphasized that bare convergence of the integral in (5.6) does not guarantee the correct solution. an example is g(y) = exp y2, which is neither a laplace nor a mellin transform of any function. the resulting function f(x) is not a solution of (3.1). testing the solution is recommended. lemma 5.4. if f(x) is a finite laplace transform of the original f(t), i.e., f(x) = ∫ b a e−xtf(t) dt, 0 ≤ a < b < ∞, (5.7) then (iy−f)(y) = ∫ b a e−xtt−yf(t) dt = ∫ b+ln b a+ln a e−yuf ( w0(eu) ) w0(eu) 1 + w0(eu) du (5.8) and (iy−f)(y) = ∫ b a e−xtt−yf(t) dt = ∫ b exp b a exp a u−y−1f ( w0(u) ) w0(u) 1 + w0(u) du. (5.9) proof. substituting w0(eu) = t, i.e., u = t + ln t in (5.6) and substituting w0(u) = t, i.e., u = tet in left integral of (5.7), as in theorems 3.1 and 3.2, respectively. theorem 5.5. if f(x) is a finite laplace transform of the piecewise continuous original f(t), i.e., f(x) = ∫ b a e−xtf(t) dt, 0 < a < b < ∞, (5.10) then the liouville–weyl diagonal fractional integral (iy−f)(y) = ∫ b a e−xtt−yf(t) dt, y ∈ c (5.11) is an entire function. proof. it is known that the finite laplace transform of an almost piecewise continuous function is an entire function [10]. if 0 < a < b < ∞, then the integral on the right of (5.8) is a finite laplace transform of the almost piecewise continuous function, i.e., (iy−f)(y), y ∈ c, is an entire function. remark 5.6. the sharp inequality 0 < a in the hypothesis of theorem 5.5 is essential. if a = 0, function f (x) is entire, but its diagonal fractional integral need not be entire, because (5.8) is not a finite laplace transform in that case. remark 5.7. from (5.11) it results that the question of the path of integration in the complex domain, not only for the diagonal integral but also for the general liouville–weyl fractional integral (iν−f)(y), ν,y ∈ c, is irrelevant in the case that function f(·) is a laplace transform. however, the specific path of integration often helps in the case when γ(ν)(iν−f)(y), ν,y ∈ c is used as the euler transform for solving differential equations [17]. 6. applications many special functions have a connection with the liouville–weyl fractional integral according to (1.1)–(1.3). we will concentrate on the diagonal restriction ν = y. 6.1. gamma function it is known [15] that there does not exist a pair (f(x),ν) such that the fractional integral (1.1) is equal to a constant. but there exists a function f(x) the diagonal integral of which is equal to 1. lemma 6.1. let g(y) = 1, g(u) = δ(u), u ∈ (−∞,∞), where δ(u) is the dirac delta function. then f(t) = (t + 1)δ(t + ln t)/t = δ ( t−w0(1) ) , t > 0, and for the laplace transform we have f(x) = ∫ ∞ 0 e−xtf(t) dt = w0(1)x, x ∈ r. 311 vladimír vojta acta polytechnica proof. we start with the relation δ(t + ln t) = δ(t−w0(1))1+1/w0(1) because the root of the equation t + ln t = 0 is equal to w0(1). then the laplace transform of the function f(t) is equal to exp(−xw0(1)) = w0(1)x because w0(1) exp w0(1) = 1. lemma 6.2. we have γ(y) = ∫ ∞ y w0(1)x(x−y)y−1 dx, y > 0. (6.1) proof. we have (iy−f)(y) = 1 γ(y) ∫ ∞ y w0(1)x(x−y)y−1 dx = ∫ ∞ 0 e−ytt−yδ ( t−w0(1) ) dt = 1, because w0(1) exp w0(1) = 1. finally we get (6.1). after substituting u = x−y in (6.1) we obtain its mellin transform form γ(y) = w0(1)y ∫ ∞ 0 w0(1)uuy−1 du, (6.2) which holds not only for y > 0 but also for 0, see [16, entry 3.1, p. 25]. the usual mellin transform formula for the gamma function can be obtained from (6.1): γ(y) = ∫ ∞ y w0(1)(x−y)y−1 dx = ∫ ∞ 0 w0(1)u/w0(1)uy−1 du = ∫ ∞ 0 e−uuy−1 du, using the linear substitution x = y + u/w0(1) and the relation ln(w0(1))/w0(1) = −1. this proof of (6.1) is independent from generalized functions [18]. equation (6.1) can be generalized for a complex argument: theorem 6.3. the function γ(z), 0 is an euler integral transform [17, p. 258] with parameter z of the function w0(1)u along path l, which is a half straight line starting at the point z = a + ic and is parallel to the x axis: γ(z) = ∫ l w0(1)u(u−z)z−1 du, 0. (6.3) proof. let u = x + ic, c ∈ (−∞,∞). then∫ l w0(1)u(u−z)z−1 du = ∫ ∞ a w0(1)x+ic(x−a)a+ic−1 dx = ∫ ∞ 0 w0(1)y+a+icya+ic−1 dy = w0(1)a+ic ∫ ∞ 0 w0(1)yya+ic−1 dy = γ(a + ic) = γ(z), (6.4) compare (6.2) or [19, entry 3.3, p. 21]. remark 6.4. constant w0(1) = 0.56714 · · · is known as the omega constant, see http://oeis.org/a030178. 6.2. diagonal restriction of some special functions many standard special functions may be represented as the laplace–mellin transform in (1.3). as examples, we mention the incomplete gamma function, the exponential integral and the macdonald function. the same representation occurs for a function specific for certain fields of physics. we mention here only the bickley function, known in neutron physics [11]. other applications, e.g., as thermonuclear reaction rate integrals [20, p. 371] are left for the reader to investigate. 6.2.1. incomplete gamma function we start with the formula γ(ν,y) = e−y γ(1 −ν) ∫ ∞ 0 eyt t−ν t + 1 dt = e−y γ(1 −ν)γ(ν) ∫ ∞ y exγ(0,x)(x−y)ν−1 dt, y > 0, ν < 1, (6.5) see [13, p. 87]. its diagonal restriction ν = y gives γ(y,y) = e−y γ(1 −y) ∫ ∞ 0 e−yt t−y t + 1 dt = e−y γ(y − 1) ∫ ∞ −∞ eyu w0(eu) (1 + w0(eu))2 du = e−y sin πy π ∫ ∞ y exγ(0,x)(x−y)y−1 dx, y ∈ (0, 1). (6.6) 312 http://oeis.org/a030178 vol. 54 no. 4/2014 fractional calculus and lambert function i on the other side, integral (6.1) can be split into two parts γ(y) = ∫ y(1+1/w0(1)) y w0(1)x(x−y)y−1 dx + ∫ ∞ y(1+1/w0(1)) w0(1)x(x−y)y−1 dx = γ(y,y) + γ(y,y), y > 0, (6.7) see [16, entries 3.2 and 3.3, p. 25] and (1.3), i.e., another representation of the diagonal restriction of both incomplete gamma functions. the second integral means that γ(y,y) is equal to the γ(y) times liouville–weyl fractional integral of the order y at the point y(1 + 1/w0(1)) of the function w0(1)x. 6.2.2. exponential integral the general exponential integral function is defined as [13, p. 132]: eν(y) = ∫ ∞ 1 e−ytt−ν dt = yν−1γ(1 −ν,y), 0, ν ∈ c. (6.8) this formula can be written in the form of the liouville–weyl fractional integral eν(y) = ∫ ∞ 0 e−ytt−νh(t− 1) dt = 1 γ(ν) ∫ ∞ y e−x x (x−y)ν−1 dx, y > 0, ν > 0. the integral on the right is the liouville–weyl fractional integral of the laplace transform of the shifted heaviside function h(t− 1). theorem 6.5. the general exponential integral eν(y) is a completely monotone function in variable y ∈ r+ for fixed ν and a completely monotone function in parameter ν ∈ r+ for fixed y ∈ r+. proof. the integral on the left of (6.8) is the laplace transform of the non-negative function in the interval t ∈ [0,∞) and ν > 0. this means, according to theorem 1.3, that eν(y) is a completely monotone function in variable y. after substituting t = ex into the integral in (6.8), we obtain eν(y) = ∫ ∞ 0 ex−ye x e−νx dx. this integral is the laplace transform of a positive function, and according to theorem 1.3 eν(y) is a completely monotone function in parameter ν. the diagonal restriction of eν(y) is ey(y) = ∫ ∞ 1 e−ytt−y dt = ∫ ∞ 1 e−yu w0(eu) 1 + w0(eu) du = yy−1γ(1 −y,y). (6.9) it is evident that ey(y) is a completely monotone function for y ∈ r+, because it is the laplace transform of the nonnegative function w0(eu)/(1 + w0(eu)). 6.2.3. modified bessel function of the second kind (macdonald function) kν(x) we start with the relation ∫ ∞ 0 e−ytt−ν−1e−1/t dt = 2yν/2kν(2 √ y), y > 0, ν ≥ 0. (6.10) according to (2.2) we have for ν = y 2yy/2ky(2 √ y) = ∫ ∞ −∞ e−yu exp(−1/w0(eu)) 1 + w0(eu) du, y > 0, (6.11) or equivalently ky(2 √ y) = y−y/2 γ(y) ∫ ∞ y k0(2 √ x)(x−y)y−1 dx,y > 0, (6.12) because ∫ ∞ 0 e−yt e−1/t t dt = 2k0(2 √ y), y > 0. in classical fractional calculus, we have from (6.10) the relation kν(2 √ y) = y−ν/2 γ(ν) ∫ ∞ y k0(2 √ x)(x−y)ν−1 dx, y > 0, ν > 0, (6.13) 313 vladimír vojta acta polytechnica i.e., the modified bessel function of the second kind with index ν is given by the liouville–weyl fractional integral of order ν of the modified bessel function of the second kind with zero index. particularly if ν = 1 we obtain k1(2 √ y) = y−1/2 ∫ ∞ y k0(2 √ x) dx, y > 0. due to the semigroup property of the fractional integrals (additive index law) [1] (iν+µ− f)(y) = ( iν−(i µ −f) ) (y), ν > 0, µ > 0, providing that f (x) ∈ l1loc(0,∞) (space of locally integrable functions), we can rewrite (6.13) easily in the form kν+µ(2 √ y) = y−(ν+µ)/2 γ(ν) ∫ ∞ y xµ/2kµ(2 √ x)(x−y)ν−1 dx = y−(ν+µ)/2 γ(µ) ∫ ∞ y xν/2kν(2 √ x)(x−y)µ−1 dx, y > 0, ν > 0. (6.14) 6.2.4. bickley function the bickley function of order ν is defined by the fractional integral [11, entry 10.43.11, p. 259] kiν(y) = 1 γ(ν) ∫ ∞ y ki 0(x)(x−y)ν−1 dx, y > 0, ν > 0, (6.15) where ki 0(x) = k0(x). because k0(y) = ∫ ∞ 1 e−yu √ u2 − 1 du, 0, it holds that the bickley function is given by the laplace–mellin transform kiν(y) = ∫ ∞ 1 e−yuu−ν √ u2 − 1 du, 0, ν > 0. (6.16) from (6.16), it is evident that the bickley function is completely monotone in y for fixed ν and completely monotone in ν for fixed y. diagonal restriction of (6.16) in the form of the pure laplace or mellin transform can be obtained very simply, and is not introduced here. instead the mellin transform pair in standard notation for the bickley function is elicited. theorem 6.6. the mellin transform of the bickley function is given by the formula∫ ∞ 0 yp−1kiν(y) dy = √ π 4 γ(p)γ(p2 + ν 2 ) γ(p2 + ν 2 + 1 2 ) ,

0, ν > 0. (6.17) proof. we have ∫ ∞ 0 yp−1kiν(y) dy = ∫ ∞ 1 u−ν √ u2 − 1 ∫ ∞ 0 yp−1e−yu dy du = γ(p) ∫ ∞ 1 u−ν−p √ u2 − 1 du = √ π 4 γ(p)γ(p2 + ν 2 ) γ(p2 + ν 2 + 1 2 ) ,

0, ν > 0, where (6.16) and the fubini theorem have been applied. the fact that the region of holomorphy of the mellin transform (6.17) is unbounded from above unveils that mn = ∫ ∞ 0 ynkiν(y) dy = √ π 4 γ(n + 1)γ(n2 + ν 2 + 1 2 ) γ(n2 + ν 2 + 1) , ν > 0, n = 0, 1, 2, . . . is the n-th stieltjes moment of the bickley function kiν(y). 314 vol. 54 no. 4/2014 fractional calculus and lambert function i 6.3. integral transform pairs containing the lambert function we use (2.4) from corollary 2.4 and (2.5) from corollary 2.5. example 6.7. let h(t) = t. then∫ ∞ 1 e−yuw0(eu) du = ∫ ∞ e u−y−1w0(u) du = ∫ ∞ 1 e−ytt−y(1 + t) dt = ey(y) + e−y y , 0, (6.18)∫ ∞ 0 e−yuw0(eu) du = ∫ ∞ 1 u−y−1w0(u) du = ∫ ∞ w0(1) e−ytt−y(1 + t) dt = w0(1) w0(1)−yey(yw0(1)) + 1 y , 0,∫ ∞ −∞ e−yuw0(eu) du = ∫ ∞ 0 u−y−1w0(u) du = ∫ ∞ 0 e−ytt−y(1 + t) dt = yy−2γ(1 −y), 0 < 0, u > e (6.20) and inversion in (6.19) gives w0(u) = 1 2iπ ∫ c+i∞ c−i∞ uyyy−2γ(1 −y) dy, c ∈ (0, 1), u ≥ 0. (6.21) this is a representation of w0(u) in the form of complex integrals. it should be noted, however, that the integral in (6.20) defines a function different from the function defined by the integral in (6.21). the two functions are equivalent for u > e. but (6.21) is equal to w0(u) also for 0 < u < e, whereas (6.20) is equal to zero for 0 < u < e and to w0(e)/2 for u = e. this is typical for integration along the bromwich contour. equation (6.18) is compelling also for another reason given by the following conjecture that links the lambert function and the residue of another function: conjecture 6.8. function uy(ey(y) + e−y)/y of complex variable y ∈ c has for fixed u ∈ c only one singular point y = 0 in which this function has non-removable point singularity. if this is true, then w0(u) = res ( uy ey(y) + e−y y ,y = 0 ) , u ≥ e. (6.22) this formula is a consequence of the cauchy residue theorem and the jordan lemma because the bromwich contour in (6.20) can be closed to the left for u > e and c > 0. indeed, because eν(y) ∝ e−y/y, y → ∞ independently on ν, it holds that uy(ey(y) + e−y)/y ∝ e−y/y, y →∞ and |uy(ey(y) + e−y)/y|→ 0 for |y|→∞, u > e, 0, c ∈ (0, 1). (6.23) proof. perform the laplace transform of (6.21) and change the order of integration according to the fubini theorem. then the laplace transform of the function φ(u) = uy is p−1−yγ(1 + y). the integrand in the second integral on the right side is given by the reflection formula γ(1 + y)γ(1 −y) = πy/ sin πy. both complex integrals on the right side of (6.23) are de facto bromwich inversion integrals of the mellin transform and converge for all complex p except p = 0, while the laplace integral on the left side converges for

0 only. we thus have the analytic continuation of the function defined by the laplace integral on the left to the region c\{0}. 315 vladimír vojta acta polytechnica example 6.10. let h(t) = t/(1 + t). then∫ ∞ 1 e−yu w0(eu) 1 + w0(eu) du = ∫ ∞ e u−y−1 w0(u) 1 + w0(u) du = ∫ ∞ 1 e−ytt−y dt = ey(y) = yy−1γ(1 −y,y), 0,∫ ∞ 0 e−yu w0(eu) 1 + w0(eu) du = ∫ ∞ 1 u−y−1 w0(u) 1 + w0(u) du = ∫ ∞ w0(1) e−ytt−y dt = yy−1γ ( 1 −y,yw0(1) ) , 0,∫ ∞ −∞ e−yu w0(eu) 1 + w0(eu) du = ∫ ∞ 0 u−y−1 w0(u) 1 + w0(u) du = ∫ ∞ 0 e−ytt−y dt = −yyγ(−y), 0 < 0,∫ ∞ 0 e−yu 1 + w0(eu) du = ∫ ∞ 1 u−y−1 1 + w0(u) du = ∫ ∞ w0(1) e−yt t−y t dt = yyγ ( −y,yw0(1) ) , 0. moreover function 1/(1 + w0(eu)) is also a laplace transform [14]:∫ ∞ 0 e−uxx−x sin πx γ(x) dx = π 1 + w0(eu) , u > −1. this means that the following stieltjes transform pair holds:∫ ∞ 0 x−xe−x sin πx γ(x) y + x dx = πeyyyγ(−y,y), y ∈ c\ (−∞, 0], (6.24)∫ ∞ 0 x−x sin πx γ(x) y + x dx = πyyγ(−y,yw0(1)), y ∈ c\ (−∞, 0]. (6.25) we now take the finite laplace transform qx(y) = ∫ x+ln x 1 e−yu 1 + w0(eu) = ∫ x exp x e u−y−1 1 + w0(u) du = ∫ x 1 e−yt t−y t dt = yy ( γ(−y,y) − γ(−y,xy) ) , x > 0. because the finite laplace transform is an entire function in the transform variable y it holds that qx(0) =∫x 1 1 t dt = ln x, which means that lim y→0 yy ( γ(−y,y) − γ(−y,xy) ) = ln x, x > 0, (6.26)∫ a 1 1 1 + w0(eu) du = ln w0(ea) = a−w0(ea). (6.27) example 6.12. let h(t) = t/(1 + t)3. then∫ ∞ −∞ e−yu w0(eu) (1 + w0(eu))3 = ∫ ∞ 0 u−y−1 w0(u) (1 + w0(u))3 du = ∫ ∞ 0 e−yt t−y (1 + t)2 dt = yyγ(1 −y), 0 < 0, where l(u) = 1 π ∫ ∞ 0 e−uxx−x sin πx dx, u > −∞ is the landau p.d.f. for example, the left hand side yyγ(1 −y), 0 < ln 2. this example originates from the theory of the distribution of prime numbers [21] and consists in calculating the integral∫ ∞ ln 2 t−(y−1)e−(y−1)t dt = ∫ ∞ 2 ln 2 u−y−1 exp w0(u) ( w0(u) )2 1 + w0(u) du = (y − 1)y−2γ ( 2 −y, (y − 1) ln 2 ) , 1. (6.29) example 6.14. let h(t) = sin t. the task is to calculate∫ ∞ −∞ e−yu sin w0(eu) du = ∫ ∞ 0 u−y−1 sin w0(eu) du. according to corollary 2.4, these integrals are equal to∫ ∞ 0 e−ytt−y (1 + t) sin t t dt = − 1 2 ( (y − i)y−1 + (y + i)y−1 ) γ(−y) = −(y2 + 1)(y−1)/2γ(−y) cos ( (y − 1) arctan 1 y ) , 0 < 0. (8.1) then integral (8.1) is equivalent to g(y) = ∫ ∞ 0 e−ytt−ν(t)f(t) dt, (8.2) providing that f(x) is a unilateral laplace transform of function f(t). example 8.1. let ν(y) = 12 sin y = s(y) and f(t) = 1. then we obtain∫ ∞ 0 e−ytt−s(y) dt = ys(y)−1γ ( 1 −s(y) ) , y > 0, (8.3) which represents values of the fractional integral (8.1) of the function f(x) = 1/x (i.e., f(t) = 1, t > 0) of the order ν(y) = 12 sin y in regions where the sinusoid has non-negative values and the fractional derivative of the same order of the function f(x), otherwise [23]. the connection to the lambert function for a general variable order fractional integral is lost, but for the linear order ν(y) = a + by ≥ 0, equation (8.2) has the laplace transform form∫ ∞ 0 e−ytt−1−byf(t) dt = ∫ ∞ 0 e−y(t+b ln t)t−af(t) dt = b−a ∫ ∞ −∞ e−yuf ( bw0(eu/b/b) )(w0(eu/b/b))1−a 1 + w0(eu/b/b) du, y > 0. (8.4) in the case that a < 0 and b > 0 there exists a critical point c = −a/b such that for y > c, equation (8.4) represents the liouville–weyl fractional integral and for y < c the liouville–weyl fractional derivative [23]. the same critical point c = −a/b exists for a > 0 and b < 0. in that case (8.4) represents fractional integral for y < c and fractional derivative for y > c. example 8.2. let f(t) = sin t, a = −3, b = 1. then f(x) = 1/(1 + x2) and c = 3. then integral (8.4) gives:∫ ∞ −∞ e−ytt3−y sin t dt = −(y2 + 1)(y−4)/2γ(4 −y) sin ( (y − 4) arctan 1 y ) . this gives for y < 3 the liouville–weyl fractional derivative and for 3 < y < 5 the fractional integral of the function 1/(1 + x2) both of the variable order |−3 + y| at point y. it must be emphasized, however, that the case b < 0 essentially changes the situation. this is the topic of [23]. remark 8.3. we call the transform ∫∞ 0 e −ytt−yf(t) dt, y > 0, the anti-diagonal laplace–mellin transform because the term diagonal laplace–mellin transform is reserved for the transformation ∫∞ 0 e −yttyf(t) dt, y > 0, which is closely related to the fractional derivative [23]. 9. conclusion this paper has focused on the simplest form of the variable order liouville–weyl fractional integral, where the order ν(y) = y. the liouville–weyl fractional integral is not so often used in physical and technical applications as the riemann–liouville integral, but the liouville–weyl fractional integral fulfills the relation of (1.3). this makes it possible to take advantage of the laplace transform or the mellin transform, and to find a connection to the lambert function. numerical calculations have been performed with the aid of mathematica® ver. 8.0.1.0. analytical calculations were checked by the same version of mathematica. in several cases, the formulas generated by mathematica have been used instead of the equivalent formulas presented in [12, 16, 19]. acknowledgements the author is grateful to both referees for their valuable comments and suggestions. 318 vol. 54 no. 4/2014 fractional calculus and lambert function i references [1] kilbas a.a. et al. theory and applications of the fractional differential equations, amsterdam: elsevier, 2006. [2] corless r.m. et al. on the lambert w function, adv. comput. math. 5, 1996, p. 329–359. [3] zayed a.i. handbook of function and generalized function transformations, boca raton: crc press, 1996. [4] yuerekli o. identities on fractional integrals and various integral transforms, appl. math. comput. 187, 2007, p. 559 566. [5] jorgenson j.,a., lang s. basic analysis of regularized series and products, berlin: springer-verlag, 1993. [6] schilling r.l. et al. bernstein functions, berlin: de gruyter, 2010. [7] christensen r.m. theory of viscoelasticity, new york: academic press, 1982. [8] samko s. g., ross b. integration and differentiation to a variable fractional order, integral transforms spec. funct. 1, 1993, p. 277–300. [9] sun h.g. et al. a comparative study of constant-order and variable-order fractional models in characterizing memory property of systems, eur. phys. j. special topics 193, 2011, p. 185-192. [10] lepage w.r. complex variables and the laplace transform for engineers, new york: dover, 1980; van der pol b., bremmer h. : operational calculus based on two-sided laplace integral, london: cambridge university press, 1964. [11] olver f.w.j. et al. nist handbook of mathematical functions, nist and cambridge university press, 2010. [12] erdélyi a. et al. tables of integral transforms, vol. ii, new york: mcgraw-hill, 1954. [13] chaudhry m.a., zubair s.m. on a class of incomplete gamma functions with applications, boca raton: chapman & hall/crc, 2002. [14] vojta v. in memory of alois apfelbeck: an interconnection between cayley–eisenstein–pólya and landau probability distributions, acta polytechnica 53, (2), 2013, p. 63–69. [15] roberts k.l. on fractional integrals equivalent to a constant, can. math. bull. 25, 1982, p. 335–338. [16] oberhettinger f. tables of mellin transforms, new york: springer-verlag, 1974. [17] hille e. ordinary differential equations in the complex domain, mineola,new york: dover publications, 1997. [18] semrád i. private communication. [19] oberhettinger f., badii l. tables of laplace transforms, new york: springer-verlag, 1973. [20] mathai a.m., haubold h.j. special functions for applied scientists, new york: springer science, 2008. [21] chernoff p.r. a pseudo zeta function and the distribution of primes, proc. natl. acad. sci. usa 97, (14), 2000, p. 7697–7699. [22] kuczma m. et al. iterative functional equations, cambridge: cambridge university press, 1990. [23] vojta v. fractional derivatives and lambert function, in preparation. 319 acta polytechnica 54(4):305–319, 2014 1 introduction 2 diagonal fractional integrals 3 inversion of the diagonal fractional integral 3.1 examples 3.1.1 example of the laplace variant 3.1.2 example of the mellin variant 4 fixed point 5 complex domain 6 applications 6.1 gamma function 6.2 diagonal restriction of some special functions 6.2.1 incomplete gamma function 6.2.2 exponential integral 6.2.3 modified bessel function of the second kind (macdonald function) k nu(x) 6.2.4 bickley function 6.3 integral transform pairs containing the lambert function 7 eigenproblem 8 generalization 9 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0028 acta polytechnica 54(1):28–34, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap design theory for the pressing chamber in the solid biofuel production process monika kováčová∗, miloš matúš, peter križan, juraj beniak slovak university of technology in bratislava, faculty of mechanical engineering, námestie slobody 17, 81231 bratislava, slovakia ∗ corresponding author: monika.kovacova@stuba.sk abstract. the quality of a high-grade solid biofuel depends on many factors, which can be divided into three main groups — material, technological and structural. the main focus of this paper is on observing the influence of structural parameters in the biomass densification process. the main goal is to model various options for the geometry of the pressing chamber and the influence of these structural parameters on the quality of the briquettes. we will provide a mathematical description of the whole physical process of densifying a particular material and extruding it through a cylindrical chamber and through a conical chamber. we have used basic mathematical models to represent the pressure process based on the geometry of the chamber. in this paper we try to find the optimized parameters for the geometry of the chamber in order to achieve high briquette quality with minimal energy input. all these mathematical models allow us to optimize the energy input of the process, to control the final quality of the briquettes and to reduce wear to the chamber. the practical results show that reducing the diameter and the length of the chamber, and the angle of the cone, has a strong influence on the compaction process and, consequently, on the quality of the briquettes. the geometric shape of the chamber also has significant influence on its wear. we will try to offer a more precise explanation of the connections between structural parameters, geometrical shapes and the pressing process. the theory described here can help us to understand the whole process and influence every structural parameter in it. keywords: densification process, numerical optimalization for structural parameters, mathematical model for cone chamber, mathematical model for cylindrical chamber. 1. introduction current european legislation set targets for using renewable energy sources, which will result in the gradual replacement of fossil fuels. biomass is the most promising renewable energy source, and offers the most effective options for energy storing. this leads to a need to carry out research in the area of processing biomass and transforming it into a high-grade solid biofuel. the compaction process can affect the mechanical quality indicators of biofuels, especially their density and mechanical resistance. the geometry of the pressing chamber has an enormous impact on the quality of the briquettes and on the required press pressure. it is therefore appropriate to work on optimizing the geometry of the pressing chamber in order to achieve high briquette quality together with minimum energy input. 2. structural parameters in the densification process the quality of solid high-grade biofuels depends on many factors, which can be divided into three groups: • material parameters, • technological parameters, • structural parameters. the material parameters affecting the quality of briquettes are mostly linked to the characteristics of the starting material (material strength, composition etc.) and some physical constants. the technological parameters (humidity, size of the compression pressure, pressing temperature, pressing speed, etc.) can dramatically affect the process of compaction and the quality of the briquettes. however, the structural parameters have a special place in the pressing process, since the successful production of high-quality briquettes involves synergies between all the groups. the main structural parameters affecting the quality of briquettes are: • the diameter of the pressing chamber, • the length of the pressing chamber, • the convexity of the pressing chamber. only a limited amount of work is currently being done on mathematical descriptions of the biomass briquetting and pelleting process, the influence of the parameters of the process on the final quality of the briquettes, and descriptions of the effects of pressure in the pressing chamber. there are no complete mathematical models that deal mainly with the impact of structural parameters on the pressing process. it is clear that a detailed study of the impact of all of structural parameters on the pressing process and the resulting quality of briquettes is a very extensive 28 http://dx.doi.org/10.14311/ap.2014.54.0028 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 1/2014 design theory for pressing chamber figure 1. specification of the geometry of the pressing chambers in the process of pelleting biomass: a) normal, b) deep, c) flat d) well, e) cylindrical, f) conical, g) stepped. figure 2. pressing chamber geometry of the screw briquetting press. undertaking, and requires a detailed analysis of this issue. the most significant influence on the pressing process is from the geometric parameters of the pressing chamber, i.e. the shape and dimensions (diameter, length and convexity of the chamber). the geometry of the pressing chambers currently used for producing solid biofuels is very diverse. it consists of a cylindrical part, in most cases also with a conical part. there is often a combination of several cylindrical and conical parts (figure 1, figure 2). the length of the cylindrical part provides the necessary back pressure by the friction part of the force. it also provides biofuels with a high-quality smooth surface. the conical part of the chamber provides spatial movement of the particles and a higher degree of compaction, resulting in higher production quality. when the material is extruded through the conical part of the chamber, the briquettes are given greater density and strength. however, the friction and pressing conditions in the conical chamber greatly increase the required press pressure. the shape and the size of the pressing chamber have a direct impact on the production quality and on the size of the required compression pressure. it is therefore necessary to provide a mathematical description of the whole physical process of densifying a particular material and extruding it through a cylindrical chamber, a conical chamber and also a combined chamber. the mathematical models describing the pressure conditions that are presented here form the basis of our study of the geometry of the chamber. our study focuses on optimizing the chamber geometry in order to achieve high briquette quality together with minimum energy input. figure 3. forces in cylindrical pressing chamber. 3. mathematical background — a cylindrical chamber we will use the following notation in this paper, see figure 3: • dx — height of an infinitesimal cylinder, dx > 0; • d — cylinder diameter; • sv — area of the bottom of the cylinder; • s — surface area of the cylinder; • pa — axial pressure • pr — radial pressure • dpa — the pressure change between the top and bottom base, dpa < 0. we suppose that f1 > f2. based on force equilibrium, we can state the following equation: f1 − f2 − f3 = 0. by simple computation we can state that the force acting on the top base is f1 = pasv = paπ (d 2 )2 = pa πd2 4 , and the force acting on the bottom base is f2 = (pa + dpa)sv = (pa + dpa)π (d 2 )2 = (pa + dpa) πd2 4 . 29 m. kováčová, m. matúš, p. križan, j. beniak acta polytechnica the friction force f3 is a special case. we will use the coefficient of support friction and the axial force to evaluate f3: f3 = µfn = µpes = µprπddx. we know that the radial pressure and the axial pressure should be connected with their horizontal compacting ratio λ: λ = σr σa = pr pa , where σr is the radial stress and σa is the axial stress. so we have f3 = µfn = λpaπddx. based on equilibrium of forces, we can derive the differential equation for pressure changes between the two bases of the cylinder: f1 − f2 − f3 = 0, pa πd2 4 − (pa + dpa) πd2 4 −µλpaπddx = 0. let us suppose that dx → 0 and dpa → 0, then d 4 dpa dx + µλpa = 0. (1) the axial pressure depends on the place, so we need to locate our cylinder on the axes. based on this, we are able to rewrite the axial pressure to the function relation d 4 p′a(x) + µλpa(x) = 0. hence we have a linear differential equation with constant coefficients, and we can find its solution in the form pa(x) = c1e−kx, wheree k is a constant given by k = 4µλ d . the result is in accordance with the physical principle that pressure decreases according to distance from the origin of the coordinate. based on our coordinate system, we have: • x = 0 — the start position of pressing chamber between compactor and material, • x = l — the start position of pressing chamber. thus we are also able to compute the cauchy problem with the initial conditions pa(x) = pap, where pap is the constant pressure of the compactor on the material throughout the pressing phase: pa(x) = c1e−kx =⇒ pa(0) = c1e0 = pap, pap = c1, pa(x) = pape− 4µλ d x. (2) figure 4. forces in a conical pressing chamber. the outgoing pressure on position l can be computed: pa(l) = pape− 4µλ d l. we are also able to express the incoming pressure pap in terms of the outgoing pressure: pap = pa(l)e+ 4µλ d l. 4. mathematical background — a truncated cone chamber a truncated cone is a more complicated case than the classical cylinder. simply speaking, the cylinder is only a special case of the truncated cone, with the elevation angle α = 0. we will use the same ideas and the same notation as for the cylinder — see figure 4. in the case of a truncated cone, the force equilibrium will change: f1 − f2 − cos α f3 = 0. the direction of friction force f3 contains elevation angle α with the direction of forces f1 and f2. so we can write pas1 − (pa + dpa)s2 − cos α f3 = 0, where s1 is the area of the top case and s2 is the area of the bottom case. the same coordinate system is used as for the cone. by simply computation we can state that the force acting on the top base is f1 = pas1 = paπ (d2 + 2v 2 )2 , where d2 is the diameter of the bottom case and v is the width of ring of the top case. the force acting on the bottom base is f2 = (pa + dpa)s2 = (pa + dpa)π (d2 2 )2 = pa πd22 4 + dpa πd22 4 . 30 vol. 54 no. 1/2014 design theory for pressing chamber figure 5. essential dimensions of elementary truncated cone. then we have paπ (d2 + 2v 2 )2 −pa πd22 4 −dpa πd22 4 = f3 cos α. (3) by simplification we get f3 = π 4 cos α ( pa ( (d2 + 2v)2 −d22 ) −dpa d22 ) = µfn, where fn is the normal force. now we will try to find a proper evaluation for the friction force. we need first of all to compute the surface area of the cone. we have sv = π (d1 2 + d2 2 ) s, where d1 and d2 are the diameters of the top and bottom cases of the cone, and s is the length of the lateral surface. based on figure 5, we can compute sin α = d1 −d2 s or cos α = dx s , where s = d sin α . hence for the surface area of the cone we have sv = π (d1 2 + d2 2 ) s = π (d1 2 + d2 2 ) dx cos α . then f3 = µfn = µprπ (d1 2 + d2 2 ) dx cos α . we know that the radial pressure and the axial pressure should be connected with their horizontal compacting ratio λ. in the case of a truncated cone, the situation is slightly different. radial pressure pr is perpendicular to the lateral surface, so the horizontal ratio must make provision for this: λ = σr σa cos α = pr pa cos α, where σr is radial stress and σa is axial stress. so we have λ = pr pa cos α and pr = λpa cosα . finally, we have f3 = µfn = µλpa 1 cos α π (d2 + 2v 2 + d2 2 ) dx cos α . (4) let us go back to the equilibrium state equation. from (3) and (4) we have π 4 cos α ( pa(d2 + 2v)2 −pad22 −dpa d 2 2 ) = µλpa 1 cos α π (d2 + 2v 2 + d2 2 ) dx cos α . by simple computation we have pa ( (d2 + 2v)2 −d22 ) −dpa s22 = 4µλ cos α pa (d2 + 2v 2 + d2 2 ) dx. we can express the ratio tan α = v/dx. it implies v = tan αdx. the left-hand side should be simplified: pa ( (d2 + 2v)2 −d22 ) −dpa d22 = pa(d22 + 2vd2 + 4v 2 −d22) −dpa d 2 2 = pa(2 tan α dxd2 + 4 tan2 αdx2) −dpa d22. the infinitesimal element should be considered as sufficiently small, so we can omit the term 4 tan2 αdx2. then we have pa(2 tan αdxd2) −dpa d22 = 4µλ cos α pa (d2 + 2v 2 + d2 2 ) dx, −dpa d22 = −pa(2 tan αdxd2) + 4µλ cos α pa (d2 + 2v 2 + d2 2 ) dx. let us suppose that dx → 0 and dpa → 0, then dpa dx d22 = +pa(2 tan αd2) − 4µλ cos α pa (d2 + 2v 2 + d2 2 ) . the axial pressure depends on the place, so we need to locate our cylinder on the axes. based on this, we are able to rewrite the axial pressure pa to the function relation dpa(x) dx d22 = +pa(x)(2 tan αd2) − 4µλ cos α pa (d2 + 2v 2 + d2 2 ) . hence we have a linear differential equation with a constant coefficient: d22p ′ a(x) + ( 4µλ cos α (d2 + 2v 2 + d2 2 ) − 2 tan αd2 ) pa(x) = 0 and we can find its solution in the form pa(x) = c1e−kx, 31 m. kováčová, m. matúš, p. križan, j. beniak acta polytechnica λ = 0.15 λ = 0.25 pap = 140 m pa 10 20 30 40 50 80 100 120 140 figure 6. cylindrical chamber. where k is a constant given by k = 2 sec α(2d2λµ + 2vλµ−d2 sin α) d22 . the result is in accordance with the physical principle that the pressure decreases according to the distance from the origin of the coordinate. based on our coordinate system, we have: • x = 0 — start position of pressing chamber, • x = l — start position of pressing chamber. thus we are also able to compute the cauchy problem with the initial conditions pa(x) = pap, where pap is the constant pressure of the compactor on the material during the whole pressing phase: pa(x) = c1e−kx =⇒ pa(0) = c1e0 = pap, pap = c1, pa(x) = pape −2 sec α(2d2λµ+2vλµ−d2 sin α) d22 x . (5) the outgoing pressure on position l can be computed: pa(l) = pape −2 sec α(2d2λµ+2vλµ−d2 sin α) d22 l . we are also able to express the incoming pressure pap in terms of the outgoing pressure: pap = pa(l)e + 2 sec α(2d2λµ+2vλµ−d2 sin α) d22 l . as was mentioned above, the cylinder is only a special case of the cone, so in the case of angle α = 0, the result of (5) should be the same as in (2): pa(x) = pape −2 sec α(2d2λµ+2vλµ−d2 sin α) d22 x = pape − 2 1cos α (2d2λµ+2vλµ−d2 sin α) d22 x = pape −2(2d2λµ+2vλµ) d22 x = pape −4µλ(d2 +2v) d22 x = pape −4µλd1 d22 x . if diameters d1 and d2 are the same (d1 = d2 = d), we have pa(x) = pape− 4µλd d2 x = pape− 4µλ d x. the outgoing pressure on position l can be computed: pa(l) = pape − 2 1cos α (2d2λµ+2vλµ−d2 sin α) d22 l . we are also able to express the incoming pressure pap in terms of the outgoing pressure: pap = pa(l)e + 2 1cos α (2d2λµ+2vλµ−d2 sin α) d22 l 5. numerical experiments the exact expressions for a conical chamber and for a cylindrical chamber have been described above. now we will deal with some simple numerical experiments. as has been shown, a linear differential equation was used in both cases for describing the mathematical model. in the case of a cylindrical chamber, the relation between the outgoing pressures on position l should be computed by the expression pa(l) = pape− 4µλ d l. 32 vol. 54 no. 1/2014 design theory for pressing chamber λ = 0.15 λ = 0.25 pap = 140 mpa 10 20 30 40 50 60 80 100 120 140 figure 7. cone case for α = 2°. λ = 0.15 λ = 0.25 pap = 140 mpa 10 20 30 40 50 80 100 120 140 figure 8. cylindrical shape – solid line, conical shape for α = 2° – dashed line. in the case of the cone, the outgoing pressures should be computed by a more complicated expression: pa(x) = pape −2 sec α(2d2λµ+2vλµ−d2 sin α) d22 l . let us take the concrete example of a conical pressing chamber and a cylindrical pressing chamber. in the case of a cylindrical chamber, we will take d = 20 mm, l = 50 mm, µ = 0.35 and λ comes from the range λ ∈ [0.15, 0.25]. we will suppose that pap = 140 mpa. then the outgoing press can be modeled by the graph in figure 6. for a conical chamber, the situation is more complicated. let us suppose α = 2° and the length of the pressing chamber is the same l = 50 mm. then by simple computation we can set v = tan 2° · 50 mm = 1.74604. let us assume similar conditions as in the previous case d1 = d = 20 mm. then d2 + 2v = d1 and d2 = 20 − 2 · 1.74604 = 16.5079. the result is in figure 7. simply stated, the shape of the pressure curve remains the same in both cases, but in the case of the cone the outgoing pressure is smaller for some parameters λ than in the case of a cylindrical pressing chamber. we can compare the two cases in one picture. we can see that for λ = 0.15 the outgoing pressure is greater in a conical shape, but the situation is completely different for λ = 0.25. in that case, it seems that the conical shape will be more effective. the conical shape is drawn with a dashed line in figure 8. 33 m. kováčová, m. matúš, p. križan, j. beniak acta polytechnica 6. conclusions this paper has presented mathematical models for describing the cylindrical part and the conical part of a pressing chamber. these models form the basis for our whole study in the field of densifying biomass into a solid biofuel. all these mathematical models allow us to optimize the geometry of the pressing chamber and the energy input of the process, to control the final quality of the briquette, and to wear to the chamber. practical results show that reducing the diameter and the length of the chamber and the angle of the cone have a direct influence on the compacting mechanism and, as a consequence, on the quality of the briquettes. the geometry of the chamber also has a significant influence on its wear. until now, the geometry of the chamber has been designed mostly empirically, without any research. however, the theory described here can help to understand whole process and influence every structural parameter in the process. the next step in our research leads toward a mathematically optimized chamber geometry together with minimum energy input (minimal pressure). acknowledgements this paper is an outcome of the project “development of progressive biomass compacting technology and production of prototype and high-productive tools” (itms project code: 26240220017), which received funding from the european regional development fund’s research and development operational programme references [1] mathews, j h. – fink, k d. numerical methods using matlab. upper saddle river: pearson prentice hall, 2004. pp. 680. isbn 0-13-191178-3. [2] hoffman, j d. numerical methods for engineers and scientists. new york: mcgraw-hill, 1993. pp. 825. isbn 0-07-029213-2. [3] horrighs, w. determining the dimensions of extrusion presses with parallel wall die channel for the compaction and conveying of bulk solids, no. 12/1985 – aufbereitungs technik. [4] matúš, m. – križan, p.: influence of structural parameters in compacting process on quality of biomass pressing. in: aplimat journal of applied mathematics. issn 1337-6365. vol. 3, no. 3 (2010), pp. 87-96. [5] križan, p. – šooš, ľ. – matúš, m. – svátek, m. – vukelič, d.: evaluation of measured data from research of parameters impact on final briquettes density. in: aplimat journal of applied mathematics. issn 1337-6365. vol. 3, no. 3 (2010), pp. 68-76 [6] križan, p. – šooš, ľ. – matúš, m.: optimisation of briquetting machine pressing chamber geometry. in: machine design. issn 1821-1259, 2010, s. 19-24 34 acta polytechnica 54(1):28–34, 2014 1 introduction 2 structural parameters in the densification process 3 mathematical background — a cylindrical chamber 4 mathematical background — a truncated cone chamber 5 numerical experiments 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0760 acta polytechnica 53(supplement):760–763, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of the macro experiment data to compare particle arrival times under gran sasso francesco ronga∗ infn laboratori nazionali di frascati, frascati italy ∗ corresponding author: francesco.ronga@lnf.infn.it abstract. the claim of a neutrino velocity different from the speed of the light, made in september 2011 by the opera experiment, suggested the study of the time delays between tev underground muons in the gran sasso laboratory using the old data of the macro experiment, ended in 2000. this study can also give hints on new physics in the particle cascade produced by the interaction of a cosmic ray with the atmosphere. keywords: neutrino velocity, new massive particles, tachyons, supersymmetry. 1. introduction in september 2011 there was a measurement of the speed of neutrino faster than the speed of light by (v−c)/c = 2.48±0.28(stat)±0.30(sys)×10−5 [2]. after many checks, we know now that this result was due to hardware problems and the opera 2012 result is that the speed of the neutrinos traveling from cern to the gran sasso is (v −c)/c = −0.7 ± 0.5(stat)+2.5−1.5(sys) × 10−6 [8]. this result is in agreement with the results of the other gran sasso experiments [6]. however the interest in this claim suggested the possibility to compare neutrino and muon velocity in a cosmic ray cascade [12]. the interaction of a primary cosmic ray with the atmosphere produces a cascade with many kind of particles, and in particular neutrinos and muons. muon neutrinos and muons are produced mainly via the decay of charged pions and kaons produced in the primary cosmic ray interactions. above about 10 tev they can also come from prompt decays of charmed hadrons. this component has not yet been observed. in a deep underground detector only muon and neutrino are detected. if the neutrino velocity is different from c the neutrinos in this cascade, should arrive with times different from the times of the muons from the same parent decay, or from another decay, with a time delay that should change according to the neutrino path length which depends on its zenith angle θ. in underground detectors muon neutrinos are detected looking for induced muons produced by neutrino charged current interactions in the rock, or in the ice around or inside the instrumented region. hence, a time spread should be observed between the muons produced directly by the pion or kaon decay and the muons produced by neutrino interactions. the path length from the meson decay point is a few tens of kilometers for vertical neutrinos and up to ∼ 300 km for near horizontal neutrinos. assuming the original time difference observed in opera, nearly horizontal neutrinos should arrive up to 28 ns before the other secondaries. in [10] a table of average production heights neutrinos in the atmosphere has been reported. the typical production height for neutrinos of energy above 20 gev can be 17.6 km at the vertical, 94.9 km at cos θ = 0.25 and 335.7 km at cos θ = 0.05, which would correspond to 1.4, 78 and 27.6 ns. there are already limits of tachions or anomalous delayed particles in cosmic rays. the limits are obtained searching for example signals before or after the main front of the electromagnetic shower. but of searches of this kind stopped some time ago and the last particle data book review of those data is the review of 1994 [11]. the limits obtained are of small interest in the framework of the opera result. however, if neutrinos were tachions, it is likely that some other kind of tachions could exist and this search in very high energy cosmic rays could have a new interest. it is important to note that the gran sasso mountain minimum depth ∼ 2700 g/cm2 corresponds to a minimum muon energy of 1.4 tev. it easy to compute that, requiring a minimum threshold of 50 mev in the detector, the time difference between two muons underground should be � 0.2 ns. therefore anomalous time differences should be a signal of “new physics”, for example a signal of supersymmetric massive particles produced in a cosmic ray cascade. for example, let us assume a hypothetical hadron of mass 100 gev, produced by an interaction of a proton with center of mass energy 7 tev (the lhc energy). if this hypothetical hadron interacts or decays after 10 km producing at the end muons, the delay between the underground muon from the massive particle and the muon produced in the primary vertex is of the order of 13 ns. lhc experiments have put limits for new hadron-like massive particles [1, 7], but it is important to remember that the cosmic ray energy could be larger than the lhc energy. under gran sasso the fraction of multiple muons produced by cosmic rays with center of mass energy ≥ 7 tev is estimated of the order of 10−3 in macro, corresponding to several thousand 760 http://dx.doi.org/10.14311/ap.2013.53.0760 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 analysis of the macro experiment data figure 1. event with six parallel muons in 3 macro “supermodules”. at the top there is the full macro display, on the bottom the zoom of the 3 supermodules interested by the event. the 12 steamer tube horizontal planes are shown as horizontal lines, the back points are the streamer tubes fired; the scintillator boxes fired are shown as rectangles. multiple muon events in the macro data set. one should also consider the possibility that new massive relic particles are directly in the primary cosmic radiations. the macro experiment has made several searches for possible anomalies of the time differences between muons [3]. the search was made mainly to study time differences of the order of a few ms or more, but this paper contains also a study of time differences at the ns-level. the statistics was limited to 35 832 tracks in events with two or more tracks. in 1992 noone was thinking about tachionic neutrinos, and therefore there was no estimate of the number of tracks due to down-going neutrino together with the primary muons. in [13], this study was extended to about 140 000 tracks of multi muon events, corresponding to about 4 % of the total macro statistics. the time distribution was in agreement with the predictions. in this paper, i present an analysis of the full macro statistics. this was not an easy job. the main reason is that macro ended in 2000, and most of the analysis software was designed for vax/alphavax computers and data formats around 1990 (a geological era for computers!). a lot of time was needed to convert programs and to find data files, sometimes stored on data tape cassette of old formats, obsolete and not supported by modern computers. 2. the macro experiment and the timing system the macro experiment [5] was located in hall b of the gran sasso underground laboratory. the modularity allowed data-taking also with partial configurations of the apparatus, starting from march 1989. the full detector was operative in the period april 1994– december 2000. macro was a large rectangular box (76.6 × 12 × 9.3 m3) divided longitudinally in 6 supermodules and vertically in a lower and an upper part (called attico). the active elements were liquid scintillator counters for time measurement and streamer tubes for tracking, with 27° stereo strip readouts. the lower half of the detector was filled with trays of crushed rock absorber alternating with streamer tube planes, while the attico was hollow and contained the electronics racks and work areas. the rock absorber sets a minimum energy threshold for vertical muons of 1 gev. the tracking system was designed to reconstruct the particle trajectory in different views (x–z for horizontal streamer tubes, d–z for horizontal strips, y–z for vertical streamer tubes combined with central hits). to perform this analysis the standard macro tracking software was improved to have greater efficiency for near horizontal tracks. the intrinsic angular resolution for muons typically ranges from 0.2° to 1° depending on track length. this resolution is lower than the angular spread due to multiple scattering of downward-going muons in the rock. the scintillator system consisted of horizontal and vertical layers of counters filled with a mixture of mineral oil (96.4 %), pseudocumene (3.6 %) and wavelength shifters (2.88 g/l). the counters had an active volume of 11.2×0.73×0.19 m3 in the horizontal planes and 11.1 × 0.22 × 0.46 m3 in the vertical planes. the total charge and the time of occurrence of the signals were measured at the two ends of each counter with two independent systems, the energy response processor (erp) and the pulse height recorder and synchronous encoder (phrase). the analysis described in this paper is based on erp data. the time and longitudinal position resolution for a single muon in a counter were about 0.6 ns and 12 cm, respectively. the photomultiplier signal is split into a direct output and one attenuated by a factor of 10, in order to be on-scale also for very large pulses. two different thresholds are used for the timing of these two outputs. the redundancy of the time measurement helps to eliminate spurious effects. each macro supermodule is connected to a dedicated independent erp system. the timing between the erp systems is insured by standard camac tdc. due to the random noise the possibility to have wrong times in the inter erp tdcs is quite high, and this is the main source of non-gaussian tails in the time distributions for events interesting different supermodules. 3. time differences in the macro muon bundles thanks to its large area and fine tracking granularity the macro detector was a proper tool for the study of multiple parallel muons. many papers were published by macro on this topic to study the muon multiplicity, the distance between muons and the impact on cosmic ray composition from the multiple muon measurement. the last macro paper on this argument is in [4]. one important number to consider is that the average distance between muon pairs is 761 francesco ronga acta polytechnica figure 2. difference between track time and average time for multiple parallel muons with 2 or 3 tracks as function of cos(θ). the dot size is proportional to the logarithm of the bin content. if the original opera claim had been correct, a few tracks would be expected inside the dashed region (see text). 〈r〉 ∼ 9.4 m. for vertical tracks and average depth 3800 g/cm2. the value of 〈r〉 changes slowly with depth and zenith angle. figure 1 shows a typical multiple muon event. from this picture is easy to understand one of the problems of this analysis: the large dimension of the scintillator boxes compared to the average value of 〈r〉. the probability of having more than one track intercepting the same counters is high. in this case, the time could be wrong. this is because the analysis software could fail to compute the light propagation time from the intercept of the track to the photomultipliers. this is the second source of non-gaussian tails in the time distributions (the first is the timing between different supermodules). for each track the analysis program computes the β = v/c and the “track time” (average between the scintillator times along the track). to remove noise, the analysis program uses only the scintillators in which the position along the scintillator computed by the time differences between the pm at the ends is in agreement with the position given from the streamer chambers. therefore the analysis program computes the differences between the “track times” and the average time of all the tracks in the bundle. this is done including a correction due to the incidence angle. a 5° angular cut is applied to require parallel tracks. to have a valid track time a single scintillator is sufficient, but in the case of events between different supermodules there is the requirement that at least one track between two different supermodules has two scintillators and that the beta value is consistent with one. this is to reduce the noise due to the inter erp tdcs. figure 3. difference between track time and average time for multiple muons with 2 or 3 tracks (continuos line) compared with a “simulation” using the data (dashed line). the calculation of the expected number of events, if the original opera claim was correct, is done considering the probability to have a neutrino and a muon from the same decay, computed in [12] and the probability to have a neutrino and a muon from different decays, computed using the approximated elbert formulas [9]. the detector and analysis efficiencies were evaluated using the standard multiple muon macro simulation software, with a modification to allow a delay in one of the muons. this calculation gives 2 delayed tracks with time delay |δt| ≥ 10 ns expected in macro the data set. in the case of events with a muon from a neutrino interaction there is unlikely more than one muon directly from the hadronic cascade, so the analysis is limited to events with less than 3 tracks (one track could be a spurious track). the results are in fig. 2. figure 2 also shows the times expected if the original opera result would have been correct. considering the region with cos(θ) ≤ 0.2 and |δt| ≥ 10 ns there is one event with two tracks with a time track – average time ∼ 22 ns (the dot of fig. 2 near the dashed arrow). however this time is outside the opera region. in the opera region there are no tracks. this result should be compared with the 2 tracks expected. to understand if the distribution tails in the full angular region are real or due to detector effects i made a comparison, computing for each track with two scintillator counters the time difference between times (instead of the average). this is shown in fig. 3 as a dashed line. this plot shows that there is agreement between the two distributions and therefore we can conclude that most of the tails are indeed due to effect of the detector. finally fig. 4 shows the time difference, including all the multiple muon multiplicity. since a possible signal due to massive particles or exotic relics is expected at high path length, i have divided the angular region 762 vol. 53 supplement/2013 analysis of the macro experiment data figure 4. difference between track time and average time for multiple muons with all multiplicities: continuous line cos(θ) ≥ 0.5, dashed line cos(θ) ≤ 0.5 (histograms normalized to 1). in two parts: cos(θ) ≥ 0.5, and cos(θ) ≤ 0.5. the two distributions are comparable. but the cos(θ) ≤ 0.5 distribution has two tracks with time differences ≥ 15 ns, compared with 0.4 tracks expected from the cos(θ) ≥ 0.5 distribution (poisson probability 0.06). 4. conclusions this work ended some time after the solution of the superluminal neutrino puzzle, but i think that has been very useful to remember that cosmic rays are still important tools in particle physics. for the superluminal neutrino, 2 tracks were expected but 0 were found. considering the different mean-lives of the pion and the kaon, an “exotic” limit can be derived from the horizontal tracks of fig. 2 on the equality of the pion and kaon speed in a cascade produced by a primary with e ≥ 3 tev: |βπ − βk| � 1.5 × 10−4. this result at the moment is of very low interest but the superluminal neutrino saga has shown that nothing can be given as guaranteed. more investigations are necessary on the delayed tracks in events with multiplicity bigger than 3 and on massive particles in cosmic rays. this work has shown once again the importance of saving past experiment data for further analysis. i must thank the macro collaboration which built and run the detector and many macro people who helped me to recover data and programs and particularly nazareno taborgna of the gran sasso laboratory, who was able to save a working alphavax with several macro original disks. particular thanks go to teresa montaruli for useful and deep discussions. references [1] aad g. et al. [atlas collaboration], eur. phys. j. c 72, 1965 (2012) [2] adam t. et al. [opera collaboration], arxiv:1109.4897 [hep-ex] [3] ahlen s. p.et al. [macro collaboration], nucl. phys. b 370, 432 (1992) [4] ambrosio m. et al. [macro collaboration], phys. rev. d 60, 032001 (1999) [5] ambrosio m. et al. [macro collaboration], nucl. instrum. meth. a 486, 663 (2002) [6] bertolucci s., kyoto neutrino 2012 conference [7] chatrchyan s.et al. [cms collaboration], phys. lett. b 713, 408 (2012) [8] dracos m. (opera collaboration), kyoto neutrino 2012 conference [9] gaisser t.k., cosmic rays and particle physics, cambridge, uk: univ. pr. (1990) [10] gaisser t. k. and stanev t., phys. rev. d 57 (1998) 1977 [11] montanet l. et al. pag 1811, phys. rev. d50, 1173–1823 (1994) [12] montaruli t. and ronga f., arxiv:1109.6238 [hep-ex] [13] scapparone e. ph. thesis bologna univ. 1995 763 acta polytechnica 53(supplement):760–763, 2013 1 introduction 2 the macro experiment and the timing system 3 time differences in the macro muon bundles 4 conclusions references acta polytechnica doi:10.14311/ap.2014.54.0325 acta polytechnica 54(5):325–332, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap improving specific power consumption for mechanical mixing of the feedstock in a biogas fermenter by mechanical disintegration of lignocellulose biomass lukáš krátký∗, tomáš jirout czech technical university in prague, faculty of mechanical engineering, department of process engineering, technicka 4, 166 07 prague 6, czech republic ∗ corresponding author: lukas.kratky@fs.cvut.cz abstract. lignocellulose biomass particles in a biogas fermenter batch either sediment towards the bottom of the vessel or rise towards the batch surface, where they float and form a compact thick scum. these processes have a basically negative influence on batch homogeneity, on evenness of the batch temperature field, on the removal of biogas bubbles from the liquid batch, and also on mass transfer among the microorganisms. these issues result in inefficient usage of the energy potential of the biomass, and lead to low biogas yields. good mixing of the bioreactor batch is very important for stabilizing the anaerobic digestion process. the aims of our study were to evaluate the impact of the disintegration and hydration of wheat straw on the hydrodynamic behaviour and on the specific power consumption for mechanical mixing of a wheat straw-water suspension. on the basis of the experimental results, it was concluded that both hydration and mechanical disintegration of lignocellulose biomass significantly improve the homogeneity and the pumpability of biomass-water batches. wheat straw hydration by itself reduces the specific power consumption for batch mixing by 60 % in comparison with untreated straw. in addition, mechanical disintegration reduces the specific power consumption by at least 50 % in comparison with untreated hydrated straw. keywords: fermenter, lignocellulose biomass, mixing, specific power consumption, wheat straw. 1. introduction anaerobic digestion of organic wastes, residues and energy crops offers a very interesting option for generating biogas and also for reducing the amount of wastes that need to be disposed of. methane-rich biogas has a high potential to replace natural gas, or it can be used as a feedstock for producing of chemicals and other materials [1]. biogas production has also been evaluated as one of the most energy-efficient and environmentally beneficial technologies for bioenergy production [2]. many types and concepts of agricultural biogas plants are currently applied, but an anerobic fermenter remains the crucial element in any kind of anaerobic technology. the vertical stirred tank fermenter is the most widely-used reactor configuration employed for wet anaerobic fermentation [3]. batch mixing by filling in new substrate, by thermal convention flow, and by raising gas bubbles is usually not sufficient for an agricultural biogas fermenter. in addition, untreated or unprocessed lignocellulose biomass feedstock is bulky, difficult to feed into the fermenter, and floats on the batch surface, where it almost always forms a compact thick scum [1]. active batch mixing therefore needs to be implemented in order to bring the microorganisms into contact with the new feedstock, to facilitate the upflow of gas bubbles, to achieve constant temperature conditions throughout the fermenter, to prevent particle sedimentation on the bottom of the vessel, and to prevent particle floating and scum formation on the surface of the batch [1, 4, 5]. up to 90 % of biogas plants use mechanical stirring equipment [1]. mechanical mixing systems for biogas fermenters are based on the use of impellers, which can be categorized according to their revolutions as slow-running or fast-running. fast-running impellers run mostly 1–10 times per day with agitation times of 5–60 min, whereas slowly rotating paddles mainly run continuously [2, 3]. submerged motor propeller stirrers are most often applied. they can be adjusted to the height, to the tilt, and to the side [2]. depending on the size and the substrate type of the fermenter, multistage mixing systems with up to four impellers in a vertical stirred tank fermenter are needed in order to prevent floating scum and sediments [1, 2]. if a fermenter is operated at a high solid concentration, slowly rotating paddle stirrers are preferred, with a horizontal, vertical, or diagonal axis and large scale paddles [1, 5]. axial stirrers are mounted on shafts that are centrally installed on the ceiling of the digester. they form a steady stream in the digester that flows from the bottom up to the walls, resulting in very efficient homogeneity of solid substrates with manure or recycled process water. the typical size of a completely mixed fermenter is in the range from 1000 to 4000 m3 reactor volume [1]. the specific power consumption of a central mixing system 325 http://dx.doi.org/10.14311/ap.2014.54.0325 http://ojs.cvut.cz/ojs/index.php/ap lukáš krátký, tomáš jirout acta polytechnica (a) untreated straw. (b) milled straw. figure 1. the wheat straw particle sizes used in the experiments. equipped with fast running impellers is usually in the range of 40–70 w m−3, and the specific power consumption of slowly-rotating paddle stirrers is about 10 w m−3 [5]. good mixing of a biogas reactor batch is very important in order to stabilize the anaerobic digestion process. on the basis of the information above, it was supposed that biomass hydration and biomass size reduction would prevent lignocellulose particles sedimentating or floating on the batch surface, would improve batch pumpability, would reduce the impeller rotational speed for sufficient batch homogeneity, and would therefore also reduce the specific consumption for mixing. the aims of our study were therefore to evaluate the impact of lignocellulose biomass pretreatment by mechanical disintegration and the influence of lignocellulose biomass hydration on the hydrodynamic behaviour and also the specific power consumption for mechanical mixing of a lignocellulose biomass-water suspension. 2. materials and methods 2.1. raw material wheat straw was used in the experiments. untreated straw 40–200 mm in length was cut in the field by a combine harvester, and collected and stored indoors in containers at ambient temperature for 4 years before the tests. the total solid content was determined to be 93 % wt., and the volatile solid content was 88 % wt. the total solid content was investigated by drying 5 reference samples in a kbc-25w oven overnight at a temperature of 105 °c. the volatile solid content was investigated by burning dried reference samples in an le 09/11 furnace at the temperature of 550 °c up to constant mass of the samples. the mass of the material was measured using an sdc31 analytic balance. to be able to evaluate the impact of wheat straw pre-treatment by mechanical disintegration, and the impact of wheat straw hydration, on the hydrodynamic behaviour of a mechanically mixed straw-water batch, the length of the untreated straw had to be reduced with the respect to the experimental layout of mixing system. the reduced untreated straw length was intended to provide a pulpy substance in an aqueous suspension, but at the same time to prevent the suction area of the impeller becoming blocked. two reference straw samples of different length were used in the experiments, see fig. 1. the first reference sample comprised untreated straw particles approximately 30 mm in length, see fig. 1a. this length was achieved by cutting the untreated straw by hand, using scissors. the second reference sample was pre-treated straw, which was reduced in size by mechanical disintegration. the untreated straw was first hydrated in hot water at a temperature of 40 °c for 20 minutes. moisture content of 40 % wt. achieved for the wheat straw. then the wet wheat straw was mechanically disintegrated using a retting mill [6] to straw particle sizes less than 10 mm, see fig. 1b. the retting mill is a new type of shredder, which is efficiently and continuously able to reduce wet fibrous biomass in size [7]. the two reference samples were dried, and were used to prepare the tested suspensions. 2.2. mixing system all experiments were carried out in a glass cylindrical vessel with a flat bottom. the diameter of the vessel was equal to d = 300 mm, and the liquid level was h = d, see fig. 2a. four radial baffles b = 0.1d in width were used in the experimental layout. a pitched six-blade turbine d = 100 mm in diameter and with blade width h = 0.2d was used in the experiments, see fig. 2b. the height of the impeller above the bottom of the vessel varied during the experiments in the heights of h2 = 1d and h2 = 2d. the impeller was operated to pump the suspension both down towards the bottom of the vessel and up towards the surface. 2.3. suspensions three types of suspensions were used in the experiments. the suspensions were prepared from water 326 vol. 54 no. 5/2014 improving specific power consumption for mechanical mixing (a) position of the impeller in the vessel. (b) a pitched six-blade turbine. figure 2. experimental layout. (a) sus1 – untreated, dry. (b) sus2 – untreated, hydrated. (c) sus3 – milled, hydrated. figure 3. behaviour of stationary wheat straw in a non-mixed aqueous suspension. and untreated straw (sus1), from untreated hydrated straw (sus2) and from milled hydrated straw (sus3), all with 1 % wt. of total solids of straw. the mass of the material was measured using a kern fkb laboratory balance. the behaviour of suspension sus1 in the vessel in the non-mixed steady state was that all straw particles floated on the surface of the batch due to the density of the straw, which was lower than the density of the liquid. these straw particles therefore created a compact floating scum on the surface of the batch, see fig. 3a. suspension sus2 comprised untreated straw particles, also 30 mm in length. however, the untreated straw was first hydrated in water at a process temperature of 60 °c for residence time of 1 hour before it was used. suspension sus3 comprised milled straw particles less than 10 mm in length. these particles were also first hydrated in water under a processing temperature of 60 °c for residence time of 1 hour before it was used. as is shown in figs. 3b and 3c, straw hydration caused partial sedimentation of the material in the vessel in the non-mixed steady state. however, a certain proportion of the straw, which was lighter than the liquid, again rose to the surface of the batch, where it formed a floating scum. 3. fluid flow properties of a milled and hydrated wheat straw-water suspension 3.1. rheological properties the rheological properties for sus3 (a milled and hydrated wheat straw suspension) were determined on the basis of a measurement of the power consumption 327 lukáš krátký, tomáš jirout acta polytechnica (a) the geometry of the system. (b) the equipment. figure 4. experimental setup for the investigation of rheological properties. figure 5. dependence of shear stress on shear flow of suspension sus3. for a stirred impeller with a defined power characteristic. the experiments were carried out in a glass cylindrical vessel with a flat bottom 150 mm in inner diameter d, and the liquid level was h = d. a standardised anchor agitator with d/d = 1.11, h/d = 0.12, hv/d = 0.8 and h2/d = 0.055 was used in the experiments, see fig. 4. sufficient homogeneity of the batch mixing was achieved during the experiments in the creeping flow regime. an rc20 rheometer was used for the torque measurement. the rheometer makes direct measurements of torque mk for adjusted impeller speed n. on the basis of this data, power consumption p was calculated as follows: p = 2πnmk. (1) it was also necessary to evaluate dimensionless power number po for an investigation of the rheological properties of a mixed batch: po = p ρn3d5 . (2) based on knowledge of the power characteristic for the anchor impeller, which is expressed by the correlation equation published e.g., in [8], the corresponding reynolds number re values were determined by comparing the calculated po values with the characteristic. then, using the reynolds number values, the effective viscosity µef was calculated as follows: re = ρnd2 µef therefore µef = ρnd2 re . (3) it is generally known that the effective viscosity value is in accordance with the apparent viscosity value η 328 vol. 54 no. 5/2014 improving specific power consumption for mechanical mixing figure 6. dependence of power consumption on impeller speed. at effective shear rate γ, which can be calculated on the basis of metzner and otto [8]: γ = kn. (4) the k value generally depends both on the type of impeller and on the geometrical configuration of the mixing system. a k value equal to 15.8 is given for this configuration [8]. based on these calculations, the dependence of shear stress on shear rate was plotted according to this equation: τ = kγm, (5) see fig. 5. a power-law model was used to describe the rheological properties of the wheat straw-water mixing system. using the least square method, the consistency coefficient and the power-law index were calculated for the power-law rheological model, as follows: τ = 4.0735γ0.76. (6) the rate of reliability r was equal to 0.97. the results clearly show that suspension sus3 evinces nonnewtonian behaviour that is characterized by a consistency coefficient k of 4.07 pa sm and a dimensionless power-law index m of 0.76. 3.2. an investigation of the fluid flow regime the fluid flow regime is generally dependent both on the rheological properties of the batch and on the set up of the process parameters and the geometry of the mixing system. as has been shown, the milled and hydrated straw in a suspension evinced non-newtonian behaviour. however, this property strongly affects the effective viscosity value, and thereby the power consumption during mixing. it is generally known that the fluid flow in a batch depends on the reynolds number, which is defined according to (3). three types of fluid flow during mixing are known, i.e., creeping flow, transitional flow and turbulent flow. to be able to evaluate the existence of turbulent or creeping flow during batch mixing, the following considerations were taken into account. the power consumption generally depends on the power number, the density of the suspension, the impeller speed and the impeller diameter: p = poρn3d5. (7) according to the theory of mixing [8], the power number is inversely proportional to the reynolds number during the creeping flow regime: po = a re . (8) using the definition of the reynolds number in (3), where the effective viscosity is replaced by the powerlaw model µef = kγm−1 (9) and the shear rate is replaced by (4), the dependence of the power consumption on the impeller revolutions and on the impeller diameter is derived: p = bn1+md3, (10) where constant b is defined as b = akkm−1. (11) based on the derived formula (9) and assuming the same geometry of the mixing system, it was concluded that the power consumption generally depends on the (1 + m)th power of the impeller revolutions during creeping fluid flow, i.e., on the revolutions powered to 1.76 for the tested batch. however, the power number is constant in the turbulent 329 lukáš krátký, tomáš jirout acta polytechnica figure 7. a comparison of the specific power consumption levels for the tested configurations and suspensions. ptb – the impeller pumps towards the bottom of the vessel; pts – the impeller pumps towards the surface of the batch fluid flow regime [8], so the power consumption is dependent on the impeller revolutions to the power of three. the fluid flow regime in the batch was experimentally verified on the basis of these formulations. taking into account the complete geometry of the tested mixing system described in section 2.2, the fluid flow regime was investigated for a baffled glass cylindrical vessel 150 mm in diameter equipped with a six pitchedblade turbine. a high-precision rc20 rheometer was used for the torque measurements. several powers were calculated by (1) by measuring the torque mk for the adjusted impeller speed n. the impeller speeds were chosen for the state when sustainable batch homogeneity was reached. the dependence of power consumption on impeller revolutions is depicted in fig. 6. using the lsm method, the power-law dependence of power consumption on impeller speed was fitted to the data. the regression equation of the power-law curve showed that the power consumption depends on the impeller revolutions powered to 2.98 with a rate of reliability equal to 1. finally, the exponent with a value of 2.98 fully corresponds with turbulent flow regime theory, where a theoretical value of 3 is defined. we will now use the theory for scaling up according to the specific power, defined as follows: nd2/3 = const. (12) on the basis of this formula, it is clear that a higher reynolds number was reached in the vessel 300 mm in diameter than in the vessel 150 mm in diameter. all the experiments in the 300 mm vessel were therefore shown to have been carried out in a turbulent fluid flow regime. 4. results and discussion the specific power consumption was determined on the basis of the findings for the minimum impeller rotational speed for sufficient batch homogeneity and for the corresponding torque. the turbulent suspension flow was studied visually during batch mixing on the bottom of the vessel, at the level of the suspension along the baffles, and also on the batch surface. sufficient batch homogeneity was defined as the state in which the straw particles do not remain for a short time period (approximately 2 s) at all the places mentioned above. the minimum impeller rotational speed for sufficient batch homogeneity was recorded as soon as sufficient batch homogeneity was reached. the impeller rotational speed was measured using a siemens 1xp8001 electronic counter, and the minimum impeller revolutions for sufficient batch homogeneity were visually determined with accuracy of ±5 %. the torque was measured by a dr1 torsion shaft angle sensor, manufactured by lorentz messtechnik. the data was recorded using an a/d converter to the pc, and was recalculated using the calibration function to the torque values. based on knowledge of the minimum impeller rotational speed for batch homogeneity nm and on knowledge of the corresponding torque mkm, the power and also the specific power consumption were calculated: pm = 2πnmmkm, (13) � = pm v . (14) a comparison of the specific power consumption on the suspension type of mixing system and on the geometrical configuration is shown in fig. 7. figure 8 shows the state of the suspension at the minimum impeller rotational speed for sufficient batch 330 vol. 54 no. 5/2014 improving specific power consumption for mechanical mixing (a) sus1 – untreated. (b) sus2 – untreated, hydrated. (c) sus3 – milled, hydrated. figure 8. the homogenized batch at h2/d = 2 with the impeller pumping to the batch surface. homogeneity for the configuration of the mixing system h2/d = 2 and with the impeller pumping to the batch surface. the maximum specific power consumption value of 1531 w m−3 was reached during mixing of suspension sus1 (untreated straw) for the bottom pumping impeller located at a height of h2/d = 1. the reason is that the wheat straw particles floated only on the batch surface (fig. 3a). a very high impeller revolution of 730 min−1 was therefore needed to drag them from the surface into the liquid batch and to achieve sufficient batch homogeneity. the maximum specific power consumption value of 321 w m−3 was reached while mixing suspension type sus2 (untreated, hydrated straw) for the impeller pumping to the batch surface and its location at a height of h2/d = 1. sufficient batch homogeneity was observed at an impeller rotational speed of 402 min−1. on the basis of a visual study of the mixed batch, it was observed that partial plugging of the impeller suction area, caused by a high local concentration of solid phase, was the main influence on the specific power consumption value. however, if the impeller pumped to the bottom of the vessel, the main limitation on achieving sufficient batch homogeneity was getting the floating straw at the batch surface into the liquid level. no straw sedimentation was observed at the bottom of the vessel for an impeller revolution of 150 min−1, but the suspension flow in the vessel did not affect the flow around the batch surface. the size of the primary circulation loop was gradually increased with increasing impeller revolutions, and more intensive turbulent macro vortices were gradually generated. the floating straw was therefore dragged from the surface into the liquid batch at impeller revolution 307 min−1, with corresponding specific power consumption of 199 w m−3. based on the comparison of the specific power consumptions on the suspension type and on the geometrical configuration of mixing system (fig. 7), it can be concluded that the low energy demanding configuration of the presented mixing system has the impeller located at a height of h2/d = 2. the main factor, which has a strong influences on the specific power consumption value, is the hydrodynamic action of the impeller in the presence of solid phase. the trbulent vortices that are formed close to the batch surface around the impeller easily that the floating straw particles are got into the liquid level. however, if the impeller is located at a height of h2/d = 1, a very high impeller revolution is needed to generate these vortices around the batch surface. based on a comparison of the data for the position of the impeller at a height of h2/d = 2, it can be concluded that pumping to the bottom of the vessel and to the batch surface have practically no influence on the specific power consumption. the measured specific power consumptions were 166 w m−3 for sus1 (untreated straw), 66 w m−3 for sus2 (untreated, hydrated straw), and 34 w m−3 for sus3 (milled, hydrated straw). however, all specific power consumptions mentioned above are slightly higher than real values. it was visually observed that both the non-milled and the milled straw particles had a tendency to cover the impeller blades. this effect primarily caused an increase in fluid flow resistance that leads to an increase in power. however, the effect of centrifugal forces effectively removed the straw particles from the blades. based on this information, it can generally be concluded that straw hydration reduces the specific power consumption by 60 % for untreated straw, while a combination of straw pre-treatment by mechanical disintegration and by hydration decreases the specific power con331 lukáš krátký, tomáš jirout acta polytechnica sumption by 80 % for untreated straw. mechanical disintegration itself decreases the specific power consumption by 50 %, at least for untreated hydrated straw. 5. conclusions based on the experimental results, it has been concluded that both hydration and mechanical disintegration of lignocellulose biomass significantly decreased the specific power consumption for achieving a homogeneous biomass-water suspension. • hydration and mechanical disintegration of lignocellulose biomass significantly improve the homogeneity and pumpability of biomass-water batches. • a milled and hydrated wheat straw-water suspension evinces non-newtonian behaviour with a powerlaw index equal to 0.76. • straw hydration by itself decreases the specific power consumption by 60 % for untreated straw. • combined straw pretreatment by mechanical disintegration and by hydration reduces the specific power consumption by 80 % for untreated straw. • mechanical disintegration by itself reduces the specific power consumption by at least 50 % for untreated hydrated straw. list of symbols a constant [–] b baffle width [m] b constant [pa sm] d impeller diameter [m] d vessel diameter [m] h height of a liquid level [m] h width of a blade [m] h2 height of the impeller above the bottom [m] hv height of the impeller [m] k metzner and otto constant [–] k consistency coefficient [pa sm] m power-law index [–] mk torque [n m] mkm torque minimum impeller rotational speed for batch homogeneity [n m] n rotational speed of impeller [s−1] nm minimum impeller rotational speed for batch homogeneity [s−1] p power consumption [w] pm power consumption at minimum impeller rotational speed for batch homogeneity [w] po power number [–] r rate of reliability [–] re reynolds number [–] v volume of suspension [m3] µef effective viscosity [pa s] γ shear rate [s−1] � specific power [w m−3] η apparent viscosity [pa s] ρ density of suspension [kg m−3] τ shear stress [pa] references [1] weiland, p.: biogas production: current state and perspectives. applied microbiology and biotechnology, 2010, 85, pp. 849-860. [2] schulz, h., eder, b.: bioplyn v praxi. ostrava: hel s.r.o., 2004. 167 pp. isbn 80-86167-21-6. [3] pandey, a.: handbook of plant-based biofuels. in crc press, new york, 2009, isbn 978-1-56022-175-3, 297 pp. [4] chanakya, h.n., srikumar, k.g., anand, v., modak, j., jagadish, k.s.: fermentation properties of agro-residues, leaf biomass and urban market garbage in a solid phase biogas fermenter. biomass and bioenergy, 1999, 16, pp. 417-429. [5] jirout, t., rieger, f., moravec, j.: studie míchání anaerobních fermentačních reaktorů na bps. [research report]. prague: czech technical university in prague, faculty of mechanical engineering, department of process engineering, 2008, 22 pp. [6] krátký, l., jirout, t. and nalezenec, j.: lab-scale technology for biogas production from lignocellulose wastes. acta polytechnica. 2012, 52 (3) pp.54-59. [7] slabý, f., nalezenec, j., krátký, l., maroušek j.: retting mill. [patent]. industrial property office of czech republic, 26080, 2013-11-11. [8] rieger, f., novák, v., jirout, t.: hydromechanické procesy ii. ctu in prague: ctu publishing house, 2005. 167 p. isbn 80-01-03302-3, (in czech). 332 acta polytechnica 54(5):325–332, 2014 1 introduction 2 materials and methods 2.1 raw material 2.2 mixing system 2.3 suspensions 3 fluid flow properties of a milled and hydrated wheat straw-water suspension 3.1 rheological properties 3.2 an investigation of the fluid flow regime 4 results and discussion 5 conclusions list of symbols references acta polytechnica doi:10.14311/ap.2014.54.0225 acta polytechnica 54(3):225–230, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap photometric analysis of pi of the sky data rafał opielaa, ∗, katarzyna małeka, b, lech mankiewicza, małgorzata siudeka, marcin sokołowskic, d, e, aleksander filip żarneckif a centre for theoretical physics of the polish academy of sciences, al. lotników 32/46, 02-668 warsaw, poland b division of particles and astrophysical science, nagoya university, 464-8601 nagoya, japan c national centre for nuclear research, hoża 69, 00-681 warsaw, poland d international centre for radio astronomy research, curtin university, wa 6845 perth, australia e arc centre of excellence for all-sky astrophysics (caastro), curtin university, wa 6845 perth, australia f faculty of physics, hoża 69, 00-681 warsaw, poland ∗ corresponding author: opiela@cft.edu.pl abstract. pi of the sky is a system of two wide field of view robotic telescopes, which search for short timescale astrophysical phenomena, especially for prompt optical grb emissions. the system was designed for autonomous operation, monitoring a large fraction of the sky with 12m–13m range and time resolution of the order of 1–10 seconds. two fully automatic pi of the sky detectors located in spain (inta — inta el arenosillo test centre in mazagón, near huelva.) and chile (spda — san pedro de atacama observatory.) have been observing the sky almost every night in search of rare optical phenomena. they also collect a lot of useful observations which include e.g. many kinds of variable stars. to be able to draw proper conclusions from the data received, adequate quality of the data is very important. pi of the sky data is subject to systematic errors caused by various factors, such as cloud cover, seen as significant fluctuations in the number of stars observed by the detector, problems with conducting mounting, a strong background of the moon or the passing of a bright object, e.g., a planet, near the observed star. some of these adverse effects have already been detected during the cataloging of individual measurements, but the quality of our data was still not satisfactory for us. in order to improve the quality of our data, we have developed two new procedures based on two different approaches. in this article we will report on these procedures, give some examples, and we will show how these procedures improve the quality of our data[1]. keywords: gamma ray burst (grb), variable stars, robotic telescopes, photometry, astrometry, data quality, photometric corrections. 1. introduction pi of the sky is a system of two wide field of view robotic telescopes designed for efficient searches for astrophysical phenomena varying on scales from seconds to months. the design of the apparatus allows a large fraction of the sky to be monitored with a range of 12m–13m and time resolution of the order of 1–10 seconds. the main goal of the pi of the sky project is to search for and observe prompt optical counterparts of gamma ray bursts (grbs) during or even before gamma-ray emission. other scientific goals include searching for nova stars and flare star explosions, and looking for new, as yet undetected variable stars. to achieve this purpose, pi of the sky selected an approach which assumes continuous observation of a large part of the sky to increase the possibility of catching a grb. it was therefore necessary to develop advanced and fully automatic hardware and software for wide-field monitoring, real-time data analysis and identification of flashes[1]. 2. methods to improve data quality 2.1. color correction as was mentioned above we have developed a series of quality filter cuts to remove measurements (or whole frames) affected by detector imperfections or observing conditions. measurements that are placed near the border of the frame, or that are affected by hot pixels, bright background caused by an open shutter or by the moon halo, or by planet or planetoid passage, can easily be recognized and removed by dedicated algorithms. by selecting only high quality measurements, average photometry uncertainty of about 0.018m–0.024m has been achieved for stars from 7m to 10m (see figure 7) [1]. we were still not satisfied with the quality of our data. we have been looking for new methods, and for algorithms which can help us to improve the quality of our data. we have managed to improve the photometry accuracy further by developing a dedicated color correction algorithm. when performing 225 http://dx.doi.org/10.14311/ap.2014.54.0225 http://ojs.cvut.cz/ojs/index.php/ap r. opiela, k. małek, l. mankiewicz et al. acta polytechnica figure 1. two currently working pi of the sky detectors. the prototype detector (on the left), and the new detector (on the right). figure 2. the detector response is correlated with the spectral type (b −v or j −k) of catalog stars. observations without any filter (which is the case for most of our data), we normalize our measurements to reference stars measured in v filter. due to the wide spectral acceptance, the ccd detector response is correlated with the stellar spectral type. as determined by the manufacturer, the ccd detectors used in the pi of the sky project have the greatest sensitivity in the near infrared, while the average wavelength is, in their case, 〈λ〉≈ 585 nm, which corresponds approximately to a wavelength characteristic of the v filter. the ccd detectors sensitivity varies according to the wavelength λ, and affects the quality of the results of measurements taken in white light. two objects with the same luminosities, from which the first one shines in near infra-red and the second e.g. in blue will have different brightnesses on our detectors. an object shining in the near infra-red will be brighter than a blue object. since neither of the inta cameras present in the new detector has a filter installed that can deal with this effect, we must take this effect into account when cataloging our data [1]. the average magnitude measured by pi of the sky is shifted with respect to the catalog magnitude in the v band by an offset depending on the spectral type given by b−v or j−k. we have already determined that in the case of data collected by the prototype detector in chile, the correction of the standard photometry used in the pi of the sky project, which is based on taking into account the dependence of the sensitivity of the ccd chip on the observed star type, may be approximated by the following formula: mcorr = m − 0.2725 + 0.5258 ∗ (j −k) (1) the j and k values correspond in this formula to the brightnesses of the object tested, in the j and k filters, respectively. m is the brightness of the analyzed object, measured by the detector and normalized to the brightness of the catalog stars in v . mcorr represents the corrected magnitude, which takes into account color correction. the prototype’s cameras, however, have chips made by a different manufacturer (fairchild) than cameras in spain (sta), so the color calibration also had to be repeated for the data collected by the new detector (see figure 2) [1, 2]. approximating of this dependence with a linear function enables the measurement of each star to be corrected, so that the measured magnitude is equal to the catalog v magnitude, irrespective of the spectral type. equation (1) gives us information about the corrected magnitude only for catalog stars which have j and k brightness values. if we want to calculate the photometry corrections for any stars visible in the resulting figure, we use a special procedure which requires the use of only the best catalog stars. we are interested in catalog stars in the range of magnitude from 6m to 10m, and we reject stars with a magnitude 226 vol. 54 no. 3/2014 photometric analysis of pi of the sky data figure 3. for calculating photometry corrections, only the best catalog stars were used (blue), after rejecting stars with a magnitude shift (mcorr − m) bigger than 0.2, rmscorr bigger than 0.07 and with fewer than 100 measurements. figure 4. uncorrected light curve for bg ind variable (left) and after spectral corrections with correction quality cut (right). shift (mcorr −m) bigger than 0.2. the catalog stars should also have more than 100 measurements, and we accept only catalog stars which have rms mcorr < 0.07 (see figure 3) [1, 2]. additional improvement of the measurement precision is also achieved when the photometric correction is not calculated as a simple average over all selected reference stars, but when a quadratic dependence of the correction on the reference star position in the sky is fitted for each frame. in this case, for the selected catalog stars we calculate the quadratic surface correction, and we try to interpolate the value of the correction in the point where our analyzed star exists. the average square distance of the reference stars from the fitted correction surface (χ2) gives us additional, independent information about the quality of the analyzed measurement. spectral correction and additional χ2 distribution allow the selection of only measurements with the highest precision [2]. the effect of photometry correction with a distribution of χ2 on the reconstructed bg ind light curve is shown in figure 4. in this case, application of the new algorithm improved the photometry quality, and uncertainty sigma of the order of 0.013m was obtained. we also applied photometry correction to other stars, with as good results as in the case of bg ind variable [3]. 2.2. statistical methods another way, independed from the previous methods, that we also considered for improving the data quality, was a study of the statistical properties of a whole frame or even a group of frames. in this method, we calculate the quality of a single frame, and on the basis of this quality we have information about the quality of a single measurement on the analyzed frame. at the beginning of this correction procedure we calculate the median for each catalog star visible on a given frame. the median is calculated on the basis of only the best measurements (measurements which have quality = 0 and fit quality ≤ 1σ) taken from the same field as in the case of the analyzed frame. later, for each catalog star visible on the analyzed frame we calculate the mcorr − med value, where med is the median calculated before and mcorr represents the corrected catalog star magnitude taken from the analyzed frame which takes into account the dependence observed magnitude from the brightness of the catalog star [2]. in order to determine the quality of an analyzed frame, we check how many catalog stars (which are visible on this frame) have |mcorr − med| > 2 ∗ σ. these stars will later be reffered to as “bad”. the σ value is calculated by creating an mcorr − med histogram for all catalog stars visible on all frames 227 r. opiela, k. małek, l. mankiewicz et al. acta polytechnica figure 5. the gauss function fitted to |mcorr − med| histogram. thanks to this fit we can obtain the value of σ, which is later used to calculate the quality of the analyzed frames. figure 6. histograms of the percentage of bad catalog stars in a single frame. these histograms were created for a given range of frames, and for each frame it was calculated what percentage of catalog stars from the analyzed frame have |mcorr − med| > 2σ. we assumed that the best frames have less than 10 % bad catalog stars and the worst frames have more than 50 % bad catalog stars. coming from the same field as in the case of the analyzed measurement. to this histogram we can fit the gauss function which gives us the value of σ, and this is later used to obtain the quality of the analyzed frames (see figure 5). we assumed that the best frames have less than 10 % bad catalog stars and the worst frames have more than 50 % bad catalog stars (see figure 6). each frame which contains an analyzed star measurement is analyzed in the same way [2]. if we know which frames are good and which are bad, we can calculate 〈m〉 and σ〈m〉 values based on each group of frames. we can calculate these values based on measurements taken only from the best frames, from the worst frames, or from all frames. we also take into account the quality of the data calculated using the previous methods. our results are given below (see figure 7). the photometry accuracy improves significantly when this method is used. after removing bad data, σ from 0.01m to 0.03m is achieved (see figure 7) [2]. 3. comparison of methods after calculating each correction separately, we can ask which correction is the best. we would like to find the combination of methods that gives us the best analyzed light curve quality, without removing a lot of data from this light curve. as we see in fig. 9, in some cases after using some combination of methods the quality of the analyzed light curve improves significantly, and it is almost the same as in cases when we apply all corrections and remove more data from the analyzed light curve. it is more cost-effective is use a small number of corrections and not to remove a lot of data, so that we can have more information about the analyzed light curve (e.g. we can calculate the period more precisely). the choice of the appropriate method is unfortunately dependent 228 vol. 54 no. 3/2014 photometric analysis of pi of the sky data figure 7. σ〈m〉 vs 〈m〉 plot. on the left we can see points with positions corresponding to 〈m〉 and σ〈m〉 calculated on the basis of all measurements. on the right we can see points with positions corresponding to 〈m〉 and σ〈m〉 calculated on the basis of only the best measurements. as we see, after using this method the quality improves significantly and σ ∼ 0.03m is achieved for stars from 7m–9m. figure 8. the light curve of an analyzed star after using the statistical data quality estimate method (on the left). colors represents the quality of the analyzed measurements. a histogram of this light curve after removing all bad data (on the right). as we see, after using this method the light curve quality improves significantly and σ ∼ 0.01m is achieved. on the star that is to be analyzed, and varies from star to star. 4. conclusions a lot of informations about the pi of the sky project can be found on the project web page, which is available on the http://grb.fuw.edu.pl. we have created a system of dedicated filters to mark bad measurements or frames. this system is applied with the cataloging procedure for new data. to improve the quality of the data, we have created an approximate color calibration algorithm based on the spectral type of catalog stars. we have also developed another statistical method which analyzes all stars on the frame, allowing bad quality exposures to be rejected. after the new frame selection is applied, photometry accuracy of 0.01m–0.03m can be obtained. further improvement is possible in dedicated analysis of selected objects [3]. in the pi of the sky project we have developed a pipeline which allows us to analyze selected interesting objects in a fully automatic way. as a result we get a corrected light curve of the analyzed star, where each measurement has the quality calculated [1]. in the pi of the sky project we have developed our own pipeline which allows us to analyze selected interesting objects in a fully automatic way. as a result, we get a corrected light curve for a star, where each measurement has the quality calculated. unfortunately, these procedures are not fast enough to be implemented in the automatic off-line reduction stage, so we have to run this only for selected objects. we are trying to improve this procedure and make it faster. however, this is a very complex problem, and we do not yet have a pipeline of this kind. acknowledgements we are very grateful to g. pojmański for providing access to the asas dome and for sharing his experience with us. we would like to thank the staff of the las campanas observatory, the san pedro de atacama observatory and the inta el arenosillo test centre in mazagón near 229 http://grb.fuw.edu.pl r. opiela, k. małek, l. mankiewicz et al. acta polytechnica figure 9. comparison of the photometry correction methods used here. the x-axis represents the percentage of data that has not been removed from the analyzed light curve. the y-axis represents the analyzed light curve quality. as we see, in some cases after using some combination of methods the quality of the analyzed light curve improves significantly, and it is almost the same as in the case when we apply all corrections and remove more data from the analyzed light curve. figure 10. the analyzed star light curve after using all that corrections described above (on the left). colors represents the observed fields. the light curve histogram after removing all bad data is shown on the right. as we see, after using this methods the light curve quality improves significantly and σ ∼ 0.009m is achieved (at the beginning, the value for σ was ∼ 0.02m). huelva for their help during installation and maintenance of our detector. this work was co-founded by the polish ministry of science and higher education in 2009-2013 as a research project, and by polish-swiss astro project cofounding under the swiss program of cooperation with new member states of the european union. km and lwp acknowledge support from jsps postdoctoral fellowships for foreign researchers. references [1] l. w. piotrowski. hunting for gamma ray bursts with pi of the sky telescopes. 33rd international cosmic ray conference, rio de janeiro 2013, the astroparticle physics conference, 2013. http://grb.fuw.edu.pl/pi/papers.htm. [2] r. opiela. improving photometry of the pi of the sky. proceedings of spie volume: 7745, 2010. http://grb.fuw.edu.pl/pi/papers/wilga10/ ropiela_spie2010.pdf. [3] a. f. żarnecki. improving the photometry of the pi of the sky system. acta polytechnica 2011/2. http://ctn.cvut.cz/ap/download.php?id=600. 230 http://grb.fuw.edu.pl/pi/papers.htm http://grb.fuw.edu.pl/pi/papers/wilga10/ropiela_spie2010.pdf http://grb.fuw.edu.pl/pi/papers/wilga10/ropiela_spie2010.pdf http://ctn.cvut.cz/ap/download.php?id=600 acta polytechnica 54(3):225–230, 2014 1 introduction 2 methods to improve data quality 2.1 color correction 2.2 statistical methods 3 comparison of methods 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0420 acta polytechnica 54(6):420–425, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap computational aerodynamic optimization of a low-speed wing martin lahutaa, zdeněk páteka, b, ∗, andrás szöllösa a výzkumný a zkušební letecký ústav, a,s, beranových 130, 199 05 praha – letňany, czech republic b faculty of mechanical engineering, czech technical university, technická 4, 166 07 praha 6, czech republic ∗ corresponding author: patek@vzlu.cz abstract. an optimization method consisting of two evolutionary optimization algorithms and a solver using nonlinear aerodynamics is applied to the design of a low-speed wing. the geometric parameterization of the wing uses standard geometric quantities commonly used for describing the wing geometry. the method seems to provide good reliable results with low computer capacity requirements. keywords: aerodynamic optimization; evolutionary algorithm; low-speed wing. 1. introduction with the use of new manufacturing technologies, especially technologies with composite materials, conventional and well-proven wing shapes (usually trapezoids or combination of trapezoids) can be abandoned, and complex three-dimensional geometrical shapes can be applied, even for light and general aviation aircraft. a wing can now be designed more closely according to aerodynamic requirements. the shapes of the wings of advanced sailplanes can serve as demonstrative examples [1, 2]. new optimization techniques connected to more precise, more accurate and faster aerodynamic solvers are leading to wings optimized according to several aerodynamic criteria in relatively broad ranges of geometric parameters. evolutionary algorithm methods suitable for multicriterion optimization have been developed over a considerable period of time. microevolutionary algorithms provide welcome computational-time savings together with an acceptably substantial scan of multidimensional design space. methods are now also available for computing the aerodynamic characteristics of wings that provide sufficiently accurate results for geometrically complex low-speed wings, and that are at the same time very quick. the necessary parameterization of the geometric shape of the wing uses conventional geometric parameters widely used in the technical description of a wing, rather than mathematical parameters. 2. optimization criteria and constraints in the aerodynamic optimization of a wing, some form of minimization of the wing drag or maximization of the lift-to-drag ratio is usually required as the main optimization criterion. this is not sufficient for a real aircraft wing, where many other constraints are simultaneously applied. typical aerodynamic constraints that can be used are: the minimum achieved value figure 1. parameterization of the wing planform. of the maximum lift coefficient cl max, the maximum absolute value of the wing pitching moment coefficient cm, and the limit of the position of the point where flow detachment begins. typical geometric constraints that can be used are: wing area, wing span, wing aspect ratio, wing taper ratio, wing twist, wing dihedral, wing swept angle. the constraints can be given either as limiting values (for example maximum acceptable twist) or as directly required values. a prescribed kind of geometric shape can also be required, for example a wing composed from two trapezoids on the wing half-span. as the reynolds numbers along the wing span are involved in the aerodynamic computations, dimensional setting of the wing and the flight speed is preferred. 3. geometric description of the wing the geometry of the wing was restricted by the following two prescribed requirements: area of the wing sw, and semispan b/2. the leading edge (le) and the trailing edge (te) of the wing were represented by a quadratic bezier curve [3] (see figure 1) given by the equation ~r(t) = (1− t)2p0 + 2t(1− t)p1 + t2p2, ~r(t) = [x,z], (1) 420 http://dx.doi.org/10.14311/ap.2014.54.0420 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 computational aerodynamic optimization of a low-speed wing where p0 is a point at the root, p2 is a point at the tip and p1 is the so-called control point of the curve lying inside the triangle p0, p2, [p x2 ,p y 0 ], which guarantees that the curve is always convex or concave and lying inside this triangle. the equation (1) of the curve can be expressed with p le,t e0 shifted to the origin of the coordinate system, without loss of generality. the parametric coordinates of these shifted points are then: p le, te 0 = [0, 0], p le, te 1 = [z le, te,dxle, texle, tezle, te], p le, te 2 = [b/2,dx le, te], where dxle, te ∈ 〈xle, temin ,x le, te max 〉, xle, te ∈ 〈0, 1〉, zle, te ∈ 〈0,b/2〉. these curves with p le, te0 at the origin of the coordinate system are then shifted against each other along the x-axis so that the surface of the wing is sw. alternatively, the configuration is rejected if the prescribed surface area cannot be achieved for the given parameters. the wing configuration is also rejected when there is non-decreasing local length of the chord with an increasing z-coordinate. the local chords of the wing are also twisted with consequent linear dependency along the wing span, where the local twist angle is given as �(z) = z b/2e tip and �tip ∈ 〈xle, temin 〉 is a parameter describing the twist of the tip of the wing. overall, there are seven parameters: xle, te, zle, te, dxle, te, �tip. this parameterization includes a wide range of realistic wing geometries, including trapezoidal geometries. 4. aerodynamic description of the wing 4.1. airfoil sections it is necessary to create a database of the airfoils used on the wing in advance, because the aerodynamic characteristics of the airfoils used on the wing and their positions along the wingspan are used as inputs for wing optimization. the airfoils are described by their usual conventional aerodynamic characteristics, i.e., by the lift curves, the drag curves and the moment curves at different reynolds numbers. it is very useful to have these airfoil characteristics for the angles of attack overrunning the angle of attack of stall by at least two degrees. 4.2. wing the wing aerodynamic performance is described using the commonly used coefficients of wing lift, wing drag and wing pitching moment, the lift distribution along the span, the lift coefficient distribution along the span and the point along the span where flow detachment begins. the point where flow detachment begins is found as the point where, increasing the wing angle of attack, the local lift coefficient reaches the maximum lift coefficient available at the local airfoil section. 5. method for computing the aerodynamic characteristics of the wing nlwing2 software was used for calculating the aerodynamic characteristics of the wing. nlwing2 is an implementation of the nonlinear lifting line method developed at vzlu [4]. this method allows the use of 2d viscous or non-viscous airfoil analysis (calculated e.g., by xfoil software or provided by wind tunnel testing) for efficient computation of the nonlinear aerodynamic properties of 3d wing configurations. it employs 2d section data to build a 3d potential vortex model of the flow. it uses a robust euler-newton method to track the change in the flow vorticity quantities as the angle of attack progresses. nlwing2 runs under the gnu octave system [5]. the implementation is very effective, and is a few orders faster than alternative cfd methods. 6. optimization method 6.1. evolutionary optimization the powerful computing facilities that have emerged in recent years have led to many practical optimizations in various areas of science and engineering. as a consequence, various heterogeneous evolutionary algorithms have appeared [6] to [12]. however, according to their functionality, they can be described by the following basic scheme: 1. initialize the population by randomly generated individuals 2. evaluate 3. repeat until the criterion for stopping of the optimization is met a. choose individuals for reproduction b. apply variation operators to the selected parents c. evaluate new design candidates d. choose the best design candidates for the next generation end of the loop genetic algorithms have received much attention in the evolutionary community. among them, the most popular evolutionary algorithms are perhaps nsga2, by k. deb et al. [13], and spea2, by e. zitzler et al. [14]. nsga2 is widely considered to be a reference algorithm. the particular achievement of nsga2 is that it is very easy to use – it is almost parameter-less, and is fast due to its elitist non-dominated sorting with the crowding distance as a diversity metric. spea2 has an excellent diversity mechanism accompanied by the so called pareto-archive, preserving promising non-dominated individuals during the evolution. a detailed explanation of these two approaches can be found in [13, 14]. 421 martin lahuta, zdeněk pátek, andrás szöllös acta polytechnica 6.2. differential evolution a very promising new concept was proposed relatively recently by r. storn and k. price [15], and is known as the differential evolutionary(de) algorithm. differential evolution is a simple yet powerful means for creating promising new design candidates via combining a parent solution with other population members. various implementations of the basic idea were soon developed and, interestingly, some of them outperformed both nsga2 and spea2 on benchmark problems. now let us quote the original formulation of the authors. differential evolution utilizes np d-dimensional design vectors xgi , i = 1, . . . ,np as a population for each generation g. de creates new parameter vectors by adding the weighted difference between two population vectors to a third vector. let this operation be called mutation. the components of the mutated vector are then mixed with the parameters of another predetermined vector, the target vector, to yield the so-called trial vector. this parameter mixing can be termed as a crossover. if the trial vector has better fitness than the target vector, it replaces the target vector. this last operation is called selection. each individual in the population has to serve once as the target vector so that np competitions take place in each generation. the basic de-operators are as follows: mutation. for each target vector xgi , a mutant vector is created according to vg+1i = x g ri + f(xgr2 −x g r3 ) with mutually different random indices r1,2,3 ∈ 1, 2, . . . ,np and f > 0. the randomly chosen integers r1,2,3 are also chosen different from the running index i, so that np must be greater than or equal to four to allow for this condition, and f controls the amplification of the differential variation xgr2 − x g r3 . it is real, greater than zero and less than 2. in the original proposal it is constant for the whole evolution, while our implementation also allows it to be either constant for the current running index of the design vector or varying with each component of the actual individual. crossover. the diversity of the population can be further increased by a crossover. the so-called trial vector is therefore formed: ug+11 = (u g+1 1i ,u g+1 2i , . . . ,u g+1 di ), where ug+1ji = { vg+1ji if randb(j) ≤ cr or j = rnbr(i), xg+1ji otherwise. here, randb(j) is the j-th evaluation of a uniform pseudo-random number generator with the outcome from the interval [0, 1]. cr is the crossover constant from [0, 1] which has to be determined by the user, and rnbr(i) is a randomly chosen index belonging to 1, 2, . . . ,d, which ensures that ug+1ji gets at least one parameter from vg+1ji . in addition, ea1 allows vectors also to be exchanged as a whole, and not just their components. selection. in the original proposal, the trial vector is compared to the target vector using the greedy criterion. if vector ug+1i yields a smaller cost function value than xgi then x g+1 i is set to u g+1 i . otherwise the old value xgi is retained. this scheme was tailored for solving merely single-objective optimization problems. modification for the multi-objective case was straightforward. ea1 is a pareto-archive oriented algorithm, therefore every trial vector is compared to the archive. if any archive-member dominates it, it is discarded. otherwise it is included into the archive. the pseudocode of differential evolution is as follows: (1.) initialize the population of p design candidates by randomly generated individuals. (2.) evaluate. (3.) repeat until the criterion for stopping the optimization is met: for each design vector pi (i = 1, . . . ,np) from p : (a) create candidate c from parent pi; (b) evaluate the candidate; (c) if the candidate is better than any of the archive parents, it becomes a new member and replaces it; otherwise the candidate is discarded. (4.) randomly enumerate the individuals in p . 6.3. the ea1 multi-objective evolutionary optimizer we, too, have proposed an evolutionary optimizer, which is called ea1 (evolution algorithms one), using the crowding distance as a diversity metric, together with the pareto-archive for preserving non-dominated members of the evolution. there are ideas borrowed from the two previously-mentioned algorithms, supplemented by range-adaptation, population-statistics management and elitist-random reinitialization. range adaptation is a highly efficient technique for directing the evolution towards interesting regions of the design and criterial space, by controlling the population statistics. the essence of range adaptation is as follows: check the mean values and the standard deviations of the decision variables every k generations. let us denote the old (after n generations) mean-value vector and its standard deviation as vold and σold, and the new (after n + k generations) mean-value vector and its standard deviation as vnew and σnew, respectively. if the i-th component of vnew differs from the i-th component of vold by more than a certain value, defined as ap∗ (σold)i (ap is a user-supplied constant 422 vol. 54 no. 6/2014 computational aerodynamic optimization of a low-speed wing – let us call it an adaptation parameter, and (σold)i is the i-th component of the old standard deviationvector), then generate the new reinitialized population in the interval ((vnew)i−(∆vnew)i, (vnew)i−(∆vnew)i), where ∆vnew = vnew − vold. in other words, let one third of (∆vnew)i be the new i-th component of the standard deviation-vector. one third is chosen, because the population is assumed to conform to the gaussian probability distribution, and 99.7 percent of it is contained within the interval (−3σ, +3σ). moreover, if the new standard deviation (σnew)i is less than a certain prescribed scalar value σmin, then (σnew)i is equal to σmin, to ensure that the evolution will not be attracted toward any local optimum. as can be seen from the above, this strategy strives to keep the evolution in a permanently “excited” state by continually perturbing it through forced modification of the population statistics. the values for ap usually go from 1.0 to 1.5, and σmin is mostly equal to 0.2. therefore, these are included as two additional parameters of the evolution. the number of individuals included in the population statistics ranges from the upper two to the whole population [7]. elitist-random reinitialization consists of putting some (usually two) pareto-archive members into the reinitialized population, which is subsequently selected randomly for mating. it usually utilizes micropopulations going down to four. however, ten-member populations are most commonly used in the evolution. at the beginning, elitist-random reinitialization was a multi-objective micro-genetic algorithm with rangeadaptation and elitist-random based reinitialization. details about our concept are given in [16]. however, it was necessary to broaden its scope by exploiting the new excellent features of differential evolution. after a redesign and a great deal of experimentation with redefining the basic de-operators to conform with multi-objective microevolution, the result has evolved into a multi-objective optimizer equipped with both genetic and differential evolution operators. the user can now choose between these according to a simple switch that indicates the type of evolution to be used during the optimization. a detailed description will be given in a paper which is currently under preparation. the evolution starts with a population generated by latin hypercube sampling. after the evaluation, the non-dominated individuals are put into the paretoarchive to update it. ea1 actually uses two populations: the first population contains only four to ten individuals to produce new information via applying evolutionary operators, and the second population contains the pareto archive. each new individual is evaluated by comparing it to the archive. if it is not dominated by any archive member, it is accepted as a new member, otherwise it is rejected. the pseudocode is as follows: (1.) initialize the population of p design candidates by randomly generated individuals through latin hypercube sampling. (2.) evaluate and update the archive. (3.) repeat until the criterion for stopping the optimization is met: for each design vector: (a) apply evolutionary operators (either genetic or differential); (b) evaluate and update the archive: every n generations: i. update the population statistics, ii. adapt the search range, iii. reinitialize the population by elitist-random reinitialization. 7. example of optimizing a wing by ea1 the method was used for optimizing a low-speed wing for the following problem. (1.) geometrical constraints: • naca 0012 symmetric airfoil along the whole span; • area of the wing sw = 16 m2; • span of the wing bw = 18 m; • the local chord c(z) must decrease monotonically towards the tip of the wing; • the leading and trailing edge created by the curve are defined by (1); • the twist is only linear from the wing root to the wing tip, maximum twist at the wing tip 3° in relation to the root; • longitudinal position of the 14 point of the mac x0.25 mac = x0.25 c root ± 0.05cmac. (2.) aerodynamic constraints: • cl max ≥ 1.25 for re = 1.5 · 106; • point of the beginning of the flow separation not farther than 0.65bw/2 from the wing root to the wing tip; • reynolds number is re = 1.5 · 106 (related to cmac). (3.) optimization criteria: in the wing polar curve diagram (relation cl = cl(cd )), to minimize the area s (do not confuse with sw wing area) constrained by: • value cd = 0 from the left side; • wing polar curve from the right side; • value cl = 0.1 from bottom; • value cl = 1.0 from the top. (4.) the differential evolution in ea1 has been set as follows: • population size: 4, 10, 30; • length of the design vectors: 7; 423 martin lahuta, zdeněk pátek, andrás szöllös acta polytechnica figure 2. pareto front, cl max as a function of the optimization criterion s. wing s cl max twist separation [°] [bw/2] 1 0.01287 1.363 0.49 0.139 2 0.01290 1.364 0.27 0.249 3 0.01292 1.365 0.27 0.249 table 1. three examples of optimized wings. • the fraction of the population included in the calculation of the population statistics: npop 1 for population sizes 4 and 10, and npop 3 for population size 30; • pareto-archive size: 300; (5.) the population was reinitialized in each generation: • adaptation parameter ap: 1.5, σmin: 0.4; • mutation parameter f: 10−10; • crossover parameter cr: 0.1. 7.1. results it is seen that the pareto front (figure 2) offers many wings which offer a low value of the drag optimization criteria and meet the prescribed constraints. three examples of the optimized wing planform geometry and twist are included in table 1 and in figure 3. the aerodynamic characteristics of these wings are given in table 1 and in figure 4. the optimized shapes are generally similar to the planforms of the wings of advanced sailplanes and motorgliders, so these preliminary tests indicate that the method can be used in practical applications. it can also be seen that the pareto front (figure 2) is very flat, as concerns the drag optimization criterion . this means that relatively high deviation from the optimum geometric extreme does not substantially affect the wing aerodynamic performance and the other constraints. this is a favourable result in figure 3. wings 1, 2 and 3. the sense that non-aerodynamic requirements (e.g., structural and manufacturing requirements) can be broadly applied in the wing design in this case. 8. conclusions a method for optimizing the shape of a low-speed wing has been developed and tested. the method combines the new ea1 evolutionary optimization algorithm with the proven nlwing2 aerodynamic solver, using nonlinear aerodynamic data as inputs. the initial tests seem to prove that the method is applicable for the preliminary design of the wing. list of symbols b wing span cd drag coefficient cl lift coefficient cl max maximum lift coefficient cm pitching moment coefficient c local wing chord cmac wing mean aerodynamic chord s value of the optimization criterion sw wing area 424 vol. 54 no. 6/2014 computational aerodynamic optimization of a low-speed wing figure 4. polar diagrams of the three optimized wings and of the rectangular and trapezoidal wing. α angle of attack � twist angle references [1] thomas, f.: fundamentals of sailplane design. college park press, college park, maryland 1999. [2] pajno, v.: sailplane design. macchione editore, varese 2006. [3] farin, g., hoschek, j., myung-soo, k.: handbook of computer aided geometric design, north holland 2002. isbn: 978-0-444-51104-1. doi:10.1016/b978-044451104-1/50000-9 [4] hájek, j.: nlwing2 software for viscous analysis of slender wings, vzlu report r-4722, výzkumný a zkušební letecký ústav, a.s., praha 2010 [5] octave community: gnu octave 3.6.4, 2014, http://www.gnu.org/software/octave/ [2014-12-01]. [6] coello coello, c. a.: an empirical study of evolutionary techniques for multiobjective optimization in engineering design. ph.d. thesis, department of computer science, tulane university, new orleans 1996 [7] coello coello, c. a., lamont, g. b. (eds.): applications of multi-objective evolutionary algorithms. world scientific, singapore 2004. doi:10.1142/5712 [8] coello coello, c. a. lamont, g. b., van veldhuizen, d. a. (2007). evolutionary algorithms for solving multi-objective problems. second edn. kluwer academic publishers, new york 2007. doi:10.1007/978-0-387-36797-2 [9] corne, d. w., knowles, j. d., oates, m. j.: the pareto envelope-based selection algorithm for multiobjective optimization. in: schoenauer. m., deb, k., rudolph, g., yao, x., lutton, e., merelo, j. j., schwefel, h .p. (eds.): proceedings of the parallel problem solving from nature vi conference, pp. 839–848. lecture notes in computer science no. 1917, springer, paris 2000. doi:10.1007/3-540-45356-3_82 [10] deb, k.: multi-objective optimization using evolutionary algorithms. john wiley & sons, chichester 2001. [11] fonseca, c. m., fleming, p. j.: genetic algorithms for multiobjective optimization: formulation, discussion and generalization. in: forrest, s. (ed.): proceedings of the fifth international conference on genetic algorithms, pp. 416-423. university of illinois at urbana-champaign, morgan kaufmann publishers, san mateo 1993 [12] goldberg, d. e.: genetic algorithms in search, optimization and machine learning. addison-wesley publishing company, reading 1989. doi:10.5860/choice.27-0936 [13] deb, k., pratap, a., agarwal, s., meyarivan, t.: a fast and elitist multiobjective genetic algorithm: nsga–ii. ieee transactions on evolutionary computation, 6 (2002) (2), pp. 182–197. doi:10.1109/4235.996017 [14] zitzler, e., laumanns, m., thiele, l.: spea2: improving the strength pareto evolutionary algorithm. in: giannakoglou k, tsahalis d, periaux j, papailou p, fogarty t (eds.) eurogen 2001. evolutionary methods for design, optimization and control with applications to industrial problems, athens, greece, pp. 95–100 [15] storn, r., price, k.: differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. journal of global optimization 11 (1997), pp. 341–359. doi:10.1023/a:1008202821328 [16] szőllős, a., šmíd, m., hájek, j., 2009: aerodynamic optimization via multi-objective micro-genetic algorithm with range adaptation, knowledge-based reinitialization, crowding and �-dominance. advances in engineering software, 40 (2009) (6) pp. 419–431. doi:10.1016/j.advengsoft.2008.07.002 425 http://dx.doi.org/10.1016/b978-044451104-1/50000-9 http://www.gnu.org/software/octave/ http://dx.doi.org/10.1142/5712 http://dx.doi.org/10.1007/978-0-387-36797-2 http://dx.doi.org/10.1007/3-540-45356-3_82 http://dx.doi.org/10.5860/choice.27-0936 http://dx.doi.org/10.1109/4235.996017 http://dx.doi.org/10.1023/a:1008202821328 http://dx.doi.org/10.1016/j.advengsoft.2008.07.002 acta polytechnica 54(6):420–425, 2014 1 introduction 2 optimization criteria and constraints 3 geometric description of the wing 4 aerodynamic description of the wing 4.1 airfoil sections 4.2 wing 5 method for computing the aerodynamic characteristics of the wing 6 optimization method 6.1 evolutionary optimization 6.2 differential evolution 6.3 the ea1 multi-objective evolutionary optimizer 7 example of optimizing a wing by ea1 7.1 results 8 conclusions list of symbols references acta polytechnica doi:10.14311/ap.2013.53.0416 acta polytechnica 53(5):416–426, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap resonances on hedgehog manifolds pavel exnera,b,∗, jiří lipovskýc a doppler institute for mathematical physics and applied mathematics, czech technical university, břehová 7,11519 prague, czechia b department of theoretical physics, nuclear physics institute as cr, 25068 řež near prague, czechia c department of physics, faculty of science, university of hradec králové, rokitanského 62, 50003 hradec králové, czechia ∗ corresponding author: exner@ujf.cas.cz abstract. we discuss resonances for a nonrelativistic and spinless quantum particle confined to a twoor three-dimensional riemannian manifold to which a finite number of semiinfinite leads is attached. resolvent and scattering resonances are shown to coincide in this situation. next we consider the resonances together with embedded eigenvalues and ask about the high-energy asymptotics of such a family. for the case when all the halflines are attached at a single point we prove that all resonances are in the momentum plane confined to a strip parallel to the real axis, in contrast to the analogous asymptotics in some metric quantum graphs; we illustrate this on several simple examples. on the other hand, the resonance behaviour can be influenced by a magnetic field. we provide an example of such a ‘hedgehog’ manifold at which a suitable aharonov-bohm flux leads to absence of any true resonance, i.e. that corresponding to a pole outside the real axis. keywords: hedgehog manifolds, weyl asymptotics, quantum graphs, resonances. submitted: 21 february 2013. accepted: 17 april 2013. 1. introduction a study of quantum systems the configuration space of which is geometrically and topologically nontrivial has proved to be a fruitful subject both theoretically and practically. a lot of attention has been paid to quantum graphs — a survey and a guide to further reading can be found in [6, 14]. together with this other systems have been studied which one can regard as a generalization of quantum graphs where the ‘edges’ may have different dimensions; using the theory of self-adjoint extensions one can construct operator classes which serve as hamiltonians of such models [19]. one sometimes uses a pictorial term ‘hedgehog manifold’ for a geometrical construct consisting of riemannian manifolds of dimension two or three together with line segments attached to them. in this paper we consider the simplest situation when we have a single connected manifold to which a finite number of semiinfinite leads are attached — one is especially interested in transport in such a system. particular models of this type have been studied, e.g., in [7, 8, 18, 20, 25]. again for the sake of simplicity we limit ourselves mostly to the situation when there are no external fields; the hamiltonian will act as the negative second derivative on the halflines representing the leads and as laplace-beltrami operator on the manifold. we have said that quantum motion on hedgehog manifolds can be regarded as a generalization of quantum graphs. it is therefore useful to compare similarities and differences of the two cases, and we will recall at appropriate places in the text how the claims look when the riemannian manifold is replaced by a compact metric graph. the first question one has to pose when resonances are discussed is what is meant by this term1. the two prominent instances are resolvent resonances identified with poles of the analytically continued resolvent of the hamiltonian and scattering resonances where we look instead into the analytical structure of the on-shell scattering operator. while the two often coincide, in general it may not be so; recall that the former are the property of a single operator while the latter refer to the pair of full and unperturbed hamiltonians, and often also a third one, an identification operator, which one uses if the two hamiltonians act on different hilbert spaces [27]. the first question we will thus address deals with the two resonance definitions for quantum motion on hedgehog manifolds. using an exterior complex scaling we will show that in this case both notions coincide and one is thus allowed to speak about resonances without a further specification. the result is the same as for quantum graphs [15, 16] and, needless to say, in many other situations. the next question to be addressed in this paper concerns the high energy behaviour of the resonances 1investigations of resonances in quantum systems have a long history. for a survey of classical results see, e.g., chap. 3 of [13]. there are also various newer results, in particular, attention has been paid recently to perturbation of eigenvalues near the threshold [12]. 416 http://dx.doi.org/10.14311/ap.2013.53.0416 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 5/2013 resonances on hedgehog manifolds which is, for this purpose, useful to count together with the eigenvalues. note that if a hedgehog manifold with a finite number of junctions is compact having finite line segments, its spectrum is purely discrete and an easy estimate yields the spectral behaviour at high energies. it follows the usual weyl’s law [30], and moreover, it is determined by the manifold component with the highest dimension, that is, in our case, the riemannian manifold [24]. if, on the other hand, the leads are semiinfinite, the essential spectrum covers the positive real axis. in contrast to the usual schrödinger operator theory, it often contains embedded eigenvalues; this happens typically when the laplace-beltrami operator which is the manifold part of the hamiltonian has an eigenfunction with zeros at the hedgehog junctions. since such eigenvalues are unstable — a geometrical perturbation turns them generically into resonances — it is natural to count them together with the ‘true’ resonances; one then asks about the asymptotics of the number of such singularities enclosed in the circle of radius r in the momentum plane. this question is made intriguing by the recent observation [10, 11] that in some quantum graphs the asymptotics may not be of weyl type. the reason behind this effect is that symmetries, maybe not apparent ones, may effectively diminish the graph size making a part of it effectively belongs to a lead instead. the mechanism uses the fact that all the edges of a quantum graph are one-dimensional, and one may expect that such a thing would not happen on hedgehog manifolds where the particles are forced to ‘change dimension’ at the junctions. we are going to give a partial confirmation of this conjecture by showing that, in contrast to the quantum-graph case, the resonances cannot be found at arbitrary distance from the real axis in the momentum plane as long as the leads are attached at a single point of the manifold. the third and the last question addressed here is again inspired by an observation about quantum graphs. it has been noted that a magnetic field can change the effective size of some quantum graphs with a non-weyl asymptotics [17]: if we follow the resonance poles as functions of the field we observe that at some field values they move to (imaginary) infinity and the resonances disappear. on hedgehog manifolds the situation is different, though a similar effect may again occur; we will present a simple example of such a system in which a suitable aharonov-bohm field removes all the ‘true’ resonances, i.e. those with pole position having nonzero imaginary part. 2. description of the model let us first give a proper meaning to what we described above as quantum motion on a hedgehog manifold; doing so we generalize previously used definitions — see, e.g. [7, 8, 18] — by allowing more than a single semiinfinite lead be attached at a point of the manifold. consider a compact and connected riemannian figure 1. example of a hedgehog manifold manifold ω ∈ rn, n = 2, 3, endowed with metric grs. the manifold may or may not have a boundary, in the latter case we suppose that ∂ω is smooth. we denote by γ the geometric object consisting of ω and a finite number nj of halflines attached at points xj, j = 1, . . . ,n belonging to a finite subset {xj} of the interior of ω — see figure 1 — we will employ the term hedgehog manifold, or simply manifold if there is no danger of misunderstanding. by m = ∑ j nj we denote the total number of the halflines. the hilbert space we are going to consider consists of direct sum of the ‘component’ hilbert spaces, in other words, its elements are square integrable functions on every component of γ, h = l2 ( ω, √ |g|dx ) ⊕ m⊕ i=1 l2 ( r(i)+ ) , where g stands for det(grs) and dx for lebesque measure on rn. let h0 be the closure of the laplace-beltrami operator2 −g−1/2∂r(g1/2grs∂s) with the domain consisting of functions in c∞0 (ω); if the boundary of ω is nonempty we require that they satisfy at it appropriate boundary conditions, either neumann/robin, (∂n + γ)f|∂ω = 0, or dirichlet, f|∂ω = 0. the domain of h0 coincides with w 2,2(ω) which, in particular, means that f(x) makes sense for f ∈ d(h0) and x ∈ ω. the restriction h′0 of h0 to the domain{ f ∈ d(h0) : f(xj) = 0, j = 1, . . . ,n } is a symmetric operator with deficiency indices (n,n), cf. [7, 8]. furthermore, we denote by hi the negative laplacian on l2(r(i)+ ) referring to the i-th halfline and by h′i its restriction to functions which vanish together with their first derivative at the halfline endpoint. since each h′i has deficiency indices (1, 1), the direct sum h′ = h′0 ⊕ h′1 ⊕···⊕ h′m is a symmetric operator with deficiency indices (n + m,n + m). the family of admissible hamiltonians for quantum motion on the hedgehog manifold γ can be identified with the self-adjoint extensions of the operator h′. the procedure for constructing them using the 2as mentioned above, we make this assumption for the sake of simplicity and most considerations below extend easily to schrödinger type operators −g−1/2∂r (g1/2grs∂s) + v (x) provided the potential v is sufficiently regular. 417 p. exner, j. lipovský acta polytechnica boundary-value theory was described in detail in [8]. it is a modification of the analogous result from the quantum graph theory [23], and in a broader context of a known general result [22]. all the extensions are described by the coupling conditions (u − i)ψ + i(u + i)ψ′ = 0, (1) where u is an (n + m) × (n + m) unitary matrix, i the corresponding unit matrix and ψ = ( d1(f), . . . ,dn(f),f1(0), . . . ,fn(0) )t , ψ′ = ( c1(f), . . . ,cn(f),f′1(0), . . . ,f ′ n(0) )t are the columns of (generalized) boundary values. the first n entries correspond to the manifold part being equal to the leading and next-to-leading terms of the asymptotics of f(x) on ω in the vicinity of xj, while fi(0),f′i(0) describe the limits of the wave function and its first derivative on i-th halfline, respectively. more precisely, according to lemma 4 in [8], for f ∈ d(h∗0 ) the asymptotic expansion near xj has the form f(x) = cj(f)f0(x,xj) + dj(f) + o(r(x,xj)), where f0(x,xj) = { −q2(x,xj)2π ln r(x,xj), n = 2 q3(x,xj) 4π ( r(x,xj) )−1 , n = 3 (2) here q2,q3 are continuous functions of x with qi(xj,xj) = 1 and r(x,xj) denotes the geodetic distance between x and xj. function f0 is the leading term, independent of energy, of the green function asymptotics near xj, i.e. g(x,xj; k) = f0(x,xj) + f1(x,xj; k) + r(x,xj; k) with the remainder term r(x,xj; k) = o ( r(x,xj) ) . the self-adjoint extension of h′ determined by the condition (1) will be denoted as hu; we will drop the subscript if the choice of u is either clear from the context or not important. not all self-adjoint extensions, however, make sense in general from the physics point of view. the reason is that for n > 1 one finds among them such extensions which would allow the particle living on γ to hop from one junction to another junction. we restrict our attention in what follows to local couplings for which such a situation cannot occur. they are described by matrices which are block diagonal, so that such a u does not connect disjoint junction points xj. the coupling condition (1) is then a family of n conditions, each referring to a particular xj and coupling the corresponding (sub)columns ψj and ψ′j by means of the respective block uj of u. before proceeding further we will mention a useful trick, known from quantum-graph theory [16], which allows to study a compact scatterer with leads looking at its ‘core’ alone replacing the leads by effective coupling at the points xj which is a non-selfadjoint, energy-dependent point interaction, namely( ũj(k) − i ) dj(f) + i ( ũj(k) + i ) cj(f) = 0, (3) ũj(k) = u1j − (1 −k)u2j [ (1 −k)u4j − (k + 1)i ]−1 u3j, where u1j denotes the top-left entry of uj, u2j the rest of the first row, u3j the rest of the first column and u4j is nj ×nj part corresponding to the coupling between the halflines attached to the manifold. this can be easily checked using the standard argument ascribed, in different fields, to different people such as schur, feshbach, grushin, etc. 3. scattering and resolvent resonances the model described in the previous section provides a natural framework for studying scattering on the hedgehog manifold. the existence of scattering is easy to establish because any hamiltonian of the considered class differs from one with decoupled leads by a finite-rank perturbation in the resolvent [27]. finding the on-shell scattering matrix is computationally slightly more complicated but simple in principle. the solution of the sschrödinger equation on the jth external lead with energy k2 can be expressed as a linear combination of the incoming and outgoing waves, aj(k)e−ikx + bj(k)eikx. the scattering matrix then maps the vector of incoming wave amplitudes aj into the vector of outgoing wave amplitudes bj. we emphasize that our convention, which is natural in this context and analogous to the one used in quantum-graph theory [26] differs from the one employed when scattering on the real line is treated [29], in that each lead is identified with the positive real halfline. for the case of two leads, in particular, it means that columns of the 2 × 2 scattering matrix are interchanged and we have lemma 3.1. the on-shell scattering matrix satisfies s(k)−1 = s(−k) = s∗(k̄), where star and bar denote the hermitian and complex conjugation, respectively. proof. the claim follows directly from the definition of scattering matrix and from the properties of the schrödinger equation and its external solutions. since the potential is absent we infer that if f(k0,x) is a solution of the schrödinger equation for a given k, so is f(−k0,x). this means that s(k) can be regarded both as an operator mapping {aj(k)} to {bj(k)} and as a map from {bj(−k)} to {aj(−k)}, i.e. as the inverse of s(−k). in an similar way one can establish the second identity. remark 3.2. if ω is replaced by a compact metric graph and the potential is again absent, the s-matrix can be written as s(k) = −f (k)−1 ·f (−k) where the m ×m matrix f(k) is an analogue of jost function. 418 vol. 53 no. 5/2013 resonances on hedgehog manifolds in particular, s(k) is unitary for k ∈ r, however, we will need it also for complex values of k. by a scattering resonance we conventionally understand a pole of the on-shell scattering matrix in the complex plane, more precisely, the point at which some of its entries have a pole singularity. a resolvent resonance, on the other hand, is identified with a pole of the resolvent analytically continued from the upper complex halfplane to a region of the lower one. a convenient and efficient way of treating resolvent resonances is the method of exterior complex scaling based on the ideas of aguilar, baslev, combes, and simon — cf. [2, 5, 28], for a recent application to the case of quantum graphs see [11, 15, 16]). resonances in this approach become eigenvalues of the non-selfadjoint operator hθ = uθhu−1θ obtained by scaling the hamiltonian outside a compact region with the scaling parameter taking a complex value eθ; if im θ is large enough, the rotated essential spectrum reveals a part of the ‘unphysical’ sheet with the poles being now true eigenvalues corresponding to square integrable eigenfunctions. in the present case we identify, by analogy with the quantum-graph situation mentioned above, the exterior part of γ with the leads, and scale the wave function at each them using the transformation (uθf)(x) = eθ/2f(eθx), which is, of course, unitary for real θ, while for a complex θ it leads to the desired rotation of the essential spectrum. to use it, we first state a useful auxiliary result. lemma 3.3. let h|ω be the restriction of an admissible hamiltonian to ω and suppose that f(·,k) satisfies h|ωf(x,k) = k2f(x,k) for k2 6∈ σ(h0), then it can be written as a particular linear combination of green functions of h0, namely f(x,k) = n∑ j=1 cjg(x,xj; k). proof. the claim is a straightforward generalization of lemma 2.2 in [25] to the situation where n > 2 and more general couplings are imposed at the junctions. suppose that k is not an eigenvalue of h0. the green functions with one argument fixed at different points xj are clearly linearly independent, hence dom h′∗ = w 2,2(ω) ⊕ (span{g(x,xj; k)}nj=1), and without loss of generality one can write f(x,k) =∑n j=1 cjg(x,xj; k)+g(x) with g ∈ w 2,2(ω). however, then we would have h|ωg(x) = h0g(x) = k2g(x) for x 6= xj, and since k2 6∈ σ(h0) and g ∈ w 2,2(ω) by assumption, it follows that g = 0. theorem 3.4. in the described setting, the hedgehog system has a scattering resonance at k0 with im k0 < 0 and k20 6∈ r iff there is a resolvent resonance at k0. algebraic multiplicities of the resonances defined in both ways coincide. proof. consider first the scattering resonances. the starting point is the generalized eigenfunction describing the scattering at energy k2 and its analytical continuation to the lower complex halfplane. from the previous lemma we know that for k2 6∈ σ(h0) the restriction of the appropriate schrödinger equation solution to the manifold is a linear combination of at most n green functions; we denote the corresponding vector of coefficients by c. the relation between these coefficients and amplitudes of the outgoing and incoming wave is given by (1). using a as a shortcut for the vector of the amplitudes of the incoming waves,( a1(k), . . . ,am (k) )t, and similarly b for the vector of the amplitudes of the outgoing waves one obtains in general system of equations a(k)a + b(k)b + c(k)c = 0, (4) in which a and b are (n + m) ×m matrices and c is (n + m) ×n matrix the elements of which are exponentials and green functions, regularized if needed — recall that n is the number of internal parameters associated with the junctions and m is the number of the leads. what is important that all the entries of the mentioned matrices allow for an analytical continuation which makes it possible to ask for solution of equations (4) for k = k0 from the open lower complex halfplane. it is obvious that for k20 6∈ r the columns of c(k0) have to be linearly independent; otherwise k20 would be an eigenvalue of h with an eigenfunction supported on the manifold ω only. hence there are n linearly independent rows of c(k0) and after a rearrangement in equations (4) one is able to express c from the first n of them. substituting then to the remaining equations one can rewrite them in the form ã(k0)a + b̃(k0)b = 0 (5) with ã(k0) and b̃(k0) being m × m matrices the entries of which are rational functions of the entries of the previous ones. suppose that det ã(k0) = 0, then there exists a solution of the previous equation with b = 0, and consequently, k0 is an eigenvalue of h since im k0 < 0 and the corresponding eigenfunction belongs to l2, however, this contradicts to the self-adjointness of h. now it is sufficient to note that the s-matrix analytically continued to point k0 equals −b̃(k0)−1ã(k0) hence its pole singularities are solutions of the equation det b̃(k) = 0. let us turn to resolvent resonances and consider the exterior complex scaling transformation uθ with im θ > 0 large enough to reveal the sought pole on the second sheet of the energy surface. choosing arg θ > arg k0 we find that the solution aj(k)e−ikx on the j-th lead, analytically continued to the point k = k0, is after the transformation by uθ exponentially increasing, while bj(k)eikx becomes square integrable. this means that solving in l2 the eigenvalue problem for the non-selfadjoint operator hθ obtained from 419 p. exner, j. lipovský acta polytechnica h = hu one has to find solutions of (5) with a = 0. this leads again to the condition det b̃(k) = 0 thus concluding the proof. remarks 3.5. (1.) it may happen, of course, that at some junctions the leads are disconnected from the manifold since the conditions (1) are locally separating and define a point interaction at those points, or that a junction coincides with a zero of an eigenfunction of h0. in such situations it may happen that h has eigenvalues, either positive, embedded into the continuous spectrum, or negative. in terms of the momentum variable k, these eigenvalues appear in pairs symmetric w.r.t. the origin. (2.) in the case of separating conditions (1) it may happen that the complex-scaled operator hθ has an eigenvalue in k20 with eigenfunction supported outside ω, then k0 is also a pole of s(k); the poles multiplicities of the resolvent and the scattering matrix may differ in this situation. (3.) in the quantum-graph analogue when ω is replaced by a compact metric graph the decomposition of lemma 3.3 cannot be used since the deficiency indices of h′0 may, in general, exceed n and one can have extensions with the wave functions discontinuous at the junctions. the role of the internal parameters is instead played by the coefficients of two linearly independent solutions on each (internal) edge. 4. resonance asymptotics the aim of this section is to say something about the asymptotic behaviour of the resolvent poles with respect to an increasing family of regions which cover in the limit the whole complex plane. using lemma 3.3 let us write the manifold part f(·,k) of a function from the deficiency subspace of h′ as a linear combination of green functions of hω, acting as negative laplace-beltrami operator on ω. for k2 6∈ σ(hω) and x in the vicinity of the point xi we then have f(x,k) = n∑ j=1 cjg(x,xj; k) = cif0(x,xi) + cif1(x,xi; k) + n∑ i 6=j=1 cjg(xi,xj; k) + o ( r(x,xi) ) which makes it easy to find the generalized boundary values ci(f) and di(f) to be inserted into the coupling conditions (1), or effective conditions (3). we will employ the latter with matrix ũ(k) = diag ( ũ1(k), . . . , ũn(k) ) whose blocks correspond to junctions of γ. we introduce q0(k) = { g(xi,xj; k), i 6= j f1(xi,xi; k), i = j which allows us to write d(f) = q0(k)c, and substituting into (3) we can write the solvability of the system which determines the resonances as det [( ũ(k) − i ) q0(k) + i ( ũ(k) + i )] = 0. (6) we note that the matrices ũj(k) entering this condition may be singular, however, this may happen for at most m values of k, taking all the conditions together. we also note that if the hamiltonian h has an eigenvalue k2 embedded in its continuous spectrum covering the interval r+, the corresponding k > 0 also solves the equation (6). hence, as we have indicated in the introduction, from now on — for purpose of this section — we will include such embedded eigenvalues among resonances. having formulated the resonance condition we can ask how its solutions are distributed. to count zeros of a meromorphic function we employ the following auxiliary result. lemma 4.1. let g be meromorphic function in c and suppose that it has no pole or zero on the circle cr = {z : |z| = r}. then the difference between the number of the zeros and poles of g in the disc of radius r of which cr is the perimeter is given by∫ cr g(z)′ g(z) dz, with prime denoting the derivative with respect to z, or equivalently, it is the difference between the number of jumps of the phase of g(z) from 2π to 0 along the circle cr and the jumps from 0 to 2π. proof. the argument is well known, and we present it just for the sake of completeness. suppose that g(·) has at z0 a zero of multiplicity s, i.e. g(z) = h(z)(z − z0)s with neither h(z) nor h(z)′ having a zero or a pole at z0. using g(z)′ g(z) = h(z)′ h(z) + s z −z0 we find resz0 g(z)′ g(z) = 1 (s− 1)! lim z→z0 ds−1 dzs−1 ( (z −z0)s g(z)′ g(z) ) = s; a similar result differing only by the sign is obtained in the case of a pole of multiplicity s. consequently, the function g(·)′/g(·) has pole with the residue s if g(·) has zero of multiplicity s, and it has pole with the residue −s at points where g(·) has pole of multiplicity s. furthermore, g(·)′ does not have a pole at z0 as long as g(·) does not which can be easily seen from the appropriate laurent series. using now the residue theorem we arrive at the desired integral expression. the claim about the number of phase jumps follows from the fact that g′(z)/g(z) = ( ln g(z) )′. 420 vol. 53 no. 5/2013 resonances on hedgehog manifolds 4.1. manifolds with the leads attached at a single point we shall consider the situation when γ has a single junction, i.e. there is a point x0 ∈ ω at which all the m halfline leads are attached. then the matrix function q0(k) is reduced to dimension one and it coincides with the regularized green function f1(x0,x0; k); for simplicity in the rest of this subsection we drop x0 from the argument. the resonance condition (6) then becomes ( ũ(k) − 1 ) f1(k) + i ( ũ(k) + 1 ) = 0; we use the lower-case symbol to stress that ũ(k) is just a number in this case. the aim is now to establish the high-energy asymptotics. if we exclude the case of ũ(k) = 1 when the leads are obviously decoupled from the manifold and the motion on ω is described by the hamiltonian h0, we can without loss of generality rewrite the resonance condition as f(k) := f1(k) + i ũ(k) + 1 ũ(k) − 1 = 0. from (3) we see that ũ(·) − i is a rational function. consequently, it may add zeros or poles to those of f1(·), however, their number is finite and bounded by m and n = 1, respectively. the main thing is thus to find the behaviour of f1(k), in particular, its asymptotics for k far enough from the real axis. lemma 4.2. the asymptotics of the regularized green function is the following: (1.) for d = 2 we have f1(k) = 12π ( ln(±ik) − ln 2 − γe ) + o(|im k|−1) if ∓im k > 0, (2.) for d = 3 we have f1(k) = ± ik4π + o(|im k| −1 ) if ∓im k > 0, where γe stands for the euler constant. proof. the claim can be easily verified by reformulating results of avramidi [3, 4] on high-mass asymptotics of the operator −∆ + m2 as m →∞. more precisely, it follows from the stated asymptotics of equations (33) and (36) in [4] in combination with the expression for bq given in [3]. the constant a0 in [4] can be determined from the form of the singular parts of green function to fit with our convention. remark 4.3. in the case of a graph with a compact core corresponding to d = 1, which we use for comparison, one has instead f1(k) = ± 12ik + o(|im k| −2) for ∓im k > 0. now we can use the previous lemmata to prove the main result of this section. theorem 4.4. consider a manifold ω, dim ω = 2, and let hu be the hamiltonian on ω with several halflines attached at a single point by coupling condition (1). then all the resonances of this system are located in the k-plane within a finite-width strip parallel to the real axis. proof. from equation (3) it follows that ũ(k) and subsequently also the expression i ( ũ(k) − 1 )−1( ũ(k) + 1 ) is a rational function of the momentum variable k. hence there exists such a constant c that for |im k| > c the leading term of f (k) behaves either like a multiple of ln |im k| or like the leading term of previously mentioned rational function. the constant c can be chosen such that the contribution of the rest of f does change substantially the phase of f. more precisely, we then have |kn| ≥ cn and ∣∣ln (∓ik)∣∣ = ∣∣∣ln |k|∓ π 2 + arg k ∣∣∣ ≥ 1 2 ln c for large enough c, and consequently, the dominant phase behaviour of ank n ( 1 + ln(∓ik) + ∑n−1 j=0 ajk j + o(|k|) ankn ) and ln(∓ik) ( 1 + c + o(|k|) ln(∓ik) ) is determined by the terms in front of the brackets, in particular, there are finitely many jumps of the phase of f between zero and 2π along the part of the circle |k| = r in the region |im k| > c with c sufficiently large. in other words, all but finitely many resonances can be found within the strip |im k| < c, hence all resonances are located within some strip parallel to the real axis in the momentum plane. theorem 4.5. let d = 3, and let hu be hamiltonian on ω with several halflines attached at one point by coupling condition (1). then all resonances of hu are located within a strip parallel to the real axis. proof. in the case when the coupling term does not coincide with the first term of asymptotics of f1, i.e. (ũ(k) − 1)−1(ũ(k) + 1) 6= ± k4π , one can employ the same arguments as in previous theorem. let us check that no unitary matrix can lead to such an effective coupling matrix. if it were the case we would have ũ(k) = − 4π ±k 4π ∓k . (7) assume that (7) holds true for some unitary matrix u. for the upper sign the expression diverges at k = 4π; this contradicts the unitarity of u, by which its modulus must not exceed one. let us now turn to the lower-sign case. using u4 = v −1dv , u2v = u2v −1 and u3v = v u3 with a diagonal d and a unitary v , the equation (7) becomes − 4π −k 4π + k = u1 −u2v ( d − 1 + k 1 −k i )−1 u3v . let us find how the right-hand side behaves in the vicinity of −4π choosing k = −4π + ε. if none of the eigenvalues of d equals 1−4π1+4π the relation cannot be valid as ε → 0. furthermore, combining (7) with 421 p. exner, j. lipovský acta polytechnica the behaviour of the last expression as k → 1, we find u1 = 1−4π1+4π . were there two or more eigenvalues of d equal to 1−4π1+4π , we could conclude that the vector (u1,u3v )t has norm bigger than one, which contradicts the unitarity of u. consequently, there is exactly one eigenvalue 1−4π1+4π of matrix d. since the rows and columns of u4 can be rearranged, we may assume that it is the first eigenvalue in which case we have − 8π ε = u1 − 1 + u (1) 2v u (1) 3v (1 + 4π)(1 + 4π + ε) 2ε −u′2v ( d′ − 1 − 4π + ε 1 + 4π −ε i )−1 u′3v , where u(1)2v and u (1) 3v are the first entries of u2v and u3v , and u′2v and u ′ 3v is rest of the row/column, respectively. to match the ε−1 terms on both sides of the last equation, the identity −8π = (1 + 4π)2 2 u (1) 2v u (1) 3v has to be valid. from this and the unitarity of u it follows that |u(1)2v | = |u (1) 3v | = √ 1 − (1−4π 1+4π )2. using the unitarity again, we note that the column (u1,u3v )t must have norm equal to one, and since we know already that u1 = 1−4π1+4π , it follows that u ′ 2v = u ′ 3v = 0. equation (7) now becomes − 4π −k 4π + k = 1 − 4π 1 + 4π − ( 1 − (1 − 4π 1 + 4π )2) eiϕ (1 − 4π 1 + 4π − 1 + k 1 −k )−1 with ϕ being the phase of u(1)2v u (1) 3v . this clearly cannot be true for all k, which can be seen, for instance, from observing the limit k →∞; in this way we come to a contradiction. the two previous claims show that, unlike the case of quantum graphs, one cannot find a sequence of resonances which would escape to imaginary infinity in the momentum plane; in the next section we provide a couple of examples illustrating the comparison between the one-dimensional case and the twoand three-dimensional cases. let us stress, however, that any such sequence tends to infinity along the real axis, and thus the above results do not answer the question stated in the opening, namely whether the resonance asymptotics always has a weyl character. also, we postpone to another paper discussion of the case when halflines are attached at two and more points; we note that the green function on the manifold between two distinct points does not depend only on local curvature properties, but also nontrivially on the structure of the whole manifold. also the resonance trajectories from the fully decoupled case to the coupled case will be left for another paper. l� l 0 figure 2. a ‘thin-hedgehog’ manifold, d = 1 4.2. examples to illustrate how the resonance asymptotical behaviour depends on the dimension of ω we now consider two examples of a planar manifold with dirichlet boundary conditions in dimensions one and two. first we illustrate the difference between dimensions one and two in a pair of examples, that at first glance may appear similar. in the former case one is able to adjust the parameters to obtain a non-weyl graph for which one half of the resonances escape to imaginary infinity, hence the number of phase jumps along the circle of increasing radius increases. on the other hand, we do not observe such behaviour in dimension two. example 4.6. in the case dim ω = 1 we consider an abscissa of length 2l with m halflines attached in the middle — cf. figure 2. we impose dirichlet boundary conditions at its endpoints, f(−l) = f(l) = 0, and condition (1) at the middle. the green function of the operator h0 is given by g(x,y; k) = f1(x,y; k) = ∑ n ψ̄n(x)ψn(y) λn −k2 = 1 l ∞∑ n=1 cos (2n−1)πx2l cos (2n−1)πy 2l((2n−1)π 2l )2 −k2 . substituting, in particular, x = y = 0 one obtains f1(0, 0; k) = 1 l ∞∑ n=1 1((2n−1)π 2l )2 −k2 = 12k tan kl. substituting this into the resonance condition one can check easily that resonance count asymptotics has a non-weyl character if the coupling is chosen as follows, i ũ(k) + 1 ũ(k) − 1 = ± i 2k ⇒ ũ(k) = −2k ∓ 1 2k ∓ 1 . the upper-sign choice can be realized, e.g. by taking m = 2 and connecting the halflines with the abscissa by kirchhoff conditions — see [10, 11]. this corresponds to a ‘balanced’ vertex connecting two internal and two external edges. the phase of the regularized green function f1 and the left-hand side of the resonance condition for the above choice of the coupling, f1 + i2k, can be seen in figures 3 and 4, respectively; figure 4 illustrates how the number of phase jumps increases for an increasing radius of the circle. 422 vol. 53 no. 5/2013 resonances on hedgehog manifolds figure 3. phase of the green function for d = 1 figure 4. phase of the green function plus i2k example 4.7. consider next an analogous situation in two dimensions — a flat circular drum of radius l with dirichlet boundary condition at r = r and m halflines attached in its center. because of the rotational symmetry, the green function with one argument fixed at y = 0 can be expressed as a combination of bessel functions, g(x, 0; k) = − 1 4 y0(kr) + c(k)j0(kr), where r := |x| and j0 and y0 are bessel functions of the first and second kind, respectively. the constant by y0 is chosen so that g satisfies (2). we employ the well-known asymptotic behaviour of bessel functions, y0(x) ∼− 2 π (ln x/2 + γ), j0(x) ∼ 1 as x → 0, which yields the expression f1(k) = − 1 2π (ln k − ln 2 + γ) + y0(kr) 4j0(kr) . using the asymptotics of j0(x) and y0(x) as x →∞, one finds that the second term on the right-hand side behaves as 14 tan ( kr − π4 ) and its absolute value is therefore bounded for k outside the real axis. the phase of f1 for r = π is plotted in figure 6. r figure 5. the hedgehog manifold of example 4.7 figure 6. phase of the regularized green function for the hedgehog manifold of example 4.7 r figure 7. a disc with a lead in a magnetic field 5. resonances for a hedgehog manifold in magnetic field now we are going to present an example showing that an appropriately chosen magnetic field can remove all ‘true resonances’ on a hedgehog manifold, i.e. those corresponding to poles in the open lower complex halfplane. we note that this does not influence the semiclassical asymptotics in this case, because the embedded eigenvalues of the system corresponding to higher partial waves with eigenfunctions vanishing at the junction will persist being just shifted. the manifold of our example will consist of a disc of radius r with a halfline lead attached at its centre. for definiteness we assume that it is perpendicular to the disc plane, cf. figure 7. the disc is parametrized by polar coordinates r, ϕ, and dirichlet boundary conditions are imposed at r = r. we suppose that the system is under the influence of a magnetic field in the form of an aharonov-bohm string which coincides 423 p. exner, j. lipovský acta polytechnica in the ‘upper’ halfspace with the lead. the effect of an aharonov-bohm field piercing a surface has been studied in numerous papers — see, e.g., [1, 9, 21] — so we can just modify those results for our purpose. the idea is that the ‘true’ resonances will disappear if we manage to choose such a coupling in which the radial part of the disc wave function will match the halfline wave function in a trivial way. we write the hilbert space of the model as h = l2 ( (0,r),rdr ) ⊗ l2(s1) ⊕ l2(r+); the admissible hamiltonians are then constructed as selfadjoint extensions of the operator ḣα acting as ḣα ( u f ) = ( −∂ 2u ∂r2 − 1 r ∂u ∂r + 1 r2 ( i ∂ ∂ϕ −α )2 u −f′′ ) on the domain consisting of functions ( u f ) with u ∈ h2loc ( br(0) ) satisfying u(0,ϕ) = u(r,ϕ) = 0 and f ∈ h2loc(r +) satisfying f(0) = f′(0) = 0. the parameter α in the above expression is the magnetic flux of the aharonov-bohm string in the units of the flux quantum; since an integer value of the flux plays no role in view of the natural gauge invariance we may restrict our attention to the values α ∈ (0, 1). using the partial-wave decomposition together with the standard unitary transformation (v u)(r) = r1/2u(r) to the reduced radial functions we get ḣα = ∞⊕ m=−∞ v −1ḣα,mv ⊗ i where the component ḣα,m acts on the upper component of ψ = ( φ f ) as ḣα,mφ = − d2φ dr2 + (m + α)2 − 1/4 r2 φ. (8) to construct the self-adjoint extensions of ḣα which describe the coupling between the disc and the lead the following functionals can be used, φ−11 (ψ) = √ π lim r→0 r1−α 2π ∫ 2π 0 u(r,ϕ)eiϕdϕ, φ−12 (ψ) = √ π limr→0 r −1+α 2π [∫ 2π 0 u(r,ϕ)e iϕdϕ −2 √ πr−1+αφ1−1(ψ) ] , φ01(ψ) = √ π lim r→0 rα 2π ∫ 2π 0 u(r,ϕ)dϕ, φ02(ψ) = √ π limr→0 r −α 2π [∫ 2π 0 u(r,ϕ)dϕ −2 √ πr−αφ01(ψ) ] , φh1 (ψ) = f(0), φ h 2 (ψ) = f ′(0). the first two of them are, by analogy with [9], multiples of the coefficients of the two leading terms of asymptotics as r → 0 of the wave functions from ḣ∗α belonging to the subspace with m = −1, the second two correspond to the analogous quantities in the subspace with m = 0, and the last two are the standard boundary values for the laplacian on a halfline. it is obvious that if the s-wave resonances should be absent, one has to get rid of the second term in the expression (8) for the m = 0 function, hence we will restrict our attention to the case α = 1/2. by analogy with the case of an aharonov-bohm flux piercing a plane treated in [9], one obtains (ψ1,hψ2) = − ∫ 2π 0 ∫ r 0 u1 r −1/2 d 2 dr2 r1/2u2 r dr dϕ − ∫ ∞ 0 f1f2 ′′ dx = − ∫ 2π 0 ∫ r 0 ũ1ũ2 ′′ dr dϕ − ∫ ∞ 0 f1f2 ′′ dx = − ∫ 2π 0 ũ1ũ2 ′ dϕ + ∫ 2π 0 ∫ r 0 ũ1 ′ ũ2 ′ dr dϕ−f1(0+)f′2(0+) + ∫ ∞ 0 f1 ′ f2 ′ dx, where ũa = r1/2ua, a = 1, 2, is a multiple of the disc component of ua (with the prime denoting the derivative with respect to r) and fa is the corresponding halfline component. hence we have (ψ1,hψ2) − (hψ1,ψ2) = lim r→0 ∫ 2π 0 [ ũ1ũ2 ′ − ũ2ũ1′ ] dϕ + f2(0+)f′1(0+) −f1(0+)f ′ 2(0+), and using asymptotic expansion of u near r = 0, √ πu(r,θ) = ( φ−11 (ψ)r −1/2 + φ−12 (ψ)r 1/2)e−iθ + φ01(ψ)r −1/2 + φ02(ψ)r 1/2, −2r √ πu′(r,θ) = ( φ−11 (ψ)r −1/2 − φ−12 (ψ)r 1/2)e−iθ + φ01(ψ)r −1/2 − φ02(ψ)r 1/2, one finds (ψ1,hψ2) − (hψ1,ψ2) = φ1(ψ1)∗φ2(ψ2) − φ1(ψ2)∗φ2(ψ1), where φa(ψ) = (φha, φ0a, φ−1a )t for a = 1, 2. consequently, to get a self-adjoint hamiltonian one has to impose coupling conditions similar to (1), namely (u − i)φ1(ψ) + i(u + i)φ2(ψ) = 0 (9) with a unitary u. we choose the latter in the form u =  0 1 01 0 0 0 0 eiρ   , (10) i.e. the nonradial part (m = −1) of the disc wave function is coupled to neither of the other two, while the radial part (m = 0) is coupled to the halfline via 424 vol. 53 no. 5/2013 resonances on hedgehog manifolds kirchhoff’s (free) coupling. to see that this choice kills all the ‘true’ resonances, we choose the ansatz f(x) = a sin kx + b cos kx, u(r) = r−1/2 ( c sin k(r−r) ) which yields the boundary values φ1(ψ) = (b,c √ π sin kr, 0)t, φ2(ψ) = k(a,−c √ π cos kr, 0)t. it follows now from the coupling conditions that b = c √ π sin kr, a = c √ π cos kr, hence f(x) = c √ π sin k(r + x), thus for any k 6∈ r and c 6= 0 the function f necessarily contains a nontrivial part of the wave e−ikx. however, as we have argued above, a resolvent resonance can must have the asymptotics eikx only. in this way we come to the indicated conclusion: proposition 5.1. the described system has no true resonances for the coupling corresponding to matrix (10) and magnetic flux α = 12 . since the effect occurs at a particular value of the magnetic flux, it is also interesting to ask what happens if the field changes, so that α runs from zero to 1 2 . the symmetry of the problem allows us to use the ansatz u(r,ϕ) = r(r) eimϕ; this shows that one has to solve the equation − ∂2r(r) ∂r2 − 1 r ∂r(r) ∂r + 1 r2 (m + α)2r(r) = k2r(r) which can be easily transformed into bessel equation in the variable kr with the constant (m + α). hence the radial part of the wavefunction on the disc is given as a combination of bessel functions and u(r,ϕ) = ∑ m ( a1mjm+α(kr) + a2mym+α(kr) ) eimϕ. we employ the behaviour of bessel functions in the vicinity of zero, jα(x) ≈ 1 γ(α + 1) (x 2 )α , yα(x) ≈− γ(α) π (2 x )α , which yields the values of the above functionals, φ01 = √ π lim r→0 rα 2π 2π −γ(α) π a20 ( 2 kr )α = − γ(α) √ π (2 k )α a20, φ02 = √ π lim r→0 r−α 2π 2πa10 1 γ(α + 1) (kr 2 )α = √ π γ(α + 1) (k 2 )α a10, -8 -7 -6 -5 -4 -3 -2 -1 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 figure 8. trajectory of a resonance for α running from zero to 12 . φ−11 = √ π lim r→0 r1−α 2π 2π 1 γ(α) (kr 2 )−1+α a1−1 = √ π γ(α) (2 k )1−α a1−1, φ−12 = − √ π lim r→0 r−1+α 2π 2π γ(α− 1) π ( 2 kr )−1+α a2−1 = − γ(α− 1) √ π (k 2 )1−α a2−1. the resonance equation is then given by eq. (9) and dirichlet condition at the disc boundary, a10jα(kr) + a20yα(kr) = 0, a1−1jα−1(kr) + a2−1yα−1(kr) = 0. in a particular case of u = ( 0 1 0 1 0 0 0 0 eiρ ) resonances are obtained as solutions to the condition det [( −1 1 1 −1 )( 1 0 0 −γ(α)√ π (2 k )α jα(kr) ) + i ( 1 1 1 1 )( ik 0 0 √ π γ(α+1) ( k 2 )α yα(kr) )] = 0, which can be rewritten as i √ π γ(α + 1) (k 2 )α yα(kr) + k γ(α) √ π (2 k )α jα(kr) = 0. in particular, for α = 1/2 it gives sin kr− i cos kr √ πr = 0 showing again that there are no resonances for α = 1/2. for other values of α the condition can be solved numerically. in figure 8 we plot the trajectory of one of the resonances as the value of α increases from zero to 12 . the step is taken to be 0.01 for values until α = 0.49, which corresponds to the sharp bend of the curve, and from this point on the linear step is replaced by a sequence of exponentially increasing density accumulating at α = 12 . 425 p. exner, j. lipovský acta polytechnica acknowledgements we thank the referees for useful remarks. this research was supported by the czech science foundation within project p203/11/0701 and the development of postdoc activities project at the university of the hradec králové, cz.1.07/2.3.00/30.0015. references [1] r. adami, a. teta: on the aharonov-bohm hamiltonian, lett. math. phys. 43, 43–54, 1998. [2] j. aguilar, j.-m. combes: a class of analytic perturbations for one-body schrödinger operators, commun. math. phys. 22, 269–279, 1971. [3] i.g. avramidi: a covariant technique for the calculation of the one-loop effective action. nuclear physics b 355, 712–754, 1991. [4] i.g. avramidi: green functions of higher-order differential operators, j. math. phys. 39, 2889–2909, 1998. [5] e. baslev, j.-m. combes: spectral properties of many body schrödinger operators with dilation analytic interactions, commun. math. phys. 22, 280–294, 1971. [6] g. berkolaiko, p. kuchment: introduction to quantum graphs, ams “mathematical surveys and monographs” series, vol. 186, providence, r.i., 2013 [7] j. brüning, p. exner, v.a. geyler: large gaps in point-coupled periodic systems of manifolds, j. phys. a: math. gen 36, 4890, 2003. [8] j. brüning, v. geyler: scattering on compact manifolds with infinitely thin horns, j. math. phys. 44, 371–405, 2003. [9] l. dabrowski, p. šťovíček: aharonov-bohm effect with δ-type interaction, j. math. phys. 36, 47–62, 1998. [10] e.b. davies, p. exner, j. lipovský: non-weyl asymptotics for quantum graphs with general coupling conditions, j. phys. a: math. theor. 43, 474013, 2010. [11] e.b. davies, a. pushnitski: non-weyl resonance asymptotics for quantum graphs. analysis and pde 4, no. 5, 729–756, 2011. [12] v. dinu, a. jensen, g. nenciu: non-exponential decay laws in perturbation theory of near threshold eigenvalues j. math. phys. 50, 013516, 2009. [13] p. exner: open quantum system and feynman integrals, reidel, dordrecht 1985. [14] p. exner, j.p. keating, p. kuchment, t. sunada, a. teplyaev, eds.: analysis on graphs and applications, proceedings of the isaac newton institute programme, january 8–june 29, 2007; 670 p.; ams “proceedings of symposia in pure mathematics” series, vol. 77, providence, r.i., 2008 [15] p. exner, j. lipovský: equivalence of resolvent and scattering resonances on quantum graphs, in adventures in mathematical physics (proceedings, cergy-pontoise 2006), vol. 447, pp. 73–81; ams, providence, r.i., 2007. [16] p. exner, j. lipovský: resonances from perturbations of quantum graphs with rationally related edges, j. phys. a: math. theor. 43, 105301, 2010. [17] p. exner, j. lipovský: non-weyl resonance asymptotics for quantum graphs in a magnetic field, phys. lett. a 375, 805–807, 2011. [18] p. exner, m. tater, d. vaněk: a single-mode quantum transport in serial-structure geometric scatterers, j. math. phys. 42, 4050–4078, 2001. [19] p. exner, p. šeba: quantum motion on a halfline connected to a plane, j. math. phys. 28, 386–391; erratum p. 2254, 1987. [20] p. exner, p. šeba: resonance statistics in a microwave cavity with a thin antenna, phys. lett. a228, 146–150, 1997. [21] p. exner, p. šťovíček, p. vytřas: generalized boundary conditions for the aharonov-bohm effect combined with a homogeneous magnetic field, j. math. phys. 43, 2151–2168, 2002. [22] v.i. gorbachuk, m.l. gorbachuk: boundary value problems for operator differential equations, kluwer, dordrecht 1991. [23] m. harmer: hermitian symplectic geometry and extension theory, j. phys. a: math. gen. 33, 9193–9203, 2000. [24] v.ya. ivrii: the second term of the spectral asymptotics for a laplace-beltrami operator on manifolds with boundary, funktsional. anal. i prilozhen 14 (3), 25–34, 1980. [25] a. kiselev: some examples in one-dimensional ‘geometric’ scattering on manifolds, j. math. anal. appl. 212, 263–280, 1997. [26] v. kostrykin, r. schrader: kirchhoff’s rule for quantum wires, j. phys. a: math. gen. 32, 595–630, 1999. [27] m. reed, b. simon: methods of modern mathematical physics, iii. scattering theory, academic press, new york 1979. [28] b. simon: quadratic form techniques and the balslevcombes theorem, commun. math. phys. 27, 1–9, 1972. [29] s.-h. tang, m. zworski. potential scattering on the real line, lecture notes, http://math.berkeley.edu/~zworski/tz1.pdf. [30] h. weyl: über die asymptotische verteilung der eigenwerte, nachrichten von der gesellschaft der wissenschaften zu göttingen, mathematisch-physikalische klasse 2, 110–117, 1911. 426 acta polytechnica 53(5):416–426, 2013 1 introduction 2 description of the model 3 scattering and resolvent resonances 4 resonance asymptotics 4.1 manifolds with the leads attached at a single point 4.2 examples 5 resonances for a hedgehog manifold in magnetic field acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0862 acta polytechnica 53(6):862–867, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap a novel bcg sensor-array for unobtrusive cardiac monitoring anna böhm∗, christoph brüser, steffen leonhardt chair for medical information technology, helmholtz-institute for biomedical engineering, pauwelsstr. 20, 52074 aachen, rwth aachen university, germany ∗ corresponding author: boehm@hia.rwth-aachen.de abstract. unobtrusive heart rate monitoring is a popular research topic in biomedical engineering. the reason is that convential methods, e.g. the clinical gold standard electrocardiography, require conductive contact to the human body. other methods such as ballistocardiography try to record these vital signs without electrodes that are attached to the body. so far, these systems cannot replace routine procedures. most systems have some drawbacks that cannot be compensated, such as aging of the sensor materials or movement artifacts. in addition, the signal form differs greatly from an ecg, which is an electrical signal. the ballistocardiogram has a mechanical source, which makes it harder to evaluate. we have developed a new sensor array made of near-ir-leds to record bcgs. ir-sensors do not age in relevant time scales. analog filtering was neccesary, because the signal amplitude was very small. the digitized data was then processed by various algorithms to extract beat-to-beat or breath-to-breath intervals. the redundancy of multiple bcg channels was used to provide a robust estimation of beat-to-beat intervals and heart rate. we installed the system beneath a mattress topper of a hospital bed, but any other bed would have been sufficient. the validation of this measurement system shows that it is well suited for bcg recordings. the use of multiple channels has proven to be superior to relying on a single bcg channel. keywords: ballistocardiography (bcg), unobtrusive measurement, cardiac monitoring. 1. introduction vital signs such as heart rate and respiratory rate are recorded to determine the overall state of a patient. typically, electrocardiograms (ecg) via ag/agcl electrodes are used to monitor heart rate and to identify unusual events, such as arrhythmias. however, before recording an ecg, the skin has to be properly abrased, sometimes even shaved, and a conductive gel is applied. the gel dries after a short period of time and the signal deteriorates. this does not allow for longer measuring periods, especially not at home. the goal of this work was to develop sensors that can be placed in a patient’s bed or in other everyday objects for heart rate monitoring. an unobtrusive method from measuring heart rate is ballistocardiography (bcg). it involves measuring rhythmical mechanical movements of the entire body resulting from blood flow. the bcg’s shape differs from an ecg because of its mechanical source. a bcg has a delay of roughly 100 ms to the ecg signal [1]. the signal shape is dependent on the measurement system, specifically on where the sensor is placed. if it is installed in a bed, the patients’ position in the bed is a determining factor [2]. many forms of ballistocardiographs have been proposed. traditionally, measuring systems have been integrated into rigid tables and the longitudinal displacement of the bed has been measured [3]. additionally, bcg systems have been placed directly onto the body [4]. others have tried locations such as beneath pillows [5], mattresses [1] or embedded in other everyday objects [6]. in addition to heart rate estimation, they have been used, for example, to detect atrial fibrillation [7]. existing bcg measuring systems can be made of various kinds of force sensors, optical sensors, acceleration sensors, position sensors, or pressure sensors. so far, only maki [8] has developed an infrared-based sensor consisting of one emitter and a detector placed between the springs of a spring core mattress. 2. methods we designed a novel optical bcg sensor system with sensors using infrared light. the sensor is made of a small printed circuit board (pcb) with three ir-leds and one photo diode. the principle is based on light scattering inside the mattress. the radiation emitted by the leds enters the mattress, and the returning amount of ir-light is detected by the photo diode. the sensing units of the ballistocardiograph can be be placed between the patient’s mattress and the mattress topper at up to eight places. it is also possible to place the sensor directly under the mattress, but our setup yielded better results. an advantage compared to [8] is that the mattress does not have to be taken apart in order to install the system, see fig. 1. 862 http://dx.doi.org/10.14311/ap.2013.53.0862 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 a novel bcg sensor-array for unobtrusive cardiac monitoring figure 1. sensor-system setup in a bed. figure 2. analog filter chain. the sensor units detect localized cardiac-related vibrations from the body. the single signals differ depending on the location of the sensors under the body. breathing can be seen in the raw signal as a superimposed oscillation with a lower frequency if the sensor is placed below the thorax. 2.1. measurement setup a block diagram of the measurement setup is depicted in fig. 2. the measurement system is modular, so that multiple sensors can be connected at the same time. the signals are filtered with analog hardware filters for each channel individually. the resulting signals are digitized with a nidaq usb6009 device with 14 bit resolution (national instruments, austin, texas, usa). subsequently, the data of each channel is recorded with labview (national instruments, austin, texas, usa). the raw data is then processed with matlab (the mathworks, natick, massachusetts, usa). 2.2. working principle the sensors are made of three infrared light emitting diodes and one infrared detector centered between the leds. the wavelength of the leds is 850 nm (type sfh 4250, osram, munich, germany). the silicon pin photo diode has its maximum sensitivity at 880 nm (type bpw 34 fas, osram, munich, germany). its radiant sensitive area dimensions are 2.65 × 2.65 mm2. our tests have shown that a 1.5 cm distance from emitter to detector is the best trade-off figure 3. sensor pcb. figure 4. sensor under a mattress. in terms of signal quality and penetration depth of the light into the material, see fig 3. if the distance between emitter and detector is larger, the resulting penetration depth becomes higher. however, the greater the distance, the lower the number of photons that reach the detector. the working principle of the measurement setup is shown in fig. 4. the light is emitted into the material, where it is scattered, absorbed or transmitted [9]. a synthetic foam mattress topper was used. when it is compressed by forces exerted by mechanical movements of the body, the air enclosures of the foam are deformed. consequently, the path of the light also altered. the number of photons that are sensed by the detector changes according to the compression of the material. 2.3. analog filtering techniques as the ad-conversion is conducted with 14 bit, the quantization error equals 1.22 mv. the signal-to-noise ratio is very low, and analog filtering of the sensor signals is necessary. for our application, higher frequencies can be filtered out, as the heart rate is assumed to be between 0.67 and 3.33 hz. the light detected by the photo diode is converted to an electrical potential by a trans-impedance amplifier. this circuit converts current into the corresponding voltage. the signal is simultaneously low-pass filtered with a corner frequency of 160 hz. additionally, it is high-pass filtered with a passive 1 hz rc-filter in order to remove the baseline, so the operational amplifier does not saturate. another operational amplifier sets the gain of the filter chain. high-frequency components are removed by a sallenkey low-pass filter with a corner frequency of 20 hz. in addition, a 50 hz twin-t notch filter cancels out the line noise. the analog filter chain is depicted in fig. 2. 863 a. böhm, c. brüser, s. leonhardt acta polytechnica 54 55 56 57 58 59 60 61 −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 time (s) b c g ( a .u .) figure 5. raw bcg signal. 2.4. multi-channel algorithm evaluating bcg signals is a challenging task. due to the position and the sensor-dependent signal morphology, a robust algorithm has to be applied. it is not sufficient to find the peaks, as in an ecg signal. analysing a bcg mostly relies on finding patterns that are similar and repetitive. a high-pass filtered signal with repetitive patterns of this kind is shown in fig. 5. while this example exhibits rather clear deflections for each heart beat, this is not generally the case. beat-to-beat and breath-to-breath intervals are estimated using of an algorithm provided in [10]. the algorithm is capable of extracting intervals of signals that have different morphologies — even in signals that are so noisy that is not possible to find peaks manually. these intervals are computed for a given signal. they can be used to monitor heart rate and to determine heart rate variability, for example. the algorithm preprocesses the signal by filtering with a butterworth pass-band filter between 0.5 and 20 hz. instead of first detecting the location of heart beats and then computing the corresponding beatto-beat intervals, the algorithm estimates the timevarying instantaneous heart rates from the raw signal using three short-time estimators (e.g. autocorrelation). these estimators are applied to an adaptive moving analysis window and then combined using a bayesian approach. in addition, artifacts are detected based on adaptive amplitude thresholding and automatically discarded. the results of the algorithm are the beat-to-beat intervals reconstructed from the signal. based on the algorithm in [10], a multi-channel version was further developed to select the channels that perform best for beat-to-beat estimation. the channels are selected on the basis of certain criteria e.g. calculation of standard deviation. if the standard deviation σx = √√√√ 1 n n∑ i=1 (xi − x)2 −0.5 0 0.5 b c g ( a .u .) 10 12 14 16 18 −1 0 1 time (s) b c g ( a .u .) figure 6. 2-channel bcg data (0.5–20 hz)-filtered bcg signal. of a given signal is much lower than the standard deviation of the other channels, low information density is expected. the second method is kurtosis. kurtosis expresses the extent to which the distribution of a signal is peaked or flattened. the kurtosis of an ecg is usually around k = 5, so we also adapted this value for bcg [11]. kurtosis can be calculated as follows: k = 1 n n∑ i=1 (xi − µx σx )4 . finally, the estimated beat-to-beat interval of each individual channel has to be within an accepted threshold: fhr = [0.67; 3.33]. 3. results and discussion signals were recorded with a prototype optical bcg system. simultaneously, an ecg was obtained as a reference. the signals were recorded with a sample rate of 400 hz. in the following sample measurement with 2 channels, repetitive patterns can be seen (fig. 6). the time between the dominant peaks is equal to the heart-rate interval. the shapes of the signals are different, as they were taken by sensors located in two different positions. fig. 7 shows the influence of respiration. the subject had a respiratory rate of approximately 9 breaths per minute, which can also be seen in the bcg signal. the recording was made while the test subject was lying on his back. 3.1. signal morphology according to a study mentioned in [12], eight different sleeping positions exist. in a trial, four different positions were analyzed: supine, on the left side, on the right side, and prone. the test subject lay in each position for one minute and then turned around. the transitions from one position to another were clearly visible in the measured signal. that is because 864 vol. 53 no. 6/2013 a novel bcg sensor-array for unobtrusive cardiac monitoring 3 4 5 6 7 −3 −2 −1 0 time (s) b c g ( a .u .) figure 7. band-pass filtered bcg data (0.5–20 hz) with respiratory influence. 0 50 100 150 200 250 −6 −4 −2 0 2 4 6 8 b c g ( a .u .) time (s) leftsupine right prone figure 8. bcg with different lying positions. the movement artifacts have much higher amplitudes than the normal signal, as can be seen in fig. 8. the patients’ contact area with the mattress topper determines the signal shape. different positions resulted in very dissimilar signals. some positions were more advantageous for estimating the heart rate than others, see fig 9. the peak-to-peak voltage was approximately 1.5 v in every case except for the right side where it resulted in a peak-to-peak voltage of 0.4 v. furthermore, periodic signal patterns could be identified in the supine, left and prone positions. the left and prone positions also showed repetitive dominant peaks. the reason for the reduced signal quality while lying on the right side is unclear. it may be due to the larger distance from the contact area of the body to the heart. 3.2. sources of artifacts the source of different artifacts was examined. a common problem of bcg systems is that they are susceptible to external vibrations. we therefore investigated whether these vibrations cause artifacts in the optical bcg signal. however, our tests showed that optical bcgs are also subject to these vibrations. the heart rate cannot be estimated if a movement artifact occurs, because the operational amplfier saturates. another test was conducted to comprehend the 30 31 32 33 34 35 −1 −0.5 0 0.5 1 time (s) b c g ( a .u .) 122 123 124 125 126 127 −1 −0.5 0 0.5 1 time (s) b c g ( a .u .) 180 181 182 183 184 185 −1 −0.5 0 0.5 1 time (s) b c g ( a .u .) 232 233 234 235 236 237 −1 −0.5 0 0.5 1 time (s) b c g ( a .u .) figure 9. five close-ups of bcgs, from top to bottom: supine; left; right; prone. influence of external light sources, i.e. incandescent light. due to the fact that the working principle of the sensors is based on radiation, external light sources may cause artifacts or baseline changes. the experiment was performed by switching on and off the lights in a darkened lab without any test subject in the bed. the test showed that incandescent light had no influence on the bcg signal. 865 a. böhm, c. brüser, s. leonhardt acta polytechnica figure 10. different sensor configurations. coverage rel. error best channel 87.01 % 1.46 % worst channel 18.61 % 25.7 % multiple channels 89.16 % 0.67 % table 1. comparison of estimations of intervals for the best/worst channel to an estimation of intervals using multiple channels. 3.3. sensor-array up to eight sensors can be connected to the current system. three different sensor configurations beneath the mattress topper were tested with a 4-channel setup. a diamond shaped configuration, a square, and a line were tested, see fig. 10. in each case, it seems beneficial to place the sensors below the patient’s torso, because of the proximity to the heart. the closer the sensors are to the diaphragm, the more dominant the breathing component becomes. the diamond configuration was chosen, because the largest part of the torso is covered with only 4 sensors and the probability that the subject lies on one or more of the sensors is higher in any chosen position. 3.4. heart rate estimation the optical bcg was evaluated with a single channel measurement during a 15-minute trial and a simultaneous ecg. compared to the ecg, the coverage of the bcg equaled 89.35 %, and the relative error of the estimated intervals was 1.58 %. fig. 11 shows a comparison of ecg beat-to-beat estimated intervals (pan-tompkins [13]) and bcg estimated intervals within a certain time frame. in this specific case, the estimations overlap almost perfectly. the multi-channel system was evaluated with a 4channel measurement in a 20-minute trial recorded at 200 hz. the results in tab. 1 show that one channel had a poor signal with only 18.61 % coverage. the channel was therefore automatically excluded from the estimation by the algorithm most of the time. the best signal in terms of coverage had a relative error of 1.46 % compared to the ecg. the best result was achieved by combining multiple channels for the estimation. 4. conclusion the proposed bcg system is a new method for bcg measurements in bed. the advantage of bcg is that it works without direct contact to the patient’s body. by using near-ir-optical sensors, ageing of the material 120 130 140 150 160 170 180 0.8 0.85 0.9 0.95 1 time (s) in te rv a l (s ) bcg ecg figure 11. comparison of bcg and ecg beat-tobeat interval estimation. can be excluded as a reason for deterioration of the signal quality. heart rate, beat-to-beat intervals and in principle respiration rate and breath-to-breath intervals can be extracted from the signal, which is essentially based on small vibrations of the body. however, optical sensors are prone to movement artifacts, as are other bcg sensors. when the subject turned, it was not possible to estimatie the heart rate or the heart beat intervals. nevertheless, the results are promising, because high coverage and low relative error were achieved by optical bcg. so far, only a single prototype is available, and only a limited number of recordings have been obtained. a larger study should be conducted to further investigate this concept. references [1] j. m. kortelainen, j. virkkala. fft averaging of multichannel bcg signals from bed mattress sensor to improve estimation of heart beat interval. in 29th annual international conference of the ieee engineering in medicine and biology society, cité internationale, lyon, france, pp. 6685–6688. 2007. [2] c. brüser, k. stadlthanner, s. de waele, s. leonhardt. adaptive beat-to-beat heart rate estimation in ballistocardiograms. ieee transactions on information technology in biomedicine : a publication of the ieee engineering in medicine and biology society 15(5):778–86, 2011. [3] i. starr, a. j. rawson, h. a. schroeder, n. r. joseph. studies on the estimation of cardiac output in man, and of abnormalities in cardiac function, from the heart’s recoil and the blood’s impacts the ballistocardiogram. the american journal of physiology vol. 127, 1939. [4] k. tavakolian, a. vaseghi, b. kaminska. improvement of ballistocardiogram processing by inclusion of respiration information. physiological measurement 29:771–781, 2008. [5] x. zhu, w. chen, t. nemoto, et al. accurate determination of respiratory rhythm and pulse rate 866 vol. 53 no. 6/2013 a novel bcg sensor-array for unobtrusive cardiac monitoring using an under-pillow sensor based on wavelet transformation. in 27th annual international conference of the ieee engineering in medicine and biology society, shanghai, china, pp. 5869–5872. 2005. [6] o. t. inan, m. etemadi, r. m. wiard, et al. robust ballistocardiogram acquisition for home monitoring. physiological measurement 30:169–185, 2009. [7] c. brüser, j. diesel, m. d. zink, et al. automatic detection of atrial fibrillation in cardiac vibration signals. ieee jbhi 2013. [8] h. maki, h. ogawa, s. tsukamoto, et al. a system for monitoring cardiac vibration, respiration, and body movement in bed using an infrared. in 32nd annual international conference of the ieee embs, buenos aires, argentina, vol. 2, pp. 5197–5200. 2010. [9] m. bleicher. halbleiter-optoelektronik. hüthig, 1986. [10] c. brüser, s. winter, s. leonhardt. robust inter-beat interval estimation in cardiac vibration signals. physiological measurement 34(2):123, 2013. [11] li, mark, clifford. robust heart rate estimation from multiple asynchronous noisy sources using signal quality indices and a kalman filter. physiol meas 29 29:15–32, 2008. [12] v. f. s. fook, k. p. leong, e. h. jianzhong, et al. non-intrusive respiratory monitoring system using fiber bragg grating sensor. in 10th international conference on e-health networking, applications and services, pp. 160–164. ieee, 2008. [13] j. pan, w. tompkins. a real-time qrs detection algorithm. in ieee transactions on biomedical engineering, vol. 32, pp. 230–235. 1985. 867 acta polytechnica 53(6):862–867, 2013 1 introduction 2 methods 2.1 measurement setup 2.2 working principle 2.3 analog filtering techniques 2.4 multi-channel algorithm 3 results and discussion 3.1 signal morphology 3.2 sources of artifacts 3.3 sensor-array 3.4 heart rate estimation 4 conclusion references ap1_02.vp 1 introduction stress intensity factors can be determined by experimental, numerical or analytical methods. however, with complicated component and crack geometry or under complex loading only numerical procedures are applicable. many programs have been designed recently to deal with fracture phenomena (e. g., franc2d, franc/fam, franc3d and afgrow), which, in spite of great programming efforts, still show some deficiencies in range of functionality, operation comfort and reliability. little evidence has yet been provided about the accuracy and suitability of such programs for solving engineering problems [1, 2]. present applications cover illustrative examples and simple problems [3, 4, 5]. proven multi-purpose numerical programs such as marc and abaqus usually possess no routines to find out stress intensity factors. to take advantage of the programs’ extensive functions and high reliability a number of user subroutines for solving stress intensity factors need to be programmed. as regards the accuracy of stress intensity factors, only single programs have been compared with analytical solutions so far, and no comparison of the programs with each other has been presented. the objective of this study was therefore to confront the results of franc2d, franc3d and marc with each other and for a better view also to verify their deviations from the analytical solutions. the analyses were conducted on simple models while also observing the influence of mesh fineness, usability of available solvers and the overall performance of the programs. 2 programs and models the tested programs were franc2d version 2.7 [6], franc3d [7] version 1.15, and marc version 2000 [8]. the first two come from cornell university, new york, and as freeware they can be freely distributed. franc2d (two-dimensional fracture analysis code) is based on the finite element method and enables analyses of two-dimensional problems with arbitrary component and crack geometries. several methods are implemented for calculating stress intensity factors, from which the j-integral method [9] was chosen for the purposes of this study. franc3d uses the boundary element method and was designed for solving three-dimensional fracture problems. also here, arbitrary component and crack geometries can be analysed. stress intensity factors are determined by the displacement correlation method [10]. both franc2d and franc3d possess further important functions for modeling various fracture phenomena, such as fatigue crack growth. currently, a new version of franc3d is being developed, which is based on the finite element method and offers a greater functional range. the finite element system marc is suitable for analyses of general problems of engineering mechanics. to determine stress intensity factors the displacement correlation method with a linear extrapolation from two nodes at each crack face [11, 12] has been implemented in the user procedures. as analytical solutions are known for only certain probe types, a single-edge cracked beam subjected to three-point bending and a flat plate with a semielliptical surface crack (fig. 1) were chosen for the tests in this study. the analytical solution for the single-edge cracked beam is given by the equation k fs bw a a w a w a w a w i � � � � � � � � � � � � � �3 2 199 1 215 393 27 2 . . . . � � � � � � � � � � � � � � � � � 2 3 21 2 1 a w a w (1) with the conditions s w a w � � �4 0 0 6, . .and the distribution of stress intensity factors along the crack front in a flat plate is expressed by the relation k c y c a c w a b i n� � � � � � 1 � � � �, , , (2) with the conditions 0 0 5 0 10 0� � � � � � c a c w . , . and � � , where � is a complete elliptic integral of the second kind and y is a geometric function. however, the function y is not based on theoretical examinations but on experimental studies [13]. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 accuracy of determining stress intensity factors in some numerical programs m. vorel, e. leidich at present, there are many programs for numerical analysis of cracks, in particular for determining stress intensity factors. analyses of a single-edge cracked beam and flat plate with a semielliptical surface crack are presented in this study to examine the accuracy and applicability of the franc2d and franc3d programs. further numerical computations of the marc program and analytical solutions of stress intensity factors were included to compare the results with each other. for this purpose marc was equipped with special user procedures. the influence of mesh fineness on the results was also investigated in all programs. the distributions of the stress intensity factors show good agreement in quality. the maximum deviations from the analytical solutions are 9.7 %. with greater numbers of elements programs franc2d and franc3d showed some instability, which currently reduces the usefulness and reliability of these promising tools for engineering applications. keywords: fracture mechanics, stress intensity factors, numerical programs. the meshes of the analysed models were generated with the relevant preprocessors (casca, osm, mentat) in three mesh densities at a time, in order to observe the influence of mesh fineness on stress intensity factors. thus, from each examined probe type there were models with a coarse, medium fine and fine mesh in each program (table 1). although in general it is appropriate to take advantage of a model symmetry, here, with respect to future studies, complete models were created (fig. 2). the basic elements were taken linear in franc3d, and quadratic in franc2d and marc. the crack front (or crack tip in two-dimensional cases) formed in all programs collapsed quarter-point quadratic elements [14], the number of which varied in three-dimensional cases from 16 to 48 along the crack front. the rosette consisted of 6 to 8 collapsed elements. the two dimensional models were considered as plain strain problems. the meshes of mentat were generated by a newly introduced parametric modeling function. the load of the single-edge cracked beam models consisted of a sin© czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 41 no. 2/2001 y x z h = 40 h/2 = 20 b = 12.566 a = 4 c = 2 w = 20 = 0.3 e = 210000 mpa = 0.3 e = 210000 mpaw = 8 b = 4 h/2 = 18 a = 4 s/2 = 16 (a) (b) h/2 h/2 w s/2 f s/2 a b b b aa y c w t x ns h/2 h nn zt z fig. 1: a single-edge cracked beam (a) and a flat plate with a semielliptical surface crack (b). length measures in mm. fig. 2: an example of the used franc3d meshes; (a) single-edge cracked beam, (b) flat plate elements nodes coarse medium fine fine coarse medium fine fine single-edge cracked beam franc2d/casca 556 1636 2168 1645 4525 6395 franc3d/osm 936 2252 4020 902 2148 3876 marc/mentat 2760 8400 14080 13233 37939 62115 centre cracked plate franc3d/osm 1206 2014 3084 1001 1918 2780 marc/mentat 7396 12880 17608 33143 55979 76341 table 1: numbers of elements and nodes of the analysed models gle force ( f = 500 n) with two-dimensional models, or of a uniformly distributed force ( f = 125 nmm�1) in three-dimensional models. the flat plate was loaded with normal stress (�n = 200 mpa) on the upper surface of the model. 3 strategy and results a single-edge cracked beam was analysed in franc2d, franc3d and marc. the flat plate was analysed as a sheer three-dimensional problem only in franc3d and marc. the analytical solutions were carried out for both probe types. after analysis there were three results (coarse, medium fine and fine mesh) for one probe type from each used program. in particular, these were three mesh qualities × three programs from the single-edge cracked beam, and three mesh qualities × two programs from the flat plate. from each three results the optimum solutions were then chosen. the criteria for this were minimum deviation from the analytical solution on the one hand, and the lowest possible computational time on the other. in the end the programs’ optimum solutions were compared together with the analytical solutions in one diagram for each probe. the stress intensity factors were evaluated as a single value at a crack tip or as a course of values at a crack front. the franc2d models were solved with an implicit direct solver, which required very low computational times (fig. 3). there was no point in using more sophisticated solvers, as the problem was entirely linear. the franc3d jobs were processed by the boundary element solver bes, which includes four different schemes. in this study the iterative scheme with out-of-core element storage proved best: the direct scheme could not be applied to larger models, as the programs crashed after the stiffness matrix assembly, and the other schemes turned out to run slightly more slowly. the marc analyses were carried out with the iterative solver (in-core element storage, incomplete choleski preconditioner). all computations were performed on an sgi origin 2000 computer. the dependences of the stress intensity factors on the mesh fineness are displayed in figs. 4 and 5. the optimum solutions of stress intensity factors are compared in the following diagrams, fig. 6: • stress intensity factors in a single-edge cracked beam under single force or distributed force loading; three numerical solutions and one analytical solution; f = 500 n or f = 125 nmm�1 respectively; • stress intensity factors in a flat plate under normal stress loading; two numerical solutions and one analytical solution; �n = 200 mpa. 4 discussion with simple crack configurations, such as that of the single-edge cracked beam, mesh quality seems to have only a little influence on the values of the stress intensity factors (fig. 4). the smallest differences can be observed for franc2d. however, mesh quality appears to be significant in the case of more complicated crack forms, such as that of a semielliptical crack (fig. 5). the stress intensity factors differ with marc especially at the crack edges; franc3d shows consistent values from a certain mesh fineness. the overall comparison of stress intensity factors for a single-edge cracked beam was performed in the middle of the crack front, as these show more important evidence than the values at the crack ends. however, this was not the case for the semielliptical crack in the flat plate, where the stress intensity factors vary considerably along the whole crack front. the values of franc3d are always somewhat higher than the analytical solution (deviation up to 9.7 %, fig. 6), whereas marc delivers somewhat lower values (deviation up to 8.4 %). on the one hand, the deviations can be traced to the methods used (boundary element vs. finite element method) and elements (linear vs. quadratic), but on the other hand, they may also result from the different methods of calculating the stress intensity factors. the deviations of the franc2d values from the analytical solution are the lowest, which agrees with the acta polytechnica vol. 41 no. 2/2001 64 28000 24000 20000 16000 12000 8000 4000 32000 0 franc3dfranc2d marc marc 12000 10000 8000 6000 4000 2000 0 franc3d c o m p u ta tio n al tim e [s ] 500 n 200 mpa coarse medium fine fine 5 25 40 24 98 65 0 90 0 63 00 coarse medium fine fine 28 61 37 49 12 60 36 60 33 19 12 66 3 3 1 9 2 0 1 0 9 6 0 fig. 3: computational time of the single-edge cracked beam (left) and flat plate (right) © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 41 no. 2/2001 meshsolution analytical medium fine finecoarse 467.9 468.9 471.5470.7 franc2d a) 560 520 480 440 400 600 500 n franc3d b) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 560 520 480 440 400 600 coordinate along the crack front [1] analytical solution fine medium fine coarse mesh quality: 500 n marc c) 560 520 480 440 400 600 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 coordinate along the crack front [1] analytical solution fine medium fine coarse mesh quality: 500 n fig. 4: influence of mesh fineness on stress intensity factors ki for a single-edge cracked beam using franc2d (a), franc3d (b) and marc (c) franc3d a) 460 420 380 340 500 300 0 20 40 60 80 100 120 140 160 180 angle along the crack front [deg]y . analytical solution fine medium fine coarse mesh quality: 200 mpa marc b) 460 420 380 340 500 300 . 0 20 40 60 80 100 120 140 160 180 angle along the crack front [deg]y analytical solution fine medium fine coarse mesh quality: 200 mpa fig. 5: influence of mesh fineness on stress intensity factors ki at the flat plate using franc3d (a) and marc (b) supposition that domain integral methods (such as the j-integral method) are more accurate than local methods (such as the displacement correlation method) [15, 16]. the stress intensity factors determined by franc3d at the crack edges in the flat plate do not correspond well to the other displayed solutions (fig. 6). stress intensity factors at the crack edges are in general dependent especially on the geometric configuration [13], but they should not acquire such a falling form. false values at crack edges are generally caused by some unsuitable treatment of the singularities which always exist at the ends of a crack front [17]. here, this phenomenon as well as the sudden decline of the stress intensity factors at � = 90° may also be connected to the boundary element method that is used. franc2d and franc3d show some errors during both manual operation and computational processing. these emerge mostly with larger models (from about 4000 elements) and result in disfunction of some commands (e. g., manual redefining of elements). this is probably caused by deficient memory management: during computational processing too high memory demand and falsely defined elements can lead to a program crash. 5 conclusion in this study some analyses with franc2d, franc3d and marc were conducted to compare the accuracy of determining stress intensity factors and to examine the behavior of the programs. a single-edge cracked beam and a flat plate with a semielliptical surface crack were used as test models. franc2d shows good accuracy, but it is applicable only to two-dimensional problems. franc3d delivers acceptable values and appears to be a promising tool for engineering applications. to this end, reliability and function range should be improved. marc with special user procedures shows lower but certainly usable values. although this multi-purpose program shows high reliability, the programming effort to adapt it for solving fracture problems remains high. references [1] may, b.: ein beitrag zur praxisnahen simulation der ausbreitung von ermüdungsrissen bei komplexer beanspruchung. düsseldorf, vdi verlag, 1998 [2] franc3d. documentation: volume v, validation/ verification, new york, cornell university, 1998 [3] schöllmann, m., richard, h. a.: franc/fam – a software system for the prediction of crack propagation. journal of structural engineering, 26, 1999, pp. 39–48 [4] lewicki, d. g.: crack propagation studies to determine benign of catastrophic failure modes for aerospace thin-rim gears. nasa technical memorandum 107170, cleveland, nasa, 1996 [5] lewicki, d. g., sane, a. d., drago, r. j., wawrzynek, p. a.: three-dimensional gear crack propagation studies. nasa technical memorandum 208827, cleveland, nasa, 1998 [6] franc2d, a two dimensional crack propagation simulator, user’s guide, version 2.7. new york, cornell university, 1998 [7] franc3d, a three dimensional fracture analysis code, concepts and user’s guide, version 1.14. new york, cornell university, 1998 [8] marc volume a manual: theory and user information, version k7. los angeles, marc analysis research corporation, 1997 [9] dodds, r. h. jr., vargas, p. m.: numerical evaluation of domain and contour integrals for nonlinear fracture mechanics: formulation and implementation aspects. illinois, university of illinois at urbana-champaign, 1988 [10] shih, c. f., de lorenzi, h. g., german, m. d.: crack extension modelling with singular quadratic isoparametric elements. int. j. fracture, 12, 1977, pp. 647–651 [11] mi, y.: three-dimensional analysis of crack growth. southampton, computational mechanics publications, 1996 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 560 520 480 440 400 600 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 d = 6.3 %0.5 d = 0.4 %0.5 d = 4.5 %0.5 analytical solution franc3d, coarse mesh franc2d, medium fine mesh marc, medium fine mesh a) 500 n 9d = 9.7 % d = 8.4 %90 d101= 12.2 % 460 420 380 340 500 300 0 20 40 60 80 100 120 140 160 180 analytical solution marc, fine mesh franc3d, medium fine mesh b) 200 mpa angle along the crack front [deg]ycoordinate along the crack front [1] .. fig. 6: comparison of stress intensity factors ki for a single-edge cracked beam (a) and flat plate (b) 67 acta polytechnica vol. 41 no. 2/2001 [12] barsoum, r. s.: on the use of isoparametric finite elements in linear fracture mechanics. international journal for numerical methods in engineering, 10, 1976, pp. 25–37 [13] sähn, s., göldner, h.: bruchund beurteilungskriterien in der festigkeitslehre. leipzig, fachbuchverlag, 1989 [14] bittnar, z., šejnoha, j.: numerické metody mechaniky 2. praha, vydavatelství čvut, 1992 [15] dhondt, g.: mixed-mode calculations with abaqus. in: “dvm bericht 232: festigkeitsund bruchverhalten von fügeverbindungen”, berlin, dvm 2000, pp. 333–343 [16] bažant, z. p., planas, j.: fracture and size effect in concrete and other quasibrittle materials. boca raton, crc press, 1998 [17] shivakumar, k. n., raju, i. s.: treatment of singularities in cracked bodies. international journal of fracture, 45, 1990, pp. 159–178 dipl.-ing. michal vorel phone: +490 371 531 4566 fax: +490 371 531 4560 e-mail: michal.vorel@mbv.tu-chemnitz.de prof. dr. ing. erhard leidich technical university chemnitz faculty of mechanical engineering and process technology reichenhainerstraße 70 chemnitz, d-09126, germany << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica acta polytechnica 53(3):271–279, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ spectral analysis of schrödinger operators with unusual semiclassical behavior pavel exnera,b, diana barseghyana,b,∗ a doppler institute for mathematical physics and applied mathematics, břehová 7, 11519 prague b nuclear physics institute ascr, 25068 řež near prague, czechia ∗ corresponding author: dianabar@ujf.cas.cz abstract. in this paper we discuss several examples of schrödinger operators describing a particle confined to a region with thin cusp-shaped ‘channels’, given either by a potential or by a dirichlet boundary; we focus on cases when the allowed phase space is infinite but the operator still has a discrete spectrum. first we analyze two-dimensional operators with the potential |xy|p − λ(x2 + y2)p/(p+2) where p ≥ 1 and λ ≥ 0. we show that there is a critical value of λ such that the spectrum for λ < λcrit is below bounded and purely discrete, while for λ > λcrit it is unbounded from below. in the subcritical case we prove upper and lower bounds for the eigenvalue sums. the second part of work is devoted to estimates of eigenvalue moments for dirichlet laplacians and schrödinger operators in regions having infinite cusps which are geometrically nontrivial being either curved or twisted; we are going to show how these geometric properties enter the eigenvalue bounds. keywords: schrödinger operator, discrete spectrum, lieb-thirring inequality, cusp-shaped regions, geometrically induced spectrum. 1. introduction the semiclassical method for analyzing the operator has proved itself a tremendously useful tool over the century since it was proposed by hermann weyl. nevertheless, there are cases when estimates based on phase-space volume fail; the classical example is due to b. simon [1] and describes a two-dimensional schrödinger operator with the potential |xy|p having deep ‘channels’ the width of which is shrinking with the distance from the origin. the present paper is devoted to a discussion of several models of this type. it summarizes the presentation of the second named author at the conference analytic and algebraic methods in physics x (prague, 2012) based on the original papers [3, 4] to which we refer for details of the proofs which are sketched here. our first aim is to show that the effects known from the paper [1] can occur even if the potential is unbounded from below; at the same time the model will exhibit a parameter transition between different spectral regimes. one has to add that the first person to draw attention to the possibility of finding a discrete and below bounded spectrum in a below unbounded potential was to our knowledge m. znojil, who analyzed a related model in [2]. in the second part we will discuss schödinger operators and dirichlet laplacians on cusp-shaped regions which are geometrically nontrivial, being either bent or twisted, and show how their geometry is reflected in spectral properties. 2. a model with potential unbounded from below and infinite phase space we are going to consider here the following class of operators, lp(λ) : lp(λ)ψ = −∆ψ + ( |xy|p −λ(x2 + y2)p/(p+2) ) ψ , p ≥ 1 , (2.1) on l2(r2) where x,y are cartesian coordinates in r2. the parameter λ in the second term of the potential is assumed to be non-negative; unless its value is important in a particular context we write simply lp. since 2p p+2 < 2 the operator (2.1) is e.s.a. on c ∞ 0 (r2) by the faris-lavine theorem — see [5, theorems x.28, x.38]; in the following we mean by the symbol lp or lp(λ) always its closure. we are going to demonstrate the existence of a critical value of the coupling constant λ, expressed explicitly in terms of the ground-state eigenvalue of the corresponding (an)harmonic oscillator hamiltonian, such that the spectrum of lp(λ) is below bounded and purely discrete for λ < λcrit, while for λ > λcrit it becomes unbounded from below. in the subcritical case we shall present upper and lower bounds to the sums of the first n eigenvalues of lp(λ). 271 http://ctn.cvut.cz/ap/ p. exner, d. barseghyan acta polytechnica 2.1. discreteness of the spectrum let us first look at small values of λ. to speak quantitatively we need an auxiliary operator which will be an (an)harmonic oscillator hamiltonian h̃p : h̃pu = −u′′ + |t|pu on the natural domain in l2(r). let γp be the minimal eigenvalue of this operator; in view of the mirror symmetry we have γp = inf σ(hp), where hp : hpu = −u′′ + tpu (2.2) on the natural domain in l2(r+) with the neumann condition at the origin. it is well known that the quantity γp depends smoothly on p being equal to one for p = 2 and tending to γ∞ = 14π 2 as p →∞; a numerical analysis performed in [3] shows that the function p 7→ γp is convex and γp > 0.99 for any p ≥ 1. theorem 2.1. for any λ ∈ [0,λcrit), where λcrit := γp, the operator lp(λ) with p ≥ 1 is bounded from below and its spectrum is purely discrete. sketch of the proof. fix first λ < γp. by the minimax principle we need to estimate lp from below by a self-adjoint operator with a purely discrete spectrum. to this aim we employ a suitable bracketing imposing additional neumann conditions at concentric circles of radii n = 1, 2, . . . . using the polar coordinates, we get a direct sum of operators acting as l(1)n,pψ = − 1 r ∂ ∂r ( r ∂ψ ∂r ) − 1 n2 ∂2ψ ∂ϕ2 + (r2p 2p |sin 2ϕ|p −λr2p/(p+2) ) ψ, ∂ψ ∂n ∣∣∣ r=n−1 = ∂ψ ∂n ∣∣∣ r=n = 0, (2.3) on the regions gn := { (r,ϕ) : n− 1 ≤ r < n, 0 ≤ ϕ < 2π } , n = 1, 2, . . . . each of these annuli is compact and the potential is regular on it, hence σ ( l (1) n,p ) is purely discrete. it thus suffices to check that inf σ ( l (1) n,p ) →∞ as n →∞, because the spectrum of ⊕∞ n=1 l (1) n,p below any fixed value will then be purely discrete. we estimate l (1) n,p from below by an operator with separated variables, l(2)n,pψ = − 1 r ∂ ∂r ( r ∂ψ ∂r ) − 1 n2 ∂2ψ ∂ϕ2 + ((n− 1)2p 2p |sin 2ϕ|p −λn2p/(p+2) ) ψ, ∂ψ ∂n ∣∣∣ r=n−1 = ∂ψ ∂n ∣∣∣ r=n = 0, and establish that inf σ ( l (3) n,p ) →∞ as n →∞, where l(3)n,p is the angular part of l (2) n,p. the spectrum of l (2) n,p is the ‘sum’ of the radial and angular component and the lowest radial eigenvalue is zero corresponding to a constant eigenfunction. one the other hand, the angular part behaves as the anharmonic oscillator around the potential minima, and the corresponding eigenvalue prevails over the negative λ-dependent term [3]. in this way one gets inf σ ( l (2) n,p ) →∞ as n →∞, which proves by the minimax principle the same for operator l(1)n,p. 2.2. the supercritical case for large λ the spectral behaviour is different. theorem 2.2. σ ( lp(λ) ) , p ≥ 1, is unbounded from below if λ > λcrit. sketch of the proof. we use a similar technique, this time looking for an upper bound to lp(λ) obtained by dirichlet bracketing: we consider the operators l̃(1)n,p acting as (2.3) on the annular domains gn with dirichlet boundary conditions. the latter give a contribution of order o(1) as n →∞ and since the λ-dependent term now prevails we conclude [3] that inf σ ( l̃(1)n,p ) →−∞ as n →∞ , which means that σ ( lp(λ) ) is unbounded from below. 2.3. lower bounds to eigenvalue sums next we will show how the eigenvalue sums of operator lp(λ) can be estimated for small values of λ. we introduce the number α := 1 40 ( 5 + √ 105 )2 ≈ 5.81; it is clear that α−1 < γp. we denote by {λj,p}∞j=1 the eigenvalues of lp(λ) arranged in ascending order; then we have the following result. theorem 2.3. to any nonnegative λ < α−1 ≈ 0.172 there is a positive cp depending on p only such that n∑ j=1 λj,p ≥ cp ( 1 −αλ ) n(2p+1)/(p+1) (lnp n + 1)1/(p+1) − cλn, n = 1, 2, . . . , (2.4) where c = 2 ( α2 5 + 1 ) ≈ 15.51. 272 vol. 53 no. 3/2013 spectral analysis of schrödinger operators sketch of the proof. we denote by {ψj,p}∞j=1 the system of normalized eigenfunctions corresponding to {λj,p} ∞ j=1, −∆ψj,p + ( |xy|p −λ(x2 + y2)p)/(p+2) ) ψj,p = λj,pψj,p, j = 1, 2, . . . ; without loss of generality we may assume these functions to be real-valued. our potential form hyperbolic-shaped ‘channels’ and we have first to estimate eigenfunction integrals in the corresponding parts of the plane. we establish [3] that for any natural j and a δ > 0 one has ∫ ∞ 1 ∫ (1+δ)y−p/(p+2) 0 y2p/(p+2)ψ2j,p(x,y) dx dy ≤ 5 2 (1 + δ)2 ∫ ∞ 1 ∫ ∞ 0 (∂ψj,p ∂x )2 (x,y) dx dy + 2 1 + δ δ ∫ ∞ 1 ∫ (1+δ)y−p/(p+2) 0 xpypψ2j,p(x,y) dx dy and that for an arbitrary ε > 0 there is a number θ(ε) ∈ [1, 1 + δ] such that ∫ ∞ 1 yp/(p+2)ψ2j,p ( θ(ε) yp/(p+2) ,y ) dy < 1 δ ∫ ∞ 1 ∫ (1+δ)y−p/(p+2) y−p/(p+2) xpypψ2j,p(x,y) dx dy + ε. using the potential symmetry we get an analogous bound for the other ‘channels’. using next the normalization ‖ψj,p‖ = 1 and the mentioned estimates we find∫ r2 (x2 + y2) p p+2 ψ2j,p(x,y) dx dy ≤ (∫ |y|≥1 ∫ |x|≤(1+δ)|y|−p/(p+2) |y|2p/(p+2)ψ2j,p(x,y) dx dy + ∫ |y|≥1 ∫ |x|>(1+δ)|y|−p/(p+2) |y|2p/(p+2)ψ2j,p(x,y) dx dy + ∫ |x|≥1 ∫ |y|≤(1+δ)|x|−p/(p+2) |x|2p/(p+2)ψ2j,p(x,y) dy dx + ∫ |x|≥1 ∫ (1+δ)|x|−p/(p+2)<|y|<1 |x|2p/(p+2)ψ2j,p(x,y) dy dx ) + 2 ≤ (1 + δ) max {5 2 (1 + δ), 2 δ }(∫ r2 |∇ψj,p| 2 (x,y) dx dy + ∫ r2 |xy|pψ2j,p(x,y) dx dy + (1 + δ) 2 ) + 2, where we have used the inequality |xy|p > |y|2p/(p+2) valid on the domain of the second of the four integrals and an analogous bound for the fourth integral; the factor (1 + δ)2 prevents double counting the ‘corner regions’ with |x|, |y| ≥ 1 and |y| ≤ (1 + δ)|x|−p/(p+2). choosing δ = −5+ √ 105 10 we arrive at∫ r2 (x2 + y2) p p+2 ψ2j,p(x,y) dx dy ≤ α (∫ r2 |∇ψj,p||2(x,y) dx dy + ∫ r2 |xy|pψ2j,p(x,y) dx dy ) + c, where c := α(1 + δ)2 + 2 = 2 ( α2 5 + 1 ) . since λj,p is the eigenvalue corresponding to the eigenfunction ψj,p the inequality derived above implies∫ r2 |∇ψj,p|2 dx dy + ∫ r2 |xy|pψ2j,p dx dy ≤ 1 1 −αλ ( λj,p + cλ ) , j = 1, 2, . . . . next we use plancherel’s theorem to express the gradients of ψj,p in the first integral and then apply the following lieb-thirring-type inequality to the resulting orthonormal series: lemma 2.4. there is a constant c′p such that for any orthonormal system of real-valued function, φ = {ϕj}nj=1 ⊂ l 2(r2), n = 1, 2, . . . , the inequality ∫ r2 ρ p+1 φ dx dy ≤ c ′ p(ln p n + 1) n∑ j=1 ∫ r2 |ξη|p|ϕ̂j|2 dξ dη , holds true, where ρφ := ∑n j=1 ϕ 2 j. this claim was proved as theorem 2 in [6] for p = 1 and it is straightforward to extend the argument to any p ≥ 1. after a few simple manipulations for any non-negative parameter % we get c′′p (1 + ln p n)1/p%(2p+1)/p ≥ n%− 1 1 −αλ n∑ j=1 (λj,p + cλ) 273 p. exner, d. barseghyan acta polytechnica with a new positive constant c′′p . in the final step we consider g̃(n) = max %≥0 ( n%−c′′p% (2p+1)/p(1 + lnp n)1/p ) . denoting now the expression in the bracket as h(%) we check easily that reaches its maximum at the point %max = ( p/(2p + 1)c′′p )p/(p+1) np/(p+1)(1 + lnp n)−1/(p+1) and its value there equals g̃(n) = h(%max) = cp n(2p+1)/(p+1) (1 + lnp n)1/(p+1) with some constant cp > 0. in this way we arrive at the bound 1 1 −αpλ n∑ j=1 (λj,p + cpλ) ≥ h (%max) = g̃(n). which is equivalent to the claim of the theorem. 2.4. upper bounds next we estimate spectral sums of lp(λ), p ≥ 1, from above for any subcritical λ. it will show, in particular, that in the case λ = 0 the asymptotics given by theorem 2.3 is exact up the value of the constant. theorem 2.5. to any p ≥ 1 there is a constant c̃p > 0 such that n∑ j=1 λj,p ≤ c̃p n(2p+1)/(p+1) (1 + lnp n)1/(p+1) , n = 1, 2, . . . holds for any 0 ≤ λ < γp. sketch of the proof. it clearly suffices to prove the claim for λ = 0. consider the operator ĥp = −∆ + q, where q(x,y) = |xy|p + |x|p + |y|p + 1 in l2(r2). its spectrum is discrete by theorem 2.1 in combination with the minimax principle since hp ≥ lp, thus we have to establish the bound of the indicated type for the eigenvalues 0 ≤ β1,p ≤ β2,p ≤ ··· of the estimating operator ĥp. we employ weyl asymptotics for the number of eigenvalues of below-bounded differential operators in a version due to g. rozenblum. let t = −∆ + v in rm, where the potential v (x) ≥ 1 and tends to infinity as |x|→∞. we denote by e(λ,v ) the set { x ∈ rm : v (x) < λ } and put σ(λ,v ) = mes e(λ,v ). for any unit cube d ⊂ rm we denote the mean value of the function v in d by vd. furthermore, given a function f ∈ l1(d) and t ≤ √ m we define its l1-modulus of continuity by the formula ω1(f,t,d) := sup |z| 0 such that σ(2λ,v ) ≤ cσ(λ,v ) holds all λ large enough. (2.) v (y) ≤ cv (x) holds if as |x−y| ≤ 1. (3.) there is a continuous and monotonous function η : t ∈ [0, √ m] → r+ with η(0) = 0 and a number β ∈ [0, 12 ) such that for any unit square d we have ω1(v,t,d) ≤ η(t) t2β v 1+β d . 274 vol. 53 no. 3/2013 spectral analysis of schrödinger operators under these assumptions the asymptotic formula n(λ,v ) ∼ γmφ(λ,v ) holds for the operator t = −4 + v , where n(λ,v ) is the number of eigenvalues of t smaller than λ, unit-ball volume γm = (2 √ π)−m ( γ(m2 + 1) )−1, and φ(λ,v ) = ∫ r2 (λ−v )m/2+ dx dy . it is straightforward to check the hypotheses of the lemma for the operator ĥp, which gives an asymptotic formula for spectral distribution function n(λ,ĥp). after some simple estimates [3] we arrive at the following upper bound on the spectrum of the operator ĥp, n∑ j=1 βj,p ≤ c̃p n(2p+1)/(p+1) (ln n + 1)p/(p+1) with a constant c̃p depending on p only; this proves the theorem. 3. schrödinger operators in cusps with non-trivial geometry in the second part of the paper we shall discuss schrödinger type operators hω = −∆ωd −v (3.1) with a bounded measurable potential v ≥ 0 on l2(ω), where −∆ωd is the dirichlet laplacian on a region ω ⊂ rd. in the spirit of the previous considerations we will be particularly interested in situations where ω is unbounded but hω still has a purely discrete spectrum. it is well known [1, 9] that for some unbounded regions the spectrum may be purely discrete; typically this happens if ω has cusps. the negative spectrum of hω consists of a finite number of eigenvalues counted with their multiplicities; it is natural to ask about bounds on the negative spectrum moments in terms of their geometrical properties, in the spirit the seminal work of lieb and thirring [10], or in the present context referring to berezin, lieb, li and yau [11–14]. note that estimates of this type have been derived recently in [5] for various cusped regions; a typical example is ω = { (x,y) ∈ r2 : |xy| < 1 } with hyperbolic ends. our goal is to discuss situations when such infinite cusps of ω are geometrically nontrivial, being curved or twisted, to see in which way the geometry influences the spectral estimates. first we will discuss a curved planar cusp and derive estimates on negative spectrum moments which include a curvature-induced potential. this result can be generalized for bent cusps with a circular cross section; we refer to [4] for a description. then, in section 3.2, we will consider cusps of a non-circular cross section in r3 which are straight but twisted. the geometry of the region will again be involved in the obtained eigenvalue estimates, now in a different way than for curved cusps, because the effective interaction associated with twisting is repulsive rather than attractive. 3.1. curved planar cusps we consider an infinite cusp-shaped ω ⊂ r2 assuming that its boundary is smooth, so one can describe it by specifying its axis and the cusp width at each point of it. this will allow us to employ natural curvilinear coordinates by analogy with the theory of quantum waveguides [15] and to ‘straighten’ the cusp, translating its geometric properties into the coefficients of the resulting operator. to be specific, we characterize our region by three functions: sufficiently smooth a,b : r → r2 and a positive and continuous f : r → r+ in such a way that ω := {( a(s) −uḃ(s),b(s) + uȧ(s) ) : s ∈ r, |u| < f(s) } , (3.2) where the dot marks the derivative with respect to s; to make the region ω cusp-shaped we shall always suppose that lim |s|→∞ f(s) = 0. (3.3) since the reference curve γ = {( a(s),b(s) ) : s ∈ r } can always be parametrized by its arc length we may suppose without loss of generality that ȧ(s)2 + ḃ(s)2 = 1 and s is the arc length. the signed curvature γ(s) of γ is then given by γ(s) = ḃ(s)ä(s) − ȧ(s)b̈(s); 275 p. exner, d. barseghyan acta polytechnica knowing this one can reconstruct the functions a,b describing the cartesian coordinates of the cusps axis, modulo euclidean transformations. under the condition (3.3) the region is quasi-bounded so it may have a purely discrete spectrum. this is indeed the case; recall that the necessary and sufficient condition for the purely discrete spectral character [16, thm 2.8] is that we can cover ω by a family of unit balls whose centres tend to infinity in such a way that the volumes of their intersections with ω tend to zero; it is not difficult to construct such a ball sequence if (3.3) is valid. our main aim is to provide bounds on the eigenvalue moments; as usual when dealing with a lieb-thirring type problem we restrict our attention to the negative part of the spectrum noting that it can be always made non-empty by including a suitable constant into the potential. theorem 3.1. consider the schrödinger operator (3.1) on the region (3.2). suppose that the curvature γ ∈ c4, the inequality ∥∥f(·)γ(·)∥∥ l∞(r) < 1 holds true, and ω does not intersect itself. then for any σ ≥ 3/2 we have the estimate tr(hω)σ− ≤ ∥∥1 + f|γ|∥∥−2σ∞ lclσ,1 ∫ r ∞∑ j=1 ( − ( πj 2f(s) )2 + ∥∥1 + f|γ|∥∥2∞w−(s) + ∥∥1 + f|γ|∥∥2∞∥∥ṽ (s, ·)∥∥∞ )σ+1/2 + ds, (3.4) where ‖·‖∞ ≡‖·‖l∞(r) and lclσ,1 is the semiclassical constant, lclσ,1 := γ(σ + 1) √ 4π γ(σ + 32 ) , (3.5) and furthermore, we have introduced w−(s) := γ(s)2 4 ( 1 −f(s)|γ(s)| )2 + f(s) ∣∣γ̈(s)∣∣ 2 ( 1 −f(s)|γ(s)| )3 + 5f2(s)γ̇(s)24(1 −f(s)|γ(s)|)4 and ṽ (s,u) := v ( a(s) −uḃ(s),b(s) + uȧ(s) ) . sketch of the proof. using the mentioned ‘straightening’ transformation [15] we infer that hω is unitarily equivalent to the operator h0 on l2(ω0) acting as (h0ψ)(s,u) = − ∂ ∂s ( 1 (1 + uγ(s))2 ∂ψ ∂s (s,u) ) − ∂2ψ ∂u2 (s,u) + ( (w − ṽ )ψ ) (s,u), where ω0 = {(s,u) : s ∈ r, |u| < f(s)}, the curvature-induced potential is w(s,u) := − γ2(s) 4 ( 1 + uγ(s) )2 + uγ̈(s)2(1 + uγ(s))3 − 54 u 2γ̇2(s)( 1 + uγ(s) )4 and dirichlet boundary conditions are imposed at u = ±f(s). in view of the unitary equivalence it is enough to establish inequality (3.4) for the operator h0. we employ the minimax principle: consider the operator h−0 defined on the domain h20(ω0) in l2(ω0) by h−0 = −∆ ω0 d − ∥∥1 + f|γ|∥∥2∞(w− + ṽ ), where −∆ω0d is as usual the corresponding dirichlet laplacian; it is obvious that h0 ≥ ∥∥1 + f|γ|∥∥−2∞ h−0 (3.6) holds true, therefore it is sufficient to get the upper bound for negative eigenvalue moments of operator h−0 . we use a variational argument — see [17] — to estimate the negative eigenvalue moments of h−0 by the negative eigenvalue moments of the other operator with the operator-valued potential defined on domain h1 ( r,l2(r) ) and given as follows − ∂2 ∂s2 ⊗ il2(r) + h ( s, ṽ ,w− ) , where h ( s, ṽ ,w− ) is the negative part of the sturm-liouville operator − d2 du2 − ∥∥1 + f|γ|∥∥2∞(w− + ṽ ). 276 vol. 53 no. 3/2013 spectral analysis of schrödinger operators consequently, tr (h−0 ) σ − ≤ tr ( − ∂2 ∂ s2 ⊗ il2(r) + h ( s, ṽ ,w− ))σ − holds for any nonnegative number σ. this makes it possible to employ the version of lieb-thirring inequality for operator-valued potentials [18] for operator-valued potentials, which yields tr(h−0 ) σ − ≤ l cl σ,1 ∫ r tr ( h ( s, ṽ ,w− ))σ+1/2 − ds, σ ≥ 3/2, (3.7) with the semiclassical constant lclσ,1. it remains to estimate the negative spectrum of the sturm-liouville operator − d 2 du2 − ∥∥1 + f|γ|∥∥2∞(w−(s) + ∥∥ṽ (s, ·)∥∥∞) which is easily done,(( πj 2f(s) )2 − ∥∥1 + f|γ|∥∥2∞w−(s) −∥∥1 + f|γ|∥∥2∞∥∥ṽ (s, ·)∥∥∞ ) − ; hence in view of (3.6) and (3.7) we find that tr(h0)σ− ≤ ∥∥1 + f|γ|∥∥−2σ∞ lclσ,1 ∫ r ∞∑ j=1 ( − ( πj 2f(s) )2 + ∥∥1 + f|γ|∥∥2∞w−(s) + ∥∥1 + f|γ|∥∥2∞∥∥ṽ (s, ·)∥∥∞ )σ+1/2 + ds, which proves the theorem. we finish this section with two remarks. first we note that while the standard phase-space-volume estimates give correct high-energy behaviour one can find finite regions ω for which there exists an intermediate energy region where the bound (3.4) is much stronger than the berezin-li-yau inequality. an example is given in paper [4], to which we refer also for the mentioned generalization of theorem 3.1 to higher dimensions. 3.2. twisted cusps of non-circular cross section in r3 let us now look at another type of nontrivial cusp geometry. as before we will suppose that its cross section changes along the curve playing role of the axis, however, now we allow it to be non-circular. consider an open connected set ω0 ⊂ r2 and a positive function f : r → r satisfying the condition (3.3), and set ωs := f(s)ω0 , (3.8) where we use the conventional shorthand αa := { (αx,αy) : (x,y) ∈ a } for α > 0 and a ⊂ r2. using (3.8) we define a straight cusped region determined by ω0 and the function f as ω0 := { (s,x,y) : s ∈ r, (x,y) ∈ ωs } . next we twist the region ω0. we fix a c1-smooth function θ : r → r with a bounded derivative, ∥∥θ̇∥∥∞ < ∞, and introduce the set ωθ as the image ωθ := lθ(ω0), (3.9) where the map lθ : r3 → r3 is given by lθ(s,x,y) := ( s,x cos θ(s) + y sin θ(s),−x sin θ(s) + y cos θ(s) ) . (3.10) we are interested primarily in nontrivial situations, assuming that (1.) the function θ is not constant , (2.) ω0 is not rotationally symmetric with respect to the origin in r2. to formulate the result of this section, we need a few more preliminaries. first of all, we introduce % := sup(x,y)∈ω0 √ x2 + y2 and assume that % ∥∥f(·)θ̇(·)∥∥∞ < 1. (3.11) next we set ṽ (s,x,y) := v ( lθ(s,x,y) ) by analogy with the corresponding definitions in the previous sections, and finally, we introduce the operator ltrans := −i ( x ∂ ∂y −y ∂ ∂x ) , dom(ltrans) = h10(ω0), describing the angular momentum component canonically associated with rotations in the transverse plane. we have the following claim: 277 p. exner, d. barseghyan acta polytechnica theorem 3.2. let hωθ be the operator (3.1) referring to the region ωθ defined by (3.9) and (3.10) with a potential v ≥ 0 which is bounded and measurable. under the assumption (3.11) for the negative spectrum of hωθ the inequality tr ( h ωθ d )σ − ≤ l cl σ,1 ( 1 −% ∥∥fθ̇∥∥∞)σ ∫ r ∞∑ j=1 ( − λ0,j(s) f2(s) + ∥∥ṽ ∥∥∞ 1 −% ∥∥fθ̇∥∥∞ )σ+1/2 + ds holds true for σ ≥ 3/2, where lclσ,1 is the constant (3.5) and λ0,j(s), j = 1, 2, . . . , are the eigenvalues of the operator hf,θ(s) := −∆ω0d + f 2(s)θ̇2(s)l2trans defined on the domain h20(ω0) in l2(ω0). sketch of the proof. as before, we employ suitable curvilinear coordinates, this time to ‘untwist’ the region. we define a unitary operator from l2(ωθ) to l2(ω0) by uθψ := ψ ◦ lθ which allows us to pass from hdωθ to the operator h0 := uθ ( hdωθ ) u−1θ in l2(ω0). from paper [19] we know that h0 is the self-adjoint operator associated with the quadratic form q0 : q0[ψ] := ∥∥∂sψ + iθ̇ltransψ∥∥2 + ∥∥∇′ψ∥∥2 −∫ ω0 ( ṽ |ψ|2 ) (s,x,y) ds dx dy defined on h10, where ∇′ := (∂x,∂y) is the transversal gradient and the norms refer to l2(ω0). for any function ψ ∈ h10 (ωs) we have |ltransψ| ≤ %f(s)|∇′ψ| and applying the cauchy-schwarz inequality we infer 2 ∣∣∣∣ ∫ ω0 θ̇∂sψltransψ ds dx dy ∣∣∣∣ ≤ %∥∥fθ̇∥∥∞(∥∥∂sψ∥∥2l2(ω0) + ∥∥∇′ψ∥∥2l2(ω0)), thus q0[ψ] can be estimated from below by( 1 −% ∥∥fθ̇∥∥∞)(∥∥∇ψ∥∥2l2(ω0) + ∥∥θ̇ltransψ∥∥2l2(ω0))− ∫ ω0 ∥∥ṽ (s, ·)∥∥∞|ψ|2 ds dx dy. we introduce the operator h−0 = −∆ ω0 d + θ̇ 2l2trans − 1 1 −% ∥∥fθ̇∥∥∞ ∥∥ṽ (s, ·)∥∥∞ defined on h20 (ω0), then the above estimate implies h0 ≥ ( 1 −% ∥∥fθ̇∥∥∞)h−0 , (3.12) hence by the minimax principle and the condition (3.11) it is enough to establish the upper estimate for the negative spectrum of operator h−0 . using again the variational technique one can estimate the negative eigenvalue moments of operator h−0 by the moments of the operator with separated variables, − ∂ 2 ∂ s2 ⊗ il2(r2) + h(s, ṽ ), defined on the domain h1 ( r,l2(r2) ) , i.e. tr ( h−0 )σ − ≤ tr ( − ∂2 ∂s2 ⊗ il2(r2) + h(s, ṽ ) )σ − , σ ≥ 0. in view of the lieb-thirring inequality for operator valued potentials this implies tr ( h−0 )σ − ≤ l cl σ,1 ∫ r tr h ( s, ṽ )σ+1/2 − d s, for any σ ≥ 3/2 (3.13) with the semiclassical constant lclσ,1. then, by virtue of unitary equivalence of operators h ωθ d and h0, inequalities (3.12), (3.13) and the condition (3.11) we get tr ( h ωθ d )σ − ≤ l cl σ,1 ( 1 −% ∥∥fθ̇∥∥∞)σ ∫ r tr h ( s, ṽ )σ+1/2 − ds for σ ≥ 3/2. (3.14) 278 vol. 53 no. 3/2013 spectral analysis of schrödinger operators it is easy to see that the eigenvalues of operator h(s, ṽ ) are(λ0,j(s) f2(s) − 1 1 −% ∥∥fθ̇∥∥∞ ∥∥ṽ (s, ·)∥∥∞)−, j = 1, 2, . . . , (3.15) where λ0,j(s), j = 1, 2, . . . are eigenvalues of the operator hf,θ(s) := −∆ω0d + f 2(s)θ̇2(s)l2trans. from inequalities (3.14) and (3.15) it follows that tr ( h ωθ d )σ − is estimated by lclσ,1 ( 1 −% ∥∥fθ̇∥∥∞)σ ∫ r ∞∑ j=1 ( − λ0,j(s) f2(s) + 1 1 −% ∥∥fθ̇∥∥∞ ∥∥ṽ (s, ·)∥∥∞)σ+1/2+ ds, for any σ ≥ 3/2 which proves the theorem. acknowledgements the research presented here was supported by the czech science foundation within project p203/11/0701. the authors are grateful to the referees for reading the manuscript carefully and pointing out some minor slips of the pen. references [1] simon b.: some quantum operators with discrete spectrum but classically continuous spectrum, ann. phys. 146 (1983), 209–220. [2] znojil m.: quantum exotic: a repulsive and bottomless confining potential, j. phys. a: math. gen. 31 (1998), 3349–3355. [3] exner p., barseghyan d.: spectral estimates for a class of schrödinger operators with infinite phase space and potential unbounded from below, j. phys. a: math. theor. 45 (2012), 075204. [4] exner p., barseghyan d.: spectral estimates for dirichlet laplacians and schrödinger operators on geometrically nontrivial cusps, to appear in j. spect. theory, arxiv:1203.2098. [5] reed m., simon b.: methods of modern mathematical physics, ii. fourier analysis. self-adjointness, academic press, new york 1975. [6] barseghyan d.: on the possibility of strengthening the lieb-thirring inequality, math. notes 86 (2009), 803–818. [7] rozenblum. g.: asymptotics of the eigenvalues of schrödinger operator, mat. sbornik (n.s.) 93(135) (1974), 347–367. [8] adams r.a., fournier j.f.: sobolev spaces, 2nd ed., academic press, new york 2003. [9] geisinger l., weidl t.: sharp spectral estimates in domains of infinite volume, rev. math. phys. 23 (2011), 615–641. [10] lieb e.h., thirring w.: inequalities for the moments of the eigenvalues of the schrödinger hamiltonian and their relation to sobolev inequalities, in studies in math. phys., essays in honor of valentine bargmann (e. lieb, b. simon and a.s. wightman, eds.); princeton univ. press, princeton 1976; pp. 269–330. [11] berezin f.a.: covariant and contravariant symbols of operators, izv. akad. nauk sssr ser. mat. 36 (1972), 1134–1167. [12] berezin f.a.: convex functions of operators, mat. sb. (ns) 36(130) (1972), 268–276. [13] lieb e.h.: the classical limit of quantum spin systems, commun. math. phys. 31 (1973), 327–340. [14] li p., yau s.t.: on the schrödinger equation and the eigenvalue problem, commun. math. phys. 88 (1983), 309–318. [15] exner p., šeba p.: bound states in curved quantum wavequides, j. math. phys. 30 (1989), 2574–2580. [16] berger m.s., schechter m.: embedding theorems and quasi-linear elliptic boundary value problems for unbounded domain, trans. am. math. soc. 172 (1972), 261–278. [17] weidl t.: improved berezin-li-yau inequalities with a remainder term, in spectral theory of differential operators, amer. math. soc. transl. 225 (2008), 253–263. [18] laptev a., weidl t.: sharp lieb-thirring inequalities in high dimensions, acta math. 184 (2000), 87–100. [19] krejčiřík d., zuazua e.: the hardy inequality and the heat equation in the twisted tubes, j. diff. eqs. 250 (2011), 2334–2346. 279 http://arxiv.org/abs/1203.2098 acta polytechnica 53(3):271–279, 2013 1 introduction 2 a model with potential unbounded from below and infinite phase space 2.1 discreteness of the spectrum 2.2 the supercritical case 2.3 lower bounds to eigenvalue sums 2.4 upper bounds 3 schrödinger operators in cusps with non-trivial geometry 3.1 curved planar cusps 3.2 twisted cusps of non-circular cross section in rˆ3 acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0534 acta polytechnica 53(supplement):534–537, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the 2h(α, γ)6li reaction at luna and big bang nucleosynthetis carlo gustavino∗ infn sezione di roma, i-00185 roma, italy ∗ corresponding author: carlo.gustavino@roma1.infn.it abstract. the 2h(α, γ)6li reaction is the leading process for the production of 6li in standard big bang nucleosynthesis. recent observations of lithium abundance in metal-poor halo stars suggest that there might be a 6li plateau, similar to the well-known spite plateau of 7li. this calls for a re-investigation of the standard production channel for 6li. as the 2h(α, γ)6li cross section drops steeply at low energy, it has never before been studied directly at big bang energies. for the first time the reaction has been studied directly at big bang energies at the luna accelerator. the preliminary data and their implications for big bang nucleosynthesis and the purported 6li problem will be shown. keywords: luna experiment, nuclear astrophysics, big bang nucleosynthesis, 6li abundance, 2h(α, γ)6li reaction. 1. introduction in its standard picture, big bang nucleosynthesis occurs during the first minutes of universe, with the formation of light isotopes such as d, 3he, 4he, 6li and 7li, through the reaction chain shown in fig. 1. their abundance depends on the standard model physics, on the baryon-to-photon ratio η and on the nuclear cross sections of involved processes. cosmic microwave background (cmb) experiments provide the η value with high precision (percent level) [1]. indeed, the bbn theory makes definite predictions for the abundances of the light elements as far as the nuclear cross sections of leading processes are known. the observed abundances of d, 3he, and 4he are in good agreement with calculations, confirming the overall validity of bbn theory. on the other hand, the observed abundance of 7li is a factor 2 ÷ 3 lower than the predicted abundance (see fig. 2). the amount of 6li observed in metal poor stars is unexpectedly large compared to big bang nucleosynthesis (bbn) predictions, about 3 order of magnitude higher than the calculated value (see fig. 2). even though many of the claimed 6li detections may be in error, for a very few metal-poor stars there still seems to be a significant amount of 6li [2]. the difference between observed and calculated values may reflect unknown post-primordial processes or physics beyond the standard model. the leading process to synthesize 6li is the 2h(α, γ)6li reaction. the 2h(α, γ)6li cross section is very small at bbn energies (30 . e(kev) . 400), because of the repulsion between the interacting nuclei. therefore, it has never been measured experimentally, and theoretical predictions remain uncertain [3]. this process has been experimentally studied only for energies greater than 1 mev and around the 711 kev [4, 5]. there are two attempts to determine the 2h(α, γ)6li figure 1. leading processes of big bang nucleosynthesis. the red arrows show the reactions measured by the luna collaboration. yellow boxes marks stable isotopes. cross section at bbn energies, using the coulomb dissociation technique [6, 7]. in this approach, an energetic 6li beam passes close to a target of high nuclear charge. in this way, the time-reversed reaction 6li(γ, α)2h is studied using virtual photons which are exchanged. the measurements mentioned above are shown in fig. 3. in the figure, the cross section σ is parameterized with the astrophysical factor s(e), defined by the formula σ(e) = s(e) e−2πη e . (1) s(e) contains all the nuclear effects and, for nonresonant reactions, it is a smoothly varying function 534 http://dx.doi.org/10.14311/ap.2013.53.0534 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the 2h(α, γ)6li reaction at luna and big bang nucleosynthetis figure 2. abundances of 7li and 6li as a function of the η parameter. observations are represented as green, horizontal dashed band. the blue band shows the calculated abundance of 7li. the calculated abundance of 6li is obtained using the nacre compilation recommended values (dashed lines). the vertical yellow band indicates the η parameter as measured by the wmap experiment. of energy. the exponential term takes into account the coulomb barrier. the sommerfeld parameter η is given by 2πη = 31.29z1z2(µ/e) 1/2. z1 and z2 are the nuclear charges of the interacting nuclei. µ is their reduced mass (in units of a.m.u.), and e is the center of mass energy (in units of kev). as coulomb dissociation measurements strongly depends on the theoretical assumptions because nuclear effects are dominant, only a direct measurement of 2h(α, γ)6li reaction in the bbn energy range can give a reliable base to compute the 6li primordial abundance. the present papers reports on the first direct measurement of the 6li(γ, α)2h performed by the luna collaboration (luna – laboratory for underground nuclear astrophysics). the measurement has been performed with the unique underground accelerator in the world, situated at the lngs laboratory (lngs – laboratorio nazionale del gran sasso). the “gran sasso” mountain provides a natural shielding which reduces the muon and neutron fluxes by a factor 106 and 103, respectively. the suppression of the cosmic ray induced background also allows an effective suppression of the γ-ray activity by a factor 102 ÷ 105, depending on the photon energy [8]. 2. experimental set-up figure 5 shows the experimental set-up used for the 2h(α, γ)6li reaction. the measurement is based on the use of the 400 kv accelerator, which provides an α beam of high intensity. the α beam impinges a figure 3. the astrophysical factor of the 2h(α, γ)6li reaction as a function of the center-of-mass energy. direct [4, 5] and indirect measurements [6, 7] are reported. the bbn energy region and the energy range studied by luna are also reported. windowless gas target of d2, with a typical operating pressure of 0.3 mbar. the signal is maximized by stretching the beam intensity up to about 350 µa and by using a geometry with the germanium detector close to the beam line (2 cm apart), to increase its acceptance. the intrinsic low level of natural background of lngs is further reduced by means of a 4π lead shield around the reaction chamber and the ge detector. everything is enclosed in a radon box flushed with high purity n2, to reduce and stabilize the γ activity due to the radon decay chain. the measurement of the 2h(α, γ)6li reaction is affected by an inevitable beam induced background. in fact, the 2h(α, α)2h rutherford scattering induces a small number of 2h(2h, n)3he and 2h(2h, p)3h reactions. while the 2h(2h, p)3h reaction is not a problem in this context, the neutrons produced by the 2h(2h, n)3he reaction (en(cm) = 2450 kev) induce (n,n′γ) reactions in the ge detector and in the surrounding materials (lead, steel, copper), generating a beam-induced background in the γ-ray spectrum, in particular around 1.6 mev, where the capture transition to the ground state of 6li is expected. to reduce the neutron production, a tube 16 cm long, with a square cross section of 2 × 2 cm2 is inserted inside the chamber. the tube strongly reduces the effective path for the scattered deuterons and therefore the neutron yield due to the 2h(2h, n)3he reaction is reduced at the level of few neutrons/second. finally, one silicon detector is faced to the gas target volume to monitor the running conditions through the detection of protons generated in the 2h(2h, p)3h reaction (ep(cm) = 3022 kev). the measurement of the number of protons detected is strictly related to the number of neutrons produced, since the cross sections of the two 535 carlo gustavino acta polytechnica figure 4. spectra taken with the ge detector. blue full line: beam induced background spectrum at eα = 400 kev and ptarget = 0.3 mbar. grey thin line: laboratory background. figure 5. experimental setup. conjugate 2h(2h, n)3he and 2h(2h, p)3h reactions are well known. figure 4 shows the ge spectrum at eα = 400 kev and pdeuterium = 0.3 mbar. various transitions due to the interaction of neutrons with the germanium and the surrounding materials can be identified. 3. method the energy of γ’s coming from d+α reaction strongly depends on the beam energy, through the relationship eγ = 1473.8 + ecm ± δedoppler. (2) as shown in fig. 6, the γ-rays energy strongly depends on the beam energy. the region of interest (roi) is about 30 kev wide, because of the doppler broadening. as the γ’s produced the 2h(α, γ)6li reaction strongly depends on the beam energy, it is possible to extract the signal with a measurement performed in two steps: figure 6. simulated full peak detection of γ’s from 2h(α, γ)6li in the luna ge detector, at different beam energy. note the doppler broadening of about 30 kev and the dependence with the beam energy. (1.) measurement with ebeam = 400 kev on d2 target. the ge spectrum is mainly due to the background induced by neutrons. the 2h(α, γ)6li γ signal is expected in a well defined energy region (1592 ÷ 1620 kev, see fig. 6). (2.) same as (1.), but with ebeam = 280 kev. the background is essentially the same as before, while the gammas from the 2h(α, γ)6li γ reaction are shifted to 1555 ÷ 1578 kev (see fig. 6). figure 7a shows the spectra with eα = 400, 280 kev, respectively. a counting excess is clearly visible in the eα = 400 kev roi. unfortunately, at eα = 280 kev the very low reaction yield prevents from any conclusion statistically significant. to verify that the counting excess at eα = 400 kev is a genuine γ signal com536 vol. 53 supplement/2013 the 2h(α, γ)6li reaction at luna and big bang nucleosynthetis figure 7. a) experimental ge spectra for ebeam = 400 kev (black line) and for ebeam = 280 kev (red line); b) experimental ge spectra for ebeam = 360 kev (black line) and for ebeam = 240 kev (red line). t, 〈p 〉, q are respectively the measurement time, the averaged target pressure, the integrated beam current. the bands indicate the roi at eα = 400 kev roi and eα = 280 kev a), and the roi at eα = 360 kev roi and eα = 240 kev b). note the counting excess visible in correspondence of the 400 kev roi. as foreseen, the counting excess shifts to the eα = 360 kev roi in b). ing from the 2h(α, γ)6li reactions, the measurement has been repeated by shifting the beam energies to eα = 360, 240 kev. as shown in fig. 7b, the counting excess at the higher energy is shifted as expected, even though the worst signal/noise ratio and the shorter measuremet time. 4. conclusion the cross section of 2h(α, γ)6li reaction has been measured for the first time at bbn energy. although the data analysis is still in progress, the luna measurement excludes a nuclear solution for the purported 6li problem. the observation of a “huge” amount of 6li in metal-poor stars must be explained in a different way, e.g. systematics in the 6li observation or physics beyond the standard model. in any case, a solid experimental footing is now available to calculate the 6li primordial abundance. references [1] d.n. spergel, et al., 2007, apjs, 170, 377 [2] see proceedings of “lithium in the cosmos”, 27-29 february, paris [3] l. marcucci, k. nollett, r. schiavilla, and r.wiringa, nucl. phys. a 777, 111 (2006) [4] p. mohr et al., phys. rev. c 50, 1543 (1994) [5] r. g. h. robertson et al., phys. rev. lett. 47, 1867 (1981) [6] j. kiener et al., phys. rev. c 44, 2195 (1991) [7] f. hammache et al., phys. rev. c 82, 065803 (2010), 1011.6179 [8] a. caciolli et al, eur. phys. j. a 39, 179–186 discussion maurice h.p.m. van putten — could the excess of 6li in pop iii stars be due to inhomogeneities in primordial li production due to density fluctuaction? that is 6li formation is a binary reaction product with a rate proportional to the density squared, and pop iii stars form selectively out of initially overdense region. carlo gustavino — in my opinion it is difficult to explain the purported 6li excess in this way, because the barionic density determines not only the 6li abundance but also the amount of all the other primordial isotopes. in particular, deuterium abundance is very sensitive to the barionic density, but observations are in good agreement with calculations. 537 acta polytechnica 53(supplement):534–537, 2013 1 introduction 2 experimental set-up 3 method 4 conclusion references acta polytechnica doi:10.14311/ap.2013.53.0732 acta polytechnica 53(supplement):732–735, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap results from lhcf experiment alessia tricomi∗, on behalf of the lhcf collaboration university of catania and infn catania, italy ∗ corresponding author: alessia.tricomi@ct.infn.it abstract. the lhcf experiment took data in the 2009 and 2010 p–p collisions at lhc at √ s = 0.9 tev and √ s = 7 tev. this paper reports on the most up-to-date results on the inclusive photon spectra and the π0 spectra measured by lhcf. these spectra are compared with the model expectations, and the impact on high energy cosmic ray (hecr) physics is discussed. in addition, we discuss the perspectives for future analyses as well as the program for the next data taking period, in particular the foreseen data taking in p–pb collisions. keywords: lhc, hadron interaction, cosmic rays, monte carlo models. 1. introduction dedicated extensive air shower experiments have been taking data for many years, and have contributed considerably to our understanding of high and ultra high energy cosmic (uhecr) ray physics. recently, in particular, the pierre auger collaboration [1] and the telescope array collaboration [2], thanks to the excellent performance of their hybrid detector arrays, have been providing new exciting observations of uhecrs. although these recent results have brought a deeper insight into the properties of primary cosmic rays, there remain systematic uncertainties due to our poor knowledge of the nuclear interactions in the earth’s atmosphere [3]. a calibration of the energy scale in the range accessible at lhc, 1015 ÷ 1017 ev, provides crucial inputs for a better interpretation of primary cosmic ray properties, in the region between the “knee” and the gzk cut-off. the lhcf experiment is designed to measure the energy spectra and the transverse momentum of neutral particles in a very high pseudo-rapidity region (η > 8.4) thus providing precise data for testing and calibrating the hadronic interaction models used in monte carlo (mc) simulation of extensive air showers. 2. the lhcf detector the lhcf experiment comprises two independent position sensitive electromagnetic calorimeters, located on both sides of the atlas experiment, 140 m away from the lhc–ip1 interaction point, inside the zerodegree neutral absorber (target neutral absorber, tan). charged particles from the ip are swept away by the inner beam separation dipole before reaching the tan, so that only photons mainly from π0 decays and neutral hadrons reach the lhcf calorimeters. each calorimeter (arm1 and arm2) has a double tower structure, with the smaller tower located at a zero degree collision angle, approximately covering the region with pseudo-rapidity η > 10 and the larger tower, approximately covering the region with 8.4 < η < 10. four x–y layers of position sensitive detectors (scintillating fibers in arm1, silicon microstrip detectors in arm2) provide measurements of the transverse profile of the showers. the two tower structure allows us to reconstruct the π0 decaying in two γ, hitting the two towers separately, hence providing very precise absolute energy calibration of the detectors. in the range e > 100 gev, the lhcf detectors have energy and position resolutions for electromagnetic showers better than 5 % and 200 µm, respectively. a detailed description of the lhcf experiment can be found in ref. [4]. 3. the single photon energy spectra the lhcf collaboration has measured the single photon energy spectrum at 7 tev [5] and, more recently, at 900 gev p–p collisions [6]. to minimize the backgrounds and hence reduce the systematic uncertainties of the measurements, for both analyses only a subset of the collected data corresponding to clean and low luminosity fill, has been analysed. for the 7 tev analysis, the analysed data correspond to integrated luminosity of 0.68 nb−1 and 0.52 nb−1 for the arm1 and arm2 detectors, respectively, while for the 900 gev analysis they correspond to integrated luminosity of 0.30 nb−1. the main steps in the analysis work-flow are almost identical for the two analyses and are summarised in the following. the energy of photons is reconstructed from the signal released by the shower particles in the scintillators, after applying corrections for the non-uniformity of light collection and for particles leaking in and out of the edges of the calorimeter towers. in order to correct for these last two effects, which are rather important due to the limited transverse size of the two calorimetric towers, we use the transverse impact position of the showers provided by the position sensitive detectors. 732 http://dx.doi.org/10.14311/ap.2013.53.0732 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 results from lhcf experiment 7t ev η>10.94 8.81<η<8.9 90 0g ev figure 1. single photon energy spectra measured by lhcf (black dots) in η > 10.94 (left) and 8.81 < η < 8.99 (right) bin, respectively, for 7 tev (top panel) and η > 10.15 (left) and 8.77 < η < 9.46 (right) bin, respectively, for 900 gev (bottom panel) p–p collisions. the ratio of mc results predictions for dpmjet iii 3.04 (red), qgsjet ii-03 (blue), sibyll 2.1 (green), epos 1.99 (magenta) and pythia 8.145 (yellow) to experimental data is shown. the error bars and gray shaded areas in each plot indicate the statistical errors and the systematic errors, respectively. figures from ref. [5, 6]. events produced by neutral hadrons are rejected using information about the longitudinal development of the showers, which is different for electromagnetic and hadronic particles. in addition, for the 7 tev analysis, thanks to the information provided by the position sensitive detectors, events with more than one shower inside the same tower (multi-hit) are rejected, while for the 900 gev analysis the number of multiparticle events is negligible, hence multi-hit rejection is not applied. in order to combine the spectra measured by arm1 and arm2, which have different geometrical configurations, in these analyses only events detected in a common pseudo-rapidity and azimuthal range are selected: η > 10.94 and 8.81 < η < 8.99, for the small and large towers, respectively, for 7 tev analysis and η > 10.15 and 8.77 < η < 9.46, for the small and large towers, respectively, for 900 gev analysis. figure 1 shows the single γ spectra measured by lhcf in the two pseudo-rapidity regions for 7 tev and 900 gev p–p collisions, respectively, compared with results predicted by mc simulations using different models: dpmjet iii-3.04 [7], qgsjet ii-03 [8], sibyll 2.1 [9], epos 1.9 [10] and pythia 8.145 [11]. statical errors and systematic uncertainties are also plotted. a careful study of systematic uncertainties has been made, and conservative estimates have been taken into account. further details can be found in ref. [5, 6]. as can be seen from fig. 1, there is a clear discrepancy between the experimental results and the predictions of the models, in particular in the high energy region. 4. neutral pion transverse momentum spectra in addition to the measurement of the single photon spectra, the lhcf experiment has recently finalised the measurements of the transverse momentum spectra for different rapidity bins for π0 produced in 7 tev p–p collisions at lhc. the integrated luminosities corresponding to the data used in this analysis are 2.53 nb−1 (arm1) and 1.90 nb−1 (arm2) after the data taking live times were taken into account. the π 0 are reconstructed in lhcf identifying their decays in two photons. events are selected requiring that the two photons enter different calorimeter towers; due to the geometrical acceptance of the detector, only photons from π0 decays with an opening angle of θ < 0.4 mrad can be detected. the energy, pt and rapidity of the π0 are reconstructed through measurements of the photon energy and the incident position in each calorimeter. in order to ensure good event reconstruction efficiency and geometrical acceptance, the range of the π0 rapidity and transverse momentum 733 alessia tricomi, on behalf of the lhcf collaboration acta polytechnica dpmjet 3.04 qgsjet ii-03 sibyll 2.1 epos 1.99 pythia 8.145 0π=7tev slhcf 8.9 < y < 9.0 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0π=7tev slhcf 9.0 < y < 9.2 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0π=7tev slhcf 9.2 < y < 9.4 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0π=7tev slhcf 9.4 < y < 9.6 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0π=7tev slhcf 9.6 < y < 10.0 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0π=7tev slhcf 10.0 < y < 11.0 -1 ldt=2.53+1.90nb∫ [gev/c] t p 0 0.1 0.2 0.3 0.4 0.5 0.6 m c /d a ta 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 figure 2. ratio of the combined arm1 and arm2 pt spectra to the pt spectra predicted by various hadronic interaction models. shaded areas indicate the range of total uncertainties of the combined spectra. figure from ref. [12]. are limited to 8.9 < y < 11.0 and pt < 0.6 gev/c, respectively. figure 2 shows the ratios of the pt spectra predicted by dpmjet 3.04 (solid, red), qgsjet ii03 (dashed, blue), sibyll 2.1 (dotted, green), epos 1.99 (dashed dotted, magenta), and pythia 8.145 (default parameter set, dashed double-dotted, brown) to the combined arm1 and arm2 pt spectra (black dots). error bars have been taken from the statistical and systematic uncertainties. among hadronic interaction models tested in this analysis, epos 1.99 shows the best overall agreement with the lhcf data, although it behaves softer than the data in the low pt region, pt . 0.4 gev/c in 9.0 < y < 9.4 and pt . 0.3 gev/c in 9.4 < y < 9.6, and behaves harder in the large pt region. dpmjet 3.04 and pythia 8.145 show overall agreement with the lhcf data for 9.2 < y < 9.6 and pt < 0.25 gev/c, while the expected π0 productions rates by both models exceed the lhcf data for larger pt. also sibyll 2.1 predicts harder pion spectra than the lhcf data, although the expected π0 yield is generally small. finally, qgsjet ii-03 predicts π0 spectra that are softer than the lhcf data and the other models. 5. impact of lhcf results on hecr physics the first lhcf results have attracted attention in the hecr community. as reported in the previous paragraphs, none of the models agree with the data in the whole energy range. tuning of the models is therefore needed to describe the physics of hadronic interactions at the tev scale. in order to better understand the implications of this measurement for hecr physics, collaboration has begun with several mc developers and theoreticians. as an example, we have artificially modified the dpmjet iii 3.04 model to produce a π0 spectrum that differs from the original one by an amount approximately equal to the difference expected between the different models. figure 3 shows the π0 spectra at elab = 1017 ev predicted by the original dpmjet iii 3.04 model and the artificially modified spectra. a difference in the position of the shower maximum of the order of 30 g/cm2 is observed. figure 4 shows, as an example, the most recent results from the auger collaboration [13] for the distribution of the 〈xmax〉 variable as a function of the energy, which is the most commonly used method for inferring the composition of cosmic rays, compared with the model predictions for a proton-like (red lines) and an iron-like (blue lines) cosmic ray component, respectively. the difference in the 〈xmax〉 distributions for the two cases is of the order of 100 g/cm2, hence a 30 g/cm2 shift is a sizable difference, which may be reflected significantly in the interpretation of hecr data. 734 vol. 53 supplement/2013 results from lhcf experiment π0 spectrum at elab = 1017ev dpmjet3 original artificial modification figure 3. π0 spectra at elab = 1017 ev for the original dpmjet iii-3.04 model (red) and artificially modified models (green, blue, magenta), see text (top). figure 4. 〈xmax〉 distribution as measured by auger [13] (black points), compared with the model expectations for a light (red) or heavy (blue) cosmic ray composition. the yellow arrows correspond to the 30 g/cm2 shift, obtained in fig. 2. the importance of direct measurements of the γ and π0 spectra by lhcf turns on to be clear. 6. future activities and summary lhcf is planning to measure very forward particle emission in the lhc p–pb collisions foreseen at the beginning of 2013. the measurements are expected to constrain the nuclear effect in the forward particle emission relevant to the cr–air interaction, thus providing a further tool for model calibration [14]. new analyses of data collected in the 2009–2010 runs are also in progress, in particular the measurements of the neutral hadron spectra. in the meantime, the lhcf collaboration is working on upgrading the detector to improve the radiation resistance in view of the 14 tev p–p run, currently foreseen in 2014. the scintillating part of the detector will be replaced by gso slabs, thus enabling lhcf to sustain the radiation level foreseen in the 14 tev run [15]. additional improvements in the front-end electronics of the silicon position sensitive layers of arm2 detectors, as well as an optimization of the layout to improve the stand-alone silicon energy resolution, are also ongoing. references [1] p. abreu, et al. [the pierre auger collaboration], arxiv:1107.4809 [astro-ph.he] [2] y. tsunesada [for the telescope array collaboration], arxiv:1111.2507 [astro-ph.he] [3] a. chiavassa, this workshop [4] o. adriani et al., jinst 3 (2008) s08006 [5] o. adriani et al., phys. lett. b703 (2011) 128 [6] o. adriani et al., accepted in phys. lett. b, arxiv:1207.7183 [hep-ex] [7] f.w. bopp et al., phys. rev. c77 (2008) 014904 [8] s. ostapchenko, phys. rev. d83 (2011) 014108 [9] e.-j. ahn et al., phys. rev. d80 (2009) 094003 [10] k. werner et al., nucl.phys.proc.suppl. 175-176 (2008) 81 [11] t. sjöstand, et al., comput. phys. comm. 178 (2008) 852 [12] o. adriani et al., arxiv:1205.4578 [hep-ex] [13] j.bellido for the auger collaboration, in proceedings of 32st icrc, beijing, (2011) [14] o. adriani et al., cern-lhcc-2011-015. lhcc-i-021 (2011) [15] k. kawade et al., jinst 6 (2011) t09004 discussion laurence jones — does lhcf also have inclusive neutron spectra? alessia tricomi — currently the measurement of neutron spectra is our highest priority for future analysis. we have already started to work on that and we plan to be able to publish our results at the beginning of next year. it is a rather important measurement both for the implications in hecr physics as well as for probing qcd physics in the high-energy forward region. although our detector is mainly designed to detect e.m. showers due to the limited interaction length (≈ 1.6λi) resulting in an energy resolution of ∼ 30 % for hadronic showers w.r.t. . 3% for em showers, simulation studies performed so far show that lhcf is pretty well able to disentangle between different hadronic models also in the neutron spectra due to the larger discrepancies present in the monte carlo models. aurelio grillo — in which direction does 〈xmax〉 move? alessia tricomi — in the toy model that we have used, we have artificially modified dpmjet iii 3.04 by softening the spectrum by an amount approximately equal to the difference expected between the different models. this modification of the spectrum reflects in a shift of the position of the shower maximum of the order of 30 g/cm2 less than in the original model, moving from 〈xmax〉 = 718 g/cm2 in the original dpmjet iii 3.04 to 〈xmax〉 = 689 g/cm2 in the modified one. let me point out however that this was just an artificial modification, not a full recalibration of the model. 735 acta polytechnica 53(supplement):732–735, 2013 1 introduction 2 the lhcf detector 3 the single photon energy spectra 4 neutral pion transverse momentum spectra 5 impact of lhcf results on hecr physics 6 future activities and summary references acta polytechnica doi:10.14311/ap.2013.53.0497 acta polytechnica 53(supplement):497–499, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the first century of cosmic rays, an historical overview lawrence w. jones∗ university of michigan ∗ corresponding author: lwjones@umich.edu abstract. the 1912 balloon flights of victor hess and related activities in those years are reviewed. subsequent research during the early 20th century is noted, including the discovery of the positron, mesons, and air showers. the cosmic ray–accelerator interrelations are noted, including cosmic ray studies at echo lake and mt. evans, colorado (usa). the more recent evolution of cosmic ray research programs to astrophysical and cosmological studies, and the major programs such as auger and ams conclude this discussion of the century of cosmic ray research. 1. the hess discovery in the early 20th century, radiation had been discovered, radium (and other radioactive elements) identified, and the ionization of air (and other gasses) by radiation, detected by the electroscope, studied. the fact that some level of ionization of the air was observed everywhere was interpreted as due to residual traces of radioactive elements in the earth’s crust (soil, rocks, etc.). in 1912, victor hess, an austrian physicist, took an electroscope in a hydrogen-filled balloon up to an altitude of almost 6000 meters (over germany), and found that the atmospheric ionization increased with altitude by about a factor of three above that at ground level, leading him to conclude that there was a source of ionization incident on the earth’s atmosphere from above. from balloon flights during a solar eclipse, he also deduced that the source of this radiation was not the sun. hess’ observations were confirmed by werner kohlhörster, from other balloon flights in 1913 and 1914, up to about 8000 m, showing an increase (from ground level) of about a factor of 8. robert millikan, a very well-known and respected physicist at that time, at first did not believe the hess and kohlhörster results, but later made measurements himself which convinced him of their validity, and he coined the term “cosmic rays”. 2. early discoveries in 1927, jacob clay, using an ionization chamber, sailed between java and the netherlands, and observed a significant latitude effect (due to variations in the earth’s magnetic field). in that year dmitri skobeltsyn first photographed cosmic ray tracks in a cloud chamber. kohlhorster and walter bothe (1929) and bruno rossi (1930), using geiger–muller counters and coincidence circuits, found that a fraction of cosmic rays traversed as much as 25 cm of lead. in 1933, rossi, arthur compton, and luis alvarez observed an east–west asymmetry of primary cosmic rays, demonstrating that the primaries were positive particles (e.g. protons). it may be noted that this was alvarez’ ph.d. thesis topic. marcel schein, from balloon flights in 1940, showed that the primaries were mostly protons. in 1933, rossi and (later) pierre auger observed the coincidence of cosmic ray particle signals between horizontally-separated counters, hence the discovery of air showers. balloon experiments, up to altitudes of 30 km, verified that primary cosmic rays included he plus a small fraction of heavier nuclei, in addition to protons (the primary, dominant component). before the evolution of particle accelerators, fundamental particle physics discoveries were made in cosmic ray studies. in 1932 carl anderson discovered the positron in a cloud chamger photograph of cosmic rays. later (1936–1937), a group consisting of dr. anderson, neddermeyer, street, and stevenson discovered the µ-meson (now known as the “muon”, but then called the “mesotron”); a particle with a mass between that of a proton (or neutron) and an electron (or positron). they first identified it as the yukawa particle; the japanese physicist h. yukawa had postulated that the strong interaction was mediated by a quantum particle with a rest mass lighter than that of the proton (analogous to the role of the photon in electromagnetism). ten years later, the pion (π-meson) was discovered by lattes, occhialini, and powell, and found to be a strongly-interacting meson which decayed into the muon with a very short half-life. in those years (1947–1953) the k-meson (or kaon) was discovered by powell, butler, and rochester. and, in the early ’50s (1951–1953), the lightest hyperons were discovered in cloud chamber studies at the french pic du midi research station (at an elevation of about 2850 m in the pyrenees); the λ, σ, and ξ particles. an excellent summary of these early milestones is contained in a physics today article [1]. 497 http://dx.doi.org/10.14311/ap.2013.53.0497 http://ojs.cvut.cz/ojs/index.php/ap lawrence w. jones acta polytechnica 3. the accelerator–cosmic ray interaction the first accelerator to achieve an energy of over 1 gev was named the “cosmotron” – the 3 gev proton synchrotron at the brookhaven national laboratory (near new york). this name recognized the high-energy discoveries in elementary particle physics in cosmic ray studies, and the probability that this higher energy accelerator would continue in that path; which indeed it did. the “bevatron”, at the lawrence berkeley national laboratory (california) accelerated protons to an energy of 6 gev (or 6 billion electron volts) in 1955. soon after, the anti-proton was first discovered there. with the invention of strong (or alternating gradient) focusing, the brookhaven and cern (in geneva, switzerland) laboratories both built proton synchrotrons of about 30 gev, completed around 1960. also, in the mid-sixties, at the dubna laboratory (north of moscow) the 10 gev synchro-phasetron, and at the argonne laboratory (near chicago) the 12 gev zgs (zero-gradient synchrotron) were completed. of course, there were also electron accelerators completed and operating then. hence, most of the physicists’ studies of elementary particle physics moved away from cosmic rays and over to accelerators during these years (the 1950s and 1960s). in the early 1960s, with the success of the brookhaven ags and the cern ps, there were extensive discussions among the active physicists about the construction of the next generation of accelerator facilities and laboratories; accelerators with an energy above 100 gev. there were arguments and confusion in both europe and america concerning where and by whom such facilities would be built, how they would be financed, and how they would be managed. such a machine would be too large to fit on the existing sites of cern or brookhaven, for example, and there were intense discussions over the organizational structure required to build and manage such a facility. 4. the echo lake and mount evans research program during this period of uncertainty and frustration in the early 1960s, at an international high-energy physics meeting in dubna, a group of us were discussing this situation, and guiseppe cocconi noted that the flux of cosmic ray protons above 100 gev, at mountain elevations, would be sufficient to make serious studies of nuclear interactions at these energies. this stimulated a group of us; myself and other midwestern particle physics colleagues, to consider an experimental facility in the colorado mountains. we proceeded to get a national science foundation grant, and, in 1965, equipped a semi-trailer with a large spark chamber, proportional chambers, plus a hadron calorimeter, and took it to the summit of mt. evans, colorado (elevation 4300 m). this site, and another lower elevation site near echo lake (elevation 3260 m) were managed by denver university, and had earlier hosted many outstanding cosmic ray physicists, including bruno rossi, giuseppi cocconi, john wheeler, marcel schein, ken greisen, wayne hazen, arthur compton, and others. the following summer, we built a larger detector in a wooden building, leaving the adjacent semi-trailer available for the electronics and the operators. the detector was designed to search for free quarks, possible cosmic ray particles with a charge (hence ionization) 1/3 or 2/3 that of known particles, e.g. of relativistic cosmic ray muons. the detector included two 3-layer multi-wire gas proportional counters, each of about 2 square meters area; below these were a 2-section wide gap spark chamber and beneath that a 7-layer iron and scintillation counter hadron calorimeter. of course, the quark search was primarily carried out by the 6 independent ionization measurements, while the other detectors were early feasibility models of components we would possibly use in a much larger detector. this detector was initially located on the mt. evans summit, however the road to the summit was only open during the three summer months, so, to maintain operations year-round, we moved the building and semi-trailer to the echo lake site, which included lodgings, and was accessible year-round. indeed, about that time, c.b.a. mccusker in australia reported a positive result in a quark search; a cloud chamber track with 1/3 the ionization of a relativistic muon. however we were able to continue our quark search for over a year, and found no quark candidates, and hence published a convincingly negative result [2]. subsequent searches, with cosmic rays, at particle accelerators, and in stable matter, have all confirmed the absence of free quarks. at the 1967 international cosmic ray conference in alberta, canada, grigorov and his russian colleagues reported results from the “proton” russian satellites; in particular, the p–p inelastic cross section (deduced from interactions in a graphite and a polyethylene target) was reported to be about 22 % greater at about 500 gev than at 20 gev [3]. this stimulated our group to pursue the direct study of p–p interactions at echo lake, where we had begun to build a larger detector. bruce cork (from berkeley) arranged for a 2000 liter liquid hydrogen target to be built for incorporation into our detector, which was built in a new, larger wooden building. the detector complex was about 4.5 m tall, and consisted of a top scintillation counter, a spark chamber 2 × 2 meters, consisting of two 20 cm gaps, the liquid hydrogen target, and another 2 gap 2 × 2 meter spark chamber. below this was a set of three multigap thin plate spark chambers (10 gaps total) separated by iron plates and scintillators (for observing em showers and nuclear interactions), and below this a large total absorption calorimeter, with 10 layers of counters between iron plates, which together totaled 1130 g/cm2. an array of scintillation counters around the mid-plane of the 498 vol. 53 supplement/2013 the first century of cosmic rays, an historical overview hydrogen target served as a veto to restrict triggered events to unaccompanied incident cosmic ray particles (virtually all protons). the resulting data, 90° stereophotographs and digital data of the recorded events, were analyzed at our home institutions just as spark chamber and bubble chamber data from accelerators were being analyzed. from our data, we confirmed that the p–p inelastic cross section was essentially constant between 20 gev (as measured at the brookhaven and cern accelerators) and ∼ 500 gev, proving the russians wrong [4]. we continued with other data collection; secondary particle multiplicity and angular distributions, p–nuclear cross sections with targets of carbon, iron, and lead, and other studies. gaurang yodh later found that the russian result was the result of back scattering of reaction products in their calorimeter below their target. of course, we now know, from the totem lhc data, that the p–p cross section rises to about 100 mb at 7 tev c.m. (about 25 pev equivalent cosmic ray energy); although our echo lake measurements (below a tev) were confirmed by accelerator data. of course, in the late 1960s, the fermi national accelerator laboratory, managed by the newly-formed universities research association (representing major universities from all across the u.s.) was established near chicago, where it began construction of the 300 gev synchrotron. and in europe, cern expanded its site and undertook to build the sps (super proton synchrotron). in 1972 the fermilab synchrotron commenced operation, and we closed the cosmic ray program and moved our activities to accelerators. 5. recent cosmic ray research for the past 40 years, the major emphasis of cosmic ray research has been directed towards questions in astrophysics and cosmology; for example, studies of the primary spectrum and composition with balloon and satellite-borne detectors for studies up to tev energies, and surface air shower arrays, especially the auger and the telescope array. also, studies of x-rays, gamma rays, gamma ray bursts, etc. neutrino astronomy is a current lively topic, with detector arrays on the kilometer scale, and the strange taxonomy of neutrons; three “flavors” (corresponding to the electron, muon, and tau), and three different mass eigen states. there are extensive presentations of these topics at this workshop, so they will not be discussed here. many other recent and current cosmic ray research activities are also addressed here, including the most recent results from the pamela and other satellite detectors. we look forward to future reports of results from the ams-02 magnetic spectrometer, aboard the international space station. an area of particle physics/cosmic ray research which is still active is the use of emulsion chambers. these are stacks of nuclear emulsions and/or x-ray film, separated by sheets of metal (or, sometimes, graphite) in which the tracks of particles from cosmic ray interactions may be observed and studied. most of this activity is currently carried out by russian, japanese, and brazilian physicists, with emulsion chamber arrays on mountains in kazakhstan, bolivia, and tibet. a recent interesting workshop, where the latest results were reported, was held in 2010 at plock, poland. unusual particle physics phenomena which were discussed included the “centauro” phenomena (high energy interactions with a dirth of neutral pions, hence gammas, as reaction products). other phenomena discussed were the azimuthal anisotropy of the final states of very high energy interactions, and the “long flying component”, an apparently stronglyinteracting reaction product particle which travels well beyond a conventional interaction length before interacting [5]. although i personally do not believe that any of these three phenomena represent new physics, they merit discussion and study, and i certainly support the research activities and goals of these groups. this is a very abbreviated history, with an emphasis on the earlier events and discoveries, in view of the more recent research discussed at this workshop. however it was indeed interesting reviewing those earlier years. we look forward to future discoveries and to the solutions to our many remaining cosmic ray problems. references [1] carlson, p.: 2012, physics today 65. no. 2, 30 [2] jones, l.w.: 1973, bulletin of the american physical society ii, 18, 33 [3] grigorov, n.l., et al.: 1967, proceedings of the tenth international cosmic ray conference part a, 512 [4] jones, l.w., et al.: 1972, nuclear physics b43, 477 [5] kempa, j., et al.: emulsion chamber observations of centauros, aligned events and the long-flying component, central european journal of physics (to be published) discussion francesco ronga — it is interesting to remember the underwater measurements made by d. pacini in 1907–1911; see arxiv 1103–4392 (a. de angelis). lawrence jones — this was indeed a relevant set of measurements, which also hinted at an extra-terrestrial source of cosmic rays. 499 acta polytechnica 53(supplement):497–499, 2013 1 the hess discovery 2 early discoveries 3 the accelerator–cosmic ray interaction 4 the echo lake and mount evans research program 5 recent cosmic ray research references acta polytechnica doi:10.14311/ap.2015.55.0029 acta polytechnica 55(1):29–33, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap ising model simulations as a testbed for nucleation theory jan kulveita, b, ∗, petra ticháa, c, pavel demoa, c a institute of physics, academy of sciences of the czech republic, cukrovarnická 10, 162 53 praha 6, czech republic b charles university in prague, faculty of mathematics and physics, ke karlovu 3 121 16 praha 2, czech republic c czech technical university in prague, faculty of civil engineering, thákurova 7, 166 29 praha 6, czech republic ∗ corresponding author: jk@ks.cz abstract. in this short review article, we discuss the use of the ising lattice model as a testbed for improving the theory of both homogeneous and heterogeneous nucleation. first, we briefly review classical nucleation theory (cnt), and two typical simple systems on which simulations are performed — hard spheres, and the ising lattice model. then we review some results obtained by this approach, and point to possible new directions for research and improvement. keywords: nucleation, ising model, classical nucleation theory. 1. nucleation as a process naturally occuring on the scale of individual atoms to thousands of atoms, the nucleation has always been a window to the nanoscale world. it was studied well before the current advances in nanoscience and nanotechnology in various phase transitions including condensation, cavitation, solidification, crystallization and precipitation. it has also been studied in various fields of physics and technology, ranging from atmospheric physics concerned with condensation of water vapor to studies of radiation damage in materials important for reactor technology applications [1], and even in fields seemingly detached from the nanoscale, such as the technology of building materials [2]. in most of these situations, some general properties are the same. discontinuous phase transitions usually proceed in three steps. first, some small clusters of the new phase — “embryos” — appear due to stochastic fluctuations. if they reach a certain critical size, the embryos become growable and stable “nuclei”. this stage of the transition is called nucleation. in the second stage, the particles grow. finally, in closed systems, the growth is limited by the supply of the untransformed, remaining phase. the formation of nuclei is associated with an energy barrier, which limits the process, and allows metastable phases to persist over long periods of time. the barrier may be lowered if the cluster forms on the site of an existing impurity or on the boundary of some other material, leading to heterogeneous nucleation. the barrier may also be lowered if the nucleus is of some intermittent phase, different in structure or composition from the stable phase [3, p. 93]. nucleation is naturally also important in nanotechnology and in the science of nanostructures. one entire approach to the formation of nanostructures — the self-assembly (or “bottom-up” approach) can often be considered as a case of controlled nucleation, and crucial nanotechnologies such as thin layer construction are in a sense a case of heterogeneous nucleation. in technology, the aim is usually to control the distribution of the sizes, the placement or the shape of the nanoparticles, for example by changing the external parameters of the system, by using surfactants, or by patterning the substrate. nucleation was first described in classical nucleation theory, dating back to volmer, weber and farkas [4, 5] and becker and döring [6]. despite its age, the theory is still widely used to describe the nucleation stage of phase transition in many contexts. 2. classical nucleation theory 2.1. nucleation barrier in the simplest case, we start with a single component system, such as the liquid phase condensing from a gas, or precipitation from a liquid solution. in classical nucleation theory, the capillarity approximation is frequently used the values of the parameters used in the model are taken to be the same as in macroscopic objects. initially, the system is in some α-phase, which is metastable with regard to the β-phase. in order to change to the β-phase, first some small cluster of β-phase must be formed. the energy balance for the formation of a small cluster consisting of n particles (atoms, ions, etc.) is thermodynamically given as ∆gn = n(µβ −µα) + ∆ginterface, (1) 29 http://dx.doi.org/10.14311/ap.2015.55.0029 http://ojs.cvut.cz/ojs/index.php/ap j. kulveit, p. ticha, p. demo acta polytechnica figure 1. free energy ∆g(n ) as a function of cluster size n in the nucleation regime where µβ (resp. µα) are chemical potentials in the β-phase (resp. α-phase) and ∆ginterface is the energy of the newly formed interface. the first term is always negative and represents the driving force of the process. the surface term is positive and competes with the first (volume) term. for small radii, the surface-to-volume ratio is large, and the surface term dominates — in effect creating a barrier for nucleation. the nature of the dependence becomes clear if we rewrite (1) to be ∆gn = n(µβ −µα) + ηn 2 3 σ, (2) where η is a shape-factor surface/n 2 3 (constant for a given shape), and σ denotes the interfacial energy per unit area (see fig 1). for small radii, the surface-to-volume ratio is large, and the surface term dominates, forming a barrier for nucleation of height ∆gc. a cluster of the corresponding size is known as the critical nucleus nc, with critical radius rc, etc. while clusters smaller than nc tend to go down the energy slope and shrink, clusters larger than nc grow further and form stable particles of the new phase. in our simple case, the classical theory with isotropic interfacial energy leads to a cluster which takes a spherical form, and critical parameters may easily be found: nc = ( 2ση 3∆µ )3 (3) and ∆gc = 4(ση)3 27∆µ2 . (4) the picture described above can be adapted to more complicated systems and scenarios by accounting for other contributions to the nucleus energy. the contributions of strain (early studies nabarro [7]), incoherency of the interfaces, anisotropy of the surface energy and the effects of vacancies are often important. 2.2. nucleation rate classical nucleation theory then proceeds to determine the nucleation rate — defined as the rate at which stable nuclei are formed within a unit volume in unit time. this is also often the quantity connecting theory with experiment. here, we use the cluster dynamics approach, which allows us to derive both the classical theory and its flavors and some recent models, and it naturally shows the links between them. in the non-nucleation regime, the new phase β is not stable, ∆gn is always positive, no stable nuclei form, and the nucleation rate is zero. the equilibrium distribution of clusters, minimizing the free energy of the systems, is xn = exp ( − ∆gn kt ) , (5) where xn is the fraction of clusters of size n to all clusters and ∆gn is the free energy of clusters of size n. in the nucleation regime, the system is out of equilibrium and clusters larger than nc grow to stable sizes. the growth can be described as a flux of clusters in size-space. if the coalescence rate is small (which is most often true at least in the early stage of nucleation), it can be assumed that the growth is governed by single particle processes the addition or loss of one particle (a so-called step-by-step process). using more simplifying assumptions, such as steady supply of monomers and removal of large clusters, the classical theory then derives a cluster flux in a “steady-state” where j(n)(t) = j. j(n)(t) = j = ( ∆gc 3πn2c kt )1 2 βcfe(−∆gc+∆g1)/kt . (6) . the first dimensionless term is called the zeldovich factor and its magnitude is typically 10−1 [8, p. 466], the frequency term βc expresses the number of monomers within jump distance from the embryo mutiplied by jump requency, and ∆gc is the free energy of critical clusters. 3. improvements to nucleation theory in the development of nucleation theory, there is obviously a huge space for extensions and improvements of classical theory. the first big class results from lifting some of the simplifying assumptions, e.g., not assuming the steady state we can examine the time lag to nucleation [9] 30 vol. 55 no. 1/2015 ising model simulations as a testbed for nucleation theory or we can study the theory of nucleation in closed systems [10]. another big class consists of attempts to improve the core of the theory, for example by including some seemingly neglected entropy contributions, or by imposing some formal requirements on consistency (e.g., note that in the above derivation the “formation energy” of a size 1 “cluster” is nonzero, which seems unnatural). until recently, such modifications were typically very hard to test. individual nuclei are usually too small to be directly observed, particularly “in vivo”, when the nucleation process is happening. the quantity accessible to experiment is often only the total nucleation rate partially obscured by subsequent growth processes. due to the exponential dependence of the nucleation rate on parameters including temperature and the energy barrier, it is often very hard to distinguish experimentally whether some proposal is really an improvement to the theory, or if it just happens to push the predicted nucleation rate in the “correct” direction, compensating for often large errors of experimental data or parameter control. the difficulty with experimental tests is also related to the fact that predictions for a relatively small space of experimental data (e.g., the dependence of the nucleation rate on a single parameter such as temperature) are based on a much bigger space of model parameters and assumptions (e.g., chemical potentials, surface energies taken from macroscopic systems, assumptions such as the insignificance of the time scale with which the system is tempered in relation to the nucleation time scale, etc.) this is a situation where computer simulations can be of enormous use, allowing precise control over the big parameter space, and allowing individual aspects of the theory to be tested. 4. modern statistical sampling methods the difficulty with computer simulations of nucleation lies in the rarity of nucleation events. for example, in the case described bellow of nucleation in the lattice ising model, the typical time until one nucleation event occurs is 105 simulation steps. straightforward simulation may wander endlessly in the initial phase, then the nucleation event proceeds very fast in a few steps, and then the systems remains in the final phase. to obtain a meaningful statistical sample, or any sample at all, it is therefore necessary to employ algorithms which enhance the probability of rare events and lead to a detailed exploration of the phase space close to the transition point. a detailed description or a comparison of these methods is beyond the scope of this paper — for a comprehensive review including practical comparisons, see van erp [11]. from several typically used methods to study nucleation an example can be forward flux sampling (ffs) [12, 13]. 5. testbed systems and results even in a simulation, when modeling real systems using molecular dynamics, nucleation theory gets tested along with various other simulation properties (e.g., a description of interatomic forces). for a systematic improvement of nucleation theory itself, the ideal case is a system with as few as possible arbitrary parameters of both the system and the simulation. two such model systems are particularly important. the system of hard spheres, often used as a reference model of a liquid, is also used for studies of nucleation. the second system is the lattice ising model, one of the simplest statistical systems exhibiting phase transitions. 5.1. hard spheres in 2001, auer and frenkel [14] used a model of hard spheres to predict absolute crystal nucleation rates without any adjustable parameters and most of the assumptions of cnt. in their comparison of the results with cnt, their conclusion was that the cnt predictions for the height of the nucleation barrier ∆g are not accurate (30–50 % too low), but the data from the simulation can be fitted to the functional form given by cnt (6) except for very small clusters. auer and frenkel also studied the nucleation pathway — the sequence of structures of small clusters. this topic was later also studied by o’malley and snook [15] and others. prestipino et al. [16] used the hard sphere model to systematically test the assumptions of cnt, giving particular attention to the definition of clusters and related problems with cluster shape and interfacial energy, leading to corrections to the first part of the theory (determining the nucleation barrier, capillarity approximation). heterogeneous nucleation of hard spheres on walls was examined by auer and frenkel [17]. drastic lowering of the nucleation barrier was observed, as would be expected from classical heterogeneous nucleation theory. an interesting observation was that the nucleation barrier was dominated by line tension. xu et al. [18] also studied heterogeneous nucleation of hard spheres on patterned substrates (consisting of patterns of the same spheres in fixed positions). they noted that the time required for crystallization can be greatly reduced on a suitable substrate and the crystallizing phase can to a large extent be influenced by the substrate. even if in some of the studies no explicit comparison with classical theory was made, or the results are mostly qualitative, there seemed to be at least qualitative agreement, and not surprisingly, a problem of cnt with correct interface energies. sandomirski et al. [19] used hard and soft sphere models to study heterogeneous crystallization on flat and curved interfaces. 31 j. kulveit, p. ticha, p. demo acta polytechnica 5.2. ising model detailed comparisons of classical nucleation theory with simulations of nucleation in the 2d and 3d lattice ising model were made by ryu and cai [20, 21]. a particularly interesting aspect of this study was the independent testing of the “nucleation barrier” part and the “nucleation rate” part of cnt. the two parts in fact rest on different sets of assumptions, and their validity is relatively independent. in the case of the 2d lattice ising model, ryu and cai demonstrated good agreement of cnt with simulation with no adjustable parameters. the cnt model in this case included two important improvements to classic theory the langer field theory correction [22] to nucleus energy, and corrected temperature dependent interfacial energy, taking into account anisotropy of the surface energy in the ising model and changes in the shape of the equilibrium nucleus with temperature. the results of these studies establish the ising model as an extremely useful reference point for testing various fundamental improvements to nucleation theory, and also for testing changes and additions to cnt that are necessary in different scenarios. brendel et al. [23] studied the nucleation times in the two-dimensional ising model, using cluster energies and transition rates directly obtained from simulation. with the input of these parameters, the nucleation times predicted by cnt were in reasonable agreement with the simulation. page and sear [24] studied the influence of pores and surface patterning on the heterogeneous nucleation rate and energy barriers, finding a significant change in the nucleation rate caused by the presence o the pores, and satisfactory agreement with cnt if different nucleation rates are assigned to nucleation in and out of pores. building on this work, hedges and whitelam [25] asked how to pattern the surface in order to maximally speed up nucleation, and as the answer studied nucleation in the presence of pores with various dimensions. an interesting and potentially practically useful result is that the maximum nucleation rate is achieved if one dimension of the pore has the critical length. kuipers and barkema [26] focused on memory effects (non-markovian dynamics) in the ising model with local spin-exchange dynamics, which introduces diffusion-like properties to the model. in such circumstances, events of particle attachment and detachment from the cluster are often strongly correlated, and the moves in cluster size space are no longer markovian. accounting for this by introducing new events, such as “particle leaving to infinity”, “particle leaving to return”, kuipers and barkema demonstrated an influence of memory effect on dynamics. effectively, the outcome was increased fluctuations around the critical size, leading to a smaller time spent on the “energy plateau” of the nucleation barrier, and hence an increased nucleation rate. an analytical description of the situation remains an open topic. allen et al. [27] focused on another important scenario, ie. nucleation in the presence of shear, in the 2d ising system. they observed a peak in the nucleation rate in the intermediate nucleation rates, and suppression of nucleation in high shear rates. it seems to remain an open question whether concepts from cnt, especially cluster size as a reaction coordinate, are suitable simplification. recently, schmitz et al. [28] carefully examined the definitions of the clusters used in most of the studies, showing that the most commonly used “geometrical” definition is unsuitable for defining clusters at higher temperatures. on the other hand, at low temperatures the cluster energy and the shape are grossly anisotropical. a big part of the previous results need to be reconsidered in light of more physical definitions of clusters. 6. conclusions and prospects for future research the results described above clearly show that agreement of cnt with simulation in simple cases is a great starting point for understanding nucleation in more complex scenarios. from a comparison with earlier studies of cnt and hard spheres, we can propose several directions in which a comparison of the theory with numerical simulations in the ising model can be made, and the required additions to the theory can be tested. one big relatively sparsely explored topic is the field of non-stationary systems and conditions. we can vary not only the external driving force, but also the temperature may be varied, as is often the case in experimental scenarios. in the case of surface nucleation, the surface energy may also be non-stationary. other interesting cases may be generated by lifting the condition of spatial homogeneity. for example, we can introduce a temperature gradient, or we can orm a more complicated and more realistic surface, which may exhibit heterogeneous surface energy, roughness, or curvature. another promising direction is to reconsider discrepancies between cnt, simulations and other models in light of more correct definitions of clusters [28], possibly leading to a model that is consistent across a broad range of temperatures and cluster sizes. acknowledgements this work has been supported by project of ctu in prague sgs14/111/ohk1/2t/11 and by project of gacr p108/12/0891, and was carried out within the framework of the joint laboratory of nanofiber technology of the institute of physics ascr and the faculty of civil engineering, ctu in prague the work of petra tichá was supported by gacr project 14-04431p. 32 vol. 55 no. 1/2015 ising model simulations as a testbed for nucleation theory references [1] f. vodak, k. trtik, v. sopko. effect of gammairradiation on strength of concrete for nuclear-safety structures. cement and concrete research 35(7):1447, 2005. doi:10.1016/j.cemconres.2004.10.016. [2] p. demo, a. sveshnikov, š. hošková, et al. physical and chemical aspects of the nucleation of cement-based materials. acta polytechnica 52(6), 2012. [3] j. schmelzer. nucleation theory and applications. vch verlagsgesellschaft mbh, 2005. doi:10.1002/3527604790. [4] m. volmer, a. weber. keimbildung in übersättigten gebilden. z phys chem 119:277–301, 1926. [5] l. farkas. keimbildungsgeschwindigkeit in übersättigten dämpfen. z phys chem 125:236–242, 1927. [6] r. becker, w. döring. kinetische behandlung der keimbildung in übersättigten dämpfen. annalen der physik 416(8):719–752, 1935. doi:10.1002/andp.19354160806. [7] f. nabarro. the strains produced by precipitation in alloys. proceedings of the royal society of london series a, mathematical and physical sciences 175(963):519–538, 1940. doi:10.1098/rspa.1940.0072. [8] r. balluffi, s. allen, w. carter, r. kemper. kinetics of materials. john wiley and sons, 2005. [9] p. demo, z. kožíšek, r. šášik. analytical approach to time lag in binary nucleation. physical review e 59(5):5124, 1999. doi:10.1103/physreve.59.5124. [10] z. kožíšek, p. demo, a. sveshnikov. size distribution of nuclei in a closed system. the journal of chemical physics 125(11):114504, 2006. [11] t. s. van erp. dynamical rare event simulation techniques for equilibrium and nonequilibrium systems. advances in chemical physics 151:27, 2012. doi:10.1002/9781118309513.ch2. [12] r. j. allen, p. b. warren, p. r. ten wolde. sampling rare switching events in biochemical networks. physical review letters 94(1):018104, 2005. doi:10.1103/physrevlett.94.018104. [13] r. j. allen, d. frenkel, p. r. ten wolde. forward flux sampling-type schemes for simulating rare events: efficiency analysis. the journal of chemical physics 124(19):194111, 2006. doi:10.1063/1.2198827. [14] s. auer, d. frenkel. prediction of absolute crystal-nucleation rate in hard-sphere colloids. nature 409(6823):1020–1023, 2001. [15] b. o’malley, i. snook. crystal nucleation in the hard sphere system. physical review letters 90(8):085702, 2003. doi:10.1103/physrevlett.90.085702. [16] s. prestipino, a. laio, e. tosatti. systematic improvement of classical nucleation theory. physical review letters 108(22):225701, 2012. doi:10.1103/physrevlett.108.225701. [17] s. auer, d. frenkel. line tension controls wall-induced crystal nucleation in hard-sphere colloids. physical review letters 91(1):015703, 2003. doi:10.1103/physrevlett.91.015703. [18] w.-s. xu, z.-y. sun, l.-j. an. heterogeneous crystallization of hard spheres on patterned substrates. the journal of chemical physics 132(14):144506, 2010. doi:10.1063/1.3383239. [19] k. sandomirski, s. walta, j. dubbert, et al. heterogeneous crystallization of hard and soft spheres near flat and curved walls. the european physical journal special topics 223(3):439–454, 2014. doi:10.1140/epjst/e2014-02101-7. [20] s. ryu, w. cai. validity of classical nucleation theory for ising models. physical review e 81(3):030601, 2010. doi:10.1103/physreve.81.030601. [21] s. ryu, w. cai. numerical tests of nucleation theories for the ising models. physical review e 82(1):011603, 2010. doi:10.1103/physreve.82.011603. [22] j. langer. theory of nucleation rates. physical review letters 21(14):973, 1968. doi:10.1103/physrevlett.21.973. [23] k. brendel, g. barkema, h. van beijeren. nucleation times in the two-dimensional ising model. physical review e 71(3):031601, 2005. doi:10.1103/physreve.71.031601. [24] a. j. page, r. p. sear. heterogeneous nucleation in and out of pores. physical review letters 97(6):065701, 2006. doi:10.1103/physrevlett.97.065701. [25] l. o. hedges, s. whitelam. patterning a surface so as to speed nucleation from solution. soft matter 8(33):8624–8635, 2012. doi:10.1039/c2sm26038g. [26] j. kuipers, g. barkema. non-markovian dynamics of clusters during nucleation. physical review e 79(6):062101, 2009. doi:10.1103/physreve.79.062101. [27] r. j. allen, c. valeriani, s. tănase-nicola, et al. homogeneous nucleation under shear in a two-dimensional ising model: cluster growth, coalescence, and breakup. the journal of chemical physics 129(13):134704, 2008. doi:10.1063/1.2981052. [28] f. schmitz, p. virnau, k. binder. monte carlo tests of nucleation concepts in the lattice gas model. physical review e 87(5):053302, 2013. doi:10.1103/physreve.87.053302. 33 http://dx.doi.org/10.1016/j.cemconres.2004.10.016 http://dx.doi.org/10.1002/3527604790 http://dx.doi.org/10.1002/andp.19354160806 http://dx.doi.org/10.1098/rspa.1940.0072 http://dx.doi.org/10.1103/physreve.59.5124 http://dx.doi.org/10.1002/9781118309513.ch2 http://dx.doi.org/10.1103/physrevlett.94.018104 http://dx.doi.org/10.1063/1.2198827 http://dx.doi.org/10.1103/physrevlett.90.085702 http://dx.doi.org/10.1103/physrevlett.108.225701 http://dx.doi.org/10.1103/physrevlett.91.015703 http://dx.doi.org/10.1063/1.3383239 http://dx.doi.org/10.1140/epjst/e2014-02101-7 http://dx.doi.org/10.1103/physreve.81.030601 http://dx.doi.org/10.1103/physreve.82.011603 http://dx.doi.org/10.1103/physrevlett.21.973 http://dx.doi.org/10.1103/physreve.71.031601 http://dx.doi.org/10.1103/physrevlett.97.065701 http://dx.doi.org/10.1039/c2sm26038g http://dx.doi.org/10.1103/physreve.79.062101 http://dx.doi.org/10.1063/1.2981052 http://dx.doi.org/10.1103/physreve.87.053302 acta polytechnica 55(1):29–33, 2015 1 nucleation 2 classical nucleation theory 2.1 nucleation barrier 2.2 nucleation rate 3 improvements to nucleation theory 4 modern statistical sampling methods 5 5.1 hard spheres 5.2 ising model 6 conclusions and prospects for future research acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0579 acta polytechnica 53(supplement):579–582, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap metals in the icm: witnesses of cluster formation and evolution lorenzo lovisaria,∗, tatiana f. laganáb, katharina borma, gerrit schellenbergera, thomas h. reipricha a argelander-institut für astronomie, auf dem hügel 71, d-53121 bonn, germany b universidade de são paulo, instituto de astronomia, geofísica e ciências atmosféricas, departamento de astronomia, cidade universitária, cep:05508-090, são paulo, sp, brazil ∗ corresponding author: lorenzo@astro.uni-bonn.de abstract. the baryonic composition of galaxy clusters and groups is dominated by a hot, x-ray emitting intra-cluster medium (icm). the mean metallicity of the icm has been found to be roughly 0.3 ÷ 0.5 times the solar value, so a large fraction of this gas cannot be of purely primordial origin. indeed, the distribution and the amount of metals in the icm is a direct consequence of the past history of star formation in the cluster galaxies and of the processes responsible for the injection of enriched material into the icm. we here briefly summarize current views on the chemical enrichment, focusing on the observational evidence in terms of metallicity measurements in clusters, spatial metallicity distribution and evolution, and expectations from future missions. keywords: galaxies: cluster: general, galaxies: abundances, x-ray: galaxies: cluster: intergalactic medium. 1. introduction in recent decades a strong effort has been made to shed light on the metal enrichment processes in galaxy clusters, but although interesting results have been obtained we are still far from a full understanding of the enrichment history of these clusters. thanks to their deep potential wells, galaxy clusters and groups retain most of the gas released by the member galaxies, making them excellent laboratories for the study of the metal enrichment. the accumulation of the enriched material in clusters can be used as a probe of: a) the source of metals; b) the mechanisms that injected the elements into icm; c) the different importance of various classes of sne; d) the star formation history of the universe. in this context, the study of galaxy groups is of extreme importance, because low mass systems are the most effective environment for the conversion of baryons into stars (e.g. [10]). all elements apart from hydrogen and helium have been synthesized by the stars, which reside mainly in galaxies. in particular, c and n are produced by intermediate mass stars, o, ne, mg are mainly produced by core-collapse sne, and fe and ni by sn ia. the intermediate α-elements (e.g. si, s, ca, ar) are produced by both sne in similar proportions, so the relative ratio between these elements can provide information about the initial mass function (imf) shape. most of the metals that we observe have been produced within the galaxies, and later they have been transported by various interaction processes from the galaxies to the icm. there is not a single mechanism that always dominates the enrichment, and the efficiencies of the processes vary strongly with galaxy and environmental properties. a list of the proposed processes is the following: ram-pressure stripping [12], galactic winds [6], agn outflows [7], galaxy–galaxy interactions (e.g. [11]), intra-cluster supernovae (e.g. [9]) and gas sloshing (e.g. [18]). this list is certainly not complete, and further processes may also contribute in a small fraction to the metal enrichment of the icm. simulations show that all these processes yield abundance distributions with many inhomogeneities that are not dispersed immediately but are gradually spread out (see [20] for a review). different processes yield different metal distributions and different time dependences of the enrichment. for example, ram-pressure stripping is more efficient at low redshifts and in the center of the clusters, where the densities of the icm and the galaxy velocity are higher. however, galactic winds are suppressed in the regions of high density. x-ray spectra are the only measure for the metallicity of the icm, and can hence give information on the origin of the metals in the clusters. with the first generation of satellites (e.g. asca, bepposax), it was just possible to determine the azimuthally averaged metallicity profiles. however, thanks to the high quality of the x-ray observations from of the current generation of satellites it is now also possible to measure in detail the 2d distribution of the metals in the icm. 579 http://dx.doi.org/10.14311/ap.2013.53.0579 http://ojs.cvut.cz/ojs/index.php/ap lorenzo lovisari et al. acta polytechnica figure 1. radial metallicity profiles for 20 galaxy groups with kt < 3 kev (lovisari et al. in prep.). abundances are expressed in [1] solar values and radii in units of r500. 2. radial profiles there is a general consensus that the metal distribution peaks in the center of cool-core clusters, while beyond ∼ 0.2rvir the metallicity is consistent with being flat, similar to the distribution of non cool-core systems (e.g. [4]). this flattening seems to hold even out to rvir [8]. the mean profiles agree well among different instruments (i.e. xmm, chandra, bepposax, see [13]). since it is expected that the metals in the icm originate from stars in galaxies, one would expect the abundance profiles to follow the cd light profile. however, the light profiles are much steeper than the metal profiles, suggesting that there must be a mixing of the injected metals, a process which may help to diffuse them to larger radii (e.g. [17]). the abundance profiles of massive clusters have received considerable attention, while the situation in the low mass regime is much less clear, despite their greater importance for the cosmic baryon and metal budgets. in fig. 1 we show the radial metallicity profiles for a sample of galaxy groups (kt < 3 kev, lovisari et al. in prep.). the profiles have a large scatter, especially at large radii, but they show a universal decrease with radius similar to the decrease observed for massive clusters (see [13]). although, the overall mean metallicity is ∼ 0.3, as for hot clusters, and the profiles peak in the center, as for the cool core systems, the abundance at large radii is lower than in massive objects and the intrinsic scatter is most prominent in the core, where non-gravitational processes are expected to be important. 3. 2d distribution metal profiles are useful for studying central enhanced metallicities due to cool cores or cd galaxies, on the one hand, and the overall decrease with radius, on the other. the profiles, however, fail to detect the local inhomogeneities of the metals, so a detailed metal map scan would be even more instructive than figure 2. metallicity map of the awm4 galaxy group, based on spectra extracted from a region centered on, but larger than, the pixels themselves (laganá et al., in prep.). although they are not independent, the map can be used for a qualitative study. the colors range from blue (∼ 0.2z�) to red (∼ 0.9z�). the radial profiles. although it is not easy to derive them, because a lot of photons are required, it was possible to derive detailed metallicity maps for several objects (e.g. [14, 15, 19, 21, 22]). the metallicity distribution appears very inhomogeneous for all the analyzed clusters, even the relaxed clusters, with high metallicity clumps both in the center and in the outskirts (e.g. [15]). several maxima that are not associated with the cluster center are usually visible in the metallicity distribution. simulations suggest that these maxima are typically in places where galaxies just have lost a lot of gas. the range of metallicities measured in a cluster from minimum to maximum comprises a factor of at least two or three. the spatial distribution of metals in groups shows a similar non-spherical symmetric distribution as in the clusters (see fig. 2, laganá et al. in prep.) suggesting that the same mechanisms are acting at all the mass scales, although their efficiency can deviate for different masses. 4. evolution in addition to the metal distribution, also the evolution of the metallicity is also interesting. the comparison of the metallicity values between local and high redshift clusters can help to distinguish between the enrichment processes, as different processes have individual time dependences. furthermore, the observed evolution of metal abundances can trace the star formation rate and the element ratios, e.g. the ratio of fe and α-elements can be used to obtain information on the different types of supernovae that have 580 vol. 53 supplement/2013 metals in the icm: witnesses of cluster formation and evolution figure 3. measured abundance as a function of redshift for a sample of 39 clusters, including the emission from the whole cluster at r < r500 (top) and excising the core (bottom). the error bars in z are at 1σ, while the shaded areas show the weighted mean of the abundance, with its error (blue) and rms dispersion (cyan) in three redshift bins. although the data are consistent with the no evolution scenario, there is some hint of evolution. figure adopted from baldi et al. [2]. contributed to the metal enrichment. it is challenging to measure metal abundances in distant clusters, is challenging, and it requires very high quality data. only a handful of works (e.g. [2, 3]) have tackled the evolution of metal abundance in the icm, and the situation remains very clear. balestra et al. [3] found that the metal abundance changes by a factor of 2 between redshift zero and ∼ 1.3. a similar evolution of metal abundance with redshift was confirmed by maughan et al. [16], but recently baldi et al. [2] failed to find any significant abundance evolution (see fig. 3). since the photon flux is reduced at high redshift, only the iron-k line emission can be revealed with the current instruments, such that global metallicity and iron abundance have the same meaning. as we will discuss in the next section, it will be possible in the near future to obtain abundance measurements for different elements up to redshift 0.5 with astro-h. 5. future missions although erosita is not designed for metallicity studies, the large amount of data will enable us to study the metal enrichment as a function of the galaxy cluster mass and redshift. while the low exposures of the individual clusters (a few ks) compared to the study with the current satellites does not allow accurate measurements of the metallicity, the advantage figure 4. the relative uncertainty on the abundance measurements as a function of redshift and masses for a 20 ks exposure time with erosita. the median values for 1000 realizations are shown, and the 68 % errors are taken from the distributions (borm et al. in prep.). of erosita is the very large number of objects that will be studied. this high statistic together with the dedicated pointed cluster follow-up data will allow us to put constraints on the enrichment processes, on their efficiencies and on the metal evolution with time. in fig. 4 we show the relative uncertainty on the abundance measurements that we expect to obtain with a 20 ks erosita observation (such exposure will be obtained, e.g. around the poles). on the contrary to erosita, one of the astro-h mission goals is to determines abundances of key diagnostic elements that are currently inaccessible because of their low equivalent widths (i.e., n, al, ca, ar) or blended with the fe-l shell transitions (ne and mg). in particular, n is a very good diagnostic of the stellar initial mass function (imf), while the ca/ar ratio is sensitive to the type ia explosion mechanism [5]. for the first time, the abundance measurements of o, ne, and mg will be accurate to ±10 % up to z ∼ 0.2 and to ±20 % up to z ∼ 0.4. at the same time, fe and si will be determined in bright clusters with a good accuracy even at z > 1. all these measurements together, will enable us to distinguish between different enrichment mechanisms, to constrain models of the imf, and to determine key characteristics of type ia supernovae, such as their progenitor formation, the explosion mechanisms and their efficiency to enrich the icm. 6. summary measurements of the content, distribution and evolution of the metals in the icm provide invaluable insights on the interactions between the different components in clusters and about the processes that inject enriched gas into the icm. while the production of metals is related to the star formation rate, their distribution is associated with many metal enrichment mechanisms that are determined by different physical 581 lorenzo lovisari et al. acta polytechnica effects. although the current generation of satellites can measure the distribution of the metals in the icm with a relatively good accuracy, it is not known yet, which of the processes is most efficient, what the time variations are and how much these factors depend on other parameters like e.g. the mass of the cluster/group. the advent of astro-h on one side and erosita on the other will allow us to measure metal abundance distributions with a very good accuracy and their time evolution, so that we will soon get answers on the importance of the different enrichment processes. acknowledgements ll acknowledges support from the dfg through heisenberg grant re 1462/6, from the german aerospace agency (dlr) with funds from the ministry of economy and technology (bmwi) through grant 50 or 1102. thr acknowledges support from the dfg through heisenberg grant re 1462/5 and grant re 1462/6. tfl expresses thanks for financial support from fapesp (grant 2008/04318-7) and from capes (grant bex 3405-10-9). references [1] asplund, m., grevesse, n., sauval, a.: 2009, ara&a, 47, 481 [2] baldi, a., ettori, s., molendi, s., et al.: 2012, a&a, 537, 142 [3] balestra, i., tozzi, p., ettori, s., et al.: 2007, a&a, 462, 429 [4] de grandi, s., ettori, s., longhetti, m., et al.: 2004, a&a, 419, 7 [5] de plaa, j., werner, n., bleeker, j.a.m., et al.: 2007, a&a, 465, 345 [6] de young, d.s.: 1978, apj, 223, 47 [7] de young, d.s.: 1986, apj, 307, 62 [8] fujita, y., tawa, n., hayashida, k., et al.: 2008, pasj, 605, 343 [9] gerhard, o., arnaboldi, m., freeman, k.c., et al.: 2002, apj, 580, 121 [10] giodini, s., pierini, d., finoguenov, a.: 2009, apj, 703, 982 [11] gnedin, n.y.: 1998, mnras, 294, 407 [12] gunn, j.e., gott, j.r.: 1972, apj, 176, 1 [13] leccardi, a., molendi, s.: 2008, a&a, 487, 461 [14] lovisari, l., kapferer, w., schindler, s., et al.: 2009, a&a, 508, 191 [15] lovisari, l., schindler, s., kapferer, w.: 2011, 528, 60 [16] maughan, b.j., jones, c., forman, w., et al.: 2008, apjs, 174, 117 [17] rebusco, p., churazov, e., bhöringer, h., et al.: 2005, mnras, 359, 1041 [18] roediger, e., lovisari, l., dupke, r., et al.: 2012, mnras, 420, 3632 [19] sanders, j.s., fabian, a.c.: 2006, mnras, 371, 1483 [20] schindler, s., diaferio, a.: 2008, ssrv, 134, 363 [21] schmidt, r.w., fabian, a.c., sanders, j.s.: 2002, mnras, 337, 71 [22] simionescu, a., werner, n., bhöringer, h., et al.: 2009, a&a, 493, 409 discussion maurice h.p.m. van putten — in your last-but-one slide, you mentioned your goal of identifying the type ia sne mechanism with astro-h. does this include identifying their progenitors? lorenzo lovisari — one of the main goals is to provide stringent constraints on the contribution of sn explosions to the metal enrichment of the icm and the evolution of the sn ia rate, which is one of the most promising methods for unveiling sn ia progenitors. james beall — is there an association between the clumpiness you show in the clusters and the central agn jets? lorenzo lovisari — the metallicity maps show a relation between the direction of highly enriched gas and the cavity/jet angle. furthermore, there is a strong correlation between the power of the agn and the maximum radius at which a significant enhancement in metallicity has been detected. 582 acta polytechnica 53(supplement):579–582, 2013 1 introduction 2 radial profiles 3 2d distribution 4 evolution 5 future missions 6 summary acknowledgements references acta polytechnica acta polytechnica 53(3):306–307, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ spectral singularities do not correspond to bound states in the continuum ali mostafazadeh∗ department of mathematics, koç university, 34450 sarıyer, istanbul, turkey ∗ corresponding author: amostafazadeh@ku.edu.tr abstract. we show that, contrary to a claim made in arxiv:1011.0645, the von neumann-winger bound states that lie in the continuum of the scattering states are fundamentally different from naimark’s spectral singularities. keywords: spectral singularity, bound state in continuum, resonance. in 1927, von neumann and wigner constructed a spherically symmetric scattering potential that supported a bound state with a positive energy [1]. this corresponded to a genuine, square integrable solution of the time-independent schrödinger equation. because the continuous spectrum of the schrödinger operator coincided with the set of nonnegative real numbers, this bound state was called “a bound state in the continuum.” this class of bound states was subsequently studied by various authors, notably stillinger and herrick [2], gazdy [3], and friedrich and wintgen [4] who interpreted them as resonances with a zero width. in 2009, the present author revealed the physical meaning and possible applications of the mathematical concept of a spectral singularity [5]. this is a generic feature of complex scattering potentials that was discovered by naimark in the 1950s [6] and studied thoroughly by mathematicians for more than half a century [7]. because spectral singularities correspond to poles of the reflection and transmission coefficients (and of the s-matrix) and belong to the continuous spectrum of the schrödinger operator, which is real and nonnegative, they can also be interpreted as zerowidth resonances [5]. this has led to the claim [8] that states associated with spectral singularities are nothing but the bound states in the continuum. a quick look at the properties of spectral singularities show, that they correspond to scattering solutions of the time-independent schrödinger equation. in particular, they are not square-integrable. this is in contrast with bound states in the continuum, which are associated with square-integrable solutions of the time-independent schrödinger equation. the following are other major differences between spectral singularities and bound states in the continuum. (1.) a real scattering potential can never support a spectral singularity [7]. this is certainly not the case for bound states in the continuum. in fact, the scattering potentials introduced by von neumann and wigner [1] and others [2, 3] that admit a bound state in the continuum are real. (2.) spectral singularities appear for generic complex scattering potentials, while scattering potentials that involve bound states in the continuum are extremely rare. for example, scattering potentials with a compact support cannot have a bound state in the continuum, whereas a complex potential with a compact support can easily admit a spectral singularity [5, 9–15]. the above discussion shows that although both the bound states in the continuum and spectral singularities can be viewed as zero-width resonances, they are quite different in nature. most notably, spectral singularities have a much wider domain of application, because they occur for generic complex scattering potentials. concrete evidence is provided by the observation that every lasing system involves a spectral singularity. this corresponds to lasing at the threshold gain [16]. acknowledgements this work has been supported by the turkish academy of sciences (tüba). references [1] j. von neumann and e. wigner, phys. z. 30, 465 (1929). [2] f. h. stillinger and d. r. herrick, phys. rev. a 11, 446 (1975). [3] b. gazdy phys. lett. 61a, 89 (1977). [4] h. friedrich and d. wintgen, phys. rev. a 31, 3964 (1985) and 32, 3231 (1985). [5] a. mostafazadeh, phys. rev. lett. 102, 220402 (2009). [6] m. a. naimark, trudy moscov. mat. obsc. 3, 181 (1954) in russian, english translation: amer. math. soc. transl. (2), 16, 103 (1960) [7] g. sh. guseinov, pramana. j. phys. 73, 587 (2009). [8] i. rotter, “the role of exceptional points in quantum systems,” preprint arxiv: 1011.0645. [9] a. mostafazadeh and h. mehri-dehnavi, j. phys. a 42, 125303 (2009). 306 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 spectral singularities do not correspond to bound states in the continuum [10] a. mostafazadeh, phys. rev. a 80, 032711 (2009). [11] a. mostafazadeh, phys. rev. a 84, 023809 (2011). [12] a. mostafazadeh, j. phys. a 44, 375302 (2011). [13] a. mostafazadeh and m. sarisaman, phys. lett. a 375, 3387 (2011). [14] a. mostafazadeh and m. sarisaman, proc. r. soc. a 468, 3224 (2012). [15] a. mostafazadeh and s. rostamzadeh, phys. rev. a 86, 022103 (2012). [16] a. mostafazadeh, phys. rev. a 83, 045801 (2011). 307 acta polytechnica 53(3):306–307, 2013 acknowledgements references ap1_02.vp 1 introduction turbulent submerged axisymmetric jets are of interest in many engineering applications. they are formed by outflow of fluid from an axisymmetric nozzle into an unbounded space containing a similar or same fluid at rest. the distinguishing feature of submerged jets (as opposed to free jets) is that the external fluid is entrained into the jet. as a consequence of the entrainment, the jet diameter gradually increases in the axial direction – while its velocity decreases (for momentum preservation). the subject of the present investigation is steady jet flow, with parameters not dependent upon time. the usual simplifying assumptions are adopted: the external fluid is the same as the jet fluid and the nozzle exit is circular. the interest is limited to the fully developed jet – neglecting the development that takes place immediately downstream from the nozzle exit. also neglected in the present approach (as is common in other analogous solutions – e.g., [2], [3], [4], [5], … etc.) is the intermittence effect on the jet boundaries, where a purely turbulent flow regime alternates with laminar flow of the entrained external fluid. this, of course, influences the conditions on the extreme ends of the evaluated profiles away from the jet axis, where, however, the solution is of less practical interest, and is in any case difficult to verify experimentally. the fully developed jet possesses an important property of similarity: profiles of various quantities – such as the velocity profile shown in the example fig. 1 – evaluated at various downstream distances x1 are of mutually similar shape. this similarity property permits us to convert the governing partial differential equations into ordinary differential equations (which, of course, are much easier to solve). the similarity solutions based upon this idea are standard in the case of the laminar jet [1]. in most engineering applications of jet flows, however, velocities and dimensions are sufficiently large for the flow to be turbulent. turbulence causes the governing equations to be much more complex – so much that there is a general belief that the only way to solve turbulent jet flows is numerical computation based upon the grid method. such numerical results, of course, are only valid for a particular set of boundary conditions, and do not provide a universal solution of general validity, which is the object of interest here. there is a universal solution of an axisymmetric turbulent jet known since 1926, derived by tollmien [3], [1], [17]. its deficiency is the use of a simple, algebraic model of turbulence. the basic assumption of this model is that turbulence is everywhere in a state of equilibrium: local turbulence production is assumed to be equal to local turbulence dissipation 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 two-equation turbulence model similarity solution of the axisymmetric fluid jet v. tesař this paper presents a general, universally valid solution of axisymmetric turbulent submerged jet flow, for which no fully satisfactory solution has been known. what has been available so far are either computational solutions for individual particular cases, lacking universality, or similarity solutions with inadequate turbulence models, some of them based upon assumptions of a speculative character (e.g. constant mixing length across the jet profile). the present approach uses a similarity transformation of the governing equations, which incorporate an advanced turbulence model. the results are shown to be in excellent agreement with available experimental data. the new solution provides a suitable basis for analysis of enigmatic aspects of axisymmetric jets, such as their “spreading anomaly”. keywords: jet, axisymmetric jet, turbulence, turbulence modelling, two-equation model, similarity solution, fluctuation energy, turbulence dissipation rate, turbulence length scale. fig. 1: schematic representation of the computed jet flow and its typical profile of time-mean axial velocity. the illustration also serves to define the orientation of the used co-ordinate axes x1 and x2. rate. this is not true in real jets – mainly in the vicinity of the jet axis, where turbulent fluctuations are transported by advective and diffusion effects. as a result, tollmien’s solution is known to disagree with experimental data, in particular in the vicinity of the jet axis. in earlier publications [4] and [5], the present author succeeded in obtaining satisfactory general solution with advanced models of turbulence in the related problem of the plane turbulent jet, using one-equation [4] and two-equation [5] turbulence models. these are capable of taking into account the spatial transport of turbulence [1]. the axisymmetric jet, which is of interest here, was recently also solved with the one-equation model [2]. while this model can account for both advective and diffusive transport, its fails to predict the distribution of the characteristic length scale of turbulence. this was circumvented in [2] by adopting tollmien’s [3] assumption of the length scale being constant across each section of the jet. the results may be described as rather successful, when compared with experimental data – indicating that tollmien’s assumption, in spite of its simplicity, is a good approximation to real conditions. nevertheless, a better approach, not requiring any preliminary assumptions, has been desirable also for the present axisymmetric jet problem. such a solution is described in this paper. it uses the two-equation model of turbulence that makes it possible to evaluate, just from basic principles, the spatial distributions of all parameters of turbulence, including its length scale. 2 the equation of the flowfield the governing equation of the axisymmetric jet flowfield problem is the streamwise specific momentum transport balance in the form valid for cylindrical geometry. jets are slender enough objects for prandtl’s version of diffusive transport effect (neglect of longitudinal diffusion) to be applicable. it is assumed that turbulent eddies move in a stochastic manner, without preference for any direction – so that the turbulence, being isotropic, may be characterised by a scalar quantity �t, turbulent viscosity. the longitudinal co-ordinate x1 (fig. 1) is assumed to coincide with the nozzle axis, while the transverse co-ordinate x2 determines the radial direction measured away from the axis. in notation according to [1] the equation is written as: � �w w xt t 1 1 2 1 p� � � � . (1) the right-hand side of eq. (1) represents the effect of radial divergence. the left-hand side represents the spatial transport effect. these are expressed by means of the turbulent left-acting prandtl transport operator � �p� t , which expresses the effects of two transport phenomena: advection and gradient diffusion (with neglected longitudinal diffusion) respectively. it may be decomposed into: � �p� �t t� � � ��w 2 2 (2) where the first term, in the present two-dimensional context equal to�w � � � �1 1 2 2w w , is a scalar product of two vectors, � and w. it represents the operator of the two mutually orthogonal components of advective transport. the second term, on the other hand, represents the operator of turbulent diffusion (with �t variable in space). the solution of equation (1) leads to evaluation of the spatial distribution � w f1 � x of the axial component w1 of the time-mean velocity vector w � � � � � w w 1 2 (3) while the mass conservation condition (again written for the cylindrical geometry case) w� � �w x2 2 0 (4) will take care of the other, transverse velocity component, w2. to evade the necessity of testing the results as to the validity of eq. (4), it is expedient to transform eq. (1) into an equivalent equation for the spatial distribution of the stream function �: w x1 2 2� �� (5) w x2 1 2� � �� (6) so that all solutions obey eq. (4) identically. after decomposition into terms corresponding to individual transport effects, eq. (1) may be written as consisting of the following five terms: � � w w w w w w w x 1 1 1 1 2 2 1 2 2 2 1 2 1 2 1 � � � � � � � � � �� � � t t t . (7) 3 model of turbulence the turbulent viscosity (or “eddy viscosity”) �t , appearing in the governing equation (1) is in, general, equal to �t t� �w � (8) where wt is the velocity scale of turbulent fluctuations, w c et v f� (9) and � is the turbulence length scale � � c ez f 3 � (10) evaluated from the specific kinetic energy of turbulent fluctuations ef [m 2/s2] and from turbulence dissipation rate � [m2/s3], so that � �t z f� c c e2 � . (11) in equilibrium turbulence, the two model constants c� and cz are mutually dependent, and according to [1] there is c c� � �z3 0 548. . (12) this relation need not hold in general turbulence. nevertheless, it was used as a convenient simplifying assumption in the present computations. the two quantities ef and � must be evaluated from the simultaneously solved two equations of the turbulence model. the first of them is the transport equation for the energy of fluctuations: � �e e x f f t t p� � � � � �p � 2 1 (13) again, as in eq. (1), the assumption of a slender shear region permits us here to apply the prandtl approximation for spatial transport effects, as expressed by the prandtl transport operator on the left-hand side of the equation. on the right-hand side, [m2/s3] is the turbulence energy production rate � p � �w1 2 2 �t (14) © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 41 no. 2/2001 while the other terms in eq. (13) are the dissipation rate � and the radial divergence effect term. applying the same decomposition for the prandtl operator as in eq. (1) and eq. (7), it is possible to rewrite eq. (13) as � � e w e w e e w e x f f f t t f t f t � � � � � � � � � � � � � � 1 1 2 2 2 2 2 2 1 2 2 2 1 � � � � � . (15) the other equation of the turbulence model describes the spatial transport of the specific dissipation rate of turbulent fluctuation energy: � �� � � � ��p� � t tu y x 2 1 (16) where u [m2/s4] is dissipation production rate and y [m2/s4] is the destruction rate, while the last term on the right hand side, similarly as in eq. (1) and (13), represents the effect of radial divergence. decomposition of the prandtl operator leads cf. eq. (13) and eq. (15) and [1] to: � � � � � � � � � � � � � � � �� � � � � � � � � � 1 1 2 2 2 2 2 2w w p p c e c � �t t u f y r r p � � � � � � e x c c c w e c e x f t u z f y f t � � � � � � � � 2 1 1 2 2 2 1 � � � . (17) in this case, the coefficient of gradient diffusion �� [m 2/s] is evaluated from the eddy viscosity and from the assumed knowledge of the corresponding prandtl number pr� � � � � t . (18) the standard recommended value [1] is pr� � 1 3. . note that the analogous prandtl number could also be introduced in eq. (13). there, however, its assumed value is 1.0 (this is equivalent to assuming identical transport of momentum and fluctuation energy by turbulent motions), in common with standard turbulence modelling practice. the model constants cu and cy introduced in eq. (18) are mutually related in equilibrium near-wall turbulence, as shown in [1], c c c c y u z � � � � 2 pr � (19) where � is the von kármán constant. the standard value of cu, evaluated from turbulence decay rate downstream from grids, is cu = 1.44. solutions of the transport equations (13) and (16) provide the spatial distributions � e ff � x and � � � f x . inserting them into eq. (11) makes possible evaluation of the spatial distribution of the turbulent viscosity � �t � f x . inserting them into eq. (11) makes possible evaluation of the spatial distribution of the turbulent viscosity and its transverse derivative needed in all three transport equations: � � �t f z f z f� � � � � � � 2 2 2 2 2e c c e c c e � . (20) in the present solution, in line with previous treatments of jet flows in [2], [4], and [5], a high turbulence reynolds number ret t� � � � � is assumed. 4 the similarity transformation transformation of equations into the similarity form is based upon measurement of transverse distances by a scale which varies in the axial direction in proportion to the local jet diameter. similarly, velocities are measured by a velocity scale which, at every jet cross section, is proportional to local maximum velocity wm. this decreases in the axial direction according to power-law w xm ~ 1 � (21) with a negative exponent �. the initial task is to determine the magnitude of this exponent. the next task is determination of the the exponent in the jet diameter axial growth power-law � �~ x1 . (22) the discussions presented e.g. in [1] and [2], supported by experimental verifications, indicate that there is � � 1 (23) and from constancy of axial momentum flow rate [1] there is � � �1. . (24) the similarity transformation is achieved by introducing the following similarity variables: a) relative transverse co-ordinate is to be of the form � � ~ x x 2 1 . to remove constants that would otherwise appear in the resultant equation, it is proper – according to [3] – to introduce the definition � � � x c c x 2 1z . (25) in agreement with usage in [1] as well as [2], [4], and [5], differentiation with respect to this similarity co-ordinate is denoted by a nonindexed left-acting operator � � d d� . b) relative longitudinal velocity should have the form u w x � 1 1 � (26) related to the local maximum velocity wm (eq. (21)) on the jet axis. introducing proportionality constant m (needed for dimensional reasons), we can write u w x m � 1 1 . (27) c) relative kinetic energy: using maximum velocity on the jet axis as the reference, a suitable definition for the transformed similarity variable is � � e w e x m f m f 2 1 2 2 . (28) d) transformation for the relative turbulence dissipation rate: j x m � � 1 4 3 . (29) 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 relative stream function f is then required so that it will be possible to transform eq. (5) into a corresponding expression with similarity-transformed variables u � �f � . (30) inserting the above definitions into eq. (5) leads to the following relation between � and f: � � f zc c mx� 1 . (31) in order to perform the transformation of eq. (1), the following conversions need first to be evaluated: analogous transformation of eq. (13) requires evaluation of terms: to transform eq. (16) then requires evaluation of terms: finally, it is also necessary to transform the expressions with the eddy viscosity, eq. (11) and eq. (20): � � � � t t � � � � � � � � � � � � � � 2 2 2 2 1 2 2 2 1 22 j mx x j j j mx x . the expressions in the above boxes are inserted into the three transport equations (1), (13), and (16) – preferably in their expanded forms – equations (7), (15), and (17). for reasons of convenience (elimination of higher derivatives in equations) and also for the interesting meaning of the introduced additional variables it is useful to introduce the following auxiliary variables: – relative transverse gradient of velocity g u� �, – relative gradient of turbulence energy n � � , – relative gradient of dissipation rate z j� �. derivatives of the transformed stream function may be eliminated from the transformed equations by substitutions using eq. (30) together with expressions derived from eq. (30) by differentiation: f f f � � � � � � � � � u g u g g � � � 2 3 2 . as a result, transformation relations and introduction of auxiliary variables convert the equations into the following set of seven ordinary first-order equations: 5 comparison with the plane jet case it is instructive to compare the above resultant transformed form of the equations, eq. (32) with the analogous similarity transformed equations obtained in [5] in the plane turbulent jet case, also solved by using the two-equation model. the equations arrived at in [5] may here be re-written as: © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 41 no. 2/2001 � w m x w mx x w m x w 1 1 2 2 1 1 1 2 1 2 1 2 2 � � � � � � � � � � � � �� � � f f f f � � � � � � � � �� � � � � � � � � � �� � � �� m x x w m x x 2 1 1 2 2 2 2 2 1 2 2 f � � � e m x e m x x e m x x f f f � � � � � � � � � � � 1 2 1 3 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 � � � � � � � � � � � � � � � � � � 1 3 1 5 2 3 2 4 1 2 2 2 2 2 3 2 4 1 2 4j m x j m x x j m x x � � � � f f � � � � � � � � � � � � � � � � � � � � � � � u u g n j z g g z j n j u g n n z j � 2 2 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 2 2 2 2 2 n g j u n j z z z j n f � � � � � � � � � � � � � � � � � j uj z j c g c j pr pr 2 42 2 2 2 f u y � � ! (33) f f � � � � � � � � � � � � � � � � � � � � � � � � u u g n j z g g z j n u g � � � 2 1 1 2 2�� � � �� � � � � � � � � � � � � � � � � � � �� � � �n n z j n g u n j 2 1 1 22 2 � � f � � � � � � � � � � � � � � � � � �� � � �� � � � � z z z j n j uj z j 2 1 42 � � pr pr f c g c j u y 2 2 2� � � � � � � � � � � ! (32) the two systems eq. (32) and eq. (33) appear to be very nearly identical, the only exception being the additional 1/� terms within the first pair of brackets on the right-hand sides of the equations for the axisymmetric case, eq. (32). since the original partial differential equations of the problem before transformation differ only in there being the additional radial divergence term in the axisymmetric case, it would seem to be almost without doubt that the additional 1/� term in the transformed equations is the result of similarity transformation of the original radial divergence terms – the last term in eq. (1), (13), and (16). it is surprising that the situation is actually far from being so simple. in spite of the similar structure before as well as after transformation, the seemingly corresponding transformed terms may be of completely different origin and meaning – the above mentioned 1/� term in fact does not stem from the radial divergence term in the equation for n , but it results from the diffusion term in the equation for g ! this completely different origin of what seem to be corresponding and perfectly analogous terms is a very strange feature of the solved system of equations. 6 boundary conditions the system (eq. (32)) of simultaneous equations is subject to the following obvious boundary conditions on the jet axis, at � = 0: uax � 1 … from w w1 � m fax � 0 … from w2 0� g ax � 0 … for zero velocity profile slope nax � 0 … for zero ef profile slope zax � 0 … for zero profile slope zero values of the nondimensional gradients due to profile symmetry. these are, unfortunately, only five conditions instead of the seven required for solution of the seven equations (32). this is a typical example of a problem of split boundary conditions: the sixth and seventh values, ax and jax at � = 0 need be adjusted so as to fulfill the required values � = 0 and j = 0 at the other end of the integration region, at � � �. the problem of finding the proper values �ax and jax at � = 0 is not easy. since the unknown valeus are two and the integrated equations are extremely sensitive even to minute changes, the use of the usual ”shooting method” is hardly practicable. fortunately, its is possible to base the starting values upon the results of the earlier one-equation model solution [2]. it may be useful to recall that the solution found in [2] is not unique: the equations are fulfilled for any value of the dimensionless parameter cz/ ks dependent on jet diameter growth factor s and mixing length factor k. the proper value of the parameter was evaluated by comparison with experiments. in [2], the value c k sz 0 5 9108401. .� (34) was found by adjusting the jet growth factor s0.5 to the magnitude evaluated for the classical trüpel (1915) experiment. the value eq. (34) led to the relative specific energy of turbulent fluctuations on jet axis ax � 0 07614. (35) and to relative specific energy dissipation rate on the axis j c k s ax z� � 3 2 01912. . (36) these boundary values eq. (35) and eq. (36) were now used as starting values in the present solution with the two-equation model. 7 solution there is still another problem associated with integration of the equation system (32). it is caused by the fact that what seems to be the natural starting point of the integration procedure, � = 0 cannot actually be used. the transformed radial co-ordinate � appears in equations (32) in denominator position, leading to infinite values of some terms there. this is no substantial problem and was easily solved by shifting the starting position a small distance away from the axis and evaluating the shifted off-axis boundary conditions using taylor series expansion (in fact quite simple, because the most important first derivatives are zero on the axis due to the symmetry). standard values of turbulence model constants cz and c� from eq. (12), as well as cy corresponding to c c c c u y z � � � � �� pr . � 1 44 (37) with von kármán constant value � = 0.41 – were used. the only deviation from the standard practice was to insert slightly higher pr .� � 13837 in place of the usual pr .� � 13(for the latter there being, after all, no physical justification anyway). the higher value leads to better fulfillment of the boundary conditions on the outer boundary (an effect which can also be achieved by adjusting the conditions "ax and jax). integration of eq. (32) was performed by a simple computer program tes-axi2eq , using the standard runge-kutta integration method. a complete listing of the program is attached in the appendix. fig. 2 shows the integration result: the half-profiles of the transformed axial velocity u as well as its transformed transverse gradient g and 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 fig. 2: solution of the governing eq. (1) in terms of the three similarity variables that determine velocities. this solution was obtained using the boundary conditions evaluated in [2] for one-equation solution. transformed stream function, f (an integral of velocity multiplied by radius). these quantities are plotted in fig. 2 against co-ordinate �, the similarity-transformed radius. fig. 3 then presents the analogous half-profiles of the results obtained from simultaneous integration of the transformed relative energy of turbulent fluctuations: there is the relative energy as well as its (transformed) transverse gradient n, again plotted against the similarity transformed radius �. in a similar manner, fig. 4 presents the evaluated half-profiles of the turbulence dissipation rate �, the spatial distribution of which is described by eq. (16). in the present solution this quantity is computed mainly as a means for evaluating the turbulence length scale using eq. (10). 8 comparisons with experiments the following several figures present comparisons of the profiles resulting from the present solution with experimental data. for plotting the present results in common co-ordinates with experimental and other data, it is useful to use another relative transverse co-ordinate, e.g., the co-ordinate � related © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no. 2/2001 fig. 3: solution of equation (13) for turbulent fluctuation energy in similarity transformed coordinates: relative magnitude of fluctuation energy and its transverse gradient n. note the extended vertical scale compared with fig. 2. fig. 4: solution of equation (16) for turbulence dissipation rate in similarity transformed coordinates: again, the transverse gradient of the evaluated quantity is plotted (on a 10-times decreased scale) in the bottom part of the diagram fig. 5: a comparison of the profile of axial time-mean velocity from the present solution (cf. fig. 2) with classical experimental data to the jet diameter at which the axial time-mean velocity reaches one half of the local velocity maximum wm. for this purpose, it was necessary to find the value of the transverse similarity co-ordinate � in which relative velocity u attains the value u = 0.5. there is �0 5 0 3139412. .� . (38) fig. 5 shows a comparison with classical experimental data, including data due to trüpel (1915). it should be stressed that although trüpel’s data set was used in [2] for evaluating boundary conditions eq. (35) and eq. (36) – the values used also in the present computations – the almost perfect agreement seen in fig. 5 is certainly not due to any mutual relationship. the evaluation of boundary conditions in [2] used just a single number from trüpel’s experiment, its radius growth factor s0.5. the agreement in the shape of the velocity profiles is, therefore, a result of good modeling of the flow and its turbulence. this may perhaps be proved by inspecting another comparison with another experiment in fig. 6, where there is no such relation. the apparent disagreement visible in this case near the jet outer boundaries is of no importance, for several reasons. first, the measurements the results of which are shown in 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 fig. 6: another comparison of the velocity profile obtained by the present solution – as shown in fig. 2 – with experimental velocity profile data measured by a hot wire anemometer in the author’s laboratory fig. 7: comparison of the profile of turbulent fluctuation energy obtained by the present solution – as shown in fig. 3 – with results of anemometric investigations (single component hot wire anemometer data evaluated using the assumption of isotropy of turbulence) © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 fig. 6 were made using a hot wire anemometer. it is a typical property of a hot wire sensor that it lacks the capability to discriminate positive and negative velocity directions. since near the jet boundary the time-mean velocity is small and the fluctuations are relatively very large, negative values occur but are interpreted by the anemometer as positive values. as a result, the apparent mean velocity data in this region tend to be higher than the actual mean velocity – a trend which is in line with what is seen in fig. 6. secondly, the present solution is not intended to be exactly valid on the jet boundary, because – as was already mentioned in the introduction – it does not take into account the effects of intermittence. anemometric measurements performed by the present author and his co-workers made possible not only evaluations of time-mean velocity, but also velocity fluctuation. from the data on the latter, it is possible to calculate local energy of turbulent fluctuations ef – under the somewhat simplifying assumption of isotropy, which is necessitated by the only single-channel available anemometer. the experimental profiles of this fluctuation energy, converted – according to eq. (28) – to the nondimensional similarity variable for which the transport equation is here solved, are compared in fig. 7 with the present theoretical solution (corresponding to profiles from fig. 3). it is evident that the anisotropy assumption is not restrictive, and the experimental and theoretical profiles are in quite good agreement. the surprising fact, however, is that this agreement is arrived at, after gradual development, only at a very large distance downstream from the nozzle. note that in the simultaneously made velocity measurements shown in fig. 6, the fully developed shape of the profile is attained already at a distance several times shorter. the extreme length of turbulence parameter development, equal to around 60 nozzle exit diameters, does not seem to be known or discussed in the available literature, where it is commonly expressed (no doubt on the grounds of comparing only velocity data) that a fully development state is already attained at 10 or perhaps 15 nozzle diameters. in order to investigate this interesting finding in detail, experiments were performed during which the velocity fluctuation was measured by a hot wire anemometer along the jet axis. the results of these experiments are shown in fig. 8. increasing time scale of fluctuation with the downstream distance caused a rather large scatter of data at larger values of x1. nevertheless, the dependence demonstrates quite clearly the asymptotic character of the approach to the value eq. (35), found by the similarity solutions both from [2] and from the present analysis. 9 turbulence length scale apart from the above described comparisons with experimental data, it may be of interest to compare the present results with what was obtained earlier in [2] with a less sophisticated, one-equation model of turbulence. fig. 9 compares the half-profiles of the axial time-mean velocity computed from the present solution (cf. fig. 2) and from [2]. both velocity profiles are nearly coincident – to the degree that there seems to be no possibility to decide between them on the basis of any conceivable experiment. the main difference between the present two-equation model solution and the earlier solution [2], which used the one-equation model, is the closed character of the present approach, not dependent upon outside information or assumptions. the one-equation model required an assumption on the spatial distribution of turbulence length (taken as constant in [2]) and, to achieve unicity, some information about the jet spreading rate. it may be of interest to see what was accomplished by the present two-equation approach in these two respects. as for the length scale, its distribution across the jet cross section may now be calculated from the present solution, using the results on the distributions of fluctuation energy acta polytechnica vol. 41 no. 2/2001 fig. 8: experimentally determined relative values of fluctuation energy on the jet axis as a function of the downstream distance from the nozzle exit (fig. 3) and dissipation rate (fig. 4), with the use of eq. (10) and of the values of constants eq. (12). in fig. 10, values obtained this way are shown in the form of their ratio to the constant turbulence length of the earlier solution [2]. there was, in [2], �1 1eq � k s x (39) while the present 2eq is given by eq. (10). using the definitions eq. (28) and eq. (29), the ratio of the two lengths is � � � c k s j z 3 2 . (40) in fig. 10, it is immediately apparent that the length scale distribution resulting from the present solution changes only insignificantly across the jet profile. indeed, an indication of the small changes of length scale across the profiles was visible from the small difference between the two solutions in fig. 9 and, in fact, already from the similar shapes of the curves in figs. 3 and 4 (with constant cz cf. eq. (10)). this insignificant variation provides an explanation of the apparent success of earlier jet solutions, beginning with [3] and including [2], which assumed the length scale to be simply constant across the jet. 10 the illusory “plane jet/round jet anomaly” while in the simpler previous solution [2] it was necessary to evaluate the jet diameter growth using information obtained from experimental data, the closed character of the present two-equation solution makes possible direct evaluation of the jet spreading rate. in fact, the information about jet spreading resulting from the present solution made possible one of important results of this paper. it is a clarification of what for a considerable time has been one of the basic problems of turbulence modelling: the unresolved problem of the ”round-jet/plane-jet spreading anomaly”. according to assertions in the most authoritative literature on the subject – e.g. [11] – computations with the present standard two-equation turbulence models, while predicting plane jets properly, are believed to fail when applied to axi-symmetric jets. in particular, computed spreading rates are said to be between 25 % and 40 % larger than experimental spreading rates – as shown in fig. 11. such a discrepancy is believed to be a manifestation of a fact that there are physical mechanisms participating in the dissipation process, that have so far not been taken into account in usual turbulence models. attempts to remove the anomaly have led to various suggested alterations to the model equation for the axisymmetric jet without, however, any convincing success. usually, as is the typical case of pope’s [12] adaptation of the two-equation model, the suggested modifications remove the “round-jet/plane-jet anomaly” only to exchange it for a spreading rate anomaly in other situations. the reported absence of the anomaly in the wilcox k – � [11] model was at least partly the reason behind its popularity. the comparisons in fig. 11, however, are not really convincing. furthermore, recent investigations (speziale and abid, 1995) show that the k – � model is hardly perfect, as it has certain difficulties associated with assumed basic physics of turbulence. the present results, on the other hand, show that what is known as the ”round-jet/plane-jet spreading anomaly” is only a fictitious effect, caused by improper numerical values of boundary conditions (and perhaps model constants) used by earlier authors in their earlier computations. in fig. 12, the 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 fig. 9: comparison of the velocity profile computed from the present two-equation model solution (round symbols) with an earlier solution [2] based on the one-equation model of turbulence (continuous line) fig. 10: the present two-equation model solution needs no preliminary assumption about the turbulence length scale . the distribution across the jet profile of length 2eq computed from the present solution is here compared with the standard length 1eq, constant across the profile. velocity profiles from fig. 11 are compared with the present solution, which is re-plotted using the same transverse co-ordinate (which directly corresponds to the value of the jet diameter growth constant) as in fig. 11. obviously, no spreading anomaly is seen, although the present solution shown in fig. 12 uses standard model constants (with the exception of the small adjustment in the value of pr�, which is irrelevant in the present context) as they are used also for the plane jet solution. fig. 13 shows the error that was probably responsible for the wrong behaviour of two-equation solution in fig. 11: without changes in model constants but with a different set of boundary conditions "ax and jax, the curve that should represent the ”standard two-equation solution” in [11] is quite well computed by the present solution, using the computer program tes-axi2eq as listed in the appendix. in fact, by slightly decreasing "ax below the value "ax = 0.1213 indicated in fig. 13, it is possible to achieve almost perfect coincidence of the two curves – the slight mismatch was preferred for presentation in fig. 13 so that both curves may be individually recognisable. obviously, the bad reputation of the two-equation model was due to a similar erroneous set of boundary conditions used in some earlier computations. it should be stressed that such an error is extremely difficult to © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 41 no. 2/2001 fig. 11: this illustration is taken from [11]. it shows the common belief in the literature on axisymmetric jets, that computation using the standard two-equation turbulence model leads to a wrong spreading rate. the model due to wilcox [11] is proposed as a remedy. fig. 12: the present results re-plotted in the same co-ordinates as used in fig. 11 show that the spreading anomaly is a fictitious effect [10]: no anomalous spreading is found for the present solution with proper constants and proper boundary conditions identify in usual computations based upon a grid solution of partial differential equations – in particular in view of another of the present findings, the extremely long way towards equilibrium, as shown in fig. 8 (computations are seldom performed on regions as long as is now known to be necessary). it is an advantage of the similarity transformation approach that all these details may be easily clarified. 11 sensitivity analysis a serious problem with the standard two-equation model of turbulence, which actually forms the background of enigmatic features such as the discussed fictitious anomaly, is its extreme sensitivity to exact values of model constants and boundary conditions [14]. the sensitivity problem becomes really serious in common applications of the model to computations using grid methods, where the problem of boundary conditions often appears in a disguised form and may lead to conclusions that the system of solved equations is unstable or unsolvable. on the other hand, the present similarity approach is particularly suitable for clarifying this question. after the transformation, the boundary condition appears in the form of just a single numerical value to be determined. using the present solution, it is tractable and in fact quite instructive to study the dependence of the solution errors on changes of the boundary condition values. in the sensitivity analysis, the magnitude of the solution error was judged by the magnitudes of the two end boundary conditions which are to be fulfilled at the end of integration: the values of relative velocity u and relative fluctuation energy outside the jet boundaries – at � = 0.8 (fig. 15) where they should be zero. the results of repeated integrations with varied values of boundary conditions at the starting point (on the jet axis) are shown in fig. 16. 38 acta polytechnica vol. 41 no. 2/2001 fig. 13: demonstration of the fact that the improper spreading rate, believed to be an inherent drawback of the two-equation model, is just a result of using improper boundary conditions on the jet axis: with a change in the value of the conditions, the present solution may be changed to match the curve from [11] fig. 14: classical sensitivity analysis is concerned with finding the slope of linear (or locally linearised) dependence between the deviations of parameters from the correct value and the resultant error in the solution they indicate that, in the present problem, there occurs the worst conceivable situation known by classical sensitivity analysis: the linearised dependence is here a vertical line with an infinite slope. this means that the solution of the jet with the two-equation model exhibits infinite sensitivity even to the smallest variations of the starting point values. it is only due to the nonlinearity of the solved system – the fact that the dependence is not linear and there is some curvature – that any solution at all is arrived at. note that some small variation of the starting value (leading, however, to an error of the order of tens percent) is admissible only in one sense: only an increase in "ax or a decrease in jax is tolerable. even a small change in the wrong direction causes the whole solution to collapse. of course, simultaneous changes of both values in their acceptable directions are possible without collapse of the integration – these are the compensatory variations of [15]. nevertheless, they lead to wrong predictions: the wrong ”classical” results from fig. 11 are a consequence of such compensated deviations from the proper values without collapse of the solution, as demonstrated in fig. 13. 12 conclusions 1) a general solution was derived for the axisymmetric submerged jet flow with a two-equation model of turbulence, valid for any conditions (but excluding, as usual, the effects of flow development and intermittence). with the exception of pr� – which it is advisable to insert slightly larger than the standard value – the solution uses standard values of model constants. 2) the solution is based upon the similarity transformation approach, whereby the governing partial differential equations are transformed into a set of seven ordinary differential equations (32). their integration is straightforward and presents no stability or stiffness problems. 3) the problem has split boundary conditions: of the seven required conditions at the starting point, only five are known. the success then depends critically upon proper insertion of the two unknown boundary conditions at the starting point of the integration – the knowledge obtained in an earlier solution [2] with a simpler, one-equation model was of substantial help here. 4) the profiles of velocity and other variables computed from the present solution are in excellent agreement with available experimental data – a new result, however, is that especially some parameters of turbulence require a substantially longer downstream distance from the nozzle to attain a fully developed condition than was previously considered necessary. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 41 no. 2/2001 fig. 15: meaning of the solution error in the sensitivity analysis of the present solution: deviations of the solution for some of the basic nondimensional quantities (velocity and kinetic energy of fluctuations) from their correct zero value outside the jet fig. 16: results of sensitivity analysis: dependence of the profile end values (outside the jet, fig. 15) on changes of the boundary conditions on the jet axis 5) the main advantage of the two-equation model is the internal closedness of the solution which – in contrast to earlier solutions such as [3] or [2] – is not dependent upon a priori assumptions or information from experimental data. otherwise the resultant profiles of many variables (such as velocity) are almost coincident with earlier one-equation solution results with , indicating that this assumption is a rather good approximation. 6) a stain upon the reputation of the two-equation model has been its alleged inability to predict the proper spreading rate of axisymmetric jets. the present analysis shows convincingly that this ”spreading rate anomaly” is a fiction: there is nothing wrong with the model, the erroneous results have been merely a consequence of wrongly inserted numerical values of boundary conditions. 7) the system of equations describing the turbulent axisymmetric jet with turbulence modelled by the two-equation model is extremely sensitive (in fact, the performed sensitivity analysis shows theoretically infinite sensitivity) to the values of inserted constants and boundary conditions, which must be used with at least four (and preferably five-) digit accuracy to obtain proper results. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 appendix: computer program listing of the program tes-axi2e written in basic language, which was used for the numerical solution of the seven ordinary differential equations obtained by the similarity transformation, is included here to make possible verification of the author’s results and/or evaluation of other variables from the present solution. 10 cls:screen 9:window screen (0,0)-(640,320) 20 print “ program ,axij-2eq’” 30 print “ — — — — — — — — — — — 40 print “ this program solves the turbulent, axially” 50 print “ symmetric submerged jet flow in similarity” 60 print “ transformed form, using two-equation model” 70 print “ of turbulence in high reynolds number version” 80 print “ 90 print “ written by: v. tesar, cvut praha, 1994” 100 print “ language : basic” 110 input “ ”, dummy:cls 120 def fn f1(f,u,eps,g,n,j,s)=u*eta 130 def fn f2(f,u,eps,g,n,j,s)=g 140 def fn f3(f,u,eps,g,n,j,s)=n 150 def fn f4(f,u,eps,g,n,j,s)=g*(s/j-2*n/eps-1/eta)-j*(f*g/eta+u*u)/(eps*eps) 160 def fn f5(f,u,eps,g,n,j,s)=n*(s/j-2*n/eps-1/eta)-g*g-j*(2*u*eps+n*f/eta-j)/(eps*eps) 170 def fn f6(f,u,eps,g,n,j,s)=s 180 def fn f7(f,u,eps,g,n,j,s)=s*(s/j-2*n/eps-1/eta)_ -pre*j*(s*f/eta+4*j*u)/(eps*eps)_ j*pre*(cu*g*g-cy*j*j/(eps*eps))/eps 190 for ver=0 to 1 step .1 200 line (2,300-430*ver)-(600,300-430*ver) 210 next ver 220 for hor=-1 to 1 step .2 230 line (200+320*hor,300)-(200+320*hor,0) 240 next hor 250 deta=.0001 260 eta=deta/4 270 pre=1.38012 280 kappa=.41 290 sqc=.3 300 cnu=sqr(sqc):cz=cnu*cnu*cnu 310 cqs=1/sqc 320 cu=1.44 330 cy=cu+kappa*kappa*cqs/pre 340 cks=9.108401 350 ks=cz/cks 360 f=0: print “ f = 0” 370 u=1: print “ u = 1” 380 g=0: print “ g = 0” © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 41 no. 2/2001 390 n=0: print “ n = 0” 400 s=0: print “ s = 0” 410 eps=7.609533e-2: print “ eps = ”;eps 420 j=cks*eps^1.5 : print “ j = ”;j 430 for i=1 to 10000 440 a1=deta*fn f1(f,u,eps,g,n,j,s) 450 b1=deta*fn f2(f,u,eps,g,n,j,s) 460 c1=deta*fn f3(f,u,eps,g,n,j,s) 470 d1=deta*fn f4(f,u,eps,g,n,j,s) 480 e1=deta*fn f5(f,u,eps,g,n,j,s) 490 f1=deta*fn f6(f,u,eps,g,n,j,s) 500 g1=deta*fn f7(f,u,eps,g,n,j,s) 510 a2=deta*fn f1(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 520 b2=deta*fn f2(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 530 c2=deta*fn f3(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 540 d2=deta*fn f4(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 550 e2=deta*fn f5(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 560 f2=deta*fn f6(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 570 g2=deta*fn f7(f+a1/2,u+b1/2,eps+c1/2,g+d1/2,n+e1/2,j+f1/2,s+g1/2) 580 a3=deta*fn f1(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 590 b3=deta*fn f2(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 600 c3=deta*fn f3(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 610 d3=deta*fn f4(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 620 e3=deta*fn f5(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 630 f3=deta*fn f6(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 640 g3=deta*fn f7(f+a2/2,u+b2/2,eps+c2/2,g+d2/2,n+e2/2,j+f2/2,s+g2/2) 650 a4=deta*fn f1(f+a3,u+b3,eps+c3,g+d3,n+e3,j+f3,s+g3) 660 b4=deta*fn f2(f+a3,u+b3,eps+c3,g+d3,n+e3,j+f3,s+g3) 670 c4=deta*fn f3(f+a3,u+b3,eps+c3,g+d3,n+e3,j+f3,s+g3) 680 d4=deta*fn f4(f+a3,u+b3,eps+c3,g+d3,n+e3,j+f3,s+g3) 690 e4=deta*fn f5(f+a3,u+b3,eps+c3,g+g3,n+e3,j+f3,s+g3) 700 f4=deta*fn f6(f+a3,u+b3,eps+c3,g+d3,n+e3,j+f3,s+g3) 710 g4=deta*fn f7(f+a3,u+b3,eps+c3,g+g3,n+e3,j+f3,s+g3) 720 f=f+(a1+2*(a2+a3)+a4)/6 730 u=u+(b1+2*(b2+b3)+b4)/6 740 eps=eps+(c1+2*(c2+c3)+c4)/6 750 g=g+(d1+2*(d2+d3)+d4)/6 760 n=n+(e1+2*(e2+e3)+e4)/6 770 j=j+(f1+2*(f2+f3)+f4)/6 780 s=s+(g1+2*(g2+g3)+g4)/6 790 eta=eta+deta 800 if eta>.6 then 1000 810 xx=300-430*eta 820 lamb=(cz*eps^1.5)/(ks*j) 830 uu=200+320*u 840 ee=200+3200*eps 850 gg=200+32*g 860 nn=200+320*n 870 ss=200+320*s 880 jj=200+320*j 890 ll=200+320*lamb 900 circle (uu,xx),8 910 circle (gg,xx),4 920 circle (ee,xx),6 930 circle (nn,xx),4 the program has only 100 lines – of which a substantial proportion have only an auxiliary role: for example, lines 10–110 serve only to print the identification heading. lines 120 to 180 contain the definitions of the seven solved equations eq. (32). lines 190 to 240 are there to draw the grid used for plotting the resultant profiles. lines 190 to 260 insert numerical values of boundary conditions on the jet axis, i.e., at � = 0, as well as the value of the parameter cks = cz / k�s. in the present form, the program runs with values of the solution parameter and fifth boundary condition as evaluated for trüpel’s jet. adjustment to different experimental data is possible by changes in lines 19 and 200. in line 250, the small stepsize deta = �� and in line 260 the starting radial position are inserted. the following lines 270 to 420 serve to insert constants and boundary conditions of the solution. the actual solution begins in line 430. it is performed in a loop (loop parameter i ), which closes at line 990. the integration method used is the standard runge-kutta fourth-order algorithm, written in the repeated seven lines corresponding to the five similarity variables: f = f, u = u, eps = , g = g, n = n, j = j and s = z. line 790 then advances the solutio along the transverse co-ordinate eta = �. the solution is terminated if the value of this similarity transverse co-ordinate becomes larger than � = 0.6 in line 800. the remaining lines 810 to 980 plot the results in a diagrammatic form. among the plotted variables there is also the turbulence production rate prod = and the ratio of turbulence length scales �. list of symbols: cz [1] turbulence dissipation rate coefficient c� [1] eddy viscosity coefficient cu [1] turbulence dissipation rate generation coefficient cy [1] turbulence dissipation rate destruction coefficient d [m] nozzle exit diameter ef [m 2/s2] specific energy of turbulent fluctuations f [1] relative stream function g [1] relative transverse gradient of velocity j [1] relative dissipation rate of turbulence jax [1] relative dissipation rate on jet axis k [1] coefficient of the integral length growth [m] turbulence length scale 1eq [m] one-equation model turbulence length scale 2eq [m] two-equation model turbulence length scale m [m2/s] coefficient of the maximum velocity n [1] relative transverse gradient of fluctuation energy [m2/s3] turbulence production rate pr� [1] prandtl number of dissipation rate transport r [1] auxiliary variable ret [1] nozzle exit reynolds number s [1] jet diameter growth coefficient s0.5 [ ] diameter growth coefficient related to �0.5 u [1] relative velocity u0.8 [1] velocity error at the boundary u [m2/s4] generation of turbulence dissipation rate wm [m/s] maximum velocity in the profile wt [m/s] characteristic velocity of turbulent motions w [m/s] velocity vector w [m/s] time-mean velocity vector w1 [m/s] time-mean axial velocity component w2 [m/s] time-mean transverse (radial) velocity component x1 [m] axial (streamwise) distance x2 [m] radial (transverse) distance y [m2/s4] destruction rate of turbulence dissipation z [1] relative transverse gradient of turbulence, dissipation rate � [1] exponent of velocity decrease � [1] exponent of the jet diameter growth � [1] transverse dimension of shear region �0.5 [1] conventional jet diameter [1] relative energy of fluctuation ax [1] relative fluctuation energy on jet axis 0.8 [1] energy boundary condition error � [m2/s3] turbulence dissipation rate � [1] von kármán constant � [1] ratio of turbulence length scales � [1] similarity transverse co-ordinate �0.5 [1] value of � at a position where u = 0.5 �� [m 2/s] gradient diffusion coefficient of turbulence dissipation rate � [m2/s] viscosity �t [m 2/s] eddy viscosity �2 [1] relative transverse distance, 2x2��0.5 � [m2/s] stream function � [1/m] vector of spatial differentiation operators 1 [1/m] longitudinal gradient operator 2 [1/m] transverse gradient operator [1] operator of differentiation with respect to � [ �t] [1/s] prandtl transport operator the nomenclature as well as the form of the equations closely follows the usage in textbook [1]. 42 acta polytechnica vol. 41 no. 2/2001 940 circle (jj,xx),2 950 circle (ll,xx),3 960 prod=g*g*eps*eps/j 970 pp=200+1600*prod 980 circle (pp,xx),1 990 next i 1000 input “ ”,dummy:end references [1] tesař, v.: mezní vrstvy a turbulence. (boundary layers and turbulence, in czech) textbook čvut praha, publ. by. ediční středisko čvut, various editions 1984–1996 [2] tesař, v., šarboch, j.: similarity solution of the axisymmetric jet using the one-equation model of turbulence. acta polytechnica, vol. 37, no. 3/1997, p. 5 [3] tollmien, w.: berechnung turbulenter ausbreitungsvorgänge. zeitschrift für angewandte mathematik und mechanik, berlin, 1926, vol. 6, p. 468 [4] tesař, v.: solution of the plane turbulent jet. acta polytechnica, čvut praha, vol. 36, no. 3/1996, p. 14 [5] tesař, v.: two-equation turbulence model solution of the plane turbulent jet. acta polytechnica, ćvut praha, vol. 35, no. 2/1995, p. 19 [6] tesař, v.: general solutions of turbulent shear flows – their meaning and importance. proc. of colloquium ”dynamika tekutin, 95", institute of thermomechanics, czech academy of sciences, praha, october 1995, p. 63 [7] tesař, v.: advanced model similarity solutions of basic turbulent shear flows. proc. of workshop ’96, seminar ctu-vut, brno, january 1996 [8] tesař, v.: similarity solutions of basic turbulent shear flows with oneand two-equation models of turbulence. zeitschrift für angewandte mathematik und mechanik, 1997, sup. 1, bd. 77, p. 333 [9] tesař, v.: similarity solution of axisymmetric turbulent jet using the two-equation turbulence model. proceedings of workshop ’97, sixth university-wide seminar, čvut praha, january 1997, vol. ii, p. 461 [10] tesař, v.: the problem of round jet spreading rate computed by two-equation turbulence model. proc. of conf. “engineering mechanics ’97”, svratka, may 1997, p. 181 [11] wilcox, d. c.: turbulence modelling for cfd. dcw industries, la cańada, usa, 1993, 1994 [12] pope, s. b.: an explanation of the turbulent round-jet/plane-jet anomaly. aiaa journal, vol. 16/1978, p. 279 [13] rajaratnam, n.: turbulent jets. developments in water science 5, elsevier scientific publ. comp., amsterdam, 1976 [14] tesař, v.: analýza citlivosti dvourovnicového modelu turbulence – variace konstant u rovinného zatopeného proudu. (sensitivity analysis of two-equation model of turbulence – variations of constants in the case of a plane turbulent jet. – in czech), proc. colloq. dynamika tekutin ’97, isbn 80-85918-29-3, út av čr, praha, october 1997, p. 47 [15] tesař, v.: compensatory variations of constants in jet solutions with two-equation turbulence model. proc. of workshop ’98, seventh university-wide seminar, čvut praha, february 1998 [16] tesař, v.: zajímavá zákonitost pro turbulenci v zatopeném tekutinovém proudu. (an interesting relation for turbulence in submerged jets. – in czech) proc. of seminar aktuální problémy mechaniky tekutin ’96, publ. by inst. of thermomechanics cas, praha, february 1996, p. 39 [17] townsend, a. a.: the structure of turbulent shear flow. cambridge university press, 1976 prof. ing. václav tesař, csc. department of chemical and process engineering the university of sheffield mappin street, sheffield s1 3jd united kingdom phone: 44 (0) 114 222 7551 fax: 44 (0) 114 222 7501 e-mail: v.tesar@sheffield.ac.uk on leave from department of fluid mechanics and thermodynamics czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic 43 acta polytechnica vol. 41 no. 2/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap1_03.vp 1 overview of q-code schemes: principles 1.1 general principles: key-ideas of qualitative shape encoding when we look at a complex image, we do not see every detail of the shape in order to recognize the image, but we see and identify the characteristic features and particular configurations and register them in our memory [30]. the q-code scheme looks at a shape and encodes only shape characteristics. the following general principles of shape encoding illustrate the key ideas for understanding the construction of qualitative shape encoding schemes. the first principle of a q-code scheme is the encoding of shape characteristics on particular nodes where qualitative changes occur. the system looks at the nodes on the shape contour and captures particular distinctive geometric characteristics around the nodes. on each singular node, a range or landmark value for a particular design quality or shape attribute is abstracted into a single symbol, a q-code, in such a way that any value in the range is considered to be the equivalent qualitative value of the particular attribute. consequently, we get finite numbers of q-codes for a shape as a symbolic representation for that shape. each q-code is constructed in such a way that particular shape characteristics are abstracted into a combination of symbols in order to encapsulate a geometric phenomenon in terms of attribute class and value range. the shape characteristic captured on singular nodes greatly simplifies the shape information in representation whilst containing essential ingredients for discriminating the idiosyncrasies of a particular shape from the geometric range. the second principle of a q-code scheme is that a shape is encoded to give an ordered sequence that provides a descriptive structure for the shape contour. qualitative description of a shape forms a q-code list as an ordered sequence of symbols in which the head and tail symbols are connected to form a circuit. this symbol order corresponds to the sequence of singular nodes scanning for the geometric characteristics from the shape contour, and this symbol circuit can be used to tell us various shape information: how complicated each shape is, what geometric patterns repeat themselves and how they repeat, and how similar two shapes are to each other. the third principle of a q-code scheme is discretisation. this means that the encoding of a continuous shape contour yields a set of discrete symbols. it is discrete in the sense that the symbolic value for particular shape qualities captures only one status of the qualitative degree from the value set of finite numbers, and each symbol is discrete from each other. it also means that only particular points, or in other words only singular nodes, on a contour line are liable to be encoded into q-codes. the discrete property of the qualitative symbolic shape encoding method is one of the reasons why the q-code scheme is effective in symbolic computation such as pattern matching. the discretisation principle also makes it possible to meet the various task requirements of shape representation 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 class representation of shapes using qualitative-codes ashraf fouad hafez ismail this paper introduces our qualitative shape representation formalism that is devised to overcome, as we have argued, the class abstraction problems created by numeric schemes. the numeric shape representation method used in conventional geometric modeling systems reveals difficulties in several aspects of architectural designing. firstly, numeric schemes strongly require complete and detailed information for any simple task of object modeling. this requirement of information completeness makes it hard to apply numeric schemes to shapes in sketch level drawings that are characteristically ambiguous and have non-specific limitations on shape descriptions. secondly, cartesian coordinate-based quantitative shape representation schemes show restrictions in the task of shape comparison and classification that are inevitably involved in abstract concepts related to shape characteristics. one of the reasons why quantitative schemes are difficult to apply to the abstraction of individual shape information into its classes and categories is the uniqueness property, meaning that an individual description in a quantitative scheme should refer to only one object in the domain of representation. a class representation, however, should be able to indicate not only one but also a group of objects sharing common characteristics. thirdly, it is difficult or inefficient to apply numeric shape representation schemes based on the cartesian coordinate system to preliminary shape analysis and modeling tasks because of their emphasis on issues, such as detail, completeness, uniqueness and individuality, which can only be accessed in the final stages of designing. therefore, we face the need for alternative shape representation schemes that can handle class representation of objects in order to manage the shapes in the early stages of designing. we consider shape as a boundary description consisting of a set of connected and closed lines. moreover, we need to consider non-numeric approaches to overcome the problems caused by quantitative representation approaches. this paper introduces a qualitative approach to shape representation that is contrasted to conventional numeric techniques. this research is motivated by ideas and methodologies from related studies such as in qualitative formalism ([4], [6], [19], [13], [31]), qualitative abstraction [16], qualitative vector algebra ([7], [32]), qualitative shapes ([18], [23], [21]), and coding theory ([20], [25], [26], [1], [2], [3], [22]). we develop a qualitative shape representation scheme by adopting propitious aspects of the above techniques to suit the need for our shape comparison and analysis tasks. the qualitative shape-encoding scheme converts shapes into systematically constructed qualitative symbols called q-codes. this paper explains how the q-code scheme is developed and applied. keywords: class representation, shape, qualitative-codes, scheme, properties of qualitative values, linguistic analogy. efficiently. discrete symbols can take in different levels of abstraction by way of changing granularity for a symbol in order to respond in various encoding details. the fourth principle of a q-code scheme is that all the properties of a qualitative value set of this representation scheme comply with qualitative formalism. qualitative formalism indicates a strict conversion process from a range of continuous numeric values into a range consisting of discrete qualitative sign values set by landmarks and intervals. this qualitative formalism will be explained later in this paper. one of the important principles of a q-code scheme is the consistency maintenance of the relativity principle applied in measuring the geometric attributes. qualitative shape representation works well both for absolute and relative measurements, yet the relative measurement of geometric attributes is only possible with a qualitative scheme, whilst absolute measurement is an encoding method that can be accomplished equally by both quantitative and qualitative methods. the q-code scheme looks at the geometric characteristics occurring between, or on, the singular nodes of a shape contour and converts them into discrete symbols. the encoding system detects the singular nodes and transforms the numeric values for particular shape attributes into discrete q-codes that are formed in combination with a character and sign values. 1.2 qualitative values and their properties the study of qualitative physics and qualitative reasoning suggests a rigorous formalism for setting qualitative sign values for a specific quality in mechanical modeling ([4], [29], [31]). the essence of qualitative formalism lies in the setting of landmark values for the continuous range of particular numeric values and abstracting the intervals with range values in terms of signs (�, 0, +). landmarks stand for critical points where qualitative changes occur. this qualitative formalism thus takes a specific technique on the method for setting landmarks to the numeric range. properties of qualitative values those qualitative values that are constructed from qualitative formalism correspond to numeric values whilst showing different aspects from the quantitative values. some of the properties of qualitative values are as follows [31]: • coverage of a quantitative value range: the qualitative value should cover all the quantitative value range for interesting phenomena. • finite number of values: the number of qualitative values should be finite. • quantitative-qualitative interpretation: it should be possible to interpret quantitative results with qualitative values (where such quantitative results exist). • exclusiveness of a qualitative value: qualitative values should not overlap with other qualitative values in terms of the corresponding quantity space. • granularity of a qualitative value: when quantitative values represent similar attributes of objects, then the qualitative values should be similar to a certain degree. • order of a qualitative value: qualitative values should be ordered. in addition to the properties mentioned above, there are the following additional properties: • polarity: instead of having an absolute scale of measurement as in quantity space, qualitative values are distinguished by the relative scale of two extreme values of polarity and • discrete sign values: polarizing values in two contrasting qualities is symbolically denoted using sign values. the degree of discretisation can further be distinguished by more detailed granularity, which is indicated by multiple sign values. qualitative value setting and q-code primitives the q-code scheme developed here represents shape contours using four sets of q-codes that capture most basic shape attributes. the four basic shape attributes are vertex angles, relative lengths, curvature, and convexity of edges, and their numeric values are converted to q-codes based upon relativity to landmarks. q-codes are assigned to a set of particular nodes on a shape contour. the qualitative value for a vertex angle is assigned on a vertex node whilst length and curvature changes measured from a comparison of adjacent edges are assigned to singular nodes where qualitative change occurs. figure 1 illustrates the nodes where qualitative values are assigned. defining a set of discrete q-codes for shape attributes traces the following process: step 1: from the given continuous numeric value range of a particular shape attribute, landmarks are set to the significant value points. the landmark value set “l” is defined by a number of landmarks as: � �l � � �l l l1 0 1, , . for example, for the given numeric value ”n” whose range is 0 1� �n or [0 1], three landmark points can be assigned to l 1 ={0, 0.5, 1}. step 2: finite numbers of ranges are assigned to the intervals and points that are discrete to each other. interval and point value set “i” is defined by a number of ranges as: � � � � � � �i � �� � � � � �l l l l l l l l l l1 1 1 0 0 0 0 1 1 1, , , , , , , , , where “[ ]” means the range inclusive of the limits and “( )” means the range exclusive of the limits. for example, the interval set “i1” for the above landmark set “l1” is � � � � � � �i 0, 0 0, 0.5 0.5, 0.5 0.5, 1 1, 11 � , , , , . step 3: qualitative values are set for the intervals and landmarks. the qualitative value set “q” is those symbols assigned to the interval value sets as: � �q q q q q q� � � � �2 1 0 1 2, , , , . the qualitative value set “q1” for the interval set “i1” is © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 41 no. 3/2001 �� � fig. 1: nodes for qualitative value assignments � �q1 1 2 3 4 5� � � � � �, , , , . the above process explicitly shows the principle of polarity and granularity of a qualitative value. sign values are generally used in qualitative schemes because of their efficiency in handling polarity and granularity. when a significant value point is selected as a landmark for a given range of numeric values, it results in two intervals and a point, such that a landmark point separates two groups of ranges: smaller and bigger value ranges to the landmark. these two contrasting ranges necessarily construct a qualitative distinction across the landmark value. in other words, a landmark results in two symbolic polar values around the point and this relation is best represented with sign values, {�, 0, +}, each of which can be literally interpreted as having a weaker, medium, or stronger property for a particular design quality. the simple and basic polarization can further be specified into detailed intervals. more landmarks to the value range mean higher granularity, and their corresponding sign values can be, for example, such as {� �, �0, �+, 0, +�, +0, ++} where {�0, 0, +0} refer to the landmark points whilst {� �, � +, + �, ++} refer to the intervals. these landmarks and intervals can be literally interpreted as very weak, medium weak, slightly weak, medium, slightly strong, medium strong and very strong with regard to a particular design quality. qualitative value setting formalism can be applied for most the design values provided with a corresponding quantitative value range. many design simulation tasks handle a group of design properties and attributes in terms of variables and values. design properties and attributes can basically be grouped into three categories: those regarding function, behavior, or structure [10]. design properties in the structure domain are represented with variables the values of which are measurable straightforwardly in a quantitative way. those design properties in the structure domain contain variables such as considering an object in terms of its shape and material: • shape: size, geometry, spatial position and relation. • material: color, texture, and weight. amongst the various structural domains of design, the q-code scheme takes shape attributes for 2-d shape contours. a continuous array of nodes and edges forms a 2-d shape. there are two kinds of nodes: singular and regular nodes [15]. two singular nodes define an edge and in between there are continuous traces of regular nodes. defining a shape contour with a set of finite nodes and edges has the following benefits: • discrete description: a finite set of discrete elements separates finite segments from a continuous shape contour line. • segmentation of shape characteristics: it is possible to conceive shape contours with segments and to compare the value for particular shape characteristics or properties to that of the neighboring segments. • value assignment on nodes: either rectilinear or curvilinear edge segments are defined with nodes. it is thus sufficient that values describing shape characteristics can be collected from the nodes. shapes are described with all the values assigned to nodes. • quantitative-qualitative correspondence: node-based shape representation is applicable either to numeric or qualitative chemes, and a set of ranges for numeric values can be assigned to nodes, which can then be converted to qualitative symbolic codes. amongst the various shape attributes, four are considered the most basic because they are concerned with the definition of lines, and a shape is defined as a continuous and connected set of lines. these four shape attributes are: • vertex angle between two edges: shape attribute measuring an inner angle at a vertex. • length of edge: shape attribute measuring the traces between two nodes. • curvature of edge: shape attribute about how an edge is curved. • convexity of edge: shape attribute about the convex and concave property of an edge. four q-code sets have been developed for the qualitative representation of these attributes: these are a-, l-, kand c-codes for vertex angle, relative lengths of edges, curvature of an edge, and convexity of an edge, respectively. the relative measure of magnitude one of the advantages of the qualitative shape representation scheme is that it provides a comparative and relative way of measuring for the magnitude of various kinds of geometric values. magnitudes of geometric attributes considered in this q-code scheme are: angle, length, curvature and convexity. q-codes capture those magnitudes in a relative manner so that qualitative distinctions can be made amongst the shapes. q-codes are assigned to every vertex by comparing how much smaller or bigger is the magnitude change that occurs on the vertex. one q-code contains two parts: a character symbolizes the shape attribute that the q-code is encoding and a combination of sign values, as below. � �� �q char� sign value the a-code set the basic q-code is the a-code, which covers the angle attribute so that any critical change of angular magnitude is captured with this q-code. the major angular change occurs on a node (vertex) where two curvilinear or rectilinear edges meet. the angular landmark is initially set to the angle p, which distinguishes a convex angle from a concave angle. the scanning order for each node on a shape contour is set to be counter-clockwise and the magnitude of the vertex inner angle is measured in a clockwise direction between two edges. the vertex angle for curvilinear edges is measured to, or from, the tangential line on that vertex. figure 2 shows how 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 fig. 2: illustration of angle measurements at nodes vertex inner angles are measured for either rectilinear or curvilinear edges. the numeric value range “n” for the magnitude of a node angle is 0 2� �n �. within this finite numeric value range, landmarks are set for the points of particular geometric importance. the following are the sets of landmarks for vertex angles from the simplest granularity. • mid-point in the value range: the value point “�” along with two limits of “0” or “2�” forms the important landmark that divides the numeric value range into two: acute angles (smaller than �) and obtuse angles (bigger than �). the landmark set “l” is then l = {0, �}. • quarter-points in the value range: right angles are particularly important to designers. thus multiples of right angles in the value range make important landmarks. this set of landmarks divides the range into four sub-ranges along with four point values. the landmark set “l” is now l = {0, �/2, �, 3�/2}. the landmark set determines the interval set from the numeric value range into a finite number of intervals and points. interval sets for each landmark set are as follows (see figure 3): • the first interval set � � � �� �i ia a1 1 0 0 0 2: , , , , ,� � �� � �� � . • the second interval set ia2 with a higher granularity: � � � � � � � � �� i a2 0 0 0 2 2 2 2 3 2 3 2 3 2 3 2 2 � , , , , , , , , , , , , , , , . � � � � � � � � � � � � � the a-code set is assigned to each interval element as follows: • the first a-code (ax-code) set qa1: � �q a a a aa nil1 0� � �, , , . • the second a-code (axx-code) set qa2: � �q a a a a a a a aa nil2 0 0 0� � � � � � � � � � �, , , , , , , . the a-code is constructed with the character “a” denoting angle and sign values {�, 0, +}. the sign value “nil” is assigned to the landmark point [0, 0] even though there is usually no practical angle measurement for zero magnitude on a vertex. a q-code value denoting a range is set by the relativity principle such that a geometric value is compared to that of landmarks and consequently signs are assigned to the nodes. this indicates the relative magnitude of the angle to the landmarks. the a-codes in the set of � �a a a�, ,0 + are literally interpreted as angles having a smaller, equal or bigger magnitude than the landmark point “�”. consequently, the acode tells us about the relative angle at landmarks. this relative angular measurement is different from the absolute numeric measurement of angles. it is only concerned with the relative difference of magnitude compared to the critical value point. the axcode tells us if it is a convex or concave angle so that one single qcode can represent all the possible numeric values, for example, for the convex angle, as the axxcode does for the acute or obtuse angles. this abstract representation is the advantage of this qualitative scheme because a single qcode can indicate numerous possible individuals with common shape characteristics. the l-code set a q-code that captures the geometric characteristics of edge length is labeled an lcode. this lcode takes the value for a relative length rather than an absolute length. relative length means that the lengths of two adjacent edges are compared with each other resulting in the measurement for the change in the length magnitude on a vertex. a comparison of the lengths of two adjacent edges considers two different and one sharing nodes from two neighboring edges as shown in figure 4. comparison of length magnitudes results in a relative measurement, which, is indicated by signs of {�, 0, +}. landmarks are set to the points so that a qualitative distinction between two adjacent edges is possible. there can be no fixed numeric points for lcode landmarks. what matters is whether an edge is shorter or longer compared to that of the previous one. as geometric characteristics are scanned node by node in counterclockwise direction, the magnitude of the previous edge becomes the landmark point for the next one. thus landmarks and intervals are set each time when a new edge is compared. the following are two sets of landmarks for the relative length attribute: • identical point landmarks: a landmark is set to the numeric point indicating a magnitude identical to the previous edge length. this landmark provides the ratio in order to distinguish the relative difference under the labels of smaller and bigger. thus the lcode measures the difference in lengths between the previous edge and the current edge. if there is a variable “d” measured as: d = current edge length – previous edge length; d = {positive, zero, negative} then the identical point landmark l1 is set to the numeric point where d = 0. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 41 no. 3/2001 [a] 1 interval set st [b] 2 interval set nd fig. 3: interval sets for 1st and 2nd level a-codes 4 5 6 1 2 2 3 4 5 5 6 1 2 3 fig. 4: illustration of length comparisons of two adjacent edges • half and double point landmarks: when the qualitative distinction of smaller or bigger does not provide sufficient detail, the l-code can be extended to higher granularity so as to distinguish, for example, much smaller and slightly smaller from the label smaller. figure 5 shows the landmarks and sign values for the lcode. interval sets are determined by the first and second landmark sets as follows: • first interval set i1: � � �� �i1 0 0 0 0� �� �, , , , , ; • second interval set i2: � � � � � � �� �i 2 1 1 1 1 0 0 0 0 1 1 1 1� �� � � � � �, , , , , , , , , , , , , . lcodes are assigned to those interval elements. • l-code set ql1: � �q l l ll1 0� � �, , , each of which may be interpreted with the labels {smaller, equivalent, bigger} in terms of length. • lcode set ql2: � �q l l l l l l ll 2 0 0 0� � � � � � � � � � �, , , , , , . figure 6 shows an illustration of aand l-code encoding of an architectural shape. the kcode set curvilinear edges differ from rectilinear edges in two ways: changes of curvature and convexity. curvature and convexity determine the geometric characteristics of a curvilinear edge in terms of the direction and configuration of the curves. specific q-codes are thus designed to describe the qualitative difference in the change of curvature and convexity for a finite number of curve segments. a curve segment is an edge bounded by two singular nodes, and the curvature k, at a node in the curve, is measured as an inverse of radius r of the curve as 1/r [15]. the curvature of any node on a straight-line edge is “0”. a critical change of curve segments occurs on singular nodes. figure 7 (a) is an example of a regular node whilst (b), (c) and (d) are examples of singular nodes: (a) regular point; (b) inflection point; (c) 1st cusp; (d) 2nd cusp [15]. the first type of singular node is an inflection point on which the point on the curve continues in the old direction whilst the image reverses its direction. the second type of singular node is the 1st cusp on which the point on the curve reverses its direction whilst the image continues in the old direction. the last type of singular node is the 2nd cusps on which the point on 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 d = 0 d = 0half double � � � � � � � � � � � � � � � � fig. 5: illustration of intervals and signs for lcodes fig. 6 illustration of q-code encoding (a) (b) (c) (d) fig. 7: illustration of regular and singular nodes on curves the curve and the tangential image both reverse their direction [15]. since major changes of curve property occur on these singular nodes, qcodes are assigned on these nodes to represent the curve characteristics in terms of curvature and convexity. curvature is concerned with the magnitude of a particular curve property whilst convexity concerns the direction of the curve property. the relative change of curvature between two curves is represented with kcodes, which capture neither the change of convexity nor the change of direction of a contour line, but only the change of curvature. every regular node, with no change in the direction and the image of a tangential line, has a k0 code, which means the curve segment has no curvature change. however, regular nodes are not the subjects of qcode encoding. singular nodes with directional change, but no change in tangential image regardless of reflection, are also encoded with k0. only those singular nodes with discrete change in their curvature will be coded with other kcodes with “�” and “+” signs. figure 8 shows examples of k0 nodes that show no curvature change regardless of either the change in direction or reflection of the image. as lcodes compare the edge property to a landmark with no fixed numeric value, k-codes are also defined in a similar manner. the qualitative difference of the curvature property is determined by taking the magnitude of this property from the previous edge and comparing it to that of the current edge. thus the landmarks are not set to particular absolute numeric values. they are, however, set as a relative magnitude to the previous edge segment. landmarks are set to points with significant changes in curvature. • primary landmark: the primary kcode landmark is set to a point with any change of curvature for two adjacent edges. the degree of curvature change at a node, dk, for the curvature k is determined as follows: �d k at current node at previous node� �k k . any node with significant curvature change will have a positive or negative dk value. a point with dk equals “0”, thus marking a landmark point. the landmark set l for the first level is l = {0} in terms of dk value. the interval sets for each landmark set are as follows: • interval set: � � �� �i � �� �, , , , ,0 0 0 0 . a range with a negative or positive value. • denotes an increase or decrease of curvature at a node. kcodes are assigned to each element in the interval set. kcode set: � �q k k kk � �, ,0 + . figure 9 illustrates the scope of k+ and k� codes. the dotted lines indicate two equal tangential images to the previous edge forming the boundary limits of curvature for kcodes. the c-code set basic shape descriptions are possible with the three types of qcodes: a-, land kcodes. when we deal with curves, however, there is another aspect to consider which cannot be handled with curvature alone. apart from curvature, the convexity of a curve tells us about the direction in which the curve bends. this is captured by a set of ccodes. convexity handles the direction that describes how the curves bend along a contour line. a convex curve contains no straight line that traces the outside of the shape when drawn from the inside of the shape, as shown in figure 10 (b). opposite is the concave curve that contains a straight line that traces the outside of the shape when drawn from the inside of the shape, as shown in figure 10 (c). the convexity of a curve © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 41 no. 3/2001 change of direction change of direction change of tangential k0 k0 k0 k0 fig. 8: illustration of nodes with k0 codes k� k� direction tangential line bounding limits of curvature (for k )0 fig. 9: illustrations of k-codes can also be geometrically determined by the tangential line change. if the slope of a tangential line does not change when we trace the shape contour in a counter-clockwise direction, then the convexity of the line is ”0”, meaning a straight line, as in figure 10 (a). when the slope increases during the trace, in other words if the tangential line moves in a counter-clockwise direction, the convexity of the line is positive (convex), as in figure 10 (b). on the contrary, if the slope decreases along the trace, or if the tangential line moves in a clockwise direction along the trace, the convexity of the line is negative (concave), which means the line is concave, as in figure 10 (c). observing this, the primary convexity landmark is set to the point with zero convexity. for a slope change of the tangential line, st, as tracing shape contour in a counterclockwise direction, a landmark is set to the point with st “0”, which distinguishes a convex curve from a concave curve. all curves are classified into three types according to sk: a convex curve with positive sk, a straight line with “0” sk, and a concave curve with negative sk. consequently, we have a landmark set l = {0} in terms of sk, and an interval set is i={(� , 0), [0, 0], (0, )}. there is a transition from convexity measures to ccodes. since convexity is a measure for an edge and a qcode is assigned to a node, we need a set of rules for converting the convexity measures of two adjacent curves t o c codes. figure 11 shows some of the cases for ccode determination based upon a convexity measure. conversion of convexity-change to ccode is also shown in the cayley diagram in table 1. table 1: from convexity measures of two adjacent curves to ccodes curve 1/curve 2 – 0 + – c0 c+ c+ 0 c� c0 c+ + c� c� c0 consequently, we have ccodes, � �q c c cc � �, ,0 + , assigned to the nodes between two curves. these primary ccodes are only concerned with a critical change of convexity, such as those between two different signs, so that any trivial change in the convexity measure under the equivalent sign produces a c0 code, as shown in figure 11 and table 1. a comparison of four q-code sets we examined four types of qcodes that capture four different shape attributes. when shape is represented with qcodes, each code captures shape characteristics such as vertex angle, relative length, curvature change and convexity change, and assigns the information to nodes. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 (a) a curve with convexity “0” inside tangential line direction (b) a curve with positive convexity inside a line drawn from the inside (c) a curve with negative convexity fig. 10: geometric determination of a curve convexity c0 c0 c� c� c� c� c� c� c0 �� � � �� � � � 0 �0 �0 0 0 � 0 fig. 11: illustration of the c-code assignment c k� � c k� � c k0 � k k0 � fig. 12: exceptional combinations of kand c-codes at the node of anil code primary qcode sets are � �a a a anil 0 +, , , ,� , � �l l l� �, ,0 , � �k k k� �, ,0 and � �c c c� �, ,0 . consequently, it is possible to combine these qcodes on a node with few exceptions. exceptional combinations of cand kcodes occur at the node of anil code, as shown in figure 12. there are possibly five types of kand ccode combinations in anil code. c� and c+ codes can only correspond to k+ and k� codes, whilst c0 code can take all three kcodes. as for other acodes, any l-, kand ccodes can correspond to them, as shown in figure 13. consequently, a node could take one of 96 different combinations of primary a-, l-, kand ccodes. these 96 different types become the qualitative granules from which the primary qcode scheme distinguishes shape characteristics from the infinite quantitative variations. 1.3 changing granularity of qcodes for multiple-level descriptions primary qcodes describe the most basic qualitative distinctions for shape attributes. therefore, primary sets of qcodes should to abstract shape characteristics most efficiently. they should also be able to expand their granularity when more detailed qualitative distinctions are required. as shown in the qcode definitions in this section, the granularity for each set of qcodes varies according to the intervals assigned to their corresponding numeric value ranges bounded by a pair of landmark points. finer granularity produces more detailed qcodes representing smaller ranges of instances. thus granularity and class abstraction are inversely proportional to each other. we need finer granularity for the following occasions: • clearer distinction for similar groups of geometric patterns. • identification of individual differences rather than abstract categorizations. • acquiring more syntactic patterns under a particular geometric attribute. the changing granularity for qcodes becomes a matter of changing landmarks and intervals. we use bisection as the generic method. with the bisection method, we can create additional landmarks in the middle of each interval. qcode bisection is illustrated in figure 14. qcode bisection contains the following four steps: landmark determination, the interval bisection, sign value determination and qcode creation: • landmark determination: a single point of the interval is taken as a new landmark point, which is mostly set to the mid-point of the interval. it is sometimes the half point or the double point of a particular measure. a pair of new intervals is set to left and right ranges along the landmark point. • interval bisection: the original interval is divided into two sub-ranges by a new landmark point. • sign value determination: new sign values are assigned to new intervals and landmark points, whilst the sign value for the existing landmark remains unchanged. new landmark points add “0” to the signs of existing intervals whilst intervals of smaller and bigger values to the landmark, that are on the leftand right-hand sides of the landmark, add additional “�” and “+” to the sign value of the previous interval. • qcode creation: consequently, a new set of qcodes is created with the combination of a character symbol and new sign values. multiple-level description of shape characteristics gives the qcode scheme advantages over other shape representation methods, such as chain coding [8], and qualitative vector algebra ([32], [18]). a shape contour produces several differ© czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 41 no. 3/2001 c k� � c k� � c k� � c k� � c k0 � k k0 � c k0 � k k0 �c k+ � k k0 � c k+ � k k0 � fig. 13: kand c-code combinations for the acode nodes ��� �� ��� ���� ��� �� �� � �� �� � �� � � � � � �� �� � � � � ��� �� ��� ��� �� ���� 1 set st 2 set nd 3 set rd fig. 14: granularity change by q-code bisection ent qualitative geometric descriptions according to various q code granularities, providing the most efficient qualitative characteristics of the shape yet maintaining the class representation of common individuals. multiple-level description is necessary for handling special cases of shape representation, as follows: • synchronous description for several discrete geometric attributes with varying abstractions. • multi-level resolutions for shape patterns. • multi-class interpretations on a group of similar shapes. multiple-description of shape characteristics using varying granularity of qcodes makes it possible to classify, for instance, a group of four rectilinear shapes into a quadrilateral class or into subsequent class labels, such as square, rectangle, parallelogram, trapezoid and diamond according to how we set the acode granularity. repeating edge patterns, such as saw-teeth, can also be abstracted into a class or a set of sub-classes by varying the granularity of a-, l-, kand ccodes. varying granularity for qcodes provides the dynamic shape encoding capability that produces the optimum level of shape representations best suited for the expected qualitative abstraction. 2 encoding formalism 2.1 shape, shape elements and corresponding qcode chunking shape as an aggregation of closed and connected lines there is a hierarchical conceptual structure on which the qcode scheme is based. it is assumed that a shape contains geometric characteristics that can be encoded with a symbolic representation scheme, and the shape is defined by its contour lines. on the contour line, there are a finite number of singular nodes on which qcodes are assigned. thus a shape is considered as a single complete unit that contains finite numbers of shape characteristics. the qcode shape encoding scheme, therefore, has a conceptual structure in such a way that lower conceptual units aggregate together to form a higher unit and this aggregation continues to its higher conceptual units. obviously the lowest unit for the qcode scheme is the qcode that takes shape information on a node. these qcodes are ”chunked” together to form a higher conceptual unit that implies geometric patterns on a shape contour. these patterns are contained in their higher unit as a shape. a group of connected shapes, similarly, can be noted by a unit that is conceptually higher than a shape. as a complete conceptual unit, a shape becomes the target and basis for qcode encoding and analysis. a shape normally refers to any finite arrangement of lines (straight, curved, open or closed) with pictorial characteristics ([27], [28]). however, we specifically define a shape as an aggregation of closed and connected lines ([11], [12], [24]) so that a shape refers to the 2-d plane contour lines which form the boundary of a shape. the terms shape and form are used, normally, in distinguishing 2-d contour images and 3-d solids. shape is the most abstract geometric description of an object image. the shape of an object, therefore, indicates the essential pictorial information that excludes all the other material attributes, such as texture, line weight and color from the object description. consequently, we use the term shape to refer to the 2-d contour of an object that is closed and connected with lines that are geometrically defined excluding all the material attributes. complete shape and geometric patterns are represented by qcodes. the qcode description of a geometric pattern should be a subset of the qcode encoding of the shape, and the difference between these two is the connection. the first and the last qcodes of a shape encoding are conceptually connected to each other to form a circuit, whilst a geometric pattern has a qcode encoding that does not form a circuit. since a shape is described as a circuit of qcodes, we consider different orders of the qcode sequences for an identical shape to be equivalent. qcode encoding of a design description are analyzed according to chunks of encoding with different levels of shape patterns. the concept can best be understood with an analogy to natural language. linguistic analogy of shape elements there are conceptual units according to how qcodes are chunked. these units are hierarchical to each other and they show several discrete levels in terms of shape description. these units handle a shape configuration through qcode chunking starting from a specific shape attribute and qualitative symbolic values at a node, to a set of geometric patterns on a shape contour, to a shape that is complete and discrete, and finally to the shape aggregation. the qcode scheme suggests a set of terminology for these conceptual units. this terminology corresponds to familiar terms used in linguistics. the matching of each term in linguistics and the qcode scheme is shown in table 2. table 2: matching terms in linguistics and in the q-code scheme chunking terms in linguistics terms in qcode scheme primary symbol alphabet qcode first chunking word qword second chunking phrase qphrase third chunking sentence qsentence fourth chunking paragraph qparagraph this linguistic analogy of qcode chunking suggests that there are specific structures and definitions for different groupings of qcodes. the variety of qcode chunking also suggests that there could be different levels of shape analysis tasks according to different levels of cognitive processes for shapes. 2.2 definitions of basic concepts in encoding formalism the following is the definition for four important qcode chunking units and their meaning in shape description. we start with the definition of the qcode, followed by the qsentence, and then the q-word. the qcode (�) � � � �� �� �� � � � ��i ia, l, k, c , , ,0 the qcode “�” forms the base symbol in the encoding formalism. four categories of qcodes – a-, l-, kand ccodes – are constructed for those geometric attributes “�” 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 of angle, length, curvature and convexity by combining a single character {a, l, k, c} with sign values {+, 0, –}. qcodes are determined as relative size values by comparing numeric values from the landmarks. table 3 shows how qualitative values are assigned to each shape attribute. the qsentence (�) � �� � � �� 1 2 m, , ,� where m = lenght (�) the qsentence refers to a discrete shape as a set of closed and connected lines. it is a finite sequence of qcodes in the form of a circuit and it includes q-words and qphrases as its syntactic subsets. the qsentence is represented as a loop of qcodes such that only the order of the sequence matters. in this way, it distinguishes itself from the q-word and the qphrase. the qword ( ) � � � � � � � � � � � � � � � � � � i i j i j k or , , , , , , i j m k j i i m 1 1 1 2� the q-word is a sequence of qcodes and it is a subset of the qsentence. q-words do not have a structure as a circuit and they refer to the subset patterns of a shape contour with a particular design meaning. the q-word can be as big as the length of a qsentence and the shortest length of a q-word is “1”. 1 � length (qword) � m table 4 shows four important qcode units and their natural language analogy. 3 representations of shapes 3.1 representation of a singular shape the qcode scheme aims to encode shapes based only upon shape information that excludes other pictorial information such as materials and spatial relations. the basic unit that completes a circuit is the shape that is indicated by a region bounded by a finite set of nodes and edges. a shape is understood as a qsentence in the qcode scheme and a singular shape is a complete shape that is independent from other shapes around it. the qcode encoding of a singular shape follows a series of straightforward steps. a shape, that is a qsentence, is a superset that contains dependent sets of qwords and q phrases, which are also defined as a series of qcodes. consequently, a shape encoding is the primary process that is followed by detection processes for qwords and qphrases. a qsentence is a circuit of qcodes and each qcode corresponds to each node of a shape. each node is © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 41 no. 3/2001 acode lcode/kcode/ccode numeric value range 0 � � � 2� � � 1, k, c � landmark set {0, �} {� , 0, } interval set {[0, 0], (0, �), [�, �], (�, 0)} {(� , 0), [0, 0], (0,+ )} q-code set {anil, a�, a0, a+} {l�, l0, l+}, {k�, k0, k+}, {c�, c0, c+} table 3: qualitative value assignments to shape attributes a� a� 1 2 3 4 5 6 7 8 9 10 11 12 1314 15 16 a� a� a� a� a� a� a�a� a�a� a� a� a� a� fig. 15: singular shape representations with a sequence of acode assignments levels of geometricpattern description reference to the shape qcode the simplest symbol which refers to an atomic component of a shape attribute. qword a qcode sequence which refers to a shape pattern with distinctive significance – a shape feature or shape knowledge. qphrase a sequence of qcodes in which one or more qwords show a distinctive pattern of structural arrangements. qsentence an aggregation of qcodes, qwords and qphrases so that it refers to a closed and complete contour of a shape. qparagraph a group of qsentences where necessary spatial relationship are described with specific connectives. table 4: various levels of geometric patterns analogous to natural language distinguished by a node label, and a qcode is assigned to the node according to specific shape characteristics. when every singular node is labeled and encoded, it completes the circuit and then we have a representation of a qsentence for a shape. thus, we follow a sequence of node labeling, to qcode assignment, to a complete circuit for a singular shape representation. figure 15 illustrates this sequence of shape representation. as shown in figure 15, every node on a shape contour is scanned in a counter-clockwise direction. as for the acode, the inner vertex angle for a node is measured in clockwise direction from the preceding edge and is compared to the landmarks. other shape attributes follow similar steps of comparison on a node. as a result, we have a circuit of qcode symbols as a representation of a qsentence that contains q-words and qphrases, such as “(a + a � a +)” and the alternated allocation of q-words expressed with shaded patterns and brackets. q-sentence: (a � a + a � a + a � a � a + a � a � a + a � a + a � a � a + a � ) q-phrase: (a � )<( a + a � a +)>/<(a � a � a + a � a � )(a � a � a + a � )> q-word: ( a + a � a +) 3.2 representation of an aggregation of shapes shapes are grouped together to form a shape aggregation. the most basic shape aggregation occurs when two simple singular shapes overlap each other. when two shapes overlap, their geometric configuration takes certain spatial relationships. a spatial relationship, however, takes a different type of information such that it is mainly concerned about how spatial entities are positioned rather than their individual visual characteristics. in a similar way, lyrics and melody are combined in a song; descriptions of shape characteristics and spatial relationships take different roles in the composition of a visual design. studies on spatial relationships ([5], [9], [14]) take simple symbols for shapes (the entities positioned by spatial relationships) where symbols refer only to the existence of the entity, abstracting a variety of information about individual shape characteristics. the qcode scheme, however, goes in the opposite direction in describing the visual information, focusing on encapsulating shape characteristics rather than spatial dispositions, thus it describes an aggregation of shapes based only upon qcodes, excluding other dimensions of visual information such as spatial relationships. the qcode scheme represents an aggregation of shapes in terms of multiple regions sharing common nodes. each region is a complete circuit of qcodes and is equivalent to a singular shape with respect to the property that the encoding corresponds to the qsentence notion as a circuit. a region, however, also has a sequence of nodes that could have multiple scanning directions. in this way, regions with common nodes with the same labels can be combined to produce a new region. an aggregation of multiple shapes is represented with these regions. simple aggregation of shapes binary spatial relations occur when two shapes overlap. the overlap of two shapes produces two or more regions. figure 16 shows possible binary spatial relationships for the overlap of two squares. the arrangements in figure 16 can be grouped into two categories: the connected shape category includes “vertex touch”, “adjacent” and “overlap” types, whilst the disconnected shape category includes “exclusive” and “enclosed” types. the difference between two categories of shape aggregation is that the connected category contains common nodes for subsequent regions whilst a disconnected category has none of these. these common nodes take multiple scanning directions and take different qcodes for the encoding of particular regions. figure 17 shows the regions as a result of the overlap type aggregation of two squares. the aggregation of multiple shapes forms a qparagraph with the regions as qsentences. the representation of a shape aggregation should be able to contain all these subsequent regions in the form of qsentences. shape aggregations in the disconnected category can only be represented with individual qsentences. there can be no common nodes between each region so that there cannot be nodes of multiple directions in “exclusive” and “enclosed” types. 3.3 nodes with multiple directions in a shape aggregation multi-region nodes one of the critical differences for the encoding of a shape aggregation (qparagraph) is the scanning of nodes. singular shapes contain only one type of nodes that have one scanning direction. a single shape corresponds to a single circuit of qcodes. regions in a shape aggregation in the 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 exclusive vertex touch adjacent overlap enclosed fig. 16: possible spatial arrangements of a pair of squares fig. 17: possible regions as a result of the overlap of two squares connected category contain nodes that have more than one direction of scanning, which are multi-region nodes. multi-region nodes are, thus, involved in more than one region, as shown in figure 18. each node has been referred to by a qcode such that a single qcode has implicitly represented a single node in a circuit and a node has appeared only once in the representation. in the case of regions in shape aggregation, a node with the same label occurs several times, as shown in figures 19 (a), (b) and (c), which illustrate the nodes that are involved in one, two and three regions. this node with the same label takes different q-codes for the different regions. the labeling of nodes labels are assigned to nodes in order to identify explicitly all the subsequent regions from a shape aggregation. each node is labeled with numbers one by one, circuit by circuit. a complete circuit of a path constructs the encoding of a region. thus each region is identified with a series of labels, or a path of labels, that follows the counter-clockwise scanning direction with a single qcode assigned to a single label. a bigger region that includes two small regions has a path of labels that combine two paths of labels for the smaller regions. the coding of a region is considered invariant with respect to the initial node of labeling. figure 19 shows the labeling of nodes for regions in shape aggregation. each region is represented with a circuit of labels and each node is assigned with a qcode as in table 5. 3.4 aggregation of multiple regions when two or more regions are connected with a pair of connection nodes, new regions are formed as new qsentences. as for an aggregation of two regions, x and y, “x � y” forms a new circuit composed of non-shared nodes of x and y connected with a pair of connection nodes. the order of node labels is maintained in respect of a and b in the process. the connection of two regions occurs at a pair of connection nodes. two regions are connected into a bigger region by canceling a group of shared node labels and by connecting the chains of the group of non-shared nodes at connection nodes. region aggregation sometimes produces invalid regions as a result. one of the conditions for a valid circuit is that there should be no repetitious node labels in the chain. the nodes of multiple scanning directions are located at connection points. the valid aggregation of regions contains no repetition of the same nodes of multiple scanning directions. the algorithm when two regions, x and y, are provided, the aggregation, x � y, takes labels from two regions in a particular sequence. the elements and the algorithm for the aggregation of regions are shown below: © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 41 no. 3/2001 1 2 3 4 5 6 7 8 9 10 x x: (1 2 3 4 5 6) y z y: (3 7 5 4) z: (3 8 9 10 5 7) fig. 19: labeling of nodes for multiple shapes regions path of labels q-sentences region x (1 2 3 4 5 6 ) (a� a� a� a� a� a�) region y (3 4 5 7) (a� a� a� a�) region z (3 8 9 10 5 7) (a� a� a� a� a� a�) region x � y (1 2 3 7 5 6) (a� a� a� a� a� a�) region y � z (3 8 9 10 5 4) (a� a� a� a� a� a�) region x � z (2 3 8 9 10 5 6 1) (a� a� a� a� a� a� a� a�) region x � y � z (2 3 8 9 10 5 6 1) (a� a� a� a� a� a� a� a�) table 5: regions, labels and q-sentences 1 1 2 1 2 3 (a) (b) (c) fig. 18: nodes with scanning directions following this process, the regions x and y in figure 19 match the results below. the order of node labels is not maintained in the process. x = {1 2 3 4 5 6} y = {3 7 5 4} x � y = {1 2 3 4 5 6 7} s = {3 4 5} c = {3 5} so = {1 2 6 7} x’= (x � so) = {1 2 6} y’= (y � so) = {7} x � y � x' � y' � c = {1 2 6 7 3 5}�{1 2 3 7 5 6} constraints for a valid circuit combination the aggregation of two regions results in a new region. out of all the combinations, only those regions that satisfy the validity constraints can then be considered as valid circuits. there are three validity constraints, as follows: • there should be, at least, a pair of connection node labels. • there should be no repetition of node labels. • there should be, at least, three node labels ( �3). 4 assessment of qualitative shape representation each qualitative and quantitative method is associated with unique paradigmatic perspectives that are complementary to each other. they are used for different purposes. until recently, the numeric (quantitative) method for representing shapes was the most successful because of its precision in the handling of values and its completeness in representation. there are, however, situations in design modeling where exactness and completeness are not of prime importance and are instead an obstacle to the understanding and classification of geometric information. the qualitative method, as it is complementary to the quantitative approach, should be useful in those tasks that are difficult using the quantitative approach. the qualitative scheme for shape representation is complementary to those adapted in conventional computer aided architectural design modeling packages in the following aspects: • shapes are described not in terms of numbers and coordinates but in qualitative symbols. • shape representation aims not for precise and complete description but for the encapsulation of shape characteristics. • shape representation can handle an unclear and ambiguous measure with partial knowledge. • shapes are analyzed on the basis of description and recognition of shape features. • shape modeling aims at such qualitative evaluations as comparison, categorization, and qualitative search, rather than precise visualization. • single qualitative shape description represents a range of shape individuals sharing common shape characteristics. hence, the qualitative approach to shape shows unique methodological distinctions from the quantitative approach. the qualitative shape representation shows an advantage over the quantitative shape representation in the following aspects: • a single description of a qualitative shape refers to many descriptions of quantitative shapes in a particular range. • qualitative shape descriptions can take several levels of abstraction by changing the granularity, which means flexibility and economy as well as adaptability for modeling purposes. • qualitative shape descriptions can refer to quantitative shapes either in the conceptual stage or in the final stage of designing. the ability to handle shapes in the conceptual or early stages of designing is a significant advantage over quantitative methods. • qualitative shape descriptions contain data for conceptual abstractions that facilitate the search for shape semantics and for building a shape knowledge base. reference: [1] buffart, h., leeuwenberg, e., restle, f.: coding theory of visual pattern completion. j. exp. psychol. hum. percept. perform. 7 (2)/1981, pp. 241–274 [2] buffart, h., leeuwenberg, e.: observations: analysis of ambiguity in visual pattern completion. j. exp. psyhol. hum. percept. perform. 9 (6)/1983a, pp. 980–1000 [3] buffart, h., leeuwenberg, e.: structural information theory. in h.-g. geissler (ed.), modern issues in perception, north-holland, amsterdam, 1983b, pp. 48–72 [4] de kleer, j., brown, j.: a qualitative physics based on confluences. artificial intelligence 24/1984, pp. 7–83 [5] egenhofer, m., al-taha, k.: reasoning about gradual changes of topological relationships. in a. u. frank, i. campari and u. formentini (eds), theories and methods of spatio-temporal reasoning in geographic space, springer-verlag, berlin, 1992, pp. 196 –219 [6] forbus, k. d.: qualitative process theory. artificial intelligence 24/1984, pp. 85–168 [7] forbus, k. d., nielsen, p., faltings, b.: qualitative spatial reasoning: the clock project. artificial intelligence 51/1991, pp. 417– 471 [8] freeman, h.: boundary encoding and processing. in b. s. lipkin and a. rosenfeld (eds), pictorial processing and psychopictorics, academic press, new york, 1970, pp. 241–266 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 • x and y: two valid regions with two sets of node labels. • s: a set of labels for a group of shared node labels of x and y. • so: the complementary set of s from x �y. • c: a set for a pair of connection node labels. • x’ and y’: the non-shared node labels of x and y. s � x s � y c� s so � (x � y) � s s � so = � x’ � x y' � y x' � (x � so) y' � (y � so) x � y � x' � y' � c [9] freksa, c.: qualitative spatial reasoning. in d. m. mark and a. u. frank (eds), cognitive and linguistic aspect of geographic space, kluwer, dordrecht, 1991, pp. 361–372 [10] gero, j. s.: design prototypes: a knowledge representation schema for design. ai magazine 11 (4)/1990, pp. 26 –36 [11] gero, j., park, s.-h.: qualitative representation of shape and space for computer-aided architectural design. in y.-t. liu, j.-h. tsou and j.-h. hou (eds), caadria ’97, hu’s publisher, taipei, taiwan, 1997a, pp. 323–334 [12] gero, j., park, s.-h.: computable feature-based qualitative modeling of shape. in r. junge (ed.), caad futures ’97, kluwer, dordrecht, 1997b, pp. 821–830 [13] hayes, p.: the second naive physics manifesto. in j. r. hobbs and r. c. moore (eds), formal theories of the commonsense world, ablex, norwood, 1985, pp. 1–36 [14] hernandez, d.: relative representation of spatial knowledge: the 2-d case. in d. m. mark and a. u. frank (eds), cognitive and linguistic aspects of geographic space, kluwer, dordrecht, 1991, pp. 373–385 [15] hilbert, d., cohn-vossen, s.: geometry and the imagination, chelsea pub., new york, 1952 [16] iwasaki, y.: qualitative physics. in a. barr, p. r. cohen and e. a. feigenbaum (eds), the handbook of artificial intelligence, 1989, pp. 323–413 [17] jungert, e.: the observer’s point of view: an extension of symbolic projection. in a. frank, i. campari, and u. formentini (eds), theories and methods of spatio-temporal reasoning in geographic space, springer-verlag, berlin, 1992, pp. 179–195 [18] jungert, e.: symbolic spatial reasoning on object shapes for qualitative matching. in a. u. frank and i. campari (eds), spatial information theory, springer-verlag, berlin, 1993, pp. 444 – 462 [19] kuipers, b.: qualitative simulation. artificial intelligence 29/1986, pp. 289 –338 [20] leeuwenberg, e.: a perceptual coding language for visual and auditory patterns. am. j. psychol. 84 (3)/1971, pp. 307–349 [21] liu, j.: a method of spatial reasoning based on qualitative trigonometry. artificial intelligence 98/1998, pp. 137–168 [22] martinoli, o., masulli, f., riani, m.: algorithmic information of images. in v. cantoni, v. d. gesu and s. levialdi (eds), image analysis and processing ii, plenum press, new york, 1988, pp. 287–293 [23] mukerjee, a., agrawal, r. b., tiwari, n.: qualitative sketch optimization. aiedam 11/1997, pp. 311–323 [24] park, s. h., gero, j. s.: qualitative representation and reasoning about shapes. in j. s. gero and b. tversky (eds), visual and spatial reasoning in design, key centre of computing and cognition, university of sydney, sydney, australia, 1999, pp. 55–68 [25] restle, f.: coding theory of the perception of motion configurations. psychological review 86 (1)/1979, pp. 1–24 [26] restle, f.: coding theory as an integration of gestalt psychology and information processing theory. in j. beck (ed.), organization and representation in perception, lawrence erlbaum assoc. pub., london, 1982, pp. 31–56 [27] stiny, g.: pictorial and formal aspects of shape and shape grammars and aesthetic systems. ph.d. thesis, university of california, los angeles, 1975 [28] stiny, g.: generating and measuring aesthetic forms. handbook of perception, academic press, 1978, pp. 133–152 [29] struss, p.: qualitative modeling of physical systems in al research. in j. calmet and j. a. campbell (eds), artificial intelligence and symbolic mathematical computing, springer-verlag, new york, 1992, pp. 20– 49 [30] treisman a., schmidt, h.: illusory conjunction in the perception of objects. cognitive psychology 14/1982, pp. 107–141 [31] werthner, h.: qualitative reasoning: modeling and the generation of behavior. springer-verlag, vienna, 1994 [32] weinberg, j. b., uckun, s., biswas, g., manganaris, s.: qualitative vector algebra. in b. faltings and p. struss (eds), recent advances in qualitative physics, mit press, london, 1992, pp. 193–207 ashraf fouad hafez ismail phone: +420 602882128 fax: +420 2 6517811 e-mail: hafez@fa.cvut.cz department of design theory czech technical university in prague faculty of architecture thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 41 no. 3/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2015.55.0388 acta polytechnica 55(6):388–392, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap multilayer coatings ti/tin, cr/crn and w/wn deposited by magnetron sputtering for improvement of adhesion to base materials j. horník∗, s. krum, d. tondl, m. puchnin, p. sachr, l. cvrček czech technical university in prague, faculty of mechanical engineering, department of materials engineering, karlovo náměstí 13, 121 35 prague, czech republic ∗ corresponding author: jakub.hornik@fs.cvut.cz abstract. the paper deals with evaluation of single and multilayer layer pvd coatings based on cr and ti widely used in tool application. additionally, w and wn based coating which are not so widespread were designed and deposited as functionally graded material. the coatings properties were evaluated from the point of view of hardness and adhesion. the hardness measuring was carried out using nanoindentation method. the scratch test was performed to test adhesion. moreover, the presence of metallic interlayer in functionally graded materials further increases the coating adhesion by gradually approaching its composition to the substrate. coatings consisting of w and wn have showed very good adhesion. with regard to the results of the scratch test, the multilayer coatings of crn, tin and wn have increased adhesion and can be assumed to have their protective function improved. results will be appliedin development of functionally graded layers for functionally graded materials. keywords: pvd coatings; magnetron sputtering; s235 steel; adhesion; nanoindentation. 1. introduction thin coatings are an important area of surface engineering nowadays experiencing rapid progress. coatings have a protective or functional purpose in a wide range of technical applications. they increase substrate resistance against oxidation, heat transfer, wear and corrosion [1]. pvd (physical vapour deposition) is the one of the ways of their deposition onto tool steels substrate. the low deposition temperature is significant advantage of this method. however, these coatings show relatively low adhesion to substrate. improvement of the adhesion may be achieved by modifying the substrate or adjusting physical properties of the coating to the substrate as much as possible in order to achieve similar mechanical properties like hardness [2]. amongst the most used coatings, there are the pvd coatings based on cr-n, ti-n, al-ti-n, and w-n. however, tungsten or its nitrides and carbides are not widely used alternatives [3, 4] in spite of the good abrasive resistance comparable to mo2n with significantly lower wear loss than commercially used ti based coatings [5]. wn based coatings suffer from rapid formation of oxides over the temperature of 500 °c. these oxides occur also in surface microcracks and their growth contributes to crack propagation and coating degradation process [4]. presented paper deals figure 1. composition of multilayer coatings. element concentration c max 0.22 wt% mn max 1.60 wt% p max 0.05 wt% s max 0.05 wt% si max 0.05 wt% table 1. typical chemical composition of s 235 steel. with characterization of pvd coatings types based on cr, ti and w deposited on carbon steel substrate. selected method of scratch test [6, 7], nanoindentation [9, 10] and microscopy were used for evaluation of the coating properties. the results will be used for development of functionally graded layers for functionally graded materials. 2. experiment the cr, ti and w based coatings were selected for the experiment. as a substrate material a normalized plain carbon steel s 235 with the hardness of 185 hv (55 hra) was unconventionally selected. typical composition of used steel is presented in table 1. the reason was to better evaluate the properties of the coatings themselves. 388 http://dx.doi.org/10.14311/ap.2015.55.0388 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 6/2015 multilayer coatings ti/tin, cr/crn and w/wn figure 2. surface morphology of evaluated coatings (sem). all samples were prepared using the deposition parameters to reach the final thickness of the coating 3–6 µm. all parameters (bias voltage, coil current, gas flow etc.) were previously optimized by test cycles. gas flows were optimized in case of fgm (functionally graded coatings) w/wn coating (w reinforced by wn clusters) in order to reach increased hardness and sufficient adhesion. the deposition times were set to obtain approximately the same final coatings thickness for all samples. prior to the deposition, all substrates were polished to mirror-like surface, than cleaned in an ultrasonic cleaning machine in acetone bath. the multilayers were prepared using pvd method on hauzer flexicoat 850 equipment. coatings were formed as an alternating combination of pure metals and their nitrides. adhesion layer of pure metal has thickness of 50 nm. the composition of the prepared and tested layer systems is evident from fig. 1. adhesion measurements were performed using the csm revetest xpress device with rockwell indenter (tip angle = 120°, tip radius = 0.2 mm). a linearly increasing load was set from 1 to 100 n with the loading rate of 49.5 n/min. the length of each scratch was 10 mm. velocity of the indenter was 5 mm/min. micro materials nanotest device was used to determine the hardness of the coatings. for each sample there were made 10 indentations with the load of 200 mn and the loading times of 10 sec/5 sec/10 sec (load/dwell/unloading) were used. the hardness values hit were calculated from the results of nanoindentation by berkovich indenter. the scanning electron 389 j. horník, s. krum, d. tondl et al. acta polytechnica figure 3. fracture appearance and layer structure of coatings (sem). microscope jeol jsm 7600f was used for the evaluation of microstructure and coating thickness. 3. results and discussion 3.1. microstructure the documentation of coatings was performed on samples after scratch test. the surface quality and tendencies to delamination were evaluated. all deposited coatings showed a columnar structure according to its growth pattern. the microphotographs in fig. 2 show the surface morphology of the pvd coatings. surfaces of ti and cr based systems have character of cupola. on surface of w/wn based pvd coating the “cauliflower” appearance dominates. qualitatively, the surface morphology is better in case of w based coatings. fracture in columnar single-layer coating structure is relatively smooth and cracks propagation is quite easier than in case of multi-layered systems (fig. 3). the alternating of layers with different properties and their discrete interphase boundaries play a role as the decelerators against propagation of the crack. 3.2. mechanical properties selected mechanical properties of the coatings are summarized in tab. 2. it represents the values of nanohardness (hit), calculated hardness (hv) and reduced young’s modulus of elasticity (er). the values of nanohardness of the cr based coatings are quite low. higher nanohardness values of tin and w based coatings were determined. the highest hardness 390 vol. 55 no. 6/2015 multilayer coatings ti/tin, cr/crn and w/wn coating nanohardness hit [gpa] hardness [hv] er [gpa] coating thickness [µm] wn 28.0 2855 258 3.6 w/wn (fgm) 32.2 3284 317 4.0 crn 16.2 1652 248 5.1 cr/crn (6 layers) 18.0 1835 283 5.8 tin 24.4 2488 252 2.6 ti/tin (6 layers) 15.5 1581 246 3.9 table 2. mechanical properties of the coatings. tin – lc3, ls w/wn (fgm) – lc3 figure 4. scratches appearance on different layers and comparison of measured critical load. was reached in case of multilayer w/wn coating. wn coating modulus of elasticity is comparable to crn and tin. the highest value was observed in w/wn multilayer coating. the w coatings are expected to be very tough. their hardness is comparable to the common crn coatings. the measured values of hardness and modulus of elasticity corresponds well to the available data [1, 4, 5, 10]. the most important technological parameter of coating is adhesion to the substrate. on every single residual groove (scratch) the main four characteristic are monitored: lc1 (the emergence of first cracks), lc2 (large-scale cracks), lc3 (first delamination of the coating) and ls (total coating delamination – substrate exposition). the results of the scratch test of coating systems and comparison of the critical load are summarized in tab. 3. the behaviour of all systems is quite different and this fact rather complicates comparison of the results. from the fig. 4 is evident that the multilayered coating structure improves adhesion of the system. all analyzed multilayered systems achieved better results than plain monolayers in individual criteria and overall. highest adhesion was achieved in case of w/wn and cr/crn. the combination of highest hardness, modulus of elasticity and best adhesion promotes w based coatings, especially the fgm type, to be very perspective for tribological application. 4. conclusions microscopy observation reveals that all deposited coatings have homogeneous columnar structure without any cracks. surface of cr and ti based coatings has 391 j. horník, s. krum, d. tondl et al. acta polytechnica coating lc1 lc2 lc3 ls wn 12 17 27 41 w/wn (fgm) 19 27 32 44 crn 7 11 27 37 cr/crn (6 layers) 10 14 32 40 tin 6 9 15 16 ti/tin (6 layers) 11 21 23 24 table 3. results of scratch test (critical loads [n]). cupola character, in case of w based coating prevails the cauliflower appearance. the results of scratch test showed improved adhesion in case of all multilayer coatings. adhesion properties of the coatings deposited on soft substrate have lower values in comparison with coatings deposited on a typical tool steel but the higher penetration depth must be considered. the w and cr based systems achieves the best adhesion results. w based coatings have very good adhesion despite to the high hardness and modulus of elasticity. the best results were achieved in case of fgm based w/wn coating in all evaluated characteristics (hardness, modulus of elasticity and adhesion). the application of pure metals as a material of “interlayers” improves the adhesion of the coatings by gradually approaching their composition to that of the substrate. the fgm w/wn coatings have also ambitions for use as a highly adhesive resistant coating. obtained results will be used for further fgm research. acknowledgements the research was supported by czech science foundation project gap108/12/1872 complex functionally graded materials. references [1] grainger, s., blunt, j. engineering coatings (second ed.). woodhead publishing lld, abingdon 1998 http://store.elsevier.com/engineering-coatings/ s-grainger/isbn-9781845698577/ [2] jurči, p. et al. využití pvd povlaků na ledeburitických ocelích. metal 2009 [cd-rom]. ostrava: tanger, 2009, isbn 978-80-87294-03-1. [3] hornik, j., tondl, d., sachr, p., anisimov, e., puchnin, m., chraska, t. the effect of pvd tungsten-based coatings on improvement of hardness and wear resistance. key engineering materials vol. 606, 2014, p. 163-166 [4] polcar, t., parreira, n.m.g., cavaleiro, a. structural and tribological characterization of tungsten nitride at elevated temperature. wear, vol. 265, 2008, p. 319-326 [5] kutschej, k., mayrhofer, p.h., kathrein, m., polcik, p., tessadri, r., mitterer, c. structure, mechanical and tribological properties of sputtered ti1-xalxn coatings with 0,5 m. since the word fw(p, k) has a period m, it is determined by its prefix w of length m. denote u = fw(q, k−m). corollary 1 and lemma 2 imply that • w = pref m(u) if m ≤ k − m, and • w = u · |u| · (|u| + 1) · · ·(m − 1) otherwise. this can be succinctly stated as: fw(p, k)[i] =   fw(q, k − m)[i mod m] if (i mod m) < k − m; i mod m otherwise. 345 štěpán holub acta polytechnica example 2. let p = {5, 7} and k = 8 as in example 1. recursive definition of fw(p, 8) leads to p = q0 = {5, 7} k = k0 = 8 q1 = {2, 5} k1 = 3 q2 = {2, 3} k2 = 1 in order to obtain the word u0 = fw(q0, k0) = fw(p, 8) we will need words u1 = fw(q1, k1) and u2 = fw(q2, k2). since k2 = 1, we have u2 = 0. from the point (2.) above we have u1 = pref 3(w ω 1 ) where w1 = 01. therefore u1 = 010. similarly, we get u0 = pref 8(w ω 0 ) where w0 = 01034, whence fw(p, 8) = 01034010. schematically: q0 = {5, 7} q1 = {2, 5} q2 = {2, 3} k0 = 8 k1 = 3 k2 = 1 u0 = 01034010 u1 = 010 u2 = 0 w0 = 01034 w1 = 01 from the above example we see that the procedure has two parts: “descending” and “ascending”, which are called “reduction” and “extension” in [3]. the end of reduction can be defined in several ways. we have seen that we can turn to extension as soon as we know fw(qi, ki). this typically happens if ki ≤ min qi, or if min qi = gcd(qi). 5. concluding remarks as already remarked, the above algorithm is identical with algorithm b from [3]. even all arguments we use can be in some way traced back to similar arguments in the literature. nevertheless, i believe that the description presented here provides further evidence that the equivalence class approach is not only simple but it also yields an intuition sufficient to formulate and understand the construction. (another elegant example, in my opinion, is the proof of the fact that the extremal fw-word is a palindrome, given in [4].) that said, one should stress that the inefficiency claim concerning the equivalence class approach is valid if we consider the naïve procedure suggested by example 1. the precise computational complexity of algorithm b goes beyond the scope of this paper (see the discussion in [3]). one possible drawback can be a bit discouraging notation like ∼p,k, and the fact that notions like “equivalence closure” may sound “too algebraic” to some ears. computer theorists could therefore like to translate the exposition into graph language and speak about edges instead of generating relations, and about connected components instead of equivalence classes. the rest will be the same. acknowledgements this work was supported by czech science foundation grant 13-01832s. references [1] s. constantinescu, et al. generalised fine and wilf’s theorem for arbitrary number of periods. theoret comput sci 339(1):49–60, 2005. [2] r. tijdeman, et al. fine and wilf words for any periods. indag math (ns) 14(1):135–147, 2003. [3] r. tijdeman, et al. fine and wilf words for any periods ii. theor comput sci 410(30-32):3027–3034, 2009. [4] š. holub. on multiperiodic words. theor inform appl 40(4):583–591, 2006. [5] š. holub. corrigendum: on multiperiodic words. theor inform appl 45(4):467–469, 2011. 346 acta polytechnica 53(4):344–346, 2013 1 a short (personal) history 2 notation 3 classes of equivalence 4 the algorithm 5 concluding remarks acknowledgements references ap1_01.vp © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 41 no.1/2001 1 introduction the study of electron induced radiation effects requires a good knowledge of the value of electron flux density during irradiation. interest in irradiation by electrons ranges from medical applications in oncology to investigations of radiation induced changes or radiation hardness of electronic solid state devices and other materials. the total beam intensity, or the number of electrons exiting from an accelerator, are normally monitored by methods – preferably not perturbing the beam – such as induction pick-up current meters situated at the vacuum side close to the beam exit window. all such devices require calibration by an absolute metering system that registers the total electron flux. if a simple metallic block is used to stop all electrons, the measured value of the current flowing from the block to the ground is normally lower than the real value, because some of the electrons are scattered backwards. a well known device for eliminating of this effect is the faraday cup. a typical faraday cup is a cylinder made of conducting material with an inner cavity provided with a small input opening, giving little chance for electrons scattered back on the inner cavity walls to escape through it. some precautions need to be taken, if the electron energy is in the mev range. the walls of the inner cavity must be thicker than the range of the most energetic electrons, so that none of the electrons entering the cavity can pass through. the walls should be made of low z material, (e.g. graphite). this prefers the small angle scattering, and ensures that radiation processes accompanied by wide angle scattering of electrons and by emission of bremmsstrahlung, leading to photonuclear reactions, are reduced as far as possible. low z materials normally have low density and it would require rather thick cavity walls to dissipate the electron energy completely. therefore low z material is used to reduce the electron energy to a value at which the energy of the emitted bremmsstrahlung is below the threshold for photonuclear reactions. the remaining electron energy is then absorbed by heavier material. in this way the overall dimensions of the faraday cup can be limited. in order to reduce the escape of electrons from the faraday cup even further, the electrons, scattered on the cavity walls toward its opening enter a magnetic field, which deviates them and forces them to end on the cavity wall. a second measure consists in reducing the escape by placing a decelerating electric field in front of the input opening between an auxiliary electrode on potential that is negative with respect to the cup. if the faraday cup were to be left in air at atmospheric pressure, the ionized air would constitute a leak resistor parallel to the input range resistor of the measuring device. to eliminate this effect, both the faraday cup and the decelerating electrode have to be placed in a vacuum housing. the vacuum is maintained continually during measurement below 1 pa by a rotary oil vacuum pump. 2 design and construction a schematic view of a faraday cup with a vacuum shell is shown in fig.1. the cavity 1 is shaped inside the graphite cylinder 2, inserted in a steel stopper 3. to stop 25 mev electrons completely would theoretically require about 14 g/cm2 of graphite [1]. in our case the graphite thickness of 60 mm in the forward direction is followed by a 10 mm steel stopper, giving a total of 18 g/cm2, guaranteeing fully zero escape of electrons through the walls of the cup. two faraday cup for electron flux measurements on the microtron mt 25 m. vognar, č. šimáně, d. chvátil the basic criteria for constructing of an evacuated faraday cup for precise measurement of 5 to 25 mev electron beam currents in air from a microtron are established. the faraday cup, built in the microtron laboratory of the faculty of nuclear sciences and physical engineering of ctu prague, is described together with the electronic chain and its incorporation in the measuring line on the beam. measures to reduce the backward escape of electrons are explained. the range of currents is from 10–5 to 10–10 a. the diameter of the al entry window of the faraday cup is 1.8 cm, and its area is 2.54 cm 2 . the thickness of the entry window is 0.1 mm. keywords: microtron, faraday cup, electron beams, electron flux measurement fig. 1: assembly drawing of the faraday cup in the vacuum housing: 1 – cavity, 2 – graphite cup, 3 – steel stopper, 4 – permanent magnets, 5 – ceramic insulators, 6 – decelerating electrode, 7 – insulators, 8 – cylindrical part of the vacuum housing, 9 – housing bottom, 10 – collar, 11 – rear flange, 12 – rubber gasket, 13 – al foil entry window, 14 – front flange, 15 – pumping sleeve, 16 – vacuum 4-lead bushing, 17 – al front shielding permanent magnets 4 are inserted in part 2. the vacuum housing consists of a cylindrical part 6 with a collar 10 and a 10 mm thick steel bottom 9. a rubber gasket 12 between the rear flange 11 and the collar 10 secures vacuum tightness of the housing. the faraday cup is mounted on ceramic insulators 7 attached to the bottom 9 and the rear flange 11. the 0.1 mm thick, � = 18 mm, al entry window 13 is clamped between the 10 mm thick front flange 14 and the bottom 9 on a pb ring seal. the decelerating electrode 6 is attached to the bottom on insulators 8. the additional 25 mm thick al front shielding 17 guarantees that no electrons enter the faraday cup by any other way than through the al entry window. the electrical scheme of the faraday cup and of the measuring chain, as well as the negative potential supply for the decelerating electrode, are shown in fig. 2. the negative –100 v potential for the decelerating electrode de is supplied by shielded coaxial cables from a dry battery installed in a shielded box. the resistor r1 prevents a battery discharge in case of an accidental short circuit. a 25 m long coaxial cable is used to interconnect of the faraday cup fc and the input resistor r of a 616 keithley digital electrometer, operated as an amperemeter. the input of the electrometer is protected by resistor r1. because the electron beam can be considered as a current source with practically infinite inner resistance, the value of r1 does not influence the measured current value. the input range resistor r can be set in decade steps from 105 to 1011 � to give 1 v input voltage for currents from 10–5 to 10–11 a , large enough to be well above the induced noise from high frequency electromagnetic and steady current sources in the microtron room. the shielding of the coaxial cable is connected to the housing of the faraday cup. the resistor r1 and r between the faraday cup and the housing, as well as the grounding of the housing to the laboratory earth (which is the only grounded point of the entire measuring chain) must never be disconnected. if this were to happen, the cup and the entire measuring chain would be brought to very high potential against the laboratory earth by the electric charge supplied by the high energy electrons, which would damage the electronic equipment. the approximately rectangular electron beam pulses are 2.5 �s long with a repetition rate of 400 hz, the peak current value being 1000 times higher than the mean current value. to absorb the peaks and obtain the mean current value, the input time constant of the electrometer must be at least 0.1 s. the capacity between the faraday cup and the vacuum housing and the capacity of the coaxial cable are not sufficient to smooth the current, and extra capacitors must be connected parallel to the input resistor r, in order to obtain time constants between 0.1 and 10 s. capacitors c1 = 4.7 �f, c2 = 100 nf and c3 = 1 nf, in combination with input resistors from 105 to 1010 �, are used to cover the current range from 10–5 to 10–10 a. in any case, when the capacities need to be switched during the run of the microtron, they must not be disconnected in order to prevent the appearance of an excessive voltage on the input of the electrometer. the accuracy of current measurements is limited by the value of the leak resistance between the faraday cup and the earth, particularly the value of the resistance between the inner wire and the shielding of the 25 m long coaxial connecting cable. its resistance of 1.5 � 1011 �, parallel to the input range resistance r, must be taken into account at currents lower than 10–10 a. the use of the faraday cup in the study of problems connected with profiling electron fields for irradiation by scattering on foils are subject of a separate publication [3]. 3 measuring arrangement an existing experimental installation (fig. 3), which had been built to generate electron and photon fields for dosimetric purposes [2], was adapted to accommodate the faraday cup. an optical bench 7, parallel to the beam and bearing the faraday cup, was installed on a table top 8. the table together with a system of diaphragms, rests on a common platform carriage 10, which can be moved parallel to the beam on guide rails 11. all the systems of the diaphragms and the faraday cap are exactly centered by an optical laser and set in the beam axis by leveling with a survey theodolite. the distance between the beam exit window 1 and the entry window 2 of the faraday cup can be changed from 860 to 1625 mm by a combination of shifting of the carriage on the guide rails and shifting the faraday cup on the optical bench. by inserting diaphragms in the mounts of turrets 3 and 4 and in the conical steel collimator 5, one can set the beam aperture and modify by scattering the angular distribution of the electron flux density. in the broad electron beam in air some of the scattered electrons can reach the faraday cup through the side walls of the vacuum housing. the walls are not thick enough to prevent their penetration. to get the true value of the electron flux entering the cup by the entry window, this effect should be taken into account by multiplying the measured 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 2: block scheme of the faraday cup including the electrical measuring chain: fc – faraday cup, de – decelerating electrode, vh – vacuum housing, r1 – 0.2 m�, r2 – 0.5 m�, r – electrometer input range resistor 105 – 1011 �, c1 – 4.7�f, c2 – 0.1 �f, c3 – 1 nf electron flux by a coefficient, the value of which may be determined from the difference of the fluxes measured at the faraday cup entry window when it is open and when it is blocked by a stopper. for example, at a distance 130 cm from the beam outlet window, at a beam aperture of 1.57°, the value of this coefficient is 0.93, and at an aperture of 4.6° the coefficient is equal to 0.88. 4 conclusion the standard diameter of the electron beam exiting from the exit window of the microtron is of the order of 0.3 to 0.5 cm. its angular width is due mainly to the scattering on the al exit window. measured at the half heights of the peak value, it ranges from 2.2° for 25 mev electrons to 5° for electrons of 10 mev [3]. on the other hand, the angular aperture of the faraday cup entry window at 5 cm distance from the exit window is about 20°, which is sufficient to receive the whole beam even in the worst case of 10 mev electrons. in this position the faraday cup is used for absolute calibration of the beam exiting from the microtron in air. at a sufficiently large distance from the beam exit window (this distance depends on the angular width of the beam), the electron flux density passing through the entry window of the faraday cup is practically uniform. at this distance one can determine the mean electron flux density (the number of electrons passing through 1 cm2 per second) at the maximum of the angular distribution by dividing the mean electron current value by the area of the window, which is 2.54 cm2, and by the electric charge of an electron 1.6 � 10–19 c. the peak values of the electron flux densities in the pulse are a thousand times higher. at intermediate distances, one must take into consideration the angular distribution of the electron flux density in the beam. by dividing the measured mean electron current by the area of the entry window and by the electron charge, the mean electron flux density is obtained. references [1] kovalev, v. p.: vtoritchnye izlutchenija uskoritelej elektronov. atomizdat, moskva 1979 [2] vognar, m., šimáně, č., burian, a., chvátil, d.: electron and photon fields for dosimetric metrology generated by electron beams from a microtron. nuclear instruments and methods in physics research a 380 (1996), pp. 613–617 [3] šimáně, č., vognar, m., chvátil, d.: some aspects of profiling electron fields for irradiation by scattering on foils. submitted for publication in acta polytechnica ing. miroslav vognar prof. ing. čestmír šimáně, drsc. ing. david chvátil dept. of dosimetry & appl. of ionizing radiation phone: +420 2 2323657, +420 2 2315212 fax: +420 2 2320861 e-mail: vognar@br.fjfi.cvut.cz ctu, faculty of nuclear science & physical engineering břehová 7, 115 19 praha 1 czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 41 no.1/2001 fig. 3: measuring line: 1 – electron beam exit window, 2 – faraday cup entry window, 3 – first indexed turret with four diaphragm mounts, 4 – second indexed turret with eight diaphragm mounts, 5 – conical stainless steel collimator, 6 – square w-steel collimator, 7 – optical bench, 8 – table desk, 9 – diaphragm frame, 10 – common platform carriage, 11 – guide rail ap02_02.vp © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 42 no. 2/2002 an optimum odor microclimate can be ensured by suitable changes in (a) the source of odors, and (b) the transfer field between the source and the exposed subject. 1 changes in the source of odors reduction or, if possible, removal of the source of odors, is the most effective way: construction materials that do not release an odor, and production technologies without odor sources should be preferred. two examples of effective ways are quick setting coatings and waste baling presses. quick setting coatings were developed in france. they consist of a great number of low-molecular compounds and of so-called photo-initiators, which, when radiated by uv rays, rapidly (within a second) change low-molecular compounds into high-molecular compounds. natural materials are preferred for wood preservation, especially beeswax applied directly to cleaned wood. waste baling presses are awailable for home kitchens. waste baling presses are produced by the lescha company (leonard schmid, augsburg, germany). a balong press can from part of a kitchen furniture suite (fig. 1). waste (including champagne bottles) is pressed into a polyethylene package of small volume (1/4 of the original volume) (fig. 2). thus, of course, any odors, or microbe release from the waste is avoided. 2 changes in the transfer field between source and subject such changes can be made in the following ways: a) stop the odors from spreading within the building, b) supply an adequate quantity of outdoor air to the building interior, i.e., suitable ventilation, c) air filtration, d) introduction of plants, e) chemical deodorization, f) intensive air ionization, g) neutralization with ionized ozone, h) bake-out procedure. 2.1 how to stop odors from spreading within a building the most effective method is to be careful about the air streams produced by infiltration and by indoor heat sources. staircases should be divided into several hermetic parts, and the sources of odors should be confined to the upper part of the building. the most serious problems occur in tall buildings, as a consequence of the stack effect (thermal upward pressure). according to some measurements, there is a negative presoptimization of the odor microclimate m. v. jokl the odor microclimate is formed by gaseous airborne components perceived either as an unpleasant smell or as a pleasant smell. smells enter the building interior partly from outdoors (exhaust fumes flower fragrance) and partly from indoors (building materials, smoking cigarettes cosmetics, dishes). they affect the human organism through the olfactory center which is connected to the part of brain that is responsible for controlling people’s emotions and sexual feelings: smells therefore participate to a high level in mood formation. sweet smells have a positive impact on human feelings and on human performance. criteria for odor microclimate appraisal are presented together with ways of improving the odor microclimate (by stopping odors from spreading within a building, ventilation, air filtration, odor removal by plants, deodorization, etc.), including so-called air design. keywords: odors, microenvironment, hygiene, indor air quality, microclimate. fig. 1: the lescha-mollpack waste baling press integrated into kitchen furniture fig. 2: the product of the waste baling press: a small polyethylene package sure in the lower part of a 16 to 30-storey building of 120 to 160 n � m�2. there is an underpressure of about 30 n � m�2, even at the front door of the nine-storey buildings often constructed in europe (fig. 3). this means, in practice, that in order to open a house-door sized about 1 × 2 m it is necessary to use a force equivalent to 6, 24 and 32 kg. the effect of the thermal uplift is increased vertically through the whole building (shafts, staircases), and there is intensive spreading of odors within the building if the staircase is not divided into several hermetic parts, or if the odor sources (kitchens, laboratories, etc.) are not located in the upper part of the building. if this is impossible, at least hermetically sealed doors should be used from the staircase to the apartments or offices. 2.2 adequate quantity of outdoor air-ventilation pettenkofer’s classic value provides a basic measure for rooms where people are the main source of air pollution. for optimum concentration of co2 1000 ppm � 1800 �g/m 3 � � 0.1 vol. he prescribes 25 m3/ l � person. according to ashrae standard 62-1989r this value can be accepted (after rounding to 7.5 l / s � person � 27 m3/ h � person) for an unadapted person, while for adapted persons it is decreased to 2.5 l / s � person (9 m3/ h � person). by these values applying the quantity of air delivered into a room can be adjusted to the number of people: with the increasing number of people the co2 concentration also increases, and thus the air rate 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 measured � �t 30 °c calculated h e ig h t [m ] � �p [ n m ] � 2 rooms staircase 12 story building 9 story building 10 20 30 neutral zone �400 �200 0 200 400 fig. 3: the air pressure distribution within a nine-story building. on the left, a schematic representation of the air pressure in a building; on the right, calculated and measured air pressure. 15 20 °cmachine room lecture hall 1 2 3 4 5 6 fig. 4: the microclimate in a lecture hall controlled by a thermostat and a co2 sensor (1 air rate sensor, 2 controlling flap, 3 electrical heater, 4 control unit, 5 room thermostat, 6 co2 sensor) must be increased, e.g., by increased fan rotations controlled by a co2 sensor in the room (see fig. 4). energy also increases, savings, an important factor, are made as a result of decreased energy consumption for warming outdoor air. the following equation should be used in the rooms where odor agents, released from building materials, are decisive for outdoor air rate: � � r g b b itvoc e tvoc � �3 6. � � [l/s�m2] (1) where rb � minimum outdoor air rate related to 1 m 2 of floor [l/s�m2], gb � tvoc rate produced within an interior [�g/h�m2 floor] (see table 1), �etvoc � tvoc concentration in outdoor air [�g/m3] (see table 2), �itvoc � prescribed tvoc limit [�g/m3] (see table 2 and fig.6). the required outdoor air rate is the sum of the two air rates (if they occur), i.e., calculated from tvoc and based on co2. an air change (the number of times that the air in a room is changed during one hour) is often prescribed. thus the outdoor air rate can be obtained if the air change is multiplied by the room volume. the value calculated in this way, can differ from the outdoor air rate, which is estimated from the air rate necessary for one person, as it is evident from the following examples. example 1: lecture hall crowded with students. the outdoor air rate related to one person can be lower than the prescribed value, e.g., 27 m3/h�person, i.e., too low, even if high air change six has been taken into account. example 2: a hangar in which one person is repairing an airplane. the outdoor rate related to one person can be much higher than the prescribed value, e.g., 27 m3/h�person, i.e., excessively high, even if only air change one has been taken into account. the outdoor air rate related to one person is decisive in each case, i.e., if calculations are based on air change, the re© czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 42 no. 2/2002 location tvoc [�g�h�1�m�2 floor] author note mean range existing buildings, offices 1,550 100–4,890 eur 14449 en (1992) converted olf value working hours (9–11) 360 132–691 ekberg (1993) night-time (5–7) – 90–467 ekberg (1993) schools (classrooms) 1,550 620–2,780 eur 14449 en (1992) converted olf value kindergartens 2,060 1,030–3,810 eur 14449 en (1992) converted olf value assembly halls 2,570 670–6,790 eur 14449 en (1992) converted olf value dwellings 720 360–1,080 eur 14449 en (1992) new pvc floor tiles 795 450–1,400 brown and crump (1993) low-polluting buildings (target values) – 260–510 eur 14449 en (1992) converted olf value solid flooring materials (vinyl, carpet, chipboard) typical below 55 crump et al (1997) emission rates constant wall and ceiling materials crump et al (1997) emission rates constant plasterboard max 6 6-mm plywood max 10 15-mm plywood max 12 bitumienised fibre board asphalt max 30 crump et al (1997) emission rates constant pvc skirting board below the detection limit crump et al (1997) polythene spacer 4 when heated to 40 °c rockwool (cavity wall) below 15 crump et al (1997) emission rates declined slowly table 1: tvoc emission rates in a building interior sults must be proved by calculations of air rates related to one person. furthermore, if recirculation is used in an air handling system, the outdoor air rate must not be lower than 10 % of all air delivered into the room. 2.3 air filtration there must be a special material for odor absorption: activated carbon, charcoal or synthetic resin (e.g., amberlite). odors can also be removed by an odor scrubber (odor washer), a biowasher, catalytic burning, biofilters, and even by plants [6]. activated carbon has better odor-removing properties than charcoal (carbonized wood). it is produced by the impact of hot steam (800–1000 °c) and zinc chloride on charcoal. this process enlarges and purifies the cells. as a result, the internal surface is enlarged up to 500–1500 m2/g, i.e., in average to an unbelievable value of 1000 m2/g. other kinds of coal and peat are also used and, even coconut shells (see fig. 5). activated carbon absorbs very little air humidity, and does not change the chemical or psychrometric condition of the air. the odor removal efficiency depends on the contact period of the gas with the carbon. for least 80 % efficiency, and an air velocity of 2.5 to 3.0 m/s, the thickness of the carbon layer should be 2.5 cm. the efficiency depends on the carbon retention: the absorbed odor quantity [g] is related to 100 g of carbon (see table 3). activated carbon is mostly applied in air cleaners. it is evident from the retentions presented in table 7 that this device is not efficient against all odors in the same way: some odors are trapped very efficiently (e.g. human body odors), others are only slightly reduced, e.g., fish odors (odors of preparations for plant protection). 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 location tvoc [�g � m�3] co2 [ppm] source note at sea 0 300–340 icao 1964, eur 14449 en converted decipol value in a city, good air quality 14 15–18 350 eur 14449 en ekberg 1993 converted decipol value in a city, bad air quality 71 23–98 350–400 eur 14449 en brown and crump, 1993 converted decipol value table 2: tvoc and co 2 concentrations in outdoor air carbonised coconut shell carbonised wood activated coal activated coconut shell fig. 5: microscope photos of carbonized and activated coal 2.4 odor removal by plants indoor plants can be used as room detectors and co2 consumers, and some are also able to clean the air from acetone, benzene, co, ethanol, formaldehyde, methanol, so2, toluene and some vocs (see table 4). a lawn can be effective in an atrium: an area of 15 × 15 m is a sufficient source of oxygen for a family with four members, and it also cleans so2, co2 and hydrogen fluoride from the air. it has not been explained satisfactory what is going on with these absorbed chemicals: whether they are only stored, or perhaps used for energy consumption. it has already been proven by nasa that some of them nourish microorganisms that grow on and near the roots. therefore flowers in a vase or plants growing in hydroponics are useless for this purpose. potted plants growing in substrate enriched by active carbon are benefi© czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 42 no. 2/2002 odor agent retention [%] [g/100 g coal] solutions 25 to 35 exhaust gases 20 body odor 35 ether 15 onion, garlic 15 antiseptics 30 tobacco smoke 25 aliphatic mercaptans (oil refineries, chemical industry) 1 aromatic mercaptans (oil refineries, chemical industry) 15 aliphatic amines (fishing indistry) 1 higher hydrocarbons (>c14) (chemical industry) 15 aliphatic chlorohydrocarbons (chemical industry) 15 plant protection chemicals (agriculture stocks, chemical industry) 0.2 table 3: retention values for activated coal (charcoal) odor agent source affecting plants 1 acetone body odor lily 2 benzene office solvents chrysanthemum gerberum lily 3 ethanol alcoholic beverages lily cleaning agents 4 hydrogen fluoride glass processing grass 5 formaldehyde wood products, especially aloe plywoods and chipboards azalea parquet sealants philodendron cork gum-tree laminates lily glues poinsettias cleaning agents and table 4: odor removal by plants cial. one well-developed plant with air streaming uniformly around it at a velocity of 0.10-0.15 m/s should be used for 9 m2 of floor area. 2.5 deodorization deodorization is the masking of odors: covering an unpleasant odor by another, stronger and more pleasant smell, a so-called deodorant: formaldehyde, acetaldehyde, ozone, etc. however deodorants cannot be used in high concentrations owing to their toxicity: e.g., an ozone concentration should not exceed 0.1 mg/m3 (0.5 ppm). deodorization has been known for a long time. incense has been used for ages. it is made by cutting the shrub boswellis carteri. from notches cut into the tree, milk juice flows, which forms yellow balls in the air, called incense (alibanum). it contains 4 to 7 % of ethereal oils. if it is burnt on glowing coals, pleasant smelling smoke is produced. nowadays incense is used as an ingredient in scented candles, and is mixed with scented woods to produce a special smelling material used during christmas. 2.6 intensive air ionization odors can also be removed by intensive ionization of the air, i.e., by forming negative aeroions of high concentration. even the typical odor of a bar can be removed during the night in this way, so that the room can be used for serving breakfast the following day. air cleaners equipped with ionizers thus have a new field of application. 2.7 neutralization with ionized ozone ionized ozone is a very effective oxidant: the molecules of odor agents are cracked and changed into water vapors, carbon dioxide and other substances (without bad smells). the ozone concentrations must be watched carefully due to its toxicity. this method should be applied at night, when no 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 odor agent source affecting plants disinfectants tulip cosmetics open fireplaces gas cookers tobacco smoke textiles 6 methanol cleaning agents lily 7 sulphur dioxide cars, boiler rooms grass 8 toluene cleaning agents arek palm, lily 9 trichloroethylene cleaning agents lily 10 voc cleaning agents philodendron carpets golden potos glues paintings rubbish solvents 11 exhaust fumes cars chesnut tree outside air intake intake air duct air handling system recirculating air duct supply air distribution duct return air ductionized ozone generator fig. 6: odor removal by ionized ozone people are present. ionized ozone is supplied to recirculated air (see fig. 6). 2.8 bake-out procedure a new way of removing voc from a building interior is the so-called bake-out procedure: the indoor temperature is raised to 30–38 °c for two or more days, and simultaneously the ventilation is increased [4]. this is even required by the authorities in the state of california. practical experience has not yet been reported in the literature. references [1] bsr/ashrae standard 62-1989 r ventilation for acceptable indoor air quality. [2] eur 14449 en. quidelines for ventilation requirements in buildings. report no. 11. commission of ec, luxembourg, 1992. [3] fanger, p. o.: introduction of the olf and the decipol units to quantify air pollution perceived by human indoors and outdoors. energy and buildings vol.12, no. 1/1988, p. 1–6. [4] hicks, j. et al: building bake-out during commissioning: effects on voc concentration. in: proc. of the fifth int. conf. on indoor air quality and climate, vol. 3. toronto (can.), 1990. [5] iaqu. odor evaluation as an investigative tool. indoor air quality update, 1991, p.10–13. [6] jokl, m.: microenvironment: the theory and practice of indoor climate. springfield (illinois, u.s.a.) : thomas publisher, 1989, p. 416. [7] jokl, m.: the theory of indoor environment of buildings. in czech. praha: vydavatelství čvut, 1993, p. 261. [8] jokl, m. v., leslie, g. b., levy, l. s.: new approaches for the determination of ventilation rates: the role of sensory perception. indoor environment vol. 2, no. 2/1993, p. 143–148. [9] jokl, m. v.: evaluation of indoor air quality using the decibel concept. int. j. of environmental health research vol. 7, no. 4/1997, p. 289–306. [10] kaiser, e. r.: odor and its measurement. in: air pollution. academic press, 1962, p. 50–527. [11] mc burney, d. h., levine, j. m., cavanaugh, p. h.: psychological and social ratings of human body odor. personality and social psychology bulletin no. 3/1977, p.135–138. [12] oseland, n. a.: a review of odor research and the new units of perceived air pollution. watford: bre, 1993, p. 24. [13] parine, n.: the use of odor in setting ventilation rates. indoor environment vol. 3, no. 3/1994, p. 87–95. [14] pettenkofer, m.: über den luftwechsel in wohngebauden. münchen, 1858. [15] the human body. bratislava: gemini, 1992 miloslav v. jokl, ph.d., sc.d, university professor phone: +420 2 2435 4432 fax: +420 2 3333 9961 e-mail: miloslav.jokl@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 42 no. 2/2002 acta polytechnica doi:10.14311/ap.2013.53.0617 acta polytechnica 53(supplement):617–620, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap rx-j0852−4622: the nearest historical supernova remnant – again bernd aschenbach∗ pr vaterstetten, mozartstraße 8, 85591 vaterstetten, germany ∗ corresponding author: bernd.aschenbach@t-online.de abstract. rx-j0852−4622, a supernova remnant, is demonstrated to be closer than 500 pc, based on the measurements of the angular radius, the angular expansion rate and the tev γ-ray flux. this is a new method of limiting the distance to any supernova remnant with hadronic induced tev γ-ray flux. the progenitor star of rx-j0852−4622 probably exploded in its blue supergiant wind, like sn 1987a, preceeded by a red supergiant phase. a cool dense shell, expected around the outskirts of the red wind, may have been identified. the distance (200 pc) and age (680 yr) of the supernova remnant, originally proposed, are supported. keywords: supernova remnants, stellar evolution, multi-wavelength, x-rays, tev γ-rays, cosmic rays, individual: rx-j0852−4622, vela jr., sn 1987a, sn 1006. 1. introduction rx-j0852−4622, nicknamed vela junior (vela jr.), is a supernova remnant (snr) discovered in the rosat all-sky survey [3]. it is located along the direction towards the south-eastern corner of the vela snr, and it is completely covered, even beyond its boundaries, by the emission from the vela snr, such that vela jr. does not shine up against the very bright soft x-ray emission from the vela snr. only at x-ray energies > 1.3 kev does vela jr. become x-ray visible. it is a patchy narrow shell-type source with a diameter of 2° of almost perfect circular shape. since the discovery multi-wavelength obsersations have confirmed the snr status, including non-thermal radio emission; hard x-ray emission with soft x-ray emission apparently missing; mev γ-ray line emission from radioactive 44ti, 26al, 44ca, of low significance, though; gev γ-ray continuum emission; and tev γ-rays. a summary of the observations and interpretations can be found e.g. in [2, 10, 12]. aharonian et al. [2] provide evidence that the tev γ-rays are likely not to be of leptonic origin (inverse compton scattering of electrons and photons) but of hadronic origin (nuclear collisions between snr produced cosmic ray protons and nuclei of the ambient matter). this is supported by the mev γ-ray measurements of tanaka et al. [12]. an open question still is the distance and the age of vela jr.; the numbers range between 200 pc and 1 kpc, and 680 yr and 5000 yr. i show that vela jr. is very close, and likely to be as young as originally suggested [4]. 2. implications of observations three unambiguous key measurements of vela jr. have been made: the angular radius ra = 1° [3], the angular velocity of the shock wave vs,a = 0.84′′/yr ± 0.23′′/yr [8] and the total energy of cosmic ray protons wp to explain the tev γ-ray flux, wp = 1049 ergd22/n0 [2]. units are cm−3 for the particle density n0 of the ambient matter; 200 pc for the distance to the source, d2. from x-ray measurements additional information is obtained for an upper limit of the ambient density of n0,x < 0.029 cm−3(d1 f)−1/2 [11], where d1 is distance in kpc, and f is a filling factor < 1. in contrast to other information, this is not definite, because it depends on the interpretation of the x-ray spectral measurements. 2.1. distance the kinetic energy e0 available from an sn explosion is shared mainly between the acceleration of particles to cosmic rays (wp) and the expansion of the explosion wave into the ambient medium driven by some energy ek, such that e0 ≥ wp + ek with ek = 4.1 × 1049 ergn0d52. ek represents the sum of the kinetic energy and the associated thermal energy, the latter of which is due to the shock-wave heating of the ambient matter. as wp scales with (d22/n0) and ek scales with (d52n0) a single absolute maximum of d2 exists for any given e0, which also fixes n0. any filling factor, additionally introduced, does not change the maximum distance. the normalization factors for wp and ek are known from the measurements. for vela jr. they are wp,1 = 1049 erg and ek,1 = 4.1 × 1049 erg, respectively, (see above) for d2 = 1 and n0 = 1. for e0 = 1051 erg the distance d is < 500 pc. for e0 = 3 × 1050 erg, d < 355 pc, and for e0 = 1050 erg, d < 260 pc. the maximum of d is realized if e0 is shared evenly between wp and ek. for instance, for wp = 0.1e0 and e0 = 1051 erg, d < 435 pc, and 225 pc for e0 = 1050 erg, respectively. the reverse case, i.e. wp = 0.9e0 does not change the upper limit of d, but it changes the associated n0, slightly. finally, the uncertainties of the measured 617 http://dx.doi.org/10.14311/ap.2013.53.0617 http://ojs.cvut.cz/ojs/index.php/ap bernd aschenbach acta polytechnica values of vs,a, ±0.23′′/yr, change the distances quoted by at most 10 %. this means that there exists a firm upper limit of d < 550 pc. only if the event were a hypernova, the distance could be larger. the method outlined above can, of course, be applied to any other snr, which emits tev γ-rays of hadronic origin. the upper limit of the distance, i.e., d2,max is generally given by the relation d2,max = ( 0.25 ( e20/(wp,1ek,1) ) v−2s,a r −3 a )1/7 ; e0 = 1051 erg is the maximum energy generally attributed to a sn explosion; wp,1 is the cosmic ray proton energy, derived from the measured tev γ-ray spectrum normalized to d2 = 1 and n0 = 1 cm−3 (for the procedure see [2]); ek,1 = 5.8 × 1049 erg is the energy computed for d2 = 1 and n0 = 1 cm−3, and an angular expansion (shock velocity) of 1′′/yr and an angular radius of 1°. this method appears to be a powerful tool for setting a distance upper limit of tev γ-ray emitting snrs, because this method depends just on angular measurements. for instance, the tev γ-ray flux, if of purely hadronic origin, and the measurements of vs,a and ra quoted [1], predict an upper limit of d = 1.85 kpc to sn 1006, using this new method. 2.2. density thermal x-ray emission is expected from the shock wave heating of the ambient medium. the upper limit of n0,x of slane et al. [11] sets further constraints to the distance when used in the expressions for wp and ek. for e0 = 1051 erg and f = 1 the upper limit of d is reduced to < 420 pc and n0,max = 0.05 cm−3. for e0 = 1050 erg, d < 250 pc and n0,max = 0.06 cm−3. values of f < 1 lower the acceptable maximum distance further, but do not change the acceptable n0 significantly. slane et al. [11] note that the upper limit of n0,x is valid only for temperatures of kt > 1 kev, which corresponds to a shock velocity of vs = 860 km/s; with vs,a = 0.84′′/yr, the upper limit of n0,x is therefore not applicable for d < 235 pc. this is about the distance to the vela snr, and the temperatures of the vela snr across the area of vela jr. range between 0.5 and 0.9 kev [9]. this means that the thermal x-ray emission from vela jr. and the vela snr can be easily mixed up, and they cannot even be spectroscopically discriminated. the majority of the soft x-ray emission received from the vela jr. covered area could be attributed to vela jr. rather than to the vela snr. taking the rosat soft x-ray measurements n0 could be a factor of 10 or more higher than the slane et al. limit, i.e., > 0.6 cm−3 [9]. 2.3. age the evolutionary state of an snr is often described by the expansion parameter β, defined by r ∼ tβ, with r the radius and t the age. this relation leads to β = vs/v, with vs the current shock velocity and the mean expansion velocity v = r/t. given the angular value of vs, i.e. vs,a, and the angular radius ra, the age would be calculated to t = βra/vs,a. formally, 0 ≤ β ≤ 1. the value of β close to 1 means that the snr is almost freely expanding; β = 0.4 describes the adiabatic state with the snr expansion slowed down by the ambient matter. for this case to apply the mass overrun by the shock wave (sweptup mass) should be much greater than the mass of the sn ejecta. the radiative phase starts around vs ∼ 250 km/s, equivalent to kt ∼ 0.1 kev. given vs,a, the snr would be located at a distance of 65 pc. this would have made vela jr. a very bright euv source, which, however, is not listed in any of the euv catalogues. for β = 0.4, vela jr. being in the adiabatic phase, t ∼ 1700 yr (1300 ÷ 2400 yr). the maximum explosion energy of e0 = 1051 erg and the maximum density of n0,x = 0.05 cm−3 allow a maximum distance of 420 pc. vs,a then corresponds to 1700 km/s equivalent to kt ∼ 4 kev. the expected thermal hard x-ray flux is fairly low at the level of n0,x and is probably not detectable given the sensitivity of present instruments, but see [7]. but it is noted that the energy in cosmic ray protons would be 8.8 × 1050 erg. a cosmic ray acceleration efficency of 88 % would be a surprise, and an overrun mass of just 2.4 solar masses casts doubts on the assumption of the adiabatic state. finally, for vela jr. being close to free expansion, the ambient density, which needs to be low for free expansion to apply, is nevertheless high, such that it is inconsistent with d > 500 pc and wp ≤ 1051 erg. summarizing, the analysis shows that the concept of an explosion of the vela jr. progenitor star into a medium of constant isotropic matter density distribution is not applicable, and estimates of age and density on this basis are irrelevant. 2.4. circumstellar environment of the sn progenitor star recalling sn 1987a, it is proposed that the vela jr. progenitor was a star with a very low density wind, like a blue supergiant, shortly before its explosion. after passing the low density region the sn shock wave may have hit the base of the preceeding red wind which is of much higher density, preserving pressure equilibrium, i.e., (nv2) = const. the relevant density before the velocity slow-down is the mean density delivered by the blue wind up to the density jump and the density of the red wind at the radius of the density jump afterwards. the relevant velocities are the initial quasi-free expansion velocity vfr of the sn taken over the full distance up to the location of the density jump, on the one hand, and the current shock velocity vs on the other hand. the jump condition 618 vol. 53 supplement/2013 rx-j0852−4622: the nearest historical supernova remnant – again translates into q = dmb/dt vb dmr/dt vr = 16 27 ( vs,a vfr,a )2 with vs,a the current angular shock velocity and vfr,a the initial angular velocity during the quasi-free expansion phase. chevalier & fransson [6] published typical mean values for the wind parameters, i.e., mass loss rate over wind velocity, of red supergiants and of blue supergiants, respectively, which are around dmb/dt vb = 10−6 solar masses/yr 550 km for a blue wind and dmr/dt vr = 10−5 solar masses/yr 20 km for a red wind. the blue wind parameters are consistent with the observations of the sn 1987a observations, the progenitor star of which had been identified as a blue supergiant. with this set of wind parameters and d2 = 1, one gets vs = 800 km/s and vfr = 10 200 km/s for the quasi-free expansion phase velocity. for t = 680 yr, the encounter of the quasifree expanding shock wave with the red wind base would have occurred at 0.91r after 333 yr. a dense shell, made up mainly of the red wind matter, with a width of 0.09r, is traversed by the shock wave at a speed of 800 km/s. the outer zone of a stellar wind is expected to be surrounded by a thin, dense and cool shell, made up of the ambient swept-up interstellar matter. in the north-westerly direction of vela jr., just outside the outer boundary of the vela snr, the snr puppis-a is located, which is a very bright x-ray source with a diameter of 45′. a spatially resolved spectral study of the rosat measurements across the surface of puppis-a shows an excess of low energy absorption above the mean interstellar absorption. the excess absorption is spatially confined to a region looking like a slighty curved lane, or filament, which stretches right through the middle of puppis-a from its northeastern boundary to its southwestern boundary. this indicates the presence of some cool matter leading to extra, spatially confined absorption on top of the interstellar absorption [5, 9]. this absorption lane could be considered as a fraction of a complete shell produced by the red wind. the majority of the shell cannot be observed because sufficiently bright background x-ray sources are missing. from this interpretation it follows that dmr/dt vr = ml (rabs,ad) with ml the total mass loss of the progenitor star by its red wind, or numerically, dmr/dt vr = 9 × 10−6 solar masses/yr 20 km ml,10 d2 , with ml,10 in units of 10 solar masses. the density nr of the red wind at its base hit by the sn shock wave is then nr = 0.1 cm−3ml,10/d32. using the measured flux of the tev γ-rays [2] it follows that wp = 1050 ergd52/ml,10. this demonstrates that even under extreme conditions, i.e., large values of wp and ml,10, d can be only somewhat larger than 200 pc. in fact, the result suggests that d < 200 pc. if so, the requirements for the associated total sn energy and cosmic ray energy would be somewhat relaxed. the age t of the snr is limited by the acceptable value of q = 16/27(vs,a/ra)2(t/a)2 with a = vfr/v > 1, v = r/t. dmr/dt vr is basically fixed by the observation of the absorbing outer shell built up by the red wind. for vfr = 10 000 km/s the required density jump is consistent with the blue wind parameters suggested for sn 1987a, shown above. observations have shown that vfr ∼ 40 000 km/s over the first 6 yr for sn 1987a with an abrupt deceleration following. applying this extreme case of such a quasi-free exansion velocity to vela jr. for a duration of up to about 100 yr after the explosion one gets dmb/dt vb = 10−7 solar masses/yr 1000 km . but an upper limit of vfr can be derived from the width of the snr shell. figure 3 of iyudin et al. [7] shows that the hard x-ray emission is confined to a shell of < 6′ to 7′ thickness measured by xmm-newton. this would result in vfr less than 15 000 km/s. it appears that the chandra images show a shell as thin as 5′ [10], implying vfr ∼ 10 000 km/s. in summary, the blue wind parameters of the vela jr. progenitor star were probably in a range of 10−6 solar masses/yr 550 km and 10−7 solar masses/yr 1000 km during its final evolutionary phase. the wind parameters of the red wind and blue wind respectively, are surprising close to the typical mean values. sn 1987a appears to be the only example so far of a blue supergiant sn, and vela jr. may be a further example. such sne might be very hard to be detected in the radio and low energy x-ray regimes, because the luminosity depends heavily on the ambient matter density. for instance, sn 1987a escaped detection for 6 years and might have escaped so for a much longer period of time. but after these 6 years the shock wave started to encounter the ‘inner ring’ with its much higher density, which slowed down the propagation of the shock wave dramatically. when, and whether, it will have reached the base of the red wind, like in the case of vela jr., remains to be seen. in this sense the observation of an absorbing shell produced by the red wind in case of vela jr. appears to be unique. the absorbing shell attributed to the red wind is located at an angular distance of rabs,a = 6.5° measured from the center of vela jr., which corresponds to rabs = 22 pc for d2 = 1. it shows a core 619 bernd aschenbach acta polytechnica width of ∼ 5′ and a full width of ∼ 30′, the latter of which is limited by the sensitivity of the measurements. the absorbing column density in the core region is nh = 1.71021 cm−2 (0.8 ÷ 1.91021 cm−2) [5]. this allows computing the ambient density nam of the matter through which the red wind propagated, i.e., nam = 0.55 cm−3nh,21d−12 . the analysis of the vela snr at a distance of 250 pc reveals a mean ambient density of ∼ 0.35 cm−3, derived also independently from the interstellar absorption column density, with spatial variations of a factor of > 2. it appears that the identification of the absorption lane as fraction of an absorption shell produced by the red wind of the vela jr. progenenitor star and the distance of ∼ 200 pc are fairly plausible. 2.5. interstellar absorption the fits of models, available in standard x-ray spectral software packages, to the measured broad band x-ray spectra (0.5 ÷ 10 kev) of vela jr. appear to be consistent with a pure non-thermal spectrum with an absorption column density of nh > 2×1021 cm−2 [10], which is at least 20 times higher than the absorption in front of the vela snr at a distance of 250 pc. this value of nh, interpreted as interstellar absorption, and therefore some measure of the distance, has raised the claim that vela jr. lies behind the vela snr, and that it is even much more distant. but even at the largest distances suggested the mean interstellar density needs to be close to or to exceed 1 cm−3, for which there is no observational evidence otherwise. narrow band (0.2 ÷ 2 kev) analysis of the rosat spectra is consistent with a much flatter non-thermal spectrum, and nh is consistent if not lower than that measured for the vela snr [4]. in particular at the low energy end (< 2 kev), the statistical quality of the fits to the broad band measurements is not convincing, and one may wonder whether the source spectrum is not much more curved than the modified power-law models of the standard software packages offer. the flattening of the observed spectrum for e < 2 kev, which is the energy range extremely sensitive to interstellar absorption, is more likely not be caused by interstellar absorption but to be of intrinsic origin, i.e., the snr produced cosmic ray electron spectrum for which a turn-over might occur at energies ∼ 0.1 kev (see, e.g., the models shown in fig. 17 of aharonian et al. [2]). one could turn the argument around and can perhaps learn something about the electron acceleration to cosmic ray energies and the shape of the spectrum close to the turn-over at the highest energies, through the measurements. 3. conclusions angular radius, angular shock velocity and tev γ-ray flux limit the distance to vela jr. to < 500 pc for an explosion with an available energy of < 1051 erg. the upper limits on the density associated with a high temperature plasma (> 1 kev) reduce the distance further. the sn explosion into a stellar wind environment can explain the observational results. the wind parameters, i.e., mass loss rate over wind velocity, are in agreement with those known for blue supergiant winds and red supergiant winds. it appears that the progenitor star of vela jr. exploded in its blue wind developing a few thousand years before the end of the progenitor’s life. the blue wind phase was preceeded by an expected red wind phase. the base of the red wind was probably hit by the sn explosion shock wave after 200 to 300 years. because of the high density at the red wind base the shock wave velocity significantly dropped and the bulk of the expected thermal x-ray emission is expected at temperatures < 1 kev, and may be hidden among the vela snr emission as the temperatures are very similar. the wind model, in particular the explosion of a blue supergiant is reminiscent of the findings for sn 1987a. the application of a wind model also suggests that protons are preferentially accelerated to cosmic ray energies in an environment of continuous high shock velocity and low density, which keeps helping the losses by collisions with ambient ions low, and, in addition, little energy is being spent for expanding and heating. this could make blue supergiant explosions favourable tev γ-ray emitters, explaining their rareness among the snr populations. acknowledgements references [1] acero, f. et al.: 2010, a&a 516, 26 [2] aharonian, f.a. et al.: 2007, apj 661, 236 [3] aschenbach, b.: 1998, nature 396, 141 [4] aschenbach, b., iyudin, a.f., schönfelder, v.: 1999, a&a 350, 997 [5] aschenbach, b., 1995, asp conf. ser. 80, 432 [6] chevalier, r.a., fransson, c.: 1987, nature 328, 44 [7] iyudin et al. et al: 2005, a&a 429, 225 [8] katsuda s. & tsunemi, h.: 2012, mmsai 83, 277 [9] lu, f.j. & aschenbach, b.: 2000, a&a 362, 1083 [10] pannuti, t.g. et al.: 2010, apj 721, 1492 [11] slane, p. et al.: 2001, apj 548, 814 [12] tanaka, t. et al., 2011, apj 740 620 acta polytechnica 53(supplement):617–620, 2013 1 introduction 2 implications of observations 2.1 distance 2.2 density 2.3 age 2.4 circumstellar environment of the sn progenitor star 2.5 interstellar absorption 3 conclusions acknowledgements references ap01_6.vp 1 financing of transport investments from the commercial point of view most transport investments are part of a broader category known as project financing. a project’s viability is therefore closely connected with the valorization of the future cash flow generated by the project. there are some features specific to infrastructure project financing that make the availability of resources more difficult. these are an infrastructure project’s: � long life-cycle � relatively low operating costs � need for large capacity resources � long construction period (2–7 years) as regards the cash-flow this means: � negative cash flow throughout the construction period, which is usually longer than in the case of ordinary industrial projects. this is an important factor in the risk to the investor. � slow increase in the cash flow at the beginning of operation, caused by large interest payments on loans. � high cash flow after amortization. a typical feature of transport infrastructure projects is that they are expensive and take a long time to construct. this is caused partly by the nature of the projects themselves and partly by the fact that they aim to meet a future demand. this represents a long-term tying up of large resources, while the regularity and the level of the revenue remain uncertain. transport infrastructure financing has traditionally lain in the public domain. by its nature society should itself maintain the quality and quantity of its infrastructures according to present and future developments. given that infrastructures of the individual transport modes are interconnected in the integrated transport infrastructure of a country, it is obvious that the transport infrastructure should be managed comprehensively and should be subject to a centralized investment and development regime. a tightening of many countries’ budgets in recent times has led to the exploration of alternative resources for financing the transport infrastructure. tax revenues are insufficient to maintain the capital necessary for the development of the transport infrastructure. therefore it is desirable and natural to allow private investors to enter the market or to let users pay for their ”consumption”. in view of the long duration of many projects, there arises the problem of generating cash flow that will be sufficient to ensure debt servicing and an appropriate return on the capital invested. 2 available financial resources finance resources for transport infrastructure projects can be divided into public and alternative. these can then be divided as follows: � public resources � state budget � state transport infrastructure fund � other government subsidies and grants � non-budget public resources � external subsidy funds and programmes (e.g. phare, ispa) � other – alternative resources � loans � from domestic banks � from foreign commercial banks � from export credit institutions � from international financial institutions � entry to capital markets � bond issue programmes � stock issue programmes � leasing � guarantees, bills � fee and toll collecting – user pays � subsidiary company purchasing � project financing via public-private partnership – mixed scenarios for private capital involvement 3 models of financing transport infrastructure project financing can use a variety of models and methods. the selection of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 41 no. 6/2001 models of financing and available financial resources for transport infrastructure projects o. pokorná, d. mocková a typical feature of transport infrastructure projects is that they are expensive and take a long time to construct. transport infrastructure financing has traditionally lain in the public domain. a tightening of many countries’ budgets in recent times has led to an exploration of alternative resources for financing transport infrastructures. a variety of models and methods can be used in transport infrastructure project financing. the selection of the appropriate model should be done taking into account not only financial resources but also the distribution of construction and operating risks and the contractual relations between the stakeholders. keywords: transport infrastructure, project financing, financial resources, models of financing, investments, ppp (public – private partnership) appropriate model should take into account not only financial resources but also the distribution of construction and operating risks and the contractual relations between the stakeholders. following from the above, five main financial models can be stated: type 1: fully financed by private capital. financial resources, structure owner, its management and operation are purely private. this solution is feasible only in cases where the project is able to generate revenue to cover debt servicing and provide a reasonable economic return on the capital. therefore only a quickly profit-making project can be taken into consideration. type 2: investment is financed by private capital; a private company builds the structure. the public sector may be partially involved in operation (for example in a regulatory role) defending the public interest and property. otherwise, this type may be viewed as a modification of type 1. type 3: the role of the promoter is performed by the public sector, but the financial resources are purely private. the structure is managed by the private sector entity, as well as the operation of the facility. state guarantees may be provided. type 4: the promoter of the project is a public sector entity, which also manages the structure and the operation, bearing the risk and other liabilities. the financing may use private resources. type 5: the project is operated and constructed by the public sector. the financial resources are provided mainly by public funds and other subsidies and by fee and toll collection. debt service may be partially borne by the infrastructure user in the form of fees and tolls. there are many other possible solutions that can combine the above types. many different financing techniques may be involved. 4 project financing with private capital participation in the past few years private sector participation and involvement in development and financing of infrastructure projects has increased noticeably, ranging from management contracts to entirely new projects. some key factors contributing to this development have been: 1) governments recognized that they did not have sufficient means available to innovate, maintain and build infrastructures to match the expected and required economic growth; 2) a broad range of opinion holds that the private sector would in some cases be more effective than the public sector. nevertheless, transport infrastructure investments are not a very attractive form for the private sector, for a number of reasons: � excessively long amortization period � a long period between project commencement and the first revenues � the money is inextricably tied up in the structure � no possibility for innovation, the product is a given constant � threat of political changes due to these considerations, reduced interest from investors may be encountered. private capital may be obtained by providing long-term state guarantees. these can take the following forms: � revenue guarantees � loss diminution � purchase volume guarantees by the state � set disbursement coverage � settlement of expenses and purchases of certain equipment � awarding other concessions for profit – making services as compensation investing private resources through the use of governmental guarantees brings advantages to both sides. the investor is confident of a return on the invested capital and at the same time the public sector gains new opportunities to invest while not increasing strains on the budget. this shows a way to avoid budgetary restrictions. there are many forms of private sector participation in the transport sector. the following could be suitable for infrastructure investments: 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 type financing risk bearer promoter private public construction operation 1 yes no private sector private sector private sector 2 yes no private sector public sector private sector 3 yes no private sector private sector public sector 4 yes no public sector public sector public sector 5 sometimes yes public sector public sector public sector table 1: model types of infrastructure construction financing � joint venture – a company with joint involvement is set up, with public and private sector liability shared in the fields of financing, construction and operation � full privatisation � operations, concessions and management contracts – the private sector is in charge of operations and is remunerated through profit sharing or direct payments by the government � boot contracts (build, own, operate, transfer) – the private sector bears full responsibility for financing, construction and operation for a set period, which matches the repayment of debts and an appropriate return of the investment. after this period the infrastructure is transferred into public ownership in accordance with the contractual arrangements. this is probably the most commonly used form of private participation in infrastructure. there are many variations on this contract, which are listed below. � boo contracts (build, own, operate) – the private sector retains ownership and further operation � bfsr contracts (build, finance, share revenue) � rot contracts (rehabilitate, operate, transfer) � blt contracts (build, lease, transfer) � bto contracts (build, transfer, operate) � other ppp variations as mentioned, there are many other variations, which are distinguished mainly by the allocation of risk between the contractors. an essential fact of private financing is that all costs are borne by the final user. cash flow is generated by collecting tolls (or a similar fee that is obvious to the user). the tolls and fees collected must cover the operation cost, debt service and return on equity. project financing is usually used for large and expensive constructions with long-term duration, where a longer period is required to amortize the investments and to generate the required revenue. the distribution of creditors depends on the cash flow generated without being linked to the property of the project promoters. accordingly, the bank as a creditor has a restricted possibility of recourse. the basic project financing features are: � “a special purpose vehicle” (spv) is established, which becomes a promoter of the project. the purpose is to separate the project from other activities of the participants. this should ensure transparency of financing. this isolation enables the use of “off-balance financing”, so the project does not affect the balance sheet of its sponsors. the debtor is the spv. the sponsor’s balance sheet contains only its involvement, but not the total debt. � a close link between debt repayment and the project’s future cash flow, which is estimated according to technical and project analysis (proving that future cash flow will be sufficient to cover debt repayment including interest). � risk allocation between more than one participant is essential due to the large extent of the project. one subject can hardly be strong enough to cover and take on the whole risk. therefore risks are divided according to the contract over the many subjects involved, mainly sponsors and banks; but also other participants, e. g. suppliers, operators, public bodies and insurance companies. the way of allocating risk differs case by case, and the contract should accurately stipulate its type and extent. 5 conclusion the field of transport infrastructure construction and its financing is the focus of repeated political and pragmatic discussions. the level of investments in transport infrastructure is within the sphere of an ordinary citizen’s concern, as it results from the nature of infrastructure. the development of infrastructure and the resulting investments are issues of interest for each state, and constitute a basis for its economic progress. other infrastructure features, which cause a tying up of financial potential, similarly should not be neglected. one of these features is the long-term nature of infrastructure investments. the construction time horizon in this field usually exceeds 20 years, with the unavailability of ordinary commercial credit banking. these days european countries are desperately looking for ways how to involve private capital in the process of infrastructure financing. the countries are focusing on the process of creating an environment suitable for public private partnerships. interested parties aim to create an optimum financial model, which will determine the roles of individual stakeholders and determine the extent of their participation in the project’s management, monitoring, risk sharing and last but not least the total of their financial contribution and resultant profit-sharing. the form and the size of an investor’s participation are related to the period and extent of the return on investment. a precondition for the successful start of a ppp is therefore the positive efficiency of the project and a reasonable return on investments. references [1] estache, a., strong, j.: the rise, the fall, and … the emerging recovery of project finance in transport. world bank institute [2] kay, j.: efficiency and private capital in the provision of infrastructure. infrastructure policies for the 1990’s, paris, oecd, 1993, pp. 58–71 [3] kolektiv pracovníků dopravní fakulty čvut: alternativní možnosti financování železničních koridorů. čvut, praha, 1998 [4] pokorná, o.: diplomová práce – zdroje financování dopravní infrastruktury. praha, 2001 ing. denisa mocková e-mail: mockova@fd.cvut.cz phone: +420 2 2435 9160 fax: +420 2 2491 9017 ing. olga pokorná e-mail: pokorna@mdcr.cz phone: +420 2 514 310 59 czech technical university in prague faculty of transportation sciences horská 3, 128 03 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 41 no. 6/2001 ap2002_01.vp 1 introduction in the last twenty years canard configurations have become more and more usual, especially for light and very light aircraft. after the wrights’ first flying machines, the revival of canard configuration on classical “backward-built” airplanes has been pushed by experimental aircraft and by the evolution of the numerical and experimental tools necessary to accomplish the design of this type of configuration. some aircraft, such as burt rutan’s famous designs like variez and longez, boomerang and defiant, have contributed with their commercial success to the success of canard configuration’s. a canard configuration is characterized by positive features (i.e., reduced wing area and aircraft drag), but cannot always comply to the control and stability requirements for all c.g. range and for every flight condition, unless an artificial stability augmenter system is installed, but this would be a difficult task for small aircraft. some authors [1, 2] have shown through detailed analysis advantages and disadvantages of canard configuration on classical aft-tailed configurations. the best compromise is to add a small horizontal surface behind the wing to compensate the reduction of the aircraft stability due to the presence of the canard surface, and so to adopt a three lifting surfaces (tls) configuration. one of the major advantages of the 3ls configuration derives from the added flexibility in selecting the aircraft geometry for what concerns the payload / wing / fuselage relative position, due to the possibility of complying with control & stability requirements for a larger range of c.g. position. some authors [3] have written papers on theoretical minimum induced drag of tls airplanes in trim. the other significant advantage of 3ls configuration is the reduction of the total lifting area required to fly, with consequent reduction of total wetted area and aerodynamic drag. the 3ls configuration has been adopted in the design of some light and commuter aircraft (eagle 150, molniya-1, etruria e-180) and for the well known piaggio p180 (the only modern transport aircraft with the 3ls configuration) in recent years and there is growing interest in this innovative configuration. design of a three lifting surfaces radio-controlled model has been carried out at dipartimento di progettazione aeronautica (dpa) by the authors in the last year. the model is intended to be a uav prototype, and is now under construction. a 3-view of the model is shown in fig. 1, and table 1 reports all the main dimensions, characteristics and weights. the maximum take-off weight is about 15 kg with a pay-load of about 4.0 kg and about 1 l of fuel. smrcek et al [4] and ostowari et al [5] have investigated, through numerical and wind-tunnel tests, the effect of canard and its position on global aerodynamic coefficients. our r/c model has been designed mainly to test the influence of the canard surface on aircraft aerodynamic characteristics, static and dynamic stability and flying qualities at high angles of attack. to this purpose the model has been designed with the goal of flying with and without canard, and so the areas of the lifting surfaces are not optimized. through the shift of the payload (batteries, acquisition systems and sensors) and fuel it will be possible to modify the c.g. position between 5 and 30 % of the wing chord to fly with the same static stability margin (ssm) with and without canard. another very important solid motivation for the project is that the model should be a low-cost flying platform to test all sensors and acquisition and measurement systems (for both flight parameter analysis and external monitoring, i.e., climatic and ground control). as final but not negligible advantage, the small aircraft can be an easy, low-cost system for teaching purposes (in particular useful for flight dynamics and flight maneuver reproduction and analysis). the model has been built in glass-fiber composite material with a wooden fuselage frame and wing ribs to reduce the empty weight and to have a clean and well finished wetted surface. the fuselage shape and the lifting surfaces planform (see again fig. 1) have been chosen in order to have very simple and economical constructive solutions. the fuselage has 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 design of a three surfaces r/c aircraft model d. p. coiro, f. nicolosi design of a three lifting surfaces radio-controlled model has been carried out at dipartimento di progettazione aeronautica (dpa) by the authors in the last year. the model is intended to be a uav prototype and is now under construction. the main goal of this small aircraft’s design is to check the influence of the canard surface on the aircraft’s aerodynamic characteristics and flight behavior, especially at high angles of attack. the aircraft model is also intended to be a flying platform to test sensors, measurement and acquisition systems for research purposes and a valid and low-cost teaching instrument for flight dynamics and flight maneuvering. the aircraft has been designed to fly with and without canard, and all problems relative to aircraft balance and stability have been carefully analyzed and solved. the innovative configuration and the mixed wooden-composite material structure has been obtained with very simple shapes and all the design is focused on realizing a low-cost model. a complete aerodynamic analysis of the configuration up to high angles of attack and a preliminary aircraft stability and performance prediction will be presented. keywords: three surfaces, tri-surfaces, canard, r/c model, design, aerodynamic analysis, performance, stability, upwash, downwash, propeller, wind tunnel test. a circular shape and have been molded through a cheap 0.2 meter diameter pvc tube. the 3.5 hp engine and the pushing propeller has been put in the fuselage rear part to have an undisturbed flow for canard and wing surfaces. the propeller effectiveness (behind the fuselage) will be tested in dpa wind tunnel before flight to verify that the necessary thrust is guaranteed. the wing and canard airfoils have been chosen to have high lift at low flight reynolds numbers together with a contained viscous drag and a reasonable pitching moment. the wing has been designed with effective ailerons (to ensure lateral control at low speed) and with flap, although the equilibrated maximum lift with full flap (1.96) is not so high as the lift without a flap (1.68) due to the strong increment in pitching moment that the tail is not able to compensate with reasonable elevator deflections. the predicted stall speed with flap is about 40 km/h and without flap about 44 km/h, so the model should not have any take-off and landing problem. due to the short distance between c.g. and the vertical tail a second vertical fin has been added below the fuselage to ensure good directional stability. this is also necessary to protect the rear propeller from contact with the ground. the design has been accomplished using a code named aereo, which has been developed in recent years at dpa to predict all aerodynamic characteristics in linear and non-linear conditions (high angles of attack) and all flight performances and flying qualities of propeller driven aircraft, and has recently been extended to deal with canard and 3lsc configurations. 2 configuration and structural design as already pointed out in the introduction, the main goal of this aircraft is to allow flight parameter measurement and estimation of canard influence (especially at high angles of attack) on aircraft aerodynamics and flight characteristics. the first step considered in the design phase was the choice of general dimensions and a weight estimation. the following design specifications were considered: 1) pay-load (instrumentation, battery, record facilities, sensors, etc.) about 4 kg. 2) fuel for about 1 h flight (about 1 kg). © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 42 no. 1/2002 fig. 1: 3-view of the model dimensions weights wing area 0.95 m2 empty structural weight 8.30 kg wing span 2.50 m payload and fuel weight 4.50 kg wing chord 0.38 m engine (3.5 hp) weight 1.80 kg canard span 0.90 m canard chord 0.14 m max to weight wto 14.60 kg canard area 0.13 m2 fuselage length 2.00 m fuselage diam. 0.20 m horiz. tail area 0.20 m2 horiz. tail chord 0.18 m (movable part 44 % of chord) tot vert. tail area 0.13 m2 table 1 3) cruise speed around 80 km/h. 4) 3ls configuration with significant canard surface to allow for the evaluation of canard influence. 5) rear pushing propeller to make the canard work in “clean” conditions. 6) very low stall speed in clean and full flap configuration to ensure good low-speed flight characteristics, short take-off and landing with consequent safer ground approach; i.e., vsclean about 45 km/h vsl about 35 km/h. 7) enough control power to allow flight at high angles of attack (to check canard effectiveness and to reach canard stall). 8) engine compatible with aircraft dimensions and weight (engine weight around 2 kg max). 9) significant wing span to have good climb characteristics and wing span dimensions much higher than canard span. to accomplish task n. 1 and to ensure a good and easy disposition of the instrumentation on board needed in order to measure, record and eventually transmit all flight parameters, a fuselage diameter of about 0.20 m was chosen. specification n. 1 and 2 leads to a useful load, around 5 kg. the structural weight can then be estimated to be around 8 kg. in particular, choosing a fuselage that can allow a good working distance for canard and horizontal tail and a fuselage fineness ratio higher than 8 to ensure good propeller efficiency, a 2 m total length fuselage comes out. the fuselage weight can be estimated around 2 and 3 kg. assuming a weight of about 5 kg for wing, tailplanes and canard, canard an empty structural weight between 8 and 9 kg is expected. with the engine total weight around 2 kg the maximum take-off weight will be around 15 kg. stall speed requirements (n. 6), assuming a maximum lift coefficient around 1.7 in clean condition and around 2.0 in full-flap condition, a maximum wing load of about 15 kg/m2 can easily be evaluated. this gives a wing surface of about 1 m2. design requirement n. 8 indicates, after a preliminary engine market analysis, an engine with about 3 hp maximum power. the “thunder tiger pro-120” engine was chosen. the engine is characterized by a displacement of about 21 cc and a maximum power of 3.5 hp at 15000 rpm. with a propeller 0.50 m in diameter, a practical working condition around 10000 rpm (then 2.5 hp maximum power) is expected. the maximum rate of climb at s/l can be estimated through a simple formula : rc w w f b max .� �76 2 97 1 4 3 2�p a e � [m/s] (with w in [kg], � in [hp], f in [m2] and b in [m]) with the following assumptions: �p � 0 4. propulsive efficiency at fastest climb speed �a hp� 2 5. maximum engine power at working rpm (10000 rpm) w � 15 kg maximum take-off weight f c d s� �0 0 032. m 2 (with c d0 0 032� . and s � 1 m 2) b b ee � (with e � 0 80. , oswald efficiency factor) and imposing a maximum climb rate at s/l around 3.5 m/s (req. n. 9), leads to a necessary wing span b of about 2.5 m. it also seems reasonable to have a wing span higher than the canard span (around 0.90 m). the horizontal tailplane dimensions were chosen to give good stability and control power to fly with and without canard. a movable part (equilibrator) extended over 44 % of the total horizontal plane chord was chosen to ensure good longitudinal control. the final configuration that was considered is then shown in fig. 1, with all main dimensions reported in table 1. the configuration has the following relevant features: the simple fuselage shape allows a low-cost molding with a consequent economic advantage. in fact the fuselage skin structure in glass-fiber with some added carbon-fiber was simply molded with a 0.20 m pvc tube. some carbon stringers were added to increase longitudinal stiffness. the fuselage is thus characterized by two high quality wood main spar frames allowing wing and undercarriage connections. the wing, canard and tailplane structures are made by wooden ribs numerically controlled machine milled (which assure perfect airfoil reproduction) and with a mixed glass-carbon fiber composite skin. a thickness of about 2 mm was chosen for almost all model surfaces. a very accurate structural analysis (i.e., finite element) has not performed to optimize thickness, dimensions and weight, but the chosen structure ensure a very good safety margin without leading to excessive weight. considering the experimental task and a possible future fully automatic flight, this design philosophy seems reasonable and efficient. the weight of the complete structure with exact dimensions and thickness was then estimated and is shown in table 2 together with the weight obtained after construction. the maximum take-off weight wto, adding engine weight and useful load (see table 1) is then 14.6 kg. wto of model without canard is 14.2 kg. aircraft c.g. position depends the location of on instrumentation on board. the main goal will be to guarantee a longitudinal static stability margin (ssm) in cruise condition of about 10 % for configuration both with and without canard, taking into account that the neutral point can be estimated to be around 15 % of the chord with canard and 40 % without, the useful load has been located in the fuselage forward part at 23 cm from the nose with canard and 44 cm without. the full weight c.g. position is imposed to be around 5 % of wing chord with canard and 29 % without. for the wing and canard the high-lift low-reynolds number airfoil sd7062 [6] was chosen, and the shape is reported in fig. 2. a picture of the fuselage in construction is shown in fig. 3. fig. 4 shows the wing internal structure before molding. fig. 5 and 6 show the lift and drag experimental [6] aerodynamic characteristics of the sd7062 airfoil. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 3 aerodynamic analysis 3.1 aereo code and extension to canard and 3ls configuration a code named aereo has been developed by the authors in recent years to predict all aerodynamic characteristics in linear and non-linear conditions of propeller driven aircraft. the code is based on longitudinal [7] and lateral-directional [8] semi-empirical methods, like those proposed by j. roskam [9] mixed with more sophisticated calculations, such as wing lift and drag predictions up to stall, are performed using an in-house built code based on non-linear prandtl lifting-line theory. the code also predicts all performances, static and dynamic stability characteristics, and is similar to the well known aaa software [10]. the code was originally written to deal with the classical aft-tailed configuration, especially for light aircraft and sailplanes [11]. the code has recently been expanded and improved to deal with canard or tls configurations. with the simple horse-shoe vortex theory the calculation of mutual influence of wing and canard (downwash of canard on wing, and upwash of wing on canard) has been implemented. maturing experiences and tools integration [12] has been one of the main goals of the author s activity at dpa in recent years, and the aerodynamic and flight behavior analysis of this configuration certainly goes in that direction. the authors also have good experience of wind-tunnel tests for analysis and optimization of light aircraft [13] and integration and comparison of numerical calculations with experimental results [14]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 42 no. 1/2002 estimated real wing 4.00 kg 4.16 kg fuselage 2.40 kg 2.31 kg hor. tailplane 0.85 kg 0.80 kg canard 0.42 kg 0.38 kg 2 vert. tailplanes 0.55 kg 0.65 kg tot structure 8.22 kg 8.30 kg table 2: weight sd 7062 airfoil fig. 2: sd7062 wing and canard airfoil fig. 3: fuselage in construction fig. 4: wing internal structure and airfoil 8.00 4.00 0.00 4.00 8.00 12.00 16.00 20.00 0.50 0.00 0.50 1.00 1.50 2.00 re = 100.000 re = 200.000 re = 400.000 cl alpha [deg.] fig. 5: sd7062 airfoil – lift curve [6] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.50 0.00 0.50 1.00 1.50 2.00 cl cd re = 100.000 re = 200.000 re = 400.000 fig. 6: sd7062 airfoil – drag polar [6] the next section presents the results of aerodynamic characteristics evaluation for the aircraft model with and without canard will be presented. the c.g. position (for moment coefficient and stability considerations) follows the values indicated in part 2. 3.2 results – lift mutual induction of canard on wing (downwash angle �w) and wing on canard (upwash angle �c) have been estimated in aereo code through a simple horse-shoe vortex theory and global value is obtained through an average value along the span. evaluated values of mutual induced angles and downwash at horizontal tail (�h) derivatives are reported in table 3, together with the wing-body and global lift curve slope in the presence and absence of a canard surface. comparison with values of d�/d� obtained with panel method calculations shows good agreement. fig. 7 shows the lift contribution of wing, wing-body, canard and horizontal tail versus � (with respect to the fuselage center line). it can be observed that canard stalls around � � 16° and this reflects on global lift and especially the global moment coefficient curve. an equilibrated lift curve has been obtained for the configuration with and without canard, and for the configuration with canard with 30° flap deflected. the results are shown in fig. 8. it can be seen that a maximum equilibrated lift coefficient of 1.66 is obtained for the configuration with canard (and c.g. at 5 % of c), a value of 1.61 for the configuration without canard and a value of 1.97 for the 30° flapped configuration with canard. the resulting stall speeds for the configuration with canard, as stated in the introduction, are about 40 km/h and 44 km/h, respectively, for the clean and flapped conditions. 3.3 results – drag the equilibrated (trimmed) drag polar for the configuration with and without canard are shown in fig. 9. it can be observed that the configuration with canard leads to a higher drag than the configuration without canard at high speed conditions, but lower drag (lower global induced drag) at low speed conditions. the global oswald efficiency factor “e” for the configuration with canard (tls) in trimmed conditions is about 0.85, showing a good value for an aircraft that should operate at low speed. moreover, as stated before, the tls is not optimized because the horizontal tail surface has been designed to fly with and without canard. 3.4 results – moment coefficient and equilibrator deflections fig. 10 shows all contributions to moment coefficient (evaluated respect to cg, 5 % of chord) of configuration with canard with equilibrator in neutral position (�e � 0°). it can be observed that the global (tot) moment curve strongly “feels” the canard stall at � � 16° and a high negative pitching tendency results. the aircraft stall is thus connected to canard stall and this seems to work like an efficient and safe stall warning device. the estimated necessary elevator deflections to equilibrate the aircraft are shown in fig. 11 for configuration without canard and with canard (in clean and flapped condi48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 with canard without canard panel method d�c/ d� (upwash on canard) 0.08 – 0.09 d�w/ d� (downwash on wing) 0.06 – 0.07 d�h/ d� (downwash on hor. tail) 0.31 0.36 cl� wing-body 0.078 0.081 cl� global 0.10 0.091 table 3: mutual induced angles and lift curve slopes 8 4 0 4 8 12 16 20 24 0.40 0.00 0.40 0.80 1.20 1.60 2.00 2.40 a [°] cl cl in equil. conditions cleq no canard cleq canard cleq canard + 30° flap fig. 8: equilibrated lift coefficient 8 4 0 4 8 12 16 20 24 28 32 0.40 0.00 0.40 0.80 1.20 1.60 2.00 lift wing wing-body hor. tail canard tot cl � [°] cl cl cl cl cl fig. 7: lift contributions tions). the required deflections at stall are always acceptable (for the configuration with canard –15° in clean and –22° in flapped conditions). 3.5 results – neutral point – static and dynamic stability the neutral point versus trimmed lift coefficient for configuration with and without canard are shown in fig. 12. the ssm is about 13 % at cruise conditions (cl � 0.50) for both configurations. high stability at low speed can be highlighted. longitudinal and lateral-directional stability derivatives have been evaluated for configuration with canard and without canard. table 4 shows the most significant stability derivatives at cruise conditions for configuration with canard and for configuration without canards (�’ is d�/dt). the most significant different values are reported in bold characters. note that the configuration with canard leads to a higher lift curve slope and higher aerodynamic dumping (unsteady cm�’ and cmq derivatives). table 5 shows the dynamic stability characteristics. the long and short period characteristics show that the configuration with canard leads to a slightly lower frequency for long © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 42 no. 1/2002 0.000 0.040 0.080 0.120 0.160 0.200 0.20 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 cd cl cd in equil. conditions cleq no canard cleq canard cleq canard + 30° flap fig. 9: trimmed drag polar 8 4 0 4 8 12 16 20 24 28 32 0.80 0.60 0.40 0.20 0.00 0.20 0.40 a [°] cmcg wing fuselage hor. tail canard tot (de=0°) fig. 10: moment coeff. contributions 0.00 0.40 0.80 1.20 1.60 2.00 -24 -20 -16 -12 -8 -4 0 4 cl de_eq de eq no canard canard canard + 30° flap fig. 11: necessary equilibrator deflections 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 0 5 10 15 20 25 30 35 40 45 50 55 60 cleq no[%c] cg can cg no can without canard with canard fig. 12: neutral stability point with canard without canard longitudinal cl� 5.60 (1/rad) 5.20 (1/rad) cm� �0.48 (1/rad) �0.59 (1/rad) cm�’ �1.94 �1.18 cmq � 8.02 � 4.41 lateral-directional cl �.059 (1/rad) �.064 (1/rad) clp �.52 �.51 cn .090 (1/rad) .079 (1/rad) table 4: static stability derivatives period motion and a higher frequency for short period motion. the dumping is always higher with the canard. 4 performance the necessary power curves at s/l for configuration with canard and without canard were evaluated and are presented in fig. 13, together with the available power curve, which was evaluated with two possible propellers, 18–6 (d � 0.46 m, blade angle �75 � 8°) and 18–10 (d � 0.46 m, �75 � 13.3°). maximum power of 2.5 hp and maximum propeller efficiency �p-max � 0.50 (the propeller works behind the fuselage) were assumed. it can be seen that a low blade angle (18–6) is needed in order to have acceptable propeller efficiency and then a good rate of climb r/c, which occurs at about 80 km/h. there is almost no difference in maximum level speed between the two propellers. table 6 shows all the main performances with propeller 18–6. 5 wind tunnel tests on engine and propeller to verify the behavior of the engine coupled to different pushing propellers, the model fuselage with rear mounted engine was set up in the wind tunnel as shown in fig. 14. fig. 15 shows a detail of the engine. drag, lift and moment was measured with an internal 3-component strain gage balance, and they were recorded with the use of an a/d acquisition system. engine rpm was also measured. unfortunately due to the lack of model aircraft employing pushing propellers, these are available only in certain diameter/pitch combinations. we tried 18/10, 18/6, 15/8, 15/6, 14/6, of which only the last was available in pvc, while the others were all made of wood. the tests were performed, for each wind speed, setting three throttle levels and recording rpm, forces and moments. some other angles of attack were also investigated. it can easily be recognized that there were many possible combinations of the free parameters, and while writing this paper we are still analyzing the recorded data. since the engine needed high rpm to develop its maximum power, when a propeller 18 inches in diameter was tested, it was impossible to get the maximum power from the engine. on the other hand, when using a propeller 14 inches in diameter, the swept area was too small to get high thrust. in summary the best results were obtained with 15/6 pushing propeller. the measured propeller delivered power is compared to the numerical values in figure 16, along with required power. note that there are some discrepancies between the predicted and measured power: these are mainly due to the unknown geometry of the blades, especially regarding airfoil shape. we made an attempt to measure all the geometrical characteristics of the blades, but due to their small size, the results are not reliable. the maximum measured propeller efficiency was 4: this is valid supposing that the maximum engine power is equal to that declared by the company. no significant reduction in propeller efficiency was measured at high angles of attack, indicating that the position of the propeller and the shape of the rear part of fuselage were fine. we also performed some tests turning the model 180 degrees to test the differences between the pushing and tracting propellers, but we are still analysing the data. during the tests, as already stated, the main problem was the engine, which was not reliable at all: the tuning was very difficult and unstable. at the end of this test campaign, we decided to use a 4 stroke – 2 cylinder – 4 hp engine delivering maximum power at much lower rpm (7000 rpm versus 13500 of the first engine). this engine was tested with 18 in (0.46 m diameter) propellers with two different blade pitch angles, 9° (propeller 18/6) and 13° (propeller 18/10). fig. 17 shows the experimentally measured power curves of the 4 stroke – 4 hp engine with propeller 18/6 and 18/10. note that the 18/6 propeller does not give good efficiency (around 0.25), which leads to very low available power. the 18/10 propeller gives good propulsive power with an efficiency around 0.5. in fig. 17 the necessary 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 with canard without canard short period freq. [hz] 0.710 0.66 dumping 0.817 0.70 long period freq. [hz] 0.070 0.087 dumping 0.026 � 0.016 table 5: dynamic longitudinal stability 40 50 60 70 80 90 100 110 120 130 140 0.0 0.4 0.8 1.2 1.6 v[km/h] �[hp] power curves p_av betap=8° (18 6) p_av betap=13° (18 -10) p_nec with canard p_nec without canard fig. 13: power curves with canard without canard max level speed 127 km/h 130 km/h max cruise speed (75 % of power) 118 km/h 122 km/h max r/c 3.28 m/s 3.5 m/s max cruise range 20 km 21 km max aer. efficiency 13.1 13 take-off run 60 m 56 m landing run 54 m 50 m table 6: performance at s/l (propeller 18–6) power curves are slightly different from those ones reported in fig. 13 and 16, because the measured drag in the wind tunnel of the fuselage + undercarriage was also taken into account. the measured drag was found to be higher than the predicted drag, and it can be seen that the necessary power of fig. 17 is higher than that reported in figs. 13 and 16. it can be seen that the predicted maximum level speed with 4 stroke – 2 cylinder – 4 hp engine and 18/10 propeller is around 125 km/h. performances with the a 4 stroke – 4 hp engine and 18/10 propeller are reported in table 7. the values show very good flight and take-off characteristics with the new engine. 6 conclusion design, aerodynamic and preliminary performance estimation of a three surfaces r/c aircraft have been performed. the aircraft has been designed to test canard influence on aircraft aerodynamics, dynamics and flying qualities. the model will be instrumented to measure all flight parameters during flight. canard influence on lift, drag and moment © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 42 no. 1/2002 fig. 14: fuselage installed in the tunnel fig. 15: particular of rear mounted engine with canard without canard max level speed 125 km/h 127 km/h max cruise speed (75% of power) 118 km/h 122 km/h max r/c 5.2 m/s 5.3 m/s max cruise range 17 km 19 km max aer. efficiency 12.8 12.9 take-off run 45 m 40 m landing run 54 m 50 m table 7: performance at s/l 4 stroke – 2 cylinder – 4 hp engine with propeller 18–6 20 30 40 50 60 70 80 90 100 110 120 130 140 0.0 0.4 0.8 1.2 1.6 v [km/h] � [hp] p_av 15-6 (num) p_nec with can p_nec without can prop power 15-6 (exp) eta_p 15-6 (exp) fig. 16: required and available power 2 stroke – 3.5 hp engine, 15/6 prop. (num/exp) 30 40 50 60 70 80 90 100 110 120 130 140 0.0 0.4 0.8 1.2 1.6 2.0 2.4 v [km/h] � [hp] p o w e r c u r v e s p _nec without canard p _nec with canard p _av prop 18/10 (exp) p _av prop 18/6 (exp) fig. 17: required and available power 4 stroke – 4 hp engine, 18/10 and 18/6 prop (exp) coefficients have been carefully evaluated and shown. the influence of the canard on aircraft static and dynamic stability has been shown. available power curves versus speed have been measured in the wind-tunnel through a balance. body drag has been also measured. estimated performances are in good agreement with aircraft design and desired flight characteristics, especially with the more reliable 4 stroke – 4 hp engine, which was tested with different propellers to set the right one and to obtain the best performance. flight tests will take place in january 2002. references [1] mcgeer, t., kroo, i.: a fundamental comparison of canard and conventional configurations. journal of aircraft, vol. 20, november 1983 [2] selberg, b. p., rokhsaz, k.: aerodynamic tradeoff study of conventional, canard, and trisurface aircraft systems. journal of aircraft, vol. 23, october 1986 [3] kendall, e. r.: the theoretical minimum induced drag of three-surface airplanes in trim. journal of aircraft, vol. 22, october 1985 [4] ostowari, c., naik, d.: experimental study of three-lifting-surface configuration. journal of aircraft, vol. 25, february 1988 [5] patek, z., smrcek, l.: aerodynamic characteristics of multi-surface aircraft configurations. aircraft design, vol. 2, 1999, pp. 191–206 [6] lyon, c. a., selig, s. et al: summary of low-speed airfoil data, vol. 3, soartech publications, virginia, isbn 0-9646747-3-4, 1997, pp. 256–263 [7] wolowicz, h. c., yancey, b. r.: longitudinal aerodynamic characteristics of light, twin-engine, propeller-driven airplanes. nasa tn d-6800, 1972 [8] wolowicz, h. c., yancey, b. r.: lateral-directional aerodynamic characteristics of light, twin-engine, propeller-driven airplanes. nasa tn d-6946, 1972 [9] roskam, j.: preliminary calculation of aerodynamic, thrust and power characteristic. airplane design. part vi: ed. roskam aviation, 1987 [10] roskam, j.: aaa, advanced aircraft analysis software. darcorporation, lawrence, kansas [11] coiro, d. p., nicolosi, f.: aerodynamics dynamics and performance prediction of sailplanes and light aircraft. technical soaring, vol. 24, no. 2, april 2000 [12] coiro, d. p., nicolosi, f.: aircraft design through numerical and experimental techniques developed at dpa. aircraft design, vol. 4, no. 1, issn 1369-8869, march 2001 [13] giordano, v., nicolosi, f., coiro, d. p., di leo, l.: design and aerodynamic optimization of a new reconnaissance very light aircraft through wind tunnel tests. rta/avt symposium on “aerodynamic design and optimization of flight vehicles in a concurrent multi-disciplinary environment”, ottawa (canada), october 1999 [14] giordano, v., coiro, d. p., nicolosi, f.: reconnaissance very light aircraft design. wind-tunnel and numerical investigation. engineering mechanics, vol. 7, no. 2, 2000, pp. 93–107 domenico p. coiro e-mail: coiro@unina.it fabrizio nicolosi e-mail: fabrnico@unina.it phone: +39 081 7683322 fax: +39 081 624609 dipartimento di progettazione aeronautica (dpa) university of naples ”federico ii” faculty of engineering via claudio 21, 80125 naples, italy 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 ap1_02.vp 1 introduction highway operation and maintenance contribute a variety of pollutant constituents to surface and subsurface water. numerous factors may affect the presence of these constituents and the quality of highway runoff, including: traffic volume, precipitation characteristics, roadway surface type, the nature of the pollutants themselves, surrounding land use, and seasonal considerations. highway runoff contains total suspended solids, hydrocarbons, oil and greases, chloride, and other contaminants that are transported in solution and particulate forms to adjacent floodplains, roadside swales, and retention/detention ponds. the use of retention/detention units for storage and attenuation of peak flows is well established, but their effectiveness in removing highway runoff contamination has not been investigated in the czech republic. the retention/detention units are of several types, depending on detention time as well as purpose of detention. the suspended solids and other related contaminants can be attenuated in these units. oil/grit chambers oil and grit chambers represent a type of retention/detention unit. they are often used in conjunction with highway runoff controls to remove heavy particulates and adsorbed hydrocarbon particulates [5]. berg (1991) mentions that these structures are intended for small contributing watersheds, usually 1.0 acre or smaller. maintenance and cleanout are recommended on a quarterly basis, and the associated operating costs are high. if properly maintained, these systems can function as pre-filtering devices for other runoff controls. silverman and stenstrom (1989) acknowledge the relative ineffectiveness of oil and grit chambers. they cite studies which showed that 40 % to 60 % of oil and grease associated with urban runoff are in a dissolved or colloidal state. thus, classic oil and grit separators which are designed to separate free-floating oil and grease products exhibit low removal efficiencies for urban and highway runoff. storage/sediment units the primary purpose of storage/treatment units is to control the peak flow associated with the runoff from a catchment. reduction in the rate of flow can limit the frequency of occurrence of erosion, thereby reducing the sediment load to the receiving waters. the secondary purpose of these units is to store the runoff temporarily and allow the removal of particulate material by settling. the treatment efficiencies depend on the length of the detention time. the length of detention time for a particular runoff event is dependent on the size and intensity of the storm. the ideal detention time for pollutant removal is 24 hours, with a minimum of 6 to 12 hours [6]. storage of runoff for at least 24 hours may reduce the concentration of particulate materials by 90 % or more. detention ponds are most effective for removing particulate constituents and associated materials that are sorbed to suspended solids [4]. detention ponds are less effective in removing soluble components of runoff, such as nitrate and some phosphorus species. observed removal of tss, bod, total phosphorus, and trace metals were 80–90 %, 20–30 %, 20–30 %, and 40–80 % respectively after 12 hours of storage [2]. 2 methodology various detention/retention units along the prague-brno (d1) and prague-plzeň (d5) highways in the czech republic were selected for monitoring runoff from highways. the prague-brno (d1) highway is a divided 6-lane highway with average daily traffic (adt) in the range 40000–65000 vehicles/day/both directions, whereas the prague-plzeň (d5) highway has adt in the range 16000–22000 vehicles/day/both directions. various oil/grit chambers and storage/sediment units were identified along the studied highways. these units collect runoff from a 100% impervious area (highway surface). the oil/grit chambers contain fibroil filters that are used to adsorb hydrocarbons and oil and greases from the runoff. the storage/sediment units allow runoff constituents to settle under the effect of gravity to reduce suspended solids and their associated constituents. they sometimes contain a fibroil filter. the tested sample was collected from the influent highway runoff into the monitored units. other samples were collected from the effluent from these units. these samples were collected manually with bottles: metal bottles for samples used in hydrocarbon and solids analysis, and plastic bottles for samples used in chloride and zinc analysis. runoff 68 acta polytechnica vol. 41 no. 2/2001 pollutant removal from highway runoff using retention/detention units ashraf el-shahat elsayed, a. grünwald, d. dvořák highway runoff contains total suspended solids, hydrocarbons, oil and greases, chloride, and other contaminants that are transported in solution and particulate forms to adjacent floodplains, roadside swales, and retention/detention ponds. oil and grit chambers represent a type of retention/detention unit used for removing heavy particulates and adsorbed hydrocarbon particulates. storage/sediment units also represent a type of retention/detention unit used for controlling peak flow and removing suspended solids. the aim of this study is to evaluate the effect of traffic volume and site characteristics on highway runoff quality. the study also aims to evaluate the performance of retention/detention units that collect runoff from the prague-brno and prague-plzeň highways, czech republic. the results of this study indicate no definitive relationship between average daily traffic and concentration of runoff constituents, though the site characteristics have a strong relation to some constituents. the results also show that retention/detention units are effective in treating organic compounds. keywords: highway runoff, oil/grit chambers, sedimentation, suspended solids, organic. samples for suspended solids and organic analysis were filtered through pre-weighed, pre-combusted glass fiber filters. after drying at 105 oc, the filters were re-weighed to determine the amount of suspended solids in each sample. this study aims to determine the contribution of adt and site characteristics to highway runoff quality. it also aims to evaluate the effect of oil/grit chambers and storage/sediment units in the treatment of highway runoff. 3 result and discussions the contribution of average daily traffic and land use in the concentration of runoff characteristics will be discussed below. influence of traffic volume traffic volume would seem to be an important factor predicting runoff quality. in order to study the effect of traffic volume on highway runoff characteristics, we compare the runoff characteristics collected from the d1 and d5, which have different traffic volumes. the tested runoff samples were taken from two detention units along the d1 at stations 72100 and 81450 km, and from two detention units along the d5 in the rudná and drahelčice areas. the data are shown in table 1. the results indicate no strong or definitive relationship between bod, oil and grease, ph, hydrocarbons, or cod on the one hand, and traffic density, on the other. the result indicates that there is a degree of correlation between vss, tss and traffic density, but this remains weak. influence of site characteristics site characteristics may also affect the amount of pollutants in highway runoff. the effect of parking/washing areas on runoff quality is investigated in this study. samples from three different sites on the d5 were selected to study the effect of site characteristics on the concentration of runoff constituents. the first site is a detention unit that collects runoff from a washing station, while the second and third sites are detention units that collect runoff from normal highway sections in the rudná area and ssúd no. 8. the results are shown in fig. 1. the results shown in the figure indicate that the site characteristics have a significant effect on the values of runoff constituents such as total suspended solids, volatile suspended solids, oil & greases, and hydrocarbons. the figure indicates that the runoff constituents from car washing stations are higher than the values from other highway surfaces. the above results indicate that there is a problem in influent runoff quality to the collection units. consequently, these units should be prepared to treat runoff or at least attenuate the concentration of its constituents to the specified limit. the following subsection evaluates oil/grit chambers, which are used mainly in the treatment of organic compounds, and storage/sediment units, which are used in the treatment of suspended solids. oil/grit chambers to evaluate the performance of oil/grit chambers, we studied one unit collecting runoff from the d1 at 72100 km and another unit in the rudná area collecting runoff from the d5. the collected data are shown in table 2. the table shows the influent and effluent concentration of some constituents, and the removal efficiencies of the constituents using oil/grit chambers. the results show that an oil/grit chamber is effective in the treatment of organic pollutants such as vss, bod, cod, o & g, and hydrocarbons. the results also show that these retention units are not effective in the treatment of tss and cl�. the measured efficiency of these oil/grit chambers in the treatment of organic pollutants may be due to the fibroil filter that is used. the short retention time as well as the high flow rate in these units may be the reason for the increased values © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 41 no. 2/2001 constituents d1 d5 influent effluent removal [%] influent effluent removal [%] tss [mg/l] 1074 1075 0.0 376 375 0.0 vss [mg/l] 339 315 7.0 121 105 13.0 bod5 [mg/l] 5.24 2.98 43.0 7.5 4.7 37.0 cod [mg/l] 36.18 24.00 34.0 46.0 32.0 30.0 ph 7.50 7.50 0.0 7.2 7.3 0.0 cl� [mg/l] 357 357 0.0 93 92 0.0 oil & greases [mg/l] 0.926 0.386 58.0 1.14 0.219 81.0 hydrocarbons [mg/l] 0.517 0.216 58.0 0.83 0.107 87.0 table 2: removal efficiencies using oil/grit chambers for runoff constituents collected from the d1 & d5 at stations 72100 km, and the rudná area, respectively constituents d1 with adt= 40000–65000 d5 with adt=16000–22000 72100 km 81450 km rudná area drahel ic e area tss [mg/l] 2095 2779 376 1262 vss [mg/l] 329 299 122 156 bod5 [mg/l] 7.0 6.65 7.5 15.7 cod [mg/l] 39.3 52 46 77 ph 7.5 7.3 7.2 7.3 cl� [mg/l] 961 1503 93 533 oil & grease [mg/l] 0.394 0.47 1.14 0.99 hydrocarbons [mg/l] 0.216 0.236 0.830 0.365 table 1: concentration of runoff constituents from two detention units for each hwy 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 oil & grease 0 10 20 30 40 monitoring date washing station ssud 8 rudná .0 9 .9 8 .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 .1 2 .0 0 c o n c e n tr a ti o n [m g /l ] tss 0 2000 4000 6000 8000 10000 monitoring date washing station ssud 8 rudná .0 9 .9 8 .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 .1 2 .0 0 .0 1 .0 1 c o n c e n tr a ti o n [m g /l ] vss 0 100 200 300 400 500 600 700 monitoring date washing station ssud 8 rudná .0 9 .9 8 .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 .1 2 .0 0 .0 1 .0 1 c o n c e n tr a ti o n [m g /l ] hydrocarbon 0 5 10 15 20 25 30 35 monitoring date washing station ssud 8 rudná .0 9 .9 8 .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 .1 2 .0 0 c o n c e n tr a ti o n [m g /l ] fig. 1: runoff constituents collected from various retention/detention units near the d5 © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 41 no. 2/2001 tss 0 100 200 300 400 500 600 . . . . . monitoring date influent filter effluent .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 .1 2 .0 0 c o n c e n tr a ti o n [m g /l ] influent filter effluentbod5 0 1 2 3 4 5 6 7 8 monitoring date .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 c o n c e n tr a ti o n [m g /l ] influent filter effluentcod 0 10 20 30 40 50 60 70 80 monitoring date .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 c o n c e n tr a ti o n [m g /l ] influent filter effluento & g 0 0.2 0.4 0.6 0.8 1 1.2 monitoring date .1 2 .0 0 c o n c e n tr a ti o n [m g /l ] .1 1 .9 8 .0 3 .9 9 .0 7 .9 9 .0 9 .9 9 .1 1 .9 9 .0 3 .0 0 .0 5 .0 0 .0 7 .0 0 .0 9 .0 0 fig. 2: runoff constituents for ssúd no. 8 storage/sediment unit along the d5 highway of tss, because the flow to these units stimulates some accumulated solids, which may lead to an increase in suspended solids. storage/sediment units the runoff concentration of the storage/sediment units which collected stormwater runoff near the management and maintenance center (ssúd no. 8) on highway d5 are shown in fig. 2. these units are larger than oil/grit chambers to allow the settling of suspended solids, and they also contain a filter using a fibroil mesh to adsorb organic pollutants. the figure shows the improvement of runoff quality by measuring the influent and effluent concentration for tss, bod, cod, and oil & greases. the figure shows that there is a reduction in the concentration of all constituents. the settlement of suspended solids under the effect of gravity during the retention time may be the reason for ss removal. the presence of fibroil in the settling unit adsorbs organic pollutants and filters some of the suspended solids, and this may explain why there is enhancement of organic pollutants. 4 summary and recommendations highway runoff contains total suspended solids, hydrocarbons, oil and greases, chloride, and other contaminants that are transported in solution and particulate forms to adjacent floodplains, roadside swales, and retention/detention ponds. oil and grit chambers represent a type of the retention/detention unit used for removing heavy particulate and adsorbed hydrocarbon particulates. storage/sediment units also represent a type of retention/detention unit used to control peak flow and remove suspended solids. this study evaluated the effect of traffic volume and site characteristics on highway runoff quality. it also evaluated the performance of the retention/detention units that collect runoff from the prague-brno and prague-plzeň highways, czech republic. analysis of the results indicates that there is no definitive relationship between average daily traffic and concentration of runoff constituents, as was concluded in other preceding studies [3], [8]. the results also show that the retention/detention units used in this study are effective in treating organic compounds, but are less effective in treating suspended solids, perhaps due either to the small size of these retention units, or to the short retention time inside it. consequently this study can make the following recommendations: • decrease the flow rate into the studied units by any suitable method, in order to allow the settling of suspended solids; • use a sand filter or densely vegetative swales before the runoff enters the studied units; • study the effectiveness of a lower flow rate in the treatment of suspended solids. acknowledgement we would like to thank and express our appreciation to ing. k. slavíčková and all members of the sanitary engineering department for their help. this research has been supported by grant no.: vz j04198: 2111100005. list of abbreviation tss total suspended solids vss volatile suspended solids cod chemical oxygen demand bod biochemical oxygen demand cl� chloride o & g oil and grease references [1] berg, v. h.: water quality inlets (oil/grit separators). maryland department of the environment, sediment and stormwater administration, baltimore, md, 1991 [2] dorman, m. e., hartigan, j., steg r.: retention, detention, and overland flow for pollutant removal from highway stormwater runoff. vol. i: research report, fhwa-rd-96-095, versar inc., springfield, va, 1996 [3] driscoll, e. d., shelley, p. e., strecker, e. w.: pollutant loadings and impacts from highway stormwater runoff. vol. ii: analytical investigation and research report, federal highway administration, office of research and development report no. fhwa-rd-88-008, 1990 [4] schueler, t. r., kumble, p. a., heraty, m. a.: a current assessment of urban best management practices, techniques for reducing non-point source pollutant in the coastal zone. metropolitan washington council of governments, washington, dc, 1992 [5] schueler, t. r., kumble, p. a., heraty, m. a.: a current assessment of urban best management practices, techniques for reducing non-point source pollution in the coastal zone. department of environmental programs, metropolitan washington council of governments, washington, dc, 1991 [6] schueler, t. r.: controlling urban runoff: a practical manual for planning and designing urban bmps. department of environmental programs, metropolitan washington council of governments, washington, dc, 1987 [7] silverman, g. s., stenstrom, m. k.: source control of oil and grease in an urban area. proceedings of an engineering foundation conference on current practices and design criteria for urban quality control, asce, new york, ny, 1989, pp. 403–420 [8] stotz, g.: investigations of the properties of the surface water runoff from a federal highway in frg. the science of the total environment, vol. 59, 1987, pp. 329–337 eng. ashraf el-shahat elsayed, msc. e-mail: ashraf@fsv.cvut.cz prof. ing. alexander grünwald, csc. department of sanitary engineering czech technical university in prague faculty of civil engineering thákurova 7, 16629 praha 6, czech republic dalibor dvořák, road and motorway directorate of czech republic na pankráci 546, 140 00 praha 4, czech republic 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 table of contents filling and recycling apparatus of a cyclotron target with enriched krypton for production of radiopharmaceuticals 3 m. vognar, m. fišer, è. šimánì, d. chvátil actual optical and thermal performance of photovoltaic modules 7 hamdy k. elminir, v. benda, i. kudláèek effect of thermal cycling on the strength and texture of concrete for nuclear safety structures 17 š. hošková, o. kapièková, k. trtík, f. vodák reliability – based design of precast buildings 20 m. kalousková, e. novotná, j. šejnoha two-equation turbulence model similarity solution of the axisymmetric fluid jet 26 v. tesaø some problems of the integration of heat pump technology into a system of combined heat and electricity production 42 g. böszörményi, l. böszörményi comparison between atmospheric turbidity coefficients of desert and temperate climates 46 hamdy k. elminir, u. a. rahuma, v. benda determining the permeable efficiency of elements in transport networks 58 v. svoboda, d. šiktancová accuracy of determining stress intensity factors in some numerical programs 60 m. vorel, e. leidich pollutant removal from highway runoff using retention/detention units 66 ashraf el-shahat elsayed, a. grünwald, d. dvoøák << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2013.53.0829 acta polytechnica 53(supplement):829–831, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap a computer cluster system for pseudo-parallel execution of geant4 serial application memmo federicia,∗, bruno l. martinoa a istituto di astrofisica e planetologia spaziali, iaps inaf via fosso del cavaliere 100, 00133 roma, italy b istituto di analisi dei sistemi ed informatica “antonio ruberti”, iasi-cnr viale manzoni 30, 00185 roma, italy ∗ corresponding author: memmo.federici@iaps.inaf.it abstract. simulation of the interactions between particles and matter in studies for developing x-rays detectors generally requires very long calculation times (up to several days or weeks). these times are often a serious limitation for the success of the simulations and for the accuracy of the simulated models. one of the tools used by the scientific community to perform these simulations is geant4 (geometry and tracking) [2, 3]. on the best of experience in the design of the aves cluster computing system, federici et al. [1], the iaps (istituto di astrofisica e planetologia spaziali inaf) laboratories were able to develop a cluster computer system dedicated to geant 4. the cluster is easy to use and easily expandable, and thanks to the design criteria adopted it achieves an excellent compromise between performance and cost. the management software developed for the cluster splits the single instance of simulation on the cores available, allowing the use of software written for serial computation to reach a computing speed similar to that obtainable from a native parallel software. the simulations carried out on the cluster showed an increase in execution time by a factor of 20 to 60 compared to the times obtained with the use of a single pc of medium quality. keywords: geant4, cluster, monte carlo method. 1. introduction the system discussed here is essentially an implementation of the geant4 software on a cluster computing platform. the toolkit uses the object-oriented programming paradigm; its application area includes experiments in high energy physics, nuclear studies, medical applications, accelerators and astrophysics. writing a software simulation using geant4 generally requires a significant design effort, so if possible developers of simulation models try to adapt programs that are already been implemented and tested for new projects by making only the needed changes. large computing capability and long execution times are required in order to improve the accuracy and the quality of a simulation model. these capabilities are often obtainable only through the use of a cluster computer system able to execute software developed for parallel platforms. unfortunately, the migration of software written for serial execution to parallel systems often requires a complete rewrite. the originality of our project is the capacity to reuse software for geant4 that was written to operate on serial platforms, obtaining calculation performance similar to running native parallel system, without the need to make substantial changes to the software. 1.1. scenario the main goal in this cluster design is to obtain the execution of simulations using the monte carlo method within reasonable times and at affordable costs. the design approach has therefore focused on the following considerations: • minimizing the cost of the computer systems. • achieving acceptable computation times. • intensive reuse of serial simulation software. this work shows the real possibility of reusing serial applications developed for geant4 without major rehabilitation efforts to be performed in a pseudo-parallel mode. 1.2. parallel applications writing parallel applications generally implies a challenging design. converting serial into parallel applications can require a complete rewrite of the source which is code, very expensive whenever executed on high-end servers (machines with a large number of processor sockets, dozens, even hundreds, and a large amount of shared memory). this cluster overcomes the unfavorable aspects of the use of parallel systems and transforms them to advantage. 1.3. computing environment figure 1 shows the structure of the hardware of the cluster, which currently consists of 8 pc (nodes) with the following characteristics: • hardware: intel i7 processor (8 cores), 4 gb of ddr3 ram 1333 mhz, 500 gb hard disk sata3 829 http://dx.doi.org/10.14311/ap.2013.53.0829 http://ojs.cvut.cz/ojs/index.php/ap memmo federici, bruno l. martino acta polytechnica figure 1. cluster hardware block diagram. • software: operating system: linux debian 6 squeeze (x86-64), resource manager: slurm [4], customized bash script interfaces for distribution of computer load • data storage: storage on nas raid, filesystem ocfs2 (a gpl oracle clustered file system, version 2), transport protocol: iscsi the hardware consists of commercial entry-level units; the software is free. the heart of the project consists of a set of scripts written in bash (bourne again shell) that handle all the operations related to the management of graphical interfaces and distribution of the workload across the nodes of the cluster in a definable pseudo parallel mode. to obtain an optimal result, the scripts split the overall workload on all cores allocated by the user, generates a “seed” randomly different for each instance of calculation and a macro file containing all the necessary parameters for the simulation. the cluster in its current configuration has 64 calculating cores and has multiuser capability. the data storage on which the “home” area is housed is made with a 6 tb nas and is configured in raid5. the file system is an ocfs2 that in the free version can handle up to 16 tb of disk space. the use of this file system developed for cluster computing systems is able to manage input output data access simultaneously on all nodes. this feature offers great advantages, speeding input-output operations from the distributed computing systems. data transport is performed using the iscsi protocol whitch manages data storage very efficiently. 2. login and setting of parameters for simulations the user connects to the cluster (login) through an ssh connection, providing his credentials. once authenticated, a particular user profile script does the following: • authenticates the users, • sets the execution environment, figure 2. example of login graphical interfaces. figure 3. example of run time graphical interfaces. • sets the number of nodes devoted to the simulation, • manages reconnection to active simulations. in this phase, the system automatically performs the setting of the necessary environment variables to geant4 and lets the user choose the parameters needed to perform the simulation. figure 2 shows an example of graphical user interfaces shown on login. during this phase, one of the fundamental aspects of the cluster is highlighted, i.e. the ability to resume sessions still active. this specially developed feature for the cluster was necessary because of the long duration of the simulations. thanks to this feature, the user may start the simulation and detach his local terminal from the cluster without interrupting the execution of his running jobs; at the next login she/he can check the progress of the simulation in progress not yet finished. at this stage, the user can manage the number of nodes in the cluster at its disposal to carry out other simulations. each node provides the user 8 calculating cores. the system automatically frees the resources that have become available. 2.1. running simulations at run time a script makes it possible for the user to select the application to be run and generate the corresponding configuration files (one for each instance of the process). using the appropriate graphical interface (see fig. 3), the user can select parameters concerning the simulation such as: the executable file, the macro file containing all the parameters for the simulation, the creation of the work directory, and he can then start the run simulation. 3. simulation campaign as an example of the cluster activity, we present two simulations: the first one involves the effect of cosmic particles on the athena xms microcalorimeter [5] to study the “anti-coincidence” system efficiency and 830 vol. 53 supplement/2013 a computer cluster system for pseudo-parallel execution the effect of the non-vetoed background on the performance of the detector (see fig. 4). this simulation is characterized by a large number of events; the time spent by the cluster to complete the simulation using 4 nodes for a total of 32 cores was approximately 10 days. this simulation performed on a pc identical to those used for the nodes of the cluster would take about 200 days of uninterrupted computing. the practical advantages of the use of the cluster in this simulation relate mainly to the speed, which is increased by a factor of 20, while the relative cost undergoes an increase of only a factor of 4. in fact 4 nodes calculates with a relative speed 20 times higher than that of a single node. another great advantage is that it to makes possible more detailed simulations and a more realistic environment. it is also very unlikely that an uninterrupted run of 200 days on a single pc can get to the end without interruption. the second simulation, in the area of medical physics, concerns the simulation of small detectors for gamma survey tomographic spect; single photon emission computed tomography can be used for research in medical oncology and in particular in the diagnosis of breast cancer. for this simulation, 7 nodes were used for a total of 56 cores, and the time employed was approximately 5 hours. the time that the simulation would take on a single pc has been estimated at about 15 days. in this case, the time for the simulation is decreased by approximately a factor of 60. this enables the development of more efficient detectors allowing changes and enhancements to the simulated model. verification requires only a very short time. 4. conclusions the cluster has been optimized for the purposes geant4, it improves the speed for simulations that require large computational resources by a factor from 40 to 60 (compared with a single pc of the same category). it drastically decreases the probability of failure thanks his great speed. it is inexpensive currently composed of 8 commercial pcs for a total of 64 cores. it is modular easily expandable without substantial changes, and can be easily reused on other projects. acknowledgements the authors wish to thank: lorenzo natalucci, maria nerina cinti, sergio lomeo. figure 4. the model of the xms microcalorimeter. references [1] federici, m. et al. 2009, pos, published online at http://pos.sissa.it/cgi-bin/reader/conf.cgi? confid=96, p. 92 [2] agostinelli, s. et al.: geant4 – a simulation toolkit, nuclear instruments and methods in physics research a 506 (2003) 250–303 [3] allison, j. et al.: geant4 developments and applications, ieee transactions on nuclear science 53 no. 1 (2006) 270–278 [4] yoo, a., jette, m., grondona, m.: job scheduling strategies for parallel processing, lecture notes in computer science, 2003, 2862, 44–60 [5] lotti, s. et al 2012, estimate of the impact of background particles on the x-ray microcalorimeter spectrometer on ixo, arxiv:1205.3002v1 [astro-ph.im] discussion james h. beal — what is the highest data rate possible between processors in your system? bruno martino — the communication between processors of different machines is via an ethernet lan 1 gb/s. the processes synchronization is managed by a master machine, which also takes care of monitoring. 831 http://pos.sissa.it/cgi-bin/reader/conf.cgi?confid=96 http://pos.sissa.it/cgi-bin/reader/conf.cgi?confid=96 acta polytechnica 53(supplement):829–831, 2013 1 introduction 1.1 scenario 1.2 parallel applications 1.3 computing environment 2 login and setting of parameters for simulations 2.1 running simulations 3 simulation campaign 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0106 acta polytechnica 54(2):106–112, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap exceptional points in open and pt-symmetric systems hichem eleucha, ingrid rotterb, ∗ a department of physics, mcgill university, montreal, canada h3a 2t8 b max planck institute for the physics of complex systems, d-01187 dresden, germany ∗ corresponding author: rotter@pks.mpg.de abstract. exceptional points (eps) determine the dynamics of open quantum systems and cause also pt symmetry breaking in pt symmetric systems. from a mathematical point of view, this is caused by the fact that the phases of the wavefunctions (eigenfunctions of a non-hermitian hamiltonian) relative to one another are not rigid when an ep is approached. the system is therefore able to align with the environment to which it is coupled and, consequently, rigorous changes of the system properties may occur. we compare analytically as well as numerically the eigenvalues and eigenfunctions of a 2 × 2 matrix that is characteristic either of open quantum systems at high level density or of pt symmetric optical lattices. in both cases, the results show clearly the influence of the environment on the system in the neighborhood of eps. although the systems are very different from one another, the eigenvalues and eigenfunctions indicate the same characteristic features. keywords: exceptional points, open quantum systems; pt symmetry breaking, dynamical phase transition, non-hermitian quantum physics, phase rigidity. 1. introduction starting with paper [1], it has been shown that a wide class of pt symmetric non-hermitian hamilton operators provides entirely real spectra. in the following years this phenomenon has been studied in many theoretical papers, see the review [2] and the special issue [3]. in order to realize complex pt symmetric structures, the formal equivalence of the quantum mechanical schrödinger equation to the optical wave equation in pt symmetric optical lattices [4] can be exploited by involving symmetric index guiding and an antisymmetric gain/loss profile. experimental results [5] have confirmed the expectations and have, furthermore, demonstrated the onset of passive pt symmetry breaking within the context of optics. this phase transition was found to lead to a loss-induced optical transparency in specially designed pseudo-hermitian potentials. in another experiment [6], the wave propagation in an active pt symmetric coupled waveguide system is studied. both spontaneous pt symmetry breaking and power oscillations violating left-right symmetry are observed. moreover, the relation of the relative phases of the eigenstates of the system to their distance from the level crossing point is obtained. the phase transition occurs when this point is approached. the meaning of these results for a new generation of integrated photonic devices is discussed in [7]. today we have many experimental and theoretical studies related to this topic. on the other hand, non-hermitian operators are known to describe open quantum systems in a natural manner, see, e.g., [8]. in contrast to the original papers more than 50 years ago, statistical assumptions on the system’s states are not at all necessary today [9] due to the improved accuracy of the experimental as well as theoretical studies. in the present-day papers, the system is assumed to be open due to the fact that it is embedded into the continuum of scattering wavefunctions into which the states of the system can decay. this environment exists always. it can be changed by means of external forces, but cannot be deleted [10]. the states of the system can decay due to their coupling to the environment of scattering wavefunctions but cannot be formed out of the continuum. hence, the loss is usually nonvanishing, while the gain is zero. the complex eigenvalues of the non-hermitian hamiltonian provide both the energy ei as well as the lifetime τi (inverse proportional to the decay width γi) of the eigenstate i. recent studies have shown the important role the singular points in the continuum play for the dynamics of open quantum systems, see, e.g., the review [10]. these singular points are usually called exceptional points (eps) after kato, who studied their mathematical properties [11] many years ago. the relation of eps to pt symmetry breaking in optical systems is considered already in the first papers [6, 7]. nevertheless, the relation between the dynamical properties of open quantum systems and those of pt symmetric systems has not been considered thoroughly up to now. it is the aim of the present paper to compare directly the influence of eps onto the dynamics of open quantum systems with that onto pt symmetry breaking in pt symmetric systems. the comparison is performed on the basis of simple models with only 106 http://dx.doi.org/10.14311/ap.2014.54.0106 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 exceptional points in open and pt-symmetric systems two levels coupled to one common channel. in both cases, the hamiltonian is given by a 2 × 2 matrix in the form it is used usually in the literature. we will follow here the representation given for open quantum systems in [10] and for pt symmetric systems used in [12]. in sect. 2, the non-hermitian hamiltonian of an open quantum system is considered. the properties of its eigenvalues and eigenfunctions are sketched, above all in the neighborhood of one or more eps. in the following section 3, two different non-hermitian operators that are used in the description of pt symmetric systems, are considered. the similarities and differences to the hamiltonian of an open quantum system are discussed on the basis of analytical studies (when possible) as well as by means of numerical results. the results are summarized in the last section. 2. exceptional points in an open quantum system in an open quantum system, the discrete states described by a hermitian hamiltonian hb, are embedded into the continuum of scattering wavefunctions, which exists always and cannot be deleted. due to this fact the discrete states turn into resonance states the lifetime of which is usually finite. the hamiltonian h of the whole system consisting of the two subsystems is non-hermitian. its eigenvalues are complex and provide not only the energies of the states but also their lifetimes (being inverse proportional to the widths). the hamiltonian of an open quantum system reads [10] h = hb + vbcg (+) c vcb, (1) where vbc and vcb stand for the interaction between system and environment and g(+)c is the green function in the environment. the so-called internal (first-order) interaction between two states i and j is involved in hb while their external (second-order) interaction via the common environment is described by the last term of (1). generally, the coupling matrix elements of the external interaction consist of the principal value integral re〈φbi |h|φ b j 〉−e b i δij = 1 2π p ∫ �′c �c de′ γ0icγ 0 jc e −e′ , (2) which is real, and the residuum im〈φbi |h|φ b j 〉 = − 1 2 γ0icγ 0 jc, (3) which is imaginary [10]. here, the φbi and e b i are the eigenfunctions and (discrete) eigenvalues, respectively, of the hermitian hamiltonian hb which describes the states in the subspace of discrete states without any interaction of the states via the environment. the γ0ic ≡ √ 2π〈φbi |v |ξ e c 〉 are the (energy-dependent) coupling matrix elements between the discrete states i of the system and the environment of scattering wavefunctions ξec . the γ0kc have to be calculated for every state i and for each channel c (for details see [10]). when i = j, (2) and (3) give the selfenergy of the state i. the coupling matrix elements (2) and (3) (by adding ebi δij in the first case) are often simulated by complex values ωij. in order to study the interaction of two states via one common environment it is convenient to start from two resonance states (instead of two discrete states). let us consider, as an example, the symmetric 2 × 2 matrix h(2) = ( ε1 ≡ e1 + i2γ1 ω12 ω21 ε2 ≡ e2 + i2γ2 ) , (4) the diagonal elements of which are the two complex eigenvalues εi (i = 1, 2) of a non-hermitian operator h0. this means that the ei and γi ≤ 0 denote the energies and widths, respectively, of the two states when ωij = 0 (the index c is ignored here for simplicity, c = 1). the ω12 = ω21 ≡ ω stand for the coupling of the two states via the common environment. the selfenergy of the states is assumed to be included into the εi. the two eigenvalues of h(2) are ei,j ≡ ei,j + i 2 γi,j = ε1 + ε2 2 ±z, z ≡ 1 2 √ (ε1 −ε2)2 + 4ω2, (5) where ei and γi stand for the energy and width, respectively, of the eigenstate i. resonance states with nonvanishing widths γi repel each other in energy according to the value of re(z) while the widths bifurcate according to the value of im(z). the two states cross when z = 0. this crossing point is an ep according to the definition of kato [11]. here, the two eigenvalues coalesce, e1 = e2. according to (5), two interacting discrete states (with γ1 = γ2 = 0) avoid always crossing since ω ≡ ω0 and ε1 − ε2 are real in this case and the condition z = 0 cannot be fulfilled, (e1 −e2)2 + 4ω20 > 0. (6) in this case, the ep can be found only by analytical continuation into the continuum. this situation is known as avoided crossing of discrete states. it holds also for narrow resonance states if z = 0 cannot be fulfilled due to the small widths of the two states. the physical meaning of this result has been very well known for many years. the avoided crossing of two discrete states at a certain critical parameter value [13] means that the two states are exchanged at this point, including their populations (population transfer). when ω = iω0 is imaginary, z = 1 2 ( (e1 −e2)2 + 1 4 (γ1 −γ2)2 + i(e1 −e2)(γ1 −γ2) − 4ω20 )1/2 (7) 107 hichem eleuch, ingrid rotter acta polytechnica is complex. the condition z = 0 can be fulfilled only when (e1−e2)2 + 14 (γ1−γ2) 2 = 4ω20 and (e1−e2)(γ1− γ2) = 0, i.e., when γ1 = γ2 (or when e1 = e2). in this case, it follows (e1 −e2)2 − 4ω20 = 0 → e1 −e2 = ±2ω0 (8) and two eps appear. it holds further (e1 −e2)2 > 4ω20 → z ∈<, (9) (e1 −e2)2 < 4ω20 → z ∈= (10) independent of the parameter dependence of the ei. in the first case, the eigenvalues ei = ei + i2 γi differ from the original values εi = ei + i/2γi by a contribution to the energies and in the second case by a contribution to the widths. the width bifurcation starts in the very neighborhood of one of the eps and becomes maximum in the middle between the two eps. this happens at the crossing point e1 = e2 where ∆γ/2 ≡ |γ1/2−γ2/2| = 4ω0. a similar situation appears when γ1 ≈ γ2 as results of numerical calculations show. the physical meaning of this result is completely different from that discussed above for discrete and narrow resonance states. it means that different time scales appear in the system without any enhancement of the coupling strength to the continuum (for details see [14]). the cross section can be calculated by means of the s matrix σ(e) ∝ |1 − s(e)|2. a unitary representation of the s matrix in the case of two nearby resonance states coupled to one common continuum of scattering wavefunctions reads [10] s = (e −e1 − i2 γ1)(e −e2 − i 2 γ2) (e −e1 + i2 γ1)(e −e2 + i 2 γ2) . (11) in this expression, the influence of an ep on the cross section is contained in the eigenvalues ei = ei − i2 γi of h(2). reliable results can therefore be obtained also when an ep is approached and the s matrix has a double pole. here, the line shape of the two overlapping resonances is described by s = 1 + 2i γd e −ed − i2 γd − γ2d (e −ed − i2 γd) 2 , (12) where e1 = e2 ≡ ed and γ1 = γ2 ≡ γd. it deviates from the breit-wigner line shape of an isolated resonance due to interferences between the two resonances. the first term of (12) is linear (with the factor 2 in front) while the second one is quadratic. as a result, two peaks with asymmetric line shape appear in the cross section (for a numerical example see fig. 9 in [15]). the eigenfunctions of the non-hermitian h(2) are biorthogonal and can be normalized according to 〈φ∗i |φj〉 = δij, (13) although 〈φ∗i |φj〉 is a complex number (for details see sections 2.2 and 2.3 of [10]). the normalization (13) allows to describe the smooth transition from the regime with orthogonal eigenfunctions to that with biorthogonal eigenfunctions (see below). it follows 〈φi|φi〉 = re〈φi|φi〉, ai ≡〈φi|φi〉≥ 1 (14) and 〈φi|φj 6=i〉 = i im〈φi|φj 6=i〉 = −〈φj 6=i|φi〉, |bji | ≡ |〈φi|φj 6=i| ≥ 0. (15) at an ep ai → ∞ and |b j i | → ∞. the ei and φi contain global features that are caused by many-body forces induced by the coupling ωik of the states i and k 6= i via the environment. they contain moreover the self-energy of the states i due to their coupling to the environment. at the ep, the eigenfunctions φcri of h (2) of the two crossing states are linearly dependent on one another, φcr1 →±iφ cr 2 , φ cr 2 →∓iφ cr 1 (16) according to analytical as well as numerical and experimental studies, see the appendix of [14] and section 2.5 of [10]. this means that the wavefunction φ1 of the state 1 jumps, at the ep, via the wavefunction φ1 ± iφ2 of a chiral state to ±iφ2 [16]. the schrödinger equation with the non-hermitian operator h(2) is equivalent to a schrödinger equation with h0 and source term [17] (h0 −εi)|φi〉 = − ( 0 ωij ωji 0 ) |φj〉≡ w|φj〉. (17) due to the source term, two states are coupled via the common environment of scattering wavefunctions into which the system is embedded, ωij = ωji ≡ ω. the schrödinger equation (17) with source term can be rewritten in the following manner [17], (h0 −εi)|φi〉 = ∑ k=1,2 〈φk|w |φi〉 ∑ m=1,2 〈φk|φm〉|φm〉. (18) according to the biorthogonality relations (14) and (15) of the eigenfunctions of h(2), (18) is a nonlinear equation. the most important part of the nonlinear contributions is contained in (h0 −εn)|φn〉 = 〈φn|w |φn〉|φn|2|φn〉. (19) the nonlinear source term vanishes far from an ep due to 〈φk|φk〉 → 1 and 〈φk|φl 6=k〉 = −〈φl 6=k|φk〉 → 0 according to (13) to (15). thus, the schrödinger equation with source term is linear far from an ep, as usually assumed. it is however nonlinear in the neighborhood of an ep. it is meaningful to represent the eigenfunctions φi of h(2) in the set of basic wavefunctions φ0i of h 0 φi = n∑ j=1 bijφ0j, bij = |bij|e iθij . (20) 108 vol. 54 no. 2/2014 exceptional points in open and pt-symmetric systems also the bij are normalized according to the biorthogonality relations of the wavefunctions {φi}. the angle θij can be determined from tan θij = im(bij)/ re(bij). from (13) and (16) follows: • when two levels are distant from one another, their eigenfunctions are (almost) orthogonal, 〈φ∗k|φk〉≈ 〈φk|φk〉 = ak ≈ 1. • when two levels cross at the ep, their eigenfunctions are linearly dependent according to (16) and 〈φk|φk〉≡ ak →∞. these two relations show that the phases of the two eigenfunctions relative to one another change when the crossing point is approached. this can be expressed quantitatively by defining the phase rigidity rk of the eigenfunction φk, rk ≡ 〈φ∗k|φk〉 〈φk|φk〉 = a−1k . (21) it holds 1 ≥ rk ≥ 0. the non-rigidity rk of the phases of the eigenfunctions of h(2) follows also from the fact that 〈φ∗k|φk〉 is a complex number (unlike the norm 〈φk|φk〉, which is a real number) such that the normalization condition (13) can be fulfilled only by the additional postulation im〈φ∗k|φk〉 = 0 (what generally corresponds to a rotation). when rk < 1, an analytical expression for the eigenfunctions as a function of a certain control parameter can, generally, not be obtained. the non-rigidity rk < 1 of the phases of the eigenfunctions of h(2) in the neighborhood of eps is the most important difference between the non-hermitian quantum physics and the hermitian one. mathematically, it causes nonlinear effects in quantum systems in a natural manner, as shown above. physically, it allows the alignment of one of the states of the system to the common environment [10]. results of numerical calculations are given, e.g., in [18]. the mixing coefficients bij (defined in (20)) of the wavefunctions of the two states due to their avoided crossing are simulated by assuming a gaussian distribution for the coupling coefficients ωi 6=j = ωe−(ei−ej ) 2 (for real ω, the results of the simulation agree with the results [17] of exact calculations). in [18], results of different calculations are shown for illustration. here, the coupling coefficients ω are assumed to be either real or complex or imaginary according to the different possibilities provided by (2) and (3). the main difference of the eigenvalue trajectories with real coupling coefficients ω to those with imaginary coupling coefficients ω is related to the relations (6) to (10) obtained analytically. for γ1 6= γ2 and real, complex or even imaginary ω, the results show one ep when the condition z = 0 is fulfilled. this ep is isolated from other eps, generally, when the level density is low. in the case of γ1 ≈ γ2 and imaginary ω however, two related eps appear, see fig. 1 right panel. between these two eps, the widths γi bifurcate (fig. 1d) while the energies ei do not change (fig. 1b). it is interesting to see that width bifurcation occurs between the two eps, according to (8) and (10), without any enhancement of the coupling strength to the environment. beyond the two eps, the eigenvalues approach the original values. in a finite neighborhood of the point at which the two eigenvalue trajectories cross, the eigenfunctions are mixed and |bij|→∞ when approaching the ep (fig. 1f). the phases of all components of the eigenfunctions jump at the ep either by −π/4 or by +π/4 [19]. this means that the phases of both eigenfunctions jump in the same direction by the same amount. thus, there is a phase jump of −π/2 (or +π/2) when one of the eigenfunctions passes into the other one at the ep. this result is in agreement with (16). it holds true for real as well as for imaginary ω. 3. exceptional points in pt symmetric systems as has been shown in [4], the optical wave equation in pt symmetric optical lattices is formally equivalent to a quantum mechanical schrödinger equation. complex pt symmetric structures can be realized by involving symmetric index guiding and an antisymmetric gain/loss profile. the main difference between these optical systems and open quantum systems consists in the asymmetry of gain and loss in the first case while the states of an open quantum system can only decay (im(ε1,2) < 0 and im(e1,2) < 0 for all states). thus, the modes involved in the non-hermitian hamiltonian in optics appear in complex conjugate pairs while this is not the case in an open quantum system. as a consequence, the hamiltonian for pt symmetric structures in optical lattices may have real eigenvalues in a large parameter range. the 2 × 2 non-hermitian hamiltonian may be written as [4, 12] hpt = ( e− iγ2 w w∗ e + iγ2 ) , (22) where e stands for the energy of the two modes, ±γ describes gain and loss, respectively, and the coupling coefficients w stand for the coupling of the two modes via the lattice. when the pt symmetric optical lattices are studied with vanishing gain, the hamiltonian reads h′pt = ( e− iγ2 w w∗ e ) . (23) in realistic systems, w in (22) and (23) is mostly real (or almost real) [20]. the eigenvalues of the hamiltonian (22) differ from (5), ept± = e± 1 2 √ 4|w|2 −γ2 ≡ e±zpt . (24) a similar expression is derived in [5]. since e and γ are real, the ept± are real when 4|w|2 > γ2. under this 109 hichem eleuch, ingrid rotter acta polytechnica condition, the two levels repel each other in energy, which is characteristic of discrete interacting states. the level repulsion decreases with increasing γ (when the interaction w is fixed). when 4|w|2 = γ2 the two states cross. here, ept± = e and γ = ± √ 4|w|2. with further increasing γ and 4|w|2 < γ2 (w fixed for illustration), width bifurcation (pt symmetry breaking) occurs and ept± = e± i 2 √ γ2 − 4|w|2. these relations are in accordance with (8) to (10) for open quantum systems. two eps exist according to 4|w|2 = (±γ)2. (25) further γ2 < 4|w|2 → zpt ∈<, (26) γ2 > 4|w|2 → zpt ∈= (27) independent of the parameter dependence γ(a) and of the ratio re(w)/ im(w). in the case of the hamiltonian (23), the eigenvalues read e ′pt ± = e−i γ 4 ± 1 2 √ 4|w|2 − γ2 4 ≡ e−i γ 4 ±z′pt . (28) we have level repulsion as long as 4|w|2 > γ 2 4 . while level repulsion decreases with increasing γ, loss increases with increasing γ. at the crossing point, e ′pt ± = e − i γ 4 . with further increasing γ and 4|w|2 � γ 2 4 e ′pt ± → e− i γ 4 ± i γ 4 = { e e− iγ2 . (29) the two modes (29) behave differently. while the loss in one of them is large, it is almost zero in the other one. thus, only one of the modes effectively survives. equation (29) corresponds to high transparency at large γ. further, two eps exist according to 4|w|2 = (±γ/2)2 (30) and γ2/4 < 4|w|2 → z ′ pt ∈<, (31) γ2/4 > 4|w|2 → z ′ pt ∈=. (32) by analogy with to (25) up to (27), these relations are independent of the parameter dependence of γ and of the ratio re(w)/ im(w). thus, the difference between the eigenvalues ei of h(2) of an open quantum system and the eigenvalues of the hamiltonian of a pt symmetric system consists, above all, in the fact that the ei depend on the ratio re(ω)/ im(ω) while the ept± and e ′pt ± are independent of re(w)/ im(w). there exist however similarities between the two cases. it is interesting to compare the eigenvalues ei of h(2) obtained for imaginary non-diagonal matrix elements ω, with the eigenvalues of (22) (or (23)) obtained for real w. in both cases, there are two eps, see fig. 1. in the first case (right panel), the energies ei of both states are equal and the widths γi bifurcate between the two eps. this situation is characteristic of an open quantum system at high level density with complex (almost imaginary) ω, see eqs. (8) to (10). in the second case (left panel) however the difference |e1 − e2| of the energies first increases (level repulsion) and then decreases again while the widths γi of both states vanish in the parameter range between the two eps in accordance with the analytical results (25) to (27). between the two eps, level repulsion causes the two levels to be distant from one another and w is expected to be (almost) real. this result agrees qualitatively with (2) and (3). similar results are obtained for the eigenvalues of (23). the only difference from those of (22) is that the γi do not vanish but decrease between the two eps with increasing a in this case. according to figs. 1a–d, the role of energy and width is formally exchanged when the eigenvalues of the hamiltonian (4) are compared with those of (22) (or (23)). in any case, the eigenvalues are influenced strongly by the eps. also the eigenfunctions of the hamiltonian (4) of an open quantum system (with imaginary ω) and those of the hamiltonians (22) and (23) of a pt symmetric system (with real w) show similar features. the eigenfunctions φpti of hpt (and φ ′pt i of h ′ pt ) are biorthogonal with all the consequences discussed in sect. 2. in contrast to the eigenvalues, they are dependent on the ratio re(ω)/ im(ω). the eigenfunctions can be represented in a set of basic wavefunctions in full analogy to the representation of the eigenfunctions φi of h(2) in (20). they contain valuable information on the mixing of the wavefunctions under the influence of the non-diagonal coupling matrix elements w and w∗ in (22) and (23), respectively, and its relation to eps. due to the level repulsion occurring between the two eps, the coupling coefficients w can be considered to be (almost) real in realistic cases. the phases of the eigenmodes of the non-hermitian hamiltonians (22) and (23) are not rigid, generally, in approaching an ep, and spectroscopic redistribution processes occur in the system under the influence of the environment (lattice). as in the case of open quantum systems, the phase rigidity rk can be defined according to (21). it varies between 1 and 0 and is a quantitative measure for the skewness of the modes when the crossing point is approached. in figs. 1ef, the eigenfunctions of the hamiltonian (22) (calculated with real w) are compared to those of the hamiltonian (4) (calculated with imaginary ω). they show the same characteristic features. as can be seen from fig. 1e, pt symmetry breaking is accompanied by a mixing of the eigenfunctions in a finite neighborhood of the eps in pt symmetric systems. this result is in complete analogy to the results shown in fig. 1f for open quantum systems 110 vol. 54 no. 2/2014 exceptional points in open and pt-symmetric systems figure 1. energies ei, widths γi/2 and wavefunctions |bij| of n = 2 states coupled to k = 1 channel as function of a of a pt symmetric system with hamiltonian (22) (left panel) and of an open quantum system with hamiltonian (4) (right panel). parameters left panel: e = 0.5, γ1 = −γ2 = 0.05a, w = 0.05; right panel: e1 = 1 − 0.5a, e2 = a, γ1/2 = γ2/2 = 0.5, ω = 0.05i. the dashed lines in (a,b) show ei(a). where a hint of width bifurcation can be seen in the mixing of the eigenfunctions around these points. also the phases of the eigenfunctions jump in both cases by π/4 at the eps (not shown here). in the parameter region between the two eps, the eigenfunctions are completely mixed (1:1) in both cases while they are unmixed far beyond the eps, see figs. 1ef. 4. discussion of the results on the basis of 2 × 2 models, we have compared the influence of an ep on the dynamics of an open quantum system with its influence on pt symmetry breaking in a pt symmetric system. in the first case the coupling of the two states via the environment is symmetric (ω12 = ω21 ≡ ω). in the second case however, the formal equivalence of the optical wave equation in pt symmetric optical lattices with a quantum mechanical schrödinger equation causes the two nondiagonal matrix elements to be complex conjugate (w21 = w∗12). the eigenvalues depend in the first case on the ratio re(ω)/ im(ω) while they are independent of re(w)/ im(w) in the second case. the eigenfunctions are sensitive to re(ω)/ im(ω) and re(w)/ im(w), respectively, in both cases. the eps cause nonlinear effects in their neighborhood which determine the evolution of open as well as of pt symmetric systems. most important for the dynamics of an open quantum system is the regime at high level density where the coupling coefficients are (almost) imaginary. here, two eps appear when the decay widths γi of both states are (almost) the same. approaching the eps, width bifurcation starts and ends, respectively, while beyond the eps the widths of both states are equal (or similar) to one another. the energies of the two states show an opposite behavior: it is ei = e2 (or ei ≈ e2) in the parameter range between the two eps while the states repel each other in energy beyond the eps. the width bifurcation related to the two eps becomes relevant for the dynamics of an open quantum system at high level density. here, short-lived and long-lived states are formed which are related to different time scales of the system (for details see [14]). two eps appear also in a pt symmetric system, and pt symmetry breaking is directly related to them. from a mathematical point of view however, energy and time are exchanged in comparison with the corresponding values in an open quantum system. this means that the widths of both states are equal and 111 hichem eleuch, ingrid rotter acta polytechnica vanish in the case of the hamiltonian (22) with gain and loss in the whole parameter range between the two eps. in this parameter range, the eigenvalues are real and, furthermore, level repulsion prohibits a small energy distance between the two levels. therefore the non-diagonal coupling matrix elements w are (almost) real, re(w) � im(w). the eigenfunctions of the different 2 × 2 models considered in the present paper show very clearly that the spectroscopic redistribution inside the system is indeed caused by the eps. however, it shows up in all cases in a finite neighborhood around them. here the rigidity of the phases of the two eigenfunctions relative to one another is reduced (ri < 1) and an alignment of one of the states to the environment is possible. in the parameter range between the two eps, the wavefunctions are completely mixed (1:1) as can be seen from the numerical results shown in fig. 1. summing up the discussion we state the following. the results obtained by studying pt symmetric optical lattices as well as those received from an investigation of open quantum systems show the characteristic features of non-hermitian quantum physics. they prove environmentally induced effects that cannot be described convincingly in conventional hermitian quantum physics. due to the reduced phase rigidity around an ep, the system is able to align (at least partly) with the environment. this can be seen from pt symmetry breaking occurring in one of the considered systems as well as from the dynamical phase transition taking place at high level density in the other system. references [1] bender, c.m., and boettcher, s.: real spectra in non-hermitian hamiltonians having pt symmetry, phys. rev. lett. 80 (1998) 5243-5246 [2] bender, c.m.: making sense of non-hermitian hamiltonians, rep. progr. phys. 70 (2007) 947-1018 [3] special issue quantum physics with non-hermitian operators, j. phys. a 45, number 44 (november 2012) [4] ruschhaupt, a., delgado, f. and muga, j.g.: physical realization of pt symmetric potential scattering in a planar slab waveguide, j. phys. a 38 (2005) l171-l176; el-ganainy, r., makris, k.g., christodoulides, d.n. and musslimani, z.h.: theory of coupled optical ptsymmetric structures, optics lett. 32 (2007) 2632-2634; makris, k.g., el-ganainy, r., christodoulides, d.n. and musslimani, z.h.: beam dynamics in pt symmetric optical lattices, phys. rev. lett. 100 (2008) 103904 (4pp); musslimani, z.h., makris, k.g., el-ganainy, r. and christodoulides, d.n.: optical solitons in pt periodic potentials, phys. rev. lett. 100 (2008) 030402 (4pp) [5] guo, a., salamo, g.j., duchesne, d., morandotti, r., volatier-ravat, m., aimez, v., siviloglou, g.a., and christodoulides, d.n.: observation of pt-symmetry breaking in complex optical potentials, phys. rev. lett. 103 (2009) 093902 (4pp) [6] rüter, c.e., makris, g., el-ganainy, r., christodoulides, d.n., segev, m., and kip, d.: observation of parity-time symmetry in optics, nature physics 6 (2010) 192-195 [7] kottos, t.: broken symmetry makes light work, nature physics 6 (2010) 166-167 [8] feshbach, h.: unified theory of nuclear reactions, ann. phys. (n.y.) 5 (1958) 357-390; feshbach, h.: a unified theory of nuclear reactions. ii, ann. phys. (n.y.) 19 (1962) 287-313 [9] i. rotter: a continuum shell model for the open quantum mechanical nuclear system, rep. prog. phys. 54 (1991) 635-682 [10] rotter, i.: a non-hermitian hamilton operator and the physics of open quantum systems, j. phys. a 42 (2009) 153001 (51pp) doi: 10.1088/1751-8113/42/15/153001 [11] kato, t.: perturbation theory for linear operators springer berlin, 1966 [12] rotter, i.: environmentally induced effects and dynamical phase transitions in quantum systems, j. opt. 12 (2010) 065701 (9pp) doi: 10.1088/2040-8978/12/6/065701 [13] landau, l., physics soviet union 2, 46 (1932); zener, c.: non-adiabatic crossing of energy levels, proc. royal soc. london, series a 137 (1932) 696-702 [14] rotter, i.: dynamical stabilization and time in open quantum systems, contribution to the special issue quantum physics with non-hermitian operators: theory and experiment, fortschritte der physik progress of physics 61 no. 2-3 (2013) 178-193 doi: 10.1002/prop.201200054 [15] müller, m., dittes, f.m., iskra, w. and rotter, i.: level repulsion in the complex plane, phys. rev. e 52 (1995) 5961-5973 [16] in studies by other researchers, the factor i in (16) does not appear. this difference is discussed in detail and compared with experimental data in the appendix of [14] and in section 2.5 of [10]. in the present paper, the definitions εi = ei + i2 γi and ei = ei + i 2 γi (with γi ≤ 0 and γi ≤ 0 for decaying states) are used [17] rotter, i.: dynamics of quantum systems, phys. rev. e 64 (2001) 036213 (12pp) [18] eleuch, h. and rotter, i.: width bifurcation and dynamical phase transitions in open quantum systems, phys. rev. e 87 (2013) 052136 (15pp) doi: 10.1103/physreve.87.052136; eleuch, h. and rotter, i.: avoided level crossings in open quantum systems, contribution to the special issue quantum physics with non-hermitian operators: theory and experiment, fortschritte der physik progress of physics 61 no. 2-3 (2013) 194-204 doi: 10.1002/prop.201200062 [19] eleuch, h. and rotter, i., eur. phys. j. d 68 (2014) 74 (16pp) doi: 10.1140/epjd/e2014-40780-8 [20] kottos, t., private communication 112 http://dx.doi.org/10.1088/1751-8113/42/15/153001 http://dx.doi.org/10.1088/2040-8978/12/6/065701 http://dx.doi.org/10.1002/prop.201200054 http://dx.doi.org/10.1103/physreve.87.052136 http://dx.doi.org/10.1002/prop.201200062 http://dx.doi.org/10.1140/epjd/e2014-40780-8 acta polytechnica 54(2):106–112, 2014 1 introduction 2 exceptional points in an open quantum system 3 exceptional points in pt symmetric systems 4 discussion of the results references ap01_45.vp 1 notation g(x) performance (limit state) function cij consequences of the events eij ctot total expected cost eij events hi hazard situation i h1 hazard situation under normal conditions h2 hazard situation due to fire p(f|hi) probability of failure f given the situation hi pf probability of failure f pd target probability of failure pf probability p(f|h2) of structural failure during fire pfi, s probability of fire start p(h2) pfi, d conditional probabilities of fire flashover given h2 pfi probability of fire flashover pt, fi target probability of structural failure under fire design situation x generic point of the vector of basic variables. x vector of basic variables � reliability index �x(x) probability density function of the vector of basic variables x �x(x) distribution function of the vector of basic variables x 2 introduction design and assessment of civil structures suffer from a number of uncertainties, which can hardly be described by available theoretical tools. according to thoft-christensen and baker [1], melchers [2] and holický [3] these uncertainties include: • natural randomness of basic variables, • statistical uncertainties caused by a limited size of available data, • model uncertainties caused by deficiencies of computational models, • uncertainties caused by inaccuracy in definitions of limit states, • gross errors caused by human faults, • lack of understanding of actual behaviour of materials and structures. these uncertainties are listed in the order corresponding to their increasing effect on the frequency of failures and the decreasing possibility of describing them theoretically. traditional probability methods usually deal with the first three types of uncertainties only. it was shown by holický [3] that the fourth type of uncertainty can be partly described using the theory of fuzzy sets. theoretical tools for the description of gross errors are insufficient (as indicated by melchers [2]), while no tools are available to describe the lack of understanding of the actual behaviour of new materials and structures. the available theoretical tools obviously have a limited capability of describing all types of uncertainties. this adverse reality corresponds to the observed proportions of failure causes, for which informative values are indicated in table 1 (obtained from the data provided by melchers [2], stewart and melchers [4] and other publications quoted in these references). the first line in table 1 indicates the proportions of various origins of structural failures chosen from basic activities during the construction and service-life of structures. the second line indicates relations between these activities and two main causes: gross errors (about 80 %) due to human activity and environmental effects (about 20 %), which are not directly dependent on human activity. environmental influences include both random and hazard (accidental) situations, e.g. due to impact, explosion, fire and extreme climatic actions. thus, natural randomness causes only a small proportion of the failures (about 10 %). obviously, further development of more precise procedures based on the traditional probabilistic approach (the basis of which is mentioned in the following section) has only a limited significance. advanced engineering design methods 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 prospects for advanced engineering design based on risk assessment m. holický current approaches to the design of structures are based on the concept of target probability of failure. this value is, however, often specified on the basis of comparative studies and past experience only. moreover, the traditional probabilistic approach cannot properly consider gross errors and accidental situations, both of which are becoming more frequent causes of failure. this paper shows that it is useful to supplement a probabilistic design procedure by a risk analysis and assessment, which can take into account the consequences of all unfavourable events. it is anticipated that in the near future advanced engineering design will include criteria of acceptable risks in addition to the traditional probabilistic conditions. keywords: reliability, hazard situations, adverse events, costs, risk assessment, bayessian network, advanced engineering design. origin design execution use other 20 % 50 % 15 % 15 % causes gross errors due to human activity environmental effects 80 % 20 % table 1: the proportions of causes of structural failures should therefore attempt to consider the actual causes of failures. 3 the probabilistic method the probabilistic method of designing structures assumes that a failure f of the structure is unequivocally described by inequality g ( x) < 0, where g ( x) denotes the limit state function (g ( x) = 0 describes the limit state, g ( x) > 0 the safe state), x is a realisation of the vector of basic variables x. if �x( x) indicates the joint probability density of the vector of the basic variables x, the probability of failure pf can be determined from the relation � � � � pf x � � ��x x x g d 0 . (1) the reliability index � is formally defined on the basis of the probability pf using the relation pf = �(��), where � is the distribution function of the standardised normal distribution. calculation of the probability of failure pf using equation (1) suffers from two essential deficiencies, as demonstrated by ellingwood [5]: • uncertainty in the definition of the limit state function g ( x), • uncertainties in theoretical models describing the basic variables x. these deficiencies are most likely the main sources of the observed discrepancy between the determined probability pf and the actual frequency of failures. that is why quantities pf and � are often referred to as „formal“ (notional) reliability indicators (associated with the intention to standardise theoretical models of basic variables). however, such an approach jeopardises the nature of probabilistic concepts including the methods of probabilistic optimisation, which should provide the target probability of failure pd used in the design condition pf < pd. in order to increase the significance of probabilistic concepts a considerable effort focussed on improving the theoretical models describing basic variables and on extending of the traditional probabilistic concepts by risk assessment methods has recently been observed by stewart and melchers [4] and ellingwood [5]. 4 the concept of acceptable risk the risk assessment of a system attempts to cover all possible events that might lead to unfavourable effects related to the considered system. as mentioned above, these events are caused mainly by gross errors in human activity and by accidental actions such as impact, explosion, fire and extreme climatic loads. adequate situations (hazard scenarios and common design situations), designated generally as hi, will occur with the probability p(hi). if the failure of structure f due to a particular situation hi occurs with the conditional probability p(f|hi), then the total probability of failure pf is given as: � � � �p f h hf i i i� �p p . (2) the conditional probabilities p(f|hi) must be determined by a separate analysis of the respective situations hi. equation (2) can be used for harmonisation of the partial probabilities of failure p(f|hi) p(hi) corresponding to the situations hi, and for the following risk consideration. in general, the situations hi may cause a number of unfavourable events eij (e.g. excessive deformations, full development of the fire). it is assumed that the adverse consequences of these events can be expressed by a one-component quantity cij (for example, by the cost expressed in a certain currency). it is further assumed that the consequences cij are mutually uniquely related to the events eij. then the total risk c related to the considered situations hi is the sum � � � �c c e h hij ij i i ij � � p p . (3) it is sometimes necessary to describe the consequences of an unfavourable phenomenon eij by a quantity having several components, denoted as cij, k (describing for example cost, injuries or casualties). the components ck of the resultant risk are then given as � � � �c c e h hk ij k ij i i ij � � , p p . (4) if it is possible to specify the acceptable limit ck, d for the components ck, it is possible to design the structure on the basis of the condition of acceptable risks ck < ck, d that supplements the probability condition pf < pd. 5 example of a structure under a fire situation an example illustrating the concept of acceptable risks concerns a structure for which only two different situations are considered: • h1 persistent design situation, for which p(h1) = 0.99 is assumed, • h2 accidental situation during the fire, for which p(h2) = 0.01 is assumed. the persistent situation h1 is analysed using the traditional probabilistic reliability analysis. an example of an analysis of situation h2 is indicated in figure 1, which shows a bayesian network describing the structure during a fire. the chance, decision and utility nodes indicated in figure 1 are briefly described below. a more detailed description is given by holický and schleich [6]. an alternative type of network and analysis was recently provided by holický and schleich [7]. 1 – fire starts. the parentless chance node describing the initiation of a fire. the probability pfi, s= p(h2) = 0.01 is assumed for the positive state (fire starts) considering an office compartment of 25 m2 during its design life of 50 years. 2 – detection by occupants. the chance node describing the detection of smoke by occupants or neighbours within a suitable time period. the conditional probability 0.9 given the fire started (parent node 1) is considered. 3 – occupancy. the chance node describing the activity of the occupants of the building to diminish the fire. the conditional probabilities related to the states of parent nodes 2 and 6 are given by holický and schleich [6]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 41 no. 4 – 5/2001 4 – tampering. this parentless chance node describes the interference of random factors with the automatic fire detection system (node 5). the probability 0.02 is considered for the disturbing effects on the detection system. 5 – smoke detection. the chance node describing the operation of an automatic smoke detection system. the conditional probabilities related to parent nodes 1 and 4 are given by holický and schleich [6]. 6 – alarm. the chance node describing the operation of an acoustic fire alarm system. the conditional probabilities related to the states of parent nodes 2, 5 and 8 are considered in accordance with holický and schleich [6]. 7 – tampering. the parentless chance node describing the interference of random factors with the automatic sprinkler system (node 5). the probability 0.02 is considered for the disturbing effects on the sprinkler system. 8 – sprinklers. the chance node describing the operation of the automatic sprinkler system (if installed). the conditional probabilities related to the states of parent nodes 1 and 7 are indicated by holický and schleich [6]. 9 – transmission. the chance node describing the operation of manual or automatic alarm transmission to the fire brigade. the conditional probabilities related to the states of parent nodes 2, 5 and 8 are given by holický and schleich [6]. 10 – fire brigade. the chance node describing the operation of a professional fire brigade. the conditional probability 0.9 that the fire brigade is active when the alarm (parent node 9) goes off is considered. 11 – flashover. the chance node describing the development of the fire. the conditional probabilities related to the states of parent nodes 1, 3, 8 and 10 are given by holický and schleich [6]. 12 – collapse. the chance node describing structural collapse under the fire design situation in the case of fire flashover. the conditional probability 0.2 of structural collapse given the fire flashover is considered in the example. 13 – protection. the parentless decision node describing the resolution concerning protection of the structural against fire. the node has two states: ‘yes’ and ‘no’. as indicated by holický and schleich[6] for the state “no” the child node 12-collapse has greater probability of a positive state than for the positive decision “yes” concerning structural protection. 14 – cost. the utility node describing the cost c14(13) of structural protection (affecting node 12), which depends on the state of node 13. the relative value 10 expressed in monetary units is considered if the decision (node 13) is positive. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 11-flashover 8-sprinklers5-smoke det.2-det. by occ. 1-fire starts 4-tampering 6-alarm 10-fire brigade 7-tampering 3-occupancy 14-cost 16-cost 15-cost 9-transmission 12-collapse 17-cost 13-protection fig. 1: an example of the bayesian belief network representing a fire situation 15 – cost. the utility node describing the damage cost c15(8, 10, 11) caused by the sprinklers (node 8) and the fire brigade (node 10) if the fire (node 11) does not flashover. the relative costs expressed in the same monetary unit as cost c14(13) are indicated by holický and schleich [6]. if the fire develops fully (node 11), these costs are covered by utility node 16. 16 – cost. the utility node describing the damage cost c16(11, 12) assuming that the fire flashover occurred. the relative value of 100 units expressed in the same monetary units as the costs c14(13) and c15(8, 10, 11) is assumed. 17 – cost. the utility node describing the damage cost c17(12) due to the collapse of the structure (node 12). relative values of the cost c17(12) from 10 5 to 108 are considered in the example. assuming the independence of the two situations h1 and h2, it holds that p(h1) + p(h2) = 1. if p(f|h1) = 10 �5 (which is an expected value) and p(f|h2) = 10 �3, then the probability of failure is according to relation (2) � � � � � � � �p f h h f h hf � � �p p p p1 1 2 2 52 10 . (5) further it is assumed that the following events, the first of which are related to situation h1, and the other to h2, may occur: • e11 structural failure due to exceeding the ultimate limit state, • e12 unacceptable deformations, i.e. exceeding the limit state of serviceability, • e21 activation of sprinklers (figure 1 – chance node 8), • e22 intervention of a fire brigade (figure 1 – chance node 10), • e23 full development of fire (figure 1 – chance node 11), • e24 structural failure due to fire (figure 1 – chance node 12). the conditional probabilities p(eij|hi) can generally be determined on the basis of a detailed probabilistic analysis of the two situations h1 and h2. assuming that unfavourable consequences are given by quantities cij or cij, k corresponding to an unfavourable phenomenon eij, equations (3) and (4) may be applied to determine the total risk c or its components ck. note that the unfavourable consequences c21, c22, c23 and c24 are described in figure 1 by utility nodes 15, 16 and 17. the total expected cost ctot can then be given by a simplified equation (3) as a sum � � � � � � � �c c c c p ctot f� � � � 14 15 16 1713 8 10 11 11 12 12, , , (6) where, as described above, c14(13) is the cost depending on the state of node 13, c15(8, 10, 11) is the damage cost depending on the states of nodes 8, 10 and 11, and c16(11, 12) is the cost due to flashover depending on the state of nodes 11 and 12. the last term in the sum pf × c17(12) is the expected cost due to structural failure (collapse), where pf is the probability of failure and c17(12) is the damage cost given the failure. the damage cost c17(12) is a complex quantity, which is dependent on many factors including the cost of the structure and other costs due to structural malfunctioning. 6 probabilistic analysis the bayesian network was analysed using the program hugin 1999. the resulting probabilities pfi of fire flashover, the conditional probabilities pfi, d, and the probabilities of structural failure pf are shown in table 2. the probability pfi of fire flashover (0.00013) obtained by the probabilistic analysis of the network seems to be relatively low. note, however, that this value is valid for the fire start probability pfi, s = p(h2) = 0.01 (corresponding to a small compartment area a = 25 m2 and a 50 year time period), which is linearly dependent on compartment area a. thus, the input probability pfi, s may be much greater than 0.01. if, for example, the compartment area is ten times greater (250 m2), then pfi, s= 0.1 and the probabilities pfi will also be ten times greater than the values indicated in table 2. the conditional probability pfi, d that the fire, once started, will develop fully (shown in the second line of table 2), is relatively low primarily due to the relatively high efficiency of the sprinklers considered by holický and schleich [6]. table 2 also shows that the probability of structural failure may be decreased using the appropriate structural protection. however, the data given in table 2 depend on input conditional probabilities, which should be determined on the basis of a detailed probabilistic analysis, taking into account the actual protection measures. having the probability of fire flashover pfi, it is now possible to specify the target probabilities pt, fi of structural failure under the fire design situation using equation (2). obviously with increasing probability of fire flashover pfi the probability pt, fi decreases. as pfi is dependent on compartment area a, the probabilities pt, fi are also dependent on a. detailed discussion is provided by holický and schleich [6]. for large compartment areas a, the target probability pt, fi of structural failure under the fire design situation will be very small and, consequently, it may be difficult (if not impossible) to design the structure under this condition. in such a case, it may be necessary to use additional elements of the fire protection system in order to decrease the probability of fire flashover pfi. it appears that the bayesian network may effectively be used to model a fire protection system and, possibly, to find the optimum arrangement. for this purpose decision and utility nodes often supplement a bayesian network like that in figure 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 41 no. 4 – 5/2001 decision concerning protection yes no probabilities of fire flashover pfi assuming p(h2) = 0.01 0.00013 conditional probabilities pfi, d of fire flashover given h2 0.013 probability of structural failure during the fire pf = p(f |h2) 1.0 × 10�5 3.6 × 10�5 table 2: probabilities of fire flashover pfi and conditional probabilities pfi, d 7 analysis of an influence diagram in order to perform the risk assessment under a fire design situation the bayesian causal network in figure 1 is supplemented by decision node 13 and four utility nodes 14, 15, 16 and 17. the purpose of the influence diagram in figure 1 is to analyse the expected total cost ctot given by equation (6). the total expected cost ctot is dependent on the assumed probability of fire start pfi, s = p(h2). figure 2 shows the total cost ctot as a function of the cost c17(12). it follows from figure 2 that for the cost c17(12) up to about 5 × 10 5 (expressed in relative monetary units), the structural protection seems to be uneconomical. however for the cost c17(12) greater than 5 × 105 the expected total cost could be considerably lower when the structural protection is provided. it should be noted that the critical value of the cost c17(12) for which the costs both with and without structural protection are equal depends on the probability of fire start pfi, s = 0.01; with increasing pfi, s the critical value decreases approximately by the same order. 8 concluding remarks the traditional probabilistic approach to engineering design covers only a small part of actual causes of structural failures. a significant proportion of all failures, besides gross errors, is related to hazard scenarios (e.g. fire, impact, and explosion), which are not usually included in the traditional probabilistic analysis. for this reason, specification of the design probability of failure remains an open question (how safe is safe enough?). the methods of risk analysis and assessment are capable of encompassing more types of uncertainties than the traditional probabilistic approaches, and can significantly contribute to further improvement of advanced engineering design. the remarkable fact that the public is better prepared to accept certain risks than to stand for specified probabilities of failure will make the application of risk assessment easier. it is therefore anticipated that in the near future the probabilistic methods of structural design will be supplemented by the criteria of acceptable risks. the above results should be considered as examples valid for the assumed input data only. these data were assessed here without due regard to specific technological and economic conditions, which should be considered in the fire safety assessment of a particular structure. further research is needed to specify a more detailed bayesian network and the appropriate input conditional probabilities. in particular, cost distribution depending on the states of the parent nodes should be investigated. nevertheless, available experience indicates that the bayesian belief network provides a very logical and effective tool for analysing the probability of fire flashover for particular fire protection conditions. acknowledgement this research has been conducted at the klokner institute of the czech technical university in prague, czech republic as a part of research project cez: j04/98/210000029 “risk engineering and reliability of technical systems”. references [1] thoft-christensen, p., baker, m. j.: structural reliability and its applications. springer-verlag berlin, 1982 [2] melchers, r. e.: structural reliability analysis and prediction. john wiley & sons, chichester, 1999 [3] holický, m.: fuzzy probabilistic optimisation of building performance. automation in construction, elsevier, amsterdam, 8(4), 1999, pp. 437– 443. [4] stewart, m. g., melchers, r. e.: probabilistic risk assessment of engineering systems. chapman & hall, london, 1997 [5] ellingwood, b. r.: probability-based structural design: prospect for acceptable risk bases. in: application of statistics and probability icasp 8. balkema rotterdam, 1999, pp. 11–18 [6] holický, m., schleich, j.-b.: fire safety assessment using bayesian causal network. in: foresight and precaution conference, edinburgh, may 2000, pp. 1301–1306 [7] holický, m., schleich, j.-b.: estimation of risk under fire design situation. in: proc. of risk analysis 2000 conference, bologna, witpress, southampton, boston, 2000, pp. 63–72 doc. ing. milan holický, phd., drsc. phone: +420 2 24310208 fax: +420 0 24355232 e-mail: holicky@vc.cvut.cz czech technical university in prague klokner institute šolínova 7, 166 08 praha 6, czech republic 12 acta polytechnica vol. 41 no. 4 – 5/2001 1 10 100 1000 10 5 10 6 10 7 10 8 without protection with protection ctot c17(12) fig. 2: expected total cost ctot versus cost c17(12) due to structural collapse for pfi, s = p(h2) = 0.01 ap01_6.vp 1 introduction a fuzzy logic neural network (flnn) [10] is a general nonlinear interpolator used for modeling static systems. the structure and rough setting of flnn parameters is usually done manually by an expert. fine-tuning is done by numerical optimization techniques using reference input-output data. the use of a random search technique is an option for solving the problem. genetic algorithm (ga) is an evolutionary method that simulates the process of natural selection and survival of the fittest. gas randomly generate a set of potential problem solutions, and manipulate them using genetic operators. each solution is assigned a scalar fitness value, which is a numerical assessment of how well it solves the problem. through crossover and mutation operations new feasible solutions are hopefully generated. the process continues until the termination condition is met. further discussion on gas can be found in [6], [17], and [5]. when ga is implemented as a learning procedure, the flnn parameters are coded to form a string referred to as a chromosome. under instances in the population of chromosomes, the genetic operations are performed. the fitness is inversely proportional to the whole system error, which represents the difference between the required and actual network response. the remainder of this paper is organized as follows: section 2 illustrates the structure of a fuzzy logic neural network model. in section 3, the derivation of the membership function (mf) constraints is performed for mfs of gaussian shape. section 4 shows the laga approach. the proposed genetic algorithm with constrained search space is explained in detail in section 5. the explanation of the optimization method is presented in detail on the basis of two application examples, and the conclusion is presented in section 6. 2 proposed fuzzy logic neural network (flnn) the flnn model as a general nonlinear interpolator is built using the multilayer fuzzy logic neural network shown in fig. 1, proposed by lin [10], with some modification by [9]. this is a particular implementation of a fuzzy system equipped with fuzzification and defuzzification interfaces. this network represents a linguistic fuzzy system with a general rule-based structure. the following example demonstrates this structure: © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 41 no. 6/2001 modeling nonlinear systems by a fuzzy logic neural network using genetic algorithms abdel-fattah attia, p. horáček the main aim of this work is to optimize the parameters of the constrained membership function of the fuzzy logic neural network (flnn). the constraints may be an indirect definition of the search ranges for every membership shape forming parameter based on 2nd order fuzzy set specifications. a particular method widely applicable in solving global optimization problems is introduced. this approach uses a linear adapted genetic algorithm (laga) to optimize the flnn parameters. in this paper the derivation of a 2nd order fuzzy set is performed for a membership function of gaussian shape, which is assumed for the neuro-fuzzy approach. the explanation of the optimization method is presented in detail on the basis of two examples. keywords: genetic algorithms, fuzzy logic neural network, 2nd order fuzzy sets. and and and and and and w w w w w w 1 1 a 2 1 a 3 1 a 1 n a 2 n a 3 n a . . . . . . . 1 x n x 3 n b 2 n b 1 n b 3 1 b 2 1 b 1 1 b inputs premise comparison and y rule weights rule consequent aggregation output fig. 1: fuzzy logic neural network topology an flnn consists of several layers [7], [9]: layer 1: actual values of the input variables are stored in this layer. generally, fuzzy sets are considered as the input values (crisp numbers are special cases of fuzzy sets). the fuzzy sets are in parametric form or in look-up table form. layer 2: rule premises (input reference fuzzy sets) are stored here. the actual input value is compared with the rule premise using degree of overlapping: � � � � � �� � � �d x a t x x a x at x t, sup ,� � �hgt x (1) where t is the selected t-norm and hgt is the height of intersection of x and a with respect to t-norm t. in the special case of crisp input x � x* the dt (x, a) is simply � � � �d x a a xt , *� (2) for some parametric fuzzy sets and some t-norms for dt (x, a) an analytical expression can be derived. for the other cases it must be computed numerically. layer 3: every neuron in this layer performs a fuzzy conjunction using the selected t-norm. � �y t u u u� 1 2, , ,� n (3) the common parameter of the layer is the type of t-norm and its parameters. layer 4: every neuron represents a rule weight w. the output of the neuron is the overall degree of rule activation act, and is computed as follows: y act w u� � � (4) where parameter w has to lie in the interval [0,1]. layer 5: only rule consequents (output reference fuzzy sets) are stored in this layer. the fuzzy sets are usually in the parametric form. the input of the neuron is the overall degree of rule activation act. this value is attached to the reference fuzzy set, and together they are fed to the next aggregation layer. layer 6: the output of the network is computed here, using the selected aggregation (inference) algorithm. there is a corresponding fuzzy set bi with its activation degree acti in the i-th input of each neuron. when we use the mamdani inference algorithm, the output fuzzy set is computed as follows: � � � �� �y y t act b y i n � � max , 1 i i (5) where n is the number of inputs to the neuron. when we use a fuzzy arithmetic based inference algorithm, the output fuzzy set is computed as follows: y act b act i n i n i i i i� � � � � � 1 1 (6) usually only crisp output value y is needed. then we use a defuzzification method to get the crisp value. the most widely used method is centroid average defuzzification: y act y act i n i n � � � � � �i i i 1 1 (6) where yi is a centroid of fuzzy set bi. flnn works in the following manner [7], [9]. in the forward run the input values (crisp values, fuzzy sets) are first compared with all premises of the rules (input reference fuzzy sets). the outputs of the and-neuron are then combined with rule-weight (preference between rules) to obtain the degree of rule activation. in the last layer these degrees are aggregated with the corresponding consequents of the rules (output reference fuzzy sets) according to the inference algorithm. the output of the flnn can be a fuzzy set or a crisp value (after defuzzification). 3 determining constraints of mf parameters in the case of flnn the membership functions mfj (xi) of input xi and output y are frequently approximated by gaussians. a gaussian shape is formed by two parameters: mathematical expectation c and standard deviation � as in formula (8): � � � � � � mf x g x c e x c j i j i j j i j j 2 � � � � � � � � � � � , , � � 2 2 (8) the idea of 2nd order fuzzy set was introduced to get the boundary of gaussian shape of membership function by [11] . the 2nd order fuzzy set of a given mf (x) is the area between d+ and d�, where d+, and d� are the upper and the lower crisp boundaries of the 2nd order fuzzy sets, respectively, as shown in figure 2 [2]. the expressions to determine its crisp boundaries are (9), and (10): � � � �� �d x mf xj i j i� � �min ,1 � (9) � � � �� �d x mf xj i j i� � �max ,0 � (10) formula (9) and formula (10) are based on the assumptions that the height of the slice of the 2nd order fuzzy region, bounded by d+ and d� , at point x is equal to 2� where � � [0, 0.3679] and these boundaries are equidistant from mf (x). to obtain the ranges for the shape forming parameters of the mfs, it should be assumed that these 2nd order fuzzy sets are mf search spaces. therefore, all mfs with acceptable parameters should be inside the area. in the general case the intervals of acceptable values for every mf shape forming parameter (e.g., �c � [c11, c22], and �� � [�11, �22] for gaussians) may be determined by solving formulas (8), (9), and (10). in practice, this may be done approximately considering d+ and d� as soft constraints. for instance, c11 and c22 for gaussians may be found as the maximum root and the minimum root of the equation d+ � 1, which can easily is to be calculated. this equation is based on the assumption that a fuzzy set represented by the gaussian must have a point where it is absolutely true. �11, and �22 can easily be found from the following four equations: � � � � � � � � g c c g c c g c c g c c � � � � � � � � � � � � � � � � � � , , , , ; , , , , 22 11 and (11) � � � � � � � � g c c g c c g c c g c c � � � � � � � � � � � � � � � � � � , , , , ; , , , , 22 11 and (12) where we choose �11 as minimum and �22 as maximum from the roots. these equations are based on the assumption that the acceptable gaussians with [�11, �22] should cross the 2nd order fuzzy region slices at points x � c � �. there are two options finding the constraints of gaussian parameters. 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 first, considering constraints as a hard constraint and it follows: the lower and upper bounds of the center of the gaussian membership function will be chosen as cmin, and cmax should be less than the values of c11 and c22 to satisfy the search space constraint conditions of 2nd order fuzzy sets as shown in fig. 3. the lower, and upper bounds for spread of gaussian membership function �min , and �max will be equal to �11 and �22, respectively to satisfy search space constraint conditions of 2nd order fuzzy sets, as shown in fig. 2. a second option is to consider these constraints as a soft constraint, i.e., [cmin, cmax] equal [c11, c22], and [�min, �max] equal [�11, �22]. 4 linear adapted genetic algorithm (laga) for tuning mf parameters a particular evolutionary algorithm a genetic algorithm is chosen. varying the crossover probability rate pc and mutation probability rate pm the ga’s control parameters provide faster convergence than constant probability rates. pc is set high at the beginning of the generation and decreases linearly with generations as in (13) [1]. as is known from the standard genetic algorithm (sga), at the beginning of generation the randomized initial ga population is diverse. this means that promising solutions are scattered through the search space. so, pc is high in the initial generation, but over the generations these solutions will generate even better solutions. this means that the population converges to a smaller subset of the search space, and the pc value will decrease according to formula (13), where there are large values for pc(0.5 � 1.0), and small values of pm(0.0 � 0.005) [6], [13], and [3]. � � � �p x . x m x x mc � � � � � 0 5 1 1 1 1( ) , , , (13) where x is the number of generations, and m is the maximum of generations allowed. as is known, mutation is not needed at the beginning of generation, where the members of the population are very distinct. the value of pm increases linearly as a function of the number of generations to exploit the improved solution in the established region of the current best solution. this is clear from equation (14). � � � �p x m x x mm � � � 0 005 1 1 1 . ( ), , . (14) 5 proposed genetic algorithm with constrained search space the main aspects of the proposed laga for optimizing flnn are discussed below and the block diagram for the laga optimization process is shown in fig. 5. 5.1 fuzzy model representation this section discusses how the proposed flnn is formulated using the laga approach, where all the parameters of the flnn are represented in a chromosome. the chromosome representation determines the ga structure. with a population size (popsize), we encode the parameters of each fuzzy model in a chromosome, as a sequence of elements describing the input fuzzy sets in the rule antecedents followed by the parameters of weights and the rule consequents. where the intervals of acceptable values for every mf shape forming parameter (�c � [cmin, cmax], and �� � [�min, �max] for gaussians) are determined based on 2nd order fuzzy sets for all membership functions, as explained in section 4. the acceptable constraints for rule weights are between [0,1], and for centroids they are the minimum and maximum values of the output. 5.2 coding of flnn parameters fig. 1 shows n inputs (x1, x2, …, xn) and one output y. each of the input fuzzy variables is classified into m reference fuzzy sets. every reference fuzzy set is described by a gaussian membership function specified by two parameters: center c and spread �, resulting in (2 × m × n) parameters at the corresponding layer. using the wang technique for generating rules from given data [16], the fuzzy model has k rules from (m)n rules theoretically possible. this means have k rule weights; w and k centroids represented by singletons b. thus a total of 2( m × n+k) parameters ( 2 × mmembership_functions× nvariables+ kweights+ kcentroids) need to be optimized using laga. the coded parameters of © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 41 no. 6/2001 fig. 2: upper and lower boundaries of spread, � using a 2nd order fuzzy set fig. 3: upper and lower boundaries of center, c using a 2nd order fuzzy set flnn are arranged as shown in table 1 to form the chromosome of the population. 5.3 selection function the selection strategy decides how to select individuals to be parents for new ‘childs’. usually the selection applies some selection pressure by favoring individuals with better fitness. after procreation, the suitable population consists for example of l chromosomes, which are all initially randomized. each chromosome has been evaluated and associated with fitness, the current population undergoes the reproduction process to create the next population, and the “roulette wheel” selection scheme is used to determine the member of the new population. the chance on the roulette-wheel is adaptive and is given as pl / pl, where � �p j l ll l � � � � � � � 1 1, , ,� and jl is the performance of the model encoded in the chromosome measured in terms of the normalized root mean square error (rmse): � � � � j y x y y y i n i n � � � � � � � i i i , � � 2 1 2 1 �y where n is the number of point samples, � is the required parameters to be optimized, y is the true output, y n y i n � � �1 1 , 72 acta polytechnica vol. 41 no. 6/2001 chromosome sub-chromosome of inputs sub-chromosome of rule weights sub-chromosome of rule consequents x1,…………, xn w1, ………, wk b1,………, bk parameters c1, �1, ………, cn, �n w1, ………, wk b1, ………, bk 2(m × n + k) (2m × n) k k gene 1000000000…0110100011 0100111111… 0111001010 … table 1: coded parameters of flnn actual model fuzzy model y� ;xf � + laga y�y � fig. 4: block diagram of parameters identification start initialization performance evaluation parameters flnn model laga x y rmse > tolerance mutation reproduction crossover stop record required information yes no randomly generated chromosomes fig. 5: block diagram for the laga optimization process and �y is the model output, as shown in fig. 4. the inverse of the selection function is used to select chromosomes for deletion. 5.4 crossover and mutation operators the mating pool is formed, and crossover is applied and followed by a mutation operation following the laga approach. finally, after these three operations, the overall fitness of the population is improved. the procedure is repeated until the termination condition is reached. the termination condition is the maximum allowable number of generations, or a certain value of (rmse) required to be reached. 6 applications 6.1 modeling the mackey-glass process the process used as an object of modeling is defined by the chaotic mackey-glass differential delay equation [8]: � � � � � � � �x t x t x t x t� � � � � 0 2 1 0110 . . � � (15) the prediction of future values of this time series is a benchmark problem, which has been considered by a number of connectionist researchers. a time window of the process behavior is shown in fig. 6. the sampling period used in the numerical study is set to 0.1, initial condition x (0) � 1.2 and time delay � � 17. in according with [8], we use the samples of x (t �18), x (t �12), x (t �6) and x (t) to predict x (t+6). for the chaotic system, the model has four input variables x (t �18), x (t �12), x (t �6) and x (t), and a single output x (t+6). the values of every input variable are classified into three reference fuzzy sets. every reference fuzzy set is described by a gaussian membership function specified by two parameters: center c and spread �, resulting in 24 parameters for inputs. using the wang technique for generating rules from the given data [16], we have 25 rules from 81 rules theoretically possible. this means we have 25 rule weights w and 25 centroids represented by singletons b. thus a total of 74 parameters (2 × 3membership_function× 4variables + 25weights + +25centroids) need to be optimized using laga. the coded parameters of flnn are arranged as shown in table 1 to form the chromosome of the population. to describe the laga optimization process, consider the block diagram shown in fig. 5. at the beginning of the process, the initial population comprises a set of chromosomes. every chromosome has 74 genes, and every gene has 10 bits, so the chromosome length is 740 bits. simulation results from the mackey-glass time series x(t) (15), we extracted 3000 input-output pairs. the first 1000 data samples were used to build the fuzzy model, while the remaining 2000 data samples were used for model testing. fig. 7 depicts the corresponding membership functions before and after training using laga. there were 420 generations, and 60 minutes of computation time using matlab and pc 400 mhz with 64 mb of ram. fig. 8 shows that the flnn model follows the actual process. the mse is 0.0009 and the rmse is 0.14 as shown in fig. 9. figure 10 shows the shape of constraints of membership functions based on second order fuzzy sets, as explained in section 3 with �=0.3. © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 41 no. 6/2001 0 500 1000 1500 2000 2500 3000 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 testing time sample training fig. 6: mackey-glass time series 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 a) b) c) d) x t( 18)� x t( )�� x t( 12)� x t( ) fig. 7: membership functions of reference fuzzy sets for inputs a) x(t �18), b) x (t �12), c) x (t �6), and d) x (t) (dashed line: normalized mfs before learning. solid line: optimized mfs after using laga) 2900 2910 2920 2930 2940 2950 2960 2970 2980 2990 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 actual output model output time sample o u tp u t fig. 8: comparing the actual process with fuzzy model after training fig. 10: membership functions of reference fuzzy sets for inputs x ( t�18). (dashed line: normalized mfs before learning. solid line: optimized mfs after using laga) 0 50 100 150 200 250 300 350 400 450 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 generations r m s e rmse at pop-size = 200 fig. 9: laga convergence rate at population size 200 6.2 nonlinear discrete time process modeling and identification this example is taken from [12], [15], in which the plant to be identified is governed by the differential equation (19): � � � � � � � �� �y k k k g u k� � � � �1 0 3 0 6 1. . (19) where the unknown function has the form � � � � � � � �g u u u u� � �0 6 0 3 3 01 5. sin . sin . sin� � � . the plant is modeled using flnn, as described in section 2. the model has three input variables u ( k), y ( k), and y ( k�1) and a single output y ( k+1), treated as linguistic variables. the universe of discourse of every variable is partitioned into five fuzzy sets with symmetrical gaussian membership functions. there are 30 parameters at the input of the flnn model. using the wang technique for generating rules from the given data, we have 20 rules. this means we have 20 weights, and 20 centroids represented by singletons. thus a total of 70 parameters (2 5membership_functions 3variables + 20weights + 20centroids) need to be optimized using laga. the learning procedure of laga is applied as in the first numerical example. fig. 5 shows the block diagram for the laga optimization process for optimizing the flnn model parameters of the second numerical example. the process starts with zero initial conditions. the first 250 data points are used to build the fuzzy model at , while the remaining 450 data points are used to identify an flnn model. as explained in the first application we determined the constraints for this application based on second order fuzzy sets with �=0.28. fig. 11 shows that the flnn model has a good match with the actual model, with an mse of 0.0473, and an rmse of 0.0607, as shown in fig. 12. fig. 13 depicts the membership functions for each input variable before and after training using laga. fig. 14 shows the output model and the plant for the input: � � � �u k k� � � � �0 5 2 250. sin � for 1 k 250 and 501 k 700, and � � � � � �u k k k� � � �0 5 2 250 0 5 2 25 25. sin . sin� � for 1 k 500. © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 41 no. 6/2001 0 100 200 300 400 500 600 700 -6 -4 -2 0 2 4 6 time sample fig. 11: outputs of the plant (solid line), and the flnn model (dashed line) for u ( k) � sin(2�k/250) 50 100 150 200 250 300 350 400 450 500 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095 r m s e fig. 12: laga convergence rate at population size 200 -4 -3 -2 -1 0 1 2 3 4 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -4 -3 -2 -1 0 1 2 3 4 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 y k( ) y k( 1)� � � � -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 u k( ) (a) (b) (c) fig. 13: (a, b, and c) are input mfs: dashed line: normalized mfs before learning solid line: optimized mfs after using laga 7 conclusion the paper deals with modeling of nonlinear systems and processes using fuzzy logic neural networks. reference data driven identification of parameters of fuzzy logic neural networks utilizing genetic algorithms has been proposed and tested. specification of parameter constraints related to input reference fuzzy sets are based on 2nd order fuzzy sets. the problem of constrained nonlinear optimization is solved based on a genetic algorithm with variable crossover and mutation probabilities rates, laga [attia, 2001]. the paper reports the results in dynamic process identification, prediction of time series in particular. the performance of the nonlinear models for time series prediction is examined. the simulation results of the application examples indicate the effectiveness of the proposed laga approach as a promising learning algorithm. acknowledgement this work received support from the ministry of education of the czech republic under project ln00b096. references [1] attia, a.: global optimization method for neuro-fuzzy modeling and identification. research report 335/01/203, ctu, faculty of electrical engineering, department of control engineering, prague, 2001, p. 41 [2] attia, a., horáček, p.: an optimal design of a fuzzy logic neural network using a linear adapted genetic algorithm. in: 7th international mendel conference on soft computing, brno, 2001, pp. 42–49 [3] attia, a., horáček, p.: adaptation of genetic algorithms for optimization problem solving. in: 7th international mendel conference on soft computing, brno, 2001, pp. 36–41 [4] farag, w., victor, h.: a genetic-based neuro-fuzzy approach for modeling and control of dynamical systems. ieee trans. on neural networks. vol. 9, no. 5, september 1998 [5] gen, m., cheng, r.: genetic algorithms and engineering optimizations. john wiley & sons, 2000 [6] goldberg, d.: genetic algorithms in search, optimization, and machine learning. addison –wesley, 1989 [7] horáček, p.: fuzzy modeling and control. in: h. adelsberger, j. lažanaský, v. ma ík (eds.): information management in computer integrated manufacturing. lecture notes in computer science no. 973, springer-verlag berlin, 1995, pp. 257–288 [8] jang, s. r., sun, c. t., mizutani, e.: neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. prentice hall, inc., 1997 [9] kolínský, j.: identifikace parametr fuzzy−logických neuronových sítí. diplomová práce, katedra řídicí techniky, fakulta elektrotechnická, čvut, 2000 [10] lin, c. t., lee, c.s.g.: neural–network-based fuzzy logic control and design decision system. ieee trans. on computers, vol. 40, no. 12/1991, pp. 1320–1336 [11] melikhov, a., miagkikh, v., topchy, p.: in: optimization of fuzzy and neuro-fuzzy systems by means of adaptive genetic search. proc. of ga+se’96 ic, gursuf, ukraine, 1996 [12] narendra, k. s., parthasarathy, k.: identification and control of dynamical systems using neural networks. ieee trans. neural networks, vol. 1, 1990 [13] srinivas, m., patnaik, l. m.: in: adaptive probabilities of crossover and mutation in genetic algorithms. ieee trans. system. man. and cybernetics, vol. 24, no. 4/1994, pp. 656–667 [14] srinivas, m., patnaik, l. m.: in: genetic search: analysis using fitness moments. ieee trans. on knowledge and data engineering. vol. 8, no. 1/1996 [15] wang li-xin: in: adaptive fuzzy systems and control, design and stability analysis. by ptr prentice hall, 1994 [16] wang li-xin, mendel, j.: in: generating fuzzy rules by learning from examples. ieee trans. systems, man, and cybernetics, vol. 22, no. 6/1992, pp. 1414–1427 [17] winter, g., périanx, j., mgalán, cuesta, p.: genetic algorithms in engineering and computer science. isbn 04-71-95859-x, john wiley & sons, 1996 ing. abdel fattah attia, m.sc. phone: +420 2 2435 7612 fax: +420 2 2435 7298 e-mail: attiaa1@control.felk.cvut.cz department of control engineering doc. ing.,petr horáček, csc. phone: +420 2 2492 2241 e-mail: horacek@control.felk.cvut.cz center for applied cybernetics czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 0 100 200 300 400 500 600 700 -6 -4 -2 0 2 4 6 time sample fig. 14: outputs of the plant (solid gray line), and flnn model (dashed line) for the input � � � �u k k� � � � �0 5 2 250. sin � for 1 k 250 and 501 k 700, and � � � � � �u k k k� � � �0 5 2 250 0 5 2 25 25. sin . sin� � for 1 k 500 table of contents power characteristics of a screw agitator in a tube 3 f. rieger study of the blending efficiency of pitched blade impellers 7 i. foøt, t. jirout, f. rieger, r. allner, r. sperling a geothermal energy supported gas-steam cogeneration unit as a possible replacement for the old part of a municipal chp plant (teko) 14 l. böszörményi, g. böszörményi power input of high-speed rotary impellers 18 k. r. beshay, j. kratìna, i. foøt, o. brùha parasitic events in envelope analysis 24 j. doubek, m. kreidl stellar image interpretation system using artificial neural networks: unipolar function case 33 f. i. younis, a. el-bassuny alawy, b. šimák, m. s. ella , m. a. madkour deriving triggers from integrity constraint specifications in the database management systems 39 m. badawy, k. richta analytical model of modified traffic control in an atm computer network 45 j. filip models of financing and available financial resources for transport infrastructure projects 51 o. pokorná, d. mocková measurement of temperature fields in long span concrete bridges 54 j. øímal non-linear temperature profiles 66 t. ficker, j. myslín, z. podešvová modeling nonlinear systems by a fuzzy logic neural network using genetic algorithms 69 abdel-fattah attia, p. horáèek acta polytechnica acta polytechnica 53(1):44–50, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ photometric analysis of pi of the sky data rafał opielaa,∗, katarzyna małeka, lech mankiewicza, małgorzata siudeka, marcin sokołowskib, aleksander filip żarneckic a center for theoretical physics polish academy of sciences, al. lotników 32/46, 02-668 warsaw. b sołtan institute for nuclear studies, hoża 69, 00-681 warsaw, poland. c faculty of physics, university of warsaw, hoża 69, 00-681 warsaw, poland. ∗ corresponding author: opiela@cft.edu.pl abstract. two fully automatic pi of the sky detectors with a large field of view, located in spain (inta) and in chile (spda) observe the sky in search of rare optical phenomena, and also collect observations which include many kinds of variable stars. to be able to draw proper conclusions from the data that is received, adequate quality of the detectors is very important. pi of the sky data are subject to systematic errors caused by various factors, e.g. cloud cover seen as significant fluctuations in the number of stars observed by the detector, problems with conducting mounting, a strong background of the moon or the passage of a bright object, e.g. a planet, near the observed star. some of these adverse effects are already detected during cataloging of the individual measurements, but this is not sufficient to make the quality of the data satisfactory for us. in order to improve the quality of our data, we developed two new procedures based on two different approaches. in this paper we will say some words about these procedures, give some examples, and show how these procedures improve the quality of our data. keywords: gamma ray burst (grb), variable stars, robotic telescopes, photometry, astrometry, data quality, photometric corrections. 1. introduction the pi of the sky experiment has been designed for continuous observations of a large part of the sky, in search of astrophysical phenomena varying in scale from seconds to months, especially for prompt optical counterparts of gamma ray bursts (grbs). other scientific goals include searching for nova stars and flare stars. the large amount of data obtained in the project also allows the identification and cataloging of many different types of variable stars. the project involves leading polish academic and research units such as the andrzej sołtan institute for nuclear studies, the center for theoretical physics (polish academy of science), the institute of experimental physics (university of warsaw), and many others. a new detector belonging to the pi of the sky project was installed at the end of 2010 in the inta el arenosillo observatory in south-western spain. unlike the prototype detector, which has been working since 2004 in chile, it is equipped with four ccd cameras, and it has a much larger field of view of 40° × 40°, which significantly increases the flow of data collected during the night. by the end of the year, two more detectors will be installed in spain, increasing the field of view of the whole system up to 4800 square degrees. this will dramatically increase the amount of data that needs to be analysed. the collected data are subject to systematic errors arising from the conditions under which the photograph was taken and from the design of the detector and the ccd figure 1. two currently working pi of the sky detectors. the prototype detector (on the left) and the new detector (on the right). chip. as has was already been mentioned, some of these adverse effects have been detected and properly marked, but this did not make the quality of our data satisfactory for us. we needed to take into account some other effects, the most important of which seems to be the dependence of the sensitivity of the detector on the spectral type of the observed object. we also looked for new, completely independent methods that can help us improve the quality of our data [1, 2]. 2. data processing 2.1. on-line data reduction data analysis in the pi of the sky experiment consist of two parts. the first part is called on-line analysis, and the second part is called off-line analysis. on-line data analysis is required to control the performance of the detector, and it is also responsible for finding 44 http://ctn.cvut.cz/ap/ vol. 53 no. 1/2013 photometric analysis of pi of the sky data optical flashes in time scales from 10 to 22 seconds in real time. fast algorithms optimised for transient search are included in the on-line data analysis. frameby-frame real time analysis provides an opportunity to distribute alerts in the community for follow-up observations. on-line data reduction consists of several parts: • dark image subtraction. dark frame subtraction is a way to minimize image noise for pictures taken with long exposure times. it takes advantage of the fact that a component of the image noise, known as fixed-pattern noise, is the same from shot to shot: noise from the sensor, dead or hot pixels. a dark frame is an image captured with the sensor in the dark, essentially just an image of noise in an image sensor. it works by taking a picture with the shutter closed. • image transformation, using a special method called laplace transformation. the value of each pixel is calculated as a simple function of several surrounding pixels. values in the pixels just around transformated pixels are assumed, and values in other pixels far away from it are subtracted. the idea of this transformation is to calculate the simple aperture brightness for each pixel. in this step, a simple and fast aperture photometry algorithm is used to calculate star brightness. this photometry is performed every 300 seconds and it covers all night images. • comparison with a reference image (a series of previous images). two images are compared in order to find differences in brightenes between existing objects, or to find entirely new objects. 2.2. off-line data reduction off-line data analysis acts on the reducted data and catalogs it to the database. it consists of algorithms optimised for data reduction. the reduction pipeline is divided to three main stages: photometry, astrometry and cataloging to the database. the final reduced data give us information about star brightness measurements which is stored in the our database, which provides easy and effective public access. the off-line analysis consists of several algorithms developed for different purposes [3, 4]. every image collected during one night is processed in the same way. algorithms optimised for off-line data reduction have the following steps: • adding 20 subsequent frames (which takes ∼ 20 · 10 = 200 seconds exposure). the image coordinates are controlled and, if they change, the average chain is stopped in order not to allow for averaging images from different positions. in the case of single image reduction, the image averaging step is skipped. • dark frame subtraction and flat corrections. in order to reduce fluctuations, the dark image is calculated as the median of several dark images. this step allows the signal offset produced by dark current and electronics to be subtracted. it also reduces the effect of hot pixels. flat correction allows corrections to be made for non-uniformity of the optics and differences between pixel amplifications. the standard way of finding this correction is by taking images of a uniformly illuminated field. this is usually the sky just after dusk or just before dawn, when the sky is bright and stars are not visible. an alternative way is to use a uniformly illuminated screen. in the case of the pi of the sky prototype, due to the large field of view the detector flat image is obtained by taking images of the evening sky with the mount tracking switched off. after taking many images and calculating the median image, stars are eliminated. finally, the image is normalized to one. • multiple aperture photometry. photometry is a procedure which finds stars in the image and determines their chip coordinates (x,y) and their brightness. in the pi of the sky experiment, aperture photometry is adopted from the asas experiment. it is rather slow, so it cannot be used for reducing each individual image from a night. this consideration led to the development of our own fast photometry algorithm, as discussed above. this kind of photometry is used in a photometry experiment involving 20 averaged images, and in reducing scan images (3 images averaged). both photometry procedures write the resulting list of stars with (x,y) coordinates and magnitudes to output mag files. the mag files are then available as input files for astrometry procedures. • astrometry and catalog star selection. this involves transforming chip coordinates (x,y) to celestial coordinates (λ,δ). it is an iterational minimization procedure for comparing stars. • identification in the image by photometry with catalog stars from external star catalogs. the astrometry procedure was adopted from the asas experiment. the star catalog currently used in the procedure is based on the tycho catalog, but it could be replaced by any star catalog. all night images are processed in the same way. • normalization to v magnitudes from the tycho catalog. the star magnitudes are normalized by comparing with the catalog of catalog stars. for each star, the corresponding star in the reference catalog is found. the correction for each star is calculated as the difference between the magnitudo of the catalog star and the magnitudo measured in the experiment. this calculated value must be added to the instrumental magnitude to obtain the normalized value. • cataloging of lightcurves to the postgresql1 database (see section 3). 1more information about postgresql database are avalible at http://www.postgresql.org/. 45 http://www.postgresql.org/ rafał opiela, katarzyna małek et al. acta polytechnica figure 2. precision dispersion of star brightness measurements from standard photometry for 200 s exposures (20 co-added frames) from the pi of the sky prototype at the las campanas observatory in chile. the large dispersion (left) is mainly caused by false measurements. after the application of quality cuts (right), the photometry accuracy improves significantly. 2.3. multilevel selection system despite using the methods described above, we still did not obtain satisfactory data quality. data quality varies with brightness, as shown in fig. 2. the mean measured brightness dispersion in the range of magnitudo between 6m–9m is about 0.03m. in other areas of magnitudo, the situation is even worse. the dispersion increases to 0.1–0.2 magnitudo. this is due to the sensitivity of the detector, which is at its most sensitive precisely in the range of magnitudo from 6 to 9. this is quite a lot if we want to detect variable stars, as one of the most popular methods for finding variable stars is to select stars with the largest variance. however, such an approach is impossible in the presence of large errors, because most of the stars with large variance measurements are simply stars with incorrect measurements. better elimination of bad measurements will in the future lead to better identification of variable stars. the value of brightness measurements may be distorted by the influence of various factors associated with the measurements. precise determination of the causes of erroneous measurements of brightness is the key to eliminating these bugs and improving the quality of the data. dedicated filters have been developed within our project to remove bad measurements or frames. the main causes of the errors that occur include: • read with an open shutter. with an open shutter, the image is irradiated all the time, and the light from bright stars forms strong streaks stretching down from the star images. if we measure the starlight from a star that has fallen into such a streak, its brightness will, of course, be many times greater than normal. strips may therefore result in time-varying brightening of stars. this phenomenon is easy to detect, because it occurs on only one camera. • planet or planetoid passage. if a planet or a planetoid passes in front of stars, it will add its light to the star’s light. this will cause momentary brightening of the stars. such cases also need to be detected and removed. for this purpose, we use a specially prepared database in which the trajectories of every bright object in our solar system are written. in a similar manner, incorrect measurements caused by brightening of stars due to the glow from jupiter are detected and removed. • measurements near the ccd edge. stars appearing at the edges of the detector have significantly distorted shapes. this is due to the fact that the sensitivity of the detector is much lower at the edges. this blurring is strong enough to change the brightness of the stars affected by it. • other causes include: hotpixels (discussed above), which can affect the brightness of nearby stars, frames with a very high background level caused by the strong background that occurs when observations are conducted during the full moon, or frames with too few matched stars, mainly due to bad weather. the accuracy of photometry improves significantly after bad quality data is removed. for stars from 7m to 10m, the photometry error is less than ∼ 0.015 magnitudo. the dispersion is still larger for stars of greater or lower brightness, but the accuracy of the photometry is greatly improved [5, 6]. 3. database the data acquired during the pi of the sky observations are reduced, and only the light curves of stars are stored in the project databases. these databases contain all measurements taken by the pi of the sky detector. the first database covers vii.2004–vi.2005, and contains about 790 mln measurements for about 4.5 mln objects. the second database covers v.2006– xi.2007, and includes about 1002 mln measurements for about 10.8 mln objects. the third database covers v.2006–iv.2009, and includes about 2.16 billion measurements for about 16.7 mln objects. a dedicated web interface has been developed to facilitate 46 vol. 53 no. 1/2013 photometric analysis of pi of the sky data figure 3. the detector response is correlated with the spectral type(b-v or j-k) of catalog stars. figure 4. only the best catalog stars were used (blue) for calculating photometry corrections, after rejecting stars with a magnitudo shift (mcorr −m) bigger than 0.2, rms corr bigger than 0.07 and with number of measurements smaller than 100. public access to the databases of the pi of the sky project. the interface allows the user to search for stars by magnitude, coordinates and other parameters and to view the light curve of a selected star. public databases are available on the pi of the sky web page http://grb.fuw.edu.pl/pi/databases [7]. 4. methods for improving data quality 4.1. color correction a ccd detector is not equally sensitive in all wavelengths. according to the manufacturer, the ccd detectors used in the pi of the sky project are at their most sensitive in the near infrared, while the average wavelength is, in their case 〈λ〉≈ 585 nm, which corresponds approximately to a wavelength characteristic of the v filter. the sensitivity of ccd detectors varies depending on the wavelength λ, and affects the quality of the results of measurements taken in white light. two objects with the same luminosities, where the first object shines in near infra-red and the second object shines e.g. in blue, will have different brightnesses in our detectors. an object shining in the near infra-red will be brighter than a blue object. since neither of the inta cameras present in the new detector has a filter installed that would take care of this effect, we must take the effect into account while cataloging our data. we have already determined that, in the case of data collected by the prototype detector in chile, the following formula may be used for approximating the correction of the standard photometry used in the pi of the sky project, which rests on taking into account the dependence of the sensitivity of the ccd chip on the observed star type: mcorr = m − 0.2725 + 0.5258(j −k) (1) in this formula, the j value and the k value correspond to the brightnesses of the object that is tested in the filter j and in the k filter. m is the brightness of the analysed object, measured by the detector and normalized to the brightness of catalog stars in v. mcorr represents the corrected magnitudo, taking into account color correction. the cameras of the prototype, however, have chips made by a different manufacturer (fairchild) than the cameras in spain (sta), so the colour calibration also had to be repeated for the data collected by the new detector (see fig. 3). equation (1) provides information about the corrected magnitudo only for catalog stars which have j and k brighteness values. to calculate the photometry corrections for any stars visible in the resulting picture, we use a special procedure that requires the use of only the best catalog stars. we are interested in catalog stars in the magnitudo range from 6m to 10m, and we reject stars with a magnitudo shift (mcorr−m) bigger than 0.2. the number of measurements of catalog stars should also be more than 100, and we accept 47 http://grb.fuw.edu.pl/pi/databases rafał opiela, katarzyna małek et al. acta polytechnica figure 5. uncorrected lightcurve for bgind variable (left) and after spectral corrections with correction quality cut (right). figure 6. gauss function fitted to the |mcorr − med| histogram. thanks to this fit, we can obtain the value of σ, which is then used to calculate the quality of the analysed frames. figure 7. values of σ obtained from the fitted gauss function for |mcorr − med| histograms. 48 vol. 53 no. 1/2013 photometric analysis of pi of the sky data only catalog stars which have rmsmcorr < 0.07, see fig. 4. for such selected catalog stars we calculate the quadratic surface correction, and we try to interpolate the value of the correction for the point where our analysed star exists. the average square distance of the reference stars from the fitted correction surface (χ2) provides additional, independent information about the quality of the analysed measurement. this information can be used to select the measurements with most precise photometry. the effect of photometry correction with a distribution of χ2 on the reconstructed bg ind light curve is shown in fig. 5. in this case, applying the new algorithm improved the photometry quality, and uncertainty sigma of the order of 0.013m was obtained. we also applied photometry correction to other stars, with as good results as in the case of bgind variable. spectral correction and additional χ2 distribution enable selection of only the measurements with the highest precision [8, 9]. 4.2. statistical methods another, independent way to improve the data quality, which we also considered, is to study the statistical properties of a group of frames. in this method, based on the statistical properties of a group of frames, we calculate the quality of a single frame and, on the basis of this quality, we calculate the quality of the single measurement on this frame. at the begining of this correction procedure we divide all analysed frames into “good” and “bad”. in this case, we create |mcorr − med| histograms based on all analysed frames. mcorr represents the corrected catalog star magnitudo taken from the analysed frame. the correction takes into account the dependence of the observed magnitudo on the brightness of the catalog star. med is the median, which is calculated on the basis of good measurements (with quality = 0) of this catalog star. in the near future, we will calculate the median on the basis only of good measurements taken from the same field. for each frame, we found catalog stars in the given magnitudo range, and for these stars we calculate the values of |mcorr − med|. the results are used to plot |mcorr −med| histograms for all ranges of magnitudo. to these histograms we fit the gauss function, which gives us the value of σ, which is later used to obtain the quality of the analysed frames (see fig. 6). as was shown in fig. 7, the smallest value of σ obtained from the gauss function fitted to |mcorr − med| histograms was calculated for the brightness range from 8m to 8.5m (for lco data). this σ value was later used to obtain the quality of the whole frame. in order to determine the quality of a given frame, we check how many catalog stars (which are visible in this frame) have |mcorr − med| > 2σ. on the basis of many tests, we assumed that “good” frames have ≤ 10 % bad catalog stars, and “bad” frames have > 10 % bad catalog stars (see fig. 8). if we know which frames are good and which are bad, we can calculate 〈m〉 and σ〈m〉 values based on each group of frames. we can calculate these values on the basis of measurements taken only from good frames, from bad frames or from all frames. we also take into account the quality of the data calculated using the previous methods. the results are given in fig. 9. 5. conclusions much information about the pi of the sky project can be found on the project webpage, which is available on http://grb.fuw.edu.pl. we have created a system of dedicated filters to mark bad measurements and frames. this system is applied with a cataloging procedure for new data. to improve the quality of the data, we have created an approximate color calibration algorithm based on the spectral type of catalog stars. we have also developed another statistical method, which analyses all stars in the frame and rejects badquality exposures. after the new frame selection is applied, photometry accuracy of 0.01m–0.03m can be obtained. further improvement can be made in dedicated analysis of selected objects[10]. acknowledgements we are very grateful to professor g. pojmański for providing access to the asas dome in lco and for sharing his experience with us. we would like to thank the staff of the san pedro de atacama observatory and the bootes-1 station at esat/inta-cedea in el arenosillo (mazagón, huelva) for their help during the installation and maintenance of our detector. this paper has presented research supported by polish ministry of science and higher education research project from 2009–2012. references [1] sokołowski, m.: investigation of astrophysical phenomena in short time scales with pi of the sky, july, 2008, warsaw. [2] majczyna, a.: search for optical flashes of astronomical origin with pi of the sky prototype, the 2009 europhysics conference on high energy physics, july, 2009. [3] małek, k.: general overview of the pi of the sky system, proceedings of spie volume: 7502, may, 2009. [4] majczyna, a.: pi of the sky catalogue of the variable stars from 2006–2007 data, proceedings of spie volume: 7745, may, 2010. [5] sokołowski, m.: detection of short optical transient of astrophysical origin in real time, proceedings of spie volume: 7502, may, 2009. [6] małek, k.: pi of the sky detector, advances in astronomy, vol. 2010, article id 194946, june, 2009. [7] sokołowski, m.: automated detection of short optical transients of astrophysical origin in real time, advances in astronomy, vol. 2010, article id 463496, june, 2009. 49 http://grb.fuw.edu.pl rafał opiela, katarzyna małek et al. acta polytechnica figure 8. histograms of the percentage of bad catalog stars in a single frame. these histograms were created for a given range of frames. for each frame, we calculated what percentage of the catalog stars from the analysed frame have |mcorr − med| > 2σ. we assumed that bad frames have more than 10 % bad catalog stars, and good frames have fewer than 10 % bad catalog stars. figure 9. σ〈m〉 vs 〈m〉 plot. on the red we can see points with positions corresponding to 〈m〉 and σ〈m〉 calculated on the basis of all measurements. on the green, we can see points with positions corresponding to 〈m〉 and σ〈m〉 calculated on the basis of good measurements only. [8] siudek, m.: photometric analysis of the pi of the sky data, acta polytechnica 2011/6 – proceedings ibws 2011, june, 2011. [9] opiela, r.: improving photometry of the pi of the sky, proceedings of spie volume: 7745, june, 2010. [10] żarnecki, f.: improving the photometry of the pi of the sky system, acta polytechnica 2011/2 – proceedings ibws 2010, june, 2010. 50 acta polytechnica 53(1):44--50, 2013 1 introduction 2 data processing 2.1 on-line data reduction 2.2 off-line data reduction 2.3 multilevel selection system 3 database 4 methods for improving data quality 4.1 color correction 4.2 statistical methods 5 conclusions acknowledgements references acta polytechnica acta polytechnica 53(2):88–93, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ characterization of human gait using fuzzy logic patrik kutileka,∗, slavka viteckovaa, zdenek svobodab a faculty of biomedical engineering, czech technical university in prague, nam. sitna 3105, 272 01 kladno, czech republic b faculty of physical culture, palacky university of olomouc, krizkovskeho 8, 771 47 olomouc, czech republic ∗ corresponding author: kutilek@fbmi.cvut.cz abstract. in medical practice, there is no appropriate widely-used application of a system based on fuzzy logic for identifying the lower limb movement type or type of walking. the object of our study was to determine characteristics of the cyclogram to identify the gait behavior by using a fuzzy logic system. the set of data for setting and testing the fuzzy logic system was measured on 10 volunteers recruited from healthy students of the czech technical university in prague. the human walking speed was defined by the treadmill speed, and the inclination angle of the surface was defined by the treadmill and terrain slope. the input to the fuzzy expert system is based on the following variables: the area and the inclination angle of the cyclogram. the output variables from the fuzzy expert system are: the inclination angle of the surface, and the walking speed. we also tested the method with input based on the angle of inclination of the surface and the walking speed, and with the output based on the area and the inclination angle of the cyclogram. we found that identifying the type of terrain and walking speed on the basis of an evaluation of the cyclogram could be sufficiently accurate and suitable if we need to know the approximate type of walking and the approximate inclination angle of the surface. according to the method described here, the cyclograms could provide information about human walking, and we can infer the walking speed and the angle of inclination of the terrain. keywords: type of terrain, walking speed, inclination angle, fuzzy logic, cyclogram. 1. introduction in medical practice, there is no widely-used application of a system based on fuzzy logic for identifying lower limb movement type or type of walking. several methods can be used in medical practice and in physiotherapeutic research for identifying gait behavior. the most widely-used method for studying gait behavior in clinical practice is gait phase analysis by gait time phase cycles [1–3]. time and phase diagrams of gait behavior have been used to analyze gait with the application of artificial intelligence methods [4–7], but the findings have not subsequently been applied in medical practice. intensive research is now being done on predicting leg movements by artificial intelligence (ai) and emg signal measurements [8–10]. fuzzy logic as a part of ai has only been tested in gait recognition systems or bipedal robotics control systems [11, 12], but has never been tested to identify the type of gait by using information about the joint angles. for a study of gait, we have used methods based on an analysis of gait angles using cyclograms (also called angle-angle diagrams or cyclokinograms) and artificial intelligence [13] to identify gait behavior. the first mention of the cyclogram [14] argued that a cyclic process such as walking is better understood if studied with a cyclic plot, e.g. an angle-angle diagram. depending on the cyclicity of the gait, cyclograms are closed trajectories generated by simultaneously plotting two (or more) joint quantities. in gait studies, most attention has traditionally been given to easily identifiable planar knee-hip cyclograms [14, 15]. applications of cyclograms in conjunction with fuzzy logic can offer a wide range of medical applications, but this approach has not yet been studied or applied in practice. our paper investigates the application of the fuzzy rule based expert system (frbes) to the identification of human gait behavior. 2. data acquisition the main object of our study was to identify the characteristics of the cyclogram to identify the gait behavior (i.e. type of terrain and walking speed) by using a fuzzy logic system. the set of data for setting and testing the fuzzy logic system was measured on 10 volunteers recruited from healthy students of the czech technical university in prague. the subjects were asked to walk properly on a treadmill at a variable walking speed and inclination angle of the surface. the human walking speed was defined by the treadmill speed, and the angle of inclination of the surface was defined by the treadmill and terrain slope. we used the very accurate lukotronic as200 motion capture system to record data about the position of the markers i.e. body segment movement in a threedimensional space [15]. one camera system mounted in front of or behind the subject moving on the treadmill recorded the 3d motion of the lower limbs [13]. 88 http://ctn.cvut.cz/ap/ vol. 53 no. 2/2013 characterization of human gait using fuzzy logic figure 1. example of the arrangement of ir markers and angles measured during the trial [13, 15]. markers were placed in accordance with the manufacturer’s recommendations for gait analysis by gaitlab software. the recommended marker set model is similar to the set defined by the helen hayes hospital model [16] for vicon clinical manager and the sagittal plane, figure 1, [13, 15]. using this method, we can record the movement in a three-dimensional space, though we primarily study the movement in a two-dimensional sagittal plane [13, 15]. the markers, i.e. the points, move in space together with the body segments, and the individual segments of the body move by the translational and angular movement per unit time. the angles between each two segments are calculated by assuming the segments to be idealized rigid bodies, figure 1. 3. gait characteristics to create and study angle-angle diagrams, we use the basic kinematics 7-segment model of the human body created in matlab (mathworks, inc.) simulink, i.e. simmechanic software. our model consists of a torso, thighs, shins and feet. the model does not include the arms and head. we obtained graphs of the changes of angles (during the time of the trial) in all joints of the lower part of the human body and the sagittal plane. this was important for subsequent computation of the angle-angle diagram curves [13, 15]. the cyclogram (figure 2) shows that the swing phase typically starts at a thigh extension angle of 0° and knee flexion of about 80 % of the maximum. the measured subject weighed 75 kilograms and was 23 years old. for each measured subject and a specific velocity, a closed cyclogram trajectory was created by the average values of the measured angles. in the next step of the study, we used the cyclograms to identify the gait behavior for different speeds and surface inclination angles. in our previous work [13, 15], we used a method based on principal component analysis (pca) to describe the shape of the area figure 2. example of the knee-hip cyclogram. enclosed by the trajectory of a cyclogram, but this method is less suitable for non-normal data distribution, if it occurs. for this reason, we have to describe the 2d shape of the area of a cyclogram by other variables used to describe the two-dimensional plane shape [15]. we decided to use the second moment of the area to describe the two-dimensional plane shape of the cyclogram. the reason for using the second moment is that we can very easily describe the distribution of the circumscribed area of an angle-angle diagram, and we can determine the inclination angle θ of a diagram. the value of angle θ, which is given a product moment of zero, is equal to, [15]: tan 2θ = 2ixy ixx − iyy , where ixx is the second moment of the area about the x-axis, iyy is the second moment of the area about the y-axis, and ixy is the product moment of the area [17, 18], θ is the inclination angle between the axes of the original coordinate system of the diagram and the principal axes of the area of a closed trajectory of the diagram curve. if the area a enclosed by the trajectory of the cyclogram of the gait cycle is defined, then all the stereotypes of walking can be characterized by the cyclogram area and the inclination of the cyclogram. theoretically, we can also use the characteristics of the ellipse of inertia, the polar moment, the center of the area, the maximum and minimum second moment of the area etc, to describe the gait behavior. 4. fuzzy rule based expert system we can use fuzzy sets to evaluate the angle-angle diagrams (cyclograms). the most important part of our work is to design methods for potential practical application of the cyclograms in order to identify gait behavior. we used the cyclograms as models/patterns for setting the fuzzy expert system. after setting the fuzzy system, we used them to identify human gait behavior. for this purpose, we used the artificial intelligence methods which are implemented in matlab 89 p. kutilek, s. viteckova, z. svoboda acta polytechnica figure 3. two basic ideas behind a fuzzy system are to interpret an input and, on the basis of sets of if-then rules, to infer an output, [12]. input variables output variables area of cyclogram (a) inclination angle walking speed (v) inclination angle of cyclogram (θ) of the surface (ε) small (s) small negative (sn) slow (s) small (s) medium (m) small positive (sp) comfortable (c) medium (m) large (l) medium positive (mp) fast (f) large (stairs) (l) large positive (lp) table 1. fuzzy sets of variables and linguistic expressions based on the area and the inclination angle of the cyclogram. toolboxes [19, 20]. the use of fuzzy logic enables us to overcome the problems encountered when using “hard” computing techniques [12, 21]. with fuzzy logic, we are able to use natural language and words, instead of equations and numbers. reasoning in natural language and words enables us to translate experimental findings and knowledge into fuzzy if-then rules. the basic idea behind a fuzzy expert system is to interpret an input and, on the basis of sets of if-then rules, to infer an output. developing fuzzy if-then rules is a relatively easy process. the transition from “belonging to a set” to “not-belonging to a set” is gradual, and is characterized by a membership function that gives fuzzy sets flexibility in modeling commonly used linguistic expressions [12, 15]. a fuzzy if-then rule assumes the form: if x is a then y is b, where a and b are linguistic variables defined by fuzzy sets on universes of discourse x and y , respectively (figure 3). the fuzzy rules could employ the relationship between input variables and output variables in a meaningful and explicit manner, [12]. our fuzzy if-then rules are derived from experimental findings and knowledge indicating relations between characteristics of the cyclogram and the type of terrain and walking speed. gait behavior can be described by a number of characteristics. we choose the inclination angle of the surface (ε) or the type of terrain and walking speed (v). in the model of cyclograms, sets of variables are distinguished. in our case, the first variable that is used is the area (a) of the cyclogram, and the second used variable is the inclination angle (θ) of the cyclogram. we can also use the following variables: the second moment of the area about the x-axis and the y-axis, the product moment of the area, the polar moment of inertia, etc. the first fuzzy expert system used and designed to identify human gait behavior is based on two input variables: area and inclination angle, table 1. the output variables from the fuzzy expert system are: the inclination angle of the surface, and the walking speed, see table 1. the proposed model is based on identifying the type of terrain and the walking speed. the fuzzy expert system interprets an input and, on the basis of sets of if-then rules, infers an output, see figure 3. we use the fuzzy expert system to identify the type of terrain and the walking speed, but we can also use the opposite way. if we know the desired type of terrain and walking speed, we can identify characteristics of the cyclogram, and thus the joint angles, see figure 3b. in the next part of this paper, we will deal with the first way, see figure 3a. the mamdani fuzzy model was used and the rules are in the form: if area (a) is . . . and inclination angle of diagram (θ) is . . . then walking speed (v) is . . . and inclination angle of the surface (ε) is . . . we also defined the membership functions to which the inputs belong in each of the appropriate fuzzy antecedent sets (see table 2 for a description of the membership functions). the output variables are represented by consequent fuzzy sets. we determined the membership functions to which the outputs belong in each of the appropriate fuzzy consequent sets (see table 2). 90 vol. 53 no. 2/2013 characterization of human gait using fuzzy logic fuzzy fuzzy a b c d variable set a (deg2) s 600 600 1550 m 600 1550 2000 l 1550 2000 3150 3150 θ (deg) sn −40 −40 −5 10 sp −5 10 25 mp 10 25 40 lp 25 40 40 v (km/h) s 0 0 5 c 0 5 12 f 5 12 12 ε (deg) s 0 0 15 m 0 15 30 l 15 30 30 table 2. parameters for defining the triangular (a, b and c) and trapezoidal (a, b, c and d) membership function for fuzzy sets. we described all rules by matrices of rules, see table 3. the matrices of variables are derived from experimental findings and knowledge indicating relations between type of terrain and walking speed and characteristics of the cyclogram. fuzzy rules define the gait characteristics by combinations of input variables. for example, if area (a) is small and inclination angle (θ) is small positive then walking speed (v) is small and the inclination angle of the surface (ε) is small. our method for identifying gait characteristics is based on the premise of the proposal of the fuzzy expert system. we used the fuzzy toolbox and the mamdani fuzzy inference system (fis) in matlab to create the fuzzy expert system. mamdani fis is the most known and the most used system in developing fuzzy models. we assume that the fuzzy system created by information on the state of the terrain, the walking speed and the characteristics of the closed trajectories of the cyclogram is sufficient to identify the gait characteristics. 5. results we assume that the type of terrain and the walking speed strongly affect the characteristics of the cyclogram. the human walking speed was defined by the treadmill speed (3 km/h and 6 km/h), and the angle of inclination of the surface (0° and 16°) was defined by the treadmill and the terrain slope. this data was used to set the frbes, but also can be used to verify the functionality of the proposed expert system by simulation. the frbes had to be tested to see if it was working properly. in order to do this, the known data a a θ s m l θ s m l sn l sn s sp s s m sp s c f mp m cp f lp m m lp s c table 3. matrix of output variables. left: inclination of the surface (ε). right: walking speed (v). frbes input: characteristics of the cyclogram a (deg2) 770 938 986 1340 θ (deg) 11 42 13 40 frbes output: gait behavior v (km/h) 3.6 4.1 4.9 5.6 ε (deg) 5 15 5 15 real gait behavior v (km/h) 3.0 3.0 6.0 6.0 ε (deg) 0 16 0 16 differences in preficted values from the expected ∆v (km/h) 0.6 1.1 1.1 0.4 ∆ε (deg) 5 1 5 1 table 4. example of tested values of the characteristics of cyclograms and the type of terrain and walking speed predicted by the fuzzy expert system. was inputted to see if the system was able to recognize the right type of terrain and walking speed. based on the input variables (a and θ), the fuzzy expert system selects two outputs. these responses are represented by two variables (v and ε), table 4. the simulation was done in matlab (mathworks). the inferred data (table 4) shows that the use of the described variables and fuzzy logic successfully enables us to determine the approximate expected values of the type of terrain and walking speed from the characteristics of the cyclogram. the slight variability in the estimation of the type of terrain and walking speed could be negligible, because there are small variations even in human walking. the small differences (table 4) in the inferred characteristics are comparable with those seen during measured natural human walking. we found that inference based on an evaluation of the cyclogram could be sufficiently accurate and suitable if we need to know the approximate type of walking and the approximate inclination angle of the surface. according to the method described here, cyclograms could provide information about human walking, and we can infer the type of terrain and walking speed. 91 p. kutilek, s. viteckova, z. svoboda acta polytechnica 6. discussion we chose measured and calculated values of the joint angles as an approach for creating the cyclograms. walking is described by the cyclogram, and the cyclogram is used to identify the type of terrain and the walking speed. for setting the fuzzy sets, we used the shape of the closed trajectory of the cyclogram that represents a set of states of a man walking. cyclograms in conjunction with artificial intelligence are broadly applicable in medicine. we have presented a method for describing the motion of the lower extremities, and these characteristics can be used for evaluating human gait in physiotherapeutic practice, based on a study of angle-angle diagrams. in our study, we have used an artificial fuzzy logic system based on expert knowledge. the new methods based on soft computing (i.e. fuzzy logic, neural networks and genetic algorithms) can be applied in clinical practice for studies of disorders or characteristics in the motion function of the human body, and the method can be used in advanced control systems for controlled prostheses of the lower extremities [22]. in the past, it was almost impossible to use complex algorithms based on ai in the slow control systems of a controlled prosthesis, but today we can consider applying the methods described here in the algorithms for new prosthetic control systems [23, 24]. the fuzzy expert system can also be modified to take into account other appropriate parameters, for example angular velocity or acceleration. the fuzzy logic expert system, however, puts great demands on expert knowledge or computer automatic determination of fuzzy rules and membership functions based on a large number of experimentally measured data items. the proposed methods based on a fuzzy expert system can also be used in algorithms for a driven robotic gait orthosis for the purposes of locomotion therapy, [25, 26]. this work has not attempted to describe all potential ways of applying cyclograms in conjunction with fuzzy logic. we have shown new methods that have subsequently been proved by a simulation in matlab software. these methods based on cyclograms and fuzzy logic could be suitable for a broad range of applications. acknowledgements this research has been supported by ctu prague project sgs 13/091/ohk4/1t/17. references [1] gage, r. j., hicks, r.: gait analysis in prosthetics. clinical prosthetics & orthotics, 9(3), 1989, p. 17–21. [2] kerrigan, d. c., schaufele, m., wen, m. n.: gait analysis. rehabilitation medicine: principles and practice, philadelphia: lippincott williams & wilkins, 1998, p. 167–187. [3] janura, m., cabell, l., svoboda, z., kozakova, j., gregorkova, a.: kinematic analysis of gait in patients with juvenile hallux valgus deformity. journal of biomechanical science and engineering, 3(3), (2008, p. 390–398. [4] gioftsos, g., grieve, d. w.: the use of neural networks to recognize patterns of human movement: gait patterns. clinical biomechanics, 10(4), 1995, p. 179–183. [5] lai, d. t. h., begg, r. k., palaniswami, m.: computational intelligence in gait research: a perspective on current applications and future challenges. ieee transactions on information technology in biomedicine, 13(5), 2009, p. 687–702. [6] wang, l., tan, t., ning, h., hu, w.: automatic gait recognition based on statistical shape analysis. ieee trans image processing, 12(9), 2003, p. 1120–1131. [7] mijailovic, n., gavrilovic, m. & rafajlovic, s.: gait phases recognition from accelerations and ground reaction forces: application of neural networks. telfor journal, 1(1), 2009, 34-36. [8] sepulveda, f., wells, d., vaughan, c.: a neural network representation of electromyography and joint dynamics in human gait. journal of biomechanics, 26(2), 1993, p. 101–109. [9] prentice, s. d., patla, a. e., stacey d. a.: artificial neural network model for the generation of muscle activation patterns for human locomotion. journal of electromyography and kinesiology, 11(1), 2001, p.19-30. [10] heller, b. w., veltlink, p. h., rijkhof, n. j. m., rutten w. l. c., andrevs b.: reconstructing muscle activation during normal walking: a comparison of symbolic and connectionist machine learning techniques. biological cybernetics, 69(4), 1993, p. 327–335. [11] lu, j.w., zhang, e: gait recognition for human identification based on ica and fuzzy svm through multiple views fusion. pattern recognition letters, 28(16), 2007, p. 2401–2411. [12] jacobs r.: control model of human stance using fuzzy logic, biological cybernetics, 77, 1997, p. 63–70. [13] kutilek, p., farkasova, b.: prediction of lower extremities movement by angle-angle diagrams and neural networks. acta of bioengineering and biomechanics, 13(2), 2011, p. 57–65. [14] grieve, d. w.: gait patterns and the speed of walking. biomedical engineering, 3(3), 1968, p. 119–122. [15] kutilek, p., viteckova, s.: prediction of lower extremity movement by cyclograms. acta polytechnica, 52(1), 2012, p. 51–60. [16] vaughan, c. l., davis, b. l., o’connor, j. c.: dynamics of human gait. 2nd edition, cape town: kiboho publishers, 1999. [17] goldstein, h.: classical mechanics. 2nd ed., boston: addison-wesley, 1980. [18] pilkey, w. d.: analysis and design of elastic beams. new york: john wiley & sons, 2002. [19] sivanandam, s. n., sumathi, s., deepa, s. n.: introduction to fuzzy logic using matlab. berlin: springer, 2010. [20] hajny, o., farkasova, b.: a study of gait and posture with the use of cyclograms. acta polytechnica, 50(4), 2010, p. 48–51. 92 vol. 53 no. 2/2013 characterization of human gait using fuzzy logic [21] zadeh, l.: fuzzy sets. information control, 8, 1965, p. 338–353. [22] eng, j. j., winter, d. a.: kinetic analysis of the lower limbs during walking: what information can be gained from a three-dimensional model. journal of biomechanics, 28(6), 1995, p. 753–758. [23] bellmann, m., schmalz, t., blumentritt s.: comparative biomechanical analysis of current microprocessor-controlled prosthetic knee joints. archives of physical medicine and rehabilitation, 91(4), 2010, p. 644–652. [24] brian, j. h., laura, l. w., noelle, c. b., katheryn, j. a., douglas, g. s.: evaluation of function, performance, and preference as transfemoral amputees: transition from mechanical to microprocessor control of the prosthetic knee. archives of physical medicine and rehabilitation, 88(2), 2010, p. 207–217. [25] boian, f. r., burdea, c. g., deutsch, e. j.: robotics and virtual reality applications in mobility rehabilitation. rehabilitation robotics, vienna: i-tech education and publishing, 2007, p. 27–42. [26] cikajlo, i., matjacic, z.: advantages of virtual reality technology in rehabilitation of people with neuromuscular disorders. recent advances in biomedical engineering, vienna: intech, 2009, p. 301–320. 93 acta polytechnica 53(2):88--93, 2013 1 introduction 2 data acquisition 3 gait characteristics 4 fuzzy rule based expert system 5 results 6 discussion acknowledgements references ap1_02.vp 1 introduction cyclotron – produced 81rb (t1/2 = 4.58 h) and its daughter 81mkr (t1/2 = 13.3 s) constitute a radionuclide generator for nuclear medicine used as a common tool for lung ventilation studies. the main advantages of this generator are the good imaging properties of 81mkr and the low radiation dose in tissue. because demand for such a system exists, it was decided to develop an appropriate target and radiopharmaceutical assemblies. these considerations led to a project to make an apparatus for producing radionuclide 37rb 81. the authors have experience in constructing and operating of filling and recycling apparatus for production of 123i on electron accelerators [1], [2] for gaseous xe targets (natural xenon and xenon enriched by 124xe). the distinctive character of these apparatuses is the irradiation of gaseous targets in the proximity of the critical point of xe to guarantee high effectiveness and reproducible yield of 123i from (�, n) reaction during irradiation of 124xe. by contrast, cyclotron targets work at lower pressure, resulting from the reaction conditions of protons with the target material, i.e., from the „square density” parameter and necessary loss of energy for a given reaction channel, e.g., � �36 82 37 81 4 58 36 81 13 32 1 2 1 2 kr p n rb kr t h m t h e , . . .� � � ������ � �c e kev, . % .� �190 3 67 for a thick target and proton energy ep = (29 �23) mev the yield is about 48 mci/(�a�h). when working with expensive krypton, highly enriched in one of its isotopes, it is necessary to ensure multiple recycling without significant losses and pollution of the gas. the recycling of the gas in the apparatus is achieved by cryogenic pumping at the temperature of liquid nitrogen. before the gas transfer, the transport volume of the apparatus must be evacuated to about 10�2 pa. the pressure of gas vapors at liquid nitrogen temperature for kr is 133 pa, which is three orders of magnitude higher than for xe (0.1 pa). to reduce the losses of kr gas per one pumping cycle, the vapor pressure must be reduced by using a suitable solid adsorbent within the pumped volume. a diagram of the apparatus is schematically shown in fig. 1. for completeness, the radiochemical, drying and target tube subsystems, which were made in the npi of cas, are indicated in gray. fig. 2 illustrates the apparatus in development. fig. 1 shows the apparatus together with the radiochemical, drying and target tube subsystems, made in the npi of cas. the scheme is much more complicated than the apparatus in [1] and [3]. this is due partly to the need to use molecular sieves, partly to satisfy the requirement that it should be possible to automate the process with the use of a pc and rs 232 interface. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 41 no. 2/2001 filling and recycling apparatus of a cyclotron target with enriched krypton for production of radiopharmaceuticals m. vognar, m. fišer, č. šimáně, d. chvátil an apparatus for multiple filling of a cyclotron target with enriched kr gas is described. the system is based on recycling pressurized gas by cryogenic pumping between the target tube and storage containers. the design and construction makes use of previous experience in the construction and operation of two analogue apparatuses for xe124 high pressure gas targets, but major modifications have been incorporated, evoked by the different physical properties of kr, by the character of the nuclear reaction, and by the demand for automation from the side of the end user. keywords: generator, kr gas target, 36k 82(p, 2n) 37rb 81 reaction. � � � �ec 67 27% , %� � fig. 1: scheme of the filling and recycling apparatus: dp – diaphragm vacuum pump, tmp – turbomolecular vacuum pump, c1 – working container, c2 – storage container, ms – molecular sieve, c3 – intermediate high pressure container (kr trap), d1 to d3 – dewar flasks, ln – liquid nitrogen, f – microfilter, v1 to v13 valves, rv1, rv2 – pressure reduction valves, vv – venting valve, fi – first filling inlet fitting, h1 and h2 – container thermocoax heating elements, pim – pirani vacuum gauge, pem – penning vacuum gauge, pt1 to pt3 – pressure transducers, m1 – manometer, t1, t2 –thermocoax thermocouples for measuring of the gas temperature, nls1 to nls3 – liquid nitrogen level sensors, nbs1 and nbs3 – sensors for control nitrogen boiling, nbh1 and nbh3 – heaters for boiling out liquid nitrogen, fv1, fv2 – filling volumes 2 description of the apparatus the pumping system consists of a diaphragm vacuum pump dp and a turbomolecular pump tmp, which can be separated from the rest of the apparatus by the valve v0. to prevent the transfer of mechanical vibrations from the diaphragm pump to the turbomolecular pump, the former is suspended on spiral springs on a separate stand and connected with the turbomolecular pump, mounted on the main stand, by a bellow hose provided with additional damping. the complete filling and recycling apparatus consists of working pressure container c1, storage container c2, high pressure intermediate container c3 (kr trap), high pressure and vacuum tight, pneumatically actuated high pressure bellow valves v0 to v6, an electrically controlled pressure air distribution system to the valves, pressure transducers pt1 to pt3, and penning type pem and pirani type pim vacuum gauges. the system is mounted together with the corresponding armatures and tubing on the supporting plate on the main stand. the apparatus is accessible for the first filling by the fitting f1, which is permanently all time blinded during operation. for the connection with the target, the other end of the apparatus is provided with a tube coupling of ultraseal type for evacuation and with two swagelock couplings for filling the target from the intermediate container and for chemical treatment of the product. the stainless steel dewar flasks for freezing the gas in containers c1 and c2 are mounted on an auxiliary plate on the main stand. the dewar flask d3 for freezing the gas in the intermediate container c3 is suspended on the main supporting plate. the thermocouples nls1 to nls3, for sensing the liquid nitrogen level in the dewar flask during automatic filling and refilling, are fixed on dewar flasks d1 to d3 as well as the thermocouples nbs1 and nbs3 for control of boiling out the nitrogen in dewar flasks d1 and d3. containers c1 and c2 are provided with thermocoax heating elements h1 and h2 wound in spiral grooves on the outer surfaces. the resistance of the heating elements is approx. 6 ohms, and the heating power is 150 w. the surface temperature of the containers is measured by thermocoax thermocouples t1 and t2, wound along the heating elements in several grooves and connected to two-level comparators. the working temperature is stabilized at 320 °c. in the event of accidental overheating the comparators automatically switch off the heaters at 350 °c. restarting must be done manually. the pressure of the gas in the containers during heating and transfer is controlled by pressure transducers pt1 to pt3, which are high vacuum tight and work up to 20 mpa. pressure transducer pt3 indicates the pressure of the gas in the target, when valves v5 and v3 are closed. heating elements nbh1 to nbh3 (4 ohm resistor wire), for quick removal of liquid nitrogen by boiling it out, are installed on the bottom in dewar flasks d1 and d3. the heating power in the presence of liquid nitrogen in the flask is 145 w. when the nitrogen has been boiled out and the temperature, as measured by thermocouples nbs1 and nbs3, reaches 0 °c the power is automatically reduced to 35 w. at this heating power the temperature is stabilized at about 80 °c. the switching and stabilization is accomplished by two level thermostats. at the beginning, when heating starts, low power heating is set automatically. high power heating is switched on by a starting impulse, activated only if there is liquid nitrogen in the dewar flask. the penning type pem vacuum gauge indicates the high vacuum level in the tubing between the turbomolecular pump and valve v0. the pirani vacuum gauge for indicating pressure from 10�1 to 103 pa can be separated from the rest of the apparatus by valve v6, which must be closed before the pressure in the connecting tubing rises above 0.1 mpa, otherwise its vacuum tightness will be lost. the complete production system consisting of the target, the filling and recycling apparatus, together with auxiliary parts, is mounted on a mobile carriage. for irradiation, the carriage is moved to the proton beamline. all the electronic blocks are kept at a sufficient distance from the beam to prevent radiation damage. 3 target construction the scheme of the target system is sketched in fig. 3; the assembly of the target is shown in fig. 4. all parts of the sys4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 fig. 2: overall view of the filling and recycling apparatus at the development stage fig. 3: schematic drawing of the kr target acta polytechnica vol. 41 no. 2/2001 tem are made of material that is only to a small extent activated by scattered particles of the beam. a thin compact ni layer coats the interior of the tube, the entrance window of the target is made of ti foil, and the material of the scatter foil is pure al. a beam stopper is integrated at the tube outlet, which, like the surfaces of the other parts, is cooled by water. the target system is electrically insulated from the output accelerator flange. the target is operated at a pressure of 1–2 mpa, with proton beam energy ep = 28 mev and energy loss of 6–7 mev in the target gas. 4 operating procedure the molecular sieves in the storage (working) containers c2 (c1) are activated by heating them to 320 °c under continuous pumping by the turbomolecular pump tmp and the diaphragm vacuum pump dp. if the nominal vacuum is reached, valve v2 (v1) is closed. then the apparatus is evacuated to 10�2 pa. the gas from the transport flask (625 cm3 at 1 bar, 20 °c) is attached to the first filling fitting f1. the apparatus, including the fitting armature, is evacuated and when the pressure descends to 10�2 pa, and all valves are closed. the dewar flask of the working container c1 is filled with liquid nitrogen and the valves of the transport flask and valve v1 of the working container c1 are opened. when the pressure, indicated on pt1, descends to negative values, valve v6 is opened, and the pressure is measured by the pirani gauge. at a pressure below 0.5 pa the valve v1 and the valve of the transport flask are closed, as is v6. the transport flask is removed and filling fitting f1 is blinded. by opening valves v0 and v4, the aerated volume is evacuated again to 10�2 pa, and after valves v0 and v4 have been closed the apparatus is ready for the first cycle, i.e., for the transfer of the gas in the target. for this purpose, the dewar flask of the intermediate container (kr trap) c3 is filled with liquid nitrogen and container c1 is heated by thermocoax heating element h1 to 320 °c. if there is liquid nitrogen present in container c1, heater nbh1 is switched on and liquid nitrogen is boiled out. after that, valves v1 and v2 will be opened and the gas from c1 will freeze in container c3. at pressure lower than 0.1 mpa, valve v6 is opened and further decrease of the pressure is measured by the pirani gauge. when the pressure decreases below 150 pa, valve v5 is closed. in following steps the heating of c1 is switched off, d1 is filled with liquid nitrogen and the rest of the gas between valves v2, v4, v5 and the blinded filling fitting is transferred by cryogenic pumping to c1. valve v1 is closed when the pressure descends below 0.5 pa. after that, heater nbh3 in c3 is switched on and the liquid nitrogen boiled out. the pressure in the target, measured by pressure transducer pt3, rises to approximately 2 mpa and from this moment the target is ready for irradiation by protons from the cyclotron. after the end of the irradiation the gas is transferred by cryogenic pumping back either to working container c1 or to storing container c2, the volume of which is larger than the volume of c1, and the danger of escape of the gas during long-term storage is minimized. then the radioisotope deposited on the walls of the irradiation container is eluted for subsequent radiochemical preparation of the end product. the details are explained in [4]. after elution, the apparatus must be dried out, evacuated to working vacuum and prepared for the next recycling. 5 approximate duration of individual process cycles 1. nominal working vacuum is attained in about 50 to 90 minutes from the start of the tm, depending on the duration for which the apparatus has been held under atmospheric pressure. 2. heating container c1 (c2) to 320 °c and stabilizing the gas pressure at 1.52 mpa (0.073 mpa) takes about 32 minutes. 3. after opening valves v1 and v5, when gas transfer starts from container c1, pressure stabilization (measured by pt3) at 0.64 mpa is attained in about 2 minutes. 4. freezing out the gas in trap c3 takes about 10 minutes. 5. heating trap c3 and transferring the gas to the target till the pressure attains 2.17 mpa and the target is ready for irradiation takes about 32 minutes. 5 fig. 4: photograph of the pressure gas target -0,5 0 0,5 1 1,5 2 0 10 20 30 40 50 60 time [min] p r e s s u r e [m p a ] fig. 5: pressure of the gas in container c1 as a function of time elapsed from the beginning of heating from 20 °c to 320 °c with 135 w heating power. the temperature is stabilized at 320 °c; valve v1 remains closed during heating 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 6. recuperation of the gas in the tubing and in the dead volumes of the valves, by freezing it in c1, after the target has been filled and separated from the rest of the apparatus, lasts about 20 minutes. 7. back transfer of the gas from the target by freezing it in working container c1 lasts about 15 minutes. the pressure increase in container c1, containing 628 cm3 (20 °c; 0.1mpa) during heating from room temperature to 320 °c and from –193 °c to 320 °c (including evaporation of liquid nitrogen) is plotted in fig. 5 and fig. 6. 6 main technical parameters • diaphragm vacuum pump 2md4, alcatel, pumping speed 3.3 m3/h, ultimate pressure 200 pa. • turbomolecular vacuum pump ath 20/40, alcatel, pumping speed 25/45/l/s, ultimate pressure < 10�8 pa (when in tandem with diaphragm vacuum pump). • container c1 – volume 60 cm3 including tubing and the dead volume of valve v1 and filter f, maximum pressure 2 mpa, filled in the cooled and heated region with 38.2 g of merck zeolite, type 5a, provided with filters (glass wool, stainless sieve, sintered stainless filter 50 �m) against escape of zeolite powder, range of temperature –195 to 350 °c. • container c2 – volume 463 cm3 including tubing and the dead volume of valve v2 and filter f, 60 cm3 of which belongs to the cooled and heated section, maximum pressure 2 mpa, filled in the cooled and heated region with 39 g of merck zeolite, type 5a, provided with filters (glass wool, stainless sieve, sintered stainless filter 50 �m) against escape of zeolite powder, range of temperature of the heated section –195 to 350 °c. • container c3 (kr trap) – volume about 1.5 cm3, maximum pressure 20 mpa. • valves v0 to v6 pneumatically actuated high-pressure bellows valves, pressure max. 24 mpa, vacuum tight. • tubing – stainless steel, od from 1/16” to 3/8”, maximum pressure 20 mpa, silver brazed, removable vacuum tight joints. • target – volume approx. 30 cm3 including dead volumes of tubing and adjacent valves. 7 conclusion since the end of 2000, the apparatus has been in use for production of rb81/kr81m by irradiation of krypton in gas form enriched to 99 % by 36kr 82. our experience, confirmed in long-term experiments with natural krypton, has shown that the filling and recycling apparatus can guarantee up to 500 cycles without significant loss or pollution of the gas, if the operation instructions are strictly observed. this is very important, when the filling gas is nearly monoisotopic 36kr 82, which is expensive ($20/cm3). the apparatus can also be used for production of radioisotopes by other nuclear reactions, for example 37rb 82m in the (p, n) reaction on 36kr 82 on krypton with 90 % content of this isotope. acknowledgement the authors would like to record their gratitude to mr. j. kříž and mr. v. němec from the department of dosimetry and application of ionizing radiation for their valuable collaboration and assistance in constructing the apparatus and in solving some technical problems during production of the mechanical and electronic sections of the apparatus. references [1] vognar, m., šimáně, č.: laboratory apparatus for preparation 123i from 124xe by photonuclear reaction. acta polytechnica, vol. 36, no. 5/1996, pp. 49–56 [2] vognar, m., šimáně, č., chvátil, d.: recyklační aparatura pro přípravu 123i fotojadernou reakcí na 124xe. závěrečná zpráva k so č. 403396, kdaiz fjfi, praha, 1997 [3] groszowski, j.: technika vysokého vakua. sntl praha, 1981 [4] fišer, m., hanč, p., lebeda, o., hradílek, p., kopička, k.: development and production of 81rb / 81mkr radionuclide generator in npi. czech j. phys. 49 (1999), suppl. s1, pp. 811–816 ing. miroslav fišer dept. of radiopharmaceuticals e-mail: fiser@ujf.cas.cz nuclear physics institute of cas 250 68 řež ing. miroslav vognar prof. ing. čestmír šimáně, drsc. ing. david chvátil dept. of dosimetry & appl. of ionizing radiation phone: +420 2 2323657, +420 2 2315212 fax: +420 2 2320861 e-mail: vognar@br.fjfi.cvut.cz czech technical university in prague faculty of nucleus sciences and physical engineering břehová 7, 115 19 praha 1, czech republic -0,5 0 0,5 1 1,5 2 0 10 20 30 40 50 60 time [min] p r e s s u r e [m p a ] fig. 6: pressure of the gas in container c1 as a function of time from the beginning of heating from –193 °c to 320 °c. the beginning of the curve corresponds to the boiling out of liquid nitrogen from the dewar flask with heating power 270 w. when the temperature rises above zero, the power is reduced to 165 w. the temperature is stabilized at 320 °c. during heating valve v1 remains closed. acta polytechnica doi:10.14311/ap.2015.55.0393 acta polytechnica 55(6):393–400, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap finite element analysis of aortal bifurcation jakub kroneka, ∗, rudolf žitnýb a czech technical university in prague, faculty of mechanical engineering, department of mechanics, mechatronics and biomechanics, technická 4, 166 07 prague, czech republic b czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: jakub.kronek@fs.cvut.cz abstract. arterial bifurcations loaded by internal pressure are significant stress concentrators. increased mechanical stress inside the arterial wall probably accelerates pathogenic processes at these places. the stress concentration factor (scf) depends mainly on geometry, loading and material. this paper presents a map of scfs calculated by fem aortic bifurcation in the aortic bifurcation region (ab), loaded by static internal pressure. the influence of geometry (aortic diameter, wall thickness, bifurcation angle, "non-planarity" angle and radius of apex), material properties and internal pressure were evaluated statistically by regression of fem results. two material variants were used (linear hooke and hyper elastic ogden). viscoelastic behaviour, anisotropy and prestrain were neglected. the results indicate that the highest mises stress appears in the inner side of the ab apex, and that the scf is negatively correlated with the bifurcation angle and with the internal pressure. the scf varies from 4.5 to 7.5 (hooke) and from 7 to 21 (ogden). keywords: stress concentration factor; aorta; artery; bifurcation; branching. 1. introduction atherosclerosis is a major cause of death in the western world [1, 2]. for a successful fight against this disease of civilisation, it is crucial to understand the processes that lead to or accelerate atherosclerosis. many biomechanical works [3–13] have dealt with the interaction between blood flow and the intimal surface of the arteries in the region of arterial bifurcation or branching. it has been shown that very low wall shear stress (less than 1 pa) accelerates the formation of atherosclerotic lesions in these regions [8, 14]. other works [15, 16] have indicated that a high level of transmural pressure, which causes tensile stress inside the arterial wall, may also cause degenerative atherosclerotic changes. arterial bifurcations are significant geometrical stress concentrators, which increase the mechanical stress many times in comparison with the level in non-branched regions. the stress concentration factor (scf) is the ratio of the maximum stress and the nominal stress in a non-branched artery. according to [17], the scf within carotid bifurcation may reach values more than 30. other analyses of scf have been published in [18–20]. scf depends on many factors. firstly, the geometry of the bifurcation. the loading and the material properties also have an impact. the aim of our work is to find simple correlations for peak stresses and scf using only a small number of parameters (geometry, material, internal pressure). the aortal bifurcation was chosen as suitable representative of arterial bifurcations. 2. methods it is hard to measure stress directly within the arterial wall, so we used finite element (fe) modelling of aortic bifurcations. a description of the ab geometry using a minimum number of parameters was the most important aspect of the design of the fe models. for this purpose, we carried out a literature review, and supplemented it by our own measurements on cadavers. we attempted to select independent geometrical parameters and/or to find a statistically relevant relation between two or more geometrical parameters, e.g. the relation between the diameter of the abdominal aorta (aa) and the diameters of common iliac arteries (cia) 2.1. geometry of ab ab is the terminal part of aa, which divides the blood flow between the left cia and the right cia. the geometry of ab generally corresponds to a slightly nonplanar y-shaped bifurcation (non-planarity is characterised by angle β). the bifurcation angles of the left cia (αl) and the right cia (αr) may be identical, but this is not necessarily. the transition between aa and cia is gradual, and may be characterised by the radius of an osculating circle (rl, rr), see fig. 1. both aa and cia generally have an elliptical (oval) cross-section. another non-uniformity may be caused by the fact that the wall thickness is not constant. an offset of the left and right cias may also be observed in some patients. 393 http://dx.doi.org/10.14311/ap.2015.55.0393 http://ojs.cvut.cz/ojs/index.php/ap jakub kronek, rudolf žitný acta polytechnica figure 1. schematic illustration of ab geometry with geometrical parameters marked. some papers have presented measured values of angles α and β [21–25], diameters of aa and cia [21, 22, 25, 26], wall thicknesses [27–29], eccentricity [25, 27] and radii of curvature [21, 22] (figure 1, table 1). 2.2. experimental measurements of geometries measurements of 12 human abs resected from cadavers (age from 17 to 71 years) were made during autopsies in the department of forensic medicine of the kralovske vinohrady university hospital. the relevant ethical committee approved the use of human tissue in this study. each sample was photographed, together with a length scale for evaluating the scale factor and the real dimensions (software imagej was used for processing the pictures). the axis of aa and the branches were identified more or less manually. the evaluated geometrical parameters, together with results published by other authors, are presented in tab. 1 (angles β could not be evaluated from the photographs). non-dimensional eccentricity eaa is defined as the maximum diameter (usually in lateral direction) divided by the minimum diameter (usually in antero-posterior direction). on the basis of our own measurements on 12 cadavers, estimates were made of the mean values of diameters daa = 13.8 mm and dcia = 8.7 mm, with standard deviations sdaa = 4.5 mm and sdcia = 3.8 mm. the inner diameters could not, of course, be evaluated from in situ photographs. it was necessary to use a different technique, based on extracted babs, in the form of excised circular rings. the mean diameters were evaluated from the lengths of the rings. the rings were also used for evaluating the wall thickness profiles (circular rings were cut and then stretched into strips, the thickness of which was measured using a laser scanner (microepsilon). a significant correlation tcia = 0.89 taa was observed (αvalue < 0.01). a statistically evaluated reduction of diameters dcia = 0.64daa seems to be a reasonable approximation of the murray law [30] (the principle of minimised dissipated energy and metabolic consumption) and the egm principle (entropy generation minimisation). s ou rc e n a ge (y ea rs ) α (° ) β (° ) d a a (m m ) d c i a (m m ) t a a (m m ) t c i a (m m ) r l ,r r (m m ) e a a (1 ) [2 1, 22 ] 37 71 ± 16 50 ± 16 18 ± 8 18 .3 ± 3. 8 10 .9 ± 2. 1 — — 45 — [2 3] 70 57 ± 16 35 ± 11 — — — — — — — [2 4] 20 — — 9. 1 ± 5. 6 — — — — — — [2 5] 12 27 to 84 34 ± 13 9. 4 15 .8 10 .5 — — — 0. 89 [2 6] 11 63 ± 9 — — 18 .5 ± 1. 2 8. 0 ± 2. 4 — — — — [2 7] 11 0 to 76 — — — — 1 — — 0. 9 [2 8] 19 6 45 to 84 — — — — 2. 2 ± 0. 5 — — — [2 9] 24 66 19 to 67 — — — — 1. 8 ± 0. 2 — — — o ur m ea su re m en ts on ca da ve rs 12 51 ± 17 55 ± 15 — 13 .8 ± 4. 5 8. 7 ± 3. 8 1. 8 ± 0. 4 1. 6 ± 0. 4 — — table 1. geometrical parameters of ab evaluated from published works, together with and our own measurements on cadavers. 394 vol. 55 no. 6/2015 finite element analysis of aortal bifurcation figure 2. a fixed sample during the uniaxial tensile test. 2.3. experimental measurements of material properties the mechanical properties of the arterial wall strips excised from the ab samples (the same samples as those used for evaluating the geometrical parameters) were identified in experiments carried out on a tensile testing machine for soft tissues (messphysik materials testing gmbh, fürstenfeld, austria) equipped with a 100 n load cell. the ends of the strips were fixed by two clamps with pins (figure 2). the samples were preconditioned by 4 loading cycles up to deformation of approximately 15 % before ultimate failure loading. the deformations of the stretched strips were obtained via an image analysis of video records performed by a matlab script that was developed in-house. the mechanical properties (described by an isotropic linear hookean model and alternatively by a hyper elastic ogden constitutive equation) were identified using a regression analysis of the stress strain data. the two-parametric ogden model [31] was defined by the following strain energy function w = 2µ α2 (λ́α1 + λ́ α 2 + λ́ α 3 + 1 d (jel − 1)2, where λ́αi are deviatoric principal stretches, jel is elastic volume deformation. the model parameters for individual samples were identified by a mathcad script and were averaged, giving mean values µ = 0.119 mpa, α = 21.99, d = 0.338 for the ogden model and the young modulus of elasticity e = 1.6 mpa and the poisson constant ν = 0.49 for the linear hooke model. more details are presented in [32]. 2.4. fe models thirty-two different geometries of fe models were created in autodesk inventor 2012 3d modelling software, using standard modelling tools (2d sketching, extrude, sweep along a line, chamfer of edges) and were exported to the abaqus fe program. the 32 geometries have different combinations of the five most important geometrical variables daa ≈ (10–18) mm, taa ≈ (0.8–2.4) mm, rab ≈ (0–2.8) mm, α ≈ (18–82)°, β ≈ (0–32)°; these parameters were distributed according to the principles of rsm (response surface methodology). other geometrical parameters were either correlated with the varying parameters tcia = 0.89taa, dcia = 0.64daa , or were fixed eaa = ecia = 0.9, rr = rl = 45 mm. the values and correlations were selected on the basis of a previous geometrical study. these 32 geometries did not correspond directly to any measured ab. the geometrical parameters of the models are chosen only to be within the ranges evaluated in a previous morphometric study. for each geometry, abaqus calculated the stress distributions (and therefore the scfs) at a constant systolic pressure load of 120 mmhg, using alternately the hooke model and the ogden model , and using the material parameters presented in the previous paragraph (thus the same material parameters and the same load were applied for all 32 geometries). the reference load of 120 mmhg is so high that the geometrical nonlinearities and also the material nonlinearities are significant (typical tangential stretches corresponding to this pressure are up to 1.2). in order to assess the effects of large deformations and the limit of the linear range, five typical geometries were selected and calculated, with internal pressures rising from 100 mmhg up to 190 mmhg. the effect of the variability of the material parameters was tested only for the linear hooke model: in addition to the reference values (e = 1.6 mpa, ν = 0.49) the young elastic modulus was varied from e = 0.6 mpa to 26 mpa for four typical geometries and for a reference load of 120 mmhg for constant ν = 0.49, because the aortic wall is practically incompressible, and lower poisson constant values are of no practical significance. numerical experiments indicate that scfs calculated at the highest stiffness (e = 26 mpa) are close to but not exactly within the linear region. 3d quadratic brick elements c3d20r were used to create a mesh. the mesh was mapped in each case to ensure the same mesh density (5 elements to the wall thickness and 0.1mm width of the first element from the plane of symmetry in the region of the apex). in addition, a mesh convergence test was carried out with one selected geometry. five meshes were created (from very thin to very dense) and two mesh density parameters were defined. the first parameter is the total number of elements, and the second parameter is the reciprocal of the width of the first element (which is normally 0.1 mm). the solution (mises stress) converged with the two mesh density parameters (figure 3). the calculated mises stress was tracked on two 1d paths. path 1 leads from the highest stress peak in the apex caudally on the inner surface of cia (figure 4). path 2 leads from the second stress peak in the outer rear side of ab cranially on aa (figure 4). three variables were evaluated from each path: the maximum mises stress in the stress concentrations at the beginnings of the path (σmax1, σmax2), the nomi395 jakub kronek, rudolf žitný acta polytechnica figure 3. test of mesh convergence. the solution (the highest mises stress value in the apex of bifurcation) as a function of the first and second parameter of mesh density. figure 4. the stress was tracked on two paths. the figure shows the starting points and the direction of the paths. nal (stable) mises stress (σnomin1, σnomin2), and the distances from the beginning of the paths (lk1, lk2), where the stress drops almost to its nominal value (σ < 1.1σnomin). stress concentration factors k1 and k2 were evaluated simply as k1 = σmax1/σnomin1 and k2 = σmax2/σnomin2. 2.5. analytical approximation of scf, maximum stresses and range it is assumed that the following factorisation of pressure, material parameters and geometry can be used for a quick estimate of scf: k1 = f1(p,e)g1(α,β,daa, taa,rab), k2 = f2(p,e)g2(α,β,daa, taa), where the pressure correction factor fi is a linear function fi = 1 + fi0 pe . the pressure correction factor is defined for the ogden model as fi = 1 + fi0 pµ even if the fe calculations were performed with only one set of ogden model parameters, therefore only for µ = 0.119 mpa. the geometric factor was suggested in linear form (linear with respect to the selected base functions). the use of dimensionless base functions reduces the number of parameters to five (k1) or four (k2): g1 = a10 + a11 daa taa + a12 cos2 α 2 + a13 √ 1 − cos β + a14 rab daa , g2 = a20 + a21 daa taa + a22 cos2 α 2 + a23 √ 1 − cos β. the basis function √ 1 − cos β was suggested by analogy with a pressurised bent pipe, and the basis function cos2 α2 was motivated by the method which estimates the stresses in the apex using membrane behaviour [33]. the resulting analytical model has 6 dimensionless parameters (fi0, ai0, ai1, ai2, ai3, ai4), which were identified by regression analysis (modified newton method) of approximately 120 scf values calculated by abaqus for the hooke model (32 different geometries at a reference pressure of 120 mmhg and for reference material parameters + different pressures and different modulus of elasticity for three geometries) and about 90 scf values calculated for the ogden model (only 16 geometries, 6 of them with varying pressure and only one set of material parameters). the two identified sets of parameters make it easy to estimate the four scf values (separately for the hooke model and for the ogden model, and separately for the inner surface of cia and for the outer surface of aa). these values also enable quick estimates of the maximum mises stresses based on the nominal membrane stresses σ1,max = k1 pdcia 2tcia φ1, σ2,max = k2 pdaa 2taa φ2 the value ϕi = 1 corresponds to thin circular tubes. a more accurate estimate can be based on an analytical solution of a pressurized elastic (and incompressible) elliptical pipe [34] with the correction factor ϕ corresponding to known eccentricity e, relative thickness and relative pressure: ϕ1 = 1 + 3 2 dcia tcia 1 −ecia 1 + ecia 1 1 + 38 p e ( dcia tcia )3 , ϕ1 = 1 + 3 2 daa taa 1 −eaa 1 + eaa 1 1 + 38 p e ( daa taa )3 . the dependences of lk1 and lk2 on the geometry were estimated in the form lk1 = c1 √ dciatcia and lk2 = c2 √ daataa, where c1 and c2 are constants, which were found by comparison with the fe results. 396 vol. 55 no. 6/2015 finite element analysis of aortal bifurcation figure 6. a comparison of the k1 values evaluated from fem and the k1 values calculated by the regression model. the graph on the left represents the set of linearly elastic models, while the graph on the right graph represents the set of hyper elastic models. figure 5. mises stress tracked on path 1. only the beginnings of the path are displayed. the curves correspond to linearly elastic fe models (y01, y02,. . . y05), where only one parameter (daa) has been varied. the rest of the geometrical parameters remain constant. the graph shows exact values of the mises stress under the same loading (pressure 120 mmhg). 3. results and discussion examples of the mises stress along path 1 are displayed in figure 5. the graph contains results from linearly elastic variants of five models (y01 to y05), in which only daa varies. the rest of the parameters remain constant. tables 2 and 3 summarise the outcomes of a regression analysis. the prediction ability of regression models of k1 (for both linearly elastic and hyper elastic materials) are demonstrated by the graphs in figure 6. some results can be interpreted on the basis of an examination of the regression coefficients: when the bifurcation angle decreases by 10°, k1 increases on an average by 10 %. however, increasing the nonplanarity angle also by 10°causes an increase of k1 on an average by 5 %. changes of the relative wall thickness have only a slight impact on k1. it is clear that some factors that have been neglected would affect the stress distribution. for example, a published fe model of ab [35] showed a 15 % increase in scf when an orthotropic model was used. a 7 % increase in scf was reported when an orthotropic model was used [16]. according to an fe model of carotid bifurcation [19], the increase in stress within the apex, using an anisotropic model, was about 18 %. however, the prestrain of the arterial wall should reduce the scf in the apex [20]. both anisotropy and prestrain have been neglected in our study. a 50 % increase in the young modulus of elasticity should increase the scf in the apex by 7 %, according to [35]. a positive correlation between the young modulus and k1 has also been evidenced in our study. 3.1. sizes of the affected regions only relatively small regions of the arterial wall are affected by stress concentrators (k1 and k2). lengths lk1 and lk2 were evaluated to be proportional to constant 1.66 and constant 2.81, respectively: lk1 = 1.66 √ dciatcia, lk2 = 2.81 √ daataa. 3.2. qualification of the stress in stress concentrations six components of the stress both in the apex and in the rear side of ab are shown in figure 7. we can say that the dominant stress component which the loads the arterial wall within the apex is a normal stress in antero-posterior direction. this stress component is almost equal to the evaluated mises stress. the dominant stress components in the second stress concentrator are the tangential stress (= 1.12σmax2) and the axial stress (= 0.68σmax2). this stress state does not differ dramatically from the state in aa or in cia far away from ab. 397 jakub kronek, rudolf žitný acta polytechnica a0 a1 a2 a3 a4 f0 k1 −0.95 −0.051 8.18 2.34 1.33 −0.00079 k2 2.27 −0.022 1.45 2. — −0.000057 table 2. regression coefficient evaluated from results of (hooke) linearly elastic fe models. a0 a1 a2 a3 a4 f0 k1 19.7 −0.031 5.17 3.28 −138 −0.00025 k2 4.7 −0.02 0.082 6.13 — −0.000023 table 3. regression coefficient evaluated from the results of (odgen) hyper elastic fe models. figure 7. stress components in the apex (the six pictures on the left) and in the outer rear side (the six pictures on the right). the z axis corresponds to the axis of the aorta; the x axis corresponds to antero-posterior direction; the y axis corresponds to lateral direction; r is the radial coordinate of aa, and θ means circumferential coordinate. the stress components are expressed as ratios to the mises stresses (σ1,max or σ2,max). 4. conclusion simple regression models predicting scf and the size of the affected region in two regions of interest have been proposed. they take into account the geometry, the internal pressure and the material parameters. simple regression models may be used by physicians for a quick estimate of whether or not the aortic bifurcation of a specific patient poses a high risk due to the high level of mechanical stress. according to the data, the risk is mainly due to a small bifurcation angle and/or a high non-planarity angle. a sharp apex radius also should raise k1, but this was observed only in a hyper elastic set of models. both k1 and k2 decrease with increasing relative pressure, both in the case of linearly elastic models and in the case of hyper elastic models (but the maximum stress increases with pressure, of course). a positive correlation between the young modulus and k1 has also been evidenced in our study. references [1] j. c. wang and m. bennett, “aging and atherosclerosis: mechanisms, functional consequences, and potential therapeutics for cellular senescence,” circulation research, vol. 111, no. 2, pp. 245-259, 2012. doi:10.1161/circresaha.111.261388 [2] d. p. faxon, m. a. creager, s. c. smith, r. c. pasternak, j. w. olin, m. a. bettmann, m. h. criqui, r. v. milani, j. loscalzo, j. a. kaufman, d. w. jones and w. h. pearce, “atherosclerotic vascular disease conference: executive summary: atherosclerotic vascular disease conference proceeding for healthcare professionals from a special writing group of the american heart association,” circulation, vol. 109, no. 21, pp. 2595-2604, 2004. doi:10.1161/01.cir.0000128517.52533.db 398 http://dx.doi.org/10.1161/circresaha.111.261388 http://dx.doi.org/10.1161/01.cir.0000128517.52533.db vol. 55 no. 6/2015 finite element analysis of aortal bifurcation [3] r. beare, g. das, m. ren, w. chong and t. phan, “does the principle of minimum work apply at the carotid bifurcation: a retrospective cohort study,” vol. 11, no. 1, pp. 1-6, 2011. doi:10.1186/1471-2342-11-17 [4] c. taylor, t. hughes and c. zarins, “finite element modeling of three-dimensional pulsatile flow in the abdominal aorta: relevance to atherosclerosis,” annals of biomedical engineering, vol. 26, pp. 975-987, 1998. doi:10.1114/1.140 [5] e. cecchi, c. giglioli, s. valente, c. lazzeri, g. f. gensini, r. abbate and l. mannini, “role of hemodynamic shear stress in cardiovascular disease,” atherosclerosis, vol. 214, no. 2, pp. 249-256, 2011. doi:10.1016/j.atherosclerosis.2010.09.008 [6] a. smedby, “geometrical risk factors for atherosclerosis in the femoral artery: a longitudinal angiographic study,” annals of biomedical engineering, vol. 26, no. 3, pp. 391-397, 1998. doi:10.1114/1.121 [7] y. s. chatzizisis, a. u. coskun, m. jonas, e. r. edelman, c. l. feldman and p. h. stone, “role of endothelial shear stress in the natural history of coronary atherosclerosis and vascular remodeling: molecular, cellular, and vascular behavior,” journal of the american college of cardiology, vol. 49, no. 25, pp. 2379-2393, 2007. doi:10.1016/j.jacc.2007.02.059 [8] f. gijsen, a. van der giessen, a. van der steen and j. wentzel, “shear stress and advanced atherosclerosis in human coronary arteries,” journal of biomechanics, vol. 46, no. 2, si, pp. 240-247, jan 18 2013. doi:10.1016/j.jbiomech.2012.11.006 [9] s. zhao, b. ariff, q. long, a. hughes, s. thom, a. stanton and x. xu, “inter-individual variations in wall shear stress and mechanical stress distributions at the carotid artery bifurcation of healthy humans,” journal of biomechanics, vol. 35, no. 10, pp. 1367-1377, 2002. doi:10.1016/s0021-9290(02)00185-9 [10] s. van wyk, l. p. wittberg and l. fuchs, “wall shear stress variations and unsteadiness of pulsatile blood-like flows in 90-degree bifurcations,” computers in biology and medicine, vol. 43, no. 8, pp. 1025-1036, 2013. doi:10.1016/j.compbiomed.2013.05.008 [11] m. i. papafaklis, c. v. bourantas, p. e. theodorakis, c. s. katsouras, d. i. fotiadis and l. k. michalis, “association of endothelial shear stress with plaque thickness in a real three-dimensional left main coronary artery bifurcation model,” international journal of cardiology, vol. 115, no. 2, pp. 276-278, 2007. doi:10.1016/j.ijcard.2006.04.030 [12] x. kang, “assessment of the pulsatile wall shear stress in the stenosed and recanalized carotid bifurcations by the lattice boltzmann method,” computers & fluids, vol. 97, no. 0, pp. 156-163, 2014. doi:10.1016/j.compfluid.2014.04.011 [13] p. evegren, l. fuchs and j. revstedt, “wall shear stress variations in a 90-degree bifurcation in 3d pulsating flows,” medical engineering & physics, vol. 32, no. 2, pp. 189-202, 2010. doi:10.1016/j.medengphy.2009.11.008 [14] a. malek, s. alper and s. izumo, “hemodynamic shear stress and its role in atherosclerosis,” jama-journal of the american medical association, vol. 282, no. 21, pp. 2035-2042, dec 1 1999. doi:10.1001/jama.282.21.2035 [15] s. wolf and n. werthessen, “hemodynamic contribution to atherosclerosis,” in dynamics of arterial flow, vol. 115, s. wolf and n. werthessen, eds., springer us, 1979, pp. 353-466. doi:10.1007/978-1-4684-7508-1_7 [16] m. thubrikar, vascular mechanics and pathology, springer, 2007. doi:10.1007/978-0-387-68234-1 [17] r. s. salzar, m. j. thubrikar and r. t. eppink, “pressure-induced mechanical stress in the carotid artery bifurcation: a possible correlation to atherosclerosis,” journal of biomechanics, vol. 28, no. 11, pp. 1333-1340, 1995. doi:10.1016/0021-9290(95)00005-3 [18] a. creane, e. maher, s. sultan, n. hynes, d. j. kelly and c. lally, “finite element modelling of diseased carotid bifurcations generated from in vivo computerised tomographic angiography,” computers in biology and medicine, vol. 40, no. 4, pp. 419-429, 2010. doi:10.1016/j.compbiomed.2010.02.006 [19] i. hariton, g. debotton, t. gasser and g. holzapfel, “how to incorporate collagen fibers orientations in an arterial bifurcation,” in proceedings of the third iasted international conference on biomechanics, 2005. [20] a. delfino, n. stergiopulos, j. m. jr and j.-j. meister, “residual strain effects on the stress field in a thick wall finite element model of the human carotid bifurcation,” journal of biomechanics, vol. 30, no. 8, pp. 777-786, 1997. doi:10.1016/s0021-9290(97)00025-0 [21] b. nanayakkara, c. gunarathne, a. sanjeewa, k. gajaweera, a. dahanayake, u. sandaruwan and u. de silva, “geometric anatomy of the aortic-common iliac bifurcation,” galle medical journal, vol. 12, no. 1, 2009. doi:10.4038/gmj.v12i1.1078 [22] p. m. shah, “geometric anatomy of the aortic-common iliac bifurcation,” journal of anatomy, vol. 126, no. 3, pp. 451-458, 1978. [23] c. bargeron, g. hutchins, g. moore, o. deters, f. mark and m. friedman, “distribution of the geometric parameters of human aortic bifurcations,” arteriosclerosis, thrombosis, and vascular biology, vol. 6, no. 1, pp. 109-113, 1986. doi:10.1161/01.atv.6.1.109 [24] m. h. friedman and z. ding, “variability of the planarity of the human aortic bifurcation,” medical engineering & physics, vol. 20, no. 6, pp. 469-472, 1998. doi:10.1016/s1350-4533(98)00039-3 [25] p. oflynn, g. o’sullivan and a. pandit, “geometric variability of the abdominal aorta and its major peripheral branches,” annals of biomedical engineering, vol. 38, no. 3, pp. 824-840, 2010. doi:10.1007/s10439-010-9925-5 [26] j. j. yeung, h. j. kim, t. a. abbruzzese, i. e. vignon-clementel, m. t. draney-blomme, k. k. yeung, i. perkash, r. j. herfkens, c. a. taylor and r. l. dalman, “aortoiliac hemodynamic and morphologic adaptation to chronic spinal cord injury,” journal of vascular surgery, vol. 44, no. 6, pp. 1254–1265.e1, 2006. doi:10.1016/j.jvs.2006.08.026 [27] n. maclean and m. roach, “thickness, taper, and ellipticity in the aortoiliac bifurcation of patients aged 1 day to 76 years,” heart and vessels, vol. 13, no. 2, pp. 95-101, 1998. doi:10.1007/bf01744592 399 http://dx.doi.org/10.1186/1471-2342-11-17 http://dx.doi.org/10.1114/1.140 http://dx.doi.org/10.1016/j.atherosclerosis.2010.09.008 http://dx.doi.org/10.1114/1.121 http://dx.doi.org/10.1016/j.jacc.2007.02.059 http://dx.doi.org/10.1016/j.jbiomech.2012.11.006 http://dx.doi.org/10.1016/s0021-9290(02)00185-9 http://dx.doi.org/10.1016/j.compbiomed.2013.05.008 http://dx.doi.org/10.1016/j.ijcard.2006.04.030 http://dx.doi.org/10.1016/j.compfluid.2014.04.011 http://dx.doi.org/10.1016/j.medengphy.2009.11.008 http://dx.doi.org/10.1001/jama.282.21.2035 http://dx.doi.org/10.1007/978-1-4684-7508-1_7 http://dx.doi.org/10.1007/978-0-387-68234-1 http://dx.doi.org/10.1016/0021-9290(95)00005-3 http://dx.doi.org/10.1016/j.compbiomed.2010.02.006 http://dx.doi.org/10.1016/s0021-9290(97)00025-0 http://dx.doi.org/10.4038/gmj.v12i1.1078 http://dx.doi.org/10.1161/01.atv.6.1.109 http://dx.doi.org/10.1016/s1350-4533(98)00039-3 http://dx.doi.org/10.1007/s10439-010-9925-5 http://dx.doi.org/10.1016/j.jvs.2006.08.026 http://dx.doi.org/10.1007/bf01744592 jakub kronek, rudolf žitný acta polytechnica [28] a. li, i. kamel, f. rando, m. anderson, b. kumbasar, j. lima and d. bluemke, “using mri to assess aortic wall thickness in the multiethnic study of atherosclerosis: distribution by race, sex, and age,” american journal of roentgenology, vol. 182, no. 3, pp. 593-597, 2004. doi:10.2214/ajr.182.3.1820593 [29] e. b. rosero, r. m. peshock, a. khera, p. clagett, h. lo and c. h. timaran, “sex, race, and age distributions of mean aortic wall thickness in a multiethnic populationbased sample,” journal of vascular surgery, vol. 53, no. 4, pp. 950-957, 2011. doi:10.1016/j.jvs.2010.10.073 [30] c. d. murray, “the physiological principle of minimum work: i. the vascular system and the cost of blood volume,” proceedings of the national academy of sciences of the united states of america, vol. 12, no. 3, pp. 207-214, 1926. doi:10.1073/pnas.12.3.207 [31] r. w. ogden, “large deformation isotropic elasticity on the correlation of theory and experiment for incompressible rubberlike solids,” proceedings of the royal society of london a: mathematical, physical and engineering sciences, vol. 326, no. 1567, pp. 565-584, 1972. doi:10.1073/pnas.12.3.207 [32] j. kronek, l. horny, t. adamek, h. chlup and r. zitny, “site-specific mechanical properties of aortic bifurcation,” 2014. doi:10.1007/978-3-319-00846-2_232 [33] v. krupka and p. schneider, stavba chemických zařízení i skořepiny tlakových nádob a nádrží, vysoké učení technické v brně, 1986. [34] a. austin and j. swannell, “stresses in a pipe bend of oval cross-section and varying wall thickness loaded by internal pressure,” international journal of pressure vessels and piping, vol. 7, no. 3, pp. 167-182, 1979. doi:10.1016/0308-0161(79)90016-4 [35] l. manuel, “a study of the stress concentration at the branch points of arteries in vivo by the finite element method,” 1986. 400 http://dx.doi.org/10.2214/ajr.182.3.1820593 http://dx.doi.org/10.1016/j.jvs.2010.10.073 http://dx.doi.org/10.1073/pnas.12.3.207 http://dx.doi.org/10.1073/pnas.12.3.207 http://dx.doi.org/10.1007/978-3-319-00846-2_232 http://dx.doi.org/10.1016/0308-0161(79)90016-4 acta polytechnica 55(6):393–400, 2015 1 introduction 2 methods 2.1 geometry of ab 2.2 experimental measurements of geometries 2.3 experimental measurements of material properties 2.4 fe models 2.5 analytical approximation of scf, maximum stresses and range 3 results and discussion 3.1 sizes of the affected regions 3.2 qualification of the stress in stress concentrations 4 conclusion references acta polytechnica doi:10.14311/ap.2013.53.0803 acta polytechnica 53(supplement):803–806, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the astro-h mission yoshitomo maedaa,∗, tadayuki takahashia, kazuhisa mitsudaa, richard kelleyb, on behalf of the astro-h team a institute of space and astronautical science (isas), jaxa, kanagawa, 252-5210, japan b nasa/goddard space flight center, greenbert, md 20771, usa ∗ corresponding author: ymaeda@astro.isas.jaxa.jp abstract. a review of the astro-h mission is presented here on behalf of the astro-h collaboration. the joint jaxa/nasa astro-h mission is the sixth in a series of highly successful x-ray missions initiated by the institute of space and astronautical science (isas). one of the main uniquenesses of the astro-h satellite is the high sensitivity and imaging capability of the wide energy band from 0.3 kev to 600 kev. the coverage is achieved by combining the four instruments of the sxs, sxi, hxi, and sgd. the other main uniqueness is a spectroscopic capability not only for a point-like source but also for an extended source with high spectral resolution of δe ∼ 4 ÷ 7 ev of sxs. using the unique powers of these instruments, astro-h will address unresolved issues in high-energy astrophysics. keywords: x-ray, hard x-ray, gamma-ray, x-ray astronomy, gamma-ray astronomy, micro-calorimeter. 1. introduction and current status astro-h (formerly known as “next”) is a facilityclass mission to be launched on jaxa h-iia into low earth orbit in 2014 [1–3]. nasa has selected us participation in astro-h as a mission of opportunity. up to now, more than 160 scientists from japan/us/europe/canada have contributed to the astro-h mission. astro-h will be launched into a circular orbit with altitude 500 ÷ 600 km, and inclination 31 degrees or less. the total mass at launch will be as heavy as 2700 kg while the length is about 14 m in orbit. astro-h is a pointing observation-type satellite; each target is pointed until the integrated observing time is accumulated, and then slews to the next target. all instruments introduced in the later section are coaligned and will operate simultaneously. astro-h has just completed the first half of the critical design review (cdr-1) which is required before making the system tests using the thermal test model (ttm) or the mechanical test model (mtm) and producing a part of the flight modules. in this paper, we will summarize the mission concept and the current baseline configuration of instruments of astro-h. a comprehensive introduction and most recent reviews of the astro-h mission will be presented at the spie meeting in july, 2012 [3]. 2. spacecraft and instruments astro-h will carry three focusing optics-type telescopes (sxs, sxi, hxi) and one compton telescope (sgd). the conceptual design of each instrument is shown in fig. 1, and the requirements and specifications of the instruments based on the base-line design are summarized in tab. 1. the soft x-ray spectrometer (sxs) achieves high resolution spectroscopy with imaging [4, 5]. sxs has the x-ray calorimeter spectrometer (xcs) focused by the soft x-ray telescope (sxt-s) [6, 7]. the energy resolution of the xcs is as good as ≤ 7 ev in the 0.3 ÷ 12 kev band pass with an effective area at 6 kev of ∼ 210 cm2. the detector array corresponds to a field of view of 3.04 arcmin on a side. a filter wheel (fw) assembly, which includes a wheel with selectable filters and a set of modulated x-ray sources will be installed between the sxt-s and xcs. the fw is able to rotate a suitable filter into the beam to optimize the count rate [9]. an high count rate degrades the energy resolution power. in addition to the filters, a set of on-off-switchable x-ray calibration sources, using light sensitive photo-cathode, will be implemented. these calibration sources will allow proper gain and linearity calibration of the detector in flight. the x-ray sensitive silicon charge-coupled devices (ccds) of the soft x-ray imaging system (sxi) provide a low background and high energy resolution tool. sxi will consist of a telescope (sxt-i) and the focal plane detector “soft x-ray imager” [11, 12]. the design of sxt-i is the same as that of sxt-s [6, 7]. sxi will use next generation hamamatsu ccd chips with a thick depletion layer of 200 µm, low noise, and almost no cosmetic defects. sxi features a large fov and covers a 38 × 38 arcmin2 region on the sky, complementing the smaller fov of the sxs calorimeter. the effective area is larger than one unit of the suzaku xis. the hard x-ray imaging system (hxi) consists of the hard x-ray telescope (hxt) and its hard xray imager focal plane detector and enables imaging spectroscopy in the 5 ÷ 80 kev band. the hxt has 803 http://dx.doi.org/10.14311/ap.2013.53.0803 http://ojs.cvut.cz/ojs/index.php/ap yoshitomo maeda et al. acta polytechnica figure 1. configuration of the astro-h satellite. [3]. parameter hard x-ray soft x-ray soft x-ray soft γ-ray imager spectrometer imager detector (hxi) (sxs) (sxi) (sgd) detector si/cdte micro x-ray si/cdte technology cross-strips calorimeter ccd compton camera focal length 12 m 5.6 m 5.6 m – effective area 300 cm2@30 kev 210 cm2@6 kev 360 cm2@6 kev > 20 cm2@100 kev 160 cm2@1 kev compton mode energy range 5 ÷ 80 kev 0.3 ÷ 12 kev 0.5 ÷ 12 kev 40 ÷ 600 kev energy 2 kev < 7 ev < 200 ev < 4 kev resolution (@60 kev) (@6 kev) (@6 kev) (@60 kev) (fwhm) angular < 1.7 arcmin < 1.3 arcmin < 1.3 arcmin – resolution effective ∼ 9 × 9 ∼ 3 × 3 ∼ 38 × 38 0.6 × 0.6 deg2 field of view arcmin2 arcmin2 arcmin2 (< 150 kev) time resolution 25.6 µs 5 µs 4 s/0.1 s 25.6 µs operating −20 °c 50 mk −120 °c −20 °c temperature table 1. summary of the four instruments of astro-h [3]. 804 vol. 53 supplement/2013 the astro-h mission figure 2. (top) the effective area of sxs. (below) the energy resolution obtained from mn kα1 using a detector from the xrs program but with a new sample of absorber material (hgte) that has lower specific heat, leading to an energy resolution of 3.7 ev (fwhm). sxs could have energy resolution approaching this value [8]. conical-foil mirrors with depth-graded multilayer reflecting surfaces that provide a 5 ÷ 80 kev energy range [13, 14]. the focal plane imager consists of four-layers of 0.5 mm thick double-sided silicon strip detectors (dssd) and one layer of 0.75 mm thick cdte double sided cross-strip detector [15, 16]. the soft gamma-ray detector (sgd) is a compton telescope which is sensitive to a broad band of 40 ÷ 600 kev [17, 18]. the sensitivity at 300 kev is 10 times better than the suzaku hxd. it outperforms previous soft γ-ray instruments in background rejection capability by adopting a new concept of narrow-fov compton telescope. the sgd can be functioned in two ways. one is the “photo absorption” mode in which the larger effective area is achieved but the imaging is not functional. the other is the compton mode in which the sgd is capable of measuring polarization as well as image. figure 3. detection limits of the sxt-i/sxi, hxt/hxi and sgd for point sources as functions of xray energy, where spectral binning with δe/e = 0.5 and 1000 ks exposure are assumed. [3]. 3. expected scientific performance in contrast to the high-resolution gratings of chandra/heg and xmm-newton/rgs, at e > 2 kev, sxs is the more sensitive and has higher resolution (fig. 2). it is notable that in the fe k band sxs has 10 times the collecting area and much better energy resolution than chandra/heg. the sxs band covers the critical inner-shell emission and absorption lines of fe i–xxvi between 6.4 and 9.1 kev often seen in the spectra from agn or galactic bh candidates. sxs uniquely performs high-resolution spectroscopy of extended sources such as cluster of galaxies and supernova remnants because it is non-dispersive. for sources with angular extent larger than 10 arcsec, the chandra meg energy resolution is degraded down to the level of ccd. the energy resolution of the xmm-newton rgs is similarly degraded for sources with angular extent ≥ 2 arcmin. sxs makes possible high-resolution spectroscopy of sources inaccessible to current grating instruments. figure 3 shows the broad-band sensitivity achieved by combing the two instruments, hxi and sgd for point sources. the sensitivity at 30 and 300 kev is two orders and one order of magnitude better than the suzaku hxd. a cloud of cool gas has been spotted approaching the supermassive black hole sgra* which 805 yoshitomo maeda et al. acta polytechnica lies at the center of our milky way galaxy [19]. in 2013, one year before the astro-h launch, the cloud may reach the event horizon, starting a bright x-ray flaring activity [20] that will shed new light on the black hole’s feeding behavior. with its broad band capability, astro-h will catch the flare with a flux brighter than 1035÷36 erg s−1 that will shed new light on the black hole’s feeding behavior. acknowledgements the authors are deeply grateful for on-going contributions provided by other members in the astro-h team in japan, the us, europe and canada. references [1] next satellite proposal,the next working group, submitted to isas/jaxa (2003) [2] t. takahashi, k. mitsuda, r.l. kelley et al., “the astro-h mission”, proc. spie, 7732, pp. 77320z-77320z-18 (2010) [3] t. takahashi, et al., “the astro-h mission”, spie, 8443, submitted (2012) [4] k. mitsuda et al. “the high-resolution x-ray microcalorimeter spectrometer system for the sxs on astro-h”, proc. spie 7732, pp. 773211-773211-10 (2010) [5] r.l. kelley et al., “the high resolution microcalorimeter soft x-ray spectrometer for the astro-h mission”, proc. spie, 8443, submitted (2012) [6] p. serlemitsos et al., “foil x-ray mirrors for astronomical observations: still an evolving technology”, proc. spie 7732, pp. 77320a-77320a-6 (2010) [7] t. okajima et al. “the first measurement of the astro-h soft x-ray telescope performance”, proc. spie, 8443, submitted (2012) [8] f. s. porter and et al., “the astro-h soft x-ray spectrometer (sxs)”, aip conference proceedings 1185, pp.91-94 (2009) [9] c.p. de vries et al., “filters and calibration sources for the soft x-ray spectrometer (sxs) instrument on astro-h”, spie 7732, pp. 773213-773213-9 (2010) [10] c. p. de vries et al., “calibration sources for the soft x-ray spectrometer instrument on astro-h”, proc. spie, 8443, submitted (2012) [11] h. tsunemi et al., “the sxi: ccd camera onboard astro-h”, proc. spie 7732, pp. 773210-773210-11 (2010) [12] h. tsunemi et al., “soft x-ray imager (sxi) onboard astro-h”, proc. spie, 8443, submitted (2012) [13] h. kunieda, et al., “hard x-ray telescope to be onboard astro-h”, proc. spie 7732, pp. 773214-773214-12 (2010) [14] h. awaki et al., “current status of astro-h hard x-ray telescopes (hxts)”, proc. spie, 8443, submitted (2012) [15] m. kokubun et al., “hard x-ray imager for the astro-h mission”, proc. spie 7732, pp. 773215-773215-13 (2010) [16] m. kokubun et al., “hard x-ray imager (hxi) for the astro-h mission”, proc. spie, 8443, submitted (2012) [17] h. tajima. et al.,“soft gamma-ray detector for the astro-h mission”, proc. spie 7732, pp. 773216-773216-17 (2010) [18] s. watanabe et al., “soft gamma-ray detector for the astro-h mission”, proc. spie, 8443, submitted (2012) [19] s. gillessen, et al. “a gas cloud on its way towards the supermassive black hole at the galactic centre”, nature 481, 51–54 (2012) [20] f. k. baganoff, et al. “rapid x-ray flaring from the direction of the supermassive black hole at the galactic centre” nature 413, 45–48 (2001) 806 acta polytechnica 53(supplement):803–806, 2013 1 introduction and current status 2 spacecraft and instruments 3 expected scientific performance acknowledgements references acta polytechnica acta polytechnica 53(2):185–188, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ calculation of the inductance of plasma column at pf-1000 device with assumed current distribution jiri kortanek∗, pavel kubes ctu in prague, fee, department of physics, technicka 2, 166 27 prague 6, czech republic ∗ corresponding author: kortanek@email.cz abstract. processed data is taken from the d−d fusion reaction experiments on pf-1000 at ifpilm warsaw, operating with 2 ma, 1011 neutron yield, using interferometry, temporal resolved neutron diagnostics, magnetic probe diagnostics and x-ray diagnostics. the inductance is calculated under two different circumstances: with the known distribution function determined by the magnetic probe signal, and with the border of the plasma column determined from interferograms using matlab. the inductance of the whole column is calculated using the formula for the inductance of a coaxial cylinder. keywords: nuclear fusion, plasma focus, pinch, inductance, diagnostics, matlab. 1. introduction interferometry is a method for visualizing plasma and for obtaining some information about its structure and density. it has been used to measure plasma with density above 1023 m−3 in various pf devices [9, 1]. the pf-1000 plasma focus device, located at ipplm in warsaw, has recently been operating with 2 ma with 1011 neutron yield [7]. interferometry, neutron diagnostics, hard x-ray diagnostics, and magnetic probes diagnostics [5] are used, along with measurements of current, its time-derivative and voltage with nanosecond resolution. bc408 scintillators and hamatsu photomultipliers are used for temporal resolved hard x-ray and neutron diagnostics. they are positioned side-on (7 m) and in the upstream direction (7 ÷ 84 m). counters with activated silver are used for total neutron yield estimation [4]. the magnetic probes are induction-type probes customized specifically for the pf-1000 device. they are located at the face of the anode, 1.3 cm and 4 cm from the axis, as shown in fig. 1. they register the temporal derivative of the azimuthal magnetic field at the anode front, and this enables the current density to be determined. an example of a signal from the probe in 1.3 cm is shown in fig. 2. the y axis was adjusted to have a maximum value of 1 for better evaluation. its scale is therefore arbitrary. the value x = 0 represents the time of the maximum soft x-ray signal. this applies to all figures in this paper, unless otherwise stated. the interferometer at pf-1000 was designed and installed at ipplm in warsaw. the beam from the nd:ylf ekspla sl300 series laser, operating on second harmonic wavelength 527 nm with pulse length below 1 ns, is divided into 16 mutually delayed beams. the time step between the beams alternates between 10 ns and 20 ns [10]. interferograms are created which show us the time-evolution of the pinch figure 1. schematic of the experimental setup. figure 2. signal from the magnetic probe in radial position 1.3 cm from the axis of the anode. shot no. 9367. over the course of 220 ns in one shot [10]. with these, we can correlate the time-evolution of the pinch dynamics with the neutron and x-ray emission [8, 6]. interferometry is a path-integrated technique, so a method based on abel transformation is used for axial symmetry [2]. more accurate unsymmetrical densitograms are then created using a method presented in [3]. an example of a densitogram of this type is shown in fig. 3. the aim of this work is to use two different calculations of the inductance of the plasma column for two temporal intervals. the first calculation is based on detecting the borders of the plasma after the implosion, 185 http://ctn.cvut.cz/ap/ jiri kortanek, pavel kubes acta polytechnica figure 3. asymmetrical densitogram of shot no. 8584. and the second calculation is based on estimating the current distribution from the magnetic probe data during the implosion. 2. calculations 2.1. current flows along the border of the dense plasma column for the calculation of the inductance, we presume that after the implosion the mean value of the current flows along a thin layer on the surface of the dense plasma column. the application detects the borders of the plasma by finding the threshold value of the density. the search algorithm searches for the first and last pixel with this threshold value in each column of pixels. these two ivalues are designated the upper and lower border of the plasma for the current pixel column. with known position of the borders, we can determine the radius r of the plasma column. the formula for determining the threshold density was d = dlm[k1 + k2(1 − dlm/dmax)] [cm−3] , (1) where dlm is the maximum density in the current column of pixels, dmax is the maximum density in the whole shot, and k1 and k2 are constants. these constants are set for the borders to follow the places where we see a sharp rise or drop in plasma density. we need to set them experimentally for each shot, because each shot has its own special density ratios. for shot 9013, the values were determined k1 = 0.23 and k2 = 0.65. the detected borders of the plasma column are shown in fig. 4. the formula for the inductance l of a coaxial cylinder is used, n∑ 1 ∆l = n∑ 1 ( ∆l µ 2π ln r r ) [h], (2) with length of the pixel in meters ∆l, µ is the magnetic constant (4π × 10−7 h/m), r is the inner radius determined by the detected borders of the plasma, figure 4. detected borders in the densitogram of shot no. 9013, t = −13 ns (relative to the time of the maximum pinch). figure 5. distribution function of the current flow at the time of pinch for shot no. 9367; x = 0 represents the position of the magnetic probe, positive values are towards the center of the anode. r is the radius of the outer electrodes (15 cm), and n is the number of the pixel columns in the current picture. the application calculates the inductance of each column of pixels and adds them up to get the inductance of the whole column. when this process is used for every densitogram in the shot, the time resolved inductance is obtained. 2.2. current flows distributed the magnetic probe signal was registered near the anode face in temporal scale dependence. to obtain the radial distribution of the current during the implosion, we use the formula s = vt [m] , (3) where s is length, v is implosion velocity and t represents time. we used implosion velocity 2.5 × 105 m/s with relative uncertainty of 20 % calculated from the positional differences of the plasma border at the same z-coordinates in the succeeding interferograms. the signal is then truncated to contain only the region of the first peak, where the magnetic probe is not yet damaged by the high-temperature plasma. the values of this signal are adjusted so that the sum 186 vol. 53 no. 2/2013 calculation of the inductance of plasma column figure 6. inductance of shots no. 9011 and no. 9013 using plasma border detection. figure 7. inductance of shot no. 9367 using the current distribution. of the signal equals 1. thus it does not have physical units. the acquired function, shown in fig. 5, is used as a distribution function of the current flow. we assume the same radial distribution of the current for all axial positions toward the boundary. with the known distribution function of the current, eq. 2 can be used again. in this case, r is the current layer diameter and r is again the radius of the electrodes (15 cm). the inductance is calculated for each current layer, i.e. row of pixels. it is multiplied by the corresponding value of the distribution function, and added us up to get the inductance of the whole plasma column. the time-resolved inductance is acquired by applying this process for every nanosecond in the shot. 3. results the results of the border detection method are shown in fig. 6 for shots no. 9011 and no. 9013. a similar slow increase in inductance from 15 nh at −10 ns to 18 nh at 40 ÷ 50 ns was also observed in other shots. later the inductance slowly decreases. the result of the calculation based on the current distribution measured by the magnetic probes for shot no. 9367 is shown in fig. 7. the differences between the results for other shots were below an uncertainty value of 20 %. figure 8. comparison of the resulting inductances for two calculations for shots no. 9011 and no. 9013. 4. conclusions the inductance of the plasma column has been estimated from shots no. 9011, 9013 and 9367. each of the calculations can be used only at certain times, specified by the limitations of the diagnostic methods. this plasma border detection method can be used only when we have interferometric data for the times after the implosion phase. for shot no. 9011, the time was from −30 ns to 60 ns, and for shot no. 9013 the time was from −12 ns to 48 ns. the current distribution can be used only during the time of the implosion. the distribution is similar for different shots, so the conclusions about the temporal evolution of the inductance can be generalized with uncertainty of 20 %. according to the magnetic probe method, most of the current flows about 1 cm above the border of the dense plasma column during the implosion phase falls towards the plasma column during the stagnation phase, and could even sink into the plasma later during the expansion. the composition of the two calculations, see fig. 8, shows that there is a connection between them in the intersection region. this shows that the assumption made in the plasma border method – that the mean value of the current flows along a thin layer of the plasma column – can be taken as valid, at least after the implosion phase. acknowledgements this research has been supported by research program no. la08024 “research in the frame of the international center for dense magnetized plasmas”, by research program no. me09087 “research of plasma of fast z-pinches” of the ministry of education, youth and sport of the czech republic, and by the gacr grants no. p205/12/0454 “effective production of fusion neutrons in z-pinch and laser produced plasma”, grant cr iaea 14817 “research of d−d fusion reactions at the ctu in prague”, grant ctu sgs 10-2660-ohk3-3t-13, and by grant no. 0661/b/h03/2011/40, supported in part by the national science centre. 187 jiri kortanek, pavel kubes acta polytechnica references [1] a. bernard, a. coudeville, et al. experimental studies of the plasma focus and evidence for nonthermal processes. phys fluids 18(2):180–194, 1975. [2] a. kasperczuk, t. pisarczyk. application of automated interferometric system for investigation of the behaviour of a laser-produced plasma in strong external magnetic fields. opt appl 31(3):571–597, 2001. [3] j. kortanek, et al. matlab applications for processing of interferograms and oscilograms from pf-1000 experiments. in tenth kudowa summer school. ipplm, 2011. [4] j. krasa, m. kralik, et al. anisotropy of the emission of dd-fusion neutrons impressed by the plasma-focus vessel. plasma physics and controlled fusion 50(12):125006, 2008. [5] v. krauz, k. mitrofanov, et al. experimental study of the structure of the plasma-current sheath on the pf-1000 facility. plasma physics and controlled fusion 54:025010, 2012. [6] p. kubes, et al. evolution of the gass-puff z-pinch column. ieee transactions on plasma science 26(4):1113–1118, 1998. [7] p. kubes, et al. spontaneus transformation in the pinched column of the plasma focus. ieee transactions on plasma science 39(1):562–568, 2011. [8] p. kubes, m. paduch, et al. interferometric study of pinch phase in plasma-focus discharge at the time of neutron production. ieee tps 37(11):2191–2196, 2009. [9] p. schmidt, p. kubes, et al. neutron emission characteristics of pinched dense magnetized plasma. ieee trans plasma sci 34(5):2363–2367, 2006. [10] e. zielinska, m. paduch, et al. sixteen-frame interferometer for a study of a pinch dynamics in pf-1000 device. contrib plasma phys 51(2–3):279–283, 2011. 188 acta polytechnica 53(2):185–188, 2013 1 introduction 2 calculations 2.1 current flows along the border of the dense plasma column 2.2 current flows distributed 3 results 4 conclusions acknowledgements references ap1_03.vp 1 introduction a real logic object depends on its model, since a model defines an entity. the definition of an object can be understood as a statement of all of its relevant, adequate, and well distinguishable characteristics that factor out the entity from its environment, or document its entirety. let us show, on an example of driving a train, the untenable of the current conception of the dynamic logic object. example 1: let the transition table, tab. 1a), be a model of a train going from point 1 through the position 2 to point 3. then the train will go back from position 3 through the point 4 to point 5. the state of the train is its bold printed position on the track and the actor – the motor – is either at rest or it operates going forward or backward. the ordered pairs of values: 00 and 10, or 01 of the train control u1 u2 make the train be at rest or go forward or backward, respectively. the flow chart of the conceptions mealy nondeterministic control automaton of the given train is presented in tab. 1b) and its minimal form with obvious combinational behavior [2] is shown in tab. 1c). the control automaton is nondeterministic since if it assumes state 3 of the train, it produces either control 10, and the train remains in state 3, or 01, due to which the train goes back to state 5. the common statement that control u1 u2 causes the train move from one place to another cannot be accepted, the train not being given data about its position on the track, i.e., without respect to the potential operation of the motor will not move. thus, control u1 u2 changing the train’s position only initiates, whereas the actor of the train is the starting position of the change of place. since the transition table tab. 1a) of the train assumes automatic movement of the train along the track � 1 � 2 � 3 � and � 3 � 4 � 5 � only due to the respective control 10 or 01, the model of the train according to tab. 1a) is inadequate. moreover, can a nondeterministic control automaton be constructed when an arbitrary product is either nondeterministic, or, if there is no other choice, a pseudodeterministic automation? in other words, a train modeled by the transition table tab. 1a) in a finite-semiautomatic way is not (!) a logic object and is referred too only as a logic pseudoobject. since the performer of a state transition on a dynamic entity is the starting state of the transition (the transition control is its actor only), we will introduce a logic potentially dynamical pseudoobject [3–5] as a division component (not a decomposition component [4]) of the entity and compare the division of the logic object with its canonic decomposition (not with a structural interpretation of the transition relation, 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 a logic object and its state j. bokr, v. jáneš this paper shows not only that the state of an objecty parameterizes the dyad (stimulus, response) but that it is also the performer of the state transition initiated by the stimulus. thus, the logic object is a feedback composition of the logic pseudoobject and the static logic object. it is also stated that, without considering the new role of the state, the content of the concept „state“ cannot be completely explained. the current intuitive conception of canonical decomposition of an entity is confronted with the exactly introduced decomposition and with the mentioned division of the logic object into a pseudoobject and a static object. keywords: logic object, pseudoobject, division, canonic decomposition, delay, state. a) s�+1 u u1 2 � � 00 10 01 s� 1 � 2 2 � 3 3 � � 4 4 � 5 5 � � b) q u u� � �1 2 s� 1 2 3 4 5 q��1 1 2/10 2 �/10 3/10 3 �/10 4/10 4 �, 5/10, 01 5 �/01 6/10 6 �/01 7/01 7 �/01 table 1: a) transition table of the train, b) table of the conceptional control automaton, c) table of the minimal control automaton of the train from example 1 c) s� 1 2 3 4 5 u1 u2 10 10 10, 01 01 01 of the function of the object). we will attempt, even if it may be without success, to define the content of the concept of an “object state”. 2 logic object let the symbol m� denote the set {{�} � m} of all representations {�} � m : � ��� m � m� and let proji be the projection to the i-th axis. let the finite-automaton model of the logic pseudoobject p be an ordered quintet p u z s y� � � � , , , , . where u, z, s and y are the corresponding input alphabet (control alphabet), explicit failure alphabet (if failures occur, then � 1, otherwise � 0), state and output alphabets, � and is the respective transition relation, or function: i) � � � � � � � �� � � � � � � � � � � s u s s u s 1 1 : , , , � � � � � �spec. : s u s : s u s� � � � � � �� 1 1, � for � 0 ii) � � � � � � � �� � � � � � � � �:s u z s s u z s� � 1 1: , , � for � 1 and the injective moore output function � � � � � � � �:: :s y s y � , when � � n0, � � n (a is a set of natural numbers denoting moments, n0 = n � {0}). if then < (< � n0 2 ) is a binary relation “before then or simultaneously with” and if by the measuring of m moments the homorphism m : n0 � r 0 + is understood, (r 0 + being a set of real nonnegative numbers) such that if � � �, then m (�) � m (�), and afterwards, if � �1< � < �, then also m (� �1) � m (�) � m (�), see fig. 1. according to demonstration example 1, we consider the complex of the automatic logic control from fig. 2 to be incorrect and are no longer concerned with it. we require the dynamic logic automatically to have done processes selected by the subject (state trajectories). if the state trajectory contains cycles, we require a finite number of its iterations. selection of processes in the entity is to be carried ont by the subject by assigning stimuli to the object. a potential carrier of processes on the entity is, without doubt, the given pseudoobject (being unable to perform transitions between the states of the selected state trajectory by itself), since the control of the potentially dynamical pseudoobject only produces the transitions on the pseudoobject. exaggerating a little, we can refer to the pseudoobject potentially both as “dead” and as a “live”. since the performer of the transition between the pseudoobject states is always the starting state of the state transition, the requirement of automatic operation of the selected process on the object reglements the division of the dynamic object into dividends, which are: • the given potentially dynamic pseudoobject, • the selected static logic entity according to fig. 3. while at least one part (a dividend) of the object division is a pseudoobject, all parts (components) of the object decomposition are again objects. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no. 3/2001 0 u ��1 u � ��� � z � s � y � � t fig. 1: time diagram of action p ca u � z � y � � s � p fig. 2: complex of automatic logic control of pseudoobject p by a nondeterministic control automaton ca u � z � x � y � � s � pca o fig. 3: division of a dynamic logic object o into potentially dynamic pseudoobject p and static logic object let us prove that the static mealy pseudoobject � � �x y z q r r � � , , , is a logic object, where x and q are the respective selection alphabet (by selecting a letter from x the given process on p can be implemented) and state alphabet, �r and r are the respective transition function: i) � � � � � � � �� ��� � � � ��� � � �r: q x y q q x y q� � : , , � for � 0, ii) � � � � � � � � � �� ��� � � � � ��� � � � �r: q x y z q q x y z q� � � : , , , � for � 1 and the mealy output function: i) � � � � � � � � ��� � � � ��� � � �r: q x y u q x y u� � : , , � for � 0, ii) � � � � � � � � � � ��� � � � � ��� � � � �r: q x y z u q x y z u� � � : , , , � for � 1 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 is a logic object and really, if � �� �q q q� � 1 , we will obtain either: i) � � � � � � � �� � � � � r: q x y q q x y q� � : , , � for � 0, ii) � � � � � � � � � �� � � � � � � r: q x y z q q x y z q� � � : , , , � for � 1, which can be formally (not actually) ignored, or: i) � � � � � � � � � � � � � � r: q x y u q x y u� � : , , � for � 0, ii) � � � � � � � � � � � � � � � � � � r: q x y z u q x y z u� � � : , , , � for � 1, which can be formally (not actually) transformed to the form: ad i) � � � � � � � � � � � �r: x y u x y u� : , � for � 0, ad ii) � � � � � � � � � � � � � � � �r: x y z u x y z u� � : , , � for � 1. hence, the finite automaton model of a static logical object (which automatically performs virtual state transitions from q to q) is the ordered triad � � �x y z u r � , , where r is the mealy output function: i) � � � � � � � � � � � �r: x y u x y u� : , � for � 0, ii) � � � � � � � � � � � � � � � �r: x y z u x y z u� � : , , � for � 1. let us construct an aggregation o ( , p) according to fig. 3; since � � � �� �u x y = x sr r� � � � � , , for � 0 and � � � �� �u x y z = x s zr r� � � � � � � , , , , for � 1, we will obtain: i) � � � �� �� � � �� � � � � � � � � � � � � � �s u s s x s s spec. s u s xr r, , , , , , , , �1 1 � �� �� �, � �s = s 1 for � 0, ii) � � � �� �� � � �� �� �� � � � � � � � � � � � � � � �s u z s x s z z s x s z = sr r, , , , , , , , ,� 1 for � 1. a) s�+1 x x s1 2 � � � 001 002 003 004 005 101 102 103 104 105 011 012 013 014 015 s� 1 � 2 2 � 3 3 � � 4 4 � 5 5 � � b) s��� x� 00 10 01 s� 1 � 2 2 � 3 3 � � 4 4 � 5 5 � � table 2: transition table of the moving train from example 2: a) scarce, b) satisfying then we can state that the finite-semi-automaton model of the logical entity o is the ordered triad o � �x z , s,� � , where � is a transition relation, spec. function, of the aggregation o( , p) of dividends and p of object o. example 2: construct a of the stepwise moving train from example 1, in other words construct a logic object – moving train. the transition table of such a moving train is presented in tab. 2a). since tab. 2a) is scarce, we will use tab. 2b) with arrows denoting the state transition of the object. the structural model of the dividend o of the train is i �1, if the state predicates of the pseudotrain �� � �p i i 0: , :i� 1 5 0 1 � are introduced. otherwise, the situation is as shown in fig. 4. in a justified way, a dynamic elementary logical object can be a matter of discussion and examination. even when division of a logical entity is an intuitive concept (different from its decomposition), this also evidently holds: an elementary logical entity is another non – decomposable dynamic logical object. let us, therefore, consider a delay � �� d, s, d , be it deterministic or nondeterministic [5], where d and s are the respective alphabets of the input and state delays, and �d is the transition function � � � � � �s d s d s� � � � �� � 1 1: + . the equality d s� �� +1 states automatic production of state transition, and thus the delay is not only a dynamic elementary object, but even the only one elementary logical object. since the transition table of the logical object is scarce, whereas the transition table of the pseudoobject with added arrows depicting the automatism of performing state transitions is satisfying, the traditional finite – automaton pseudoobject models well also serve as finite – automaton models of dynamic entities provided that the stimuli from the objects only evoke state transitions, and the states produce them through the dividend ( which is virtual in the case of the delay). if s � is given in dividend p in the division of the object, we will obtain � � � and � and the given entity is the static object , because an object without a potentially dynamic dividend is sure to be static. note, in addition, that the connection of the level sensors or pulse sensors to the has to be ensured [5]. 3 canonical decomposition let us consider, without loss of generality, a dynamic prototype o traditionally modeled by a finite semiautomaton o= x z s,� � �, where � is a transition relation, spec. function, assigned in a traditional way: i) � � � � � � � �� �� � � � � �� � � s x s s x s ,1 1: , , � � � � � �spec. s x s : s x s�� � � � � � �� 1 1, � , for � 0 ii) � � � � � � � ��� � � � � � � � �s x z s : s x z s� � 1 1, , � for � 1. if we regard the state s�, stimulus x�, and failure z� as current, then s�+1 is: i) � �� �s proj s x s+1 3 +1� � � ��� , , , � �spec. s s x+1� � ��� , , for � 0, ii) � �s s x z+1� � � ��� , , for � 1 © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 41 no. 3/2001 p1 p2 x2 p3 p4 p5 x1 forward back stop 1 1 1 & & & & & & fig. 4: control automaton of the pseudotrain from example 2 a current prediction [6] of a state of the follower, and not the follower state itself, since the follower state cannot be currently documented. then the composition o ( �, �) from fig. 5, where � is a static, nondeterministic, spec. deterministic, logical object, can be regarded as a structural interpretation of the transition relation, spec. function � of object o, but not as its canonical decomposition, since the transition relation, spec. function does not regalement decomposition, and thus the canonical decomposition of a logical object. an interesting, and surprisingly common [8, 10, 12] statement is that the static object � according to the current s�, x�, or z� currently produces s�+1, which accepts nothing other than an anticipative delay? let us now leave the realm of prejudices, and since the relationship of covering the finite automata is not constructive [1], as it does not allow us to construct any automaton to the given automaton, let us, intuitively for the time being introduce an automaton, called a substitute, instead of the covering automaton. let us consider the composition on(i, or) from fig. 6, where i is the searched static deterministic exciting block of the given object or – a substitute for prototype o. let the finite-semiautomaton model of or be the ordered triad o e z sr r r� � � �, , where e and sr are the respective exciting block alphabet and state alphabet, and �r is the transition relation, spec. function: i) � � � � � � � �� �� �r r r r r rs e s s e s� � � � � � �1 1: , , , � � � � � �spec. : s e s : s e sr r r r r� � � �� � � � 1 1, for � 0 ii) � � � � � � � �� � � �� � � � r r r r r:s e z s s e z s � � � � � : , , 1, for � 1. let the finite-automaton model of the searched block i be the ordered triad i x s z e,r� � � � , where is the excitation function: i) � � � � � �x s e x s er r � � � � � � �: , for � 0, ii) � � � � � � � �x s z e x s z er r � � � � � � � � � �: , , for � 1. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 x � z � � s � s ��� fig. 5: block interpretation of the transition relation, spec. function � of object o e � x � z � ori s � fig. 6: canonical composition on(i, or ) of object o the finite-automaton model of the canonical decomposition on(i, or ) is then the ordered triad o x, s ,n r n� � where �n is the transition relation, spec. function: i) � � � � � � � � � � � �� � �� � �n r r n r r n r rs x s s x s spec. : s x s� � � � � � � � � �1 1: , , , � � 1 1: s x sr r � �, � for � 0, ii) � � � � � � � �� � � �� � � r r r r r: s x z s s x z s � � � � � � : , , 1 for � 1. the composition on(i, or) from fig. 6 [5] is regarded as a canonical decomposition of prototype o, if there exists the monomorphism o to on, i.e., if with the given coding state k: s s s sr r : � the following relation holds: i) � � � � � �� � � �� � � �� �� � � �� � � � � � � � � �s x s k s x k s spec. k s x k s x, , , , , , , � �1 1 , for � 0, ii) � �� � � �� �k s x z k s x z� �� � � � � �, , , ,� for � 1. since i) � � � �� �� � � � � �� �� �proj k s x k s proj k s e k s3 3 r� �� � � � � � �, , , , � �1 1 � � � �� � � �� �� �� proj k s x k s k s3 r� � � � �, , , 1 , spec. � �� � � �� � � � � �� �� �� � � � � � � � � � k s x k s e k s x k sr r, , , ,� � , for � 0, ii) � �� � � �� � � � � �� �� �� � � � � � � � � � � � � � k s x z k s e z k s x k s z zr r, , , , , , , ,� � for � 1 note that the substitute or, intuitive at first, is exactly defined by the monomorphism o into on. the universal substitute or of prototype o is an entity such that: • s sr � , • it can have every possible state transition. example 3: verify whether the object according to tab. 3a) is the proper substitute of the simulated train from tab. 1a) by constructing its excitation (tab. 3b)) at state coding �� ��k: i, 2 iii, 3 iv, 4 v, 5 vi vi i j i j� � 1 5 1 1: � � � � � . the structural interpretation of the transition relation, spec. function, � of object o can be considered as a canonical decomposition of entity o if instead of the current prediction s�+1 though an actual, as is only correct, excitation d� of the delay can be introduced, since d� s�+1. note that if there did not exist at least one dynamic elementary logic object, a structural model of a dynamic logic entity could not be constructed. let us have in mind that a universal and, in a limited way, simple binary dynamic substitute of the binary dynamic object considered is a parallel register (the simplest possible composition) of binary dynamic memory modules. in this case, any structural model of an arbitrary binary memory module, designed either intuitively or exactly (by canonical decomposition), is strongly dependent on the binary dynamic elementary substitute – on a binary delay. therefore the structural models of a computational, pseudoobject designed by canonical decomposition with substitutes, which are always logic objects, are also logic objects, which have not yet been taken into account, and the process of controlling a logic object, as can be seen from paragraph 2, does not make sense. there is also an obvious extremely large difference between the division of an object and its canonical decomposition; even though and i are static object, controls the given pseudoobject b, whereas i “forces” the substitute or to mimic by its transition sequences the transition sequences on the given prototype o. 4 state of the object let us consider an extended finite-automaton model o = x z , s, y, ,� � � of a logic entity o, where y is the output alphabet and can be either a mealy output relation, spec. function: i) � � � � � � � � � � � � � �� � �s x y s x y: , , , � � � � � �spec. : s x y : s x y � � � � � �� , � , for � 0, ii) � � � � � � � � � � � � � � � �: : , ,s x z y s x z y� � � , for � 1, or a moore output function � � � � � � � �: :s y s y � . the finite-automaton model of a logic entity is evidently introduced: • independent of the content of the “state of the object” concept, • in such a way that the state of the entity both parameterizes [9], [13] an arbitrary input/output dyad x y� �, , and transfers each state transition; though there are an unlimited number of ways of parametrizing/transferring, all of them are substantially equivalent. in [7, 11] the state of the object is understood as the entire input history of the entity being recorded in a cumulative way (without deletion) and stored in the “memory”, i.e., in the © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 41 no. 3/2001 a) s r +1� e e1 2 � � 00 01 11 10 s r � i iii i ii ii ii iii iii iv iii v iv iv v v ii v vi vi vi b) s r +1� e e1 2 � � s� � � � � 1 2 00 01 10 01 10 01 s r � 1 i i iii 11 01 ii 2 iii iii iv 11 01 3 iv iv iv v 00 00 01 4 v v vi 01 10 5 vi vi vi 10 10 table 3: a) transition table of the substitute object of the train, b) excitation table of the substitute the searched excitation e� can be found, as i) � � � � � �� � � �� �� � � � � � � � � s x s k s x k s k s, , , , , �1 1 , spec. � �� � � � � �� �� �k s x k s x k sr� � � � � � � , , ,� , for � 0, ii) � �� � � � � �� �� �k s x z k s x k s z zr� � � � � � � � � � , , , , , ,� for � 1. delay of block interpretation of the transition relation, spec. function, � of object o. but if it is i) � �� � � �� �s proj s x s proj s x s+1 3 +1 3 +1� � �� �� �i j � � � �, , , , , spec. � � � �s s x s x+1� � �� �� �i j � �, , , for � 0, ii) � � � �s s x z s x z+1� � � � �� �� �i j � �, , , , , for � 1, how can the input history s si j � � � , i.e., for different input histories cumulatively included in s i � and in s j �, the input history, which originated by connecting x�, or x�z� to s i � or to s j �, respectively, be exactly identical with s�+1? the memory of the entity so does not cumulatively record and does not store the input history of an entity, and therefore the conception of the object according to fig. 5 is not tenable, though it may appear sufficient to put s�+1= d�, where d� is the excitation of the delay. 5 instead of a conclusion the authors, though having found that a state not only parameterizes the input/output pair but also transfers the state transition, which is relevant, regret to say that by rejecting the current conception of the state they have made the content of the concept “state” a little unclear. they also believe that objections such as: • i am not convinced that logic control is combinational, • the requirement for : |q| = 1 is too high, • logic control is not so simple as some would wrongly think, • in canonical decomposition both components are to be determined, • in canonical decomposition the state can be measured, will be found insubstantial by the reader. references [1] baranov, s. i.: sintez mikroprogrammnych avtomatov. energija, leningrad, 1979 [2] bokr, j. at al: logické řízení technologických procesů. sntl, praha, 1986 [3] bokr, j., jáneš, v.: logic control and canonical decomposition. in: proceedings of uwb, vol. 1, pp. 33–40, plzeň, 1997 [4] bokr, j.: upravlenie logičeskim objektom i kanoničeskaja dekompozicija. avt, no. 6/1999, pp. 12–23 [5] bokr, j., jáneš, v.: logické systémy. vydavatelství čvut, praha, 1999 [6] bokr, j., svatek, j.: základy logiky a argumentace. a. čeněk, plzeň, 2000 [7] brunovský, p., černý, j.: základy matematickej teórie systémov. veda, bratislava, 1980 [8] frištacký, n. at al: logické systémy. alfa-sntl, bratislava-praha, 1986 [9] kalman, r. e., falb, p. l., arbib, m.: topics in mathematical system theory. mc graw hill book co., 1969 [10] katz, h. r.: contemporary logic design. the benjamin cummings publishing co., inc., 1994 [11] minsky, m. l.: finite and infinite machines. russian translation, mir moskva, 1971 [12] perrin, j. p., denouette, m., daclin, e.: systemes logiques. dunod, paris, 1967 [13] zadeh, l. a.: the concept of state in system theory. in: views on general system theory, edited by mesarovič, m. d., j. wiley & sons, inc., new york-london-sydney, 1964 doc. ing. josef bokr, csc. department of information and computer science university of west bohemia faculty of applied sciences universitní 22 306 14 pilsen, czech republic doc. ing. vlastimil jáneš, csc. department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13, 121 25 praha 2, czech republic 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap1_01.vp 1 introduction albert einstein stated: “science must start from facts and end with facts, irrespective of the theoretical structures that it uses to combine them” [1–4]. this statement is directly relevant to morphological analysis based on lull’s art, as this method starts from facts and combines them in a mechanical way, thereby creating theoretical structures. according to this method, from among the theoretical structures formed in this way, only those are to be chosen, which will be in accordance with a pre-defined need. thus, the method assumes a decision-making role, the creative choice of a human-designer, which is an indispensable element of architectural design in a systemic approach.1 at present, the traditional understanding of architectural design2 does not correspond fully with the assumed high level of complexity of technological and architectural design processes. the traditional understanding of design is also not consistent with the contemporary level of complexity of designed technical objects. in the creative method of morphological analysis [5–11], the study or design of fragments of reality, while treating these fragments as wholes composed of different parts, correspond to the contemporary level of complexity of designed technical objects. the use of this method requires that a concept be introduced, which will make it possible to isolate the object of study or design. this concept is the system, and, in the case of architectural design, it is the architectural system [11]. the effect of the development of a design methodology which searches for general instructions of procedure and regularities that will refer to the broadest classes of processes of the design of technical objects, is, among others, the formation of the specific interdisciplinary conceptual means that accepts and facilitates a systemic approach to design. such a development of the methodology of design (fig. 1) [9], means that the process of architectural design can be directly described and studied, without referring to the description of a designed object [12–14]. at present, not only design processes but also creation methods may be characterized, without the necessity to defined them directly, with the use of object descriptions [5]. according to our contemporary methodological knowledge, features formed in a design process are as fundamental as those which result from the function of the object, the material used for its production, and the production process. these facts, on the background of knowledge resulting from the contemporary extensive development of the methodology of scientific design, involving the methods, procedures and techniques of design [6], form a reason for undertaking an attempt to study of the study of the morphological analysis methodology based on lull’s art[5, 7], as that which can be applied not only in technology, but also in architectural design. these remarks are the leading concepts of the presented considerations referring to the possibilities of applying morphological analysis or some parts of it in the real process of architectural design. the indispensable condition for these considerations is a systemic approach2 to the architectural design process and to design as an architectural system. 2 lull’s art as a philosophy and a “pre-concept” of the morphological analysis methodology morphological analysis methodology has its “pre-concept” [7–8], which testifies not only to the richness but also to the continuity of human thought through the ages. ramon lull, (1235–1315), a monk from majorca, later know as raimundus lullus, constructed a “logic machine” that performs automatic combinations of solutions. this consists in systematic combinations of a small number of notions. these notions symbolize intervals on the orbits of concentrically rotating circles (app. 1) [8]. ramon lull called his method “great art”, and later he reduced to nine the number of notions subject to automatic combination (app. 2) [5]. an acta polytechnica vol. 41 no.1/2001 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ application of morphological analysis methodology in architectural design a. prokopska the theory of system and design methodology as the sphere of concepts being objective mode can be applied to a more precise description, analysis and improvement of the methods of the real architectural design process. lull’s art as the primary idea of morphological analysis, has been acknowledged as to an element corresponding with the specificity of architectural design. from the architect’s point of view it is worth studying the rules and peculiarities of morphological analysis methodology. in engineering design thus it may be possible to apply the methodology in architectural design. keywords: design, architecture, engineering, structure, morphology, analysis. fig. 1: sequential basic actions in the process of satisfying needs [9] example contained in the table, reprinted from a seventeen-century book, is a typical morphological interval (fig. 2, 3, app. 3). lull’s art, however, led to the danger that thought might be mechanized. kant [5] expressed this danger as follows: “being young, i studied a little between philosophy and logics, … but after studying them i am of the opinion that logics and syllogisms serve rather to explain things that are known, or even, as in lull’s art, to speak without judgement about things of which we know nothing, and which we must learn”. lull’s methodology, as a basis of their philosophy, has been cultivated throughout the history by many thinkers, e.g., giordano bruno and g. w. leibniz [5]. leibniz wrote about it with appreciation in his dissertatio de arte combinatoria. a. kircher [5], a famous jesuit (the inventor of the magic lantern, and thus the ancestor of the cinema) presented a clear example of morphological analysis (app. 3). in gulliver’s travels, swift [5] criticizes lull’s art by describing a device that sets letters in a random manner. this method enabled the most ignorant people to write books without the least help of genius or education. this criticism makes a clear distinction between what is morphological analysis and what is not. in later centuries, lull’s art gradually fell into oblivion, despite its potential values. 3 selected procedures of morphological analysis morphological analysis was introduced by fritz zwicky [8–10], an astronomer3 who succesfully used this method, a pioneer of the construction of reaction engines. zwicky used the term “morphological analysis” to define a standard construction method that served for identifying all possible means enabling the attainmet of a specific functional capability. he was interested only in technical dimensions. he formulated a method for identifying, classifying and organizing the parameters influencing the construction of a physical device. the processes of thinking known as morphological analysis or morphological exercise, go beyond zwicky’s morphological analysis, which is limited to the arrangement of technical factors. the aim of contemporary thinking processes, whether they are called decomposition into semantic factors, morphological exercises, methods of morphological analysis, or simply “common-sense activity", is to search for more proper and meaningful factors through a continuation of the analysis of partial terms connected with complex terms, and complex terms covering partial terms [5, 8–9]. in morphological analysis, the value of solutions is connected with the value of the analysis, and the solutions must be consciously studied, and must also be used in future applications of this method in architectural design. gerardin [5] states that, in fact, “morphological analysis is a method of creation, and more precisely – a systematic help in creation”. this does not eliminate creative work by a human beying, but stimulates and develops it, allowing the imagination to work on a larger number of ideas than would be possible with the classic approach. arthur d. hall [14] defines morphological analysis as a generalization of the arrangement of properties, closely connecting them with morphological analysis itself. he regards the name as apt, since the word “morphology" refers to the science of structure and form (from greek morphe – shape and logos – science). he claims that pondering on structure and form stimulates intuition, and helps in formulating the problems themselves. the procedure of this method is presented by hall as follows: • start from the broadest possible formulation of the problem; • make a list of the independent variables of the desired system; • assign to each variable one of the dimensions of the morphological map; • count the values that may be taken by each variable. the total number of the problems will be equal to the product of the value numbers of each of the variables. in this method, the combination process grows in geometric progression, and tens of thousands of solutions are quickly obtained; thus it is necessary to distinguish clearly what is morphological analysis, and what is not. the importance of this method does not consist exclusively in obtaining an ordered means of recording the combinations of the values of the features of an object in the morphological interval. its essence lies in imposing a discipline and a systematic way of procedure on the designer, and, as a result, enabling successive choices to be made among many variant solutions [14–15]. to perceive an intrinsic order in a physically non-existing thing, and to settle the main features of thought out solution © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 41 no.1/2001 fig. 2: a) known solutions of the problem b) structuralized solutions having various new solutions of the problem inside their abstract structure. fig. 3: a schematic drawing of a morphological interval [24] is the most difficult element of morphological procedure. it is also difficult to define the variants of the main features. improper classification of the features of the thought-out solution may lead to an analysis of many contradictory or non-essential elements. neglecting the essential values of features, on the other hand, may cause interesting concepts to be overlooked. gerardin [5] proposed that zwicky’s definition be replaced by another simple definition: “morphological analysis is a method for systematic study of all the possible solutions of a given problem”. according to this definition, a problem is considered in morphological analysis from the most general possible point of view, so that the greatest probability is obtained in examining all usable solutions of the problem. some solutions of the problem under consideration are however, always known (fig. 2a, b). thus in a procedure also consistent with morphological analysis, by starting from the analysis of a known or apparently right solution, the discovery of a new solution may be hoped for (fig. 2b). having undertaken the analysis of a problem by morphological analysis, its structure must then be defined and described in an abstract form, while limiting it, for example, to the form of a morphological matrix (fig. 2b). by calculating all possible solutions, the set of possible solutions is investigated. a few of these solutions can be regarded as meeting the pre-determined requirements. this method is particularly suitable for studying objects of a very high level of complexity, and for solving problems whose substance reduces to various combinations of a large number of elements. these elements, combined in various manners, form wholes of more and more new features, as happens in the process of architectural design. the most important advantage of this method, as analysed from the viewpoint of its future application in architectural design, is that it leads, when the sets of main and particular features are properly determined, to the identification and investigation of all combinations of the real features of an object. this morphological exercise depends on passing from one level of abstraction or population to others, in order to determine the essential change connected with a given problem. at present, morphological analysis serves for the determination of a full set of combinations of the variants of the features of a defined class of technical objects, since it creates the possibility of making a choice. it is important for architects that, thanks to these properties, this method makes it possible to obtain unconventional solutions, and, at the same time, to formulate complex problems in a general, comprehensive and clear manner. joining together the particular elements of an object under study in a combinatory manner, we can create (or generate) a huge number of types of systems. this task has a large, but always finite number of solutions. in morphological analysis method, the examination of fragments of reality or their design, and regarding these fragments as a whole composed of various parts (e.g., architectural forms), corresponds with the modern level of complexity of design processes as intellectual processes, and the complexity of the design of technical objects, including architectural objects. 4 selected examples of the applications of morphological analysis in practical applications of this method in technology, success is attainable if a randomly selected set of solutions allows new ideas to be found. this is an attractive feature from the point of view of the specificity of an architect’s design work. swager [15] describes an example of the use of morphological analysis in searching for real factors connected with the future of coatings in the package production industry. in this case, the analysis covered a number of conditions related to: transport, distribution, concept of marketing, changes in product manufacture technology, standards and government acts, changes in product forms, changes in demand for the products, and income per head. during this morphological exercise which started with a description of the problem from the point of view of an evaluation of the importance and significance or lack of significance, an analysis was performed in relation to the usefulness and functions. in this analysis, each division or combination of the fragments of the description of the problem was investigated in terms of potential changes that could increase or decrease the future importance or requirements in relation to the described functions and usefulness. sielicki [6] presents in the following manner the course of a morphological procedure which is at the same time an analysis of an example of its application: • for a given class of objects, a set of main features is defined: a,b,c, ... • then, for each feature, its varieties or particular features are determined a1, a2, …ak b1, b2, …bk c1, c2, …ck this process can be continued to determine the subsequent varieties of the particular features. • then all possible combinations of particular features or varieties of these features are determined. a graphical description can be used here, in the form of a table (fig. 4), a tree of solutions (fig. 5), a morphological matrix (fig. 5) or a structural card (fig. 6). in a situation when there are a considerable number of features and varieties of features, it is convenient to use computer procedures. • the obtained set of variant solutions is subjected to reduction. the remaining variants form the set of solutions searched for, and undergo further evaluation and reduction. the set of solutions can be reduced, while searching for limitations that may exist as regards the values of various parameters. these limitations may reflect true impossibilities, physical, admissible values, also combinations, that seem to be unrealistic. in order to study all solutions obtained through morphological analysis, the morphological interval must be defined (fig. 3) in such a way that a number of different solutions be obtained, which correspond to a reasonable amount of time for studying them. in a morphological interval containing 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 seven parameters, each having three values, 2187 solutions are obtained. in the case of seven parameters, each having two values, only 128 solutions are obtained. the example analysed above of the use of morphological analysis and a multitude of possible graphical descriptions of this method (fig. 3, 4a, 4b, 5, 6, 7 and 8) describes and device for collecting apples, presented by sielicki [6]. the structure of description shown in fig. 4a, b corresponds to the tree of solution variants. this structure can also be presented in the form of a so called structural card (fig. 6) [4]. another form of graphical representation is a morphological matrix (sometimes called a morphological box). the number of dimensions of this matrix is equal to the number of main features of the solution, and each of its elements corresponds to a defined combination of features (fig. 5). figure 7, on the other hand, gives an example of a two-dimensional matrix, which describes the main features of the same device for collecting apples. the most convenient method of representation is a so called decision graph (fig. 1) [6]. the number of possible paths in the graph corresponds to the number of possible variants of the solution. in the decision graph shown, which is related to the process of architectural design, one design decision would be defined by one operational action, and would be depicted as passing from one graphical point to the second point. thus, in the process of architectural design based on morphological analysis, an architectural object can be formed [6, 11, 13–14], in successive design steps taken as a result of evaluations made in accordance with the ideas, knowledge and skill of an architect. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 41 no.1/2001 method separation fruit from branch method location system method collest fruit vibration pulling down cutting burning on support suspension transport loosing and catching shaking ulrasoundes hand operated mechanical suction electric resistance flame soldering iron cruth ladder scaffold translocation lift baloon lift container conveyor network tapes linen bowl with water mattress with rubber air cushion common rollers runner hydraulic screw elevator sack box band helical pipe suction force gravit a) fig. 4: a), b) table of the tree of solutions [9] p a b c a1 b1 c2 a2 a3 b2 c1 a11 a12 a21 a22 a23 a4 a41 a42 a43 b11 b12 b13 b14 b21 b22 c11 c12 c21 c22 c23 c24 c25 c111 c112 c121 c122 c123 c1231 c1232 b131 b132 b141 b142 b) fig. 5: two-dimensional morphological matrix [9] 5 discussion and conclusions the modern development of the theories of system, proxeology, and design methodology, together with the development of concepts connected with these domains, has led to new possibilities of design and improvement in the real process of an architectural design system approach to design. one of many consequences of the development of theories of system design methodology is an increase in the importance of new applications of various methods; this includes the method of morphological analysis known and applied in technology, which is based on lull’s art. in architectural design applications, this method, based on the arrangement of fragments into more or less complex wholes, requires a systemic approach. a fundamental way of viewing morphological analysis methodology is to treat it consciously as a tool that increases the possibility of aggregating diversities in the process of architectural design. from the point of view of an architect, morphological analysis has a particularly attractive feature that enables a designer to create freely and yet systematically many variants of the solution, according to a predefined need, and also according to his creative imagination, knowledge and skill. the present development of design methodology is closely connected with design practice. the usefulness of future processes of architectural design, in which morphological analysis methodology will be applied, will result from the degree to which these processes will enable understanding, teaching and easier performance of architectural design. morphological analysis, as a mathematical method facilitates computer aid to the process of design, including architectural design. the analysis undertaken here does not exhaust all possibilities of the use of a systemic approach to architectural design, and further study is recommended. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 fig. 6: a structural card [9] fig. 7: three-dimensional morphological matrix [9] references [1] einstein, a.: the common language of science. “advancement of science”, ii, 5/109, 1937 [2] kuzniecov, b. g.: einstein albert. evic, warszawa-moskwa 1959 [3] einstein, a.: historical and cultural perspectives. the centennial symposium in jerusalem, ed. by gerald holton, yehuda elkana, princeton, univ. press, 1980 [4] einstein, a.: philosopher – scientist. ed. by paul arthur sohilp brothers publ., harper toveh books, science library, new york, 1959, vol. 1, vol. 2 [5] gerardin, l: screenplays of future. morphological analysis – the method of creation. in: a guide to practical technological forecasting, eds. bright j. r. and schoeman m. e. f., prentice – hall. inc., englewood cliffs, new jersey, usa, 1973, pp. 507–522 [6] sielicki, a., jeleniewski, t.: the elements of the methodology of technical design. wnt, warszawa 1980 (in polish) [7] swager, w. l.: perspective trees – method of creative application of prognoses. in: a guide to practical technological forecasting, eds. bright j. r. and shoeman m. e. f., prentice – hall. inc., englewood cliffs. 204–234, new jersey, usa, 1973 © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 41 no.1/2001 fig. 8: a decision graph [9] [8] zwicky, f.: the morphological method of analysis and construction. courant, anniversary volume, intersciences, publ., new york 1948 [9] zwicky, f.: morphology and nomenclature of jet engines. aeron. eng. review, june 1947 [10] zwicky, f.: morphology of propulsive power monographs on morphological research. no. 1. society for morphological research, pasadena, california 1962 [11] prokopska, a.: morphological analysis in architectural design. “teka” komisji architektury i urbanistyki txxviii, polish academy of science (pan), kraków, 1997 (pp. 185–195) (in polish) [12] simon, h. a.: the science of the artificial mit press. cambridge, ma, usa 1969 [13] simon, h. a.: the style of design. in: designing and systems. methodological problems, vol. 3, polish academy of science (pan), komitet naukoznawstwa, 1981, pp. 97–115, warsawa (in polish) [14] simon, h. a.: formulating, finding out and untie of problems in design. design and systems, vol. xii, zagadnienia 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 app. 1: lull’s great art [8] app. 2: ramon lull’s table [8] © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 41 no.1/2001 app. 3: a clear example of morphological analysis from 17th century, in [8] metodologiczne dyscyplin praktycznych, wrocław-warszawa-kraków-gdańsk-łódź, zakład narodowy imienia ossolinskich, polish academy of sience (pan), 1990 (in polish) [15] swager, w. l.: prespective trees – method of creative application of prognoses. in: a guide to practical technological forecasting, eds. bright j. r. and schoeman m. e. f., prentice – hall. inc., englewood cliffs, new jersey, usa, 1978, pp. 204–234 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 1) the application of a systemic approach to architectural design may turn out to be difficult, as this approach requires precise knowledge which is not always available to the architect, since he acts in the sphere of intuition and professional know-how. 2) the so-called architect’s design studio consist of know-how knowledge and know-how knowledge. the notion of an “architectural studio” is indispensable in architecture, and originates from the traditions of this profession, because architectural design originated from manufacture and has know-how in itself, i.e., knowledge which can be referred to as “i know, but i don’t know how to say it”. this knowledge can be described at least partly by means of the language of concepts in the theory of system and design methodology. 3) see also: zwicky, f.: morphological astronomy. springer verlag. berlin 1957 endnotes prof. aleksandra prokopska, d.sc. arch dept. of town planning and architecture faculty of civil and environmental engineering rzeszów university of technology ul, poznańska 2 35-084 rzeszów poland ap02_02.vp 1 introduction in the chemical and food industry, the aim of mixing is often to obtain a solid suspension. in some cases, high homogeneity of the produced suspensions is also required [1]. for mixing suspensions, both standard slender tanks of h/d � 1 and tanks with higher h/d ratios are used. application of tanks of the latter type enables rational use of a working surface. it may be difficult to obtain the desired homogeneity of the suspension (particularly for slender tanks with only one impeller or in the case of coarse grained suspensions) so it will be necessary to install more impellers on the shaft. the aim of this study is to explain the process of producing suspensions in slender tanks (h � 2d) equipped with one or more impellers. 2 experimental experiments were carried out in a cylindrical flat-bottomed (glass) tank of diameter d � 0.2 m, filled with a liquid to a height of h � d or h � 2d. in the tank there were four standard baffles and a pitched four-blade impeller (� � 45°). the diameter of these impellers was d � d/3. measurements were made using one to five impellers, with the bottom impeller being always placed at a distance of 0.5 d from the tank bottom. the distances between the impellers with several rotors were � 0.5 d, d or 1.5 d [2, 3]. water suspensions of glass ballotine of diameter � � 1.34 and 0.42 mm in an amount adequate for volumetric solid concentration equal to 2.5 % were used as model suspensions. the height of the layer of unsuspended particles (hv) close to the wall and the level of the suspension – water interface (hs) were determined visually. these quantities are shown in fig. 1. local concentrations of solid particles at different levels of suspension were measured by conductometry [4]. to estimate quantitatively the suspension homogeneity on the basis of these measurements, relative standard deviation � was calculated. it was defined by the equation � � � � � � � 1 2 1 c c c n i n m i m (1) where n – number of measuring points in the suspension. the ability of an impeller to form a suspension was estimated on the basis of a dimensionless criterion which determined the power required to introduce solid particles into the suspension. this criterion is an extension of the dimensionless number proposed previously [5] and has the following form: 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 mixing suspensions in slender tanks f. rieger, e. rzyski industrial suspension mixing processes are carried out both in standard tanks (h/d �1) and in the tanks with height h/d > 1. when only one impeller is used in such slender tanks, it may be difficult to produce a suspension of desired homogeneity. hence it may be necessary to install a larger number of impellers on the shaft. the aim of this study was to explain the mechanism of suspension formation in slender tanks (h/d � 2) with an increased number of impellers. on the basis of the solid bed height on the tank bottom, the position of the suspension – water interface and the concentration profile of solid particles in the suspension (standard deviation of solid body concentration) the operation of the impellers was estimated and conclusions were drawn on how and at what distance from each other to install them were presented. the location of the upper, highest impeller appeared to be specially significant. on the basis of this study it is recommended to locate the upper impeller so that its distance from the free liquid surface is less than 0.8 d. it was found that such a position of the highest impeller was also advantageous from the energy point of view. keywords: agitation, agitated tank, mixing, suspension. l l d h s h v 0 .2 ·d d /2 n d fig. 1: schematic diagram of the tank � � � �� � s z fr� � �� � �� � � � � � � � � � � p g d h po d d d h� � � 3 2 7 3 3 71 7 3 (2) values of power number po for multiple pitched four-blade turbines were calculated from the following relation recommended in [6] � �po po j po� �1 s+ 1 (3) where po 1 is the power number of the bottom impeller and po s are the power number values of the higher impellers. the value of the bottom impeller power number for pitched four-blade turbine po1 � 1.36, the pos values depend on /d and they are listed in table 1. 3 results the results obtained are presented graphically. fig. 2 shows the results of measurements for a standard tank with one impeller. fig. 2a illustrates measured heights (hv and hs) as a function of the frequency of the impeller rotation. at rotation frequency n � 200 min�1 solid particles are introduced into the suspension. the rotation frequency at which all solid particles are separated from the bottom (critical frequency of the impeller rotation [3]) is 1000 min�1. fig. 2b shows the volumetric concentration of the solid particles at different levels of liquid in the tank just for this frequency. though all solid particles are in the suspension, and the level of separation reaches the height of the liquid in the tank, there are differences in the concentration, and the suspension is not homogeneous. a further increase in the rotation frequency causes an increase in homogeneity, as follows from fig. 2c, which shows the relative standard deviation for different frequencies of rotation. a quick decrease in the relative standard deviation which takes place between 900 and 1000 min�1 corresponds to the disappearance of the solid particles from the tank bottom. the application of a larger number of impellers on one shaft leads to a decrease in the critical frequency of rotation corresponding to separation of all particles from the bottom. this is reflected in the graph presented in fig. 3a. although the number of impellers increases, the power needed to put them in motion does not change significantly (fig. 3b). however, in such a case the homogeneity increases. this follows from a comparison of the concentration distribution in the tank for rotation frequency n � 1000 min�1 for the tested impellers and the impact of the relative standard deviation © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 42 no. 2/2002 ratio of /d 0.5 1.0 1.5 pos 0.71 0.82 0.88 tab. 1: values of power number po s for tested four-blade turbines n [min ] �1 0 500 1000 1500 0.1 1 10 c)� 50 100 150 1 2 3 b) h [mm] c [% ] 0 500 1000 1500 0 100 200 a) hs hvv n [min ] �1 h [m m ] fig. 2: mixing a suspension of particles d � 1.34 mm (standard tank h � d with one impeller): a – dependence of heights h v and h s on the frequency rotation; b – solid concentration on different levels of liquid; c – dependence of standard deviation on the impeller rotational frequency p [w ] a) b) fig. 3: effect of increasing the number of impellers (particles � � 1.34 mm, tank h � d): a – values of rotational frequency of impellers; b – values of mixing power (a – one impeller; b – two impellers, = 0.5 d; c – two impellers, = d) (figs. 4a and 4b). it follows from these figures that a smaller distance between the impellers (equal to 0.5 d) is more advantageous for homogeneity than a distance equal to d. in a standard tank with one impeller only, suspensions can be produced which – although non-homogeneous – have solid particles dispersed along the whole liquid height. in more slender tanks (h/d � 2) the separation level is much below the liquid height in the tank (solid particles reach only slightly above half the liquid height), and in most of the upper half almost pure liquid is found (cf. fig. 5). when two impellers are located on the shaft, the critical frequency of the impeller rotation is not much reduced and homogeneity does not increase. fig. 6 illustrates the dependence of the relative standard deviation on rotation frequency for one and for two impellers placed at different distances from each other. however, we do not recommend such a solution, because the use of two impellers is connected with an increase in energy consumption. a clear increase in the homogeneity of the suspension is achieved when three impellers are used. fig. 7 illustrates the relative standard deviation for one, two and three impellers (at a distance of 1.5 d from each other). this homogeneity increase is also a result of the distributions of solid body concentration shown in figs. 8 and 9. at above of the liquid height in the tank with three impellers, there is a suspension with very low concentration of solid body (c � 0). this refers, however, to larger particles only (� � 1.34 mm), because when 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 50 100 150 2 3 n = 1000 min �1 a) one impeller two impellers, l = 0.5·d two impellers, l = d 400 800 1200 0.1 1 10 b) one impeller two impellers, l = 0.5·d two impellers, l = d � n [min ] �1 c [% ] h [mm] fig. 4: mixing the suspension using one or two impellers (particles 1.34 mm, tank h � d): a – concentration distribution at n � 1000 min�1, b – mean standard deviation for different rotation frequency 400 800 1200 0 100 200 300 hs hv n [min ] �1 h [m m ] fig. 5: interface and height of the layer of solid particles (particles 1.34 mm, tank h � 2 d, one impeller) 400 800 1200 0.9 1 2 3 one impeller two impellers, l 0.5·� d two impellers, l d� two impellers, l 1.5·� d � n [min ] �1 fig. 6: relative standard deviation of suspension concentration for one or two impellers ( � � 1.34 mm, tank � � 2 d� 400 800 1200 0 1 2 3 l 1.5·� d one impeller two impellers three impellers � n [min ] �1 fig. 7: relative standard deviation of suspension concentration for different number of impellers in the tank h � 2 d (� � 1.34 mm, � 1.5 d) a suspension of fine particles is formed (0.42 mm) the suspension – water interface reaches the surface of the liquid in the tank and the concentration distribution reveals growing homogeneity of the suspension. the application of three impellers also gives the interface h s , which almost reaches the height of tank filling, as shown in fig. 10. to draw further conclusions, experiments were carried out with a larger number of impellers. a comparison of the concentration profile of the suspension obtained using several impellers placed at a distance of � d from each other (for rotation frequency n � 1000 min�1) is shown in fig. 11. it follows that partial improvement of homogeneity is already achieved when three impellers are used. a significant increase in homogeneity occurs only when four impellers are applied. fig. 12 gives a comparison of the distribution of suspension concentration for the same rotation frequency (1000 min�1) for impellers located at a distance of � 0.5 d from each other. in this case a distinct improvement in the concentration distribution of the suspension is obtained for a maximum number, i.e., five impellers. relationships of the relative standard deviations at different frequencies of the impeller rotation (for a different number of impellers) and their distribution are given in fig. 13. it follows from these data that the lowest values of standard deviation (the highest homogeneity) can be achieved when using four impellers, placed at a distance of � d or three impellers at a distance of � 1.5 d. hence, it may be concluded that homogeneity of the obtained mixture depends more on the location of the highest impeller than on the number of impellers. the highest homogeneity was achieved © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 42 no. 2/2002 0 100 200 300 0 2 4 6 n 1000 min� �1 l 1.5·� d one impeller two impellers three impellers h [mm] c [% ] fig. 8: distribution of solid concentration for the impellers in fig. 7 0 100 200 300 400 0 2 4 three impellers h d2· ,� �l 1.5·d � � 0.42 mm 1.34 mm� � h [mm] c [% ] fig. 9: comparison of concentration distribution for different solid particle sizes (tank h � 2 d) 200 400 600 800 1000 1200 0 100 200 300 400 three impellers l 1.5·� d hs hv n [min ] �1 h [m m ] fig. 10: interface and height of the layer of solid particles (� � 1.34 mm, tank h � 2 d, three impellers) 0 50 100 150 200 250 300 350 0 2 4 6 8 n 1000 min� �1 h d2· ,� l � d one impeller two impellers three impellers four impellers h [mm] c [% ] fig. 11: distribution of suspension concentration for a different number of impellers (� � 1.34 mm, tank h � 2 d, � d) 0 50 100 150 200 250 300 350 0 2 4 6 8 n 1000 min� �1 h d2· ,� 0.5·l � d one impeller two impellers three impellers four impellers five impellers h [mm] c [% ] fig. 12: distribution of suspension concentration for a different number of impellers (� � 1.34 mm, tank h � 2 d, � 0.5 d) when the highest impeller was at a distance of less than 0.8 d from the liquid surface. application of the system with the highest impeller just in this position is also advantageous from the point of view of energy consumption, as shown by the dependence of standard deviation � on the energy criterion �s, shown in fig. 14. 4 conclusions in order to produce a suspension it is not necessary to use a large number of impellers on the shaft in a standard tank ( h � d), but this results in increased homogeneity (especially of suspensions containing big particles). in more slender tanks (h � 2 d) the state of solid suspension can be reached, but in the upper part of the tank there is only a continuous phase and there are practically no solid particles at all. homogeneity of the produced suspensions can be increased significantly by placing a larger number of impellers on the shaft in such a way that the highest impeller operates in the upper part of the tank. on the basis of the results obtained in this study it is recommended to locate the upper impeller at such a level that its distance to the free surface of liquid in the tank is smaller than 0.8 d. 5 symbols ci solid concentration at a given point (volumetric) cm mean concentration of the solid (volumetric) d tank diameter, m d impeller diameter, m fr’ modified froude number, fr’ � (n2d/g)�/ � frcr’ modified critical froude number, frcr’ � (ncr 2d/g)�/ � g acceleration of gravity, m/s2 h height of liquid in the tank, m hs height of the suspension up to the suspension-water interface, m hv height of the particles resting on the bottom, m j number of impellers on one shaft distance between impellers, m n frequency of the impeller rotations, s�1 or min�1 ncr critical frequency of impeller rotations p mixing power, w po power number, po � p/(n3d5�z) � solid particle diameter, m � density difference, � � �s–� � disperse phase density, kg/m3 �s solid density, kg/m 3 �z suspension density, kg/m 3 � relative standard deviation of the concentration �s criterion defined by eq. (2) the investigations presented in this paper were in part financed by a research project of the ministry of education of the czech republic (project no. j04/98: 212200008). references [1] rieger, f., rzyski, e.: dobór optymalnego mieszadła do procesu mieszania zawiesin ciała stałego w cieczy. (selection of the optimum impeller for mixing of solid-liquid suspensions). inż. aparat. chem. 1998, vol. 37, no. 5, p. 19–23. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 500 750 1000 1250 0.5 0.6 0.7 0.8 0.9 1 2 3 4 5 three impellers one impeller two impellers l 0.5·� d l 0.5·� d l 0.5·� d l 0.5·� d l � d l � d l � d l 1.5·� d l 1.5·� d four impellers five impellers � n [min ] �1 fig. 13: relative standard deviation of suspension concentration (particles � � 1.34 mm) in the tank h � 2 d 0 0.05 0.10 0.15 1 10 one impeller two impellers, l = 1.5·d three impellers, l = 1.5·d four impellers, l = ·d five impellers, l = 0.5·d � �s fig. 14: comparison of standard deviation and energy criterion for different number of impellers placed in the tank h � 2 d (particles � � 1.34 mm) [2] rieger, f., rzyski, e.: mieszanie zawiesin w zbiornikach o różnej smukłości i liczbie mieszadeł. (mixing of suspensions in tanks of different fineness ratios and number of impellers). zesz. nauk. polit. łódzkiej (nr 838), inż. chem. i proces, 2000, vol. 27, p. 221–226. [3] rieger, f., rzyski, e.: mieszanie zawiesin w zbiornikach z większą liczbą mieszadeł. (mixing of suspensions in tanks with an increased number of impellers). inż. chem. proc. 2001, vol. 22, no. 3e, p. 1213–1218. [4] bilek, p., rieger, f.: distribution of solid particles in a mixed vessel. collect. czech. chem. commun., 1990, vol. 55, p. 2169–2181. [5] rieger, f.: efficiency of agitators while mixing of suspensions. proceed. of vi polish seminar on mixing, kraków – zakopane, 1993, p. 79–85. [6] novák, v., rieger, f., marisko, v., mašín, l.: power consumption and homogenization effect of multiple impellers in mixing in tall vessels. congress chisa, prague, 1990. prof. ing. františek rieger, drsc. e-mail: rieger@fsid.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic dr. ing. edward rzyski e-mail: erzyski@wipos.p.lodz.pl department of process equipment łódź technical university faculty of process and environmental engineering ul. wólczańska 213/215, 90-924 łódź, poland © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 42 no. 2/2002 acta polytechnica doi:10.14311/ap.2013.53.0659 acta polytechnica 53(supplement):659–664, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap disc atmospheres and winds in x-ray binaries maria díaz trigoa,∗, laurence boirinb a eso, karl-schwarzschild-strasse 2, d-85748 garching bei münchen, germany b observatoire astronomique de strasbourg, 11 rue de l’université, f-67000 strasbourg, france ∗ corresponding author: mdiaztri@eso.org abstract. we review the current status of studies of disc atmospheres and winds in low mass x-ray binaries. we discuss the possible wind launching mechanisms and compare the predictions of the models with the existent observations. we conclude that a combination of thermal and radiative pressure (the latter being relevant at high luminosities) can explain the current observations of atmospheres and winds in both neutron star and black hole binaries. moreover, these winds and atmospheres could contribute significantly to the broad iron emission line observed in these systems. keywords: x-rays: binaries, accretion, accretion disks, stars: neutron, black hole physics, spectroscopy. 1. introduction in the last decade we have witnessed a wealth of discoveries of narrow absorption features in low-mass x-ray binaries (lmxbs). they have been identified with resonant absorption from fe xxv and fe xxvi and other abundant ions and, in a number of systems, are blueshifted, indicating outflowing plasmas. these features were first detected with asca from the microquasars gro j1655−40 [34, 38] and grs 1915+105 [16, 18]. the launch of the x-ray observatories chandra, xmm-newton and suzaku, with ability to obtain medium to high resolution spectra, opened a new era in studies of plasmas, showing that absorption features are common to black hole (bh) and neutron star (ns) binaries and probably associated to the accretion nature of those systems (e.g. [2–7, 11, 15, 19, 20, 22, 29, 30, 35, 36]). the overwhelming presence of absorption features in systems that are known to be at high inclination (due to the existence of absorption dips in their light curves) and the modelling of such features with self-consistent photoionised plasma codes led boirin et al. [4] and díaz trigo et al. [5] to conclude that absorption plasmas are probably ubiquitous to all x-ray binaries, but are only detected in high-inclination systems because the plasma has a flat, equatorial geometry above the disc. studies of photoionised plasmas are important for a number of reasons. firstly, since such plasmas seem to be ubiquitous in lmxbs, we can learn about accretion processes in these systems by e.g. mapping changes in the disc via changes in the disc atmosphere (or photoionised plasma above the disc). secondly, since in a fraction of these systems the plasmas are outflowing, the whole energetic budget of the systems could be significantly altered by the mass expelled in the wind and have an effect in the dynamics of the accretion flow. thirdly, feedback to the environment could be relevant if the outflows carry a significant amount of kinetic energy and momentum. finally, studies of winds in lmxbs may provide important information for understanding the outflows present in other accretion powered objects. made up of a normal star and a collapsed star, lmxbs bridge the gap between young stellar objects and super massive black holes and may hold the answers to the driving mechanism of winds. lmxbs are unique in that they show both general relativity effects, when the collapsed star is a bh (typically of 5 to 10 solar masses) and effects due to the presence of magnetised stars, when the collapsed star consists of a ns. thus, studies of winds in lmxbs allow us to isolate the role of the compact object in driving the winds. in this work, we review the current observational state and theoretical understanding of disc atmospheres and winds in lmxbs. as a diagnostic tool for this study we focus on the narrow absorption lines, with or without blueshifts. we exclude from our study the winds observed in high mass x-ray binaries, because in these systems the wind from the massive companion star contributes significantly or even dominates the wind component. we discuss the scattered component of the wind, its observational evidence and the implications for spin measurements of stellar mass bhs. finally, we briefly address the potential connection between the presence of compact jets and winds in lmxbs. 2. observational properties and spectral modelling of photoionised plasmas to date, narrow absorption features have been discovered in ten ns and seven bh lmxbs. the lines are consistent with a zero shift for seven systems and show significant outflow velocities consistent with a wind in ten systems. however, we note that for two 659 http://dx.doi.org/10.14311/ap.2013.53.0659 http://ojs.cvut.cz/ojs/index.php/ap maria díaz trigo, laurence boirin acta polytechnica of the systems not showing outflows, the constraints to velocity shifts are weak, . 1000 km/s, since observations with high resolution spectral gratings are not available [4, 10]. detailed modelling of the absorption features is often performed for each line individually and followed by a comparison of the line parameters with simulations of a photoionised plasma with codes such as xstar [13] or cloudy [6]. a second method consists in modelling all the lines and continuum simultaneously with self-consistent photoionisation codes such as warmabs in xspec (arnaud 1996) or xabs in spex (kaastra et al. 1996). these codes model the absorption due to a photoionised plasma in the line of sight but taking all relevant ions into account, including those having small cross-sections, which can contribute significantly to the absorption when combined. the relative column densities of the ions are coupled through a photoionisation model. during the fitting process, warmabs/xabs calculate spectra using stored level populations pre-calculated with xstar or cloudy for a given ionising continuum. the main parameters of warmabs/xabs are nh, ξ, σv, and v, representing the column density of the absorber, the ionisation parameter, the turbulent velocity broadening, and the average systematic velocity shift of the absorber (negative values indicate blueshifts). the components warmabs and xabs differ in one important aspect: xabs includes self-consistently compton scattering in the plasma, as opposed to warmabs. electron scattering can affect the whole spectrum significantly when column densities are high. for example, the photoionised plasma present during dips of 4u 1323−62 (nxabsh > 10 23 cm−2) decreases the flux by more than 50 % below 10 kev and by ∼ 25 % in the 20 ÷ 50 kev band [4]. therefore, electron scattering should be taken into account in parallel to the use of warmabs. the systems showing narrow absorption features are predominantly at high inclination, as first noted for ns [4, 5] and recently for bh [26] lmxbs. for nss, modelling of the absorption lines with photoionised plasmas yields column densities between 3.5 and 17.2 × 1022 cm−2 and log(ξ) ∼ 2.5 ÷ 4.5. the lines are blueshifted in 30 % of the systems, with velocities between ∼ 400 and 3000 km/s. for bh binaries, the photoionised plasma has column densities between 0.3 × 1020 and ∼ 6 × 1023 cm−2 and log(ξ) ∼ 1.8 ÷ 6. outflow velocities range between ∼ 100 and 1300 km/s and are present in 85 % of the systems. in one case an outflow velocity of 9000÷13000 km/s has been claimed, but the significance of the features is low [15]. we note that as observations become sensitive to more lines, the need for more than one ionised plasma increases due to the existence of lines with significantly different outflow velocities and the co-existence of lines that require different ionisation equilibria (e.g. [14, 39]). for systems showing winds, and considering a purely photoionised plasma, the mass outflow rate has been estimated to be of the order of the mass accretion rate (e.g. [26, 35, 36]), indicating that the winds are important for the dynamics of the system. 3. the wind launching mechanism disc winds from an accretion disc can be launched via thermal, radiative and/or magnetic mechanisms. the launching mechanism may differ significantly for different types of accreting sources and we expect characteristics of the winds such as geometry, density or outflow velocity to vary depending on the mechanism or mechanisms involved. this will ultimately determine how much matter can be expelled out of the system and is therefore of utmost importance for assessing the dynamical relevance of the winds, both for the systems themselves and for their environment. uv radiation can be effective in driving a wind because the number of lines is large in this energy range, increasing the probability of transferring energy from uv continuum photons to matter. in x-ray binaries however, x-rays tend to photoionise material and decrease the concentration of ions capable of uv line driving. therefore, line-driven winds are not expected to be relevant unless shielding from the x-ray irradiation is taking place in some uv-emitting part of the disc [27]. in contrast, the same x-rays that prevent uv line opacity can heat low density gas to a temperature ∼ 107 k and potentially drive thermal winds. assuming the accretion disc is concave and exposed to x-rays from the central source, its upper layers are expected to puff up and expand into an atmosphere, corona or wind [11]. the upper boundary of the atmosphere is the compton temperature corona, which is less dense and hotter than the underlying atmosphere. the evaporated photoionised plasma will remain bound to the disc as an atmosphere/corona or be emitted as a thermal wind, depending on whether the thermal velocity exceeds the local escape velocity [1, 27, 37]. importantly, the radial extent of the corona is independent of luminosity and determined only by the mass of the compact object and the compton temperature [1, 37]. woods et al. [37] determined that a wind would be launched by thermal pressure at radii larger than 0.25 ric (where ric denotes the compton radius or distance at which the escape velocity equals the isothermal sound speed at the compton temperature tic). for tic ∼ 1.3 × 107 k, as expected in lmxbs, 0.25 ric corresponds to a radius of ∼ 1.9 × 1010 m cm, where m is the mass of the compact object in units of solar masses. however, they also found that even above such radius the wind could be gravity-inhibited if the luminosity were below twice a critical luminosity defined as lcr ∼ 2.88 × 10−2t −1/2 ic8 ledd (see their eq. 4.4), where ledd is the eddington luminosity and tic8 is the compton temperature in units of 108 k. therefore, for tic ∼ 1.3 × 107 k the wind could be gravitationally inhibited for luminosities below 0.16 ledd (see their fig. 17). consequently, for 660 vol. 53 supplement/2013 disc atmospheres and winds in x-ray binaries ns systems, at a disc radius of ∼ 2.6 × 1010 cm an isothermal wind will develop for luminosities above ∼ 3 × 1037 erg s−1, while such a wind could be inhibited by gravity at lower luminosities. for larger radii, a steadily heated free wind could develop already at luminosities below ∼ 3 × 1037 erg s−1. interestingly, proga & kallman [27] found that when they included the radiation force, although the line force is dynamically unable to drive a wind, the radiation force due to electrons can be important for very luminous systems: it lowers the effective gravity and subsequently the escape velocity and allows a hot robust disc wind to be already produced at ∼ 0.01 ric, well inside the compton radius and previous estimates by woods et al. [37]. besides the thermal/radiative pressure mechanism described above, in the presence of a strong magnetic toroidal field, magnetic pressure can give rise to a self-starting wind. calculations of magnetic winds are still scarce and have to make an assumption about the unknown magnetic field configuration. therefore, it is common practice to determine first if a thermal launching mechanism is plausible for a given source and only if this mechanism is ruled out, is the magnetic case considered. we note that when magnetic pressure is considered, a wind could be launched well inside the compton radius and therefore we can use the location of the wind to discriminate between different launching mechanisms. 3.1. comparison of wind models with the current observations of atmospheres and winds in lmxbs figure 1 shows the estimated luminosity and location of the photoionised plasma for the ns lmxbs studied with high-resolution gratings or ccd observations. we performed this comparison for nss rather than for bhs since for the latter the critical luminosity depends on the often uncertain estimates of the bh mass. for nss which do not show outflows, we have used the estimations of díaz trigo et al. [5] for the luminosity of the system and the location of the plasma (see their tabs. 1 and 13), except for cir x−1, for which we have used the estimates of schulz et al. [29]. uncertainties in the luminosity are derived from the uncertainty in the distance given in tab. 13 of díaz trigo et al. [5] except for cir x−1, for which we use a lower(upper) limit of 6(10.5) kpc [12, 29]. tight constraints to the existence of an outflow were inferred from grating observations for all the sources shown in fig. 1 except for 4u 1323−62 [4]. for nss showing outflows we plot the plasma location and luminosity inferred from grating observations [22, 28, 35]. the errors in the luminosity are estimated from the error in the distance and for the radius, a default error of 50 % is adopted whenever the error is not provided in the discovery papers. the dotted lines have been derived from the lines in fig. 17 0.01 0.10 1.00 10.00 100.00 1000.00 radius [10 10 cm] 1 10 100 l u m in o s it y [ 1 0 3 6 e rg s -1 ] 4u 1254-690 xb 1916-053 exo 0748-676 xb 1323-619 mxb 1659-298 x 1624-490 cir x-1 cir x-1 cir x-1 cir x-1 igr j17480-2446 gx 13+1 figure 1. ns lmxbs for which a photoionised absorber has been detected. filled and open symbols represent systems with or without an outflow, respectively. the vertical lines correspond to 0.25 ric (dotted) and 0.01 ric (dashed). the horizontal dotted line corresponds to twice the critical luminosity for a ns and marks the boundary between a non-isothermal (bottom) and an isothermal (top) corona. the lower right triangle delimited by the dotted lines shows the region where the wind will be gravity-inhibited. from [37], and set the boundaries for the existence of a wind for ns lmxbs with a tic ∼ 1.3 × 107 k. the dashed line represents the transition from the corona into a wind when radiation force from electron scattering is included (but note that such limit is applicable only to high luminosities [27]). in short, all the sources that do not show winds (represented with open triangles) are consistent with not showing them if the thermal mechanism is in place, i.e. the location of the photoionised plasma is consistent with the location of the radially bound atmosphere/corona. moreover, even if the plasma were located at a distance larger than 0.25 ric, their luminosity is below the critical luminosity needed to overcome gravity [37]. the only exception could be x 1624−490, which has a sufficient large radius and luminosity to launch a wind and for which the average line velocity is consistent with a static atmosphere. note however that xiang et al. [39] find a better fit when two plasmas are used, one of which is outflowing. we next look at the sources showing winds, which are represented in fig. 1 with filled squares. it is evident that such sources show luminosities which are significantly larger than for the sources not showing winds, and which are just above twice the critical luminosity derived to launch a wind [37]. in the case of cir x−1, the launching radius obtained for the wind observed in the high luminosity observations corresponds to ∼ 0.1ric. this radius is still plausible for launching winds once the radiation pressure due to electron scattering is considered (see [27]), especially at this high luminosity. in summary, fig. 1 indicates that a thermal mechanism explains satisfactorily the presence or absence of winds in ns lmxbs. further 661 maria díaz trigo, laurence boirin acta polytechnica support to this conclusion is given by the fact that winds and warm absorbers are preferentially detected in high-inclination sources (all the sources shown in fig. 1 show dips in their light curves, indicating inclinations larger than ∼ 65°). this is consistent with the geometry predicted for a thermal wind, which will be observed close to the equatorial plane, since at polar angles, . 45°, the low density and the high ionisation of the gas prevent its detection. a caveat to this interpretation is that the compton temperature for the different sources may differ from the one considered above. to evaluate the uncertainty introduced by the assumption of tic ∼ 1.3 × 107 k, we calculated tic for three of the sources for which models fitted to broadband x-ray spectra were found in the literature. we obtained values of ∼ 0.3, 0.4 and 1 × 107 k for 4u 1254−69, x 1624−49 and gx 13+1, respectively. these values imply that ric could be ∼ 3 ÷ 5 × 1011 cm for the first two sources placing the plasma well inside the calculated radial extent of the corona. we note that uncertainties in the derivation of the location of the plasma are mainly driven by uncertainties in its density. a second source of uncertainty is the bolometric luminosity of the source, which is inferred from the x-ray luminosity and may be severely underestimated when there is a high opacity in the line of sight, which happens precisely for sources observed at high inclination and with a strong wind. we conclude that the presence or absence of winds in current observations of ns lmxbs can be explained by thermal/radiative pressure in the disc, when all the uncertainties are taken into account. for bh lmxbs, the uncertainty in the mass of the bh increases the uncertainties in the calculation of the compton radius and the critical luminosity and make it more difficult to discriminate between different launching mechanisms. to date there is only one claim of a magnetic driven wind in a bh lmxb [20], which has been controversial (netzer et al. 2006). gro j1655−40 was extensively monitored by x-ray observatories during its 2005 outburst. xmm-newton observations showed a highly ionised wind consistent with a thermal mechanism [6]. however, the wind had strongly evolved at the time of the chandra observation performed five days after the last xmm-newton observation. more than 70 lines were present and revealed a very dense wind. the most exhaustive study of this spectrum was performed by kallman et al. [14], who derived a radius of 7 × 109 cm for the location of the plasma and concluded that the wind could not be thermally launched. however, given the uncertainty in the density of the plasma (1013 ÷ 1015 cm−3) and the bolometric luminosity (since obscuration of the continuum due to compton scattering in the wind was not considered), we conclude that it is too early to discard a thermal launching mechanism for this wind. different compton temperatures, obtained by fitting plausible continuum models to broadband spectra, should also be considered when calculating the allowed radii. in summary, even if several mechanisms could be in place at the same time, we find that magnetic pressure is not required to explain the existent observations of winds. 4. scattering in the wind the simultaneous presence of narrow absorption features, indicating the presence of a warm absorber, and of a broad iron line has been often observed in high inclination, dipping, lmxbs (e.g. [4, 7, 8, 17, 25, 30]). due to the symmetry and the relatively modest size of the broad iron line, its origin has been commonly attributed to a combination of line blending, doppler broadening and compton scattering in an accretion disc corona or hot atmosphere. recently, xmm-newton observations of the ns lmxb gx 13+1 have shown a correlation of the variations of the broad iron line and the state (column density and degree of ionisation) of the warm absorber, indicating that absorption and emission take place most likely in the same plasma [8]. these observations are in agreement with models of photoionised winds by sim et al. [31, 32]. in these models absorption features are imprinted in the spectra when we look through the wind, and the radiation scattered in the outflow produces the broad iron line emission and other distinct features such as a compton hump, which will be more or less visible depending on their contribution with respect to the incident radiation. at low, . 45 °, inclinations the wind does not obscure the x-ray source and consequently, absorption features are not observed. however, a significant component of scattered/reprocessed radiation in the outflow is present in addition to the direct emission and responsible for the broad iron line emission. the emission is predominantly formed by fe xxv and fe xxvi, since the part of the wind seen at low inclinations is the highly ionised surface of the outflow. given that warm absorbers and/or winds seem to be ubiquitous to all lmxbs [4, 5, 26], and that the scattered component of the wind should be visible at all inclinations (e.g. [31]), it follows that broad line emission should be observable in a majority of lmxbs. this is consistent with systematic analyses of broad iron emission lines in ns lmxbs (see [24, and references therein]), where lines are found in 50 ÷ 85 % of the sources and are highly ionised. a systematic study of broad iron lines in bh lmxbs is challenging due to the transient nature of the sources. however, given the similarity of the winds in ns and bh lmxbs, it is not unexpected that at least a fraction of the broad iron lines observed in bh lmxbs are produced at the wind and not by reflection at the inner disc, as it is often assumed. therefore, measurements of the bh spin based on the breadth of the iron emission line will have an uncertainty associated to the contribution of the wind to the line, which may be comparable or even larger than 662 vol. 53 supplement/2013 disc atmospheres and winds in x-ray binaries the component arising from reflection at the inner disc. in conclusion, our ability to probe the physical conditions at the inner disc depends on our understanding of accretion disc physics, including atmospheres and winds. 5. a wind-jet connection? disc winds have been observed in the high/soft, thermal-dominated, state of bh transients, when the jet emission is absent. in contrast, in observations of the low/hard state of the same transients, with typical jet emission, winds were excluded, indicating that there may be an anticorrelation between winds and jets. in particular, blueshifted absorption lines were observed in the soft state spectra of gro j1655−40 [6, 20], grs 1915+105 [36], gx339−4 [19], 4u1630−472 [17] and h1743−322 [21]. in contrast, observations of the hard state of gro j1655−40 [33] and h1743−322 [21] excluded the presence of such lines down to equivalent widths of 20 and 3 ev, respectively. the absence of strong winds in the hard state of bh lmxbs has recently been confirmed by a systematic search of absorption lines in existent x-ray observations [26]. based on grs 1915+105 observations, neilsen & lee [23] proposed that a possible explanation for the observed anti-correlation between jets and winds is that the wind observed during the soft state carries enough mass away from the disc to halt the flow of matter into the radio jet. however, there is one detection of a weak wind in the “hard” (or “c”) state of grs 1915+105 [18]. therefore, it could also be possible that a high amount of matter is carried away in both soft and hard states, but due to the higher ionisation in the hard state the ions become fully stripped and the wind becomes transparent and thus “invisible”. although significant changes in the ionisation and column density of the wind with x-ray luminosity have already been observed during the soft states of bh outbursts (e.g. [6, 17]), a systematic study of the evolution of the compton temperature and the wind throughout the whole outburst has yet to be made, and it is now of utmost importance to determine the reason for the absence of wind detections in the hard state. 6. discussion and conclusions the overwhelming presence of hot atmospheres and winds in high inclination lmxbs shows that they are most likely situated in a flat geometry around the disc and are therefore ubiquitous in lmxbs. modelling of the absorption features with photoionised plasmas is consistent with a scenario in which the disc is heated by irradiation from the central object and an atmosphere and corona are formed, which can be observed as an outflowing wind under certain conditions of distance to the central object and luminosity. the reprocessed/scattered component of the wind has scarcely been studied. the high plasma column densities observed in some objects indicate that opacity in the wind could be relevant and scattering in the plasma a major source of broad line emission [31, and references therein]. in this respect, it is now of utmost importance to include the reprocessed component in wind models to compare them with the observations and be able to quantify their significance for bh spin measurements based on the broad iron emission line. finally, the appearance or disappearance of winds could be triggered by changes in the structure of the accretion flow. sensitive simultaneous observations in x-ray and radio wavelengths of bh binaries throughout their outbursts together with an estimation of the expected winds at each state could soon shed some light on this topic. future observations of winds with high resolution calorimeters, like the calorimeter onboard astro-h , will allow us to determine with unprecedented accuracy the sites of wind production, thus further constraining the dynamics of accretion flows. references [1] begelman, m., mckee, c. & shields, g.: 1983, apj, 271, 70 [2] boirin, l. & parmar, a.: 2003, a&a, 407, 1079 [3] boirin, l., parmar, a. et al.: 2004, a&a, 418, 1061 [4] boirin, l., méndez, m., díaz trigo, m. et al.: 2005, a&a, 436, 195 [5] díaz trigo, m., parmar, a., boirin, l., et al.: 2006, a&a, 445, 179 [6] díaz trigo, m., parmar, a., miller, j. et al.: 2007, a&a, 462, 657 [7] díaz trigo, m., parmar, a., boirin, l. et al.: 2009, a&a, 493, 145 [8] díaz trigo, m., sidoli, l. et al.: 2012, a&a, 543, a50 [9] ferland, g., korista, k. et al.: 1998, pasp, 110, 761 [10] hyodo, y., ueda, y. et al.: 2009, pasj, 61, s99 [11] jimenez-garate, m., raymond, j. & liedahl, d.: 2002, apj, 581, 1297 [12] jonker, p. & nelemans, g.: 2004, mnras, 354, 355 [13] kallman, t. & bautista, m.: 2001, apj, 133, 221 [14] kallman, t., bautista, m. & goriely, s.: 2009, apj, 865, 701 [15] king, a., miller, j. et al.: 2012, apjl, 746, l20 [16] kotani, t., ebisawa, k. et al.: 2000, apj, 539, 413 [17] kubota, a., dotani, t. et al.: 2007, pasj, 59s, 185 [18] lee, j., reynolds, c. et al.: 2002, apj, 567, 1102 [19] miller, j., raymond, j. et al.: 2004, apj, 601, 450 [20] miller, j., raymond, j. et al.: 2006a, nature, 441, 953 [21] miller, j., raymond, j. et al.: 2006b, apj, 646, 394 [22] miller, j., maitra, d. et al.: 2011, apjl, 731, l7 [23] neilsen, j. & lee, j.: 2009, nature, 458, 481 663 maria díaz trigo, laurence boirin acta polytechnica [24] ng, c., díaz trigo, m. et al: 2010, a&a, 522a, 96n [25] parmar, a., oosterbroek, t. et al.: 2002, a&a, 386, 910 [26] ponti, g., fender, r. p. et al.: 2012, mnras, 422, 11 [27] proga, d. & kallman, t.: 2002, apj, 565, 455 [28] schulz, n. & brandt, w.: 2002, apj, 572, 971 [29] schulz, n., kallman, t. et al.: 2008, apj, 672, 1091 [30] sidoli, l., oosterbroek, t. et al.: 2001, a&a, 379, 540 [31] sim, s., miller, l. et al.: 2010a, mnras, 404, 1369 [32] sim, s., proga, d. et al.: 2010b, mnras, 408, 1396 [33] takahashi, h., fukazawa, y. et al.: 2008, pasj, 60, s69 [34] ueda, y., inoue, h. et al.: 1998, apj, 492, 782 [35] ueda, y., murakami, h. et al.: 2004, apj, 609, 325. [36] ueda, y., yamaoka, k. & remillard, r.: 2009, apj, 695, 888. [37] woods, d., klein, r. et al.: 1996, apj, 461, 767 [38] yamaoka, k., ueda, y. et al. 2001, pasj, 53, 179 [39] xiang, j., lee, j. c. et al.: 2009, apj, 701, 664 acta polytechnica 53(supplement):659–664, 2013 1 introduction 2 observational properties and spectral modelling of photoionised plasmas 3 the wind launching mechanism 3.1 comparison of wind models with the current observations of atmospheres and winds in lmxbs 4 scattering in the wind 5 a wind-jet connection? 6 discussion and conclusions references acta polytechnica doi:10.14311/ap.2015.55.0352 acta polytechnica 55(5):352–358, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap unsteady flow of thixotropic collagen substance in pipes rudolf zitnya, ales landfeldb, jan skocilasa, jaromir stancla, ∗, vlastimil fleglc, milan houskab a czech technical university in prague, faculty of mechanical engineering, department of process engineering, technicka 4, prague, czech republic b food research institute in prague, radiova 7, prague, czech republic c devro ltd., vichovska 830, jilemnice, czech republic ∗ corresponding author: jaromir.stancl@fs.cvut.cz abstract. unsteady flow of thixotropic liquid in pipes is solved by 1d and 2d numerical methods using the same constitutive equation — the only difference is in the radial diffusion of the structural parameter. comparison shows that the neglected diffusion of structural parameter implicates a much stronger effect of thixotropy. the models are applied for analysis of the observed hysteresis of hydraulic characteristic of collagen. keywords: thixotropy, collagen, unsteady flow in pipe, hydraulic characteristics. 1. introduction thixotropy is a property of fluids having viscous characteristics that depend on their deformation history, see review articles by mewis [1] and barnes [2]. approximate solutions of steady flow of thixotropic liquids in a pipe based upon a constant power-law radial velocity profiles were presented by sestak and zitny [3] and by kemblowski and petera [4]. a steady flow and variable radial velocity profile were considered in the fem solution, sestak et al. [5], using houska’s constitutive model [6] that takes into account yield stresses. solutions describing unsteady flows of thixotropic fluids are not so frequent, and are concerned mostly with the specific problem of restarting waxy crude oil in pipelines, starting from the crude approximation by zitny [7] or simulations presented by de oliveira et al. [8] up to the most complex cfd methods published in a series of papers by wachs et al. [9] and negrao et al. [10]. these cfd solutions of compressible fluids described by houska’s model of thixotropy are based upon techniques developed by vinay et al. [11, 12], for compressible bingham fluids. the methods can be characterized as follows: 2d orthogonal staggered grid, augmented lagrangian multiplier method for pressures and uzawa algorithm applied to the resulting saddle point problem. all the previous models assume purely convective transport of structural parameter λ. the exception is a generalization of the moore’s model of thixotropy with a molecular diffusion term suggested by billingham and fergusson [13]. the 1d and 2d models presented in this paper introduce a generalization of houska’s model by adding a diffusion term into the transport equation for λ. the 1d model assumes infinite radial diffusion and uniform radial profile of λ, while the 2d model assumes zero diffusion and a strong radial variation of λ. the aim of the following analysis is to compare predictions of pressure drops corresponding to the 1d and 2d extremes. 2. motivation the problem of unsteady flow of thixotropic liquid in a pipe was initiated by the industry and by the demand for an on-line recording of rheological properties of collagenous material used for the extrusion of collagen casings. the time independent herschel bulkley rheological model was capable of describing rheograms of collagenous material for a wide range of deformation rates; however it was not able to describe a slight observed hysteresis of hydraulic characteristic (pressure drop versus flowrate) at transient flows. the thixotropic properties of collagen were identified as a possible reason. the suggested thixotropic models can be used not only for modeling of the hysteresis, but also for simulation of production lines (collagenous materials used for extrusion of vascular grafts in biomedicine (kumar [14]), or as sausage casings in food industry, deiber et al. [15]). nevertheless, information on rheological properties of collagen is restricted due to problems with the application of rotational rheometers. usually only rheograms measured by capillary rheometers are available, giving no information about thixotropy. the only exception is a paper by deiber et al. [15], reporting thixotropy of collagen suspensions at low concentration (0.5–3.7 %), using a rotational rheometer and a cone and plate configuration. 3. methods this paper is devoted to numerical modelling of unsteady flow of thixotropic liquids in circular pipes. as a possible application of the suggested methods the analysis of hysteresis of the hydraulic characteristics observed at a production line of collagen casings will be presented. 352 http://dx.doi.org/10.14311/ap.2015.55.0352 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 5/2015 unsteady flow of thixotropic collagen substance in pipes figure 1. setup of experimental pipeline. 3.1. experimental setup and procedure the experiments were realized at the production extrusion line devro jilemnice shown schematically in fig. 1. the processed material (collagenous matter: approximately 7 % mass fraction of bovine collagen and water) is delivered from a storage tank by an aqm 57-20/s003 positive displacement pump, through a relatively long pipe (l = 4.287 m, r = 0.0106 m) to an extruder. the flow rate is controlled by varying the pumping speed and the pressure drops (changes) are recorded by dmp 331p (bd sensors) pressure transducers. special experiments characterized by a gradual increase and decrease of mass flow rate at constant temperature and constant composition of tested matter were carried out during a break of production in the extrusion processing line with the goal of identifying parameters for the herschel-bulkley rheological model. most details concerning the technical realization and the composition of the processed collagenous material are confidential, and only limited data files (time; flow rate; pressure drop) were released. nevertheless these experiments serve as illustration and as a test of the possibilities of the suggested methods in real applications. the processed material, at relatively high concentrations of bovine collagen (7 %), is a viscoelastic paste that looks like “silly putty”. our preliminary experiments in the laboratory confirmed linear viscoelasticity at small deformations (measured by the cone and plate oscillating rheometer haake giving a module of g′ of 5 kpa, g′′ of 2 kpa). experiments with rotational rheometers at constant deformation rates failed due to very high consistency. therefore, no independent evaluation of thixotropic time constants was realized. laboratory tests of several collagen samples, carried out by using size exclusion chromatography and uv detection, identified the existence of very long collagen fibres (19 % of the longest fraction 780 kda). 3.2. constitutive equation the houska’s thixotropic model [6] is a generalization of the routinely used herschel bulkley constitutive equation, simplified for the special case of simple shear flow to the scalar equation for shear stress τ: τ = τy + ∆τyλ + (k + ∆kλ)γ̇n; γ̇ = ∂u ∂r . (1) the structural parameter λ= 1 describes a fully recovered internal structure, while λ= 0 corresponds to a completely destroyed structure. time changes for structural parameter λ are described by the transport equation: dλ dt = dλ∇2λ + a(1 −λ) − bλγ̇m. (2) the diffusion term on the right side is usually neglected and only the terms for regeneration (a) and structure destruction (b) are considered. 3.3. radially independent structural parameter (1d model) the assumption that the structural parameter λ depends only on the axial coordinate and on time λ(t,x) would represent a great simplification. this case is probably more relevant for polymeric chains with clusters of a size comparable with the pipe radius (the tested collagen is characterized by extremely long chains with molecular mass 780 kda). 3.3.1. flow of a herschel-bulkley fluid in a pipe if λ is independent from the radial coordinate, then the effective consistency k? and the yield stress τ?y depend only upon time and the axial coordinate: k? = k + λ∆k; τ?y = τy + λ∆τy. (3) 353 r. zitny, a. landfeld, j. skocilas et al. acta polytechnica figure 2. time course of flow rate for tested cases. therefore, it is possible to use the rmw (rabinowitsch, mooney, weissenberg) equation for volumetric flow rate expressed as a function of wall shear stress (or gradient of pressure): v̇ πr3 = ( τw k? )1 n ( n 3n + 1 ( 1 − τ?y τw )3n+1 n + 2n 2n + 1 τ?y τw ( 1 − τ?y τw )2n+1 n + n n + 1 (τ?y τw )2( 1 − τ?y τw )n+1 n ) . (4) because we need to calculate the wall shear stress for a given flow rate, it is necessary to iterate the inverse relationship: τw = τ?y + k ? ( v̇ κ(τw)πr3 )n , (5) κ(τw) = n 3n + 1 ( 1 − 1 2n + 1 τ?y τw − 2n (2n + 1)(n + 1) ((τ?y τw )2 + n (τ?y τw )3)) . (6) after just a few iterations, the process converges for arbitrary model parameters. equations (3)–(6) were presented recently by zitny et al. [16] in a simplified 1d model. 3.3.2. flow of a herschel-bulkley fluid in a pipe the transport equation (2) can be integrated across the cross section of the pipe and the last term describing the averaged decomposition of the structure can be expressed in an analytical form, because the radial velocity profile is known (the mentioned model in zitny et al.[16] assumed a constant shear rate corresponding to newtonian velocity profile): γ̇m = 2 r2 ∫ r ry r (τw rr − τ?y k? )m n dr = = 2n(τw − τ?y ) m n +1 τ2wk ? m n (m + n)τw + nτ?y (m + 2n)(m + n) . (7) for dλ = 0, equation (2) reduces to a hyperbolic equation that can be integrated analytically along its characteristic dx = udt giving: λ(t) = a− (a− (a + bγ̇m)λ0)e−(a+bγ̇ m)(t−t0) a + bγ̇m , (8) where λ(t) is the value of the structural parameter of a fluid particle having a value of λ0 at time t0, assuming that the particle is under the influence of a constant shear rate (constant flow rate) within the time interval (t0, t). in cases with variable flow rates, it is necessary to apply (8) using shorter time steps during the integration. the 1d numerical solution by method of characteristics calculates nodal values of structural parameters at a new time t(k+1) from the values of the structural parameter λ(k)1 ,λ (k) 2 , . . . at equidistant nodes (x1 = 0,x2 = ∆x,x3 = 2∆x,. . .) and previous time t(k). the time step ∆t(k) = t(k+1) − t(k) is not a constant and is determined by the volumetric flow rate so that the fluid particle moves a distance of ∆x: ∆t(k) = t(k+1) − t(k) = ∆x u(k) = ∆xπr2 v̇ (k) (9) (courant friedrichs lewy criterion cfl u∆t/∆x = 1). the new value of λ(k+1)i is calculated by integration of a fluid particle along the characteristic from the old value λ(k)i−1 using (8). after the new values for structural parameters are updated, wall shear stress, pressure gradient and overall pressure profile can be calculated from (6). 354 vol. 55 no. 5/2015 unsteady flow of thixotropic collagen substance in pipes figure 3. integral 1d model and 2d model calculated for time course of flow rate shown in fig. 2. herschel-bulkley model τy = 100 pa, ∆τy = 100 pa, k = 200 pa sn, ∆k = 250 pa sn, n = 0.38. transport equation a = 0.001 s−1, b = 0.002 sm−1, m = 0.9. 3.4. 2d model of velocities and structural parameter profiles (finite differences) results that are probably more realistic can be obtained by taking into account non-uniform structural parameter profiles. values of axial uij and radial vij velocities and structural parameter λij are calculated at grid points xi, rj of orthogonal nonuniform collocated mesh using the finite differences method. radial profiles of axial velocities are evaluated from axial momentum balance (assuming linear radial shear stress profile and a constant value of structural parameter in the annular sections rj,rj+1 for j = 1, 2, . . .nr−1). radial velocities follow from the continuity equation and the profiles of structural parameters are calculated from discretized transport equation (2) assuming purely convective transport (therefore dλ = 0). more details of this numerical solution are presented in the appendix. 4. results and discussion the numerical methods described in section 3 were implemented in several matlab programs and tested with the goal of evaluating the differences between the predictions made using the 1d and 2d model. operational parameters were selected according to the geometry of the test setup (pipe l = 4.2 m, r = 0.01 m). the flow rate was progressively increased from 5×10−7 to 2 × 10−5 m3 s−1 and then progressively decreased as shown in fig. 2. a comparison of the predicted hydraulic characteristic using the 1d and 2d model for selected rheological parameters is shown in fig. 3. the 1d model, with a constant radial profile of structural parameter, predicts pressure drops approximately 10 to 20 % higher than the 2d model with a varying radial profile of structure. as expected, the 2d model predicts a more intensive breakup of the thixotropic structure; particularly in the vicinity of the wall (the uniform distribution of the structure decay obviously underestimates the significant effect of the wall layer in the 1d model). the simulations were carried out for different meshes (spatial discretization using 21, 51, 101, and 451 nodes in the axial direction, and 11, 21, and 31 nodes in radial direction when testing the 2d method). the times steps were fully determined by spatial steps in the method of characteristics (1d), while in the 2d method, time steps were not restricted. nevertheless, in the simulations the time steps were selected to be identical with the 1d method. tests confirmed that both the 1d and 2d method were stable and consistent in all tested cases. fig. 4 presents the numerical prediction of the 2d model and the data obtained experimentally with the collagen material used in the experimental setup described in paragraph 3.1. the volumetric flow rate was gradually increased from 7×10−7 m3 s−1 to 5.1×10−5 m3 s−1. the rate of volumetric change was much slower than in previous tests (more than 20 minutes up and 20 minutes down). such a slow variation of flow rate increases the number of time steps (4100 time steps for 151 grid points in the axial direction) and increases the run-time necessary for preliminary identification of thixotropic model parameters. the parameters (a, b, n, m, k, ∆k, τy, ∆τy) were identified (approximately) by linear search using the least square criterion (the sum of the squares of the deviations between measured and predicted pressures during up and down phases). it is obvious that the models of thixotropy (regardless of whether it was the 1d or 2d model) were not able to describe the hysteresis 355 r. zitny, a. landfeld, j. skocilas et al. acta polytechnica figure 4. hydraulic characteristic of collagen (2d model). upward pointing triangles represent increasing flow rate; while diamonds represent decreasing flowrate. continuous line integral model for k = 150 pa sn, ∆k=350 pa sn, τy = 1350 pa, ∆τy = 250 pa, n = 0.35, a = 0.002 s−1, b = 0.004 sm−1, m = 1, r = 0.01 m, l = 4.2 m, time up: 1200 s. at high flow rates, a trend that was also observed in previous simulations (fig. 3). the fact that the hydraulic hysteresis of thixotropic liquid is significant only at low flow rates is closely related to the residence time trtd = l/ū of fluid particles. the relative change of structural parameter (and hysteresis) can be quantified as ∆λ λ = 1 λ ∂λ ∂v̇ v̇max − v̇min ∆tup trtd. (10) for the analyzed case (and for the model parameters shown in fig. 4) the ∆λ/λ ratio ≈ 100 at the minimum flow rate, and only ≈ 0.01 at the maximum flow rate. 5. conclusions the 1d method of characteristic and the 2d finite differences method were designed to simulate transient flow of thixotropic incompressible liquids in pipes using houska’s model with variable yield stress. the 1d model assumes that the radial diffusion mitigates non-uniformities and large gradients of structural parameter at the wall. the 2d model assumes only convective transport of structure and predicts large changes of structure at the wall where local residence times go up to infinity. the two suggested methods represent, in fact, the lower and upper bound of the pressure drop (the 1d method underpredicts and the 2d method overpredicts the effect of thixotropic structure decay). the developed models were used to evaluate rheological properties of bovine collagenous matter. we observed a phenomenon resembling hydraulic hysteresis in a production line facility and this paper is an attempt to explain this behaviour by thixotropy. however, the hysteresis can only be explained by thixotropy at low flow rates and the hysteresis observed at high flow rates would need a different explanation. 6. appendix axial velocities uij , radial velocities vij and structural parameter λij are calculated in each time step in nodes of 2d rectangular mesh (i=index in the axial, j in the radial direction). the radial profiles of axial velocities are evaluated from the axial momentum balance (assuming a linear radial shear stress profile and a constant value of structural parameter in the annular sections rj, rj+1 for j = 1, 2, . . .nr−1), τw(t,xi) r r = τy + ∆τy λij + λij+1 2︸ ︷︷ ︸ τ?y + ( k + ∆k λij + λij+1 2 ) ︸ ︷︷ ︸ k? ( − ∂u ∂r )n , (11) giving u(t,xi,rj) = u(t,xi,rj+1) + rn τwk ? 1 n (n + 1) (( τw rj+1 r − τ?y )n+1 n − ( τw rj r − τ?y )n+1 n ) . (12) the nodal velocities were evaluated recursively from (12) starting from zero velocity at the wall. the 356 vol. 55 no. 5/2015 unsteady flow of thixotropic collagen substance in pipes figure 5. 2d mesh and interpolation of structural parameter at a “previous” time step. corresponding flow rates in annular sections between the wall and radius rj are v̇ (t,xi,rj) = v̇ (t,xi,rj+1) + π ( (r2u)j+1j + r3 τ3wk ? 1 n τw rj+1 r∫ τw rj r τ2 ( τ − τ?y ) 1 n dτ ) . (13) the velocities and flow rates in (12), (13) are expressed as analytical functions of wall shear stress τw and the wall shear stress is determined by the prescribed flowrate. this constraint is in fact the algebraic equation v̇ (t) = v̇ (t,xi,rj) , which has to be solved iteratively at each time step and at all cross sections xi. the radial velocity can be calculated from the mass balance expressed in terms of the radial profiles of flow rates v(t,xi,rj) = v̇ (t,xi+1,rj) − v̇ (t,xi,rj) 2πrj(xi+1 −xi) . (14) let us assume that the nodal values of structural parameter λ(k)ij are known at time t (k) and the values λ(k+1)ij at time t(k+1) and node xi, rj we want to calculate. it is first necessary to calculate the trajectory of the particles, or more precisely, to calculate the initial position of the fluid particles (coordinates ξ, η shown in fig. 4) that ends at node xi, rj. assuming that the change in the next time step ∆t(k) = t(k+1) − t(k) is small, the trajectory remains inside the cell formed by cross sections xi−1 and xi, and the increments ξ, η are given by velocities evaluated at node i, j from (12) and (14): ξ = u(k)ij ∆t(k) ∆x ; η = v(k)ij ∆t(k) rj −rj−1 . (15) the value of the structural parameters of a fluid particle at a “previous” time t(k) at position ξ, η can be interpolated from the nodal values (using bilinear interpolation): λ(k)(ξ,η) = λ(k)ij (1 − ξ)(1 −η) + λ(k)i−1,jξ(1 −η) + λ (k) i,j−1(1 − ξ)η + λ(k)i−1,j−1ξη for ξ > 0, η > 0; (16) λ(k)(ξ,η) = λ(k)ij (1 − ξ)(1 + η) + λ(k)i−1,jξ(1 + η) −λ (k) i,j+1(1 − ξ)η −λ(k)i−1,j+1ξη for ξ > 0, η < 0. (17) the values of the structural parameters at a “new” time are obtained by integration of transport equation (2) along the particle trajectory (i.e., along the characteristic): λ (k+1) ij = 1 a + b(γ̇(k)ij ) m ( a− ( a− (a + b(γ̇(k)ij ) m) ·λ(k)(ξ,η) ) e −(a+b(γ̇(k) ij )m)∆t(k) ) . (18) this relationship can be applied to all internal points; the value of structural parameters at the wall can be calculated directly from equilibrium λ (k+1) wi ∼= a a + bγ̇mi . (19) the method described is explicit because the velocities in (15), the associated trajectories and rate of deformation are evaluated at the “previous” time. acknowledgements this research was supported by the research project gacr 14-23482s “thermal, electrical and rheological properties of collagen matter”. list of symbols a coefficient of structure regeneration [s−1] b coefficient of structure breakdown [sm−1] dλ diffusion coefficient [m2 s−1] k consistency coefficient [pa sn] ∆k increment of consistency coefficient [pa sn] l length of pipe [m] m breakdown index [–] 357 r. zitny, a. landfeld, j. skocilas et al. acta polytechnica n flow behaviour index [–] p pressure [pa] r radial coordinate [m] r radius of outer pipe [m] t time [s] trtd mean residence time [s] u axial velocity [m s−1] ū mean axial velocity [m s−1] v radial velocity [m s−1] v̇ volumetric flow rate [m3 s−1] x axial coordinate [m] γ̇ shear rate [s−1] λ structural parameter [–] τw wall shear stress [pa] τy yield stress [pa] ∆τy increment of yield stress [pa] i index of cross-section (axial coordinate) j index of node in radial direction w subscript referring to the wall references [1] j. mewis. thixotropy a general review. journal of non-newtonian fluid mechanics 6(1):1–20, 1979. doi:10.1016/0377-0257(79)87001-9. [2] h. a. barnes. tixotropy a review. journal of non-newtonian fluid mechanics 70:1–33, 1997. doi:10.1016/s0377-0257(97)00004-9. [3] j. sestak, r. zitny. tok tixotropni kapaliny v trubce. acta polytechnica 4:45, 1976. [4] z. kemblowski, j. petera. memory effects during the flow of thixotropic fluids in pipes. rheologica acta 20(4):311–323, 1981. doi:10.1007/bf01547661. [5] j. sestak, r. zitny, m. houska. dynamika tixotropnich kapalin. rozpravy csav praha 1990. [6] j. sestak, r. zitny, m. houska. simple rheological models of food liquids for process design and quality assessment. journal of food engineering 2(1):35–49, 1983. doi:10.1016/0260-8774(83)90005-5. [7] r. zitny. nestacionarni tok tixotropni kapaliny v trubce. acta polytechnica 3(iv):95, 1977. [8] g. m. de oliveira, l. l. v. da rocha, a. t. franco, c. o. r. negrao. numerical simulation of the start-up of bingham fluid flows in pipelines. journal of non-newtonian fluid mechanics 165(19):1114–1128, 2010. doi:10.1016/j.jnnfm.2010.05.009. [9] a. wachs, g. vinay, i. frigaard. a 1.5d numerical model for the start up of weakly compressible flow of a viscoplastic and thixotropic fluid in pipelines. journal of non-newtonian fluid mechanics 159(1-3):81–94, 2009. doi:10.1016/j.jnnfm.2009.02.002. [10] c. o. r. negrao, a. t. franco, l. l. v. rocha. a weakly compressible flow model for the restart of thixotropic drilling fluids. journal of non-newtonian fluid mechanics 166(23-24):1369–1381, 2011. doi:10.1016/j.jnnfm.2011.09.001. [11] g. vinay, a. wachs, j. f. agassant. numerical simulation of weakly compressible bingham flows: the restart of pipeline flows of waxy crude oils. journal of non-newtonian fluid mechanics 136(2-3):93–105, 2006. doi:10.1016/j.jnnfm.2006.03.003. [12] g. vinay, a. wachs, i. frigaard. start-up transients and efficient computation of isothermal waxy crude oil flows. journal of non-newtonian fluid mechanics 143(2-3):141–156, 2007. doi:10.1016/j.jnnfm.2007.02.008. [13] j. billingham, j. w. j. ferguson. laminar unidirectional flow of a tixotropic fluid in a circular pipe. journal of non-newtonian fluid mechanics 47:21–55, 1993. doi:10.1016/0377-0257(93)80043-b. [14] v. a. kumar, j. m. caves, c. a. haller, et al. acellular vascular grafts generated from collagen and elastin analogs. acta biomaterialia 9(9):8067–8074, 2013. doi:10.1016/j.actbio.2013.05.024. [15] j. a. deiber, m. b. peirotti, m. l. ottone. rheological characterization of edible films made from collagen colloidal particle suspensions. food hydrocolloids 25(5):1382–1392, 2011. doi:10.1016/j.foodhyd.2011.01.002. [16] r. zitny, m. houska, a. landfeld, et al. possible explanations of hysteresis at hydraulic characteristics in pipes. in proceedings of 21st international congress of chemical and process engineering chisa 23-27 august 2014 prague. 2014. 358 http://dx.doi.org/10.1016/0377-0257(79)87001-9 http://dx.doi.org/10.1016/s0377-0257(97)00004-9 http://dx.doi.org/10.1007/bf01547661 http://dx.doi.org/10.1016/0260-8774(83)90005-5 http://dx.doi.org/10.1016/j.jnnfm.2010.05.009 http://dx.doi.org/10.1016/j.jnnfm.2009.02.002 http://dx.doi.org/10.1016/j.jnnfm.2011.09.001 http://dx.doi.org/10.1016/j.jnnfm.2006.03.003 http://dx.doi.org/10.1016/j.jnnfm.2007.02.008 http://dx.doi.org/10.1016/0377-0257(93)80043-b http://dx.doi.org/10.1016/j.actbio.2013.05.024 http://dx.doi.org/10.1016/j.foodhyd.2011.01.002 acta polytechnica 55(5):352–358, 2015 1 introduction 2 motivation 3 methods 3.1 experimental setup and procedure 3.2 constitutive equation 3.3 radially independent structural parameter (1d model) 3.3.1 flow of a herschel-bulkley fluid in a pipe 3.3.2 flow of a herschel-bulkley fluid in a pipe 3.4 2d model of velocities and structural parameter profiles (finite differences) 4 results and discussion 5 conclusions 6 appendix acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2015.55.0136 acta polytechnica 55(2):136–139, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap modernization features of a vacuum installation based on low-pressure arc discharge for forming composite tin-cu layers dmitrii tsyrenov∗, aleksandr semenov, natalia smirnyagina institute of physical materials science sb ras, laboratory of physical materials science, sakhyanovoy st. 6,670047 ulan-ude, russia. ∗ corresponding author: dmitriyzak@mail.ru abstract. a hybrid technology for forming composite tin-cu layers under the combined influence of a magnetron and low pressure arc discharges has been designed. the parameters of the plasma generator have been studied. the technological parameters for layer deposition in the conditions of coordinated action of the vacuum arc evaporator and the planar magnetron have been studied. this report describes the phase composition of tin-cu layers and the structure on fused silica substrates. keywords: arc evaporator; planar magnetron; layers; structure; phase composition; nanocomposite. 1. introduction in this paper, we propose a hybrid technology for forming composite layers. the main feature of this technology lies in the coordinated action of a vacuum arc and magnetron discharges. coverings such as tin, tialn, ticrn and others that are widely used in mechanical engineering and in metal working are characterized by high hardness and low friction coefficients. however, their main disadvantage lies in their substantial fragility, which greatly narrows the areas in which they can be applied [1]. intensive research work in the field of nanostructured and nanocomposite coatings of transitional metals is currently being carried out intensively around the world, but much work still remains to be done. significant success in applying nanostructured and nanocomposite coatings, in particular of tin-cu, has been achieved by vacuum plasma processes [2,3,4,5]. for this reason, studies of new technologies for making composite layers with good plasticity and high hardness are of interest. to work on this task, we combined a vacuum arc evaporator and a planar magnetron in a single chamber. this combination provides the following advantages: coatings can be applied on details of various geometric forms, high deposition speeds and low temperature of the substrate can be achieved, and an impurity component can be introduced. 2. experimental part the experiments were performed on the vu-1m vacuum installation (figure 1). the tin-cu layers were deposited by simultaneous reactive magnetron sputtering of the copper target and arc evaporation of the titanium target. the industrial vu-1b installation was modernized to carry out the deposition process. modernization involved figure 1. general view of the installation. placing the planar magnetron in the vacuum chamber. this planar magnetron works on the basis of an abnormal smoldering discharge of the direct current. the magnetron is installed vertically on the side wall of the vacuum chamber. the power supply for the magnetron is up to 3 kw. the magnetron (figure 2) consists of a cathode in the form of a copper target and an annular anode. the magnetic system is produced by constant (cobaltsamarium) magnets, a magnetic conductor and a polar tip. these make a toroidal magnetic field with the induction of 0.2–0.8 tl above the surface of the target. the plate was supplied to the grounded anode (positive potential) and to the isolated target (negative potential). the cathode unit (figure 3) consists of the welded case, the cathode, the magnetic coil, the additional anode, and the electrostatic screen isolated both from the cathode and from the additional anode. the magnetic coil ensures that the cathode burns evenly. inside the chamber there is a rotating substrate holder, 136 http://dx.doi.org/10.14311/ap.2015.55.0136 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 2/2015 modernization features of a vacuum installation number current of the arc current of the voltage on the pressure in the discharges, a magnetron magnetron chamber, 10−2 tor discharges, a target (cu), v a 80 0.2 b 90 0.8 35 0.2 c 60 0.6 40 9 d 60 0.6 40 9 table 1. composite tin-cu layer formation parameters figure 2. design of the magnetron. which distributes the gradually thickening coating evenly. the reference voltage applied to the substrate holder is 1.5 kv. it preheats the details and provides ionic removal of gas inclusions from the growth surface. the gas mixture is prepared in a separate vacuum mixer. the installation has a unified vacuum system based on a n400 diffusive pump. the tin-cu layers are set while both devices are working simultaneously. the parameters used in this experiment are presented in table 1. plates of fused quartz (amorphous sio2) 1 mm in thickness were used in order to avoid the influence of focusing the substrate material on the structure of the composite layer. xray phase analysis was carried out on a 2d bruker phaser (cu kα1-radiation). the microstructure of the layers was investigated using a metam pb-22 microscope with the nexsys image expert program. knoop (hk) hardness measurements were made by pressing a diamond indenter of a specified shape into a surface with a known force. figure 3. design of the cathode unit. 3. results and discussion figure 4 shows the x-ray diffraction (xrd) patterns of the tin-cu films coated on fusing quartz. polycrystalline tin-cu layers are obtained with column crystallite and preferred orientation on the plane [111] perpendicular substrate surfaces of the sample (figure 4a). although x-ray analyses showed no fixed reflexes of α-ti phase impurities in a layer, drops 700–800 nm in size were observed. deposition of composite tincu led to the formation of a layer containing tin, partially orientated on a plane [111], though other planes (200), (220) can be allocated to the reflexes (figure 4b). as an impurity, x-ray analysis was able to detect the presence of nitride ti2n and an α-ti drop phase. the decrease from 90 to 60 a in the arc discharge current when titanium was evaporated led to a considerable reduction in the quantity and in the sizes of the titanium drops in a layer. according to x-ray analysis (figure 4c), there are reflexes of tin in the composite layer. in addition, the x-ray pattern contains a pure cu peak. note that intensity of the cu reflexes on the roentgenogram is only slight. a change in the distance from the arc cathode (a change in deposition speed, and, hence, in the thickness of the composite layer that is formed, etc.) to substrates 220 to 280 mm in thickness led to increased cu in the composite layer. the x-ray patterns reveal copper intensity varying from 10 % (figure 4d). the parameters of a tin crystal cell (fm3m) are defined for all composite tin-cu layers. a considerable reduction in parameter a with a = 0.4318 nm to 137 d. tsyrenov, a. semenov, n. smirnyagina acta polytechnica figure 4. xrd patterns for tin-cu films: a) tin, 80 a; b) tin-cu, 90 a; c) tin-cu, 60 a (220 mm); d) tin-cu, 60 a (280 mm). figure 5. microstructure of a tin-cu layer. a = 0.4254 nm is observed (table 2). figure 5 shows the microstructure of the surface layer of the pattern with copper reflections (pattern d). a study of the microstructure of a layer surface revealed no presence of ti drops. the layers have a homogeneous grain structure. there may be some copper on the grain borders. it can be expected that the tin grains have a column structurethe changes in the hardness of the composite tin-cu films are illustrated in table 2. the maximum hardness values for composite tin-cu films were measured to be 3910 mpa. the hardness of the tin layers formed by reactive arc evaporation was 2910 mpa. this effective increase in the hardness of the composite layer may be connected with the absence of the ti drop phase and with the discovery of cu atoms on the borders of the tin grains. thus, the structure and the phase structure of a number crystal cell micro knoop tin a, nm hardness hk pa a 0.4310 2560 b 0.4318 2910 c 0.4259 1680 d 0.4254 3910 table 2. physical properties of the tin-cu composite films. layer of composite tin-cu indicates the formation of a composite material combining firm titanium nitride and plastic copper. 4. conclusion preliminary results have been obtained for a hybrid technology for forming composite tin-cu layers with a simultaneous combination of a vacuum arc evaporator and a planar magnetron. composite tin-cu layers with different technological parameters were formed. a layer of the composite has a granular structure, and there were inclusions of copper atoms on the border of the tin grains. the results of studies of tin-cu layers will be used for optimizing the technological process for forming a coating with the required properties. references [1] volosova m. a.: the technological principles deposition of wear-resistant nanocoatings on tool ceramics. science and technologies. results of dissertation researches. moscow: vol. 1, 2009 – p. 75–90. [2] myung h.s., lee h.m., shaginyan l.r., han j.g.. microstructure and mechanical properties of cu doped 138 vol. 55 no. 2/2015 modernization features of a vacuum installation tin superhard nanocomposite coatings. surf. and coat. techn. – 2003. – vol. 163-164. – p. 591–596. [3] he j.l., sethuhara y., shimuzu i., miyake s. structure refinement and hardness enhancement of titanium nitride films by addition of copper. surf. and coat. techn. – 2001. – vol. 137. – p. 38142. [4] pribytkov g.a., korosteleva e.n., psakhie s.g., goncharenko i.m., ivanov yu.f., koval n.n., shanin p.m., gurskih a.v., korjova v.v., mironov yu.p. nanostructured titanium nitride coatings produced by arc sputtering of composite cathodes. i. cathodes structure, phase composition and sputtering peculiarities. proceedings of 7 int. conf. on modification of materials with particle beams and plasma flows. tomsk. – 2004. – p. 163–166. [5] korotaev a.d., moshkov v.yu., ovchinnikov s.v., pinzhin yu.p., savostikov v.m., tyumentsev a.n. nanostructured and nanocomposite superhard coatings. physical mesomechanics – 2005. – vol. 8, №5. – p. 103–116. 139 acta polytechnica 55(2):136–139, 2015 1 introduction 2 experimental part 3 results and discussion 4 conclusion references acta polytechnica doi:10.14311/ap.2014.54.0133 acta polytechnica 54(2):133–138, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap stability of bose-einstein condensates in a pt-symmetric double-δ potential close to branch points andreas löhle, holger cartarius∗, daniel haag, dennis dast, jörg main, günter wunner institut für theoretische physik 1, universität stuttgart, 70550 stuttgart, germany ∗ corresponding author: holger.cartarius@itp1.uni-stuttgart.de abstract. a bose-einstein condensate trapped in a double-well potential, where atoms are incoupled to one side and extracted from the other, can in the mean-field limit be described by the nonlinear gross-pitaevskii equation (gpe) with a pt symmetric external potential. if the strength of the inand outcoupling is increased two pt broken states bifurcate from the pt symmetric ground state. at this bifurcation point a stability change of the ground state is expected. however, it is observed that this stability change does not occur exactly at the bifurcation but at a slightly different strength of the in-/outcoupling effect. we investigate a bose-einstein condensate in a pt symmetric double-δ potential and calculate the stationary states. the ground state’s stability is analysed by means of the bogoliubov-de gennes equations and it is shown that the difference in the strength of the in-/outcoupling between the bifurcation and the stability change can be completely explained by the norm-dependency of the nonlinear term in the gross-pitaevskii equation. keywords: bose-einstein condensates, pt symmetry, stability, bogoliubov-de gennes equations. 1. introduction in quantum mechanics non-hermitian hamiltonians with imaginary potentials have become an important tool to describe systems with loss or gain effects [1]. non-hermitian pt symmetric hamiltonians, i.e. hamiltonians commuting with the combined action of the parity (p: x →−x, p →−p) and time reversal (t : x → x, p → −p, i → −i) operators, possess the interesting property that, in spite of the gain and loss, they can exhibit stationary states with real eigenvalues [2]. when the strength of the gain and loss contributions is increased typically pairs of these real eigenvalues pass through an exceptional point, i.e. a branch point at which both the eigenvalues and the wave functions are identical, and become complex and complex conjugate. a promising candidate for the realisation of a pt symmetric quantum system are bose-einstein condensates. at sufficiently low temperatures and densities they can in the mean-field limit be described by the nonlinear gross-pitaevskii equation [3, 4]. if a condensate is trapped in a double-well potential it is possible to add atoms to one side of the double well and remove atoms from the other. this leads to a gain or loss to the condensate’s probability amplitude. if the strength of both contributions is equal, the process can effectively be described by a complex external potential v (−x) = v ∗(x) rendering the hamiltonian pt symmetric [5]. the experimental realisation of an open quantum system with pt symmetry will be an important step since the experimental verification has only been achieved in optics so far [6, 7]. the nonlinearity ∝ |ψ(x, t)|2 of the gross-pitaevskii equation introduces new interesting properties [8–17] such as pt broken states which appear for gain/loss contributions lower than those at which the corresponding pt symmetric states vanish. in optics the same effect can be observed for wave guides with a kerr nonlinearity. they are described by an equation that is mathematically equivalent to the grosspitaevskii equation. it has been shown that the additional features appearing in the presence of the nonlinearity might be exploited to create uni-directional structures [18] or solitons [19–24]. since pt symmetry in optics is extensively studied [5, 18–28] and has been experimentally realised [6, 7] these approaches seem to be very promising. from the mathematical point of view a new type of bifurcation appearing in the nonlinear pt symmetric gross-pitaevskii equation is of special interest [8–17]. as in linear quantum systems, two pt symmetric stationary states merge in an exceptional point if the strength of a parameter describing the gain and loss processes is increased. however, in contrast to its linear counterpart, the gross-pitaevskii equation possesses no pt broken states emerging at this exceptional point. they already appear for lower strengths of the gain/loss parameter and bifurcate from one of the pt symmetric states in a pitchfork bifurcation. 133 http://dx.doi.org/10.14311/ap.2014.54.0133 http://ojs.cvut.cz/ojs/index.php/ap a. löhle, h. cartarius, d. haag et al. acta polytechnica the new bifurcation point has been identified to be a third-order exceptional point [10]. for attractive nonlinearities one finds that the pt broken solutions bifurcate from the ground state. in this scenario the pt symmetric ground state is the only state which exists on both sides of the bifurcation and always possesses a real energy eigenvalue. the pitchfork bifurcation is expected to entail a change of its stability. however, it is observed that this stability change does not occur exactly at the bifurcation but at a slightly different value of the gain/loss parameter. the discrepancy between the points of bifurcation and stability change seems to be surprising and does not appear in all similar systems. the mean-field limit of a two mode approximation with a bose-hubbard hamiltonian [14–17] does not show this effect. the model has, however, two crucial differences from the treatment of bose-einstein condensates with the full gross-pitaevskii equation in ref. [11]. the latter system contains a harmonic trap in which infinitely many stationary states can be found, whereas the nonlinear two-mode system exhibits only four states, viz. the two pt symmetric and the two pt broken states mentioned above. furthermore the nonlinearity derived in [14, 15] is slightly different from that of the gross-pitaevskii equation. it has the form ∝ |ψ|2/||ψ||2, and hence does not depend on the norm of the wave function. thus, there might be two reasons for the appearance of the discrepancy. it could have its origin in the existence of higher modes influencing the ground state’s stability or in the norm-dependency of the gross-pitaevskii nonlinearity. it is the purpose of this article to clarify this question. to do so we study a bose-einstein condensate in an idealised double-δ trap [8–10], a system of which already its linear counterpart helped to understand basic properties of pt symmetric structures [29–33]. this system is described by the gross-pitaevskii equation, i.e. the contact interaction has the norm-dependent form ∝ |ψ(x, t)|2. however, it exhibits only four stationary states of which two are pt symmetric and two are pt broken as in the two-mode model [14– 17]. additionally, in a numerical study the structure of the nonlinearity can easily be changed such that the system’s mathematical properties can be brought into agreement with the mean-field limit of the bosehubbard dimer. the article is organised as follows. we will introduce and solve the gross-pitaevskii equation of a bose-einstein condensate in a double-δ trap for an attractive atom-atom interaction in section 2. some properties of the stationary solutions which are important for the following discussions are recapitulated. then we will investigate the ground state’s stability in the vicinity of the bifurcation in section 3. the bogoliubov-de gennes equations are solved for both types of the nonlinearity, and the origin of the discrepancy between the bifurcation and the stability change is discussed. conclusions are drawn in section 4. 2. bose-einstein condensates in the pt symmetric double-δ trap we assume the condensate to be trapped in an idealised trap of two delta functions, i.e. the potential has the shape [8] v (x) = −(1 − iγ) δ (x− b) − (1 + iγ) δ (x + b) , (1) where the units are chosen such that the real part due to the action of the δ functions has the value -1. it describes two symmetric infinitely thin potential wells at positions ±b. in the left well we describe an outflux of atoms by a negative imaginary contribution −iγ, and on the right side an influx of particles is described by a positive imaginary part +iγ of the same strength. this leads to the time-independent gross-pitaevskii equation in dimensionless form [8], [ − d2 dx2 − (1 − iγ) δ (x− b) − (1 + iγ) δ (x + b) + g ∣∣ψ(x)∣∣2]ψ(x) = −κ2ψ(x) (2) with the energy eigenvalue or chemical potential µ = −κ2. the parameter g is determined by the s-wave scattering length, which effectively describes the van der waals interaction for low temperatures and densities. physically it can be tuned via feshbach resonances. throughout this article we assume g to be negative, i.e. the atom-atom interaction is attractive. solutions to the gross-pitaevskii equation are found with a numerical exact integration. the wave function is integrated outward from x = 0 in positive and negative direction. to do so the initial values ψ(0),ψ′(0), and κ have to be chosen. the arbitrary global phase is exploited such that ψ(0) is chosen to be real. together with ψ′(0) ∈ c and κ ∈ c we have five parameters which have to be set such that physically relevant wave functions are obtained. these have to be square-integrable and normalised in the nonlinear system. the five conditions ψ(∞) → 0, ψ(−∞) → 0, and ||ψ|| = 1 ensure that these conditions are fulfilled, and, together with the five initial parameters, define a five-dimensional root search problem, which is solved numerically. figure 1 shows a typical example for all stationary states found in the case g = −1 (red solid lines). two states with purely real eigenvalues vanish for increasing γ in an exceptional point. since κ is plotted and µ = −κ2 is the energy eigenvalue, the upper line corresponds to the ground state. two complex and complex conjugate eigenvalues bifurcate from this ground state in a pitchfork bifurcation at a critical value γκ ≈ 0.3071. the same plots for g = −0.5 (blue dotted lines) and g = 0 (green dashed lines) demonstrate how the pitchfork bifurcation is introduced by the nonlinearity. 134 vol. 54 no. 2/2014 stability of becs in a pt-symmetric double-δ potential 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 r e (κ ) γ (a) γ κ g = -1.0 g = -0.5 g = 0 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0 0.1 0.2 0.3 0.4 0.5 0.6 im (κ ) γ (b) γ κ figure 1. real (a) and imaginary (b) parts of the stationary solutions found for the gross-pitaevskii equation (2) for the case g = −1 (red solid lines), which is compared with its linear counterpart g = 0 (green dashed lines) and the weaker interaction g = −0.5 (blue dotted lines). in the nonlinear case we observe two complex conjugate states bifurcating from the ground state in a pitchfork bifurcation. for g = −1 this occurs at γκ ≈ 0.3071. this eigenvalue structure is very generic for pt symmetric systems with a quadratic term in the hamiltonian. it is found for bose-einstein condensates in a true spatially extended double well in one and three dimensions [11, 12] which contain the nonlinearity ∝ |ψ|2 as well as for the mean-field limit of the bosehubbard dimer [14–17] with a term ∝ |ψ|2/||ψ||2. the difference in the norm-dependency of the nonlinearity between the two systems does not lead to different eigenvalues, as has already been mentioned [17]. the difference appears, however, for the dynamics, where the norm plays a crucial role in the presence of gain and loss [13]. 3. stability analysis of the ground state the linear stability is analysed with the bogoliubov-de gennes equations. they are derived under the assumption that a stationary state ψ0(x,t) is perturbed by a small fluctuation θ(x,t), i.e. ψ(x,t) = eiκ 2t [ ψ0(x) + θ(x,t) ] , (3a) where θ(x,t) = u(x)e−iωt + v∗(x)eiω ∗t. (3b) with this ansatz and a linearisation in the small quantities u and v one obtains from the gross-pitaevskii equation the coupled system of the bogoliubov-de gennes differential equations, d2 dx2 u(x) = [ −(1 + iγ)δ(x + b) − (1 − iγ)δ(x− b) + κ2 −ω + 2g ∣∣ψ0(x)∣∣2]u(x) + gψ0(x)2v(x), (4a) d2 dx2 v(x) = [ −(1 − iγ)δ(x + b) − (1 + iγ)δ(x− b) + (κ2)∗ + ω + 2g ∣∣ψ0(x)∣∣2]v(x) + gψ∗0 (x)2u(x). (4b) in equation (3b) it can be seen that ω decides on the temporal evolution of the fluctuation. real values of ω describe stable oscillations, whereas imaginary parts lead to a growth or decay of the fluctuation’s amplitude. thus, ω measures the stability of the stationary solution ψ0 against small fluctuations. numerically the bogoliubov-de gennes equations are solved with the same method as the stationary states, i.e. the wave functions u and v are integrated outward from x = 0. it can easily be seen that the bogoliubov-de gennes equations (4) are invariant under the transformation u(x) → u(x)eiχ, v(x) → v(x)eiχ with a real phase χ. similarly to the procedure for the integration of the stationary states this symmetry can be exploited. the remaining initial values with which the integration has to be started are u(0) ∈ c, re(v(0)), u′(0) ∈ c, v′(0) ∈ c, ω ∈ c. in a nine-dimensional root search they have to be chosen such that the conditions u(±∞) → 0, v(±∞) → 0, and ∫ ∞ −∞ ∣∣u(x) + v∗(x)∣∣2 dx = 1 are fulfilled [11]. two further symmetries can be exploited to reduce the number of independent stability eigenvalues that have to be calculated. the replacement (u,v,ω) → (v∗,u∗,−ω∗) leaves the ansatz (3b) invariant. thus, if ω is a stability eigenvalue also −ω∗ is a valid solution. furthermore for every eigenvalue ω there is also one solution with the eigenvalue ω∗ if the stationary state ψ0 is pt symmetric. this can be verified by applying the pt operator to the bogoliubov-de gennes equations (4). due to these two symmetries it is sufficient to search for stability eigenvalues with re(ω) > 0 and im(ω) > 0. 3.1. stability in the vicinity of the bifurcation the relevant question which has to be answered by our calculation is whether or not the discrepancy between the γ values of the pitchfork bifurcation and the stability change appears for the double-δ potential. thus we calculated the stability eigenvalue with re(ω) > 0, im(ω) > 0 for a range of γ around the bifurcation, which is shown in figure 2 for g = −1. for increasing 135 a. löhle, h. cartarius, d. haag et al. acta polytechnica 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 κ , ω γ γ κ re(ω) im(ω) re(κ) figure 2. real (red solid line) and imaginary (green dashed line) part of the stability eigenvalue ω for the stationary ground state in the case g = −1. the stability change occurs at γω ≈ 0.3138. to illustrate the pitchfork bifurcation the real parts of κ of all stationary states are also shown (blue dotted lines). the value γκ ≈ 0.307 is marked by the black dasheddotted line. obviously there is a discrepancy between the two values. γ the eigenvalue ω switches from real to imaginary at γω ≈ 0.3138 marking the stability change. the pitchfork bifurcation is visible in the real parts of κ of all stationary states of the system. it is marked by the black dashed-dotted line. obviously there is a discrepancy between γκ and γω. the difference ∆γ = γκ −γω (5) is ∆γ ≈−0.0067. the system does not possess any further stationary states besides those shown in figure 1. three of these four states are participating in the pitchfork bifurcation. only the excited pt symmetric solution would be able to influence the dynamics of the ground state at a value γ 6= γω. however, it stays real for all parameters γ shown in figure 2 and cannot cause any qualitatively different behaviour of the ground state’s dynamics. thus, an influence of further states can be ruled out as the reason for the discrepancy in the double-well system of refs. [11, 12]. the remaining difference between the grosspitaevskii equation (2) and the two mode system of refs. [14–17] is the norm-dependency of the nonlinearity. indeed, an influence of the norm is already present if the study done in figure 2 is repeated for different values of the nonlinearity parameter g. figure 3 shows ∆γ as a function of g. a strong dependency is visible. even the sign changes. for g / 0.4 the ground state becomes unstable at γω < γκ. for g → 0 the discrepancy vanishes as expected. 3.2. norm-independent variant of the gross-pitaevskii equation an even clearer identification of ∆γ with the normdependency of the nonlinearity in the gross-pitaevskii -0.007 -0.006 -0.005 -0.004 -0.003 -0.002 -0.001 0 0.001 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ∆ γ -g figure 3. difference between γκ and γω as defined in equation (5) as a function of g. a strong dependency is clearly visible. equation (2) can be given with a small modification. the replacement g|ψ|2 → g|ψ|2∫ |ψ|2 dx (6) makes the gross-pitaevskii equation (2) normindependent. note that this is exactly the form of the mean-field limit of refs. [14–17]. since the stationary states are normalised to 1 they are not influenced by the replacement. however, it influences the dynamics and also the linear stability. the bogoliubov-de gennes equations have to be adapted. assuming again a small perturbation of the form (3a) and (3b) a linearisation in u and v leads us to d2 dx2 u(x) = [ −(1 + iγ)δ(x + b) − (1 − iγ)δ(x− b) + κ2 −ω + 2g ∣∣ψ0(x)∣∣2]u(x) + gψ0(x)2v(x) −g ∣∣ψ0(x)∣∣2ψ0(x)s, (7a) d2 dx2 v(x) = [ −(1 − iγ)δ(x + b) − (1 + iγ)δ(x− b) + (κ2)∗ + ω + 2g ∣∣ψ0(x)∣∣2]v(x) + gψ∗0 (x)2u(x) −g ∣∣ψ0(x)∣∣2ψ∗0 (x)s (7b) with the integral s = ∫ [ v(x)ψ0(x) + u(x)ψ∗0 (x) ] dx. (7c) for a numerical solution of the modified bogoliubovde gennes equations the value of s is included in the root search, i.e. for the integration of equations (7a) and (7c) a value for s is guessed and subsequently compared with the result of equation (7c). since s is in general a complex value this increases the dimension of the root search to 11. an example for a typical result is shown in figure 4. the discrepancy between the γ values at which the pitchfork bifurcation and the stability change occur vanishes. the two values agree, as is expected for a pitchfork bifurcation. this is true for all values of g. 136 vol. 54 no. 2/2014 stability of becs in a pt-symmetric double-δ potential 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 κ , ω γ γ κ = γ ω re(ω) im(ω) re(κ) figure 4. the same as in figure 2 but for the modified bogoliubov-de gennes equations (7a)-(7c). now the γ values at which the pitchfork bifurcation and the stability change occur are in perfect agreement. thus, we conclude that the discrepancy appearing in figure 2 for the gross-pitaevskii equation is solely a result of the norm-dependent nonlinearity ∝ |ψ|2 in the hamiltonian and is not a consequence of the interaction with higher excited states. 3.3. discussion the reason why the stability change does not occur exactly at the bifurcation can also be understood intuitively. figure 1 indicates that the values γκ of the pitchfork bifurcation are not equal for all values of g. for g = 0 there is no pitchfork bifurcation. the two real eigenvalues κ vanish in a tangent bifurcation and two complex eigenvalues emerge. only for nonvanishing g the pitchfork bifurcation exists and moves to smaller values of γ for increasing |g|. if now the norm of a wave function changes, this can also be understood as a variation of g, i.e. a wave function with a norm n = ||ψ||2 has the same effect as a wave function with the norm 1 and a modified value g̃ = ng. the values γκ of the pitchfork bifurcation are obviously different for g and g̃. with this relation in mind it is not surprising that a fluctuation changing the norm of the wave function may cause a qualitative change of the condensate’s stability properties in the vicinity of γκ. 4. conclusions we investigated the origin of the discrepancy between the value of the gain/loss parameters at which the ground state of a bose-einstein condensate in a doublewell trap passes through a pitchfork bifurcation and at which its stability changes. in a naive expectation these γ values should be identical. however, it is found that this is not exactly fulfilled. since this discrepancy does not occur for a similar system, viz. the mean-field limit of a bose-hubbard dimer [14–17], we investigated the differences between the equations describing the two systems. it was found that the norm-dependency of the nonlinearity in the hamiltonian ∝ |ψ|2 is unambiguously the origin of the discrepancy. it can be completely removed with the replacement |ψ|2 → |ψ|2/||ψ||2. an intuitive explanation can be given. fluctuations which change the norm of a stationary state are able to shift the position γκ of the bifurcation. since the dynamical properties of the wave functions are crucial for the experimental observability of a pt symmetric bose-einstein condensate it is important to know about all processes introducing possible instabilities. as has been shown in this article the stability relations are nontrivial close to branch points in condensate setups with gain and loss. the gross-pitaevskii has a norm-dependent nonlinearity, and therefore it should be clarified in future work which type of fluctuation influences the wave function’s norm such that additional instabilities appear. in particular, it would be interesting to see, how the amplitude of a fluctuation is related to the size of the difference ∆γ. a deeper understanding of how this effect influences realistic setups generating the pt symmetric external potential [34], would also be of high value. acknowledgements we thank eva-maria graefe for stimulating discussions. references [1] n. moiseyev. non-hermitian quantum mechanics. cambridge university press, cambridge, 2011. [2] c. m. bender, et al. real spectra in non-hermitian hamiltonians having pt symmetry. phys rev lett 80:5243, 1998. doi: 10.1103/physrevlett.80.5243 [3] e. p. gross. structure of a quantized vortex in boson systems. nuovo cimento 20:454, 1961. doi: 10.1007/bf02731494 [4] l. p. pitaevskii. vortex lines in an imperfect bose gas. sov phys jetp 13:451, 1961. [5] s. klaiman, et al. visualization of branch points in pt -symmetric waveguides. phys rev lett 101:080402, 2008. doi: 10.1103/physrevlett.101.080402 [6] a. guo, et al. observation of pt -symmetry breaking in complex optical potentials. phys rev lett 103:093902, 2009. doi: 10.1103/physrevlett.103.093902 [7] c. e. rüter, et al. observation of parity-time symmetry in optics. nat phys 6:192, 2010. doi: 10.1038/nphys1515 [8] h. cartarius, et al. model of a pt -symmetric bose-einstein condensate in a δ-function double-well potential. phys rev a 86:013612, 2012. doi: 10.1103/physreva.86.013612 [9] h. cartarius, et al. nonlinear schrödinger equation for a pt -symmetric delta-function double well. j phys a 45:444008, 2012. doi: 10.1088/1751-8113/45/44/444008 [10] w. d. heiss, et al. spectral singularities in pt -symmetric bose-einstein condensates. j phys a 46:275307, 2013. doi: 10.1088/1751-8113/46/27/275307 137 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1007/bf02731494 http://dx.doi.org/10.1103/physrevlett.101.080402 http://dx.doi.org/10.1103/physrevlett.103.093902 http://dx.doi.org/10.1038/nphys1515 http://dx.doi.org/10.1103/physreva.86.013612 http://dx.doi.org/10.1088/1751-8113/45/44/444008 http://dx.doi.org/10.1088/1751-8113/46/27/275307 a. löhle, h. cartarius, d. haag et al. acta polytechnica [11] d. dast, et al. a bose-einstein condensate in a pt symmetric double well. fortschr physik 61:124–139, 2013. doi: 10.1002/prop.201200080 [12] d. dast, et al. eigenvalue structure of a bose-einstein condensate in a pt -symmetric double well. j phys a 46:375301, 2013. doi: 10.1088/1751-8113/46/37/375301 [13] h. cartarius, et al. stationary and dynamical solutions of the gross-pitaevskii equation for a bose-einstein condensate in a pt symmetric double well. acta polytechnica 53(3):259–267, 2013. [14] e. m. graefe, et al. a non-hermitian pt symmetric bose-hubbard model: eigenvalue rings from unfolding higher-order exceptional points. j phys a 41:255206, 2008. doi: 10.1088/1751-8113/41/25/255206 [15] e. m. graefe, et al. mean-field dynamics of a non-hermitian bose-hubbard dimer. phys rev lett 101:150408, 2008. doi: 10.1103/physrevlett.101.150408 [16] e. m. graefe, et al. quantum-classical correspondence for a non-hermitian bose-hubbard dimer. phys rev a 82:013629, 2010. doi: 10.1103/physreva.82.013629 [17] e. m. graefe. stationary states of a pt symmetric two-mode bose-einstein condensate. j phys a 45:444015, 2012. doi: 10.1088/1751-8113/45/44/444015 [18] h. ramezani, et al. unidirectional nonlinear pt -symmetric optical structures. phys rev a 82:043803, 2010. doi: 10.1103/physreva.82.043803 [19] z. h. musslimani, et al. optical solitons in pt periodic potentials. phys rev lett 100:30402, 2008. doi: 10.1103/physrevlett.100.030402 [20] f. k. abdullaev, et al. dissipative periodic waves, solitons, and breathers of the nonlinear schrödinger equation with complex potentials. phys rev e 82:056606, 2010. doi: 10.1103/physreve.82.056606 [21] y. v. bludov, et al. nonlinear patterns in bose-einstein condensates in dissipative optical lattices. phys rev a 81:013625, 2010. doi: 10.1103/physreva.81.013625 [22] r. driben, et al. stability of solitons in parity-time-symmetric couplers. opt lett 36:4323–4325, 2011. doi: 10.1364/ol.36.004323 [23] f. k. abdullaev, et al. solitons in pt -symmetric nonlinear lattices. phys rev a 83:41805, 2011. doi: 10.1103/physreva.83.041805 [24] y. v. bludov, et al. stable dark solitons in pt -symmetric dual-core waveguides. phys rev a 87:013816, 2013. doi: 10.1103/physreva.87.013816 [25] k. g. makris, et al. beam dynamics in pt symmetric optical lattices. phys rev lett 100:103904, 2008. doi: 10.1103/physrevlett.100.103904 [26] k. g. makris, et al. pt -symmetric optical lattices. phys rev a 81:063807, 2010. doi: 10.1103/physreva.81.063807 [27] a. ruschhaupt, et al. physical realization of pt -symmetric potential scattering in a planar slab waveguide. j phys a 38:l171, 2005. doi: 10.1088/0305-4470/38/9/l03 [28] r. el-ganainy, et al. theory of coupled optical pt -symmetric structures. opt lett 32:2632, 2007. doi: 10.1364/ol.32.002632 [29] v. jakubský, et al. an explicitly solvable model of the spontaneous pt -symmetry breaking. czech j phys 55:1113–1116, 2005. doi: 10.1007/s10582-005-0115-x [30] a. mostafazadeh. delta-function potential with a complex coupling. j phys a 39:13495, 2006. doi: 10.1088/0305-4470/39/43/008 [31] a. mostafazadeh, et al. spectral singularities, biorthonormal systems and a two-parameter family of complex point interactions. j phys a 42:125303, 2009. doi: 10.1088/1751-8113/42/12/125303 [32] h. mehri-dehnavi, et al. application of pseudo-hermitian quantum mechanics to a complex scattering potential with point interactions. j phys a 43:145301, 2010. doi: 10.1088/1751-8113/43/14/145301 [33] h. f. jones. interface between hermitian and nonhermitian hamiltonians in a model calculation. phys rev d 78:065032, 2008. doi: 10.1103/physrevd.78.065032 [34] m. kreibich, et al. hermitian four-well potential as a realization of a pt -symmetric system. phys rev a 87:051601(r), 2013. doi: 10.1103/physreva.87.051601 138 http://dx.doi.org/10.1002/prop.201200080 http://dx.doi.org/10.1088/1751-8113/46/37/375301 http://dx.doi.org/10.1088/1751-8113/41/25/255206 http://dx.doi.org/10.1103/physrevlett.101.150408 http://dx.doi.org/10.1103/physreva.82.013629 http://dx.doi.org/10.1088/1751-8113/45/44/444015 http://dx.doi.org/10.1103/physreva.82.043803 http://dx.doi.org/10.1103/physrevlett.100.030402 http://dx.doi.org/10.1103/physreve.82.056606 http://dx.doi.org/10.1103/physreva.81.013625 http://dx.doi.org/10.1364/ol.36.004323 http://dx.doi.org/10.1103/physreva.83.041805 http://dx.doi.org/10.1103/physreva.87.013816 http://dx.doi.org/10.1103/physrevlett.100.103904 http://dx.doi.org/10.1103/physreva.81.063807 http://dx.doi.org/10.1088/0305-4470/38/9/l03 http://dx.doi.org/10.1364/ol.32.002632 http://dx.doi.org/10.1007/s10582-005-0115-x http://dx.doi.org/10.1088/0305-4470/39/43/008 http://dx.doi.org/10.1088/1751-8113/42/12/125303 http://dx.doi.org/10.1088/1751-8113/43/14/145301 http://dx.doi.org/10.1103/physrevd.78.065032 http://dx.doi.org/10.1103/physreva.87.051601 acta polytechnica 54(2):133–138, 2014 1 introduction 2 bose-einstein condensates in the pt symmetric doubletrap 3 stability analysis of the ground state 3.1 stability in the vicinity of the bifurcation 3.2 norm-independent variant of the gross-pitaevskii equation 3.3 discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0193 acta polytechnica 55(3):193–198, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of transonic flow past cusped airfoil trailing edge jiří stodůlka∗, pavel šafařík ctu in prague, faculty of mechanical engineering, department of fluid mechanics and termodynamics, technická 4, 166 07 prague 6, czech republic ∗ corresponding author: jiri.stodulka@fs.cvut.cz abstract. in order to verify the limits of theoretical design methods, a transonic flow past two designed cusped airfoils is numerically solved and studied. the achieved results are compared with the theoretical predictions and then analyzed in terms of flow behavior and oblique shocks formation using known classical gas dynamics relations. the regions around the sharp trailing edges are studied in detail and parameters of shock waves are solved and compared using the classical shock polar approach and verified by reduction parameters for symmetric configurations. keywords: oblique shock; shock polar; trailing edge; cusped airfoil; transonic flow; cfd. 1. introduction to this day, transonic flow has represented a very challenging topic. although modern cfd is very powerful, it is good to keep classical methods in mind in order to verify the data obtained from numerical simulations, as nicely reminded in [1]. for a demonstration of such analyses, some cases from the pre-computational era are still relevant and can be used as perfect examples, thanks to their known analytical solutions and limits. one such example of a cusped airfoil pointed into sonic free stream that has been studied in depth using modern cfd tools, and which is exactly described by near sonic theory [2] can be found in reference [3]. even though a good correspondence was found between design theory and the numerics, there are still bounds or limits beyond which the results are invalid and therefore questionable. to confirm the observed behavior, classical gas dynamics methods can be used to evaluate the character of the flow field. more specifically, they can be used to evaluate the parameters of oblique shock waves formed at the trailing edge and to see if any deviations from standard behavior can appear, especially near the limits of the transonic design methods. 2. theory and design of transonic cusped airfoils the mathematical methods solving transonic flow field are based on potential equations. to avoid nonlinearity of the basic system, the solution is transformed into a modified hodograph plane replacing the physical coordinates x,y with the new, azimuthal angle ϑ and with the prandtl-mayer angle υ. concentrating only on flow with small perturbations to sonic flow simplifies the system enough to get an exact solution for both subsonic field, using conformal mapping methods, and supersonic field, using the method of characteristics. finding analytical solutions to the above hodograph relations, described in [2], allows us to derive the formulae defining the shape, flow conditions and pressure coefficient for cusped airfoils in a uniform sonic flow m = 1. the solution of such airfoils is described in fig. 1 and a schematic view of a cusp with all the defined parameters is shown in fig. 2 [2] the solution in real plane is on the left in fig. 1 the modified hodograph plane on the right with axis t corresponds to the flow angle and s a function of the mach number or the prandtl-meyer angle. the sharp leading edge (point a) cuts through the sonic flow at a certain angle. then, due to smooth acceleration of the transonic flow from subto supersonic past the airfoil, crossed sonic lines appear (b, e). up to this point, quasi-conformal mapping is used, and from here the solution is switched to characteristic mapping. the neutral characteristics (c, f) appear and the flow is still accelerating smoothly up to the trailing edge where two oblique shocks are formed (d, g). the previous solution results in the cusp shape described by two parameters: the thickness parameter τ, defined as the thickness to chord ratio, and the camber parameter ω, defined as the camber to chord ratio. the solution is exact for τ → 0 and practically valid for slender airfoils with τ ≤ 0.5. the case with camber to chord ratio ω = 0 is symmetrical, and also known as “guderley’s cusp”. the range of validity for cambered airfoils is given by ω/τ ≤ 0.5. the camber/thickness parameter which is afterwards used to generate the geometry and the flow description is given by p (ω τ ) = 213/2 · 33/2 · 5−7/2 ω τ ( 1 + 212 · 3 · 5−6 (ω τ )2)−1/2 . (1) when the cusp is cambered, it is pointing into the flow and it is smoothly passed by the stream, so the 193 http://dx.doi.org/10.14311/ap.2015.55.0193 http://ojs.cvut.cz/ojs/index.php/ap jiří stodůlka, pavel šafařík acta polytechnica figure 1. cusp solution in the real and the rheograph plane [2]. figure 2. the cusp solution and parameters [3]. flow is not forced to change its direction around the sharp leading edge. the angle of attack is then α = τ · 2−9/2 · 3−1/2 · 55/2p 1 − 2−1 · 3−4 · 5 · 13p 2 (1 − 2−1 · 3−2 · 5p 2)3/2 . (2) the geometry vertex data for the family of cambered airfoils are given by yp(x) = τ ·x(1 −x)( 22 ω τ ± 2−2 · 3−3/2 · 55/2 ·x1/2 ) . (3) and finally the pressure coefficient cp = (52 · τ)2/3 (22 · 3(κ + 1))1/3 ( 1 − 2−2 · 3−2 · 5p 2 1 − 2−1 · 3−2 · 5p 2 − 2−1 · 5x ∓ 2−1/2 · 3−1 · 5p ·x1/2 (1 − 2−1 · 3−2 · 5p 2)−1/2 ) , (4) where κ is specific heat ratio. knowing this, we have a complete analytical solution of the problem of sharp cusped airfoils in a sonic free stream. 3. numerical simulations for numerical simulations, the two following variants were chosen: case i with parameters somewhere in the middle of the exact solution bounds, τ = 0.05 and ω/τ = 0.02 (fig. 3a), and case ii at the limit of theoretical solution validity, with thickness to chord ratio τ = 0.1 and parameter ω/τ = 0.5 (fig. 3b). case ii is very interesting, because it is on the limit of validity, where the theory predicts a questionable solution figure 3. predicted flow behavior. around the lower side of the profile and around the trailing edge. for the numerical simulation of the flow the inviscid euler model dealing with ideal gas implemented in the ansys fluent commercial cfd software was used. this simple model was chosen to be as comparable with the exact solution based on potential flow theory as possible. some real gas applications to the transonic flow can be found, for instance in [4]. the ausm numerical flux scheme was used on a structured quad mesh with the pressure far-field boundary condition that guarantees the sonic free stream. a detailed description of the simulation itself, the model, solver settings and boundary conditions is presented in [3]. the results describing the flow field behavior, shaded by contours of the mach number for both variants are shown in figs. 3.2 and 3.3. 194 vol. 55 no. 3/2015 analysis of transonic flow past cusped airfoil trailing edge figure 4. contours of the mach number, case i, τ = 0.05 and ω/τ = 0.02. the results confirm the behavior of the flow expected from the theory. no shocks appear anywhere near the leading edge and the flow velocities are locally subsonic here. the flow is then smoothly accelerated along the profile to supersonic values. the supersonic region is subsequently closed by oblique shocks starting from the trailing edge. however, the differences between the variants are obvious. the first one (fig. 4), a thin and slightly cambered case, forms two shocks with different strengths, while the second (fig. 5), a thicker and cambered limit variant, produces higher mach numbers and only one oblique shock on the upper side. on the lower side of the profile, the flow is not accelerated enough and no shock is visible here. instead, there is a slip line recognizable behind the profile. the situation around the trailing edge of the second variant especially encourages for further investigation, because as mentioned, it is the limit case, and so the theory cannot give us an absolutely exact solution here and all the cfd results ought to be validated before pronounced as relevant. 4. classical gas dynamics analysis the following physical model of supersonic flows interaction on a sharp trailing edge, depicted in fig. 3, is formulated for the evaluation of oblique shock parameters. shock wave a is a shock wave of the first family (a left-running shock wave) and shock wave b is a shock wave of the second family (a right running shock wave). the total pressures are equal: p01 = p02. (5) the basic conditions on the discontinuity d downstream of the sharp trailing edge are equality of static pressures p3 = p4 (6) and equality of azimuthal flow angles ϑ3 = ϑ4. (7) for the solution of the flow conditions, the shock polar diagrams are used. these diagrams give a family of possible solutions in terms of pressure ratio to turning angle dependencies for given values of mach figure 5. contours of the mach number, case ii, τ = 0.1 and ω/τ = 0.5. case i case ii m1 1.51 2.29 m2 1.27 1.15 δte 15.1° 28.8° table 1. gas dynamics analysis input data. numbers m and angles of shock waves to incoming flow β as a parameter. the pressure ratio on the shock wave a is given by the following formula: p3 p1 = κ− 1 κ + 1 ( 2κ κ− 1 m21 sin 2 βa − 1 ) , (8) where κ is the ratio of specific heat capacities, m1 is the mach number in the region 1 in fig. 3, and βa is the angle of shock wave a to incoming flow. analogously, the pressure ratio on the shock wave b is given by p4 p2 = κ− 1 κ + 1 ( 2κ κ− 1 m22 sin 2 βb − 1 ) . (9) the ratios of total to static pressures in regions 1 and 2 are given by the isentropic formula p01 p1 = ( 1 + κ− 1 2 m21 )κ/(κ−1) (10) and p02 p2 = ( 1 + κ− 1 2 m22 )κ/(κ−1) . (11) the turning angles of the flow on the shock waves a and b are given by [5] tg δa = 2 tg βa ( m21 sin2 βa − 1 m21 (κ + cos 2βa) + 2 ) (12) and tg δb = 2 tg βb ( m22 sin2 βb − 1 m22 (κ + cos 2βb) + 2 ) . (13) thanks to that, the only necessary input data required for the analysis are the incoming mach numbers and the trailing edge angle, showed in tab. 4.1. 195 jiří stodůlka, pavel šafařík acta polytechnica figure 6. trailing edge oblique shocks configuration. figure 7. shock polars for both variants: a) case i, b) case ii. the azimuthal flow angles ϑ3 and ϑ4 can be expressed as follows: ϑ3 = ϑ1 + δa (14) and ϑ4 = ϑ2 − δb. (15) the trailing edge angle is, of course δte = ϑ2 −ϑ1 = δa + δb. (16) equations (8), (10) and (12) give the final dependence of static to total pressure ratio p3/p01 on turning angles δa for m1 = const when βa = var 〈 arcsin 1 m1 < βa < 90° 〉 . the dependence is depicted in the diagrams in fig. 7 as a blue curve. equations (9), (11) and (14) give the final dependence of the ratio of static pressure p4/p01 on the turning angles δb for m2 = const when βb = var 〈 arcsin 1 m2 < βb < 90° 〉 . the dependence is depicted in the diagrams in fig. 7 as a red curve. noting that thanks to the ambiguous character of the supersonic flow, such relations give two values of the pressure ratio for each possible wave angle which fulfill conditions of (5) and (6) [6] up to the maximum value where the wave detaches and changes to normal shock. the solution with a higher value of p/p01 represents an unstable strong shock solution and the solution with lower value of p/p01 represents a weak stable solution. the points of intersection define the overall solution of the supersonic flow past the sharp trailing edge. the results of this analysis are the shock polars shown in figs. 4.2. every “half-heart-shaped” line represents one side of the profile and the vertical line is the value of the trailing edge angle. shock polars for the case i airfoil configuration, with a solid vertical line representing the trailing edge angle of a value of 15.1°, are depicted in fig. 7a. both polars intersect twice and, considering that we are looking for a stable solution, the result is the lower point of intersection. the absolute value of the flow turning angle of the profile is approximately 10.8° on the upper side and 4.3° on the lower side. to compare these numbers with the cfd results, the values from the nearest cell of the shock are approximately 10.9° for the upper side and 4.0 for the lower side. that is a very satisfying result considering the finite character of the computational mesh on one side and the ideal gas dynamics theory on the other. in fig. 7b, shock polars for the airfoil of case ii (thicker, more cambered profile and limit variant) 196 vol. 55 no. 3/2015 analysis of transonic flow past cusped airfoil trailing edge figure 8. trailing edge configuration with reduced parameters. with the trailing edge angle 28.8° are shown. the first noticeable difference is the fact that the polars do not intersect, resulting in a questionable irregular solution for this configuration. however, the contour in fig. 5 above, with no obvious oblique shock on the lower side of the profile, already predicted nonstandard behavior. 5. reduced parameter analysis of supersonic flow past trailing edge for a deeper investigation of this problem, the model of nonsymmetrical supersonic flow past a trailing edge [7] is proposed (fig. 8). the nonsymmetrical supersonic flow past a trailing edge can be reduced to a symmetric case by means of the following relations, in order to obtain some results for an analysis of irregular configuration as well. the reduced value of the azimuthal angle of the flow upstream and downstream (namely azimuthal angle of discontinuity d) of a symmetric trailing edge is given by ϑred = ϑ1 + υ1 + ϑ2 −υ2 2 (17) and the reduced value of prandtl-meyer function υred = ϑ1 + υ1 −ϑred = ϑred −ϑ2 + υ2, (18) where the prandtl-meyer function is [5] υ(m) = − √ κ + 1 κ− 1 arctg √ κ + 1 κ− 1 (m2 − 1) + arctg √ m2 − 1. (19) the reduced parameters analysis proved the possibility to apply the model in fig. 8 also to sharp trailing edges. the application of equations (17) and (18) to case i of the cusped airfoil proved a regular interaction of supersonic flows on the sharp trailing edge. the reduced value of the azimuthal angle is ϑred = 10.97° and the reduced value of the prandtl-meyer function is υred = 1.23°. these figures confirm the previous numbers obtained from the basic analysis. for case ii of the cusped airfoil, the reduced value of the azimuthal angle is ϑred = 30.23° and the prandtl-meyer function is υred = 3.80°, while the flow angle obtained from the numerical simulation is approx. 30.8°. that also corresponds well with the reduced angle value. however, further analysis proved the important fact that the condition for the upper branch of exit shock waves δa ≤ δa,max is not fulfilled. the angle of the shock wave to the incoming flow βa,max, corresponding to the maximum turning angle δa,max, is given by the following expression [8]: βa,max = arcsin [ 1 κm21 ( κ + 1 4 m21 − 1 + √ (κ + 1) ( 1 + κ−12 m 2 1 + κ+1 16 m 4 1 ))]1/2 . (20) the interaction of supersonic flows at the trailing edge for case ii is not regular, and the supersonic flow on the upper side of the profile can be separated upstream of the trailing edge [9],or can have an unstable, unsteady character. 6. conclusion the transonic flow past cusped airfoil profiles was studied, focusing especially on the sharp trailing edge region. two profiles – case i and case ii – with known solutions were designed according to modified hodograph methods for potential flow. the flow fields were solved using ansys fluent numerical code for both profile cases. a detailed analysis based on classical gas dynamics proved, for case i, a good accordance of shock waves parameters at the trailing edge forregular interaction of supersonic flows. both classical gas dynamics analysis and reduced parameters analysis proved that the transonic flow past case ii resulted in irregular interaction. for case ii a possible separation of flow upstream of the trailing edge, or some unsteady behavior is predicted. 197 jiří stodůlka, pavel šafařík acta polytechnica acknowledgements this work has been supported by the grant agency of the czech technical university in prague, grant sgs13/180/ohk2/3t/12. the support from the technology agency of the czech republic under project te01020036 is gratefully acknowledged. list of symbols x x-coordinate [m] y y-coordinate [m] ϑ azimuthal angle [°] υ prandtl-meyer function [°] m mach number [–] t t-coordinate [°] s s-coordinate [°] p camber/thickness parameter [–] α angle of attack [°] τ thickness to chord ratio [–] ω camber to chord ratio [–] c chord length [m] cp pressure coefficient [–] p pressure [pa] β wave angle [°] δ turning angle [°] κ specific heat capacity ratio [–] references [1] jameson, a., ou, k. 50 years of transonic aircraft design, progress in aerospace design, vol.44, is.5, 2011, pp. 308-318 doi:10.1016/j.paerosci.2011.01.001 [2] sobieczky, h. tragende schnabelprofile in stossfreier schnallanstormung, zamp 26, 1975 (in german) [3] stodulka, j., sobieczky, h. on transonic flow models for optimized design and experiment, european physical journal web of conferences 67, 2014 doi:10.1051/epjconf/20146702111 [4] halama, j. transonic flow of wet steam numerical simulation, acta polytechnica, vol.52, no.6, 2012 [5] shapiro, a. h. the dynamics and thermodynamics of compressible fluid flow, the ronald press company, new york, 1953 [6] safarik p. the flow in the supersonic exit of blade cascades, archives of mechanics, vol.26, no.3, 1974, pp. 529-533 [7] safarik, p. application of method of characteristics at calculation of flow past blade cascades, report of it cas no. t-178/75, 1975 (in czech) [8] zucrow, m. j., hoffman j. d. gas dynamics, john wiley and cons, inc, new york, 1976. [9] bibin, j., kulkarni, v.n., natarajan, g., shock wave boundary layer interactions in hypersonic flows, international journal of heat and mass transfer, vol.70, 2014, pp. 81-90 doi:10.1016/j.ijheatmasstransfer.2013.10.072 198 http://dx.doi.org/10.1016/j.paerosci.2011.01.001 http://dx.doi.org/10.1051/epjconf/20146702111 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2013.10.072 acta polytechnica 55(3):193–198, 2015 1 introduction 2 theory and design of transonic cusped airfoils 3 numerical simulations 4 classical gas dynamics analysis 5 reduced parameter analysis of supersonic flow past trailing edge 6 conclusion acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2015.55.0154 acta polytechnica 55(3):154–161, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison of classical and interactive multi-robot exploration strategies in populated environments nassim kaldea, b, c, ∗, olivier simonind, françois charpilleta, b, c a inria, villers-lès-nancy, 54600, france b université de lorraine, loria, umr 7503, vandoeuvre-lès-nancy, 54506, france c cnrs, loria, umr 7503, vandoeuvre-lès-nancy, 54506, france d citi-inria laboratory, insa lyon, 69621 villeurbanne, france ∗ corresponding author: nassim.kalde@loria.fr abstract. multi-robot exploration consists in coordinating robots for mapping an unknown environment. it raises several issues concerning task allocation, robot control, path planning and communication. we study exploration in populated environments, in which pedestrian flows can severely impact performances. however, humans have adaptive skills for taking advantage of these flows while moving. therefore, in order to exploit these human abilities, we propose a novel exploration strategy that explicitly allows for human-robot interactions. our model for exploration in populated environments combines the classical frontier-based strategy with our interactive approach. we implement interactions where robots can locally choose a human guide to follow and define a parametric heuristic to balance interaction and frontier assignments. finally, we evaluate to which extent human presence impacts our exploration model in terms of coverage ratio, travelled distance and elapsed time to completion. keywords: artificial intelligence, multi-agent system, robotic exploration, human-robot interaction. 1. introduction mobile robots intervene in our daily life and provide services (e.g., guidance, assistance [1, 2]) or even leisure activities (e.g., providing company, dancing [3, 4]). this intrusion of mobile robots into citizens’ day-to-day lives must take people into account, whilst seeking social compliance. human activities and motion patterns are already studied [5], so that a robot can learn a model of the human behaviour to generate a socially compliant control and apply it. for example, by observing pedestrians walking nearby, a robot could model the pedestrian dynamics and generate its own navigation control for efficiently navigating in populated environments. multi-robot exploration (mre) consists in reconstructing all the reachable space of an unknown environment by controlling mobile robots. introducing human presence awareness into a robotic exploration system for populated environments can constitute an interesting route for study purposes. indeed, it paves the way for human-robot interaction (hri)-based exploration approaches. in populated environments, the exploration task raises new concerns regarding clean reconstruction and efficient robot coordination. concerning reconstruction quality, it is particularly difficult to separate static aspects (background) from dynamic aspects (people, robots) of the scene [6]. obviously, mobile robot perceptions are biased due to the dynamics of the environment, thus hindering localisation and mapping. regarding the selection of targets to explore, pedestrian movements create spatiotemporal reachability of known/unknown areas making exploration tricky. in fact, the reachable space evolves dynamically according to the density of the human presence. nevertheless, humans can understand the dynamics of their environment, and can sense, decide and act adequately. in this sense, we can assume that every person has an adaptive heuristic, depending on the local environment, that allows him or her to walk readily through dense areas (e.g., crowds). we are interested in exploiting human skills as possible heuristics for the exploration task. we propose a weighted heuristic that incorporates human presence for selecting areas to explore or for initiating human-robot interactions. this is followed by a brief state of the art in mre, and we also situate our approach among hri applications in mobile robotics. in the third section, we formalise the multi-agent system for exploration in populated environments, and present the framework of our study. the fourth section defines the mixed exploration approach (robot-frontier/interaction) and proposes a human-aware exploration heuristic for establishing human following interactions. we then perform several experiments of our mixed approach in simulation, to underline the variability of the performance depending on the environment. finally, we discuss our results and perspectives regarding machine learning for adaptive heuristic parameterisation. 154 http://dx.doi.org/10.14311/ap.2015.55.0154 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 3/2015 comparison of classical and interactive multi-robot exploration figure 1. multi-agent system simulated in v-rep [19]. 2. related work first, this section presents previous work in the field of mre. then, we situate our study among mobile robotic applications of hri. 2.1. multi-robot exploration the mre problem consists in acquiring an accurate representation of an environment by efficiently coordinating the actions of robots within it. representation accuracy refers to the degree of closeness to the ground truth. coordination of the robots arises from the teamwork involved in solving the task. coordination efficiency can be evaluated at several levels, e.g., energy consumption, trajectory overlapping, etc. thus, mre solutions design efficient control of robots for accurately completing a chosen representation of the environment (e.g., graph). the proposed solutions can be roughly classified into reactive, goalbased and utility-based agent design. within reactive and bio-inspired approaches, the actions of an agent are hardwired to its perceptions, and simple navigation rules can be created, e.g., as following walls, circular or boustrophedon patterns [7–10]. these approaches are usually only concerned with coverage. they do not always consider mapping, as typical reactive agents are memory-less. for goal-based agents, frontier-based methods offer a good computation/performance tradeoff, making them particularly suitable for deployment in real embedded systems. the idea is to incrementally assign robots to frontiers (separating the known and free space from the unknown space), thus serializing the exploration task into subgoals. various frontiers evaluation and target selection methods are discussed in the literature [11–13]. in utility-based approaches, an agent makes decisions according to the value of world states. the information gain value is proposed in [14], and [15] considers curiosity, surprise and hunger as motivational agents. in our present study, we consider goal-based agents with frontiers to reach and humans to interact with as subgoals during exploration. we are looking for a parametric heuristic that evaluates/balances frontiers and interactions for exploiting human adaptive skills. 2.2. human-robot interaction hri is defined as the study of humans, robots and their mutual influences [16]. to the best of our knowledge, the office-conversant robot jijo-2 is the only hri application of mobile robotic exploration. this robot exhibits socially embedded learning mechanisms [17] by gathering information while conversing with other people. thus, it realises semi-supervised learning by incorporating local oracle heuristics while exploring. we present an application in mobile robotics considering close interactions established by proximity or direct perception between humans and robots. this type of interactions belongs to the intimate interaction class defined by takeda into his hri classification [18]. our study bridges together intimate hri applications and mre goal-based algorithms. 3. multi-agent system formalisation 3.1. modelling the environment and the agents we propose a model for representing the multi-agent system for populated environments (fig. 1). in (1), the environment to explore is described by a navigation map e, which evolves over time. this evolution results from the actions of the agents (humans h and robots r). at each time t, a robot ri from r has a configuration from which it observes the environment. oti is an observed subset of e, it corresponds to the robot’s observation at time t. formally, let: e be a navigation map, r = {r1, . . . ,rn} be a set of robots, h = {h1, . . . ,hm} be a set of humans, oi ⊂ e be the observation of ri. (1) 3.2. exploration and completion for the exploration task, we must represent the environment explored by the robots over time (2). let θ0:ti be the set of observations, namely the local t-time history, that agent ri has experienced up to time t. similarly, θ0:t is the global t-time history, which aggregates the local t-time histories from r. 155 n. kalde, o. simonin, f. charpillet acta polytechnica ? ! ? unknown occupied free ! animated figure 2. cell states transition diagram. 4 ! 1 2 3 4 3 ! 1 2 3 4 2 1 2 3 4 1 1 2 3 4 w l (a) e, r1, h1 4 r1 1 2 3 ? 4 3 h1 1 2 3 ? 4 2 ? 1 2 ? 3 ? 4 1 ? 1 ? 2 ? 3 ? 4 w l (b) θ0:11 = o11 figure 3. full grid and robot observation at t = 1. thus, we have: θ0:ti = θ 0:t−1 i ∪ o t i, θ0:t = n⋃ i=1 θ0:ti . (2) it is of fundamental importance for the robots to know when the exploration is finished. the completion criterion determines this moment and can be defined locally on each robot. robots determine exploration completion based upon already explored space θ. the mission is over as soon as there is no configuration in the already explored space that allows for new observations. 3.3. instantiating the multi-agent system we represent e as a discrete occupancy grid of l×w square cells. each cell has 4 possible states: the unknown (not observed), occupied (walls, objects), animated (humans, robots) and free (empty) states. states transitions are illustrated in fig. 2. in this grid representation, r becomes the set of cells animated by the robots and ri describes the position of one robot on the grid. the observation area of each robot is within a limited circle. an environment, a robot and a human are represented in fig. 3a. the robot is located on cell r1 at (1, 1) and the human on cell h1 at (1, 2). the maximum field of view of r1 is within the dashed arc in fig. 3b. o11 consists of 7 cells: 3 are free, 2 are occupied and 2 are animated. the explored environment θ0:11 is limited to this first observation. we have provided an instance of a multi-agent system for exploration in populated environments. the environment is represented with a discrete occupancy grid, agents are characterised by their identifier and their coordinates on the grid, and observations are critj t r crt aritj t r art opt. figure 4. multi-robot exploration as a task allocation problem. made by casting rays within the viewing range of a robot. our study is based on this representation of a multi-agent system. in the next section, we present the frontier/interaction exploration approach. 4. mixed exploration approach by frontiers and interactions first, let us consider the mre problem defined as a target allocation problem of robots in an unknown environment [12–14]. a solution to the mre problem defines a way to explore an unknown space, i.e., how to assign robots from r to tasks/targets from t. to achieve this, we can look for an assignment matrix art that optimises the cost matrix crt (cf. fig. 4). 4.1. various approaches for multi-robot exploration we show how different sets of targets define the classical frontier-based exploration, our new interactive approach and the mixed approach (frontier/interaction). 4.1.1. frontier-based exploration a frontier is the observed boundary between an explored space and an unexplored space [11]. classical frontier-based exploration is defined by choosing the targets from the set of frontiers f (3). let: crifj be the cost for ri to reach fj, arifj = { 1 if ri must go to fj, 0 otherwise. (3) in populated environments this approach can fail when the path to a chosen frontier is congested by humans. 4.1.2. interactive exploration human-robot interaction is defined as the reciprocal influence between a human and a robot, followed by one or more effects. we introduce an interactive approach that takes into account the presence of humans for establishing human-robot interactions (opening a door, guiding though a crowd, etc.). targets are now chosen from the set of humans h (4). let: crihj be the cost for ri to interact with hj, arihj = { 1 if ri must interact with hj, 0 otherwise. (4) a purely interactive approach can be inefficient in sparsely populated environments. indeed, without any perception of human presence, the robots adopt a wait-and-see policy and pause the exploration. 156 vol. 55 no. 3/2015 comparison of classical and interactive multi-robot exploration r f h d r i f j d r i h k (a) distances. r f h (σ ) p r i f j (1 − σ) p r i h k (b) penalties. figure 5. distances and penalties considered in the system. 4.1.3. mixed exploration mixed exploration enables to initiate interactions and also to reach frontiers. thus, we combine the two target sets (frontiers and humans) to define a new set g (5). let: crigj be the mixed cost for ri to gj, arigj = { 1 if ri is assigned to gj, 0 otherwise. (5) this approach requires to smartly adjust interaction and frontier assignments to overcome the two above-mentioned issues (wait-and-see policy and the congested frontier). 4.2. mixed cost model in this study, robots can interact only by following pedestrians. the optimisation criterion is to explore a possibly populated environment with minimum distance and time. thus, we define mixed costs using distances and weighted penalties, see fig. 5. the weight σ balances interaction and frontier penalties. first we detail distances, then we introduce penalties and explain the different weights used in the cost formula. 4.2.1. distance first, we incorporate distances between robots and targets as immediate costs (fig. 5a). thus, we initialise crg with normalised robot-frontier and robot-human distances (drf, drh) in (6). drg = (drf | drh), drx = dr1x1 dr1x|x| drnx1 drnx|x|     . (6) distance costs have multiple drawbacks. two examples follow: if a robot travels towards a frontier but a crowd hinders its navigation: the robot cannot adapt the exploration depending on navigation feasibility. remote but reachable frontiers are not reevaluated as good options. the distance cost is prohibitive and the next target is always chosen between the last frontier and close humans who are nearby. a solution is to crg α · drg + (1 −α) · (prf | 0) α · drg + (1 −α) · (0 | prh) p r g d r g α σ figure 6. crg according to α and σ. use a planned distance, which is set to infinity when a target is momentarily unreachable. if a robot follows a pedestrian walking nearby but the person stops to discuss with other people: the robot cannot decide either to maintain or stop the current interaction depending on the human activity. due to distances, the robot will resume exploration only if one person moves again. this also causes a growing unease for the people. a solution is to make an a priori evaluation of an interaction and to update the evaluation of the interaction a posteriori, while it is taking place. 4.2.2. penalty we tackle these two drawbacks with a heuristic that associates penalties to each robot-frontier/human pair (fig. 5b). a penalty prixj is defined as the sum of a time penalty and an orientation penalty. the time penalty trixj is the time elapsed since a frontier discovery or a human remains idle. the orientation penalty orixj is the smallest unsigned angle between the orientation of a robot and the orientation of a frontier/human (a frontier is oriented towards the unknown). thus, we define prg with normalised robot-frontier and robot-human penalties (prf, prh) in (7). prg = ( σ · prf ∣∣ (1 −σ) · prh), σ ∈ [0, 1], prx = pr1x1 pr1x|x| prnx1 prnx|x|     , prixj = trixj + orixj . (7) parameter σ sets more or less weight on the frontier penalties or on the interaction penalties. when this parameter is high, it increases the frontier costs and decreases the interaction costs. this results in favouring interactions over frontiers. 4.2.3. distance and penalty the mixed cost matrix crg which incorporates distances drg and penalties prg is represented in (8). crg = α · drg + (1 −α) · prg, α ∈ [0, 1]. (8) 157 n. kalde, o. simonin, f. charpillet acta polytechnica (a) empty (100 m2) (b) cave (144 m2) (c) structured (242 m2) figure 7. environments. 0.9 1 1.1 1.2 e m p ty coverage (%) 25 30 35 40 45 distance (m) 50 60 70 80 90 time (s) 0.82 0.84 0.86 0.88 0.9 0.92 u n s t r u c t u r e d 40 60 80 100 120 140 100 150 200 250 0 0.25 0.5 0.75 1 0.6 0.65 0.7 0.75 0.8 α s t r u c t u r e d 0 0.25 0.5 0.75 1 60 80 100 120 α 0 0.25 0.5 0.75 1 100 150 200 α figure 8. local greedy not dense: empty (top); unstructured (middle); structured (bottom). parameter α modulates the immediate distance cost and the information coming from the penalty heuristic. when α is high (resp. low), the importance of the penalties is reduced (resp. increased). distances and penalties are counterbalanced with α, while σ sets more or less focus on frontiers or on interactions. we present the influence of α and σ on the cost formula in (fig. 6). the values range from 0 to 1 for each parameter and the formula written on each side is obtained when one parameter is set to its extreme value. we have adopted a mixed approach, and have defined a parametric cost matrix based on a penalty heuristic. now, we evaluate the exploration performance of this heuristic for two greedy optimisation methods, assuming different values of α and σ. 5. experimental framework we use the v-rep robotic simulator for our experiments [19]. the environment is discretised with 0.5 m square cells. the robots share their exploration map, so the frontiers that are discovered are known by every robot. contiguous frontier cells are grouped together into a frontier area. inside a frontier area, the targeted cell minimises the sum of distances to the other cells. assignments are locally computed by each robot. to optimise its assignment, it takes into account the entire set of frontiers known until now, but only the robots and pedestrians perceived locally (within a 2 m radius). planning is done using a potential field propagated on the grid. 5.1. protocol the parameters are as follows: • map: three environments are considered. the first contains no obstacle (empty), the second has unstructured obstacles (unstructured) and the third environment is composed of a corridor and three rooms (structured). maps are shown in (fig. 7). 158 vol. 55 no. 3/2015 comparison of classical and interactive multi-robot exploration 0.9 1 1.1 1.2 e m p ty coverage (%) 24 26 28 30 32 34 distance (m) 50 55 60 65 70 time (s) 0.8 0.9 1 u n s t r u c t u r e d 40 60 80 100 120 100 150 200 0 0.25 0.5 0.75 1 0.66 0.68 0.7 α s t r u c t u r e d 0 0.25 0.5 0.75 1 40 60 80 100 120 140 160 α 0 0.25 0.5 0.75 1 100 150 200 250 α figure 9. group greedy not dense: empty (top); unstructured (middle); structured (bottom). • population density: environments are human populated with 0 or 30% of occupation. each human agent moves in a straight line and avoid obstacles by stopping and rotating. • number of robots: two explorers are used for each experiment, they are represented as cylinders. • optimisation method: we use two different cost optimisation methods: the first one is a local greedy method, where each robot chooses the minimum cost target among only its own possible targets as in [11] for distances. the second one is a group greedy method, where for all locally visible robots at each time step, the robot-target assignment with minimum cost is recursively discarded until the local robot is assigned. • modulators: α and σ are discretised from 0 to 1 with a step of 0.25. 5.2. metrics each scenario is evaluated with classical mre metrics: coverage, distance and time. in addition, we use a common metric in hri, called the robotic attention demand (rad), which measures the autonomy of a robot during its task [20, 21]. here we consider the number of interactions initiated during exploration. 5.3. results first, let us consider environments without humans. we study the influence of α by fixing σ to 1. this allows to only adjust the distance and frontiers penalty. this is legitimate, since no human implies no interaction penalty. the performances averaged over 10 runs are plotted in fig. 8 for local greedy, and fig. 9 for group greedy. for local greedy in fig. 8, regarding the empty and structured maps, we distinguish two steps. in the first step, distance and time increase, and in the second step they both decrease until α = 1. the unstructured map for local greedy and all maps for group greedy (fig. 9) present only one step where distance and time decrease when increasing α. in these cases, when α is high, penalties fade and robots do less round trips between remote frontiers in the scene. thus, in nonpopulated environments, our heuristic does not give better performances. now we consider the presence of pedestrians. the maps are populated at 30 % up to 1 human/m2, thus enabling human-robot interactions. figs. 10 and 11 give the mean performances of local greedy and group greedy, respectively for 10 runs of each (α,σ) combination. when σ increases, the penalty of the interactions is reduced, favouring interactions to the detriment of frontiers. for the empty case (fig. 10), full coverage with the shortest distance and time are at (α,σ) = (0, 0). only penalties are used, and frontiers are preferred over interactions. no interaction was initiated (rad), but an average of 28 frontiers were assigned. in the unstructured case, the best average performance is at (0.75,0). distances are overweighted compared to penalties, and interactions are heavily penalised. nevertheless, an average of 18 interactions (rad) were initiated against 26 frontiers assignments. in the structured environment, the best performances 159 n. kalde, o. simonin, f. charpillet acta polytechnica 0.96 0.98 1 e m p ty coverage (%) 50 100 distance (m) 100 200 time (s) 0.8 0.85 u n s t r u c t u r e d 50 100 100 200 300 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0.5 0.6 0.7 α σ s t r u c t u r e d 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 100 α σ 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 200 300 α σ figure 10. local greedy dense: empty (top); unstructured (middle); structured (bottom). and lowest standard deviations are at (0.5,0). distances and penalties are equally weighted, and again interactions are penalised. an average of 31 frontier assignments and 18 interaction assignments took place. here, interactions are interesting, because a robot can discover the corridor by following someone. for group greedy (fig. 11), the best performance is located at (0.25, 0) for the empty scene. penalties have more weight than distances; frontiers are preferred over interactions. an average of 16 frontier assignments and 7 interaction assignments is noticed. the unstructured environment has maximum coverage, with minimum travelled distance and time at (0.5,0). the average number of frontiers assigned is 8 times the number of interactions (only 4). for the last map, the best average performance is at (0.25,0) for a frontier/interaction ratio of 29/4. with these new results, the distance does not suffice for choosing the best targets with human presence. instead, a smart equilibrium with our penalty heuristic always gives the best performance (α 6= 1). here with (σ = 0), the frontiers were chosen considering only distances, but interactions were chosen carefully by adding heavy penalties. thus our heuristic is already sufficient for selecting interactions only if necessary, but also it is not yet able to promote them. 6. conclusion in this paper, we have defined the interaction-based exploration by targeting the humans perceived by the robots. interactive exploration paves the way for exploiting human natural heuristics, for a better understanding of the dynamics of a populated environment. the mixed approach, based upon frontier and interactive exploration, aims at bringing out the best of both approaches. for this purpose, we designed a parametric heuristic to equilibrate frontiers and interactions (pedestrian following) assignments. this heuristic considers penalties for the idle state of the targets (frontier, human) and their orientation. we have shown in simulation that, in some cases, incorporating an interactive aspect into exploration can be beneficial, even with this simplistic heuristic. to enable efficient dynamic exploration, it is therefore paramount to discover these particular cases. in this sense, machine learning and online tuning of weights might be of interest for achieving a robotic heuristic adaptation. this work opens up prospects for exploiting human adaptiveness in robotic exploration of populated environments. references [1] w. burgard, a. b. cremers, d. fox, et al. experiences with an interactive museum tour-guide robot. artificial intelligence 114(1):3–55, 1999. [2] k. kosuge, y. hirata. human-robot interaction. in proceedings of the ieee international conference on robotics and biomimetics, pp. 8–11. 2004. [3] r. c. arkin, m. fujita, t. takagi, r. hasegawa. an ethological and emotional basis for human-robot interaction. robotics and autonomous systems 42(3):191–201, 2003. [4] m. p. michalowski, s. sabanovic, h. kozima. a dancing robot for rhythmic social interaction. in proceedings of the acm/ieee international conference on human-robot interaction. 2007. 160 vol. 55 no. 3/2015 comparison of classical and interactive multi-robot exploration 0.98 0.99 1 e m p ty coverage (%) 50 100 distance (m) 100 200 time (s) 0.8 0.85 u n s t r u c t u r e d 50 100 100 200 300 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 0.5 0.6 0.7 α σ s t r u c t u r e d 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 50 100 α σ 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 200 300 α σ figure 11. group greedy dense: empty (top); unstructured (middle); structured (bottom). [5] m. bennewitz, w. burgard, g. cielniak, s. thrun. learning motion patterns of people for compliant robot motion. the international journal of robotics research 24(1):31–48, 2005. [6] a. dubois, a. dib, f. charpillet. using hmms for discriminating mobile from static objects in a 3d occupancy grid. in tools with artificial intelligence (ictai), 2011 23rd ieee international conference on. 2011. [7] d. baronov, j. baillieul. reactive exploration through following isolines in a potential field. in proceedings of the american control conference. 2007. [8] r. morlok, m. gini. dispersing robots in an unknown environment. in distributed autonomous robotic systems, pp. 253–262. 2007. [9] m. andries, f. charpillet. multi-robot exploration of unknown environments with identification of exploration completion and post-exploration rendez-vous using ant algorithms. in proceedings of the ieee/rsj international conference on intelligent robots and systems. 2013. [10] e. ferranti, n. trigoni, m. levene. brick&mortar: an on-line multi-agent exploration algorithm. in proceedings of the ieee international conference on robotics and automation. 2007. [11] b. yamauchi. a frontier-based approach for autonomous exploration. in proceedings of the ieee international symposium on computational intelligence in robotics and automation. 1997. [12] j. faigl, m. kulich, l. preucil. goal assignment using distance cost in multi-robot exploration. in proceedings of the ieee/rsj international conference on intelligent robots and systems. 2012. [13] a. bautin, o. simonin, f. charpillet. minpos : a novel frontier allocation algorithm for multi-robot exploration. in proceedings of the 5th international conference on intelligent robotics and applications. 2012. [14] w. burgard, m. moors, c. stachniss, f. e. schneider. coordinated multi-robot exploration. robotics, ieee transactions on 21(3):376–386, 2005. [15] l. macedo, a. cardoso. exploration of unknown environments with motivational agents. in proceedings of the third international joint conference on autonomous agents and multiagent systems. 2004. [16] m. a. goodrich, a. c. schultz. human-robot interaction: a survey. foundations and trends in human-computer interaction 2007. [17] h. asoh, y. motomura, f. asano, et al. jijo-2: an office robot that communicates and learns. ieee intelligent systems 16(5):46–55, 2001. [18] h. takeda, n. kobayashi, y. matsubara, t. nishida. towards ubiquitous human-robot interaction. in proceedings of the working notes for ijcai workshop on intelligent multimodal systems. 1997. [19] e. rohmer, s. p. singh, m. freese. v-rep: a versatile and scalable robot simulation framework. in proceedings of the ieee/rsj international conference on intelligent robots and systems. 2013. [20] d. r. olsen, m. a. goodrich. metrics for evaluating human-robot interactions. in proceedings of permis. 2003. [21] a. steinfeld, t. fong, d. kaber, et al. common metrics for human-robot interaction. in proceedings of the 1st acm sigchi/sigart conference on human-robot interaction. 2006. 161 acta polytechnica 55(3):154–161, 2015 1 introduction 2 related work 2.1 multi-robot exploration 2.2 human-robot interaction 3 multi-agent system formalisation 3.1 modelling the environment and the agents 3.2 exploration and completion 3.3 instantiating the multi-agent system 4 mixed exploration approach by frontiers and interactions 4.1 various approaches for multi-robot exploration 4.1.1 frontier-based exploration 4.1.2 interactive exploration 4.1.3 mixed exploration 4.2 mixed cost model 4.2.1 distance 4.2.2 penalty 4.2.3 distance and penalty 5 experimental framework 5.1 protocol 5.2 metrics 5.3 results 6 conclusion references ap1_02.vp 1 introduction the mechanical and physical properties of concrete deteriorate during the ageing process. climatic conditions play a major role in this process [1]. the texture of concrete is not constant in time. when it is in a water saturated condition and subjected to freezing and thawing cycles, concrete is susceptible to damage caused by hydraulic pressure generated in the capillary cavities of the cement paste as water freezes. therefore, the compressive strength of concrete is definitely determined by modifications of its structure. changes in the strength of concrete under the impact of cycles is studied in this article. changes in porous cement paste texture are also investigated. 2 material composition all studies were carried out on two concrete types: concrete from the czech republic, used for a containment structure at npp temelín (t), and high-resistance penly concrete (p) (france). the composition of the concrete mixtures for production of 1 m3 of ready material is shown in table 1. the chemical composition of the cements (% by weight) is shown in table 2. both types of concrete are formed on the basis of portland cement and have approximately the same water cement ratio (penly – 0.43 and temelín – 0.45). these are the only similar factors: temelín concrete belongs to class b 40, while penly concrete is prepared as so-called high-resistance concrete (the uniaxial compression strength is higher than 60 mpa) [3]. acta polytechnica vol. 41 no. 2/2001 17 effect of thermal cycling on the strength and texture of concrete for nuclear safety structures š. hošková, o. kapičková, k. trtík, f. vodák the effect of thermal cycling (freezing and thawing) on the texture and strength of two types of concrete is studied: 1. concrete used for a containment structure at npp temelín (czech republic) – so-called temelín concrete. 2. highly resistant penly concrete, which was used as a standard because of its high quality, proved by the research carried out in a european commission project. the results for the two samples of concrete are compared. keywords: concrete, thermal cycling, texture, strength, porosity. component penly temelín cement 290 kg (cpa) 499 kg (42.5r) water 131 kg 215 kg silica fume (anglefort) 30 kg – calcareous filler (piketty) 105 kg – superplasticizer (resino ct) 10.62 kg – retarder (chrytard) 1.7 kg – plasticizer: ligoplast sf – 4.9 kg aggregates river gravel crushed agg. crushed agg. 831 kg 0–5 mm 287 kg 5–12.5mm 752kg 12.5–25 mm 710kg 0–4mm 460kg 8–16mm 530kg16–22mm table 1: composition of the concrete mixtures for production of 1 m3 of fresh concrete sample cp 52.5 france cem i 42.5 mokra phase clinker composition c3s 53.7 68.5 c2s 26.7 11.6 c3a 4.4 7.4 c4af 14.8 11.5 cfree 0.4 1.0 total 100.0 100.0 c3seq 55.4 72.7 c2seq 25.4 8.4 components fraction in cement clinker 97.0 95.0 gypsum 2.9 3.5 fly ash 0.1 1.3 slag trace 0.2 total 100.0 100.1 sample cp 52.5 france cem i 42.5 mokra components fraction without gypsum clinker 99.9 98.4 fly ash 0.1 1.4 slag trace 0.2 total 100.0 100.0 table 2: quantitative composition of cements (% by weight) 3 experimental details an artificial ageing process was simulated by thermal cycling. the samples were completely saturated with water and were cooled down to a temperature of –20 °c. they were then warmed up to +25 °c in a hereaus hc 4020 conditioning chamber. the total time of one temperature cycle was 6 hours. relative humidity of 95 % was maintained. a test of the compressive strength of the concrete was performed on the second fragment of the original beam. the [100 × 100] mm steel square plates, whose edges fitted with the face of the original beam, were placed on the body of the specimen. the test was performed in accordance with czech national standard čsn 73 13 18. the poreand cracksdistribution curve of the cement paste and the total specific volume of the pores were determined by mercury porosimetry. the porosimetric measurements were carried out at the institute of chemical process of the academy of sciences of the czech republic. table 3 shows the results of measurements of the compressive strength and volume of pores in dependence on the numbers of cycles for penly and temelín concrete. a graphical depiction is presented in figs. 1 and 2. the compressive strength and the specific volume are presented here in percentages, where the value of 100% corresponds to a non-cycled sample (age 28 days). table 4 relates the age of the samples and the number of climatic cycles. 4 discussion our experimental results in fig. 1, 2 show that the behaviour of both types of concrete (temelín, penly) is analogous in the investigated parameters. also, fig. 1 shows that the changes in compressive strength during thermal cycling are lower in the case of temelín. a decrease of compressive strength brings about 8.4 % of the initial value in the case of temelín (after 400 cycles), and about 16.5 % in the case of penly already after 300 cycles. additional loading above 300 cycles results in the destruction of the penly samples. both of the dependencies (fig. 1, 2) present extremes after about 100 climatic cycles: maximum values for compressive strength (fig. 1) and minimum values for a specific volume of pores (fig. 2). it is a known fact that in solids there is an inverse relationship between porosity and strength [2], [3]. it should be noted that the measurement of compressive 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 number of cycles strength [mpa] specific volume of pores [mm3/g] penly 0 64.4 35.1 100 74.4 33.9 200 65.2 37.7 300 53.8 39.2 temelín 0 52.3 38.5 100 57.7 31.0 200 56.5 49.9 400 47.9 57.2 table 3: strength and volume of pores in dependence on the number of cycles age of concrete (days) number of cycles penly temelín 0 28 28 100 84 86 200 184 186 300 281 – 400 – 304 table 4: age of samples and the number of climatic cycles number of cycles fig. 1: dependence of the relative strength on the number of cycles number of cycles í fig. 2: dependence of the relative specific volume of pores on the number of cycles strength was carried out on samples of concrete, but the specific volume of pores were studied on samples of cement paste. however, in concrete the porosity of the cement paste usually determines the strength characteristic. obviously, our results are in accordance with the assumed theoretical dependence between porosity and strength. after about 100 climatic cycles the strength in both materials increased and the volume of pores decreased. this result probably relates to the fact that during the first cycles the porous structure of the samples “matured”. under ideal moisture conditions in the climatic chamber hydration is further in progress and causes a decrease in the volume of the pores and a refining of the porous structure. this process evidently prevails over cracking of the porous structure as a consequence of the temperature cycling. the change in the development of the porous structure causes the corresponding behaviour of the strength of the two samples (temelín, penly). aknowledgements this work was supported by the mšmt čr (contract no. j04-098:210000004) and the grant agency of the czech republic (grant no. 103/99/0248). references [1] pachner, j. et al.: concrete containment buildings. iaea – tecdoc – 1025, vienna, iaea, 1998 [2] mehta, p. k. – mouteiro: concrete. new jersey, prentice hall, 1993 [3] schneider, u.: behaviour of concrete at high temperatures. deutscher ausschuss für stahlbeton, berlin, verlag ernst & sohn, 1982 rndr. šárka hošková department of physics phone: +420 2 2435 4695 e-mail: hoskova@fsv.cvut.cz rndr. olga kapičková, csc. department of physics phone: +420 2 2435 4696 e-mail: kapickov@fsv.cvut.cz doc. ing. karel trtík, csc. department of concrete structures and bridges phone: +420 2 2435 4626 e-mail: trtik@fsv.cvut.cz prof. františek vodák, drsc. department of physics phone: +420 2 2435 3886 e-mail: vodak@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 41 no. 2/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2015.55.0267 acta polytechnica 55(4):267–274, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap cfd simulation of the heat transfer process in a chevron plate heat exchanger using the sst turbulence model jan skočilasa, ∗, ievgen palaziukb a czech technical university in prague, faculty of mechanical engineering, department of process engineering, technicka 4, 166 07 prague 6, czech republic b national aerospace university, “kharkiv aviation institute” named after n. ye. zhukovsky, faculty of airplane building, department of aircraft manufacturing technology, chkalova 17, 61070, kharkiv, ukraine ∗ corresponding author: jan.skocilas@fs.cvut.cz abstract. this paper deals with a computational fluid dynamics (cfd) simulation of the heat transfer process during turbulent hot water flow between two chevron plates in a plate heat exchanger. a three-dimensional model with the simplified geometry of two cross-corrugated channels provided by chevron plates, taking into account the inlet and outlet ports, has been designed for the numerical study. the numerical model was based on the shear-stress transport (sst) k-ω model. the basic characteristics of the heat exchanger, as values of heat transfer coefficient and pressure drop, have been investigated. a comparative analysis of analytical calculation results, based on experimental data obtained from literature, and of the results obtained by numerical simulation, has been carried out. the coefficients and the exponents in the design equations for the considered plates have been arranged by using simulation results. the influence on the main flow parameters of the corrugation inclination angle relative to the flow direction has been taken into account. an analysis of the temperature distribution across the plates has been carried out, and it has shown the presence of zones with higher heat losses and low fluid flow intensity. keywords: cfd; simulation; ansys cfx; ansys cfd-post; heat transfer; plate heat exchanger; chevron corrugated plates; pressure drop; heat transfer coefficient. 1. introduction heat exchangers play an important role in heat power engineering, as well as in food, beverage, chemical, pharmaceutical, oil refining and other industries. several main types of indirect heat exchangers are available: plate, shell and tube, spiral etc. in most cases the plate type is the most efficient heat exchanger because it offers the best solution to thermal problems, having the widest pressure and temperature limits within the constraints of current equipment. plate heat exchangers have many benefits because they are more thermally efficient, occupy less space, are lighter and do not need to be cleaned as often as shell and tube heat exchangers. corrugated plates merged together create a cavity which enhances the turbulent flow in the liquid in order to maximize heat transfer in the exchanger. a high degree of turbulence can be obtained at low flow rates and a high heat transfer coefficient can then be reached. in order to increase the heat transfer in plate type heat exchangers using a passive method, rectangular fins are located on the plates, so that the flow path of the fluid and the surface area of the plates are increased [1]. an optimal performance is achieved for less obtuse (i.e., sharper) corrugations as the channel plates are coming closer to each other, while the plate heat exchanger performance can be improved for lower values of channel aspect ratio (i.e., wider channels) and for higher values of the corrugation angle as the reynolds number increases [2]. with the constraints of fixed plate surface geometry and constant pumping power, the heat transfer can be enhanced up to 2.8 times compared to that in a flat-plate channel [3]. comparing the tested 60 degree chevron plate with 30 degree and 45 degree plates, the result shows a higher nusselt number in the case of the 60 degree, at a wide reynolds number range [4]. the heat transfer coefficient and the pressure drop increase proportionally to the mass flow rate and inversely to the chevron angle [5]. a calculation of various heat exchanger modes is required for a wide range of tasks, which include: determination of thermal power and coolant flow rate in the absence of flow meters, prediction of required physical parameters and flow velocities of coolant, formulation of rationale for the choice of pipeline systems equipment, diagnostics of the heating surfaces, etc. to achieve the objectives of efficiency improvement of heat exchangers, it is necessary to apply modern approaches to their design. the main requirements which have to be met by an engineer specializing in analysing, in design or in optimizing existing heat exchange equipment are to reduce the time required to solve the assigned problem and to minimize the problem-solving costs. in the case of lack of required equipment or when carrying out experiments is too 267 http://dx.doi.org/10.14311/ap.2015.55.0267 http://ojs.cvut.cz/ojs/index.php/ap jan skočilas, ievgen palaziuk acta polytechnica expensive, a quick and cheap alternative is to make a simulation of the considered processes. a key advantage of cfd is that it is a very compelling, non-intrusive, virtual modelling technique with powerful visualization options, which allows engineers to evaluate the performance of a wide range of technical system configurations on a computer without the time, expenses, and disruptions required to make actual changes. cfd simulation largely saves cost and turnaround time and makes it possible to obtain reliable results, and this method is therefore widely used in the process of design and upgrade of plate heat exchangers. the quality of the solutions obtained from cfd simulations is largely within the acceptable range, which proves that this method is an effective tool for predicting the behavior and performance of a wide variety of heat exchangers [6]. the computations enable us to point out the regions on the plates where insufficient flow can result in problems with their cleaning [7]. a simulation allows us to clearly demonstrate that the phenomenon of non-uniform distribution of the fluid is a major factor affecting the performance of plate heat exchangers, which can be weakened by rolling guide area in the corrugated board [8]. investigations by cfd can be helpful for determining the hydrodynamic characteristics and the flow distribution in two cross-corrugated channels [9]. cfd results help to identify features of the flow field in detail, and they can explain the distribution of convection coefficients [10]. the temperature distribution obtained by simulation can be helpful for selecting the plate material [11]. thanks to cfd results you can get detailed information about the fluid velocities near the wall in thermal boundary layer, and hence these information enhance the accuracy of the calculation of the heat transfer coefficient and pressure drop values [12]. 2d calculations show the influence of the corrugation shape on the performance of plate heat exchangers, but 3d calculations are necessary in order to assess the importance of the corrugation orientation [13]. this study deals with numerical simulation of the heat transfer process during turbulent hot water flow between two chevron corrugated plates in a plate heat exchanger. in the article a simplification of the heat exchanger geometry is used and an effective methodology for the design and calculation of heat exchangers is one of the aims. the main objective of the paper is to determine performance characteristics of the heat exchanger e.g. the values of the heat transfer coefficient and the pressure drop by carrying out investigations of the main flow parameters for different geometrical configurations. 2. theory the methods focused on the intensification of heat transfer are usually associated with an increase in hydraulic resistance. improving the heat transfer efficiency leads to an increase in hydraulic resistance. c1 y c2 n m x 1.22 0.252 0.14 0.65 0.4 0.1 table 1. coefficients and exponents used in the calculations [15, 16]. the important task is therefore to find transfer surface geometries which have the greatest value of heat transfer coefficient at the lowest possible value of hydraulic resistance (i.e., of the power required to pump coolant). pressure drop is one of the main heat exchanger characteristics, and it is in direct relationship to the size of the plate heat exchanger. the pressure drop value determines the pump power which is required for pumping the working fluid through the channels of the heat exchange apparatus. if it is possible to increase the allowable pressure drop, and incidentally accept higher pumping costs, then the heat exchanger will be smaller and less expensive. the performance of the many different corrugation forms can vary considerably, but the pressure loss in a plate heat exchanger can always be calculated from a fanning friction factor type of equation [14]: ∆p = 2flg2 ρde . (1) the general equation for predicting the friction factor coefficient has the following form: f = c1 rey, (2) where coefficient c1 and exponent y are the performance characteristics of the heat exchanger, which depend on individual geometry features of the plates. exponent y generally varies from 0.1 to 0.4 [15, 16]. the heat transfer performance of plate-type units can be calculated from typical dimensionless heat transfer equations, with the appropriate constants and exponents for each specific type of heat exchanger: nu = c2renpr m ( η ηw )x , (3) where nu = αde/λ, de = 4s/p = 2b, re = νdeρ/µ and pr = cp/λ. the typical values of constant c2 and exponents n,m,x depend on individual characteristics of the plate surface geometry and of the coolant flow regime. in different cases they have the following values: c2 = 0.065–0.60, n = 0.3–0.75, m = 0.3–0.45, x = 0.05–0.2 [15, 16]. the coefficients and exponents used in the analytical calculations for the considered geometry of heat exchanger plates, according to their dimensions, are shown in table 1. 268 vol. 55 no. 4/2015 cfd simulation of heat transfer process figure 1. basic geometrical parameters and dimensions of the heat exchanger plates. l w a b c d β 1000 mm 400 mm 15 mm 2 mm 6 mm 1 mm 30°; 45°; 60° table 2. dimension values of the considered heat exchanger plates. figure 2. finite element mesh. 3. numerical model geometry, grid and pre-processor set-up the basic geometry of the investigated model was created in the solid modelling computer-aided design (cad) software solidworks 2013. for study purposes, a simplified geometry of plates with orthogonal chevron shape corrugations (without rounding), available among the internet resources of the grabcad engineering portal [17], was considered. we used identical plates with mutually reverse orientation of the corrugations on adjacent plates. the basic geometrical parameters and dimensions of the plates are shown in figure 1. the assembly of single plates into the package is performed by means of separating rubber gaskets. the influence of the corrugation inclination angle relative to flow direction on the main flow parameters was taken into account. in order to investigate the influence of the corrugations on the flow parameters and the intensification of heat transfer processes, flat plates were also considered. to simplify the numerical model and reduce computational costs, only the fluid flow domain was considered. the dimension values of the considered heat exchanger plates are shown in table 2. the finite element mesh was created by means of ansys meshing software, as this technology provides a means to balance all mesh requirements and obtain the right mesh for each simulation in the most automated way possible [18]. when creating the finite element mesh, tetrahedral elements were used (figure 2). for a more accurate modeling of the near-wall flow in the boundary layer near the contact surfaces of coolant and plates, special layers of prismatic elements were constructed. it is worth noting that the sizes of these elements were set according to the requirements for the value of the non-dimensional distance from the wall to the first mesh node y+ for the turbulence model used. the average total number of elements for the different cases was amounted to 6 · 106. because the regime of coolant flow in heat exchangers is generally turbulent, the choice of an appropriate turbulence model is of great importance. in some cases the turbulence model can have a huge effect on the results which are obtained from cfd. the k–ε turbulence model has been most widely employed for heat exchanger design optimization using cfd software [6]. the simulations generally yield results in good agreement with the experimental studies, ranging from 2 % 269 jan skočilas, ievgen palaziuk acta polytechnica figure 3. change in heat transfer coefficient and pressure drop with increasing reynolds numbers for plates with β = 60°. figure 4. change in heat transfer coefficient and pressure drop with increasing reynolds numbers for plates with β = 45°. to 10 %, but in some exceptional cases they vary up to 36 %. the difference turbulence model was used in the numerical model design [19]. the simulation results show that the k–ω and k–ε turbulence models give an overestimation of the nusselt number maximum value (up to 20 %), whereas the calculation results obtained using the sst model differ from experimental data by no more than 5 %. the results of a numerical simulation of the heat transfer process during shock waves damping in an enclosed chamber after the detonation of a gas mixture were also taken into account [20]. the use of the sst turbulence model for simulation of the heat transfer between hot combustion products and solid body, located in the chamber, has shown a good accuracy during the comparison of simulation results and experimental data. on this basis, within the framework of this task, the sst turbulence model was used. the numerical solution for all considered cases was carried out in ansys cfx v15.0, which contains the wide array of advanced models and technology of a leading cfd software package [21]. with ansys cfx we are able to predict the impact of fluid behaviour on the basic characteristics of the plate heat exchanger. a varying intensity of the coolant flow was set by changing the mass flow rate value ṁ at the inlet, which directly affects the flow velocity and, as a consequence, the value of the reynolds number. during the simulation the mass flow rate varied from 0.2 to 1.4 kg/s. in all cases the inlet temperature was equal to 60 °c; the pressure to 1 bar; and the value of turbulence intensity to 5 %. the boundary condition at the area of liquid stream which is in contact with the separating gasket was set as adiabatic wall. negative heat flux equal to −8000 w/m2 was applied to the flow area, which is the contact surface of coolant and of the plates through which heat transfer occurs. the given value of heat flux allows reaching a temperature difference at the inlet and outlet ports of about 10–15 °c at low reynolds numbers. the use of 270 vol. 55 no. 4/2015 cfd simulation of heat transfer process figure 5. change in heat transfer coefficient and pressure drop with increasing reynolds numbers for planes with β = 30°. β [°] c1 y c2 n m x δ(α) [%] δ(∆p) [%] 60 1.19 0.254 0.14 0.65 0.4 0.1 4.9 12.7 45 1.06 0.28 0.14 0.645 0.395 0.1 2.4 6.8 30 0.88 0.3 0.14 0.64 0.39 0.1 2.5 13.8 table 3. recalculated values of coefficients and exponents for different angles β. this boundary condition allowed simulating the process of heat transfer and significantly simplified our numerical model. 4. simulation results and their discussion to benefit from the prediction requires post-processing providing complete insight into the results of the fluid dynamics simulation. an analysis of our simulation results was carried out using the ansys cfd-post v15.0 software. it is a powerful post-processor for ansys fluid dynamics products, which delivers everything needed to visualize and analyze the obtained results [22]. in order to evaluate the adequacy of the used numerical model, a comparative analysis of the simulation results and of the analytical calculation results was carried out. we compared the values of heat transfer coefficient and pressure drop. in figures 3–5, charts of heat transfer coefficient and pressure drop change with increasing reynolds numbers for plates with different values of the corrugation inclination angle β are shown. in the case of using plates with an angle β = 60° at high reynolds numbers, the maximum deviation in determining the heat transfer coefficient during the simulation compared to the results of analytical calculations is 5.1 %, and the deviation when determining the pressure drop is 14 %. if the angle β decreases, the difference between the considered parameter values increases, and in the case of β = 30° the deviation of the determined heat transfer coefficient is 11.5 %, and of the determined pressure drop is 59.1 %. in the analytical calculations for different β angles, we used the constant values of coefficients and exponents which are shown in table 1. as can be seen in the graphs, these values almost fully correspond to the case when plates with β = 60° are used (figure 3). in fact, a decrease of the corrugation inclination angle β leads to a reduction of hydraulic resistance and consequently reduces the efficiency of heat transfer [1, 3], which explains the deviation increase for the considered parameters between analytical calculation results and the data obtained through numerical simulation (figures 4 and 5). when using the simulation results, we can obtain more accurate values of the coefficients and exponents for different corrugation inclination angles β of the considered plates by recalculating the values of the fanning friction factor coefficient and of the nusselt number (equations 2 and 3). the results of the recalculation are shown in table 3. using the obtained coefficients and exponents can lead to a significant reduction of the maximum deviation δ of the simulation results compared to the analytical calculation results when determining the heat transfer coefficient change δ(α) and the pressure drop change δ(∆p). figure 6, below, shows a picture of the hot water flow temperature distribution along the plate for different values of the corrugation inclination angle relative to flow direction. in all cases, initial and boundary conditions were identical. the inlet mass flow rate value ṁ was equal to 1.2 kg/s. an increase 271 jan skočilas, ievgen palaziuk acta polytechnica figure 6. picture of flow temperature distribution along the plates with different β. figure 7. change in heat transfer coefficient and pressure drop with increasing reynolds number for plates with different β and for the flat plates. in the angle β leads to an increase of the reynolds number and hence to a greater flow turbulence, which in turn affects the heat transfer process. when the coolant flows downstream, the flow temperature gradually decreases due to the convective heat transfer process. the temperature distributions of the fluids are similar in all the considered cases: the maximum temperature appears around the upper inlet port and the lowest temperature appears in some areas with the lowest flow intensity. from figure 6 it can clearly be seen that with a decreasing corrugation inclination angle there is an increase in the flow zones with a lower temperature. the presence of these zones is caused by an increase in heat losses in some areas due to a reduction of the local heat transfer coefficient. this disadvantage can be eliminated by changing the geometry of the plate, which excludes the existence of shadow zones. near the sides of the plates without corrugations, there are smooth flow passages that reduce the heat transfer and the pressure drop. these areas of flow represent about 20 % of the total mass flow through the considered channels between the used plates. as mentioned above, the increase of the hydraulic resistance promotes an intensification of the heat transfer processes. however, the value of the pressure drop increases, which is clearly demonstrated in figure 7. in order to demonstrate the advantages of using chevron corrugations, the case when smooth plates are used was considered. the obtained results show that chevron plate surface corrugations promote higher heat transfer coefficients, but there is a higher 272 vol. 55 no. 4/2015 cfd simulation of heat transfer process pressure drop penalty as well. in the considered range of the reynolds number the used geometry of corrugations at values of angle β = 60° enables a gain in heat transfer coefficient of 1.23 times compared to a flat plate, while the pressure drop is 2.89 times higher at the maximum considered value of mass flow rate ṁ = 1.4 kg/s. the low values of heat transfer coefficient and pressure drop can be explained by geometric features of the used plates: the orthogonal shape of the corrugations, the presence of smooth channels near the plate sides, the presence of a gap between plates, and the presence of shadow zones with a low intensity of flow. 5. conclusions the proposed model of turbulent water coolant flow between two chevron corrugated plates in a plate heat exchanger can provide us with complex information about basic flow parameters and features of the heat transfer process. the adequacy of the numerical model which was used was evaluated by a comparative analysis of simulation results and analytical calculation results based on experimental data. the maximum deviation of the obtained results when determining the heat transfer coefficient is 4.9 % and it is 13.8 % when determining the pressure drop. the general values of the parameters in equation 3 for the evaluation of the nusselt number are in good agreement with the simulated data. however, the calculation of pressure drops strongly depends on the inclination angle and the parameters of equation 2 significatly vary with the reynolds number and cannot be applied as constants for all inclination angles. this numerical model can be directly used to calculate the flow parameters depending on the heat exchanger various characteristics. the model can also be used to predict the basic performance characteristics of the plate heat exchanger depending on a variety of plate geometrical configurations: its overall dimensions, forms of corrugations, corrugation inclination angles, location of inlet and outlet pipes, etc. a great advantage of the used numerical model is that when there is a change in the geometry, the reconstructing of the finite volume grid occurs automatically, which gives us significant time savings. the simulation results show the advantages of using chevron corrugations compared to using smooth plates. chevron plate surface corrugations promote higher heat transfer coefficients. however, there are higher pressure drop values due to an increase in the hydraulic resistance. increasing the corrugation inclination angle leads to an increase in the heat transfer coefficient as well as an increase of the pressure drop. these results can help solve the problem of finding optimal heat transfer surfaces geometries, which have the greatest value of heat transfer coefficient at the lowest possible value of hydraulic resistance. however, in spite of the advantages of this model, there remain several directions for further improvement. during the described investigations, we used constant physical properties of water coolant, so our numerical model can be improved by setting properties as functions of temperature. we should also consider the process of heat transfer from the hot water flow to the cold liquid through the physical separating wall, which would correspond to the real conditions, but would require more computing power. moreover, the model results are not in good agreement with previously published data, which is obviously due to the following geometric features of the used plates: orthogonal shape of corrugations, presence of smooth channels near the plate sides, presence of a gap between plates, and presence of shadow zones with low intensity of flow. list of symbols a length of surface corrugation in a cross section [mm] b corrugation depth [mm] c maximum width of channel flow [mm] d plate thickness [mm] cp specific heat [j kg−1 k−1] de hydraulic diameter [m] f fanning friction factor [–] g mass velocity [kg m−2 s−1] l plate length between ports [m] ṁ mass flow rate [kg s−1] nu nusselt number [–] p wetted perimeter of channel flow [m] pr prandtl number [–] re reynolds number [–] s cross sectional area of channel flow [m2] t temperature [k] v velocity [m s−1] w plate width between ports [m] y+ dimensionless wall distance [–] ∆p pressure drop [pa] α heat transfer coefficient [w m−2 k−1] β corrugation inclination angle relative to flow direction [°] η bulk dynamic viscosity [pa s] ηw wall dynamic viscosity [pa s] λ thermal conductivity [w m−1 k−1] µ dynamic viscosity [pa s] ρ density [kg m−3] references [1] durmus a., benli h., kurtbas i., gul h.: investigation of heat transfer and pressure drop in plate heat exchangers having different surface profiles. international journal of heat and mass transfer, 52, 2009, pp. 1451 – 1457. issn 0017-9310. [2] kanaris a.g., mouza a.a., paras s.v.: optimal design of a plate heat exchanger with undulated surfaces. international journal of thermal sciences, 48, 2009, pp. 1184 – 1195. issn 1290-0729. 273 jan skočilas, ievgen palaziuk acta polytechnica [3] muley a., manglik r.m.: experimental study of turbulent flow heat transfer and pressure drop in a plate heat exchanger with chevron plates. journal of heat transfer, 121, 1999, pp. 110 – 117. issn 0022-1481. [4] naik v.r., matawala v.k.: experimental investigation of single phase chevron type gasket plate heat exchanger. international journal of engineering and advanced technology, 2, 2013, pp. 362 – 369. issn 2249-8958. [5] muthuraman s.: the characteristics of brazed plate heat exchangers with different chevron angles. global journal of researches in engineering, 11, 2011, pp. 10 – 26. issn 2249-4596. [6] aslam bhutta m.m., hayat n., bashir m.h., khan a.r., ahmad k.n., khan s.: cfd applications in various heat exchangers design: a review. journal “applied thermal engineering", 32, 2012, pp. 1 – 12. issn 1359-4311. [7] prepiorka-stepuk j., jakubowski m.: numerical studies of fluid flow in flat, narrow-gap channels simulating plate heat exchanger. journal “chemical and process engineering”, 34, 2013, pp. 507 – 514. issn 2300-1925. [8] han x.-h., cui l.-q., chen g.-m., wang q.: a numerical and experimental study of chevron, corrugated-plate heat exchangers. journal “international communications in heat and mass transfer”, 37, 2010, pp. 1008 – 1014. issn 0735-1933. [9] tsai y.-c., liu f.-b., shen p.-t.: investigations of the pressure drop and flow distribution in a chevron-type plate heat exchanger. journal “international communications in heat and mass transfer”, 36, 2009, pp. 574 – 578. issn 0735-1933. [10] freund s., kabelac s.: investigation of local heat transfer coefficients in plate heat exchangers with temperature oscillation ir thermography and cfd. international journal of heat and mass transfer, 53, 2010, pp. 3764 – 3781. issn 0017-9310. [11] pandey s.d., nema v.k.: analysis of heat transfer, friction factor and exergy loss in plate heat exchanger using fluent. journal “energy and power”, 1, 2011, pp. 6 – 13. issn 2163-1603. [12] wang y.-q., dong q.w., liu m.s., wang d.: numerical study on plate-fin heat exchangers with plain fins and serrated fins at low reynolds number. journal “chemical engineering & technology”, 32, 2009, pp. 1219 – 1226. issn 1521-4125. [13] grijspeerdt k., hazarika b., vucinic d.: application of computational fluid dynamics to model the hydrodynamics of plate heat exchangers for milk processing. journal of food engineering, 57, 2003, pp. 237 – 242. issn 0260-8774. [14] schlunder e.u.: heat exchanger design handbook. international centre for heat and mass transfer, 1983, 2305 p. isbn 3-1841-9081-1. [15] vedernikova m.i., talankin v.s.: calculation of plate heat exchangers. methodological guidelines for course and diploma projects. ural state forest engineering university. yekaterinburg, 2008, 29 p. [16] mamchenko v.o., malyshev a.a.: plate heat exchangers in cryogenic techniques and biotechnological processes. study guide. saint petersburg national research university of information technologies, mechanics and optics. institute of refrigeration and biotechnology. saint petersburg, 2014, 117 p. [17] https://grabcad.com [2015-08-01]. [18] ansys, inc. ansys meshing user’s guide. release 15.0. 2013, 492 p. [19] vieser w., esch t., menter f.: heat transfer predictions using advanced two-equation turbulence models. cfx validation report no. cfx-val10/060211, 2002, 73 p. [20] plankovskyy s.i., shypul o.v., tryfonov o.v., palaziuk ie.s., malashenko v.l.: the simulation of the heat transfer during shock waves damping in an enclosed chamber. journal “aerospace equipment and technology", 1 (108), 2014, pp. 104 – 109. issn 1727-7337. [21] ansys, inc. ansys cfx-pre user’s guide. release 15.0. 2013, 386 p. [22] ansys, inc. ansys cfd-post user’s guide. release 15.0. 2013, 376 p. 274 https://grabcad.com acta polytechnica 55(4):267–274, 2015 1 introduction 2 theory 3 numerical model geometry, grid and pre-processor set-up 4 simulation results and their discussion 5 conclusions list of symbols references acta polytechnica acta polytechnica 53(3):280–282, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ generalized continuity equation for quasi-hermitian hamiltonians amine b. hammou∗ usto–mohamed boudiaf, département de physique, bp 1505 el m’naouer, oran, algeria ∗ corresponding author: hbamine@gmail.com abstract. the continuity relation is generalized to quasi-hermitian one-dimensional hamiltonians. as an application we show that the reflection and transmission coefficients computed with the generalized current obey the conventional unitarity relation for the continuous double delta function potential. keywords: complex scattering potential, metric operator, quasi-hermitian, pt -symmetry. 1. introduction in [1], scattering from a discrete quasi-hermitian delta function potential was studied and a generalized continuity relation was obtained in the physical hilbert space hphys. a generalized probability current density was defined. using a quasi-hermitian toy model of discrete delta function potential, it was shown that the reflection and transmission coefficients computed using this current obey the unitarity relation. in this paper we will review the construction of the generalized continuity relation of [1] and then apply it to the continuous double delta function potential where a metric was obtained to first order in the imaginary part of the potential strength [2]. we will then compute the reflection and transmission coefficients in the physical hilbert space hphys and show that they indeed obey the unitarity relation. this result comes as a surprise since, although the potential is local, the metric is highly non local and so the existence of asymptotic states is questionable [3–5]. this is as far as i know the first example of a quasi-hermitian hamiltonian with a non-diagonal metric that possesses a conserved probability current density. recently it has been found [6] that, for the non-hermitian complex pt-symmetric version of the scarf ii potential, unitarity is obeyed for a particular choice of the parameters. 2. the continuity relation in hphys a hamiltonian h that obeys h† = ηhη−1 (2.1) is said to be quasi-hermitian [7]. where η is a positive definite metric operator. by an appropriate modification of the inner product on the hilbert space, quasi-hermitian operators can be made hermitian [8]. the inner product of the new hilbert space is given by 〈·|·〉η := 〈·|η·〉, where 〈·|·〉 is the inner product which defines the original hilbert space h. the new hilbert space endowed with the inner product 〈.|.〉η is recognized as the physical hilbert space hphys in which h acts as a hermitian operator [8]. a generalized continuity equation in the physical hilbert space hphys was defined in [1] as ∂ ∂t ρη(x) + ∂ ∂x jη(x) = 0 (2.2) in the x-representation, where ρη(x) = ∫ dy ψ∗(y)η(y,x)ψ(x) (2.3) is the probability density. the probability current density is given by jη(x) = i ∫ dy ( ψ∗(y)η(y,x) ∂ψ(x) ∂x − ∂ ( ψ∗(y)η(y,x) ) ∂x ψ(x) ) . (2.4) 3. the continuous double delta function potential in this section, we will test the continuity relation obtained from the generalized current (2.4) in the continuous case. there are very few cases in the literature where the metric was computed [2, 9–11]. let us consider the double delta function potential given by v (x) = iλ ( δ(x + a) −δ(x−a) ) (3.1) with the scattering wave function, in the situation where the plane wave comes in from the left and is either reflected or transmitted at the delta function potential, is of the form{ ψ<(x) = eikx + re−ikx if x �−a, ψ>(x) = teikx if x � a, (3.2) where ψ<(x) and ψ>(x) are respectively the incoming and the outgoing wave functions. the metric was given in [2] as a perturbative series in λ, to first order it is given by η(x,y) = δ(x−y) + η(1)(x,y) + o(λ2) (3.3) 280 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 generalized continuity equation for quasi-hermitian hamiltonians where η(1)(x,y) = i λ 2 sign(x−y) ( θ(x + y + 2a) −θ(x + y − 2a) ) . (3.4) let us apply our formula for the generalized continuity relation introduced in (2.4) to this case. therefore we should compute the functions χ(x) = ψ(x) + ∫ η(1)(x,y)ψ(y) dy. (3.5) a close inspection of the metric shows that for both x �−a and x � a with −(x + 2a) < y < −(x− 2a) we have η(1)(x,y) 6= 0; otherwise it is zero. we therefore obtain for x �−a χ<(x) = ψ<(x) − i λ 2 ∫ −(x−2a) −(x+2a) ψ<(y) dy = eikx + ( r − i λ k t sin 2ak ) e−ikx (3.6) and for x � a we find χ>(x) = ψ>(x) + i λ 2 ∫ −(x−2a) −(x+2a) ψ<(y) dy = ( t + i λ k r sin 2ak ) eikx + i λ k sin 2ak e−ikx. (3.7) the second term in this function can be interpreted as a reflected wave function traveling from right to left and therefore it has no physical meaning [4, 5]. this causes no problem in our approach since χ(x) has no physical interpretation. the generalized reflection coefficient is easily computed using its definition in terms of reflected and incident generalized currents that can be worked out using (2.4), and is given by r = |r|2 + iλ 4 t∗r sin 2ak. (3.8) similarly the generalized transmission coefficient can be computed using the transmitted and incident currents, and is given by t = |t|2 − iλ 4 r∗t sin 2ak (3.9) leading to r + t = |r|2 + |t|2 + iλ 4 (t∗r −r∗t) sin 2ak. (3.10) note that if we use the complex conjugate of relation (2.4) to compute the generalized reflection and transmission coefficient we obtain the same result. this is easily seen from the fact that the reflection and transmission coefficients are real and hence the currents are real too. the first term in both r and t are obviously real and the second terms are also real because of the relation t∗r = −r∗t, which is a general feature of pt -symmetric models [12]. for the model at hand (3.1), the reflection and transmission amplitudes can be easily worked out, leading to r = it λ k ( 1 − λ 2k ) sin 2ak (3.11) and t = 1 1 + i λ22k2 e 2iak sin 2ak . (3.12) it is worth noting that if we do usual quantum mechanical scattering (η(x,y) = δ(x,y)) by using the pt-symmetric double delta function potential we will not have unitarity |r|2 + |t|2 = 1 + λ 2 k2 (1 − λ2k ) 2 sin2 2ak 1 − λ2 k2 (1 − λ24k2 ) sin 2 2ak . (3.13) by using the expressions (3.11) and (3.12) we obtain t∗r = −r∗t = i |t|2 λ k ( 1 − λ 2k ) sin 2ak (3.14) and after a straightforward computation we can show that r + t = 1. (3.15) we have obtained a unitarity relation in the hilbert space hphys. note that this result is exact in λ, as we have used the full expressions of the transmission (3.12) and reflection amplitudes (3.11) in terms of λ (to all orders). this is a puzzling result that we have yet to understand. it is clear that if one uses the expressions for t and r to first order in λ, the relation (3.15) holds trivially. a possible explanation is that the higher orders in λ of the metric somehow will not change the result of the continuity relation. for this to be true, the higher order corrections should not contribute to the functions χ<(x) and χ>(x). 4. conclusion we have been able to show that the generalized current defined with respect to the metric η is conserved (jη)in − (jη)out = 0 (4.1) and the corresponding generalized reflection and transmission coefficients obey the unitarity relation for the quasi-hermitian continuous double delta function potential. although the metric used in the continuous case is perturbative in the strength of the imaginary part of the potential λ, the results of the unitarity relation were obtained using the expression for the amplitudes to all orders in λ. and so this arises a question that deserves further investigation. to this end, one has to work out the metric to a higher order in λ. finally an interesting and urgent question is to understand the physical implementation of the metric η in these scattering processes. 281 amine b. hammou acta polytechnica acknowledgements i would like to thank the abdus salam international center for theoretical physics ictp, trieste, italy for hospitality and support. i would also like to thank m. znojil for helpful discussions and for organizing the conference “analytic and algebraic methods in physics x”, prague. references [1] amine b. hammou, j. phys. a: math. theor. 45 (2012) 215310. [2] h. mehri-dehnavi, a. mostafazadeh, a. batal, j. phys. a43, 145301 (2010). [3] m. znojil, phys. rev. d78, 025026 (2008). [4] h. f. jones, phys. rev. d76, 125003 (2007). [5] h. f. jones, phys. rev. d78, 065032 (2008). [6] z. ahmed, arxiv:1207.6896. [7] f. g. scholtz, h. b. geyer and f. j. w. hahne, ann. phys. 213, 74 (1992). [8] a, mostafazadeh, int. j. geom. meth. mod. phys. 7, 1191 (2010). [9] a. mostafazadeh, j. math. phys. 46, 102108, (2005). [10] a. mostafazadeh, j. phys. a: math. gen. 39, 13495 (2006). [11] a. mostafazadeh, j. phys. a39, 10171 (2006). [12] f. cannata, j.-p. dedonder and a. ventura, ann. phys. 322, 397 (2007). 282 http://arxiv.org/abs/1207.6896 acta polytechnica 53(3):280–282, 2013 1 introduction 2 the continuity relation in h_phys 3 the continuous double delta function potential 4 conclusion acknowledgements references ap01_45.vp 1 introduction the idea of structural reliability can be very simply stated: the structure must have greater resistance than the load effects on the structure demand. the resistance of a structure is a variable quantity depending on material strengths, accuracy of analytical models, geometric tolerances, and many other factors. similarly, load effects due to wind, seismic, vehicle and other loads are variable. the first reliability assessment concepts used for design were deterministic and based on experience, using single safety factors to account for uncertainties in both resistance and load effects. since the early 1960’s many national and international design codes have been moving in accordance with a limit states design philosophy based on the partial factors concept in which the variability of resistance and load effect combinations are analyzed separately. this approach allows for taking more precisely into account the strength of the structure beyond the linear elastic limit, second order effects, and the accumulation of damage. in partial factors design, nominal values of loads (or load effects) are chosen and combinations of these loads are specified for design using special coefficients and load factors that multiply the nominal loads in order to define design values that take into account load combinations. the resistance of the structure or element of the structure is represented by its ultimate strength. the ultimate strength is multiplied by a resistance factor to take into account its variability. structural safety is achieved when the factored resistance equals or exceeds the factored load effect combination. unfortunately, when considering the multi-random variable input which governs both resistance and loading, the analytical mathematical solutions required to determine the design point can become extremely difficult, if not impossible. there is also some question as to the appropriateness of using ultimate (i.e. collapse) behavior as the reference level for reliability assessment. thus, load and resistance factors have been chosen based on years of design experience coupled with experimental results and calibration. because of the large amount of design experience associated with current design codes, reliable designs for many classes of structures can be realized, but as a consequence, codes based on the partial reliability factors concept remain, from the designer’s point of view, deterministic in nature. 2 simulation based reliability assessment (sbra) an alternative approach, which is now possible due to modern computer technology, is to use monte carlo simulation to model the variability of both load effects and resistance. simulation based reliability assessment (sbra) as presented by marek et al [1] allows for detailed evaluation of individual load effects and their combinations. all loads are random in nature and can be described as random variables. in this approach, individual loads are represented by load duration curves and corresponding bounded (non-parametric) histograms and not by nominal values and load factors as is the case in many current codes. similarly, variable quantities (e.g. yield strength) that affect the resistance of a structure or an element can also be expressed in terms of bounded histograms. fig. 1 is an outline of how a bounded histogram that represents a loading action may be constructed. in fig. 1(a) the time history of an action f(t) is plotted. in this particular case, the action represents a long lasting load on a structure. fig. 1(b) shows the time history sorted so that the minimum intensity is on the left and orders the intensities ending with the maximum intensity on the right. this sorted time history is called the load duration curve. the load duration curve can be transformed into the bounded histogram shown in fig. 1(c). in sbra all loads are represented as bounded histograms and can be constructed from data on load history. another key element in sbra is the definition of a reference level to define the limit of the “usefulness” of a structure or element. any exceedance of this “usefulness” limit would impair the ability of the structure or elements of the structure to perform “safely” and would require replacement or repair. obviously, these limits may have a different character depending on the type of structure, material, loading, etc. for example, in the case of simple strength, the limit resistance may be set as the onset of yielding or to some tolerable magnitude of permanent plastic deformation. in the case of fatigue, © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 41 no. 4 – 5/2001 structural design using simulation based reliability assessment s. vukazich, p. marek the concept of simulation based reliability assessment (sbra) for civil engineering structures is presented. sbra uses bounded histograms to represent variable material and geometric properties as well as variable load effects, and makes use of the monte carlo technique to perform a probability-based reliability assessment. sbra represents a departure from the traditional deterministic and semi-probabilistic design procedures applied in codes and as such requires a “re-engineering” of current assessment procedures in accordance with the growing potential of computer and information technology. three simple examples of how sbra can be used in structural engineering design are presented: 1) the reliability assessment of a steel beam, 2) a truss bar subjected to variable tension and compression, and 3) the pressure on the wall of an elevated water tank due to earthquake load. keywords: structural engineering, reliability assessment, monte carlo method. the limiting value may be set as a tolerable magnitude of accumulated damage or by a tolerable length of fatigue crack. fig. 2 is an illustration of how structural reliability can be checked using simulation from marek et al [1]. the method uses direct monte carlo simulation to determine the probability of failure corresponding to any combination of variable load effects, s, and resistance, r. the required number of computer simulations are performed according to the distributions of r and s. to obtain the probability of failure, the number of simulations (each “dot” in fig. 2 corresponding to the interaction of r and s) in the failure region (the region to the right of the r – s = 0 line in fig. 2) is divided by the total number of simulations. to check the design, the resulting probability of failure is compared to an acceptable target probability. the following three examples illustrate how sbra can be used in structural design. the first example examines the safety and serviceability assessment of a steel beam. the second example presents the assessment of a diagonal member of a steel truss subjected to loading that can induce both tension and compression in the member. the third example is an illustration of how sbra can be used in earthquake engineering. 3 example 1: reliability of a steel beam the reliability of the simply supported steel beam shown in fig. 3 exposed to the uniform dead load q, long-lasting point load f1 and short-lasting point loads f2 will be examined. lateral-torsional buckling of the beam is prevented, only the elastic response to the loading is considered, and 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 f(t) t a) load time history b) sorted load time history: load duration curve maximum value service life of structure l 0 l 0 % frequency service life of structure 0.625l c) bounded histogram f(t) fig. 1: generation of load duration curve and corresponding bounded histogram for long lasting load r s r s = 0r el ia bi lit y r eg io n fa ilu re r eg io n load effects histogram r e s is ta n c e h is to g ra m fig. 2: graphical representation of simulation based reliability assessment 3 m q 3 m f2 f1 f2 1.5 m 1.5 m fig. 3: steel beam example a) yield stress (s235_15.his) b) dead load (dead2.his) c) long lasting load (long1.his) d) short lasting load (short1.his) e) moment of inertia (n1-10.his) f) section modulus (n1-05.his) fig. 4: bounded histograms used to represent variable loads and properties effects of residual stresses are neglected. the target probabilities pd = 0.00007 for safety and pd = 0.07 for serviceability are considered. the steel is grade s235, while the extreme load values are dead load, q = 5 kn/m; long lasting load, f1 = 75 kn; and short-lasting load, f2 = 30 kn. the nominal beam span is l = 9 m and the rolled section properties are: nominal section modulus wnom = 1147500 mm 3, and the nominal moment of inertia inom = 329296875 mm 4. the yield stress fy is expressed by the bounded histogram s235_16.his shown in fig. 4(a). the load variations are represented by the histograms dead2.his, long1.his, and short1.his shown in figs. 4(b), 4(c), and 4(d). the variations of the geometric properties of the section due to over and under-rolling are expressed by normal distributions with a range of �10 % for inom, represented by histogram n1-10.his shown in fig. 4(e), and similarly, a �5% range for wnom, expressed by the histogram n1-05.his shown in fig. 4(f). the beam length l is taken to be constant. variable quantities are represented by multiplying the extreme or characteristic value by the corresponding histogram. for example, the variation of the section modulus is represented by � �w w histogram nvar � �nom -1 05 . (1) note that in all subsequent equations, the subscript var is used to represent a variable quantity. 3.1 reliability assessment for safety in this example, the reference level applied in the calculation of probability of failure pf is defined by the onset of yielding with the prescribed target probability of pd = 0.00007. the load effect combination at the critical section of the beam can be expressed as: s q l f l f l� � �var var var 2 1 28 4 3 . (2) the reliability function at the critical section can be expressed as: r f w� 0 9. var vary . (3) where the 0.9 reduction factor is used to represent the difference between the reference level yield stress and the experimentally derived yield stress fy. a more detailed discussion of reference levels may be found in marek et al [1]. the safety function is defined as: sf r s� � . (4) using the m-star program, developed by marek et al [1], the safety function can be evaluated and the output is shown in fig. 5(a). a total of 100000 simulation steps were used. from the m-star output, one can see that the probability of failure of this section (i.e. the probability that the safety function is zero or less) is 0.00006. since the probability of failure is less than the target probability (0.00006 < 0.00007) the safety of the section is adequate. 3.2 reliability assessment for serviceability next the reference serviceability limitation will be checked. the maximum tolerable deflection is defined by the limit of l/350. the prescribed corresponding target probability is pd = 0.07. the deflection referring to the variable load effect combination is: � � � � � � � 1 5 384 48 0 0355 4 3 1 3 2 e i l q l f l f var var var var. . (5) the serviceability function is defined by: sf l � � 350 � . (6) the m-star program output for the serviceability analysis is shown in fig. 5(b). a total of 100000 simulation steps were used. from the m-star output, one can see that the probability of failure of this section (i.e. the probability that the serviceability function is zero or less) is 0.00581. since the probability of failure is less than the target probability (0.00581 < 0.07) the serviceability of the section is adequate. 4 example 2: truss bar subjected to tension and compression the pin-connected truss shown in fig. 6 is exposed to horizontal and vertical variable mutually uncorrelated loads wl, sl and dl acting in the plane of the truss. translation of the joints out of the plane of the truss is prevented. this example illustrates how simulation based reliability assessment can be used to assess the diagonal bar “a” indicated in fig. 6 to resist variable tension and compression due to the loads. initial eccentricity is considered due to fabrication and also to express the effect of residual stresses. all load effects and variables are considered to be statistically independent. the target probability used for reliability assessment for safety © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 acta polytechnica vol. 41 no. 4 – 5/2001 b) reliability analysis for serviceability a) reliability analysis for safety fig. 5: m-star� program output for reliability assessment considering tension and buckling in this example is assumed to be pd = 0.00007. table 1 contains the characteristic values of the loads and corresponding load factors according to eurocode 1993 (further details can be found in [4]). the nominal geometrical and mechanical properties of the rolled steel shapes are shown in table 2. the bounded histograms used to represent the variable quantities in this example are shown in fig. 7. all variable quantities are expressed in a similar manner as presented in the previous example. the resistance in tension rt of the truss bar is defined as: r f at y� 0 9. var var . (7) the resistance rc in compression can be defined by the following equation from marek et al [1], [4]: � � � � r r r l r ea f a l r ea c y � � � � � � 1 1 2 2 2 2 2 4 2 � � var var var var ; (8) where � � � � r f l r e l e s c r y 1 2 2 21 1 0 1 � � � �var var var. � . where the constant quantities in eqn. 8 are: e, the elastic modulus; c, the distance to the extreme fibers from the centroidal axis; l, the length of the bar; and r, the radius of gyration. the variable quantities are: svar, a variable parameter that takes into account residual stress in the cross section (fig. 7(d)); evar, the initial eccentricity (fig. 7(e)); and avar, the cross sectional area (fig. 7(f)). the reliability assessment according to sbra is based on the analysis of the safety function: s f r s� � (9) where r is the resistance (referring to the onset of yielding or buckling of the truss bar) and s is the variable load effect 88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 loading characteristic value [kn] eurocode 1993 load factors extreme values [kn] dead dl 400.0 1.35 1.00 540.0, 350.0 short lasting sl 1200.0 1.50 0.00 1800.0, 0.0 wind wl 200.0 1.50 �1.50 300.0, �300.0 table 1: characteristic values of applied loads corresponding to load factors rolled shape ipe 330 ipe 360 cross-sectional area a [m2] 0.00626 0.00727 radius of gyration r [m] 0.03547 0.03782 steel grade (čsn 73 1401, čsn p env 1993-1-1) fe360/s235 fe360/s235 table 2: geometric and material properties of selected rolled steel shapes 3 m 3 m 2 m 2 m wl sl dl bar “a” fig. 6: geometry of the example truss a) yield stress (t235fy01.his) b) dead load (dead1.his) c) wind load (wind1.his) d) residual stress (expon1.his) e) initial eccentricity (imp1000.his) f) cross sectional area (area-m.his) fig. 7: bounded histograms used to represent variable loads and variable properties expressed by the axial force in the truss bar (variable n2 in the m-star and anthill analyses). two possible sections are considered for the truss bar. the first is an ipe 330 section with the probability of failure pf determined using sbra and the m-star program to analyze the safety function with the output shown in fig. 8(a). as illustrated previously in fig. 2, the probability of failure can also be determined by plotting resistance (r) versus axial load in the truss bar (n2). this analysis was performed using the anthill program (developed by marek et al [1]) with the output shown in fig. 9(a). note that 200000 simulation steps were used and both the m-star and anthill analyses yield a similar probability of failure, pf = 0.00110 and pf = 0.001075, respectively. from both the m-star and the anthill output, we can see that the ipe 330 section is not adequate (pf = 0.0011 > pd = 0.00007). the second analysis considers a larger ipe 360 section and results in a lower probability of failure (pf = 0.00001 from both the m-star analysis and the anthill analysis). the m-star and anthill output for the ipe 360 section is shown in figs. 8(b) and 9(b), respectively. from the analyses, one can see that the ipe 360 section is adequate to resist both the tension and the compression induced by the applied loads ( pf = 0.00001 < pd = 0.00007). © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 acta polytechnica vol. 41 no. 4 – 5/2001 a) b) fig. 8: m-star™ program output: a) ipe330, b) ipe360 a) b) fig. 9: anthill™ program output: a) ipe330, b) ipe360 fig. 10: elevated water tank structure and physical model of water in the tank 5 example 3: pressure on the wall of an elevated water tank due to earthquake in this example, the maximum pressure on the wall of the cylindrical elevated water tank shown in fig. 10 is calculated for earthquake loading. the effect of the variation of water level, h, and other variable properties on the maximum pressure exerted on the wall at point “a” is studied. the characteristic values of the mass and stiffness of the tank support are shown in fig. 10. the vibratory response of the entire structure is approximated by the shape function �(x). the dynamic properties of the water in the tank are represented by the effective spring-mass system shown in fig. 10. the mass, m1, and the stiffness, k1, represent the oscillation of the water in the tank due to acceleration of the tank base (harris [6]). the effect of tank rotation and the rotational inertia of the tank are negligible. the maximum peak ground acceleration at the site is taken to be 0.44 g, with a variation of �10 %. the response spectra given in fig. 11 are representative of the response of the entire structure (spectrum 1) and the equivalent mass-spring system for the water in the tank (spectrum 2). the variable quantities considered in this problem are peak ground acceleration, tank radius, water height in the tank, distributed mass of the tank support structure, and the bending stiffness of the support structure. each of these variables is represented by multiplying the characteristic value by a bounded, normal distribution histogram. the characteristic and extreme values of the variable quantities are summarized in table 3. note that for this illustrative example, the response of the entire tank structure and the equivalent spring-mass system that represents the effect of the oscillation of the water in the tank are considered separately. this approach can be used for the ranges of stiffness and mass in this problem. in more general cases, the analysis should consider the dynamics of the entire coupled system. the entire elevated tank structure is modeled as an equivalent single degree of freedom (sdof) system based on the shape function given in fig. 10. in this manner, the deformation of the tank can be expressed as: � � � � � �u x t x z t, � � . (10) using eqn. 10, the equivalent sdof equation of motion for a structure subjected to the earthquake ground acceleration üg can be written as: � �z z w z m� � �2 2� � * üg (11) where for this problem and the defined shape function: � damping ratio (5 %) � �k m* * fundamental circular frequency of the supported tank structure m ml m* .� �0 2357 k e i l* � 3 3 � � �0 375. ml m k1 equivalent stiffness of fluid in the tank m distributed mass of the support structure m mass of the supported tank and water calculation details for an equivalent sdof system can be found in any structural dynamics text (clough and penzien [5]). the maximum acceleration response at the level of the tank ( x l) can be found using spectrum 1 in fig. 11 and the fundamental period, t, of the elevated tank structure: � � � �s l l m s m sab a a� � �ü max * *� � � . (12) 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 11: response spectra for the elevated tank problem load/property characteristic value extreme values peak ground acceleration 0.440 g 0.396 g, 0.484 g tank radius 3.5 m 3.15 m, 3.85 m water height 3.0 m 2.0 m, 4.0 m mass of support structure 1500.0 kg/m 1350.0 kg/m, 1650.0 kg/m bending stiffness of support 1.0e+10 nm2 0.9e+10 nm2, 1.1e+10 nm2 table 3: variable quantities for tank pressure example the maximum displacement response of the oscillating water in the tank can now be found by assuming that the acceleration at the base of the tank can be taken as approximately sinusoidal as shown in fig. 10: � �ü l s t� ab sin (13) where is the circular frequency of the entire tank structure. spectrum 2 shown in fig. 11 is the result of assuming that the acceleration at the base of the tank is sinusoidal and the maximum acceleration can be written as: s s da ab1 2� � (14) where: � � � �1 1t t frequency ratio of the tank structure and water d � � 1 1 2� for an undamped system. details of the derivation of eqn. 14 can be found in clough and penzien [5]. the maximum acceleration at the base of the tank can be found from eqns. 12 and 14 and the fundamental period of the water in the tank: t p k m1 1 12� (15) where k1 and m1 are defined in harris [6] as: m m h r h r 1 3 5 27 8 27 8 � tanh ; k m g h h r h r 1 1 27 8 27 8 � tanh . (16) the maximum displacement of the equivalent mass m1 can be found by integrating the acceleration response: s sd a1 1 2� . (17) from the equivalent heights, h0 and h1, and from an analysis of the equivalent spring-mass system for the tank shown in fig. 10 the pressure on the wall at depth h can be expressed as: p s h r h s r h r w ab d� � � � � � � 3 2 3 5 8 1 27 8 1tanh cosh (18) where sab represents the maximum acceleration at the base of the tank and sd1 represents the maximum displacement of the equivalent mass m1 relative to the wall of the tank. details of the derivation of eqn. 18 can be found in harris [6]. note that eqn. 18 conservatively assumes that both sab and sd1 reach their maximum values at near the same instant in time. the distribution of the pressure at the base of the tank wall pw was calculated using the m-star program using 100000 simulation steps. the m-star output for this problem is shown in fig. 8 with a summary of the results given in table 4. a deterministic analysis for the water level at the maximum level of h 4 m results in a pressure of 24727 n/m2. from fig. 12, the probability of the pressure exceeding 24727 n/m2 is 1 – 0.9057 0.0943 (9.43 %). the results show that taking into account variable properties and the variation of the water level has a significant effect on the results and consequently on the design of the tank. in lieu of choosing normal distributions to describe variable quantities, historical water level records and collected data that describe other variable properties can be used for a more accurate analysis. note also that this example was constructed for the purpose of illustrating sbra, and so some of the simplifications and assumptions made may not be acceptable for all conditions. this analysis can easily be extended to include the probability of the occurrence of an earthquake. 6 conclusion the three examples presented only hint at the potential of sbra as a tool for the designer. many detailed examples of sbra related to civil engineering design problems are presented in marek et al [1], [3] and brozzetti et al [2]. unfortunately, at present, to include simulation based reliability assessment methods into codes and standards is not an easy task. in many ways current methods must be reexamined, in particular, the characterization of how variable loads are represented and distributed on structures. since nearly all of the current design codes in civil engineering, from the designer’s point of view, are deterministic in nature, assessment of an acceptable probability of failure is difficult. considerable discussion by researchers, specification writing committees, and designers is needed to pave the way for innovation in structural engineering, starting with the transition from a deterministic to a probabilistic “way of thinking” (marek et al [3]). attention must be given to the “rules of the game” including the definition of the reference level in the probabilistic analysis of safety, durability, and serviceability. to date, the only code that allows for applying simulation and gives guidelines for target probability is the structural steel design code 73 1401-1998 (appendix a) of the czech national standards. © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 acta polytechnica vol. 41 no. 4 – 5/2001 pressure at point “a” probability of non-exceedance 16547 n/m2 0.0 (minimum) 26096 n/m2 0.99 27021 n/m2 0.999 27723 n/m2 0.9999 28516 n/m2 1.0 (maximum) table 4: variation of the maximum water pressure on the elevated tank wall fig. 12: calculation of the maximum wall pressure using m-star� computational methods like sbra, allow designers to use, build, and update available data and knowledge bases to better represent the variable character of loading, material properties, and other factors in design and assessment. it is the opinion of the authors that sbra takes advantage of modern information transfer and computer power to build a more general, accurate, consistent, and transparent representation of load and resistance, and represents the future of structural reliability assessment methods. acknowledgements the results presented in this work were supported by the leonardo da vinci agency (ec brussels) and gacr 103/01/1410 and 103/96/k034 – czech republic. professor ing. ondřej fischer, drsc. of itam cas, academy of sciences of czech republic, contributed to the development of example 3 (see [3]). references [1] marek, p., guštar, m., anagnos, t.: simulation-based reliability assessment for structural engineers. crc press, inc., boca raton, florida, 1995 [2] brozzetti, j., guštar, m., ivanyi, m., kowalczyk, r., marek, p.: tereco-teaching reliability concepts using simulation technique. leonardo da vinci agency, eu, brussels, 2001 [3] marek, p., brozzetti, j., guštar m., ed.: probabilistic assessment of structures using monte carlo simulation: background, exercises, software. itam cas cz prague, 2001 [4] marek, p., krejsa, m.: transition from deterministic to probabilistic structural steel reliability assessment with special attention to stability problems. sdss ’99, sept. 1999, timisoara, romania [5] clough, r. w., penzien, j.: dynamics of structures. second edition. mcgraw-hill book company, new york, 1993 [6] harris, c. m., crede, c. e.: shock and vibration handbook. volume 3, mcgraw-hill book company, new york, 1961 steven vukazich, associate professor phone: +408 924 3858 fax: +408 924 4004 e-mail: vukazich@email.sjsu.edu department of civil engineering san jose state university san jose, ca usa 95192-0083 professor, pavel marek e-mail: marekp@itam.cas.cz itam czech academy of sciences czech republic prosecká 76, 190 00 praha 9, czech republic 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2013.53.0868 acta polytechnica 53(6):868–871, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap factor and palindromic complexity of thue-morse’s avatars christiane frougnya, karel kloudab,∗ a liafa, cnrs umr 7089, case 7014, 75205 paris cedex 13, france b faculty of information technology, czech technical university in prague, thákurova 9, praha 6, 160 00, czech republic ∗ corresponding author: karel.klouda@fit.cvut.cz abstract. two infinite words that are connected with some significant univoque numbers are studied. it is shown that their factor and palindromic complexities almost coincide with the factor and palindromic complexities of the famous thue-morse word. keywords: factor complexity, palindromic complexity, univoque numbers, thue-morse word. 1. introduction the main result of this paper is the computation of the factor and palindromic complexity of two infinite words which appear in [1] as a representation of some significant univoque numbers. a real number λ > 1 is said to be univoque if 1 admits a unique expansion in base λ of the form 1 = ∑ i>0 aiλ −i with ai ∈{0, 1, . . . ,dλe− 1}. komornik and loreti showed in [2] that there is a smallest univoque number γ in the interval (1, 2). this number is transcendental [3] and is connected with the thue-morse word in this sense: if 1 = ∑ i>0 aiγ −i, then a1a2a3 · · · = 11010011 · · · = 0−1utm, i.e., the thue-morse word without the leading zero. there are two generalizations of this result. the first one is the work of the same authors [4], where they studied the univoque numbers λ ∈ (1,b + 1), b ∈ n. the second one is the work of allouche and frougny [1]. they proved that there exists a smallest univoque number in (b,b + 1) (this is proved also in [5]) and they also found the corresponding unique expansion of 1. these expansions and some other significant words from [1] are studied in the sequel. as explained in the concluding remark, at least the factor complexity could be computed using the common method employing special factors (see e.g. [6] for details). however, here we derive both complexities directly from the definition of words which really enlighten the connection between the studied words and the thue-morse word. 2. preliminaries an alphabet a is a finite set of letters. a concatenation of n letters v = v0v1 · · ·vn−1 from a is a (finite) word over a of length n. an infinite sequence u = u0u1u3 · · · is an infinite word over a. any finite word v such that v = ukuk+1 · · ·uk+n−1 for some k ∈ n is called a factor of u and ukuk+1 · · ·uk+n−1 is its occurrence in it. the set of all factors of u is denoted by l(u), the set ln(u) is the set of all factors of length n. the factor complexity of an infinite word u is the function cu(n) that returns for all n ∈ n the number of factors of u of length n. given a word v = v0v1 · · ·vn−1, the word ṽ is defined as vn−1vn−2 · · ·v1v0. if v = ṽ, v is called a palindrome. the palindromic complexity of an infinite word u is the function pu(n) that returns for all n ∈ n the number of factors of u of length n that are palindromes. all infinite words in question are derived from the famous thue-morse word utm. the thue-morse word is the fixed point of the thue-morse morphism ϕtm(0) = 01 ϕtm(1) = 10 starting in the letter 0, i.e., utm = lim n→∞ ϕntm(0) = 0110100110 · · · = ε0ε1ε2ε3 · · · . we are interested in the factor and palindromic complexity of infinite words w = m0m1m2 · · · given by mn = εn+1 − (2t− b− 1)εn + t− 1, (1) where 2t > b ≥ 1. in particular, we want to determine both complexities for all the three cases stated in [1, theorem 2], i.e., for 2t ≥ b+3, 2t = b+2 and 2t = b+1. if 2t = b + 1, then mn = εn+1 + t−1 and so w equals the word 0−1utm (after renaming the letters 0 → t−1 and 1 → t) having the same factor and palindromic complexity. analogously, in the other two cases 2t ≥ b+ 3 and 2t = b+ 2, it is sufficient to consider only one choice of parameters t and b satisfying the inequality and the equality, respectively. that is because the former formula implies that the word w consists of four distinct letters b−t < b−t+ 1 < t−1 < t and the latter one that the word w consists of three distinct letters b− t < t− 1 < t. if we choose t = b = 3 and 868 http://dx.doi.org/10.14311/ap.2013.53.0868 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 factor and palindromic complexity of thue-morse’s avatars t = b = 2 respectively, all the other words given by (1) are (after renaming the letters) equal to the words corresponding to these two choices of parameters b and t. thus, we can simplify the definition of the infinite words we study as follows. definition 1. for a = 1, 2, the infinite word wa = m0m1m2 · · · is defined by mn = εn+1 −aεn + a = εn+1 + a(1 −εn). (2) hence, we get w1 = 210201210120 · · · w2 = 310302310230 · · · as we will see, the factor and palindromic complexity of the word wa will be expressed using the factor and palindromic complexity of the thue-morse word utm. therefore we recall the following two theorems. theorem 2 [7], [8]. for the thue-morse sequence, cutm (1) = 2, cutm (2) = 4 and, for n ≥ 3, if n = 2r + q + 1, r ≥ 0, 0 ≤ q < 2r, then cutm (n) = { 6 · 2r−1 + 4q if 0 ≤ q ≤ 2r−1, 2r+2 + 2q if 2r−1 < q < 2r. theorem 3 [9]. let n ≥ 3 and n = 2 · 4k + q, k ∈ n, 0 ≤ q < 6 · 4k, then putm (2n) = { 4 if 0 < q ≤ 3 · 4k, 2 if 3 · 4k < q < 3 · 4k or q = 0. furthermore, putm (1) = putm (2) = putm (3) = putm (4) = 2 and there are no palindromes of odd length greater than 3. 3. factor complexity the following lemma points out the similarity between the languages of the words utm and wa. lemma 4. there exists a bijective mapping from lutm (n + 1) to lwa (n) for all n ≥ 2. proof. the mapping is defined by (2). we just have to prove that it is injective. let mq · · ·mq+k and mp · · ·mp+k be two occurrences of the same factor of wa, k ≥ 1, q 6= p. we prove that the factors εqεq+1 · · ·εq+k+1 and εpεp+1 · · ·εp+k+1 are the same as well. obviously, it suffices to prove it for the case of k = 1. let mq = εq+1 + a(1 −εq ) = mp = εp+1 + a(1 −εp) and mq+1 = εq+2 + a(1 −εq+1) = mp+1 = εp+2 + a(1 −εp+1). since there are only 8 possible three-letter binary words εpεp+1εp+2 and εqεq+1εq+2, it is easy to find all solutions of these two equations. if a = 2, then εpεp+1εp+2 = εqεq+1εq+2 is the unique solution of this system of two equations. if a = 1, εq = εq+1 = εq+2 6= εp = εp+1 = εp+2 is the only other solution, but it is not admissible since neither 000 nor 111 are factors of utm. this lemma allows us to determine the factor complexity cwa (n) for n ≥ 2. the case n = 1 is trivial, cwa (1) is equal to the number of letters occurring in wa. corollary 5. for both a = 1 and a = 2 and for all n ≥ 2, it holds cwa (n) = cutm (n + 1). furthermore, cw1 (1) = 3 and cw2 (1) = 4. corollary 6. for both a = 1 and a = 2, wa is square-free. proof. let ww be a factor of wa, w is of length n, and let ww = mi · · ·mi+2n−1. then, according to the previous lemma, there exists a unique factor v of length n having b as its first letter such that vvb = εi · · ·εi+2n is a factor of utm. but this is not possible since utm is overlap-free (see e.g. [10]), which means exactly that it does not contain factors of this form. 4. palindromic complexity as for the palindromic complexity, the difference between the cases a = 1 and a = 2 is more significant than it is for the factor complexity. however, the result still remains strongly related to the palindromic complexity of utm. first simple observation is that, since wa is square-free for both values of a, it cannot contain palindromes of even length since such palindrome contains the square of a letter in its middle. definition 7. let a = {0, 1, . . . ,n},a ∈a and v = v1 · · ·vm ∈ a∗,n,m ≥ 1. set ā = n − a and v̄ = v̄1 · · · v̄m. lemma 8. let p ≥ 2 be even. • a word mnmn+1 . . .mn+p is a palindrome of w2 if and only if εn = εn+2 = · · · = εn+p−2 = εn+p, εn+1 = εn+3 = · · · = εn+p−1 = εn+p+1, (3) where εn+1 6= εn. • a word mnmn+1 . . .mn+p is a palindrome of w1 if and only if εn = εn+p+1 ... εn+ p2 = εn+ p2 +1. (4) 869 ch. frougny, k. klouda acta polytechnica proof. we have for all i = 0, 1, . . . , p2 − 1 mn+i = εn+i+1 + a(1 −εn+i) = mn+p−i = εn+p−i+1 + a(1 −εn+p−i), mn+i+1 = εn+i+2 + a(1 −εn+i+1) = mn+p−i−1 = εn+p−i + a(1 −εn+p−i−1), (5) where mn+i 6= mn+i+1 due to the square-freeness of wa. these two equations have a trivial solution εn+i = εn+i+2 = εn+p−i 6= εn+i+1 = εn+p−i+1 = εn+p−i−1 for i = 0, 1, . . . , p2 − 2. for the case a = 2, it is the only solution. if a = 1, we can rewrite (5) as εn+1 + εn+p = εn + εn+p+1 εn+2 + εn+p−1 = εn+1 + εn+p ... εn+ p2 + εn+ p2 +1 = εn+ p2−1 + εn+ p2 +2. now, considering that εn = εn+p+1 leads to inadmissible solution εn = εn+1 = · · · = εn+p+1, therefore, the factor εnεn+1 · · ·εn+p+1 is a solution if and only if (4) is satisfied. thus, in the case of a = 2, the existence of a palindrome of odd length p + 1,p ≥ 2 is equivalent to the existence of the factors 1010 · · ·10 or 0101 · · ·01 in utm of length p + 2. but such words are factors of utm only for p = 2. theorem 9. it holds pw2 (n) =   4 if n = 1, 2 if n = 3, 0 otherwise. in order to describe the relation between the palindromic complexities of w1 and utm, we need to introduce the following definition. definition 10. a factor v of an infinite word u is said to be a c-palindrome if v = ṽ. denote cpu(n) the number of c-palindromes of length n in u. lemma 8 says that there exists a bijective mapping between the set of palindromes in w1 of odd length p + 1, p ≥ 2, and the set of c-palindromes in utm of length p + 2. corollary 11. for n ≥ 1 it holds pw1 (2n + 1) = cputm (2n + 2). lemma 12. for all positive integers n it holds that cputm (2n) = putm (4n). proof. it is readily seen that if v is a c-palindrome in utm of length 2n , then ϕtm(v) is a palindrome of length 4n. similarly, if v′ is a palindrome of length 4n, then there exists a unique c-palindrome v of length 2n such that ϕtm(v) = v′. theorem 13. for n ≥ 1 pw1 (2n + 1) = putm (4n + 4), pw1 (1) = 3. there are no palindromes of even length in w1. 5. remarks as remarked in [1, remark 5], w1 = 210201210120 · · · is exactly the square-free braunholtz sequence on three letters given in [11]. moreover, this sequence is in fact the sequence ui which can be defined as the fixed point of istrail’s substitution 1 7→ 102, 0 7→ 12, 2 7→ 0 [12], thus we obtain w1 from ui by exchanging letters 1 ↔ 2, 2 ↔ 0, 0 ↔ 1. then, of course, the factor complexity of ui and w1 is the same. the word ui was studied in [13], where its factor complexity is computed using the notion of (right) special factors. in [13] the sequence ui is referred to as the thuemorse word on three symbols and as it is recalled there it was originally defined by thue [14, 15] and later on rediscovered in various contexts by several authors, such as morse [16]. another relation between ui and utm is also pointed out there: if we define a (non-primitive) substitution δ(1) 7→ 011,δ(0) 7→ 01,δ(2) 7→ 0, we have δ(ui) = utm. consequently, δ′(w1) = utm for δ′(2) 7→ 011,δ′(1) 7→ 01,δ′(0) 7→ 0. acknowledgements this work was supported by the czech science foundation grant gačr 13-35273p. references [1] j.-p. allouche, et al. univoque numbers and an avatar of thue-morse. acta arith 136:319–329, 2009. [2] v. komornik, et al. unique developments in noninteger bases. amer math monthly 105:636–639, 1998. [3] j.-p. allouche, et al. the komornik-loreti constant is transcendental. amer math monthly 107:448–449, 2000. [4] v. komornik, et al. subexpansions, superexpansions and uniqueness properties in non-integer bases. period math hungar 44:197–218, 2002. [5] m. de vries, et al. unique expansions of real numbers. adv math 221(2):390–427, 2009. [6] j. cassaigne. special factors of sequences with linear subword complexity. in developments in language theory, pp. 25–34. world scientific, 1996. [7] a. de luca. on some combinatorial problems in free monoids. discrete math 38:207–225, 1982. [8] s. brlek. enumeration of factors in the thue-morse word. discrete appl math 24:83–96, 1989. [9] a. blondin-massé, et al. palindromic lacunas of the thue-morse word. proc gascom 2008 pp. 53–67, 2008. 870 vol. 53 no. 6/2013 factor and palindromic complexity of thue-morse’s avatars [10] j.-p. allouche, et al. palindrome complexity. theor comput sci 292:9–31, 2003. [11] c. h. braunholtz. an infinite sequence on 3 symbols with no adjacent repeats, (solution to problem 439 posed by h. noland). amer math monthly 70:675–676, 1963. [12] s. istrail. on irreducible languages and nonrational numbers. bull math soc sci math r s roumanie 21:301–308., 1977. [13] a. de luca, et al. on the factors of the thue-morse word on three symbols. inform process lett 27(6):281–285, 1988. [14] a. thue. über unendliche zeichenreihen. norske vid skrifter i mat-nat kl chris 7:1–22, 1906. [15] a. thue. über die gegenseitige lage gleicher teile gewisser zeichenreihen. norske vid skrifter i mat-nat kl chris 8:1–67, 1912. [16] m. morse, et al. unending chess, symbolic dynamics and a problem in semigroups. duke math j 11:1–7, 1944. 871 acta polytechnica 53(6):868–871, 2013 1 introduction 2 preliminaries 3 factor complexity 4 palindromic complexity 5 remarks acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0229 acta polytechnica 55(4):229–236, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap assessment of postural instability in patients with a neurological disorder using a tri-axial accelerometer lenka hanakovaa, ∗, vladimir sochaa, jakub schlenkera, ondrej cakrta, b, patrik kutileka a faculty of biomedical engineering, czech technical university in prague, nam. sitna 3105, kladno, czech republic b 2nd faculty of medicine, university hospital motol, charles university in prague, v uvalu 84, prague, czech republic ∗ corresponding author: lenka.hanakova@fbmi.cvut.cz abstract. current techniques for quantifying human postural stability during quiet standing have several limitations. the main problem is that only two movement variables are evaluated, though a better description of complex three-dimensional (3-d) movements can be provided with the use of three variables. a single tri-axial accelerometer placed on the trunk was used to measure 3-d data. we are able to evaluate 3-d movements using a method based on the volume of confidence ellipsoid (ve) of the set of points obtained by plotting three accelerations against each other. our method was used to identify and evaluate pathological balance control. in this study, measurements were made of patients with progressive cerebellar ataxia, and also control measurements of healthy subjects, and a statistical analysis was performed. the results show that the ves of the neurological disorder patients are significantly larger than the ves of the healthy subjects. it can be seen that the quantitative method based on ve is very sensitive for identifying changes in stability, and that it is able to distinguish between neurological disorder patients and healthy subjects. keywords: trunk acceleration, gyro-accelerometer, postural stability, confidence ellipsoid volume, cerebellar disease. 1. introduction neurological disorders usually have a negative effect on trunk posture [1–3]. patients with neurological disorders often show instability during quiet stance tasks. in recent years, tri-axial inertial measurement units (imu) for acceleration and orientation measurements have been replacing force platforms for high-accuracy 3-d measurements of body movements for studying the centre of pressure (cop) excursions [4] of patients with neurological disorders. the sensing units can be used to measure the three euler angles and the three accelerations of a segment. in the past, sensing units were placed on top of the head, spinous processes of t1 and/or s1 in order to measure the motion of the head, the trunk and the pelvis, respectively. although imu systems can measure the three angles and the three accelerations, techniques for quantifying segment movements using only one or two measured quantities were introduced into clinical practice [4–8]. measurements of trunk accelerations during stance can identify impaired balance control in individuals with neurological disorders [5, 9]. for posture monitoring and training, a non-commercial imu system has also been developed as a diagnostic tool for trunk movements [10, 11]. in addition, the cheap accelerometers and gyroscopes used in modern cell phones have been tested for posture monitoring [6]. an assessment of trunk movements using imu may yield clearer insights into balance deficits, and may provide a considerably better diagnostic tool than the more traditional measures mentioned above. although measurements of trunk movements can provide aid in diagnostics and rehabilitation, measurements of trunk movements using imu have not previously been implemented in the study of numerous postural balance problems of patients with specific types of diseases. the force (posturography) platform has until now been the main tool for studying, e.g., the body movement of patients suffering from cerebellar diseases [12]. traditional, more complex methods for processing the measured data and for assessing postural instability, using at least two measured variables, are based on the 2-d convex hull [13–15], the 2-d confidence ellipse [16–18], or the length of the trajectory obtained by plotting two variables against each other [19, 20]. these methods are usually used to evaluate the data from the force platform [12, 21]. however, these traditional ways of quantifying postural stability have some limitations. the major limitation is that the methods are based on evaluating only two variables, one in each of the two planes/axes of the human body. this can lead to a loss of important information about physical activity, specifically the third physical quantity of 3-d movements. although tri-axial imus have been used 229 http://dx.doi.org/10.14311/ap.2015.55.0229 http://ojs.cvut.cz/ojs/index.php/ap l. hanakova, v. socha, j. schlenker et al. acta polytechnica for quantifying postural instability in clinical practice, the methods for evaluating the measured data usually use only two variables out of the three measured variables. this paper aims to validate and apply a method for identifying the pathological balance control of patients with a neurological disorder on the basis of measurements of accelerations by imu, and the volume of confidence ellipsoid of the set of points obtained by plotting the three accelerations, measured during quiet stance, against each other. the technique relies on a well-known principle that is used for evaluating the 2-d data set [16–18].the reason for applying this method is that a study of three-dimensional movement with one variable characterizing the change in all three accelerations may find new uses for the study of postural stability in clinical practice using a lowcost tri-axial imu. moreover, these accelerations are directly related to changes in position, which means that we could make a direct evaluation of the accelerations. this method is used for diagnosing cerebellar disease manifested by tremors or by swaying, since it is primarily designed for use in neurology to discover relationships between neurological disorders and postural trunk movements in 3-d space. 2. methods and materials 2.1. participants ten volunteer patients (aged 52.2 (sd 11.7) years) with degenerative cerebellar ataxia [22] and eleven healthy subjects (aged 26.0 (sd 6.4) years) participated in the study. a comparison of groups of the same age is not necessary in this case, since studies have shown that the body sway parameters of healthy subjects within the age range between 20 and 60 years vary only slightly [23, 24]. aoki et al. [23] found that there were insignificant differences in 10–60-year-old subjects in cop sway parameters: circumference, rectangular area, left-right width, and front-back width in both cases (eyes open/eyes closed), i.e., no significant age-related differences were found in all the cop sway parameters (i.e., romberg quotients). it has been shown that significantly degraded stability begins after the age of 60 years. in addition, a detailed analysis of an age-related increase in the cop parameters by polynomial-type regression showed that a gradual increase in body sway, characterized by increased cop oscillations, begins around the age of 60 years [24]. the patients were recruited from the motol faculty hospital, prague, czech republic. a board-certified neurologist had previously screened and diagnosed progressive cerebellar disease. the diagnostic evaluation included a detailed disease history, a neurological examination, routine laboratory blood and urine tests, and a brain mri. the patients were measured in the initial phase of the clinic’s two-week rehabilitation program. healthy subjects were recruited from figure 1. the xsens system with one tri-axial imu (a) and a central unit (b) used to measure angles and accelerations of the trunk. students/volunteers at charles university in prague. for the healthy subjects, the diagnostic evaluation included a detailed disease history, a neurological examination and routine laboratory testing. the study was performed in accordance with the helsinki declaration. the study protocol was approved by the local ethics committee and the motol university hospital, and informed consent was obtained from all subjects. the subjects were chosen for the measurements randomly, and on different days. 2.2. test procedure and measurement equipment the xsens system, composed of the xbus master (central unit), a lightweight (330g) portable device using mtx units (tri-axial imus) for orientation and acceleration measurement of body segments, was used for measuring trunk movements (see fig. 1). the mtx unit is an accurate imu providing drift-free 3-d orientation and 3-d acceleration. the mtx unit was calibrated before each measurement, and was set up in such a way that one axis of the coordinate system was parallel to the anterior-posterior axis, i.e., the axis of symmetry of the fixed stationary platform on which the participants stood, and the other two axes were perpendicular to the anterior-posterior axis (i.e., the axis of symmetry of the platform) with respect to the direction of the earth’s gravitational field, i.e., the anterior-posterior axis was co-linear with the direction of gravity. after calibration, one mtx unit was placed on the patient’s trunk, according to [8, 13, 25], at the level of the lower back (lumbar 2-3), see fig. 1. the data, i.e., the three angles (euler angles: roll, yaw and pitch) [26–28] and the three accelerations (in the accelerometer coordinate system) [29] of the trunk, were measured by the imu while the patients 230 vol. 55 no. 4/2015 assessment of postural instability (pts) and the healthy subjects (control group — cg) were standing quietly standing on a fixed stationary platform. in brief, trunk sway was measured during quiet stance on a firm surface (fis) and on a soft foam surface (fos) with eyes open (eo) and with eyes closed (ec) [30]. the tasks included standing on each legg for at least 60 seconds [31]. the measurements usually lasted for a few seconds longer, and the initial data were cut off, so that all datasets have a record length of 60 seconds. the subject’s bare feet were positioned next to each other, splayed at an angle of 30°, with the arms always in hanging position. the three angles, the three accelerations and the task duration were recorded with a sample frequency of 100 hz. there was no need to normalize the data measured during quiet stance because the standard ranges of angles and accelerations are the same for all three planes of the body and for all adult subjects, since the imu was placed on the same anatomical point. 2.3. data analysis the roll (φ), yaw (ψ) and pitch (θ) angles and the three orthogonal accelerations (asx, asy, asz) in the accelerometer coordinate system [29] were measured by one mtx unit. the conventions of the euler angles (roll, yaw and pitch) are described in [27, 28, 32]. the three accelerations measured by the accelerometer of the imu (i.e., the mtx unit) are described in detail in [33, 34]. these angles and accelerations are used to calculate the accelerations in the global reference system and then in the anatomical coordinate frame. the calculation is based on the rotational matrices. the first rotation matrix rgs can be interpreted in terms of the euler angles [35]: rgs = rzψ ·r y θ ·r x φ , (1) where rzψ =  cos ψ −sin ψ 0sin ψ cos ψ 0 0 0 1   , (2) ryθ =   cos θ 0 sin θ0 1 0 −sin θ 0 cos θ   , (3) rxφ =  1 0 00 cos φ −sin φ 0 sin φ cos φ   . (4) matrix rgs rotates an acceleration vector −→as = (asx,asy,asz )t in the sensor co-ordinate system (s) to the global reference system (g): −→ag = rgs ·−→as. (5) the acceleration vector −→ag = (agx,agy,agz )t in the global reference system is then rotated by second figure 2. example of the confidence ellipsoid obtained by plotting superior-inferior (si), medio-lateral (ml) and anterior-posterior (ap) accelerations against each other. rotation matrix rag to the anatomical coordinate frame (a) [36]: −→aa = rag · −→ g, (6) where rag = rzψ0 =  cos ψ0 −sin ψ0 0sin ψ0 cos ψ0 0 0 0 1   . (7) the angle ψ0 is obtained during the calibration process of the mtx unit. the calculated acceleration vector −→aa = (aap ,am l,asi )t in the anatomical coordinate frame represents the superior-inferior acceleration (asi ), the medio-lateral acceleration (am l) and the anterior-posterior acceleration (aap ). using the accelerations derived above we are able to evaluate the 3-d movements of the patients. the calculated acceleration vectors, or in other words, the time dependent data (asi, am l, aap ) obtained by the tri-axal accelerometer are plotted as a 3-d plot, i.e., a set of points is obtained by plotting the three accelerations against each other, see fig. 2. the number of points is determined by the time of the measurements, i.e., the recorded length of the dataset (60 s) and the sample frequency (100 hz). the method for identifying pathological balance control is based on the mathematical tools for static posturography, which are used to analyze the body sway of unsteady patients [16–18]. we can model the distribution of the measured data by a 2-d ellipse or a 3-d ellipsoid. the confidence ellipse area has already been used in clinical practice to study postural balance problems, but the concept of the confidence ellipsoid volume has not been used before in clinical practice for studying the postural balance problems by three accelerations. we have used a method based 231 l. hanakova, v. socha, j. schlenker et al. acta polytechnica figure 3. comparison of the volume of the ellipsoid of the control group (cg) and the patients (pts) standing on a firm surface (fis), with eyes open (eo) and with eyes closed (ec). figure 4. comparison of the volume of the ellipsoid of the control group (cg) and the patients (pts) standing on a foam surface (fos) with eyes open (eo) and with eyes closed (ec). on a description of the distribution of the measured data (i.e., asi, am l and aap ) by a confidence ellipsoid. the confidence ellipsoid is determined using the principal component analysis (pca) [37]. the volume of an ellipsoid is given by an ellipsoid matrix and the ellipsoid matrix is composed of entries from the covariance matrix [38, 39]. the symmetric 3 × 3 covariance matrix is given by the equation [40, 41]: c = [ var(aap ) cov(aap ,am l) cov(aap ,asi ) cov(am l,aap ) var(am l) cov(am l,asi ) cov(asi ,aap ) cov(asi ,am l) var(asi ) ] . (8) the eigensystem of the covariance matrix constitutes a quadratic surface which is used for visualization. the confidence ellipsoid that makes the fewest assumptions about the shape of the underlying distribution from which the measured data can be drawn is a three-dimensional ellipsoid, fig. 2. we use a 95% confidence ellipsoid (ce) to verify the applicability of a 3-d confidence ellipsoid [42–44]. the 95% confidence ellipsoid (with 0.95 probability [45, 46]) volume is the volume of an ellipsoid that is expected to bound 95 % of the measured data, i.e., the set of points is obtained by plotting three accelerations against each other. the confidence ellipsoid volume is given by the equation: v = 4 3 πabc, (9) where a = √ χ2λ1, b = √ χ2λ2, c = √ χ2λ3 are semiaxes, λ1,2,3 are the eigenvalues of matrix c, and χ2 = 5.991 as reported by greenwalt, c. r. and schultz [47]. in our case, the physical unit of the volume is m3 s−6. although the mtx unit also senses the gravitational acceleration it is not necessary to subtract the gravitational acceleration because the method for calculating ve uses only changes in the accelerations, and the gravitational acceleration is constant and perpendicular to the horizontal plane of the earth’s surface. all calculations were performed using customwritten matlab programs with principal component analysis implemented in matlab software (matlab r2010b, mathworks, inc., natick, ma, usa) to determine the symmetric 3×3 covariance matrix of the measured data and the confidence ellipsoid volume. 2.4. statistical analysis after calculating the ve of each patient and of each healthy subject standing on an fis and on an fos with eo and ec, the statistical analysis was performed using matlab software. the jarque-bera test was used to test the normal distribution of the calculated ves [48]. the test returns a value of h = 1 if it rejects the null hypothesis at the 0.05 significance level, and h = 0 otherwise. the median (mdn), minimum (min), maximum (max), the first quartile (q1) and the third quartile (q3) of the av and tl are then used to compare the results for pts and cg. in addition, the wilcoxon signed rank test and the wilcoxon rank sum test [49] were used to assess the significance of the differences between the results of the measurements. the significance level was set at p < 0.05. 3. results the statistical data are used to illustrate the differences between the ves of pts with eo, pts with ec, cg with eo and cg with ec (tab. 1). the following plots (fig. 3 and fig. 4) display the min, max, mdn, q1, and q3 for the ves. since some calculated values were not distributed normally, the wilcoxon test was used to compare and analyze sets of calculated values. the comparison of cg on fis with eo and cg on fis with ec (p = 0.18) revealed no differences. differences were found when comparing pts on fis with eo and pts on fis with ec (p = 0.02), cg on fos with eo and cg on fos with ec (p < 0.01), and pts on fos with eo and pts on fos with ec 232 vol. 55 no. 4/2015 assessment of postural instability cg eo pts eo cg ec pts ec fis min 4.30 · 10−5 1.00 · 10−4 5.60 · 10−5 2.28 · 10−4 max 1.79 · 10−4 1.04 · 10−1 2.53 · 10−4 5.10 · 10−1 mdn 1.14 · 10−4 3.90 · 10−4 1.31 · 10−4 9.76 · 10−4 q1 9.80 · 10−5 2.86 · 10−4 1.09 · 10−4 3.70 · 10−4 q3 1.43 · 10−4 4.49 · 10−4 1.85 · 10−4 3.36 · 10−3 fos min 2.74 · 10−4 1.08 · 10−3 4.94 · 10−4 4.80 · 10−2 max 1.31 · 10−3 2.03 · 100 9.43 · 10−3 2.65 · 100 mdn 4.76 · 10−4 1.68 · 10−2 1.76 · 10−3 4.23 · 10−1 q1 4.15 · 10−4 5.34 · 10−3 1.51 · 10−3 6.71 · 10−2 q3 7.48 · 10−4 2.89 · 10−2 2.76 · 10−3 7.93 · 10−1 table 1. comparison of the volume of the ellipsoids of the control group and the patients (all values in m3 s−6). (p < 0.01). for the subjects standing on fis, the measured data of the cg or pts with eo and ec show the same median or a slight increase in the median of the ves after the eyes were closed (fig. 3). for the subjects standing on fos, the measured data for the cg or pts with eo and ec show a significant increase in the median of the ves when the eyes were closed (fig. 4). differences were found when comparing cg on fis with eo and pts on fis with eo (p < 0.01) show a significant difference, cg on fis with ec and pts on fis with ec (p < 0.01), cg on fos with eo and pts on fos with eo (p < 0.01), and cg on fos with ec and pts on fos with ec (p < 0.01). it can therefore be concluded, that there is significant difference between the data for cg and for pts. the method based on the ve provides significantly different results if the cg and pts are measured when standing on both types of surface. the median of the ves in pts standing on fis with eo is 3.4 times larger than the median of the ves for cg standing on fis with eo. the median of the ves in pts standing on fis with ec is 7.5 times higher than the median of the ves in cg standing on fis with ec. the median of the ves in pts standing on fos with eo is 35.3 times higher than the median of the ves in cg standing on fos with eo. the median of the ves in pts standing on fos with ec is 240.3 times higher than the median of the ves in cg standing on fos with ec. 4. discussion a technique based on the volume of the confidence ellipsoid of the set of points obtained by plotting three accelerations against each other has been tested in clinical practice. the results for the method show that in all cases where the subjects stand on fis or fos the ves in pts with progressive cerebellar ataxia are significantly higher than the ves in healthy subjects. however, the differences are much more pronounced when standing with ec on fos. for patients or healthy subjects standing on fos with eo and ec, the findings are consistent with findings obtained by traditional methods [13, 50], and show a significant difference between the postural stability of pts and cg. these findings also justify the decision to ask the subjects stand on a foam surface with ec during the medical examination [51, 52]. for pts standing on fos with ec, the median of the ve is 4.23 ·10−1 m3 s−6. for healthy subjects, the median of the ve is only 1.76 · 10−3 m3 s−6. the results show that our method clearly identifies a deterioration in postural control, and large differences between the postural stability of pts and cg. this difference is caused by a strong tremor or swaying of patients’ trunks, which is one of the manifestations of cerebellar diseases [22, 50]. according to the method presented here, ve can yield insights into the postural stability and postural balance problems in patients with neurological disorders. this information is important in medical examinations and in rehabilitation medicine. the method based on ve can therefore be used as an additional method for determining and evaluating of postural instability. the primary advantage of the method is that the use of a cheap tri-axial imu enables us to study the patient’s trunk movements in all three planes/axes of the human body. there are potential limitations to our study. the most important limitation is that the sample of subjects was small, and was probably not representative of the population as a whole. although statistically significant results were observed, it would be helpful to see these findings proven in a larger study. however, a sample of ten pts and eleven cg was sufficient to test the basic features of the techniques proposed for the study of the pts with degenerative cerebellar disorder in this preliminary study. the proposed techniques should be seen as a contribution to the study of the postural stability of pts with a neurological disorder, for which widely used method based on cheap tri-axial imu is not adopted. 233 l. hanakova, v. socha, j. schlenker et al. acta polytechnica 5. conclusions we have carried out a study of postural instability in patients with a neurological disorder based on the volume of a confidence ellipsoid of the set of points obtained by plotting three accelerations against each other, and have shown that our technique is suitable for identifying postural balance problems. the technique enables us to study the distribution of the three measured variables (superior-inferior acceleration, medio-lateral acceleration and anterior-posterior acceleration) of body movements, and to overcome the greatest limitation of traditional methods, which rely only on two variables, one in each of the axes of the human body (medio-lateral and anterior-posterior). this method is used for studying the cop. our new technique, based on the confidence ellipse volume, has never been used before for studying patients diagnosed with neurological disorders. the findings have shown that a single inexpensive tri-axial accelerometer placed on the patient’s body segments can be a reliable tool for clinical measurements, and that our method more suitable than the force platforms that are currently widely used [4]. finally, it should be mentioned that our new technique can also be used for analyzing data obtained using the accelerometers installed in cell phones or in electronic watches. this can make it possible to use the methods proposed here in small health clinics or even in the patient’s home. acknowledgements this work was done in the joint department of the faculty of biomedical engineering of the czech technical university in prague and charles university in prague in the framework of research program vg20102015002 (2010-2015, mv0/vg), sponsored by the ministry of the interior of the czech republic, and project sgs15/107/ohk4/1t/17 of the ctu in prague. references [1] o. cakrt et al. exercise with visual feedback improves postural stability after vestibular schwannoma surgery. european archives of oto-rhino-laryngology 267(9):1355–1360, 2010. http://dx.doi.org/10.1007/s00405-010-1227-x, doi:10.1007/s00405-010-1227-x. [2] k. bridgewater, m. sharpe. trunk muscle performance in early parkinson’s disease. physical therapy 78(6):566–576, 1998. [3] m. latt et al. acceleration patterns of the head and pelvis during gait in older people with parkinson’s disease: a comparison of fallers and nonfallers. the journals of gerontology series a: biological sciences and medical sciences 64a(6):700–706, 2009. http://dx.doi.org/10.1093/gerona/glp009, doi:10.1093/gerona/glp009. [4] m. mancini et al. isway: a sensitive, valid and reliable measure of postural control. j neuroeng rehabil 9(59):59, 2012. http://dx.doi.org/10.1186/1743-0003-9-59, doi:10.1186/1743-0003-9-59. [5] j. liu, x. zhang, t. lockhart. fall risk assessments based on postural and dynamic stability using inertial measurement unit. safety and health at work 3(3):192–198, 2012. http://dx.doi.org/10.5491/shaw.2012.3.3.192, doi:10.5491/shaw.2012.3.3.192. [6] b.-c. lee et al. cell phone based balance trainer. journal of neuroengineering and rehabilitation 9(1):10, 2012. http://dx.doi.org/10.1186/1743-0003-9-10, doi:10.1186/1743-0003-9-10. [7] j. allum et al. trunk sway measures of postural stability during clinical balance tests: effects of a unilateral vestibular deficit. gait & posture 14(3):227–237, 2001. http: //dx.doi.org/10.1016/s0966-6362(01)00132-1, doi:10.1016/s0966-6362(01)00132-1. [8] a. l. adkin, b. r. bloem, j. h. j. allum. trunk sway measurements during stance and gait tasks in parkinson’s disease. gait & posture 22(3):240–249, 2005. http: //dx.doi.org/10.1016/j.gaitpost.2004.09.009, doi:10.1016/j.gaitpost.2004.09.009. [9] w. maetzler et al. impaired trunk stability in individuals at high risk for parkinson’s disease. plos one 7(3):e32240, 2012. http://dx.doi.org/10.1371/journal.pone.0032240, doi:10.1371/journal.pone.0032240. [10] e. lou et al. a low power accelerometer used to improve posture. in canadian conference on electrical and computer engineering 2001. conference proceedings (cat. no.01th8555). 2001. doi:10.1109/ccece.2001.933657. [11] w. wong, m. wong. trunk posture monitoring with inertial sensors. european spine journal 17(5):743–753, 2008. http://dx.doi.org/10.1007/s00586-008-0586-0, doi:10.1007/s00586-008-0586-0. [12] s. kammermeier et al. disturbed vestibular-neck interaction in cerebellar disease. j neurol 260(3):794–804, 2012. http://dx.doi.org/10.1007/s00415-012-6707-z, doi:10.1007/s00415-012-6707-z. [13] b. van de warrenburg et al. trunk sway in patients with spinocerebellar ataxia. movement disorders 20(8):1006–1013, 2005. http://dx.doi.org/10.1002/mds.20486, doi:10.1002/mds.20486. [14] e. de hoon et al. quantitative assessment of the stops walking while talking test in the elderly. archives of physical medicine and rehabilitation 84(6):838–842, 2003. http: //dx.doi.org/10.1016/s0003-9993(02)04951-1, doi:10.1016/s0003-9993(02)04951-1. [15] c. horlings et al. identifying deficits in balance control following vestibular or proprioceptive loss using posturographic analysis of stance tasks. clinical neurophysiology 119(10):2338–2346, 2008. http://dx.doi.org/10.1016/j.clinph.2008.07.221, doi:10.1016/j.clinph.2008.07.221. [16] p. odenrick, p. sandstedt. development of postural sway in the normal child. human neurobiology 3(4):241–244, 1983. 234 http://dx.doi.org/10.1007/s00405-010-1227-x http://dx.doi.org/10.1007/s00405-010-1227-x http://dx.doi.org/10.1093/gerona/glp009 http://dx.doi.org/10.1093/gerona/glp009 http://dx.doi.org/10.1186/1743-0003-9-59 http://dx.doi.org/10.1186/1743-0003-9-59 http://dx.doi.org/10.5491/shaw.2012.3.3.192 http://dx.doi.org/10.5491/shaw.2012.3.3.192 http://dx.doi.org/10.1186/1743-0003-9-10 http://dx.doi.org/10.1186/1743-0003-9-10 http://dx.doi.org/10.1016/s0966-6362(01)00132-1 http://dx.doi.org/10.1016/s0966-6362(01)00132-1 http://dx.doi.org/10.1016/s0966-6362(01)00132-1 http://dx.doi.org/10.1016/j.gaitpost.2004.09.009 http://dx.doi.org/10.1016/j.gaitpost.2004.09.009 http://dx.doi.org/10.1016/j.gaitpost.2004.09.009 http://dx.doi.org/10.1371/journal.pone.0032240 http://dx.doi.org/10.1371/journal.pone.0032240 http://dx.doi.org/10.1109/ccece.2001.933657 http://dx.doi.org/10.1007/s00586-008-0586-0 http://dx.doi.org/10.1007/s00586-008-0586-0 http://dx.doi.org/10.1007/s00415-012-6707-z http://dx.doi.org/10.1007/s00415-012-6707-z http://dx.doi.org/10.1002/mds.20486 http://dx.doi.org/10.1002/mds.20486 http://dx.doi.org/10.1016/s0003-9993(02)04951-1 http://dx.doi.org/10.1016/s0003-9993(02)04951-1 http://dx.doi.org/10.1016/s0003-9993(02)04951-1 http://dx.doi.org/10.1016/j.clinph.2008.07.221 http://dx.doi.org/10.1016/j.clinph.2008.07.221 vol. 55 no. 4/2015 assessment of postural instability [17] t. prieto, j. myklebust, b. myklebust. postural steadiness and ankle joint compliance in the elderly. ieee eng med biol mag 11(4):25–27, 1992. http://dx.doi.org/10.1109/51.256953, doi:10.1109/51.256953. [18] l. oliveira, d. simpson, j. nadal. calculation of area of stabilometric signals using principal component analysis. physiological measurement 17(4):305–312, 1996. http://dx.doi.org/10.1088/0967-3334/17/4/008, doi:10.1088/0967-3334/17/4/008. [19] j. raymakers, m. samson, h. verhaar. the assessment of body sway and the choice of the stability parameter(s). gait & posture 21(1):48–58, 2005. http: //dx.doi.org/10.1016/j.gaitpost.2003.11.006, doi:10.1016/j.gaitpost.2003.11.006. [20] c. horlings et al. influence of virtual reality on postural stability during movements of quiet stance. neuroscience letters 451(3):227–231, 2009. http://dx.doi.org/10.1016/j.neulet.2008.12.057, doi:10.1016/j.neulet.2008.12.057. [21] m. manto. cerebellar, disorders: a practical approach to diagnosis and management. cambridge university press, 2010. [22] o. cakrt et al. balance rehabilitation therapy by tongue electrotactile biofeedback in patients with degenerative cerebellar disease. neurorehabilitation 31(4):429–434, 2012. [23] h. aoki. evaluating the effects of open/closed eyes and age-related differences on center of foot pressure sway during stepping at a set tempo. advances in aging research 1(3):72–77, 2012. http://dx.doi.org/10.4236/aar.2012.13009, doi:10.4236/aar.2012.13009. [24] d. abrahamova, f. hlavacka. age-related changes of human balance during quiet stance. physiological research 57(6):957, 2008. [25] f. ochi, k. abe, s. ishigami, et al. trunk motion analysis in walking using gyro sensors. in proceedings of the 19th annual international conference of the ieee engineering in medicine and biology society. 'magnificent milestones and emerging opportunities in medical engineering' (cat. no.97ch36136). 1997. doi:10.1109/iembs.1997.757084. [26] s. aw et al. head impulses reveal loss of individual semicircular canal function. journal of vestibular research 9(3):173–180, 1999. [27] j. allum, l. nijhuis, m. carpenter. differences in coding provided by proprioceptive and vestibular sensory signals may contribute to lateral instability in vestibular loss subjects. exp brain res 184(3):391–410, 2007. http://dx.doi.org/10.1007/s00221-007-1112-z, doi:10.1007/s00221-007-1112-z. [28] o. findling et al. trunk sway in mildly disabled multiple sclerosis patients with and without balance impairment. exp brain res 213(4):363–370, 2011. http://dx.doi.org/10.1007/s00221-011-2795-8, doi:10.1007/s00221-011-2795-8. [29] t. kennie, g. petrie. engineering surveying technology 1990. [30] f. honegger, g. van spijker, j. allum. coordination of the head with respect to the trunk and pelvis in the roll and pitch planes during quiet stance. neuroscience 213:62–71, 2012. http: //dx.doi.org/10.1016/j.neuroscience.2012.04.017, doi:10.1016/j.neuroscience.2012.04.017. [31] m. zadnikar, d. rugelj. postural stability after hippotherapy in an adolescent with cerebral palsy. journal of novel physiotherapies 1(1), 2011. http://dx.doi.org/10.4172/2165-7025.1000106, doi:10.4172/2165-7025.1000106. [32] c. osler, r. reynolds. postural reorientation does not cause the locomotor after-effect following rotary locomotion. exp brain res 220(3-4):231–237, 2012. http://dx.doi.org/10.1007/s00221-012-3132-6, doi:10.1007/s00221-012-3132-6. [33] k. altun, b. barshan. pedestrian dead reckoning employing simultaneous activity recognition cues. meas sci technol 23(2):025103, 2012. http: //dx.doi.org/10.1088/0957-0233/23/2/025103, doi:10.1088/0957-0233/23/2/025103. [34] a. gil-agudo et al. a novel motion tracking system for evaluation of functional rehabilitation of the upper limbs. neural regeneration research 8(19):1773, 2013. [35] n. ying, w. kim. use of dual euler angles to quantify the three-dimensional joint motion and its application to the ankle joint complex. journal of biomechanics 35(12):1647–1657, 2002. http: //dx.doi.org/10.1016/s0021-9290(02)00241-5, doi:10.1016/s0021-9290(02)00241-5. [36] p. brinckmann, w. frobin, g. leivseth. musculoskeletal biomechanics. thieme, 2002. [37] i. jolliffe. principal component analysis. wiley online library, 2002. [38] s. nash, a. sofer. linear and nonlinear programming. mcgraw-hill inc., 1996. [39] p. wolf, c. ghilani. adjustment computations. statistics and least squares in surveying and gis, john wiley&sons 1997. [40] w. krzanowski. principles of multivariate analysis: a userâăźs perspective. number 3 in oxford statistical science series, 1988. [41] m. hazewinkel. encyclopaedia of mathematics, supplement iii, vol. 13. springer science & business media, 2001. [42] m. rocchi et al. the misuse of the confidence ellipse in evaluating statokinesigram. ital j sport sci 12(2):169–172, 2005. [43] j. swanenburg et al. the reliability of postural balance measures in single and dual tasking in elderly fallers and non-fallers. bmc musculoskeletal disorders 9(1):162, 2008. http://dx.doi.org/10.1186/1471-2474-9-162, doi:10.1186/1471-2474-9-162. [44] m. moghadam et al. reliability of center of pressure measures of postural stability in healthy older adults: effects of postural task difficulty and cognitive load. gait & posture 33(4):651–655, 2011. http: //dx.doi.org/10.1016/j.gaitpost.2011.02.016, doi:10.1016/j.gaitpost.2011.02.016. 235 http://dx.doi.org/10.1109/51.256953 http://dx.doi.org/10.1109/51.256953 http://dx.doi.org/10.1088/0967-3334/17/4/008 http://dx.doi.org/10.1088/0967-3334/17/4/008 http://dx.doi.org/10.1016/j.gaitpost.2003.11.006 http://dx.doi.org/10.1016/j.gaitpost.2003.11.006 http://dx.doi.org/10.1016/j.gaitpost.2003.11.006 http://dx.doi.org/10.1016/j.neulet.2008.12.057 http://dx.doi.org/10.1016/j.neulet.2008.12.057 http://dx.doi.org/10.4236/aar.2012.13009 http://dx.doi.org/10.4236/aar.2012.13009 http://dx.doi.org/10.1109/iembs.1997.757084 http://dx.doi.org/10.1007/s00221-007-1112-z http://dx.doi.org/10.1007/s00221-007-1112-z http://dx.doi.org/10.1007/s00221-011-2795-8 http://dx.doi.org/10.1007/s00221-011-2795-8 http://dx.doi.org/10.1016/j.neuroscience.2012.04.017 http://dx.doi.org/10.1016/j.neuroscience.2012.04.017 http://dx.doi.org/10.1016/j.neuroscience.2012.04.017 http://dx.doi.org/10.4172/2165-7025.1000106 http://dx.doi.org/10.4172/2165-7025.1000106 http://dx.doi.org/10.1007/s00221-012-3132-6 http://dx.doi.org/10.1007/s00221-012-3132-6 http://dx.doi.org/10.1088/0957-0233/23/2/025103 http://dx.doi.org/10.1088/0957-0233/23/2/025103 http://dx.doi.org/10.1088/0957-0233/23/2/025103 http://dx.doi.org/10.1016/s0021-9290(02)00241-5 http://dx.doi.org/10.1016/s0021-9290(02)00241-5 http://dx.doi.org/10.1016/s0021-9290(02)00241-5 http://dx.doi.org/10.1186/1471-2474-9-162 http://dx.doi.org/10.1186/1471-2474-9-162 http://dx.doi.org/10.1016/j.gaitpost.2011.02.016 http://dx.doi.org/10.1016/j.gaitpost.2011.02.016 http://dx.doi.org/10.1016/j.gaitpost.2011.02.016 l. hanakova, v. socha, j. schlenker et al. acta polytechnica [45] h. scheffe. the analysis of variance, vol. 72. john wiley & sons, 1999. [46] s. baig et al. cluster analysis of center-of-pressure measures. international journal of electrical and computer engineering 2012. http://dx.doi.org/10.11159/ijecs.2012.002, doi:10.11159/ijecs.2012.002. [47] c. greenwalt, m. schultz. principles of error theory and cartographic applications, acic technical report no. 96 1968. [48] c. jarque, a. bera. a test for normality of observations and regression residuals. international statistical review / revue internationale de statistique 55(2):163, 1987. http://dx.doi.org/10.2307/1403192, doi:10.2307/1403192. [49] p. kvam, b. vidakovic. nonparametric statistics with applications to science and engineering. john wiley & sons, 2007. [50] h. diener et al. quantification of postural sway in normals and patients with cerebellar diseases. electroencephalography and clinical neurophysiology 57(2):134–142, 1984. http://dx.doi.org/10.1016/0013-4694(84)90172-x, doi:10.1016/0013-4694(84)90172-x. [51] m. patel et al. the effects of foam surface properties on standing body movement. acta oto-laryngologica 128(9):952–960, 2008. http://dx.doi.org/10.1080/00016480701827517, doi:10.1080/00016480701827517. [52] j. blackburn et al. kinematic analysis of the hip and trunk during bilateral stance on firm, foam, and multiaxial support surfaces. clinical biomechanics 18(7):655–661, 2003. http: //dx.doi.org/10.1016/s0268-0033(03)00091-3, doi:10.1016/s0268-0033(03)00091-3. 236 http://dx.doi.org/10.11159/ijecs.2012.002 http://dx.doi.org/10.11159/ijecs.2012.002 http://dx.doi.org/10.2307/1403192 http://dx.doi.org/10.2307/1403192 http://dx.doi.org/10.1016/0013-4694(84)90172-x http://dx.doi.org/10.1016/0013-4694(84)90172-x http://dx.doi.org/10.1080/00016480701827517 http://dx.doi.org/10.1080/00016480701827517 http://dx.doi.org/10.1016/s0268-0033(03)00091-3 http://dx.doi.org/10.1016/s0268-0033(03)00091-3 http://dx.doi.org/10.1016/s0268-0033(03)00091-3 acta polytechnica 55(4):229–236, 2015 1 introduction 2 methods and materials 2.1 participants 2.2 test procedure and measurement equipment 2.3 data analysis 2.4 statistical analysis 3 results 4 discussion 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0770 acta polytechnica 53(supplement):770–775, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap icecube observatory: neutrinos and the origin of cosmic rays paolo desiatia,b,∗, for the icecube collaborationc a wisconsin icecube particle astrophysics center (wipac), university of wisconsin, madison, wi 53706, u.s.a. b department of astronomy, university of wisconsin, madison, wi 53706, u.s.a. c http: // icecube. wisc. edu ∗ corresponding author: desiati@icecube.wisc.edu abstract. the completed icecube observatory, the first km3 neutrino telescope, is already providing the most stringent limits on the flux of high energy cosmic neutrinos from point-like and diffuse galactic and extra-galactic sources. the non-detection of extra-terrestrial neutrinos has important consequences on the origin of the cosmic rays. here the current status of astrophysical neutrino searches, and of the observation of a persistent cosmic ray anisotropy above 100 tev, are reviewed. keywords: neutrinos, cosmic rays, anisotropy. 1. introduction one hundred years after their discovery, the origin of the cosmic rays is still a mystery. the current leading model is that cosmic rays are accelerated in diffusive shocks. in this case supernova remnants (snrs) in our galaxy could be the major source of cosmic rays up to about 1015 ÷1017 ev. the snr energy output in the galaxy can provide the energy budget necessary to maintain the presently observed population of galactic cosmic-rays. in particular, in order to achieve such high energies it is expected that acceleration occurs during the relatively short period in the snr evolution between the end of free expansion and the beginning of the so-called sedov phase. this period is about 103 years from the explosion when the shock velocity is high enough to allow for efficient acceleration. at energies in excess of about 1017 ev, active galactic nuclei (agn) and gamma ray bursts (grb) could play an important role in the origin of the extragalactic cosmic rays. since cosmic rays are deflected by magnetic fields, it is not possible to associate them to their sources. however, if hadronic particles are accelerated, a fraction of them would interact within their sources or in surrounding molecular clouds to produce mesons. the mesons eventually decay into high energy γ-rays and neutrinos with an energy spectrum ∼ e−2 of the accelerated cosmic rays. the remaining hadronic particles propagate until their detection on earth. detection of γ-rays and neutrinos from individual galactic or extragalactic source candidates of cosmic rays, or from extended molecular clouds, is therefore a method to indirectly probe the origin of cosmic rays. during the last decade, detection of γ-rays from galactic sources has been successfully achieved by satellite experiments such as agile and fermi up to 10 and 100 gev, respectively. imaging cherenkov telescope arrays such as magic, veritas and h.e.s.s., and water cherenkov detectors such as milagro have made measurements up to o(10 tev). high energy direct emission from old snrs appears to be inconsistent with hadronic acceleration1. it is interesting, however, that delayed secondary γ-ray emissions can be produced by the most energetic particles that escaped the acceleration region when they propagate through molecular clouds that surround the star forming regions [1]. with this mechanism, indirect evidence of hadronic acceleration is present even when snr are several 104 years old. in fact, the detection of an extended emission of tev γ-rays from the galactic center by h.e.s.s., which is attributed to cosmic rays accelerated by snr g0.9+0.1 interacting with the surrounding clouds, might provide the first evidence of hadronic acceleration [2]. the most compelling evidence currently comes from low energy γ-ray emission from the regions surrounding the intermediate-age snr w44. agile observations in the energy range of 50 mev ÷ 10 gev [3] and fermi observations up to 100 gev [4] show that while leptonic models fail to describe simultaneously γ and radio emissions without requiring too large circumstellar densities, the hadronic models are consistent with experimental constraints from radio, optical, x and γ-rays observations. although the γ-ray energy spectrum is consistent with a proton spectral index of 3 and a low energy cut-off of approximately 10 gev 2, the hadronic origin of the observed emission is considered likely. the observed steep spectrum and low energy cut-off may be caused by suppression of efficient particle acceleration in the dense environment of this source [5]. ion-neutral col1most probably snr older than several thousand years no longer efficiently accelerate cosmic rays. 2this is the reason why such a source was not observed at tev energy. 770 http://dx.doi.org/10.14311/ap.2013.53.0770 http://ojs.cvut.cz/ojs/index.php/ap http://icecube.wisc.edu vol. 53 supplement/2013 icecube observatory: neutrinos and the origin of cosmic rays lisions in the weakly ionized dense gas surrounding the remnant lead to a softer spectrum as well as to damping of the plasma alfvén waves that form the shock. the resulting poor particle confinement leads to a low energy cutoff [6]. other than the specific properties of single objects, evidence of an instance of hadronic acceleration is a very important step towards the discovery of the origin of cosmic rays. however, this would not mean that all galactic cosmic rays are necessarily accelerated in snr. if cosmic ray acceleration occurs predominantly on a larger scale, such as in superbubbles [7] or in the galaxy cluster medium where particles could be accelerated to ultra-high energies [8], the search for the origin of cosmic rays should concentrate on extended sources or diffuse fluxes. while the tev γ-ray horizon is limited within our galaxy, because of absorption in the infrared and microwave cosmic background, the gev γ-emissions can be observed within about 100 mpc making it possible to search for extragalactic sources of cosmic rays. on the other hand, detection of neutrinos from individual sources are an efficient and unambiguous probe for the high energy hadronic acceleration mechanism, and therefore for the sources of cosmic rays. however, the very same property that makes neutrinos an excellent cosmic messenger also makes them difficult to detect. thus large instrumented volume of target matter is required to capture sufficient event statistics. the icecube neutrino observatory (see fig. 1), completed in december 2010, is currently the only km3 scale neutrino telescope collecting data. the observatory consists of an array of 5,160 optical sensors arranged along 86 cables (or strings) between 1,450 and 2,450 meters below the geographic south pole, where the antarctic ice is particularly transparent. icecube includes a surface shower array, icetop, and a dense instrumented core with a lower energy threshold, deepcore. the surface array, icetop, is 81 stations each consisting of two tanks of frozen clean water with each tank containing two optical sensors. icetop, using events in coincidence with the deep icecube array, provides the measurement of the spectrum and mass composition of cosmic rays at the knee and up to about 1018 ev. the deepcore sub-array, consisting of 6 densely instrumented strings located at the bottom-center of icecube, lowers the observatory neutrino energy threshold to about 10 gev. deepcore uses the surrounding icecube instrumented volume as a veto for the background of cosmic ray induced through-going muon bundles, thus enhancing the detection of down-going neutrinos within the deep core volume. veto rejection power in excess of 108 has been achieved [9]. the basic detection component of icecube is the digital optical module (dom) which consists of a 10-inch hamamatsu photomultiplier tube (pmt) and its own data acquisition (daq) circuitry enclosed in a pressure-resistant glass sphere. the doms detect, digitize and timestamp the signals figure 1. a schematic view of the icecube observatory with the surface array icetop and the densely instrumented deepcore. from cherenkov radiation photons. their main daq board is connected to the central daq in the icecube laboratory at the surface, where the global trigger is determined [11]. the construction of icecube started in 2004 and physics quality data taking commenced in 2006. with this early data the observatory is providing the most stringent limits on the flux of high energy neutrinos from extra-terrestrial origin, and therefore strong constraints on the models of individual sources of cosmic rays and unidentified diffuse sources. at the same time, icecube has accumulated a large number of cosmic ray induced neutrinos produced in the atmosphere, making it possible to probe the combined effect of hadronic interaction models, cosmic ray spectrum and composition on the neutrino spectrum up to a few hundred tev [10]. in the search for high energy neutrinos, the large exposure of icecube makes it possible to collect an unprecedented number of events in the form of bundles of high energy muons generated in the cosmic ray induced extensive air showers. although these events represent an overwhelming background in the neutrino searches, they make it possible, for the first time, to determine the degree of anisotropy of cosmic rays from a few tev to several pev of particle energy. the persistence of a cosmic ray anisotropy at high energy raises the question of the responsible mechanism. the notion that cosmic ray anisotropy might be connected to the distribution of nearby and recent supernovae is intriguing, and might thus provide a new probe into the origin of the cosmic rays. on the other hand the complex energy-dependent topology suggests that non-diffusive processes in the local interstellar medium most probably play an important role. 2. physics results if the signals from detected cherenkov photons satisfy specific trigger conditions, an event is defined and recorded by the surface data acquisition system. on771 paolo desiati, for the icecube collaboration acta polytechnica strings year mean µrate final νµ rate 22 2007 500 hz 18/day 40 2008 1100 hz 40/day 59 2009 1700 hz 130/day 79 2010 2000 hz 170/day 86 2011 2100 hz 200/day table 1. mean rate of muon bundles and atmospheric neutrinos after final event selection for different string configurations of the icecube observatory (numbers in italic are predictions). line data filtering at the south pole reduces the event volume to about 10 % of the trigger rate, based on a series of reconstruction and filter algorithms aimed to select events based on directionality, topology and energy [15]. the filter makes it possible to transfer data via satellite from the experimental site for prompt physics analyses. 2.1. atmospheric neutrinos of the events that trigger icecube, the vast majority are muon bundles produced by the impact of primary cosmic rays in the atmosphere. only a small fraction of the detected events (∼ 10−5) are muons produced by the charged current interaction of atmospheric muon neutrinos. the easiest way to reject the downgoing muon bundle background is to exclusively select well reconstructed up-going events, since these can only be produced by neutrinos crossing the earth and interacting in the matter surrounding the detector. depending on the detector configuration and on the specific reconstruction algorithms and event selection utilized, the atmospheric neutrino sample is characterized by a directional resolution of better than 1° above 1 tev. the corresponding resolution in the estimation of the muon energy is about 0.2 ÷ 0.3 (in log10 of the energy) for crossing track-like events, and about 0.1 or better for contained cascade-like events. typically, 30 % ÷ 40 % of the up-going events survive the selection with a background contamination of less than about 1 % (see tab. 1). the atmospheric neutrino sample collected by icecube over the years is the largest ever recorded and currently reaches energies near 400 tev (see fig. 2). for the first time the precision of this measurement is providing a powerful tool to constrain the effects of high energy hadronic interaction models that represent our present knowledge of the cosmic ray induced extensive air showers and the spectrum and composition of primary cosmic rays [10]. 2.2. search for astrophysical ν’s atmospheric neutrinos represent an irreducible background for the search of high energy astrophysical neutrinos. if hadronic acceleration is the underlying process of high energy cosmic ray production and γ-ray observations in galactic and extra-galactic sources, the figure 2. collection of theoretical calculations and experimental measurements of the atmospheric neutrino spectrum. shown is the predicted conventional νµ + ν̄µ (blue line) and νe + ν̄e (red line) flux from [16], and the predicted prompt flux of neutrinos (magenta band) from [17]. the unfolded energy spectrum [18] (black filled circles) and forward folded spectrum [19] (gray band) from the 40-string icecube configuration, unfolded spectrum [20] (black open circles) and forward folded spectrum [21] (ecru band) from amanda are presented. the results from super-k [22] (aqua band) and that from fréjus [23] (black filled squares for νµ + ν̄µ and black open squares for νe + ν̄e) are also presented. charged mesons could produce enough neutrinos to be observed in a detector the size of icecube. figure 3 shows the sensitivity (90 % cl) of icecube for the full-sky search of steady point sources of e−2 muon neutrinos as a function of declination, along with that of other experiments. the extension of the point source search to the southern hemisphere is made possible by a high energy event selection that rejects the background down-going events by five orders of magnitude, and restricts neutrino energies to above 100 tev. still dominated by high energy large muon bundles, this makes the southern hemisphere poor in atmospheric neutrinos yielding a low neutrino detection sensitivity. nevertheless, this provides icecube with a full-sky view that complements coverage of the neutrino telescopes in the mediterranean. the figure shows the sensitivities from icecube and other observatories (interpreted as the median upper limit we expect to observe from individual sources across the sky) along with upper limits from selected sources. the sensitivity is reaching the level of current predictions for flux from astrophysical sources (i.e. below 1012 × e−2 tev cm−2 s−1) although the discovery potential, defined to be 5σ for 50 % of the trials, is typically a factor of three higher than the sensitivity. therefore constraints on the parameters of hadronic acceleration models are starting to develop. searches for neutrinos from transient [30] and peri772 vol. 53 supplement/2013 icecube observatory: neutrinos and the origin of cosmic rays figure 3. sensitivity (90 % cl) for a full-sky search of steady point sources of muon neutrinos with an e−2 energy spectrum as a function of declination angle for icecube and other experiments. note that for icecube, events with δ < 0° are down-going, coming from the southern hemisphere, and events with δ > 0° are upgoing and come from the northern hemisphere. figure 4. upper limits (90 % cl) for neutrino searches in coincidence with gamma ray bursts with 40 strings of icecube and the combined 40and 59string detector configurations [32]. also shown is the waxman & bahcall predicted average flux [33]. odic [31] sources have also been performed. in particular, a time window scan for transient sources (with no external triggers) shows that the discovery potential drops by a factor of 2 if searching for 1 day duration flares. a particular search for transient sources is that for neutrinos from grb. for the first time, the icecube observatory has provided a definitive test of the grb models with the most stringent constraints. figure 4 shows the upper limits obtained with the data collected by the 40-string configuration o icecube and by the combined data of the 40and 59-string configurations [32]. for each detector configuration, a list of grbs detected during the corresponding physics runs was compiled and the predicted neutrino flux was calculated based on the γ-ray spectrum shown in [34]. the corresponding stacked neutrino flux was used to search for events collected within the time window in which 5 % to 95 % of the fluence is recorded (i.e. t90). the upper limit is about 3 times below the predicted flux of the waxman & bahcall model, challenging the hypothesis that grb are the sources of ultra high energy cosmic rays (uhecr). this result has profound consequences for the predicted flux of neutrinos produced by the interaction of uhecr with the cosmic microwave background, the so-called cosmogenic neutrinos, as well as for the gev–tev γ-ray background flux (see for instance [35, 36]). it is important to note that it was recently shown that the fireball model with refined assumptions yields a 10 times smaller predicted flux (see [37, 38]). there is the possibility that the bulk of cosmic rays does not originate from individual sources, but from large-scale acceleration processes in superbubbles or even galaxy clusters. in addition, unresolved sources of cosmic rays over cosmological times are expected to have produced detectable fluxes of diffuse neutrinos. since shock acceleration is expected to provide an ∼ e−2 energy spectrum, harder than the ∼ e−3.7 of the atmospheric neutrinos, the diffuse flux is expected to dominate at high energy where the sensitivity is strongly dependent on the experimental quality of the selected events. figure 5 shows a collection of sensitivities and upper limits (90 % cl) for an e−2 flux of νµ + ν̄µ, from amanda, antares and various icecube configurations compared to the experimental and theoretical flux of the atmospheric neutrinos and various models of astrophysical neutrinos. the most recent results lie below the waxmann & bahcall neutrino bound [42], again indicating icecube’s potential for discovering the origin of cosmic rays. in the ultra high energy range (uhe), above ∼ 106 gev, icecube is reaching a competitive sensitivity as well. at this level one begins to reach current models of cosmogenic neutrino production (see fig. 6) that are simultaneously constrained by the current observations of uhecrs and the gev γ-rays by fermi-lat [49]. taking into account that uhecr mass composition is a key ingredient for the absolute flux and spectral shape of cosmogenic neutrinos [35], its large uncertainty still weighs profoundly on current models. this means that although the icecube sensitivity to uhe neutrinos is currently the best ever achieved below 1010 gev it might be still far from the actual flux. from this point of view, the current developments toward a radio array in antarctica, such as askaryan radio array (ara) [51] is a natural extension toward the highest energies. it is worth noting that the preliminary sensitivity for an arbitrary spectrum, shown in fig. 6, has a minimum just above 1 pev, where no significant cosmogenic neutrino flux is expected. in the experimental analysis performed on data collected during 2010–12, where events with a large number of detected photons were selected, two events were found on a background of conventional atmospheric neutrinos of 0.3. the events deposited an energy in the detector of about 1 pev, and further study is underway to determine their nature. one possible hypothesis is that these 773 paolo desiati, for the icecube collaboration acta polytechnica [gev]) ν log10(e 3 4 5 6 7 8 ] -1 s r -1 s -2 [ g e v c m ν /d e φ d 2 ν e -9 10 -8 10 -710 -6 10 -5 10 2000-2003 90%cl limitµνamanda 07-09 90%cl limitµνantares 90%cl limitµνic40 ic59 diffuse sensitivity ic59 diffuse 90%cl limit ic40 atmospheric unfolding (hkkm07)µνconventional atmospheric µνconventional (hkkm07) + prompt (enberg et al.) waxman-bahcall upper bound (2011) mannheim 1995 bbr i 2005 steep spectra sources stecker agn (seyfert) 2005 high peaked bl lac (max) mucke 2003 prompt grb razzaque et al. 2008 preliminary figure 5. experimental upper limits (90 % cl) for the diffuse muon neutrino flux (including the preliminary result from the 59-string configuration of icecube) along with atmospheric neutrino observations and theoretical models of atmospheric and extraterrestrial neutrino fluxes. from top to bottom in the legend [16, 18, 39–47]. events represent an upper fluctuation of the prompt neutrino production in the atmosphere from the decay of heavy charm mesons. 2.3. cosmic ray anisotropy the large number of muon bundle events collected by icecube (about 1010–1011 each year, depending on the detector configuration) makes it possible to study the arrival direction distribution of the cosmic rays at a level of about 10−5. the bundles of highly collimated atmospheric muons share the same direction as the parent cosmic ray particle. since this study does not require highly well reconstructed muon directions, all collected and reconstructed events with a median angular resolution of about 3° are used. using full simulation of cosmic ray induced extensive air shower we find that the median particle energy of the icecube data sample is about 20 tev. with these data icecube provides the first high statistics determination of the anisotropy of galactic cosmic rays in the southern hemisphere in the multi-tev energy range. the large scale anisotropy observed by icecube [52] appears to complement the observations in the northern hemisphere, providing for the first time an all-sky view of tev cosmic ray arrival directions. the sky map obtained by subtracting an averaged map (over a scale of 30°–60°) from the data [53], shows significant small angular scale structures in the cosmic ray anisotropy, similarly to observations in the northern hemisphere [54, 55]. figure 6. preliminary sensitivity (90 % cl) for the detection of uhe neutrinos, compared to other experimental results and to predictions [48–50]. the sensitivity curves are evaluated at each decade of energy. another interesting result obtained by icecube is the persistence of the anisotropy at an energy in excess of 100 tev. at such energies a different structure is observed that can be interpreted in terms of a different phase [56] as already reported by the eas-top shower array in the northern hemisphere for the first time [58]. the observation at high energy was recently confirmed by the preliminary result from the icetop shower array [57]. the change of the anisotropy pattern at about 100 tev may suggest that the heliosphere could have an effect in flipping the apparent direction of the anisotropy. in fact, at about 100 tev the cosmic rays’ gyro-radius in the 3 µg local interstellar magnetic field is of the order of magnitude of the elongated heliosphere. below this energy scale the scattering processes on the heliospheric perturbations at the boundary with the interstellar magnetic field might be the dominant processes affecting the global cosmic ray arrival distribution and the small angular structure as well (see [59] where a review of other proposed models is also given). the milagro observation of a likely harder than average cosmic ray spectrum from the localized excess region toward the direction of the heliotail, the so-called region b in [54] and also observed by argo-ybj shower array [55], have triggered astrophysics interpretations (see [60–62]). however, this may suggest that some type of re-acceleration mechanism associated with cosmic ray propagation in the turbulent heliospheric tail might occur [63, 64]. on the other hand, the tev cosmic ray anisotropy is a tracer of the local interstellar magnetic field, and it might indicate cosmic ray 774 vol. 53 supplement/2013 icecube observatory: neutrinos and the origin of cosmic rays streaming along the magnetic field lines due to the loop i shell expanding from the scorpion-centaurus association [65]. if the local propagation effects on the cosmic ray anisotropy below 100 tev are dominant, at higher energy it is reasonable to believe that the persistent anisotropy might be a natural consequence of the stochastic nature of cosmic ray galactic sources, in particular nearby and recent snrs [66–68]. references [1] gabici s., aharonian f.a.: 2007, apj 665, l131 [2] aharonian f.a., et al.: 2006, nature 439 (7077), 695 [3] giuliani a., et al.: 2011, arxiv:1111.4868 [4] uchiyama y., et al.: 2012, arxiv:1203.3234 [5] uchiyama y., et al.: 2010, apj 723, l122 [6] malkov m.a., et al.: 2011, arxiv:1004.4714 [7] butt, y. :2009, nature 460, 701 [8] lazarian a. and brunetti g.: 2011, arxiv:1108.2268 [9] hyon ha, c., et al.: 2012, j. phys.: conf. ser. 375, 052034 [10] fedynitch a., et al.: 2012, submitted to prd, arxiv:1206.6710 [11] abbasi r., et al.: 2009, nim a 601, 294 [12] abbasi r., et al.: 2010 nim a 618, 139 [13] andrés e., et al.: 2006 j. geophys. res 111, d13203 [14] abbasi r., et al.: in preparation [15] ahrens, j., et al.: 2004, nim a 524, 169 [16] honda m., et al.: 2007, phys. rev. d75, 043006 [17] enberg r., et al.:2008, phys. rev. d78, 043005 [18] abbasi r., et al.: 2011, phys. rev. d83, 012001 [19] abbasi r., et al: 2011, phys. rev. d84, 082001 [20] abbasi r., et al.: 2010, astropart. phys. 34, 48 [21] abbasi r., et al.: 2009, phys. rev. d79, 102005 [22] gonzalez-garcia m.c., maltoni m, and rojo: 2006, j. high energy phys. 0610, 075 [23] daum k., et al.: 1995, z. phys. c66, 417 [24] thrane e., et al.: 2009, apj 704, 503 [25] adrián-martínez s., et al.: 2011, arxiv:1108.0292 [26] abbasi r., et al.: 2011, astrop. phys. 34, 420 [27] abbasi r., et al.: 2011, apj 732, 18 [28] abbasi r., et al.: 2011, in proc. of icrc 2011, beijing, china, arxiv:1111.2741 [29] kooijman p., et al.: 2011, in proc. of 32nd icrc, vol. 4, p.351, beijing, china [30] abbasi r., et al.: 2012, apj 744, 1 [31] abbasi r., et al.: 2012, apj 748, 118 [32] abbasi r., et al.: 2012, nature 484, 351 [33] waxman, e. & bahcall, j.n.: 1997, phys. rev. lett. 78, 2292 [34] guetta d., et al.: 2004, astropart. phys. 20, 429 [35] anchordoqui l.a., et al., 2007, phys. rev. d76, 123008 [36] ahlers m., et al.: 2011, astrop. phys. 35-2, 87 [37] baerwald, et al.: 2011, phys. rev. d83, 067303 [38] hümmer, et al.: 2011, arxiv:1112.1076 [39] achterberg a., et al.: 2007, phys. rev. d76, 042008 [40] biagi s., et al.: 2010, neutrino 2010 athens, greece [41] abbasi r., et al.: 2011, phys. rev. d84, 082001 [42] waxman e. & bahcall j.n.: 1998, phys. rev. d59, 023002 [43] waxman e.: 2011, to be published in astronomy at the frontiers of science [44] mannheim k.: 1995, astropart. phys. 3, 295 [45] muecke a., et al.: 2003, astropart. phys. 18, 593 [46] becker j.k., et al.: 2005, astropart. phys. 23, 355 [47] stecker f.w.: 2005, phys. rev. d72, 107301 [48] kotera k., et al.: 2010, jcap 10, 013 [49] ahlers m., et al.: 2010, astrop. phys. 34, 106 [50] yoshida s., et al.: 1997, apj 479, 547 [51] allison p., et al.: 2011, arxiv:1105.2854 [52] abbasi r., et al.: 2010, apj 718, l194 [53] abbasi r., et al.: 2011, apj, 740, 16 [54] abdo a.a., et al.: 2008, phys. rev. lett. 101, 221101 [55] di sciascio g., et al.: 2012, arxiv:1202.3379 [56] abbasi r., et al.: 2012, apj 746, 33 [57] santander m., et al.: 2012, arxiv:1205.3969 [58] aglietta m., et al.: 2009, apj 692, l130 [59] desiati p. and lazarian a.: 2012, submitted to apj [60] salvati m. & sacco b.: 2008, astron. & astrophys 485, 527 [61] drury l.o’c. & aharonian f.a.: 2008, astropart. phys. 29, 420 [62] malkov m.a., et al.: 2010, astrophys. j. 721, 750 [63] lazarian a. and desiati p.: 2010, apj 722, 188 [64] desiati p. and lazarian a.: 2012, nonlinear proc. in geophys. 19, 351 [65] frisch p.c., et al.: 2012, submitted to apj, arxiv:1206.1273 [66] erlykin a.d. & wolfendale a.w.: 2006, astrop. phys. 25, 183 [67] blasi p. & amato e.: 2012, jcap 1, 11 [68] biermann p., et al.: 2012 discussion peter grieder — do the anisotropies which you have established with icecube muons correlate with known magnetic structures in our part of the galaxy? paolo desiati — there are a few articles from frisch et al. on the properties of the local interstellar magnetic field in relation to the evolved loop i sub-shell expanding from the scorpion-centaurus association. there seems to be some link between the direction of the magnetic field and the loop i shell that is correlated with the tev cosmic ray anisotropy. this is a compelling argument that i mentioned in the text of this paper as well. 775 acta polytechnica 53(supplement):770–775, 2013 1 introduction 2 physics results 2.1 atmospheric neutrinos 2.2 search for astrophysical neutrino's 2.3 cosmic ray anisotropy references acta polytechnica doi:10.14311/ap.2013.53.0473 acta polytechnica 53(5):473–482, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap new concept of solvability in quantum mechanics miloslav znojil∗ nuclear physics institute ascr, hlavní 130, 250 68 husinec – řež, czech republic ∗ corresponding author: znojil@ujf.cas.cz abstract. in quite a few recent quantum models one is allowed to make a given hamiltonian h self-adjoint only after an ad hoc generalization of hermitian conjugation, h† → h‡ := θ−1h†θ where the suitable operator θ is called hilbert-space metric. in the generalized, hidden-hermiticity scenario with nontrivial metric θ 6= i the current concept of solvability (meaning, most often, the feasibility of a non-numerical diagonalization of h) requires a generalization (allowing for a non-numerical tractability of θ). a few very elementary samples of “solvable” quantum models of this new type are presented. keywords: quantum hamiltonians, hilbert-space metrics, solvable models. submitted: 15 february 2013. accepted: 12 april 2013. 1. introduction in a pre-selected hilbert space h of states |ψ〉 ∈h the unitarity of the time evolution of a quantum system s is usually guaranteed via a pre-selection of the generator (i.e., of the hamiltonian operator h) in self-adjoint form, h = h†. it is obvious that the simultaneous use of both of these pre-selections is over-restrictive. in spite of the rather trivial nature of such an observation, the practical removal of this restriction has always been obstructed by mathematical difficulties. in our present review-like paper we intend to argue that the introduction of a new concept of solvability might help in this respect. from a purely pragmatic point of view one has just two ways to seperate the choice of the hilbert space from the choice of the generator of evolution. in the first case (= “option a”) one defines the “true” generator h as acting in a “false”, unphysical hilbert space h = h(f) (following our compact review paper [1] the superscript (f) abbreviates the word “friendly”). naturally, we must still, sooner or later, reconstruct the correct physical hilbert space h(s) in which h will generate unitary evolution (the superscript (s) stands here for “standard”). otherwise, we would not be able to make any probabilistic predictions. in the second scenario (= “option b”) the initial choice of a “correct” hilbert space h(p) (where (p) abbreviates “physical”) is being paralleled by our input knowledge of an apparently incorrect hamiltonian h which is, typically, defined in another, friendlier hilbert space h(f) 6= h(p). such a construction of the model appears in fact slightly more natural. indeed, in the majority of textbooks one really starts from the very initial specification of the suitable physical hilbert space. typically, one chooses h(p) ≡ l2(r). unfortunately, the unusual hamiltonian h must be, sooner or later, also be mapped from its native space h(f) into its interpretation-carrying alternative h(p). practical merits of the constructive and invertible “option-b” transitions ω : h(f) →h(p) between unitarily non-equivalent hilbert spaces were first revealed, in the context of nuclear physics, by f. dyson (cf. the review paper [2] for more details). he was the first to introduce computation-facilitating and ω−induced transitions between the “intractable”, microscopic, fermionic hamiltonians h = ωhω−1 = h† in p-space and their “tractable”, bosonic isospectral images h defined in f-space. albeit non-hermitian in h(f), h 6= h†, the latter operator appeared perfectly hermitian in h(s), i.e., it exhibited the hidden hermiticity property h = h‡ = θ−1h†θ, where θ = ω†ω. in what follows, in the language and notation of ref. [1], let us call the latter hamiltonians h “crypto-hermitian”. one of the first physical models belonging to the former category “a” is due to daniel bessis [3] and carl bender with coauthors (cf. his extensive review paper [4]). they started their considerations from a truly interesting candidate for the correct physical hamiltonian operator h = p2 + ix3, which proves manifestly non-hermitian in the most common “false” hilbert space h(f) ≡ l2(r). unfortunately, for the bessis’ imaginary cubic oscillator the necessity of providing a correct s-superscripted hilbert space metric θ 6= i appeared to be a formidably difficult mathematical task [5] which still seems to be far from completion [6]. further models need to be developed. indeed, the active use of the whole three-hilbert-space (ths, [1]) pattern is conceptually transparent, based just on an ad hoc variability of the maps ω and/or of the inner products (i.e., in other words, on the freedom of our varying metric θ 6= i in the correct physical hilbert space of quantum states h(s)). after all, one may find applications of the same or similar pattern in the older literature on molecular physics [7], in the relativistic quantum mechanics [8], in the variational descriptions of many-body systems and spin lattices [9] etc. thus, all of the methodological questions of applicability of the ths representation pattern deserve 473 http://dx.doi.org/10.14311/ap.2013.53.0473 http://ojs.cvut.cz/ojs/index.php/ap m. znojil acta polytechnica to be reanalyzed via as simple toy-model examples as possible. in our present paper we intend to review and complement the recent development of such a project, and we will describe some of its consequences and ramifications in some detail. 2. the current state of the art from the point of view of the recent history of quantum mechanics it was, certainly, fortunate that in some of the above-mentioned specific hidden-hermiticity contexts people discovered the advantages of working with such an operator representation h of a given observable quantity (say, of the energy) which only proved hermitian after a change of the inner product in the initially ill-chosen (i.e., by assumption, unphysical) hilbert space h(f) (the superscript (f) might also be read here as abbreviating “former” or “first”). a shared motivation of many of the above-cited papers speaking about non-hermitian quantum mechanics resulted just from the observation that several phenomenologically interesting operators (say, hamiltonians) h appear manifestly non-hermitian in the “usual” textbook setting and that they only become hermitian in some much less common representation of the hilbert space of states. the amendments of space were, naturally, mediated by the mere introduction of a non-trivial metric θ = θ(s) 6= i entering the upgraded, s−superscripted inner products, 〈ψ1|ψ2〉(f) →〈ψ1|ψ2〉(s) := 〈ψ1|θ|ψ2〉(f). (1) such an inner-product modification changed, strictly speaking, the hilbert space, h(f) →h(s). there were several independent reasons for this. besides the formal necessity of re-installing the unitarity of the evolution law, the costs of the transition to the more complicated metric were found more than compensated by the gains due to the persuasive simplicity of hamiltonians (cf. [2] or [4] in this respect). moreover, for some quantum systems the transition f → s may prove motivated by physics. the most elementary illustration can be found in our recent study [10] where a consistent quantization of the big bang has been performed in a schematic toy model in which the hamiltonian remained selfadjoint while only a “geometry” observable proved non-hermitian, q 6= q†. for a generic quantum system characterized by two observables h and q, hermitian or not, a characteristic scenario may be found displayed in fig. 1. in the picture (where the whole plane symbolizes a multidimensional space of all parameters of the model) we see three circles. schematically, they represent three boundaries ∂d of three domains d. thus, the spectrum of h is assumed potentially observable (i.e., real and, for simplicity, non-degenerate) in the left lower domain dh. similarly, let the spectrum of q be real and non-degenerate inside the right lower domain dq. in parallel, the spectrum of the available the physical subdomain of stability domain of domain ofdomain of parametric parametricparametric qh i ii iii θ figure 1. generic domains of parameters for which the metric θ exists (upper disc) or for which the spectra of observables h or q remain potentially observable (the two respective lower discs). hermitizing metrics θ must be, by definition, strictly positive (upper circle, domain dθ). in this arrangement, operator q ceases to represent an observable in domain “i” while operator h ceases to represent an observable in domain “ii”. in domain “iii”, neither of these two operators can be made hermitian using the available class of metrics θ, in spite of the reality of both spectra. a number of open questions emerge. some of them will be discussed in our present paper. via a few illustrative examples we will show, among others, that and why the variability of the metric θ in the physical hilbert space h(s) represents an important merit of quantum theory, and that and why the closed-form availability of operator θ (i.e., a new form of solvability) is of a truly crucial importance in applications. 3. methodical guidance: dimension two 3.1. toy-model hamiltonian in the simplest possible two-dimensional and real hilbert space h(f) ≡ r2 an instructive sample of the time evolution may be chosen as generated by the hamiltonian (i.e., quantum energy operator or matrix) of ref. [11], example ii.1.1, h = h(2)(λ) = [ −1 λ −λ 1 ] . (2) its eigenvalues e(2)± = ± √ 1 −λ2 are non-degenerate and real (i.e., in principle, observable) for λ inside interval (−1, 1). on the two-point domain boundary {−1, 1}, these energies degenerate in such a way that the canonical form of the matrix itself becomes a jordan block. subsequently, the energies complexify whenever |λ| > 1. in the current literature one calls the boundary points λ = ±1 “exceptional points” (ep, [11]). at these points the eigenvalues degenerate and our toy-model hamiltonian ceases to be diagonalizable, becoming unitarily equivalent to a triangular jordan474 vol. 53 no. 5/2013 new concept of solvability in quantum mechanics block matrix, h(2)(1) = [ h(2)(−1) ]† = [ −1 1 −1 1 ] = 1 2 [ 1 −1 1 1 ][ 0 1 0 0 ][ 1 1 −1 1 ] . (3) at |λ| > 1, the diagonalizability gets restored but the eigenvalues cease to be real, e(2)± = ±i √ λ2 − 1. in the spirit of current textbooks, this leaves these purely imaginary complex conjugate energies unobservable. 3.2. hidden hermiticity: the set of all eligible metrics our matrix h(2)(λ) remains diagonalizable and cryptohermitian whenever −1 < λ = sin α < 1, i.e., for the auxiliary hamiltonian-determining parameter α lying inside a well-defined physical domain dh such that α ∈ (−π/2,π/2). in such a setting, matrix h(2)(λ) becomes tractable as a hamiltonian of a hypothetical quantum system whenever it satisfies the abovementioned hidden hermiticity condition h = h‡ := θ−1h†θ. (4) suitable candidates for the hilbert-space metric are all easily found from the latter linear equation, θ = θ(2)λ (a,d) = [ a b b d ] , b = − λ 2 (a + d). (5) all of their eigenvalues must be real and positive, θ± = 1 2 [ a + d± √ (a−d)2 + λ2(a + d)2 ] > 0. (6) this is satisfied for any positive σ = a + d > 0 and with any real δ = a−d such that√ 1 −λ2 = cos α > δ σ > − √ 1 −λ2 = −cos α. (7) without loss of generality we may set σ = 2, put δ = cos α cos β and treat the second free parameter β ∈ (−π/2,π/2) as numbering the admissible metrics θ(2)(physical) = [ 1 + cos α cos β −sin α −sin α 1 − cos α cos β ] (8) with eigenvalues θ± = 1 ± √ 1 − cos2 α sin2 β > 0. (9) thus, all of the eligible physical hilbert spaces are numbered by two parameters, h(s) = h(s)(α,β). 3.3. the second observable q = q‡ what we now need is the specification of the domain dq. for the general four-parametric real-matrix ansatz q̃ = [ w x y z ] (10) the assumption of observability implies that the eigenvalues must be both real and non-degenerate, 4xy > −(w −z)2. (11) once we shift the origin and rescale the units we may set, without loss of generality, w = −z = −1. this simplifies the latter condition yielding our final untilded two-parametric ansatz q = [ −1 x y 1 ] ,xy > −1. (12) at any fixed metric θ(2)(physical) the crypto-hermiticity constraint (4) imposed upon matrix (12) degenerates to the single relation x−y = 2 sin α− (x + y) cos α cos β. (13) the sum s = x + y may be now treated as the single free real variable which numbers the eligible second observables. the range of this variable should comply with the inequality in eq. (12). after some straightforward additional calculations one proves that the physical values of our last free parameter remain unrestricted, s ∈ r, due to the validity of eq. (13). we may conclude that our example is fully non-numerical. it also offers the simplest nontrivial explicit illustration of the generic pattern as displayed in fig. 1. 4. hilbert spaces h(f) of dimension n 4.1. anharmonic hamiltonians during the developments of mathematics for quantum theory, one of the most natural paths of research started from the exactly solvable harmonic-oscillator potential v (ho)(x) = ω2x2 and from its power-law perturbations v (aho)(x) = ω2x2 + gxm. perturbation expansions of energies proved available even at the “unusual”, complex values of the coupling constants g /∈ r+. the particularly interesting mathematical results have been obtained at m = 3 and at m = 4. in physics and, in particular, in quantum field theory the climax of the story came with the letter [12] where, under suitable ad hoc boundary conditions and constraints upon g = g(m) (called, conveniently, pt -symmetry), the robust reality (i.e., in principle, observability) of the spectrum was achieved at any real exponent m > 2 even for certain unusual, complex values of the coupling. it has been long believed that the pt -symmetric hamiltonians h = h(m) with real spectra are all consistent with the postulates of quantum theory, i.e., that these operators are crypto-hermitian, i.e., hermitian in the respective hamiltonian-adapted hilbert spaces h(s)(m) [4]. due to the ill-behaved nature of the wave functions at high excitations, unfortunately, such a simple-minded physical interpretation of these models has been shown contradictory [6]. on these grounds one has to develop some more robust approaches to the theory for similar models in the nearest future. 475 m. znojil acta polytechnica in our present paper we will avoid such a danger by recalling the original philosophy of scholtz et al [2]. they simplified the mathematics by admitting, from the very beginning, that just the bounded-operator and/or discrete forms of the eligible anharmonic-type toy-model hamiltonians h 6= h† should be considered. 4.2. discrete hamiltonians for our present illustrative purposes we intend to recall, first of all, one of the most elementary versions of certain general, n−dimensional matrix analogues of the differential toy-model hamiltonians, which were proposed in refs. [13]. referring to the details as described in that paper, let us merely recollect that these hamiltonians are defined as certain tridiagonal and real matrices h(n) = h(n)0 + v (n) where the “unperturbed”, harmonic-oscillator-simulating main diagonal remains equidistant, h(n)11 = (h (n) 0 )11 = −n + 1, h(n)22 = (h (n) 0 )22 = −n + 3, . . . , h (n) nn = (h(n)0 )nn = n−1 while the off-diagonal “perturbation” becomes variable and, say, antisymmetric, v (n)12 = −v (n)21 , v (n) 23 = −v (n) 32 , . . . , v (n) n−1n = −v (n) nn−1. the word “perturbation” is written here in quotation marks because, in the light of results of ref. [14], the spectral properties of the model become most interesting in the strongly non-perturbative regime where one up-down symmetrizes and re-parametrizes the perturbation v (n) k,k+1 = −v (n) k+1,k = √ k(n −k)(1 − t− t2 − . . .− tj−1 −gktj), n = 2j or n = 2j + 1. (14) this parametrization proved fortunate in the sense that it enabled us to replace the usual numerical analysis by a rigorous computer-assisted algebra. in this sense, the model in question appeared to represent a sort of an exactly solvable model, precisely in the spirit of our present message. the new parameter t ≥ 0 is auxiliary and redundant. it may be interpreted, say, as a measure of distance of the system from the boundary ∂dh of the domain of spectral reality. at very small t the local part of boundary ∂dh has been shown to have the most elementary form of two parallel hyperplanes in the j−dimensional space of parameters gn [14]. in the simplest nontrivial special case of n = 2 the present hamiltonian h(n) degenerates precisely to the above-selected toy-model of section 3. vice versa, the basic components of the n = 2 discussion (i.e., first of all, the feasibility of the construction of the metric and of the second observable) might be immediately transferred to all n > 2. several steps in this direction may be found performed in our recent paper on the solvable benchmark simulations of the phase transitions interpreted as a spontaneous pt -symmetry breakdown [15]. 5. the problem of non-uniqueness of the ad hoc metric θ = θ(h) the roots of the growth of popularity of the description of stable quantum systems using representations of observables which are non-hermitian in an auxiliary hilbert space h(f) may be traced back not only to the entirely abstract mathematical analyses of spectra of quasi-hermitian operators [16] and of the operators which are self-adjoint in the so-called krein spaces with indefinite metric [17] but also to the emergence of manageable non-hermitian models in quantum field theory [18] or even in classical optics [19], etc. after a restriction of attention to quantum theory, the key problem emerges in connection with the ambiguity of the assignment h → θ(h) of the physical hilbert space h(s) to a given generator h of time evolution. for many phenomenologically relevant hamiltonians h it appeared almost prohibitively difficult to define and construct at least some of the eligible metrics θ = θ(h) in an at least approximate form (cf., e.g., ref. [5] in this respect). clearly, in methodological analyses the opportunity becomes wide open to finite-dimensional and solvable toy models. 5.1. solvable quantum models with more than one observable let us restrict the scope of this paper to the quantum systems which are described by a hamiltonian h = h(λ) accompanied by a single other operator q = q(%) representing a complementary measurable quantity like, e.g., angular momentum or coordinate. in general we will assume that symbols λ and % represent multiplets of coupling strengths or of any other parameters with an immediate phenomenological or purely mathematical significance. we shall also solely work here with the finite-dimensional matrix versions of our operators of observables. in such a framework it becomes much less difficult to analyze one of the most characteristic generic features of crypto-hermitian models which lies in their “fragility”, i.e., in their stability up to the point of a sudden collapse. mathematically, we have seen that the change of the stability/instability status of the model is attributed to the presence of the exceptional-point horizons in the parametric space. in the context of phenomenology, people often speak about the phenomenon of quantum phase transition [19]. let us now return to fig. 1, where the set of the phase-transition points pertaining to the hamiltonian h is depicted as a schematic circular boundary ∂dh of the left lower domain inside which the spectrum of h is assumed, for the sake of simplicity, non-degenerate and completely real. similarly, the right lower disc or domain dq is assigned to the second observable q. finally, the upper, third circular domain dθ characterizes the parametric subdomain of the existence of a suitable general or, if asked for, special class of the eligible candidates θ for a physical metric operator. the key message delivered by fig. 1 is that at any n, 476 vol. 53 no. 5/2013 new concept of solvability in quantum mechanics –1 –0.5 0.5 1 –0.4 –0.2 0.2 0.4g z figure 2. the boundary of the domain of reality of the spectrum of hamiltonian (15) in z − g plane (i. e., the zero line of polynomial g(z,g)). the correct physics may still only be formulated inside the subdomain d = dh ∩dq ∩dθ. a generalization of this scheme to systems with more observables, q → q1,q2, . . . would be straightforward. 5.2. quantum observability paradoxes one of the most exciting features of all of the abovementioned models may be seen in their ability to connect the stable and unstable dynamical regimes, within the same formal framework, as a move out of the domain d though one of its boundaries. in this sense, the exact solvability of the n < ∞ toy models proves crucial since the knowledge of the boundary ∂dθ remains practically inaccessible in the majority of their n = ∞ differential-operator alternatives [5]. in the current literature on the non-hermitian representations of observables, people most often discuss just the systems with a single relevant observable h(λ) treated, most often, as the hamiltonian. in such a next-to-trivial scenario it is sufficient to require that operator h remains diagonalizable and that it possesses a non-degenerate real spectrum. once we add another observable q into considerations, the latter conditions merely specify the interior of the leftmost domain dh of our diagram fig. 1. one may immediately conclude that the physical predictions provided by the hamiltonian alone (and specifying the physical domain of stability as an overlap between dh and the remaining upper disc or domain dθ) remain heavily non-unique in general. according to scholtz et al [2] it is therefore virtually obligatory to take into account at least one other physical observable q = q(%). in opposite direction, even the use of a single additional observable q without any free parameters may prove sufficient for an exhaustive elimination of all of the ambiguities in certain models [4]. one can conclude that the analysis of the consequences of the presence of the single additional operator q = q(%) deserves a careful attention. at the same time, without the exact solvability of the models, some of their most important merits (e.g., the reliable control and insight –1.5 –1 1 1.5 –0.8 –0.4 0.4g z figure 3. the comparison of zero lines of functions g0(z,g) and g(z,g). to the processes of the phase transitions) might happen to be inadvertently lost. 6. adding the degrees of freedom 6.1. embedding: n = 2 space inside n = 3 space in the spirit of ref. [20] a return to observability may be mediated by an enlargement of the hilbert space. for example, a weak-coupling immersion of our matrices h(2)(λ) in their three by three extension h(3) =   −1 1 + z 0 −1 −z 1 g 0 −g 3   (15) may be interpreted as a consequence of the immersion of the smaller hilbert space (where one defined hamiltonian (2)) into a bigger hilbert space. via the new hamiltonian (15), the old hamiltonian becomes weakly coupled to a new physical degree of freedom by the interaction proportional to a small constant g. in a way discussed in more detail in our older paper [21], the boundary of the new physical domain dh(3) coincides with the zero line of the following polynomial g(z,g) in two variables, 60g2z2 − 6zg4 − 12g2z3 −z6 − 162z + 27g2 − 18g4 −g6 − 153z2 − 3g4z2 − 3g2z4 − 6z5 − 30z4 − 80z3 + 144zg2. (16) the shape of this line is shown in fig. 2. in the vicinity of z = 0 and g = 0, the truncated polynomial g0(z,g) = 27g2 − 162z − 18g4 + 144zg2 − g6 − 153z2 − 6zg4 appears useful as a source of the auxiliary boundary of a fairly large subdomain d0 of the physical domain dh(3) (cf. fig. (3)). all of these observations imply that the original z = 0 boundary bends up, i.e., the net effect of the introduction of the new, not too large coupling g 6= 0 lies in the enlargement of the domain of the reality of the energy spectrum beyond λ = 1 (and, symmetrically, below λ = −1). in other words, an enhancement of the 477 m. znojil acta polytechnica –2 –1 0 –1 0 1 0.3 0.5 0.7 0.9 g z θ figure 4. the subdominant eigenvalue of metric θ(3)(1, 1, 1). it stays safely positive in the whole preselected rectangle of parameters z and s. stability of the system with respect to some random perturbations is achieved simply by coupling it to an environment. 6.2. global metrics at n = 3 the enlarged system controlled by hamiltonian h(3) of eq. (15) has been chosen as crypto-hermitian. the construction of the eligible metrics θ(3) =   a b c b f h c h m   (17) of the enlarged and re-coupled system may be perceived as another exercise in the construction of the metrics exactly, by non-numerical means. using the similar techniques we obtain, step-by-step, c = (−h−hz − bg)/4, b = −(4az + 4f + 4fz + 4a + hg + ghz)/(8 + g2) and eliminate, finally, − 2h(9 + 2z + z2 + g2)/g = −2az −az2 + 7f − 2fz −fz2 −a + 8m + mg2 + fg2. thus, starting from the three arbitrary real parameters θ11 = a, θ22 = f and θ33 = m we recursively eliminate θ1,3 = c = (−h − hz − bg)/4, θ1,2 = b = −(4az + 4f + 4fz + 4a + hg + ghz)/(8 + g2) and θ2,1 = h = −12g(−2az −az 2 + 7f − 2fz −fz2 − a + 8m + mg2 + fg2)/(9 + 2z + z2 + g2). as a final result we obtain the formula for 2(9 + 2z + z2 + g2)θ1,2 = zg2m + fzg2 + mg2 + fg2 − 3fz2 − 3az2 −fz3 −az3 − 9a− 11fz − 11az − 9f. thus, we may denote θ = θ(3)(a,f,m) and conclude that the metric is obtainable in closed form so that our extended, n = 3 quantum system remains also solvable. –2 –1 0 –1 –0.5 0 0.5 1 g z figure 5. the complete domain of positivity of the smallest eigenvalue of metric θ(3)(1, 1, 1). if we also wish to determine the critical boundaries ∂dθ of the related metric-positivity domain dθ, the available cardano’s closed formulae for the corresponding three eigenvalues θj yield just the correct answer in a practically useless form. thus, we either have to recall the available though still rather complicated algebraic boundary-localization formulae of ref. [21] or, alternatively, we may simplify the discussion by the brute-force numerical localization of a sufficiently large metric-supporting subdomain in the parametric space. for the special choice of a = f = m = 1 we found, for example, that for the sufficiently large range of parameters z and g as chosen in figs. 4 and 5 we reveal that while the two upper eigenvalues θ2 and θ1 remain safely positive, the minimal eigenvalue θ0 only remains positive inside the minimal domain of positivity as displayed in fig. 5. thus, the boundary of the latter domain represents an explicit concrete realization of its abstract upper-circle representative in fig. 1. 7. up-down symmetrized couplings to the environment 7.1. toy model with n = 9 the pt−symmetric and tridiagonal nine-by-ninematrix hamiltonian h(9) of ref. [13] reads  −8 b 0 0 0 0 0 0 0 −b −6 c 0 0 0 0 0 0 0 −c −4 d 0 0 0 0 0 0 0 −d −2 α 0 0 0 0 0 0 0 −α 0 α 0 0 0 0 0 0 0 −α 2 d 0 0 0 0 0 0 0 −d 4 c 0 0 0 0 0 0 0 −c 6 b 0 0 0 0 0 0 0 −b 8   . in the limit α → 0 it splits into a central onedimensional submatrix with eigenvalue 0 and a pair of non-trivial four-by-four sub-hamiltonians h(4). the spectrum remains real, say, for the family of parameters b = √ 3 + 3t, c = 2 √ 1 + t and d = √ 3 + 3t. they span an interval in the physical domain dh whenever t stays negative, t ∈ (−∞, 0) [14]. 478 vol. 53 no. 5/2013 new concept of solvability in quantum mechanics 20 25 30 0–0.02–0.04–0.06–0.08 t z figure 6. the t−dependence of the real roots zj of secular equation (18). the collapse at t = 0 is not destroyed due to the weakness of the coupling to the environment (β = 1). at α = 0 the special and easily seen feature of the latter operator (i.e., matrix) is that at t = 0 (i.e., at the boundary of its physical domain dh) it ceases to represent an observable because its eigenvalues degenerate. indeed, the vanishing level e4 = 0 separates from the two degenerate quadruplets of e4+j = −e4−j = 5 with j = 1, 2, 3, 4. subsequently, at t > 0, these eigenvalues get, up to the constantly real level e4 = 0, complex. this makes the model suitable for quantitative studies of the properties of the boundary ∂dh [15]. 7.2. boundary ∂dh the t−independent level e4 = 0 is a schematic substitute for a generic environment. each of the two remaining subsystems remains coupled to this environment by the coupling or matrix element α. we shall choose its value as proportional to t via a not too large real coupling constant β, α = βt. at the particular choice of β = 1 the description of the boundary ∂dh remains feasible by nonnumerical means yielding the transparent and algebraically tractable secular equation 0 = z4 + (−100 − 20t + 2t2)z3 + (3750 + 500t− 80t2 − 34t3)z2 + (−62500 + 12500t + 4810t2 + 360t3 + 158t4)z + 390625 − 312500t− 23500t2 + 22450t3 − 3221t4 − 126t5, (18) which may very easily be treated numerically. obviously, the level e4 = 0 separates while the other two quadruplets acquire the square-root form e4+j = −e4−j = √ zj for j = 1, 2, 3, 4. hence, one may proceed and study the spectrum of z = e2 in full parallel with our above n = 3 model. the β = 1 results are sampled in fig. 6. inside the physical domain of t < 0, qualitatively the same pattern is still obtained even at the perceivably larger β = 2.73 (cf. fig. 7). once we are now getting very close to the critical value of β ≈ 2.738, the situation becomes unstable. in the unphysical domain of t > 0, 24.5 25.0 25.5 0.010–0.01–0.02 t z figure 7. the t−dependence of the real roots zj of secular equation (18) near t = 0 at β = 2.73. the collapse survives, a partial recovery emerges at negative t. 24.5 25.0 25.5 0.010–0.01–0.02 t z figure 8. the change of the t−dependence of the real roots zj of secular equation (18) near t = 0 at β = 2.75. for example, we can spot an anomalous partial decomplexification of the energies at certain positive values of parameter t. at cca β ≈ 2.738 the two separate ep instants of the degeneracy and complexification/decomplexification of the energies fuse themselves. subsequently, a qualitatively new pattern emerges. a graphical sample of it is given in fig. 8. first of all, the original multiple ep collapse gets decoupled. this implies that at β = 2.75 as used in the latter picture, the inner two levels degenerate and complexify at a certain small but safely negative t = tcrit ≈−0.004. due to the solvability of the model we may conclude that the boundary-curve ∂dh starts moving with parameter β. 8. non-hermitian quantum graphs 8.1. models with point interactions another interesting pt -symmetric single-particle differential-operator hamiltonian h with the property h 6= h† in h(f) was proposed in ref. [22]. the particle of mass µ = 1/2 was assumed there living on a finite interval of x ∈ (−l,l). the only nontrivial interaction was chosen as localized at the endpoints and 479 m. znojil acta polytechnica characterized by the robin-type boundary conditions ψ′(±l) + iα ψ(±l) = 0, α > 0, 2lα/π 6= 1, 2, . . . . (19) the extreme simplicity of this model opened the way not only towards the elementary formula for the energy spectrum, e0 = α2, en = (nπ 2l )2 , n = 1, 2, . . . (20) but also towards the equally elementary construction of the complete family of the eligible metrics θ (cf., e.g., refs. [22–24] for the details). the solvability as well as the extreme simplicity of this model proved encouraging in several directions. in the present context, the mainstream developments may be seen in the study of its discrete descendants (cf. the next subsection). however, before turning our attention to the resulting family of the finite-dimensional crypto-hermitian problems, let us add a brief remark on the alternative possibility of a transfer of the present analysis of the idea of generalized solvability to the quickly developing field of so-called quantum graphs, i.e., of systems where the usual underlying concept of a point particle moving along a real line or interval is generalized in the sense that the single interval (say, e+ := (0,l)) is replaced by a suitable graph g(q) composed of q edges ej, j = 0, 1, . . . ,q − 1. the idea still waits for its full understanding and consistent implementation. in particular, in ref. [25] we showed that even for the least complicated equilateral q−pointed star graphs with q > 2 the spectrum of energies need not remain real anymore, even if one parallels, most closely, the q = 2 boundary conditions (19) and even if one does not attach any interaction to the central vertex. in our present notation this means that the domain dh of fig. 1 becomes empty. in other words, the applicability of this and similar models remains restricted to classical physics and optics while a correct, widely acceptable quantum-system interpretation of the manifestly non-hermitian q > 2 quantum graphs must still be found in the future. 8.2. discrete lattices as we already indicated above, one of the most promising methods for efficiently suppressing some of the above-mentioned shortcomings of the pt -symmetric models which are built in an infinite-dimensional hilbert space h(f) may be seen in the transition, say, to the discrete analogues and descendants of various confining pt -symmetric as well as non-pt symmetric potentials [26]. in particular, the most elementary discrete analogues of the most elementary end-point-interaction-simulating boundary conditions (19) may be seen in the suitable end-point non-hermitian perturbations w (n) of the standard hermitian kinetic-energy matrices −4(n), i.e., of the n by n negative discrete laplacean hamiltonians where mere two diagonals of matrix elements are nonvanishing, 4(n)k,k+1 = 4 (n) k+1,k = 1, k = 1, 2, . . . ,n − 1. with this idea in mind, we have already studied, in [27], the most elementary model with w (n)(λ) =   0 −λ 0 0 . . . 0 λ 0 0 0 ... ... 0 0 0 ... ... 0 0 0 ... ... 0 0 ... ... ... 0 0 λ 0 . . . 0 0 −λ 0   . (21) we succeeded in constructing the complete n-parametric family of the physics-determining solutions θ of the compatibility constraint (4). in ref. [28] we then extended these results to the more general, multiparametric boundary-condition-simulated perturbations w (n)(λ,µ) =   0 −λ 0 0 . . . . . . 0 λ 0 µ 0 . . . . . . 0 0 −µ 0 0 ... ... 0 0 0 0 ... ... ... ... ... ... ... 0 0 0 ... ... 0 0 −µ 0 0 . . . . . . 0 µ 0 λ 0 . . . . . . 0 0 −λ 0   (22) etc. thus, all of these models may be declared solvable in the presently proposed sense. at the same time, the question of the survival of feasibility of these exhaustive constructions of metrics θ after transition to nontrivial discrete quantum graphs remains open [29]. 9. discussion during transitions from classical to quantum theory one must often suppress various ambiguities — cf., e.g., the well known operator-ordering ambiguity of hamiltonians which are, classically, defined as functions of momentum and position. moreover, even after we specify a unique quantum hamiltonian operator h, we may still encounter another, less known ambiguity which is well know, e.g., in nuclear physics [2]. the mathematical essence of this ambiguity lies in the freedom of our choice of a sophisticated conjugation t (s) which maps the standard physical vector space v (i.e., the space of ket vectors |ψ〉 representing the admissible quantum states) onto the dual vector space v′ of the linear functionals over v. in our present paper we discussed some of the less well known aspects of this ambiguity in more detail. let us now add a few further comments on the current quantum-model building practice. first of all, let us recollect that one often postulates a point-particle (or point-quasi-particle) nature and background of the generic quantum models. thus, 480 vol. 53 no. 5/2013 new concept of solvability in quantum mechanics in spite of the existence of at least nine alternative formulations of the abstract quantum mechanics as listed, by styer et al, in their 2002 concise review paper [30], a hidden reference to the wave function ψ(x) which defines the probability density and which lives in some “friendly” hilbert space (say, in h(f) = l2(rd)) survives, more or less explicitly, in the large majority of our conceptual as well as methological considerations. a true paradox is that the simultaneous choice of the friendly hilbert space h(f) and of some equally friendly differential-operator generator h = 4 + v (x) of the time evolution encountered just a very rare critical opposition in the literature [31]. the overall paradigm only started changing when the nuclear physicists imagined that the costs of keeping the hilbert space h(f) (or, more explicitly, its inner product) unchanged might prove too high, say, during variational calculations [2]. anyhow, the ultimate collapse of the old paradigm came shortly after the publication of the bender’s and boettcher’s letter [12] in which, for certain friendly ode hamiltonians h = 4+ v (x) the traditional choice of space h(f) = l2(r) was found unnecessarily over-restrictive (the whole story may be found described in [4]). the net result of the new developments may be summarized as an acceptability of a less restricted input dynamical information about the system. in other words, the use of the friendly space h(f) in combination with a friendly hamiltonian h = h† has been found to be a theoretician’s luxury. the need arose for a less restrictive class of standard hilbert spaces h(s) which would differ from their “false” predecessor h(f) by a nontrivial inner-product metric θ 6= i. one need not even abandon the most common a priori selection of the friendly hilbert space h(f) of the ket vectors |ψ〉 with their special dirac duals (i.e., roughly speaking, with the transposed and complex conjugate bra vectors 〈ψ|) yielding the dirac’s inner product 〈ψ1|ψ2〉 = 〈ψ1|ψ2〉(f). what is new is only that such a pre-selected, f-superscripted hilbert space need not necessarily retain the usual probabilistic interpretation. one acquires an enhanced freedom of working with a sufficiently friendly form of the input hamiltonian h, checking solely the reality of its spectrum. thus, one is allowed to admit that h 6= h† in h(f). one must only introduce, on some independent initial heuristic grounds, the amended hilbert space h(s). for such a purpose it is sufficient to keep the same ket-vector space and just to endow it with some sufficiently general and hamiltonian-adapted (i.e., hamiltonian-hermitizing) inner product (1) [2]. this is the very core of innovation. in the physical hilbert space h(s) the unitarity of the evolution of the system must remain guaranteed, as usual, by the hermiticity of our hamiltonian in this space, i.e., by a hidden hermiticity condition h = θ−1h†θ := h‡ (23) alias crypto-hermititicity condition [1]. in the special case of finite matrices one speaks about the quasihermiticity condition. unfortunately, this name becomes ambiguous and potentially misleading whenever one starts contemplating certain sufficiently wild operators in general hilbert spaces [16]. it is rarely emphasized (as we did in [32]) that the choice of the metric remains an inseparable part of our model-building duty even if our hamiltonian happens to be hermitian, incidentally, also in the unphysical initial hilbert space h(f). irrespective of the hermiticity or non-hermiticity of h in auxiliary h(f), one must address the problem of the independence of the dynamical input information carried by the metric θ. only the simultaneous specification of the operator pair of h and θ connected by constraint (23) defines physical predictions in consistent manner. in this sense, the concept of solvability must necessarily involve also the simplicity of θ. acknowledgements this work has been supported by gačr grant no. p203/11/1433. references [1] m. znojil. three-hilbert-space formulation of quantum mechanics. sigma 5: 001, 2009 (arxiv overlay: 0901.0700). [2] f. g. scholtz, h. b. geyer and f. j. w. hahne. quasihermitian operators in quantum mechanics and the variational principle. ann phys (ny) 213: 74–101, 1992. [3] d. bessis, private communication (2002). [4] c. m. bender. making sense of non-hermitian hamiltonians. rep prog phys 70: 947–1018, 2007. [5] a. mostafazadeh. metric operator in pseudo-hermitian quantum mechanics and the imaginary cubic potential. j phys a: math gen 39: 10171–10188, 2006. [6] p. siegl and d. krejcirik, on the metric operator for the imaginary cubic oscillator. phys. rev. d 86: 121702(r), 2012. [7] n. moiseyev. non-hermitian quantum mechanics. cup, cambridge, 2011. [8] a. mostafazadeh. hilbert space structures on the solution space of klein-gordon type evolution equations. class quantum grav 20: 155, 2003; m. znojil. relativistic supersymmetric quantum mechanics based on klein-gordon equation. j phys a: math gen 37: 9557–9571, 2004; v. jakubsky and j. smejkal. a positive-definite scalar product for free proca particle. czech j phys 56: 985, 2006; f. zamani and a. mostafazadeh. quantum mechanics of proca fields. j math phys 50: 052302, 2009. [9] c. korff, r. a. weston. pt symmetry on the lattice: the quantum group invariant xxz spin-chain. j phys a: math theor 40: 8845–8872, 2007; r. f. bishop and p. h. y. li. coupled-cluster method: a lattice-path-based subsystem approximation scheme for quantum lattice models. phys rev a 83: 042111, 2011. 481 m. znojil acta polytechnica [10] m. znojil. quantum big bang without fine-tuning in a toy-model. j phys: conf ser 343: 012136, 2012. [11] t. kato. perturbation theory for linear operators. springer, berlin, 1966. [12] c. m. bender and s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. phys rev lett 80: 5243–5246, 1998. [13] m. znojil. maximal couplings in pt-symmetric chain-models with the real spectrum of energies. j phys a: math theor 40: 4863–4875, 2007. [14] m. znojil. tridiagonal pt-symmetric n by n hamiltonians and a fine-tuning of their observability domains in the strongly non-hermitian regime. j phys a: math theor 40: 13131–13148, 2007. [15] m. znojil. quantum catastrophes: a case study. j phys a: math theor 45: 444036, 2012. [16] j. dieudonne. quasi-hermitian operators. proc int symp lin spaces, pergamon, oxford, 1961, pp. 115–122. [17] h. langer and ch. tretter. a krein space approach to pt symmetry. czechosl j phys 54: 1113–1120, 2004; p. siegl. non-hermitian quantum models, indecomposable representations and coherent states quantization. univ. paris diderot & fnspe ctu, prague, 2011 (phd thesis). [18] c. m. bender and k. a. milton. nonperturbative calculation of symmetry breaking in quantum field theory. phys rev d 55: 3255–3259, 1997. [19] c. e. rüter, r. makris, k. g. el-ganainy, d. n. christodoulides, m. segev and d. kip, observation of parity-time symmetry in optics. nature phys 6: 192, 2010. [20] m. znojil. a return to observability near exceptional points in a schematic pt-symmetric model. phys lett b 647: 225–230, 2007. [21] m. znojil. horizons of stability. j phys a: math theor 41: 244027, 2008. [22] d. krejcirik, h. bila and m. znojil. closed formula for the metric in the hilbert space of a pt-symmetric model. j phys a: math gen 39: 10143–10153, 2006. [23] d. krejcirik. calculation of the metric in the hilbert space of a pt-symmetric model via the spectral theorem. j phys a: math theor 41: 244012 (2008). [24] j. železný. the krein-space theory for non-hermitian pt-symmetric operators. fnspe ctu, prague, 2011 (msc thesis); d. krejcirik, p. siegl and j. železný. on the similarity of sturm-liouville operators with non-hermitian boundary conditions to self-adjoint and normal operators. complex anal. oper. theory, to appear. preprint available on arxiv:1108.4946. [25] m. znojil. quantum star-graph analogues of ptsymmetric square wells. can j phys 90: 1287–1293, 2012. [26] m. znojil. n-site-lattice analogues of v (x) = ix3. ann phys (ny) 327: 893–913, 2012. [27] m. znojil. complete set of inner products for a discrete pt-symmetric square-well hamiltonian. j math phys 50: 122105, 2009. [28] m. znojil and j. wu. a generalized family of discrete pt-symmetric square wells. int j theor phys 52: 2152–2162, 2013. [29] m. znojil. fundamental length in quantum theories with pt-symmetric hamiltonians .ii. the case of quantum graphs. phys rev d 80: 105004, 2009. [30] d. f. styer et al. nine formulations of quantum mechanics. am j phys 70 (3): 288–297, 2002. [31] j. hilgevoord. time in quantum mechanics. am j phys 70 (3): 301–306, 2002. [32] m. znojil and h. b. geyer. smeared quantum lattices exhibiting pt-symmetry with positive p. fortschr physik 61(2-3): 111–123, 2013. 482 acta polytechnica 53(5):473–482, 2013 1 introduction 2 the current state of the art 3 methodical guidance: dimension two 3.1 toy-model hamiltonian 3.2 hidden hermiticity: the set of all eligible metrics 3.3 the second observable q=q+ 4 hilbert spaces h(f) of dimension n 4.1 anharmonic hamiltonians 4.2 discrete hamiltonians 5 the problem of non-uniqueness of the ad hoc metric theta=theta(h) 5.1 solvable quantum models with more than one observable 5.2 quantum observability paradoxes 6 adding the degrees of freedom 6.1 embedding: n=2 space inside n=3 space 6.2 global metrics at n=3 7 up-down symmetrized couplings to the environment 7.1 toy model with n=9 7.2 boundary ddh 8 non-hermitian quantum graphs 8.1 models with point interactions 8.2 discrete lattices 9 discussion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0427 acta polytechnica 53(5):427–432, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap coulomb scattering in non-commutative quantum mechanics veronika gáliková, peter prešnajder∗ faculty of mathematics, physics and informatics, comenius university of bratislava, mlynská dolina f2, bratislava, slovakia ∗ corresponding author: presnajder@fmph.uniba.sk abstract. recently we formulated the coulomb problem in a rotationally invariant nc configuration space specified by nc coordinates xi, i = 1, 2, 3, satisfying commutation relations [xi,xj] = 2iλεijkxk (λ being our nc parameter). we found that the problem is exactly solvable: first we gave an exact simple formula for the energies of the negative bound states eλn < 0 (n being the principal quantum number), and later we found the full solution of the nc coulomb problem. in this paper we present an exact calculation of the nc coulomb scattering matrix sλj (e) in the j-th partial wave. as the calculations are exact, we can recognize remarkable non-perturbative aspects of the model: 1) energy cut-off — the scattering is restricted to the energy interval 0 < e < ecrit = 2/λ2; 2) the presence of two sets of poles of the s-matrix in the complex energy plane — as expected, the poles at negative energy eiλn = e λ n for the coulomb attractive potential, and the poles at ultra-high energies eiiλn = ecrit −e λ n for the coulomb repulsive potential. the poles at ultra-high energies disappear in the commutative limit λ → 0. keywords: coulomb scattering, non-commutativity, quantum mechanics. submitted: 12 march 2013. accepted: 16 april 2013. 1. introduction the basic ideas of non-commutative geometry have been developed in [1] and in a form of matrix geometry in [2]. the analysis performed in [3] led to the conclusion that quantum vacuum fluctuations and einstein gravity could create (micro)black holes which prevent localization of space-time points. mathematically this requires non-commutative (nc) coordinates xµ in space-time satisfying specific commutation relations. e.g. heisenberg-moyal commutation relations [xµ,xν] = iθµν,µ,ν = 0, 1, 2, 3, (1) where θµν are given numerical constants that specify the non-commutativity of the space-time in question. later in [4] it was shown that such field theories in nc spaces can emerge as effective low energy limits of string theories. these results supported a vivid development of non-commutative qft. however, such models are very complicated and contain various unpleasant and unwanted features. however, it may be interesting to reverse the approach. not to use the nc geometry to improve the foundations of qft, but to test the effect of noncommutativity of the space on well-defined problems in quantum mechanics (qm), such as the harmonic oscillator, the aharonov-bohm effect, the coulomb problem and the planar spherical well, see e.g. [5–7]. recently, in [9, 10] we formulated the coulomb problem in a rotationally invariant nc configuration space r3λ specified by nc coordinates xk, k = 1, 2, 3, satisfying the commutation relations [xi,xj] = 2iλεijkxk, (2) where λ is a parameter of the non-commutativity with the dimension of length. we found the model exactly solvable. in [9] we gave an exact simple formula for the nc negative bound state energies, and in [10] we presented the full solution of the nc coulomb problem. in this paper we present exact formulas for the nc coulomb s-matrix in the j-th partial wave. a similar construction of a 3d noncommutative space, as a sequence of fuzzy spheres, was proposed in [11]. however, various fuzzy spheres are related to each other differently there (not leading to the flat 3d geometry at large distances). this paper is organized as follows. in section 2 we provide the formulation of the coulomb problem in nc space and we briefly describe the method of solution suggested in [9] and [10]. in section 3 we sketch the determination of the coulomb s-matrix in spherical coordinates within standard qm (see [12]), and then we generalize the calculations to the non-commutative context. as the calculations are exact, we shall be able to recognize remarkable non-perturbative aspects of the nc coulomb problem: • the cut-off for the scattering energy e ∈ (0,ecrit), where ecrit = 2/λ2; • two sets of s-matrix poles: poles at energies e < 0 for attractive coulomb potential and poles at 427 http://dx.doi.org/10.14311/ap.2013.53.0427 http://ojs.cvut.cz/ojs/index.php/ap v. gáliková, p. prešnajder acta polytechnica ultra-high energies e > ecrit for repulsive coulomb potential that disappear in the commutative limit λ → 0. in section 4 we provide a brief discussion and conclusions. 2. non-commutative quantum mechanics 2.1. non-commutative configuration space we realize the nc coordinates in r3λ, similarly as the jordan-wigner realization of the fuzzy sphere in [8], in terms of 2 pairs of boson annihilation and creation operators aα, a†α, α = 1, 2, satisfying the following commutation relations: [aα,a † β] = δαβ, [aα,aβ] = [a † α,a † β] = 0. (3) they act in an auxiliary fock space f spanned by normalized vectors |n1,n2〉 = (a†1) n1 (a†2) n2 √ n1!n2! |0〉. (4) here |0〉≡ |0, 0〉 denotes the normalized vacuum state: a1|0〉 = a2|0〉 = 0. the noncommutative coordinates xj, j = 1, 2, 3, in the space r3λ satisfying (2) are given as xj = λa+σja ≡ λσ j αβa † αaβ,j = 1, 2, 3, (5) where λ is the universal length parameter and σj are pauli matrices. the operator that approximates the nc analog of the euclidean distance from the origin is r = ρ + λ,ρ = λn,n = a†αaα. (6) it can easily be shown that [xi,r] = 0, and r2 −x2j = λ2. a strong argument supporting the exceptional role of r will be given later. 2.2. hilbert space hλ of nc wave functions let us consider a linear space spanned by normal ordered polynomials containing the same number of creation and annihilation operators: ψ = (a†1) m1 (a†2) m2 (a1)n1 (a2)n2,m1 + m2 = n1 + n2. (7) hλ is our denotation for the hilbert space of linear combinations of functions (7) closed with respect to the norm ‖ψ‖2 = 4πλ3 tr [ (n +1)ψ†ψ ] = 4πλ2 tr[rψ†ψ]. (8) the rotationally invariant weight w(r) = 4πλ2r is determined by the requirement that a ball in r3λ with radius r = λ(n + 1) should possess a standard volume in the limit r →∞. it can be shown that the chosen weight w(r) guarantees that the ball in question has the volume vr = 4π3 r 3 + o(λ). remark. the weighted trace tr[w(r) . . .] with w(r) = 4πλ2r goes to the usual volume integral∫ d3~x.. . at large distances. the 3d non-commutative space proposed in [11] corresponds to the choice w(r̂) = const and at large distances does not correspond to the flat space r3. 2.3. orbital momentum in hλ in hλ we define orbital momentum operators, the generators of rotations lj, as follows: ljψ = 1 2 [a+σja, ψ],j = 1, 2, 3. (9) they are hermitian (self-adjoint) operators in hλ and obey the standard commutation relations [li,lj]ψ ≡ (lilj −ljli)ψ = iεijklkψ. (10) the standard eigenfunctions ψjm, j = 0, 1, 2, . . . ,, m = −j, . . . , +j, satisfying l2i ψjm = j(j + 1)ψjm,l3ψjm = mψjm, (11) are given by the formula ψjm = ∑ (jm) (a†1) m1 (a†2) m2 m1! m2! rj(%) an11 (−a2) n2 n1! n2! , (12) where % = λa†αaα = λn. the summation goes over all nonnegative integers satisfying m1 + m2 = n1 + n2 = j, m1 − m2 − n1 + n2 = 2m. for any fixed rj(%) equation (12) defines a representation space for a unitary irreducible representation with spin j. 2.4. the nc analog of laplace operator in hλ we postulate the nc analog of the usual laplace operator in the form: ∆λψ = − 1 λr [ a†α, [aα, ψ] ] = 1 λ2(n + 1) [ a†α, [aα, ψ] ] . (13) this choice is motivated by the following facts: (1.) a double commutator is an analog of a second order differential operator; (2.) factor r−1 guarantees that the operator ∆λ is hermitian (self-adjoint) in hλ, and finally, (3.) factors λ−1 or λ−2 respectively, guarantee the correct physical dimension of ∆λ and its non-trivial commutative limit. calculating the action of (13) on ψjm given in (12) we can check whether the postulate (13) is a reasonable choice. first, we represent the operator rj(%) in (12) as a normal ordered form of an analytic function rj(%): rj(%) = :rj(%): = ∑ k c j k:% k: = ∑ k c j kλ k n! (n −k)! (14) 428 vol. 53 no. 5/2013 coulomb scattering in non-commutative quantum mechanics the last equality follows from the formula :nk:|n1,n2〉 = n! (n−k)! |n1,n2〉,n = n1 + n2 (15) (which can be proved by induction in k). now we will use the following commutation relations [a†α, :n k:] = −ka†α:n k−1: ⇒ [a†α, :r:] = −λa † α:r ′:, [aα, :nk:] = k:nk−1:aα ⇒ [aα, :r:] = λ:r′:aα, (16) where r′ denotes the derivative of r: r′ =∑∞ k=1 kck% k−1. using (16), the following formula can be derived: [ a†α, [aα, ψ] ] = ∑ (jm) (a†1) m1 (a†2) m2 m1!m2! × : [ −%r′′(%) − 2(j + 1)r′(%) ] : an11 (−a2) n2 n1!n2! . (17) where r′′(%) is defined as the derivative of r′(%). in the commutative limit λ → 0 the operator % formally reduces to the usual radial r variable in r3, and we see that ∆λ just reduces to the standard laplace operator in r3. 2.5. the potential term in hλ the operator v corresponding to a central potential in qm is defined simply as the multiplication of the nc wave function by v (r): (v ψ)(r) = v (r)ψ = ψv (r). (18) in the commutative case the coulomb potential φ(r) = −q r is the radial solution of the equation ∆φ(r) = 0 (19) vanishing at infinity. due to our choice of the nc laplace operator ∆λ the nc analog of this equation is ∆λφ(r) = 0 ⇐⇒ [ a†α, [aα, φ(n)] ] = 0. the last equation can be rewritten as a simple recurrent relation (n + 2)φ(n + 1) − (n + 1)φ(n) = (n + 1)φ(n) −nφ(n − 1) (20) that can be easily solved. its solution vanishing at infinity is given as φ(n) = − q′ n + 1 ⇐⇒ φ(r) = − q r . (21) we identify φ(r) with the nc analog of the coulomb potential. we see that the 1/r dependence of the nc coulomb potential is inevitable. 3. the coulomb problem in nc qm 3.1. nc radial schrödinger equation based on (13) and (21) we postulate the nc analog of the schrödinger equation with the coulomb potential in r3λ as ~2 2mλr [ a†α, [aα, ψ] ] − q r ψ = eψ ⇐⇒ 1 λ [ a†α, [ aα, ψ] ] − 2αψ = k2rψ, (22) where q is a square of electric charge q = ±e2 (q > 0 or q < 0 corresponding to the coulomb attraction or repulsion respectively), α = mq/~2 and k2 = 2me/~2. putting ψ = ψjm given in (12) into nc schrödinger equation (22) we come to the radial schrödinger equation for rj = :r:. using (17) and the relation rψjm = ∑ (jm) (a†1) m1 (a†2) m2 m1!m2! × :[(% + λj + λ)rj + λ%r′j]: an11 (−a2) n2 n1!n2! , (23) we obtain :%r′′j + [k 2λ% + 2j + 2]r′j + [ k2% + k2λ(j + 1) + 2α ] rj: = 0. (24) we claim (24) to be an nc analog of the usual radial schrödinger equation known from the standard qm. there definitely is a resemblance, as in the limit λ → 0 the terms in (24) proportional to λ representing the nc corrections disappear. considering the same limit we see that we also do not need to worry about the colon marks denoting the normal ordering, since for zero λ it makes no difference whatsoever whether we care for the ordering or not. now we can solve the nc radial schrödinger equation in two separate steps: (1.) we associate the following ordinary differential equation to the mentioned operator radial schrödinger equation (24): %r′′j + [k 2λ% + 2j + 2]r′j + [ k2% + k2λ(j + 1) + 2α ] rj = 0, (25) with % being real variable, and we will solve this one. but why do we expect this step to be of any use to us, when we actually do have to care about the ordering? the key information follows from (16): the derivatives of r appearing in (17) are just like carbon copies of the usual derivatives. (2.) now bearing this in mind, we put r = :r: , the solution of (24), to be of the same form as r, the solution of (25), except that % = λn and the normal 429 v. gáliková, p. prešnajder acta polytechnica powers :%n: have to be calculated. fortunately there is a simple formula relating the two, namely :%n: = λn:nn: = λn n! (n −n)! :%−n: = λ−n:nn: = λ−n n! (n −n)! (26) all we need is to rewrite :r: using those relations. then the comparison of qm and ncqm will be at hand. 3.2. coulomb scattering in qm to begin with, we briefly sum up the qm results before handling our ncqm case. the solution of the radial schrödinger equation for a particle in the potential v (r) = −α/r with the angular momentum j and energy e > 0 regular in r → 0 is given as r qm j = e ikrφ ( j + 1 − i α k , 2j + 2,−2ikr ) , k = √ 2e > 0, (27) in terms of the confluent hypergeometric function (see [12]): φ(a,c,z) = ∞∑ m=0 (a)m (c)m zm m! . (28) here (a)m is the so-called pochhammer symbol: (a)m = a(a + 1) · · ·(a + m− 1), m = 0, 1, 2, . . . , and (a)0 = 1. in (27) we have refrained from writing down m/~2 explicitly. this will simplify the formulas and will not do any harm, since the full form can be restored anytime. the solution (27) is real and for r →∞ it can be written as the sum of two complex conjugated parts corresponding to an inand out-going spherical wave. in the following formula a real factor common for both parts is left out, having no influence on the s-matrix. r qm j ∼ ij+1 γ(j + 1 + iα k ) eikr+i α k ln(2kr) + i−j−1 γ(j + 1 − iα k ) e−ikr−i α k ln(2kr). (29) the s-matrix for the j-th partial wave is defined as the ratio of the r-independent factors multiplying the exponentials with the kinematical factor (−1)j+1 left out: s qm j (e) = γ(j + 1 − iα k ) γ(j + 1 + iα k ) , e = 1 2 k2 > 0. (30) 3.3. coulomb scattering in ncqm now let us have a look on the coulomb scattering in ncqm. the solution of equation (25) regular at the origin is again given in terms of confluent hypergeometric function (see [10]): rj± = exp [ (±π(e) −λe)% ] ×φ ( j + 1 ± α π(e) , 2j + 2,∓2π(e)% ) , (31) where π(e) = √ 2e( 1 2 λ2e − 1). (32) we point out that both rj+ = rj− due to the kummer identity valid for confluent hypergeometric function (see [13]). scattering solutions, containing inand out-going spherical waves, can be obtained only for energy properly restricted to the values 2e (1 2 λ2e − 1 ) < 0 ⇐⇒ e ∈ (0, 2/λ2). (33) thus we recovered energy cut-off ecrit = 2/λ2. for e ∈ (0,ecrit) we put π(e) = ip, p = √ 2e ( 1 − 12λ 2e ) > 0, (34) and chose the solution (31) as rej = rej+; we labeled the solution by the admissible value of energy e ∈ (0,ecrit). the solution of the nc radial schrödinger equation (24) is rej = :rej:. using (26) the calculation is straightforward and we find for rej the expression rej = (p + iλe p− iλe )n ×f ( j + 1 − i α p ,−n, 2j + 2; 2iλp p− iλe p + iλe ) (35) in terms of the usual hypergeometric function f(a,b; c; z) = ∞∑ m=0 (a)m(b)m (c)m zm m! . (36) the radial dependence of rj is present in the hermitian operator n: r = % + λ, % = λn. by analogy with (29) we will rewrite also the nc solution as a sum of two hermitian conjugated terms corresponding to the inand out-going spherical wave. first, we express rej as rej = (1 −z)b/2f(a,b,c; z). (37) according to kummer identities (see [13]), f (a,b,c; z) can be written as a linear combination of two other solutions of the hypergeometric equation, namely (−z)−af(a,a + 1 − c,a + 1 − b; z−1) and (z)a−c(1 −z)c−a−bf ( c−a, 1 −a,c + 1 −a−b; z−1 z ) . 430 vol. 53 no. 5/2013 coulomb scattering in non-commutative quantum mechanics again, leaving out the common hermitian factor which is irrelevant regarding the s-matrix, we can write: rej ∼ (−1)j+1eαπ/p γ(j + 1 + iα p ) (p + iλe p− iλe )n+j+1−iα p × γ(2j + 2)γ(n + 1) γ(n + 2 + j − iα/p) (2λp)−1−j+i α p ×f ( a,b,c;− i 2λp (p + iλe p− iλe )) + (−1)jeαπ/p γ(j + 1 − iα p ) (p− iλe p + iλe )n+j+1+iα p × γ(2j + 2)γ(n + 1) γ(n + 2 + j + iα/p) (2λp)−1−j−i α p ×f ( a∗,b∗,c∗; i 2λp (p− iλe p + iλe )) , (38) where a = j + 1 − i α p , b = −j − i α p , c = n + 2 + j − i α p . (39) in the limit r = λ(n + 1) →∞ this can be simplified as (for details see ([10])): rej ∼ (−1)j+1i−j−1e−απ/2p × γ(2j + 2) γ(j + 1 − iα/p) e−i α p ln(2pr) (2pr)j+1 × exp [ −(r/λ + j + iα/p) ln p + iλe p− iλe ] + (−1)j+1ij+1e−απ/2p × γ(2j + 2) γ(j + 1 + iα/p) ei α p ln(2pr) (2pr)j+1 × exp [ −(r/λ + j − iα/p) ln p− iλe p + iλe ] . (40) the s-matrix is the ratio of the r-independent factors in (40): sλj (e) = γ(j + 1 − iα p ) γ(j + 1 + iα p ) , e = 1 λ2 ( 1 + i √ λ2p2 − 1 ) . (41) the function e = e(p), given above with a positive square root in for p ∈ (1/λ, +∞), is a conformal map inverse to (34), which maps the cut p right-half-plane into the e upper-half-plane. the physical-relevant values of the s-matrix are obtained as sλj (e + iε) in the limit ε → 0+. the interval corresponding to the scattering e ∈ (0, 2/λ2) is mapped onto the branch cut in the p-plane as follows: the energy interval e ∈ (0, 1/λ2) maps on the upper edge of the branch cut p ∈ (0, 1/λ), whereas e ∈ (1/λ2, 2/λ2) maps on the upper edge of the branch cut p ∈ (0, 1/λ). 3.4. coulomb bound states as poles of the s-matrix we will begin a with brief reminder of the qm case: supposing the potential is attractive (α > 0), the s-matrix (30) has poles in the upper complex k-plane for kn = i α n , n = j + 1,j + 2, . . . . (42) it is obvious that the bound state energy levels correspond to the poles of the s-matrix: en = − α2 2n2 , n = j + 1,j + 2, . . . . (43) in ncqm there is an analogy, the poles of the s-matrix occur in the case of attractive potential (α > 0) for some special values of energy below 0. however, poles can also be found in the case of repulsive potential (α < 0) for particular values of energy above 2/λ2. (1.) poles of the s-matrix for attractive potential. pλn = i α n α > 0 ⇐⇒ eiλn = 1 λ2 ( 1 − √ 1 + (λα/n)2 ) < 0, n = j + 1,j + 2, . . . . (44) in the limit λ → 0 this coincides with the standard self-energies (43). let us denote κn = λα n , ωin = κn − √ 1 + κ2n + 1 κn + √ 1 + κ2n − 1 . (45) then the solution (35) is rinj = (ω i n) nf ( −n,−n, 2j + 2;−2κn(ωin) −1). (46) it is integrable since ωn ∈ (0, 1) for positive κn and under given conditions the hypergeometric function is a polynomial. (2.) poles of the s-matrix for repulsive potential. pλn = i α n α < 0 ⇐⇒ eiiλn = 1 λ2 ( 1 + √ 1 + (λα/n)2 ) > 2/λ2, n = j + 1,j + 2, . . . . (47) now (35) has the form riinj = (−ω ii n ) nf ( −n,−n, 2j + 2; 2κn(ωiin ) −1), (48) where ωiin = − κn + √ 1 + κ2n + 1 κn − √ 1 + κ2n − 1 . (49) the definition of κn is the same as in (45) (note that it is negative this time). since ωiin = ωin ∈ (0, 1) the solution (35) is integrable because the hypergeometric function terminates as in the previous case. these states disappear from the hilbert space in the limit λ → 0. 431 v. gáliková, p. prešnajder acta polytechnica 4. conclusions in this paper we have investigated the coulomb scattering in ncqm in the framework of the model formulated in [9] and [10]. as the model is exactly solvable, we were able to find an exact formula for the nc coulomb scattering matrix sλj (e) in j-th partial wave. we found that it turns out to have remarkable non-perturbative aspects: (1.) energy cut-off — the scattering is restricted to the energy interval 0 < e < ecrit = 2/λ2; (2.) sλj (e) has two sets of poles in the complex energy plane: as expected, the poles at negative energy eiλn < 0 for attractive coulomb potential that reduce to the standard h-atom bound states energies in the commutative limit λ → 0, and poles at ultrahigh energies eiiλn = ecrit −e λ n > ecrit for repulsive coulomb potential which disappear for λ → 0. in [10] these results have been confirmed by a direct solution of the corresponding nc radial schrödinger equation. the analytic properties of sλj (e) in the complex energy plane indicate that the aspects of causality in ncqm, within investigated model, are fully consistent with the standard rôle and interpretation of s-matrix poles. in [14] we constructed the laplace-runge-lenz vector akλ(e), k = 1, 2, 3, within our nc coulomb model. we found that at fixed energy level e < 0 and e > ecrit the model shows dynamical so(4) symmetry with two sets of bound states for e = eiλn and e = eiiλn, whereas for the scattering regime 0 < e < ecrit = 2/λ2 the dynamical symmetry is so(3, 1), exactly as in the standard coulomb problem. in conclusion we claim that the investigated coulomb nc qm model, although containing unexpected non-perturbative features, is fully consistent with the usual qm postulates and interpretation. references [1] a. connes, publ. ihes 62 (1986) 257; a. connes, noncommutative geometry (academic press, london, 1994). [2] m. dubois-violete, c. r. acad. sci. paris 307 (1988) 403; m. dubois-violete, r. kerner and j. madore, j. math. phys. 31 (1990) 316. [3] s. doplicher, k. fredenhagen and j. f. roberts, comm. math. phys. 172 (1995) 187. [4] m. m. sheikh-jabbari, phys.lett. b425 (1998) 48-54, v. schomerus, jhep 9906 (1999) 030; n. seiberg and e. witten, jhep 9909 (1999) 97. [5] m. chaichian, a. demichev, p. prešnajder, m.m. sheikh-jabbari and a. tureanu, phys. lett. b527 (2002) 149; h. falomir, j. gamboa, m. loewe and j. c. rojas, phys. rev. d66 (2002) 045018; m. chaichian, miklos langvik, shin sasaki and anca tureanu, phys. lett. b666 (2008) 199; [6] m. chaichian, m.m. sheikh-jabbari and a. tureanu, phys. rev. lett. 86 (2001) 2716; m. chaichian, m. m. sheikh-jabbari and a. tureanu, eur. phys. j. c36 (2004) 251; t. c. adorno, m. c. baldiotti, m. chaichian, d. m. gitman and a. tureanu, phys. lett. b682 (2009) 235. [7] f. g. scholtz, b. chakraborty, j. goaverts, s vaidya, j. phys. a: math. theor. a40 (2007) 14581; j. d. thom, f. g. scholtz, j. phys. a: math. theor. a42 (2009) 445301. [8] h. grosse, c. klimčík and p. prešnajder, int. j. theor. phys. 35 (1996) 231. [9] v. gáliková, p. prešnajder, j. phys.: conf. ser. 343 (2012) 012096. [10] v. gáliková, p. prešnajder, coulomb problem in nc quantum mechanics: exact solution and non-perturbative aspects, arxiv:1302.4623; j. math. phys. 54 (2013) 052102. [11] a. b. hammou, m. lagraa and m. m. sheikh-jabbari, phys.rev. d66 (2002) 025025; e. batista and s. majid, j. math. phys. 44 (2003) 107. [12] l. schiff, quantum mechanics, new york 1955. [13] h. bateman, higher transcendental functions, vol. 1, (mc graw-hill book company, 1953). [14] v. gáliková, s. kováčik, p. prešnajder, laplace-runge-lenz vector for coulomb problem in nc quantum mechanics, (2013) arxiv:1309.4614v1. 432 http://arxiv.org/abs/1302.4623 http://arxiv.org/abs/1309.4614v1 acta polytechnica 53(5):427–432, 2013 1 introduction 2 non-commutative quantum mechanics 2.1 non-commutative configuration space 2.2 hilbert space h lambda of nc wave functions 2.3 orbital momentum in h lambda 2.4 the nc analog of laplace operator in h lambda 2.5 the potential term in h lambda 3 the coulomb problem in nc qm 3.1 nc radial schrödinger equation 3.2 coulomb scattering in qm 3.3 coulomb scattering in ncqm 3.4 coulomb bound states as poles of the s-matrix 4 conclusions references acta polytechnica doi:10.14311/ap.2014.54.0240 acta polytechnica 54(3):240–247, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap in-cylinder mass flow estimation and manifold pressure dynamics for state prediction in si engines wojnar sławomir, boris rohaľ-ilkiv∗, šimončič peter, honek marek, csambál jozef slovak university of technology in bratislava, faculty of mechanical engineering, institute of automation, measurement and applied informatics, námestie slobody 17, 812 31 bratislava, slovakia ∗ corresponding author: boris.rohal-ilkiv@stuba.sk abstract. the aim of this paper is to present a simple model of the intake manifold dynamics of a spark ignition (si) engine and its possible application for estimation and control purposes. we focus on pressure dynamics, which may be regarded as the foundation for estimating future states and for designing model predictive control strategies suitable for maintaining the desired air fuel ratio (afr). the flow rate measured at the inlet of the intake manifold and the in-cylinder flow estimation are considered as parts of the proposed model. in-cylinder flow estimation is crucial for engine control, where an accurate amount of aspired air forms the basis for computing the manipulated variables. the solutions presented here are based on the mean value engine model (mvem) approach, using the speed-density method. the proposed in-cylinder flow estimation method is compared to measured values in an experimental setting, while one-step-ahead prediction is illustrated using simulation results. keywords: combustion engine, afr control, pressure dynamics, in-cylinder flow estimation, state prediction. preprint 2014-06-19 vol. 00 no. 0/0000 example of an article with a long title 100 14.7 1514 50 n ox co h c c o n v er si o n % air fuel ratio (-) 7 figure 1. twc conversion window. 1. introduction the use of three-way catalytic converters (twc) in exhaust after-treatment systems of combustion engines is essential for reducing emissions to government established standards. however, to achieve maximum efficiency of twc, the engine has to be cycled between lean and rich operating regimes (see fig. 1) [1]. not only the emission standards, but also the power and fuel consumption of a spark ignition (si) engine need to be taken into account to satisfy consumer demands. to reach maximum power, the engine has to work with a rich mixture (air fuel ratio (afr) ≈ 12.6), producing pollutants such as hydro carbons and carbon monoxide. however, if the engine works in regimes often described as minimal consumption (afr ≈ 15.4), it will produce nitrogen oxides (see fig. 2). the mutual dependance between power, cons. wojnar, b. rohal’-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 af r ≈ 14.7 maximum power af r ≈ 12.6 minimum consumption af r ≈ 15.4 p ow er / co n su m p ti o n air fuel ratio (-) power consumption rich mixture lean mixture 8 figure 2. power and consumption depending on afr. sumption and afr has made afr controllers more and more important, as they balance acceptable fuel economy with satisfactory power output, while minimizing the environmental impact [2, 3]. because of the highly nonlinear dynamics of internal combustion engines, standard electronic control units (ecus) cannot always produce the desired accurate control performance of the afr. ongoing research on afr controllers includes solutions utilizing various advanced techniques, e.g. robust control [4], sliding mode control [5–7], or adaptive control [8–10]. in addition to these approaches, some proposed afr controllers use advanced model-based approaches [1, 11–17]. since the quality of model-based control depends strongly on accuracy, it is essential to create models that are as dependable as possible. 240 http://dx.doi.org/10.14311/ap.2014.54.0240 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 3/2014 in-cylinder mass flow estimation and manifold pressure dynamics preprint 2014-06-19 vol. 00 no. 0/0000 example of an article with a long title intake manifold exhaust manifold wi wie wix wxi wx wex spark plug fuel injector optional egr throttle ego mapafm ti,pi tx,px 9 figure 3. simplified schematics of an si engine. this paper deals with modeling the air path, in response to the in-cylinder air mass flow estimation and intake flow measurements. several works have analyzed estimations of the in-cylinder flow, enabling the air mass aspired into the combustion chamber to be estimated [18–20]. a state-of-the-art approach is to estimate the individual in-cylinder mass flow using periodic observers and takagi-sugeno design [21] or precise estimation of the injector characteristics [22]. however, certain modern algorithmic solutions are inherently complex, causing possible issues with real-time implementation, especially with predictive control. an alternative straightforward approach enables us to estimate the in-cylinder flow by applying the well known and widely used mean value engine model (mvem) and the speed-density method. as this approach is computationally less expensive, it will be used throughout this paper. the aim of this paper is to build a simple model that can be used as a basis for calculating predictions and thus for the design of an afr predictive controller, asynchronous with the engine events. many works deal with the air path dynamics of naturally aspired engines or turbo-charged engines, see e.g. [18, 23] and [24, 25]. unlike works describing sampling synchronous with the engine events [26], we define a model describing pressure dynamics working asynchronously to engine events. as a result, the sampling of the proposed model is asynchronous. due to its simplicity the in-cylinder mass flow estimation technique presented here utilizes direct differentiation, and a possible extension to usual observer design techniques can be found e.g. in [27]. as the test engine utilized in the experiment does not feature variable valve timing (vvt) or exhaust gas recirculation (egr), the proposed solutions are shown without considering these effects. moreover, because a practical analysis would be troublesome, this work does not consider the effect of thermal transients. this paper is organized in the following fashion. in section 2, the dynamics of the air path is considered and its connection with the intake manifold and the in-cylinder flows are presented. section 3 describes the structure of the model predictions and the experimental computation of the in-cylinder flow estimation compared to measurement. certain preliminary experimental and simulation results are presented in section 4. 2. model of the air system the model for the air fuel ratio consists of two independent parts: the air and the fuel system of the si engine. however, only the air system is taken into consideration in this paper. the model is conceptually illustrated in fig. 3. the variables have the following meaning: w variables deal with the mass flows, wi into the intake manifold (through the throttle), wie from the intake manifold to the engine, wex from the engine to the exhaust manifold, wx out of the exhaust manifold, wix from the egr to the intake manifold, and wxi from the exhaust manifold to the egr. ti,pi describe the temperature and pressure in the intake manifold, and tx,px denote the temperature and the pressure in the exhaust manifold. 2.1. intake manifold pressure dynamics the dynamics of the pressure in the intake manifold can be described by an equation based on the ideal gas law, internal energy and enthalpy equations [18, 23]. applying the ideal gas law gives us: p(t)v = n(t)r̃t(t), (1) where p is pressure, v is volume, n is the amount of substance in the moles, r̃ is the ideal gas constant (denoted with a tilde ‘̃ ’ to keep the notation uniform), and t is the gas temperature (in kelvins). since it is physically easier to measure mass, the number of moles n is not a suitable variable for further computation, and we will express it as mass m divided by the molar mass m: n(t) = m(t) m . (2) 241 s. wojnar, b. rohaľ-ilkiv, p. šimončič et al. acta polytechnica s. wojnar, b. rohal’-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 ṁi, ḣi, ti m, p, u, t , vm ṁie, ḣie, tie q̇ 10 figure 4. input, states and outputs in an intake manifold model. in this way we can transform (1) into: p(t)v = m(t) r̃ m t(t) = m(t)rt(t). (3) the internal energy of a system is equal to u(t) = cvt(t)m(t), (4) where cv is the specific heat at a constant volume. if one analyzes the intake manifold system presented in fig. 4 and substitutes t (t)m(t) in (4) with (3) and assumes that v is equal to manifold volume vm, then we obtain: u(t) = cv p(t)vm r . (5) the fundamental equation describing the physics of gases assumes that r = cp − cv and κ = cp cv . then cv r in (5) is equal to cv cp−cv = 1cp cv −1 = 1 κ−1 , and (5) derives into: u(t) = 1 κ − 1 p(t)vm. (6) in the next step, the enthalpy of the system is described as ḣi(t) = cpṁi(t)ti(t) = cpwi(t)ti(t), (7) ḣie(t) = cpṁie(t)tie(t) = cpwie(t)tie(t), (8) where cp is the specific heat at a constant pressure. applying the change of the internal energy gives: d dt u(t) = ḣi(t) − ḣin(t) + q̇(t). (9) if one substitutes equations (7) and (8) into (9), and assumes that there is no heat transfer through the walls of the intake manifold (q̇(t) = 0 in fig. 4): 1 κ − 1 ṗ(t)vm = cpwi(t)ti(t) − cpwie(t)tie, (10) and after transformation: ṗ(t) = cp(κ − 1) vm ( wi(t)ti(t) − wie(t)tie ) , (11) where cp(κ− 1) = cpκ−cp = cpκ−cvκ = (cp −cv)κ = rκ. after substituting cp(κ − 1) into (11) we obtain the dynamic equation describing the pressure in the intake manifold: ṗ(t) = rairκ vm ( wi(t)ti(t) − wie(t)tie ) . (12) in further sections both the flow measured at the inlet of the intake manifold wi and the in-cylinder flow wie will be considered. s. wojnar, b. rohal’-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 frequency (rad/sec) p h a se (d eg ) m a g n it u d e (d b ) 14 figure 5. bode diagram for the transfer function g(s). 2.2. air mass flow measurement in this section, we take into account the input to the differential equation given by eq. (12), described as the flow measured at the inlet of the intake manifold wi, or the flow through the throttle. the dynamics of the air flow mass meter were modeled by the following equation [18, 20]: maf i(t) = maf md(t) + τ d dt maf md(t), (13) where maf i is the actual air mass flow, maf md is the measured air mass flow, and τ is a time lag equal to about 15 ms. a disadvantage of this approach is that the presented estimation may amplify the high frequency noise present in the sensor measurements [7]. since eq. (13) describes first order system dynamics with a time constant (equal to τ), the system can be described by the transfer function g(s) = 1 0.015s + 1 . (14) if one analyzes the bode diagrams (see fig. 5) of the system in (14), one may notice that at frequencies higher than 8 hz (50 rad/s) the signal amplitude starts to be attenuated. a frequency equal to 8 hz appears in the intake manifold when the rotation speed of the engine is twice as fast as the frequency registered in the manifold. this is caused by the fact that during one turn of a crankshaft (360 deg) two intake strokes take place. this means that a frequency of 8 hz appears at an engine speed equal to 1000 revolutions per minute (rpm). the corresponding phase attenuation is equal to 45 deg (see fig. 5). since a typical gasoline engine usually works with speeds higher than 1000 rpm, the phase is also decreased and (as can be seen in fig. 5) it can reach about −55 deg. note that the air mass flow into the manifold wi (see fig. 3) is in this case measured with the afm. however, it can be expressed as a function of the throttle position, and the difference between the pressure 242 vol. 54 no. 3/2014 in-cylinder mass flow estimation and manifold pressure dynamics in front of the throttle and behind it: wi = ath pa rairta √ 2κ κ−1 [( pim pa ) 2 κ − ( pim pa )κ−1 κ ] , (15) where the ‘im’ indexes refer to the intake manifold conditions and the ‘a’ indexes refer to the ambient conditions. ath is a flow section through the throttle and is a function of the throttle angle ath = f(α), while κ is poisson’s constant. 2.3. cylinder flow computation in-cylinder flow estimation is based on the idea of using the reading from the manifold absolute pressure sensor (map) in the flow computation [18, 19, 24, 28]. an map sensor is mounted closer to the cylinders (valves) than the afm sensor, and it prevents the influence of air accumulation in the manifold. as a result, the in-cylinder flow can be estimated more accurately, especially in transient regimes. the basis for flow estimation [3] is eq. (16): ṁmix(t) = ρin(t) · v̇d(t) = ρin(t) · ηv(p, rev) vd 60n ne(t), (16) where ṁ will from now on be denoted by w, ρin is the density of the gas at the intake of the engine, ηv is volumetric efficiency, vd is displaced volume, ne is engine speed and n is the number of revolutions per one cycle (n = 2 for four stroke engines). volumetric efficiency depends on engine speed and load (expressed as pressure). if we use density ρin(t) in the form ρin(t) = pin(t) rintin(t) (17) and substitute it into (16), we obtain: wmix(t) = ηv(p, rev) pin(t) rintin(t) vd 60n ne(t), (18) where p is absolute air pressure, r is the ideal gas constant, t is gas temperature (in kelvins) and the ‘in’ indexes refer to the conditions at the intake of the engine. note that the gas density ρin in (17) refers to the conditions at the inlet of the intake manifold. in a real experiment we can measure the pressure and the temperature in the manifold. these refer to the manifold density ρm, which can be expressed as ρm(t) = pm(t) rmtm(t) , (19) where the ‘m’ indexes refer to conditions in the manifold. if one rewrites eq. (18) and substitutes ρin with ρm and aims to have wmix unchanged, then one has to change the meaning of volumetric efficiency ηv. thus, now the new volumetric efficiency is related to the manifold conditions, and is marked as η̃v. as is shown in fig. 9, the aspired mixture consists of air and fuel, so it is necessary to compute the gas constant r and the temperature t related to the mixture: rmix = (ṁair · rair) + (ṁfuel · rfuel) ˙mair + ṁfuel (20) tmix = (ṁair · tair · cpair) + (ṁfuel · tfuel · cp,fuel) (ṁair · cpair) + (ṁfuel · cp,fuel) (21) where: cair and cp,fuel are the specific heat of air and fuel at constant pressure. if we substitute equations (20), (21) into (18), subtract the fuel flow (wf) from the air flow (18) and take the changed volumetric efficiency into consideration, we obtain the equation describing the in-cylinder flow: wie(t) = η̃v(p, rev) pm(t) rmixtmix(t) vd 60n ne(t) − wf (t). (22) the fuel flow that appears in eq.(22) is defined as: wf = iir · if, (23) where if is the characteristic injector flow and iir is the ratio of the injection length to the intake stroke length: iir = (pulse width) (intake length) . (24) 2.4. volumetric efficiency estimation volumetric efficiency η̃v is a crucial variable, which influences the flow into the cylinders. it describes how much air can be aspired into the chamber in comparison with the volume of the combustion chamber. it is a nonlinear parameter and is a function of the engine speed and load (η̃v(rev, load)) [3]. the goal of this section is to describe the estimation routine of this factor. since the air in the intake manifold has a dynamic character, the air mass flow to intake manifold wi does not have to be equal to the cylinder air flow wie in the transient regime (see fig. 3). however, in steady states, these two flows are equal because there is no other source charging the manifold with air, and there is no other outflow. thus, it is possible to substitute in eq. (22) the measurement from the afm (wi) by the estimated in-cylinder flow (wie): wi(t) = η̃v(p, rev) pm(t) rmixtmix(t) vd 60n ne(t) − wf (t) (25) we have obtained an equation with one unknown variable — η̃v. this can be inserted on the left side of the equation and thus the volumetric efficiency can be computed as a dependance of afm-measurement, map sensor reading, manifold temperature and engine speed: η̃v(p, rev) = 60 ( wi(t) − wf (t) ) rmixtmix(t)n pm(t)vdne(t) (26) 243 s. wojnar, b. rohaľ-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 vol. 00 no. 0/0000 example of an article with a long title 0 1000 2000 3000 4000 5000 0 50 100 0 0.2 0.4 0.6 0.8 v o lu m et ri c effi ci en cy absolute manifold pressure (kpa) engine speed (rpm) 11 figure 6. measured volumetric efficiency map. a two dimensional map of the volumetric efficiency estimated this way can be used in transient regimes to compute the in-cylinder flow. in dependence on changing revolutions and loads, the volumetric efficiency also changes and influences the estimated flow (see fig. 6). the presented map shows the volumetric efficiency η̃v estimated in every 500 revolutions and every 10 kpa. of course, the volumetric efficiency maps of different engines differ. if one needs a more accurate map, one may estimate it with higher resolution (for example every 5 kpa, or every 250 revolutions), or use higher resolution in areas with a strongly nonlinear character, and lower resolution in less nonlinear areas. the volumetric efficiency map presented in fig. 6 is based on a measurement which was carried out on a test bench with an si combustion engine made by volkswagen (1.4 liter volume, 16 valves, 55 kw power). 2.5. summary of models in sections 2.2 and 2.3, both components of the dynamic model were considered. if we substitute eqs. (22) and (13) into (12), the pressure dynamics can be expressed as ṗ(t) = rairti(t)κ vm ( wi(t) + τ d dt wi(t) ) − η̃v(p, rev)p(t) rairtie(t) rmixtmix(t) vd vm ne(t) 2 . (27) since we want to apply the equation in a discrete time control system, we are interested in its discrete time implementation. to solve this problem, we have used the typical euler numerical procedure for solving ordinary differential equations. the solution looks as follows: p(k) = p(k − 1) + δt ( rairti(k)κ vm ( wi(k − 1) + τ wi(k−1)−wi(k−2) δt ) − η̃v(p, rev)p(k − 1) rairtieκ rmixtmix vd vm ne(k) 2 − wf (k) ) , (28) where δt is the sampling period, k is the sample number, p is the system state and wi is the input. the analysis of (28) shows that the pressure in the intake manifold can be simulated depending on the past pressure values, input — flow through the afm, and further variables, such as temperature and engine speed. 3. prediction with the model the dynamics of the air system presented in the previous sections offers the possibility to predict future states of the system. it is obvious that equation (28) can be rewritten in the form of a one-step-ahead prediction, that is: p̂(k + 1) = p(k) + δt ( rairti(k) vm ( wi(k) + τ wi(k)−wi(k−1) δt ) − η̃v(p, rev)p(k) rairtie(k) rmixtmix(k) vd vm ne(k) 2 − wf (k) ) , (29) where pressure p(k) and other variables measured at a discrete sample time k enable future states p̂(k + 1) to be predicted. if one wants to increase the prediction accuracy, it is possible to define a model of the temperature and the engine speed and insert them into eq. (29). since we have focused on the air path dynamics, we have used a simplification assuming that the engine speed and the temperatures are constant. the many-step-ahead prediction is then defined as p̂(k + i + 1) = p̂(k + i) + δt ( rairti(k) vm ( wi(k) + τ wi(k)−wi(k−1) δt ) −η̃v(p, rev)p̂(k+i) rairtie(k) rmixtmix(k) vd vm ne(k) 2 −wf (k) ) . (30) the predicted pressure can then be used in air flow computations. if we substitute p̂(k + 1) in eq. (22) we obtain: ŵie(k + i) = η̃v(p, rev) p̂m(k + i) rmixtmix(k) vd 60n ne(k) − wf (k). (31) supposing that the engine speed is constant, the intake stroke length can be easily computed. if we integrate flows ŵie(k + i) on a horizon (k to k + i), (k + i to k + 2i), (k + ni to k + (n + 1)i) and so on (see fig. 7), it will be possible to compute the mass of air aspired to the combustion chamber in future intake strokes m̂(n) (where i is the determined amount of prediction, and n is the number of predicted intake strokes). 244 vol. 54 no. 3/2014 in-cylinder mass flow estimation and manifold pressure dynamics s. wojnar, b. rohal’-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 k samples ŵ (k + 1 ) | k ŵ (k + 2 ) | k ŵ (k + 3 ) | k ŵ (k + i) | k ŵ (k + 2 i) | k ŵ (k + j i) | k p re d ic te d a ir m a ss fl ow (k g / s) future (predicted) intake strokes 1 2 m̂1 m̂2 m̂j 12 figure 7. air mass flow prediction on a desired amount of assumed intake strokes. the following equation describes the vector of predicted masses: m̂(n) = set∑ i=0 ŵie(n · set + i)tpred, (32) where tpred = it/set is the time equal to an intake stroke time divided by the number of prediction steps, it is the intake stroke time, set is a parameter which changes the accuracy of the computations (defining the number of samples in one predicted intake stroke). according to this definition, the model can be computed asynchronously to the engine events — with sampling periods independent from engine speed. however, in every computation period a constant number of future intakes (or aspired masses) are predicted. note that n varies from 0 to the max amount of intake strokes that are predicted, and i varies from 0 to max number of integration steps. note that the volumetric efficiency changes with the following iteration steps. these changes are not accurate, because the volumetric efficiency depends on the engine speed and pressure, and we predict only the pressure changes. importantly, volumetric efficiency is the main source of nonlinearity that influences both the pressure dynamics and the air mass flow into the chamber. 4. experimental and numerical results the in-cylinder flow estimation was evaluated experimentally on the test bench with a volkswagen si engine. fig. 8 compares the air flow measured in the inlet pipe with afm (wi) with the air flow obtained by the speed-density calculation (wie). preprint 2014-06-19 vol. 00 no. 0/0000 example of an article with a long title time (s) a f m re a d in g d a sh ed li n e in -c y li n d er fl ow (k g / s) fu ll li n e 15 figure 8. measurement with afm is denoted by a dashed line, computed air mass flow with a full line. preprint 2014-06-19 vol. 00 no. 0/0000 example of an article with a long title manifold volume (air accumulator)wi wie wf 13 figure 9. schematic representation of the intake manifold. note that since the actual flow into the engine cannot be measured in a straightforward way, the quality of the estimation cannot be independently judged. in a steady state, when no other source is charging the manifold with air, and no other outflow appears, the accuracy of the estimation is obvious. the intake manifold has a relatively big volume, which acts as an air accumulator (see fig. 9). after the throttle opens, it has to be filled with air and then air can be aspired into the cylinder. a difference therefore appears between wi and wie in the transient regime. in this regime, the air accumulation also has to be taken into consideration. since the aim of this paper is to define a model which can be utilized as a basis for predicting system states, a one-step-ahead prediction of the pressure is presented here. figure 10 shows the simulated pressure with a full line, and the predicted pressure with a dotted line. prediction of the aspired air masses in future strokes will not be presented here. since the prediction algorithm runs asynchronously with engine events, it is difficult to compare its results with the mass of air aspired to the chamber. moreover, it would be interesting to analyze the future aspired masses in a transient regime. the underlying problem here is that the revolutions are not exactly constant in a real test-run and a prediction gap appears, which is caused by the assumption that the revolutions are constant throughout the horizon (see sect. 3). however, in the 245 s. wojnar, b. rohaľ-ilkiv, p. šimončič et al. acta polytechnica s. wojnar, b. rohal’-ilkiv, p. šimončič et al. acta polytechnica preprint 2014-06-19 time (s) p re ss u re (k p a ) 16 figure 10. one step ahead prediction of the pressure. simulations the predictions are completely accurate — similar to the prediction presented in fig. 10. 5. conclusion a model of the intake manifold dynamics has been presented in this work. we have described its usage in control and simulation. a comparison of the intake and in-cylinder flows measured on a test bench with a real si engine have also been presented. an important topic for future research will be to apply the model presented here in a model-based afr predictive control scheme. further efforts will be focused on using the preliminary results achieved in the works of [12] and [13] in designing an afr predictive controller. this paper deals only with an engine model without an exhaust gas recirculation (egr) valve or turbocharger. turbo-charged engines have become especially interesting, and might be an attractive research topic. the presented air system (see fig. 3) would have to be extended with equations describing the exhaust manifold conditions (wexm(t) and ṗexm(t)). then the flow through the egr and the turbocharger will influence the intake manifold conditions (winm(t) and ṗinm(t)), and there will be a need to reformulate the model presented here. some of these issues are currently being investigated by the authors. acknowledgements the investigation presented in this paper was supported by the slovak grant agency apvv, under projects id: apvv-0090-10, lpp-0096-07 and lpp-0075-09. the support is very gratefully appreciated. references [1] k. r. muske, j. c. p. jones. feedforward/feedback air fuel ratio control for an automotive catalyst. in proceedings of the 2003 american control conference, pp. 1386–1391. denver, colorado, usa, 2003. doi:10.1109/acc.2003.1239784. [2] p. andersson. intake air dynamics on a turbocharged si-engine with wastegate. master’s thesis, linkoping university, sweden, 2002. [3] l. guzzella, c. onder. introduction to modeling and control of ic engine systems. springer, 2010. doi:10.1007/978-3-642-10775-7. [4] c. w. vigild, k. p. h. andersen, e. hendricks. towards robust h-infinity control of an si-engine’s air/fuel ratio. sae technical paper no 1999-01-0854 1999. doi:10.4271/1999-01-0854. [5] j. k. pieper, r. mehrotra. air/fuel ratio control using sliding mode methods. in american control conference, pp. 1027–1031. san diego, ca, usa, 1999. doi:10.1109/acc.1999.783196. [6] j. k. h. j. s. souder. adaptive sliding mode control of air-fuel ratio in internal combustion engines. international journal of robust and nonlinear control 14(6):525–541, 2004. doi:10.1002/rnc.901. [7] s. w. wang, d. l. yu. a new development of internal combustion engine air-fuel ratio control with second-order sliding mode. journal of dynamic systems, measurement and control 129(6):757–766, 2007. doi:10.1115/1.2789466. [8] s. w. wang, d. l. yu. adaptive air-fuel ratio control with mlp network. international journal of automation and computing 2(2):125–133, 2005. doi:10.1007/s11633-005-0125-y. [9] k. r. muske, j. c. p. jones, e. m. franceschi. adaptive analytical model-based control for si engine air-fuel ratio. ieee transactions on control systems technology 16(4):763–768, 2008. doi:10.1109/tcst.2007.912243. [10] y. yildiz, a. annaswamy, d. yanakiev, i. kolmanovsky. spark ignition engine fuel-to-air ratio control: an adaptive control approach. control engineering practice 18:1369–1378, 2010. doi:10.1016/j.conengprac.2010.06.011. [11] k. r. muske, j. c. p. jones. a model-based si engine air-fuel ratio contoller. in proceedings of the 2006 american control conference, pp. 3284–3289. minneapolis, minnesota, usa, 2006. doi:10.1109/acc.2006.1657224. [12] t. polóni, other. multiple arx model-based air-fuel ratio predictive control for si engines. in 3rd ifac workshop on advanced fuzzy and neural control. valenciennes, france, 2007. doi:10.3182/20071029-2-fr-4913.00013. [13] t. polóni, other. modeling of air-fuel ratio dynamics of gasoline combustion engine with arx network. journal of dynamic systems, measurement and controltransactions of the asme 130(6):061009/1–061009/10, 2008. doi:10.1115/1.2963049. [14] e. alfieri, a. amstutz, l. guzzella. gain-scheduled model-based feedback control of the air/fuel ratio in diesel engines. control engineering practice 17:1417– 1425, 2009. doi:10.1016/j.conengprac.2008.12.008. [15] l. del re, f. allgöwer, l. glielmo, et al. (eds.). automotive model predictive control: models, methods and applications, vol. 402 of lecture notes in control and information sciences. springer-verlag berlin heidelberg, 2010. doi:10.1007/978-1-84996-071-7. 246 http://dx.doi.org/10.1109/acc.2003.1239784 http://dx.doi.org/10.1007/978-3-642-10775-7 http://dx.doi.org/10.4271/1999-01-0854 http://dx.doi.org/10.1109/acc.1999.783196 http://dx.doi.org/10.1002/rnc.901 http://dx.doi.org/10.1115/1.2789466 http://dx.doi.org/10.1007/s11633-005-0125-y http://dx.doi.org/10.1109/tcst.2007.912243 http://dx.doi.org/10.1016/j.conengprac.2010.06.011 http://dx.doi.org/10.1109/acc.2006.1657224 http://dx.doi.org/10.3182/20071029-2-fr-4913.00013 http://dx.doi.org/10.1115/1.2963049 http://dx.doi.org/10.1016/j.conengprac.2008.12.008 http://dx.doi.org/10.1007/978-1-84996-071-7 vol. 54 no. 3/2014 in-cylinder mass flow estimation and manifold pressure dynamics [16] s. behrendt, p. dunow, p. lampe. an application of model predictive control to a gasoline engine. in m. fikar, m. kvasnica (eds.), proceedings of the 18th international conference on process control, pp. 57 – 63. hotel titris, tatranská lomnica, slovakia, 2011. [17] h. c. wong, p. k. wong, c. m. vong. model predictive engine air-ratio control using online sequential relevance vector machine. journal of control science and engineering p. 15, 2012. hindawi publishing corporation, doi:10.1155/2012/731825. [18] a. a. stotsky, i. kolmanovsky. application of input estimation techniques to charge estimation and control in automotive engines. control engineering practice 10:1371–1383, 2002. doi:10.1016/s0967-0661(02)00101-6. [19] j. desantes, j. galindo, c. guardiola, v. dolz. air mass flow estimation in turbocharged diesel engines from in-cylinder pressure measurement. experimental thermal and fluid science 34(1):37–47, 2012. doi:10.1016/j.expthermflusci.2009.08.009. [20] j. grizzle, j. cook, w. milam. improved cylinder air charge estimation for transient air fuel ratio control. in preceedings of the american control conference, vol. 2, pp. 1568–1573. baltimore usa, 1994. doi:10.1109/acc.1994.752333. [21] h. kerkeni, j. lauber, t. guerra. estimation of individual in-cylinder air mass flow via periodic observer in takagi-sugeno form. in vehicle power and propulsion conference (vppc) 2012, ieee, pp. 1–6. lille, france, 2010. doi:10.1109/vppc.2010.5729154. [22] l. benvenuti, m. di benedetto, s. di gennaro, a. sangiovanni-vincentelli. individual cylinder characteristic estimation for a spark injection engine. automatica 39(7):1157–1169, 2003. doi:10.1016/s0005-1098(03)00077-3. [23] a. a. stotsky. automotive engines: control, estimation, statistical detection. automotive engines. springer-verlag, berlin heidelberg, 2009. doi:10.1007/978-3-642-00164-2. [24] m. jung. mean-value modelling and robust control of the airpath of a turbocharged diesel engine. ph.d. thesis, sidney sussex college, department of engineering, university of cambridge, trumpington street, cambridge, uk, 2003. [25] p. andersson, l. eriksson. air-to-cylinder observer on a turbocharged si-engine with wastegate. in proceedings of sae 2001 world congress., 2001-01-0262, pp. 33–40. detroit, mi, usa, 2001. doi:10.4271/2001-01-0262. [26] t. jimbo, y. hayakawa. a physical model for engine control design via role state variables. control engineering practice 19(3):276–286, 2011. doi:10.1016/j.conengprac.2010.03.008. [27] o. barbarisi, a. di gaeta, l. glielmo, s. santini. an extended kalman observer for the in-cylinder air mass flow estimation. in proceedings of the meca02 international workshop on diagnostics in automotive engines and vehicles, pp. 1–14. university of salerno, fisciano (sa), italy, 2002. [28] s. di cairano, d. yanakiev, a. bemporad, et al. model predictive powertrain control: an application to idle speed regulation, pp. 183–194. springer, 2010. doi:10.1007/978-1-84996-071-7_12. 247 http://dx.doi.org/10.1155/2012/731825 http://dx.doi.org/10.1016/s0967-0661(02)00101-6 http://dx.doi.org/10.1016/j.expthermflusci.2009.08.009 http://dx.doi.org/10.1109/acc.1994.752333 http://dx.doi.org/10.1109/vppc.2010.5729154 http://dx.doi.org/10.1016/s0005-1098(03)00077-3 http://dx.doi.org/10.1007/978-3-642-00164-2 http://dx.doi.org/10.4271/2001-01-0262 http://dx.doi.org/10.1016/j.conengprac.2010.03.008 http://dx.doi.org/10.1007/978-1-84996-071-7_12 acta polytechnica 54(3):240–247, 2014 1 introduction 2 model of the air system 2.1 intake manifold pressure dynamics 2.2 air mass flow measurement 2.3 cylinder flow computation 2.4 volumetric efficiency estimation 2.5 summary of models 3 prediction with the model 4 experimental and numerical results 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0545 acta polytechnica 53(supplement):545–549, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the search for dark matter with gamma-rays: a review aldo morselli∗ infn roma tor vergata ∗ corresponding author: aldo.morselli@roma2.infn.it abstract. successfully launched in june 2008, the fermi gamma-ray space telescope, formerly named glast, has been observing the high-energy gamma-ray sky with unprecedented sensitivity in the 20 mev÷ 300 gev energy range and electrons + positrons in the 7 gev÷ 1 tev range, opening a new observational window on a wide variety of astrophysical objects. keywords: gamma ray, gamma ray detectors, dark matter. 1. introduction the fermi observatory carries two instruments onboard: the gamma-ray burst monitor (gbm) [1] and the large area telescope (lat) [2]. the gbm, sensitive in the energy range between 8 kev and 40 mev, is designed to observe the full unocculted sky with rough directional capabilities (at the level of one to a few degrees) for the study of transient sources, particularly gamma-ray bursts (grbs). the lat is a pair conversion telescope for photons above 20 mev up to a few hundreds of gev. the field of view is ∼ 2.4 sr and lat observes the entire sky every ∼ 3 hours (2 orbits). these features make the lat a great instrument for dark matter (dm) searches. the operation of the instrument through the first three years of the mission was smooth at a level which is probably beyond the more optimistic pre-launch expectations. the lat has been collecting science data for more than 99 % of the time spent outside the south atlantic anomaly (saa). the remaining tiny fractional downtime accounts for both hardware issues and detector calibrations [3, 4]. more than 650 million gamma-ray candidates (i.e. events passing the background rejection selection) have been made public and distributed to the community through the fermi science support center (fssc) 1. over the first three years of the mission, the lat collaboration has put considerable effort into achieving a better understanding of the instrument and of the environment in which it operates. in addition, a continuous effort has been made to publish the advances as soon as possible. in august 2011, the first new event classification (pass 7) since launch was released, along with the corresponding instrument response functions. compared with the pre-launch (pass 6) classification, it features a greater and more uniform exposure, with a significance enhancement in acceptance below 100 mev. 1the fssc is available at http://fermi.gsfc.nasa.gov/ssc 2. the second fermi-lat catalog the high-energy gamma-ray sky is dominated by diffuse emission: more than 70 % of the photons detected by the lat are produced in the interstellar space of our galaxy by interactions of high-energy cosmic rays with matter and low-energy radiation fields. an additional diffuse component with an almost-isotropic distribution (and therefore thought to be extragalactic in origin) accounts for another significant fraction of the lat photon sample. the rest consists of various types of point-like or extended sources: active galactic nuclei (agn) and normal galaxies, pulsars and their relativistic wind nebulae, globular clusters, binary systems, shock-waves remaining from supernova explosions and nearby solar-system bodies like the sun and the moon. the second fermi-lat catalog (2fgl) [5] is the deepest catalog ever produced in the energy band between 100 mev and 100 gev. compared to the first fermi-lat (1fgl) [6], it features several significant improvements: it is based on data from 24 (vs. 11) months of observations and makes use of the new pass 7 event selection. the energy flux map is shown in fig. 1, and the sky-distribution of the 1873 sources is shown in fig. 2. it is interesting to note that 127 sources are firmly identified, based either on periodic variability (e.g. pulsars) or on spatial morphology or on correlated variability. in addition, 1170 sources are reliably associated with sources known at other wavelengths, while 576 (i.e. 31 % of the total number of entries in the catalog) are still unassociated. 3. indirect dark matter searches one of the major open issues in our understanding of the universe is the existence of an extremely-weakly interacting form of matter, dark matter (dm), supported by a wide range of observations including largescale structures, the cosmic microwave background and the isotopic abundances resulting from the primordial nucleosynthesis. complementary to direct searches being carried out in underground facilities 545 http://dx.doi.org/10.14311/ap.2013.53.0545 http://ojs.cvut.cz/ojs/index.php/ap http://fermi.gsfc.nasa.gov/ssc aldo morselli acta polytechnica 0 0.05 0.15 0.35 0.74 1.5 3.1 6.2 13 25 50 figure 1. sky map of the energy flux derived from 24 months of observations. the image shows the γ-ray energy flux for energies between 100 mev and 10 gev, in units of 10−7 erg cm−2 s−1 sr−1. no association possible association with snr or pwn agn pulsar globular cluster starburst gal pwn hmb galaxy snr nova 2702853003153303450153045607590 −30 −15 0 15 30 galactic longitude (deg) g a la c ti c l a ti tu d e ( d e g ) figure 2. full sky map (top) and blow-up of the inner galactic region (bottom) showing sources by source class. identified sources are shown with a red symbol, associated sources in blue. and at accelerators, the indirect search for dm is one of the main items in the broad fermi science menu. the word indirect denotes here the search for signatures of weakly interactive massive particle (wimp) annihilation or decay processes through the final products (gamma-rays, electrons and positrons, antiprotons) of such processes. among many other ground-based and space-borne instruments, the lat plays a prominent role in this search through a variety of distinct search targets: gamma-ray lines, galactic and isotropic diffuse gamma-ray emission, dwarf satellites, cr electrons and positrons. 3.1. galactic center the galactic center (gc) is expected to be the strongest source of γ-rays from dm annihilation, due to its coincidence with the cusped part of the dm halo density profile [7–9]. figure 3. spectra from likelihood analysis of the fermi-lat data (number of counts vs reconstructed energy) in a 7°×7° region around the galactic center (number of counts vs reconstructed energy). figure 4. residuals ((exp. data−model)/model) of the above likelihood analysis. the blue area shows the systematic errors on the effective area. a preliminary analysis of the data taken during the first 11 months of the fermi satellite operations is presented in [10, 11] and is shown in figs. 3 and 4. the diffuse gamma-ray backgrounds and discrete sources, as we know them today, can account for a large majority of the detected gamma-ray emission from the galactic center. nevertheless, a residual emission is left, not accounted for by the above models [10, 11]. improved modeling of the galactic diffuse model as well as the potential contribution from other astrophysical sources (e.g. unresolved point sources) could provide a better description of the data. analyses are underway to investigate these possibilities. 3.2. dwarf galaxies dwarf spheroidal galaxies (dsphs) of the milky way are among the cleanest targets for indirect dark matter searches in gamma-rays. they are systems with a very large mass/luminosity ratio (i.e. systems which are largely dm dominated). the lat detected no significant emission from any of these systems, and the upper limits on the γ-ray flux allowed us to put very stringent constraints on the parameter space of well motivated wimp models [12]. a combined likelihood analysis of the 10 most promising dwarf galaxies, based on 24 months of data and pushing the limits below the thermal wimp cross 546 vol. 53 supplement/2013 the search for dark matter with gamma-rays: a review figure 5. derived 95 % c.l. upper limits on wimp annihilation cross sections for different channels. figure 6. predicted 95 % c.l. upper limits on wimp annihilation cross sections in 10 years for the bb̄ channel. section for low dm masses (below a few tens of gev), has been recently performed [14]. the derived 95 % c.l. upper limits on wimp annihilation cross sections for different channels are shown in fig. 5. the most generic cross section (∼ 3×10−26 cm3 s−1 for a purely s-wave cross section) is plotted as a reference. these results are obtained for nfw profiles [13], but for a cored dark matter profile the j-factors for most of the dsphs would either increase or not change much, so these results include j-factor uncertainties [14]. with the present data, we are able to rule out large parts of the parameter space where the thermal relic density is below the observed cosmological dark matter density and wimps are dominantly produced non-thermally, e.g. in models where supersymmetry breaking occurs via anomaly mediation (see fig. 7 for the mssm model, updated from [12]). these γ-ray limits also constrain some wimp models proposed to explain the fermi-lat and pamela figure 7. mssm models in the (mwimp, 〈σv〉) plane. the models are consistent with all accelerator constraints, and red points have a neutralino thermal relic abundance corresponding to the inferred cosmological dark matter density (blue points have a lower thermal relic density, and we assume that neutralinos still comprise all of the dark matter in virtue of additional non-thermal production processes). e+e− data, including low-mass wino-like neutralinos and models with tev masses pair-annihilating into muon–antimuon pairs. future improvements (apart from increased amounts of data) will include improved event selection with a larger effective area and photon energy range, and the inclusion of more satellite galaxies. figures 6 and 7 show the predicted upper limits in the hypothesis of ten years of data instead of two; thirty dsphs instead of ten (supposing that the new optical surveys will find new dsph); spatial extension analysis (source extension increases the signal region at high energy e ≥ 10 gev, m ≥ 200 gev). other complementary limits were obtained with the search for possible anisotropies generated by the dm halo substructures [15], the search for dark matter satellites [16] or in the galactic halo [17] and a search for high-energy cosmic-ray electrons from the sun [18]. 3.3. gamma-ray lines a line at the wimp mass, due to the 2γ production channel, could be observed as a feature in the astrophysical source spectrum [9]. such an observation would be a “smoking gun” for wimp dm, as it is difficult to explain by a process other than wimp annihilation or decay, and the presence of a feature due to annihilation into γz in addition would be even more convincing. up to now, however, no significant evidence of gamma-ray line(s) has been found in the first 11 months of data, between 30 and 200 gev [19] and in the first two years of from 547 aldo morselli acta polytechnica figure 8. dark matter annihilation 95 % c.l. cross section upper limits into γγ for the nfw, einasto, and isothermal profiles for the region |b| > 10° plus a 20° × 20° square at the gc. 7 to 200 gev [20] (see fig. 8). work is ongoing to extend the energy range of the analysis and include more data. recently, the claim of an indication of line emission in fermi-lat data [21, 22] has drawn considerable attention. using an analysis technique similar to [19], but doubling the amount of data and also optimizing the region of interest for signal over square-root of background, [21] found a (trial corrected) 3.2σ significant excess at a mass of ∼ 130 gev which, if interpreted as a signal, would amount to a cross section of about 〈σv〉∼ 10−27 cm3 s−1. the signal is concentrated on the galactic centre with a spatial distribution consistent with an einasto profile [23]. this is marginally compatible with the upper limit presented in [20]. the main problems are the limited statistics in the gc sample and the check for any systematic effect that can mimic the line. a new version of the instrument response function (irf) (called pass 8) is foreseen soon from the fermilat collaboration. with this new analysis software we should increase the efficiency of the instrument at high energy and have better control of the systematic effects. 3.4. the cosmic ray electron spectrum recently, the experimental information available on the cosmic ray electron (cre) spectrum has been dramatically expanded with a high precision measurement of the electron spectrum from 7 gev to 1 tev [24, 25]. the spectrum shows no prominent spectral features and it is significantly harder than that inferred from several previous experiments (see fig. 9). more recently, we provided further, and stronger, evidence of the positron anomaly by providing a direct measurement of the absolute e+ and e− spectra, and of their fraction, between 20 and ∼ 200 gev, using the earth magnetic field (see fig. 9). a steady rise of the positron fraction was observed up to that energy, in agreement with that found by pamela. in the 2 ) 2 g e v − 1 s r − 1 s − 2 j (m ⋅ 3 e 10 2 10 − + e+e − e +e ) − + e + fermi 2010 (e ) − pamela 2011 (e ) + heat 2001 (e total error energy (gev) 10 2 10 )− + e + j (e )− j (e + ) + j (e 0.8 1 1.2 stat error only figure 9. energy spectra for e+, e−, and e+ + e− (control region). in the control region where both species are allowed, this analysis reproduces the fermilat results reported previously for the total electron plus positron spectrum [24, 25] (gray). previous results from heat [27] and pamela [26] are shown for reference. the bottom panel shows that the ratio between the sum and the control flux is consistent with 1, as expected. energy (gev) 1 10 2 10 p o s it ro n f ra c ti o n −1 10 fermi 2011 pamela 2009 ams 2007 heat 2004 figure 10. positron fraction measured by the fermilat and by other experiments [26–28]. the fermi statistical uncertainty is shown with error bars, and the total (statistical plus systematic) uncertainty is shown as a shaded band. same energy range, the e− spectrum was fitted with a power-law with index γ(e−) = −3.19 ± 0.07. this is in agreement with what was recently measured by pamela between 1 and 625 gev [26]. most importantly, fermi-lat for the first time measured the e+ spectrum in the 20 ÷ 200 gev energy interval (see fig. 10). the e+ spectrum is fitted by a power-law with index γ(e+) = −2.77 ± 0.14. these measurements seems to rule out the standard scenario, in which the bulk of electrons reaching the earth in the gev÷tev energy range are originated by supernova remnants (snrs) and only a small fraction of secondary positrons and electrons come from the interaction of cr nuclei with the interstellar medium (ism). an additional electron + positron component 548 vol. 53 supplement/2013 the search for dark matter with gamma-rays: a review peaked at ∼ 1 tev seems necessary for a consistent description of all the available data sets. the temptation to claim the discovery of dark matter from the detection of electrons from the annihilation of dark matter particles is strong, but there are competing astrophysical sources, such as pulsars, that can give a strong flux of primary positrons and electrons (see [29] and references therein). at energies between 100 gev and 1 tev, the electron flux reaching the earth may be the sum of an almost homogeneous and isotropic component produced by galactic supernova remnants and the local contribution of a few pulsars, with the latter expected to contribute more and more significantly as the energy increases. if a single pulsar makes the dominant contribution to the extra component, large anisotropy and small bumpiness should be expected; if several pulsars contribute, the opposite scenario is expected. so far, no positive detection of cre anisotropy has been reported by the fermi-lat collaboration, but some stringent upper limits have been published [30], and the pulsar scenario is still compatible with these upper limits. forthcoming experiments like ams-02 and calet are expected to reduce drastically the uncertainties on the propagation parameters by providing more accurate measurements of the spectra of the nuclear components of cr. fermi-lat, and these experiments are also expected to provide more accurate measurements of the cre spectrum and anisotropy, looking for features which may give a clue to the nature of the extra component. 4. conclusions fermi reached four years in orbit in june 2012, and it is definitely living up to expectations in terms of scientific results delivered to the community. the mission is planned to continue at least four more years (likely more), with many remaining opportunities for discoveries. acknowledgements the fermi lat collaboration acknowledges support from a number of agencies and institutes both for development and for the operation of the lat, as well as for scientific data analysis. these include nasa and doe in the united states, cea/irfu and in2p3/cnrs in france, asi and infn in italy, mext, kek, and jaxa in japan, and the k. a. wallenberg foundation, the swedish research council and the national space board in sweden. additional support from inaf in italy and cnes in france for science analysis during the operations phase is also gratefully acknowledged. references [1] meegan, c. et al., aip conference proceedings 662 (2003) 469 [2] w. b. atwood et al. [fermi coll.] apj 697 no 2 (2009 june 1) 1071–1102 [arxiv:0902.1089] [3] m. ackermann et al. [fermi coll.], astroparticle physics 35 (2012) 346–353 [arxiv:1108.0201] [4] m. ackermann et al. [fermi coll.], apjs 203 (2012) 4, [arxiv:1206.1896] [5] a. abdo et al. [fermi coll.] apjs 199, 31 (2012) [arxiv:1108.1435] [6] a. abdo et al. [fermi coll.] apjs 188, 405 (2010) [arxiv:1002.2280] [7] a. morselli et al., nucl.phys. 113b (2002) 213 [8] a. cesarini, f. fucito, a. lionetto, a. morselli, p. ullio, astroparticle physics 21 (2004) 267 [astro-ph/0305075] [9] e. baltz et al. , jcap07 (2008) 013 [arxiv:0806.2911] [10] v. vitale and a. morselli for the fermi/lat collaboration, 2009 fermi symposium [arxiv:0912.3828] [11] a. morselli, b. cañadas, v. vitale, il nuovo cimento vol. 34 c, n.3 (2011) [arxiv:1012.2292] [12] a. abdo et al. [fermi coll.], apj 712 (2010) 147–158 [arxiv:1001.4531] [13] j. navarro, j. frenk, s. white, astrophys. j. 462 1996 563 [arxiv:astro-ph/9508025] [14] m. ackermann et al. [fermi coll.], 2011, phys. rev. lett. 107, 241302 (2011) [arxiv:1108.3546] [15] m. ackermann et al. [fermi coll.], phys. rev. d 85, 083007 (2012) [arxiv:1202.2856] [16] m. ackermann et al. [fermi coll.], apj 747 (2012) 121 [arxiv:1201.2691] [17] m. ackermann et al. [fermi coll.], apj submitted [arxiv:1205.6474] [18] m. ajello et al. [fermi coll.], phys. rev. d 84, 032007 (2011) [arxiv:1107.4272] [19] a. abdo et al. [fermi coll.], phys. rev. lett. 104, 091302 (2010) [arxiv:1001.4836] [20] m. ackermann et al. [fermi coll.], physical review d 86, (2012) 022002 [arxiv:1205.2739] [21] c. weniger, jcap 1208 (2012) 007 [arxiv:1204.2797 [hep-ph]]. [22] m. su and d. p. finkbeiner, arxiv:1206.1616 [astro-ph.he]. [23] t. bringmann and c. weniger, arxiv:1208.5481 [hep-ph] [24] a. a. abdo et al. [fermi coll.], prl 102, 181101 (2009) [arxiv:0905.0025] [25] m. ackermann et al. [fermi coll.], phys. rev. d 82, 092004 (2010) [arxiv:1008.3999] [26] o. adriani et al. [pamela coll.], phys. rev. lett. 106, 201101 (2011) [27] m. a. duvernois et al., [heat coll.], apj 559, 296 (2001) [28] m. aguilar et al., [ams coll.], physics reports 366, 331 (2002) [29] d. grasso, s. profumo, a. w. strong, l. baldini, r. bellazzini, e. d. bloom, j. bregeon, g. di bernardo, d. gaggero, n. giglietto, t. kamae, l. latronico, f. longo, m. n. mazziotta, a. a. moiseev, a. morselli, j. f. ormes, m. pesce-rollins, m. pohl, m. razzano, c. sgro, g. spandre and t. e. stephens, astroparticle physics (2009) 32, 140 [arxiv:0905.0636] [30] m. ackermann et al. [fermi coll.], phys. rev. d 82, 092003 (2010) [arxiv:1008.5119] 549 acta polytechnica 53(supplement):545–549, 2013 1 introduction 2 the second fermi-lat catalog 3 indirect dark matter searches 3.1 galactic center 3.2 dwarf galaxies 3.3 gamma-ray lines 3.4 the cosmic ray electron spectrum 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0109 acta polytechnica 55(2):109–112, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap energy efficiency of planar discharge for industrial applications jakub kelar∗, jan čech, pavel slavíček dept. of physical electronics, masaryk university, kotlarska 2, 611 37 brno, czech republic ∗ corresponding author: jakub.kelar@gmail.com abstract. diffuse coplanar surface barrier discharge has proven its capabilities as an industry-ready plasma source for fast, in-line and efficient plasma treatment at atmospheric pressure. one parameter required by industry is energy efficiency of the device. in this paper, we present the energy efficiency of the whole plasma system, and we investigate possible sources of errors. keywords: dielectric barrier discharge; dcsbd; electrical efficiency; power estimation; error analysis. 1. introduction diffuse coplanar surface barrier discharge (dcsbd) is a special type of dielectric barrier discharge [1] that uses a planar parallel arrangement of electrodes [2, 3] embedded in the ceramic plate dielectric barrier and a dielectric cooling oil. dcsbd is a non-isothermal discharge [4, 5] which is able to generate macroscopically homogeneous plasma at atmospheric pressure in air and also in other working gases (o2, n2, co2, h2, h2o, etc.) [6–8]. the planar geometry and the macroscopically homogeneous plasma generated as a thin plasma layer (approx. 0.3 mm) make dcsbd an appropriate plasma source for treating planar surfaces with roughness below approx. 1 mm. dcsbd has been used with positive results for treating of non-woven textiles [9], wood [10], silicon wafers [11], various foils and fibers [12–14], glass [7], metals [8] and bio-materials [15]. in the work presented here, the energy efficiency of dcsbd was estimated in order to improve the electrical circuit parameters of the dcsbd device, if necessary. the energy efficiency was determined as the ratio of the power-to-plasma estimate and the measured electric power consumption of the electric power supply unit. the power-to-plasma estimates were based on direct power integration from the current and voltage measurements. the energy efficiency was measured in the power range from 100 to 400 w. the oscilloscope parameters and the measurement set up were also investigated as possible sources of errors in power estimates. 2. experimental procedure 2.1. discharge source the dcsbd device can be used in various geometric configurations. the dielectric barrier can be a flat or bent plate of 96% alumina ceramic. the electrode system can consist of electrodes of various thicknesses or with various electrode gap widths. for the measurements presented here, the planar dcsbd device has flat dielectrics with electrodes 1.5 mm in thickness and 1 mm spacing between the electrodes. fig. 1 figure 1. simple scheme of the dcsbd device. plasma layer, dielectric plate, cooling oil and electrodes on high voltage (hv). figure 2. image of dcsbd discharge generated in air at a power input of 400 w. the discharge area is about 8 × 20 cm2. shows the simplified cross-section of the dcsbd device. the periodic structure of high voltage (hv) and grounded electrodes as well as dielectric cooling oil and flat ceramic barrier are depicted. the dcsbd discharge generated in air at atmospheric pressure and at an input power of 400 w at 15 khz is given in fig. 2. the visually homogeneous periodic structure of the discharge across the electrodes can be seen. 2.2. electric parameters measurements and experimental set-up for measurements of electrical efficiency of dcsbd two channel digital oscilloscope with a bandwidth of 100 mhz, maximal memory depth of 14 mpts and sampling rate up to 2 gsa/s (rigol ds2102) was used. for high voltage measurements the tektronix p6015a 109 http://dx.doi.org/10.14311/ap.2015.55.0109 http://ojs.cvut.cz/ojs/index.php/ap j. kelar, j. čech, p. slavíček acta polytechnica figure 3. experimental set-up of dcsbd power efficiency measurement. probe with 1:1000 ratio was used. the current probe pearson current monitor 4100 with 1:1 ampere-to-volt ratio was used for current measurements. the simple scheme of experimental setup is given in fig. 3. 3. results and discussion in this work the energy efficiency of a dcsbd device was calculated as the quotient of the measured high voltage power supply input power and the powerto-plasma estimated from high voltage and current measurements using direct power integration. a typical example of the current and voltage waveforms used for power-to-plasma estimation is given in fig. 4. in fig. 4 the capacitive and discharge currents (blue) can be distinguished. the results of energy efficiency calculations are presented in fig. 5. this calculation of power-to-plasma was performed using custom-made python or matlab script. the corresponding error of estimation, estimated from law of the error propagation, was determined to be approx. 6%. in the process of dcsbd power-to-plasma estimation from current-voltage measurements, several sources of errors could be identified when the direct integration method is used. in this paper, three sources of errors are discussed using simulations in the matlab computational environment on the real measured data of a dcsbd discharge operated at an input power of 250 w (measured input power of the power supply unit). the first source of errors was identified in the process of finding the zero-phase, i.e., the beginning of the discharge periods. the zero-phase of input voltage defines the interval of integration, so a false estimate could introduce a systematic power estimation error. in fig. 6 the effect of zero-phase misalignment is shown on the y-axis. it can be seen that even 500 ns of zero-phase point misalignment does not induce significant errors of power-to-plasma estimation. this can be explained taking into account that the active figure 4. typical current and voltage waveforms of dcsbd operated at input power of approx. 250 w. figure 5. energy efficiency of the dcsbd device for selected input power of the power supply unit. "discharge" phase starts at about the π/4 phase of the input high voltage. the second source of errors was identified in the current-voltage phase shift of the recorded waveforms. this phase shift could easily result from unequal length of the probe-to-scope cables, or from inaccurate oscilloscope parameters. fig. 6 shows the effect of a current-voltage phase shift. it can be seen that this phase shift has a measurable effect on the estimated power-to-plasma value. a phase shift on the scale of several tens of ns shifts the estimated power input value by up to several tens of watts. this effect can be explained by the capacitive coupling of the dcsbd. an ideal capacitor introduces a current-voltage phase shift of π/2, so the real power over one period is zero. when the phase shift of current-voltage waveforms is introduced, a phase shift other than π/2 occurs and the net integrated power consumption becomes non-zero. a third source of errors was identified at the internal a/d converter of the oscilloscope – the precision, or the noise level of the sampled signal. this level 110 vol. 55 no. 2/2015 energy efficiency of planar discharge for industrial applications figure 6. absolute error of power-toplasma/discharge power estimation using direct numerical integration of discharge current-voltage waveforms. the error of estimation resulting from a) a bad estimate of the beginning of the discharge period (labeled as ’zero shift’ misalignment) and b) voltage-to-current waveform phase misalignment (i.e., voltage-current phase shift, labeled as ’desync’) are given. the zero reference point (the origin) is given as the power estimated from raw measured data. of conversion noise is inherent in each oscilloscope and cannot be lowered easily. in fig. 7 the real current-voltage signal from the oscilloscope was salted with uniformly distributed pseudo-random noise of the magnitude of n bits of a/d converter resolution and then the input power was estimated. the average of 10 salted signal power estimates is used as a point in the graph. it can be seen that when high added noise is present the absolute error of the power input estimation can be more than 0 to 12 w at 250 w of input power (i.e., 5 % of the value). from the results we can conclude that the powerto-plasma numeric estimation procedure is relatively robust. the artificially introduced current-voltage waveform distortions lead to relatively small errors in power estimation of the order of a few percent, even when relatively bad experimental conditions are assumed (i.e., phase shift < 50 ns and noise level < 2 bits). this error magnitude is comparable or less than the assumed measurement errors of current and voltage waveforms. 4. conclusions in the first part of the experiments presented here, the energy efficiency of the dcsbd device was calculated as the ratio of power-to-plasma and the input power of the power supply unit. results have been given for dcsbd operated at a frequency of about 15 khz and input power ranging from 100 to 450 w. we found that the dcsbd device has energy efficiency higher than 85 % in the given power input range. in the second part of our study, a power input of 250 w was chosen and the estimation of errors for power-tofigure 7. absolute error of power-toplasma/discharge power estimation using direct numerical integration of discharge current-voltage waveforms. the error of estimation due to the artificial noise (salt) added to the current and/or the voltage waveforms are visualized. the zero reference point (origin) is given as the power estimated from the raw measured data. plasma were investigated using numerical simulations. the matlab computational environment was used to estimate the importance of making a precise calculation of the beginning of the discharge periods (zero-phase estimation), the influence of the phase shift of the voltage and current measurements, and the sensitivity of the process to the pseudorandom noise added to the real measured current and voltage waveforms. we have found that the power estimation process is relatively robust and stable, as the artificially induced power estimation errors were less than or equal to the assumed measurement errors, i.e., they were of the order of a few percent under a phase shift of < 50 ns and an added noise level of < 2 bits. the influence of zero-phase estimation was found to be negligible, provided that the phase misplacement is well below π/4. acknowledgements this research has been supported by project cz.1.05/2.1.00/03.0086 funded by the european regional development fund and by project lo1411 (npu i) of the ministry of education youth and sports of the czech republic, by the technology agency of the czech republic under project no. ta02010412 and by the czech science foundation under project no: ga13-24635s. references [1] u. kogelschatz. dielectric-barrier discharges: their history, discharge physics, and industrial applications. plasma chemistry and plasma processing 23(1):1–46, 2003. doi:10.1023/a:1022470901385. [2] m. šimor, j. rahel’, p. vojtek, et al. atmospheric-pressure diffuse coplanar surface discharge for surface treatments. applied physics letters 81(15):2716, 2002. doi:10.1063/1.1513185. 111 http://dx.doi.org/10.1023/a:1022470901385 http://dx.doi.org/10.1063/1.1513185 j. kelar, j. čech, p. slavíček acta polytechnica [3] v. i. gibalov, g. j. pietsch. dynamics of dielectric barrier discharges in different arrangements. plasma sources science and technology 21(2):024010, 2012. doi:10.1088/0963-0252/21/2/024010. [4] a. fridman, a. chirokov, a. gutsol. non-thermal atmospheric pressure discharges. journal of physics d: applied physics 38(2):r1–r24, 2005. doi:10.1088/0022-3727/38/2/r01. [5] k. h. becker, u. kogelschatz, k. h. schoenbach, r. j. barker (eds.). non-equilibrium air plasmas at atmospheric pressure. institute of physics publishing, bristol, 2004. [6] m. černák, l. černáková, i. hudec, et al. diffuse coplanar surface barrier discharge and its applications for in-line processing of low-added-value materials. the european physical journal applied physics 47(2):22806, 2009. doi:10.1051/epjap/2009131. [7] t. homola, j. matoušek, v. medvecká, et al. atmospheric pressure diffuse plasma in ambient air for ito surface cleaning. applied surface science 258(18):7135–7139, 2012. doi:10.1016/j.apsusc.2012.03.188. [8] v. prysiazhnyi, a. brablec, j. čech, et al. generation of large-area highly-nonequlibrium plasma in pure hydrogen at atmospheric pressure. contributions to plasma physics 54(2):138–144, 2014. doi:10.1002/ctpp.201310060. [9] m. černák, d. kováčik, j. ráhel’, et al. generation of a high-density highly non-equilibrium air plasma for high-speed large-area flat surface processing. plasma physics and controlled fusion 53(12):124031, 2011. doi:10.1088/0741-3335/53/12/124031. [10] j. ráhel’, p. sťahel, m. odrášková. wood surface modification by dielectric barrier discharges at atmospheric pressure. chemické listy 105(s2):s125–s128, 2011. [11] d. skácelová, v. danilov, j. schäfer, et al. room temperature plasma oxidation in dcsbd: a new method for preparation of silicon dioxide films at atmospheric pressure. materials science and engineering: b 178(9):651–655, 2013. doi:10.1016/j.mseb.2012.10.017. [12] a. asadinezhad, i. novák, m. lehocký, et al. an in vitro bacterial adhesion assessment of surface-modified medical-grade pvc. colloids and surfaces b, biointerfaces 77(2):246–56, 2010. doi:10.1016/j.colsurfb.2010.02.006. [13] i. hudec, m. jasso, h. krump, et al. the influence of plasma polymerization on adhesion of polyester cords to rubber matrix. kgk-kautschuk gummi kunststoffe 61(3):95–97, 2008. [14] m. stepankova, j. saskova, j. gregr, j. wiener. using of dscbd plasma for treatment of kevlar and nomex fibers. chemicke listy 102(4, si):s1515–s1518, 2008. [15] m. henselová, u. slováková, m. martinka, a. zahoranová. growth, anatomy and enzyme activity changes in maize roots induced by treatment of seeds with low-temperature plasma. biologia 67(3):490–497, 2012. doi:10.2478/s11756-012-0046-5. 112 http://dx.doi.org/10.1088/0963-0252/21/2/024010 http://dx.doi.org/10.1088/0022-3727/38/2/r01 http://dx.doi.org/10.1051/epjap/2009131 http://dx.doi.org/10.1016/j.apsusc.2012.03.188 http://dx.doi.org/10.1002/ctpp.201310060 http://dx.doi.org/10.1088/0741-3335/53/12/124031 http://dx.doi.org/10.1016/j.mseb.2012.10.017 http://dx.doi.org/10.1016/j.colsurfb.2010.02.006 http://dx.doi.org/10.2478/s11756-012-0046-5 acta polytechnica 55(2):109–112, 2015 1 introduction 2 experimental procedure 2.1 discharge source 2.2 electric parameters measurements and experimental set-up 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0341 acta polytechnica 54(5):341–347, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap annealing of polycrystalline thin film silicon solar cells in water vapour at sub-atmospheric pressures peter piknaa, b, ∗, vlastimil píčb, vítězslav bendaa, antonín fejfarb a department of electrotechnology, czech technical university, technicka 2, 166 27 prague, czech republic b institute of physics, academy of sciences of cr, cukrovarnicka 10, 162 53 prague, czech republic ∗ corresponding author: pikna@fzu.cz abstract. thin film polycrystalline silicon (poly-si) solar cells were annealed in water vapour at pressures below atmospheric pressure. the pn junction of the sample was contacted by measuring probes directly in the pressure chamber filled with steam during passivation. the suns-voc method and a lock-in detector were used to monitor the effect of water vapour on voc of the solar cell during the whole passivation process (in-situ). the tested temperature of the sample (55–110 °c) was constant during the procedure. the open-circuit voltage of a solar cell at these temperatures is lower than at room temperature. nevertheless, the voltage response of the solar cell to the light flash used during the suns-voc measurements was well observable. the temperature dependences for multicrystalline wafer-based and polycrystalline thin film solar cells were measured and compared. while no significant improvement to the parameters of the thin film poly-si solar cells from annealing in water vapour at below-atmospheric pressures has been observed in the past, in-situ observation proved that there was the required sensitivity to changing voc at elevated temperatures during the process. keywords: passivation, water vapour, thin film solar cell, polycrystalline silicon (poly-si), multicrystalline silicon (m-si), suns-voc . 1. introduction the efficiency of a solar cell is directly influenced by the semiconductor material that is used. however, high quality materials tend to be rather expensive, while cheaper materials are of lower quality. one possible way to deal with this issue is to reduce the number of imperfections and defects in the structure of a solar cell. sufficient photovoltaic efficiency of thin film solar cells in connection with a sufficiently low production price is the key condition for competitiveness on the market dominated by wafer-based solar cells. plasma hydrogenation is usually used for passivation, i.e., to reduce the electrical activity of recombination centres, e.g., grain boundaries, impurities and crystallographic defects in silicon [1–5]. however, this relatively expensive passivation procedure could be replaced by a cheaper alternative — passivation in water vapour [6–8]. 2. experimental the use of passivation by water vapour was investigated using thin film silicon solar cells fabricated by csg solar ag, germany. the structure of the solar cell is shown in fig. 1 [9]. these solar cells were placed in a stainless steel pressure chamber filled with deionized water. the chamber was heated to prepare water vapour. fig. 2 shows the passivation apparatus with the solar cell contacted directly in the pressure chamber. the steam pressure in the chamber was regulated by the temperature of the chamber heater figure 1. structure of a thin film polycrystalline silicon solar cell fabricated by csg solar ag without metallization to allow passivation in water vapour [9]. according to the water phase diagram [10–12]. the solar cell was heated separately by the pyrolytic boron nitride sample heater [13] supplied by wiring passing into the chamber through the feedthroughs. the temperature of the sample was measured using a coaxial thermocouple [13]. the temperature of the cell was at all times kept around 10 °c higher than the temperature of the steam, in order to prevent it condensing on the sample and to ensure passivation strictly in vapour. the steam pressures ranged from 0.01 to 0.1 mpa. the performance of water vapour passivation was measured by the suns-voc method and a lock-in detector during the process every 15 minutes for 2 hours. open-circuit 341 http://dx.doi.org/10.14311/ap.2014.54.0341 http://ojs.cvut.cz/ojs/index.php/ap p. pikna, v. píč, v. benda, a. fejfar acta polytechnica figure 2. apparatus comprising the stainless steel pressure chamber, the heater, the xenon flash lamp, the thermocouple, the pyrolytic boron nitride sample heater, and stainless steel probes to contact the investigated thin film poly-si solar cell used for investigating water vapour passivation. voltage voc is commonly used as a silicon quality factor. the higher the voc, the lower the concentration of defects [3]. this key parameter was measured at light intensity 1000 w/m2 = 1 sun for the suns-voc method (a xenon flash used as a light source) and at the light intensity of a white led determined only approximately to be around 1000 w/m2 for the lock-in detector. 3. suns-voc method the principle of the suns-voc method is based on a light flash about 10 times longer than the lifetime of the charge carriers in the investigated semiconductor [14]. free charge carriers are generated in the semiconductor by absorption of light, and it takes some time — the lifetime of the free charge carrier to recombine with the oppositely charged carrier. the decrease in light intensity during the light flash is a slow process from the point of view of the generated charge carriers. a quasi-steady state of the charge carrier concentration can therefore be assumed in every moment of the measurement during the light decay. two parameters — the light intensity, and the corresponding voc — are measured in time, as shown in fig. 3. they are also presented in fig. 4. for better understanding, where the data is needed for the suns-voc measurement in comparison with a common current-voltage measurement. in principle, suns-voc measurements can be understood as a sequence of ordinary current-voltage measurements at different light intensities (from 100 w/m2 to 1000 w/m2 in fig. 4), where only voc is noted at the corresponding light intensities. it is very important to know that the light intensity and the generated solar cell short-circuit current isc are linearly proportional. if some value of the current generated by the solar cell at the corresponding light intensity is known, e.g., isc at 1000 w/m2 measured at a comfigure 3. suns-voc measurement: the decrease in light intensity and the corresponding open-circuit voltage voc of the solar cell in time [15]. figure 4. explanation of the suns-voc method: two parameters — the light intensity and the corresponding open-circuit voltage marked by the red ellipses — are measured during the suns-voc measurement, and one parameter — the short-circuit current isc at light intensity of 1000 w/m2 (blue circle) — is needed from some other measurement, e.g., a common currentvoltage measurement, to recalculate the detected light intensity of the flash into isc of the solar cell [16]. mon current–voltage (i–v ) measurement, any light intensity illuminating the solar cell can be recalculated to the current generated by the solar cell, as is done to compare a suns-voc measurement with a common i–v measurement, see fig. 5. this picture consists of two graphs. the graph on the left-hand side presents a measurement of the light intensity illuminating a solar cell and the corresponding voc of the solar cell during a light flash around 14 ms in duration. the graph on the right-hand side of the picture shows the current-voltage curve calculated from the data measured by the suns-voc method, and also explains the differences between this suns-voc current-voltage characteristic and the i–v curve of the same solar cell measured in the common way. according to the i–v curves presented here, the suns-voc measurement gives better results (efficiency and fill factor). the 342 vol. 54 no. 5/2014 annealing of polycrystalline thin film silicon solar cells figure 5. creation of a suns-voc current-voltage characteristic from the measured light intensity and the corresponding voc. comparison of a suns-voc measurement with a current-voltage characteristic of the same solar cell measured in the common way [15]. figure 6. temperature dependence of open-circuit voltage voc for a tested non-passivated thin film polycrystalline silicon solar cell. reason is that the suns-voc method omits the influence of the series resistance of the solar cell, because no current is measured, just voltage. an advantage of this method is that no solar cell metallization is needed, just a pn junction. a comparison of the sunsvoc measurement performed before the metallization step (deposition of metal electrodes) and a common i–v curve measured after metallization provides information about the quality of this manufacturing procedure. 4. lock-in detector a model sr830 dsp lock-in detector was used to measure the open-circuit voltage generated by the tested solar cell during passivation. the open-circuit voltage voc of a solar cell decreases as its temperature increases. this is presented in fig. 6 for a tested nonpassivated thin film polycrystalline silicon solar cell. this data was measured by the suns-voc method, voc was measured at light intensity 1000 w/m2 of a xenon flash lamp. according to the measured data, voc becomes comparable with/lower than the parasitic thermoelectrical voltage vth at higher temperatures (voc ≈ 5 mv, vth ≈ 23 mv at 150 °c). to measure voc at elevated temperatures, the generated photovoltage signal voc has to be amplified or measured selectively by a lock-in detector. the basic idea of this measurement is that the lock-in detector supplies a light source (white led in our case) with a specific frequency (70 hz in our case). simultaneously, the same lock-in detector measures the voc of the investigated solar cell. the measured voc has an alternating (ac) character with the same frequency as the light source (70 hz). the photogenerated ac voltage is separated from the parasitic thermo-electrical voltage of direct (dc) character, and also from any other parasitic signals by the lock-in detector measuring only signals with the same frequency as it generates (70 hz of the white led). selective measurement of voc for a tested solar cell can be done this way, and even very small signals that normally vanish in noise can be detected with high accuracy. 5. results and discussions the development of the open-circuit voltage of a solar cell passivated at different temperatures (55–110 °c) and steam pressures (0.01–0.1 mpa) is shown in fig. 7. these investigations were realized by the suns-voc method. according to photovoltaic theory, average voc decreases with increasing temperature. no clear trend of voc in time at particular temperatures was observed, and previous expectations of increasing voc during annealing due to passivation of solar cell defects were not fulfilled. it is assumed that the changes in voc in time are probably caused by instability of the contacting probes. some details about the process should be noted in order to provide a picture of the complexity of the experiments that were performed, and to discuss possible reasons for the negative results that were obtained. voc was measured at sample passivation temperatures higher than room temperature (55–110 °c). the 343 p. pikna, v. píč, v. benda, a. fejfar acta polytechnica figure 7. open-circuit voltage voc of the thin film poly-si solar cell csg measured in time at light intensity 1000 w/m2 for different temperatures by the suns-voc method; legend for the measurements: voc at room temperature before passivation (e.g., 171 mv), water vapour pressure (e.g., 0.01 mpa), sample temperature (e.g., 55 °c) during passivation, and voc at room temperature after passivation (e.g., 172 mv). voltage voc at room temperature is not depicted in the graph. it was measured at light intensity 1 sun by the suns-voc method, as during annealing. intensity of the measured signal was therefore limited, see fig. 6. the thermal conductivity of the stainless steel chamber is relatively poor. due to the non-homogeneous temperature distribution in particular parts of the chamber, the temperature gradient changed in time. the influence of the thermo-electrical voltage should therefore also be taken into account. the thermo-electric voltage can be explained in the following way. when two different metals are connected in two points, and the temperature of these connections is different, the electrical voltage can be measured as the difference in the electrical potential at these two places [17]. a model sr830 dsp lock-in detector with 70 hz flashing frequency of the white led was used to separate the “ac voc” alternating part of the signal and the “dc voc” direct part (thermo-electrical voltage and illumination background). the “ac voc” part of the signal (bold curves) measured by the lock-in detector, and the “dc voc” part of the signal, measured by suns-voc purged of parasitic thermo-electrical voltage (fine curves), are shown in fig. 8. the open-circuit voltage of a solar cell can be measured in direct (dc) mode under constant illumination, e.g., by the suns-voc method, or under alternating (ac) light, e.g., by a lock-in detector. the measurement of voc under a slow decaying light flash used for the suns-voc method can be understood as a sequence of many separate voc measurements under smaller and smaller constant light intensities (dc mode). if the decay of the light intensity during the light flash is appropriately slow, the semiconductor material has enough time to approach the equilibrium state — a balance between generated and recombined free charge carriers, electrons and holes, and this measurement is called a quasi-steady-state measurement. suns-voc measurements are quasi-steady state measurements, and the appropriate duration of the light flash is around 10 ms for polycrystalline silicon. another way to measure the voc of a solar cell is under a sequence of relatively fast flashes (duration in the range of µs). the solar cell generates voc with the same frequency as the frequency of the fast light flashes. the solar cell generates ac voltage. to compare the voc of a solar cell measured as a quasi-steady state measurement (suns-voc method) and as an ac measurement under a sequence of fast light flashes, it is necessary to transfer the ac voltage to dc, or vice versa. in principle, this is similar to comparing the voltage measured in an electricity network (230 vac) and the dc voltage of some accumulator. in order to compare ac and dc voltage, it is necessary to calculate the effective ac voltage, see [18]. a comparison between the effective alternating part of voc measured by a lock-in detector and a direct signal measured by the suns-voc method purged of parasitic dc signals (thermo-electrical voltage) was performed on the basis of the following recalculation, using voc measured at room temperature. the opencircuit voltage of the sample measured under a slow light flash at room temperature (rt) was 201 mv, and it corresponded to the alternating voltage measured by a 60.7 mv lock-in detector, also at rt. all data measured by the lock-in detector during passivation was therefore multiplied by factor f to compare them with the reference “dc voc” of the non-passivated 344 vol. 54 no. 5/2014 annealing of polycrystalline thin film silicon solar cells figure 8. annealing the thin film solar cell (sample csg 30a) in water vapour at vapour pressures 0.06 and 0.08 mpa. the temperature of the sample was 95 °c (for 0.06 mpa) and 105 °c (for 0.08 mpa). the bold curves represent data measured by the suns-voc method purged of the parasitic thermo-electrical voltage vth. the fine curves are data measured by the lock-in detector, subsequently recalculated into dc voltage according to formula (1). legend: voc at room temperature before passivation (e.g., 201 mv), water vapour pressure (e.g., 0.06 mpa) and sample temperature (e.g., 95 °c) during passivation and voc at room temperature after passivation (e.g., 193 mv). voltage voc at room temperature is not depicted in the graph. it was measured at light intensity 1 sun by the suns-voc method, as during annealing. solar cell at rt equal to 201 mv, see below: f = dc voc ac voc = 201 mv 60.7 mv = 3.31 (1) this recalculation allowed a comparison between “dc voc” data measured by suns-voc and “ac voc” data measured by the lock-in detector. factor f = 3.31 allowed to omit some inconsistences of the measurements performed by the lock-in detector and the suns-voc method. the suns-voc measurement used a xenon flash lamp, but the lock-in detector used a white led; the intensity and the spectra of the light sources were different the lock-in detector expected a sinus-shaped detected signal, but the white led flashed with a rectangle shape function all these discrepancies are included in factor f. data measured by the suns-voc method and data from the lock-in detector recalculated on the basis of the formula presented above are shown below. fine curves represent data measured by the lock-in detector, and the results were recalculated according to formula (1). bold curves are data measured by the suns-voc method. thermo-electrical voltage vth at an appropriate temperature was subtracted from them. the vth data was measured in the dark, directly on the contacts used for voc measurements. the red curves are correlated, with the exception of some points; there is an even better correlation between the blue curves. the discrepancy between the “ac voc” and “dc voc” data is probably due to changes in the thermo-electrical voltage in time, or mechanical instability of the contacts. another explanation could be an impact of the illumination background. these imperfections should be removed for future experiments, and then a more exact determination of voc would be approached. better correlation of “ac voc” and “dc voc” could then be expected. temperature dependences of the open-circuit voltage for a polycrystalline thin film solar cell and a multicrystalline wafer-based solar cell are shown in fig. 9. thermo-electrical voltage vth at different temperatures was measured for both solar cell types (green curves). curves with triangles correspond to the multi-si wafer-based solar cell, and curves with points represent the poly-si thin film data. the blue curves represent data measured by the lock-in detector recalculated on the basis of formula (1) and the red data are from the suns-voc method including vth. the results from measurements of voc and vth for multi-si and poly-si solar cells are summarized in table 1. the voc data at room temperatures, the voc decrease at increasing temperature in unit mv/°c and the relative expression in unit %/°c and also the thermo-electrical voltage at 150 °c for both solar cell types provide a comparison of the different behaviour of thin film and wafer-based solar cells. the voc decrease at increasing temperature is stronger for polysi than for multi-si for both types of measurements (suns-voc, and by the lock-in detector). however, stronger voc temperature dependence was expected 345 p. pikna, v. píč, v. benda, a. fejfar acta polytechnica figure 9. temperature dependences of voc for multicrystalline wafer-based (triangles) and polycrystalline thin film (points) solar cells. red curves represent data measured by the suns-voc method, including the thermo-electrical voltage; blue curves represent data measured by the lock-in detector recalculated on the basis of formula (1). the green curves are measurements of thermo-electrical voltage. the “dc voc” suns-voc measurements were realized at light intensity 1000 w/m2 (xenon flash lamp). the light intensity during measurements performed by the lock-in detector was around 1 sun (white led). voc (mv) dvoc/dt dvoc/voc dt (%/°c) vth (mv) solar cell (23 °c) (mv/°c) (23 °c) (150 °c) multi-si ac voc 531 2.8 0.5 0.7 multi-si dc voc 531 2.3 0.4 0.7 poly-si ac voc 218 3.2 1.5 23.1 poly-si dc voc 218 2.6 1.2 23.1 table 1. overview of voc at room temperature, dvoc/dt , dvoc/voc dt and thermo-electrical voltage at 150 °c for multicrystalline and polycrystalline solar cells. from the wafer-based solar cell type. the measured thermo-electrical voltage was significantly different for poly-si and multi-si. while vth of the wafer-based solar cell was less than 1 mv at 150 °c, the thin film cell approached more than 23 mv of vth. this big difference could be explained by an influence from the borofloat glass substrate (thermal insulator) used for poly-si. the temperature gradients were probably bigger in this case. 6. conclusions polycrystalline silicon thin film solar cells were annealed in water vapour to improve their quality by passivation, in other words by de-activating the recombination centres. the slight improvement in voc detected in some cases is more likely to be a measurement inaccuracy than a real improvement. the tested pressure range was 0.01–0.1 mpa, and the sample temperature ranged from 55 to 110 °c. in spite of the elevated temperatures, the open-circuit voltage voc was well detectable by the suns-voc method and when using the lock-in detector. since no significant improvement was observed in the range of sub-atmospheric pressures, our next experiments will be focused on the more promising supra-atmospheric pressures of water vapour. a comparison of the temperature dependences for thin film poly-si solar cells and wafer-based multi-si solar cells has been given. all investigations of water vapour passivation were realized using an in-situ setup: the changing voc of the solar cell was measured directly during annealing in steam. an advantage of this approach is elimination of a time as a passivation parameter. therefore, only a steam pressure and a temperature of the process have to be optimised. acknowledgements this research work was supported by sgs grant number sgs13/072/ohk3/1t/13, and by projects av cr m100101216, mpo moremit fr-ti2/736 and msmt lnsm. 346 vol. 54 no. 5/2014 annealing of polycrystalline thin film silicon solar cells references [1] b. gorka, hydrogen passivation of polycrystalline si thin film solar cells, phd thesis, technische universität berlin, 2010. [2] l. carnel, i. gordon, k. van nieuwenhuysen, d. van gestel, g. beaucarne, and j. poortmans, defect passivation in chemical vapour deposited fine-grained polycrystalline silicon by plasma hydrogenation, thin solid films, 487, pp. 147–151, 2005, doi:10.1016/j.tsf.2005.01.081 [3] l. carnel, i. gordon, d. van gestel, k. van nieuwenhuysen, g. agostinelli, g. beaucarne, and j. poortmans, thin-film polycrystalline silicon solar cells on ceramic substrates with a voc above 500 mv, thin solid films, 511–512, pp. 21–25, 2006, doi:10.1016/j.tsf.2005.12.069 [4] b. gorka, b. rau, p. dogan, c. becker, f. ruske, s. gall, and b. rech, influence of hydrogen plasma on the defect passivation of polycrystalline si thin film solar cells, plasma processes and polymers, 6, pp. s36–s40, 2009, doi:10.1002/ppap.200930202 [5] c. h. seager, d. s. ginley: passivation of grain boundaries in polycrystalline silicon: applied physics letters 34(5), 1979, doi:10.1063/1.90779 [6] t. sameshima, h. hayasaka, m. maki, a. masuda, t. matsui, and m. kondo: defect reduction in polycrystalline silicon thin films by heat treatment with high-pressure h2o vapor, jpn. j. appl. phys., 46, pp. 1286–1289, 2007, doi:10.1143/jjap.46.1286 [7] s. honda, t. mates, b. rezek, a. fejfar, j. kocka: microscopic study of the h2o vapor treatment of the silicon grain boundaries, journal of non-crystalline solids 354, pp. 2310–2313, 2008, doi:10.1016/j.jnoncrysol.2007.09.107 [8] s. honda, a. fejfar, j. kocka, t. yamazaki, a. ogane, y. uraoka, t. fuyuki: annealing in water vapor as a new method for improvement of silicon thin film properties, journal of non-crystalline solids 352, pp. 955–958, 2006, doi:10.1016/j.jnoncrysol.2006.01.061 [9] m. j. keevers, t. l. young, u. schubert, m. a. green: 10% efficient csg minimodules, 22nd european photovoltaic solar energy conference, milan, italy, pp. 1783-1790, 2007. [10] http://www1.lsbu.ac.uk/water/phase.html [2014-07-20]. [11] http://ergodic.ugr.es/termo/lecciones/water1. html [2014-07-20]. [12] http://en.wikipedia.org/wiki/phase_diagram# mediaviewer/file:phase_diagram_of_water.svg [2014-07-20]. [13] http://www.tectra.de/heater.htm [2014-07-20]. [14] r. sinton: user manual to photoconductance lifetime tester and optional suns-voc stage, http://www.sintonconsulting.com [2014-07-20]. [15] r. sinton, a. cuevas: a quasi-steady-state opencircuit voltage method for solar cell characterization, 16th european photovoltaic solar energy conference, glasgow, united kingdom, 2000. [16] h. schmidt, b. burger, u. bussemas, s. elies: how fast does an mpp tracker really need to be? 24th european photovoltaic solar energy conference, hamburg, germany, 2009. [17] r. m. park, r. m. carroll, p. bliss, g. w. burns, r. r. desmaris, f. b. hall, m. b. herzkovitz, d. mackenzie, e. f. mcguire, r. p. reed, l. l. sparks, t. p. wang, manual on the use of thermocouples in temperature measurement, fourth edition, isbn: 0-8031-1466-4, 1993. [18] n. n. bhargava, d. c. kulshreshtha, s. c. gupta, basic electronics and linear circuits, tata mcgrawhill education, isbn: 0-07-451965-4, 1984. 347 http://dx.doi.org/10.1016/j.tsf.2005.01.081 http://dx.doi.org/10.1016/j.tsf.2005.12.069 http://dx.doi.org/10.1002/ppap.200930202 http://dx.doi.org/10.1063/1.90779 http://dx.doi.org/10.1143/jjap.46.1286 http://dx.doi.org/10.1016/j.jnoncrysol.2007.09.107 http://dx.doi.org/10.1016/j.jnoncrysol.2006.01.061 http://www1.lsbu.ac.uk/water/phase.html http://ergodic.ugr.es/termo/lecciones/water1.html http://ergodic.ugr.es/termo/lecciones/water1.html http://en.wikipedia.org/wiki/phase_diagram#mediaviewer/file:phase_diagram_of_water.svg http://en.wikipedia.org/wiki/phase_diagram#mediaviewer/file:phase_diagram_of_water.svg http://www.tectra.de/heater.htm http://www.sintonconsulting.com acta polytechnica 54(5):341–347, 2014 1 introduction 2 experimental 3 suns-voc method 4 lock-in detector 5 results and discussions 6 conclusions acknowledgements references ap1_02.vp 1 introduction design and performance evaluation of photovoltaic systems usually involve an estimation of irradiation incident on the photovoltaic module plane. for design purposes only gloal horizontal irradiation, published by meteorological institutes, is usually available. using irradiation models, we can calculate the gain in irradiation on a tilted plane with respect to horizontal irradiation. concerning a part of this task, most textbooks [4] recommend that the diffuse component be treated as if it were isotropically emanating from the sky vault. however, theoretical as well as experimental results have shown that this simplifying assumption is generally far from reality. dave, j., 1977 examined the validity of this isotropic distribution approximation for a sun-facing flat surface located at the bottom of plane parallel models of non-absorbing homogeneous atmospheres. even for such idealized and somewhat unrealistic models, he showed that the use of an isotropic distribution approximation results in systematic underestimation of the diffuse energy contribution to the sun-facing surfaces. that study demonstrated the need to test this approximation for more realistic, atmospheric conditions. thus it appears that sky radiance should be treated as anisotropic, particularly because of the strong forward scattering effect of aerosols [2, 9, 14, 16]. in the first part of this study, some results concerning the accuracy of models to estimate irradiance on inclined planes are tested. the models chosen for discussion here are for arbitrary sky conditions and are supposed to be applicable anywhere in the world. these methods begin with measured hourly global and diffuse radiation received on a horizontal surface. these quantities are then transposed onto an inclined plane by a mathematical procedure. the accuracies of the models are then compared on the basis of statistical error tests, and the most accurate model is recommended. the analyses are also extended to include the optical processes occurring inside the encapsulation solar module. solar concentrators are frequently used to increase the specific power and decrease the space of radiation degradation of solar cell parameters. on the other hand, at high light concentration coefficients, a large amount of thermal energy evolves in solar cells. if cooling is not provided, the cell working temperature increases. this affects the carrier concentration and light absorption process, causing a reduction in open circuit voltage, a slight increase in short circuit current, a reduction of the fill factor, and degradation of the cell power output. the degradation rate of cell output is approximately double for each 10 °c increase in temperature [7, 22]. therefore, the modules and solar array must take full advantage of radiative, conductive and convective cooling and absorb the minimum of unused radiation. hence, there is a need for an exact technique to calculate accurately and efficiently the temperature distribution of photovoltaic solar modules, from which we can adjust safe and proper operation at maximum ratings. the computer code was written in fortran, a programming language quite flexible and suitable for organizing the screen with multiple windows and for drawing curves and histograms. the program structure consists of a main executable file, which has several external units to perform the different calculations; program maintenance is therefore quite easy, as is the implementation of new models or extension to a new configuration. the derived equations are solved using an explicit technique and iteration processes [18, 20]. the models and simulation programs developed here allow us to predict the thermal performance, and they are promising tools for evaluating new photovoltaic power plants with the aim of increasing efficiency. three main criteria are considered in evaluating the models. the first is versatility: what range of photovoltaic modules and systems can the model handle? accuracy is the second issue: what is the physical basis and how closely can the model replicate manufacturers’ data? the final issue is computational speed. this criterion is of secondary importance for this work. 2 apparatus and measurements total radiation was monitored with a high precision pyranometer, which is sensitive in the wavelength range from 300 to 3000 nm. sky diffuse radiation was measured by © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 41 no. 2/2001 actual optical and thermal performance of photovoltaic modules hamdy k. elminir, v. benda, i. kudláček field testing is costly, time-consuming and depends heavily on prevailing weather conditions. adequate security and weather protection must also be provided at the test site. delays can be caused due to bad weather and system failures. to overcome these problems, a photovoltaic array simulation may be used. in any simulation scheme involving photovoltaic systems, one important choice is the selection of a mathematical model. in the literature several approaches to the problem have been made. most procedures designed for this purpose are based on analytical descriptions of the physical mechanisms inside the solar cell that can be represented by a circuit diagram with discrete components, like a two-exponential model. such simulators have some merits. however, their limited flexibility in readily simulating the influence of solar radiation, temperature and various array parameters is a serious drawback that has been noted. to get more accurate results in predicting the actual performance of photovoltaic modules, the parameters influencing incoming (optical parameters) and outgoing power flow (electrical and thermal parameters) were investigated by simulation and by some verifying experiments, to get a closer insight into the response behavior of this element, and to estimate the overall performance as well as optimization of the parameters. keywords: solar radiation, reflection losses, radiation shape factor, radiative surface area, temperature distribution, emissivity, isotropic model, hay and klucher’s anisotropic models. a pyranometer equipped with a special shading device to exclude direct radiation from the sun. all data sets were subjected to various quality control tests. three types of data checks were performed to identify missing data, data that clearly violates physical limits, and extreme data. hours when the data was known to be ”bad” or “missing” were omitted. second, any hour with an observation that violated a physical limit or conservation principle was eliminated from the data set, including: reported hours with a diffuse fraction greater than 1, or beam radiation exceeding the extraterrestrial beam radiation. to eliminate the uncertainty associated with radiation measurements at large incidence angles, hours with a zenith angle larger than 80� were eliminated. the final data set was constructed from the measured data that passed all of the quality control checks. 3 optical performance of photovoltaic modules in order to make a precise representation of the actual optical conditions in the module, a model for an encapsulating cell was developed which determines the insolation reaching the cell from sun and sky irradiance. this was done by modeling the optical processes occurring outside and inside the encapsulation. most collecting devices associated with solar energy systems are tilted at some angle with respect to the horizontal. due to the lack of measured tilted surface solar radiation data, models are employed to estimate the radiation incident on a collector surface from measured horizontal radiation. in the present work, methods for calculating radiation on a tilted surface from horizontal data are evaluated. 3.1 evaluation of irradiation on a tilted south-oriented surface consider a plane inclined at an angle � from the horizontal position. for the time being, we assume that the surface is inclined in such a way that it faces the sun; that is, direct radiation is not striking the back of this plane. total radiation arriving on an inclined surface is composed of three components: beam radiation, sky diffuse radiation and radiation reflected from the ground. the models discussed here all share the same formulations for the beam and ground reflected components. hourly irradiations with a solar elevation angle � > 8° are tested only in order to avoid errors resulting from dividing small numbers. it is useful to write down the hourly formulation of these components. 3.1.1 beam radiation incident on an inclined surface for a surface oriented in any direction with respect to the meridian, the trigonometric relation for the incidence angle, �i can be written in the following form: � �cos sin cos cos sin cos sin cos cos sin sin � � � � � � � � � � � i � � � � �� �cos cos cos cos sin sin sin � � � � � � � � � (1) where the meanings of the different symbols are as given in the nomenclature list. there are several commonly occurring cases for which equation 1 is simplified. for a fixed surface sloped toward the south or north, that is, with a surface azimuth angle � of 0° or 180° (a very common situation for fixed flat plate collectors), the last term drops out. � � � � cos sin cos cos sin sin cos cos sin sin cos � � � � � � � � � � � i � � � � � � � � � cos sin sin cos cos cos . � � � � � � � � � � � (2) for horizontal surfaces, the angle of incidence, �i is the zenith angle of the sun, �z. its value must be between 0° and 90° when the sun is above the horizon. for this situation � 0, and equation 1 becomes: cos cos cos cos sin sin� � � � � ��z � � (3) the geometric factor rb, the ratio of beam radiation on the tilted surface, ib, t to that on a horizontal surface at any time, can be calculated exactly by appropriate use of equation 1. the ratio i ib, t b, h is given by: � � � � rb i z � � � � �cos cos sin sin cos cos cos cos cos cos � � � � � � � � � � � � � �� sin sin . (4) the hourly beam radiation received on an inclined surface can be expressed as: � �i i r g i rb, t b, h b h d, h b� � � � � . (5) it should be noted that at the grazing angles (just at sunrise or at sunset) rb can change rapidly and may approach infinity or zero because both the numerator and the denominator are small numbers. this depends on slope, latitude and date. 3.1.2 sky diffuse radiation incident on an inclined surface diffuse irradiance is difficult to determine accurately with the simple parameterization methods that were used to calculate direct normal irradiance in the previous section, since its spatial distribution is generally unknown and time dependent. three diffuse subcomponents are used to approximate the anisotropic behavior of diffuse radiation. the first is an isotropic part received uniformly from the entire sky dome. the second is circumsolar diffuse resulting from forward scattering of solar radiation and concentrated in the part of the sky around the sun. the third, referred to as horizon brightening, is concentrated near the horizon, and is most pronounced in clear skies. several models have been proposed to estimate the diffuse radiation on a tilted surface (not all of which account for these three diffuse subcomponents). the isotropic model [12] is the simplest of the tilted surface models. in this model the intensity of sky diffuse radiation is assumed uniform over the sky dome, i.e., it is independent of the azimuth and zenith angles. it approximates the completely overcast sky condition. the formula for the hourly sky diffuse radiation incident on an inclined plane is given by the product of the hourly diffuse radiation incident on a horizontal surface and the configuration factor from the surface to the sky, � �1 2� cos � . i id, t d, h� �1 2 cos � . (6) under completely cloudy skies, the isotropic model becomes a good approximation. as skies become clearer, the validity of the isotropic sky model deteriorates due to the presence of circumsolar and horizon brightening anisotropic effects. klucher, t., 1979 model: this model is based on a study of clear sky conditions by temps, r. and coulson, k., 1977. their model was modified by klucher, who incorporated 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 conditions of cloudy skies. it takes into account the increase in the diffuse radiation in the vicinity of the sun (circumsolar radiation) and that near the horizon (horizon brightening). klucher’s formulation of the hourly sky diffuse radiation incident on an inclined surface is: i i f fd, t d, h� �� � � � � � � � � � � 1 2 1 2 11 3 1 cos sin co � � � �s sin .2 3� �z z� � � � � � (7 ) where f1 is the modulating function given by: f i g 1 2 1� � � � � d, h h . (8) when the skies are completely overcast, f1 0, klucher’s model reverts to the isotropic model. hay, j. and davies, j., 1980 developed a model to predict tilted surface diffuse radiation, which accounts for both circumsolar and isotropic diffuse radiation. realizing that the anisotropic behavior of circumsolar diffuse radiation becomes more pronounced under clear sky conditions, hay, j. and davies, j., 1980 defined an ”anisotropic index, ai” to weight the circumsolar, it, cir and isotropic, it, iso radiation components. the anisotropy index defines a portion of the diffuse radiation to be treated as circumsolar with the remaining portion to be considered isotropic. the circumsolar diffuse radiation is projected onto the tilted surface in the same fashion as beam radiation. a i ii b, h o� (9) i i a rt, cir d, h i b� � � . (10) the remaining diffuse radiation is treated as isotropic diffuse. � �i i at, iso d, h i� � � 1 1 2 cos . � (11) the total diffuse radiation on a tilted surface is the sum of (10) and (11). � �i i a a rd, t d, h i i b� � �� � � � � � �� � �� 1 1 2 cos . � (12) under clear skies, the anisotropy index will be high and the circumsolar diffuse is weighted more heavily than the isotropic diffuse. under cloudy skies, the anisotropy index goes to zero and all diffuse is treated as isotropic. the hay model does not account for horizon brightening radiation incident on south facing surfaces. 3.1.3 ground reflected radiation incident on an inclined surface in order to correctly evaluate the diffuse radiation incident on an inclined surface of any orientation and of any tilt angle, we need to known separately the sky diffuse radiation as well as the ground reflected radiation. in this section we are going to focus on ground reflected radiation, which is also significant and which can sometimes reach values of the order of 100 w/m2 for a vertical plane. it has already been previously found, but based on restricted data, that using an albedo value measured on site and considered as constant leads to satisfactory results, ineichen et al., 1987. � �i site gg, h h� �� . (13) a common method for calculating the ground reflected radiation incident on a tilted surface is to assume that the foreground in the collector field of view is a diffuse reflector and that the horizon is unobstructed. other authors have proposed anisotropic ground reflectance models [6, 8], but lack of experimental data has hampered their validation. therefore, the ground reflected radiation is assumed to be diffuse. under the isotropic condition, the ground reflected radiation incident on the inclined surface is given by the above quantity multiplied by the configuration factor from the ground to the inclined surface. thus: � �i gg, t h� � 1 2 1� �cos . (14) 3.1.4 model evaluation three irradiance models are considered in this work: the isotropic model, hay’s and klucher’s anisotropic models. for each model, measured diffuse and horizontal global values were used to calculate the irradiance on surfaces tilted at 30°, 60° and 90° above the horizon. the results were compared with the irradiances monitored and presented in terms of usual statistics: mean bias error (mbe) and root mean square error (rmse). the results for south facing surfaces are given in table 1 and table 2. it is noted that the rmses for all three models increase as the slope of the collector increases, but remain in a domain of errors for which these relations can acta polytechnica vol. 41 no. 2/2001 9 month isotropic model rmse hay’s model rmse klucher’s model rmse 30 ° 60 ° 90 ° 30 ° 60 ° 90 ° 30 ° 60 ° 90 ° january 18.0 22.0 22 8.0 11.0 10.0 8.0 11.0 13 february 16.0 21.0 22 7.5 11.0 12.0 7.0 11.0 13 march 9.0 12.0 12 4.0 6.0 8.5 3.0 5.0 8 april 6.0 14.0 8 4.0 12.0 8.0 3.0 12.0 8 may 3.5 5.5 12 3.0 5.5 12.0 3.0 5.0 8 june 4.0 9.0 25 3.0 9.0 21.0 4.5 8.0 24 july 3.5 6.0 19 2.0 5.0 13.0 4.0 6.5 18 august 3.0 7.0 12 2.0 5.0 10.0 3.0 5.0 12 september 11.0 15.0 17 6.5 9.0 12.0 4.5 7.0 10 october 13.0 16.0 17 6.5 7.0 9.0 5.0 6.5 9 november 24.0 31.0 36 13.0 8.0 20.0 10.0 14.0 19 december 18.0 23.0 26 9.0 13.0 14.0 9.0 11.0 13 table 1: root mean square errors [%] for global radiation received on inclined surfaces be applied with good accuracy. inspecting the results, it is apparent that the models agree quite well with each other during the summer months. they deviate from each other in the winter months, when the effect of the difference in the diffuse radiation parameterization is at its maximum. the rmse results indicate that the anisotropic models (hay and klucher) show similar performance on an overall basis (within 5 %) but the isotropic model exhibits much larger error. the mbe results show that the isotropic model substantially underpredicts the irradiance incident on an inclined surface, and the klucher model considerably overpredicts irradiance incident on an inclined surface on an overall basis. our findings confirm the observation that the klucher model describes the irradiance on inclined planes more accurately than the isotropic model or hay’s model. due to lack of measured tilted surface solar radiation data, klucher’s model is employed to estimate the radiation incident on a collector surface from measured horizontal radiation. the results of these calculations are plotted against the angle of tilt for summer, winter and all year round intended use. a summary of the effects of tilt angle on solar irradiation for south facing surfaces is shown in fig. 1. 3.2 optical parameters of the encapsulation module reflection losses produced in photovoltaic modules operating in real conditions in comparison with standard ones are commonly neglected or simplified to a value that does not take into account the module tilt angle. in some applications of interest such as building integration of photovoltaic modules, architectural criteria or other limitations frequently force modules to have tilt angles far from their optimal values. as a consequence, optical losses increase to non-negligible quantities [17]. the different encapsulation layers cause multiple reflections inside and among the slabs (see scheme in fig. 2). the acta polytechnica vol. 41 no. 2/2001 10 month isotropic model mbe hay’s model mbe klucher’s model mbe 30 ° 60 ° 90 ° 30 ° 60 ° 90 ° 30 ° 60 ° 90 ° january 8.3 10.3 9.2 2.2 1.6 1.2 3.3 3.9 4.2 february 8.2 11.0 11.7 2.8 3.3 3.3 2.8 4.7 5.5 march 5.5 6.1 4.4 1.7 0.6 1.0 0.2 0.1 1.7 april 3.3 5.3 1.8 1.1 2.5 0.6 0.2 0.2 4.5 may 1.7 2.2 2.4 1.2 2.2 5.5 1.6 2.2 4.8 june 0.6 1.1 1.2 0.5 1.9 4.4 2.2 3.0 10.0 july 0.8 0.6 1.1 0.4 1.1 3.9 2.2 4.7 8.9 august 2.5 2.8 2.2 1.1 0.8 2.2 1.9 2.2 5.5 september 4.4 4.4 3.6 2.2 1.1 0.6 0.1 1.1 3.2 october 5.0 8.9 9.4 3.3 2.8 2.8 2.2 2.8 1.9 november 8.3 9.4 10.0 4.4 2.6 2.2 3.3 2.9 0.8 december 6.7 7.8 7.2 1.1 0.2 0.0 2.1 0.3 0.5 table 2: mean bias errors [%] for global radiation received on inclined surfaces fig. 1: annual variation in estimating average daily radiation for a flat plate collector facing south for a latitude of 50°, and ground reflectance of 0.2 fig. 2: ray tracing through the module layers abbreviation eva stands for ethylene-vinyl-acetate, which serves as a mechanical, thermal and optical interface between the solar cells and the glass cover. arc stands for the antireflective coating of the solar cell. it consists of tio2 or another optical material to bridge over the refractive indices of silicon and the front cover. refractive indices and transmissivity coefficients are given in table 3. table 3: refractive indices and transmissivity coefficient for the wavelength of 800 nm [5] materials refractive index thickness transmissivity coefficient normal incidence incidence at 60° glass eva arc silicon 1.526 1.450 1.700 3.690 3.2 mm 0.6 mm 50–60 nm 0.5 mm 0.951 0.992 1.000 – 0.941 0.990 1.000 – light rays are modeled as objects having the properties of electromagnetic waves [1, 15]. each ray can exist either as incident or as one of its reflected, refracted, absorbed and transmitted components, produced by the respective optical process (i.e., reflection, refraction, etc.). at the interfaces, a matrix method derived from electromagnetic theory is used to determine the probability of a photon being reflected, transmitted or absorbed in the interface layers. the calculation technique is based on the model of [19, 23], which calculates the radiation transfer properties of multi-layer structures. their model was modified to generate an equivalent application in the domain of photovoltaic solar cell simulation. 3.2.1 radiation transmission through a photovoltaic module when light is incident upon the interface between two optically different materials, in general part of the light is reflected and part transmitted. if the medium on what is nominally chosen to be the outside (o) of the interface has a complex refractive index no and the medium on the inside (i) has a refractive index, ni then, for a given wavelength, snell’s law states that: � �� �� �o i o i� arcsin sinn n . (15) for an incident wave of amplitude eo � traveling across an interface from outside to inside (the superscript denotes the direction of propagation), the amplitude reflection coefficient (r) and amplitude transmission coefficient (t) are defined as: r e e t e eoi o o + oi i + o +� �� ; . (16) similar expressions hold for waves traveling in the opposite direction, for which it can be shown that: r r t t r rio oi oi io oi io� � � �; 1 . (17) an alternative approach is to calculate the transmission through the multi-layer using transfer matrices. each matrix, in an ordered sequence of matrix multiplications, represents one of two possible transformations of the radiation fields. the transformations are due to: • transmission across an interface (interface matrix), • transmission through a layer (layer matrix). the interface matrix: the definition of the interface (m) follows from consideration of the boundary conditions at an interface. let eo + and ei + represent the amplitudes of the net electric field propagation towards, and away from, an interface in the direction from the outside to the inside medium (fig. 3). eo � and ei � are the corresponding net electric fields traveling in the reverse direction. ei + is then composed of the transmitted component of eo + and the reflected component of ei �, eo � consists of the transmitted component of ei � and the reflected component of eo +. equations (16) and (17) enable this to be expressed mathematically as: e t e r e e t e r e i + oi o + io i o io i o i o + � � � � � � � (18) or e e t r t r t t e e o + o oi oi oi oi oi oi i + i � � � � � � � � � � � � � � � � � � � � 1 1 � �� � � � � � � � � � � � � ��m e eoi e i + i . (19) when transmission across an interface is considered in terms of intensities, the following matrix equation is obtained: i i i i o + o oi oi oi oi oi oi i + i � � � � � � � � � � � � � � � � � � � � � � 1 1 � � � � � �� � � � � � � � � � � � � ��m i ioi i i + i . (20) the layer matrix: the layer matrix (n) describes the attenuation of a wave as it traverses the medium within a layer. when a wave propagates in the positive direction between the interfaces of a thin layer, its initial amplitude � �ef � 1 within the layer is attenuated to, say, � �ef � 2 on reaching the second interface (fig. 4). a wave of initial amplitude � �ef � 2 traveling in the opposite direction through the layer is reduced to � �ef � 1 . if the layer has thickness d then from the definition of an attenuation factor, � � � � � � � � � � � � e d e e d e f f f f f f � � � � � � � 2 1 2 1 � � (21) or � � � � � � � � � � � � � �e e d d e e nf f f f f f f e � � � � � � � � � � � � � � � � � � � � � � � � � � 1 1 1 0 0 2 2 � � � � � � e e f f � � � � � � � � 2 2 . (22) matrix multiplication: the following equation represents the transformation of the electric fields by a thin layer acta polytechnica vol. 41 no. 2/2001 11 fig. 3: the electric fields at an interface ( f ) sandwiched between a pair of outer (o) and inner (i) media: � � � � � �e e m n m e e o + o of e f e fi e i + i � � � � � � � � � � � � � � � � � � . (23) this involves the multiplication of three matrices, the first and third being interface materials, the second the layer matrix. a system is composed of a greater number of thin layers, say layers 1, 2, 3, …, q sandwiched in the transformation. � � � � � � � � � � � � e e m n m n n m n o + o o1 e 1 e 12 e 2 e q 1 e q 1, q e q � � � � � � � � � � � � � � � � � � � � � � � e qi e i + i � � � � � � ��m e e . (24) equation (24) can be use to obtain the total amplitude reflection and transmission coefficients, �oi and �oi of the multiple thin layer structure. this follows by considering the surrounding media to be of infinite thickness, so that there is no net reflected wave in medium (i) incident upon the inner interface (i.e., ei � � 0). by setting eo � � 1 equation (24) becomes: � � � � � � � � � � � � � � 1 � � � � � � � � � � � � � �� � � �� � oi o1 e 1 e 12 e 2 e q 1 e q 1, q e q e m n m n n m n � �mqi e oi� 0 � � � � � � (25) which can be solved for �oi and �oi . 3.2.2 absorbed solar radiation the prediction of collector performance requires information on the solar energy absorbed by the collector absorber plate. the solar energy incident on a tilted collector can be found by the methods of section 3.1. this incident radiation has three different spatial distributions: beam radiation, diffuse radiation, and ground reflected radiation, and each must be treated separately. using klucher’s formulation, equation (7) can be modified to give the absorbed radiation, s, by multiplying each term by the appropriate transmittance-absorptance product: � � � � � �s g i r i f f � � � � � � �� � � � �� �� � � b h d, h b d d, h 1 2 1 2 11 3cos sin � � � � � � 1 2 3 1 2 1 cos sin cos . � � �� � � z g h � �� � �� � � �g (26) the subscripts b, d, and g represent beam, diffuse, and ground, respectively. equation (27) can be used to find the proper absorptance: � � � � � � � � � �� � � 1 a a transmitted incident o i i k l exp cos (27) where k is the proportionality constant, the extinction coefficient, which is assumed to be a constant in the solar spectrum. for glass, the value of k varies from approximately 4 m 1for ”water white” glass (which appears white when viewed on the edge) to approximately 32 m 1 for poor (greenish cast of edge) glass. l is the path length in the medium. 3.2.3 efficiency limits of photovoltaic modules the analytical results above were applied to analyze the optical behavior of an encapsulated photovoltaic solar cell. photovoltaic modules have an angle of incidence aoi dependent optical behavior that can be measured and used to improve the analysis of array performance. like absolute air mass, solar angle of incidence is time of day dependent. its effect on the short circuit current of a photovoltaic module results from two causes. the first is familiar to solar enthusiasts as the ”cosine effect”. the ”cosine effect” is independent of the module design, and is only geometry related. due to the cosine effect, the short circuit current from a module varies directly with the cosine of the aoi. for example, at aoi 60° the cosine effect reduces the short circuit current by one half compared to the normal incidence condition. the second way in which the short circuit current is affected by the aoi is dependent on the module design. the optical characteristics of the module materials located between the sun and the solar cells cause the effect. above 45° incidence angle the reflectivity starts to increase, and reaches 20% losses at 75° (see fig. 5). the influence of a module’s optical characteristics on its performance is shown in fig. 6. acta polytechnica vol. 41 no. 2/2001 12 fig. 4: transmission through a thin absorbing layer fig. 5: calculated relative transmission, reflection, and absorption of radiation for glass cover 4 thermal performance of a photovoltaic module solar radiation irradiating at the surface of the solar module is absorbed partially by various materials of the module. almost all radiation is absorbed in the silicon. only a very small part is absorbed in other materials of the solar module. this will produce heat inside the solar cell. there are heat loss mechanisms from the front and rear surface of the module: free convection, wind convection and radiative heat loss. inside cell heat is transmitted by conduction. the equations describing the heat flows within the cell are given below: radiation heat transfer equation: the total radiative heat flow from the module, qrad [w] to the environment comprises the components of the radiation exchange of the front and rear sides of the module with the sky and the ground. these components are also functions of the module surface temperature, t(i, j, k), ambient air temperature, t� and the relevant emissivity � also; these components are determined by the so-called radiation shape factor, fs the radiative surface area of the module, am and the stefan-boltzmann constant �. in general this can be expressed as [11]: � �� �q f t trad s m i, j, k� � �� � � 4 4 . (28) convection heat transfer equation: convective heat flows, qconv. cannot be treated in a closed mathematical model and, therefore have to be computed by iteration or approximation. the convective heat transfer coefficient h� [w/m 2k] is a bulky function of the actual air temperature and module surface temperature, air viscosity, thermal conductivity and heat capacity, and finally wind speed. the models used are based on the following equation: � �� �q h a t tconv. c i, j, k� �� � . (29) conduction heat transfer equation: can be given as: � � � �� �q k a h t tcond. th m i, j, k i+1, j, k� � . (30) 4.1 problem formulation the solution starts by assuming that the photovoltaic solar module is composed of three layers. for the purposes of the 3d-thermal model the solar module is divided into a number of meshes, each mesh having six faces, (xy = top, yx = bottom, yz = front, zy = back, xz = right and zx = left), i, j, k are the coordinates of the meshes. the computer code was written in fortran. simulation outputs allow a complete evaluation of the temperature distribution inside the photovoltaic module and can be of great interest both for system designers and for researchers. the technique used in this code involves calculating the temperature by an explicit method using iteration processes. the study is confined to a single silicon photovoltaic module of the following dimensions: thickness of silicon 0.5 mm, sandwiched between a glass layer which is taken as 3 mm and mylar 0.5 mm, giving a total photovoltaic module thickness of 4 mm. both length and width are taken as 40 mm. also, it is assumed that all the heat is absorbed inside the silicon wafer homogeneously. the results are shown in figs. 7 and 8, which show the temperature distribution along the xy-plane at the mid z-axis of the silicon wafer, and the temperature distribution along the xz-plane at mid y-axis. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 41 no. 2/2001 fig. 6: influence of angle of incidence on cell parameters fig. 7: temperature distribution along the xy plane at mid z-axis fig. 8: temperature distribution along the xz plane at mid y-axis 4.2 temperature dependence of solar module parameters fig. 9 shows the dependence of isc, voc, ff, and pm on temperature. as the cell temperature increases, the short circuit current, isc somewhat increases, and the maximum power, pm the open circuit voltage, voc and the fill factor, ff decrease. the short circuit current increase with temperature is attributed to light absorption variation. under the conditions considered, the solar absorption edges shift due to a decrease in the crystal forbidden gap width, and the number of pairs generated in the bulk increases. alternatively, as the temperature increases, the intrinsic light absorption factor changes and the short circuit current increases. to explain the variation of voc with temperature, the effects of all voltage controlling parameters (mobility, lifetime, energy band gap and absorption coefficient) have to be included in the analysis. in the general case, open circuit voltage is determined by the relation � �v i ioc sc d� ln , by which voc must grow as isc increases. depending on the current flowing mechanism, the acta polytechnica vol. 41 no. 2/2001 14 for the meshes at the corners of the cell for i = 1, j = 1, k = 1 � � � �q f a t t f a ti, j, k s xz xz i, j, k xz 4 s xy xy i, j, � �� � � ��� � � � 4 � � � �k xy 4 s yz yz i, j, k yz 4 xz x 4 4�� � � � � � � � �� � �t f a t t h a� � � �� � � �� � � � z i, j, k xz xy xy i, j, k xy yz yz i, j, k y t t h a t t h a t t � � � � � � � � � � �� � � � � �� � � � z th zx y i, j, k i, j+1, k th yx z i, j, k i, j, k m m � � � � � k a h t t k a h t t� �� � � � � �� �+1 th zy x i, j, k i+1, j, km� �k a h t t (31) for the meshes at the edges of the cell for i = 1, k = 1, j = 2, …, (ny 1) � � � � � �q f a t t h a ti, j, k s yz yz i, j, k yz 4 yz yz i, j, k� � � � � � �� �� � 4 � � � � � �t f a t t h a t t� � � �� �� � � � �yz s xy xy i, j, k xy 4 xy xy i, j, k� � 4 � � � � � �� � � � xy th yx z i, j, k i, j, k+1 th zy x i, j, k i+1,m m � � � � �k a h t t k a h t t� �� � � � � �� � � � j, k th xz y i, j, k i, j+1, k th zx y i, j, k i, j m m � � � � � k a h t t k a h t t� �� ��1, k (32) for the meshes inside the cell body for i = 2, …, (nx 1), j = 2, …, (ny 1), k = 2, …, (nz 1) � � � � � �� � � �q k a h t t k a h t ti, j, k th xy z i, j, k i, j, k 1 th yx z i, j, k im m� � � �� � �� � � � � �� � � � , j, k+1 th zy x i, j, k i 1, j, k th yz x i, j, k m m � � � � � �k a h t t k a h t t� �� � � � � �� � � �i+1, j, k th xz y i, j, k i, j 1, k th zx y i, j, km m� � � ��k a h t t k a h t t� �� �i, j+1, k (33) for the meshes at the surface for i = 2, …, (nx 1), j = 2, …, (ny 1), k= 1 � � � � � �q f a t t h a ti, j, k s xy xy i, j, k xy 4 xy xy i, j, k� � � � � � �� �� � 4 � � � � � �� � � � t k a h t t k a h t t � � � � � � � xy th yx z i, j, k i, j, k+1 th yz x i, j, k i m m � �� � � � � �� � � �1, j, k th zy x i, j, k i+1, j, k th zx y i, j, k i,m m� � � �k a h t t k a h t t� �� � � � � �� � j 1, k th xz y i, j, k i, j+1, km � � � �k a h t t (34) for the meshes at the gap � � � � � �� � � �q k a h t t k a h t ti, j, k th yz x i, j, k i 1, j, k th zy x i, j, k im m� � � �� � �� � � � � �� � � � +1, j, k th xz y i, j, k i, j+1, k th zx y i, j, km m � � � � �k a h t t k a h t t� �� � � � � �� � � � i, j 1, k th xy z i, j, k i, j, k 1 gap yx i, j, k im � � � � � � �k a h t t h a t t� �� �, j, k+1 (35) where: hx width of mesh in x, hy width of mesh in y, hz width of mesh in z, axz hxhz, axy hxhy, ayz hyhz, m material number, nx total no. of meshes in x-direction, ny total no. of meshes in y-direction and nz total no. of meshes in z-direction. value of id is proportional to ni 2. as the temperature increases ni grows exponentially and causes exponential growth of id. moreover, the crystal forbidden gap narrowing with temperature increase leads to an increase in the dark current, id and minimizes the positive effect of the absorption factor. as a result, voc decreases. the maximum power across the optimal load and the efficiency depend strongly on temperature; the shape of this dependence is close to linear. conclusions a simple and accurate close-form solution for the optical and thermal behaviors of a solar cell array were investigated by simulation and by some verifying experiments, in order to obtain more accurate results in predicting the actual performance of photovoltaic modules, without resorting to a lengthy, time-consuming iterative solution which has to be repeated for any change in the parameters of the array. nomenclature gh hourly global radiation incident on a horizontal surface, kwh/m2/day gt hourly global radiation incident on an inclined surface, kwh/m2/day ib, h hourly beam radiation incident on an inclined surface, kwh/m2/day ib, t hourly beam radiation incident on a horizontal surface, kwh/m2/day id, h hourly sky diffuse radiation incident on a horizontal surface, kwh/m2/day id, t hourly sky diffuse radiation incident on an inclined surface, kwh/m2/day io extraterrestrial hourly radiation incident on a horizontal surface, kwh/m2/day ig, h hourly ground reflected radiation incident on a horizontal surface ig, t hourly ground radiation incident on an inclined surface, kwh/m2/day it, cir circumsolar radiation component it, iso isotropic radiation component n refractive index ai anisotropic index rb the ratio of beam radiation on the tilted surface to that on a horizontal surface r reflection of incident � surface slope, degrees � declination, degrees � latitude, degrees � ground albedo �1 incidence angle, the angle between beam radiation and the vertical, degrees �o angles of refraction � sunrise hour angle, degrees � surface azimuth angle, degrees e eo i � �, amplitudes of the reflected and transmitted waves � reflectance qrad total radiative heat flow from the module, w qconv convective heat flow, w qcond quantity of heat transfer by conduction from one mesh to another, w kth thermal conductivity of different materials of solar cell, w/mk t(i, j, k) module surface temperature, °c t� ambient temperature, °c � emissivity fs radiation shape factor am radiative surface area of the module ac the surface area of the cell, m 2 references [1] born, m., wolf, e.: principles of optics. 5th edition, pergamon, new york, 1975 [2] bugler, j.: the determination of hourly insolation on a tilted plane using a diffuse irradiance model based on hourly measured global horizontal insolation. solar energy, 1977, vol. 19, no. 5, pp. 477 [3] dave, j.: validity of the isotropic distribution approximation in solar energy estimations. solar energy, 1977, vol. 19, pp. 331 [4] duffie, j., beckman, w.: solar engineering of thermal processes. wiley, new york, 1980 [5] fraidenraich, n., vilela, o.: exact solutions for multiplayer optical structures: application to pv modules. solar energy, 2000, vol. 69, no. 5, pp. 357 [6] gardner, c., nadeau, c.: estimating south slope irradiance in the arctic – a comparison of experimental and modeled values. solar energy, 1988, vol. 41, pp. 227 [7] green, m.: high efficiency silicon solar cells. trans tech publications, switzerland, 1987 [8] gueymard, c.: an anisotropic solar irradiance model for tilted surfaces and its comparison with selected engineering algorithms. solar energy, 1987, vol. 38, p. 367 [9] hay, j.: calculation of monthly mean solar radiation for horizontal and tilted surfaces. solar energy, 1979, vol. 23, p. 301 © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 41 no. 2/2001 fig. 9: variation of key parameters of a silicon solar cell at high temperatures 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 2/2001 [10] hay, j., davies, j.: calculation of the solar radiation incident on an inclined surface. proceedings first canadian solar radiation data workshop, 1980, p. 59 [11] holman, j.: heat transfer. mc graw hill, 1982 [12] hottel, h. c., woertz, w.: performance of flat plate solar heat collectors. transactions of american society of mechanical engineers, 1942, vol. 64, pp. 91 [13] ineichen, p., perez, r., seals, r.: the importance of correct albedo determination for adequately modeling energy received by tilted surfaces. solar energy, 1987, vol. 39, pp. 221 [14] iqbal, m.: an introduction to solar radiation. academic press, toronto, 1983 [15] klein, m., furtak, t.: optics. wiley, new york, 1993 [16] klucher, t.: evaluation of models to predict insolation on tilted surfaces. solar energy, 1979, vol. 23, p. 114 [17] martin, n.: comparative study of the angular influence on different pv module technologies. proceeding 14th e.c. pvsec, 1997 [18] nahumura, s.: applied numerical methods with software. prentice hall, new york, 1991 [19] perommer, p., lomas, k., seale, c., kupke, c.: the radiation transfer through coated and tinted glazing. solar energy, 1995, vol. 54, no. 5, p. 287 [20] smith, g.: numerical solution of partial differential equations: finite difference method. oxford, 1979 [21] temps, r. c., coulson, k. l.: solar radiation incident upon slopes of different orientations, solar energy, 1977, vol. 19, p. 179 [22] wenham, s., green, m., watt, m.: applied photovoltaics. university of new south wales, center of photovoltaic systems and devices, australia, 1994 [23] harbeke, b.: coherent and incoherent reflection and transmission of multi-layer structures. applied physics, 1989, vol. 39, pp. 165–170 eng. hamdy kamal elminir e-mail: ehamdy@hotmail.com phone: +420 2 2435 2212 fax: +420 2 2435 3949 prof. ing. vítězslav benda, csc. ing. ivan kudláček, csc. czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2014.54.0173 acta polytechnica 54(3):173–176, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap fact — status and first results daniela dornera,∗, a. bilandb, t. bretzb, j. bußc, s. eineckec, d. eisenachera, d. hildebrandb, m. l. knoetigb, t. krähenbühlb, w. lustermannb, k. mannheima, k. meiera, d. neisec, a.-k. overkempingc, a. paravaca, f. paussb, w. rhodec, m. ribordyd, t. steinbringa, f. temmec, j. thaelec, p. voglerb, r. waltere, q. weitzelb, m. zängleina (fact collaboration) a universität würzburg, germany — institute for theoretical physics and astrophysics, 97074 würzburg b eth zurich, switzerland — institute for particle physics, schafmattstr. 20, 8093 zurich c technische universität dortmund, germany — experimental physics 5, otto-hahn-str. 4, 44221 dortmund d epf lausanne, switzerland — laboratory for high energy physics, 1015 lausanne e university of geneva, switzerland — isdc data center for astrophysics, chemin d’ecogia 16, 1290 versoix ∗ corresponding author: dorner@astro.uni-wuerzburg.de abstract. fact is the first imaging cherenkov telescope based on a camera using solid state photosensors (geigermode avalanche photodiodes g-apd aka sipm). since october 2011, it has been taking data regularly. apart from commissioning and calibration measurements, it has already started regular operation, where the main goal is to do long-term monitoring of bright tev blazars. in june 2012, a flare of mrk 501 was observed. thanks to the robustness of the g-apds, observations can be carried out during strong moon light without aging of the sensors. this improves the duty cycle of the instrument and provides better statistics for long-term light curves. the telescope, situated on the canary island of la palma, is operated, already now, remotely from central europe. for the future, robotic operation is planned. we report on our experiences during the commissioning, and we present first results from the first 1.5 years of observations. keywords: cherenkov astronomy, gamma astronomy, monitoring, agn, blazar. 1. fact — first g-apd cherenkov telescope with the aim to monitor bright active galactic nuclei (agn) in the very high energy (vhe) range, the first g-apd cherenkov telescope (fact) was designed for stable and robust operation. g-apds [1] were therefore chosen as photosensors which has provided also the opportunity to show that si-based photodetectors are suitable for cherenkov telescopes. in cherenkov astronomy, the cameras have until now been built using photo-multiplier tubes (pmts). apart from the use of this new technology, fact features plexiglass cones, a sum trigger, electronics based on the drs-4 [2] analogue ring buffer and fully integrated into the camera housing, as well as a readout via standard ethernet. the g-apd camera was mounted on the refurbished hegra ct3 mount which was also equipped with a new drive system, basically a down-scaled version of the magic drive system [3], and the re-coated mirrors of the former hegra ct1 telescope, providing a mirror area of 9.5 m2. with its 1440 pixels, each consisting of one g-apd and a plexiglass cone glued to it, fact has a field of view of 4.5°. the camera has low power consumption of less than 500 watts. a full description of the design and the construction of the system can be found in [4]. a photo of the telescope is shown in figure 1. the camera was mounted on the telescope in october 2011. since then, it has been taking data regularly. thanks to its reliability, fact is the first telescope of its kind that is remotely operated. 1.1. stability of the detector already now, g-apds provide performance comparable to that of the best pmts available. for the future, an increase in the photon detection efficiency as well as a significant reduction in cost is expected. g-apds are very easy to handle, as they do not require a high voltage but can be operated at about 70 volts. in addition, they provide very good timing resolution. being insensitive to magnetic fields and mechanically more robust, they provide an ideal alternative for longterm use in a monitoring telescope which is operated remotely or is robotic. another very important advantage is that g-apds can be operated during strong moon light. so far, no indication of ageing due to strong light has been found [5]. afterpulses, darkcounts and crosstalk are well under control for 173 http://dx.doi.org/10.14311/ap.2014.54.0173 http://ojs.cvut.cz/ojs/index.php/ap d. dorner et al. (fact collaboration) acta polytechnica figure 1. the first g-apd cherenkov telescope, situated on the canary island of la palma at 2200 m a.s.l. photo: courtesy of daniela dorner fact: the darkcount rates are much lower than the rates from the night sky background light (nsb), and can therefore be neglected. the crosstalk (i.e., two or more cells in one pixel are discharged by a single photon) only increases the signal in the pixels slightly and therefore also does not pose a problem except for fake triggers. in fact, the operation voltage of the g-apds is set to result in a crosstalk probability of about 10 %, which is well within acceptable limits. the afterpulses can be treated by choosing a signal extractor insensitive to this feature. in addition, afterpulses in g-apds do not tend to arrive with constant delays, as they do in pmts, and are therefore no problem for the trigger. one important feature of g-apds that cannot be ignored is the dependence of their gain on temperature and on the applied voltage. these dependencies have to be corrected to keep the gain of the g-apds stable which is important for ensuring consistent and stable data. the temperature dependence of g-apds is known and can therefore be corrected easily. in addition, nsb light introduces a continuous current in the g-apd, causing a voltage drop. a special method has been developed to correct for this: the voltage change, needed to keep the gain constant, is calculated from the measured current. since may 2012, a feedback system including temperature and current correction has been in use [6], and a detailed paper is in preparation. measurements of the gain with the help of a temperature-stabilized light pulser show that with this method the gain can be kept constant within 6 % in time, i.e., correcting for temperature changes of more than 15 degrees, and 4 % between the individual pixels. scans of the trigger rate in dependence on the applied trigger threshold have been carried out under various light conditions from dark night to almost full moon. the result of these rate scans shows that the change in the rate at a low trigger threshold depends only on the effect of nsb light, while at high thresholds the rate from hadronic showers remains constant. this proves not only that fact keeps the nsb dependence well under control, but also allows non-standard conditions of the atmosphere to be detected [7]. this means that apart from a higher energy threshold, the performance of the system is stable for all light conditions. in this way, a lot of observation time can be gained, improving the duty cycle of the telescope. this is a big advantage for monitoring, as the gaps between the observations can be kept small. based on the analysis of muon rings, an upper limit for the timing resolution of the whole system could be determined. the distribution of the rms of the arrival time of the pulses in a muon provides a measurement of the timing resolution as the arrival time spread of the light from muons is known to be very small. therefore the arrival time of muons can give an upper limit for the timing resolution of the whole system. for fact, the timing resolution has been measured to be around 600 picoseconds, probably dominated by the non-isochronicity of the mirror. 2. data selection and analysis the results shown here include the data taken between 1. 5. 2012 and 30. 6. 2013. all data were taken in wobble mode with an offset of 0.6° from the camera center. details on the data selection and data quality checks can be found in [8]. the performance of the telescope was studied with observations of the crab nebula in all possible observation conditions, including zenith distances up to 75° 174 vol. 54 no. 3/2014 fact — status and first results ] 2 [deg2ϑ 0 0.05 0.1 0.15 0.2 0.25 0.3 c o u n ts 0 200 400 600 800 1000 1200 preliminary , off-scale 0.20σsignificance 34.0 1597.8 excess events, 1275.2 background events fact mrk421 141h ] 2 [deg2ϑ 0 0.05 0.1 0.15 0.2 0.25 0.3 c o u n ts 0 200 400 600 800 1000 1200 1400 1600 1800 preliminary , off-scale 0.20σsignificance 41.1 2371.4 excess events, 1943.6 background events fact mrk501 258h figure 2. signals of mrk 421 (left) and mrk 501 (right). ϑ2-distributions for both sources, where the black crosses are the signal and the gray shaded area is the background measurement. the vertical dashed line indicates the cut in ϑ of 0.11°. the events to the left of this line are used to determine the significance of the detection. these plots include data from 1.5.2012 until 30.6.2013 with zenith distance smaller than 40° and a trigger threshold smaller than 500 dac counts. a data quality check as described in [8] has been applied. and light conditions ranging from dark night to almost full moon. with the current analysis, the excess rate remains stable for a zenith distance up to 40° and also with moderate moon light. based on this and in addition to the data check mentioned above, data with zenith distance smaller than 40° and a trigger treshold of less than 500 dac counts were selected. the analysis is done using the software package mars cheobs ed. [10]. details on the analysis chain and the quick look analysis are given in [8] and [9]. so far, the analysis has been providing excess rate curves, as shown and discussed in detail in [11]. work on the monte carlo simulations and on reconstructing the energy spectrum is ongoing. once the dependency of the performance of the system on the zenith distance and the light conditions is fully understood, data taken with larger zenith distances or under different moon conditions can be corrected and will be included. nevertheless, significant flaring activities will show flux changes larger than these corrections, so it is already possible to send alerts. 3. results in the time between 1.5.2012 and 30.6.2013, mrk 421 was detected with 34 σ in 141 hours of observation (see figure 2, left plots). mrk 501 was detected with 41 σ in 258 hours (see figure 2, right plot). from both sources, major flares could be detected (two from mrk 501 and one from mrk 421). more details on the results of the longterm monitoring of bright tev blazars with fact can be found in [11]. in the night with the highest flux measured so far by fact, mrk 501 was detected with 22.6 σ in 1.9 hours (see figure 3, left plot), and within a single run of 5 minutes, the source was detected with up to 6.7 σ during this flare night (see figure 3, right plot), which nicely demonstrates the capability of fact to send fast flare alerts to other telescopes. during the first 1.5 years of operation, the crab nebula has also been observed. by comparing these results to those of hegra, a sensitivity of 8 % crab and an energy threshold of about 700 gev for cuts optimized on sensitivity and about 400 gev for open cuts are estimated. preliminary results on this can be found in [9]. 4. conclusions and outlook since october 2011, fact has stable operation, and since january 2012 continuous monitoring of bright agn in the tev range has been ongoing. three major outbursts at tev energies of the sources mrk 501 and mrk 421 have been detected so far. in the meantime, the telescope can be operated remotely and automatically via internet. the operation is being further automated aiming at a fully robotic operation [12]. thanks to the robust photosensors, observations during strong moon light are possible. this is ideal for monitoring, as it enlarges the duty cycle and provides a more complete data sample. g-apds have shown to be ideal photosensors for the camera of a cherenkov telescope. inexpensive, small telescopes like fact are ideal systems for extended monitoring of bright tev blazars and multi-wavelength campaigns. acknowledgements the important contributions from eth zurich grants eth-10.08-2 and eth-27.12-1 as well as the funding from the swiss snf and the german bmbf (verbundforschung astround astroteilchenphysik) are gratefully acknowledged. we are thankful for the very valuable contributions from e. lorenz, d. renker and g. viertel during the early phase of the project. we thank the instituto de astrofisica de canarias for allowing us to operate the telescope at the observatorio roque de los muchachos in la palma, the max-planck-institut für physik for providing us with the mount of the former hegra ct 3 telescope, and the magic collaboration for their support. we further 175 d. dorner et al. (fact collaboration) acta polytechnica ] 2 [deg2ϑ 0 0.05 0.1 0.15 0.2 0.25 0.3 c o u n ts 0 20 40 60 80 100 preliminary , off-scale 0.20σsignificance 22.6 222.6 excess events, 17.4 background events fact mrk501 flare night 1.9h ] 2 [deg2ϑ 0 0.05 0.1 0.15 0.2 0.25 0.3 c o u n ts -2 0 2 4 6 8 10 12 14 preliminary , off-scale 0.20σsignificance 6.7 15.6 excess events, 0.4 background events fact mrk501 2012/06/08 run 78 (5min) figure 3. signals of mrk 501 for the night of 8.6.2012 (left) and for one 5-minute-run from that night (right). ϑ2-distributions for subsets of the mrk501 observations, where the black crosses are the signal and the gray shaded area is the background measurement. the vertical dashed line indicates the cut in ϑ of 0.11°. the events to the left of this line are used to determine the significance of the detection. data selection criteria, as described in the text, have been applied. thank the group of marinella tose from the college of engineering and technology at western mindanao state university, philippines, for providing us with the scheduling web-interface. references [1] hamamatsu, http://www.hamamatsu.com/jp/en/ product/category/3100/4004/4113/s10362-33-050c/ index.html [2014-06-01] [2] http://www.psi.ch/drs/ [3] t. bretz, et al. the drive system of the major atmospheric gamma-ray imaging cherenkov telescope. astroparticle physics 31:92–101, 2009., doi: 10.1016/j.astropartphys.2008.12.001 [4] h. anderhub et al. (fact collaboration), design and operation of fact the first g-apd cherenkov telescope. jinst 8 p06008, 2013, doi: 10.1088/17480221/8/06/p06008 [5] m.l. knoetig at al. (fact collaboration), fact long-term stability and observations during strong moon light. proceedings of 32nd icrc, rio de janeiro 2013 [6] t. bretz et al. (fact collaboration), fact how stable are the silicon photo detectors? proceedings of 32nd icrc, rio de janeiro 2013 [7] d. hildebrand et al. (fact collaboration), measuring atmospheric conditions with imaging air cherenkov telescopes. proceedings of 32nd icrc, rio de janeiro 2013 [8] d. dorner et al. (fact collaboration) fact longterm monitoring of bright tev blazars proceedings of 32nd icrc, rio de janeiro 2013 [9] t. bretz et al. (fact collaboration) fact the first g-apd cherenkov telescope proceedings of 5th gamma ray conference, heidelberg 2012, doi: 10.1063/1.4772374 [10] t. bretz and d. dorner, mars cheobs ed. – a flexible software framework for future cherenkov telescopes. astrop., particle and space physics, 681., 2010, doi: 10.1142/9789814307529_0111 [11] k. meier, et al. (fact collaboration) fact monitoring of bright tev blazars. these proceedings. 2014. [12] a. biland et al. (fact collaboration) fact: towards robotic operation of an imaging air cherenkov telescope proceedings of 32nd icrc, rio de janeiro 2013 176 http://www.hamamatsu.com/jp/en/product/category/ 3100/4004/4113/s10362-33-050c/index.html http://www.hamamatsu.com/jp/en/product/category/ 3100/4004/4113/s10362-33-050c/index.html http://www.hamamatsu.com/jp/en/product/category/ 3100/4004/4113/s10362-33-050c/index.html http://dx.doi.org/10.1016/j.astropartphys.2008.12.001 http://dx.doi.org/10.1088/1748-0221/8/06/p06008 http://dx.doi.org/10.1088/1748-0221/8/06/p06008 http://dx.doi.org/10.1063/1.4772374 http://dx.doi.org/10.1142/9789814307529_0111 acta polytechnica 54(3):173–176, 2014 1 fact — first g-apd cherenkov telescope 1.1 stability of the detector 2 data selection and analysis 3 results 4 conclusions and outlook acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0500 acta polytechnica 53(supplement):500–505, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the discovery of the chandrasekhar mass and the chandrasekhar–eddington controversy giora shaviv∗ department of physics, israel institute of technology, haifa, israel 32,000 ∗ corresponding author: gioras@physics.technion.ac.il abstract. the so-called chandrasekhar limiting mass is a quantum mechanical relativistic effect. the discovery and establishment of the concept involved a major controversy between the young chandrasekhar and the hilarious eddington. we review the origin and evolution of the controversy. keywords: chandrasekhar, eddington, limiting mass, compact objects. 1. introduction a hot subject of research in the early 1920 was the distribution of the electrons in the various atomic shells. the correct electron arrangement in atoms was found by edmund stoner (1899–1968) in 1924. based on optical spectra, stoner attempted to find the arrangement of the electrons in the various levels. it is remarkable, stated stoner, that the number of electrons in each complete level is equal to double the sum of the inner quantum numbers as assigned. the electrons appeared to come in pairs which occupy the same quantum states. stoner’s distribution of electrons was the distribution we know today, and as stoner had already shown, it explained the chemical and the physical properties as they vary throughout the periodic table. in this distribution the electrons come in pairs, and not more than two occupy the same quantum state. however, stoner went one step further and characterized the states of the electrons by two numbers: the first number was identical to the principal quantum number n of bohr, and the second could take values from 0 to n−1. indeed stoner noted, that each electron has another l value. pauli’s interest in the problem arose in 1922 when he met niels bohr. bohr lectured in göttingen on his new theory to explain the periodic system of elements. right after bohr came up with his model of the multi-electron atom, the following question arose: why do all the electrons in the atom not fall to the lowest energy level? as a matter of fact, bohr had already discussed this problem but could not find a satisfactory solution. a hint as to what goes on came when a strong magnetic field was applied to the atom. so far it was known that all electrons in a given shell possess the same energy. however, when a magnetic field was applied to the atom, the various sub-states within each shell obtained different energies. very soon, pauli realized that electrons immersed in a strong magnetic field have different quantum numbers and still do not descend to a lower state. however, he did not have a clue as to why this is so. in 1923, pauli returned to the university of hamburg. the lecture he gave to obtain the title privatdozent was on the periodic system of elements expressed disappointment that the problem of closed electronic shells had no explanations. the only thing that was clear was the connection to the multiplet structure of the energy levels. according to the popular notion at that time, non-vanishing angular momentum had to do with doublet splitting. however, this was just a guess. in 1924 pauli published some arguments against this point of view, in particular the two valued. at that time, the following essential remark by stoner was published: for a given value of the principal quantum number, the number of energy levels of a single electron in the alkali metal spectra in an external magnetic field is the same as the number of electrons in the closed shell of the rare gases which corresponds to this principal quantum number. it is this sentence by stoner, as pauli wrote, which led him to the idea that: the complicated numbers of electrons in closed subgroups are reduced to the simple number one, if the division of the group by giving the values of the four quantum numbers of an electron is carried so far that every degeneracy is removed. an entirely non degenerate energy level is already closed, if it is occupied by a single electron. states in contradiction with this postulate have to be excluded. the general principle was finally formulated in 1925. in simpler words: in a given system of many electrons, no two electrons can have the same quantum numbers. about twenty years later, the exclusion principle brought pauli the nobel prize. stoner was not summoned to stockholm. 2. enrico fermi & paul dirac enrico fermi (1901–1954) was bothered by the fact that the equations of an ideal gas, in particular the expression for the heat capacity at constant volume, did not satisfy the nernst (1864–1941) law, which demanded that you cannot reach absolute zero temperature in a finite number of steps. when fermi saw the papers by stoner and pauli, he set out to 500 http://dx.doi.org/10.14311/ap.2013.53.0500 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the discovery of the chandrasekhar mass find to what extent the application of the new principle to the molecules of an ideal gas yields an expression which satisfies nernst’s general principles. interestingly, there was no mention in fermi’s paper of electrons, the particles to which the pauli principle was applied. eventually, fermi discovered what is known today as the fermi–dirac statistics. fermi and dirac, independently, immediately grasped the farreaching implications of the pauli exclusion principle (pep) for the gases of particles which obey it, such as electrons. with his fantastic physical intuition, fermi derived his results directly, while dirac, with his superb mathematical skills, derived the general theory of the behavior of quantum particles and derived both the fermi result and the bose–einstein result, as special cases of his general theory. it was sommerfeld who applied the new statistics to the theory of metal and introduced the idea that the free electrons in a metal are a fermi gas. 3. eddington’s white dwarf paradox in his famous book, eddington (1926) pointed to a paradoxical situation. as a star contracts the gravitational pull increases and, as a consequence, the temperature and the density of the gas must increase so as to counter-balance the increase in the gravitational pressure. at the same time, the star continues to lose energy from the surface. how can this be? part of the gravitational energy goes into heating the gas and the rest is radiated away. so stars are unique objects, they lose energy all their life and as a consequence heat up! and inversely, stars cannot cool! as eddington pointed out: to die by cooling, the star must lower its temperature and hence reduce its gas pressure, and in order to stay balanced it must decrease the gravitational pull, which it can do only by expansion or by having some extra source of energy which nobody had thought of. in eddington’s words: we can scarcely credit the star with sufficient foresight to retain more than 90 % in reserve for the difficulty awaiting it. . . . imagine a body continually losing heat but with insufficient energy to grow cold! the paradox shook the scientific community! 4. cracking the paradox: fowler ralph h. fowler (1889–1944) was a leading physicist with contributions to statistical mechanics and astrophysics. dirac’s paper was communicated by fowler to the royal society on august 26, 1926. on november 3, fowler communicated his paper with the application of the laws of the ‘new quantum theory’ to the statistical mechanics of assemblies consisting of similar particles. by december 10 his paper entitled ‘dense matter’ was read before the ras. fowler solved the paradox by applying sommerfeld’s theory of metals to stars. a star devoid of energy sources can reach zero temperature, and the pressure generated by the compressed electrons would be large enough to balance the weight of the stellar layers attempting to collapse inward. amazing. the temperature of a gas reflects the number of states the system can be in. the higher the temperature, the more states the system can be in. here we find, à la fowler, that white dwarfs are in the single lowest possible state, namely, all particles fill all the energy levels, exactly like the electrons in an atom. the gravitational force, which pushes the white dwarfs to this state appears to act in an opposite direction to thermodynamics. the star cools to the state of a white dwarf, and reaches the most ordered state with the lowest entropy. in his obituary to fowler, chandrasekhar described this discovery as among the more important astronomical discoveries of our time. fowler, in eddington’s language, allowed stars to die by cooling. 5. pokrowski – the idea of a limiting mass a surprising paper appeared in 1928 by the russian scientist pokrowski. pokrowski assumed that the maximum density of the matter in the star is obtained when all fully ionized nuclei touch each other. provided nuclei cannot be compressed, this should be the maximum density that matter can be in. this state is known today as ‘nuclear matter’. pokrowski estimated this density to be ρmax = 4 × 1013±1 gm/cm3. assume now a star with mass m and uniform density ρ = ρmax. it is simple to calculate the energy needed by a particle of mass m on the surface of the star to escape from the star to infinity. since ρmax is fixed, there exists a stellar mass mlim for which the energy needed to escape exceeds the rest mass energy, and hence no energy/particle can leave this star and it cannot be observed. pokrowski claimed that for m > mlim energy cannot leave the star. according to pokrowski’s calculations mlim = 30.29m�. this was a pure classical calculation. 6. anderson expands pokrowski’s idea but changes the reasons hardly a year after pokrowski’s publication, wilhelm anderson from tartu university in estonia, took pokrowski’s idea a bit further. repeating the calculation without the new general theory of relativity, anderson argued as follows: the luminosity that the star radiates is equivalent to the mass, so when the star radiates into space it contracts and decreases its mass. he therefore calculated how much mass a star loses as a function of the original mass before it reaches ρmax. for example, if the initial mass is 334 m� about 0.55 m� of the stellar mass is radiated before the star reaches ρmax, and if the initial mass is 4.82 × 107 m�, the final mass is 370 m� so that the amount radiated away is 1 ÷ 10−6 = 0.999999 of the 501 giora shaviv acta polytechnica initial mass. hence, concluded anderson, the final mass of a star must be smaller than 370 m�. however, anderson’s most important contribution was the following: after sending the paper for publication, he became aware of stoner’s paper (see next) and remarked correctly in ‘a note added in proof’ that stoner ignored the effects of special relativity, and hence his results are good only for small stellar masses. 7. stoner: relativistic degeneracy leads to a limiting mass at this point, stoner entered the picture once more and published a sequence of papers in which the idea of a limiting mass gradually evolved. by now he was aware of the pauli principle and of course of fowler’s work, which he applied. in the first paper stoner developed the idea that there may exist a ρmax not due to full ionization but due to the ‘jamming’ of the electrons which obey the fermi statistics. thus, the idea was basically that there exists a ρmax which is smaller than the ρmax derived by pokrowski and anderson. stoner mentioned jeans’ stellar stability theory (which was not yet proven to be wrong) that a star cannot be stable if it satisfies the ideal gas laws. hence, the matter in a stable star must be in a liquid state. stoner also cited jeans that atoms are fully ionized in white dwarfs, and claimed that it is electron jamming, rather than nucleus jamming, which results in the departure from the gas laws which ensure the stability of the star. so stoner calculated the revised ρmax caused by pep. he adopted fowler’s theory and assumed a mean molecular weight of µ = 2.5. to simplify the calculation, he used a constant density, like an incompressible liquid. stoner found a ρmax beyond which the gravitational pull does not have the power to provide energy to the electrons so as to allow further contraction. the resulting density was found to be ρmax = 3.85 × 106 (m/m�) 2 gm/cm3. stars that reach this density cannot contract anymore, claimed stoner, so they cannot extract energy from the gravitational field. they consequently become dark and their temperature is zero. all stars are doomed to die when their density reach ρmax. the comparison with observations was excellent and all known wds had mean densities below ρmax. the mean density of sirius b, for example, is 5 × 104 gm/cm3 and stoner got 2.77 × 106 gm/cm3. the radii also agreed. stoner was happy with the results, because the electron gas in which all the energy levels are occupied is practically incompressible. in other words, it behaved like a liquid and hence satisfied jeans’ condition for the stability of stars. on the other hand, stoner mentioned that his results had no effect on the difficulties that jeans’ condition implied for the stability of ordinary main sequence stars. there was no reference to pokrowski, whose paper was published well before, or to anderson, who published his paper roughly at the same time in the prestigious german zeitschrift für physik. 8. anderson again soon after the semi critical paper on pokrowski’s limiting density, anderson published an analysis of the state of the electron gas in white dwarfs in which he criticized stoner’s treatment of the problem. anderson’s most important contribution was that he noted that as the density increases the electrons are driven to higher energies and quickly become relativistic. indeed, at a density of 106 gm/cm3 the kinetic energy of the electron is already 0.28 of its rest mass energy. the inclusion of special relativity turned out to be crucial. 9. stoner responds shortly after anderson’s paper was published, stoner criticized his mathematical treatment, but accepted the basic idea that the role of the special theory of relativity is crucial. stoner found the way to carry the calculation accurately. in particular, stoner demonstrated that as the density tends to infinity the mass tends to a finite value mlim. stoner did not discuss what happens to stars with masses m > mlim. do they contract forever? at a later time, stoner attempted to improve the estimates of the limiting mass by assuming a polytropic equation of state. the pressure of the condensed electron gas varies as ρ5/3 at low densities and as ρ4/3 at high densities. the effect of special relativity is to reduce the power of the dependence of pressure on density by just 1/3. it is this change in the exponent which became the subject of a fierce and emotionally charged controversy between chandrasekhar and eddington. as a matter of fact, stoner and tyler managed to solve the case of low density but just missed the idea of assuming an ideal star in which everywhere the polytropic index is 4/3, as dictated by the special theory of relativity. both papers were communicated by eddington to the journal. in other words, eddington communicated papers which included a result he objected to. moreover, stoner ended the paper with an acknowledgment to eddington for proposing the problem of the ‘upper limits’. when it came from stoner, eddington did not raise any objection or controversy. i suspect that stoner’s cardinal contribution to white dwarf theory was not much recognized by astrophysicists (a) because it was published in the philosophical magazine, a journal that most of them did not read, and (b) because stoner unfortrunately suffered from diabetes and poor health, which restricted his travelling and limited the presentation of his results in meetings. 502 vol. 53 supplement/2013 the discovery of the chandrasekhar mass 10. chandrasekhar chandrasekhar (1910–1995) met sommerfeld in 1928 during sommerfield’s trip to india, and heard his seminar on the new theory of metals and the fermi–dirac statistics. at this time, chandrasekhar decided to go to england and not to germany, though the intention of sommerfeld’s visit to india was to strengthen the relations between german and indian sciences. this preference for england over germany had major consequences and a major impact on chandrasekhar’s life in the subsequent years. the story has it that while on the boat sailing to england, at the age of 19, chandrasekhar applied sommerfeld’s theory of metals to white dwarfs. in doing so he generalized the fermi–dirac statistics to satisfy the demands of special relativity. chandra effectively repeated fowler’s work with generalization to relativistic degeneracy. the basic difference between stoner’s limiting mass expression (which chandrasekhar apparently was not aware of while on the boat) and chandrasekhar’s was that chandrasekhar’s included a better model for the density distribution in the star. the first result for the limiting mass obtained by chandrasekhar was 0.91 m�. later, chandrasekhar compared his result with stoner’s and concluded that: the agreement between the accurate value based on the theory of the polytropes and the cruder form of the theory is rather surprising. no word as to what may happen to stars more massive than 0.91 m� appeared in chandrasekhar’s two-page-long note. chandrasekhar’s short paper about the limiting mass was published in the american apj, although the most important astrophysical works on the subject of stars were published at that time in the mnras. one can only wonder why chandrasekhar chose this publication for his seminal contribution. presumably he wanted to avoid a certain veto by eddington. in 1934, chandrasekhar summarized the physical state of the matter in the interior of stars by distinguishing between matter which obeys the ideal equation of state, dense matter which obeys the equation p ∼ ρ5/3, and ultra dense matter, which obeys the equation p ∼ ρ4/3. a limiting mass is obtained only for the ultra dense case. so chandrasekhar classified the stars according to their mass. the very massive stars satisfy eddington’s equation, and the matter in them remains in the state of anideal gas. the matter in these stars depends only marginally on the pep. on the other hand, the small masses were divided again into two classes. for stars with m < (1.74/µ2) m�, the relativistic effects never become dominant and the density never exceeds 6.3 × 105µ5 (m/m�) 2 gm/cm3. then came the white dwarfs. for white dwarfs with m < 3.822µ2m�, relativistic effects never play a role. white dwarfs in the mass range 1.743µ2 m� to 6.623µ2 m� reach a density in which relativistic effects play a dominant role. finally, matter in stars with m > 6.623/µ2 m� always obeys the ideal gas law. as for their fate, chandrasekhar entered the territory for speculation and conjectured that as the density approaches the critical density the behavior of matter changes in an unknown way. until 1935, eddington’s attacks on chandrasekhar were made in public and not in published papers. in 1935, eddington published his first straightforward attack on the idea that special relativistic effects are important to the theory of white dwarfs. one may wonder what triggered eddington and why he was so upset, to put it mildly, with chandrasekhar’s result. maybe the answer can be found in the introduction to his paper: using the relativistic formula, he (chandrasekhar) finds that a star of large mass will never become degenerate, but will remain practically a perfect gas up to the highest densities contemplated. when its supply of subatomic energy is exhausted, the star must continue radiating energy and therefore contracting – presumably until, at a diameter of a few kilometers, its gravitation becomes strong enough to prevent the escape of radiation. this result seems to me, argued eddington, almost a reductio ad absurdum of the relativistic formula. it must at least rouse suspicion as to the soundness of its foundation. in other words, eddington did not believe in the physical reality of the schwarzschild solution, just like einstein, who refused to accept it as a physical solution. so, because he did not believe in what we call today black holes, he turned the argument round. namely, if chandrasekhar’s theory leads to the formation of black holes, it must be wrong. chandrasekhar definitely knew about eddington’s basic reasons for the objection to his results, and refrained from predicting the fate of a massive star in his communication to the ras (february, 1934) and speculated about the nature of the interaction between the nuclei change at high density . . . in the paper eddington set out to look for flaws in the derivation of the result p ∼ ρ4/3 for relativistic electrons. eddington raised a series of technical questions and one fundamental one. the basic assumption of fowler was that the electrons released from the atom in the star move freely in the entire volume of the star. this was one of eddington’s objections. eddington did not argue with fowler but with chandrasekhar, who brought in special relativity (and got the limiting mass with its implications). in particular, eddington claimed that chandrasekhar combined special relativity with non-relativistic quantum mechanics. the derivation made the (paradoxically correct) assumption that as the density rises the electrons move more like free particles in a spherical box. the nuclei do not affect the motion of the electrons, and consequently the electrons have a very long mean free path. this is exactly what happens in metals. we remark that fowler did discuss this point and came to the conclusion that this assumption, however incredible it sounds, is correct. möller and chandrasekhar responded right away to eddington’s published attack. actually, no wonder they could respond so quickly, 503 giora shaviv acta polytechnica as they acknowledged that they were: are indebted to sir arthur eddington for allowing them to see a manuscript copy of his paper. as a consequence, the two papers appeared in the same issue of the mnras. a mere one volume later, the mnras carried eddington’s reply. again, the arguments were mostly technical but his time the reply included a statement that: the exclusion principle has been abundantly verified for electrons in the atom. undoubtedly there exists a generalization of it applicable to large assemblies of particles (he meant stars, g.s.) but the generalization cannot be of the form assumed by möller and chandrasekhar, which conflicts with the uncertainty principle. eddington accepted pauli’s principle for atoms but rejected the extension to cosmic systems. nobody else doubted the validity of the pauli principle in stars. moreover, this very statement contradicts eddington’s statements from 1916 about the validity of the laws of physics discovered on earth in stars. chandrasekhar’s final paper on the limiting mass with the new and rigorous derivation of mlimit for wds was published in 1935. first, chandrasekhar removed any references to radiation (symbolically, introducing radiation was eddington’s main achievement). next, came the question: what happens to masses above the limiting mass? what chandrasekhar had hesitated to state in the previous paper he dared to write now: configurations of greater mass must be composite (which means milne’s models) these composite configurations have a natural limit ... zero radius. in a footnote, chandrasekhar added that: in the previous paper this tendency of the radius to zero was formally avoided by introducing a state of ‘maximum density’ for matter, but now we shall not introduce any such states, namely for the reason that it appears from general considerations that when the central density is high enough for marked deviations from the known gas laws to occur, the configuration then would have such small radii that they would cease to have any practical importance in astrophysics. in other words, chandrasekhar did not believe at that time in the reality of what we today call black holes. however, chandrasekhar changed his mind years later. in his concluding remarks, he stated that white dwarfs are the limiting sequence of configurations to which all stars must tend eventually. how more massive stars would behave was not elaborated. in 1939, chandrasekhar closed his chapter on wd when he summarized his results in a book. in the chapter on quantum statistics, chandra added a note: eddington claimed in 1935 that the partition function used in this book is incorrect. however, the investigation by möler and me failed to support it. the general theory presented in this chapter is accepted by theoretical physicists. when chandra derived mlimit there is no further mention of eddington. after eddington published his attack in 1935, his claims became explicit. in 1936, chandrasekhar recruited rudolf peierls (1907–1995), a leading nuclear physicist, to write a note on the derivation of the equation for a relativistic gas. peierls discussed eddington’s contentions that the behavior of the gas in the star may depend on the shape of the volume inside which it moves. as a matter of fact, peierls admitted that the issue is so simple that there is no need to elaborate on it. yet, some, and they did not mention eddington, still argue to the contrary. eddington’s last paper on wd matter was published in 1940. the paper contains eddington’s contention including statements such as quantum theory, unlike relativity theory, is not primarily a rational theory, and therefore its formulae are generally enunciated without any indication of the conditions in which they are valid. a formula established empirically in certain conditions is extended to conditions in which is has not been verified by a procedure known as “the principle of induction” or less euphemistically as “blind extrapolation”. such extrapolation, though often leading to progress, is fairly sure to breakdown sooner or later . . . but the limits of application are not derived along with the formulae in a rational theory. by then, eddington was immersed in his metaphysical theories and nobody paid attention to his paper. 11. the personal side so far we have discussed what appeared in the professional literature. but the controversy between the two scientists had unpleasant personal sides. we see a conflict between two extreme personalities. on the one hand eddington, a dominant figure in astrophysics, who had won every possible medal and prize, and on the other hand, a young unknown scientist who had recently completed his phd thesis. it eventually turned out that the controversy propelled chandra to a position of scientific eminence. despite his eminence, eddington was easily accessible in cambridge, and chandrasekhar had many scientific conversations with him. but private friendship and public relations are quite different matters. when chandrasekhar went early in 1935 to a meeting of the royal society to report his results, he noticed to his surprise that eddington was listed to talk after him. and indeed, after chandrasekhar finished his talk, eddington took the podium and tried to prove that there is no such a thing as ‘relativistic degeneracy’. eddington in effect ambushed chandra, as he had given him no warning that he was going to attack and humiliate him in public. moreover, to argue against someone’s scientific result is one thing but to joke at the expense of a rival is another thing, and eddington joked about chandra’s colossal error. a similar scene happened later that year during the iau meeting in paris. it was clear that eddington had publicly vanquished chandrasekhar to the point that he could not get any position in europe. the community believed that eddington was right, namely nature could not behave the way chandrasekhar predicted . . . eddington argued that it was heresy. there are claims 504 vol. 53 supplement/2013 the discovery of the chandrasekhar mass that henri norris russell, who was the chairman of the session, told chandra in private that he did not believe eddington, but he did not let chandra respond to eddington’s assault. eddington had a very special language and reasoning in his scientific papers and managed to fight with many during his scientific career. his sarcasm only added oil to the fire. chandrasekhar had excellent relations with milne, appreciated his core-envelope models and discussed his first results in terms of milne’s model, to which eddington objected bitterly. eddington detested milne and disapproved of his stellar model and admitted that: i have not read professor milne’s paper, but i hardly think it is necessary, for it would be absurd for me to pretend that professor milne has the remotest chance of being right. in 1931, chandrasekhar extended his research in two directions: in a paper communicated by milne, he expanded milne’s theory of collapsed objects (a collapsed core surrounded by a stellar envelope – nonhomogeneous star), and attempted to explain the structure of white dwarfs. at the end of this paper, chandrasekhar gave a table in which he distinguished between the fate of low mass stars and high mass stars. this is one of the first times that the fate of a star was considered as a function of its mass. in parallel, he worked on his theory of bare white dwarfs. it so happened that the paper on milne’s composite models came out just before chandrasekhar submitted his paper about the critical mass of white dwarfs. needless to say, these papers did not please eddington. chandra had excellent relations with dirac, who advised him to go to copenhagen, where there were many good physicists. indeed chandra did go to copenhagen. he summarized his grievances in a letter to leon rosenfeld, asking the verdict of bohr and his gang. rosenfeld’s response was disappointing: i may say that your letter was some surprise for me: for nobody had ever dreamt of questioning the equations, and eddington’s remark as reported in your letter is utterly obscure. so i think you had better cheer up and not let you scare so much by high priests: for i suppose you know enough marxist history to be aware of the fundamental identity of high priests and mountebanks . . . so, if “eddington’s principle” had any sense at all, it would be different from pauli’s. could you perhaps induce eddington to state his views in terms intelligible to humble mortals? what are the mysterious reasons of relativistic invariance which compel him to formulate a natural law in what seems to ordinary human beings a non-relativistic manner. that would be curious to know. amazingly, a respected list of physicists knew that eddington was wrong, but chose to stay away from controversy. a point of concern. stoner’s work is essentially identical to that of chandra. yet eddington chose to make his ferociously attack on the young astrophysicists and did not mention at all the already mature stoner. landau derived chandra’s result independently (though for neutrons), and eddington did not attack him. the moral: eminent scientists are not immune against making colossal mistakes and perusal biases. chandrasekhar was awarded the nobel prize in 1983. by then, had stoner met his creator. it is a pity that there was no prize for stoner. references [1] for detailed reference see: giora shaviv, the life of stars, springer & magnes pub., 2009 505 acta polytechnica 53(supplement):500–505, 2013 1 introduction 2 enrico fermi & paul dirac 3 eddington's white dwarf paradox 4 cracking the paradox: fowler 5 pokrowski – the idea of a limiting mass 6 anderson expands pokrowski's idea but changes the reasons 7 stoner: relativistic degeneracy leads to a limiting mass 8 anderson again 9 stoner responds 10 chandrasekhar 11 the personal side references ap2002_01.vp 1 introduction risk allocation is a complex and difficult process, and for all practical purposes, it is a negotiated process. the study of risks should consist of two stages: � identification of the sources of risks. � detection of the financial consequences of these risks for the stakeholders and investors. 2 identification of risks a variety of risks can occur during the construction phase and operation phase of a project. the risks that can have a direct impact on the profitability and credibility of the project should be identified. identification of risks and risk management is a crucial part of project financing. different risks can occur during different projects. therefore the risks must be identified and allocated among the stakeholders of the project. the golden rule is that the actor who is best able to control, influence and manage the risk should bear the risk. this is often not the case in reality. risk allocation is a comprehensive and a complicated process. the problem can be illustrated by legal risks, which are often borne by the private sector, albeit only the public sector controls them. we can also mention inflation and interest rates, which the national bank oversees for. the risks of changes in them are often borne by the creditors, investors and shareholders. there are many other risks that are not borne by subjects who are in a position to manage them appropriately. the extent of an individual risk can change over time. a feature of successful projects is that the risks are widely shared by the public and private sector. generally it can be said that the private sector is better able to manage commercial risks and the responsibility associated with construction, operation and financing. on the other hand in the field of transport the public sector must be involved in many issues like right of way, political risks and sometimes also traffic and revenue risks. in the context of project financing, a private company should have access to adequate resources and experience to carry out the construction effectively with future backflow from the collection of tolls. it may then be possible for the company to bear the construction risk and some part of the traffic risk. the public entity would bear responsibility for guarantees and subsidies in the case of insufficient traffic intensity when operation begins. the main risks facing infrastructure projects are pre-construction activity, construction, traffic and revenue, currency, force majeure, tort liability, political risk and financial risk. these risks must be addressed in a satisfactory manner before debt and equity investors will commit to project funding. the standard risks identified in contracts are: pre-construction, construction, traffic and revenue, financial, regulatory and political. in addition, force majeure and legal liability are commonly addressed in contracts since they have proven to be serious sources of cost overruns in the sector. 2.1 construction phase risks during this phase, the major risks are delays in completion and the commencement of project cash flows; cost overruns with an increase in the capital needed to complete construction; and insolvency or lack of experience of contractors or key suppliers. construction costs may exceed estimates for many reasons, including inaccurate engineering and design, escalation in material and labour costs, and delays in project start-up. cost overruns are typically handled through a fixed-price and fixed-term contract, with incentives for completion and for meeting pre-specified investment goals. other alternatives include provision for additional equity injections by the sponsor or standby agreements for additional debt financing. it is always sensible for developers to establish an escrow or contingency fund to cover such overruns. delays in project completion can result in an increase in total costs through higher capitalized interest charges. they may also affect the scheduled flow of project revenues necessary for debt servicing and operating and maintenance expenses. availability of materials and equipment in many developing countries, the risk of equipment or materials for construction or operation must be considered. transit bottlenecks, tariffs, foreign currency fluctuations and other factors can cause a significant increase in costs. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 42 no. 1/2002 the risks of investments in transport infrastructure projects o. pokorná, d. mocková investment decisions should not be taken without an in-depth analysis of the risks. this is an important stage in project preparation and should be performed simultaneously with the planning of the financial operations. infrastructure development requires that project risks and responsibilities be assigned to the public or private entity that is best able to manage them. the risks and their financial impacts are usually not quantified equally by all parties. each party views the given risks according to the guarantees provided. these guarantees are related to the form of participation in the project. keywords: project financing, identification of risks, construction phase risks, star-up and operating phase risks, risk analysis, investment decisions, financial evaluation of the risks. contractor capability the main contractors and key subcontractors should have the experience, reputation, financial, technical and human resources to be capable of completing the project in a timely fashion on budget. this risk is best addressed through tough pre-qualification of bidders; through certification and monitoring; and by ongoing financial supervision of the contracting companies, to make sure that poor results from other projects or from weak balance sheets do not spill over into the specific project in question. environmental and land risks transport projects can have a substantial environmental impact. such projects frequently attract strong opposition from community and environmental groups over issues of pollution, congestion, neglect of public transport and visual impact. similarly, land acquisition can be a protracted process with the potential for extensive legal delays. in general, the public sector often takes on the responsibility for most of these risks since often it is easier for the public sector to take responsibility for acquiring rights-of-way, to pay for them and contribute this asset to the project. project sponsors often try to ensure that the government bears the risk of providing all necessary land within a given time frame or being liable for damages. furthermore, the cost of land acquisition can become a major factor where land values rise rapidly or are subject to speculative activity over which the project developer has no control. in these cases, agreement on some form of cost ceiling may be necessary in the concession contract. generally, the host government should ensure that required licenses and permits be obtainable without unreasonable delay or expense. 2.2 start-up and operating phase risks the major risks for transport projects in these stages relate to traffic/revenue risk; regulatory and legal changes; interest rate and foreign exchange risks; force majeure risk; and political risk. technology risks project finance participants cannot ignore new technologies since these can either significantly improve the profitability of a project, or can adversely affect any project that uses obsolete technology. for example, the use of automatic toll collection technology reduces collection costs and incentives for graft. another example is technological improvements in customs processing, so that border crossings on major arterial toll roads can be traversed more quickly, saving time for users and making the road more valuable. traffic and revenue risks demand risk is a major issue in virtually all projects. even where there is a reasonable level of confidence in forecasts, demand can be dramatically affected by competition from other modes or facilities, changing patterns of use, and macroeconomic conditions. these issues, over which the project sponsor often has little or no control, are very difficult to predict and represent a major risk to financing. in particular, forecasting during early years can be quite subjective. to the extent that these risks are driven by economic conditions, there is a potential role for the government to play in risk sharing, either through traffic or revenue guarantees or though other forms of support. since infrastructure construction often brings major structural changes for the region (the channel tunnel, high-speed railways) prediction of consumer behavior is not commensurate with the aggregate demand prognosis model. the issue is more complicated in the case of cross-border flows, where the border effect can appear. another typical feature is over-optimistic forecasting in order to convince a potential partner of the value of the project or, alternatively to get the deal at any price and renegotiate it afterwards. toll roads provide an illustration. traffic volumes are very sensitive to income and economic growth, and the failure to recognize this may be one of the main reasons why so many toll road projects have failed. financial risks: interest rates financial risk is the risk that project cash flows may be insufficient to cover debt servicing and then to pay an adequate return on sponsor equity. financing constraints, especially the lack of long-term debt capital, are a significant hindrance to toll road development. only a few projects are able to generate returns on investment sufficient to attract private capital. this suggests that only a limited number of projects will be executed without massive state support. since in infrastructure we are concerned with long-term investments with high start-up costs, countries with local capital markets capable of providing long-term capital have an advantage. of particular importance is the availability of mature domestic finance. in many countries infrastructure projects have been unable to obtain finance for more than 5 to 6 years, bringing other risks of renegotiations and refinancing. such projects are not viable without government guarantees. in theory, financial risk is best borne by the private sector. however in transport projects this risk is likely to be shared by the public sector, either in the form of debt or revenue guarantees, or by participation of the state or of an international financial institution. this can also take a form of a direct subsidy, grant or financial contribution, which will serve to improve the rate of return for the private sector. currency risk currency risk relates to the impact of the local currency exchange rate on the value of the investment. moreover there is a question of the convertibility of the local currencies. this is a major risk in countries where, for example, the tolls are collected in a weak local currency. force majeure risk neither they public sector nor the private sector can influence or control some risks like earthquakes and floods, which impair the ability of the project to earn revenues. for the private sector there is some insurance available, but the public sector generally has to bear this risk and redesign the project as need be. the rule is that the remedies in these cases should be essential part of the contract. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 regulatory and legal risks the issues of regulation and subordination to the regulatory authority should be dealt with in the contract. this should apply to the extent of the authority’s power and its responsibility concerning the fees charged, public commitments and an environment of equal competition. this matter can affect the value of the business, which can be sensitive to the revenue earned. therefore clear rules should be set and the way in which the regulatory authority exercises them should be verifiably independent. but even if regulatory rules are clear enough, they are only as effective as the regulators can be. the best-designed regulatory environment is useless if the regulator is not independent or fair. the pressures on regulators can be a major source of concern. project finance structures typically cover periods of ten years or more. the relevant legal and regulatory environment is likely to change substantially over that period. the rules dealing with the financial consequences of these changes between government, users and operators are critical, but are often ignored. the rules must cover the possibility of adjusting the contract terms during the project-financing period. political risks political risks concern government activities that could affect the ability to generate revenues. these could include termination of concessions, additional taxation and regulation, which then impair the value of the project for the investors, or even the nationalization of private property. ideally the government guarantees concerning these risks, and the method of compensation for lost profit, should form part of the contract. in some cases insurance provided by an international financial institution should be a requisite. 3 risk analysis risk analysis is involves quantifying the risk of each variant examined followed by proposal and design of corrective measures that will strengthen the probability of the project’s success. risk analysis can be divided into the following phases: � identification of the factors that influence the decision criteria � determination of the functionality of the decision criterion on the factors � determination of the probability density function of the risk factors � construction of the probability density function of the decision criteria considered � evaluation of the risks of the project. in order to perform risk analysis we have to know the decision criteria. if there is more than a single decision criterion, the analysis solves the comprehensive problem of how to express the total risk taking into consideration all evaluation criteria. the most common investment decision is risk analysis taking into consideration only one decision criterion – the economic evaluation of the efficiency of the investment. financial evaluation of risks the risks and their financial impacts are usually not quantified equally by all parties. each party views the given risks according to the guarantees provided. these guarantees are related to the form of participation in the project. mostly this concerns the basic capital provided by the stakeholders, debts guaranteed by the shareholders, non-guaranteed debts (the risk is borne by a financial institution), and the resources provided or guaranteed by the public sector. stakeholders and banks mostly cover financial risks. the recovery priority is always debt servicing after, which comes satisfying the shareholders. the greater the risk they undertake, the higher return on capital they naturally expect. evaluation of the risks is performed by independent bodies and experts, who assess all aspects of the project (technical, legal, commercial and fiscal). sensitivity analyses are performed for these purposes. these attempt to measure the impacts on the anticipated profitability of the project of different parameter changes: exchange rate deviation, interest rate changes, inflation, and construction delays, underestimated traffic intensities. the results of the sensitivity analysis form the basis for the variants, which lead to the risk evaluation. through the use of these probabilities future cash flows are thus constructed. 4 conclusion the main risks facing infrastructure projects include pre-construction, construction, traffic and revenue, currency, force majeure, tort liability, political, and financial factors. these risks must all be addressed in a manner satisfactory to debt and equity investors before they will commit to project funding. project finance transactions are typically governed by a nexus of long-term formal contracts, written between the project promoter, the host country government, creditors, input suppliers, contractors, operators, and service providers. three classes of contracts are important: concession agreements that stipulate a property rights transfer from the government to the project company, performance contracts between the project company and contractors and operators, and loan contracts between creditors and the project company. such contracts are designed to share risk and to protect contracting parties against opportunistic “hold-up” behaviour by others. in practice, they address two important characteristics of infrastructure investments: a high degree of asset specificity; and large project-specific risks that cannot be diversified in financial markets. in such “relationship-specific” investments, investors are hesitant to make investments without adequate contractual protection. once the investment is sunk, the incentive system and the bargaining power of the contracting parties change vis-vis each other. anticipating such an outcome, project promoters often insist on governments providing various kinds of guarantees to cover a range of risks. references [1] estache, a., strong j.: the rise, the fall, and … the emerging recovery of project finance in transport. world bank institute © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 42 no. 1/2002 [2] gannon, m. j.: risk analysis and its application within the private finance initiative. the proceedings of the chartered institute of transport in the uk, 7/1998 [3] thompson, j. l.: strategic management. chapman & hall, london 1993 [4] johnson, g., scholes, k.: exploring corporate strategy. prentice hall international, 1993 [5] babbar, s., fisher, g.: private financing of toll roads. world bank’s project finance and guarantees group [6] kolektiv pracovníků dopravní fakulty čvut: alternativní možnosti financování železničních koridorů. čvut, praha 1998 [7] pokorná, o.: zdroje financování dopravní infrastruktury. diplomová práce, praha 2001 ing. denisa mocková phone: +420 2 2435 9160 fax: +420 2 2491 9017 e-mail: mockova@fd.cvut.cz ing. olga pokorná phone: +420 2 514 310 59 e-mail: pokorna@mdcr.cz czech technical university in prague faculty of transportation sciences horská 3 128 03 praha 2, czech republic 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 ap1_01.vp 1 introduction in all areas of particulate technology where powders are handled, structures in contact with the powder will exhibit wear. in some applications this wear may be so severe as to limit the life of a component or plant, while in others it may be negligible. all particles will cause some wear, but in general the harder they are, and the more angular in shape, the more severe will be the wear [1]. mixing processes including the suspension of solids in liquids have shown that the rate at which the impeller wears is a strong function of the impeller blade inclination angle (pitch angle) as well as the impeller tip speed [2]. the impact rate of the solid particles affecting the blade of the pitched blade impellers was studied experimentally with the aim of explaining the significant dependency of impact rate on impeller tip speed and impeller diameter [3, 4]. the impact mostly occurs in the outer and upper corners of the blades for a pitched blade impeller pumping downwards. this fact corresponds to the results of experimental studies [5, 6] investigating the erosive wear of pitched blade impellers in a highly erosive suspension, where the impeller blade leading edge is worn, together with a reduction of the surface blade. then, e.g., the impeller pumping capacity decreases. this study deals with an attempt to express mathematically the shape of the worn blade of a pitched blade impeller in two sorts of industrial suspension, i.e., in a suspension of flue gas desulphurization product (gypsum) and a suspension of silicious sand, and to interpret the influence of the impeller erosion wear on the surface or on the thickness of the worn blade of the various pitch angles under its erosion process. 2 experimental experiments were carried out on pilot plant mixing equipment (see fig. 1), with water as a working liquid and two types of solid phases: 1. silicious sand (volumetric concentration cv = 10 % vol., mean diameter of particles d = 0.4 mm), 2. waste gypsum (caso4 . 2h2o) from a thermal power station (cv = 18.3 % vol., d = 0.1 mm) after a desulphurization process. pitched blade impellers with four adjustable inclined plane blades (see fig. 2) made of rolled brass or construction steel were used for the erosion tests. as an independent variable we took a pitch angle � of the impeller with inclined plane blades, and in all cases the frequency of the impeller revolution, n, was kept constant under conditions of complete homogeneity of the suspension. for the experiments we used a cylindrical metal baffled vessel made of stainless steel (see fig. 1). © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 41 no.1/2001 erosion wear of axial flow impellers in a solid-liquid suspension i. fořt, j. medek, f. ambros a study was made of the erosion wear of the blades of pitched blade impellers in a suspension of waste gypsum from a thermal power station (vol. concentration cv = 18.3 %, particle mean diameter d = 0.1 mm, degree of hardness “2.5”) and silicious sand (cv = 10 %, d = 0.4 mm, degree of hardness “7.5”) in water under a turbulent flow regime of agitated charge when complete homogeneity of the suspension was achieved. experiments were carried out on pilot plant mixing equipment made of stainless steel (diameter of cylindrical vessel t = 390 mm, diameter of impeller d = 100 mm, impeller off-bottom clearance h = 100 mm) equipped with four wall radial baffles (width b = 39 mm) and an impeller with four inclined plane blades (pitch angle � = 20°, 30°, 45°, relative blade width w/d = 0.2) made either of rolled brass (brinell hardness 40–50 bh) or of structural steel (brinell hardness 100–120 bh) always pumping the liquid downwards towards the flat vessel bottom. two erosion process mechanisms appear, depending on the hardness of the solid particles in the suspension: while the particles of gypsum (lower hardness) generate a uniform sheet erosion over the whole surface of the impeller blade, the particles of silicious sand (higher hardness) generate wear of the leading edge of the impeller blades, together with a reduction of the surface of the worn blade. the hardness of the impeller blade also affects the rate of the erosion process: the higher the hardness of the impeller blade, the lower the wear rate of the blade. this study consists of a description of the kinetics of the erosion process of both mechanisms in dependence on the pitch angle of the tested impellers. while the wear of the leading edge of the blade exhibits a monotonous dependence on the pitch angle, the sheet erosion exhibits the maximum rate within the interval of the pitch angles tested � � � 20°; 45°�. however, generally the pitch angle � = 45° seems to be the most convenient angle of blade inclination when both investigated mechanisms of the blade erosion process are considered at their minimum rate. keywords: pitched blade impeller, erosion wear, solid-liquid suspension. fig. 1: geometry of pilot plant mixing equipment (h = t = 290 mm, h = d = 100 mm, four baffles, b/t = 0.1) during the experiments with silicious sand the shape of the blade profile was determined visually from magnified copies of the worn impeller blades on photosensitive paper (magnification ratio: approx. 2:1). the average course of the blade profile for the given length of the erosion process was determined as an average curve from four individual worn impeller blades. the surface of the worn blades was determined from their magnified copies using a planimeter. during the experiments with waste gypsum the local thickness of the blade si,j ( j = 1, 2, 3, 4) along the blade length (see fig. 3) was determined by means of a contact micrometer with an accuracy �0.01 mm. the erosion process when a suspension of waste gypsum was used exhibits no change in the blade profile, e.g., at its leading edge. only its thickness changes. 3 results and discussion two series of experiments were evaluated: studies of the erosion wear of the impeller blades under the influence of particles of silicious sand (higher hardness) and under the influence of waste gypsum (lower hardness). axial impeller in a suspension of silicious sand during the erosion process of the pitched blade impeller caused by solid particles of silicious sand, its metal plane blade changes shape mainly around its leading edge [6]. the radial profile of the worn plane blade can be considered in dimensionless exponential form (see fig. 4): � � � �� �h x c k x� � �1 1exp , (1) where the dimensionless axial (vertical) coordinate along the width of the blade is � �h x w� z (2) and the dimensionless longitudinal (radial) coordinate along the radius of the blade is x x d� 2 . (3) the values of the parameters of eq. (1) – the wear rate constant k and the geometric parameter of the worn blade c – were calculated by the least squares method from the experimentally found profile of the worn blade; each curve was calculated from 14–16 points (x, h) with a regression coefficient better than r = 0.970. figs. 5 and 6 show the dependence (all the dependences mentioned here were calculated from experimental data by the least squares method) of the wear rate constant k on the duration of erosion process t for the given impeller pitch angle � both for rolled brass and for construction steel. the independence of the wear rate constant from the duration of the erosion process proves the reliability of the chosen analytical function [1] describing the radial profile of the worn blade. nevertheless, parameter k depends significantly on the pitch angle: the lower the pitch angle, the higher the rate of the erosion process of the pitched blade impeller. this result is in excellent accordance with the fact (1) that metals tend to suffer the most severe erosion wear at impact angles of 20° to 30° (measured from the plane of the blade surface), i.e., the angles at which the particles strike the blade surface. figs. 7 and 8 show the dependence of the worn blade geometric parameter c on quantity t. for both tested metals this parameter is linearly dependent on the duration of the 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 w � � d � d0 4 x 90 ° fig. 2: design of a pitched blade impeller with four inclined plane adjustable blades ( d = 100 mm, d0 = 25 mm, w = 20 mm, s = 1.2 + 0.04 mm (experiments with silicious sand), s = 1.0 + 0.04 mm (experiments with waste gypsum), � = 15°, 20°, 30°, 45°, n1 = 479 rpm, n2 = 640 rpm) fig. 3: distribution of measurement points for determining of the local blade thickness along the blade length fig. 4: radial profile of the leading edge of the worn blade of a pitched blade impeller © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 41 no.1/2001 process. this course confirms fairly well the strong dependence of the studied erosion wear on its duration. moreover, this is in accordance with the experimentally determined distribution of the rate of particle impact along the blade of the pitched blade impeller indicated on the artificially soft surface of the impeller blades (4). figures 9 and 10 illustrate the dependences of the experimental parameters of function (1) on the impeller pitch angle. the linear form for the two tested metals � �� � �k 8 79 10 48. sin cos .� � , rolled brass, � � �15°, 45°� (4a) � �� � �k 4 73 6 31. sin cos .� � , construction steel, � � �20°, 45°� (4b) characterizes the relation between the erosion wear of the impeller blade and the corresponding (axial) component of the mean velocity in the discharge stream leaving a pitched blade impeller [5] expressed by means of the product sin� cos�. at the same time our results confirm the general view of the relation between the hardness of the worn metal and its wear rate: the higher the hardness of the worn metal, the lower its wear rate. the linear forms c t � � � �� �214 10 121 104 2. .� , rolled brass, � � �20°, 45°� (5a) c t � � � �� �0 41 10 0 27 104 2. .� , construction steel, � � �20°, 45°� (5b) characterize the fact that the geometric parameter of the profile of a worn blade c depending on the duration of the erosion period corresponds to the development of the exponential profile of the worn blade with increasing time t. similarly as for parameter k the geometric parameter of the profile of the worn blade exhibits lower values for construction steel (brinell hardness 100–120 hb) than for rolled brass (brinell hardness 40–50 hb). figs. 11 and 12 illustrate the dependence of the surface of the impeller worn blade aworn (related to the unworn blade surface a0) on the duration of the erosion process for both rolled brass and construction steel impeller blades as well as being compared with predicted values [6]. the rate of the blade surface reduction can be expressed in linear form of the found dependence, i.e., for rolled brass: a a tworn 0 1 0 0014� � . , � = 20° , (6a) a a tworn 0 1 0 0008� � . , � = 45° , (6b) 5 7 9 0 100 200 300 t [h] � k [] k = 7.612� k = 6.77� k = 6.075� – 20°,� – 30°,� – 45°� fig. 5: dependence of the wear rate constant on the duration of the erosion process (impeller blades made of rolled brass) – 20°,� – 35°� 3 5. 4 5. 5.5 0 200 400 600 t [h] � k [] k = 4.30 k = 4.78 fig. 6: dependence of the wear rate constant on the duration of the erosion process (impeller blades made of construction steel) – 20°,� – 30°,� – 45°� 0 00. 0 40. 0 80. 1 20. 1.60 0 50 100 150 200 250 300 t [h] c [] c t= 0.0053 c t= 0.0079 c t= 0.0025 fig. 7: dependence of the worn blade geometric parameter on the duration of the erosion process (impeller blades made of rolled brass) 0 0 2. 0.4 0.6 0.8 0 100 200 300 400 500 600 t [h] c [] c t= 0.0018 c t= 0.0011 – 20°,� – 35°� fig. 8: dependence of the worn blade geometric parameter on the duration of the erosion process (impeller blades made of construction steel) 0 0 005. 0.01 10 20 30 40 50 [ ° ] c t/ [h ] fig. 10: dependence of the worn blade geometric parameter on the impeller pitch angle 3 5 7 9 0 2. 0 3. 0 4. 0 5. 0.6 � k [] fig. 9: dependence of the wear rate constant on the impeller pitch angle and for construction steel: a a tworn 0 1 0 0004� � . , � = 20°, (7a) a a tworn 0 1 0 0003� � . , � = 35°. (7b) all the above mentioned relations express the complex influence of the duration of the erosion process and the hardness of the impeller blade on the rate of its surface reduction, i.e., the positive effect of both the increase of the impeller pitch angle and the increase of the metal hardness on the rate of the blade erosion wear. as follows from fig. 11, the proposed model of the erosion wear of pitched blade impellers (see eq. 1) enables us to make a sufficiently accurate theoretical calculation of the rate of the surface blade reduction [6]. axial impeller in a suspension of waste gypsum during the erosion process when a suspension of the waste gypsum was used, no change of the blade profile was observed, and only its thickness was changed. then for the given duration of erosion period t the average thickness of the blade was calculated (see fig. 3) s si i j j � � 14 1 4 , (8) as well as its standard deviation � � s i i ji s s� ��� � �� 1 3 2 2 , (9) and, finally, the relative change of the average blade thickness for the given moment of the erosion process was determined �s s s s s i i 0 0 0 � � (10) where s0 is the initial average thickness of the unworn blade. table 1 and fig. 13 illustrate the dependence of quantity �si/s 0 on the duration of the erosion process. the values of quantity si confirm that the uniform sheet erosion of the impeller blade can be considered for both metals tested when waste gypsum from the thermal power station is used. its degree of hardness exhibits quite a different effect on the 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 0 6. 0 8. 1 1.2 0 50 100 150 200 250 t [h] a w o rm /a o [] � 20°: – theoretically predicted values, – experimental values � 45°: – theoretically predicted values, – experimental values fig. 11: dependence of the surface of the worn blade on the duration of the erosion process – impeller blades made of rolled brass 0 84. 0 88. 0 92. 0 96. 1 1.04 0 50 100 150 200 250 300 350 400 t [h] a w or m /a o [] – 20°,� – 35°� fig. 12: dependence of the surface of the worn blade on the duration of the erosion process – impeller blades made of construction steel i. impeller made of rolled brass � = 20° t [h] 0 93 141 183 241 si [mm] 1.020 0.9780 0.9400 0.9200 0.8900 si [mm] 0 0.02062 0.015 0.02062 0.03162 �si/ s0 �100 [%] 0 4.2 7.8 9.8 12.5 � = 30° t [h] 0 91 118 256 329 444 521 si [mm] 1.040 0.870 0.850 0.780 0.740 0.670 0.640 si [mm] 0 0.01414 0.01732 0.03109 0.0216 0.02828 0.01915 �si/ s0 �100 [%] 0 16.3 18.2 25.4 28.8 35.6 37.0 table 1. dependence of blade thickness on duration of the erosion process 0 4. 0 6. 0 8. 1 1.2 0 100 200 300 400 500 600 t [h] 1 � s/ so – 20°,� – 30°,� – 45°� fig. 13: dependence of relative blade thickness on the duration of the erosion process – impeller blades made of rolled grass metal surface than previously used silicious sand with degree of hardness approx. three times greater than the first one. it can be proposed: the different mean particle diameters of the two powders under study need not affect the erosion mechanism [1], i.e., their differences are caused only via their different hardnesses. the relative change of the average blade thickness �s si 0 depends both on the type of metal and on the blade pitch angle �. according with the above mentioned results, when silicious sand was used the blade made of construction steel exhibits a lower erosion rate than the blade made of rolled brass. the dependence of the quantity �s si 0 on the duration of the erosion process can be expressed in linear form �s s ti 0 1 0 0005� � . , � = 20°, (11a) �s s ti 0 1 0 0008� � . , � = 30°, (11b) �s s ti 0 1 0 0007� � . , � = 45°. (11c) the shape of relation � ��s s ti 0 � f does not exhibit a monotonous dependence on the impeller pitch angle as in the case of the harder powder used previously, and we can consider the maximum rate of the erosion process when the impeller blade pitch angle � = 30°. it follows from the above results that more tests with various powders with different particle hardnesses should be made, in order to gain better and fuller evidence of the influence of solid particles on the rate of erosion of pitched blade impellers. 4 conclusions two mechanisms of the erosion process of pitched blade impellers in a solid-liquid suspension were found in this study: a) particles of waste gypsum (lower hardness) generate uniform sheet erosion over the whole surface of the impeller blade, b) particles of silicious sand (higher hardness) generate predominantly erosion wear of the leading edges of the impeller blade. the higher the hardness of the impeller blade material, the lower the wear rate of the blade. while the wear rate of the blade leading edge exhibits a monotonous dependence on the impeller pitch angle, sheet erosion exhibits the maximum rate within the interval of the pitch angles tested � � �20°, 45°� the impeller pitch angle � = 45° seems to be the most convenient angle of plane blade inclination when the two investigated mechanisms of the blade erosion exhibit their minimum rate. all the results mentioned are valid for turbulent regime of flow agitated charge. 5 list of symbols a dimensionless surface of impeller blade (a = a/w(d/2)) a surface of impeller blade, m2 b baffle width, m c geometric parameter of the profile of a worn blade cv volumetric concentration, m 3/m3 d impeller diameter, m d0 hub diameter, m d particle diameter, m h dimensionless axial coordinate of the profile of worn blade hl height of liquid from bottom of vessel, m h off bottom impeller clearance, m k wear rate constant n impeller frequency of revolution, s–1 r regression coefficient s thickness of impeller blade, m t time, h w width of impeller blade, m x longitudinal coordinate of impeller blade, m z axial (vertical) coordinate along the width of the blade, m � pitch angle of blade, ° standard deviation subscripts and superscripts i summation index j summation index o initial value worn worn –– average value © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 41 no.1/2001 � = 45° t [h] 0 66 111 155 254 si [mm] 1.030 0.960 0.920 0.900 0.860 si [mm] 0 0.0238 0.0245 0.0294 0.02986 �si/ s0 �100 [%] 0 6.8 10.7 12.6 16.5 ii. impeller made of construction steel � = 45° t [h] 0 50 93 163 235 si [mm] 1.0400 1.0275 1.0275 1.0250 1.0250 si [mm] 0 0.00957 0.00957 0.00577 0.00577 �si/ s0 �100 [%] 0 1.2 1.2 1.44 1.44 this research has been supported by the research project of the ministry of education of the czech republic j04/98: 212200008. references [1] hutchings, i. m.: wear by particulates. chem. eng. sci., 1987, 42, no. 4, pp. 869–878 [2] kipke, k.: erosiver verschleiss von rührorganen. chem.ing.tech., 1980, 52, pp. 658–659 [3] he, y., takahashi, k., nomura, t.: particle impeller impact for a six bladed 45° pitched blade impeller in an agitated vessel. j. chem. eng. jap., 1995, 28, no. 6, pp. 786–789 [4] kee, k. c., reilly, c. d.: an experimental method for obtaining particle impact frequencies and velocities on impeller blades. proceedings of the 10th european conference on mixing, delft (the netherlands), editors: h. e. a. van den akker, j. j. derksen, elsevier, amsterdam 2000, pp. 231–238 [5] fořt, i., ambros, f., medek, j.: study of wear and tear of axial flow impellers. proceedings of the fluid mixing vi conference (editor: h. benkreira), bradford (england) 1999, pp. 59–68 [6] fořt, i., ambros, f., medek, j.: study of wear of pitched blade impellers. acta polytechnica 2000, 40, no. 516, pp. 5–10 doc. ing. ivan fořt, drsc. ing. františek ambros, csc. dept. of process engineering czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6 prof. ing. jaroslav medek, csc. dept. of process engineering technical university faculty of mechanical engineering technická 2, 616 69 brno acta polytechnica vol. 41 no.1/2001 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica doi:10.14311/ap.2014.54.0367 acta polytechnica 54(5):367–377, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap towards evolutionary design of complex systems inspired by nature jaroslav vítků∗, pavel nahodil czech technical university in prague, faculty of electrical engineering, department of cybernetics, technická 2, 16627, prague 6, czech rep. ∗ corresponding author: vitkujar@fel.cvut.cz abstract. this paper presents the first steps towards the evolutionary design of complex autonomous systems. the approach is inspired by modularity of the human brain and the principles of evolution. rather than evolving neural networks or neural-based systems, the approach focuses on evolving hybrid networks composed of heterogeneous sub-systems implementing various algorithms/behaviors. currently, evolutionary techniques are used to optimize the weights between predefined blocks (so-called neural modules) in order to find an agent architecture appropriate for a given task. the framework, together with the simulator of such systems is presented here. then, examples of agent architectures represented as hybrid networks are presented. one architecture is hand-designed and one is automatically optimized by means of an evolutionary algorithm. even such a simple experiment shows how evolution is able to pick-up unexpected attributes of the task and exploit them when designing a new architecture. keywords: agent, architecture, artificial life, creature, behavior, hybrid, neural network, evolution. 1. introduction — problem decomposition many researchers have had a long-term goal: to build an autonomous robotic system that is able to operate in real-world conditions in a robust way. however, after many years of research and development, there are still no satisfactory results. the uneasy task of building reliable robots is composed of too many smaller sub-problems, such as vision, reasoning in unknown domains, etc. on this long way, there are simply too many problems. moreover, most of these smaller problems do not yet have a feasible solution. 1.1. modular design in practice nevertheless, many of these small problems have already been solved relatively well, such as for example pattern recognition, planning, reinforcement learning, etc. in order to build a well-performing system it is often possible just to use these known systems. or even better, it is suitable to use implementation of known algorithms, which are tested and have performed on a given task. building a more complex system then becomes faster and simpler. this approach of re-using modules was started by the robotic operating system (ros). the main goal of ros is to provide nodes — implementations of algorithms — that can be re-used in robotic applications (such as path-planning, vision etc.) [1]. the user picks selected nodes and composes the resulting system from them. generally, such an approach is called top-down. the opposite direction is called a bottom-up design approach. connectionist artificial neural networks (anns) provide an example. in anns, no piece of the system (neuron) implements useful processing alone. however when composed together, complex behavior can be obtained often with the use of emergence. the most traditional design of anns is as follows: (1) predefine some structure of ann; (2) optimize the weights between neurons, in order to obtain the desired behavior. optimization can be done by some local algorithm [2], or by global approaches, such as neuro-evolution [3]. more recently, anns have often been designed by a principle called “neural engineering”. this is a top-down approach, which works with modular neural networks (mnns) [4]. the purpose of each module (sub-network) is defined. the required behavior of the resulting system is then obtained by composing multiple sub-networks together [5]. in this way, complex systems can be composed [6]. 1.2. how can systems be designed in a general way? both of the approaches mentioned here have their own benefits and their own weaknesses. the use of alreadyimplemented pieces of robotic software provides high reliability of the system, and the user has big insight into the inner functionality. however, the resulting design can be very constrained, and the design possibilities are limited. anns provide unconstrained design options, and are very good at dealing with uncertainty in data, but their structure is often too complex to be understood directly and altered well. here, authors focus on designing agent architectures (learning and decision-making systems for (virtual) robots) in a hybrid way. the process of designing agents is not be fixed to any of the design approaches mentioned above, but takes advantages of both. the novel presented framework is called hybrid artificial neural network systems (hanns). it is an attempt to 367 http://dx.doi.org/10.14311/ap.2014.54.0367 http://ojs.cvut.cz/ojs/index.php/ap jaroslav vítků, pavel nahodil acta polytechnica figure 1. a neural module with two data inputs, four data outputs and four configuration inputs. the module also contains ”prosperity” output, which provides subjective heuristics defining how well the module performs in the current architecture during the current simulation. here, a modem serves as a communication interface between the nengoros simulator and an external ros node, which implements a given algorithm/functionality. unify modular neural networks with purely top-down and practical sub-systems, such as those implemented in ros. one of main goals of this framework is to be able to automate the design of new architectures for a given task. this approach will be shown on a simple example here. 1.3. structure of the text the following chapter will describe the main concepts of our hanns framework, together with the nengoros simulator. the third chapter will show these principles applied to a simple agent architecture implementing motivation-driven reinforcement learning (rl). then the evolutionary approach is used to design a similar architecture automatically. finally, the two results are compared and discussed. 2. hybrid artificial neural network systems hybrid artificial neural networks can be described as modular anns [4] composed of heterogeneous subsystems. each subsystem can implement various methods of decision-making and employ various representations of information processing, from neural-like to symbolic representations [7]. this means that (for a connecting symbolic-based module to a neural network) the symbolic representation has to be derived from the activity of particular neurons. this has to be done for example by means of a predefined lexicon, rule extraction or similar methods. this chapter describes the main principles employed by the presented hybrid artificial neural network systems (hanns) framework. the framework uses seamless communication between all modules in the network and encapsulated information transformation where needed. the particular strategy of information transformation belongs to own module and is hidden from the rest of the hybrid network. 2.1. communication and sub-system representation in order to deal with connecting different modules which use different types of communication, a common type of communication had to be defined. since there is a possibility that every system could be implemented by neural computation one day, the aim is to add higher-level subsystems into anns. therefore the entire network uses neural-like communication in this framework. each ”neural module” subsystem can implement arbitrary behavior and has a defined number of input and output connections. each connection can represent a real-valued number, typically from the interval 〈0, 1〉. the scheme of an example of a neural module is shown in fig. 1. the figure shows that the framework uses simple communication in the network, while a transformation from/to a symbolic (or other) domain is implemented inside a particular neural module (depicted as modem in the schematics), if needed. according to [7], this framework uses “hybrid system coupling interleaved by function calls”, where all activity on inputs of neural module is translated into the inner representation of modules and is processed accordingly. after processing the data, the module encodes the result back into the “neural communication” and sets data on its outputs. 2.2. configuration of a neural module this type of neural module can implement various types of information processing. however, even do368 vol. 54 no. 5/2014 towards evolutionary design of complex systems inspired by nature main independent algorithms often need fine-tuning of their parameters in order to work efficiently. therefore the neural module has configuration inputs in addition to data inputs and outputs. these inputs are represented in the same way as data inputs, but their purpose is to define the parameters of the algorithm(s) encapsulated in the neural module. these parameters can be used as normal data inputs, which means that their value can be changed during the simulation. the only difference is that these inputs can be ignored. if the configuration inputs are left unconnected, these hold a predefined (default) value of the algorithm parameters. this can be used for simplifying the overall complexity of the network topology. 2.3. prosperity of the neural module one of the main aims of this framework is to study new, alternative use-cases of known algorithms/subsystems. during the automatic design of these hybrid artificial neural network systems, the following use cases of the neural module can occur: • the neural module is connected in a completely wrong way: the corresponding algorithm is used inefficiently or it does not work at all. • the neural module is connected in an unexpected, new way: the corresponding algorithm is employed, but in a way not anticipated by the designer, a potentially new use of the algorithm. • the neural module is connected in an expected way: the algorithm is employed as expected during the neural module design. the second of these three cases does not necessarily mean that the algorithm is not advantageous in the architecture. however, the behavior of an algorithm (and potentially its purpose in the architecture) can be often hard to analyze by hand. in order to identify incorrectly used parts of the resulting architecture, it would be convenient to distinguish between these three use-cases automatically. furthermore it would be useful to be able to evaluate the performance of the algorithms in a given situation. however, this is possible only by means of heuristics. the value of prosperity output defines subjective heuristics defining “how well the algorithm performs” in a given architecture during the simulation. this enables the user (and potentially ea) to distinguish between good and bad parts of a particular architecture. it is up to the designer of a particular neural module how to define its prosperity function. the function should produce values in the interval 〈0, 1〉. 2.4. the nengoros simulator the next goal of our hanns framework is to provide a platform for simulating agent architectures. in order to rapid prototype and simulate these architectures, the simulator of hybrid artificial neural network systems was created. the main objectives were the following: maximum reuse of current implementations of algorithms, decentralized and event-driven simulation and integration with an advanced simulator of large-scale anns. the resulting open-source system is called nengoros [8]. it connects a simulator of large-scale anns of the 3rd generation called nengo [9] with the robotic operating system (ros) [1]. the nengo simulator was created with the goal of implementing the neural engineering framework (nef) and it therefore supports modular anns. the addition of ros enables the simulator to use any ros-enabled subsystem in the simulation. the integration of ros into the simulator is depicted in the fig. 1, where the component called “q-lambda module” is an external ros node and the “modem” serves as the interface between nengo and ros. the ros-based components can be run remotely and can represent a particular piece of sw, a simulated world or even robotic hw. 3. description of selected neural modules an example of the use of the presented framework will be shown on two selected neural modules.the first module implements the reinforcement learning algorithm, and the other module presents the physiological state of the agent. together, these modules implement a principle called motivation-driven reinforcement learning. for each of these algorithms the theory will be briefly described. then, integration into the neural module will be introduced. 3.1. reinforcement learning module agent architectures from the domain of artificial life (alife) often require online and model-free reinforcement learning (rl). an algorithm called q-learning meets these requirements. this discrete algorithm learns a desired strategy only by means of interaction with the environment based on actions produced rewards/punishments received. the algorithm learns behavior which leads towards the nearest reward while avoiding punishments. the algorithm is encapsulated as a standalone sub-system — neural module here. this particular example (the use of rl in hanns) can be likened to ensemble algorithms in reinforcement learning [10], or to the aggregated multiple reinforcement learning system (amrls) [11]. compared to these, a single ensemble is represented as a multiple-input multiple-output sub-system, which communicates compatibly with 2nd generation of artificial neurons. the action selection mechanism (asm) is also currently integrated in the neural module currently. figure 2 presents a graphical representation of our modification of the q-learning algorithm. this algorithm runs inside a standalone ros node and communicates externally only by means of ros messages. figure 1 shows the integration of this ros node into a neural module. the module is compatible with the 369 jaroslav vítků, pavel nahodil acta polytechnica figure 2. scheme of the stochastic return predictor (srp) implementation. it is composed of the q-lambda algorithm and asm. the (sub-)system is implemented as a stand-alone ros node, which can be used as a neural module. outputs encode a selected action by means of the 1ofn code (where n is number of available actions). m state variables are encoded by m data inputs, and each data input is sampled in a predefined number of discrete values. a node selects one action at each time-step and expects information about the new state and the reward. the node has the following configuration inputs: α, γ, λ, importance, which affect learning, and action selection methods (in the case that these inputs are not connected, default values are used). the prosperity heuristics represents the average between mcr and overall coverage of the state space (number of visited states). hanns framework, and can be seamlessly connected into a network of heterogeneous nodes. 3.1.1. learning for learning, the neural module uses standard algorithm called q-learning (more exactly: q-lambda, see below). it is named according to its q matrix, which maps state-action pairs to utility value. at each environment state, the q(s,a) matrix stores utility values for all possible actions: q : a×s → r, (1) where a is a set of all available actions and s is the set of all possible states of the environment. the utility value represents the discounted future reinforcement that will be received by the agent if it will follow a given action a in a given state s. online learning is governed by obtaining new q(s,a) values into the matrix. a change of the value in the matrix is represented by the following equation: δ = rt+1 + γmax aq(st+1,at+1) −q(st,at). (2) the algorithm stores the current state and the action that was just executed (st,at). action at may cause receiving the reward rt+1 and a transition1 into the new state st+1. based on this information and the optimal action in the new state: 1note that this theoretically requires environment with markov property. a∗t+1 = max aq(st+1,at+1), the value q(st,at) is updated as follows: q(st,at) ← q(st,at) + αδ, (3) the following parameters used: γ ∈ 〈0; 1) is a forgetting factor and α ∈ (0; 1〉 is a learning rate. according to the equation (3), the q-learning algorithm updates only one value at a time. the learning speed can be enhanced by modification called eligibility trace, which enables the algorithm updates values of multiple past state-action pairs at one step. such modification of q-learning is called q-lambda, or q(λ) algorithm. by introducing the error function, which is the fundamental for the eligibility tracesbased approaches, the equation can be rewritten as follows: q(st,at) ← q(st,at) + αδe(s,a), (4) where the parameter error is defined for each stateaction pair as follows: et(s,a) = { γλet−1(s,a) if (s,a) 6= (st,at) γλet−1(s,a) + 1 if (s,a) = (st,at) (5) the equation states that for all state-action pairs, there is an error function value that decays in time. if the state-action pair is used, the error function is increased by 1. such a modified equation (4) means that all state-action pairs are updated in each step. the decay parameter λ ∈ 〈0, 1〉 defines the magnitude of the update of the previous states. in the 370 vol. 54 no. 5/2014 towards evolutionary design of complex systems inspired by nature figure 3. the neural module implementing the physiological state space. this node produces/represents motivation for the behavior which leads to the correct reinforcement. the value of its physiological variable decreases in each time step with given dynamics, and thus increases the motivation. once the correct reinforcement is received, the value of the variable goes back towards the limbo area, where no motivation is produced. the value of the prosperity output is defined as pt = 1 − sf t (mean state distance to the limbo area). case that λ = 0, pure one-step temporal difference (td) learning is used. in the case of λ = 1, montecarlo learning is obtained. correct estimation of λ can improve the speed of learning, but also can cause oscillations in learning. in the implementation of this neural module, the modification of the q(λ) algorithm is used. here, the eligibility trace is constrained to finite length of n previous steps, which saves computational resources and prevents bigger destabilization of learning convergence by an incorrect value of the λ parameter. 3.1.2. action selection method in non-episodic experiments a typical use of the q(λ) algorithm is in episodic experiments. in episodic experiments, the initial state of the environment is selected randomly. this helps toward uniform exploration of (and learning in) the entire state-space. in order to add domain-independence, the designed module has to be able to operate in nonepisodic experiments. in non-episodic experiments (particularly those simpler with one attractor), it is necessary to achieve balance between knowledge exploitation and exploration/learning. a typical action selection method (asm) for efficient knowledge exploitation is called a greedy strategy, where the action with the highest utility is selected: at+1 = a∗t+1 = max aq(st+1,at+1). (6) this strategy may stick at a local optimum, so the �greedy ams is often used. in this asm, parameter � affects the amount of randomization. random action is selected with probability of �, while the greedy strategy is followed with probability of: p(a∗t+1) = 1 − �. (7) parameter � therefore directly balances between exploitation of knowledge and exploration of the state space around the nearest attractor. in order to provide efficient learning ability in nonepisodic experiments, authors introduce an input to the neural module called “importance” which generally defines the current need for “services” provided by the module. for the q(λ) module, the importance input represents the motivation for the behavior represented by this node, see fig. 2. the amount of randomization in the asm should be indirectly proportional to the importance input. here, the �-greedy asm is used, but the randomization is defined as: � = 1 − importance. (8) by increasing the importance of the q(λ) module (increasing the motivation for executing a behavior represented by this node), the probability of taking the greedy action a∗ increases. this means that the importance enables the agent to learn by exploration in free time and to exploit the information if needed. 3.1.3. prosperity of the reinforcement learning module this subjective online heuristics estimates how efficiently the module is used in the current topology during the current simulation (task). since the q(λ) module has to follow two antagonistic objectives (exploitation vs. exploration), it can be difficult to represent the efficiency of its use by a single value. the heuristics that is currently being used is represented by the following equation: pt = cover t + mcrt 2 , (9) 371 jaroslav vítků, pavel nahodil acta polytechnica figure 4. scheme of mapping the genotype (vector of binary/real values) to the phenotype (working agent architecture). the physiological module is wired to the reward source in the map, and this determines the main goal of the agent (by the module’s prosperity value). all configuration inputs are unconnected, so the default parameters are used. outputs of the q-lambda module are directly wired to agent’s actuators. the genotype of the hand-designed architecture is depicted at the bottom and its connections are highlighted in the scheme. variables representing the state of the environment (x,y coordinates) are connected to the data inputs of the q-lambda node. the reinforcement is connected to the physiological module, which produces motivation for the asm and a reward for learning in the q-lambda module. where the mean cumulative reward (mcr) is defined as the mean reward (r) received during the simulation until time step t. this represents the efficiency of knowledge exploitation: mcrt = ∑ i ri i ∀i = 0, 1, . . . , t. (10) the value of cover t represents how many states of the entire state-space have been visited so far: cover t = ∑ i∈visited si∑ s∈s , (11) this value represents exploration efficiency. 3.2. motivation source module in order to represent agents’ needs, their physiology can be modeled [12]. it has been shown that decomposing the task into subtasks can help the rl to learn more efficiently, which is beneficial especially in more complex tasks or in tasks that are difficult for rl to learn [13]. the agents’ needs can represent the motivation to execute/learn these subtasks of such a more complicated policy. the second neural module, which serves as a motivation source and holds one physiological variable is used here. the value of this variable decays with predefined dynamics in time. the value of 0 represents purgatory area and the value of 1 represents the limbo area. the amount of motivation is indirectly dependent on the physiological variable. in the limbo area, no motivation is produced, while the purgatory area represents the maximum motivation/need. the simple dynamics of the physiological variable is defined as follows: vt+1 = vt − decay. (12) the amount of motivation that is produced is determined by applying the sigmoid to the inverse value of physiological variable v . the resulting amount of motivation m at time t is: mt = 1 1 + emin+(max−min)×(1−vt) , (13) where the min and max parameters are chosen so that the value of the variable vt = 0 roughly corresponds to the motivation of mt = 1. if the reward is received, the value of vt+1 is set to one, and therefore the motivation decreases towards 0, which switches the agent back towards exploration. 3.2.1. prosperity of the motivation source module if the agent behaves efficiently enough, the mean motivation produced by this module is low. the mean motivation value can be expressed by the mean state distance to optimal conditions (sf), which is defined as follows: sf t = ∑ i di i ∀i = 0, 1, . . . , t, (14) where di is the distance of state variable vi from the optimal conditions of v = 1. sf t is computed 372 vol. 54 no. 5/2014 towards evolutionary design of complex systems inspired by nature figure 5. observing the motivation-driven rl behavior of the architecture: the exploration vs. knowledge exploitation is dynamically balanced, according to the agent’s current needs. the x-axis represents the simulation steps; the y axis is value of the motivation/reward. peaks in the lower graph represent the binary event of receiving the reward. these events correlate with the amount of motivation that is currently produced (upper graph). the speed of receiving the motivation depends on quality of the agent’s knowledge (fig. 6b and fig. 6c) and the agent’s current distance to the reward source. online for each simulation step. since the prosperity is indirectly proportional, its value is computed as: pt = 1 − sf t. (15) 4. experiments first, experiments which validate the expected functionality of particular neural modules are performed. then their correct interaction is evaluated on the architecture,which is hand-wired for a given experiment. two types of evolutionary algorithms (eas) were used. the standard generational model of the genetic algorithm (ga) and its modification for vector of real-valued numbers called real-valued ga (rga) were used. both, the ga and rga were tested and compared on designing new architectures for a given task. in the experiments, the agents are allowed to move in a 2d discrete world 15 × 15 positions in size. the map contains two obstacles and one source of motivation (see fig. 6b). the agent has 4 actions — moving in four directions — and if the agent steps on a tale with the reward, a positive reinforcement is received. therefore, the q-lambda module is configured to have four outputs (four actions) and two inputs (x,y coordinates on the map) sampled into 15 expected values (see fig. 4). the value of physiological variable is configured to decrease with value of decay = 0.01 each step. the simulated environment is also implemented as an ros node and is also used in the nengoros simulator as a neural module. the simulation is nonepisodic; the agent is placed in the environment and is allowed to interact with the world for a predefined number of simulation steps. 4.1. description of hardwired agent architecture the modules are connected so that the physiological module receives a reward from the environment and produces motivation for the q-lambda module. states of the environment are connected to the data inputs of the q-lambda module. actions produced by the q-lambda module are then directly applied as actions of an agent in the environment. the scheme of the hand-wired architecture represented as a hybrid artificial neural network is depicted in fig. 4. together, this network implements the motivation-driven reinforcement learning used in a non-episodic simulation. 4.1.1. simulation of hardwired architecture during the simulation, the prosperity values of both the q-lambda and physiological module are observed. these should correlate with the successfulness of the behavior of the agent. note that q-lambda prosperity is composed of mcrt and cover t values (how successful the learning is and how efficient the exploitation is) and the prosperity of the physiological module is defined as 1 − sf t (defining how satisfied the agent is). figure 6a shows the course of these values in time. it can be seen that at about step 20000, the prosperity of the physiological module and the agent’s average reward/step converge. since the behavior is learned reliably at about step 20000, the agent is able to exploit the knowledge efficiently (fulfills the requirements for a reward faster) and therefore it has more time to explore. this causes further discovering of new states between steps 20000 and 40000. figure 5 shows relatively stable behavior of the motivation-driven rl system during this part of the simulation. here, the agent balances between exploration and exploitation 373 jaroslav vítků, pavel nahodil acta polytechnica 0 2 4 6 8 10 x 10 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 simulation steps p ro s p e ri ty v a lu e prosperity of neural modules during the agent life prosperity of physiology module (1!msd) prosperity of rl module no. of visited states reward per step (a) convergence of prosperity values of both neural modules in time. since the behavior is learned reliably around the step 20000, the agent is satisfied and systematically explores further new states. (b) visualization of knowledge learned by the agent during the first 10000 simulation steps. each position in the table is filled with representation of action with the highest utility value. 5 10 15 5 10 15 0 10 20 30 40 s2 = ypos utility values of best actions in a given state ! q(s,a) s1 = xpos u ti lit y o f th e b e s t a c ti o n (c) utility value of the best action (see b) based on agent’s current position in the map. the nearer to the reward source, the higher the expected outcome of the best action is. figure 6. simulation of the hand-designed agent architecture. each position in b) and c) represents one position in the map. there are two obstacles and one source reward in the environment. b) and c) show the agent’s learned knowledge. greedy asm selects the action with the highest utility in each state. these actions are shown in b) and their utility is shown in c). note that the utility values on the z-axis are rescaled for better visibility. of knowledge based on the current motivation value. the longer delay between satisfying the motivation, the further the agent was from the source of a reward (potentially unexplored area). the knowledge that has been learned by the agent can be visualized in two ways. figure 6b shows a graphical representation of the best action (a∗) based on the agent’s current position in the map (state s of the environment). convergence to the optimal strategy can be observed especially near the source of the reward. figure 6c depicts the actual utility values in the q(s,a) matrix for the best action a∗ in the state. the higher the surface is, the higher is the expected reward. zero values represent unexplored states, such as obstacles. note that the behavior of this architecture can be seen in the attached movie. 4.2. principle of the evolutionary design of new architectures two types of simple genetic algorithm (ga) were used for the neuro-evolutionary design (optimizing the connection weights between modules) of new architectures here. as was mentioned above, the agent architecture is represented as an oriented graph, where the nodes are neural modules and the edges are the connection weights between their inputs/outputs. similarly to the neuro-evolutionary design of anns [14], the topology of agent architecture is optimized by modifying the connection weights between nodes. each individual consists of a genome and fitness value. a genome is a vector of binary/real values representing the connection weights between modules. the fitness value represents the quality of a given architecture (its performance on a given task) and is determined by means of prosperity values (see below). the generational model of the ga/rga (real-valued ga) is used here. in this particular case, the mapping of genotype (genome) to phenotype (architecture) is depicted in the fig. 4. the representation of the architecture is inspired in feed-forward ann topologies, where the inputs/outputs between particular layers are fully connected. the weights of these connections are optimized by the ga/rga. the previous experiment suggested that during the simulation of 20000 steps the architecture should be able to successfully learn the desired behavior. the evaluation of an individual is therefore determined by means of prosperity values of the physiological neural module, and is obtained after simulation of the architecture for 20000 steps. the parameters of gas and rgas were empirically set to the following values. the population size is popsize = 50, the number of generations maxgens = 80. each gene is mutated with probability of pmut = 0.05. the one-point crossover is applied to two individuals with the probability of pcross = 0.8. the ga mutation is defined as flipping the value of a given gene. for the rga, the mutation was implemented as sampling the gaussian function with the standard deviation of σ = 1 with mean value of µ = genei, where genei is the value of mutated genei ∈ r (the constraints of interval are applied after the mutation). 4.3. evolutionary design of new architectures since the prosperity of the physiological module is determined by the agent’s ability to learn new knowledge and use it, the suitability of the wiring of both modules is represented in the prosperity of the physiological module. we therefore define the simple fitness (sf) is defined, which is equal to the prosperity of the physiological module: parch = sf arch = pphys = 1 − sf t, (16) 374 vol. 54 no. 5/2014 towards evolutionary design of complex systems inspired by nature 0 10 20 30 40 50 60 70 80 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 generations v a lu e o f th e b e s t f it n e s s i n t h e p o p u la ti o n comparing sorga and soga on designing new architectures best fitness during sorga (fitted by pol. of ord.5) best fitness during soga (fitted by pol. of ord.8) mean value of best fitness mean value of best fitness 95% prediction intervals figure 7. comparing ga and rga with single objective fitness (sf). the fitness is defined as a prosperity of the physiology module (but the courses of evolution for the composed-fitness cf look similar). the ga finds solution with similar quality faster than the rga in both cases. note that each result is averaged over 10 runs of the algorithm and fitted by a polynomial for better readability. where pphys is the prosperity of the physiological module. this means that single-objective ga (soga) and single-objective rga (sorga) will maximize only the prosperity of the physiological module. here, ga and rga were used to design new architectures for the described environment. the graph in fig. 7 compares the convergence of the two types of evolutionary design of agent architectures. again, it can be said that soga is able to find a similarly fit solution faster than sorga. the following sections will describe some typical automatically found architectures and their properties. 4.3.1. analyzing new architectures found by soga table 1 shows several automatically-designed architectures by means of sorga and soga, and compares them to the manually-designed architecture. soga found functional architectures relatively fast. both selected typical individuals (marked as ind1 and ind2 in the table) have the following properties: • the reward input of the q-lambda module is wired correctly, so that the rl part works as expected. • both environment state variables, and also the motivation output of the physiology are connected to the motivation input of the q-lambda module. this means that the motivation-driven rl is also used. the motivation is directly proportional to the motivation produced by physiology and to the agent’s position/distance from the reward source. • the binary reward output of the physiology is also connected to the motivation source, but this has no significant influence on the agent’s behavior. parameters a b c d (see fig. 4) e f g h i j k l m n hand-designed 1 0 0 1 sf = 0.625 0 0 0 1 0 0 1 0 0 0 soga – ind1 1 0 0 1 sf = 0.699 0 0 0 1 0 0 1 1 1 1 soga – ind2 1 1 0 1 sf = 0.697 0 0 0 1 0 1 1 1 1 1 sorga – ind1 0 1 1 0.71 sf = 0.745 0 0 0 1 1 0 0.89 0 1 1 sorga – ind2 0 0.51 1 0 sf = 0.723 0 0 0 1 0 0 0.76 0 0.3 0.44 table 1. comparison of agent architectures’ genomes: hand-designed, two sorga-designed and two sogadesigned architectures. also note that ind1 has correctly wired state variables to the agent’s position, while the ind2 has one dimension “diagonalized” (both, the x coordinate and the y coordinate are connected to the s1 variable). this means that only one half of the q(s,a) matrix was used here, but still the architecture still performed relatively well. figure 8 shows the behavior of a typical best architecture found by soga. the architecture performs similarly to the hand-designed architecture (see fig. 6). compared to the hand-designed architecture, this architecture has bigger motivation to stay near the reward source, so that the overall prosperity of the physiological module is higher, while the number of explored states is lower. 4.3.2. analyzing new architectures found by sorga finding new architectures by means of sorga took more than generations than for soga. however, sorga has wider options for connecting modules. for example, it can be seen from the table 1 that ind2 used only half of the q(s,a) memory. ind1 uses swapped coordinates with one dimension slightly diagonalized. in both architectures, motivation-driven rl is used. again, the amount of motivation originates from the physiology and the agent’s position in the map (the agent is therefore afraid of going further away from the food). figure 9 shows the course of learning of ind1 and the content of its memory. the knowledge is diagonalized (see fig. 9b), but the learned data corresponds to the reward source. the position of the obstacle is also visible. the separated peak is caused by the fact that the reward output of the physiological module is connected to the s1 375 jaroslav vítků, pavel nahodil acta polytechnica 0 1 2 3 4 5 x 10 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 simulation steps v a lu e s agent performance prosperity of physi logy module (1!msd) prosperity of rl module no. of visited states reward per step (a) course of life of typical best agent found by the soga – ind1. the behavior is similar to the hand-designed architecture (see fig. 6a). 5 10 15 2 4 6 8 10 12 14 0 10 20 30 40 s2 ~ ypos? utility values of best actions in a given state ! q(s,a) matrix values s1 ~ xpos? u ti lit y o f th e b e s t a c it o n b a s e d o n a g e n ts p o s it io n (b) knowledge of selected agent found by the soga – ind1. representation is identical to the hand-designed architecture. not all environment states are explored here. figure 8. analyzing the typical architecture found by the soga (marked as ind1). it can be seen that the higher motivation (combined from two sources) causes the agent to stay nearer the reward source (once found) than in the hand-designed architecture. compared to this, the corga approach is able to weight the amount of importance produced by particular sources. 0 1 2 3 4 5 x 10 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 simulation steps v a lu e s agents performance prosperity of physilogy module (1!msd) prosperity of rl module no. of visited states reward per step (a) course of the life of the typical best agent found by sorga – ind1. compared to the coga-designed architecture (fig. 8a), this agent learns more slowly (slower convergence of prosperity of the physiological module). this is caused by sub-optimal representation of knowledge in the q-lambda module. 2 4 6 8 10 12 14 5 10 15 0 10 20 30 s1 ~ xpos? utility values of best actions in a given state ! q(s,a) matrix values s2 ~ ypos? u ti lit y o f th e b e s t a c it o n b a s e d o n a g e n ts p o s it io n (b) knowledge of the selected agent found by sorga – ind1. the q-lambda module has swapped axes and the value of the s2 variable is computed as follows: s2 = x + 0.71y . this causes slower convergence of learning. the “peak” in the graph is caused by connecting the reward output to the s1 input. figure 9. behavior and knowledge learned in the sorga-designed architecture.although the convergence of learning is slower, the overall results of the behavior are similar to the soga-designed architecture (fig. 8). input. this means that while receiving the reward, the perceived x position “jumps” to the maximum value. 4.3.3. discussion it has been shown that the agents’ ability to weight between exploration and knowledge exploitation can be represented only by means of the prosperity of the physiological module. table 1 shows that all architectures found by soga and sorga have notably higher fitness values than the hand-designed architecture. furthermore, the results of sorga (particularly ind1) suggest that it is more efficient to use two components as a source of motivation: time (agent’s physiology) and space (agent’s position). this is a simple example of how evolution is able to correctly identify possibly hidden attributes of the task and employ then while designing architectures suitable for the task. solutions provided by the evolution typically use less efficient representation of the knowledge in the srp’s memory, but are able to use neural modules in an unexpected manner. one of goals of our research is: to test how known algorithms can be employed in new, unexpected ways. 376 vol. 54 no. 5/2014 towards evolutionary design of complex systems inspired by nature 5. conclusion a novel approach for designing hybrid agent architectures, which is inspired by neuro-evolution has been presented. this approach uses the newly presented framework of hybrid artificial neural network systems (hanns). it searches for new topologies of these hybrid networks by modifying the connection weights between particular neural modules. this method enables semi-automatic design of agent architectures that are specialized for a given task. it has been shown that this approach is able to do the following: • determine whether and how to the module will be used by connecting its outputs. • define the form of knowledge representation in the module by connecting its inputs. it has been shown that, during the design of new architectures, the approach is able to pick the important aspects of the task and make efficient use of this knowledge for designing the architecture. this method of automatic design is able to find design solutions that are as good as, or even better than, those designed by hand. this can be particularly useful in designing autonomous systems, where the task is too complicated to be fully understood by the human designer. in this particular case, evolutionary design was able to choose the inner representation of a problem inside a neural module (represent x,y coordinates), to discover inherent properties of the task (that the reward source is near the coordinates to 0, 0) and to define the solution by wiring the connections between neural modules (e.g., that the importance of knowledge exploitation is directly dependent on the agent’s position). the rga discovered that it is a more robust approach to use two independent sources of motivation; one based on the agent’s position (environment state) and the other based on the agent’s physiology. last but not least, particular nodes in the hanns framework are implemented in a way that can be used in a variety of different (e.g., more complex) architectures and/or in a variety of modular systems in the future. for example, multiple (differently configured) rl modules can either compete (for learning/action selection of antagonistic goals) or cooperate to learn/execute one (hierarchically decomposable) behavior. acknowledgements this research has been funded by the dept. of cybernetics, faculty of electrical engineering, czech technical university in prague, under sgs project sgs14/144/ohk3/2t/13. references [1] m. quigley, k. conley, b. gerkey, et al. ros: an open-source robot operating system. in icra workshop on open source software. 2009. [2] f. jiang, h. berry, m. schoenauer. the impact of network topology on self-organizing maps. in proceedings of the first acm/sigevo summit on genetic and evolutionary computation, gec ’09, pp. 247–254. acm, new york, ny, usa, 2009. doi:10.1145/1543834.1543869. [3] j. bullinaria. generational versus steady-state evolution for optimizing neural network learning. in in: proceedings of the international joint conference on neural networks ijcnn. 2004. [4] g. auda, m. kamel. modular neural networks: a survey. int j neural syst 9(2):129–151, 1999. [5] c. eliasmith, c. h. anderson. neural engineering: computation, representation, and dynamics in neurobiological systems. the mit press, cambridge, isbn: 0-262-05071-4, 2003. [6] e. crawford, m. gingerich, c. eliasmith. biologically plausible, human-scale knowledge representation. in 35th annual conference of the cognitive science society, pp. 412–417. 2013. [7] k. mcgarry, s. wermter, j. macintyre. hybrid neural systems: from simple coupling to fully integrated neural networks. neural computing surveys 2:62–93, 1999. [8] dept. cybernetics fee, ctu in prague. nengoros. http://nengoros.wordpress.com/ [2014-10-01]. [9] centre for theoretical neuroscience, u waterloo. nengo. http://nengo.ca/ [2014-10-01]. [10] m. wiering, h. van hasselt. ensemble algorithms in reinforcement learning. systems, man, and cybernetics, part b: cybernetics, ieee transactions on 38(4):930–936, 2008. doi:10.1109/tsmcb.2008.920231. [11] j. jiang. a framework for aggregation of multiple reinforcement learning algorithms. ph.d. thesis, waterloo, ont., canada, canada, 2007. aainr34511. [12] d. kadlecek, p. nahodil. adopting animal concepts in hierarchical reinforcement learning and control of intelligent agents. in proc. 2nd ieee ras & embs int. conf. biomedical robotics and biomechatronics biorob 2008, pp. 924–929. 2008. doi:10.1109/biorob.2008.4762882. [13] t. watanabe, t. sawa. instruction for reinforcement learning agent based on sub-rewards and forgetting. in fuzzy systems (fuzz), 2010 ieee international conference on, pp. 1–7. 2010. doi:10.1109/fuzzy.2010.5584788. [14] j. fekiac, i. zelinka, j. c. burguillo. a review of methods for encoding neural network topologies in evolutionary computation. in proceedings 25th european conference on modelling and simulation ecms, pp. 410–416. 2011. isbn: 978-0-9564944-2-9. 377 http://dx.doi.org/10.1145/1543834.1543869 http://nengoros.wordpress.com/ http://nengo.ca/ http://dx.doi.org/10.1109/tsmcb.2008.920231 http://dx.doi.org/10.1109/biorob.2008.4762882 http://dx.doi.org/10.1109/fuzzy.2010.5584788 acta polytechnica 54(5):367–377, 2014 1 introduction — problem decomposition 1.1 modular design in practice 1.2 how can systems be designed in a general way? 1.3 structure of the text 2 hybrid artificial neural network systems 2.1 communication and sub-system representation 2.2 configuration of a neural module 2.3 prosperity of the neural module 2.4 the nengoros simulator 3 description of selected neural modules 3.1 reinforcement learning module 3.1.1 learning 3.1.2 action selection method in non-episodic experiments 3.1.3 prosperity of the reinforcement learning module 3.2 motivation source module 3.2.1 prosperity of the motivation source module 4 experiments 4.1 description of hardwired agent architecture 4.1.1 simulation of hardwired architecture 4.2 principle of the evolutionary design of new architectures 4.3 evolutionary design of new architectures 4.3.1 analyzing new architectures found by soga 4.3.2 analyzing new architectures found by sorga 4.3.3 discussion 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0583 acta polytechnica 53(supplement):583–588, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap thermal and chemical evolutions of galaxy clusters observed with suzaku kosuke satoa,∗, kyoko matsushitaa, yoshitaka ishisakib, noriko y. yamasakic, takaya ohashib a department of physics, tokyo university of science, japan b department of physics, tokyo metropolitan university, japan c institute of space and astronautical science (isas), japan aerospace exploration agency, japan ∗ corresponding author: ksato@rs.tus.ac.jp abstract. we studied the properties of the intracluster medium (icm) of galaxy clusters to the outer regions observed with suzaku. the observed that the temperature dropped by about ∼ 30 % from the central region to the virial radius of the clusters. the derived entropy profile agreed with the expectation from simulations within r500, while the entropy profile in r > r500 indicated a flatter slope than the simulations. this would suggest that the cluster outskirts were out of hydrostatic equilibrium. as for the metallicity, we studied the metal abundances from o to fe up to ∼ 0.5 times the virial radius of the galaxy groups and clusters. comparing the results with supernova nucleosynthesis models, the number ratio of type ii to ia supernovae is estimated to be ∼ 3.5. we also calculated not only fe, but also o and mg mass-to-light ratios (mlrs) with k-band luminosity. the mlrs in the clusters had a similar feature. keywords: galaxy clusters, x-rays, thermal evolution, chemical evolution. 1. introduction clusters of galaxies, the largest virialized systems in the universe, are filled with the intracluster medium (icm), which consists of x-ray emitting hot plasma with a typical temperature of a few times 107 k. x-ray spectroscopy of the icm can immediately determine its temperature, mass and metal abundances. the mass profile of a cluster, which is a useful parameter for constraining cosmology, is determined through x-ray measurements of the temperature and density structure of the icm under the assumption of hydrostatic equilibrium of the icm. in the framework scenario of a hierarchical formation of structures based on the cold dark matter (cdm) paradigm, clusters are also thought to grow into larger systems through mass accretion flows along large-scale filamentary structures. consequently, the icm properties to the virial radius play key roles in investigating the structure formation and the evolution of clusters. because of the difficulties in observation, however, properties, such as temperature, density, pressure, entropy, and metal abundance around the virial radius are not yet known. also, the metal abundances of icm provide a large amount of information for understanding the chemical history and evolution of clusters. a large amount of metals in the icm are mainly produced by supernovae (sne) in galaxies, and are classified roughly as type ia (sne ia) and type ii (sne ii). elements such as si, s and fe are synthesized in both sne ia and sne ii, while lighter α elements such as o, ne, and mg are mainly produced in sn ii, which are explosions of massive stars with initial mass above ∼ 10 m�. the metals produced in the galaxies are transferred into the icm by galactic wind and/or ram pressure strippings. recent observational studies of clusters with chandra and xmm-newton, with their powerful imaging capability and large effective area, have unveiled radial profiles of temperature, entropy, gas mass, and total mass of the icm up to r500 (e.g., [25, 37, 39]). the derived temperature and entropy profiles in the outer region to r500 were almost consistent with theoretical expectations from the self-similar assumption. also, zhang et al. [39] showed that the gas mass fraction mgas/mtotal mass increases with radius to r500. as for metal abundances, asca first measured the distributions of si and fe in the icm (e.g., [8]). the derived iron-mass to light ratios (imlr) are nearly constant in rich clusters and decrease toward poorer systems [19]. recent observations with chandra and xmm-newton allowed detailed studies of the metals in the icm. these observations, however, showed abundance profiles of o, mg, si and fe only for the central regions of very bright clusters or groups of galaxies dominated by cd galaxies in a reliable manner (e.g., [7, 9, 20, 35]). the abundance profiles of o and mg, in particular for outer regions of clusters, are still poorly determined, because data from chandra and xmm-newton both show relatively high intrinsic background levels. tamura et al. [35] derived imlr for five clusters within 250 h−1100 kpc to be ∼ 0.01 m�/l�, and the oxygen mass within 50 h−1100 kpc for several clusters. however, oxygen-mass to light ratios (omlr) for rich clusters are not reliable due to the low emissivity of 583 http://dx.doi.org/10.14311/ap.2013.53.0583 http://ojs.cvut.cz/ojs/index.php/ap kosuke sato et al. acta polytechnica ovii and oviii lines in high temperatures. de grandi et al. [5, 6] and hayakawa et al. [13] found that clusters associated with cd galaxies and central cool components showed abundance concentration in the cluster center, while clusters without cd galaxies indicated flatter profiles. the central metallicity enhancement in the cool core clusters were further studied and the excess metals were shown to be supplied from the cd galaxies [6]. the spatial distribution and elemental abundance pattern of the icm metals were determined with the large effective area of xmm-newton (e.g., [21, 22]). on the other hand, abundance measurements of o and mg with xmm-newton could only be made for the central regions of the brightest cooling core clusters, due to the relatively high intrinsic background. rasmussen et al. [26, 27] report the si and fe profiles of 15 groups of galaxies observed with chandra. they suggest that the si to fe ratios in the groups tend to increase with radius, and the imlrs within r500 show a positive correlation with the total group mass (temperature). we use h0 = 70 km s−1 mpc−1, ωλ = 1 − ωm = 0.73 in this paper. unless otherwise noted, the solar abundance table is given by anders & grevesse [2], and errors are within the 90 % confidence region for a single parameter of interest. 2. thermal properties of the icm 2.1. temperature profiles because suzaku xis is characterized by a lower background level and higher sensitivity below 1 kev [18], we have been able to observe the icm emission beyond r500 region of clusters [1, 4, 10, 12, 15, 28, 34]. almost all clusters observed with suzaku showed a similar trend, which the temperature dropping to ∼ 1/3 of the peak value. this was consistent with the theoretical expectations as shown in fig. 1. the temperature was normalized by the mean temperatures of the clusters. the radial axis in fig. 1 was normalized by r200, which was derived from the mean temperature of the clusters in henry et al. [11]. dotted line show simulation result (burns et al. 2010), and two gray dashed lines show standard deviation. 2.2. entropy profiles an entropy profile provides the thermal process and history of the icm, particularly for the gas heated by the accretion shock from outside the cluster. in x-ray astronomy, we define the entropy as s = ktn−2/3e . the derived entropy profile from suzaku increased with the radius to ∼ 0.5 r200 ∼ r500, and the profile had a flatter slope at r > 0.5 r200. this tendency was consistent for the suzaku results [1, 4, 10, 12]. compared to the previous xmm results for 31 clusters within r500 in pratt et al. [25], suzaku’s results were consistent with the entropy profile within r500. 0 0.5 1 0 0. 5 1 kt /k < t > r/r200 a1413 k= 7.4 kev pks−0745 k= 7.0 kev a1689 k= 9.3 kev a1795 k= 5.3 kev a2142 k= 8.6 kev a1246 k= 6.0 kev hydra a k= 3.0 kev a2199 k= 4.0 kev a1835 k= 8.0 kev figure 1. radial scaled temperature profiles by the mean temperature for the clusters observed with suzaku (see also [1]).the radii are normalized by r200 with the mean temperature as shown in henry et al. [11]. 0.1 1 0.1 1 s o bs /s es t r / r200 a1795 (5.3 kev) a1246 (6.0 kev) pks0745−191 (7.0 kev) a1413 (7.4 kev) a2142 (8.6 kev) a1689 (9.3 kev) hydra a (3.0 kev) perseus (6.0 kev) a2199 (4.0 kev) a1835 (8.0 kev) figure 2. derived radial entropy profile for each cluster from suzaku observations. the radii are normalized by r200 with the mean temperature as shown in henry et al. [11]. voit [38] reported s ∝ r1.1 on the basis of numerical simulations of adiabatic cool gas accretion and the xmm results. pratt et al. [25] agreed with the relation within r500. we compared the entropy derived from suzaku with the expected values from simulations, as shown in fig. 2. as a result, all the observed clusters had a similar tendency in r > 0.2 r200. this would suggest that all the observed clusters have undergone a similar thermal evolution process. 2.3. mass profile while the gas mass of the clusters is calculated by the electron density from x-ray observation, the gravitational mass of the clusters is derived from the temperature and electron density profiles assuming spherical 584 vol. 53 supplement/2013 thermal and chemical evolutions of galaxy clusters 1012 1013 1014 m as s (m /m so la r) h.e. mass icm mass stellar mass 0.1 10.2 0.5 2 0.01 0.1 0.02 0.05 0.2 f ra ct io n radius (r/r500) wmap7 result figure 3. total mass under the assumption of hydrostatic equilibrium and gas mass (upper and lower solid lines, respectively) of the abell 2199 cluster. each dashed line shows ±90 % errors for each mass. the gray line corresponds to the stellar mass from k-band luminosity with the 2mass catalogue. lower panel: the radial profile of the stellar and gas mass fraction to the total cluster mass. the dashed lines show ±90 % errors. the blue line indicates the cosmic baryon fraction [16]. symmetry and hydrostatic equilibrium. for example, the stellar, gas and total mass of the abell 2199 cluster are shown in fig. 3. note that the resultant total mass from the hydrostatic equilibrium assumption started decreasing beyond the r500 region. this would indicate a flaw in the assumption. as mentioned in the previous section, the flatness or a decrease of the entropy in r > r500 would also indicate being outside the hydrostatic equilibrium assumption in the outskirts region of the cluster. the fraction of the derived baryon to the total mass of the cluster at r200 was ∼ 20 %, which was consistent with the cosmic baryon fraction, ∼ 15 % [16]. 3. metal distributions in the icm 3.1. metallicity in the icm suzaku xis can measure all the main elements from o to fe, because it realizes a lower background level and higher spectral sensitivity, especially below 1 kev [18]. suzaku observations have shown the abundance profiles of o, mg, si, s, and fe to the outer regions (r ∼ 0.5 r180) with good precision for several groups and clusters [17, 21, 22, 29–34, 36]. because the fe abundance was well determined in the metal abundances with smaller uncertainties, we compared the distributions of the metal abundances with those of the fe abundance. in order to compare the relative variation in the abundance profiles, we show the abundance ratios of o, mg, si, and s divided by fe as a function of the projected radius in figs. 4–7. as a result, the si and s to fe abundance ratios were close to ∼ 1.5 from the central to the outer region, while the gradients of the o and mg to fe ratios increase 0.01 0.1 1 1 0.2 0.5 2 5 o /f e ra ti o r / r180 awm7 a1060 a262 ngc507 hcg62 fornax ngc5044 ngc1550 centaurus figure 4. radial abundance ratios of o to fe. 0.01 0.1 1 1 0.2 0.5 2 5 m g/ f e ra ti o r / r180 awm7 a1060 a262 ngc507 hcg62 fornax ngc5044 ngc1550 centaurus figure 5. radial abundance ratios of mg to fe. more gently than those of the si and s to fe ratios, as shown in figs. 4–7. in addition, the o/fe solar abundance ratios in the central region were about ∼ 0.6, while the other elements to fe ratios in the central region were almost a solar abundance. the feature of o abundance agreed with the xmm-newton observations [21, 22, 35]. 3.2. contributions from type ia and type ii supernovae in order to examine the relative contributions from type ia and type ii supernovae (sne ia and sne ii) to the icm metals, the elemental mass pattern of o, mg, si, s and fe was examined for each cluster. the mass patterns were fitted by a combination of average sne ia and sne ii yields per supernova, as demonstrated in fig. 8. the fit parameters were chosen to be the integrated number of sne ia (nia) and the number ratio of sne ii to sne ia (nii/nia), because nia could be well constrained due to relatively small errors in the fe abundance. the sne ia and ii yields were taken from iwamoto et al. [14] and nomoto et al. [24], respectively. we assumed a salpeter imf for stellar masses from 10 to 50m� with progenitor 585 kosuke sato et al. acta polytechnica 0.01 0.1 1 1 0.2 0.5 2 5 s i/ f e ra ti o r / r180 awm7 a1060 a262 ngc507 hcg62 fornax ngc5044 ngc1550 centaurus figure 6. radial abundance ratios of si to fe. 0.01 0.1 1 1 0.2 0.5 2 5 s /f e ra ti o r / r180 awm7 a1060 hcg62 ngc507 fornax a262 ngc5044 ngc1550 centaurus figure 7. radial abundance ratios of s to fe. metallicity of z = 0.02 for sne ii, and w7, wdd1 or wdd2 models for sne ia. all the clusters exhibited similar features. the abundance patterns were better represented by the w7 sne ia yield model than by wdd1. the number ratio of sne ii to sne ia with w7 is ∼ 3.5, while the ratio with wdd1 is ∼ 2.5. the wdd2 model gave quite similar results as w7. almost 3/4 of the fe and ∼ 1/4 of the si is synthesized by sne ia, in the w7 model (for details, see sato et al. [29, 30]). note that, here, we estimated the numbers as the integrated ones at present. we would therefore need to consider the instantaneous recycling approximations, taking into account the stellar lifetime etc., as shown in matteucci & chiappini [23], to derive the correct chemical evolution of the galaxies in the clusters. 3.3. metal mass to light ratios we examined mass-to-light ratios for o, fe, and mg (omlr, imlr, and mmlr, respectively), which enabled us to compare the icm metal distribution with the stellar mass profile. historically, b-band luminosity has been used for estimating the stellar mass [19]. however we calculated it using the k-band luminosity in the clusters, based on the two micron all sky 10 8 10 9 10 10 10 11 e le m en ta l m as s (s ol ar m as s) 10 -1 1 10 r at io 0.2 0.4 0.6 0.8 10 20 s n e ia / t ot al sne ia sne ii w7 wdd1 w7, < 0.1 r180 wdd2 awm7 atomic number figure 8. fit results of each elemental mass for the awm 7 cluster. top panels show the mass within the whole observed region (black) and within 0.1 r180 (red) fit by [nia{(sne ia yield) + (nii/nia)(sne ii yield)}]. the blue dashed and solid lines correspond to the contributions of sne ia (w7) and sne ii within 0.1 r180, respectively. ne (atomic number = 10) is excluded in the fit. the mid and lower panels indicate the ratios of data points to the best-fit, and the fractions of the sne ia contribution to the total mass in the best-fit model for each element, respectively. survey (2mass) catalogue. this method is useful in performing a uniform comparison with the properties in other groups and clusters based on the same k-band galaxy catalogue to trace the distribution of member elliptical galaxies. the mlrs with k-band continue to increase in the observed region with suzaku, as shown in figs. 9–11. this means that the metals extend to the outer region than the galaxies. the mlrs with k-band also exhibited a similar tendency for each cluster, and showed a smaller dispersion than those with b-band luminosity. these suggest that the metal enrichment process in the icm seems to be similar for each cluster. we stress that high-sensitivity abundance observation to the outer region of clusters will give important clues about their evolution. if the distribution of oxygen, as well as iron, could be measured to the very outer region (r ∼ r180), we might obtain a clear view about when the oxygen and iron were supplied to the inter galactic space, because most of the oxygen should have been synthesized by sne ii and supplied in the starburst era. another possibility is very early metal enrichment of oxygen by galaxies or by massive population iii stars before groups and clusters assem586 vol. 53 supplement/2013 thermal and chemical evolutions of galaxy clusters 0.01 0.1 1 10−6 10−5 10−4 10−3 0.01 0.1 in te gr at ed o m l r r / r180 a1060 awm7 hcg62 ngc507 a262 ngc5044 ngc1550 centaurus figure 9. radial oxygen mass-to-light profiles with k-band luminosity. 0.01 0.1 1 10−6 10−5 10−4 10−3 0.01 0.1 in te gr at ed m m l r r / r180 a1060 awm7 hcg62 ngc507 a262 ngc5044 ngc1550 centaurus figure 10. radial magnesium mass-to-light profiles with k-band luminosity. bled. in this case, a large part of the intergalactic space would be enriched quite uniformly with oxygen and other elements. metallicity information in cluster outskirts would thus give us unique information about the enrichment history. for this purpose, instruments with much higher energy resolution, such as microcalorimeters, and optics with larger effective area will play a key role in carrying out these studies. 4. conclusion suzaku observations of group and cluster galaxies for the first time showed spatial distributions of temperature and entropy to the virial radii, and metal abundances for o, mg, si, s, and fe up to ∼ 0.5 r180. the icm temperatures at the virial radii dropped by ∼ 30 % from the peak temperature. the derived entropy profiles from suzaku were consistent with those from the previous xmm results and the expected value from numerical simulations within r < r500. in r > r500, the derived entropy profiles had a flatter slope than that expected from the simulations. this would indicate that the outskirts of the cluster were outside the hydrostatic equilibrium assumption. 0.01 0.1 1 10−6 10−5 10−4 10−3 0.01 0.1 in te gr at ed i m l r r / r180 a1060 awm7 hcg62 ngc507 a262 ngc5044 ngc1550 centaurus figure 11. radial iron mass-to-light profiles with k-band luminosity. the abundances of mg, si, s, and fe dropped from solar levels at the center to ∼ 1/4 solar in the outermost region, while the o abundance showed a flatter distribution around ∼ 0.5 solar without a strong concentration in the center. the abundance ratios, o/fe, mg/fe, si/fe, and s/fe in the groups and clusters were generally similar to each other. the abundance pattern from o to fe enabled us to constrain the number ratio of sne ii to ia as ∼ 3.5, which was consistent with the values obtained for the groups and clusters. the derived omlr, mmlr, and imlr in the clusters using k-band luminosity had a similar feature. this would suggest that the clusters had undergone a similar evolution process in the past. references [1] akamatsu, h., hoshino, a., ishisaki, y., ohashi, t., sato, k., takei, y., & ota, n. 2011, arxiv:1106.5653 [2] anders, e., & grevesse, n. 1989, geochim. cosmochim. acta, 53, 197 [3] barnes, j., & efstathiou, g. 1987, apj, 319, 575 [4] bautz, m. w., et al. 2009, pasj, 61, 1117 [5] de grandi, s., & molendi, s., 2001 apj, 551, 153 [6] de grandi, s., ettori, s., longhetti, m., & molendi, s. 2004, a&a, 419, 7 [7] finoguenov, a., matsushita, k., böhringer, h., ikebe, y., & arnaud, m. 2002, a&a, 381, 21 [8] fukazawa, y., makishima, k., tamura, t., ezawa, h., xu, h., ikebe, y., kikuchi, k., & ohashi, t. 1998, pasj, 50, 187 [9] fukazawa, y., makishima, k., & ohashi, t. 2004, pasj, 56, 965 [10] george, m. r., fabian, a. c., sanders, j. s., young, a. j., & russell, h. r. 2009, mnras, 395, 657 [11] henry, j. p., evrard, a. e., hoekstra, h., babul, a., & mahdavi, a. 2009, apj, 691, 1307 [12] hoshino, a., et al. 2010, pasj, 62, 371 [13] hayakawa, a., hoshino, a., ishida, m., furusho, t., yamasaki, n. y., & ohashi, t. 2006, pasj, 58, 743 587 kosuke sato et al. acta polytechnica [14] iwamoto, k., brachwitz, f., nomoto, k., kishimoto, n., umeda, h., hix, w. r., & thielemann, f.-k. 1999, apjs, 125, 439 [15] kawaharada, m., et al. 2010, apj, 714, 423 [16] komatsu, e., et al. 2011, apjs, 192, 18 [17] komiyama, m., sato, k., nagino, r., ohashi, t., & matsushita, k. 2009, pasj, 61, 337 [18] koyama, k., et al. 2007, pasj, 59, 23 [19] makishima, k., et al. 2001, pasj, 53, 401 [20] matsushita, k., finoguenov, a., böhringer, h. 2003, a&a, 401, 443 [21] matsushita, k., böhringer, h., takahashi, i., & ikebe, y. 2007b, a&a, 462, 953 [22] matsushita, k., et al. 2007a, pasj, 59, 327 [23] matteucci, f., & chiappini, c. 2005, pasa, 22, 49 [24] nomoto, k., tominaga, n., umeda, h., kobayashi, c., & maeda, k. 2006, nuclear physics a, 777, 424 [25] pratt, g. w., et al. 2010, a&a, 511, a85 [26] rasmussen, j., & ponman, t. j. 2007, mnras, 380, 1554 [27] rasmussen, j., & ponman, t. j. 2009, mnras, 399, 239 [28] reiprich, t. h., et al. 2009, a&a, 501, 899 [29] sato, k., et al. 2007, pasj, 59, 299 [30] sato, k., tokoi, k., matsushita, k., ishisaki, y., yamasaki, n. y., ishida, m., & ohashi, t. 2007, apj, 667, l41 [31] sato, k., matsushita, k., ishisaki, y., yamasaki, n. y., ishida, m., sasaki, s., & ohashi, t. 2008, pasj, 60, s333 [32] sato, k., matsushita, k., ishisaki, y., yamasaki, n. y., ishida, m., & ohashi, t. 2009a, pasj, 61, 353 [33] sato, k., matsushita, k., & gastaldello, f. 2009b, pasj, 61, 365 [34] sato, k., kelley, r. l., takei, y., tamura, t., yamasaki, n. y., ohashi, t., gupta, a., & galeazzi, m. 2010, pasj, 62, 1423 [35] tamura, t., kaastra, j. s., makishima, k., & takahashi, i. 2003, a&a, 399, 497 [36] tokoi, k., et al. 2008, pasj, 60, s317 [37] vikhlinin, a., markevitch, m., murray, s. s., et al. 2005, apj, 628, 655 [38] voit, g. m. 2005, reviews of modern physics, 77, 207 [39] zhang, y.-y., okabe, n., finoguenov, a., et al. 2010, apj, 711, 588 acta polytechnica 53(supplement):583–588, 2013 1 introduction 2 thermal properties of the icm 2.1 temperature profiles 2.2 entropy profiles 2.3 mass profile 3 metal distributions in the icm 3.1 metallicity in the icm 3.2 contributions from type ia and type ii supernovae 3.3 metal mass to light ratios 4 conclusion references ap02_3.vp 1 introduction a planar description of fluid flow caused by a rotational axial impeller in a stirred vessel equipped with radial baffles is usually interpreted by means of a one-loop circulation model. in the case of a pumping-down axial impeller, the flow is directed towards the bottom of the tank. on the bottom, the fluid diverges, changing its direction, and then travels upward along the vessel wall. at the top of the tank, the fluid is directed radially inward and then redirected down to the impeller, closing the circulation loop. however, the real liquid flow inside a tank constitutes a pseudo-stationary high-dimensional dynamical system. laser doppler velocimetry (ldv) and particle image velocimetry (piv) measurements of the velocity field in stirred vessels revealed the occurrence of a secondary circulation loop near the bottom of the vessel, kresta and wood [1] and myers et al [2]. it was observed further that in the upper third of the vessel the fluid circulation was low and the main circulation loop was instable. pseudo-periodic flow in the shape of a macro-vortex was detected in this region (see fig. 1). this phenomenon was also noticeable visually, and it was dependent on stirrer diameter and impeller off-bottom clearance. the measure of these macro-vortices was comparable with the scale of the vessel and this phenomenon was called macro-instability (mi) of the velocity field. knowledge of the mechanism of the origin and scale of mi is of great practical importance. macro-vortices improve the mixing in the upper part of the vessel, but they also act negatively on the vessel walls, baffles, impeller shaft, and other solid parts. in addition, the uneven vibration influences the fixation of the apparatus. in extreme cases, it could damage the equipment, kratěna et al [3]. in the turbulent region of mixing, the presence of macro-vortices in the stirred vessel appears as a swollen surface of the liquid, obviously different from the wavy surface resulting from turbulence eddies. brůha et al [4] looked at the frequency of mi occurrence according the movement of the liquid surface using 3, 4, and 6 bladed pbt. their investigations covered two impeller diameters (d � t/3 and t/4) and three different off-bottom clearances (t � 0.3 m, c/t � 0.33, 0.4 and 0.5). based on visual observation, they defined the occurrence of a macro-vortex as a situation during which the liquid surface swells up for more than 5 mm (under their operational conditions) close to the baffle. they found out experimentally that the occurrence of the macro-vortex is dependent on the number of wall baffles, on the number of stirrer blades and on the stirrer frequency. an electronic probe measured the frequency of macro-vortex occurrence and a linear dependence was found with stirrer frequency. it is not sufficient to look at the liquid surface from the outside to understand the origination of macro-vortices, and vortices that do not reach the surface are not registered. the authors of the above mentioned study used a mechanical measuring device, developed later, called a “tornadometer”, which enabled the frequency of mi inside the vessel to be measured (brůha et al [5, 6]). they confirmed their © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 42 no. 3/2002 macro-instabilities of a suspension in an axially agitated mixing tank m. jahoda, v. machoň, l. vlach, i. fořt this paper deals with an experimental assessment of the occurrence of flow macro-instabilities in a mechanically stirred suspension and visual observation of the origination and extinction of macro-vortices. the mean frequency of the occurrence of macro-instabilities in operational conditions was also observed. the experiments were carried out in a cylindrical vessel with an inner diameter of 0.19 m, and an axial stirrer with six pitched 45° blades (pbt) was used. the diameter of the stirrer was equal to half of the vessel diameter. the mean frequencies of the occurrence of macro-instabilities were determined at the stirrer frequency for just suspended conditions in dependence on solids concentration and impeller clearance (height of the stirrer above the bottom of the vessel). two regions of origin of macro-instabilities and one region of extinction were determined visually. it was found that the mean frequency of occurrence of macro-instabilities increases with increasing stirrer frequency at constant concentration of solids. at constant stirrer speed, the frequency of occurrence of macro-instabilities decreases with increasing concentration of solids. at higher concentration of solids, a sharp change toward lower mean frequencies of macro-instabilities was observed. keywords: suspension, mixing, macro-instabilities. fig. 1: flow patterns in a vessel stirred by an axial impeller. main circulation loop i, secondary circulation loop ii, macro-vortex iii. previous finding that the mean mi frequency is linearly related to the rotational speed of the impeller. the impeller off-bottom clearance influenced the slope of this dependency. they also observed that the mi frequency is accompanied by changes in the angle of the impeller discharge flow and the appearance of an unstable secondary circulation loop. further, the dimensionless mi frequency was defined as the mean frequency of mi occurrence divided by the frequency of the stirrer speed, and the dependency of this quantity on modified reynolds number, re, was found in three hydrodynamic regimes. in the laminar region of fluid flow in the vessel (re < 200), the primary circulation loop was steady and the mi was not registered. in the transient region of the flow (200 < re < 5000), the dimensionless frequency of mi was found to increase logarithmically with increasing values of re. under fully turbulent conditions (re > 9000), the values of mi frequency of occurrence were near to constant in the range from 0.043 to 0.048. the intensity of the turbulence increased with increasing re number, and it was not possible to distinguish between mi and turbulent eddies during highly turbulent flow (re > 67000) . generally, the measuring method with a tornadometer is regarded by many authors as unreliable, because of the presence of the arm inside the vessel, which could influence the fluid flow. montes et al [7] use ldv measurements and spectral analysis to observe mi frequency occurrence in a stirred tank equipped with a six-bladed 45° pbt impeller (t � 0.3 m, d � t/3, c/t � 0.35). the axial and radial components of the velocity were measured at eight locations situated evenly in a rectangular net in the stirred region. the measurements lasted 25 minutes and the time record of the velocity was analysed by the fast fourier transform (fft). the dominant frequency was determined from the frequency spectrum. the dominant frequencies were proved in all measurement locations, but the obtained values of the power spectrum density were not identical. it was concluded that mi are macroscopic phenomena with a dimension comparable with the dimension of the apparatus. the results confirmed the previous finding that the mean frequency of mi increases linearly with the stirred frequency. hasal and fořt [8] applied a non-linear dynamic system analysis method to data measured by montes et al [7]. they demonstrated the possibility to detect and separate the low dimensional component of the velocity field from the stochastic turbulent component. in the velocity field, the macro-instabilities presented a marked component of kinetic energy. the presence of mi in the velocity record was more distinct in laminar and transient flow regimes. the macro-instabilities were poorly manifested in the discharge flow of the impeller, where their relative magnitude takes on a minimal value. the authors deduced possible production of mi from the interaction of the impeller discharge stream and the ascending circulation streams. roussinova and kresta [9] analysed the ldv time series of axial velocities upstream of the baffle using the lomb algorithm as an alternative method to fft for the case of unevenly spaced data. they found a dominant mi frequency of 0.62 hz in a system stirred by a pitched blade 45° turbine (4 blades) with diameter d � t/2 (t � 0.24 m) and re � 48 � 104. the impeller off-bottom clearance (c/t � 0.33, 0.5, and 0.67) did not change the value of the dominant frequency. the dimensionless mi frequency was constant at 0.18 for re > 104. no dominant frequency appeared for small pbt (d � t/4) at any impeller off-bottom clearance. the previous investigations show that the incidence of macro-instability is very sensitive to vessel geometry and the procedure of mi frequency determination. our goal was to examine experimentally the occurrence of macro-instabilities in a mechanically stirred system in which the solid particles were present and fully suspended. 2 experimental the experiments were carried out in a stirred cylindrical baffled vessel with a flat bottom and inner diameter t � 0.19 m. a sketch of the experimental tank is shown in fig. 2. the mixed charge was a suspension in water of solid particles at room temperature, and the vessel was filled to the height h � t. an outer square glass tank was filled with water in order to minimize the effects of vessel curvature on visual observation. the axial impeller was a pitched blade turbine (pbt) of diameter d � t/2 with six blades inclined at 45° (pumping down), and its off-bottom clearances were c/t � 0.2, 0.3, and 0.4. the width of the blade, h, was one fifth of the impeller diameter d. the impeller speed, n, was varied in the range from 75 rpm to 200 rpm. the pvc solid particles were spherical in shape, diameter 0.94 mm, and density 1260 kg m �3. the solids concentration was changed from 2.5 % to 22.5 % by weight (from 2 % to 18.7 % by volume). the experiments were done in conditions where the solid particles were fully suspended or in a condition near to full suspension. the visual method, zwietering [10], was used for assessment of the minimal impeller speed at which the solid particles were just suspended, njs. to investigate mi frequency occurrence, the upper part of the mixing vessel close to the baffle leading edge was observed (see fig. 2). the presence of red colored solid particles made it possible to observe the appearance of macro-vortices. a typical spiral shape characterized a fully developed macro-vortex. a video camera was used to reduce the spatial view for visual observation. the time between two successive appearances of macro-vortices was stored in the computer memory in accordance with a signal from an 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 t c d h observational area fig. 2: experimental vessel operator. for each experimental condition, the recording time was about 30 min. the reciprocal values of the time series give the incidence of mi. the mean values of mi frequency occurrence, fm, and its dimensionless form, fmi � fm/ n, were then calculated. 3 results and discussion during the experiments, two circulation flow patterns were observed. there was a stable and symmetrical circulation pattern around impeller speed 65 rpm (c/t � 0.2, 0.3, and 0.4), which resembled the flow described by kresta and wood [1]. the single primary circulation loop reaches a height of only 2/3 h. this is in agreement with the results presented by bittorf and kresta [11]. if the impeller speed increased, the stable flow pattern turned into unstable conditions. the primary circulation loop is unstable and macro flow instabilities begin to appear. an upper part of the mixed vessel was stirred mainly by means of the macro-vortices. these vortex structures are able to reach the free surface if they have sufficient energy. a local swell due to a macro-vortex was apparent for impeller speed around 200 rpm. this corresponds to the description made by brůha et al [4]. two regions of origin of macro-instabilities and one region of extinction were determined visually. the first area of origin was along the leading edge of the baffle. a stream moving up beside a baffle usually headed to the impeller area, but sometimes the flow was accelerated and reached the free surface or came near to the surface. macro-vortices developed from this jet stream (see fig. 3). similar macro-vortex creation was observed by montes et al [7]. the second area of origin was located between adjacent baffles at a horizontal distance 0.25 t from vessel wall to shaft (see fig. 3). the macro-vortices were formed on an interface between the ascending stream of suspension along the vessel wall and the descending stream that arrived from the baffle trailing edge. the stream interface was nearly constant in the vertical direction from the vessel bottom (from 0.45 t to 0.5 t) for impeller speed above njs and for all investigated off-bottom impeller clearances. the horizontal position of the interface was slightly pitched due to stream that arrived from the baffle trailing edge. the area of macro-vortex extinction was always in the upper quadrant of the stirred vessel near the baffle leading edge and the free surface, and it was not dependent on the macro-vortex origin. the frequency of macro-vortex occurrence was in the range from 0.5 to 2 s�1. the dependence of the mean value of the dimensionless frequency of macro-instability occurrence on impeller speed for constant solid concentration 10 % by weight is shown in fig. 4. the figure exposes some influence of the amount of solid particles in suspension on macro-vortex detection. the minimal impeller speed for full suspension of solid particles, determined by zwietering’s method, was found to be 135 rpm. this method is very dependent on the opinion of an observer. it seems that the real value of njs ranged from 135 to 150 rpm. for higher impeller speed, the solid particles were fully suspended and the dimensionless frequency values of mi are practically steady. this behaviour conforms with a one-phase system in which the dimensionless mi frequency scarcely changes in the region of developed turbulence. the dependence of the mean value of the dimensionless frequency of macro-instability occurrence on solid concentration for constant impeller speed is shown in fig. 5. the solid particles were in the fully suspended state. there is 5 acta polytechnica vol. 42 no. 3/2002 fig. 3: sketch of origin of macro-instabilities. 1a) the area of origin was along the leading edge of the baffle. 1b) area of origin between two adjacent baffles. 1c) fully developed macro-vortex. 2) ascending stream along the vessel wall. 3) descending stream arriving from the baffle trailing edge. 4) interface between ascending and descending streams. fig. 4: dependence of mean value of dimensionless mi frequency on impeller speed for c � 0.2 t, njs � 135 rpm, w � 10 %w/w, 1.4 � 104 < re < 2.8 � 104 a turning point between solids concentrations 7.5 %w/w and 10 %w/w. solid particles above a concentration of 7.5 %w/w greatly influenced the fluid flow inside the vessel. from visual observation, it was apparent that the flow patterns were more stable in the case of higher solids concentrations than at lower concentrations. figure 6 depicts the dependence of the mean value of the dimensionless frequency of macro-instability occurrence on solid concentration for minimal impeller speed when the solid particles were just suspended. the values of dimensionless mi frequency were very similar for the two off-bottom clearances, c � 0.2 t and 0.3 t, above all when the solids concentration was higher than 7.5 %w/w. in the case of impeller off-bottom clearance 0.4 t and w > 15 %w/w, the investigation of macro-vortex incidence could not be carried out due to a high degree of turbulent eddies. 4 conclusions the existence of macro-instability occurrence in the examined mixing system was proved by visual observation for a wide range of solid particles concentrations and three off-bottom impeller clearances. two regions of origin of macro-instabilities and one region of extinction were determined. the macro-vortices evolved either in an area along the leading edge of the baffle or at a location between the baffles at a horizontal distance 0.25 t from vessel wall to impeller shaft. the macro-vortices always disappeared in the upper quadrant of the stirred vessel near the baffle leading edge and free surface. the frequency of macro-vortex occurrence was found to be in the range from 0.5 to 2 s�1. the visual method is very simple but demanding on the attentiveness of the operator. the mean value of the macro-instability frequency is linearly related to the impeller rotational speed for equal concentrations of solids particles. the mean value of the dimensionless frequency of macro-instability occurrence is steady above the minimal impeller speed for just suspension conditions. at constant stirrer speed, the frequency of occurrence of macro-instabilities decreases with increasing concentration of solids. a turning point was observed between solids concentrations 7.5 %w/w and 10 %w/w. at higher concentration of solids, the mean frequencies of the macro-instabilities were identical. the results for impeller off-bottom clearances 0.2 t and 0.3 t are very similar. 5 list of symbols c impeller off-bottom clearance, m d impeller diameter, m fm mean value of frequency of occurrence of macro-instabilities, s�1 fmi mean value of dimensionless frequency of occurrence of macro-instabilities h height of suspension in the vessel, m h width of impeller blade, m n impeller speed, s�1 njs minimal impeller speed for just suspended condition, s�1 re reynolds number t vessel diameter, m w solid particles concentration by weight acknowledgement this research was financially supported by research projects of the ministry of education of the czech republic j19/98:223400007 and j04/98:2122 0008. references [1] kresta, s. m., wood, p. e.: the mean flow field produced by a 45° pitched blade turbine: changes in the circulation pattern due to off bottom clearance. canad. j. chem. eng., 1993, 71, p. 42–53. [2] myers, k. j., ward r. w., bakker a.: a digital particle image velocimetry invesigation of flow field instabilities of axial-flow impellers. j. fluids eng., 1997, 119, p. 623 – 632. [3] kratěna, j., fořt, i., brůha, o., růžička, m.: dynamic stress affecting the radial baffle in an industrial mixing vessel with a pitched blade impeller. acta polytechnica, 2000, vol. 40, no. 5/6, p. 22–31. [4] brůha, o., fořt, i., smolka, p.: a large scale unsteady phenomenon in a mixing vessel. acta polytechnica, 1993, vol. 33, no. 4, p. 27–34. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 5: dependence of the mean value of dimensionless mi frequency on solid concentration for c � 0.2 t and n � 150 rpm fig. 6: dependence of the mean value of dimensionless mi frequency on solid concentration for njs [5] brůha, o., fořt, i., smolka, p.: phenomenon of turbulent macroinstabilities in agitated systems. collect. czech. chem. commun. 1995, vol. 60, p. 85–96. [6] brůha, o., fořt, i., smolka, p., jahoda, m.: experimental study of turbulent macroinstabilities in an agitated system with axial high – speed impeller and with radial baffles. collect. czech. chem. commun., 1996, vol. 61, p. 856–867. [7] montes, j. l., boisson, h. c., fořt, i., jahoda, m.: velocity field macro-instabilities in an axially agitated mixing vessel. chem. eng. j., 1997, vol. 67, p. 139–145. [8] hasal, p., fořt, i.: macro-instabilities of velocity field in stirred vessel: detection and analysis. chem. eng. sci. 2000, vol. 55, p. 391– 401. [9] roussinova, v., kresta, s. m.: analysis of macro-instabilities of the flow field in a stirred tank agitated with axial impellers. 10th european conference on mixing, delft, the netherlands, 2000, p. 361–368. [10] zwietering, t. n.: suspending solid particles in liquids by mechanical mixers. aiche j., 1958, vol. 6, no. 3, p. 419–426. [11] bittorf, k.-j., kresta, s. m.: active volume of mean circulation for stirred tanks agitated with axial impellers. chem. eng. sci., 2000, vol. 55, p. 1325–1335. dr. ing. milan jahoda phone: +420 2 2435 3223 e-mail: milan.jahoda@vscht.cz doc. ing. václav machoň, csc. ing. lubomír vlach dept. of chemical engineering prague inst. of chemical technology faculty of chemical engineering technická 3, 166 28 praha 6, czech republic doc. ing. ivan fořt, drsc. phone: + 420 2 2435 2713 dept. of process engineering czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2015.55.0039 acta polytechnica 55(1):39–49, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap a description of the microstructure and the micromechanical properties of spruce wood zdeněk prošeka,∗, vlastimil králíka, jaroslav topiča, václav nežerkaa, kateřina indrováa,b, pavel tesáreka,b a faculty of civil engineering, czech technical university in prague, thákurova 7, 166 27 prague, czech republic b institute of physics, academy of sciences of the czech republic, cukrovarnická 10, 162 00 prague, czech republic ∗ corresponding author: zdenek.prosek@fsv.cvut.cz abstract. knowledge about the microstructure and the morphology of individual phases within wood tissues is essential for numerous applications in materials engineering and in the construction industry. the purpose of the work presented here is to monitor the distribution of elastic stiffness within the tissues of individual cells using state-of-art equipment and exploiting emerging methods, such as modulus mapping, to investigate the morphology of the individual phases. quasi-static nanoindentation was carried out on cell walls of spruce earlywood and latewood tracheids to obtain values for the indentation modulus, which is closely related to the young modulus of the material. dynamic modulus mapping, also known as nanodma, was utilized to obtain a map of the elastic moduli over the entire tracheid cross-section. in particular, it was found that the indentation stiffness of the cells walls ranges between 10.5 gpa for of earlywood tracheids and 12.5 gpa for latewood tracheids. the difference between the elastic stiffness of earlywood and latewood is attributed to the different chemical composition and the orientation of the fibrils. the data that has been acquired is indispensable for micromechanical modeling and for the design of engineered products with superior mechanical properties. keywords: norway spruce, atomic force microscopy, optical microscopy, nanoindentation, nanodma. 1. introduction about 76 % of the afforested area in the czech republic comprises coniferous trees, and more than one half of these are spruces. this is reflected in the construction industry, where spruce is the most widely-used timber. for this reason, we focused our study on spruce wood tissues from norway spruce, which is the only species of spruce and the most commonly encountered tree in europe. wood in general is a non-homogeneous and highly anisotropic material at sub-micro-, microand macrolevel. the microstructure and the micromechanical properties have a significant influence on the macroscopic properties of timber. 1.1. microstructure of wood cells coniferous trees were present on earth before the first deciduous trees emerged, and their structure is therefore simple and very regular. only two kinds of wood cells can be found in them, tracheids and parenchymal cells. the parenchymal cells create medullary rays and are involved in the construction of resin channels. on the other hand, the longitudinally extending tracheids, comprising up to 95 % of the total wood volume, are the basic structural tissue. their cell structure is closed and oblong, with a rectangular up to hexagonal cross-section. their length varies from 2 to 6 mm, and their width is usually in the range between 0.02 and 0.1 mm. two basic types of tracheids can be found in wood earlywood tracheids and latewood tracheids [1]. earlywood tracheids are formed at the beginning of the growing season, and their purpose is to deliver water and dissolved nutrients up the tree. for this reason, earlywood tracheids are thin-walled, with a cell wall thickness of about 2 to 3 µm, and they have a relatively big internal cavity for better conduction [2]. latewood tracheids are formed in the second half of the growing season, and their main function is to reinforce the tree trunk against mechanical loading. latewood tracheids have a thick wall, about 7 µm in thickness, and the internal cavity is significantly smaller than in the case of earlywood cells. about 7 % of the wood volume consists of parenchymal cells in the shape of short prisms. they are maintained for a variously limited period, and are located in the sapwood region. their function is to deliver and store starch and nutrients for some period in the sapwood at the periphery of the tree trunk. in addition, these cells are responsible for the formation of medullary rays, consisting of resin channels and of parenchymal cells oriented perpendicular to the longitudinal axis of the trunk and the annual rings. after the parenchymal cells die, they can retain water or stay empty. the resin channels are formed by disintegration or spreading of the cell walls between neighboring parenchymal cells, so the long resin channels are sur39 http://dx.doi.org/10.14311/ap.2015.55.0039 http://ojs.cvut.cz/ojs/index.php/ap z. prošek, v. králík, j. topič et al. acta polytechnica figure 1. scheme of the wood cell tissues (reproduced from [2]). rounded by parenchymal cells. when the tree suffers a surface wound, resin is transported to the damaged part to provide a seal. a comprehensive study of the structure of cell walls in the wood tissues by means of electron microscopy revealed that the walls are composed of multiple layers, each composed of millions of fibrils. the different properties of the individual layers are caused by the arrangement of the chemical properties of the fibrils [3]. in the direction from the perimeter to its center, a wood cell is composed of the middle lamella, then a primary wall, then a secondary wall composed of three layers, and a cavity called the lumen (see figure 1). the middle lamella is thin, ranging in thickness between 0.1 and 0.3 µm, and its chemical and physical properties are similar to the properties of the primary wall. the middle lamella is usually well connected with the primary wall, so it is sometimes referred to as a part of the composite lamella consisting of the middle lamella and neighboring cell walls [2]. however, the structure of the composite lamella can easily be separated by chemical agents into individual anatomic elements [1]. the thickness of the primary wall does not differ from the thickness of the wall of the middle lamella; it is about 0.05 to 0.2 µm in thickness. it has an irregular arrangement of the fibrils, and is oriented in an angle ranging from 0 to 90° with respect to the longitudinal cell axis. it is assumed that the fibrils connect the primary layer and the middle lamella, thus forming a firm connection [3]. before electron microscopy was available, bailey and kerr [4] used iodine staining and polarization microscopy to find that the secondary wall is composed of three layers, commonly referred to as outer (s1), middle (s2) and inner (s3) (figure 1). while investigating the cell cross-section they found that the outer and inner layers are brighter than the middle layer, due to the different orientation of the fibrils in the individual layers. the outer layer of the secondary cell wall is composed of four lamellas, and it is usually 0.1 to 0.4 µm in thickness. in this layer, the fibrils are arranged in two perpendicular directions at an angle of about 60° with respect to the longitudinal axis of the cell [3]. the middle layer of the secondary wall forms the major part of the wood cells, and it is therefore the most important component from the mechanical point of view. it is composed of 30 to 150 lamellas, and the fibrils are arranged at an angle of 10° with respect to the longitudinal axis of the wood cell. the middle layer has a superior influence on the properties of the wood cells and consequently of the entire tree trunk. the properties of the middle layer have an impact on the anisotropy, shrinkage, strength and ductility of the wood. the inner layer of the secondary wall is similar to the inner layer, except that the orthogonal fibrils are arranged at an angle of between 60° and 90° with respect to the longitudinal axis of the cell [3]. 1.2. micromechanical properties of wood knowledge of micromechanical properties is crucial for understanding the micromechanics of wood that are responsible for the macromechanical properties of timber. information about the microstructure and the micromechanical properties of individual tissues can be efficiently used for developing wood-based engineering products such as fiberboard and wood-plastic composites [5]. two techniques are commonly used in experimental investigations of mechanical properties at micro-level: tensile tests of individual fibers, and nanoindentation. the famous english engineer, a. a. griffith, did pioneering work on determining the tensile strength of a single fiber [6]; his first tests on glass fiber were carried out in 1921. a significant simplification of these tests was achieved in 1938 with the invention of strain gauges. further research on single wood fibers 0.5 to 30 mm in length and ranging between 15 and 40 µm in diameter was conducted in 1959 by jayne [7]. it turned out that the most problematic issue is how to attachment the fibers firmly to the clamps of the testing machine. after several unsuccessful trials using adhesives, the fibers were finally clamped using abrasive paper. since then, numerous methods have been developed for the purposes of testing single fibers in tension. in 1971, page [8] designed a frame to which the fibers are attached with the use of two glass plates that are connected to the frame. this invention achieved popularity, and the method has also been used for testing other materials, e.g. metals and paper [5]. nanoindentation was developed in 1992 by oliver and pharr [9], and within a short time it become 40 vol. 55 no. 1/2015 a description of the properties of spruce wood reference indentation mfa [°] modulus [gpa] latewood wimmer [10] 21 ±3.34 — wimmer [11] 15.81 ± 1.61 3 gindl [14] 17 0 gindl [14] 18 0 gindl [12] 15.34 5 gindl [15] 17.08 5 konnerth [16] 20 0 transitionwood wimmer[10] 21.27 ± 3 — gindl [13] 16.1 — gindl [14] 16 20 gindl [12] 13.46 7.5 gindl [15] 17.54 7 konnerth [16] 17 9 konnerth [16] 16.8 11.5 earlywood wimmer [10] 13.49 ± 5.75 gindl [14] 11.5 35 gindl [14] 8 50 konnerth [16] 12.5 17.5 table 1. indentation modulus and micro-fibril angle (mfa) of the secondary cell wall layer s2 determined by nanoindentation on norway spruce [10–16]. the most popular method for investigating micromechanical properties on various materials. wimmer et al. [10] utilized nanoindentation in 1997 to investigate the mechanical properties of wood cells, and published information for the first time about the stiffness of spruce earlywood, latewood and transition wood. his results are summarized in table 1, and are compared with the results provided by other authors. wagner et al. [17] found the relationship between indentation depth and the results that they obtained, and established a depth of 200 to 250 nm as optimal for obtaining the most accurate results. indentation depth is not the only factor influencing the results, which are also affected by lignin content and by the angle of the fibrils in the cell wall. jäger et al. [18] found the relation between the angles of the fibrils on the nanoindentation results, and they came to the conclusion that the indentation modulus and hardness are proportional to the number of fibrils in the cell wall. attempts were made to measure the micromechanical properties of the outer secondary wall s1, but these failed because the results were influenced by the surrounding layers. for this reason, all successful nanoindentation measurements have been conducted on the middle secondary wall s2. 1.3. aim of the paper the main purpose the work presented here was to provide detailed information about the distribution of the elastic stiffness within the cross-section of earlywood and latewood cells. the biggest obstacle appeared to be the preparation of the samples, especially due to enormous sensitivity to the moisture of wood tissue. the paper provides comprehensive information about the individual phases, which is needed for micromechanical modeling and homogenization. the necessary comprehensive information was not available from the literature, so the data that we obtained has had to be discussed very carefully and compared with the findings of other authors. modulus mapping is emerging as an advanced method for monitoring the field of elastic stiffness over a selected area. this method, based on dynamic measurements in a raster located at predetermined point on the material, has been successfully used for micromechanical analyses e.g. of bio-materials [19], and the results are in agreement with the data obtained by quasi-static indentation. however, no applications of this method for monitoring the distribution of elastic stiffness over a cross-section of wood tissues has been reported in the available literature. 2. experimental methods 2.1. optical and electron microscopy microscopy images were acquired using a neophot 21 optical metallurgy microscope with up to 1k magnification capabilities, and a philips xl-30 scanning electron microscope that uses a thin ray of electrons to scan the selected surface area. the scanning electron microscope requires a detector to track the reflected electrons and digitally reproduce and record the surface texture. various detectors based on signals provided by the reflected electrons have been used for our purposes. in particular, we used a detector of secondary electrons, a detector of reflected electrons, and a gaseous detector of secondary electrons [20]. 2.2. atomic force microscopy atomic force microscopy (afm) utilizes a scanning probe that moves close to the investigated surface in order to capture the variations of atomic forces on a predetermined raster measured by means of piezoelectric transducers. the method has become extensively used, e.g. for investigating thin layers [21]. the probe tip size is usually at the scale of µm, with a radius of about 10 nm [22]. afm microscopy was performed on the dme 2329 ds 95-200 dual scopetm probe scanner. within this study, we used the non-contact mode of the afm, which is based on the observation that van der waals forces (of the magnitude of 10−12 n) decrease the resonance frequency of the cantilever. these forces are strongest from 1 to 10 nm above the surface of the sample. in oscillation mode, we detect the changes 41 z. prošek, v. králík, j. topič et al. acta polytechnica figure 2. scheme of the indentation with an indication of the depth at full loading and after unloading (reproduced from [9]). in the amplitude of the cantilever oscillation caused by changes in the distance between the probe and the sample. the topographic image is thus constructed without any contamination or damage to the sample. the size of the scanned raster was equal to 65 × 65 microns, and the scanning time was limited to 2 minutes to avoid excessive heating of the device. the vertical resolution (measurement range) was set to 0.5 nm. 2.3. static nanoindentation the static indentation method is widely used nowadays, and is a very popular experimental technique for measuring the elastic stiffness and hardness on various materials, usually metals, glass, ceramics and thin coatings. this type of measurement can be performed on a very small volume, so it is suitable for investigating composite materials at microscale. the principle is based on imprinting a micrometersized diamond tip into the investigated material and recording the loading force and the indentation depth. in most of the measurements, the indentation depth is of the order of hundreds of nanometers. various nanoindentation tips are available, e.g. spherical or berkovich tips, and several techniques are used to ensure that the analysis also provides information about the plastic hardening, the yield stress or the viscosity of the material [23]. standard processing of the measured data is based on the assumption of a perfectly homogeneous isotropic material in the volume affected by the indentation, and the elastic and non-elastic material parameters are usually derived from the nanoindentation data, using an analytic solution. when there are distinct phases, however, the indentation results can provide information about individual inclusions and the interfaces between them. the pioneering work of hertz (1882) [24] dealt with the imprint of an elastic tip in a homogeneous medium. sneddon (1965) [25] derived an analytical relationship between the loading force, the depth of an imprint and the contact area for individual indentation tips. a typical output from nanoindentation measurements consists of two elastic constants: the hardness parameter and the elastic stiffness. the hardness parameter (h) is defined as the mean of the contact pressure at maximum loading force pmax: h = pmax a , (1) where a is the contact indentation tip area. during the loading process, the indented material deforms both in the elastic range and in the plastic range. the plastic response on the load-displacement curve is eliminated during unloading, allowing the user to determine the local material stiffness, known as the reduced modulus er. the value for the reduced modulus can therefore be obtained from the unloading part of the recorded force-displacement diagram from the tangent (dp/dh)|pmax , normalized by the contact area: er = √ πdp 2 √ adh . (2) knowing the stiffness of indenter tip ei, the true indentation modulus of the measured material can be expressed, using the hertz solution for a contact of two compliant bodies, as 1 er = 1 −ν2 e + 1 −ν2i ei , (3) where ν and νi are the values of the poisson ratio representing the tested material and the indenter, respectively. the solution of oliver and phar [9], which is most commonly used for an analysis of experimentally obtained nanoindentation data, suggests a formula for the contact depth in the following form: hc = hmax −ε pmax dp dh , (4) with constant ε dependent on the indentation tip geometry, e.g. ε = 0.726 in the case of a berkovich tip and ε = 0.750 if a spherical tip is used. it should be noted that if a non-symmetric tip is used, equation (1) must be rendered by a geometric correction parameter β, so that er = √ πdp 2β √ adh . (5) for a berkovich tip, the β parameter is established as 1.034. 42 vol. 55 no. 1/2015 a description of the properties of spruce wood figure 4. microstructural hierarchy of wood tissues. figure 3. a dynamic model of the indenter and the tested specimen (reproduced from [26]). 2.4. dynamic nanoindentation the dynamic method (nanodma) assumes a dynamic model of the indenter and the test sample with a single degree of freedom. the model parameters include indenter mass m, sample stiffness ks, and damping coefficient cs, as well as the indenter stiffness and damping coefficient ki and ci, respectively (see figure 3). in addition, the damping coefficient, representing the air gap in the displacement sensor ci, the contact stiffness ks, and the stiffness of the leaf spring holding the indenter shaft, ki, should also be taken into account in the model. the frame stiffness is usually large enough to be considered infinite. when using the dynamic nanodma method, a dynamically applied harmonic force p(t) = p0 sin ωt with amplitude p0 and frequency f = ω/2π is superimposed on a quasi-static load pmax. the equation of motion for the indenter tip can be expressed as mḧ + cḣ + kh = p0 sin ωt. (6) the solution to the above equation is a steady-state displacement oscillation at the same frequency as the excitation: h = h0 sin(ωt−φ), (7) where h0 is the deformation amplitude and φ represents the deformation phase shift with respect to the excitation force. the amplitude and the phase shift can be used to calculate the contact stiffness, using the dynamic model described in figure 3, assuming the following formulas for the deformation amplitude and the phase shift [26]: h0 = p0√ ks + ki −mω2)2 + ((ci + cs)ω)2 , (8) φ = tan−1 (ci + cs)ω ks + ki −mω2 . (9) knowing the stiffness and the damping of the sample, the viscoelastic properties can be determined using the values of the reduced storage modulus, e′r and the loss modulus, e′′r to determine the ratio between the loss and storage moduli, tan δ = e′′r /e′r, according to the following formulas: e′r = ks √ π 2 √ a , e′′r = ωcs √ π 2 √ a , (10) tan δ = e′′r e′r = ωcs ks , (11) where a is the contact area obtained from the quasistatic calibration for the particular contact depth. the storage and loss moduli of the sample can be determined using the same relationship as in equation (5), which is used when interpreting the data from quasi-static indentation. the storage modulus depends on the elastic recovery of the sample, which is expressed in terms of the recovered energy after one loading cycle. the loss modulus corresponds to the damping of the material, and depends on the time delay between reaching the maximum force and the maximum deformation. the ratio of the loss and storage moduli (tan δ) is then related to the material viscosity, independent of the contact area. the instrumentation used in our study could combine nanodma with the imaging capabilities of in-situ 43 z. prošek, v. králík, j. topič et al. acta polytechnica figure 5. afm images of an earlywood cross-section (left) and a latewood cross-section (right), 65 × 65 µm. imaging, meaning that nanodma can be extended to a larger area. during the monitoring process, the system continuously records the storage and loss moduli of the sample as a function of the position on the measured surface. this raster measurement provides information about the stiffness variation and the morphology of the sample surface. this type of approach, which extends the nanodma method, is referred to as modulus mapping [27]. 3. preparing the samples 3.1. tested samples the spruce wood samples used for the investigation and for testing were extracted from a glue laminated timber beam (glt) composed of 30×200 mm lamellae of length equal to 1.5 m. this wood must be dried to a moisture content ranging between 8 % and 15 % in order to prevent swelling and consequent cracking or failure of the glue due to excessive water content. for this purpose, the timber was artificially dried using hot air drying kilns with controlled air humidity. after reaching the required moisture, the timber lamellae were gradually cooled down and the timber beams were stored in heated warehouses for several days. 3.2. sample preparation the samples of wood with a cross section of 10×10 mm, were cut in the required directions with respect to the grain, and were sealed with struers epofix kit epoxy resin. after the resin had hardened, the samples were sliced, ground, and polished with silica papers to achieve the best possible quality of the sample surface, going from grit of 800 grains/cm2 to grit of 4000 grains/cm2. the whole grinding process was carried out under water, using an electronic device. polishing was performed using an emulsion containing 0.25 micron nanodiamonds for 15 minutes. after each step, the sample was cleaned in an ultrasonic bath of distilled water. 4. results and discussion 4.1. microstructure the microstructure of the samples was investigated by means of optical microscopy, electron scanning microscopy and afm. the earlywood portion of an 44 vol. 55 no. 1/2015 a description of the properties of spruce wood figure 6. matrix of indents in an earlywood cell wall, 18.55 × 18.55 µm (left) and in a latewood cell wall, 50 × 50 µm (right). figure 7. load-displacement diagram obtained by nanoindentaiton on the cell wall of earlywood. annual ring was 0.79 mm in thickness, and the latewood cells were arranged in a region 1.14 mm in thickness. the visible microcracks within the samples had formed due to the artificial drying applied in the initial stage of glt processing. the first method was accomplished using a neophot 21 microscope (figure 4) to acquire basic information about the microstructure. afm was used for the analysis in order to obtain a detailed description of individual phases within the cells of the spruce samples (figure 5). optical microscopy revealed a honeycomb-like structure of hexagonally-shaped tracheids with an average size of 30 µm (earlywood) or 41 µm (latewood) in the tangential direction (t) with respect to the annual rings, and an average size of 27 µm (earlywood) or 32 µm (latewood) in the radial direction (r). the wall of the earlywood tracheids was significantly thinner, as demonstrated in figure 5, with a cell wall thickness of about 2 µm (t) and 3 µm (r), and a lumen of 18 µm (r) and 17 µm (t) to provide nutrient transport. the latewood cells of significantly higher density and with figure 8. load-displacement diagram obtained by nanoindentation on the cell wall of latewood. thicker walls, 12 µm (r) and 7 µm (t), contained a smaller lumen with an average size of 19 µm (r) and 13 µm (t), in order to perform as a reinforcement against the mechanical loading. 4.2. micromechanical properties 4.2.1. static indentation both the earlywood and the latewood wood cells were indented in various cell regions, using a berkovich tip, in order to obtain a map of the micromechanical properties. the standard indentation loading function was pursued, consisting of a constant loading stage (5 seconds), a holding period (8 seconds) and an unloading stage (5 seconds). the maximum loading capacity of the indenter was 400 µn. figure 6 shows a 4 × 5 indentation matrix captured during in-situ monitoring, using the hysitron tribolab® scanning device. in order to prevent any interaction between individual indents, their minimum spacing was set to 3 µm. 45 z. prošek, v. králík, j. topič et al. acta polytechnica figure 9. measured indentation modulus, with an indication of the standard deviations. figure 10. measured indentation hardness, with an indication of the standard deviations. figure 11. topography, gradient, amplitude and phase shift, determined by dma on a 16 × 16 µm earlywood tracheid area. figure 12. topography, gradient, amplitude and phase shift determined by dma on a 16 × 16 µm latewood tracheid area. the chosen nanoindentation load-displacement diagrams, reaching minimum and maximum indentation depth, are plotted in figures 7 and 8. the average contact depth was equal to 270 nm for earlywood cells and 222 nm for latewood cells. this depth is satisfactory for the surface roughness (figure 5) and small enough to prevent the influence of the limited wall thickness (figure 6). the mean deviation of the surface roughness of the tested samples was lower than 20 nm. the oliver and pharr [9] method was used to evaluate the elastic parameters from the experimentally obtained data. the mean values for indentation modulus and hardness with an indication of the standard deviation are presented in figures 9 and 10. these results are in agreement with the data published by other authors [10–16]. 4.2.2. micromechanacal properties – nanodma indentation for the modulus mapping method, the dynamic force was harmonically superimposed (with amplitude 5 µn and frequency 150 hz) on the nominal quasi-static contact force 12 µn. the measurements were performed over an area of 16 × 16 µm. the amplitude and the phase shift recorded in the raster obtained from the dma measurements are presented in figures 11 and 12, which also document the morphology of the sample. since there was negligible viscosity, as indicated by the relatively small magnitude of the measured loss moduli, the values for the storage moduli can be considered as the reduced elastic stiffness modulus, obtained by means of a quasi-brittle nanoindentation test. figures 13 and 14 present the storage modulus 46 vol. 55 no. 1/2015 a description of the properties of spruce wood figure 13. earlywood storage modulus map, 16 × 16 µm (left) and a line-plot projection to a horizontal section: sl – secondary wall, cml – composite middle lamella (right). figure 14. latewood storage modulus map, 16 × 16 µm (left) and a line-plot projection to a horizontal section: sl – secondary wall, cml – composite middle lamella (right). static indentation dma indentation indentation modulus [gpa] st. d. indentation modulus [gpa] st. d. earlywood 10.2 0.9 10.4 1.4 latewood 12.9 1.3 12.09 2.06 table 2. indentation modulus of tracheid walls; st. d. represents standard deviation of the measured data. mapping, representing the earlywood and latewood samples. it can be observed that the cell walls exhibit greater elastic stiffness than the lumen area and the composite middle lamella (the middle lamella and the primary wall). however, the cell walls contain significant local deviations. the mean indentation modulus of the secondary cell wall tissue (denoted sl in figures 13 and 14) was established as 10.4±1.4 gpa for the earlywood tracheids, and 12.09 ± 2.06 gpa for the latewood tracheids. irrespective of the tissue, the composite middle lamella (denoted by cml in figures 13 and 14) exhibited stiffness equal to 9.45 ± 0.19 gpa. 4.3. discussion all the results obtained by means of static indentation and the modulus mapping method are summarized in table 2. the values seem reasonable for the effective properties of spruce wood, as published in the literature [10–16]. static indentation could not be performed on the, middle lamella because of its limited size. the difference between the elastic stiffness of the earlywood and latewood cell walls is attributed to the different chemical composition and the angle of the fibrils. the earlywood cells contain more lignin and less cellulose, resulting in reduced elastic stiffness. the same effect can also be observed for the middle lamella, which contains twice as much lignin as the latewood cell walls, and less than one third of the amount of cellulose. moreover, the fibrils are not systematically arranged, so the stiffness of the middle lamella is 2.64 gpa lower than the stiffness of the latewood cell walls. the work of gindl et al. [14], which focused on the micromechanical properties of individual phases in the wood tissues, suggests that lignin, which bonds to hemicellulose, does not have any significant impact on the hardness parameter, and influences only the elastic stiffness value. the minor discrepancy between our quasi-static nanoindentation results for the latewood cells and the results of other authors may be due to the angle of the fibrils deposited in the cell walls. the influence of the orientation angle of the fibrils 47 z. prošek, v. králík, j. topič et al. acta polytechnica on the elastic stiffness was investigated by jäger et al. [18], who found a clear correlation between these quantities. according to their equation describing the relationship between the orientation of the fibrils and the indentation modulus, the fibrils of earlywood cell walls in our samples were oriented at an angle of 39.5° and, in the case of latewood, at an angle of 23.5°. these values are larger than we had expected on the basis of a literature study, which had indicated angles of about 10°. the large values may be due to defects in the vicinity of the extracted samples, or due to the way the samples were prepared. according to gindl et al. [15], the compressed wood defect increases the angle of the fibrils up to 50°, while wagner et al. [17] found that polishing the samples can also increase the angle. 5. conclusion the results of our study provide detailed information about the morphology and the elastic stiffness of individual phases in wood from norway spruce. the quasi-static nanoindentation and dynamic modulus mapping method have provided valuable data about the distribution of stiffness within individual earlywood and latewood cells. the data contributes to our basic understanding of the wood microstructure, and can also be efficiently used for analytical or numerical homogenization and micromechanical modeling. on the basis of our findings, it can be followed that: • microcracking cannot be avoided if the wood is dried using elevated temperatures exceeding 100 °c, • the use of quasi-static nanoindentation and nanodma can provide detailed information about the distribution of the elastic stiffness within a single spruce tracheid, • the indentation modulus of tracheid cell walls is equal to 10.3 gpa for earlywood and 12.5 gpa for latewood, • the difference between the mechanical properties of earlywood and latewood is caused by the different chemical composition, by the lignin content at the expense of cellulose, and by the different orientation of the fibrils, • the influence of fibrils and chemical composition is clearly demonstrated in the case of the middle lamella, which has an indentation modulus about 2.64 gpa lower than that for latewood cell walls, • earlywood fibrils are inclined by about 39.5°, while latewood fibrils are oriented at an angle of 23.5°. future research will be focused on an investigation of the angle of the fibrils, using optical microscopy and x-ray diffraction. the data will be correlated with the elastic stiffness of the investigated phases, and the knowledge that is acquired will be used for upscaling the elastic stiffness using micromechanical modeling [28]. the model will be validated with the use of macroscopic testing by means of the impulse excitation method, which has been successfully exploited by kuklík et al. [29] for investigating timber elements. acknowledgements financial support from the faculty of civil engineering, czech technical university in prague (sgs projects sgs14/122/ohk1/2t/11 and sgs14/121/ohk1/2t/11). the authors also thank the center for nanotechnology in civil engineering at the faculty of civil engineering, ctu in prague, and the joint laboratory of polymer nanofiber technologies of the institute of physics, academy of science of the czech republic, and the faculty of civil engineering, ctu in prague. references [1] gandelová, l., horáček, p. and šlezingerová, j.: nauka o dřevě. brno, mendelova zemědělská a lesnická univerzita v brně, 2002, p. 1–176. [2] balabán, k.: anatomie dřeva. praha, szn, 1955, p. 1–216. [3] kettunen, p. o.: wood structure and properties. tampere, trans tech publication, 2006, p. 1–401. [4] bailey, i. w. and kerr, t.: the visible structure of the secondary wall and its significance in physical and chemical investigations of tracheary cells and fibers. protoplasma , vol. 26, 1935, p. 273–300. doi:10.1007/bf01628654 [5] eder, m. et al.: experimental micromechanical characterisation of wood cell walls. wood science and technology, vol. 47, 2013, p. 163–182. doi:10.1007/s00226–012–0515–6 [6] griffith, a. a.: the phenomena of rupture and flow in solids. philos trans r soc lond, vol. 221, 1921, p. 163–198. [7] jayne, b. a.: mechanical properties of wood fibers. tappi, vol. 42, 1959, p. 461–467. [8] page, d. h., el-hosseiny, f. and winkler, k.: behaviour of single wood fibres under axial tensile strain. nature, vol. 229, 1971, p. 252–253. doi:10.1038/229252a0 [9] oliver, w. c. and pharr g. m.: an improved technique for determining hardness and elastic-modulus using load and displacement sensing indentation experiments. j mater res, vol. 7, 1992, p. 1564–1583. doi:10.1557/jmr.1992.1564 [10] wimmer, r. et al.: longitudinal hardness and elastic stiffness of spruce tracheid secondary walls using nanoindentation technique. wood science and technology, vol. 31, 1997, p. 31–141. doi:10.1007/bf00705928 [11] wimmer, r. and lucas, b. n.: comparing mechanical properties of secondary wall and cell corner middle lamella in spruce wood. iawa, vol. 18, 1997, p. 77–88. [12] gindl, w., gupta, h. s. and gunwald, c.: lignification of spruce tracheid secondary cell walls related to longitudinal hardness and modulus of elasticity using nano-indentation. canadian journal of botany, vol. 80, 2002, p. 1029–1033. doi:10.1139/b02–091 48 http://dx.doi.org/10.1007/bf01628654 http://dx.doi.org/10.1007/s00226--012--0515--6 http://dx.doi.org/10.1038/229252a0 http://dx.doi.org/10.1557/jmr.1992.1564 http://dx.doi.org/10.1007/bf00705928 http://dx.doi.org/10.1139/b02--091 vol. 55 no. 1/2015 a description of the properties of spruce wood [13] gindl, w. and gupta, h. s.: cell-wall hardness and elastic stiffness of melamine-modified spruce wood by nano-indentation. composites, vol. 33, 2002, p. 1141–1145. doi:10.1016/s1359–835x(02)00080–5 [14] gindl, w. et al.: mechanical properties of spruce wood cell walls by nanoindentation. appl phys, vol. 79, 2004, p. 2069–2073. doi:10.1007/s00339–004–2864-y [15] gindl, w. and schöberl, t.: the significance of the elastic modulus of wood cell walls obtained from nanoindentation measurements. composites, vol. 35, 2004, p. 1345–1349. doi:10.1016/j.compositesa.2004.04.002 [16] konnerth, j. et al.: actual versus apparent within cell wall variability of nanoindentation results from wood cell walls related to cellulose microfibril angle. j mater sci, vol. 44, 2009, p. 4399–4406. doi:10.1007/s10853–009–3665–7 [17] wagner, l., bader, t. k. and borst, k.: nanoindentation of wood cell walls: effects of sample preparation and indentation protocol. j mater sci, vol. 49, 2009, p. 94–102. doi:10.1007/s10853–013–7680–3 [18] jäger, a. et al.: the relation between indentation modulus, microfibril angle, and elastic properties of wood cell walls. composites, vol. 42, 2011, p. 677–685. doi:10.1016/j.compositesa.2011.02.007 [19] jiroušek, o. et al.: use of modulus mapping technique to investigate cross-sectional material properties of extracted single human trabeculae. chemické listy, vol. 106, 2011, p. 442–445. [20] ekertová, l. and frank, l.: metody analýzy povrchů – elektronová mikroskopie a difrakce. academia, 2003. [21] siegel, j. and kotál, v.: preparation of thin metal layers on polymers. acta polytechnica, vol. 47, 2007, p. 9–11. [22] binnig, g., quate, c. f. and gerber, ch.: atomic force microscope. phys. rev. lett., vol. 56, 1986, p. 930–935. doi:10.1103/physrevlett.56.930 [23] fisher-cripps, a.c.: nanoindentation. new york, springer verlag, 2002, p. 1–163. [24] hertz, h.: verhandlungen des vereins zur beförderung des gewerbe fleisses. macmillan & co, london, vol. 64, 1882. [25] sneddon, i. n.: the relation between load and penetration in the axisymmetric boussinesq problem for a punch of arbitrary profile. international journal of engineering science, vol. 3, 1965, p. 47–57. doi:10.1016/0020–7225(65)90019–4 [26] syed asif, s.a, wahl, k. j. and colton, r. j.: nanoindentation and contact stiffness measurement using force modulation with a capacitive load-displacement transducer. review of scientific instruments, vol. 70, 1999, p. 2408–2414. doi:10.1063/1.1149769 [27] syed asif, s. a. et al.: quantitative imaging of nanoscale mechanical properties using hybrid nanoindentation and force modulation. journal of applied physics, vol. 90, 2001, p. 1994–1200. doi:10.1063/1.1380218 [28] mishnaevsky l. and qing h.: micromechanical modelling of mechanical behaviour and strength of wood: state-of-the-art review. computational materials science, vol. 44, 2008, p. 363–370. doi:10.1016/j.commatsci.2008.03.043 [29] kuklík, p. and kuklíková, a.: nondestructive strength grading of structural timber. acta polytechnica, vol. 40, 2000. 49 http://dx.doi.org/10.1016/s1359--835x(02)00080--5 http://dx.doi.org/10.1007/s00339--004--2864-y http://dx.doi.org/10.1016/j.compositesa.2004.04.002 http://dx.doi.org/10.1007/s10853--009--3665--7 http://dx.doi.org/10.1007/s10853--013--7680--3 http://dx.doi.org/10.1016/j.compositesa.2011.02.007 http://dx.doi.org/10.1103/physrevlett.56.930 http://dx.doi.org/10.1016/0020--7225(65)90019--4 http://dx.doi.org/10.1063/1.1149769 http://dx.doi.org/10.1063/1.1380218 http://dx.doi.org/10.1016/j.commatsci.2008.03.043 acta polytechnica 55(1):39–49, 2015 1 introduction 1.1 microstructure of wood cells 1.2 micromechanical properties of wood 1.3 aim of the paper 2 experimental methods 2.1 optical and electron microscopy 2.2 atomic force microscopy 2.3 static nanoindentation 2.4 dynamic nanoindentation 3 preparing the samples 3.1 tested samples 3.2 sample preparation 4 results and discussion 4.1 microstructure 4.2 micromechanical properties 4.2.1 static indentation 4.2.2 micromechanacal properties – nanodma indentation 4.3 discussion 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0736 acta polytechnica 53(supplement):736–741, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap search for gravitational waves from supernovae and long grbs maurice h.p.m. van putten∗ korea institute for advanced study, hoegiro, dongdaemun-gu, seoul 130-722, korea, and department of astronomy, sejong university, 98 gunja-dong gwangin-gu, seoul 143-747, korea ∗ corresponding author: mvputten@kias.re.kr abstract. we report on evidence for black hole spindown in the light curves of the batse catalogue of 1491 long grbs by application of matched filtering. this observation points to a strong interaction of the black hole with surrounding high density matter at the isco, inducing non-axisymmetric instabilities sustained by cooling in gravitational wave emission. opportunities for ligo-virgo and the recently funded kagra experiments are highlighted, for long grbs with and without supernovae and for hyper-energetic core-collapse supernovae within a distance of about 35 mpc in the local universe. keywords: core-collapse supernovae, long grbs, frame dragging, kerr black holes, gravitational radiation, batse, bepposax, matched filtering, ligo, virgo, kagra. 1. introduction gamma-ray bursts (grbs) and core-collapse supernovae (cc-sne) are by far the most energetic and enigmatic transient events associated with neutron stars and mass black holes. a key objective is to identify the nature of their explosion mechanism. extremely powerful events are unlikely powered by their presumably prodigious output in mev neutrinos. they may, instead, be powered by rotation (e.g. [4]). grb 030329/sn 2003dh and grb 031203/sn 2003lw are hyper-energetic events, requiring an energy reservoir that exceeds the maximal rotational energy of a proto-neutron star (pns) by an order of magnitude [27]. these anomalous supernovae – of “type i x” – practically rule out pns as a universal inner engine to all core-collapse supernovae. in light of these hyper-energetic events and the diversity in grbs in durations and associations (with and without supernovae, e.g., listed in [27]), we here turn to inner engines hosting a rotating black hole surrounded by high-density matter. these systems appear naturally in mergers of neutron stars with another neutron star or companion black hole, as well as in core-collapse of relatively mass stars. 1.1. frame dragging induced outflows according to the kerr metric, the angular momentum of rotating black holes induces frame dragging. frame dragging has recently been measured by the lageos ii [6] and gravity probe b [10] satellite experiments around the earth, manifest in an angular velocity ω ∼ j/r3 at an orbital radius r and given the earth’s angular momentum j. by this scaling property, these experiments equivalently measured ω around a maximally spinning kerr black hole at a distance of about 5.3 million schwarzschild radii. as part of the gravitational field, the frame dragging angular velocity ω is a clean and universal agent inducing non-thermal radiation processes in spacetime around rotating black holes. relativistic frame dragging is encountered as ω approaches the angular velocity ωh of the black hole itself. frame dragging around rapidly rotating black holes enables transfer of a major fraction of its spin energy to surrounding matter via an inner torus magnetosphere simultaneously with transfer of a minor fraction to infinity in the form of ultra-relativistic outflows along the spin axis. the first can be seen by an equivalence in poloidal cross-section of the inner and outer faces of a torus in suspended accretion to the magnetospheres of neutron stars [19, 23]. the second is described by a potential energy e = ωjp for test particles with angular momentum jp. here, e assumes energies on the order of ultra high energy cosmic rays (uhecrs) for particles in superstrong magnetic fields, typical for the catastrophic events under consideration [25]. frame dragging is hereby a causal mechanism for the onset of a two-component outflow, comprising a baryon-rich wind from an inner disk or torus collimating the latter in the form of a baryon-poor jet. a two-component outflow with different baryon loading is advantageous for grb-supernovae, whose grb emissions are generally attributed to dissipation of ultra-relativistic baryon-poor outflows and whose supernovae can be attributed to irradiation from within [23], such as by impulsive momentum transfer of internal magnetic winds onto the remnant stellar envelope. the efficiency of the latter favors relatively baryon-rich winds [27], which generally will produce aspherical explosions. these two-component outflows circumvent the limitations of neutron stars with approximately uniform baryon-loading throughout their wind, rendering these less amenable to making hyper-energetic supernovae with a successful grb. 736 http://dx.doi.org/10.14311/ap.2013.53.0736 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 search for gravitational waves from supernovae and long grbs 1.2. if not a pulsar, how to identify rotation? in our model, the t90 durations of long grbs is attributed to the lifetime of rapid spin of the black hole. it sets the duration for energy and angular momentum transfer from the black hole mostly onto the surrounding matter. when the black hole has slowed down sufficiently, this interaction ceases, and the inner accretion disk or torus will fall in, heralding the end of the grb. that is, a long grb ends with hyper-accretion onto a slowly spinning black hole, whose angular velocity is approximately equal to that of the isco as prescribed by the kerr metric. furthermore, our model considers the aforementioned two-component outflow. any time-variability in the collimated torus wind – in strength or orientation – inevitably modulates the inner jet producing the grb with possibly quasi-periodic variations in the light curve. the frequency scale of these variations is those associated with the isco around the black hole, i.e., on the order of 1 khz, possibly higher if multipole mass moments are involved and possibly lower by precession. a search for such high frequency qpos requires the highest sampling frequencies available for grb light curves, well beyond the 64 ms sampling interval in the batse catalogue. following the above, we consider the problem of tracking the evolution of black hole spin, as determined by its interaction with a surrounding torus. attributed to frame dragging as indicated above, the count rate of gamma-ray photons observed should show a near-exponential decay as the angular velocity approaches the fixed point ωh = ωisco. the evolution of an initially rapidly rotating black hole that ensues is mostly in a decrease in its angular momentum and less so in a decrease of its total mass [24, 26]. since ω and hence e decrease along with ωh, the luminosity of the baryon-poor outflows will decrease in the process of black hole spindown in approaching a fixed point ωh = ωd. here, ωd denotes the angular velocity of the inner disk, which is expected to closely track the angular velocity ωisco of the inner-most stable circular orbit (isco), as described by the kerr metric. thus, an asymptotically exponential decay in the light curve of any high energy emission produced by dissipation of the aforementioned baryon poor outflows is a characteristic imprint of black hole spindown. the above suggests extracting a normalized light curve (nlc) of long grbs, to identify the anticipated late time, exponential decay. here, we report on this study using a recent implemented matched filtering [25], to identify a model light curve in the complete batse catalogue of 1491 long grbs. matched filtering allows for an accurate extraction of a normalized light curve, in which fluctuations on all intermediate time scales are filtered out, that can be used to validate model templates. we apply this to various model templates representing spindown, of black holes and (proto-)neutron stars. 3000 3500 4000 4500 5000 5500 6000 6500 7000 0 2 4 6 8 10 12 n v α [km/s] normal mean =4.72e+003 σ = 659 3000 3500 4000 4500 5000 5500 6000 6500 7000 0 1 2 3 4 n v α [km/s] bl mean =5.68e+003 σ = 860 figure 1. shown are the histograms of the ejection velocities of the sample of [14] of cc-sne with narrow (top) and broad (bottom) line emission lines. the mean of the latter is larger than that of the former by 3.9σ. 1.3. em priors to gw searches properties of the inner engine of hyper-energetic supernovae and long grbs in the electromagnetic spectrum of radiation can provide valuable priors to searches for their gravitational wave emissions. by the very nature of these catastrophic events, the high density matter at nuclear densities orbiting at about the schwarzschild radius of the central engine, is expected to develop a non-axisymmetric mass distribution in response to exceptionally large energy fluxes. if so, significant emission in gravitational waves is inevitable for a sustained period of time, provided by a balance between heating, apparent in mev neutrino emission, and cooling, by aforementioned magnetic disk winds and gravitational radiation. this outlook offers some novel opportunities for the advanced detectors ligovirgo and kagra currently under construction, that are expected to be operational within this decade. of particular interest are priors to identifying black holes or pns, in view of their dramatically different outlook in gravitational wave emission. the latter offers a broad range of possible radiation channels which, however, remain hitherto somewhat uncertain. examples are acoustic modes, convection, differential rotation, and magnetic fields [1, 2, 7–9, 11, 15, 17, 18]. if an electromagnetic prior is found to rule out a pns, we can direct our attention to black holes as a probable alternative. an attractive possibility, for example, is a prolonged quasi-period emission by quadrupole mass inhomogeneities or non-axisymmetric instabilities in the inner disk or torus at twice the orbital frequency, particularly so if powered by the spin energy of the black hole. 737 maurice h.p.m. van putten acta polytechnica grb sn esn η ê prior 2005ap > 10 1 > 0.3 indet 2007bi > 10 1 > 0.3 indet 980425 1998bw 50 1 1.7 bh 031203 2003lw 60 0.25 10 bh 060218 2006aj 2 0.25 0.25 indet 100316d 2010bh 10 0.25 1.3 bh 030329 2003dh 40 0.25 5.3 bh table 1. selected supernovae with kinetic energies esn in units of 1051 erg. required energy reservoir erot is expressed in ê = erot/ec, ec = 3 × 1052 erg (adapted from [27]). 2. diversity out of universality in contrast to type ia supernovae, cc-sne are not of one kind with both narrow and broad emission lines, shown in the histograms of ejection velocities (fig. 1) of a sample of cc-sne compiled by [14]. those of broad emission line events are higher on average that those of narrow emission line emission events with a statistical significance in excess of 4σ. the explosions of the former appear to be relatively more energetic. rapidly rotating black holes (ωh > ωisco, a/m > 0.36) can form in cc-sne and mergers of two neutron stars. if the progenitor of the former is a member of a short period binary, a rapidly rotating black hole is formed with a rotational energy between about oneand two-thirds of the extremal value for kerr black holes. in particular, dimensionless specific angular momenta a/m = 0.7679 and a/m = 0.9541 are found with erot/emaxrot = 0.3554 and, respectively, erot/e max rot = 0.6624. neutron star-neutron star mergers form rapidly rotating black holes of relatively small mass, m, essentially the sum of the mass mns of individual neutron stars except for a minor loss of mass that forms an accretion disk, for which we estimate a m = 2 5 √ r rg = 0.5963, 0.7303, 0.8433 (1) for mns = 3, 2, 1.5 m�. these values are in remarkable agreement with numerical simulations, showing a/m = 0.74 ÷ 0.84 for mns = 1.5 m� [3]. in the merger of a neutron star with a black hole companion, an accretion disk will form from tidal break-up around the companion if the black hole is not too massive (the limit for which is relaxed if it spins rapidly). in this event, the spin of the black hole is unchanged from its spin prior to the merger. consequently, the rotation of the black hole may be diverse, given by the diversity spin in neutron star-black hole binaries. in considering grbs from rotating black holes, we associate long/short grbs with rapidly/slowly spinning black holes. the above leads to the perhaps counter-intuitive conclusion that mergers of neutron star-neutron star binaries produce long grbs, more likely so than short grbs [24]. they may further be produced by mergers of neutron stars with black hole companions (e.g. [12, 16]), especially those with rapid spin [19]. consequently, long grbs are expected to occur both with and without supernovae involving rapidly rotating black holes. notable examples are given in tab. 1 of [27], e.g., grb 060614 with t90 = 102 s. 3. a light curve of long grbs to leading order, the evolution of a rotating black hole interacting with surrounding matter via a torus magnetosphere is described by conservation of energy and angular momentum associated with the black hole luminosity lh, its mass m and angular momentum jh, satisfying [19] lh = −ṁ, j̇h = −t, (2) where lh = ωht associated with the torque t = κ(ωh −ωd). here, κ incorporates the physical and geometrical properties of the inner torus magnetosphere, and ωd is taken to be tightly correlated to ωisco mentioned earlier. upon numerical integration of eq. 2, a model light curve can be calculated from the resulting baryon-poor outflow as a function of ωh = ωh(t) and θh = θh(t) in its dependence on the radius rd = risco of the inner disk. for our intended leading order analysis, we assume that the photon count rate is proportional to the luminosity in the baryon poor outflow. figure 2 shows the resulting template, scaled to a duration of about one minute. a finite polar cap supports an open magnetic flux tube to infinity, while the remainder of the black hole event horizon is connected to the inner disk or torus via an inner torus magnetosphere. the time-averaged luminosity of the jet along the open magnetic flux tube that results from the action of frame dragging satisfies the angular scaling [23] 〈lj (t)〉∼ 1051 ( m1 t90/30s )〈( θh(t) 0.5 )4〉 erg s−1 (3) with further dependence on the angular velocity of the black hole according to lj ∝ ω2hz 2ek,t , ek,d ∝ (ωdrd)2e(z), e(z) = √ 1 − 23z , rt ≡ zm. here, we consider the geometrical ansatz θ4h ∝ z n with n = 2. (here, n = 2 corresponds to the expansion of the surface area ah = a1z + a2z2 + · · · of the polar cap as a function of the radius of the inner disk.) figure 2 shows the model light curve produced by eq. 3 following eq. 2. thus, the light curve (fig. 2) is based on two positive correlations, in the grb luminosity with ω and θh with the radius of the isco. for the latter we use the ansatz n = 2, a choice among a few integers by insisting on analyticity. we now seek validation of this model a in the batse light curves of long grbs by application of matched filtering. 738 vol. 53 supplement/2013 search for gravitational waves from supernovae and long grbs 0 20 40 60 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 l g r b ( a .u .) time [s] 0 20 40 60 0.4 0.5 0.6 0.7 0.8 0.9 1 θ h /m a x (θ h ) time [s] 0.2 0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 a/m l g r b ( a .u .) 0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (θ h /max(θ h ) l g r b ( a .u .) figure 2. the template light curve in gamma-rays for an initially extremal black hole, shown as a function of time (top left), dimensionless angular momentum (bottom left), and the associated horizon half-opening angle θh (right). during initial spindown, the rise is due to a marked increase of θh with the expansion of the isco, leading to a maximal luminosity at about 16 % of the duration of the burst. there is an accompanying decrease in the total mass of the black hole [24, 27]. 4. normalized grb light curve to extract an intrinsic light curve of long grbs, we set out to filter out all fluctuations, randomly or quasiperiodic, in the batse data (fig. 3) that are not representative for the evolution of the black hole. these include, without being exhaustive, precession of the disk or black hole and instabilities, in the accretion disk, the inner torus magnetosphere and in the interface between baryon-poor jet and the surrounding baryon-rich disk winds, as well as fluctuations in the dissipative fronts due to turbulence. to this end, a normalized light curve (nlc) is extracted by an ensemble average of individually normalized light curves defined by a best-fit to a template upon scaling and translation of time and count rate. figure 4 shows the results along with the residuals, showing convergence for very long bursts (t90 > 20 s) to template a (fig. 2). the results are sub-optimal for other model templates, here template b for spindown against surrounding matter beyond the isco satisfying ωd = 12ωh, and template c, for spindown of a pns. the preferred match to template a establishes confidence in the natural in view of spindown against matter at the isco, that represents a suspended accretion state with a luminous output in gravitational radiation [20]. a detailed analysis points to a sensitivity distance of about 35 mpc by the advanced gravitational wave detectors ligo-virgo and kagra [26], by extrapolating the sensitivity distance determined from model injections into the strain amplitude noise of the tama 300, shown in fig. 5. figure 3. compilation of the complete batse catalogue of 1491 light curves of long grbs, sorted by 2 s < t90 < 1307 s. each light curve shown represents the sum of the photon count rate in all four batse energy channels, is smoothed with a time scale of 2.56 s and is plotted as a function of time normalized to t90. (reprinted from [28].) 0 5 10 15 20 25 30 0 200 400 600 800 1000 1200 1400 1600 1800 2000 fr e q u e n c y [ h z ] time [s] d=0.07 mpc model (m=10m ο ) tsmf tsmf± 1σ figure 5. the long duration negative chirp can be searched for using a dedicated time-sliced matched filtering procedure, here illustrated following signal injection in tama 300 noise data. it points to a sensitivity distance of about 35 mpc for the upcoming advanced detectors. (reprinted from [26].) 5. discussion and conclusions some hyper-energetic supernovae – of type i x – appear to require an energy reservoir that exceeds the maximal rotational energy of a pns. grb-supernovae ideally have an inner engine that produces simultaneously a baryon-rich wind to power the supernova and a baryon-poor jet to power the grb. these and other considerations point to a universal inner engine in the form of a rotating black hole, whose diversity in spin [19] can account for short/long grbs from hyperand suspended accretion [23]. the fanaroff-riley class i and ii radio-loud agn 739 maurice h.p.m. van putten acta polytechnica −1 0 1 2 3 0 0.5 1 2 s < t 90 < 20 s c o u n t ra te ( a .u .) nlc a template a −1 0 1 2 3 −0.1 0 0.1 0.2 deviation ± 3σ −1 0 1 2 3 0 0.5 1 c o u n t ra te ( a .u .) nlc b template b −1 0 1 2 3 −0.2 0 0.2 0.4 deviation ± 3σ −1 0 1 2 3 0 0.5 1 c o u n t ra te ( a .u .) normalized time nlc c template c −1 0 1 2 3 −0.5 0 0.5 normalized time deviation ± 3σ −1 0 1 2 3 0 0.5 1 2 s < t 90 < 20 s c o u n t ra te ( a .u .) nlc a template a −1 0 1 2 3 −0.1 0 0.1 0.2 deviation ± 3σ −1 0 1 2 3 0 0.5 1 c o u n t ra te ( a .u .) nlc b template b −1 0 1 2 3 −0.2 0 0.2 0.4 deviation ± 3σ −1 0 1 2 3 0 0.5 1 c o u n t ra te ( a .u .) normalized time nlc c template c −1 0 1 2 3 −0.5 0 0.5 normalized time deviation ± 3σ figure 4. shown are the nlc (thick lines) generated by model templates aðc (thin lines) for the ensemble of 531 long duration bursts with 2 s < t90 < 20 s (left) and the ensemble of 960 long bursts with t90 > 20 s (right) and the associated deviations for templates aðc. here, the standard deviation σ is calculated from the square root of the variance of the photon count rates in the ensemble of individually normalized light curves as a function of normalized time. (reprinted from [28].) may represent a closely related dichotomy for supermassive black holes, where the latter may be related to a similar suspended accretion state catalyzing black hole spin energy into powerful winds. eliso may test for suspended accretion by searches for an accompanying low-frequency emission in gravitational waves, notably from sgra*. the type b x-ray flaring in the galactic microquasar grs 1915+105 may further illustrate the suspended accretion state in a galactic microquasar [22]. figure 4 validates our model for black hole spindown most likely against matter at the isco, rather than further out or the spindown of (proto-)neutron stars. as a corollary, some of the neutron star-neutron star mergers are expected to long grbs, more likely so than short grbs. our results point to specific high-frequency chirps both in gws of interest to ligo-virgo and kagra, and in intensity modulations in prompt grb emissions to some of the gamma-ray satellite missions. for the first, we estimate a sensitivity distance to be about 35 mpc for the advanced generation of gravitational wave detectors, indicated in fig. 5. acknowledgements the author thanks m. della valle and the organizers for the invitation to participate at this workshop and the referee for constructive comments. references [1] andersson, n., ferrari, v., jones, d. i., et al. 2011, gen. rel. gravit., 43, 409 [2] arras, p., flanagan, e. e., morsink, s. m., et al. 2003, apj, 591, 1129 [3] baiotti, giacomazzo & rezzolla, 2008, phys. rev. d, 78, 084033 [4] bisnovatyi-kogan, g.s., 1970, astron. zh., 47, 813 [5] bromberg, o., levinson, a., & van putten, m.h.p.m., 2006, newa, 11, 619 [6] ciufollini, i., & pavils, e.c., 2004, nature, 431, 958 [7] cutler, c. 2002, phys. rev. d, 66, 084025 [8] cutler, c., & thorne, k. s. 2002, proc. gr16, durban, south africa, 2001 [arxiv:astro-ph/0204090] [9] dallõosso, s., shore, s. n., & stella, l. 2009, mnras, 38, 1869 [10] everitt, c.w.f., et al., 2011, phys. rev. lett., 106, 221101 [11] howell, e., coward, d., burman, r., et al. 2004, mnras, 351, 1237 [12] kluźniak, w., & lee, w.h., 1998, apj, 494, l53 [13] kumar, p., narayan, r., johnson, j.l., 2008, science, 321, 376, ibid. mnras, 388, 1729 [14] maurer, j. i., mazzali, p. a., deng, j., et al. 2010, mnras, 402, 161 [15] owen, b. j., lindblom, l., cutler, c., et al. 1998, phys. rev. d, 58, 084020 [16] paczynski, b. p., 1991, acta. astron.,41, 257 [17] rees, j. m., ruffini, r., & wheeler, j. a. 1974, black holes, gravitational waves and cosmology: an introduction to current research (new york: gordon & breach), sect. 7 [18] regimbau, t. 2011, res. astron. astroph., 11, 369 [19] van putten, m.h.p.m., 1999, science, 284, 115 [20] van putten, m.h.p.m., 2001, phys. rev. lett., 87, 091101 [21] van putten, m.h.p.m., 2002, apj, 575, l71 [22] van putten, m.h.p.m., & eikenberry, s., arxiv:astro-ph/0304386 [23] van putten, m.h.p.m., & levinson, a., 2003, apj, 584, 937 [24] van putten, m.h.p.m., 2008, apj, 684, l91 [25] van putten, m.h.p.m., & gupta, a., 2009, mnras, 396, l81 740 vol. 53 supplement/2013 search for gravitational waves from supernovae and long grbs [26] van putten, m.h.p.m., kanda, n., tagoshi, h., tatsumi, d., masa-katsu, f., & della valle, m., 2011, phys. rev. d, 83, 044046 [27] van putten, m.h.p.m., della valle, m., & levinson, a., 2011, a&a, 535, l6 [28] van putten, m.h.p.m., 2012, prog. theor. phys., 127, 331 discussion j. beal — is there anisotropy in the emitted gravitational radiation? m. van putten — the anisotropy in quadrupole emissions is known to be small, about a factor of 1.58 at most. b. czerny — (a) gw efficiency will depend strongly on the m in azimuthal modes. are low m more important in the papaloizou-pringle instability? (b) why don’t you use the term “hyper nova”? (c) how sensitive are your grb templates to the detailed assumptions that go into bz efficiency? m. van putten — (a) in [5], we considered the problem of the spectrum in gravitational wave emission using a toy model for a magnetized disk. the results indicate a dominant emission in m = 2. in [21], i extended the pp instability analysis to tori with finite width, and found that the low m modes (starting with m = 1) become unstable first in response to thermal or magnetic pressures. both results are fortunate, pointing to major emission in quadrupole radiation, that falls within the limited band width of sensitivity of the ligo-virgo and kagra detectors. (b) in seeking em priors on the inner engine of cc-sne, i consider the possibility of ruling out pns by determining the required energy reservoir relative to ec = 3 × 1052 erg, the canonical value for the maximal energy in rotation of a pns. the term “hypernova” has been introduced for supernovae with isotropically equivalent kinetic energies of a few times 1052 erg, well in excess of typical sn energies of a few times 1051 erg. this definition is not sufficiently specific to serve as a prior on the inner engine. (c) bz proposed an open model for a black hole luminosity entirely in open outflows, sustained by accretion of magnetized flows with no feedback onto the inner accretion disk. the black hole evolves according to the net result of losses (in energy and angular momentum) in magnetic outflow and gain by accretion. hyper-accretion will likely cause the black hole to spin up continuously up to close to maximal rotation [13]. in associating the duration of hyper-accretion with the observed t90 of the bursts, a model light curve from a baryon poor jet along the black hole spin axis should be increasing, followed by a relatively sharp drop off when accretion ceases. this is at odds with the gradual and extended decay seen in the nlc obtained by templates a–c (fig. 4). 741 acta polytechnica 53(supplement):736–741, 2013 1 introduction 1.1 frame dragging induced outflows 1.2 if not a pulsar, how to identify rotation? 1.3 em priors to gw searches 2 diversity out of universality 3 a light curve of long grbs 4 normalized grb light curve 5 discussion and conclusions acknowledgements references acta polytechnica acta polytechnica 53(3):308–313, 2013 © czech technical university in prague, 2013 available online at http://ctn.cvut.cz/ap/ inherited properties of effect algebras preserved by isomorphisms jan pasekaa,∗, zdenka riečanováb a department of mathematics and statistics, faculty of science, masaryk university, kotlářská 2, cz-611 37 brno, czech republic b department of mathematics, faculty of electrical engineering and information technology, slovak university of technology, ilkovičova 3, sk-812 19 bratislava, slovak republic ∗ corresponding author: paseka@math.muni.cz abstract. we show that isomorphism of effect algebras preserves properties of effect algebras derived from effect algebraic sum ⊕ of elements. these are partial order, order convergence, order topology, existence of states and other important properties. however, there are properties of effect algebras for which the preservation of the ⊕-operation is not substantial and they need not be preserved. keywords: effect algebras; operator effect algebras; isomorphisms; operator representations of effect algebras.. ams mathematics subject classification: 06c15, (03g12, 81p10). 1. introduction in the quantum-mechanical framework, the elements of an effect algebra represent quantum effects, meaning elementary yes-no measurements that may be unsharp. the standard hilbert space effect algebra e(h) on a complex hilbert space h is the set e(h) of all positive operators dominated by the identity operator i on h. so called interval effect algebras form a further important class of effect algebras. these are effect algebras possessing an ordering set of states, which is equivalent to the condition that these effect algebras can be represented by positive linear operators densely defined in an infinite-dimensional complex hilbert space h (see [18]). here, by the operator representation of effect algebras (initiated by questions of m. znojil at the 9th phhqp workshop in hangzhou, china) we mean their isomorphism with sub-effect algebras of the standard hilbert space effect algebra e(h) on the complex hilbert space h. in this paper we show that isomorphisms of effect algebras inherit the partial order on them, and consequently also the order convergence on them and other important properties. however, we also show examples of properties that need not be inherited by isomorphisms of effect algebras (e.g., sequential product of elements). 2. basic definitions and some known facts 2.1. effect algebras and generalized effect algebras definition 2.1. [3] a partial algebra (e;⊕, 0, 1) is called an effect algebra if 0,1 are two distinguished elements and ⊕ is a partially defined binary operation on e which satisfies the following conditions for any x,y,z ∈ e: (e1) x⊕y = y ⊕x if x⊕y is defined, (e2) (x⊕y) ⊕z = x⊕ (y ⊕z) if one side is defined, (e3) for every x ∈ e there exists a unique y ∈ e such that x⊕y = 1 (we put x′ = y and say that x′ is a supplement of x), (e4) if 1 ⊕x is defined then x = 0. we often denote the effect algebra (e;⊕, 0, 1) briefly by e. on every effect algebra e the partial order ≤, binary relation ⊥ and partial binary operation can be introduced as follows: x ≤ y and x ⊥ z and y x = z iff x⊕z is defined and x⊕z = y. generalizations of effect algebras (i.e. without a top element 1) have been introduced and studied in [3], [5], [6] and [9]. definition 2.2. (1.) a generalized effect algebra (e, ⊕, 0) is a set e with an element 0 ∈ e and a partial binary operation ⊕ satisfying for any x,y,z ∈ e the conditions (ge1) x⊕y = y ⊕x if one side is defined, (ge2) (x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) if one side is defined, (ge3) if x⊕y = x⊕z then y = z, (ge4) if x⊕y = 0 then x = y = 0, (ge5) x⊕ 0 = x for all x ∈ e. (2.) define a binary relation ≤ on e by x ≤ y iff for some z ∈ e,x⊕z = y. the significant property of a generalized effect algebra (e;⊕, 0) is that every interval [0,q], for q ∈ e, q 6= 0, is an effect algebra with ⊕ restricted to [0,q]. 308 http://ctn.cvut.cz/ap/ vol. 53 no. 3/2013 inherited properties of effect algebras every effect algebra e is also a generalized effect algebra and a generalized effect algebra is also an effect algebra iff it includes the top element. definition 2.3. a nonempty subset q of an effect algebra (generalized effect algebra) e is called a subeffect algebra (sub-generalized effect algebra) of e iff: (1.) if at least two of the elements x,y,z ∈ e with x⊕y = z are in q then all x,y,z are in q; (2.) 1 ∈ q when e is an effect algebra. we say that a finite system f = (xk)nk=1 of not necessarily different elements of an effect algebra (e;⊕, 0, 1) is orthogonal if x1 ⊕ x2 ⊕···⊕ xn (written n⊕ k=1 xk or ⊕ f) exists in e. here we define x1⊕x2⊕···⊕xn = (x1⊕x2⊕···⊕xn−1)⊕xn supposing that n−1⊕ k=1 xk is defined and n−1⊕ k=1 xk ≤ x′n. we also define ⊕ ∅ = 0. an arbitrary system g = (xκ)κ∈h of not necessarily different elements of e is called orthogonal if ⊕ k exists for every finite k ⊆ g. we say that for an orthogonal system g = (xκ)κ∈h the element ⊕ g (more precisely ⊕ e g) exists iff∨{⊕ k | k ⊆ g is finite } exists in e, and then we put ⊕ g = ∨{⊕ k | k ⊆ g is finite } . (here we write g1 ⊆ g iff there is h1 ⊆ h such that g1 = (xκ)κ∈h1 ). 2.2. topologies on ordered sets definition 2.4. (1.) a preordered set (λ;≤) is called a directed (upwards) set of indices if the following conditions are satisfied: (a) α ≤ α, (b) α ≤ β, β ≤ γ implies α ≤ γ, (c) for all α,β ∈ λ there exists γ ∈ λ such that α,β ≤ γ. a net (aα)α∈λ is a family of not necessary different elements which have indices from a directed set of indices λ. (2.) a net (aα)α∈λ of elements of a poset (p;≤) is increasingly directed if aα ≤ aβ for all α,β ∈ λ such that α ≤ β, and then we write aα ↑. if moreover a = ∨ {aα | α ∈ λ} we write aα ↑ a and we call such a net increasing to a. the meaning of aα ↓ and aα ↓ a is dual (decreasingly directed or filtered). (3.) a net (aα)α∈λ of elements of a poset (p ;≤) order converges ((o)-converges, for short) to a point a ∈ p if there are nets (uα)α∈λ and (vα)α∈λ of elements of p such that a ↑ uα ≤ aα ≤ vα ↓ a. we write aα (o) −→ a in p (or briefly aα (o) −→ a). definition 2.5. the order topology (denoted by τp0 or shortly τ0) on a poset (p ;≤) is the finest (strongest) topology on p such that for every net (aα)α∈λ of elements of p , aα (o) −→ a in p =⇒ aα τp0−→ a, where aα τp0−→ a denotes that (aα)α∈λ converges to a ∈ p in the topological space (p,τp0 ). clearly, aα ↑ a ⇒ aα (o) −→ a because a ↑ aα ≤ aα ≤ a ↓ a and aα ↓ a ⇒ aα (o) −→ a, because a ↑ a ≤ aα ≤ aα ↓ a (see [7], [8], [12],[13]). theorem 2.6 ([11, theorem 2.1.21]). let (p,≤) be a poset and f ⊆ p. then f is τ0-closed iff for every net (aα)α∈λ of elements of p , (cs) (aα ∈ f,α ∈ λ,aα (o) −→ a) ⇒ x ∈ f . 2.3. morphisms, embeddings and isomorphisms of effect algebras recall the following definitions, needed in what follows. definition 2.7 ([2, 18]). let (e1;⊕1, 01, 11), (e2; ⊕2, 02, 12) be effect algebras. a mapping ϕ : e1 → e2 is called (1.) a morphism, if (a) ϕ(01) = 02, ϕ(11) = 12, (b) for all a,b ∈ e1: if a⊕1b exists then ϕ(a)⊕2ϕ(b) exists, in which case ϕ(a⊕1 b) = ϕ(a) ⊕2 ϕ(b), (2.) an ordering morphism, if it is a morphism and, for all a,b ∈ e1, a ≤1 b iff ϕ(a) ≤2 ϕ(b), (3.) an embedding (also called a monomorphism), if ϕ is injective and (a) ϕ(01) = 02, ϕ(11) = 12, (c) for all a,b ∈ e1: a⊕1 b exists iff ϕ(a) ⊕2 ϕ(b) exists, in which case ϕ(a⊕1 b) = ϕ(a) ⊕2 ϕ(b), (4.) an isomorphism, if ϕ is bijective embedding, (5.) a positive operator valued state (povs for short) on e1 iff ϕ is a morphism into e2 = e(h) for some complex hilbert space h, (6.) a hilbert space effect-representation of e1 iff ϕ is an embedding into e2 = e(h) for some complex hilbert space h. clearly, every embedding ϕ is an isomorphism of effect algebras e1 and ϕ(e1); ϕ(e1) is a sub-effect algebra of e2; and a composition of morphisms (embeddings, isomorphisms) is again a morphism (embedding, isomorphism). every morphism of effect algebras preserves supplements. recall that ϕ is an isomorphism of effect algebras iff ϕ is bijective and both ϕ and ϕ−1 are morphisms of effect algebras. lemma 2.8. let (e1;⊕1, 01, 11) and (e2;⊕2, 02, 12) be effect algebras and let ϕ : e1 → e2 be a morphism of effect algebras. then ϕ is order-preserving 309 j. paseka, z. riečanová acta polytechnica and, for any orthogonal system g = (xκ)κ∈h of not necessarily different elements of e1, the system ϕ(g) = ( ϕ(xκ) ) κ∈h is again orthogonal. proof. assume that a,b ∈ e1, a ≤1 b. then there is an element c ∈ e1 such that a ⊕1 c = b. it follows that ϕ(b) = ϕ(a⊕1 c) = ϕ(a) ⊕2 ϕ(c) ≥2 ϕ(a). now, let l ⊆ ϕ(g) be finite. then there is a finite subset f ⊆ h such that l = (ϕ(xκ))κ∈f . put k = (xκ)κ∈f . then ⊕ e1 k exists and hence ⊕ e2 l exists and ⊕ e2 l = ϕ (⊕ e1 k ) . it follows that ϕ(g) =( ϕ(xκ) ) κ∈h is orthogonal. proposition 2.9. let (e1;⊕1, 01, 11) and (e2;⊕2, 02, 12) be effect algebras and let ϕ : e1 → e2 be a morphism of effect algebras. then the following conditions are equivalent: (1.) ϕ is an ordering morphism. (2.) ϕ is an embedding. proof. assume that a,b ∈ e1. then a ≤1 b′ iff a⊕1 b exists and ϕ(a) ≤1 ϕ(b′) iff ϕ(a) ≤1 ϕ(b)′ iff ϕ(a) ⊕1 ϕ(b) exists. hence ϕ is an ordering morphism iff ϕ is an embedding. theorem 2.10. let (e;⊕, 0, 1) be an effect algebra and let h be some complex hilbert space. for a map ϕ : e →e(h) the following conditions are equivalent: (1.) ϕ is an ordering positive operator valued state. (2.) ϕ is an embedding. (3.) ϕ is a hilbert space effect-representation of e in h. proof. the equivalence between (1.) and (2.) follows from proposition 2.9, the equivalence between (2.) and (3.) follows from definition 2.7, (2.), (5.) and (6.). definition 2.11 ([2, 14, 18]). (1.) a map ω : e → [0, 1] ⊆ r is a state on an effect algebra e if ω(0) = 0, ω(1) = 1 and ω(x ⊕ y) = ω(x) + ω(y) whenever x ≤ y′, x,y ∈ e. (2.) a set m of states on an effect algebra e is called an ordering set of states if for any a,b ∈ e the condition a ≤ b iff ω(a) ≤ ω(b) for all ω ∈ m, is satisfied. (3.) a state ω on an effect algebra e is called σadditive if, for every countable net (xn)n∈n of elements of e, xn ↑ x =⇒ ω(xn) → ω(x). (4.) a state ω on an effect algebra e is called (o)continuous (order-continuous) if, for every net (xα)α∈λ of elements of e, xα (o) −→ x implies ω(xα) → ω(x) (equivalently xα ↑ x implies ω(xα) ↑ ω(x)). (5.) a state ω on an effect algebra e is called completely additive if for any orthogonal system (xκ)κ∈h of not necessarily different elements of e such that ⊕ {xκ | κ ∈ h} exists, ω (⊕ {xκ | κ ∈ h} ) = ∑ {ω(xκ) | κ ∈ h} = sup{ ∑ {ω(xκ) | κ ∈ f} | f ⊆ h, f finite set}. it follows that states on effect algebras are exactly morphisms from them into [0, 1]. note that any (o)continuous state is completely additive and also any completely additive state is σ-additive. moreover, it was proved in [18] that, for an effect algebra e, there exists a complex hilbert space h such that e has a hilbert space effect-representation into e(h) = [0,i]b+(h), where b+(h) are positive bounded operators on h iff there exists an ordering set m of states on e and then h = l2(m). 3. basic properties of isomorphisms of effect algebras and operator representations roughly speaking, the operator representations of abstract effect algebras (if they exist) are their isomorphisms with operator effect algebras in some complex hilbert space h. more precisely, they are their isomorphisms with sub-effect algebras of the standard hilbert space effect algebra e(h). in such a case it may be interesting to know which properties of the initial effect algebras are inherited for those isomorphic operator effect algebras. let us start our considerations with properties of two isomorphic abstract effect algebras. theorem 3.1. let (e1;⊕1, 01, 11) and (e2;⊕2, 02, 12) be effect algebras and let ϕ : e1 → e2 be an isomorphism of effect algebras. then (1.) for all a,b ∈ e1, a ≤1 b if and only if ϕ(a) ≤2 ϕ(b). (2.) for all s ⊆ e1, ∨ e1 s exists if and only if ∨ e2 ϕ(s) exists, in which case ∨ e2 ϕ(s) = ϕ (∨ e1 s ) . (3.) for any increasingly directed net (aα)α∈λ of elements of e1 and a ∈ e1, aα ↑ a if and only if ϕ(aα) ↑ ϕ(a). (4.) for any decreasingly directed net (aα)α∈λ of elements of e1 and a ∈ e1, aα ↓ a if and only if ϕ(aα) ↓ ϕ(a). (5.) for any net (aα)α∈λ of elements of e1 and a ∈ e1, aα (o)1−−→ a if and only if ϕ(aα) (o)2−−→ ϕ(a). (6.) for subsets and nets of elements of e1 and e2 the following statements are satisfied: • for all f ⊆ e1, f is τe10 -closed if and only if ϕ(f) is τe20 -closed. • for all u ⊆ e1, u is τe10 -open if and only if ϕ(u) is τe20 -open. • for any net (aα)α∈λ of elements of e1 and a ∈ e1, aα τ e1 0−−→ a if and only if ϕ(aα) τ e2 0−−→ ϕ(a). 310 vol. 53 no. 3/2013 inherited properties of effect algebras proof. (1.) assume that a,b ∈ e1. if a ≤1 b. from lemma 2.8 we have that ϕ(a) ≤2 ϕ(b). conversely, let ϕ(a) ≤2 ϕ(b).then again by lemma 2.8 applied to ϕ−1 we get a = ϕ−1(ϕ(a)) ≤1 ϕ−1(ϕ(b)) = b. (2.) assume that s ⊆ e1 such that ∨ e1 s exists. let us put a = ∨ e1 s. then, for all s ∈ s, s ≤ a and we get from lemma 2.8 that ϕ(s) ≤1 ϕ(a). hence ϕ(a) is an upper bound of ϕ(s) for all s ∈ s. let d = ϕ(c) ∈ e2, c ∈ e1 be an upper bound of ϕ(s) for all s ∈ s. then c ∈ e1 is an upper bound of s for all s ∈ s by part 1. this yields that a ≤ c. therefore by lemma 2.8 we get ϕ(a) ≤ ϕ(c) = d, i.e. ∨ e2 ϕ(s) = ϕ( ∨ e1 s). the converse implication follows by the same considerations as were applied above to ϕ−1 and the assumption that ∨ e2 ϕ(s) exists. (3.) assume α ≤ β, α,β ∈ λ. then aα ≤1 aβ and by lemma 2.8 we obtain that ϕ(aα) ≤2 ϕ(aβ). it follows that (ϕ(aα))α∈λ is an increasingly directed net. assume now that aα ↑ a. then by (2.) we obtain ϕ(aα) ↑ ϕ(a). the converse implication follows by the same considerations as were applied above to ϕ−1 and ϕ(aα) ↑ ϕ(a). (4.) it follows by the considerations dual to them in (3.). (5.) assume first that aα (o)1−−→ a. hence there are nets (uα)α∈λ and (vα)α∈λ of elements of e1 such that a ↑ uα ≤ aα ≤ vα ↓ a. from lemma 2.8 and (3.) and (4.) we obtain nets (ϕ(uα))α∈λ and (ϕ(vα))α∈λ of elements of e2 such that ϕ(a) ↑ ϕ(uα) ≤ ϕ(aα) ≤ ϕ(vα) ↓ ϕ(a). it follows that ϕ(aα) (o)2−−→ ϕ(a). the converse implication follows by the same considerations as above applied to ϕ−1 and the net (ϕ(aα))α∈λ of elements of e2 and ϕ(a) ∈ e2 such that ϕ(aα) (o)2−−→ ϕ(a). (6.) assume that f ⊆ e1. then f is τe10 -closed if and only if by theorem 2.6 for every net (aα)α∈λ of elements of e1 it holds (aα ∈ f,α ∈ λ, aα (o)1−−→ a) ⇒ a ∈ f if and only if by (5.) for every net (ϕ(aα))α∈λ of elements of e2 it holds (ϕ(aα) ∈ ϕ(f),α ∈ λ, ϕ(aα) (o)2−−→ ϕ(a)) ⇒ ϕ(a) ∈ ϕ(f) if and only if for every net (bα)α∈λ of elements of e2 it holds (bα ∈ ϕ(f),α ∈ λ, bα (o)2−−→ b) ⇒ b ∈ ϕ(f) if and only if by theorem 2.6 ϕ(f) is τe20 -closed. now, let us assume that u ⊆ e1. then u is τe10 open if and only if f = e1 \ u is τe10 -closed if and only if ϕ(f) = ϕ(e1 \ u) = ϕ(e1) \ ϕ(u) = e2 \ϕ(u) is τe20 -closed if and only if ϕ(u) is τ e2 0 open. in what remains, we will assume that we have a net (aα)α∈λ of elements of e1 and a ∈ e1, aα τ e1 0−−→ a. let us check that ϕ(aα) τ e2 0−−→ ϕ(a). assume that we have a τe20 -open set v ⊆ e2 such that ϕ(a) ∈ v . since v = ϕ(u) and u = ϕ−1(v ) for some τe10 -open subset u ⊆ e1 we get that a ∈ u. hence there is an index α0 ∈ λ such that aα ∈ u for all α ≥ α0. it follows that ϕ(aα) ∈ ϕ(u) = v for all α ≥ α0. therefore ϕ(aα) τ e2 0−−→ ϕ(a). the converse implication goes the same way. recall that theorem 3.1 can be stated and proved entirely for posets. theorem 3.2 has to be stated and proved for effect algebras. effect algebras are suitable algebraic structures to be carriers of states or probability measures (σ-additive states) also in cases when events may be unsharp or some pairs of events are noncompatible. theorem 3.2. let (e1;⊕1, 01, 11) and (e2;⊕2, 02, 12) be effect algebras and let ϕ : e1 → e2 be an isomorphism of effect algebras. then, for any mapping ω : e2 → [0, 1], (1.) ω is a state on e2 if and only if ω ◦ϕ is a state on e1. (2.) ω is an (o)-continuous state on e2 if and only if ω ◦ϕ is an (o)-continuous state on e1. (3.) ω is a σ-additive state on e2 if and only if ω ◦ϕ is a σ-additive state on e1. (4.) ω is a completely additive state on e2 if and only if ω ◦ϕ is a completely additive state on e1. proof. (1.) let ω be a state on e2. then the composition ω ◦ϕ is a morphism from e1 to [0, 1], hence a state. conversely, let ω ◦ ϕ be a state on e1. then ω = (ω ◦ϕ) ◦ϕ−1 is a morphism from e2 to [0, 1]. (2.) let ω be an (o)-continuous state on e2. assume that (aα)α∈λ is an increasingly directed net of elements of e1 and that a ∈ e1 such that aα ↑ a. from theorem 3.1 we obtain that ϕ(aα) ↑ ϕ(a) in e2. since ω is (o)-continuous we have that ω ( ϕ(aα) ) ↑ ω ( ϕ(a) ) . hence, by (1.), ω ◦ ϕ is an (o)-continuous state on e1. the converse implication follows by the same considerations as above applied to ϕ−1 and the (o)continuous state ω ◦ϕ on e1. (3.) it follows by literally the same considerations as in (2.) applied to any countable increasingly directed net. (4.) let ω be a completely additive state on e2. assume that (xκ)κ∈h is an orthogonal system of not necessarily different elements of e1 such that ⊕ e1 {xκ | κ ∈ h} exists. then by lemma 2.8 we get that ( ϕ(xκ) ) κ∈h is an orthogonal system in e2 and by theorem 3.1 we obtain that ϕ (⊕ e1 { xκ ∣∣κ ∈ h}) = ⊕e2{ϕ(xκ) ∣∣κ ∈ h}. since ω is a completely additive state on e2 we 311 j. paseka, z. riečanová acta polytechnica have that (ω ◦ϕ) (⊕ e1 {xκ | κ ∈ h} ) = ω (⊕ e2 { ϕ(xκ) ∣∣κ ∈ h}) = ∑{ ω ( ϕ(xκ) )∣∣κ ∈ h} = sup {∑{ (ω ◦ϕ)(xκ) ∣∣κ ∈ f}∣∣∣f ⊆ h, f finite set } . the converse implication follows by the same considerations as above applied to ϕ−1. 4. some properties of operator effect algebras that need not be preserved by effect algebraic isomorphisms we see, in section 3, that isomorphism of effect algebras preserves those properties of effect algebras which depend only on the ⊕-operation or on the partial order that is derived from ⊕. on the other hand, there are properties of effect algebras for which the preservation of the ⊕-operation by isomorphisms is not substantional. for operator effect algebras it is, e.g., boundedness or self-adjointness of operators (elements of operator effect algebras). definition 4.1. [4] a sequential effect algebra is a partial algebra (e;◦,⊕, 0, 1) such that (e;⊕, 0, 1) is an effect algebra and ◦ is another binary operation (called a sequential product) defined on e satisfying: (sea1) the map b 7→ a ◦ b is additive for each a ∈ e, that is, if b ⊥ c, then a ◦ b ⊥ a ◦ c and a◦ (b⊕ c) = a◦ b⊕a◦ c. (sea2) 1 ◦a = a for each a ∈ e. (sea3) if a◦ b = 0, then a◦ b = b◦a. (sea4) if a ◦ b = b ◦ a, then a ◦ b′ = b′ ◦ a and a◦ (b◦ c) = (a◦ b) ◦ c for each c ∈ e. (sea5) if c ◦ a = a ◦ c and c ◦ b = b ◦ c, then c◦(a◦b) = (a◦b)◦c and c◦(a⊕b) = (a⊕b)◦c whenever a ⊥ b. assume that (e1;◦1,⊕1, 01, 11) and (e2;◦2,⊕2, 02, 12) are sequential effect algebras. a mapping ϕ : e1 → e2 is called a sequential effect algebraic morphism if ϕ is a morphism of the effect algebra e1 into the effect algebra e2 and, for all a,b ∈ e1, ϕ(a ◦1 b) = ϕ(a) ◦2 ϕ(b). in what follows we will assume that h is an infinitedimensional complex hilbert space, i.e., a linear space with inner product (· , ·) which is complete in the induced metric. the term dimension of h is defined as the cardinality of any orthonormal basis of h (see [1]). moreover, we will assume that all considered linear operators a (i.e. linear maps a : d(a) → h) have a domain d(a) that is a linear subspace dense in h with respect to the metric topology induced by the inner product on h (i.e., d(a) = h). recall that a linear operator a is called positive (denoted by a ≥ 0) iff (x,ax) ≥ 0 for all x ∈ d(a), hence a is also symmetric, meaning that (y,ax) = (ay,x) for all x,y ∈ d(a) (see [1] for more details). recall that a : d(a) → h is called a bounded operator if there exists a real constant c ≥ 0 such that ‖ax‖≤ c‖x‖ for all x ∈ d(a). gudder [4] showed that, for any standard hilbert space effect algebra e(h) on a complex hilbert space h, there is a binary operation ◦ defined by b ◦c = b 1 2 cb 1 2 for all b,c ∈ e(h) such that it satisfies conditions (sea1)–(sea5), and so it is a sequential product of e(h). liu weihua and wu junde in [10, theorem 4.3] proved that there is a binary operation ◦i on e(h) such that it satisfies conditions (sea1)– (sea5) and ◦i 6= ◦. this yields the following. theorem 4.2. let h be a complex hilbert space. then there are sequential operator effect algebras( e(h);◦,⊕, 0, 1 ) and ( e(h);◦i,⊕, 0, 1 ) that are isomorphic as effect algebras but the respective effect algebraic isomorphism does not preserve the sequential product. proof. evidently, ide(h) is an effect algebraic isomorphism. from [10, theorem 4.3] we know that there are a,b ∈ e(h) such that a ◦ b 6= a ◦i b. hence ide(h)(a ◦ b) = a ◦ b 6= a ◦i b = ide(h)(a) ◦i ide(h)(b). let v(h) be the set of all positive linear operators densely defined in an infinite-dimensional complex hilbert space h and the domain d(b) = h for every bounded operator b. to every such linear operator with d(a) = h there exists the adjoint operator a∗ of a such that d(a∗) = { y ∈h | there exists y∗ ∈h such that (y∗,x) = (y,ax) for every x ∈ d(a) } and a∗y = y∗ for every y ∈ d(a∗). if a∗ = a then a is called self-adjoint. an operator a : d(a) → h is called closed if for every sequence (xn)n∈n, xn ∈ d(a), such that xn → x ∈ h and axn → y ∈ h as n → ∞ one has x ∈ d(a) and ax = y. since every a ∈ v(h) is symmetric there exists a closed operator a such that a ⊂ a and a ⊂ b for every closed operator extending a. moreover a is again symmetric and it is called the closure of a. a symmetric operator a is called essentially self-adjoint if ( a )∗ = a and then a is a unique self-adjoint extension of a (see [1, p. 96]). finally, recall that every a ∈ v(h) has a positive self-adjoint extension â called friedrichs’ extension of a (see, e.g., [16]). moreover, â extends all symmetric extensions a′ of a. it was shown in [15, theorem 1] that, for any infinite-dimensional complex hilbert space h, there are positive unbounded a and b such that a is not essentially self-adjoint and b is not closed. 312 vol. 53 no. 3/2013 inherited properties of effect algebras furthermore, let v(h) be equipped with the partial sum ⊕ such that for any a,b ∈v(h) the sum a⊕b is defined iff either one of a,b is bounded or d(a) = d(b). then we set a⊕b = a+b (the usual operator sum). in [17] it was proved that ( v(h);⊕, 0 ) is a generalized effect algebra. theorem 4.3 ([17, theorem 2], [18, theorem 7]). for every infinite-dimensional complex hilbert space h and every q ∈ v(h), q 6= 0 it holds: (1.) the interval ( [0,q]v(h);⊕q, 0,q ) where a ⊕q b = a + b iff a + b ≤ q, for any a,b ∈ [0,q]v(h), is an effect algebra and mq ={ ωx | x ∈ d(q), (x,qx) > 0 } is an ordering set of states on [0,q]v(h); here the mapping ωx : [0,q]v(h) → [0, 1] ⊆ r is defined for every a ∈ [0,q]v(h) by ωx(a) = (x,ax) (x,qx) . (2.) the effect algebra ( [0,q]v(h);⊕q, 0,q ) can be embedded into the standard hilbert effect algebra e ( l2(mq) ) . we denote the respective embedding by ϕq. therefore we obtain the following theorem that boundedness (self-adjointness, closedness, essential self-adjointness, friedrichs’ extension) of operators need not be preserved by effect algebraic isomorphisms. theorem 4.4. for every infinite-dimensional complex hilbert space h and every q ∈ v(h), q 6= 0 unbounded (unbounded and non self-adjoint, unbounded and non closed, unbounded and non essentially self-adjoint, unbounded and with q̂ 6= q respectively) we have an effect algebraic isomorphism ϕ−1q : ϕq ( [0,q]v(h) ) → [0,q]v(h) such that ϕ−1q does not preserve bounded operators (self-adjoint operators, closed operators, non essentially self-adjoint operators, friedrichs’ extension, respectively). proof. clearly, ϕq(q) = il2(mq) ∈ ϕq ( [0,q]v(h) ) . note that il2(mq) is bounded and positive. it follows that it is also self-adjoint, closed, essentially self-adjoint and it coincides with its friedrichs’ extension. hence ϕ−1q ( il2(mq) ) = q, il2(mq) is bounded (self-adjoint, closed, essentially self-adjoint and it coincides with its friedrichs’ extension respectively) in l2(mq) and q is unbounded (unbounded and non self-adjoint, unbounded and non closed, unbounded and non essentially self-adjoint, unbounded and with q̂ 6= q, respectively). acknowledgements the first author acknowledges support from esf project cz.1.07/2.3.00/20.0051 algebraic methods in quantum logic of the masaryk university. the second author was supported by grant vega 1/0297/11 of the ministry of education of the slovak republic, and by project apvv0178/11. references [1] j. blank, p. exner, m. havlíček, hilbert space operators in quantum physics (second edition), springer, 2008. [2] dvurečenskij a., pulmannová s., new trends in quantum structures, kluwer, dordrecht, the netherlands, 2000. [3] foulis d.j., bennett m.k., effect algebras and unsharp quantum logics, foundations of physics 24 (1994), 1331–1352. [4] gudder s., greechie r., sequential products on effect algebras, reports of mathematical physics, 49 (2002), 87–111. [5] hedlíková j., pulmannová s., generalized difference posets and orthoalgebras, acta math. univ. comenianae lxv (1996), 247–279. [6] kalmbach g., riečanová z., an axiomatization for abelian relative inverses, demonstratio math. 27 (1994), 769–780. [7] kirchheimová h., some remarks on (o)-converfengence, proc. of the first winter school of measure theory, liptovský ján (1990), 110–113. [8] kirchheimová h., riečanová z., note on order convergence and order topology, appendix b, in riečan, b., neubrunn, t., measure, integral and order, ister science (bratislava) and kluwer academic publishers (dordrecht-boston-london), 1997. [9] kôpka f., chovanec f., d-posets, math. slovaca 44 (1994), 21–34. [10] liu, w. h., wu, j. d., the uniqueness problem of sequence product on operator effect algebra e(h), j. phys. a: math. theor. 42 (2009). [11] mosná k., paseka j., riečanová z., order convergence, order and interval topologies on posets and lattice effect algebras, in uncertainty 2008, bratislava, slovak republic: slovak university of technology in bratislava, publishing house of stu, (2008), 45–62, [12] olejček v., order convergence and order topology on a poset, int. j. theor. physics 38 (1999), 557–561. [13] olejček v., the order topology on a lattice and its macneille completion, int. j. theor. physics, 39 (2000), 801–803. [14] paseka j., riečanová z., the inheritance of bde-property in sharply dominating lattice effect algebras and (o)-continuous states, soft computing, 15 (2011), 543–555. [15] paseka j., riečanová z., considerable sets of linear operators in hilbert spaces as operator generalized effect algebras, foundations of physics, 41 (2011), 1634–1647. [16] reed m., simon b., methods of modern mathematical physics ii, fourier analysis, self-adjointness, academic press new york, san francisco, london, 1975. [17] riečanová z., zajac m., pulmannová s., effect algebras of positive linear operators densely defined on hilbert spaces, reports of mathematical physics, 68 (2011), 261–270. [18] riečanová z., zajac m., hilbert space effect-representations of effect algebras, reports of mathematical physics, 70 (2012), 283–290. 313 http://dx.doi.org/10.1007/bf02283036 http://dx.doi.org/10.1088/1751-8113/42/18/185206 http://dx.doi.org/10.1088/1751-8113/42/18/185206 doi: 10.1007/s00500-010-0561-7.doi: 10.1007/s00500-010-0561-7 doi: 10.1007/s10701-011-9573-0 acta polytechnica 53(3):308–313, 2013 1 introduction 2 basic definitions and some known facts 2.1 effect algebras and generalized effect algebras 2.2 topologies on ordered sets 2.3 morphisms, embeddings and isomorphisms of effect algebras 3 basic properties of isomorphisms of effect algebras and operator representations 4 some properties of operator effect algebras that need not be preserved by effect algebraic isomorphisms acknowledgements references ap01_6.vp 1 introduction the endeavour to improve building technologies has the same importance as the tendency to reduce the mass of structures. this can be done in two main ways: � by choosing statically efficient structural designs and building technologies, � by using materials with high strength properties. modern computation methods enable the exploitation of hidden reserves of carrying capacity, i.e., designs that make full use of the strength of materials. at the same time, these methods lead to certain problems, which are seen in the decreasing durability and lifetime of structures. the main reasons for this lie in the action of the external environment – chemical influences, temperature and moisture. temperature and moisture create corresponding volume changes, followed by the inception of deformations and stresses. the influence of temperature and moisture could be neglected in massive structures, but this is no longerthe case for modern buildings. the supplementary state of stress due to volume changes is usually a primary cause of microcracks, and forms the basis for the subsegment process of corrosion and degradation. 2 bridge structure nusle bridge, the largest prestressed concrete bridge in the czech republic, was built in 1973 (see photo 1). the bridge structure with a full-span of 485 m is divided into five spans 68.25 + 3 × 115.50 + 68.25 m. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 measurement of temperature fields in long span concrete bridges j. římal this paper deals with assesing of the influence of climate temperatures on deformations and stresses in a cross section of the nusle bridge. the main purpose is to describe the measurement of the thermal fields, to compare measured and computed temperature fields, and to provide a real estimation of the stresses that occur. keywords: nusle bridge in prague, measurement of thermal fields, calculation of temperature fields, evaluation of stress fields. photo 1 the cross-section sizes are shown in fig. 1. the bridge decking structure, at the time of measurement, is described in tab. 1 and fig. 2. 3 measurement of thermal fields in the bridge decking and bridge structure a) thermal probes the special thermal probes were developed on the basis of mt 100 � platinum resistant thermometers. the probes conformed to the requirements of accuracy, reliability, durability, safety, waterproofness, and mechanical stress. each sensor was calibrated in the laborarory. the calibration constants were determinated in relation to the control of the entire experiment by means of a personal computer. b) location of thermal probes; cable line to the measurement centre the probes were placed approximately in the middle of each traffic line, see fig. 1. the vertical location of the sensors is shown in fig. 2. the grooves for the cable line and the holes for lateral placing of the sensors were made in the bridge decking. other holes for placing the thermal sensors were made in the bridge structure. the cable lines led to the measurement centre, which was placed in the underground station located below one end of the bridge. c) measurement centre the measuring sets are capable of telemetering the temperature at the measuring points with simultaneous reading of all measured values and objective recording of the values. 55 acta polytechnica vol. 41 no. 6/2001 fig. 1: cross section of the bridge and location of thermal probes fig. 2: structure of the bridge decking and placing of the thermal probes in the cross section marking material thickness [m] ma mastic asphalt 0.03 ac asphalt-concrete 0.04 co cement overcoat 0.04 hi hydroinsulation 0.01 pc prestressed concrete table 1: construction of bridge decking 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 n u s l e b r id g e m e a s u re m e n ts o f t e m p e ra tu re f ie ld s o n n u s le b ri d g e 1 7 . 3 . 2 2 . 3 . 2 0 0 0 0 0. 2 0. 4 0. 6 0. 8 0. 1 0 0. 1 2 .0 22:00:00 02:00:00 06:00:00 10:00:00 14:00:00 18:00:00 22:00:00 02:00:00 06:00:00 10:00:00 14:00:00 18:00:00 22:00:00 02:00:00 06:00:00 10:00:00 14:00:00 18:00:00 22:00:00 02:00:00 06:00:00 10:00:00 14:00:00 18:00:00 22:00:00 02:00:00 06:00:00 10:00:00 14:00:00 18:00:00 22:00:00 t im e temperature[°c] l 5 a c o n c r e t e d o w n 1 5 [° c ] l 5 a c o n c r e t e s id e 1 6 [° c ] l 5 a c o n c r e t e u p 1 3 [° c ] l 5 a c o n c r e t e o u t c [° c ] l 5 a c o n c r e t e o u t a [° c ] p 5 a c o n c r e t e d o w n 1 0 [° c ] p 5 a c o n c r e t e s id e 1 4 [° c ] l 5 a a ir o u t 2 0 [° c ] l 5 a a ir in 1 8 [° c ] t em p er at u re ch ar t n o. 1 the measuring sets consist of: � ultrakust 5601-11abe digital multimeter � cascade controlled series of ultrakust 4364/pt/100 w tenway switches � personal computer the four-wires method for measuring temperatures eliminates the effect of thermal changes on the resistances in the cable line. automatic regulation of the measurement and recording of the measured values in real time is controlled by connecting of measurement sets with a pc. all the technical advantages of the ultrakust measurement sets could be used thanks to the development of a special interface. the controlling pc program is able to set the series of switches to the beginning state, to switch the measurement place to the real time dependence, to calculate averages from several values in one measured place, to correct the measured values according to a calibration constant, to draw the time-dependent temperature distribution on the screen monitor, and finally to record the results for future calculations. the special pc program was suitable for this type of measurement requiring long-time recording of slowly changing data in many different locations. the speed of measurement was limited by speed of the ultrakust measurement sets. 4 model of climate temperature loading the temperature of the external environment changes in connection with the daily and annual cycles. the value of the external temperature is determinated by the temperature of the ambient air, solar radiation, and the speed of the air flowing around the bridge structure. these most important factors are combined into the equivalent sun-temperature tas expressed in equation (1) � �t t i pas a g� � 1 � (1) where ta � ta(t) air temperature ig � ig(t) intensity of global solar radiation � � �(v) heat-transfer coefficient depending on air speed p coefficient of surface absorption t time the value of tas is applied in fourie’s law by expression (2) � �� � � �� � � �n p t n t t� (2) where t � t(x, y, t) temperature field � surface of area where the heat-transfer boundary condition is used n vector of outside normal �n coefficient of thermal conductivity tp surface temperature of structure the time dependent distribution of the external environment in summer time used for computation of an unsteady temperature field is shown in fig. 3 and in temperature chart no. 1, 2, 3 and 4. due to the high heat capacity of the concrete bridge structure it is necessary to use the unsteady temperature field to achieve agreement between the measured and computed values. the temperature loading state is defined as the time dependent progression of the separate temperature states that can be observed in the structure during all seasons of the year. 5 calculation of temperature fields in order to evaluate thermal stresses evaluation it is necessary to know the temperature distribution in the whole area of a cross section. for this reason the experiment is the most important criterion for the accuracy of unsteady temperature © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 41 no. 6/2001 air temperatures t e m p e ra tu re [° c ] time [hour] bridge decking air tube fig. 3: time dependent distribution of environment temperature in summer 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 n u s l e b r id g e – m e a s u re m e n t o f t e m p e ra tu re f ie ld s o n n u s le b ri d g e -3 0 0 . -2 0 0 . -1 0 0 . 0 0 0 . 1 0 0 . 2 0 0 . 3 0 0 . 4 0 0 . 5 0 0 . 6 0 0 . 7 0 0 . 8 0 0 . 9 0 0 . 1 0 0 0 . 1 1 0 0 . 7 :0 0 8 :0 0 9 :0 0 1 0 :0 0 1 1 :0 0 1 2 :0 0 1 3 :0 0 1 4 :0 0 1 5 :0 0 1 6 :0 0 1 7 :0 0 1 8 :0 0 1 9 :0 0 t im e temperature[°c] a ir re in fo rc e d c o n c re te c e m e n t o v e rc o a t a s p h a lt c o n c re te m a s ti c a s p h a lt t em p er at u re ch ar t n o. 2 © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 41 no. 6/2001 n u s l e b r id g e m e a s u re m e n ts o f t e m p e ra tu re f ie ld s o n n u s le b ri d g e 1 . 5 . 4 . 5 . 2 0 0 0 1 0 0. 1 2 0. 1 4 0. 1 6 0. 1 8 0. 2 0 0. 2 2 0. 2 4 0. 2 6 0. 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 t im e temperature[°c] l 5 a c o n c r e t e d o w n 1 5 [° c ] l 5 a c o n c r e t e s id e 1 6 [° c ] l 5 a c o n c r e t e u p 1 3 [° c ] l 5 a c o n c r e t e o u t c [° c ] l 5 a c o n c r e t e o u t a [° c ] p 5 a c o n c r e t e d o w n 1 0 [° c ] p 5 a c o n c r e t e s id e 1 4 [° c ] l 5 a a ir o u t 2 0 [° c ] l 5 a a ir in 1 8 [° c ] t em p er at u re ch ar t n o. 3 field computation. the distribution of the temperature field in the structures is described by fourier’s law for two dimensional solution by the following equation � � � � � � � � � � � � �x t x y t y c t t x y � � � � � � � � (3) where �x, �y coefficient of thermal conductivity � density c heat capacity the heat transfer between the environment and the surface of the structure is described in boundary condition (2). it is neccessary to know the initial condition in time t � t0 to start the solution of equation (3). in accordance to with cyclic character of the boundary condition we determine the initial condition from the steady temperature state. after finishing 1 or 2 one-day cycles of computations, the influence of any inaccurate choice of initial condition totally disappears. equation (3) is solved by the finite element method. this process is advantageous, because the discretisation of the cross section into finite elements (fig. 4) also remains for the computation of thermal deformations and stresses. the time integration passes by the incremental method by means of the cranck-nicholson scheme. the thermotechnical properties of each material used in the bridge structure are described in tab. 2. 6 evaluation of computations and comparison with experiment the agreement between measured and evaluated temperature distributions at each measured point is the criterion for the truthfulness at the model of the environment and for the suseement computation. fig. 5a, b show the distribution of isothermal lines in the cross section at two moments in the 24-hour cycle. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 4: discretisation of the cross section into the finite elements number material � [w/mk] c [j/kgk] � [kg/m3] 1 mastic asphalt 1.55 650 2400 2 asphalt-concrete 1.40 650 2400 3 cement overcoat 1.00 900 2200 4 hydroinsulation 0.70 900 2200 5 prestressed concrete 1.71 960 2500 table 2: thermotechnical properties of materials the first aim of this paper is to compare the experimentally measured and computed time dependent temperature distributions, see fig. 6a, b, c (temperatures in the bridge decking) and fig. 7 (temperatures in the bridge structure). we can see from the fig. 6, 7 that the differences between the computation and the experiment are negligible for our purposes. the sources of the deviations are as follows: � the heat-transfer coefficient is time mutable � the intensity of the solar radiation is locally suppressed due to cloud movements � the real shape of the cross section differs slightly from our theoretical model. this means that the evaluated temperature field can be used as an objective basis for computing the thermoelastic stresses. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 41 no. 6/2001 a) b) fig. 5: isothermal lines: a) 6 a. m., b) 12 a. m. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 n u s l e b r id g e m e a s u re m e n ts o f t e m p e ra tu re f ie ld s o n n u s le b ri d g e 2 5 . 5 . 2 9 . 5 . 2 0 0 0 1 2 0. 1 4 0. 1 6 0. 1 8 0. 2 0 0. 2 2 0. 2 4 0. 2 6 0. 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 00:00:00 04:00:00 08:00:00 12:00:00 16:00:00 20:00:00 t im e temperature[°c] l 5 a c o n c r e t e d o w n 1 5 [° c ] l 5 a c o n c r e t e s id e 1 6 [° c ] l 5 a c o n c r e t e u p 1 3 [° c ] l 5 a c o n c r e t e o u t c [° c ] l 5 a c o n c r e t e o u t a [° c ] p 5 a c o n c r e t e d o w n 1 0 [° c ] p 5 a c o n c r e t e s id e 1 4 [° c ] l 5 a a ir o u t 2 0 [° c ] l 5 a a ir in 1 8 [° c ] t em p er at u re ch ar t n o. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 41 no. 6/2001 fig. 7 comparison of measured and computed temperatures in the bridge structure a) b) c) fig. 6: comparison of measured and computed temperatures: a) outside traffic line, b) middle traffic line, c) inside traffic line 7 evaluation of stress fields the second aim of this paper is to evaluate the stress fields. the two-dimensional fields of deformations and stresses are computed in the same perpendicular network of finite elements as the temperature fields. the bridge structure is loaded by the volume changes caused by the previous temperature states. the physical properties of the materials used for the computation are shown in tab. 3. it should be mentioned that a value of 10 °c was chosen for the assembling temperature. the following figures (8a, b) demonstrate the important factor influencing the durability and reliability of structures. the cyclic temperature changes involve the corresponding response in the thermal stresses. it is evident that the compressive stresses change into tensile stresses and vice versa in one 24-hour cycle. in fig. 8a, b this fact is noticeable, especially on the outside surface of the vertical wall, where the values of the tensile stresses are in the range from 0.46 mpa to 0.80 mpa at 6 a.m., while at 12 a.m. we observe compressive stresses in range from 0.3 mpa to 0.60 mpa in the same place. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 a) b) fig. 8 longitudial axial stresses in the cross section: a) 6 a.m., b) 12 a.m. the absolute peaks of the stresses that are reached may not be in the same level of the material strength, but due to the cyclic character of the loading (daily and annually) the fatigue strength can be exceeded in this place, and so become a potential cause of structural failure. acknowledgement this project is being carried out with the participation of undergraduate students jan pytel, jakub římal and jana zaoralová. this research has been supported by the grant agency of the czech republic under grant no. 103/00/0776. references [1] římal, j., klobouček, b.: reconstruction of bridge decking of nusle bridge. measurement of thermal fields, research report, ctu, 1981 [2] klobouček, b., římal, j.: reconstruction of bridge decking of nusle bridge. pudis prague, 1980 [3] bažant, z. p., křístek, v., vítek, j. l.: drying and cracking effects in box-girder bridge segment. journal of structural engineering, 1992 [4] římal, j.: measurement of temperature and moisture fields of the charles bridge in prague. ctu, prague, 1991 [5] římal, j.: impact of temperature fields on degradation processes of the bridge deck and bridge structure. ctu, prague, 1990 [6] římal, j.: the measurement of thermal fields in the bridge deck and bridge structure of nusle bridge in prague. new requirement for materials and structures, proceedings of the conference – new requirement for materials and structures, ctu, 1998 [7] římal, j., jelínek, v., vodák, f.: climatic impact of the bridge structure of the charles bridge in prague. conference proceedings 1998, ctu, faculty of civil engineering, 1998 [8] vítek, j., křístek, v., římal, j.: construction sequence of concrete structures. proceedings of workshop 98, ctu, prague, 1998, p. 951 [9] římal, j.: evaluation of temperature fields on the reinforced concrete slab and bridge structure of charles bridge. ctu reports 2001, ctu, prague, 2001, p. 846 doc. rndr. jaroslav římal, drsc. phone: +420 2 2435 4702 fax: +420 2 3333 3226 e-mail: rimal@fsv.cvut.cz czech technical university faculty of civil engineering thákurova 7 166 29, prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 41 no. 6/2001 number material e [mpa] n [-] �t [1/k] 1 mastic asphalt 200 0.30 0.000050 2 asphalt-concrete 500 0.15 0.000020 3 cement overcoat 20000 0.15 0.000012 4 hydroinsulation 70 0.30 0.000050 5 prestressed concrete 36000 0.15 0.000010 table 3: physical properties of materials acta polytechnica doi:10.14311/ap.2014.54.0113 acta polytechnica 54(2):113–115, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap on the real matrix representation of pt-symmetric operators francisco m. fernández inifta (unlp, cct la plata-conicet), blvd. 113 y 64 s/n, sucursal 4, casilla de correo 16, 1900 la plata, argentina correspondence: fernande@quimica.unlp.edu.ar abstract. we discuss the construction of real matrix representations of pt-symmetric operators. we show the limitation of a general recipe presented some time ago for non-hermitian hamiltonians with antiunitary symmetry and propose a way to overcome it. our results agree with earlier ones for a particular case. keywords: non-hermitian hamiltonians, antiunitary symmetry, pt-symmetry, real matrix. 1. introduction at first sight it is suprising that a subset of eigenvalues of a complex-valued non-hermitian operator ĥ can be real (see [1] and references therein). in order to provide a simple and general explanation of this fact bender et al. [2] showed that it is possible to construct a basis set of vectors so that the matrix representation of such an operator is real. as a result the secular determinant is real (the coefficients of the characteristic polynomial are real) and its roots are either real or appear in pairs of complex conjugate numbers. the argument is based on the existence of an antiunitary symmetry âĥâ−1 = ĥ, where the antiunitary operator â satisfies âk = 1̂ for k odd. bender et al. [2] showed some illustrative examples of their general result. the procedure followed by bender et al. [2] for the construction of the suitable basis set is reminiscent of the one used by porter [3] in the study of matrix representations of hermitian operators. however, the ansatz proposed by porter appears to be somewhat more general. the purpose of this paper is to analyse the argument given by bender et al. [2] in more detail. in section 2 we outline the main features of an antiunitary or antilinear operator and in section 3 we briefly discuss the concept of antiunitary symmetry. in section 4 we review the argument given by bender et al. [2] and show that under certain conditions it does not apply. we illustrate this point by means of the well known harmonic-oscillator basis set and show how to overcome that shortcoming. in section 5 we discuss the harmonic-oscillator basis set in more detail and in section 6 we draw conclusions. 2. antiunitary operator as already mentioned above, a wide class of nonhermitian hamiltonians with unbroken pt symmetry exhibits real spectra [1]. in general, they are invariant under an antilinear or antiunitary transformation of the form â−1ĥâ = ĥ. the antiunitary operator â satisfies [4] â ( |f〉 + |g〉 ) = â|f〉 + â|g〉 âc|f〉 = c∗â|f〉, (1) for any pair of vectors ∣∣f〉 and ∣∣g〉 and arbitrary complex number c, where the asterisk denotes complex conjugation. this definition is equivalent to〈 âf ∣∣âg〉 = 〈f|g〉∗. (2) one can easily derive the pair of equations (1) from (2) so that the latter can be considered to be the actual definition of an antiunitary operator [4]. if k̂ is an antilinear operator such that k̂2 = 1̂ (for example, the complex conjugation operator) then it follows from (2) that âk̂ = û is unitary (û† = û−1); that is to say the inner product 〈 f ∣∣g〉 remains invariant under û:〈 âk̂f ∣∣âk̂g〉 = 〈k̂f∣∣k̂g〉∗ = 〈f|g〉. (3) in other words, any antilinear operator â can be written as a product of a unitary operator and the complex conjugation operation [4]. in exactly in the same way we can easily prove that â2j is unitary and â2j+1 antiunitary. in their discussion of real matrix representations of non-hermitian hamiltonians bender et al. [2] considered hamiltonians ĥ with antiunitary symmetry âĥâ−1 = ĥ, (4) where â satisfies the additional condition â2k = 1̂, k odd. (5) since b̂ = âk is antiunitary and satisfies b̂2 = 1̂ we can restrict our discussion to the case k = 1 without loss of generality. therefore, from now on we substitute the condition â2 = 1̂ (6) 113 http://dx.doi.org/10.14311/ap.2014.54.0113 http://ojs.cvut.cz/ojs/index.php/ap francisco m. fernández acta polytechnica for the apparently more general equation (5). from now on we refer to equation (4) as a-symmetry and to the operator ĥ as a-symmetric for short. 3. antiunitary symmetry it follows from the antiunitary invariance (4) that [ĥ,â] = 0. therefore, if |ψ〉 is an eigenvector of ĥ with eigenvalue e ĥ|ψ〉 = e|ψ〉, (7) we have [ĥ,â]|ψ〉 = ĥâ|ψ〉− âĥ|ψ〉 = ĥâ|ψ〉−e∗â|ψ〉 = 0. (8) this equation tells us that if ∣∣ψ〉 is an eigenvector of ĥ with eigenvalue e then â|ψ〉 is also an eigenvector with eigenvalue e∗. that is to say: the eigenvalues are either real or appear as pairs of complex conjugate numbers. in the former case ĥâ|ψ〉 = eâ|ψ〉, (9) which contains the condition of unbroken symmetry [1] â|ψ〉 = λ|ψ〉 (10) as a particular case. note that equation (9) applies to the case in which â|ψ〉 is a linear combination of degenerate eigenvectors of ĥ with eigenvalue e. an illustrative example of this more general condition for real eigenvalues is given elsewhere [5]. 4. real matrix representation bender et al. [2] put forward a straightforward procedure for obtaining a basis set in which an a-symmetric hamiltonian has a real matrix representation. they proved that for an a-adapted basis set {|na〉} â|na〉 = |na〉 (11) the matrix elements of the invariant hamiltonian operator are real 〈ma|ĥ|na〉 = 〈ma|ĥ|na〉∗ (12) these authors proposed to construct |na〉 as (remember that we have restricted present discussion to k = 1 without loss of generality) |na〉 = |n〉 + â|n〉 (13) where {|n〉} is any orthonormal basis set. it is not difficult to prove that this recipe does not apply to any basis set. according to equation (6) we can find a basis set {|n,σ〉} that satisfies â|n,σ〉 = σ|n,σ〉, σ = ±1 (14) consequently, all the vectors |n,σ〉a = |n,σ〉 + â|n,σ〉 = (1 + σ)|n,σ〉 (15) with σ = −1 vanish and the resulting a-adapted vector set is not complete. we conclude that the basis set {|n〉} should be chosen carefully in order to apply the recipe of bender et al. [2]. in fact, the authors showed a particular example where it certainly applies. we can construct the basis set {|n,σ〉} from any orthonormal basis set {|n〉} in the following way |n,σ〉 = nn,σq̂σ|n〉, q̂σ = 1 2 ( 1 + σâ ) (16) where nn,σ is a suitable normalization factor. it already satisfies equation (14) because âq̂σ = σq̂σ. in order to overcome the shortcoming in the recipe (13) we define the a-adapted basis set ba = {|n+a〉 = |n, 1〉, |n−a〉 = i|n,−1〉}. note that the vectors |n ± a〉 satisfy the requirement (11) and that ba is complete. in principle there is no guarantee of orthogonality, but such a difficulty does not arise in the examples discussed below. the vectors |v±n 〉 = 1√ 2 ( |n+a〉± |n − a〉 ) also satisfy the requirement (11) and in the particular case of a two-dimensional space they lead to the a-adapted basis set chosen by bender et al. [2] to introduce the issue by means of a simple example. as a particular case consider the parity-time antiunitary operator â = p̂t̂, where p̂ an t̂ are the parity and time-reversal operators, respectively [3]. let {|n〉, n = 0, 1, . . .} be the basis set of eigenvectors of the harmonic oscillator ĥ0 = p̂2 + x̂2 that are real and satisfy p̂ |n〉 = (−1)n|n〉 so that â|n〉 = (−1)n|n〉. it is clear that the recipe (13) does not apply to this simple case. on the other hand, the present recipe yields the a-adapted basis set bhoa = {|2n〉, i|2n + 1〉, n = 0, 1, . . .} which is obviously complete. every vector of the orthonormal basis set bhoa in the coordinate representation can be expressed as a linear combination of the elements of the nonorthogonal basis set {fn(x) = e−x 2/2(ix)n, n = 0, 1, . . .}. by means of a slight generalization of the latter, znojil [6] derived a recurrence relation with real coefficients for a family of complex anharmonic potentials. he also constructed a real matrix representation of a ptsymmetric oscillator in terms of the eigenvectors of â = p̂t̂ [7]. note that his vectors |sn〉 and |ln〉 are our |n, 1〉 and |n,−1〉 respectively. following porter [3] we can try the ansatz |na〉 = an|n〉 + âan|n〉 = an|n〉 + a∗nâ|n〉 (17) which already satisfies â|na〉 = |na〉. this definition of an a-adapted basis set is slightly more general than equation (13). when â|n〉 = (−1)n|n〉 we simply choose an = 12 (1 + i) and obtain the result above for the particular case of the harmonic-oscillator basis set. note that the resulting expressions (we can also choose an = 12 (1 − i)) are similar to those in equation (16) in the paper of bender et al. [2]. 114 vol. 54 no. 2/2014 real matrix representation 5. the harmonic-oscillator basis set many examples of pt-symmetric hamiltonians are one-dimensional models of the form [1] ĥ = p̂2 + v (x), (18) where v (−x)∗ = v (x). (19) we can write v (x) as the sum of its even ve(−x) = ve(x) and odd vo(−x) = −vo(x) parts v (x) = ve(x) + vo(x), (20) where ve(x) = 1 2 [ v (x) + v (−x) ] = 10 kev, the x-ray emission from hot clusters is still dominated by thermal bremsstrahlung emission [26, 35]. thus detailed characterization of the thermal emission spectrum, including the multi-temperature components, is indispensable for separating the non-thermal emission from the thermal emission. as is discovered in rx j1347.5−1145 [26], the hard x-ray emission in 621 http://dx.doi.org/10.14311/ap.2013.53.0621 http://ojs.cvut.cz/ojs/index.php/ap naomi ota acta polytechnica the suzaku/hard x-ray detector (hxd) band is originated predominately by the super-hot (kt ∼ 25 kev) gas in the cluster. such violent mergers often produce shock-heated gas with high (� 10 kev) temperature. this needs to be properly taken into account in the thermal modeling in order to study high-energy populations in the intracluster space. 2. objectives we search for non-thermal hard x-ray emission from merging clusters using suzaku broad-band x-ray spectroscopy to reveal the origin of hard x-ray emission and obtain new insight into the gas physics and the cluster dynamical evolution. we also estimate the magnetic fields in clusters by comparing the hard x-ray and radio fluxes. finally, we compare the suzaku results with those from other satellites and summarize the present status of hard x-ray studies of clusters and briefly discuss the future prospects. we adopt a cosmological model with ωm = 0.27, ωλ = 0.73, and the hubble constant h0 = 70 km s−1 mpc−1 throughout the paper. quoted errors indicate the 90 % confidence intervals, unless otherwise specified. 3. sample and the suzaku data the 5th japanese satellite, suzaku [23], is equipped with x-ray ccd imaging spectrometers (xis; [20]) and hxd [33] and enables sensitive broad-band x-ray observations thanks to the low and stable background levels. in particular, the pin detectors of the hxd instrument are useful for the present study because they have achieved a lower background level than other missions in the 10 ÷ 60 kev range [9]. in order to constrain the hard x-ray properties of merging clusters, we conducted suzaku observations of several targets. we focus on x-ray bright sources with radio synchrotron halos, particularly a2163 at z = 0.203, and the bullet cluster at z = 0.296, since they are considered to have undergone a recent, violent merger. the suzaku’s exposure times are 113 ksec and 41 ksec for a2163 south and north-east regions, and 80 ksec for the bullet cluster. figure 1 shows the xmm-newton image of a2163 with overlaid the suzaku/hxd field of views for the two pointing observations. a2163 is the brightest abell cluster hosting one of the brightest radio halos [7, 13]. previous x-ray observations showed the presence of a high temperature region in the north-east, which can be attributed to the merger shock [3, 21]. in the hard x-ray energies, [30] reported the detection of non-thermal ic emission with a flux level of fnt = 1.1+1.7−0.9 × 10−11 erg s−1 cm−2 in the 20 ÷ 80,kev band from the long rxte observations, while [6] derived an upper limit of fnt < 5.6 × 10−12 erg s−1 cm−2 from the bepposax/pds observations. thus the suzaku observation will further provide independent measurements of hard x-ray properties of the sample. 5’ figure 1. xmm-newton image of a2163 (grayscale) with overlaid the suzaku hxd field of views (34′ × 34′ in fwhm) for the two pointing observations (the solid boxes). the xmm-newton integration region for the global spectrum is indicated with the dashed circle. the bullet cluster exhibits extended radio halo emission [16] and a prominent strong shock feature [22]. the hottest gas with kt > 20 kev exists in the region of the radio halo enhancement. possible detection of non-thermal hard x-ray emission has been reported; the non-thermal flux in the 20 ÷ 100 kev band is (0.5 ± 0.2) × 10−11 erg s−1 cm−2 by rxte [28] and 3.4+1.1−1.0 × 10−12 erg s−1 cm−2 by swift [2]. thus it is worth examining the existence of ic emission with a detailed suzaku analysis. 4. analysis strategy to accurately measure the non-thermal x-ray emission from clusters at hard x-ray energies (> 10 kev), we need: (1.) a careful assessment of the background components, (2.) detailed modeling of the thermal emission. for 1., the suzaku hxd background consists of the cosmic x-ray background and the instrumental nonx-ray background (nxb), and is dominated by the latter. we estimate the systematic error of nxb to be 2 % by analyzing the hxd data during the earthoccultation and quote the 2 % systematic error (1σ) in the spectral analysis of a2163 and the bullet cluster. this is consistent with suzaku-memo 2008-03 by the instrument team1. for 2., since the merging clusters can have a complex, multi-temperature structure, including a very hot thermal gas that emits hard x-rays, we apply single-, two-, and multi-temperature thermal models 1http://www.astro.isas.jaxa.jp/suzaku/doc/ suzakumemo/suzakumemo-2008-03.pdf 622 http://www.astro.isas.jaxa.jp/suzaku/doc/suzakumemo/suzakumemo-2008-03.pdf http://www.astro.isas.jaxa.jp/suzaku/doc/suzakumemo/suzakumemo-2008-03.pdf vol. 53 supplement/2013 the impact of suzaku measurements on astroparticle physics to the suzaku spectra. as detailed later, the multitemperature model is constructed based on the analysis of xmm-newton or chandra data. the suzaku and xmm-newton/chandra joint analysis allows us to take advantage of suzaku’s spectral sensitivity in the wide x-ray band and the xmm-newton/chandra’s high spatial resolution, as demonstrated by [26]. 5. results 5.1. a2163 with suzaku/hxd, we detected x-ray emission from the cluster up to 50 kev. with the cxb and nxb components subtracted, the 12 ÷ 60 kev source flux is measured to be 1.52 ± 0.06 (±0.28) × 10−11 erg s−1 cm−2. here the first and second errors indicate the 1-σ statistical and systematic uncertainties. the 1-σ systematic error of the flux is estimated by changing the normalization of the nxb model by ±2 %. thus the detection of hard x-ray emission is significant at the > 5σ level even if the systematic error of nxb is considered. next we performed the joint xmm+suzaku/hxd spectral analyses under i) the single-temperature model, ii) the two-temperature model, and iii) the multi-temperature (multi-t) model. for i) and ii), the xmm-newton pn and mos spectra were accumulated from the r = 10′ circular region, as shown in fig. 1. for i), the xmm+suzaku broad-band spectra in the 0.3 ÷ 60 kev band can be fitted by the apec thermal emission model: the resultant gas temperature and the metal abundance are kt = 13.5 ± 0.5 kev and z = 0.29 ± 0.10 solar, respectively, and χ2/d.o.f. is 1249/1180. thus the hard x-ray emission is likely to be dominated by the thermal emission. for ii), we have not found any significant improvement of the fits in comparison to case i). however, there is obviously a complex temperature structure as shown by the previous x-ray observations, and we attempted to construct the multi-t model for the thermal emission below. iii) the 10′ spectral region of the xmm data is divided into 2′ × 2′ grids and the single-component apec model is assumed for each grid. figure 2 shows the total model of multi-t components (namely the sum of the single apec models for the grids) fit to the suzaku/hxd data. we find that this multit model gives an acceptable fit to the hxd spectrum. note that the model includes a kt ∼ 18 kev apec component for the north-east “shock” region, which is in agreement with the previous xmm result [3]. the x-ray luminosity of this hot component is 5×1044 erg s−1, which contributes to the hard x-ray emission by about 15 %. in the case of iii), the additional power-law component for non-thermal ic emission does not significantly improve the fit to the observed hxd spectrum. assuming the multi-t+ power-law model with γ = 2.18 (i.e., the same index in radio [7]), the 10−6 10−5 10−4 10−3 0.01 c o u n ts s − 1 k e v − 1 20 50 −4 −2 0 2 4 χ energy (kev) hxd data multi−t+pl model figure 2. upper panel: the suzaku/hxd spectrum of a2163 (the crosses) fitted with the multi-t + power-law model (the solid step function). the contribution from the spectral components, i.e., multiple temperature components (the dotted lines) and the nonthermal power-law model (the dotted lines) are indicated in the figure. bottom panel: the residual of the fitting is shown with the crosses. 90 % upper limit on the non-thermal emission is derived as 9.4 × 10−12 erg s−1 cm−2. this gives 3-times stronger constraints than the upper limit obtained from rxte [30], though the present result is consistent with rxte and bepposax [6] within their errors. more details of data reduction and discussions are to be described in a forthcoming paper (n. ota et al. in preparation). 5.2. the bullet cluster following similar methods mentioned in § 5.1, we analyzed the suzaku and chandra data of the bullet cluster. our preliminary analysis showed that the global cluster spectrum taken with suzaku is well represented by a two-component thermal model. however, the non-thermal emission is not significantly detected, yielding the 90 % upper limit on a γ = 1.5 power-law component to be ∼ 10−11 erg s−1 cm−2 (20 ÷ 100 kev). note that the derived upper limit depends on the assumed photon index. the observed broad-band suzaku spectrum in the 1÷50 kev band can also be fitted by the multi-t model. here the multi-t model is constructed by analyzing the deep chandra observations. we did not confirm the previous ic detections with rxte and swift, and the present results support the thermal origin of the hard x-ray emission. under the multi-t model, the flux of the hot (kt > 13 kev) thermal component is estimated to be fhot ∼ 2 × 10−12 erg s−1 cm−2 and comparable to the non-thermal flux reported by [2], while the ic flux is suggested to be even lower than fhot. more details of the analysis, including an assessment of systematic uncertainty due to the suzaku–chandra cross-calibration, will be presented in nagayoshi et al. in preparation. 623 naomi ota acta polytechnica 6. discussion 6.1. non-thermal hard x-ray emission for the two hot clusters, a2163 and bullet, the hard x-ray spectra taken with suzaku are likely to be dominated by thermal emission, giving stronger limits on the non-thermal ic emission. we also note that a super-hot (∼ 20 kev) gas exists in both clusters whose contribution is not negligible in the hard x-ray band. thus this component should be properly modeled in order to accurately measure the non-thermal component. furthermore, this hot gas is over-pressured and thus thought to be short-lived, ∼ 0.5 gyr [26]. therefore the existence of super-hot gas supports the scenario of a recent merger. given the link between the radio flux and x-ray luminosity (see § 1), the generation of high-energy particles in intracluster space is likely to be connected with the merging events. thus, is there any relationship between lx and ic flux (fnt) or between the gas temperature (kt) and fnt? with suzaku, non-thermal hard x-ray emission has been constrained in about ten clusters. in fig. 3, fnt is plotted as a function of kt. the suzaku results for nearby clusters as well as rx j1347.5−1145 were taken from the literature [8, 17–19, 25, 26, 32, 35]. the rxte/bepposax/swift results [1, 2, 15, 36] are also quoted for comparison. note that the non-thermal flux depends on the modeling of the thermal component as well as the assumption of the photon index for the non-thermal component. for suzaku, non-thermal emission has not been detected and the upper limit was derived for the nearby objects including the coma cluster, a2319 etc. hence there is no clear relation on the fnt–t plane. 6.2. cluster magnetic field using the observed radio flux, ssyn, and the relation sic/ssyn = ucmb/ub [15, 27], we can infer the strength of the cluster magnetic field. in the case of a2163, adopting the radio flux ssyn = 155 mjy at 1.4 ghz [7] and the ic upper limit of sic < 0.26 µjy at 12 kev from the this work, we obtain b > 0.09 µg for the multi-t + power-law (γ = 2.18) model. for the bullet cluster, the magnetic field is estimated to be b > 0.06 µg under the two-temperature apec + power-law (γ = 1.5) model. thus a lower limit of the order of ∼ 0.1 ÷ 1µg has been obtained in most of the clusters observed with suzaku. 7. summary suzaku broad-band x-ray observations of hot clusters hosting a bright radio synchrotron halo, a2163 and the bullet cluster, were performed in order to search for the non-thermal hard x-ray component and reveal the origin of the high-energy emission and the cluster dynamical evolution. the suzaku/hxd spectra for these two clusters are well represented by singleor 0.6 0.8 1 1.2− 1 3 − 1 2 − 1 1 − 1 0 lo g (f n t [ e rg /s /c m 2 ]) log(kt [kev]) c en a 21 99 a 33 76 a 36 67 a 22 56 c om a a 23 19 o ph iu ch us 1e 06 57 r x j1 34 7 a 21 63 rxte bepposax swift suzaku figure 3. non-thermal ic hard x-ray flux in ten clusters measured by suzaku. the results from rxte, bepposax, and swift are also shown for comparison. see § 6 for references. two-temperature thermal models. since the merging clusters can often have shock-heated, super-hot (kt ∼ 20 kev) gas, and its contribution is not negligible in the interpretation of the hard x-ray origin, it is essential to determine the thermal component with high accuracy. this motivated us to carry out the joint suzaku+xmm-newton/chandra spectral analysis under the multi-t model. as a result, the observed suzaku spectra are consistent with the multi-t models and we did not find any significant non-thermal ic emission from a2163 and the bullet cluster. the properties of hard x-rays and the cluster magnetic fields have been studied for about ten clusters so far. there is no significant detection of ic emission by suzaku, giving the upper limits on the level of ∼ 10−11 erg s−1 cm−2. thus no clear relationship between the non-thermal x-ray and radio properties or the ic luminosity (lic)–t is seen. to further constrain the physics of gas heating and particle acceleration in clusters, it is important to increase the number of cluster samples by applying the present analysis method to other clusters. given that the relativistic particles are localized in small, shock regions, present instruments with limited imaging capability may fail to identify their existence. if this is the case, the advent of hard x-ray imagers will benefit the cluster study. now nustar [14] is successfully in orbit and taking focused images of the high-energy x-ray sky. the astro-h satellite is scheduled to be launched in 2014 [34], and will carry state-of-the-art instruments to realize sensitive x-ray observations in the 0.3 ÷ 600 kev band. the hard x-ray imagers on astro-h will have an effective area comparable to nustar. these new hard x-ray imagers will enable a more accurate high-temperature thermal component and identification of the particle acceleration site to get higher signal-to-noise ratios. this will lead us to detect the ic emission to a 624 vol. 53 supplement/2013 the impact of suzaku measurements on astroparticle physics level ∼ 2 orders of magnitudes lower than the present limit. in addition, the high-spectral resolution of x-ray micro-calorimeters will enable direct measurements of bulk/turbulent gas motions [24]. astro-h can therefore measure non-thermal energies in the form of kinetic gas motions (turbulence, bulk gas motion) and relativistic particles, leading us to draw a more comprehensive picture of the cluster structure and evolution. acknowledgements n.o. thanks the organizers for the opportunity to present this paper and r. fusco-femiano for useful suggestions and discussions. references [1] ajello, m., rebusco, p., cappelluti, n., et al. 2009, apj, 690, 367 [2] ajello, m., rebusco, p., cappelluti, n., et al. 2010, apj, 725, 1688 [3] bourdin, h., arnaud, m., mazzotta, p., et al. 2011, a&a, 527, a21 [4] brunetti, g., cassano, r., dolag, k., & setti, g. 2009, a&a, 507, 661 [5] cassano, r., ettori, s., giacintucci, s., et al. 2010, apj, 721, l82 [6] feretti, l., fusco-femiano, r., giovannini, g., & govoni, f. 2001, a&a, 373, 106 [7] feretti, l., orrù, e., brunetti, g., et al. 2004, a&a, 423, 111 [8] fujita, y., hayashida, k., nagai, m., et al. 2008, pasj, 60, 1133 [9] fukazawa, y., mizuno, t., watanabe, s., et al. 2009, pasj, 61, 17 [10] fusco-femiano, r., orlandini, m., brunetti, g., et al. 2004, apj, 602, l73 [11] fusco-femiano, r., orlandini, m., bonamente, m., & lapi, a. 2011, apj, 732, 85 [12] giovannini, g., tordi, m., & feretti, l. 1999, new astron., 4, 141 [13] govoni, f., markevitch, m., vikhlinin, a., et al. 2004, apj, 605, 695 [14] harrison, f. a., boggs, s., christensen, f., et al. 2010, spie, 7732 [15] kaastra, j. s., bykov, a. m., schindler, s., et al. 2008, space science reviews, 134, 1 [16] liang, h., hunstead, r. w., birkinshaw, m., & andreani, p. 2000, apj, 544, 686 [17] kawaharada, m., makishima, k., kitaguchi, t., et al. 2010, pasj, 62, 115 [18] kawano, n., fukazawa, y., nishino, s., et al. 2009, pasj, 61, 377 [19] kitaguchi, t., nakazawa, n., makishima, k., et al. 2007, xmm-newton: the next decade, 28p [20] koyama, k., tsunemi, h., dotani, t., et al. 2007, pasj, 59, 23 [21] markevitch, m., & vikhlinin, a. 2001, apj, 563, 95 [22] markevitch, m., gonzalez, a. h., david, l., et al. 2002, apj 567, l27 [23] mitsuda, k., bautz, m., inoue, h., et al. 2007, pasj, 59, 1 [24] mitsuda, k., kelley, r. l., boyce, k. r., et al. 2010, spie, 7732 [25] nakazawa, k., sarazin, c. l., kawaharada, m., et al. 2009, pasj, 61, 339 [26] ota, n., murase, k., kitayama, t., et al. 2008, a&a, 491, 363 [27] ota, n. 2012, research in astronomy and astrophysics, 12, 973 [28] petrosian, v., madejski, g., & luli, k. 2006, apj, 652, 948 [29] rephaeli, y., & gruber, d. 2002, apj, 579, 587 [30] rephaeli, y., gruber, d., & arieli, y. 2006, apj, 649, 673 [31] sarazin, c. l., & kempner, j. c. 2000, apj, 533, 73 [32] sugawara, c., takizawa, m., & nakazawa, k. 2009, pasj, 61, 1293 [33] takahashi, t., abe, k., endo, m., et al. 2007, pasj, 59, 35 [34] takahashi, t., mitsuda, k., kelley, r., et al. 2012, spie, 8443 [35] wik, d. r., sarazin, c. l., finoguenov, a., et al. 2009, apj, 696, 1700 [36] wik, d. r., sarazin, c. l., finoguenov, a., et al. 2011, apj, 727, 119 625 acta polytechnica 53(supplement):621–625, 2013 1 introduction 2 objectives 3 sample and the suzaku data 4 analysis strategy 5 results 5.1 a2163 5.2 the bullet cluster 6 discussion 6.1 non-thermal hard x-ray emission 6.2 cluster magnetic field 7 summary acknowledgements references ap2002_01.vp 1 the minorit monastery the existence of a settlement above underground structures excavated in poor quality types of ground and at small depths below the surface must be taken into account, especially as regards existing buildings standing within the reach of underground works. the main factors influencing the maximum amount of surface settlement, the shape and extent of the settlement are as follows: the physical – mechanical properties of the rock environment, the tunnelling method, the dimensions of the underground works, the depth of the underground structure below the surface and the stress conditions within the rock mass. conditions which make excavation dangerous as regards the amount of settlement are cohesive soils (soft and solid consistency), poorly settled soils, incohesive soils supercharged due to a lowered water table, semi-rocks and more deformational rocks. it can be stated that the new austrian tunnelling method (natm), applied with short advances per cycle and with increased rigidity of the lining, restricts the possible occurrence of inadequate deformations of the massif or of the surface area. suitable modifications of natm or other special tunnelling techniques such as horizontal jet-grouting, peripheral slot or consolidation grouting enable other reductions of surface settlement. settlement phenomena are generally increased in linear dependence on the growing excavated cross section of the underground structure (especially with increasing dimensions of its width). major deformations occur if the overburden is shallow (d � w). the size of the deformations decreases with increasing depth, and becomes negligible below a certain level. the dimensions of the settlement are, apart from other data, a function of the non-measurable argument d / w. in most cases, a state of stress originates in a rock mass after excavation, where the highest values and the deciding importance lie in the vertical stresses. in a number of cases there are stress conditions, where the significant influence comes from residual components of the horizontal stress. 1.1 the method for securing the foundations of a monastery during excavation of a secondary utility tunnel the fifth stage of the secondary utility tunnel structure in brno passed under jánská street, beneath the building of the minorits monastery. the tunnel with the horseshoe-shaped profile and an area of 10 m2, was situated below the foundation level of the monastery, and the deformation zone of the tunnel collided with the monastery foundation within a length of about 50 m. it was necessary to secure the foundations of the minorit monastery by underpinning, at present mostly by means of columns created by jet-grouting. although the interference with the foundation structures of a building is not so drastic when jet-grouting is used, the owners and supervisors of the monastery refused to allow drilling or grouting in direct contact with their buildings. the issue was therefore solved by building a quite sizeable wall formed by columns created by jet-grouting (dia. 0.75 m per 1.5 m, length 5.5 m). the purpose of the wall was to separate the subsoil of the minorite monastery from settlement caused by the utility tunnel (fig. 1). a grout curtain was constructed beyond the building of the monastery (the transverse distance from the foundation structures was about 1.0 m). neither the building nor its © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 42 no. 1/2002 numerical modelling of overburden deformations j. barták, m. hilar, j. pruška this paper focuses on the application and verification of mathematical models of the effect of supporting measures on the reduction of overburden deformations. the study of the behaviour of the models is divided into three parts: reduction of the tunnelling effects on the minorit monastery by means of a jet-grouting curtain; the behaviour of the hvížd’alka backfilled tunnel and a numerical analysis of the supporting measures affecting the tunnel deformations of the mrázovka tunnel in prague. keywords: numerical modelling, overburden deformations. 200200 2 700 3 100 -5,2 -6,5 0,25 0,00 1 000 -2,0 neogene clays gravel loess clay shear area curtain of jet grouting lenght 5,5m, diameter 0,75m per 1,5m pavement backfill jánská street m in o ri te s ´ m o n a s te ry 2 0 0 2 7 0 0 2 0 0 5 0 0 5 0 0 -1,0 utility tunnel u ua j -7,0 -8,0 fig. 1: cross section a – a´ foundation was directly touched by the protective structure. the location of the borehole for jet-grouting inside the zone of the utility services under the sidewalk necessitated the provision of a casing pipe embedded in the bottom of a pre-dug trench, uncovering the utility services. minimisation of soil loosening in the deformed settlement zone was a prerequisite for the assumed function of the grout curtain. 1.2 theoretical solution of the grout curtain function by fem the whole course of the construction of the jet-grouting curtain and of the subsequent utility tunnel excavation was computer simulated, using plaxis software. the current version of plaxis software enables detailed phasing of the task, which was a prerequisite for modelling of the process of securing the minorit monastery. an area 21 m wide down to a depth of 17 m was modelled for the calculation. the vertical edges of the whole area were fixed against horizontal displacement, and the bottom end was secured against vertical displacement. horizontal displacement was also prevented along the two vertical edges of the monastery wall. an automatic generator was used to generate the finite element mesh. fifteen-noded triangular elements, which enabled a sufficiently accurate analysis to be performed, were used for the modelling. we concentrated on the following computation phases: � primary state of the stress. � excavation of the pit for monastery foundation. � loading at the foundation level by the dead weight of the monastery. � execution of the jet grouting curtain. � setting of the jet grouting curtain. � operable load above the future excavated space. � the tunnel excavation proper. plaxis software operates in such a way that when the primary state of stress is being computed, the finite element mesh remains undeformed. however, the stress value corresponds to the real situation. a partial defect of the model appeared in this phase, which does not however affect the other phases of computation. the defect appeared as result of the relief at foundation level, caused by deactivation of elements in the construction pit. this process manifested itself in an upward deformation of the mesh by 12 mm. this behaviour of the mesh, which does not correspond to the real case, is caused by the elastic behaviour of the mesh elements. as a result of deactivation, the stress under the foundation level was also slightly reduced. as a result of the activation of a uniform load, vertical deformations of the foundations occurred. the vertical deformations increased by 42 mm, which represents a total settlement of 30 mm. as anticipated, a stress concentration occurred under the foundation level. horizontal deformations up to 1.6 mm and vertical deformations up to 1.3 mm occurred in the area of the diaphragm wall when the jet-grout curtain was set. the increase on that deformations (and the stress) under the load due to the superimposed load on the overburden was 1.3 mm. all deformations of the excavated space occurred in the inward direction: roof – 35 mm; bottom – 50 mm; walls – 6 mm. the deformations of the other parts of the massif were: terrain above the excavation – settlement of 15–20 mm; monastery foundations – settlement of 6.5 mm; maximum horizontal displacement of the diaphragm wall – 3.5 mm. 1.3 in-situ measurements the progress of the monastery foundation settlement was monitored by means of levelling. the final measured values of the foundation settlement in the longitudinal direction fluctuated between 2.1 mm and 5.9 mm. the settlement value determined by plaxis was 6.5 mm. the horizontal displacements of the diaphragm wall were measured by means of inclinometers placed directly in the wall (in two profiles 01 and 02). through this measurement the horizontal deformation of the measuring sleeve be measured with accuracy greater than � 0.2 mm per 1 m of the measuring sleeve length [6]. for the sake of comparison, the measured and computed values are shown in a single diagram (fig. 2). the individual curves correspond quite well especially in the values of the maximum achieved displacement. the maximum horizontal displacement determined by measurement was 2.65 mm, and by modelling 3.5 mm, which represents a total difference of 0.85 mm. the deviation of the computed curve from the measured values is shown at the foot of the diagram. this deviation is probably related to the manner of taking the measures, which assumed a non-slide bearing of the inclinometer footing. 1.4 conclusion the in-situ measured horizontal deformations of the protective wall (3 mm) are of the same order as the deformations determined by plaxis (3.5 mm). their effect on the horizontal settlement of the monastery foundation and on the origin of defects in the building can be classified as negligible. this fact was confirmed by levelling, which determined the maximum value of of 5.9 mm for the foundation settlement of the monastery. this value is consistent with the settlement value determined by plaxis modelling. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 0 43210,5 0 1 2 3 4 5 5,2 6 7 8 deformation [mm] d e p t h [m ] plaxis in cl in o m et er 02 in cl in om et er 01 6,5 bottom of the inclinometer bottom of the grout curtain gravel loess clay clay fig. 2: horizontal deformation of grout curtain 2 hvížd’alka backfiled tunnel the hvížd’alka open limestone quarry is situated in the valley of the radotín brook, and forms rather unsightly pit in a beautiful landscape. this is why it was decided to restore and afforest the site, using filling material, and providing a bottom to the open cast mine to make it accessible by a backfilled tunnel. settlement of this tunnel structure, deformation of the lining and contact stress are measured all the time during the backfilling [1]. the behaviour of the hvížd’alka backfilled tunnel was modeled using commercial plaxis software. the observation of the deformations for the hvížd’alka tunnel enabled the computer procedure to be validated. 2.1 description of the model numerical analysis was carried out by means of the plaxis program system. an automatic generator was used to create the finite element mesh. for modelling we used quadratic six-node triangular elements, which enable a sufficiently accurate analysis to be performed. the modelled area covers an area about 60 m width and 50 m in height (fig. 3). the tunnel tube has a cross-section area of 30 m2 (fig. 4). the length depends on the quarry output and will be from 224 m to 450 m. the tunnel is backfilled with filling material (stripping soils, aggregates from cement material extraction). the lateral backfill causes lateral movements and thus the computational model counts with an interaction between the lining and the soil. the rock mass behaviour was approximated by means of the mohr-coulomb model. the input parameters of the subsoil were determined on the basis of the results of the engineering-geological investigations. the parameters of the lining and filling material were determined in compliance with the detailed design. for the computational models, the following input data were considered: � backfill: e � 20 mpa; � � 0.3; � � 23 kn/m3; � � 20°; c � 29 kpa � subsoil: e � 2000 mpa; � � 0.15; � � 25 kn/m3 � lining: e � 30000 mpa; � � 0.14; d � 0.85 m, w � 21.25 kn/m/m´. 2.2 computation phases the static analysis in calculating the sectional forces and deformations is based on the following computation phases concept: � primary state of stress in the subsoil. � execution of the lining. � execution of compacted backfill. � placement of the last layer of backfill. � loading of backfill above the tunnel roof. 2.3 conclusion the numerical model, which was implemented with plaxis software, corresponds a adequately with the in-situ measurements. it should be pointed out that during construction the filling material close to the lining was strengthened by cement binder. 3 mrázovka tunnel in prague the north-western sector of the city circular road in prague contains three major road tunnels of large cross section area. while the strahov tunnel is already open for traffic, the mrázovka tunnel is still under construction (driving of the western tunnel tube (wtt) started at the beginning of 1999) and the blanka tunnel is still in the planning phase. the very difficult prague ordovician geological conditions make it necessary to apply natm with horizontal sequencing of the excavation. stabilization of the deformations was always achieved only after closure of the whole primary lining, but the following supporting technical measures were performed: anchoring, widening of the top heading legs, micropile support under the top heading legs, reinforcing grouting performed in advance from the exploratory gallery, and closing the top heading by a temporary invert. due to increased deformations occurring at the excavation with a horizontally © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 42 no. 1/2002 fig. 3: numerical model of the hvížd’alka tunnel fig. 4: cross-section of the hvížd’alka tunnel divided face, a vertical pattern of the sequence was used in the further course of excavating the mrázovka tunnel. 3.1 description of the starting model the numerical analysis was carried out by means of the plaxis program system. 3d behaviour of the excavation face area, and correct description of the influence on the deformations and the state of the massif, were simulated by means of the frequently used procedure of loading the excavation and lining using the so-called �-method. the geometry of the starting model of the profile at km 5.160 wtt covers an area of about 200 m wide and 110 m high. the modelled area is divided into eight basic sub-areas according to the types of rock encountered (fig. 5). the rock mass behaviour was approximated by means of the mohr-coulomb model. the input parameters of the rock mass were determined on the basis of the engineering-geological investigation results [3]. the modulus of deformation of the tectonic zone was determined according to the results of plate bearing tests. a comparison of theoretically determined deformations with the values obtained by monitoring [5] was used verifying the applicability of the mathematical model. a parametric study was developed, which proved that the concurrence of unfavourable values of the rock mass (within the limits determined by the geotechnical investigation) could be the cause of the increase in real deformation above the values computed by means of the mathematical model. 3.2 results of modelling of the supporting measures 3.2.1 reinforcing grouting grouting boreholes were drilled from one centre on the centre line of the exploratory gallery, at an angle of 30 to 36° from the longitudinal axis, in a radial array, creating a fan shape (fig. 6). a cementation mixtures stabilized by bentonite was used as the grouting mix. evaluation of the effect of the grouting on improving the mechanical properties of the rock mass was performed by means of pressiometry [4]. it followed from the results of the series of pressiometric tests that: � the modulus of deformation of the grouted rock increased by an average of 64 %. � the cohesion of the grouted rock increased by an average of 49 %. the outputs determined by the modelling show that the grouting that was performed causes a reduction of 12 mm (about 10 %) in the deformations. one of the reasons for this is probably the insufficiently large area of the grouted rock (an area up to a distance of 4 m from the tunnel tube was injected). another reason for the lower effectiveness is the influence of the significant tectonic fault existing in the monitored profile. regarding the models of the reinforcing grouting, it should be noted that planar models do not take into consideration the favourable spatial effect of this measure, which may not be negligible. 3.2.2 rockbolt support sn-type rock bolts made from deformed reinforcing bars, steel grade 10425, 32 mm in diameter, were used at the profile. the bar was inserted into boreholes 48 mm in diameter, filled with cementatious mortar. the bolts were 4 m long. the rock bolts were installed along the tunnel perimeter at 2 m spacing, and their spacing in the longitudinal direction was also 2 m. of the 4 types of models of anchoring that were used (fixation along the overall length of the bolt, fixation at the root and head of the bolt, approximation by force effects, improvement of the shear strength on the grouted area of the massif) the most suitable is the model of rockbolts fixed in the massif at the root and in the lining at the head. the results of complex modelling of the anchoring show the fundamental (and known) influence of the length of the time lag between installation of the rockbolts and installation of the primary lining. rock bolts installed simultaneously with the lining improved the final magnitude of the deformations by up to 90 mm (45 %), but when there is a time lag between installation, the maximum resulting improvement is 18 mm (9 %). 3.2.3 the micropile support pipes 70 mm in diameter with a wall thickness of 7 mm were used for the top heading support. the length of the right-hand micropile was 12 m (root length 8 m), and the left-hand micropile was 9 m in length (root length 6 m). the micropiles were installed under the widened legs after the lining concrete of the top heading had set. the influence of the time lag between the installation of the micropiles and the application of shotcrete on the top heading lining is shown in 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 1 2 4 5 6 7 7a tectonic fault 2 diluvial sedimenty 1 made ground 4 rotten shale 5 weatherad shale 6 slightly weatherad shale 7 sound shale fig. 5: division of the model exploratory gallery future profile of the tunnel tube saving grouting bore holes circumferential vault of the saving grouting fig. 6: scheme of the execution of the saving grouting the results of the complex modelling. the high concentration of stresses in the concrete of the vault leg at the contact with the micropiles played the main role in the realization. these stresses lead to subsequent penetration of the micropiles through the lining. 3.2.4 temporary invert of the top heading temporary closing of the top heading by invert is a measure which can be effectively applied to influence the deformations of the tunnel lining when there is horizontal sequencing of the tunnel face. howerver, this measure adversely affects the fluent and economic progress of the works. the computed values of the internal stresses show that their magnitude decreases significantly (by about 2/3) with a growing time lag in realization. nevertheless, the subsequently performed joint at the foot of the top heading lining should be able to transmit all three types of internal forces. 3.2.5 vertical sequencing of the face the transition from the horizontal to the vertical sequencing of the face enables a notable reduction in the magnitude of the deformations (fig. 7). the vertical pattern of the sequencing used in the further course of the mrázovka tunnel excavation was used for this modelling. the reinforcement of the lining of the sidewall drifts was taken over from the performance documentation (sprayed b25concrete 250 mm thick, bretex 1*25 and 2*16 mm ribs). it follows from the model results that the final deformation will amount to 114 mm if the full profile excavation of the sidewall drifts is applied. if the drifts are divided into two branches, the final deformation will amount to 134 mm. 3.3 conclusion the analysis was carried out by means of modelling using the finite element method. the design and monitored parameters at the profile km 5.160 wtt mrázovka tunnel were used as source data for the analysis. the comparative absolute values are related to the model with the exploratory gallery (tab. 1). the most clearly effective measure, both theoretically and, at other profiles, also practically, proved to be the transition from the horizontal to the vertical sectioning of the tunnel face [7]. in combination with reasonably promptly installed rockbolt support, we can expect a reduction in the deformations within the range of 50 to 70 %. it should be stated that the type used and the position of the exploratory gallery are not suitable for combining with vertical sequencing of the tunnel face. 4 conclusion the main goal of the research was to verify the application of suitable numerical models on the basic underground structures type. the whole course of comparison was computer simulated, using commercial plaxis software. an automatic generator was used for generating the finite element mesh. for the modelling we used quadratic triangular elements, which enable a sufficiently accurate analysis to be performed. three tasks were prepared in order to study the behaviour of the numerical model: reduction of the tunneling effects on the minorit monastery by means of a jet grouting curtain was modeled in the first task. the in-situ measured horizontal deformations of the protecting wall (3 mm) are of the same order as the deformations determined by fem (3.5 mm ). the levelling determined that the maximum value of the monastery foundation settlement is 5.9 mm. this value is consistent with the settlement value determined by fem modelling. the behaviour of the hvížd’alka backfilled tunnel was modeled in the second task. the results determined by fem © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 42 no. 1/2002 iia iib iiia iiib iv v vi ii iii iv fig. 7: scheme of the sectioning of the faces model title surface [mm] tunnel roof [mm] reduction at the surface [mm] reduction at the tunel roof [mm] model with exploratory gallery 171 215 0 0 reinforcing grouting 159 193 12 22 anchoring type 1 97 125 74 90 anchoring type 2* 154 197 17 18 micropiles type 1 123 150 48 65 micropiles type 2* 152 190 19 25 invert type 1 133 170 38 45 invert type 2* 171 215 0 0 vertical sectioning type 1 79 114 92 101 vertical sectioning type 2* 104 134 97 81 * time lag table 1: influence of supporting measures on resulting deformations modelling were compared with the in-situ measurements. this value is consistent with the settlement value determined by fem modelling. the numerical analysis of the supporting measures affecting on the tunnel deformations was performed in the last task. the aims of this project were: � to create a reasonable static model of cross-section km 5.160, where the deformations were significantly high (about 80 % higher values for the limit values than in the realization project). � to evaluate the influence of supporting measures on decreasing the final deformations, and to analyse their suitability (i.e. addition of rock bolts, consolidation grouting and micropiling under top heading footings). � to analyse possible causes of increases in of deformations above the originally supposed values (i.e., the specific properties of the studied cross-section, possible imperfections during construction). � to evaluate other ways of reducing the tunnel deformations (alternative methods of sequencing the tunnel face – vertical sequencing, closing the top heading by a temporary invert). � to evaluate the influence of the excavation of a exploratory gallery on the static behaviour of the massif. � after comparing of in situ measured deformation values with calculated deformations on the basic model, the effect of parametric changes on the final deformations was investigated. the middle values of the rock parameters were used for modelling purposes. the parameters of the tectonic fault vary slightly from the middle (due to more detailed investigation in profile km 5.160). the results of the parametric study were examined to determine the roof arch settlement. the influence of cohesion and friction angle values on the results was also examined. the quality of the results was influenced by the following factors: � reliable input data: the advantage of the mrázovka project was the detailed site investigation. this enabled a relatively exact derivation of the rock massif characteristics in the examined cross section. the tectonic fault which affected the behaviour of the structure, was properly incorporated into the model thanks to co-operation with a geologist. � natm modelling: the option of single stage simulation during tunnel face excavation was used for this purpose. a description of the deformations made before the tunnel lining was added also had to be performed. it was created using the beta method (the beta coefficient was chosen to be 0.67). � monitoring during construction: the behaviour of profile km 5.160 was monitored in detail during construction. this enabled the model to be verifies by in-situ measured data. acknowledgement this work was supported by ministry of education of the czech republic research project msm210000003 “developments of the algorithms of computational mechanics and their application in engineering”. this support is greatly acknowledged. references [1] barták, j., macháček, j., pacovský, j.: observation measuring of the earth-covered traffic tunnel structure in the quarry hvížd’alka. tunel, no. 1/1998, pp. 6–9 [2] barták, j., hilar, m., pruška, j.: plaxis program. geotechnika, no. 2/2000, pp. 8–11 [3] hudek, j.: the mrazovka tunnels – measurement and monitoring at construction of the wtt – presiometric checking on the success of saving grouting – profiles 003 to 009. púdis, praha 1999 [4] hudek, j., chmelař, r., verfel, j.: additional engineering-geological investigation for the vehicular tunnel mrázovka – partial report – evaluation 2 of trial reinforcing grouting by zakládání staveb a.s. company at km 4.873–4.887. púdis, praha 1998 [5] kolečkář, m., zemánek i.: monitoring of the mrazovka tunnel. in: “volume of papers of the int. conf. underground construction 2000”, ita/aites, praha 2000, pp. 427–433 [6] mišove, p.: check inclinometric measurements for securing stability of st. john’s church in brno. vuis – zakladanie stavieb, bratislava 1996 [7] salač, m.: construction of the tunnel under mrázovka hill. tunel, no. 4/1999, pp. 33–38 prof. ing. jiří barták, drsc. e-mail: bartakj@fsv.cvut.cz dr. ing. jan pruška e-mail: pruska@fsv.cvut.cz phone: +420 2 2435 4548 fax.: +420 2 2435 4556 department of geotechnics ctu, faculty of civil engineering thákurova 7 166 29 praha 6, czech republic ing. matouš hilar, m.sc.,ph.d. phone: +420 2 41443411 d2 consult prague, s.r.o. na usedlosti 513/16 140 00 praha 4, czech republic 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 acta polytechnica doi:10.14311/ap.2014.54.0266 acta polytechnica 54(4):266–270, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap x-ray variability study of polar scattered seyfert 1 galaxies tobias beucherta, b, ∗, jörn wilmsa, matthias kadlera, b, anna lia longinottic, matteo guainazzic, giovanni miniuttid, ignacio de la callec a dr. karl remeis-observatory, universität erlangen-nürnberg & ecap, sternwartstrasse 7, 96049 bamberg, germany b lehrstuhl für astronomie, universität würzburg, campus hubland nord, emil-fischer-straße 31, 97074 würzburg, germany c european space astronomy centre of esa, po box 78, villanueva de la cañada, 28691 madrid, spain d centro de astrobiología (csic-inta), dep. de astrofísica; esac, po box 78, e-28691 villanueva de la cañada, madrid, spain ∗ corresponding author: tobias.beuchert@sternwarte.uni-erlangen.de abstract. we study 12 seyfert 1 galaxies with a high level of optical polarization. optical light emerging from the innermost regions is predominantly scattered in a polar region above the central engine directly in our line of sight. these sources show characteristics of seyfert 2 galaxies, e.g. polarized broad lines. the polarization signatures suggest a viewing angle of 45°, classifying them as intermediate seyfert 1/2 types. the unified model predicts this line of sight to pass through the outer layer of the torus resulting in significant soft x-ray variability due to a strongly varying column density. the aim is to find evidence for this geometrical assumption in the spectral variability of all available historical observations of these sources by xmm-newton and swift. keywords: variability, column density variation, unified model. 1. introduction according to the unified model of active galactic nuclei/agn [1] seyfert 1 and seyfert 2 galaxies are the same type of galaxies, but seen under different inclination angles (fig. 1). at inclinations . 45°, seyfert 1 galaxies are typically either optically unpolarized or polarized due to predominantly equatorial scattering. seyfert 2 galaxies have an inclination of & 45° and show mainly optical polarization features due to polar-scattering, as the line of sight of seyfert 2 galaxies passes through the optically thick torus. this picture usually works well, but [22] identify 12 seyfert 1 galaxies that exhibit optical polarization similar to seyfert 2 galaxies. they conclude that these polar-scattered seyfert 1 galaxies are seen under an inclination of i ∼ 45° and thus represent the transition between unobscured seyfert 1 and obscured seyfert 2 galaxies (fig. 1). the line of sight towards these galaxies therefore passes through the outer layers of the torus, where significant absorption is still expected and suppresses polarized light from the equatorial scattering region, but not the polar-scattered light. we assume these outer layers to be a non-homogeneous gas and dust medium which might be stripped off by nuclear radiation resulting in a highly variable column density towards the observer. x-ray observations of these polar-scattered seyfert 1 galaxies should therefore exhibit a strongly variable nh. we compare the x-ray properties of a sample of 12 polar-scattered seyfert 1 galaxies and 11 equatorialscattered seyfert 1 galaxies (see table 1) for which no absorption variability is expected according to the line-of-sight. as of now, only the first set of polarscattered seyfert 1 galaxies has been analyzed. in table 1, the dashed line separates sources with spectra of sufficient signal-to-noise to constrain the nh(top) from sources with insufficient signal-to-noise (bottom). when studying the properties of absorption we have to take into account source intrinsic absorption of both neutral and ionized matter. the neutral, cold absorption can be due to spatially distinct regions in agn. theory [4, 9] predicts a relatively cold, dense phase in equilibrium with partially ionized gas forming the broad line region (blr). these embedded blr clouds are commonly believed to be gravitationally bound to the center of mass. they are good candidates to cause short occultation events with attenuated soft x-ray spectra. at larger distances to the central black hole, variable neutral structures can exist at the outer torus layer possibly evaporated due to the in-falling radiation [6]. recent studies by [13] even demonstrate the relevance of a model where the torus consists of distinct clumps [14, 15]. whereas cold absorbing gas and dust is still assumed to be confined to the obscuring torus, warm, partially ionized gas is often found to be outflowing [5]. mhd disk winds [7] are one theoretical explanation. such so called warm 266 http://dx.doi.org/10.14311/ap.2014.54.0266 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 4/2014 x-ray variability study of polar scattered seyfert 1 galaxies figure 1. polar and equatorial scattering regions according to the unified scheme of agn. adopted from [22]. source name swift xmm-newton xrt pn ngc 3227 7 2 ngc 4593 5 2 mrk 704 5 2 fairall 51 2 2 eso 323-g077 3 1 mrk 1218 7 2 ugc 7064 2 1 mrk 766 33 15 mrk 321 1 1 mrk 376 2 iras 15091−2107 1 mrk 1239 1 akn 120 3 1 1zwl 7 2 kuv 18217+6419 1 22 mrk 006 4 4 mrk 304 2 mrk 509 23 17 mrk 841 4 5 mrk 876 16 2 mrk 985 ngc 3783 6 4 ngc 4151 4 17 table 1. the sample from smith et al.: top half: polar-scattered sy i galaxies, bottom half: equatorial scattered sy i galaxies – the control sample. the numbers denote the amount of pointings with the corresponding satellite. absorbers [19] can be separated from neutral absorbers with typical spectral features. in contrast to the dusty outer layer of the torus, we assume ionized absorbers to fully cover the line-of-sight [3]. 2. methods here we limit ourselves to the first sample of polar scattered seyfert galaxies. the search for variability in these sources requires consistent model fitting in order to ensure that the detected variability of absorption indeed originates in the proposed region. an important task is to disentangle consistently the contributions of warmand cold-absorption via spectral model fitting in order to locate the absorber. distinguishing two absorption components, neutral and partially ionized, is a challenging task when dealing with low signalto-noise spectra, such as swift data. xmm-newton observations, however, can help to constrain properties of one or more warm absorber phases. the picture becomes even more complicated when considering the warm absorber not to be a homogeneous gas but rather dynamic and indeed variable, both in covering fraction and column density as shown for mrk 704 by [11]. as a consequence we cannot easily draw a conclusion on warm absorber parameters from one observation to another. as the partially ionized phase can also attenuate the soft continuum to some extent, we have to assume that both ionized and neutral phases do contribute to the overall continuum absorption within the suggested line-of-sight at ∼ 45° inclination. within this work, however, we are explicitly interested in the variability of neutral absorption. at 267 t. beuchert, j. wilms, m. kadler et al. acta polytechnica xmm – 0300240901 xmm – 0300240401 swift – 00037809002 swift – 00037809001 876543210 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8 fairall 51 nh [10 22 cm−2] γ p o w xmm – 0302260201 swift – 00035777002 3.532.521.510.50 2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 mrk 1218 nh [10 22 cm−2] γ p o w figure 2. combined contours of all observations of fairall 51 (left) and mrk 1218 (right) for 68.3 %, 90 % (blue) and 99 % confidence levels. the numbers within the plot panels are appropriate observation ids. least for now we do not consider further the warm absorber contributions, unless explicitly necessary. when searching for variability, a consistency check of all modeled observations is necessary. this is done by examining correlations of column density related parameters of each fit and observation, and calculating appropriate confidence levels. in order to find appropriate model fits, a bottom-up procedure is chosen, starting with a power-law photon continuum, nph,full(e) = afulle−σnh int (e−γ + gfe)e−σnh hal (1) which is fully covered by galactic and source-intrinsic neutral matter. we further test a model where the continuum source is partially covered by the source intrinsic absorber, nph,partial(e) = apartial [ (1 − c) ce−σnh int ] × (e−γ + gfe)e−σnh gal (2) and finally a partially covered source with an ionized medium in front of it (a “warm absorber”), nph,wa(e) = awawa(1) wa(2) ×e−σnh int (e−γ + gfe)e−σnh gal. (3) which model is chosen depends on the signal to noise ratio of the data. here the partial covering scenario is included in the warm absorber models. in addition to the continua above, the fe kα line at 6.4 kev is phenomenologically fitted with a gaussian model component if existent. compton reflection as a more physical model to explain features like the fe kα line is not included in the fits because of the lack of sensitivity above 10 kev, where compton reflection dominates. having found the best fitting model, the parameter space is further investigated to find the 90 % confidence levels for each parameter as well as confidence contours between correlated parameters. in particular, correlations between the neutral column density and the covering fraction as well as the power-law slope are of interest. uncertainties of the column density are then derived from the 90 % confidence contours. the immediate aim is to search for significant variability of the measured column densities of a cold absorber that is assumed to be located in the outer torus layers and hence to underlie structural variability. the significance of variability is best evaluated by considering contours of all analyzed observations of one source (fig. 2). the left panel of fig. 2 shows that the column densities measured in different observations are inconsistent between different epochs, therefore implying a clear detection of variability in the case of fairall 51. in contrast, in the case of mrk 1218 (fig. 2, right panel) the smaller contours of the xmm-newton observation are fully enclosed by those of the swift observation with less signal-to-noise. both observations are consistent with each other and no variability can be claimed based on the given data. for models such as ours with several degrees of freedom as well as a limited number of bins, the typical parameter space is more or less asymmetric [2]. the results are asymmetric uncertainties. we derive combined column densities with unequal upper and lower uncertainties. gaussian error propagation is not applicable in this case. a solution is given by [2] and is also applied in this work. 3. results we find 6 out of 12 sources of the sample to reveal variable absorption of the soft x-rays on timescales from days to years. for mrk 766 as a well studied source [12, 23], minimum variability timescales on the basis of hours were found by [21]. figure 3 shows an example of strong spectral variability in ngc 3227 with particularly changing warm absorption properties. warm or (partially) gas still has a rich number of high-z elements able to absorb soft x-ray photons. in addition to column density variations of such ionized gas, the ionization states are change between the xmm-newton observations. these changes 268 vol. 54 no. 4/2014 x-ray variability study of polar scattered seyfert 1 galaxies x swift 00031280008 2008-11-27 ▽ swift 00031280007 2008-11-25 △ swift 00031280004 2008-11-13 ◦ xmm 0400270101 2006-12-03 ⋆ xmm 0101040301 2000-11-28 0.01 10-3 10-4 520.5 101 3 1 -1 -3 -5 ngc3227 p h o to n s c m − 2 s− 1 k e v − 1 energy [kev] χ figure 3. spectral variability of ngc 3227. mrk 1218 mrk 704 eso 323-g077 ngc 3227 ugc 7064 ngc 4593 fairall 51 500040003000200010000 50 40 30 20 10 0 ∆t[d] ∆ n h [1 0 2 2 c m − 2 ] figure 4. plot of variability timescales measured between sub-sets of two observations against the corresponding difference in absorption for all sources with sufficient signal-to-noise ratios. all column densities are due to neutral matter except for fairall 51, where only warm absorber phases can be constrained. the uncertainties are plotted at a 90 % confidence level. are interpreted to be due to the varying irradiation expressed by a varying power-law norm. as the ionization parameter ξ = l/nhr2 is proportional to the luminosity, it should behave directly proportional to it. this seems indeed to be the case for ngc 3227. collecting the results from the analysis of the whole sample of polar-scattered seyfert 1 galaxies, fig. 4 shows the time differences between all possible pairs of observations of all sources against the appropriate 3000200010000 6 4 2 0 ∆t n u m b e r figure 5. distribution of all measured variability timescales in days. column density differences throughout for all sources. for the cases where no nh–variability can be stated, upper limits are shown. the distribution plotted as a histogram is found in fig. 5. both plots, fig. 4 and fig. 5, reveal a strong concentration towards shorter timescales from a few up to 103 days. one has to keep in mind that these plots may rather represent the situation of observation timing than a real source intrinsic distribution of variability timescales. the results of interest are the shortest timescales of each source, as listed in table 2. here we list only sources with sufficient data to constrain variability. 269 t. beuchert, j. wilms, m. kadler et al. acta polytechnica source name ∆tmin ngc 3227 710 d ngc 4593 2245 d mrk 704 235 d fairall 51 5 d eso 323-g077 455 d mrk 766 10 − 20 h [21] table 2. list of minimum timescales for all sources with sufficient data to constrain such timescales. 4. interpretation the shortest variability timescales found are of particular interest, since they trace the smallest spatial scales of the absorber. to get an estimate on the distance of the variable absorber we assume distinct clouds to move across the line-of-sight on keplerian orbits. a certain minimum cloud size is required to be able to cover the x-ray source and to derive upper limits on the distance of the absorber [20]. if the radius of the clouds moving on keplerian orbits with velocity v around the central black hole is r = xrs where rs = 2gm/c2 is the schwarzschild radius, and the time for the cloud to pass the line of sight is ∆t ∼ 2r/v = 2xx1/2rs/c (4) with the distance of the absorber r = xrs [10]. as x-ray sources of agn are assumed to have a diameter of 10rs, the cloud diameter must be > 10rs in order to cover the central source [10]. the shortest measured timescale of 5 days is due to warm absorber variability. together with a typical black hole mass of 108 m� [16] we find an absorber distance of r . 1.42·1016 cm. this distance is consistent with the distance to the blr, which is around 0.001–0.1 pc, i.e., 3 · 1015–3 · 1017 cm away from the black hole [3, 17, 19]. the result also fits well to the model of the blr to consist of variable, ionized gas [3, 9]. the second minimal timescale found for one source is 235 d in the case of mrk 704 (see table 2). the according upper limit for the absorber distance estimated to be r . 1.1·1019 cm is consistent with the expected distance of the torus of a few parsec [8, 18]. acknowledgements this work has been funded by the bundesministerium für wirtschaft und technologie under a grant from the deutsches zentrum für luftund raumfahrt. references [1] r. antonucci, 2012, astronomical and astrophysical transactions, 27, 557 [2] barlow, r. 2004, arxiv:physics/0406120 [3] a. j. blustin, et al., 2005, a&a, 431, 111 [4] s. chakravorty, et al., 2008, mnras, 384, l24, doi:10.1111/j.1745-3933.2007.00414.x [5] e. costantini, 2010, space sci rev, 157, 265, doi:10.1007/s11214-010-9706-3 [6] dorodnitsyn, a., & kallman, t. 2012, apj, 761, 70, doi:10.1088/0004-637x/761/1/70 [7] fukumura, k., kazanas, d., contopoulos, i., & behar, e. 2010, apj, 715, 636, doi:10.1088/0004-637x/715/1/636 [8] j. h. krolik, et al., 1988, apj, 329, 702, doi:10.1086/166414 [9] j. h. krolik, et al., 2001, apj, 561, 684, doi:10.1086/323442 [10] a. m. lohfink, et al., 2012, apj, lett, 749, l31, doi:10.1088/2041-8205/749/2/l31 [11] g. matt, et al., 2011, a&a, 533, a1, doi:10.1051/0004-6361/201116443 [12] l. miller, et al., 2007, a&a, 463, 131, doi:10.1051/0004-6361:20066548 [13] markowitz, a. g., krumpe, m., & nikutta, r. 2014, mnras, 439, 1403, doi:10.1093/mnras/stt2492 [14] nenkova, m., sirocky, m. m., ivezić, ž., & elitzur, m. 2008, apj, 685, 147, doi:10.1086/590482 [15] nenkova, m., sirocky, m. m., nikutta, r., ivezić, ž., & elitzur, m. 2008, apj, 685, 160, doi:10.1086/590483 [16] p. padovani, et al., 1988, a&a, 205, 53 [17] b. m. peterson., 1993, pasp, 105, 247, doi:10.1086/133140 [18] e. a. pier, et al., 1992, apj, 401, 99, doi:10.1086/172042 [19] reynolds, c. s., & fabian, a. c. 1995, mnras, 273, 1167 [20] g. risaliti, et al., 2002, apj, 571, 234, doi:10.1086/324146 [21] g. risaliti, et al., 2011, mnras, 410, 1027, doi:10.1111/j.1365-2966.2010.17503.x [22] j. e. smith, et al., 2004, mnras, 350, 140, doi:10.1111/j.1365-2966.2004.07610.x [23] t. j. turner, et al., 2007, a&a, 475, 121, doi:10.1051/0004-6361:20077947 270 acta polytechnica 54(4):266–270, 2014 1 introduction 2 methods 3 results 4 interpretation acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0353 acta polytechnica 56(5):353–359, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap impact of plastic deformation on the properties of a selected aluminium alloy anna lišková∗, petra lacková, mária mihaliková, robert kočiško institute of materials, faculty of metallurgy, technical university of košice, slovakia ∗ corresponding author: anna.liskova@tuke.sk abstract. this paper reports on an experiment to assess the influence of plastic deformation on the microstructure and properties of en aw 6012 (almgsipb) aluminium alloy in two states. the first was the initial state with heat treatment t3, and the second was the state after intensive plastic deformation by ecap (equal channel angular pressing) technology. the ecap process was carried out repeatedly at room temperature. in the initial state of the alloy, the process redistributed eutectic si-particles and increased the strength of the alloy. the mechanical properties and the hardness increased due to intensive plastic deformation (the yield strength increased by 15 %, the tensile strength by 6 %, and the hardness by 23 %). the fracture cracks initiated and propagated mainly along eutectic particles. the fracture area of the ecaped specimen displayed a typical ductile cavity characteristic. keywords: intensive plastic deformation; aluminium alloy; tensile test. 1. introduction the almgsipb alloy investigated here belongs to the aa6xxx (almgsi) series of aluminium alloys, where magnesium and silicon are the principal alloying elements. commercial alloys of this type contain mass fractions of 0.5 % to 1.5 % of si and 0.5 % to 1.5 % of mg, and are used in great quantities. they are universal aluminium alloys which can be extruded into sections, rods and tubes. their characteristics are a high level of workability, strength properties, corrosion resistance and machinability. their mechanical and technological properties depend on the chemical composition and the heat treatment of the castings, i.e., cast blanks and extruded pieces [1, 2]. free machining aluminium alloys are well known in the literature [3, 4]. these alloys typically include free machining constituents that are insoluble but soft and nonabrasive. they are beneficial, assisting in chip breakage and tool life [3]. more specifically, at the point of contact between the tool and the material, softening and melting occur. as a result of these changes, breakage occurs, chips are formed and material removal is enhanced. it is well known that chip breaking is promoted by the addition of pb to conventional aluminium alloys, since pb has poor solubility in solid aluminium and forms a soft, low melting point phase [5, 6]. apart from the major alloying elements, standard aluminium alloys for free cutting also include additions (lead, bismuth), which form softer phases in the matrix. these “free machining” phases improve the machinability of the alloys, because the chips break more easily, they have a smooth surface, lower cutting forces and cause less tool wear. since lead is poisonous, there is a tendency to replace it with other elements: tin and, to some extent, indium are the most frequently-used substituents. alloys with tin must have similar or better properties than standard alloys as regard microstructure, workability, mechanical properties, corrosion resistance, and machinability [7]. in recent time, tin has been added mainly to al-mg-si (aa 6000 series) alloys and to al–cu (aa 2000 series) alloys, which normally contain lead and bismuth, or only lead. semi-finished products made from these alloys in the form of bars are used for free cutting or, more precisely, for turning [8–10]. equal-channel angular pressing (ecap) is a very useful method for producing ultra-fine microstructures of al-based alloys with significantly improved mechanical properties [11–16]. during the last two decades, intensive plastic deformation (spd) techniques have been widely applied to obtain an ultrafine-grained (ufg) structure, which can significantly improve the mechanical properties of al–mg–si alloys [17, 18]. among various spd techniques, equal channel angular pressing (ecap) is the most promising method for fabricating large bulk ufg materials [19, 20]. a significant increase in strength is obtained during ecap processing [21]. in addition, some ultrafinegrained al-based alloys produced by the ecap procedure have shown a superplastic forming capability [16]. intensive plastic deformation by the ecap process also significantly increases the density of lattice defects in a solid solution of al-based alloy, and can therefore accelerate the precipitation process of strengthening particles during the post-ecap ageing treatment applied to an age-hardenable alloy [22–24]. al–mg–si alloys can be strengthened by precipitation hardening. it is essential to study the precipitation sequence and the precipitation behavior. the precipitation sequence in al–mg–si alloy is atomic clusters → gp zones → β′′ → β′ → β, where the atomic clusters are the supersaturated solid solution; gp zones are generally spherical clus353 http://dx.doi.org/10.14311/ap.2016.56.0353 http://ojs.cvut.cz/ojs/index.php/ap a. lišková, p. lacková, m. mihaliková, r. kočiško acta polytechnica si fe mn mg cr bi pb al 1.13 0.37 0.53 0.83 0.2 0.51 0.83 bal. table 1. the chemical composition of aw 6012 [wt.%]. figure 1. the shape and the parameters of the short specimen for the tensile test. ters with unknown structure; β′′ precipitates are fine needle-shaped zones with a monoclinic structure. they are generally present in al alloys aged to maximum hardness; β′ are rod-shaped precipitates with a hexagonal structure and are found in overegged specimens; β (mg2si) is an equilibrium phase in the precipitation sequence. among these, the β′′ phases are considered to make the main contribution to strength. the significant improvement in the strength of al alloy upon spd is due to the dynamic ageing effect, as reported in earlier work. the strength and the ductility of an al alloy were further improved by static ageing [25]. our study has been aimed at understanding the influence of the ecap process on the microstructure and the mechanical properties of alsimgpb alloys. 2. material and methods en aw 6012 aluminium alloy on the basis of almgsipb was used as the experimental material. in the initial state, the experimental material was treated with t3 — solution annealing and natural ageing. prior to deformation in an ecap die, the specimens of the initial states were solution annealed for 1.5h at 550 °c. the alloy was then subjected to intensive plastic deformation by heat treatment: the solution was annealed for 1.5h at 550 °c, followed by 4 passes of ecap and artificial ageing for 30h at 100 °c. the ecap process was performed in a die with the following parameters: channel intersection angle φ = 90°, and arc of curvature ψ = 37°. repetitive pressing of specimens � 10 mm × 80 mm in size was attempted in the ecap die at room temperature, using the bc route (the sample was rotated in direction by 90°). the chemical composition of the aluminium alloy is shown in table 1. figure 1 shows the parameters of the specimens that were used for the tensile test. microstructures were prepared by standard metallographic methods (special enhanced etching – etchant: figure 2. microstructure of the aluminium alloy in the initial state. figure 3. microstructure of the aluminium alloy after ecap and heat treatment. modified kroll – 92ml distilled water, 6ml hno3, and 2ml hf) and were observed using an olympus optical microscope. the fracture surfaces were studied by means of a scanning electron microscope (a jeol model jsm 7000f microscope operated at an accelerating voltage of 300kv). particle identification was carried out using edx quantitative analysis with the inca-sight analyser. the influence of severe plastic deformation by the ecap process and post-ecap artificial ageing on the mechanical properties of the analyzed alloys was evaluated with a vickers hardness measurement (hv 10) and a tensile test. the hardness was estimated in a cross-section by the vickers test with a dwell time of 10s. the test was carried out according to the en iso 6507-1 standard [26]. the hardness was estimated in a cross-section. the pre-ecap and post-ecap state of the samples was evaluated by 10 measurements in 2 lines. for the vickers test, a very small diamond indenter with pyramidal geometry is forced into the surface of a specimen of almgsipb aluminium alloy [27]. the mechanical properties (yield strength — ys, ultimate tensile strength — uts, elongation — a, and 354 vol. 56 no. 5/2016 impact of plastic deformation on a selected aluminium alloy figure 4. the edx analysis of the aluminium alloy in the initial state. reduction of area — z) were measured by a uniaxial tensile test carried out using a zwick 1387 machine, according to stn en iso 6892-1 [28], for samples 8mm in diameter (figure 1). the tension test can be used to ascertain several mechanical properties of materials that are important in mechanical engineering. the tensile test (the initial strain rate of 2.5·10−4 s−1) was carried out on specimens made from quenched, ecap-processed and post-ecap specimens. the tensile testing machine is designed to elongate the specimen at a constate rate using an extensometer [27, 29]. the young modulus was measured using a wn2 52497 extensometer. a leica wild m 32 microscope was used to observe the macrostructure after a tensile test had been applied. 3. results and discussions the microstructure (figure 2) characteristics of the initial state of the investigated alloy, its state after quenching (figure 3), deformation in the ecap die and subsequent ageing treatment were analyzed in the central zone of the cross-section of the specimens. figure 3 shows the ultrafine-grained (ufg) structure with nearly equi-axial morphology, where we can observe the locations of elongated grains. figure 3 shows the deformed microstructure after intensive plastic deformation with characteristic shear bands along the cross section of the sample. the heterogeneous microstructure of the ecap-ed alloy indicates a non-uniform deformation along the cross-section of the ecap-ed specimen. the shear bands are well developed in alloys after plastic deformation, and can be clearly distinguished from the other regions of the microstructure under an optical microscope (figure 3). figure 4 shows the spectrum of the aluminium alloy in the pre-ecap state, and figure 5 shows the spectrum of the alloy in the post-ecap state. it consists of a primary phase a-al solid solution, which forms the matrix of the material, and secondary phases distributed at the grain boundary and in the interdendritic regions. the secondary constituents, such as mg2si and α-al(femn)si compounds, as revealed by eds, are clustered in bands oriented parallel to the extrusion direction. the α-al(femn)si phase is also present as coarse particles. when the billets were heat treated, the grain segregations and the coarse mg2si particles in the a-al matrix dissolved. the maximum number of mg and si atoms in the alloy are therefore in a solid solution in the extruded section, and are therefore available for precipitating the hardening particles during ageing. coarse mg2si particles present in the microstructure do not help to strengthen the alloy. coarse mn and cr phases, detected by edx analyses, are distributed mainly along the grain boundaries. large pb bearing particles are normally found adjacent to fe particles, and sometimes enveloping them [30]. the mechanical properties were evaluated by a static tensile test at room temperature. the dependent force of strain and elongation in static condition is shown in figure 6. the specimen is deformed, usually to fracture, with a gradually increasing tensile load that is applied uniaxially along the long axis of the specimen. the results for mechanical properties, 355 a. lišková, p. lacková, m. mihaliková, r. kočiško acta polytechnica figure 5. the edx analysis of the aluminium alloy after ecap and heat treatment. state ys uts a z e hv10 [mpa] [mpa] [%] [%] [gpa] initial 309 350 17 23 75 103 ecaped 355 372 10 8 70 125 table 2. mechanical properties of the investigated aluminium alloy. evaluated as average values from six measurements, are summarized in table 2. the intensive plastic deformation realized by ecap technology increased the yield strength properties from 309mpa to 355mpa, and the ultimate tensile strength from 350mpa to 372mpa. however, the values for the plastic characteristics decreased. the elongation value decreased from 17 % to 10 %, and the contraction decreased from 23 % to 8 %. the decrease in the modulus of elasticity was shown on the samples after intensive plastic deformation in comparison with the initial state. exhaustion of the plasticity and hardening of aluminum alloy en aw 6012 was caused by intensive plastic deformation. en aw 6012 alloy showed a significant increase in hardness hv10 in the post-ecap state, up to 125 on an average. in its initial state, en aw 6012 displayed hardness of about 103 (tab. 2). a macroscopic analysis of the samples in the initial state and after plastic deformation is shown in figures 7 and 8. it can be concluded that greater plastic deformation has been replaced by minimum macroscopic deformation. the fracture surface of the sample seems to be slightly rugged. the surface of the figure 6. dependent force of strain and elongation in the static condition of en aw6012. fracture on each sample of en aw 6012 aluminium alloy was in the direction of the axial load. figure 9 shows a detail of the fracture surface in the initial state, and figure 10 shows the post-ecap state. the fracture surfaces, documented by sem after a tensile test, are presented in (figures 9–12). figure 9 shows the fracture surface of en aw 6012 in the initial state. figure 10 shows the fracture surface of the ecaped state of the aluminium alloy. details of the fracture surface are shown in figures 11 and 12. the analysis of the fracture surfaces of the investi356 vol. 56 no. 5/2016 impact of plastic deformation on a selected aluminium alloy figure 7. fracture of en aw 6012 alloy in pre-ecap. figure 8. fracture of en aw 6012 alloy post-ecap. figure 9. the fracture surface of en aw 6012 in the initial state. figure 10. the fracture surface of en aw 6012 in the post-ecap state. gated pre-ecap and post-ecap materials showed dominance of transcrystalline ductile fracture. the effect of plastic deformation was revealed in particle cracking for the relevant materials. during plastic deformation, particles were cracked and/or particles were divided from the interphase surface by means of cavity failure systems, which developed from the former dimples. the morphology of the fracture surface was observed as characterized dimples with local presence of striation. the shapes of the fracture surfaces of the samples are characterized by a mixed morphology, which is formed by the surfaces of the particles and the digested holes within a transgranular ductile fracture. the appearance of the surfaces of both surfaces has visible lines, which is an infringement of the guidelines. the fracture was initialized from the surface of the specimen, and the crack growth continued in the perpendicular direction of the axis of the specimen. 4. conclusions on the basis of our experimental work, we have drawn the following conclusions. the intensive plastic deformation carried out by ecap technology can be summarized as follows: • increased strength properties: the yield strength increased by 18 %, and the ultimate tensile strength increased by 16 %. • there was also a significant 23 % increase in hardness hv10. however, the plastic characteristics decreased: the elongation decreased by 31 %, and the area decreased by 21 %. • due to the exhaustion of plasticity and hardening of the en aw 6012 aluminium alloys, the samples showed a lower elasticity modulus value after the application of intensive plastic deformation. • a macroscopic examination proved that the surface of the fracture is perpendicular to the load in the 357 a. lišková, p. lacková, m. mihaliková, r. kočiško acta polytechnica figure 11. detail of the fracture surface of en aw 6012 in the initial state. figure 12. detail of the fracture surface of en aw 6012 in the post-ecap state. en aw 6012 alloy in its initial state, and also in the ecaped state. • the surface fracture on the composite formed in the direction of the axial load. • after intensive plastic deformation, shear bands can be observed on the microstructural level. they point to non-uniform deformation along the cross section of the sample. it is evident that there was deformation that led to the formation of shear bands. these shear bands developed, and led to deformation along narrow paths. • a flat fracture with dimples and local presence of striations was observed on en aw 6012 in both states (figures 11 and 12). the surface was highly fractured, with a fine-grained morphology. acknowledgements this study was supported by the grant agency of the slovak republic, grant project vega 1/0549/14. references [1] katgerman,l., eskin, d.: hardening, annealing, and aging. in.: handbook of aluminum. edited by g.e. totten, d.c. mackenzie, new york, 1, 2003, p. 266-280, isbn: 0-8247-0494-0 [2] smolej, a. et al.: influence of heat treatment on the properties of the free-cutting almgsipb alloy. journal of materials processing technology, 53, 1-2, 1995, p. 373-384, doi:10.1016/0924-0136(95)01994-p [3] davis, j. r.: machining of aluminum and aluminum alloys. in.: asm handbook, vol. 16. edited by j. r. davis, materials park, ohio, asm international, 1989, p. 761,8041989, isbn: 978-0-87170-022-3 [4] nejezchlebova, j. et al.: ultrasonic detection of ductile-to-brittle transitions in free-cutting aluminum alloys. ndt & e international, 69, 2015, p. 40-47, doi:10.1016/j.ndteint.2014.09.007 [5] de hosson, j. t. m.: lead induced intergranular fracture in aluminum alloy aa6262. materials science and engineering a-structural materials properties microstructure and processing, 361, 1-2, 2003, p. 331-337, doi:10.1016/s0921-5093(03)00521-5 [6] timelli, g., bonollo, f.: influence of tin and bismuth on machinability of lead free 6000 series aluminium alloys. materials science and technology, 27, 1, 2011, p. 291-299, doi:10.1179/026708309x12595712305799 [7] koch, a., antrekowitsch, h.: free-cutting aluminium alloys with tin as substitution for lead. bhm bergund hüttenmännische monatshefte, 153, 7, 2008, p. 278-281, doi:10.1007/s00501-008-0390-5 [8] kopač, j. et al.: strategy of machinability of aluminium alloys for free cutting. proceedings of the thirtieth international matador conference, 1993, p. 151-156, doi:10.1007/978-1-349-13255-3_20 [9] davis, j. r.: physical metallurgy. in asm specialty handbook: aluminum and aluminum alloys. edited by j. r. davis, publisher: asm international, ohio, 1993, p. 31-47, isbn: 978-0-87170-496-2 [10] sokolovič, m., kopač, j., smolej, a.: model of quality management in development of new free-cutting al-alloy. journal of achievements in materials and manufacturing engineering, 19, 2, 2006, p. 92-98 [11] segal, v. m.: materials processing by simple shear. materials science and engineering, 197, 2, 1995, p. 157-164, doi:10.1016/0921-5093(95)09705-8 [12] furukawa, m. et al.: processing of metals by equalchannel angular pressing. journal of materials science, 36, 2001, p. 2835-2843, doi:10.1023/a:1017932417043 [13] furukawa, m., horita, z., langdon, t. g.: developing ultrafine grain sizes using severe plastic deformation. advanced engineering materials, 3, 2001, p. 121-125, doi:10.1002/15272648(200103)3:3<121::aid-adem121>3.0.co;2-v 358 http://dx.doi.org/10.1016/0924-0136(95)01994-p http://dx.doi.org/10.1016/j.ndteint.2014.09.007 http://dx.doi.org/10.1016/s0921-5093(03)00521-5 http://dx.doi.org/10.1179/026708309x12595712305799 http://dx.doi.org/10.1007/s00501-008-0390-5 http://dx.doi.org/10.1007/978-1-349-13255-3_20 http://dx.doi.org/10.1016/0921-5093(95)09705-8 http://dx.doi.org/10.1023/a:1017932417043 http://dx.doi.org/10.1002/1527-2648(200103)3:3<121::aid-adem121>3.0.co;2-v http://dx.doi.org/10.1002/1527-2648(200103)3:3<121::aid-adem121>3.0.co;2-v vol. 56 no. 5/2016 impact of plastic deformation on a selected aluminium alloy [14] horita, z. et al.: improvement of mechanical properties for al alloys using equal-channel angular pressing. journal of materials processing technology, 117, 2001, p. 288-292, doi:10.1016/s0924-0136(01)00783-x [15] islamgaliev, r. k. et al.: characteristics of superplasticity in an ultrafine-grained aluminum alloy processed by eca pressing, scripta materialia, 49, 2003, p. 467-472, doi:10.1016/s1359-6462(03)00291-4 [16] kvačkaj, t. et al.: ultra fine structure and properties formation of en aw 6082 alloy. high temperature materials and processes, 27, 3, 2008, p. 193-202, doi:10.1515/htmp.2008.27.3.193 [17] nurislamova, g. et al.: nanostructure and related mechanical properties of an al-mg-si alloy processed by severe plastic deformation. philosophical magazine letters, 88, 6, 2008, p. 459-466, doi:10.1080/09500830802186938 [18] khan, a. s., meredith, c. s.: thermomechanical response of al 6061 with and without equal channel angular pressing (ecap). international journal of plasticity, 26, 2010, p.189-203, doi:10.1016/j.ijplas.2009.07.002 [19] huang, y., prangnell, p.b.: continuous frictional angular extrusion and its application in the production of ultrafine-grained sheet metals. scripta materialia, 56, 2007, p. 333-336, doi:10.1016/j.scriptamat.2006.11.011 [20] cherukuri, b., nedkova, t.s., srinivasan, r.: a comparison of the properties of spd-processed aa6061 by equal-channel angular pressing, multi-axial compressions/forgings and accumulative roll bonding. material science and engineering a, 410, 2005, p. 394-397, doi:10.1016/j.msea.2005.08.024 [21] hockauf, k. et al.: improvement of strength and ductility for a 6056 aluminum alloy achieved by a combination of equal-channel angular pressing and aging treatment. journal of materials science, 45, 2010, p. 4754-4760, doi:10.1007/s10853-010-4544-y [22] kim, w. j. et al.: optimization of strength and ductility of 2024 al by equal channel angular pressing (ecap) and post-ecap aging. scripta materialia, 49, 4, 2003, p. 333-338, doi:10.1016/s1359-6462(03)00260-4 [23] murashkin, m. y.: strength of commercial aluminum alloys after equal channel angular pressing and postecap processing. solid state phenomena, 114, 2006, p. 91-96, doi:10.4028/www.scientific.net/ssp.114.91 [24] kim, j. k., kima, w. j.: effect of post-ecap aging on mechanical properties of age-hardenable aluminum alloys. solid state phenomena, 124-126, 2007, p. 14371440, doi:10.4028/www.scientific.net/ssp.124-126.1437 [25] manping liu et al.: dsc analyses of static and dynamic precipitation of an al–mg–si–cu aluminum alloy. progress in natural science: materials international, 25, 2, 2015, p. 151-158, doi:10.1016/j.pnsc.2015.02.004 [26] iso 6507-1: metallic materials – vickers hardness test – part 1: test method, 2005 [27] callister, w., d. jr., rethwisch, d.g.: chapter 6 / mechanical properties of metals. in.: materials science and engineering an introduction, seventh edition, john wiley & sons, new york, 2007, p. 133-165, isbn: 978-0-471-73696-7 [28] iso 6892-1: metallic materials – tensile testing part 1: method of test at room temperature, 2009 [29] miháliková, m., német, m., vojtko, m.: impact of strain rate on microalloyed steel sheet fracturing. acta polytechnica, 54, 4, 2014, p. 281-28, doi:10.14311/ap.2014.54.0281 [30] timelli, g., bonollo, f.: influence of tin and bismuth on machinability of lead free 6000 series aluminium alloys. materials science and technology, 7, 1, 2011, p. 291-299, doi:10.1179/026708309x12595712305799 359 http://dx.doi.org/10.1016/s0924-0136(01)00783-x http://dx.doi.org/10.1016/s1359-6462(03)00291-4 http://dx.doi.org/10.1515/htmp.2008.27.3.193 http://dx.doi.org/10.1080/09500830802186938 http://dx.doi.org/10.1016/j.ijplas.2009.07.002 http://dx.doi.org/10.1016/j.scriptamat.2006.11.011 http://dx.doi.org/10.1016/j.msea.2005.08.024 http://dx.doi.org/10.1007/s10853-010-4544-y http://dx.doi.org/10.1016/s1359-6462(03)00260-4 http://dx.doi.org/10.4028/www.scientific.net/ssp.114.91 http://dx.doi.org/10.4028/www.scientific.net/ssp.124-126.1437 http://dx.doi.org/10.1016/j.pnsc.2015.02.004 http://dx.doi.org/10.14311/ap.2014.54.0281 http://dx.doi.org/10.1179/026708309x12595712305799 acta polytechnica 56(5):353–359, 2016 1 introduction 2 material and methods 3 results and discussions 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0367 acta polytechnica 56(5):367–372, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap an experimental assessment of the plate heat exchanger characteristics by wilson plot method jan opatřil, jan havlík, ondřej bartoš∗, tomáš dlouhý ctu in prague, department of energy engineering, technická 4, 166 07 praha, czechia ∗ corresponding author: ondrej.bartos@fs.cvut.cz abstract. an aim of this paper is a suggestion of the evaluation method based on the experimental data and the wilson plot method for the plate heat exchangers (phe). for the purpose of the project, a new experimental loop was built for the testing of phe to obtain the overhaul heat transfer coefficient and pressure drop between the inlet and outlet of the fluid. the measurement were done for three different phe, with the performance range of 30-100kw. the working fluid was water on both sides of the phe. the differences are in number of pates as well as in extrusion profiles. the wilson plot evaluation method was involved in the processing of the experimental data. to obtain more accurate correlations between the experimental data and theoretical results yielded by the wilson plot, the method was enhanced by the measured pressure drop involving. this approach could be useful for phe designing software and for the manufacturing company. keywords: plate heat exchanger; wilson plot; experimental measurement; heat transfer; pressure drop; friction factor. 1. introduction plate heat exchangers (phe) are the group of heat exchangers generally characterized by their compactness and flexibility. the phe could be commonly used for a wide range of fluids include the non-newtonian liquids. on the other side, their application is limited by relatively low allowable media pressure difference, which arises from the construction. phes are typically corrugated from the thin metal plates which are providing the large heat transfer surface. the configuration of the plates has the influence on both important parameters, pressure drop and heat transfer. several different plate patterns were developed worldwide to reach the desired operation parameters. for the purpose of this work, three pattern configurations called chevron, tubular and dimpel were tested [1, 2]. the main goal of the paper is to suggest the method to validate the criterial equation applied for the phe design from the easily measured parameters. the temperatures of the fluid at the inlet and outlet, the flow rate on both sides and the pressure drop were considered. the first step was the experimental determination of the overall heat transfer coefficient (ohtc) in the phe. all the achieved data were further used for the evaluation of the heat transfer coefficient on both sides of the plates. equations for phe characterisation are available for the most used patterns in literature [1, 2]. the number of researchers experimentally investigated different plate patterns to describe their basic characteristics by newly developed criterial equations [3–6] and some researchers focused work on heat exchanger optimization, e.g. [7, 8]. the presented work suggests a similar way but with the basic knowledge of the geometrical characteristics of plate patterns. the advantage of the approach is possibility to verify or recalculate existed phes. the modification of wilson plot technique was involved to obtain the new equations. a necessary for the project was to develop and manufacture the testing stand for the measurement of the ohtc. in the second part, the experimental investigation was carried out on the three phe. the phes differ in the pattern of the plates used as heat exchanging surfaces. the chevron, tubular and dimpel type of the metal sheet pattern was used. the measurement was performed with water on both sides of each phe. 2. experimental set-up the testing loop was built in the laboratory of ctu in prague in the year 2014 as the part of the project with an industrial partner. the aim was the stand for the variable measuring of the phe. automatic controlled biomass boiler “comp. fiedler s.r.o.”, with the performance range 30–100kw, was chosen as a heat source. for the phe cooling, a water loop with the maximum cooling capacity in the laboratory (300kw), was used. the schema of the measurement loop is in figure 1. the design of the testing loop was solved as the combination of available facilities in laboratories, the boiler and the cooling tower, and demands for measurement of required physical quantities. in comparison with generally recommended schemes, a hydraulic separator is included. the hydraulic separator was involved to separate the phe loop and the boiler loop on the heating water side. this solution brings the advantage of the independence of the heat transfer in the boiler itself, because the hydraulic condition in the boiler is nearly constant and it is independent on the flow condition 367 http://dx.doi.org/10.14311/ap.2016.56.0367 http://ojs.cvut.cz/ojs/index.php/ap jan opatřil, jan havlík, ondřej bartoš, tomáš dlouhý acta polytechnica erin, the 9th international conference for young researchers and phd students may 4-6, 2015, monínec, czech republic 3 fig. 1: the schema of the measurement set-up the hydraulic separator was involved to separate the phe loop and the boiler loop on the heating water side. this solution brings the advantage of the independence of the heat transfer in the boiler itself, because the hydraulic condition in the boiler is nearly constant and it is independent on the flow condition in the phe for varying mass flow. the phe was isolated with polystyrene foam. 3 the method the method of the experimental research was chosen with the respect of the limited knowledge of the phe parameters. the measurements were performed for four different conditions. first three conditions were different only by the performance of the boiler. this means that hot water from the boiler flow through the a-side of phe (see fig. 1). in each of the conditions, 4 flow rates in the phe were tested. the performance of the boiler was adjusted by different timing of the fuel supply. the forth condition was measured while the phe was turned around to commute the hot and cold water. in this case, hot water was flowing through the b-side of the phe. this method can help to determine the ohtc and, moreover, helps with the evaluation of the heat transfer coefficient for each side using the wilson plot method. a-side b-side figure 1. the schema of the measurement set-up. in the phe for varying mass flow. the phe was isolated with polystyrene foam. 3. the method the method of the experimental research was chosen with the respect of the limited knowledge of the phe parameters. the measurements were performed for four different conditions. first three conditions were different only by the performance of the boiler. this means that hot water from the boiler flow through the a-side of phe (see figure 1). in each of the conditions, 4 flow rates in the phe were tested. the performance of the boiler was adjusted by different timing of the fuel supply. the forth condition was measured while the phe was turned around to commute the hot and cold water. in this case, hot water was flowing through the b-side of the phe. this method can help to determine the ohtc and, moreover, helps with the evaluation of the heat transfer coefficient for each side using the wilson plot method. the data processing to yield the ohtc is straightforward from the measured parameters of temperature and pressure the density of the water and heat capacity is computed from the iawps if97, a mean temperature between inlet and outlet is expected. the performance of the hot and cold side of the phe can be written as qhot = cp,hot%hotvol.flowhot(thot,in − thot,out), qcold = cp,cold%coldvol.flowcold(tcold,in − tcold,out) (both in [kw]). the area of the heating surface a [m2] is unknown for all tested exchangers. the determination of the ohtc can be done only in combination with a by following equation: ohtc ∗ = ohtc · a = qhot + qcold 2lmtd [kw/k]. logarithmic mean temperature difference is defined for the testing loop as lmtd = (thot,in − tcold,out) − (thot,out − tcold,in) ln thot,in−tcold,out thot,out−tcold,in (in [k]). 3.1. wilson plot method most of the convective heat transfer processes inherent to heat exchangers usually involve complex geometries and complicated flows so analytical solutions are not possible. in many cases, there are several empirical methods to calculate the htc for a similar type of heat exchanger and geometry, which provide different results [9]. the wilson plot method is a suitable technique to estimate the convection coefficients in a variety of convective heat transfer processes. the method avoids the direct measurement of the surface temperature and the disturbance of the fluid flow and heat transfer introduced while attempting to measure those temperatures. it relies on the fact that the overall thermal resistance can be extracted from experimental measurements in a reliable manner. the aim of the method is the calculation of the convection coefficient (or thermal resistance) of the fluid in the criterion formula for the defined type of the convection heat transfer. the application of the wilson plot method is based on the measurement of the experimental mass flow rate and temperatures [1]. the original method has been derived for the processes of transferring heat by convection, when the thermal resistance on one side of the heat transfer surface remains constant (in the case of condensation [10]) while varying the mass flow on the other side changes the total thermal resistance in the heat exchanger. this method was originally derived for determining the coolant htc in steam condensers, where the sensitivity of the shell-side htc value to 368 vol. 56 no. 5/2016 experimental assessment of plate heat exchanger characteristics the overall htc is more significant than the condensation htc [10]. modifications of the wilson plot method based on the principle have been derived for wide range of heat exchanger geometries. 3.2. evaluating parameters of plate heat exchangers in the case of the plate heat exchangers, the hydraulic diameter is very small, almost in order of mm. therefore, the turbulent conditions are achieved very early. the reynolds number is a function of the fluid flow rate. the mathematical formula for the reynolds number is given in (1). the ratio represents momentum to viscous forces: re = ud v . (1) the prandtl number is a function of two important physical properties (thermal and momentum). the prandtl number may be seen as a ratio of the rate that viscous forces penetrate the material to the rate that thermal energy penetrates the material: pr = µcp k . the nusselt number is equal to the dimensionless temperature gradient at the surface and it essentially provides a measure of convective heat transfer. the nusselt number may be viewed as a ratio of the conduction resistance of a material to the convection resistance of the same material: nu = hd k . (2) in single phase fluid flow heat transfer, generally, nu is represented by a pragmatic expression such as the one given in (3). the term (µ/µw) is accountable for variable viscosity effects: nu = crenprm ( µ µw )0.14 . (3) 3.3. the wilson plot modification the derivation of the wilson plot method is based on the heat balance of the heat transfer in a heat exchanger [11]. the thermal heat resistance r, in this process, is determined from the enthalpy change of the fluid: r = 1 ua = lmtd mlcp,l(tl,out − tl,in) . the overall heat transfer coefficient u can be expressed as 1 u = 1 ha + s kw + 1 hb , (4) where ha = ka da carenapr m a ( µ µw )0.14 a , (5) hb = kb db carenbpr m b ( µ µw )0.14 b . (6) erin, the 9th international conference for young researchers and phd students may 4-6, 2015, monínec, czech republic 7 fig. 2: evaluation of the criterial equation [1] coefficients a, b are determined as the result of the linear regression 𝑎 = 1 𝐶𝐴 ; 𝑏 = 1 𝐶𝐵 (16) the x and y regression starts with an initial n value as well as a guess for the cb value. these values have an impact on the wall temperature calculations; therefore, the viscosity ratio must be adjusted in both linear regression processes. from the x1 and y1 regression, ca and cb coefficients are found. this cb coefficient is then used in a mathematical relaxation method to converge the viscosity ratio in the x2 and y2 linear regression, producing values of n and ca. the new n is then used in the next iteration of regressions (which has the new viscosity ratios to be relaxed). the calculations continue following this procedure until the difference between the successive n and cb values and the ca values from the x1-y1 and x2-y2 linear regressions reach a predetermined allowable error [11]. in our case, when the evaluation method should be independent on the geometry of the phe, one can expect following suggestions. the velocity for the determination of the reynolds number can be given as 𝑣 = 𝑉 𝑆⁄ . thus eq. (7) and (8) are transformed to the term ℎ = 𝑘 𝐷 ∙ 𝐶 ∙ ( 𝑉 ∙ 𝐷 𝑆 ∙ 𝜈 ) 𝑛 ∙ 𝑃𝑟𝑚 ∙ ( 𝜇 𝜇𝑊 ) 0,14 (17) the values of the parameters 𝐷, 𝑆 are unknown, but these parameters are constant for the specific geometry of a heat exchanger. therefore, we introduce the modified reynolds number, defined as 𝑅𝑒𝑚𝑜𝑑 = 𝑉 𝜈⁄ and the unknown parameters 𝐷, 𝑆 are included to the constant 𝐶𝑚𝑜𝑑 = 𝐶 ∙ 𝐷𝑛 𝐷 ∙ 𝑆𝑛⁄ . eq. (12) is transformed to the term ℎ = 𝐶𝑚𝑜𝑑 ∙ 𝑘 ∙ 𝑅𝑒𝑚𝑜𝑑 𝑛 ∙ 𝑃𝑟𝑚 ∙ ( 𝜇 𝜇𝑊 ) 0,14 (18) which is used in the calculation below instead of the terms in equation (7), (8). figure 2. evaluation of the criterial equation [1]. by the substitution of terms in (5), (6) to (4), there is given: ( 1 u − s kw ) ka da renapr m a ( µ µw )0.14 a = 1 ca + ka da renapr m a ( µ µw )0.14 a cb kb db renbpr m b ( µ µw )0.14 b . (7) it is possible to write (7) in a linear form y1 = ax1 + b, where y1 and x1 are in following forms: y1 = ( 1 u − s kw ) ka da renapr m a ( µ µw )0.14 a , x1 = ka da renapr m a ( µ µw )0.14 a kb db renbpr m b ( µ µw )0.14 b . the set of values of y and x, given for the set of experimental data, can be fitted by the linear regression for an estimated value of the coefficient n. the value, pragmatically used for the coefficient m, is 0.4 in the transfer process [3]. coefficients a, b are determined as the result of the linear regression a = 1 ca , b = 1 cb . the x and y regression starts with an initial n value as well as a guess for the cb value. these values have an impact on the wall temperature calculations; therefore, the viscosity ratio must be adjusted in both linear regression processes. from the x1 and y1 regression, ca and cb coefficients are found. this cb coefficient is then used in a mathematical relaxation method to converge the viscosity ratio in the x2 and y2 linear regression, producing values of n and ca. the new n is then used in the next iteration of regressions (which has the new viscosity ratios to be relaxed). the calculations continue following this procedure until the difference between the successive n and cb values and the ca values from the x1 − y1 and x2 − y2 linear regressions reach a predetermined allowable error [11]. 369 jan opatřil, jan havlík, ondřej bartoš, tomáš dlouhý acta polytechnica heat exchanger chevron dimpel tubular cold side 2.79 1.80 1.77 hot side 8.61 2.88 2.97 table 1. resulting values of coefficient cmod. fluid flow friction factor laminar f = 1.328re−1/2 turbulent f = 0.074re−1/5 transition f = 0.074re−1/2 − 174re−1 table 2. friction factor dependence on the fluid flow [3]. exponent 2 + b b b chevron 1.7302 −0.2608 −1/3.8 tubular 1.8134 −0.1866 −1/5.4 dimpel 1.7918 −0.2086 −1/4.8 table 3. coefficients on the a-side of the phe. exponent 2 + b b b chevron 1.5074 −0.4926 −1/2.0 tubular 1.6261 −0.3739 −1/2.7 dimpel 1.515 −0.485 −1/2.1 table 4. coefficients on the b-side of the phe. in our case, when the evaluation method should be independent on the geometry of the phe, one can expect following suggestions. the velocity for the determination of the reynolds number can be given as v = v s . thus (2) and (3) are transformed to the term h = k d c (v d sv )n prm ( µ µw )0.14 . the values of the parameters d, s are unknown, but these parameters are constant for the specific geometry of a heat exchanger. therefore, we introduce the modified reynolds number, defined as remod = vv and the unknown parameters d, s are included to the constant cmod = c d n dsn . equation (6) is transformed to the term h = cmodkrenmodpr m ( µ µw )0.14 , which is used in the calculation below instead of the terms in (2) and (3). according to the calculation procedure below, the value of the coefficient m was defined as 0.4 and the value of the coefficient n was calculated as approximately 0.9, which is the best corresponding in each type of the testing plate heat exchangers. the resulting values of coefficient cmod are shown in table 1. the results in table 1 show that the heat transfer coefficient is higher on the hot side of the phe. for erin, the 9th international conference for young researchers and phd students may 4-6, 2015, monínec, czech republic 9 fig. 3: evaluation of the pressure losses dependence on the a-side tab. 3:coefficients on the a-side of the phe exponent 2+b b b chevron 1,7302 -0,2608 −1 3,8⁄ tubular 1,8134 -0,1866 −1 5,4⁄ dimpel 1,7918 -0,2086 −1 4,8⁄ results in tab. 3 correspond with the turbulent flow and the coefficient -1/5, 0 0,5 1 1,5 2 0 2 4 p re ss u re lo ss [ k p a ] volume flow [m3/hod] a-side hot side chevron hot tubular hot dimpel hot figure 3. evaluation of the pressure losses dependence on the a-side. erin, the 9th international conference for young researchers and phd students may 4-6, 2015, monínec, czech republic 10 fig. 4: evaluation of the pressure losses dependence on the a-side tab. 4:coefficients on the b-side of the phe exponent 2+b b chevron 1,5074 -0,4926 −1 2,0⁄ tubular 1,6261 -0,3739 −1 2,7⁄ dimpel 1,515 -0,485 −1 2,1⁄ results in tab.4 correspond with the laminar flow and the coefficient of -1/2. the analysis of the measured data indicates that on the a-side, the turbulent flow is established and on the b-side, the flow is mostly laminar. this interesting result could be useful for the manufacturer. for example, an increase of the turbulence intensity on the b-side could possibly enhance the performance. 4 conclusion the method for the validation of the phe design was suggested. the application of the wilson plot method, together with the pressure loss analysis, brought the powerful tool for the verification and the design of the phe for their producers. main advantage is the application of simply measured parameters (temperature, flow rate and pressure loss) to reach dependence of the heat transfer coefficients on the flow rate of the fluid. this method is even suitable for the 0 0,2 0,4 0,6 0,8 1 1,2 1,4 0 2 4 p re ss u re l o ss [ k p a ] volume flow [m3/hod] b-side cold side chevron cold tubular cold dimpel cold figure 4. evaluation of the pressure losses dependence on the a-side. the explanation of the convection coefficient difference, the analysis of the pressure losses were done. 3.4. pressure losses the pressure loss is defined as ∆p = f 4l de %u2 2 , where f is the friction factor [3]. its value for a plate exchanger is according to the flow type, see table 2: f = areb = a (ude v )b , u = v̇ s , ∆p = 2a l d1−be % vbs2+b v̇ 2+b = av̇ 2+b. the measured dependence between the pressure loss and the volume flow of water for both sides of phe is presented in figures 3 and 4. measured data were fit with an exponential function and coefficients of the functions are in tables 3 and 4. results in table 3 correspond with the turbulent flow and the coefficient −1/5. results in table 4 correspond with the laminar flow and the coefficient of −1/2. 370 vol. 56 no. 5/2016 experimental assessment of plate heat exchanger characteristics figure 5. the quality of the model for the chevron and dimpel phe. the analysis of the measured data indicates that on the a-side, the turbulent flow is established and on the b-side, the flow is mostly laminar. this interesting result could be useful for the manufacturer. for example, an increase of the turbulence intensity on the b-side could possibly enhance the performance. 4. conclusion the method for the validation of the phe design was suggested. the application of the wilson plot method, together with the pressure loss analysis, brought the powerful tool for the verification and the design of the phe for their producers. main advantage is the application of simply measured parameters (temperature, flow rate and pressure loss) to reach dependence of the heat transfer coefficients on the flow rate of the fluid. this method is even suitable for the cases, when a similar fluid, with similar order of the convection coefficient, is on both sides of the phe. the analysis of pressure losses can increase reliability of the standard wilson plot method. in our case, the proof of the mostly turbulent flow on the hot side and mostly laminar flow on the cold side for all three phe types, was brought. in figure 5 the agreement between suggested model and experimental data is plotted in the normalized units. the data used for the data processing were measured while the entire measurement system was in the equilibrium. this state should be identified by the low difference between the performance on the cold and hot side. the difference between performance of the cold and hot side of the phe can be expected as the uncertainty of the measurement. the sensitivity analysis brought the result that entire uncertainty of the phe side performance should be smaller than 3 %. main contribution is coming from the uncertainty of the temperature measurement. most of the measurements are in the range of the expected error. the conclusions, results and data reported in this paper introduce a good basic approach to evaluate parameters phe with unknown geometrical parameters. further tests would be beneficial to extend present knowledge of the phe parameter to the region exceeding the tested ranges. list of symbols a heat transfer area [m2] c constant cp specific heat capacity [kj/kg k] d characteristic diameter (m) h convection coefficient [w/(m2 k)] k thermal conductivity [w/m k] lmtd logarithmic mean temperature difference [k] m mass flow rate [kg/s] nu nusselt number pr prandtl number r thermal resistance [k/w] re reynolds number s wall thickness [m] s flow cross-section [m2] t temperature [k] u overall heat transfer coefficient [w/m2 k] v velocity [m/s] subscripts a fluid a b fluid b in inlet l liquid out outlet w wall superscripts m exponent of prandtl number n exponent of reynolds number ∗ normalized value acknowledgements the authors gratefully acknowledge the support of all colleagues from g.a.m. heat spol. s.r.o. who helped to finish this project. this work has been supported by the grant agency of the czech technical university in prague, grant no. sgs 14/183. 371 jan opatřil, jan havlík, ondřej bartoš, tomáš dlouhý acta polytechnica references [1] j. fernández-seara, f. j. uhía, j. sieres a a. campo. „a general review of the wilson plot method and its modifications to determine convection coefficients in heat exchange devices“. applied thermal engineering, p. 2745–2757, 2007. [2] r. k. shah, d. p. sekulic., fundamentals of heat exchanger design, new york, wiley, 2003. [3] r. l. pradhan, d. ravikumar, d. l. pradhan. „review of nusselt number correlation for single phase fluid flow through a plate heat exchanger to develop c# code application software“. journal of mechanical and civil engineering, pp. 01-08, 2013. issn (e): 2278-1684, issn (p): 2320–334x. [4] a. hashmi, f. tahir, u. hameed. „empirical nusselt number correlation for single phase flow through a plate heat exchanger“. recent advances in fluid mechanics, heat & mass transfer and biology (2011). [5] s. d. pandey, v. k. nema. „investigation of the performance parameters of an experimental plate heat exchanger in single phase flow“. international journal of energy engineering. 2011; 1(1): 19-24. doi:10.5923/j.ijee.20110101.04 [6] f. akturk, g. gulben, s. aradag, n. sezer uzol, s. kakac. „experimental investigation of the characteristics of a chevron type gasketed-plate heat exchanger“. 6th international advanced technologies symposium (iats’11), 16-18 may 2011, elazığ, turkey. [7] j. m. pinto, j. a. w. gut. „a screening method for the optimal selection of plate heat exchanger configurations“. brazilian journal of chemical engineering, vol. 19, no. 04, pp. 433 439, october december 2002. issn 0104-6632. [8] v. r. naik, v. k. matawala. „experimental investigation of single phase chevron type gasket plate heat exchanger“. international journal of engineering and advanced technology (ijeat), volume-2, issue-4, april 2013. issn: 2249 – 8958. [9] j. havlík, t. dlouhý. experimental verification of theoretical models to calculate the heat transfer coefficient in the shell-and-tube heat exchanger. in 20 th international conference engineering mechanics 2014. brno: brno university of technology, 2014, p. 216-219. isbn 978-80-214-4871-1. web of science: 000364573900047. [10] havlík, j. a t. dlouhý. condensation of water vapour in a vertical tube condenser. acta polytechnica. 2015, vol. 55, no. 5, p. 306-312. issn 1210-2709. scopus: 2-s2.0-84946073960. [11] g. f. hewitt, g. l. shires a t. r. bott, process heat transfer, new york: begell house, 2000. 372 http://dx.doi.org/10.5923/j.ijee.20110101.04 acta polytechnica 56(5):367–372, 2016 1 introduction 2 experimental set-up 3 the method 3.1 wilson plot method 3.2 evaluating parameters of plate heat exchangers 3.3 the wilson plot modification 3.4 pressure losses 4 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0329 acta polytechnica 55(5):329–334, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap effect of tube pitch on hear transfer in sprinkled tube bundle petr kracík∗, jiří pospíšil department of power engineering, energy institute, technická 2896/2, brno 616 69, czech republic ∗ corresponding author: kracik@fme.vutbr.cz abstract. water flowing on a sprinkled tube bundle forms three basic modes: the droplet mode (the liquid drips from one tube to another), the jet mode (with an increasing flow rate, the droplets merge into a column) and the membrane (sheet) mode (with a further increase in the flow rate of the falling film liquid, the columns merge and create sheets between the tubes. with a sufficient flow rate, the sheets merge at this stage, and the tube bundle is completely covered by a thin liquid film). there are several factors influencing both the individual modes and the heat transfer. beside the above-mentioned falling film liquid flow rate, these are for instance the tube diameters, the tube pitches in the tube bundle, or the physical conditions of the falling film liquid. this paper presents a summary of data measured at atmospheric pressure, with a tube bundle consisting of copper tubes of 12 millimetres in diameter, and with a studied tube length of one meter. the tubes are situated horizontally one above another at a pitch of 15 to 30 mm, and there is a distribution tube placed above them with water flowing through apertures of 1.0 mm in diameter at a 9.2 mm span. two thermal conditions have been tested with all pitches: 15 °c to 40 °c and 15 °c to 45 °c. the temperature of the falling film liquid, which was heated during the flow through the exchanger, was 15 °c at the distribution tube input. the temperature of the heating liquid at the exchanger input, which had a constant flow rate of approx. 7.2. litres per minute, was 40 °c, or alternatively 45 °c. keywords: sprinkled; tube bundle; heat transfer; tube pitch. 1. introduction a liquid flowing through a horizontal tube bundle may form three basic sprinkle modes visible in figure 1. these are the droplet mode (d), the jet mode (j) and the membrane (sheet) mode (s) [1–4]. with an increasing flow rate of the falling film liquid, the transition from the droplet to the jet mode is defined by the formation of one stable column of liquid among the droplets (this transition mode will be hereinafter referred to as the “d→j” mode). the transition from the jet to the membrane (sheet) mode is defined by the connection of two columns and their formation of a small triangular sheet (hereinafter referred to as the “j→s” mode). in this mode, columns and sheet exist side by side. with a decrease in the flow rate of the falling film liquid, the reverse process takes place. the formation of a stable liquid flow among the sheets is referred to as the sheet-column state (hereinafter referred to as the “s→j” mode), and the disintegration of the first column in the column mode and its replacement by droplets changes the state to the jet-droplet mode (hereinafter referred to as the “j→d” mode). there are several factors influencing the individual mode types as well as heat transfer. beside the above mentioned falling film liquid flow rate (tested [5] for example) they are for instance tube diameters (tested [6] for example), tube pitches in a bundle (tested [7] for example) or a physical condition of a falling film liquid and heat transfer (tested [8–10] for example). figure 1. sprinkle modes [1]. the following results compare with those authors must be judiciously. for the authors of its results achieved under strict laboratory conditions for one to three trumpets and our research deals with the behavior “big tube bundle”. 2. effects of pitch tubes on a falling film liquid the influence of the gap between the sprinkled profiles on the flow mode was tested by wang et al. [7]. the tested bundle consisted of a distribution tube out of which a liquid was flowing (the tested substances were ethylene glycol and water), a tube of circular crosssection, whose purpose was to regulate the distribution of the falling film liquid, and of two flat tubes which created a gap that demonstrated the flow mode. the tested flat tubes were made of polished aluminium and had the following dimensions: 400×25.4×3.18 mm 329 http://dx.doi.org/10.14311/ap.2015.55.0329 http://ojs.cvut.cz/ojs/index.php/ap petr kracík, jiří pospíšil acta polytechnica s–d [mm] increasing flow decreasing flow d/d–j j–d/j j/s–j s–j/s s–j/s j/s–j j–d/j d/d–j 4.8 203 259 363 395 395 365 268 209 6.4 231 293 390 427 429 389 289 230 9.5 230 298 430 459 461 427 297 227 14.5 244 320 435 473 470 426 328 236 19.4 262 336 434 481 486 431 335 262 24.5 268 348 448 512 513 449 341 264 rms [%] 13.15 13.15 0.84 13.15 0.87 0.77 13.15 13.15 table 1. reynolds numbers for various tube pitches according to wang et al. [7]. (length×height×width). the measuring itself was preceded by a 2to 3-hour tube sprinkling which ensured an ideal adherence of the liquid to the surface (making the surface ideally wettable). the measurement procedure was followed to decrease the flow rate from the maximum value. after a zero flow rate had been reached, it was again increased to the maximum possible rate. the measurement was repeated three times for each profile gap, and only after these results were obtained, the reynolds numbers for the given mode transition and the relative error of these values were determined. table 1 provides an overview of the measurement results where water with approximately the same physical properties, as those which were expressed by the authors by means of the modified galilei number ga0.25 ≈ 450, was used as the falling film liquid. this value at atmospheric pressure (101 325 pa) corresponds with a water temperature of approx. 27.4 °c [11]. to achieve the given measurement results, the authors [11] set the gap between the stabilization tube and the first profile to 2.0 mm. their results suggest that the larger the gap between the sprinkled profiles, the higher the reynolds number for the given sprinkle mode. this increase, however, did not prove to be steady, as in one case of the gap increase the next reynolds number was even lower by approx. 1.0 % which does not seem to be a correct value. the increasing gap causes the horizontal sprinkled diameter to increase as well and therefore more liquid. higher flow rate, is necessary at a larger gap in order to achieve the same sprinkle mode. when comparing the reynolds numbers belonging to the smallest and the largest gaps between the profiles at individual modes, the increase ranges between approx. 23 % and 32 %. the difference in the reynolds numbers between the increasing and decreasing flow rates at individual states equals on average 1.1 %, with a standard deviation of 1.1 %. 3. measuring apparatus for the purpose of examining the heat transfer and the sprinkle modes on sprinkled tube bundles, a test apparatus has been constructed; see the diagram in figure 2 on the right and the photograph of the apparatus on the left. falling film liquid of temperature (t1) and of volumetric flow rate (v1), which is measured by the fm1 – flomag 3000 induction flow meter, flows from a distribution tube positioned above the bundle to which a liquid of temperature (t3) and flow rate (v2), which is measured by the fm2 – flomag 3000 induction flow meter flows in, and a liquid of temperature (t4) flows out into the collection flume positioned below the examined exchanger. the studied area (i.e., the sprinkled area), is one meter wide. there are also four thermocouples (t6–t9) in the loop measuring the process of temperature change within the loop. below the exchanger, falling film liquid is collected into a small flume situated right below the last tube, from which the liquid is conducted towards the thermocouple (t4), which measures its temperature. the liquid then freely flows into the collection flume, from which it is drawn by a pump into a drain (c). in case of excess hot water, it can be let off to a drain through a gate valve (gv6). the sprinkling loop is further fitted with a water meter and a rotameter for the purpose of visual inspection. all the thermocouples are insulated and unearthed t type thermocouples. all examined liquid temperatures (t1–t9), the environment temperature (uninsulated t type thermocouple) and flow rates v1 and v2 are continuously recorded by daq 56 converters and saved in a computer in the labview interface. apart from the effect the flow rate of the falling film liquid has on the examined heat transfer, the influence of the tube surface has also been studied. in figure 3, on the left, a clear difference between the smooth surface and the grooved surface with a rhombus pattern is visible. this surface has been created using the cold volumetric profiling and track wheeling technique (grooving). the outer tube diameter ranges from 12.3 to 12.4 mm. the calculations take into account the mean tube diameter, which is 12.0 mm. in figure 3, on the right, we can see an example of the difference between the smooth and sandblasted surface. the calculation of the heat transfer coefficient in this paper is based on the thermal balance according to 330 vol. 55 no. 5/2015 effect of tube pitch on hear transfer in sprinkled tube bundle figure 2. test apparatus diagram. figure 3. types of examined surfaces. the law of conservation of energy based on a simplified diagram in figure 2, newton’s law of heat transfer and fourier’s law of heat conduction. the evaluation of the measured data is based on the thermal balance between the operation liquid circulating inside the tubes and a sprinkling loop according to the law of conservation of energy. heat transfer is realized by convection, conduction and radiation. in lower temperatures the heat transferred by radiation is negligible, therefore it is excluded from further calculations. the calculation of the studied heat transfer coefficient is based on the newton’s heat transfer law and fourier’s heat conduction law that have been used to form the following relation 1 αo = 2πro ( 1 ks − 1 2παiri − 2πλs ln ro ri ) , (1) where αo [w m−2 k−1] is the heat transfer coefficient at the sprinkled tubes’ surface; αi [w m−2 k−1] is the heat transfer coefficient at the inner side of a tube set for a fully developed turbulent flow [12, 13]; ro and ri [m] are the outer and inner tube radii; λs [w m−1 k−1] is thermal conductivity; ks [w m−1 k−1] is heat admittance based on the above mentioned laws governing heat transfer, which is calculated from heat balance of the heating side of the loop, that is why the following must be valid: q′s = kslλtln = m ′ 34cp ( p; t3 + t4 2 ) (t3 − t4). (2) where m′34 [kg s−1] is the mass flow of heating water; cp [j kg−1 k−1] is the specific heat capacity of water at constant pressure related to the mean temperature inside the loop; l [m] is the total length of the bundle; ∆tln [k] is a logarithmic temperature gradient where a counter-current exchanger was considered. 4. experiment results the experiments described in this paper involved the testing of two temperature differences. it was a range of 15–40 °c and the range of 15–45 °c where the falling film liquid’s temperature t1 at the distribution tube outlet was approximately 15 °c and the temperature of the sprinkled liquid was t3 40 °c or 45 °c at the inlet of an exchanger which consisted of ten tubes 331 petr kracík, jiří pospíšil acta polytechnica pitch n t1 [°c] t3 [°c] v1 [l/min] v2 [l/min] error [%] smooth tubes a1 772 15.0 ± 0.52 44.9 ± 0.55 2.1–11.5 7.20 ± 0.05 3.2 ± 3.4 a1 2625 15.0 ± 0.44 40.1 ± 0.55 1.7–13.7 7.22 ± 0.05 3.0 ± 2.6 a2 902 14.9 ± 0.42 45.1 ± 0.53 2.2– 9.0 7.22 ± 0.05 3.5 ± 2.9 a2 690 15.4 ± 0.42 40.1 ± 0.49 3.0–12.2 7.21 ± 0.06 5.2 ± 4.1 b1 927 14.8 ± 0.60 45.3 ± 0.48 2.1–12.7 7.21 ± 0.06 3.0 ± 3.1 b1 1072 14.9 ± 0.48 40.1 ± 0.58 1.8–12.9 7.21 ± 0.05 2.6 ± 2.8 c1 574 15.1 ± 0.44 45.0 ± 0.59 3.8–11.0 7.21 ± 0.06 6.1 ± 4.4 c1 777 15.2 ± 0.44 40.1 ± 0.65 3.8–13.1 7.21 ± 0.04 5.5 ± 3.4 groove-surface tubes a1 848 15.0 ± 0.50 40.1 ± 0.62 3.0–11.8 7.21 ± 0.06 1.9 ± 1.6 a1 568 14.9 ± 0.50 45.2 ± 0.42 3.8–14.9 7.23 ± 0.04 1.2 ± 1.8 a2 1044 15.1 ± 0.44 40.0 ± 0.49 2.3–13.6 7.21 ± 0.04 4.9 ± 4.0 a2 961 14.7 ± 0.41 45.2 ± 0.47 1.6–11.8 7.21 ± 0.05 7.2 ± 5.9 b1 932 15.2 ± 0.56 40.2 ± 0.52 2.1–12.8 7.24 ± 0.13 3.1 ± 3.6 b1 1043 14.8 ± 0.46 45.1 ± 0.56 2.1–12.4 7.23 ± 0.04 2.3 ± 2.3 c1 1004 15.1 ± 0.43 40.6 ± 0.36 2.2–13.4 7.23 ± 0.05 5.7 ± 4.2 c1 553 15.0 ± 0.54 45.2 ± 0.59 2.0–12.7 7.23 ± 0.05 4.0 ± 2.8 sandblasted tubes a1 573 14.7 ± 0.36 39.7 ± 0.47 3.1–11.7 7.22 ± 0.04 2.8 ± 2.7 a1 1192 15.0 ± 0.47 44.8 ± 0.59 2.8–14.1 7.21 ± 0.06 1.9 ± 1.5 a2 788 15.2 ± 0.31 39.9 ± 0.51 1.9–12.2 7.21 ± 0.06 4.2 ± 3.5 a2 252 15.1 ± 0.29 45.0 ± 0.32 1.2–11.7 7.20 ± 0.05 5.1 ± 4.8 b1 1059 15.0 ± 0.35 40.4 ± 0.34 1.4–12.5 7.20 ± 0.05 3.3 ± 2.9 c1 834 15.3 ± 0.45 40.3 ± 0.36 2.6–12.7 7.23 ± 0.05 4.9 ± 4.2 table 2. summary table of measured points; n is the number of measurement points for each line of the table. positioned horizontally one above another. four different tube bundle pitches were studied. these were pitches of 15 mm (hereinafter marked as a1), 20 mm (b1), 25 mm (c1) and 30 mm (a2). the summary in table 2 also shows, besides the above-mentioned temperature values, the numbers of the measured points, the range of the studied falling film liquid flow rates, the average sprinkled liquid flow rate and the average error for the measured points. in the first case, the studied heat transfer coefficient at the surface of the sprinkled tube bundle was tested for the exchanger consisting of smooth tubes. the measured results for the thermal gradients are shown in figure 4, on the left for the 15–40 range and on the right for the 15–45 range. both thermal gradients feature a linear increase in the heat transfer coefficient up to a falling film liquid flow rate of about 5.0 litres per minute, and the convenience of particular pitch types cannot be assessed due to measurement uncertainty. by reaching the above-mentioned flow rate value, the heat transfer coefficient starts to stabilize at the a2 pitch for both thermal gradients. the heat transfer coefficient at other pitches keeps increasing, although the rise is not so sharp. with a flow rate of approx. 0.6 litres per minute, the heat transfer coefficient stabilizes at the b1 and c1 pitches, while the c1 pitch is more convenient at both thermal gradients. the coefficient keeps increasing at the a1 pitch, and it gets stabilized at the average value of approx. 7.0 kw m2 k, with the maximum tested flow rate reaching 11.2 litres per minute. in the second case, the exchanger consisted of tubes with a sandblasted surface. the resulting dependences of the heat transfer coefficient on the tube bundle surface are evident in figure 5, on the left for the 15–40 thermal gradient and on the right for the 15–45 thermal gradient. for the latter only the a1 and a2 pitches have been measured. up to a flow rate of approx. 3.5 litres per minute, the heat transfer coefficient at all measured pitches increases within the same trend. the convenience of individual pitches cannot be clearly determined for this type of surface, with the exception of two areas. the first is the b1 pitch and the thermal gradient of 15–40, where the heat transfer coefficient at the flow rate range of 3.5–6.5 litres per minute is higher by approx. 800 w m2 k compared to the rest. by reaching 6.5 litres per minute, the heat transfer coefficient stabilizes at the value of approx. 5.0 kw m2 k. the second significant area is located at the a1 pitch and the thermal gradient of 15–45, with a flow rate higher than 11.5 litres per minute. within this area, the coefficient stabilizes at the average value 332 vol. 55 no. 5/2015 effect of tube pitch on hear transfer in sprinkled tube bundle figure 4. dependence of heat transfer coefficient on exchanger consisting of smooth tubes. figure 5. dependence of heat transfer coefficient on exchanger consisting of sandblasted tubes. of approx. 6.0 kw m2 k), which is higher by approx. 1.0 kw m2 k) than the coefficient at other pitches. in the third case, the exchanger consisting of groovesurface tubes has been tested. the resulting dependences of the heat transfer coefficient on the tube bundle surface are shown in figure 6, on the left for the thermal gradient 15–40, and on the right for the thermal gradient 15–45. the results of the tube bundle with a grooved surface again do not clearly imply the most convenient pitch with the exception of two areas. they are in the first case the area at the a2 pitch and the thermal gradient 15–45 that reaches the maximum of approx. 5.0 kw m2 k in the flow rate range of 5.0 to 10.0 litres per minute, which is in this particular point almost by 2.0 kw m2 k less in comparison with the rest. the second significant area is at the a1 pitch and both thermal gradients where the average maximum values of the heat transfer coefficient reach almost 8.0 kw m2 k which is almost by 2.0 kw m2 k more than at the highest measured parameters. 5. conclusions this paper presents primary measured values of a heat transfer coefficient at the surface of sprinkled tube bundle consisting of ten tubes positioned horizontally one above another, where the tube pitches have been altered and three various tube surfaces have been tested at two thermal gradients. the primary processing clearly implies the convenience of the groove-surface tubes. when compared to the smooth surface, the increase trend up to the value of approx. 3.0 litres per minute is identical. however, further flow rate increase makes the coefficient at groove-surface tubes rise sharper and the coefficient between some pitches reaches the difference of almost 4.0 kw m2 k. the comparison of a tube bundle with smooth and sandblasted tubes at the tested thermal gradients surprisingly shows that the heat transfer coefficient at sandblasted tubes is worse, with the maximum value reaching only about 5.0 kw m2 k. in the introduction of this paper is mentioned, which are published frontiers of the reynolds number for 333 petr kracík, jiří pospíšil acta polytechnica figure 6. dependence of heat transfer coefficient on exchanger consisting of groove-surface tubes. sprinkle modes. for the smallest spacing (a1) was achieved fairly good agreement, but with the growing gap between the tubes, there are significant differences which may be caused by structurally different method of water distribution on a tubes bundle. currently we regimens evaluated very subjective, and therefore the results are not compared numerically. our further research should expand these measurements by the effect of low pressure in the tube bundle environment and also by the influence of the exchanger length / number of tubes comprising the exchanger on the heat transfer coefficient at the surface of sprinkled tubes and based on these measurements criterial equations applicable for tube bundle design should be created. acknowledgements this work is an output of research and scientific activities of netme centre plus (lo1202) by financial means from the ministry of education, youth and sports under the „national sustainability programme i“. references [1] armbruster, r. and j. mitrovic. evaporative cooling of a falling water film on horizontal tubes. experimental thermal and fluid science. 1998, vol. 18, issue. 3, pp. 183-194. issn 08941777. [2] gonzalez, g, j. m, j. m. s jabardo and w. f. stoecker. falling film ammonia evaporators. air conditioning and refrigeration center. college of engineering. university of illinois at urbana-champaign, 1992, pp. 60. [3] tang, j., z. lu, b. yu-chi and s. lin. droplet spacing of falling film flow on horizontal tube bundles. in: proceedings of the 18th international congress of refrigeration. montreal, quebec, canada: international institute of refrigeration, paris, france, 1991, pp. 474-478. isbn 2-9802798-0-3. [4] yung, d., j. j. lorenz and e. n. ganić. vapor/liquid interaction and entrainment in falling film evaporators. journal of heat transfer. 1980, vol. 102, issue 1, pp. 20-25. [5] hu, x. and a. m. jacobi. the intertube falling film: part 1 flow characteristics, mode transitions, and hysteresis. journal of heat transfer. 1996, vol. 118, issue. 3, pp. 616-624. issn 0022-1481. doi:10.1115/1.2822676. [6] jafar, thorpe and turan. computational fluid dynamics seventh international conference on cfd in the minerals and process industries. melbourne, vic.: csiro, 2009. isbn 978-064-3098-251. [7] wang, xiaofei, p. s. hrnjak, s. elbel, a. m. jacobi and maogang he. flow patterns and mode transitions for falling films on flat tubes. journal of heat transfer. 2012, vol. 134, issue 2, pp. 1-8. doi:10.1115/1.4005095. [8] parken, w. h., l. s. fletcher, v. sernas and j. c. han. heat transfer through falling film evaporation and boiling on horizontal tubes. journal of heat transfer. 1990, vol. 112, issue 3, pp. 744-750. doi:10.1115/1.2910449. [9] owens, w. l. correlation of thin film evaporation heat transfer coefficients for horizontal tubes. proceedings, fifth ocean thermal energy conversion conference, miami beach, florida. 1978, pp. 71-89. [10] sernas, v. heat transfer correlation for subcooled water films on horizontal tubes. journal of heat transfer. 1979, vol. 101, issue 1, pp. 176-178. [11] x-eng. x steam tables for ms excel [computer file .xls]. ver. 2.6.. [cit. 30.11.2010]. freeware. http://www.x-eng.com/xsteam_excel.htm [2010-11-30]. [12] incropera frank.p., dewittl david.p., bergman t., lavine, a. fundamentals of heat and mass transfer, 6th edition, 2007, hard cover, 1024 pages, isbn 978-0-471-45728-2 [13] jícha, miroslav (2001). přenos tepla a látky. 1. vyd. brno: cerm. 334 http://dx.doi.org/10.1115/1.2822676. http://dx.doi.org/10.1115/1.4005095. http://dx.doi.org/10.1115/1.2910449. http://www.x-eng.com/xsteam_excel.htm acta polytechnica 55(5):329–334, 2015 1 introduction 2 effects of pitch tubes on a falling film liquid 3 measuring apparatus 4 experiment results 5 conclusions acknowledgements references ap2002_01.vp 1 introduction the amount of data stored in digital storage has been increasing at a very rapid rate. since much of this data is of critical importance for its owners, it is necessary to ensure transparent data availability and to secure it against loss or disclosure. till now, data backup has in many cases been performed manually, using various backup data storage types; in extreme cases the data has even been transformed into printed form. unfortunately, these methods of data backup provide neither satisfactory protection against loss nor greater availability, and such a solution is clearly unsatisfactory. the probability of simultaneous hard drive failure and backup tape loss or damage remains high for many applications. moreover, such backup is far from transparent for users. the cost of this approach is also much too high. the gaston file system is an experimental storage system that provides users with the standard services of ordinary file systems. its major advantages are high data availability, support for mobile and disconnected users, and the ability to protect data from loss, damage or disclosure. one of the primary purposes of the system is to perform under many circumstances similar to existing lan-based networked storage systems. these properties are to be achieved by involving thousands of computers spread across a large geographical area (continents or even the whole world) and connected by a computer network (internet). high data availability and protection are managed by massive replication, and data protection in the generally insecure environment of the global network is achieved by suitable cryptographic mechanisms. the gfs is still under development and is divided into several mutually connected areas. these are: the system architecture, data locating, shared data consistency, replication, caching, data security, authentication & authorization, request distribution and naming schema & structure. it is necessary to bind these research fields together into a single complex and consistent system specification useful for subsequent implementation and testing. the essential architecture of the system is based on clients who deposit their data into the entire system and manipulate it, replica-managers managing stored (and replicated) data, and data servers that store data. each replica-manager controls one or possibly more data servers for better performance and fault-tolerance. clients access the data transparently using a specified interface. since shared storage space is based on mutual reciprocity, client, replica-manager and data server can be installed on one machine, thus providing the system with shared resources in return for using its services. an important means for protecting data in gfs is massive replication. data is allowed to be replicated anytime/anywhere (even a copy in a client cache is considered to be a replica). to provide maximum performance, data location is of great importance. this is supported by separating data from its physical location (such a class of data is called nomadic data) and by replicas that can flow freely. the number and location of the replicas can vary in time, based both on client specification and actual usage patterns and network characteristics. these propositions are important for mobile client support, to deal with temporary network partitioning and especially to minimize communication costs. one of most important tasks in the field of replication is to ensure the consistency of shared files. as the main aim of gfs is to achieve high availability, it is necessary to implement optimistic replica control protocol providing data availability even in a partitioned network. the protocol used in gfs is based on version vectors, transaction semantics and on automatic version conflict resolution. conflict resolution is performed taking into account data type, or using the supplied application specific merge procedure. in order to ensure consistency it is also necessary to order update requests, which, unfortunately, cannot be performed globally on such a high number of replicas due to unacceptable time consumption. hence ordering is specified by means of smaller variable groups of selected primary replica-managers, which eventually make information available to the other replica-managers. because of primary replica-manager group variability, neither the system scalability nor fault-tolerance is negatively affected. to improve gfs performance and decrease network usage costs, the system uses caching of data. in gfs whole files are cached to support mobile clients, who can be temporarily disconnected and therefore can use data even in this mode of operation. since caching of data in distributed environments causes data consistency problems, cached data are seen as ordinary replicas and for this reason the replica management described above is also used for cached data. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 large-scale file system design and architecture v. dynda, p. rydlo this paper deals with design issues of a global file system, aiming to provide transparent data availability, security against loss and disclosure, and support for mobile and disconnected clients. first, the paper surveys general challenges and requirements for large-scale file systems, and then the design of particular elementary parts of the proposed file system is presented. this includes the design of the raw system architecture, the design of dynamic file replication with appropriate data consistency, file location and data security. our proposed system is called gaston, and will be referred further in the text under this name or its abbreviation gfs (gaston file system). keywords: file system, replication schema, data protection, consistency, data locating. one of the requirements on gfs is data security. as the system is open to many different clients, and as it uses internet as its communication subsystem, it is exposed to many security threats. the system must protect the data not only when it is being exchanged, but also when it is stored at data servers. for these purposes well known and widely accepted cryptographic algorithms are used. for change detection of sensitive or non-encrypted data, digital signatures are used. like almost every distributed system, gfs performs authentication of clients to check the user’s identity, which is also based on cryptography. however, data is not only protected by cryptographic means. every client who wants to perform any operation on the data must be authorized to perform such an operation. in this case access control lists are used, as in many other modern operating systems. since gfs is intended to provide users with a transparent file system with sharing capability, a global name space is used. this means that each user uses the same file name to access another’s file. moreover, the names of files are independent of the number of replicas and their locality, hence files in the system are transparently presented to the user. another important benefit is that there is no need to remember file locations, because all replicas of one particular file are equivalent. the design of the above mentioned challenging properties of gfs is further described below in several chapters dealing with the most important design issues. 2 requirements to dfs the proper design of a distributed file system should reflect several basic characteristics of the system, which arise from the essential features of a distributed environment and the typical data access pattern in dfss. these include the following: � read/write access a distributed file system usually stores data accessible in read and write mode (unlike www and other similar distributed services storing just read-only data objects). furthermore, these data can be shared by more than one user. the system must efficiently support strong consistency models for shared objects that are often updated, which causes higher communication costs. the replica management should minimize these costs. � dynamic access pattern the client usage patterns may change quickly and frequently. the replica management must maintain system stability while responding reasonably to these changes. the environment of a large-scale distributed file system, as wide as the internet, contains many individual machines. the protocols and algorithms used must scale well with the number of replicas for each object and the total number of machines in the system. any centralized authority or information source would quickly become a performance bottleneck. � federal structure the replica managers of a large-scale distributed file system may reside in disjoint administrative domains, so the system must allow replica managers to make autonomous decisions. no replica manager can force others to perform any activity that is against their interest, and the protocols that are used must take into account the possibility of cooperation refusal by any party. � unreliable environment it is not possible to eliminate all errors of individual components (communication lines, replica managers) of large-scale systems, so at least some replicas may not be available at any moment and the network may even become temporarily partitioned. also in these cases the data must be as available as possible. � insecure environment the replica managers, clients and network infrastructure cannot be trusted. to prevent the replica management system from becoming a security risk, it is necessary for all decisions to consider only authenticated information. the system must also prevent a small number of compromised replica managers from gaining control of the system. 3 system architecture the gaston file system consists of three main architecture components of the highest importance – replica managers, clients and data servers: � replica managers. these manage stored (and replicated) data and run data consistency and update protocols in order to be able to provide the correct data. � clients. these are the basic elements in the system. they deposit and manipulate the data in the system. if the client caches the data, it is at the same time the replica manager, considering cached data as ordinary replicas. � data servers, which physically store the data. the elementary architecture is presented in fig. 1. the system supports three types of users – ordinary users, dedicated servers and diskless terminals: � the ordinary users of the system are caching clients (thus also replica managers), who use the system and in turn provide the system with their resources. the data of other users can be replicated at these clients. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 42 no. 1/2002 fig. 1: elementary system architecture � the dedicated servers are machines intended just to store data. they act as relatively stable replica managers supposed to operate continuously. � the diskless terminals are non-caching clients manipulating the data and communicating with replica managers. this is the case of small and possibly mobile and temporarily disconnected devices (cellular phones, small handheld computers, etc.). 4 data consistency the gaston global file system works under a transaction model of computation, which uses transactions for data manipulation. the system infers transactions by mapping the file operation call sequences to the transaction types. the goal of consistency and update protocols is to ensure one-copy serializability – data correctness. since the means for achieving high data availability is massive data replication, it is not possible to assume a connected network of replica managers all the time. in such a large system, network partitioning will surely occur. thus, the optimistic strategy for partitioned replica control is used. this protocol is based on precedence graphs [1]. a precedence graph models the necessary ordering between transactions and is used to check serializability across partitions. each partition maintains a log file, from which read-sets, write-sets and serialization order can be deduced. the sequence of transactions in partition k i is denoted as t t t1 i 2 i n i, , ,� . the nodes of the precedence graph represent transactions; the edges represent interactions between transactions. the first step in construction of the graph is to model interactions in each individual partition. two types of edges are introduced: 1. data dependency edges t tj i k i � , if � � � �writeset readsett tji ki� � 0, j < k. 2. precedence edges t tj i k i � , if � � � �readset writesett tji ki� � 0, j < k. both types of edges demonstrate that the transaction processing order has influenced the computation result. the graph constructed in this way must be acyclic, since the orientation of an edge is always consistent with the serialization order, and within each partition serializability is always provided. to complete the precedence graph (at reconnection), a conflict between partitions must be represented. the following type of edge is defined for this purpose: interference edges t tj i k� l , i � l, if � � � �readset writesett tji k� �l 0. an interference edge represents the dependency between transaction t j i that read the data in one partition and transaction tk l that wrote the same data in another partition. this edge between t j i and tk l indicates that t j i logically precedes tk l since the value read by t j i could not be influenced by any update transaction in another partition. thus, the interference edge signals read/write conflict; a pair of these edges indicates a write/write conflict. if the resulting precedence graph is acyclic, then transactions from partitions can be represented by a single serial history that can be obtained by topologically sorting the graph. if the graph contains the cycles, then the computation is not serializable and detected inconsistencies are resolved by rolling back (undoing) transactions and their dependent transactions (connected by dependency edges) in the reverse order of execution until the resulting subgraph is acyclic. to merge partitions, the final value of each data object is then forwarded to other partitions. if the transaction cannot be rolled back, some compensating actions need to be performed to nullify the effect of that transaction. in order to address the very high number of replicas, a kind of epidemic algorithm is used for update propagation among replicas that make the tentative commits of the transaction until the global stable commit is announced. 5 replication data replication is an important means for achieving high data availability, fault tolerance and thus better data protection. to provide maximum performance, the data locality is of great importance. this is supported by separating data from its physical location (this kind of data is called nomadic data) and by replicas that can flow freely. the number and location of replicas can vary in time based both on client specification and actual usage patterns and network characteristics. the enhanced adr algorithm [2] is used for dynamic replication scheme management. the adr algorithm constitutes the replication scheme that forms the tree subnetwork. the size and the location of this tree changes dynamically depending on the request locality and frequency. all requests for an object from clients are routed to the closest replica manager in the replication scheme (rs) along the shortest path. the following types of replica managers are defined: � rs – neighbor – a replica manager that belongs to the replication scheme and has a neighbor that does not belong to rs, � rs – fringe – a leaf replica manager of the subgraph induced by rs. the replication scheme is modified through the following tests that are executed at the end of each predefined time period t : � the expansion test. the test is executed by all rs – neighbor replica managers. each rs – neighbor replica manager rmi compares for each of its neighbors rm rsj � values of � �rcnt xr j i and � �h rcnt x a k j 1 � � wk i rmk i , � which is the total number of write requests received in the last time period t from rmi itself or from a neighbor other than rmj (a i denotes a neighbor set of replica manager rmi). if � �rcnt x hr j i � 1, then replica manager rmi sends to rmj a copy of the file with an indication to save it. thus rmj joins rs and the replication scheme expands. the 8 acta polytechnica vol. 42 no. 1/2002 expansion test is successful when the (if) condition is satisfied for at least one neighbor of rmi. � the contraction test. the test is executed by all rs – fringe replica managers. each rs – fringe replica manager rmi compares for its only neighbor rm rsj the values of � �rcnt xw j i and � �h rcnt x a k j 2 � � rk i rmk i – the total number of read requests received in the last time period t from rmi itself or from a neighbor other than rmj. if � �rcnt x hw j i � 2, then rmi requests permission from rmj to leave a copy of the file. thus replication scheme rs shrinks. � the switch test. the test is executed by replica manager rmi if rs = {rmi} and the expansion test has failed. for each neighbor rmj it compares the values of � �rcnt xj i and � �h rcnt x a k j 3 � � � k i rm rmk i i – the total number of all requests received in the last time period t from all replica managers apart from rmj. if � �rcnt x hj i � 3, then rmi sends a copy of the file to rmj with an indication that rmj becomes the new singleton replica manager in rs, and rmi discards its copy. thus the replica migrates towards the center of the request activity. at the end of each time period t all rs – neighbor replica managers execute the expansion test and all rs fringe replica managers execute the contraction test. each replica manager that is both rs – neighbor and rs – fringe first executes the expansion test, and if it fails it executes the contraction test. a replica manager that does not have a neighbor in rs first executes the expansion test and if it fails it executes the switch test. an example of the dynamic replication scheme is presented in fig. 2. the initial replication scheme rs ( t0) = {rm4}. issued requests (in each time period t): rm1 – rm6, rm8: 4 read requests, 2 write requests rm7: 20 read requests, 12 write requests. t1: rm4 (as an rs – neighbor of rm3 and rm5) executes the expansion test. since the number of reads requested by rm3 is 12 and the number of writes requested by rm4 and rm5 is 20, replica manager rm3 does not enter the replication scheme. the number of reads requested by rm5 is 32 and the number of writes requested by rm4 and rm3 is 8. thus, rs ( t1) = {rm4, rm5}. t2: first, rm4 performs the expansion test that fails, and then it executes the contraction test. it is successful since rm4 receives 18 write requests from rm5 and 16 read requests from rm4 and rm3. at the same time rm5 performs the expansion test and successfully includes rm7 in the replication scheme (the number of reads from rm7 is 20 and the number of write requests from the other replica managers is 14). the resulting rs(t2)={rm5, rm7}. t3: rs stabilizes at rs( t3) = {rm5, rm7} and it will not change further. extension of the adr algorithm to an arbitrary network topology is relatively straightforward using the network spanning tree. the modification that will be used in the gaston system implementation concerns the connectivity of the replication scheme, and is able to create pseudoreplicas to address the problem of extremely distant clients. 6 file addressing to locate data in the large system we use two distinct algorithms. the first one is a probabilistic algorithm that uses modified bloom filters. it can be characterized as a fast algorithm that adheres to the principle of locality. every node in the network keeps a filter (union) of objects that are accessible at this node and every edge also keeps filters of objects that are accessible using this edge. these filters (edge filters) define objects that are accessible at different distances (filter for distance one, two, etc.), thus forming a kind of routing table. an example of the algorithm is shown in fig. 3. here, a document characterized as (1,0,1,1,0) stored at node n4 is being accessed from node n1. the filters (the filter of the document and filters at nodes and edges) are compared to match and sequently this means, that the document is not stored at node n1, but it can be reached on a node at distance 2. at node n2 a comparison of the filters shows that the document is not stored at the node, but can be reached in distance 1 and in the direction to node n4. at node n4 we have perfect match, so the document can be reached. however, we must confirm this result with a complete scan of the documents stored at node n4, due to the simple fact that the filters are unions. 9 acta polytechnica vol. 42 no. 1/2002 fig. 2: example of a dynamic replication scheme fig. 3: example of probabilistic file addressing the second algorithm is used when the first one fails. it is a modified algorithm originally developed by plaxton et al [3]. the algorithm uses embedded trees rooted in every node, which are assigned numbers that correspond to the object ids. routing to the node means routing to the information where physical data is stored. this mechanism is extended with information about the object that might be found during routing to the root, thus increasing fault tolerance. an example of this algorithm is shown in fig. 4. the mechanism is similar to routing on an n-dimensional hypercube. simply said, given a node from which we access data we compare particular parts of its id and climb the tree in the direction corresponding to the matching parts. in the example, we access the node identified as 3268 from node 4291. we compare particular elements of the id from right to left, thus passing through nodes 0328, 0768, 2268 and finally to node 3268. as can be seen, the nodes we are passing through become more similar to the node as we come closer to it. 7 authorization and data protection distributed systems nowadays require far more protection of data than ever before. this is due to the simple fact that more people can easily access the distributed world than in the early days of the internet. protection using only well guarded servers is not sufficient, indeed it is not possible in the architecture that we present here. data can be stored at potentially dangerous servers, which may not be guarded as necessary. using them just as storage is sufficient from the view of the resistance against “physical” attacks. we propose to use cryptography to protect data content from restricted users (those who are not allowed to know the content, not only those who are not allowed to access the system). data is therefore encrypted at the client nodes, which ensures that only the client knows the data and controls who can read the content. this approach not only provides good security, but also gives the system the property of good scalability. the system is designed to adapt to any appropriate cipher algorithm for this purpose. of course, the system is open to several other improvements on the server and/or communication side of the system, where the underlying layers can use their own protection mechanisms (ipsec in the communication layer, hw encryption of data at servers etc.) protection of new data is straightforward – data is encrypted by a user and then sent to the replica manager. protection of changed data is a more complex problem, which is solved by a modified mechanism presented in [4]. a changed block of data is encrypted and attached to the end of the file, but the old block is also preserved to decrypt all data that consequently follow this changed block (we proposed to use the cbc mode of a cipher algorithm). this enables all the data of the file not to be reciphered, so that performance is not severely affected. as this solution preserves old blocks of data, the old versions of the file are also saved. the update situation is shown in fig. 5. 10 acta polytechnica vol. 42 no. 1/2002 fig. 4: example of a deterministic file addressing mechanism fig. 5: update of a file a distributed file system is accessed by many users, so users’ data must somehow be protected against access by other users. on the other hand, some people may want to publish their data to be read by others so some authorization to the data has to be introduced. in gaston we use the following general principle [5]. the file identifier is extended with an encrypted random number and effective rights to the file for a given user. unencrypted rights are attached for a user to know what rights he has been assigned at the moment. as he does not know cryptographic key to decrypt the encrypted part, he is unable to change the rights. the encryption key is known only to the issuer of the identifier and may be stored in a special part of the file system. a random number is used to prevent guessing of file identifiers. changing the rights that are attached to the identifier has almost no effect on the resulting access. all user rights are stored in access control lists (acl) that are associated with every file. 8 conclusion this paper has introduced the general architecture and key design issues of a large-scale file system. our proposal of the gaston file system is designed to provide high data availability, security and user mobility. these goals are achieved particularly by deploying file replication with reasonable data consistency and fast file location and also by proper cipher and authorization techniques that allow unencrypted data to be completely hidden from the replica managers. our design is based on the following basic characteristics, reflecting the essential requirements for dfss: � file replication providing high data availability and support for mobile and disconnected users. � transactional processing solving multiple read/write accesses and allowing achievement of data consistency corresponding to the requirements of dfss. � a dynamic replication scheme reacting to a frequently changing access pattern. � cryptography for securing user data and support for any appropriate cipher algorithm to protect data. � advanced authorization which helps in dealing with access rights. future work on the gaston file system project will include advanced remote data modification techniques at distrusted replica managers, and also improving replica management in the field of load-dependent request distribution. symbols ai set of neighbor nodes to node ni acl access control list adr adaptive data replication cbc cipher block chaining ds data server dfs distributed file system gfs gaston file system ki graph partition i ni node i rmi replica manager i rs (ti) replication scheme in the time ti rs (ti) graph complement to replication scheme rs in the time ti � �rcnt xr j i read request count to xi initiated by node nj � �rcnt xw j i write request count to xi initiated by node nj readset (t j i ) set of data read by transaction t j i ti time i t j i transaction with sequential number j performed in partition ki writeset (t j i ) set of data written by transaction t j i xi data item stored at node ni references [1] davidson, s. b.: an optimistic protocol for partitioned distributed database systems. ph.d. thesis, department of electrical engineering and computer science, princeton university, princeton, nj, 1982 [2] wolfson, o., jajodia, s., huang, y.: an adaptive replication algorithm. acm trans. on database systems, 1997, vol. 22, no. 2, pp. 255–314 [3] plaxton, c., rajaraman, r., richa, a.: accessing nearby copies of replicated objects in a distributed environment. in. proc. of the 9th acm symp. on parallel algorithms and architectures, 1997, pp. 311–320 [4] kubiatovicz, j., bindel, d., chen, y., czerwinski, s., eaton, p., geels, d., gummadi, r., rhea, s., weatherspoon, h., weimer, w., wells, c., zhao, b.: oceanstore: an architecture for global-scale persistent storage. in proc. of the 9th int’l conf. on architectural support for programming languages and operating systems, 2000 [5] couloris, g., dollimore, j., kindberg, t.: distributed systems: the concepts and design. addison-wesley pub. co., 1994, isbn 0-20162-433-8 ing. vladimír dynda e-mail: xdynda@fel.cvut.cz ing. pavel rydlo e-mail: xrydlo@fel.cvut.cz phone: +420 2 2492 3325 fax: +420 2 2492 3325 department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo náměstí 13 121 35 praha 2, czech republic 11 acta polytechnica vol. 42 no. 1/2002 fig. 6: unique file identifier acta polytechnica doi:10.14311/ap.2016.56.0395 acta polytechnica 56(5):395–401, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of intensively blasted electric arc burning in the arc heater’s anode channel j. senk∗, i. jakubova, i. laznickova faculty of electrical engineering and communication, but, technicka 12, 616 00 brno, czech republic ∗ corresponding author: senk@feec.vutbr.cz abstract. the paper deals with the description of the intensively blasted electric arc burning in ar in the anode channel of the arc heater operated under various conditions. directly measured experimental data (current, voltage, gas flow rate, power loss) characterize the operation of the device as a whole, but important parameters describing the electric arc inside (its geometry, temperature and voltage distribution) must be revealed using a mathematical model of the arc. an updated version of the model is introduced and used for analysis of two exemplary sets of measured data. the results are given in figures and commented. keywords: electric arc; arc heater; argon; modelling. 1. introduction a modular-type arc heater with the electric arc burning in a cylindrical anode channel, being cooled and stabilized by gas flow, was designed by the authors and experimentally operated under various operational conditions [1]. various technological applications have been tested (e.g. decomposition of stable harmful substances, diamond deposition), and numerous data have been collected, which are useful e.g. in design, operation and usage of similar devices. the behaviour of the electric arc in the anode channel is of key importance for the operation of the device, but unfortunately substantial parameters of the arc cannot be observed and measured directly. the inner space of the device is inaccessible and basic properties of the arc are hidden in integral measured data. that is why a mathematical model of the arc has been designed which enables to determine the arc radius ra, temperature ta and voltage ua distribution from the experimentally obtained integral data such as the total voltage u and current i, the gas flow rate qm and power loss pl of individual segments of the arc heater determined by calorimetry. during a long period of experiments and modelling, rather extent experience has been obtained and some previous assumptions have been modified. the basic laws and presumptions of the model [2] remain unchanged and are only briefly summarized in § 2, which focuses mainly on new approaches in the new version of the model. examples of computed dependences are given and discussed in § 3. finally, § 4 concludes the paper. in the paper, subscripts are used to define the part of the device or the arc itself. the arc heater is divided into several separately cooled segments which are chosen with respect to their expected power loss load and mechanisms of energy exchange between the arc and the segment in question. the following notation is used: in the downstream direction, subscript “cat” stands for the cathode, subscript “in” for the input part of the anode channel, subscript “ch” for the (anode) channel, and subscript “a” stands for the anode itself. subscript “as” indicates the anode spot, the interface between the arc’s root and the grounded anode. finally, subscript “a” means the arc. 2. mathematical model the core part of the model describes the behaviour of the intensively blasted electric arc burning in the cylindrical anode channel of the arc heater. the model is based on the mass and energy conservation laws and ohm’s law and altogether with the measured data it uses also transport and thermodynamic properties of working gas [3]. to make the text easy to read, the main assumptions are summarized here: the arc plasma is supposed to be in the local thermodynamic equilibrium, with its kinetic energy small compared to its enthalpy. only the radial component of radiative energy flow and only the axial component of enthalpy flow are taken into account as predominant terms; the conductive heat loss of the arc is neglected based on previous experience. the cylindrical anode channel forces also the stabilized arc to be axially symmetrical. mach number ma is taken constant over the anode channel cross-section, i.e. the same in the hot arc zone and the cold surrounding zone. the development of the arc radius along the anode channel axis is assumed as follows: ra(z) = r0 ( 1 + ( z r0 )1/nr) , (1) where r is the radius (m), z is the axial coordinate (m), nr is the parameter to be found. as mentioned above, subscript a means the arc. subscript 0 stands for the beginning z = 0 at the cathode tip. the radius of the cathode spot r0 is determined from the 395 http://dx.doi.org/10.14311/ap.2016.56.0395 http://ojs.cvut.cz/ojs/index.php/ap j. senk, i. jakubova, i. laznickova acta polytechnica current density, which is supposed to be 108 a m−2 for currents up to 2.16 ka [4]. previous experiments and computations have shown that using the rectangular temperature distribution across the anode channel is sufficient, which makes the computation much simpler ta(r,z) = ta(z) . furthermore, in the cold zone between the arc and the anode channel wall the gas temperature has been found to be equal to its input temperature t0. the three basic integral equations are rewritten in the difference form and are solved in an equidistant mash, with the exception of the beginning near the cathode tip. the total length of the arc zl is divided into n steps ∆z = zl/n. between adjacent segments, energy balance is assumed. in each slice (k = 1, 2, . . . ,n), the following equation system is solved: • continuity equation ma(zk) = ( π ( r2c − r 2 a(zk) ) ρ(t0)a(t0) + πr2a(zk)ρ ( ta(zk) ) a ( ta(zk) ))−1 qm, (2) • energy equation πr2a(zk)ma(zk)ρ ( ta(zk) ) a ( ta(zk) ) h ( ta(zk) ) = ( 1 − pl(zl) ) ua(zk)i, (3) • ohm’s law ua(zk) = ua(z1) + i π ∆z k∑ j=2 1 r2a(zj )σ ( ta(zj ) ), (4) where — besides the quantities defined above — rc is the anode channel radius, σ is conductivity, ρ density, a sound velocity, h enthalpy of the working gas, e(z) electric field intensity, qm is gas flow rate, ma(z) mach number. the loss coefficient pl is the ratio of the total measured power loss reduced by the power loss of the cathode and the anode spot to the electric input power of the arc pl = pl,tot,meas − pl,cat − pl,as ua(zl)i . in the very first version of the model, only the behaviour of the electric arc was modelled and the influence of near-electrode regions was neglected. later, the influence of the electrodes was included using the data given by other authors. in this new model, the near-electrode regions are included and solved using the real measured data. the model describing the arc column is completed with calculations respecting special nature of the nearcathode and near-anode region. the treatment of these regions is in detail explained in [5] and acceptable agreement of obtained values with the data of other authors is found. here, the solution of nearelectrode regions is explained only in brief, with the emphasis on implementation of these parts to the complex model. it should be stressed that although the near-electrode regions need a special approach, they cannot be calculated independently, but must be solved as a part of the whole calculation system altogether with the arc. for near-electrode regions, first the power consumed in these regions must be estimated and then the total input power ui is split into three parts pertaining to the near-cathode region, the arc alone, and the nearanode region. in other words, only a part of the total voltage u pertains to the arc u = ucat + ua + uas. thanks to the construction of the cathode and its shell and their intensive water cooling, the power consumed for building necessary conditions for the arc in the near-cathode region can be estimated in a simple way. it is measured separately as the power loss of the cathode pl,cat. the corresponding cathode voltage drop is ucat = pl,cat/i. in the close vicinity of the cathode, in the near-cathode layer, non-equilibrium processes take place which are not studied here [6] . what is needed for further computation of the model of the arc is the distance from the cathode tip s and the corresponding temperature ta(s) from which the axial arc development starts. for this purpose, the effective conductivity of the working gas σ[ta(s)] is supposed to be corresponding to the cathode voltage drop ucat, and the current density j(s) which is reached at the distance s from the cathode tip if the arc radius ra(s) is taken from (1). based on this consideration, the near-cathode layer width s can be determined from the following equation: σ ( ta(s) ) = j(0)( 1 + (s/r0)1/nr )2 sucat(i). (5) in this equation, first a suitable ta(s) is chosen and the corresponding near-cathode layer width s is computed for the given nr. it has been found that the selected temperature ta(s) significantly affects the shape of the temperature distribution ta(z) near the beginning, but soon its influence to ta(z) dies down. such a value of ta(s) is used in further modelling which results in a fast and smooth increase of ta(z) without extremes near the beginning. an accuracy of hundreds of kelvins is sufficient. obviously, the width of the near-cathode layer depends not only on the sort of the working gas, on the arc current i and the cathode voltage drop, but also on the exponent nr, which describes the development of the arc radius along the z coordinate ra(z), see (1). this exponent is determined during the computation of the model as a whole as explained later. in the near-anode region, the situation is more complicated, because the power consumed for the arcanode attachment cannot be measured separately. in the anode, the arc is fully developed and the power irradiated from the arc column to the anode wall cannot be neglected or easily split off. for the first approximate estimation of the anode spot voltage 396 vol. 56 no. 5/2016 analysis of an intensively blasted electric arc drop uas the following consideration is used. the anode in the tested device has the same radius rc as the anode channel but its length la is different than the length of the channel lch, here it was about three times shorter. very steep changes in the arc radius and temperature take place especially near the beginning, while towards the end, both the radius and the temperature change more slowly. so it can be expected that near the end, the power transferred from the arc to the unit surface of the anode channel wall is roughly the same as that transferred to the unit surface of the anode wall. then the separately measured power loss of the anode can be divided into the power obtained from the arc column and the power loss due to the arc/anode attachment pl,as. naturally, this first estimation of the anode spot power loss 0pl,as and the corresponding anode spot voltage drop 0uas is refined during iterative computation of the model as a whole. the input data of the model are geometrical parameters of the device (radius and length of the input part of the anode channel rin, lin, of the main part of the anode channel rch, lch, of the anode ra, la) and transport and thermodynamic properties of the working gas (σ is the conductivity, ρ density, a sound velocity, h enthalpy of the working gas) which are known before the experiment. further set of input data is individually obtained for each experiment which is characterized by the current i, the total voltage u, gas flow rate qm, and measured power loss of individual segments of the device (power loss of the cathode pl,cat, input part of the channel pl,in, of the main part of the anode channel pl,ch, and of the anode pl,a). the computation starts with the above mentioned determination of the cathode voltage drop ucat and the rough estimation of the anode spot voltage drop 0uas. the arc voltage is 0ua = u − ucat − 0uas. the first value of the exponent nr is estimated from the energy balance at the output cross-section of the device at z = zl. as mentioned above, the equation system (2)–(4) is solved in a mash over the z axis. the step ∆z of 1 mm is found to be suitable when the total length zl is tens of millimetres (109 mm in the following examples). in the first node z1, the first interval is of a different length (∆z − s) than the others. near the beginning, the electric field intensity falls very fast with the increasing distance from the cathode tip. to prevent overestimation of the voltage drop in the first interval, an average value of the computed electric field intensity is taken as an acceptable compromise ua(z1) = ∆z − s 2π ( i r2a(s)σ(ta(s)) + i r2a(z1)σ(ta(z1)) ) . (6) solving the equation system in the mash step by step gives axial dependences of arc temperature ta(z), electric field intensity e(z), mach number ma(z), power loss pl(z), voltage ua(z) and finally, at the end, the total arc voltage ua(zl) and pl(zl) as sums of individual increments. the calculation is repeated with slightly changed exponent nr until the sum of the computed voltages for the i-th iteration iua(zl) + ucat + iuas is (with an acceptable difference) equal to the total measured voltage u. next, the estimation of the anode spot power loss must be refined first. now not only measured power loss of the anode channel and the anode are available, but also the computed values obtained by summation within the kth iteration. they are mutually compared, and the estimation of the anode spot power loss is updated in the next step to better match the measured power loss of the channel and the anode. the procedure is repeated until the difference between the two approaches starts to increase; the best approximation of the anode spot power loss is found. after the calculation, the computed values of power loss and voltage distribution with acceptable errors correspond to the measured values. the next section illustrates some computations and gives examples of the axial dependence of the arc temperature and radius for the arc stabilized by argon of two different flow rates. 3. results and discussion the designed model of the arc including also its nearelectrode regions was used for evaluation of two sets of experimental data. the arc heater with the channel radius rc of 8 mm was operated on pure argon with two different flow rates (11.3 g/s and 22.5 g/s). the total distance from the cathode tip to the output of the anode zl was 109 mm, consisting of the input part of the channel lin = 22 mm, the main part of the anode channel lch = 60 mm, and the anode la = 27 mm. the total input power pin = ui was set approximately between 4 and 30 kilowatt. in the following figures, if the figure compares the results obtained with both argon flow rates, solid symbols and lines are used for the lower flow rate of 11.3 g/s and empty symbols with dashed lines for the higher flow rate of 22.5 g/s. figure 1 compares the near-cathode layer width s, the corresponding arc temperature ta(s) and the arc cross-section area sa(s) = πr2a(s) at the distance s for two different argon flow rates and under different arc current i. obviously, the increasing arc currents result in lower s and higher ta(s). as could be expected, for higher arc currents, the conditions for the arc burning are built in a closer vicinity of the cathode and the arc temperature is higher. for the same arc current, higher argon flow rate distinctly decreases the near cathode layer width and also the corresponding temperature is lower. the arc cross-section area is higher for the lower gas flow rate and higher arc currents, but it is worth mentioning that both the arc cross-section and the arc temperature cannot be directly compared 397 j. senk, i. jakubova, i. laznickova acta polytechnica 0 2000 4000 6000 8000 10000 12000 0 0.1 0.2 0.3 0.4 0.5 0.6 0 50 100 150 200 250 300 t_ a (s ) ( k) , s _a (s ) ( m m ^2 ) s  (m m ) i (a) cathode region: ar 11.3 g/s, 22.5 g/s s_22.5 s_11.3 t_a(s) 22.5 t_a(s) 11.3 s_a(s) 22.5 s_a(s) 11.3 figure 1. the near-cathode layer width s, the corresponding arc temperature ta(s) and cross-section sa(s) for two argon flow rates of 11.3 and 22.5 g/s. 2 4 6 8 10 12 14 0 20 40 60 80 100 120 0 50 100 150 200 250 300 u _a s,  u _c at  (v ) u , u _a  (v ) i (a) voltage vs. current, ar 11.3 g/s solid, 22.5 g/s dashed u_a(v), 11 u (v), 11 u_a(v), 22 u (v), 22  u_cat (v), 11 u_as (v), 11 u_cat (v), 22 u_as (v), 22 figure 2. the voltage vs. current dependences for two argon flow rates 11.3 and 22.5 g/s: the measured voltage u, net arc voltage ua, cathode voltage drop ucat, and anode-spot voltage drop uas. between the two gas flow rates as they are reached at different distances s from the beginning. figure 2 gives the voltage vs. current dependences for both the investigated argon flow rates. the highest curves with circles are the measured total voltages u. as can be seen, the twice as high argon flow rate results in nearly by 10% higher total voltage. after computations, the net arc voltage ua is obtained by subtracting the cathode voltage drop ucat and the computed anode spot voltage drop uas from the measured total voltage u. the dependences of the cathode and anode spot voltage drops on the arc current are given in the same figure and deserve a short notice. as observed in our older experiments, the cathode voltage drop does not depend on the gas flow rate at all and decreases with the increasing arc current. on the contrary, the anode spot voltage drops increase with the increasing arc current and seem to exhibit some relationship to the gas flow rate. higher argon flow rate results in a bit lower anode spot voltage drop. as a result of the opposite course of the cathode and anode spot voltage drop dependences on the current, their sum changes only little with the current and thus the shape of the curves u(i) and ua(i) is almost the same, with the net arc voltage here being approximately by 10% lower than the corresponding measured total voltage u(i). in both cases, the characteristics exhibit a typical s-shape with very slightly increasing voltage in the middle and slightly decreasing voltage at low (and probably also high) currents. in figures 3 and 4 typical computed axial dependences of the arc temperature ta(z) and the arc radius ra(z) are given for the argon flow rate of 22.5 g/s and several arc currents. it is clearly seen, that close to the beginning the arc temperature and the arc ra398 vol. 56 no. 5/2016 analysis of an intensively blasted electric arc 8000 9000 10000 11000 12000 13000 0 0.02 0.04 0.06 0.08 0.1 t_ a (k ) z (m) ar 22.5 g/s 227 203 183 162 122 102 82 figure 3. the computed distribution of the arc temperature on the channel axis ta(z) for argon flow rate of 22.5g/s and different arc currents (the order of the legend items corresponds to the order of curves). 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0 0.02 0.04 0.06 0.08 0.1 r_ a (m ) z (m) arc radius, ar 22.5 g/s 227 203 183 162 122 102 82 figure 4. the computed distribution of the arc radius ra(z) for argon flow rate of 22.5g/s and different arc currents (the order of the legend items corresponds to the order of curves at the channel). dius quickly increase. while the arc radius increases along the whole channel (see (5)), the arc temperature begins to decrease soon. steep changes of the arc temperature and radius in the input part of the anode channel (of the length of 22 mm in the discussed experiments) are the main reason why the input part is cooled separately and is not taken into account for estimation of the anode spot power loss. in the anode channel, the arc temperature decreases almost linearly with the increasing distance from the beginning. the legend in both figures gives the arc currents in the same order as the curves are at the end of the anode. obviously, the output arc temperature ta(zl) first increases almost linearly with the arc current, but at higher currents this increase becomes slower and the output temperature does not increase any more. in figure 5 the computed arc temperature at the end of the anode channel ta(0.082 m) for argon flow rates of 11.3 g/s and 22.5 g/s is given in dependence on the total input power. similarly, figure 6 shows the computed arc cross-section at the end of the anode channel sa(0.082 m) for both argon flow rates. it is clearly seen that the higher argon flow rate results in a smaller arc cross-section but a higher arc temperature. the arc column is narrow and hot which manifests in higher power irradiated to the channel walls. also, the saturation of the arc temperature at higher currents at higher argon flow rate is apparent in figure 5. surprisingly, no similar saturation is seen at the arc temperature dependence for the half argon flow rate of 11.3 g/s. with this lower argon flow rate, both arc temperature and arc cross-section increase almost linearly with the input power within the tested range. undoubtedly, a small slowdown can be observed at higher input power. 399 j. senk, i. jakubova, i. laznickova acta polytechnica 6000 7000 8000 9000 10000 11000 12000 0 5 10 15 20 25 30 t_ a  (k ) ui (kw) t_a(0.082 m), ar 11.3 g/s, 22.5 g/s 11 22 figure 5. the computed arc temperature at the end of the anode channel ta(0.082 m) for argon flow rates of 11.3g/s (solid line, solid symbols) and 22.5 g/s (dashed line, empty symbols) vs. the total input power. 0 0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0 5 10 15 20 25 30 s_ a  (m ^2 ) ui (kw) arc cross‐section (0.082 m), ar 11.3 g/s, 22.5 g/s 11 22 figure 6. the computed arc cross-section at the end of the anode channel sa(0.082 m) for argon flow rates of 11.3g/s (solid line, solid symbols) and 22.5 g/s (dashed line, empty symbols) vs. the total input power. 4. conclusions the paper introduces an updated model of the electric arc burning in the anode channel of the arc heater. the mathematical model not only describes the behaviour of the intensively blasted arc column but also enables to solve the problem of near-electrode regions in direct connection to the measured data obtained from the analysed experiment. the model is applied for analysis of two sets of experiments carried out under different argon flow rates. selected results are illustrated in figures and reveal interesting observations. in the described cases, the difference between the computed (summed) arc voltage and the value determined from measured data was below one percent. unfortunately, much higher error was observed with the computed and measured power loss of the anode, especially in some experiments. probably, more precise method for determination of the anode spot power loss should be sought for. further experimental and computational experience may inspire better approach to this problem. acknowledgements authors gratefully acknowledge financial support from the ministry of education, youth and sports of the czech republic under npu i programme (project no. lo1210) and from the czech science foundation (project no. ga 15-14829s). references [1] i. jakubova, j. senk, i. laznickova. the influence of nitrogen in argon/nitrogen mixture on parameters of high-temperature device with electric arc. acta polytechnica 53(2):179–184, 2013. [2] j. heinz, j. senk. modelling of energy processes in intensively blasted electric arc. czechoslovak j phys 54(c):c702–708, 2004. doi:10.1007/bf03166474. 400 http://dx.doi.org/10.1007/bf03166474 vol. 56 no. 5/2016 analysis of an intensively blasted electric arc [3] a. farmer, g. haddad. material functions and equilibrium composition of argon and nitrogen plasma. csiro division of applied physics, 1989. [4] s. ramakrishnan, a. stokes, j. j. lowke. an approximate model for high-current free-burning arcs. j phys d: appl phys 11:2267–2280, 1978. doi:10.1088/0022-3727/11/16/014. [5] j. senk, i. jakubova, i. laznickova. treatment of nearelectrode regions in a simple model of blasted electric arc. in proceedings of epe 2016. to be published, 2016. [6] j. lowke. a unified theory of arcs and their electrodes. j phys iv 07(c4):c4–283–c4–294, 1997. 401 http://dx.doi.org/10.1088/0022-3727/11/16/014 acta polytechnica 56(5):395–401, 2016 1 introduction 2 mathematical model 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0407 acta polytechnica 55(6):407–414, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap an interpolation method for determining the frequencies of parameterized large-scale structures s. nasisi∗, m. valášek, t. vampola faculty of mechanical engineering, ctvu in prague, technická 4, 16607, prague, czech republic ∗ corresponding author: salvatore.nasisi@fs.cvut.cz abstract. parametric model order reduction (pmor) is an emerging category of models developed with the aim of describing reduced first and second-order dynamical systems. the use of a prom turns out useful in a variety of applications ranging from the analysis of micro-electro-mechanical systems (mems) to the optimization of complex mechanical systems because they allow predicting the dynamical behavior to be predicted at any values of the quantities of interest within the design space, e.g. material properties, geometric features or loading conditions. the process underlying the construction of a prom using an svd-based method accounts for three basic phases: a) construction of several local roms (reduced order models); b) projection of the state-space vector onto a common subspace spanned by several transformation matrices derived in the first step; c) use of an interpolation method capable of capturing the values of the quantity of interest for one or more parameters. one of the major difficulties encountered in this process has been identified at the level of the interpolation method and can be encapsulated in the following contradiction: if the number of detailed finite element analyses is high then an interpolation method can better describe the system for a given choice of a parameter, but the computation time is higher. in this paper, a method is proposed for removing this contradiction by introducing a new interpolation method (rsdm). this method allows us to restore and make available to the interpolation tool certain natural components belonging to the matrices of the full fe model that are related, on one hand to the process of reduction and, on the other to the characteristics of a solid in the fe theory. this approach shows higher accuracy than methods used for assessing the eigenbehavior of the system. a hexapod will be analyzed to confirm the usefulness of the rsdm. keywords: prom; singular value decomposition; interpolation method; large-scale hexapod; structural optimization. 1. introduction 1.1. parametric model order reduction and its interpolation within the engineering community increasing efforts have been focusing on parametric models order reduction (pmor) as a way of diminishing the computation time for simulating large systems. by interpolating among the reduced matrices obtained at specific values of 1 or more parameters – where a parameters refers, for example, to the length of a plate, the magnitude of an externally applied force, a material property, or specific boundary or initial conditions – the analyst is able to obtain very fast and accurate models that can then be used for a number of purposes. one indicator showing how increasing attention has been spreading throughout the scientific community is shown by the application of the parametric order reduction for solving electrochemical models for simulating cyclic voltammograms in the field of polarography [8]; for simulating 3d micro-electro-mechanicalsystems (mems) [19]; for design optimization [9], and also for control [18]. in these studies, ordinary first and second order differential equations are solved to estimate the frequency response of the system within a range of frequencies of interest. to reduce these equations, most works employ mor techniques spanning from the truncated balanced realization method [2– 4] to the krylov subspace method [5–7, 10]. 1.2. problems related to the interpolation method the construction of a prom is accompanied by the use of an interpolation method for predicting the values of the quantity of interest for one or more parameters at any point in the design space. computing of parametric stiffness and mass matrices requires the application of a suitable interpolation scheme to deliver accurate results. an interpolation scheme for the computation of parametric matrices is suitable when it: (1.) preserves the property of positive definiteness of the parametric roms; 407 http://dx.doi.org/10.14311/ap.2015.55.0407 http://ojs.cvut.cz/ojs/index.php/ap s. nasisi, m. valášek, t. vampola acta polytechnica (2.) preserves the quantity of interest within the design space with the least error; (3.) requires the lowest computational cost with the simplest or minimal mathematical apparatus. the first property is essential because if it is violated then the computation of, for example, the eigenfrequencies of the system under investigation yields complex numbers. an intuitive approach used for interpolating between two or more precomputed roms would consist of lettin a first, second or higher order interpolating polynomial passing through the values of the k or m entries for different values of the parameter of interest. therefore, if the parameter that is used is, for example, the length of a cantilever, and three points have been chosen to explore the design space spanning from the first length, say l1 = 200 mm to the third length l3 = 300 mm through l2 = 250 mm, a quadratic polynomial could be used to interpolate between points: (l1,k (1) ij ), (l2,k (2) ij ), (l3,k (3) ij ) and (l1,m (1) ij ), (l2,m (2) ij ), (l3,m (3) ij ), where the superscript indicates at what length value the matrix has been computed. although this scheme seems attractive due to the use of reduced matrices and available functions for computing interpolating polynomials, it produces ineffective results. the reason is that homologous entries, e.g., of a stiffness matrix calculated at different length values exhibits a highly nonlinear behavior. therefore, one consequence of applying this method is that the property (1) above is violated. there is a possible way out; it would entail calculating the matrices for length values very close to each other. in this case, the parametric matrix would be readily available for any length value. the problem is that this procedure would require the interval between two adjacent matrices be so narrow that it would discourage the use of a prom, and would therefore find no application at all for engineering purposes. a simple way to obtain a parametric rom for a given set of parameters entails performing a linear approximation of a rom that lies between two precomputed roms obtained for different values of the parameters. these precomputed roms are weighted by appropriately chosen functions and the final, parametric matrix is generated by superimposing system matrices, as shown in [16]. in order to improve the effectiveness of the interpolation tool, and due to the difficulty arising from describing the parametric effects for nonlinear systems a spline-based interpolation method is introduced in [17] and also in [15], where element-wise interpolation in the tangent space of the matrix manifold is employed. [15] also addresses the mode-veering problem and shows how the method can detect the narrowing of the frequency gaps between two adjacent mode shapes. in [18], the interpolation method that is used is based on the direct matrix interpolation method – in this paper indicated as dmm – which consists of the weighted sum of the reduced matrices calculated in correspondence with specific values of the chosen parameter(s). the typical form of, for example, a parametric reduced stiffness matrix is given by k(p1,p2, . . . ) = k∑ j=1 αj (p1,p2, . . . )kj, (1) where ∑k j=1 αj (p1,p2, . . . ) = 1, αj (pi) = δij for i,j = 1, . . . ,k, p ∈ π. here, αj (p1,p2, . . . ) are the coefficients depending on the selected parameters while kj are the reduced local stiffness matrices. in [18], the dmm is employed for computating of the frequency response of first-order linear time invariant dynamical systems, and two local models – i.e., two pairs of reduced k and m – are used for 1 or 2 parameters. unfortunately, the application of this superposition of weighted matrices relies on the use of only 1 parameter for a linear interpolation, which implies the use of only two precomputed roms. this can strongly affect the precision with which the final parametric matrices can be employed with success in assessing the eigenbehavior of the system. in [15], a wing with store configuration is presented, and first 8 frequencies are calculated, but there is error frequency of up 10 %. furthermore, in the aforementioned works it have not yet pointed out the importance of recovering the components of the fe matrices associated to the process of reduction and the deformation of the solid. in order to take these components into account, it is necessary to use a different viewpoint, i.e. to proceed backwards from the roms to the mathematical structure that characterizes the computation of the matrices of an fe assemblage. 2. aims the first objective of this paper is to introduce a new interpolation method named rsdm for adapting the framework of the dmm to a different structure of stiffness and mass matrices. instead of performing element-wise interpolation of precomputed roms, the proposed method transforms the interpolating matrices into a form that allows interpolation among these deformation components, and changes in the coordinates of the system, which are made visible by appropriate factorization. this process seems essential, because certain components describing the matrices of an fe assemblage have remained encapsulated after the process of reduction, and have blent with the components of the transformation matrix; while other components, e.g. those pertaining to the rigid rotation, which do not participate in the deformation of the solid, are not removed. as result, the application of a matrix interpolation based method without this step remains, so to say, “blind” and incapable of discerning between the useful components. in the first step of the method, two pairs of matrices – the stiffness and mass matrices – are factorized in such a way as to rule out the useless 408 vol. 55 no. 6/2015 an interpolation method for determining the frequencies figure 1. normalized relative frequency difference (nrfd) vs mode number for a comparison between the proposed rsdm and the direct matrix method (dmm). the proposed method preserves the first three frequencies better than dmm. components and restore those components that have been blended after the process of reduction. the overall computational complexity and the numerical properties take advantage of this procedure. in fact, in order to obtain very accurate prom, especially within the broad design space of the chosen parameters, it might be necessary to run several detailed finite element analyses. consequently, it might be time-consuming to run several simulations in order to keep the accuracy within a given tolerance. this problem can be condensed in the following contradiction: if the number of detailed finite element analyses is high (3 or more), then an interpolation method can better describe the system because it can more effectively predict the frequencies pertaining to intermediate values of the chosen parameter(s), but the computation time becomes higher. as it will be shown, instead of searching for a trade-off between these two conflicting requirements – computation time and computation accuracy – it is possible to improve the effectiveness of the method and to remove the above contradiction by making minimal changes to the direct matrix method for the system. since this solution to the above problem involves minimal changes, it is more ideal because it provides an easier way to introduce fewer resources (only a carefully manipulated svd), while reducing the expenses (the computational time and the mathematical apparatus required by the rsdm) as much as possible. the intuitive promise of the proposed method is illustrated in figure 1. the second objective of the paper is to show the performance of rsdm in terms of accuracy. a numerical application to a hexapod discretized with almost 200 thousand degrees of freedom will be presented. 3. description of the interpolation method 3.1. foundations and derivation of the proposed interpolation method. an analogy with continuum mechanics solid mechanics is a branch of continuum mechanics that has been developed to predict the behavior of solids e.g. changes in shape, internal forces subjected to the action of mechanical and thermal loads. a general deformation of a solid – using the idealization often considered in engineering dynamics that a body is rigid can be defined as a combination of a rigid rotation and a stretch (or contraction) [13]. once the deformation is known, the stresses induced by external mechanical and/or thermal loads can be calculated. however, these stresses are generated by the deformational component of the stretch and not of the rigid rotation. therefore, in order to define the deformation gradient for a solid, the stretch component is separated from the rigid rotation component by means of a mathematical factorization known as polar decomposition [10]. in order to predict the deformation and the stress fields, most of the engineering design calculations make use of finite element techniques that are employed to solve the governing equations of a solid, both for static analysis and for dynamic analysis,. with reference to figure 2 and using the principle of virtual work the equations of motion for linear elastic solids are described as follows ([11], p.492): ∫ r %nbna ∂2ubi ∂t2 δvai dv + ∫ r cijkl ∂n b ∂xl ∂n a ∂xj ubkδv a i dv 409 s. nasisi, m. valášek, t. vampola acta polytechnica − ∫ r bin aδvai dv − ∫ ∂2r tin aδvai da = 0. (2) this can be in turns be written in the compact form( mabü b i + kaibku b k − f a i ) δvai = 0, (3) where uai , is the displacement vector at each nodal point, while bi and ti are respectively the vectors of body and traction forces. this set of linear equations can also be used to study the eigenbehavior of the system. in order to calculate the frequencies and the mode shapes of this structure, it is necessary then to solve the eigenproblem of the type kaibkφnxp = maibkφnxpλpxp (4) that stems from solving the characteristic equation obtained after introducing the trial solution, ui = <(bφnx1eiωt), (5) within the system of equations (3). the calculation of the eigensolution pertaining to this set of equations is therefore dependent on kaibk and maibk, which are the stiffness and mass matrices of the finite element assemblage illustrated in figure 2. referring to (2), these matrices and the vector of the externally applied loads have the following mathematical structure: kaibk = ∫ r cijkl ∂na ∂xj ∂nb ∂xl dv, mab = ∫ r %nanbdv, (6) f ai = ∫ r bin adv + ∫ ∂2r tin ada, where r and ∂2r are the shape of the solid in its unloaded condition and its boundary, respectively. analyzing these relations, it can be observed that both the finite element stiffness and mass matrices depend on the material properties (the density % and the elastic modulus tensor cijkl) and on the element interpolation functions or shape functions, n(x1,x2) with x1,x2 being the coordinates of a generic node with respect to a reference system o~e1 ~e2 (figure 2); while, indices a,b,i,k, l indicate the directions pertaining to the cartesian basis, {e1,e2}. in turns, the shape functions depend on the type of finite element and on the reciprocal displacement of the nodes of the discretized structure. that is, the entries of the stiffness and mass matrices of a finite element assemblage contain information related to the material and the deformation of the solid. the general deformation of a solid can be defined as combination of a rigid rotation and a stretch (or contraction) [13]. the process of general deformation undergone by a finite element is illustrated in part (b) of figure 2 and it is described as a stretch followed by a rigid rotation or, which is the same thing, by a rigid rotation followed by a stretch. once the deformation is known, the stresses can be derived. figure 2. an example of a finite element mesh (a); a general deformation can be decomposed into a rotation and a stretch (b). the gradient of deformation is defined in continuum mechanics as, f = rs. the analogy with continuum mechanics is used to show the importance of recovering those components of stretch that, more than a rigid rotation, contribute to the change in the shape of the solid. however, these stresses are generated only by the deformation component of the stretch and not of the rigid rotation; that is, a rigid rotation does not participates actively in the stretch or contraction of the solid. as a result, the matrices kaibk and maibk are dependent only on the two components of rigid rotation and stretch. now, without loss of generality, let us suppose that the finite element assemblage of figure 2(a) has a large number of elements, and that we desire to reduce the size of kaibk and maibk by means of a transformation matrix t which, by definition, is a rotation matrix. therefore, the full stiffness and mass matrices, in addition to containing information related to the rigid rotation and the stretch of the solid, also retain information related to the change in the coordinates system undergone after the process of reduction. it then appears evident then that, whenever a large-scale system is reduced and then analyzed for modal analysis by means of kr = t t kt , mr = t t m t , three types of components are blended to one another. consequently, these components are not actively exploited when applying the standard matrix interpolation method, because the weights that are used can never distinguish between them. a direct consequence of this situation is that the resulting roms cannot guarantee satisfactory accuracy. the question that emerges is therefore: how can these three components be taken out of kr and mr in order to make them visible to the interpolation tool? the next section introduces the important concept of polar decomposition, which allows the useless component of rigid rotation to be ruled out, and brings out the useful component associated to the deformation. 3.2. polar decomposition the polar decomposition can be defined as the analogous, for matrices, of a complex number z = reiϑ, r ≥ 0. 410 vol. 55 no. 6/2015 an interpolation method for determining the frequencies figure 3. correspondence between the process of reducing the stiffness matrix and the factors calculated by svd, a singular value decomposition method. these factors have been identified after using polar decomposition to rule out the part associated with rigid rotation. theorem 1. let a ∈ rm×n, m ≥ n. then there exists a matrix r ∈ rm×n and a unique positive semi-definite matrix s ∈ rm×n such that a = rs, rt r = in. (7) if rank a = n, then s is positive definite and r is uniquely determined.1 considering the partition u = [u1 u2], u t1 u1 = in. it follows that the polar decomposition of a is [12] a = rs, r = u1v t , s = v dv t , (8) where the factor v , known as the right-hand singular vector of s [12], is an orthogonal matrix. in continuum mechanics, the gradient of deformation is expressed in terms of r and s, which describe the rotation and the stretch of a solid, respectively. this is the feature that is exploited by the proposed interpolation method. the two matrices kr and mr are decomposed in terms of r and s, and since the matrix r does not participate in the deformation of the solid, it can be ruled out. matrices kr and mr can therefore be expressed in terms of the s-component only. as several numerical experiments have shown, ruling out the rotational component of deformation improves the accuracy of results. therefore, the term rsdm2 stems from the process that separates the rigid rotation from stretch by means of a decomposition method in order to bring out the components of transformation and deformation. 1the concepts explained in this paper deal with real matrices. consequently, the complex matrices that appear in the referenced scientific literature are converted into real matrices. 2here, different letters have been used from those adopted in [12], which are u and h; they are also different from those employed by [14], which are q and s figure 3 depicts the situation just described. at the top, the reduction process of the stiffness matrix is shown: t is an orthogonal matrix, which allows the state variables to be projected onto the subspace spanned by its columns. matrix k accounts for the deformation due to the reciprocal displacement of the nodes. from continuum mechanics, however, rigid rotation is ruled out by applying a decomposition to describe the deformation through the action of the stretch which, unlike the rigid rotation, is responsible for the stress. consequently (figure 3 bottom), the rotation and the stretch, which, after the process of reduction are originally blended, can each be made visible by the factors of a singular value decomposition (applied in this example to the stiffness matrix): an orthogonal matrix v and the matrix of singular values that define the part responsible for the deformation. polar decomposition can therefore be considered as the conceptual tool that, by analogy with continuum mechanics, allows us to identify which component of the rigid rotation to rule out; svd practically allows us to compute the components of the reduced matrix pertaining to the change of coordinates and stretch. to compute the s-component, it will therefore be necessary to apply an svd method to the matrices of interest. the following section shows this step. 3.3. application of the proposed interpolation method in order to obtain the s matrix, it is therefore necessary to carry out a singular value decomposition, and to exploit only two of its factors. it is important to highlight this, because it might lead to the idea that svd is applied to the matrices in its entirety, whereas only a part of its generated factors is in fact used. since the svd method is numerically stable and is implemented in different computational software programs, its use for calculating s is simple and fast. suppose that two finite element analyses have been run, and that the matrices of mass m and stiffness k have been derived. two couples of matrices are now available: (k1, k2) and (m1, m2). the first step consists of performing an svd and extracting the matrices v and d for each of these 4 matrices. let us use (vk1, vk2, dk1, dk2) and (vm1, vm2, dm1, dm2) to refer to the 8 matrices pertaining to the svd applied to each of the 4 matrices (k1, k2) and (m1, m2). the parametric stiffness and the mass matrices can now be ascertained: k(p) = α1(p)vk1α2(p)dk1α3(p)v tk1 + (1 −α1)(p)vk2(1 −α2)(p)dk2(1 −α3)(p)v tk2, m (p) = β1(p)vm1β2(p)dm1β3(p)v tm1 (9) + (1 −β1)(p)vm2(1 −β2)(p)dm2(1 −β3)(p)v tm2, where we shorten (p) = (p1,p2, . . . ). the coefficients are an essential part of this model. more specifically, for those values of the parameter(s) 411 s. nasisi, m. valášek, t. vampola acta polytechnica figure 5. error committed with the use of serep-cms for calculating the first 20 frequencies of interest of a hexapod. figure 4. 3d model of a hexapod. at which (k1, m1) are obtained, αi = βi = 1 while in the correspondence of those values at which (k2, m2) are computed, αi = βi = 0, i = 1, 2, 3. the advantages of this method rely on three factors: (1.) unlike the direct matrix method, this method accounts for 6 coefficients, thus allowing the model to be tuned up efficiently. (2.) the decomposition discussed above is guaranteed to act on more components that, without decomposition, would remain encapsulated into the reduced stiffness and mass matrices; namely the component pertaining to the transformation matrix and the component associated to the shape functions of the matrices of the fe assemblage. (3.) the useless component related to the rigid rotation has been ruled out, so it does not participate actively and harmfully in the process of interpolation. 4. results we now present as a numerical example a study of a hexapod structure with piezoactuators that was created at the faculty of mechanical engineering of the czech technical university in prague. the aim is to investigate the suppression of vibrations of compliant mechanical structures. this structure will show how the implementation of rsdm copes with the use of a large-scale system. the matrices of the original finite element model have dimensions 198015 × 198015 and they are reduced to 70×70 (0.035 %). it is interesting to assess the degree of fidelity with which rsdm is able to generate the rom at a desired test point. only one parameter was considered for this structure, namely the radius of the legs. the initial value of the parameter was increased by 100 % and only two finite element analyses were run. to assess the accuracy with which the rom interpolated by rsdm was computed, a full model was compared with the computed rom at a chosen test point lying within the design space identified by the radius of the legs. the reduction process of reduction for obtaining the two initial roms employed compund serep-cms mor, i.e. first the serep method was used to preserve the lowest 20 frequencies and subsequently a component-mode synthesis method (craig-bampton) was applied to the reduced matrices obtained by serep. the accuracy of the rom was assessed by applying two criteria: computation of the normalized relative frequency difference or nrfd, and the mac number. 412 vol. 55 no. 6/2015 an interpolation method for determining the frequencies figure 6. computation of the first 6 frequencies by the parametric rom at the test point (a); modal assurance criterion (mac) [1]. the computed rom at the test point exhibits a good correlation, despite the relevant reduction of the full system (b) the error estimate therefore relies on the following: nrfd = ∣∣∣1 − exact frequency(i) approximatingfrequency(i) ∣∣∣ · 100 %, (10) i = 1, 2, . . . , nfoi3, while computing the mac number we used mac(i,j) = (dr tpφti pr tpψi )(dr tpφti dr tpφi ) · (pr tpψ t i pr tpψi ) (11) where φ and ψ are the modal matrix pertaining to the direct rom (dr) and the interpolated rom (pr) obtained at the test point (tp), respectively. figure 5 shows the nrfd using a serep-cms mor, and how successful the reduction process was successful in capturing the dynamics of the original system despite the considerable reduction. 4.1. performance of rsdm for frequency prediction of a hexapod despite the high accuracy of the mor throughout the range of the 20 frequencies, figure 6 shows how the dramatic reduction along with a 100 % change of the parameter has influenced the preservation of the whole range of frequencies at which the initial serep method was calibrated. the picture highlights the following features. first, the error committed by the rom for calculating the fundamental frequency is about 0.0006 %, and it is less than 0.7 % for the first three frequencies. second, the maximum error committed for the first 5 frequencies is below 1.5 %. third, the maximum error is observed for the mode number 6, at which the error is slightly smaller than 3.9 %. the initial 5 frequencies are the 3number of frequencies of interest frequencies of main interest for an engineer. consequently, rsdm is able to preserve the most important range. for the present study-case, nrfd is also supported also by the calculation of the modal assurance criterion (mac), the results for which are plotted in figure 6(b). for the first 6 frequencies, the values of mac along the main diagonal range between a minimum of 98.7 % and a maximum of 99.9 %. these results provide evidence that the rom generated at the test point by rsdm is accurate for the first 6 frequencies, even though a consistent reduction was performed by serep-cms (0.035 % of the full model). 5. conclusions the main challenge that had to be faced in improving the accuracy of parameterized large-scale systems was to optimize the direct matrix interpolation method, i.e. to minimize the number of detailed finite element analyses and to guarantee high accuracy, while introducing as few as possible mathematical concepts. in an attempt to achieve this outcome, rsdm has been proposed as promising interpolation method for obtaining accurate parametric roms. the application of this method relies on a carefully manipulated svd of the reduced stiffness and mass matrices, and on the use of only two finite element analyses. however, there are several reasons why this optimization may be compromised. first, if the number of the parameters and the design space are large, only two finite element analyses might not be sufficient. second, if the size of the rom is reduced to less than 0.1 % of the size of the original discretized structure, then the model might not be able to capture consistently capture the eigenbehavior of the system for the whole range of frequencies of interest. nevertheless, this 413 s. nasisi, m. valášek, t. vampola acta polytechnica increase in ideality is manageable, and our paper has shown how to address the related difficulties by introducing of a new method of interpolation, named rsdm, with the aim of restoring and making available to the interpolation tool certain natural components belonging to the matrices of the full model that are related, on the one hand, to the process of reduction and, on the other hand, to the characteristics of a solid in the fe theory. this is a point that has been neglected in previously published works, yet it seems to be an indispensable step to adopt. cautious use of the svd method for achieving this purpose is effective for 2 reasons: first, it is numerically stable [12] and its algorithm is embedded in different computational software programs; second, the use of reduced matrices limits the computation time that it needs. when the proposed method is applied according to algorithm 2 the following performances have been observed: (1.) for the large-scale hexapod structure the error committed lies between 0.0006 % and less than 1.5 % for the first 5 frequencies despite the consistent, initial reduction. (2.) there is very accurate correlation, as shown by the mac number. engineering systems can be modeled by effective roms displaying various degrees of accuracy. on the basis of the results obtained here, it is evident that despite the increase in ideality attained by the rsdm, this degree of fidelity of rom is in some way undermined by a combination of the dramatic reduction of the original system and the large number of parameters. one challenge that must be pointed out here is the need to further optimize the interpolation tool, in order to deal ideally with opposing requirements of computation time and accuracy. this of course entails first investigating how the error depends on the number of finite element analyses, on the interpolation method, and on the size of the final rom. acknowledgements this work was supported by research grant p101/1339100s, mechatronic flexible joint. references [1] j.a. randall, the modal assurance criterion twenty years of use and abuse, journal of sound and vibration, (august 2003) 14-21. [2] r.w. freund, model reduction methods based on krylov subspaces, acta numerica, 12 (may 2003) 267-319. doi:10.1017/cbo9780511550157.004. [3] b. moore, principal component analysis in linear systems: controllability, observability, and model reduction, ieee transactions on automatic control, 26 (february 1981) 17-32. doi:10.1109/tac.1981.1102568 [4] j.r. li, reduction of large circuit models via low rank system gramians, computational fluid and solid mechanics, (2001) 1601–1605. doi:10.1016/b978-008043944-0/50977-x. [5] d. g. meyer, s. srinivasan, balancing and model reduction for second-order form linear systems, ieee transactions on automatic control, 41 (november 1996) 1632-1644.doi:10.1109/9.544000 [6] a. c. antoulas, d. c. sorensen, s. gugercin, a survey of model reduction methods for large-scale systems, structured matrices in mathematics, computer science, and engineering i, (2001) 193–219. doi:10.1090/conm/280/04630c [7] a. beattie, krylov-based model reduction of second-order systems with proportional damping, proceedings of the 44th ieee conference on decision and control, and the european control conference, (12-15 december 2005). doi:10.1109/cdc.2005.1582501 [8] l. feng, e. b. rudnyi, j. g. korvink, parametric model reduction for fast simulation of cyclic voltammograms, american scientific publishers, 4 (2006) 1-10. doi:10.1166/sl.2006.021 [9] k.r. tze-mun leung, parametric model order reduction technique for design optimization, ieee international symposium on circuits and systems, 2 (2326 may 2005) 1290-1293. doi:10.1109/iscas.2005.1464831 [10] b. lohmann, b. salimbahrami, structure preserving reduction of large second order models by moment matching, proc. appl. math. mech., 4(1) (2004) 572–573. doi:10.1002/pamm.200410317 [11] a. f. bower, applied mechanics of solids, crc press, 2008. [12] n.j. higham, computing the polar decomposition – with applications, siam journal of scientific computation, 7 (october 1986) 1160-1174. doi:10.1137/0907079 [13] g.a.holzapfel, nonlinear solid mechanics. a continuum approach for engineering, john wiley & sons ltd, 2000. [14] g.strang, linear algebra and its applications, cengage learning, 4th edition, 2005. [15] d.amsallem, c.farhat, an online method for interpolating linear parametric reduced-order models, siam j. sci. comput., 33 (2011) 2169-2198. doi:10.1137/100813051 [16] b. lohmann, r. eid, efficient order reduction of parametric and nonlinear models by superposition of locally reduced models, methoden und anwendungen der regelungstechnik, (2009), erlangen-münchener workshops. [17] d. amsallem, j. cortial, k. carlberg, c. farhat, an online method for interpolating linear reduced-order structural dynamics models, proceedings of 50th aiaa/asme/asce/ahs/asc structures, structural dynamics, and materials conference (2009). doi:10.2514/6.2009-2432 [18] h. panzer, r. eid, b. lohmann, parametric model order reduction by matrix interpolation, automatisierungstechnik, 58 (2010) 475-484. doi:10.1524/auto.2010.0863 [19] z. bai, j. clark, j. demmel, k. pister, n. zhou, new numerical techniques and tools in sugar for 3d mems simulation, technical proceedings of the 2001 international conference on modeling and simulation of microsystems, 1 (2001) 31-34. 414 http://dx.doi.org/10.1017/cbo9780511550157.004. http://dx.doi.org/10.1109/tac.1981.1102568 http://dx.doi.org/10.1016/b978-008043944-0/50977-x. http://dx.doi.org/10.1109/9.544000 http://dx.doi.org/10.1090/conm/280/04630c http://dx.doi.org/10.1109/cdc.2005.1582501 http://dx.doi.org/10.1166/sl.2006.021 http://dx.doi.org/10.1109/iscas.2005.1464831 http://dx.doi.org/10.1002/pamm.200410317 http://dx.doi.org/10.1137/0907079 http://dx.doi.org/10.1137/100813051 http://dx.doi.org/10.2514/6.2009-2432 http://dx.doi.org/10.1524/auto.2010.0863 acta polytechnica 55(6):407–414, 2015 1 introduction 1.1 parametric model order reduction and its interpolation 1.2 problems related to the interpolation method 2 aims 3 description of the interpolation method 3.1 foundations and derivation of the proposed interpolation method. an analogy with continuum mechanics 3.2 polar decomposition 3.3 application of the proposed interpolation method 4 results 4.1 performance of rsdm for frequency prediction of a hexapod 5 conclusions acknowledgements references ap02_02.vp 1 introduction disinfection by-products (dbps) comprise of several organic and inorganic compounds that are formed by reactions between chlorine, naturally occurring organic matter (nom) and bromide in drinking water [1]. the major halogenated dbps that are commonly identified from chlorine treatment are trihalomethanes (thms), haloacetic acids (haas), haloacetonitrile (hans), cyanogen halides, and halopicrins. some of the major species of these dbps are listed in tab. 1 [2]. trihalomethanes (thms) and haloacetic acids (haas) are the two major classes of dbps commonly found in waters disinfected with chlorine. early studies have mainly focused on the formation of thms and haas. the levels of these compounds formed after chlorination of natural waters depend on several operational conditions, such as chlorine dosage and free chlorine contact time, as well as water quality conditions such as natural organic matter content (nom), bromide concentration, temperature and ph. the us environmental protection agency (usepa) has set a maximum contaminant level (mcl) of 100 �g � l�1 for total thms and has set a new mcl of 80 �g � l�1 in stage 1 of the disinfection (disinfection by-product rule (usepa 1998). in addition to these standards, an mcl for haas of 60 �g � l�1 was proposed in the stage 1 rule. stage 2 of the d/dbp rule may lower the mcls for thms and haas to 40 �g � l�1 and 30 �g � l�1, respectively. hence, techniques to rapidly determine the problematic organic fractions most responsible for dbp formation within nom are important for the minimization of dbp formation in water treatment systems. the aggregate concentration of all halogenated dbps is sometimes characterised as the total organic halide concentration (tox). to date, most dbp research has focused on thms and haas [3]. nom is considered to be the primary organic precursor to dbp formation and it is present in nearly all-natural waters. previous studies have shown the importance of many parameters for the formation of thms and haas, such as dose of chlorine, concentration of bromide and ammonia, ph, temperature, content and type of natural organic matter (nom) [4]. the nom of most source waters comprises humic substances (humic and fulvic acids), hydrophilic acids, carboxylic acids, amionoacids, carbohydrates and hydrocarbons in the approximate proportions of 50, 30, 6, 3, 10 and 1 %, respectively [5]. however, in highly colour waters, the humic substance content may be as high as 50 to 90 %. the portion of nom that can be biodegraded is sometimes defined as biodegradable dissolved organic carbon (bdoc) or assimilable organic carbon (aoc), which are measured by two distinct techniques [6]. extensive research has been conducted to understand nom composition. much of this research has relied on the 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 formation of haloforms during chlorination of natural waters a. grünwald, b. št’astný, k. slavíčková, m. slavíček recent drinking water regulations have lowered the standards for disinfection by-products and have added new disinfection by-products for regulation. natural organic matter (nom), mainly humic compounds, plays a major role in the formation of undesirable organic by-products following disinfection of drinking water. many disinfection by-products have adverse carcinogenic or mutagenic effects on human health. this paper deals with the formation potencial of disinfection by-products in water samples taken from different places in the flaje catchment. keywords: water, chlorination, disinfection by–product formation potential. chemical class chemical compound trihalomethanes (thm) chloroform bromodichloromethane dibromochloromethane bromoform haloacetic acids (haas) monochloracetic acid (mcaa) dichloroacetic acid (dcaa) trichloroacetic acid (tcaa) monobromoacetic acid (mbaa) dibromoacetic acid (dbaa) tribromoacetic acid (tbaa) bromochloroacetic acid (bcaa) bromodichloroacetic acid (bdcaa) chlorodibromoacetic acid (cdbaa) haloacetonitrile (hans) dichloroacetonitrile trichloroacetonitrile dibromoacetonitrile bromochloroacetonitrile cyanogen halides cyanogen chloride tab. 1: chlorinated dbps fractionation of natural waters into operationally defined discrete fractions based on adsorption chromatography employing synthetic resins. however, questions have been raised about such isolation and fractionation techniques because nom is cycled through large changes in ph (2 to 10), which may chemically alter its structure. nom can be characterised by non-specific parameters; important examples include organic carbon content (i.e. dissolved organic carbon doc) and uv–absorbance in the range of 254 to 280 nm (uv254–280). among all the different parameters for characterizing nom of given water, uv254 and specific ultraviolet absorbance (suva� � uv�/doc) at a particular wavelength (�) has often correlated well with dbp formation [7]. a number of studies have used linear regression techniques to correlate thms formation potential (thmfp) with toc and uv254 [8, 9]. thmfp is the difference between the final thm concentration and the initial thm concentration in a sample at standard reaction conditions. standard reaction conditions are as follows: free chlorine residual at least 3 mg � l�1 and not more than 5 mg � l�1 at the end of a 7-d reaction (incubation) period, with sample incubation temperature of 25 °c and ph controlled at 7,0 � 0,2 with a phosphate buffer [10]. this paper discusses dbp formation potential in surface waters of the flaje catchments and its relationship to the properties of the humic substances in these waters. the focus of this discussion is on reactions with chlorine and on the formation of halogenated dbps – thms and haas. 2 samples five sampling points in the flaje catchment were chosen: rašeliník stream (no. 1), flaje water reservoir (no. 2), radní stream (no. 3), flájský stream (no. 4) and mackovský stream (no. 5). the locations of the sampling points are given in fig. 1. 3 materials and methods thm and haa formation potential (thmfp and haafp) tests were conducted in accordance with czech method tnv 757549 [10], which corresponds with method 5710, given in the standard methods for the examination of water and wastewater [11]. thms were determined by head–space solid phase microextraction, using a carboxen coated fiber (supelco). haas were analysed by liquid – liquid extraction from an acidified sample into methyl t–butylether (mtbe) after esterification by boron trifluoride. gas chromatography – mass spectometry (gc 8000/md 800 fisons) was used as a final analytical technique. other parameters such as toc, a254, were tested in all samples. the results for thmfp and haafp measured in given samples are summarized in tab. 2. table 3 shows the average, minimum and maximum values of measured parameters. the results are compared in fig. 2 and fig. 3. the values of other measured parameters are given in tab. 4. as can bee seen from fig. 2, the maximum potential of thm formation presented was in the rašeliník stream (� 0.140 mg � mg�1 doc), white the mimimum was in the water from radní stream (� 0.088 mg � mg�1 doc). the maximum formation potential of haa was in water from flájský stream (� 0.101 mg � mg�1 doc), whereas the minimum was in water from radní stream (� 0.083 mg � mg�1 doc). the seasonal evolution of thmfp in the water under study was also observed. the results of tests measured in december were the highest, and the the overall trend advanced from © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 42 no. 2/2002 fig. 1: flaje catchments and the locations of sampling points sampling place 1 2 3 4 5 thmfp (mg � mg�1 doc) 0,061 0,107 0,024 0,120 0,107 0,139 0,159 0,070 0,146 0,055 0,104 0,106 0,069 0,044 0,106 0,258 0,164 0,191 0,170 0,109 haafp (mg � mg�1 doc) 0,009 0,030 0,024 0,017 0,009 0,211 0,099 0,105 0,208 0,201 0,108 0,114 0,147 0,076 0,128 0,134 0,113 0,057 0,103 0,112 tab. 2: results of thmfp and haafp tests – flaje catchment. sampling period september – december 2001 no sampling place thmfp (mg � mg�1 doc) haafp (mg � mg�1 doc) min. max. � min. max. � 1 rašeliník 0.061 0.258 0.140 0.009 0.211 0.033 2 reservoir surface 0.106 0.164 0.134 0.030 0.114 0.089 3 radní stream 0.024 0.191 0.088 0.024 0.147 0.083 4 flájský stream 0.044 0.170 0.120 0.017 0.208 0.101 tab. 3: evaluation of thmfp and haafp tests in surface water from the flaje catchment september to december 2001. this was not the case for haafp. it should be noted that these conclusions need to be verified by analysing further samples. it should be noted, that the average values of thmfp in the given waters were higher than those in the data presented for surface water by fox et al [7]. 4 summary and conclusions this paper discusses the use of thmfp and haafp as a predictive tool for disinfection by–product formation due to the presence of natural organic matter (nom), mainly of humic and fulvic acids in water. the limited number of samples used means that the conclusions and the use of thmfp and haafp as an interest parameter should be regarded with caution. acknowledgements this work was partly supported by gačr no. 103/02/0243, tu dresden project no. 790111440, nazv no. 790111440, cez: j04/98:211100002, and was carried out with the assistance of prof. v. janda & associates from the chemical technical university of prague whose collaboration is gratefully acknowledged. references [1] minear, r. a., amy, g. l.: disinfection by-products in water treatment. the chemistry of their formation and control. lewis publishers, 1996. [2] marhaba, t. f., kochar, i. h.: rapid prediction of disinfection by-product formation potential by fluorescence. environ. engg. and policy, 2000, no. 2, p. 29–36. [3] singer, p. c.: humic substances as precursors for potentially harmful disinfection by-products. water sci. technol., 1999, vol. 40, no. 9, p. 25–30. [4] trussell, r. r., umphres, m. d.: the formation of trihalomethanes. journal awwa, 1978, vol. 70, no. 11, p. 604–612. [5] kitis, m., karanfil, t., kilduff, j. e., wigton, a.: the reactivity of natural organic matter to disinfection by–products 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 rašeliník nádr� – hladina radní potok flájský potok mackovský potok 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 m g /m g d o c 1 2 3 4 5 fig. 2: average values of thmfp in water under study rašeliník nádr� – hladina radní potok flájský potok mackovský potok 0 0.02 0.04 0.06 0.08 0.1 0.12 1 2 3 4 5 m g /m g d o c fig. 3: average values of haafp in water under study sample no. sampling place 1 2 3 4 5 1 doc (mg � l�1) 15.9 6.55 8.15 9.33 9.53 codmn (mg � l �1) 22.7 8.00 10.70 11.00 10.7 a254 (cm �1) 0.685 0.25 0.327 0.385 0.376 2 doc (mg � l�1) 6.55 3.51 6.00 7.20 10.91 codmn (mg � l �1) 14.40 8.00 4.60 9.30 7.00 a254 (cm �1) 0.483 0.278 0.131 0.327 0.245 3 doc (mg � l�1) 10.69 6.29 10.23 16.55 5.73 codmn (mg � l �1) 14.00 10.00 15.50 22.00 8.90 a254 (cm �1) 0.445 0.263 0.446 0.748 0.231 4 doc (mg � l�1) 4.62 6.79 2.07 3.92 4.55 codmn (mg � l �1) 5.60 8.60 1.90 4.20 4.70 a254 (cm �1) 0.185 0.269 0.061 0.140 0.146 tab. 4: content of organic compounds in water samples from the flaje catchment formation and its relation to specific ultraviolet absorbance. water sci. et technol., 2001, vol. 43, no. 2, p. 9–16. [6] huck, p. m.: measurement of biodegradable organic matter and bacterial growth potential in drinking water. journal awwa, 1990, vol. 82, no. 7, p. 78. [7] korshin, g. v., benjamin, m. m., sletten, r. s.: adsorption of natural organic matter (nom) on iron oxide: effects on nom composition and formation of organo-halide compounds during chlorination. water res., 1997, vol. 31, no. 7, p. 1643–1650. [8] batchelor, b., fusilier, g., murray, e. h.: developing haloform formation potential test. j. amer. water works assoc., 1987, vol. 79, no. 1, p. 50–56. [9] singer, p. c., chang, s. d.: correlation between trihalomethanes and total organic halides formed during water treatment. j. amer. water works assoc., 1989, vol. 81, no. 8, p. 61–67. [10] tnv 75 7549. jakost vod – stanovení potenciálu trihalomethanů (pthm) za normalizovaných podmínek jejich vzniku. mžp, srpen 2001. [11] standard methods for the examination of water and wastewater. 20th edition. washington (usa): american public health assoc., 1998, p. 5-55–5-61. prof. ing. alexander grünwald, csc. phone: +420 2 2435 4638 fax: +420 2 2435 4607 e-mail: grunwald@fsv.cvut.cz ing. bohumil št’astný phone: +420 2 2435 4403 fax: +420 2 2 2435 4607 e-mail: stastny@fsv.cvut.cz ing. kateřina slavíčková phone: +420 2 2435 4608 fax: +420 2 2 2435 4607 e-mail: slavickova@fsv.cvut.cz ing. marek slavíček phone: +420 2 2435 4608 fax: +420 2 2 2435 4607 e-mail: slavicek@fsv.cvut.cz department of sanitary engineering czech technical university in prague faculty of civil engineering thákurova 7 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 42 no. 2/2002 acta polytechnica doi:10.14311/app.2016.56.0001 acta polytechnica 56(1):1–9, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap formation control of multiple unicycle-type robots using lie group youwei dong∗, ahmed rahmani ecole centrale de lille, cité scientifique – cs20048, 59651 villeneuve d’ascq cedex, france ∗ corresponding author: youwei.dong@ec-lille.fr abstract. in this paper the formation control of a multi-robots system is investigated. the proposed control law, based on lie group theory, is applied to control the formation of a group of unicycle-type robots. the communication topology is supposed to be a rooted directed acyclic graph and fixed. some numerical simulations using matlab are made to validate our results. keywords: formation control; multi-robot system; unicycle-type robot; lie group; lie algebra. 1. introduction the various ways to control and coordinate a group of mobile robots widely have been studied in recent years. this has brought a breadth of innovation, providing considerable attention for the potential applications, such as flocking systems control, surveillance, search and rescue, cooperative construction, distributed sensor fusion, etc. when comparing the mission outcome of a multi-robot system (mrs) to that of a single robot, it is clear that cooperation among multiple robots can perform complex tasks that it would otherwise be impossible for a single powerful robot to accomplish. the fundamental idea behind multirobotics is to allow the individuals to interact with each other to find solutions of complex problems. each of them senses the relative positions of his neighbors, and achieves the desired formation by controlling the relative positions [1–3]. in formation control, various control topologies can be adopted, depending on the specific environment and tasks. theoretical views of mrs behavior are divided into centralized and decentralized systems. in a centralized system, a powerful core unit makes decisions and communicates with the others. in the decentralized approach, the robots can communicate and share information with each other [4]. we will focus on distributed system control due to its advantages, such as feasibility, accuracy, robustness, cost, and so on. many studies have been devoted to the control and coordination of multi-agent systems and multi-robot systems (e.g. [1, 4–9]). some of these results have been used to control vehicles (holonomic, nonholonomic mobile robots, etc.). in this paper, our goal is to control a group of unicycle robots to achieve a desired formation. motivated by references [10–14], we focus on a rigid body with kinematics evolving on lie groups. this is based on regarding the set of rigid body posture as the lie group se(2), which leads to a set of kinematic equations. these equations are expressed in terms of standard coordinated invariant linear operators on the lie algebra se(2). this approach allows a global description of rigid body motion which does not suffer from singularities. it provides a geometric description of rigid motion which greatly simplifies an analysis of the mechanisms [10]. paper [1] proposed an elegant control law based on lie algebra theory for consensus of a multi-agent system. it has the holonomic constraints, while nonholonomic constraints are not considered. in [12], lie algebra is used to study the path following control of one mobile robot. in [15], distributed formation control of multi-nonholonomic robots is studied. however the the control law is a leader-follower approach, and the multi-leader case is not considered. in this paper, a lie group method is used to control multiple unicycletype robots. the communication topology is defined as a rooted directed acyclic graph (dag). due to the nonholonomic property of this type of robot, a new local control law is proposed to make the nonlinear system converge to the desired formation. the outline of this paper is as follows. in section 2, some preliminary results are summarized and the formation control problem for a group of unicycle-type robots is stated. in section 3, a formation control strategy is proposed and the stability is analyzed. the simulation and results are given in section 4. concluding remarks are finally provided in section 5. 2. preliminary and problem statement 2.1. lie groups definition 1 [10]. a manifold of dimension of n is a set m, which is locally homeomorphic to rn. a lie group is a group g which is also a smooth manifold and for which the group operations (g,h) 7→ gh and g 7→ g−1 are smooth. left action of g on itself lg : g → g is defined by lg(h) = gh, and right action is defined the same way. adjoint action adg : g → g is adg(h) = ghg−1. here are two examples, the special orthogonal group so(n) = {r ∈ gl(n,r) : rrt = i,detr = +1} and the special euclidean group se(n) = {(p,r) : p ∈ rn,r ∈ so(n)} = rn ×so(n). 1 http://dx.doi.org/10.14311/app.2016.56.0001 http://ojs.cvut.cz/ojs/index.php/ap youwei dong, ahmed rahmani acta polytechnica 2.2. lie algebra associated with a lie group a lie algebra g over r is a real vector space g together with a bilinear operator [, ]: g×g (called the bracket) such that for all x,y,z ∈ g, we have: • anti-commutativity: [x,y] = −[y,x]; • jacobi identity: [[x,y],z] + [[y,z],x] + [[z,x],y] = 0. a lie algebra g is said to be commutative (or abelian) if [x,y] = 0 for all x,y ∈ g. we can define adab = [a,b] = ab − ba where a,b ∈ gl(n,r), which is the vector space of all n × n real matrices, gl(n,r) forms a lie algebra. clearly, we have [x,x] = 0. the lie algebra of so(2), denoted by so(2), may be identified with a 2×2 skew-symmetric matrix of the form ω̂ = [ 0 −ω ω 0 ] with the bracket structure [ω̂1, ω̂2] = ω̂1ω̂2−ω̂2ω̂1, where ω̂1, ω̂2 ∈ so(2). the lie algebra of se(2), denoted by se(2), can be identified with a 3 × 3 matrix of the form ξ̂ = [ ω̂ v 0 0 ] , where ω ∈ r,v ∈ r2, with the bracket [ξ̂1, ξ̂2] = ξ̂1ξ̂2 − ξ̂2ξ̂1. the exponential map: exp : teg → g is a local diffeomorphism from the neighborhood of zero in g onto the neighborhood of e in g. the mapping t → exp(tξ̂) is the unique one-parameter subgroup r → g with tangent vector ξ̂ at time 0. for ω̂ ∈ so(2) and ξ̂ = (ω̂,v) ∈ se(2), we have exp ω̂t = [ cos ωt −sin ωt sin ωt cos ωt ] , (1) exp ξ̂t = [ exp ω̂t a(ω)v 0 1 ] , (2) where a(ω) = 1 ω [ sin ωt −(1 − cos ωt) (1 − cos ωt) sin ωt ] . 2.3. graph theory the communication topology among n robots will be represented by a graph. let g = (v,e,a) be a graph of order n with the finite nonempty set of nodes v(g) = {v1, . . . ,vn}, the set of edges e(g) ⊂v ×v, and an adjacency matrix a = (aij)n×n. if for all (vi,vj) ∈ e, (vj,vi) ∈ e as well, the graph is said to be undirected, otherwise it is called directed. here, each node vi in v corresponds to a robot-i, and each edge (vi,vj) ∈e in a directed graph corresponds to an information link from robot-i to robot-j, which means that robot-j can receive information from robot-i. in contrast, the pairs of nodes in an undirected graph are unordered, where an edge (vi,vj) ∈e denotes that each robot can communicate with the other one. the adjacency matrix a of a digraph g is represented as a =   a11 a12 · · · a1n a21 a22 · · · a2n ... ... ... ... an1 an2 · · · ann   , where aij is the weight of link (vi,vj) and aii = 0 for any vi ∈ v, aij > 0 if (vi,vj) ∈ e and aij = 0 otherwise. a of a weighted undirected graph is defined by analogy, except that aij = aji,∀i 6= j [16]. a directed path from node vi to vj is a sequence of edges (vi,vj1), (vj1,vj2), . . . , (vjl,vj) in a directed graph g with distinct nodes vjk,k = 1, . . . , l. a directed graph is called acyclic if it contains no directed cycle. a rooted graph is a graph in which one vertex is distinguished as the root. 2.4. problem statement a unicycle-type mobile robot is composed of two independent actuated wheels on a common axle which is rigidly linked to the robot chassis. in addition, there are one or several passive wheels (for example, a caster, swedish or spherical wheel) which are not controlled and just serve for sustentation purposes [17]. we study the formation control problem of a group of such robots and each one is equipped with a local controller for deciding the velocities. we consider each robot as a node of a directed graph g, then the communication topology of a group of n robots could be expressed by an adjacency matrix a = (aij)n×n, where aii = 0 and aij = 1 if (vi,vj) ∈ e or 0 otherwise. the purpose is to design the strategy of the control applied to each robot in order that this group of mobile robots could execute a predefined task of formation control. 2.5. kinematic model on a lie group in order to describe the kinematic properties of the unicycle-type robot, we consider a reference point or at the mid-distance of the two actuated wheels. then we define two frames: fi = {o,x,y} and fr = {or,xr,yr}, as shown in figure 1. fi = {o,x,y} is an arbitrary inertial basis on the plane as the global reference frame and fr = {or,xr,yr} is a frame attached to the mobile robot with its origin located at or, and the basis {xr,yr} defines two axes relative to or on the robot chassis and is thus the robot’s local reference frame. the position of the robot in the global reference is specified by coordinates xi and yi, and the angular difference between the global and local reference frames is given by θi. then the pose of the robot could be described as an element of the lie group se(2): g = [ r p 0 1 ] =  cos θi −sin θi xisin θi cos θi yi 0 0 1   , where p = [xi,yi]t denotes the position of the robot in the global reference frame, and r = [ cos θi −sin θi sin θi cos θi ] is the rotation matrix of the frame fr relative to frame fi. then the motion of a robot could be described by g(t), which is a curve parameterized by time t in se(2). 2 vol. 56 no. 1/2016 formation control of multiple unicycle-type robots using lie group x y xryr θ ω2 ω u o or ω1 figure 1. representation of the frames. each pure rotational motion of a robot on a plane can be given by a 2×2 orthogonal matrix r ∈ so(2). let ω ∈ r be the rotation velocity of the robot’s chassis and then the exponential map exp : so(2) → so(2), ω̂ → exp(ω̂t) which is defined by equation 1 where ω̂ = [ 0 −ω ω 0 ] ∈ so(2) correspond to the robot chassis rotation. this map represents the rotation from the initial (t = 0) configuration of the robot to its final configuration with the rotation velocity ω. the rigid motions consist of rotation and translation. a general motion could also be described by an exponential map exp : se(2) → se(2), ξ̂ → exp(ξ̂t) defined in equation 2, where ξ̂ = [ ω̂ v 0 0 ] ∈ se(2) represents the velocities of movement, and v = −ω̂p+ṗ =[ ωyi + ẋi −ωxi + ẏi ] where (xi,yi) is the position of the robot and v represents the velocity of a (possibly imaginary) point on the rigid body which is moving through the origin of the world frame. exp(ξ̂t) is a mapping from the initial configuration of the robot to its final configuration. that is, if we suppose that the initial configuration of the robot is g(0), then the final configuration is given by g(t) = eξ̂tg(0). (3) the kinematic model of the unicycle-type robot is given by   ẋi = u cos θi, ẏi = u sin θi, θ̇i = ω, where u characterizes the robot’s longitudinal velocity. the variables u and ω are related to the angular velocity of the actuated wheels via the one-to-one transformation:[ u ω ] = [ rw/2 rw/2 rw/l −rw/l ][ ω1 ω2 ] (4) where rw is the wheels’ radius, l is the distance between the two actuated wheels, and ω1 and ω2 are respectively the angular velocity of the right wheel and the left wheel. we differentiate the matrix given in equation 3, and obtain the kinematic model of a unicycle-type robot on the lie group: ġ(t) = ξ̂g(t) (5) where ξ̂ is the control input matrix given by ξ̂ =  0 −ω ωyi + u cos θiω 0 −ωxi + u sin θi 0 0 0   . (6) this is the kinematic model on the lie group for a unicycle-type robot. for one robot with a certain pose (xi,yi,θi), a control vector (u,ω) results in a unique control input matrix ξ̂ to update the robot’s motion. 3. formation control law on se(2) 3.1. controller design we consider n unicycle-type mobile robots, and use gi ∈ se(2) and ḡi ∈ se(2) (i = 1, · · · ,n) to denote respectively the current configuration and the desired configuration of each robot. in fact, gi is the representation of the robot frame fr shown in figure 1 relative to the spatial frame fi. as introduced in the previous section, the evolution of system gi can be expressed by ġi = ξ̂igi (7) where ξ̂i ∈ se(2) is the control input matrix. let gij be the configuration of the robot-j frame relative to the robot-i frame, then we have gj = gigij. (8) thus gij = g−1i gj. we can use ḡij to represent the desired configuration of the robot-j frame in the roboti frame. then the robots achieve a desired formation if their configurations satisfy the following equation for any k = 1, · · · ,n,k 6= i lim t→∞ g−1k gi = ḡki, i = 1, · · · ,n. (9) ḡki ∈ se(2) is defined according to the task requirements and is often used to identify the geometric configuration of the formation. we study the movement of gi relative to gj, so here we can consider provisionally gj = ḡj, then ḡij could be written as ḡ−1i gj. thus we have ḡ−1i gj = ḡ −1 i ḡj = ḡ −1 i (ḡkḡ −1 k )ḡj = (ḡ−1k ḡi) −1(ḡ−1k ḡj) = (ḡki) −1ḡkj which gives ḡi = gj(ḡkj)−1ḡki. then for robot-i (in the local frame gi), the needed transformation of robot-i 3 youwei dong, ahmed rahmani acta polytechnica from the current configuration to the desired configuration while considering the current configuration of robot-j is g̃i_j = g−1i gj(ḡkj) −1ḡki. (10) to simplify the notations, we note g̃ij instead of g̃i_j. in [1], noting x̃ij = log g̃ij, a control law for agents, which have holonomic constraints, is proposed as ξ̂i = c ai n∑ j=1 aijx̃ij, i = 1, · · · ,n where aij is the element in the adjacency matrix a and ai = ∑n j=1 aij. however, in our mrs, nonholonomic constraints are associated with unicycle-type robots, so we develop a new nonlinear control approach. from the matrix g̃ij, we could know the position error and orientation error x̃ij, ỹij, θ̃ij. we suppose that the relative configuration of ḡi with respect to the robot frame gi is denoted by ḡii. then ḡii could be obtained by the the mean function m : se(2) ×···×se(2)︸ ︷︷ ︸ n−1 → se(2), (g̃i1, · · · , g̃i,i−1, g̃i,i+1, · · · , g̃in ) 7→ g̃ii. this function means to get the weighted arithmetic mean of all the arguments, that is, if we note g̃ij =  cos θ̃ij −sin θ̃ij x̃ijsin θ̃ij cos θ̃ij ỹij 0 0 1   j=1,··· ,n,j 6=i , then x̃ii, ỹii, θ̃ii are given by: ∆ii = 1 ai n∑ j=1 aij∆ij, j 6= i, ∆ = x̃, ỹ, θ̃ (11) where aij is the element of adjacency matrix a and ai = ∑n j=1 aij. when x̃ii, ỹii and θ̃ii are obtained, g̃ii can be rewritten as g̃ii =  cos θ̃ii −sin θ̃ii x̃iisin θ̃ii cos θ̃ii ỹii 0 0 1   . we take the inverse of the matrix g̃−1ii which represents the relative configuration of gi with respect to the desired configuration ḡi when the predefined communication topology is considered. let us consider figure 2, where the unknowns are annotated in the list of symbols after the article. o′x′y ′ is the frame of the desired configuration of robot i, and (a,θ), related to g̃−1ii , is the current pose of robot i in the frame o′x′y ′. in this frame, we assume a circle of radius |r|, denoted by cb, and then we propose a control law to drive the robot to the origin with the help of this circle. the absolute value |r| is always positive, and it is supposed appropriately according to the initial conditions. r is signed: when the robot is located in the lower half-plane, r = −|r| and thus the angle α is also a θ b o′ x′ y′ pr φ l p′ b′ o″ β α ψ d _ figure 2. geometrical relations between the robot actual configuration and the desired configuration. negative. the coordinate r is determined according to the following rules:  r is chosen arbitrarily without changing sign if |r| ≤ l/2, r = −y+sgn y √ 4y2+3x2 3 if |r| > l/2, (12) where the function “sgn” is defined as: sgn x = { 1, x ≥ 0, −1, x < 0. we denote β = arcsin sin β̄, then the local control law is proposed as follows:{ u = −sgn cos β̄λl ω = u|r|(β −α) (13) where λ is a positive constant. from the proposed law, we have ui and ωi, then the control input matrix of robot i is obtained from equation 6. 3.2. stability analysis from the previous section, we know that g̃ii is the representation of ḡi in frame gi, while its inverse g̃−1ii is the representation of gi in frame ḡi. to explain the convergence of gi to ḡi, we just need to prove that ḡ−1ii converges to the origin, which is also the identity matrix i. to prove this, with the help of the notations depicted in figure 2, we will divide the movement of each robot into three phases. phase 1. l ≥ 2|r|,β −α 6= 0. lemma 1. if we choose a convenient r which satisfies l ≥ 2|r|, then the angle between the direction of movement and one tangent of the circle cb converges to 0, that is δ = |β −α|→ 0. 4 vol. 56 no. 1/2016 formation control of multiple unicycle-type robots using lie group proof. if r > 0, we have δ = |β − α| = √ (β −α)2, then δ̇ = β −α√ (β −α)2 (β̇ − α̇) = sgn(β −α)(β̇ − α̇). because β = arcsin sin β̄ = arcsin sin θ −ϕ, so β̇ = 1 1 − sin2(θ −ϕ) cos(θ −ϕ) (θ̇ − ϕ̇) = sgn cos β̄ (θ̇ − ϕ̇). consider the coordinate transformation into polar coordinates, we have ϕ̇ = u sin β/l and l̇ = u cos β. we distinguish three cases: (1.) case β̄ ∈ [−π2 , π 2 ]. in this case, the control law is u = −λl,ω = −λl(β −α)/r. and sin α = r l ⇒ α̇ = − rl̇ l2 cos α = λr cos β l cos α = λ cos β tan α. then we have δ̇ = sgn(β −α) ( ω − u sin β l −λ cos β tan α ) = sgn(β −α)λ [ − l r (β −α) + sin β − cos β tan α ] . suppose eβ = − lr (β −α) + sin β − cos β tan α. because sin α = r/l and l ≥ 2r, so deβ dβ < 0. then we can say that eβ is a monotonically decreasing function about β, and β = α is the unique zero value point of eβ. hence: • if −π2 ≤ β < α, then eβ > 0, β − α < 0 and δ̇ ≤ 0; • if α ≤ β ≤ π2 , then eβ ≤ 0, β −α > 0 and δ̇ ≤ 0. so δ converges monotonically to 0. (2.) case β̄ ∈ (π2 ,π]. we have β = π−β̄ ∈ (0, π 2 ], and the control law is u = λl, ω = λl r (π − β̄ −α). in this case, we get δ̇ = sgn(β −α)λ [ sgn cos β̄ ( l r (π − β̄ −α) − sin β̄ ) + cos β̄ tan α ] . suppose eβ = sgn cos β̄[ lr (π − β̄ − α) − sin β̄] + cos β̄ tan α, then deβ dβ = − l r + cos β + sin β tan α < 0. α = β = π − β̄ is the equilibrium point of eβ, so δ converge monotonically to 0. (3.) case β̄ ∈ [−π,−π2 ). we have β = −π − β̄, we can get a similar result to case 2. if r < 0, the same calculus leads to the same results. hence we have the conclusion δ = |β −α|→ 0. a θ p′ o′ x′ y′ b r φ l p figure 3. movement in phase 3. phase 2. l ≥ 2|r|,δ = β −α = 0. because of the regulation of phase 1, in this phase, the robot will moves towards the origin along the tangent of circle cb, thus δ = 0 and ω = u(β − α)/|r| = 0. lemma 2. suppose d = |o′a| and the lyapunov function is chosen as v = 12d 2 + 12θ 2. if the robot moves towards the origin along the tangent of circle cb, then v̇ < 0. proof. consider the polar coordinates, we have ḋ = u cos(β̄ −ψ) sgn x sgn y = cos(β̄ −ψ)sxsy, where sx is sgn x for short. we find that if u < 0, then 0 < |β̄| ≤ π6 ; if u > 0, then 5π 6 ≤ |β̄| < π. angle ψ is always positive. if l = 2|r|, ψ is maximal: ψmax = arccos(7/8) ≈ 0.5054, then no matter what sign x and y have, we have ḋ = −sgn cos β̄ λl cos(β̄ −ψsxsy) < 0. in this phase, δ = 0, so θ̇ = ω = 0. hence v̇ = dḋ + θθ̇ < 0. the lemma is proved. phase 3. l = 2|r|. in this phase, we have always l = 2r and β = α = π 6 (shown in figure 3). we use y(x) to represent the movement of the robot and suppose r ≥ 0. the case r < 0 could be studied in the same way and the same conclusion will be obtained. (x,y) is the position of point a. theorem 1. suppose that one robot, with the velocity defined by the proposed control law (equation 13), moves towards the origin along the tangent of the circle cb (figure 3) of which the radius |r| satisfies l = 2|r| and r is determined by rule (equation 12), then both d and θ asymptotically converge to 0. 5 youwei dong, ahmed rahmani acta polytechnica proof. we consider first the case where r > 0. in this case, l = 2r, (x,y) satisfies the equation x2 +(y−r)2 = 4r2. then we get r = −y+ √ 3x2+4y2 3 (the negative solution is omitted). using bp⊥pa, we could have y′ = dy dx = − 3x + √ 3(4y − √ 3x2 + 4y2) −3 √ 3x + 4y − √ 3x2 + 4y2 . this is an homogeneous differential equation. we suppose y = zx and differentiate it about x: dy dx = z + x dz dx = 3x + √ 3(4y − √ 3x2 + 4y2) 3 √ 3x− 4y + √ 3x2 + 4y2 = 3 + √ 3(4z − √ 3 + 4z2) 3 √ 3 − 4z + √ 3 + 4z2 . simplify this result and get dx x = 3 √ 3 − 4z + √ 3 + 4z2 3 + √ z + 4z2 − ( √ 3 + z) √ 3 + 4z2 dz. integrate it and get ln |x| = ∫ 3 √ 3 − 4z + √ 3 + 4z2 3 + √ z + 4z2 − ( √ 3 + z) √ 3 + 4z2 dz = √ 3 z ( − √ 1 + 43z 2 − 1 ) − 1 2 ln(z2 + 1) + arctanh z √ 3 + 4z2 + const, where “const” is a constant and its value is determined by the initial conditions. when x > 0, x = l cos ϕ, so ẋ = l̇ cos ϕ− lϕ̇ sin ϕ = −λl cos β − lλ sin β sin ϕ = − λl 2 ( √ 3 + sin ϕ) < 0. l = 0 is the equilibrium point, so x → 0. this is also the conclusion for x < 0. so we just need to consider the right neighborhood of the origin, hence x = exp (√ 3 z ( − √ 1 + 43z 2 − 1 ) − 1 2 ln(z2 + 1) + arctanh z √ 3 + 4z2 + const ) . thus dx dz = x (−z3 + √ 3z2 + √ 3) √ 3 + 4z2 + 4z2 + 3 z2(z2 + 1) √ 3 + 4z2 . solve the equation dx dz = 0, get z0 = 1 + √ 3. and when z < z0, dxdz > 0; when z > z0, dx dz < 0. so in the right neighborhood of the origin, dx dz < 0. hence if z → 0, then x → 0, and we have the approximate relation between x and z: ln x = − 2 √ 3 z − z √ 3 − z2 2 + o(z3). now z is very close to 0, so the terms of higher orders could be omitted and we get ln x = − 2 √ 3 z = − 2 √ 3x y . thus y = −2 √ 3x ln x → 0 when x → 0. the inclination also converges to 0, because tan θ = dy dx ∼ d dx ( −2 √ 3 x ln x ) = −2 √ 3 ( 1 ln x − 1 ln2 x ) , when x → 0, tan θ → 0, then θ → 0. from the proposed control law, we know that l = 0 is the equilibrium point, and here it is demonstrated that when l → 0, the limit of d and θ are both 0. in the polar coordinate frame, we have l̇ = u cos β̄ = −sgn cos β̄ λl cos β̄ = −λl|cos β̄|, hence l → 0. with the trigonometric relations, we could prove that v̇ = dḋ + θθ̇ < 0, and the gain λ does not affect the stability. when r < 0, the same reasoning can be made. this completes the proof. suppose d to be a length that has the same order of the workspace of the system and satisfies l ≤ d for any t ∈ r and any robot. the two wheel velocities are ω1 = 2u + ωl 2rw ,ω2 = 2u−ωl 2rw , and satisfy |ω1| ≤ ωmax, |ω2| ≤ ωmax, then we get the range of λ: 0 < λ ≤ λmax = rwωmax d + 2π3 µrchassis where rchassis = l/2 is the radius of the robot chassis, and µ is a convenient number that satisfies l/|r| ≤ µ for all t. 3.3. stability of formation control because of the nonholonomic constraints, if there is a bidirectional path between any two unicycle-type robots which are equipped with this local control law, the system will not converge, so we propose a rooted directed acyclic graph as the communication topology of the multi-robot system and the theorem below. theorem 2. if the communication topology between n unicycle-type robots is a rooted directed acyclic graph, then the system (equation 7) will achieve the desired formation (equation 9) under the local control law (equation 11, 13). especially, each robot, in phase 3, converges to the desired formation asymptotically. proof. there is no directed circle, so the root node (robot) will not receive any information and will be static. let km denote the set of the nodes (robots) 6 vol. 56 no. 1/2016 formation control of multiple unicycle-type robots using lie group 1 2 3 4 5 6 figure 4. communication topology of simulation. to which there is a directed path from the root and this path consists of at most m edges. then k0 has only one element – the root robot, denoted by v0. the configuration of this robot in the fixed frame is denoted by g0 = ḡ0. then we use the mathematical induction method. (1.) for k1, suppose that there are n1 elements in k1. one element is denoted by v1i where 1 ≤ i ≤ n1, and the configuration of v1i is denoted by g1i. because v1i receives information only from v0, according to the lemmas above and theorem 1 we know that limt→∞g−11i g0 = ḡ1i,0. (2.) for km, the elements in this set are denoted by vmi, 1 ≤ i ≤ cm where cm is the cardinality of this set. vmi receives information from the nodes which are elements of ⋃ kn,n≤m−1 and have achieved the desired configurations. we use j to denote the index numbers of these robots, that is, vnjj ∈ knj ⊂⋃ kn,n≤m−1. then with the control law, vmi will converge to the desired configuration relative to vnjj, so lim t→∞ g−1mig0 = lim t→∞ g−1mignjjg −1 njj g0 = ( lim t→∞ g−1miḡnjj ) (ḡ−1njjḡ0) = ḡ−1(njj),miḡ(njj),0 = ḡmi,0. the topology graph is a finite graph, so all the robots will converge to the desired configuration relative to v0. then for any vi,vj, we have lim t→∞ g−1i gj = lim t→∞ g−1i g0g −1 0 gj = ḡ −1 0i ḡ0j = ḡij. then the formation in equation 9 is achieved. 4. simulation let us consider a group of 6 unicycle-type robots which are located in a global frame, and we suppose that each robot could know its own position and orientation in the frame via gps or via a camera which is installed above the work area. the initial pose of each is p = (x,y,θ) where (x,y) represents the robot position in figure 5. trajectories of the 6 robots. the global frame and angle θ indicates the orientation of the robot. the six initial poses are given by p1 = (0, 0, 0), p2 = (5, 3, 0), p3 = (−1, 6,π/6), p4 = (6,−5,−π/2), p5 = (0,−5,π/3), p3 = (−5,−4,−π/2). and the desired formation is a regular hexagon with side length of 2. let c = 1, the sample time is 0.1 s, and the maximum angular velocity of the wheels is ωmax = 5π/s. the communication topology is given in figure 4. using matlab, the results are obtained as shown in figure 5 and 6. we observe that the six robots achieve the desired hexagonal formation: robot-1 has no information source, so it remains static. other robots perform the trajectories according to the posture of their information source robots. robots 4, 5 and 6 achieve the desired configurations after robots 2 and 3 because of the communication topology shape. figure 6 shows the evolution of the angles between the forward direction of each robot and the x-axis of the global frame. we see that the six angles turn to the same value after some regulations of the configurations, which indicates the coordination of the robots’ orientations. the rotation velocities become 0 at the end. 5. conclusions in this paper, we have studied the problem of formation control for a group of unicycle-type robots using a lie group. a local control law based on se(2) for the robots is proposed, and the stability is analyzed. the problem is investigated under a rooted directed acyclic communication topology for a group of unicycle-type robots, and the behavior of the system is discussed. some simulations of a 6-robot system validate the 7 youwei dong, ahmed rahmani acta polytechnica figure 6. orientation of the 6 robots. proposed control laws. the communication topology was supposed fixed. the case of switching topology, avoiding obstacles and an experiment on real robots will be studied in our future work. list of symbols u robot’s longitudinal velocity [m/s] ω robot’s chassis instantaneous velocity of rotation [rad/s] ω1,ω2 angular velocity of right and left wheel [rad/s] rω radius of wheel [m] l distance between the two actuated wheels [m] p = [xi,yi ]t position of robot in frame fi gi ∈ se(2) configuration of local frame attached to roboti relative to frame fi gij configuration of robot-j relative to the local frame attached to robot-i ḡij desired configuration of robot-j relative to robot-i b = (0,r) the centre of a circle of which the radius is |r| a = [x,y]t position of robot gi in frame o′x′y ′ θ orientation of robot gi relative to axis o′x′ ϕ angle ∠abx′ ∈ [−π,π] l distance between a and b d distance between a and o′ β̄ angle formed by −→ ba and robot’s orientation, ∈ [−π,π] α angle arcsin(r/l) ∈ [−π/2,π/2] ψ angle ∠bao′ ∈ [0,π/2] acknowledgements youwei dong was sponsored by the china scholarship council. many thanks to members of the math stack exchange community, anonymous and otherwise, for helpful comments and suggestions. references [1] r. dong, z. geng. consensus based formation control laws for systems on lie groups. in systems & control letters 62:104–111, 2013. doi:10.1016/j.sysconle.2012.11.005. [2] a. gautam, s. mohan. a review of research in multi-robot systems. in industrial and information systems(iciis), pp. 1–5. chennai, aug 2012. doi:10.1109/iciinfs.2012.6304778. [3] k.-k. oh, h.-s. ahn. formation control of mobile agents based on distributed position estimation. in ieee transactions on automatic control 58(3):737– 742, march 2013. doi:10.1109/tac.2012.2209269. [4] h. mehrjerdi, j. ghommam, m. saad. nonlinear coordination control for a group of mobile robots using a virtual structure. in mechatronics 21:1147–1155, 2011. doi:10.1016/j.mechatronics.2011.06.006. [5] w. ren. consensus based formation control strategies for multi-vehicle systems. in proceedings of the 2006 american control conference, pp. 4237–4242. minnesota, june 2006. doi:10.1109/acc.2006.1657384. [6] s. mastellone, j. s. mejía, d. m. stipanović, m. w. song. formation control and coordinated tracking via asymptotic decoupling for lagrangian multi-agent systems. in automatica 47:2355–2363, 2011. doi:10.1016/j.automatica.2011.08.030. [7] r. olfati-saber, r. m. murray. distributed cooperative control of multiple vehicle formations using 8 http://dx.doi.org/10.1016/j.sysconle.2012.11.005 http://dx.doi.org/10.1109/iciinfs.2012.6304778 http://dx.doi.org/10.1109/tac.2012.2209269 http://dx.doi.org/10.1016/j.mechatronics.2011.06.006 http://dx.doi.org/10.1109/acc.2006.1657384 http://dx.doi.org/10.1016/j.automatica.2011.08.030 vol. 56 no. 1/2016 formation control of multiple unicycle-type robots using lie group structural potential functions. in ifac world congress. barcelona, 2002. [8] k. d. listmann, m. v. masalawala, j. adamy. consensus for formation control of nonholonomic mobile robots, in robotics and automation. in mathematical physics, pp. 3886–3891. kobe, may 2009. doi:10.1109/robot.2009.5152865. [9] p. morin, c. samson. control of nonholonomic mobile robots based on the transverse function approach. in transactions on robotics 25(5):1058–1073, oct 2009. doi:10.1109/tro.2009.2014123. [10] r. m. murray, z. li, s. s. sastry. a mathematical introduction to robotic manipulation. crc press, 1994. [11] f. w. warner. foundations of differentiable manifolds and lie groups. london: scott, foresman and company, 1983. [12] p. coelho, u. nunes. lie algebra application to mobile robot control: a tutorial. in robotica 21:483–493, 2003. doi:10.1017/s0263574703005149. [13] j. f. cariñena, j. clemente-gallardo, a. ramos. motion on lie groups and its applications in control theory. in mathematical physics 51:159–70, jul 2003. doi:10.1016/s0034-4877(03)80010-1. [14] g. s. chirikjian. information theory on lie groups and mobile robotics applications. in ieee international conference on robotics and automation, pp. 2751–2757. alaska, 2010. doi:10.1109/robot.2010.5509791. [15] z. peng. formation control of multi nonhonomic wheeled mobile robots. ph.d. thesis, ecole centrale de lille, june 2013. [16] g. wen. distributed cooperative control for multi-agent systems. ph.d. thesis, ecole centrale de lille, october 2012. [17] b. siciliano, o. khatib. handbook of robotics. springer, 2008. 9 http://dx.doi.org/10.1109/robot.2009.5152865 http://dx.doi.org/10.1109/tro.2009.2014123 http://dx.doi.org/10.1017/s0263574703005149 http://dx.doi.org/10.1016/s0034-4877(03)80010-1 http://dx.doi.org/10.1109/robot.2010.5509791 acta polytechnica 56(1):1–9, 2016 1 introduction 2 preliminary and problem statement 2.1 lie groups 2.2 lie algebra associated with a lie group 2.3 graph theory 2.4 problem statement 2.5 kinematic model on a lie group 3 formation control law on se(2) 3.1 controller design 3.2 stability analysis 3.3 stability of formation control 4 simulation 5 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0440 acta polytechnica 56(6):440–447, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on generalization of special functions related to weyl groups lenka hákováa, ∗, agnieszka tereszkiewiczb a department of mathematics, faculty of chemical engineering, university of chemistry and technology, prague, technická 5, cz-166 28 prague, czech republic b institute of mathematics, university of bialystok, ciolkowskiego 1m, 15-245 bialystok, poland ∗ corresponding author: lenka.hakova@vscht.cz abstract. weyl group orbit functions are defined in the context of weyl groups of simple lie algebras. they are multivariable complex functions possessing remarkable properties such as (anti)invariance with respect to the corresponding weyl group, continuous and discrete orthogonality. a crucial tool in their definition are so-called sign homomorphisms, which coincide with one-dimensional irreducible representations. in this work we generalize the definition of orbit functions using characters of irreducible representations of higher dimensions. we describe their properties and give examples for weyl groups of rank 2 and 3. keywords: weyl groups, characters, special functions. 1. introduction we consider simple lie algebras, i.e., the infinite families an,bn,cn and dn and exceptional algebras g2,f4,e6,e7 and e8. several families of multivariable special functions – orbit functions – are defined with respect to related weyl groups; see (5). these functions are called c–, s–, ss– and sl–functions and they have been described in many papers; see [1]–[6]. in [7] we studied a generalization of c– and s– functions of weyl groups of an using immanants of a certain matrix a with exponential entries. immanants are functions defined on the set of squared matrix of order n. they are related to irreducible characters of the symmetry group sn, among them the standard determinant and permanent. in the paper we reviewed the fact that the c– (s–)functions correspond to the permanent (determinant) of the matrix a and studied non-trivial immanants. these new special functions possess some of the properties of orbit functions, although they are not invariant with respect to the weyl group an. another way to define these functions is to use irreducible characters of the weyl group of an directly. this definition is then extended for all weyl groups of simple lie algebras (11). in this paper we show that we can describe uniformly the four families of orbit functions, immanant functions and new families of functions related to the corresponding weyl group using its irreducible characters via the definition (11). we give several examples and prove some important properties. the paper is organized as follows: in section 2 we review some definitions and notions from the representation theory of finite groups and the theory of weyl groups of semisimple lie algebras. section 3 presents infinite families of special functions related to the weyl groups. namely, we summarize some properties of the families of c−,s−,ss– and sl–orbit functions and we define their generalization. in section 4 we study properties of these special functions, including their continuous and discrete orthogonality in section 4.2 and linear independence in section 4.3. section 5 gives examples of functions related to weyl groups of rank 2 and 3. 2. preliminaries 2.1. irreducible characters of symmetric groups for the general definition of character functions we need to review some notation and facts from the standard representation theory of finite groups; see for example [8, 9]. let g be a finite group. it can be written as a union of its conjugacy classes. irreducible characters χ, traces of irreducible representations, are mappings of conjugacy classes to complex numbers. the number of conjugacy classes equals the number of irreducible characters. the values of the characters are then listed in so-called character tables; see table 1. degree dk of the character χk is the dimension of the corresponding representation. linear characters are the characters of degree one. they are homomorphisms between group g and the multiplicative group of non-zero complex numbers. characters are real valued if and only if every g ∈ g is conjugated to its inverse. this is the case for example for all the weyl groups of simple lie algebras [10, corollary 3.2.14]. the inner product of characters is defined as 〈χk,χl〉 = 1 |g| ∑ g∈g χk(g)χl(g), where |g| denotes the order of group g. the row orthogonality relation states that for every irreducible 440 http://dx.doi.org/10.14311/ap.2016.56.0440 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 6/2016 on generalization of special functions related to weyl groups characters χk,χl, 1 |g| ∑ g∈g χk(g)χl(g) = δkl. (1) the column orthogonality relation is the following: for every g,h ∈ g, k∑ r=1 χk(g)χk(h) = δgh |g| |[g]| , (2) where |[g]| is the size of the conjugacy class of g. moreover, another useful identity holds [9, theorem iii.2.7]: ∑ g∈g χk(hg−1)χl(g) = δkl |g| dk χk(h). (3) 2.2. weyl groups of simple lie algebras details about the notion introduced in this subsection are to be found for example in [5]. we consider simple lie algebras of rank n with the set of simple roots ∆ = {α1, . . . ,αn}. the roots are either of the same length or of two different lengths, called the short and long roots. in the latter case we can write ∆ = ∆s ∪ ∆l. coroots are a normalization of simple roots, α∨ = 2α/〈α, α〉. weights ω and coweights ω∨ are dual to coroots and simple roots, 〈αi, ω∨j 〉 = 〈α ∨ i , ωj〉 = δij. integer combinations of the mentioned vectors are an important tool when dealing with orbit functions. we define p = zω1 + . . . + zωn (the weight lattice), p∨ = zω∨1 + . . . + zω ∨ n (the coweight lattice), q = zα1 + . . . + zαn (the root lattice), q∨ = zα∨1 + . . . + zα ∨ n (the coroot lattice). moreover, we denote by p + and p ++ the non-negative and positive part of p , respectively. weyl groups w are generated by reflections ri with respect to the hyperplanes orthogonal to the simple roots αi, i.e., rix = x−〈αi,x〉α∨i . their action on the set of simple roots gives the root system w (∆). since such a root system is irreducible, there exists a unique highest root ξ. coordinates of ξ in the basis of simple roots mi are called marks. the affine weyl group is an infinite extension which can be described as a semidirect product of shifts by integer combinations of coroots q∨, and the weyl group w . its fundamental domain f is a simplex with vertices {0, ω ∨ 1 m1 , . . . , ω∨n mn }. the set of coroots ∆∨ generates the same weyl group; its action on ∆∨ gives a dual root system with the highest root η. the coordinates of η in the basis of coroots are called dual marks and denoted m∨i . finally, the dual affine weyl group can be written as q o w and its fundamental domain f∨ is a simplex with vertices {0, ω1 m∨1 , . . . , ωn m∨n }. weyl groups can be described also by their presentation as a finite coxeter group, i.e., a group generated by elements ri which satisfy r2i = 1, (rirj) sij = 1, i,j ∈{1, . . . ,n}, (4) where sij = sji are integers greater than 2. 3. special functions related to weyl groups 3.1. weyl group orbit functions the standard way of defining four families of orbit functions uses the concept of sign homomorphisms, i.e., mappings σ : w →{±1}. a sign homomorphism can be defined by prescribing its values on the generators of w . they have to satisfy the condition (4); we get σ(ri)2 = 1, (σ(ri)σ(rj))sij = 1, i,j = {1, . . . ,n} . there are four admissible mappings: the identity, the determinant and homomorphisms σs and σl, defined as σs(rα) = { 1, α ∈ ∆l, −1, α ∈ ∆s, σl(rα) = { 1, α ∈ ∆s, −1, α ∈ ∆l. the general formula defining the four families of orbit functions is for every x ∈ rn and every λ ∈ p ϕσλ = ∑ w∈w σ(w)e2πı〈wλ,x〉, (5) where σ is one of the four sign homomorphisms. for the choice σ = id, we get the family of c–orbit functions, which are denoted φλ. the family of s– functions comes from the choice σ = det; we denote ϕλ. the homomorphisms σs and σl induce families of ss– and sl–functions, denoted ϕs and ϕl. these functions have been extensively studied; see [1, 2, 4] and others. they are multivariable complex functions, invariant or anti-invariant with respect to the affine weyl group. they form an orthogonal basis of spaces of continuous functions defined on the fundamental domain f or of discrete functions defined on points of a finite grid in f . details are found in [3, 5, 11]. in this paper we prove the orthogonality of generalized orbit functions using their relations with the family of c–functions; see lemma 2. we will use the following properties of c–functions: continuous orthogonality — for every λ,µ ∈ p + it holds that∫ f φλ(x)φµ(x)dx = |f||w ||stabw (λ)|δλµ, (6) where |f | denotes the volume of the fundamental domain f, |w| is the order of the weyl group w 441 lenka háková, agnieszka tereszkiewicz acta polytechnica and stabw (λ) is the stabilizer of the action of w on λ. in particular, for the choice µ = 0, the c–function becomes a constant equal to |w| and equation (6) gives∫ f φλ(x)φ0(x)dx = |w | ∫ f φλ(x)dx = |f ||w|2δλ0. (7) discrete orthogonality — let m be a positive integer of our choice. we define a w–invariant lattice grid fm as 1m p ∨/q∨ ∩ f. we consider a space of functions sampled on the points of fm with a scalar product defined for each pair of functions f,g as 〈f, g〉fm = ∑ x∈fm ε(x)f(x)g(x) . (8) the weight function ε(x) is given by the order of the weyl orbit of x, ε(x) = |w||stabw (x)|. the set of parameters λm is defined as λm = p/mq∩mf∨. it gives us a finite family of orbit functions which are pairwise orthogonal with respect to the scalar product (8). for every λ,µ ∈ λm it holds that 〈φλ, φµ〉fm = c|w|m nh∨λδλµ, (9) where the coefficient h∨λ is the order of the stabilizer of λ and c is the determinant of the cartan matrix of the corresponding weyl group. the values of |w |, c, ε(x) and h∨λ can be found in several papers, e.g., [5]. with the choice µ = 0 we can rewrite the orthogonality relation (9) as∑ x∈fm ε(x)φλ(x) = c|w |mnδλ0, (10) 3.2. character functions let w be a weyl group of rank n with irreducible characters χ1, . . . ,χr. for every x,λ ∈ rn and k = 1, . . . ,r we define φkλ(x) = ∑ w∈w χk(w)e2πı〈wλ,x〉. (11) we obtain several infinite families of functions related to the irreducible characters of w. for shortness we will refer to them within this paper as to character functions. the first two irreducible characters are the trivial and alternating character. their values correspond to the values of the first two sign homomorphisms, the identity and determinant. in the case of the weyl groups of bn,cn,f4 and g2, the sign homomorphisms σs and σl are the only two other possible linear characters. therefore, the definition (11) includes the families of c– and s– functions for all the weyl groups and the families of ss– and sl– functions for bn,cn,f4 and g2. the rest of the paper studies the properties of functions related to irreducible characters of degree ≥ 2. 4. properties of character functions 4.1. general properties character functions are trivial in the case when λ = 0. indeed, from the orthogonality relation (1) applied to the trivial character, we get 〈χk, χ1〉 = 1 |w | ∑ w∈w χk(w) = δk1. it implies that the c–function φ0 = φ10 equals the order of the weyl group and all the others are identically zero. therefore, we now consider only λ 6= 0. character function are not, in general, invariant with respect to the weyl group. nevertheless, we can describe several symmetries and identities. let χk,χl be any irreducible characters, c ∈ r and x ∈ rn. then φkλ(x) = φ k x(λ) φkcλ(x) = φ k λ(cx) moreover, for each w ∈ w and x ∈ rn we have φkwλ(x) = φ k λ(w −1x), φkwλ(wx) = φ k λ(x) (12) and for linear characters φkwλ(x) = χk(w)φ k λ(x). let λ,µ ∈ rn. we get the following formula for a product of two character functions φkλ(x)φ l µ(x) = ∑ w,w̃∈w χk(w)χl(w̃)e2πi〈wλ+w̃µ,x〉. (13) two particular cases are of interest, when l = 1, i.e., product of a general character functions with a c– function, it holds that φkλ(x)φ 1 µ(x) = ∑ w∈w φkλ+wµ(x), and in the case of χk being a linear character, we get φkλ(x)φ 1 µ(x) = ∑ w∈w χk(w)φkwλ+µ(x), φkλ(x)φ k µ(x) = ∑ w∈w χk(w)φkλ+wµ(x). the formula for the product of two character functions can be rewritten using the following lemma. lemma 1. let w be any element of a weyl group w and x,ν ∈ rn. then e2πı〈wν,x〉 = |[w]| |w | r∑ j=1 χj(w)φjν(x). (14) 442 vol. 56 no. 6/2016 on generalization of special functions related to weyl groups proof. we consider the sum from the right hand side of (14) and we use formulas (2) and (11). r∑ j=1 χj(w)φjν(x) = r∑ j=1 χj(w) ∑ w̃∈w χj(w̃)e2πı〈w̃ν,x〉 = ∑ w̃∈w e2πı〈w̃ν,x〉 r∑ j=1 χj(w)χj(w̃) = ∑ w̃∈w e2πı〈w̃ν,x〉δww̃ |w | |[w]| = |w| |[w]| e2πı〈wν,x〉. now, putting in lemma 1: w = id and ν = wλ+w̃µ we could rewrite (13) in the form φkλ(x)φ l µ(x) = 1 |w| r∑ j=1 dj ∑ w,w̃∈w χk(w)χl(w̃)φ j wλ+w̃µ (x). another remarkable property of orbit functions is the fact that they are eigenfunctions of laplace operator in rn. by an analogy to [1] we calculate the laplace operator in rn and we get that φkλ(x) fulfills the helmholtz equation 4φkλ(x) = −4π 2〈λ,λ〉φkλ(x). finally, the following lemma is crucial for the continuous and discrete orthogonality. its proof is straightforward and it is analogous to the proof of the similar proposition for the immanant functions in [7]. lemma 2. let 0 6= λ,µ ∈ p + and χk,χl irreducible characters. then∑ w∈w φkwλ(x)φ l wµ(x) = ∑ w̃,ŵ∈w χk(w̃)χl(ŵ)φ1λ+w̃ŵµ(x), ∑ w∈w φkwλ(x)φlwµ(x) = ∑ w̃,ŵ∈w χk(w̃)χl(ŵ)φ1λ−w̃ŵµ(x). 4.2. continuous and discrete orthogonality the following theorem is analogous to theorem 3 in [7]. it describes the continuous orthogonality of character functions over the domain f̃ = ⋃ w∈w wf, where w is any weyl group and f the fundamental domain of the corresponding affine weyl group. theorem 3. for every 0 6= λ,µ ∈ p + and every pair of characters χk,χl the following relation holds.∫ f̃ φkλ(x)φlµ(x)dx = |w |2|f | dk δλµδkl ∑ w∈stabw (λ) χk(w), where dk is the degree of the character χk. in particular, for λ,µ ∈ p ++ it holds that∫ f̃ φkλ(x)φlµ(x)dx = |f||w | 2δklδλµ . proof. using the symmetries of character functions given by (12) we write ⋃ w∈w ∫ f φkλ(x)φlµ(x)dx = ∑ w∈w ∫ f φkwλ(x)φlwµ(x)dx = ∫ f ∑ w∈w φkwλ(x)φlwµ(x)dx. from lemma 2 we get∫ f ∑ w∈w φkwλ(x)φlwµ(x)dx = = ∫ f ∑ w̃,ŵ∈w χk(w̃)χl(ŵ)φ1λ−w̃ŵµ(x)dx = ∑ w,ŵ∈w χk(wŵ−1)χl(ŵ) ∫ f φ1λ−wµ(x)dx . in the last equation we substituted w = w̃ŵ. since λ,µ ∈ p +, the expression λ−wµ equals zero only for λ = µ and w ∈ stabw (λ). then, using (3) and (7) we get ∑ w,ŵ∈w χk(wŵ−1)χl(ŵ) ∫ f φ1λ−wµ(x)dx = |w||f |δλµ ∑ w∈stabw (λ) ∑ ŵ∈w χk(wŵ−1)χl(ŵ) = |w |2|f|δλµδkl 1 dk ∑ w∈stabw (λ) χk(w). for λ,µ ∈ p ++ the stabilizer is trivial and the sum in the above expression equals dk. an analogous method is used in the proof of discrete orthogonality. let f̃m = ∪w∈wwfm = 1m p ∨/q∨∩f̃ . using the orthogonality of c–functions (10) we get the following theorem. theorem 4. for every 0 6= λ,µ ∈ λm and every pair of characters χk,χl the following relation holds.∑ x∈f̃m φkλ(x)φlµ(x) = c|w |2mnδλµδkl 1 dk ∑ w∈stabw (λ) χk(w). in particular, for λ,µ ∈ λm ∩p ++ it holds that∑ x∈f̃m φkλ(x)φlµ(x) = c|w | 2mnδklδλµ . 4.3. linear independency of character functions the orthogonality relations described in the previous section show that functions corresponding to different 443 lenka háková, agnieszka tereszkiewicz acta polytechnica c1 c2 c3 χ1 1 1 1 χ2 1 1 −1 χ3 2 −1 0 c1 c2 c3 c4 c5 χ1 1 1 1 1 1 χ2 1 1 1 −1 −1 χ3 1 1 −1 1 −1 χ4 1 1 −1 −1 1 χ5 2 −2 0 0 0 c1 c2 c3 c4 c5 c6 χ1 1 1 1 1 1 1 χ2 1 1 1 1 −1 −1 χ3 1 −1 1 −1 1 −1 χ4 1 −1 1 −1 −1 1 χ5 2 2 −1 −1 0 0 χ6 2 −2 −1 1 0 0 table 1. character tables of weyl groups of a2,c2 and g2. irreducible characters are linearly independent, as well as functions related to the same character but labelled by points of two different orbits. this subsection describes in detail the relations between functions corresponding to the same character and labelled by points from the same w–orbit. theorem 5. let χk and χl be any irreducible characters of a weyl group w and λ ∈ rn. then∑ w∈w χk(w)φlwλ(x) = |w | dk φkλ(x)δkl. proof. we write∑ w∈w χk(w)φlwλ(x) = ∑ w∈w ∑ w̃∈w χk(w)χl(w̃)e2πı〈w̃wλ,x〉 = ∑ w∈w ∑ w̄∈w χk(w)χl(w̄w−1)e2πı〈w̄λ,x〉 = ∑ w̄∈w e2πı〈w̄λ,x〉 ∑ w∈w χl(w̄w−1)χk(w) = |w | dk φkλ(x)δkl. we used the substitution w̄ = w̃w, which does not change the summation limits. finally, we applied the formula (3). the theorem can be reformulated as follows. corollary 6. let w be a weyl group with irreducible characters χ1, . . . ,χr. we fix a character χl and we consider the corresponding character function φlλ as a function of x ∈ rn. the functions φlwλ, where w ∈ w , fulfills r − 1 linearly independent relations, namely, for each k 6= l, ∑ w∈w χk(w)φlwλ(x) = 0. 5. character functions related to weyl groups of rank 2 and 3 character tables of all the weyl groups of rank ≤ 4 can be found for example in [12]. some of them are quite extensive, therefore, we give here examples of groups of rank 2 and 3. 5.1. weyl groups of rank 2 let us consider the irreducible weyl groups of rank two, namely a2,c2 and g2. their conjugacy classes are the following: a2 : c1 = {id}, c2 = {r1r2,r2r1}, c3 = {r1,r2,r1r2r1}, c2 : c1 = {id}, c2 = {(r1r2)2}, c3 = {r1r2,r2r1}, c4 = {r1,r1r2r1}, c5 = {r2,r2r1r2}, g2 : c1 = {id}, c2 = {(r1r2)3}, c3 = {(r1r2)2, (r2r1)2}, c4 = {r1r2,r2r1}, c5 = {r1,r2r1r2, (r1r2)4r1}, c6 = {r2,r1r2r1, (r2r1)4r2}. the irreducible characters of weyl groups of rank two are listed in table 1. the trivial and the alternating characters correspond to c– and s– orbit functions. for the case of c2 and g2, character χ3 gives the ss– function and χ4 the sl– function. therefore, the weyl groups of rank two give four new families of functions. their explicit formulas are the following: a2 : φ3λ(x) = 2e 2πı〈λ,x〉 −e2πı〈r2r1λ,x〉 −e2πı〈r1r2λ,x〉, c2 : φ5λ(x) = 2e 2πı〈λ,x〉 − 2e2πı〈(r1r2) 2λ,x〉, g2 : φ5λ(x) = 2e 2πı〈λ,x〉 + 2e2πı〈(r1r2) 3λ,x〉 −e2πı〈(r1r2) 2λ,x〉 −e2πı〈(r2r1) 2λ,x〉 −e2πı〈r1r2λ,x〉 −e2πı〈r2r1λ,x〉, φ6λ(x) = 2e 2πı〈λ,x〉 − 2e2πı〈(r1r2) 3λ,x〉 −e2πı〈(r1r2) 2λ,x〉 −e2πı〈(r2r1) 2λ,x〉 + e2πı〈r1r2λ,x〉 + e2πı〈r2r1λ,x〉. 444 vol. 56 no. 6/2016 on generalization of special functions related to weyl groups figure 1. the contour plot of the real part (left) and the imaginary part (right) of the function φ3((1, 2),x) of the weyl group w (a2). the triangle denotes the fundamental domain f of the affine weyl group. figure 2. the contour plot (real) of the function φ5((2, 1),x) (left) and contour plot (pure imaginary) of the function φ5((2, 1),x) (right) of the weyl group w(g2). the triangle denotes the fundamental domain f of the affine weyl group. from the properties of orbits we have also c2 : φkλ(x) = { real, k = 1, 2, 3, 4, pure imaginary, k = 5, g2 : φkλ(x) = { real, k = 1, 2, 5, pure imaginary, k = 3, 4, 6. linear dependence: a2 : φ3λ(x) + φ 3 r1r2λ (x) + φ3r2r1λ(x) = 0, φ3r1λ(x) + φ 3 r2λ (x) + φ3r1r2r1λ(x) = 0, c2 : φ3λ(x) + φ 3 (r1r2)2λ(x) = 0, φ3r1λ(x) + φ 3 r2r1r2λ (x) = 0, φ3r2λ(x) + φ 3 r1r2r1λ (x) = 0, φ3r1λ(x) + φ 3 r2r1λ (x) = 0. contour plots of some character functions related to weyl groups of rank two are depicted in figures 1 and 2. 5.2. weyl groups of rank 3 now we consider the irreducible weyl groups of rank three, namely a3,c3. their character tables are to be found in table 2. the conjugacy classes of the weyl group of a3 are the following: a3 : c1 = {id}, c2 = {r1,r2,r3,r1r2r1,r2r3r2,r1r2r3r2r1}, c3 = {r1r3,r2r1r3r2,r3r2r3r1r2r3}, c4 = {r1r2,r2r1,r2r3,r3r2,r1r3r2r1, r1r2r1r3,r2r3r2r1,r1r2r3r2}, c5 = {r1r2r3,r2r3r1,r3r1r2,r3r2r1, r3r2r3r1r2,r1r2r1r3r2}. 445 lenka háková, agnieszka tereszkiewicz acta polytechnica c1 c2 c3 c4 c5 χ1 1 1 1 1 1 χ2 1 −1 1 1 −1 χ3 2 0 2 −1 0 χ4 3 1 −1 0 −1 χ5 3 −1 −1 0 1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 χ1 1 1 1 1 1 1 1 1 1 1 χ2 1 −1 −1 1 1 1 −1 1 −1 −1 χ3 1 1 −1 1 −1 −1 −1 1 1 −1 χ4 1 −1 1 1 −1 −1 1 1 −1 1 χ5 2 0 2 −1 0 0 −1 2 0 2 χ6 2 0 −2 −1 0 0 1 2 0 −2 χ7 3 1 −1 0 1 −1 0 −1 −1 3 χ8 3 −1 −1 0 −1 1 0 −1 1 3 χ9 3 1 1 0 −1 1 0 −1 −1 −3 χ10 3 −1 1 0 1 −1 0 −1 1 −3 table 2. character tables of weyl groups of a3 and c3. the explicit formulas of functions φ3,4,5λ (x) are φ3λ(x) = 2e 2πı〈λ,x〉 + 2e2πı〈r1r3λ,x〉 + 2e2πı〈r2r1r3r2λ,x〉 + 2e2πı〈r3r2r3r1r2r3λ,x〉 −e2πı〈r1r2λ,x〉 −e2πı〈r2r1λ,x〉 −e2πı〈r2r3λ,x〉 −e2πı〈r3r2λ,x〉 −e2πı〈r1r3r2r1λ,x〉 −e2πı〈r1r2r1r3λ,x〉 −e2πı〈r2r3r2r1λ,x〉 −e2πı〈r1r2r3r2λ,x〉, φ4λ(x) = 3e 2πı〈λ,x〉 + e2πı〈r1λ,x〉 + e2πı〈r2λ,x〉 + e2πı〈r3λ,x〉 + e2πı〈r1r2r1λ,x〉 + e2πı〈r2r3r2λ,x〉 + e2πı〈r1r2r3r2r1λ,x〉 −e2πı〈r1r3λ,x〉 −e2πı〈r2r1r3r2λ,x〉 −e2πı〈r3r2r3r1r2r3λ,x〉 −e2πı〈r1r2r3λ,x〉 −e2πı〈r2r3r1λ,x〉 −e2πı〈r3r1r2λ,x〉 −e2πı〈r3r2r1λ,x〉 −e2πı〈r3r2r3r1r2λ,x〉 −e2πı〈r1r2r1r3r2λ,x〉, φ5λ(x) = 3e 2πı〈λ,x〉 −e2πı〈r1λ,x〉 −e2πı〈r2λ,x〉 −e2πı〈r3λ,x〉 −e2πı〈r1r2r1λ,x〉 −e2πı〈r2r3r2λ,x〉 −e2πı〈r1r2r3r2r1λ,x〉 −e2πı〈r1r3λ,x〉 −e2πı〈r2r1r3r2λ,x〉 −e2πı〈r3r2r3r1r2r3λ,x〉 + e2πı〈r1r2r3λ,x〉 + e2πı〈r2r3r1λ,x〉 + e2πı〈r3r1r2λ,x〉 + e2πı〈r3r2r1λ,x〉 + e2πı〈r3r2r3r1r2λ,x〉 + e2πı〈r1r2r1r3r2λ,x〉. the linear dependence relations of functions φ 3,4,5 λ (x) for labels from the same w–orbit are∑ w∈w(a3) φ3wλ(x) = ∑ w∈w(a3) χ2(w)φ3wλ(x) = ∑ w∈w(a3) χ4(w)φ3wλ(x) = ∑ w∈w(a3) χ5(w)φ3wλ(x) = 0, ∑ w∈w(a3) φ4wλ(x) = ∑ w∈w(a3) χ2(w)φ4wλ(x) = ∑ w∈w(a3) χ3(w)φ4wλ(x) = ∑ w∈w(a3) χ5(w)φ4wλ(x) = 0, ∑ w∈w(a3) φ5wλ(x) = ∑ w∈w(a3) χ2(w)φ5wλ(x) = ∑ w∈w(a3) χ3(w)φ5wλ(x) = ∑ w∈w(a3) χ4(w)φ5wλ(x) = 0. the weyl group of c3 decomposes into 10 conjugacy classes: c3 : c1 = {id}, c2 = {r1,r2,r1r2r1,r3r2r3,r3r1r2r1r3, r2r1r3r2r3r2r1r2}, c3 = {r3,r2r3r2,r1r2r3r2r1}, c4 = {r1r2,r2r1,r3r2r1r3,r1r3r2r3, r3r2r3r1r2r1,r2r1r3r2r3r1, r3r1r2r1r3r2,r2r1r3r2r3r2}, c5 = {r1r3,r2r3r1r2,r1r2r3r1r2r1, r3r2r1r3r2r3,r1r3r2r1r3r2r3r1, r2r1r3r2r1r3r2r3}, c6 = {r3r2,r1r3r2r1,r2r3r2r1,r2r3, r1r2r3r1,r1r2r3r2}, c7 = {r3r1r2,r1r2r3,r2r3r1,r3r2r1, r2r3r1r2r1,r1r2r1r3r2, r3r2r1r3r2r1r3,r3r2r1r2r3r2r3}, c8 = {r2r3r2r3,r1r3r2r3r2r1,r2r1r3r2r3r1r2r1}, c9 = {r3r2r3r2r1,r3r2r3r1r2,r1r2r3r2r3, r2r3r1r2r3,r1r3r2r3r1r2r1, r2r3r1r2r3r2r1}, c10 = {r1r2r1r3r2r1r3r2r3}. 446 vol. 56 no. 6/2016 on generalization of special functions related to weyl groups the explicit formulas and linear dependency relations can be written down using definition 11 and corollary 6. 6. concluding remarks (1.) the paper [7] was inspired by an extended possibility of applications of immanants in physics. we believe that this generalization will find its applications as well. (2.) in order to define the fourier transform using families of character functions as in [5] we need to decide about the completeness of the orthogonal set of character functions. (3.) there are other directions of future research inspired directly by orbit functions. for example, in [13] orbit functions with the lowest labels are used as variables of orthogonal polynomials. references [1] anatoliy klimyk, jiří patera, orbit functions, sigma 2 (2006), 006, 60 pages, doi:10.3842/sigma.2006.006. [2] anatoliy klimyk, jiří patera, antisymmetric orbit functions, sigma (symmetry, integrability and geometry: methods and applications) 3 (2007), 023, 83 pages, doi:10.3842/sigma.2007.023. [3] robert v. moody, jiří patera, orthogonality within the families of c-, s-, and e-functions of any compact semisimple lie group, sigma, 2, (2006), 076, doi:10.3842/sigma.2006.076. [4] r. v. moody, l. motlochová, j. patera, gaussian cubature arising from hybrid characters of simple lie groups, j. fourier anal. appl., 2014, vol. 20, issue 6, doi:10.1007/s00041-014-9355-0. [5] jiří hrivnák, jiří patera, on discretization of tori of compact simple lie groups, j. phys. a: math. theor. 42 (2009), doi:10.1088/1751-8113/42/38/385208. [6] lenka háková, jiří hrivnák, jiří patera, four families of weyl group orbit functions of b3 and c3, j. math. phys. 54,(2013), doi:10.1063/1.4817340. [7] lenka háková, agnieszka tereskiewicz, on immanant functions related to weyl groups of an, journal of mathematical physics, 2014, vol.55, issue 11, doi:10.1063/1.4901556. [8] gordon james, martin liebeck, representations and characters of groups, second edition. cambridge university press, new york, (2001). viii+458 pp. isbn: 0-521-00392-x. [9] barry simon, representations of finite and compact groups, graduate studies in mathematics, 10. american mathematical society, providence, ri, (1996) xii+266 pp. isbn: 0-8218-0453-7. [10] meinolf geck, gãűtz pfeiffer, characters of finite coxeter groups and iwahori-hecke algebras, london mathematical society monographs. new series, 21. the clarendon press, oxford university press, new york, 2000. [11] jiří hrivnák, lenka motlochová, jiří patera, on discretization of tori of compact simple lie groups ii,, j. phys. a: math. theor. 45, (2012), doi:10.1088/1751-8113/45/25/255201. [12] agnes andreassian, macdonald characters of weyl groups of rank ≤ 4, ph.d. thesis, 1973, university of british colombia. [13] maryna nesterenko, jiří patera, marzena szajewska, agnieszka tereszkiewicz, orthogonal polynomials of compact simple lie groups: branching rules for polynomials, j. phys. a 43 (2010), no. 49, 495207, 27 pp, doi:10.1088/1751-8113/43/49/495207. 447 http://dx.doi.org/10.3842/sigma.2006.006 http://dx.doi.org/10.3842/sigma.2007.023 http://dx.doi.org/10.3842/sigma.2006.076 http://dx.doi.org/10.1007/s00041-014-9355-0 http://dx.doi.org/10.1088/1751-8113/42/38/385208 http://dx.doi.org/10.1063/1.4817340 http://dx.doi.org/10.1063/1.4901556 http://dx.doi.org/10.1088/1751-8113/45/25/255201 http://dx.doi.org/10.1088/1751-8113/43/49/495207 acta polytechnica 56(6):440–447, 2016 1 introduction 2 preliminaries 2.1 irreducible characters of symmetric groups 2.2 weyl groups of simple lie algebras 3 special functions related to weyl groups 3.1 weyl group orbit functions 3.2 character functions 4 properties of character functions 4.1 general properties 4.2 continuous and discrete orthogonality 4.3 linear independency of character functions 5 character functions related to weyl groups of rank 2 and 3 5.1 weyl groups of rank 2 5.2 weyl groups of rank 3 6 concluding remarks references acta polytechnica doi:10.14311/ap.2014.54.0139 acta polytechnica 54(2):139–141, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap a differential integrability condition for two-dimensional hamiltonian systems ali mostafazadeh department of mathematics, koç university, 34450 sarıyer, istanbul, turkey correspondence: amostafazadeh@ku.edu.tr abstract. we review, restate, and prove a result due to kaushal and korsch [phys. lett. a 276, 47 (2000)] on the complete integrability of two-dimensional hamiltonian systems whose hamiltonian satisfies a set of four linear second order partial differential equations. in particular, we show that a two-dimensional hamiltonian system is completely integrable, if the hamiltonian has the form h = t + v where v and t are respectively harmonic functions of the generalized coordinates and the associated momenta. keywords: integrable system, complexified hamiltonian, constant of motion. the study of integrable hamiltonian systems has been a subject of active research since the nineteenth century. despite the existence of an extensive literature on the subject, there is no well-known practical test of complete integrability of a given hamiltonian system even in two dimensions. the present work elaborates on a simple-to-check sufficient condition for the complete integrability of such hamiltonian systems. this condition, which is originally outlined in ref. [1], identifies a special class of completely integrable hamiltonian systems in two dimensions. the following is a precise statement of this result. theorem 1. let s be a hamiltonian system with phase space r4 and a twice-continuously-differentiable time-independent hamiltonian h : r4 → r that satisfies ∂2h ∂x21 + ∂2h ∂x22 = 0, ∂2h ∂p21 + ∂2h ∂p22 = 0, ∂2h ∂x1∂p2 + ∂2h ∂x2∂p1 = 0, ∂2h ∂x1∂p1 − ∂2h ∂x2∂p2 = 0, (1) where h = h(x1, p1, x2, p2), and (x1, p1) and (x2, p2) are conjugate coordinate-momentum pairs. then s is completely integrable. before giving a proof of this theorem, we wish to motivate its statement and the idea of its proof. let z := x + iy and p := p + iq be a pair of complex variables, where x, y, p, q ∈ r, and h : c2 → c is an analytic complex-valued function of (z, p) with real and imaginary parts u and v, so that u(x, y, p, q) := re[h(x + iy, p + iq)], (2) v(x, y, p, q) := im[h(x + iy, p + iq)]. (3) we can express the condition of the analyticity of h as the cauchy-riemann relations: ∂u ∂x = ∂v ∂y , ∂u ∂y = − ∂v ∂x , ∂u ∂p = ∂v ∂q , ∂u ∂q = − ∂v ∂p . (4) these in turn imply ∂2u ∂x2 + ∂2u ∂y2 = 0, ∂2u ∂p2 + ∂2u ∂q2 = 0, ∂2u ∂x∂p + ∂2u ∂y∂q = 0, ∂2u ∂x∂q − ∂2u ∂y∂p = 0. (5) refs. [1, 2] show that the complexified hamilton equations, dz dt = ∂h ∂p , dp dt = − ∂h ∂z , (6) define a completely integrable hamiltonian system in the phase space r4 whose dynamics may be determined using the standard hamilton equations with u serving as the hamiltonian and v acting as a constant of motion. it is important to note that in order to establish this result one needs to use an appropriate symplectic structure (which is different from though isomorphic to the standard symplectic structure) on r4 and the associated darboux coordinates for the problem [2, 3]. the latter can be taken to be the conjugate coordinate-momentum pairs (x, p) and (q, y), i.e., q and y must be viewed as coordinate and momentum variables, respectively [4]. the statement and proof of theorem 1 relies on the fact that conditions (1), which were initially derived in [1], can be used to identify h with the real part of an analytic function h : c2 → c such that the hamiltonian system determined by h coincides with the one defined by (6). proof of theorem 1. let x := x1, y := x2, p := p1, q := −p2, and u(x, y, p, q) := h(x, p, y,−q). (7) then (1) is equivalent to (5), and if we view (4) as a system of differential equations for v, then (5) becomes the integrability condition for these equations. this 139 http://dx.doi.org/10.14311/ap.2014.54.0139 http://ojs.cvut.cz/ojs/index.php/ap ali mostafazadeh acta polytechnica shows that there is a solution v of (4) for the case that u is given by (7). let i : r4 → r be defined by i(x1, p1, x2, p2) := v(x1, x2, p1,−p2), (8) so that v(x, y, p, q) = i(x, p, y,−q). (9) we can use (7) and (9) to express (4) in the form ∂h ∂x1 = ∂i ∂x2 , ∂h ∂x2 = − ∂i ∂x1 , ∂h ∂p1 = − ∂i ∂p2 , ∂h ∂p2 = ∂i ∂p1 . (10) these in turn imply {h, i} = 0, where {·, ·} is the poisson bracket, {h, i} := 2∑ k=1 ( ∂h ∂xk ∂i ∂pk − ∂i ∂xk ∂h ∂p2 ) . this identifies i with a constant of motion for the system s. next, we prove that i is functionally independent of h. assume by contradiction that i = f (h) for some real-valued differentiable function f : r → r. then for all z ∈{x1, p1, x2, p2}, ∂i ∂z = f ′ ∂h ∂z , (11) where f ′ stands for the derivative of f . substituting (11) in (10), we find ∂h ∂x1 = f ′ ∂h ∂x2 , ∂h ∂x2 = −f ′ ∂h ∂x1 , ∂h ∂p1 = −f ′ ∂h ∂p̃2 , ∂h ∂p2 = f ′ ∂h ∂p̃1 . we can use these equations to establish ( f ′ 2 + 1 ) ∂h ∂x1 = ( f ′ 2 + 1 ) ∂h ∂x2 = ( f ′ 2 + 1 )∂h ∂p1 = ( f ′ 2 + 1 )∂h ∂p2 = 0. clearly these cannot be satisfied unless h is a constant function, which is definitely not the case. this completes the proof that i is a constant of motion that is functionally independent of h. therefore, according to liouville’s integrability theorem, s is completely integrable. the following is a straightforward consequence of theorem 1. theorem 2 (harmonic integrability). let t : r2 → r, v : r2 → r, and h : r4 → r be twicecontinuously-differentiable functions such that h serves as the hamiltonian for a classical system s with phase space r4 and has the form h(x1, p1, x2, p2) = t (p1, p2) + v (x1, x2). (12) then s is completely integrable, if t and v are harmonic functions. proof. for the hamiltonians of the form (12), the 2nd pair of conditions in (1) is trivially satisfied and the 1st pair is equivalent to the condition that t and v are harmonic functions. therefore, according to theorem 1, s is completely integrable, if t and v are harmonic functions. given a hamiltonian h that fulfils (1), we can use (10) to construct an independent constant of motion i. in order to demonstrate this construction, we examine three concrete examples. in the first two of these, h satisfies the conditions of theorem 2. in all of them, c is an arbitrary real constant. example 1. consider the hamiltonian h := 12 (p 2 1 − p22 − x21 + x22), which is clearly of the form (12) for a pair of harmonic functions t and v . we can write (10) as ∂i ∂x2 = −x1, ∂i ∂x1 = −x2, ∂i ∂p2 = −p1, ∂i ∂p1 = −p2, (13) which we can easily integrate to find i = −(x1x2 + p1p2) + c. example 2. the hamiltonian h := ep1 cos p2 + e−x1 sin x2 also fulfils the hypothesis of theorem 2. for this hamiltonian, (10) takes the form ∂i ∂x2 = −e−x1 sin x2, ∂i ∂x1 = −e−x1 cos x2, ∂i ∂p2 = −ep1 cos p2, ∂i ∂p1 = −ep1 sin p2. (14) again we can integrate these equations and obtain i = e−x1 cos x2 − ep1 sin p2 + c. example 3. consider the hamiltonian h := 1 4 (x41 − 6x 2 1x 2 2 + x 4 2)(p 4 1 − 6p 2 1p 2 2 + p 4 2) + 4x1x2(x21 − x 2 2)p1p2(p 2 1 − p 2 2). it is not difficult to check that it satisfies (1), but that it cannot be expressed in the form (12). integrating (10) for this choice of h gives i = (x1p1 + x2p2)(x2p1 + x1p2) [ (x1 + x2)p1 − (x1 − x2)p2 ][ (x1 − x2)p1 + (x1 + x2)p2 ] + c. remark. we became aware of ref. [1] when this project was complete and the first draft of the present paper was already written. subsequently we revised the paper to give due credit to the authors of [1], who discussed an equivalent version of theorem 1 although not in the form of an explicitly stated mathematical theorem. conditions (1) coincide with eqs. (10) of ref. [1] with j = 1 provided that we perform the canonical transformation: x1 → x1, p1 → p1, x2 → −p2, and p2 → x2. 140 vol. 54 no. 2/2014 a differential integrability condition acknowledgements this work has been supported by the turkish academy of sciences (tüba). references [1] r. s. kaushal and h. j. korsch, “some remarks on complex hamiltonian systems,” phys. lett. a 276, 47-51 (2000). doi: 10.1016/s0375-9601(00)00647-2 [2] a. mostafazadeh, “real description of classical hamiltonian dynamics generated by a complex potential,” phys. lett. a 357, 177-180 (2006). doi: 10.1016/j.physleta.2006.04.045 [3] t. curtright and l. mezincescu, “biorthogonal quantum systems,” j. math. phys. 41, 092106 (2007). doi: 10.1063/1.2196243 [4] a. l. xavier jr. and m. a. m de aguiar, “complex trajectories in the quartic oscillator and its semiclassical coherent-state propagator,” ann. phys. (n.y.) 252, 458-478 (1996). doi: 10.1006/aphy.1996.0141 141 http://dx.doi.org/10.1016/s0375-9601(00)00647-2 http://dx.doi.org/10.1016/j.physleta.2006.04.045 http://dx.doi.org/10.1063/1.2196243 http://dx.doi.org/10.1006/aphy.1996.0141 acta polytechnica 54(2):139–141, 2014 acknowledgements references ap1_03.vp 1 introduction the accumulation of dust and dirt on the impervious surface of catchments such as highway surface may be attributed to numerous sources. vehicles are a source of oil, petrol and eroded material from car bodies and tyres. industry and transport are also sources of dust and gaseous material, which is emitted into the atmosphere and subsequently settles out or is washed out by rainfall. assessment of the quantity and quality of stormwaters runs from the surface of these catchments can benefit from applying the stormwater management model. models for simulation of stormwater quantity and quality vary widely in terms of their complexity and data, personnel, and computational requirements. the simplest models calculate pollutant loads from the storm runoff volume and event mean concentration (emc). at the other extreme there are simulation models that attempt to simulate the buildup and washoff of pollutants from the watershed and their transport through the drainage system to the point of interest [3], [5]. the goal of this study is to predict highway runoff characteristics using data from different highway catchments. the stormwater management model (swmm) was used in the simulation process and the predicted data was verified using monitoring data collected from prague-plzeň highway. the analysis focused on total suspended solids (tss), because they are widely viewed as an indicator of stormwater quality and are correlated with other stormwater quality constituents [6]. stormwater management model (swmm) swmm is a well–known stormwater runoff simulation model developed in the united states between 1969 and 1971. since then it has been used in scientific and practical applications in many countries. the swmm is a deterministic, spatially distributed model for calculation of runoff quantity and quality [4]. the input consists of a time series of rainfall data (and/or snowmelt) and a set of parameters describing the physical properties of the catchment. structurally, the model can be divided in four main modules, which can run separately or combined together: 1. the runoff module is critical to swmm simulation. this module receives meteorological data from user-defined hyetographs (rainfall intensity vs. time), antecedent conditions, land use and topography, and then simulates the rainfall-runoff process using a non-linear reservoir approach. surface runoff is calculated from rainfall excess surface detention and evaporation. quality processes in this module include generation of surface runoff constituent loads through a variety of options: a) buildup of constituents during dry weather and washoff during wet weather, b) the rating curve approach, in which the load is proportional to flow rate, and c) constant concentration. the runoff module produces hydrographs and pollutographs at inlet locations, which can be analyzed or used as an interface file to subsequent modules. 2. the transport module is the subsequent module, and performs the detailed flow and pollutant routing through the sewer system. the flow routing is accomplished using the kinematic wave method, while quality processes include first-order decay and simulating scour and deposition within the sewer system. the transport module uses inlet hydrographs and pollutographs generated either from the runoff module via an interface file or from the user defined option as the input. this module deals with flow quantity and quality, whereas the extran module deals with flow quantity only. 3. the extended transport (extran) module provides the swmm with dynamic wave simulation capability [5]. this module is the most comprehensive simulation program available in the public domain for drainage system hydraulics, and simulates branches or looped networks. the flow routing using this module is accomplished using the kinematic wave method. 4. the storage/treatment (s/t) module was developed to simulate the routing of flows and pollutant through dryor wetweather storage/treatment plants or units containing up to 5 units or processes. the primary objectives of the storage/ treatment module are to: a) simulate the quality improvement provided by each process; b) simulate the handling of sludge; and c) provide estimates of capital, operation, and maintenance costs. the s/t module simulates the removal of pollutant in the control devices by: first-order decay, removal functions, and sedimentation dynamics. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 41 no. 3/2001 modelling of highway runoff quantity and quality ashraf el-shahat elsayed, a. grünwald, m. synáčková, m. slavíček the stormwater management model (swmm) is a well-known stormwater runoff-simulation model. it is a deterministic, spatially distributed model for calculation of runoff quantity and quality. the aim of this study was to predict highway runoff characteristics using data from different highway catchments. the swmm was used in the simulation process and the predicted data was verified using monitoring data collected from prague-plzeň highway for both total suspended solids and organic compounds. the analyzed data indicated that the simulated tss, bod5 concentrations lie within the range of the measured data and both data sets are considered highly polluted with respect to the specification limits. the analyzed data also show that the first flush of runoff is the most polluted, and is responsible for contamination of retention and/or received waters. keywords: stormwater management model, highway runoff, quantity, quality, first flush. in this study only the runoff and transport modules will be used in the simulation of total solids (ts), and biochemical oxygen demands (bod5). the overland flow, buildup and washoff relationships, on which swmm depends, will be discussed below. buildup and washoff relationships the runoff module was developed to simulate both the quantity and quality of runoff in a drainage basin and the routing of flows and contaminants to the major sewer lines. it forms the source of runoff hydrographs and quality pollutographs for most swmm applications. buildup and washoff relationships are physically based, and their use for simulating constituent concentrations is conceptually appealing. according to the buildup-washoff model, a supply of constituents is assumed to buildup on the land surface during periods of dry weather. with a subsequent storm, some of this material is then washed off into the drainage system. buildup may depend on season, land use, traffic, and so forth. washoff may be a function of rainfall intensity, bottom shear stress, and other factors. rates of buildup and various stormwater constituents on impervious surfaces have been measured in several studies. however, for some environments, stormwater data suggest that constituent buildup may not be correlated with the length of the antecedent dry period [7]. it also is likely that there is significant variation in the rate and maximum accumulation of constituents depending on climatic and other site-specific factors. during a subsequent stormwater event, depending on stormwater duration and intensity, some or all of the constituents will be washed from the watershed with the storm runoff. the constituent will again accumulate during the dry days following the event, in addition to the amount of constituent load remaining on the watershed at the end of the previous storm event. many stormwater models assume that the rate of washoff is a function of the amount of constituent present in the watershed. this formulation results in higher predicted concentration at the beginning of the storm event. 2 methodology to achieve the aim of this study, two types of data were used in the analysis process. these data include the simulated data using the swmm model and the measured data from the in situ retention/detention units. the equations used in the prediction process using swmm and the characteristics of the in situ retention/detention units are discussed in the following subsections. simulated data swmm can simulate both runoff quantities using the overland flow model and runoff quality using buildup-washoff models. the used equations will be discussed in the following. overland flow model the runoff module forms the origin of flow generation with swmm. in order to understand better the conversion of rainfall excess into overland flow (runoff), a brief description of the used equation is given. however, if the subcatchment width (w) is assumed to represent a true prototype width of overland flow, then the reservoir will behave as a rectangular catchment, as sketched in fig. 1. the outflow is generated using manning’s equation as shown in equation (1). � �� �q w n d d sp� � 1 67 0 5. . (1) where: q = outflow rate [m3/s], w = subcatchment width [m], n = manning’s roughness coefficient, d = water depth [m], dp = depth of depression storage [m], and s = subcatchment slope [m/m]. buildup model in the swmm model, buildup and washoff of solids are both approximated using an exponential distribution. the distribution of buildup of solids is a function of the antecedent dry days, according to equation (2). � �� �p p pa pt i i kte� � � � �1 (2) where: pt = accumulation of solids up to day t [kg], pi = initial solids load on the surface (not washed from the previous storm) [kg], p = maximum buildup of solids (2.4 kg/ha), a = drainage area [ha], k = exponential buildup factor (0.4) [days�1], and t = antecedent dry days. the maximum solids buildup load can be adjusted to provide similar long-term solids loading rates. in the swmm model once the pollutant build-up reaches the maximum limit (2.4 kg/ha), additional build-up is not allowed (assumed to be wind re-suspended/driven off the surface). washoff model washoff is the process of erosion or solution of constituents from a subcatchment surface during a period of runoff. if the water depth is more than a few millimeters, the process of erosion may be described by sediment transport theory in which the mass flow rate of sediment is proportional to the flow and bottom shear stress. washoff is estimated using equation (3). � �p poff t kve� � �1 (3) where: poff = cumulative amount washed off at time t [kg], pt = initial amount of quantity on a surface calculated from equation 1 [kg], k = exponential decay factor (0.2) [mm�1], and 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 d dp evaporation rainfall q infiltration fig. 1: non-linear reservoir model for a subcatchment v = volume of accumulated runoff from the surface [mm]. the exponential decay factor (k) was based on a review of previous literature, which indicates that k-values range from 0.03 to 0.55 mm�1[1]. modeling parameters swmm model catchments and conveyance systems are based on input rain, temperature, and evaporation data. only rain data were used in these analyses, while temperature and evaporation data were used as default values. tab. 1 provides a list of the parameters used in the swmm model. the width of the subcatchment was assumed to be equal to the square root of the area. measured data different storage/sediment units along the prague-plzeň (d5) highway in the czech republic were selected for monitoring runoff from highways. the prague-plzeň (d5) highway is a divided 6-lane highway with average daily traffic (adt) in the range 16000–22000 vehicles / day / both directions. these units, as shown in fig. 2, collect runoff from a 100% impervious area (highway surface). the tested sample was collected from the influent highway runoff into the monitored units. these samples were collected manually using metal bottles. runoff samples for suspended solids and organic analysis were filtered through pre-weighed, pre-combusted glass fiber filters. after drying at 105 °c temperature, the filters were re-weighed to determine the amount of suspended solids in each sample. 3 results and discussion this study used one-year and two-year rainfall data to predict the runoff quantity and quality process. this study evaluates the effect of these data on runoff volume and the concentration of total suspended solids (tss), and biochemical oxygen demands (bod5). the results are analyzed in the following sections. data verification due to the shortage of measured data and the necessity to study some important concepts such as runoff peak flow, pollutant accumulation and first flush phenomena, simulated data were used for studying these concepts. then, the simulated data were verified with available measured data to measure the confidence in these simulated data. a comparison between the measured and simulated concentrations of some constituents is shown in tab. 2. the results indicate that the concentration of tss and bod5 is very close to those measured from the d5 highway in czech republic, and the results lie within the total range of measured data. the simulated data therefore has a degree of confidence. with this confidence degree the studied concepts can be accepted and analyzed. quantity simulation the overflow volume and the time of the discharged peak overflow will be analyzed in this section. the used rainfall data hyetograph and the accumulated runoff volume are shown in fig. 3. the šifalda rainfall data hyetograph was used, and the measured intensity values were recorded at a rainfall station near the studied area. the effect of these data on peak flow at the sewer system outlet is also shown in fig. 3. the figure indicates that the runoff volume increases as either the subcatchment area increases or the rainfall intensity increases under the effect of the same conditions. consequently, the diameter of the conveyance system, which was designed for maximum flow less than full flow, should be increased depending on the runoff volume. the figure also shows that the runoff peak flow value increases as rainfall intensity increases and this occurs approximately at the same time after the beginning of a storm event. the value and the time of peak flow affects the design of retention/detention units. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 41 no. 3/2001 parameter value area [ha] variable imperviousness 100 % width [m] variable slope variable impervious depression storage [mm] 1.00 impervious manning’s value n 0.011 maximum infiltration rate [mm/hr] 25 minimum infiltration rate [mm/hr] 6 decay rate of infiltration [s�1] 0.00115 table 1: swmm parameters 4 .4 m effluent influent filter 3.5 1.58–17 m2.0 i ii iii fig. 2: storage/sediment units. constituents measured data simulated data ssud 8 drahelčice ssud 9 rokycany average range tss [mg/l] 221 4895 5601 1969 1250 200–1650 bod5 [mg/l] 3.21 18.25 2.65 4.20 6.88 6.00–8.20 table 2: comparison between simulated and measured data quality simulation the buildup process is a function of antecedent dry days, traffic volume, and seasonal considerations, whereas washoff is the process of removing contaminants from the subcatchment surface during a period of runoff. tss and bod5 contaminations that washed off the pavement surface and were transported into the sewer system are simulated at the outlet, as shown in fig. 4 and 5. the figures indicate that the load of tss and bod5 increase as the rainfall intensity increases. the higher load of tss as well as bod5 is explained by the intensity of the rainfall. higher flow rates, associated with higher rainfall intensity, moved more of the heavier dirt particles than smaller storm events did. the contaminations 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 0 20 40 60 80 100 120 140 0 6 12 18 24 30 36 42 48 54 60 duration [min.] n = 1.0 n = 0.5 in te n s it y [m m /h r] 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 5 15 25 35 45 55 65 75 duration [min.] n = 1.0 n = 0.5 0 200 400 600 800 1000 1200 1400 1600 1800 2000 38 12 8 21 8 30 8 39 8 48 8 56 8 66 0 75 1 81 6 90 6 95 6 system length [m] n =1.0 n = 0.5 fig. 3: relationship between rainfall intensity and flow characteristics deposited during a dry period preceding the rainwater and the washed off contaminations during a storm event contribute to the total measured contamination in the runoff. the figure also shows that the peak rate value and time is higher for one-year rainfall data than for two-year rainfall data. this may be due to the high runoff volume with two-year rainfall data which wash off most of the pollutant in the early stages of runoff. first flush the first flush is an important phenomenon for receiving water impacts. a period of high concentration is evident for each constituent at the beginning of the storm. the period of high concentration, however, occurs simultaneously with the rising leg of the hydrograph and ends at the time of concentration for the watershed. it is difficult to ascertain from the graph if the high concentration at the beginning of the storm results from a large amount of material being washed from the highway early in the storm event (e.g., a true first flush) or from the smaller volume of water in the catchment (watershed) at the start of the storm. the nature of the contaminants and their solubilities in water may affect the magnitude and extent of the first flush effect. fig. 6 shows the relationship between runoff volume and scoured/washed off concentrations. as shown in the figure, the deposited contaminations on the pavement surface are scoured in the first stages of a rain event or at least that its concentrations decrease with time. the figure shows that the contamination is totally washed off at less than half of the overflow rate duration, and this explains what happens during the first flush. the simulated data was in agreement with various other studies, which show that the concentration of runoff contaminations is high in the first runoff flow during a storm event. this means that most of the surface contamination is washed off in the early stages of a storm event, depending on the intensity of the storm. consequently the concentration of the contaminations decreases with increasing storm duration and runoff volume. tab. 3 shows the concentration of tss and bod5 for different measured flow volumes investigated in this study. 4 conclusion assessment of stormwater quantity and quality runs from the surface of a catchment can benefit from applying the stormwater management model. the stormwater manage© czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 41 no. 3/2001 0 50 100 150 200 250 300 38 17 3 30 8 44 3 56 8 70 7 81 6 94 0 system length [m] n = 1.0 n = 0.5 a c c u m u la te d t s s [k g ] tss 0 100 200 300 400 500 600 700 800 5 15 25 35 45 55 65 75 duration [min.] n = 1.0 n = 0.5 w a s h o ff ra te [g /s ] fig. 4: loads of tss washed from highway bod5 0 0.5 1 1.5 2 2.5 3 3.5 5 15 25 35 45 55 65 75 duration [min.] n = 1.0 n = 0.5 w a s h o ff ra te [g /s ] 0 2 4 6 8 10 12 38 17 3 30 8 44 3 56 8 70 7 81 6 94 0 system length [m] n = 1.0 n = 0.5 fig. 5: loads of bod5 washed from highway measured constituents percent of runoff volume of runoff 10 % 43 % 76 % tss [mg/l] 75.68 98.70 99.0 bod5 [mg/l] 74.89 98.0 99.0 table 3: percent of washed off contaminations versus total loads of these contaminations using 1-year rainfall data ment model (swmm) is a well-known stormwater runoff-simulating model. the swmm was used in a simulation process and the predicted data were verified using measured data collected from the prague-plzeň highway for both total suspended solids and biochemical oxygen demands. the analysis of the results indicates that the swmm makes a useful prediction of both runoff quality and runoff quantity. the simulated results were compared and verified using measured data from the prague-plzeň highway, czech republic. the verification of the data indicates that the simulated data are very close to the measured data, and this gives a high degree of confidence in the simulation process. using swmm results, the effect of first flush concentration on a receiving water and/or on a retention unit can be evaluated to choose bmps suitable for the studied conditions. peak flow volume and duration are required for the design of retention units. acknowledgement we would like to thank and express our appreciation to ing. kateřina slavíčková for her endless help. this research has been supported by grant no.: vz j04198: 2111100005. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 tss 0 200 400 600 800 1000 1200 1400 1600 1800 1 6 11 16 21 26 31 36 41 46 51 56 duration [min.] n = 1.0 n = 0.5 c o n c e n tr a ti o n [m g /l ] bod5 6.4 6.9 7.4 7.9 8.4 1 6 11 16 21 26 31 36 41 46 51 56 duration [min.] n = 1.0 n = 0.5 c o n c e n tr a ti o n [m g /l ] 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 5 15 25 35 45 55 65 75 duration [min.] n = 1.0 n = 0.5 fig. 6: first flush phenomena references [1] charbeneau, r., barrett, m.: evaluation of methods for estimating stormwater pollutant loads. water environment research, vol. 70, no. 7/1998 [2] davies, t. t.: swmm windows interface user’s manual 4.3. office of water, epa-823-b-95-008, united states environmental protection agency, 1995 [3] huber, w. c., dickinson, r. e.: stormwater management model version 4, user’s manual. epa–600/3–88–001a, usepa, athens, ga, 1988 [4] niemczynowicz, j.: mathematical modeling. department of water resources engineering, university of lund, sweden, 1984 [5] roesner, l. a., adrich, j. a., dickinson, r. e.: stormwater management model version 4, user’s manual. epa-600/3-88-001a, usepa, athens, ga, 1988 [6] sansalone, j. j., buchberger, s. g.: partitioning and first flush of metals in urban roadway stormwater. environmental engineering, 1997, pp. 123–134 [7] sutherland, r. c., jelen s. l.: sophisticated stormwater quality modeling is worth the effort. in advances in modeling the management of stormwater impacts, w. james, chi, guelph, ca, 1995 b.sc. ashraf el-shahat elsayed, msc e-mail: ashraf@fsv.cvut.cz prof. ing. alexander grünwald, csc. phone: +420 2 2435 4638 e-mail: grunwald@fsv.cvut.cz ing. marcela synáčková, csc. phone: +420 2 2435 4604 e-mail: synackov@fsv.cvut.cz ing. marek slavíček phone: +420 2 2435 4608 e-mail: marek.slavicek@fsv.cvut.cz department of sanitary engineering czech technical university in prague faculty of civil engineering thákurova 7, 16629 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 41 no. 3/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2013.53.0913 acta polytechnica 53(6):913–922, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap hybrid shearwall system — shear strength at the interface connection ulrich wirtha, nuri shiralib, vladimír křístekc,∗, helmut kurthc a pww-ingenieurbüro für bauwesen, darmstadt b roßdorf c faculty of civil engineering, czech technical university in prague ∗ corresponding author: vladimirkristek@seznam.cz abstract. based on a series of alternating, displacement-controlled load tests on ten one-third scale models, to study the behaviour of the interface of a hybrid shear wall system, it was proved that the concept of hybrid construction in earthquake prone regions is feasible. the hybrid shear-wall system consists of typical reinforced concrete shear walls with composite edge members or flanges. ten different anchorage bar arrangements were developed and tested to evaluate the column-shearwall interface behaviour under cyclic shear forces acting along the interface between column and wall panel. finite element models of the test specimens were developed that were capable of capturing the integrated concrete and reinforcing steel behaviour in the wall panels. special models were developed to capture the interface behaviour between the edge columns and the shear wall. a comparison between the experimental results and the numerical results shows excellent agreement, and clearly supports the validity of the model developed for predicting the non-linear response of the hybrid wall system under various load conditions. keywords: aseismic design, earthquake, hybrid structural system, reinforced concrete, shearwall. 1. introduction although current codes cover well defined rules for the aseismic design of reinforced concrete buildings, earthquakes continue to cause extensive structural damage, particularly in developing countries. in most cases, the significant damage is triggered by column failure in the lower stories of rc frames. in addition, the failure of reinforced concrete shear walls has been found to be caused by local failure of the edgemember columns (flanges). in most instances, column failures of this kind are triggered by an inadequate layout of the stirrups in the column end-regions. the consequent lack of confinement of the concrete core, and associated buckling of the longitudinal reinforcing steel, lead to the observed failures. considering these observations, a new hybrid structural system for both moment resistant frame and shearwall buildings has been proposed by bouwkamp (1992) and studied at the darmstadt university of technology. basically, this system is characterized by replacing the typical reinforced concrete columns or shearwall edge members with concrete-filled rectangular steel tubes acting as composite columns. conceptually, the remainder of the structural system is planned to be identical to the layout of a typical reinforced concrete building. basically, the tubular section provides direct confinement of the concrete core and serves as a longitudinal reinforcement. in fact, depending on the thickness of the column wall, additional typical column reinforcement may not be necessary. of course, it is also possible to minimize the tube-wall thickelevation view steel tube concrete wall cross-section view concrete filled concrete wall steel tube reinforcement foundation floor rc panel wall concrete filled steel tube figure 1. view of hsw. ness by only satisfying the confinement requirements, and to design the longitudinal reinforcement as for a normal, non-composite, column. however, the major objective in developing the proposed hybrid structural system is to design a connecting interface between 913 http://dx.doi.org/10.14311/ap.2013.53.0913 http://ojs.cvut.cz/ojs/index.php/ap ulrich wirth et al. acta polytechnica the composite-columns and the reinforced concrete elements. optimization of the composite column design therefore does not form the subject of our study. instead, extensive experimental and numerical studies of the seismic behavior of the proposed hybrid system, as affected by the reinforcement in the interface regions, have been performed in darmstadt. a research program on the design and the seismic response of hybrid moment resistant frames was carried out successfully by ashadi (1994, 1997). the results showed that this system can be used effectively in ductile moment resistant frames under earthquake exposure. parallel studies have focused on the potential use of this system for shearwall type buildings. in this case, too, the main effort has focused on the design of the reinforcement at the interface between the composite column, or flange, and the concrete shear wall. various design solutions have been studied experimentally under cyclic shear loads acting along the column-wall interface. experimental findings and recommendations for the effective and economical design of the interface connection (ifc) of a hybrid shearwall (hsw) for use in regions of high-seismicity is presented here. considering the construction process of the specific hybrid system, the composite columns are to be prefabricated at either a special construction yard or on site. before pouring the concrete in the hollow column sections, the interface reinforcement has to be installed through predrilled holes in the tube column walls. this reinforcement requires on one end hooks, for anchorage into the column concrete core, and on the other end a sufficient length, for connection (lapped bars) to either the common beam or slab reinforcement, in case of moment resistant frames, or to the typical shearwall reinforcement. the composite concrete filled rectangular steel tubes (cfrst) are erected in a typical steel construction manner. the design of the footing of the first-floor columns, which extend over 1.5 or 2.5 stories, can follow typical steel design procedures. because the possible additional longitudinal column reinforcement can extend from the bottom of the column, alternative connection designs between column and foundation are possible. subsequent columns are prefabricated in 2-story long sections, and are erected following typical steel construction procedures. 2. experimental program several alternative interface designs have been developed and tested in working on an optimum design solution for the interface connection between the rc wall panel and the composite edge member. for this purpose, a 6-bay by 4-bay, 8-story high hybrid shearwall building with plan dimensions of 36 m by 20 m and 28 m in height was designed according to the provisions of ec-8, zone 4; the design of the shear wall reinforcement was based on ec-2. the resulting design also in fact satisfies ubc 1991, zone 3 and the figure 3. elevation view of hsw. us code aci-318-89 code requirements. the structural layout, as shown in figure 2, calls for four shear walls oriented parallel to each of the two main axes. the elevation of a 6 m wide 8-story high shear wall is shown in figure 3. an equivalent static analysis, based on the typical code-specified force distribution, was used in the design of the prototype structure. because of cost considerations and test capacity limitations, a reduction in the scale of the shear wall system was necessary. as the response of the model in studying alternative rc wall panel-to-edge member interface design solutions should be as representative as possible of the cyclic response of the actual shearwall, a reduction in scale to not more than one-third of the actual wall was considered acceptable. this meant that it was possible to use common readily available reinforcing bars, which would properly model the tensile force resistance of the reinforcement. however, because of space and test-capacity limitations, it was not possible to test an entire one-third scale shearwall model, or even a 3-story high portion of it (see fig. 4). hence, with the aim of studying alternative interface connection designs between column and rc wall panel, only the edge portion of the first-story shear wall was selected for our study. a perspective view of a typical test specimen is shown in figure 5; because of the laboratory test conditions, it was necessary to mount the actuator 914 vol. 53 no. 6/2013 hybrid shearwall system figure 2. plan view of building. figure 4. 1/3 scale of hybrid shearwall (hsw). horizontally. hence, the composite edge member — with outside dimensions of 160 mm by 160 mm and a tube-wall thickness of 5.6 mm — had to be placed in horizontal position. the wall of the test specimen, 110 cm in length and 68 cm in height (being a portion of the shearwall width, measured to the edge-member centre line), represents about one-third of the firststory model shearwall. the wall panel thickness had figure 5. general view of test specimen. been set to 8.5 cm (or basically 1/3 of the prototype wall thickness of 25 cm). in order to anchor the test specimen to the test frame, the wall was cast integrally with a concrete footing block with overall dimensions of 40 × 40 × 110 cm. because of cost restrictions, it was decided not to model the floors as edge elements on either side of the test specimen, but rather to introduce additional reinforcement along the free edges of the wall in order to counter the overturning moment introduced by the horizontal cyclic interface load. this was considered acceptable because of the basic shear loading of the wall specimen. an overall view of the test setup showing the test specimen, the double-acting actuator and the test frame is presented in figure 6. the test specimen was typically anchored with hs bolts to the upper 915 ulrich wirth et al. acta polytechnica figure 6. test setup. flange of the lower beam of the test frame. under horizontal displacement-controlled cyclic loading, a certain rocking of the test specimen could occur due to deformations of the upper flange of the lower steel beam of the test frame. vertical web stiffeners were therefore added to stiffen the upper flange directly below the test specimen. the position of the actuator was dictated by the need for the forcing level to coincide with the level of the interface between edge-member and shearwall. 2.1. test specimen design the design of the model shear wall called for a doublelayered 10 × 10 cm mesh layout with � 6 mm bars vertically and horizontally. because the test specimen reflects a squat shear wall and premature test failure of the wall (prior to the interface connection), the horizontal bars in the test specimen were increased from � 6 mm to � 8 mm. the vertical bars were kept as in the model shear wall, namely, � 6 mm at 10 cm spacing. conceptually, an interface layout of � 6 mm anchor bars spaced at 10 cm, to match the � 6 mm bar arrangement in the wall panel, would in fact be a dowel shear transfer, and would not be adequate to develop an appropriate interface force transfer, as the concrete would be inactive at this junction. it was therefore decided to develop an interface design with increased bar diameters, namely, two layers of � 8 mm, 30 cm long, anchor bars spaced at 10 cm and lapped with the corresponding � 6 mm bars of the rc wall panel reinforcement. an alternative design that we decided to test had a primary two-layered arrangement of � 6 mm, 30 cm long anchor bars spaced at 10 cm, and lapped with the � 6 mm bar reinforcement of the rc wall panel, plus an additional array of 30 cm long � 6 mm anchor bars, placed mid-way between the 10 cm spaced primary bars. this basically 5 cm spacing arrangement of � 6 mm anchor bars is shown in figure 7a. in the case of the first test specimen, the � 8 mm anchor bars have the same layout as the primary � 6 mm bars shown in figure 7a. the advantage of these two designs is the relative ease with which the hooked anchor bars can be placed into the hollow steel column section and held in position, prior to placing the concrete in the prefabricated columns. in addition, it is relatively easy to tie the basic shear wall reinforcement to the 30 cm long anchor bars, which reflects standard practice. a third design, conceptually similar to the first two, was developed with a primary arrangement of two layers of 30 cm long � 6 mm anchor bars spaced at 10 cm. however, in order to increase the dowel shear transfer capacity, additionally, single welded — 12 cm long — shear studs 3/8 in. in diameter (9.52 mm) were placed midway between the � 6 mm anchor bars, thus resulting also in a spacing distance of 10 cm. from a construction point of view, this design has the same advantages as the first two designs. however, a (potential) disadvantage is the need for the steel fabricator to be licensed to weld typical headed shear studs. an additional consideration in selecting such a design is the local bending resistance of the column wall to provide the necessary bending resistance at the base of the shear studs. considering that the state of shear in the wall leads to the development of an inclined set of normal forces, it was decided also to test two layouts with 45-degree inclined double layered anchor bars. therefore, a 4th design was formed with two layers of � 8 mm anchor bars spaced at 10 cm. these bars had 45-degree hooks both on the inside of the column and at the extending end of these bars. the hooks at the extending end 916 vol. 53 no. 6/2013 hybrid shearwall system figure 7. reinforcement details at the interface connection. were introduced to allow the horizontal � 6 mm bars of the rc wall panel to be lapped to these anchor bars (the kink of these hooked anchor bars coincides with a horizontal � 8 mm bar). similar to the fourth design, a fifth design (see fig. 7b) was conceived as having a primary array of � 6 mm, 45-degree, anchor bars spaced at 10 cm. this design further called for a second two-layered set of � 6 mm, 45-degree, anchor bars to be placed at the midpoint of distance between the primary bars thus resulting in an overall spacing distance of 5 cm. a final — sixth — design to be tested had an anchorage arrangement of open � 8 mm stirrups welded to the wall of the steel column tube (no wall penetration) at a spacing of 10 cm. these stirrups were lapped with the two-layered � 6 mm bars of the shear wall reinforcement. a summary of the six different test specimens covering the various design solutions for the interface connection (ifc) is given in table 1. in the test specimens, the composite steel tube was of grade fe 360 steel. the concrete had been specified as c30 and the deformed reinforcing bars as bst 500. unfortunately, the concrete quality varied considerably; for the different specimens, the material test values reflected concrete qualities of c46, 46, 22, 33, 29 and 33, respectively. 2.2. instrumentation and test sequence with the aim of studying the column-wall interface connection under a displacement-controlled cyclic load acting along the interface and applied to an end-plate arrangement on one end of the composite column, the instrumentation was designed to evaluate both the 917 ulrich wirth et al. acta polytechnica test test panel specimen reinforcement number horizontal vertical interface connection 1 � 8 mm � 6 mm straight anchor bars (� 8 mm) at 10 cm 2 � 8 mm � 6 mm straight anchor bars (� 6 mm) at 5 cm 3 � 8 mm � 6 mm straight anchor bars (� 6 mm) at 10 cm plus (� 9.52 mm) with 12 cm long shear studs at 10 cm 4 � 8 mm � 6 mm diagonal anchor bars (� 8 mm) at 10 cm 5 � 8 mm � 6 mm diagonal anchor bars (� 6 mm) at 5 cm 6 � 8 mm � 6 mm open stirrups with straight anchor bars (� 8 mm) at 10 cm welded to the steel tube wall table 1. hybrid shear wall system — test specimens. figure 8. instrumentation of a test specimen. shear wall and the interface behavior. basically, the different interface connections were designed with the intention that the resistance of the connection would be larger than the actual wall panel capacity. in order to evaluate the behavior of the different parts of the test specimen, the layout of the instrumentation included both displacement transducers and straingages (see fig. 8). other than the typical test control transducers for the displacement control and loadcell output, lvdt displacement transducers w10 through w22 were used to record the interface slip (w14–16), and the shear panel deformations (w10–13, diagonally and w17– 22, vertically). w23–26 displacement potentiometers were used to measure possible base rocking motions. in addition, strain gages (d30–41) were placed in three sections along the length of the steel column tube; these measurements were intended to provide information about the load transfer along the length of the interface connection. relative displacements (slip) along the column-shearwall interface were to be recorded by displacement transducers w14, 15 and 16 (with a gage length of 7 cm). in addition, displacement potentiometers (sz 2 and 3) were used to measure the overall shear distortions of the shear wall. during the preliminary test phase, a number of test specimens were tested with a larger number of displacement transducers on the shear wall. in these tests, the edges of the shear wall were totally free (as shown in fig. 5). the results showed that the bending effect in the shear wall initiated a failure of the wall immediately above the anchorage beam before any distress in the interface region could be observed and a rating of the different interface connections could be made. we therefore decided to reduce the bending moment effect in the shear wall by introducing additional side support to the wall over the lower half of the test specimen (see fig. 8). this decision also resulted in a reduction in the number of transducers used in the final tests (however, for general data reduction and comparison the transducer numbering 918 vol. 53 no. 6/2013 hybrid shearwall system figure 9. force — displacement diagrams. system was kept the same). the tests were performed under displacementcontrolled conditions (measured against the motion of the free end of the tubular column. the alternating displacements were increased in 0.5 mm intervals from ± 0.5 to 3.0 mm and in 1.0 mm intervals from ± 4.0 to ± 7.0, 8.0, 9.0 or 10.0 mm, depending on the performance of the specific test specimen. at each displacement step up to ± 4 mm, the specimen was subjected to three cycles of loading. subsequently, in order to assess the deteriorating behavior under repeated displacement at ± 5 mm, four cycles of displacement were introduced. finally, from ± 6 mm on, each displacement step was introduced twice. the maximum displacements to which the test specimens were subjected were governed by the observed performance. 3. test results in general, it can be noted that the hysteretic response of all specimens exhibited the pinched forcedisplacement response common to cracked shear loaded concrete specimens. the first three specimens, with straight anchor bars and, in the case of specimen 3, welded shear studs, failed at the interface connection. in the other cases, both specimens 4 and 5, with 45-degree inclined diagonal anchor bars, and specimens 6, with welded straight open stirrups, failed in the shear wall due to an excessive concrete contact pressure at the edges. 3.1. force — displacement the hysteretic force displacement data for all six specimens are presented in figure 9. other than specimens 1 and 2, which registered maximum resistances of about 320 kn and 390 kn, respectively, the remaining specimens developed maximum resistant values between about 420 and 450 kn. considering the deterioration under repeated alternating cyclic displacements, the first three specimens with straight anchorage bars, as compared to the other three specimens, show a distinct loss of resistance. specimen 1, which had shown little loss of resistance up to a 3-cycle alternating displacement of ± 4 mm, exhibited after 3-cycles at ± 5 mm a drop in resistance of 30 % (after 4.5 cycles the loss in resistance had increased to 50 %). the same basic phenomenon was observed for specimen 2. in this case, after a virtually no-loss observation under a 3-cycle alternating displacement of ± 3 mm, a loss of 30 % was observed after 3 cycles at ± 4 mm (increasing to 40 % in the next half cycle). in addition, specimen 3 showed a similar behavior as specimen 1. in this case, the loss of resistance under a 3-cycle displacement of ± 4 mm was still minimal. however, at ± 5 mm, a loss of resistance of about 25 % after 3 cycles increased to 40 % after 4.5 cycles. 919 ulrich wirth et al. acta polytechnica figure 10. force — slip diagrams. in comparison, the three other specimens exhibited both a relatively gradual drop in resistance under increased displacements and relatively little deterioration under repeated alternating cyclic deformations. comparing the results for specimen 4 and 5, specimen 5 with a 5 cm interval of � 6 mm diagonal anchorage bars, shows a slightly better response than specimen 4, with 10 cm spaced � 8 mm anchorage bars. specimen 6, with � 8 mm stirrups spaced at 10 cm, shows an almost identical response to that of specimen 4 up to an alternating displacement of ± 4 mm (with a similar � 8 mm anchorage arrangement at an interval of 10 cm). however, under increasing cyclic displacements specimen 6 shows a superior response as compared to both specimen 4 and 5. in fact, at a cyclic displacement of ± 10 mm, the drop in cyclic resistance of specimen 6 was only 25 %. on the other hand, at a cyclic displacement of ± 9 mm specimens 4 and 5 were no longer able to resist a significant interface shear force. 3.2. force — slip for all six specimens tested here, the force applied to the test specimen versus the slip of the edge member column relative to the shear wall, measured along the interface at the middle of the column (lvdt w15 — see figure 8), is presented in figure 10. the slip, measured against a 7 cm gage length, reaches displacements of close to ± 3.5 mm for the first three specimens, with straight anchor bars and welded shear studs (in specimen 3 only). for the specimens with diagonal anchorage bars, the slip reaches maximum values of between 0.5 and 1.5 mm. for the welded straight stirrup anchorage arrangement, the slip is virtually negligible. 4. conclusions a hybrid system consisting of typical reinforced concrete wall panels with composite edge members or flanges has been studied experimentally. the edge members, which are formed by composite hollow steel square column sections, are prefabricated with reinforcing bars anchored inside the column and extending through the wall of the column for connection to the wall panel reinforcement. the remainder of the building is constructed like a typical reinforced concrete structure. based on a series of alternating, displacement-controlled load tests on six one-third scale models, to study the behavior of the interface connections between column and wall panel, it can be concluded that hybrid shear wall construction in earthquake prone regions is feasible the six interface connections, which were subjected to cyclic alternating displacement-controlled shear forces, had basically two different types of interface designs. three test specimens had straight (horizontal) column anchorage bars passed through holes in the tube wall, and two had instead diagonally oriented bars extending through the tube wall. a sixth specimen was basically similar to the first three, but had horizontal anchorage bars welded to the wall of 920 vol. 53 no. 6/2013 hybrid shearwall system figure 11. hsw2 after test. figure 12. hsw7 after test. the tubular column. the results showed that the diagonally arranged bars at the column-wall interface performed better under cyclically induced alternating interface shear-loads than the interface connections with horizontal anchorage bars. however, the alternative design with horizontal anchorage bars welded to the column wall showed the least slip between edge member and wall. in turn, such a design would exhibit the lowest level of energy dissipation at the edge-member and wall interface. to develop the integrated behavior of the hybrid shear wall, at this stage of the study we recommend either diagonally arranged anchor bars extending from the column, or welded horizontal stirrups to be connected (lapped) to the shearwall reinforcement. in order to study the overall seismic behavior and the load carrying capacity of the hybrid shear wall, it is recommended to test three 1/3-scale hybrid shear walls subjected to cyclic alternating, displacementcontrolled shear forces. as the critical shearwall region is assumed to be the first three stories of the building, the tests will be carried out on test structures three stories in height. however, as cost considerations and test limitations do not permit an experimental study on full scale models, it is proposed to study the seismic behavior of three 1/3-scale models of the three-story high hybrid shearwall (fig. 14) with three different interface connections. figure 13. detail hsw7 after test. figure 14. view of the test wall. acknowledgements support from the grant agency of the czech republic through grant no. 104/11/1301 is gratefully acknowledged. references [1] aci 318-89, 1989, building code requirements for reinforced concrete, detroit, michigan: american concrete institute, usa. [2] ansys, 2001, user’s — theory manual for revision 5.7, swanson analysis systems inc., usa. [3] ashadi h.w., 1997, a hybrid composite-concrete structure earthquake resistant system, dissertation, technische universität darmstadt. [4] architectural institute of japan, 1987, aij standard for structural calculation of steel reinforced concrete structures. revised 1991. [5] bažant z.p., 1976, instability, ductility and size effect in strain softening concrete, journal of engineering mechanics, asce, vol. 102(2), pp. 331–344. 921 ulrich wirth et al. acta polytechnica [6] bažant z.p., belytschko t. b., chang, t.-p., 1984, continuum theory for strain softening, journal of engineering mechanics, asce, vol. 110, pp. 1666–1692. [7] bažant z.p., prat p., 1988, microplane model for brittle plastic materials. i: theory, ii: verification, journal of engineering mechanics, asce, vol. 114, pp. 1672–1702. [8] belytschko t., liu w.k., moran b., 2000, nonlinear finite elements for continua and structures, john wiley & sons, new york, usa. [9] bergmann r., 1981, traglastberechnung von verbundstützen, mitteilung nr. 81-2, technisch-wissenschaftliche mitteilungen institut für konstruktiven ingeunieurbau ruhr-universität bochum. [10] betten j., 1993, kontinuumsmechanik, springer-verlag, berlin. [11] boresi a.p., chong k.p., 1987, elasticity in engineering mechanics, elsevier science publishers, england. [12] bouwkamp j.g., ashadi h.w.,1992, a seismic hybrid composite structural frame system for buildings. international symposium on earthquake disaster prevention, mexico city, mexico. [13] chen w.f., 1982, plasticity in reinforced concrete, mcgraw-hill book company inc., usa. [14] chen w.f., han d.j., 1988, plasticity for structural engineers, springer-verlag, new york inc. usa. [15] comité international pour le dévelopment et étude de la construction tubulaire, 1970, concrete filled hollow section steel columns, british edition, london. [16] darwin d., pecknold d.a., 1976, analysis of rc shear panels under cyclic loading, journal of the structural division, asce,vol. 102, no. st2, pp. 355–369, usa. [17] european committee for standardization, 1992, eurocode-4, common unified rules for composite steel and concrete structures, ecsc-eec-eaec, brussels: luxembourg. [18] wirth u., 2011 seismic resistance of a hybrid shearwall system, phd thesis, ctu prague 922 acta polytechnica 53(6):913–922, 2013 1 introduction 2 experimental program 2.1 test specimen design 2.2 instrumentation and test sequence 3 test results 3.1 force — displacement 3.2 force — slip 4 conclusions acknowledgements references ap02_04.vp 1 introduction propagation of nonlinear sound waves in waveguides is a very interesting physical problem. in the case that nonlinear waves travel in a gas-filled waveguide, we can observe phenomena such as nonlinear distortion, nonlinear absorption, diffraction, lateral dispersion, boundary layer effects, etc. all these phenomena can be described by means of the complete system of the equations of hydrodynamics, see e.g. [1]–[4]: the navier-stokes momentum equation, the continuity (mass conservation) equation, the heat transfer (entropy) equation and the state equations. unfortunately, there is no known general solution of this system of equations, and numerical solutions bring many problems regarding stability of the solutions and their time consumption. consequently, it is sensible to simplify the fundamental system of equations if we ignore some phenomena or consider some of them weak. this simplification leads to the derivation of model equations of nonlinear acoustics. there have been a number of papers devoted to various aspects of the propagation of nonlinear waves in waveguides. the viscous and thermal dissipative effects on the nonlinear propagation of plane waves in hard-walled ducts are treated for instance in papers [5], [6], [7]. the authors deal with the dependence of the frequency on the dissipative and dispersive effects induced by the acoustic boundary layer. experimental results focused on propagation of finite-amplitude plane waves in circular ducts are presented in [8] and [9]. here, a very good agreement is demonstrated between experimental data and results obtained by means of the rudnick decay model for the fundamental harmonic. burns in [10] obtained a fourth-order perturbation solution for finite-amplitude waves. however, his expansion breaks down for large times because it contains secular terms. he took into account dissipation, but he neglected mainstream dissipation with respect to boundary dissipation. keller and millmann in [11] found the solution of the model equation for inviscid isentropic fluids, where they used perturbation expansion adapted to eliminate secular terms and determined the nonlinear wavenumber shift for dispersive modes. in [12], keller utilized the results from [11]. he rewrote the results in a form that is useful near the cutoff frequency, in order to show that the cutoff frequencies and resonant frequencies of modes in acoustic waveguides of finite length depend upon the mode amplitude. nayfeh along with tsai in [13] presented the nonlinear effects of gas motion as well as the non-linear acoustic lining material properties on wave propagation and attenuation in circular ducts. they obtained a second-order uniformly valid expansion by using the method of multiple scales. these authors presented [14], where they investigated nonlinear propagation in a rectangular duct whose side walls were also acoustically treated by means of the method of multiple scales. ginsberg in [15] dealt with nonlinear propagation in rectangular ducts. he determined by an asymptotic method the nonlinear two-dimensional acoustic waves that occur within a rectangular duct of semi-infinite length as the result of periodic excitation. in [16], he utilized the perturbation method of renormalization to study the nonlinearity effect on a hard-walled rectangular waveguide. nonlinear wave interaction in a rectangular duct was also investigated by hamilton and tencate in [17] and [18]. multiharmonic excitation of a hard-walled circular duct was treated by nayfeh in [19]. he used the method of multiple scales to derive a nonlinear schrödinger equation for temporal and spatial modulation of the amplitudes and the phases of waves propagating in a hard-walled duct. foda presented his work in [20], which is concerned with the nonlinear interactions and propagation of two primary waves in higher order modes of a circular duct, each at an arbitrary different frequency and finite amplitude. he used the renormalization method to annihilate the secular terms in the obtained expression. if we take no diffraction effects into account, we can use the generalized burgers equation to describe nonlinear plane waves in circular ducts. it is the burgers equation that is supplemented by the term that represents boundary layer effects, see [21]–[25]. the generalized burgers equation enables the description of dissipative and dispersive effects that are caused by the boundary layer. asymptotic and numerical solutions of this equation were presented by sugimoto [26]. in [27], there is a description of the solution of the burgers equation in the time domain. the approximate solutions of the generalized burgers equation for harmonic excitation in both the preshock and the postshock region are presented in [28]. in the case that only weak 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 propagation of quasi-plane nonlinear waves in tubes p. koníček, m. bednařík, m. červenka this paper deals with possibilities of using the generalized burgers equation and the kzk equation to describe nonlinear waves in circular ducts. a new method for calculating of diffraction effects taking into account boundary layer effects is described. the results of numerical solutions of the model equations are compared. finally, the limits of validity of the used model equations are discussed with respect to boundary conditions and the radius of the circular duct. the limits of applicability of the kzk equation and the gbe equation for describing nonlinear waves in tubes are discussed. keywords: nonlinear acoustics, khokhlov-zabolotskaya-kuznetsov equation, generalized burgers equation. diffraction effects are considered, the kzk equation can be used, see [22], [25], [29]. however, the boundary layer is incorporated into the boundary condition in [22] and [25]. 2 basic equations from the equations of continuum mechanics we can obtain in quadratic approximation an equation describing weakly nonlinear waves in thermo-viscous fluids � �� � � � � � � � � � � � �� � �0 2 2 0 0 2 0 0 2 2 2 2t c t c t � � � � � � � � � � � � �� � � � � � � � � � � l t b c t o 0 2 3 3 ( ). (1) here � is the velocity potential, t is time, c0 is the small-signal sound speed, b is the dissipation coefficient of the medium, �0 is the ambient density, � is the adiabatic index, equal to the ratio of the specific heat at constant pressure cp to that at constant volume, cv, and l is the second-order lagrangian density, which is given as � �l c t � � � � � � � � � � �� � 0 2 0 0 2 2 2 2 . (2) the symbol � represents the laplacian operator and the symbol � is the gradient operator. the operators are defined for axisymmetric waves in the cylindrical coordinates by � � � � � � � � � � 2 2 2 2 1 z r r r , (3) � � �e ez r � � � �z r , (4) where z is the coordinate along the axis of the tube, r is the coordinate perpendicular to the axis of the tube, ez is the unit vector along the z axis and er is the unit vector along the r axis. we shall assume that the profile of the wave distorts slowly as it propagates in the positive z direction. the term “slowly” means that the wave must travel a distance of many wavelengths � for its profile to be distorted significantly. then it is reasonable to seek a solution having this functional form � �� � � � � � � � �, , , , , ,z r t z c z z r r1 1 0 1 1 (5) where � is retarded time and is the small dimensionless parameter which is given by the acoustic mach number – the ratio of the particle velocity amplitude vm to the small-signal sound speed c0. if we use the new coordinate system (�, z1, r1), then transformation of the partial derivatives yields � � � �� � �z c z � � � 1 0 1 , (6) � � � � � � � � � � � 2 2 0 2 2 0 2 1 2 2 1 2 1 z c c z z � � � , (7) � � � �� � � � ��t t � �, , 2 2 2 2 (8) � � � � � � � �� � r r r r � �, . 2 2 2 (9) substitution of the partial derivatives (6) into the operator (4) gives � � � � � � �� � � ��ez 1 0 1c z � � � � � . (10) here � denotes directions transverse to propagation. we can write the linear wave equation directly from eq. (1) � � � � 2 2 0 2 t c o� � � � � ( ). (11) if we use the operator (10) in eq. (11) for the divergence operator, then after neglecting a quadratic order term we obtain � � �� � � ��� � 2 2 0 2 0 2� � � � � � �� �c z c o ( ) (12) if we suppose the transverse partial velocity v� �� � � ~ 2 (13) then we can omit the last term in eq. (12): �� � �� � t c z o� � �0 ( ). (14) equation (14) represents the plane progressive wave impedance relation. when conditions (13) and (14) are satisfied we can suppose l � 0 + o ( 3) . this enables us to modify eq. (1) � � � � � � � �� � �� � 2 2 0 2 0 2 2 21 2t c t c t z � � � � � � � � � � � � � � � � � � � � � � � b c t o � � � � 0 0 2 3 3 ( ). (15) substituting relation (14) to the second-order term of eq. (15) we get � � � � � � � � � �� � � � � �� 2 2 0 2 2 2 0 2 2 2 1 t c z c r r r t � � � � � � � � � �� � � � � � z b c t o� � � � � � 2 0 0 2 3 3 ( ), (16) where � �� �� � 1 2 is the coefficient of nonlinearity. after using the transformation of derivatives (6) we can neglect the cubic and higher terms. this yields the well-known kzk (khokhlov-zabolotskaya-kuznetsov) equation � � � � � � � � � � � � � � v z c v v b c v c z z z z� � � � � � � � � 2 2 2 0 2 0 0 3 2 2 0 2 2 1v r r v r oz z � � � � � � � � � � ( ), (17) where v zz � �� � is the z component of the particle velocity v e v� � �z zv . the kzk eqatuion was derived on the condition that vz ~ and v� ~ 2 (i.e. vz � v� ). vz and v� are connected by the irrotationality condition of the velocity field � � � � � v c v r z� � �0 0 . (18) if we do not take into account the wall friction we have to use for the solution of the kzk the following boundary conditions � � � �v z r f r zz , , ,� �� �at 0 , (19) © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 42 no. 4/2002 � � v r rz � �0 0at , (20) � � v r r rz � �0 0at , (21) where r0 is the tube radius and � �f r, � represents the prescribed sound field at z � 0. if we assume the wall friction in the case that nonlinear waves propagate in a rigid tube, then a thin boundary layer appears near the tube wall. within the boundary layer, the velocity component in the direction of the tube axis decreases from a mainstream value to zero at the tube wall. the boundary layer affects the acoustic waves not just near the walls but in the entire volume. the boundary layer causes both energy dissipation and wave dispersion. if the boundary layer is assumed to have a small displacement effect on the mainstream, the following relation can be obtained, see [24] � � � ��� � � � � � � r v z r r b v z r r r r z� � � � � � � � � 0 1 2 1 2 0 0 r , , , , . (22) here b is the coefficient which is given as b c c cp � � � � � � � � � 2 1 0 2 0 2 t pr (23) where � is the kinematic viscosity, pr is the prandtl number, the fractional derivative of the order1 2 in eq. (22) represents the following integrodifferential operator � � � � � � � � � � � �� � � � � � 1 2 1 2 0 01 v z r r v z r r z z , , , , � � � � � � � � � �� � (24) and �t is the thermal expansion coefficient of the fluid � � �� � t � � � � � � � 1 0 0t p t t, , (25) where t is temperature. if we suppose a perfect gas then � � t p c c 0 2 1� � . (26) relation (22) is valid on condition that � � r0, � � �, (27) where � is the boundary layer thickness � � � � ~ , (28) where � is an angular frequency. differentiating eq.(14) with respect to r, we get the equation (the irrotationality condition) � � � � � v r c vz r� � 1 0 . (29) if we take into account eqs. (29) and (22), we can modify boundary condition (21) for tubes with wall friction as follows � �� � � � � � � � � � � � � � � � v r b c v z r b c v z z z � � � � � � � � � �� �0 0 2 2 3 2 , , d �� 3 2 0at r=r . (30) after space-averaging over a whole tube cross section, if we take into account boundary condition (30), the kzk equation is reduced to the generalized burgers equation (gbe) � � � � � � � � � � � � � v z c v v b v b c vz z z z� � � � 0 2 0 0 3 2 2 2 1 2 1 2 , (31) where � �b b r0. the solution of the linearized gbe for the boundary condition � � � �v vz zm0, sin� � �� can be expressed as � �� � � �v v z zz zm� � � �exp sin� � � � �b b . (32) where � is the attenuation coefficient for the classical thermo-viscous loss mechanism � � � � b c 2 0 0 32 (33) and the attenuation coefficient �b represents the losses due to the wall friction � �b � �b . (34) the ratio of the two attenuation coefficients can be written as � � � � �b ~� r0 . (35) it is obvious that � � �1 for high frequencies but r0 � � 1 because the first of the conditions (27) has to be satisfied. this means that the classical thermo-viscous losses dominate for high frequencies in comparison to the boundary layer losses. consequently, the condition � � � 1 cannot be satisfied for higher wave form harmonics. 3 conditions for the quasi-plane wave in a tube the condition for validity of the kzk equation in a tube is given by the relation between the values of the longitudinal and transverse velocity components v � v� . this condition follows from the derivation of the kzk equation in the tube given above, see section basic equations. the transverse velocity component relates to transverse diffraction. the value of v� depends on wave frequency, the non-quasi planar boundary condition, the tube radius and the boundary layer. supposing that the tube is narrow, the influence of the boundary layer is significant. let us assume that the source frequency is lower than the cut-off frequency, thus only plane-wave mode propagation is possible. then the diffraction is given by the influence of the boundary layer only. when we now gradually increase the frequency, the transverse component of velocity v� also gradually rises. as soon as the frequency exceeds the cut-off frequency of the tube, the transverse mode arises, thus the effect of transverse diffraction increases considerably and the transverse component of the 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 velocity significantly rises. oscillations appear on the propagation curves for the harmonic amplitudes which were obtained by means of a numerical solution of the kzk equation and consequently the kzk equation loses its accuracy. this means that in the case of a narrow tube we can study wave propagation only below the cut-off frequency. in addition, it is not possible to study wave propagation if we use a boundary condition of any distinct velocity distribution along the tube radius (non-quasi planar boundary condition). the influence of the boundary layer is less significant in wide tubes. the amplitudes of transverse components of velocity are much lower than in the case of narrow tubes. therefore the propagation can be studied far above the cut-off frequency even with distinct velocity distribution along the tube radius for the boundary condition. however, the kzk equation cannot be used for the boundary condition with distinct velocity distribution along the radius when the source frequency is less than the cut-off frequency for a given tube. this restriction holds for any tube radius. if the used frequency is less than the cut-off frequency, then all higher modes are evanescent and only the plane wave propagates along the waveguide. however, the kzk equation cannot describe this process because the transverse component of the velocity is large and we cannot consider the kzk equation to be valid. the cut-off frequency of the first symmetric mode in a circular waveguide is given by the formula f c k c r m1 � � �0 1 0 02 2� � � � , (36) where k1 is the radial wave number of the first mode and � ��� � .3 8317 is the first zero point of the bessel function derivative: � �� � �j0 0�� . now, let us suppose that the time dependency of the dimensionless longitudinal component near the source is given as � � � �v g� 0 � �sin , (37) where � � r r0 and � ��� is the dimensionless retarded time, which means that we suppose a mono-frequency harmonic source. here � �g 0 � represents the function describing the distribution of the longitudinal component along the radius. then for the transverse component of the velocity we have � � w c r g � 0 0 0 � � � �� , (38) where w is the dimensionless transverse component of the velocity normalized in the same way as the longitudinal component v. assuming wave propagation with frequency �f equal to one half of the cut-off frequency � � �f f c r 1 2 3 8317 4 0 0 m1 . � (39) we can see that w is equal to � � � � w g g � � 2 3 8317 0 5220 0 . . � � � � � � � � . (40) taking into account the fact that the maximum of the derivative of the commonly used velocity distributions (gauss, fermi) at the boundary of a tube is of the order o(1), we can see that the values of v� and vz are comparable. therefore the validity condition of the kzk equation is not fulfilled. we can say that the kzk equation is valid only for a moderate distribution along the tube radius below the cut-off frequency (for instance, a plane wave which is affected by the boundary layer). we often read (e.g., in [4], [29]) the condition kr0 � 1 from which follows the relation between wave vector components k � k� . this means that a sound wave is quasi-plane and diffraction effects are weak in tubes. with regard to the text above we can say that the condition kr0 � 1 represents only a sufficient, but not a necessary, condition for the use of the kzk equation as a model equation for nonlinear waves propagating in tubes. 4 numerical solution equation (17) was solved in the frequency domain. when the solution is periodic in time (i.e., the sound source is periodic in time), it can be expressed in the form of fourier series. for a numerical solution it was necessary to truncate the infinite number of terms in the fourier series to * terms. then the solution was sought in the form � � � � � �v g n h nn n n m � � � � � � � � � �, , , sin , cos� � � 1 , (41) where � � �� z v czm 0 2 and � � r r0 are dimensionless coordinates. the finite number of terms in series (41) causes instability in the numerical solution. when the solution is approaching the region of shock wave formation, all harmonics are excited and the energy flow stops with the last harmonic, i.e., the m-th harmonic. this effect causes the gibbs oscillations in the numerical solution. these oscillations were damped by the method described, e.g., by fenlon [31]. each harmonic was multiplied by the coefficient �n given by � � �n � sin nh nh , (42) where h is the frequency damping coefficient. this causes the additional artificial attenuation of the solution. the value of h was chosen so that gibbs oscillations practically did not arise. during the assembling of the numerical schema we proceeded from the well known bergen code [32], which was extended by terms including the boundary layer influence. simple iteration by the lu decomposition method was used to calculate the solution in the next layer. if we use lu decomposition, then the calculations take approximately three times longer than in the case of calculation by means of simple iteration. however, lu decomposition is not restricted by the condition on the step size in the direction of propagation. application of this method is very convenient in the case when it is necessary to carry out a very large number of iterations by means of the simple iteration method. for example, we need 500000 steps for the simple iteration method and only 400 steps for lu decomposition for a tube 4 mm in radius. the burgers equation was solved in the frequency domain by means of the standard runge-kutta method of the fourth order. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 42 no. 4/2002 5 results of a numerical solution of the kzk equation 5.1 a narrow tube the presented distributions represent the deformation of a plane wave under the influence of the boundary layer for a narrow tube, where the boundary layer effect is very important. the calculations are done for particle velocity amplitude vm � 2.76 m�s �1, tube radius r0 � 0.004 m, and for the following six frequencies 1 khz, 10 khz, 40 khz, 60 khz, 100 khz and 500 khz. the cut-off frequency of the first symmetric mode for the given tube is approximately equal to 52 khz. the first 100 harmonics were used in the numerical calculation, the numerical step in the direction of the wave propagation was �� � 0.01 for the burgers equation. the value of the numerical step for the kzk equation was determined from this step, so the total number of steps was identical for the gbe and kzk equation. the kzk equation was solved with the plane boundary condition. the numerical step in the direction of the tube radius was chosen �� � 0.05. the calculations are realized for the case of propagation in air. the source oscillated harmonically with one frequency. for all calculations we used the numerical attenuation coefficient h � 40. figures 1–3 contain three curves: curve 1 is the numerical solution of the gbe, curve 2 is the longitudinal component of the velocity obtained by the numerical solution of the kzk equation, and curve 3 represents the transverse velocity component which is calculated from the longitudinal velocity across the tube radius using eq. (18). the values of curve 3 are multiplied 100 times and calibrated in the same way as the longitudinal component of the velocity to enable the two velocity components to be compared easily. curves 2 and 3 are depicted for the radius � � 0 5. . owing to the fact that the resultant space evolutions correspond to practically plane waves, the choice of this radius is not important. figures 1–3 contain the first harmonic. the graphs illustrate the speculation about the validity of the kzk equation for a narrow tube, where the influence of the boundary layer is dominant. we see that with increasing frequency the transversal velocity component progressively rises. we can observe the significant growth of the transversal velocity component when the cut-off frequency is exceeded. the first oscillations of the transversal velocity component are noticeable in the case of 60 khz. these oscillations still grow and they are conspicuous when the frequency is equal to 100 khz. the longitudinal component of the velocity was scarely affected, as can be seen from the comparison with the velocity from gbe. the further increase in frequency causes the lon22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 1: solution from the gbe (line 1), longitudinal (line 2) and transversal (line 3) velocity component from the kzk equation for the source frequency 1khz and 10khz (first harmonic) fig. 2: solution from the gbe (line 1), longitudinal (line 2) and transversal (line 3) velocity component from the kzk equation for the source frequency 40khz and 60khz (first harmonic) gitudinal component of velocity also to become disrupted, as can be seen for frequency equal to 500 khz. these variations are readily noticeable, especially for the first harmonic, and they are weaker for higher harmonics. the presented phenomena are in harmony with the deductions mentioned above. 5.2 a wide tube results for a tube with radius r0 � 0.5 m are also presented. the calculations were made for frequency 100 khz. the cut-off frequency of the first symmetric mode for this tube is approximately equal to 911 hz, thus the frequency of the described waves is considerably higher than the cut-off frequency. the fermi distribution was chosen as the initial condition for the kzk equation. all other parameters of the computation were the same as in the previous case of the narrow tube. figures 4 and 5 contain the propagation curves of the first four harmonics. the distributions in these figuresare plotted for � � 0 8. , which corresponds to the value, at which the fermi © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 42 no. 4/2002 fig. 3: solution from the gbe (line 1), longitudinal (line 2) and transversal (line 3) velocity component from the kzk equation for the source frequency 100khz and 500khz (first harmonic) fig. 4: kzk equation solution of the longitudinal (line 1) and transversal (line 2) velocity component of the first and second harmonic fig. 5: kzk equation solution of the longitudinal (line 1) and transversal (line 2) velocity component of the third and fourth harmonic distribution derivative has its maximum. figures 6-–9 contain the space evolutions of the first four harmonics for both the longitudinal and the transverse velocity components. 6 summary this paper shows the derivation of the kzk equation in the parabolic approximation. the following validity condition of this equation for the waveguide has to be satisfied v ~ , v� ~ (that is v � v� ). the validity conditions of the kzk equation in the case of the description of quasi-linear waves propagation in the circular waveguide are further discussed by means of these conditions. it is shown that the condition kr0�1 represents only the sufficient condition for validity of the kzk equation. the equation remains valid in the case of the small radius waveguide (and consequently with strong influence of the boundary layer) for frequencies below the cut-off frequency though kr0<1 under the condition that v � v� . 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 6: kzk equation solution of the longitudinal (v1) and transversal (w1) velocity component of the first harmonic fig. 7: kzk equation solution of the longitudinal (v2) and transversal (w2) velocity component of the second harmonic fig. 8: kzk equation solution of the longitudinal (v3) and transversal (w3) velocity component of the third harmonic the results of numerical solution of the kzk equation are presented for waveguides of both small and large radius which demonstrate these two cases. the numerical solution of the kzk equation was performed by means of a method based on the bergen code [32], which was modified by the authors because it was necessary to incorporate the boundary condition, taking into account the boundary layer at the waveguide lateral walls. the new method for calculating the diffraction effects generated by the boundary layer effects is described in the text. when the plane wave with a frequency below the cut-off frequency propagates in a small radius tube, then its space evolution along the radius varies only a little and the results of the kzk and the burgers equation are comparable. solving the kzk equation also enables us to get the distribution of the perpendicular component of velocity. by means of the perpendicular component of the velocity we can see whether the validity limits of the kzk are fulfilled. in the waveguide of the large radius we can describe by means of the kzk equation the high frequency wave propagation for non-planar distribution of the velocity along the tube radius (it is no longer possible to use the generalized burgers equation here). again we can decide from the magnitude of the perpendicular component of the velocity whether we will use the kzk equation within the limits of its validity for the given parameters f, r0, vm and for the given velocity distribution along the radius near the source. acknowledgments this work was supported by gacr grant no. 202/01/1372 and by ctu research program 16:cez:j04/98:212300016. references [1] landau, l. d., lifshitz, e. m.: fluid mechanics, (2nd ed.). new york: pergamon press, 1987. [2] rudenko, o. v., soluyan, s. i.: theoretical foundations of nonlinear acoustics. new york: plenum, 1977. [3] naugolnykh, k., ostrovsky, l.: nonlinear wave processes in acoustics. usa: cambridge university press, 1998. [4] hamilton, m. f., blackstock, d. t.: nonlinear acoustics. usa: academic press, 1998. [5] chester, w.: resonant oscillations in closed tubes. j. fluid mech., 1964, vol. 18, p. 44–64. [6] blackstock, d. t.: nonlinear acoustics (theoretical). (3rd ed.) american institute of physics handbook, new york: mc graw hill, 1972. [7] coppens, a. b.: theoretical study of finite-amplitude traveling waves in rigid-walled ducts: behaviour for strengths precluding shock formation. j. acoust. soc. am., 1971, vol. 49 (1-part 2), p. 306–318. [8] webster, d. a., blackstock, d. t.: finite-amplitude saturation of plane sound waves in air. j. acoust. soc. am., 1977, vol. 62, no. 3, p. 518–523. [9] gaete-garreton, l., gallego-juarez, j. a.: propagation of finite-amplitude ultrasonic waves in air-ii. plane waves in a tube. j. acoust. soc. am., 1983, vol. 73, no. 3, p. 768–773. [10] burns, s. h.: finite-amplitude distortion in air at high acoustic pressures. j. acoust. soc. am., 1967, vol. 41 (4-part 2), p. 1157–1169. [11] keller, j. b., millmann, m. h.: finite-amplitude sound wave propagation in a waveguide. j. acoust. soc. am., 1971, vol. 49 (1-part2), p. 329–333. [12] keller, j. b.: nonlinear forced and free vibrations in acoustic waveguides. j. acoust. soc. am., 1974, vol. 55, no. 3, p. 524–527. [13] nayfeh, a. h., tsai, m. s.: non-linear wave propagation in acoustically lined circular ducts. j. of sound and vib., 1971, vol. 36, no. 1, p. 77–89. [14] nayfeh, a. h., tsai, m. s.: nonlinear acoustic propagation in two-dimensional ducts. j. acoust. soc. am., 1974, vol. 55, no. 6, p. 1127–1133. [15] ginsberg, j. h.: finite amplitude two-dimensional waves in a rectangular duct induced by arbitrary periodic excitation. j. acoust. soc. am., 1978, vol. 65, no. 5, p. 1127–1133. [16] ginsberg, j. h., miao, h. c.: finite amplitude distortion and dispersion of a nonplanar mode in a waveguide. j. acoust.soc. am., 1986, vol. 80, no. 3, p. 911–920. [17] hamilton, m. f., tencate j. a.: sum and difference frequency generation due to noncollinear wave interaction in a rectangular duct. j. acoust. soc. am., 1987, vol. 81, no. 6, p. 1703–1712. [18] hamilton, m. f., tencate j. a.: finite amplitude sound near cutoff in higher-order modes of a rectangular duct. j. acoust. soc. am., 1988, vol. 84, no. 1, p. 327–334. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 42 no. 4/2002 fig. 9: kzk equation solution of the longitudinal (v4) and transversal (w4) velocity component of the fourth harmonic [19] nayfeh, a. h.: nonlinear propagation of a wave packet in a hard-walled circular duct. j. acoust. soc. am., 1975, vol. 57, no. 4, p. 803–809. [20] foda, m. a.: analysis of nonlinear propagation and interactions of higher order modes in a circular waveguide. acustica, 1998, vol. 84, no. 1, p. 66–77. [21] blackstock, d. t.: generalized burgers equation for plane waves. j. acoust. soc. am., 1985, vol. 77, no. 6, p. 2050–2053. [22] makarov, s. n., vatrushina, e.: effect of the acoustic boundary layer on a nonlinear quasiplane wave in a rigid-waled tube. j. acoust. soc. am., 1993, vol. 94, no. 2, p. 1076–1083. [23] konicek, p., bednarik, m., cervenka, m.: finite-amplitude acoustic waves in a liquid-filled tube. gdynia: naval academy, hydroacoustics, 2000, vol. 3, p. 85–88. [24] ochmann, m.: representation of absorption of nonlinear waves by fractional derivatives. j. acoust. soc. am., 1993, p. 3392–3399. [25] makarov, s., ochmann, m.: nonlinear and thermoviscous phenomena in acoustics. part ii, acustica, 1997, vol. 83, no. 2, p. 197–222. [26] sugimoto, n.: burgers equation with a fractional derivative: hereditary effects on nonlinear acoustic waves. j. fluid mech., 1991, vol. 125, p. 631–653. [27] bednařík, m., koníček p., červenka m.: solution of the burgers equation in the time domain. acta polytechnica, 2002, vol. 42, no. 2, p. 71–75. [28] bednarik, m., konicek, p.: propagation of quasiplane nonlinear waves in tubes and the approximate solutions of the generalized burgers equation. j. acoust. soc. am., 2002, vol. 112, p. 91–98. [29] zhileikin, ya. m., zhuravleva, t. m., rudenko, o. v.: nonlinear effects in the propagation of high-frequency sound waves in tubes. phys. acoust., 1980, vol. 26, p. 32–34. [30] gonghuan, d.: fourier series solution of burger’s equation for nonlinear acoustics in relaxing media. j. acoust. soc. am., 1985, vol. 77, no. 3, p. 924–927. [31] fenlon, f. h.: a recursive procedure for computing the nonlinear spectral interactions of progressive finite-amplitude waves in nondispersive fluids. j. acoust. soc. am., 1971, vol. 50, no. 5, p. 1299–1312. [32] aanonsen, s. i.: numerical computation of the nearfield amplitude sound beam. report no. 73, university of bergen, department of mathematics, 1983. dr. mgr. petr koníček phone: +420 224 352 329 e-mail: konicek@feld.cvut.cz dr. ing. michal bednařík phone: +420 224 352 308 e-mail: bednarik@feld.cvut.cz ing. milan červenka phone: +420 224 353 975 e-mail: cervenm3@feld.cvut.cz department of physics czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 acta polytechnica doi:10.14311/ap.2015.55.0242 acta polytechnica 55(4):242–246, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap a system and a device for isolating circulating tumor cells from the peripheral blood in vivo michal megoa, miroslav kocifajb, c, ∗, františek kundracikc a department of oncology, comenius university, medical faculty and national cancer institute, klenová 11, 833 10 bratislava, slovak republic b ica, slovak academy of sciences, dúbravská cesta 9, 845 03 bratislava, slovak republic c faculty of mathematics, physics, and informatics, comenius university, mlynská dolina, 842 48 bratislava, slovak republic ∗ corresponding author: kocifaj@savba.sk abstract. circulating tumor cells (ctc) play a crucial role in disseminating tumors and in the metastatic cascade. ctcs are found only in small numbers, and the limited amount of isolated ctcs makes it impossible to characterize them closely. this paper presents a proposal for a new system for isolating ctcs from the peripheral blood in vivo. the system enables ctcs to be isolated from the whole blood volume for further research and applications. the proposed system consists of magnetic nanoparticles covered by monoclonal antibodies against a common epithelial antigen, large supermagnets, which are used to control the position of the nanoparticles within the human body, and a special wire made of a magnetic core wrapped in a non-magnetic shell. the system could be used not only for isolating ctcs, but also for in vivo isolation of other rare cells from the peripheral blood, including hematopoietic and/or mesenchymal stem cells, with applications in regenerative medicine and/or in stem cell transplantation. keywords: circulating tumor cells, in vivo isolation, magnetic nanoparticles. 1. introduction circulating tumor cells (ctcs) play a crucial role in tumor dissemination and in the metastatic cascade. ctcs are very rarely-occurring cells, surrounded by billions of hematopoietic cells in the bloodstream. recent advances in detection methods have enabled them to be identified reproducibly and further characterized [1]. ctcs consistently showed a prognostic value for several types of cancer, including breast, prostate and colon cancer [2–4]. however, ctcs represent a heterogeneous population of cells with various phenotypes and biological values [1]. different methods detect different ctc subpopulations with different clinical and biological values. all data on ctcs should therefore be interpreted within the context of the detection method that is used [1]. despite recent technological advances, we are still only beginning to understand ctc-related processes in tumor dissemination and progression. various strategies are used for detecting and characterizing ctcs, including morphological and physical characteristics, such as size and weight, and detecting the expression of specific markers. in carcinomas, ctcs are usually identified on the basis of the expression of epithelial-lineage markers, such as epcams (epithelial cell adhesion molecules) or cytokeratins (cytoskeletal proteins present in epithelial cells) and the absence of a common leukocyte marker (cd45) and/or by the presence of putative tumor-specific antigens (for example muc1 and her2) [1]. due to the small numbers of ctcs, almost all detection methods include enrichment steps (negative selection, including depletion of hematopoietic cells, or positive selection, typically using an anti epcam antibody) to increase the detection success rate. despite this enrichment, there is only a very limited amount of isolated ctcs, and ctcs cannot be characterized more closely. almost all current detection methods detect ctcs from a limited amount of peripheral blood that is drawn from the venous system, while all ctc isolation steps are performed in vitro. in this paper, we propose a new system and device for isolating ctcs from the peripheral blood in vivo. the system enables more cells to be isolated for further research and applications. a major advantage of the proposed system is that it could isolate ctcs (or other cells of interest) from the whole volume of the blood, whereas existing detection methods isolate ctc from a limited volume of blood. 2. material and methods 2.1. magnetic nanoparticles the system with magnetic nanoparticles covered by monoclonal antibodies against a common epithelial antigen (epcam) is used to isolate circulating tumor cells from the peripheral blood. magnetic nanoparticles are used because they can be manipulated easily (held and transported) by means of a magnetic field. magnetic particles can thus be fixed in a vein for a long time, using permanent supermagnets attached to 242 http://dx.doi.org/10.14311/ap.2015.55.0242 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 4/2015 system for in vivo isolation of ctc figure 2. magnetic induction, b, determined along the line perpendicular the wire and crossing the wire between the pellets. inset: magnetic field lines in the vicinity of the wire. figure 1. special wire containing supermagnetic pellets in a non-magnetic mantle. the arm. this approach is likely to be more comfortable for patients. the magnetic force, fmag, acting on the nanoparticles should be strong enough in relation to the viscous force, fvis, caused by the blood flow. nacev et al. [5] have analyzed the diffusion effects and the velocity profiles of venous blood, and have shown that the ratio fmag/fvis > 5 · 10−5 for typical viscous forces in the central parts of the vein. because there is continuous lowering of the blood velocity from the central parts of a vein to its walls, a fixed nanoparticle layer will be formed at the edges — i.e., at the venous walls. the magnetic force acting on a nanoparticle is proportional to its volume, so large-sized nanoparticles are usually preferred for this application. in cases that are important for practically applications, commercially available particles about 30 nm in diameters would be a suitable choice. 2.2. magnets an external magnetic field is an advantageous concept, since it prevents magnetically guidable nanoparticles spreading over the blood vessels. large supermagnets are used to control the position of the nanoparticles, and to prevent them moving beyond the area of the arm. the nanoparticles are exposed to circulating tumor cells, and ctcs are picked up by binding them to an epcam antibody on the surface of each nanoparticle. by manipulating them with large magnets, the nanoparticles can easily be transported towards the venous cannula after the in vivo incubation period. 2.3. a wire with a magnetic core a special wire made of a magnetic core wrapped in a non-magnetic shell (figure 1) is inserted into the cannula, and the large magnets are removed. the nanoparticles with ctcs are attracted by the wire, and are then transferred from the cannula to a tube with an appropriate medium, where the particles scatter within a short time. before further analysis, the magnetic core should be removed. to make this procedure possible, the wire has to produce a magnetic force large enough to act on the nanoparticles, but low enough to interact with the large magnets. since the magnetic force is proportional to the magnetic field gradient, the magnetic core can be advantageously made as a chain of supermagnetic pellets with opposite orientations. in this type of configuration the magnetic field of the pellets is pulled out of the wire and large gradients are produced near the touching pellets. since the force-vector between the pellets and the large magnets changes its direction, the resulting net force between the whole wire and the large magnets is relatively small. if the individual pellets are well separated (e.g., equispaced), rather than glued together, the wire becomes flexible enough for use in in vivo manipulation. we used a software implementation of the finiteelement method [6] to analyze the wire shown in figure 1. pellets 1 mm in diameter and 2 mm in length made from standard neodymium magnets were considered. figure 2 demonstrates the magnetic induc243 m. mego, m. kocifaj, f. kundracik acta polytechnica figure 3. proof of concept. tion, b, computed along a line perpendicular to the wire and crossing the wire between the pellets (the length is measured from the axis of symmetry). a magnetic field gradient of 500 t/m was found at a distance of 1 mm from the center of the wire (0.5 mm above the surface of the pellets). magnetic induction exceeding 0.2 t is large enough to saturate the magnetization of the nanoparticles in this region [7]. the magnetic properties of fe3o4 nanoparticles 30 nm in size [7] were taken into consideration when determining the magnetic force: density of 5 g/cm3 and magnetic saturation of 20 emu/g = 100 emu/cm3 = 105 a/m. the magnetic force acting on a nanoparticle with radius r is fmag = mv grad b = 4πmr3 grad b 3 , where m is the magnetization of the particle and v is the volume of the particle. using the parameters mentioned above, we obtain fmag = 5.7 · 10−15 n. the viscous force in a non-turbulent flow can be expressed as follows: fvis = 6πrηv, where η is the dynamic viscosity of the blood plasma and v is the velocity of the blood at the center of the vein. using typical values [8] η = 1.3 · 10−3 pa s and v = 10 cm/s, we obtained fvis = 3.7 · 10−11 n. a ratio of fmag/fvis = 1.5 · 10−4 is sufficient to hold the nanoparticles near the wall of the magnetic wire. 3. results this is a new system and device for isolating circulating tumor cells from the peripheral blood in vivo. the system is based on isolating ctcs in vivo, using magnetic nanoparticles covered by a monoclonal antibody (figure 3a.) magnetic nanoparticles (a) covered by monoclonal antibodies against a common epithelial antigen are slowly injected into the cubital vein (or into another appropriate vein) through the 244 vol. 55 no. 4/2015 system for in vivo isolation of ctc venous cannula. the position of these nanoparticles is controlled by large magnets, and their movement is limited to the area of the arm (or to another area of interest) (figure 3b). nanoparticles are exposed to the circulating tumor cells, and ctcs are picked up by binding them to an epcam antibody on the surface of a nanoparticle. the special wire (b) consists of a magnetic core covered by a non-magnetic mantle. the wire is used for removing nanoparticles with the ctcs attached to them from the bloodstream (figure 3c). after in vivo incubation, the nanoparticles are moved closer to the venous cannula by moving the large magnets. then a special wire (b) is inserted through the cannula (figure 3d). subsequently, the large magnets are removed and the nanoparticles are attracted by a magnetic wire. then the magnetic wire covered with nanoparticles with ctcs attached to them is removed from the cannula (figure 3e). the wire is placed into the tube with an appropriate medium. the magnetic core is removed, and the nanoparticles with ctcs attached to them are released into the medium for further analysis (figure 3f). 4. discussion numerous methods are used for isolating ctcs from the peripheral blood, with various levels of success, in detecting ctcs, ranging from 1 cell/7.5 ml of blood to several thousand, but with most of the methods the number of ctcs that are detected is very small (1, 9, 10). almost all of these methods isolate ctcs from a limited volume (5–30 ml) of peripheral blood. ctcs occur in small numbers in the blood, and a limited number of ctcs could be used for further biological characterization. our system and device overcome this limitation by utilizing in vivo isolation of ctcs. in comparison with a similar system [11], the main advantage of our system is that the nanoparticles float in the bloodstream, and are therefore exposed to ctcs from the whole volume of the blood. these nanoparticles are exposed to the blood not just at the site where the venous cannula is inserted (as is the case in [11]), but they can be moved to large veins with a bigger blood flow, so they can be exposed to more ctcs from the whole blood. moreover, the nanoparticles can be exposed to the blood flow for several hours, so they can be washed with the whole volume of the blood several times before they are extracted from the bloodstream. their positions are controlled by external magnets, and the nanoparticles can be placed in any vein of the human body. another advantage is that the blood does not leave the blood vasculature, so our method is not associated with any blood losses, even when there is repeated isolation of ctcs. unlike apheresis methods for ctc isolation, the system that we propose here does not need blood anti-coagulation, and the blood is not discharged outside the blood vasculature into any artificial equipment. the method is therefore safer and more comfortable for patients. in addition, the patient is not bed bound. the system can be moved while the patient is exercising, and normal activities can be performed during the procedure. one of the limitations of the proposed method is that it utilizes positive selection using an anti-epcam antibody. ctcs without epcam expression, including those undergoing the epithelial-to-mesenchymal transition, are therefore not detected by this method. however, depending on the antibody that is utilized on the surface of the nanoparticles, this system could be used for detecting other rare cells present in the peripheral blood, including hematopoietic stem cells. this system has several potential applications. for example, it can be used in ctc research to better characterize ctcs, to identify therapeutic targets on ctcs, to perform molecular profiling of tumors that are seeded by ctcs, to study the mechanism of drug resistance, and/or to develop cancer vaccines from the predominant ctc clones. other potential applications include in vivo isolation of dendritic cells for developing tumor vaccines, or mesenchymal stem cells with applications in regenerative medicine and/or in stem cell transplantation. in addition, the system can be utilized for detecting and isolating any cell of interest present in the blood and/or in another body fluid, with potential uses in translational research, personalized medicine and treatment. in conclusion, our new system for isolating ctcs from the peripheral blood in vivo enables ctcs to be isolated from the whole volume of the blood for further research and applications. in addition to its use for isolating ctcs, the system could be used for isolating other rare cells from the peripheral blood in vivo, including hematopoietic and/or mesenchymal stem cells, with applications in regenerative medicine and/or in stem cell transplantation. acknowledgements the methods described in this paper present a patent application submitted by miroslav kocifaj and michal mego. this publication is an outcome of project 1/0724/11, funded by the slovak grant agency vega, and project apvv-0016-11, supported by the slovak research and development agency. this work also received support from the slovak national grant agency vega (grant 2/0002/12). the work presented here has also been supported by research & development operational programme project 26240120026, funded by the erdf. there are no competing interests. the study involves no human subjects. references [1] mego m, mani sa, cristofanilli m, molecular mechanisms of metastasis in breast cancer clinical applications. nat. rev. clin. oncol. 7, 693–701 (2010). doi:10.1038/nrclinonc.2010.171 [2] cristofanilli m, budd gt, ellis mj, stopeck a, matera j, miller mc, reuben jm, doyle gv, allard wj, terstappen lw, hayes df, circulating tumor cells, disease progression, and survival in metastatic breast cancer. n. engl. j. med. 351, 781-791 (2004). doi:10.1056/nejmoa040766 245 http://dx.doi.org/10.1038/nrclinonc.2010.171 http://dx.doi.org/10.1056/nejmoa040766 m. mego, m. kocifaj, f. kundracik acta polytechnica [3] cohen sj, punt cj, iannotti n, saidman bh, sabbath kd, gabrail ny, picus j, morse m, mitchell e, miller mc, doyle gv, tissing h, terstappen lw, meropol nj, relationship of circulating tumor cells to tumor response, progression-free survival, and overall survival in patients with metastatic colorectal cancer. j. clin. oncol. 26, 3213–3221 (2008). doi:10.1200/jco.2007.15.8923 [4] de bono js, scher hi, montgomery rb, parker c, miller mc, tissing h, doyle gv, terstappen lw, pienta kj, raghavan d, circulating tumor cells predict survival benefit from treatment in metastatic castrationresistant prostate cancer. clin. cancer res. 14, 6302–6309 (2008). doi:10.1158/1078-0432.ccr-08-0872 [5] nacev a, beni c, bruno o, shapiro b, magnetic nanoparticle transport within flowing blood and into surrounding tissue. nanomedicine 5, 1459–1466 (2010). doi:10.2217/nnm.10.104 [6] meeker d. finite elements method magnetics version 4.2 user’s manual. on line: http://www.femm.info/archives/doc/manual42.pdf (2013-12-20) [7] parvin k, ma j, ly j, sun xc, nikles de, sun k, wang lm, synthesis and magnetic properties of monodisperse fe3o4 nanoparticles. j. appl. phys. 95, 7121-7123 (2004). doi:10.1063/1.1682783 [8] klabunde re. cardiovascular physiology concepts. published by lippincott williams & wilkins isbn 9781451113846 (2011). [9] allard wj, matera j, miller mc, repollet m, connelly mc, rao c, tibbe ag, uhr jw, terstappen lw. tumor cells circulate in the peripheral blood of all major carcinomas but not in healthy subjects or patients with nonmalignant diseases. clin cancer res 10:6897–6890 (2004). doi:10.1158/1078-0432.ccr-04-0378 [10] pachmann k, camara o, kavallaris a, krauspe s, malarski n, gajda m, kroll t, jörke c, hammer u, altendorf-hofmann a, rabenstein c, pachmann u, runnebaum i, höffken k. monitoring the response of circulating epithelial tumor cells to adjuvant chemotherapy in breast cancer allows detection of patients at risk of early relapse. j. clin. oncol. 26, 1208–1215 (2008). doi:10.1200/jco.2007.13.6523 [11] saucedo-zeni n, mewes s, niestroj r, gasiorowski l, murawa d, nowaczyk p, tomasi t, weber e, dworacki g, morgenthaler ng, jansen h, propping c, sterzynska k, dyszkiewicz w, zabel m, kiechle m, reuning u, schmitt m, lücke k. a novel method for the in vivo isolation of circulating tumor cells from peripheral blood of cancer patients using a functionalized and structured medical wire. int. j. oncol. 41, 1241-1250 (2012). 246 http://dx.doi.org/10.1200/jco.2007.15.8923 http://dx.doi.org/10.1158/1078-0432.ccr-08-0872 http://dx.doi.org/10.2217/nnm.10.104 http://dx.doi.org/10.1063/1.1682783 http://dx.doi.org/10.1158/1078-0432.ccr-04-0378 http://dx.doi.org/10.1200/jco.2007.13.6523 acta polytechnica 55(4):242–246, 2015 1 introduction 2 material and methods 2.1 magnetic nanoparticles 2.2 magnets 2.3 a wire with a magnetic core 3 results 4 discussion acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0665 acta polytechnica 53(supplement):665–670, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap population of black holes in the milky way and in the magellanic clouds janusz ziółkowski∗ copernicus astronomical center, ul. bartycka 18, 00-716 warsaw, poland ∗ corresponding author: jz@camk.edu.pl abstract. in this review, i will briefly discuss the different types of black hole (bh) populations (supermassive, intermediate mass and stellar mass bhs) both in the galaxy and in the magellanic clouds and compare them with each other. keywords: stars: binaries, stars: x-ray binaries, stars: black holes, black holes: masses, black holes: spins, individual: sgra*. 1. introduction our galaxy (mw) contains one supermassive bh (sgra*), a small number (between zero and a few tens) of intermediate mass bhs (imbhs) and about 107 to 108 stellar mass bhs (sm bhs). the magellanic clouds (mcs) contain no supermassive bhs and, most likely, no imbhs. the number of sm bhs in mcs is not well estimated but it is possibly smaller than the number inferred from the the relative mass of the mcs with respect to the mw (∼ 0.1): 106 to 107. in this review, i will briefly discuss all three classes of bhs and compare their populations in the mw and in the mcs. 2. sgra* the evidence for the existence of supermassive bh in the center of mw is extremely robust. the most recent estimate of the mass of sgra* is based on new precise astrometric and radial velocity determination of the orbit of the star s-02 and is equal (4.5 ± 0.4) × 106 m� [20]. the present level of activity of sgra* is extremely low – its luminosity is only ∼ 1033 erg/s (time-averaged energy of occasional weak x-ray and ir flares). recent radio observations at 1.3 mm [11] permitted us, for the first time in history, to see the structures on the scale of the event horizon. for the first time, the size of the radio image of sgra* did not resulting from interstellar scattering, but reflected the true size of the source. the radio image obtained by doeleman et al. was ellipsoidal shape with the major axis equal 37+16−10 µas. the angular diameter of the event horizon of sgra* at the distance of the galactic center (8.4 kpc) should be equal to ∼ 20 µas. however due to light bending, the apparent size of the event horizon for a distant observer should be equal to ∼ 52 µas for non-rotating bhs or ∼ 45 µas for maximally rotating bhs. doeleman et al. concluded that the emission from sgra* is not exactly centered on a bh (it might be e.g. the base of the jet or a part of the disc). yuan et al. [43] calculated the images for disc emission close to the event horizon assuming radiatively inefficient advection flow. their conclusion is that either the disc is highly inclined or sgra* is rotating fast. recently, a gas cloud on a collision course with sgra* was discovered [6, 21]. the dusty gas cloud has a mass of ∼ 1.7 × 1028 g (about 3 earth masses) and an estimated temperature of ∼ 550 k (from the fact that it is seen in the l but not in the k infrared band). the cloud moves along an elliptic orbit with the eccentricity of 0.94 ± 0.01 and an orbital period of 137 ± 11 yr. the pericentre distance from the bh is rather large: 3140 ± 240 schwarzshild radii. however, the cloud will not survive the pericentre passage which will occur in the middle of 2013. the cloud evolution simulations [6] indicates that it will be disrupted by the tidal forces and its content will fall onto the central bh. it is estimated the x-ray luminosity sgra* will reach the level of ∼ 1034 erg/s, or one order of magnitude higher than it is today. since we will be able to resolve the emission region, we may expect to obtain interesting information about the processes taking place in the immediate vicinity of the event horizon. this burst, which we will witness in about one year, will, however be much weaker than the burst which happened about 300 years ago: the activity of sgra* was then so high, that its luminosity was by six orders of magnitude higher than it is now [29]. the evidence of this activity comes from the nearby x-ray reflection nebula sgrb2 which is still glowing due to past irradiation from then much brighter sgra* [28]. 3. intermediate mass bhs there are no ultraluminous x-ray sources (ulxs) in the mw (nor in the mcs). therefore, the only place one may search for imbhs as in globular clusters (gcs). until recently, there was a general consensus that some galactic gcs contain black holes at their centers. this opinion was based mainly on modeling of gravitational fields of central regions of these clusters. the modeling, in turn, was based on the the 665 http://dx.doi.org/10.14311/ap.2013.53.0665 http://ojs.cvut.cz/ojs/index.php/ap janusz ziółkowski acta polytechnica analysis of the brightness profiles of these regions. it was found that a useful parameter during the preliminary analysis of the brightness profiles is the ratio of core radius to the half mass radius rc/rh. trenti [42] analyzed the dynamical evolution of a gc under a variety of initial conditions. she found, that for a cluster consisting initially of single stars only, the final (after relaxation) value of rc/rh was ∼ 0.01, for a cluster containing 10 % of binaries this value was ∼ 0.1, but for the cluster containing an imbh the value of rc/rh was ∼ 0.3. these results confirmed earlier conclusions that imbh clusters have expanded cores. trenti subsequently considered 57 dynamically old (relaxed) gcs and found that for at least half of them the value of rc/rh is & 0.2, which implies the presence of an imbh. it was concluded, therefore , that a substantial fraction of old gcs contain imbhs. this conclusion was supported by the finding [18], that gcs obey the relation (or, rather, an extension of it) between the velocity dispersion in the core and the mass of the central bh, found earlier for the galaxies [13]. the detailed analysis of the brightness profiles was used to obtain quantitative estimates of the masses of the probable central bhs in some gcs. the leading candidates were m15 (∼ 2000 m�, [19]) and ω cen (∼ 50 000 m�, [30]). recently, the case for imbhs in some gcs became weaker after fregeau at al. [16] carried out the analysis of the white dwarf (wd) populations in gcs. their analysis suggests that wds receive a kick of a few km/s shortly before they are born. the effect of this kick is an increase in both rc/rh and velocity dispersion. as a result, at the moment, no globular cluster requires an imbh at its center. this, of course, does not mean that there are no central bhs in some gcs, but the case for their presence is now far from being robust. no attempt has so far been made to search for possible imbhs in the mcs gcs. 3.1. a ulx in a globular cluster? at the end of this section, i should mention an object that might be a ulx in the gc. this object is a bright x-ray source in the unnamed gc which belongs to the virgo cluster giant elliptical galaxy ngc 4472 [25]. the source luminosity is ∼ 4 × 1039 ergs/sec (which corresponds to a mass of ∼ 25 ÷ 30 m�, if the source emits at the eddington level). the source exhibits x-ray luminosity variability by a factor of 7 in a few hours, which excludes the possibility that the object is several neutron stars superposed. as the xray luminosity of massive x-ray binaries is, typically, substantially smaller than the eddington luminosity, the mass of the compact object might be significantly higher than the lower limit, given above. it seems likely that the gc in question contains a ulx, that harbors a fairly massive bh (although, perhaps, not an imbh). 4. stellar mass bhs 4.1. bh candidates from microlensing events microlensing events are, at present, the only method for detecting solitary stellar mass bh candidates (bhcs). the method is based on mass estimates for the lensing objects. such estimates are possible only for so-called “paralax events”. these are the events that are long enough to show the magnification fluctuations, reflecting the orbital motion of the earth around the sun. this effect permits to calculate the “microlensing parallax” which is a measure of the relative transverse motion of the lens with respect to the observer. assuming the standard model of the galactic velocity distribution, we are then able to perform a likelihood analysis, which permits to estimate the distance and the mass of the lens. with the help of the above analysis, some long events might be selected as, possibly, caused by black hole lenses. the list of such candidates has not changed in the last few years. it still contains only four events: macho-96-blg-5 (probable mass of the lens ∼ 3 ÷ 16 m�, [1]), macho98-blg-6 (probable mass of the lens ∼ 3÷13 m�, [1]), macho-99-blg-22 = ogle-1999-bul-32 (probable mass of the lens ∼ 100 m�, [2]) and ogle-sc52859 (probable mass of the lens ∼ 7 ÷ 43 m�, [39]). paczyński [32] promised a substantial increase of the number of possible bh lenses in some 2–3 years since the start of ogle iii project. ogle iii (which started in 2001) was predicted to detect more than 500 events per year and, among them, some 20–30 parallax events. paczyński expected that a few of them (per year) should be bhcs. however, no new firm detections were reported until end of ogle iii project (2009). fortunately, ogle iv project (which started in 2010) detected several long events. these events are now analyzed and supported with supplementary hst observations. there is a hope that the list of possible bh lenses will increase in near future. 4.2. bhs in non-x-ray binaries there are rather few binaries that are not x-ray emitters but that still might be suspected of harboring a black hole. in such cases, the evidence comes from mass functions indicating presence of a massive but unseen member of the system. there are some w–r stars with massive unseen companions, that are mentioned on this occasion [8]. there are some low mass binaries, in which the observed component displays ellipsoidal type variability due to tidal action of the unseen massive companion (in some cases, possibly, a bh [35]). quite recently, analysis of a well known binary v pup [33] indicated that the system is probably a triple, and that the third unseen companion is, most likely, a black hole. 4.2.1. wr + unseen companion binaries about 20 such binaries are known [8]. most of them have high z-altitude values over the galactic plane, 666 vol. 53 supplement/2013 population of bhs in the mw and in the magellanic clouds which might indicate that they survived supernova explosion. if so, then unseen companions must be relativistic objects. their mass estimates derived from the mass functions indicate that at least some of them must be black holes. the strongest case is the binary cd-45°4482 (the lower mass limit of the unseen companion, estimated from radial velocities of wr star, is 5.5 m�. 4.2.2. binaries with ellipsoidal type variability caused by the presence of an unseen massive companion sa̧dowski et al. [36] indicated that bhs might be detected in some pre-xrbs. these are the systems in which the mass transfer has not yet started, but the optical component is large enough (with respect to its roche lobe) to be tidally distorted by its massive unseen companion. the photometric light curve of this system exhibits the characteristic ellipsoidal type variability. różyczka et al. [35] investigated 11 objects in globular clusters ω centauri and ngc 6397. these objects were selected photometrically, on the basis of their ellipsoidal type variability. the objects were, subsequently, observed spectroscopically in search for the radial velocities. as a result of these observations, ten of them were found to be binaries. further analysis indicated that the system named v36 (in ngc 6397) contains an unseen degenerate companion with the mass in the range 2 ÷ 4 m�. it might be either a heavy ns or a light bh. 4.2.3. v pup binary system v pup is known as a high mass eclipsing binary with on orbital period of 1.45 days and masses of the components equal 15.8 and 7.8 m�. however, the orbital solution is not sufficiently accurate, since it produces residuals in the form of cyclic orbital period oscillation with periodicity of 5.47 years. if these residuals are interpreted as caused by an unseen third body, then the mass of this body (orbiting the close pair in 5.47 years) must be & 10.4 m� [33]. it is likely, that this body is a black hole. 4.3. bhs in x-ray binaries x-ray binaries (xrbs) are still the main source of information about stellar mass bhs. at present, the list of xrbs harboring bh candidates (bhcs) contains 64 objects (62 in mw and 2 in mcs). we include here the controversial system ci cam, although it might appear that it contains a white dwarf and not a bh (both classes of sources have very soft x-ray spectra). among these 64 binaries, there are 24 containing confirmed bhs with dynamical mass estimates. here, we include the well known tev binary ls 5039, although both the interpretation of the mass function and the true nature of the compact object has recently become controversial. we include also the celebrated system cygrx-3, although its mass estimate only marginally indicates the presence of a black hole (but there are other arguments in favor of its bh nature). if we consider the distribution of bhcs between the class of high mass xrbs (hmxbs) and low mass xrbs (lmxbs), then we find 56 bhcs in lmxbs (all in mw) and only 8 in hmxbs (6 in mw and 2 in mcs). for confirmed bhs the numbers are: 17 bhs in lmxbs and 7 bhs in hmxbs (5 in mw and 2 in mcs). it is also worth to note that 14 bhcs are microquasars (all of them in mw: 9 in lmxbs and 5 in hmxbs). 4.3.1. masses of stellar mass bhs the smallest mass of a probable bh (∼ 2.5 m�) was recently found for cyg x−3 [44]. unfortunately, this estimate is not very precise (the permitted value of the mass is in the range 1.3 to 4.5 m�). moreover, one should remember that many, wildly different, estimates of the mass of the compact object in cyg x−3 were given earlier in the literature (one recent estimate was by shrader et al. [38]). the estimate by zdziarski et al. is not the last word in this area. if we consider the more precise determinations, then we find that the range of the masses has not changed recently. still, the lightest bhs have masses ∼ 4 m� (gro j0422+32, m ≈ 4 ± 1 m� [14, 31]; grs 1009−45, m ≈ 4.4 ÷ 4.7 m� [15]) and the heaviest have masses ∼ 16 m� (ss 433, m ≈ 16 ± 3 m� [4]; cyg x−1, m ≈ 16 ± 3 m� [47]; grs 1915+105, m ≈ 14 ± 4.4 m� [24]). during discussion of the low mass bhs, the question of the oppenheimer–volkoff mass (the largest possible mass for a ns) inevitably shows up. theoretical estimates (1.4 ÷ 2.7 m�) remain highly uncertain (we still do not know the proper equation of state). there has, however, been progress in observational measurements. until quite recently, the measured values were all consistent with the mass not greater than ∼ 1.4 m�. this is no longer true. the first ns mass substantially higher than 1.4 m� was measured with great precision by champion et al. [7] (psr j1903+0327, m = 1.67 ± 0.01 m�). then demorest et al. [10] found (from the general relativistic shapiro delay) that the mass of the radio pulsar psr j1614−2230 is equal 1.97 ± 0.04 m�. we should emphasize that this is a high precision determination. therefore, at present, the upper mass limit for a nss is & 1.97 m�. that is not the end of the story. a few years ago, freire et al. [17] analyzed the radio pulsar ngc 6440b (in globular cluster ngc 6440). the mass estimate, as for radio pulsar, is still very imprecise (it is based on only one year of observations). however, it indicates the mass larger than 2 m�, with probability greater than 99 %. the most likely value is 2.74 m�. the precision of this determination will improve substantially after a few more years of observations. the outcome might be very exciting. if the value in excess of 2.5 m� is confirmed, it would mean a disaster for most of the equations of state for dense matter, 667 janusz ziółkowski acta polytechnica name a∗ lmc x−3 < 0.26 xte j1550−564 0.34+0.37−0.45 gro j1655−40 0.65 ÷ 0.75 4u 1543−47 0.75 ÷ 0.85 lmc x−1 0.92+0.05−0.07 cyg x−1 > 0.95 grs 1915+105 > 0.98 table 1. spin estimates based on modeling of x-ray continuum. but it would also mean that very heavy nss do exist. this would make the discrimination between nss and bhs, based on the mass of the compact object, more difficult, as the light bhs of similar mass (∼ 2.5 m�) can also exist (and cyg x−3 might be an example). 4.3.2. spins of stellar mass bhs there are three basic methods of deducing the spin of an accreting black hole. they are: modeling of spectral energy distribution in x-ray continuum, modeling of the shape of the x-ray fe kα line and interprenting the high frequency quasi-periodic oscillations (khz qpos). the resulting spin estimates are usually expressed with the help of a dimensionless angular momentum parameter a∗, where a∗ = 0 corresponds to non-rotating (schwarzshild) black hole and a∗ = 1 corresponds to maximally prograde (i.e. in the same direction as accretion disc) rotating black hole. spectral energy distribution (x-ray continuum) zhang et al. [45] were the first to discuss the x-ray emission from the discs around rotating black holes (kerr bhs). using very rough estimates, they found the evidence of rapid rotation for two galactic microquasars: gro j1655−40 (the dimensionless angular momenta (spin parameter) a∗ ≈ 0.93) and grs 1915+105 (a∗ ≈ 1.0). in recent years, a very careful and detailed analysis was performed by mcclintock and his collaborators. in a series of papers [9, 22, 23, 26, 37] they made spectral fits for six x-ray binaries. their results are shown in tab. 1. the table also contains one determination made by steiner et al. [41] for the system xte j1550−564. as we may see, steiner et al. are less optimistic about the precision of the continuum fit determination than mcclintock’s group. generally, they might be considered rather reliable, with the possible exception of cyg x−1. modeling of x-ray fe kα line the broad fe kα lines are observed in the spectra of the growing number of x-ray binaries (the most recent summary is given by miller et al. [27]). these lines are believed to originate in the innermost regions of the discs due to their irradiation by a source of hard x-rays (most likely a comptonizing corona). if, due to rapid rotation of bh, the disc extends to smaller radius name a∗ 4u 1543−47 0.3 (1) sax j1711.6−3808 0.6+0.2−0.4 xte j1550−564 0.55+0.15−0.22 grs 1915+105 0.56+0.02−0.02 swift j1753.5−0127 0.76+0.11−0.15 xte j1908+094 0.75 (9) xte j1650−500 0.79 (1) lmc x−1 0.97+0.01−0.13 cyg x−1 0.97+0.014−0.02 gro j1655−40 0.98 (1) table 2. spin estimates based on modeling the fe kα line. note: the number in parenthesis shows the uncertainty of the last digit. than it would be possible for non-rotating bh, then the line is expected to be more redshifted and more distorted. modeling of the shape of fe kα line produced results that are generally similar to, but not fully consistent with, the results obtained from the x-ray continuum fits. table 2 contains the results summarized by miller et al. [27] together with the more recent determinations for grs 1915+105 [3], swift j1753.5−0127 [34], xte j1550−564 [41], cyg x−1 [12] and lmc x−1 [40]. high frequency quasi-periodic oscillations (khz qpos) qpos in bh binaries are still not well enough understood and no progress has been made in this area during the recent years. the situation remains as it was when reviewed by me four years ago [46]. summary of bhs spins before summarizing of the situation, i have to note that the precision of both principal methods (i.e. the spectra of the discs and the shape of fe kα lines) are being questioned. it is indicated that the fitting of the continuum spectra is model-dependent and sensitive to the uncertainties of absorption corrections [9]. as for the shape of the kα line, the fitting is sensitive to the uncertainty of the continuum level determination (e.g. [5]). the sort of widespread scepticism was supported by the large (sometimes very large) disagreements between the results of the two methods. for example, for cyg x−1, the continuum fit method gave the result a∗ > 0.95 [23], but the iron line fitting method gave the result a∗ = 0.05+0.01−0.01. fortunately, recently, the results of both methods seem to be converging. teams using one or another method are joining efforts and publishing joint papers. sometimes, both methods are used in one paper and the results are compared (steiner et al. [41] for xte j1550−564 or blum et al. [3] for grs 1915+105). having said that and comparing the content of tabs. 1 and 2 in the context of the history of the topic, one can make the following observations: 668 vol. 53 supplement/2013 population of bhs in the mw and in the magellanic clouds name of the class mw lmc smc total mass of the galaxy (in msmc units) 100 10 1 hmxrbs 118 26 83 in this bexrbs 72 19 79 lmxrbs 197 2 – bhcs 62 2 – table 3. comparison of numbers of different classes of xrbs in the mw and in the mcs. (1.) some systems (cyg x−1, lmc x−1) probably have rotation close to nearly maximal spin (a∗ > 0.9). (2.) several other systems (gro j1655−40, xte j1650−500, xte j1908+094 and swift j1753.5−0127 have large spins (a∗ & 0.65). (3.) the case of grs 1915+105 is not decided yet. mcclintock et al. [26] got, from continuum fit method, a∗ > 0.98. blum et al. [3], from iron line fitting method, got a∗ = 0.56+0.02−0.02. however, the same authors (blum et al.), in the very same paper, using continuum fit method, got a∗ = 0.98+0.01−0.01. (4.) not all accreting black holes have large spins (robust (?) result a∗ < 0.26 for lmc x−3). (5.) there are still substantial discrepancies between the results of two methods, but they are significantly smaller than a few years ago. 5. comparison of different classes of xrbs in the mw and in the mcs looking at the above table, we may observe in the mcs (in comparison with the mw): • lack of lmxrbs, • relative surplus of hmxrbs, • deficit of bhs. these differences are real (it would be difficult to attribute them to selection effects). they are, probably, mostly due to a different star formation history in the mcs than in the mw. acknowledgements i would like to thank the unknown referee for several helpful comments and suggestions. this work was partially supported by the polish ministry of science and higher education (mshe) project 362/1/n-integral (2009–2012). references [1] bennett d.p., becker a.c., quinn j.l. et al.: 2002a, apj 579, 639 [2] bennett d.p., becker a.c., calitz j.j., johnson b.r., laws c., quinn j.l., rhie s.h., sutherland, w.,: 2002b, arxiv:astro-ph/0207006 [3] blum, j.l., miller, j.m., fabian, a.c., miller, m.c., homan, j., van der klis, m., cackett, e.m., reis, r.c.: 2009, apj 706, 60 [4] blundell, k.m., bowler, m.g., schmidtobreick, l.: 2008, apj 678, l47 [5] boller, th.: 2012, mmsai 83, 132 [6] burkert, a., schartmann, m., alig, c., gillessen, s., genzel, r., fritz, t. k., eisenhauer, f.: 2012, apj 750, 58 [7] champion, d.j., ransom, s.m., lazarus, p. et al.: 2008, sci. 320, 1309 [8] cherepashchuk, a.m.: 1998, in modern problems of stellar evolution, d.s. wiebe (ed.), geos, moscow, russia, p. 198 [9] davis, s.w., done, c., blaes, o.m.: 2006, apj 647, 525 [10] demorest, p.b., pennucci, t., ransom, s.m., roberts, m.s.e., hessels, j.w.t.: 2010, nature 467, 1081 [11] doeleman, s.s., weintroub, j., rogers, a.e.e., et al.: 2008, nature 455, 78 [12] fabian, a.c., wilkins, d.r., miller, j.m., reis, r.c., reynolds, c.s., cackett, e.m., nowak, m.a., pooley, g.g., pottschmidt, k.; sanders, j.s., ross, r.r., wilms, j.: 2012, mnras 424, 217 [13] ferrarese, l., merritt, d.: 2000, apj 539, l9 [14] filippenko, a.v., matheson, t., ho, l.c.: 1995, apj 455, 614 [15] filippenko, a.v., matheson, t., leonard, d.c., barth, a.j., van dyk, s.d.: 1999, pasp 109, 461 [16] fregeau, j.m., richer, h.b., rasio, f.a., hurley, j.r.: 2009, apj 695, l20 [17] freire, p.c.c., ransom, s.m., begin, s., stairs, i.h., hessels, j.w.t., frey, l.h., camilo, f.: 2008, apj 675, 670 [18] gebhardt, k., rich, r.m., ho, l.c.: 2002, apj 578, l41 [19] gerssen, j., van der marel, r.p., gebhardt, k., guhathakurta, p., peterson, r.c., pryor, c.: 2003, aj 125, 376 [20] ghez, a.m., salim, s., weinberg, n.n., lu, j.r., do, t., dunn, j.k., matthews, k., morris, m.r., yelda, s., becklin, e.e., kremenek, t., milosavljevic, m., naiman, j.: 2008, apj 689, 1044 [21] gillessen, s., genzel, r., fritz, t. k., quataert, e., alig, c., burkert, a., cuadra, j., eisenhauer, f., pfuhl, o., dodds-eden, k., gammie, c.f., ott, t.: 2012, nature 481, 51 [22] gou, l., mcclintock, j.e., liu, j., narayan, r., steiner, j.f., remillard, r.a., orosz, j.a., davis, s.w., ebisawa, k., schlegel, e.m.: 2009, apj 701, 1076 669 janusz ziółkowski acta polytechnica [23] gou, l., mcclintock, j.e., reid, m., orosz, j.a., steiner, j., narayan, r., xiang, j., remillard, r.a., arnaud, k., davis, s.w.: 2011, apj 742, 85 [24] greiner, j., cuby, j.g., mccaughrean, m.j.: 2001, nature 414, 522 [25] maccarone t.j., kundu a., zepf, s.e., rhode, k.l.: 2007, nature 445, 183 [26] mcclintock, j.e., shafee, r., narayan, r., remillard, r.a., davis, s.w., li, l.-x.: 2006, apj 652, 518 [27] miller, j. m., reynolds, c. s., fabian, a. c., miniutti, g., gallo, l. c.: 2009, apj 697, 900 [28] murakami, h., koyama, k., sakano, m., tsujimoto, m., maeda, y.: 2000, apj 534, 283 [29] murakami, h., senda, a., maeda, y., koyama, k.: 2003, ans 324, 125 [30] noyola, e., gebhardt, k., bergmann, m.: 2006, asp conf. series, 352, 269 [31] orosz, j.a.: 2003, in a massive star odyssey: from main sequence to supernova, proceedings of iau symposium 212, k. van der hucht, a. herrero, and e. cesar (eds.), astronomical society of the pacific, san francisco, p.365 [32] paczyński b., 2003, in the future of small telescopes in the new millennium. volume iii science in the shadows of giants, t.d. oswalt(ed.), astrophysics and space science library, volume 289, kluwer academic publishers, dordrecht, p.303 (see also astro-ph 0306564) [33] qian, s.-b., liao, w.-p., lajus, f.: 2008, apj 687, 466 [34] reis, r.c., fabian, a.c., ross, r.r., miller, j.m.: 2009, mnras 395, 1257 [35] różyczka, m., kalużny, j., pietrukowicz, p., pych, w., catelan, m., contreras, c., thompson, i.b.: 2010, a&a 524, 78 [36] sa̧dowski, a., ziółkowski j., belczyński, k.: 2006, procs. of the 6th microquasar workshop, belloni, t. (ed.), proceedings of science (http://pos.sissa.it/), p.64 [37] shafee, r., mcclintock, j.e., narayan, r., davis, s.w., li, l.-x., remillard, r.a.: 2006, apj 636, l113 [38] shrader, c., titarchuk, l., shaposhnikov, n.: 2010, apj 718, 488 [39] smith, m.c.: 2003, mnras 343, 1172 [40] steiner, j.f., reis, r.c., fabian, a.c., remillard, r.a., mcclintock, j.e., gou, l., cooke, r., brenneman, l.w., sanders, j.s.: 2012, arxiv:astro-ph/1209.3269 [41] steiner, j.f., reis, r.c., mcclintock, j.e., narayan, r., remillard, r.a., orosz, j., gou, l., fabian, a.c., torres, m.: 2011, mnras 416, 941 [42] trenti, m.: 2006, arxiv:astro-ph/0612040 [43] yuan, y.-f., cao, x., huang, l., shen, z.-q.: 2009, apj 699, 722 [44] zdziarski, a.a., mikołajewska, j., belczyński, k.: 2012, arxiv:astro-ph/1208.5455 [45] zhang, s.n., cui, w., chen, w.: 1997, apj 482, l155 [46] ziółkowski, j.: 2009, in frontier objects in astrophysics and particle physics (procs. of the vulcano workshop 2008), f. giovannelli & g. mannocchi (eds.), conference proceedings, italian physical society, editrice compositori, bologna, italy, 98, 205 [47] ziółkowski, j.: 2012, in preparation discussion maurice van putten — as a comment, candidates for low a/m are just as important as those for high a/m, in that they reflect possible lower bounds reflecting interaction between the spin and the inner disk (van putten, m.h.p.m., 1999, science, 284, 115). janusz ziolkowski — yes, thank you for the comment. laura brenneman — continuum fitting and fe kα line groups now working together to make sure. bhs spins measured in galactic bhbs are consistent between two methods. active work to revise both models: to parametrise spectral hardening in continuum fitting and to take into account ionization of disk, thermal x-ray emission of disk in fe kα. fabian, novak have recently revised cyg x−1 spin to high values (a ≥ 0.9) using revised fe kαmodels. janusz ziolkowski — thank you for this comment. it is, certainly, encouraging news. [following the referee’s suggestion, i have updated the relevant part of the written version of my review talk]. 670 http://pos.sissa.it/ acta polytechnica 53(supplement):665–670, 2013 1 introduction 2 sgra* 3 intermediate mass bhs 3.1 a ulx in a globular cluster? 4 stellar mass bhs 4.1 bh candidates from microlensing events 4.2 bhs in non-x-ray binaries 4.2.1 wr + unseen companion binaries 4.2.2 binaries with ellipsoidal type variability caused by the presence of an unseen massive companion 4.2.3 v pup 4.3 bhs in x-ray binaries 4.3.1 masses of stellar mass bhs 4.3.2 spins of stellar mass bhs 5 comparison of different classes of xrbs in the mw and in the mcs acknowledgements references acta polytechnica doi:10.14311/ap.2013.53.0698 acta polytechnica 53(supplement):698–702, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap the mass composition of ultra high energy cosmic rays aurelio f. grillo∗ infn – laboratori nazionali del gran sasso, ss 17 bis, 67100 assergi (aq), italy ∗ corresponding author: grillo@lngs.infn.it abstract. the status of the mass composition measurements of ultra high energy cosmic rays is presented, with emphasis on the results from the fluorescence detector of the pierre auger observatory. possible consequences of the present measurements are discussed, both on the particle physics and astrophysics aspects. 1. introduction at the highest energies (log10 e > 18.5) ultra high energy cosmic rays (uhecrs) are very likely of extra-galactic origin. measurements of the moments of their mass distribution when they hit the earth’s atmosphere are likely to give important clues on their sources, propagation and interaction at (center of mass) energies that are around 100 tev. the zeroth moment (all particle spectrum) by definition does not explicitly distinguish between different nuclear components, although its interpretation can be easily connected to those (see the report by r. aloisio at this conference). higher moments are starting to discriminate between different hypotheses, although of course they are more and more affected by statistical and systematic errors. the most used shower observables for studying the composition of ultra high energy cosmic rays (uhecr) are the mean value of the depth of shower maximum, 〈xmax〉, and its dispersion, σ(xmax). inferring the mass composition from these measurements is subject to some level of uncertainty. this is because their conversion to mass relies on the use of shower codes which include the assumption of a hadronic interaction model. these interaction models [1] have in common the ability to fit lower energy accelerator data. however, they are based on different physical assumptions to extrapolate these low energy interaction properties to higher energies. consequently they provide different expectations for 〈xmax〉 and σ(xmax). in the following we will mainly discuss the different roles of the two observables, 〈xmax〉 and σ(xmax), with respect to mass composition. on the basis of the superposition model [2] 〈xmax〉 is proportional to 〈ln a〉 and therefore it actually measures (average) mass composition for both pure and mixed compositions. the behaviour of σ(xmax) is however more complex, and gives indications on mass distributions corresponding to the same 〈ln a〉. 2. general ideas as observed above 〈xmax〉 can be directly connected to the average composition of the nuclear cosmic rays in the beam when they hit the atmosphere (the average being performed within reconstructed energy bins): 〈xmax〉 = x0 + x1〈ln a〉 (1) where the coefficients depend on details of the interaction of the beam with the atmosphere and generally depend logarithmically on its energy. for a given combination of nuclear species with normalized (generally energy dependent) weights {wi}1 〈ln a〉 = σ wi (ln ai) here we remark that the nuclear weights at the earth’s atmosphere will be in general different from the corresponding fractions at the sources, since, at the energies we are considering, nuclei will suffer photodisintegration γa → (a−n) + nn interacting with the universal radiation backgrounds in the extra-galactic space through which they propagate. this implies that even in the extreme case of sources producing pure compositions of nuclei (apart from protons) the detected composition will be in general mixed. the variance of xmax is σ2(xmax) = x21 σ 2 ln a + 〈σ 2 shower〉 (2) σ2ln a = σ wi (ln ai −〈ln a〉) 2 while 〈σ2shower〉 = σ wiσ 2 shower(a) describes the intrinsic fluctuations of xmax for different nuclei, and as such depends on details of the interactions, but generically decreases with increasing a, being maximum for protons and minimum for iron nuclei [3]. notice that the first factor in eq. 2 is trivially zero for a pure composition, while the second factor is obviously larger than zero. equation 1 describes how the measurement fixes the average (logarithmic) mass, but 1given the low statistics at highest energies it is customary to group the nuclei into mass groups. 698 http://dx.doi.org/10.14311/ap.2013.53.0698 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 the mass composition of ultra high energy cosmic rays figure 1. the relation between 〈ln a〉 and σ2(xmax), here for four mass groups (figure provided by m. unger). cannot discriminate the real composition. equation 2 starts to help in this task because for each average mass there is a spread of corresponding allowed values of the variance – so a given measurement can hopefully exclude possible contributions to the average mass. this is beautifully expressed in a plot, originally proposed by linsley (fig. 1) [4]. in the original plot the σln a was plotted, here we use σ2(xmax), introducing a dependence on the propagation model. it is instructive to elaborate on this plot. the figure describes the possible range of σ2(xmax) for a given value of 〈ln a〉. clearly the cusps correspond to pure mass compositions: here the σ2 reaches a minimum consistent with a given average mass since the first term in eq. 2 vanishes. for the same reason the transition from a pure composition to the next one (e.g. from proton to helium) bounds its minimum variation. on the other hand a superposition of the two extreme masses, here proton and iron, gives the largest variance. in fact, for a given 〈ln a〉 this combination requires a proton fraction larger than any other combination, and protons give the largest contribution to both terms of eq. 2. 3. the data from the pierre auger observatory figure 2 shows the all-particle spectrum obtained by the pierre auger observatory. the cut-off of the spectrum at high energies has a significance of ≈ 20σ. the composition data from the pierre auger observatory discussed below are obtained from hybrid events, which are events detected by the fluorescence detector (fd) of the observatory, with at least a signal in one of the water cherenkov stations of the surface detector (sd) measured in coincidence2. considering the data from december 2004 up to september 2010, after the fd quality cuts [5] 15 979 events remain. for the composition analysis additional cuts are used to ensure that no bias 2the data are those described in ref. [5] as updated in [6]. e[ev] 18 10 19 10 20 10 ] 2 e v -1 s r -1 y r -2 j (e ) [k m 3 e 37 10 38 10 (e/ev) 10 log 18 18.5 19 19.5 20 20.5 auger power laws power laws + smooth function figure 2. the pierre auger observatory all-particle spectrum. see [7] for details. with respect to the cosmic ray composition is introduced in the data sample. specifically, it is required that the trigger probability of the sd station be saturated both for proton and iron primaries and only fd reconstructed geometries are kept for which the full range of xmax is observable. after these cuts, 6 744 events remain. the systematic uncertainty in the energy reconstruction of the fd events is 22 % the average resolution is ≈ 20 g/cm2 over the energy range considered. furthermore, the rms(xmax) has been corrected for the detector resolution. therefore the auger data are detector independent and can be directly compared with simulations. let us first consider the plot of 〈xmax〉 in fig. 3. its energy dependence has been fitted with a broken line, showing an increase up to log10 e ≈ 18.4 up to a value compatible with a pure proton composition, then a much milder increase consistent with a compensation of the logarithmic increase with energy by a (logarithmic) increase of average mass. as is also indicated by the expectations from interaction models the composition appears to become increasingly heavy with energy. we can therefore conclude that 〈xmax〉 data are consistent with 〈ln a〉 increasing with energy, at least above log10 e > 18.4. let us now discuss the plot of fluctuations, the lower panel of fig. 3. let us first note that this plot can lead to misleading interpretation in the way it is customarily presented. the lines of the predictions of interaction models are only for pure compositions. while in the 〈xmax〉 plot they have the generic meaning of the possible range of values of xmax, this is not true here and, as for instance indicated in fig. 1, the fluctuations can be larger than those corresponding to the upper lines, i.e. pure proton compositions, while they cannot be smaller than those of pure iron composition. to take an example, if the composition is proton dominated at some energy and then evolves toward larger mass through a proton–iron combination, then in general in some energy range the fluctuations can be larger than those of pure protons. 699 aurelio f. grillo acta polytechnica e [ev] 18 10 19 10 ] 2 > [ g /c m m a x < x 650 700 750 800 850 1407 1251 998 781 619 457 331 230 188 143 186 106 47 eposv1.99 p fe qgsjet01 p fe sibyll2.1 p fe qgsjetii p fe eposv1.99 p fe qgsjet01 p fe sibyll2.1 p fe qgsjetii p fe figure 3. xmax (upper panel) and rms(xmax) compared with predicted values for pure compositions in different interaction models. ] 2 [g/cm〉 fe max x〈-〉 max x〈 0 20 40 60 80 100 ] 2 [ g /c m f e σσ 0 10 20 30 40 50 fe si n he p auger data auger syst. simulation ] 2 [g/cm〉 fe max x〈-〉 max x〈 0 20 40 60 80 100 ] 2 [ g /c m f e σσ 0 10 20 30 40 50 fe si n he p ] 2 [g/cm〉 fe max x〈-〉 max x〈 0 20 40 60 80 100 ] 2 [ g /c m f e σσ 0 10 20 30 40 50 fe si n he p ] 2 [g/cm〉 fe max x〈-〉 max x〈 0 20 40 60 80 100 ] 2 [ g /c m f e σσ 0 10 20 30 40 50 fe si n he p figure 4. σ(xmax) versus 〈xmax〉 of the data, both normalized to iron, compared with predictions of various interaction models: qgsjet01/ii (up left and right), sibyll2.1 (bottom left) and eposv1.99 (bottom right). energy varies as the dotted line, the highest value being the lower points. the variance reaches a maximum approximately at the same energy as the break of the slope of 〈xmax〉, then it starts to decrease approximately monotonically with energy. since 〈ln a〉appears to rise monotonically in log10 e, and considering eq. 2 and the behaviour of fig. 1 it appears that the transition of average mass towards larger values happens consistently through almost pure compositions (〈σln a〉≈ 0), in particular with few protons at the higher energies. in [8] a thorough discussion is presented of mass composition measurements in cosmic ray experiments. in particular a discussion of the combined 〈xmax〉 versus σ(xmax) of the auger data is analyzed with the help of a series of plots similar to fig. 1 for each interaction model considered and for five mass groups ] 2 [ g /c m µ m a x µ xµ 500 550 600 650 eposv1.99 qgsjetii-03 s ib y l l 2. 1 proton iron syst. unc. m a x θ 1.5 1.55 1.6 energy [ev] 18 10 19 10 20 10 figure 5. two composition sensitive observables from sd, compared with expectations from interaction models (for further details see [9]). (proton, helium, cno, silicon, iron). the main difference from fig. 1 lies in the use of the experimental observables both for the simulations and for the data. from these plots it appears that the central values of the experimental data (within statistical errors only) have a varying, but generically large, tension towards all the interaction models apart from epos, especially at the highest energies. also, these plots confirm the general idea that the transition to larger masses can be better described by the dominance of different mass groups in different energy intervals, with small mixing between the groups, and in particular little admixture of protons. however, taking into account systematics considerably weakens these conclusions. the fluorescence detector of the pierre auger observatory has only a limited duty cycle, while the surface detector is continuously active, therefore composition-related observables connected to sd, although of less direct interpretation, are a valuable complement for direct longitudinal shower developement measurements. fig. 5 shows two such measurements, relating to muons (and electrons) in auger showers. this data confirm the fd measurements, especially when systematics is taken into account [9]. 4. discussion although a full analysis of the auger composition data is not yet available, some tentative conclusions can be drawn. but before discussing them, it is important to relate the behaviour of the composition to the data of other large experiments, hires [10], telescope array [11] and yakutsk [12]. the former is no longer in operation. these experiments claim a composition 700 vol. 53 supplement/2013 the mass composition of ultra high energy cosmic rays 1e+32 1e+33 1e+34 1e+35 18 18.5 19 19.5 20 20.5 21 proton nitrogen iron all particle figure 6. partial spectra (multiplied by e3, in arbitrary units) of a mixture of mass groups that would qualitatively reproduce both spectrum and composition data. compatible with pure proton. however their dataset is substantially smaller than that of auger, and their data are also compatible with auger data. moreover, with the cuts described above the auger data (and, although with a different strategy, those from yakustsk) are free from detector biases and therefore can be directly compared with simulations from interaction models, while for hires and ta the detector biases have been applied to the simulations. this makes a direct comparison of the data difficult. it should also be stressed that moving the data within the relatively large band of systematics, especially for the second moment (the variance of xmax), can greatly influence the conclusions, as can be seen in fig. 43. finally, we are implicitely assuming that at least some of the interaction models used are correct at these energies, which in the center of mass are approximately two orders of magnitude larger than those in lhc. a change in proton (and nuclei) interactions in these range would have profound consequences on the interpretation of the experimental data. coming back to the data we have seen that the behaviour of rms(xmax) seems to suggest an evolution of the mass composition toward larger values with little mixing between mass groups. in other words the transition towards large masses in energy is possibly happening through the dominance of a single mass group, as is sketched in fig. 6 for three mass groups (proton, nitrogen and iron). although this figure has to be seen as a guide to the eyes, it is clear that a similar behaviour could apply for log10 e > 18.4 given that measurements suggest an increasing 〈ln a〉 plus a decreasing variance, remembering that one expects a decrease of 〈σshower〉 with a. note however that these are the spectra at earth. 3however, as is clear from the figures, moving the data towards lower values of fluctuations is inconsistent with any interaction model and would imply profound consequences from a particle/nuclear physics point of view. nuclei generally interact with universal radiation backgrounds (both cmb and ebl) in their travel in the extra-galactic space, and suffer photodisintegration in wich the original a decreases. if one assumes the simplest model of (extra-galactic) sources of uhecrs at these energies: uniformily distributed, with a universal power law spectrum and charge-dependent maximum energy, then it appears that very peculiar conditions must be fulfilled in order to reproduce the experimental moments of the mass distribution. in fact, to avoid overpopulating the partial spectra with species produced in the propagation in the extragalactic space (particularly protons) it appears that a low cut-off energy is needed at the source. this in turn implies very flat spectra (i.e. differential slope < 2) to reproduce the observed data. in this case the observed high energy cut-off of the all-particle spectrum would be a feature of the sources, not of the propagation. these features appear to be at odds with generally accepted ideas of acceleration of uhecrs, but of course this source model can be oversimplified. for instance the spectrum might be dominated by only a few (maybe peculiar) nearby sources (see e.g. [13]). and, of course, the interaction models used to describe the data might be inadequate at these energies. this has been advocated in [14]. in conclusion, a combined, full analysis of uhecr data (spectrum, composition and possibly anisotropy) is needed to get hints of their provenience. such full analysis is likely to require a modification of the simplest models, of particle physics and/or of astrophysics, used up to now to describe the data. acknowledgements all of the considerations expressed here benefitted from very fruitful work and discussions within the pierre auger collaboration; they in any case represent my personal point of view. i thank k.-h. kampert and m. unger for allowing me to use figures from their paper [8]. references [1] for a recent review see e.g. r. engel, d. heck and t. pierog, annu. rev. nucl. part. sci., 61, 467 (2011) [2] see e.g. t.k. gaisser, cosmic rays and particle physics, cambridge 249 university press, cambridge, 1990 [3] see e.g. j. matthews, astropart. phys., 22, 387 (2005) and references therein [4] j. linsley, proc. 19th icrc 6 (1985) 1 [5] j. abraham et al. (pierre auger collaboration), phys. rev. lett. 104, 091101 (2010) [6] p. facal san luis for the pierre auger collaboration, proc. 32nd international cosmic ray conference (icrc 2011), beijing, china and arxiv:1107.4804v1 [7] f.salamida (pierre auger collaboration) proc. 32nd international cosmic ray conference (icrc 2011), beijing, china and arxiv:1107.4804v1 [8] k.-h. kampert and m. unger, astropart. phys. 35, 660 (2012) 701 aurelio f. grillo acta polytechnica [9] d. garcia-pinto et al. (pierre auger collab.) in proc. 32nd icrc (beijing, china, 2011), arxiv:1107.4804v1 [10] r. abbasi et al. [hires coll.], phys. rev. lett. 104 161101 (2010) [11] c. jui et al. [ta coll.], proc. aps dpf meeting, arxiv:1110.0133 [12] e. korosteleva et al., nucl.phys.proc.suppl. 165 74 (2007) [13] a. m. taylor, m. ahlers and f. a. aharonian, arxiv:1107.2055v1 [14] n. shaham and t. piran, arxiv:1204.1488v1 282 702 acta polytechnica 53(supplement):698–702, 2013 1 introduction 2 general ideas 3 the data from the pierre auger observatory 4 discussion acknowledgements references ap01_6.vp 1 envelope analysis envelope analysis is deeply connected to the hilbert transform [1], [5]. the hilbert transform of signal x(t) is defined by the following equation � � � �� � � �~x t h x t x t � � � � � � 1 � � � �d (1) where � �~x t is hilbert image of signal x(t), also refered to as the quadrature part to signal x(t). the inverse hilbert transform is defined by � � � �� � � �x t h x t x t � � � � � � � � 1 1~ ~ � � � �d . (2) using the definition of a convolution we can define the hilbert transform by � � � �~x t t x t� 1 � (3) � � � �x t t x t� � 1 � ~ . (4) the complex signal �(t) whose imaginary part � �~x t is the hilbert transform of the real part x(t) is called the analytic signal. � � � � � �� t x t j x t� � ~ (5) where � � � �� �~x t h x t� . the analytical signal as a complex function in the time domain can be expressed as a complex function in euler form by � � � � � �� �t e t e j t� � , (6) where is e(t) is the amplitude of a complex function in the time domain, �(t) is the phase shift of a complex function in the time domain. the following equations are valid for e(t) and �(t): � � � � � �e t x t x t� �2 2~ (7) � � � � � � � t x t x t � atan ~ . (8) function e(t) is called a signal x(t) envelope. the following equations are valid for the hilbert transform of a harmonic signal: � �� � � �h t tcos sin� � � �� � � (9) � �� � � �h t tsin cos� � � �� � � � . (10) 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 parasitic events in envelope analysis j. doubek, m. kreidl envelope analysis allows fast fault location of individual gearboxes and parts of bearings by repetition frequency determination of the mechanical catch of an amplitude-modulated signal. systematic faults arise when using envelope analysis on a signal with strong changes. the source of these events is the range of function definition of 1 �t used in convolution integral definition. this integral is used for hilbert image calculation of analyzed signal. overshoots (almost similar to gibbs events on a synthetic signal using the fourier series) are result from these faults. overshoots are caused by parasitic spectral lines in the frequency domain, which can produce faulty diagnostic analysis. this paper describes systematic arising during faults rising by signal numerical calculation using envelope analysis with hilbert transform. it goes on to offer a mathematical analysis of these systematic faults. keywords: gearbox, bearing, envelope analysis, hilbert transform, parasitic spectral lines. )t̂xt � )(t� )()}(re{ txt �� 0 0.5 1 txt ~im �� txt ��re 0.5 1 1.5 u [v] )(~ tx )(tx a) b) envelope ( ) = 1 ve t fig. 1: analytical signal and envelope of � � � �x t t� �sin 2 10� � , � � � �~ cos ,x t t� � �2 10 0� � � these equations show that the envelope e(t) of the harmonic signal x(t) = sin (�t) is equal to: � � � � � � � � � �e t x t x t t t� � � � �2 2 2 2 1~ sin cos� � . (11) the analytical signal and envelope is shown in fig. 1. 2 envelope analysis faults fig. 1 shows a part of harmonic signal � �x t , its analytically calculated quadrature part � �~x t and envelope e(t). figure 1a shows analytical signal �( t) in complex space. figure 1b shows the same signal in two dimensions. 2.1 the edge measurement interval effect on a harmonic signal envelope the envelope of a harmonic signal is shown in figure 2a. the quadrature part was obtained by a numerical calculation of the analytical signal in comparison to an analytical calculation (see figure 1b). the envelope distortion at the edge of the measurement interval is shown in figure 2a. figure 2b shows the same signal but with a phase shift of � �� 2. figure 2a shows a 50 % deflection of the envelope in comparison to the ideal. figure 2b shows a 70 % deflection of the envelope in comparison to the ideal. 2.2 the step change effect on a harmonic signal envelope the envelope of harmonic signal modulated by a square pulse is shown in figure 3a. the step is in the time when the harmonic signal is crossing zero (phase �sk = 0), and the quadrature part is numerically calculated. figure 3b shows the same signal but the step is at the time when the harmonic signal reaches the maximum (phase � �sk � 2). figure 3 shows almost the same form of envelope as figure 2. the overshoots produced by the step changes are smaller than the overshoot produced by the edge of the measurement interval. 2.3 envelope analysis errors recapitulation figure 1 to figure 3 show the different form of the envelope if it is calculated analytically or numerically at points of step change and at points of the edge of the measurement interval. the overshoot is theoretically infinite at the point of the step change (see chapter 3.3). calculation in discrete time only approximates this one. the signal in figure 2b has an overshoot of 70 % of the original envelope. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 41 no. 6/2001 )(~ tx )(tx )(tx )(~ tx a) b) 0.5 v – 50%� 1.7 v – fig. 2: envelope distortion on the edge of the measurement interval, � � � �x t t� �sin 2 10� � , sampling frequency 5 khz a) for � = 0, b) for � �� 2 )(tx )(~ tx )(tx )(~ tx a) b) fig. 3: envelope distortion on step change of harmonic signal, sampling frequency 5 khz, � = 0 a) � � � � � � � �� �� x t t sign t sign t� � � � � �sin . . .2 10 1 0 5 0 3 0 7� � b) � � � � � � � �� �� x t t sign t sign t� � � � � �sin . . .2 10 1 0 5 0 35 0 75� � figure 4 is a recapitulation of the systematic errors and a recapitulation of the theoretical values of the overshoots during the step change. the recapitulation is done for the harmonic signal � � � � � �x t e t t� �sin 2 10� � for the phase shift � � 0 and for phase shift � �� 2. the step change from � �e t � 0 to � �e t � 1 is in time t � 0. the following lines show recapitulate the errors from fig. 4: a) for � � 0 and step change at time t � 0 the envelope of signal x(t) has: �the first maximum overshoot (point p1) 50 %, at time t = 0 ms �the second maximum overshoot (point m1) 9.06 %, in time t = 0.045 ms, i.e., 0.45 of the signal, �stabilization (ripple < 1 %) (point u1), at time t = 0.405 ms, i.e., 4.05 of the harmonic signal period. b) for � �� 2 and step change at time t � 0 the envelope of signal x(t) has: �the first maximum overshoot (point p2) �, at time t = 0 ms, �the second maximum overshoot (point m2) 9.44 %, at time t = 0.012 ms, i.e., 0.12 of the signal period, �stabilization (ripple < 1 %) (point u2), at time t = 0.080 ms, i.e., 0.80 of the harmonic signal period. an analytical calculation of the theoretical overshoot values of points p1 and p2 is given in section 3.3. 3 analytical calculation of envelope overshoots this section gives a mathematical analysis of envelope behavior (parasitic overshoots) based on discovering the “non standard” behavior of the envelope (described in the previous chapter) calculated by the hilbert transform in areas of step changes. 3.1. reason for envelope overshoots as was said in the introduction, the reason for these overshoots is the convolution function of the hilbert transform 1 �t. this function for t � 0 reaches as very high values, and for t = 0 it is not defined. thus it is necessary for the integral of the hilbert transform � � � �~x t x t � � � � � 1 � � � �d (12) to be calculated as the values of the intrinsic integral according to the equation � � � � � �~ lim limx t x t x t t t � � � ��� � � � �� � � � � � � � � � � � � � � 0 0 1 1 d d . (13) the value of the integral is given by summation of limits. the limits calculated from the left and right side have a high difference when the calculation is made at the point of the step change. this is the source of the overshoots in the areas with step changes. the areas of step changes are also the beginning and the end of measurement signal. 3.2 overshoot amplitude calculation of a trapezoidal pulse envelope overshoot behavior is mathematically analyzed on the basis of the example of the impulse combined from a trapezoidal and square pulse (further trapezoidal). this signal can simulate more types of signals by parameter changes. the trapezoidal signal is shown in figure 5. the function shown in figure 5 can be expressed as: � � � �x t t t t t e t t t t t t g t t t g t t t t t � � � � � � � � 0 1 8 1 3 6 8 4 5 2 4 2 3 a a � � t t g t t t t t t t � � � � � � � � �� � � � � � 4 7 7 5 5 6 (14) 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 0 50 m1 u2 envelope for envelope for [%] 2 � � � 0�� 2 fig. 4: envelope error evaluation of harmonic signal for a step change in time t � 0 e g x t( ) fig. 5: the pulse combined from a trapezoidal and square pulse using equation (13) for the quadrature part of trapezoidal pulse calculation � �~ limx t e t g t t t t t t t t � � � � � � �� � � � � � � � � � � � � � 0 4 2 21 1 3 3 4 d d � � � � � � � � � � � � � � � � � � � � � � � � g t g t t t t e t t t t t d d d 4 5 5 7 7 5 7 � � � t t 6 8 � � � � � � � � � � � � (15) figure 6 illustrates the quadrature part � �~x t and envelope � � � � � �e t x t x t� �2 2~ of signal � �x t calculated by equation (17), defined by equation (14). parameter e 0 was chosen to provide an easier view of the quadrature part and envelope. the dependency of the overshoot size (figure 6) on the rising edge gradient is shown in figure 7. this gradient is defined by parameter s. the dependency is determined numerically based, on any analytical expression of the quadrature part (17). the dependency cannot be determined analytically due to the complexity of equation (17). finding the local extreme by using the first derivative and then getting the overshoot dependency is very difficult. it is evident from the graph that there is an overshoot amplitude of 50 % for gradient s = 46,2° of the rising edge. it is derived in [2] that the quadrature part of a square pulse (a trapezoidal pulse with a 90° gradient) reaches an infinite value at this point. the difference between the theoretical amplitude (�) and the amplitude given numerically (250 %, see figure 7) is due to numerical calculation. numerical calculation is not able to simulate a trapezoidal pulse with a 90° gradient according to the finite distance of two following time samples. the maximum gradient that can be found by numerical calculation is given by � �� s g e fmax � � 180 � atan vz (18) where fvz is the sampling frequency. it is possible to reach the gradient smax = 89.942° for fvz = 1 khz and g � e = 1, smax = 89.999° for fvz = 1 mhz. further increasing of the sampling frequency only very slowly reaches the theoretical infinite amplitude. 3.3 overshoot amplitude calculation of a harmonic signal modulated by a trapezoidal pulse the following section describes the same analytical calculation as in the previous chapter, but for a harmonic signal modulated by the trapezoidal pulse shown in figure 8. the function in figure 8 can be expressed as: � � � � � � x t t t t t e t t t t t t t g t t t t � � � � � � � � � � � 0 1 8 1 3 6 8 4 a acos cos � � � � � � � � � � � � 5 2 4 2 3 4 7 7 5 5 6 g t t t t t t t t g t t t t t t t t � � � � � � � � � � � cos cos � � � � � � � � �� � � � � � (19) 27 acta polytechnica vol. 41 no. 6/2001 � �~ limx t e t g t t t t t t t t � � � � � �� � � � � � � � � � 0 4 2 21 1 3 3 4 d d � � � � � � � � � � � � � � � � � g t t t g t g t t t t t 4 2 4 4 7 3 5 � � � � � d d t t g t t t t e t t t t t t 5 7 5 7 5 6 5 6 6 � � � � � � � � � � � � � � � � � � d d d t 8 � � � � � � . (16) substituting for the integral yields (16) � �~ ln ln lnx t e t t t t e t t t t gt t t t t t � � � � � � � � � � � � �1 3 1 8 6 2 4 2 4 3� � � � � � � � � � � � � � t gt t t t t t t g t t t t g t t t t t 7 7 5 6 5 5 4 4 2 4 3 ln ln ln t t t t g t t t t t t t t t 4 3 7 5 7 5 6 5 � � � � � � � � � � � � � � � � � � � � � � � � � �ln � . (17) )(tx )( ~ tx )(te t fig. 6: trapezoidal pulse given by (14), quadrature part � �~x t and envelope e(t) with parameters e = 0; g = 1; t3 = �6.5 s; t4 = �5.5 s; t5 = 5.5 s; t6 = 6.5 s; fvz = 1 khz 150 200 250 % 10 20 30 40 50 60 70 80 90 fig. 7: normalized graph of overshoot dependency on the gradient s of a trapezoidal pulse rising edge. parameters � � � �� s g e t t� � �180 6 5atan �, where e � 0, g � 1, t6 6� s, � �t t6 5 0 5� � s s, e g x (t) fig. 8: harmonic signal modulated by a trapezoidal pulse 3.3.1 invariant part of a harmonic signal first we make a calculation of the invariant part of the harmonic signal, because it is easier to describe the calculation and because the hilbert transform is linear according to addition. the invariant part of the harmonic signal modulated by the trapezoidal pulse of the signal in fig. 8 is between points t1 and t2. the invariant part of the signal is shown in fig. 9. the function shown in fig. 9 can be expressed as: � � � � x t t t t t t t t t � � � � � � � � cos , , � � �0 1 2 0 1 2 0 0 pro pro (20) the use of equation (13) for calculating the quadrature part of function (20) yields � � � �~ cosx t t t t � � � �� 1 0 1 2 � � � � � �d . (21) substituting z t z z t t z t t � � � � � � � � �d d 1 1 2 2 (22) equation (21) can be written as: � � � �~ cosx t z t z z z z � � � � 1 0 0 1 2 � � � � � d . (23) 28 acta polytechnica vol. 41 no. 6/2001 1 x t( ) 0 �1 fig. 9: invariant part of the harmonic signal equation (23) can be decomposed by equation (24) to equation (25) � � � � � � � � � �cos cos cos sin sin� � � � � � (24) and thus � � � � � � � � � �~ cos cos sin sin x t t z z z t z z z z z z z � � � � �� 1 0 0 0 0 1 2 1 � � � � � � � d d 2 � � � � � � � � � � . (25) the integral calculation inside equation (25) has to be done separately according to a valid condition of integration [3]. the properties of odd and even functions, their integrals and integral sine and cosine are (27) in [2]. � for t t z1 1 0� �, and for t t z2 2 0� �, � � � � � � � �~ cos cos cos x t t z z z z z z z z � � � � � � � � � � � � � � � � 1 0 0 0 1 2 � � � � � d d � � � � � � � � � � � � � � � � � � � � � � � �sin sin sin � � � � 0 0 0 0 0 2 1 t z z z z z z z z d d � � � � � � � � . (26) e;quation (26) changes by using inversion substitution (22) and substitution integral sine and cosine (27) to equation (30). � � � � � � � � ci x t t t si x t t t x x � � � � � � cos , sin d d 0 (27) � � � � � � � �� � � � � �~ cos sinx t t ci z ci z t si z si z� � � � � � � � 1 0 0 1 0 2 0 0 2 0 � � � � � � � � �� �� �� �1 (28) � � � � � � � �� � � � � � � �� �� �~ cos sinx t t ci z ci z t si z si z� � � � � �1 0 0 1 0 2 0 0 2 0 1 � � � � � � � � � (29) � � � � � �� � � �� �� � � � � �� �~ cos sinx t t ci t t ci t t t si t t s� � � � � � � � � 1 0 0 1 0 2 0 0 2 � � � � � � � � � �� �� �� �i t t�0 1 � . (30) � for t t z1 1 0� �, and for t t z2 2 0� �, the extrinsic value of the integral has to be calculated on this condition, because the integral is not defined at zero. the function � �lim �0 f x will be written as � �f x for easier notation. simplification (31) will also be used further � � � � c z z z � cos �0 , � � � � s z z z � sin �0 (31) � � � � � � � � � � � �~ cosx t t c z c z c z c z z � � � � � � �� � �� � � � � � 1 0 0 0 1 � � � z d z d z d z d � � � � � � z z z t s z s z 2 1 2 0 0 0 � � � � � � � � � � �� � � � � �� � � � � � � �sin � � z d z d � � � � � � � � � � � � � (32) 3.3.2 semifinite part of a harmonic signal a special case of the previous section is the semifinite part of the harmonic signal shown in figure 10. the function shown in the figure above can be expressed as: � � � � x t t t t � � � � � � cos ,� � �0 00 0 0 0 pro pro (44) © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 41 no. 6/2001 the equation (33) and subsequently equation (34) will be given by using the properties of even function s(z) and odd function c(z). � � � � � � � �~ lim cosx t t c z c z z z � � � � � � � � � �� � � � � � �� � � � �0 0 1 1 2 z d z d � � � � � � �� � � � � � � � � � � � � � ! " # # # $ % & � � sin � � 0 0 01 2 t s z s z z z z d z d & & � � � � � � � � � � (33) � � � � � � � �~ lim cos sx t t c z c z z z � � � � � � � � � � � � � � � � � � � � �0 0 1 1 2 z d z d � � � � � �in � � 0 0 0 1 2 t s z s z z z � � � � � � � � � � � � ! " # # # $ % & & & � � � z d z d (34) equation (36) is given by conversion and by inversion substitution (22) � � � � � � � �� � � � � �~ cos sinx t t ci z ci z t si z si� � � � � � � � � 1 0 0 1 0 2 0 0 2 0 � � � � � � � � �� �� �� z1 (35) � � � � � �� � � �� �� � � � � �� �~ cos sinx t t ci t t ci t t t si t t s� � � � � � � � � 1 0 0 1 0 2 0 0 2 � � � � � � � � � �� �� �� i t t�0 1� (36) � for t t z1 1 0� �, and for t t z2 2 0� �, � � � � � � � �~ cos sinx t t c z c z z z � � � � � � � � �� � � � � �� � �� �� 1 0 2 1 � � � z d z d � � � � � �� �0 0 0 1 2 t s z s z z z � � � � � � � � � � � � ! " # # # $ % & & & z d z d (37) equation (38) will be given by using the properties of even function s(z) and odd function c(z). � � � � � � � �~ cos sinx t t c z c z z z � � � � � � � � �� � � � � �� � � � � � 1 0 1 2 � � � z d z d � � � � � �� �0 0 0 1 2 t s z s z z z � � � � � � � � � � � � ! " # # # $ % & & & � � z d z d (38) equation (40) is given by conversion and inversion substitution. � � � � � � � �� � � � � �~ cos sinx t t ci z ci z t si z si� � � � � � � � � � � 1 0 0 1 0 2 0 0 1 � � � � � � � � � �� �� ��0 2z (39) � � � � � �� � � �� �� � � � � �� �~ cos sinx t t ci t t ci t t t si t t s� � � � � � � � � 1 0 0 1 0 2 0 0 1 � � � � � � � � � �� �� �� i t t�0 2� (40) let us rewrite equations (30), (36) and (40) as a function with parameters. from equation (30) it holds for t t t� �1 2 � � � � � �� � � �� �� � � �y t t t t ci t t ci t t t si1 1 2 0 0 1 0 2 0 0 1 , , cos sin� � � � � � � � � � � � � � � � �� � � �� �� �� t t si t t2 0 1� � �� (41) from equation (36) it holds for t t t1 2� � � � � � � �� � � �� �� � � �y t t t t ci t t ci t t t si2 1 2 0 0 1 0 2 0 0 1 , , cos sin� � � � � � � � � � � � � � � � �� � � �� �� �� t t si t t2 0 1� � �� (42) from equation (40) holds for t t t1 2� � � � � � � �� � � �� �� � � �y t t t t ci t t ci t t t si3 1 2 0 0 1 0 2 0 0 1 , , cos sin� � � � � � � � � � � � � � � � �� � � �� �� �� t t si t t� � �1 0 2� (43) 1 x (t) fig. 10: semifinite part of a harmonic signal the same result we obtained if we use the function defined by equation (20), when t1 0� , t2 � �. thus for equation (30) and (36) it holds: � for t < 0 � � � � � �� � � � � ~ cos sin x t t ci t t si t � � � � � � � �! "# $ %& 1 1 2 0 0 0 0 � � � � � � � � � (45) � for t > 0 � � � � � �� � � � � ~ cos sin x t t ci t t si t � � � � � � �! "# $ %& 1 1 2 0 0 0 0 � � � � � � � � � (46) � for t = 0 and � �� � � � 2 2 1k , where k = 1, 2, … is a positive integer, the value of � �~x t according to equation (47) is equal to �0.5. � � � �~x k0 1 2 1 1� � � (47) for t = 0 and � �� � ' � 2 2 1k , and therefore for other phases except � 2, 3 2� , 5 2� , … the quadrature signal is discontinuous, and it has an extrinsic limit of a logarithmic character [3] – see equation (50). � � � � � � � � ci x t t t x n x n x n n n � � � � � � � � � ( cos ln ! d � 1 1 2 2 2 1 (48) where � �� � � � � � �� � � � � � � �� lim ln . n n n1 1 2 1 3 1 0 58� (49) � � � � ci t t t0 0 � � � � � cos d (50) 3.3.3 linearly rising part of a harmonic signal let us make a calculation of the linearly rising part of harmonic signal modulated by a trapezoidal pulse between points t3 and t4, see figure 8. this part is shown in figure 11, but points t3 and t4 are replaced by points t1 and t2. the function shown in figure 11 can be expressed as: � � � � � � x t at b t t t t t t t t � � � � � � � � � cos , , � � �0 1 2 0 1 2 0 0 pro pro (51) using equation (13) for the quadrature part, the calculation of function (52) yields � � � � � �~ cosx t a b t t t � � � � � 1 0 � � � � � � �d 1 2 (52) � � � � � �~ cos cosx t a t b t t t t t � � � � � � � � � 1 0 0 � � � � � � � � � � � � �d d 1 2 1 2 � � � � � � � (53) substituting z t z z t t z t t � � � � � � � � �d d 1 1 2 2 (54) 30 acta polytechnica vol. 41 no. 6/2001 b x(t) fig. 11: linearly rising part of a harmonic signal we can write equation (53) as equation (57) � � � � � �� � � �� �~ cos cosx t a z t z t z z b z t z z z z z z � � � � � � � ! 1 0 0 � � � � � � d d 1 2 1 2 " # # # $ % & & & (55) � � � �� � � �� �~ cos cos co x t a z t z at z t z z b z z z z � � � � � � � 1 0 0 � � � � � �d d 1 2 1 2 � �� �s � �0 z t z z z z � � ! " # # # $ % & & & d 1 2 (56) � � � �� � � � � �� �~ cos cos x t a z t z at b z t z z z z z z � � � � � � � � ! " 1 0 0 � � � � � d d 1 2 1 2 # # # $ % & & & (57) let us make a separate calculation of the integral using (58) in equation (57) � � � � � �� ~x t i t i t� � � 1 1 2 � (58) where there is � � � �� �i t a z t z z z 1 0 1 2 � � � cos � � d (59) let us rewrite the equations (62), (63), (64) and (65) as a function with parameters. for equation (62), (63) and for t t t� �1 2 it holds: � � � � � �� y t t t a b i t t t a i t t t a ba4 1 2 1 1 2 2 1 2 1 , , , , , , , , , , ,� � � � (66) for equation (62), (64) and for t t t1 2� � it holds � � � � � �� y t t t a b i t t t a i t t t a bb5 1 2 1 1 2 2 1 2 1 , , , , , , , , , , ,� � � � (67) for equation (62), (65) and for t t t1 2� � � � � � � �� y t t t a b i t t t a i t t t a bc6 1 2 1 1 2 2 1 2 1 , , , , , , , , , , ,� � � � (68) 3.3.4 total analytical calculation of the quadrature part of a harmonic signal modulated by trapezoidal pulse as was mentioned above, it was necessary to make the calculation of the quadrature part of a harmonic signal modulated by a trapezoidal pulse in separate parts, due to the complexity of the analytical calculation. now we will make a summary of all the parts, see figure 8. we obtain the entire quadrature signal � �~x t by the sum of all parts with respect to important parameters in the individual part. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 41 no. 6/2001 � � � � � �� � i t at b z t z z z z 2 0� � � � cos � � d 1 2 (60) for calculation of � �i t1 it holds: � � � �� � � �� �� i t a z t z t1 0 0 2 0 1� � � � � � � � � � �sin sin (61) by using inverse substitution (54) we have � � � � � �� i t a t t1 0 0 2 0 1� � � � � � � � �sin sin (62) the result of equations (30), (36) and (40) will be used for calculating of � �i t2 . � for t t t� �1 2 � � � � � � � �� � � �� �� � �i t at b t ci t t ci t t t si ta2 0 0 1 0 2 0 0 2� � � � � � � � �cos sin� � � � � � � � �� � � �� �� � �t si t t� ��0 1 (63) � for t t t1 2� � � � � � � � � �� � � �� �� � �i t at b t ci t t ci t t t si tb2 0 0 1 0 2 0 0 2� � � � � � � � �cos sin� � � � � � � � �� � � �� �� � �t si t t� ��0 1 (64) � for t t t1 2� � � � � � � � � �� � � �� �� � �i t at b t ci t t ci t t t si t tc2 0 0 1 0 2 0 0� � � � � � � � �cos sin� � � � � � � � �� � � �� �� � �1 0 2� �si t t� (65) � for t t� 1 � � � �z t y t t t y t t t g e t t et gt t t 1 1 1 3 4 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 1 4 5 4 5 6 6 5 6 5 6 5 , , , , , , � �y t t t1 6 8, , (69) � for t t t1 3� � � � � �z t y t t t y t t t g e t t et gt t t 2 2 1 3 4 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 1 4 5 4 5 6 6 5 6 5 6 5 , , , , , , � �y t t t1 6 8, , (70) � for t t t3 4� � � � � �z t y t t t y t t t g e t t et gt t t 3 3 1 3 5 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 1 4 5 4 5 6 6 5 6 5 6 5 , , , , , , � �y t t t1 6 8, , (71) � for t t t4 5� � � � � �z t y t t t y t t t g e t t et gt t t 4 3 1 3 6 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 2 4 5 4 5 6 6 5 6 5 6 5 , , , , , , � �y t t t1 6 8, , (72) � for t t t5 6� � � � � �z t y t t t y t t t g e t t et gt t t 5 3 1 3 6 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 3 4 5 5 5 6 6 5 6 5 6 5 , , , , , , � �y t t t1 6 8, , (73) � for t t t6 8� � � � � �z t y t t t y t t t g e t t et gt t t 6 3 1 3 6 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 3 4 5 6 5 6 6 5 6 5 6 5 , , , , , , � �y t t t2 6 8, , (74) � for t t8 � � � � �z t y t t t y t t t g e t t et gt t t7 3 1 3 6 3 4 4 3 4 3 4 3 � � � � � � � � � � � , , , , , , � �� � � � � � � � � � � � � �y t t t y t t t e g t t gt et t t 3 4 5 6 5 6 6 5 6 5 6 5 , , , , , , � �y t t t3 6 8, , (75) the entire equation for the quadrature part is done by the sum of equations (69)–(75): � � � � � � � � � � � � � � � �~x t z t z t z t z t z t z t z t� � � � � � �1 2 3 4 5 6 7 (76) the quadrature part � �~x t calculated by (76) and the envelope � � � � � �e t x t x t� �2 2~ of signal x ( t) defined by equation (19) is shown in figure 12. the dependency of the overshoot size (fig. 12) on the rising edge gradient is shown in figure 13. this gradient is defined by parameter s. the dependency is determined numerically based on an analytical expression of the quadrature part (76). the dependency cannot be determined analytically due to the complexity of equation (76). finding the local extreme by using the first derivative and then getting the overshoot dependency is very difficult. it is evident from the graph that there is an overshoot amplitude of 50 % for gradient s = 89.5° of the rising edge. it is derived in [2], that the quadrature part of a harmonic signal modulated by a square pulse (a trapezoidal pulse with a 90° gradient) reaches an infinite value at this point. it is also evident that there is a high amplitude rise for gradient s > 86°. it was derived in equations (44) – (50) that the quadrature part of the harmonic signal � �cos � �0t � reaches an infinite amplitude at any point with a step change and when the relation � �cos � �0 0t � ' is valid. the difference between the theoretical amplitude (�) and the amplitude given numerically (90 %, see figure 13) is due to numerical calculation. numerical calculation is not able to simulate a trapezoidal pulse with a 90° gradient according to the finite distance of two following time samples. the maximum gradient that can be reached by numerical calculation is given by � �� �s g e fmax � � 180 � atan vz where fvz is the sampling frequency. it is possible to reach the gradient smax = 89.936° for fvz = 1 khz and g – e = 1, smax = 89.999° for fvz = 1 mhz. a further increase in sampling frequency only very slowly approaches to the theoretical amplitude. 4 conclusion the dependency of the size overshoot on the gradient edge of a pulse was found. this was done by an analytical calculation of the quadrature part of a trapezoidal pulse and of a harmonic signal modulated by a trapezoidal pulse. the same source of the envelope size overshoot is the gradient edge of a pulse and also that beginning and end of the measurement interval. the size of the overshoot of a harmonic signal modulated by a trapezoidal pulse is 80 % with a gradient of edge almost 90°. the size of the overshoot of the trapezoidal pulse is 250 %. the analytical dependency of the size of the overshoot as not found, due to mathematical complexity. only numerical calculation was done, based on previous analytical calculation of the quadrature part. acknowledgement this research work has received support from research program no j04/98:210000015 “research of new methods for physical quantities measurement and their application in instrumentation” of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic). references [1] hahn, s. l.: hilbert transforms in signal processing. artech house, boston, 1966 [2] doubek, j.: integrální transformace ve zpracování vibrodiagnostického signálu. dizertační práce, čvut fakulta elektrotechnická, 2001, (in czech: integral transform in a vibrodiagnostic signal), ph.d. thesis, fee, ctu, prague, 2001 [3] čížek, v.: discrete hilbert transform. ieee transaction on audio and electronic, volume au-18, no. 4, december 1970, pp. 340–342 [4] doubek, j., kreidl, m.: new algorithms of gearbox faults detection using hilbert transform and cross corelation. workshop, ctu publishing house, prague, 2001, pp. 320–321 [5] kreidl, m., et al: diagnostické systémy. monografie, ediční středisko čvut, praha, (in czech: diagnostic systems). monograph, fee cut, prague, 2001 doc. ing. marcel kreidl, csc. e-mail: kreidl@fel.cvut.cz ing. jan doubek, ph.d. e-mail: doubek@fel.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 -10 -8 -6 -4 -2 0 2 4 6 8 10 -1 -0.5 0 0.5 1 t [s] )(tx )(teovershoot )(~tx fig. 12: harmonic signal modulated by trapezoidal pulse x(t) given (19), quadrature part � �~x t and envelope e(t) with parameters e = 0; g = 1; t3 = �6.25 s; t4 = �5.75 s; t5 = 5.75 s; t6 = 6.25 s; �0 = 2�; � = 0, fvz = 1 khz notice: parameters t1 and t8 lose their relevance if the parameter e = 0. for this reason these parameters were not mentioned. parameter e = 0 was chosen to provide an easier view of the quadrature part and envelope. 60 80 100 % fig. 13: normalized graph of overshoot dependency on the gradient s of the signal (19) rising edge, parameters � � � �� s g e t t� � �180 6 5atan � , where e = 0, g = 1, t6 = 6 s, � �t t6 5 0 5� � s s, , �0 = 2�, � = 0 acta polytechnica doi:10.14311/ap.2014.54.0389 acta polytechnica 54(6):389–393, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap deposition of carbon nanostructures by surfatron generated discharge marina davydovaa, ∗, jiri smidb, zdenek hubickab, alexander kromkaa a institute of physics ascr, cukrovarnicka 10, prague, czech republic b institute of physics ascr, na slovance 2, prague, czech republic ∗ corresponding author: davydova@fzu.cz abstract. various carbon nanostructures were deposited by surface wave discharge from an ar/ch4/co2 gas mixture. the type and the form of the carbon nanostructure were controlled by the gas mixture and the gas inlet. the nanostructures that formed were investigated by scanning electron microscopy and by raman measurements. the influence of the geometrical combination of the gas inlets (via surfatrons or via the gas shower system) into the chamber is found to be a crucial deposition parameter for the controllable growth of desired carbon nanostructures. keywords: surfatron, carbon nanostructures, microwave plasma, pecvd. 1. introduction plasmas are extremely successful in many industrial thin-film applications, e.g. in the production of microelectronic devices and solar cells [1] and for biomedical applications [2]. many of these thin-film materials have been successfully developed using mainly empirical methods. microwave (mw) plasma is frequently used in so-called plasma-enhanced chemical vapor deposition (pecvd). in mw pecvd, the feed gas mixture is ionized and excited by microwave radiation in a reaction chamber. ionized and exited neutral atomic and molecular components from plasma, generally of low fractional ionization, in which radicals and neutral react and/or recombine and finally condense onto the substrate as a thin film [3]. the commercial and technical value of low temperature plasmas is well established. among the broad range of plasmas, surface wave discharge (swd) is nowadays considered promising as a low-temperature process for depositing various coatings. swd is widely used in the engineering industries as a source of reactive radicals and for depositing zno thin films, sioxcyh thin films, semiconductive sige films, and some others [4–7]. however, deposition of carbon nanostructures by swd is not broadly used. swd is characterized by the transport of energy toward an active plasma area by an electromagnetic wave that is spread along the plasma column [7]. the so-called surfatron waveguide can operate both in a continuous regime and in a pulsed regime employing a microwave generator. low-temperature mw plasma is a strongly nonequilibrium system generating an exotic physical and chemical environment through free electrons at low gas temperatures. this unique environment allows the treatment of temperature-sensitive materials at molecular precision. in the present work we investigate the growth of various carbon nanostructures by surface wave discharge. the influence on the deposition process of gas flow rate, gas mixture and the geometrical combination of gas inlets (via surfatrons or via a gas shower system) into the chamber is discussed here. 2. experimental details 2.1. deposition system figure 1 shows the experimental set-up of a modified plasma-enhanced cvd reactor [7]. the plasma system consists of 4 independent nozzles (surfatrons) which form a single swd. the nozzles are connected to the microwave source (2.45 ghz). the mw generator used here can be operated both in continuous mode and in pulsed mode, with averaged absorbed power of about 300 w per surfatron and with repetition frequency f = 60 hz. the process gases were introduced into the vacuum chamber via surfatrons (1 or 4 nozzles) or via a gas shower. the distance between the sample and the surfatron outlet(s), i.e., the quartz tube, varied in the range from 1 to 3 cm. 2.2. deposition of carbon nanostructures the deposition of the carbon nanostructures comprised three steps: (1.) a catalyst fabrication step, which employed thermal evaporation of an ni layer (6 nm in thickness) on 10 × 10 mm2 si/sio2 substrates. the thickness of the ni was monitored by in situ measurements, using a quartz-crystal-based thickness monitor (inficon xtc/2). (2.) a thermal treatment step, during which the si/sio2 substrates covered with a catalyst layer were thermally treated in hydrogen plasma using the large area pulsed linear-antenna microwave plasma 389 http://dx.doi.org/10.14311/ap.2014.54.0389 http://ojs.cvut.cz/ojs/index.php/ap m. davydova, j. smid, z. hubicka, a. kromka acta polytechnica (a) (b) figure 1. schematic view of the modified plasma-enhanced cvd surface-wave discharge set-up used for depositing the carbon nanostructures (a) and the digital photography inside the vacuum chamber (b). (a) (b) (c) (d) figure 2. surface morphology of carbon nanostructures deposited under various conditions: (a, c) ar/co2 gases were injected through the single surfatron, and ch4 was transported via a parallel surfatron, (b, d) ar/co2 gases were injected via 4 surfatrons and co2/ch4 gases were introduced through the shower head. cvd process [8]. the process conditions were as follows: p = 1700 w, gas pressure 20 mbar, h2 flow 300 sccm, substrate temperature ts = 650 °c and treatment time 10 min. after the annealing process, the catalyst layer decomposed into small nickel clusters about 40 nm in diameter. (3.) deposition of carbon nanostructures, which was carried out using the modified plasma swd pecvd system. the system was pumped to a base pressure of 0.008 mbar and was then maintained at a pressure of 0.2 mbar by introducing an ar carrier gas at a flow rate of 100 sccm and various ratios of co2 and ch4. the flow of methane was changed from 60 to 90 sccm, while the flow of carbon dioxide was kept at a constant value of 30 sccm. a microwave power input of 300 w operated in the external modulation mode was used in all the experiments. dur390 vol. 54 no. 6/2014 deposition of carbon nanostructures by surfatron generated discharge (a) (b) figure 3. top-view sem image (a) and raman spectrum (b) of carbon nanostructures deposited under the following conditions: ar/co2 gases were injected through a single surfatron, and ch4 gas flowed through parallel surfatrons. ing the deposition process the substrate temperature was maintained at 650 °c and the deposition time was 15 min. the distance between substrate holder and precursor outlet was varied from 1 to 3 cm. 2.3. material characterization the surface morphology of the coatings (carbon nanostructures) was characterized by scanning electron microscopy (sem e_line writer, raith gmbh) and was confirmed by uv-raman spectroscopy (renishaw invia reflex raman spectrometer, 442 nm excitation wavelength). 3. results and discussion figure 2 shows the surface morphology of carbon nanostructures deposited using various ch4/co2 flow rates. the distance between substrate holder and quartz nozzle outlet was 1 cm. the sem images clearly confirm that only isolated nanoislands were formed using an ar/co2/ch4 gas mixture (100/30/60 sccm) (figure 2ac). the surface morphology changed considerably after the ch4 gas flow was increased from 60 to 90 sccm (figure 2bd). in this case, the sem measurements indicate the development of nanotube-like structures. the raman spectroscopy measurements did not reveal a reasonable signal of carbon phases. next we investigated the influence of sample position. the distance between substrate holder and precursor outlet increased to 3 cm. as in the previous case, a gas mixture of ar/co2/ch4 (100/30/60 sccm) was introduced to the chamber. the sem image reveals the formation of a porous-like structure consisting of nanosized features (figure 3a). the raman spectrum (figure 3b) is represented by three strong contributions: the silicon characteristic peak centered at 520 cm−1 (si-peak) and two broad bands centered at approximately 1350 cm−1, which is attributed to the d band, and at approximately 1590 cm−1, which is attributed to the g band. the d band is usually assigned to the disorder and imperfection of the carbon crystallites, whereas the g band is one of the two e2g modes of the stretching vibrations in the sp2 domains of perfect graphite [9]. in addition, a weak broad band resolvable at 970 cm−1 reflects the second-order peak of the si substrate [10]. increasing the methane flow from 60 to 90 sccm in combination with varying the gas inlet(s) resulted in the formation of vertically ordered carbon nanowalls (figure 4ab). the distance between substrate holder and precursor outlet was kept at 3 cm. figure 4c shows the raman spectrum of the nanostructures that formed. four basic features are recognized in the raman spectrum: a sharp si peak centered at 520 cm−1, the d band (1369 cm−1), the g band (1570 cm−1), and the second-order d band (2d-band). the 2d band centered at 2720 cm−1 is attributed to the typical symbol of graphitic carbon [11–13]. moreover, it should be noted that the radial breathing mode (rbm) was not detected in any samples, i.e., absence of single or multiwall carbon nanotubes. these results clearly confirm that various carbon nanostructures can be deposited by a surface wave discharge. the gas mixture and the working gas inlet to the process chamber were found to be the crucial parameters for the growth of various carbon forms/types. increasing the ch4 concentration and using the shower head for introducing working gases into the chamber led to the formation of carbon nanowalls (figure 4ab). we found that reactive gases (co2 and ch4) are optimally decomposed at the substrate by ar plasma due to a set of plasma-chemical reactions which finally support the growth of carbon nanostructures. moreover, we found that the optimal distance for deposition of carbon nanostructures is 3 cm for our experimental swd process. these observations represent unique and, in certain cases, advanced features of the surfatron system. 391 m. davydova, j. smid, z. hubicka, a. kromka acta polytechnica (a) (b) (c) figure 4. top-view sem images (a, b) and the raman spectrum (c) of carbon nanostructures deposited under the following conditions: ar gas was injected via 4 surfatrons, and co2/ch4 gases were introduced through the shower head. 4. conclusions we have introduced the modified plasma enhanced cvd system working on the principle swd as a versatile deposition system for the growth of various carbon nanostructures. we have shown that the carbon structures that are formed are significantly influenced by the gas mixture (i.e., ratios of co2/ch4) and by the chosen gas inlet(s) for achieving the proper gas decomposition and/or plasma-chemically driven reactions at the substrate surface. the formation of carbon nanowalls was observed after the methane content was increased and its flow via the shower head was forced. we assume that the implementation of swd will open new prospects in deposition of carbon allotrope forms, or even for gentle surface modification. acknowledgements this work was supported by grants 14-06054p (czech science foundation) and ta01011740 (technological agency of the czech republic). references [1] soppe, w. et al.: bulk and surface passivation of silicon solar cells accomplished by silicon nitride deposited on industrial scale by microwave pecvd. prog. photovolt. res. appl., 13 (7), 2005, p. 551–569. doi:10.1002/pip.611 [2] ferreira, c.m. et al.: air–water microwave plasma torch as a no source for biomedical applications. chem. phys., 398, 2012, p. 248–254. doi:10.1016/j.chemphys.2011.05.024 [3] brewer, m.a. et al.: simple, safe, and economical microwave plasma-assisted chemical vapor deposition facility. rev. sci. instrum., 63 (6), 1992, p. 3389–3393. doi:10.1063/1.1142557 [4] takanishi, y. et al.: deposition of polycrystalline sige by surface wave excited plasma. thin solid films, 516 (11), 2008, p. 3554–3557. doi:10.1016/j.tsf.2007.08.025 [5] kment, s. et al.: photo-induced electrochemical functionality of the tio2 nanoscale films. electrochimica acta, 54 (12), 2009, p. 3352–3359. doi:10.1016/j.electacta.2008.12.036 [6] šerá, b. et al.: new physicochemical treatment method of poppy seeds for agriculture and food industries. plasma sci. technol., 15 (9), 2013, p. 935. doi:10.1088/1009-0630/15/9/19 [7] olejníček, j. et al.: zno thin films prepared by surfatron produced discharge. catal. today, 230, 2014, p. 119–124. doi:10.1016/j.cattod.2013.11.024 392 http://dx.doi.org/10.1002/pip.611 http://dx.doi.org/10.1016/j.chemphys.2011.05.024 http://dx.doi.org/10.1063/1.1142557 http://dx.doi.org/10.1016/j.tsf.2007.08.025 http://dx.doi.org/10.1016/j.electacta.2008.12.036 http://dx.doi.org/10.1088/1009-0630/15/9/19 http://dx.doi.org/10.1016/j.cattod.2013.11.024 vol. 54 no. 6/2014 deposition of carbon nanostructures by surfatron generated discharge [8] kromka, a. et al.: linear antenna microwave plasma cvd deposition of diamond films over large areas. vacuum, 86 (6), 2012, p. 776–779. doi:10.1016/j.vacuum.2011.07.008 [9] kruk, m. et al.: partially graphitic, high-surface-area mesoporous carbons from polyacrylonitrile templated by ordered and disordered mesoporous silicas. microporous mesoporous mater., 102, 2007, p. 178–187. doi:10.1016/j.micromeso.2006.12.027 [10] woo, h. k. et al.: growth of epitaxial beta-sic films on silicon using solid graphite and silicon sources. diam. relat. mater., 8, 1999, p. 1737-1740. doi:10.1016/s0925-9635(99)00016-3 [11] krivchenko, v. a. et al.: carbon nanowalls: the next step for physical manifestation of the black body rating. sci. rep., 3, 2013, p. 1-6. doi:10.1038/srep03328 [12] wang, l. et al.: b and n isolate-doped graphitic carbon nanosheets from nitrogen-containing ion-exchanged resins for enhanced oxygen reduction. sci. rep., 4, 2014, p. 1-8. [13] ghosh, s. et al.: evolution and defect analysis of vertical graphene nanosheets. j. raman spectrosc., 45, 2014, p. 642-649. 393 http://dx.doi.org/10.1016/j.vacuum.2011.07.008 http://dx.doi.org/10.1016/j.micromeso.2006.12.027 http://dx.doi.org/10.1016/s0925-9635(99)00016-3 http://dx.doi.org/10.1038/srep03328 acta polytechnica 54(6):389–393, 2014 1 introduction 2 experimental details 2.1 deposition system 2.2 deposition of carbon nanostructures 2.3 material characterization 3 results and discussion 4 conclusions acknowledgements references ap01_45.vp 1 introduction thermal plasmas are plasmas in which the thermodynamic state approaches local thermodynamic equilibrium (lte). although such plasmas are characterized by a single temperature common to all species and to all their thermal movements (translation, rotation, vibration, electronic temperature), the analysis of a thermal plasma system is highly complex. it involves an intricate interplay between fluid dynamics, turbulent transport, thermal radiation, chemical reactions and interaction with other phases. this paper deals with the theoretical and experimental analysis of a typical engineering system utilizing thermal plasma – a system for chemical vapor deposition (cvd), in particular deposition of diamond. the system under consideration is shown schematically in figure 1, where the major processes are: (1) heating of the process gas with a dc or an rf plasma torch, (2) expansion of the gas into the reactor and introduction of the diamond growth precursor species, and (3) interaction of the jet with the substrate, and diamond formation. the focus of this paper is on the second stage – the interaction of the plasma flow (plasma jet) with the surrounding atmosphere and with the substrate located downstream. the dc torch power of the system under consideration is 6.3 kw, the jet power is about 50 % of this, i.e. 3.1 kw. the argon flow rate through the torch is 16 slm, and the hydrogen flow rate through the torch is 5 slm. methane is used as a diamond growth precursor and is injected through three probes within 3 mm of the torch nozzle exit with a total flow rate of 0.25 slm. the reactor pressure is 12.5 kpa the substrate stand-off distance is 8 cm. the plasma torch exit temperature is of the order of 5000 k and the torch exit velocity is of the order of 2500 m/s. the system is located and operates at fg plasma-oberflächentechnik, technische universität ilmenau, thüringen, germany. 2 laminar flow modeling in this study, it was assumed that the gas exits the plasma torch with known radial profiles of pressure, velocity, enthalpy and chemical composition. these boundary conditions were obtained in two ways: (1) by a simple axial integration of the conservation equations with a 1d model of the torch region, and (2) from the spectroscopic measurements made by jahn [1] on the plasma system under consideration. jahn measured the h� lines in the region close to the torch nozzle exit and from the results evaluated the plasma jet temperature. the temperature was evaluated assuming an lte in the torch exit. in reality, the specification of the inflow boundary conditions turns out to be not so much an input but rather one of the outputs of the design analysis, and the selected final profiles are based on the best agreement of the overall simulation results with experiments. assuming the upstream boundary conditions are known, one forms a set of conservation equations that are to be solved for the unknown flow characteristics, temperature/energy, pressure and species concentrations. the set includes [2] the © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 41 no. 4 – 5/2001 analysis of a thermal plasma diamond cvd system d. kolman this paper deals with the analysis of a typical engineering system utilizing thermal plasma – a system for diamond chemical vapor deposition. it defines the system – a slightly overexpanded plasma jet impinging at a downstream-located substrate, outlines the theoretical description of the system – the navier-stokes and species conservation equations, and presents key theoretical results on the major and most troublesome factors influencing diamond deposition – velocity and temperature of the jet. then, the paper demostrates the necessity to shift from a laminar to a turbulent flow description and compares both results to experiments. an explanation of the remaining discrepancy – insufficient velocity drop in the jet – is attempted. keywords: modeling, thermal plasma, cvd. fig. 1: the thermal plasma diamond cvd system continuity equation, two momentum equations, the energy equation, species continuity equations with temperature-dependent chemical kinetics, and as a closure the equation of the state. the species set may but does not include ions due to a rather low jet temperature in the region of interest. none of the viscous and/or diffusion effects, and compressibility effects can be excluded. as there is no electric arc passing through the reactor domain there is no need to solve the maxwell-boltzmann electromagnetic equations. thermal diffusion terms and radiation are also neglected due to their small values in the situation under consideration (a rather low temperature). the obtained equation set is solved with a finite difference semi-implicit pressure-based algorithm for compressible viscous flows with an arbitrary number of conservation equations [3]. the final form of the program adopts a hybrid convection-diffusion flux approximation at the computational cell interfaces. the 2d computational domain includes the reactor and the substrate and is non-uniformly subdivided into ~ 100 computational cells in both directions with the finest spacing in the vicinity of the precursor injection location and at the substrate. at the substrate, no slip boundary conditions are adopted for the momentum equation, the known radial temperature distribution is imposed at the substrate surface, and a source/sink is included in the species conservation equation where the strength of the surface net production rates is given by a system of surface chemical reactions. for this purpose the momentary chemical state of the surface is described via concentrations of appropriately defined surface species (a radical site, a site ended with a hydrogen atom, etc.). energy release by surface recombination is included in the energy balance. in the present model, the ar-h2-ch4 kinetics mechanism consists of 34 gas phase species (ar, h, h2, ch0�4 c2h0�6) and 52 gas phase reactions. the production rates of the individual species are given by the sum of contributions from the individual gas phase reactions. all temperature and chemical composition-dependent thermodynamic and transport properties are evaluated using the wilke mixing rule which has been found to be sufficient for the situation under consideration. equilibrium chemical composition is assumed in the torch exit. for a discussion of the surface chemistry see [2]. in all the figures presented, the jet is assumed slightly overexpanded with the nozzle pressure ~ 9000 pa, axial velocity ~ 2700 m/s and temperature ~ 4150 k. (the supersonic character of the jet, m ~ 2, is supported by the diamond shocks sometimes visible in the actual jet.) after the theoretical model was developed and the results obtained, a set of enthalpy probe measurements [4] was performed in the reactor – see figure 2. these data were used to validate the theoretical results, through a comparison to gain more advanced insight into the deposition process, and to suggest the further course of investigation. in evaluation of the measurements it was assumed that the static pressure of the jet is close to the chamber pressure – a reasonable assumption far away from the torch exit (an assumption of this kind is necessary to evaluate the results across the bow shock formed in front of the enthalpy probe in the case of compressible high velocity flows). to the disappointment of the author it was found that the experimental data agree rather poorly with the predictions. the main problem is that both the axial velocity and the jet enthalpy are too high and do not fall off as fast as the experiment suggests. after a detailed analysis with the experimental researchers the following explanations have been proposed: (1) the spectroscopic data are too high: when computing the temperature from the h� line data a state of lte was assumed. however, this is not likely to be the case in the torch exit, where electron temperature tends to be elevated above the heavy species temperature. thus the temperature derived from the spectroscopic measurements corresponds more to the electron temperature than to the heavy species temperature and should not be taken as a boundary condition for heavy particle jet flow. (2) however, even when starting with a lower nozzle exit temperature, 4150 k, the theoretical curves do not decline as fast as the experiments require. this leads to the suspicion that the jet is turbulent. the transition reynolds number for an axisymmetric jet is about 2300. in our case, the torch orifice diameter = 6 mm, plasma velocity 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 (a) (b) d d a e fig. 2: (a) axial jet velocity; (b) jet enthalpy – comparison of simulations and experiments ~ 2700 m/s, density ~ 0.1 kg/m3, and viscosity ~ 0.00016 kg/(m�s), and thus the reynolds number ~ 10000. this value decreases to some extent with increasing torch exit temperature, but always stays above the transition limit. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 41 no. 4 – 5/2001 d d d d d d r r r r r r fig. 3: contours of the axial velocity and enthalpy for a laminar, prandtl and k-� turbulent plasma jet 3 turbulent flow modeling modeling of the turbulence was approached in two ways: (1) using prandtl’s free shear layer model for eddy viscosity [5], � � �t c u� � � � max (1) where c is constant for the particular type of mixing layer, c ~ 0.012 for a round jet, � is local density, � is the shear layer thickness taken as the nozzle radius, and umax is the centerline jet velocity at a particular distance from the torch nozzle. (2) using the standard low reynolds number k-� model of turbulence [6, 7]. figure 2 shows that the character of the solution has changed drastically: one notes that the enthalpy corresponds quite well with the experimental measurements, both qualitatively and quantitatively. however, the agreement in the axial velocity is still rather poor. in the experimental data, which are unfortunately available only from a distance 55 mm away from the nozzle exit, the axial velocity falls from ~ 2500 m/s at 55 mm to ~ 1200 m/s at 67 mm. in the simulation, the velocity falls sufficiently but much more gradually and over most of the distance to the nozzle. the following remedies and explanations are available at the present time: (1) one can modify the nozzle boundary conditions via an increase in the nozzle exit velocity and a decrease in the density, and pull the whole velocity profile up that way (either by decreasing the torch temperature further below 4000 k, which is unrealistic, or by reducing the jet pressure further and thus causing greater overexpansion). however, this fix will not lead to a sudden drop in velocity, as the experiment suggests. (2) the interpretation of the experimental data may be questioned on two counts: the assumption that the jet static presure equals the chamber pressure; and the finding that the pressure ratio in the entalpy probe measurements closer to the torch than 55 mm is so high that it does not allow velocity and enthalpy evaluation for those values. these points will need further clarification with the experimental researchers. finally, an illustrative comparison between individual computational cases is given in figure 3. the two turbulent simulations produce much a shorter and much wider jet than the laminar case, as expected (k-� leading to even more extensive entrainment of the ambient atmosphere than the prandtl model). however, again, no turbulent modeling can reproduce the sudden velocity drop around 6 cm from the nozzle that the experiments suggest. 4 summary and conclusions this paper has reported the process of designing and analysing of a typical thermal plasma chemical vapor deposition system. it has shown the development of a theoretical model from a simple laminar model to a more demanding turbulent one, and has pointed out the reasons leading to such change in understanding. also, close cooperation with the experimental researchers has been emphasized. at the current stage, the benefits of the analysis undertaken are fourfold: (1) the simulation has confirmed the supersonic character of the flow; (2) recirculation of the gas (incl. hydrocarbon precursor) in the region close to the torch exit has been brought to attention; (3) numerous cross-checks and critical evaluations of the experiments were enforced, specific suggestions of further experiments have been made, some potentially erroneous assumptions in the experimental work have been exposed; and (4) fruitful discussions have been initiated on various aspects of the deposition process, incl. the precursor flow pattern and its chemical composition, and boundary layer effects on chemistry. future work should include (1) a critical reevaluation of the enthalpy probe measurements, (2) better suited turbulence modeling, and (3) greater attention to the torch exit conditions. only then trustworthy predictions and further optimization of the deposition process take place. acknowledgement this project was in part carried out using computational resources provided under research grants av čr a1057001/120/00 and msm/216200031. references [1] jahn, e.: unpublished results, fg plasma-oberflächentechnik, technische universität ilmenau, 1997 [2] kolman, d., heberlein, j., pfender, e.: influence of deposition parameters on diamond thermal plasma chemical vapor deposition with liquid feedstock injection. diam. rel. mat. 7/1998, pp. 794 –801 [3] patankar, s. v.: numerical heat transfer and fluid flow. hemisphere publishing, new york, 1980 [4] schwenk, a., sember v., gravelle, d. v., boulos, m. i., nutsch, g.: enthalpy probe measurements in supersonic plasma jets. viii. workshop plasmatechnik, technische universität ilmenau, germany, juni 2000 [5] patankar, s. v.: computation of heat transfer and fluid flow. complementary notes, university of minnesota [6] jones, w. p., launder, b. e.: the prediction of laminarization with a two-equation model of turbulence. int. j. heat mass transfer 15, pp. 301–314 [7] lam, c. k. g., bremhorst, k.: a modified form of the k-� model for predicting wall turbulence. transactions asme 103, sept. 1981, pp. 456 – 459. ing. david kolman, phd. phone: +420 2 2435 7541 e-mail: kolman@marian.fsik.cvut.cz department of technical mathematics czech technical university in prague faculty of mechanical engineering karlovo nám. 13, 121 35 praha 2, czech republic 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 acta polytechnica doi:10.14311/ap.2013.53.0872 acta polytechnica 53(6):872–877, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of the accuracy of fibre-optic strain gauges dita jiroutová∗, miroslav vokáč department of experimental methods, klokner institute, czech technical university in prague, šolínova 7, 166 08 prague 6, czech republic ∗ corresponding author: dita.jiroutova@klok.cvut.cz abstract. in recent years, the field of structure monitoring has been making increasing use of systems based on fiber-optic technologies. fiber-optic technology offers many advantages, including higher quality measurements, greater reliability, easier installation and maintenance, insensitivity to the environment (mainly to the electromagnetic field), corrosion resistance, safety in explosive and flammable environments, the possibility of long-term monitoring and lower cost per lifetime. we have used sofo fibre-optic strain gauges to perform measurements to check the overall relative deformation of a real reinforced concrete structure. long-term monitoring of the structure revealed that the measurement readings obtained from these fibre-optic strain gauges differed from each other. greater attention was therefore paid to the calibration of the fibre-optic strain gauges, and to determining their measurement accuracy. the experimental results show that it is necessary to calibrate sofo strain gauges before they are used, and to determine their calibration constant. keywords: fibre-optic strain gauge; sofo system; calibration; accuracy. 1. introduction during the lifespan of building structures, changes take place in the volume of the structure due to the effects of external loads from the surrounding environment. a whole range of sensors, working on various physical principles, are used for monitoring these changes. the most widely used deformation sensors include resistance strain gauges, capacity and induction displacement sensors, and video strain gauges. a disadvantage of these standard displacement sensors is that they cannot be integrated into the internal part of the structure or into fresh concrete structures. these sensors therefore cannot be used to monitor the behaviour of building structures during the concrete casting process or immediately after the casting operation. in recent years, fibre-optic measuring methods have been introduced for monitoring the behaviour of concrete structures. strain gauges of this type use the capability of optical fibres to transmit optical radiation in the direction of their centreline. the radiation is transmitted by means of the reflection of light at an interface between two environments with a different reflection index [1]. the principle of fibre-optic sensors is derived from various physical phenomena. an example is the fibre bragg grating (fbg) reflector,or sensors on the basis of the fabry-perot or michelson interferometer etc. the main advantage of fibre-optic strain gauges is that they can be installed directly on the reinforcement steel of the structure. this enables deformations in the structure to be monitored right from the concrete casting operation. other advantages of these sensors are their higher quality, reliability and measurement accuracy, easier installation and maintenance, electromagnetic resistance, resistance against corrosion, and the opportunity to carry out long-term monitoring of the structure. strain gauges of this type can be used for monitoring the behaviour of a whole range of building structures, e.g. bridges, tunnels, dams, power stations, buildings, piping systems, interactions between old and new concrete, etc. [2]. in 2008, the klokner institute of the czech technical university in prague acquired a sofo measuring system for measuring static quantities, with sbd ver. 6.3.53 measuring software and sofo fibre-optic strain gauges of various active lengths, produced by smartec s.a. four of the fibre-optic strain gauges, together with the measuring system, were used to perform measurements to check the overall relative deformation of a real reinforced concrete structure. long-term monitoring of the structure revealed that the measurement readings obtained from these fibreoptic strain gauges differed from each other. greater attention was therefore paid to the calibration of these fibre-optic strain gauges, and to determining their measurement accuracy. 2. description of the experimental equipment in the experimental part of our work, we analysed the measurement accuracy of sofo long-fibre strain gauges, produced by smartec s.a. fibre-optic strain gauges of this type works on the principle of the michelson interferometer. the sofo measuring system for static quantities with sbd ver. 6.3.53 software from smartec s.a. was used to record the measurement readings. the sofo measuring system comprises a sofo long-fibre sensor and a 872 http://dx.doi.org/10.14311/ap.2013.53.0872 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 no. 6/2013 analysis of the accuracy of fibre-optic strain gauges figure 1. configuration of the sofo system for measuring static quantities [3]. figure 2. design of the sofo long-fibre strain gauge [5]. portable reading unit, and this whole system can be connected to a control computer with software. a schematic drawing of the configuration of the sofo measurement system for measuring static quantities is illustrated in figure 1. the sofo fibre-optic strain gauge comprises two parts — the active part and the passive part. the active part is formed by a polyamide tube, inside which a measuring and reference optical fibre is installed. the active part is available in the length range from 0.25 to 10 m. this part of the system is a so-called michelson interferometer. the passive part is composed of a fibre-optic connection cable, which is available in lengths up to 50 m. the cable connection ends are fitted with e-2000 optical connectors (figure 2). the measurement range of this strain gauge is 0.5 % of the active length when shortened, and 1 % when extended. the working temperature range of the passive part is −40 °c to +80 °c, and the range of the active part is −50 °c to +110 °c. the sofo fibre-optic strain gauge is connected by the optical connection cable to the measuring unit containing a michelson interferometer, which is referred to as the reference interferometer. this measurement system provides deformations in absolute values by comparing data from the measuring unit and the reference michelson interferometer. the measuring unit also contains an internal memory. this measuring unit uses sofo sbd (version 6.3.53) software, which can establish a database of the configurations of the connected sensors and a database of measuring project configurations, in order to control the sofo measuring system for static measurements and to administer the measurement readings data [4]. the control computer with the software has two roles. firstly, it saves the data, and, secondly, it controls the system. 3. calibrating the experimental equipment calibration was carried out on a total of six sofo long-fibre strain gauges of varying active length — 0.5 m, 1.0 m and 2.0 m. the specification of the standard sofo strain gauges used in our work is presented in table 1. these standard fibre-optic strain gauges were calibrated under identical ambient conditions in the laboratory of the klokner institute. the air temperature was 22±1 °c and the humidity was 45÷50 %. for the actual calibration of the sofo fibre-optic strain gauges, a jig device was designed to hold the strain gauges and to set their initial stress length — see figure 3. the jig device was made from stainless 873 dita jiroutová, miroslav vokáč acta polytechnica sensor’s serial number active length passive length initial stress length number sn la [m] pl [m] dl [mm] 7063 0.5 5 37.127 7064 0.5 5 37.488 9490 1.0 5 38.623 9491 1.0 5 36.498 7076 2.0 10 36.358 7077 2.0 10 35.645 table 1. specification of the calibrated standard sofo fibre-optic strain gauges. figure 3. calibration of sofo fibre-optic strain gauges of active length la 1 m. steel and had two main parts (a and b). the function of these parts was to fasten the fibre-optic strain gauge that was being calibrated. part a could be moved along the calibration route, depending on the active length of the sensor. part b allowed deformation changes to be continuously set. a somet inclinometer with a range of 1 mm and accuracy of 0.001 mm was used for setting the changes in deformation. 4. experimental works the manufacturer of sofo fibre-optic strain gauges provides the value for initial stress length dl in the technical sheet of each sensor. the sensor should be stressed to this value during installation. in the course of the experimental works, calibration constants were derived and the measurement accuracy was determined as a function of the initial stress length. calibration of the sofo fibre-optic strain gauges commenced by setting the strain gauge to the initial stress length declared by the manufacturer (value dl). the initial stress length set in this way was recorded by the sofo portable reading unit, and a zero value was set on a somet inclinometer, and was also entered into the measuring application. from this initial state, the sofo fibre-optic strain gauge was then stretched and compressed by ±0.5 mm in steps of 0.01 mm. the displacement values were checked on the inclinometer. after each step, a reading was taken of the stress length of the fibre-optic strain gauge, and the set real value of the displacement was read off the inclinometer. this process was repeated five times for each initial stress length. an example of a recording is presented in figure 4. after five repetitions, the initial stress length was changed, and the stressing process was repeated five times. for each fibre-optic strain gauge, the calibration was carried out for the declared initial stress length (dl) and for at least five other stress length values. the displacements as a function of time obtained from the inclinometer, ∆lh, were then plotted against the displacements obtained from the sofo fibre-optic strain gauge, ∆lsofo, for the repeated measurements of ±0.5 mm displacement in 0.1 mm steps for each initial stress length. a linear regression curve was then fitted through the measurement readings, and the calibration constant ki (see figure 5) was determined from the regression equation ∆lh = ki · ∆lsofo. for the five repeated measurements of a single initial stress length, the calibration constants ki in equation 1 were then obtained by means of the least square method. an example of the five calibration constants ki obtained in this way for a fibre-optic strain gauge of active length 1 m is presented in figure 5. for each value of the initial stress length, the calibration constants ki determined in this way were used to derive the value of the resultant calibration constant k, described by an average value k and the limits of a 95% reliability interval [6] k = k ± s · t(m−1);0.05√ m , where t(m−1);0.05 is a coefficient of student’s t-distribution for a 5 % level of significance, m is the number of repeated measurements for each initial stress length, and s is a standard deviation. 5. measurement results the calibration procedure and the calculations to determine the limits of a 95 % reliability interval have been described above. the values obtained in the calibration of six fibre-optic strain gauges of varying active lengths (0.5 m, 1 m and 2 m) have been plotted as graphs depicting the spread of calibration constants together with a 95 % reliability interval as a function 874 vol. 53 no. 6/2013 analysis of the accuracy of fibre-optic strain gauges figure 4. an example of displacements obtained from the somet inclinometer and the initial stress length of the sofo fibre-optic strain gauge with a 0.1 step obtained during calibration, as a function of time (sn 9490; la 1.0 m; dl 36.623 mm; initial stress 38.657 mm). figure 5. an example of displacements determined by the inclinometer and by the sofo fibre-optic strain gauge for five repeated measurements with a fitted regression curve and calibration constants for the given initial stress value. of the initial stress length — see figures 6 to 8. each graph depicts the functions for two strain gauges of identical active length. in addition, the initial stress length recommended by the manufacturer (dl) is shown for each function. 6. conclusion calibrations of six sofo sensors showed that the spread of calibration constants as a function of the initial stress length is similar for sensors of identical active length. however, this does not apply to strain gauges of other active lengths. for example, figure 7 shows that the calibration constant k of a sofo strain gauge of active length 1.0 m changes considerably with the value of the initial stress length, whereas for a sofo strain gauge of active length 2.0 m the value of the calibration constant is almost independent of the initial stress length (figure 8). the graphs also show that the greater the value of the initial stress length of the sofo strain gauge, the closer the calibration constant is to the limiting value, and the dispersion is reduced. the spreads of the calibration constants show that it is necessary to calibrate sofo strain gauges, and to determine their calibration constant, before they are used. when the sensors are being installed in the structure, it is necessary to achieve the highest possible initial stress length value, as this will guarantee that the constant will be close to the limiting value, and that the highest possible accuracy will be achieved. 875 dita jiroutová, miroslav vokáč acta polytechnica figure 6. calibration constants in a 95 % reliability interval of sofo fibre-optic strain gauges of active length la 0.5 m — sn 7063 (required stress length dl 37.127) and sn 7064 (required stress length dl 37.488). figure 7. calibration constants in a 95 % reliability interval of sofo fibre-optic strain gauges of active length la 1.0 m — sn 9490 (required stress length dl 38.623) and sn 9491 (required stress length dl 38.498). figure 8. calibration constants in a 95 % reliability interval of sofo fibre-optic strain gauges of active length la 2.0 m — sn 7076 (required stress length dl 36.358) and sn 7077 (required stress length dl 35.645). 876 vol. 53 no. 6/2013 analysis of the accuracy of fibre-optic strain gauges when installing these sensors, it is recommended to attach the fibre-optic strain gauge as carefully and as rigidly as possible to the reinforcement of the structure, so that when the concrete is cast the fixing points are not moved along the steel and the initial stress length of the strain gauge is not changed. references [1] martinek, r.: senzory v průmyslové praxi. prague: ben — technická literatura, 2004. 200 p. [2] glišić, b., inaudi, d.: fibre optic methods for structural health monitoring. london: john wiley & sons ltd, 2007, 262 p. [3] inaudi, d.: sofo sensors for static and dynamic measurements. in: proceedings of the 1st fig international symposium on engineering surveys for construction works and structural engineering. nottingham, 2004, p. 1-10. [4] král, j., vokáč, m., bouška, p.: optovláknové extenzometry — příručka. prague: ctu in prague, 2009, 22 p. [5] http://www.smartec.ch/, [2009-01-05]. [6] vorlíček, m., holický, m., špačková, m.: pravděpodobnost a matematická statistika pro inženýry. prague: ctu in prague, 1982, 345 p. 877 http://www.smartec.ch/ acta polytechnica 53(6):872–877, 2013 1 introduction 2 description of the experimental equipment 3 calibrating the experimental equipment 4 experimental works 5 measurement results 6 conclusion references acta polytechnica doi:10.14311/ap.2015.55.0022 acta polytechnica 55(1):22–28, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap simulation of radiative heat flux distribution under an infrared heat emitter jan loufek∗, jiřina královcová technical university of liberec, faculty of mechatronics, informatics and interdisciplinary studies, studentská 1402/2, 561 17, liberec, czech republic ∗ corresponding author: jan.loufek@tul.cz abstract. this paper deals with a heat radiation model used forcalculating the heat flux of an infrared heater. in general, we consider a system consisting of a set of objects, whereby a single object can stand for a heater, a reflector or a heated body. each of the objects is defined by its bounding surface. the model applies a 2d restriction of the real system. the aim of a particular simulation is to obtain the heat flux distribution all over the heated body under given conditions, e.g. the temperature and material properties of the object. the implemented model is used to design a reflector profile to obtain a desired heat flux distribution. the paper presents the implemented model, a validation of simulation using measured data, and an example of the design of a reflector. keywords: radiative heat transfer; simulations; optimization; heat flux distribution. 1. introduction in high temperature applications, e.g. with infrared heating, thermal radiation is the dominant mode of heat transfer. infrared heating is used, among other methods, to heat-up shell-forms in the production of artificial leathers in the automotive industry. in this case, it is necessary to determine the optimal location for heaters, aiming at almost uniform distribution for radiation intensity all over the shell form surface. the procedure for determining the optimal location of particular heaters is based on temperature simulation and optimization. some relevant issues were published in [1–3]. a similar topic is adresses rukolaine [4], whose work presents the possibility of optimal shape design related to radiative heat transfer. in [4] work the inverse problem is reduced by a least squares objective functional to an optimised one. in our work we consider a reflector of gray body properties, i.e. the case when the energy incident on a surface is reflected diffusely. a model of a complex system consisting of a shell form and dozens (or hundred) of heaters [5] works with a heat flux distribution function that determines the heat flux at various points under the emitter. unfortunately, there are not enough data specifications available for an infrared heater that would allow an evaluation of the necessary characteristics. the data needed can be achieved by measurements of the heat flux coming out of the heater at different distances and different incidence angles. furthermore, it is necessary to make an interpolation to find contribution of a particular heater to the heat flux at a respective point of the shell form. another method for obtaining the heat flux distribution under particular heaters is to simulate them. for that purpose we implemented a model of heat transfer by radiation, where the system consists of a heat emitter, a reflector and a heated body. a similar topic, i.e., radiative heat transfer simulation, can be found in the work of takami, danielson and mahmoudi [6], where the authors present simulations for the purpose of optimizing the high power reflector with respect to heat power and temperature distribution. our model is also applicable to other purposes: (1), to verify a multiple infrared heat source interaction, or (2) to design an alternative reflector shape suitable for specific criteria, such as focusing the heat flux in the desired direction or (3) to design a reflector that achieves a desired heat flux distribution all over the heated body. 2. heat transfer model the following section summarizes the general relations used in the model design. the model presented here works with two main parts of the system. there is a heat source or a set of heat sources that is/are equipped with reflectors (if needed) on one side and the object or set of objects to be heated up on the other side. in general, each part of the system radiates a continuous electromagnetic energy. the magnitude of the energy depends on the temperature of the object and its surface properties. the amount of radiated energy depends on the biquadrate of the absolute temperature. in the case of a “black body”, i.e. a body, which absorbs/emits all the incident energy without any reflection, the amount of energy absorbed or emitted per square meter eb (w m−2) is described by the stefan-boltzmann law eb = σt 4, (1) where t (k) stands for the absolute temperature and σ = 5.670373 · 10−8 w m−2 k−4 is the stefanboltzmann constant. 22 http://dx.doi.org/10.14311/ap.2015.55.0022 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 1/2015 simulation of radiative heat flux distribution in the case of a real body (“gray body”), we deal with emissivity of the body surface ε(−), which represents the rate of emitted energy against the black body emission. ε = e(t) eb(t) = ∫ ∞ 0 eλ(λ,t) dλ σt 4 (2) where λ(m) stands for wavelength of radiating energy. we assume that each part of the system (every heater, every reflector, every heated object) is determined (set) by its bounding surface. the union of all the bounding surfaces forms the model domain. for the purposes of radiation heat transfer simulation, we introduced a partition of the domain into a set of facets as simplex elements of the respective domain (i.e. line segments for the bounding surface in 2d, and triangles for the bounding surface in 3d). the union of the facets forms an approximation of the original domain. each facet is described by its position, its normal vector, its surface temperature and emissivity. the key concept in solving the radiative heat transfer in systems of surfaces is demonstrated by the view factor. the view factor fi−j is defined as the fraction of the total outgoing radiation out of the i-th surface intercepted by the j-th surface. the view factor fi−j is demonstrated by the double-integral form fi−j = 1 ai ∫ ai ∫ aj cos β1 cos β2 πs2 dai daj, (3) where ai (m2) stands for the area of the surface, s (m) stands for the length of the join of particular points on surfaces ai and aj, and βn (rad) stands for the angle between the join and outer normal. a sample configuration of two rectangular facets in 3d is shown in figure 1. for some simple geometries, calculations of the view factors are straightforward and various methods have been outlined in books on radiative heat transfer, e.g. in [7–9]. due to the asymptotic complexity o(n4) of the algorithm, related to the numerical evaluation of the view factor, which is unavoidable for complex geometries, especially the geometry working with objects forming obstacles, the calculation is a time-consuming problem. it particularly approves simulations with a more accurate discretization, where the considered bounding surfaces are split into plenty of discrete facets. the amount of surface density of the radiated energy (radiosity) b (w m−2) is determined as the sum of the reflected and the emitted energy, and is given by the formula b = ρh + εeb, (4) where h (w m−2) stands for the irradiance of the surface and ρ (dimensionless) stands for the reflectivity of the considered surface. eventually, the rate of heat transfer qi (w) of particular surface ai in the figure 1. sample configuration of two rectangular facets in 3d. direction of the outer normal due to radiation can be expressed as qi = n∑ j=1 bi −bj 1/aifi−j , i 6= j. (5) the model presented here is designed to determine the heat flux distribution (5) all over any surface [8, 10]. our model is composed of a set of particle surfaces, whereby each particle surface has given particular properties (location, size, emittance and temperature). 3. model implementation the proposed model was implemented into the ire designer computer program, which was written in java language. ire designer implements our own numerical solver. the implementation worked with the 2d restriction of the system described in section 2. the application (figure 2) deals with a cross-section of a heat emitter or emitters (with a reflector if needed) and the heated-up surface. the model geometry consists of a set of straight lines (line segments). in the 2d implementation, the view factors are evaluated by means of hottel’s crossed strings method [9]. the advantage of the two-dimensional model is its significant simplification and its straightforward view-factor evaluation, and it also responds promptly to input parameter changes. the aim of a particular simulation was to obtain the heat flux distribution all over the heated surface under given conditions that were in general given by the temperature and the material properties of particular line segments. an additional functionality has been implemented in order to validate the evaluated results according to measured data (through the import of measured values). in addition, an algorithm to optimize the 23 jan loufek, jiřina královcová acta polytechnica figure 2. illustrative screenshot of ire designer’s main window with a simple simulation. figure 3. an intermediate result of incomplete reflector shape optimization. the figure shows the required heat flux distribution by the green rectangular line and the simulated distribution. shape of a reflector to reach the desired heat flux distribution function has been implemented into the two-dimensional version of the model. reflector shape optimization uses the hill-climbing algorithm, which is based on accomplishing small steps in the direction of the gradientt [11]. the intermediate result of an incomplete reflector shape optimization is illustrated in figure 3. the optimization is carried out in subsequently-performed steps. in each step of the algorithm, the aim is to modify the position of each point that represented the reflector partitioning (boundary points of line segments forming the reflector). the considered position modifications of one point are shown in figure 4. when each point has been moved, the difference between the simulated and the figure 4. the considered shifts of a single point of the two dimensional reflector model used for each step in shape optimization. desired heat intensity was evaluated. those intensities are shown in figure 3. the desired intensity distribution is distinguished by a green line. the second curve is for the calculated distribution of the particular configuration. the optimisation starts with a user-defined initial configuration and material and temperature properties. in the case of reflector shape optimisation presented in figure 3, we consider flat heated surface. in that case, the following parameters were used: temperature of the heated surface 110 °c, temperature of the heater wire 3230 °c, and temerature of the reflector 300 °c; the emissivity of the heated surface 0.73, emissivity of the heater wire 0.95 and emissivity of the reflector 0.05. initial values used for the optimization algorithm are determined by the model calibration described in section 5. in each step of the algorithm, the best configuration is selected. the intermediate result was the outcome for the next optimization step. the optimization algorithm minimizes the difference between the absolute values of the integrals of the simulated and the desired distribution function. 24 vol. 55 no. 1/2015 simulation of radiative heat flux distribution figure 6. 3d visualization of the measured heat flux characteristics of the phillips irz500 emitter placed at distance of 100 mm above the heated surface. figure 5. the trajectory of the heat flux characteristics of measured emitter. 4. heat flux measurement for validation of the simulated results against real results, we performed a heat flux measurement of a real infrared heater. the real measurement was made with a hukseflux sbg01 heat flux sensor. the detected emissivity of the sensor was 0.95, showing almost “black body” behaviour. due to the symmetry, the measurement of the heat flux was performed in one quadrant of the radiated surface. the trajectory of the measurement is shown in figure 5. a 3d visualization of the measured heat flux characteristics of the phillips irz500 emitter placed at a distance of 100 mm above the heated surface is shown in figfigure 7. a side-view of the reflector used for measurements of the calibration values. ure 6. the reflector that wasused is made of polished aluminum with spreadsheet emissivity values between 0.04 and 0.1. the values shown in the graph were interpolated and mirrored on the other quadrants to achieve a 3d full visualization of the heat flux distribution. more information on the measurements, their conditions and results were presented in [12]. 5. model calibration the input data for a particular simulation performed on the given geometry (given the bounding surfaces and their discretization) are represented with the temperature and the emittance, for each particle surface (facet). for the temperature, setting a temperature of the heat emitter is the most crucial factor. the data provided in the documentation of particular products 25 jan loufek, jiřina královcová acta polytechnica figure 8. a validation of the measured and simulated values. the green line displays measured values and the black curve displays values obtained from the numerical simulation. are not sufficient. however the results of our model were significantly influenced by that input. the heater temperature was calibrated with the phillips irz500 emitter [13] based on results o real measurement. a side-view of the reflector used for this purpose is shown in figure 7. considering the three parts of the simulated system (the heater, the reflector, and the heated surface) and taking into account the uniform input characteristics of each of these parts, we obtained model inputs (as a result of our calibration): the temperature of the heated surface was 110 °c, the temperature of the heater wire was 3230 °c, and the temperature of the reflector was 300 °c; the emissivity of the heated surface was 0.73, the emissivity of the heater wire was 0.95 (an almost ideal black body surface), and the emissivity of the reflector was 0.05. the calibration of all parameters was based on the table values for emissivity and the expected temperature of the elements. the calibration process is performed by minimizing the absolute value of the difference of the integrals of the simulated and measured distribution function. a comparison of the measured and the simulated values of the calibrated numerical model is shown in figure 8, where the green line on the right side represents the measured values, and the black curve represents the values obtained from the numerical simulation. despite the calibration, it can be seen tat full agreement was not reached. the distinct course of the measured and simulated heat flux distribution is caused mainly by the simplification of the real system that is taken into accont in the model (e.g. the usage of a glass bulb filled with halogen). the main aim of the calibration is to determine the slope and the maximum of the heat flux distribution function. 6. model results the main benefit of the implemented model is the possibility to obtain the heat flux characteristics in free-choice conditions given by the values of the temperature and the emissivity for each particular part of the model. a result of such a simulation is provided in section 5. here, the result was obtained through calibration based on validation by the measured data. the important feature of the model (with a view to its further use in real conditions) is its sensitivity to the input parameters. the impact of an inperfect/inprecise assessment of the inputs is determined by the sensitivity of the model. the graphs in figure 9 and figure 10 show the sensitivities to the temperature and the emissivity of various parts. the horizontal axisindicates deviation of the input parameters from the reference point given by the input parameters of the calibrated model described in section 5. the vertical axisindicates the maximum heat flux under the reflector. almost linear dependence can be seen in each of the outlined cases. in addition, for the input temperature settings, the results of the model were most sensitive to the heater temperature setting. however, the results were nearly independent from the reflector temperature, and for that reason we assume that its temperature can be set freely — (more or less) inaccurately — as its setting does not affect the results significantly. with regard to the input 26 vol. 55 no. 1/2015 simulation of radiative heat flux distribution figure 9. model sensitivity to changes intemperature. figure 10. model sensitivity to changes in emissivity. emissivity settings, we observed that the results of the model were most sensitive to the radiated surface emissivity, whereas the emissivity of the heater wire and of the reflector had a less significant influence. with regard to the real system – heated-up shell-forms in the production of artificial leathers in the automotive industry – changes in the radiated surface quality are observed with the course of time, and for this reason the surface emissivity of the corresponding model must be adjusted appropriately. in the model presented here, we have focused on a model involving a single heater. however, shell-forms in the production of artificial leathers are heated by tens (often more than one hundred) of infrared heaters. in the model of such a complex system, it is necessary to deal with the interaction of several heaters to determine the heat flux above a particular (defined) element of a heated form. we assume that a simple superposition (sum) of the particular contributions of heaters involved can be used in the calculations. our presumption has been confirmed by preliminary measurements [1]. then, we had to prove the applicability of the superposition in the procedure, using single heater model outputs for an evaluation of the complex model. for that reason, we dealt with a model involving more than one heater. figure 11 shows a particular configuration of two heaters and their placement. for this case, the graph in figure 12 shows comparison of (1) the heat flux obtained by summing the heat flux of two single heaters, and (2) the result obtained by the model with two infrared heaters. it can be seen, that there is a difference between the figure 11. two emitters side-by-side. figure 12. a comparison of heat flux intensity for each emitter, sum of flux of those emitters and simulated intensity of those emitters. two curves. the difference represents the heat flux of the system involving only heated surface (without any heater nor reflector) – i.e. the heat flux of the surface emissivity, or the background heat flux. we had to avoid multiple involvement of the background heat flux when particular results were superposed in the single heater model. the presented model is also helpful in solving task related to reflector design (shape), where its shape is determined by a desired heat flux distribution. a possible use and one example were mentioned in the section 3. the model is also useful for determining the energy coverage in various configurations. table 1 and table 2 shows several configurations with different depths between the emitter and the heated surface. from those simulation results is possible to obtain sharpness of the radiation heat transfer characteristics for various situations. table 2 presents the maximal value of the absorbed heat flux b(wm−2). table 2 shows the amount of radiated energy q(w) irradiating heated surfaces of different sizes, which indicates the fraction of the energy absorbed by surfaces of various widths. 7. conclusions this paper has presented a numerical model based on radiation heat transfer theory, and on utilization in simulations of a system consisting of an infrared heater, a heater reflector and the heated surface. the goal of a single simulation is to determine the heat flux distribution above the heated surface. the model presented here is the two-dimensional. additionally, a super-structure module (using the model) applies an 27 jan loufek, jiřina královcová acta polytechnica distance between surface and emitter d 50 mm 10 mm 150 mm 200 mm 250 mm 300 mm maximal heat flux b 72397 37255 25351 19463 15962 13644 table 1. comparison of simulation results for emitters located at various depth above the heated surface. the table shows the maximal heat flux b (w m−2) for various depths. distance between surface and emitter d 50 mm 10 mm 150 mm 200 mm 250 mm 300 mm heat transfer q for surface width 200 mm 39709 38488 37728 36098 34425 32867 heat transfer q for surface width 160 mm 38226 36914 35314 33213 31282 29405 heat transfer q for surface width 80 mm 34756 30762 26854 23530 20842 18666 heat transfer q for surface width 40 mm 30168 22689 17716 14454 12200 10578 table 2. a comparison of the simulation results for an emitter located at various depth above a heated surface. the table shows the rate of heat transfer q (w) for heated surfaces of various widths. optimization algorithm and serves to design a reflector (its shape). both the numerical model and the superstructure optimizing module were implemented in the ire designer software tool. this tool is helpful in providing heat flux distribution functions under various configurations (heater reflector shape, heated surface shape and inclination, distance) and under various input parameters (temperature, emissivity). the resulting heat flux distribution functions are used in the complex model of the heated shell-forms in the production of artificial leathers. then, the optimization functionality is applicable for the design of a reflector utilizable above a particular part of the shell-form to avoid deep non-uniformity of the heat flux. in addition, we are interested in an implementation corresponding to a full three-dimensional model, which has not been presented in this paper. our future work is focused on implementing a more precise method that will determine the three-dimensional view factor with obstructions, and will be helpful for implementing a suitable generic optimization algorithm. acknowledgements this work was supported by the student grant competition of the technical university of liberec. references [1] martinec, t. měření teplot pomocí kontaktních metod měření. ph.d. thesis, technical university of liberec, 2009. [2] školník, t. řízení teplotních polí pomocí ohřevu infračervenými záriči. ph.d. thesis, technical university of liberec, 2010. [3] mlýnek, j., srb, r. optimization of a heat radiation intensity on a mould surface in the car industry. in proceedings of 9th international conference on mechatronics. springer-verlag, berlin, 2011. [4] s. a. rukolaine. the shape gradient of the least-squares objective functional in optimal shape design problems of radiative heat transfer. journal of quantitative spectroscopy and radiative transfer 111(16):2390–2404, 2010. doi:10.1016/j.jqsrt.2010.06.016. [5] hušek, m, potěšil, a. software prediction of non-stationary heating of shell moulds for manufacture of artificial leathers. in proceedings of 18th international conference engineering mechanics 2012. svratka, žďár nad sázavou, 2012. [6] k. m. takami, ö. danielsson, j. mahmoudi. high power reflector simulation to optimise electrical energy consumption and temperature profile. applied thermal engineering 31(4):477–486, 2011. doi:10.1016/j.applthermaleng.2010.09.031. [7] g. walton. calculation of obstructed view factors by adaptive integration. national institute of standards and technology, 2002. [8] f. m. modest. radiative heat transfer. 3rd ed. academic press, 2013. [9] s. a. f. hottel, h.c. radiative transfer. mcgraw-hill, new york, 1967. [10] j. h. lienhard iv, j. h. lienhard v. a heat transfer textbook. 3rd ed. phlogiston press, 2003. [11] a. w. johnson, s. h. jacobson. a class of convergent generalized hill climbing algorithms. applied mathematics and computation 125:359–373, 2002. doi:10.1016/s0096-3003(00)00137-5. [12] j. mlýnek, a. potěšil. ohřevy radiací – teorie a průmyslová praxe. technical university of liberec, 2012. [13] dr. fischer europe s.a.s. infrared halogen lamps, 2013. [2014-12-01], http://www.dr-fischer-group. com/img_pool/df%20ir%20catalogue.pdf. 28 http://dx.doi.org/10.1016/j.jqsrt.2010.06.016 http://dx.doi.org/10.1016/j.applthermaleng.2010.09.031 http://dx.doi.org/10.1016/s0096-3003(00)00137-5 http://www.dr-fischer-group.com/img_pool/df%20ir%20catalogue.pdf http://www.dr-fischer-group.com/img_pool/df%20ir%20catalogue.pdf acta polytechnica 55(1):22–28, 2015 1 introduction 2 heat transfer model 3 model implementation 4 heat flux measurement 5 model calibration 6 model results 7 conclusions acknowledgements references ap01_45.vp 1 introduction the engineering design process of a new aircraft starts with a set of requirements which are to be met by one or several aircraft concepts that we may call solutions, being developed along the timeline. during the conceptual design phase, on which this paper is focused, various possible solutions are investigated, modified, abandoned or further developed. simultaneously, several requirements may face a discussion, in which the required parameter values, and even the requirements themselves, are questioned and changed if necessary. later, the preliminary design phase will follow, looking further into only a few particularly promising aircraft studies, the number of which will be reduced to a single concept to enter the detailed design phase. by then, the requirements will be far more rigid than during the conceptual design stage. the design phases are characterised by steadily changing degrees of knowledge, freedom of design, and cost of change (figure 1). knowledge is information about the evolving aircraft project. naturally, in the early design phase, it is incomplete or imprecise, so assumptions are often made. these assumptions, if incorrect, can lead to poor decisions that precipitate project failure, budget and/or schedule overruns, etc. freedom of design is a measure of flexibility, or the degree to which changes in the aircraft characteristics are realistic. cost of change refers to the resource allocation which is determined by the decision-making processes. key decisions made early in the design process determine a comparatively large number of aircraft parameters as well as a high percentage of the total cost committed, as [1] points out. unfortunately, these decisions are often based on minimal knowledge and incomplete or inaccurate information. necessary revisions in later design phases are significantly more expensive and complicated than are changes early in the process. so there is a desire to shift knowledge forward in the design timeline, enabling better substantiated decisions. this faster increase of design knowledge implies changes of improvement in the conceptual design stage, as these better substantiated decisions then meet a more flexible and less costly-to-change aircraft design state (as indicated by the arrows in figure 1). in order to accomplish this, a tool is required to help increase and improve the information about the evolving aircraft project in the early design stages, allowing a sounder review of presented solutions and of the design driving requirements. one possible approach is an iterative scaling process, which will be described in detail below. here, starting from a model reference aircraft not yet meeting all requirements, several parameters are resized – scaled – deliberately in each step, thus describing a scaled aircraft with new characteristics, which, in turn, are subject to investigation. this iteration is guided by the objective of an aircraft design which optimally satisfies the initial requirements with respect to a selectable figure of merit, e.g. total mass. at the same time, various design sensitivities become apparent along the iteration, adding to the desired information base. at the chair of aeronautical engineering of the technische universität münchen, the fastr ( flexible aircraft scaling to requirements) program is currently being developed as a modern computer-aided approach to run this scaling process automatically, as will be described in detail © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 acta polytechnica vol. 41 no. 4 – 5/2001 cargo aircraft conceptual design optimisation using a flexible computer-based scaling approach f. schieck, d. schmitt in the early design stages of a new aircraft, there is a strong need to broaden the knowledge base of the evolving aircraft project, allowing a profound analysis of the solution concepts and of the design driving requirements. the methodology presented in this paper provides a tool for increasing and improving in an exemplary manner the necessary information on cargo aircraft. by exchanging or adapting a few particular modules of the entire program system, the tool is applicable to a range scale of different aircraft types. in an extended requirement model, performance requirements are represented along with other operational requirements. an aircraft model is introduced in sufficient detail for conceptual design considerations. the computer-aided scaling methodology is explained, which, controlled by an optimisation module, automatically resizes the aircraft model until it optimally satisfies the requirements in terms of a selectable figure of merit. typical results obtained at the end of the scaling are discussed together with knowledge gained during the process, and an example is given. keywords: aircraft conceptual design, aircraft scaling, mass growth factor, aircraft multidisciplinary design optimisation. fig. 1: changes of chances during the conceptual design phase in the following sections. in this context, aircraft requirements, which can be integrated into a requirement model, are introduced as design objectives in section 2. complementing the requirement model, an aircraft model is presented in a sufficient grade of detail for conceptual design considerations in section 3. subsequently, the fastr core, an automated scaling algorithm, is described in section 4. the results of this scaling process are discussed in section 5, and an example is given in section 6. 2 aircraft requirements as design objectives the engineering design process of a new aircraft begins with the specification, that has to be reached in the end with a certain technical solution. hence, during all scaling efforts, the specification defines the design-guiding boundary conditions. a full specification consists of several requirements, most of them relating to performance. with few exceptions, e.g. a dedicated stealth aircraft, where stealthiness and low signatures may dominate the whole design [2], performance requirements can be considered strong design drivers in an aircraft project. the approach described herein will therefore set one focus on performance requirements, as will become apparent in section 4. these demanded performances can be divided basically into point performance requirements (table 1) and mission performance requirements (table 2). the former describe singular performance items which have to be satisfied at a single point in time with a fixed aircraft setup. the latter relate to performance requirements which have to be met in a mission context, along a flight profile, with e.g. a steadily changing fuel mass. for both requirement classes, various formula systems have been developed and have been published [3], [4]. however, an aircraft specification is not restricted to performance requirements alone. several operational requirements must be met as well (table 3). in the given example, a dedicated cargo aircraft is characterised; aircraft with other main purposes like e.g. an unmanned reconnaissance vehicle can possibly be categorised by other operational requirements. most of these operational requirements are not immediately reflected in the above formula systems and performance models. the proposed scaling approach therefore includes an extended requirement model, enabling automated expansion of the above operational requirements into technical solutions with quantifiable effects on mass and drag, as well as further technical boundary conditions. these requirements are thus made compatible with the fastr core formula system, which mostly relies on the above mentioned formulas and equations. a specified in flight refuelling capability, e.g., will be translated into the integration of a specific subsystem, a refuelling probe, with defined individual mass and drag properties. other requirements will lead to the introduction of several restrictions in the aircraft’s overall configuration. with this, an extensive requirement model as a guide for the scaling has been defined. 3 aircraft model for conceptual design considerations, an aircraft can be described by a set of variables, following a model in a sufficient grade of detail. in the fastr approach, some 250 variables are currently used, describing a certain aircraft setup in the first part in terms of geometric key figures, furthermore propulsion, aerodynamic, and mass properties. necessarily, this variable model is complemented in a second part by several methods (see below) for parameter value determination of the variables of the first part. in this way, e.g. a certain wing area can be methodically associated with a certain mass and certain aerodynamic characteristics, and a resize of the former automatically yields changes in both latter variables. in this methods-part, the “rubberised” propulsion device is calculated according to a generic engine model [5]. the prediction of a longitudinal aerodynamic dataset, including trim losses, relies on handbook methods [6], [7], [8]. mass determination also relies on handbook methods [9], [10]. additionally, the methods-part includes an automated rule-making functionality around a freely definable design point, e.g. a referenced baseline aircraft as described in section 4, enabling even better model accuracy in parameter variations closely around that well-defined reference. the methods-part is designed to be exchangeable for different types of aircraft. for the developing and validation phase of the fastr approach, a conventional cargo aircraft configuration with 94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 point performance requirements at any given fuel and payload percentage – stall speed – sustainable load factor – take-off run length – specific excess power – turn rate – landing run length table 1: point performance requirements mission performance requirements – payload mass – climb rate in a mission context – range – acceleration – cruise altitude – manoeuvres – cruise speed – store drop table 2: mission performance requirements operational requirements – quick re-role capability (cargo transport – passenger transport – medevac aircraft) – pressurised cargo compartment – rapid centre of gravity shift capability – in flight refuelling capability – quick cargo handling roll-on roll-off capability – ground operations on unprepared strips and without ground handling means – survivability (military applications) table 3: operational requirements a high wing is modelled. since this model is exchangeable, however, it does not basically restrict the proposed scaling algorithm described in the following. 4 automatic scaling algorithm the described aircraft model is sized in the scaling process, as a whole or in part, in order to meet certain specified requirements – e.g. by increasing the tank volume, thus enabling a specified better mission range. at the beginning of this proposed automated scaling process, a number of ground rules for the scaling, e.g. parameter value minimum/maximum envelopes, a master mission profile (see below), key figures like wing loading to be kept constant, or scaling boundary conditions used by the optimisation module (see below) can be set individually. furthermore, in another pre-processing step, the process continues with the mentioned automated expansion of operational requirements (table 3) into boundary conditions to be reflected during the scaling. the sizing then starts from a certain baseline aircraft according to the aircraft model described in section 3. this baseline design represents an in-itself consistent reference aircraft dataset and a metric for further investigation of scaling results. the baseline design, however, does not need to meet the required performance, and does not necessarily represent an optimum design with respect to any objective. during the iterative scaling process, the current aircraft dataset is analysed in several modules in order to ascertain whether the given requirements can be satisfied, whether the current aircraft is over-designed, or whether it still lacks potential (figure 2). this investigation focuses primarily on performance requirements. first, several required point performances listed in the aircraft specification are investigated (table 1): here the performance figures of the current aircraft dataset are computed and checked against the requirements. if the current aircraft dataset over-qualifies or fails by a definable margin in this comparison, the responsible aircraft parameters are correspondingly marked for change. in the next module, required mission performances are investigated as well (table 2). the fastr approach allows the flexible definition of a master mission profile with a modular construction system. single mission segments, each defined separately, can be combined without any restriction, generating a mission which the current aircraft dataset “flies along” in any desired time resolution (figure 3). moreover, a stability and control module is run to ensure certain aircraft handling qualities according to the requirements. finally, the compliance of the overall design with geometric restrictions, e.g. to prevent the blades of an upscaled propeller from touching the ground, is tested in a separate module. at the end of this downloop a rescale decision is met: if the current aircraft design satisfies all required criteria investigated earlier, the algorithm terminates with a “tailor-made” solution to fit the specification. if, on the other hand, there is still a need for scaling, and the parameters in question are still within their value envelopes, a parameter resize is initiated. a loop-back then allows scaling of the current aircraft design in terms of engine size, wing, empennage, and fuselage dimensions as well as fuel mass corresponding to tank volume. currently, about 25 variables used in the aircraft model are subject to direct manipulation – e.g. wing area. the ground rules contain an initially definable list of parameters subject to change during the scaling, so that scaling investigations under specific restrictions, e.g. a resize of wing properties only, are possible. along the design loop, associated changes in the other variables describing the propulsion device, aerodynamics and masses, which are indirectly triggered by that direct rescaling, are determined according to the methods-part of the aircraft model (section 3). this iterative process automatically resizes the baseline design towards the “tailor-made” design, aiming at the favoured minimum-mass solution. to decrease the run-time, the rescale algorithm relies on a newton-solver, enabling quick detection of the wanted solution in only a few iterations. this scaled design is represented by a list of aircraft properties according to the used aircraft model, regarding geometry, propulsion, aerodynamics with drag as the dominating parameter, and mass. calculated point performance data, mission performance results, and stability/control properties complete the representation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 95 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 2: proposed scaling algorithm (simplified) fig. 3: flexible mission model (supply mission example) a simple optimisation module finally evaluates the resulting scaled design regarding a selectable figure of merit, e.g. aircraft total mass, and decides on a new scaling run with slightly changed scale criteria. thereby, several technical variants for solving same problem are investigated – e.g. a required high climb rate, which could be achieved with the help of either a huge wing, or a powerful propulsion device, or a sophisticated high-lift system, or a combination of all of these – and the best solution concerning this figure of merit is isolated. at present, both aircraft model and requirement model refer to a dedicated cargo aircraft. the presented scaling methodology, however, is not restricted to this aircraft type. the same methodology is currently being used in a joint academia-industry research project for a scaling tool applicable for uav conceptual design and scaling [11]. running on a modern standard personal computer, a fastr run typically takes less than a minute. it should be noted, however, that the described scaling method does not necessarily converge. in this case, the parameter value of one or several requirements prevents the algorithm from reaching a realisable aircraft; relying on an iteration counter, the algorithm will nevertheless terminate. since the scaling log file will in any case display the then futile efforts to achieve the objectives, the fastr approach also functions as a quick test for the feasibility of the requirements against the background of the baseline design. so in any case the fastr approach yields results, which will be discussed in the following. 5 results of the scaling process with a number of optimisation module-controlled scaling processes, various scaled designs are available – each matching the requirements, but differing along a parameter list according to the individual ground rule-setting. every single solution can be plotted in various diagrams, e.g. in a common design diagram showing power loading versus wing loading (figure 4). since each plotted point represents a complete design, several trends are determined, e.g. as defined by the aircraft total mass as a rough figure of merit, and the single solutions are evaluated individually. as additional guidelines, various boundaries are mapped which result from individual requirements: in the shaded areas those combinations of the variables are located which fail to satisfy certain requirements. the data obtained implicates even more than a variety of solutions that can be visualized in the diagram below. for example, mass growth factors – i.e. values for the partial differentiations of the aircraft total mass relative to a certain required aircraft quality [12] – are available through dedicated fastr runs. thus the sensitivity of the baseline design concerning a certain requirement becomes apparent in terms of a total mass change. specifically, this requirement may be any point performance requirement (see table 1), or mission performance requirement (see table 2), or additional operational requirement (see table 3). aircraft total mass as a figure of merit is currently used because of its implications, as there are methods available to easily derive rough cost and time schedule estimations from mass data [13]. another parameter, e.g. aircraft total drag, could be investigated as well. with these results, the penalties – in terms of additional mass or drag – which have to be accepted in order to satisfy a certain requirement become clearly evident. so it is possible to make a critical review of the basic set of requirements, maybe slightly weakening the one or other desired parameter value, while focusing on a better overall performance in the end with respect to a definable figure of merit. moreover, as the fastr algorithm with its optimisation module can be used to investigate several technical options for meeting the same requirement, a discussion of favoured basic technical approaches can be provided as well, including the minimum mass and minimum drag solutions. thus the resulting data adds to the available knowledge concerning the aircraft project in an early design phase. moreover, it can be considered a valuable aid in the trade-off decision-making processes concerning design driving requirement parameter values or detail solution versus detail solution, as an example may demonstrate. 6 example in the example below, the search history of a scaling run is shown for selected key parameters (figure 5). the automated scaling process resizes the mentioned cargo aircraft towards a certain required climb rate by changing engine static thrust, wing area, and lift coefficient respectively flap system, thus implicating a change in aircraft total mass. interactions between lift coefficient and wing geometric properties are reflected in the calculations. any of the vertically arranged parameter combinations in figure 5 can be regarded as a possible solution, a result of a scaling core run (with the exception of iteration 0, which represents the baseline design not yet fulfilling the improved requirement). it becomes apparent that the minimum-mass technical approach to improve climb rate values is to revise the high-lift system, thus allowing wing size and engine static thrust to shrink. relying on this result, the minimum mass solution for an e.g. 10 % climb rate improvement and the corresponding mass penalties can be calculated (table 4). effects of a 10 % climb rate reduction can be determined as well (table 5). with this data the mass penalties or benefits of the realisation of a certain climb rate requirement are made obvious. moreover, in this example the results indicate that the baseline aircraft features a certain design improvement potential concerning 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 fig. 4: design diagram with plottings of evaluable solutions climb rate performance, since even a climb rate-reduced aircraft shows a possible mass reduction. note that the example diagram refers only to the mentioned climb rate performance improvements. additional constraints, e.g. a shortened landing distance or a changed mission range performance, would yield different results. in addition to point performances, different technical approaches to satisfy a certain mission range requirement can be investigated as well. in figure 6, the example baseline design is scaled in order to accommodate enough fuel for a given increased outbound range within a complex air supply mission profile (figure 3): wing area and/or external fuel tank size are varied, and the overall changes in mass and zero drag are determined. it becomes obvious that a wing area increase is preferable to the installation of external fuel tanks, which would require pylons and additional piping and come along with increased zero drag, too. relying on this recommended configuration, the mass and drag penalties for an outbound range increase of e.g. 10 % or 20 % can be investigated, and the required key basic parameters are provided by scaling runs (table 6). to finally include the above mentioned reflection of operational requirements (table 3) in the example, the last column of table 6 shows the penalties of a required in-flight refuelling capability, which was realised by integrating an aerial refuelling probe into the upper forehead of the cargo aircraft. for the current baseline aircraft, the probe integration is obviously acceptable, at least in terms of additional mass and drag. these examples illustrate how trade-off decisions can be prepared substantially with the presented scaling methodology. 7 conclusion a computer-based automatic scaling process, which is methodically not restricted to certain aircraft types, is described. an extended requirement model reflecting point, mission © czech technical university publishing house http://ctn.cvut.cz/ap/ 97 acta polytechnica vol. 41 no. 4 – 5/2001 overall iterations2 102 % 98 % 85 % 100 % 310 rel. t/o mass rel. wing area 4 5 6 minimum mass solution fig. 5: search history for climb rate improvement climb rate improvement of 10 % changes relative to baseline static thrust change –8.0 % wing area change –17.0 % max lift coefficient change +24.0 % overall mass change –2.4 % table 4: climb rate improvement of 10 % climb rate reduction of 10 % changes relative to baseline static thrust change –7.6 % wing area change –15.4 % max lift coefficient change 0.0 % overall mass change –3.3 % table 5: climb rate reduction of 10 % overall iterations 102 % 100 % 100 % 101 % 10 rel. zero drag 2 rel. wing area minimum mass solution fig. 6: search history fors improved mission range mission range calculation changes relative to baseline range +10 % range +20 % refueling probe � fuel mass +4.6 % +10.1 % 0.0 % � structure mass +4.0 % +6.4 % +0.003 % � overall mass +4.25 % +7.9 % +0.001 % � wing area +4.3 % +8.6 % 0.0 % � fuselage length 0.0 % 0.0 % 0.0 % � wetted area +1.7 % +3.4 % +0.002 % � zero drag +1.6 % +3.3 % +0.002 % � wing loading –0.05 % –0.64 % 0.0 % � thrust loading – 4.08 % –7.32 % 0.0 % table 6: mission range calculation and operational performances has been introduced. an aircraft is represented in sufficient detail for conceptual design considerations by a set of variables and methods to enable the determination of their parameter values. an accordingly defined cargo aircraft model is then automatically resized with computer aid in the described fastr approach, until it satisfies the extensive, yet freely definable set of requirements in an optimum solution with respect to a selectable figure of merit, e.g. overall mass or drag. results available at the end of, as well as information gained along the scaling process, include growth factors and design sensitivities. relying on this data, important trade-off decision-making processes during aircraft conceptual design are enabled and backed up with extended knowledge about the evolving aircraft. references [1] mavris, d. n., delaurentis, d. a.: a probabilistic approach for examining aircraft feasibility and viability. aircraft design 3/2000, pergamon press, oxford, 2000 [2] whitford, r.: the benefits and costs of stealth from the aircraft designer’s viewpoint. acta polytechnica vol. 40, january 2000, prague [3] roskam, j.: airplane design, part i-vii. darcorporation, kansas, 1988 [4] torenbeek, e.: synthesis of subsonic airplane design. delft university press, delft, 1982 [5] wittmann, r.: generisches modell für propellerund strahlbasierte flugzeugantriebe. internal report lt-sa 01/6, technische universität münchen, munich, 2001 [6] schemensky, r. t.: development of an empirical based computer program to predict the aerodynamic characteristic of aircraft. technical report affdl-tr-73-144, volume 1, wright-patterson afb, ohio, 1973 [7] polhamus, e. c.: prediction of vortex-lift characteristics based on a leading-edge suction analogy. aiaa paper 69-1133, washington d.c., 1969 [8] n. n.: usaf stability and control datcom. air force flight dynamics laboratory, wright-patterson afb, ohio, (revised) 1978 [9] n. n.: luftfahrttechnische handbücher, lth-band masseanalyse. iabg, munich, february 1992 [10] n. n.: esdu international structures series. issn 0141-4097, london, 2000 [11 schieck, f., deligiannidis, n., gottmann, t.: a flexible, open-structured computer based approach for aircraft conceptual design optimisation. aiaa paper 2002-0593, washington d.c., 2002 [12] ballhaus, w. f.: clear design thinking using the aircraft growth factor. presentation, sae los angeles aeronautic meeting, october 5–9, 1954 [13] burns, j. w.: aircraft cost estimation methodology for preliminary design development applications. presentation, sawe conference, 23–25 may 1994 (sawe paper no. 2228, attachment e) dipl.-ing. florian schieck phone: +49 89 289 15986 e-mail: schieck@llt.mw.tum.de prof. dr.-ing. dieter schmitt phone: +49 89 289 15981 e-mail: schmitt@llt.mw.tum.de fax: +49 89 289 15982 chair of aeronautical engineering technische universität münchen boltzmannstraße 15, 85748 garching, germany 98 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 table of contents rapid prediction of configuration aerodynamics in the conceptual design phase 3 c. munro, p. krus prospects for advanced engineering design based on risk assessment 8 m. holický genetic algorithm optimisation of a ship navigation system 13 e. alfaro-cid, e. w. mcgookin, d. j. murray-smith conceptual design of a hydraulic valve train system 20 j. pohl, a. warell, p. krus, j.-o. palmberg analysis of a thermal plasma diamond cvd system 29 d. kolman fatigue crack initiation and early growth in glare 3 fiber-metal laminate subjected to mixed tensile and bending loading 33 a. chlupová, j. heger, a. vašek involvement of thermodynamic cycle analysis in a concurrent approach to reciprocating engine design 37 j. macek, m. takáts the validation of computer-based models in engineering: some lessons from computing science 45 d. j. murray-smith control system design based on a universal first order model with time delays 49 t. vyhlídal, p. zítek stress distribution in a coal seam before and after bump initiation 54 j. vacek computational investigation of flows in diffusing s-shaped intakes 61 r. menzies improving aircraft design robustness with scenario methods 68 a. strohmayer credibility of design procedures 74 j. marková, m. holický aerodynamic design of a tailless aeroplan 79 j. friedl structural design using simulation based reliability assessment 85 s. vukazich, p. marek cargo aircraft conceptual design optimisation using a flexible computer-based scaling approach 93 f. schieck, d. schmitt ap01_45.vp 1 introduction control problems of time delay systems have been solved since the very beginning of modern control theory. the well-known idea of compensating input delay by means of a special control loop arrangement was introduced by smith [1]. in fact the idea of the single input delay in the plant model is also considered in the ziegler and nichols method of controller setting. series linkage of the delay and other parts of the model is often used in control engineering for describing of systems with a significant dead time. a conventionally applied model for such systems is assumed with the following transfer function � � � � � � g s k s ts n � � � exp � 1 (1) where n = 1, 2, … is the order of the model, k is steady state gain, t is time constant and t is input time delay. the first order model n = 1 is often used in practice. three parameters k, t and � can be estimated for example from the step response of the system (see figure 3). it should be noted that the model obtained in this way often truly describes only the transient behaviour of the system. this results from the fact that most real systems have more complicated dynamic behaviour, which can be only very roughly described by the first order model with only input time delay. discrepancies between the transient and frequency responses of the model and the real system often occur if the first order model is used. the use of a higher order model (n > 1) can reduce these discrepancies, but it causes increasing complexity of the model structure. on the other hand, the first order model is advantageous if the model-based control strategy is to be applied. the low order of the model brings about the low order of the final controller model desirable for its implementation. 2 first order model with time delays instead of the standard approximation (1) let us assume an alternative model of the system � � � � � � g s k s ts s � � � � exp exp � � (2) corresponding to the delayed differential equation � � � � � �ty t y t ku t� � � � �� � (3) where u and y are system input and system output respectively, t is time, k is steady state gain, t is time constant, � is input time delay and � is state delay. in effect, state delay � plays an analogous role as the n-th power in the denominator of (1). in combination with the input delay it allows (1) dynamics to be described by means of model (2). for the sake of model (2) generalisation it is useful to introduce a reference model with the following relative parameters: k = 1, t = 1, �* = �/t, �* = �/t and t* = t/t. the dynamic properties of the linear model can be determined by the solutions of its characteristic equation, i.e. the system poles. the structure of the characteristic equation of model (2) is very simple � � � �m s s s� � � �exp *� 0 (4) but it is not easy to find a sufficient set of solutions. because of the presence of the exponential term, equation (4) is not algebraic but transcendental and therefore it has infinitely many poles. on the other hand, only a few of this set of poles have a significant influence on the system behaviour [2]. the distance between the appropriate pole and the origin of complex plane determines the pole significance. the more distant the pole, the less significant. poles may be either real or complex conjugate. real poles represent damped modes, while complex poles s = � � j� represent oscillatory modes of the system dynamics. the ratio between the real and imaginary part of a complex pole is known as relative damping � � �� . the location of the poles is dependent on the model parameters. the trajectories of the poles with respect to the value of state delay �* are depicted in figure 1. as can be seen, if �* = 0 system (2) has only one real pole in the selected region. this real pole denoted as s1 moves to the left with increasing value of �*. it should be recalled that since �* � 0 equation (4) is transcendental, with infinitely many solutions. another real pole emerges in the left part of figure 1 and it moves to the right as �* increases. the case of double pole s1, 2 occurs with �* = exp( 1). it is the highest value of �* for which equation (4) has real solutions. for higher �* the poles acta polytechnica vol. 41 no. 4 – 5/2001 49 control system design based on a universal first order model with time delays t. vyhlídal, p. zítek an original modelling approach for siso systems is presented, based on a first order model with more than one delay in its structure. by means of this model it is possible truly to hit off the properties of systems which are conventionally described by higher order models. the identification method making use of a relay feedback test combined with transient responses of the system has proved to be suitable for assessing the model parameters. with respect to its plain structure the model is well suited to be applied in the framework of an internal model control scheme (imc). the resultant control algorithm with only one optional parameter is very simple and can easily be implemented, for example by means of a programmable controller (plc). keywords: time delay system, internal model control, system eigenvalues, control parameter setting. are complex conjugate only. it is apparent from figure 1 that the couple s1, 2 is always the nearest one to the origin of the complex plane. this determines its dominant influence in the system dynamics. common intersection points of all the pole trajectories are im (m (s)) = �1 and re (m (s)) = 0 (marshal et al [3]). the dominant couple reaches this point for �* = /2. this is the ultimate value for which model (2) is stable. for values of �* > /2, the dominant couple of poles enters the right half of the complex plane, which indicates loss of stability. the significance of �* is also demonstrated in the step responses of model (2) (see figure 2). the relationship between the position of dominant poles s1, 2 and appropriate responses is apparent. the smaller the value of the relative damping � � �� of the dominant poles, the more oscillatory the step response. there is a class of systems for which not only distribution of poles but also distribution of zeros is important. these systems cannot be truly describesd by models (1) or (2). the general differential equation with derivations on both sides is usually used for describing such systems. zeros can be added to the first order model (2) by means of enlargement of the numerator by the term ps + exp ( s�). � � � �� � � � � g s ps s k s ts s � � � � � � exp exp exp � � � . (5) time constant p and time delay � play an analogous role in the numerator to parameters t and � in the denominator. the ratio �* = � / p determines the system zeros distribution, in accordance with figure 1. 3 identification of the system parameters the step response is often used for a first estimation of system properties. it is customary and a relatively easy matter to assess the input time delay, time constant and steady state gain. these parameters are sufficient for the system representation by means of model (1) if n = 1. the steady state gain is simply given by k = �y/�u. the tangent in the inflexion point of the step response assesses parameters t and �. as is shown in figure 3 these parameters can be read from the time axis of the step response. if the model with two delays (2) is used, its parameters can also be estimated from the step response. the tangent in the inflection point of the step response also assesses t and �. the remaining parameter � influences the position of the inflection point of the step response. it is usually a relatively easy task to find the tangent in the inflexion point, but it is rather complicated to find the exact position of this point, especially if the response is influenced by the measurement or the system noise. acta polytechnica vol. 41 no. 4 – 5/2001 50 fig. 1: pole trajectories with respect to � * ; 0 2 fig. 2: step responses of model (2) with respect to � * ; 0 2 fig. 3: step response fig. 4: oscillating motion of relay feedback an identification method based on the relay feedback test (astrom, hagglund [4]) is suggested in order to find the value of parameter �. the relay is usually used for providing on-off control of the system. for a large class of systems, relay feedback gives an oscillating motion (see figure 4), the frequency of which is close to the system ultimate frequency �u. the system ultimate gain is approximately given by astrom and hagglund as k u y u a a � 4 (6) where ua a ya are the amplitudes of the relay and system output, respectively. the point of the system frequency response with argument (�) = is described by the above ultimate parameters. on the basis of this point two parameters of the model can be calculated. � �� � �� re , , im , , . g j t k g j t u u u � � � � � � � 1 0 (7) as was shown in section two, parameters � and t determine the dynamic features of the system. therefore, it is advantageous to assess these two parameters on the basis of the same identification method. for this reason it is suggested that t be assessed together with � on the basis of the relay feedback test. simplifying (7) the following expressions are obtained � �� � � � � � � � arccos coskku u u (8) � � � � � � t u u u u � �sin tan cos� � � � � � � . (9) identification of the parameters of model (5) is more difficult, because it contains two additional parameters. the possible solution of the identification is to find more points of the frequency response and solve the set of equations numerically, by analogy with (7). 4 application to internal model control as has been shown in [5] and [6] the conventional internal model control (imc) design of the controller (morari, zafiriou [7]) can be extended to a broad class of time delay systems. with respect to preceding parts of this paper, first order models with time delays (2) and (5) are able to describe a large class of systems. this is the main reason why these models are suitable for application to imc. controller r(s) for model (2) is according to the imc design � � � � � � r s ts s k fs � � � � exp � 1 (10) where t, � and k are parameters of the model. the filter 1/(fs+1), where f is a time constant, ensures the feasibility of the controller. the inner loop with controller r and model of the system g can be described by the transfer function � � � � � � � � � � � �� � c s r s r s g s ts s k fs s � � � � � � � �1 1 exp exp � � . (11) the control feedback arrangement with controller c acquires a conventional structure. if model (5) is used, the analogous controller results � � � � � �� � �� � c s ts s k ps s fs s � � � � � � � � exp exp exp � � �1 . (12) applying controller (11) to model (2) or controller (12) to model (5), a feedback loop of quite simple and favourable dynamics is obtained. � � � � � � � � � � � � g s c s g s c s g s s fs wy � � � � �1 1 exp � . (13) if the real system properties agree with models (2) or (5) the control loop is not only always stable, but moreover its transients are without overshoot, given by the only pole s = 1/f . the imc controller compensates most of the undesirable effects of delays in the system response. in spite of encountering result (13), it should be noted that an exact agreement between the real system and its model (2) or (5) cannot be achieved. for this case, parameter f affects not only the speed but also the robustness of the closed loop. 5 example 1 – ball levitation an application of the designed control method is demonstrated on a simple laboratory system – ball levitation (see figure 5). the actuator of the system is a pump. the pump supplies a jet with an appropriate amount of water, so that the ball, which is raised by the water flow, is maintained in the desired horizontal position. the position is sensed by an ultrasound sensor. the controller (11) generates the actuating signal u assigning the pump performance – the amount of water drawn through the jet. steady state gain k = 62.5 and input time delay � = 0.7 s were estimated from the step response of the system that is shown in figure 6. the relay feedback experiment was performed and the following ultimate parameters were found: �u = 3.5 s 1, ku = 0.02. parameters t = 0.31 s and � = 0.08 s were calculated from equations (8) and (9). the simulated step response with assessed parameters is shown in figure 6. it is apparent that model (2) with identified parameters describes the system dynamics very well. because of the higher value of the ratio �/t and because of the presence of a distinct noise, a rather higher value of the time constant of the filter f = 1 s was chosen. the change of desired position of the ball from w = 70 mm to w = 120 mm, shown in figure 7, shows a good control response. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 5: laboratory system – ball levitation 6 example 2 – pitch attitude controller the substitution of the higher order model by model (5) and its implementation to imc is illustrated through application to the pitch attitude controller of an aircraft. the transfer function (14) (etkin and reid [8]) describes the dynamic relationship between the rudder angle �e and the pitch angle � (see figure 8). details of the investigated plane, a boeing 747-100, as well as the methodology used to obtain model (14) can be found in the mentioned literature. the pitch angle is defined as the angle between the plane axes x and the earth fixed axis xe. this angle is readily available from either the real horizon (pilot) or the vertical gyro (auto-pilot). it should be noted that linear model expressed by the transfer function � � � � � � � � g s n s m s b s b s b s a s a s a s ae � �, � � � � � � � � � 2 2 1 0 4 3 3 2 2 1 0 (14) with parameters b0 = 0.003873, b1 = 0.3545, b2 = 1.158, a0 = 4.19588�10 �3, a1 = 9.463025�10 �3, a2 = 0.935494, a3 = 0.750468 is supposed truly to describe the dynamic relation between �e and � only for small departures from the reference position. let us turn our attention to an investigation of the dynamic properties of model (14). the system has two zeros and four poles. zeros p1 = �0.0113 s �1 and p2 = �0.2948 s �1 were found directly as two solutions of the equation n(s) = 0, while the system poles s1, 2 = �0.0034 � j 0.0650 s �1 and s3, 4 = �0.37 � j 0.8875 s �1 result from numeric calculation of the equation m(s) = 0. the numerator of the alternative model (5), namely parameters p and �, can be assigned on the basis of found zeros p1 and p2 from the equation ps + exp ( �s �) = 0. the same distribution of dominant zeros appears if p = 101 s and � = 11.51 s. the steady state gain k = �0.92305 is obtained from (14) if s = 0 s. if the dominant couple of the model (14) sf1, 2 = �0.0034 � j 0.0650 s �1 is prescribed for model (5), the parameters t = 16.64 s and � = 23.4 s result. another couple sf3, 4 = �0.0734 � j 0.3260 s �1 appears near to the imaginary axis. this couple is closer to the origin of the complex domain than couple s3, 4 of model (14). this causes a certain phase shift between the responses of the models. better agreement of model (14) and model (5) is achieved if parameters t = 16.1 s and � = 22.61 s are used. model (5) then has dominant poles sf1, 2= �0.0035 � j 0.0671 s �1 and sf3, 4 = �0.0758 � j 0.3371 s �1. by means of the last parameter = 5 s the phase shift is completely removed. step responses and bode diagrams of model (14) and its approximation (5) are shown in figure 9 and figure 10, respectively. apparent discrepancies in the beginning of the step responses are caused by the same order of the numerator and the denominator of model (5). this discontinuity is a feature of model (5) and cannot be removed by the choice of different parameters. on the other hand, the step responses are nearly overlapping since they reach their first minimum. the bode diagrams in figure 10 show excellent agreement for frequencies acta polytechnica vol. 41 no. 4 – 5/2001 52 fig. 6: step responses of the system and its model fig. 7: set-point response of the closed loop e x x e fig. 8: definition of pitch angle � and rudder angle �e in the motion of the plane fig. 9: step responses of model (14) and its substitution (5) fig. 10: bode diagrams of model (14) and its substitution (5) � < 10 1 s 1. model (5) can be considered as a good approximation of model (14) for lower frequencies. the distinctive discrepancies of the responses after they reach the mentioned boundary value of the frequency indicate that model (5) cannot be used for approximation of model (14) in the higher frequency range. a different set of parameters of model (5) must be used if substitution of model (14) in the higher frequency range is needed. the assessed parameters of model (5) were used for controller (12). model (14) was used as a controlled system in the closed loop. the optional parameter of the controller f = 8 s ensures good control responses, which are shown in figure 11, as well as sufficient robustness of the closed loop. 7 conclusions the presented first order model with input and state delay or with another delay in the model input section proves to have the ability to describe systems with delay dynamics. in spite of its first order, it can be used for describing systems conventionally described by a higher order model. the model is appropriate for describing both non-oscillatory and oscillatory systems with arbitrary dead time. the well-known method based on a closed loop with a relay in combination with a step response may be used for assessing the model parameters. with respect to its plain structure, the model is well suited to be applied in the framework of the internal model control scheme (imc). it should be noted that the good features of the designed control method depend greatly on the agreement of the model and the system dynamics. this agreement is easy to obtain for industrial or laboratory systems with simple and time-invariant dynamics. the presented control of ball levitation is a typical example of such a system. on the other hand, real systems like the presented pitch attitude control of a plane are mostly non-linear with time-variant dynamics. this fact should be taken into account in controller implementation. the problem of system non-linearity can then be solved by means of linearization and by varying set of imc controller parameters for each operational state of the system. as regards implementation, the control algorithm can be easily implemented on plc. the authors implemented the presented imc controller on the plc produced by the control technology production company teco kolín. acknowledgement this research was supported by the ministry of education of the czech republic under project ln00b096, and by the university of glasgow, department of aerospace engineering. references [1] smith, o. j. m.: closer control of loops with dead time. chem. eng. prog., 1957, 53 (5), pp. 217–219 [2] gorecki, h., fuksa, p., grabowski, p., korytowski, a.: analysis and synthesis of time delay systems. pwn-polish scientific publishers, warszawa, 1989 [3] marshal, j. e., gorecki, h., walton, k., koritowski, a.: time-delay systems, stability and performance criteria with applications. ellis horwood limited, chichester, 1992 [4] astrom, k. j., hagglund, t.: automatic tuning of simple regulators with specifications on phase and amplitude margins. automatica 20/1984, pp. 645– 651 [5] zítek, p.: time delay control system design using functional state models. ctu reports no. 1/1998, ctu prague, p. 93 [6] zítek, p., hlava, j.: anisochronic internal model control of time delay systems. control engineering practice 9, no. 5/2001, pp. 501–516 [7] morari, m., zafiriou, e.: robust process control. prentice-hall, englewood clifts, n. j., 1989 [8] etkin, b., reid, l. d.: dynamics of flight, stability and control. john wiley & sons, inc., 1996 ing. tomáš vyhlídal phone: +420 2 2435 3953 e-mail: vyhlidal@student.fsid.cvut.cz prof. ing. pavel zítek, drsc. phone: +420 2 2435 2564 e-mail: zitek@fsid.cvut.cz institute of instrumentation and control engineering cak – centre for applied cybernetics czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 11: set point response and disturbance rejection ap1_01.vp 1 introduction several nitrogenous compounds, including ammonia, nitrite and nitrate, are frequently present in drinking water and in various types of agricultural, domestic and industrial wastewaters (metcalf & eddy, 1991). this necessitates the upgrading of technological schemes, and a search for cost effective and environment friendly methods for removal of ammonia, nitrite, and nitrate. according to the instruction of the european community council (cce) of 15 july 1980, the maximum permissible level of ammonium in drinking water is 0.5 mg/l. methods for removal of nitrogenous compounds proposed by previous investigators have included air stripping, nitrification/denitrification using fixed or fluidized bed biological reactors and ion exchange. one of the common processes for drinking water treatment applied in recent times is an ion exchange process. removal of ammonia from water has been investigated by many researchers (gaspard and martin, 1983; hlavay et al., 1983; vokáčová, et al., 1986; hódi et al., 1995; booker et al., 1996; beler baykal and akca guven, 1997; cooney et al., 1999). a simple comparison of ammonia removal by both natural and a synthetic material was attempted by (haralambous et al. 1992). jörgensen, (1976) wrote in his paper that ammonia removal from wastewater was examined during treatment in a laboratory glass column with a diameter of 20 mm, containing 200 ml of ion exchange material of the following type: 1) a strong acidic cation exchanger on a sodium form (lewatit 500 a); 2) a strongly acidic cation exchanger on a hydrogen form (lewatit 500 a); 3) clinoptilolite on a sodium form; 4) an artificial zeolite; 5) sulfonated lignocelluloses on a sodium form; 6) a weak acidic cation exchanger on a sodium form (lewatit 69 mp); 7) a weak acidic cation exchanger on a hydrogen form (lewatit 69mp). the results show that out of ion exchangers 1–7 only 2, 3 and 7, and perhaps also 6 give satisfactory results. the objective of our work is to treat drinking water spiked with nh4 + (10±0.5, 5±0.5 and 2±0.5) mg l �1, using lewatit s100. 2 materials and methods physical description of the system fig. 1 shows an experimental apparatus consisting of a column with the following characteristics: internal diameter = 20 mm, containing approximately 50 ml of exchanger, at a volumetric flow rate of 8.7 ml min��, which is equivalent to 10.5 bed volumes (bv) per hour, particle-size of material = 0.3–1.2 mm. a glass screen supported the lewatit in the column. analysis all analyses were made according to standard methods (apha, see greenberg et al., 1992). ammonia was determined by nessler methods using a spectrophotometer (model hach dr/2000). calcium and magnesium were determined by the edta titrimetric method. specification of the material lewatit s100 was the material used for the investigation. lewatit s100 is a synthetic resin ion exchanger of the na-type. it is a strongly acid cationic ion exchange resin. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 41 no.1/2001 lewatit s100 in drinking water treatment for ammonia removal h. m. abd el-hady, a. grünwald, k. vlčková, j. zeithammerová ammonium nitrogen is the most important form of nitrogen that can cause excessive algal growth and stimulate eutrophication in surface water. the purpose of this study is to investigate the possibility of removing ammonium from drinking water by means of an ion exchange process. polymeric lewatit s100 material (particle-size 0.3–1.2 mm) was used. the breakthrough capacity was determined by dynamic laboratory investigations and the concentration of regenerant solution (5 and 10 % nacl) was investigated. the concentration of ammonium ion inputs in the tap water that we used were 10, 5 and 2 mg nh4 + l�1 and down to levels below 0.5 mg nh4 + l�1. the experimental results show that the breakthrough capacity was very small at ammonium concentration 2 mg nh4 + l�1 compared to its breakthrough capacity at ammonium concentration 10 mg nh4 + l�1. there was no difference between regeneration by 10 and 5 % nacl. we conclude that the use of lewatit s100 is an attractive and promising method for ammonium concentration greater than 5 mg nh4 + l�1 and till 10 mg nh4 + l�1. keywords: lewatit s100, ion exchange, ammonia removal, drinking water. reservoir regeneration peristaltic pump valve service ion exchange resin column fig. 1: laboratory column system operation parameters the first experiment was carried out to see the exhaustion performance of lewatit s100 using distilled water spiked at 10 mg nh4 + l �1 concentration. then we applied tap water containing ca+2 = 60 mg l �1 and mg+2 = 12 mg l �1. the concentrations of ammonium input were 10, 5 and 2 mg nh4 + l �1 as (nh4cl) and down to levels below 0.5 mg l �1. in the regeneration phase, we used 5.0% nacl at a volumetric flow rate of 8.7ml min�1, which is equivalent to 10.5 bed volumes (bv) per hour and then 10% nacl. after regeneration, the excess cl� was removed from the lewatit s100 by distilled water. this washing was repeated until visual tests with agno3 revealed zero chloride. 3 results and discussion breakthrough capacity within the scope of this work, the results from the first experiment using distilled water indicated that the volume of water treated till breakthrough, defined as 0.5 mg l �1 in nh4 +, was 2570 bv (128.5 l), and the breakthrough capacity was 1.356 mol l �1 for nh4 + = 10 mg l �1. fig. 2 shows that for tap water containing calcium and magnesium ions, the breakthrough capacity and the volume of water treated till nh4 + breakthrough were 0.156, 0.085, 0.0317 mol l �1 and 295 bv (14.75 l), 340 bv (17 l), 380 bv (19 l), respectively. these values indicate that tap water produces a lower breakthrough capacity than distilled water, due to the presence of other ions, especially ions with a polyvalent charge in water, such as ca+2 and mg+2. fig. 3 shows that, for tap water, the evaluation of practical capacity exchange is a function of the entering concentration of nh4 +. these results reveal that the breakthrough capacity is lower by approximately 0.55 and 0.2 times at nh4 + = 5 and 2 mg l �1, respectively, than the breakthrough capacity at nh4 + = 10 mg l�1. thus, lewatit s100 can remove ammonium ions very quickly with a higher breakthrough capacity at initial ammonia concentration of more than 5 mg l �1. comparing the results summarized in table 1, it can be seen that the calcium elimination from solution of 10 mg l �1 nh4 + was higher than the other solution of 5 mg l �1 and 2 mg l �1. magnesium was elimination from all ammonium solution with the same rate. regeneration effects the elution curves (fig. 4) indicate no difference between regeneration by (10 and 5%) nacl solution. table 2 shows that 29 bv (1.45 l) of on nacl solution is sufficient for ammonium elution using lewatit s100. these results reveal that lewatit s100 is slightly more economical when using 5% nacl for regeneration. acta polytechnica vol. 41 no.1/2001 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ lewatit column nh 4 cl feed breakthrough curve 0 0.1 0.2 0.3 0.4 0.5 0.6 200 220 240 260 280 300 320 340 360 380 400 bed volume passed n h 4 + m g l1 nh4=10mg/l nh4=5mg/l nh4=2mg/l breakthrough level fig. 2: results from column study useful capacity of exchange 0 0.04 0.08 0.12 0.16 0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 nh 4 + = mg/l m o l l1 lewatit fig. 3: exchange capacity table 1: results of experiments resins nh4 + = 10 mg l �1 nh4 + = 5 mg l �1 nh4 + = 2 mg l �1 ca+2(mg l �1) mg+2(mg l �1) ca+2(mg l �1) mg+2(mg l �1) ca+2(mg l �1) mg+2(mg l �1) lewatit 2.85 1.29 4.0 1.01 10.02 1.22 table 2: elution of ammonia nacl = 5% nacl = 10% volume (ml) nh4 + (mg l �1) bv volume (ml) nh4 + (mg l �1) bv 500 0.65 10 500 1.9 10 1000 0.211 20 1000 0.125 20 1200 0.09 24 1250 0.098 25 1450 0.029 29 1450 0.04 29 4 conclusions the experimental results indicate that lewatit s100 as a cation exchanger can remove ammonium ions very quickly. higher breakthrough capacity was found at an initial ammonium ion concentration of more than 5 mg/l compared to 2 mg. the calcium elimination was lower at an ammonium ion concentration of 10 mg/l. no difference between regeneration by 10 and 5% nacl was observed. we conclude that the use of lewatit s100 is an attractive and promising method for ammonium concentration greater than 5 mg nh4 + l �1 and till 10 mg nh4 + l �1. references [1] apha standard methods for examination of water and waste water. 18th edition, 1992, published by american public health association, washington, usa [2] beler baykal, b. and akca guven, d.: performance of clinoptilolite alone and in combination with sand filters for the removal of ammonia peaks from domestic wastewater. wat. sci. technol.,1997, vol. 35, no. 7, pp. 47–54 [3] booker, n., cooney, e. and priestly, a.: ammonia removal from sewage using natural australian zeolite. wat. sci. technol., 1996, vol. 34, no. 9, pp. 17–24 [4] cooney, e., booker, n., shallcross, d. and stevens, g.: ammonia removal from wastewater using natural australian zeolite. i. characterization of the zeolite. sep. sci. technol., 1999, vol. 34, no. 12, pp. 2307–2327 [5] gaspard, m. and martin, a.: clinoptilolite in drinking water treatment for nh4 + removal. wat. res., 1983, vol. 17, no. 3, pp. 279–288 [6] hlavay, j., vigh, gy., olaszi, v. and inczédy, j.: ammonia and iron removal from drinking water with clinoptilolite tuff. zeolite 3, 1983, pp. 188–190 [7] hódi, m., polyák, k. and hlavay, j.: removal of pollutants from drinking water by combined ion exchange and adsorption methods. envir. international, 1995, vol. 21, no. 3, pp. 325–331 [8] haralambous, a., maliou, e., and malamis, m.: the use of zeolite for ammonia uptake. wat. sci. technol., 1992, vol. 25, no. 7, pp. 139–145 [9] jörgensen, s. e.: ammonia removal by use of clinoptilolite. wat. res., 1976, vol. 10, pp. 213–224 [10] metcalf and eddy inc.: wastewater engineering: treatment, disposal, and reuse. 3rd edn., 1992, mcgraw-hill, new york [11] vokáčová, m., matějka, z. and ellášek, j.: sorption of ammonium-ion by clinoptilolite and by strongly acidic cation exchangers. acta hydrochim, 1986, vol. 14, no. 6, pp. 605–611 hossam monier abd el-hady phone: + 420 2 2435 4605 e-mail: hoss@fsv.cvut.cz prof. ing. alexander grünwald, csc. phone: +420 2 2435 4638 e-mail: grunwald@fsv.cvut.cz ing. karla vlčková ing. jitka zeithammerová department of sanitary engineering czech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no.1/2001 lewatit s100 0 20 40 60 80 100 120 140 160 0 5 10 15 20 25 30 bed volume passed n h 4 + m g l1 nacl = 5% nacl = 10% fig. 4: elution curves of ammonium ions acta polytechnica vol. 43 no. l/2003 dlc fitms deposited by the dc pacvi) method d. palamarchuk, m. zoriy,j. gurovid, f.ierny, s. konvidkov6,i. htittel dlc (dinmond,-like carbon) coatings haue bee-n sugguted as protectiae surface layers.against .uear' howeuer h"ard dlc coatings' especially those of greater ,hii,inru, hiae poor ad,hr1l.9.i7o substraies. we haae used' seieral iay to i'ncrease.the adhesion of dlc coatings brebared. by the pacvd (plasma assisted. chemicar vapour deposition) method in steel substrates. one of these is the dc pacvd method 'for' prepaing d lc flms. keyu ords : d lc, pacvd, microhardne ss, adhesion' electrode *ltt ruuto.t. heating subsffate vacuum chamber i introduction diamond-like carbon (dlc) films have attracted considerable attention recently due to their properties of high mechanical hardness, high electrical resistivity, low friction coeflicient, optical transparency and chemical^inertness il]' these properties make di-c films suitable for numerous poterrtial applications in hard and wear-resistant coatings' iltfrog.upny, protective optical and biomechanical coatings' electiolumines..n.. *uti,ials and held-emission devices [2' 3]. various techniques have been used for preparing dlc fiim, including ion-beam deposition, reactive magnetron ,pr:ir..ing, pulsed laser deposition, a filtered cathodic vacuum arc, dire.t.r.,r.e.,t plasma cvd, radio frequency plasma cvd and electron .yaio,.ott resonance microwave plasma cvd [4-7]. a problem that affects some of these methods is their poor adhesion to steel substrates with no intermediate layer' one fotential solution to this problem is to use current plasma for hepositing dlc films with no intermediate layer' this paper ,,.idi", th"e adhesion of dlc films to steel substrates and the microhardness of the layer. 2 experimental and results the dlc films were deposited with a direct current plasma assisted chemical vapor deposition process (dc i'ncvo) on silicon (lll) and iteel substrates' the steel substrate consists of 0.9 7o c, 4'14 7o cr, 6'l vo w' 5 vo mo' 2.02 voy.these samples were polished up to a mirror finish' using a series of standard metallurgical polished steps' befo.. p.epuration, the samples were, cleaned with acetone and isopropanollian in an ultrasonic bath' 'lhe apparatus for plasma assisted chemical vapor deposition conslsts of a vacuum chamber, a diffusion pump' two oarallel electrodes and a generator of dc pacvd plasma' i part of the pacvd apparatus is illustrated in ftg'l' the dlc films were prepared on our substrates with the same parameters, apart from the bias voltage' which was varied. this parameter was varied between -400 v and -900 v. other processing parameters of the dlc layers' prepared with different uil.t.t of the bias voltage' are shown in thble l. the thickness of the deposited dlc films was approximately lpm, with diflerent values of bias voltage' fig. l: schematic view of dc pacvd apparatus table 1: deposition parameters of dlc films then we investigated the dependence of microhardness o.r biu, voltage anithe adhesion of the dlc layers to the steel substrate. the microhardness of the substrates and substrates with dlc films was measured using a leica dm inverter research mi..or.op" for materials tesiing' the measurement results of microhardness for dlc layers prepared with the diflerent values of bias voltage are shown in fig' 2' these graphs show the microhardness of the dlc layers' which incriases with increasing bias voltage' the photographs of the vickers indentations clearly demonstrate the diilerence between layers with good and poor adhesion. the indentation in layers with poor adhesion to the substrate exhibits cracks in the diagonal direction' in the case of good adhesion no cracks are visible, and the layer remains .oitpu.t. this difference is shown in !ig' 3' 3.9.10-2 torr3.9' l0-2 torrinitial vacuum 7.1.10-l torr7.1.10-l torrpressure of ch, bias current -400v+-900v-400v+-900v thickness of dlc layer 27 acta polytechnica vol. 43 no. l/2003 aaaa o g o 6 at oc p t! g .9 = 14 13 12 11 10 i 14 13 12 11 10 i i o g (9 o a(, e (! o .9 =400 500 700 negative bias voltage [vl *r steel substrate {: steel substrate with dlc laver 900 3uu negative bias voltage [vl * silicon substrate {r silicon substrate with dlc laver 700 fig. 2: dependence ofmicrohardness on bias voltage ofdlc layers on silicon and steel substrates fig. 3: shape ofvickers indentation in layerswith good and poor adhesion the microhardness of the coated materials is about l3 gpa higher than the base material at a load of 50 mn and bias voltage -900 v these coatings can be used for example for protecting metal bioimplants [8]. acknowledgment this work was supported by a grant from the grant agency of the czech republic 106/02/1 194 ,104/98:210000012 and ctu0204t 12. references ill robertson, j.: adv. physics, vol. 35, 1986, p. 317. tzl lettington, a. h., smith, c.: diamond relat. mater., vol. l. 1992. t3l seth, j., babu, s. v., ralchenko, v. g.: thin solid films, yol254, 1995, p. 92. l4l nagai, l, ishitani, a., kuroda, h.: journal appl. phys., vol.67, 1990, p.2890. t5l harry, m., zukotynski, s.: journal vac. sci. technol., a9, 1991, p.496. t6l ganapathi, l., giles, s., rao, r.: appl. phys. lett., vol.63, 1993, p.993. t71 mckenzie, d. r., miiller, d., pailthorpe, b.a.: phys. rev. lett., vol. 67, 1991, p.773. i8l konvickova, s., valerian, d.: acta mechanica slovaca, vol.2, 1989, p.39. 28 ing. dmytro palamarchuk e-mail : palamarc@student.fsid.cvut.cz ing. miroslav zoriy ing. jdn gurovii phone: +420 224 355 600 e-mail: gurovic@fsid.cvut.cz doc. ing. franti5ek cerny, csc. phone: +420 224 352 437 e-mail : cernyf@fsid.cvut.cz f)epartment of physics doc. ing. s. konviikovd, csc. phone: +420224352 5ll e-maii: konvicko@fsid.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering technick6 4 166 07 prague 6, czech republic doc. ing. ivan htttel, csc. phone: +420224353205 e-mail : huttel@vscht.cz institute of chemical technology technickii 3 166 28 prague 6, czech republic 0aaa scan 27 scan 28 acta polytechnica doi:10.14311/ap.2015.55.0162 acta polytechnica 55(3):162–168, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison of exploration strategies for multi-robot search miroslav kulich∗, tomáš juchelka, libor přeučil department of cybernetics, faculty of electrical engineering, czech technical university in prague, technicka 2, 166 27 prague, czech republic ∗ corresponding author: kulich@labe.felk.cvut.cz abstract. searching for a stationary object in an unknown environment can be formulated as an iterative procedure consisting of map updating, selection of a next goal and navigation to this goal. it finishes when the object of interest is found. this formulation and a general search structure is similar to the related exploration problem. the only difference is in goal-selection, as search and exploration objectives are not the same. although search is a key task in many search and rescue scenarios, the robotics community has paid little attention to the problem. there is no goal-selection strategy that has been designed specifically for search. in this paper, we study four state-of-the-art strategies for multi-robot exploration, and we evaluate their performance in various environments with respect to the expected time needed to find an object, i.e. to achieve the objective of the search. keywords: mobile robots, multi-robot systems, planning, search, exploration. 1. introduction searching for a stationary object of interest in a known or unknown environment is a practical task in everyday life. almost everyone, for example, has lost keys or has forgotten where he/she put his/her glasses, cellular phone or wallet, and has tried to find them. an important task in the search and rescue scenario is to find a black box flight recorder or debris after a plane crash, or victims/survivors after an accident or a catastrophe. although these practical applications and many others exist, the search problem has been addressed only marginally by the robotics community. some effort has been devoted to the single and multi-robot search problem for environments known a priori. sarmiento et al. [1] formulate the problem in such a way that the time required to find an object is a random variable induced by the choice of a search path and a uniform probability density function for the location of the object. they propose a two-stage process to solve the problem. first, a set of locations to be visited (known as guards, from the art gallery problem [2]) is determined. an order for visiting these locations while minimizing the expected time to find an object is then sought. the optimal order is determined by a greedy algorithm in a reduced search space, which computes a utility function for several steps ahead. this approach is then used in [3], where robot control is assumed in order to generate smooth and locally optimal trajectories. hollinger et al. [4] utilize a bayesian network to estimate the posterior distribution of the position of the target and present a graph search to minimize the expected time needed to capture a non-adversarial object. a single-robot search in known environments can also be formulated as the traveling deliveryman problem (tdp, also known as the traveling repairman problem or the minimal latency problem). this problem is studied by the operational research community, and is known to be np-hard, even for a single robot [5]. recently, several approximation algorithms have been presented. salehipour et al. [6] have presented a metaheuristics combining grasp (general randomized adaptive search) with vnd (variable neighborhood descent). another metaheuristics, called vns (variable neighborhood search), is introduced in [7], while linear programming is used in [8]. to the best of the authors’ knowledge, the problem of a multi-robot search in an unknown environment has not been studied yet. however, methods developed for multi-robot exploration can be adopted for the search problem, as these two problems are similar in a general structure and problem formulation. a popular method for both single-robot and multi-robot exploration is frontier-based exploration, introduced by yamauchi [9], which has been further extended by many researchers, see for example [10, 11] for an experimental evaluation of several single-robot strategies. for the multi-robot case, wurm et al. [12] present goal assignment, based on the hungarian method [13]. burgard et al. [14] use a decision theory to coordinate the exploration: they estimate the expected information gain of a goal and combine it with a path cost. the method presented in stachniss et al. [15] takes the structure of the environment into account by detecting rooms and corridors and trying to assign robots to separated rooms. in addition, approaches based on k-means clustering and assigning the clusters to particular robots are presented in [16, 17] intuitively, multi-robot search is a process of autonomous navigation of a team of mobile robots in an a-priori unknown environment in order to find an ob162 http://dx.doi.org/10.14311/ap.2015.55.0162 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 3/2015 comparison of exploration strategies for multi-robot search ject of interest. a natural condition is to perform this process with minimal usage of resources, e.g., time of search, trajectory length, or energy/fuel consumption. following yamauchi [9], the search algorithm can be defined (similarly to exploration) as an iterative procedure consisting of model updating with current sensory data, selection of a new goal for each robot based on current knowledge of the environment, and subsequent navigation to this goal. as we discussed in our previous paper [18], focused on a single-robot case, the key difference between search and exploration lies in the way in which the next goals to be visited are chosen at each iteration. we showed that the objectives of the two problems differ, so that trajectories optimal for exploration are not optimal for search in general. nevertheless, exploration goal-selection strategies may be used for search. the aim of this paper is to study the behavior of several state-of-the-art exploration strategies and to evaluate their performance in the search task. the rest of the paper is organized as follows. a definition of the problem is presented in section 2, while the frontier-based framework for search is introduced in section 3, and strategies are described in section 4. an evaluation of the results and a discussion are presented in section 5. finally, section 6 is dedicated to concluding remarks. 2. problem formulation the formulation of a multi-robot search is a direct extension of the single-robot case, introduced in [18]. we assume a team of n mobile robots equipped with a ranging sensor with a fixed, limited range (e.g., a laser range-finder) operating in an unknown environment. the search problem is defined as navigation of particular robots through this environment in order to find a stationary object placed randomly in the environment. the search is completed when the object is first detected by robot sensors1, and the natural goal is to minimize the time for this detection. the objective is to find a tuple of trajectories ropt = 〈 r opt i |i = 1 . . .n 〉 among all possible tuples of trajectories r = 〈ri|i = 1 . . .n〉 minimizing the expected (mean) time for detecting the object: ropt = arg min r e(t|r), (1) where ri and ropti are trajectories of the i-th robot, t is the time needed to traverse r and tf = e(t |r) = ∞∑ t=0 tp(t). (2) p(t) can be generally an arbitrary probability density function, if priory information about the position of the object is available. nevertheless, we consider that 1we do not address in this paper the problem of how to recognize the object to be detected. instead, we consider that this functionality is available. this information is not provided, so we define the probability p(t) as the ratio of the area art newly sensed at time t when the robots follow trajectories r and atotal, the area of the whole environment in which the robots operate: p(t) = art atotal . we can therefore rewrite (1) as ropt = arg min r e(t|r) = arg min r ∞∑ t=0 tart , (3) 3. framework the framework for multi-robot search is based on yamauchi’s frontier-based approach [9], successfully used for exploration, which uses an occupancy grid as the environment representation. this approach is centralized, which means that the occupancy grid is global and it is built by a central unit by integrating raw sensor measurements from all robots. also, all decisions are made centrally and then distributed to the particular robots. the key idea of the approach is to detect frontier cells, i.e., reachable grid cells representing free regions adjacent to at least one as yet unexplored cell. the frontier is a continuous set of frontier cells such that each frontier cell is a member of exactly one frontier. the search algorithm consists of several steps that are repeated as long as some unexplored area remains. the process starts with individual robots reading actual sensor information. after some data processing, the existing map is updated with this information. new goal candidates are then determined and goals for particular robots are assigned using a defined cost function. having assigned the goals to the robots, the shortest path from the robots to the goals are found. finally, the robots are navigated along the paths. the whole process is summarized in algorithm 1. algorithm 1: frontier-based search algorithm while unexplored areas exist do read current sensor information; update the map with the obtained data; determine new goal candidates; assign the goals to the robots; plan paths for the robots; move the robots towards the goals; 4. exploration strategies many exploration strategies exist, see for example [19]. within the exploration framework presented here, we chose and implemented four methods, which are centralized, do not use a distance-based cost for goal evaluation, and are easy to implement. the following paragraphs give an overview of these methods. 163 m. kulich, t. juchelka, l. přeučil acta polytechnica (a) (b) figure 1. greedy assignment, (a) two robots exploring the same goal, (b) an inefficient assignment of goals. 4.1. greedy approach a simply and easily implementable strategy is described in [9] – each robot greedily heads towards the best goal (according to a cost function) without any coordination between robots. the strategy lacks optimality2, since one goal can be selected and explored by many robots as depicted in fig. 1(a). to avoid this inefficiency already selected goals can be discarded from further selection. this is used in the broadcast of local eligibility (ble) assignment algorithm, developed by werger & mataric [20], see algorithm 2. algorithm 2: ble assignment algorithm. while any robot remains unassigned do find the robot-goal pair (i,j) with the highest utility; assign the goal j to the robot i and remove them from the consideration; however, this remains a greedy algorithm, and does not necessarily produce the optimal solution. the solution depends on the order of the robot-goal assignments. fig. 1(b) depicts an example of an inefficient assignment. 4.2. the hungarian method the hungarian method, first introduced in [13], is more sophisticated. it is an optimization algorithm which solves the worker-task assignment. the assignment can be written in the form of the n×n matrix c, where element ci,j represents the cost that the j-th task has been assigned to the i-th worker. in our case, we define the cost as the length of the path from the current position of the robot to the goal. the hungarian method finds the optimal assignment for the given cost matrix c in o(n3). 2note that we make the decision on the basis of only current knowledge of the map. from this perspective, the approach from fig. 1(b) seems to be better than the approach in fig. 1(b). nevertheless, it might be globally more efficient to leave two robots to explore the same goal in some steps of the exploration process. the algorithm requires the number of robots to be the same as the number of goals, which cannot be guaranteed throughout the exploration. if the number of robots or goals is lower, imaginary robots or goals can be added to satisfy the assumption. they have a fixed cost assigned to them, so they do not affect the real robots/goals. in the selection, the imaginary robots, and targets are skipped. this strategy does not assign the same goal to different robots and it does not depend on the order of selection. 4.3. k-means clustering in the majority of multi-robot tasks, the robots start from the same area, e.g., from the entrance to a building. this leads to exhaustive exploration of the starting area during the first phase of exploration. [16] present a strategy that attempts to spread the robots quickly in the environment, so that each robot focuses on an individual part of the environment, for which it will remain responsible. this is done by using k-means clustering, which divides the remaining unknown space into the same number of regions as the number of robots. each particular region is then assigned to the closest robot. after the assignment, each robot chooses a frontier according to a predefined cost function. the cost of frontier fj for robot ri assigned to region ζi is defined as: ci,j = { ∆ + e(fj,ci) + oi,j fj /∈ ζi, d(fj,ri) + oi,j fj ∈ ζi, where ∆ is a constant penalization representing the diagonal length of the map, e is the euclidean distance, ci is the centroid of the region, d is the real path cost defined by any path planning algorithm, and oi,j is the accumulated penalization increasing the cost when the frontier has already been selected. the frontier that does not belong to the assigned region receives a high penalization ∆, so it can happen that there is no frontier in the assigned region, in which case, the robot selects the closest frontier to its region. as a result, robots tend to work separately in their assigned regions. if the assigned region is not directly accessible, other regions are explored on the way to the assigned region. the robots explore all these separate regions simultaneously, because each robot heads to its own region. this disperses the robots in the environment, and different parts of the environment are explored at similar speeds. in general, the k-means algorithm consists of the following steps. (1.) randomly choose k centroids ci where 1 ≤ i ≤ k. (2.) classify each not yet explored cell in the environment to the class ζi of its closest centroid ci. (3.) determine a new centroid for each class. (4.) if all the centroids did not change, finish. otherwise, continue with the step 2. 164 vol. 55 no. 3/2015 comparison of exploration strategies for multi-robot search (a) (b) (c) (d) figure 2. maps used in the experiments. the starting positions are marked with the green circles: (a) empty map with dimensions 50 × 50 m; (b) arena map with dimensions 50 × 50 m; (c) jh map with dimensions 52.5 × 60 m; (d) hospital map with dimensions 138 × 110.75 m. 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 900 1000 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 m ap e xp lo re d [ % ] step greedy ble hungarian k-means figure 3. empty map size progress: for 4 robots (left), for 6 robots (center), for 8 robots (right). 5. results the strategies mentioned above (greedy, ble, the hungarian method, and k-means clustering) were implemented in a framework for search/exploration in a polygonal domain [21]. the framework uses ros [22] as a communication middleware. sensor measurements are represented by polygons and combined together by polygon operations. this way, a polygonal representation of the environment is built. the framework therefore enables search/exploration to be performed with a larger number of robots, in larger experiments, and re-planning can be carried out faster than is possible in a grid-based approach. experiments to compare the strategies were performed in simulation, using maps with various sizes and structures, see figure 2. the empty map 2(a) was created to simulate the trivial case of a big room without obstacles. the arena map 2(b) represents a slightly structured environment with large corridors and rooms. the jh map 2(c) represents a real administrative building with many separated rooms. the hospital map 2(d) is a part of the hospital-section map from the stage simulator representing another building. all the simulations were examined on the same hardware with a quad-core processor on 3.30 ghz running x86_64 gnu/linux kubuntu 3.0.0-20, ros electric with the stage simulator and gcc 4.6.1. the considered numbers of robots are m = {4, 6, 8}, while the sensor range is set to ρ = 5 meters with a 270° field of view. the robots are controlled using our implementation of the snd algorithm [23] as a ros node, and the planning period was set to 1 second. although all the strategies are deterministic, other parts of the exploration process (especially robot control) and thus the whole process are not deterministic. each experimental setup determined by a tuple 〈map, number of robots, strategy〉 was therefore repeated several times to obtain statistical characteristics of the exploration. the number of runs differs for different maps, as does the time demand for performing a certain number of experiments. the number of repetitions for empty, which is the easiest map, was 30. for arena there were 22 runs, while 17 runs were performed for each setup for jh and 10 for hospital. as the computations are not time-consuming, the experiments were speeded up 3 times in the stage configuration file. this has the same effect as when the planning period is set to 3 seconds and the simulation speed is normal. the benefit is that we can perform the experiments faster. this is crucial, as we performed about 700 experiments, each taking 5 to 15 minutes. a statistical evaluation of the strategies is shown in figs. 3–8. the first two figures depict the progress of a newly explored area averaged for each map and strategy over all runs, while figs. 7 and 8 display fivenumber summaries of the expected time for finding object — tf — as defined in (2). art is computed as the difference of the volumes of the already explored areas at times t and t− 1, while atotal is the volume of the explored area in the final map. for the empty environment and 4 to 6 robots, all 165 m. kulich, t. juchelka, l. přeučil acta polytechnica 0 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 1000 1200 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 900 1000 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 900 1000 m ap e xp lo re d [ % ] step greedy ble hungarian k-means figure 4. arena map size progress: for 4 robots (left), for 6 robots (center), for 8 robots (right). 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 300 350 400 450 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 300 350 400 m ap e xp lo re d [ % ] step greedy ble hungarian k-means figure 5. jh map size progress: for 4 robots (left), for 6 robots (center), for 8 robots (right). 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 800 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 700 m ap e xp lo re d [ % ] step greedy ble hungarian k-means 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 m ap e xp lo re d [ % ] step greedy ble hungarian k-means figure 6. hospital map size progress: for 4 robots (left), for 6 robots (center), for 8 robots (right). the methods, except k-means, report similar results with a difference of 0.3 % for 4 robots and 1 % for 6 robots. the worse behavior of the k-means strategy (6 % worse than the hungarian method for 4 robots and 8.5 % worse for 6 robots) is caused by a non appropriate distribution of robots in the first stages of the search, and the need to redistribute them later. the situation is more balanced for 8 robots, but the greedy approach and the hungarian approach behave slightly better. the performance of the methods in the other maps shows similar characteristics. greedy performs worst in almost all cases, followed by ble. in general, the best results were achieved by the hungarian method, followed by k-means in arena and hospital. k-means even outperforms the hungarian method in some cases. note the poor results of k-means for jh. this environment contains many small rooms, and its partitioning forces robots to explore small regions partially spread over two or three rooms. this slows down the search as more than one robot visits the same room, in many cases. 6. conclusion although search in unknown environments has many practical applications, it has until now been addressed only marginally by the robotics community. in this paper, we define the problem for a team of robots, and present the results of several standard exploration approaches for a search. although the hungarian method outperforms the other approaches in most cases, the differences in results appear to be marginal, and differ for different maps and for different numbers of robots. in our further research we therefore plan to study the behaviour of the methods in greater detail in order to clarify the reasons for their performance. although the number of experiments that have been performed is not small, more experiments are needed in order to draw statistically reasonable conclusions, and especially to provide a statistical analysis of variance. we believe that this study will help in the design of novel methods for search in future. 166 vol. 55 no. 3/2015 comparison of exploration strategies for multi-robot search 2 5 0 3 0 0 3 5 0 number of robots t f [ s] 4 6 8 greedy ble hungarian k−means 3 0 0 3 5 0 4 0 0 4 5 0 5 0 0 5 5 0 6 0 0 number of robots t f [ s] 4 6 8 greedy ble hungarian k−means figure 7. comparison of the strategies, i.e. the five-number summaries of tf for empty (left) and for arena (right) map. 1 2 0 1 4 0 1 6 0 1 8 0 2 0 0 2 2 0 2 4 0 number of robots t f [ s] 4 6 8 greedy ble hungarian k−means 2 5 0 3 0 0 3 5 0 number of robots t f [ s] 4 6 8 greedy ble hungarian k−means figure 8. comparison of the strategies, i.e., the five-number summaries of tf for jh (left) and for hospital (right) map. acknowledgements this work has been supported by the technology agency of the czech republic under project no. te01020197. references [1] a. sarmiento, r. murrieta-cid, s. hutchinson. a multi-robot strategy for rapidly searching a polygonal environment. in c. lemaître, c. a. reyes, j. a. gonzález (eds.), proc. 9th ibero-american conf. on ai on advances in artificial intelligence, vol. 3315 of lecture notes in computer science, pp. 484–493. springer, 2004. doi:10.1007/978-3-540-30498-2_48. [2] t. shermer. recent results in art galleries [geometry]. proc of the ieee 80(9):1384 –1399, 1992. doi:10.1109/5.163407. [3] a. sarmiento, r. murrieta, s. hutchinson. an efficient motion strategy to compute expected-time locally optimal continuous search paths in known environments. advanced robotics 23(12-13):1533–1560, 2009. doi:10.1163/016918609x12496339799170. [4] g. hollinger, j. djugash, s. singh. coordinated search in cluttered environments using range from multiple robots. in c. laugier, r. siegwart (eds.), field and service robotics, vol. 42 of springer tracts in advanced robotics, pp. 433–442. springer berlin heidelberg, 2008. doi:10.1007/978-3-540-75404-6_41. [5] k. trummel, j. weisinger. the complexity of the optimal searcher path problem. operations research 34(2):324 – 327, 1986. doi:10.1287/opre.34.2.324. [6] a. salehipour, k. sörensen, p. goos, o. bräysy. efficient grasp+vnd and grasp+vns 167 http://dx.doi.org/10.1007/978-3-540-30498-2_48 http://dx.doi.org/10.1109/5.163407 http://dx.doi.org/10.1163/016918609x12496339799170 http://dx.doi.org/10.1007/978-3-540-75404-6_41 http://dx.doi.org/10.1287/opre.34.2.324 m. kulich, t. juchelka, l. přeučil acta polytechnica metaheuristics for the traveling repairman problem. a quarterly journal of operations research (4or) 9:189–209, 2011. doi:10.1007/s10288-011-0153-0. [7] n. mladenović, d. urošević, s. hanafi. variable neighborhood search for the travelling deliveryman problem. a quarterly journal of operations research (4or) pp. 1–17, 2012. doi:10.1007/s10288-012-0212-1. [8] i. méndez-díaz, p. zabala, a. lucena. a new formulation for the traveling deliveryman problem. discrete appl math 156(17):3223–3237, 2008. doi:10.1016/j.dam.2008.05.009. [9] b. yamauchi. frontier-based exploration using multiple robots. in proceedings of the second international conference on autonomous agents, agents ’98, pp. 47–53. acm, new york, ny, usa, 1998. doi:10.1145/280765.280773. [10] f. amigoni. experimental evaluation of some exploration strategies for mobile robots. in robotics and automation, 2008. icra 2008. ieee international conference on, pp. 2818–2823. 2008. doi:10.1109/robot.2008.4543637. [11] d. holz, n. basilico, f. amigoni, s. behnke. evaluating the efficiency of frontier-based exploration strategies. in proc. int. symp. on robotics/german conf. on robotics, pp. 1–8. vde verlag, 2010. [12] k. wurm, c. stachniss, w. burgard. coordinated multi-robot exploration using a segmentation of the environment. in proc. ieee/rsj int. conf. on intelligent robots and systems, pp. 1160 –1165. 2008. doi:10.1109/iros.2008.4650734. [13] h. w. kuhn. the hungarian method for the assignment problem. naval research logistics quarterly 2:83–97, 1955. doi:10.1002/nav.3800020109. [14] w. burgard, m. moors, c. stachniss, f. schneider. coordinated multi-robot exploration. robotics, ieee transactions on 21(3):376–386, 2005. doi:10.1109/tro.2004.839232. [15] c. stachniss, o. m. mozos, w. burgard. efficient exploration of unknown indoor environments using a team of mobile robots. annals of mathematics and artificial intelligence 52(2-4):205–227, 2008. doi:10.1007/s10472-009-9123-z. [16] a. solanas, m. a. garcia. coordinated multi-robot exploration through unsupervised clustering of unknown space. in proc. int. conf. on intelligent robots and systems, vol. 1, pp. 717–721 vol.1. 2004. doi:10.1109/iros.2004.1389437. [17] d. puig, m. garcia, l. wu. a new global optimization strategy for coordinated multi-robot exploration: development and comparative evaluation. robotics and autonomous systems 59(9):635–653, 2011. doi:10.1016/j.robot.2011.05.004. [18] m. kulich, l. přeucil, c. m. b. bront. single robot search for a stationary object in an unknown environment. in robotics and automation (icra), 2014 ieee international conference on, pp. 5830–5835. 2014. doi:10.1109/icra.2014.6907716. [19] m. juliá, a. gil, o. reinoso. a comparison of path planning strategies for autonomous exploration and mapping of unknown environments. autonomous robots 33(4):427–444, 2012. doi:10.1007/s10514-012-9298-8. [20] b. b. werger, m. j. mataric. broadcast of local eligibility for multi-target observation. in l. parker, g. bekey, j. barhen (eds.), distributed autonomous robotic systems 4, pp. 347–356. springer-verlag, 2001. doi:10.1007/978-4-431-67919-6_33. [21] t. juchelka, m. kulich, l. preucil. multi-robot exploration in the polygonal domain. in int. conf. on automated planning: workshop on planning and robotics. 2013. [22] m. quigley, k. conley, b. p. gerkey, et al. ros: an open-source robot operating system. in proc. ieee int. conf. on robotics and automation: workshop on open source software. 2009. [23] j. w. durham, f. bullo. smooth nearness-diagram navigation. in proc. ieee/rsj int. conf. on intelligent robots and systems, pp. 690–695. 2008. doi:10.1109/iros.2008.4651071. 168 http://dx.doi.org/10.1007/s10288-011-0153-0 http://dx.doi.org/10.1007/s10288-012-0212-1 http://dx.doi.org/10.1016/j.dam.2008.05.009 http://dx.doi.org/10.1145/280765.280773 http://dx.doi.org/10.1109/robot.2008.4543637 http://dx.doi.org/10.1109/iros.2008.4650734 http://dx.doi.org/10.1002/nav.3800020109 http://dx.doi.org/10.1109/tro.2004.839232 http://dx.doi.org/10.1007/s10472-009-9123-z http://dx.doi.org/10.1109/iros.2004.1389437 http://dx.doi.org/10.1016/j.robot.2011.05.004 http://dx.doi.org/10.1109/icra.2014.6907716 http://dx.doi.org/10.1007/s10514-012-9298-8 http://dx.doi.org/10.1007/978-4-431-67919-6_33 http://dx.doi.org/10.1109/iros.2008.4651071 acta polytechnica 55(3):162–168, 2015 1 introduction 2 problem formulation 3 framework 4 exploration strategies 4.1 greedy approach 4.2 the hungarian method 4.3 k-means clustering 5 results 6 conclusion acknowledgements references ap02_04.vp 1 introduction the arguments for using discrete element methods instead of methods defined on the continuum of the entire domain were discussed in a previous paper by the authors, [9]. in the early 1970s cundall, [2], introduced discrete, elements starting with dynamic equilibrium. first, brick-like elements were used (professional computer program udec), and later circular elements in 2d and spherical elements in 3d (pfcparticle flow code – both computer systems issued by itasca) simulated the continuum behavior of structures. such methods have been applied mainly in geotechnics, where soil is a typical grain material with the above-mentioned shape, [14]. if the material parameters are well chosen, the mechanical behavior of the discrete elements is very close to reality. the problem consists of finding such material parameters. there have been many attempts to find out these parameters, but there is still no satisfactory output from these studies. it seems promising to cover of the domain defining the physical body by hexagonal elements, which are very similar to disks, and can cover the domain with very small geometrical error. the internal mechanical behavior is described by virtue of the boundary element method, e.g. [1]. the shape is probably derived from honeycombs [11]. the examples are taken from the field of geotechnical problems; particularly opening face stability is discussed. experimental measurements from a scale model are compared with numerical treatments for the occurrence of bumps during tunneling or mining. the reliability of the proposed methods is tested for rock bumps in coalmines. the experimental data and the results from these models are in good agreement, cf. [14]. an application to tunnel face stability can be found in [10]. the extrusion of gas in the coalmine can be described by eigenparameters, [3], or, in this case, eshelby’s trick is used [4]. 2 rock bumps bumps are a typical problem in mining engineering, but they have not yet been adequately studied. there is a lack of numerical models, which this paper should partly bridge. the examples focus on two basic problems: bumps due to extreme depth of the mine, and the influence of gas emissions. in the first case, static pfc is also employed and the two numerical methods are compared with the experimental results. for bumps to occur, the rock has to possess certain particular material properties, leading to an accumulation of energy and the ability to release this energy. such a material may be brittle, or the bumps may arise at interfacial zones of two parts of the rock that have principally different material properties. the experiments concentrate on the loading of a longwall seam. the coal is supposed to be drilled at great depths. the mathematical models have been prepared in compliance with the experiments. both numerical methods, the free hexagonal element method and static pfc are selected to assess the local energy concentration and forthcoming bumps. first, let us look back and mention certain interesting publications dealing with local sudden failure of stability. haramy et al [5] studied the behavior of cubes of coal that were loaded by a concentrated load. the tests were made on irregular lumps. ujihara et al [13] discuss a scale model and theoretical studies on the mechanism of coal and gas bumps. the authors aimed to establish a comprehensive theory on the mechanism of coal and gas outbursts. they reasoned that an effective method for elucidating the mechanism involves scale model laboratory experiments, because the phenomena can be observed precisely not only with the naked eye but also with appropriate measuring methods. experimental results and numerical analyses of stresses indicate that the mechanism of coal and gas outbursts is mainly due to gas pressure, rock pressure and sudden face exposure. gas pressure contributes to fracturing as well as transporting the coal. from both the numerical and the experimental results it follows that there is a typical paraboloidal disconnecting surface, with the principal axis meeting the axis of the tunnel or seam. in addition, in [8] a model material is used to simulate coalmine bumps. the authors concentrate on the stability of coal seams that are intersected by a straight, infinite cavity, representing a tunnel, a roadway, a gallery, or a longwall working. they observe that a large experimental (and obviously also in situ) scatter of critical pressure, necessary for bump initiation, can be understood as a consequence of the variance of the material. © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 42 no. 4/2002 application of discrete element methods to the problem of rock bumps p. p. procházka, m. g. kugblenu this paper is a continuation of a previous paper by the authors. applications of two discrete element methods (dem) to several fields of geotechnics are discussed. the free hexagon element method is considered a powerful discrete element method, and is widely used in mechanics of granular media. it substitutes the methods for solving continuum problems. in order to complete the study, other discrete element methods are discussed. the second method starts with the classical particle flow code (pfc, which uses dynamic equilibrium), but we apply static equilibrium in our case. the second method is called the static particle flow code (spfc). the numerical experiences and comparison with experimental results from scaled models are discussed. keywords: free hexagonal element method, statical pfc, localized damage, occurrence of cracking and bumps during mining or tunneling. since in situ rock strength needs to be determined for successful design of underground structures, the bureau of mines in denver (there is now only one bureau of mines in the usa, located in pittsburgh) issued a series of publications, where estimates for engineers can be found. a summary is published in [7], where it is pointed out that at failure rocks with a wide range of strengths approximately satisfy the following equation: � � � � � � � � � � � � � a 1 2 , where � and � are the shear and normal stresses along the failure plane, �� is the rock tensile strength (note that tension is positive), 2� is a universal constant equal to 0.684 ± correctional term, and a is a material constant. values for a are determined from the triaxial compression test and borehole shear test reported in [7], which are approximately 3.018, and 2.466, respectively; the values are comparable to a determined using the hoek and brown material constants for failure of undisturbed and disturbed rock masses. large scale experiments were carried out and the constants for the above-mentioned failure criterion were: 2 0 684 0 017� � �. . , a � �2 466 0 58. . for undisturbed material, and a � �3 018 0 88. . for disturbed mass. a strong roof helps to minimize roof fall problems in coalmine entries, see [6]. however, the inability of a strong roof to cave readily may contribute to major ground control problems in longwall and retreat mining operations. concern expressed by mine operators prompted the authors to analyze the effects of strong roof members on ground stability around longwall openings. 2.1 laboratory testing devices fig. 1 is a sketch of the loading cell. it consists of a lower steel tank that is designed for the horizontal forces that are caused by the vertical load in araldite specimens. the loading cell is equipped with perspex on its sides, which enables observation of the deformation processes during the tests. the described loading cell models the rock mass near the face of the opening. in the loading cell, two araldite specimens are placed (both 160/400/40 mm) that model the real tunnel. the gap between them corresponds to the width of a working gallery in a mine. the mechanism and history of the coal bumps was observed on this araldite specimen, which was covered by a soft duralumin sheet with force meters placed on it in the following manner: 5 wider force meters were placed near its outer edge, and another 15 thinner force meters were placed next to them, see fig. 1. in order to embed the force meters properly and to prevent them from tilting, another double steel sheet, 1 mm thick, was placed over the force meters. a 300 mm high block of duralumin was placed over this sheet. the purpose of this block was to model the hanging wall and to facilitate the stress distribution similarly as in reality. the sizes of the force meters were: length 160 mm, height 68 mm, width 16 or 32 mm. on each force meter 4 strain gauges were placed, and their distribution enabled us to measure the deformation along its full length. the strain gauges are connected in series in order to be able to gauge each force meter separately, and simultaneously to increase the gauging sensitivity. the force meters were calibrated within the expected range of forces, that is 0 to 250 kpa. in the scale model the concrete was substituted by dural soft sheet with physically equivalent material properties, according to buckingham’s theorem, see [12]. the scale was approximately 1 : 250, meaning that the opening was 4.75 m high. this height was also applied in the mathematical models, for comparative studies. for more details, see [14]. fig. 2 shows the distribution of the vertical stresses along the contact between the upper rock and the lower coal seam 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 1: view of the symmetric part of the specimen in dependence on the distance from the face of the opening for various applied stresses. sij denotes the load, which range from 0 to 3 mpa. at the value of 2.5 mpa, the bumps occurred and the stress in the sides of the opening obviously increased. this is why the value 2.5 mpa of the terrain loading is considered as decisive in the mathematical models. the experiments were carried out at the klokner institute of ctu prague, [14]. 2.2 examples fig. 3 shows the mesh of particles in the static pfc with the generalized mohr-coulomb friction and cohesion law, and exclusion of tensile stress exceeding the tensile strength, as described in [9]. the internal radius of the hexagonal particles is 0.25 m, and the number of particles is 1532. the lower part with the free face represents the coal seam, and the upper part is the rock. the coal seam is 4.75 m high, as in the experimental model. the numerical procedure for solving the problems is the over-relaxation method, with improvement of the unknowns in the sense of the contact conditions between the adjacent disks. first, the fixed contacts are considered, i.e., high penalty coefficients are introduced. after 1000 iteration steps the contact conditions are taken into account and after approximately another 1500 iteration steps the procedure terminates. the termination obeys the rule of euclidean norm er© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 42 no. 4/2002 fig. 2: distribution of y-stresses [mpa] behind the opening. loading along the terrain: i1kpa, ii1 mpa, iii1.5 mpa, iv2 mpa, v2.5 mpa, vi3 mpa. fig. 3: setting of the hexagonal elements and boundary conditions ror imposed on the change of displacements of the centers of the particles. all these values are considered for the examples discussed below. note that when using static pfc, the final load is divided into 10 equal increments to suppress the influence of possibly large rotations. since the elastic-plastic law is applied to the free hexagon solution, the structure is from the beginning loaded by full loading. two examples are discussed in this text. first, the opening stability during mining is studied, and the results are compared with the experiments. the results from the hexagonal method and from static pfc are compared, and the experimental measurements are taken into consideration. the second example studies the influence of gas emission on the face stability and is illustrated for two values of pressures, which are located in the coal seam due to gas emission. both examples focus on the problem of extrusion of rock (or coal) mass from the rock body. the geometry of the hexagonal elements is shown in fig. 3, where the boundary conditions are also depicted. this mesh is also used for the disks in such a way that the centers coincide with the hexagons and the radius is also 0.25 m. the material coefficients in the hexagons are: e � 500 gpa, g � 150 gpa, shear strength c � 1 mpa and tensile strength pn � � 100 kpa. these values are valid for the rock. the lower part with the free face describes the coal; its material properties e and g are ten times lower than those of the rock, but the shear strength and the tensile strength vary. the load is given by the overburden. the volume weight � � 25 kn/m3, the depth of the opening is 1000 m. the disks are connected by springs with the following material properties: kn ij � 500 gn/m and kt ij � 150 gn/m, shear strength c � 1 mn/m, and the tensile strength pn � � 100 kn/m for the rock. for the coal it holds: kn ij � 50 gn/m and kn ij �15 gn/m, the shear strength and the tensile strength change in compliance with the hexagons. in fig. 4, the movements of the particles describe the elastic solution. here and in the other figures, the particles 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 4: movements of particles in an elastic solution fig. 5: movements in an elastic solution with their vectors are drawn as rigid, although they change their shape after deformation. in fig. 5, the same movements are depicted with the displacement vectors. the vectors start at the centers of the particles in the undeformed state and end at the centers of the moved elements. fig. 6 shows a typical situation. the extrusion of the particles is observed in the front part, while next to the face (approximately 3 m) the particles move backwards and stabilize the coal seam. this can change after some other influence, e.g., due to gas emission, as is shown in the following pictures. the movements of the rock and coal with these material properties make the opening stability relatively safe. some particles at the face of the seam are disconnected, but the direction of the vectors shows the still stabilized influence of the overburden. it should be noted that the nucleation of the crack zones is seen in the right upper part of the rock, which causes a drop in the stresses in this region. fig. 7 shows the same arrangement of the system discussed in fig. 6, but in this case static pfc is used. the displacements at the face are principally smaller, but the triangular wedge is created in a similar manner as for the hexagons. as the process of cracking is influenced by the contact conditions rather than by the springs, statical pfc is obviously less reliable than the hexagon method. the most dangerous case is depicted in fig. 8 for hexagons and in fig. 9 for disks. the triangular wedge is disconnected into two parts, and extrusion of particles is now obvious. similar pictures are shown in figs. 10 and 11. it can be proved, that for tensile strength of 3 kpa, shear strength higher than 300 kpa does not principally change the movement configuration. a similar conclusion is also valid for other relations pn � and c. a wide range of computations were carried out to get the relation pn � and c at the occurrence of the bumps, which is determined as a singular solution of the problem. the relation is shown in fig. 12. the numerical results are in reasonable agreement with the experiments carried out at the klokner institute of ctu prague. this conclusion follows from fig. 13. the deviations on the left hand side of the picture are probably caused by the relatively rough mesh © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 42 no. 4/2002 fig. 6: movements for pn � � 10 kpa, c � 150 kpa fig. 7: movements for pn � � 10 kpa, c � 150 kpa 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 8: movements for pn � � 5 kpa, c � 150 kpa fig. 9: movements for pn � � 5 kpa, c � 150 kpa fig. 10: movements for pn � � 10 kpa, c � 150 kpa of particles; the right hand side may be influenced by the boundary conditions (in the numerical model obvious cracking occurs, i.e., a drop in the stresses has to be expected). the second example shows the influence of gas emission in the coal seam. the boundary of the gas blast is about 18 m from the face of opening. the effect of two magnitudes of the blast on the face stability is studied by the free hexagon method and then compared. figs. 15 and 17 show that 1 mpa of gas causes local disturbances in the vicinity of the gas emission region, but that there is almost no influence on the face. more interesting is the case of magnitude 10 mpa. fig. 16 shows that the most damaging blast causes even vertical disconnection in a large band in the upper part of the coal seam. this case can probably be estimated to be near the occurrence of the bumps. © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 42 no. 4/2002 fig. 11: movements for pn � � 3 kpa, c � 300 kpa fig. 13: comparison experiment – computation for loading 2.5 mpa fig. 12: relation pn � and c for occurrence of bumps, fixed e and g 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 14: setting of the hexagonal elements and the domain of gas emission fig. 15: movements for pn � � 10 kpa, c � 150 kpa, gas pressure � 1 mpa fig. 16: movements for pn � � 10 kpa, c � 150 kpa, gas pressure � 10 mpa © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 42 no. 4/2002 fig. 17: movements for pn � � 10 kpa, c � 250 kpa, gas pressure � 1 mpa fig. 18: movements for pn � � 10 kpa, c � 250 kpa, gas pressure � 10 mpa fig. 19: movements for pn � � 25 kpa, c � 250 kpa, gas pressure � 10 mpa the case described in fig. 18 is also very dangerous. it is interesting that the “tensile band” has moved, and that the band is narrower. the stability of the face is still under threat. figs. 19 and 20 show the influence of local “dilation” caused by the form of the particles. the displacements may be larger if the arrangement of the particles rotates by 90°. 3 conclusions this paper studies the behavior of open rock of the type that occurs during longwall mining in coalmines. the main problem is longwall instability and extrusion of the rock mass into an open space. this effect is mostly referred to as bumps or rock bursts. in order for bumps to occur, the rock has to possess certain special properties. in this study, we consider a brittle or almost brittle coal seam, which is treated theoretically, and experiments that have been carried out in the past are taken for comparison. the results from the two models seem to be reasonable. a combination of experimental and mathematical models appears promising for the study of similar problems. both methods allow the problems to be studied as time dependent. they make it possible to develop cracks during bump initiation, and therefore describe the problem in a way that is very close to reality. the free hexagonal element method (the principal method) is applied as a numerical tool, and the static particle flow code is a numerical method serving for comparison of the results from the two methods. the stiffness of the hexagons is created by the bem. the generalized hooke’s law is used, involving eigenparameters, which can represent various material phenomena. each element is considered to behave elastically (or rigidly in static pfc), and the contact conditions make the problem nonlinear. using an iterative procedure, a very fast solution is obtained. if the solution converges, the process of deterioration of the coal seam and the rock mass can be observed and evaluated. two examples are presented. one starts with accumulation of energy due to external loading at a great depth in a mine, while the second example considers gas emission in the coal seam. the system of extrusion of coal particles describes the way they behave and the reasons, which locally affect the movements of the coal, or sometimes also the rock particles. the gas emission can be simulated by the eigenparameters, or by forces known as eshelby’s forces. in this paper eshelby’s forces are mainly used, although eigenparameters may also serve as an advantageous tool for simulating blasts. as the process of cracking is influenced by contact conditions rather than by springs, static pfc is less reliable then the hexagon method. this assertion is verified by comparing the figures. because of its simplicity in comparison to the free hexagonal element method, static pfc can be more applicable than hexagons in dynamic problems. experience with classical pfc and its static version indicates that dynamic equilibrium applied to the static pfc concept is more reliable than classical pfc, where only a narrow range of material parameters delivers reasonable results. if we deviate from this range, the results are very unpredictable, and a concrete structure can burst out in space, for example. acknowledgment this research was supported by grant agency of the czech republic grant number 103/00/0530, and by ministry of education of the czech republic, project msm:210000001,3. references [1] brebbia, c. a., telles, j. f. c., wrobel, l. c.: boundary element techniques. berlin: springer-verlag, 1984. [2] cundall, p. a.: a computer model for simulation of progressive large scale movements of blocky rock systems. symposium of the international society of rock mechanics, 1971, p. 132–150. [3] dvorak, g. j., procházka, p. p.: thick-walled composite cylinders with optimal fiber prestress. composites, part b, vol. 27b, 1996, p. 643-649. [4] eshelby, j. d.: the determination of the elastic field of an ellipsoidal inclusion, and related problems. jmps vol. 11, 1963, p. 376–396. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 20: movements for pn � � 40 kpa, c � 250 kpa, gas pressure � 10 mpa [5] haramy, k. y., morgan, t. a., de waele, r. e.: a method for estimating western coal strengths from point load tests on irregular bumps. 2nd conf. on ground control in mining, west virginia university, july 19–21, 1982, p. 123–136. [6] haramy, k. y., magers, j. a., mc donnell, j. p.: mining under strong roof. 7th int. conf. on ground control in mining, denver (usa): bureau of mines, 1992, p. 179–194. [7] harami, k. y., brady, b. t.: a methodology to determine in situ rock mass failure. internal report of bureau of mines, denver (usa): co, 1995. [8] kuch, r., lippmann, h., zhang, j.: simulating coal mine bumps with model material. gibowitz & lasocki (eds.), balkema, rotterdam: rockbursts and seismicity in mines, 1997, p. 23–25. [9] procházka, p. p., kugblenu, m.: certain discrete element methods in problems of fracture mechanics problems. acta polytechnica, vol 42, no. 4, p. 42–50. [10] procházka, p. p., trčková, j.: coupled modeling of concrete tunnel lining. our world in concrete and structures, singapore, (j. tan ed.), 2000, p. 215–224. [11] silva, m. j., gibson, l. j.: the effect of non-periodic microstructure and defects on the compressive strength of two-dimensional cellular solids. int. j. of mech. sci. vol. 39, no. 5, 1997, p. 549-563. [12] stilborg, b., stephensson, o., swan, g.: three-dimensional physical model technology applicable to the scaling of underground structures. 40th int. conf. on rock mech. vol. 2, montreux, 1979, p. 655–662. [13] ujihara, m., higuchi, k., nabeya, h.: scale model studies and theoretical considerations on the mechanism of coal and gas outbursts. int. conf. outbursts, sydney, 1986, p. 121–128. [14] vacek, j., procházka, p. p.: rock bumps occurrence during mining. cmem, (c. a. brebbia ed.), alicante (spain): 2001, p. 125–134. prof. ing. rndr. petr p. procházka, drsc. phone: +420 224 354 480 fax: +420 224 310 775 e-mail: petrp@fsv.cvut.cz ing. michael g. kugblenu dept. of structural mechanics ctu in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 42 no. 4/2002 acta polytechnica doi:10.14311/ap.2017.57.0089 acta polytechnica 57(2):89–96, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap use of disc springs in a pellet fuel machine ebubekı̇r can güneş, i̇smet çelı̇k∗ mechanical engineering department, dumlupınar university, kütahya 43270, turkey ∗ corresponding author: ismet.celik@dpu.edu.tr abstract. the use of biomass fuel pellets is becoming widespread as a renewable and environmentfriendly energy. pellet fuels are produced in various pellet machine types. pellet machines encounter problems such as pressure irregularities and choke at the initial start for different kinds of biomass feedstock. in this study, disc springs are integrated into a vertical axis pellet machine for a pressure regulation and design optimization. force-deformation and stress-deformation relations of disc springs are investigated using analytical and finite element methods. pelletizing pressures were calculated based on disc spring force values using the hertzian stress formula. utilized disc springs ensured the pressure regulation, production efficiency increase and damage prevention on the die-roller mechanism. keywords: biomass pellet; pellet machine; disc spring; almen–laszlo; fea; pelletizing pressure. 1. introduction biomass is one of the alternative and renewable energy sources in today’s world where fossil fuel sources rapidly decrease and environmental effects increase. biomass is defined as a renewable energy source that consists of organic materials as wood, plant, manure and sewage sludge residues and has a capacity to be utilized as fuel [1]. carbon dioxide emitted during biomass combustion is absorbed by green plants and is used in photosynthesis and its natural cycle is maintained [2]. moreover, in contrast to fossil fuels such as coal, its nitrous and sulphur based gas emission is very low. thus, biomass is considered as a valuable energy source today. biomass needs to be processed to be utilized as solid fuel. this process is listed in three main stages: dehumidification/humidification, grinding, and pressing. pressing stage has a major importance, because it is the last step of the process, during which the figure 1. (a) briquetting press [16] and briquettes. (b) pelletizing press and pellets. final product is obtained. there are two types of common solid biomass pressing techniques briquetting and pelletizing. for briquetting technique, fuel pieces produced in various dimensions and geometries, such as prismatic, cylindrical along a continuous line, spherical and pillow-shaped, are also bigger than pelletized end-products. in pelletizing technique, only compressed cylindrical pieces are produced in particular lengths, which is more common and efficient than briquetting. examples of briquetting and pelletizing presses and end-products are shown in figure 1. pelletizing machines work on the basis of pressing grinded biomass through a perforated die by the compression of rotating cylindrical rollers (commonly 2 or 3). pellet presses are classified into two groups (ring die press, disc die pellet press) based on the type of the dies, as shown in figure 2. both types have cylindrical channels on dies and have rotating rollers on them. also, it is possible to classify pellet presses figure 2. (i) disc die (vertical axis) pellet press; (ii) ring die (horizontal axis) pellet press; (a) rollers, (b) die, (c) pellet outlet. 89 http://dx.doi.org/10.14311/ap.2017.57.0089 http://ojs.cvut.cz/ojs/index.php/ap ebubekı̇r can güneş, i̇smet çelı̇k acta polytechnica according to the die/roller motion. for the first type, the die rotates and rollers remain in a constant position and merely rotate on their axis. for the latter, the die remains in constant position and both of the rollers rotate on their axis and move around the die. in pellet presses, various kinds of feedstock can be processed and pelletized. each biomass kind has its own pelletizing parameters. these parameters are feedstock moisture, grinded particle size, pelletizing pressure and binder addition. working with precise parameters is vital for obtaining durable and combustion efficient pellets. for instance, when the pressure and moisture of the feedstock is too low, biomass will not be able to be pelletized, or the produced pellets will be too loose and non-durable. there are many studies on pelletizing various biomass feedstocks. some of the current studies are summarized below. serrano et al. [4], have examined the effects of the feedstock moisture ratio, particle size and pine sawdust addition on barley straw pellets. for dense barley straw pellets, optimum feedstock moisture rate is indicated in the range of 19–23 %, and within this interval, the obtained pellet durability was 95.5 %. adding a pine sawdust at a rate of 2, 7 and 12 % improved the pellet durability up to 97–98 %. their study has unveiled that addition of a woody biomass to a herbaceous biomass feedstock can improve the pellet quality. gilbert et al. [5] have studied the effects of pelletizing pressure and temperature on the pellet density, mechanical strength and durability using a switchgrass biomass. the effects of temperature on the pellet density were observed only between 14 °c and 50 °c. increasing pelletizing pressure from 55 to 552 bar enhanced the pellet density and durability. mani et al. [6] have observed the effects of the particle size reduction, compressive force and moisture ratio on pelletization of wheat straw, barley straw, corn stover and switchgrass at 100 °c using a laboratory scale pellet press. except for wheat straw, with a reducing particle size, denser pellets were obtained. however, the moisture ratio and compressive force are more effective on pellet density. increasing the moisture ratio from 12 to 15 % and increase of the compressive force have resulted in denser pellets. relova et al. [7] have revealed that the pelletizing pressure is the most important factor affecting the pellets’ drop resistance by using statistical data from 32 kinds of pine sawdust pelletization. moisture ratio is less effective than the pelletizing pressure and the particle size is less effective than the moisture ratio on determining the pellets’ drop resistance. larsson et al. [8] have examined the effects of the moisture ratio, bulk density after pre-compaction, steam addition and die temperature on pelletizing of reed canarygrass. the factor that has the biggest impact on the durability and density of pellets was found out to be the moisture ratio. using binders in pelletizing process provides a durability and resistance to outer factors. various organic and inorganic binders are preferred in pelletizing processes. binders are used necessarily, or sometimes optionally, to maintain a superior pellet quality. kong et al. [9] have pelletized a spruce sawdust under 6 mpa pressure and 25 °c using binders of lignin, ca(oh)2, naoh, cacl2, and cao. among these binders, combination of 5 % ca(oh)2 + 10 % lignin by mass increased the pellets’ water resistance. furthermore, catalysis of hydroxide, contained within ca(oh)2, provided polymer chain growth into three-dimensional cross-linking that strengthened the bonds in pellet microstructure. puig–arnavat et al. [10] have pelletized six different biomasses using a single pellet press. for all types of biomass, the optimum moisture ratio they determined for pelletizing is 10 %. higher moisture ratios have deteriorated pellet quality. they observed that the friction in the die increases when the die temperature increases from the room temperature to 60–90 °c. in higher temperatures, friction decreases. biomass is subjected to a pre-treatment before pelletizing, which, in some studies, consists of heating up to 200–300 °c in inert gas atmosphere. thus, low moisture, volatile-free and high energy density biomass is obtained. also pre-treated biomass has a better grinding performance [12, 13]. larsson et al. [14] have pre-heated spruce sawdust to high temperatures (270– 300 °c) and pelletized it in a small scale pellet press. produced pellets had comparable bulk densities (630– 710 kg/m3), but lower pellet durability (80–90 %) in comparison to conventional pellets. also, the energy consumption for pelletizing was 100 % higher. it was observed that pellet production speed is positively correlated with the die temperature. the feedstock moisture rate was not significantly effective on pellet quality and the production speed; however, adding water to the feedstock worsened the flow properties and did not provide a benefit. the same effect was also observed in coal powder by abou–chakra et al. [15]. water addition to coal powder decreases flowability while free water particles increases this impact. different biomass kinds have their own pelletizing parameters due to their characteristic bulk/particle density and microstructure. for instance, the woody biomass has a higher content of the lignin component than the herbaceous biomass. lignin is a vital constituent providing inner pellet bonding. the biomass feedstock, which is pressed in die channel layer by layer, has a frictional movement relation with the wall of the die channel. depending on temperature, every kind of biomass forms its own pelletizing characteristic in a frictional material-machine relation. this is shown in figure 3, the effective forces f1, f2, and f3 on the biomass under pelletizing pressure in a cylindrical channel. kováčová et al. [16] have derived pressure relation equations of a pressed biomass through a cylindrical 90 vol. 57 no. 2/2017 use of disc springs in a pellet fuel machine figure 3. effective forces on biomass in a die channel [16]. die channel based on the force equation in figure 3. pa(l) = pape− 4µλ d l, (1) pap(l) = pae+ 4µλ d l. (2) in equations (1) and (2), symbols are defined as µ — friction coefficient between the biomass and die, λ — radial/axial pressure ratio on biomass, d — hole diameter, l — die length, pa — outlet pressure, pap — internal pressure. in the equations, d/l parameter, which is dependent on die geometry, stands out. choosing a convenient die according to the biomass type is important. friction force in die channel is a temperature dependent parameter, which is determined by stelte et al. [17]. they have observed that when the temperature increases the pelletizing pressure drops. in high temperatures, they identified hydrophobic extractives on the pellet surface by monitoring the infrared spectrum. these extractives act as lubricants between the biomass and die and reduce the needed pelletizing pressure. one of the common faced problems in pellet production is the choking of the machine. this condition is instructed in many pellet machine manufacturers’ handbooks. for instance, an animal feed pellet machine manufacturer company gives a rule of thumb about the moisture ratio being 18 % and this is accepted as a limit value for the choking of the machine. moisture ratios greater than this value cause the machine to choke. particularly, moisture addition to the herbaceous biomass increases pelletizing temperature to 20 °c per 1 % increase [18]. naturally high moisture containing biomass types may not reach required pelletizing temperature in pellet press due to the effect of heat absorption of high moisture. in pellet fuel literature, there are numerous studies conducted about the pelletizing pressure, temperature, moisture ratio, feedstock species, particle size, heat values and environmental effects. one of the most important problems for commercial presses used in the pellet fuel production is the pressure irregularity and consequent choke. in this study, a pressure regulation of vertical axis pellet machines was performed by utilizing disc springs inside them. an improved design of pellet fuel machines for an effective and quality pellet production was established. 2. material and method in this study, a pellet press driven by a 35 kw power and with a capacity of 400 kg/h production rate was employed. pressure regulation of the pellet press was maintained by integrating disc springs into the fixed dierotating rollers system. disc springs are able to bare high loads with small deformations. their simple and modular geometry promotes their use in many areas where a damping of high forces is necessary. a novel usage of this element was unveiled in this study. optimum conditions for pellet production are obtained after while after the first start, especially when the die and related components reach a particular temperature. this is because; the friction between biomass and die lowers and enters a regular regime at high temperatures. furthermore, components in internal structure of biomass activate with high temperatures for bonding mechanism. the moisture extracted from biomass at high temperatures reduces the friction between the biomass and die [17]. also, lignin, a main constituent of woody/herbaceous biomass, gains flowability above the glass-transition temperature, thus maintains internal bonds for mechanically durable pellets [5]. it is known that lignin polymer’s glasstransition temperature is 65–75 °c at 8 % moisture ratio. after this temperature, lignin in biomass softens and its binding effect increases [19]. pellet presses are exposed to an overloading at initial start of the system. this may cause choking of the system at the beginning of the process or a high figure 4. (a) broken roller teeth. (b) deformed washers that got into the press. 91 ebubekı̇r can güneş, i̇smet çelı̇k acta polytechnica figure 5. integrating disc springs to pellet press. figure 6. disc springs: (a) single, (b) parallel, (c) in series. consumption of energy. the biomass type is an important factor in determining the system’s operating pressure. for high-hardness valued biomass, friction between the feedstock and die wall is high, thus the system overloads and may choke at first. the biomass may contain foreign matters that may escape when collected from its natural environment. impurities like small stones and metal pieces having high-hardness may damage the system by getting stuck between the roller and die (see figure 4). in this case, disc springs are deformed and system protects itself from irreversible damage. in our study, disc springs are utilized in pellet presses as shown in figure 5. this provides rollers to press on the die. disc springs are positioned under the tightening nuts and pre-deformed to maintain a constant contact and pre-load between the roller and die. in conventional pellet presses, the gap between the roller and die is set to 0.5–1 mm as a rule of thumb. this gap partially protects the system from unexpected pressure rises. disc springs constantly push down rollers on the die; however, in unexpected pressure rises (stone/metal gets into the machine, feedstock change, system choke) they are deformed and prevent the system from damage. in the case of impurities getting into the machine, system is to be stopped and cleaned so that the spring flexed system is not damaged. the pelletizing pressure required for different biomass species is automatically regulated by flexion of disc springs. in this particular study, pine tree residues (branch, needle, cone, bark etc.) that require high pelletizing pressure were pelletized. 2.1. disc spring selection and calculations disc springs’ outer diameters may vary between 8– 600 mm according to the din 2093 standard. custom disc springs can be manufactured in desired dimensions when needed. in this study, in accordance with press dimensions, a standard sized disc spring with 125 mm outer diameter and 71 mm inner diameter is used. in table 1, all parameters of the disc spring are demonstrated. depending on the operating conditions, disc springs may be used as a single piece, combined in parallel or in series (see figure 6). when stacked in series, the total deformation of springs increase and the load carrying capacity remains constant. accordingly, the deformation of springs remains constant, load carrying capacity is positively proportional with the spring number. the first and still-used calculation method for disc springs was developed in 1936 by almen and laszlo [20]. including din 2093 standard, among many manufacturers and users, the almen–laszlo (a-l) algorithm for disc springs is widely used. the forcedeformation relation for a disc spring according to a-l algorithm is formulated as f = 4e 1 −ν2 δ md2 ( (h− δ)(h− δ/2)s + s3 ) . (3) when the disc spring becomes flat, the needed force is described as a critical force fcr = 4e 1 −ν2 hs3 md2 . (4) 92 vol. 57 no. 2/2017 use of disc springs in a pellet fuel machine outer diamter inner diameter cone height thickness total height d [mm] d [mm] h [mm] s [mm] h + s [mm] d/d h/s 125 71 2.9 8 10.9 1.76 0.3625 table 1. dimensions of used disc spring and related parameters. f [kn] σcomp. [mpa]deformation [mm] a-l fea dev. a-l fea dev. 0.25h 0.725 74.16 85.74 13.5 % 881.78 977.04 9.7 % 0.50h 1.45 143.28 166.93 14.1 % 1707.31 1942.5 12.1 % 0.75h 2.175 209.03 264.04 20.8 % 2476.60 2976 16.8 % h 2.9 273.1 354.06 22.8 % 3189.62 3990 20 % table 2. a-l method and fea results for parallel stacked two disc springs. the compression stress at the inner edge of the spring is σc = 4e 1 −ν2 δ md2 ( m1(h− δ/2) + m2s ) . (5) tension stress at the outer edge of spring is σt = 4e 1 −ν2 hs3 md2 ( m1(h−δ/2) −m2s ) , (6) where m = 6 π ln d/d (d/d− 1 d/d )2 , (7) m1 = 6 π ln d/d (d/d− 1 ln d/d ) , (8) m2 = 6 π ln d/d (d/d− 1 2 ) , (9) calculations based on given formulae are compared with finite element analysis (fea), results of disc springs will be given in tables in the next chapter. calculations were made for deformations of 0.25, 0.5, 0.75, and 1 fold of cone heights and presented in table 2. practically, the disc spring deformation should not exceed 0.75 of cone height. in our study, it is assumed that this value will not be exceeded. parallel-stacked two disc springs were used to maintain sufficient pelletizing pressure on the die. calculations and analysis were made based on parallel disc springs. fea analysis was made on the scanned cad data of disc springs. material is modelled as a spring steel with 206 gpa young’s modulus and 0.3 poisson ratio. meshing was made using 20-node hexahedrons, constituent of 5695 nodes and 938 elements. two disc springs stacked in parallel, defining contact region between one spring’s convex surface and other’s concave surface. friction coefficient (µ) is defined as 0.4 between spring surfaces [21]. as boundary conditions, displacement of the bottom circle of the lower spring is restricted in the vertical axis and set free in the horizontal axis. deformations of 0.25, 0.5, 0.75, and 1 fold of cone height is given at the top circle of the upper disc. force reactions at the top circle and maximum equivalent von-mises stress values are analysed and shown in table 2. 3. results and discussion the analytical calculation (a-l method), fea results and comparison of them are given in table 2. difference between the a-l method and fea force values increases with the deformation ratio. maximum stress is formed on the inner edge of the disc spring’s top surface as seen in figure 7. likewise, for force values and inner edge stress values the difference increases with deformation. based on these values, it can be concluded that the a-l method is more effective on small deformations. this arose from neglecting radial stresses while only considering tangential stresses when forming the a-l equations. with increasing deformation, radial stress on disc spring becomes more effective. in high deformations the a-l method gives more erroneous results. friction forces occur in contact areas of the disc springs when used in parallel. thus, in frictional conditions, springs bear more loads than in the frictionless conditions for the same deformation value. in table 3, both conditions are compared in force and stress values. friction coefficient (µ) is defined as 0.4 by ozaki et al. [21] in a similar study. in force 93 ebubekı̇r can güneş, i̇smet çelı̇k acta polytechnica f [kn] σcomp. [mpa]deformation [mm] frictionless frictional dev. frictionless frictional dev. 0.25h 0.725 81 85.74 5.8 % 926.89 977.04 5.4 % 0.50h 1.45 162 166.93 3 % 1853.8 1942.5 4.8 % 0.75h 2.175 243 264.04 8.6 % 2780.7 2976 7 % h 2.9 324 354.06 9.3 % 3707.6 3990 7.6 % table 3. frictional/frictionless fea results for parallel stacked two disc springs. figure 7. fea stress distribution on parallel stacked two disc springs (for 0.75h deformation). values, there is a deviation between 3–9 %, in stress values, the deviation is 5–8 %. 3.1. pelletizing pressure calculation via hertzian stress pressures occurring when a cylindrical surface contacts a planar surface can be calculated with hertzian stress calculations. as seen in figure 8, a contacting cylinder-plane model is considered for the roller-die relation. hertzian stress calculation is made based on material properties and dimensions of contacting surfaces and the applied force. hertzian stress is calculated by the following formula [22]: b = √ 2f πl (1 −ν21/e1) + (1 −ν22/e2) 1/d1 + 1/d2 , (10) pmax = 2f πbl . (11) material properties are: e1 = e2 = 210 gpa, ν1 = ν2 = 0.3 and the roller dimensions are d1 = 145 mm, l = 110 mm. equation (11) results pmax = 2.146 √ f. (12) disc springs work in the interval of 0.25h–0.75h deformation. there are two rollers employed in the pelletizing system. thus, the force applied by disc figure 8. hertzian stress between die and roller and defining parameters: b — contact surface half-width, d1 — cylinder diameter, l — cylinder length, f — applied force. springs is divided in two. hertzian stresses given in deformation intervals are calculated as pmax = 444.33 mpa for 0.25h, (13) pmax = 779.74 mpa for 0.75h. (14) the calculated pressure is the maximum value of stress formed in between the die and roller. the pelletizing pressure of biomass may be considered as a half of this value as a practical and easy-applicable approach. thus, the pelletizing pressure is between 220–390 mpa in the operating range of disc springs. 4. conclusion although there are various studies about pellet production parameters (biomass kind, temperature, moisture) and the final product quality, there is no significant study about improvement of pellet mills, prevention of damage and production process. regarding pellet presses, mostly encountered problems are choking, pressure irregularity and determining optimum gap between the die and roller during the production process. in this study, these problems are aimed to be solved by integrating disc springs into presses. disc springs maintain the required compression for rollers and also deflect adequately to protect the system from undesired damage by impurities (stone and metal particles etc.). also, the pelletizing pressure can be estimated by hertzian pressure occurring between the roller and die, which is calculated through the disc spring force. a pellet press of 35 kw having a capacity of 400 kg/h production was used in this study. two disc springs with 125 mm outer diameter, 71 mm inner diameter, 8 mm in thickness and 2.9 mm cone height were stacked 94 vol. 57 no. 2/2017 use of disc springs in a pellet fuel machine in parallel and integrated in the pelletizing system. a commonly used analytical a-l method and the fea were used to analyse disc springs’ force/stressdeformation relations. with increasing deformation, differences between the a-l method and fea results increase. radial stresses, which are neglected in the a-l method, become more effective on disc springs with increasing deformation. thus, the a-l method gives more accurate results for small deformations. disc springs are preloaded by tightening nuts, initially deforming at 0.25 of its cone height. in operating conditions, the occurring maximum deformation was assumed as 0.75 of its cone height. for 0.25 deformation 75 kn (a-l) and 85 kn (fea) force values were obtained. for 0.75 deformation 209 kn (a-l) and 264 kn (fea) force values were obtained. hertzian stress was calculated under roller cylinders based on the fea force values. while pelletizing operation comes to approximately 444–780 mpa, maximum pressure occurs under each roller. for the mean pelletizing pressure, half of this value (222–390 mpa) may be considered as a practical and easy-calculated value. maximum stress is formed on the inner edge of the top surface of disc spring during compression. based on the fea, stresses of 977.04 mpa and 2976 mpa occurred for the 0.25h and 0.75h deformations respectively. when stacked in parallel, friction forces occur between the springs. these forces help damping the vibrations during the machine operation. in frictional conditions, springs’ load should be greater than in frictionless conditions to reach same deformation value. a novel utilization of disc springs was realized in this study. in this way, common problems of pellet presses, such as choke, roller damage and pressure regulation, were eliminated. list of symbols f force p pressure σcomp. compression stress e young’s modulus ν poisson ratio δ deformation d outer diameter of disc spring d inner diameter of disc spring s thickness m auxiliary coefficient acknowledgements this work has been made in collaboration with the republic of turkey ministry of science, industry and technology (project 0823.tgsd.2015) and the scientific research projects organization (bap) of dumlupınar university, turkey (project 2012/28). references [1] chiew, y.l., iwata, t., shimada, s.: system analysis for effective use of palm oil waste as energy resources, biomass and bioenergy, 35, 2011, p. 2925-2935. doi:10.1016/j.biombioe.2011.03.027 [2] jones, j.c.: biomass-peat blends and carbon neutrality, fuel, 158, 2015, p. 1016. doi:10.1016/j.fuel.2015.06.005 [3] herbert, g.m., krishnan, a.u.: quantifying environmental performance of biomass energy, renewable and sustainable energy reviews, 59, 2016, p. 292-308. doi:10.1016/j.rser.2015.12.254 [4] serrano, c., monedero, e., lapuerta, m., portero, h.: effect of moisture content, particle size and pine addition on quality parameters of barley straw pellets, fuel processing technology, 92, 2013, p. 699–706. doi:10.1016/j.fuproc.2010.11.031 [5] gilbert, p., ryu, c., sharifi, v., swithenbank, j.: effect of process parameters on pelletisation of herbaceous crops, fuel, 88, 2009, p. 1491-1497. doi:0.1016/j.fuel.2009.03.015 [6] mani, s., tabil, l.g., sokhansanj, s.: effects of compressive force, particle size and moisture content on mechanical properties of biomass pellets from grasses, biomass and bioenergy, 30, 2006, p. 648–654. doi:10.1016/j.biombioe.2005.01.004 [7] relova, i., vignote, s., leon, m.a., ambrosio, y.: optimisation of the manufacturing variables of sawdust pellets from the bark of pinus caribaea morelet: particle size, moisture and pressure, biomass and bioenergy, 33, 2009, p. 1351-1357. doi:10.1016/j.biombioe.2009.05.005 [8] larsson, s.h., thyrel, m., geladi, p., lestander, p.a.: high quality biofuel pellet production from pre-compacted low density raw materials, bioresource technology, 99, 2008, p. 7176-7182. doi:10.1016/j.biortech.2007.12.065 [9] kong, l., tian, s., li, z., luo, r., chen, d., tu, y., xiong, y.: conversion of recycled sawdust into high hhv and low nox emission bio-char pellets using lignin and calcium hydroxide blended binders, renewable energy, 60, 2013, p. 559-565. doi:10.1016/j.renene.2013.06.004 [10] puig-arnavat, m., shang, l., sarossy, z., ahrenfeldt, j., henriksen, u.b.: from a single pellet press to a bench scale pellet mill — pelletizing six different biomass feedstocks, fuel processing technology, 142, 2016, p. 27-33. doi:10.1016/j.fuproc.2015.09.022 [11] tumuluru, j.s.: effect of process variables on the density and durability of the pellets made from high moisture corn stover, biosystems engineering, 119, 2014, p. 44-57. doi:10.1016/j.biosystemseng.2013.11.012 [12] bridgeman, t.g., jones, j.m., shield, i., williams, p.t.: torrefaction of reed canary grass, wheat straw and willow to enhance solid fuel qualities and combustion properties, fuel, 87, 2008, p. 844–856. doi:10.1016/j.fuel.2007.05.041 [13] prins, m.j., ptasinski, k.j., janssen, f.j.j.g.: more efficient biomass gasification via torrefaction, energy, 31, 2006, p. 3458–3470. doi:10.1016/j.fuproc.2014.11.018 [14] larsson, s.h., rudolfsson, m. nordwaeger, m., olofsson, i., samuelsson, r.: effects of moisture content, torrefaction temperature, and die temperature in pilot scale pelletizing of torrefied norway spruce, applied energy, 102, 2013, p. 827-832. doi:10.1016/j.apenergy.2012.08.046 95 http://dx.doi.org/10.1016/j.biombioe.2011.03.027 http://dx.doi.org/10.1016/j.fuel.2015.06.005 http://dx.doi.org/10.1016/j.rser.2015.12.254 http://dx.doi.org/10.1016/j.fuproc.2010.11.031 http://dx.doi.org/0.1016/j.fuel.2009.03.015 http://dx.doi.org/10.1016/j.biombioe.2005.01.004 http://dx.doi.org/10.1016/j.biombioe.2009.05.005 http://dx.doi.org/10.1016/j.biortech.2007.12.065 http://dx.doi.org/10.1016/j.renene.2013.06.004 http://dx.doi.org/10.1016/j.fuproc.2015.09.022 http://dx.doi.org/10.1016/j.biosystemseng.2013.11.012 http://dx.doi.org/10.1016/j.fuel.2007.05.041 http://dx.doi.org/10.1016/j.fuproc.2014.11.018 http://dx.doi.org/10.1016/j.apenergy.2012.08.046 ebubekı̇r can güneş, i̇smet çelı̇k acta polytechnica [15] abou-chakra, h., tüzün, u.: microstructural blending of coal to enhance flowability powder technol, 111, 2000, p. 200–209. doi:10.1016/s0032-5910(99)00285-5 [16] kováčová, m., matus, m., krizan, p., beniak, j.: design theory for the pressing chamber in the solid biofuel production process, acta polytechnica, 54(1), 2014, p. 28-34. doi:10.14311/ap.2014.54.0028 [17] stelte, w., holm, j.k., sanadi, a.r., barsberg, s., ahrenfeldt, j., henriksen, u.b.: fuel pellets from biomass: the importance of the pelletizing pressure and its dependency on the processing conditions, fuel, 90, 2011, p. 3285-3290. doi:10.1016/j.fuel.2011.05.011 [18] california pellet mill, http://www.cpm.net [2016-18-10]. [19] stelte, w., clemons, c., holm, j.k., ahrenfeldt, j., henriksen, u.b., sanadi, a.r.: thermal transitions of the amorphous polymers in wheat straw, industiral crops and products, 34, 2011, p. 1053-1056. doi:10.1016/j.indcrop.2011.03.014 [20] almen, j.o., laszlo, a.: the uniform-section disk spring, transactions of the asme, research papers, 58-10, 1936, p. 305-314. [21] ozaki, s., tsuda, k., tominaga, j.: analyses of static and dynamic behavior of coned disk springs: effects of friction boundaries, thin-walled structures, 59, 2012, p. 132-143. doi:10.1016/j.tws.2012.06.001 [22] budynas, r.g., nisbett, j.k.: shigley’s mechanical engineering design, mcgraw-hill, 2008, p.118-119. isbn: 0-390-76487-6 96 http://dx.doi.org/10.1016/s0032-5910(99)00285-5 http://dx.doi.org/10.14311/ap.2014.54.0028 http://dx.doi.org/10.1016/j.fuel.2011.05.011 http://www.cpm.net http://dx.doi.org/10.1016/j.indcrop.2011.03.014 http://dx.doi.org/10.1016/j.tws.2012.06.001 acta polytechnica 57(2):89–96, 2017 1 introduction 2 material and method 2.1 disc spring selection and calculations 3 results and discussion 3.1 pelletizing pressure calculation via hertzian stress 4 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0248 acta polytechnica 54(3):248–253, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap low-mass binary x-ray sources: monitoring with various satellites vojtěch šimona,b a astronomical institute, academy of sciences of the czech republic, 25165 ondřejov, czech republic b czech technical university in prague, fel, prague, czech republic correspondence: simon@asu.cas.cz abstract. we show the importance of observing low-mass binary (lmxb) x-ray sources with the x-ray monitors onboard satellites. this enables us to study the physical processes governing the longterm activity of lmxbs. they are excellent targets for monitoring because of their strong activity. we recall that various physical processes operating in lmxbs produce specific large-amplitude variations of x-ray luminosity which can be investigated even in a single-band x-ray light curve as often provided by the monitors. we also emphasize the role of the spectral region of the x-ray monitor. choosing the spectral region often strongly influences the profile of the observed intensity curve because different emission components may dominate in different x-ray energy ranges. we show several examples of lmxbs which undergo various types of unpredictable activity (e.g. outbursts, superorbital cycles). keywords: radiation mechanisms, accretion, accretion disks, x-rays: binaries, x-ray monitors, x-ray detectors. 1. introduction low-mass x-ray binaries (lmxbs) are systems in which matter flows from a companion star, the socalled donor (often a low-mass, main-sequence star), onto a compact object which acts as an accretor. this object is either a neutron star (ns) or a black hole (bh). release of gravitational energy during accretion of matter from the donor onto the compact object is often the dominant energy source of a lmxb. in this system, the accretion occurs in the disk embedding the compact object. a review can be found e.g. in [1]. most x-rays come from the close vicinity of the compact object in a lmxb. the inner disk region emits thermal radiation, which contributes to the soft x-ray emission output (with energies e up to several kev). comptonizing cloud around the compact object produces an emission via inverse compton scattering. it is dominant especially in hard x-rays (with e even larger than 10 kev). these systems often display strong activity on various timescales, from a very small fraction of a second to many years (e.g. [1]). 2. the importance of monitoring the activity of lmxbs in the x-ray band on the timescales of years and decades is little explored, also because of the necessity to conduct monitoring in the x-ray band. we discuss the big role of monitors of x-ray emission in studying the physical processes operating in lmxbs. lmxbs are excellent targets for monitoring. systems of this type display various kinds of long-term activity (e.g. [2]). we stress the importance of longterm coverage in the x-ray band. the reason is that occasional pointing in any spectral band is not enough, because many pieces of information on the time evolution of the studied object are lost by this strategy. in addition, time allocation has to be justified, because a search for unexpected behavior of the object is usually not approved. moreover, determining a comprehensive picture of the processes operating in a given lmxb (or a group of such systems) requires an analysis of an ensemble of events even in the same system. we will discuss the activity of lmxbs in various x-ray bands and how monitoring helps. so-called lmxb transients are those for which the intensity of their emission increases even by several orders of magnitude from time to time. wide-field monitoring of the sky is necessary because most transients are discovered only by the first detection of their outburst. such outbursts of lmxbs are unpredictable. only the mean recurrence time (cycle-length) tc of these events can be determined. of course, a long (years or, preferably, decades) series of observations is necessary for this basic determination. this activity is interpreted by thermal-viscous instability of the accretion disk [3]. in this case, the time-averaged mass transfer rate from the donor to the compact accretor lies between certain limits. the mass accumulates in the outer regions of the disk during quiescence. when some critical mass density of the disk is reached, strong accretion of matter from the disk onto the central compact object triggers the outburst, hence a fast increase of x-ray intensity [3]. some other lmxbs are (quasi)persistent x-ray sources, which means that they are detectable repeatedly, possibly always. some of them can display transitions between the high and low states of x-ray intensity, which is probably caused by a transient de248 http://dx.doi.org/10.14311/ap.2014.54.0248 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 3/2014 monitoring of low-mass binary x-ray sources crease of the mass transfer rate. these transitions are usually fast (several days) and unpredictable. even when these lmxbs reside in the high state (i.e. in a state of high intensity), their x-ray intensity is not quite constant; it often displays significant fluctuations. they may also display cyclic (superorbital) changes of luminosity on a timescale of weeks and months, interpreted in terms of precession of the accretion disk (e.g. [4]). 2.1. remarkable x-ray monitors the all sky monitor (asm) operated onboard the nasa rossi x-ray timing explorer (rxte) [5] (http://xte.mit.edu/) between 1996 and 2012. this monitor consisted of three wide-angle shadow cameras equipped with proportional counters, each with 6 × 90 degrees of field of view. the collecting area was 90 cm2. the detector was a position-sensitive xenon proportional counter. spatial resolution was 3 × 15 arcmin. time resolution was 80 percent of the sky every 90 min. the sensitivity was about 13 mcrab for the one-day means of observations. such averaging of data is often used for analyses of the observations from this instrument to increase the signal to noise ratio. asm/rxte was an excellent monitor for soft x-rays. the data contain the sum band intensities isum in the 1.5–12 kev band and the intensities in three sub-bands: ia (1.5–3 kev), ib (3–5 kev), ic (5– 12 kev). this enables us to calculate the hardness ratios hr1 = ib/ia and hr2 = ic/ib. the burst alert telescope (bat) onboard the nasa satellite swift has been operating since 2004 [6, 7]. it is equipped with a coded mask. its detecting area is 5200 cm2. the field of view is 1.4 sr (partially-coded). this is a monitor for very hard xrays. its energy range is 15–150 kev. nevertheless, the 15–50 kev band is used for monitoring x-ray sources. it is not divided into sub-bands. 2.2. data from x-ray monitors what can we expect from data from x-ray monitors? when a lmxb is in its high state of activity (e.g. in outburst), the x-ray spectrum usually displays the biggest intensity in the soft x-ray band (e about 2 kev) (e.g. [8]). the intensity decreases steeply with growing e. it is therefore desirable to construct monitors observing the soft x-ray band. the monitors often work with a single band (typically in soft x-rays, with e of a few kev); this is suitable for utilizing the spectral band in which the emission of lmxbs is most intense during the states when they can be detected by these instruments. the famous asm/rxte monitor even provided observations in three bands (sect. 2.1). dividing the observed spectral region into several bands (e.g. those used in asm/rxte) or simultaneous usage of various monitors (e.g. asm/rxte along with bat/swift) helps to distinguish between various processes and spectral components influencing the x-ray luminosity. in addition, absorption of x-rays can be measured in the asm/rxte data by simultaneous use of the intensities in band a, band b, and band c. absorption is predicted to influence mostly the softest asm band ([9]). various physical processes operating in lmxbs produce specific large-amplitude variations of x-ray luminosity on a timescale of days, weeks, even years and decades (e.g. [2]). the characteristic features of such activity (e.g. outbursts, high/low state transitions, superorbital cycles) can be investigated even in a single-band x-ray light curve of a lmxb. even some model predictions of the features of the longterm activity are already available. for example, the properties of the basic outburst light curves in soft x-rays, tc, and the dependence of the outburst profile on irradiation of the disk by x-rays from the vicinity of the accretor were modeled by [3]. 2.2.1. aql x-1 the soft x-ray transient aql x-1 spends most time in quiescence, in which its intensity is too low to be detected by the monitors. from time to time (after roughly 200 days), this system switches to an outburst, during which it can rapidly become a bright x-ray source for several weeks. part of its asm/rxte light curve is displayed in fig. 1a. simultaneous evolution of its intensity in the bat/swift data (fig. 1b) shows that most outbursts are detectable also in the very hard x-ray band. fig. 1ab is thus an example of the fact that both the profile and the intensity of a given outburst in the same lmxb sometimes differ very considerably for the spectral region used by the monitor. it is therefore risky to compare the profiles of the outbursts obtained by various monitors. this strengthens the importance of constructing a monitor and its satellite which can operate on the orbit as long as possible to provide a reasonable ensemble of events (e.g. outbursts) which will enable an analysis. the outbursts in aql x-1 (and in similar soft x-ray transients) are discrete and separated from each other. their peaks can often be resolved easily. it is thus possible to apply the method of o–c residuals (‘o’ simply stands for observed and ‘c’ for calculated) to tc of the outbursts. the o–c method was successfully applied in analyzing outbursts, e.g. by [10, 11]. it enables us not only to determine tc, but also to analyze its variations. it works with the residuals from some reference period (i.e. with the deviations from a constant period). the o–c curve also enables us to assess the position of each outburst with respect to the o–c profile of the remaining outbursts. this method can work even if some outbursts are missing due to the gaps in the data, provided that the profile of the o–c curve is not too complicated. it is known that tc of soft x-ray transients like aql x-1 can vary by a large amount. nevertheless, in many cases, the changes of tc are not chaotic, and well-defined trends can be resolved in the o–c diagrams. the evolution of tc of the outbursts in aql x-1 is shown in fig. 1c. 249 http://xte.mit.edu/ vojtěch šimon acta polytechnica figure 1. a) part of the asm/rxte light curve of the soft x-ray transient aql x-1. the outbursts are marked by arrows. b) simultaneous evolution of its intensity in the bat/swift data. c) o–c diagram for the recurrence time tc of the outbursts in aql x1. diamonds represent the values determined from observations by bat. circles represent observations by other instruments (often using asm/rxte). see sect. 2.2.1 for details. (this figure is available in color in electronic form.) since these outbursts are only cyclic, monitoring is necessary for detecting them, especially their steep rising branch. although the profiles, and hence the peaks, of a given outburst may differ for the asm and bat bands, this difference is much smaller than tc. it thus does not influence the profile of the o–c curve significantly. however, some outbursts may be too faint in the bat band, even when they can still be detected in the asm band. when only the bat observations are available now, some outbursts may escape detection (fig. 1c). an example of the profile of a bright outburst in aql x-1, observed in the asm/rxte 1.5–12 kev band, is displayed in fig. 2a. the evolution of hardness ratios hr1 and hr2 (defined in sect. 2.1) with isum of this outburst shows a strong hysteresis (fig. 2b). both these ratios are considerably larger in the bottom half of the rising branch than during the decay from the figure 2. a) the profile of a bright outburst in aql x-1 (asm/rxte sum band (1.5–12 kev) data). b) the dependence of hr1 and hr2 on isum of the outburst. the symbols are connected by lines to trace the evolution of the outburst. see sect. 2.2.1 for details. adapted from [11]. (this figure is available in color in electronic form.) outburst peak. aql x-1 can also serve as an example of the fact that the sensitivity of the monitors plays a big role in a study of the features of the light curves. for example, the x-ray data obtained in the 3–12 kev band by vela 5b between years 1969 and 1976 were averaged into 10-day means [12]. later, asm/rxte observing in the 1.5–12 kev band (similar to that of vela 5b) between years 1996 and 2012 [5] was able to detect significantly fainter outbursts included in the activity of aql x-1. averaging the observations into one-day means was already sufficient. including these minor outbursts which were resolved by asm/rxte was important for the correct assessment of the activity of this object (e.g. [11]) (see also fig. 1a). this system also serves as clear evidence that the profile of the o–c curve is reliable only if the faint outbursts are also detected. 2.2.2. xte j1701–462 a comparison of the profiles of the asm/rxte and bat/swift light curves of the very long outburst of 250 vol. 54 no. 3/2014 monitoring of low-mass binary x-ray sources figure 3. a) asm/rxte light curve of the very long outburst in xte j1701–462 fitted with hec13 running through the fluctuations. b) bat/swift light curve of the whole outburst fitted with hec13 running through the fluctuations. vertical lines denote the conjunction of the system with the sun. see sect. 2.2.2 for details. (this figure is available in color in electronic form.) xte j1701–462 shows striking differences (figs. 3a and 3b). both these light curves display a plateau in the middle phase of the outburst, but the transition to it differs. although bat did not map the time segment of the rising branch of the outburst, asm observed a steep rise of intensity to a prominent initial peak of the outburst. noticeable fluctuations on a timescale of several days (i.e. on a timescale much shorter that the duration of the whole outburst) were present in the asm data in some phases (especially during the initial peak of the outburst). the whole part of the bat light curve prior to the onset of the fast final decay of intensity displayed largeamplitude chaotic variations on a timescale shorter than in the asm band. it required smoothing to distinguish the profile of the outburst. the data were therefore fitted with the hec13 code, written by prof. p. harmanec. the code is based on the method of [13, 14], who improved the original method of whittaker [15]. the resulting fit consists of the mean points, calculated to the individual observed points of the curve. the peak-to-peak amplitude of these fluctuations was significantly larger than the measurement error ubat of the individual daily means of intensity in the bat band. notice especially the very large decrease in the amplitude of the fluctuation near mjd 54 260, shortly before the onset of the steep final decline of the outburst. the fitting procedure of the bat light curve was made with the same input values as for the asm data (fig. 3b). figure 4. the relation of the time evolution of isum (a) and the hardness ratio hr = ibat/isum (b) during the decline of the very long outburst in xte j1701– 462. hr and isum were determined for the data included in a segment of 40 days. the consecutive points are connected by a line. the standard errors of isum and hr are displayed. see sect. 2.2.2 for details. to investigate the time evolution of the hardness ratio hr = ibat/isum (ibat being the intensity measured by bat) during the decline of the outburst, it was necessary to overcome the problem of the largeamplitude fluctuations of the intensities in both these bands. such strong fluctuations occurred especially in the profile of the bat light curve. it was not possible to take the fluctuations in the one-day means of ibat as precisely coinciding with the measurements included in the one-day means of isum. the problem was solved in the following way. part of the outburst beginning in mjd 53 783 was divided into 11 bins. the width of each bin was 40 days. the resulting means of both isum and ibat along with their standard errors were determined for each bin. only these means were used for the calculation of hr in each bin. the result is displayed in fig. 4. generally, the spectrum hardens with time during the outburst’s decline, with the exception of a shallow dip centered on mjd 53 940. moreover, the time evolution of hr displays a shorter plateau than the asm light curve. this is because the increase of hr is slower than that of isum during the decline of isum from the initial peak and also during the final decay. in addition, a relation between isum and hr exists, with the steeper rise of hr in the upper half of the isum range. the initial peak in the isum light curve caused this steepening. 251 vojtěch šimon acta polytechnica 2.2.3. several other examples an outburst of the bh transient gx 339–4 was simultaneously monitored by asm/rxte and bat/swift [16]. this combination revealed very large differences between the profiles of the outburst in the soft and the very hard x-rays. although the start of this outburst was almost simultaneous for both spectral bands, a state transition occurred close to the time of the peak luminosity. this resulted in a very fast decay of the bat luminosity, while the decaying branch of the outburst was much less steep. nevertheless, a re-brightening in the bat light curve gave rise to the much longer total duration of the outburst in the bat band than in the asm band [16]. cyg x-2 is a persistent x-ray source with a massaccreting ns (e.g. [17]). it is known to display superorbital cycles observed by several satellites (rxte, vela 5b, ariel 5) (e.g. [18, 19]). a complex evolution of the cycle-length (sometimes even multiperiodic cycles) was observed (e.g. [19]). the lengths range from about 40 d to 78 d. this behavior was interpreted by [19] as a complex warping shape of the accretion disk given by the position in the binary radius–precession frequency diagram of [20]. the amplitude of the cycle in the asm/rxte band is large; the intensity can decrease by about 50 percent. this cycle is thus easily detectable by various monitors and its complex evolution can be studied in long time segments. the variations of intensity were accompanied by changes of the hardness ratio, interpreted by variable obscuration of x-rays [19]. the long-term variations of x-ray intensity of the persistent x-ray source 4u 1820–30 display a large-amplitude superorbital cycle of about 176 days (e.g. [12, 21]). the variations of its x-ray spectrum suggest that this cycle is a manifestation of masstransfer variations, and not a precessing disk [22]. in variance with cyg x-2, the superorbital modulation of 4u 1820–30 is probably due to the mazeh & shaham mechanism [21, 23]. the highly asymmetric x-ray light curve of this cycle, with a complicated profile of the decay branch, also suggests a combination of the mazeh & shaham mechanism and an irradiation-driven instability of the donor [24]. simultaneous monitoring by asm/rxte and bat/swift revealed transitions from soft to hard state during some episodes of minimum of soft x-ray luminosity of the superorbital cycle [16]. 3. conclusions dense series of x-ray observations from monitors covering time segments of at least several years are necessary to investigate the properties of the long-term activity of lmxbs, hence to study the relevant physical processes involved in these systems. this is necessary for resolving the profiles of the outbursts and the transitions between the high and low states. only dense mapping will enable us to place these events in the context of the long-term activity of a given system. it will enable us to form a representative ensemble of events (a) in a given x-ray binary system, (b) in a type of x-ray binary systems. it emerges that important profiles of features of the long-term activity are measurable by the monitors. a very large variety of these features exists. a search for the common properties is therefore needed. even a search for the accompanying spectral variations is possible with some existing monitors. changes of the hardness ratios are measurable e.g. by asm/rxte or by a combination of simultaneous asm/rxte and bat/swift observations. we emphasize the very important role of the spectral region of the x-ray monitor. a very hard x-ray band like the one in bat/swift sometimes maps quite a different activity (probably coming from a different spectral component) than that detected by monitors observing in soft x-ray band. the time evolution of tc of outbursts is very little studied mainly because of a lack of data. it has been possible to investigate only very few soft x-ray transients with quite short tc of less than a year so far. acknowledgements this research has made use of the observations provided by the asm/rxte team. i also acknowledge the use of public data from the swift data archive. this study was supported by grants 13-33324s and 13-394643 of the grant agency of the czech republic. i thank prof. p. harmanec for providing me with the hec13 code. the fortran source version, the compiled version and brief instructions on how to use the program can be obtained via http: //astro.troja.mff.cuni.cz/ftp/hec/hec13/. references [1] w. h. g. lewin, j. van paradijs, e. p. j. van den heuvel. x-ray binaries. camb. astrophys. ser., vol.26. cambridge university press, 1995. [2] w. h. g. lewin, m. van der klis. compact stellar x-ray sources. cambridge university press, 2006. [3] g. dubus, j.-m. hameury, j.-p. lasota. the disc instability model for x-ray transients: evidence for truncation and irradiation. a&a 373:251–271, 2001. doi:10.1051/0004-6361:20010632. [4] s. b. foulkes, c. a. haswell, j. r. murray. sph simulations of irradiation-driven warped accretion discs and the long periods in x-ray binaries. mnras 401(2):1275–1289, 2010. doi:10.1111/j.1365-2966.2009.15721.x. [5] a. m. levine, h. bradt, w. cui, et al. first results from the all-sky monitor on the rossi x-ray timing explorer. apj 469:l33–l36, 1996. doi:10.1086/310260. [6] s. d. barthelmy, l. m. barbier, j. r. cummings, et al. the burst alert telescope (bat) on the swift midex mission. space sci rev 120:143–164, 2005. doi:10.1007/s11214-005-5096-3. [7] h. a. krimm, s. t. holland, r. h. d. corbet, et al. the swift/bat hard x-ray transient monitor. apjs 209:14–46, 2013. doi:10.1088/0067-0049/209/1/14. 252 http: //astro.troja.mff.cuni.cz/ftp/hec/hec13/ http: //astro.troja.mff.cuni.cz/ftp/hec/hec13/ http://dx.doi.org/10.1051/0004-6361:20010632 http://dx.doi.org/10.1111/j.1365-2966.2009.15721.x http://dx.doi.org/10.1086/310260 http://dx.doi.org/10.1007/s11214-005-5096-3 http://dx.doi.org/10.1088/0067-0049/209/1/14 vol. 54 no. 3/2014 monitoring of low-mass binary x-ray sources [8] t. narita, j. e. grindlay, d. barret. asca observations of gx 354-0 and ks 1731-260. apj 547(1):420–427, 2001. doi:10.1086/318326. [9] r. morrison, d. mccammon. interstellar photoelectric absorption cross sections, 0.03–10 kev. apj 270:119–122, 1983. doi:10.1086/161102. [10] n. vogt. the su uma stars – an important sub-group of dwarf novae. a&a 88(1-2):66–76, 1980. [11] v. šimon. on the recurrence time and outburst properties of the soft x-ray transient aquila x-1. a&a 381:151–167, 2002. doi:10.1051/0004-6361:20011470. [12] w. c. priedhorsky, j. terrell. long-term observations of x-ray sources – the aquila–serpens–scutum region. apj 280:661–670, 1984. doi:10.1086/162039. [13] j. vondrák. a contribution to the problem of smoothing observational data. baicz 20:349–355, 1969. [14] j. vondrák. problem of smoothing observational data ii. baicz 28:84–89, 1977. [15] e. whittaker, g. robinson. the calculus of observations. blackie & son ltd, london, 1946. [16] j. tang, w.-f. yu, z. yan. rxte/asm and swift/bat observations of spectral transitions in bright x-ray binaries in 2005–2010. raa 11(4):434–444, 2011. doi:10.1088/1674-4527/11/4/006. [17] j. casares, p. a. charles, e. kuulkers. the mass of the neutron star in cygnus x-2 (v1341 cygni). apj 493:l39–l42, 1998. doi:10.1086/311124. [18] r. a. d. wijnands, e. kuulkers, a. p. smale. detection of a ∼78 day period in the rxte, vela 5b, and ariel 5 all-sky monitor data of cygnus x-2. apj 473:l45–l48, 1996. doi:10.1086/310390. [19] w. i. clarkson, p. a. charles, m. j. coe, et al. long-term properties of accretion discs in x-ray binaries – ii. stability of radiation-driven warping. mnras 343(4):1213–1223, 2003. doi:10.1046/j.1365-8711.2003.06761.x. [20] g. i. ogilvie, g. dubus. precessing warped accretion discs in x-ray binaries. mnras 320(4):485–503, 2001. doi:10.1046/j.1365-8711.2001.04011.x. [21] y. chou, j. grindlay. binary and long-term (triple?) modulations of 4u 1820-30 in ngc 6624. apj 563(2):934–940, 2001. doi:10.1086/324038. [22] p. f. bloser, j. e. grindlay, p. kaaret, et al. rxte studies of long-term x-ray spectral variations in 4u 182030. apj 542(2):1000–1015, 2000. doi:10.1086/317019. [23] t. mazeh, j. shaham. the orbital evolution of close triple systems – the binary eccentricity. a&a 77:145–151, 1979. [24] v. šimon. long-term x-ray activity of the ultra-compact binary 4u 1820-30. a&a 405:199–206, 2003. doi:10.1051/0004-6361:20030514. 253 http://dx.doi.org/10.1086/318326 http://dx.doi.org/10.1086/161102 http://dx.doi.org/10.1051/0004-6361:20011470 http://dx.doi.org/10.1086/162039 http://dx.doi.org/10.1088/1674-4527/11/4/006 http://dx.doi.org/10.1086/311124 http://dx.doi.org/10.1086/310390 http://dx.doi.org/10.1046/j.1365-8711.2003.06761.x http://dx.doi.org/10.1046/j.1365-8711.2001.04011.x http://dx.doi.org/10.1086/324038 http://dx.doi.org/10.1086/317019 http://dx.doi.org/10.1051/0004-6361:20030514 acta polytechnica 54(3):248–253, 2014 1 introduction 2 the importance of monitoring 2.1 remarkable x-ray monitors 2.2 data from x-ray monitors 2.2.1 aql x-1 2.2.2 xte j1701–462 2.2.3 several other examples 3 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0150 acta polytechnica 55(3):150–153, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap polymers containing cu nanoparticles irradiated by laser to enhance the ion acceleration mariapompea cutroneoa, ∗, lorenzo torrisib, anna mackovaa, c, andriy velyhand a nuclear physics institute, ascr, 25068 rez, czech republic b physics department and of earth sciences, messina university, v.le f.s. d’alcontres 31, 98166 s. agata, messina, italy c department of physics, faculty of science, j.e. purkinje university, ceske mladeze 8, 400 96 usti nad labem, czech republic d institute of physics, ascr, v.v.i., 182 21 prague 8, czech republic ∗ corresponding author: cutroneo@ujf.cas.cz abstract. the target normal sheath acceleration method was employed at pals to accelerate ions from laser-generated plasma at intensities above 1015 w/cm2. laser parameters, irradiation conditions, and target geometry and composition control the plasma properties and the electric field driving the ion acceleration. cu nanoparticles deposited on the polymer promote resonant absorption effects, increasing the plasma electron density and enhancing the proton acceleration. protons can be accelerated in the forward direction at kinetic energies up to about 3.5 mev. the optimal target thickness, the maximum acceleration energy and the angular distribution of the emitted particles were measured using ion collectors, an x-ray ccd streak camera, sic detectors and a thomson parabola spectrometer. keywords: tnsa, hydrogenated target, resonant absorption. 1. introduction in a laser-matter interaction, the electromagnetic energy of the laser radiation is converted initially into electronic excitation and later into thermal, chemical and kinetic energy [1]. the characteristics of the lasergenerated plasma, the amount of emitted ions and their distribution into energy, charge states and angle emission depend on many important factors. the propagation of the ions in forward direction and in backward direction, for example, is tied to the thickness of the target [2]. the focal position and the laser pulse duration have a crucial role on the characteristics of the generated plasma [3]. the laser intensity is the main factor influencing the ion distribution of the energy and the charge state [4]. the mechanisms for ion generation, acceleration and expansion in a vacuum, are complex. above the ablation threshold, and at laser intensity higher than 1015 w/cm2, plasma may be full ionized, and non-linear effects, ponderomotive forces, relativistic electrons and magnetic self-focusing effects promote high charge states and accelerate the emitted ions [5]. using the target normal sheath acceleration (tnsa) regime, in which a double layer of charges generated in the rear side of the foil drives the ion acceleration in the forward direction, along the normal to the target surface, charge states may reach 60+ and ion acceleration values may be higher than 1 mev per charge state [6]. a plasma rich in protons, carbon and copper ions can be obtained by irradiating thin polyethylene foils covered by cu films or containing cu nanostructures. the composition and the geometry of the target absorb high laser energy and generate hot plasmas and high charge separation, inducing high ion acceleration, as will be reported. in this context, investigations into optimal laser parameters and irradiation conditions will be aimed at maximizing the ion kinetic energy. 2. material and methods the high-power photodissociation iodine laser system of the pals research center in prague, operating at 1.315 µm wavelength, laser energy ranging between 450 and 600 j, pulse duration 300 ps, and intensity of 1015 w/cm2, was employed for the experiments presented here [7]. the focalization setting of the laser beam consists in an aspherical lens (f = 627 mm for 1 ω) 29 cm in diameter focusing the laser beam on to the target with a nominal spot of 70 µm. sheets of pure mylar, pure copper, and mylar covered by thin cu films were irradiated at 30° with respect the normal to the target surface. the original stoichiometry of the mylar or polyethylene terephthalate is (c10h8o4)n. the thickness of the mylar ranges between 0.6 µm and 100 µm, and the thickness of the cu film ranges between 0.01 µm and 1 µm. the targets were produced by the physical vapor deposition (pvd) method, using a leybold-heraeus evaporator, at a vacuum of 10−6 mbar. a mylar substrate, 0.6 µm in thickness was covered by the evaporated cu thin film, the thickness of which was measured online, with a calibrated quartz crystal, and offline, using the transmission energy loss spectroscopy of the 241am 150 http://dx.doi.org/10.14311/ap.2015.55.0150 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 3/2015 polymers containing cu nanoparticles figure 1. experimental setup. alpha source. at very low thicknesses, of the order of 5–10 nm, the cu film is not uniform; it consists of nucleated cu cluster with dimensions comparable with the thickness, as observed at sem. the laser beam was focused 100 microns in front of the the target surface (fp = −100 µm) [8]. the scattering chamber is equipped with a target holder movable in x, y, z directions with 1 µm minimum steps [7]. a kentech x-ray streak camera, fixed in a side view having 2 ns exposition time, is employed to monitor the initial position of the plasma formation. ion collectors (ic) and sic semiconductors are employed in time-of-flight (tof) configuration for the ion detection at an angle of 30° and at a distance of 1.03 and 0.6 m from the target [9]. sic is a promising detector not sensitive to the visible, soft uv, and infrared light, where optical photons are not able to produce electron-hole pairs because their energies are below the 3.2 ev of the 4h-sic gap energy. the photons, electrons and ions absorbed in the sensitive volume of the detector generate e-h pairs, losing 7.8 ev for a pair production, which results in voltage signal at the device electrodes proportional to the deposited radiation energy [10]. a thomson parabola spectrometer (tps) was placed in forward direction at 0° angle and 1.67 m distance from the target [11]. tps consists in a magnetic-electrostatic system; it is equipped with two input pin-holes, 1 mm and 0.1 mm in diameter. the emitted ions undergo a magnetic deflection of about 0.1 t and an electrostatic deflection voltage of the order of 3 kv. the trajectories of the emerging are parabolic. they are detected on a plane orthogonal to the incident ions by a multi-channel figure 2. sic spectra for irradiation of pure mylar (a) and of irradiation of mylar covered with a layer of cu 10 nm in thickness (b), 50 nm in thickness (c) and 100 nm in thickness (d). plate (mcp) coupled to a phosphorus screen and a ccd camera. the recorded spectra were stored into a fast oscilloscope operating at 20 gs/s. a schematic view of the experimental arrangement is shown in fig. 1. 3. results and discussion the sic spectra of ions accelerated by a laser irradiating mylar and cu are presented in fig. 2. irradiating a pure mylar thin film, 8 µm in thickness, with 471 j pulse energy and −100 µm focal position, the kinetic energy of faster protons is 0.9 mev (a). due to the conditions of the focal position, this relatively high proton energy is controlled accurately by the streak camera. at the value used here, self-focusing may be induced, decreasing the laser spot on the target and increasing the laser intensity [4]. however the laser energy released in the thin mylar remains low because the pure mylar has a very low absorption coefficient for the ir laser wavelength that is used, with evaluated transmission of the order of 90 % [12]. using the same experimental condition, in a pure cu target, 1 µm in thickness, the maximum proton energy was evaluated to be 1.2 mev. the proton energy increases with the cu deposition thickness, from 10 nm (b) to 50 nm (c) and to 100 nm (d), and assume an energy of 1.1 mev, 3.2 mev and 1.5 mev, respectively, as demonstrated by the tof measurements of the four spectra. this result is due to two causes: enhancement of the plasma electron density responsible for the electric field driving the proton acceleration, and enhancement of the laser absorption in the thin target obtained using the cu metal. the optimal cu thickness seems to be 50 nm according to the replicability of the measurements under the same experimental conditions. c ions are accelerated to maximum kinetic energies approximately proportional to six times the maximum proton energy, in agreement with the coulomb acceleration in the charge space separation generated in the rear side of the thin target. 151 m. cutroneo, l. torrisi, a. mackova, a. velyhan acta polytechnica figure 3. tps spectra for irradiation of pure mylar (a) and of mylar covered with a cu thickness of 10 nm (b), 50 nm (c) and 100 nm (d). thus, in the case of a 50 nm cu/mylar target, the carbons are accelerated up to about 20 mev, due to the six charge states of carbon ions. the corresponding tps spectra of the targets irradiated in the conditions reported for fig. 2 are presented in fig. 3. this figure compares the tps protons and carbon parabolas achieved by irradiating a pure mylar (a) and the parabolas obtained when we increase the thickness of the cu film deposited on mylar, from 10 nm (b) to 50 nm (c) and to 100 nm (d). the parabola recognition was obtained for comparison with simulation programs based on the real tps geometry and used value of deflecting magnetic and electric fields used here, as presented in literature [11]. recognition of the cu parabolas permits full cu ionization, up to cu+29, to be evinced. for the pure mylar foil the parabola luminosity decreases, as a consequence of the lower plasma electron density and the lower absorbed laser energy in the polymeric film, which absorbs only about 10 % of the laser pulse energy [12]. the parabolas obtained in the other cases shows more energetic ions and may be more luminous. the most energetic ions, less deflected by the magnetic and electric fields, are produced for a cu thickness of 50 nm deposited on the mylar substrate, in agreement with measurements obtained by sic. such a thin film produces high laser absorption due to the presence of the metal which absorbs the ir laser radiation and also due to the cu nanoparticles in which surface plasmon resonant absorption effects are induced by the laser electromagnetic wave [13]. thus laser absorption enhancement is induced in the target and transferred to the plasma, the cu thin film increases the plasma electron density and consequently the charge separation developed in the rear side of the target and higher ion acceleration is developed. the cu nanoparticle absorption plays an important role in laser absorption in the thin mylar target, depending on the size of the cu nanoparticles, which is target thickness el (j) ep(h+) mylar 8 µm 471 0.9 mev cu/mylar 10 nm/8 µm 500 1.0 mev cu/mylar 50 nm/8 µm 495 3.2 mev cu/mylar 100 nm/8 µm 486 1.5 mev mylar/cu 8 µm/100 nm 472 1.1 mev cu 1 µm 608 1.2 mev table 1. summary of results of proton acceleration obtained irradiating different thin films. very high for particles with dimensions of the order of hundreds nm absorbing in the range of visible and ir wavelengths. 4. conclusions bilayer targets, consisting of milar foil 8 µm in thickness covered with thin cu pvd depositions, with thicknesses ranging between 10 nm and 100 nm, are suitable for high energy proton acceleration in the forward direction from laser intensity of about 1015 w/cm2 in the tnsa regime. a summary of our results is given in the table 1, demonstrating that the cu nanostructures increase the proton acceleration as result of the enhanced laser absorption in the thin film. mylar covered by cu was laser irradiated from the cu face. irradiating on the converse face, from mylar to cu, the maximum proton energy decreases with respect to the value obtained when irradiating from the cu face, as presented in the table. a possible explanation involves the plasma electron density which is higher in the case of cu/mylar irradiation due to high cu electron injection in the mylar plasma. in conclusion three main parameters play an important role in ion acceleration in the forward direction from the tnsa regime: (1) the plasma electron density, which can be increased using metals to cover the polymer filmsthat inject the electrons into the polymer plasma; (2) the use of metallic nanostructures depsited on the polymer, which significantly increase the laser absorption due to surface plasmon resonance effects at wavelength in the visible and ir regions; (3) the use of an optimal laser focal position to induce self-focusing effect in thin films. for polymers a distance of −100 µm in front of the target surface is used. this enables the laser light due to the formation of the first ionized vapor to be focused, and enables the self-focusing effect to be made use of. acknowledgements the authors acknowledge the pals laser team for its expert support for this experiment and mr c. cutroneo and mrs m. d’amico for their precious contribution. access to the pals laser facility was supported by the european commission under laserlab-europe by project pals001823 and by project p108/12/g108. 152 vol. 55 no. 3/2015 polymers containing cu nanoparticles references [1] m. von allmen. laser-beam interactions with materials. second printing. springer series in material science 2, berlin, 1987. [2] j. badziak. laser-driven generation of fast particles. opto-electron 15:1–12, 2007. [3] l. laska, k. jungwirth, k. kralikova, al. charge-energy distribution of ta ions from plasmas produced by i and 3 frequencies of a high-power iodine laser. rev sci instrum 75:1588–1591, 2004. [4] l. torrisi, d. margarone, a. borrielli. characterization of laser-generated silicon plasma and rdquo. laser part beams 26:379–387, 2008. [5] m. borghesi, a. j. mackinnon, l. barringer, al. relativistic channeling of a picosecond laser pulse in a near-critical preformed plasma. j phys rev lett 78:879–882, 1997. [6] l. torrisi, m. cutroneo, s. cavallaro, al. thomson parabola spectrometry as diagnostics of fast ion emission from laser-generated plasma. proc of spie 8779:1–6, 2013. [7] k. jungwirth, a. cejnarova, l. juha, al. the prague asterix laser system pals. phys plasmas 8:2495–2501, 2001. [8] l. laska, j. krasa, k. a. masek. iodine laser production of highly charged ta ions. czechoslovak journal physics 46(11):1099–1100, 1996. [9] l. torrisi, g. foti, l. giuffrida, al. single crystal silicon carbide detector of emitted ions and soft x rays from power laser-generated plasmas. j appl phys 105:123304, 2009. [10] a. lo giudice, f. fizzoti, c. manfredotti, al. average energy dissipated by mev hydrogen and helium ions per electron-hole pair generation in 4h-sic. appl phys lett 87:222105, 2005. [11] m. cutroneo, l. torrisi, l. andó, al. thomson parabola spectrometerfor energetic ions emitted from sub-ns laser generated plasmas. acta polytechnica 53(2):138–141, 2013. [12] dupont. dupont teijin film. 2014. [13] m. a. garcia. surface plasmons in metallic nanoparticles: fundamentals and applications. journal of physics d: applied physics 44:283001–20, 2011. 153 acta polytechnica 55(3):150–153, 2015 1 introduction 2 material and methods 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0177 acta polytechnica 54(3):177–182, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap x-ray transmission and reflection through a compton-thick medium via monte-carlo simulations wiebke eikmanna, b, ∗, jörn wilmsa, b, randall k. smithc, julia c. leec a dr. karl remeis-sternwarte bamberg, sternwartstrasse 7, 96049 bamberg, germany b erlangen centre for astroparticle physics (ecap), erwin-rommel-str. 1, 91058 erlangen, germany c harvard-smithsonian center for astrophysics, 60 garden st.,cambridge, ma 02138 usa ∗ corresponding author: wiebke.eikmann@sternwarte.uni-erlangen.de abstract. the spectral shape of an x-ray source strongly depends on the amount and distribution of the surrounding material. the spectrum of a primary source which is located in an optically thin medium with respect to compton scattering is mainly modified by photo absorption in the lower energy range and is almost unaltered above ∼ 10 kev. this picture changes when the source is obscured by gas exceeding hydrogen column densities of ∼ 1024 cm−2. at this degree of absorption it is likely that photons are scattered at least twice before leaving the medium. the multiple scatterings lead to a lack of photons in the high energy range of the resulting spectrum as well as to an accumulation of photons at moderate energies forming the so-called compton-bump. the shape of the fluorescent lines also changes since scattered line photons form several compton-shoulders which are very prominent especially for compton-thick sources. using a monte carlo method, we demonstrate the importance of compton scattering for high column densities. for that purpose, we compare our results with existing absorption models that do not consider compton scattering. these calculations will be implemented in a prospective version of the tbabs absorption model including an analytic evaluation of the strength of the fluorescent lines. keywords: interstellar absorption; monte carlo simulation. 1. introduction some astronomical x-ray sources are deeply embedded in gas exceeding hydrogen column densities of ∼ 1024 cm−2 (e.g. the x-ray binary igr j16318-4848 [1]). the original spectrum emitted by such highly obscured systems is considerably modified by interactions between the radiation and the surrounding material. the flux below 10 kev is strongly reduced by photo absorption and the absorbed photons may be re-emitted as fluorescent lines. compton scattering affects the spectral shape mainly due to down-scattering of high-energetic photons and has also an important impact on the fluorescent line profiles. correctly interpreted, these modifications encode information about the composition, structure and geometrical formation of the source’s environment. current absorption models mostly neglect the effect of compton scattering, though it is the dominant process for dense gas and high photon energies. for this reason, we present on a revised version of the tbabs [2] the absorption model which includes both compton scattering and fluorescent line emission and is therefore appropriate to model the spectra of highly absorbed systems. 2. computational method the monte carlo method labels a class of computational algorithms that achieves numerical results by repeated random sampling. applied to the problem of radiative transfer this means we simulate the random walk that a photon performs during its passage through a medium of a defined thickness. the absorbing gas is assumed to be neutral and has a constant density. the included interaction mechanisms between radiation and matter are photo absorption, fluorescent line emission, and compton scattering. the photo absorption cross sections are taken from verner & yakovlev (1995) [3] and include all elements with z ≤ 30. we consider also the emission of k-fluorescent lines according to the fluorescent yields from kaastra & mewe (1993) [4]. compton scattering is simulated by using the klein-nishina formula and we follow all photons of the primary spectrum as well as the fluorescent photons until they leave the medium. in this manner, we are able to model scattering features like the compton shoulders of the emission lines and the compton-bump at moderate energies. a similar simulation for modeling the x-ray processing through a compton-thick toroidal medium has already been provided by murphy & yaqoob (2009) [5]. our aim is to provide a simple, geometrically flexible fitting model able to represent the scattering processes in highly obscured sources realistically. 3. scattering features to demonstrate the influence of compton scattering on the spectral shape, we compare in fig. 1 two ab177 http://dx.doi.org/10.14311/ap.2014.54.0177 http://ojs.cvut.cz/ojs/index.php/ap w. eikmann, j. wilms, r. k. smith, j. lee acta polytechnica nh = 2 × 1024 cm−2 50020050205 10010 10000 1000 100 10 1 e [kev] p h o to n s k e v − 1 figure 1. absorbed spectra transmitted through a sphere for nh = 2 × 1024 cm−2. the red spectrum is calculated with considering compton scattering, the blue one with including photo absorption only. the dashed line indicates the input spectrum. sorbed spectra, the red spectrum simulated taking compton scattering fully into account and the blue one considering only photo absorption. the dashed black line represents the input spectrum. both spectra are modeled using a geometry where the photon source is located in the center of a spherical cloud with a radius equivalent to nh = 2 × 1024 cm−2. in contrast to the unscattered spectrum, which is indistinguishable from the input continuum above ∼ 20 kev, the scattered spectrum shows a significant lack of photons above 200 kev. as the relative energy loss per scattering is proportional to the photon energy, the high energy photons are effectively down-scattered and accumulate at moderate energies forming the so-called comptonbump between ∼ 30–100 kev. in addition, the strength of the fluorescent lines and also the overall flux in the low energy range are reduced again by compton downscattering. figure 2 shows the iron band in detail. the profile of the iron kα-line at 6.4 kev exhibits on its red side several compton-shoulders consisting of line photons that are scattered once or multiple times. this feature has also been observed, e.g. in the spectrum of the massive x-ray binary gx 301-2 [6]. 4. geometry dependence depending on the distribution of the surrounding matter a different fraction of the radiation is either absorbed, scattered or escapes without interaction and therefore the shape of the reprocessed spectrum strongly depends on the geometry of the absorbing material. figure 3 shows the transmitted spectra for the same column density but two different geometries, a spherical geometry with the primary source located at the center and a slab located between the observer and the source. the spectra are normalized at the iron k-edge energy. the ratio between the spectra (lower panel) illustrates the geometry induced differences in the compton-shoulder and the compton down-scattering at higher energies. both the compton-bump and the fluorescent lines are more pronounced for the spherical geometry. this is due to the different path length that a photon travels on average before it leaves the medium. back-scattered photons still have a good chance to leave the spherical geometry on the other side but they are mostly absorbed in the slab case. therefore, the averaged distance that a photon traverses is larger for the spherical geometry. in fig. 4 we compare the spectrum for pure compton reflection with a transmitted spectrum. again the spectra are normalized at the iron k-edge energy. the blue spectrum shows the reflected radiation from a semi-infinite slab averaged over all inclination angles. the red spectrum was calculated using again the spherical cloud geometry with a radius equivalent to nh = 1 × 1024 cm−2. this value was chosen for a better comparison because it results in a similar equivalent width of the iron kα-line for both the reflected and transmitted case. in contrast to the transmitted spectrum, which is formed mainly by photo absorption, the shape of the reflected spectrum is almost exclusively determined by compton back-scattering. the compton-bump can be well seen at moderate energies. as forward scattering is the preferred process for high photon energies, the intensity decreases above ∼ 30 kev. 178 vol. 54 no. 3/2014 x-ray transmission and reflection through a compton-thick medium 87.576.565.55 10000 1000 100 10 e [kev] p h o to n s k e v − 1 figure 2. same as fig. 1 with a detailed view on the iron region. slab sphere nh = 5 × 1024 cm−2 10000 1000 100 10 1 0.1 0.01 50205 10010 1.5 1 0.5 p h o to n s k e v − 1 e [kev] r a ti o figure 3. absorbed spectra for transmission through a sphere (red) and though a slab (blue) both calculated for nh = 5 × 1024 cm−2. 179 w. eikmann, j. wilms, r. k. smith, j. lee acta polytechnica transmission nh = 1 × 1024 cm−2 reflection 1000 100 10 1 50205 10010 2 1.5 1 0.5 p h o to n s k e v − 1 e [kev] r a ti o figure 4. spectra for transmission through a sphere (red) calculated for nh = 1 × 1024 cm−2 and reflection from a slab (blue). fe ca si o 1000010001001010.10.0110-3 1 0.1 0.01 10-3 10-4 10-5 10-6 τ(eline) i li n e figure 5. general line shape for a spherical geometry and for a monochromatic photon beam at e = 10 kev as input. 180 vol. 54 no. 3/2014 x-ray transmission and reflection through a compton-thick medium fe kα i 0.1 0.01 10-3 10-4 10-5 1010.10.01 1 0.5 i li n e τ(eline) r a ti o figure 6. modulation of the fe kα i-line. 5. fluorescent lines the strength of the fluorescent lines depends upon properties of the absorbing material such as its elemental composition and geometry. assuming a spherical geometry, the line strength as a function of the column density is very similar for all lines. to demonstrate this we show in fig. 5 the strength of the kα-line of o, si, ca and fe plotted against the optical depth given by τ(eline) = σ(eline)nh where σ(eline) is the sum of the photo absorption and compton scattering cross section for the respective line energy. the dashed line marks the depth τ = 1 from which a line photon interacts on average more than once before it escapes. up to this limit, the line strength increases continuously for all elements and thereupon decreases again as a result of compton down-scattering and new photo absorption. the next step was to reconstruct the line shape analytically as a function of the optical depth. we assume for now a monochromatic input spectrum with intensity iin and energy ein. starting from the wellknown formula iout(ein) = iin(ein)e−τ(ein) we can describe the line strength simplified by iline(eline) ∝ iin(ein) ( 1 − e−τ(ein) ) e−τ(eline). the first term describes the number of line photons that was created within τ and the second term describes the subsequent absorption. in fig. 6 we compare the analytic solution (blue dashed line) with the values for the iron kα-line from the simulation (red triangles). the general evolution of the line strength is quite well described considering the simplicity of the function. the deviations for large optical depths comes especially from the interplay of isotropic line emission and anisotropic compton scattering. 6. outlook additionally to the efforts described here, two further updates will be implemented in a revised version of the tbabs absorption model: (1.) higher resolution cross sections close to the absorption edges of neon, oxygen, and iron. (2.) the addition of mechanisms to include laboratory measured cross sections for solids. test versions of these modifications will be available at http://pulsar.sternwarte.uni-erlangen. de/wilms/research/tbabs references [1] l. barragán, j. wilms, k. pottschmidt, et al. suzaku observation of igr j16318-4848. astronomy and astrophysics 508:1275–1278, 2009. arxiv:0912.0254 doi:10.1051/0004-6361/200810811. [2] j. wilms, a. allen, r. mccray. on the absorption of x-rays in the interstellar medium. the astrophysical journal 542:914–924, 2000. arxiv:astro-ph/0008425 doi:10.1086/317016. [3] d. a. verner, d. g. yakovlev. analytic fits for partial photoionization cross sections. astronomy and astrophysics suppl 109:125–133, 1995. 181 http://pulsar.sternwarte.uni-erlangen.de/wilms/research/tbabs http://pulsar.sternwarte.uni-erlangen.de/wilms/research/tbabs http://arxiv.org/abs/0912.0254 http://dx.doi.org/10.1051/0004-6361/200810811 http://arxiv.org/abs/astro-ph/0008425 http://dx.doi.org/10.1086/317016 w. eikmann, j. wilms, r. k. smith, j. lee acta polytechnica [4] j. s. kaastra, r. mewe. multiple auger ionisation and fluorescence processes for be to zn. in e. h. silver & s. m. kahn (ed.), uv and x-ray spectroscopy of laboratory and astrophysical plasmas, p. 134. 1993. [5] k. d. murphy, t. yaqoob. an x-ray spectral model for compton-thick toroidal reprocessors. monthly notices of the royal astronomical society 397:1549–1562, 2009. arxiv:0905.3188 doi:10.1111/j.1365-2966.2009.15025.x. [6] s. watanabe, m. sako, m. ishida, et al. detection of a fully resolved compton shoulder of the iron kα line in the chandra x-ray spectrum of gx 301-2. the astrophysical journal, letters 597:l37–l40, 2003. astro-ph/0309344 doi:10.1086/379735. 182 http://arxiv.org/abs/0905.3188 http://dx.doi.org/10.1111/j.1365-2966.2009.15025.x http://arxiv.org/abs/astro-ph/0309344 http://dx.doi.org/10.1086/379735 acta polytechnica 54(3):177–182, 2014 1 introduction 2 computational method 3 scattering features 4 geometry dependence 5 fluorescent lines 6 outlook references acta polytechnica doi:10.14311/ap.2017.57.0201 acta polytechnica 57(3):201–208, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap optimization method and software for fuel cost reduction in case of road transport activity györgy kovács institute of logistics, university of miskolc, miskolc, hungary correspondence: altkovac@uni-miskolc.hu abstract. the transport activity is one of the most expensive processes in the supply chain and the fuel cost is the highest cost among the cost components of transportation. the goal of the research is to optimize the transport costs in case of a given transport task both by the selecting the optimal petrol station and by determining the optimal amount of the refilled fuel. recently, in practice, these two decisions have not been made centrally at the forwarding company, but they depend on the individual decision of the driver. the aim of this study is to elaborate a precise and reliable mathematical method for selecting the optimal refuelling stations and determining the optimal amount of the refilled fuel to fulfil the transport demands. based on the elaborated model, new decision-supporting software is developed for the economical fulfilment of transport trips. keywords: road transport trip; fuel cost saving; petrol station; optimization; software development. 1. introduction the volume of the transport activities connected to production and services is increasing constantly, due to the formation of long intercontinental supply chains [1]. transport companies have to focus on the cost reduction and profitability [2]. this research study is very relevant, because the cost reduction and efficiency improvement is a very important goal for all service providers. the ratio of the road transport in europe is 78 % of the total freight volume [3], and because the fuel cost is the highest cost among the cost components of the transportation, every transport company puts a great emphasis on optimizing road transport activities and reducing transport costs. recently, in practice, the selection of the petrol stations for refuelling and the amount of the refilled fuel have not been determined centrally at transport companies, but depend on the individual decision of the driver. this can result in wastes for the transport companies. the goal of the research is to optimize the transport activity both by selecting the optimal petrol station (depending on the fuel price and the distance to the original destination) and by determining the optimal amount of the refilled fuel (only the required amount of the fuel, not more). the result of wrong decisions is that the cost of the consumed fuel is not optimal. this is true in the case of inland routes, but it has even more significance in the case of international routes, when a high amount of fuel is consumed. this research is absolutely original and unique, because a precise and reliable mathematical model and method are elaborated for determining the optimal refuelling stations and the optimal amount of the loaded fuel. this research is especially innovative, because, based on the elaborated method, a new software was developed. so, the research topic is not only theoretical, but the elaborated model, method and software can also be used very efficiently and widely in practice for everyday use. further advantages of the developed software are that it absolutely suits the customer demands and it is very cost effective. in this study, the fuel cost reduction is showed in a case study by a comparative analysis (section 8). it can be concluded that the fuel cost of transport trips can be reduced significantly by applying the developed software. 2. literature review the author evaluated a lot of literature relating to the cost components and general characteristics of the road freight transport, which provided the theoretical background of this research. there is a great deal of literature discussing the logistical costs [4,5,6] and logistics literature rarely deals with the introduction of transport cost components [7]. there are three different organization methods in the case of the road transportation: simple trips, shuttle trips and round trips [8,9]. the most common form of organizing international road transportation is the round trip. the main goal of organizing transport routes is to minimize specific transport costs and reduce the transport lead time [10,11,12,13]. every transport trip departs from the depot of the transport company and goes to the first station where the goods to be transported are loaded, then it proceeds to the next station where the goods are unloaded, then the next loading and the next unloading, etc. after the last station, the vehicle finishes its route at the depot. although the existing literature often discusses route optimization and optimizing transport trips and networks [7,9,11,13], there is a gap in the literature 201 http://dx.doi.org/10.14311/ap.2017.57.0201 http://ojs.cvut.cz/ojs/index.php/ap györgy kovács acta polytechnica in the field of fuel optimization of the road transportation. it can be said that there are hardly any materials available in this topic, the optimization task to be solved has not been covered, and this topic is absolutely new. in the past years, the author has completed several r&d projects for forwarding companies, therefore, he has practical experience in the field of transportation. this empirical experience initiated the idea of the optimization software, which offers a prompt solution for everyday problems. 3. tendencies and characteristics of road freight transport rapidly changing market environment, global competition and fluctuating customer demands have resulted in more complex networks of supply chains. the value chain is globalized, the cooperation between enterprises has become more dynamic. due to these factors, the following changes and tendencies can be seen in the transport sector: • the volume of the transport activities connected to the production and services is constantly growing. • cooperation and coordination between the transport modes (rail, road, water and air) are increasing to form faster and more economical transport chains. advantages and synergies of the different modes can be combined and utilized. • transport distances become longer due to the formation of worldwide global supply chains. • transport chains and transport trips are optimized. • the quality of transport activities is increasing. • the shipping time considered acceptable by the customer is reduced. • optimization of the transport coordination and utilization of resources take place. • application of it tools in logistics improves the efficiency of transport activities and ensures better monitoring and tracking. • specific transport costs of transport chains are reduced. • the technology in the transport sector is developed [2]. recently, the ratio of the road transport in the total transport volume is 78 % in europe [3]. the remaining volume of the other modes can be seen in figure 1. the goal of the european union is to decrease the ratio of the road transport and, consequently, the environmental pollution, noise pollution, traffic jams and accidents can be decreased. a further goal is the better utilization of the railway and water transportation. it can be predicted that the intensity of the road transportation will be increasing to a small extent, so the road transport will be the most significant transport figure 1. ratio of transport modes in europe [3]. mode also in the future. compared to other transport modes, the road transport has many advantages: cheaper mode of transport, shorter transport time, door-to-door service, high density of road network, flexibility in routing and flexibility in time scheduling, and high level of adaptation to the customer’s demands. the optimal operation of transport routes can be attained in the following ways: • modernization of the vehicle fleet, • integration of multiple transport tasks into a one transport trip, • application of multimodal transport modes, where road, railway, and water transportation modes are integrated, • maximal utilization of vehicles, • application of it tools in logistics improves the efficiency of transport tasks, and ensures better monitoring and tracking, • reduction of fuel costs of the transport tasks: refilling the optimal amount of fuel at the optimal petrol station. 4. total prime cost of a transport trip at first, we have to define the total prime cost of a transport trip to find the possibilities of the cost reduction. total prime cost (cpα ) of the αth transport trip (figure 2) can be calculated: cpα = clα+culα+cwtα+caα+cwα+cmα, (1) where clα is the cost of a transport way with a useful load; culα is the cost of a transport way without a useful load; cwtα is the cost of the waiting time; caα are total additional costs (the motorway fee, parking fee, etc.); cwα is the driver’s salary; cmα is the maintenance cost of vehicles; α is an identifier of the transport trip. because the aim of the optimization is to minimize the cost of the fuel consumption, the cost of a transport 202 vol. 57 no. 3/2017 optimization method and software for fuel cost reduction figure 2. structure of a transport trip. way with a useful load (clα) and the cost of a transport way without a useful load (culα ) components are examined, the other components can be neglected. in the calculation, it is advantageous to divide the transport trips (α) into sections (β), which means the road between loading-in and loading-out stations (figure 2). the cost components of the sections are different due to the different volume of transported goods, topography, etc. the total fuel cost of a transport trip is the sum of the costs of the sections: cfα = ∑ β sαβcαβ [€], (2) where sαβ is the distance of βth section of αth transport trip [km]; cαβ is the specific cost of the βth section of α transport trip [€/km]; β is the section identifier; α is the transport trip identifier. the fuel consumption of a vehicle depends on the fuel consumption of the vehicle without a useful load (main characteristics of the engine) and the weight of the transported useful load. the specific cost can be calculated by cαβ = pαβ(ff + εlqαβ) [€/km], (3) where pαβ is the fuel price [€/l]; ff is specific fuel consumption in case of an empty vehicle [l/km]; εl is the correction factor for different loading conditions (every additional ton of useful load results in extra fuel consumption) [l/t km] ; qαβ is the transported useful load [t]. the determination of the correction factor is based on the previous research of the author [14]. 5. optimization of selecting the ideal petrol station and the determination of the amount of refilled fuel during transport trips (national and international), the selection of the ideal petrol station for refuelling and the amount of the refilled fuel are not determined centrally at most transport companies, but depend on the individual decision of the driver. therefore, many times refuelling is not carried out at the ideal petrol station, and not the optimal amount of fuel is refilled and, in some cases, the driver refuels absolutely figure 3. determination of possible petrol stations. figure 4. changes in fuel level during the transport trip. unnecessarily. the result of these wrong decisions is that the total cost of the transport task and the cost of the consumed fuel are not optimal. there are many available petrol stations all over europe, but the difference between the unit prices of the fuel can be 0.16–0.23 €/l at the different the petrol stations, which results in a 115 euro difference in case of a refill of only 500 litres. a common mistake is that the driver refuels at an expensive petrol station, or refuels an unnecessarily high amount of fuel in the end of a transport route before arriving at the depot of the transport company. the above-mentioned mistakes inspired the author to elaborate a new optimization method for refuelling, which is essential for transportation companies to reduce the transportation costs. 5.1. selecting the optimal petrol station the goal of this study is to determine the most costeffective petrol station (figure 3) amongst the preferred ones by the company (psj), in case of a transportation task from the station si to the station si+1 defined by gps coordinates; furthermore, to determine the amount of refilled fuel (qrf ) with a known initial level of the fuel (figure 4). most transport companies — due to the huge amount of fuel consumption — can buy fuel at a contracted fuel selling company. the gps coordinates of the available petrol stations and the actual fuel price at the individual stations are provided by the contracted fuel selling company. the fuel level of the vehicle is known at the starting point (qfs), the fuel consumption can be calculated continuously between the loading-in and loading-out stations (figure 4). 203 györgy kovács acta polytechnica figure 5. determination of the optimal refilling station. there is a constraint relating to the safety amount of fuel capacity (qs) which has to be available. safety amount qs covers the extra consumption of the vehicle resulting from traffic jams, missed ways, etc. between stations si−1 and psj, and after the station si+1 to find the next petrol station (figure 4). the maximal distance (smax), which can be completed by a vehicle, is calculated as follows (see figure 3): sαβ,max = qfs − qs ff + εlqαβ [km]. (4) the radius (smax − ∆s) defines the coordinates of possible psj petrol stations, from which the optimal should be selected (figure 3). the value of ∆s can be freely defined, relating to the distance from which the optimal petrol station has to be selected before running out of fuel. the next step is to find the optimal petrol station, which provides the most cost-effective solution (figure 5). there is a dilemma of whether a cheaper but further or more expensive but nearer refuelling is more cost effective. the increment value of the costs of different refuelling possibilities should be analysed. total costs of different alternatives can be calculated by (2) and the following equation (cf. figure 5): cmin = min(csi−1ssi−1,ps j,real+cps j sps j,si+1,real) [€], (5) where csi−1 is the specific transport cost taking into consideration the fuel price at petrol station psj−1 [€/km]; cps j is the specific transport cost taking into consideration the fuel price at petrol station psj [€/km]; ssi−1,ps j,real is the transport way between station si−1 and petrol station psj [km]; sps j,si+1,real is the transport way between petrol station psj and station si+1 [km]. the optimal solution is the transport way, which has a minimal total cost; it defines the optimal refilling petrol station (psj). 5.2. determining the optimal quantity of refilled fuel a common mistake is that the driver refuels with an unnecessarily high amount of fuel near the end of a transport route before arriving at the depot of the transport company. this can result in extra cost for the transport company. the next aim of this study is to calculate the optimal amount of the refilled fuel required to complete the transport task. the required amount of refilled fuel (qrf ) can be defined based on figure 4: qrf = (ff + εlqαβ)sps j,si+1,real + qs − qps j [l], (6) where qps j is the actual fuel level in the vehicle arriving at the selected petrol station; qs is the safety amount of fuel. 6. software application for determining the optimal petrol station and the amount of fuel to be refilled based on the elaborated theoretical model and method, a software was developed for optimizing the fuel supply of transport activities by the contribution of norbert cziczer, an engineering student [15]. the software was written in c# programming language [16] and microsoft visual studio was used to develop the software. the developed software has two menu points: the “definition of new data” and “optimization”. menu “definition of new data”. in the menu “definition of new data” we can define: • new fuel stations (figure 6); • new loading in stations and loading out stations (figure 7); • new transport vehicles (figure 8). menu “optimization”. figure 9a provides the possibility of selecting different vehicles for a given transport task. in this menu, the actual fuel level of the vehicle at the beginning of the transport way and the fuel price of this existing fuel can also be input.figure 9b provides the possibility of defining a transport trip, the loading-in and loading-out stations and the loading conditions (transported weights on the different road sections).figure 9c shows the results of the optimization. the total cost of the transport trip, the total fuel consumption of the trip and the remaining fuel volume at the end of the trip are listed.figure 9d shows the transport way on a map and the location of the optimal petrol station where the driver has to refill the vehicle.figure 9e shows the name and location of the optimal petrol station and the volume of the fuel to be refilled. the actual fuel price at the ideal petrol station and the total cost of the fuel to be refilled are also listed. the developed software is capable of selecting the emphoptimal petrol station and determining the optimal amount of refilled fuel during a long transport trip. based on this information, the drivers can make the best decision and the total cost of the transport trip can be minimized. 204 vol. 57 no. 3/2017 optimization method and software for fuel cost reduction figure 6. definition of new petrol stations. figure 7. definition of new loading in and loading out stations. figure 8. definition of new transport vehicles. figure 9. screen of “optimization” menu — optimal solution of the case study. 7. case study — optimization using the developed software the transport trip and the loading conditions are (see figure 10): station 1 budapest +3 t; station 2 miskolc +1.5 t −1 t; station 3 debrecen +1.5 t −2 t; station 4 szeged +1 t − t; station 5 zalaegerszeg, +1 t −2 t; station 6 győr +2.5 t −1.5 t; station 7 székesfehérvár +0 t −1 t; station 8 budapest −2 t. in our case, the selected vehicle is a light truck, the fuel consumption is 14 l/100 km, maximal fuel tank capacity is 150 litres, the maximal loading capacity is 3.5 tons. the correction factor for different loading conditions is 0.3 (every additional ton of useful load results in 0.3,l/100 km extra fuel consumption). figure 10. case study — round trip with loading conditions. 205 györgy kovács acta polytechnica figure 11. individual decision of the driver — non-optimal solution. the route between the loading-in and loading-out stations can be seen in figure 9. the fuel volume at the beginning of the tour is 100 litres (see figure 9a), the stations of the transport way and the loading conditions are given in figure 9b. the volume of the fuel will be under the defined limited value before arriving at station 5 (figure 9d). the software calculates the optimal petrol station in the searching zone (red rectangle in figure 9.d). the identifier of the optimal station (mol_415, where the fuel price is minimal: huf 336/l) and the amount of fuel (73 litres) to be refilled are listed in figure 9e. the conclusion of the optimization can be seen in figure 9c: the total cost of the whole transport route is huf 50800, the total volume of the fuel usage on the whole transport trip is 161.9 litres, the volume of the remaining safety fuel at the last station is only 11 litres, which is enough for another 79.4 km. currently, the software is under testing, but we hope that in the near future, more and more companies will use it and the total transport costs can be reduced in this way. 8. comparative analysis — optimal result vs. driver decision the aim of this comparative analysis is to show the cost saving of the optimization resulting from applying the developed software. there are a lot of possible petrol stations during the transport trip and the selection of the petrol station where the driver refills the vehicle depends on the individual decision of the driver. therefore, the total cost of the consumed fuel is not optimal, because the fuel prices are different at different petrol stations. in hungary, for example, the difference between the fuel prices at the cheapest and at the most expensive petrol stations can be 13–19 % in the case of one litre of fuel. figure 11 shows a non-optimized (based on a driver’s bad decision) result of the same case study detailed in figure 10. in this case, refuelling is carried out at the most expensive petrol station (depicted in figure 11de) in the search zone, and the amount of the refilled fuel is 139.8 litres (a full fuel tank capacity of the vehicle). the volume of the remaining fuel at the last station is 76.9 litres (figure 11c). the comparison of the optimal solution (the result of the optimizing software — figure 9) and the nonoptimal solution (driver’s decision — figure 11) can be seen in table 1. the results of the driver’s wrong decisions are that the total costs of the transport task and the cost of the consumed fuel are not optimal. it can be seen that if the driver refuels at an expensive petrol station, or refuels with an unnecessarily high amount of fuel at the end of a transport route before arriving at the depot of the transport company a huge extra fuel cost will be incurred. by applying the developed software, significant cost savings (−15.85 %) can be attained in case of a short transport trip (figure 10), but in case of a long international round trip the cost savings can be much higher. 206 vol. 57 no. 3/2017 optimization method and software for fuel cost reduction total cost of the whole transport way [huf] amount of refilled fuel [l] fuel price [huf/l] volume of remaining fuel at last station [l] cost of remaining fuel at last station [huf] cost saving of remaining fuel at last station [huf] non-optimal solution (figure 11) 54423 139.88 376 76.97 28940 optimal solution (figure 9) 50800 73 336 11.12 3736 cost saving 3623 (−6.66 %) 5004 total cost saving 8627 (−15.85 %) table 1. comparison of optimal and non-optimal solutions. note: the cost saving resulted by the remaining fuel can be calculated from the difference of volumes of remaining fuel in case of optimal and non-optimal cases (65.85 litres) and the difference of fuel price at the non-optimal station (376 huf/litre) and the fuel price at the depot of the transport company (300 huf/litre). it can be concluded that if the decision making of the driver could be supported by our software, the total cost of the transport way can be reduced significantly. 9. communication between vehicles and the dispatch station the developed software is installed at the dispatch station. based on the information provided by the software, the drivers can make the best decision relating to refuelling and the total costs of the transport trip can be minimized. a two-way communication system has to be formed in which vehicles can communicate with a dispatch station, providing each other with information. the vehicle has to provide the required information on-line for the optimal route planning and selection of the optimal fuel station. these data are the following: the actual gps position of the vehicle, actual fuel amount of the vehicle, actual fuel consumption, etc. the decisions calculated by the planning software can also be transferred to the mobile vehicle. figure 12 shows the operation and the system elements of the two-way communication, which involves feedback from the receiver to the sender. elements of the communication system. (1.) gps system for detecting the vehicle’s position with the help of satellites. (2.) communication channel for double direction communication between vehicles and the dispatch station. this data communication is based on gprs. figure 12. vehicle to dispatcher communication. (3.) on board unit (obu) which is a communication device mounted in the vehicle. it allows the two-way gprs communication of vehicles with the dispatch station. the obu is available for collecting actual data related to the vehicle, e.g. actual fuel amount of the vehicle, actual fuel consumption, etc. (4.) dispatch station comprising: • planning software at the dispatch station for calculating and visualizing the optimal transport route. the transport manager can define or reconfigure the optimal transport routes and select the optimal petrol station. • database for the properties of the vehicles (capacity, fuel consumption, etc.), part of the software. • database for actual fuel prices at the different petrol stations, also part of the planning software. • digital (vector graphical) map database with important defined points (poi), e.g. preferred petrol stations, frontier stations, parking places, etc.; part of the planning software. 207 györgy kovács acta polytechnica 10. conclusion transportation is one of the most expensive logistical activities; therefore, carriers and forwarding agents put great emphasis on the optimization of the road transportation and reduction of the transport costs, especially the reduction of the fuel costs. the topic of the research is the fuel cost reduction; therefore, the study is very important and relevant. recently, in practice, the selection of the petrol station for refuelling and the amount of the refilled fuel have not been determined centrally at most transport companies; they are not supported by software, but depend on the individual decision of drivers. the result of wrong decisions is that the cost of the consumed fuel is not optimal. during the research, an absolutely new and unique mathematical model and method was elaborated for optimizing the road transport activity both by selecting the optimal petrol station (depending on the fuel price and distance to the original transport way) and by determining the optimal quantity of the amount of the refilled fuel (only the required amount of the fuel, not more). this method can result in significant cost savings, mainly in the case of international routes, when a high amount of fuel is consumed. this is the reason why this research is important. based on the precise and reliable mathematical model and method, unique software was developed, which can be used very efficiently and widely in practice for everyday use. further advantages of the software are that it absolutely suits the customer demands and it is very cost effective. in this study, the fuel cost reduction was showed in a case study by a comparative analysis (section 8). it can be concluded that the fuel costs of the road transport trips can be reduced significantly by applying the developed software. references [1] bookbinder, h. j. (editor): handbook of global logistics, transportation in international supply chains. springer, 2013. [2] kovács, gy., kot, s.: new logistics and production trends as the effect of global economy changes. polish journal of management studies, 14(2), 2016, p. 115–126. [3] fraunhofer institute: executive summary. 2015. http://www.scs.fraunhofer.de/content/dam/scs/ de/dokumente/studien/top%20100%20eu%202015% 20executive%20summary.pdf [2017-06-30]. [4] ross, d. f.: distribution planning and control. springer, 2015. [5] lukinskiy, v., dobromirov, v.: methods of evaluating transportation and logistics operations in supply chains. transport and telecommunication, 17(1), 2016, p. 55–59. [6] mocková, d.: allocation and location of transport logistics centres. acta polytechnica, 50 (1), 2010, p. 30–34. [7] birge, j. r., linetsky, v.: handbooks in operations research and management science. north holland, 2007. [8] simchi-levi, d., xin, c., bramel, j.: the logic of logistics, theory, algorithms, and applications for logistics management. springer, 2014. [9] anbuudayasankar, s. p., ganesh, k., mohapatra, s.: models for practical routing problems in logistics. design and practices. springer, 2014. [10] sinha, k. c., labi s.: transportation decision making. john wiley&sons inc., 2007. [11] caramia, m., dell’olmo, p.: multi-objective management in freight logistics. springer, 2008. [12] rushton, a.; croucher, p.; baker, p.: the handbook of logistics & distribution management. kogan page limited, 2010. [13] ehmke, j. f.: integration of information and optimization models for routing in city logistics. springer, 2012. [14] kovács, gy., cselényi, j.: utilization of historic data evaluation obtained from computer database during the organization of international transport activity. in: proceedings of 2nd conference with international participation management of manufacturing systems, presov, slovakia, 2006, p. 18. [15] cziczer, n.: optimal design of refuelling method for hungarian road freight transport activity (in hungarian), graduation thesis, university of miskolc, 2016. [16] reiter, i.: c# technologies (in hungarian). 2010. http://devportal.hu/download/e-bookok/csharp% 20jegyzet/csharp.pdf [2017-06-30]. 208 http://www.scs.fraunhofer.de/content/dam/scs/de/dokumente/studien/top%20100%20eu%202015%20executive%20summary.pdf http://www.scs.fraunhofer.de/content/dam/scs/de/dokumente/studien/top%20100%20eu%202015%20executive%20summary.pdf http://www.scs.fraunhofer.de/content/dam/scs/de/dokumente/studien/top%20100%20eu%202015%20executive%20summary.pdf http://devportal.hu/download/e-bookok/csharp%20jegyzet/csharp.pdf http://devportal.hu/download/e-bookok/csharp%20jegyzet/csharp.pdf acta polytechnica 57(3):201–208, 2017 1 introduction 2 literature review 3 tendencies and characteristics of road freight transport 4 total prime cost of a transport trip 5 optimization of selecting the ideal petrol station and the determination of the amount of refilled fuel 5.1 selecting the optimal petrol station 5.2 determining the optimal quantity of refilled fuel 6 software application for determining the optimal petrol station and the amount of fuel to be refilled 7 case study — optimization using the developed software 8 comparative analysis — optimal result vs. driver decision 9 communication between vehicles and the dispatch station 10 conclusion references acta polytechnica vol. 43 no' 212003 a note on normalised distributions of dc partial microdischarges t. ficker, j. macur statistical distr,ibutions (exponentfuil and. pareto) of dc partinl rnicrod,ischargu running. within sandwich electrode sjstems are discussed from the uieupoint ,7 " ,"lr-*t'tt.1io, ,-""auu *ht"h'*oy influence some features of the final distribution' keywords: exponenthl and statistical d'istributions' partial mi'crod'ischarges, normalisation procedure' 1 introduction when studying the statistics of partial microdischarges within sandwicir elctrode systems loading by dc voltages in excess of paschen breakdown values, highly asymmetric distributions can be encountered tll-t4l in both time and height domains. time statistics, i.e. the densities of probability u-(l) of time intervals , between microdischarge pulses, follow an exponential distribution ,i!)=o..-ot, re(o,o), a=const. (l) while the heights u of microdischarge pulses (their peak values) obey a power law of the pareto type (pareto distribution of the first kind) [5] wp(u)=c'[j-4, u e (0, .o), cr > 0. (2) when the normalising procedure (5) is performed? this is nicely seen aom tfre-fottowing tlvo equations, (6) and (7)' which were obtained after logarithmic operations had been applied to eqs. (l) and (5) lnwr(t)=lnb-at + )=k-at' (6) tnwilt)=lnd-lns, -al + )* =k* -at' (7) k* =h-lnsr. (8) therefore, after normalisation the graph of the exponential distribution function plotted in a semilogarithmic system will conserve its shape (stiaight line), but it will shift its position in the vertical direction by a constant value ln s. in fact' there is no need to look for the value of s, to obtain a normalisation form (5), since the normalisation constant bls =a is an invariant appearing in the argument of unnormalised function (+). frg. i shows an unnormalised distribution in semilogarithmic co-ordinates with the following fitting equation ) = 0.0685 0.0895x (9) which enables us straightforwardly to determine the corresponding normalised exponential distribution ,i1r; = o.oasr '.-o o8e5r ittt s;-r (10) ln [v(ms)'l @ some problems may arise when normalised forms of these highly asymmetric distributions should be used, especially*iih iur.io't distribution (2), certain probability moments of which diverge the goal of this paper is to discuss problems connected with the normalisation of pareto's statistic (2)' 2 normalising exponential distribution in fact there is no principal problem with normalisation of a measured (unnormalised) exponential distribution i w p(u)du = 'n . 0 (3) -2.5 ,r(t)=b-r-ot, b+a' since there is no singular point in the interval (0' co)' normalisation form (l) can easily be found ,i(t\=lr-o'. (5) s" , to verify the exponential character of a function' the semilogarithmic co-ordinates are usually employed' but what will haipen with the shape and position.of the graph of an .*po.r..rtial function plotted in the semilogarithmic system @ ih s.=f zu,(r)d1=l+1. (4) ju 0 5u 60 of time intervals t/ (ms) between microdis-l: probability density charge pulses [4] fig. 59 acta polytechnica vol. 43 no. 2/2003 (18)w;(u) =0506· ul42 (mvt'. conclusion assoc. prof. rndr. tomáš ficker, drsc. phone: + 420 541 147661 e-mail: fyfic@fce.vutbr.cz department of physics assoc. prof. rndr. jiří macur, csc. phone: +420541 147249 e-mail: macur.j@fce.vutbr.cz department of tnformatics faculty of civil engineering university of technology žižkova 17 662 37 brno, czech republic acknowledgement this work has been supported by the grant agency of the czech republic under the grant no. 202/03/00 ll. the pareto distribution can be normalised in a standard, when the used interval (ul' u2 ) does not possesses zero point, i.e. ul > o. when plotting both the distributions, i.e. exponentia! and pareto, in logarithmic co-ordinates (semiand bisystems, respectively), normalisation procedure causes a certain shift of their graphs in the vertical direction, while their shapes and asymmetricities remain unchanged. their functiona! character is not influenced by a normalisation procedure. in other words, the normalisation procedure cannot change the results of physical processes and this fact also supports the employment of unnormalised distributions in statistical studies realised in physics and other fields of science and technology. references [1] ficker, t: hactal statisties oj partial discharges with polymeric samples.j. app!. phys, vol. 78, 1995, p. 5289-5295. [2] fromm, u.: interpretation oj partial discharges at dc voltages. ieee trans. diej. ei. lnsul., vol. 2, 1995, p.761-770. [3] ficker, t, macur, j., pazdera, l., kliment, m., filip, 5.: simplifwd digital acquisition ojmurodischarge pulses. ieee trans. die!. ii. insul. vol. 8, 2001, p. 220-227. [4] ficker, t: electron avalanches i. statistus oj partinl microdischarges in their pre-strearner stage. teee trans. diel. el. tnsul. (in press). [5] johnson, n. j., kotz, s.: continuous univariate distributions i. new york: john wiley & sons, 1970. (ll) 1.0 i -y" -1.42x -0. 514 1 0.5o ln (wl (mvy'j -1 o -2 -3..l--~-------r------.-----.-.....j­ 1.5 ln iv i (mv)] 3 normalising pareto distribution the situation with the pareto distribution (2) is more problematic. due to the singularity at the point u = o, the integral (3) over the interval (o, o. real measurements are lisually performed in such intervals. in addition, the interval s in real experiments are usually finite, i.e. u2 < y=k-ux, (14) lnw; =lnc-lnsp -ulnu => y' =k' -ux, (15) k'=k-lnsp, (16) similar conclusions as for exponential distribution can be drawn: in bilogarithmic systems the normalisation procedure changes only the vertical position of the graph of the pareto distribution and does not influence its functional character (slope or asymmetricity). fig. 2 shows an unnorm~lise~ pareto distribution within the interval u e (05, 5) mv lil bllogarithmic co-ordinates with the following fming equation y = -0514 -l.42x (17) which again enables us to determine the corresponding normalised form of the pareto distribution. according to (ll) we can find fig. 2: probability density of heights of microdischarge pulses [4) 60 scan59 scan60 acta polytechnica doi:10.14311/ap.2018.58.0189 acta polytechnica 58(3):189–194, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap primary peak ratio correlation to the measurement accuracy of piv method jan novotný, ilona machovská∗ department of fluid dynamics and thermodynamics, czech technical university in prague, czech republic ∗ corresponding author: ilona.machovska@fs.cvut.cz abstract. an estimation of a measurement accuracy at each measured point is crucial regarding the applicability of results of the measurements. the aim of this work is to determine the correlation between individual metrics and the measurement accuracy by using corrected metrics of the correlation plane. this work is based on defining a corrected metric using known metrics corrected by the displacement measured in the last iteration, the number of the particles and the velocity gradient inside the interrogation area. the resulting tests are performed using conventional synthetic tests. the discovered dependencies between individual corrected metrics are subsequently approximated in order to determine the measurement accuracy. and, finally, the most suitable variant for the determination of the accuracy of the measurement by the particle image velocimetry method is specified. keywords: primary peak ratio; cross-correlation; measurement accuracy; piv method; synthetic tests. 1. introduction in this work, we focus on a description of the interconnection between the ratio of the primary and secondary peak in the correlation plane, known as primary peak ratio (ppr), and the error of evaluation of the measured displacement by the piv method. the aim of this work is the correction of the ppr metric by the main parameters that define the quality of the measured data in order to find a clear correlation between the metric and the measurement accuracy of the piv method. one option to evaluate the measurement accuracy at a specific point is to track the dependency of the primary peak ratio. it is possible to split the cross-correlation of particle images rs into the crosscorrelation of the noise rc, cross-correlation of the signal and noise rf and cross-correlation of the signal rd. a brief illustration is shown in figure 1. article [1, 2] state that the height of the signal peak and the ppr are proportional to the number of particles ni. another factor that affects the height of the peak is the number of lost pairs fi. the diameter and shape of the peak are also influenced by the particle diameter di and by the velocity gradient f∆ inside the interrogation area ia. if this inequality (nifif∆ > 5) applies, there is a 95 % chance that the highest peak is equal to the signal peak. the evaluation of the mentioned parameters (ni, fi, f∆) is not so trivial and that is why these parameters are not currently used to determine the measurement accuracy. though it is difficult to evaluate the mentioned parameters, more than 30 years ago, a clear context has been found between the measurement accuracy and the shape of the correlation plane [3, 4]. when using the classic piv iterative algorithm, the measurement, where the highest peak represents the displacement and the second highest peak represents the result of the random correlation — noise peak, can be considered as the correct one. over the 30-year existence of the piv method, a whole range of metrics has been defined [6, 7]. the above-mentioned metric ppr complements the ratio of quadrates of the primary peak height and the root mean square of all peak heights, known as peak to root mean square ratio (prmsr), and the ratio of the primary peak height quadrate and the energy in the correlation plane, known as peak to correlation energy (pce). to the mentioned main metrics, a metric, called the mutual information mi, has been added. the definition of the metric mi is different from the main metrics. the metrics ppr, prmsr and pce are defined only by resulting the correlation plane, from which the displacement is also defined. to determine the mi metric, it is necessary to evaluate the final cross-correlation plane and, moreover, the autocorrelation plane defined by the autocorrelation of the average particle [8]. the metric mi is then defined as the ratio of maximum values in the crosscorrelation and autocorrelation plane: mi = max. val. in corr. pl.(cross-corr. of all part.) max. val. in corr. pl.(autocorr. of av. part.) . (1) a schematic procedure of the calculation of the metric mi is shown in figure 2. this metric represents the number of particles ni inside the interrogation area and it is a very good measure of the quality of the correlation plane. in [8], the synthetic tests, which have been done with the number of particles ni from 5 to 30 particles in an interrogation area with a size of 32 × 32 pixels, are very well described. results of these synthetic tests were divided into 40 subgroups according to the value 189 http://dx.doi.org/10.14311/ap.2018.58.0189 http://ojs.cvut.cz/ojs/index.php/ap jan novotný, ilona machovská acta polytechnica figure 1. the effect of individual components of the signal on the resulting shape of correlation plane; ia = 32 × 32 pixels, ni = 10, f∆ = 0 [5]. of the tested metric. for each range of the value of the metric, the error was evaluated as the root mean square error of the measurement in each subgroup. for a real use of the individual dependencies, authors of [8] also defined the approximation function of the mentioned dependencies. the following equation is based on the model, which was presented in [7], and then modified in [8] to the following form: σ2 = ( m exp ( − 1 2 (φ − n s )2 ))2 +(aφb )2 +c2. (2) the dependency of the evaluated root mean square error on the values of the ppr is shown in figure 3. the factor m in (2) is proportional to the total error of the measurement and φ is the value of the adequate metric. the coefficient s is proportional to the real achievable value of each metric. the minimum value of n can easily be derived from the definition of the appropriate metric. for the ppr, the minimum value of n is nmin = 1, for the prmsr, the minimum value of n is nmin = 4, for the pce, the minimum value of n is nmin = 1. for the mi, the minimum value of n is nmin = 0, considering no particles inside the interrogation area. the next part of (2) approximates the effect of correctly measured data and their contribution to the total error of the measurement. the parameter a is proportional to the deviation from the correct value and, vice versa, the parameter c corresponds to the minimum deviation of the measurement. according to [8], the total error of the measurement dependency on the ppr metric can be defined as follows: σ2scc = ( 10.47 exp ( − 1 2 (ppr − 1 1.12 )2 ))2 + (1.913φ−1.371)2 + (2.221 · 1014)2. (3) 2. experiment the effect of specific parameters on the measurement accuracy can be tested using synthetic tests. two different synthetic tests were used – uniform flow test (uft) and couette flow test (cft) [10]. uniform flow test simulates a velocity field with a constant displacement, couette flow test simulates a velocity field with a constant velocity gradient within the whole interrogation area. the mean displacement of the particles’ deviation from the assumed value of the particle displacement is monitored during the tests. this deviation is denoted as a systematic error β. another monitored parameter is the fluctuation of the displacement deviation around the mean value of the assumed displacement, which is denoted as a random error δ. sum of these errors is the total error of σ the measured displacement. according to [11], mentioned 190 vol. 58 no. 3/2018 primary peak ratio correlation to the measurement accuracy of piv method figure 2. schematic illustration of procedure of calculation metric mi [9]. figure 3. root mean square error dependency on metric ppr [8]. errors are defined as follows: β = u − ū, δ = √√√√ 1 n n∑ i=1 (ui − ū)2, δ = √√√√ 1 n n∑ i=1 (ui − u)2, (4) where u is the value of the assumed displacement of the particle, ū is the measured mean value of the displacement and ui is the measured displacement. when generating synthetic data, it is important to ensure that the synthetic data are generated correctly. an excellent guideline for a proper data generation can be found in [4, 12]. in our work, the standard cross-correlation (scc) method was used to process the results of the synthetic tests. the resulting dependency between the specific value of the corrected metric and the measurement accuracy is applicable only to the scc algorithm. a detailed description and results of the synthetic tests can be found in [13]. an example of the results of both synthetic tests are shown in figure 4. 3. corrected metric although the authors of the presented functions have done a great job, the use of these functions is only indicative. the metric ppr is not considering the effect of the number of the particles inside the interrogation area, the diameter of the particles and the displacement measured in the last iteration on the total error of the measurement. also, the usefulness of (3) is arguable, because this dependency is mostly monitoring the set of the data of apparently wrongly measured vectors and not just correctly measured data. for this reason, it was necessary to define a new metric to address these mentioned shortcomings. at first, the new metric was corrected by the measured displacement in the last iteration, and also by the number of the particles inside the interrogation area ia. when defining the new metric, it is also necessary to consider the velocity gradient inside the ia. the value of the velocity gradient is defined as the ratio of the maximum difference of the displacement inside the ia and the diameter of the particle. the corrected metric ppr∆,mi,gr can be formulated in the following form: ppr∆,mi,gr = ppr − 1 0.01 + √ ∆x2 + ∆y2mi0.42 + sgn(cr) 4 exp (mi 20 ) exp(gr)−1 . (5) to evaluate the dependency of the corrected metric ppr∆,mi,gr on the total error of the measurement, 191 jan novotný, ilona machovská acta polytechnica figure 4. example of the results of the synthetic tests – uft (top), cft (bottom) [13]. the synthetic tests mentioned in section 2 were performed for several numbers of particles inside the ia and several values of the velocity gradient inside the ia. based on these tests, a new definition of the measurement error dependency on the corrected metric was determined and formulated in the following form: σ2 = ( −1.5 exp(−ppr2∆,mi,gr) )2 + 0.8 ( ppr−1.1∆,mi,gr )2 . (6) the final graph showing the dependency of the total error of the measurement on the corrected metrics is shown in figure 5. 4. conclusions this work deals with the evaluation of a measurement accuracy based on the detection of the ratio of the primary and the secondary signal peak – the ppr and its correction, considering other parameters influencing the measurement accuracy, by using the particle image velocimetry method – piv. among other things, a procedure how to determine the number of the particles inside the ia is listed in this work. to define the corrected metric, a velocity gradient inside the ia was determined using the diameter of the particles and not, as it is usual, using the size of the edge of the ia. the corrected metric ppr∆,mi,gr is introduced as well as its dependency on the total error of the measurement. list of symbols ∆x horizontal displacement [pixel] ∆y vertical displacement [pixel] a coefficient of the correct value of displacement [1] b exponent [1] c minimum deviation of measurement [1] 192 vol. 58 no. 3/2018 primary peak ratio correlation to the measurement accuracy of piv method figure 5. dependency of the corrected metric ppr∆,mi,gr on the total error of the measurement. results of synthetic tests uft and cft for four different numbers of the particles. velocity gradient inside ia was between the values 0 to 1.5. di particle diameter [pixel] f∆ velocity gradient [1] fi number of lost pairs [1] gr velocity gradient [1] m coefficient of total error of measurement [1] mi mutual information [1] n minimum value of specific metric [1] ni number of particles [1] pce peak to correlation energy [1] ppr primary peak ratio [1] ppr∆,mi,gr corrected primary peak ratio [1] prmsr peak to root-mean-square ratio [1] rc cross-correlation of noise [1] rd cross-correlation of signal [1] rf cross-correlation of signal and noise [1] rs cross-correlation of particle images [1] s coefficient of real value of metric [1] u assumed displacement [1] ui measured displacement [1] ū measured mean displacement [1] β systematic error [pixel] δ random error [pixel] σ total error [pixel] φ general metric value [1] ia interrogation area cft couette flow test piv particle image velocimetry scc standard cross-correlation uft uniform flow test acknowledgements technological agency, czech republic, program center of competence, project #te01020020 josef božek competence centre for automotive industry, the ministry of education, youth and sports program npu i (lo), project #lo1311, development of vehicle centre of sustainable mobility. this support is gratefully acknowledged. references [1] r. j. adrian and c. yao, “pulsed laser technique application to liquid and gaseous flows and the scattering power of seed materials,” applied optics, vol. 24, pp. 44-52, 1985. doi:10.1364/ao.24.000044 [2] j. westerweel, d. dabiri and m. gharib, “the effect of a discrete window offset on the accuracy of cross-correlation analysis of digital piv recordings,” experiment in fluids, vol. 23, pp. 20-28, 1997. doi:10.1007/s003480050082 [3] r. j. adrian, “image shifting technique to resolve directional ambiguity in double-pulsed velocimetry,” applied optics, vol. 25, no. 21, pp. 3855-3858, 1986. doi:10.1364/ao.25.003855 [4] j. westerweel, digital particle image velocimetry: theory and application, delft: delft university press, 1993. https://repository.tudelft.nl/islandora/object/ uuid:85455914-6629-4421-8c77-27cc44e771ed? collection=research [2018-06-17]. [5] j. novotný, analýza faktorů ovlivňujících vyhodnocení posunutí signálu při měření metodou particle image velocimetry, habilitační práce, praha, 2016. [6] j. charonko and p. vlachos, “estimation of uncertainity bounds for individual particle image velocimetry measurements from cross-correlation peak ratio,” measurement science and technology, vol. 24, no. 6, 2013. doi:10.1115/fedsm2012-72475 [7] a. eckstein and p. vlachos, “digital particle image velocimetry (dpiv) robust phase correlation,” measurement science and technology, vol. 20, no. 5, 2009. doi:10.1088/0957-0233/20/5/055401 193 http://dx.doi.org/10.1364/ao.24.000044 http://dx.doi.org/10.1007/s003480050082 http://dx.doi.org/10.1364/ao.25.003855 https://repository.tudelft.nl/islandora/object/uuid:85455914-6629-4421-8c77-27cc44e771ed?collection=research https://repository.tudelft.nl/islandora/object/uuid:85455914-6629-4421-8c77-27cc44e771ed?collection=research https://repository.tudelft.nl/islandora/object/uuid:85455914-6629-4421-8c77-27cc44e771ed?collection=research http://dx.doi.org/10.1115/fedsm2012-72475 http://dx.doi.org/10.1088/0957-0233/20/5/055401 jan novotný, ilona machovská acta polytechnica [8] z. xue, j. charonko and p. vlachos, “signal-to-noise ratio, error and uncertainty of piv measurement,” in international symposium on particle image velocimetry, delft, the netherlands, 2013, july 1-3. https://repository.tudelft.nl/islandora/object/ uuid:a6270b28-1132-4186-817e-259eef0e9d87/ datastream/obj [2018-06-17]. [9] j. novotný and i. machovská, “corrected metric for uncertainity estimation methods in particle image velocimetry,” in 9th world conference on experimental heat transfer, fluid mechanics and thermodynamics, foz do iguazu, 2017. [10] m. raffel, c. willert and j. komphenhaus, particle image velocity, berlin: springer-verlang, 2007. doi:10.1007/978-3-662-03637-2 [11] t. astarita and cardone g., “analysis of interpolation schemes for image deformation metods in piv,” experiments in fluids, vol. 38, pp. 233-243, 2005. doi:10.1007/s00348-004-0902-3 [12] k. okamoto, nishio t., t. saga and t. kobayashi, “standard images for particle image velocimetry,” measurement sicence and technology, vol. 11, pp. 685-691, 2000. doi:10.1088/0957-0233/11/6/311 [13] j. novotny, “influence of data quality on piv measurement accuracy,” journal of flow visualization and image processing, vol. 19, pp. 215-230, 2012. doi:10.1615/jflowvisimageproc.2013003304 194 https://repository.tudelft.nl/islandora/object/uuid:a6270b28-1132-4186-817e-259eef0e9d87/datastream/obj https://repository.tudelft.nl/islandora/object/uuid:a6270b28-1132-4186-817e-259eef0e9d87/datastream/obj https://repository.tudelft.nl/islandora/object/uuid:a6270b28-1132-4186-817e-259eef0e9d87/datastream/obj http://dx.doi.org/10.1007/978-3-662-03637-2 http://dx.doi.org/10.1007/s00348-004-0902-3 http://dx.doi.org/10.1088/0957-0233/11/6/311 http://dx.doi.org/10.1615/jflowvisimageproc.2013003304 acta polytechnica 58(3):189–194, 2018 1 introduction 2 experiment 3 corrected metric 4 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0042 acta polytechnica 54(1):42–51, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap technical and economic optimization of cogeneration technology using combustion and gasification martin lisý∗, marek baláš, michal špiláček, zdeněk skála brno university of technology, faculty of mechanical engineering, energy institute ∗ corresponding author: lisy@fme.vutbr.cz abstract. this paper presents the technical and economic optimization of new microcogeneration technology with biomass combustion or biomass gasification used for cogeneration of electrical energy and heat for a 200 kw unit. during the development phase, six possible connection solutions were investigated, elaborated and optimized. this paper presents a basic description of the technology, a description of the technological solutions, and especially the results of balance and financial calculations, ending with a comparison and evaluation of the results. keywords: microcogeneration, biomass, combustion, heat balance, economic indicators. 1. introduction — microcogeneration the energy institute at the faculty of mechanical engineering, brno university of technology, has been conducting research and development in the area of gasification since 2000. the institute’s laboratory is equipped with a biofluid 100 atmospheric fluid generator with 150 kwt capacity [1]. the development activities focus on catalytic technologies for cleaning gases. a cogeneration unit based on a combustion engine with 22 kwe capacity is attached to biofluid 100, and this enables tests to be run on operations of the unit fired by gas generated from the gasification process. the main problem when applying gasification technologies integrated with combustion engines is unreliability of the units [1–3]. gas cleaning is also an energy-intensive and expensive process. the energy institute at the faculty of mechanical engineering has therefore participated in several projects working on the design and manufacture of brand-new cogeneration units. this involves the use of biomass combustion or a gasification unit. flue gas released from the boiler heats the compressed air from the compressor in a heat exchanger. the heated air is supplied to a turbo-set, where it expands. cogeneration of energy from biomass and from wastes produces heat that is often difficult to utilize. it is therefore beneficial to design small-capacity units, so-called microcogeneration stations, where the problem with heat utilization is not so striking. these units are constructed directly on the site where the fuel comes from, which is most commonly logging, the wood-processing industry or agricultural industry. transport costs may have a negative impact on the overall cost-effectiveness of the operation of these energy stations, especially when fuel is to be transported over long distances. fuel transport may also contribute to heavy traffic on the roads. all of these drawbacks are eliminated when microcogeneration units are introduced. other benefits include reduction of electrical energy losses in the power transmission system, since the electrical energy is consumed in the close vicinity of the microgeneration unit. research carried out at the energy institute at brno university of technology (ei but) has recently focused on microcogeneration units of this kind. in cooperation with commercial businesses, we have been participating in developing the stirling engine, an efficient steam engine designed for generating electrical energy from steam under low capacities. we are also developing a cogeneration station incorporating gasification technology and a hot-air microturbine. the basic principle is shown in fig. 1. it involves the use of a biomass combustion or gasification unit. flue gas released from the boiler heats up the compressed air from the compressor in a heat exchanger. the heated air is supplied to a turbo-set, where the air expands. the air leaves the turbine at a high temperature, and can be used for combustion. the rest is mixed with the flue gas to dry biomass, or serves for heating purposes. 2. design and balance calculations for a microcogeneration unit we will now compare six basic options, and modifications to them, for a unit with 200 kw designed electric power which utilizes waste heat for drying biomass. due to limited space in this paper, we will not provide full specifications of each solution, but only basic parameters and the main differences between the designs. the basic design parameters were: mass flow of the air, temperature and pressure of the air prior to and beyond the turbine. the desired flue gas temperature at the end point of the technology, i.e. at 42 http://dx.doi.org/10.14311/ap.2014.54.0042 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 1/2014 technical and economic optimization of cogeneration technology figure 1. scheme of the designed technology — solution 1. s ol u ti on a m ou nt of ai r t em p er at u re p ri or to tu rb in e t em p er at u re b ey on d tu rb in e p re ss u re b ey on d tu rb in e a m ou nt of fu el t em p er at u re of co m b u st io n ai r f lu e ga s fl ow ra te e le ct ri c p ow er t h er m al p ow er e le ct ri ca l effi ci en cy t h er m al effi ci en cy o ve ra ll effi ci en cy [kg/s] [°c] [°c] [kpa] [kg/h] [°c] [kg/h] [kw] [kw] [%] [%] [%] 1 3.21 750 460 101.3 608.4 240 27195 200 790.7 9.86 39.0 48.9 2 2.59 750 503 101.3 416.0 503 20688 200 601.4 14.42 43.4 57.8 3 3.13 750 518 111.3 490.7 518 25343 200 737.1 12.23 45.0 57.3 4 2.33 850 597 111.3 379.5 597 18112 200 527.3 15.81 41.7 57.5 table 1. basic parameters and calculation results for solutions 1–4. the inlet to the laboratory oven, was 175 °c. the flue gas temperature at the inlet to the heat exchanger was restricted to 1000 °c because of the service life of the exchanger and the material intensity. a decrease in flue gas temperature was achieved via higher amounts of excess air during the combustion process; however, this requires bigger dimensions of the exchanger. the air pressure prior to the turbine was 400 kpa. the inlet data for the solutions are given in tables, see below. the solutions differ in the parameters of the working media, and also in several key factors in their circuits. the basic construction characteristics and differences can be briefly summed up as follows: solution 1 — air is heated in an exchanger, which is located beyond the boiler for sawdust. flue gas at a temperature of roughly 1000 °c from the boiler is the heat source. air is compressed in the compressor, passes through the exchanger and enters the turbine. a certain amount of the air leaving the turbine is used as combustion air; the rest is mixed with the flue gas beyond the heat exchanger. the recuperator for preheating the air beyond the compressor is located in the path of the air stream beyond the turbine. the degree of recuperation was selected as ε = 0.73. the mixture of flue gas and hot air is blended with cold air, so that the flue gas reaches the desired temperature prior to entering the dryer. the air pressure beyond the turbine is atmospheric, so there has to be an exhauster integrated into the system. for a scheme, see fig. 1. solution 2 — differs from solution 1 through the absence of a recuperator integrated into the system. the parameters of the air flow rate and the air temperature beyond the turbine are also different. 43 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica figure 2. scheme of the technology — solution 6. c oe ffi ci en t of ex ce ss ai r a m ou nt of fu el t em p er at u re of fl u e ga s fr om th e b u rn er t h er m al p ow er of co ol in g b ey on d th e b u rn er t h er m al p ow er of co ol er b ey on d th e tu rb in e t h er m al in p u t of th e d ry er t ot al th er m al p ow er e le ct ri c p ow er e le ct ri ca l effi ci en cy t h er m al effi ci en cy t ot al effi ci en cy [kg/h] [°c] [kw] [kw] [kw] [kw] [kw] [%] [%] [%] 3.6 427.6 807 21 851 270 1142 193.2 9.4 55.6 65.0 3 508.5 913 346 855 271 1772 193.2 7.9 60.2 68.1 2.4 627.2 1060 815 862 272 1950 193.2 6.4 64.7 71.1 1.8 818.0 1281 1559 872 275 2706 193.2 4.9 68.8 73.7 table 2. comparison of solutions with a pellet burner — solution 5. solutions 3 and 4 — also do not integrate the recuperator. the overpressure of the air leaving the turbine equals 10 kpa. the combustion chamber will therefore be designed as an overpressure chamber, and again no exhauster will be installed. a major difference in solution 4 is that the temperature of the air prior to the turbine is 850 °c, much higher than for the other solutions. 2.1. partial evaluation of basic solutions these four basic solutions were compared on the basis of balance calculations, which accounted for the heat losses and pressure drops. in addition, the internal consumption of the equipment, especially the input of the exhauster, was not accounted for. the basic design parameters for particular solutions and key results are presented in table 1. a comparison of the solutions showed that solution 4 seems to be the most useful, as it has the highest rate for electrical efficiency. the low rate of overall efficiency must be caused mainly by significant dilution of the flue gas, with cold air carried out so that the temperature decreases prior to entering the dryer. if recuperation is used, preheated air enters the exchanger with the temperature of the outlet flue gas higher by approximately 200 °c. the flue gas flow rate after dilution then basically doubles. the loss due to the heat in the flue gas increases, and the input of the exhauster becomes disproportionate. solution 1 has therefore not been included for further specifications. on the other hand, solution 4 has an integrated overpressure system, so there is no need to install an exhauster. however, this may complicate the design of the overpressure furnace, and may cause difficulties in regulating the boiler and in maintaining optimum efficiency of the operations [4]. 44 vol. 54 no. 1/2014 technical and economic optimization of cogeneration technology s ol u ti on s a ir m as s fl ow ai r t em p er at u re p ri or to tu rb in e t em p er at u re b ey on d tu rb in e p re ss u re b ey on d tu rb in e a m ou nt of fu el f lu e ga s fl ow ra te in p u t of ex h au st er e le ct ri ca l p ow er t h er m al in p u t of d ry er t h er m al ou tp u t fo r h ea ti n g e le ct ri ca l effi ci en cy t h er m al effi ci en cy o ve ra ll effi ci en cy [kg/s] [°c] [°c] [kpa] [kg/h] [kg/s] [kw] [kw] [kw] [kw] [%] [%] [%] 2 2.59 750 503 101.3 416.0 5.75 0 200 791 0 14.4 43.4 57.8 2a 2.59 750 503 101.3 559 5.30 35 165 557 0 8.85 29.9 38.8 2b 2.59 750 503 101.3 559 2.77 18 182 297 378 9.75 36.2 45.9 3 3.13 750 518 111.3 490.7 7.04 0 200 601 0 12.2 45.0 57.3 3a 3.13 750 518 111.3 652.7 6.48 0 200 681 0 9.2 32.3 40.5 3b 3.13 750 518 111.3 652.7 3.34 0 200 364 461 9.2 37.9 47.1 4 2.33 850 597 111.3 379.5 5.03 0 200 737 0 15.8 41.7 57.5 4a 2.33 850 597 111.3 515 4.76 0 200 501 0 11.6 29.2 40.8 4b 2.33 850 597 111.3 515 2.50 0 200 268 336 11.6 35.2 46.8 5 3.6 750 503 101.3 428 2.59 6.8 193.2 270 872 9.4 55.6 65.0 6 3.6 750 518 111.3 517 3.13 0 200 326 1106 8.0 57.7 65.7 table 3. overall comparison of all solutions. 2.2. specification and extension of the designed solutions on the basis of the acquired data and values, we decided to specify our calculations for solutions 2–4 in two more solutions. solutions a — heat loss in the circuit will be taken into account. the heat loss was determined partially by calculation, and partially from operational experience with a similar 80 kwe unit [5]. further, the assumed input of the exhauster was specified on the basis of the known flue gas flow rate, [6]. these solutions always take subsequent cooling of the flue gas into account by mixing with cold air and utilization of the mixture in a biomass dryer. solutions b — these solutions, in contrast to solutions a, substitute mixing of flue gas and cold air by cooling in a flue gas/water exchanger and subsequent utilization of the flue gas in a dryer. this design reduces the flow of the flue gas into the stack, losses due to heat in the flue gas, and also the input of the exhauster. we further evaluated modifications to previous solutions which integrate an overpressure burner for pellets and direct supply of flue gas into the turbo-set. compressed air will be supplied into the overpressure burner for the pellets. hot flue gas will be cooled in a flue gas/water exchanger to the desired temperature at the turbine inlet, and the required flow rate also has to be specified. the flue gas leaving the turbine will be cooled again in a flue gas/water exchanger so that the temperature required for the flue gas entering the dryer is achieved. solution 5 integrates a turbo-set identical to solution 2 (750 °c, 101.3 kpa beyond the turbine); solution 6 integrates a turbo-set identical to solution 3 (750 °c, 101.3 kpa beyond the turbine), i.e. a dryer with overpressure without an exhauster — see fig. 2. calculations of solutions 5 and 6 were always carried out with different values of the excess-air coefficients. the calculations used the α coefficient in the interval 1.8–3.6. an increase in excess air leads to a decrease in the temperature of the flue gas in the chamber, and also to a decrease in the need to cool the flue gas prior to entering the turbine (input of the hot-water exchanger). the boundary value is roughly α = 3.6, where the final flue gas temperature, including the heat loss, reflects the required temperature of the flue gas prior to entering the turbine — see table 2. in general, greater excess air leads to greater electrical efficiency and lower thermal power, and also lowers the overall efficiency. considering the analysed solutions, we recommend adopting the solution with maximum excess air, which does not require the installation of an exchanger between the chamber and the turbine, and the flue gas temperature will be controlled directly in the chamber. this solution requires less investment and also eliminates the need to utilize the acquired heat; it also offers maximum electrical efficiency. 45 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica figure 3. discounted cash flow. investment gasifier € 80000 turbo-generator € 200000 flue gas/air exchanger € 80000 flue gas/water exchanger € 20000 fixed costs services € 15000/year service costs € 3005/year table 4. overview of investments costs and fixed costs. 2.3. overall evaluation and a comparison of the solutions table 3 presents an evaluation of solutions 2a, 2b, 3a, 3b, 4a, 4b, 5 and 6, and a comparison with the original solutions 2–4, without considering the heat losses, the pressure drops and the internal consumption of the exhauster. the results correspond with data obtained while participating in the development and commissioning of a similar 80 kwe unit [5]. 3. economic comparison of the proposed solutions several solutions for microcogeneration units combusting biomass were proposed in previous sections. the optimum solution has to be selected on the basis of the cost-effectiveness of the whole project. an assessment may be carried out using generally known and beneficial indicators. this includes the net present value (npv), which expresses the appreciation of the investment, including the cost of capital in a given variable expenses and selling prices fuel price – sawdust € 84/t fuel price – pellets € 168/t feed-in price of electricity € 155.2/mwh selling price of heat € 14/gj (including green premium) table 5. overview of fuel prices and feed-in energy prices. period of time, usually the service life of a particular piece of equipment. other indicators include the simple return on the investment (which does not take into account the change in the value of money in time) and the discounted cash flow (dcf) or the internal rate of return (irr). less significant indicators are the payback period (pp), index rentability (ir) and return on investment (roi). a basic economic assessment requires precise identification of the inlet and outlet commodities. above all, it is crucial to have available the real value of the investments and the fixed costs. for the purposes of our analysis, the prices of the components of basic units were determined. small payments and extra work are either included in the components or are neglected. the fixed costs include only costs directly linked with operations of the unit. expenses linked with other services (e.g. book-keeping, company’s overhead, etc.) are neglected. in addition, we have to identify the prices of the input materials (fuel) and the purchase price of output commodities (electrical energy and heat). energy prices may be identified using trends in market prices or, if possible, using state-guaranteed feed-in 46 vol. 54 no. 1/2014 technical and economic optimization of cogeneration technology simple discounted pp npv irr ir roi payback period payback period [years] [years] [€] [%] [%] 2a 11.19 19.5 −63.03 9145 0.30 % 1.03 5.10 % 2b 3.94 6.86 11.14 727423 18.00 % 2.91 14.60 % 3a 5.93 10.34 −1.17 336364 9.40 % 1.93 9.70 % 3b 3.3 5.75 15.56 940695 22.80 % 3.48 17.40 % 4a 4.53 7.9 7.3 550978 14.70 % 2.53 12.70 % 4b 3.21 5.6 16.22 977548 23.60 % 3.57 17.90 % 5 3.04 5.3 17.52 832697 25.30 % 3.78 18.90 % 6 2.87 5.01 18.8 897129 27.10 % 3.99 20.00 % table 6. discounted cash flow. figure 4. chart showing the dependence of npv on fuel price. prices and green premiums. the calculations provide results for more quantities — operational time 7500 hours/year, discount 6 %. we may neglect pressure drops and heat losses, as mentioned above, as well as potential changes in fuel prices and the prices of both generated energies in time. predicting these prices is rather complex, and lies beyond the scope of this paper. we analysed the solutions on the basis of these values for expenses and revenues. the results of the analysis are given in the form of a chart in figure 3 and dcf table 6. it is clear that solutions 5 and 6 have the shortest payback period, thanks to the low initial investment costs. however, the highest revenue over the whole service life may be expected with the application of solutions 4b and 3b. the solutions that do not integrate use of the residual heat in an exchanger (solution a) seem to be the worst options. investment in solution 2a pays off at the very end of its service life. solutions 3b, 4b, 5 and 6 all have a reasonable payback period of 5–6 years. assessing the economic analyses, we also have to carry out a sensitivity analysis of the quantities, so that we know in advance how a change in one quantity may influence the expected outcome. the following basic quantities were selected for the purposes of our sensitivity analysis: investments, fuel prices, feed-in price of electrical energy and selling price of heat. the sensitivity of each solution to changes in these quantities was basically the same; therefore a thorough calculation was therefore performed for solutions 4b and 5. we examined the influence of selected quantities on npv and on discounted cash flow. the results of the sensitivity analysis are shown in the charts in figures 4–9, which show the dependence of npv and discounted cash flow on the price of heat and on investment. on the basis of these calculations, we may say that npv and discounted cash flow are strongly dependent on fuel prices, the feed-in price of electrical energy, and the selling price of heat. the dependence of discounted cash flow is exponential. in contrast, the dependence on investment is minimal. 47 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica figure 5. chart showing the dependence of the discounted payback period on fuel price. figure 6. chart showing the dependence of npv on the selling price of heat. 4. overall evaluation and a comparison between the options in solution 2, the electrical efficiency decreased by 5 % and the total efficiency decreased by more than 10 %, due to the impact of heat loss and the decrease in net electrical power. although solution 3 does not have to account for the input of the exhauster, the electrical efficiency and the total efficiency are not significantly higher than for solutions 2a and 2b. the highest electrical efficiency among all originally designed solutions was achieved with solution 4, thanks to the higher temperature of the air prior to entering turbine. the internal consumption of the exhauster is also eliminated; however, the whole technology would need to have higher internal pressure, which may make it problematic to seal the combustion equipment. however, a technical solution for this problem may be possible. a possible technical solution might be to cool the boiler and the grate during combustion using very hot combustion air, which enables even inferior and very wet types of fuel to be combusted. all values prove that solution b is the most suitable option. instead of using a supply of cold air, there is an exchanger for heating the water that is designed to reduce the flue gas temperature to the temperature required by the exchanger (175 °c). the greatest electrical efficiency was achieved for the original solution 4, with a conventional boiler and air heating in the exchanger. the greatest overall electrical efficiency was identified for solutions with a direct supply of flue gas into the turbo-set using a pellet burner. although the electrical efficiency did not achieve the expected values of 15–20 %, the results are adequate 48 vol. 54 no. 1/2014 technical and economic optimization of cogeneration technology figure 7. chart showing the dependence of the discounted payback period on the selling price of heat. figure 8. chart showing the dependence of npv on investments. in comparison with orc technology. we also have to bear in mind that the published data describing the electrical efficiency of the orc technology in the interval 17–18 % [7] refers to the orc unit only. heat transferred by the oil heat medium is considered as an input energy flow. boiler efficiency, internal consumption of the boiler, and consumption of the oil heat medium circuit (oil pump) are not included in the provided data. it is therefore somewhat misleading. after we have included all losses and drops, and also the internal consumption and efficiency of the boiler, we may ascertain that the net electrical efficiency of the orc technology in relation to the energy in the fuel is equal to approximately 11 %. if the power output is reduced to 50 %, the total electrical efficiency decreases below 9 % for a 1 mwe unit. solutions 3b, 4b, 5 and 6 offer the best results in the economic analyses, with little difference between their performances. their payback period is 5–6 years. we may therefore choose the desired solution from among these options on the basis of the technical criteria. the choice is then presented to the client. the benefits and drawbacks of particular solutions may be briefly summed up as follows: • solutions 3b and 4b produce less heat with the same amount of electric power, which will make easier and complete use of the heat. • compared with solution 3b, solution 4b requires a higher air heating temperature of 850 °c, which is the limit value of the turbo-set manufacturer and may complicate the size and heat load and/or the cooling of the exchanger. this has been substantiated by practical experience from other installations. • however, solution 4b achieves higher electrical effi49 m. lisý, m. baláš, m. špiláček, z. skála acta polytechnica figure 9. chart showing the dependence of the discounted payback period on investments. ciency than solution 3b, and the overall efficiency is comparable. • both solutions account for use of hot combustion air with 10 kpa pressure, so the combustion unit will be operated with higher internal pressure. this leads to problems such as the threat of fires during fuel transportation in the combustion chamber. • solutions 5 and 6 do not require the installation of a flue gas/air exchanger, which significantly reduces the investment. in addition, the total efficiency is much higher. • however, the drawbacks of solutions 5 and 6 include greater heat production, the need to use standardized high-quality pellets, burner design for combustion at 40 bar pressure and zero content of solid particulate matter in the flue gas, which flows directly into the turbo-set. it is important to have in mind that modification and compression of the fuel is a expenzive and energy-intensive process. this will have a great impact on energy savings and the environmental benefits of the whole technology [8–10]. 5. conclusions this paper has summarized the situation in energy production in the czech republic, with special focus on targets and the direction for the energy sector in the czech republic. we have also discussed the use of renewable energy sources and wastes. the second part of the paper presents the main research areas being studied at the energy institute at the faculty of mechanical engineering, brno university of technology, focusing on microcogeneration using biomass combustion. technical and economic optimization of the design of a new 200 kw unit is outlined. the paper ends with a comparison of results. the range of electrical efficiency is 9–12 %; the overall efficiency of the unit is 65 %. our economic analysis shows a payback period of 5–8 years for the unit. these units may be operated in logging, in the wood-processing industry, and in agricultural production, where fuel is provided in the form of waste products. at the same time, all-year-round consumption of residual heat must be secured. the head of the research project anticipates that the technology can be exported and used in developing countries with a poor distribution network. there are typically abundant supplies of very cheap biomass in these regions, and a lack of electrical energy. we conclude that microcogeneration technology seems to be a viable and plausible option. acknowledgements this paper reports on a study that was made possible by financial support from the ministry of industry and trade of the czech republic within project fr-ti4/353 and the netme centre — new technologies for mechanical engineering project (project cz.1.05/2.1.00/01.0002). references [1] ochrana, l., skála, z., dvořák, p., kubíček, j., najser, j.: gasification of solid waste and biomass, (2004) vgb powertech, 84 (6), pp. 70–74 [2] lisý, m., baláš, m., moskalík, j., štelcl, o.: biomass gasification primary methods for eliminating tar, (2012) acta polytechnica, 52 (3), pp. 66–70. [3] baláš, m., lisý, m., štelcl, o.: the effect of temperature on the gasification process, (2012) acta polytechnica, 52 (4), pp. 7–11. [4] plaček, v., oswald, c., hrdlička, j.: optimal combustion conditions for a small-scale biomass boiler, acta polytechnica, vol. 52 (3) (2012), pp. 89-92. [5] lisý, m.; štelcl, o.; baláš, m.; skála, z.: new cogeneration technology for small industrial application. 50 vol. 54 no. 1/2014 technical and economic optimization of cogeneration technology acta metalurgica slovaca, 2011, roč. 2, č. 1, pp. 126–131. issn: 1338-1660. [6] černý, v., janeba, b., teysler, j.: parní kotle, technický průvodce 32, (1983) sntl praha. [7] kunc j.: orc technologie v realizaci (ii) — trhové sviny, srovnání, http://www.tzb-info.cz/2834-orc-technologie-vrealizaci-ii-trhove-sviny-srovnani [10.3.2013] [8] beniak, j., ondruška, j., čačko, v.: design process of energy effective shredding machines for biomass treatment. in: acta polytechnica. issn 1210-2709. vol. 52, no. 5 (2012), pp. 133–137 [9] matúš, m., križan, p.: modularity of pressing tools for a screw press producing solid biofuels (2012) acta polytechnica, 52 (3), pp. 71–76. [10] beniak, j.: monitoring of operational load of disintegrative machines. in: applied mechanics and materials. issn 1660-9336. – vol. 309, 3rd central european conference on logistics (cecol 2012), november 28–30, 2012, trnava, slovak republic (2013). isbn 978-3-03785-636-9, pp. 88–95 51 http://www.tzb-info.cz/2834-orc-technologie-v-realizaci-ii-trhove-sviny-srovnani http://www.tzb-info.cz/2834-orc-technologie-v-realizaci-ii-trhove-sviny-srovnani acta polytechnica 54(1):42–51, 2014 1 introduction — microcogeneration 2 design and balance calculations for a microcogeneration unit 2.1 partial evaluation of basic solutions 2.2 specification and extension of the designed solutions 2.3 overall evaluation and a comparison of the solutions 3 economic comparison of the proposed solutions 4 overall evaluation and a comparison between the options 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0096 acta polytechnica 55(2):96–100, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap immobilization of humic substances using plasma modification pavlína hájkováa, b, ∗, david tichýb, jiří čmelíka, petr antoša a research institute of inorganic chemistry, revolucni 84, 400 01 usti nad labem, czech republic b technical university of liberec, department of material science, studentska 2, 461 17 liberec, czech republic ∗ corresponding author: pavlina.hajkova@vuanch.cz abstract. this paper presents a study of the immobilization of humic substances (hss) on a polypropylene (pp) nonwoven fabric. in order to attach the hss, the pp nonwoven fabric was modified in a volume of nonthermal atmospheric pressure dielectric barrier discharge (dbd) under defined conditions. an unmodified pp nonwoven fabric was used as a reference sample. the modified and unmodified samples were both dipped in an aqueous solution of potassium humate, and then the samples were washed in water and the amount of hss attached to the pp fabric was monitored. an aqueous solution of cadmium salts was filtered through the treated fabric, the content of cd2+ in the solution was monitored using icp-oes analysis, and the cd2+ sorbed on the fabric was proved by sem/eds analysis. the efficiency of the pp plasma modification was proved by xps analysis, and the presence and the distribution of the hss along the fibers was proved by sem analysis. keywords: dbd plasma modification; pp nonwoven fabric; humic substances; cadmium. 1. introduction in recent times, specialists have been taking increased interest in humic substances (hss), because of their ability to absorb heavy metals, such as cd2+, pb2+, cu2+, ba2+, ni2+, zn2+, co2+, mn2+, mg2+, ca2+. hss are the major organic constituents of soils, natural waters, river, lake and sea sediments, peat, brown-black coals and other natural materials as a product of chemical and biological transformations of animal and plant residues [1–7]. the interaction of humic substances with metals could play an important role in removing these hazardous substances from the environment [4, 6–10]. many studies therefore focus on the interaction of heavy metals in wastewater with hss [3–5, 8–12]. however, hss are used mainly in powder form or in a solution, which limits their applications. it is therefore necessary to attach humic substances to porous carriers, which could be used as replaceable filters for filtering liquids or gases. hss can be immobilized on a porous material by various methods, e.g., by chemical treatment or by plasma treatment. for plasma treatment, several plasma types can be used. the most widely used are low pressure plasmas [13] and atmospheric plasmas [14–16]. low pressure plasma treatments usually allow better surface chemistry control, but quite complicated apparatus is needed. for our study, an atmospheric pressure dielectric barrier discharge (dbd) was used, because it can easily be brought into the production line, it has a relatively simple design, and there is no need for vacuum pumps, which are relatively expensive and also require long pumping times. our paper reports on the immobilization of humic substances (hss) on a polypropylene (pp) nonwoven fabric. in order to attach the hss, the pp nonwoven fabric was modified in a volume of nonthermal atmospheric pressure dbd under defined conditions. cadmium, which is highly toxic to the environment, was selected for tests of the pp fabric with hss for wastewater treatment. cadmium in the form of salts contaminates wastewater and air. it is a major threat to human health, because it accumulates in tissues. 2. experimental 2.1. humic substances (hss) potassium humate was obtained from the raw material by alkaline extraction (koh solution) at an elevated temperature, in the way described in detail in [3]. the raw material was young coal (oxyhumolite) from the vaclav mine in north bohemia (near bilina). the content of humic substances in a prepared sample was determined. the sample was subsequently diluted with water to a final content of 15% of hss. analysis of the potassium humate: the chemical analyses followed standard chemical procedures, which are described in detail in [3]: water 79.62 %; ash in dry state 29.88 %; hss in dry state 70.12 %; ph 10, density 1111 kg/m3; hss in dry state leachable: 69.82 %; fulvic acids: 0.99 %; total acidity of humic substances: 8.59 meq/g. 2.2. polypropylene (pp) nonwoven fabric the pp nonwoven fabric was 0.5 mm in thickness and 0.019 g/cm2 in density, and the diameter of a single fiber was 20 µm. for the experiments, a strip of nonwoven pp fabric 100 m in total length and 5 cm in width was used. the strip was cut into 4 parts used 96 http://dx.doi.org/10.14311/ap.2015.55.0096 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 2/2015 immobilization of humic substances using plasma modification figure 1. dbd plasma reactor with parallel electrode alignment: a) powered electrode, b) dielectric barrier 1 – corundum, c) pp textile sample, d) dielectric barrier 2 – rubber, e) grounded electrode. for 4 types of samples, as described in section 2.5 – method. 2.3. chemicals an aqueous solution of cadmium salts with a cd2+ concentration of 1 mmol/l (112.4 mg/l) was used to determine the sorption capacity of plasma-modified and unmodified pp with and without humates. a 7 % acetic acid solution was used for acidification of the potassium humate. 2.4. plasma modification dielectric barrier discharge treatments were conducted in air at atmospheric pressure, and the discharges were operated in filamentary mode, i.e., they were constituted by many tiny microdischarges (filaments) randomly distributed over the entire area of the electrodes. the plasma reactor (fig. 1) consists of two plane parallel electrodes made from ag80cu alloy covered by a 1 mm layer of dielectric (corundum and rubber). both electrodes are rectangular in shape, with dimensions of 60 × 50 mm and thickness of 8 mm without active cooling. a detailed description is given, e.g., in [17]. the distance between the electrodes was 4.5 mm. the discharge was ignited by means of an ac power source, and filled the entire space between the electrodes. this means that the samples were completely immersed in the plasma. plasma treatment conditions and parameters: • ac source voltage: 20 kv; • ac source frequency: 3 khz; • nominal power: 120 w; • distance of electrodes: 4.5 mm; • modification time: 3 s; • working gas: ambient air. the efficiency of the plasma treatment of the pp fabrics was evaluated by the drop liquid test (arcotest) and by an xps analysis. 2.5. method the pp fabric samples were dried at 105 °c for 1 hour to remove adsorbed water, and were then weighed and divided into 4 groups. the first group was modified in the dbd plasma for 3 s in order to increase the wettability and the adhesion of the hss. immediately after modification, the samples were immersed in the aqueous solution of potassium humate for 60 seconds, dried at room temperature for 24 hours and for 1 hour at 105 °c. after weighing, the samples were dipped in acetic acid to stabilize the humic acids, and were then washed for 30 min in distilled water. finally, the samples were dried and weighed again. this allowed us to determine the amount of immobilized humate on the pp fabric. the second group was prepared in the same way as the first group, but without plasma modification. the third and fourth group were the control samples, i.e., both were without humates the third group was dbd plasma modified under the same conditions as the first group, and the fourth group was not plasma treated. all 4 types of samples were tested for cadmium sorption capacity using the column apparatus. the column apparatus was filled with pp fabrics. circular cuttings were used to fit the column. the total area of pp fabrics was 62.8 cm2, and it was dripped by an aqueous solution of cadmium salts with a flow rate of 0.5 ml/min. the resulting eluted solution was tested after each 1 ml for the presence of cd2+, using inductively coupled plasma optical emission spectrometry (icp-oes). when the concentration of cd at the column output reached the input concentration, the total amount of cadmium trapped in the column was used to determine the sorption capacity (i.e., the difference between the amount of cd entering and exiting the column was measured). the placement of the hss on the fibers was observed by sem analysis, and sorbed cd2+ on the fabric was detected by sem / eds analysis. 3. results the xps analysis confirmed a significant change in the chemical composition of the surface of the pp fibers after dbd plasma modification. there were visible changes in the c1s peak (an increased signal corresponding to the carbon bonds with oxygen and oh groups). the deconvolutions were performed by comparing the spectra with the literature [18–20]. a quantification of the carbon and oxygen content using c1s and o1s signals is given in tab. 1. plasma modified pp fibers contained 14 at% of oxygen in comparison with 6 at% for unmodified pp fibers, and the o/c ratio of the modified fabric was more than 2.5 times higher. the literature [14–16] provides similar results for modified pp plasma at atmospheric pressure. the change in the chemical composition of the surface-modified pp fabric corresponds to its significantly increased wettability by an aqueous solution of potassium humate, and is consistent with the 97 p. hájková, d. tichý, j. čmelík, p. antoš acta polytechnica pp fabric atomic concentration [%] o/c ratio c1s o1s pp unmodified 94 6 0.06 pp dbd modified 3s 86 14 0.16 table 1. atomic concentrations of carbon and oxygen for dbd modified and unmodified pp fabrics according to xps. the c1s and o1s peaks were used for the quantifications. pp/hs before after washing washing content content content of sorbed cd sorptive sorptive of hs of hs hs/filling capacity capacity [mg/cm2] [mg/cm2] [mg] [mg/filling] [µgcd/cm2] [µgcd/mghs] unmodified 32 3 188.4 0153 2.44 0.81 dbd modified 73 9 565.2 0.549 8.74 0.97 table 2. effectiveness of modifying fabrics with hss. figure 2. placement of hss on unmodified pp after washing. increase in surface energy detected using the specified arcotest liquids. the unmodified pp showed surface energy of only 28 mn/m, while the modified pp fabrics had surface energy more than twice higher (more than 56 mn/m) according to the arcotest. due to the higher wettability of the modified pp fabrics, the modified samples retained a significantly higher amount of immobilized humate. both groups of samples lost 90 % of the humate after the pp fabrics were treated in acetic acid and washed in water. plasma-treated pp retained 9 mg/cm2 of hss, in comparison with 3 mg/cm2 of hss immobilized on the unmodified fabric (tab. 2). figures 2 and 3 show an obvious difference in the distribution of immobilized particles of hss on individual fabric fibers. there are particles of hss very densely attached to the dbdtreated fabric, whereas the placement of the hss on the fabric without treatment is sparse, confirming the effectiveness of dbd plasma treatment. there is a relatively good correlation between the triple amount figure 3. placement of hss on dbd-modified pp after washing. of hss on the dbd-treated fabric and the 3.5 times higher amount (according to icp-oes) of sorbed cd2+ in comparison with the unmodified fabric. the higher hs content allowed a greater amount of cd to be captured, and in addition the distribution and the size of individual particles of hss immobilized on the pp fabric can also have a positive influence on the cd sorption capacity. the amount of hss was measured by weighing, and the amount of trapped cadmium was estimated by measuring the concentration of the cd salt solution entering and exiting the column by icp-oes (see section 2.5). sorption of cd2+ on hss particles was confirmed by an sem/eds analysis, see figures 4 and 5, where fig. 4 shows the same location as fig. 3. the experiments where the aqueous solution of cadmium salts was filtered through the pp fabric with immobilized hss confirmed a positive effect of dbd modification of fabrics for increasing cd2+ filtration 98 vol. 55 no. 2/2015 immobilization of humic substances using plasma modification figure 4. placement of cd2+ (green) on particles of hss on dbd-modified pp after the experiment with an aqueous solution of cd salts. figure 5. eds of pp fabrics after adsorption of cd2+. yellow indicates unmodified pp textiles with hss, and green indicates dbd-modified pp textile with hss. efficiency. the cadmium concentration used in the solution was more than 100 times higher than the permissible limit for surface water. for a modified pp fabric with hss, the eluted solution reached its original concentration after 25 ml had been dripped (fig. 6), whereas for the unmodified fabric with hss the eluted solution reached its original concentration after just 7 ml. in addition, for unmodified fabric with hss, the hss particles released together with sorbed cd2+, as indicated by following fluctuation of values in range 7–16 ml.in addition, for the unmodified fabric with hss, the hss particles were released together with sorbed cd2+, as indicated by the following subsequent fluctuation of values between 7–16 ml. . 4. conclusion our experiments have confirmed the positive effect of dbd plasma treatment on immobilizing hss on the pp non-woven fabric. dbd treatment of pp fabric increased the weight of the immobilized hss on the fabric threefold in comparison with untreated pp fabrics. modified fabrics with immobilized hss show more than 3 times increased cd2+ sorption capacity. these results can be very important in the field of wastewater treatment and air purification, and can figure 6. dependence of the outlet cd2+ concentration on the outlet volume of the working solution. for clarity, the error bar is given only for one data point. the other error bars are equal or smaller. offer an economical and ecological solution for removing heavy metals, particularly cadmium, from the environment. acknowledgements this paper was prepared using the infrastructure supported by the unicre project (reg. no. cz.1.05/2.1.00/03.0071), funded by the eu structural funds and by the state budget of the czech republic. the work was also supported by institutional funds (ministry of industry and trade of the czech republic). we thank the efs opvk – envimod project reg. cz.1.07/2.2.00/28.0205 of the ministry of education, youth and sports of the czech republic. in addition, the authors thank pavel kejzlar for the sem-eds investigations, jindřich matoušek for the xps analysis, and the technological centre of msv systems cz s.r.o. for its help in constructing the dbd plasma apparatus. references [1] stevenson, f.j., humus chemistry. genesis, j. chem. educ., 1995, 72 (4), p a93. doi:10.1021/ed072pa93.6 [2] dick, w.a. , humic substances in the global environment and implications on human health, journal of environment quality, 1995, vol. 24 no. 3, p. 558. doi:10.2134/jeq1995.00472425002400030029x [3] novák, j. et al, humic acids from coals of the north-bohemian coal field i. preparation and characterisation, react. funct. polym., 2001, (47), 101-109. doi:10.1016/s1381-5148(00)00076-6 [4] čežíková, j. et al, humic acids from coals of the north-bohemian coal field ii. metal-binding capacity under static conditions. react. funct. polym., 2001, (47), 111-118. doi:10.1016/s1381-5148(00)00078-x [5] madronová, l. et al, humic acids from coals of the north-bohemian coal field iii. metal-binding properties of humic acids – measurements in a column arrangement. react. funct. polym., 2001, (47), 119-123. doi:10.1016/s1381-5148(00)00077-8 [6] rate a. w., sorption of cd(ii) and cu(ii) by soil humic acids: temperature effects and sorption heterogenity, chemistry and ecology volume 26, issue 5, 2010. doi:10.1080/02757540.2010.504666 99 http://dx.doi.org/10.1021/ed072pa93.6 http://dx.doi.org/10.2134/jeq1995.00472425002400030029x http://dx.doi.org/10.1016/s1381-5148(00)00076-6 http://dx.doi.org/10.1016/s1381-5148(00)00078-x http://dx.doi.org/10.1016/s1381-5148(00)00077-8 http://dx.doi.org/10.1080/02757540.2010.504666 p. hájková, d. tichý, j. čmelík, p. antoš acta polytechnica [7] pena-mendez e. m., havel j., patočka j., humic substances – compounds of still unknown structure, journal of applied biomedicine (2005), vol 3, pp 13-24 [8] tang w. w. et al., impact of humic/fulvic acid on the removal of heavy metals from aqueous solutions using nanomaterials: a review, science of the total environment, 468-469, (2014), 1014-1027. doi:10.1016/j.scitotenv.2013.09.044 [9] beveridge a., pickering w.f., influence of humate-solute interactions on aqueous heavy metals ion levels, water, air and soil pollution (1980), vol. 14 iss 1, pp 171-185. doi:10.1007/bf00291834 [10] sposito g., weber j.h., sorption of trace metals by humic materials in soils and natural waters. critical reviews in environmental control (1986), vol. 16, iss. 2, pp. 193-229. doi:10.1080/10643388609381745 [11] castro, g. r. et al, lability of cd, cr, cu, mn and pb complexed by aquatic humic substances, ecl. quím., são paulo, 30(2): 45-51, 2005. doi:10.1590/s0100-46702005000200006 [12] jing-fu lio et al, coating fe3o4 magnetic nanoparticles with humic acid for high efficient removal of heavy metals in water. environ.sci. technol., 2008, 42 (18), pp 6949–6954. doi:10.1021/es800924c [13] r. shishoo, plasma technologies for textiles, woodhead publishing limited, 2007, isbn-13: 978-1-84569-073-1 [14] fang, z., xie, x., li, j., yang, h., qiu, y., kuffel, e., comparison of surface modification of polypropylene film by filamentary dbd at atmospheric pressure and homogeneous dbd at medium pressure in air j. phys. d: appl. phys. 42 (2009). doi:10.1088/0022-3727/42/8/085204 [15] cui, n. y., brown n. m. d., modification of the surface properties of a polypropylene (pp) film using a dielectric barrier discharge plasma, applied surface science 189, (2002), 31-38. doi:10.1016/s0169-4332(01)01035-2 [16] wang changquan et al, surface treatment of polypropylene films using dielectric barrier discharge with magnetic field 2012, plasma sci. technol. 14, 891. doi:10.1088/1009-0630/14/10/07 [17] tichý d., hájková p., modification of composite material fillers by atmospheric plasma discharge, acta polytechnica 53 (2):237–240, 2013 [18] liu, c., frenkel, a. i., vairavamurthy, a., huang, p. m., sorption of cadmium on humic acid: mechanistic and kinetic studies with atomic force microscopy and x-ray absorption fine structure spectroscopy, can. j. soil. sci., 01/02/12, p. 337-348. doi:10.4141/s00-070 [19] briggs, d., grant, j.t., surface analysis by auger and x-ray photoelectron spectroscopy, isbn 1-901019-04-7 [20] b.v. crist, handbook of monochromatic xps spectra, (vol. 4 polymers and polymer damage), xps international llc, mountain view, 2004 100 http://dx.doi.org/10.1016/j.scitotenv.2013.09.044 http://dx.doi.org/10.1007/bf00291834 http://dx.doi.org/10.1080/10643388609381745 http://dx.doi.org/10.1590/s0100-46702005000200006 http://dx.doi.org/10.1021/es800924c http://dx.doi.org/10.1088/0022-3727/42/8/085204 http://dx.doi.org/10.1016/s0169-4332(01)01035-2 http://dx.doi.org/10.1088/1009-0630/14/10/07 http://dx.doi.org/10.4141/s00-070 acta polytechnica 55(2):96–100, 2015 1 introduction 2 experimental 2.1 humic substances (hss) 2.2 polypropylene (pp) nonwoven fabric 2.3 chemicals 2.4 plasma modification 2.5 method 3 results 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0128 acta polytechnica 55(2):128–135, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap comparison between 2d turbulence model esel and experimental data from aug and compass tokamaks peter ondáča, b, ∗, jan horacekb, jakub seidlb, petr vondráčeka, b, hans werner müllerc, jiří adámekb, anders henry nielsend, and asdex upgrade teamc a department of surface and plasma science, faculty of mathematics and physics, charles university, v holešovičkách 2, 180 00 prague 8, czech republic b institute of plasma physics, academy of sciences, za slovankou 1782/3, 182 00 prague 8, czech republic c max-planck-institut für plasmaphysik, boltzmannstraße 2, d-85748 garching, germany d association euratom-ris∅ national laboratory, technical university of denmark, opl-128 ris∅, roskilde, dk-4000, denmark ∗ corresponding author: shreder.peter.ondac@gmail.com abstract. in this article we have used the 2d fluid turbulence numerical model, esel, to simulate turbulent transport in edge tokamak plasma. basic plasma parameters from the asdex upgrade and compass tokamaks are used as input for the model, and the output is compared with experimental observations obtained by reciprocating probe measurements from the two machines. agreements were found in radial profiles of mean plasma potential and temperature, and in a level of density fluctuations. disagreements, however, were found in the level of plasma potential and temperature fluctuations. this implicates a need for an extension of the esel model from 2d to 3d to fully resolve the parallel dynamics, and the coupling from the plasma to the sheath. keywords: turbulence; tokamak; computer model; probe measurements. 1. introduction transport (mainly turbulent) in the outermost plasma region, in contact with material surfaces, regulates particle and heat loads on plasma-facing components. the control of this transport is very important for particle and heat confinements in tokamaks and other magnetized plasma experiments (stellarators, linear devices, reversed field pinches). only intermittent turbulent structures account for more than 50 % [1] of the radial plasma transport towards the material surfaces. cross-field particles and heat losses from the central plasma represent a very high risk of damaging the tokamak first wall, and other plasma-facing components. at the same time, impurities released by these components spread through the boundary region into the central plasma, cooling it down by their radiation and decreasing the fusion rate due to the dilution of the fuel. interchange instability is one of the candidates for explaining these significant cross-field plasma losses observed at the tokamak edge. the numerical model esel (edge-sol-electrostatic) [2–4] simulates the edge turbulent plasma, as three interacting fluid fields, electron temperature, density and vorticity. in this article, following [5], we compare the predictions of the model with experimental measurements on two tokamaks with iter-like geometry, asdex upgrade [6] and compass [7]. in the past, similar comparisons were made also for tokamaks jet [8], tcv [3], textor [9] and mast [10]. from [8] it follows that the radial transport in edge plasma is dominated by electrostatic interchange turbulence. initially esel was used for tcv plasma with high collisionality, and was very successful. quantities calculated from the measured density were in good concurrence with the model [3], [11–13]. however, in comparisons between the esel and the asdex upgrade plasma, with higher temperature and therefore lower collisionality than in tcv tokamak, there were great discrepancies [5]. in this case the temperature and potential were measured as well. the radial profile of the density in the esel was too flat, and relative temperature and electric potential fluctuations in the model were too large compared with experimental values. the radial profile of electric potential also differed in the model and the experiment. later, a new extra linear term (described below) was added into one esel governing equation. with this term the model electric potential radial profile was finally in agreement with the asdex data, and the radial profile of density was more consistent with the asdex data. however the relative density, temperature and electric potential fluctuations were even larger than before [5]. from this time forth, a change in the extra term and extensive investigation of the parameter dependence in esel matching the asdex upgrade plasma has been made. also, first time comparisons between esel and compass tokamaks have been made. a brief summary of the results is presented in this paper. in this context, the purpose of this work is to show 128 http://dx.doi.org/10.14311/ap.2015.55.0128 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 2/2015 comparison between esel model and aug and compass data figure 1. a photograph of the probe head with 4 langmuir probes (lps) and 4 ball-pen probes (bpps) used in the asdex upgrade. the vertical axis of the image corresponds to the direction parallel to b and the horizontal axis corresponds to the poloidal direction in the tokamak [16]. the actual possibilities of the interpretations of the experimental data with the help of the interchange turbulence paradigm in the 2d model esel. as far as the experiment is concerned, to measure individual turbulent structures (blobs) a high temporal (1 µs) and spatial (2 mm) resolution was necessary. this was provided by a set of langmuir probes (lps) and ball-pen probes (bpps) [14] mounted on a reciprocating manipulator. the description of the experimental setup and the esel model is given in chapters 2 and 3. in chapter 4 we present the results obtained from the comparison between the experiment and model. a detailed study of the comparison can be found in the master thesis [15]. 2. experiments the data from the asdex upgrade tokamak shown in this article comes from discharge #24349. they were previously analysed in [5] and [16], here, however, we use more general assumptions such as ratio of ion and electron temperature larger than one. the experiment was performed using the reciprocating horizontal manipulator located just above (z = 0.3125 m) the lfs mid-plane in the sol [16]. the probe head used is shown in figure 1. the diameter of the ball-pen [14] collectors was 4 mm and the interior diameter of the shielding tubes was 6 mm. the diameter of lp pins was 0.9 mm. the bpps and lps were located in different poloidals and due to the poloidal inclination of the magnetic flux surface with respect to the probe head surface ≈ 12°, were also in different radial positions. in this work we use data from probes lp1, lp2, bpp1 and bpp2. the measurements were performed in a dshaped deuterium plasma, during an l-mode with a neutral beam injection power of ≈ 1 mw and a plasma current of ip ≈ 800 ka, line averaged density ne ≈ 3 · 1019 m−3 and toroidal magnetic field bt ≈ 2.5 t in the plasma centre and ≈ 1.9 t in the sol. the sampling frequency of the data acquisition system was 2 mhz. for the purpose of statistical analysis, the radial probe head movement was divided into 60 time intervals, and each processed separately. these analysed time intervals, from which the experimental data points (further marked with circles) were obtained, had a duration of 2.9 ms and each of them corresponded to 5826 measured values. each time interval corresponds to a corresponding radial position where the probes, during the interval, were situated . in the figures, the experimental data from both radial motion in and out of the plasma are shown (two lines with circles in each graph for asdex upgrade labelled as “asdex”, the smaller values are related to slower motion out). the probes bpp1, bpp2 and lp2 were floating and they measured the floating potentials v bpp1f , v bpp2 f and v lp2f respectively. lp1 was biased and measured ion saturation current, iis. plasma potential φp (in v), electron temperature te (in ev) and plasma density ne at the position of lp1 (in m−3) were calculated as following (v bppf ≈ φp − 0.6(te/e), the coefficient 0.6 has been found experimentally [17]): φp = v bpp1f + v bpp2 f 2 , (1) te = t bpp1−lp2e + t bpp2−lp2e 2 , (2) t bpp1−lp2e = (v bpp1f −v lp2 f )e α− 0.6 , t bpp2−lp2e = (v bpp2f −v lp2 f )e α− 0.6 , ne = nlp1e = 13 · 1018m−3(ev)1/2 √ 2 0.014 a iis√ te + ti , (3) where coefficient α indicates representation of temperature fluctuations in the measured v lp2f : α = − 1 2 ln [ 2π ( me mi )( 1 + ti te )] , with the electron and ion mass me and mi and the electron and ion temperature te and ti. secondary electron emission from the probes is neglected. in [16] only the value α ≈ 2.8, for ti/te = 1 has been used. since the ratio ti/te can be up to 10 ([18], [20]), the corresponding 2 < α < 2.8. in equation (3), the constant 13 · 1018m−3(ev)1/2 /0.014 a is a calibration factor to match the lithium beam diagnostic measurement. the experimental data from the compass tokamak was obtained in discharge #6092. the probe head shown in figure 2 is a bit smaller and so less perturbing to the plasma. it was radially inserted into the plasma by the fast reciprocating horizontal mid-plane manipulator that is located at the outer mid-plane (z = 0). 129 p. ondáč et al. acta polytechnica figure 2. photograph of the compass probe head with 2 langmuir probes (lps) and 3 ball-pen probes (bpps). the vertical and horizontal direction of the image parallel to the probe head surface corresponds to the toroidal and poloidal direction respectively [18]. the ball-pen probes had collectors with diameters of 2 mm and shielding with interior diameter of 5 mm. lp pins had a diameter of 0.9 mm and protruded 1.5 mm into the plasma. the poloidal distance between lps and bpps was around 4 mm. the measurements were performed in a d-shaped plasma with ohmic heating in l-mode with typical values of toroidal magnetic field bt = 1.16 t (at the minor axis at r ≈ 0.56 m), plasma current ip = 110 ka and line averaged electron density ne ≈ 5 · 1019m−3. the sampling frequency of the data acquisition system was 5 mhz. analysed time intervals had durations of approximately 2 ms, a compromise between long enough statistics (we typically caught 10 blobs during this interval) and short enough probe movement (2 mm, i.e., much less than typical blob size or the sol radial decay length). similar to the data from asdex upgrade, we derive ne, te and φp from the probe data, and for each time interval we calculate the first two statistical moments. the relative fluctuations were calculated as a ratio between the second and the first statistical moment. in the case of compass measurements, we used only the data corresponding to the probe motion into the plasma one experimental line with circles. movement out of the plasma was not processed due to the arcing of the probes. the probes bpp2 and lp2 were floating and measured floating potential v bpp2f and v lp2 f respectively. lp1 was biased and measured ion saturation current iis. the te (in ev), φp (in v) and ne (in m−3) were calculated as following: te ' t bpp2−lp2e = (v bpp2f −v lp2 f )e α− 0.6 , (4) φp = v bpp2f + 0.6 te e , (5) ne = nlp1e = ( 2 √ mi es ) iis√ te + ti , (6) where s is a langmuir probe effective collection area. since ion temperature ti was not measured, we assumed for both tokamaks its value to be in the range of 1 to 10 times the electron temperature, in agreement with [19] or [20]. this uncertainty is represented by error bars in all figures with radial profiles of mean plasma density and electron temperature. in all esel simulations the ion temperature was set twice the electron temperature. 3. model esel the esel model (edge-sol electrostatic) is a 2d drift-fluid turbulence numerical simulation of scrapeoff layer turbulence. it is an electrostatic model (so the magnetic field is constant in time). there are three governing equations describing the evolution of plasma; electron density n, electron temperature te and electric potential φ, which are solved in slab geometry perpendicular to the magnetic field (axis x′, y′, z′) and which, by using the bohm normalization (dimensionless quantities are marked with prime ‘′’), take the following form [21]: dn′ dt′ + n′c′(φ′) −c′(n′t ′e) = d ′ ⊥n∇ ′2 ⊥n ′ − n′ τ′‖n , (7) dt ′e dt′ + 2 3 t ′ec ′(φ′) − 7 3 t ′ec ′(t ′e) − 2 3 t ′2e n′ c′(n′) = d′⊥te∇ ′2 ⊥t ′ e − t ′e τ′‖te , (8) dω′ dt′ −c′(n′t ′e) = d ′ ⊥ω∇ ′2 ⊥ω ′ − ω′ τ′‖ω + 2cs ωci0l‖d ψ; (9) ψ = [ 1 − exp ( α− φ′ t ′e )] , (10) where ωci0 = eb0/mi is the ion gyro-frequency, cs is the warm (ti 6= 0) ion plasma sound speed and l‖d is the parallel connection length for sol between two divertors. the vorticity, the advective derivative, the inhomogeneous magnetic field and the curvature operator are defined by ω′ = ∇′2⊥φ ′, d dt′ = ∂ ∂t′ + 1 b′ z ×∇′φ′ ·∇′, 1 b′ = 1 + r0 + ρs0x′ r0 , c′ = − ρs0 r0 ∂ ∂y′ , (11) with the hybrid thermal gyro-radius ρs0 = cs0/ωci0, and the tokamak major and minor radius r0 and r0. here cs0 = √ (te0/mi) is the cold ion plasma sound speed, te0 is the electron temperature at the lcfs and b0 is the magnetic field at the plasma centre. for the purpose of this paper, we added the last term in the equation (9) into the esel model, in order to better match the experimental data see later. this last term has, in dimensional form, the meaning of the average of the divergence of the parallel electric current 130 vol. 55 no. 2/2015 comparison between esel model and aug and compass data to the divertor plasma sheath multiplied by b/(nmi). the average is taken over the parallel direction with a connection length l‖d. thus we have [22]: 〈∇· j‖〉≈ 〈b ·∇j‖〉 = 〈∇(encs[1 − exp(evfd/ted)])〉 ≈ (2 encs /l‖d)[1 − exp(−evfd/te)], (12) where b is the unit vector along the magnetic field, ted ≈ te is the electron temperature in the vicinity of the divertor and −vfd ≈ φ−α(te/e) is the floating potential at the divertor. this means that the term describes the outflow of electric potential from the blobs in parallel direction to what affects the radial turbulent transport (via e × b drift). the parallel terms are apparently the weakest point of the esel and there has been a lot of effort to develop a more sophisticated model of parallel transport. a similar sheath dissipation term has already been used before, in other 2d interchange codes, [23] or [24] , but in esel, it was used the first time. the parallel transport along the magnetic field lines is estimated in the model by characteristic loss times for particles (τ′‖n), electron heat (τ ′ ‖te ) and momentum (τ′‖ω). perpendicular diffusion due to collisions is represented by neoclassical pfirsch-schlüter perpendicular collisional diffusion coefficients for particles d′⊥n, electron heat (d ′ ⊥te ) and momentum (d ′ ⊥ω). the relations for computations of the loss times and the diffusion coefficients are shown in [21]. the parallel particle density loss time in esel is estimated as τ‖n ' l‖m/v‖, where v‖ is parallel velocity equal around half (±30%) the warm ion sound speed. it is derived for time-independent parallel transport and it neglects the recycling of neutrals in the vicinity of the divertor as the particle source to the computational domain area. it would be more correct to use the length scale of parallel particle density variation l‖n in place of l‖m, but it is difficult to estimate this [21]. these three governing equations (7, 8, 9) were derived from the equation of continuity, fluid equation of motion and energy transport equation [15], [25]. input data for the esel model are mainly electron temperature, electric potential and electron density at the separatrix or on the left edge (in the figures labelled as “le”) of the 2d computational domain, also a value of magnetic field in the plasma centre, tokamak geometry parameters and a selection of radial neumann or dirichlet boundary conditions. in poloidal direction there are only periodic boundary conditions. the model then provides a time series of turbulent fluctuations at predefined spatial points. radial profiles and relative fluctuations are the model’s output. finite larmor radius and in some cases also ion temperature are neglected. the computational domain at the outboard midplane of the tokamak is shown in figure 3. the edge, sol and wall shadow region represent the confined plasma, sol figure 3. 2d computational domain of esel. the black dots represent the location of the numerical probes. the vertical axis represents the poloidal direction and the horizontal axis represents the radial direction. the colour code represents the plasma density. the sol region width was around 27 mm for asdex upgrade and around 37 mm for compass. plasma and the plasma outside a midplane wall major radius, respectively. the left edge of the edge region is impermeable for the plasma. in this region there are no parallel losses (infinite loss times). the esel model does not include drift waves or 3d effects that may be important in the edge region. moreover, the edge region can be influenced by the inner boundary’s conditions. therefore, only sol and wall shadow regions should be considered as physically relevant. as the loss times are directly proportional to l‖m, they are finite and long in the sol, and finite and short in the wall shadow. however, in the wall shadow region, wall effects such as clouds of neutrals are not included. we focus here only on the results in the sol region. to model the experimental probes, evenly radially distributed numerical point probes were inserted in the esel computational domain. the time data of the fields n, te and φ from numerical probes in certain radial position were processed in the same way as the relevant, experimentally obtained data of ne, te and φp, in the time interval corresponding to the same radial position. 4. results we performed 15 different esel simulations using the asdex upgrade plasma parameters and 3 different simulations using compass plasma parameters. the differences in the simulations were in radial boundary conditions (neumann or dirichlet) for plasma density, plasma potential, radial electric field and electron temperature, value of the the neoclassical pfirsch-schlüter perpendicular collisional diffusion coefficient for particles d′⊥n, value of parallel loss time for particles τ ′ ‖n or in the form of the bracket ψ in the last term in the equation (9) (where the exponential form is shown). as the first, we present here outputs of the esel model with new sheath dissipation term. we note that the mechanism of the sheath dissipation was con131 p. ondáč et al. acta polytechnica figure 4. time-averaged radial profiles of plasma potential. the solid lines represent esel data. in the simulation with the exponential form of the ψ (blue line) the value of electric potential on le of the computational domain was set as φ′(le) = −40. figure 5. time-averaged radial profiles of relative plasma density fluctuations. again, the solid lines represent esel data, with the lowest lying, blue line for exponential form of the sheath dissipation term. sidered inappropriate for highly collisional plasmas [21]. but we may show that its presence in equation (9) via the last term, with ψ in exponential form, can improve agreement between the model and asdex upgrade experiment. in figures 4 and 5 the exponential and linear form of the bracket ψ in the term, is compared for the asdex upgrade tokamak. the bohm criterion yields the exponential form of the bracket ψ. the linear form of the ψ has also been tested mainly for these two reasons: first, we do not know the exact divergence ∇· j‖ in the area of the computational domain [26]. second, the assumption of a sheath-connected plasma implies that it should be collisionless, while the sol plasma often has significant collisionality. there was a possibility that for non-zero collisionality, the exponential form of the ψ will not fully apply. in these two figures we see the agreement for the exponential form in the radial profile of plasma electric potential and relative plasma density fluctuations. the simulation labelled as “lin.” has the sheath dissipation term in linear form with ψ = [φ′/t ′e −α] and the simulations labelled as “lin.average” has the term figure 6. time-averaged radial profiles of plasma density. the solid lines represent the esel data. sheath dissipation term was used in the simulation represented by the pink solid line. in the linear ’averaged’ form with ψ = [〈φ′〉/〈t ′e〉−α], where 〈〉 means average over the poloidal direction y′ of the esel computational domain. in figure 4 the experimental data of plasma potential labelled as “doppler” were obtained from the experimental data measured by a doppler reflectometer. the doppler reflectometer provides poloidal plasma flow velocity data which we converted to radial electric field data assuming the equality with usual e × b drift. the assumption that the poloidal motion of the density fluctuations is in fact governed by e × b drift is valid close to the separatrix [27]. in the far sol, the situation might become more complicated [28]. we converted the radial electric field data into plasma potential data by numerical integration with respect to radial distance. the integration constant has been taken into consideration to achieve the best agreement with the probe data labelled as “asdex”. in the case of the compass tokamak, we can point to the improvement of the esel, using the sheath dissipation term with the bracket ψ only in linear form. comparisons of the plasma density radial profile for the simulations without the sheath dissipation term and with the linear sheath dissipation term for the compass tokamak is shown in figure 6. the words “lower” and “greater” in the labels in this figure mean only greater and lower n′(le) on the left edge of the esel computational domain. similar comparisons with and without the sheath dissipation term for the asdex upgrade tokamak can be found in [15] or [5]. in summary, the exponential form of the sheath dissipation term has turned out to be the best form and all simulations for the asdex upgrade presented below were made with the term in this form. in the next section, the consequences of adding this term to the esel code are examined in more detail. for clarity, we are presenting only a few representative simulations and experimental data only from one discharge for each tokamak, in all figures in this paper. 132 vol. 55 no. 2/2015 comparison between esel model and aug and compass data figure 7. time-averaged radial profiles of electron temperature (solid line model, circles experimental data). the experimental data consist from probe data (right of the lcfs line) and thomson scattering data (left of the lcfs line confined plasma). 4.1. general observations for esel and the asdex upgrade scrape-off layer profile a good agreement is found for the radial profile of the mean plasma potential (blue solid line in fig. 4), relative plasma density fluctuations (blue solid line in fig. 5) and also in the radial profile of mean electron temperature (fig. 7). general disagreements for the asdex upgrade tokamak were found, however, in the relative fluctuations of the electron temperature and plasma potential [15] and also in the radial profile of the plasma density (black line in figure 8). the relative fluctuations are too large compared with the experiment. the gradient of the radial profile of plasma density in the esel simulations matching the asdex upgrade plasma is too small compared with the experiment. in case of the compass tokamak the experimental profile of density has smaller gradient than in the asdex upgrade. the esel simulation matches the compass plasma quite well and describes the experimental plasma density radial profile (pink line in fig. 6). in the case of compass tokamak, there was lower electron collisionality (defined by ν∗e = l‖m/λee with the electron-electron collisional mean-free path λee [21]), than in the case of asdex upgrade tokamak, as we have seen in the esel output. in all figures the absolute values (vertical shifting of the graphs) in the mean radial profiles are not so crucial. gradients of the radial profiles are more important. the values of density and electron temperature at the lcfs are esel input. for the compass, in one case (pink line in fig. 6), the absolute value of the plasma density at the lcfs was set greater intentionally, due to computing speed. for a little greater absolute value of ne at the lcfs, the diffusion in the computational domain is higher (values of the diffusion coefficients are greater). also the τ′‖te is greater, but the mean radial profile of the electron temperature is not very dependent on it. the values τ′‖n, τ ′ ‖ω do not depend on the density. overall, the simulation with a little greater value of the ne at the lcfs is more stable figure 8. time-averaged radial profiles of plasma density. two solid lines represent esel simulations with theoretical value of the diffusion coefficient for particles multiplied by two different factors. (the time step of the simulation can be longer) due to dissipation of the smallest structures (because of higher diffusion); while a change in the dynamics of the turbulent structures, that exist at larger spatial scales, is negligible. we have tried to improve the general mismatches for the asdex upgrade tokamak by changing the esel parameters and some of the results are presented in the next section. 4.2. parameter dependence we changed the parameters d′⊥n and τ ′ ‖n and observed their impact on the esel dynamics. our main purpose was to increase the gradient of the radial profile of plasma density in esel simulations matching the asdex upgrade plasma. this increase was achieved by an increase in d′⊥n as we can see in fig. 8. there is a rather large uncertainty in the value of theoretically derived coefficient d′⊥n. first, it is only an effective radial value obtained by flux surface averaging. second, it assumes the existence of closed flux surfaces that is, indeed, not present in sol. a rigorous treatment of neoclassical transport on open flux surfaces has yet to be performed, [21]. therefore we have tested the consequences of increasing the theoretical value of d′⊥n by a factor in the range of 1.5–12 which corresponds to the absolute values in the range of around 8.9 × 10−4–7.1 × 10−3. for d′⊥n multiplied by factor 12 the gradient of the plasma density radial profile is in a good concurrence with the experiment (fig. 8). the simulation, however, represents a plasma with almost no blobs crossing lcfs and consequently almost no density fluctuations close to the lcfs (see fig. 9), therefore such a large diffusion coefficient seems to be incorrect. we also tested whether an increase in parallel loss (along the magnetic field) of particles may increase the gradient of the plasma density radial profile. and we found a decrease in τ′‖n (within the theoretical uncertainty) doesn’t increase the gradient significantly and it also causes an increase of relative density and temperature fluctuations well above the 133 p. ondáč et al. acta polytechnica figure 9. time-averaged radial profiles of relative plasma density fluctuations. the solid lines represent esel data with theoretical value of the diffusion coefficient multiplied by a factor of 3 and 12. experimentally observed level [15]. therefore, the increase in parallel losses of particles characterized by parallel particle density loss time is not a solution to have a steeper radial density profile. 5. conclusions we compared 15 esel simulation runs with experimental data from the asdex upgrade #24349 and 3 simulations with experimental data from the compass tokamak. in the standard esel code we have included a sheath dissipation term in the vorticity equation to investigate if we can observe an improved agreement with experimental observations for the profiles of density, electric potential and electron temperature. we considered exponential and linear form of the term. we observed a strong effect using the sheath dissipation term which generally makes a better match to the experimental observations of asdex upgrade but, e.g., the gradient of radial profile of the plasma density is too small compared to the experiment. we also tested an increase in the perpendicular collisional diffusion coefficient d′⊥n and the parallel losses of density characterized by parallel particle density loss time τ′‖n to obtain a better agreement with the experimental observations, but without success. the summary of which profiles obtained from esel match the asdex upgrade observation is shown in table 1. for the simulations using compass parameters, we observe a better agreement with the experiment for the density gradient but only 3 simulations were studied from which two simulations do not have the sheath dissipation term, and one simulation has only a linear form of the sheath dissipation term. generally, we conclude that the present esel model cannot fully simulate the experimental observation from asdex upgrade. there may be other dissipative processes, such as drift waves [29] which are not included in the esel model, and which dissipate the turbulence structures, resulting in a steeper experimental radial density profile. an extension of agreements disagreements •〈te(t)〉(r) •〈n(t)〉(r) •〈φ(t)〉(r) • σte(t) 〈te(t)〉 (r) • σn(t) 〈n(t)〉 (r) • σφ(t) 〈φ(t)〉 (r) table 1. general conclusions about the model esel matching the asdex upgrade tokamak plasma. the esel model from 2d to 3d to fully resolve the parallel dynamics and the coupling from the plasma to the sheath is necessary to improve agreements. acknowledgements this research has been supported by the czech science foundation project p205/12/2327 and msmt project no. lm2011021. access to computing and storage facilities owned by parties and projects contributing to the national grid infrastructure metacentrum, provided under the programme “projects of large infrastructure for research, development, and innovations” (lm2010005), is appreciated. references [1] antar, y., g.et al. on the scaling of avaloids and turbulence with the average density approaching the density limit. physics of plasmas 12, 082503 (2005). doi:10.1063/1.19535592 [2] garcia, o., e. et al. turbulence and intermittent transport at the boundary of magnetized plasmas. physics of plasmas 12, 062309, (2005). doi:10.1063/1.1925617 [3] garcia, o., e. et al. interchange turbulence in the tcv scrape-off layer. plasma phys. control. fusion 48, l1-l10, (2006). doi:10.1088/0741-3335/48/1/l01 [4] garcia, o., e. et al. turbulence simulations of blob formation and radial propagation in toroidally magnetized plasmas. phys. scr. t112, 89-103, (2006). doi:10.1088/0031-8949/2006/t122/013 [5] vondráček, p.. study of edge plasma physics of tokamak compass by means of two reciprocating probes. master thesis, ctu in prague, czech republic, 2012 [6] herrmann, a. , gruber, o. asdex upgrade-introduction and overview. fusion science and technology, vol. 44, no. 3, 569 577, (2003). [7] pánek, r. et al. reinstallation of the compass-d tokamak in ipp ascr. czechoslovak journal of physics, vol. 56, issue 2, pp. b125 b137, (2006). doi:10.1007/s10582-006-0188-1 [8] naulin, v. et al. turbulence modeling of jet sol plasma. in proceedings of the 21st iaea fusion energy conference, (chengdu, china, 2006), th/p6-22. online at http://www.iop.org/jet/fulltext/efdc060519.pdf [2015-03-01] 134 http://dx.doi.org/10.1063/1.19535592 http://dx.doi.org/10.1063/1.1925617 http://dx.doi.org/10.1088/0741-3335/48/1/l01 http://dx.doi.org/10.1088/0031-8949/2006/t122/013 http://dx.doi.org/10.1007/s10582-006-0188-1 http://www.iop.org/jet/fulltext/efdc060519.pdf vol. 55 no. 2/2015 comparison between esel model and aug and compass data [9] vergote, m., et al. discussion of sol turbulence properties in textor by means of esel simulations . in 38th eps conference on plasma physics, 2011, p5.068. online at http://ocs.ciemat.es/eps2011pap/pdf/p5.068.pdf [2015-03-01] [10] militello, f.,et al. simulations of edge and scrape off layer turbulence in mega ampere spherical tokamak plasmas . plasma phys. control. fusion 54, no.9, 095011, 2012 doi:10.1088/0741-3335/54/9/095011 [11] garcia, o., e.,et al. fluctuations and transport in the tcv scrape-off layer. nucl. fusion 47, no. 7, p. 667-676, (2007) doi:10.1088/0029-5515/47/7/017 [12] garcia, o., e.,et al. collisionality dependent transport in tcv sol plasmas. plasma phys. control. fusion 49, (2007) b47–b57. doi:10.1088/0741-3335/49/12b/s03 [13] garcia, o., e.,et al. turbulent transport in the tcv sol. journal of nuclear materials 363-365, (2007) 575. doi:10.1016/j.jnucmat.2006.12.063 [14] adámek, j. et al. a novel approach to direct measurement of the plasma potential. czechoslovak journal of physics, vol. 54, pp c95-c99, (2004). doi:10.1007/bf03166386 [15] ondáč, p.. study of tokamak plasma turbulence by means of reciprocating probes. master thesis, charles university in prague, czech republic, 2014. [16] horacek, j. et al. interpretation of fast measurements of plasma potential, temperature and density in sol of asdex upgrade. nucl. fusion 50, no.10, 105001,(2010). doi:10.1088/0029-5515/50/10/105001 [17] adámek, j. et al. ball-pen probe measurements in l-mode and h-mode on asdex upgrade. contrib. plasma phys. 50, no. 9, 843-859, (2010). doi:10.1002/ctpp.201010145 [18] adámek, j. et al. direct plasma potential measurements by ball-pen probe and self-emitting langmuir probe on compass and asdex upgrade. contrib. plasma phys. 54, issue 3, pp. 279–284, (2014). doi:10.1002/ctpp.201410072 [19] adámek, j. et al. fast ion temperature measurements using ball-pen probes in the sol of asdex upgrade during l-mode. 38th eps conference on plasma physics, vol. 35g, p1.059 (2011). [20] kočan, m. et al. edge ion-to-electron temperature ratio in the tore supra tokamak. plasma phys. control. fusion 50, no. 12, 125009, (2008). doi:10.1088/0741-3335/50/12/125009 [21] fundamenski, w. et al. dissipative processes in interchange driven scrape-off layer turbulence. nucl. fusion 47, 417-433 (2007). doi:10.1088/0029-5515/47/5/006 [22] garcia, o., e. et al. radial interchange motions of plasma filaments. physics of plasmas 13, 082309 (2006). doi:10.1063/1.2336422 [23] sarazin, y. , ghendrih, ph. intermittent particle transport in two-dimensional edge turbulence. physics of plasmas 5, 4214 (1998). doi:10.1063/1.873157 [24] bisai, n. et al. simulation of plasma transport by coherent structures in scrape-off-layer tokamak plasmas. physics of plasmas 11, 4018 (2004). doi:10.1063/1.1771658 [25] garcia, o., e. et al. turbulence and intermittent transport at the boundary of magnetized plasmas. physics of plasmas 12, 062309 (2005). doi:10.1063/1.1925617 [26] krasheninnikov, s., i. et al. recent theoretical progress in understanding coherent structures in edge and sol turbulence. j. plasma physics 74, part 5, pp. 679–717 (2008). doi:10.1017/s0022377807006940 [27] müller, h., w. et al. latest investigations on fluctuations, elm filaments and turbulent transport in the sol of asdex upgrade. nucl. fusion 51, 073023, 11pp (2011). doi:10.1088/0029-5515/51/7/073023 [28] müller, h., w. et al. characterization of scrape-off layer turbulence changes induced by a nonaxisymmetric magnetic perturbation in an asdex upgrade low density l-mode. contrib. plasma phys. 54, no.3, 261-266 (2014). doi:10.1002/ctpp.201410079 [29] angus, j., r. et al. effect of drift waves on plasma blob dynamics. phys. rev. lett. 108, 215002 (2012). doi:10.1103/physrevlett.108.215002 135 http://ocs.ciemat.es/eps2011pap/pdf/p5.068.pdf http://dx.doi.org/10.1088/0741-3335/54/9/095011 http://dx.doi.org/10.1088/0029-5515/47/7/017 http://dx.doi.org/10.1088/0741-3335/49/12b/s03 http://dx.doi.org/10.1016/j.jnucmat.2006.12.063 http://dx.doi.org/10.1007/bf03166386 http://dx.doi.org/10.1088/0029-5515/50/10/105001 http://dx.doi.org/10.1002/ctpp.201010145 http://dx.doi.org/10.1002/ctpp.201410072 http://dx.doi.org/10.1088/0741-3335/50/12/125009 http://dx.doi.org/10.1088/0029-5515/47/5/006 http://dx.doi.org/10.1063/1.2336422 http://dx.doi.org/10.1063/1.873157 http://dx.doi.org/10.1063/1.1771658 http://dx.doi.org/10.1063/1.1925617 http://dx.doi.org/10.1017/s0022377807006940 http://dx.doi.org/10.1088/0029-5515/51/7/073023 http://dx.doi.org/10.1002/ctpp.201410079 http://dx.doi.org/10.1103/physrevlett.108.215002 acta polytechnica 55(2):128–135, 2015 1 introduction 2 experiments 3 model esel 4 results 4.1 general observations 4.2 parameter dependence 5 conclusions acknowledgements references bibliography ap1_01.vp 1 introduction employees are a potentially creative element in each work organization – no company has a chance to achieve its objectives unless its staff works effectively. a manager, irrespective of the area he is responsible for in the company, does not make decisions regarding his employees in a vacuum. he has to deal with economic pressure to increase productivity and improve product quality, he has to reflect on the changes taking place in the area of production and information technologies, and marketing as well as financing. managers at all levels assume a number of roles that influence each other and overlap. a keen observer, when looking at management activities, will note that the differences between individual managers are particularly influenced by the roles that are given priority in their management activity. the recent trend of devoting increased attention to personnel management (or “human resource management” as this area is often called) is accompanied by the introduction of the benchmarking method, consisting in comparing company results with current best practice in leading companies worldwide or nationwide. benchmarking studies show that personnel managers of czech companies, when compared with those in other european countries, are still obliged to spend a part of their working time on ineffective administrative procedures, dealing with problems of current concern of their company rather than addressing conceptual issues related to the company’s future strategy (according to a report by price waterhouse – coopers, 1999). as stated in this source, while foreign companies list among personnel management priorities such items as “management functions development”, “changes in organization and company culture” and “internal communication,” in other words, activities of a non-administrative nature, czech personnel managers spend as much as half of their working hours dealing with routine administrative work. consequently, they “serve” an average of 57 employees, as compared with 69 employees in western europe and the near east countries. 2 aim of the research in 1999, the masaryk institute of advanced studies of the czech technical university launched a research project to study the effectiveness of personnel management in czech companies, focusing on managers with a technical background. in the first stage, the research into this extensive and complex subject, involving a number of relationships, was carried out in the form of a pilot study that also focused on methodology. the aim was to select certain personnel management activities, and study them in detail. while conducting the pilot study the researchers visited 10 companies and made 5 analyses in each of them. their evaluation resulted in a set of 47 analyses that resulted in a preliminary, rough generalization, though statistically at a low level of significance. the correlation analysis showed consistently higher overall effectiveness of the analyzed personnel management activities when these were carried out by a specialized company and not by the company using its own resources. similarly, outsourcing as a rule resulted in overall higher effectiveness of the activities assessed regarding the company’s financial results, individual performance, employee and customer satisfaction. the pilot study data also showed that those who were successful in personnel management were people with a larger knowledge base of the given area. however, the expectation that activities initiated by top management would be more effective than, for instance, activities initiated by personnel managers was not confirmed by our study. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 management of people by managers with a technical background – research results d. dobrovská in 1999 and 2000, the masaryk institute of advanced studies of ctu in prague launched a study of the efficiency of human resource management in czech enterprises, with emphasis on technically educated managers. about 85 managers, each responsible for 5–250 employees assessed their own hrm activities and attitudes and those of their firm. the following aspects were analysed: evaluating individual management areas, assessing the general management standard of the given company, awareness of the company’s personnel management policy, manager’s own contribution to formulating the company’s personnel management strategy, developing job descriptions, using professional methods of employee selection, promoting employees to the status of manager, periodic assessment of employees, training of staff responsible for assessment, employee remuneration and other motivation tools, etc. results and data analysis are given in the paper. the managers of the czech companies reviewed are still involved mainly in operating management, and devote limited time to the conceptual work needed in order to formulate an integral company policy in the area of personnel management. on the basis of this analysis, further training for managers with technical education will be designed and organised by the masaryk institute of advanced studies. keywords: human resource management, effectiveness of personnel management, employee selection, employee assessment, motivation tools, broadening employee skills 3 conducting the research and research results in the first half of 2000, a questionnaire survey was carried out on a sample of 87 managers (75 men and 12 women) coming from 45 czech and moravian companies. the sample comprised 18 managing directors, 20 technical directors and 49 representatives of lower management. the number of employees reporting to them varied, ranging from 5 to 250. the managers were within the 25–56 age range. most of the managers surveyed were graduates from technical universities, mainly from civil, mechanical and electro-technical faculties. other respondents held degrees in chemistry, mining and agriculture. 7 respondents were attending post-graduate courses, including mba programs. about half of those surveyed had more than 11 years’ experience of managing people. the questionnaire was in two parts. part a comprised 28 items concerning the following: evaluating individual management areas, assessing the general management standard of the given company, awareness of the company’s personnel, management policy, manager’s own contribution to formulating the company’s personnel, management strategy, developing job descriptions, using professional methods of employee selection, staff responsible for employee selection, promoting employees to the status of manager, new employee adaptation, periodic employee assessment, training of staff responsible for assessment, managers’ attitude to work motivation, employee remuneration and other motivation tools, improving and broadening employee qualifications, company spending on staff training, stimulating improvement of staff qualifications, information sources used in the company, relationships with the trade union organization. the second part of the questionnaire (part b) comprised data concerning the respondents. they were asked to state their current position, sex, age, number of subordinates (including indirect subordinates) and the highest level of education received, their own education, length of managerial experience, personal attitude to personnel management (a personal “creed”), and sources of information about personnel management. in most cases, the guided process of completing the questionnaires took place at the manager’s workplace. the questionnaire administrators were the research team members of the cvut masaryk institute of advanced studies. filling in the questionnaire usually took 1–2 hours. when evaluating the items related to specific management areas, managers arranged them in order of importance for their everyday work. in their view, the most important item is the operating management, followed by information technologies, employee motivation, company internal communication and strategic management. a scale of one to six was used to judge the standard of management in the managers’ own companies, and the respondent had to opt for either a positive or a negative evaluation (there was no “average standard” rating on the scale). managers gave all the areas assessed an above-average rating (a certain “identification with the company”); the highest ratings were given to the areas seen as most important in the previous items. operating management in these areas is the most frequent activity of the managers at work, being carried out usually on a daily basis or several times a week. in order to assess managers’ own knowledge and skills regarding specific personnel management areas, a classical scale of one to five was used (excellent, above-average, average, below-average, totally inadequate), this time offering the possibility of a “modest” answer – an “average” rating. and, in fact, this was the answer that most managers chose. they gave a higher rating to their knowledge in the areas to which they devote most of their time in their professional practice. if we compare the answers to all four items concerning the evaluation of specific management areas, it is obvious that in all cases the main emphasis is placed on “operating management”. other items concerned the evaluation of the existence of a comprehensive company personnel management policy and the manager’s role in formulating its strategy. answers to the first question indicate that about one third of the respondents admit that an integral policy is lacking and more than one third have no precise information about the integral company policy in the area of personnel management. the next question was answered along the same lines: most managers do not participate in formulating personnel management strategy at all and, if they do, it is only when it is closely linked with their own job responsibilities. the next set of questions covered personnel planning and selection. it follows from the answers that staff requirements are based primarily on operating experience, or are determined on the basis of a rough analysis of the company’s goal. only a rough estimate is made of the future personnel requirements in terms both of their number and of their structure. ordinary employees are not chosen by professional selection methods, and fewer than one third of the respondents claimed to use such methods to select key employees. moreover, the majority of the managers surveyed keep minimal records of costs of activities related to personnel management (although these answers may be connected with the different work responsibilities of the managers interviewed). consequently, it is either the manager himself or the manager in conjunction with the personnel department who participates in the selection process. employees are promoted to the rank of manager on the basis of their qualifications and performance. adaptation of new employees means, in most cases, getting to know the workplace and assigning a fellow employee to acquaint the newcomer with the job. other items focused on employee assessment. the results show that most companies surveyed assess their employees approximately once a year, mainly on the basis of achieving company goals, in the form of a free description or assessment interview. these methods and/or a combination of them are considered most suitable. the assessment is seen as support information for personnel-related decision-making (promotion, dismissal, remuneration, or training and development). the assessment results are communicated to employees in a detailed interview; only exceptionally are they not communicated at all. the assessment staff receive a short briefing instructing them on how to handle the assessment. work motivation was the subject of the other part of the questionnaire. the personal attitude of the managers to this subject was tested by a classical method of indirect assessment of attitudes – managers were offered a set of statements about motivation and asked to choose the principles they most identify with. among the choice of 12 “principles” the preferred one was the so-called “distributional equity”, expressed in the relevant literature by the formula “my © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 41 no.1/2001 performance : your performance = my remuneration : your remuneration”. managers also manifested their attitude to employees’ financial motivation: while the average amount of the floating component of the salary was stated to be 24 % (with probable statistical distortion, because in one case the figure given was as high as to 75 %), the managers themselves would like it to increase to a level of 35 %. from the total choice of 34 positive and negative motivation factors, (e.g. salary, praise, disciplinary measures, promotion, personal development, benefits, reprimand with the threat of dismissal), managers regard financial incentives as the most efficient (e.g., extraordinary financial rewards, target bonuses, a floating salary component, rewarded employees’ differentiation and incentive bonuses). other motivation factors included the manager’s show of personal interest and performance encouragement. on an average, the managers themselves deal with motivation just once a month; however, they show personal interest and give recognition and praise more often. in their opinion, what employees appreciate most is an extraordinary financial bonus, recognition and praise, and the manager’s help in difficult life situations. another personnel management area assessed was employee training. in the companies surveyed, employees usually improve their skills in short-term training courses, and less frequently in long-term courses and correspondence courses combining work and study. those who study are mainly younger workers, especially those regarded as having great potential. however, other employees who show interest in improving their skills are also given the opportunity to do so. the companies reviewed spend an annual average of czk 1000–5000 per employee on training. employees are encouraged to improve their skills primarily by their immediate superiors. work experience is also acquired at the workplace when less experienced employees are assigned to work with more experienced colleagues. at the end of the questionnaire the respondents listed their most frequently used information sources. the sources seen as most important (in order of frequency of use) are the internet, information from employees, employee suggestions, minutes taken at company management meetings, public company media and company documents. on the other hand, they hardly ever use or do not have available career plans, the company’s ethical code, or company questionnaires and surveys. managers take part in drawing up company documents. the purpose of the complementary item of the questionnaire was to map out the relationship of the company to trade unions. the answers to this question show that in more than half of the companies there are no trade union organizations at all, and where they do exist, their relationship with company managers may be regarded as practically free of any conflict. 4 conclusion the research results showed that the managers of the czech companies reviewed are still involved mainly in operating management and devote limited time to the conceptual work needed to formulate an integral company policy in the area of personnel management. this policy is either nonexistent in the company or is implemented spontaneously, responding to an immediate need. it is often the case that this policy is carried out separately from the company’s principal tasks. the same applies to personnel planning and selection, in that employees are hardly ever chosen by professional selection methods. work motivation is regarded by managers as an important part of management. in their view, the most effective (and the “easiest”) incentives are differentiated financial rewards given to individual employees for their work results. besides financial motivation they use other incentives of a non-material nature in their professional practice, mainly praise and a show of personal interest and support for employees. (according to these managers, they themselves would prefer to be rewarded for their achievements not only by bonuses but also by recognition and appreciation, enhancing their own satisfaction with the work done.) in the czech companies under review, employees with good prospects and potential have the opportunity to improve their skills, with encouragement from their immediate superior or on their own initiative. however, the amount of money spent on education is not commensurate with the importance attached to education in western companies. in most cases managers share the same preferences regarding the information sources they use – alongside classical company sources (minutes taken at company management meetings and information obtained from employees) the internet has become very popular. other personnel management tools, the use of which has been neglected so far, include career planning, a company ethical code, and company questionnaires and surveys. the research carried out on a sample of managers with a technical background shows that czech companies will have to formulate and improve their integral company strategy in the area of personnel policy, which should be seen as an organic part of meeting company goals. only when a management strategy is viewed in this way will the top managers be partly liberated from excessive, ineffective routine work leading to reinforcement of the administrative stereotype. as a result they will be able to address conceptual issues – developing management functions, making changes in organization and company culture, and improving internal communication in the company. references [1] dobrovska, d.: hrm and technically educated managers. international symposion igip, klagenfurt 2001, p. 4 [2] moulis, j.: motivation myth. modern management 1988, (33), pp. 53–56 [3] hrm in czech companies. price waterhouse – coopers report 1999 phdr. dana dobrovská, csc. masaryk institute of advanced studies of ctu in prague, horská 3, 128 00 prague 2, czech republic phone/fax: +420 2 24915319, 0603 342 339 e-mail: dobrovd@muvs.cvut.cz 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no.1/2001 acta polytechnica doi:10.14311/ap.2016.56.0224 acta polytechnica 56(3):224–235, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap the aharonov-bohm hamiltonian with two vortices revisited petra košťáková, pavel šťovíček∗ department of mathematics, faculty of nuclear sciences and physical engineering, czech technical university in prague, trojanova 13, 120 00 praha, czech republic ∗ corresponding author: stovipav@kmlinux.fjfi.cvut.cz abstract. we consider an invariant quantum hamiltonian h = −∆lb + v in the l2 space based on a riemannian manifold m̃ with a discrete symmetry group γ. to any unitary representation λ of γ one can relate another operator on m = m̃/γ, called hλ, which formally corresponds to the same differential operator as h but which is determined by quasi-periodic boundary conditions. as originally observed by schulman in theoretical physics and sunada in mathematics, one can construct the propagator associated with hλ provided one knows the propagator associated with h. this approach is reviewed and demonstrated on a quantum model describing a charged particle on the plane with two aharonov-bohm vortices. the construction of the propagator is explained in full detail including all substantial intermediate steps. keywords: aharonov-bohm effect; propagator; covering space; bloch decomposition. 1. introduction suppose there is given a riemannian manifold m̃ with a discrete symmetry group γ and a γ-periodic hamilton operator h on l2(m̃). to any unitary representation λ of γ one can relate another operator on m = m̃/γ, called hλ, which is determined by quasi-periodic boundary conditions. a formula relating the propagators kλt (x,x0) and kt(x,x0) associated with hλ and h, respectively, has been derived in the framework of the feynman path integral [18, 19]. an analogous formula is also known for heat kernels [4]. an opposite point of view is taken when one decomposes the operator h into a direct integral with components hλ where λ runs over all irreducible unitary representations of γ [3, 6, 23]. the evolution operator then decomposes correspondingly. this type of decomposition is a substantial step in the bloch analysis. the both relations, the propagator formula on one hand and the generalized bloch decomposition on the other hand, are in a sense mutually inverse [11, 12]. in the current paper we wish to demonstrate how this relationship can be effectively used on a concrete example of interest. we consider the formula for propagators in the case of the aharonov-bohm effect with two vortices. in this quantum model, m̃ is identified with the universal covering space of the plane with two excluded points and γ is the fundamental group of the same manifold. this problem has already been treated by one of the authors quite a long time ago in [21]. but the topic is in no way exhausted completely, and this quantum model was intensively discussed in a number of papers, in some cases even very recently. these discussions rely on completely different approaches, however, like asymptotic methods for largely separated vortices, semiclassical analysis and a complex scaling method [1, 2, 9, 25]. one may also mention more complex models comprising, apart of magnetic vortices, also additional potentials or magnetic fields [14, 16], or models with an arbitrary finite number of magnetic vortices or even with countably many vortices arranged in a lattice [15, 17, 22]. on the other hand, the method stemming from the original ideas of schulman and sunada turned out to be fruitful also in analysis of other interesting models like brownian random walk on the twice punctured plane [5, 7]. here we return to the article [21] which is in its character a brief letter presenting the final formulas without a detailed derivation. but the technique applied therein is of independent interest and can prove useful in other situations as well, as already mentioned above. this is why we focus, in the present paper, primarily on the method itself and aim to explain the approach on a concrete example while indicating all necessary intermediate steps in full detail. hopefully, the provided analysis may open the way to new applications of the method. the paper is organized as follows. the main ideas and results of the general approach are outlined in section 2 following papers [11, 12]. section 3 is the key section of the present paper. in subsection 3.1, a formula for the propagator on the universal covering space of the twice punctured plane, as originally presented in [21], is briefly recalled. subsection 3.2 has a preliminary character and provides a summary of some auxiliary useful identities. subsection 3.3 is fully dedicated to the proof of the propagator formula, as given in (9), (10). more precisely, the goal of this subsection is a verification of equation (17). as a corollary, in section 4, more details 224 http://dx.doi.org/10.14311/ap.2016.56.0224 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited are provided, if compared to [21], about a formal application of the schulman-sunada formula, as recalled in (5), to the studied example while making use of the knowledge of the propagator on the covering space. 2. a summary of the general approach 2.1. periodic hamiltonians let m̃ be a connected riemannian manifold with a discrete and at most countable symmetry group γ. the action of γ on m̃ is assumed to be smooth, free and proper. let us recall that under these assumptions any element s ∈ γ different from the unity has no fixed points on m̃, and for any compact set k ⊂ m̃, the intersection k ∩s ·k is nonempty for at most finitely many elements s ∈ γ. this also implies that any point y ∈ m̃ has a neighborhood u such that the sets s ·u, s ∈ γ, are mutually disjoint [13, corollary 12.10]. denote by µ̃ the measure on m̃ induced by the riemannian metric. the quotient m = m̃/γ is a connected riemannian manifold with an induced measure µ. this way one gets a principal fiber bundle π : m̃ → m with the structure group γ. all l2 spaces based on the manifolds m and m̃ are everywhere tacitly understood with the measures µ and µ̃, respectively. in a number of important examples, m̃ is the universal covering space of m and γ = π1(m) is the fundamental group of m. in particular, this is the case when one is considering the aharonov-bohm effect. to a unitary representation λ of γ in a separable hilbert space lλ one relates the hilbert space hλ formed by λ-equivariant vector-valued functions on m̃. this means that any function ψ ∈ hλ is measurable with values in lλ and satisfies ∀s ∈ γ, ψ(s ·y) = λ(s)ψ(y) almost everywhere on m̃. moreover, the norm of ψ induced by the following scalar product is required to be finite. if ψ1,ψ2 ∈ hλ then the function y 7→ 〈ψ1(y),ψ2(y)〉lλ defined on m̃ is γ-invariant and so it projects to a function sψ1,ψ2 defined on m, and the scalar product is defined by 〈ψ1,ψ2〉 = ∫ m sψ1,ψ2 (x) dµ(x). our discussion focuses on γ-periodic hamiltonians on m̃ of the form h = −∆lb + v where ∆lb is the laplace-beltrami operator and v is a γ-invariant semibounded and locally integrable real function on m̃. clearly, the differential operator −∆lb + v is semibounded on the domain formed by test functions (i.e. smooth and compactly supported functions), and h is defined as its friedrichs extension. the same choice will also be made in other instances below in the paper. this is to say that in the presented approach we distinguish the friedrichs extension as the preferred self-adjoint extension of a given semibounded symmetric operator. here we are referring to the widely used result ensuring the existence of an unambiguously defined and in some sense minimal self-adjoint extension of a semibounded symmetric operator, the so called friedrichs extension [10, § vi.2]. this choice is encountered very frequently in various applications and it also makes it possible to avoid the discussion of the domain of the self-adjoint operator in question which sometimes may be quite tedious. to the same differential operator, −∆lb + v , one can relate a self-adjoint operator hλ in the space hλ for any unitary representation λ of γ in lλ. let us define φλ : c∞0 (m̃) ⊗ lλ → hλ by ∀ϕ ∈ c∞0 (m̃),∀v ∈ lλ, (φλϕ⊗v)(y) = ∑ s∈γ ϕ(s ·y)λ(s−1)v. (1) since the action of γ is proper, the vector-valued function φλϕ⊗v is smooth. moreover, φλϕ⊗v is λ-equivariant, the norm of φλϕ⊗v in hλ is finite, and the range of φλ is dense in hλ. the laplace-beltrami operator is well defined on ran(φλ) and it holds ∆lbφλ[ϕ⊗v] = φλ[∆lbϕ⊗v]. one can also verify that the differential operator −∆lb is positive on the domain ran(φλ) ⊂ hλ. since the function v (y) is γ-invariant, the multiplication operator by v is well defined in the hilbert space hλ. the hamiltonian hλ is defined as the friedrichs extension of the differential operator −∆lb + v considered on the domain ran φλ. the reader is referred to [11, 12] for more details. 225 petra košťáková, pavel šťovíček acta polytechnica 2.2. a generalization of the bloch decomposition let γ̂ be the dual space to γ (the quotient space of the space of irreducible unitary representations of γ). in the first step of the generalized bloch analysis one decomposes h into a direct integral over γ̂ with components being equal to hλ. to achieve this goal a well defined harmonic analysis on γ is necessary. it is known that the harmonic analysis is well established for locally compact groups of type i [20]. so all formulas presented below are well defined provided γ is a type i group. as shown in [26, satz 6], a countable discrete group is of type i if and only if it has an abelian normal subgroup of finite index. this means that there exist multiply connected configuration spaces of interest whose fundamental groups are not of type i. for example, the fundamental group in the case of the aharonov-bohm effect with two vortices is the free group with two generators, and it is not of type i. this problem is avoided, however, if m̃ is the maximal abelian covering of m rather than the universal covering [12, 24]. let us recall basic properties of the harmonic analysis on discrete type i groups [20]. the haar measure on γ is simply the counting measure. let dm̂ be the plancherel measure on γ̂. it is known that if γ is a countable discrete group of type i then dim lλ, the dimension of the carrier representation space, is a bounded function of λ ∈ γ̂ [26, korollar i]. denote by i2(lλ) ≡ lλ ⊗ l ∗λ the hilbert space formed by hilbert-schmidt operators on lλ (l ∗λ is the dual space to lλ). the fourier transform is defined as a unitary mapping f : l2(γ) → ∫ ⊕ γ̂ i2(lλ) dm̂(λ). note that in this situation f ∈ l1(γ) just means that the values of f on γ represent a summable sequence. since every summable sequence is also square summable we have f ∈ l1(γ) ⊂ l2(γ), and then f [f](λ) = ∑ s∈γ f(s)λ(s). conversely, if f is of the form f = g ∗h (the convolution) where g,h ∈ l1(γ), and f̂ = f [f] then f(s) = ∫ γ̂ tr [ λ(s)∗f̂(λ) ] dm̂(λ). using unitarity of the fourier transform one finds that m̂(γ̂) ≤ 1. the following rule is of crucial importance: ∀s ∈ γ, ∀f ∈ l2(γ), f [f(s ·g)](λ) = λ(s−1)f [f(g)](λ). now we are going to construct a unitary map φ : l2(m̃) → ∫ ⊕ γ̂ hλ ⊗ l ∗λ dm̂(λ) making it possible to decompose h. observe that the tensor product hλ ⊗ l ∗λ can be naturally identified with the hilbert space of λ-equivariant operator-valued functions on m̃ with values in i2(lλ). for f ∈ l2(m̃) and y ∈ m̃ set ∀s ∈ γ, fy(s) = f(s−1 ·y). the norm ‖fy‖ in l2(γ) is a γ-invariant function of y ∈ m̃ whose projection onto m is square integrable. hence for almost all x ∈ m and all y ∈ π−1({x}) one has fy ∈ l2(γ). we define components φ[f](λ), λ ∈ γ̂, by( φ[f](λ) ) (y) = f [fy](λ) ∈ i2(lλ). in particular, if f ∈ l1(m̃) ∩l2(m̃) then( φ[f](λ) ) (y) = ∑ s∈γ f(s−1 ·y)λ(s). equivalently, referring to (1), one can define φ in the following way. for ϕ ∈ c∞0 (m̃), v ∈ lλ and y ∈ m̃ set( φ[ϕ](λ) ) (y)v = (φλϕ⊗v)(y). (2) then φ introduced in (2) is an isometry and extends unambiguously to a unitary mapping. finally one can verify the formula φhφ−1 = ∫ ⊕ γ̂ hλ ⊗ 1 dm̂(λ) which represents the sought bloch decomposition. as a corollary we have φu(t)φ−1 = ∫ ⊕ γ̂ uλ(t) ⊗ 1 dm̂(λ). (3) 226 vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited 2.3. propagators associated with periodic hamiltonians in (3), the evolution operator u(t) is expressed in terms of uλ(t), λ ∈ γ̂. it is possible to invert this relationship and to derive a formula for the propagator associated with hλ which is expressed in terms of the propagator associated with h. the propagators are regarded as distributions which are introduced as kernels of the corresponding evolution operators. recall that, by the schwartz kernel theorem (see, for example, [8, theorem 5.2.1]), to every b ∈ b(l2(m̃)) there exists one and only one β ∈ d′(m̃ ×m̃) such that ∀ϕ1,ϕ2 ∈ c∞0 (m̃), β(ϕ1 ⊗ϕ2) = 〈ϕ1,bϕ2〉. one calls β the kernel of b. the kernel theorem can be extended to hilbert spaces formed by λ-equivariant vector-valued functions. in this case the kernels are operator-valued distributions. to every b ∈ b(hλ) there exists one and only one β ∈ d′(m̃ ×m̃) ⊗ b(lλ) such that ∀ϕ1,ϕ2 ∈ c∞0 (m̃), ∀v1,v2 ∈ lλ,〈 v1,β(ϕ1 ⊗ϕ2)v2 〉 lλ = 〈 φλϕ1 ⊗v1,bφλϕ2 ⊗v2 〉 . the distribution β is λ-equivariant: ∀s ∈ γ, β(s ·y1,y2) = λ(s)β(y1,y2) and β(y1,s ·y2) = β(y1,y2)λ(s−1) (4) denote by kt ∈ d′(m̃ ×m̃) the kernel of u(t) ∈ b(l2(m̃)), and by kλt ∈ d′(m̃ ×m̃) ⊗b(lλ) the kernel of uλ(t) ∈ b(hλ). here and everywhere in this section, t is a real parameter. the kernel kλt is λ-equivariant in the sense of (4). first, we can rewrite the bloch decomposition (3) in terms of kernels. for all ϕ1,ϕ2 ∈ c∞0 (m̃), kt(ϕ1 ⊗ϕ2) = ∫ γ̂ tr [ kλt (ϕ1 ⊗ϕ2) ] dm̂(λ), with the integral being convergent. an inverse relation was derived by schulman in the framework of path integration [18, 19] and reads kλt (x,y) = ∑ s∈γ λ(s)kt(s−1 ·x,y). (5) it is possible to give (5) the following rigorous interpretation [11, 12]. suppose that ϕ1,ϕ2 ∈ c∞0 (m̃) are fixed but otherwise arbitrary. set ft(s) = kt ( ϕ1(s−1 ·y1) ⊗ϕ2(y2) ) for s ∈ γ, gt(λ) = kλt (ϕ1 ⊗ϕ2) ∈ i2(lλ) for λ ∈ γ̂. one can show that ft ∈ l2(γ) and gt is bounded on γ̂ in the hilbert-schmidt norm. recalling that m̂(γ̂) ≤ 1 we have ‖gt(·)‖∈ l1(γ̂) ∩l2(γ̂). then ft = f−1[gt] and, consequently, gt = f [ft]. (6) rewriting (6) formally yields (5). 3. the propagator on the universal covering space 3.1. a formula for the propagator the configuration space for the aharonov-bohm effect with two vortices is the plane with two excluded points, m = r2 \{a,b}. this is a flat riemannian manifold and the same is true for the universal covering space m̃. let π : m̃ → m be the projection. it is convenient to complete the manifold m̃ by a countable set of points a∪b lying on the border of m̃ and projecting onto the excluded points, π(a) = {a} and π(b) = {b}. m̃ looks locally like r2 but differs from the euclidean space by some global features. first of all, not every two points from m̃ can be connected by a geodesic segment. fix a point x ∈ m̃. the symbol d(x), as introduced below in (20), stands for the set of points y ∈ m̃ which can be connected with x by a segment. then d(x) is a sheet of the covering m̃ → m. it can be identified with r2 cut along two half-lines with the limit points a and b, respectively. the border ∂d(x) is formed by four half-lines. the universal covering space m̃ can be imagined as a result of an infinite process of gluing together countably many copies of d(x) with each copy having four neighbors. 227 petra košťáková, pavel šťovíček acta polytechnica the fundamental group of m, called γ, is known to be the free group with two generators ga and gb. for the generator ga one can choose the homotopy class of a simple positively oriented loop winding once around the point a and leaving the point b in the exterior. analogously one can choose gb by interchanging the role of a and b. one-dimensional unitary representations λ of γ are determined by two numbers α, β, 0 ≤ α,β < 1, so that λ(ga) = e2πiα, λ(gb) = e2πiβ. the standard way how to define the aharonov-bohm hamiltonian hab with two vortices is to choose a vector potential −→ a for which curl −→ a = 0 on m and such that the nonintegrable phase factor [27] for a closed path from the homotopy class ga or gb equals e2πiα or e2πiβ, respectively (assuming that 0 < α,β < 1). hab then acts as the differential operator (−i∇− −→ a)2 in l2(m). here again, to be more rigorous, hab is the friedrichs extension of the positive operator (−i∇− −→ a)2 defined on test functions on m. for our purposes it would be more convenient to pass to a unitarily equivalent formulation. this is done in two steps. first, the differential operator (−i∇− −→ a)2 is lifted to m̃. then the unitarily equivalent operator is h̃ab acting as a differential operator (−i∇− −→̃ a)2 in the hilbert space of γ-periodic functions on m̃ which are square integrable over a fundamental domain of the γ action. once more, h̃ab is rigorously introduced with the aid of the friedrichs extension. second, curl −→̃ a = 0 holds again on m̃. but this time m̃ is simply connected and therefore the vector potential can be removed by a globally well defined gauge transformation. this gauge transformation induces a unitary mapping between the hilbert space of γ-periodic functions on m̃ and the hilbert space hλ of λ-equivariant functions on m̃, as introduced in subsection 2.1. the resulting operator which is unitarily equivalent to hab is nothing but the hamiltonian hλ = −∆ acting in hλ, as it has been introduced in the same subsection. remember that simultaneously one considers the free hamiltonian h = −∆ in l2(m̃). h is γ-periodic. in order to apply (5) and compute the propagator kλ(t,x,y) associated with hλ one has to rely on a known formula for the free propagator k(t,x,y) on m̃. let us recall a formula for k(t,x,y), as presented in [21]. let ϑ stand for the heaviside step function. for x,y ∈ m̃ ∪a∪b set χ(x,y) = 1 if the points x, y can be connected by a geodesic segment, and χ(x,y) = 0 otherwise. given t ∈ r we define z(t,x,y) = ϑ(t)χ(x,y) 1 4πit exp ( i 4t dist2(x,y) ) , (7) furthermore, for x1,x3 ∈ m̃ ∪a∪b and x2 ∈a∪b obeying χ(x1,x2) = χ(x2,x3) = 1, and for t1, t2 > 0, we set v ( x3,x2,x1 t2, t1 ) = 2i (( θ −π + i ln t2r1 t1r2 )−1 − ( θ + π + i ln t2r1 t1r2 )−1) (8) where θ = ∠x1,x2,x3 ∈ r is the oriented angle and r1 = dist(x1,x2), r2 = dist(x2,x3). note that θ can take any real value. we claim that the free propagator on m̃ equals k(t,x,x0) = ∑ γ∈c (x,x0) kγ(t,x,x0), (9) where c (x,x0) stands for the set of all piecewise geodesic curves γ : x0 → c1 → ··· → cn → x with the inner vortices cj, 1 ≤ j ≤ n, belonging to the set of extreme points a∪b. this means that it should hold χ(x0,c1) = χ(c1,c2) = · · · = χ(cn,x) = 1. let us denote by |γ| = n the length of the sequence (c1,c2, . . . ,cn). in particular, if |γ| = 0 then γ designates the geodesic segment x0 → x. to simplify notation we set everywhere where convenient c0 = x0 and cn+1 = x. with this convention, the terms in (9) equal kγ(t,x,x0) = ∫ rn+1 dtn · · ·dt0 δ(tn + · · · + t0 − t) n−1∏ j=0 v ( cj+2,cj+1,cj tj+1, tj ) n∏ j=0 z(tj,cj+1,cj). (10) in particular, if |γ| = 0 then kγ(t,x,x0) = z(t,x,x0), and if |γ| = 1 then γ designates a path composed of two geodesic segments x0 → c → x, with c ∈a∪b, and kγ(t,x,x0) = ϑ(t) ∫ t 0 v ( x,c,x0 t−s,s ) z(t−s,x,c)z(s,c,x0) ds. in what follows we aim to provide a detailed verification of formulas (9), (10). 228 vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited 3.2. auxiliary relations in r2, it holds true that ( ∂ ∂x + i ∂ ∂y ) 1 x + iy = 2πδ(x)δ(y) and ∆ 1 x + iy = 2π ( δ(y)δ′(x) − iδ(x)δ′(y) ) . it follows that ( ∂2 ∂r2 + 1 r ∂ ∂r + 1 r2 ∂2 ∂θ2 )( θ + i ln t r )−1 = 2πt r2 ( δ(t−r)δ′(θ) − irδ′(t−r)δ(θ) ) (11) holds on the domain t > 0, r > 0, θ ∈ r. on the same domain, ( r ∂ ∂r + t ∂ ∂t )( θ + i ln t r )−1 = 0. (12) combining (11) and (12) one finds that ( i ∂ ∂t + ∂2 ∂r2 + 1 r ∂ ∂r + 1 r2 ∂2 ∂θ2 )( θ + i ln t r )−1 1 t exp ( i r2 4t ) = 2π r2 exp ( i r2 4t )( δ(t−r)δ′(θ) − irδ′(t−r)δ(θ) ) . (13) equipped with (13) one can prove the identity ( i ∂ ∂t + ∂2 ∂r2 + 1 r ∂ ∂r + 1 r2 ∂2 ∂θ2 )∫ t 0 ( θ + i ln (t−s)r0 sr )−1 1 t−s exp ( i r2 4(t−s) ) f(s) ds = 2πr0 r2(r + r0) exp ( i (r + r0)r 4t )[ f ( tr0 r + r0 ) δ′(θ) − i r r + r0 (( 1 + i r0(r + r0) 4t ) f ( tr0 r + r0 ) + tr0 r + r0 f′ ( tr0 r + r0 )) δ(θ) ] , (14) which is true in the sense of distributions for any r0 > 0 and f ∈ c1([0, +∞[), again on the domain t > 0, r > 0, θ ∈ r. note that 1 ε exp ( i r2 4ε ) → 0 as ε → 0+ in d′(]0, +∞[). in particular, letting f(s) = (1/s) exp(ir 20 /(4s)) one derives the identity ( i ∂ ∂t + ∂2 ∂r2 + 1 r ∂ ∂r + 1 r2 ∂2 ∂θ2 )∫ t 0 ( θ + i ln (t−s)r0 sr )−1 1 (t−s)s exp ( i ( r2 4(t−s) + r 20 4s )) ds = 2π tr2 exp ( i (r + r0)2 4t ) δ′(θ). (15) let us further recall a basic fact concerning the generalized laplacian. if g ⊂ m̃ is an open set with a piecewise smooth boundary, χg is the characteristic function of g, −→n is the normalized outer normal vector field on ∂g and η is a smooth function on m̃ then, in the sense of distributions, ∆(ηχg) = (∆η)χg − ∂η ∂−→n δ∂g − ∂ ∂−→n (ηδ∂g). (16) the distribution δ∂g is a single layer supported on the curve ∂g and fulfilling ∀ϕ ∈ c∞0 (m̃), δ∂g(ϕ) = ∫ ∂g ϕd`. the double layer ∂/∂−→n (ηδ∂g) is defined by ∀ϕ ∈ c∞0 (m̃), ( ∂ ∂−→n (ηδ∂g) ) (ϕ) = − ∫ ∂g ∂ϕ ∂−→n d`. 229 petra košťáková, pavel šťovíček acta polytechnica 3.3. verification of the propagator formula we have to show that, for x0 ∈ m̃ fixed, the propagator k(t,x,x0) defined in (9), (10) verifies the condition( i ∂ ∂t + ∆ ) k(t,x,x0) = iδ(t)δ(x,x0) on r×m̃. (17) this is equivalent to showing that lim t→0+ k(t,x,x0) = δ(x,x0) (18) and ( i ∂ ∂t + ∆ ) k(t,x,x0) = 0 for t > 0, x ∈ m̃. (19) equation (18) is obvious. in fact, since the form of z(t,x,x0) on the sheet {x; χ(x,x0) = 1} is that of the free propagator on r2 we have lim t→0+ z(t,x,x0) = δ(x,x0). by a similar reasoning, limt→0+ z(t,x,c) = 0 if c ∈a∪b and x runs over m̃. hence lim t→0+ kγ(t,x,x0) = 0 if |γ| ≥ 1. concerning (19), we first introduce some notation related to the geometry of the universal covering space. denote by % the distance dist(a,b). observe that if c1,c2 ∈ a ∪ b then χ(c1,c2) = 1 if and only if dist(c1,c2) = %. if this is the case then necessarily c1 ∈a and c2 ∈b or vice versa. for x ∈ m̃ ∪a∪b set d(x) = {y ∈ m̃; χ(x,y) = 1}. (20) if x ∈ m̃ then d(x) can be identified with the plane cut along two half-lines with the limit points a and b, respectively. the border of d(x) consists of two pairs of half-lines. one pair has a common limit point a ∈a and is denoted ∂d(x; a), the other pair has a common limit point b ∈b and is denoted ∂d(x; b). we have ∂d(x) = ∂d(x; a) ∪∂d(x; b). (21) if c ∈a∪b then d(c) resembles the universal covering space in the one-vortex case. it can be viewed as a union of countably many sheets glued together in a staircase-like way. each sheet contributes to the border of d(c) by a pair of half-lines with a common limit point c′. thus the border ∂d(c) is formed by a countable union of pairs of half-lines: ∂d(c) = ⋃ c′∈d, dist(c,c′)=% ∂d(c; c′), (22) where d = b if c ∈a and d = a if c ∈b. let us first examine the case |γ| = 0. one has ( i ∂ ∂t + ∆ ) z(t,x,x0) = 0 for t > 0 and x ∈ d(x0). observe also that ∂ ∂−→n z(t,x,x0) = 0 for x ∈ ∂d(x0). this is so since, in polar coordinates centered at x0, z(t,x,x0) does not depend on the angle variable. let us also note that z(t,x,x0) can be continued smoothly in the variable x over the borderline of the domain d(x0). thus, in virtue of (16), for t > 0 and x ∈ m̃, ( i ∂ ∂t + ∆ ) z(t,x,x0) = − ∂ ∂−→n ( z(t,x,x0)δ∂d(x0) ) . (23) remark. in (23) as well as everywhere in this section we use the following convention. the value of a density (which is in this case z(t,x,x0)) on the border ∂d(x0) is understood as the limit value taken from the interior of the domain d(x0). 230 vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited next we discuss the case |γ| = 1. then γ designates a piecewise geodesic curve x0 → c → x, with c ∈a∪b. denote by γ′ the geodesic segment x0 → x provided x ∈ d(x0). we have kγ(t,x,x0) = 1 8π2i χ(x,c)χ(c,x0) × ∫ t 0 (( θ −π + i ln (t−s)r0 sr )−1 − ( θ + π + i ln (t−s)r0 sr )−1) 1 (t−s)s exp ( i ( r2 4(t−s) + r20 4s )) ds (24) where r = dist(x,c), r0 = dist(c,x0) and θ = ∠x0,c,x. application of the differential operator (i∂t + ∆) to the rhs of (24) in the sense of distributions results in several singular terms supported on one-dimensional submanifolds. first, due to the discontinuity of the characteristic function χ(x,c), the application of ∆ leads to two terms supported on the boundary ∂d(c) (see (16)). second, as it follows from (15), the singularity of the integrand for the values θ = ±π and r0/s = r/(t−s) produces terms supported on the submanifold determined by θ = ±π, and this set is nothing but a part of the boundary of the domain d(x0), namely ∂d(x0; c). notice that for θ = ±π it holds r + r0 = dist(x,x0) and ∂/∂−→n = ±r−1∂/∂θ. moreover, in polar coordinates centered at c, δ∂d(x0;c) = 1 r ( δ(θ −π) + δ(θ + π) ) . thus the latter contribution takes the form 1 4πitr2 ∂ ∂θ ( exp ( i 4t dist(x,x0)2 )( δ(θ −π) − δ(θ + π) )) = ∂ ∂−→n ( kγ′(t,x,x0)δ∂d(x0,c) ) , where kγ′(t,x,x0) = z(t,x,x0). in summary, we obtain ( i ∂ ∂t + ∆ ) kγ(t,x,x0) = − ( ∂ ∂−→n kγ(t,x,x0) ) δ∂d(c) − ∂ ∂−→n ( kγ(t,x,x0)δ∂d(c) ) + ∂ ∂−→n ( kγ′(t,x,x0)δ∂d(x0,c) ) . (25) finally, let us consider the case |γ| ≥ 2. thus γ is a piecewise geodesic curve x0 → c1 → ··· → cn → x, n ≥ 2. denote by γ′ the truncated geodesic curve x0 → c1 → ··· → cn−1 → x provided x ∈ d(cn−1). recalling (7), (8), one can express kγ(t,x,x0) = ∫ rn dtn−1 . . . dt0v ( x,cn,cn−1 t− τ,tn−1 ) z(t− τ,x,cn)fγ(t0, . . . , tn−1,x0) = 1 2π χ(x,cn) ∫ rn−1 dtn−2 . . . dt0 ∫ t−τ′ 0 dtn−1 (( θ −π + i ln (t− τ′ − tn−1)% tn−1r )−1 − ( θ + π + i ln (t− τ′ − tn−1)% tn−1r )−1) 1 t− τ′ − tn−1 exp ( i r2 4(t− τ′ − tn−1) ) fγ(t0, . . . , tn−1,x0), (26) where τ = t0 + · · · + tn−2 + tn−1, τ′ = t0 + · · · + tn−2, r = dist(cn,x), θ = ∠cn−1,cn,x, and fγ(t0, . . . , tn−1,x0) = n−2∏ j=0 v ( cj+2,cj+1,cj tj+1, tj )n−1∏ j=0 ztj (cj+1,cj). application of (i∂t + ∆) to the rhs of (26) in the sense of distributions again produces several singular terms. as a consequence of the discontinuity of the characteristic function χ(x,cn) a single and a double layer supported on the boundary ∂d(cn) occur (see (16)). the singularity of the integrand for the values θ = ±π and %/tn−1 = r/(t− τ′ − tn−1) produces terms supported on the part of the boundary of the domain d(cn−1), namely on ∂d(cn−1; cn). this time one can apply identity (14). in order to treat the resulting terms the following equations are useful. suppose that θ = ±π and so x ∈ ∂d(cn−1; cn). set r′ = r + % = dist(cn−1,x), θ′ = ∠cn−2,cn−1,x. if %/tn−1 = r/(t− τ′ − tn−1) then tn−1 = %(t− τ′) r′ and t− τ′ − tn−1 r = t− τ′ r′ . 231 petra košťáková, pavel šťovíček acta polytechnica moreover, % r′ exp ( ir2 4(t− τ) ) z(tn−1,cn,cn−1) = z(t− τ′,x,cn−1) and v ( cn,cn−1,cn−2 %s2/r ′,s1 ) = v ( x,cn−1,cn−2 s2,s1 ) . observe also that ∂ ∂s ( exp ( ir2 4(t− τ′ −s) ) exp (i%2 4s ))∣∣∣∣ s=%(t−τ′)/r′ = 0, exp ( ir2 4(t− τ′ −s) ) exp (i%2 4s )∣∣∣∣ s=%(t−τ′)/r′ = exp ( ir′ 2 4(t− τ′) ) , and for θ = π, ∂ ∂s v ( cn,cn−1,cn−2 s,tn−2 )∣∣∣∣ s=%(t−τ′)/r′ = ir′ %(t− τ′) ∂ ∂θ′ v ( x,cn−1,cn−2 t− τ′, tn−2 ) . a similar relation holds for θ = −π. after a bit tedious but quite straightforward manipulations one arrives at the final identity ( i ∂ ∂t + ∆ ) kγ(t,x,x0) = − ( ∂ ∂−→n kγ(t,x,x0) ) δ∂d(cn) − ∂ ∂−→n ( kγ(t,x,x0)δ∂d(cn) ) + ( ∂ ∂−→n kγ′(t,x,x0) ) δ∂d(cn−1;cn) + ∂ ∂−→n ( kγ′(t,x,x0)δ∂d(cn−1;cn) ) . (27) now we can show (19) when taking into account (23), (25) and (27). it is true that ( i ∂ ∂t + ∆ ) k(t,x,x0) = ∑ |γ|≥2 [ − ( ∂ ∂−→n kγ(t,x,x0) ) δ∂d(cn) − ∂ ∂−→n ( kγ(t,x,x0)δ∂d(cn) ) + ( ∂ ∂−→n kγ′(t,x,x0) ) δ∂d(cn−1;cn) + ∂ ∂−→n ( kγ′(t,x,x0)δ∂d(cn−1;cn) )] + ∑ |γ|=1 [ − ( ∂ ∂−→n kγ(t,x,x0) ) δ∂d(c) − ∂ ∂−→n ( kγ(t,x,x0)δ∂d(c) ) + ∂ ∂−→n ( z(t,x,x0)δ∂d(x0;c) )] − ∂ ∂−→n ( z(t,x,x0)δ∂d(x0) ) = 0, where we have used (21) and (22). 4. conclusion. the propagator for two aharonov-bohm vortices in conclusion we present a formula for the propagator of a charged particle on the plane pierced by two aharonov-bohm magnetic fluxes. without loss of generality we can suppose that the vortices are located at the points a = (0, 0) and b = (%, 0). in order to express the propagator for the aharonov-bohm hamiltonian hab we again pass to a unitarily equivalent formulation. let us cut the plane along two half-lines, la = ]−∞, 0[ ×{0} and lb = ]%, +∞[ ×{0}. let (ra,θa) be polar coordinates centered at the point a and (rb,θb) be polar coordinates centered at the point b. the angle variables are chosen so that the values θa = ±π correspond to the two sides of the cut la, and similarly for θb and lb. then an explicit and commonly used choice of the aharonov-bohm vector potential reads −→ a = α∇θa + β∇θb. denote by u a unitary operator in l2(r2,d2x) acting as the multiplication operator uψ = ei(αθa+βθb)ψ, 232 vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited and let h′λ = u −1habu. then h′λ acts as −∆ in l 2(r2,d2x), and its domain is determined by the boundary conditions along the cut la ∪lb: ψ(ra,θa = π) = e2πiαψ(ra,θa = −π), ∂θaψ(ra,θa = π) = e 2πiα∂θaψ(ra,θa = −π), ψ(rb,θb = π) = e2πiβψ(rb,θb = −π), ∂θbψ(rb,θb = π) = e 2πiβ∂θbψ(rb,θb = −π). in addition, one imposes the regular boundary condition at the vortices, namely ψ(a) = ψ(b) = 0. we wish to find a formula for the propagator kab(t,x,x0) associated with the hamiltonian hab. note that kab(t,x,x0) = expi(αθa(x)+βθb(x)) k′ λ(t,x,x0)e−i(αθa(x0)+βθb(x0)) where k′λ(t,x,x0) is the propagator associated with h′λ. let us denote d = r2 \ (la ∪lb). then one can embed d ⊂ m̃ as a fundamental domain. k′λ(t,x,x0) is simply obtained as the restriction to d of the propagator kλ(t,x,x0) associated with the hamiltonian hλ. on the other hand, to construct kλ(t,x,x0) one can apply formula (5) and the knowledge of the free propagator on m̃, see (9), (10). thus we get kλ(t,x,x0) = ∑ g∈γ ∑ γ∈c (g·x,x0) λ(g−1)kγ(t,g ·x,x0). (28) fix t > 0 and x0,x ∈ d. one can classify piecewise geodesic paths in m̃, γ : x0 → c1 →···→ cn → g ·x, (29) with cj ∈ a∪b and g ∈ γ, according to their projections to m. let γ be a finite alternating sequence of points a and b, i.e. γ = (c1, . . . ,cn), cj ∈ {a,b} and cj 6= cj+1. the empty sequence γ = () is admissible. relate to γ a piecewise geodesic path in m, namely x0 → c1 → ··· → cn → x. suppose that this path is covered by a path γ in m̃, as given in (29). then cj ∈ a iff cj = a and cj ∈ b iff cj = b. denote the angles ∠x0,c1,c2 = θ0 and ∠cn−1,cn,x = θ. then the angles in the path γ in (29) take the values ∠x0,c1,c2 = θ0 + 2πk1, ∠cn−1,cn,g · x = θ + 2πkn and ∠cj,cj+1,cj+2 = 2πkj+1 for 1 ≤ j ≤ n − 2 (if n ≥ 3), where k1, . . . ,kn are integers. any values k1, . . . ,kn ∈ z are possible. in that case the representation λ applied to the group element g occurring in (29) takes the value λ(g) = exp ( 2πi(k1σ1 + · · · + knσn) ) where σj ∈{α,β} and σj = α if cj = a, and σj = β if cj = b. using the equation∑ k∈z e−2πiαk ( 1 θ + 2πk −π + is − 1 θ + 2πk + π + is ) = −2 sin(πα) e−α(s−iθ) 1 + e−s+iθ which is valid for 0 < α < 1, |θ| < π, one can carry out a partial summation in (28) over the integers k1, . . . ,kn. this way the double sum in (28) reduces to a sum over finite alternating sequences γ. here is the resulting formula for k′λ(t,x,x0). we set ζa = 1 or ζa = e2πiα or ζa = e−2πiα depending on whether the segment x0x does not intersect la, or x0x intersects la and x0 lies in the lower half-plane, or x0x intersects la and x0 lies in the upper half-plane. analogously, ζb = 1 or ζb = e2πiβ or ζb = e−2πiβ depending on whether the segment x0x does not intersect lb, or x0x intersects lb and x0 lies in the upper half-plane, or x0x intersects lb and x0 lies in the lower half-plane. furthermore, let us write ζa = eiαηa, ζb = eiβηb, where ηa,ηb ∈{0, 2π,−2π}. then one has k′λ(t,x,x0) = ζaζb 1 4πit exp ( i |x−x0|2 4t ) − ∑ c∈{a,b} ζc sin(πσ) 4π2i ∫ ∞ 0 dt1 t1 ∫ ∞ 0 dt0 t0 δ(t1 + t0 − t) exp ( i ( r 2c 4t1 + r 20c 4t0 ))exp[−σ(sc − i(θc −θ0c −ηc)] 1 + exp(−sc + iθc − iθ0c) + 1 4πi ∑ γ,n≥2 (−1)n ∫ ∞ 0 dtn tn . . . ∫ ∞ 0 dt0 t0 δ(tn + · · · + t0 − t) exp ( i 4 (r2 tn + %2 tn−1 + · · · + %2 t1 + r20 t0 )) sγ(s,θ,θ0), 233 petra košťáková, pavel šťovíček acta polytechnica where sγ(s,θ,θ0) = sin(πσn) π exp[−σn(sn − iθ)] 1 + exp(−sn + iθ) sin(πσn−1) π exp(−σn−1sn−1) 1 + exp(−sn−1) ×···× sin(πσ1) π exp[−σ1(s1 − iθ0)] 1 + exp(−s1 + iθ0) , and sa = ln t1r0a t0ra , sb = ln t1r0b t0rb , sj = ln tjrj−1 tj−1rj for 1 ≤ j ≤ n. furthermore, in the first sum on the rhs, (rc,θc) and (r0c,θ0c) are the polar coordinates of x and x0 with respect to the center c, respectively, and σ = α (resp. β) if c = a (resp. b). the second sum, ∑ γ,n≥2, runs over all finite alternating sequences of length at least two, γ = (c1, . . . ,cn), and (r,θ) are the polar coordinates of x with respect to cn, (r0,θ0) are the polar coordinates of x0 with respect to c1, and σj = α (resp. β) depending on whether cj = a (resp. b). acknowledgements one of the authors (p.š.) wishes to acknowledge gratefully partial support from grant ga13-11058s of the czech science foundation. references [1] i. alexandrova, h. tamura. resonance free regions in magnetic scattering by two solenoidal fields at large separation. j. funct. anal. 260:1836-1885, 2011. doi:10.1016/j.jfa.2010.12.005 [2] i. alexandrova, h. tamura. resonances in scattering by two magnetic fields at large separation and a complex scaling method. adv. math. 256:398-448, 2014. doi:10.1016/j.aim.2014.01.022 [3] j. asch, h. over, r. seiler. magnetic bloch analysis and bochner laplacians. j. geom. phys. 13:275-288, 1994. doi:10.1016/0393-0440(94)90035-3 [4] m. f. atiyah. elliptic operators, discrete groups and von neumann algebras. astérisque 32-33:43-72, 1976. [5] o. giraud, a. thain, j. h. hannay. shrunk loop theorem for the topology probabilities of closed brownian (or feynman) paths on the twice punctured plane. j. phys. a: math. gen. 37:2913-2935, 2004. doi:10.1088/0305-4470/37/8/005 [6] m. j. gruber. bloch theory and quantization of magnetic systems. j. geom. phys. 34:137-154, 2000. doi:10.1016/s0393-0440(99)00059-5 [7] j. h. hannay, a. thain. exact scattering theory for any straight reflectors in two dimensions. j. phys. a: math. gen. 36:4063-4080, 2003. doi:10.1088/0305-4470/36/14/310 [8] l. hörmander. the analysis of linear partial differential operators i. berlin: springer, 2003. [9] h. t. ito, h. tamura. aharonov-bohm effect in scattering by point-like magnetic fields at large separation. ann. h. poincaré 2:309-359, 2001. doi:10.1007/pl00001036 [10] t. kato: perturbation theory of linear operators. new york: springer-verlag, 1966. [11] p. kocábová, p. šťovíček. generalized bloch analysis and propagators on riemannian manifolds with a discrete symmetry. j. math. phys. 49:art. no. 033518, 2008. doi:10.1063/1.2898484 [12] p. košťáková, p. šťovíček. noncommutative bloch analysis of bochner laplacians with nonvanishing gauge fields. j. geom. phys. 61:727-744, 2011. doi:10.1016/j.geomphys.2010.12.004 [13] j. m. lee. introduction to topological manifolds. berlin: springer-verlag, 2000. [14] s. mashkevich, j. myrheim, s. ouvry. quantum mechanics of a particle with two magnetic impurities. phys. lett. a 330:41-47, 2004. doi:10.1016/j.physleta.2004.07.040 [15] m. melgaard, e. ouhabaz, g. rozenblum. negative discrete spectrum of perturbed multivortex aharonov-bohm hamiltonians. ann. h. poincaré 5:979-1012, 2004. doi:10.1007/s00023-004-0187-3 [16] t. mine. periodic aharonov-bohm solenoids in a constant magnetic field. ann. h. poincaré 6:125-154, 2005. doi:10.1007/s00023-005-0201-4 [17] t. mine, y. nomura. periodic aharonov-bohm solenoids in a constant magnetic field. rev. math. phys. 18:913-934, 2006. doi:10.1142/s0129055x06002826 [18] j. s. schulman. approximate topologies. j. math. phys. 12:304-308, 1971. doi:10.1063/1.1665592 [19] j. s. schulman. techniques and applications of path integration. new york: wiley, 1981. [20] a. j. shtern. unitary representation of a topological group the online encyclopaedia of mathematics. berlin: springer, 2001. online: http://eom.springer.de/ [21] p. šťovíček. the green function for the two-solenoid aharonov-bohm effect. phys. lett. a 142:5-10, 1989. doi:10.1016/0375-9601(89)90702-0 234 http://dx.doi.org/10.1016/j.jfa.2010.12.005 http://dx.doi.org/10.1016/j.aim.2014.01.022 http://dx.doi.org/10.1016/0393-0440(94)90035-3 http://dx.doi.org/10.1088/0305-4470/37/8/005 http://dx.doi.org/10.1016/s0393-0440(99)00059-5 http://dx.doi.org/10.1088/0305-4470/36/14/310 http://dx.doi.org/10.1007/pl00001036 http://dx.doi.org/10.1063/1.2898484 http://dx.doi.org/10.1016/j.geomphys.2010.12.004 http://dx.doi.org/10.1016/j.physleta.2004.07.040 http://dx.doi.org/10.1007/s00023-004-0187-3 http://dx.doi.org/10.1007/s00023-005-0201-4 http://dx.doi.org/10.1142/s0129055x06002826 http://dx.doi.org/10.1063/1.1665592 http://dx.doi.org/10.1016/0375-9601(89)90702-0 vol. 56 no. 3/2016 the aharonov-bohm hamiltonian with two vortices revisited [22] p. šťovíček. scattering on a finite chain of vortices. duke math. j. 76:303-332, 1994. doi:10.1215/s0012-7094-94-07611-4 [23] t. sunada. fundamental groups and laplacians. in: geometry and analysis on manifolds. lect. notes math. 1339. berlin: springer, 1988, pp. 248-277. [24] t. sunada. a periodic schrödinger operator in an abelian cover, j. fac. sci. univ. tokyo sect., 1a math. 37:575-583, 1990. [25] h. tamura. semiclassical analysis for magnetic scattering by two solenoidal fields: total cross sections. ann. h. poincaré 8:1071-1114, 2007. doi:10.1007/s00023-007-0329-5 [26] e. thoma. über unitäre darstellungen abzälbarer, diskreter gruppen. math. annalen 153:111-138, 1964. [27] t. t. wu, c. n. yang. concept of nonintegrable phase factors and global formulation of gauge fields. phys. rev. d 12:3845-3857, 1978. doi:10.1103/physrevd.12.3845 235 http://dx.doi.org/10.1215/s0012-7094-94-07611-4 http://dx.doi.org/10.1007/s00023-007-0329-5 http://dx.doi.org/10.1103/physrevd.12.3845 acta polytechnica 56(3):224–235, 2016 1 introduction 2 a summary of the general approach 2.1 periodic hamiltonians 2.2 a generalization of the bloch decomposition 2.3 propagators associated with periodic hamiltonians 3 the propagator on the universal covering space 3.1 a formula for the propagator 3.2 auxiliary relations 3.3 verification of the propagator formula 4 conclusion. the propagator for two aharonov-bohm vortices acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0426 acta polytechnica 54(6):426–429, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap infrared photoluminescence spectra of pbs nanoparticles prepared by the langmuir-blodgett and laser ablation methods zdenek remesa, b, ∗, tomas novakb, jiri stuchlika, the-ha stuchlikovaa, vladislav dřínekc, radek fajgarc, konstantin zhuravlevd a institute of physics of the ascr, v.v.i., praha 6, czech rep. b czech technical university in prague, faculty of biomedical engineering, kladno, czech rep. c institute of chemical process fundamentals of the ascr, v.v.i., praha 6, czech rep. d the institute of semiconductor physics, siberian branch of the russian academy of sciences, novosibirsk, russia ∗ corresponding author: remes@fzu.cz abstract. we have optimized the optical setup originally designed for photoluminescence measurements in the spectral range 400–1100 nm. the new design extends the spectral range into the near infrared region 900–1700 nm and enables colloidal solution measurements in cuvettes as well as measurements of nanoparticles deposited in the form of thin films on glass substrates. the infrared photoluminescence spectra of pbs nanoparticles prepared by the langmuir-blodgett technique show higher photoluminescence intensity and a shift to shorter wavelengths than the infrared photoluminescence spectra of pbs nanoparticles prepared by laser ablation from a pbs target. we also have proved that pbs nanoparticles prepared in the form of thin layers have high stability. keywords: infrared photoluminescence, pbs, langmuir–blodgett, laser ablation. 1. introduction in the visible spectral region, functionally specific targeting and imaging have been demonstrated using semiconductor nanoparticles [1]. using visible fluorescent tags, however, deep organs such as the liver and the spleen could not be detected because of the limited penetration depth of visible light. deep-tissue imaging requires the use of infrared light within a spectral window separated from the major absorption peaks of hemoglobin and water. infrared optical imaging of living tissue is therefore an area of growing interest, for example to provide improved tumor-sensitivity. high-transparency spectral bands in the near infrared region enable depths of detection of 5–10 cm, a capability that provides surgeons with direct infrared visual guidance throughout a sentinellymph-node mapping procedure, minimizing incision and dissection inaccuracies and permitting real-time confirmation of complete resection [2]. infrared light-emitting diodes based on nanoparticles have the potential to offer low manufacturing cost, compatibility with a range of substrates and quantum-size-effect tunability. they can be integrated on cmos-processed silicon electronics in chipto-chip and board-to-board optical interconnections and in fiber-optic and optical wireless communications. large-area infrared emitters for biomedical imaging would enable optical diagnosis in near infrared biological transparency windows [3]. photoluminescence (pl) spectroscopy is a powerful technique for investigating semiconducting and semiinsulating nanoparticles [4]. we apply infrared pl to study the relative pl quantum efficiency and the thermal stability of pbs nanoparticles prepared in the form of thin layers by the langmuir–blodgett and laser ablation methods on glass substrates. 2. experimental part samples of thin-layered amorphous hydrogenated silicon with embedded lead sulphide (pbs) nanoparticles were prepared in the combined radio frequency (13.56 mhz) plasma-enhanced chemical vapor deposition (cvd) and reactive laser ablation reactor. the reactor was designed by ing. j. stuchlik, from the institute of physics of the academy of sciences of the czech republic (ascr), v.v.i., prague, and it is being developed at the institute of chemical process fundamentals of the ascr, v.v.i. in prague for in-situ growth, deposition and embedding of nanoparticles in a-si:h based pin and led structures. the 2-litre glass capacitively coupled plasma (ccp) reactor is equipped with: • two electrodes for deposition of silicon layers (glass and copper substrates were fixed to the grounded substrate); • a quartz window entrance for a focused laser beam; • a pbs target for ablation. the reactor can be evacuated down to 10−3 pa. the pbs nanoparticles are deposited by ablation of a solid pbs target (diameter 9 mm) in a vacuum (arf ex426 http://dx.doi.org/10.14311/ap.2014.54.0426 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 6/2014 infrared photoluminescence spectra of pbs nanoparticles laser s l4 l1 ch l5' l5l4' mch d1 l3 m1 a1 f3 l2 m2 f2f1 d2 a2 b v figure 1. photoluminescence setup: laser: excitation light source, l1–5: fused silica lens 1” in diameter, ch: chopper, a1–2: apertures, f1–4: optical filters (f1: grey filters, f2: band pass filters, f3: long-pass filter), d1–2: detectors connected to lock-in amplifiers referenced to the chopper frequency, m1–2: mirrors (m2 is removable), b: beamsplitter, mch: horiba h20ir monochromator, s: sample. cimer laser operating at 193 nm, energy 60 mj/pulse, 600 pulses). for the purposes of this study, pbs deposition was conducted at room temperature on corning glass #7059 substrates using only laser ablation without ccp plasma. the thickness measured by an alpha step 100 profilometer was about 50 nm. high quality samples with pbs nanocrystals on a fused silica substrate were prepared using the langmuir-blodgett (lb) technique at the institute of semiconductor physics in novosibirsk, russia. for this purpose, lb films of lead behenate were grown by substrate transfer through monolayers (ml) formed on the liquid subphase surface, comprising lead nitrate. monolayer transfer was carried out by y-type at a surface pressure of 30 mn/m and a temperature of 22 °c. using this method, samples containing up to hundreds of monolayers can be prepared with a typical nanoparticle diameter of 3–10 nm (k. zhuravlev, private communication). sulphurizing the films using hydrogen sulfide gas was carried out for a period of 1–1.5 hours at a pressure of 7–13 kpa and a temperature of 22 °c. as a result of the interaction between lead behenate and hydrogen sulfide, pbs ncs distributed in the lb matrix were formed. the final step was to remove the lb film matrix by thermally-induced desorption of the behenic acid in ammonia and argon gas under atmospheric pressure at temperatures of 100 and 200 °c. the steady-state photoluminescence spectra were measured at the institute of physics of the ascr, v.v.i. in prague by the setup depicted in fig. 1. the essential components in this setup are a green laser, which is used as the excitation energy source, the objective l5 collecting the photoluminescence signal with a relative aperture of f /1, the monochromator (mch2 ) and the ingaas photodiode (d2 ) to detect the photoluminescence spectra. 0 2 4 6 8 10 12 14 500 700 900 1100 1300 1500 1700 p l ( a u .u ) wavelength (nm) si ingaas figure 2. the spectrally corrected and normalised photoluminescence spectrum of the corning glass #7059 substrate measured by si (550–1050 nm) and ingaas (900–1650 nm) photodiodes. 3. results and discussion unlike fused silica glass, other types of glass show significant photoluminescence. the spectrally corrected and normalized photoluminescence spectrum of a clean corning glass #7059 substrate is shown in fig. 2. the higher noise in the ir spectra is due to the lower detectivity of the ingaas photodiode. the ir spectrum is dominated by the glass-related pl band at wavelengths 850–950 nm and the pl band centered at 1064 nm. however, no measurable photoluminescence was detected on fused silica glass. 427 z. remes, t. novak, j. stuchlik et al. acta polytechnica 0 20 40 60 80 800 1000 1200 1400 1600 n o rm a li se d p h o to lu m in e sc e n c e (a .u .) wavelength (nm) 110c 100c 25c 140c 160c 130c 120c 150c 170c 180c 190c 200c figure 3. the room temperature spectrally corrected and normalized pl spectra of a 50 nm layer of pbs nanoparticles deposited on a corning glass #7059 substrate and annealed for 5 min in air at various temperatures. the spectra were normalised on the peak intensity at 1064 nm. the normalized spectra in fig. 3 clearly show the trends of pbs nanoparticle-related pl upon annealing in air at various temperatures. the glass related pl bands at 850–950 nm and at 1064 nm are assumed to be constant. therefore, the variations in their intensity in the spectrally corrected spectra are random changes related to handling the sample. we therefore normalized the corrected spectra on the peak intensity at 1064 nm, see fig. 3. first of all, the pl spectra show the maximum at 1530 nm and prove that pbs nanoparticles are stable at room temperatures with no observable changes in the as-grown samples in a period of several months between the first measurements and the late measurements. we have observed that pl can be enhanced by moderate annealing at temperatures of about 110 °c. we speculate that this pl enhancement may be due to drying nanoparticles, but this hypothesis must be confirmed by further studies. however, after annealing at higher temperatures the pl intensity gradually deteriorates with increasing temperature. in fig. 4, we compare a thin layer of pbs nanoparticles prepared by the langmuir-blodgett technique and laser ablation from the pbs target with both samples annealed at 110 °c to promote infrared photoluminescence. the comparison clearly shows higher intensity of the infrared photoluminescence of the pbs nanoparticles prepared by the langmuir-blodgett technique, and also a shift of the maximum photoluminescence to shorter wavelengths. this indicates the smaller size of the pbs nanoparticles deposited by the langmuir-blodgett technique. it should be noted that the langmuir-blodgett technique is a well optimized process at the institute of semiconductor physics in novosibirsk, russia, whereas pbs deposition by the 1 10 100 1000 800 1000 1200 1400 1600 p h o to lu m in e sc e n c e ( a .u .) wavelength (nm) lb la figure 4. a room temperature comparison of the logarithmic scale intensity of the photoluminescence spectra of a layer of pbs nanoparticles about 50 nm in thickness prepared by the laser ablation method (la, on corning glass 7059) and by the langmuir-blodgett method (lb, on fused silica glass). laser ablation technique at the institute of chemical process fundamentals of the ascr, v.v.i. in prague has only recently been started. more optimization is therefore needed to enhance the quantum efficiency of pbs nanoparticles deposited by the laser ablation technique. one way to achieve pbs nanoparticles of smaller size and with higher quantum efficiency is to apply reactive laser ablation of the metallic pb target in an h2s atmosphere [5]. 428 vol. 54 no. 6/2014 infrared photoluminescence spectra of pbs nanoparticles 4. conclusion we have investigated the thermal stability of pbs nanoparticles deposited as a thin film on glass substrates by laser ablation from a pbs target. we have discussed in detail the calibration procedures used to correct the spectra on the spectral efficiency of the setup. we have shown the importance of the background signal for normalizing the pl spectra and have proved that the pbs nanoparticles have high stability. the infrared photoluminescence spectra of the pbs nanoparticles prepared by the langmuir-blodgett technique clearly show higher photoluminescence intensity and a shift to shorter wavelengths than the infrared photoluminescence spectra of pbs nanoparticles prepared by laser ablation from the pbs target. more optimization is therefore needed to reduce the size of the pbs nanoparticles deposited by the laser ablation technique. acknowledgements the authors gratefully acknowledge meys kontakt ii grant no. lh12236, project ld14011 (hint cost action mp1202) and czech science foundation project ga1405053s. we thank martin müller for the profilometer measurements. references [1] havlik, j., petrakova, v., rehor, i., petrak, v., gulka, m., stursa, j., kucka, j., ralis, j., rendler, t., lee, s.y., reuter, r. , wrachtrup, j. , ledvina, m., nesladek, m., cigler, p.: boosting nanodiamond fluorescence: towards development of brighter probes, nanoscale 5 (8), 2013, p. 3208. doi: 10.1039/c2nr32778c [2] sargent, e. h.: infrared quantum dots, adv. mater. 17(5), 2005, p. 515–522, doi: 10.1002/adma.200401552 [3] konstantatos, g., huang, c. , levina, l. , lu, z., and sargent, e. h.: efficient infrared electroluminescent devices using solution-processed colloidal quantum dots, adv. funct. mater. 15 (11), 2005, p. 1865–1869. doi: 10.1002/adfm.200500379 [4] pelant, i., luminescence spectroscopy of semiconductors. new york: oxford university press, 2012. [5] xi, m., xie, h., zheng, j., wu, z. , li, j., zhao, g., laser reactive ablation deposition of pbs film, in proc. spie 3122, infrared spaceborne remote sensing v, 1997, p. 420–425. doi: 10.1117/12.292703 429 acta polytechnica 54(6):426–429, 2014 1 introduction 2 experimental part 3 results and discussion 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0478 acta polytechnica 56(6):478–491, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap testing the accuracy of measured values in continuous long-term geodetic monitoring jan vaněček∗, martin štroner department of the special geodesy, faculty of the civil engineering, czech technical university in prague, thakurova 7, prague 6, czech republic ∗ corresponding author: jan.vanecek.2@fsv.cvut.cz abstract. a geodetic measurement of shifts and deformations by a total station is a well-known and widespread used method. in this paper, an analysis of the accuracy and its changes over time of the measured values in continuous geodetic monitoring is presented. for the analysis, a set of data measured in the period of time between january 2006 to july 2010 was used. the main method of the analysis is a linear-harmonic function approximation. keywords: measurement of displacements and deformations, mean squared error adjustments, linear and harmonic approximation, total station, slope stability. 1. introduction a geodetic monitoring or a measurements of shifts and deformations is the branch of geodesy, which is dedicated to measure the objects periodically over time e.g. dams, bridges, buildings, landslides etc. the principle of this monitoring is the measuring of observed points on the object and comparing their position with previous results regularly in time. measuring of the shifts is also specified in standards, e. g. in the czech republic it is csn 730405. this subject is widely described in [1]. the measurements of shifts cover many areas of surveying, e. g. sub-millimetre level measuring of structures (e.g. [2]), bridges monitoring (e.g. [3]), tunnel monitoring (e.g. [4]), historical buildings monitoring (e.g. [5]) and landslide monitoring (e.g. [6]) or vertical ground deformation monitoring (e.g. [7]). a geodetic monitoring can also be used in rescue operations [8]. a geodetic monitoring is usually realized by the terrestrial methods of measuring (e.g. [9]), methods of global navigation satellite systems (gnss, e.g. [10, 11]) or complex monitoring systems consisting of geodetic [12] and geotechnical instruments [13]. special applications of monitoring can require special methods of determining of the shifts, e. g. a terrestrial laser scanning [14], a photogrammetry [15] or a radar interferometry, which is suitable especially for measuring dynamic movements [16]. a special case is the continuous monitoring, where the measuring machine records the measured values 24 hours a day, 7 days a week and 365 days in the year. this procedure is used for monitoring of the specific structures or other objects that create a large risk of damage to lives and property. an example might be the monitoring of the slopes, which are vulnerable to a landslide (e.g. [17, 18]). in the processing and evaluation of the results obtained, it is necessary to know the accuracy with which each of the variables are measured. on the one hand, there is the accuracy of the instrument that the manufacturer states and was, most likely, determined in a laboratory or a calibration base in good weather conditions, and on the other hand, there is the actual accuracy that the instrument is able to achieve in a particular situation for a long-term monitoring in the specific conditions of the site. the majority of scientific works deal with determining the accuracy of the measurements according to the standard iso 17123, before the measurement starts (described in e.g. [19] or [20]). this procedure is suitable for short-term measurements, when local conditions do not change a lot, but it is not suitable for long-term monitoring, when conditions of the site change in time. this article deals with identifying and evaluating the accuracy of the measurements of the automatic total station during the monitoring of the stability of slopes in the brown coal mine. 2. description of the monitoring system the data that were used for the analysis of the accuracy of the measurements are from the monitoring system from the čsa surface mine. the mine is situated in the north bohemia under the ore mountains (wgs84: 50.5380206n, 13.5254592e). measured data were obtained from the severní energetická a.s., which owns the čsa mine. unfortunately, the data can not be linked without the agreement of the severní energetická a.s. the monitoring system consists of several parts, the main element, which is the automatic total station leica tcr 2003a, observed and reference points and a management center. total station is located in the middle of the mine in a protective shelter (see fig. 1). the observed points are located on the slopes of the ore mountains and the reference points are located 478 http://dx.doi.org/10.14311/ap.2016.56.0478 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring figure 1. total station in protective shelter. on objects on the opposite side of the mine. the protective shelter for the total station is made from a metal with windows made of an ordinary glass and is equipped with a heating and an air conditioning (keep the internal temperature in the range of operating temperatures of the total station). the total station was purchased especially for this purpose in 2005. the manufacturer declares the standard deviations of horizontal direction a zenith angle 0.15 mgon (measured in both faces) and the standard deviation of the slope distance 1 mm + 1 ppm ·d. the total station measures all observed and reference points every hour for 365 days a year with the exception of service outages and periods when visibility is not sufficient. outside temperature and atmospheric pressure are stored in addition to the standard measured values by the total station (a horizontal direction, a zenith angle and slope distance). there are calculated shifts for each round of measurements in the management center. the relevant employees of the company are automatically informed by a text message in the case of an imminent danger. a detailed description of the monitoring system is given in [21]. the measured values from the period from the beginning of 2006 to july 2010 were used for the analysis of the accuracy of the horizontal direction, the zenith angle and a slope distance. 3. mathematical basis of the analysis of the accuracy of the measurements at first, an estimate of the accuracy of the measurements can be calculated from the measured values for an analysis of the accuracy. this estimate of the accuracy is based on the calculation of the accuracy from the differences between the consecutive measurements according to (1), which can be regarded as a pair of measurements. the estimate of the standard deviation of measurements is then calculated according to (2) for the specific interval (1 day, 7 days, 30 days, 90 days, etc.). the derivation of the calculation of the standard deviation for pairs of measurements is shown e. g. in [22]. differences of pairs of measurements shall be calculated according to ∆i = li+1 − li ∣∣ ∆i≤∆m , (1) where li is the measurement in the time i and the calculated difference shall apply in next step of the calculation on condition that the calculated difference is less than the specified limit ∆m. an estimate of the standard deviation is then sl = √∑n j=1 ∆ 2 j 2n , (2) where n is the number of the differences of measurements in the given time interval (e. g. one day – 24 hours). the calculated standard deviation is assigned a time index that equals to the center of the center time interval, which is more important for longer time intervals such as 30 or 90 days. the basis of the analysis of the accuracy of the measurements comes out of the above calculated standard deviations for each day for the whole of the reference period (i.e., january 2006–july 2010). the analysis was made by the approximation of computed values of the standard deviations specific curve (l-h function), which contains the linear and harmonic part and its equation is the following: y = a + bx + c sin (2πx t + d ) , (3) where a is the absolute coefficient, b is a linear coefficient (slope of the regression), c is an amplitude of the harmonics part, d is a phase shift and t is the period. this function was chosen because of the linear trend that indicates a worsening of the accuracy of the measurements in time and the harmonic part describes the periodic changes of the accuracy during the year, depending on the temperature and other weather conditions variation. the coefficients of l-h function have been obtained by an approximation using method of least squares. the calculation was very sensitive to the choice of the approximate unknowns due to the large number of standard deviations and their similar size. the same weight was assigned to all standard deviations in calculation. linear coefficient of the l-h function was further tested by the tests of statistical hypotheses, whether the data comprise linear trend, namely whether the calculated coefficient b corresponds to the expected value of θ. in this case, the premise is that the parameter b has a value θ = 0. it means that the standard deviation has no trend. null hypothesis will be of the form: h0 : b = θ ⇒ b = 0 . (4) 479 jan vaněček, martin štroner acta polytechnica figure 2. standard deviation of horizontal direction for all tested reference points. testing can be done on the base of the two test criteria. in the first case, the tested value f has the fisher distribution of probability with (n−k) degrees of freedom f = (b− θ)2 σ2b . (5) in the second case, the tested value t has the student t-distribution of the probability with (n−k) degrees of freedom t = b− θ σb . (6) critical values fα and tα/2 are determined from tables, or by calculation for the two-sided test at a significance level α. the null hypothesis will be rejected if f > fα or |t | > tα/2. the second step of the analysis was a comparison of the l-h curve of the standard deviation of the measurements with the l-h curve of the temperature of the air, which was determined by the same calculation as the l-h curve of the standard deviation of the measurements from the equation (3) of the daily average air temperature values. the premise of this comparison is that the accuracy of the measurements depends on the temperature changes during the year. the third step in the analysis of the accuracy is the calculation of the accuracy for a longer time intervals and their comparison. these time intervals were chosen in the lengths of 7 days, 30 days and 90 days. the analysis was carried out for the reference points 91, 92, 93 and 95. reference point no. 94 was not tested because it was measured for a short period of time in comparison with the other points. 4. results of the analysis of the accuracy of the horizontal directions analysis of the accuracy of the horizontal directions was carried out according to the process described in the previous chapter. in the calculation of the differences of pairs of measurements according to (1), the size of the limit difference ∆m was chosen at the value of 0.01 gon. the choice was not based on the accuracy of the horizontal direction specified by the manufacturer of the device (standard deviation to calculate the limit difference), because the accuracy of the manufacturer may not be retained in the specific conditions of the mine. therefore, the size of the marginal difference was determined experimentally from the graphs of measured values of the horizontal direction. in fig. 2, the calculated daily standard deviation for all tested reference points is shawn. it is seen that the progress of the size accuracy is approximately the same for all points, only in rare cases, there are outliers. at first, approximation of daily standard deviations of the horizontal direction was made by using the l-h function as an approximation curve. the computed values of the important coefficients of approximation curves for individual reference points are listed in table 1. there is not an coefficient a that is an absolute part of the l-h function (approximate mean value), and a coefficient d, which is a phase shift of the periodic part of the function. standard deviations and the approximation curve of measurements for the 480 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring coefficient reference point 91 92 93 95 b [ mgonyear ] 0.01 −0.01 0.002 0.01 σb 3 · 10−6 3 · 10−6 3 · 10−6 3 · 10−6 c [mgon] 0.12 0.05 0.13 0.12 σc 2 · 10−6 2 · 10−6 2 · 10−6 4 · 10−6 t [day] 366.1 380.1 362.1 368.8 σt 0.001 0.008 0.001 0.002 table 1. important coefficients of approximation curves of horizontal direction standard deviations. figure 3. approximation curve of the standard deviations of horizontal direction – reference point 91. reference point 91 are shown in fig. 3. it can be clearly seen from the chart that the values of the standard deviation lie between the values 0.2 and 1.0 mgon, and that the approximation curve has a small amplitude. the mean value lies between the values 0.5 and 0.6 mgon. the value of the linear trend obtained from table 1 is 0.01 mgon per year, which is a value that can be neglected regardless of the fact that the results of the tests of statistical hypotheses confirm the linear trend. next step of the analysis is a comparison of the approximation curve of the temperature and the approximation curve of the standard deviations of the horizontal direction. the comparison was carried out graphically and the result is in fig. 4. there is an apparent dependence of the standard deviation of the horizontal direction on the average daily air temperature that can be seen from this chart. when the average daily temperature increases, the value of the standard deviation of the horizontal direction increases too. the length of the period is a year that is apparent in the chart and it is listed in table 1. the last chart in fig. 5 shows a comparison of the standard deviations of the horizontal direction that are calculated for a longer time intervals. we can see that even in long time intervals there are periodic changes. there is still a noticeable slight increase of the size of the standard deviation at the end of the reference period. a part of the analysis was the calculation of the average standard deviations of the horizontal direction for all the reference points. the average standard deviations were calculated using the root mean square of the daily standard deviations of the horizontal directions. the average standard deviations are listed in table 3. there are also the smallest and the largest daily standard deviations and numbers of the daily standard deviations in table 3. 481 jan vaněček, martin štroner acta polytechnica figure 4. comparison of the approximation curves of the temperature and the standard deviation of the horizontal direction – reference point 91. figure 5. standard deviations of the horizontal direction in different time intervals – reference point 91. reference point 91 92 93 95 f 10 369 984 9 567 181 590 403 28 835 683 fα=0.5 4 4 4 4 t 3220 −3093 768 5370 tα=0.25 2 2 2 2 n′ 1574 1588 1605 1132 table 2. results of the tests of the statistical hypothesis. 482 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring reference point 91 92 93 95 σ̄ϕ [mgon] 0.66 0.75 0.63 0.61 min σϕ [mgon] 0.02 0.09 0 0.12 max σϕ [mgon] 2.11 2.87 2.17 1.9 n 1579 1593 1610 1137 table 3. average daily standard deviations of the horizontal direction, min./max. values. coefficient reference point 91 92 93 95 b [ mgonyear ] −0.01 0.002 0.0005 0.002 σb 2 · 10−5 4 · 10−7 5 · 10−7 5 · 10−7 c [mgon] 0.03 0.04 0.01 0.03 σc 4 · 10−5 7 · 10−7 4 · 10−6 2 · 10−7 t [day] 390.6 391.8 424.4 408.2 σt 0.04 0.0001 0.10 0.003 table 4. important coefficients of the approximation curves of the zenith angle standard deviations. 5. analysis of the accuracy of the zenith angle analysis of the accuracy of the zenith angle is based on the same approach as the previous analysis of the accuracy of the horizontal direction. even in this case, the limit difference ∆m was chosen as the value of 0.01 gon during the calculation of the differences of pairs of measurements (1), for the same reasons that were described in the section dedicated to accuracy of the horizontal direction. fig. 6 shows the calculated daily standard deviations for all tested reference points from which it can be seen that the progress of the size accuracy is similar for all points, only for reference point 91 are the values of the standard deviations lower than the average for the other points. in comparison with the horizontal direction, there are also more outlying values, but due to the considerably greater influence of the refraction in the vertical direction, it can be expected. firstly the approximation of the calculated standard deviations for each day was done using the l-h function. the values of the important coefficients of curves for all tested reference points are listed in table 4. chart of values of standard deviations and the approximation curve for the reference point 91 is shown in fig. 7. fig. 7 shows that the large part of values of the standard deviation lies in the range from 0.2 to 0.8 mgon. the approximation curve is very flat and the mean value is approximately 0.5 mgon. the value of the linear trend given in table 4 is −0.01 mgon per year (the highest value of all the reference points). in this situation, the value of the linear trend is so small that it can be neglected, despite the fact that the linear trend can be considered for the proven based on the results of the tests of statistical hypotheses listed in table 5. furthermore, it is obvious from the approximation curve in fig. 7 that a period of harmonious part and the max/min of the curve do not correspond to the seasons. it is obvious from table 4 that the value of the period is around 400 days. the next step of the analysis was a comparison of the calculated approximation curves of the standard deviations of the zenith angle with the approximation curve of the air temperature. the comparison was carried out in a graphical way. the result is shown in fig. 8. we can conclude the same thing as from the fig. 7. the data may contain a certain period, but its dependence on the temperature can’t be proved. in the case of the zenith angle the size of the standard deviation is probably mainly affected by influences other than the temperature of the air (direct sunlight, wind, precipitation, etc.). the graph clearly shows that even in longer time intervals, there are noticeable periodic changes of the size of the standard deviation. however, for the interval of 90 days, the curve is almost flat. similarly as for the horizontal direction, the average standard deviation of the zenith angle was calculated for all reference points, using the root mean square of the daily standard deviations. the average standard deviations are listed in table 6, which also contains the smallest and the largest daily standard deviation and a number of daily standard deviations used for the analysis. 483 jan vaněček, martin štroner acta polytechnica figure 6. calculated daily standard deviations of the zenith angle for all reference points. figure 7. values of standard deviation of the zenith angle and approximation curve – reference point 91. reference point 91 92 93 95 f 67 349 22 528 972 794 872 11 338 453 fα=0.5 4 4 4 4 t −260 4746 892 3367 tα=0.25 2 2 2 2 n′ 1574 1588 1605 1132 table 5. results of the tests of the statistical hypothesis. 484 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring figure 8. comparison of the approximation curves of the temperature and standard deviation of the horizontal direction – reference point 91. figure 9. standard deviations of the zenith angle in different time intervals – reference point 91. reference point 91 92 93 95 σ̄z [mgon] 0.49 0.6 0.63 0.59 min σz [mgon] 0.01 0.11 0.03 0.2 max σz [mgon] 2.13 2.22 2.33 2.77 n 1579 1593 1610 1137 table 6. average daily standard deviations of the zenith angle and min./max. values. 485 jan vaněček, martin štroner acta polytechnica coefficient reference point 91 92 93 95 b [ mmyear ] 0.06 0.001 0.03 0.02 σb 1 · 10−7 5 · 10−7 1 · 10−6 2 · 10−7 c [mm] 0.19 0.36 0.18 0.19 σc 2 · 10−6 1 · 10−6 2 · 10−5 7 · 10−7 t [day] 367.1 365.6 367.6 373.1 σt 0.0001 0.0001 0.001 0.0002 table 7. important coefficients of the approximation curves of the standard deviations of the slope distance. 6. analysis of the accuracy of the slope distance the last quantity, directly measured by the total station, is the slope distance. the analysis of the accuracy was carried out using the same process as the previous analysis of the accuracy of the horizontal direction and the zenith angle. in this case, the limit difference was selected as the value ∆m = 0.007 m for the calculation of the differences of pairs of measurements (1). the choice of the limit difference was based on the same assumptions as in the case of the horizontal direction and the zenith angle. the standard deviation of the slope distance, according to the manufacturer, is σs = 1 mm + 1 ppm d (d is measured distance). the standard deviation of the slope distance to the reference point 91 is then 2.3 mm. the limit difference can be determined as ∆m = σs · √ 2 ·up (up = 2 for probability 95 %, up is the coefficient of the standard normal distribution). the calculated value of the limit difference is then 6.5 mm that was rounded up to 7 mm, and used in (1), because there was a minimum of occurrences of a larger difference than the limit difference, in the experimental testing of measurements. the calculated daily standard deviations of the slope distance for all tested reference points are shown in fig. 10. it is seen that the progress of the accuracy is similar for all points, only for the point 92, the standard deviation is slightly higher than for the other points. this is caused probably a stabilization of the reference point 92, which is high lattice tower. the first step of the analysis was, again, an approximation of the calculated daily standard deviations by the l-h function. the calculated values of important coefficients of the approximation curves are listed in table 10. chart of the values of the standard deviations and the approximation curve for the reference point 91 is shown in fig. 11. from the graph in fig. 11, it can be seen that the values of the standard deviation are almost all located in the interval with the upper limit of the value of 2.5 mm, which is slightly more than the manufacturer declared standard deviation. in this case, the approximation curve is not flat, there is an obvious linear trend. the value of the linear trend for the reference point 91 is, as can be seen in table 7, 0.06 mm a year (the highest of the tested reference points). the linear trend can be considered as proven, based on the results of the tests of the statistical hypotheses listed in table 8. but the value of the linear trend is very small and it is possible to neglect this trend for a given purpose. it is important to note that a positive linear trend reflects the deteriorating accuracy of the slope distance. there is a visible decrease of the blue dots scattering (the value of the standard deviation of the slope distance) over time in fig. 11. this phenomenon is observable for most of the slope distance, for the horizontal direction, it is less noticeable, and for the zenith angle, it is practically not observable. the next step of the analysis was a comparison of the calculated approximation curves of the standard deviations of the slope distance with the approximation curve of the air temperature. the comparison was carried out in a graphical way. the result is shown in fig. 12. this figure expresses the same conclusion that is obvious from the fig. 11. the size of the standard deviation of the slope distance is dependent on the temperature of the air. the standard deviation of the slope distance increases in the warm months, while reaching minimum values in the winter. values of the period bring the same conclusion as graph. the lenght of the period is about 367 days (listed in table 7). the last graph in fig. 13 expresses a comparison of the standard deviation of the slope distance calculated for different time intervals. we can see in fig. 13 that noticeable periodic changes of the value of the standard deviation remain even in longer time intervals and that there is a slight increase of the value of the standard deviation since the beginning of 2009. similarly, as in the section on horizontal direction and in the section on zenith angle, the average standard deviation of the slope distance was calculated for all the reference points by using the root mean square of the daily standard deviations. the averages of the standard deviations are listed in table 9. table 9 also contains the smallest and the largest daily standard deviation of the slope distance and the number of daily standard deviations used for the 486 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring figure 10. calculated daily standard deviations of the slope distance for all reference points. figure 11. approximation curve of the standard deviations of the slope distance for reference point 91. reference point 91 92 93 95 f 173 610 514 013 2 970 776 883 731 125 10 549 655 517 fα=0.5 4 4 4 4 t 416 666 1724 29 728 102 712 tα=0.25 2 2 2 2 n′ 1574 1588 1605 1132 table 8. results of the tests of the statistical hypothesis. 487 jan vaněček, martin štroner acta polytechnica figure 12. comparison of the approximation curves of the temperature and the standard deviations of the slope distance – reference point 91. figure 13. standard deviations of the slope distance in different time intervals – reference point 91. reference point 91 92 93 95 σ̄s [mm] 1.06 1.37 0.93 0.84 min σs [mm] 0 0.09 0 0.09 max σs [mm] 3.49 4.46 5.27 3.67 n 1579 1593 1610 1137 table 9. average standard deviations of the slope distance and min./max. values. 488 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring analysis for specific reference points. it is clear from the values listed in table 9 that the average standard deviations correspond to the size of the fixed part of the standard deviation of the slope distance specified by the manufacturer. 7. discussions of the findings the analysis of the accuracy of the measurements has brought very interesting results. the results of the analysis of the individual variables are often similar, this mainly applies to the horizontal direction and the zenith angle. the most important results are the values of the linear trends, the absolute values of the standard deviation and the dependence of the size of the standard deviation on the air temperature. in the analysis of the accuracy of the horizontal direction, the average standard deviation of the horizontal direction was determined with the value of 0.66 mgon. this value is higher than the value declared by the manufacturer, but the accuracy is sufficient for the purpose of this monitoring. the calculated standard deviation includes, in addition to the accuracy of the measurement and the accuracy of the automatic target recognition, effect of the surrounding environment (for the consecutive measurements), e. g. the influence of the atmosphere and the changes in optical parameters of the glazing of the shelter. however, even more important is the fact that the calculated values of the daily standard deviation show a very small linear trend. it is essential for the conclusion that the accuracy of the horizontal direction is constant and it almost does not change in the term of several years. the values of standard deviation of the horizontal direction change periodically, with the annual period and the average amplitude of 0.1 mgon. these periodic changes are most likely caused by meteorological phenomena, specifically an air temperature as it is apparent from a comparison of the approximation curves of the temperature and the standard deviation of the horizontal direction. it can be said that the values of standard deviation are generally higher in the summer than in the winter. the average standard deviation of the zenith angle was determined with the value of 0.58 mgon in the analysis of the accuracy of the zenith angle. this value is lower by 0.08 mgon than the average standard deviation of the horizontal direction. this is a very good result, because it confirms the manufacturer’s claim that the accuracy of the measurement of the horizontal direction and the zenith angle is the same. similarly, as in the accuracy of the horizontal direction, the calculated accuracy of the zenith angle also includes an accuracy of the automatic target recognition and other effects of the surrounding environment. a linear trend of the accuracy of the zenith angle is negligible, as in the case of the accuracy of the horizontal direction. the linear trend is evenly decreasing in the case of the reference point 91. as it was already described above, very important is a fact that the accuracy of zenith angle is constant throughout the analyzed period. however, the standard deviation of the zenith angle does not show any changes with significant annual periodic amplitude compared to the accuracy of the horizontal direction. the calculated amplitude is around 400 days. also, the temperature dependence is not apparent to the standard deviation of the zenith angle. consequently, it can be assumed that the changes in the air temperature and other atmospheric phenomena do not have an influence on the size of the standard deviation of the zenith angle in the long term. the last analyzed measured value was the slope distance, for which the average standard deviation was of 1.1 mm. this value of the average standard deviation is a half of the one declared by the manufacturer for these distances. the value of the average standard deviation corresponds to a constant part of the formula for the calculation of the standard deviation of the distance given by the manufacturer. it seems that the part of the formula depending on the distance has no effect in this case. it can be assumed that the accuracy of the automatic target recognition does not apply in the accuracy of the slope distance. the detected linear trend of the accuracy of the slope distance is very small and it is insignificant in this application of the monitoring. but, in the charts for all reference points, there is a noticeable increase of the number of the standard deviations with a higher value since the half of the 2008. one can infer that the accuracy of the slope distance is getting slightly worse. it is apparent from the charts, but the size is negligible compared to the accuracy of the slope distance. this deterioration may be caused by, for example, wear down of the laser beam source used for measuring lengths and the automatic target recognition. values of the standard deviation of the slope distance shows annual periodic changes with amplitude of about 0.2 mm. maximum and minimum of the approximation curve correspond to the seasons. the standard deviations reach higher values in the summer and lower values in the winter. it is clear that the value of the standard deviation of the slope distance is dependent on the temperature of the air. 8. conclusions the analysis of the measured values by the total station determined the average standard deviations of the horizontal direction (0.66 mgon), the zenith angle (0.58 mgon) and the slope distance (1,1 mm). other result of the analysis is very small linear trend of the standard deviations of the all measured values. the value of the standard deviations of the horizontal direction and the slope distance is significantly dependent on the air temperature. finally, it can be said that the accuracy of the angle measurements is stable and the accuracy of the slope 489 jan vaněček, martin štroner acta polytechnica distance is getting slightly worse, but the change is negligible in the analyzed time period. on the one hand, the average standard deviations of the horizontal direction and the zenith angle are almost identical and also higher than the manufacturer declares, on the other hand, the average standard deviation of the slope distance is lesser than the manufacturer declares. the determined standard deviations of the measurements meet the requirements of the accuracy of the monitoring of landslides in the čsa mine. list of symbols li measurement in time i (horizontal direction, zenith angle, slope distance) [gon, m] sl estimate of the standard deviation of the measurement l [mgon, mm] x time of measurement in the l-h function [day] y estimate of the measurement in the l-h function [gon, m] acknowledgements supported by grant sgs 2017 – optimization of acquisition and processing of 3d data for purpose of engineering surveying, geodesy in underground spaces and laser scanning. references [1] r. urban. surveying works during the deformation measurement of buildings. first printing. ctu publishing house, prague, 2015. isbn 978-80-01-05786-5. [2] t. beran, l. danisch, a. chrzanowski, m. bazanowski. measurement of deformations by mems arrays, verified at sub-millimetre level using robotic total stations. geoinformatics fce ctu 12:30–40, 2014. doi:10.14311/gi.12.6. [3] y. g. he, c. j. zhao. large-scale bridge distortion measuring technique discussion. international conference on mechanics and civil engineering (icmce), wuhan, peoples republic of china, 2014. doi:110.2991/icmce-14.2014.120. [4] a. berberan, m. machado, s. batista. automatic multi total station monitoring of a tunnel. survey review 39(305):203–211, 2007. doi:10.1179/003962607x165177. [5] m. štroner, r. urban, p. rys, j. balek. prague castle area local stability determination assessment by the robust transformation method. acta geodynamica et geomaterialia 11(4):325–336, 2014. doi:10.13168/agg.2014.0020. [6] a. aryal, b. a. brooks, m. e. reid, et al. displacement fields from point cloud data: application of particle imaging velocimetry to landslide geodesy. journal of geophysical research-earth surface 117(article number f01029), 2012. doi:10.1029/2011jf002161. [7] v. ballu, j. ammann, o. pot, et al. a seafloor experiment to monitor vertical deformation at the lucky strike volcano, mid-atlantic ridge. journal of geodesy 83(2):147–159, 2009. doi:10.1007/s00190-008-0248-3. [8] j. lucic. monitoring a cruise shipwreck. construction equipment guide, may 2, page 8, 2012. http://www.constructionequipmentguide.com/web_ edit/southeast/-%202012/0912/se00802se.pdf. [9] c. tse, j. luk. design and implementation of automatic deformation monitoring system for the construction of railway tunnel: a case study in west island line. joint international symposium on deformation monitoring, hong kong, china, 2011. [10] d. dzurisin, m. lisowski, c. w. wicks, et al. geodetic observations and modeling of magmatic inflation at the three sisters volcanic center, central oregon cascade range, usa. journal of volcanology and geothermal research 150(1-3):35–54, 2006. doi:10.1016/j.jvolgeores.2005.07.011. [11] l. m. peci, berrocoso, r. m., paez, et al. iesid: automatic system for monitoring ground deformation on the deception island volcano (antarctica). computers & geosciences 48:126–133, 2012. doi:10.1016/j.cageo.2012.05.004. [12] w. stempfhuber. performance of geodatic monitoring systems. bautechnik 89(11):794–800, 2012. doi:10.1002/bate.201201575. [13] b. klappstein, g. bonci, w. maston. implementation of real time geotechnical monitoring at an open pit mountain coal mine in western canada. international multidisciplinary scientific symposium universitaria simpro 2014, petrosani, romania, 2014. [14] a. c. bala, f. m. brebu, a. m. moscovici. using terrestrial laser scanning technologies for high construction monitoring. 12th international multidisciplinary scientific geoconference (sgem), albena, bulgaria, 2012. [15] m. scaioni, l. barazzetti, a. giussani, et al. photogrammetric techniques for monitoring tunnel deformation. earth science informatics 7(2):83–95, 2014. doi:10.1007/s12145-014-0152-8. [16] i. lipták, j. erdélyi, p. kyrinovič, a. kopáik. monitoring of bridge dynamics by radar interferometry. geoinformatics fce ctu 12:10–15, 2014. doi:10.14311/gi.12.2. [17] c. castagnetti, e. bertacchini, a. corsini, a. capra. multi-sensors integrated system for landslide monitoring: critical issues in system setup and data management. european journal of remote sensing 46:104–124, 2013. doi:10.5721/eujrs20134607. [18] j. blachowski, s. ellefmo, e. ludvigsen. monitoring system for observations of rock mass deformationscaused by sublevel caving mining system. acta geodynamica et geomaterialia 8(3):335–344, 2011. [19] v. nestorović. on the precision of measuring horizontal directions in engineering projects. geonauka 3(1):33–40, 2013. doi:10.14438/gn.2013.22. [20] t. owerko, m. strach. examining coherence of accuracy tests of total station surveying and geodetic instruments based on the comparison of the results of the complete test procedures according to iso 17123. reports on geodesy 87(2):291–299, 2009. 490 http://dx.doi.org/10.14311/gi.12.6 http://dx.doi.org/110.2991/icmce-14.2014.120 http://dx.doi.org/10.1179/003962607x165177 http://dx.doi.org/10.13168/agg.2014.0020 http://dx.doi.org/10.1029/2011jf002161 http://dx.doi.org/10.1007/s00190-008-0248-3 http://www.constructionequipmentguide.com/web_edit/southeast/-%202012/0912/se00802se.pdf http://www.constructionequipmentguide.com/web_edit/southeast/-%202012/0912/se00802se.pdf http://dx.doi.org/10.1016/j.jvolgeores.2005.07.011 http://dx.doi.org/10.1016/j.cageo.2012.05.004 http://dx.doi.org/10.1002/bate.201201575 http://dx.doi.org/10.1007/s12145-014-0152-8 http://dx.doi.org/10.14311/gi.12.2 http://dx.doi.org/10.5721/eujrs20134607 http://dx.doi.org/10.14438/gn.2013.22 vol. 56 no. 6/2016 testing the accuracy of geodetic monitoring [21] p. stanislav, j. blín. technical support of the service of automatic total station leica tcr 2003a in operating conditions of the company mostecká uhelná a.s. acta montanistica slovaca 12(special issue 3/2007):554–558, 2007. issn 1335-1788. [22] m. štroner, m. hampacher. processing and analysis of measurements in engineering surveying. first printing. ctu publishing house, prague, 2011. 491 acta polytechnica 56(6):478–491, 2016 1 introduction 2 description of the monitoring system 3 mathematical basis of the analysis of the accuracy of the measurements 4 results of the analysis of the accuracy of the horizontal directions 5 analysis of the accuracy of the zenith angle 6 analysis of the accuracy of the slope distance 7 discussions of the findings 8 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0384 acta polytechnica 55(6):384–387, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap calculation of stress and deformation in fuel rod cladding during pellet-cladding interaction dávid halabuk∗, jiří martinec department of power engineering, energy institute, faculty of mechanical engineering, brno university of technology, technická 2896/2, brno 616 69, czech republic ∗ corresponding author: david.halabuk@gmail.com abstract. the elementary parts of every fuel assembly, and thus of the reactor core, are fuel rods. the main function of cladding is hermetic separation of nuclear fuel from coolant. the fuel rod works in very specific and difficult conditions, so there are high requirements on its reliability and safety. during irradiation of fuel rods, a state may occur when fuel pellet and cladding interact. this state is followed by changes of stress and deformations in the fuel cladding. the article is focused on stress and deformation analysis of fuel cladding, where two fuels are compared: a fresh one and a spent one, which is in contact with cladding. the calculations are done for 4 different shapes of fuel pellets. it is possible to evaluate which shape of fuel pellet is the most appropriate in consideration of stress and deformation forming in fuel cladding, axial dilatation of fuel, and radial temperature distribution in the fuel rod, based on the obtained results. keywords: nuclear fuel; fuel rod; stress and deformation analysis; fuel cladding. 1. introduction nuclear fuel used in pressurized water reactors is formed from uranium dioxide (uo2). this ceramic fuel has the shape of small cylindrical pellets, and it is hermetically encased in long tubes made of zirconium alloy. zirconium alloys have been selected for their good qualities such as geometric stability, small parasitic capture of neutrons, compatibility with uo2, corrosion resistance, good mechanical properties, etc. [1]. during irradiation, most of the nuclear fuel properties change. the mechanical and thermal properties change, as do the shape, structure and volume of the fuel pellet [1]. in a very early stage, the cracking of pellets occurs due to thermal stresses. moreover, the ends of pellets expand more than their centre, and the pellet acquires an hourglass shape. cracked fragments or deformed rim may cause a local force on cladding and thus concentrate a stress. there are also some specific phenomena that affect nuclear fuel properties, which are mostly direct consequences of the irradiation (swelling, densification, radiation growth, fission gas release, etc.) [2]. in certain cases, fuel pellets have modifications of their shapes which improve some of their properties. a typical example is a central hole, which reduces the peak temperature in the centre of the fuel and also increases the free volume of the fuel rod, where fission gas can be released. another modification is related to the shape of the dished chamfered pellet, which reduces axial dilatation of the fuel column. each of these changes of the pellet shape has an effect on the amount of stress and deformation in cladding during pellet-cladding interaction (pci) [1]. figure 1. pellet cracking due to thermal stress [2]. 2. pellet-cladding interaction the gap between the fuel and cladding is gradually reduced during operation (due to the thermal expansion and volume growth of the fuel pellet) and at high burnup the fuel can completely disappear. if contact between the pellet and cladding is created, stress and deformation of cladding are increased. this interaction may occur even earlier by the local power ramps (because of the different thermal expansion of the fuel and cladding) [3]. fission products, such as iodine and cesium, create an aggressive corrosion environment inside the fuel rods. if a small part of a cracked pellet is missing, fission products are accumulated in a free space between pallet and cladding. at that place on the inner surface of the cladding, a layer which is brittle and 384 http://dx.doi.org/10.14311/ap.2015.55.0384 http://ojs.cvut.cz/ojs/index.php/ap vol. 55 no. 6/2015 calculation of stress and deformation in fuel rod cladding figure 2. pci failure due to scc [3]. susceptible to cracking may form. if the cladding is in sufficient time exposed to a corrosive environment and to increased tension caused by the contact between the fuel and cladding, it may cause a stress-corrosion cracking (scc) [3]. 3. stress and deformation analysis the main goal of the analysis was to find the influence of the pellet shape on the deformation and stress which occurs in cladding during pci. the axial dilatations of fuel columns and radial temperature distribution in fuel rods were also compared. the calculation was done for two different states. the first state involved, a fresh fuel with a gap between the fuel and the cladding. the linear heat generation rate was 30 kw/m and the pressure inside the fuel rod was 0.6 mpa. the second state was characterized by contact between the fuel and the cladding. because it was a spent fuel, the linear heat generation rate was only 10 kw/m and the pressure increased to 5 mpa [4]. the diameter of the fuel pellets was 7.6 mm and their height was 12 mm [5]. four various modifications of the basic pellet shape were compared in the analysis. the first two pellets had a cylindrical shape with and without a central hole, and the other two had a dished chamfered shape, also with and without a central hole. the central hole had a diameter 1.2 mm [5] and the geometry of the dished chamfered pellets is shown in fig. 3. the geometrical model consisted of three fuel pellets, cladding and a spacer grid. because of the 3 symmetry axes, the final model was only 1/6 of the geometrical model. the outer diameter of the cladding was 9.1 mm and its thickness was 0.685 mm [1]. the spacer grid, with a thickness of 0.25 mm and a height of 10 mm [7], was placed on the interface of the 2nd and 3rd pellets. figure 3. dished chamfered face of fuel pellet [6]. the external conditions correspond to the conditions in the vver 440 reactor during normal operation (coolant temperature: 297 °c; pressure: 12.26 mpa [4]). uo2 was considered as the material of which the fuel pellets were made, the gap between the fuel and cladding was filled with pure helium, and the cladding and spacer grid were made of zirconium alloy. a zr1nb alloy is used as cladding in the vver 440 reactor, but in the absence of available information necessary for the calculation, a zircaloy-4 alloy was used for analysis. the thermal and mechanical properties of both alloys needed for the calculation (thermal conductivity, thermal expansion, modulus of elasticity, poisson’s ratio) are very similar in the used temperature range, and so the obtained results can also be used informatively for the zr1nb alloy. for more accurate results, an analysis using the material properties of zr1nb alloy would be required. all properties of the materials were considered as dependent on the temperature, and the thermal conductivity of uo2 was in addition dependents on burnup [8]. the calculation for the elastic zone of material deformations was done by the fem (finite element method) programme ansys 14.5. 4. results fig. 4 shows stress and deformation of cladding for state 2. concentrations of stress and deformation occurred on the pellets interfaces, because of the nonuniform deformation of fuel pellets, hourglassing. the maximum value of stress was calculated on the inner surface, but the maximal deformations occurred on the outer surface of the cladding. a comparison of the maximum values of stresses and deformations occurred in the cladding, for all various shapes of fuel pellets, is shown in tab. 1. if the contact was not created, the cladding was loaded only by pressure and thermal load, which were not affected by the small changes of the pellet’s shape. for this reason, deformation and stress for state 1 were essentially the same for all shapes. for state 2, the biggest deformation and stress occurred in the pellet without a central hole. the dished chamfered shape 385 dávid halabuk, jiří martinec acta polytechnica figure 4. stress (left) and deformation (right) of cladding for state with contact. cladding deformation no contact [mm] 0.0127 0.0127 0.0127 0.0127 contact [mm] 0.0192 0.0186 0.0193 0.0187 difference [mm] 0.0065 0.0059 0.0066 0.006 relative difference [%] 51.2 46.5 52.0 47.2 stress in cladding no contact [mpa] 199.2 199.2 199.2 199.2 contact [mpa] 312.2 301.2 303.3 294.2 difference [mpa] 113.0 102.0 104.1 95.0 relative difference [%] 56.7 51.2 52.3 47.7 axial dilatation no contact [mm] 0.473 0.434 0.392 0.38 of 3 pellets contact [mm] 0.186 0.185 0.165 0.156 temperature in centre no contact [°c] 1462 1371 1462 1371 of pellet contact [°c] 646 616 646 616 table 1. maximum value of deformation and stress in cladding, axial dilatation of fuel pellets, and temperature in centre of pellet. had no influence on the deformation of the cladding, but when comparing the stresses, there was some effect. however, the dished chamfered shape had the biggest influence on axial dilatation of the fuel column. in the model, the difference of axial dilatations between the dished chamfered and the cylindrical shapes is only hundredths of millimetres, but at considering 2.5 m as the height of the fuel column, it already reaches a few millimetres. predictably, the central hole had the biggest impact on the temperature in the centre of the pellets. there was a difference of 30 °c between the pellets with and without a central hole for state 1, and of 91 °c for state 2. according to the obtained results, the most appropriate shape in consideration of stress and deformation forming in fuel cladding, as well as of axial dilatation of fuel and radial temperature distribution in the fuel rods, is the dished chamfered shape with a central hole. the shape without dished chamfered faces and without the central hole is the most inappropriate one. however, it contains more mass and thus more fuel for fission reactions. this factor may be more influential, especially when the values of stress, deformation and temperature fall within a permissible range. the results of the analysis show that stress was about 200 mpa for state 1, which is under ultimate strength. for state 2, there were higher values of stress only in a small region, but even there the strength limit was not exceeded, because the ultimate strength of zirconium alloy is increased after irradiation. a problem may occur only if other degradation effects (like grid to rod fretting, stress-corrosion cracking, hydriding, etc.) are added. in that case, the thickness of the cladding may be reduced or the surface may be defective, and together with increased tension it may induce a failure. 386 vol. 55 no. 6/2015 calculation of stress and deformation in fuel rod cladding 5. conclusions the influence of the pellet shape on stress and deformation is not very strong for the selected model conditions. when the most appropriate and the inappropriate shapes are compared, the differences of the maximal values are less than 6 %. a larger differences, namely almost 24 %, is seen for the axial dilatations of the fuel column. the results of the calculations show that stress is under yield strength for each shape, even in full contact. a problem may occur only in combination with other degradation effects. the main goals of development in nuclear research are to increase the safety, reliability and economic efficiency of nuclear fuel. burnup, as well as fuel cycle length, is increased. this increase is possible due to higher fuel enrichment and more uranium mass in the reactor. in some reactors, there are already loaded pellets with extended diameter (from 7.6 mm to 7.8 mm) and without a central hole [5]. when we keep the basic dimensions of fuel assemblies and fuel rods, the thickness of the cladding has to be decreased. these changes produce benefits such as higher efficiency of fuel utilization and decreasing amount of spent fuel. however, the thickness of cladding may be decreased only while still respecting the strength limits and the safety of nuclear fuel. references [1] bečvář, j.: jaderné elektrárny. praha: sntl, 1978. [2] patterson c, strasser a, et al.: processes going on in nonfailed rod during normal operation. skultuna, sweden: advanced nuclear technology international, 2010. [3] international atomic energy agency: review of fuel failures in water cooled reactors. vienna: iaea, 2010. [4] heřmanský, b.: termomechanika jaderných reaktorů. praha: academia, 1986. [5] ugryumov, a.: nuclear fuel for npp: current status and main fields of the development. international conference vver 2013. praha, 2013. [6] wolfgang, d., hoff, a. et al.: nuclear reactor green and sintered fuel pellets, corresponding fuel and fuel assembly. us20110176650. 2011. [7] rämä, t. et al.: effect of spacer grid mixing vanes on coolant outlet temperature distribution. in: kerntechnik. vol. 77, no. 4, p. 265-270, 2012. doi:10.3139/124.110252 [8] siefken l.: scdap/relap5/mod 3.3 code manual. matpro a library of materials properties for light-water-reactor accident analysis. washington, dc: division of systems technology, office of nuclear regulatory research, u.s. nuclear regulatory commission, 2001. 387 http://dx.doi.org/10.3139/124.110252 acta polytechnica 55(6):384–387, 2015 1 introduction 2 pellet-cladding interaction 3 stress and deformation analysis 4 results 5 conclusions references ap2002_01.vp 1 introduction during the service life of military aircraft, advances in avionics technology render certain systems onboard either obsolete or of limited capability, compared to a state-of-the-art system. mid-life upgrade of military aircraft that includes insertion of advanced avionics systems in the “avionics architecture” is a cost-effective option to new design [1]. the major challenge in an avionics upgrade design process is the integration of an advanced system with the systems already onboard. the integration process is governed by the avionics architecture of the aircraft [2]. the architecture for military aircraft is based on a functional format. flight control, navigation, identification of friend or foe, and communication are the common functional format [3]. the design rigidity of such an architecture format limits the degree to which integration can be achieved. the development of ‘multi-functional avionics systems’, coupled with architecture rigidity, has made the avionics upgrade process an engineering challenge. a new design approach, named integrated modular avionics (ima), is an attempt to deal with the current design drawbacks of avionics architecture and to address the problem of technology insertion [4], [5]. the principles on which these concepts are formulated are still premature, and there is no major literature on the subject in the public domain. a design methodology needs to be developed for an avionics architecture with upgrade potential, from a system perspective – one that will holistically address all design parameters and constraints, including technological insertion [6]. the architecture needs to be in an ‘open format’, to provide in-built growth potential, and to facilitate insertion of state-of-the-art systems into the architecture on a continuous basis during the service life of the aircraft. this paper presents a framework for the development of a system methodology to design avionics architecture with upgrade potential. 2 system methodology a system methodology to study the operational needs and operational environment for deriving the mission requirements of military aircraft was developed by sinha et al [7]. based on the derived mission requirements, a mid life upgrade system (mlus) was structured by sinha et al [6] to identify the system elements (components, attributes and relationships) and develop the system hierarchy [8]. the mlus hierarchy has aided the identification of state-of-the art mission systems for mid-life upgrade of in-service military aircraft [9]. the mission systems identified include advanced avionics systems as replacements for obsolete systems on board, or as additional systems to enhance mission capability. the insertion of these state-of-the-art avionics systems on board as part of the upgrade process depends on the “avionics system architecture” (asa) – the platform on which all avionics systems rest. the design structure of the asa is based on state-of-the-art technology during the design phase of asa. as the asa remains an integral part of the aircraft during its service life, the technological parameters on which the design was based remain static. on the other hand, advances in avionics systems technology continue, resulting in new or modified design parameters. hence, to facilitate the insertion of advanced systems there is a need for an asa with upgrade potential (asa-up) – one that focuses on the design parameters of future avionics systems. the ima concept and the methodology for mid-life up-grade analysis of military aircraft provides the foundation to formulate a research program on asa-up design methodology. the system methodology developed by sinha et al [6], [7], [9] can be explored to identify advanced avionics systems. the methodology could then be further explored to identify the asa-up design parameters. in order to develop a methodology that holistically [10] addresses an asa-up design, a system structure [6] for avion© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 42 no. 1/2002 system framework for the design of an avionics architecture with upgrade potential n. rao, r. kusumo, a. k. sinha, m. l. scott the technological growth of ‘avionics systems’ has outpaced the service-life of aircraft, resulting in avionics upgrade as a preferred cost-effective option to new design. mid-life upgrade of “avionics systems” by state-of-the-art mission systems has been a challenging engineering task. the complexity of avionics upgrade process is due to the design rigidity of avionics systems architecture. an avionics architecture with growth potential is required to optimise avionics upgrade with state-of-the-art systems. a research program that partially addresses avionics systems upgrade by developing a methodology to design an avionics architecture with in-built growth potential is discussed in this research paper. a ‘system approach’ is adopted to develop a methodology that identifies the design parameters that will facilitate design of an avionics architecture with upgrade potential. keywords: avionics systems, avionics upgrade, system approach, avionics architecture. ics upgrade first needs to be initially formulated. the system structure should facilitate the identification of system elements based on the slated functions of the avionics upgrade and system (aus). keeping the provisions of technological insertion as the focus, the functions of the aus to be structured are as follows: � identify state-of-the art avionics systems; � formulate technological growth parameters; � identify the avionics architecture parameters of the aircraft system; and � integrate growth and architecture parameters to identify the design parameters of asa-up. the structure of the aus formulated considering the above functions is presented in fig. 1. system framework after conceptualising the avionics upgrade process from a system perspective the framework for the design of an asa-up can be developed. the aus structure identifies the need for four components – two analysers, an integrator, and a tester and validator – to aid the design of an asa-up. the components and their stated functions are as follows: � analysers: to provide an analysis of the current architecture and advanced systems, and the identify the architecture and technological growth parameters; � integrator: to integrate the architecture and technological growth parameters, and to update the design parameters for an asa-up; and � tester & validator: to test and validate the asa-up design parameters for functionality, compatibility and performance. when the above modules (components) and their functions have been identified, the system framework for the design of an avionics architecture with upgrade potential can be developed. the system framework with the various modules and the functional flow is presented in fig. 2. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 built-up area mountains relationship (inter & intra) component test & validation input architecture advanced system component architecture parameter analyser (attributes) component advance system parameter analyser (attributes) station size weight power requirement data transfer interface & compatibility component parameter integrator (attributes) output architecture with upgrade potential fig. 1: system structure of an avionics upgrade system © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 42 no. 1/2002 architecture existing architecture parameter avionics architecture dtprwtst si ic design requirements of avionics architecture with upgrade potential requirements analysis requirements prioritisation or system breakdown structure sub-architecture integration avionics architecture with growth potential md-‘n’md 1 md 2 smd – 1 smd – 2 smd –‘n’ sad – 1 sad – 2 sad –‘n’ architecture design verification system design analysis odfd pd design decision support validation degree of design requirements fulfill upgrade potential acceptance new architecture design yesno reconfigure system design avionics upgrade analysis operational needs df oflg operational environment wetitesn mission requirements technological growth parameters state-of-the-art avionics mission systems onboard mission systems fig. 2: system framework for the design of an avionic architecture with upgrade potential legends: cs: costs df: defensive mission dt: data transfer ic: interference & compatibility lg: logistic mission mp: module parameters of: offensive mission pr: power requirement sad: sub-architecture design sc: system capability sf: system functionality si: size smp: sub-module parameters sn: situation sp: system performance st: station te: terrain ti: time we: weather wt: weight results and discussion the application of generic system methodology for the identification of advanced mission systems resulted in the following: � development of a “system structure of an avionics upgrade system”; and � development of a “system framework for the design of an avionics architecture with upgrade potential”. system structure the system structure that has been developed provides a system perspective of an avionics upgrade problem. it identifies the components of the problem and the functional requirements that need to be addressed for a holistic solution. an interrelationship analysis of the components and their attributes (functional requirements) provides the methodology to integrate the design parameters of the architecture and advanced avionics systems. system framework the system framework identifies four analysis modules – architecture analysis, avionics upgrade analysis, architecture design and decision support. each of these modules comprises various sub-modules that aid analysis and decision support. the sub-module frameworks need to be further developed in detail for functionality of the modules. the system framework caters for feedback loops to optimise the design. conclusion the system approach adopted for developing the framework for the design of an avionics architecture with upgrade potential provides an avenue for a holistic analysis of the problem. the methodology developed addresses all design parameters that need to be considered in the design process. reference [1] little, r.: advanced avionics for military needs. proceedings of royal aeronautical society on avionics in the future land-air battle, london, u.k., hamilton place, 12 december 1990, pp. 11.1–11.10 [2] morgan, d. r.: military avionics twenty years in the future. aiaa/ieee digital avionics systems conference, 5–9 november, massachusetts, cambridge, 1995, pp. 483–490 [3] rushby, j.: partitioning in avionics architectures: requirements, mechanisms, and assurance. nasa/cr-1999-209347, california 1999 [4] giddings, b. j.: some fundamentals of integrated modular avionics. proceedings of royal aeronautical society, april, maryland, annapolis, 1999, pp. 4.1– 4.7 [5] design guidance for integrated modular avionics. arinc specification 651, no. 1991, aeronautical radio inc., annapolis, md [6] sinha, bil, scott: design pf payloads for mid-life upgrade of maritime helicopters: stages i, ii, iii and iv. third australian pacific vertiflite conference on helicopter technology, 12–14 july, act, canberra 2000 [7] sinha, kem, wood: a system approach to helicopter modifications for multi-mission roles. first australian system conference, wa, perth 2000 [8] blanchard, b. s., fabrycky, w. j.: systems engineering and analysis. prentice hall, new jersey, 1990 [9] sinha, bil, scott: design of payloads for mid-life upgrade of maritime helicopters: stages v and vi. presented at the international conference on systems thinking in management, 08–10 november, melbourne, vic, 2000 [10] flood, r. l., jackson, m. c.: creative problem solving total systems intervention. john wiley and sons, new york 1991 nagendra rao e-mail: nagendra.rao@rmit.edu.au raden kusumo department of aerospace engineering 226 lorimer street fishermens bend port melbourne, vic 3207, australia dr. arvind kumar sinha murray l. scott the sir lawrence wackett centre for aerospace design technology department of aerospace engineering royal melbourne institute of technology gpo box 2476v, melbourne vic 3001, australia 62 acta polytechnica vol. 42 no. 1/2002 table of contents partial discharge measurements in hv rotating machines in dependence on pressure of coolant 3 i. kršòák, i. kolcunová large-scale file system design and architecture 6 v. dynda, p. rydlo thermal comfort and optimum humidity part 1 12 m. v. jokl non-isothermal diffusion of water vapour in porous building materials 25 t. ficker, z. podešvová thermal comfort and optimum humidity part 2 28 m. v. jokl the risks of investments in transport infrastructure projects 33 o. pokorná, d. mocková pumping efficiency of screw agitators in a tube 37 f. rieger education for production and operations management 40 m. kavan design of a three surfaces r/c aircraft model 44 d. p. coiro, f. nicolosi numerical modelling of overburden deformations 53 j. barták, m. hilar, j. pruška system framework for the design of an avionics architecture with upgrade potential 59 n. rao, r. kusumo, a. k. sinha, m. l. scott acta polytechnica doi:10.14311/ap.2017.57.0379 acta polytechnica 57(6):379–384, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap lie algebra representations and rigged hilbert spaces: the so(2) case enrico celeghinia, b, manuel gadellaa, mariano a del olmoa, ∗ a departamento de física teórica and imuva, universidad de valladolid, 47011 valladolid, spain b dipartimento di fisica, università di firenze and infn-sezione di firenze, i50019 sesto fiorentino, firenze, italy ∗ corresponding author: marianoantonio.olmo@uva.es abstract. it is well known that related with the representations of the lie group so(2) we find a discrete basis as well a continuous one. in this paper we revisited this situation under the light of rigged hilbert spaces, which are the suitable framework to deal with both discrete and continuous bases in the same context and in relation with physical applications. keywords: lie groups representations; special functions; rigged hilbert spaces. 1. introduction in the last years we have been involved in a program of revision of the connection between special functions (in particular, classical orthogonal polynomials), lie groups, differential equations and physical spaces. we have obtained the ladder algebraic structure for different orthogonal polynomials, like hermite, legendre, laguerre [1], associated laguerre polynomials, spherical harmonics, etc. [2, 3]. in all cases, we have obtained a symmetry group. the corresponding orthogonal polynomial is associated to a particular representation of its lie group. for instance, for the associated laguerre polynomials and the spherical harmonics, we obtain the symmetry group so(3, 2) and in both cases they support a unitary irreducible representation (uir) with quadratic casimir −5/4. both are bases of square integrable functions defined on (−1, 1) ×z and on the sphere s2, respectively. in any case we get discrete and continuous bases. on the other hand, the rigged hilbert space (rhs) is a suitable framework for a description of quantum states, when the use of both discrete bases, i.e., complete orthonormal sets, and generalized continuous bases like those used in the dirac formalism are necessary [4]–[15]. as mentioned above, this is a typical situation arisen when we deal with special functions, which hold discrete labels and depend on continuous variables. we have analysed this situation for the hermite and laguerre functions motivated also by possible applications on signal theory in recent papers [11, 12]. moreover, the rhs fit very well with lie groups [16] and also with semigroups, see [17] and references therein. in this paper, we continue the study of the relation between lie algebras, special functions and rhs. here we propose a revision of the elementary case associated to the lie group so(2). since the representations of so(2) admit continuous and discrete bases [13] it is necessary a rhs for a mathematical rigorous description. the relation with the fourier series and its interest in quantum physics make that this case provides a relevant example of these mathematical objects. 2. rigged hilbert spaces there are several reasons to assert that hilbert spaces are not sufficient for a thoroughly formulation of quantum mechanics even within the non-relativistic context. we can mention, for instance, the dirac formulation [18] where operators with continuous spectrum play a crucial role (see also [10] and references therein) and their eigenvectors are not in the hilbert space of square integrable wave functions. another example is related with the proper definition of gamow vectors [9], which are widely used in calculations including unstable quantum systems and are non-normalizable. we can also refer to formulations of time asymmetry in quantum mechanics that may require the use of tools more general than hilbert spaces [19]. the proper framework that includes naturally the hilbert space and its features, which are widely used in quantum mechanics, is the rhs. the rigged hilbert spaces were introduced by gelfand and collaborators [4] in connection with the spectral theory of self-adjoint operators. they also proved, together with maurin [14], the nuclear spectral theorem [10, 15]. the rhs formulation of quantum mechanics was introduced by bohm and roberts around 1965 [5, 8]. a rigged hilbert space (also called gelfand triplet) is a triplet of spaces φ ⊂h⊂ φ×, with h an infinite dimensional separable hilbert space, φ (test vectors space) a dense subspace of h endowed with its own topology, and φ× the dual /antidual space of φ. 379 http://dx.doi.org/10.14311/ap.2017.57.0379 http://ojs.cvut.cz/ojs/index.php/ap enrico celeghini, manuel gadella, mariano a del olmo acta polytechnica the topology considered on φ is finer (contains more open sets) than the topology that φ has as subspace of h, and φ× is equipped with a topology compatible with the dual pair (φ, φ×) [20], usually the weak topology. one consequence of the topology of φ [10, 21] is that all sequences which converge on φ, also converge on h, the converse being not true. the difference between topologies gives rise that the dual space of φ, φ×, is bigger than h, which is self-dual. here, the dual φ× of φ, i.e., any f ∈ φ× is a continuous linear mapping from φ into c. the linearity or antilinearity of f ∈ φ× means, respectively, that for any pair of vectors ψ,ϕ ∈ φ and any pair α,β ∈ c we have 〈f|αψ + βϕ〉 = α〈f|ψ〉 + β 〈f|ϕ〉, 〈f|αψ + βϕ〉 = α∗〈f|ψ〉 + β∗ 〈f |ϕ〉, where we have followed the dirac bra-ket notation and the star denotes complex conjugation. a crucial property to be taken under consideration is that if a is a densely defined operator on h, such that φ be a subspace of its domain and that aϕ ∈ φ for all ϕ ∈ φ, we say that φ reduces a or that φ is invariant under the action of a, (i.e., aφ ⊂ φ). in this case, a may be extended unambiguously to the dual φ× by making use of the duality formula 〈a×f |ϕ〉 := 〈f|aϕ〉 ∀ϕ ∈ φ, ∀f ∈ φ×. (1) if a is continuous on φ, then a× is continuous on φ×. the topology on φ is given by an infinite countable set of norms {‖·‖∞n=1}. a linear operator a on φ is continuous if and only if for each norm ‖·‖n there is a kn > 0 and a finite sequence of norms ‖·‖p1,‖·‖p2, . . . ,‖·‖pr such that for any ϕ ∈ φ, one has [22] ‖aϕ‖n ≤ kn ( ‖ϕ‖p1 + ‖ϕ‖p2 + · · · + ‖ϕ‖pr ) . (2) the same result applies to check the continuity of any linear or antilinear mapping f : φ 7−→ c. in this case, the norm ‖aϕ‖p should be replaced by the modulus |f(ϕ)|. 3. a paradigmatic case: rhs for so(2) as mentioned before, we have considered the most elementary situation provided by so(2), where we have two rhs serving as support of unitary equivalent representations of so(2). one of these rhs is a concrete rhs constructed with functions or generalised functions and the other one is an abstract rhs. a mapping of the test vectors of the abstract rhs gives the test functions of the concrete one. also we have to adjust the topologies so that the elements of the lie algebra be continuous operators on both test spaces and their corresponding duals. let us remember that so(2) is the group of rotations on the euclidean plane. it is a one-dimensional abelian lie group, parametrized by φ ∈ [0, 2π). the elements r(φ) of so(2) satisfy the product law r(φ1) ·r(φ2) = r(φ1 + φ2) (mod 2π). here, we are considering two equivalent families of uir of so(2): one of them supported by the hilbert space l2[0, 2π] (via the regular representation, that contains once all the uir, each one related to an integer number) and another set of uir (also labelled by z) supported by an abstract infinite dimensional separable hilbert space h. 3.1. uir supported by the hs l2[0, 2π] we consider the uir characterised by the unitary operator on l2[0, 2π] um(φ) := e−imφ ∀φ ∈ [0, 2π), m ∈ z (fixed). (3) an orthonormal basis for l2[0, 2π] is given by the sequence of functions φm labelled by m ∈ z φm ≡ 1 √ 2π e−imφ, m ∈ z. thus, any lebesgue square integrable function f(φ) of l2[0, 2π] can be written as f(φ) = ∞∑ m=−∞ fm φm, (4) with fm = 1 √ 2π ∫ 2π 0 eimφ f(φ) dφ, (5) under the condition that ∞∑ m=−∞ |fm|2 = ∫ 2π 0 |f(φ)|2 dφ < +∞. note that the complex numbers fm are the fourier coefficients of f(φ). the functions um = e−imφ satisfy the following orthogonality and completeness relations: 1 2π ∫ 2π 0 u†m(φ)un(φ) dφ = δm,n, 1 2π ∞∑ m=−∞ u†m(φ)um(φ ′) = δ(φ−φ′). 3.2. uir on an infinite-d separable hs equivalently, we may construct another set of uir’s of so(2) labelled by z and supported on an abstract infinite dimensional separable hilbert space h. let {|m〉}m∈z be an orthonormal basis of h. there is a unique natural unitary mapping s such that h s−→ l2[0, 2π], |m〉 7−→ s|m〉 = φm, ∀m ∈ z. 380 vol. 57 no. 6/2017 lie algebra representations and rigged hilbert spaces: the so(2) case let us consider the subspace φ of h of vectors |f〉 = ∞∑ m=−∞ am |m〉 ∈h, am ∈ c, (6) such that 〈f|f〉p ≡‖f‖2p := ∞∑ m=−∞ |am|2|m + i|2p < ∞, (7) for any p = 0, 1, 2, . . . the imaginary unit i has been introduced to have |m + i| 6= 0 for all m ∈ z. since φ contains all finite linear combinations of the basis vectors |m〉 is dense on h. we endow φ with the metrizable topology generated by the norms ‖f‖p, (p = 0, 1, 2, . . . ). in this way we have constructed a rhs: φ ⊂h⊂ φ×. considering that the unitary mapping s transports the topologies, we get two rhs φ ⊂ h ⊂ φ×, sφ ⊂l2[0, 2π] ⊂ (sφ)×. such that φ and h have the discrete basis {|m〉}m∈z and sφ and l2[0, 2π] have its equivalent discrete basis {φm}m∈z. now, we may define continuous bases in both rhs as follows. since these two rhs are unitarily equivalent, it is enough to construct the continuous basis on the abstract rhs and to induce the equivalent one in the other rhs. let us consider the abstract rhs φ ⊂ h ⊂ φ×. since |m〉 ∈h we can consider 〈m| ∈h× = h. then, for any φ ∈ [0, 2π), we can define a ket |φ〉 such that 〈m|φ〉 := 1 √ 2π eimφ. from the duality relation 〈φ|m〉 = 〈m|φ〉∗ and for any |f〉 = ∑∞ m=−∞am |m〉 ∈ φ we get 〈φ|f〉 = ∞∑ m=−∞ am 〈φ|m〉 = 1 √ 2π ∞∑ m=−∞ ame −imφ, where am = fm as in (4). the action of 〈φ| on φ, 〈φ|f〉, is well defined since the following series is absolutely convergent ∞∑ m=−∞ |am| = ∞∑ m=−∞ |am||m + i| |m + i| ≤ √√√√ ∞∑ m=−∞ |am|2|m + i|2 √√√√ ∞∑ m=−∞ 1 |m + i|2 . (8) note that both series on the right converge: the first one because it verifies (7) for p = 1, and it is obvious for the second series. since |〈φ|f〉| ≤ c‖f‖1 with ‖f‖1 = √√√√ ∞∑ m=−∞ |am|2 |m + i|2, c = √√√√ ∞∑ m=−∞ 1 |m + i|2 , then 〈φ| ∈ φ×. note that 〈φ|f〉 = 〈f|φ〉∗ and since 〈φ| is a linear map on φ then |φ〉 is antilinear. on the other hand, {|φ〉}φ∈[0,2π), is a continuous basis. in fact, if we apply the map s to an arbitrary |f〉 ∈ φ as in (6), we obtain that s|f〉 ∈ sφ ⊂ l2[0, 2π] and s|f〉 = ∞∑ m=−∞ am s|m〉 = ∞∑ m=−∞ am e−imφ √ 2π = 〈φ|f〉 = f(φ). (9) if |f〉, |g〉 ∈ φ, then f(φ) = s|f〉 and g(φ) = s|g〉 belong to (sφ) ⊂ l2[0, 2π]. thus, and due to the unitarity of s, we get 〈f|g〉 = ∫ 2π 0 f∗(φ)g(φ) dφ = ∫ 2π 0 〈f|φ〉〈φ|g〉dφ, (10) and thus i = ∫ 2π 0 |φ〉〈φ|dφ. (11) applying this identity to |f〉 ∈ φ, we have i|f〉 = ∫ 2π 0 |φ〉〈φ|f〉dφ = ∫ 2π 0 f(φ) |φ〉dφ. (12) this gives a span of |f〉 in terms of |φ〉 with coefficients f(φ) for all φ ∈ [0, 2π), which shows that {|φ〉} is a continuous basis on φ, although its elements are not in φ but instead in φ×. since 〈φ| acts on φ only (not on all h), then for an arbitrary |g〉 ∈ φ we have 〈g|if〉 = ∫ 2π 0 〈g|φ〉〈φ|f〉dφ = 〈g|f〉. because of the definition of rhs to any |f〉 ∈ φ corresponds a 〈f| ∈ φ× and the action of 〈f| on any |g〉 ∈ φ is given by the scalar product 〈f|g〉 (10) from l2[0, 2π]. thus, i is the canonical injection i : φ 7−→ φ×, and it is continuous. consequently, f(φ) = 〈φ|f〉 = ∫ 2π 0 〈φ|φ′〉〈φ′|f〉dφ′, and then, 〈φ|φ′〉 = δ(φ−φ′). (13) therefore, the set {|φ〉} satisfies the relations of orthogonality (13) and completeness (11) that allow us, once more, to write (12) and to show that {|φ〉} is a continuous basis. however the above formulae are not rigurously correct for elements of h. effectively, formula (10), and hence all formulae derived thereof including (12), is a consequence of the gelfand– maurin theorem [4, 16]. since this theorem is only valid for |f〉, |g〉 ∈ φ, we conclude that (12) is only valid for |f〉 ∈ φ from a strictly rigorous point of view. however, we may write formal expressions like |h〉 = ∫ 2π 0 φ |φ〉dφ, 381 enrico celeghini, manuel gadella, mariano a del olmo acta polytechnica for the function h(φ) = φ ∈ l2[0, 2π], i.e. |h〉 ∈ h. but this expression is meaningless from the point of view of the gelfand-maurin theorem, since |h〉 /∈ φ, as one easily checks in the following. taking into account formulae (4) and (5) we have hm = 1 √ 2π ∫ 2π 0 φeimφ dφ =   i √ 2π 1 m , m 6= 0, √ 2π, m = 0, then, ∑ m |hm|2 = 13π 6 < ∞. however, ∑ m |hm| 2|m + i|2p diverges for p ≥ 1. this proves that h(φ) ≡ φ is not in sφ and, hence, |h〉 /∈ φ. there are some formal relations between both bases, {|m〉} and {|φ〉}. for instance, replacing |f〉 by |m〉 in (12) we have that |m〉 = ∫ 2π 0 〈φ|m〉 |φ〉dφ = 1 √ 2π ∫ 2π 0 e−imφ |φ〉dφ. since {|m〉} is a basis in h, the following completeness relation holds ∞∑ m=−∞ |m〉〈m| = i, (14) where i is the identity on h (and also on φ). do not confuse this identity with i previously defined (11) that is the canonical injection from φ to φ×. because |m〉 ∈ φ, we may apply to it any element of φ× so that i becomes a well defined identity on the dual φ× 〈φ|i =〈φ|= ∞∑ m=−∞ 〈φ|m〉〈m|= 1 √ 2π ∞∑ m=−∞ e−imφ 〈m|, which gives the second formal identity (14). nevertheless and due to the absolute convergence of the series 〈φ|f〉 = ∞∑ m=−∞ am 〈φ|m〉 = 1 √ 2π ∞∑ m=−∞ ame −imφ, it is easy to prove that 〈φ|i converges in the weak topology on φ×. 3.3. action of so(2) on the rhs the hilbert space l2[0, 2π] also supports the regular representation of so(2), r(θ), defined by [r(θ)f](φ) := f(φ−θ) (mod 2π), ∀f ∈ l2[0, 2π], for any θ ∈ [0, 2π). the unitary map s : h 7−→ l2[0, 2π] also allows us to transport r to an equivalent representation r supported on h by r(θ) = s−1r(θ)s, such that r(θ)φ = φ, ∀θ ∈ [0, 2π). since r(θ) is unitary on l2[0, 2π] for any value of θ then r(θ) is also unitary on h due to the unitarity of s. unitary operators leaving φ invariant can be extended to φ× by the duality formula (1), i.e., 〈r(θ)f|f〉 = 〈f|r(−θ)f〉, ∀|f〉 ∈ φ, ∀f ∈ φ×. therefore, 〈r(θ)φ|f〉 = [r(−θ)f](φ) = f(φ + θ) = 〈φ + θ|f〉. combining both expressions and dropping the arbitrary |f〉 ∈ φ we arrive to 〈r(θ)φ| ≡ 〈φ|r(θ) = 〈φ + θ| (mod 2π), which is a rigorous expression in φ×. in fact, let |f〉 ∈ φ as in (6). then, we have r(θ) ∞∑ m=−∞ am |m〉 = s−1 ∞∑ m=−∞ amsr(θ)s−1 s|m〉 = s−1 ∞∑ m=−∞ amr(θ) 1 √ 2π e−imφ = s−1 ∞∑ m=−∞ am 1 √ 2π e−im(φ−θ) = s−1 ∞∑ m=−∞ ame imθ 1√ 2π e−imφ = ∞∑ m=−∞ ame imθ |m〉 ∈ φ. hence, we see that r(θ)φ ⊂ φ and since r−1(θ) = r(−θ) then φ ⊂ r(−θ)φ. so, φ = r(θ)φ, ∀θ. the uir, um, on h is given in terms of the um, defined in (3), by um(φ) := s−1um(φ) s, ∀m ∈ z, ∀φ ∈ [0, 2π). let j be the infinitesimal generator associated to this representation, i.e. um(φ) = e−ijφ. since um is unitary then j is self-adjoint. its action on the vectors |m〉 is j|m〉 = m |m〉. hence for any |f〉 ∈ φ as in (6), we have that j|f〉 = ∞∑ m=−∞ amm |m〉. from the set of norms ‖jf‖2p, p = 0, 1, 2, . . . , that we have defined in (7), we obtain the following inequality ∞∑ m=−∞ |am|2m2|m + i|2p ≤ ∞∑ m=−∞ |am|2|m + i|2p+2, which shows that jφ ⊂ φ. also, this inequality may be also read as ‖jf‖2p ≤‖f‖p+1, ∀|f〉 ∈ φ, ∀p ∈ n. 382 vol. 57 no. 6/2017 lie algebra representations and rigged hilbert spaces: the so(2) case thus, we have proved that j is continuous on φ. moreover, since the self-adjoint operator j verifies jφ ⊂ φ, it can be extended to φ× using the duality form, i.e., 〈jf |f〉 = 〈f|jf〉, ∀|f〉 ∈ φ, ∀f ∈ φ×. furthermore, since j is continuous on φ, this extension is weakly (with the weak topology) continuous on φ×. in fact, since the series i|φ〉 = |φ〉 = ∞∑ m=−∞ |m〉〈m|φ〉 = 1 √ 2π ∞∑ m=−∞ eimφ |m〉 is weakly convergent, then j|φ〉 = 1 √ 2π ∞∑ m=−∞ eimφ j|m〉 = 1 √ 2π ∞∑ m=−∞ eimφm |m〉 = −idφ |φ〉, where the operator dφ is defined as follows: for any |f〉 in φ we know that s|f〉 = f(φ) ∈ (sφ) as in (9). then, i d dφ f(φ) = i d dφ ∞∑ m=−∞ am e−imφ √ 2π = ∞∑ m=−∞ amm e−imφ √ 2π . (15) we easily conclude that the operator id/dφ is continuous on sφ with the topology transported by s from φ (norms on sφ look like exactly as the norms on φ). hence, −idφ := s−1 i d dφ s. this definition implies that −idφ is continuous on φ. moreover, it is self-adjoint on h, so that it can be extended to a weakly continuous operator on φ× as the last identity in (15) shows. therefore on φ× j ≡−idφ. 4. conclusions we have construct two rhs that support the uir of the lie group so(2), φ ⊂ h ⊂ φ× and sφ ⊂ l2[0, 2π] ⊂ (sφ)×. the first one is related with the discrete basis {|m〉} and in some sense is an abstract rhs, but the second one, related with the continuous basis {|φ〉}, is obtained by means of a unitary map s : |m〉 → e−imφ/ √ 2π that allows to translate the topologies of the first rhs as well as all its properties to the second one. another interesting point to stress is the fact that rhs, from one side, and lie algebras and universal enveloping algebras, from the other, are closely related. this means that starting from a lie algebra we can construct a rhs that supports it in such a way that generators and universal enveloping elements can be represented by operators in the rhs avoiding domain difficulties [5]. vice versa a rhs contains itself the symmetries that allow to construct its related algebraical structures. acknowledgements partial financial support is acknowledged to the spanish junta de castilla y león and feder (project va057u16) and mineco (project mtm2014-57129-c2-1-p). references [1] e. celeghini and m.a. del olmo. coherent orthogonal polynomials, ann. phys. (ny) 335, 78–85 (2013). http://dx.doi.org/10.1016/j.aop.2013.04.017 [2] e. celeghini and m.a. del olmo. algebraic special functions and so(3, 2), ann. phys. (ny) 333, 90–103 (2013). http://dx.doi.org/10.1016/j.aop.2013.02.010 [3] e. celeghini, m. a. del olmo and m.a. velasco. lie groups, algebraic special functions and jacobi polynomials, j. phys.: conf. ser. 597, 012023 (2015). doi:10.1088/1742-6596/597/1/012023 [4] i.m. gelfand and n.ya. vilenkin. generalized functions: applications to harmonic analysis. academic, new york, 1964. [5] j.e. roberts. rigged hilbert spaces in quantum mechanics, comm. math. phys., 3, 98–119 (1966). https://doi.org/10.1007/bf01645448 [6] j.p. antoine. dirac formalism and symmetry problems in quantum mechanics. i. general dirac formalism, j. math. phys., 10, 53–69 (1969). https://doi.org/10.1063/1.1664761 [7] o. melsheimer. rigged hilbert space formalism as an extended mathematical formalism for quantum systems. i. general theory, j. math. phys., 15, 902–916 (1974). https://doi.org/10.1063/1.1666769 [8] a. bohm. the rigged hilbert space and quantum mechanics, springer lecture notes in physics, 78. springer, berlin, 1978. [9] a. bohm and m. gadella. dirac kets, gamow vectors and gelfand triplets, springer lecture notes in physics, 348, springer, berlin, 1989. [10] m. gadella and f. gómez. a unified mathematical formalism for the dirac formulation of quantum mechanics, found. phys., 32, 815–869 (2002). https://doi.org/10.1023/a:1016069311589 [11] e. celeghini and m.a. del olmo. quantum physics and signal processing in rigged hilbert spaces by means of special functions, lie algebras and fourier-like transforms j. phys.: conf. ser. 597, 012022 (2015). https://doi.org/10.1088/1742-6596/597/1/012022 [12] e. celeghini, m. gadella and m.a. del olmo. applications of rigged hilbert spaces in quantum mechanics and signal processing. j. math. phys. 57 (2016) 072105. http://dx.doi.org/10.1063/1.4958725 [13] wu-ki tung. group theory in physics, chap. 6, world scientific, singapore, 1985. 383 enrico celeghini, manuel gadella, mariano a del olmo acta polytechnica [14] k. maurin. bull. acad. polon. sci. ser. sci. math. astronom. phys., 7 (1959) 471. [15] m. gadella and f. gómez. eigenfunction expansions and transformation theory, acta appl. math., 109, 721–742 (2010). https://doi.org/10.1007/s10440-008-9342-z [16] k. maurin. general eigenfunction expansions and unitary representations of topological groups, monografie matematyczne, 48, pwn-polish scientific publishers, warsaw, 1968. [17] m. gadella, f. gómez-cubillo, l. rodríguez and s. wickramasekara. point-form dynamics of quasistable states, j. math. phys., 54, 072303 (2013). https://doi.org/10.1063/1.4811563 [18] p.a.m. dirac. the principles of quantum mechanics, clarendon press, oxford, 1958. [19] a.r. bohm, m. gadella and p. kielanowski. time asymmetric quantum mechanics, sigma, 7, 086 (2011). https://doi.org/10.3842/sigma.2011.086 [20] j. horvath, topological vector spaces and distributions, addison-wesley, reading ma., 1966. [21] a. pietsch, nuclear topological vector spaces, springer, berlin, 1972. [22] m. reed and b. simon, functional analysis, academic, new york, 1972. 384 acta polytechnica 57(6):379–384, 2017 1 introduction 2 rigged hilbert spaces 3 a paradigmatic case: rhs for so(2) 3.1 uir supported by the hs l2[0,2] 3.2 uir on an infinite-d separable hs 3.3 action of so(2) on the rhs 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0446 acta polytechnica 57(6):446–453, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap algebraic description of shape invariance revisited satoshi ohya institute of quantum science, nihon university, kanda-surugadai 1-8-14, chiyoda, tokyo 101-8308, japan correspondence: ohya@phys.cst.nihon-u.ac.jp abstract. we revisit the algebraic description of shape invariance method in one-dimensional quantum mechanics. in this note we focus on four particular examples: the kepler problem in flat space, the kepler problem in spherical space, the kepler problem in hyperbolic space, and the rosen–morse potential problem. following the prescription given by gangopadhyaya et al., we first introduce certain nonlinear algebraic systems. we then show that, if the model parameters are appropriately quantized, the bound-state problems can be solved solely by means of representation theory. keywords: exactly solvable models; shape invariance; representation theory. 1. introduction the purpose of this note is to revisit a couple of one-dimensional quantum-mechanical bound-state problems that can be solved exactly. in this note we shall focus on four particular examples: the kepler problem in flat space, the kepler problem in spherical space [1–3], the kepler problem in hyperbolic space [4, 5], and the rosen–morse potential problem [6, 7], all of whose bound-state spectra are known to be exactly calculable. hamiltonians of these problems1 are respectively given by hkepler = − d2 dx2 + j(j − 1) x2 − 2g x , hspherical kepler = − d2 dx2 + j(j − 1) sin2 x − 2g cot x, hhyperbolic kepler = − d2 dx2 + j(j − 1) sinh2 x − 2g coth x, hrosen–morse = − d2 dx2 − j(j − 1) cosh2 x − 2g tanh x, (1.1) where j and g are real parameters. the potential energies and bound-state spectra are depicted in figure 1. there exist several methods to solve the eigenvalue problems of these hamiltonians (1.1). among them is the shape invariance method [9],2 which is based on the factorization of hamiltonian and the darboux transformation. and, as discussed by gangopadhyaya et al. [11] (see also the reviews [12, 13]), the shape invariance can always be translated into the (lie-)algebraic description—the so-called potential algebra.3 the spectral problem can then be solved by means of representation theory. however, as far as we noticed, the representation theory of potential algebra has not been fully analyzed yet. in particular, the spectral problems of the above hamiltonians have not been solved in terms of potential algebra. the purpose of this note is to fill this gap. as we will see below, these very old spectral problems require to introduce rather nontrivial nonlinear algebraic systems. the goal of this note is to show that these bound-state problems can be solved by representation theory of the operators {j3,j+,j−} that satisfy the linear commutation relations between j3 and j± [j3,j±] = ±j±, (1.2) and the nonlinear commutation relations between j+ and j− (kepler) [j+,j−] = − g2 j23 + g2 (j3 − 1)2 , (spherical kepler) [j+,j−] = j23 − g2 j23 − (j3 − 1)2 + g2 (j3 − 1)2 , (hyperbolic kepler & rosen–morse) [j+,j−] = −j23 − g2 j23 + (j3 − 1)2 + g2 (j3 − 1)2 . (1.3) we will see that, if j is a half-integer, the bound-state problems of (1.1) can be solved from these operators. the rest of the note is organized as follows: in section 2 we introduce the potential algebra for the kepler problem in flat space and solve the spectral problem by means of representation theory. in sections 3 and 4 1these names for the hamiltonians, though not so popular nowadays, are borrowed (with slight modifications) from infeld and hull [8]. notice that these are different from those commonly used in the supersymmetric quantum mechanics literature [9]. 2recently it has been demonstrated that spectral intertwining relation provides a yet another scheme to solve the eigenvalue problems of hkepler, hspherical kepler, and hhyperbolic kepler [10]. 3a similar algebraic description for shape invariance has also been discussed by balantekin [14]. 446 http://dx.doi.org/10.14311/ap.2017.57.0446 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 algebraic description of shape invariance revisited 0 x v (x) (a) kepler 0 π v (x) (b) spherical kepler 0 x v (x) (c) hyperbolic kepler 0 x v (x) (d) rosen–morse figure 1. potential energies (thick solid curves) and discrete energy levels (blue lines). we generalize to the other problems. we shall see that the bound-state spectra of the hyperbolic kepler and rosen–morse hamiltonians just correspond to two distinct representations of the same algebraic system. we conclude in section 5. 2. kepler let us start with the kepler problem in flat space. as is well known, the kepler hamiltonian hkepler in (1.1) can be factorized as follows: hkepler = a−(j)a+(j) − g2 j2 , (2.1) where a±(j) are the first-order differential operators given by a±(j) = ± d dx − j x + g j . (2.2) let us next introduce the potential algebra of this system. following [11] with slight modifications, we first introduce an auxiliary periodic variable θ ∈ [0, 2π), then upgrade the parameter j to an operator j3 = −i∂θ, and then replace a+(j) and a−(j) to j+ = eiθa+(j3) and j− = a−(j3)e−iθ. the resultant operators that we wish to study are thus as follows: j3 = −i∂θ, j+ = eiθ ( ∂x − j3 x + g j3 ) , j− = ( −∂x − j3 x + g j3 ) e−iθ. (2.3) here one may wonder about the meaning of 1/j3. the operator 1/j3 would be defined as the spectral decomposition 1/j3 = ∑ j(1/j)pj, where pj stands for the projection operator onto the eigenspace of j3 with eigenvalue j. this definition would be well-defined unless the spectrum of j3 contains j = 0. an alternative way to give a meaning to 1/j3 would be the (formal) power series 1j3 = 1 λ 1 1−(1−j3/λ) = 1 λ ∑∞ n=0(1 − j3 λ )n, where 447 satoshi ohya acta polytechnica λ is an arbitrary constant. this expression would be well-defined if the operator norm of 1 −j3/λ satisfies ‖1 −j3/λ‖ < 1. for the moment, however, we will proceed the discussion at the formal level. it is not difficult to show that the operators (2.3) satisfy the following commutation relations: [j3,j±] = ±j±, [j+,j−] = − g2 j23 + g2 (j3 − 1)2 , (2.4) which follow from e∓iθj3e±iθ = j3 ± 1 or j3e±iθ = e±iθ(j3 ± 1). it is also easy to check that the invariant operator of this algebraic system is given by h = j−j+ − g2 j23 = j+j− − g2 (j3 − 1)2 = −∂2x + j3(j3 − 1) x2 − 2g x , (2.5) which commutes with j± and j3.4 notice that if g = 0 the commutation relations (2.4) just describe those for the lie algebra iso(2) of the two-dimensional euclidean group. in this case the invariant operator h is nothing but the casimir operator of the lie algebra iso(2). now, let |e,j〉 be a simultaneous eigenstate of h and j3 that satisfies the eigenvalue equations h|e,j〉 = e|e,j〉, j3|e,j〉 = j|e,j〉, (2.6) and the normalization condition ‖|e,j〉‖ = 1. we wish to find the possible values of e and j. to this end, let us next consider the states j±|e,j〉. as usual, the commutation relations (2.4) lead j3j±|e,j〉 = (j ± 1)j±|e,j〉, which implies j± raise and lower the eigenvalue j by ±1: j±|e,j〉∝ |e,j ± 1〉. (2.7) proportional coefficients are determined by calculating the norms‖j±|e,j〉‖. by using the relations‖j±|e,j〉‖2 = 〈e,j|j∓j±|e,j〉, j−j+ = h + g2/j23 , and j+j− = h + g2/(j3 − 1)2, we get ‖j+|e,j〉‖2 = e + g2 j2 ≥ 0, ‖j−|e,j〉‖2 = e + g2 (j − 1)2 ≥ 0. (2.8) these equations not only fix the proportional coefficients in (2.7) but also provide nontrivial constraints on e and j. in fact, together with the ladder equations (2.7), the conditions (2.8) completely fix the possible values of e and j. to see this, let us consider a negative-energy state |e,j〉 that corresponds to an arbitrary point in the lower half of the (e,j)-plane. by applying the ladder operators j± to the state |e,j〉 one can easily see that such an arbitrary point eventually falls into the region in which the squared norms become negative. see the figure below: j e 1 2 |e,j〉j+ j− e = − g2 j2 e = − g2 (j − 1)2 ‖j±|e,j〉‖2 < 0 the only way to avoid this is to terminate the sequence {· · · , |e,j−1〉, |e,j〉, |e,j + 1〉, · · ·} from both above and below. this is possible if and only if there exist both the highest and lowest weight states |e,jmax〉 and |e,jmin〉 in the sequence such that j+|e,jmax〉 = 0 = j−|e,jmin〉, −g2/j2max = −g2/(jmin − 1)2, jmax − jmin ∈ z≥0, and jmax ≥ 1/2 and jmin ≤ 1/2. it is not difficult to see that these conditions are fulfilled if and only if the eigenvalue of the invariant operator takes the value e = −g2/ν2, ν ∈ {12, 1, 3 2, 2, · · ·}. with this ν the eigenvalues of j3 take the values {jmax = ν,ν −1, · · · , 2−ν,jmin = 1−ν}. note, however, that if ν is an integer, the spectrum of j3 contains j = 0 which makes the operator 1/j3 ill-defined. thus we should disregard this case. to summarize, 4the commutation relation [h, j3] = 0 is trivial. in order to prove [h, j±] = 0, one should first note that hj+ −j+h = g2(j+ 1 j23 − 1 (j3 −1)2 j+) and hj− − j−h = g2(j− 1(j3 −1)2 − 1 j23 j−), which follow from (2.5). then by using (2.3) and e−iθ 1(j3 −1)2 e iθ = 1 j23 , one arrives at [h, j±] = 0. 448 vol. 57 no. 6/2017 algebraic description of shape invariance revisited ··············· j e −32 − 1 2 1 2 3 2 5 2 (a) kepler ··············· j e −32 − 1 2 1 2 3 2 5 2 (b) spherical kepler ··· ······ ······ ······ j e 1 2 − √ g 1 2 − √ 1 4 + g 1 − √ g √ g 1 2 + √ 1 4 + g 1 + √ g (c) hyperbolic kepler & rosen–morse: case g > 1/4 figure 2. representations of the potential algebras. gray shaded regions are the domains in which the squared norms ‖j±|e, j〉‖2 become negative. red circles represent the finite-dimensional representations, whereas blue circles represent the infinite-dimensional representations. right and left arrows indicate the actions of ladder operators j+ and j−, respectively. the representation of the potential algebra is specified by a half-integer ν ∈{12, 3 2, · · ·} and the representation space is spanned by the following 2ν vectors:{ |e,j〉 : e = − g2 ν2 and j ∈{ν,ν − 1, · · · , 1 −ν} } . (2.9) these 2ν-dimensional representations are schematically depicted in figure 2(a). now it is straightforward to solve the original spectral problem of the kepler hamiltonian hkepler. to this end, let j ∈ {±12,± 3 2, · · ·} be fixed. since the hamiltonian is invariant under j → 1 − j, without any loss of generality we can focus on the case j ∈{12, 3 2, · · ·}. then the discrete energy eigenvalues read en = − g2 (j + n)2 , n ∈{0, 1, · · ·}. (2.10) the energy eigenfunction ψen,j(x) that satisfies the schrödinger equation hkeplerψen,j = enψen,j can be determined by the formula |en,j〉∝ (j−)n|en,j +n〉. noting that |e,j〉 corresponds to the function ψe,j(x)eijθ and j− is given by j− = a−(j3)e−iθ, we get the following rodrigues-like formula: ψen,j(x) ∝ a−(j)a−(j + 1) · · ·a−(j + n− 1)ψen,j+n(x), (2.11) where ψen,j+n(x) is a solution to the first-order differential equation a+(j + n)ψen,j+n(x) = 0 and given by ψen,j+n(x) ∝ xj+n exp(− g j+nx). all of these exactly coincide with the well-known results. in the rest of the note we would like to apply the same idea to the spectral problem for the spherical kepler, hyperbolic kepler, and rosen–morse hamiltonians. we shall first introduce the potential algebras, and then classify their representations, and then solve the bound-state problems. as we will see below, the spherical kepler problem is rather straightforward but the hyperbolic kepler and rosen–morse potential problems are more intriguing and require careful analysis. 3. spherical kepler let us next move on to the spherical kepler problem [1–3], whose hamiltonian is factorized as follows: hspherical kepler = a−(j)a+(j) + j2 − g2 j2 , (3.1) 449 satoshi ohya acta polytechnica where a±(j) = ± d dx − j cot x + g j . (3.2) just as in the previous section, let us next introduce the following operators: j3 = −i∂θ, j+ = eiθ ( ∂x − cot xj3 + g j3 ) , j− = ( −∂x − cot xj3 + g j3 ) e−iθ, (3.3) which satisfy the following commutation relations: [j3,j±] = ±j±, [j+,j−] = j23 − g2 j23 − (j3 − 1)2 + g2 (j3 − 1)2 . (3.4) the invariant operator that commutes with j3 and j± is given by h = j−j+ + j23 − g2 j23 = j+j− + (j3 − 1)2 − g2 (j3 − 1)2 = −∂2x + j3(j3 − 1) sin2 x − 2g cot x. (3.5) it should be noted that, if g = 0, (3.4) reduces to the standard commutation relations for the lie algebra so(3) under the appropriate shift j3 → j3 + 1/2. in this case the invariant operator h is nothing but the casimir operator of so(3) and provides a well-known example of interplay between shape invariance and lie algebra; see, e.g., the review [12]. now, let |e,j〉 be a simultaneous eigenstate of h and j3 that satisfies the eigenvalue equations h|e,j〉 = e|e,j〉, j3|e,j〉 = j|e,j〉, (3.6) as well as the normalization condition ‖|e,j〉‖ = 1. then we have the following conditions: ‖j+|e,j〉‖2 = e − j2 + g2 j2 ≥ 0, ‖j−|e,j〉‖2 = e − (j − 1)2 + g2 (j − 1)2 ≥ 0, (3.7) which, together with the ladder equations j±|e,j〉 ∝ |e,j ± 1〉, restrict the possible values of e and j. as discussed in the previous section, these conditions are compatible with each other if and only if the eigenvalue of the invariant operator takes the value e = ν2 −g2/ν2, ν ∈{12, 3 2, · · ·}. now let ν ∈{ 1 2, 3 2, · · ·} be fixed. then the representation space is spanned by the following 2ν vectors: { |e,j〉 : e = ν2 − g2 ν2 and j ∈{ν,ν − 1, · · · , 1 −ν} } . (3.8) these 2ν-dimensional representations are schematically depicted in figure 2(b). now it is easy to find the spectrum of the original hamiltonian hspherical kepler. for fixed j ∈{12, 3 2, · · ·} the energy eigenvalues and eigenfunctions read en = (j + n)2 − g2 (j + n)2 , n ∈{0, 1, · · ·}, (3.9) and ψen,j(x) ∝ a−(j)a−(j + 1) · · ·a−(j + n− 1)ψen,j+n(x), (3.10) where ψen,j+n(x) ∝ (sin x)j+n exp(− g j+nx). we note that (3.9) and (3.10) are consistent with the known results [1–3]. 4. hyperbolic kepler & rosen–morse let us finally move on to the study of potential algebras for the hyperbolic kepler and rosen–morse hamiltonians. we shall see that the bound-state spectra of these problems correspond to two distinct representations of a single algebraic system. 4.1. hyperbolic kepler the hamiltonian for the hyperbolic kepler problem [4, 5] can be factorized as follows: hhyperbolic kepler = a−(j)a+(j) − j2 − g2 j2 , (4.1) 450 vol. 57 no. 6/2017 algebraic description of shape invariance revisited where a±(j) = ± d dx − j coth x + g j . (4.2) we then introduce the following operators: j3 = −i∂θ, j+ = eiθ ( ∂x − coth xj3 + g j3 ) , j− = ( −∂x − coth xj3 + g j3 ) e−iθ, (4.3) which satisfy the following commutation relations: [j3,j±] = ±j±, [j+,j−] = −j23 − g2 j23 + (j3 − 1)2 + g2 (j3 − 1)2 . (4.4) the invariant operator is given by h = j−j+ −j23 − g2 j23 = j+j− − (j3 − 1)2 − g2 (j3 − 1)2 = −∂2x + j3(j3 − 1) sinh2 x − 2g coth x. (4.5) we note that, if g = 0, (4.4) reduce to the standard commutation relations for the lie algebra so(2, 1) under the shift j3 → j3 + 1/2. in other words, the operators (4.3) provide one of differential realizations of so(2, 1) if g = 0 and j3 → j3 + 1/2. unfortunately, however, this lie-algebraic structure is less useful in the present problem because the invariant operator (4.5) does not contain discrete eigenvalues if g = 0 and j3 has real eigenvalues. as we will see shortly, however, this situation gets changed if g is non-vanishing. now, let |e,j〉 be a simultaneous eigenstate of h and j3: h|e,j〉 = e|e,j〉, j3|e,j〉 = j|e,j〉. (4.6) then, under the normalization condition ‖|e,j〉‖ = 1, the squared norms ‖j±|e,j〉‖2 are evaluated as follows: ‖j+|e,j〉‖2 = e + j2 + g2 j2 ≥ 0, ‖j−|e,j〉‖2 = e + (j − 1)2 + g2 (j − 1)2 ≥ 0. (4.7) these conditions are enough to classify representations. in contrast to the previous two examples, there are several nontrivial representations depending on the range of j. for g > 1/4, we have the following three distinct representations (see figure 2(c)): • case j ∈ (−∞,−√g): infinite-dimensional representation. let ν ∈ (−∞,−√g) be fixed. then the representation space is spanned by the following infinitely many vectors: { |e,j〉 : e = −ν2 − g2 ν2 and j ∈{ν,ν − 1, · · ·} } . (4.8) we emphasize that in this case the parameter ν ∈ (−∞,− √ g) is not necessarily restricted to an integer or half-integer. this is a one-parameter family of infinite-dimensional representation of the algebraic system {j3,j+,j−}. • case j ∈ (1 −√g,√g): finite-dimensional representation. let ν ∈ {12, 3 2, · · · ,νmax} be fixed, where νmax is the maximal half-integer smaller than √ g; i.e., νmax = max{ν ∈ 12 n : ν < √ g}. then the representation space is spanned by the following 2ν vectors: { |e,j〉 : e = −ν2 − g2 ν2 and j ∈{ν,ν − 1, · · · , 1 −ν} } . (4.9) this is a 2ν-dimensional representation of the algebraic system {j3,j+,j−}. • case j ∈ (1 + √g,∞): infinite-dimensional representation. let ν ∈ (1 + √g,∞) be fixed. then the representation space is spanned by the following infinitely many vectors: { |e,j〉 : e = −(ν − 1)2 − g2 (ν − 1)2 and j ∈{ν,ν + 1, · · ·} } . (4.10) note that ν ∈ (1 + √ g,∞) is a continuous parameter and is not necessarily be an integer or half-integer. this is another one-parameter family of infinite-dimensional representation of the algebraic system {j3,j+,j−}. 451 satoshi ohya acta polytechnica one may notice that the region [− √ g, 1 − √ g] ∪ [ √ g, 1 + √ g] is excluded in the above classification. this is because there is no bound state in this region for both the hyperbolic kepler and rosen–morse potential problems. we note that the finite-dimensional representation (4.9) disappears for g ≤ 1/4, whereas the infinite-dimensional representations (4.8) and (4.10) remain present for g ≤ 1/4. now we have classified the representations of the potential algebra. the next task we have to do is to understand which representations are realized in the hyperbolic kepler problem. to see this, let us consider the potential v (x) = j(j − 1)/ sinh2 x− 2g coth x. in order to have a bound state, it is necessary that v (x) has a minimum on the half line.5 this is achieved if and only if j is in the range ( 12 − √ g + 14, 1 2 + √ g + 14 ), which includes (1− √ g, √ g); see figure 2(c). hence the bound state spectrum should be related to the finite-dimensional representation (4.9). now it is easy to solve the original eigenvalue problem hhyperbolic keplerψen,j = enψen,j for the hyperbolic kepler hamiltonian. for fixed j ∈{12, 3 2, · · · ,νmax}, the energy eigenvalues and eigenfunctions are given by en = −(j + n)2 − g2 (j + n)2 , n ∈{0, 1, · · · ,n}, (4.11) and ψen,j(x) ∝ a−(j)a−(j + 1) · · ·a−(j + n− 1)ψen,j+n(x), (4.12) where n = max{n ∈ z≥0 : j + n < √ g} = νmax − j and ψen,j+n(x) ∝ (sinh x)j+n exp(− g j+nx). notice that these results are consistent with the known results [5]. before closing this subsection it is worthwhile to comment on the case g ≤ 1/4. as mentioned before, the finite-dimensional representation (4.9) disappears for g ≤ 1/4. however, new finite-dimensional representations appear in this case. the relevant one is the following one-dimensional representation spanned by a single vector: { |e,j〉 : e = −j2 − g2 j2 and j = 1 2 − √ 1 4 −g } , (4.13) where g ∈ (0, 1/4). notice that this j is one of the solutions to the condition −j2−g2/j2 = −(j−1)2−g2/(j−1)2. now one can easily check that this state vector satisfies j±|e,j〉 = 0. it is also easy to see that, for g ∈ (0, 1/4), j = 1/2 − √ 1/4 −g satisfies the condition j < √ g, which is the necessary condition for the ground-state wavefunction to be normalizable. the point is that, just as in the case g > 1/4, j must be quantized in a particular manner in this representation theoretic approach. 4.2. rosen–morse let us finally move on to the bound-state problem of the rosen–morse hamiltonian [6, 7]. first, the hamiltonian hrosen–morse in (1.1) is factorized as follows: hrosen–morse = a−(j)a+(j) − j2 − g2 j2 , (4.14) where a±(j) = ± d dx − j tanh x + g j . (4.15) let us then introduce the following operators: j3 = −i∂θ, j+ = eiθ ( ∂x − tanh xj3 + g j3 ) , j− = ( −∂x − tanh xj3 + g j3 ) e−iθ, (4.16) which satisfy the commutation relations: [j3,j±] = ±j±, [j+,j−] = −j23 − g2 j23 + (j3 − 1)2 + g2 (j3 − 1)2 . (4.17) the invariant operator is h = j−j+ −j23 − g2 j23 = j+j− − (j3 − 1)2 − g2 (j3 − 1)2 = −∂2x − j3(j3 − 1) cosh2 x − 2g tanh x. (4.18) note that the commutation relations (4.17) are exactly the same as those for the hyperbolic kepler problem. hence the bound-state spectrum should be related to the representations classified in the previous subsection. 5this is, of course, not sufficient condition. 452 vol. 57 no. 6/2017 algebraic description of shape invariance revisited to understand which representations are realized, let us study the minimum of the potential v (x) = −j(j − 1)/ cosh2 x− 2g tanh x. thanks to the symmetry j → 1 − j, without any loss of generality we can focus on the case j ≥ 1/2. it is then easy to see that the potential has a minimum if j is in the range ( 12 + √ g + 14,∞), which contains the region (1 + √ g,∞); see figure 2(c). hence, in contrast to the previous case, the bound-state problem for the rosen–morse hamiltonian should be related to the infinite-dimensional representation (4.10). now it is easy to find the energy eigenvalue of the original hamiltonian. let j ∈ (1 + √ g,∞) be fixed. then the energy eigenvalues and eigenfunctions read en = −(j −n− 1)2 − g2 (j −n− 1)2 , n ∈{0, 1, · · · ,n}, (4.19) and ψen,j(x) ∝ a+(j − 1)a+(j − 2) · · ·a+(j −n)ψen,j−n(x), (4.20) where n = max{n ∈ z≥0 : 1 + √ g < j −n} and ψen,j−n(x) ∝ (cosh x)−j+n+1 exp( g j−n−1x). notice that (4.19) and (4.20) are consistent with the known results [7]. 5. conclusions in this note we have revisited the bound-state problems for the kepler, spherical kepler, hyperbolic kepler, and rosen–morse hamiltonians, all of which have not been solved before in terms of potential algebra. we have introduced three nonlinear algebraic systems and solved the problems by means of representation theory. we have seen that the discrete energy spectra can be obtained just from the four conditions: j±|e,j〉∝ |e,j ± 1〉 and ‖j±|e,j〉‖2 ≥ 0. these conditions correctly reproduce the known results in a purely algebraic fashion. the price to pay, however, is that in this approach j must be a half-integer (except for the rosen–morse potential problem and the hyperbolic kepler problem in the domain g ∈ (0, 1/4)), otherwise there arise inconsistencies. this is a weakness of this representation theoretic approach. references [1] e. schrödinger. a method of determining quantum-mechanical eigenvalues and eigenfunctions. proc roy irish acad (sect a) 46:9–16, 1940. [2] l. infeld. on a new treatment of some eigenvalue problems. phys rev 59:737–747, 1941. doi:10.1103/physrev.59.737. [3] a. f. stevenson. note on the “kepler problem” in a spherical space, and the factorization method of solving eigenvalue problems. phys rev 59:842–843, 1941. doi:10.1103/physrev.59.842. [4] m. f. manning, n. rosen. a potential function for the vibrations of diatomic molecule. phys rev 44:951–954, 1933. doi:10.1103/physrev.44.951. [5] l. infeld, a. schild. a note on the kepler problem in a space of constant negative curvature. phys rev 67:121–122, 1945. doi:10.1103/physrev.67.121. [6] c. eckart. the penetration of a potential barrier by electrons. phys rev 35:1303–1309, 1930. doi:10.1103/physrev.35.1303. [7] n. rosen, p. m. morse. on the vibrations of polyatomic molecules. phys rev 42:210–217, 1932. doi:10.1103/physrev.42.210. [8] l. infeld, t. e. hull. the factorization method. rev mod phys 23:21–68, 1951. doi:10.1103/revmodphys.23.21. [9] f. cooper, a. khare, u. sukhatme. supersymmetry and quantum mechanics. phys rept 251:267–385, 1995. arxiv:hep-th/9405029 doi:10.1016/0370-1573(94)00080-m. [10] t. houri, m. sakamoto, k. tatsumi. spectral intertwining relations in exactly solvable quantum-mechanical systems. ptep 2017:063a01, 2017. arxiv:1701.04307 doi:10.1093/ptep/ptx074. [11] a. gangopadhyaya, j. v. mallow, u. p. sukhatme. translational shape invariance and the inherent potential algebra. phys rev a58:4287–4292, 1998. doi:10.1103/physreva.58.4287. [12] c. rasinariu, j. v. mallow, a. gangopadhyaya. exactly solvable problems of quantum mechanics and their spectrum generating algebras: a review. central eur j phys 5:111–134, 2007. doi:10.2478/s11534-007-0001-1. [13] j. bougie, a. gangopadhyaya, j. mallow, c. rasinariu. supersymmetric quantum mechanics and solvable models. symmetry 4:452–473, 2012. doi:10.3390/sym4030452. [14] a. b. balantekin. algebraic approach to shape invariance. phys rev a57:4188–4191, 1998. arxiv:quant-ph/9712018 doi:10.1103/physreva.57.4188. 453 http://dx.doi.org/10.1103/physrev.59.737 http://dx.doi.org/10.1103/physrev.59.842 http://dx.doi.org/10.1103/physrev.44.951 http://dx.doi.org/10.1103/physrev.67.121 http://dx.doi.org/10.1103/physrev.35.1303 http://dx.doi.org/10.1103/physrev.42.210 http://dx.doi.org/10.1103/revmodphys.23.21 http://arxiv.org/abs/hep-th/9405029 http://dx.doi.org/10.1016/0370-1573(94)00080-m http://arxiv.org/abs/1701.04307 http://dx.doi.org/10.1093/ptep/ptx074 http://dx.doi.org/10.1103/physreva.58.4287 http://dx.doi.org/10.2478/s11534-007-0001-1 http://dx.doi.org/10.3390/sym4030452 http://arxiv.org/abs/quant-ph/9712018 http://dx.doi.org/10.1103/physreva.57.4188 acta polytechnica 57(6):446–453, 2017 1 introduction 2 kepler 3 spherical kepler 4 hyperbolic kepler & rosen–morse 4.1 hyperbolic kepler 4.2 rosen–morse 5 conclusions references acta polytechnica doi:10.14311/ap.2013.53.0626 acta polytechnica 53(supplement):626–630, 2013 © czech technical university in prague, 2013 available online at http://ojs.cvut.cz/ojs/index.php/ap xmm-newton scientific highlights: x-ray spectroscopic population studies of agn matteo guainazzi∗ european astronomy center of the european space agency, camino bajo del castillo, s/n, urbanización villafranca del castillo, villanueva de la cañada, e-28692 madrid, spain ∗ corresponding author: matteo.guainazzi@sciops.esa.int abstract. in this paper i review the contribution that the xmm-newton esa x-ray mission has given to our understanding of active galactic nuclei, together with other operational, and complementary, x-ray facilities. i will focus on answering three basic questions: a) to which extent do agn share the same engine?; b) to which extent are agn “relativistic machines”?; c) to which extent do agn affect their immediate environment? keywords: x-rays: galaxies, x-rays: general, galaxies: active. 1. goals of this paper after more than 10 years of successful scientific operations, xmm-newton has provided the scientific community with a rich harvest of data in all fields of astrophysical investigation. the nature of active galactic nuclei (agn) is the most productive field of investigation based on xmm-newton data in terms of number of published papers in refereed journals (guainazzi [26], to which readers are referred for a short description of the xmm-newton scientific payload and performances). we have achieved significant progress in our understanding of how accretion onto super-massive black holes (smbhs) works, on the similarity of the “central engines” among different classes of observationally distinct agn, and on the interaction between the agn and the surrounding gas and dust. in this paper, i will review how xmm-newton has contributed to this progress. i will concentrate on spectroscopic studies of sizable samples (number of objects ≥ 25) of nearby (z ≤ 0.1) radio-quiet agn. due tribute will be paid, whenever relevant, to the contribution by other, and largely complementary, x-ray facilities which have been successfully operating alongside xmm-newton, such as chandra, integral, and swift. 2. to which extent do agn share the same engine? agn exhibit a very diverse phenomenology. one of the basic observational classifications of radio-quiet agn is based on their optical spectrum. emission line profiles in “broad-line” (or “type 1”) agn have widths & 1000 km s−1; by contrast, the optical spectra of “narrow-line” (“type 2”) agn exhibit narrower (. 1000 km s−1) profiles, primarily from forbidden transitions. however, this observational diversity hides a more fundamental identity: a recent study of 165 seyfert galaxies (the most common class of nearby radio-quiet agn) in the integral/ibis catalogue [61] shows that the average intrinsic spectral shape of type 1 and type 2 objects is statistically indistinguishable above 15 kev. the distributions of the αox parameter (the logarithm of the flux density ratio between x-ray and 5500 å) are very similar in type 1 and type 2, once the x-ray flux density is calculated at 20 kev [7]. why does one need high-energy measurements to discover this hidden identity? because the spectra of radio-quiet agn are significantly affected by obscuration due to intervening gas and dust along the line of sight. even over the two decades in energy between 0.1 and 10 kev (where most, and the most sensitive, x-ray spectroscopic measurements of agn are currently available), the observational appearance of type 1 and type 2 agn is remarkably different. the latter are characterized by neutral gas photoelectric absorption column densities ≥ 1022 cm−2 (this fact has been known since the dawn of agn highenergy astrophysics; [4, 75], cf. [72] for an analysis of the most carefully selected sample of seyfert galaxies to date). this feature is almost entirely absent in type 1 objects. this observational fact lends support to the so called “unification scenario”, originally postulated by antonucci & miller [2] to explain the appearance of broad lines in spectropolarimetric observations of “narrowline” agn. the 0-th order formulation of this scenario (see [1] for a review) posits an azimuthally-symmetric gas and dust structure surrounding the agn. this structure (traditionally referred to as the “torus”) impedes the view of the central engine as well as of the broad line regions (blrs), i.e. the gas clouds responsible for the production of broad optical lines, if the line-of-sight to the agn intercepts it. in the simplest interpretation of the unified scenario, the torus is a pc-scale compact structure. the most recent observational and theoretical developments suggest rather that the torus is clumpy, extending on scales 626 http://dx.doi.org/10.14311/ap.2013.53.0626 http://ojs.cvut.cz/ojs/index.php/ap vol. 53 supplement/2013 xmm-newton scientific highlights from a fraction to tens of parsecs [19, 46], and that gas in the blr also contributes to the line-of-sight obscuration (see [11] for a review). notwithstanding the detailed structure of the obscuring matter, x-ray surveys confirm the basic predictions of the unified scenario at a very high level of accuracy. only 3 % of type 1 agn are x-ray obscured, and 3 % of type 2 agn are x-ray unobscured in the xmm-newton wide area survey (486 objects; [41]). the former group of exceptions can be easily explained by assuming that the host galaxy contributes to the x-ray obscuration along some line-of-sight, or by anisotropy of the agn emission. more intriguingly, the latter class could be living proof of an elusive class of “bare” type 2 agn, where the blr is missing rather than obscured. this class of objects could demonstrate that a minimum accretion rate is required for the torus and/or the blr to form [20, 47, 49]. this hypothesis has only recently received observational support from a dedicated xmm-newton experiment [40]. it goes without saying that one cannot rule out that the original optical classification of these outliers is simply wrong. 3. to which extent are agn relativistic machines? there is nowadays almost unanimous consensus that agn are powered by accretion onto smbhs, following the hypothesis originally formulated by lynden-bell [38]. in this picture, high-energy radiation comes from the innermost regions of the accretion flow, a few to tens of gravitational radii from the event horizon, due to the steep increase of the disk dissipation [60, and references therein], and due to the disk temperature profile in a shakura-sunyaev accretion disk [55, 69]. more recently, quasar micro-lensing [14] and x-ray occultation experiments [62, 63] have unambiguously demonstrated that the size of the x-ray source is . 10 gravitational radii, in agreement with early suggestions based on short time-scale x-ray variability [6]. an x-ray illuminated accretion disk extending down to the innermost circular stable orbit should comptonscatter the primary radiation. models of “disk reflection” have been developed since the early 1990s [23, 64] following the discovery of spectral features expected from disk reflection such as the compton “hump” and iron fluorescent lines [45, 53]. besides newtonian dynamical effects, the profile of disk emission lines is distorted by special and general relativity [21, 42]. they depend on the emissivity profile along the disk, on the exact location and size of the line emitting region, on the disk inclination ı with respect to the line-of-sight, and on the black hole spin. recently, several studies have tried to validate our standard picture of the innermost accretion flow by looking at spectroscopic signatures of “line profile relativistic broadening” [9, 16, 44]. the results of these studies are summarised in fig. 1. figure 1. cumulative distribution function of the fraction of agn exhibiting relativistically broadened fe kα lines in two different samples: fero+gredos (red; [16, 29]), and the “nandra et al.” sample (blue; [9, 44]). the filled surfaces correspond to different statistical criteria for the detection of broadened profiles: 5σ in the former, 90 % confidence level for one interesting parameter in the latter studies. the distribution functions are calculated against the “disk reflection parameter” r [39], which is proportional to the accretion disk solid angle, as seen from a point-like isotropic x-ray source. r ≡ 1 for a plane-parallel, infinite slab when the relativistic light bending is negligible. they are consistent with ' 90 % of nearby seyfert galaxies in well-defined flux-limited samples to exhibit relativistically broadened profiles. the main systematic uncertainties on these results are due to the uneven statistical quality of the data [28], as well as (and more importantly) on pending ambiguities in the spectral deconvolution of ccd-resolution x-ray spectra, even with the highest statistical quality (see [74] for a review of these issue; they propose an alternative scenario which does not require any relativistic effect). in fig. 1 the cumulative distributions are plotted against the “disk reflection” parameter r, which is proportional to the solid angle that the accretion disk covers, as seen from the – supposedly point-like – x-ray source. with the definition used in fig. 1, r = 1 if the disk is a plane-parallel slab and the x-ray source is isotropic. this should be the most common geom627 matteo guainazzi acta polytechnica figure 2. accretion disk inclination angle ı derived from x-ray relativistic spectroscopy on the fero and gredos samples [29] against the host galaxy inclination. etry. however, the cumulative distribution function also increases steeply for r < 1, and keeps growing for r > 1. this evidence requires modifications of the standard picture. these issues are discussed in [29, 44]. high values of r may indicate objects where relativistic “light bending” is important, increasing the relative strength of the reflected emission when the x-ray source is located at few gravitational radii above the disk [43]. we observe that in this scenario r can be significantly larger even than 2 (the value corresponding to a reflecting slab covering a solid angle equal to 4π), because light bending increases the relative fraction of primary photons bent towards the accretion disk while at the same time decreasing the fraction of primary photons directly reaching the observer. when the relativistic light bending is strong, the observed ratio between the reflected and the primary fluxes is therefore unrelated to the true geometry of the reflector. however, low values of r may be indicative of a relativistic aberration in a mildly outflowing corona [8]. x-ray relativistic spectroscopy is one of the few direct ways to measure the inclination of the innermost disk region, down to a few gravitational radii from the bh event horizon. in fig. 2 we compare ı against the inclination of the host galaxy. there is no correlation between the two quantities. naively, one might expect that the accretion disk preserves the host galaxy orientation due to the conservation of the overall angular momentum. figure 2 confirms previous indications that the distribution of the angle between the radio jet and the host galaxy disk is consistent with being random [34, 67]. galaxygalaxy merging, or precession and warping of the innermost accretion disk due to the bardeen & petterson effect [5, 13] have been invoked to explain this lack of correlation. realistic accretion disk models predict strong emission in the soft x-ray band (below 2 kev) due to the combination of low opacity and emission lines [64]. “soft excesses” are indeed commonly found in the x-ray spectra of agn and quasars [50, and references therein]. a possible explanation in terms of relativistically smeared high-density outflows [24] has recently been ruled out [68]. however, reflection from an ionized disk fits well the observed xmm-newton epic (strüder et al. 2001, turner et al. 2001) spectra [15]. a systematic study of the physical nature of soft excesses which takes explicitly into account the time-dimension (soft x-ray emission in agn is variable on all range of timescales from hours to years) is, unfortunately, still missing. 4. to which extent do agn affect their environment? disk line-driven winds are a natural prediction of relativistic accretion disk models [18, 32, 56, 57, 68, 70, 71]. spectroscopic studies of samples of bright seyfert galaxies with asca unveiled that at least 50 % of nearby bright seyferts exhibit absorption features, which can be associated with ionized gas. blustin et al. [12] published a systematic compilation of literature results based on data of the reflection grating spectrometer on board xmm-newton [17]. these “warm absorbers” have typical ionization parameters1 ξ ∼ 100 erg cm s−1, column densities2 nh ∼ 1021÷22 cm−2, and outflow velocities ≤ 1000 km s−1. the covering fraction, as derived from the fraction of “warm absorbed” agn in well-defined samples, is possibly larger than 80 % [76]. more recently [54, 59], a population of high-density (nh ∼ 1023÷24 cm−2), highly ionized (ξ ≥ 1000 erg cm s−1), high-velocity (& 0.1c) outflows has been discovered in xmm-newton ccd spectra of more luminous quasars. evidence for these more massive counterparts of historical “warm absorbers” 1ξ ≡ l/(nr2), where l is the ionizing luminosity, n is the electron density and r the distance between the ionizing source and the inner side of the ionized cloud 2to clarify an often confusing issue for a non-expert: we refer here to the column density of ionized absorbers as opposed to the photoelectric absorption from a neutral gas discussed in sect. 2. agn exhibiting “warm absorbers” are typically “type 1”, i.e. x-ray unobscured in the sense described in sect. 2. i apologize on behalf of the agn community for this confusing nomenclature. 628 vol. 53 supplement/2013 xmm-newton scientific highlights has been found in 40 ÷ 60 % agn in the local universe [73]. ionized outflows are important because they are a fundamental ingredient of our agn structure model, and also because they provide us with a complementary view of the accretion flow. they may be tracing agn feedback onto the surrounding gas. in order to quantitatively assess the mass rate and kinetic energy carried by these outflows, one needs to know their location or launch radius. photoionization models yield only the product between the gas density and its distance from the ionizing continuum. an independent estimate of the density can be derived when the x-ray continuum is variable (the ionization and recombination times depend on n−1; [35, 48], if meta-stable transitions are detected [3, 22], through density diagnostics on he-like emission line triplets ([52]; it must be assumed in this case that absorption and emission occur in the same system), or if the gas is heated by free-free absorption [65]. blustin et al. [12] estimate upper and lower limits on the launch radius using the escape velocity, and imposing that the width of the absorbing slab is much smaller than r, respectively (otherwise stated, that the electron number density for a given ionization parameter falls rapidly with r; see [37] for a criticism to this approach). under these assumptions, warm absorbers should be launched at parsec scale, probably via radiation pressure on gas clouds detached from the rim of the torus, and the outflow rate is of the same order as the accretion rate, with huge uncertainties. unfortunately, different methods for estimating r give widely different results when applied to the same data [36]. notwithstanding these uncertainties, some general conclusions can be drawn: • at least the most powerful agn, and maybe even more prosaic seyferts, could release to the interstellar medium 108 m� of hot gas in bulges across their lifetime. • standard seyferts could release enough energy to heat the ism gas to temperatures ' 107 k. • powerful quasars (≥ 1044 erg s−1) could release more than the binding energy of a 1011m� bulge with dispersion velocity σ ∼ 300 km s−1, or the energy necessary to control the evolution of the host galaxy and the surrounding intergalactic medium. readers interested in a more detailed discussion of this subject are referred to [31, 66] or [30]. the last authors propose a “two-stage” feedback process, whereby a disk wind drives a weak wind or outflow in the hot, diffuse interstellar medium. this latter component is able to drive cold gas clouds in a direction perpendicular to the incident flow. this mechanism could reduce by a factor ∼ 10 the energy requirement for feedback to affect the host galaxy and quench star formation (see also [58]). on larger scales, these outflows probably connect with x-ray emitting diffuse structures, extended on figure 3. ratio between the oviii ly-α and the forbidden component of the ovii as a function of the x-ray luminosity in the 14 ÷ 195 kev band for the type 2 agn of the cielo sample [27]. the inverse of the quantity on the y-axis is an indicator of photoionization. the luminosity of compton-thick agn (nh > σ−1t = 1.6×10 24 cm−2) was adjusted by a factor of 30. the dot-dashed line indicates the best fit, the dashed lines indicate the envelope corresponding to the 1σ uncertainties on the best-fit parameters. scales as large as a hundred parsecs (extended x-ray narrow-line regions, exnlrs; [10]). a systematic spectroscopic study of a sample of over 90 type 2 agn observed by the xmm-newton/rgs unveiled that the gas is photoionized with a small or negligible contribution from nuclear starburst or shocked plasma ([27]; see [33] for a detailed description of the underlying astrophysics). the photoionization diagnostics are proportional to the hard x-ray luminosity, suggesting that the agn is the ultimate ionizing power of the exnlrs (fig. 3). 5. a final comment i would like to conclude this brief review with a comment. while many physical insights can be obtained from the detailed study of a few “archetypal” individual agn, only the global view provided by a large sample of good quality spectroscopic measurements will allow us to make robust progress in our understanding of the astrophysics of such complex systems. 629 matteo guainazzi acta polytechnica spectroscopy is still the only available tool to investigate astrophysical complexity on spatial scales, which will remain unresolved for many generations to come. with the main x-ray spectroscopic missions (chandra, suzaku, and xmm-newton) in their full operational maturity, enough observing time should be allocated to sizable, well-defined, complete samples of agn. complete, homogeneous sample coverage in the science archives over the largest possible parameter space is an obligation towards the coming generations of high-energy astronomers, should this class of researcher still exist in the future (cf. ubertini, this volume). acknowledgements i acknowledge useful discussions with enrico piconcelli. references [1] antonucci r., 1993, ara&a, 31, 473 [2] antonucci r.r.j., miller j.s, 1985, apj, 297, 621 [3] arav n., et al., 2008, apj, 681 954 [4] awaki h., et al., 1991, pasj, 43, 195 [5] bardeen w.a., petterson j.a, 1975, apj, 195, 65 [6] barr p., mushotzky r.f., 1986, nat, 320, 421 [7] beckmann v., et al., 2009, a&a, 505, 417 [8] beloborodov a.m., 1999, apj, 510, 123 [9] bhayani s., nandra k., 2011, mnras, 416, 629 [10] bianchi s., et al., 2006, a&a, 448, 499 [11] bianchi s., et al., 2012, adast2012e, 17 [12] blustin a.j., et al., 2005, a&a, 431, 111 [13] caproni a., et al., 2006, apj, 653, 112 [14] chartas g., et al., 2009, apj, 693, 174 [15] crummy j., et al., 2006, mnras, 365, 1067 [16] de la calle-pérez i., et al., 2010, a&a, 524, 50 [17] den herder j.w., et al., 2001, a&a, 365, 7 [18] dorodnitsyn a., et al., 2008, apj, 675, 5 [19] elitzur m., 2012, apj, 747, 33 [20] elitzur m., shlosman i., 2006, apj, 648, 101 [21] fabian a.c., et al., 1989, mnras, 238, 729 [22] gabel j.r., et al., 2003, apj, 583, 178 [23] george i.m., fabian a.c., 1991, mnras, 249, 352 [24] gierlinśki m., done c., 2004, mnras, 349, 7 [25] gierliński m., et al., 2008, nat, 455, 369 [26] guainazzi m., 2010, mmsai, 81, 226 [27] guainazzi m., bianchi s., 2007, mnras, 374, 1290 [28] guainazzi m., et al., 2006, an, 327, 1032 [29] guainazzi m., et al., 2011, a&a, 531, 131 [30] hopkins p.f., elvis m., 2010, mnras, 401, 7 [31] king a., 2003, apj, 596, 27 [32] king a.r., pounds k.a., 2003, mnras, 345, 657 [33] kinkhabwala a., et al., 2002, apj, 575, 732 [34] kinney a.l., et al., 2000, apj, 537, 152 [35] krolik j.h., kriss g.a., 1995, apj, 447, 512 [36] krongold y., et al., 2007, apj, 659, 1022 [37] krongold y., et al., 2010, apj, 710, 360 [38] lynden-bell d., 1969, nat, 223, 690 [39] magdziarz p., zdziarski a.a., 1995, mnras, 273, 837 [40] marinucci a., et al., 2012, apj, 748, 130 [41] mateos s., et al., 2010, a&a, 510, 35 [42] matt g., et al., 1991, a&a, 247, 25 [43] miniutti g., fabian a.c., 2004, mnras, 349, 1435 [44] nandra k., et al., 2007, mnras, 382, 194 [45] nandra k., pounds k.a., 1994, mnras, 268, 405 [46] nenkova m., et al., 2008, apj, 685, 147 [47] nicastro f., 2000, apj, 530, 65 [48] nicastro f., et al., 1999, apj, 512, 184 [49] nicastro f., et al., 2003, apj, 589, 13 [50] piconcelli e., et al., 2005, a&a, 432, 15 [51] ponti g., et al., 2012, a&a, 542, 83 [52] porquet s., dubau j., 2000, a&as, 143, 495 [53] pounds k.a., at al., 1990, mat, 344, 132 [54] pounds k.a., et al., 2003, mnras, 345, 705 [55] pringle j.e., 1981, ara&a, 19, 137 [56] proga d., 2000, apj, 538, 684 [57] proga d., kallman t.r., 2004, apj, 616, 688 [58] menci r., 2008, apj, 686, 219 [59] reeves j.n., et al., 2003, apj, 593, 65 [60] reynolds c.s., fabian a.c., 2008, apj, 675, 1048 [61] ricci c., et al., 2011, a&a, 532, 102 [62] risaliti g., et al., 2007, apj, 659, 111 [63] risaliti g., et al., 2009, mnras, 393, 1 [64] ross r.r., fabian a.c., 2005, mnras, 358, 211 [65] rózańska a., et al., 2008, a&a, 487, 89 [66] scannapieco e., oh s.p., 2004, apj, 608, 62 [67] schmitt h.r., et al., 2001, apj, 555, 663 [68] schurch n.j., et al., 2009, apj, 649, 1 [69] shakura n.i., sunyaev r.a., 1973, a&a, 24, 337 [70] sim s.a., et al., 2008, mnras, 388, 611 [71] sim s.a., et al., 2010, mnras, 408, 1396 [72] singh v., et al., 2011, a&a, 533, 128 [73] tombesi f., et al., 2010, a&a, 521, 57 [74] turner t.j., miller l., 2009, a&arv, 17, 47 [75] turner t.j., pounds k.a., 1989, mnras, 240, 833 [76] winter l., 2010, apj, 735, 126 discussion maurice van putten — could you comment on the detection of quasi period oscillations (qpos) in agn? matteo guainazzi — the only positive detection of a ∼ 1 hour qpo in agn was reported by gierliński et al. (2008) in rej1034+396. a recent systematic analysis of over 160 bright agn observed by xmm-newton has not discovered any further qpo (ponti et al. 2012). 630 acta polytechnica 53(supplement):626–630, 2013 1 goals of this paper 2 to which extent do agn share the same engine? 3 to which extent are agn relativistic machines? 4 to which extent do agn affect their environment? 5 a final comment acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0116 acta polytechnica 54(2):116–121, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap a bose-einstein condensate with pt-symmetric double-delta function loss and gain in a harmonic trap: a test of rigorous estimates daniel haag, holger cartarius, günter wunner∗ institut für theoretische physik 1, universität stuttgart, 70550 stuttgart, germany ∗ corresponding author: wunner@itp1.uni-stuttgart.de abstract. we consider the linear and nonlinear schrödinger equation for a bose-einstein condensate in a harmonic trap with pt -symmetric double-delta function loss and gain terms. we verify that the conditions for the applicability of a recent proposition by mityagin and siegl on singular perturbations of harmonic oscillator type self-adjoint operators are fulfilled. in both the linear and nonlinear case we calculate numerically the shifts of the unperturbed levels with quantum numbers n of up to 89 in dependence on the strength of the non-hermiticity and compare with rigorous estimates derived by those authors. we confirm that the predicted 1/n1/2 estimate provides a valid upper bound on the shrink rate of the numerical eigenvalues. moreover, we find that a more recent estimate of log(n)/n3/2 is in excellent agreement with the numerical results. with nonlinearity the shrink rates are found to be smaller than without nonlinearity, and the rigorous estimates, derived only for the linear case, are no longer applicable. keywords: pt symmetry, bose-einstein condensates, perturbed harmonic oscillator. 1. introduction bose-einstein condensates with pt -symmetric loss and gain have been proposed [1] as a first experimental realisation of pt symmetry in a real quantum system. the idea is to confine the condensate in a double well potential, and to create a pt -symmetric situation by coherently injecting atoms into one well and removing them from the other. a particular difficulty in the theoretical treatment is that, on account of the swave scattering of the atoms, the gross-pitaevskii equation, which describes the condensates, contains a term g|ψ|2, and is -shell-escapebthus nonlinear in the wanted wave function. in a series of papers both for realistic set-ups as well as for delta-function models of the double wells we have shown [2–7] that the nonlinearity introduces new features in the evolution of the eigenvalue spectrum as the non-hermiticity is increased but yet pt symmetry of the wave function is preserved if both nonlinearity and non-hermiticity are not too strong. bose-einstein condensates are usually trapped in harmonic potentials produced by counterpropagating laser beams. therefore condensates with an additional pt -symmetric double well potential can be regarded as a “perturbation” of the harmonic oscillator. aducci and mityagin [8] and siegl and mityagin [9] have recently analysed perturbations of harmoniclike operators from a mathematical point of view. the second paper in particular also allows for singular perturbations, such as delta-functions. these authors have proved that the eigenvalues of the perturbed operator eventually become simple and the root system forms a riesz basis. their results are valid if the following criterion is fulfilled: ∀m,n ∈ n ∣∣〈ψm|b̂|ψn〉∣∣ ≤ m mαnα , α > 0, m > 0, (1) where b̂ is the perturbation operator, m is a constant that depends on the type of the perturbation, and the ψm are the harmonic oscillator eigenfunctions. a further prediction is that the shrink rate of the disks into which the eigenvalues can be shifted is proportional to 1/n2α, with α the exponent appearing in (1). it is the purpose of this paper to test the estimates numerically for the example of a bose-einstein condensate in a double well potential confined by a harmonic trap. to this end we consider the model of two pt -symmetric delta-function wells, since in this case simple analytical estimates can be obtained, whereas in the case of the realistic double well discussed in refs. [4, 5] complicated estimates in terms of hypergeometric functions result. it should be mentioned that of course because of their simplicity delta functions have been widely used in the literature in the context of pt symmetry. spectral properties of scattering and bound states in pt symmetric doubleand multiple-delta function potentials have been investigated e.g. in refs. [2, 3, 11–18]. in all these papers no external potential was present, in addition to the delta potentials. a paper in which pt -symmetric point interactions were studied embedded in an external potential is 116 http://dx.doi.org/10.14311/ap.2014.54.0116 http://ojs.cvut.cz/ojs/index.php/ap vol. 54 no. 2/2014 bec with pt-symmetric double-delta loss and gain: test of estimates that by jakubský and znojil [19], who positioned the delta functions in an infinitely high square well and analysed the spectrum in dependence on the position of the delta functions within the well. their work was extended by krejčiřík and siegl [20, 21] who replaced the delta functions by pt -symmetric robin boundary conditions at the edges of the square well. we note that the spectrum of a harmonic oscillator perturbed by two identical real-valued point interactions has been analysed by fassari and rinaldi [22]. however, to our knowledge the situation of two pt -symmetric delta functions in an external harmonic potential has not yet been investigated. while in the square well the delta functions can be placed only within the well, the harmonic oscillator potential has the advantage that the delta functions can in principle be shifted to any position on the real axis. 2. bose-einstein condensate in a pt -symmetric harmonic trap at low temperatures and densities bose-einstein condensates are well described by the gross-pitaevskii equation [23, 24] ( − d2 dx2 + v (x) + g|ψ|2 ) ψ = µψ. (2) here ψ denotes the condensate wave function, the eigenvalue µ is the chemical potential, and v (x) is the trapping potential to confine the condensate. the nonlinear term in (2) arises from the s-wave scattering interaction of the atoms; g is a measure for the strength of this interaction. we consider a harmonic trapping potential and model a pt -symmetric double well with equilibrated loss and gain by imaginary delta functions. thus the hamiltonian that we consider here is given by ĥ = − d2 dx2 +x2+iγ ( δ(x−b)−δ(x+b) ) +g ∣∣ψ(x)∣∣2. (3) here ±b denotes the position of the imaginary deltas and γ the strength of the non-hermiticity. we will later consider the effects of the nonlinearity on the spectrum, but for the time being we assume that the nonlinearity is negligible in order to be in a position to compare with the predictions of mityagin and siegl [9]. the eigenvalues of the unperturbed spectrum are given by µn = 2n+ 1 (n = 0, 1, 2, . . . ). figure 1 shows the unperturbed spectrum together with the wave functions of the lowest five states. the dashed vertical lines designate different positions at which the delta functions are placed. from the figure it is already obvious that only such states will be significantly affected by the perturbation which are within the classically allowed region at the positions of the delta functions. by contrast, states for which the delta functions lie in the classically forbidden (exponentially decaying) regime will not be affected. this means that 0 2 4 6 8 10 −3 −2 −1 0 1 2 3 -0.5 0 0.5 -0.5 0 0.5 -0.5 0 0.5 -0.5 0 0.5 -0.5 0 0.5 v ψ x figure 1. unperturbed spectrum of the harmonic oscillator and the lowest five wave functions. vertical dashed lines designate different positions of the pt symmetric delta-function perturbations. as the delta functions are shifted further and further out an increasing number of low-lying eigenvalues will not be changed by the perturbation. it is easy to show that the mityagin-siegl criterion (1) is fulfilled for the imaginary delta function perturbation in (3) since∣∣〈ψm|b̂|ψn〉∣∣ = |γ|∣∣ψm(b)ψn(b) −ψm(−b)ψn(−b)∣∣. (4) the oscillator eigenfunctions have either even or odd parity, therefore |〈ψm|b̂|ψn〉| = 0 if m and n are both even or odd, while for m even and n odd, and vice versa, we have∣∣〈ψm|b̂|ψn〉∣∣ = 2|γ|∣∣ψm(b)∣∣∣∣ψn(b)∣∣ ≤ 2|γ|c̃m−1/4n−1/4. (5) in the last line we have exploited an inequality given by mityagin and siegl [9] which is valid for 2(2n+ 1) ≥ b2. the prediction then is that eigenvalues can only move to a distance m, where m is a constant, uniform for all eigenvalues, and in particular, that there exists an n0 such that for n ≥ n0 all eigenvalues stay in disjoint disks with shrinking radii, with the shrink rate being bounded from above by cn−1/2. in fact, one of the authors of ref. [9] has pointed out [10] that for the case of two delta potentials the shrink rate can be estimated even more precisely to behave as log(n)/n3/2. note that no such statements can be made from their theorems for the case that the nonlinearity is also included as a perturbation. 3. eigenvalue spectra the (real and complex) eigenvalues µ of the hamiltonian (3) and its eigenstates ψ are obtained, in both the linear and the nonlinear case, by integrating the wave functions outward from x = 0, and varying the initial values of re ψ(0), ψ′(0) ∈ c, and µ ∈ c to find square-integrable normalised solutions (the arbitrary global phase is exploited by choosing im ψ(0) = 0). figure 2 shows for the lowest 30 levels the real and imaginary parts of the eigenvalues of (3) (for g = 0) as 117 d. haag, h. cartarius, g. wunner acta polytechnica 0 10 20 30 40 50 60 70 0 5 10 15 20 25 30 µ r γ −10 −5 0 5 10 0 5 10 15 20 25 30 µ i γ figure 2. real parts (top) and imaginary parts (bottom) of the eigenvalues of (3) (with g = 0) for b = 0.2, evolving from the unperturbed levels with n = 0, . . . , 29, in dependence on the strength γ of the non-hermiticity. functions of the strength γ of the non-hermiticity, for a position of the delta functions at b = 0.2, close to the centre of the oscillator. one recognises that successive pairs of eigenvalues coalesce at branch points, from where onwards they turn into complex conjugate pairs. the branch points are shifted to larger values of γ as one goes up in the spectrum. what is surprising is that both the real and the imaginary part of the complex eigenvalues emerging from the branch point of the eighth and ninth excited state experience huge shifts, but eventually saturate, like the other levels. for the ninth excited state, the real parts remain approximately constant beyond the branch points, and the imaginary parts quickly tend to zero. in figure 3 we move the delta functions further away from the centre of the harmonic oscillator to the classical turning points b = 1 and √ 7 of the unperturbed levels with n = 0 and n = 3, respectively. at the turning points the unperturbed wave functions enter into the classically forbidden, i.e. exponentially decreasing region. it is therefore no surprise that for b = 1 the ground state no longer “feels” the delta functions, and no longer unites with the first excited state at a branch point. rather its eigenvalue remains real for any strength of the non-hermiticity. for b = √ 7 it is the states with n = 0, 1, 2, 3 which exhibit this property. we note that similar behaviour was found by jakubský and znojil [19] in their pt -symmetric square well 0 10 20 30 40 50 60 70 0 5 10 15 20 25 30 µ r γ 0 10 20 30 40 50 60 70 0 5 10 15 20 25 30 µ r γ figure 3. real parts of the eigenvalues of (3) (with g = 0), evolving from the unperturbed levels with n = 0, . . . , 29, with the delta functions placed into the classical turning point of the unperturbed ground state (b = 1, top) and the third excited state (b = √ 7, bottom). model. in their terminology, energy levels which coalesce at a branch point and turn complex are called “fragile”, while energy levels whose eigenvalues remain real for any strength of the perturbation are called “robust”. the latter also include states where the delta functions happen to be at or close to a node of the wave function, and therefore remain unaffected by the perturbation. examples of this can be seen in figure 3. in figure 2 the ground state coalesces with the first excited state, while in figure 3, at b = 1 (top panel), it has become a single real level for any value of γ, and the first excited state coalesces with the second excited one. the question arises how the transition between the different coalescence behaviour occurs. this is illustrated in figure 4, where the real and imaginary parts of the eigenvalues emerging from the ground state and the first two excited levels are shown as functions of γ, for three positions of the delta functions around b ≈ 0.9. it is evident that at b = 0.897 the ground and first excited state still coalesce, giving rise to a pair of complex conjugate eigenvalues which “collides” with the real eigenvalue of the second excited level. the latter then turns into another pair of complex conjugate eigenvalues. the behaviour is similar for b = 0.915 but here the pair of complex eigenvalues resulting from the merger of the ground and first excited state disappears by splitting into 118 vol. 54 no. 2/2014 bec with pt-symmetric double-delta loss and gain: test of estimates 1 2 3 4 5 0 1 2 3 4 5 µ r γ b=0.897 b=0.915 b=0.925 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0 1 2 3 4 5 µ i γ b=0.897 b=0.915 b=0.925 figure 4. change in the coalescence and bifurcation behaviour of the three lowest eigenstates as a function of γ at three different positions of the delta functions in the vicinity of b = 0.9. (top: real parts, bottom: imaginary parts of the eigenvalues.) two real eigenvalues the lower of which remains real for any γ, while the higher after a small interval of γ coalesces with the real eigenvalue of the second excited state and a new pair of complex eigenvalues is born. finally, at b = 0.925 the transition has occured, the ground state has become a single real level, and the first two excited states come together. we note again that similar behaviour was found by krejčiřík and siegl [20] in their studies of eigenvalues in a square well with robin boundary conditions (cf. figure 6 in [20]). we now proceed to results for nonvanishing nonlinearity. figure 5 shows the spectrum for a value of g = 5 with the delta functions placed at b = 1. the overall behaviour is similar (many states coalesce at branch points), but there are significant differences. like in our previous studies of pt -symmetric hamiltonians with nonlinearity [2–6, 25], pairs of complex conjugate eigenvalues (imaginary parts not shown) appear before the branch points are reached. again there exist “robust” levels whose eigenvalues remain real for any value of γ. 4. eigenvalue shifts for vanishing nonlinearity, the upper part of figure 6 shows, for γ = 1 and different positions of the delta functions, the shifts of the eigenvalues as a function of the quantum number n in comparison with the predicted shrink rates of n−1/2 of [9] and the improved 0 10 20 30 40 50 60 0 1 2 3 4 5 6 7 8 9 µ r γ figure 5. real parts of the eigenvalues of (3) with finite nonlinearity g = 5, evolving from the unperturbed levels with n = 0, . . . , 29, for b = 1 in dependence on the strength of the non-hermiticity. blue lines denote purely real eigenvalues, whereas real parts of complex eigenvalues (single lines) are shown in red. 10−4 10−3 10−2 10−1 100 1 10 100 ∆ µ n ∝ n−0.5 ∝ log(n)/n3/2 b = 1 b = 2 b = 3 b = 5 0.1 1 1 10 100 ∆ µ n ∝ n−0.5 ∝ log(n)/n3/2 b = 1 b = 2 b = 3 b = 5 figure 6. shift of the eigenvalues for γ = 1 and different positions of the delta functions in comparison with the mathematically estimated n−1/2 and log(n)/n3/2 dependence. top: vanishing nonlinearity, g = 0, bottom: g = 2. estimate of log(n)/n3/2 [10]. it can be recognized that the n−1/2 dependence is indeed an upper bound on the shrink rate, and, moreover, that the improved estimate is in excellent agreement with the numerical data, irrespective of the position of the deltas! the picture changes when the nonlinearity is switched on. the bottom part of figure 6 shows eigenvalue shifts for g = 2, and different positions of the imaginary delta potentials. from the comparison of the scales of the vertical axes in figure 6 it can 119 d. haag, h. cartarius, g. wunner acta polytechnica 10−2 100 102 1 10 ∆ µ n γ = 5 γ = 10 γ = 15 γ = 20 figure 7. shift of the real and complex eigenvalues for growing values of γ for b = 0.2, and g = 0. be seen that with nonlinearity the shifts even for the highest states are still bigger than 0.1, while they have already dropped well below 10−2 in the linear case. furthermore, the shrink rate is found to be slower (approximately proportional to n−0.37) than predicted by both rigorous mathematical estimates for the linear case. we therefore find significant differences in the shrink rates with and without nonlinearity. for small γ, all eigenvalues are still real. the question therefore suggests itself how the situation changes when eigenvalues are involved that are shifted into the complex plane, which happens for increasing γ. figure 7 shows for the eigenvalue spectra with b = 0.2, vanishing nonlinearity and different values of γ the distances |∆µ| of the real or complex eigenvalues from their original values (cf. also figure 2). the outlier at n = 8 and 9 is caused by the giant change of the real and imaginary parts of the complex eigenvalues beyond the branch point of the corresponding states observed already in figure 2. a similar outlier occurs around n = 75, and a monotonous decrease of the eigenvalue shifts in our calculations only sets in for n ≈ 80. finally, in figure 8 we investigate the effect of the nonlinearity on the eigenvalue shifts for γ = 1.5 and b = 0.5 and b = 1. we find that the eigenvalue shifts oscillate around straight lines with slopes -0.37, irrespective of the strength of the nonlinearity. the amplitude of the oscillations, however, and their numbers depend on the position of the delta functions. again the slope of the estimate proportional to n−1/2, also shown in the figure, is steeper than the actual slopes found in the numerical results. 5. conclusions we have carried out a numerical analysis of a pt symmetric double delta perturbation of the harmonic oscillator. we have also considered the case were in addition a gross-pitaevskii nonlinearity proportional to |ψ|2 is present. with the latter, the system can be considered as a model of a bose-einstein condensate in a double well with loss and gain of atoms. 10−1 100 1 10 ∆ µ n g = 1 g = 2 g = 3 g = 4 10−1 100 1 10 ∆ µ n ∝ n−0.37 ∝ n−0.5 figure 8. shift of the eigenvalues n = 2n + 1 for four different values of the nonlinearity at γ = 1.5 for b = 0.5 (top) and b = 1 (bottom). we have checked that the mityagin-siegl criterion for the perturbed eigenfunctions to form a riesz basis is fulfilled, and compared rigorous mathematical estimates for the shrink rate of the eigenvalue shifts in dependence on the harmonic oscillator quantum number n for various strengths of the non-hermiticity and the nonlinearity. we have verified that in the linear case the mathematical prediction for the shrink rates proportional to 1/n1/2 is a valid estimate, and that the improved estimate proportional to log(n)/n3/2 is in excellent agreement with the behaviour of the shrink rates found in the numerical results. by contrast, with nonlinearity we find slopes of approximately −0.37, less steep than both mathematical estimates. evidently, to derive estimates also for the nonlinear case remains a mathematical challenge. a peculiarity that is found is the occurrence of outliers in the eigenvalue spectra which appear beyond branch points with unusually large real and imaginary parts of their complex conjugate eigenvalues. they are the reason why for growing strength of the nonhermiticity the asymptotic shrink rate behaviour is attained only for high values of n. the nature of these outliers and their mathematical importance should certainly be clarified in future studies. acknowledgements we are grateful to petr siegl for helpful explanations and discussions. we are also grateful to boris mityagin for very helpful remarks, and for communicating his improved estimate for the two-delta potentials. furthermore we 120 vol. 54 no. 2/2014 bec with pt-symmetric double-delta loss and gain: test of estimates thank two anonymous referees for valuable comments. in particular, one referee points out that a deeper analytic insight in the bottom of the spectra could probably be obtained using the strategies described in ref. [22] for the self-adjoint case, if adapted to the pt -symmetric perturbation. this is certainly also a useful suggestion for future work. references [1] s. klaiman, et al. visualization of branch points in pt -symmetric waveguides. phys rev lett 101:080402, 2008. doi: 10.1103/physrevlett.101.080402 [2] h. cartarius, et al. model of a pt -symmetric bose-einstein condensate in a δ-function double-well potential. phys rev a 86:013612, 2012. doi: 10.1103/physreva.86.013612 [3] h. cartarius, et al. nonlinear schrödinger equation for a pt -symmetric delta-function double well. j phys a 45:444008, 2012. doi: 10.1088/1751-8113/45/44/444008 [4] h. cartarius, et al. stationary and dynamical solutions of the gross-pitaevskii equation for a bose-einstein condensate in a pt symmetric double well. acta polytechnica 53(3):259–267, 2013. [5] d. dast, et al. a bose-einstein condensate in a pt symmetric double well. fortschr physik 61:124–139, 2013. doi: 10.1002/prop.201200080 [6] d. dast, et al. eigenvalue structure of a bose-einstein condensate in a pt -symmetric double well. j phys a 46:375301, 2013. doi: 10.1088/1751-8113/46/37/375301 [7] m. kreibich, et al. hermitian four-well potential as a realization of a pt -symmetric system. phys rev a 87:051601(r), 2013. doi: 10.1103/physreva.87.051601 [8] j. adduci, et al. eigensystem of an l2-perturbed harmonic oscillator is an unconditional basis. cent eur j math 10:569, 2012. doi: 10.2478/s11533-011-0139-3 [9] b. mityagin, et al. root system of singular perturbations of harmonic oscillator type operators, 2013. arxiv:1307.6245. [10] b. mityagin, private communication, january 2014. [11] h. f. jones. the energy spectrum of complex periodic potentials of the kronig-penney type. phys lett a 262:242, 1999. doi: 10.1016/s0375-9601(99)00672-6 [12] z. ahmed. energy band structure due to a complex, periodic, pt -invariant potential. phys lett a 286:231, 2001. [13] e. demiralp. bound states of n-dimensional harmonic oscillator decorated with dirac delta functions. j phys a 38:4783, 2005. doi: 10.1088/0305-4470/38/22/003 [14] e. demiralp. properties of a pseudo-hermitian hamiltonian for harmonic oscillator decorated with dirac delta interactions. czech j phys 55:1081, 2005 doi: 10.1007/s10582-005-0110-2 [15] a. mostafazadeh. delta-function potential with a complex coupling. j phys a 39:13495, 2006. doi: 10.1088/0305-4470/39/43/008 [16] a. mostafazadeh, et al. spectral singularities, biorthonormal systems and a two-parameter family of complex point interactions. j phys a 42:125303, 2009. doi: 10.1088/1751-8113/42/12/12503 [17] h. mehri-dehnavi, et al. application of pseudo-hermitian quantum mechanics to a complex scattering potential with point interactions. j phys a 43:145301, 2010. doi: 10.1088/1751-8113/43/14/145301 [18] h. uncu, et al. bose-einstein condensate in a harmonic trap with an eccentric dimple potential. las phys 18:331, 2008. [19] v. jakubský, et al. an explicitly solvable model of the spontaneous pt -symmetry breaking. czech j phys 55:1113–1116, 2005. doi: 10.1007/s10582-005-0115-x [20] d. krejčiřík, et al. pt -symmetric models in curved manifolds. j phys a 43:485204, 2010. doi: 10.1088/1751-8113/43/48/485204 [21] p. siegl. pt -symmetric square well-perturbations and the existence of metric operator. int j theor phys textbf50:991, 2011. doi: 10.1007/s10773-010-0593-x [22] s. fassari, et al. on the spectrum of the schrödinger equation of the one-dimensional harmonic oscillator perturbed by two identical attractive point interactions. rep math phys 69:353, 2012. [23] e. p. gross. structure of a quantized vortex in boson systems. nuovo cimento 20:454, 1961. [24] l. p. pitaevskii. vortex lines in an imperfect bose gas. sov phys jetp 13:451, 1961. [25] w. d. heiss, et al. spectral singularities in pt -symmetric bose-einstein condensates. j phys a 46:275307, 2013. doi: 10.1088/1751-8113/46/27/275307 121 http://dx.doi.org/10.1103/physrevlett.101.080402 http://dx.doi.org/10.1103/physreva.86.013612 http://dx.doi.org/10.1088/1751-8113/45/44/444008 http://dx.doi.org/10.1002/prop.201200080 http://dx.doi.org/10.1088/1751-8113/46/37/375301 http://dx.doi.org/10.1103/physreva.87.051601 http://dx.doi.org/10.2478/s11533-011-0139-3 http://dx.doi.org/10.1016/s0375-9601(99)00672-6 http://dx.doi.org/10.1088/0305-4470/38/22/003 http://dx.doi.org/10.1007/s10582-005-0110-2 http://dx.doi.org/10.1088/0305-4470/39/43/008 http://dx.doi.org/10.1088/1751-8113/42/12/12503 http://dx.doi.org/10.1088/1751-8113/43/14/145301 http://dx.doi.org/10.1007/s10582-005-0115-x http://dx.doi.org/10.1088/1751-8113/43/48/485204 http://dx.doi.org/10.1007/s10773-010-0593-x http://dx.doi.org/10.1088/1751-8113/46/27/275307 acta polytechnica 54(2):116–121, 2014 1 introduction 2 bose-einstein condensate in a pt-symmetric harmonic trap 3 eigenvalue spectra 4 eigenvalue shifts 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2014.54.0271 acta polytechnica 54(4):271–274, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap search for isolated black holes: past, present, future sergey karpov∗, grigory beskin, vladimir plokhotnichenko special astrophysical observatory of russian academy of sciences ∗ corresponding author: karpov@sao.ru abstract. the critical property of a black hole is the presence of an event horizon. it may be detected only by means of a detailed study of the emission features of its surroundings. the temporal resolution of such observations has to be comparable to rg/c, which is in the 10−6–10 s range, depending on the mass of the black hole. at sao ras we have developed the mania hardware and software complex, based on the panoramic photon counter, and we use it in observations on the 6 m telescope for searching and investigating the optical variability of various astronomical objects on time scales of 10−6–103 s. we present here the hardware and methods used for these photometric, spectroscopic and polarimetric observations, together with the principles and criteria for object selection. the list of objects includes objects with featureless optical spectra (dc white dwarfs, blazars) and long microlensing events. keywords: black holes, accretion, observations. 1. accretion onto isolated stellar-mass black holes even though more than 60 years have passed since the theoretical prediction of black holes as astrophysical objects [1], in a certain sense they still have not been discovered. the features of a black hole — the compactness (the size for mass m is close to the schwarzschild radius rg = 2gmc2 ) and mass larger than 3m� — are necessary but not sufficient features. the key property of a black hole is the presence of an event horizon instead of a usual surface. it is necessary to detect and study the emission generated very close to an event horizon to decide whether the horizon is present in a given compact and massive object. this is a very complicated task that cannot be easily performed in x-ray binaries and agns due to the high accretion rate and, consequently, the high optical depth of the accreting gas. at the same time, single stellar-mass black holes which accrete an interstellar medium of low density (10−2–1 cm−3) are the ideal case for detecting and studying an event horizon [2]. the analysis of existing data on possible black hole masses and velocities in comparison with the interstellar medium structure shows that in the majority of cases in the galaxy (> 90%) the dimensionless accretion rate ṁ = ṁc2/ledd can not exceed 10−6–10−7 [3]. for typical interstellar medium inhomogeneity the captured specific angular momentum is smaller than that on the bh last stable orbit, and the accretion is always spherically-symmetric [4, 5]. the only possible exception is the case of a slowly-moving (v < 10 km/s) black hole in cold dense clouds of interstellar hydrogen (n ∼ 102–105 cm−3, t ∼ 102 k), where the accretion may become disk-like and the accretion rate is high enough to provide luminosities up to 1038–1040 erg/s. the plasma in the accretion flow is collisionless all the way down to the event horizon, and the continuity of the flow is provided by the magnetic field. correct treatment of adiabatic heating, the major accretion flow heating mechanism in spherically-symmetric accretion flow, shows that it is 25 % more efficient than the flow for ideal gas[3]. due to it the plasma temperature in the accretion flow grows much faster and the electrons become relativistic earlier – the radius where electrons become relativistic rrel = rrel/rg ≈ 6000, in contrast to rrel ≈ 1300 in [6] and rrel ≈ 200 in [7]. the accretion flow is much hotter, and our estimation of “thermal” luminosity l = ηṁc2 = 9.6 · 1033m310n 2 1(v 2 + c2s) −3 16 erg/s (1) is significantly higher than the estimates derived by ipser & price [7] lip = 1.6 · 1032m310n 2 1(v 2 + c2s) −3 16 erg/s (2) and by bisnovatyi-kogan & ruzmaikin [6] lbkr = 2 · 1033m310n 2 1(v 2 + c2s) −3 16 erg/s, (3) while the optical spectrum shape is nearly the same (see fig. 1). the efficiency of accretion as a function of accretion rate is shown in fig. 4. correct treatment of the magnetic field behaviour in accretion flow is an extremely complicated task (see, for example, [8] and references therein). the basis of our analysis is the assumption of energy equipartition in the accretion flow (shvartsman’s theorem, see [2]) which determines the radial dependencies of both infall velocity and magnetic field strength. a direct consequence of this assumption is the necessity of magnetic energy dissipation at the rate defined as the difference between the rates of magnetic energy 271 http://dx.doi.org/10.14311/ap.2014.54.0271 http://ojs.cvut.cz/ojs/index.php/ap s.karpov, g.beskin, v.plokhotnichenko acta polytechnica figure 1. decomposition of a single black hole (with the mass 10 m�) emission spectrum into thermal and nonthermal parts. the accretion rate is 1.4 · 1010 g/s, which corresponds to ṁ = 10−8. increase for a purely frozen-in magnetic field and the rate for equipartition [6]. in the accretion models proposed earlier this dissipation runs continuously[9] in the turbulent flow and its mechanism is not examined in detail. we considered conversion of the magnetic energy in the discrete turbulent current sheets [10] as a mechanism providing such dissipation, and studied the observational consequences of this process. these include the generation of various modes of plasma oscillations (ion-acoustic and lengmur plasmons mostly) and the acceleration of electrons, which is very important for the observational appearances of the whole accretion flow. the beams of the accelerated electrons emit their energy due to the motion in the magnetic field and generate an additional nonthermal component in addition to synchrotron emission of thermal particles (see figs. 1 and 2). a large fraction of black hole emission is generated inside the 2rg sphere, and so carries information about the physical conditions of space-time very close to the event horizon. an important property of nonthermal emission is its flaring nature — the electron ejection process is discrete, and the typical light curve of a single beam is shown in fig. 3. the light curve of each flare has a stage of fast intensity increase and a sharp cut-off, and its shape reflects the properties of the magnetic field and space-time near the horizon. a study of these flares in black hole emission therefore provides a way to probe extreme space-time regions directly very close to the horizon. 2. observational appearance of an isolated black hole the black hole at a 100 pc distance (a sphere with this radius must contain several tens of such objects, see [11]) looks like a 15–25m optical object (due to the “thermal” spectral component) with a strongly figure 2. spectra of the accretion flow onto the 10 m� black hole for the various accretion rates. variable counterpart in high-energy spectral bands (“nonthermal” component) [3]. the hard emission consists of flares, the majority of which are generated inside a 5rg distance from the bh. these events have durations of ∼ rg/c (∼ 10−4 s), a rate of 103–104 flares per second, and an amplitude of 2–6 %. the bh variable x-ray emission can be detected by modern space-borne telescopes. the optical emission consists of a quasi-stationary “thermal” part and a low-frequency tail of nonthermal flaring emission. the rate and duration of optical flares are the same as x-ray flares, while their amplitudes are significantly smaller. indeed, the contribution of the nonthermal component to the optical emission is approximately 2 · 10−2 for ṁ = 10−8–10−6, so the mean amplitudes of optical flares are 0.04–0.12 %, while the peak amplitudes may be 1.5-2 times higher and may reach 0.2 %. certainly, it is nearly impossible to detect such single flares, but their collective power reaches 18–24m and may therefore be detected in observations with high time resolution (< 10−4 s) by large optical telescopes[3]. of course, the variability of the bh emission is related not only to the electron acceleration processes described here. additional variability may be result from plasma oscillations from various kinds, or other types of instabilities. the time scale of such variability may be from rg/c till rc/c (where rc = 2gmv 2+c2s is the gas capture radius [12]), i.e., from microseconds till years. 3. the search for isolated stellar mass black holes in the optical band the most striking property of the accretion flow onto a single black hole is its inhomogeneity — the clots of plasma act as probes testing the space-time properties near the horizon. the characteristic timescale of emission variability is τv ∼ rg/c ∼ 10−4–10−5 s and this short stochastic variability may be considered as a distinctive property of a black hole as the smallest 272 vol. 54 no. 4/2014 search for isolated black holes figure 3. internal structure of a flare as a reflection of the electron cloud evolution. the prevailing physical mechanisms defining the observed emission are denoted and typical durations of the stages are shown. possible physical object with a given mass. its parameters — spectra, energy distribution and light curves – carry important information on space-time properties of the horizon [3]. the general observational appearance of a single stellar-mass black hole at typical interstellar medium densities — its brightness and featureless optical spectrum — is the same as other optical objects without spectral lines — dc-dwarfs and rocoses (radio objects with continuous optical spectra, a subclass of blazars) [13]. the suggestion that isolated bhs can be among them forms the basis of the observational programme in search of isolated stellar-mass black holes — mania (multichannel analysis of nanosecond intensity alterations). it uses photometric observations of candidate objects with high time resolution, special hardware and data analysis methods [14, 15], and is based on the fact that fast variability is the critical property of isolated black hole emission. in observations of 40 dc-dwarfs and rocoses using the 6-meter telescope of the special astrophysical observatory and the standard high time resolution photometer based on photomultipliers, only upper limits for variability levels of 20–5 % on the timescales of 10−6–10 s respectively were obtained, i.e., bhs were not detected [15–17]. in the new stage of the mania experiment, since the end of 1990s, we have developed the multichannel panoramic spectro-polarimeter (mppp) based on the position-sensitive detector (psd) with 1 µs time resolution ([18, 19]). such detectors use the set of microchannel plates (mcp) for electron multiplication and a multi-electrode collector to determine the incoming photon position. the psd used in our observations has the following parameters: quantum efficiency of 10 % in the 3700–7500 a range (s20 photocathode), mcp stack gain of 106, spatial resolution of 70 µm (0.21′′ for the 6 m telescope), 700 ns time resolution, 7 · 104 pixels with 22 mm working diameter, and 200–500 counts/s detector noise. the “quantochron 4-480” spectal time-code convertor with 30 ns figure 4. efficiencies of the synchrotron emission of thermal and non-thermal electron components of the accretion flow. time resolution and 106 counts/s maximal count rate is used as an acquisition system. this equipment allows us to study 20–22m objects for 1-hour exposure (under good weather conditions) with microsecond temporal resolution [20, 21]. in recent years, the population of objects with featureless optical spectra and without known localizations has been extended by means of the crosscorrelation of surveys of various wave bands (from radio to gamma) and follow-up spectroscopic observations [13, 22]. these objects are the major targets of a new stage of the mania experiment. in addition, some evidence has recently appeared that the isolated stellar mass black holes may be among the unidentified gamma-ray sources [23, 24]. another large class of candidate objects for isolated black holes are those where independent estimation of the mass is possible, e.g., long-lasting macho microlensing events [25]. another possibility is a black hole in a binary system with a white dwarf (though, strictly speaking, this type of black hole is not isolated, it accretes from interstellar gas only, and therefore may be considered in the same way). this type of binary may be detected by means of its periodic brightness amplification on the tens of seconds time scale due to gravitational microlensing [26]. by using the technical equipment of sdss (23m limit in a 6-square-degree field), roughly 15 such objects may be detected in the course of 5 years. it is clear that in such a system it is easy to estimate the mass of the black hole. acknowledgements this work has been supported by intas grant no 04-787366, by the russian foundation for basic research (grant no 04-02-17555), and by the russian science support foundation. s.k. has also been supported by a grant from the dynasty foundation. g.b. thanks landau networkcenro volta and the cariplo foundation for a fellowship and the brera observatory for hospitality. 273 s.karpov, g.beskin, v.plokhotnichenko acta polytechnica references [1] j. r. oppenheimer, et al. on continued gravitational contraction. physical review 56:455–459, 1939. doi:10.1103/physrev.56.455. [2] v. f. shvartsman. halos around ”black holes”. astronomicheskij zhurnal 48:479–488, 1971. [3] g. m. beskin, et al. low-rate accretion onto isolated stellar-mass black holes. a&a 440:223–238, 2005. arxiv:astro-ph/0403649 doi:10.1051/0004-6361:20040572. [4] h. mii, et al. ultraluminous x-ray sources: evidence for very efficient formation of population iii stars contributing to the cosmic near-infrared background excess? apj 628:873–878, 2005. arxiv:astro-ph/0501242 doi:10.1086/430942. [5] g. beskin, et al. search for the event horizon by means of optical observations with high temporal resolution. in v. karas, et al. (eds.), iau symposium, vol. 238 of iau symposium, pp. 159–163. 2007. doi:10.1017/s1743921307004899. [6] g. s. bisnovatyi-kogan, et al. the accretion of matter by a collapsing star in the presence of a magnetic field. apss 28:45–59, 1974. doi:10.1007/bf00642237. [7] j. r. ipser, et al. synchrotron radiation from spherically accreting black holes. apj 255:654–673, 1982. doi:10.1086/159866. [8] i. v. igumenshchev, et al. three-dimensional magnetohydrodynamic simulations of spherical accretion. apj 566:137–147, 2002. arxiv:astro-ph/0105365 doi:10.1086/338077. [9] g. s. bisnovatyi-kogan, et al. influence of ohmic heating on advection-dominated accretion flows. apjl 486:l43–l46, 1997. arxiv:astro-ph/9704208 doi:10.1086/310826. [10] l. a. pustilnik. stability of accretion models. apss 252:353–362, 1997. [11] e. agol, et al. x-rays from isolated black holes in the milky way. mnras 334:553–562, 2002. arxiv:astro-ph/0109539 doi:10.1046/j.1365-8711.2002.05523.x. [12] h. bondi, et al. on the mechanism of accretion by stars. mnras 104:273–282, 1944. [13] s. a. pustilnik. the list of radio objects with purely continuum optical spectra. preliminary analysis of their features. soobshcheniya spetsial’noj astrofizicheskoj observatorii 18:3–41, 1976. [14] v. f. shvartsman. the mania [multichannel analysis of nanosecond intensity alterations] experiment. astrophysical problems, mathematical methods, instrumentation complex, results of the first observations. soobshcheniya spetsial’noj astrofizicheskoj observatorii 19:5–38, 1977. [15] g. m. beskin, et al. the investigation of optical variability on time scales of 10−7-102 s: hardware, software, results. experimental astronomy 7:413–420, 1997. [16] v. f. shvartsman, et al. the results of search for superrapid optical variability of radio objects with continuous optical spectra. astrofizika 31(3):685–690, 1989. [17] v. f. shvartsman, et al. a search for 0.5-microsecond to 40-second optical variability in dc white dwarfs. soviet astronomy letters 15:145–149, 1989. [18] v. debur, et al. position-sensitive detector for the 6-m optical telescope. nuclear instruments and methods in physics research a 513:127–131, 2003. arxiv:astro-ph/0310353 doi:10.1016/s0168-9002(03)02233-2. [19] v. plokhotnichenko, et al. the multicolor panoramic photometer-polarimeter with high time resolution based on the psd. nuclear instruments and methods in physics research a 513:167–171, 2003. arxiv:astro-ph/0310354 doi:10.1016/s0168-9002(03)02249-6. [20] v. l. plokhotnichenko, et al. high-temporal resolution multimode photospectropolarimeter. astrophysical bulletin 64:308–316, 2009. doi:10.1134/s1990341309030109. [21] v. g. debur, et al. high temporal resolution coordinate-sensitive detector with gallium-arsenide photocathode. astrophysical bulletin 64:386–391, 2009. doi:10.1134/s1990341309040087. [22] g. tsarevsky, et al. a search for very active stars in the galaxy. first results. a&a 438:949–955, 2005. arxiv:astro-ph/0502235 doi:10.1051/0004-6361:20042274. [23] n. gehrels, et al. discovery of a new population of high-energy γ-ray sources in the milky way. nature 404:363–365, 2000. [24] n. la palombara, et al. x-ray and optical coverage of 3eg j0616-3310 and 3eg j1249-8330. mem. soc. astron. italiana 75:476, 2004. arxiv:astro-ph/0403705. [25] d. p. bennett, et al. gravitational microlensing events due to stellar-mass black holes. apj 579:639–659, 2002. arxiv:astro-ph/0109467 doi:10.1086/342225. [26] g. beskin, et al. detection of compact objects by means of gravitational lensing in binary systems. a&a 394:489–503, 2002. 274 http://dx.doi.org/10.1103/physrev.56.455 http://arxiv.org/abs/astro-ph/0403649 http://dx.doi.org/10.1051/0004-6361:20040572 http://arxiv.org/abs/astro-ph/0501242 http://dx.doi.org/10.1086/430942 http://dx.doi.org/10.1017/s1743921307004899 http://dx.doi.org/10.1007/bf00642237 http://dx.doi.org/10.1086/159866 http://arxiv.org/abs/astro-ph/0105365 http://dx.doi.org/10.1086/338077 http://arxiv.org/abs/astro-ph/9704208 http://dx.doi.org/10.1086/310826 http://arxiv.org/abs/astro-ph/0109539 http://dx.doi.org/10.1046/j.1365-8711.2002.05523.x http://arxiv.org/abs/astro-ph/0310353 http://dx.doi.org/10.1016/s0168-9002(03)02233-2 http://arxiv.org/abs/astro-ph/0310354 http://dx.doi.org/10.1016/s0168-9002(03)02249-6 http://dx.doi.org/10.1134/s1990341309030109 http://dx.doi.org/10.1134/s1990341309040087 http://arxiv.org/abs/astro-ph/0502235 http://dx.doi.org/10.1051/0004-6361:20042274 http://arxiv.org/abs/astro-ph/0403705 http://arxiv.org/abs/astro-ph/0109467 http://dx.doi.org/10.1086/342225 acta polytechnica 54(4):271–274, 2014 1 accretion onto isolated stellar-mass black holes 2 observational appearance of an isolated black hole 3 the search for isolated stellar mass black holes in the optical band acknowledgements references ap02_3.vp 1 introduction the requirement to measure the material moisture content of a sculptural group within the prague loretto complex was based on its condition in 1991. the sculptural group was in critical condition with typical symptoms of deep sandstone degradation in the form of chinks, cracks and surface spalling of the larger parts. the objective causes of this condition were directly connected with the properties of the sandstone, which is relatively heterogeneous, with numerous hydrolytic transformations of ferrous minerals (limonite). erosive failures had presumably also occurred in the sandstone in the more distant past, always provoked by water, and several attempts had already been made to conserve the sculptural group. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 prague loretto – measurements of moisture content in sculptural group material j. římal an innovative method of spatial measurement of the moisture distribution in a sculptural group has been developed and verified. this method facilitates monitoring of moisture distribution and, therefore, the condition of the sculptural group in a spatial perspective. the seven-month measurements were in order to take into account all climatic effects influencing the sculpture. these measurements resulted from the practical requirement to provide source material for future conservation interventions to be conducted on the sculptural group in the prague loretto. keywords: prague loretto, measurement of moisture fields, mass transport. fig. 1: prague loretto fig. 2: foutain decorated with the sculpture of the resurrection of our lord – jan michael brüderle and richard prachner, (1739–1740) previous conservation attempts on the sculptural group are the second, subjective cause of its contemporary dilapidated condition. the remains of conservation attempts can still be identified, namely in the form of fragments of various fillings and surface crusts, originally meant in good faith, but which today are the fundamental sources of secondary erosion and corrosion processes. these interventions, in particular, have violated the basic requirements for natural water migration within the porous structure of the stone. in turn, water tends to accumulate in various intermediate stages of the natural stone, and in its artificial forms. this has produced the usual negative effects, such as frost heaving pressures during winter periods, hydrolytic effects, hydration and crystallization pressures, etc. it is obvious that the condition is not improved by the increased moisture content in the stone, due to both storm water and capillary elevation from the subsoil of the sculptural group. there is nowadays heavy air pollution in prague, and special in the hradčany locality. all types of water within the stone porous system in addition to the above-mentioned negative effects, develop into sorption or absorption centres for the cumulation of acid emissions. it is therefore necessary to know the exact values of the mass moisture content and, on the basis of the knowledge, to design measures for long-term protection of the sculptural group. based on this critical assessment of the situation and additional observations of the sculptural group, an extensive programme was designed in agreement with the monastery management, with the aim of creating a descriptive model of the condition of the entire volume of the sculptural group with regard to distribution of moisture maps inside the structure in relation to climatic conditions, namely, rainfall intensity and moisture content due to capillary elevation from the subsoil. 2 measurement technique in was decided to select a conductometric method for the measurements. this method involves low input voltage and the use of copper electrodes fixed at pre-selected points of the sculptural group that remained stable through out the year. in the given configuration, the method is highly accurate (the slight changes in polarization and corrosion potentials of the copper electrodes in the course of the year may be neglected), allowing, in particular, an investigation of the entire structure. by combining the measured points (electrodes) into electrode pairs and making sensitive measurement of the current flowing between them, practically all vertical and horizontal cross sections of the sculptural group could be mapped. the probes consisted of copper wire with a diameter of 2 mm and a length of 60 mm mounted by means of lime dab into previously prepared boreholes with a diameter of 3 mm. this type of copper pin anchorage guarantees an alkaline environment within the immediate vicinity of the electrodes for a period of at least one year, thus stabilizing the electrolytic medium in terms of appearance of variable polarization and corrosion potentials. the points for anchoring the boreholes and electrodes were selected with due care in order not to damage the original, valuable relief surface and also to include the points exposed to rainfall or, by contrast, those situated in rainfall shades, within all horizontal and vertical sections of the sculptural group. the measurements of the currents were implemented by a digital system using apparatus by system ultrakust (frg), multimeter (uk) and sanwa electric (japan) with an input working potential of 2 v, which, at currents ranging up to 10 ma, ensured only slight excess values of generated hydrogen voltage. the use of copper electrodes of the first type was preferred to electrodes of the second type with electrolytic bridges. the latter type of electrodes have the fundamental advantage of eliminating polarization voltages, but, over a longer period of time, they show changes in electrolyte concentrations and potentially also in electrolyte volumes, which may subsequently have a decisive effect on the values of the read currents. copper electrodes may be regarded as the most suitable solution as, when used in a sandstone medium with the respective moisture contents, they provide defined coverage with corrosion films, which, secondarily, render the surface passive and thus provide long-term electric stabilization. the used configuration even allowed us to measure the surface layer of the sculptural group by the neighbouring electrodes, which was of interest namely during the period immediately following rainfall. the distribution of the selected measured points is indicated in fig. 2. in all, 14 measured points were numbered from the bottom edge of the sculptural group almost to the top of it. moisture content gauging was performed using the samples of original sandstone after it has been bored off. the samples came both from the socle and from the scaled layers. simple analyses did not show extreme contents of free anions, which could, namely in heterogeneous distribution within the volume of the sculptural group, considerably affect the conductometric data. gravimetric gauging using equally fixed electrodes in the original sandstone samples practically eliminated all negative effects of electrolytic interfaces. in fact, it turned out during the measurements that only about two electrode combinations showed some signs of fluctuations in the results, namely after heavy moistening, which could be attributed to electrolytic concentration effects. 3 results of measurements the results of the measurements are summed up in a graphic form (see graph no. 1, no. 2). the charts treat the actual values of the mass moisture content in various ways in relation to the time of measurement and the mutual positions of the measured points, over a seven-month period in order to show a nearly cycle, including spring, summer and autumn. graph 2 shows the gradients of the moisture field measured at a certain selected point in relation to other (14) points measured all over the statue. the charts make it clear that we managed to collect complete sets of moisture data for the measured period. the analysis and interpretation enables us to draw wider conclusions on the moisture distribution inside the sculptural group and on the movement of the moisture zones inside the structure during one season of the year. it is also evident how the sculptural group reacted to heavy rainfall and how the water © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 42 no. 3/2002 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 9 th m a y 5 th j u n e 1 4 th o c to b e r 0 ,0 2 ,0 4 ,0 6 ,0 8 ,0 1 0 ,0 1 2 ,0 moisturecontent[%] m e a s u r e d p o in t s p r a g u e l o r e t t o – m o is tu re c o n te n t m e a s u re m e n t o f th e m a te ri a l o f s c u lp tu ra l g ro u p r e s u rr e c ti o n o f o u r l o rd 9 . 5 . 1 4 . 1 0 . 1 9 9 2 9 th m a y 2 4 th m a y 5 th j u n e 2 5 th a u g u s t 1 4 th o c to b e r d a t e o f m e a s u r e m e n t g ra p h n o. 1 © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 42 no. 3/2002 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 2 5 9 0 ,0 2 ,0 4 ,0 6 ,0 8 ,0 1 0 ,0 moisturecontent[%] m e a s u r e d p o in t s (r e l a t e d t o t h e b a s ic p o in t s ) p r a g u e l o r e t t o – m o is tu re c o n te n t m e a s u re m e n t o f th e m a te ri a l o f s c u lp tu ra l g ro u p r e s u rr e c ti o n o f o u r l o rd 1 . 1 0 . 1 9 9 2 2 4 5 7 9 1 1 b a s ic p o in t s g ra p h n o. 2 moved or was retained inside chinks and caverns, or how it migrated through the porous sandstone system. a study of the moisture profiles enabled the water contributions due to capillary elevation and rainfall to be estimated. though some combinations of measured points show considerable fluctuations of values depending on the actual moisture contents, it is possible to trace the basic moistening trends during the different seasons and also during the period of intensive summer drying up of the surface parts and their successive wetting after a heavy downpour. in absolute values, the highest mass contents range above 10 %, but with regard to the erosion and corrosion behaviour of the sandstone, the steady average moistening of roughly the lower half of the sculptural group ranging from 6 to 8 % seems to be more dangerous, as this is roughly twice the value of the corresponding balanced moisture content of the respective sandstone. these values of moistening are probably the basic source of failures in the intermediate material stages of the structure. 4 conclusion the above-described measurements of the mass moisture content values in the sculptural group resurrection of our lord, located in the paradise court of the prague loretto show that the structure is exposed to excessive moistening practically throughout the seven-month cycle. on average, the values of sandstone moistening are twice higher than would correspond to its average balanced contents. the basic source of the current moistening of the lower parts of the sculptural group is capillary elevation of water from the subsoil. contributions after rainfall cause high-level sandstone saturation, reaching up to values of around 12 % of the mass moisture content, which must be regarded as critical moistening, namely in connection with the existence of watertight fillings and surface crusts. it should be emphasised that it is water in its various forms within the porous sandstone structure that is causing accelerating damage to the sculptural group. to conclude, we recommend that long-term rehabilitation measures on the sculptural group require the installation of a damp-proofing barrier against moisture capillary elevation from the subsoil, and at the same time, the installation of roofing above the structure. only complete elimination of excessive moisture sources can guarantee the success of any preservation or conservation interventions that will be needed on the sculptural group. acknowledgement this project is being carried out with the participation of ph.d. students ing. jan pytel and ing. jana zaoralová and undergraduate student jakub římal. this research has been supported by the grant agency of the czech republic under grant no. 103/00/0776 references [1] groot, de s. r., mazur, j.: non equilibrium thermodynamics. amsterodam: north-holand publ. co., 1962. [2] římal, j.: mass and heat transport in building and bridge structures. proceedings of workshop 2000, vol. b., prague: ctu, 2000. [3] římal, j.: research of impact of temperature and moisture fields on the bridge structure of charles bridge. new requirements for materials and structures, prague: ctu, 1998, p. 105–110, isbn 80-01-01838-5. [4] šejnoha, j., bittnar, z.: numerical methods in structural analysis. new york: asce press, / london: thomas telford, 1996. [5] vodák, f.: termodynamika spojitého prostředí. prague: ctu, 1992. doc. rndr. jaroslav římal, drsc. phone: +420 224 354 702 fax: +420 233 333 226 e-mail: rimal@fsv.cvut.cz czech technical university faculty of civil engineering thákurova 7 166 29, prague 6, czech republic 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2015.55.0347 acta polytechnica 55(5):347–351, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap changes in the surface layer of rolled bearing steel oskar zemčík∗, josef chladil, josef sedlák institute of manufacturing technology, technická 2896/2, brno 616 69, czech republic ∗ corresponding author: zemcik.o@fme.vutbr.cz abstract. this paper describes changes observed in bearing steel due to roller burnishing. hydrostatic roller burnishing was selected as the most suitable method for performing roller burnishing on hardened bearing steel. the hydrostatic roller burnishing operation was applied as an additional operation after standard finishing operations. all tests were performed on samples of 100cr6 material (en 10132-4), and changes in the surface layer of the workpiece were then evaluated. several simulations using finite element methods were used to obtain the best possible default parameters for the tests. the residual stress and the plastic deformation during roller burnishing were major parameters that were tested. keywords: bearing steel; roller burnishing; finite element method; residual austenite; residual stress. 1. race of a rolling-element bearings roller bearings form an integral part of a large number of machines and devices. they are widely used in the automotive industry, in transport machinery, in machine tools, in aerospace engineering, etc. their mechanical properties and their reliability have a significant impact on the operation of the entire system. various types of roller bearings are made, ranging from the most common ball bearings through tapered, cylindrical, and barrel bearings to special types for specific purposes. the critical factor is the dynamically loaded surface layer of the bearing ring. this is what ultimately leads to fatigue failure (see fig. 1). in the case of bearing rings made of bearing steel 14109 (din 1.3505 100cr6) and 14209 (din 1.3520 100crmn6), the layer is thermally treated, and is then ground and superfinished. to increase the life and the reliability of a roller bearing, the dynamically-loaded bearing surface may be submitted to roller burnishing, which improves the properties of the surface layer of the ring without actually changing the dimensions of a ring that has already been machine processed [11]. 2. hydrostatic roller burnishing the roller burnishing method does not eliminate the total residual stress induced for example by previous grinding, but it causes compressive stresses by plastification of the surface layer [1, 3, 4, 9]. the compressive stresses on the surface of the burnished area then prevent the development of cracks and eliminate the effects of micronotches. it is suitable to use a hydrostatic roller burnishing head with a ball-shaped element roughly 3 mm in diameter as a roller burnishing tool in for hardened bearing steel. hydrostatic roller burnishing heads made by hegenscheidt or by ecoroll [14] figure 1. pitting on a bearing ring. figure 2. the hegenscheidt hydrostatic roller burnishing tool [11]. can be given as examples, in both cases, constant pressure of the roller burnishing element on the surface of the workpiece is ensured, along with a supply of process fluid to the point of contact. convenient conditions were provided by speed values in the range of 20–100 m min−1, feeds of 0.05 to 0.2 mm, and a working force derived from the fluid pressure of 5002500 n [1, 6]. simulation by means of the finite element method can provide a better understanding of the mechanism 347 http://dx.doi.org/10.14311/ap.2015.55.0347 http://ojs.cvut.cz/ojs/index.php/ap o. zemčík, j. chladil, j. sedlák acta polytechnica figure 3. 3d simulation of roller burnishing, reduced stress affected by friction in the surface layer [mpa]. figure 4. 3d simulation of roller burnishing, intensity of the plastic deformation affected by friction in the surface layer [–]. of plastic deformation in the surface layer of a rollerburnished material [3, 4]. the ansys program was selected as a suitable candidate, and a simulation was made of the stress and plastic deformations in the immediate vicinity of the contact point between the workpiece and the tool [12, 13]. the model itself was reduced to a depth of 0.5 mm for the workpiece, and the ball segment was reduced to a length of 0.3 mm [4]. the resultant values showed a high stress value in the vicinity of the contact point between the workpiece and the roller-burnishing element, and also showed the formation of a plastically deformed area in the surface layer, see figs. 3 and 4. these figures present the results for a 3mm ball made from sintered carbide, and a working force of 2000 n. the lines marked as iv and v represent the paths selected for evaluating the residual stress and the intensity of the plastic deformation below the surface, see fig. 5. the areas marked as mx, mn represent places with maximum and minimum values. the simulations show that the resultant course of the stress below the roller-burnishing element is also influenced to a relatively large extent by the rollerburnishing method that has been used; particularly when using a friction element, a more significant plastically deformed area is shifted to the surface of the workpiece. the intensity of the stress and the plastic deformation [12] used in the simulation is defined by σi = max{σ1 −σ2,σ2 −σ3,σ3 −σ1}, �i = max{�1 − �2,�2 − �3,�3 − �1}. to provide a better illustration of the course of the plastic deformation itself in the area below the tool, the cross-section of the courses can be shown, see fig. 5. the simulations that were performed showed an excess of stress in the surface layer of the workpiece up to a depth of roughly 0.2 mm. this leads to plastic deformation, and has a significant impact on the examined material layer. 3. experimental measurements for an evaluation of the actual changes in the surface layer, the residual stress was measured and the surface roughness of the roller-burnished surface was 348 vol. 55 no. 5/2015 changes in the surface layer of rolled bearing steel figure 5. intensity of the plastic deformation, depending on the depth below the surface. figure 6. arithmetical mean roughness ra and maximum height of profile rz depending on the size of working force f . evaluated. next, the surface layer was investigated, and images were obtained from optical and electron microscopes. various values of the working force acting on the forming element were tested. the forming element, made of sintered carbide, was 3 mm in diameter and spherical in shape. various other technological parameters did not have such a major impact as the change in the working force. working speed values of vk = 30 m s−1 were therefore chosen, and the feed per revolution was f = 0.05 mm, which corresponds to the recommended values. the arithmetic mean roughness ra and also the maximum height of profile rz [6, 8] were evaluated as shown in fig. 6. the resulting measurements show a more significant increase in surface roughness when the working force exceeds 1000–1500 n. however, this value still meets the parameters required for the final state of the surface of the workpiece. the roentgenographic method also evaluates the crystal lattice deformation in the individual phases of the material under consideration [7, 10]. the philips d500 equipment that was used also allowed an assessment of the percentage representation of each metal phase. it was therefore possible for each measurement of the residual stress to determine the amount of residual austenite in the surface layer. the resultant percentage representation value was measured with accuracy of ±1 % of the total volume of material. however, the depth of the layer measured with this method was limited to 40 µm below the surface. 349 o. zemčík, j. chladil, j. sedlák acta polytechnica figure 7. dependence of the residual stress in the surface layer on the size of the working force. figure 8. dependence of the amount of residual austenite in the surface layer on the size of the working force. as part of the measurements, changes to the amount of residual austenite were evaluated, depending on the size of the working force. subsequently, the amount of residual stress was also assessed in relation to the working force [6]. both of these dependencies are shown in figs. 7 and 8. the negative values in fig. 7 represent the compressive stress [7]. within the measured interval, the pressure residual stress value rises approximately linearly with the working force. light and x-ray microscopy were used for evaluating the state of the structure in the surface layer. the respective steel with 1 % c, 0.4 % mn and 1.5 % cr is used for rolling elements and for rings up to 25 mm in thickness. like mass-produced rings of rolling bearings, the samples were hardened and tempered at low temperature. a comparison between the images of the subsurface layer (fig. 9 left) and the layer a few tenths of milimetre below the surface (fig. 9 right) shows that there is a reduction in the residual austenite areas [2, 5]. 4. conclusions our results lead to the conclusion that the rollerburnishing method applied here can offer significant improvements to the properties of the surface layer of dynamically-loaded bearing rings. the tested samples showed both improvements in the state of the residual stress, and also changes in the roughness of the surface layer. significant plastic deformation occurred even though the material was in a hardened state. this result may be ascribed to a plastic deformation or a possible phase changes in the material leading to volume changes. a positive impact on the lifetime of bearing has been experimentally verified for bearings where an additional roller-burnishing operation was applied to the rolling track. the service life of these bearings was more than doubled [11]. due to these good results for surface roughness, it is possible to consider skipping the superfinishing operation and replacing it by roller burnishing. 350 vol. 55 no. 5/2015 changes in the surface layer of rolled bearing steel figure 9. steel for roller bearings 14109.4, 10000×, roller-burnished by a working force of 2500 n: (left) a replica at a depth of 0.2mm below the roller-burnished surface; (right) a replica 0.01 mm below the roller-burnished surface. acknowledgements the work has been supported by the department of trade and industry of the czech republic under grant fr–ti4/247. the support provided from this source is very gratefully acknowledged. references [1] yen, y.c., p. sartkulvanich and t. altan. finite element modeling of roller burnishing process. cirp annals manufacturing technology. 2005, vol. 54, issue 1, pp. 237-240. doi:10.1016/s0007-8506(07)60092-4 [2] kundin, julia, evgeny pogorelov and heike emmerich. numerical investigation of the interaction between the martensitic transformation front and the plastic strain in austenite. journal of the mechanics and physics of solids. 2015, no. 76, pp. 65-83. doi:10.1016/j.jmps.2014.12.007 [3] balland, pascale, laurent tabourot, fabien degre and vincent moreau. mechanics of the burnishing process. precision engineering. 2013, vol. 37, issue 1, pp. 129-134. doi:10.1016/j.precisioneng.2012.07.008 [4] balland, pascale, laurent tabourot, fabien degre, vincent moreau and yan-wing ng. an investigation of the mechanics of roller burnishing through finite element simulation and experiments. international journal of machine tools and manufacture. 2013, vol. 65, pp. 29-36. doi:10.5353/th_b3196038 [5] lehnhoff, g.r., k.o. findley and g.r. chanani. influence of austenite stability on predicted cyclic stress-strain response of metastable austenitic steels. procedia engineering. 2011, vol. 10, pp. 1097-1102. doi: 10.2172/4114261. accessible from: [6] mezlini, s., s. mzali, s. sghaier, c. braham, ph. kapsa and j. martin. effect of a combined machining/burnishing tool on the roughness and mechanical properties. lubrication science. 2013, vol. 26, issue 3, pp. 175-187. doi:10.2172/821697 [7] newby, m., m. n. james and d. g. hattingh. finite element modelling of residual stresses in shot-peened steam turbine blades. ffems. 2014, vol. 37, issue 7, pp. 707-716. doi:10.1111/ffe.12165 [8] davim, j. paulo. surface integrity in machining. 1st ed. london: springer, 2010, 215 pp. isbn 978-1-84882-873-5. [9] vajskebr, jiří and zdeněk špeta. dokončování a zpevňování povrchu strojních součástí válečkováním. praha: sntl, 1984. [10] kraus, ivo and nikolaj ganev. difrakční analýza mechanických napětí. 1st ed. prue: čvut, 1995. isbn 80-01-01366-9. [11] zemčík, oskar. změna vlastností oběžných drah valivých ložisek po aplikaci válečkování: thesis. brno: cerm, 2001. 136 pp. isbn 80-214-2131-2. dizertační práce. vut v brně, fsi, úst. [12] ansys theory reference: v. 13 [online]. 2011. southpointe: ansys, inc. [cit. 2011-07-14]. [13] lawrence, kent l. ansys workbench tutorial: structural & thermal analysis using the ansys workbench release 12.1 environment. mission: schroff development corp, 2010, 252 s. isbn 1585035807. [14] ecoroll ag. ecoroll ag werkzeugtechnik [online]. celle, 2013 [cit. 2013-05-14]. http://www.ecoroll.de 351 http://dx.doi.org/10.1016/s0007-8506(07)60092-4 http://dx.doi.org/10.1016/j.jmps.2014.12.007 http://dx.doi.org/10.1016/j.precisioneng.2012.07.008 http://dx.doi.org/10.5353/th_b3196038 http://dx.doi.org/10.2172/821697 http://dx.doi.org/10.1111/ffe.12165 http://www.ecoroll.de acta polytechnica 55(5):347–351, 2015 1 race of a rolling-element bearings 2 hydrostatic roller burnishing 3 experimental measurements 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0301 acta polytechnica 55(5):301–305, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap the compacting process of the en aw 6060 alloy lukáš dragošeka, róbert kočiškoa, andrea kováčováa, róbert bidulskýa, ∗, milan škrobianb a department of metal forming, faculty of metallurgy, technical university of košice, vysokoškolská 4, 042 00 košice, slovakia b sapa profily a.s., na vartičke 7, 965 01 žiar nad hronom, slovakia ∗ corresponding author: robert.bidulsky@tuke.sk abstract. this study reports on an investigation of factors affect the process of compacting al chips which are used to direct scrap processing through the forward extrusion method. en aw 6060 chips of different geometry and types were mainly used as the experimental material. the chips were compacted in a die with a vertical channel (10.3 mm in diameter). to provide a range of processing conditions, three different weights were selected and compacting was performed under five, different compacting pressures. the movement of the chips within the die during compacting was analysed through numerical simulations using deform 2d software. study of the compacting process optimal parameters for increasing the density and enhancing the density distribution were defined. the results from our study clearly show that optimal conditions are obtained when the proportion of d/h is 1/1.1. moreover, it was recognized that in the process of small chips compacting, there was obtained lower density than in the case of large chips. keywords: compaction; aluminium chips; finite element method; porosity. 1. introduction in general, primary aluminium processing is one of the most energy-demanding fields in metal production [1, 2]. depending on the technology that is applied, 168 gj or 200 gj of energy is needed to produce one ton of aluminium, which is 10 times greater than the energy requirement for steel production [2]. al products are at present mostly manufactured by gravity or high-pressure casting, by extrusion and also with the use of powder metallurgy (pm) methods. in the final step, it is also necessary to modify the geometrical parameters of the end product through machining. in recent years, chips have become a popular commodity for enhancing the efficiency of recycling process [3]. re-melting is a well-know method for processing al scrap [4]. re-melting reduces the energy consumption for the whole process of al production, but due to the high affinity of al to oxygen, re-melting also leads to significant material losses through the oxidation process in a molten metal [5, 6]. due to the large specific surface, the material losses can grow in the process of re-melting al chips. for small chips, the material losses account for almost 50 % [7, 8]. a solution proposed and designed by max stern and patented by him in 1945, was forward extrusion of the chips, performed at an elevated temperature [9]. in 1977, this method was modified and innovated by sharma and takahashi [10, 11]. this kind of chip recycling can also be used for steels, for cast iron, for copper and for aluminium alloys [12]. the method of forward extrusion of chips involves compacting the chips at ambient temperature, followed by extrusion at an elevated temperature. in the first stage, the chips are mechanically compacted and are then pressed [13]. within the arrangement of the chips, open porosity emerges which is not fully removed in the second stage of processing. in addition, depending on the degree porosity of the compact, some of the pores are closed. in general, pores which are not able to move to the surface, and therefore remain inside, are influenced by deformation (during processing). they are subjected to great compacting pressure, and this has a negative impact on the mechanical properties of the final product. as the real length of a shaft depends on its density, the porosity of the compact has a significant influence on the quality of the final product (manufacturing line productivity) [14–16]. the first stage of processing is therefore very important. the most widely-used methods for extruding of compacted chips are forward extrusion [17, 18], extrusion through bridge dies [2, 4, 13], and new technologies, such as equal channel angular pressing (ecap) [20–23] and friction stir wire extrusion [24]. 2. experimental material and methods two types of chips delivered by sapa profily a.s. slovakia, žiar nad hronom, were used as experimental material: small chips — which originated from chip cutting operations (drilling, milling, turning). they are spiral in shapes and bow-shapep (fig. 1). the average dimensions of the chips (t × w × l) were 0.1 × 0.2–1 × 2–10 mm. the effect of mechanical 301 http://dx.doi.org/10.14311/ap.2015.55.0301 http://ojs.cvut.cz/ojs/index.php/ap l. dragošek, r. kočisko, a. kováčová et al. acta polytechnica si fe cu mn mg cr zn ti v co small chips 0.44 0.3 0.03 0.04 0.35–0.6 0.05 0.44 0.1 0.01 0.01 large chips 0.45 0.3 0.17 0.3 0.33 2 0.27 0.02 0.01 0.01 table 1. chemical composition of the small and large chips. figure 1. small chips. figure 2. large chips. strengthening was eliminated after the chips had been cleaned chemically by annealing under the following conditions: t = 688,15 k, t = 9000 s. large chips — which originated from a mechanical saw used for cutting shafts at an elevated temperature. the chips were uniform in shape, and consisted of tangled bands (fig. 2). the average dimensions of the chips (t×w×l) were 0.5×2.5×20– 80 mm. the chips were not mechanically strengthened and their surfaces were not polluted. both types of chips were analyzed using an optical emission spectrometer. the chemical composition is given in table 1 (the chemical composition of both types of chips corresponds to en aw 6060). figure 3. schematic illustration of the division of the sample. 2.1. compacting chips the two types of chips were compacted under identical conditions (weight 2, 5, 10 g; compacting pressure 300, 400, 500, 600, 700 mpa; and speed 0.83 mm/s). to ensure reproducibility of the measurement, each experimental measurement was repeated 3 times. for the compacting process, we used a die with a round, vertical channel 10.3 mm in diameter and 250 mm in height. to determine the porosity of the compacts, the density of both types of chips was established to 0.0027g/mm3 (according to the chemical composition). the average porosity was calculated using the followin formula [22, 23, 24], and then the reciprocal density was estimated: p = ρchips − ρspecimens ρchips · 100 %, (1) where p is the porosity of compacted samples; ρchips is the density of chips; and ρspecimens is the density of compacted specimens. 2.2. density distribution across the compact height to distinguish the density distribution across the height of the sample, each sample was divided into 5 mm segments (see fig. 3). the density distribution was determined for each sample weight, compacted from large chips under pressure at 300 mpa. 302 vol. 55 no. 5/2015 the compacting process of the en aw 6060 alloy figure 4. density vs. compacting pressure – the small chips. figure 5. density vs. compacting pressure – the large chips. 2.3. material flow during compacting the material flow in the die during the compacting process was analysed by mathematical simulations in deform 2d software. the simulation was constructed as the compacting process of a porous material, with the density determined from the height and the weight of the chips. the simulation conditions correspond to the real compacting conditions, i.e., the weight of a sample was 10 g (large chips) within the simulation. we used a material from the deform database with similar mechanical properties to those of the experimental material. the stop criterion for the simulation was the distance (or the final length of the compact). 3. results and discussion figures 4 and 5 provide graphical illustrations of the density and the compacting pressures of the samples. figures 4 and 5 show that the large chips provide greater density than the small chips under the same compacting pressure. for explanaition for this can be that the large specific surface and the suspension of the small chips cause the samples to break during handling. it is also seen (fig. 4) that the density increases as the compacting pressure grows. however, there are some particular cases when growth in pressure does not have a significant influence on the final density of the compact [? ]. weight ratio of d/h 10 g 1/5 5 g 1/2.5 2 g 1/1.1 table 2. the ratio of the diameter to the height in the compact. figure 6. density distribution across the height of the sample. the experimental studies show that the density decreases as the weight of the sample increases (there is an increase in the ratio of the diameter to the height in the sample) in both cases (small chips vs. large chips). the ratio of the cross-section of he sample to the height (table 2) has a significant impact on the density distribution across the height of the compact. due to the direction from the piston to the lower part of the die, the density declines in all weight conditions (see the graphical illustrations in fig. 6). the most significant in the density distribution was observed when the sample weighed had 10 g. the decrease in density (in the direction of the bottom part of the shaft) is caused by the difference in the vectorial movement of the chips in particular parts inside the die during compaction (fig. 7). in general, the most significant vectorial movement of the chips is assumed to be directly below the piston. it is therefore, easier to carry out processes such as rearranging, pressing and deforming the chips in these parts than in other parts. in general, increasing density of the compact under the piston leads to a growth in the radial pressure and in the friction area (resulting in enhanced friction at the surface of the die). the next forward movement of the piston in the compacting direction gradually spreads the region with higher density, the contact surface of the shaft and the die, which finally enhances of the friction on die walls of the die. in the compacting process of 5 g and 10 g samples made from small chips, we observed an increase in the friction force above the value of the compacting force, after which the material inside the die broke. it was later observed that the samples in the die began to break at one third of their height. these samples were not suitable for a study of the density distribution (not included in fig. 4). to ensure sufficient compacting 303 l. dragošek, r. kočisko, a. kováčová et al. acta polytechnica figure 7. numerical simulation of the compacting process on a 10 g sample under 300 mpa. efficiency and, at the same time, the best density distribution, it is preferable to compact the chips into shafts with a d/h ratio of 1/1.1. a high ratio of d/h can be obtained by putting the chips gradually into the layers. however, this increases the friction forces, resulting in breakage when the shaft is pressed. the same phenomenon is recognized when the al cover is compacted. another option is to use a two-sided compaction method. 4. conclusions (1.) the dependence of the density on the amount of compacting pressure is influenced by the size and the geometry of the chips. it is therefore, not possible to specify a general optimal value for the compacting pressure (for all types of chips). (2.) at the same time, the size and the geometry of the chips has a significant influence on the consistency of the compact while it is being handled. samples compacted from small chips under all considered pressure conditions (300, 400, 500, 600, 700 mpa) were brittle. (3.) if the d/h ratio drops, the density increases and there is improved uniformity of the density distribution across the height of the shaft. the best density distribution was observed when the d/h ratio was 1/1.1. (4.) due to non-uniformity in the vectorial movement of the chips, it is preferable to process shafts with the d/h ratio higher than 1/2,5, using a a two-sided compaction method. acknowledgements this work was realized within the frame of the operational programs research and development: “the centre of competence for industrial research and development in the field of light metals and composites”, project code itms 26220220154, financially supported by a european regional development fund. references [1] j. gronostajski, a. matuszak: journal of materials processing technology, 92-93:35-41, 1999, doi: 10.1016/s0924-0136(99)00166-1 [2] v. güley, a. güzel, a. jäger, n. ben khalifa, a.e. tekkaya, w.z. misiolek: materials science and engineering a, 574:163-175, 2013, doi: 10.1016/j.msea.2013.03.010 [3] m. samuel: journal of materials processing technology, 135(1):117-124, 2003, doi: 10.1016/s0924-0136(02)01133-0 [4] v. güley, n. ben khalifa, a.e. tekkaya: international journal of material forming, 3:853-856, 2010, doi: 10.1007/s12289-010-0904-z c. schmitz: handbook of aluminium recycling vulkan-verlag gmbh, essen, germany, 2006 [5] a.e. tekkaya et al.: journal of materials processing technology, 209: 3343–3350, 2009, doi: 10.1016/j.jmatprotec.2008.07.047 [6] v. güley, n. ben khalifa, a.e. tekkaya: aip conference proceedings, 1353:1609-1614, 2011, doi: 10.1063/1.3589746 [7] m. haase, a.e. tekkaya: journal of materials processing technology, 217: 356-367, 2015, doi: 10.1016/j.jmatprotec.2014.11.028 304 vol. 55 no. 5/2015 the compacting process of the en aw 6060 alloy [8] m. stern: method for treating aluminium or aluminium alloy scrap, us 2391752 a, 25.12.1945 [9] c.s. sharma, t. nakagawa, n. takenaka: recent developments in the recycling of machining swarfs and blanking scraps by sintering and powder forging, stc f, 26/1/1977, p. 121 [10] t. takahashi: development of scrap extrusion reformation and utilization process in proceedings of 2nd international aluminum extrusion technology seminar, band 1, 1977, p. 123-128 [11] j. gronostajski, h. marciniak, a. matuszak: journal of materials processing technology, 106(1-3):34-39, 2000, doi: 10.1016/s0924-0136(00)00634-8 [12] r. kočiško, r. bidulský, l. dragošek, m. škrobian: acta metallurgica slovaca, 20(3):302-308, 2014, doi: 10.12776/ams.v20i3.366 [13] j. bidulská et al.: archives of metallurgy and materials, 58(2): 371-375, 2013, doi: 10.2478/amm-2013-0002 [14] m. maccarini, r. bidulsky, m. actis grande: acta metallurgica slovaca, 18(2-3):69-75, 2012. [15] b. pawlowska: acta metallurgica slovaca, 20(1):28-34, 2014, doi: 10.12776/ams.v20i1.182 [16] z.s. ji, l.h. wen, x.l. li: journal of materials processing technology, 209:2128-2134, 2009, doi: 10.1016/j.jmatprotec.2008.05.007 [17] m.l. hu, z. s. ji, x. y. chen, q.d. wang, w.j. ding: transactions of nonferrous metals society of china (english edition), 22:68-73, 2012, doi: 10.1016/s1003-6326(12)61685-9 [18] m. haase, n. ben khalifa, a.e. tekkaya, w.z. misiolek:materials science and engineering a, 539:194–204, 2012, doi: 10.1016/j.msea.2012.01.081 [19] w. tang, a.p. reynolds: journal of materials processing technology, 210:2231-2237, 2010, doi: 10.1016/j.jmatprotec.2010.08.010 [20] y. hangai et al.: materials transactions, 53(8):15151520, 2012, doi: 10.2320/matertrans.m2012125 [21] j.b. fogagnolo, e.m. ruiz-navas, m.a. simón, m.a. martinez: journal of materials processing technology, 143-144:792-795, 2003, doi: 10.1016/s0924-0136(03)00380-7 [22] j. bidulská et al.: acta physica polonica a, 122(3):553-556, 2012. [23] j. bidulská et al.: chemicke listy, 105(16):s471-s473, 2011. [24] r.a. behnagh, r. mahdavinejad, a. yavari, m. abdollahi, m. narvan: metallurgical and materials transactions b: process metallurgy and materials processing science, 45(4):1484-1489, 2014, doi: 10.1007/s11663-014-0067-2 305 acta polytechnica 55(5):301–305, 2015 1 introduction 2 experimental material and methods 2.1 compacting chips 2.2 density distribution across the compact height 2.3 material flow during compacting 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0126 acta polytechnica 56(2):126–131, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap ultrasonic soldering of cu and al2o3 ceramics by use of bi-la and bi-ag-la solders roman koleňák, michal prach, igor kostolný∗ slovak university of technology in bratislava, faculty of materials science and technology in trnava, institute of production technologies, paulínska 16, 917 24 trnava, slovak republic ∗ corresponding author: igor.kostolny@stuba.sk abstract. this work deals with the effect of solder alloying with a small amount of lanthanum on joint formation with metallic and ceramic substrate. the bi-ag – based solder with 2 wt. % lanthanum addition and bi solder with 2 wt. % lanthanum addition were studied. soldering was performed by a fluxless process on the air, by activation with a power ultrasound. it was found out that, during the process of ultrasonic soldering, lanthanum is distributed on the boundary, both with the copper and the ceramic substrate, which enhances the joint formation. the bond with al2o3 ceramics is of an adhesive character, without the formation of a new contact interlayer. keywords: ultrasonic soldering; bi–based solder; lanthanum; al2o3 ceramic. 1. introduction the bi-based solders belong to the group of solders for higher application temperatures. soldering technology at higher application temperatures is widely spread at present and it provides irreplaceable properties to resultant products, such as excellent thermal conductivity or high reliability [2, 3]. these solders are used mainly in electronics, but also in automotive, space, aviation and power industries [4–7]. there exists exception in case of this group of solders from the ban to use lead solders until full-value substitute for pb-sn solders [8] will be developed. in this group of solders, exceptions are made from the ban of using the lead solders, until sufficient substitute for pb-sn soldiers will be developed. several alternative alloys based on bi, zn and au have been developed up to now, but none of them was capable of fully substituting the pb5sn a pb10sn solders [5]. the bismuth based solders have melting point in the required temperature range, suitable for soldering at higher working temperatures (from 250 to 400 °c). the alloy containing 2.6 wt. % ag exerts eutectic temperature 262.5 °c [9]. the bi-based solders offer excellent properties, such as high toughness, low elasticity modulus in shear and resistance against thermal fatigue [10]. brittleness and lower electric conductivity, when compared to high-lead solders, belong to their disadvantages. the conductivity of bi10ag solder is only 1.0 · 10−6/ω m, that is much lower, than the conductivity of pb5sn (3.5 · 10−6/ω m) or sac 307 (8.66 · 10−6/ω m) solders [11]. therefore, this solder is alloyed with silver, from 2.5 to 11 wt. % ag. such solder was also used for soldering cu and ni, which is described in work [12]. we have also chosen to use the bi2.5ag and bi11ag solders. interfacial reactions between cu substrate and bi– ag solder were investigated by authors [13]. without forming intermetallic compounds (imcs), the molten solder grooved and further penetrated along the grain boundaries (gb) of the cu substrate. another authors [14] studied the melting range, wetting behaviour and thermal conductivity of the bi-ag alloy (with ag content between 2.6 and 12 wt. %) and compared these characteristics with pb-based alloys. the thermal conductivity was measured at 30 and 100 °c. the pure bismuth has the low thermal conductivity (7 w/mk at 30 °c). the addition of up to 12 wt. % ag increases the thermal conductivity by 50 %, to 10.5 w/mk. reactions on the boundary of bi-solders and metallic substrates were studied in works [15–19]. examples of another solders are in the following studies [20–28]. however, in the industrial applications, for which these solders are developed, it is also necessary to joint also non-metallic and ceramic materials. therefore. the bi-based solders must be alloyed with an active element, to ensure wetting of the non-metallic or ceramic substrate with the solder. an example of alloying the bi–ag solder is mentioned in the work [29]. authors added 0.1 wtp of ce into the bi–ag alloy. authors changed an ag amount in the alloy from 2.5; 5; 7.5; to 10 wt. %. wettability of the bi–ag solder on the cu substrate is fair, but it is still inferior to the pb5sn solder. an increase in the ag content of solder has a positive effect on the wettability to cu substrate. moreover, it is clear that the lanthanides addition may promote the wetting property. the reason is related with the surface-active action of the lanthanides. the aim of presented work was to study experimental solders type bi11ag2la and bi2la, to prove the solderability of cu and al2o3 ceramics with these solders and to analyse the fabricated soldered joints. 126 http://dx.doi.org/10.14311/ap.2016.56.0126 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 2/2016 ultrasonic soldering of cu and al2o3 ceramics figure 1. soldered material combinations. ultrasound power 400 w working frequency 40 khz amplitude 2 µm soldering temperature 290 °c time of ultrasonic activation 5–10 s table 1. soldering parametres. 2. experimental the bi2la (2 wt. % la) and bi11ag2la (11 wt. % ag and 2 wt. % la) solders manufactured in a cast condition in the high vacuum were used in experiments. manufacturing procedure was as follows: the calculated charges of alloys were inserted into a graphite boat, the boat with the charge was placed into a tube of 50 mm-diameter silicon glass, the tube was then laid to vacuum resistance furnace in such a manner that it was situated in the heated zone, the tube, if needed, could be blown with an ar gas by the use of a flange on its beginning and outlet on its end. the charge was subjected to a temperature above 900 °c, in dependence on the type of manufactured alloy. experimentally prepared bi2la and bi11ag2la solders were used for fabrication of soldered joints. schematic representation of joints is shown in fig. 1. alumina and copper substrates of 2n5, resp. 4n purity were used for joining. substrates were in the form of rings with dimensions ∅ 15 × 2. soldering was performed by the use of the ultrasonic equipment type hanuz ut2, with parameters given in table 1. solder activation was realised via an encapsulated ultrasonic transducer consisting of a piezo-electric oscillating system and a titanium sonotrode with the end tip of diameter 3 mm. scheme of ultrasonic soldering through the layer of molten solder is shown in fig. 2. soldering temperature was 20 °cabove the liquidus temperature of the appropriate solder. control of soldering temperature was realised via a continuous temperature measurement on the hot nicr/nisi plate by a thermocouple. soldering procedure was performed in such a manner that solder was placed on the substrate heated to a soldering temperature, and then it was heated to liquidus temperature. the molten solder was subjected to a power ultrasound for the time of 5 to 10 s without application of flux or shielding atmosphere. after ultrasound activation, the excessive layer and surface oxides were removed from the substrate surface. the copper substrate was prepared in the same way. cu substrate was laid and centred on a ceramic substrate with the surface, on which the solder layer was deposited by figure 2. ultrasonic soldering through the layer of molten solder. ultrasound activation. in this way, the desired joint was fabricated. metallographic preparation of specimens from soldered joints was realised by standard metallographic procedures, used for specimen preparation. grinding was performed by the use of sic emery papers with 240, 320, 1200 grains/cm2 granularity. polishing was performed with diamond suspensions with grain size 9, 6, and 3 µm. final polishing was performed by the use of the polishing emulsion type op-s (struers) with 0.2 µm granularity. solder microstructure was studied on the light optical microscope type neophot 32, with application of image analyser nis-elements, type e and by the use of the scanning electron microscopy (sem) on jeol 7600 f with x-ray micro-analyser type microspec wdx-3pc for performing the qualitative and semi-quantitative chemical analysis. dsc analysis of the bi2la and bi2y solder was performed on equipment type netzsch sta 409 c/cd in shielding ar gas with 6n purity. 3. experimental results for determination of the soldering temperature and other phase transformation dsc analysis of bi11ag2la solder was performed. figure 3 shows the results of the dsc analysis of the bi11ag2la solder at a heating rate of 1 k/min. the solder matrix, formed of a fine eutectics (bi + 3–4 wt. % ag), starts to melt at a temperature of 261.4 °c. temperature of 262.8 °ccorresponds to a temperature of an eutectic reaction in the bi–ag binary system. this assumption was also proved by a binary bi–ag diagrams of authors [? ]. melting of eutectics is fully completed at temperature of 263.8 °c. from the dsc curve of the bi2la solder at a heating rate of 1 k/min follows that the start of a melting of the bi2la solder occurs at 270.2 °cand the peak temperature is 271.2 °c. by the binary bi–la diagram, the peritectic phase transformation is concerned. the peak temperature 271.5 °ccorresponds to the melting point of the pure bismuth, whereas melting is fully completed at 271.6 °c. it can be stated that the presence of 2 wt. % la in the matrix of the bi solder affects the melting point of the pure bismuth to a minimum extent. 127 r. koleňák, m. prach, i. kostolný acta polytechnica figure 3. dsc analysis of bi11ag2la solder (1 k/min). r. koleňák, m. prach, i. kostolný acta polytechnica fig. 4 shows the microstructure of the bi11ag2la solder. silver crystals can be seen in the solder, since the solubility of ag in bi is low (up to 4.9 wt. %). therefore, the matrix is formed of a fine eutectics (bi + 3 to 4 wt. % ag). the surroundings of ag crystal is depleted by ag crystals ,therefore, it is formed only of bi, which is obvious from the planar analysis shown in fig. x. lanthanum in the solder is uniformly segregated in the bi matrix and the exceptions are only ag phases, showing considerably less lanthanum. fig. 4 planar analysis of bi11ag2la solder microstructural observation of the bi2la solder has revealed that the solder consists of the bi matrix, with uniformly distributed particles of the la phases. smaller clusters of phases with globular character were observed in the microstructure (fig. 5). twins may be also seen in more detailed pictures fig. 5. none la was observed in the bi matrix of the solder (fig. 6). table 2 shows the documented results of the analysis of the chemical composition in the lanthanum phases (spectra 1 to 4) and in the solder matrix (spectra 5 to 9). the results of the analyses suggest that the globular phases have chemical composition, which, by binary diagram, corresponds to the composition of the labi2 phase. the analysed lanthanum phases oxidize fast, owing to the oxygen presence from the air. lanthanum generally has high affinity to oxygen. probably, mixture of la2o3 and bi2o3 oxides is formed. owing to this fact, 7 to 8 wt. % o was identified in analyses. fig. 5 microstructure of bi2la solder with la phases and microstructure with twins fig. 6 analysed zones of microstructure in bi2la solder spectrum bi [wt. %] la [wt. %] o [wt. %] 1 77.27 12.26 10.47 2 66.66 22.88 10.46 3 72.08 16.93 10.99 4 82.99 10.52 6.49 5 99.05 0.95 6 98.74 1.26 7 98.81 1.19 8 98.87 1.13 9 98.97 1.03 table 2 quantitative analysis of chemical composition in selected zones of bi2la solder 3.2 analysis of cu/bi2la joint the boundary of the soldered joint between the cu substrate and the bi2la solder is shown in fig. 7. very narrow transition zone was formed on the boundary. bismuth from the solder does not form a new intermetallic phase with cu, nor a solid solution. from this viewpoint, the interaction between bi and cu is weaker. joint formation between the copper surface and the bi-based solder occurs due to eutectic reaction between cu and bi. lanthanum from bi2la matrix is diffused during the ut process to boundary with cu substrate, where it enhances the joint formation. in case of an ultrasonic soldering without the flux application, the joint is formed in a short time with the contribution of la. edx line analysis has revealed increased concentration of la on the cu/bi2la boundary, which is also documented in fig. 7. cu bi la increased concentration of la figure 4. planar analysis of bi11ag2la solder. 3.1. microstructural analysis of bi11ag2la and bi2la solders figure 4 shows the microstructure of the bi11ag2la solder. silver crystals can be seen in the solder, since the solubility of ag in bi is low (up to 4.9 wt. %). therefore, the matrix is formed of a fine eutectics (bi + 3–4 wt. % ag). the surroundings of ag crystal is depleted by ag crystals ,therefore, it is formed only of bi, which is obvious from the planar analysis shown in fig. 4. lanthanum in the solder is uniformly segregated in the bi matrix and the exceptions are only ag phases, showing considerably less lanthanum. microstructural observation of the bi2la solder has revealed that the solder consists of the bi matrix, with uniformly distributed particles of the la phases. smaller clusters of phases with globular character were observed in the microstructure (fig. 5). twins may be also seen in more detailed pictures — fig. 5. none la was observed in the bi matrix of the solder (fig. 6). table 2 shows the documented results of the analysis of the chemical composition in the lanthanum phases (spectra 1 to 4) and in the solder matrix (spectra 5 to 9). the results of the analyses suggest that the globular phases have chemical composition, which, by binary diagram, corresponds to the composition of the labi2 phase. the analysed lanthanum phases oxidize fast, owing to the oxygen presence from the air. lanthanum generally has high affinity to oxygen. probably, mixture of la2o3 and bi2o3 oxides is formed. owing to this fact, 7– 8 wt. % o was identified in analyses. 128 vol. 56 no. 2/2016 ultrasonic soldering of cu and al2o3 ceramics figure 5. microstructure of bi2la solder with la phases and microstructure with twins. figure 6. analysed zones of microstructure in bi2la solder. spectrum bi [wt. %] la [wt. %] o [wt. %] 1 77.27 12.26 10.47 2 66.66 22.88 10.46 3 72.08 16.93 10.99 4 82.99 10.52 6.49 5 99.05 – 0.95 6 98.74 – 1.26 7 98.81 – 1.19 8 98.87 – 1.13 9 98.97 – 1.03 table 2. quantitative analysis of chemical composition in selected zones of bi2la solder. 4. analysis of cu/bi2la joint the boundary of the soldered joint between the cu substrate and the bi2la solder is shown in fig. 7. very narrow transition zone was formed on the boundary. bismuth from the solder does not form a new intermetallic phase with cu, nor a solid solution. from this viewpoint, the interaction between bi and cu is weaker. joint formation between the copper surface and the bi-based solder occurs due to eutectic reaction between cu and bi. lanthanum from bi2la matrix is diffused during the ut process to boundary with cu substrate, where it enhances the joint formation. in case of an ultrasonic soldering without the flux application, the joint is formed in a short time with the contribution of la. edx line analysis has revealed increased concentration of la on the cu/bi2la boundary, which is also documented in fig. 7. 5. analysis of al2o3/bi11ag2la joint the soldered al2o3/bi11ag2la joint is documented in fig. 8. increased concentration of la — fig. 8 was observed on the joint boundary similarly as in the previous case. a new layer with increased la content, 0.5 µm in thickness, was formed. formation of a new phase was not observed. it is supposed that an adhesive bond is formed between the solder and the ceramic substrate. la enhances the formation of the adhesive bond. activated la element guarantees the wetting of the ceramic substrate at the activation by power ultrasound and thus contributes to the joint formation. 6. conclusions the aim of this work was to determine the effect of a small amount of la in the solder on the formation of the joint with the al2o3 and cu substrates at the application of fluxless soldering by ultrasound. the subjects of study were development solders type bi2la and bi11ag2la. the following results were achieved: (1.) the matrix of bi11ag2la solder is formed of bismuth, where silver crystals with a fine eutectics (bi + 3–4 wt. % ag) are segregated. the phases type labi2 occur in a globular shape in the matrix of bi2la solder. (2.) the dsc analysis has proved that the bi11ag2la solder has a melting point at 262.8 °cof eutectics. the melting point of the bi2la solder is at 271.2 °c, 129 r. koleňák, m. prach, i. kostolný acta polytechnica r. koleňák, m. prach, i. kostolný acta polytechnica fig. 4 shows the microstructure of the bi11ag2la solder. silver crystals can be seen in the solder, since the solubility of ag in bi is low (up to 4.9 wt. %). therefore, the matrix is formed of a fine eutectics (bi + 3 to 4 wt. % ag). the surroundings of ag crystal is depleted by ag crystals ,therefore, it is formed only of bi, which is obvious from the planar analysis shown in fig. x. lanthanum in the solder is uniformly segregated in the bi matrix and the exceptions are only ag phases, showing considerably less lanthanum. fig. 4 planar analysis of bi11ag2la solder microstructural observation of the bi2la solder has revealed that the solder consists of the bi matrix, with uniformly distributed particles of the la phases. smaller clusters of phases with globular character were observed in the microstructure (fig. 5). twins may be also seen in more detailed pictures fig. 5. none la was observed in the bi matrix of the solder (fig. 6). table 2 shows the documented results of the analysis of the chemical composition in the lanthanum phases (spectra 1 to 4) and in the solder matrix (spectra 5 to 9). the results of the analyses suggest that the globular phases have chemical composition, which, by binary diagram, corresponds to the composition of the labi2 phase. the analysed lanthanum phases oxidize fast, owing to the oxygen presence from the air. lanthanum generally has high affinity to oxygen. probably, mixture of la2o3 and bi2o3 oxides is formed. owing to this fact, 7 to 8 wt. % o was identified in analyses. fig. 5 microstructure of bi2la solder with la phases and microstructure with twins fig. 6 analysed zones of microstructure in bi2la solder spectrum bi [wt. %] la [wt. %] o [wt. %] 1 77.27 12.26 10.47 2 66.66 22.88 10.46 3 72.08 16.93 10.99 4 82.99 10.52 6.49 5 99.05 0.95 6 98.74 1.26 7 98.81 1.19 8 98.87 1.13 9 98.97 1.03 table 2 quantitative analysis of chemical composition in selected zones of bi2la solder 3.2 analysis of cu/bi2la joint the boundary of the soldered joint between the cu substrate and the bi2la solder is shown in fig. 7. very narrow transition zone was formed on the boundary. bismuth from the solder does not form a new intermetallic phase with cu, nor a solid solution. from this viewpoint, the interaction between bi and cu is weaker. joint formation between the copper surface and the bi-based solder occurs due to eutectic reaction between cu and bi. lanthanum from bi2la matrix is diffused during the ut process to boundary with cu substrate, where it enhances the joint formation. in case of an ultrasonic soldering without the flux application, the joint is formed in a short time with the contribution of la. edx line analysis has revealed increased concentration of la on the cu/bi2la boundary, which is also documented in fig. 7. cu bi la increased concentration of la figure 7. the edx analysis of cu/bi2la joint boundary with concentration of bi, la and cu elements. figure 8. concentration profiles of bi, la, ag, al and o elements on the boundary of al2o3/bi11ag2la joint. which approximately corresponds to the melting point of pure bismuth. la addition affects its melting point to a minimum extent. (3.) in case of the soldered joints with metallic and ceramic substrate, it was found out that, during ultrasonic soldering process, lanthanum is distributed to the boundary with the substrate, thus enhancing the joint formation. the bond with the al2o3 is of an adhesive character, without formation of new phases. the bond between the copper substrate and the bi-based solder is formed, owing to the eutectic reaction between cu and bi. (4.) from the viewpoint of the mechanism of the joint formation, we suppose that the bond with metallic material is of a metallurgical-diffusion character. the bond with the ceramic materials is of an adhesive character. acknowledgements the contribution was prepared with the support of apvv–0023–12: research of a new soldering alloys for the fluxless soldering with application of beam technologies and ultrasound and vega 1/0455/14: research of modified solders for fluxless soldering of metallic and ceramic materials. the authors thank ing. marián drienovský, phd. 130 vol. 56 no. 2/2016 ultrasonic soldering of cu and al2o3 ceramics for the dsc analysis, ing. martin sahul, phd., ing. pavel bílek and rndr. petr harcuba for the microscopic and edx analysis. references [1] suganuma k, kim s j, kim k s high-tepmerature lead-free solders: properties and possibilities. jom, vol. 61, no. 1, 2009, pp. 64-71 [2] chidambaram v, hattel j, hald. j. design of lead-free candidate alloys for high-temperature soldering based on the au–sn system. materials and design, vol. 31, 2010, pp. 4638–4645 doi 10.1016/j.matdes.2010.05.035 [3] watson j, castro g. high-temperature electronics pose design and reliability challenges. analog dialogue, vol. 46-04, 2012, pp. 1-7 [4] chidambaram v, hattel j, hald. j. high-temperature lead-free solder alternatives. microelectronic engineering, vol. 88, 2011, pp. 981-989 doi 10.1016/j.mee.2010.12.072 [5] manikam v r, cheong k y. die. attach materials for high temperature applications: a review. ieee transaction on components, packaging and manufacturing technology, vol. 1, no. 4, 2011, pp. 457-478 [6] gayle f w, becka g, badgett j, et al. high temperature lead-free solder for microelectronics. jom, 2001, pp. 17-21 [7] kroupa a, andersson d, hoo n. et al. current problems and possible solutions in high-temperature lead-free soldering. journal of materials engineering and performance, vol. 21, is. 5, 2012, pp. 629-637 doi 10.1007/s11665-012-0125-3 [8] koleňák r, hlavatý i. lead-free solders intended for higher temperatures. transactions of the všb — technical university of ostrava, mechanical series, vol. lv, no. 3, 2009, arcitle no. 1727, pp. 113-117 [9] schoeller h, bansal s, knobloch a, et al. effect of alloying elements on the creep behaviour of high pb-based solders. materials science and engineering, vol. 528, 2011, pp. 1063-1070 doi 10.1016/j.msea.2010.10.083 [10] shi y, fang w, xia z, et al. investigation of rare earth-doped biag high-temperature solders. journal of materials science materials in electronics, vol. 21, 2010, pp. 875-881 doi 10.1007/s10854-009-0010-5 [11] song j m, chuang h y, wu z m. interfacial reactions between bi-ag high-temperature solders and metallic substrates. journal of electronic materials, vol. 35, no. 5, 2006. pp. 1041-1049 [12] song j m, chuang h y, wu z m. substrate dissolution and shear properties of the joints between bi-ag alloys and cu substrate for high-temperature soldering applications. journal of electronic materials, vol. 36, no. 11, 2007, pp. 1516-1523 doi 10.1007/s11664-007-0222-5 [13] rettenmayr m, lambracht p, kempf b, graff m. high melting pb-free solder alloys for die-attach applications. advanced engineering materials, vol. 7, no. 10, 2005, pp. 965-969 doi 10.1002/adem.200500124 [14] chachula m, koleňák r, augustín r, koleňáková m. wettability of bi11ag solder during flux application. metal, brno, 2011 [15] yamada y, takaku y, yagi y, et al. pb-free high temperature solders for power device packaging. microelectronics reliability, vol. 46, 2006, pp. 1932-1937 doi 10.1016/j.microrel.2006.07.083 [16] song j m, chuang h y, wu z m. interfacial reactions between bi-ag high-temperature solders and metallic substrates. journal of electronic materials, vol. 35, no. 5, 2006, pp. 1041-1049 [17] yamada y, takaku y, yagi y, et al. novel bi-based high-temperature solder for mounting power semiconductor devices. r&d review of toyota crdl, research report, vol. 41, no. 2, 2006, pp. 43-48 [18] shi y, fang w, xia z, lei y, guo f, li x. investigation of rare earth-doped biag high-temperature solders. journal of materials science: materials in electronics, vol. 21, 2010, pp. 875-881 doi 10.1007/s10854-009-0010-5 [19] song j m, chuang h y. faceting behaviour of primary ag in bi-ag alloys for high temperature soldering applications. materials transactions, vol. 50, no. 7, 2009, pp. 1902-1904 [20] song j m, chuang h y, wen t x. thermal and tensile properties of bi-ag alloys. metallurgical and materials transactions, vol. 38, 2007, pp. 1371-1375 doi 10.1007/s11661-007-9138-1 [21] lalena j n, dean n f, weiser m w. experimental investigation of ge-doped bi-11ag as a new pb-free solder alloy for power die attachment. journal of electronic materials, vol. 31, no. 11, 2002, pp. 1244-1249 [22] spinelli j e, silva b l, garcia a. microstructure, phases morphologies and hardness of a bi-ag eutectic alloy for high temperature soldering applications. materials and design, vol. 58, 2014, pp. 482-490 doi 10.1016/j.matdes.2014.02.026 [23] fima p, gasior w, sypien a, moser z. wetting of cu by bi-ag based alloys with sn and zn additions. journal of materials science, vol. 45, 2010, pp. 4339-4344 doi 10.1007/s10853-010-4291-0 [24] song j m, chuang h y, wu z m. substrate dissolution and shear properties of the joints between bi-ag alloys and cu substrates for high-temperature soldering applications. journal of electronic materials, vol. 36, no. 11, 2007, pp. 1516-1523 doi 10.1007/s11664-007-0222-5 [25] koleňák r, martinkovič m, koleňáková m. shear strength and dsc analysis of high-temperature solders. archives of metallurgy and materials, vol. 58, iss. 2, 2013, pp. 529-533 doi 10.2478/amm-2013-0031 [26] song j m, tsai c h, fu y p. electrochemical corrosion behaviour of bi-11ag alloy for electronic packaging applications. corrosion science, vol. 52, 2010, pp. 2519-2524 doi 10.1016/j.corsci.2010.03.031 [27] koleňák r, chachula m. characteristics and properties of bi-11 ag solder. soldering and surface mount technology, vol. 25, iss. 2, 2013, issn 0954-0911, pp. 68-75 doi 10.1108/09540911311309022 [28] shi y, fang w, xia z, et al. investigation of rare earth-doped biag high-temperature solders. journal of materials science materials in electronics, vol. 21, 2010, pp. 875-881 doi 10.1007/s10854-009-0010-5 [29] elliot r p, shunk f a. ag-bi binary diagram, 1980 131 acta polytechnica 56(2):126–131, 2016 1 introduction 2 experimental 3 experimental results 3.1 microstructural analysis of bi11ag2la and bi2la solders 4 analysis of cu/bi2la joint 5 analysis of al2o3/bi11ag2la joint 6 conclusions acknowledgements references ap02_3.vp 1 introduction ultrasonic testing is a widely used non-destructive testing method for inspection and monitoring. ultrasonics can be used in many ways for testing. the basic characteristics of an ultrasonic instrument and probe are their sensitivity and resolution [2]. evaluation of flaws is based on deciding whether a reflector is or is not the flaw echo, and consequently noise reduction is a method that raises the quality of test equipment [3]. signal de-noising is a critical property of any ultrasonic test system. 2 theoretical this paper describes the algorithm for noise suppression using the cross-correlation function. the algorithm is based on a well-known equation, eq. (1). � � � � � ��r rt n x nt y nt rt n n xy v v v v� � � �1 1 , (1) where tv sampling period, x(ntv) ultrasonic signal, y(ntv) simulated signal of the ultrasonic impulse, � �y ntv � 0 for ntv � 0, n number of samples, r � �n, …, n. the correlation function can reach the maximum for signals of similar shape. when we measure an ultrasonic signal x(ntv) with additive noise, we can use a suitable simulated signal y(ntv) for contrasting the flaw echoes. this paper discusses two possible shape models, i.e., two possible simulated signals y(ntv) of the ultrasonic impulse with the varying parameters. the parameters should be chosen according to an analysis of the ultrasonic signal with flaw echo. the first computer simulated ultrasonic impulse signal is expressed by the equation � �y nt nt t t nt t nt t v v v v � � � � � 1 0 2 1 2 0 , , , (2) the model given above represents two rectangular impulses with wide t/2. the second pulse is shifted in relation to the first about the value t/2. assuming that the reflected ultrasonic impulse has the shape of a damped sinusoidal curve, then signal y (ntv) is a sign function of one period of the reflected ultrasonic impulse, and t/2 is a parameter. for the second simulated ultrasonic impulse signal y (ntv), we selected a modulated sinusoidal gaussian pulse that is very similar to the ultrasonic pulses reflected from the material. the modulated sinusoidal gaussian pulse is defined by the following equation � � � � � �y nt f nt e ntv v v� � cos 2 0 0 2 � � � � � , (3) where �0 phase shift, f frequency, � damping coefficient. the phase shift �0 can be used for setting the initial phase of the impulse. in our case the value �� � 7�� was chosen. two variable parameters can be used to achieve conformity with the reflected ultrasonic impulse: � the frequency of impulse f, � the damping coefficient �� �n order to verify the algorithms, the signal x(ntv) was simulated in the matlab environment [1]. this signal simulates a classical high frequency ultrasonic signal with the initial, flaw and final echo without any noise. (fig. 1). the parameters of this synthetic signal correspond to the signal on the output of the ultrasonic device with the working frequency of the acoustic wave equal to 20 mhz and the equivalent sampling frequency equal to 256 mhz. the simulated flaw is situated approximately in the middle of a material 10 mm in thickness. from the statistical analysis of the real ultrasonic noise it is evident that the probability distribution is nonstandard. therefore the simulation of the ultrasonic noise signal was performed by numerical methods, applying the statistical characteristics of the real noise [1]. 22 acta polytechnica vol. 42 no. 3/2002 using the correlation function in ultrasonic non-destructive testing m. kreidl, p. houfek this paper deals with ultrasonic signal de-noising by means of correlation. it is commonly known that the cross-correlation function shows the statistical dependence between two signals. in ultrasonic inspection, the measured signal is taken as the first signal. the most important aspect of this method is the choice of the second signal. various types of the second signals can be tried. keywords: ultrasonic testing, flaw echo, signal noise reduction, cross-correlation. 0 0 5. 1 1 5. 2 2 5. 3 3 5. 4 �1 �0 5. 0 0.5 1 u / u m a x signal ( )x ntv t [ s] fig. 1: synthetic signal x(ntv) the variable value of the signal to noise ratio snr was obtained through increasing and adding to the synthetic signal. the signal to noise ratio snr can be expressed by eq. (4) snr ef ef � � � � � �20 log u n , (4) where uef rms (root mean square) value of a noiseless ultrasonic signal, nef rms value of additive noise. for verification purposes, an analysis was made through the computer-simulated noise signal, corresponding to the value of the snr coefficient in the range from 0 db to 21 db. to compare the algorithms, the coefficient of the noise reduction �sn was defined, as follows eq. (5). � ��sn dbef2 ef2 ef1 ef1 � � � � � �20 log u n u n , (5) where nef1 rms value of the additive noise in the signal, nef2 rms value of noise included in the signal after application in a given algorithm, uef1 rms value of the ultrasonic signal, e.g., corresponding to the flaw echo in the measured signal, uef2 rms value of the ultrasonic signal, when applying the tested algorithm. the rms values of the simulated signal were obtained from digital samples corresponding to the flaw echoes before and after application of a given algorithm. all the above items are non-dimensional: u u u u r r xy xy ef1 ef ef2 ef � � � � � � � � � � � � � � 1 1max max � � n n n n n n ef1 ef ef2 ef � � � � � � � � � � � �1 1 2 2max max first, the optimum choice of value t was given by correlation analysis between the synthetic signal x(ntv) and the simulated ultrasonic impulse y(ntv), according to eq. (2). it is obvious that value t is determined as the reciprocal value of the acoustic wave frequency. for a frequency of 20 mhz it is consequently t � 0,05 s. then, numerical calculation determined the increase in coefficient �sn depending on value of coefficient snr. the result is shown in fig. 2. second, the optimum choice of the value parameters according to eq. (3) was tested. fig. 3 displays a 3d chart of the coefficient of the noise reduction �sn depending on the acoustic wave frequency and on the gaussian pulse damping coefficient. the noise level was chosen for the value snr � 21 db. fig. 3 shows that the maximum noise reduction of the simulated signal x(ntv) was achieved with a frequency f � 20 mhz and a damping coefficient � � 9. to determine the noise reduction range, the �sn values are displayed in fig. 4 for different values of the simulated noise levels. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 42 no. 3/2002 �5 0 5 10 15 20 25 30 �10 �5 0 5 10 15 20 25 � s n [d b ] snr [db] fig. 2: dependence of the noise reduction �sn coefficient on the value of coefficient snr for the optimum value t � 0,05 s. the simulated signal was defined by eq. (2). f [mhz] � �] � s n [d b ] max: 26,7 db = 20 mhz = 9 f � 5 10 15 20 10 15 20 25 35 30 30 20 10 0 -10 -20 -30 -50 -40 0 fig. 3: relationship between noise reduction coefficient �sn of the synthetic signal x(ntv) with the noise ratio snr � 21 db, frequencies f and damping coefficients � of the simulated ultrasonic impulse defined by eq. (3) �5 0 5 10 15 20 25 30 � s n [d b ] snr [db] �15 �10 �5 0 5 10 15 20 25 fig. 4: dependence of the noise reduction �sn coefficient on the value of coefficient snr for optimum values of parameters f � 20 mhz and � � 9 as mentioned previously, the results show that this method reduces the noise, owing to the optimum choice of the simulated signal parameters. the suitability of the algorithms is conditioned by the agreement between the flaw wave shape of the real and simulated ultrasonic signals. the ability of the method to reduce noise was then tested on real data. 3 experimental the proposed noise reduction algorithms, based on the cross-correlation function, were tested on real data [1]. the measurement was performed with the use of a special scale, made of two welded metal sheets 9.2 mm in thickness. the flaw was artificially manufactured on one part of the scale. spark technology was used to make a hole 0.5 mm in diameter. firstly, in order to test the algorithms, we used the model of the ultrasonic impulse expressed by eq. (2). fig. 5 illustrates signal de-noising with the exception a frequency of 20 mhz. this part of the real signal is caused by the reflection of ultrasonic waves from the structure grain boundaries and/or microscopic reflectors in the material. next, a simulated signal modulated gaussian pulse was used, according to eq. (3). the optimum parameters of this impulse were computed. the change of the ratio �sn in dependence on the parameters is shown in fig. 6 and in fig. 7. the values of the noise are drawn without units, because the determination of the noise level is inaccurate. it is evident from the graphs in fig. 6 and in fig. 7 that the optimum parameters of the simulated signal are f � 20 mhz and � � 20. the following figures demonstrate the real ultrasonic signal and this signal after applying the algorithm based on the cross-correlation function with optimum parameters of the gaussian pulse. 24 acta polytechnica vol. 42 no. 3/2002 1 6. �1 �0 8. �0 6. �0 4. �0 2. 0 0 2. 0 4. 0 6. 0 8. 1 1 2. 1 4. 1 8. 2 ntv [ s] u u / m a x ^ ^ 1.2 1 4. 1 6. 1 8. 2 1 0 8. 0 6. 0 4. 0 2. 0 �0 2. �0 4. �0 6. �0 8. �1 rtv [ s] r r x y x y m a x / fault echo a) b) fig. 5: results of noise reduction with the use of computation of the cross-correlation function for a simulated impulse, according to eq. (2) with t � 0.05 s: a) detail of the real signal with flaw echo, b) estimation of the cross-correlation function �rxy. 0 0 05. 0 1. 0 15. 0 2. 0 25. t [ s] � � sn snmax 1 0.9 0 8. 0 7. 0 6. 0 5. 0 4. 0 3. 0 2. fig . 6: dependence of the noise reduction �sn coefficient on different values of t for the simulated ultrasonic impulse, according to eq. (2) f [mhz] � �-] 5 10 15 20 25 0 5 10 15 20 25 = 20 mhz = 20 f � maximum � �sn/ snmax 0 0.2 0.4 0.6 0.8 1 �sn �snmax fig. 7: dependence of the noise reduction �sn coefficient on different frequencies f and damping coefficients � for the simulated ultrasonic impulse, according to eq.(3) conclusion one of the most important properties of ultrasonic measurement is the suppression of additive noise. undesirable signal noises arise from contact between the probe and the material, from the amplifier and as a result of scattering at inhomogeneties in the material structure. this paper has provided two algorithms for noise reduction in an ultrasonic signal based on the cross-correlation function. the use of two models of the ultrasonic impulse, according to eq. (2) and eq. (3), assuming of the correct choice of model parameters, indicates that the new algorithms are suitable for all practical purposes. acknowledgement this research work has received support from research program no. j04/98:210000015 research of new methods for physical quantities measurement and their application in instrumentation of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic). references [1] houfek, p.: metody zvyšování citlivosti a rozlišovací schopnosti v ultrazvukové defektoskopii. [methods for improving sensitivity and recognition in ultrasonic defectoscopy], ph.d. thesis. praha: čvut, fakulta elektrotechnická, 2001. [2] kreidl, m., et al: diagnostické systémy. [diagnostic systems]. monografie, praha: vydavatelství čvut, 2001, p. 212–228. [3] kreidl, m., houfek, p.: reducing ultrasonic signal noise by algorithms based on wavelet thresholding. acta polytechnica, journal of advanced engineering, 2002, vol. 42, no. 2, p. 60 – 65. doc. ing. marcel kreidl, csc. phone: +420 224 352 346 e-mail: kreidl@fel.cvut.cz ing. petr houfek, ph.d. phone: +420 224 352 189 e-mail: houfek@fel.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 42 no. 3/2002 �1 �0 8. �0 6. �0 4. �0 2. 0 0 2. 0.4 0.6 0.8 1 1 2. 1 4. 1 6. 1 8. 2 ntv �[ s] u u / m a x 1 2. 1 4. 1 6. 1 8. 2 1 0 8. 0 6. 0 4. 0 2. 0 �0 2. �0 4. �0 6. �0 8. �1 rtv �[ s] r r x y x y m a x / ^ ^ fault echo fig. 8: results of noise reduction, using computation of the cross-correlation function for the simulated impulse, according to eq. (3) with parameters f � 20 mhz and � � 20 ap02_02.vp 1 introduction ultrasonic non-destructive testing is a versatile technique that can be used in a wide variety of materials analysis applications. there are several sources of noise that hide a fault. the difficulties of ultrasonic methods result from strong signal attenuation, caused mainly by scattering at the inhomogeneities in the structure of the material. the decreasing transmitted ultrasonic signal causes strong coherent noise. the optimum frequency of an acoustic wave provides the highest signal-to-noise ratio compatible with the detection of a specific discontinuity. each combination of discontinuity type and material may have a different optimum frequency. noise occurs on contact between the probe and the material, and finally noise is caused by the electronics that are used. this noise can totally mask even large backwall echoes. the basic characteristics of an ultrasonic instrument and probe are their sensitivity and resolution. sensitivity is the characteristic of ultrasonic testing that determins the ability to detect small signals limited by the signal-to-noise ratio. resolution is the ability of an ultrasonic flaw detection system to give separate indications of discontinuities that have almost the same range and/or lateral position. 2 theoretical reduction of the s/n ratio using algorithms of wavelet thresholding is described in this paper. the method is based on replacing small wavelet coefficients by zero, and keeping or shrinking the coefficients with an absolute value above the threshold discrete wavelet transform (hard or soft thresholding) [1]. the wavelet procedure consists of three steps: multiple-level decomposition on approximation and detail coefficients (dwt – discrete wavelet transform), thresholding of detail coefficients, reconstruction (idtw – inverse discrete wavelet transform). the choice of the threshold is relevant for noise reduction. the thresholding methods proposed in this research were the following: � rigsure – quadratic loss function for a soft threshold estimate of the risk for a particular threshold value, � sqtwolog – the threshold is set to a fixed value, which is computed as the square root of the logarithm of the discrete values of the signa,l � heursure – a mixture of the previous options, � minimaxi – the threshold value is calculated for the minimum of the mean square error against an ideal procedure. in the matlab environment (wavelet toolbox) the following types of waves were used for the signal decomposition to the approximation coefficient and detail coefficient: daubechies, symlet, coiflet, biorthogonal pairs. the model of the classical high frequency signal as an a-scan with a probe frequency of 20 mhz and an equivalent sampling frequency of 256 mhz was implemented from real recorded data by computer for testing of algorithms. noise at different levels was added to this signal and the noise-signal ratio was expressed by the coefficient of the noise level nsr (1), which is, according to equation (1): nsr ef ef � � � �� � � ��20 log n s (1) where nef is rms (root mean square) value of noise, sef rms value of signal. the nsr rate was chosen for better understanding of the following graphs. the coefficient of noise reduction �sn was defined for a comparison of the success of the used algorithms, according to eq. (2): �sn db ef1 ef1 ef2 ef2 � � � � � � � � � � � � � 20 log s n s n (2) where nef1 is rms value of noise included in signal, nef2 rms value of noise included in signal, after application of a given algorithm, sef1 rms value of signal corresponding to the fault echo, sef2 rms value of signal after application of a given algorithm. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 reducing ultrasonic signal noise by algorithms based on wavelet thresholding m. kreidl, p. houfek traditional techniques for reducing ultrasonic signal noise are based on the optimum frequency of an acoustic wave, ultrasonic probe construction and low-noise electronic circuits. this paper describes signal processing methods for noise suppression using a wavelet transform. computer simulations of the proposed testing algorithms are presented. keywords: ultrasonic testing, wavelet transform, thresholding of wavelet coefficients, de-noising algorithms. the rms values of the simulated signal were obtained from digital samples corresponding to fault echoes before and after application of a given algorithm. while analysing the signal it is convenient to set the extraction of approximations and details of the given signal to a certain level . fig. 1 shows a graph of noise reduction in simulated signal as a function of the decomposition level l (e.g., the maximum order of used approximations and details). the noise-signal ratio expressed by coefficient nsr (1) was chosen as –20 db. this picture shows that the influence of the decomposition level is roughly indicated at a value of � 6 in all cases, and higher values are not proper. a value of � 6 was chosen for the maximum decomposition level (i.e., the maximum level of the used approximations and details) according the analysis of the ultrasonic signal. the four above thresholding methods were applied to the simulated signal with coefficient nsr � –20 db after dwt analysis ( with � 6), with a steady threshold at particular levels of analysis. the wavelets in the table were chosen empirically. the resulting effect of the noise, expressed as a value of the coefficient �sn for different kinds of the waves and thresholding methods, is shown in tab. 1. results for waves that gave the change in the noise-signal ratio expressed by �sn smaller than 10 db are not presented in this table. these waves probably have a different wave shape from the echoes contained in ultrasonic signal, and therefore they are not suitable for noise reduction. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 42 no. 2/2002 0 2 4 6 8 10 0 1 2 3 4 5 6 decomposition level � � s n [d b ] 0 2 4 6 8 10 0 2 4 6 8 10 decomposition level � � s n [d b ] wavelets bior wavelets sym 0 2 4 6 8 10 0 2 4 6 8 decomposition level � � s n [d b ] 0 2 4 6 8 10 1 2 3 4 5 decomposition level � � s n [d b ] wavelets db wavelets coif fig. 1: the average dependence of noise reduction on decomposition level for particular kinds of wavelets thresholding method type of wave rigrsure heursure sqtwolog minimaxi db4 21.59 23.73 27.07 26.91 db6 19.37 20.59 26.42 27.42 db7 18.52 22.64 25.05 23.39 db8 20.90 25.61 28.24 27.43 db9 18.14 19.55 26.93 25.70 db10 18.84 21.53 26.63 26.32 bior 1.3 11.55 11.48 28.28 28.76 bior3.9 18.98 22.55 29.81 27.26 bior 4.4 18.63 22.95 27.59 26.48 coif1 17.61 17.47 25.29 21.55 coif2 17.88 17.94 25.55 22.14 coif4 18.99 23.62 26.10 23.35 sym3 17.59 22.15 27.51 25.81 sym4 20.82 20.84 26.89 25.22 sym5 19.54 19.86 25.96 25.76 sym6 17.16 20.71 26.70 24.86 sym7 17.45 20.74 27.65 24.25 sym8 17.22 20.45 24.79 22.82 tab. 1: values of noise reduction coefficient �sn [db] for different types of waves and thresholding the results presented in table 1 show that the values of the noise reduction coefficient vary from 17 to 30 db. the different effects of different thresholding methods are also apparent. the “rigrsure” and “heursure” methods gave worse results than the other methods. the best results were obtained using the “sqtwolog” method. only this method will therefore be considered from now on. the four waves “db8”, “bior3.9”, “bior 1.3”, “sym3” which produced the highest values of coefficient �sn were chosen. in following section the ultrasonic signal occurred with additive noise, which was generated on the basis of the model of real noise for the noise reduction rates simulated by computer. the noise-signal ratio was consequently increased in the ranges of the nsr coefficient from – 45 db to – 3 db. a graph expressing the dependence of the noise reduction coefficient �sn on the coefficient of the noise level in the signal for the waves given above is shown in fig. 2. the dependence in fig. 2 shows that the maximum value of the nsr coefficient, for which the �sn value is positive, varies in the range from –7,5 db to – 4 db. until now we have considered only the standard thresholding methods for noise reduction. now a new method [2] will be presented. this involves computing the optimum threshold value individually for wavelet decomposition of every detail coefficients. for verification purposes, an analysis was made of the simulated signal, primarily up to the level = 6. this analysis was made using the “bior3.9” wave, i.e., the wave that gave best results in the prior analysis. fig. 3 shows the details and approximations obtained during wavelet analysis of the signal with noise given an nsr coefficient value of �7,15 db. the time curve of the analysed signal is displayed in fig. 3. fig. 3 shows that the simulated echo is best displayed in detail no. 3 and particularly in detail no. 4, while other details are representations of the noise. these results show that it is necessary to perform maximum suppression of the noise part of the signal contained in all details, except details no. 3 and no. 4. a simple algorithm was written in the matlab environment to determine the optimum thresholds. this algorithm goes gradually through the combinations of threshold values for every detail of the signal in the range from 0 to 1. after threshold processing for every combination of threshold reconstruction, the signal is made and the computation of the noise reduction coefficient �sn is processed. in the following table, the optimum values are written for wave “bior 3.9”, together with the value �sn, which characterises the noise reduction. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 -10 -10 -15 -15-20-25-30-35-40-45 -5 -5 0 0 -10-15-20-25-30-35-40-45 -5 0 -10-15-20-25-30-35-40-45 -5 0-10-15-20-25-30-35-40-45 -5 0 5 10 15 20 25 30 � s n [d b ] wavelet db8 -10 -15 -5 0 5 10 15 20 25 30 � s n [d b ] wavelet bior 1.3 -15 -10 -5 0 5 10 15 20 25 30 35 wavelet bior 3.9 � s n [d b ] -10 -15 -20 -5 0 5 10 15 20 25 30 � s n [d b ] wavelet sym7 nsr [db] nsr [db] nsr [db] nsr [db] fig. 2: dependence of the noise reduction coefficient on the noise level in the signal for the four waves, with maximum noise reduction effect an analysis was then made of the influence of changes in the particular thresholding levels. after this analysis had been performed in the matlab environment, a program was written which gradually shifted the threshold values of the detail and computed the noise reduction effect. the levels of the other details were set to the optimum value according to tab. 2. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 42 no. 2/2002 0 0.5 1 1.5 2 2.5 3 3.5 4 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 t [ s]� u u / m a x the time curve of the signal -1 -0.5 0 0.5 1 d1 detail no.1 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 d2 detail no.2 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 d3 detail no.3 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 d4 detail no.4 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 d5 detail no.5 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 d6 detail no.6 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� -1 -0.5 0 0.5 1 a2 aproximation no.2 0 0.5 1 1.5 2 2.5 3 3.5 4 t [ s]� fig. 3: decomposition of the simulated signal for � 6 (drawings of all details of the wavelet analysis and one selected approximation) levels of signal details �sn [db] detail no. 1 detail no. 2 detail no. 3 detail no. 4 detail no. 5 detail no. 6 0.5 0.3 0.2 0.1 0.1 0.3 26.5 tab. 2: optimum choice of thresholds for maximum noise reduction results of this analysis for wave “bior3.9” are drawn in fig. 4. the dependence of the noise reduction coefficient on the threshold value increases before settling at a particular value, for all details except detail no. 3. since details no. 1, no. 2, no. 4, no. 5 and no. 6 include mostly noise (see fig. 3), it may be assumed that a maximum reduction will be achieved if we totally cut out the discrete values of these details. fig. 3 also shows that the settling details of the higher levels are faster. this is because the discrete values of the particular details are decreasing and total removal is achieved immediately with lower threshold levels. if we focus on the dependence of �sn on the level of threshoding of detail no. 3, we will see that this dependence has a certain maximum representing the optimum noise reduction. the existence of this maximum can be explained as follows: detail no. 3 contains information concerning the fault echo, but it also contains a certain noise value. if we cut down the noise value, then the noise from the signal will be reduced. the amplitude of the fault echo will also be reduced, but the main influence is from the noise reduction effect. if we increase the threshold value so that the noise is totally removed, then after the next reduction the fault echo will be reduced and the value of coefficient �sn will decrease. a comparison of the value of coefficient �sn received after applying the new proposed thresholding method with the value achieved using standard thresholding methods provides the following result: the value of coefficient �sn is about 20 db higher when the new thresholding method is used. 3 experimental the proposed algorithm based on applying wavelet transformation was then tested on real data [2]. the measurement was done on a special gauge, made of two metal sheets 9,2 mm in thickness, which were jump welded. a fault was created artificially on one part of the scale by a 0,5 mm hole, which was manufactured by spark technology. the results of the measurements are given in fig. 5. 4 conclusion our analysis indicates that all waves from the sym family are suitable for noise reduction in an ultrasonic signal, followed by coiflet and symlet – type waves and some chosen biorthogonal pairs (especially “bior3.9” or “bior 4.4”). noise reduction for these waves is characterised by noise reduction coefficient �sn � 12 db. on the other hand, it is not recommended to use “db1”, “db5”, “bior3.1” to “bior3.7” and “coif5” waves. these waves have a different shape [4] from the echoes in an ultrasonic signal. we then investigated the different thesholding methods with respect to noise reduction. the standard methods indicate that no optimum algorithm exists. our proposed method produced the best results, consisting of the optimum 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -10 0 10 20 30 detail no.1 threshold � s n [d b ] 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 5 10 15 20 25 30 detail no.2 threshold � s n [d b ] 10 15 20 25 30 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 threshold detail no.3 � s n [d b ] 10 15 20 25 30 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 threshold detail no.4 � s n [d b ] 21 22 23 24 25 26 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 threshold � s n [d b ] detail no.5 20 21 22 23 24 25 26 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 threshold � s n [d b ] detail no.6 fig. 4: influence of threshold values on signal noise reduction threshold value individually for a wavelet analysis of each detail coefficient of the signal. by choosing a suitable threshold, the noise reduction coefficient can be increased by about 9 db on the real signal, and by even 10 db on a simulated signal. acknowledgement this research work has received support from research program no j04/98:210000015 „research of new methods for physical quantities measurement and their application in instrumentation“ of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic). references [1] donoho, d. l.: de-noising by soft-thresholding. ieee transactions on information theory, 1995, vol. 41, no. 3, p. 613–627. [2] houfek, p.: metody zvyšování citlivosti a rozlišovací schopnosti v ultrazvukové defektoskopii. [methods for improving sensitivity and recognition in ultrasonic defectoscopy]. dizertační práce, fakulta elektrotechnická čvut, praha, 2001. [3] kreidl, m. et al: diagnostické systémy. [diagnostic systems]. monografie, praha: vydavatelství čvut, 2001, p. 123–128. [4] misiti, m., misiti, y., oppenheim, g., poggi, j. m.: wavelet toolbox user's guide. natick (usa): the math works, 1966. [5] strang, g., nguyen, t.: wavelets and filter banks. wellesley (usa): wellesley-cambridge press, 1966. ing. petr houfek, ph.d. e-mail: houfek@fel.cvut.cz doc. ing. marcel kreidl, csc. e-mail: kreidl@fel.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 42 no. 2/2002 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.8 1 1.2 1.4 1.6 1.8 2 t [ s]� detail of the echo in the original signal -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 u u / m a x u u / m a x 0.8 1 1.2 1.4 1.6 2 t [ s]� 1.8 detail of the echo after noise reduction structure echo fig. 5: results of noise reduction, using the proposed new algorithm with wave “sym3” acta polytechnica doi:10.14311/ap.2016.56.0132 acta polytechnica 56(2):132–137, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap parametric studies on the component-based approach to modelling beam bottom flange buckling at elevated temperatures guan quan∗, shan-shan huang, ian burgess university of sheffield, department of civil and structural engineering, sheffield, uk ∗ corresponding author: g.quan@sheffield.ac.uk abstract. in this study, an analytical model of the combination of beam-web shear buckling and bottom-flange buckling at elevated temperatures has been introduced. this analytical model is able to track the force-deflection path during post-buckling. a range of 3d finite element models has been created using the abaqus software. comparisons have been carried out between the proposed analytical model, finite element modelling and an existing theoretical model by dharma (2007). comparisons indicate that the proposed method is able to provide accurate predictions for class 1 and class 2 beams, and performs better than the existing dharma model, especially for beams with high flange-to-web thickness ratios. a component-based model has been created on the basis of the analytical model, and will in due course be implemented in the software vulcan for global structural fire analysis. keywords: shear buckling; bottom-flange buckling; component-based model; fire. 1. introduction the structural behaviour of a real composite frame observed in the full-scale cardington fire tests in 1995-96 [1, 2] was very different from that observed from furnace tests on isolated composite beams. this stimulated a general awareness of the importance of performance-based design, which sufficiently considers the interactions between different members in the structure. wald et al. [3] reported on the results of a collaborative research project, including tensile membrane action and robustness of structural steel joints under natural fire, to investigate the global structural behaviour of a compartment on the 8-storey steel–concrete composite frame building at the cardington laboratory during a bre large-scale fire test. this report further revealed the structural integrity of structures under fire conditions. however, full-scale fire tests are expensive. to carry out finite-element modelling of an entire structure, including its joints, using solid elements is computationally demanding. the component method provides a practical alternative approach to modelling the joints and their adjacent zones; component-based joint models can be included in global structural analysis using much lower numbers of beam-column and shell elements. the cardington fire tests [4] indicated that beamweb shear buckling, as well as beam bottom-flange buckling, near to the ends of steel beams, is very prevalent under fire conditions, as shown in fig. 1. these phenomena can have significant effects on the beam deflection, as well as the force distribution within the adjacent joint. as joints are one of the most vulnerable element in steel structures, due to their complex behaviours [5] , it is significant to fully figure 1. shear buckling and bottom-flange buckling in cardington fire test [4] . consider the effects of the buckling elements in the vicinity of beam-column joints in fire. research by elghazouli [6] addressed the influence of local buckling on frame response, although local buckling may have an insignificant influence on the fire resistance of isolated members. however, the local buckling model presented in elghazouli’s work is based on elastic plate buckling theory [7] , which is not appropriate for representing the buckling behaviour of class 1 and 2 sections. in this study, the principles of the analytical model [8] , which combines the beam-web shear buckling and bottom-flange buckling at elevated temperatures, are briefly reviewed. this model is capable of predicting local buckling behaviour in the post-buckling stage. together with the assumption that the characteristics of the buckling zone are identical to those of the normal beam in the pre-buckling stage, the analytical model is able to track the complete force-deflection path of the end-zone of the beam, from initial loading 132 http://dx.doi.org/10.14311/ap.2016.56.0132 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 2/2016 component-based approach to modelling beam bottom flange buckling 2 development of analytical model the development of the analytical model is explained using a short cantilever i-beam section (fig. 2) as an example. by changing the cantilever length and by applying different combinations of uniformly distributed load and shear force at the beam-end, the cantilever is able to represent part of a fixed-ended beam from its end to the point of contraflexure under a uniformly distributed load. it can be calculated that the distance from one end of the beam to its adjacent contraflexure point is equal to 0.2113 of the whole beam length. d 0.2113l f q 0.2113l 0.2113l 0.5774l l (a) beam up to the point of contraflexure (b) bending moment diagram m2 m1 m1 fig. 2 the analytical model when the bottom-flange buckling wavelength calculated from eqns. (1) and (2) is shorter than the beam depth, the length of the buckling zone is taken as identical to the lp calculated from these equations. otherwise, the length of the buckling zone is capped to the beam depth d. 3/41/4 /275 0.713 0.7 f e y y w t k kd f b t                  (1) p 2l c (2) where c is the outstand of the flange. if the material properties, shown in fig. 3, for steel at temperatures higher than 400°c, are used, the vertical force-deflection relationship of the buckling panel can be illustrated qualitatively as in fig. 4. the proposed analytical model divides the loading procedure into three stages: pre-buckling, plateau and post-buckling. stress σ strain ε σy,θ σp,θ εp,θ εy,θ εl,θ εu,θ f o rc e pre-buckling plateau post-buckling deflection fmax fp,t local buckling fig. 3 stress-strain relationship of structural steel. fig. 4 schematic force–deflection of a buckling panel. figure 2. the analytical model. 2 development of analytical model the development of the analytical model is explained using a short cantilever i-beam section (fig. 2) as an example. by changing the cantilever length and by applying different combinations of uniformly distributed load and shear force at the beam-end, the cantilever is able to represent part of a fixed-ended beam from its end to the point of contraflexure under a uniformly distributed load. it can be calculated that the distance from one end of the beam to its adjacent contraflexure point is equal to 0.2113 of the whole beam length. d 0.2113l f q 0.2113l 0.2113l 0.5774l l (a) beam up to the point of contraflexure (b) bending moment diagram m2 m1 m1 fig. 2 the analytical model when the bottom-flange buckling wavelength calculated from eqns. (1) and (2) is shorter than the beam depth, the length of the buckling zone is taken as identical to the lp calculated from these equations. otherwise, the length of the buckling zone is capped to the beam depth d. 3/41/4 /275 0.713 0.7 f e y y w t k kd f b t                  (1) p 2l c (2) where c is the outstand of the flange. if the material properties, shown in fig. 3, for steel at temperatures higher than 400°c, are used, the vertical force-deflection relationship of the buckling panel can be illustrated qualitatively as in fig. 4. the proposed analytical model divides the loading procedure into three stages: pre-buckling, plateau and post-buckling. stress σ strain ε σy,θ σp,θ εp,θ εy,θ εl,θ εu,θ f o rc e pre-buckling plateau post-buckling deflection fmax fp,t local buckling fig. 3 stress-strain relationship of structural steel. fig. 4 schematic force–deflection of a buckling panel. figure 3. stress-strain relationship of structural steel. to post-buckling. the analytical model has for the first time been validated in uniformly distributed loading condition. a range of 3d finite element models has been created using abaqus, and comparisons have been carried out between these, the proposed analytical model, and dharma’s model [9] , which indicate that the proposed method gives better predictions overall than dharma’s model. a component-based model has been created on the basis of the analytical model, and will be implemented into the software vulcan for global structural fire analysis. 2. development of analytical model the development of the analytical model is explained using a short cantilever i-beam section (fig. 2) as an example. by changing the cantilever length and by applying different combinations of uniformly distributed load and shear force at the beam-end, the cantilever is able to represent part of a fixed-ended beam from its end to the point of contraflexure under a uniformly distributed load. it can be calculated that the distance from one end of the beam to its adjacent contraflexure point is equal to 0.2113 of the whole beam length. when the bottom-flange buckling wavelength calculated from eqns. (1) and (2) is shorter than the beam depth, the length of the buckling zone is taken as identical to the lp calculated from these equations. otherwise, the length of the buckling zone is capped 2 development of analytical model the development of the analytical model is explained using a short cantilever i-beam section (fig. 2) as an example. by changing the cantilever length and by applying different combinations of uniformly distributed load and shear force at the beam-end, the cantilever is able to represent part of a fixed-ended beam from its end to the point of contraflexure under a uniformly distributed load. it can be calculated that the distance from one end of the beam to its adjacent contraflexure point is equal to 0.2113 of the whole beam length. d 0.2113l f q 0.2113l 0.2113l 0.5774l l (a) beam up to the point of contraflexure (b) bending moment diagram m2 m1 m1 fig. 2 the analytical model when the bottom-flange buckling wavelength calculated from eqns. (1) and (2) is shorter than the beam depth, the length of the buckling zone is taken as identical to the lp calculated from these equations. otherwise, the length of the buckling zone is capped to the beam depth d. 3/41/4 /275 0.713 0.7 f e y y w t k kd f b t                  (1) p 2l c (2) where c is the outstand of the flange. if the material properties, shown in fig. 3, for steel at temperatures higher than 400°c, are used, the vertical force-deflection relationship of the buckling panel can be illustrated qualitatively as in fig. 4. the proposed analytical model divides the loading procedure into three stages: pre-buckling, plateau and post-buckling. stress σ strain ε σy,θ σp,θ εp,θ εy,θ εl,θ εu,θ f o rc e pre-buckling plateau post-buckling deflection fmax fp,t local buckling fig. 3 stress-strain relationship of structural steel. fig. 4 schematic force–deflection of a buckling panel. figure 4. schematic force–deflection of a buckling panel. to the beam depth d: β = 0.713 √ 275 fy (d b )1/4( tf tw )3/4(ke/ky 0.7 ) , (1) lp = 2βc, (2) where c is the outstand of the flange. if the material properties, shown in fig. 3, for steel at temperatures higher than 400°c, are used, the vertical force-deflection relationship of the buckling panel can be illustrated qualitatively as in fig. 4. the proposed analytical model divides the loading procedure into three stages: pre-buckling, plateau and post-buckling. in the pre-buckling stage, the calculation rules follow those for beams at elevated temperatures. the plateau ab occurs when the sectional plastic moment capacity is reached at the middle of the buckling zone. the point b is the point at which bottom-flange buckling occurs. the plastic buckling mechanism is shown in fig. 5; in the post-buckling stage it is assumed that the collapse mechanism is composed of a combination of yield lines and plastic yield zones. the calculation principle is based on equality of the internal plastic work and the loss of potential energy of the external load. the internal plastic work wint includes the work done in the flanges, which is composed ∑ i(wl)i of due to rotation about yield lines and ∑ j(wz)j due to axial deformation of the plastic zones, as well as the work ww done by the beam web due to its deforma133 g. quan, s.-s. huang, i. burgess acta polytechnica in the pre-buckling stage, the calculation rules follow those for beams at elevated temperatures. the plateau ab occurs when the sectional plastic moment capacity is reached at the middle of the buckling zone. the point b is the point at which bottom-flange buckling occurs. the plastic buckling mechanism is shown in fig. 5; in the post-buckling stage it is assumed that the collapse mechanism is composed of a combination of yield lines and plastic yield zones. the calculation principle is based on equality of the internal plastic work and the loss of potential energy of the external load. the internal plastic work wint includes the work done in the flanges, which is composed of ( ) l i i w due to rotation about yield lines and ( )z j j w due to axial deformation of the plastic zones, as well as the work w w done by the beam web due to its deformation during shear buckling. the total internal work can be expressed as: 2 int 1 1 ( ) ( ) 4 p y i p y j w i j w l t f a tf w     (3) θ1 θ2 q f lp 0.2113l shear buckling flange buckling fig. 5 plastic buckling mechanism. fig. 6 the effects of flange buckling and beam-web shear buckling on beam vertical deflection. the total deflection of the buckling zone is composed of the deflection caused by simultaneous bottom-flange buckling and beam-web shear buckling, as shown in fig. 6. the influence of bottomflange buckling is to cause a rotation of the whole beam-end about the top corner of the beam, while the effect of shear buckling is a parallel movement of the opposite edges of the shear panel. when applying uniformly distributed load on top of the beam, as well as a shear force at the beam-end, the total external work can be expressed as: 2 2 1 1 2 p p 2 p 2 0.5 0.2113 0.2113 0.5 0.2113 + ext i i p w p l q l f l q q l l l fl             ( ) ( ) ( ) (4) the theoretical model can be applied to different load and boundary conditions. figure 5. plastic buckling mechanism. in the pre-buckling stage, the calculation rules follow those for beams at elevated temperatures. the plateau ab occurs when the sectional plastic moment capacity is reached at the middle of the buckling zone. the point b is the point at which bottom-flange buckling occurs. the plastic buckling mechanism is shown in fig. 5; in the post-buckling stage it is assumed that the collapse mechanism is composed of a combination of yield lines and plastic yield zones. the calculation principle is based on equality of the internal plastic work and the loss of potential energy of the external load. the internal plastic work wint includes the work done in the flanges, which is composed of ( ) l i i w due to rotation about yield lines and ( )z j j w due to axial deformation of the plastic zones, as well as the work w w done by the beam web due to its deformation during shear buckling. the total internal work can be expressed as: 2 int 1 1 ( ) ( ) 4 p y i p y j w i j w l t f a tf w     (3) θ1 θ2 q f lp 0.2113l shear buckling flange buckling fig. 5 plastic buckling mechanism. fig. 6 the effects of flange buckling and beam-web shear buckling on beam vertical deflection. the total deflection of the buckling zone is composed of the deflection caused by simultaneous bottom-flange buckling and beam-web shear buckling, as shown in fig. 6. the influence of bottomflange buckling is to cause a rotation of the whole beam-end about the top corner of the beam, while the effect of shear buckling is a parallel movement of the opposite edges of the shear panel. when applying uniformly distributed load on top of the beam, as well as a shear force at the beam-end, the total external work can be expressed as: 2 2 1 1 2 p p 2 p 2 0.5 0.2113 0.2113 0.5 0.2113 + ext i i p w p l q l f l q q l l l fl             ( ) ( ) ( ) (4) the theoretical model can be applied to different load and boundary conditions. figure 6. the effects of flange buckling and beamweb shear buckling on beam vertical deflection. 3 validation against finite element modelling 3.1 development of the abaqus models the commercial finite element software abaqus was used to simulate the buckling phenomena in the vicinity of beam-column connections at 615°c. the four-noded shell element s4r [10], which is capable of simulating buckling behaviour, was adopted. a 15mm x 15mm element size was used, after a mesh sensitivity analysis. the riks approach was used in order to identify the descending curve at the post-buckling stage. cantilever models with the same loading and boundary conditions as the analytical model were set up. an image of the abaqus model is shown in fig. 7 (a). the cross-section dimensions are shown in fig. 7 (b). all the cantilevers shared the same configuration except for the beam web and flange thicknesses. the beam cross-section dimensions were based on the universal beam ub356x171x51, whose beam web and flange thicknesses are 7.4mm and 11.5mm respectively. as the analytical model applies generally to class 1 and 2 beams, the thicknesses of the beam webs and flanges vary within this range. therefore, the thicknesses of the beam webs were varied from 5.5mm to 8mm, while those of the beam flanges were varied between 10mm and 13mm. in the cases validated, the potential beam length was 6m, on the basis that a beam depth-to-length ratio of 1/20 is commonly used in design practice. the cantilever length was 1267.8mm, which is identical to the distance from the beam-end to its adjacent contraflexure point. the shear force applied to the beam-end was 1732.2q, which enabled the cantilevers to be in the same loading condition as the corresponding end zones of the 6m fixed-ended beams. 171.5 332 (b) tf tf 171.5 tw fig. 7 finite element model: (a) image of finite element model; (b) cross-section dimension the stress-strain relationship of the beam material at 615°c was defined according to eurocode 3 [11]. the details of the material properties used in the abaqus models are shown in table 1. (a) figure 7. finite element model: (a) image of finite element model; (b) cross-section dimension. tion during shear buckling. the total internal work can be expressed as wint = 1 4 ∑ i (lpt2fyθ1)i + ∑ j (aptfyε)j + ww. (3) the total deflection of the buckling zone is composed of the deflection caused by simultaneous bottomflange buckling and beam-web shear buckling, as shown in fig. 6. the influence of bottom-flange buckling is to cause a rotation of the whole beam-end about the top corner of the beam, while the effect of shear buckling is a parallel movement of the opposite edges of the shear panel. when applying uniformly distributed load on top of the beam, as well as a shear force at the beam-end, the total external work can be expressed as: wext = ∑ i pi∆i = 0.5 × (0.2113l)2qθ1 + (0.2113l) × fθ1 + 0.5l2pqθ2 + q(0.2113l − lp) × lpθ2 + flpθ2. the theoretical model can be applied to different load and boundary conditions. 3. validation against finite element modelling 3.1. development of the abaqus models the commercial finite element software abaqus was used to simulate the buckling phenomena in the vicinity of beam-column connections at 615°c. the fournoded shell element s4r [10] , which is capable of simulating buckling behaviour, was adopted. a 15 mm × 15 mm element size was used, after a mesh sensitivity analysis. the riks approach was used in order to identify the descending curve at the post-buckling stage. cantilever models with the same loading and boundary conditions as the analytical model were set up. an image of the abaqus model is shown in fig. 7 (a). the cross-section dimensions are shown in fig. 7 (b). all the cantilevers shared the same configuration except for the beam web and flange thicknesses. the beam cross-section dimensions were based on the universal beam ub356x171x51, whose beam web and flange thicknesses are 7.4 mm and 11.5 mm respectively. as the analytical model applies generally to class 1 and 2 beams, the thicknesses of the beam webs and flanges vary within this range. therefore, the thicknesses of the beam webs were varied from 5.5 mm to 8 mm, while those of the beam flanges were 134 vol. 56 no. 2/2016 component-based approach to modelling beam bottom flange buckling table 1 material properties fy,θ (n/mm 2) εy,θ (%) εt,θ (%) εu,θ (%) eα,θ (n/mm2) 224.1 2 15 20 201697 3.2 comparisons between the analytical model, dharma’s model and fea the force-displacement relationships given by the proposed analytical model, dharma’s model and the abaqus analyses have been compared. the first group of beams compared have the same flange thickness of 11.5mm, while their web thickness varies. the detailed curves are shown in fig. 8. the lines with diamond markers, denoted “elastic-plastic”, represent the force-deflection relationships when the full plastic moment resistance is reached at the middle of the flange buckling zone. the smooth lines without markers represent the results of finite element modelling. the descending solid and dashed lines are the results from the new proposed buckling model and dharma’s model respectively. it can be seen that both the proposed analytical model and dharma’s model give very good comparisons to the fe modelling for beams with thicker webs. the proposed model is able to provide acceptable results for beams with webs within the class 3 range. however, dharma’s model tends to over-estimate the beam capacity considerably for those with more slender webs. 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) web=5.5mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) web=6.0mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) web=7.4mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) web=8.0mm elastic-plastic fea proposed model dharma's model fig. 8 comparison between the analytical model, dharma’s model and fe analysis (web thickness varies). figure 8. comparison between the analytical model, dharma’s model and fe analysis (web thickness varies). fy,θ (n/mm2) εy,θ (%) εt,θ (%) εu,θ (%) eα,θ (n/mm2) 224.4 2 15 20 201697 table 1. material properties. varied between 10 mm and 13 mm. in the cases validated, the potential beam length was 6m, on the basis that a beam depth-to-length ratio of 1/20 is commonly used in design practice. the cantilever length was 1267.8 mm, which is identical to the distance from the beam-end to its adjacent contraflexure point. the shear force applied to the beam-end was 1732.2q, which enabled the cantilevers to be in the same loading condition as the corresponding end zones of the 6m fixed-ended beams. the stress-strain relationship of the beam material at 615°c was defined according to eurocode 3 [11] . the details of the material properties used in the abaqus models are shown in table 1. 3.2. comparisons between the analytical model, dharma’s model and fea the force-displacement relationships given by the proposed analytical model, dharma’s model and the abaqus analyses have been compared. the first group of beams compared have the same flange thickness of 11.5 mm, while their web thickness varies. the detailed curves are shown in fig. 8. the lines with diamond markers, denoted “elastic-plastic”, represent the force-deflection relationships when the full plastic moment resistance is reached at the middle of the flange buckling zone. the smooth lines without markers represent the results of finite element modelling. the descending solid and dashed lines are the results from the new proposed buckling model and dharma’s model respectively. it can be seen that both the proposed analytical model and dharma’s model give very good comparisons to the fe modelling for beams with thicker webs. the proposed model is able to provide acceptable results for beams with webs within the class 3 range. however, dharma’s model tends to over-estimate the beam capacity considerably for those with more slender webs. fig. 9 shows the second group of comparisons, for 135 g. quan, s.-s. huang, i. burgess acta polytechnica fig. 9 shows the second group of comparisons, for which the beam web thickness remains at 7.4mm, and the flange thickness varies between 10.0mm and 13.0mm to guarantee that the beam classification lies in the class 1 to 2 range. it can be seen that the proposed analytical model compares well with beams within all the selected flange thicknesses, while dharma’s model overestimates the capacity for beams with stocky flanges. this is possibly because the length of the buckling zone is related to the ratio between f t and w t according to eq. (1). decreasing the web thickness or increasing the flange thickness can both increase this ratio. dharma’s model seems more sensitive to the flange-to-web thickness ratio, and therefore considerably over-estimates the capacity when this ratio increases. 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) flange=10.0mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) flange=11.5mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) flange=12.0mm elastic-plastic fea proposed model dharma's model 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 r e a c ti o n f o rc e a t o n e e n d (k n ) vertical deflection at mid-span (mm) flange=13.0mm elastic-plastic fea proposed model dharma's model fig. 9 comparison between the proposed analytical model and dharma’s model (flange thickness varies). 4 component-based model the bottom-flange buckling component, representing the beam-end buckling element, is a compressive spring. its characteristic relies on the flange buckling behaviour, which is related to 1  (fig. 6). the effect of the displacement of the top and bottom springs is effectively a rotation around the top point, connected to the connection element. the characteristic of the shear-buckling component is related to 2  . the effect of shear buckling is a transverse drift of the two opposite edges of the buckling panel. the length of the buckling component has been defined as identical to figure 9. comparison between the proposed analytical model and dharma’s model (flange thickness varies). which the beam web thickness remains at 7.4 mm, and the flange thickness varies between 10.0 mm and 13.0 mm to guarantee that the beam classification lies in the class 1 to 2 range. it can be seen that the proposed analytical model compares well with beams within all the selected flange thicknesses, while dharma’s model over-estimates the capacity for beams with stocky flanges. this is possibly because the length of the buckling zone is related to the ratio between tf and tw according to eq. (1). decreasing the web thickness or increasing the flange thickness can both increase this ratio. dharma’s model seems more sensitive to the flange-to-web thickness ratio, and therefore considerably over-estimates the capacity when this ratio increases. the bottom-flange buckling component, representing the beam-end buckling element, is a compressive spring. its characteristic relies on the flange buckling behaviour, which is related to θ1 (fig. 6). the effect of the displacement of the top and bottom springs is effectively a rotation around the top point, connected to the connection element. the characteristic of the shear-buckling component is related to θ2. the effect of shear buckling is a transverse drift of the two opposite edges of the buckling panel. the length of the buckling component has been defined as identical to the beam depth (d). one of the major objectives of this research is to develop new components representing beam-web shear buckling and flange-buckling, and to implement these together with the adjacent joint element, as shown in fig. 10, to carry out performancebased frame analysis under fire conditions. 4. conclusions this paper has briefly reviewed an analytical model proposed to predict the post-buckling behaviour of the end-zones of steel beams at elevated temperatures. the load resistance of a steel shear panel at elevated temperatures involves three stages: non-linear prebuckling, plateau and post-buckling. a whole forcedeflection relationship has been postulated, from initial loading to the post-buckling stage. a range of finite element models has been created using the finite element software abaqus. the loading of the cantilever models was uniformly distributed on the top flange, together with a shear force at the beam end. the proposed analytical model was compared with dharma’s existing model and fe models. the comparisons were carried out for beams with different web and flange thicknesses within the class 1 and 2 range according to the classification method provided by eurocode 3 [11] . the comparisons showed that the proposed method provides a stable upper bound in terms of the reaction force-deflection relationship 136 vol. 56 no. 2/2016 component-based approach to modelling beam bottom flange buckling w f u f s connection element buckling element shear-buckling component flange-buckling component rotation centre l1=0 l2=d node at column face bc c c c c c c b b b b b b bc buckling element joint element figure 10. component-based model for buckling components. within this range, while dharma’s model tends to overestimate post-buckling capacity, especially when the beam flange-to-web thickness ratio is relatively large. component-based models of the beam-web shear buckling and the bottom flange buckling have been created. the newly developed components will be implemented, in conjunction with the adjacent component-based joint element, to carry out performance-based analysis of steel and composite framed structures under fire conditions. references [1] kirby b. the behaviour of a multi-storey steel framed building subject to fire attack-experimental data; british steel swinden technology centre, united kingdom. 1998. [2] huang z, burgess i, plank r, reissner m. three-dimensional modelling of two full-scale, fire tests on a composite building. proceedings of the ice-structures and buildings. 1999;134:243-55. http://dx.doi.org/10.1680/istbu.1999.31567 [3] wald f, da silva ls, moore d, lennon t, chladna m, santiago a et al. experimental behaviour of a steel structure under natural fire. fire safety journal. 2006;41:509-22. http://dx.doi.org/10.1016/j.firesaf.2006.05.006 [4] newman gm, robinson jt, bailey cg. fire safe design: a new approach to multi-storey steel-framed buildings: steel construction institute; 2000. [5] bijlaard fsk. connections in steel structures v: behaviour, strength & design ; proceedings of the fifth international workshop. zoetermeer: bouwen met staal; 2005. [6] elghazouli a, izzuddin b. significance of local buckling for steel frames under fire conditions. 4th international conference on steel and aluminium structures (icsas 99): elsevier science bv; 1999. p. 727-34. http: //dx.doi.org/10.1016/b978-008043014-0/50187-6 [7] timoshenko sp, gere jm. theory of elastic stability. 1961. mcgraw-hill, new york; 1961. [8] quan g, huang s-s, burgess i. component-based model of buckling panels of steel beams at elevated temperatures. journal of constructional steel research. 2016;118:91-104. http://dx.doi.org/10.1016/j.jcsr.2015.10.024 [9] dharma rb. buckling behaviour of steel and composite beams at elevated temperatures 2007. [10] hibbit d, karlsson b, sorenson p. abaqus reference manual 6.7. pawtucket: abaqus inc; 2005. [11] cen. bs en 1993-1-2. design of steel structures. part 1.2: general rules structural fire design. uk: british standards institution; 2005. 137 http://dx.doi.org/10.1680/istbu.1999.31567 http://dx.doi.org/10.1016/j.firesaf.2006.05.006 http://dx.doi.org/10.1016/b978-008043014-0/50187-6 http://dx.doi.org/10.1016/b978-008043014-0/50187-6 http://dx.doi.org/10.1016/j.jcsr.2015.10.024 acta polytechnica 56(2):132–137, 2016 1 introduction 2 development of analytical model 3 validation against finite element modelling 3.1 development of the abaqus models 3.2 comparisons between the analytical model, dharma’s model and fea 4 conclusions references acta polytechnica doi:10.14311/ap.2018.58.0184 acta polytechnica 58(3):184–188, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap hydrophobic impregnation of geopolymer composite by ethoxysilanes zdeněk mašek∗, linda diblíková výzkumný a zkušební letecký ústav – composite technologies department, beranových 130, 19005 prague, czech republic ∗ corresponding author: zdenek.masek@vzlu.cz abstract. a geopolymer composite was impregnated by incorporating the hydrophobic alkyl group on the outer surface and in the inner structure of the geopolymer. ethoxysilanes 1h,1h,2h,2h perfluoroctyltriethoxysilane and hexadecyltrimethoxysilane were used as the source of hydrophobic groups. three types of solutions based on the ethoxysilanes were prepared according to adapted procedures. the modification of the geopolymer composites was done by their immersion into the hydrophobic solutions followed by drying at a laboratory or elevated temperature. the effectivity of the procedure was evaluated by measuring the water contact angle on the surface of the modified composite and by measuring the water uptake and stiffness of the composite. the results confirmed that the silanes hydrolyzed in sol containing sio2 nanoparticles have a higher hydrophobization effect than solutions of simply hydrolyzed silanes. the resulting impregnation procedure led to the change of the geopolymer composite surface from hydrophilic to hydrophobic. keywords: geopolymer; sol; hydrophobicity; alkyltrisilane; tetraethoxysilane; contanct angle. 1. introduction geopolymers are alkaline amorphous inorganic polymers, which belong to the group of aluminosilicates [1]. the polymeric spatial structure of geopolymers is created by silicon and aluminium atoms, which are coordinated by atoms of oxygen. oxygen atoms serve as a bridge connecting the silicon and aluminum atoms. the bridges are created by a condensation of (alo2)– or (sio4)4– anions and poly-anions. uncondensated –oh and –o– groups are the reason of the hydrophilic character of the geopolymer the geopolymer is in its nature similar to concrete, which means that it is possible to describe the level of protection against a water penetration into geopolymers with similar nomenclature as for concrete structures. it can be divided into three groups [2]: i) protective coating, ii) impregnation and iii) hydrophobic impregnation. the protective coating forms an impermeable layer only on the surface. the impregnation uses special agents, which penetrate into the inner structure of the material and fill the pores. on the contrary, a hydrophobic impregnation does not affect pores, i.e. it acts only as a water repellent. an inappropriate application of the penetration could cause harmful stress and damage to the treated material. therefore, the hydrophobic impregnation is a widely researched area in the field of conservation and restauration of monuments, because it is material friendly [3, 4]. the hydrophobic impregnation of geopolymers is presented in this paper. the geopolymer is a microporous material, which can be used as a matrix in fine carbon fibers reinforced composites. however, such composite is usually more porous than the original geopolymer, because of a low adhesion between the matrix and the surface of fibers and an insufficient penetration of the matrix into the unidirectional bundles of carbon fibers. this fact, together with the hydrophilic character of the geopolymer matrix, makes the final composite very sensitive to humidity. duan et al. [5] proposed the hydrophobic impregnation of a geopolymer by attaching the molecules of a palmitic acid into its structure by esterification. to our knowledge, this is the only treatment used for geopolymers. other treatments providing hydrophobic effect were tested on different types of materials, although, in principle, they can be applied also on geopolymers. it is well known that a small addition of alkyltrialkoxysilanes to the suspension of sio2 nanoparticles during a surface treatment of textiles increases its hydrophobicity [6–9]. similarly, polydopamine is used as the carrier of hydrophobic groups created by the hydrolysis of alkyltriethoxysilane [10]. in both cases, the formation of nanoparticles with attached hydrophobic groups originating from alkyltriethoxysilanes was observed. a preparation process of binder-based on sol-gel nanoparticles was described in the field of a zinc-silicate anticorrosive coating. these binders are usually prepared by the hydrolysis and condensation of polyethoxysilicates and boric acid in a non-aqueous environment [10]. all stated examples of sols represent the concept of carrier nanoparticles with an ability to be attached to a specific substrate, to react with alkyltriethoxysilanes and to affect the hydrophobic properties of the surface treatment. 184 http://dx.doi.org/10.14311/ap.2018.58.0184 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 3/2018 hydrophobic impregnation of geopolymer composite by ethoxysilanes si o ch 3 o ch 3 o ch 3 h3c(h2c) 14h2c si oo o ch 3 ch 3 ch 3 f f f ff f f f f f f f f figure 1. chemical formula of: (left) 1h,1h,2h,2h-perfluoroctyltriethoxysilane, (right) hexadecyltrimethoxysilane. the main principle of hydrophobic impregnation in our work is the exposure of the geopolymer composite by a reagent, which is created by acidic hydrolysis of tetraethoxysilane with a small addition of hexadecyltriethoxysilane or 1h,1h,2h,2hperfluoroctyltrimethoxysilane. these reagents are probably condensating with hydrophilic ohand ogroups and are usually used together with carrier nanoparticles. the objective of this work is to design a gentle treatment of carbon fiber/geopolymer composite, which will improve its negative properties given by a combination of porosity and hydrophilic character of a geopolymer resin. 2. materials and methods 2.1. preparation of geopolymer resin and composite samples the geopolymer resin was prepared by an alkaline activation of a burnt clay shell mefisto l05 (české lupkové závody, a.s., czech republic) and amorphous silica (thermal silica, saint-gobain, france). the alkaline activator of the reaction was an aqueous solution of potassium silicate dv 1.7 (vodní sklo, a.s., czech republic). the mixture was homogenized by dispermat ca60-m1 (vma-getzmann gmbh, germany) in a ratio ensuring the best strength properties of the pure geopolymer resin; the exact chemical composition of the resin is a classified information within the project. the mixing vessel was placed in a water chilled container. firstly, the amorphous silica was mixed with the chilled solution (3 °c) of potassium silicate at 9600 rpm for 20 min. then, the mixture was cooled at −20 °c for 20 min and metakaolin was added under continuous stirring at 9600 rpm for 5.5 min. finally, the mixture was left in the freezer for 10 min and degassed by stirring under vacuum. the geopolymer composite was prepared by a lamination of 10 layers of carbon fibers fabric in a form of a plain weave, with a density of 93 g cm−2, 1k (havel composite cz s.r.o.). the manual lamination was done using a metal spatula at a dosage of 282 g resin per square metre of the fabric. the fabric with the geopolymer was layered and pressed by a roller. the last layer was covered by a separation foil with p3 perforation and a non-woven staple. the whole panel was sealed by a pp foil and cured under vacuum at 23 °c for 72 hours. finally, test samples with dimensions of 10 × 8 × 1.7 mm were cut from the panel and left at a laboratory temperature for 14 days. 2.2. preparation of hydrophobic solutions and impregnation of samples four types of impregnation agents were used: i) 1h,1h,2h,2h-perfluoroctyltriethoxysilane was purchased under the product name dynasylan f-8261, ii) hexadecyltrimethoxysilane under the name dynasylan 9116, iii) tetraethoxysilane under the name dynasylan a and iv) polyethoxysilane, which contains from three to seven ethoxysilanes units and is produced by a partial hydrolysis, was purchased under the product name dynasylan 40 (evonik industries). the chemical structure of dynasylan f-8261 and 9116 is displayed in figure 1. six hydrophobic solutions were prepared from the agents and labelled as a, b, c, d, e and f. solutions a and d were mixed according to the application manual of the glass and ceramic materials manufacturer. the preparation of solutions b and e was performed by a modified procedure for cotton fabrics published in [3]. we used isopropyl alcohol instead of ethanol and adjusted the dosage of silane. solutions c and f were synthesized based on a procedure for the binder of zinc-silicate paint [11]. we used the idea of the peos hydrolysis in a non-aqueous solution and we improved it by adding alkysilanes. the solution a was prepared by mixing the pfoteos with isopropanol in a weight ratio 1 : 99. one hundred grams of the solution was mixed with 10 g of distilled water and 0.2 g of 37 % hcl. the whole mixture was stirred at 500 rpm for 5 hours. samples were immersed into the solution for 1 hour, wiped by a linen cloth and placed in a drying oven at 105 °c for 20 hours. the solution b was obtained by stirring 35 g of distilled water and 39.45 g of isopropanol at 500 rpm, then 7 ml of 0.01 m hcl was added. consequently, 15.65 g of teos was added dropwise under a vigorous stirring at 1000 rpm. after each addition of teos, the solution became opalescent. next part of teos was always added after the solution had cleared. when the whole amount of teos was added, 5.47 g of pfoteos was added dropwise and the obtained colloidal suspension was stabilized by stirring at 500 rpm for 30 min. at the end, samples were immersed into the solution for 30 min, wiped by a linen cloth and dried at 80 °c for 20 hours. the solution c was prepared by hydrolysis and a condensation of peos. the reaction was carried out 185 zdeněk mašek, linda diblíková acta polytechnica in the solution of isopropanol and 2-ethoxyethanol in the presence of boric acid. 49.4 g of peos was mixed with 16.7 g of isopropanol, 11.1 g of 2-ethoxyethanol and 4.1 g of boric acid was added. the reaction mixture was stirred for 1 hour under vacuum with reflux. then, 18.6 g of pfoteos was added and stirred at a boiling temperature in a reflux configuration for 1 hour. samples were immersed into this solution at 80 °c for 2 hours, wiped by a linen cloth and kept at a laboratory temperature for 2 days. finally, they were dried at 80 °c for 6 hours. the solution d was obtained by the same procedure as the solution a, except that hdteos was used instead of pfotoes as the hydrophobic agent. the solution e was obtained by stirring 35 g of distilled water and 39.4 g of isopropanol at 500 rpm and adding 15 ml of 0.01 m hcl. then, 10.26 g of teos was added dropwise under an intensive stirring at 1000 rpm. after each addition of teos, the solution became milky. next part of teos was always added after the solution had cleared. when the whole amount of teos was added, 4.0 g of hdteos was dropwise added and the obtained colloidal suspension was stabilized by stirring at 500 rpm for 30 min. at the end, samples were immersed into the solution for 30 min, wiped by a linen cloth and dried at 80 °c for 20 hours. the solution f was prepared by the same procedure as solution c, except that 18.6 g of hdteos was added instead of 18.6 g of pfteos. in summary, solutions a and d were not based on sio2 sol. solutions b and e were sio2 sol solutions. and solutions c and e were non-aqueous solutions with sio2 sol and h3bo3. 2.3. testing of hydrophobic agents’ effectivity prior to water uptake measurements, the samples were dried at 80 °c for 4 hours and then at a temperature ramp from 80 to 170 °c for 16 hours. water sorption of distilled water was measured during the sample immersion by weighting in time until saturation reached the equilibrium. the sorption was measured with 5, 50 or 100 mm water column height, where the water column means water level in a beaker. the samples were dried at 100 °c. three samples were used for each experiment and the average value was calculated. water vapour sorption was tested by placing samples into a closed glass container, where a water vapour was generated by heating water to a boiling point (100 °c) and maintaining the conditions for 24 hours. before the test, the samples were dried at 80 °c for 4 hours and then at a temperature ramp from 80 to 170 °c for 16 hours. three samples were used for each experiment and the average value was calculated. the surface of the geopolymer composite was analysed by contact angle measurements (cam) using surface energy evaluation system (advex instruments s.r.o., cz) based on the sessile-drop method storage modulus [gpa]impregnation solution contact angle [°] before after none 0 21.5 19.0 a 70 23.6 19.4 b 123 24.5 20.9 c 98 24.0 21.1 d 0 23.0 19.7 e 105 25.0 21.9 f 87 24.3 22.8 table 1. contact angle and storage modulus before and after impregnation. and equipped with 2 mpix (1600 × 1200) uvc camera. water was used as the test liquid; the volume of its droplets was 1.5 µl. the contact angle (ca) is defined as the angle between the specimen surface and the tangent to the droplet surface at the interface of three phases (specimen surface, droplet and ambient air). the ca was calculated as the average value of 5 measurements for each sample; the calculated standard deviation was lower than 9 for all impregnations. image analyses was performed using the see system software. the influence on mechanical properties of the composite was investigated as well. non-destructive measurement of the storage modulus was performed using a dma q800 (ta instruments, usa) at a frequency of 1 hz and a strain amplitude of 20 µm. the storage modulus is the real part of the complex modulus of elasticity and is related to the sample´s stiffness. 3. results and discussion 3.1. effect of treatments on contact angle and storage modulus water contact angles stated in table 1 show that all hydrophobic treatments except d had a significant effect on the increase of the angle when compared to the sample without an impregnation. the treatment in solutions b, c and e changed the surface properties of the geopolymer composite from hydrophilic to hydrophobic, i.e., the ca was greater than 90°, which is the value determining the transition between hydrophilicity and hydrophobicity. figure 2 displays the contact angle measurement for the most hydrophobic surface impregnated by a solution based on pfoteos with sio2 sol. the ca of samples e and d could not be measured exactly, because of a very fast absorption of the water drop. regarding the storage modulus, the impregnation led to a small reduction in stiffness. the decrease of storage modulus for a sample without penetration was caused by a time lag between measurements, which indicates the behaviour of the composite itself. 186 vol. 58 no. 3/2018 hydrophobic impregnation of geopolymer composite by ethoxysilanes wt. gain [%] contact angle [°] storage modulus [gpa]impregnation solution water vapor before test water vapor before test water vapor none 3.4 7.2 0 0 0 21.0 6.4 15.3 b 4.3 5.7 123 126 81 19.0 12.0 18.5 c 3.2 6.9 98 102 85 19.5 10.4 16.9 e 3.2 5.7 105 107 75 21.9 12.2 17.6 table 2. changes of composites properties after the tests of water uptake and water vapour sorption. figure 2. analysis of a water drop on the surface of geopolymer composite impregnated by solution b; see system software image. 3.2. change of composites properties due to water uptake and sorption of water vapor the function of hydrophobic impregnation was evaluated by the tests of water uptake and water vapour sorption. it was observed that the height of the water column, i.e. a submersion depth of samples, had no influence on the water uptake rate of both the impregnated and unimpregnated samples. the saturation of samples reached the equilibrium after 25 hours. all samples, including the unpenetrated one, showed a similar water uptake between 3.2 and 4.3 % when expressed as a weight gain as can be seen in table 2. other data stated in the table show that the impregnation effect was not influenced by an exposure to water. contrary, the hot water vapour caused a distinctive decrease of an average contact angle under 90°. regarding mechanical properties, storage modulus decreased only slightly after the vapour sorption test. in both cases, the decrease of storage modulus was lower for impregnated samples. thus, the results show that the modification of geopolymer composite samples belongs to the category of hydrophobic impregnation, which means that it doesn´t prevent moisture penetration, but repels liquid water and allows a diffusion of air humidity. it should also be noted that the geopolymer tend to generate free alkali and white efflorescence of alkali carbonates on the surface. however, no traces of the efflorescence were observed on the surface of our samples after three months during which they were stored at a laboratory temperature (23 °c) and at a relative humidity in the range of 35 to 50 %, and not even after the exposition in extremely humid environment – saturated steam at 100 °c for 24 hours. 4. conclusion our results show that we have successfully modified a hydrophobic treatment for a cotton fabric and applied it on a geopolymer composite. by the application of the hydrophobic agent 1h,1h,2h,2hperfluoroctyltriethoxysilane (pfoteos) in the form of a solution prepared by hydrolysis and condensation of the silane in fresh sio2 sol, we achieved the lowest wettability of the composite surface, i.e. the water contact angle was 123°. the non-aqueous solution based on pfoteos and the solution with hexadecyltriethoxysilan (hdteos) and sio2 sol also led to high contact angles of the impregnated composites, 98° and 105° respectively. the presence of sio2 sol was found to be crucial as worse results were obtained when the impregnation in the solution without it was used. in principle, the modification was done by attaching hydrophobic group by sio2 nanoparticles on the surface and into the inner structure of the geopolymer. it was confirmed by water uptake test that water did not influence the contact angle values; the storage modulus characterizing the stiffness of the composite slightly decreased. the amount of water absorbed by the composite was in the range of 3.2 to 4.3 % regardless of the treatment. this means that the treatment is a hydrophobic impregnation, which does not affect pores, i.e. acts only as a water repellent. acknowledgements this result was obtained within the institutional support of the ministry of industry and trade of the czech republic for the development of a research organization (decision no. 12/2017). references [1] davidovits j. (2011). geopolymer chemistry and applications, 3rd ed. institute géopolymère. isbn: 9782951482050 [2] international standard en 1504-2:2004. products and systems for the protection and repair of concrete structures definitions, requirements, quality control and evaluation of conformity part 2: surface protection systems for concrete 187 zdeněk mašek, linda diblíková acta polytechnica [3] nurhan o. et al. (2015): water-and oil-reppelency properties of cotton fabric treated with silane. international journal of textile science, 4(4), 84-96. doi:10.5923/j.textile.20150404.03 [4] zhang h., liu q., liu t., zhang b. (2000): the preservation damage of hydrophobic polymer coating materials in conservation of stone relics. progress in organic coatings, 76,1127-34. doi:10.1016/j.porgcoat.2013.03.018 [5] duan p., yan ch., luo w., zhou w. (2016): a novel surface waterproof geopolymer derived from metakaolin by hydrophobic modification, materials letters 164, 172–175. doi: 10.1016/j.matlet.2015.11.006 [6] mahltig b., böttcher h. (2003): modified silica sol coatings for water-repellent textiles. journal of sol-gel science and technology 27(1), 43–52. doi:10.1023/a:1022627926243 [7] daoud w.a., xin j.h., tao x. (2004): superhydrophobic silica nanocomposite coating by a low-temperature process. journal of the american ceramic society 87, 1782-1784. doi:10.1111/j.1551-2916.2004.01782.x [8] daoud w.a., xin j.h., tao x. (2006): synthesis and characterization of hydrophobic silica nanocomposites. applied surface science 252, 5368-5371. doi:10.1016/j.apsusc.2005.12.020 [9] daoud w.a., xin j.h., zhang y.h., mak c.l. (2006): pulsed laser deposition of superhydrophobic thin teflon films on cellulosic fibers. thin solid films 515, 835-837. doi:10.1016/j.tsf.2005.12.245 [10] hongxia w. et al (2017): durable, self-healing, superhydrophobic fabrics from fluorine-free, waterborne, polydopamine/alkyl silane coatings, rsc advances, 7, 33986-33993. doi:10.1039/c7ra04863g [11] e. i. du pont de nemours and company. binder for zink-rich paint. inventor: aaron oken. ipc c09d 5/10. us3649307, 14.03.1972, wipo. 188 http://dx.doi.org/10.5923/j.textile.20150404.03 http://dx.doi.org/10.1016/j.porgcoat.2013.03.018 http://dx.doi.org/10.1023/a:1022627926243 http://dx.doi.org/10.1111/j.1551-2916.2004.01782.x http://dx.doi.org/10.1016/j.apsusc.2005.12.020 http://dx.doi.org/10.1016/j.tsf.2005.12.245 http://dx.doi.org/10.1039/c7ra04863g acta polytechnica 58(3):184–188, 2018 1 introduction 2 materials and methods 2.1 preparation of geopolymer resin and composite samples 2.2 preparation of hydrophobic solutions and impregnation of samples 2.3 testing of hydrophobic agents' effectivity 3 results and discussion 3.1 effect of treatments on contact angle and storage modulus 3.2 change of composites properties due to water uptake and sorption of water vapor 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0226 acta polytechnica 58(4):226–231, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap verification of behaviour of human enamel for fracture toughness determination petra hájková∗, aleš jíra, luboš řehounek czech technical university in prague, faculty of civil engineering, thákurova 7 166 29, prague, czech republic ∗ corresponding author: petra.hajkova@fsv.cvut.cz abstract. enamel is the hardest biological tissue in the human body because of its structure and composition. the structure of interlocking rods enables this biomaterial to resist the stresses of mastication. unfortunately, enamel is prone to fracture initiation and growth. determining the fracture toughness of enamel is a difficult task. the lack of thickness makes it impossible to prepare samples which could be analysed by usual methods. other authors ordinarily use vickers indentation fracture test (vif) to determine the fracture toughness of enamel. the vif is, however, not generally acceptable. the aim of this study is a verification of a fracture behaviour of enamel using nanoindentation. in the study the impact of changes of hardness (hit) and reduced modulus (er) caused by crack initiation and the growth on fracture toughness determination is observed. the next goal is an evaluation of loading rate impact. keywords: human enamel, fracture behaviour, mechanical properties, loading rate, work of indentation. 1. introduction enamel is a tissue that forms the external part of human teeth. conventionally, it is the only visible dental material in the oral cavity. enamel also protects the inner parts of the human teeth – dentin and pulp. its mechanical properties correspond to its functions and conditions it is under – enamel must withstand the stresses caused by mastication and other stimuli, such as sudden changes of temperature caused by drinking hot or cold drinks and consuming acidic or sugary foods. because of that, enamel is the hardest tissue in the human body [1]. unfortunately, hard materials also tend to be brittle and exhibit little resistance towards initiation and propagation of cracks. the cracks can then compromise the overall health of the tooth, especially if dental cavities are located beneath the enamel layer. mechanical properties of enamel are determined by its structure and chemical composition. it consists of 96% anorganic substances (hydroxyapatite) and 4% is water and organic compounds (protein enamelin). densely formed crystals of hydroxyapatite form prisms that pass in volume from the dentino-enamel junction (dej) to the outer tooth layer. the shape of the prisms enables for the interconnection of convex and concave surfaces of surrounding prisms and forms a strong bond. the different orientation of the hydroxyapatite crystals in prisms and the interprismatic substance also contributes towards a higher mechanical durability [1]. authors who describe the mechanical properties of dental tissues have previously dedicated more attention to dentin than enamel. this is caused by the fact that samples of healthy, intact enamel are hard to come by and it has a maximum thickness of 2.5 mm. the thickness of the samples also limits the number of viable testing methods. the most common method used for the determination of hardness hit and young’s modulus e of dentin and enamel is nanoindentation. enamel, which is a very brittle material, has another important property – the fracture toughness. fracture toughness describes the resistance of the material against the propagation of cracks at given stress values. the property most commonly used for describing the fracture toughness of enamel is the stress intensity factor kic. the mean value of the fracture toughness of enamel, according to a published research, ranges between 0.45–1.55 mpa m1/2 [2–5]. a great range of experimentally determined values is caused by multiple factors. authors [2, 3] mention an increasing brittleness from the dej towards the outer tooth surface for old enamel (50/55 ≤ age). other published works mention the different properties of enamel in deciduous and permanent teeth [4] and differences caused by the enamel structure – indentation parallel or perpendicular to enamel rods [5]. the problems connected with the determining the values of the fracture toughness of enamel do not stem only from the inhomogeneous nature of enamel and its time-dependent changes but also from the chosen testing method. although authors almost solely use the microindentation and vickers indentation fracture test (vif) for fracture toughness determination of 226 http://dx.doi.org/10.14311/ap.2018.58.0226 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 4/2018 verification of behaviour of human enamel enamel, the formulas used for calculating kic differ. the most common formula used to calculate values of kic is as follows [6]: kic = δ ( e hit )1/2 p a3/2 (1) where e is the young’s modulus, hit is the hardness, p is the applied load, a is the crack length and δ is calibration constant. the differences in the values of calibration constants seem to be one of the main reasons for the different outcomes among various researchers. authors also use other formulas, which take into account other parameters, such as the size of the indent, the face angle of the indentation tip etc. because of these discrepancies, the vif method is not seen as acceptable for the fracture toughness testing by some researchers. quinn and bradt, in their review [7], point out other problems of the vif method. the problems are, for example, that the vif method does not meet the definition of linear fracture mechanics as there is no crack in the specimen at the beginning of a test. the overall results can also be influenced by the polishing of a specimen, the range of applied stress, which causes initiation of different types of cracks (palmqvist, median etc.), or accuracy of measurement (length of cracks). a part of the review was also dedicated to a comparison of results of fracture toughness, determined by the vif method using three different popular equations, for standard reference material (srm). as the authors expected, none of these equations yielded correct certified values of fracture toughness for the srm. the main goal of this study is to assess the brittle behaviour of enamel during nanoindentation tests, where the sample is loaded by 10–100× smaller forces than during the (micro)indentation, which is used in experiments for the fracture toughness determination of other authors [2–4, 8]. the possibility of using nanoindentation for fracture toughness determination by the vif method is examined. further effort is dedicated towards identifying the contribution of other factors to the overall results of the fracture toughness test. the work focuses on the variations of reduced modulus er and hardness hit as a result of initiation and propagation of cracks that form under 10–150 mn force loads. another observed factor that can influence the measurements is the loading velocity. at the end of this work an alternative method for the fracture toughness determination, which is based on a dissipation of energy, is presented. 2. materials and methods 2.1. specimen preparation we chose a human molar with no visible defects as a test specimen for our research. the molar was cleaned of all impurities and embedded into a technical dentacryl solution from the spofadental company. it was subsequently transversally cut from both sides to form a 15 mm wide specimen. the sections were made using a water-cooled saw cutter (atm brillant 210, austria) with a diamond disc. the first section was situated above the tooth crown and the second inside its root. the sample was grinded with a coarse silicon-carbide (sic) paper (coarseness 320) from the side of the crown until the section reached a desirable depth of enamel. then it was grinded again with finer papers (coarseness 1000, 2500) and polished with a diamond paste (0.25 µm). the thorough grinding and polishing was the most important part of the specimen preparation, because scratches in the surface of the specimens could impair the whole measurement. finally, it was cleaned with ultrasonic waves. 2.2. nanoindentation method nanoindentation was performed on the csm instruments nano hardness tester (anton paar, austria) equipped with a diamond cube corner tip. the indentation tip was chosen with respect to the experiment as the cube corner tip has a smaller face angle (35.26°) compared to the berkovich (65.27°) or vickers (68°) tip, so it is sharper and induces the cracks easier. the sample was loaded gradually with the force increasing without oscillations. the unloading curve slope was the same as loading curve, using the same velocity. the force-controlled test was used for each cycle of indentation. for determining the hardness hit and reduced modulus er, the methodology of oliver & pharr was used [9]. this method uses the unloading curve and the reduced modulus is calculated as follows: er = √ π 2 s √ a (2) where s is the contact stiffness (dp/dh) and a is the contact area of the indent determined from the measured contact depth hc. unlike young’s modulus, the reduced modulus takes into account the fact that the measured contact depth is the sum of the specimen deformation and the deformation of the indentation tip itself. hardness, which represents the material’s resistance to a localised plastic deformation induced by a mechanical indentation, is calculated from the maximum load pmax and contact area a: hit = pmax a (3) for determining the value changes of er and hit due to initiation and propagation of cracks and evaluating the use of nanoindentation for measuring the fracture toughness of enamel 5×16 indents were performed. each of these 5 matrices had a different value of maximum applied load (10, 20, 40, 80 and 150 mn). the indentation cycle of all matrices was identical. the loading consisted of 3 steps — the loading curve (120 mn/min), constant load (10 sec) and unloading (120 mn/min). the 10-second long constant load was 227 p. hájková, a. jíra, l. řehounek acta polytechnica figure 1. picture of indents performed by scanning electron microscopy (tescan mira, czech republic). different types of cracks: a) palmqvist or half-penny cracks, b) splitting off, delamination and c) radial cracks going through the indent. d) indent performed in dentin. applied in order to eliminate the creep influence on the final results. the contact depth varied between 787–3759 nm depending on the maximum applied load. the distance of individual indents was chosen in regard to the anticipated contact depth so the distance between the individual indents was at least 2× greater than the indent itself but also as small as possible so it would not be influenced by the inhomogeneities caused by the transition from the dej to the outer tooth surface. the distance, which is too small, can influence the final results due to an overlap of plastic areas. for evaluating the influence of the load velocity on the brittle behaviour of enamel, 3×25 indents were performed. the maximum applied load (150 mn) was identical for all indents. the maximum value of loading force applicable by the testing instrument was chosen in regard to the maximization of the crack propagation. the indentation cycle for individual matrices consisted of loading with the velocity of 60, 250 and 450 mn/min up to the maximum value of force (150 mn) and immediate unloading at the same velocity. the contact depth varied between 3260– 3835 nm. the distance between individual indents was 50 µm, just as during the previous experiment for the maximum value of 150 mn. 3. results the brittle behaviour of enamel during the nanoindentation, when the sample was loaded with forces of 10-150 mn, was different from the behaviour other authors describe for (micro)indentation and loading with forces of 1–10 n [2, 4, 10]. they describe the initiation of either palmqvist or half-penny cracks, which are necessary for the determination of fracture toughness by the vif method. these cracks were rarely encountered in our study (fig. 1a), splitting off and delamination (fig. 1b) were seen more often. radial cracks were also observed, but it did not radiate from all peaks of the indent, there was just one crack going through the indent (fig. 1c). to highlight the enamel’s brittleness and to show a comparison between different dental tissues, a picture describing the indent performed with the same force located in dentin (fig. 1d) was also included. no cracks were visible in dentin that indicated greater fracture toughness of dentin than that of enamel. determination of the mechanical properties of dentin was a part of our previous efforts [11, 12]. the brittle behaviour of enamel has shown to be very heterogeneous and very difficult to be precisely defined. the final values of fracture toughness can be influenced by many factors. equation (1) shows that the fracture toughness determined by the vif method depends on hardness hit and reduced modulus er. therefore, changes of these characteristics have to be 228 vol. 58 no. 4/2018 verification of behaviour of human enamel indentation load [mn] loading rate [mn/min] hc [nm] hit [gpa] er [gpa] 10 120 787 ± 40 3.79 ± 0.402 79.65 ± 4.924 20 120 1139 ± 65 3.71 ± 0.437 80.33 ± 4.618 40 120 1673 ± 104 3.50 ± 0.386 80.95 ± 4.198 80 120 2491 ± 114 3.28 ± 0.297 81.55 ± 5.276 150 120 3497 ± 263 3.25 ± 0.421 85.66 ± 5.606 150 60 3470 ± 239 3.29 ± 0.391 85.33 ± 6.456 150 250 3465 ± 309 3.31 ± 0.500 86.16 ± 9.811 150 450 3534 ± 300 3.20 ± 0.461 82.22 ± 9.070 table 1. values of hardness hit, reduced modulus er and contact depth hc in relation to a) applied load and b) load velocity. figure 2. graphs describing the changes of a) hardness hit and b) reduced modulus er due to changes of applied load. one of the main factors affecting the output. table 1 shows that different applied load influences the values of hit and er. values of hardness decreased with increasing load from 3.79 gpa at 10 mn to 3.25 gpa at 150 mn (fig. 2a). the hardness decreased by 14 % overall. the reduced modulus exhibited a different trend and increased with increasing load from 79.65 gpa to 85.66 gpa (fig. 2b). the reduced modulus increased by 7 % overall. these changes are likely attributed to the initiation and propagation of cracks. the characteristic indentation curve (fig. 3) obtained at the maximum load of 150 mn shows the initiation of cracks and clarifies the changes in the values of micromechanical properties. initiation and propagation of cracks cause an energy dissipation, which affects the indentation curve by a pop-in effect. cracks formed in a close proximity of the indentation tip cause a sudden increase of contact depth ∆hc. methodology described in § 2.2 implies that changes in contact depth influence the values of the contact area a and, therefore, also the hardness (3) and reduced modulus (2). with regard to the fact that the contact area a is the denominator in (2), we assumed a decreasing trend of the reduced modulus like in the case of hardness. therefore, the opposite trend of the reduced modulus had to be the consequence of the increasing contact stiffness s. it is possible to deduce that the values of hardness and reduced modulus measured at minimal applied load of 10 mn are the most accurate. the indentation curves at this load did not show initiation and propagation of cracks. verification of values is possible by applying lower loads, but the anticipated variation is negligible with regard to uncertainties of the fracture toughness calculation. table 1 also shows that the velocity of loading does not have any dramatic effect on the values of hardness, reduced modulus and crack initiation. although the velocity of 450 mn/min caused a decrease of hardness by 2.7 % (compared to the velocity of 60 mn/min), no continuous decreasing trend was observed. the changes of the reduced modulus does not correspond to the previous experiment. in contrast to the increasing trend, in this experiment, the reduced modulus decreased by 3.8 %. it is possible that the decrease was caused by the location of the matrix of indents in enamel, which is inhomogeneous. 4. discussion the brittle behaviour of enamel during the nanoindentation test was found to be very heterogeneous. a number of different types of cracks were induced (palmqvist, half-penny) but splitting off and delamination were prevalent. the lack of clear cracks emanating from corners of the indentation tip makes it 229 p. hájková, a. jíra, l. řehounek acta polytechnica figure 3. indentation curve showing the initiation and propagation of cracks (pop-in effect). initiation of a crack causes a sudden increase in contact depth hc which affects contact area a, and consequently hardness hit and reduced modulus er. during initiation of cracks the energy is dissipated and the total energy is increasing. impossible to determine the fracture toughness by the vif method. an alternative option is a method based on the dissipation of energy. the method uses the energy released during the crack initiation and growth, which is called the fracture energy ufrac. the release of energy is obvious from the indentation curve in fig. 3. the crack initiation causes a shift of the loading curve ∆hc (pop-in effect) and an increase in total energy of indentation, which is displayed as the area under the loading curve. the fracture energy can be separated from the total energy wtot [13]: wtot = wel + wpl + ufrac + wother (4) where wel is the energy of the elastic deformation, which is displayed as the area under the unloading curve, wpl is the energy of the plastic deformation and wother are other energies, for example energy of creep or energy associated with changes in temperature. the sum of the fracture energy, the energy of the plastic deformation and other energies is called an irreversible energy. this is displayed as the area enclosed by the loading and unloading curve (grey area). the energy of the plastic deformation cannot be derived directly from the indentation curve, so it is necessary to use a linear relationship between the ratio wpl/wtot and ratio hf/hmax [14]: wpl wtot = (1 + λ) hf hmax −λ (5) where hf is the final indentation depth, hmax is maximal indentation depth and λ=0.27. other energies can be eliminated by test conditions or dwell time in the case of the creep. if the fracture energy ufrac is derived, the fracture toughness kic is calculated from formulas based on principles of linear elastic fracture mechanics (lefm) [15]: gc = ufrac afrac (6) kic = √ egc (7) where afrac is the area of the fracture whose initiation caused the energy dissipation. the method based on the dissipation of energy enables the determination of fracture toughness in the case of initiation of any cracks (palmqvist, half-penny, median, delamination) in contrast to the vif method. it was found out that the values of the applied load for measuring the hardness and reduced modulus of a brittle material can significantly influence overall results of these characteristics, and consequently fracture toughness determined by the vif method. the degree to which the fracture toughness is affected depends on the value of the applied load and a formula used for the calculation of kic because the exponent of the ratio e/hit differs in formulas derived by different authors. in this test, the applied load (10– 150 mn), was significantly lower than in experiments performed by other authors. the authors usually apply load 1–10 n which means that changes of hardness and reduced modulus will probably be more obvious. the problem is even more augmented because authors do not use the same methods to determine the values of hardness and modulus. while some authors [8] use mean values of hardness and modulus 230 vol. 58 no. 4/2018 verification of behaviour of human enamel cited in literature, others [2, 3] use the nanoindentation or they determine the parameters directly while testing the fracture toughness. padmanabhan [10], for example, describes a hardening of enamel with increasing loading but the question is how much are the results influenced by values of hardness and modulus, which are determined in the same test as the fracture toughness. it is, therefore, necessary to provide these parameters by other means rather than directly from the test of the fracture toughness. in this test, the choice of the loading velocity showed no significant influence on the outcome of the measurement. as in the previous experiment, changes can be more obvious if authors use the loading velocity greater than 450 mn/min. 5. conclusions the present study assesses the brittle behaviour of enamel during nanoindentation tests and possibility of using the nanoindentation for a fracture toughness determination. the vif method, almost solely used by other authors, was found to be inapplicable. therefore, the alternative method based on a dissipation of energy was described. this method can eliminate some negative aspects of the vif method. the factors influencing the overall results of the fracture toughness test were examined. as the values of the applied load were found to be the important factor of the fracture toughness determination by the vif method, the choice of the loading velocity did not affect the result to any noticeable degree. further study will be dedicated to the determination of the stress intensity factor kic on the basis of the fracture energy ufrac and fracture area afrac. since the fracture energy is relatively easy to determine, the measurement of the fracture area will be crucial. 6. acknowledgements the financial support by faculty of civil engineering, czech technical university in prague (sgs project no. sgs17/168/ohk1/3t/11) is gratefully acknowledged. references [1] d. j. chiego. essentials of oral histology and embryology-e-book: a clinical approach. elsevier health sciences, 2014. [2] s. park, j. quinn, e. romberg, d. arola. on the brittleness of enamel and selected dental materials. dental materials 24(11):1477–1485, 2008. doi:10.1016/j.dental.2008.03.007. [3] q. zheng, h. xu, f. song, et al. spatial distribution of the human enamel fracture toughness with aging. journal of the mechanical behavior of biomedical materials 26:148–154, 2013. doi:10.1016/j.jmbbm.2013.04.025. [4] s. hayashi-sakai, j. sakai, m. sakamoto, h. endo. determination of fracture toughness of human permanent and primary enamel using an indentation microfracture method. journal of materials science: materials in medicine 23(9):2047–2054, 2012. doi:10.1007/s10856-012-4678-3. [5] h. xu, d. smith, s. jahanmir, et al. indentation damage and mechanical properties of human enamel and dentin. journal of dental research 77(3):472–480, 1998. doi:10.1177/00220345980770030601. [6] f. sergejev, m. antonov. comparative study on indentation fracture toughness measurements of cemented carbides. proc estonian acad sci eng 12(4):388–398, 2006. doi:10.1111/j.1551-2916.2006.01482.x. [7] g. d. quinn, r. c. bradt. on the vickers indentation fracture toughness test. journal of the american ceramic society 90(3):673–680, 2007. doi:10.1111/j.1551-2916.2006.01482.x. [8] r. hassan, a. caputo, r. bunshah. fracture toughness of human enamel. journal of dental research 60(4):820– 827, 1981. doi:10.1177/00220345810600040901. [9] w. c. oliver, g. m. pharr. measurement of hardness and elastic modulus by instrumented indentation: advances in understanding and refinements to methodology. journal of materials research 19(1):3–20, 2004. doi:10.1557/jmr.2004.19.1.3. [10] s. k. padmanabhan, a. balakrishnan, m.-c. chu, et al. micro-indentation fracture behavior of human enamel. dental materials 26(1):100–104, 2010. doi:10.1016/j.dental.2009.07.015. [11] p. hájková, a. jíra. micromechanical analysis of complex structures by nanoindentation. in key engineering materials, vol. 731, pp. 60–65. trans tech publ, 2017. doi:10.4028/www.scientific.net/kem.731.60. [12] a. jíra, j. němeček. nanoindentation of human tooth dentin. in key engineering materials, vol. 606, pp. 133–136. trans tech publ, 2014. doi:10.4028/www.scientific.net/kem.606.133. [13] j. chen, s. bull. indentation fracture and toughness assessment for thin optical coatings on glass. journal of physics d: applied physics 40(18):5401, 2007. doi:10.1088/0022-3727/40/18/s01. [14] y.-t. cheng, z. li, c.-m. cheng. scaling relationships for indentation measurements. philosophical magazine a 82(10):1821–1829, 2002. doi:10.1080/01418610208235693. [15] e. rocha-rangel. fracture toughness determinations by means of indentation fracture. in nanocomposites with unique properties and applications in medicine and industry. intech, 2011. 231 http://dx.doi.org/10.1016/j.dental.2008.03.007 http://dx.doi.org/10.1016/j.jmbbm.2013.04.025 http://dx.doi.org/10.1007/s10856-012-4678-3 http://dx.doi.org/10.1177/00220345980770030601 http://dx.doi.org/10.1111/j.1551-2916.2006.01482.x http://dx.doi.org/10.1111/j.1551-2916.2006.01482.x http://dx.doi.org/10.1177/00220345810600040901 http://dx.doi.org/10.1557/jmr.2004.19.1.3 http://dx.doi.org/10.1016/j.dental.2009.07.015 http://dx.doi.org/10.4028/www.scientific.net/kem.731.60 http://dx.doi.org/10.4028/www.scientific.net/kem.606.133 http://dx.doi.org/10.1088/0022-3727/40/18/s01 http://dx.doi.org/10.1080/01418610208235693 acta polytechnica 58(4):226–231, 2018 1 introduction 2 materials and methods 2.1 specimen preparation 2.2 nanoindentation method 3 results 4 discussion 5 conclusions 6 acknowledgements references ap1_03.vp 1 introduction dust is generally defined as an aerosol of solid particles, mechanically produced, with individual diameters of 0.1 �m upwards [1]. exposure limits have been defined for many kinds of industrial dusts dispersed in workspaces, as a function of the health hazard they present, with the aim of protecting workers from possible respiratory diseases. the iso and the acgih (american conference of governmental industrial hygienists) definitions of inspirable [2] dust are the mass concentration of ambient airborne particles inspired through the nose and mouth, during breathing, which is available for deposition anywhere in the respiratory tract. the aim of this work is to deepen the understanding of the factors relevant to this problem by conducting a survey of the nature of airborne wool dust in an industrial environment. wool in its raw state contains a variety of associated materials, which are regarded as impurities [3]. amongthese, mineral impurities, such as dust and dirt, are picked up by the animal from the pasture during its growth and may account for 5–20 % of raw weight. most foreign inorganic materials are removed by scouring. however, a certain amount remains as a deposit on the fiber surface or trapped within the entangled fiber mass and becomes a preferential target of the strong mechanical stresses developed during carding [4]. substantial quantities of airborne dust are generated during the processing stages of wool textiles, especially during combing and carding operations. in these processes, dust is associated with the raw material and also arises as a consequence of the mechanical stresses to which the fibers are subjected. during carding and combing, about 40 % of the fibers are broken under normal operating conditions [5], developing fragments in the form of airborne dust. 2 aims of the current project the aims of this present investigation were as follows: • to collect samples of the dust produced by different processes in a wool textile mill, • to obtain quantitative data on the dust present in the working atmosphere, • to identify components in the dust which could possibly pose particular health hazards, • to produce a size distribution of the dust particles, • to advise the wool-processing industry of the results of the project with regard to improving the health of the workforce, thus leading to a reduction in working time lost through illness. 3 experimental details 3.1 sampling techniques the gravimetric concentration of airborne dust in an occupational environment is determined by drawing a measured volume of air through a filter medium, and calculating the mass of dust collected on the filter by weighting the filter before and after sampling. in the current embodiment, the sampling apparatus consists of a pump that produces a defined air volume flow through a filter assembled in a duct. the air intake was regulated to simulate the aerodynamic conditions of human breathing (air velocity 1.1–1.2 m/s). the filters used were relatively flat and easy to coat with electrically conducting material, and all particles were easily visible. in addition, they had pores of a precisely controlled size (0.8 �m) and it was possible to measure this size directly by microscopy, thereby providing an independently measured lower boundary to the dust-particle size collected. the time for collection needed to be as long as possible to maximize the quantity of dust collected, but short enough to allow several samples to be collected. thus a sampling period of 20 minutes was used. the samples were gathered by simply positioning the sampling head with the pump running in the working region of the process in question. the primary concern of this project was to explore the nature of the dust rather than the quantity. it was therefore not considered necessary to sample throughout a shift. after collection, the samples were placed in static-neutralizing conductive pots and taken back to the laboratory for analysis. the chemical substances tlv committee of acgjh recommends the following definitions [6] for particulate materials, which are intended to correspond to the fractions which penetrate to specific regions of the human respiratory system: • inhalable dust fraction: (corresponds to the total inspirable fraction) for those materials that are hazardous when deposited anywhere in the respiratory tract, © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 41 no. 3/2001 investigation into airborne dust in a wool textile mill a. k. haghi airborne dust samples were gathered from the vicinity of various commonly performed processes in the wool-preparation industry. samples of airborne wool dust were collected on membrane filters during the processing of wool lots. the chemical composition of the inorganic particles present in the total inspirable and respirable dust fractions was determined with the use of a scanning electron microscope (sem). the widely differing morphologies of the particles collected raise questions about the validity of trying to correlate minor respiratory symptoms with dust concentrations, as some particle types will penetrate the respiratory system more easily than others. the results are discussed with respect to the sampling methodology used. keywords: airborne dust, wool, textile mill, sem. • thoracic dust fraction: for those materials that are hazardous when deposited anywhere within the lungs, airways and the gas exchange region (bronchiolar and alveolar tract), • respirable dust fraction: for those materials that are dangerous when deposited in the unciliate gas exchange region of the lungs (alveolus). such definitions ignore exhalation loss: they represent conventional diameter size ranges correlated with experimental curves for penetration into the respiratory system of spherical aerosol particles of density 1 g/cm3. airborne dust sampling instruments and their operating characteristics make reference to these recommended values. 3.2 sampling methodologies three analytical techniques were used, as follows: (i) scanning, electron microscopy (sem) allowed the morphology of the dust to be explored and facilitated both size and x-ray analyses. to prepare the sample for microscopy, it was goldcoated in order to prevent charging when exposed to the electron beam. (ii) size analysis involved measuring of the particles. (iii) in x-ray analysis, the electron microscope was used to bombard the target with a beam of electrons. the acghi quantitative definition of the inhalable dust fraction is the mass fraction of particles that are captured according to the collection efficiency, defined as follows, regardless of the sampler orientation with respect to wind direction [6]: � �e e d� � ��50 1 100 06. for 0 � d � 100 �m where e � collective efficiency (%) d � aerodynamic diameter (�m), defined as the diameter of a sphere of density 1 g/cm3 having, in the gravitational field, the same aerodynamic behavior (terminal velocity in air) as the examined particle. the membrane filters containing the airborne wool dust were analyzed with a scanning electron microscope (sem). small filter sections (approximately 8 × 8 mm) were cut out with a razor blade from the center of each filter. after coating with a thin gold film, the mounted filter sections were first observed at low magnification and then scanned at 2000×. ten to 20 fields were selected, according to the particle density, in such a way as to exclude specimen edges, ensuring that the same distance existed among consecutive fields, and that each part of the specimen surface was sampled. 4 results and discussion collection efficiencies representative of several sizes of particles are shown in table 1. the particulate produced during wool processing is formed by organic and inorganic components, present at the same time on the filter surface. the former have morphological and chemical characteristics that can interfere with the analysis of the latter. microscope magnification is a critical parameter which enables the detection of dust particles and distinguishes them from filter surface features. since the average size of inorganic particles is rather small, and some of them have a diameter lower than 1 �m, the minimum magnification was set to 2000×, in order to miss a significant part of the smallest particles. most of these small particles were found to be characterized by a relatively high concentration of ca. the area of the section removed from the membrane filter represented about 5 % of the total area where the wool dust was collected. in order to ascertain whether inorganic dust particles were homogeneously distributed all over the filter surface, we cut three subsamples (inner, middle, and outer) along a radial direction and analyzed about 100 particles for each section. the results of chemical analysis showed that neither the relative elemental abundance nor the detection frequency of different particle types (table 2) changed significantly from one section to the other. this indicated sampling one section in the middle of the membrane filter was enough to obtain a complete and exhaustive description of the average chemical composition of the inorganic dust fraction. the minimum number of fields and particles to be analyzed for each filter section, as well as the field position within the section area, were determined as follows. 100 and 10 fields were scanned on two different sections from the same membrane filter. about 1400 and 150 particles were detected and analyzed, respectively. the results reported in table 3 show that the two sets of data are very similar. the analyses of 10 fields, corresponding to about 150 particles, 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 particle aer. diam. [�m] inhalable part. mass [%] particle aer. diam. [�m] thoracic part. mass [%] particle aer. diam. [�m] respirable part. mass [%] 0 100 0 100 0 100 1 97 2 94 1 97 2 94 4 89 2 91 5 87 6 80 3 74 10 77 8 67 4 50 20 65 10 50 5 30 30 58 14 23 6 17 40 54 16 15 7 9 50 52 18 9 8 5 100 50 25 2 10 1 table 1: particulate mass [%] for several sizes of particles in each of the respective theoretical mass fractions permits characterization of the sample with an accuracy which is not significantly improved by a tenfold increase in the number of fields. as regards to the field position, the entire section area was divided into 15–20 regions (according to dust density) with animaginary grid, taking care not to include filter edges. one field was then selected and scanned in the center of each grid unit, the distance between adjacent fields being at least five times the field width. this approach reduced the influence of poor local homogeneity and allowed a reproducible characterization of the inorganic material collected on the filter surface to be achieved. in fact, adjacent fields may sometimes differ in particle density, especially for those particles whose detection frequency is quite low. the constituent particles of each sample fell into several broad groups, from long fibers, several millimeters in length, to fragments of cortical cells, whose longest dimension was less than 5 �m. representative particles appear in the electron micrographs shown in figures 1–3. 5 conclusions the dust was found to fall broadly into four categories: (i) long fibers, with lengths greater than 500 �m, (ii) fiber fragments, much shorter lengths of fiber with length/width ratios of less than 10/1 and scales, (iii) mineral dust particles, less than 50 mm in the longest dimension but usually around 20 �m, and (iv) cortical cells with lengths of 50–100 �m, but with widths of less than 5 �m. inorganic components were found in the earlier stages of processing, but these decreased in quantity with further © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 41 no. 3/2001 outer section middle section inner section elements counts % counts % counts % al 3.0 4.0 3.0 si 19.0 16.0 22.0 s 7.0 4.0 5.0 ca 4.0 2.0 4.0 fe 62.0 68.0 59.0 others 5.0 6.0 7.0 particle type counts % counts % counts % slm particles si-fe-k-ca-al 29.3 28.2 28.3 sizi-p-zr 6.8 5.2 7.5 si 9.5 13.5 13.7 ca 0.8 2.5 1.8 hm particles fe 34.4 29.9 28.3 fe-s 11.1 11.0 11.9 fe-x 8.1 9.7 8.5 total no. of part 123 91 156 slm – silicates and light metals particles, hm – heavy metal particles table 2: elemental composition (count %) and detection frequency [%] of inorganic particles in different sections of the same membrane filter elements 100 fields/count % 10 fields/count % al 3.0 3.0 si 22.0 22.0 s 5.0 5.0 ca 4.0 4.0 fe 58.0 59.0 others 8.0 7.0 particle type % % slm particles si-fe-k-ca-al 28.1 28.9 si-p-zr 6.7 7.7 si 9.3 13.5 ca 1.6 1.3 hm particles fe 33.0 28.8 fe-s 11.0 11.5 fe-x 10.3 8.3 total no. of part. 1398 156 slm – silicates and light metal particles, hm – heavy metal particles table 3: elemental composition (count %) and detection frequency [%] of inorganic particles as a function of the number of fields scanned treatment. they were identified as soil minerals, but residues from suint and compounds from skin-wool processing were shown to be present in some batches, even after scouring. of the inorganic components, the most common substances were silica, presumably from sand, and aluminum silicates, often containing trace amounts of sodium, magnesium, iron and calcium, presumably from clay, both major constituents of soil. in this study it was found that dust on ledges caused damage to the lungs of rodents. the ledge dust was found to contain microscopic growths, possibly of a type of fungus which can produce allergy-inducing spores. these may cause respiratory symptoms. in the light of this, regular 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 fig. 1: electron micrograph fig. 2: electron micrograph fettling of machinery and cleaning of the mill in general would seem to be desirable, especially in warm and humid environments. acknowledgment the author would like to thank the management and staff of iran barak company for the generous assistance provided during dust collection. references [1] health and safety executive, occup. med. hyg. lab.: general methods for the gravimetric determination of respirable and total inhalable dust. mdhs 14, may 1986 [2] international standards organisation: air quality-particle size fraction definitions for health-related sampling. tech. rep., iso/tr 7708, 1983 [3] dusenbury, j. h.: wool handbook. werner von bergen, j. p. stevens & co. inc., interscience publishers, j. wiley & sons, n.y., usa, 1963 [4] bownass, r.: changes in fiber length during the early worsted processing. report in raw wool length, iwto tech. comm., paris, 1984 [5] love, r. g. et al: further studies of respiratory health of wool textile workers. institute of occupational medicine report, tm/88/16, october 1988 [6] american conference of governmental industrial hygienists – committee on threshold limits: particle size-selective sampling in the workplace. cincinnati, ohio, usa, 1991 dr a. k. haghi e-mail: haghi@kadous.gu.ac.ir university of guilan p.o.box 3756 rasht, iran © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no. 3/2001 fig. 3: electron micrograph fettling of machinery and cleaning of the mill in general would seem to be desirable, especially in warm and humid environments. acknowledgment the author would like to thank the management and staff of iran barak company for the generous assistance provided during dust collection. references [1] health and safety executive, occup. med. hyg. lab.: general methods for the gravimetric determination of respirable and total inhalable dust. mdhs 14, may 1986 [2] international standards organisation: air quality-particle size fraction definitions for health-related sampling. tech. rep., iso/tr 7708, 1983 [3] dusenbury, j. h.: wool handbook. werner von bergen, j. p. stevens & co. inc., interscience publishers, j. wiley & sons, n.y., usa, 1963 [4] bownass, r.: changes in fiber length during the early worsted processing. report in raw wool length, iwto tech. comm., paris, 1984 [5] love, r. g. et al: further studies of respiratory health of wool textile workers. institute of occupational medicine report, tm/88/16, october 1988 [6] american conference of governmental industrial hygienists – committee on threshold limits: particle size-selective sampling in the workplace. cincinnati, ohio, usa, 1991 dr a. k. haghi e-mail: haghi@kadous.gu.ac.ir university of guilan p.o.box 3756 rasht, iran © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no. 3/2001 fig. 3: electron micrograph ap1_03.vp 1 introduction the colorimetry of all present-day tv standards (analog as well as digital) emanates from the original ntsc tv standard. when this tv standard was put forward the cathode ray tube (crt) was the only available display device. the conventional scanning system (the rgb prism) is characterized by the reproductive lights of the display and by a comparative white light. the existence of negative parts of the colour matching functions � � � � � �r g b� � �, , causes complications by optical separation of partial pictures r, g, b in the classic scanning system. this leads to distortion in the reproduction of colour images. however, the present-day market offers many kinds of display devices, most notably lcd and plasma displays. these new displays have other, sometimes preferable colorimetric features, but we cannot take advantage of these better features due to dependence on the conventional rgb system [1]. however, the tv camera, which scans in a colorimetric system of unreal lights x, y, z, is not predetermined by the colorimetric features of any display device. theoretical spectral reflectances of partial filters of the xyz prism correspond to the colour matching functions � � � � � �x y z� � �, , , which are only positive [2]. this solves a number of problems that are encountered when realizing the conventional scanning system (rgb prism). a camera working in the xyz colorimetric system produces electrical analogs of trichromatic components x, y, z. so every kind of display device has a circuit of colorimetric transformation from the system of lights x, y, z into the system of primary (reproductive) lights in the given display device. another advantage of the xyz colorimetric system is that one of the scanning channels is directly channel y. in this case, its noise is identified only by its own scanning device. the two remaining channels (x, z) carry colour information and just by realizing such a camera we can anticipate lower resolution of details in the x and z channels. 2 colorimetric systems – a comparison the basis for each colorimetric space is created by three lights. for each colorimetric system, the colour matching functions are defined. these functions are determined by the basic lights and the comparative white light. each colorimetric system has its chromaticity diagram. this is usually issued from the colorimetric system of unreal lights x, y, z. as the triangle (in the mko chromaticity diagram, see fig. 1), whose vertices create the lights x, y, z, overlays the whole gamut of existing colours, the colour matching functions � �x � , � �y � , � �z � , are only positive (fig. 2). the xyz colorimetric system is the only one with only positive colour matching functions. the rgbntsc colorimetric system has been used in tv since 1953. the lights r, g, b create a triangle, which is inscribed in the gamut of all existing colours, hence the corresponding colour matching functions � �r � , � �g � , � �b � are bipolar (fig. 3). negative parts of these colour matching functions come up to partitions of the area of all existing colours, which lie in the second and the fourth quadrant of the rgbntsc chromaticity diagram (fig. 4). mutual conversion among colorimetric systems is conducted by means of the general colorimetric transformation [3]. for correct scanning of colour information, the three channels of the tv camera scanning set must have sensitivities equal to some colour matching functions. it is apparent that only positive sensitivities can be realized optically. the reason for this is fundamental. there is no negative radiation intensity, there is no negative medium transparency and the photoeffect is also a response of the output quantity (charge, current, voltage) only to the radiation intensity. hence the real sensitivities of the rgb prism channels follow at the very most only the positive parts of the ideal sensitivities (fig. 4). colour information about the scanning scene is in this way knowingly neglected ahead of the optical–electrical conversion on the image sensors. the end effect of incorrect scanning is reduced fidelity of colour reproduction on a display unit and also on the crt from which the channel sensitivities of the rgb prism are derived. the summing curve of any colour matching functions embodies the characteristic minimum at a wavelength of around 500 nm. the summing curve of colour matching functions � �x � , � �y � , � �z � and � �r � , � �g � , � �b � is shown in fig. 5. it is interesting that, for evaluating colour, human vision does not use information that is contained at a wavelength of around 500 nm. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 colorimetry and tv colour splitting systems j. kaiser, e. košt’ál the colorimetric standard of the present-day television system goes back to the american ntsc system from 1953. in this rgb colorimetric system it is not possible, for basic reasons, to produce a scanning device which will provide signals suitable for controlling any displayed unit. from the very beginning of the television system the scanning device has produced inevitable colour deformation. the range of reproductive colours is not fully utilized either by a contemporary cathode ray tube display unit or by a liquid crystal display. in addition, the range is not sufficient for true reproduction of colours. specific technical and scientific applications in which colour bears a substantial part of the information (cosmic development, medicine) demand high fidelity colour reproduction. the colour splitting system, working in the rgb colorimetric system, continues to be universally used. this article submits the results of a design for a colour splitting system working in the xyz colorimetric system (hereafter referred to as the xyz prism). a way to obtain theoretical spectral reflectances of partial xyz prism filters is briefly described. these filters are then approximated by real optical interference filters and the geometry of the xyz prism is established. keywords: tv colorimetry, colour splitting system, interferential filters, tv reproduction, colour gamut. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 41 no. 3/2001 [] wavelenght [nm] z y x fig. 2: colour matching functions � �x � , � �y � , � �z � y x fig. 1: mko chromaticity diagram with lights x,y,z and (r,g,b)ntsc g r fig. 3: rgbntsc chromaticity diagram with lights x, y, z and (r,g,b)ntsc [] wavelenght [nm] b g r fig. 4: colour matching functions � �r � , � �g � , � �b � a) b) wavelenght [nm] wavelenght [nm] [] [] fig. 5: summing curve of colour matching functions � �x � , � �y � , � �z � (a) and � �r � , � �g � , � �b � (b) 3 spectral reflectances of partial filters of the xyz prism on the basis of fig. 6, a system of three equations for spectral sensitivities of partial channels � ��x � , � ��y � , � ��z � of the proposed scanning system can be compiled. the spectral reflectances of partial filters ay(�), bx(�), and cz(�) are unknown [4]. � � � �� �y � �ay (1) � � � � � �� �� � �x � � �bx ay1 (2) � � � � � �� � � �� �� � � �z � � � �cz ay bx1 1 (3) the spectral sensitivities of partial channels � ��x � , � ��y � , � ��z � of the proposed scanning system are the colour matching functions � �x � , � �y � , � �z � (cie 1931, 2-deg), which are corrected for maximum efficiency of transmission of light flux through the colour splitting system, also for maximum transparence of the splitting system, and for the spectral sensitivities of image sensor ccd. with the solution of equations (1), (2), (3), we acquire spectral reflectances of partial filters ay (�), bx (�) and cz (�). the solutions show that the colour splitting system is not orthogonal. it turns out that the curves of the spectral sensitivities of the scanning system overlap one another above the wavelength axis. by separation, the light energy is sucked into two and in some places even into three paths. the ideal spectral reflectances of the partial filters of the xyz prism are � � � �ay � �� �y (4) � � � � � �� �bx � � �� � � �x y1 (5) � � � � � � � �� �cz � � � �� � � � � �z y x1 . (6) the approximations of ideal spectral reflectances ay (�), bx (�), cz (�) by real optical interference filters (see fig. 7) were made using the synopsys programme [5]. the technical solution of real optical interference filters involves producing dichroic thicknesses [6], [7], [8]. these form a coating of a shiny pellucid medium (e.g., boro-silicate glass bk7) with thicknesses comparable with the wavelength of light. the thicknesses are sorted step by step with the alternating higher and lower refractive index. filter ay (�) is built up from eight layers of three materials (mgf2, cef3, ceo2), filter bx (�) is built up from twelve layers of four materials (mgf2, cef3, ceo2, sio), and filter cz (�) has twelve layers of three materials (mgf2, cef3, zro2) [11]. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 �y� �x � �z � incident light ay bx cz fig. 6: the sequence of separation of partial components (images) wavelenght [nm] wavelenght [nm] ay [] bx [] wavelenght [nm] cz [] fig. 7: ideal spectral reflectances ay(�), bx(�), cz(�) and their approximation byreal optical interference filters (approximation by real filters – full line) 4 geometry of the xyz colour splitting system the xyz colour splitting system (see fig. 8) consists of four prisms and three interference filters. the colour splitting system constitutes a three-band frequency selective switch of light pencils and a three-band amplitude switch. the pencils generate the partial images x, y, z on the outputs of the switch. the images are scanned via three sensors, e.g., ccd sensors, and video signals ex, ey, ez are obtained as electrical analogs of the trichromatic components x, y, z. the prism is made of bk7 glass. the third prism functions only as an adjusting shim to provide sufficient room for image sensor z. each of the prisms is proposed and sorted in set so that the trace lengths of the partial light tubes will be identical and the proportions of the prisms will enable trouble-free transit of light tubes of the required diameter. all three filters are reflective-interferential. the rear surfaces of the first, second and fourth prism are coated with these filters, i.e., in the direction in which the beams are going. the third prism creates an adjusting shim. the filters are built up of dielectric multilayers of the following materials: sio, mgf2, cef3, zro2, ceo2. the spectral reflectances of partial filters ay (�), bx (�), cz (�) are illustrated in fig. 2. each partial light tube executes two reflections in the xyz prism. the first reflections of the partial light tubes occur on filters ay(�), bx(�) and cz(�). these reflections are frequency and amplitude selective. the second reflections of the light tubes are total and occur on the front walls (on the glass-air passages) of the first, second and fourth prism. after the second reflections, the partial light tubes with spectral sensitivities � ��y � , � ��x � and � ��z � , respectively, come to the image sensors. due to the transmissivity of the xyz prism, and due to the summing curve of the spectral sensitivities � � � � � �� � � � �x y z� � � , only a part of the incident light spectrum is used to obtain the trichromatic components x, y, z (partial images x, y, z). the unused light spectrum, mainly the section around wavelength 500 nm, passes through filter cz(�) and leaves the xyz prism. this light must be absorbed in the camera (e.g., absorption with velvet) to prevent it being reflected back into the prism. otherwise this light would cause spurious artefacts in the picture during reproduction. in order to create glass-air passages, i.e., total reflections also for components x and z, there has to be a slim air interspace 0.1–0.2 mm in thickness between each two prisms. this air interspace is also needed between the second and the third prism. it does not engender total reflection, but the air interspace has a favourable effect on the number of layers and on the kinds of filter material bx (�). in other words, the impedance match of the filters will be less demanding if there is a substance with different impedance on all sides of the filter. 5 conclusion this paper aims to show how the colorimetry of the tv scanning set could proceed to full exploitation of the new range of colorimetric display devices. the xyz colour splitting system encounters no insuperable difficulties during optical separation of partial components (which are found in the classic rgb scanning system). its ideal spectral sensitivities are, in contrast to the ideal spectral sensitivities of the classic scanning system, only positive. hence, there is no longer any need to introduce additional corrections for areas with negative spectral sensitivities of partial channels. all three filters in the xyz prism are of the reflective-interferential type, unlike the green filter of the rgb prism, which is coloured and therefore absorptive. the colorimetric system of primary lights x, y, z overlays the whole gamut of existing colours. for light of any colour, the trichromatic components x, y, z are only positive. the end effect is that the colour gamut of reproduction will not be reduced [10], [11]. footnote the xyz colour splitting system for tv cameras was submitted by doc. ing. emil koš�ál, csc., ing. jan kaiser and ing. jiří slavík as a utility model and patent application [12]. a registration certificate for the utility model was granted on 29. 5. 2000. the number of the utility model is 10026. the certificate of patent registration was granted on 19. 4. 2001. number of the patent is 288456. references [1] košt’ál, e.: obrazová a televizní technika ii – televize (image and television technology ii – tv). učební text čvut 1998, p. 135 [2] http://cvision.ucsd.edu/index.html, cie standards, color spectra databases [3] ptáček, m.: přenosové soustavy barevné a digitální televize (transmission systems of colour and digital tv). 2. vydání, nadas praha 1981, p. 488 [4] slavík, j.: návrh světlodělící soustavy pro kameru pracující v kolorimetrickém systému x,y,z (design of colour splitting system for tv camera working in the x,y,z colorimetric system). unpublished manuscript, 1999 [5] http://www.gwi.net/osd, synopsys program [6] dobrowolski, j. a.: completely automatic synthesis of optical thin film systems. applied optics, vol. 4, no. 8, aug. 1965, p. 937 © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 41 no. 3/2001 fig. 8: the xyz colour splitting system (1, 2, 3, 4 – glass prisms; 5, 6, 7 – dielectric multilayers (filters); 8, 9, 10 – image sensors; 11, 12, 13 – air interspaces) [7] ditchburn, r. w.: light. 3rd edition, acad. press, 1976 [8] novák, z.: optické soustavy snímacích zařízení (optical set of scanning devices). učební text pro postgraduální studium, čvut 1971 [9] pazderák, j.: kolorimetrie snímacích soustav barevné televize a elektronické kolorimetrické korekce (colorimetry of scanning systems of colour tv and electronic colorimetric corrections). edice čs. televize, řada ii, svazek 16, praha 1974, p. 150 [10] svoboda, v.: kolorimetrie a zdokonalené televizní soustavy (colorimetry and improved tv systems). in: televize 94 č.1, ivp čt praha 1994, pp. 65–114 [11] kaiser, j.: kolorimetrie zdokonalených tv soustav (colorimetry of improved tv systems). diploma project, čvut 2001 [12] košt’ál, e., kaiser, j., slavík, j.: hranolová světlodělící soustava pro televizní kamery (the colour splitting system for tv cameras). přihláška vynálezu (patent application) č. pv 2000-1167, 30. 3. 2000 ing. jan kaiser e-mail: xkaiserj@feld.cvut.cz department of radioelectronics czech technical university in prague faculty of electrical engineering technická 2, 16627 praha 6, czech republic doc. ing. emil košt’ál, csc. e-mail: kostalem@worldonline.dk ryvangs allé 14 2100 copenhagen 0 czech embassy, dánsko 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2018.58.00ix acta polytechnica 58(3):ix–ix, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap farewell and welcome new editor-in-chief my seven years with acta polytechnica ended on june 30. the opportunity to lead this journal from a local list of reviewed periodicals to the prestigious databases – web of science, scopus, inspec, cas and more – was a rewarding experience and privilege for me. all this would have not been possible without the continual cooperation and understanding of many individuals, who contribute to putting the journal together. in the first place there are authors of excellent submissions and reviewers who contributed their time, expertise, knowledge and experience to help the authors to improve their manuscripts and, through their comments and questions, gave them an inspiration for their further research. during my tenure i saw very satisfying trends in our submissions, mainly the increasing quality of the manuscripts submitted by young researchers. i very appreciate the continual support of all the members of the editorial board. as experts in their field they contributed to the editorial process, provided invaluable advices and recommendations. in particular i would like to thank the internal board members, who collaborated with their colleagues from faculties and institutes. without this motivating communication and cooperation across the university the ap would not be what it is. a special thanks belongs to tomáš hejda. seven years ago he created the image of the journal and set up typographic rules and standards. last but not least, ap could not grow without the work of the language editors, especially robin healey. and, of course, i would like to thank all the colleagues from the ctu central library for their support and intensive cooperation. without them it would not be possible to set the necessary publishing standards and processes. especially i would like to thank lenka němečková for being a source of energy and inspiration for the ap, and thank to the director of the ctu central library marta machytková for her support. finally, i would like to introduce and welcome an outstanding colleague and the new editor-in-chief tereza karlová. i very respect her professional skills, i am convinced that she has the right motivation for the successful management of acta polytechnica and i wish her all the best. iva adlerová outgoing editor-in-chief ix http://dx.doi.org/10.14311/ap.2018.58.00ix http://ojs.cvut.cz/ojs/index.php/ap acta polytechnica doi:10.14311/ap.2016.56.0283 acta polytechnica 56(4):283–290, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on connecting weyl-orbit functions to jacobi polynomials and multivariate (anti)symmetric trigonometric functions jiří hrivnák, lenka motlochová∗ department of physics, faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, cz-115 19 prague, czech republic ∗ corresponding author: lenka.motlochova@fjfi.cvut.cz abstract. the aim of this paper is to make an explicit link between the weyl-orbit functions and the corresponding polynomials, on the one hand, and to several other families of special functions and orthogonal polynomials on the other. the cornerstone is the connection that is made between the one-variable orbit functions of a1 and the four kinds of chebyshev polynomials. it is shown that there exists a similar connection for the two-variable orbit functions of a2 and a specific version of two variable jacobi polynomials. the connection with recently studied g2-polynomials is established. formulas for connection between the four types of orbit functions of bn or cn and the (anti)symmetric multivariate cosine and sine functions are explicitly derived. keywords: weyl-orbit functions, chebyshev polynomials, jacobi polynomials, (anti)symmetric trigonometric functions. 1. introduction special functions associated with the root systems of simple lie algebras, e.g. weyl-orbit functions, play an important role in several domains of mathematics and theoretical physics, in particular in representation theory, harmonic analysis, numerical integration and conformal field theory. the purpose of this paper is to link weyl-orbit functions with various types of orthogonal polynomials, namely chebyshev and jacobi polynomials and their multivariate generalizations, and thus to motivate further development of the remarkable properties of these polynomials in connection with orbit functions. the collection of weyl-orbit functions includes four different families of functions called c–, s–, ss– and sl–functions [6, 15, 16, 25]. they are induced from the sign homomorphisms of the weyl groups of geometric symmetries related to the underlying lie algebras. the symmetric c–functions and antisymmetric s– functions also appear in the representation theory of simple lie algebras [32, 34]; the s–functions appear in the weyl character formula and every character of irreducible representations of simple lie algebra can be written as a linear combination of c–functions. unlike c– and s–functions, ss– and sl–functions exist only in the case of simple lie algebras with two different lengths of roots. a review of several pertinent properties of the weylorbit functions is contained in [8, 15, 16, 25]. these functions possess symmetries with respect to the affine weyl group – an infinite extension of the weyl group by translations in dual root lattice. therefore, we consider c–, s–, ss– and sl–functions only on specific subsets of the fundamental domain f of the affine weyl group. within each family, the functions are continuously orthogonal when integrated over f and form a hilbert basis of squared integrable functions on f [25, 27]. they also satisfy discrete orthogonality relations which is of major importance for the processing of multidimensional digital data [8, 10, 27]. using discrete fourier-like transforms arising from discrete orthogonality, digital data are interpolated in any dimension and for any lattice symmetry afforded by the underlying simple lie algebra. several special cases of simple lie algebras of rank two are studied in [28–30]. the properties of orbit functions also lead to numerical integration formulas for functions of several variables. they approximate a weighted integral of any function of several variables by a linear combination of function values at points called nodes. in general, such formulas are required to be exact for all polynomial functions up to a certain degree [3]. furthermore, the c–functions and s–functions of simple lie algebra a1 coincide, up to a constant, with the common cosine and sine functions respectively. they, are therefore, related to the extensively studied chebyshev polynomials and, consequently, to the integration formulas, quadratures, for the functions of one variable [5, 31]. in [24], it is shown that there are analogous formulas for numerical integration, for multivariate functions, that depend on the weyl group of the simple lie algebra an and the corresponding c– and s–functions. the resulting rules for functions of several variables are known as cubature formulas. the idea of [24] is extended to any simple lie algebra in [9, 25, 26]. optimal cubature formulas in the sense of the nodal points that are required are known only for s– and ss–functions. 283 http://dx.doi.org/10.14311/ap.2016.56.0283 http://ojs.cvut.cz/ojs/index.php/ap jiří hrivnák, lenka motlochová acta polytechnica besides the chebyshev polynomials, the weyl-orbit functions are related to other orthogonal polynomials. for example, orbit functions of a2 and c2 coincide with two-variable analogues of jacobi polynomials [21]. it can also be shown that the c–, s–, ss– and sl–functions arising in connection with simple lie algebras bn and cn become, up to a constant, (anti)symmetric multivariate cosine functions and (anti)symmetric multivariate sine functions [17]. note that these generalizations lead to multivariate analogues of chebyshev polynomials and are used to derive optimal cubature formulas. therefore, this fact indicates that it might be possible to obtain such formulas for all families of orbit functions. as for chebyshev polynomials, it is of interest to study the accuracy of approximations and interpolations using cubature formulas. this paper starts with a brief introduction to weyl groups in section 2. then there is a review of the relations of the weyl-orbit functions with other special functions which are associated with the weyl groups. in section 3.1, the connection of the c– and s–functions of one variable with chebyshev polynomials is recalled. in sections 3.2 and 3.4, we show that each family of weyl-orbit functions corresponding to a2 and c2 can be viewed as a two-variable analogue of jacobi polynomials [21]. in section 3.4, we also provide the exact connection with generalizations of trigonometric functions [34]. 2. weyl groups of simple lie algebras in this section, we summarize the properties of the weyl groups that are neeeded for definition of orbit functions. there are four series of simple lie algebras an(n ≥ 1), bn(n ≥ 3), cn(n ≥ 2), dn(n ≥ 4) and five exceptional simple lie algebras e6, e7, e8, f4 and g2. each of these algebras is connected with its corresponding weyl group [1, 12, 13, 18, 34]. they are completely classified by dynkin diagrams (see e.g. figure in [15]). a dynkin diagram characterizes a set ∆ of simple roots α1, . . . ,αn generating an euclidean space isomorphic to rn with the scalar product denoted by 〈·, ·〉. each node of the dynkin diagram represents one simple root αi. the number of links between two nodes corresponding to αi and αj respectively is equal to 〈αi,α∨j 〉〈αj,α ∨ i 〉, where α ∨ i ≡ 2αi 〈αi,αi〉 . note that we use the standard normalization for the lengths of roots, namely 〈αi,αi〉 = 2 if αi is a long simple root. in addition to the basis of rn consisting of the simple roots αi, it is convenient for our purposes to introduce the basis of fundamental weights ωj given by 〈ωj,α∨i 〉 = δij. this allows us to express the weight lattice p defined by p ≡{λ ∈ rn | 〈λ,α∨i 〉 ∈ z, i = 1, . . . ,n} as z-linear combinations of ωj. the subset of dominant weights p + is standardly given as p + ≡ z≥0ω1 + · · · + z≥0ωn. we consider the usual partial ordering on p given by µ � λ if and only if λ − µ is a sum of simple roots with non-negative integer coefficients. to each simple root αi there is a corresponding reflection ri with respect to the hyperplane orthogonal to αi, ri(a) ≡ rαi(a) = a− 2〈a,αi〉 〈αi,αi〉 αi, for a ∈ rn. the finite group w generated by such reflections ri, i = 1, . . . ,n is called the weyl group. for the properties of the weyl groups see e.g. [12, 14]. in the case of simple lie algebras with two different lengths of the roots, we need to distinguish between short and long simple roots. therefore, we denote by ∆s the set of simple roots containing only short simple roots and we denote by ∆l the set of long simple roots. we also define the following vectors: % ≡ n∑ i=1 ωi, % s ≡ ∑ αi∈∆s ωi, % l ≡ ∑ αi∈∆l ωi. (1) 3. weyl-orbit functions each type of weyl-orbit function arises from the sign homomorphism of weyl groups σ : w →{±1}. there exist only two different sign homomorphisms on w connected to simple lie algebras with one length of the roots: identity, denoted by 1, and the determinant [8, 25]. they are given by their values on the generators ri of w as 1(ri) = 1 for all αi ∈ ∆, det(ri) = −1 for all αi ∈ ∆. in the case of simple lie algebras with two different lengths of roots, i.e. bn,cn,f4 and g2, there are two additional sign homomorphisms denoted by σs and σl and given as σs(ri) = { −1 if αi ∈ ∆s, 1 if αi ∈ ∆l, σl(ri) = { 1 if αi ∈ ∆s, −1 if αi ∈ ∆l. labelled by the parameter a ∈ rn, the weyl-orbit function of the variable b ∈ rn corresponding to sign homomorphism σ is introduced via the formula ϕσa(b) = ∑ w∈w σ(w)e2πi〈w(a),b〉, a,b ∈ rn. 284 vol. 56 no. 4/2016 weyl orbit functions and (anti)symmetric trigonometric functions each sign homomorphism 1, det,σs,σl determines one family of complex valued weyl-orbit functions, called c–, s– ,ss– and sl–functions respectively, and denoted by c–functions: σ ≡ 1, ϕσ ≡ φ, s–functions: σ ≡ det, ϕσ ≡ ϕ, ss–functions: σ ≡ σs, ϕσ ≡ ϕs, sl–functions: σ ≡ σl, ϕσ ≡ ϕl. several remarkable properties of weyl-orbit functions, such as continuous and discrete orthogonality are usually achieved by restricting a to some subsets of the weight lattice p (see for example [8, 10, 15, 16, 27]). note also that symmetric c–functions and antisymmetric s–functions appear in the theory of irreducible representations of simple lie algebras [32, 34]. it is also convenient to use an alternative definition of the weyl-orbit functions via sums over the weyl group orbits [15, 16, 25]. these orbit sums ca,sa+%, ssa+%s, sla+%l differ from φa, ϕa+%, ϕ s a+%s, ϕla+%l only by a constant ca(b) = φa(b) |stabw a| , sa+%(b) = ϕa+%(b), ssa+%s(b) = ϕsa+%s(b) |stabw (a + %s)| , sla+%l(b) = ϕl a+%l(b) |stabw (a + %l)| with |stabw c| denoting the number of elements of w which leave c invariant. 3.1. case a1 the symmetric c–functions and antisymmetric s– functions of a1 are, up to a constant, the common cosine and sine functions [15, 16], ca(b) = 2 cos (2πa1b1), sa(b) = 2i sin (2πa1b1), where a = a1ω1, b = b1α∨1 . it is well known that such functions appear in the definition of the extensively studied chebyshev polynomials [5, 31]. several types of chebyshev polynomials are widely used in mathematical analysis, in particular, as efficient tools for numerical integration and approximations. the chebyshev polynomials of the first, second, third and fourth kind are denoted by tm(x), um(x), vm(x) and wm(x), respectively. if x = cos(θ), then for any m ∈ z≥0 it holds that tm(x) ≡ cos(mθ), um(x) ≡ sin ((m + 1)θ) sin(θ) , vm(x) ≡ cos (( m + 12 ) θ ) cos (1 2θ ) , wm(x) ≡ sin (( m + 12 ) θ ) sin (1 2θ ) . therefore, for specific choices of parameter a1 and 2πb1 = θ, we can view the weyl-orbit functions of a1 as these chebyshev polynomials. recall also that the chebyshev polynomials are actually, up to a constant cα,β, special cases of jacobi polynomials p (α,β)m (x), m ∈ z≥0. the jacobi polynomials are given as orthogonal polynomials with respect to the weight function (1 −x)α(1 + x)β, −1 < x < 1, where the parameters α,β are subjects to the condition α,β > −1 [4, 33]. in particular, it holds that tm(x) = c−12 ,−12 p (−12 ,− 1 2 ) m (x), um(x) = c 1 2 , 1 2 p ( 12 , 1 2 ) m (x), vm(x) = c−12 , 12 p (−12 , 1 2 ) m (x), wm(x) = c 1 2 ,− 1 2 p ( 12 ,− 1 2 ) m (x). for both chebyshev polynomials and jacobi polynomials, there exist various multivariate generalizations, see for example [3, 21, 22]. in sections 3.2, 3.3 and 3.4, we identify some of the two-variable analogous orthogonal polynomials with specific weyl-orbit functions. 3.2. case a2 since, for a2 the two simple roots are of the same length, there are only two corresponding families of weyl-orbit functions, c– and s–functions. for a = a1ω1 +a2ω2 and b = b1α∨1 +b2α∨2 the explicit formulas of c–functions and s–functions are given by ca(b) = 1 |stabw a| ( e2πi(a1b1+a2b2) + e2πi(−a1b1+(a1+a2)b2) + e2πi((a1+a2)b1−a2b2) + e2πi(a2b1−(a1+a2)b2) + e2πi((−a1−a2)b1+a1b2) + e2πi(−a2b1−a1b2) ) , sa(b) = e2πi(a1b1+a2b2) −e2πi(−a1b1+(a1+a2)b2) −e2πi((a1+a2)b1−a2b2) + e2πi(a2b1−(a1+a2)b2) + e2πi((−a1−a2)b1+a1b2) −e2πi(−a2b1−a1b2) with the values |stabw a| given in table 1 of [9]. they are related to the generalized cosine and sine functions tc k and tsk studied in [23] and defined as tc k(t) = 1 3 ( e iπ 3 (k2−k3)(t2−t3) cos k1πt1 + e iπ 3 (k2−k3)(t3−t1) cos k1πt2 + e iπ 3 (k2−k3)(t1−t2) cos k1πt3 ) , tsk(t) = 1 3 ( e iπ 3 (k2−k3)(t2−t3) sin k1πt1 + e iπ 3 (k2−k3)(t3−t1) sin k1πt2 + e iπ 3 (k2−k3)(t1−t2) sin k1πt3 ) , 285 jiří hrivnák, lenka motlochová acta polytechnica figure 1. the region of orthogonality bounded by the three-cusped deltoid. where t = (t1, t2, t3) ∈ r3 with t1 + t2 + t3 = 0 and k = (k1,k2,k3) ∈ z3 with k1 + k2 + k3 = 0. the explicit correspondence ca = 6 |stabw a| tc k, sa = 6tsk is obtained by the following change of variables and parameters, k1 = a1, k2 = a2, k3 = −a1 −a2, t1 = 2b1 − b2, t2 = −b1 + 2b2, t3 = −b1 − b2. it is possible to express ca and sa+%/s% with a ∈ p + as polynomials in cω1 and cω2 [1]. taking into account that cω1 = cω2 , one can pass to real variables by making a natural change of variables, these x = cω1 + cω2 2 = cos 2πb1 + cos 2πb2 + cos 2π(b1 − b2), y = cω1 −cω2 2i = sin 2πb1 − sin 2πb2 − sin 2π(b1 − b2). since c–functions and s–functions are continuously orthogonal, their polynomial versions inherit the orthogonality property. one can verify that the corresponding polynomials are special cases of twodimensional analogues of jacobi polynomials orthogonal with respect to the weight function wα(x,y) = ( −(x2 + y2 + 9)2 + 8(x3 − 3xy2) + 108 )α on the region bounded by the three-cusped deltoid called steiner’s hypocycloid [21] with the boundary given by −(x2 + y2 + 9)2 + 8(x3 − 3xy2) + 108 = 0, see fig. 1. more precisely, the polynomials ca and sa+%/s% correspond to the choices α = −12 and α = 1 2 respectively. 3.3. case g2 since, for g2 its two simple roots are of different lengths, all four families of c–, s–, ss and sl– functions are obtained [15, 16, 25]. the symmetric and antisymmetric orbit functions are given by the following formulas for a = a1ω1 +a2ω2 and b = b1α∨1 +b2α∨2 , ca(b) = 2 |stabw a| ( cos 2π(a1b1 + a2b2) + cos 2π(−a1b1 + (3a1 + a2)b2) + cos 2π((a1 + a2)b1 −a2b2) + cos 2π((2a1 + a2)b1 − (3a1 + a2)b2) + cos 2π((−a1 −a2)b1 + (3a1 + 2a2)b2) + cos 2π((−2a1 −a2)b1 + (3a1 + 2a2)b2) ) , sa(b) = 2 ( cos 2π(a1b1 + a2b2) − cos 2π(−a1b1 + (3a1 + a2)b2) − cos 2π((a1 + a2)b1 −a2b2) + cos 2π((2a1 + a2)b1 − (3a1 + a2)b2) + cos 2π((−a1 −a2)b1 + (3a1 + 2a2)b2) − cos 2π((−2a1 −a2)b1 + (3a1 + 2a2)b2) ) . the hybrid cases can be expressed as ssa(b) = 2i |stabw a| ( sin 2π(a1b1 + a2b2) + sin 2π(−a1b1 + (3a1 + a2)b2) − sin 2π((a1 + a2)b1 −a2b2) − sin 2π((2a1 + a2)b1 − (3a1 + a2)b2) − sin 2π((−a1 −a2)b1 + (3a1 + 2a2)b2) − sin 2π((−2a1 −a2)b1 + (3a1 + 2a2)b2) ) , sla(b) = 2 ( sin 2π(a1b1 + a2b2) − sin 2π(−a1b1 + (3a1 + a2)b2) + sin 2π((a1 + a2)b1 −a2b2) − sin 2π((2a1 + a2)b1 − (3a1 + a2)b2) − sin 2π((−a1 −a2)b1 + (3a1 + 2a2)b2) + sin 2π((−2a1 −a2)b1 + (3a1 + 2a2)b2) ) . these functions have been studied in [22], under the notation cc k, ssk, sc k and csk with cc k(t) = 1 3 ( cos π(k1 −k3)(t1 − t3) 3 cos πk2t2 + cos π(k1 −k3)(t2 − t1) 3 cos πk2t3 + cos π(k1 −k3)(t3 − t2) 3 cos πk2t1 ) , ssk(t) = 1 3 ( sin π(k1 −k3)(t1 − t3) 3 sin πk2t2 + sin π(k1 −k3)(t2 − t1) 3 sin πk2t3 + sin π(k1 −k3)(t3 − t2) 3 sin πk2t1 ) , 286 vol. 56 no. 4/2016 weyl orbit functions and (anti)symmetric trigonometric functions sc k(t) = 1 3 ( sin π(k1 −k3)(t1 − t3) 3 cos πk2t2 + sin π(k1 −k3)(t2 − t1) 3 cos πk2t3 + sin π(k1 −k3)(t3 − t2) 3 cos πk2t1 ) , csk(t) = 1 3 ( cos π(k1 −k3)(t1 − t3) 3 sin πk2t2 + cos π(k1 −k3)(t2 − t1) 3 sin πk2t3 + cos π(k1 −k3)(t3 − t2) 3 sin πk2t1 ) , where the variable t = (t1, t2, t3) ∈ r3h = {t ∈ r 3 | t1 + t2 + t3 = 0} and parameter k = (k1,k2,k3) ∈ z3 ∩ r3h. indeed, performing the following change of variables and parameters, t1 = −b1 + 3b2, t2 = 2b1 − 3b2, t3 = −b1, k1 = a1 + a2, k2 = a1, k3 = −2a1 −a2, we obtain the following relations. ca = 12 |stabw a| cc k, sa = −12ssk, ssa = 12i |stabw a| sc k, sla = 12i |stabw a| csk. in [22], the functions ca,sa+%/s%,ssa+%s/ss%s and sl a+%l/s l %l are expressed as two-variable polynomials in variables x = 1 6 cω2 = 1 3 ( cos 2πb2 + cos 2π(b1 − b2) + cos 2π(−b1 + 2b2) ) , y = 1 6 cω1 = 1 3 ( cos 2πb1 + cos 2π(−b1 + 3b2) + cos 2π(2b1 − 3b2) ) and it is shown that they are orthogonal within each family with respect to a weighted integral on the region (see fig. 2) containing points (x,y) satisfying (1 + 2y − 3x2)(24x3 −y2 − 12xy − 6x− 4y − 1) ≥ 0 with the weight function wα,β(x,y) equal to (1 + 2y − 3x2)α(24x3 −y2 − 12xy − 6x− 4y − 1)β with parameters α = β = − 1 2 for c–functions, α = β = 1 2 for s–functions, α = 1 2 , β = − 1 2 for ss–functions, α = − 1 2 , β = 1 2 for sl–functions. figure 2. the region of orthogonality for the case g2. 3.4. cases bn and cn it is shown in this section that the c–, s–, ss– and sl–functions arising from bn and cn are related to the symmetric and antisymmetric multivariate generalizations of trigonometric functions [17]. the symmetric cosine functions cos+λ (x) and the antisymmetric cosine functions cos−λ (x) of the variable x = (x1, . . . ,xn) ∈ rn are labelled by the parameter λ = (λ1, . . . ,λn) ∈ rn and are given by the following explicit formulas, cos+λ (x) ≡ ∑ σ∈sn n∏ k=1 cos(πλσ(k)xk), cos−λ (x) ≡ ∑ σ∈sn sgn(σ) n∏ k=1 cos(πλσ(k)xk), where sn denotes the symmetric group consisting of all permutations of numbers 1, . . . ,n, and sgn(σ) is the signature of σ. the symmetric sine functions sin+λ (x) and the antisymmetric sine functions sin − λ (x) are defined similarly, sin+λ (x) ≡ ∑ σ∈sn n∏ k=1 sin(πλσ(k)xk), sin−λ (x) ≡ ∑ σ∈sn sgn(σ) n∏ k=1 sin(πλσ(k)xk). firstly, consider the lie algebra bn and an orthonormal basis {e1, . . . ,en} of rn such that αi = ei −ei+1 for i = 1, . . . ,n− 1 and αn = en. if we determine any a ∈ rn by its coordinates with respect to the basis {e1, . . . ,en}, a = (a1, . . . ,an) = a1e1 + · · · + anen, then it holds for the generators ri, i = 1, . . . ,n−1 and rn of the weyl group w(bn) of bn that ri(a1, . . . ,ai,ai+1, . . . ,an) = (a1, . . . ,ai+1,ai, . . . ,an), rn(a1, . . . ,an−1,an) = (a1, . . . ,an−1,−an). 287 jiří hrivnák, lenka motlochová acta polytechnica therefore, w(bn) consists of all permutations of the coordinates ai with possible sign alternations of some of them, and we actually have that w (bn) is isomorphic to (z/2z)n o sn [12]. this implies that φa(b) = ∑ w∈w(bn) e2πi〈w(a),b〉 = ∑ σ∈sn n∏ k=1 ∑ lk=±1 e2πi(lkaσ(k)bk) = ∑ σ∈sn n∏ k=1 ( e2πiaσ(k)bk + e−2πiaσ(k)bk ) = 2n ∑ σ∈sn n∏ k=1 cos(2πaσ(k)bk) = 2n cos+a (2b). since det is a homomorphism on w(bn), we also obtain ϕa(b) = ∑ w∈w(bn) det(w)e2πi〈w(a),b〉 = ∑ σ∈sn det(σ) n∏ k=1 ∑ lk=±1 lke 2πi(lkaσ(k)bk) = ∑ σ∈sn det(σ) n∏ k=1 ( e2πiaσ(k)bk −e−2πiaσ(k)bk ) = (2i)n ∑ σ∈sn det(σ) n∏ k=1 sin(2πaσ(k)bk) = (2i)n sin−a (2b). similar connections are valid for ss–functions and sl–functions, ϕsa(b) = (2i) n sin+a (2b), ϕ l a(b) = 2 n cos−a (2b). since lie algebras bn and cn are dual to each other, we can deduce that the symmetric and antisymmetric generalizations are also connected to the weyl-orbit functions of cn. in order to obtain explicit relations, one can proceed by analogy with case bn and introduce an orthogonal basis {f1, . . . ,fn} such that for i = 1, . . . ,n− 1 〈fi,fi〉 = 1 2 , αi = fi −fi+1 and αn = 2fn. we denote by ãi the coordinates of any point a ∈ rn with respect to the basis {f1, . . . ,fn}, i.e. a = (ã1, . . . , ãn) = ã1f1 + · · · + ãnfn. the generators ri, i = 1, . . . ,n− 1 and rn of the weyl group w(cn) corresponding to cn are also given by ri(ã1, . . . , ãi, ãi+1, . . . , ãn) = (ã1, . . . , ãi+1, ãi, . . . , ãn), rn(ã1, . . . , ãn−1, ãn) = (ã1, . . . , ãn−1,−ãn). thus, proceeding as before, we derive the following. φa(b) = 2n cos+a (b), ϕa(b) = (2i) n sin−a (b), ϕsa(b) = 2 n cos−a (b), ϕ l a(b) = (2i) n sin+a (b). figure 3. the region of orthogonality bounded by two lines and parabola. note that the ss–functions are related to cos−a and the sl–functions are related to sin+a in the case of cn, whereas the ss–functions correspond to sin+a and the sl–functions correspond to cos−a if we consider the simple lie algebra bn. this follows from the fact that the short (long) roots of cn are dual to the long (short) roots of bn. setting n = 2, the construction of the polynomials pi,+(k1,k2) ≡ cos + (k1,k2) , pi,−(k1,k2) ≡ cos−(k1+1,k2) cos−(1,0) , piii,+(k1,k2) ≡ cos+(k1+ 12 ,k2+ 12 ) cos+( 12 , 12 ) , piii,−(k1,k2) ≡ cos−(k1+ 32 ,k2+ 12 ) cos−( 32 , 12 ) labelled by k1 ≥ k2 ≥ 0 and in the variables x1 ≡ cos+(1,0)(x1,x2) = cos(x1) + cos(x2), x2 ≡ cos+(1,1)(x1,x2) = 2 cos(x1) cos(x2) yields special cases of two-variable polynomials built in [19–21]. these polynomials are constructed by orthogonalization of monomials 1,u,v,u2,uv,v2, . . . of generic variables u,v with respect to the weight function (1 −u + v)α(1 + u + v)β(u2 − 4v)γ in the domain bounded by the curves 1 −u + v = 0, 1 + u + v = 0 and u2 − 4v = 0, see fig. 3. the parameters α,β,γ are required to satisfy the conditions α,β,γ > −1, α + γ + 32 > 0 and β + γ + 3 2 > 0. the resulting polynomials with the highest term um−kvk are denoted by pα,β,γm,k (u,v), where m ≥ k ≥ 0. the polynomial variables x1 and x2 are related to the variables u and v of [19–21] by x1 = u, x2 = 2v and it can easily be shown that • pi,+(k1,k2) coincides, up to a constant, with p α,β,γ k1,k2 (u,v) for α = β = γ = −12 , • piii,+(k1,k2) coincides, up to a constant, with p α,β,γ k1,k2 (u,v) for α = γ = −12 and β = 1 2 , 288 vol. 56 no. 4/2016 weyl orbit functions and (anti)symmetric trigonometric functions • pi,−(k1,k2) coincides, up to a constant, with p α,β,γ k1,k2 (u,v) for α = β = −12 and γ = 1 2 , • piii,−(k1,k2) coincides, up to a constant, with p α,β,γ k1,k2 (u,v) for α = −12 and β = γ = 1 2 . 4. concluding remarks (1.) symmetric and antisymmetric cosine functions can be used to construct multivariate orthogonal polynomials analogous to the chebyshev polynomials of the first and third kind. the method of construction is based on decomposition of the products of these functions and is fully described in [7]. to build polynomials analogous to the chebyshev polynomials of the second and fourth kind, it seems that the symmetric and antisymmetric generalizations of sine functions have to be analysed. this hypothesis is supported by the decomposition of the products of two-dimensional sine functions which can be found in [11]. (2.) another approach to generalization of the multivariate polynomials related to the weyl-orbit functions stems from the shifted orthogonality of the orbit functions developed in [2]. this generalization encompasses shifts of the points of the sets over which the functions are discretely orthogonal, and also shifts of the labeling weights. as a special case it contains for a1 all four kinds of chebyshev polynomials. the existence of analogous polynomials obtained through this approach and their relations to already known generalizations deserves further study. (3.) besides the methods of polynomial interpolation and numerical integration, the chebyshev polynomials are connected to other efficient methods in numerical analysis such as numerical solutions of differential equations, solutions of difference equations, fast transforms and spectral methods. the existence and the form of these methods, connected in a multivariate setting to weyl-orbit functions, are open problems. acknowledgements the authors gratefully acknowledge the support received for this work from rvo68407700. this work is supported by the european union through the project support of inter-sectoral mobility and quality enhancement of research teams at the czech technical university in prague cz.1.07/2.3.00/30.0034. references [1] n. bourbaki, groupes et algèbres de lie, chapiters iv, v, vi, hermann, paris 1968. [2] t. czyżycki, j. hrivnák, generalized discrete orbit function transforms of affine weyl groups, j. math. phys. 55 (2014), 113508, doi:10.1063/1.4901230. [3] c. f. dunkl, y. xu, orthogonal polynomials of several variables, cambridge university press, cambridge, 2001, doi:10.1017/cbo9781107786134. [4] a. erdélyi, w. magnus, f. oberhettinger, f. g. tricomi, higher transcendental functions. vol. ii, robert e. krieger publishing co., inc., melbourne, fla., 1981. [5] d. c. handscomb, j. c. mason, chebyshev polynomials, chapman&hall/crc, usa, 2003, doi:10.1201/9781420036114. [6] l. háková, j. hrivnák, j. patera, four families of weyl group orbit functions of b3 and c3, j. math. phys. 54 (2013), 083501, 19, doi:10.1063/1.4817340. [7] j. hrivnák, l. motlochová, discrete transforms and orthogonal polynomials of (anti)symmetric multivariate cosine functions, siam j. numer. anal. 52 (2014), no. 6, 3021–3055, doi:10.1137/140964916. [8] j. hrivnák, l. motlochová, j. patera, on discretization of tori of compact simple lie groups ii., j. phys. a 45 (2012), 255201, 18, doi:10.1088/1751-8113/45/25/255201. [9] j. hrivnák, l. motlochová, j. patera, cubature formulas of multivariate polynomials arising from symmetric orbit functions, symmetry 8 (2016), no. 7, 63, doi:10.3390/sym8070063. [10] j. hrivnák, j. patera, on discretization of tori of compact simple lie groups, j. phys. a: math. theor. 42 (2009), 385208, doi:10.1088/1751-8113/42/38/385208. [11] j. hrivnák, l. motlochová, j. patera, two-dimensional symmetric and antisymmetric generalizations of sine functions, j. math. phys 51 (2010), 073509, 13, doi:10.1063/1.3430567. [12] j. e. humphreys, reflection groups and coxeter groups, cambridge studies in advanced mathematics, 29 (1990), cambridge university press, cambridge, doi:10.1017/cbo9780511623646. [13] j. e. humphreys, introduction to lie algebras and representation theory, springer-verlag, new york, 1978, doi:10.1007/978-1-4612-6398-2. [14] r. kane, reflection groups and invariant theory, springer-verlag, new york, 2001, doi:10.1007/978-1-4757-3542-0. [15] a. u. klimyk, j. patera, orbit functions, sigma 2 (2006), 006, 60, doi:10.3842/sigma.2006.006. [16] a. u. klimyk, j. patera, antisymmetric orbit functions, sigma 3 (2007), paper 023, 83, doi:10.3842/sigma.2007.023. [17] a. klimyk, j. patera, (anti)symmetric multivariate trigonometric functions and corresponding fourier transforms, j. math. phys. 48 (2007), 093504, 24, doi:10.1063/1.2779768. [18] a. w. knapp, lie groups beyond an introduction, birkhäuser boston inc., boston, ma, 1996. [19] t. h. koornwinder, orthogonal polynomials in two variables which are eigenfunctions of two algebraically independent partial differential operators i-ii, kon. ned. akad. wet. ser. a 77 (1974), 46–66. [20] t. h. koornwinder, orthogonal polynomials in two variables which are eigenfunctions of two algebraically independent partial differential operators iii-iv, indag. math. 36 (1974), 357–381. 289 http://dx.doi.org/10.1063/1.4901230 http://dx.doi.org/10.1017/cbo9781107786134 http://dx.doi.org/10.1201/9781420036114 http://dx.doi.org/10.1063/1.4817340 http://dx.doi.org/10.1137/140964916 http://dx.doi.org/10.1088/1751-8113/45/25/255201 http://dx.doi.org/10.3390/sym8070063 http://dx.doi.org/10.1088/1751-8113/42/38/385208 http://dx.doi.org/10.1063/1.3430567 http://dx.doi.org/10.1017/cbo9780511623646 http://dx.doi.org/10.1007/978-1-4612-6398-2 http://dx.doi.org/10.1007/978-1-4757-3542-0 http://dx.doi.org/10.3842/sigma.2006.006 http://dx.doi.org/10.3842/sigma.2007.023 http://dx.doi.org/10.1063/1.2779768 jiří hrivnák, lenka motlochová acta polytechnica [21] t. h. koornwinder, two-variable analogues of the classical orthogonal polynomials, theory and application of special functions, edited by r. a. askey, academic press, new york (1975) 435–495, doi:10.1016/b978-0-12-064850-4.50015-x. [22] h. li, j. sun, y. xu, discrete fourier analysis and chebyshev polynomials with g2 group, sigma 8 (2012), paper 067, 29, doi:10.3842/sigma.2012.067. [23] h. li, j. sun, y. xu, discrete fourier analysis, cubature and interpolation on a hexagon and a triangle, siam j. numer. anal. 46 (2008), 1653-1681, doi:10.1137/060671851. [24] h. li, y. xu, discrete fourier analysis on fundamental domain and simplex of ad lattice in d-variables, j. fourier anal. appl. 16, 383-433, (2010), doi:10.1007/s00041-009-9106-9. [25] r. v. moody, l. motlochová, j. patera, gaussian cubature arising from hybrid characters of simple lie groups , j. fourier anal. appl. 20 (2014), issue 6, 1257–1290, doi:10.1007/s00041-014-9355-0. [26] r. v. moody, j. patera, cubature formulae for orthogonal polynomials in terms of elements of finite order of compact simple lie groups, advances in applied mathematics 47 (2011) 509—535, doi:10.1016/j.aam.2010.11.005. [27] r. v. moody and j. patera, orthogonality within the families of c–, s–, and e–functions of any compact semisimple lie group, sigma 2 (2006) 076, 14, doi:10.3842/sigma.2006.076. [28] j. patera, a. zaratsyan, discrete and continuous sine transform generalized to semisimple lie groups of rank two, j. math. phys. 47 (2006), 043512, 22, doi:10.1063/1.2191361. [29] j. patera, a. zaratsyan, discrete and continuous cosine transform generalized to lie groups su (3) and g(2), j. math. phys. 46 (2005), 113506, 17, doi:10.1063/1.2109707. [30] j. patera, a. zaratsyan, discrete and continuous cosine transform generalized to lie groups su (2) × su (2) and o(5), j. math. phys. 46 (2005), 053514, 25, doi:10.1063/1.1897143. [31] t. j. rivlin, the chebyshev polynomials, wiley, new york, 1974. [32] j.-p. serre, complex semisimple lie algebras, springer monographs in mathematics, springer-verlag, berlin, 2001, doi:10.1007/978-3-642-56884-8. [33] g. szegő, orthogonal polynomials, american mathematical society, providence, r.i., 1975. [34] n. ja. vilenkin, a. u. klimyk, representation of lie groups and special functions, kluwer academic publishers group, dordrecht, 1995, doi:10.1007/978-94-017-2885-0. 290 http://dx.doi.org/10.1016/b978-0-12-064850-4.50015-x http://dx.doi.org/10.3842/sigma.2012.067 http://dx.doi.org/10.1137/060671851 http://dx.doi.org/10.1007/s00041-009-9106-9 http://dx.doi.org/10.1007/s00041-014-9355-0 http://dx.doi.org/10.1016/j.aam.2010.11.005 http://dx.doi.org/10.3842/sigma.2006.076 http://dx.doi.org/10.1063/1.2191361 http://dx.doi.org/10.1063/1.2109707 http://dx.doi.org/10.1063/1.1897143 http://dx.doi.org/10.1007/978-3-642-56884-8 http://dx.doi.org/10.1007/978-94-017-2885-0 acta polytechnica 56(4):283–290, 2016 1 introduction 2 weyl groups of simple lie algebras 3 weyl-orbit functions 3.1 case a1 3.2 case a2 3.3 case g2 3.4 cases bn and cn 4 concluding remarks acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0214 acta polytechnica 56(3):214–223, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap laplace equations, conformal superintegrability and bôcher contractions ernest kalninsa, willard miller, jr.b, ∗, eyal subagc a department of mathematics, university of waikato, hamilton, new zealand b school of mathematics, university of minnesota, minneapolis, minnesota 55455, usa c department of mathematics, pennsylvania state university, state college 16802, pennsylvania usa ∗ corresponding author: miller@ima.umn.edu abstract. quantum superintegrable systems are solvable eigenvalue problems. their solvability is due to symmetry, but the symmetry is often “hidden”. the symmetry generators of 2nd order superintegrable systems in 2 dimensions close under commutation to define quadratic algebras, a generalization of lie algebras. distinct systems and their algebras are related by geometric limits, induced by generalized inönü-wigner lie algebra contractions of the symmetry algebras of the underlying spaces. these have physical/geometric implications, such as the askey scheme for hypergeometric orthogonal polynomials. the systems can be best understood by transforming them to laplace conformally superintegrable systems and using ideas introduced in the 1894 thesis of bôcher to study separable solutions of the wave equation. the contractions can be subsumed into contractions of the conformal algebra so(4,c) to itself. here we announce main findings, with detailed classifications in papers submitted and under preparation. keywords: conformal superintegrability; contractions; laplace equations. 1. introduction a quantum superintegrable system is an integrable hamiltonian system on an n-dimensional riemannian/pseudo-riemannian manifold with potential: h = ∆n + v that admits 2n−1 algebraically independent partial differential operators lj commuting with h, the maximum possible. [h,lj] = 0, j = 1, 2, · · · , 2n− 1. superintegrability captures the properties of quantum hamiltonian systems that allow the schrödinger eigenvalue problem (or helmholtz equation) hψ = eψ to be solved exactly, analytically and algebraically [1–5]. a system is of order k if the maximum order of the symmetry operators, other than h, is k. for n = 2, k = 1, 2 all systems are known, see, e.g., [6, 7] we review quickly the facts for free 2nd order superintegrable systems, (i.e., no potential, k = 2) in the case n = 2, 2n − 1 = 3. the complex spaces with laplace-beltrami operators admitting at least three 2nd order symmetries were classified by koenigs in 1896 [8]. they are: • the two constant curvature spaces (flat space and the complex sphere), six linearly independent 2nd order symmetries and three 1st order symmetries, • the four darboux spaces (one with a parameter), four 2nd order symmetries and one 1st order symmetry [9], ds2 = 4x(dx2 + dy2), ds2 = x2 + 1 x2 (dx2 + dy2), ds2 = ex + 1 e2x (dx2 + dy2), ds2 = 2 cos 2x + b sin2 2x (dx2 + dy2). • eleven 4-parameter koenigs spaces. no 1st order symmetries. an example is ds2 = ( c1 x2 + y2 + c2 x2 + c3 y2 + c4 ) (dx2 + dy2). for 2nd order systems with non-constant potential, k = 2, the following is true [6, 7, 10–12]. • the symmetry operators of each system close under commutation to generate a quadratic algebra, and the irreducible representations of this algebra determine the eigenvalues of h and their multiplicity. • all the 2nd order superintegrable systems are limiting cases of a single system: the generic 3-parameter potential on the 2-sphere, s9 in our listing [13], or are obtained from these limits by a stäckel transform (an invertible structure preserving mapping of superintegrable systems [6]). analogously all quadratic symmetry algebras of these systems are limits of that of s9. s9 : h = ∆2 + a1 s21 + a2 s22 + a3 s23 , s21 + s 2 2 + s 2 3 = 1, l1 = (s2∂s3 −s3∂s2 ) 2 + a3s 2 2 s23 + a2s 2 3 s22 , l2, l3, • 2nd order superintegrable systems are multiseparable. 214 http://dx.doi.org/10.14311/ap.2016.56.0214 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 laplace equations, conformal superintegrability and bôcher contractions here we consider only the nondegenerate superintegrable systems: those with 4-parameter potentials (the maximum possible): v (x) = a1v(1)(x) + a2v(2)(x) + a3v(3)(x) + a4, where {v(1)(x), v(2)(x), v(3)(x), 1} is a linearly independent set. for these the symmetry algebra generated by h,l1,l2 always closes under commutation and gives the following quadratic algebra structure: define 3rd order commutator r by r = [l1,l2]. then [lj,r] = a (j) 1 l 2 1 + a (j) 2 l 2 2 + a (j) 3 h 2 + a(j)4 {l1,l2} + a (j) 5 hl1 + a (j) 6 hl2 + a (j) 7 l1 + a(j)8 l2 + a (j) 9 h + a (j) 10 , r2 = b1l31 +b2l 3 2 +b3h 3 +b4{l21,l2}+b5{l1,l 2 2} + b6l1l2l1 + b7l2l1l2 + b8h{l1,l2} + b9hl21 + b10hl22 + b11h 2l1 + b12h2l2 + b13l21 + b14l22 + b15{l1,l2} + b16hl1 + b17hl2 + b18h2 + b19l1 + b20l2 + b21h + b22, where {l1,l2} = l1l2 + l2l1 and a (j) i , bk are constants. all 2nd order 2d superintegrable systems with potential and their quadratic algebras are known. there are 44 nondegenerate systems, on a variety of manifolds (just the manifolds classified by koenigs), but under the stäckel transform they divide into 6 equivalence classes with representatives on flat space and the 2-sphere [14]. every 2nd order symmetry operator on a constant curvature space takes the form l = k + w(x), where k is a 2nd order element in the enveloping algebra of o(3,c) or e(2,c). an example is s9 where h = j21 + j 2 2 + j 2 3 + a1 s21 + a2 s22 + a3 s23 , where j3 = s1∂s2 −s2∂s1 and j2,j3 are obtained by cyclic permutations of indices. basis symmetries are (j3 = s2∂s1 −s1∂s2, . . .) l1 = j21 + a3s 2 2 s23 + a2s 2 3 s22 , l2 = j22 + a1s 2 3 s21 + a3s 2 1 s23 , l3 = j23 + a2s 2 1 s22 + a1s 2 2 s21 . theorem 1. there is a bijection between quadratic algebras generated by 2nd order elements in the enveloping algebra of o(3,c), called free, and 2nd order nondegenerate superintegrable systems on the complex 2-sphere. similarly, there is a bijection between quadratic algebras generated by 2nd order elements in the enveloping algebra of e(2,c) and 2nd order nondegenerate superintegrable systems on the 2d complex flat space. remark. this theorem is constructive [15]. given a free quadratic algebra q̃ one can compute the potential v and the symmetries of the quadratic algebra q of the nondegenerate superintegrable system. special functions arise from these systems in two distinct ways: 1) as separable eigenfunctions of the quantum hamiltonian. second order superintegrable systems are multiseparable [6]. 2) as interbasis expansion coefficients relating distinct separable coordinate eigenbases [16, 17, 19, 20]. most of the classical special functions in the digital library of mathematical functions, as well as wilson polynomials, appear in these ways [21]. 1.1. the big picture: contractions and special functions • taking coordinate limits starting from quantum system s9 we can obtain other superintegrable systems. • these coordinate limits induce limit relations between the special functions associated as eigenfunctions of the superintegrable systems. • the limits induce contractions of the associated quadratic algebras, and via the models, limit relations between the associated special functions. • for constant curvature systems the required limits are all induced by inönü-wigner-type lie algebra contractions of o(3,c) and e(2,c) [22–24] • the askey scheme for orthogonal functions of hypergeometric type fits nicely into this picture [25]. lie algebra contractions. let (a; [; ]a), (b; [; ]b) be two complex lie algebras. we say that b is a contraction of a if for every � ∈ (0; 1] there exists a linear invertible map t� : b → a such that for every x,y ∈ b, lim�→0 t−1� [t�x,t�y ]a = [x,y ]b. thus, as � → 0 the 1-parameter family of basis transformations can become nonsingular but the structure constants go to a finite limit. contractions of e(2,c) and o(3,c). these are the symmetry lie algebras of free (zero potential) systems on constant curvature spaces. their contractions have long since been classified [15]. there are 6 nontrivial contractions of e(2,c) and 4 of o(3,c). they are each induced by coordinate limits. example — an inönü-wignercontraction of o(3,c). we use the classical realization for o(3,c) acting on the 2-sphere, with basis j1 = s2∂s3 − s3∂s2, j2 = s3∂s1 − s1∂s3, j3 = s1∂s2 − s2∂s1 , commutation relations [j2,j1] = j3, [j3,j2] = j1, [j1,j3] = j2, and hamiltonian h = j21 +j22 +j23 . here s21 +s22 +s23 = 1. we introduce the basis change: {j′1,j ′ 2,j ′ 3} = {�j1, �j2, j3}, 0 < � ≤ 1, with coordinate implementation x = s1 � ,y = s2 � ,s3 ≈ 1. the structure relations become [j′2,j ′ 1] = � 2j′3, [j ′ 3,j ′ 2] = j ′ 1, [j ′ 1,j ′ 3] = j ′ 2, 215 e. kalnins, w. miller, e. subag acta polytechnica as � → 0 these converge to [j′2,j ′ 1] = 0, [j ′ 3,j ′ 2] = j ′ 1, [j ′ 1,j ′ 3] = j ′ 2, the lie algebra e(2,c). contractions of quadratic algebras. just as for lie algebras we can define a contraction of a quadratic algebra in terms of 1-parameter families of basis changes in the algebra: as � → 0 the 1-parameter family of basis transformations becomes singular but the structure constants go to a finite limit [15]. motivating idea — lie algebra contractions induce quadratic algebra contractions. for constant curvature spaces we have the following theorem. theorem 2 [15]. every lie algebra contraction of a = e(2,c) or a = o(3,c) induces a contraction of a free (zero potential) quadratic algebra q̃ based on a, which in turn induces a contraction of the quadratic algebra q with potential. this is true for both classical and quantum algebras. 1.2. the problems and the proposed solutions the various limits of 2nd order superintegrable systems on constant curvature spaces and their applications, such as to the askey-wilson scheme, can be classified and understood in terms of generalized inönü-wigner contractions [15]. however, there are complications for spaces not of constant curvature. for darboux spaces the lie symmetry algebra is only 1-dimensional so limits must be determined on a caseby-case basis. there is no lie symmetry algebra at all for koenigs spaces. furthermore, there is the issue of finding a more systematic way of classifying the 44 distinct helmholtz superintegrable eigenvalue systems on different manifolds, and their relations. these issues can be clarified by considering the helmholtz systems as laplace equations (with potential) on flat space. this point of view was introduced in the paper [26] and applied in [27] to solve important classification problems in the case n = 3. it is the aim of this paper to describe the laplace equation mechanism and how it can be applied to systematize the classification of helmholtz superintegrable systems and their relations via limits. the basic idea is that families of (stäckel-equivalent) helmholtz superintegrable systems on a variety of manifolds correspond to a single conformally superintegrable laplace equation on flat space. we exploit this relation in the case n = 2, but it generalizes easily to all dimensions n ≥ 2. the conformal symmetry algebra for laplace equations with constant potential on flat space is the conformal algebra so(n + 2,c). in his 1894 thesis [28] bôcher introduced a limit procedure based on the roots of quadratic forms to find families of rseparable solutions of the ordinary (zero potential) flat space laplace equation ∆nψ = 0 in n dimensions. (that is, he constructed separable solutions of the form ψ = r(u)πnj=1ψj(uj) where r is a fixed gauge function and ψj depends only on the variable uj and the separation constants.) we show that his limit procedure can be interpreted as constructing generalized inönü-wigner lie algebra contractions of so(4,c) to itself. we call these bôcher contractions and show that all of the limits of the helmholtz systems classified before for n = 2 [15] are induced by the larger class of bôcher contractions. here we present the main constructions and findings. detailed proofs and the lengthy classifications are in [32]. 2. the laplace equation systems of laplace type are of the form hψ ≡ ∆nψ + v ψ = 0. here ∆n is the laplace-beltrami operator on a conformally flat nd riemannian or pseudo-riemannian manifold. a conformal symmetry of this equation is a partial differential operator s in the variables x = (x1, · · · ,xn) such that [s,h] ≡ sh −hs = rsh for some differential operator rs. the system is maximally conformally superintegrable (or laplace superintegrable) for n ≥ 2 if there are 2n − 1 functionally independent conformal symmetries, s1, · · · ,s2n−1 with s1 = h [26]. it is second order conformally superintegrable if each symmetry si can be chosen to be a differential operator of at most second order. every 2d riemannian manifold is conformally flat, so we can always find a cartesian-like coordinate system with coordinates (x,y) ≡ (x1,x2) such that the helmholtz eigenvalue equation takes the form h̃ψ = ( 1 λ(x,y) (∂2x + ∂ 2 y) + ṽ (x) ) ψ = eψ. (1) however, this equation is equivalent to the flat space laplace equation hψ ≡ ( ∂2x + ∂ 2 y + v (x) ) ψ = 0, v (x) = λ(x)(ṽ (x) −e). (2) in particular, the symmetries of (1) correspond to the conformal symmetries of (2). indeed, if [s,h̃] = 0 then [s,h] = [s,λ(h̃−e)] = [s,λ](h̃−e) = [s,λ]λ−1h. conversely, if s is an e-independent conformal symmetry of h we find that [s,h̃] = 0. further, the conformal symmetries of the system (h̃ −e)ψ = 0 are identical with the conformal symmetries of (2). thus without loss of generality we can assume the manifold is flat space with λ ≡ 1. the conformal stäckel transform. suppose we have a second order conformal superintegrable system h = ∂xx + ∂yy + v (x,y) = 0, h = h0 + v, where v (x,y) = w(x,y) − e u(x,y) for arbitrary parameter e. 216 vol. 56 no. 3/2016 laplace equations, conformal superintegrability and bôcher contractions theorem 3 [26]. the transformed (helmholtz) system h̃ψ = eψ, h̃ = 1 u (∂xx + ∂yy) + ṽ is superintegrable (not just conformally superintegrable), where ṽ = w u . there is a similar definition of ordinary stäckel transforms of helmholtz superintegrable systems hψ = eψ which takes superintegrable systems to superintegrable systems, essentially preserving the quadratic algebra structure [29]. thus any second order conformal laplace superintegrable system admitting a nonconstant potential u can be stäckel transformed to a helmholtz superintegrable system. by choosing all possible special potentials u associated with the fixed laplace system we generate the equivalence class of all helmholtz superintegrable systems obtainable through this process. theorem 4. there is a one-to-one relationship between flat space conformally superintegrable laplace systems with nondegenerate potential and stäckel equivalence classes of superintegrable helmholtz systems with nondegenerate potential. indeed, for a stäckel transform induced by the function u(1), we can take the original helmholtz system to have hamiltonian h = h0 + v = h0 + u(1)α1 + u(2)α2 + u(3)α3 + α4, (3) where {u(1),u(2),u(3), 1} is a basis for the 4dimensional potential space. a 2nd order symmetry s would have the form s = s0 + w (1)α1 + w (2)α2 + w (3)α3, where s0 is a symmetry of the potential free hamiltonian, h0. the stäckel transformed symmetry and hamiltonian take the form s̃ = s −w (1)h̃ and h̃ = 1 u(1) h0 + u(1)α1 + u(2)α2 + u(3)α3 + α4 u(1) . note that the parameter α1 cancels out of the expression for s̃; it is replaced by a term −α4w (1)/u(1). now suppose that ψ is a formal eigenfunction of h (not required to be normalizable): hψ = eψ. without loss of generality we can absorb the energy eigenvalue into α4 so that α4 = −e in (3) and, in terms of this redefined h, we have hψ = 0. it follows immediately that s̃ψ = sψ. thus, for the 3-parameter system h′ and the stäckel transform h̃′, h′ = h0 + v ′ = h0 + u(1)α1 + u(2)α2 + u(3)α3, h̃′ = 1 u(1) h0 + −u(1)e + u(2)α2 + u(3)α3 u(1) , we have h′ψ = eψ and h̃′ψ = −α1ψ. the effect of the stäckel transform is to replace α1 by −e and e by −α1. further, s and s̃ must agree on eigenspaces of h′. we know that the symmetry operators of all 2nd order nondegenerate superintegrable systems in 2d generate a quadratic algebra of the form [r,sj] = f(j)(s1,s2,α1,α2,α3,h′), j = 1, 2, r2 = f(3)(s1,s2,α1,α2,α3,h′), (4) where {s1,s2,h} is a basis for the 2nd order symmetries and α1,α2,α3 are the parameters for the potential [6]. it follows from the above considerations that the effect of a stäckel transform generated by the potential function u(1) is to determine a new superintegrable system with structure [r̃, s̃j] = f(j)(s̃1, s̃2,−h̃′,α2,α3,−α1), j = 1, 2, r2 = f(3)(s̃1, s̃2,−h̃′,α2,α3,−α1), (5) of course, the switch of α1 and h′ is only for illustration; there is a stäckel transform that replaces any αj by −h′ and h′ by −αj and similar transforms that apply to any basis that we choose for the potential space. formulas (4) and (5) are just instances of the quadratic algebras of the superintegrable systems belonging to the equivalence class of a single nondegenerate conformally superintegrable hamiltonian ĥ = ∂xx + ∂yy + 4∑ j=1 αjv (j)(x,y). (6) let ŝ1, ŝ2,ĥ be a basis of 2nd order conformal symmetries of ĥ. from the above discussion we can conclude the following. theorem 5. the symmetries of the 2d nondegenerate conformal superintegrable hamiltonian ĥ generate a quadratic algebra [r̂, ŝ1] = f(1)(ŝ1, ŝ2,α1,α2,α3,α4), [r̂, ŝ2] = f(2)(ŝ1, ŝ2,α1,α2,α3,α4), r̂2 = f(3)(ŝ1, ŝ2,α1,α2,α3,α4), (7) where r̂ = [ŝ1, ŝ2] and all identities hold mod(ĥ). a conformal stäckel transform generated by the potential v (j)(x,y) yields a nondegenerate helmholtz superintegrable hamiltonian h̃ with quadratic algebra relations identical to (7), except that we make the replacements ŝ` → s̃` for ` = 1, 2 and αj →−h̃. these modified relations (6) are now true identities, not mod(ĥ). every 2nd order conformal symmetry is of the form s = s0 + w where s0 is a 2nd order element of the enveloping algebra of so(4,c). the dimension of this space of 2nd order elements is 21 but there is an 11dimensional subspace of symmetries congruent to 0 mod(h0) where h0 = p 21 + p 22 . thus mod(h0) the space of 2nd order symmetries is 10-dimensional. 217 e. kalnins, w. miller, e. subag acta polytechnica 3. the bôcher method in his 1894 thesis bôcher [28], developed a geometrical method for finding and classifying the r-separable orthogonal coordinate systems for the flat space laplace equation ∆nψ = 0 in n dimensions. it was based on the conformal symmetry of these equations. the conformal lie symmetry algebra of the flat space complex laplacian is so(n + 2,c). we will use his ideas for n = 2 , but applied to the laplace equation with potential hψ ≡ (∂2x+∂2y +v )ψ = 0. the so(4,c) conformal symmetry algebra in the case n = 2 has the basis p1 = ∂x, p2 = ∂y, j = x∂y −y∂x, d = x∂x + y∂y, k1 = (x2 − y2)∂x + 2xy∂y, k2 = (y2 − x2)∂y + 2xy∂x. bôcher linearizes this action by introducing tetraspherical coordinates. these are 4 projective complex coordinates (x1,x2,x3,x4) confined to the null cone x21 + x22 + x23 + x24 = 0. they are related to complex cartesian coordinates (x,y) via x = − x1 x3 + ix4 , y = − x2 x3 + ix4 , h = ∂xx + ∂yy + ṽ = (x3 + ix4)2 ( 4∑ k=1 ∂2xk + v ) , where ṽ = (x3 + ix4)2v . we define ljk = xj∂xk − xk∂xj , 1 ≤ j,k ≤ 4, j 6= k, where ljk = −lkj. these operators are clearly a basis for so(4,c). the generators for flat space conformal symmetries are related to these via p1 = ∂x = l13 + il14, p2 = ∂y = l23 + il24, d = il34, j = l12, k1 = l13 − il14, k2 = l23 − il24. 3.1. relation to separation of variables bôcher uses symbols of the form [n1,n2, ..,np] where n1 + ... + np = 4, to define coordinate surfaces as follows. consider the quadratic forms ω = x21 + x 2 2 + x 2 3 + x 2 4 = 0, φ = x21 λ−e1 + x22 λ−e2 + x23 λ−e3 + x24 λ−e4 . if the parameters ej are pairwise distinct, the elementary divisors of these two forms are denoted by [1, 1, 1, 1]. given a point in 2d flat space with cartesian coordinates (x0,y0), there corresponds a set of tetraspherical coordinates (x01,x02,x03,x04), unique up to multiplication by a nonzero constant. if we substitute into φ we see that there are exactly 2 roots λ = ρ,µ such that φ = 0. (if e4 → ∞ these correspond to elliptic coordinates on the 2sphere.) they are orthogonal with respect to the metric ds2 = dx2 + dy2 and are r-separable for the laplace equations (∂2x+∂2y)θ = 0 or ( ∑4 j−1 ∂ 2 xj )θ = 0. example. consider the potential v[1,1,1,1] = a1x21 + a2 x22 + a3 x23 + a4 x24 . it is the only potential v such that equation ( ∑4 j−1 ∂ 2 xj + v )θ = 0 is r-separable in elliptic coordinates for all choices of the parameters ej. the separation is characterized by 2nd order conformal symmetry operators that are linear in the parameters ej. in particular the symmetries span a 3-dimensional subspace of symmetries, so the system hθ = ( ∑4 j=1 ∂ 2 xj + v[1,1,1,1])θ = 0 must be conformally superintegrable. 3.2. bôcher limits suppose some of the ei become equal. to obtain separable coordinates we cannot just set them equal in ω, φ but must take limits, bôcher develops a calculus to describe this. thus the process of making e1 → e2 is described by the mapping, which in the limit as � → 0 takes the null cone to the null cone: e1 = e2 + �2, x1 → i(x′1 + ix′2)√ 2� , x2 → (x′1 + ix′2)√ 2� + � (x′1 − ix′2)√ 2 , x3 → x′3, x4 → x ′ 4, in the limit we have ω = x′1 2 + x′2 2 + x′3 2 + x′4 2 = 0, φ = (x′1 + ix′2)2 2(λ−e2)2 + x′1 2 + x′2 2 λ−e2 + x′3 2 λ−e3 + x′4 2 λ−e4 , which has elementary divisors [2, 1, 1], see [30, 31]. in the same way as for [1, 1, 1, 1], these forms define a new set of orthogonal coordinates r-separable for the laplace equations. we can show that the coordinate limit induces a contraction of so(4,c) to itself: l′12 = l12, l ′ 13 = − i √ 2 � (l13 − il23) − i � √ 2 l13, l′23 = − i √ 2 � (l13 − il23) − � √ 2 l13, l ′ 34 = l34, l′14 = − i √ 2 � (l14 − il24) − i � √ 2 l14, l′24 = − i √ 2 � (l14 − il24) − � √ 2 l14. we call this the bôcher contraction [1, 1, 1, 1] → [2, 1, 1]. there are analogous bôcher contractions of so(4,c) to itself corresponding to limits from [1, 1, 1, 1] to [2, 2], [3, 1], [4]. similarly, there are bôcher contractions [2, 1, 1] → [2, 2], etc. if we apply the contraction [1, 1, 1, 1] → [2, 1, 1] to the potential v[1,1,1,1] we get a finite limit v[2,1,1] = b1 (x′1 + ix′2)2 + b2(x′1 − ix′2) (x′1 + ix′2)3 + b3 x′3 2 + b4 x′4 2 , (8) provided the parameters transform as a1 = − 1 2 (b1 �2 + b2 2�4 ) , a2 = − b2 4�4 , a3 = b3, a4 = b4. 218 vol. 56 no. 3/2016 laplace equations, conformal superintegrability and bôcher contractions note: we know from theory that the 4-dimensional vector space of potentials v[1,1,1,1] maps to the 4dimensional vector space of potentials v[2,1,1] 1-1 under the contraction [15]. the reason for the �dependence of the parameters is the arbitrariness of choosing a basis. if we had chosen a basis for v[1,1,1,1] specially adapted to this contraction, we could have achieved aj = bj, 1 ≤ j ≤ 4. bôcher contractions obey a composition law: theorem 6. let a: (∆x + va(x))ψ = 0, b : (∆y + vb(y))ψ = 0 c : (∆z + vc(z))ψ = 0, be bôcher superintegrable systems such that a bôchercontracts to b and b bôcher-contracts to c. then there is a one-parameter contraction of a to c. a fundamental advantage in recognizing bôcher’s limit procedure as contractions is that whereas the bôcher limits had a fixed starting and ending point, say [1, 1, 1, 1] → [2, 1, 1], contractions can be applied to any nondegenerate conformally superintegrable system and are guaranteed to result in another nondegenerate conformally superintegrable system. this greatly increases the range of applicability of the limits. 4. the 8 classes of nondegenerate conformally superintegrable systems the possible laplace equations (in tetraspherical coordinates) are ( ∑4 j=1 ∂ 2 xj + v )ψ = 0 with potentials: v[1,1,1,1] = 4∑ j=1 aj x2j , (9) v[2,1,1] = a1 x21 + a2 x22 + a3(x3 − ix4) (x3 + ix4)3 + a4 (x3 + ix4)2 , v[2,2] = a1 (x1 + ix2)2 + a2(x1 − ix2) (x1 + ix2)3 + a3 (x3 + ix4)2 + a4(x3 − ix4) (x3 + ix4)3 , v[3,1] = a1 (x3 + ix4)2 + a2x1 (x3 + ix4)3 + a3(4x12 + x22) (x3 + ix4)4 + a4 x22 , v[4] = a1 (x3 + ix4)2 + a2 x1 + ix2 (x3 + ix4)3 +a3 3(x1 + ix2)2 − 2(x3 + ix4)(x1 − ix2) (x3 + ix4)4 , v[0] = a1 (x3 + ix4)2 + a2x1 + a3x2 (x3 + ix4)3 +a4 x21 + x22 (x3 + ix4)4 , v (1) = a1 1 (x1 + ix2)2 + a2 1 (x3 + ix4)2 +a3 (x3 + ix4) (x1 + ix2)3 + a4 (x3 + ix4)2 (x1 + ix2)4 , v (2) = a1 1 (x3 + ix4)2 + a2 (x1 + ix2) (x3 + ix4)3 +a3 (x1 + ix2)2 (x3 + ix4)4 + a4 (x1 + ix2)3 (x3 + ix4)5 . (the last 3 systems do not correspond to elementary divisors; they appear as bôcher contractions of systems that do correspond to elementary divisors.) each of the 44 helmholtz nondegenerate superintegrable (i.e., 3-parameter) eigenvalue systems is stäckel equivalent to exactly one of these systems. thus, with one caveat, there are exactly 8 equivalence classes of helmholtz systems. the caveat is the singular family of systems with potentials vs = (x3 +ix4)−2h(x1+ix2x3+ix4 ) where h is an arbitrary analytic function except that vs 6= v (1),v (2). this family is unrelated to the other systems. expressed as flat space laplace equations (∂2x +∂2y + ṽ )ψ = 0 in cartesian coordinates, the potentials are ṽ[1,1,1,1] = a1 x2 + a2 y2 + 4a3 (x2 + y2 − 1)2 − 4a4 (x2 + y2 + 1)2 , ṽ[2,1,1] = a1 x2 + a2 y2 −a3(x2 + y2) + a4, ṽ[2,2] = a1 (x + iy)2 + a2(x− iy) (x + iy)3 +a3 −a4(x2 + y2), ṽ[3,1] = a1 −a2x + a3(4x2 + y2) + a4 y2 , ṽ[4] = a1 −a2(x + iy) +a3 ( 3(x + iy)2 + 2(x− iy) ) −a4 ( 4(x2 + y2) + 2(x + iy)3 ) , ṽ[0] = a1 − (a2x + a3y) + a4(x2 + y2), ṽ (1) = a1 (x + iy)2 + a2 − a3 (x + iy)3 + a4 (x + iy)4 , ṽ (2) = a1 + a2(x + iy) + a3(x + iy)2 + a4(x + iy)3. (10) 4.1. summary of bôcher contractions of laplace superintegrable systems table 1 contains a partial list of contractions. the full list is presented in [32]. we have omitted some contractions, such as [3, 1] → [4], because they are consequences of other contractions in the table. 5. helmholtz contractions from bôcher contractions we describe how bôcher contractions of conformal superintegrable systems induce contractions of helmholtz superintegrable systems. we consider the conformal stäckel transforms of the conformal system [1, 1, 1, 1] with potential v[1,1,1,1]. 219 e. kalnins, w. miller, e. subag acta polytechnica [1, 1, 1, 1] → [2, 1, 1] v[1,1,1,1] ↓ v[2,1,1] v[2,1,1] ↓ v[2,1,1] v[2,2] ↓ v[2,2] v[3,1] ↓ v(1) v[4] ↓ v[0] v[0] ↓ v[0] v (1) ↓ v (1) v (2) ↓ v (2) [1, 1, 1, 1] → [2, 2] v[1,1,1,1] ↓ v[2,2] v[2,1,1] ↓ v[2,2] v[2,2] ↓ v[2,2] v[3,1] ↓ v (1) v[4] ↓ v (2) v[0] ↓ v[0] v (1) ↓ v (1) v (2) ↓ v (2) [2, 1, 1] → [3, 1] v[1,1,1,1] ↓ v[3,1] v[2,1,1] ↓ v[3,1] v[2,2] ↓ v[0] v[3,1] ↓ v[3,1] v[4] ↓ v[0] v[0] ↓ v[0] v (1) ↓ v (2) v (2) ↓ v (2) [1, 1, 1, 1] → [4] v[1,1,1,1] ↓ v[4] v[2,1,1] ↓ v[4] v[2,2] ↓ v[0] v[3,1] ↓ v[4] v[4] ↓ v[0] v[0] ↓ v[0] v (1) ↓ v (2) v (2) ↓ v (2) [2, 2] → [4] v[1,1,1,1] ↓ v[4] v[2,1,1] ↓ v[4] v[2,2] ↓ v[4] v[3,1] ↓ v (2) v[4] ↓ v (2) v[0] ↓ v[0] v (1) ↓ v (2) v (2) ↓ v (2). [1, 1, 1, 1] → [3, 1] v[1,1,1,1] ↓ v[3,1] v[2,1,1] ↓ v[3,1] v[2,2] ↓ v[0] v[3,1] ↓ v[3,1] v[4] ↓ v[0] v[0] ↓ v[0] v (1) ↓ v (2) v (2) ↓ v (2) table 1. bôcher contractions of laplace superintegrable systems. as we show explicitly in [32], the various possibilities are s9 above and 2 more helmholtz systems on the sphere, s7 and s8, 2 darboux systems d4b and d4c, and a family of koenigs systems. example 1. using cartesian coordinates x,y, we consider the [1, 1, 1, 1] hamiltonian h = ∂2x + ∂ 2 y + a1 x2 + a2 y2 + 4a3 (x2 + y2 − 1)2 + 4a4 (x2 + y2 + 1)2 . dividing on the left by 1/x2 we obtain ĥ = x2(∂2x + ∂ 2 y) + a1 + a2 x2 y2 + 4a3 x2 (x2 + y2 − 1)2 − 4a4 x2 (x2 + y2 + 1)2 , the stäckel transform corresponding to the case (a1,a2,a3,a4) = (1, 0, 0, 0). this becomes more transparent if we introduce variables x = e−a,y = r. the hamiltonian ĥ can be written ĥ = ∂2a + e −2a∂2r + a1 + a2 e−2a r2 + a3 4 (e−a + ea(r2 − 1))2 −a4 4 (e−a + ea(r2 + 1))2 . recalling horospherical coordinates on the complex two sphere, viz. s1 = i 2 (e−a + (r2 + 1)ea), s2 = rea, s3 = 1 2 (e−a + (r2 − 1)ea) we see that the hamiltonian ĥ can be written as ĥ = ∂2s1 + ∂ 2 s2 + ∂2s3 + a1 + a2 s22 + a3 s23 + a4 s21 , and this is explicitly the superintegrable system s9. more generally, let h be the initial hamiltonian. in terms of tetraspherical coordinates a general conformal stäckel transformed potential will take the form v = a1 x21 + a2 x22 + a3 x23 + a4 x24 a1 x21 + a2 x22 + a3 x23 + a4 x24 = v[1,1,1,1] f(x, a) , where f(x, a) = a1 x21 + a2 x22 + a3 x23 + a4 x24 , and the transformed hamiltonian will be ĥ = 1 f(x, a) h, where the transform is determined by the fixed vector (a1,a2,a3,a4). now we apply the bôcher contraction [1, 1, 1, 1] → [2, 1, 1] to this system. in the limit as � → 0 the potential v[1,1,1,1] → v[2,1,1], (8), and h → h′ the [2, 1, 1] system. now consider f(x(�), a) = v ′(x′,a)�α + o(�α+1), where the the integer exponent α depends upon our choice of a. we will provide the theory to show that the system defined by hamiltonian ĥ′ = lim �→0 �αĥ(�) = 1 v ′(x′,a) h′ is a superintegrable system that arises from the system [2, 1, 1] by a conformal stäckel transform induced by the potential v ′(x′,a). thus the helmholtz superintegrable system with potential v = v[1,1,1,1]/f contracts to the helmholtz superintegrable system with potential v[2,1,1]/v ′. the contraction is induced by a generalized inönü-wigner lie algebra contraction of the conformal algebra so(4,c). always the v ′ can be identified with a specialization of the [2, 1, 1] potential . thus a conformal stäckel transform of [1, 1, 1, 1] has been contracted to a conformal stäckel 220 vol. 56 no. 3/2016 laplace equations, conformal superintegrability and bôcher contractions figure 1. relationship between conformal stäckel transforms and bôcher contractions. transform of [2, 1, 1]. the results follow and generalize to all laplace systems. the basic idea is that the procedure of taking a conformal stäckel transform of a conformal superintegrable system, followed by a helmholtz contraction yields the same result as taking a bôcher contraction followed by an ordinary stäckel transform: the diagrams commute. the possible helmholtz contractions obtainable from these bôcher contractions number well over 100; they will be classified in another paper. all quadratic algebra contractions are induced by lie algebra contractions of so(4,c), even those for darboux and koenigs spaces. 6. conclusions and discussion we have pointed out that the use of lie algebra contractions based on the symmetry groups of constant curvature spaces to construct quadratic algebra contractions of 2nd order 2d helmholtz superintegrable systems is incomplete, because it doesn’t satisfactorily account for darboux and koenigs spaces, and because even for constant curvature spaces there are abstract quadratic algebra contractions that cannot be obtained from the lie symmetry algebras. however, this gap is filled in when one extends these systems to 2nd order laplace conformally superintegrable systems with conformal symmetry algebra. classes of stäckel equivalent helmholtz superintegrable systems are now recognized as corresponding to a single laplace superintegrable system on flat space with underlying conformal symmetry algebra so(4,c). the conformal lie algebra contractions are induced by bôcher limits associated with invariants of quadratic forms. they generalize all of the helmholtz contractions derived earlier. in particular, contractions of darboux figure 2. the bigger picture. and koenigs systems can be described easily. all of the concepts introduced in this paper are clearly also applicable for dimensions n ≥ 3 [33]. in papers submitted [32], and under preparation we will: (1.) give a complete detailed classification of 2d nondegenerate 2nd order conformally superintegrable systems and their relation to bôcher contractions; (2.) present a detailed classification of all bôcher contractions of 2d nondegenerate 2nd order conformally superintegrable systems; (3.) present tables describing the contractions of nondegenerate 2nd order helmholtz superintegrable systems and how they are induced by bôcher contractions; (4.) introduce so(4,c) → e(3,c) contractions of laplace systems and show how they produce conformally 2nd order superintegrable 2d time-dependent schrödinger equations. from theorem 1 we know that the potentials of all helmholtz superintegrable systems are completely determined by their free quadratic algebras, i.e., the symmetry algebra that remains when the parameters in the potential are set equal to 0. thus for classification purposes it is enough to classify free abstract quadratic algebras. in a second paper under preparation we will: (1.) apply the bôcher construction to degenerate 221 e. kalnins, w. miller, e. subag acta polytechnica (1-parameter) helmholtz superintegrable systems (which admit a 1st order symmetry); (2.) give a classification of free abstract degenerate quadratic algebras and identify which of those correspond free 2nd order superintegrable systems; (3.) classify abstract contractions of degenerate quadratic algebras and identify which of those correspond to geometric contractions of helmholtz superintegrable systems; (4.) classify free abstract nondegenerate quadratic algebras and identify those corresponding to free nondegenerate helmholtz 2nd order superintegrable systems; (5.) classify the abstract contractions of nondegenerate quadratic algebras. we note that by taking contractions step-by-step from a model of the s9 quadratic algebra we can recover the askey scheme [25]. however, the contraction method is more general. it applies to all special functions that arise from the quantum systems via separation of variables, not just polynomials of hypergeometric type, and it extends to higher dimensions. the functions in the askey scheme are just those hypergeometric polynomials that arise as the expansion coefficients relating two separable eigenbases that are both of hypergeometric type. thus, there are some contractions which do not fit in the askey scheme since the physical system fails to have such a pair of separable eigenbases. in a third paper under preparation we will analyze the laplace 2nd order conformally superintegrable systems, determine which of them is exactly solvable or quasi-exactly solvable and identify the spaces of polynomials that arise. again, multiple helmholtz superintegrable systems will correspond to a single laplace system. this will enable us to apply our results to characterize polynomial eigenfunctions not of askey type and their limits. acknowledgements this work was partially supported by a grant from the simons foundation (# 208754 to willard miller, jr). references [1] evans n.w., super-integrability of the winternitz system; phys. lett. v.a 147, 483–486, (1990), doi:10.1016/0375-9601(90)90611-q [2] tempesta p., turbiner a. and winternitz p., exact solvability of superintegrable systems, j. math. phys., 42, 4248–4257 (2001), doi:10.1063/1.1386927 [3] superintegrability in classical and quantum systems, tempesta p., winternitz p., miller w., pogosyan g., editors, ams, vol. 37, 2005, isbn-10: 0-8218-3329-4, isbn-13: 978-0-8218-3329-2 [4] fordy a. p., quantum super-integrable systems as exactly solvable models , sigma 3 025, (2007), doi:10.3842/sigma.2007.025 [5] miller w. jr., post s. and winternitz p.. classical and quantum superintegrability with applications , j. phys. a: math. theor., 46, 423001, (2013) [6] kalnins e. g., kress j .m, and miller w. jr., second order superintegrable systems in conformally flat spaces. i: 2d classical structure theory. j. math. phys., 46, 053509, ( 2005); ii: the classical 2d stäckel transform. j. math. phys., 46, 053510, (2005); iii. 3d classical structure theory, j. math. phys., 46, 103507, (2005), iv. the classical 3d stäckel transform and 3d classification theory„ j. math. phys., 47, 043514, (2006) ; v: 2d and 3d quantum systems. j. math. phys., 47, 09350, (2006); nondegenerate 2d complex euclidean superintegrable systems and algebraic varieties, j. phys. a: math. theor., 40, 3399-3411, (2007), doi:10.1088/1751-8113/40/13/008 [7] daskaloyannis c. and tanoudis y., quantum superintegrable systems with quadratic integrals on a two dimensional manifold. j. math phys., 48, 072108 (2007). [8] koenigs, g., sur les géodésiques a intégrales quadratiques. a note appearing in “lecons sur la théorie générale des surfaces”. g. darboux. vol 4, 368-404, 1896, chelsea publishing 1972. [9] kalnins e. g., kress j. m., miller w. jr. and winternitz p., superintegrable systems in darboux spaces. j. math. phys., v.44, 5811–5848, (2003), doi:10.1063/1.1619580 [10] granovskii ya. i., zhedanov a. s., and lutsenko i. m., quadratic algebras and dynamics in curved spaces. i. oscillator, theoret. and math. phys., 1992, 91, 474-480, doi:10.1007/bf01018846; quadratic algebras and dynamics in curved spaces. ii. the kepler problem, theoret. and math. phys., 91, 604-612, (1992), doi:10.1007/bf01017335 [11] bonatos d., daskaloyannis c. and kokkotas k., deformed oscillator algebras for two-dimensional quantum superintegrable systems; phys. rev., v.a 50, 3700–3709, (1994), doi:10.1103/physreva.50.3700 [12] letourneau p. and vinet l., superintegrable systems: polynomial algebras and quasi-exactly solvable hamiltonians. ann. phys., v.243, 144–168, (1995), doi:10.1006/aphy.1995.1094 [13] kalnins e. g., kress j. m.,miller w. jr. and pogosyan g. s., completeness of superintegrability in two-dimensional constant curvature spaces. j. phys. a: math gen. 34, 4705–4720 (2001), doi:10.1088/0305-4470/34/22/311 [14] kress j. m., equivalence of superintegrable systems in two dimensions. phys. atomic nuclei, 70, 560-566, (2007), doi:10.1088/0305-4470/34/22/311 [15] kalnins e. g. and miller w. jr., quadratic algebra contractions and 2nd order superintegrable systems, anal. appl. 12, 583-612, (2014), doi:10.1142/s0219530514500377 [16] kalnins e. g., miller w. jr. and post s., wilson polynomials and the generic superintegrable system on the 2-sphere, j. phys. a: math. theor. 40, 11525-11538 (2007), doi:10.1088/1751-8113/40/38/005 222 http://dx.doi.org/10.1016/0375-9601(90)90611-q http://dx.doi.org/10.1063/1.1386927 http://dx.doi.org/10.3842/sigma.2007.025 http://dx.doi.org/10.1088/1751-8113/40/13/008 http://dx.doi.org/10.1063/1.1619580 http://dx.doi.org/10.1007/bf01018846; http://dx.doi.org/10.1007/bf01017335 http://dx.doi.org/10.1103/physreva.50.3700 http://dx.doi.org/10.1006/aphy.1995.1094 http://dx.doi.org/10.1088/0305-4470/34/22/311 http://dx.doi.org/10.1088/0305-4470/34/22/311 http://dx.doi.org/10.1142/s0219530514500377 http://dx.doi.org/10.1088/1751-8113/40/38/005 vol. 56 no. 3/2016 laplace equations, conformal superintegrability and bôcher contractions [17] kalnins e. g., miller w. jr. and post s., models for quadratic algebras associated with second order superintegrable systems, sigma 4, 008, 21 pages; arxiv:0801.2848, (2008), doi:10.3842/sigma.2008.008 [18] kalnins e. g., miller w. jr. and post s., two-variable wilson polynomials and the generic superintegrable system on the 3-sphere, http://www.emis.de/journals/sigma/2011/051/ [math-ph], sigma, 7, 051 (2011) 26 pages, doi:10.3842/sigma.2011.051 [19] post, s., models of quadratic algebras generated by superintegrable systems in 2d.sigma 7 (2011), 036, 20 pages arxiv:1104.0734, doi:10.3842/sigma.2011.036 [20] li q, and miller w. jr, wilson polynomials/functions and intertwining operators for the generic quantum superintegrable system on the 2-sphere, 2015 j. phys.: conf. ser. 597 012059 (http://iopscience.iop.org/1742-6596/597/1/012059) [21] digital library of mathematical functions (http://dlmf.nist.gov). [22] inönü e. and wigner e. p., on the contraction of groups and their representations. proc. nat. acad. sci. (us), 39, 510-524, (1953), doi:10.1073/pnas.39.6.510 [23] weimar-woods e., the three-dimensional real lie algebras and their contractions, j. math. phys., 32, 2028-2033 (1991), doi:10.1063/1.529222 [24] nesterenko m. and popovych r., contractions of low-dimensional lie algebras, j. math. phys., 47 123515, (2006). [25] kalnins e. g., miller w. jr and post s., contractions of 2d 2nd order quantum superintegrable systems and the askey scheme for hypergeometric orthogonal polynomials sigma, 9 057, 28 pages, (2013), doi:10.3842/sigma.2013.057 [26] kalnins e. g., kress j.m, miller w. jr and post, s., laplace-type equations as conformal superintegrable systems, advances in applied mathematics 46 (2011) 396416. [27] capel j.j. and kress j.m., invariant classification of second-order conformally flat superintegrable systems, j. phys. a: math. theor. 47 (2014), 495202. [28] bôcher m., ueber die reihenentwickelungen der potentialtheorie, b. g. teubner, leipzig 1894. [29] kalnins e,g., miller w. jr. and post s., coupling constant metamorphosis and nth order symmetries in classical and quantum mechanics, j. phys. a: math. theor. 43 (2010) 035202. (20 pages) , doi:10.1088/1751-8113/43/3/035202 [30] kalnins e.g., miller w. jr., and reid g.j., separation of variables for complex riemannian spaces of constant curvature. i. orthogonal separable coordinates for snc and enc, proc. r. soc. lond. a 394 (1984), pp. 183-206, doi:10.1098/rspa.1984.0075 [31] bromwich t. j. a., quadratic forms and their classification by means of invariant factors. cambridge tract no. 3. cambridge university press, 1906, reprint hafner, new york, 1971. [32] e.g. kalnins, w. miller, jr., and e. subag, bôcher contractions of conformally superintegrable laplace equations, (submitted), arxiv:1512.09315, 2016; e.g. kalnins, w. miller, jr., and e. subag, bôcher contractions of conformally superintegrable laplace equations: detailed computations, arxiv:1601.02876, 2016 [33] capel j.j., kress j.m. and post s., invariant classification and limits of maximally superintegrable systems in 3d, sigma, 11 (2015), 038, 17 pages arxiv:1501.06601, doi:10.3842/sigma.2015.038 223 http://arxiv.org/abs/0801.2848 http://dx.doi.org/10.3842/sigma.2008.008 http://dx.doi.org/10.3842/sigma.2011.051 http://arxiv.org/abs/1104.0734 http://dx.doi.org/10.3842/sigma.2011.036 http://dx.doi.org/10.1073/pnas.39.6.510 http://dx.doi.org/10.1063/1.529222 http://dx.doi.org/10.3842/sigma.2013.057 http://dx.doi.org/10.1088/1751-8113/43/3/035202 http://dx.doi.org/10.1098/rspa.1984.0075 http://arxiv.org/abs/1512.09315 http://arxiv.org/abs/1601.02876 http://arxiv.org/abs/1501.06601 http://dx.doi.org/10.3842/sigma.2015.038 acta polytechnica 56(3):214–223, 2016 1 introduction 1.1 the big picture: contractions and special functions 1.2 the problems and the proposed solutions 2 the laplace equation 3 the bôcher method 3.1 relation to separation of variables 3.2 bôcher limits 4 the 8 classes of nondegenerate conformally superintegrable systems 4.1 summary of bôcher contractions of laplace superintegrable systems 5 helmholtz contractions from bôcher contractions 6 conclusions and discussion acknowledgements references ap01_6.vp 1 introduction screw agitators rotating in tubes are very efficient tools for mixing and pumping viscous liquids. they are also suitable for cases where the viscosity of the liquid changes during operation. the power characteristic of the agitator in the tube must be known to enable its power consumption in a given configuration to be calculated. an estimate of power consumption p from the power characteristic is schematically shown in fig. 1, where the specific energy e, for which the power is determined, is given by the intersection of the pumping characteristic of the screw and the hydraulic characteristic of the system (dependencies of flow rate q on specific energy). the information available in the literature on the power consumption of screws concerns mainly the creeping flow regime (screws in extruders). however, screw agitators are often used in transition and eventually turbulent regimes. a method for calculating the power characteristics of screw agitators in a creeping flow regime was proposed in [1]. the power characteristics at higher reynolds number values must be determined experimentally. an experimental method based on measurements in several configurations was reported previously [2]. however, it should be noted that it is difficult to keep the same geometrical parameters (especially clearance between tube and screw) in different configurations. therefore, a new method based on measurement in a single configuration was used [3], [4]. 2 theoretical the power characteristic is the dependence of power consumption p on specific energy e. applying inspection analysis of the governing equations, the following relationship for the dimensionless power characteristic was proposed in [2] � �p p e* * *, re� (1) where dimensionless power p p n d* � � 2 3 (2) dimensionless specific energy e e n* � � (3) and reynolds number re � nd2 � (4) in the creeping flow regime eq.(1) reduces to � �p p e* * *� (5) and the dimensionless power characteristic is linear p c ae* *� � (6) at higher reynolds number values the power characteristics are generally non-linear. however, the dependence of dimensionless power on dimensionless specific energy is not very strong and decreases with increasing reynolds number, as it is shown fig. 2, based on the experimental results © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 41 no. 6/2001 power characteristics of a screw agitator in a tube f. rieger screw agitators rotating in tubes are very efficient tools for mixing and pumping viscous liquids. the power characteristic of the agitator-tube assembly must be known to enable its power consumption in a given configuration to be calculated. the dimensionless power characteristic is described by eq. (6). an estimate of power consumption from the power characteristic is schematically shown in fig. 1. the dependence of the coefficients in eq. (6) on the reynolds number is shown in fig. 5. the power characteristics for selected reynolds number values are shown in figs. 6–9. keywords: power, power characteristic, screw agitator, pump. agitator power characteristic agitator pumping characteristic hydraulic characteristic of agitating system p q e fig.1: estimation of power consumption from operating characteristics (creeping flow region) 0 200 400 600 800 1000 1200 0 500 1000 1500 2000 2500 �e* p * 48.7–54.2 152–155.8 199.1–202 358.4–363.5 514.5–517.7 1.9–11.1 re fig. 2: power characteristics at different reynolds number values presented in [3]. for this reason we may approximate the dependence of dimensionless power on dimensionless specific energy at given reynolds number values also by a straight line (6), with coefficients c and a dependent on the reynolds number. two pairs of p* and e* values are sufficient for the design of these straight lines. these two points can be represented by the asymptotic power characteristics at minimum and maximum specific energy presented in [4]. inserting in eq.(6) we can write p c aemin * min * � � (7) p c aemax * max * � � (8) the value of the dimensionless minimum specific energy emin * was calculated (in the creeping flow region – see [5], in the turbulent region – see [6]), the value of the dimensionless maximum specific energy emax * was determined experimentally (see [7] or [3]). substracting equations above we obtain the following equation for � � � �a p p e e� � �max* min* max* min* (9) coefficient c can be expressed from (8) c p ae� �max * max * (10) 3 results and discussion the procedure proposed above was applied for a screw agitator with the following parameters: pitch s = 2d, root diameter d = 0.2 d and length l = 1.4 d – see fig. 3. the asymptotic power characteristics for this agitator obtained experimentally and reported in [4] are presented in fig. 4. the characteristic at minimum specific energy was measured in a relatively large vessel with low hydraulic resistance. the characteristic at maximum specific energy (zero pumping capacity) was attained covering the draught tube. using the values of dimensionless power p po* re� obtained from fig. 4, the values of coefficients a and c were calculated from eqs.(9) and (10). the dependencies of a, c and c /re on the reynolds number are presented in fig. 5. from this figure it follows that in the creeping flow regime values a and c are constant, and in the turbulent region values a and c /re are constant. for this reason the equation of dimensionless power characteristic (6) in a turbulent region transforms to po c ae� � �t (11) where ct = c /re. equation (11) is recommended for standard pumps used for pumping low viscosity liquids. fig. 5 also shows that the value of a in a creeping flow regime is greater than the corresponding value in a turbulent region, which means that the dependence of dimensionless power on specific energy (hydraulic resistance of the system) is less pronounced in a turbulent region. using the coefficients 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 3: screw agitator used in experiments 0.1 1 10 100 1000 10000 0.1 1 10 100 1000 10000 100000 re min maxp o fig. 4: asymptotic power curves of the screw agitator depicted in fig. 5, the power characteristics for selected values of reynolds number can be obtained. power characteristics in the form of eq. (6) for a creeping flow region and re = 100 are shown in fig. 6. power characteristics in the form of eq. (11) for re = 1000 and a turbulent region are shown in fig. 7. using the pumping characteristics presented in [8] the power characteristics can be expressed in an alternative form, more frequently used in literature on pumps (se e.g. [9]), with dimensionless flow rate q* on the axis of the abscissas shown in figs. 8 and 9. 4 acknowledgement this research was supported by the grant agency of the czech republic under grant no.101/99/0638. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 41 no. 6/2001 0.01 0.1 1 10 100 1000 10000 100000 1 10 100 1000 10000 100000 re a c c/re fig. 5: dependence of dimensionless coefficients on reynolds number 0 50 100 150 200 250 300 350 400 450 0 100 200 300 400 500 e* re < 10 re = 100 p * fig. 6: power characteristics at small reynolds number values 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 1 2 3 4 e + re=1000 re>10000 p o fig. 7: power characteristics at high reynolds number values 0 50 100 150 200 250 300 350 400 450 0 0.1 0.2 0.3 0.4 0.5 q* re< 10 re =100 p * fig. 8: power characteristics at small reynolds number values 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.2 0.4 0.6 0.8 q* re=1000 re>10000 p o fig. 9: power characteristics at high reynolds number values 5 symbols a, c coefficients in eq. (6) ct coefficient in eq. (11) d screw root diameter d agitator diameter e specific energy per unit mass e* dimensionless specific energy, e e d* � � e+ dimensionless specific energy, e e n d� � 2 2 l length of screw n agitator speed p power p* dimensionless power p p n d* � � 2 3 po power number, po p n d� � 3 5 q volumetric flow rate q* dimensionless volumetric flow rate, q q nd* � 3 re reynolds number, re � nd2 � s pitch of screw � dynamic viscosity � kinematic viscosity � density references [1] rieger, f.: the power input of screw rotors and agitators. collect. czech. chem. commun. 54, 1989, pp. 1575–1588 [2] rieger, f., weiserová, h.: determination of operating characteristics of agitators in tubes. chem. eng. technol. 16, 1993, pp. 172–179 [3] jirout, t., brož, j., rieger, f.: measurement of power characteristics of screw agitators in a tube. proceedings of czech mixing conference, techmix, brno 2000 (cdrom, in czech) [4] rieger, f.: measurement of asymptotic power characteristics of screw agitators in a tube. chisa conference, czech society of chemical engineering, srní 1999 (cdrom, in czech) [5] rieger, f.: power characteristics of screw agitators in creeping flow region. proceedings of sschi conference, slovak society of chemical engineering, tatranske matliare 2000 (cdrom, in czech) [6] blenke, h., bohner, k., hirner, w.: druckverlust bei der 180° – stromungsumlenkung im schlaufenreaktor. verfahrenstechnik 3, 1969, pp. 444–452 [7] rieger, f.: pumping characteristics of a screw agitator in a tube. chem. eng. j. 66, 1997, pp. 73–77 [8] sedláček, l., rieger, f.: influence of geometry on pumping characteristics of screw agitators. collect. czech. chem. commun. 62, 1997, pp. 1871–1878. [9] bláha, j., brada, k.: pump handbook. prague, čvut publishing house 1997 (in czech) prof. ing. františek rieger, drsc. e-mail: rieger@fsid.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4, 166 07 praha 6, czech republic 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 ap01_45.vp 1 notation a crack length af crack length at test termination e distance filler k bending factor n number of cycles nin number of cycles prior to crack initiation r stress cycle ratio v fatigue crack growth rate �pl, l local plastic strain in the notch root of a specimen �bend induced bending stress �layer local stress in the individual metal layer of a laminate �max maximum applied tension stress 2 introduction fiber-metal laminates are a family of structural aerospace sheet materials, as shown in vogelesang et al [1], that consist of thin metal sheets alternating with adhesive layers reinforced by high strength fibers. glare 3 fiber-metal laminate is composed of aluminium alloy layers and pre-impregnated layers of cross-plied high strength glass fibers in an epoxy-resin matrix. the cross-ply structure of the fibers makes this material suitable for applications subjected to bi-axial loading. glare 3 has been designed as an excellent material for the pressurized fuselage skin structure of large aircraft of the future. glare 3 laminate is currently being evaluated and tested in the airbus a340 fuselage section, and is considered a promising candidate for the new 21st century airbus a3xx, as mentioned by vlot et al [2]. since riveting of large sheets is still the most common joining technology for a fuselage structure, rows of rivet holes are subjected to the highest attention. the holes locally induce stress concentration, and act as possible fatigue crack initiation sites. the pressurization of fuselage gives hoop stress, which is distributed over the joints. the load in the riveted lap joints is a combination of tension, bending, rivet load, clamp up and friction, as was shown by müller [3]. in this study, attention is paid to the effect of mixed tension and bending loading on fatigue crack initiation and early growth in an open-hole sheet specimen. 3 experiments the glare 3 laminate used in the tests had been stacked with three 0.3 mm thick 2024-t3 aluminium alloy sheets alternating with two 0.25 mm thick epoxy resin layers reinforced by cross-plied continuous glass fibers (0°/90°) – see figure 1. the laminate plate of total thickness 1.35 mm was cured in an autoclave at 120 °c under pressure 5 bar for 90 minutes. a special asymmetric test configuration, as suggested by nam et al [4], was adopted in order to introduce cyclical tension and bending in the sheet specimen loaded in a uniaxial fatigue machine. for this purpose both ends of the sheet specimen were flat bonded on in-axis sheet extensions made of the same material. an open hole of diameter 4.8 mm was drilled in the centre of the 40 mm wide and 100 mm long sheet specimens. the specimen surface in the vicinity of the central hole was mechanically polished in order to make fatigue cracks visible from their initiation. the specimens were cyclically loaded in the mts 880 servo-hydraulic fatigue-testing machine, with the controlled load cycle of maximum stress �max varying between 90 mpa and 150 mpa in individual tests. the stress cycle ratio was kept at r = 0.04. the eccentricity of the specimen was preset by the different thickness of the filler inserted in the bonded lap joints. this enabled the bending moment to vary in different specimens. the geometry and dimensions of the specimen and the thickness of the fillers enabled the bending factor k = �bend/ �max to be adjusted in the interval between 0.5 and 1.5. cyclic loading under pure tension ( k = 0) was also applied. the specimen geometry is shown in figure 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no. 4 – 5/2001 fatigue crack initiation and early growth in glare 3 fiber-metal laminate subjected to mixed tensile and bending loading a. chlupová, j. heger, a. vašek a special open-hole sheet specimen of glare fiber-metal laminate was used to simulate mixed loading similar to the loading of a riveted hole in a real fuselage skin structure. the effect of cyclic tension and secondary bending loading was studied. the notch region was observed through a microscope in order to detect the first fatigue crack initiation during fatigue loading. the number of cycles prior to crack initiation was measured, and the crack growth rate in the surface layer of the laminate was evaluated. the specimens were subjected to fatigue damage investigation in the inner layers of the laminate after termination of the test. a significant effect of secondary bending on fatigue crack initiation and early crack growth was found. the experimental results are discussed in terms of local stress-strain conditions in the notch region evaluated by means of finite element calculation. keywords: laminate, composite, fatigue, mixed loading, finite element modeling. we performed finite element stress and strain modelling in glare 3 laminate sheet subjected to mixed tensile and secondary bending. each prepreg layer was modelled by two element layers, and each aluminium layer was modelled by four element layers. a denser mesh of finite elements was used in the vicinity of the rivet hole (figure 2) to give a true picture of the real stress and strain concentration effect. meshing was performed by the “mapped meshing” technique, using six sided solid elements exclusively. even if double symmetry were used, the total number of elements would exceed 20,000 and the total number of nodes 23,000. a special fe procedure was developed in order to take into account both the stress-strain non-linear response in the metal layers and the geometrical non-linearity of the large specimen deflection due to the bending moment, as suggested by heger [5]. the ansys finite element system was adopted for the solution. the notch region was observed through a light microscope in order to detect the initiation of the first fatigue crack and to measure the number of cycles prior to crack initiation during loading. the growth of the fatigue cracks was periodically measured after initiation on the specimen surface with the aim of evaluating the crack growth rate in the metal layer surface of the laminate. the loading was terminated when the surface crack, measured from the hole edge, was about 10 mm in length. the specimens were subjected to a destructive investigation of the fatigue damage in the inner layer of the laminate after termination of the test. the surface metal layers were electrochemically milled off and the adhesive layers with glass fibers were removed mechanically in the procedure. then the disclosed surface of the inner metal layer was electrochemically polished and the fatigue cracks in this originally hidden layer could be revealed. 4 results first fatigue cracks were initiated in the surface metal layer that was under the greatest bending stress. the number of cycles prior to crack initiation versus maximum applied stress �max is plotted in figure 3. the different positions of the 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 110 150 grip grip 47.5 27.5 4 0 4.8 1 .4 1 .4 e distance filler 1) e = 0 mm 2) e = 1 mm glare 3 (3/2) al (2024-t3), 0.3 mm prepreg (0°), 0.125 mm prepreg (90°), 0.125 mm fig. 1: specimen geometry and laminate composition of glare 3 fig. 2: detail of the fe mesh in the vicinity of the rivet hole n fig. 3: graph of the number of cycles prior to initiation nin versus maximum applied stress �max for different thickness of the distance filler curves confirm that the additional bending stress reduces the initiation period in glare 3 laminate. the effect is more obvious at low cyclic stress when the bending/tension rate k is higher. the crack growth of surface cracks was measured during the experimental loading. figure 4 shows graphs of the crack length a, measured from the notch root dependent on the number of cycles n. the experimental data in the figure were acquired in tests with �max = 125 mpa and with different bending stress �bend = 0, 80 and 137 mpa ( k = 0, 0.64 and 1.1). as shown in figure 4, the crack growth data can fit a linear dependence. this implies that crack growth rate v = da / dn was approximately constant in all performed tests. the length of the cracks in the inner metal layer was measured after termination of the test. a significant difference between final crack lengths was found comparing cracks in the individual metal layers. the difference was affected by the bending factor k. higher k resulted in more different crack lengths on opposite surfaces of the laminate plate. figure 5 shows an example of crack lengths in all three metal layers of laminate specimens subjected to �max = 125 mpa with varying bending stress �bend = 0, 80 and 137 mpa ( k = 0, 0.64 and 1.1). the relative crack length a/af (where af is the crack length in the upper layer at test termination) was plotted for better comparison. 5 discussion first fatigue cracks were found in the surface metal layer that was subjected to the maximum resulting stress. cracks in the inner metal layer occurred later, and the latest fatigue cracks were initiated in the surface layer that was subjected to the minimum resulting stress. in accordance with a recent investigation of fatigue crack initiation mechanisms described in polák [6], the local plastic strain �pl, l in the notch root was calculated by the finite element method in order to find a parameter nin determining the number of cycles necessary for the initiation of fatigue cracks. the experimental data of nin from all tests vs. local plastic strain are plotted in figure 6, where a power-law relation fits the curve: �p in1 1 1 3131745, . � � �n . (1) although the experimental data in figure 6 were taken only from the surface layers, the relation (1) can be used for predicting nin also in the inner layer when the local plastic strain in the layer �pl, l is known. the investigation revealed that the crack growth rate in the performed tests was approximately constant. the crack growth in this kind of material is determined predominantly by an effect of the crack bridging fibers, which restrain the crack opening and thus decrease the stress intensity factor. the crack bridging effect increases with increasing crack length, due to an increase in the number of fibers affected in longer cracks. depending on the laminate structure, the bridging effect can reduce the crack growth rate expected for monolithic metallic materials and can even cause a decrease in the crack growth rate with increasing crack length (e.g., glare 2 in vašek [7]). in the special case of the glare 3 laminate the crack growth rate was found to be constant and independent of the crack length. assuming that the accelerating effect of increasing crack length is compensated by the retarding effect of the crack bridging fibers, the stress intensity factor and the crack growth rate can be determined by a local stress acting in a separate metal layer. figure 7 shows © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 41 no. 4 – 5/2001 a n� fig. 4: dependence of crack length on the number of cycles for the applied stress �max = 125 mpa (at different bending factor k) a a/ fig. 5: relative crack length in the individual metal layers for the applied stress �max = 125 mpa and various distance fillers � � �p l, l in1745 n 10 �2 10 �3 10 � 4 10 4 10 5 number of cycles prior to initiation [cycle]nin �1.313 fig. 6: graph of the experimental data of nin from all tests versus local plastic strain �pl, l in the notch root the dependence of the crack growth rate v on the stress in the surface metal layer calculated by means of finite element modeling. a semi-logarithmic relation v � � �7 17 10 10. exp �layer 10.8 (2) was derived that fits the experimental results. relation (2) holds for v in �m and �layer in mpa and can also be used for predicting the crack growth rate and/or residual life in all metal layers of the laminate when �layer is known. 6 conclusions an experimental investigation of fatigue crack initiation and early growth in glare 3 fiber metal laminate plates subjected to mixed tensile-bending loading was analysed by means of finite element modelling of local stress and strain in the individual laminate layers. the most significant results are as follows: • an asymmetric lap-joint tension specimen was adopted in order to produce a mixed tension-bending cyclic loading. • the first fatigue cracks were found in the upper surface layer subjected to the maximum cyclic resulting stress. cracks in the central layer initiated later and the latest cracks initiated in the opposite surface layer under minimum resulting stress. the number of cycles necessary for crack growth through the laminate thickness increased with the higher rate �bend/�max. the through-thickness crack was separated by intact fibers bridging the crack. • the additional cyclic bending loading shortened the crack initiation period and increased the crack growth rate in comparison with pure tensile cyclic loading. • the number of cycles necessary for fatigue crack initiation in metal layers of the laminate was determined by the local plastic strain at the notch edge in the individual layers. the received relation can be used to predict the number of cycles prior to the initiation in outer or inner metal layers. • the surface crack growth rate was found to be constant and independent of the crack length. knowledge of the stress in individual metal layers enabled the crack growth rate also to be evaluated in the inner metal layers. acknowledgment this work was financially supported by the academy of sciences of the czech republic (grant no. s/204/1001), grant agency of the academy of sciences of the czech republic (grant no. c/204/1104), grant agency of the czech republic (grant no. 106/01/1464, 106/01/1164, 106/99/0728) and by the cez research project: 322/98: 262100001. references [1] vogelesang, l. b., gunning, j. w.: materials and design. vol. 7, 1986, pp. 278–300 [2] vlot, a., vogelesang, l. b., de vries, t. j.: towards application of fibre metal laminates in large aircraft. http://www.glareconference.com/glare.htm, 2001 [3] müller, r. p. g.: an experimental and analytical investigation on the fatigue behaviour of fuselage riveted lap joints. phd thesis, delft university of technology, 1995 [4] nam, k. w., et al: fatigue life and penetration behaviour of a surface-cracked plate under combined tension and bending. fatigue and fracture of engineering materials and structures, vol. 17, 1994, pp. 873–882 [5] heger, j.: combined tensile-bending load simulation of the notched thick wall multi-layer laminate composite used for aircraft body building. ninth international ansys users conference and exhibition, pittsburgh, pa, iv/2, 2000 [6] polák, j.: cyclic plasticity and low cycle fatigue life of metals. academia praha, 1991 [7] vašek, a., prášilová, a.: retarding effect of reinforcing fibres on early crack growth in fatigued notched laminate glare 2. engineering mechanics, vol. 5, 1998, pp. 219 – 223 ing. alice chlupová phone: +420 5 4163 6344 fax: +420 5 4121 8657 e-mail: prasil@ipm.cz institute of physics of materials academy of sciences žižkova 22, 616 62 brno, czech republic rndr. alois vašek, csc. phone: +420 5 4712 5328 fax: +420 5 4712 5310 e-mail: vasek@bibus.cz institute of physics of materials academy of sciences žižkova 22, 616 62 brno, czech republic temporary address: bibus s.r.o. vídeňská 125, 639 27 brno, czech republic ing. jaromír heger, csc. phone: +420 5 4114 2888 fax: +420 5 4121 8657 e-mail: heger@umt.fme.vutbr.cz institute of solid mechanics technical university technická 2, 616 69 brno, czech republic 36 acta polytechnica vol. 41 no. 4 – 5/2001 v fig. 7: dependence of crack growth rate on the stress in a surface metal layer �layer for different values of the distance filler ap01_45.vp 1 introduction in july 1995 our institute was awarded funding for a project on “model experiments on the pressure distribution in a coal seam, or in a wide coal pillar, before and after bump initiation” in the framework of the volkswagen foundation. the project was led by the technische universität münchen, represented by prof. dr. dr.h.c. horst lippmann, and the coapplicant was the of the czech technical university in prague, klokner institute represented by jaroslav vacek. tum supervised the project and supplied technicians to operate the 10mn testing machine and to help with assembling and disassembling the device. this took a total of 30 days, in which the testing machine was exclusively reserved for the volkswagen project. all other research work was carried out in prague. research work is now continuing in the framework of a czech grant. 2 testing devices 2.1 loading cell fig. 1 shows the loading cell. it consists of the lower steel tank, which is designed for the horizontal forces caused by the vertical load in araldite specimens. the loading cell is equipped with plexiglass on its sides which allows the samples to be observed during the tests. the tank is shown on photo 2. this loading cell models (simulates) the rock mass in the vicinity of the seam. in the loading cell we placed two araldite specimens (with dimensions of 160/400/40 mm), which model a real seam. the gap between them corresponds to the width of a working gallery in a mine. we observed the mechanism and the history of coal bumps. the araldite specimen was covered with a soft dural sheet, and a force meters were placed on it in the following manner: 5 comparatively thick force meters were placed near its outer edge and another 15 thinner force meters were placed next to them, (see fig. 1 and 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 stress distribution in a coal seam before and after bump initiation j. vacek this paper deals with to the behaviour of open rock that occurs, for example, during longwall mining in coal mines, in deep tunnel, or shaft excavation. longwall instability leads to extrusion of rock mass into an open space. this effect is mostly referred to as a bump, or a rock burst. for bumps to occur, the rock has to possess certain particular rock burst properties leading to accumulation of energy and the potential to release this energy. such materials may be brittle, or the bumps may arise at the interfacial zones of two parts of the rock, that have principally different material properties. the solution is based on experimental and mathematical modelling. these two methods have to allow the problem to be studied on the basis of three presumptions: • the solution must be time dependent • the solution must allow the creation of crack in the rock mass • the solution must allow an extrusion of rock into an open space (bump effect) keywords: rock bursts, bumps, mining, rock mechanics, mathematics and physical modelling. fig. 1: scheme of loading cell photo 2). in order to embed the force meters properly and to prevent them from tilting, another double steel sheet, 1 mm thick, was placed over the force meters. a 300 mm high block of duraluminium was placed over this sheet. this block simulates the handing wall and facilitates stress distribution similar to that in reality (see photo 1). 2.2 force meters photo 2 shows the force meters. the force meters are 160 mm in length, 68 mm in height, and 16 or 32 mm in width. there are 4 strain gauges on each force meter – 2 on one side 30 mm from the edge of the force meter and 2 on the other side 60 mm from the edge of the force meter. these allow us to measure the deformation along its full length. the strain gauges are connected in series in order to be able to gauge each force meter separately and at the same time to increase the gauging sensitivity. the force meters indicated by numbers 1, 2, … 30 (width 16 mm), 31, 32, … 40 (width 32 mm) were calibrated within the expected range of forces, i.e., 0 to 250 (500) kn. 2.3 experimental verification of the measuring system the stability of the measuring system in the unloaded state was verified for approximately 26 hours. the values of the readings in the unloaded force meters were measured at 5 minute intervals. the random error of the measuring system is characterised by the floating deviation (derived from 480 values); it approximately 0.50 kn. the errors of the gauging electronic system are only small, with a value less than approximately 0.3 % of the maximum gauged range. 3 survey of recorded bumps table 1 surveys the forces that acted during the recorded bumps. the evaluation of their intensity is subjective. we can conclude that • the first bumps were recorded at forces 1100–1700 kn. the lack of bumps for vw 01-03 is probably because we failed to recognise them until we had gained some experience. the best way to recognize a bump is by noticing the drop in force at the press or the changes in the stress registered by the force meters. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 41 no. 4 – 5/2001 photo 1: testing device. two camcorders recorded the test. photo 2: loading cell. right with sample and force meters. results of the tests these results are presented in the form of pi /q. an example is given in fig. 2. here pi is the stress registered by the i-th force meter and q is the average stress inside the loading cell. we can observe the following kinds of dependence: • for bumps which occurred when the load was low (1–1.5 mn), when there were not too many cracks and the araldite was brittle, the maximum p /q ratio was greater then 4. • the bumps that occur at higher loads exhibit a lower ratio; for 3 mn the maxima are only three times higher and over 5 mn the araldite becomes plastic and the values of pi are only slightly higher than twice that of q. this is despite the fact that the empty space has increased by 30–50 mm on each side. for six groups of stronger bumps, which occur for similar loads from 2.582 mn onward (where the bumps are listed 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 4 – 5/2001 modelvw 01 02 03 04 06 07 11 12 13 14 20 21 22 test location zus ki tum tum tum tum tum tum tum tum tum tum tum the total force (mn) acting during the bump and eveluation of its intensity 1.500 w 1.150 w 1.400 w 1.100 w 1.600 w 1.150 w 1700 w 1.900 w 2.900 w 2.600 w 2.723 m 2.200 w 2.150 w 2.700 s 2.222 w 2.900 s 2.500 w 2.760 w 2550 w 3.500 s 3.280 s 3.189 s 3.367 s 3.100 w 3.080 m 3.400 m-s 3.169 m 3.500 m 3.700 w 3.700 w 4.080 m-s 4.147 s 4.060 m 4.300 w 4.270 m-s 4.170 vs 3.970 m 3.790 w 4.400 s 4.260 s 4.200 m 4.880 s 4.900 s 4.980 s 5.365 s 4.520 m 5.000 s 4.940 s 4.960 m-s 5.170 w 4.640 m-s 4.900 w 5.300 vs 5.500 w 5.400 m 5.540 vs 5.592 s 5.540 s 5.670 s 5.480 s 5.550 w 5.570 w 5.760 m-s 6.000 s 6.200 vs 6.470 vs 6.000 s 6.445 vs 6.000 vs 6.280 m 6.320 s 6.820 s w – weak m – medium m-s – medium strong s – strong vs – very strong table 1: total number of determined bumps fig. 2: stress distribution of coal seam loading before (thick line) and after bump vw 03 © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 3: mean stress distribution of coal seam loading before (thick line) and after a bump. the force is 4170– 4265, covered bump of models vw 03, 06, 07, 11, 12, 13, 14, 20, 21, 22. � [mpa] vw 21 90–100 80–90 70–80 60–70 50–60 40–50 30–40 20–30 10–20 0–10 100 90 80 70 60 50 40 30 20 10 0 fig. 4: loading of the coal seam during test vw 21 in axonometry photo 3: detail of araldite samples at force 60 kn together with average values and corresponding model numbers), we calculated the average values of pi/q. fig. 3 provides an example. • the maximum pi equals approximately 2q (1.8–2.2), which is the case for all loads on a loading cell bigger than 3 mn. • the maximum for 3.2 mn occurs at the fifth force meter from the gap in the middle (80 mm), for 5.4 mn at the 8th to 12th force meter (130 –190 mm) and for a load of 6.4 mn at the 12th force meter, i.e., 190 mm from the original edge of the gap in the middle. 4 mathematical model pfc 2d(particle flow code in two dimensions) developed by itasca, usa was used for the numerical modelling part of the project. a physical problem concerning the movement and interaction of circular particles may be modelled directly by pfc2d . pfc2d models the movement and interaction of circular particles by the distinct element method (dem), as described by cundall and strack (1979). particles of arbitrary shape can be created by bonding two or more particles together: these groups of particles act as autonomous objects, provided that their bond strength is high. as a limiting case, each particle may be bonded to its neighbour: the resulting assembly can be regarded as a “solid” that has elastic properties and is capable of “fracturing” when the bonds break in a progressive manner. pfc2d contains extensive logic to facilitate the modelling of solids as close-packed assemblies of bonded particles; the solid may be homogeneous, or it may be divided into a number of discrete regions of blocks. 58 acta polytechnica vol. 41 no. 4 – 5/2001 photo 4: detail of araldite samples at force 6.5 mn. at the centre extruded material can be seen. fig. 5: detail of the model immediately after a bump (with ball velocities).the first cracks are cleary visible. the calculation method is a time stepping, explicit scheme. modelling with pfc2d involves the execution of many thousands of time steps. at each step, newton’s second law (force = mass × acceleration) is integrated twice for each particle to provide updated velocities and new positions, given a set of contact forces acting on the particle. based on these new particle positions, contact forces are derived from the relative displacements for pairs of particles: a linear or non-linear force/displacement law at contacts may be used. fig. 5 and fig. 6 show details of the mathematical model after a bump with ball velocities, and fig. 7 shows the typical stress distribution along the coal mine. the stress grows (i, ii) until the first bump initiation (iii). this occurs at the 5th measurement cycle approximately, then the stress decreases in this location, but it is increasing simultaneously at the 7th measurement cycle, where the second bump initiation occurs (iv). the subsequent bump initiations can be expected in the 8th and 10th measurement cycles (v, vi). 5 conclusion a combination of experimental and mathematical models appears very appropriate for a study of the stress distribution in a coal seam before and after bump initiation. both methods enable a time dependent study of the problem, and make it possible to study the development of cracks during bump initiation, and thes extrusion of material into an open space during a bump. they thus offer a description of the problem that is very close to reality. acknowledgements this research and this article have been sponsored by the grant agency of czech republic, (gačr), grant number 103/00/0530 “experimental and mathematical study of coal seam loading before and after a bump” (“experimentální a teoretická studie zatížení uhelné sloje před a po otřesu”). © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 41 no. 4 – 5/2001 fig. 6: detail of the model some time after a bump (with ball velocities) fig. 7: stress distribution along the coal mine (left side) i – 5000 cycles, ii – 7000 cycles, iii – 9000 cycles, iv – 12000 cycles, v – 15000 cycles, vi – 26000 cycles references [1] foss, m. m., westman, e. c.: seismic method for in-seam coal mine ground control problems. seg international exposition and 64th annual meeting, los angeles, 1994, pp. 547 – 549 [2] goodman, r. e.: introduction to rock mechanics. john wiley & sons, 1989, p. 562 [3] torańo, j., rodríguez, r., cuesta, a.: using experimental measurements in elaboration and calibration of numerical models in geomechanics. computation methods and experimental measurements x, alicante, 2001, pp. 457– 476 [4] vacek, j., procházka, p.: rock bumps occurence during mining. computation methods and experimental measurements x, alicante, 2001, pp. 437 – 446 [5] vacek, j., bouška, p.: stress distribution in coal seam before and after bump initiation. geotechnika 2000, glivice-ustroň 2000, pp. 55 – 66 [6] vacek, j., procházka, p.: behaviour of brittle rock in extreme depth. 25th conference on our world in concrete & structures, singapore, 2000, pp. 653 – 660 [7] williams, e. m., westman, e. c.: stability and stress evaluation in mines using in-seam seismic methods. 13th conference on ground control in mining, us bureau of mines, 1994, pp. 290 – 297 [8] wood, d. m.: soil behaviour and critical state soil mechanics. cambridge university press, 1996, p. 462 doc. ing. jaroslav vacek, drsc. phone: +420 2 2435 3549 fax: +420 2 2435 3855 e-mail: vacek@klok.cvut.cz czech technical university in prague klokner institute šolínova 7, 166 08 praha 6, czech republic acta polytechnica vol. 41 no. 4 – 5/2001 60 ap02_04.vp 1 introduction application of the partial factor method introduced in operational european standards for structural design often leads to unequal reliability of structures or structural members made of different building materials and exposed to different combinations of actions. well-balanced structural reliability can be achieved using design procedures based on probabilistic methods. this approach to the verification of structural reliability is allowed in the fundamental european document on structural design en 1990 basis of structural design [1]. at present, the basic principles and data for the design and verification of structural members using probabilistic methods are partly provided in the technical literature and also in recent iso and en standards. detailed guidelines can be found in the jcss working materials [2]. it is expected that probabilistic design will become a practical design tool. unfortunately, implementation of the design is limited by lack of required input data. the reliability analysis presented in this paper provides reliability verification of a steel frame designed according to recommendations given in the eurocodes [1,3]. the reliability index �� as a basic indicator of the level of reliability, is determined using both time invariant and time variant analysis provided by the software product comrel [4]. the basic variables are described using probabilistic models recommended by jcss [2]. the submitted analysis indicates possible procedures for implementing probabilistic methods of structural reliability in the design of civil engineering structures. 2 deterministic design 2.1 geometry the portal frame analysed in this study is a double-pinned frame stiffened by haunches in the frame corners as indicated in fig. 1. the span of the frame is 17.71 m. the height of the structure is 7.26 m. the slope of the roof is approximately 15°. the maximum loading width is 6.48 m. the cross-section of the frame consists of the rolled i-profile ipe 330. in the location of the haunches, a t-section of variable height (10–280 mm) is welded on it (see fig. 1). the maximum section height is 610 mm in the frame corner. the lengths of the haunches are 2.0 and 2.8 m, respectively. 2.2 effects of actions the frame is exposed to the self-weight of the load bearing girders and the roof, snow, and wind action. the effect of the imposed action and thermal actions is negligible. the action effects of the actions considered in the analysis consist of an axial force n and bending moment m. in the design calculation, the axial force and bending moment are represented by the design values nd and md. the combination of actions is determined considering expression (6.10b), given in en 1990 [1]. if the snow load is the leading variable action, then it follows that: � � � � n n n n n d g frame, k roof, k q snow, k 0 w wind, k � � � � � � � � � � (1) � � � � m m m m m d g frame, k roof, k q snow, k 0 w wind, k � � � � � � � � � � (2) where � � 0.85 is the reduction factor for permanent actions, �g � 1.35 is the partial factor for permanent actions, �q � 1.5 is the partial factor for variable actions, and �0,w � 0.6 is the factor for the combination value of the wind action. nframe,k is the characteristic value of the axial force due to the self-weight of the frame (the rolled sections) estimated as 0.49 kn/m. in the location of the haunches, it ranges from 0.49 to 0.76 kn/m. nroof,k is the characteristic value of the axial force due to the self-weight of the roof structure. the load, including the secondary longitudinal girders, is estimated as 0.15 kn/m2. nsnow,k is the characteristic value of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 42 no. 4/2002 reliability analysis of a steel frame m. sýkora a steel frame with haunches is designed according to eurocodes. the frame is exposed to self-weight, snow, and wind actions. lateral-torsional buckling appears to represent the most critical criterion, which is considered as a basis for the limit state function. in the reliability analysis, the probabilistic models proposed by the joint committee for structural safety (jcss) are used for basic variables. the uncertainty model coefficients take into account the inaccuracy of the resistance model for the haunched girder and the inaccuracy of the action effect model. the time invariant reliability analysis is based on turkstra’s rule for combinations of snow and wind actions. the time variant analysis describes snow and wind actions by jump processes with intermittencies. assuming a 50-year lifetime, the obtained values of the reliability index � vary within the range from 3.95 up to 5.56. the cross-profile ipe 330 designed according to eurocodes seems to be adequate. it appears that the time invariant reliability analysis based on turkstra’s rule provides considerably lower values of � than those obtained by the time variant analysis. keywords: steel frame, lateral-torsional buckling, reliability, jump processes. ipe 330 h = 610mm 15° ipe 330 a a´ a a ´ ipe 330 h = 610mm 15° ipe 330 a a´ a a ´ a a ´ fig. 1: geometry of the frame [m] axial force due to snow action. the characteristic value of the snow load sk determined according to [5] is given as: s c c sk e t g, k� �1 (3) where �1 � 0.8 is the load shape coefficient considered for a uniform snow load covering a whole roof area and for a roof slope about 15°. both the exposure coefficient ce and the heat coefficient ct are chosen equal to 1 and the characteristic value of the snow load on the ground at the weather station is taken as sg,k � 1.33 kn/m 2 considering a given site locality (approximately corresponding to region iii for the czech republic). nwind,k is the characteristic value of the axial force due to wind action. following the recommendations provided in [6], the characteristic value of the wind pressure wk is given as: � �w c q zk p p� (4) where cp is the pressure coefficient dependent on the building geometry and the size of the loaded area (here, the loaded area is assumed to be larger than 10 m2). it describes the outside pressure and suction combined with either the inside suction or the inside overpressure. in this case, more unfavourable effects are caused by a combination of outside pressure and inside suction. the peak velocity pressure can be written as: � � � � � �q z c z c z vbp g r� 2 21 2 � (5) where cg(z) � 2.4 m is the gust factor specified for the height of the structure z � 7,5 m and for the terrain category ii – open terrain with isolated obstacles [6]. the roughness factor cr (z � 7,5 m) � 0.95 is also defined for the terrain category ii. the air density � is taken as 1.25 kg/m3. the reference wind speed vb is 26 m/s. it results qp(z) � 0.92 kn/m 2. the design values of the bending moments are derived from the same assumptions as for the design values of the axial forces. 2.3 structural analysis the internal forces were determined using the deformation method. the structure has been modelled as a double-pinned frame. to model the real behaviour of the frame, the haunches of the girder were divided into 6 parts, each having a constant height corresponding to the middle cross-section of the relevant part. it appears that the shear does not affect the bending capacity and need not be taken into account. the structure is classified as a sway frame and consequently the sway moments caused by the wind action are increased by the factor k � 1.28. the buckling length of the column with respect to axis y ly � 12.48 m is taken as 2.6-multiple of the length of the column following the approximate procedure for sway frames shown in [3]. the buckling length lz with respect to axis z is chosen as 2.2 m, which is the distance between the stays for lateral buckling restraint. as for the diaphragm beam, ly � 8.855 m is half of the beam span. lz is again 2.2 m. each of the cross-sections within the haunch is checked against buckling without lateral-torsional buckling and buckling with lateral-torsional buckling. the design criterion for buckling without lateral-torsional buckling seems to yield the most critical criterion for checking of the column. the most critical criterion for the diaphragm beam is the criterion for buckling with lateral-torsional buckling. it appears that the critical cross-sections within the column and diaphragm beam are just at the origin of the haunches. the design criterion for buckling without lateral-torsional buckling is expressed as: n a f c m w f sd y, k m1 y my sd, y p1, y y, k m1 k � � � �1 (6) where, in the critical cross-section of the column, nsd � 115 kn is the design value of the axial force due to the actions, � � 0.63 is the buckling coefficient (the lower of the values �y � 0.63 and �z � 0.80), a is the area of the relevant cross-section (aipe330 � 6261 mm 2), fyk � 275 mpa is the characteristic value of the yield strength of the steel s275, �m1 � 1.1 is the partial factor for the material property, ky � 1.09 is the moment amplification factor, cmy � 0.95 is the equivalent uniform moment factor, msd,y � 132 knm is the design value of the bending moment due to the actions and wpl,y is the plastic sectional modulus ( wpl, y, ipe330 � 804�10 3 mm3). the design criterion for buckling with lateral-torsional buckling is given as: n a f m w f sd z y, k m1 sd, y lt p1, y y, k m1 � � � �1 (7) where, in the critical cross-section of the beam, nsd is 91 kn, �z is 0.80, msd,y is 161 kn, and �lt � 0.89 is the buckling coefficient of lateral-torsional buckling. the eurocodes [3] do not provide a procedure for determining the critical bending moment at the limit of the lateral-torsional buckling mcr of the haunched girder, which is required for calculation of �lt. therefore, the critical bending moment mcr is approximately calculated neglecting the effect of the haunch. it is assumed that the i-section alone without the haunch resists lateral-torsional buckling. considering the criterion for buckling, the ratio between the design action effect and the design resistance for the critical cross-section of the column at the origin of the haunch is 0.8. for the critical cross-section of the beam at the origin of the haunch, the ratio is 0.97 taking into account the criterion for buckling with lateral-torsional buckling. thus, this cross-section is also the most critical one within the whole structure and for this reason its reliability is verified in the following analysis. 3 limit state function as mentioned above, the reliability analysis concentrates on the critical cross-section of the beam at the origin of the haunch. the limit state function is derived from the design criterion for lateral-torsional buckling (7). in addition, the uncertainty model coefficients are used to take into account the inaccuracy of the resistance model for the haunched girder and the inaccuracy of the action effect model. the limit state function reads as: � �g x n af m w f � � � � � � � � �1 0 � � � � en s rn z y em s, y rm lt p1, y y (8) 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 where en is the coefficient of the model uncertainties for axial force and em for bending moment, rn is the coefficient of the model uncertainties for axial force resistance and rm for bending moment resistance. utilizing the results of the structural analysis, the internal forces in the critical cross-section can be simply written as: � �n bg g bs bws roof frame�� � � �6 7 6 47 2 54. . . (9) � �m bg g bs bws, y roof frame� � � �8 37 8 09 13 27. . . (10) where b � 6.48 m is the loading width, groof is the self-weight of the roof [kn/m2], gframe is the self-weight of the load bearing girders [kn/m], s is the snow load [kn/m2] and w is the wind action [kn/m2]. the limit state function given by equation (8) is applied in the following reliability analysis considering appropriate probabilistic models for the basic random variables described below. 4 theoretical models for basic variables 4.1 basic variables probabilistic models for basic variables are used in accordance with the models proposed by the joint committee for structural safety (jcss). the sectional area a, the plastic sectional modulus wpl,y, the loading width b and the span of the girder l are assumed to be deterministic values (d), while the others are considered as random variables. the statistical properties of the random variables are described by the normal distribution (n), lognormal distribution (ln) and the gumbel distribution (g) indicated by the moment characteristics (the mean � and standard deviation ) [2,7] as listed in tab. 1. the skewnesses � are implicitly given by the type of distributions as: �n � 0, �ln � 3vx + vx 3, and �g � 1.14. the statistical parameters used for the yield strength are estimated assuming: � fy y, k fy� �f 2 (11) © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 42 no. 4/2002 parametersvar. types symbol x name of basic variable dist. dim. xk �x �x/xk x vx mp* fy yield strength ln mpa 275 327 1.19 26.2 0.08 a sectional area d mm2 6261 6261 1.00 wpl,y plastic sectional modulus d mm 3 804000 804000 1.00 b loading width d m 6.48 6.48 1.00 l girder span d m nom nom 1.00 z buckling coefficient n 0.64 0.67 1.04 0.04 0.06 lt coefficient of lateral bucklingtorsional n 0.79 0.82 1.03 0.04 0.05 �en axial force action effect n 1 0.05 0.05 �em bending action effect n 1 0.1 0.1 �rn axial force resistance n 1.1 0.07 0.07 �rm bending resistance n 1.1 0.07 0.07 gframe self-weight due to girders n kn/m 0.49 0.49 1 0.03 0.05 groof self-weight due to roof n kn/m 2 kn/m2 kn/m2 kn/m2 kn/m2 0.15 0.15 1 0.02 0.1 s 50 50-year extremes of snow g 1.06 1.18 1.11 0.32 0.27 s 1 annual extremes of snow g 1.06 0.38 0.36 0.27 0.72 w50 50-year extremes of wind g 0.92 0.64 0.7 0.21 0.33 w1 annual extremes of wind g 0.92 0.28 0.3 0.14 0.52 g eo m et ri c d at a m od el u n ce rt ai n ti es a ct io n s table 1: statistical properties of basic variables *mp = material properties xk is the characteristic value of the variable, �x is the mean, x is the standard deviation and vx is the coefficient of variation. �fy fy� 0 08. (12) the parameters of the buckling coefficient �z are derived from the recommendation indicated in en 1993 [3] taking into account the random variability of the yield strength fy, comparative yield strength fyp and imperfection coefficient �. similarly, the statistical parameters of the coefficient of lateral-torsional buckling �lt are derived. in addition, the factors depending on loading and end restraint conditions c1 and c2 are also considered as random variables. the parameters applied in the analysis are listed in tab. 2. the coefficients of the model uncertainties cover the imprecision and incompleteness of the theoretical models for the frame with the haunches. their statistical parameters are assumed as in the jcss probabilistic model code [2]. the probabilistic models for the self-weight actions (gframe and groof) are considered as in [7]. 4.2 snow load as for the snow load s, the statistical parameters are derived considering equation (3). the characteristic value of the snow load on the ground sg,k is assumed to have the probability p � 0.02 to be exceeded by annual extremes. assuming a gumbel distribution and the coefficient of variation vsg, 1 � 0.7 [8], the mean of the annual extremes �sg,1 � 0.47 kn/m 2 can be obtained from the following equation for a fractile of the gumbel distribution: � �� �� � � sg,1 g,k sg,1 � � � � � s p w1 0 45 0 78 1. . ln ln (13) the standard deviation of the annual extremes is then sg � 0.33 kn/m 2. for time invariant analysis, the parameters of the 50-year extremes must be determined. the standard deviation of the 50-year extremes is equal to the standard deviation of the annual extremes for the gumbel distribution. the mean of the n-year extremes can be derived from the annual extreme parameters as: � �� � sg,n sg,1 sg� � 0 78. ln n (14) for n � 50, the mean of the 50-year extremes is �sg,50 � 1.48 kn/m 2. the statistical parameters of the other variables are used in accordance with jcss probabilistic model code [2]. the statistical parameters of the annual extremes and 50-year extremes of the snow load (denoted as s1 and s50 in table 1) result from equation (3) using the statistical models for random variables shown in table 3. 4.3 wind action the statistical parameters of wind pressure w are determined assuming that: w c c c m q� p g r q b 2 (15) 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 parametersvar. types symbol x name of basic variable dist. dim. xk �x �x/xk x vx mp fyp yield strength (s235) ln mpa 235 280 1.19 22.4 0.08 � imperfection coefficient n 0.32 0.275 0.028 0.028 0.1 c 1 loading and end restraint factor n 1.59 1.90 1.19 0.19 0.1 c 2 loading and end restraint factor n 0.78 0.67 0.86 0.067 0.1c oe ff ic ie n ts table 2: statistical properties of the basic variables used to derive coefficients �z and �lt s � �1 ce c t sg parametersvar. types symbol x name of basic variable dist. dim. xk �x �x/xk x vx �1c e shape and exposure coef. n 0.8 0.8 1 0.12 0.15 c t heat coefficient d 1 1 1 s g,1 annual extremes of snow load on ground g kn/m2 1.33 0.47 0.35 0.33 0.7 s g,50 50-year extremes of snow load on ground g kn/m2 1.33 1.48 1.11 0.33 0.22 c oe f. a ct io n s table 3: variables used in calculating the parameters of the snow load where mq is the model coefficient describing the ratio between the expected and computed value of the basic wind pressure qb, which can be written as: q vb b 2� 1 2 � (16) the characteristic value qb � 0.42 kn/m 2 is defined to have the probability p � 0.02 to be exceeded by the annual extremes. the coefficient of variation of the annual extremes of the reference wind speed vb is vvb,1 � 0.2 [8]. supposing that the annual extremes of the reference wind speed can be modelled by a gumbel distribution, the coefficient of variation of the annual extremes of the basic wind pressure vqb,1 � 0.43 results from: v v v v qb vb vb vb , , , , . 1 1 1 1 2 2 1 114 1 � � � (17) the mean of the annual extremes of the basic wind pressure �qb,1 � 0.20 kn/m 2, the mean of the 50-year extremes �qb,50 � 0.46 kn/m 2 and the standard deviation qb � 0.085 kn/m 2 can be derived identically as for the statistical parameters of the extremes of the snow load on the ground (13,14). the statistical parameters of the other variables used in calculating the statistical parameters of the wind pressure w are taken in accordance with the jcss probabilistic model code [2] as listed in table 4. 5 reliability analysis climatic actions due to snow and wind are complex time-variant quantities that significantly complicate the reliability analysis. two different approximations for describing them are considered in the following analysis. firstly, turkstra’s rule is accepted in conjunction with time invariant analysis. secondly, the ferry borges-castanheta model (fbc) is applied together with time variant reliability analysis. the variable actions due to snow and wind are assumed to be uncorrelated. the software product comrel [4] has been applied in both types of analysis. 5.1 time invariant analysis in accordance with turkstra’s rule, the leading action is described by its lifetime (assumed as 50 years) extreme while the accompanying action is considered by its point-in-time value (approximated by annual extremes). in the following analysis, each climatic action, the snow and the wind action, is considered to be either a leading or an accompanying action. the probability densities of the 50-year extremes of the snow load s50 (considered as the leading variable action) and the annual extremes of the wind pressure w1 are shown in fig. 2. the characteristic value of the wind pressure w being the 98-percentage fractile of gumbel’s distribution is denoted as wk, and wd denotes the design value. 5.2 time variant analysis the time variant reliability analysis is based on the fbc model for the snow and wind actions. both the climatic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 42 no. 4/2002 w � cp cg cr2 mq qb symbol x xk �x �x xk x vx c p c g c r 2 mq q b,1 2 q b,50 2 parametersvar. types name of basic variable dist. dim. / pressure coefficient n nom nom 1 0.1nom 0.1 gust factor n 2.4 2.4 1 0.24 0.1 roughness factor n 0.91 0.73 0.8 0.073 0.1 model coefficient n 1 0.8 0.8 0.16 0.2 annual extremes of basic wind pressure g kn/m 0.42 0.20 0.44 0.085 0.43 50-year extremes of basic wind pressure g kn/m 0.42 0.46 1.10 0.085 0.18 c oe f. a ct io n s table 4: variables used in calculating the parameters of the wind action � w1 s50 0 1 2 3 4 0 0.5 1 1.5 wk = 0.92 wd = 1.38 w s, [kn/m ] 2 fig. 2: the probability densities of the 50-year extremes of the snow load s50 and the annual extremes of wind pressure w1 actions are described by jump processes without intermittencies (actions sometimes take a zero value), which approximate their real variation in time by rectangular wave renewal functions. each jump process with intermittencies is characterized by the jump rate � (the average number of magnitude changes of the square waves in a reference time tref) and by the interarrival-duration intensity � (the product of the arrival rate � and the mean duration with respect to a reference time tref). as for the snow load, it is assumed that it takes its extreme five times a year. considering the reference time tref � 1 year, the arrival rate of on-times is therefore �s � 5. the mean duration (the time during which the structure is loaded by the extreme snow load) is supposed to be about 14 days. the interarrival-duration intensity is thus �s � 5 × 14/365 � 0.19. the possible approximation of the snow load during the reference time tref � 1 year is shown in fig. 3. windstorms are expected to appear ten times a year (�w � 10) and the mean duration of the storm is estimated as 8 hours. the interarrival-duration intensity is then �w � 10 × 8/(24 × 365) � 0.009. 5.3 results of reliability analysis according to the results of the time invariant analysis, the reliability of a structure of the ipe 300 profile seems to be rather low (� is less than 3), while the cross-profile ipe 330 seems to be acceptable. for the higher profile the resulting reliability index � � 3.95 corresponds well to the recommended value � � 3.8 [1], as shown in table 5. nevertheless, it should be mentioned that the time invariant analysis based on turkstra’s rule provides considerably lower values for reliability index � than those obtained by the time variant analysis. the time variant analysis predicts the interval at which the reliability index � can be expected (a higher value of � corresponds to the lower bound of a failure probability while a lower value of � corresponds to the upper bound of a failure probability). the results obtained by the time variant analysis are more favourable and indicate that even the smaller profile ipe 300 might be acceptable. the expected values of the reliability index � for ipe 300 are within the range from 3.57 (the upper bound) to 4.84 (the lower bound)� for ipe 330 from 4.49 up to 5.56 as listed in table 5. fig. 4 shows the reliability index � determined by both the analyses as a function of the plastic sectional modulus wpl,y. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 t day~14 ~14 ~14 ~14 ~14 365 days = 1 year ~ 73 days ~ 73 days ~ 73 days ~ 73 days ~ 73 days snow load s fig. 3: the approximation of the snow load used in the time variant analysis analysis used load models reliability index � ipe 300 ipe 330 time invariant s50 + w1 2.87 3.95 s1 + w50 2.97 3.97 time variant jump processes with intermittencies 3.57–4.84 4.49–5.56 table 5: results of reliability analysis � 600 650 700 750 800 850 900 2.6 3.0 3.4 3.8 4.2 4.6 5.0 wpl,y [10 3 mm3] ip e 3 3 0 ip e 3 0 0 lower bound (pf) time variant analysis upper bound (pf) time invariant analysis turkstra‘s rule target value of � � 600 650 700 750 800 850 900 2.6 3.0 3.4 3.8 4.2 4.6 5.0 wpl,y [10 3 mm3] ip e 3 3 0 ip e 3 0 0 lower bound (pf) time variant analysis upper bound (pf) time invariant analysis turkstra‘s rule target value of � fig. 4: reliability index � as a function of the plastic sectional modulus wpl,y 1 2 3 4 5 6 7 8 9 10 4.4 4.8 5.2 5.6 6.0 � �s lower bound (pf) upper bound (pf) 1 2 3 4 5 6 7 8 9 10 4.4 4.8 5.2 5.6 6.0 � �s lower bound (pf) upper bound (pf) fig. 5: reliability index � as a function of the jump rate of the snow load �s 5.4 effect of input data on resulting reliability index � the model parameters � and � required for the time variant analysis are very difficult to specify. nevertheless, parametric studies indicate that uncertainty in � and � have an insignificant effect on the resulting reliability. for example, if the value of the jump rate of the snow load �s increases from 1 to 5 (i.e. if the snow load takes its extreme lasting 14 days five times a year, which is not real), the upper bound of � decreases approximately by 0.4 for the cross-profile ipe330, as shown in fig. 5. parametric studies of the jump rate of the wind action �w and of the interarrival-duration intensities of the two climatic actions �s and �w provide similar results. 5.5 probabilistic optimisation probabilistic optimisation is based on minimisation of a simplified objective function expressed as the sum of the initial, marginal and expected malfunction cost. the decisive parameter is the sectional area a. the total cost ctot can be expressed as: � �c c c a c p atot m f f� � �0 (18) where c0 denotes the initial cost independent of parameter a and failure probability pf. the marginal cost is the product of the unit cost of the sectional area cm and the sectional area a. the expected malfunction cost is the product of failure probability pf and malfunction cost cf when failure occurs. for probabilistic optimisation, equation (18) may be adapted as: � � c c c c a c c p atot m tot f m f � � � � �0 (19) the relative increment of the total cost ctot is dependent only on the decisive parameter a and on the ratio cf/ cm. choosing various values of this ratio, different cross-sections seem to be adequate according to the results of the probabilistic optimisation shown in fig. 6. the arrows point to the minima of the relative increment �ctot for assumed ratios cf/ cm. the dot-and-dash curve shows the resulting reliability index � dependent on sectional area a assuming turkstra’s rule (the 50-year extremes of the snow load and the annual extremes of the wind action). the horizontal dashed line marks the target value of � (�t � 3.8). obviously with the increasing cost ratio cf/ cm the optimum cross-sectional area a increases. profile ipe 330 seems to be optimal for cf/ cm = 5 × 10 6. to get credible results of the optimisation, it is necessary to determine the values of cm and cf exactly. 6 conclusions the structural analysis of the frame shows that lateral-torsional buckling represents the most critical design criterion and indicates that the snow load is the leading variable action. considering a 50-year lifetime, the reliability index � for ipe 330 varies within the range from 3.95 up to 5.56. according to the results of the reliability analysis, cross-profile ipe 330 designed using eurocodes seems to be adequate. the time invariant analysis based on turkstra’s rule provides considerably lower values of � than those obtained by the time variant analysis. it seems that turkstra’s rule leads to a rather conservative reliability level for a combination of variable actions having significant intermittencies. the great differences between the lower and upper bounds are most likely caused by the considerable intermittencies of the two variable actions. the model parameters required for time variant analysis are, however, very difficult to estimate. nevertheless parametric studies indicate that this uncertainty has an insignificant effect on the resulting reliability. structural analysis of a beam with haunches exposed to lateral-torsional buckling is a very complicated task. it is foreseen that more precise results may be obtained by an analysis based on the model using the finite element method. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 42 no. 4/2002 � ip e 3 0 0 ip e 3 3 0 � ip e 3 0 0 ip e 3 3 0 � ip e 3 0 0 ip e 3 3 0 � ip e 3 0 0 ip e 3 3 0 ip e 3 3 0 � ip e 3 0 0 ip e 3 3 0 � ip e 3 0 0 ip e 3 3 0 ip e 3 3 0 � ip e 3 0 0 a[mm ] 2 ip e 3 3 0 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 � ip e 3 0 0 5000 5250 5500 5750 6000 6250 6500 5000 5500 6000 6500 7000 7500 tottotal cost c´ ip e 3 3 0 ip e 3 3 0 c 6 m f 10.5� c c 5 m f 10.5� c 6 10 c m f � c 5 10 c m f � c fig. 6: relative increment of total cost c tot as a function of sectional area a using turkstra’s rule (s50 + w1) acknowledgement this research has been conducted at the klokner institute, czech technical university in prague, czech republic, as a part of research project ctu0214417 “basis for probabilistic design of structural elements in accordance with eurocodes”. references [1] en 1990 basis of structural design. brussels: european committee for standardisation, 2002. [2] jcss probabilistic model code. joint committee for structural safety, 2001 (http://www.jcss.ethz.ch\). [3] pren 1993-1-1 design of steel structures. brussels: european committee for standardisation, 2001. [4] rcp reliability consulting programs: strurel: a structural reliability analysis program system. comrel & sysrel user's manual. munich: rcp consult, 1995. [5] pren 1991-1-3 actions on structures – snow loads. final draft. brussels: european committee for standardisation, 2001. [6] pren 1991-1-4 actions on structures – wind actions. final draft. brussels: european committee for standardisation, 2001. [7] holický m.: reliability based calibration of eurocodes considering steel component. jcss workshop on reliability based code calibration, zürich: joint committee for structural safety, 2001. [8] tichý, m.: zatížení stavebních konstrukcí. (actions on structures). technický průvodce 45, praha: sntl, 1987. ing. miroslav sýkora phone: +420 224 353 850 fax: +420 224 355 232 e-mail: sykora@klok.cvut.cz czech technical university in prague klokner institute šolínova 7 166 08 praha 6, czech republic 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 acta polytechnica doi:10.14311/ap.2018.58.0339 acta polytechnica 58(6):339–345, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap evaluation of sensor signal processing methods in terms of information theory patrik flegner∗, ján kačur institute of control and informatization of production processes, faculty berg, technical university of košice, boženy němcovej 3, 040 01 košice, slovak republic ∗ corresponding author: patrik.flegner@tuke.sk abstract. the paper deals with the examination of basic methods of evaluation of sensor signals in terms of the information content of the given method and the used technical means. in this respect, methods based on classical analog systems, digital systems in the time domain of signal processing, hybrid systems and digital systems evaluating signal in the frequency domain are compared. a significant increase in entropy in individual systems is demonstrated in the case of a more complex signal evaluation. for each measuring system, the experimental setups, results, and discussions are described in the paper. the issue described in the article is particularly topical in connection with the development of modern technologies used in the processes and subsequent use of information. the main purpose of the article is to show that the information content of the signal is increased because the signal is more complexly processed. keywords: entropy, information, analog and digital system, spectrum, spectrogram. 1. introduction the term data mining is used today mainly in management and marketing, where it is understood as the process of obtaining information from the available data. in this “mining”, various methods and procedures are used by employing among other things, modern information technologies. we no longer meet this concept in the field of control of production and technological processes. it is in this area that information as a basis for decision-making in the choice of appropriate intervention in the process, is of a fundamental importance. with the increasing complexity of processes such as controlled objects, with an increasing computing and communication technology, and progress in science disciplines, such as control theory and artificial intelligence, the classical exact control methods and new or modern methods are offered. these methods are based on the acquisition of qualitatively new types of information about the controlled and monitored process. we can define the sketched problem of “data mining” from signal sensors in terms of information theory and signal theory. information, from the viewpoint of information theory eliminates uncertainty (i.e., entropy). the measure of information is the increment of probability after receiving the message. if we accept the a information that we can expect with the probability p(a), then we receiv the amount of information (bit) in the sense of shannon’s entropy theorem [1]: i(a) = − log2 p(a) (bit). (1) if we quantify the information according to shanfigure 1. relationship between the probability of information and its entropy. non’s theorem [2, 3] then it is valid that: p(ai) = 0.5 ⇒ i(ai) = 1 (bit), p(ai) → 1 ⇒ i(ai) → 0 (bit), p(ai) → 0 ⇒ i(ai) →∞ (bit). from (1) and from figure 1, it is clear that if the specific information is less probable and if this information occurs and we accept this information [4], we obtain a larger amount of information [5, 6]. the theory of information for the purposes of active and by time term periodically process of information receiving defines the information source [7]. with some simplification, based on the information theory and probability theory, we can define the information source as a probabilistic space [8]. we can write this space in mathematical formalism as follows: ϕ = (x∗,p), (2) 339 http://dx.doi.org/10.14311/ap.2018.58.0339 http://ojs.cvut.cz/ojs/index.php/ap patrik flegner, ján kačur acta polytechnica where x is the final set of elements x = {x1,x2, . . . ,xn}, which we call the source alphabet and its element by the letter of the source, x∗ is the set of all the final sequences of the elements of x and represents a set of possible source messages, p is a probability function defined on a set x∗. function p has these properties: p = {p(xi); xi ∈ x}∈ 〈0, 1〉. obviously, the longer the fed information, the more information can be sent. therefore, as an information content of an information source (entropy of an information source), the average entropy of the source per information is used x (probability average) [9]. for stationary and ergodic sources of information, we then obtain: h(ϕ) = − ∑ x∈x∗ p(x) log2 p(x) (bit). (3) by analyzing the relation (3), it is possible to come to a serious conclusion, the higher the information content of the source, the higher the amount of information, which is generated with a uniform probability. that is, the size of the probability space (i.e., the number of elements of set x∗) directly determines the “content” of a specific information source [10, 11]. from a functional point of view, we can divide the information acquisition process into several basic functions. it is clear that the key role in terms of the adequacy of the information obtained and in terms of its quantity is played by the sensor during the measurement [12, 13]. other processes can only damage the acquired information or destroy it altogether. consequently, it is not possible to add relevant information to the measured variable through the processes [14]. the problem lies in how to “data mine” and then “use” the maximum information contained in the signal from the sensing element. thus, the output analog signal of the sensor in operation can be understood as the bearer of the information, as a continuous information source. it is demonstrated in the literature [15, 16] that the maximum amount of information is contained in such sensor signal having a limited average power pm and whose amplitude probability density distribution p(x) is by gaussian distribution on interval x ∈ 〈xmin,xmax〉: p(x) = 1 √ 2πepm exp− x2 2pm . (4) its information content (4) then acquires the maximum value: max ha = 1 2 log2(2πepm) (bit)., (5) where e is the operator with an expected value. information content by (5) is only the theoretical value because it assumes the ability of the sensor to generate at its output infinitely many amplitude levels of the signal from the interval x ∈ 〈xmin,xmax〉. with a real sensor, this is not possible due to its limited sensitivity and its inaccuracy. the sensor with the accuracy class δ can generate a signal of about n = 12δ + 1 amplitude levels. this then causes a decrease of the information content of the sensor towards the theoretical value (5) and in accordance with (3). another important moment that essentially decides about “data mining” is that the maximum amount of information contained in the analog signal of the sensor is the signal evaluation process itself. at present we can talk about two basic ways: • evaluation of the amplitude of the analog signal in the time domain by a standard analog or modern digital system; • evaluating the amplitude of the analog signal in the frequency domain using a digital measurement system. as mentioned above, from (3) follows that the sensor as a discrete information source has the higher information content, the more amplitude levels of its output signal x(t) we can distinguish. with certain simplifications, when we neglect the sensitivity and accuracy class of the real sensor, we can deduce from the entropy ha equation (4) of the analog signal that is continuous both in time and amplitude on the final amplitude range. 2. analog measuring system the measurement system generally represents a summary of the elements that provide the measurement task. the behaviour of the measured signal is interpreted mainly by using the signal analysis at certain points of amplitude, time and the frequency view. from these characteristics, it is possible to obtain information about a process that could not be captured using the basic signal processing functions. this includes the processing of average data values, determination of their distribution, correlations, transformations, and also the functions necessary to describe deterministic or stochastic signals in static processes or in transition processes [17]. signal analyses are most often solved by an external host computer without requiring a real-time operation [18]. as an example, the determination of the sampling period of a process variable based on the analysis of the frequency spectrum of the measured signal according to shannon theorem can be used [19]. the processing of this signal in the time domain deals about analysis of its overall amplitude. in the past and in many cases even today, this is the most common way of evaluating the measurement of physical variables [20]. the visual display of the corresponding amplitude of the one-way signal of the sensor is realized by means of an analog apparatus calibrated in the corresponding physical units (see figure 2). as mentioned above, analog measurement systems are classified based on the accuracy class δ (%) e.g., 340 vol. 58 no. 6/2018 evaluation of sensor signal processing methods figure 2. signal evaluation by analog measuring instrument. 0.01, 0.02, 0.05, 1.0, 1.5 and 2.5. for the accuracy class, it is valid δ = ±max εrange 100 %. it causes the socalled uncertainty band of a relative width ε = 2δ near the result of a measurement. an analog measuring system with a relative error δ provides n of the distinguishable amplitudes of the measured physical variable, with regards to the following equation: n = 1 2δ + 1 = 1 ε + 1. (6) if, for simplicity, we assume an uniform distribution of the probability density of the measured quantity, i.e., all values have the same probability of occurrence p = 1 n , it is possible to simplify the differential entropy of an analog measuring system with a given accuracy class δ based on (3) into the following form: haδ = log2 n = log2 ( 1 2δ + 1 ) (bit). (7) this equation gives the maximum boundary value of information that one measurement can contain. if, for example, the relative error of the analog measuring system δ = 0.01 %, it allows the instrument to measure 51 different measured values on a given range. then, according to (7), we receive the information content of this measurement system haδ = log2 51 = 5.67 (bit). 3. digital measuring system nowadays, in the practical applications of the theory of the automatic control or digital signal processing [21], we very often counter the issue of communication of discrete technical devices with a continuous environment [22]. the bridges, which enable us to connect digital and continuous worlds, are digital-toanalog (dac) and analog-to-digital (adc) converters. at present, the evaluation of a measurement of the variable by digital systems prevails. digital measurement systems are based on the digitization [23] of the analog signal by the m-bit analogto-digital converter. if the width of the ad converter is m-bits, then this converter will distinguish, on the interval of x ∈ 〈xmin,xmax〉, the total n = 2m amplitude signal levels. the differential entropy of this sampled signal is generally given by (3). in the case of a uniform distribution of signal probability, the simplified equation applies: hdigm = log2 n = log2 2 m = m (bit). (8) figure 3. digital signal evaluation by digital system. if, for example, we consider a common 12-bit ad converter in technical practice, this allows us to distinguish up to the given signal range of n = 212 = 4096 different levels i.e., measured values. assuming the uniform distribution of the probability of the measured values, we receive the information content of this measuring system hdig12 = log2 4096 = 12 (bit). from the comparison results that in practice are valid haδ < hdigm. thus, numerical methods achieve significantly higher accuracy than analog methods. they have better static properties, but the price is worse dynamic properties. an illustrative diagram of a signal evaluation by a digital system is shown in figure 3. 4. hybrid measuring system another type of digital measurement systems are systems based not on the processing of the sampled sensor signal but on the evaluation of the analog signal of the sensor itself. the analog signal of the sensor is evaluated by a special set of analog and digital circuits. this is a hybrid measurement system, although its fundamental is the use of special programmable digital circuits. to calculate the differential entropy of such measuring systems, we usually have to approach them individually. as an example of a digital or hybrid measuring system, it is possible to include a device for measuring a young’s elastic modulus of steel ropes [24]. this is a method of indirectly measuring the elasticity modulus of steel rope under traction based on the measurement of propagation velocity of longitudinal wave caused by a mechanical shock. from a physical point of view, the method relies on a known dependence between the rate of sound propagation in the material v (m s−1) and modulus of elasticity e (mpa) of steel rope, whose mass density of material is ρ (kg m−3) [25, 26]. for a more accurate idea of dependence, we also present the following equation: e = v2ρ (mpa). (9) the velocity of the propagation of the longitudinal acoustic wave in the steel rope can be converted to two time-shifted τ pulses using suitable sensors and pre-amplifiers [27]. by a time shift τ, the time 341 patrik flegner, ján kačur acta polytechnica figure 4. hybrid measurement system with a pulse width modulation signal. period after which the mechanical shock from one cross-section of the rope passes to the other is meant. the implemented flip-flop circuit converts these two time-shifted pulses into one width-modulated pulse. the counter is controlled by the impulse so that it only works for the duration τ commensurable to the velocity of the wave propagation velocity and also to indirectly measured modulus of elasticity of a steel rope (see figure 4). the presented hybrid measuring system is implemented in practice and is still functional for the purpose of assessing the quality and damage of the steel rope [28]. counting the generated pulses by the counter for a time τ may cause an inaccuracy of a unit size. this means that in one measurement we obtain the uncertainty of ε = (τfg) −1 for the impulses τfg. the maximum distinguishable number of levels n of measured variable is: n = 1 ε + 1 = τfg + 1, (10) where the value of one represents a zero amplitude value. this basic equation (10) is valid for a digital measurement and shows that an increase in the number of distinguishable levels and thus the entropy of the measurement will be achieved by increasing the clock frequency fg of the generator. it is assumed that this frequency fg of the used generator is determined without an error. if, for example, we use the generator with a frequency fg = 10 (mhz), then on a unit scale τ ∈ 〈0, 1〉 in seconds, we can distinguish n = 107 levels, which correspond to an entropy hhyb = log2 10 7 = 23.25 (bit). 5. processing the signal in the frequency domain the basic method of a signal processing in the frequency domain is the analysis of its spectrum (see figure 5). it is based on the fact that the sequence of n samples (i.e., the record xs) of any real signal can be expressed in terms of the approximation of the sums of the unique series of n harmonic components, each of which has its complex amplitude fk, frequency fk and phase shift ϕk, k = 0, 1, 2, . . . ,n − 1, is valid equation: xs(t) = n−1∑ k=0 fke ik2πf1t+ϕk, t ∈ 〈0,t〉. (11) equation (11) is valid for the real signal that is limited by the highest frequency component fs/2, while we assume its periodicity with the base time period n/fs. for the first frequency component in the spectrum (the so-called base frequency) f1 and for the frequency resolution ∆f in the spectrum, it is valid f1 = ∆f = 1t = fs n , where t is the length of the record of analysed sensor signal in time units. the record length t depends on the number of samples n and the sampling frequency of signal t = n fs . in equation (11), the coefficient was limited to the range 0 to n−1, because in the sense of the discrete fourier transform (dft), the number of the spectrum lines must correspond the number of samples in the record. the spectrum is complex, thus comprised of the amplitude spectrum and the phase spectrum. the number of spectral lines represented in the spectrum is equal to the number of n samples in the analysed signal recording. due to the aliasing and the symmetry of the discrete spectrum around the axis fs, the usable part of the complex spectrum is only until the nyquist frequency fs/2. therefore, for the frequency analysis and industrial practice, the usable number of discrete complex spectrum lines is according to (11) n/2 (see figure 6). to assess the amount of information contained in the signal spectrum, we must build on the number n|f| of possible shapes of amplitude spectrum and also on the number nϕ of phase spectra. due to the discreet signal evaluation, this is the final count. for simplicity, consider only the amplitude spectrum analysis, which is more common in practice. when calculating the number of possible amplitude signal spectra, we must realize that this spectrum consists of n/2 spectral lines, each of which can have one of 2m values. from a combinatorial point of view, there are variations of n/2 class from 2m elements with the repeating. each of the amplitude levels can occur across multiple spectral lines. then it is valid that: n|f| = v ′n/2(2 m) = (2m)n/2. (12) then the entropy hf (bit) of the measurement based on the amplitude spectrum examination of the sensor signal is given by: hf = log2 n|f| = log2(2 m)n/2 = n 2 log2 2 m = n 2 m (bit) (13) if, for example, we used an m=12 (bit) sensor and an ad converter to digitize the analog signal, and we would evaluate the two-sided complex amplitude 342 vol. 58 no. 6/2018 evaluation of sensor signal processing methods figure 5. signal evaluation by a digital system in the frequency domain. spectrum of this signal with the length of n=1024, then the entropy of such measurement would be hf = 1024 2 12 = 6144 (bit). in figure 6, as an example from practice, a one two-sided complex amplitude spectrum of the accompanying acoustic signal generated in disintegration of the rock by the rotation drilling is exemplified is shown [29]. the entropy of the spectrum has a value of 6144 (bit). the measurement was carried out on the horizontal laboratory drilling stand. a record of n = 1024 samples obtained at a sampling frequency of fs = 18 (khz) from the microphone signal was evaluated using a m = 12 (bit) ad converter. the purpose of analysing this acoustic signal is to find the information in the signal that can be used for an optimal control of the drilling process [30]. the basic criteria for optimizing the process are in this case the minimal specific energy of disintegration and the maximum drilling speed [31]. in practice, in some cases, spectrum changes are examined depending on the change of a given variable. for example, in the technical diagnostics of rotary machines, it is interesting to observe the change of spectrum of their vibration when increasing the revolutions (rpm). we talk about so-called spectrogram, i.e., spectrum dependence on time (or, in the example, on increasing revolutions). let’s assume that we have measured a number of s spectra corresponding to a time interval of 0, 1, 2, . . . ,s− 1. this sequence of spectra represents the spectrogram as a highly integrative information source. in calculating its entropy as a potential information content, we must calculate the number n|f|s of possible spectrograms consisting of spectra s. figure 6. the two-sided complex amplitude spectrum of an acoustic signal from the rock drilling process. based on previous considerations, starting from the combinatorial one, we can conclude that a spectrogram containing the spectra of signal records with a length of n samples obtained by the m-bit ad converter represents variations of the s-class of n|f| elements with the repeating. each of the spectra may occur at multiple time moments. if this equation is valid: n|f|s = v ′s (n|f|) = n s |f|. (14) then the entropy h|f|s (bit) of measurement, based on the investigation of spectrogram of the sensor signal is given by the equation: h|f|s = log2 n|f|s = log2 n s |f| = s log2 n|f| = s n 2 m (bit). (15) if, for example, we used an ad converter with the width of m = 12 (bit) to digitize the analog signal of the sensor and we would evaluate a spectrogram containing s = 10 complex amplitude spectra, each of which was generated by analysing a signal recording with the length of n = 1024 samples, then the entropy of such measurement would, according to (15), have the value of h|f|s = 10 10242 12 = 61440 (bit). as an example of the spectrogram investigation, we can present the spectral analysis of an acoustic signal of the accompanying noise in the rock drilling process [32, 33]. the aim of the analysis is to obtain the information on the actual conditions of the rock disintegration by rotary drilling in terms of an optimal control of this process (see figure 7) [34–37]. thus, the increase of the entropy compared to the classical analog technique as well as in the time domain digital technique is significant in the case of the signal evaluation in the frequency domain. this is illustrated in table 1. to highlight the differences in measurement systems, the potential entropy values of the individual signal processing methods were recalculated to the decimal logarithm log10 h(ϕ). it is shown in figure 8. 343 patrik flegner, ján kačur acta polytechnica figure 7. spectrogram of acoustic signal as accompanying noise in rotary disintegration of granite. measuring system entropy 1 – analogue system haδ = 5.67 b 2 – digital system hdigm = 12 b 3 – hybrid system hhyb = 23.25 b 4 – spectrum hf = 6144 b 5 – spectrogram h|f|s = 61440 b table 1. approximate entropy values for individual methods of evaluating the sensor signal. 6. summary and conclusions table 1 shows the comparison of the individual measuring systems. based on the entropy values of the sensor signal evaluation, it can be seen that the analog measurement system has the lowest information notice value. this is understandable because this system belongs to classical measurement systems, but is still used at the lowest procedural level of control. the digital measuring system is an extension of the analog system by a part, which ensures the conversion of the analog variable into a number in a suitable form and for subsequent processing. the hybrid system is an example of a measurement system in which the benefits of both systems are interconnected. the signal processing of the sensor in terms of entropy in the frequency domain has a high information value. this is confirmed by the numerous uses in industrial practice and in various areas ranging from mining (e.g., processing of signals from geological survey wells) through the automotive industry (e.g., signal processing gerenerized by the car and its influence on the driver) to medicine (e.g., ekg cardiac signal processing, eeg brain). the successful implementation of the developed experimental measuring systems, and thus their practical applicability, is always decided by a deployment in a real environment. it is necessary to say that the current industrial distributed control systems have an increasingly more complex and more extensive transmission and processing of data. distributed control systems use a variety figure 8. potential entropy values of individual signal processing methods. of communication buses. this means that at the lower levels of control, the necessary technical means are used with the digital processing of information from intelligent sensors, analyzers to plc systems and workstations. at this lower level, the current state is characterized by the use of classic measurement systems along with intelligent or smart elements that are capable of cooperating through industrial communications networks. the described problem is so serious when implementing new measurement systems or signal processing that it deserves an increased attention. verification of the correctness and effectiveness of the presented measuring systems was carried out in the framework of research activities and problem-oriented projects. acknowledgements this work was supported by the slovak research and development agency under contract apvv-14-0892 and grants vega 1/0273/17 from the slovak grant agency for science. references [1] c. shannon. a mathematical theory of communication. bell system technical journal 27(3):379–423 and 623–656, 1948. doi:10.1109/9780470544242.ch1. [2] c. shannon, w. weaver. the mathematical theory of communication. the university of illinois press, urbana il, 1964. doi:10.2307/3611062. [3] m. belies, s. guiasu. a quantitative-qualitative measure of information in cybernetic systems. in ieee trans. inf. theory it-4, pp. 593–594. 1968. doi:10.1109/tit.1968.1054185. [4] e. t. jaynes. information theory and statistical mechanics. phys rev 106(4):620–630, 1957. doi:10.1103/physrev.108.171. [5] a. delgado-bonal, j.martín-torres. human vision is determined based on information theory. scientific reports 6(1), 2016. doi:10.1038/srep36038. 344 http://dx.doi.org/10.1109/9780470544242.ch1 http://dx.doi.org/10.2307/3611062 http://dx.doi.org/10.1109/tit.1968.1054185 http://dx.doi.org/10.1103/physrev.108.171 http://dx.doi.org/10.1038/srep36038 vol. 58 no. 6/2018 evaluation of sensor signal processing methods [6] j. shore, r. iohnson. axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. ieee trans inf theory 26(1):26–37, 1980. doi:10.1109/tit.1980.1056144. [7] a. rényi. on measures of entropy and information. proceedings of the fourth berkeley symposium on mathematical statistics and probability, berkeley-los angeles pp. 547–561, 1961. [8] m. donald. on the relative entropy. commun math phys 105:13–34, 1986. doi:10.1007/bf01212339. [9] l. r. nemzer. shannon information entropy in the canonical genetic code. journal of theoretical biology 415:158–170, 2017. doi:10.1016/j.jtbi.2016.12.010. [10] s. yu, t.-z. huang, x. liu, w. chen. information measures based on fractional calculus. inf process lett 112(23):916–921, 2012. doi:10.1016/j.ipl.2012.08.019. [11] s. yu, t.-z. huang. exponential weighted entropy and exponential weighted mutual information. neurocomputing 249:86–94, 2017. doi:10.1016/j.neucom.2017.03.075. [12] k. krechmer. relational measurements and uncertainty. measurement 93:36–40, 2016. doi:10.1016/j.measurement.2016.06.058. [13] k. krechmer. relative measurement theory, the uniïňącation of experimental and theoretical measurements. measurement 116:77–82, 2018. doi:10.1016/j.measurement.2017.10.053. [14] n. travers. exponential bounds for convergence of entropy rate approximations in hidden markov models satisfying a path-mergeability condition. stochastic processes and their applications 124(12):4149–4170, 2014. doi:10.1016/j.spa.2014.07.011. [15] m. thomas, j. thomas. elements of information theory. john wiley and sons, inc. print, 1991. doi:10.1002/0471200611. [16] t. schneider. information theory primer with an appendix on logarithms. national cancer institute, 2007. [17] p. duhamel, m. vetterli. fast fourier transforms: a tutorial review and a state of the art. signal processing 19:259–299, 1990. doi:10.1016/0165-1684(90)90158-u. [18] a. v. oppenheim, r. w. schafer. discrete-time signal processing. prentice-hall, 1989. [19] k. nelson. a definition of the coupled-product for multivariate coupled-exponentials. physica a: statistical mechanics and its applications 422:187–192, 2015. doi:10.1016/j.physa.2014.12.023. [20] m. frigo, s. g. johnson. fftw: an adaptive software architecture for the fft. in proceedings of the international conference on acoustics, speech, and signal processing, vol. 3, pp. 1381–1384. 1998. doi:10.1109/icassp.1998.681704. [21] m. h. hayes. statistical digital signal processing and modeling. john wiley and sons, 1996. [22] j. w. cooley, j. w. tukey. an algorithm for the machine computation of the complex fourier series. mathematics of computation 19:297–301, 1965. doi:10.2307/2003354. [23] s. shreedharan, c. hegde, s. sharma, h. vardhan. acoustic fingerprinting for rock identification during drilling. international journal of mining and mineral engineering 5(2):89–105, 2014. doi:10.1504/ijmme.2014.060193. [24] h. zheng, y. mingjun, s. fuyu. a new method for measuring young’s modulus by optical fiber sensor. in proceedings of the 2012 third international conference on mechanic automation and control engineering, vol. 3 of mace ’12, pp. 1662–1664. ieee computer society, 2012. [25] j. boroška, j. krešák, p. peterka. estimation of quality for steel wire ropes according to their mechanical properties. acta montanistica slovaca 1:37–42, 1997. [26] e. štroffek, i. leššo. acoustic method for measurement of young’s modulus of steel wire ropes. metalurgija 40(4):219–221, 2001. [27] i. leššo, j. futó, f. krepelka, et al. control with acoustic method of disintegration of rocks by rotary drilling. metalurgija 43(2):119–121, 2004. [28] p. peterka, p. kačmáry, j. krešák, et al. prediction of fatigue fractures diffusion on the cableway haul rope. engineering failure analysis 59:185–196, 2016. doi:10.1016/j.engfailanal.2015.10.006. [29] i. leššo, p. flegner, et al. new principles of process control in geotechnics by acoustic methods. metalurgija 46(3):165–168, 2007. [30] g. wittenberger, m. cehlár, z. jurkasová. deep hole drilling modern disintegration technologies in process of hdr technology. acta montanistica slovaca 17(4):241–246, 2012. [31] i. leššo, p. flegner, et al. research of the possibility of application of vector quantisation method for effective process control of rocks disintegration by rotary drilling. metalurgija 49(1):61–65, 2010. [32] masood, h. vardhan, m. aruna, b. r. kumar. a critical review on estimation of rock properties using sound levels produced during rotary drilling. international journal of earth sciences and engineering 5(6):1809–1814, 2012. [33] p. flegner, j. kačur, m. durdán, et al. measurement and processing of vibro-acoustic signal from the process of rock disintegration by rotary drilling. journal of the international measurement confederation 56:178–193, 2014. doi:10.1016/j.measurement.2014.06.025. [34] j. jurko, a. panda, m. gajdoš. study of changes under the machined surface and accompanying phenomena in the cutting zone during drilling of stainless steels with low carbon content. metalurgija 50(2):113–117, 2011. [35] j. jurko, m. dzupon, a. panda, et al. deformation of material under the machined surface in the manufacture of drilling holes in austenitic stainless steel. chemicke listy 105(16):600–602, 2011. [36] p. flegner, j. kačur, m. durdán, et al. significant damages of core diamond bits in the process of rocks drilling. engineering failure analysis 59:354–365, 2016. doi:10.1016/j.engfailanal.2015.10.016. [37] i. leššo, p. flegner, j. futó, z. sabovaá. utilization of signal spaces for improvement of efficiency of metallurgical process. metalurgija 53(1):75–77, 2014. 345 http://dx.doi.org/10.1109/tit.1980.1056144 http://dx.doi.org/10.1007/bf01212339 http://dx.doi.org/10.1016/j.jtbi.2016.12.010 http://dx.doi.org/10.1016/j.ipl.2012.08.019 http://dx.doi.org/10.1016/j.neucom.2017.03.075 http://dx.doi.org/10.1016/j.measurement.2016.06.058 http://dx.doi.org/10.1016/j.measurement.2017.10.053 http://dx.doi.org/10.1016/j.spa.2014.07.011 http://dx.doi.org/10.1002/0471200611 http://dx.doi.org/10.1016/0165-1684(90)90158-u http://dx.doi.org/10.1016/j.physa.2014.12.023 http://dx.doi.org/10.1109/icassp.1998.681704 http://dx.doi.org/10.2307/2003354 http://dx.doi.org/10.1504/ijmme.2014.060193 http://dx.doi.org/10.1016/j.engfailanal.2015.10.006 http://dx.doi.org/10.1016/j.measurement.2014.06.025 http://dx.doi.org/10.1016/j.engfailanal.2015.10.016 acta polytechnica 58(6):339–345, 2018 1 introduction 2 analog measuring system 3 digital measuring system 4 hybrid measuring system 5 processing the signal in the frequency domain 6 summary and conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0316 acta polytechnica 57(5):316–320, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap strength and dynamic analysis of a structural node limiting the multi-output gear mechanism mária kačalováa, ∗, slavko pavlenkob a department of technological systems design, bayerova 1, 080 01 prešov, slovak republic b faculty of manufacturing technologies, bayerova 1, prešov 080 01, slovak republic ∗ corresponding author: maria.kacalova@tuke.sk abstract. the rapidly advancing technological development leads to designing and researching a new multioutput gear mechanism. the investigated new double-output gear mechanism has two output coaxial shafts located against the input shaft on the other side of the gearbox. the gear mechanism achieves high gear ratios. its limiting structural node is the output stage, to which the gears belong. the problem is addressed through the analysis of the stress resistance of tooth flanks in contact and bending. the content of the paper is a comparison of analytical computations with the modal analysis on the model. we expect that new findings will be beneficial for further optimization of the gear mechanism. keywords: double-output gear mechanism; gears; strength and dynamic analysis. 1. introduction from continuing trends of current developments, it is believed that in the 21st century, there will be a rise in production to meet higher customer demands for products. it puts a strain on the flexible response to customer requirements, and also on the construction machinery of the manufacturing technology that is related to the concept with the required technical parameters. in the development process, you can include the search for new principles of construction of the gear mechanism. it has been developed with efforts to increase the carrying capacity and lifespan, with the possibility of creating a gear mechanism with a wide range of gear ratios with one or multiple outputs. the nearest area of the current state of the principles of comparability solutions and possibilities of application in production engineering have been chosen high-precision gears. the term “high-precision gears” or “gears of high accuracy” have been invented by producers of their respective products. the producer declared a high accuracy of the transfer of their characteristic precision. in this regard, it should be noted that this kind of reducers has a relatively short history of development and implementation. currently, the world’s leading companies producing these reducers include japanese company tein seiki, sumitomo cyclo and harmonic drive. high-precision gears are predestined for applications in machinery and equipment, which require virtually zero clearance, high positioning accuracy and repeatability of accomplishing its high stiffness with the requirement of a higher gear ratio. figure 1. functional model of double-output gear mechanism 2. materials and methods the department of technical systems design on faculty of manufacturing technologies in prešov of the technical university in košice has designed the highprecision gears multi-output transmission gear mechanism shown in figure 1, designed from an existing utility model number 3937 harmonic double-output gear mechanism, authors: ing. jozef haľko, phd., prof. ing. vladimír klimo, csc. the gear mechanism achieves small and large ratios at the same time. the limiting output stage is a hollow shaft with an internal gearing at the end that needs an investigation from the strength and dynamics point of view. figure 2 shows the basic scheme of a double-output gear mechanism that has two output coaxial shafts ii and iii located against the input shaft i on the other side of the gearbox. the output stage of the double-output gear mechanism shown in figure 2 under number ii was researched. 316 http://dx.doi.org/10.14311/ap.2017.57.0316 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 5/2017 analysis of structural node limiting multi-output gear mechanism figure 2. principal scheme of double-output gear mechanism: i – input shaft, ii – output shaft, iii – output shaft, s – central axis, e – eccentricity, z1–z6 – numbers of teeth, 1 – frame, 2 – gear wheel figure 3. output stage of gear mechanism diameter hertz pressure under ideal load coefficient of the additional forces hertz allowable pressure hertz stress in contact d [mm] σho [mpa] kh [–] σhp [mpa] σh [mpa] gear 101 180.41 2.486 281.15 284.46 pinion 101 99.52 2.486 281.15 156.91 table 1. the results of strength analysis for fatigue in contact. circumferential force corresponding to the first degree of load allowable stress in contact maximum load coefficient of the dynamic external forces hertz stress in contact ft1 [n] σhp,max [mpa] σh,max [mpa] kas [–] σh [mpa] gear 3376.7 620 348.39 1.5 284.46 pinion 419.69 620 192.17 1.5 156.91 table 2. the results of strength analysis for disposable load in contact. 2.1. strength analysis of the output stage the output stage shown in figure 3 is the most stressed and loaded with high torques. it is a hollow shaft with an internal gearing at the end. this internal gearing has been analysed by a strength analysis. strength analysis was performed on the gear: • fatigue in contact table 1; • disposable load in contact table 2; • bending fatigue table 3; • disposable retraced bending table 4. strength calculations of the investigated gear were carried out according to the standard stn 01 4686 based on the basic input parameters for a given gear. 2.2. dynamic analysis of the output stage the output stage of the gear mechanism shown in figure 3 was analysed by a dynamic analysis for fatigue. the following condition have to be fulfilled: kc < k, where kc is the minimum dynamic safety and k is the overall safety margin. while the minimum dynamic safety for a less accurate calculation without any experimental verification is kc = 1.5–1.8. the overall safety margin is calculated as: k = kσ · kτ√ k2σ · k2τ , (1) where kτ is a safety against a fatigue failure under a torsion loading and kσ is a safety against a fatigue failure under a bending stress. 2.3. modal analysis of the output stage a modal analysis method can solve many technical problems encountered in the design, manufacture and operation of mechanical systems or parts. in this 317 mária kačalová, slavko pavlenko acta polytechnica coefficient of the additional load bending stresses in the dangerous section of dedendum permissible bending stress safety factor against fatigue fracture in the dedendum kf [–] σf [mpa] σfp [mpa] sf,min [–] gear 2.465 260.71 254.29 1.4 pinion 2.465 33.74 254.29 1.4 table 3. the results of strength analysis for bending fatigue. the biggest local bending stresses in the dedendum permissible bending stress at maximum load bending stress at maximum load σf,max [mpa] σfp,max [mpa] σf,st [mpa] gear 391.07 712 890 pinion 50.06 712 890 table 4. the results of strength analysis for disposable retracted bending. figure 4. finite element mesh created on the model of the output stage. a b kσ 5.23 9.79 kτ 21.25 23.34 k 5.08 9.03 table 5. the results of dynamic analysis of output stage. work the finite element method (fem), which is a frequently used numerical method, was used. the analysed output stage of the double-output gear mechanism has been created in ansys workbench 14.5. the calculation model was created with the defined exact geometry, high-quality finite element mesh model and a corresponding load. figure 4 shows the finite element mesh of the output stage of the doubleoutput gear mechanism, and the figures 5 and 6 show the stress in the meshing of the mating teeth. on the wheel, the maximum tension value 335.66 mpa is reached at the point of contact with the teeth and the maximum tension value reached on the pinion is 187.50 mpa. 3. discussion using a finite element method on the output stage of the double-output gear mechanism, it has been found out that the greatest stress 335.66 mpa is on the wheel gear, opposed to 187.50 pa on the pinion. the analytical solution values of the tension calculated for the wheel and pinion are 284.46 mpa and 156.91 mpa respectively. the verification of the results of the calculated values of the tension differs from the values generated by the ansys analysis. probably due to rounding of the results and the distribution of the finite element mesh. the finite element method (fea) is widely used to optimize the gearing. workplaces, such as collage of information and control engineering in china, department of mechanical engineering, curtin university, bentle & western australia, also use this method in their work. in a similar way, by comparing calculated and generated values, the school of mechanical engineering, department of design and automation, university, vellore, tamil nadu, india followed these principles. they used this experiment to analyse the gear system for the turboshaft aero engine reduction gearbox. 318 vol. 57 no. 5/2017 analysis of structural node limiting multi-output gear mechanism figure 5. analysis of pinion teeth at contact. figure 6. analysis of the gear at the point of contact of teeth. 4. conlusion the main task of this paper was to find the limit states of the output stage of the double-output gear mechanism. based on these results, we can continue to optimize the entire double-output gear mechanism. the authors will continue their work in research, development, testing and diagnosis of these newly developed transmission mechanisms at the department of technical systems design, faculty of manufacturing technologies. we can say that the finite element method (fea) is a reliable method for optimizing individual parts of the gear mechanism as well as other components of the manufacturing technologies. references [1] j. haľko, v. klimo. dvojstupňový viacvytupový harmonický prevod, úžitkový vzor zaregistrovaný úradom priemyselného vlastníctva sr č. 3937, 2004. [2] j. haľko, i. vojtko. simulácia prevodovky vo virtuálnom prostredí. acta mechanica slovaca, roc. 11, c. 4-a, s. 89-94, issn 1335-2393, 2007. [3] j. paško, s. pavlenko. j. haľko. conditions for toothing in two-stage multi-output transmissions. in: scientific bulletin. vol. 20, serie c (2006), p. 315-320. issn 1224-3264, 2006. [4] j. paško, j.haľko. s. pavlenko. možné viarianty riešenia dvojstupňových prevodov s vnútorným obežným kolesom, possible solution variants of two-degree gears with inner impeller, in: sborník medzinárodní 46. conference kateder části a mechanizmu stroju – liberec: tu, 2005 p. 269-273. isbn 8070839511, 2005. [5] j. haľko. s. pavlenko. design strenght calculation of cycloidal lantern gear. in: barsu herald scientific and practical journal, physical and mathematical sciences, engineering sciences, no. 1 (2013), p. 58-65. issn 2309-1339, 2013. [6] m. byrtus. modeling of gearboxes with time dependent meshing stiffness. in zeszyty naukowe katedry mechaniki stosowanej, zeszyt nr 21, proceedings of the conference. applied mechanics, jaworzynka, poland. 2003 pp. 35 – 30. isbn 83-917224-3-0, 2003. [7] x. l., s. liu, y. wang. design and analysis of a stator hts field – modulated machine for direct 319 mária kačalová, slavko pavlenko acta polytechnica drive applications. ieee transactions on applied superconductivity. institute of electrical and electronics engineers inc. issn 105182232017, 2017. [8] s. xue, i. howard. dynamic modelling of flexibly supported gears using iterative convergence of tooth mesh stifness. mechanical systems and signal processing. academic press. issn 08882370, 2016. [9] r. j. boby, e. raj kumar. design and analysis of gear systém for turobshaft aero engine reduction gearbox. international journal of mechanical engineering and technology. indian society for education and enviroment. issn 09746846, 2016. [10] v. balambica, t. j. prabhu, r. v. babu. finite element application of gear tooth analysis. advanced materials research. 2013 4th international conference on advances in materials and manufacturing, icammp 2013; kunming; china volume 889-890, 2014 pages 527-531 issn 10226680, 2014. [11] a. a. umar, a. s. ahmad, a. g. yusuf, z. i. bibi farouk. effect of face width on bending stress of spur gear using agma and fea. advanced materials research. volume 945-949, 2014, pages 840-844. trans tech publications ltd. issn 10226680, 2016. [12] x. q. sun, y. zhang, d. w. shi. dynamics simulation and fatigue life study of the drive system of rack and pinion climbing vertical ship lift. applied mechanics and materials. 2014 international conference on manufacturing technology and electronics applications, icmtea 2014; taiyuan; chinavolume 687-691, 2014, pages 93-96. issn 16609336, 2014. 320 acta polytechnica 57(5):316–320, 2017 1 introduction 2 materials and methods 2.1 strength analysis of the output stage 2.2 dynamic analysis of the output stage 2.3 modal analysis of the output stage 3 discussion 4 conlusion references ap01_6.vp 1 introduction during the last three decades progress has been made in the field of observational astronomy. first, sophisticated low light level detectors were invented, developed and used for direct imaging. among these detectors, charge coupled device (ccd) chips have become the dominant detector for various applications. this is because they possess high detective quantum efficiency, a linear response and a very wide dynamic range. secondly, fast electronic (mainframe, personal and workstation) computing machines with huge memories have greatly supported astronomy not only in automating the observational techniques but also in data acquisition, reduction and analyses. in the field of stellar astronomy, ccd detectors and electronic computers have played a vital role since numerous frames have been taken and analysed by various packages to identify stellar images and extract their astronomical parameters. this has been achieved through modelling a stellar image either empirically [1, 2, 3] or mathematically [4, 5, 6] or semi-empirically [7, 8]. all these approaches require user intervention to set initial values for the parameters of the adopted model as well as the form of the model itself. moreover several non-linear function-fitting processes have to be performed for many parameters, which requires much computing time. in addition, in many cases false results have been obtained, identifying images as stellar where they are not, and vice versa. this paper deals with the development of an artificial neural network approach for stellar image recognition. it is structured as follows. section 1 introduces the project. section 2 states the problem and the objective. section 3 describes the developed system, specifying all relevant concepts and the implementation details. in section 4, a case study is selected and used to verify the applicability of the system. the results of this approach are reported and discussed, taking into account standard published approaches, and are compared with the results derived by one of the best known and more widely applied methods in astronomy. some concluding remarks are presented in section 5. 2 statement of problem and objective ccd detectors are capable of imaging a huge number of celestial objects (stars, galaxies, etc) on a single frame under the same conditions. in such frame, each given picture element (pixel) contains the data number representing linearly the detected photons coming from one or more of several sources: 1) seen object, 2) unseen object, 3) localised image defects, 4) cosmic ray events, and 5) diffuse sources, including but not necessarily limited, for example, to the terrestrial night sky and scattered light in the camera. hence, an obtained frame may contain many objects, some of which are actually due to stars, while the others are not, though their data-distributions may be similar to data distributions of stellar origin. this may be due to some of the sources stated above. the main objective of the present work is to develop a simple, fast and reliable stellar image identification method that should be effectively able to identify stellar images and to differentiate between them and other objects on ccd frames. the approach deemed here is based on artificial neural network concepts. 3 the present method an artificial neural network (ann) can be defined as a category of mathematical algorithms that produces solutions for various specific problems. anns, have been biologically inspired to emulate the neural networks found in living organisms. an extremely important feature of an ann is its learning capability, which is very useful and powerful in a wide range of applications. supervised learning can be advantageously used, since it is a straightforward task to prepare the necessary input/output patterns for network training. this justifies the selection of multi-layer feed-forward networks for the required classification task. the error back-propagation learning algorithm will be used in the present work to train the network. in the present work we aim at investigating the possibility of establishing a stellar image interpretation system using © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 41 no. 6/2001 stellar image interpretation system using artificial neural networks: unipolar function case f. i. younis, a. el-bassuny alawy, b. šimák, m. s. ella , m. a. madkour an artificial neural network based system for interpreting astronomical images has been developed. the system is based on feed-forward artificial neural networks (anns) with error back-propagation learning. knowledge about images of stars, cosmic ray events and noise found in images is used to prepare two sets of input patterns to train and test our approach. the system has been developed and implemented to scan astronomical digital images in order to segregate stellar images from other entities. it has been coded in c language for users of personal computers. an astronomical image of a star cluster from other objects is undertaken as a test case. the obtained results are found to be in very good agreement with those derived from the daophotii package, which is widely used in the astronomical community. it is proved that our system is simpler, much faster and more reliable. moreover, no prior knowledge, or initial data from the frame to be analysed is required. keywords: neural networks, knowledge-based system, stellar images, image processing. artificial neural networks (siis-ann) as a classifying technique that decides whether a given pixel array in a ccd frame represents a star image or some other objects. this decision is based on the distribution pattern of the electron contents of the given pixel and its neighbouring pixels. 3.1 siis-ann structure the present siis-ann consists of a two-layer feed-forward network, as shown in fig. 1. the first (hidden) layer has 4 neurones, while the second (output) layer has 3 neurones. the total inputs (zi) to the network amount to 24. the dimension of the hidden layer weight matrix is v(4 × 24), while that of the output layer weight matrix is w(3 × 4). note that v1,0, v2,0, v3,0, and v4,0 are the biases of the neurones of the hidden layer, while w1,0, w2,0, and w3,0 are the biases of the output layer neurones. a differentiating function has to be adopted for learning and discriminating purposes, on which the weights v and w are applied and by which the neurones work. we adopted one of the most common functions, namely the sigmoid function, defined as: � �f x x� � � 1 1 e � � �1 0� �f x where � (steepness factor) is an arbitrary small positive value. 3.2 input patterns two pattern sets were adopted, one for training and the other for testing the present siis-ann. each set contains 341 input patterns (119 stellar images, 111 cosmic ray events and 111 noises). each pattern comprises the data of a 5 × 5 – pixel array, where the central pixel represents the centre of its relevant identity. all values were normalised with respect to the central value. these patterns were randomised through each set. the chosen stellar input patterns have a wide range of brightness, while the cosmic ray event patterns resemble energies from low energy to high energy. the input patterns were adopted as follows: � 1. mark the object of interest (star image, cosmic ray event, or noise) on some ccd frames. � 2. find the brightest pixel. � 3. construct a 5 × 5 window with the brightest pixel at its centre. � 4. extract the values of the pixels corresponding to the constructed window. � 5. normalise the obtained values by dividing them by the value of the brightest pixel. figure 2 demonstrates the 24-network input values (zi), where i ( x,y) is the value of the brightest central pixel. examples of these input patterns are given below. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 fig. 1: the siis-ann network architecture 3.2.1 star input pattern: figures 3 and 4 represent the input patterns for a bright star and a faint star, respectively. 3.2.2 cosmic ray event input pattern: figures 5, 6 and 7 show examples of some of the input patterns for high, intermediate and low energy cosmic ray events, respectively. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 41 no. 6/2001 z1= i ( x�2, y�2) /i ( x, y) z2= i ( x�1, y�2) /i ( x, y) z3= i ( x, y�2) /i ( x, y) z4= i ( x+1, y�2) /i ( x, y) z5= i ( x+2, y�2) /i ( x, y) z6= i ( x�2, y�1) /i ( x, y) z7= i ( x�1, y�1) /i ( x, y) z8= i ( x, y�1) /i ( x, y) z9= i ( x+1, y�1) /i ( x, y) z10= i ( x+2, y�1) /i ( x, y) z11= i ( x�2, y) /i ( x, y) z12= i ( x�1, y) /i ( x, y) i ( x, y) z13= i ( x+1, y) /i ( x, y) z14= i ( x+2, y) /i ( x, y) z15= i ( x�2, y+1) /i ( x, y) z16= i ( x�1, y+1) /i ( x, y) z17= i ( x, y+1) /i ( x, y) z18= i ( x+1, y+1) /i ( x, y) z19= i ( x+2, y+1) /i ( x, y) z20= i ( x�2, y+2) /i ( x, y) z21= i ( x�1, y+2) /i ( x, y) z22= i ( x, y+2) /i ( x, y) z23= i ( x+1, y+2) /i ( x, y) z24= i ( x+2, y+2) /i ( x, y) fig. 2: diagram illustrating the input pattern values 244 505 827 672 219 669 1974 3171 1737 406 1593 4499 6839 2950 520 2091 3825 5474 2256 459 1369 1569 1894 941 247 a) 5 × 5 window pixel (original values) 0.035678 0.073841 0.120924 0.098260 0.032022 0.097821 0.288639 0.463664 0.253985 0.059365 0.232929 0.657845 1 0.431350 0.076035 0.305746 0.559292 0.800409 0.329873 0.067115 0.200175 0.229419 0.276941 0.137593 0.036116 b) 5 × 5 window pixel (normalised values) fig. 3: bright star input pattern 23 25 28 27 21 25 35 48 38 27 25 48 64 45 23 24 43 56 37 28 25 28 35 28 24 a) 5 × 5 window pixel (original values) 0.359375 0.390625 0.437500 0.421875 0.328125 0.390625 0.546875 0.750000 0.593750 0.421875 0.390625 0.750000 1 0.703125 0.359375 0.375000 0.671875 0.875000 0.578125 0.437500 0.390625 0.437500 0.546875 0.437500 0.375000 b) 5 × 5 window pixel (normalised values) fig. 4: faint star input pattern 21 21 22 25 19 25 21 25 20 24 22 22 201 85 22 22 19 21 24 23 23 22 21 19 24 a) 5 × 5 window pixel (original values) 0.104478 0.104478 0.109453 0.124378 0.094527 0.124378 0.104478 0.124378 0.099502 0.119403 0.109453 0.109453 1 0.422886 0.109453 0.109453 0.094527 0.104478 0.119403 0.114428 0.114428 0.109453 0.104478 0.094527 0.119403 b) 5 × 5 window pixel (normalised values) fig. 5: high-energy cosmic ray input pattern 21 21 22 22 23 18 22 23 18 19 20 23 187 27 21 21 24 93 30 23 22 16 21 19 22 a) 5 × 5 window pixel (original values) 0.112299 0.112299 0.117647 0.117647 0.122995 0.096257 0.117647 0.122995 0.096257 0.101604 0.106952 0.122995 1 0.144385 0.112299 0.112299 0.128342 0.497326 0.160428 0.122995 0.117647 0.085561 0.112299 0.101604 0.117647 b) 5 × 5 window pixel (normalised values) fig. 6: intermediate-energy cosmic ray input pattern 3.2.3 noise input pattern: figure 8 provides an example of the noise input pattern. 3.3 output patterns the present siis-ann was trained with all desired output values (oi) set to zero, except the value for the class of the input pattern, which was set to unity. table (1) summarises the desired output values for a star, a cosmic ray, and noise. 3.4 training error for the purpose of weight adjustment in each training step, the error to be reduced is usually that computed only for the pattern currently being undertaken. to assess the quality and success of the training, however, the joint errors must be computed for the entire batch of training patterns. it should be pointed out that networks in classification applications may perform as excellent classifiers and may exhibit zero decision errors while still yielding substantial continuous response (cumulative and root mean square) errors. in this case, the decision error adequately reflects the accuracy of the neural network classifiers. this error was adopted for the network training. while training the present siis-ann, all the desired output values were set to the values listed in table (1). in addition, the decision error was used to terminate the training process when it reached, practically, a zero value. such an error is defined by: e n pk d err � where nerr is the total number of bit errors resulting at k thresholded outputs over the complete cycle, and p is the number patterns. in our siis-ann, k = 3, and nerr is computed as follows: � at the beginning of each training cycle set nerr=0 � for an individual pattern: if the desired output = 1 and the actual output � 0.9 or the desired output = 0 and the actual output � 0.1 then nerr is incremented and the step is re-executed. � at the end of the cycle set e n pkd err� 3.5 learning factors implementation of the error back-propagation learning algorithm may encounter various difficulties. one of the problems is that the error minimisation procedure may produce only shallow local minima in many of the training cases. this may be sufficiently avoided by including some form of randomness in the algorithm concerning: ainitial weights the weights of the network to be trained are typically initialised at small random values. the choice of initial weights is, however, only one of several factors affecting the training of the network toward an acceptable minimum error. the initial weights of the present network were initialised at small random values (between 0.0 and 0.1), except the bias values (v1,0, v2,0, v3,0, v4,0, w1,0, w2,0, and w3,0) for the hidden and output layers. these biases were initialised at negative random small values (between 0.0 and �0.1). blearning constant various values were adopted for the learning constant �. it was found that the values of the cumulative, root mean square and decision errors reached their minimum and acceptable values after 400 and 5000 learning cycles for � = 0.1 and � = 0.01, respectively [9]. however, the latter value (� = 0.01) 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 6/2001 19 25 23 20 21 22 22 22 21 21 21 20 61 19 21 21 22 22 22 25 21 24 20 21 21 a) 5 × 5 window pixel (original values) 0.311475 0.409836 0.377049 0.327869 0.344262 0.360656 0.360656 0.360656 0.344262 0.344262 0.344262 0.327869 1 0.311475 0.344262 0.344262 0.360656 0.360656 0.360656 0.409836 0.344262 0.393443 0.327869 0.344262 0.344262 b) 5 × 5 window pixel (normalised values) fig. 7: low-energy cosmic ray input pattern 20 19 22 25 23 23 24 22 22 24 21 21 27 22 21 21 24 25 21 22 19 21 24 23 16 a) 5 × 5 window pixel (original values) 0.740741 0.703704 0.814815 0.925926 0.851852 0.851852 0.888889 0.814815 0.814815 0.888889 0.777778 0.777778 1 0.814815 0.777778 0.777778 0.888889 0.925926 0.777778 0.814815 0.703704 0.777778 0.888889 0.851852 0.592593 b) 5 × 5 window pixel (normalised values) fig. 8: noise input pattern o1 o2 o3 star 1 0 0 noise 0 1 0 cosmic ray 0 0 1 table 1: the desired output values for a star, a cosmic ray, and noise accelerated the present siis-ann convergence without overshooting the solution. cmomentum term the momentum term was adopted to be 0.5 in order to accelerate the convergence of the error back-propagation learning algorithm of our siis-ann. 3.6 siis-ann implementation the developed siis-ann deals with ccd image frames for scanning and searching for the local central peaked pixel (lcpp) whose datum is larger than those of the surrounding ones [9, 10]. this technique significantly reduces both the search space and the required computer time. when such a pixel is found the surrounding 24-inputs are adopted, normalised to the value of the lcpp, and then mapped to the network. the outputs are then evaluated and the following three cases are considered for each input pattern: case 1: if the first output > 0.9 and the second output < 0.1 and the third output < 0.1 then classify this pattern as a star image case 2: if the first output < 0.1 and the second output > 0.9 and the third output < 0.1 then classify this pattern as noise case 3: if the first output < 0.1 and the second output < 0.1 and the third output > 0.9 then classify this pattern as a cosmic ray 3.7 siis-ann test the present method was coded in c computer language and applied to the test pattern set (sec. 3.2). the results obtained agree exactly with the prior known input. however, it is necessary to consider a practical application in order to verify the applicability, reliability and limitations of siis-ann against one of the known packages currently used by astronomers. this was performed through the test case given in the following section. 4 application 4.1 test case a test case was applied in order to evaluate our method in comparison with the daophot [7], which is widely used by astronomers. a ccd frame of the star cluster m67 was available together with the recently updated version of this code (daophotii) as well as all relevant and necessary input files, that need tedious and laborious work to acquire [11]. this is a cluster that is well studied, and accurate astronomical investigations are available. under these circumstances, a comparison between the two methods was realistic. the frame was 3 2 0 × 350 pixels and was taken through the visual optical band within 30 seconds of exposure time. 4.2 results and discussions the adopted frame was reduced employing the two codes. the present code identified 134 stellar images while 137 stellar images were recognised using the other code. the two methods agree for 132 images, which are displayed by asterisks in figure 9. on the one hand there were 2 images identified by the present method. these represent faint star images and designated by circles in the figure. the other method could not find these. on the other hand, there were 5 images are recognised by daophotii. of these, three stellar images are located at the first or the second pixel close to the frame borders. these are plotted as filled squares in the figure. this case implies that these stars are partially imaged and hence they are of no astronomical importance. in addition, unreliable astronomical data is obtained by gaussian fitting, as in the daophotii case, by incomplete data. our method is not able to deal with cases for which 5 × 5 array data are not available. the two other images are plotted as open squares in a position where our method identified one image only. only one star can be seen at this location through the palomar observatory sky survey photographic plate. the id charts given by johnson [12], eggen [13] and kent et al [14] also resolve this controversy in favour of our method. inspection of the data around this location reveals that the image is near saturation. for this case, daophotii assigns two overlapped images through a non-linear fitting process. the computer time needed for executing siis-ann code has been estimated at 45 seconds using a pentium ii (233mhz – processor) personal computer to display the image via the monitor, to identify stellar images and to encircle them on the display. by contrast a very much longer time is required for running daophotii, as reported in the manual [15]. daophotii needs a few minutes to a few hours, mainly for preparing the auxiliary input files necessary for execution. moreover, as pointed out by stetson (op cit.), at least 4mb hd free space are required to reduce a 512 × 512-pixel frame. on the other hand, regardless of the frame size a few kb of hd free space is sufficient for executing our method to save the position and peak data for the images that are found of stars and cosmic rays. 5 conclusions the present study shows that, in comparison with daophotii, the method has the advantages that: a) better reliability is provided. b) there is considerable higher recognition ability for faint images of stars. c) extremely short execution time is needed. d) neither user intervention nor prior knowledge about the ccd frame is needed. e) no complicated computation or numerical fitting process is performed. f) no large hd free space is required, even for large image frames. the only limitation of the present method is its inability to identify objects having centres at the first or second pixel close to the frame borders. such an objects, even if it is a star image is of no astronomical significance, as it is an incomplete image. hence this limitation can be disregarded. the ssis-ann method possesses remarkable features for practical applications, and it is planned to extend its capability to identify images of galaxies. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 41 no. 6/2001 references: [1] tody, d.: spie. 1980, 264, p. 171 [2] lupton, r. et al: astronomical journal. 1986, 91, p. 317 [3] linde, p.: highlights in astronomy. 1989, 8, p. 651. [4] bleacha, r.: astronomy and astrophysics. 1984, 135, p. 401 [5] penny, a. et al: r., mnras, 1986, 220, p. 845 [6] mateo, m. et al: proceeding of the first eso/st-ecf data analysis workshop. 1989, p. 69 [7] stetson, p.: publications of the astronomical society of the pacific. 1987, 99, p. 191 [8] gilliland, r. et al: publications of the astronomical society of the pacific. 1988, 100, p. 754 [9] younis, f. i.: msc thesis, faculty of engineering, al-azhar university, cairo, egypt, 1998 [10] alawy, a., el-bassuny: astrophysics and space sciences. 2001, (in press) [11] bojan, d.: http://david.fiz.uni-lj.si/daophotii, 1996 [12] johnson, h. et al: astrophysical journal. 1955, 121, p. 616 [13] eggen, o. et al: astrophysical journal. 1964, 140, p. 130 [14] kent, a. et al: astronomical journal. 1993, 106, p. 181 [15] stetson, p.: user’s manual for daophotii. dominion astrophysical observatory, victoria, canada, 1996 assoc. prof. ahmed el-bassuny alawy e-mail: abalawy@hotmail.com national research institute of astronomy and geophysics (nriag) helwan, cairo, egypt doc. ing. boris šimák, csc. e-mail: simak@feld.cvut.cz department of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic eng. farag ibrahim younis elnagahy, msc e-mail: faragelnagahy@hotmail.com department of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2, 166 27 praha 6, czech republic prof. eng. mohamed ashraf madkour department of systems & computers engineering al-azhar university faculty of engineering cairo, egypt assoc. prof. mohamed s. ella national research institute of astronomy and geophysics (nriag) helwan, cairo, egypt 38 acta polytechnica vol. 41 no. 6/2001 fig. 9: map of the images of stars recognised in the m67 cluster frame north is up and east is to the left asterisks: images identified by both methods circles: images identified by the siis-ann method only squares: images identified by the daophotii method only ap1_03.vp 1 introduction in recent years attention has been drawn to understanding, characterising and predicting water quality behaviour in drinking water distribution systems. these systems act as large-scale chemical and biological reactors with a considerable residence time. improper design and operation may result in water of diminished quality in terms of increased water age, reduced disinfectant residual, increased growth of disinfectant by-products and bacterial levels, and may impact the level of compliance with current and impending water-quality regulations [1]. in the past, several authors have proposed algorithms for use in simulating the spatial and temporal variations in distribution system water quality. 2 experimental part this study concentrates on changes in water quality transport in the pipelines that take drinking water from wtp plav to the town of tábor and to the towns and villages that lie on this route [2]. the length of these pipelines is about 80 km. the material of the pipelines is steel without any type of coating. they carry about 285 l�s�1 from wtp plav. there are six reservoirs along the pipelines, at various intervals, with a total capacity of about 48000 m3. the samples for this study were taken from six locations along the pipelines travelling towards tábor (fig. 1). two methods were used to predict the decrease in residual chlorine in the distributed water. 3 modeling of chlorine decay the first method for the decrease in chlorine in the bulk flow is based on first order kinetics [3]: c c k tt e� � � 0 1 where ct is the chlorine concentration in time t, in mg�l ��, c0 chlorine concentration in time t = 0, in mg�l ��, t time in days, k1 first order decay coefficient in d �1. the chlorine decay coefficient k1 calculated from the values of chlorine concentrations measured in the period 1997–2000 ranged from 0.252 d�1 to 1.336 d�1 and their coefficient of correlation varied from 0.637 to 0.999. a value lower than 0.9 was stated only 5 times. the second method for modelling possible chlorine loss in a pipe [4] combines the effect of bulk reaction, wall reaction and mass transfer. the overall rate of chlorine decay can be expressed as follows: � ��c b f h w� � � �k k r c c where � is the rate of chlorine decay in mg�l�1�d��, kb first order decay coefficient in d ��, c chlorine concentration in bulk flow in mg�l�1, kf mass transfer coefficient in m�d �1, rh hydraulic radius in m, cw chlorine concentration at the pipe wall in mg�l �1. the coefficient of chlorine uptake calculated by the second method includes two factors – chlorine uptake in bulk flow, and chlorine transfer from the bulk liquid to the wall with the subsequent reaction with biofilm and consumption of chlorine in the corrosion process. the calculated values of kb for the whole period 1997–2000 ranged from 0.00 to 1.336 (d�1) and for the constants kw1 (constant of chlorine uptake on the wall) from 0.828 to 1.000 (m�d�1). a coefficient of correlation lower than 0.9 was stated only in one case. this © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 41 no. 3/2001 effect of the distribution system on drinking water quality a. grünwald, b. št’astný, k. slavíčková, m. slavíček the overall objective of this paper is to characterise the main aspects of water quality deterioration in a distribution system. the effect of residence time on chlorine uptake and the formation and evolution of disinfection by–products in distributed drinking water are discussed. keywords: drinking water, distribution system, chlorine uptake, thm and haa formation. fig. 1: the samples for this study were taken from six locations along the pipelines travelling to the town of tábor model demonstrated higher similarity to the measured values than the previous model. it was shown that the piping procedure has an important effect on the residence time in different parts of the distribution system. piping from wr hosín ii into wr chotýčany take place during the night and is complete by 5 o’clock in the morning. the residence time difference between the samples taken from wr chotýčany and wr hosín ii is about 5 hours. the decay of active chlorine with increasing residence time in the distribution system is given in fig. 2. the concentration of chlorine active in the effluent from wtp plav or wr hosín ii ranged during the whole period between 0.15 and 0.85 mg�l�1. in the influent to wr sv. anna and veselí n/l the concentration of active chlorine was zero. the residence time at the end of the pipeline (wr sv. anna) was between 4.56 and 7.26 days. the average retention time was 5.85 days. 4 deposition formation previous research on the south bohemian transport pipeline has shown that deposits formed from particles of various origins and size can under specific conditions deteriorate the quality of distributed drinking water [5]. as a consequence of the resuspension of these deposits there is a rise in the consumption of active chlorine and in the formation of disinfection by-products in the bulk flow. in our research a number of deposits were sampled from different sampling points along the transport pipeline. sample 1 comes from the shaft behind wr hosín, sample 2 from the shaft behind wr chotýčany, sample 3 from the shaft before wr zlukov and sample 4 from the shaft before wr sv. anna. the characteristics of these samples are given in table 1. the characteristics of the deposits given in table 1 show no dependency on sampling point. the content of suspended solids varied between 353 and 892 mg�l�1, an the content of volatile solids was rather low (1.3–3.1 % of suspended solids). the main part of the suspended solids consisted of iron. its content ranged between 84.5 and 215 mg�g�1 (8.45–21.5 % of total suspended solids). the highest values were found in samples 1 and 2. the content of other metals, e.g., manganese (0.28–0.67 %), nickel (0.01–0.04 %) and zinc (0.05–0.007 %) was much lower. sedimentation analysis was used to analyse the characteristics of deposit suspension. from the measured values the sedimentation rates of particles and their proportion in the suspension were calculated. the comparison of the sedimentation curves is given in fig. 3. the results of the sedimentation show that the biggest proportion of the particles deposited at higher rates were in sample 1, sampled from the shaft near wtp plav. the reason for this could be the high iron content. the sedimentation curves of samples 2, 3 and 4 were similar, but a larger proportion with good sedimentation was found in sample 4. this sample contained the highest content of suspended solids and the highest content of toc. fig. 2 shows that the deposits consisted mainly of particles that sediment at low rates and are able to pass from the deposits into the bulk flow when there are changes in hydraulic conditions in the distribution system. these particles can deteriorate the quality of transported drinking water. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 41 no. 3/2001 fig. 2: decay of active chlorine residual as a function of residence time sample no. 1 2 3 4 total solids [mg�l��] 1103 535 630 833 volatile solids [mg�l��] 84 35 81 119 nonvolatile solids [mg�l��] 1019 500 549 714 suspended solids [mg�l��] 892 353 439 627 volatile solids [mg�l��] 65 11 31 71 nonvolatile solids [mg�l��] 827 342 408 556 codmn [mg�l ��] 21.0 5.8 7.4 19.0 usp. s. [mg�l��] 16.1 1.4 3.0 7.0 fe [mg�l��] 215.4 214.1 153.0 84.5 [%] 21.5 21.4 15.3 8.45 mn [mg�l��] 3.0 2.8 4.2 6.70 [%] 0.3 0.28 0.42 0.67 ni [mg�l��] 0.16 0.10 0.18 0.39 [%] 0.016 0.01 0.02 0.04 tc [%] 4.81 7.85 8.50 7.02 ic [%] 1.72 7.77 7.62 3.81 toc [%] 3.09 0.08 0.88 3.21 table 1: characteristics of deposits from the pipeline wtp plav – town of tábor fig. 3: sedimentation curves 5 formation of disinfection by-products the possible formation of disinfection by-products by reacting with chlorine was studied in the laboratory experiments under controlled conditions (temperature 20 °c, residence time 72 hours, chlorine doses 0.5–4.0 mg�l�1, by chloramination constant addition of nh4 � 0.5 mg�l�1). deposits from the shaft before wr sv. anna were used. their characteristics were as follows: content of suspended solids 346 mg�l�1, toc 7.46 mg�l�1, ph 7.6. the experimental results are given in fig. 4. it was shown that the concentration of chloroform increased with increasing doses of chlorine from 15.4 �g�l�1 (at cl2 dose 0.5 mg�l �1) to 77.7 �g�l�1 (at cl2 dose 4.0 mg�l �1) whereas during chloramination with the same chlorine doses the thm content ranged from 11.5 �g�l�1 to 14.1 �g�l�1. this means that the increase in thm concentration was negligible. 6 conclusions the research has shown that the particles of deposits resuspended into the bulk flow during changes in hydraulic conditions in the distribution system can be a real source of thm and haa. formation of these compounds depends mainly on the concentration of resuspended particles in water, doses of chlorine, and reaction time. acknowledgements this research has been supported by gačr grant no. 103/99/0659 and by research program cez: j04/98211100002. the authors would like to thank prof. v. janda, doc. n. strnadová and msc p. fišar from the department of water technology and environmental engineering všcht prague for their assistance. references [1] kiéné, l., lu, w., lévi, y.: relative importance of the phenomena responsible for chlorine decay in drinking water distribution systems. wat. sci. tech., vol. 38, 1998, pp. 219–272 [2] el-shafy, m. a., grünwald, a.: thm formation in water supply in south bohemia, czech republic. water res., vol. 34, no. 13/2000, pp. 3453–3459 [3] vasconcelos, j. j. et al: kinetics of chlorine decay. journal awwa, vol. 89, no. 7/1997, pp. 54–65 [4] ozdemir, o. n. et al: realistic numerical simulation of chlorine decay in pipes. water res., vol. 32, no. 11/1998, pp. 3307–3312 [5] grünwald a. et al: effect of deposits on water quality in distribution system (2000). ctu reports, proceedings of workshop 2000, part b, vol. 4, no. 7/2000, p. 568 prof. ing. alexander grünwald, csc. grunwald@fsv.cvut.cz ing. bohumil št’astný stastny@fsv.cvut.cz ing. kateřina slavíčková slavickova@fsv.cvut.cz ing. marek slavíček slavicek@fsv.cvut.cz department of sanitary enginee ringczech technical university in prague faculty of civil engineering thákurova 7, 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 41 no. 3/2001 � � fig. 4: formation of thm in suspension with various doses of chlorine or chloramine << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica doi:10.14311/ap.2019.59.0153 acta polytechnica 59(2):153–161, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap general calculation of winding factor for multi-phase/-layer electrical machines irrespective of poles number daoud ouamaraa, b, ∗, frédéric dubasa, sid ali randic, mohamed nadjib benallalb, christophe espanetd a univ. bourgogne franche-comté, département energie, femto-st, cnrs, f90000 belfort, france b khemis miliana university, laboratory of energy & smart system lesi, ain defla, algeria c renault s.a., 78280 guyancourt, france d univ. bourgogne franche-comté, 90000 belfort, france ∗ corresponding author: daoud.ouamara@gmail.com abstract. in this paper, a method to calculate the winding factor by only considering stator parameters without the rotor ones is developed. this is interesting because it allows the separation of the stator and rotor design, unlike the existing methods in the literature. a general method based on the matrix representation of a winding is presented. this approach requires the knowledge of four parameters : i) slots number, ii) phases number, iii) layers number, and iv) single-phase spatial distribution. a new feature of the multi-layer windings is introduced, it is called false-zero windings, which is divided into two categories: i) α-windings (i.e., odd false-zero windings), and ii) β-windings (i.e., even false-zero windings). the windings having no false-zero are categorized as γ-windings. the calculations are applied for single and multi-phase/-layer windings. the results of the comparison are satisfactory. the code used for the calculation is given in appendix. keywords: false-zero windings, matrix representation, multi-phase/-layer, winding factor. 1. introduction 1.1. context rotational force generating devices, such as motors, electric generators and sirens, are generally composed of two components called stator and rotor, in which a winding is inserted in one or both sets. the winding is one of the most important and critical elements in such machines. several types of windings are developed and studied to achieve the desired performances for a given application, viz.: full-/short-/long-pitch winding, concentrated around teeth winding, . . . , distributed winding, etc. recently, ouamara et al. (2018) [1] published an overview on the winding design and developed a research tool called anfractus tool 1.0 allowing an automatic generation of all windings in multi-phase/-layer electrical machines by using the matrix representation. the way the coils are distributed in the stator and/or rotor slots directly affects the winding factor of each spatial harmonic and, therefore, the electromagnetic performances of electrical machines (e.g., the self-/mutual-inductances, the back electromotive force, the electromagnetic torque, unbalanced magnetic forces, . . . , the permanent-magnet eddy-current losses, the efficiency). the winding factor is then defined as the ratio of flux linked by an actual winding to flux that would have been linked by a full-pitch winding with the same number of turns [2]. the winding is spatially distributed in the stator and/or rotor slots, whereby the flux penetrating the coils does not pass through all the coils simultaneously but with a certain phase shift. consequently, the back electromotive force of the winding is not directly calculated with the number of turns, but the winding factor corresponding to each harmonic must be taken into account [3]. in order to compensate the low torque of an electrical machine having a low winding factor, it is necessary to supply it with a higher current or using more turns [4]. in the literature, different methods for calculating the winding factor have been proposed. by using the star of slots method [5], the computation is performed for fractional-slot three phase synchronous machines with single and two layers [6]-[8] and for multi-phase machines [9]. the electromotive force phasor vector is used to calculate the winding factor of concentrated multi-layer windings as explained in [10]. using analytical expressions, the winding factor can be calculated from the distribution factor and the pitch factor [11]-[13]. in [14], the equations for a specific winding layout have been applied for multi-phase permanent-magnet synchronous machines having all teeth wound concentrated windings. 1.2. objective of this paper the purpose of this work is to establish a general and simple method for calculating the winding factor 153 http://dx.doi.org/10.14311/ap.2019.59.0153 http://ojs.cvut.cz/ojs/index.php/ap d. ouamara, f. dubas, s.a. randi et al. acta polytechnica regardless of the number of slots, phases and layers. only, the knowledge of the spatial distribution of single-phase coils in slots is sufficient. to the best of the author’s knowledge, the fact that the number of pole pairs is not taken into account for calculating the winding factor is not reported in the literature. this allows to design the windings without constraints on the number of pole pairs. the terms used to describe the method and notion of false-zero windings are described in section 2, where the difference between α-windings and β-windings are explained. in section 3, γ-windings are introduced, and also, the equations used to calculate the winding factor for each case are explained in detail with some examples. section 4 deals with the multi-layer windings in order to validate the calculations outlined in this article. some specific windings in [15]-[17] have been used and compared with the results obtained by the developed method. finally, the algorithm written in matlab allowing the calculation of the winding factor is given in appendix. 2. notions of the method 2.1. matrix representation instead of using the usual diagrams, the stator and/or rotor windings distribution can be represented by a connection matrix [mw] linking the m-phases (i.e., the matrix rows) to the stator and/or rotor slots (i.e., the matrix columns). in [18], this concept was used to design fractional-slot windings. a table was used for the coil arrangement in [19]. in order to represent multi-layer windings, [1] has subdivised the number of columns of [mw] by the number of layers. this new connecting matrix [cw] (of dimensions m × qs · ly whose m, qs and ly are respectively the number of phases, slots and layers) is defined by: 1 2 · · · qs [cw] =   [ c1,1wk ] [ c1,2wk ] · · · [ c1,qswk ] [ c2,1wk ] [ c2,2wk ] · · · [ c2,qswk ] ... ... ... ... ... ...[ cm,1wk ] [ cm,2wk ] · · · [ cm,qswk ]   1 2 ... ... m (1) the elements [ ci,uwk ] (where i = 1, . . . ,m; u = 1, . . . ,qs and k = 1, . . . ,ly) are equal to: • 0 if no conductor of phase i in the uth slots and kth layers; • +1 if forward conductor of phase i in the uth slots and kth layers; • –1 if return conductor of phase i in the uth slots and kth layers. the false-zero windings are visible thanks to the use of [cw] (see section 2.2). nevertheless, the windings’ distribution will be represented by: [dw] = 1 ly × [cw] (2) for example, figure 1 represents [dw] for a stator of an electrical machine with qs = 6, m = 3 and ly = 2. throughout this study, only the first row will be considered (i.e., the first phase) of [dw], viz., [vw] =[ d1,uwk ] (of dimension 1 × qs · ly). however, one of the other phases can also be chosen since the phases are balanced. for the last example, [vw] is given by: [vw] = 1 2 [ 0 1 0 0 −1 0 . . . . . . 0 −1 0 0 1 0 ] (3) the vector [vw] may be separated into ly subvectors [v ′w] (of dimension 1 ×qs) according to: [v ′w]k = [vw]{k + (u− 1) ·ly)} ∀k (4) for the last example, [v ′w] of each layer is: [v ′w]1 = [vw]{1, 3, 5, 7, 9, 11} (5a) [v ′w]2 = [vw]{2, 4, 6, 8, 10, 12} (5b) [v ′w]1 = 1 2 [ 0 0 −1 0 0 1 ] (6a) [v ′w]2 = 1 2 [ 1 0 0 −1 0 0 ] (6b) figure 2 shows the sub-vectors [v ′w]1 and [v ′ w]2 extracted from the winding in figure 1. 2.2. notion of false-zero 2.2.1. definition figure 3 shows an example of a one phase distribution for a false-zero winding with qs = 9, m = 3 and ly = 2. using the connection matrix [mw], the first phase of this winding can be represented by: [ m1,uw ] = 1 2 [ 1 −1 0 0 −1 0 1 0 0 ] (7) figure 1. two-layer winding (qs = 6, m = 3 and ly = 2) and its connection matrix. 154 vol. 59 no. 2/2019 general calculation of winding factor (a) (b) figure 2. two-layer winding (qs = 6, m = 3 and ly = 2): (a) first sub-vector extracted, and (b) second sub-vector extracted. figure 3. example of a one phase distribution for a false-zero winding with qs = 9, m = 3 and ly = 2. the sixth column (i.e., the sixth slot) is filled with a zero, this indicates the absence of the first phase in this slot. indeed, the sixth slot is filled by the first phase in the opposite direction (2-layer winding) as given by: [vw] = 1 2   1 0 0 −1 0 0 0 . . .. . . 0 −1 0 –1 1 0 . . . . . . 1 0 0 0 0   (8) this case is called false-zero. noting that a single layer winding can not have a false-zero (the slot is fully occupied by the phase). by introducing this new concept, multi-layer windings having false-zero will be subdivided into two categories: i) α-windings (i.e., odd false-zero windings), and ii) β-windings (i.e., even false-zero windings). the windings having no false-zero are categorized in γ-windings. 2.2.2. α-windings (i.e., odd false-zero windings) a false-zero winding is odd, if forward (+1) and return (–1) conductor numbers are unequal, viz.,∑ u [v ′w]1 (u) 6= 0 (9) by applying (4) on the vector (8) representing the winding in figure 3, the following vector is obtained: [v ′w]1 = 1 2 [ 1 0 0 0 −1 −1 0 0 0 ] (10) the sum of [v ′w]1 elements is nonzero, and thus this winding is an α-winding (i.e., odd false-zero winding). 2.2.3. β-windings (i.e., even false-zero windings) a false-zero winding is even, if forward (+1) and return (−1) conductor numbers are equal, viz.,∑ u [v ′w]1 (u) = 0 (11) the winding given in figure 4 is represented by: [vw] = 1 2   1 –1 0 −1 0 0 0 . . .. . . 0 0 0 −1 0 –1 . . . . . . 1 0 1 0 0 0 . . . . . . 0 0 0 1 0   (12) by applying (4) to (12), the sub-vector is obtained: [v ′w]1 = 1 2 [ 1 0 0 0 0 −1 . . . . . . −1 0 0 0 0 1 ] (13) the sum of [v ′w]1 elements is zero and thus this winding is a β-winding (i.e., even false-zero winding). 3. winding factor calculation from the usual calculation of the winding factor based on the ditribution and pitch factors [2] and formulas given in [20], an adaptation on [vw] is made. a generalized equation was obtained regardless of the number of slots, phases and layers. for the sake of clarity of the winding factor solution for each case, figure 4. an example of a one phase distribution for a β-winding with qs = 12, m = 3 and ly = 2. 155 d. ouamara, f. dubas, s.a. randi et al. acta polytechnica the following notations are adopted throughout the paper n = u⊗ [o]1,ly (14) s = 1, . . . ,qs ·ly (15) where ⊗ represents the kronecker product, and [o]1,ly the ones matrix of dimensions 1 ×ly. the flowchart of the method is presented in figure 5, showing steps of the calculation and the used equations. first, the vector [vw] of the phase spatial distribution is tested if it is a false-zero winding. then, the studied winding is categorized and the adequate equation is applied. 3.1. γ-windings 3.1.1. description for all single layer windings and multi-layer windings with no false-zero, the winding factor is given by: ξγwh = m qs × ∣∣∣∣∣∑ s [vw](s) ×e−j 2π qs ×n(s)×h ∣∣∣∣∣ (16) where j = √ −1, and h is the spatial harmonic orders. 3.1.2. application : qs = 20, m = 5 and ly = 2 • step 1: the distribution of the first phase of an γ-winding is given in figure 6.(a) and represented by: [vw] = 1 2   0 1 −1 −1 1 0 0 . . . . . . 0 0 0 0 0 0 . . . . . . 0 0 0 0 0 0 . . . . . . 0 0 −1 1 1 −1 . . . . . . 0 0 0 0 0 0 . . . . . . 0 0 0 0 0 0 . . . . . . 0 0 0   (17) • step 2: according to (16), the calculation of the winding factor ξγwh is given in figure 6.(b). figure 5. flowchart of the methodology. (a) (b) figure 6. γ-winding with qs = 20, m = 5 and ly = 2: (a) first phase distribution, and (b) winding factor ξγwh. 3.2. β-windings (i.e., even false-zero windings) 3.2.1. description the winding factor for this category is given by: ξβwh = ly × ξ γ wh (18) 3.2.2. application : qs = 12, m = 3 and ly = 2 • step 1: the distribution of the first phase of a β-winding is given in figure 7.(a) and represented by: [vw] = 1 2   1 –1 0 −1 0 0 . . . . . . 0 0 0 0 −1 . . . . . . 0 –1 1 0 1 . . . . . . 0 0 0 0 0 . . . . . . 0 1 0   (19) • step 2: the sub-vector extracted is: [v ′w]1 = 1 2 [ 1 0 0 0 0 −1 . . . . . . −1 0 0 0 0 1 ] (20) • step 3: the forward (+1) and return (−1) conductor numbers are equal, viz.,∑ u [v ′w]1 (u) = 0 (21) • step 4: according to (18), the calculation of the winding factor ξβwh is given in figure 7.(b). 156 vol. 59 no. 2/2019 general calculation of winding factor (a) (b) figure 7. β-winding with qs = 12, m = 3 and ly = 2: (a) first phase distribution, and (b) winding factor ξβwh. 3.3. α-windings (i.e., odd false-zero windings) 3.3.1. description the winding factor for this category is given by: ξαwh = c× ξ β wh (22) where c = qs/λ with λ a correction coefficient defined by: λ = { tz+1, for tz < qs < tz+1 tz, for tz = qs (23) in which t =   3, for z = 1 6, for z = 2 12 · (z − 2) , for z > 2 (24) with z ∈ n∗. 3.3.2. application : qs = 15, m = 3 and ly = 2 • step 1: the spatial distribution of the first phase of an α-winding is given in figure 8.(a) and represented by: [vw] = 1 2   1 0 0 −1 0 0 0 0 . . .. . . −1 0 0 1 0 0 1 . . . . . . 0 1 –1 0 −1 0 0 . . . . . . −1 0 0 1 0 0 0 0   (25) • step 2: the sub-vector extracted is: [v ′w]1 = 1 2 [ 1 0 0 0 −1 0 0 1 . . . . . . 1 0 0 −1 0 0 0 ] (26) (a) (b) figure 8. α-winding with qs = 15, m = 3 and ly = 2: (a) first phase distribution, and (b) winding factor ξαwh. • step 3: the forward (+1) and return (−1) conductor numbers are unequal, viz.,∑ u [v ′w]1 (u) 6= 0 (27) • step 4: the coefficient λ = 24 which gives c = 0.625. • step 5: the calculation of the winding factor ξαwh is given by figure 8.(b). 4. multi-layer windings in order to validate the method described in this paper for the winding factor calculation with ly > 2, specific wingings, in the literature, has been used and compared with the results obtained by the developed method. 4.1. windings extracted in qi et al. [15] the authors of this paper (qi et al.) presented a new multi-layer winding design method based on the winding function. the multi-layer winding is obtained by a superposition of several double-layer windings. the example given in [15] is an 18-slots/16-poles fractionalslot concentrated-windings permanent-magnet machine. three multi-layer winding designs are obtained using the proposed method. the winding arrangements are shown in figure 9, viz., i) a double layer, ii) two four-layer windings (i.e., type1 and type2), and ii) a six-layer winding. 157 d. ouamara, f. dubas, s.a. randi et al. acta polytechnica figure 9. different type multi-layer windings (for qs = 9 and m = 3) [15]. (a) double layer winding (b) type1 four-layer winding (c) type2 four-layer winding (d) six-layer winding figure 10. winding factor calculated by qi et al. [15] and method of this paper. the winding factor is calculated for each winding and the results are compared with those calculated by qi et al. [15] as shown in figure 10. except for some harmonics for which the negative direction is taken, the amplitude is the same between the compared results for each winding. 4.2. windings extracted in alberti et al. [16] a general theory of fractional-slot multi-layer windings has been presented by alberti et al. [16]. by applying the star of slots method, two 9-slots/8-poles four-layer windings (see figures 11.(b) and 11.(c)) are obtained from a 9-slots/8-poles two-layer winding (see figure 11.(a)). in [16], the winding factor for four-layer windings is computed starting from the one of the two-layer winding with a shift angle and based on the star of slots method. compared results of the calculation are reported in figure 12, where the same amplitude of the winding factor is obtained for each harmonic in both methods of the calculation. (a) 2-layer (b) 4-layer i (c) 4-layer ii figure 11. winding layout of phase a (for qs = 9 and m = 3) [16]. 158 vol. 59 no. 2/2019 general calculation of winding factor (a) 2-layer (b) 4-layer i (c) 4-layer ii figure 12. winding factor calculated by alberti et al. [16] and method of this paper. 4.3. windings extracted in cistelecan et al. [17] in [17], the winding diagram for a 12-slots/10-poles single layer layout is presented in figure 13.(a). the double layer presented in figure 13.(b) may be obtained from the single layer winding by doubling the first with a half number of turns per coil and shifting by 5 geometrical slots, as explained in [17]. by doufigure 13. winding diagrams for 12-slots/10-poles (only one phase is represented) [17]: (a) single layer, (b) double layer, and (c) four-layer winding. (a) single layer winding (b) double layer winding (c) four-layer winding figure 14. winding factor calculated by cistelecan et al. [17] and by the method of this paper. bling again the double-layer winding and shifting by another 5 geometrical slots, a four-layer winding may be obtained as shown in figure 13.(c). the winding factor has been calculated using an analytical method based on star of slots and compared with the calculation made in this paper, results are identical. a comparison of winding factors obtained from both methods is presented in figure 14. a special 12-slots/10-poles three layer winding is presented in [17] and shown in figure 15. this winding has an unequal number of turns. in the developed method, the number of turns is supposed to be equal. the small difference in the results of the calculation of the winding factor, as given in figure 16, is due to the unequal number of turns not taken into consideration in this paper. figure 15. winding diagrams for 12-slots/10-poles : special three layer [17]. 159 d. ouamara, f. dubas, s.a. randi et al. acta polytechnica figure 16. winding factor calculated for the special three layer by cistelecan et al. [17] and by the method of this paper. 5. conclusion a general and simple method for the calculation of the winding factor of electrical machines has been presented. this calculation is based on the matrix representation of a winding. knowing the distribution of phases as well as the number of slots/phases/layers, the winding factor’s calculation is possible for all types of balanced windings with equal turns. the method has been tested on the multi-phase windings (the three phase included) and for any number of layers. unlike the methods existing in literature, the number of pole pairs is not necessary, which allows the seperation of the rotor and stator design. the results of the comparisons are satisfying. this paper gives a useful tool for users to estimate the performance of a winding, focusing only on the stator parameters (i.e., number of slots, phases and layers), without the constraints on the number of pole pairs at the rotor. since this method is limited to windings of an equal number of turns, future work will include expanding its potential to allow the computation of windings with an unequal number of turns. the hybrid windings having unequal number of layers (e.g., winding with three-four layers) may also be considered in future studies. 6. appendix clear all, close all, clc, winding’s parameters titrefig=’winding’’s parameters’; titreprop={’qs : number of slots’;... ’m : number of phases’;... ’ly : number of layers’;... ’vw : vector of distribution of phase a’}; num_lines=1; val_init={’’;’’;’’;’’}; datacell=inputdlg(titreprop,titrefig,... num_lines,val_init,’on’); qs=str2double(datacell{1}); m=str2double(datacell{2}); ly=str2double(datacell{3}); vw=str2num(datacell{4}); begin of calculation h=30; % number of harmonics ns=qs*ly; n=kron(1:qs,ones(1,ly)); % vector n for h=1:h s=1:ns; % vector s fabs{h} = abs((sum(vw(s).* exp((-1i*2*pi*n(s)*h)/qs)))); end fabs = cell2mat(fabs); % detect a false-zero winding x=1:2:ns-1; vect=vw(x).*vw(x+1); % detect if an alpha-/beta-winding vw_p=vw(x); sum=sum(vw_p); % alpha-winding if (ly > 1) && (min(vect)<0) && (sum~=0) % calculation of coefficient lambda t1=[3 6]; t2=12:12:240; t=[t1 t2]; for z=1:size(t,2); if qs > t(z) && qs < t(z+1) lambda=t(z+1); elseif qs >= t(z) && qs < t(z+1) lambda=t(z); end end % winding factor of alpha-winding fw=((m*ly)/lambda)*fabs; % beta-winding elseif (ly > 1) && (min(vect)<0) && (sum==0) % winding factor of beta-winding fw=((m*ly)/qs)*fabs; else % gamma-winding % winding factor of gamma-winding fw=(m/qs)*fabs; end idx= ( fw < 1e-10); fw(idx)=0.0000; post-processing create figure with bars f=figure; axefig=axes(’parent’,f,’position’, [.1 .19 0.8 .74],’xlim’,[0,5],’ylim’,[0,5],... ’xgrid’,’on’,’ygrid’,’on’,’xminorgrid’,’off’); stem(1:h,fw(1:h),’b’,’marker’,’none’, ... ’linewidth’,10); set(gca,’ygrid’,’on’); xlabel(’harmonics’,’fontname’, ’garamond’,... ’fontsize’, 11); xlim([0 31]), set(gca,’xtick’, [0:2:30]); 160 vol. 59 no. 2/2019 general calculation of winding factor ylabel(’amplitude [-]’,’fontname’, ... ’garamond’, ’fontsize’, 11 ); ylim([0 1]); title(’winding factor : \xi_{w}’, ... ’fontname’,’garamond’,’color’, ’red’, ... ’fontsize’, 18); % create the uitable and display fw t = uitable(’parent’, f,’data’,fw, ... ’fontname’,’garamond’ ,’fontsize’, 14, ... ’position’,[30 10 1300 80], ... ’rearrangeablecolumns’,’on’, ... ’foregroundcolor’, ... [0.2,0.3,0.8], ’columnname’,’numbered’,... ’rowname’,’fw’); references [1] d. ouamara, f. dubas, m. n. benallal, et al. automatic winding generation using matrix representation anfractus tool 1.0. acta polytechnica 58(1):37, 2018. doi:10.14311/ap.2018.58.0037. [2] j. l. kirtley. class notes 5. massachussets institute of technology, department of electrical engineering and computer science pp. 1–9, 1997. [3] j. pyrhonen, t. jokinen, v. hrabovcova. design of rotating electrical machines, secon edition. wiley & sons, ltd, 2nd edn., 2014. [4] f. magnussen, c. sadarangani. winding factors and joule losses of permanent magnet machines with concentrated windings. in ieee international electric machines and drives conference, 2003. iemdc’03., vol. 1, pp. 333–339. ieee, 2003. doi:10.1109/iemdc.2003.1211284. [5] m. m. liwschitz. distribution factors and pitch factors of the harmonics of a fractional-slot winding. transactions of the american institute of electrical engineers 62(10):664–666, 1943. doi:10.1109/t-aiee.1943.5058623. [6] n. bianchi, s. bolognani, m. pre, g. grezzani. design considerations for fractional-slot winding configurations of synchronous machines. ieee transactions on industry applications 42(4):997–1006, 2006. doi:10.1109/tia.2006.876070. [7] s. e. skaar, ø. krøvel, r. nilssen. distribution, coil-span and winding factors for pm machines with concentrated windings. in icem-2006, chania (greece). 2006. [8] g. ugalde, j. poza, m. a. rodriguez, a. gonzalez. space harmonic modeling of fractional permanent magnet machines from star of slots. in 2008 18th international conference on electrical machines, pp. 1– 6. ieee, 2008. doi:10.1109/icelmach.2008.4799937. [9] x. shangguan, j. zhang, w. zhang. calculation on the winding factor and armature reaction mmf of a pmsm with 5-phase fractional-slot winding. journal of computers 8(3):725–732, 2013. doi:10.4304/jcp.8.3.725-732. [10] h.-j. kim, d.-j. kim, j.-p. hong. characteristic analysis for concentrated multiple-layer winding machine with optimum turn ratio. ieee transactions on magnetics 50(2):789–792, 2014. doi:10.1109/tmag.2013.2279100. [11] a. mohammadpour, a. gandhi, l. parsa. winding factor calculation for analysis of back emf waveform in air-core permanent magnet linear synchronous motors. iet electric power applications 6(5):253, 2012. doi:10.1049/iet-epa.2011.0292. [12] j. cros, p. viarouge. synthesis of high performance pm motors with concentrated windings. in ieee international electric machines and drives conference. iemdc’99. proceedings (cat. no.99ex272), vol. 17, pp. 725–727. ieee, 1999. doi:10.1109/iemdc.1999.769226. [13] y. yokoi, t. higuchi, y. miyamoto. general formulation of winding factor for fractional-slot concentrated winding design. iet electric power applications 10(4):231–239, 2016. doi:10.1049/iet-epa.2015.0092. [14] d. ouamara, f. dubas, s. a. randi, et al. electromagnetic comparison of 3, 5and 7-phases permanent-magnet synchronous machines : mild hybrid traction application. mediterranean journal of modeling & simulation 6(1):012–022, 2016. [15] l. qi, t. fan, x. wen, et al. a novel multi-layer winding design method for fractional-slot concentrated-windings permanent magnet machine. 2014 ieee conference and expo transportation electrification asia-pacific (itec asia-pacific) (2):1–5, 2014. doi:10.1109/itec-ap.2014.6940615. [16] l. alberti, n. bianchi. theory and design of fractional-slot multilayer windings. ieee transactions on industry applications 49(2):841–849, 2013. doi:10.1109/tia.2013.2242031. [17] m. cistelecan, f. j. t. e. ferreira, m. popescu. three phase tooth-concentrated multiple-layer fractional windings with low space harmonic content. in 2010 ieee energy conversion congress and exposition, pp. 1399– 1405. ieee, 2010. doi:10.1109/ecce.2010.5618267. [18] p. wach. algorithmic method of design and analysis of fractional-slot windings of ac machines. electrical engineering 81(3):163–170, 1998. doi:10.1007/bf01236235. [19] m. v. cistelecan, b. cosan, m. d. popescu. analysis and design criteria for fractional unbalanced windings of three-phase motors. in in proc. 6th int. symp. adv. electromech. motion syst. (electromotion), pp. 1–5. 2005. [20] f. scuiller, e. semail, j.-f. charpentier. general modeling of the windings for multi-phase ac machines. the european physical journal applied physics 50(3):31102, 2010. doi:10.1051/epjap/2010058. 161 http://dx.doi.org/10.14311/ap.2018.58.0037 http://dx.doi.org/10.1109/iemdc.2003.1211284 http://dx.doi.org/10.1109/t-aiee.1943.5058623 http://dx.doi.org/10.1109/tia.2006.876070 http://dx.doi.org/10.1109/icelmach.2008.4799937 http://dx.doi.org/10.4304/jcp.8.3.725-732 http://dx.doi.org/10.1109/tmag.2013.2279100 http://dx.doi.org/10.1049/iet-epa.2011.0292 http://dx.doi.org/10.1109/iemdc.1999.769226 http://dx.doi.org/10.1049/iet-epa.2015.0092 http://dx.doi.org/10.1109/itec-ap.2014.6940615 http://dx.doi.org/10.1109/tia.2013.2242031 http://dx.doi.org/10.1109/ecce.2010.5618267 http://dx.doi.org/10.1007/bf01236235 http://dx.doi.org/10.1051/epjap/2010058 acta polytechnica 59(2):153–161, 2019 1 introduction 1.1 context 1.2 objective of this paper 2 notions of the method 2.1 matrix representation 2.2 notion of false-zero 2.2.1 definition 2.2.2 -windings (i.e., odd false-zero windings) 2.2.3 -windings (i.e., even false-zero windings) 3 winding factor calculation 3.1 -windings 3.1.1 description 3.1.2 application : qs=20, m=5 and ly=2 3.2 -windings (i.e., even false-zero windings) 3.2.1 description 3.2.2 application : qs=12, m=3 and ly=2 3.3 -windings (i.e., odd false-zero windings) 3.3.1 description 3.3.2 application : qs=15, m=3 and ly=2 4 multi-layer windings 4.1 windings extracted in qi et al. qi2014 4.2 windings extracted in alberti et al. alberti2013 4.3 windings extracted in cistelecan et al. cistelecan2010 5 conclusion 6 appendix references ap04_2web.vp 1 introduction with ever increasing fault current levels in today’s interconnected power systems it is necessary to ensure very low grounding resistance of transformer stations. in order to get the best techno-economic solution in the design of grounding systems for given safety criteria (ieee, iec, or some other national standard) only computer-aided design can give an optimal and fast solution of the given task. low grounding resistance and acceptable distribution of touch and step voltages (as uniform as possible) for a high fault current level can be simultaneously achieved only by using a grid-grounding system. the task for designers of grounding systems is to arrange a buried metallic conductor with adequate equivalent radii to achieve the safety criteria. the safety criteria in this paper are based on ieee std. 80, 188 editions and iec 479-1, 1984 and some croatian safety requirements. another task is to check some existing grounding grids for an increased fault current level, i.e., to determine the fault current level which satisfies the safety criteria. a typical grounding grid is designed and simulated using the hifreq module. a design problem is © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 44 no. 2/2004 computer aided design of transformer station grounding system using cdegs software s. nikolovski, t. barić this paper presents a computer-aided design of a transformer station grounding system. fault conditions in a transformer station can produce huge damage to transformer station equipment if the grounding system is not designed properly. a well designed grounding system is a very important part of the project for transformer station design as a whole. this paper analyses a procedure for transformer grounding system design and spatial distribution of touch and step voltage on the ground surface level, using the cdegs (current distribution electromagnetic interference grounding and soil structure analysis) software. spatial distribution is needed for checking and finding dangerous step and touch voltages above and around the transformer station. apparent earth resistivity data is measured and analyzed using the resap module of the cdegs software. because of the very high current flow into the grounding system during a single line to ground fault or a three phase fault in the transformer station, very high and dangerous potentials can be induced on the metallic structures including the fence, which can cause dangerous situations for people and animals near the station and for the personnel inside the station. the plot module of cdegs is used to view the results of the scalar potential, step and touch voltage on the surface. graphic displays include equipotent contour lines and potential profiles (gradients) in 3d and 2d perspective and apparent soil resistivity (�m) versus inter electrode spacing (m). the results of alternative grid designs may be displayed simultaneously for the purpose of comparison. keywords: computer-aided design, substation, grounding grid, soil, safety, touch and step voltage. fig. 1: simplified geometrical model of a transformer station grounding system and the conducting part described and verified for several (110/35/10 kv ) transformer substations: “ts našice”, “ts osijek”, and “ts valpovo”. these transformer substations are in the eastern part of the croatian power system. due to their operational time, ts are exposed to potential fault conditions. in this sample case a single line to ground fault is relevant for the grounding design. according to the technical specifications, single line to ground fault currents entering the grounding system of these transformer substations are in the order of 6 to 10 ka. the analysed grounding system consists of a 110 × 80 m grounding grid with 20–30 additional grounding rods of 3 m length. when this happens, the potentials on the surfaces inside the ts and nearby are on a much higher level than the reference point away from ts. 2 mathematical expression of the electromagnetic field the conductor network is subdivided into small segments. this allows the thin-wire approximation to be used, which in turn enables us to use linear current sources. the method used to obtain the scalar potential and the electromagnetic field in the frequency domain is described by using the following 3 equations. maxwell’s equations can be used to describe electric field e and magnetic field h in terms of scalar � and vector potential a: � � � � � � � 1 a (1) e a� � � �j� � (2) h � � � 1 � x a (3) � � � �� � j (4) where: � complex conductivity of the medium, � permeability of the medium, � conductivity of the medium, � permittivity of the medium. 3 example of transformer station grounding design through there is no conventional rule in designing transformer station grounding, some guidelines can be suggested on the basis of this example using cdegs, as follows. the first step in the design of transformer station grounding is to determine appropriate soil model, for predicting the effect of the underlying soil characteristics on the performance of the grounding system. from the resistivity measurement data obtained using arbitrarily spaced 4-electrode configuration methods (including wenner or schlumberger methods), the resap module determines the equivalent soil layers. the earth layers may be vertical or horizontal (a one-layer, a two-layer or a multi-layer soil model). in this study, soil resistivity is measured by a 4-point measurement (schlumberger method). the influence of seasonal variations in the last several years on soil resistivity is estimated indirectly using statistical data collected in the course of preventive measurements of the grounding impedance by a local utility. according to this simplified method, 100 �m soil resistivitiy can be assumed as the worst soil top layer resistivity. the resap program interprets the measured apparent earth resistivity (or resistance) data to determine the equivalent earth structure model that is necessary for analysing the grounding systems. the measured soil resistivity data is shown in table 1. measured data entered in the resap (computations) window is shown in fig 2. the 4-point schlumberger measurement method with necessary data for calculating soil resistivity is also presented. the resap module calculates the resistivity using the following generalised equation: � � � � � � � � � � �2 1 1 1 1 1 2 1 2 r s s s s s se e e i e i (5) where: se1 distance between current c1 and potential p1 electrodes, se2 distance between current c2 and potential p2 electrodes, si distance between potential electrodes, � resistivity of soil, r measurement of soil resistance. the apparent soil resistivity in an �m (versus inter-electrode spacing in m) relevant in this study is shown in fig. 3. according to fig 3., after 2 m below the surface the apparent resistivity becomes practically constant. fig. 3. represents one measurement series, which cannot represent the resistivity during seasons. it is useful to remember that current depth penetration in soil depends on current electrode spacing. it is therefore useful to perform some additional measurements with a very close inter-electrode to get a better interpolation of the soil layer just below the surface, since the apparent resistivity has changed rapidly in one layer. the computed soil resistivity and layer thickness for a 2-layer soil model are given by the text report, and are shown below in the computer generated report. 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 measurement number 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. distance between electrode [m] se 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.75 5.75 si 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 measured resistance [�] r 42.200 6.130 2.530 1.372 0.906 0.604 0.446 0.338 0.214 0.148 table 1: measurement data r e s i s t i v i t y (system information summary) system of units: meters soil type selected: multi-layer horizontal rms error between measured and calculated: 2.08052 % resistivities soil layer resistivity calculated by resap is the initial data for a further study. from the computer generated report for soil resistivity, a two-layer soil model is made, which gives satisfactory solutions in most cases including this study. the data entered in the second module hifreq is shown in fig. 4. data for relative permeability and permittivity is irrelevant in this study. the grounding system described in this study consists of a 110×80 m grounding grid plus an additional ring 5m away from the grid. the grounding system is buried 0.8 meters below the transformer station surface. the ground fault current in this study is assumed to be 7 ka. the grounding grid con© czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 44 no. 2/2004 fig. 2: resap (computations) window with measurement data relevant in this study fig. 3: apparent soil resistivity versus inter-electrode spacing layer number resistivity (� m) thickness (m) coefficient (p.u.) contrast ratio 1 infinite infinite 0.0 1.0 2 78.17224 0.2106636 �1.0000 0.78172e�18 3 33.55941 infinite �0.39929 0.42930 fig. 4: a hifreq (soil type) window is used to specify the type of soil sists of rectangular cross section fe/zn conductors (30×4 mm) with an equivalent radius equal to 10.83 mm. a geometrical position of the grounding grid is shown in fig 5. this first iteration is without additional rods, e.g., the grounding could consist of the grounding grid, with rods and an additional ring around it at the same depth. division of the grid into smaller windows is not appropriate in the first design iteration. the result of simulation is presented in the following figures. first, a 3d plot of scalar potential at the surface of ts is shown in fig. 6. the scalar potential peaks correspond to the nodes of the grid, e.g., the valleys correspond to the centre of the windows in the grid (see fig. 6). this 3d view is useful in visualisation of the scalar potential shape. this shape should look as uniform as possible. fig. 7 shows the reach touch voltages in 3d view with references to the worst system grounding potential rise (gpr). fig. 7 shows that the reach touch voltages are worst in the corners of ts, and additional measures must be taken to decrease these voltages below safety limits. since a 3d plot loses some detailed information about scalar potential or step/touch voltages, an additional investigation must be made for suspicious areas or profiles. an additional inspection can be made in a 2d view for arbitrary profiles (see fig. 8). reach touch voltages for the worst system gpr at the surface of ts, for three profiles with numbers 1, 3 and 100, are shown in fig. 8. 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 5: geometry of the grounding grid with data relevant in this study fig. 6: scalar potential at the surface of ts a similar investigation can be made in a 2d view, as shown in fig. 9. these 2d contour views show lines of the same reach touch voltages with the superposed grid. the safety limits for step and touch voltages are generated by cdegs according to user defined standards. several different standards can be chosen ( ieee, iec or some national © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 acta polytechnica vol. 44 no. 2/2004 fig. 7: reach touch voltages in 3d plot, corresponding to the worst system gpr profile number 1. profile number 3. profile number 100. fig. 8: reach touch voltages/worst system gpr 88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 9: reach touch voltages/ worst system gpr contour plot fig. 10: reach step voltages/ worst system gpr contour plot standard). in this study the fibrillation current is calculated according to the c1-iec standard and body resistance for 95 % population, see fig. 11. if crushed rock is used to improve the safety limits for step and touch voltage, it can be included and entered in the window. 4 conclusion manual calculation of the equivalent resistance of composite grounding can lead to incorrect results if the mirroring effect is not taken into account. because of the large geometric structure of a grid, all additional metallic conductors such as rods, rings and another additional structures can influence each other. this gives advantages to computer design, especially to specialised programs that take into account all physical phenomena. since ts is a restricted area for non-operating personnel (civilian personnel), it is protected by a metallic fence. attention must be taken to prevent the rise in the potential of fence due to fault conditions in ts. fatal accidents caused by dangerous touch voltages when someone touches the fence can be avoided by placing an additional buried ring conductor outside ts, one meter away from the fence, which is in galvanic contact with the main grounding but not with the fence. this additional ring is added in this study. the main purpose of this additional ring conductor is to shape the electric potential at the surface around the fence. references [1] ma j., dawalibi f. p.: “modern computational methods for the design and analysis of power system grounding.” proceedings of the 1998 international conference on power system technology, beijing, august 18–21, 1998, p. 122–126. [2] dawalibi f. p., barbeito n.: “measurements and computations of the performance of grounding systems buried in multilayer soils.” ieee transactions on power delivery, vol. 6, no. 4, october 1991, p. 1483–1490. [3] nikolovski s., fortin s.: “lightning transient response of 400 kv transmission tower with associated grounding system.” emc 1998 roma international symposium on electromagnetic compatibility, rome, italy, september 14, 1998. [4] nikolovski s., fortin s.: “frequency domain analysis of 110/35 kv transformer station grounding system subject to lightning strike.” proceedings of the ieee powertech '99 conference, budapest, hungary, august 29–september 2, 1999. srete nikolovski ph.d e-mail: srete.nikolovski@etfos.hr tomislav barić b.sc. e-mail: tomislav.baric@etfos.hr power system department faculty of electrical engineering osijek k. trpimira 2b 31000 osijek, croatia © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 acta polytechnica vol. 44 no. 2/2004 fig. 11: window for safety assessment limits with data relevant in this study fig. 12: computed safety step and touch voltage for an additional layer (rocks) at the surface of ts table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap04_1.vp 1 introduction the use of artificial neural nets for systems identification and control, has been the subject of a vast amount of publications in recent years. it is possible to mention for instance, the paper of narendra and parthasaraty [5] in which several possible structures of the neural controller that suppose an a priori knowledge of the plant dynamic structure are proposed. in bhat and mcavoy [3] and in many other papers, a scheme based in an inverse dynamic neural model is used. a similar approach is adopted by aguado and del pozo [2] but here the inverse neural regulator is complemented by a self-tuning pid algorithm which guarantees that the stationary error goes to zero. in the above mentioned paper and in many others, it is required an exhaustive previous training of the neural net, which is a serious obstacle for the practical implementation of the proposed solution in the industry. at the same time, the controller structure in some cases, as in the mentioned paper of narendra and parthasaraty [5], is very complex, including several nets each one with two hidden layers of many neurons which must be trained using large data samples. an important contribution to enhance the possibilities of neural nets in the solution of practical problems is the paper by cui and shin [4]. in that work a direct neural control scheme is proposed for a wide class of non-linear processes and it is shown that in many cases, the net can be trained directly retropropagating the regulation error instead of the net output error. however, in that paper it is not discussed the influence of some training parameters, particularly the learning coefficient, over the closed loop dynamics. it is also not remarked that practically with a direct neural control scheme the training stage can be substituted by a permanent and real time adaptation of the weighting coefficients of the neural net. in the present paper, it is proposed a self-tuning neural regulator, inspired in the ideas of cui and shin [4], but with the particular feature that the previous training is substituted by a permanent adjustment of the weighting coefficients based on the control error. at the same time, it is shown the influence of the learning coefficient over the closed loop dynamics and some criteria are given about how to choose that parameter. finally, some examples are given where the possibilities of the proposed method for difficult non-linear systems control is shown, specially when there exists a considerable pure time delay. 2 self tuning neural controler structure in fig. 1, it is shown the proposed scheme of the self-tuning neural regulator. the neural net which assume the regulator function, is a 3 neuron layers perceptron (one hidden layer) which weighting coefficients are adjusted by a modified retropropagation algorithm. in this case, instead of the net output error: � � � � � �e t u t u tu d� � (1) it is used the process output error: � � � � � �e t y t y ty r� � (2) to adjust the weighting coefficients. in fig. 2, it is represented the neural net controller structure. the output layer has only one neuron because by the moment, we are limiting the analysis to one input-one output processes. in the input layer, the present and some previous values of the regulation error are introduced, it means that: � � � � � � � �� �x t e t e t e t ny y y� � �1 � . (3) for the cases that has been simulated until now, the value n � 2 was sufficient to obtain an adequate closed loop performance. it means that the number of input neurons can be 3. a similar number of hidden layer neurons was equally satisfactory. 3 weighting coefficients adaptation algorithm in fig. 2, the weighting coefficients wji and vj for the hidden layer and output layer input connections respectively, are shown. in what follows, we will detail the adaptation algo© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 neural networks for self-tuning control systems a. noriega ponce, a. aguado behar, a. ordaz hernández, v. rauch sitar in this paper, we presented a self-tuning control algorithm based on a three layers perceptron type neural network. the proposed algorithm is advantageous in the sense that practically a previous training of the net is not required and some changes in the set-point are generally enough to adjust the learning coefficient. optionally, it is possible to introduce a self-tuning mechanism of the learning coefficient although by the moment it is not possible to give final conclusions about this possibility. the proposed algorithm has the special feature that the regulation error instead of the net output error is retropropagated for the weighting coefficients modifications. keywords: neural networks, feedforward, back-propagation, networks, self-tuning control. ey rn p yr u y fig. 1.: scheme of the self-tuning neural regulator rithm for that coefficients, which ensures the minimisation of a regulation error � �e ty function. the output of the j hidden layer neuron may be calculated by means of: h e j s j � � � 1 1 , j �1 2 3, , ,� (4) where: s w xj ji i i � � 1 3 (5) at the same time, the output layer neuron output value will be: � �u t e r � � � 1 1 (6) where: r v hj j j � � 1 3 . (7) as a criteria to be minimised, we defined the function: � � � �e t e ky k t � � 12 2 1 (8) where it is supposed that the time has been discretized by using an equally-spaced small time interval. the minimisation procedure consists, as it is known, in a movement in the negative gradient direction of the function e(t) with respect to the weighting coefficients vj and wji. the e(t) gradient is a multi-dimensional vector whose components are the partial derivatives � �� � e t vj , � �� � e t w ji , it means that: � � � � � � � � � � � � � � � e t e t v e t w j ji � � � � (9) let us first obtain the partial derivatives with respect to the coefficients of the output neuron. applying the chain rule, we get: � � � � � � � �� � � � � � � � � � � � e t v e t e e e e u t u t r r vj y y u u j � � � � � (10) � � � � � � � �� � � � � � e t v e e e u t u t h j y y u j� � � � � � � �1 1 (11) in (11) it is used the well known relation: � � � � � � � � u t r e r e e r r r � � � � � � � � � � � � � 1 1 1 2 � � � � � �� � � � u t r e e e u t u t r r r � � � � � � � � � �1 1 1 1 (12) let us define: � � � �� ��1 1� � � �e u t u ty (13) then: � �� � � � � e t v h e ej j y u � � 1 (14) in the equation (14), it appears the partial derivative � � e e y u , which can be interpreted as some kind of “equivalent gain” of the process. further we will make some considerations about that term. the partial derivative of function e(t) with respect to the weighting coefficients wji, can be obtained applying again the chain rule: � � � � � � � �� � � � � � � � � � � � � � �e t w e t e e e e u t u t r r h h s s ji y y u u j j j � � � � � � � j jiw� (15) � � � � � �� � � �� � � � e t w e e e u t u t v h h x ji y y u j j j i� � � � � � �1 1 � � � �� � � � � e t w v h h x e eji j j j i y u � � �1 1 (16) let us define: � �� �j j j jv h h2 1 1� � (17) and then: � �� � � � � e t v x e ej j i y u � � 2 (18) using equations (14) and (18), the adjustments of weighting coefficients vj , wji can be made by means of the expressions: � � � �v t v t e e hj j y u j� � � � � � � � � � �1 1 � � � � (19) � � � �w t w t e e xji ji y u j i� � � � � � � � � � �1 2 � � � � (20) where � is the so-called learning coefficient and � � e e y u is the “equivalent gain” of the plant. the main obstacle to apply the adjustment equations (19) and (20) is that in general the plant equivalent gain � � e e y u is unknown. however, in the above mentioned paper by cui and shin [4], it is demonstrated that it is only required to know the sign of that term to ensure the convergence of the weighting coefficients, because the magnitude can be incorporated in the learning coefficient � if the 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague e t( ) e t( 1)� e t( 2)� u(t)vj wji fig. 2.: neural net controller structure non-restrictive condition � � e e y u � � is accomplished. besides, the sign of � � e e y u can be easily estimated by means of a very simple auxiliary experiment, by instance, to apply a step function at the process input. the assumption that the sign of the gain remains constant in a neighbourhood of the operation point of the process is not very strong and it is accomplished in most practical cases. finally, in the worst of cases, real time estimation of the gain sign could be incorporated, without great difficulties, in the control algorithm. having in mind the above considerations, the equations (19) and (20) could be written as follows: � � � �v t v t e e hj j y u j� � � � � � � � � � �1 1 � � � �sign (21) � � � �w t w t e e xji ji y u j i� � � � � � � � � � �1 2 � � � �sign (22) the right value of learning coefficient � can be experimentally determined from the observation of closed loop system performance when some changes are made in the controlled variable set-point. the equations (21) and (22) for the proposed neural controller structure, may be interpreted as the regulator adaptation equations instead of training equations as it is normally done. indeed, the simulation study done until now using the scheme shown in fig. 1 and permanently adjusting the weighting coefficients vj and wji by means of (21) and (22) allowed us to arrive to the next conclusions: � it is not in general required a previous training of the net and once the control loop is closed, the weighting coefficients self-adjust in a few control periods, carrying the regulation error ey(t) to zero. � the dynamical performance of the closed loop depends exclusively on the learning coefficient magnitude, corresponding to higher values of �, a faster response that can even present a considerable over-shoot. diminishing the value of �, the system is damped up to the point in which the desired response is obtained. the system, however, keeps the stability for a wide range of � values. � it is very convenient to use a variable learning coefficient, using an expression as: � �� � �� � � abs e y (23) in this way a small basic value of � can be used, for instance � � 01. , and the effective value �� is incremented depending of the regulation error magnitude. the right value of � can be tuned experimentally without troubles. the use of equation (23) gives the system the possibility to present a fast response when the errors are big and then to go slowly to the reference value. that behaviour is, indeed, very convenient in practice. 4 some simulation results although the above described algorithm has been tested in many examples, here we will only show the results obtained in two simulated cases corresponding to non-linear processes which can be described by means of the equation: � � � �� � � �y t k e t s t s u t t sd � � � � � � � � � � � � 1 2 2 2 1 1 (24) the simulation was carried out on a real time environment provided by the cpg system (aguado, 1992) so that the obtained results are very close to those that could be expected in a real process. in fig. 3 it is represented the closed loop behavior corresponding to the next process and regulator parameters: t t td � � �5 3 51 2s s s, , , (25) � �� �0 6 0 4. , . . (26) as can be seen, a practically perfect response is obtained when positive and negative step changes in the reference output value are applied. given that the time constants are relatively small, the values of � and � can be chosen relatively big without producing positive or negative over-shoots. this type of performance is observed in general, it means that for fast dynamics processes, it is possible and convenient a fast learning of the net. we have observed that for time constants in the order of milliseconds, values of � of 5 and more can be used. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig 3.: closed loop behaviour for process in equations (25) in fig. 4, it is represented the close loop performance for the next constants values: t t td � � �60 10 201 2s s s, , , (27) � �� �0 05 0 20. , . . (28) as can be observed, we have a large time-delay non linear system which can be hardly controlled by a conventional adaptive algorithm, for instance a self-tuning pid. however, with the adaptive neural regulator, it is obtained a response that can be considered as very good, given the process characteristics. notice that in this case, the values of � and � are considerably smaller, as expected, given that the process dynamics is much slower. 5 conclusions the self-tuning control algorithm based on a neural net presented in this paper, promise to be a very interesting option for the control of processes with a difficult dynamics that could not be adequately controlled with pid regulators, even in their self-tuning versions, as it was shown in the above presented simulation cases. the class of processes in which the algorithm could be applied is very wide and it includes most of the cases that can appear in practice. in the near future, we plan to apply the algorithm to some real laboratory processes and to extend the obtained results to the case of multivariable and multiconnected systems. references [1] aguado a.: controlador de propósito general. memorias del v congreso latinoamericano de control automático, la habana, cuba, 1992. [2] aguado a., del pozo a.: esquema de control combinado usando redes neuronales. memorias de cimaf 97, la habana, cuba, 1997 [3] bhat n., mcavoy t. j.: “use of neural nets for dynamic modelling and control of chemical process systems.” computers on chem. engineering, vol. 14 (1990), no. 4/5, p. 573–583. [4] cui x., shin k. g.: “direct control and coordination using neural networks.” ieee transactions on systems, man and cybernetics, vol. 23 (1992), no. 3, p. 686–697. [5] narendra k. s., parthasaraty k.: “identification and control of dynamical systems using neural networks.” ieee transactions on neural networks, vol. 1 (1990), no. 1. m c. alfonso noriega ponce phone: 01 (442) 1921264 fax: 01 (442) 1921264 ext. 103 e-mail: anoriega@uaq.mx dr. alberto aguado behar m. c. antonio ordaz hernández dr. vladimir rauch sitar universidad autónoma de querétaro centro universitario santiago de querétaro qro. c. p. 76010, méxico 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 4.: closed loop behaviour for process in equations (27) acta polytechnica doi:10.14311/ap.2020.60.0056 acta polytechnica 60(1):56–64, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap suitable production tools selection with the use of evolutionary algorithms petr hyneka, ∗, viktor kreibicha, roman firtb a czech technical university in prague, faculty of mechanical engineering, department of manufacturing technology, technická 4, 166 07 praha 6, czech republic b volkswagen ag, department of automation engineering, letter box 011/13990, 38436 wolfsburg, germany ∗ corresponding author: hynekp1@centrum.cz abstract. this paper deals with the use of a production equipment simulation in the design of production systems, more specifically the welding equipment in the automotive industry. based on the simulation results, a matrix, which defines the possibility of using given manufacturing tools (in this case welding guns are considered) to connect the plates using the electrical resistance spot welding process, is created. this matrix generates a set of several numbers of solutions depending on other parameters, such as the lowest price, the lowest number of used welding guns, etc. the goal is to solve this task. the solution is presented using mathematical programming. specifically, the method of genetic evolutionary algorithms is being used. the solver software is used to optimize the selection of the welding guns’ combination. the solver is an add-on in ms excel. the case study shows 15 welding points weldment on which the availability of 20 types of welding guns was simulated. the result is an ideal combination of 2 types of guns for the lowest price. keywords: design process, production system, production process, simulation, body shop, automotive industry, evolutionary algorithms. 1. introduction generally, the design and the optimization in the automotive industry equipment is driven forward by the constant search for an ideal solution. this can be generalized for any production industry. equipment can be considered as ideal when it meets the following basic conditions: • lowest price • minimal production area • required capacity • reaching legal regulations and standards • flexibility • ergonomics the system design and even overall design of production systems require a step-by-step modelling method, i.e. creation of opportunities and their technical, organizational and economic evaluation. for more complex tasks, it is necessary to use simulations to evaluate the dynamic ability to coordinate all the functions and elements of the production system over time, occupying space with qualitative and quantitative requirements. [1] production planning is a problem of multidimensional optimization where there is a number of partial issues, such as product selection, product allocation, manufacturing sequence, etc. that need to be solved at the same time. in the following text, an improved genetic algorithm is introduced to find an ideal solution. experimental results have proved that the proposed genetic algorithm structure is better than a conventional structure. this is because the proposed genetic algorithm allows “learning” from its own experience. [2] running a simulation to support production planning can be used for an early issue detection [3]. simulations are commonly used for a design valuation during the late stages of product development [4]. it also provides the ability for production capacities tests, simulation model experiments and various scenarios creations [3]. hence, research suggests an early-stage systematic physical analysis in the product design. there are also examples using different simulations to prove a product manufacturability. however, there is a dearth of research that investigates the application of simulation tools that can support the assessment of preliminary production operations that utilizes a variety of production resources to produce the same emerging product variety [4]. big data analysis has been successfully used in many areas, including product lifecycle management, supply chain management, and predictive maintenance. the aim is to design a machining optimization based on a big data analysis. each production machine is represented by its attributes. once the data are ready, the resources are optimized. this approach is validated by a simplified case study with implemented hybrid genetic algorithms. [5] the main technology used in automotive bodywork production lines is electrical resistance spot welding. when designing this type of production line, it is necessary to solve the appropriate production tools, 56 https://doi.org/10.14311/ap.2020.60.0056 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 1/2020 suitable production tools selection. . . figure 1. part list. in this case, it is a selection of the most suitable welding guns that can weld the prescribed welding points in the required quality. nowadays, computer simulation methods, combined with the experience of the product designer are used to select the best design of welding guns. in order to run the simulation with production tools, it is necessary to prepare a simulation model at first. for designing welding lines, the czech car manufacturer škoda auto a.s. uses planning software called process designer by siemens. this software is the first used to prepare a product, in a specified case, it is part of the whole bodywork. 2. simulation every assembly is defined by its part list, every pressed metal sheet has a 3d model with an exact location (x, y, z) in the space. the part list is a list of all input parts (pressed metal sheets, weldments) with the part name and number. the specific level is assigned to every part and defines the time when the part enters the bodywork (welding). the part list also has more attributes describing the product (material, surface finish, sheet thickness, production depth, start and finish date of usage of the bodywork). the 3d weldments model is created using the cad system, in this case, the catia software is used, see figures 1 and 2. welding elements are specific joint types, such as electric resistance spot welds, fusion welds, adhesives, bolts, etc. the elements are designed in the cad system as well. they contain the following attributes: weld name, weld type, position in space (x, y, z), a combination of joined parts (2 sheets welds, 3 sheet welds) and clear assignments to pressed metal sheets that need to be welded. in order to simulate the welded part’s manufacturability, production tools library has to be created for the df (digital factory) system. in this case, it is library of standard welding guns, such as in figure 3. now all the necessary resources are reached to make the first production tool selection, the suitable welding guns need to be selected. the selection method is called a welding-guns availability simulation. this simulation method checks selected welding guns with the requested weld points. the method is about placing a 3d model of welding guns on each welding point. the guns than rotate around the point in 10°steps. next step is the rotation of welding guns around their axis by 180°and the process is repeated. the simulation tool evaluates whether there is at least one position of the welding guns relative to the weld point, where there is no collision between the guns and the assembly. the result of the simulation process checking the availability of welding guns is a matrix. this matrix holds for the set of welding points and welding guns’ information about collision-free welding possibility. in figure 4 is an example of such a simulation where the “+” sign indicates a collision-free situation of the guns and the welding point and the “-” sign indicates a collision. 3. ideal solution using the simulation, welding guns, which can be used to weld to a specific welding point were identified. the best result is when one type of welding guns can weld all selected points. a case may also occur, where the weld point is not weldable by any guns 57 p. hynek, v. kreibich, r. firt acta polytechnica figure 2. assembly (product). figure 3. welding guns library. figure 4. simulation result. 58 vol. 60 no. 1/2020 suitable production tools selection. . . figure 5. optimization task. saved in the library. in this case, the construction of special welding guns is necessary, so the unique point can be welded. usually, the simulation confirms the weldability of all the points by a combination of several guns. the set n solution is obtained where there are n combinations of guns that can weld the selected set of points. the goal is to choose the ideal solution from this n solution set, as shown in figure 5. the selection can be, for example, based on the lowest price. the unit price for every welding gun can be added to the solution matrix, so it can look for a combination where the price will be lowest. mathematical programming can be used to solve such a task. the task of mathematical analysis is to find extremes (maximum, minimum) of the multiple variables function with boundary conditions. if the function, for which an extreme is solved for, is linear and the boundary conditions are linear as well, the whole task is treated as a linear programming (lp) problem. [6] the task of the lp is to find the extreme of a linear function in which the variables are bound by a set of linear limiting conditions: [6] m < n; bi ≥ 0 n∑ j=1 aij xij ≤ bi for i = 1, 2, . . . , m; j = 1, 2, . . . , n; xj ≥ 0 f = n∑ j=1 cj xj = min the basic variables are defined in table 1. the goal is to find a configuration −→ k, minimizing the total price t over the condition of overall success (that is for v −→ k = nw , it means that the selected configuration can meet all the goals). since this is a multidimensional task with discrete parameters, one of the variants of the global evolutionary optimization is chosen. 4. evolutionary algorithms evolutionary algorithms are global optimization algorithms, they can approximately solve tasks that are not possible to solve exactly using current computing power, are extremely time-consuming, or require human intuition. these algorithms use the principles known from evolutionary biology, especially darwin’s survival of the fittest principle. to make sure the evolution will work, three things are necessary: (1.) two existing solutions can form new “averaged” solution, this is called crossing. (2.) the created solution can be randomly manipulated, this is called mutation. (3.) for any individual, another suitable individual is selected, this is called natural selection. this principle often finds high-quality problem solutions without the need of inventing a specialized algorithm to solve a particular problem. the only thing the algorithm needs to know is which solutions are stronger and which are weaker. this is a simple task carried out by a human, however, creating a good solution can be unbelievably difficult. [7] different tasks make it possible to represent the solution in different ways. the solution is represented by a chromosome that can be specified as: • binary numbers vector (zeros and ones sequence) • real numbers vector • charts on others the entire vector that represents the solution is called the chromosome. the chromosome consists of genes that can have different values. the specific value of one gene is called an allele. see figure 7. the size of the genotype may vary. in this case, both the mutations and crossing should have been precisely designed to work with different sizes of chromosomes. [4] 59 p. hynek, v. kreibich, r. firt acta polytechnica nt number of tools (for example welding guns) nw number of tasks (for example welding points) h = {0, 1}(nt×nw ) solution matrix hij = 0 with the i-tool, it is not possible to reach j-task hij = 1 with the i-tool, it is possible to reach j-task−→ p ∈ rnt price vector pi price of the i-tool−→ k ∈ nnt0 configuration vector ki planned amount of i-tool t = ∑nt j=1 kj pj total price−−→ u −→ k ∈{0, 1}nw success vector for the configuration −→ k u −→ k i = 0 i-task is not possible in configuration −→ k (i.e. ∀j : hji = 0° ∨ kj = 0) u −→ k i = 1 i-task is possible in configuration −→ k (i.e. ∃j : hji = 1° ∧ kj > 0) v −→ k = ∑nw j=1 u −→ k j overall success of configuration −→ k table 1. basic variables. figure 6. general ea process. figure 7. bit string. figure 8. roulette wheel. crossing one of the most common ways of crossing is a simple crossover method. it is usually done with one or two points that are randomly selected in the chromosome and the genes are exchanged between those two. [7] mutation for the mutation, it should apply that bigger changes occur with fewer probabilities than small changes. the mutation method depends on the representation method. if a binary chromosome is present, the mutation can be done by random bit exchange. in the case of the real number chromosome, random values (given e.g. by the normal distribution) can be added to gene values. if the solution by a chart is presented, the mutation can be adding a node, an edge, changing the order, etc. [8] selection the commonly used genetic-algorithm selection method is the roulette wheel selection. imagine a roulette wheel divided into different size fields. the size of each piece corresponds to the individual’s fitness value (see figure 8). when the roulette spins, there is a higher chance that larger piece will be selected. individuals with a higher fitness rating have a higher chance of being selected for crossing and to pass on their genes. [7] 60 vol. 60 no. 1/2020 suitable production tools selection. . . figure 9. start matrix. g1 – g20 welding gun wp1 – wp15 welding point costs costs of welding gun quantity number of welding guns required for the best solution result quantity vs. costs checksum the check sum, number 1 is necessary for the correct result table 2. description of the matrix.. 5. ideal solution the problem described above will be solved using the genetic evolution algorithm. the solver optimization software will be used, this is the ms excel add-on. first, the matrix of the result is converted (composed of welding guns’ availability) into the binary code and other parameters are added. individual variables are described in table 2. now in ms excel, under the “data” tab, the “solver” add-on is used and the basic parameters are set (see figure 10). since the task is relatively complex and many variants need to be checked, it is recommended to change the basic optimization time setting to 3 minutes (see figure 11). the optimization itself is started with the “solve” command, see figure 12. after the calculation is complete, the ideal solution is confirmed to keep (see figure 13). the ideal solution is the use of only two welding gun types g3 and g5. the total cost of this combination is 240. it is still necessary to check the checksum indicator, all values need to equal to 1 and the total sum should match the number of welding points, in this case, the total sum is 15. in the considered case, for 20 welding gun types, where each type is capable to weld at least one welding point, more than 1.4 million combinations can be found. with solver, the ideal combination was found after trying 119,500 combinations, which is about 8 % of all possible combinations. 6. conclusion the main feature of the production system simulation is to create multiple possibilities. the described partial simulation of the welding guns’ availability solves the designing issue of the selection of welding guns. the result shows guns that are able to weld the selected point with the electric resistance welding method without a collision, but choosing the right combination of these guns is not resolved. therefore, further analysis is required, with new variables such as price being added. this process gives many additional result combinations. the global evolutionary analysis 61 p. hynek, v. kreibich, r. firt acta polytechnica figure 10. solver parameters. figure 11. solver options. 62 vol. 60 no. 1/2020 suitable production tools selection. . . figure 12. solver results. figure 13. result matrix. 63 p. hynek, v. kreibich, r. firt acta polytechnica is used to find the ideal solution out of the additional combinations including new variables. the ms excel add-on called solver was used for this analysis. this software allows you to check about 10,000 possibilities in a short amount of time and come out with the ideal one. as an example, the availability matrix is presented with the combination of 15 welding points and 20 types of welding guns. to find the ideal combination of welding guns using the genetic evolution algorithm in ms excel add-on solver, about 8 % of all combinations were necessary to verify. in today’s bodywork, the weldments can reach up to 400 welding points. the standardized production is an attempt to have only certain types of welding guns. for the case of 50 welding gun types and 400 welding points, lowering the necessary combinations to find the ideal value below 10 % is very beneficial. references [1] a. zelenka. projektování výrobních procesů a systémů. ctu publishing house, prague, 1st edn., 2007. [2] s. d. dao, k. abhary, r. marian. an improved genetic algorithm for multidimensional optimization of precedence-constrained production planning and scheduling. journal of industrial engineering international 13(2):143 – 159, 2017. doi:10.1007/s40092-016-0181-7. [3] š. václav, p. košťál, š. lecký, et al. assembly system planning in automotive industry with use of discrete event simulation. lecture notes in mechanical engineering pp. 503–515, 2018. doi:10.1007/978-3-319-75677-6_44. [4] x. gong, j. landahl, h. johannesson, r. jiao. simulation-driven manufacturing planning for product-production variety coordination. in 2017 ieee international conference on industrial engineering and engineering management (ieem), pp. 2039–2043. 2017. doi:10.1109/ieem.2017.8290250. [5] w. ji, s. yin, l. wang. a big data analytics based machining optimisation approach. journal of intelligent manufacturing 30(3):1483–1495, 2019. doi:10.1007/s10845-018-1440-9. [6] j. kožíšek, b. stieberová. statistická a rozhodovací analýza. ctu publishing house, prague, 2014. [7] f. streichert. introduction to evolutionary algorithms. http://www.ra.cs.uni-tuebingen.de/mitarb/ streiche/publications/introduction_to_ evolutionary_algorithms.pdf, 1991. university of tuebingen. [8] y. xinjie. introduction to evolutionary algorithms. springer, london, 2010. doi:10.1007/978-1-84996-129-5. 64 http://dx.doi.org/10.1007/s40092-016-0181-7 http://dx.doi.org/10.1007/978-3-319-75677-6_44 http://dx.doi.org/10.1109/ieem.2017.8290250 http://dx.doi.org/10.1007/s10845-018-1440-9 http://www.ra.cs.uni-tuebingen.de/mitarb/streiche/publications/introduction_to_evolutionary_algorithms.pdf http://www.ra.cs.uni-tuebingen.de/mitarb/streiche/publications/introduction_to_evolutionary_algorithms.pdf http://www.ra.cs.uni-tuebingen.de/mitarb/streiche/publications/introduction_to_evolutionary_algorithms.pdf http://dx.doi.org/10.1007/978-1-84996-129-5 acta polytechnica 60(1):56–64, 2020 1 introduction 2 simulation 3 ideal solution 4 evolutionary algorithms 5 ideal solution 6 conclusion references acta polytechnica doi:10.14311/ap.2018.58.0365 acta polytechnica 58(6):365–369, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap the influence of silver nanoparticles synthesis on their properties anna mražíkováa, ∗, oksana velgosováa, jana kavuličováb, stanislav krumc, jaroslav málekc a institute of materials and quality engineering, faculty of materials, metallurgy and recycling, technical university of kosice, slovakia b institute of metallurgy, faculty of materials, metallurgy and recycling, technical university of kosice, slovakia c department of materials engineering, faculty of mechanical engineering, czech technical university in prague, czech republic ∗ corresponding author: anna.mrazikova@tuke.sk abstract. application of green methods to replace physical and chemical methods for synthesis of silver nanoparticles (agnps) has become necessary not only from economic aspect but especially due to its significant impact on ecosystem. the properties of biologically synthesized agnps using green algae parachlorella kessleri (p. kessleri) and chemically prepared were investigated and compared. the uvvis analysis confirmed a high stability of biosynthesized agnps as well as chemically synthesized gelatin modified citrate-agnps. scanning electron microscopy (sem) and transmission electron microscopy (tem) revealed different sizes and shapes of agnps synthesized in different ways. biosynthesized agnps have similar inhibitory antimicrobial activity as gelatin/sodium citrate-agnps. keywords: silver nanoparticles; biosynthesis; chemical reduction; gelatin; anti-microbial activity. 1. introduction in recent years the use of noble metal nanomaterials in many industrial applications including physics, chemistry, electronics, optics, material science has rapidly increased. furthemore, silver containing materials have gained a great attention especially due to their antimicrobial properties. therefore, agnps are now being used to reduce infections, to prevent biofilm formation on protheses, catheters, dental materials and also on stainless steel materials [1–5]. physical and chemical methods generally used for agnps synthesis very often involved toxic chemicals that can contaminate the nanoparticles [6]. such nanoparticles are released into environment in different stages of their production, apllication and even disposal of nanowastes what can consequently lead to contamination of the whole ecosystem. finally, the majority of nanoparticles accumulates in fresh and marine ecosystems [7]. oukarroum et al. [8] in their reports outlined a negativelly effect of such agnps on both freshwater and marine algae by strong decrease in viable algal cells. therefore there is a need to replace physical and chemical techniques of agnps preparation by green alternatives, which are cost-effective, safe, environment-friendly and easily scaled up for large syntheses of nps. the use of biomolecules like proteins and lipids present on nps surfaces has a great potential in agnps synthesis due to their non-toxic nature and also gentle synthetic procedures [9, 10]. therefore there is a growing concern to apply biomimetic which use plants, bacteria, fungi, yeast, actinomycetes and algae for synthesis of nanostructures of biocompatible metals and semiconductors [5]. the common problem of agnps application typically prepared via reduction of a silver precursor using chemical or physical means is dispersion instability against aggregation. one of the possibilities to enhance stabilization of nanoparticles is addition of surface-protecting agents such as organic ligands namely chitosan, polysacharides and gelatin or inorganic capping materials [11–13]. lee [12] and sivera [13] reported the gelatin-modified agnps exhibited long-term stability against aggregation and maintained unchanged optical and physical properties and a high antibacterial activity for several months at ambient temperature. the similar properties were also observed in biologically synthesized agnps. functional groups of the biological materials are responsible for the reduction of the ag+ and subsequent stabilization of nanoparticles [9, 14, 15]. green algae are well-known by biomass containing different organic biologically active compounds such as chlorophylls, carotenoids, flavonoids, proteins, vitamins and minerals [5]. natural polymer on algae is considered to be suitable for stabilization of inorganic silver nanoparticles. we have recently reported that agnps biosynthesized using p. kessleri showed long-term stability at the higher ph values [16]. the novelty of this work is investigation and comparison of properties biologically synthesized agnps using the algae p. kessleri and chemically synthesized agnps. chemically synthesized agnps were performed by two different ways: using sodium citrate and both sodium citrate and gelatin as a reducing and capping agent. this study 365 http://dx.doi.org/10.14311/ap.2018.58.0365 http://ojs.cvut.cz/ojs/index.php/ap anna mražíková, oksana velgosová, jana kavuličová et al. acta polytechnica figure 1. uv-vis spectra of (a) biosynthesized agnps (b) gelatin/sodium citrate-agnps (c) sodium citrate-agnps. also compares the antimicrobial activities of agnps against the algae p. kessleri. 2. materials and methods 2.1. synthesis of silver nanoparticles the green algae p. kessleri were cultivated on agar plates in petri dishes for 3 weeks at the ambient temperature and light mode (12 : 12). the extract was filtered and the filtrate was centrifuged at 9000 rpm for 15 min and supernatant was added into erlenmeyer flasks containing 250 ml of agno3 solution (0.29 mm) and used for biosynthesis of agnps. the erlenmayer flasks were stored under lighting condition at the ambient temperature to allow reducing the silver ions into agnps. chemically synthesized agnps were prepared using chemical reduction method [7, 17]. the 15 ml of sodium citrate (0.5 wt.%) solution as a reducing agent was added drop by drop to the 250 ml of aqueous agno3 (0.29 mm). in the case of chemically synthesized agnps using mixed gelatin/sodium citrate, first of all, to prepare stock solution of agno3, the gelatin (0.01 wt.%) was dispersed in 250 ml of 0.29 mm of agno3 to prevent particle agglomeration. silver nanoparticles were prepared by adding drop-wise of 15 ml of (0.5 wt.%) sodium citrate solution into agno3 solution. both solutions of chemically prepared agnps were stirred at 700 rpm with a magnetic stirring bar at 70 °c for 30 minutes. erlenmayer flasks were stored in lighting condition at the ambient temperature and the end point of the reaction was the appearance of pale yellow-brown and dark brown colour. 2.2. antimicrobial assay the ability of agnps to inhibit the formation of algae biofilm was performed by standard diskdiffusion method [18]. the 1 ml of algal suspension (105 cfu/ml) was used to seed agar plates consisted of 2 % agar and culture medium (milieu bristol). the 25 µl of colloidal sollutions agnps were added to sterile swabs (6 mm) placed on agar plates seeded with microorganisms. the minimum inhibitory concentration (mic) was read after 7 days of incubation at the ambient temperature and light mode (12 : 12). 2.3. characterizations the nanoparticle colloidal solutions were stirred with magnetic stirrer with heating (ika c-mag hs4). the absorbance of the agnps dispersions was analyzed using an uv–vis spectra from 300 to 800 nm with unicam uv/vis spectrometer uv4. the absorbance was recorded on days 7 and 120. transmission electron microscope (tem; jem2000fx, jeol) at 200 kv was used to determine the size and morphology of agnps on day 120. scanning electron microscopy (sem) analysis was done using jeol jsm-7600f and used to determine the surface morphology properties on day 120. evetm-nanoentek was used to automate cell counting by the standard trypan blue technique. the observation of algae cells eradication on agar plates was done by macroscope leica wild m32. 3. results and discussion the formation of silver nanoparticles synthesized either chemically and biologically was clearly observed after 3 and 24 hours reaction time by solution colour changes and confirmed by uv-vis spectroscopy as depicted in fig. 1. in the case of p. kessleri the solution colour changed from pale yellow to yellow-brown. in the presence of citrate and gelatin/sodium citrate the solution colour changed to pale yellow-brown and dark brown, respectively. the uv-vis measurements showed the increase of the absorption maximum within 120 days in all three nanoparticle samples. this indicated the formation of agnps in the solution at the time. the biologically synthesized nanoparticles exhibited an increase of the broad absorption band on day 120 and only minor shift of uv-visible spectrum (fig. 1a), what indicated a long-term stability of the silver nanoparticles [19]. kadukova [15] reported that agnps produced by p. kessleri can be stable even more than 6 months. the agnps chemically produced using gelatine as a capping agent exhibited the most significant deviation at increase of the absorption maximum at 433 nm on day 120 (fig. 1b). as reaction time increased more amine residues of gelatine were being released into the reaction system and consequently, reduction of silver ions slowly proceeded [12]. shift of uv-visible spectrum (fig. 1b) 366 vol. 58 no. 6/2018 the influence of silver nanoparticles synthesis on their properties figure 2. sem and tem images and the antimicrobial effect of (a,top) biosynthesized agnps (b,middle) gelatin/sodium citrate-agnps (c,bottom) sodium citrate-agnps. from 416 to 433 nm points to the creation of larger average particle sizes as it was reported by many authors [19, 20]. for agnps obtained and stabilized only using sodium citrate the uv-vis increase of the absorption maximum on day 120 was not as strong as in the case of biologically synthesized agnps and gelatin modified sodium citrate-agnps as well. the most significant shift of uv-vis absorption maximum from day 7 to 120 (from 360 to 450 nm) was observed in the solution with citrate-agnps without addition of capping agents. such broadening and shift of spr indicate the presence of larger nanoparticle sizes than above mentioned and also showed their short-term stability [20]. the occurrence of symmetrical sharp uv-vis absorption peaks typically located around 400 nm observed on day 120 in the solutions with biologically synthesized agnps as well as gelatin/sodium citrate-agnps (fig. 1ab) indicated the presence of stable nanoparticles. the broad absorption band observed in citrateagnps, (fig. 1c) indicated much less uniform and stabilized nanoparticles. the sem and tem micrographs (fig. 2c) obtained after 120 days revealed the presence of large particle sizes (from 7 to 85 nm) and also small agglomerates and some dispersed agnps. it is very likely that agglomeration was caused by diminishing electrostatic repulsion [21]. based on our results the addition of gelatin to ag+ solution in the process of agnps formation improved stability and dispersibility of nanoparticles. biomolecules like peptides and proteins present in gelatin are able readily interact with metals and hydrophilic ligands protect gelatin coated agnps in aqueous solution. the coating serves to provide proper gap between the silver core [12, 13]. the silver nanoparticles, which were reached using biological approaches, showed on sem and tem micrographs (fig. 2a) spherical particles with average particle size of 15 nm. the role of active compounds of biomass responsible for the process of agnps formation and stabilization was confirmed by the work of several authors [6, 15, 23]. their results indicated that the various functional groups, especially amine, carboxyl, sulphydril and hydroxyl moieties present in the proteins, primarily cause the reduction of nanoparticles. formation of smaller sizes 367 anna mražíková, oksana velgosová, jana kavuličová et al. acta polytechnica gelatin/sodium citrate-agnps (from 4 to 55 nm) might be caused by presence of a higher gelatin concentration as demonstrated [14, 22]. sem and tem images (fig. 2b) of gelatin-protected particles also revealed the nanoparticles of different spherical and pyramidal shapes. antimicrobial effects of biosynthesized and chemically synthesized agnps against the green algae p. kessleri were observed in all three cases. the results revealed that chemically synthesized sodium citrateagnps caused only particular inhibition (fig. 2c) against algae. double zone of inhibition observed around the swabs impreganated with sodium citrateagnps was attributable to their bigger sizes, where agnps were not able to pass through the pores on the cell wall. such aggregate formation might act as a binding agent between cells and inhibited algal cells growth [8]. stronger extent of algae cells eradication and clear circular inhibition zone was observed around the swabs impregnated with gelatin/sodium citrate-agnps and biosynhesized agnps (fig. 2ab). according to literature [3, 14], the nanoparticles of smaller sizes have a higher antibiofilm activity due to the largest surface/volume ratio what is most easily to reach cellular proximity. such agnps cause structural changes and damages of cellular membrane that lead to cell death [3]. our results indicated that biosynthesized agnps have similar inhibitory antimicrobial activity as gelatin/sodium citrate-agnps against biofilm formation and owing to their easy and inexpansive synthesis appear to be good alternative to chemically prepared agnps. 4. conclusion silver nanoparticles were synthesized bioand chemical reduction of ag+ ions. the uv-vis spectroscopy revealed that the addition of gelatin positively affected size and long-term stability of chemically synthesized citrate-agnps. gelatin coated citrate-agnps also displayed enhanced antialgal effects in comparison with citrate-agnps. the uv-vis, sem and tem analyses revealed that biosynthesized agnps using algae extract exhibited long-term stability and also good antimicrobial activity against the green algae which could be attributed to the smallest sizes of the agnps. the extract from the green algae p. kessleri can adequately act as both reducing and capping agents. the results implied that biosynthesized agnps can be good alternative for preparation of materials which inhibit the biofilm formation. acknowledgements this work was financially supported by slovak grant agency (vega 1/0134/19). references [1] zhang x., wang h., li j., he x., hang r., yang y., tang b.: the fabrication of ag-containing hierarchical micro/nano-structure on titanium and its antibacterial activity. mater. lett. 193, 2017, p. 97-100. doi:10.1016/j.matlet.2017.01.094 [2] guzmán m. g., dille j., godet s.: synthesis of silver nanoparticles by chemical reduction method and their antibacterial activity. int. j. of chemical and biomolecular eng., 2:3, 2009, p. 104-111. doi:10.1016/j.nano.2011.05.007 [3] inbakandan d., kumar c., abraham l. s., kirubagaran r., venkatesan r., khan s.a.: silver nanoparticles with anti microfouling effect: a study against marine biofilm forming bacteria. colloids and surfaces b: biointerfaces 111, 2013, p. 636-643. doi:10.1016/j.colsurfb.2013.06.048 [4] oluwafemi o. s., vuyelwa n., scriba m., songca s. p.: green controlled synthesis of monodispersed, stable and smaller sized starch-capped silver nanoparticles. mater. lett. 106, 2013, p. 332-336. doi:10.1016/j.matlet.2013.05.001 [5] shankar p.d., shobana s., karuppusamy i., pugazhendhi a., ramkumar v.s., arvindnarayan s., kumar g.: a review on the biosynthesis of metallic nanoparticles (gold and silver) using bio-components of microalgae: formation mechanism and applications. enz. and microb. technol. 95, 2016, p. 28-44. doi:10.1016/j.enzmictec.2016.10.015 [6] bogireddy n. k. r., kumar h. a. k., mandal b. k.: biofabricated silver nanoparticles as green catalyst in the degradation of different textile dyes j. of environ. chem. eng. 4, 2016, p. 56-64. doi:10.1016/j.jece.2015.11.004 [7] girilal m., krishnakumar v., poornima p., fayaz a.m., kalaichelvan p.t.: a comparative study on biologically and chemically synthesized silver nanoparticles induced heat shock proteins on fresh water fish oreochromis niloticus. chemosphere, 139, 2015, p.461-468. doi:10.1016/j.chemosphere.2015.08.005 [8] oukarroum a., bras s., perreault f., popovic r.: inhibitory effects of silver nanoparticles in two green algae, chlorella vulgaris and dunaliella tertiolecta. ecotoxicology and environ. safety, 78, 2012, p. 80-85. doi:10.1016/j.ecoenv.2011.11.012 [9] sharma d., kanchi s., bisetty k.: biogenic synthesis of nanoparticles: a review. arabian j. of chemistry. 2015. doi:10.1016/j.arabjc.2015.11.002 [10] srinithia b., kumar v.v., vadivel v., pemaiah b., anthony s. p., mathuraman m. s.: synthesis of biofunctionalized agnps using medicinally important sida cordifolia leaf extract for enhanced antioxidant and anticancer activities. mater. lett. 170, 2016, p. 101-104. doi:10.1016/j.matlet.2016.02.019 [11] abdulla-al-mamun m., kusumoto y., muruganandham m.: simple new synthesis of copper nanoparticles in water/acetonitrile mixed solvent and their characterization. mater. lett. 63, 2009, p. 2007-2009. doi:10.1016/j.matlet.2009.06.037 [12] lee ch. and zhang p.: facile synthesis of gelatin-protected silver nanoparticles for sers applications. j. raman spectrosc. 44, 2013, p. 823-826. doi:10.1002/jrs.4304 368 http://dx.doi.org/10.1016/j.matlet.2017.01.094 http://dx.doi.org/10.1016/j.nano.2011.05.007 http://dx.doi.org/10.1016/j.colsurfb.2013.06.048 http://dx.doi.org/10.1016/j.matlet.2013.05.001 http://dx.doi.org/10.1016/j.enzmictec.2016.10.015 http://dx.doi.org/10.1016/j.jece.2015.11.004 http://dx.doi.org/10.1016/j.chemosphere.2015.08.005 http://dx.doi.org/10.1016/j.ecoenv.2011.11.012 http://dx.doi.org/10.1016/j.arabjc.2015.11.002 http://dx.doi.org/10.1016/j.matlet.2016.02.019 http://dx.doi.org/10.1016/j.matlet.2009.06.037 http://dx.doi.org/10.1002/jrs.4304 vol. 58 no. 6/2018 the influence of silver nanoparticles synthesis on their properties [13] sivera m., kvitek l., soukupova j., panacek a., prucek r., vecerova r., zboril r.: silver nanoparticles modified by gelatin with extraordinary ph stability and long-term antibacterial activity. plos one, 9(8) e103675, 2014. doi:10.1371/journal.pone.0103675 [14] ethiraj a. s., jayanthi s., ramalingam ch., benerjee ch.: control of size and antimicrobial activity of green synthesized silver nanoparticles. mater. lett. 185, 2016, p. 526-529. doi:10.1016/j.matlet.2016.07.114 [15] kadukova j.: surface sorption and nanoparticle production as a silver detoxification mechanism of the freshwater alga parachlorella kessleri biores. tech. 216, 2016, p. 406-413. doi:10.1016/j.biortech.2016.05.104 [16] velgosová o., mražíková a., marcinčáková r.: influence of ph on green synthesis of ag nanoparticles. mater. lett. 180, pp. 336-339 (2016). doi:10.1016/j.matlet.2016.04.045 [17] sileikaite a., prosycevas i., puiso j., juraitis a., guobiens a.: analysis of silver nanoparticles produced by chemical reduction of silver salt solution. materials science (medziagotyra) 12(4) 2006, p. 287–291. [18] kavita k., singh v.k., jha b.: 24-branched ∆5 sterols from laurencia papillosa red seaweed with antibacterial activity against human pathogenic bacteria. microbial research. 169 (4), 2014, p. 301-306. doi:10.1016/j.micres.2013.07.002 [19] villanueva-ibáñez m., yañez-cruz m.g., r. álvarezgarcía r., hernández-pérez m.a., flores-gonzález m.a.: aqueous corn husk extract – mediated green synthesis of agcl and ag nanoparticles mater. lett. 152, 2015, p. 166-169. doi:10.1016/j.matlet.2015.03.097 [20] rashid m.u., bhuiyan m.k.h., quayum m.e.: synthesis of silver nano particles (ag-nps) and their uses for quantitative analysis of vitamin c tablets. j. pharm. sci. 12, 2013, p. 29-33. doi:10.3329/dujps.v12i1.16297 [21] roh j., umh h. n., sim j., park s., yi j., kim y.: dispersion stability of citrateand pvp-agnps in biological media for cytotoxicity test korean. j. chem. eng. 30 (3), 2013, p.671-674. doi:10.1007/s11814-012-0172-3 [22] pootawang p., saito n., takai o.: ag nanoparticle incorporation in mesoporous silica synthesized by solution plasma and their catalysis for oleic acid hydrogenation. mater. lett. 65, 2011, p. 1037-1040. doi:10.1016/j.matlet.2011.01.009 [23] patel v., berthold d., puranik p., gantar m.: screening of cyanobacteria and microalgae for their ability to synthesize silver nanoparticles with antibacterial activity. biotechnol. reports. 5, 2015, p. 112-119. doi:10.1016/j.btre.2014.12.001 369 http://dx.doi.org/10.1371/journal.pone.0103675 http://dx.doi.org/10.1016/j.matlet.2016.07.114 http://dx.doi.org/10.1016/j.biortech.2016.05.104 http://dx.doi.org/10.1016/j.matlet.2016.04.045 http://dx.doi.org/10.1016/j.micres.2013.07.002 http://dx.doi.org/10.1016/j.matlet.2015.03.097 http://dx.doi.org/10.3329/dujps.v12i1.16297 http://dx.doi.org/10.1007/s11814-012-0172-3 http://dx.doi.org/10.1016/j.matlet.2011.01.009 http://dx.doi.org/10.1016/j.btre.2014.12.001 acta polytechnica 58(6):365–369, 2018 1 introduction 2 materials and methods 2.1 synthesis of silver nanoparticles 2.2 antimicrobial assay 2.3 characterizations 3 results and discussion 4 conclusion acknowledgements references ap04_3web.vp 1 introduction this paper describes the conceptual design process of octaslide redundant parallel kinematics for a machine tool. redundantly actuated parallel kinematics is a recently developed new concept for machine tools. it enables all mechanical properties of machine tools to be improved several times simultaneously. this is particularly demonstrated on the design of the octaslide. this is a concept of five-axis machine tool center. the paper describes the critical initial design phases and the accessible mechanical properties. the design process follows the newly developed design methodology for parallel kinematics machines. 2 concept of redundant actuation for parallel kinematics parallel kinematics means kinematic refers to where a body carrying a tool (called a platform) is supported by several independent links (legs) from the frame. parallel kinematics has the advantage that in principle all drives can be on the frame (decreased moving masses) and that its structure is a truss (increased stiffness). however, it also has severe disadvantages, mainly the occurrence of singular positions in the workspace and a generally smaller workspace due to link collisions. these problems of parallel kinematics can be removed by the principle of redundant actuation [1]. if the platform of parallel kinematics is supported on a redundant number of legs (links), i.e. on more than the needed number of dofs – in plane, on more than 3 legs, and in space, on more than 6 legs, then the following simplified consideration can be applied. if a certain combination of 6 legs from the redundant number of legs in the given position of the kinematics leads to a singular position, then the other combination of another 6 legs in the same position will be in a non-singular position (fig. 1). certainly switching between different selected combinations of legs is just an ideal consideration. the kinematics must use all redundant legs simultaneously and the control must be correspondingly smooth. the actuators of parallel structures can be realized by different principles (fig. 2), not only by telescopic links. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 design and properties of octaslide redundant parallel kinematics v. bauma, m. valášek, z. šika this paper describes the conceptual design process of octaslide redundant parallel kinematics for a machine tool. redundantly actuated parallel kinematics is a recently developed new concept for machine tools. it enables all mechanical properties of machine tools to be improved several times simultaneously. this is particularly demonstrated on the design of the octaslide. this is a concept of a five-axis machine tool centre. the paper describes the critical initial design phases and the accessible mechanical properties. the design process has follow the newly developed design methodology for parallel kinematics machines. keywords: parallel kinematics, redundant actuation, five axes, machine tool, dexterity, stiffness. fig. 1: the concept of redundant actuation of parallel kinematics fig. 2: the different constructions of link actuators (telescopic, sliding, rotational) using telescopic links, the octapod has been proposed as a redundant variant of the traditional hexapod. this paper deals with the conceptual design of redundant parallel kinematics with sliding links. 3 design methodology for parallel kinematics the design of parallel kinematics is a difficult problem. the fundamental problem is that almost all design parameters are mutually dependent. this leads to significantly increased computational complexity. a special design methodology has therefore been developed for parallel kinematics [2]. this methodology has succeeded in decreasing the computational complexity of the design by decomposing the design process into several hierarchical levels and by using of computational tools capable of computing the mechanical properties globally (for the whole workspace, rather than for one position in it). 4 octaslide concept the octaslide is a redundant version of the hexaslide. it is a parallel kinematics the links of which have sliding actuators. the principal concept of such a structure is shown in fig. 3. the platform is suspended on 8 links with actuators, unlike the hexaslide or pentaslide, which are suspended on only 6 or 5 links. 5 workspace and dexterity optimization the goal is to design the octaslide as a machine tool for a cylindrical workspace with a diameter of 1200 mm (length is almost unlimited). the platform is again a cylinder with diameter 400 mm and a length of 700 mm for axial suspension of the spindle. the goal is to maximize the orientation angles for five-axis machining. the investigation and optimization were provided on three testing trajectories (fig. 4). the first is a planar circular trajectory with chords of radius 540 mm in plane x-z, the second is a conical trajectory from the origin with variant deflection angle �, and the third is a conical trajectory on a circular path in plane x-z of radius 540 mm with variant deflection. first, the hexaslide was optimized. two structures were investigated. the first kinematic structure is symmetric (fig. 5). the second structure is asymmetric, with links of different lengths (fig. 6). the influence of structure asymmetry on the important kinematic property of dexterity (characterizing the distance from the singular position) is very strong. a comparison of the dexterity behaviour of the two hexaslide structures on three testing trajectories is shown in fig. 9 (left). the same investigation was done for the octaslide. two structures were designed and optimized – symmetric and © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 44 no. 3/2004 fig. 3: kinematic concept of the octaslide � x z y � x z y fig. 4: three testing trajectories 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 1 6 2 x 5 3 4 z y -1 0 1 -1.5 -1 -0.5 0 0.5 6 1 x 4 3y 5 2 -1.5-1-0.500.5 -1 -0.5 0 0.5 1 5 1 2 4 z 3 6 -1 0 1 -2 0 2 -1 0 1 x 6 2 5 3 y z 1 4 fig. 6: hexaslide with an asymmetric structure (different lengths of links) -1 -0 .5 0 0.5 1 -1 -0 .5 0 0.5 1 1 6 5 2 x 4 3 z -1 0 1 -1 .5 -1 -0 .5 0 0.5 6 1 2 x 4 3 y 5 -1 .5-1-0 .500.5 -1 -0 .5 0 0.5 1 5 1 y 2 4 z 3 6 -1 0 1 -2 0 2 -1 0 1 x 6 5 3 y z 4 2 1 fig. 5: hexaslide with a symmetric structure © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 44 no. 3/2004 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 1 8 2 7 x 6 34 5 z -1 0 1 -1.5 -1 -0.5 0 0.5 8 1 2 7 x 6 4 5 y 3 z -1.5-1-0.500.5 -1 -0.5 0 0.5 1 1 5 3 y 8 4 2 76 -1 0 1 -2 0 2 -1 0 1 1 x 8 2 4 y z 5 6 7 3 fig. 8: octaslide with an asymmetric structure (different lengths of links) -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 8 1 7 2 x 6 3 4 5 z -1 0 1 -1.5 -1 -0.5 0 0.5 8 1 7 2 x 6 4 5 y 3 -1.5-1-0.500.5 -1 -0.5 0 0.5 1 7 3 y 2 z 4 8 6 1 5 -1 0 1 -2 0 2 -1 0 1 1 x 2 y z 6 7 5 4 8 3 fig. 7: octaslide with a symmetric structure asymmetric (figs. 7 and 8). the influence of these structures on dexterity on the testing trajectories is again very strong. a comparison of the dexterity behaviour of the two octaslide structures on three testing trajectories is shown in fig. 9 (right). a comparison of the dexterity on testing trajectories of the optimized hexaslide and octaslide is shown in fig. 10 a, b, c. the behaviour of the octaslide is significantly better as the dexterity minimum is increased and its variation is decreased. the principle of asymmetry was further investigated and the structure of the octaslide was optimized (fig. 11) by varying the link lengths and their positioning on the frame. the dexterity was further significantly improved above the previous levels (fig. 10c) comparing the optimized hexaslide and the two octaslides. the octaslide from fig. 11 has better dexterity in almost the whole workspace. simultaneously with the dexterity, the declination angles characterizing the orientation capabilities were also investi28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 9: dexterity on three testing trajectories for the hexaslide (left) and the octaslide (right) © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 44 no. 3/2004 a) b) c) d) fig. 10: dexterity comparison on three testing trajectories of optimized hexaslide and octaslide fig. 11: further optimized octaslide with increased asymmetry of links 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 12: octaslide with a conical platform fig. 13: stiffness in the workspace for the hexaslide (left) and the octaslide (right) in directions x, y, z gated and optimized. the declination of the hexaslide for trajectory 2 is 14o, and for trajectory 3 it is 13o. the declination of the octaslide for trajectory 2 is 16o, and for trajectory 3 it is 12o. this means that the two structures are comparable in terms of their orientation capabilities. then the structure of the octaslide was further optimized. the platform was conical instead of cylindrical (fig. 12). this modification enables the declination angle to be extended to 33o. 6 octaslide stiffness the stiffness in the whole workspace was compared for the two optimized variants of the hexaslide and the octaslide. they are shown in fig. 13. the comparison of maximum, minimum and average values demonstrates the significant superiority of the octaslide. the increase in the maximum values is 65–74 %, and the increase in the average values is 43–54 %. 7 conclusions the concept of redundant actuation enables the design of new parallel kinematics with significantly improved mechanical properties. this has again been confirmed by the design of the octaslide. 8 acknowledgment the authors appreciate the support of msmt project j04/98:212200008. references [1] valášek m. et al: “new concept of redundant parallel robot”. in: proc. of mechatronics and robotics 97, vut brno, brno 1997, p. 269–274. [2] valášek m., šika z., bauma v., vampola t.: design methodology for redundant parallel robots, in proc. of aed 2001, 2nd int. conf. on advanced engineering design, glasgow, 2001, p. 243–248. ing. václav bauma, csc. phone: +420 224 357 373 e-mail: vaclav.bauma@fs.cvut.cz prof. ing. michael valášek, drsc. phone: +420 224 357 361 fax: +420 224 916 709 e-mail: michael.valasek@fs.cvut.cz ing. zbyněk šika, ph.d. phone: +420 224 357 452 e-mail: zbynek.sika@fs.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 44 no. 3/2004 ap04_2web.vp 1 notation pi purkynje image roi region of interest col centre of lens xt co-ordinate of the centre of the lens n number of pixels j, m pixel position d (x) value of the general difference at point x l half width of the weighting window k point of the surroundings and its weight y polynomial of interpolation a0, a1 polynomial coefficients 2 introduction accommodation and convergence are synkinetic ocular reflex actions co-ordinated by high brain controllers. convergence-accommodation synkinesis is a fundamental prerequisite for single binocular vision. appropriate co-ordination of the two actions is fundamental. there is clinical evidence for the presumption that co-ordination of the two parts of synkinesis is tuned in early infancy. disturbance of this co-ordination may lead to strabismus. the likehood of successful treatment falls with increasing age of the patient. therefore, it is important to initiate treatment in infant patients. this requires a careful approach: noninvasive and completely automatic. we have designed a new noninvasive system for measuring of basic eye reactions (accommodation and convergence), especially for infants. the system is based on determining the of horizontal position of the i. purkynje image (pi) and eccentric photorefraction. purkynje images were discovered in 1823 by j. e. purkynje [1]. purkynje images are reflections of the light from optical boundaries of the eye, as shown in fig. 1. eccentric photorefraction is a retinal reflex of incoming light. according to the position of the measuring light source and the estimation algorithm of this light reflection from the eye, two principal subtypes of photorefraction are distinguished: 1. co-axial methods are based on a light source located on the axis of the camera lens. four 70o pi-shaped cylinder lenses attached in front of the camera lens were typical for orthogonal modification of the co-axial method, while the defocus of the camera lens is distinctive for isotropic modification of the method. 2. eccentric methods are named after the eccentric position of the measuring light source in front of the camera lens on the shield occluding the part of the lens beneath the light source. the distance between the sharp edge of the shield and the source is called eccentricity and is the crucial parameter of the method. the method was completely described by bobier and braddick in 1985 [2]. a principal methodological improvement was brought about by schaeffel (1987) [3], when the light source design was changed from a point source to an array of point sources. impractical measurement of the light crescent of reflected measuring light in the pupilla was reduced by measuring the slope of the measurement light intensity in the pupilla. roorda (1997) showed that if the size of the light source is increased, the intensity profiles become more linear and the slope of the reflex changes linearly with the refractive state (accommodation). 3 methodology 3.1 measuring system for measurement, we designed a special noninvasive system. the scheme of the system is shown in fig.2. this system periodically stimulates the patient’s eye system with the two fixation pictures displayed on the fixation monitors. the face of the mother is used as the fixation picture. the ratio between accommodation and convergence is given by the geometric position of these monitors. to elimi68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 image analysis of eccentric photorefraction j. dušek, m. dostálek this article deals with image and data analysis of the recorded video-sequences of strabistic infants. it describes a unique noninvasive measuring system based on two measuring methods (position of i. purkynje image with relation to the centre of the lens and eccentric photorefraction) for infants. the whole process is divided into three steps. the aim of the first step is to obtain video sequences on our special system (eye movement analyser). image analysis of the recorded sequences is performed in order to obtain curves of basic eye reactions (accommodation and convergence). the last step is to calibrate of these curves to corresponding units (diopter and degrees of movement). keywords: eccentric photorefraction, purkynje images, strabismus, image analysis. optical axis sclera iris cornea lens ciliary body light source i. pi ii. pi iii. pi iv. pi suspensory ligament fig. 1: formation of purkynje images nate the effect of the patient´s attention being disturbed, an infrared light point source is used as the measuring light. the doctor finds the position of the eye and starts capturing the picture sequences. the selected capturing camera was dalsa ca d1 with 360 frames per second, which is capable of registering an infrared measuring light. 3.2 image analysis the image analysis is the same for each picture of the captured sequence. the first step in our image analysis is to choose a region of interest (roi) that eliminates an amount of image data (comprising only the lens and necessary surroundings). for subsequent image analysis, the only remaining necessary image data is a rectangular roi with the lens and part of the iris (see fig. 3). the second step in image analysis is to determine the threshold for partial thresholding [5]. for automatic setting of the threshold we choose another smaller rectangular roi on the border of the lens and the iris. in this roi, the program finds the minimum and maximum value, and computes the average value that is set as the threshold. finally, partial thresholding for the roi is applied. the third step is 8-neighborhood identification (for more details see [5]) that controls and labels shapes in roi and removes any possible undesirable objects or areas other than the lens because of head and eye movements. then the image analysis is divided into two parts. the fist parts is for convergence, and the second is for accommodation. for convergence analysis it is necessary to find the horizontal position of the center of the lens, using the following equation x n x j mt j j k � �1 ( , ) , , where n is the number of pixels, xj is value j – position ( j, m) pixel of the shape in the picture, n is number of pixels in the object, xt is the co-ordinate of the centre of the lens (col). the next step determines the horizontal position of pi. first, we do average vertical summation (fig. 4), which is the vertical summation in the pixel columns divided by the number of nonzero pixels in the same column. then general difference with a weighting window that eliminates local extremes is applied twice, where x is the point where the general difference is d x k f x k k k l l k l l ( ) ( ) � � �� �� � � 2 , computed, d (x) is the value of the general difference at point x, l is half of the width of the weighting window, k is the point of the surroundings and its weight. then, we find the minimum zero crossing point in first difference and in the second difference we find the curve crossing point that of this curve that represents maximum of the original curve – the global extreme. in order to achieve precision, we interpolate the surroundings of this extreme. the result is the horizontal position of pi. for convergence analysis we subtract pi and xj, which represents the distance between pi and col and shows us the time demanding process of convergence. the first step in accommodation analysis is to remove pi from the thresholded roi by a new partial thresholding. the new threshold is set at 70 % of the dynamic range, which eliminates higher values of brightness that represent i. pi. © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 44 no. 2/2004 lead switch mother halfporous mirror father with patient doctor pc fixation monitor i. fixation monitor ii. look axis positioning device camera i. small supervisory monitor suorce of ir light camera ii. devices room studio for genaration of the fixation images surgery fig. 2: scheme of the measuring system fig. 3: extracted roi with lens and pi fig. 4: average vertical summation the average horizontal summation (the same as in convergence analysis, but in the horizontal direction) is presented in fig. 5. this is the horizontal summation in pixel rows divided by the number of nonzero pixels in the same row. by fitting the middle part of this curve we get the following polynomial: y a a x� �0 1 . for accommodation analysis we use coefficient a1, which that represents the slope of the curve. 3.3 calibrating the curves the last step involves calibrating of both curves to the corresponding units (diopter and degrees of movement). this calibration depends on the geometrical position of both monitors and patient. for calibration of accommodation we use distances of the fixation monitors that represent relative defocus, and starting dioptric power that is computed as the average of the first 25 values of a1 (before accommodation starts – fixation on the first monitor). calibration of the convergence is angle transformation between the view axis of the eye and the camera, and it is done separately for each eye. the range of convergence is given by the position of the fixation monitors. the start angle is computed as the average of the first 25 values of the relative distance between pi and col (before convergence starts – fixation on the first monitor). 4 results a unique system for automatic measurement of accommodation and convergence has been designed and implemented, as shown in fig. 2. test have shown that the system is capable of detecting i. pi and eccentric retinal refraction. fig. 3 shows the most important part of the recorded picture (roi) with high system resolution and sensitivity. the results of automatic data analysis and calibration are presented in fig. 6 in a graph of the time demanding process of accommodation and convergence. 5 conclusions we have designed a universal system for measuring of synkinetic reaction (accommodation and convergence) based on i. pi position and eccentric photorefraction. this system has many advantages, e.g., it is noninvasive, automatic, cheap and easy to operate. its precision is good enough for clinical practice. using this system we are able to obtain a time curve of synkinetic reaction. these curves enable us to recognize and diagnose a range of eye defects and squints. 6 acknowledgment this research work has been supported by research program no. msm 210000012 “transdisciplinary biomedical engineering research” of the czech technical university in prague (sponsored by the ministry of education youth and sports of the czech republic) and partly supported by grant-founded project gačr no. 102/00/1494. references [1] purkyně j. e.: commentario de examine physiologico organi visus, breslau, 1823. [2] bobier w. r. & braddick: “eccentric photorefraction: optical analysis and empirical measures”. am. j. optom. physiol. opt., vol. 62, 1985, p. 614–620. [3] schaeffel f., farkas l., howland h. c.: “infrared photoretinoscope”. appl. opt., vol. 26, 1987, p.1505–1509. 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 5: average horizontal summation patient's identification mark -0 5. 0 0. 0 5. 1 0. 1 5. 2 0. 2 5. 3 0. 3 5. 4 0. 4.5 5.0 0 49 125 201 277 353 429 505 581 657 733 809 884 960 time [ms] a n g le [° ] 0 5. 0 0. 0 5. 1 0. 1 5. 2 0. 2 5. 3 0. 3 5. d io p tr ic p o w e r [d ] convergence accommodation fig. 6: results of image analysis © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 44 no. 2/2004 [4] roorda a., campbell m. c. w., bobier w. r.: “slope-based eccentric photorefraction: theoretical analysis of different light source configurations and effects of ocular aberrations”. j. opt. soc. am. a., vol. 14, 1997, p.2547–2556. [5] šonka m., hlaváč v., boyle r.: image processing, analysis and machine vision. pws, boston, usa, second edition, 1998, p.695. ing. jaroslav dušek phone: +420 224 352 113 fax: +420 233 339 801 e-mail: xdusekj@feld.cvut.cz department of radioelectronics czech technical university faculty of electrical engineering technická 2 168 00 prague 6, czech republic mudr. miroslav dostálek phone: +420 604 148 517 e-mail: dostalek@lit.cz department of ophthalomology litomyšl hospital purkyňova 919 70 01 litomyšl, czech republic table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap2002_01.vp 1 introduction screw agitators rotating in tubes are very efficient tools for mixing and pumping viscous liquids. they are also suitable for cases where the viscosity changes during operation. the effect of reynolds number on pumping characteristics was shown in [1]. the influence of geometry on pumping characteristics of screw agitators was presented in [2]. paper [3] was devoted to the effect of reynolds number on power characteristics. both the pumping and the power characteristics enable us to calculate pumping efficiency, defined as the ratio of fluid power to power input. the aim of this paper is to show the effect of reynolds number on pumping efficiency. 2 theoretical background the pumping characteristic is the dependence of specific energy e (transferred to the unit mass of fluid by the agitator) on pumping capacity q. as shown in [4] it is advantageous to express this in a dimensionless form as � �e q* *, re� f . (1) as can be seen from [1] and [2] for screw agitators, the dependence of e* on q*, for a given re, can be approximated by a straight line � �e e q q* * * *max max� �1 . (2) the values e *max and q *max are independent of the reynolds number in the creeping flow region. in the turbulent region, the value e * remax � and eq. (2) transforms to the form � �e e q q � � � �max max* *1 (3) independent of the reynolds number. the power characteristic is a dependence of power consumption p on specific energy e. after an inspection analysis of the governing equations, the following relationship for the dimensionless power characteristic was proposed in [4] � �p p e* * *, re� . (4) as was shown in [3], the dimensionless power characteristic can be approximated by a linear relation p c ae* *� � . (5) the values of coefficients c and a are independent of the reynolds number in the creeping flow region. in the turbulent region, the values of a and c / re are independent of re, and the power characteristic can be approximated by the following equation po c ae� � �re . (6) pumping efficiency � (defined as the ratio of fluid power to power input) can be calculated by the following relation (see e.g. [5]) � �� e q p (7) or in dimensionless form � � � �q e p q e po* * * * . (8) 3 effect of reynolds number on pumping efficiency the procedure for calculating pumping efficiency will be illustrated on a screw agitator characterized by the following dimensionless geometrical parameters: s / d � 2, d1 / d � 0.2, l / d � 1.4, dt / d � 1.1 (see fig. 3 in [3]). with reference to [1] the plot of parameters q *max and e *max on the reynolds number shown in figs. 1 and 2 can be obtained using the values presented in [2]. inserting these values into eq. (2) we receive the pumping characteristics depicted in fig. 3. dimensionless specific energy e+ instead of e* is recommended for regions with high reynolds number values. the dependence of e max � on reynolds number is depicted in fig. 4. inserting values from figs. 1 and 4 into eq. (3), the pumping characteristics shown in fig. 5 are obtained. the dependencies of c and a on the reynolds number and power characteristics of a given screw agitator were presented in [3]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 42 no. 1/2002 pumping efficiency of screw agitators in a tube f. rieger most information on pumping efficiency that is available in the literature is limited to the turbulent region (centrifugal pumps). the aim of this paper is to show the effect of the reynolds number on the pumping efficiency of screw agitators for a wide range of reynolds number values from creeping to the turbulent flow region. the dependence of pumping efficiency on reynolds number extends our knowledge about the efficiency of classical impeller pumps restricted usually to the turbulent region. keywords: pumping efficiency, screw agitator. 0.1 1 1 10 100 1000 10000 100000 re q * m a x fig. 1: dependence of maximum dimensionless pumping capacity on reynolds number using pumping and power characteristics, the values of pumping efficiency can be calculated from eq. (8). the dependence of efficiency on dimensionless pumping capacity calculated for selected reynolds number values is shown in fig. 6. from this figure it can be seen that efficiency is very low in the creeping flow region (re < 10). this is due to the fact that most of the energy is spent for viscous friction in the screw at low reynolds number values. it can also be seen that with increasing reynolds number values the efficiency increases. the dependence of maximum efficiency on the reynolds number is shown in fig. 7. from this figure it can be seen that maximum efficiency increases from 4.7 % in the creeping flow region to 50 % in the turbulent region. this means that only 4.7 % of the power is transferred to the fluid near the maximum in the creeping flow region, and 50 % in the turbulent region. the relation between efficiency and reynolds number presented in fig. 6 extends our knowledge about the efficiency of classical impeller pumps, for which valid results have until now been restricted mostly to the turbulent flow region. acknowledgement this research was supported by ga čr grant no. 101/99/0638. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 1/2002 100 1000 10000 100000 1000000 1 10 100 1000 10000 100000 re e * m a x fig. 2: dependence of maximum dimensionless specific energy on reynolds number 0 50 100 150 200 250 300 350 400 450 500 0 0.1 0.2 0.3 0.4 0.5 q* e * re 10� re 100� fig. 3: dimensionless pumping characteristics at low reynolds number values 1 10 100 1000 1 10 100 1000 10000 100000 re e + m a x fig. 4: dependence of maximum dimensionless specific energy on reynolds number 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.2 0.4 0.6 0.8 q* e + re = 1000 re > 10000 fig. 5: dimensionless pumping characteristics at high reynolds number values 0 0.1 0.2 0.3 0,4 0.5 0.6 0 0.2 0.4 0.6 0.8 q* � re < 10 re = 100 re = 1000 re > 10 000 fig. 6: plots of pumping efficiency on dimensionless pumping capacity at different reynolds number values 0.01 0.1 1 1 10 100 1000 10000 100000 re fig. 7: dependence of maximum efficiency on reynolds number symbols a, c coefficients in eq. (5) d agitator diameter [m] d1 root diameter [m] dt tube diameter [m] e specific energy [j kg�1] e* dimensionless specific energy, e* � e/�n e *max maximum dimensionless specific energy e+ dimensionless specific energy, e+� e / n2d2 e max � maximum dimensionless specific energy l length [m] n agitator speed [s�1] p power [w] p* dimensionless power, p* � p/ � n2d3 po power number, po � p/ � n3d5 q volumetric flow rate [m3 s�1] q* dimensionless pumping capacity, q* � q / nd3 q*max maximum dimensionless pumping capacity re reynolds number, re � nd2/� s pitch [m] � efficiency � dynamic viscosity [pa s] � kinematic viscosity [m2 s�1] � density [kg m�3] references [1] rieger, f.: pumping characteristics of a screw agitator in a tube. chem. eng. j. 66, 1997, pp. 73–77 [2] sedláček, l., rieger, f.: influence of geometry on pumping characteristics of screw agitators. collect. czech. chem. commun. 62, 1997, pp. 1871–1878 [3] rieger, f.: power characteristics of a screw agitator in a tube. acta polytechnica, vol. 41, no. 6/2001, pp. 3–6 [4] rieger, f., weiserová, h.: determination of operating characteristics of agitators in tubes. chem. eng. technol. 16, 1993, pp. 172–179 [5] mccabe, w. l., smith, j. c., harriott, p.: unit operations of chemical engineering. mcgraw-hill, n.y. 1993 prof. ing. františek rieger, drsc. e-mail: rieger@fsid.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4, 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 42 no. 1/2002 acta polytechnica doi:10.14311/ap.2014.54.0191 acta polytechnica 54(3):191–196, 2014 © czech technical university in prague, 2014 available online at http://ojs.cvut.cz/ojs/index.php/ap distinguishing between spot and torus models of high-frequency quasiperiodic oscillations vladimír karasa, ∗, pavel bakalab, gabriel törökb, michal dovčiaka, martin wildnerb, dalibor wzientekb, eva šrámkováb, marek abramowiczb, c, d, kateřina goluchováb, grzegorz p. mazurd, e, frédéric h. vincentd, f a astronomical institute, boční ii 1401, cz-14100 prague, czech republic b institute of physics, faculty of philosophy and science, silesian university in opava, bezručovo nám. 13, cz-74601 opava, czech republic c physics department, gothenburg university, se-412 96 göteborg, sweden d copernicus astronomical center, ul. bartycka 18, pl-00 716 warszawa, poland e institute for theoretical physics, university of warsaw, hoza 69, pl-00 681 warsaw, poland f laboratoire astroparticule et cosmologie, cnrs, université paris diderot, 10 rue alice domon et leonie duquet, 75205, paris cedex 13, france ∗ corresponding author: vladimir.karas@cuni.cz abstract. in the context of high-frequency quasi-periodic oscillation (hf qpos) we further explore the appearance of an observable signal generated by hot spots moving along quasi-elliptic trajectories close to the innermost stable circular orbit in the schwarzschild spacetime. the aim of our investigation is to reveal whether observable characteristics of the fourier power-spectral density can help us to distinguish between the two competing models, namely, the idea of bright spots orbiting on the surface of an accretion torus versus the scenario of intrinsic oscillations of the torus itself. we take the capabilities of the present observatories (represented by the rossi x-ray timing explorer, rxte) into account, and we also consider the proposed future instruments (represented here by the large observatory for x-ray timing, loft). keywords: x-rays: binaries, accretion, accretion disks, black hole physics. 1. introduction low-mass x-ray binaries (lmxbs) represent a particular example of astronomical sources in which puzzling quasi-periodic modulation of the observed signal can develop and reach very high (kilohertz) frequencies [25]. orbiting inhomogeneities, a.k.a. spots residing on the surface of an accretion disk, have been introduced as a tentative explanation for persistant high-frequency oscillations (hf qpos) observed in x-rays from accreting black holes and neutron stars in lmxbs. these qpos have been discussed, in particular, in a series of papers [12, 16, 20–23] that explain qpos as a direct manifestation of modes of relativistic epicyclic motion of blobs at various radii r in the inner parts of the accretion disc. within the model, the two observed hf qpo peaks arise due to the keplerian and periastron precession of the relativistic orbits. in a similar manner, the authors of [7, 8, 14] introduced a concept according to which the qpos are generated by a tidal disruption of large accreting inhomogeneities. recently, [9] has argued that the behaviour of the azimuth phase φ(t) for non-closed quasi-elliptic orbits in the curved spacetime can be responsible for the observed pairs of hf qpos. in this contribution we investigate the behaviour of the observable signal produced by radiating small circular hotspots. we discuss the detectability of the produced signal propagated from the strongly curved spacetime region, and in particular we concentrate of the properties of fourier power-spectral density (pds), as predicted in different scenarios. it is known that pds reflects the physical properties of the source, but that it does not provide a complete description. we thus explore a possible way to discriminate via pds between the two specific models that are thought to provide a promising scheme for hf qpos in black hole sources. in fact, according to the wiener-khinchin theorem, the power spectral function s(ω) can be calculated from the autocovariance function c(t) as [5] s(ω) = f[c](ω). (1) it can be seen that only the gaussian processes are completely determined by their power spectra. nonetheless, it is interesting to investigate the predicted profiles of the power spectrum for well-defined processes and to reveal specific features, their differences and similarities. in our discussion we consider the capabilities of present x-ray observatories, represented by rossi xray timing explorer (rxte), as well as the proposed 191 http://dx.doi.org/10.14311/ap.2014.54.0191 http://ojs.cvut.cz/ojs/index.php/ap v. karas, p. bakala, g. török et al. acta polytechnica spots drdr w (k r ) wk r3:2 r2 r3 figure 1. left: sketch of the qpo model based on spots orbiting close to two preferred radii, producing the 3:2 ratio of observed frequencies. right: illustration of the role of lensing and doppler effects in the visual appearance of a torus located at the resonant orbit r3:2. figure adopted from [4]. 100 200 300 400100 200 300 400 100 200 300 400 110hz 165hz spots (rxte) torus (rxte) p o w e r s / 100 200 300 400 spots (loft) torus (loft) 165hz 110hz 165hz 110hz 165hz110hz figure 2. comparison between multiple spot and oscillating torus pds obtained for the two instruments. superimposed red curves indicate various multi-lorentzian models. future instruments represented by the large observatory for x-ray timing (loft). we also compare the signal produced by spots to the signal obtained from another specific kind of simulations assuming axisymmetric epicyclic disc-oscillation modes. our present paper is based on a recent work [4] where further details can be found. in particular, in the cited paper we explore the width of the qpo peaks with a 3:2 frequency ratio that has been reported to occur in a number of sources. 2. set-up of the model qualitative mathematical properties of fourier power spectra, as predicted by the hot-spot scenario, have been studied in [17, 18]. in [12], an example of a quantitative study of the expected pds was presented for a particular case of orbital motion of the spots; their distribution within a narrow range of radii was assumed and explored. the adopted phenomenological description of the source employs spots or clumps orbiting in the schwarzschild metric as an approximation to more realistic models of inhomogeneous disc–type accretion flows around black holes and neutron stars. to reproduce the quality factor of the observed qpos, it was found necessary that the spots must be distributed in a zone of only several gravitational radii from the central black hole, with their observed luminosity influenced by doppler and lensing effects. furthermore, we found that the expected values of the quality factor of the oscillations can reach about several hundred (typically, q ≈ 3 × 102). however, the assumption of strictly circular orbits was clearly a simplification which calls for further discussion in the present paper. furthermore, the effect of background noise has to be taken properly into account [26]. the left panel of figure 1 illustrates the spot scenario which we investigate. the radial distribution or drifting of the spots can clearly result in various levels of signal coherence. nevertheless, small circular spots related to a single preferred radius cannot reproduce the often observed 3:2 frequency ratio. we also consider a more elaborate scheme where the multiple spots are created and drifted around radii close to two preferred orbits with keplerian frequencies roughly in the 3:2 ratio. these orbits are set as r2 = 8m (the radius where the radial epicyclic frequency reaches the maximum value) and r3 = 6m (isco). spots are then created within the regions [ri − δr, ri] with the size given by δr = 0.75m. since we assume the black hole mass m .= 11m�, our setup leads to the main observable frequencies around 110 hz and 160 hz (see figure 2 drawn for the signal fraction n = 10% and d = 65°). in the following consideration we compare the pds obtained for this setup to the pds resulting from the model of the oscillating optically thin torus slowly drifting through the resonant radius r3:2. 192 vol. 54 no. 3/2014 distinguishing between spot and torus models 08 40 16 0 12 8 4 p o w e r s / 20 1 06 p o w e r s / 1 02 0 160hz 320hz320hz 480hz160hz s n ~ 1 40 1 10-3 2.5 25 12.5 5 10 480hz s n ~ 1 70 5 10-3 2.5 25 12.5 5 25 0 100 200 300 4000 100 200 300 400 figure 3. the resulting pds obtained for different levels of the signal fraction n assuming d = 80° and e = 1m. for the sake of clarity, the individual pds are rescaled by unifying factor s. left: outputs considering the rxte capabilities. right: outputs considering loft capabilities. one should note that the lowest displayed values of n corresponding to gray and yellow colour do not indicate any significant features within the rxte pds. on the other hand, the loft pds already reveal keplerian frequency (gray and yellow pds) respectively its first two harmonics (yellow pds). to define the torus kinematics, we assume the m = 0 radial and vertical oscillations with equal intrinsic amplitudes. the possible qpo origin in the resonances between this or similar disc oscillation modes has been extensively discussed [1–3, 10, 13, 24]. here we adopt the concept previously investigated by [6], who focused on an optically thin torus with a slender geometry. the visual appearance of a torus influenced by lensing and doppler effects is illustrated in the right panel of figure 1. within the adopted concept the periodic changes of the observed luminosity are partially governed by the radial oscillations due to changes of the torus volume while the vertical oscillations modulate the flux just due to lensing effects in the strong gravitational field. the contribution of the two individual oscillations to the variations of the observed flux thus strongly depends on the inclination angle (see also [15]). here we set d = 65° where the fractions of the power in the two observed peaks are comparable. we set the black hole mass m = 5.65m� and a = 0 (r3:2 = 10.8m), implying that the two oscillatory frequencies are νθ(r3:2) = 160 hz and νr(r3:2) .= 110 hz. assuming this setup we produce torus drift lightcurves for the interval r/r3:2 ∈ [0.97, 1.03]. the resulting pds drawn for the signal fraction n = 10 % is included in figure 2. we note that similar pds can be reached assuming e.g. a near extreme rotating black hole with a = 0.98 and m .= 18m�. using figure 2 we can finally confront the predictions obtained for spots drifted around preferred radii to those expected for the oscillating torus slowly passing the resonant orbit r3:2. inspecting this figure, we can find that the rxte pds obtained for the given setup of the two models are rather similar. on the other hand, the loft pds clearly reveal the presence/absence of the harmonics additional to the 3:2 peaks representing the signature of spot motion. we assume the global source flux described by the approximations of the spectral distribution n(e) and power density spectra p (ν), where k is chosen to normalize the assumed countrate roughly to 1 crab and pi = [0.001,−1.3, 2.5, 0.8, 0.002]. this setup roughly corresponds to the so-called high steep power law (hspl) state in grs 1915+105. n(e) = ke−2.5, (2) p(ν) = p0νp1 + 1 π p3p4 (ν −p2)2 + p23 , (3) 3. expected properties of pds figure 3 shows the pds resulting from the rxte and loft simulations assuming various levels of n and the inclination d = 80° (i.e., nearly equatorial view). since the signal from the spot strongly depends on the source inclination d (see, e.g., [19]), we will compare the results for two representative values of d corresponding to the nearly equatorial view, d = 80°, and the view close to the vertical axis, d = 30°. in the following, we assume a spot orbiting at r = 6.75m with constant angular velocity of the keplerian value ωk. the spot trajectory deviates slightly from the circular shape due to the radial epicyclic oscillation, having small amplitude e > 0. 3.1. nearly equatorial (edge-on) view in figure 4 (upper panels) we include amplitude spectra and time dependent energy spectra of the net spot signal calculated for the distant observer at d = 80°. the spot signal is dominated by the keplerian frequency and its harmonics amplified by relativistic effects, which is well illustrated by the amplitude spectrum on the upper left panel of figure 4. the eccentricity corresponding to the amplitude of the radial epicyclic oscillation e = 0.1m causes only negligible modulation at the radial and precession frequencies. the increased eccentricity corresponding to the amplitude of the radial epicyclic oscillation e = 1m can 193 v. karas, p. bakala, g. török et al. acta polytechnica figure 4. expected net spot flux measured by a distant observer for different inclination angles. left: amplitude spectrum. right: time dependent energy spectra drawn for the distant observer. be well recognized in the amplitude spectra, but the signal is still dominated by the keplerian frequency and its harmonics. the time dependent energy spectra of the spot are depicted in the upper right panel of figure 4. we can see that they clearly reveal the signatures of relativistic redshift effects. so far we have reproduced just the variability and the spectra of the net spot flux. in order to assess the observable effects we have to study the total composition of the net spot flux together with the global source flux given by equations (2) and (3). assuming this composed radiation, we can consider the capabilities of the rxte and loft instruments using their response matrices and provided software tools. the time-dependent spectra describing the composed radiation are then convolved with the appropriate response matrix, giving an estimate of the observed data. these are fourier transformed to the resulting power spectra. within such consideration the detectability of the spot signatures obviously depends on the fraction of photons from the spot in the total flux. we refer to this signal-to-noise ratio shortly as the signal fraction n. figure 3 includes pds resulting from the rxte and loft simulations assuming various levels of n and the inclination d = 80°. it includes the cases when the signal is weak for the rxte and there are no significant features within its pds as well as the high signal fraction when first two harmonics of the keplerian frequency can be seen. comparing the two panels of this figure we can deduce that when the weak qpo signal corresponding to the hot-spot keplerian frequency is around the limits of the rxte detectability, the loft observations can clearly reveal its first and second harmonics. we checked that there is, in practice, no qualitative difference between the cases of e = 0.1m and e = 1m. it is therefore unlikely that the periastron precession or radial epicyclic frequency can be detected in addition to the harmonics when the inclination angle is close to the equatorial plane. 3.2. view close to the vertical axis for d = 30° (lower panels), the signal is dominated by the keplerian frequency but the harmonics are much less amplified in comparison to the nearly equatorial view (see the bottom left panel of figure 4). eccentricity corresponding to the amplitude of radial epicyclic oscillation e = 0.1m again causes rather negligible modulation at the radial and precession frequencies. nevertheless, we can see that the increased eccentricity of e = 1m affects the variability more than for the large inclination angle. furthermore, its influence is comparable to those of second harmonics of the keplerian frequency. the time dependent energy spectra are again depicted in the lower right panel of figure 4. finally, figure 5 shows the pds resulting from rxte and loft simulations assuming various levels of n. it is drawn for e = 1m and includes the few cases 194 vol. 54 no. 3/2014 distinguishing between spot and torus models 0 15 10 5 160hz 320hz 480hz160hz 320hz 480hz 0 200 400 600 800 s n ~ 1 10-2 5 20 1000 p o w e r s / p o w e r s / 100 200 300 4000 30 2 s n ~ 1 15 10-2 5 20 30 30 100 200 300 4000 5 hz35 hz3 120 figure 5. the resulting pds obtained for different levels of n assuming d = 30° and e = 1m. for the sake of clarity, the individual pds are rescaled by unifying factor s. left: outputs considering rxte capabilities. right: outputs considering loft capabilities. we note that rxte pds includes a barely significant excess of power at 160hz only for the highest displayed value of n. for the same value, the loft pds reveal the first two harmonics and also the radial epicyclic frequency (53 hz). when the signal is weak for the rxte and there are no significant features within its pds, plus one case when some feature at the keplerian frequency can be seen. comparing the two panels of this figure we can deduce that when the weak qpo signal corresponding to the hot-spot keplerian frequency is around the limits of the rxte detectability, the loft observations can clearly reveal its first and second harmonics and also the radial epicyclic frequency νr. although it is not directly shown, we check that decreasing the eccentricity to e = 0.1m leads to similar pds but with the missing feature at νr. 4. conclusions we can identify the signatures of spot motion mostly with the harmonic content of the observable signal. for large inclination angles, the loft observations could easily reveal the keplerian frequency of the spot together with its first and second harmonics when the strongest (but weak) single signal is around the limits of rxte detectability. nevertheless, the radial epicyclic frequency could also be found providing that the inclination is small. in our analysis we have paid attention to the timing signatures of the motion of small circular spots radiating isotropically from the slightly eccentric geodesic orbits. the case of highly eccentric orbits and/or spots having large azimuthal shear will be presented elsewhere. we have studied the comparison between spot and torus scenarios serving as fiducial representations of two very specific kinematic models. obviously, any general validity of this discussion is limited. for instance, a consideration of resonance driven effects or the role of torus geometrical thickness could give rise to some harmonic content in the signal from the oscillating tori. despite uncertainties, the elaborated comparison indicates clearly that the increased sensitivity of the proposed loft mission can be crucial for resolving the nature of qpo. we refer the reader to [4] for a more extended discussion and additional details. acknowledgements we acknowledge the czech science foundation projects gačr 14-37086g – albert einstein center for gravitation and astrophysics in prague (vk) and gačr 209/12/p740 (gt, eš), the european union 7th framework programme no. 312789 ‘stronggravity’ (md), and the cost action mp1304 “exploring fundamental physics with compact stars”. international collaboration of the institute of physics is supported by the ‘synergy’ project cz.1.07/2.3.00/20.0071, and the astronomical institute is supported by research program rvo:67985815. references [1] abramowicz, m. a., & kluźniak, w. 2001, a&a, 374, l19 doi: 10.1051/0004-6361:20010791 [2] abramowicz, m. a., bulik, t., bursa, m., & kluźniak, w. 2003a, a&a, 404, l21 doi: 10.1051/0004-6361:20030737 [3] abramowicz, m. a., karas, v., kluźniak, w., lee, w. h., & rebusco, p. 2003b, pasj, 55, 466 doi: 10.1093/pasj/55.2.467 [4] bakala, p., török, g., karas, v., dovčiak, m., wildner, m., wzientek, d., šrámková, e., abramowicz, m., goluchová, k., mazur, g. p., & vincent, f. h. 2014, mnras, 439, 1933 doi: 10.1093/mnras/stu076 [5] bendat, j. s., & piersol, a. g. 2000, random data: analysis and measurement procedures (new york: wiley) [6] bursa, m., abramowicz, m. a., karas, v., & kluzniak, w. 2004, apj, 617, l45 doi: 10.1086/427167 [7] čdež, a., calvani, m., & kostić, u. 2008, a&a, 487, 527 doi: 10.1051/0004-6361:200809483 [8] czerny, b., lachowicz, p., dovčiak, m., karas, v., pecháček, t., & das, t. k. 2010, a&a, 524, id. a26 doi: 10.1051/0004-6361/200913724 [9] germana, c. 2013, mnras, 430, l1 doi: 10.1093/mnrasl/sls036 195 http://dx.doi.org/10.1051/0004-6361:20010791 http://dx.doi.org/10.1051/0004-6361:20030737 http://dx.doi.org/10.1093/pasj/55.2.467 http://dx.doi.org/10.1093/mnras/stu076 http://dx.doi.org/10.1086/427167 http://dx.doi.org/10.1051/0004-6361:200809483 http://dx.doi.org/10.1051/0004-6361/200913724 http://dx.doi.org/10.1093/mnrasl/sls036 v. karas, p. bakala, g. török et al. acta polytechnica [10] horák, j. 2005, astronomische nachrichten, 326, 824 doi: 10.1002/asna.200510421 [11] karas v. 1999a, apj, 526, 953 doi: 10.1086/308015 [12] karas v. 1999b, pasj, 51, 317 doi: 10.1093/pasj/51.3.317 [13] kluźniak, w., abramowicz, m. a., kato, s., lee, w. h., & stergioulas, n. 2004, apj, 603, l89 doi: 10.1086/383143 [14] kostić, u., čadež, a., calvani, m., & gomboc, a. 2009, a&a, 496, 307 doi: 10.1051/0004-6361/200811059 [15] mazur, g., p., vincent, f., h., johansson, m., šrámková, e., török, g., bakala, p., abramowicz, m. a. 2013, astronomy & astrophysics, 554, id. a57 doi: 10.1051/0004-6361/201321488 [16] morsink, s. m., & stella, l. 1999, apj, 513, 827 doi: 10.1086/306876 [17] pecháček, t., goosmann, r. w., karas, v., czerny, b., & dovčiak, m. 2013, a&a, 556, id. a77 doi: 10.1051/0004-6361/201220339 [18] pecháček, t., karas, v., & czerny, b. 2008, a&a, 487, 815 doi: 10.1051/0004-6361:200809720 [19] schnittman, j. d., bertschinger, e. 2004a, apj, 606, 1098 doi: 10.1086/383180 [20] stella, l., & vietri, m. 1998a, in abstracts of the 19th texas symposium on relativistic astrophysics and cosmology, eds. j. paul, t. montmerle, & e. aubourg (saclay, france: cea) [21] stella, l., & vietri, m. 1998b, apj, 492, l59 doi: 10.1086/311075 [22] stella, l., & vietri, m. 1999, physical review letters, 82, 17 doi: 10.1103/physrevlett.82.17 [23] stella, l., & vietri, m. 2002, in the ninth marcel grossmann meeting, proc. mgixmm meeting, eds. v. g. gurzadyan, r. t. jantzen, & r. ruffini, part a, 426 [24] török, g., abramowicz, m. a., kluźniak, w., & stuchlík, z. 2005, a&a, 436, 1 doi: 10.1051/0004-6361:20047115 [25] van der klis, m. 2006, in compact stellar x-ray sources, ed. w. h. g. lewin & m. van der klis (cambridge: cambridge univ. press), p. 39 [26] witzel, g., eckart, a., bremer, m., zamaninasab, m., shahzamanian, b., valencia-s., m., schödel, r., karas, v., lenzen, r., marchili, n., sabha, n., garcía-marín, m., buchholz, r. m., kunneriath, d., & straubmeier, c. 2012, apj, 203, id. 18 doi: 10.1088/0067-0049/203/2/18 196 http://dx.doi.org/10.1002/asna.200510421 http://dx.doi.org/10.1086/308015 http://dx.doi.org/10.1093/pasj/51.3.317 http://dx.doi.org/10.1086/383143 http://dx.doi.org/10.1051/0004-6361/200811059 http://dx.doi.org/10.1051/0004-6361/201321488 http://dx.doi.org/10.1086/306876 http://dx.doi.org/10.1051/0004-6361/201220339 http://dx.doi.org/10.1051/0004-6361:200809720 http://dx.doi.org/10.1086/383180 http://dx.doi.org/10.1086/311075 http://dx.doi.org/10.1103/physrevlett.82.17 http://dx.doi.org/10.1051/0004-6361:20047115 http://dx.doi.org/10.1088/0067-0049/203/2/18 acta polytechnica 54(3):191–196, 2014 1 introduction 2 set-up of the model 3 expected properties of pds 3.1 nearly equatorial (edge-on) view 3.2 view close to the vertical axis 4 conclusions acknowledgements references ap04_2web.vp 1 introduction for many years engineering researchers have been interested in exploring ways in which knowledge about biology can contribute to engineering design. an early stimulus to research efforts in that area came from the much-publicised and innovative work of norbert wiener, including his introduction of the word “cybernetics”, meaning the study of control principles in man and machine. later milestones include the publication in the 1960s of the proceedings of an agard conference [1] on the theme of “bionics”, which included some fascinating contributions that showed just how much biologically-inspired engineering research was under way by then, both in europe and in north america. more recent examples where biology has provided important links that have led to widely-used design tools, include evolutionary methods of optimisation (such as genetic algorithms and genetic programming) and to artificial neural networks. in both of these areas biological analogies and biological thinking have contributed to the tools as they are used at present, but it is clear that the biological context of much of the original research is now largely irrelevant to those who routinely use the techniques in solving engineering problems. also, although biological thinking did contribute significantly to their development, it is important to point out that the most widely used forms of artificial neural networks have only some features that reflect properties of networks of real biological neurons. similarly, in the evolutionary computing field, although the algorithms are firmly based on the principles of survival of the fittest, only some aspects of evolutionary biology have been applied within the forms of genetic algorithm that are in general use today. in considering living systems from an engineering perspective it is clear that there are some fundamental similarities and differences between natural systems and human-engineered systems. both types of system show behaviour that is based on the same physical, chemical and thermodynamic laws and principles. on the other hand, natural systems sample a design space through the processes of evolution and optimisation that involves small changes to an existing system. they are also strongly integrated at a variety of levels including the molecular, cellular, organ and eco-system levels. human-engineered sub-systems are never, at the current level of our technology, integrated as closely as biological systems and these engineering sub-systems can often function independently. indeed structure and function are related at all scales and levels in biological systems in a way that is seldom approached in present-day man-made systems. 2 biological control systems man-made control systems still lag far behind natural systems in terms of their performance. this is especially clear in mobile robots where human performance in complex tasks, such as tennis playing or riding a standard bicycle, goes beyond the capabilities of any present-day man-made system. present-day manipulator robots also show a level of performance in terms of control system capabilities that is far below that of the equivalent biological systems. a manipulator robot has a dynamic response that is highly dependent on the load being carried by the arm. human arms on the other hand show dynamic responses that are largely independent of the loading conditions and have characteristics that strongly suggest adaptive properties in the underlying control system. the question arises of how these adaptive properties are provided from the basic elements of the neuromuscular system and whether some of these properties could be translated into equivalent characteristics in robots. 3 the neuromuscular control system the biological system that controls posture and movement in mammals is highly complex, with significantly nonlinear and adaptive properties. although the system presents a considerable challenge to biologists and to engineers, it has for many years attracted much attention because of the potential benefits in terms of the treatment of diseases that can affect our ability to control posture and move effectively. the neuromuscular system has also been taken into account in modelling the human operator for control tasks where the human-in-the-loop has a critically important role, such as in flying high-performance military aircraft. much use has been made of relatively simple neuromuscular system © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 44 no. 2/2004 biological systems thinking for control engineering design d. j. murray-smith artificial neural networks and genetic algorithms are often quoted in discussions about the contribution of biological systems thinking to engineering design. this paper reviews work on the neuromuscular system, a field in which biological systems thinking could make specific contributions to the development and design of automatic control systems for mechatronics and robotics applications. the paper suggests some specific areas in which a better understanding of this biological control system could be expected to contribute to control engineering design methods in the future. particular emphasis is given to the nonlinear nature of elements within the neuromuscular system and to processes of neural signal processing, sensing and system adaptivity. aspects of the biological system that are of particular significance for engineering control systems include sensor fusion, sensor redundancy and parallelism, together with advanced forms of signal processing for adaptive and learning control. keywords: biology, control systems, design, neuro-muscular system. sub-models within pilot models used for aircraft handling qualities and flight control systems studies. interest shown by engineers in the structure and function of the neuromuscular system has also been very considerable because they have recognised the potential benefits for engineering design. a better understanding of the way in which the central nervous system controls complex movements and the role of reflex systems in the regulation of posture could well lead to the design of better control systems for manipulator robots or walking robots. it is generally accepted that the neuromuscular control system is organised in a hierarchical fashion. the muscular and skeletal subsystems and the associated sensory elements (neural receptors) lie at the lowest level and are directly involved as elements in the feedback loops that provide reflex actions through the spinal cord. these reflex actions are influenced by a network of interneurones which is thought to provide the coordination and pattern generation elements for the system for control tasks involving more than one muscle. the brain stem, cerebellum, motor cortex and other similar areas of the central nervous system are involved in the upper levels of this hierarchy. neural communication pathways, both ascending and descending, interconnect elements at the various different levels. fig. 1 shows a simplified schematic diagram of the system. most physiologists and engineers involved in neuromuscular system modelling and experimental investigations have adopted a modular approach in which attempts are made to understand the function and structure of the many different elements of the system. the basic actuator is the anatomical muscle. a single anatomical muscle may contain hundreds of active contractile elements, known as motor units, which are connected in parallel to a common tendon. each motor unit itself consists of a single motoneurone and a group of about 250 muscle cells that can be activated simultaneously by the motoneurone. each muscle fibre resembles a form of active nonlinear spring in which the force developed is dependent upon both the external loading and the neural input. at a microscopic level the fibres contain many smaller elements known as fibrils, and each fibril is itself packed with microfilaments which are of two types, thin filaments of actin and thick filaments of myosin. the active contractile properties of muscle are derived from the active sliding of one type of filament over the other. the signal transmission paths in the neuromuscular system are nerve fibres. a typical nerve fibre has a cell body, which contains the nucleus, and a number of so-called “processes” that may be more than one metre in length. most nerve cells have several short processes, called dendrites, and a thinner and longer process called the axon. electrical impulses are generated by a discharge at a receptor organ. activity at one point in a nerve fibre leads to activation of adjacent parts of the fibre and at points where a nerve divides the activity is carried into all branches. in electrical terms the activity at a given point takes the form of voltage impulses or “spikes”, known as action potentials, separated by periods of inactivity. it is widely believed that neural information is transmitted using a form of pulse-frequency modulation. since the transmission of an action potential is an active process that depends upon ion transport across membranes there are usually significant time delays. the velocity of propagation of a neural impulse is directly related to the nerve fibre diameter. the point of connection between one nerve cell and another is known as a synapse. a single impulse in one pre-synaptic fibre may not have sufficient effect to activate the post-synaptic cell and there is a resulting change in the pattern of activity. in many ways a synaptic junction may be regarded as analogous to an ideal summing junction with additional attenuating properties. the pre-synaptic connections are classified as either facilitatory or inhibitory and are equivalent to inputs with positive or negative signs, respectively, at the summing element. the sensory elements of the stretch reflex feedback system are of two main types. these are the muscle spindle receptors and the tendon organ receptors and they are known to be very different, both anatomically and functionally. it is believed that the muscle spindle receptors have a particularly important role in the stretch reflex system and in the overall control of the complete neuromuscular system. it has for long been accepted that a more complete understanding of the muscle spindle receptor would throw light on many unsolved problems concerning neuromuscular control mechanisms. the muscle spindle receptors are sense organs that lie in parallel with the main load-bearing (extrafusal) muscle fibres. each spindle consists of a group of special-purpose muscle fibres, known as intrafusal fibres, with a set of receptors located in the central region of the spindle. although the complexity of the muscle spindle is considerable and some of 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 1: diagram of pathways connecting the muscle spindle and its parent muscle to the spinal cord (adapted from diagram in [2]) its features are still not understood, there are some points upon which there is general agreement. it is recognised that intrafusal fibres vary in size and fall into two distinct categories in terms of visible microscopic features, these being known as nuclear bag fibres and chain fibres. nuclear bag fibres consist of a central region which has few myofibrils and is thought to have properties that are essentially elastic. this central region is linked in series on either side to contractile regions that have visco-elastic characteristics. nuclear chain fibres are more uniform in structure and therefore are not expected to show such markedly viscous responses. in most mammals the muscle spindle consists of two or three bag fibres and several chain fibres. a typical anatomical muscle may have a large number of muscle spindles, in some cases as many as eighty. the properties of the intrafusal fibres are similar to the properties of ordinary extrafusal muscle, except that the intrafusal fibres have their own motor pathways through which the central nervous system can influence the spindle quite independently of the main extrafusal fibres. in the context of neuromuscular control, one of the most important points that has been established about the muscle spindle receptor is that there are several different types of efferent nerve fibre leading to intrafusal fibres. in addition to the well-known gamma efferents, of which there are two types, muscle spindles may also be innervated by collaterals of the alpha motor nerve fibres which supply the extrafusal fibres of the main muscle. these collaterals are known as beta fibres. the sensory regions of the muscle spindle can therefore be affected both by mechanical changes in the main load-bearing muscle and by activity in the gamma and beta pathways. functionally, the muscle spindle receptors are thought to be somewhat analogous to strain gauges and provide neural output signals in response to an applied extension of the main muscle or in response to neural activation of the intrafusal fibres of the spindle itself. there are two types of sensory receptor in the mammalian muscle spindle and these are known as primary and secondary endings. the significant difference between primary and secondary endings is that while the primary endings are found to display a high sensitivity to the velocity of applied stretch, the secondary endings are not so velocity sensitive. the primary output of the muscle spindle is usually transmitted to the spinal cord by a single large axon that winds around the nuclear bag. mechanical deformation of the sensory endings of this axon leads to the generation of neural impulses at a frequency that is related to the velocity of applied stretch as well as to the static deformation. this rate sensitivity is thought to be of considerable significance in the study of the neuromuscular control system. secondary endings do not display any marked rate sensitivity. tendon organs lie in series with the main load-bearing muscle and the neural ouput from this sensory receptor is directly related to the overall muscle tension. the tendon organ is known to have an inhibitory effect on the the motoneurones of its own load-bearing muscle, whereas the spindle output has a facilitatory effect. in the 1950’s a servomechamism hypothesis of neuromuscular control was formulated by hammond, merton et al [3] in which it was suggested that some movements are produced not as a direct result of neural signals from higher centres acting on the alpha motoneurones, but indirectly through the gamma pathway to cause contraction of the intrafusal fibres of the muscle spindle. the spindle output signal is transmitted to the motoneurone pool and thus stimulates the motoneurone which causes contraction of the load-bearing muscle. although hammond et al. [3] were the first to employ the servomechanism analogy and carried out experimental investigations that supported this theory, the existence of closed-loop neuromuscular systems had been recognised for many years before that. the supposed advantage of the indirect route through the gamma efferent pathways is that feedback through the muscle spindle receptor facilitates the maintenance of the conditions appropriate to the fusimotor input in the presence of external disturbances. the system becomes largely independent of changes in load and will be relatively insensitive to muscle fatigue. later work suggested that the action of the stretch reflex system must be more complicated than the description given by hammond et al.. it is known, for example, that beta motoneurones innervate both extrafusal muscle and the intrafusal fibres of the muscle spindle. a number of researchers pointed out that the controlled variable cannot be defined in a simple way because of the complex dynamic interactions in the muscle spindle that acts as the comparator within the closed-loop system. examination of the possible consequences of either position or tension control suggests that neither of these would be suited to the tasks performed by skeletal muscle. more recently, the ideas of pre-calculated feed-forward control have contributed to an understanding of some aspects of the action of the cerebellum in controlling rapid movements in the presence of delays in the feedback pathways. 4 computational models of the neuromuscular system a number of mathematical and computer-based models have been proposed for describing control mechanisms in the neuromuscular system and these models have been reviewed recently [4]. a modular approach is generally favoured and has been highly successful in studies of the dynamic properties of individual neural elements of this complex nonlinear dynamic system and of interactions between them [2]. what remains uncertain in this closed-loop system model is the nature of the controlled variable and the origin of properties that produce insensitivity to changes of external load and smooth movements under a very wide range of experimental conditions. these apparently adaptive features are of enormous interest for robotics research. if these and other features of the biological system could be more fully understood and reproduced in the design of control systems for manipulator robots there could be significant benefits. it appears likely that the key element in providing insight about the complex properties of this biological system is the muscle spindle receptor. this appears to function as a comparator and also as a combined sensor that provides a signal that depends primarily on muscle length and velocity but has dynamic properties that are dependent on the gamma efferent signal from the central nervous system. there is substantial evidence [2, 5] that the fusimotor inputs to the © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 44 no. 2/2004 muscle spindle are responsible for changes in the dynamic response of this sensory element in the system. this gives a strong clue about the possible source of some of the complex self-adaptive properties observed in the intact closed-loop system but it is still not known how the overall behaviour of the control system is influenced by these changes at the spindle. the difficulty in approaching this problem lies in the inherent nonlinearities of the system, the complex nature of the interactions at the spinal cord of the various sensory feedback pathways together with spinal pattern generators and local neural feedback pathways. there is also evidence from experiments on intact human subjects that the neuromuscular system changes the loop gain with the load on the system – a feature that is not replicated in many present-day man-made control systems. this has been the subject of a further servomechanism hypothesis by marsden, merton et al [6], one of whom (merton) was closely associated with the earlier hypothesis of hammond et al [3]. within the control paths for a single muscle, the alpha motoneurone provides the actuating input that causes the extrafusal muscle fibres to contract. the alpha motoneurone output is determined by a number of feedback loops as well as the inputs from higher levels in the central nervous system. these loops include the feedback from muscle spindle receptors and golgi tendon organs, feedback from joint receptors and skin receptors and local neural feedback pathways at the spinal cord, involving what are known as renshaw cells which provide recurrent inhibition. also, while the activities of the alpha and gamma motoneurones are essentially independent, the activity of the beta motoneurones causes contraction of the intrafusal fibres as well as being related to the activity of the alpha motoneurones, which leads to contraction of the extrafusal fibres. the situation becomes even more complex when the model attempts to represent the coordinated action of a pair of muscles working together on a joint. it is far from clear how the sensory feedback signals and descending control signals interact. the effect of a single feedback pathway on the overall system behaviour may appear obvious but the complex nonlinear properties of muscle and the threshold phenomena that are know to be present in some of the feedback pathways make any meaningful quantitative analysis or simulation very difficult. detailed models that attempt to incorporate the anatomical features of the real biological system require knowledge of a large number of parameter values, many of which are difficult, or impossible at present, to estimate from experimental data. access to intermediate variables for model validation is also extremely difficult in models of this type. 5 an alternative nonlinear model of the neuromuscular system one common assumption in models of the neuromuscular system is that the efferent signal that actuates the load-bearing muscle is a pulse-frequency modulated signal of strength proportional to the weighted sum of the relevant afferent signals. there is, however, considerable well-established experimental evidence that suggests that the force developed by a muscle is controlled primarily by regulation of the number of motor units that are active. as the required force in the muscle increases, more and more units become active. it is known that in a single unit the pulse frequency increases with the overall load force until a saturation level is reached beyond which no further increase occurs. if recruitment of motor units is the most significant factor in the development of tension in intact muscle this feature must be incorporated into neuromuscular control system models. for a muscle subjected to an imposed constant tension p(t), it is reasonable to assume initially that the applied tension is shared equally by the active units. if the number of active units is n(t), the tension in each single unit of the load-bearing musle must be p(t) / n(t). traditional models of muscle, make the implicit assumption that all the motor units are stimulated in synchronous fashion, with the complete muscle acting as a single unit. this is clearly incorrect, except in some rather artificial experimental conditions, but it is reasonable to take such a description as a model of a single active motor unit [6]. this provides a basis for quantitative assessment of the ideas of a variable gain system that are implied by the hypothesis of marsden et al [7]. it is first of all assumed that, as in most previous models, the outputs of the muscle spindle are proportional to the overall length of the main muscle. it is also assumed that some form of composite afferent signal from the muscle spindles and other sensory receptors controls the number of active motor units in the extrafusal load-bearing muscle. possible models representing the stretch reflex under conditions of controlled tension or controlled length would have the forms shown in figs. 2 and 3 respectively. if, in the model for isotonic conditions, the fusimotor input is such that at a particular level of applied tension p1, the number of active units is n1 and the muscle length is l1, then an increase of applied tension will cause the length to increase and this will give rise to an increase in the number of active units. any increase in the number of active units will tend to reduce the tension within each active unit because of the action of the divider in the forward path of the nonlinear feedback system. this 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: block diagram showing stretch reflex model for conditions of controlled muscle tension shows clearly that a form of negative feedback exists. in fig. 3 no feedback path exists involving the active skeletal muscle. it has already been noted that there is evidence that suggests that the stretch reflex system is not a simple position servomechanism. it is possible therefore that muscle spindle receptors have a component of their output that is proportional to muscle tension as well as the component proportional to muscle length. an increase in either length or tension would then tend to increase the number of active units. the overall characteristics of the stretch reflex models for isotonic conditions is altered very little by the introduction of the tension path, but for isometric conditions the presence of such a pathway completely changes the properties of the model and introduces the possibility of an unstable condition that would not otherwise exist. such instabilities have been observed experimentally under conditions of controlled muscle length. tendon organ receptors have been included in this recruitment model. the inclusion of tendon organs provides an additional feedback pathway under isometric conditions. a tendon organ discharge is known to have an inhibitory effect on the motoneurones of its own muscle and must therefore act in opposition to the signals in the muscle spindle pathway. the fusimotor input influences the spindle primary discharge and thus may be expected to influence the number of motor units that are active. for conditions of controlled tension any increase of fusimotor activity will cause the muscle to contract until a new equilibrium condition has been reached. conversely, any reduction of fusimotor activity will cause the muscle to lengthen. although fusimotor inputs are known to play an important part in the control of the neuromuscular system, it is also known that active movements can occur in preparations in which the feedback path from the muscle spindle receptors has been opened. it follows that the system must incorporate a second input that acts directly at the motoneurone pool. this is the feed-forward input, referred to previously, that appears to be used for the initiation of urgent movements for which the time delays inherent in the gamma pathway would introduce an unacceptable lag. although the two inputs are normally considered as being functionally distinct it is known that they often work together and it has been suggested that the relative amounts of activity in the two pathways may be controlled by the central nervous system to suit the task being performed. it has been suggested that the feedback properties of the neuromuscular system only apply for fusimotor inputs but this is not necessarily true since in many forms of feedback control input signals applied at different points in the loop may have similar effects at the output. if, however, the contraction initiated by the direct pathway is so rapid that the spindle discharge ceases, the system does become transiently open-loop. this is one of the interesting effects of neural signal encoding through pulse frequency modulation. large negative effects at a sensory receptor may cause the neural discharge to cease complete and the feedback signal to be interrupted. this gives rise to a subtle form of nonlinearity in a closed-loop system that may produce open-loop operation on a transient basis. the models of the muscle spindle currently available suggest that there is a significant rate sensitivity component on the spindle primary output but no similar rate sensitivity in the secondary output. this could be regarded as a form of rate feedback that would be beneficial in compensating for the effects of pure time delays and other lags. there is a further interesting observation that can be made here about the pulse frequency modulated form of signal in neural pathways. since the signal strength is proportional to the inverse of the time between adjacent spikes observed in the afferent nerve, the primary discharge from the spindle could be said to show a form of adaptive sampling. when the mechanical input is changing rapidly the instantaneous frequency in the afferent nerve will be high due to the rate sensitivity and the sampling rate will be correspondingly high. this is clearly a desirable situation in a control system. the changes in the dynamic response of muscle spindles observed when fusimotor inputs are applied suggests that, in addition to producing changes of the equilibrium point, these inputs may provide some form of adaptive compensation of the stretch reflex through parametric changes at the muscle spindles. it is possible, therefore that the static and dynamic fusimotor pathways to the intrafusal fibres provide a means of relatively independent control of the operating point and the dynamic characteristics of the complete closed-loop system. discussion one technological development that makes the study of recruitment phenomena in muscle particularly interesting is the recent development at a number of research centres, © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 44 no. 2/2004 fig. 3: block diagram showing stretch reflex model for conditions of controlled muscle length including sri international at menlo park, california [8], of actuators that expand and contract in response to electrical stimuli. the development of a robotic actuator made up of many strands of plastic artificial muscle fibre provides interesting possibilities for a wide range of applications. the inclusion of large numbers of sensors that can be embedded in these artificial muscles, like muscle spindles and tendon organs in real muscle, opens up the further possibility of highly redundant control systems that resemble the structure of the neuromuscular system and could begin to offer the high levels of redundancy in terms of sensors, actuation units and signal transmission pathways that seem to offer advantages in the neuromuscular control system. adaptive systems that vary the gain and other system parameters with load would become relatively easy to implement. the pulse frequency modulation that is inherent in nervous signal transmission has interesting consequences in terms of control. as already pointed out, extreme conditions can effectively cause feedback pathways to open transiently and switch the control system from closed-loop to an open-loop mode. this form of nonlinearity could well be beneficial in some circumstances and has already been introduced in some forms of bang-bang control system, which effectively operate in open–loop mode with maximum control effort applied except when the error becomes small and they change to linear closed-loop operation. as has been noted above, the pulse frequency modulation inherent in nervous signal transmission can also be regarded as a form of adaptive sampling and this has interesting implications in terms of information transmission in man-made systems. conclusions any engineer must inevitably have respect for the excellence of the design that can be seen in biological systems. unfortunately much of the fine detail is well beyond our present understanding and cases where the results of these natural design processes may appear to us to be imperfect or puzzling may simply represent our failure to comprehend fully the complexities of the situation. one of the key elements of engineering design is that, generally, we aim for simplicity and accept added complexity only when we have convincing evidence that the gains exceed the losses. such losses may be associated with the inevitable extra development or materials cost and the possible added system life-time costs caused by a reduction in reliability and additional maintenance demands. in contrast, complexity does not appear to be expensive in nature, where prototype testing is carried out on such a very large scale that an enormous number of refinements can be tried [9]. while biological systems appear to have a major advantage in terms of the scope for innovative design by evolutionary optimisation, there are important limitations set by the kinds of materials that can be used and the mechanisms that can be employed. biological systems are also limited, if one accepts evolutionary principles, in the sense that every new feature must develop from an existing feature and there is no chance of making the sudden giant developmental leaps that are so common in the history of technology. nevertheless it is clear that the processes of evolution can tell us much that is of value for specific engineering control systems that have functions equivalent to easily identifiable biological systems. the neuromuscular system is an excellent example because of the obvious link to manipulator robots and other types of robotic system. the subtle nonlinear and self-adaptive features of the neuromuscular control system are beginning to be understood and better knowledge of the biological control system could well provide further ideas to explore in the development of improved forms of man-made system. improved understanding of the normal operation of the neuromuscular control system could also assist rehabilitation engineers in developing ways of assisting , through controlled functional electrical stimulation techniques, those who have lost the functions of normal limb control. references [1] von gierke h. e., keidel w. d., oestreicher h. l. (editors): principles and practice of bionics. agard conference proceedings no. 54. slough (england): technivision services, 1970. [2] rosenberg j. r., murray-smith d. j., rigas a.: “an introduction to the application of system identification techniques to elements of the neuromuscular system.” trans. institute measurement and control, (u.k.), vol. 4 (1982), p. 187–201. [3] hammond p. h., merton p. a., sutton g. g.: “nervous gradation of muscular contraction.” brit. med. bulletin, vol. 12 (1956), p. 214–218. [4] he j., maltenfort m. g., wang q., hamm t. m.: “modelling neural control.” ieee control systems magazine, vol. 21 (2001), p 55-69. [5] halliday d., murray-smith d. j., rosenberg j. r., rigas a.: “a frequency-domain identification approach to the study of neuromuscular systems-a combined experimental and modelling study.” trans. institute measurement and control, u.k., vol. 14 (1992), p. 79–90. [6] murray-smith d. j.: an application of modelling techniques to the neuromuscular control system. phd thesis, university of glasgow, chapter 6, 1970. [7] marsden c. d., merton p. a., morton h. b.: “servo action in human voluntary movement.” nature, vol. 238 (1972), p. 140–143. [8] anon: “artificial muscles: expansive thinking.” the economist, february 5th 2000. [9] french m.: invention and evolution. 2th edition. cambridge, cambridge university press, 1994. david j. murray-smith centre for systems and control department of electronics and electrical engineering rankine building the university of glasgow glasgow g12 8lt, scotland, u.k. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2015.55.0169 acta polytechnica 55(3):169–176, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap decentralized multi-robot planning to explore and perceive laetitia matignona, ∗, laurent jeanpierreb, abdel-illah mouaddibb a université lyon 1, liris umr5205, f-69622, france b université caen basse-normandie, greyc umr6072, f-14032, france ∗ corresponding author: laetitia.matignon@univ-lyon1.fr abstract. in a recent french robotic contest, the objective was to develop a multi-robot system able to autonomously map and explore an unknown area while also detecting and localizing objects. as a participant in this challenge, we proposed a new decentralized markov decision process (dec-mdp) resolution based on distributed value functions (dvf) to compute multi-robot exploration strategies. the idea is to take advantage of sparse interactions by allowing each robot to calculate locally a strategy that maximizes the explored space while minimizing robots interactions. in this paper, we propose an adaptation of this method to improve also object recognition by integrating into the dvf the interest in covering explored areas with photos. the robots will then act to maximize the explored space and the photo coverage, ensuring better perception and object recognition. keywords: cooperative multi-robot systems, robot coordination, robot planning, multi-robot exploration, active perception. 1. introduction some key challenges of robotics reported in the recent roadmap for u.s. robotics [1], e.g., planetary missions and service robotics, require mobile robots to travel autonomously around unknown environments and to augment metric maps with higher-order semantic information such as the location and the identity of objects in the environment. the ability of the mobile robots that gather the necessary information to obtain a useful map for navigation is called autonomous exploration. this was the central topic of a dga1/nra2 robotic challenge, in which multiple robots have to explore and map some unknown indoor area while recognizing and localizing objects in this area. the scientific issues of this project involve slam3, object recognition and multi-robot collaboration for exploration. as a participant in this challenge, we mainly focused on multi-robot collaboration for exploration. we were particularly interested in multi-robot exploration strategies. we proposed a new dec-mdp (decentralized markov decision process) resolution technique based on the distributed value function (dvf) to consider sparse interactions. our dec-mdp model for exploration and its resolution based on dvf were applied for the contest. it allowed robots to explore an unknown area cooperatively by reducing the overlap between the explored areas of each robot. in a second phase, we focused on improving object recognition by integrating into the planning stage the interest in covering explored areas with photos. thus each robot will act to explore and 1defense procurement agency. 2french national research agency. 3simultaneous localization and mapping. to perceive. the objective of this paper is to present our interaction-sparse dec-mdp resolution adapted to achieve improved perception. in the following, we first present the context of this work, with details concerning the robotic challenge and the system that we developed to participate in the competition. second, related works on multi-robot exploration, active perception and interaction-oriented models are introduced. then we present our decmdp resolution, based on dvf, and its application to multi-robot exploration. finally, we introduce and show experiments aimed at extending dvf to improve photo coverage of the space. finally, some concluding remarks are made. 2. context 2.1. the carotte challenge this work was carried out within the framework of a french robotic contest called defi carotte4 (cartography by a robot of a territory), which involved exploring and mapping an unknown indoor area (an enclosed arena made up of a set of rooms) with one or several autonomous robots. the competition took place in an arena of approximately 120m2, in which objects had been laid out. the arena contained several rooms, typically 10 or more, with variable grounds and various difficulties to be dealt with (fitted carpet, grid, sand, etc. ). several kinds of objects were present, isolated or gathered together, e.g., chairs, books, fans, furniture, etc. the carotte challenge proceeded over a period of three years, with an increase in the level of difficulty over the years. the required outcome was to produce 4http://www.defi-carotte.fr/ 169 http://dx.doi.org/10.14311/ap.2015.55.0169 http://ojs.cvut.cz/ojs/index.php/ap l. matignon, l. jeanpierre, a.-i. mouaddib acta polytechnica 2d, 3d and topological maps of the arena and to detect, localize and classify objects present in the arena. the trials within the framework of the competition consisted of several missions with a time limit. during each mission, robots navigated autonomously to map, detect and recognize objects. the robots were required to return to their starting position before the end of the mission. five teams entered for this challenge [2, 3], in which the goal was to maximize the explored space, the precision of the map and the quality of object detection. 2.2. the robots_malins system we developed the robots_malins5 (robots for mapping and localization, using intelligent navigation and search) system for the carotte challenge. our system uses wifibot6 µ-trooper m robots. these are six-wheel robots characterized by great flexibility, which allows the robots to cover rough terrain. each µ-trooper embeds an intel core 2 duo processor, a 2gb ram and a 4gb flash and is equipped with a hokuyo lidar 30m 7 laser range scanner for localization and mapping and a hokuyo lidar 4m tilted toward the ground, plus an ultrasonic rangefinder for detecting nearby obstacles and glass walls. an avt marlin firewire camera is used for object detection. the software running on-board these robots is based on a data distribution system (dds) implementation from opensplice8. this middle-ware allows several programs to run concurrently, even on different computers. in our architecture, this implies that various modules can run asynchronously: laser acquisition, slam, decision, mobility and object recognition. the architecture allows the robots to exchange their laser scans and their poses. thus each robot knows the areas explored by the others, and updates its local map with local and distant scans. in particular, the slam module, based on [4], receives laser readings and provides the other modules and the other robots with the robot pose (location and heading). so each robot knows the relative positions of all the robots and the map of the zones explored by the team. the mobility module implements an advanced point and shoot algorithm, along with a backtrack feature that prevents the robot from being stuck, reverting back on its own trajectory. the point and shoot algorithm consists of turning to the desired heading and then going forward for a specified distance, while correcting the heading drift. the object recognition module uses the different pictures taken by the camera to recognize predefined classes of objects. pictures are taken every 4 seconds. objects to be detected are known in advance and a database has been built containing each object over different 5https://robots_malins4carotte.greyc.fr/ 6www.wifibot.com 7www.hokuyo-aut.jp 8http://www.opensplice.com points of view. object detection was performed using a shape/template matching technique based on dominant orientation template models [5]. the decision module runs asynchronously, computing a new strategy every second on an average. in this paper, we focus on the decision module. details on the algorithm used to compute a joint policy of the robots to efficiently explore the arena and cover it with pictures are given in sections 5 and 6. 3. state of the art 3.1. multi-robot exploration multi-robot exploration has received considerable attention in recent years due to its obvious advantages over single-robot systems: it is faster, robust, faulttolerant, etc. most multi-robot approaches assume that robots share the information that they have gathered to build a shared map and know their locations in this map. the robots cooperate through the shared map, but coordination techniques are required to exploit the parallelism inherent to the team. in [6], each robot uses the greedy approach in the shared map, i.e., each robot chooses the nearest exploration frontiers, defined as regions on the boundary between open space and unexplored space. therefore, there is no coordination and multiple robots can be assigned to the same frontier. in [7–9], the coordination is centralized. a cost-utility model is used, where the gain expected at a target is the expected area discovered when the target is reached, taking into account the possible overlap between robot sensors. coordination is accomplished by assigning different targets to the robots, thus maximizing the coverage and reducing the overlap between the explored areas of each robot. coordination can also be decentralized. in [10] robots bid on targets to negotiate their assignments. classically, frontiers are rated using the distance to the robot, but bautin et al. [11] chose to favor a well-balanced spatial distribution of robots in the environment: for each robot-frontier pair, the cost function is the number of robots closer than it to the considered frontier. accordingly, each robot is allocated to the frontier for which it has the lowest rank. all these strategies have been devised, but few efforts have been made to compare them. an exception is a recent article [12] that compares some methods for autonomous exploration and mapping using various criteria, e.g., exploration time and map quality. 3.2. active perception combined with exploration early works on object recognition were based on passive approaches. during the last decade, some approaches have investigated the field of active object recognition: the viewpoint of the camera can be controlled to improve the recognition rate. for example, looking at an object from different poses decreases ambiguities in object recognition, thanks to the choice of 170 vol. 55 no. 3/2015 decentralized multi-robot planning to explore and perceive different viewpoints. several active approaches have been proposed for planning optimal sequences of views for a camera. an approach for viewpoint selection based on reinforcement learning is proposed in [13]. in [14], the objective is to plan the minimum number of actions and observations required to achieve recognition. active perception planning is formulated as pomdp. however, the issue of active planning combined with exploration is not addressed in most of these works, as the location of the object to be recognized is known. other works are interested in adding in the trajectories planned for the exploration other objectives. for example, integrated exploration [15, 16] involves integrating the path planned for exploration with slam in order to plan trajectories that favour the creation of a high-quality map. some actions, e.g., closing a loop or returning to precedent positions, may reduce the uncertainty of the robot pose and the uncertainty of the map. a utility function is used which trades off the cost of exploring new terrain against the potential reduction of uncertainty by making measurements at selected positions. in these works, there is no verification of the assumption of having an accurate position estimate during exploration. integrated exploration considers the problem of acting for better localization, but not for better recognition. 4. interaction-oriented models decision-theoretic models based on dec-mdps provide an expressive means for modeling cooperative teams of decision makers. in this section, we present this formalism and recent advances in resolving it. 4.1. background on dec-mdp dec-mdp [17] is an extension of mdp [18] for decentralized control domains. a dec-mdp is defined with a tuple 〈i,s,a,t,r, ω,o〉. i is the number of agents, s is the set of joint states and a = {ai} is the set of joint actions9. t : s ×a×s → [0; 1] is a transition function and t(s,a,s′) is the probability of the i robots transitioning from joint state s to s′ after performing joint action a. r : s →< is a reward function that represents the global immediate reward for the robots being in s. ω is a set of joint observations that agents can receive about the environment and o : s×a×s×ω → [0; 1] is an observation function giving the probability of receiving o ∈ ω after performing joint action a and transitioning from s to s′. if the global state of the system is collectively totally observable, the dec-pomdp is reduced to a dec-mdp. we can see an mdp as a dec-mdp where i = 1. it is defined with a tuple 〈s,a,t,r〉. the goal of mdp planning is to find a sequence of actions maximizing the long-term expected reward. such a plan is called 9a state of the problem can be written with a tuple s = (s1, ..., si ) such that sj is the state of robot j. aj defines the set of actions aj of robot j. a policy π : s → a. an optimal policy π∗ specifies for each state s the optimal action to execute in the current step, assuming that the agent will also act optimally in future time steps. the value of π∗ is defined by the optimal value function v ∗ that satisfies the bellman optimality equation: v ∗(s) = r(s) + γ max a∈a ∑ s′∈s t(s,a,s′)v ∗(s′), (1) where γ is the discount factor. an mdp is solved using dynamic programming, with polynomial time complexity. a dec-pomdp is solved similarly by computing the optimal joint policy. however its time complexity is nexp-complete [17], which is extremely hard. 4.2. interaction-oriented models when faced with real-world applications such as multirobot systems, dec-(po)mdp models are very difficult to apply, due to their high complexity. recent advances in dec-(po)mdp resolution have allowed a notable increase in the size of the problems that can be solved. an interesting direction that has emerged recently involves taking advantage of local or sparse interactions between agents. these methods relax the most restrictive and complex assumption which considers that the agents interact permanently with all the others. the complexity of solving dec-(po)mdps is then reduced by solving a set of interactive individual decision making problems. the nd-pomdp model [19] is a static interaction approach, i.e., an agent always interacts with the same subset of neighbors. however this assumption is not realistic. models have therefore been proposed that use dynamic interactions so that each agent interacts with an evolving set of agents. the dec-simdp model assumes full local observability, and unlimited, free communication between agents interacting together in some specific states [20]. dylim is similar but applies to partial observation and no communications [21]. it considers dec-pomdps as a set of pomdps, and interactive situations are solved separately by deriving joint policies. for non-interactive situations, each agent has its local policy to behave solely. these are promising approaches for real-world applications of decentralized decision makers. 5. dvf for multi-robot exploration during the robotic contest, we were particularly interested in multi-robot exploration strategies. in this section, we present our fully decentralized approach based on dvf and its application to multi-robot exploration. 5.1. interaction sparse dec-mdp with dvf to improve the complexity for solving dec-mdps, we proposed an interaction-oriented resolution based 171 l. matignon, l. jeanpierre, a.-i. mouaddib acta polytechnica on distributed value functions (dvf). dvfs were introduced by [22] as a way to distribute reinforcement learning knowledge through different agents. our approach decouples the multi-agent problem into a set of individual agent problems, and considers possible interactions among a team as a separate layer. this currently seems one of the main tracks for tackling scalability in dec-(po)mdps (cf. section 4.2). we represent a dec-mdp with two classes: • the global interaction class, defined as a collection of augmented mdps {mdp aug1 , ...,mdp aug i }. there is one augmented mdp per agent, which is defined by mdp augi = 〈si,ai,ti,ri, γi〉 where 〈si,ai,ti,ri〉 individually models agent i in the absence of other agents and γi is some additional information. γi can be communicated by other agents or can be inferred locally. it provides the agent with global information, enabling interaction between mdps of the global interaction class. then, each agent resolves solely its augmented mdp so that interactions are minimized. the global decmdp is solved as a set of local mdps augmented by information from the other agents. this significantly reduces the computational complexity: the nexp complexity of solving a dec-mdp is reduced to the complexity of solving one mdp (polynomial) per agent. • the local interaction class is for close interactions. indeed, each agent computes strategies with dvf in its augmented mdp to minimize the interactions. however, when situations of interaction occur, dvf does not handle these situations and the local coordination must be resolved separately with another technique. for example, joint policies can be computed off-line for the specific joint states of close interactions, including only interacting agents. in the exploration context, the additional information of the augmented mdp is limited to the last known state of other agents: agent i knows at each step the state sij ∈ si of the other agents j. then it computes its distributed value function dvfi according to dvfi(si) = ri(si) + γ max ai∈ai ( ∑ s′∈si ti(si,ai,s′) ( dvfi(s′) − ∑ j 6=i fijpr (s′|sij )vj (s′) )) (2) for all si ∈ si, where pr(s′|sij ) is the probability of agent j transitioning from state sij to state s′; vj (s′) is the value function of agent j, computed locally by agent i; fij is a weighting factor that determines how strongly the value function of agent j reduces the value function of agent i. the dvf technique allows each agent to choose a goal which should not be considered by the others, and in this way the interactions are minimized. the value of a goal depends on the expected rewards at figure 1. reward propagation mechanisms with stars as resulting rewards. a) from frontier hexagons at the top. an unknown hexagon (grey) propagates its reward over a radius (dotted circle) on free neighborhood hexagons (white). propagation is stopped if occupied (black) or unknown hexagons are encountered. white arrows show impossible propagations, and black arrows represent active propagations. b) from free and non-covered hexagons at the bottom. a free and non-covered hexagon (yellow) propagates varying rewards at a best view point distance (solid circle). this goal and on the fact that it is unlikely to be selected by other agents. more details about dvf and its extension under communication constraints can be found in [23]. 5.2. dvf applied to multi-robot exploration in the second year of the contest, dvf was used in our decision module to compute multi-robot exploration strategies in a decentralized way. thanks to the slam module (cf. section 2.2), each robot has access to a map updated with all explored areas and to the position of all the robots. we can then assume that the location of the robot and of the other robots is known at decision time. 5.2.1. mdp model each robot generates its local augmented mdp from a four-layer grid. the first layer is the real world layer where the robots move. the pixel layer is an occupancy grid of pixels, where each pixel is initialized as unknown and updated as free (no obstacle) or occupied (something there) by the data acquisition process. the hexagon layer is an occupancy grid of hexagons10. hexagons and voronoi layers are each computed from the pixel layer, and are used to generate the data structures of the local augmented mdp. states and rewards are based on the hexagonal layer, while actions and transitions are based on the hexagonal and voronoi 10each hexagon is composed of a set of pixels and is considered as unknown, free or occupied according to the value of its pixels. 172 vol. 55 no. 3/2015 decentralized multi-robot planning to explore and perceive (a) simulation environment. (b) simulation screenshots. figure 2. in b), areas that have been explored but not yet covered with photos are in yellow; explored areas covered with photos are in white; non explored areas are in grey. layers. the exploration reward function is computed with a reward propagation mechanism based on the expected information gain in each state as in [24]. we propagate rewards in some radius around the frontier hexagons, taking into account line-of-sight constraints (see fig. 1a). further details about our model are given in [25]. 5.2.2. dvf to apply dvf (2), we consider that the robots are homogeneous. the value functions vj of the other robots j can therefore be computed only once by robot i 11. robot i computes an empathic value function with the standard value iteration algorithm [26] in its local augmented mdp. to evaluate the transition probability of other robot j, i applies a wavefront propagation algorithm from the last known state sj of robot j. 5.2.3. decision step a decision step involves building the model, computing the policy from dvf, and producing a smoothed trajectory. the agent plans continuously, executing a decision step as it perceives changes in its environment. we can observe that exploration rewards will never be gained by the robot. indeed, as the robot comes close enough to the frontier, it will gather new information with its sensors and unknown cells will become known before they are reached. therefore, the exploration rewards will disappear before the robot can claim them and the frontier between known and unknown areas, which is the source of the rewards, retreats as the robot approaches it. the action plan must then be updated quickly to react as soon as possible to this kind of information gained en route. however, this requires the decision step to be quick enough for on-line use. given that the model will be updated at each decision step, we use the greedy approach, which plans on a short-term horizon. 5.2.4. experiments experimental results from simulation and real-world scenarios can be found in [23, 25]. videos12 present 11if the robots are not homogeneous we just need to compute one value function for each type of robot 12available at http://liris.cnrs.fr/laetitia.matignon/ research.html various exploration tasks with real robots, and some interesting situations are underlined as global task repartition or local coordination. the experiments show that this method is able to coordinate a team of robots effectively during exploration. the global interaction class addresses the allocation of exploration goals, and also minimizes close interactions between robots. each robot locally computes a strategy that minimizes the interactions between the robots and maximizes the explored space. 6. planning to explore and perceive in the first two years of the contest, pictures were taken to locate and recognize objects. this was done separately from the decision module. pictures taken by the camera and analyzed by the object recognition module were gathered along the way, i.e., the camera took pictures at a specified rate along the trajectory planned for the exploration. however, this led to poor performance in terms of object recognition results, primarily because some objects in the arena were not photographed. indeed, the pictures that are taken depend on the trajectory computed for the exploration, so some objects were not covered by photos, while some areas without objects were photographed several times. to improve the coverage of the objects with pictures, we extended the decision module so that it does not only explore the arena but must also ensure that the explored areas are covered by photos. indeed, if all the explored space has been photographed, each object must be in at least one picture and the recognition module will handle it. in this section, we introduce a dvf extension to improve the photo coverage of the space, and we also present some experiments. 6.1. dvf extension to improve photo coverage of the space the decision module must simultaneously combine the interest in exploring an area with the interest in covering it with pictures. to take into account both the exploration criterion and the picture coverage criterion, we modified the reward function ri of the augmented mdp. we introduced a specific reward for 173 l. matignon, l. jeanpierre, a.-i. mouaddib acta polytechnica figure 3. number of objects detected and mission time (in seconds) versus coverage rate (produced from 340 simulations). areas explored and covered with photos in addition to the exploration reward. mdps allow planning while optimizing several criteria at once: in our case, the expected information gain, the picture-taking and the cost to reach the chosen location. during the challenge, we also optimized the return-to-base criterion at the end of the mission and the ball-pushing feature when the ball was detected. these two criteria are not considered in this paper. the dvf equation is then adapted to: ∀si ∈ si dvfi(si) = (1 −α)ri,exp(si) + αri,cov(si) + γ max ai∈ai ( ∑ s′∈si t(si,ai,s′) ( dvfi(s′) − ∑ j 6=i fijpr (s′|sij )vj (s′) )) (3) where ri,exp(si) is the reward function for exploring a state si and ri,cov(si) is the reward function for taking a photo of si. α ∈ [0, 1] is the picture coverage rate for balancing the exploration and picture coverage behaviors of the robots. with α = 0, the robots will only optimize the exploration without covering the space with photos. with α = 1, the robots will only optimize photo coverage. exploration then occurs only as a side-effect. to compute the cover reward function ri,cov, a cover grid of pixels counts the number of times that each pixel has been photographed. each time the robot takes a picture, the cover grid is updated by tracing a set of rays covering the optimal recognition area of the camera. the pixels are updated with bresenham’s algorithm [27]. the cover reward is then computed with a hexagonal reward propagation mechanism similar to the exploration rewards, multiplied by a bell-shaped factor to boost the rewards at the optimal recognition distance (see fig. 1b). this allows the mdp to select the best viewpoint for the next picture. regarding communications, this new reward implies that new data is sent: each robot needs to know where a picture has been taken. thus, each robot sends the location of its photos to all the other robots. then they can update their coverage grid and maintain an accurate reward function. in order to keep the computing complexity low, we chose not to add an action for taking a picture. when the policy brings the robot to a location where a picture must be taken, the robot will stop there and face the reward. when a picture is taken, the reward disappears and the new policy drives the robot farther. in order to avoid having blurred pictures where the recognition process will fail, we specify that photos must be taken only when the robot velocity is low. 6.2. experiments 6.2.1. simulated robots stage13 simulator was used with an architecture that mimics the real robots. dds is replaced by an inter process communication shared memory segment. laser acquisition is simulated by a “ranger” virtual sensor. a “position” virtual device simulates both the slam module by providing odometric data and the mobility module by executing the point and shoot algorithm. the stage blobfinder model is used to simulate (poorly) the camera and object detection, as it can track areas of color in a simulated 2d image, giving the location and the size of the color blobs. thanks to this architecture, the decision module of the real robots can be used with the simulator without modification. 6.2.2. results we conducted a set of experiments in the autolab environment (see fig. 2a). figure 2b shows successive screenshots of one simulation. the robots are initially in the starting zone and objects are regularly positioned in the environment. for each coverage rate, we compute the number of objects detected and the mission time (see fig. 3). with α = 0, the mission is fastest, given that the robot does not care about taking pictures. the only goal of the robot is to map the entire arena, and it only takes few pictures when it stops between two actions. few objects are therefore detected. with α > 0, all objects are detected but the mission time varies over α. indeed α modifies the priority of the two tasks. when α is low, pictures are taken at the end of the mission, if there is sufficient time remaining time. this is illustrated 13http://playerstage.sourceforge.net/ 174 vol. 55 no. 3/2015 decentralized multi-robot planning to explore and perceive e n v ir o n m e n t c o v e ra g e b y p ic tu re s fo r th e o b je c t re c o g n it io n environment coverage by laser for the exploration/mapping coverage rate figure 4. percentage of objects detected versus percentage of exploration. in fig. 4, with α = 0.0001, where the number of detected objects increases and 100% of the arena has been explored. with such a low α, the robot takes pictures once the exploration has been finished, so it travels the arena twice and the mission time is high. when α is high, pictures are taken throughout the mission, and the robot deals with both tasks at the same time. with α = 1, we obtain the second fastest mission time and all the objects are detected. so this new approach, based on exploration and photo coverage of the space, allows tall of the objects to be photographed. the α factor must be chosen to balance priority of exploration versus object detection. for example, if the time is limited and α is high, the robot may not explore the whole map, since taking pictures is time consuming, but it will photograph all objects in the explored area. 7. conclusion and perspectives we have shown a new algorithm for coordinating multiple robots that explore some unknown area. dvf enables the expensive dec-mdp to be solved as a set of augmented mdps in a distributed way. the versatile reward function can be adapted so that the robots explore the whole area, but also so that they choose the right positions to take meaningful pictures. as a result, we can observe that the robots take longer to explore, but that they also take many more pictures, efficiently covering the whole space. we can then expect a better detection rate, as any object should be in at least one picture. an immediate perspective is therefore to compare object recognition results with different parameters on real robots to confirm through a real experiment that our method provides improved perception. another perspective of our work is to plan to perceive by interlacing detection and decision modules to achieve more robust object recognition. the idea is not to take more pictures, but to take better pictures. the objective would be to plan viewpoints where the recognition process would be more reliable. to achieve this kind of active perception planning, the decision module should use information from the recognition module. in our work, the recognition module is based on the dominant orientation template method [5]. for each object, a set of views is defined, and the method gives a weighting for each view that is equivalent to its precision14. a positive detection is reliable when the point of view has a high weighting. for each object detected, the module also gives the location of the object and scores (matching results) for each view. thus the decision module could manage a set of hypotheses about the probability of the presence of each object. these hypotheses could then be confirmed or discarded by taking pictures of the object from viewpoints that maximize the precision. this could be done easily by generating specific high rewards at these viewpoints, so that the decision module will adapt the planned trajectory to ensure that the robots take pictures from there. acknowledgements this work has been supported by nra and dga (anr-09cord-103) and was developed jointly with the members of the robots malins consortium. it has also received support from the insa foundation crome project. references [1] from internet to robotics: a roadmap for u.s. robotics . 2013. [2] a. bautin, p. lucidarme, r. guyonneau, et al. cart-o-matic project : autonomous and collaborative multi-robot localization, exploration and mapping. in proc. of 5th workshop on planning, perception and navigation for intelligent vehicles, pp. 210–215. 2013. [3] d. filliat, e. battesti, s. bazeille, et al. rgbd object recognition and visual texture classification for indoor semantic mapping. in proc. of tepra. 2012. doi:10.1109/tepra.2012.6215666. [4] j. xie, f. nashashibi, n. m. parent, o. garcia-favrot. a real-time robust slam for large-scale outdoor environments. in 17th its world congress. 2010. doi:10.1028/its.slam.nashashibi. [5] s. hinterstoisser, v. lepetit, s. ilic, et al. dominant orientation templates for real-time detection of texture-less objects. in cvpr, pp. 2257–2264. 2010. [6] b. yamauchi. frontier-based exploration using multiple robots. in proceedings of the second international conference on autonomous agents, agents ’98, pp. 47–53. 1998. 14or positive predictive value 175 http://dx.doi.org/10.1109/tepra.2012.6215666 http://dx.doi.org/10.1028/its.slam.nashashibi l. matignon, l. jeanpierre, a.-i. mouaddib acta polytechnica [7] r. simmons, d. apfelbaum, w. burgard, et al. coordination for multi-robot exploration and mapping. in proc. of the aaai national conf. on artificial intelligence. 2000. [8] w. burgard, m. moors, c. stachniss, f. schneider. coordinated multi-robot exploration. ieee transactions on robotics 21:376–386, 2005. doi:10.1109/tro.2004.839232. [9] k. m. wurm, c. stachniss, w. burgard. coordinated multi-robot exploration using a segmentation of the environment. in proc. of iros, pp. 1160–1165. 2008. doi:10.1109/iros.2008.4650734. [10] r. zlot, a. stentz, m. dias, s. thayer. multi-robot exploration controlled by a market economy. in proc. of icra, vol. 3, pp. 3016–3023. 2002. doi:10.1109/robot.2002.1013690. [11] a. bautin, o. simonin, f. charpillet. minpos : a novel frontier allocation algorithm for multi-robot exploration. in icira, vol. 7507 of lecture notes in computer science, pp. 496–508. 2012. [12] m. julia, a. gil, ó. reinoso. a comparison of path planning strategies for autonomous exploration and mapping of unknown environments. autonomous robots 33(4):427–444, 2012. doi:10.1007/s10514-012-9298-8. [13] f. deinzer, c. derichs, h. niemann, j. denzler. a framework for actively selecting viewpoints in object recognition. ijprai 23(4):765–799, 2009. [14] s. brandao, m. veloso, j. p. costeira. active object recognition by offline solving of pomdps. in proc. of the international conference on mobile robots and competitions, pp. 33–38. 2011. [15] a. a. makarenko, s. b. williams, f. bourgault, h. f. durrant-whyte. an experiment in integrated exploration. in proceedings of iros, pp. 534–539. 2002. doi:10.1109/irds.2002.1041445. [16] c. stachniss, g. grisetti, w. burgard. information gain-based exploration using rao-blackwellized particle filters. in proc. of robotics: science and systems (rss). cambridge, ma, usa, 2005. [17] d. s. bernstein, r. givan, n. immerman, s. zilberstein. the complexity of decentralized control of markov decision processes. math oper res 27:819–840, 2002. [18] m. l. puterman. markov decision processes. 1994. [19] r. nair, p. varakantham, m. tambe, m. yokoo. networked distributed pomdps: a synthesis of distributed constraint optimization and pomdps. in proc. of aaai, pp. 133–139. 2005. [20] f. s. melo, m. m. veloso. decentralized mdps with sparse interactions. artificial intelligence 175(11):1757– 1789, 2011. doi:10.1016/j.artint.2011.05.001. [21] a. canu, a.-i. mouaddib. collective decisiontheoretic planning for planet exploration. in proc. of ictai. 2011. [22] j. schneider, w.-k. wong, a. moore, m. riedmiller. distributed value functions. in proc. of icml, pp. 371–378. 1999. [23] l. matignon, l. jeanpierre, a.-i. mouaddib. coordinated multi-robot exploration under communication constraints using decentralized markov decision processes. in proc. of aaai. 2012. [24] s. le gloannec, l. jeanpierre, a.-i. mouaddib. unknown area exploration with an autonomous robot using markov decision processes. in proc. of taros, pp. 119–125. 2010. [25] l. matignon, l. jeanpierre, a.-i. mouaddib. distributed value functions for multi-robot exploration. in proc. of icra. 2012. doi:10.1109/icra.2012.6224937. [26] r. bellman. dynamic programming: markov decision process. 1957. [27] j. e. bresenham. algorithm for computer control of a digital plotter. ibm systems journal 4(1):25–30, 1965. 176 http://dx.doi.org/10.1109/tro.2004.839232 http://dx.doi.org/10.1109/iros.2008.4650734 http://dx.doi.org/10.1109/robot.2002.1013690 http://dx.doi.org/10.1007/s10514-012-9298-8 http://dx.doi.org/10.1109/irds.2002.1041445 http://dx.doi.org/10.1016/j.artint.2011.05.001 http://dx.doi.org/10.1109/icra.2012.6224937 acta polytechnica 55(3):169–176, 2015 1 introduction 2 context 2.1 the carotte challenge 2.2 the robots_malins system 3 state of the art 3.1 multi-robot exploration 3.2 active perception combined with exploration 4 interaction-oriented models 4.1 background on dec-mdp 4.2 interaction-oriented models 5 dvf for multi-robot exploration 5.1 interaction sparse dec-mdp with dvf 5.2 dvf applied to multi-robot exploration 5.2.1 mdp model 5.2.2 dvf 5.2.3 decision step 5.2.4 experiments 6 planning to explore and perceive 6.1 dvf extension to improve photo coverage of the space 6.2 experiments 6.2.1 simulated robots 6.2.2 results 7 conclusion and perspectives acknowledgements references ap02_3.vp 1 introduction moisture of building materials and structures is very variable. it is dependent on external climatic conditions, and also on the internal conditions in each specific building structure. while external climatic conditions are independent, internal conditions may be affected by the selection of a structural system and suitable materials. the moisture distribution in construction materials is non-uniform, being dependent on pressure, temperature and the texture of the applied material and the structural type. moisture penetration into building materials and structures, including subsequent moisture transfer in them, is brought about and affected by the following factors below: � climatic conditions � water-vapour diffusion and capillary moisture conductivity due to temperature and moisture gradients � capillarity and capillary absorption capacity � sorption and de-sorption. moisture field changes are caused not only by different and time-dependent external conditions, but also by stable conditions, characterized by temperature, moisture and pressure gradients. moisture may be found in a gaseous, liquid, or even solid state. 2 anticipated moisture effects in the structure of charles bridge in prague 2.1 anticipated moisture effects of fully functional hydroinsulation charles bridge, a stone structure across the vltava river, is exposed to climatic effects, including rain and snowfall, solar radiation an the flow of air and water vapour that arises over water level of the river, which does not freeze over in winter. water from rain and melting snow affects the surface of the sandstone bridge blocks. the water partly drips down its spandrel and breast walls and partly soaks in, particularly at the base of the spandrel walls, and leaking through cracks in joints between the blocks. as a result of wetting of the breast walls and spandrels by water, the sandstone blocks become soaked with water. fully functional hydroinsulation eliminates water leakage from the pavement surface. it creates a closed internal © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 42 no. 3/2002 measurement of moisture fields in the bridge structure of charles bridge j. ímal this paper describes measurements of the moisture field of charles bridge in prague. the measurements were scheduled to cover a one-year cycle, including the spring, summer, autumn and winter, in order to monitor the behaviour of the bridge structure over a period of one year. keywords: charles bridge, measurement of moisture fields, heat and mass transport. photo 1: charles bridge in prague space enclosed by the arch, breast walls and pavement, filled with hand-placed arenaceous marl, reinforced concrete slab, expanded-clay concrete and dab below the insulation. in this space, on the surface of the breast walls and below the insulation, water vapour condensates due to temperature differences, reaching dew point, and it drips down the extrados of the arch. in winter time, the entire structure freezes through, but, the freezing process occurs more slowly at the crown of the arch towards the piers. in the hand-placed arenacous marl close to the piers, the core remains unfrozen. it can only freeze through under extremely long-lasting low temperatures. this phenomenon probably affects the moisture transfer and brings about changes in the moisture field of the structure. the moisture field of the bridge structure is also influenced by the air flow from the wind and the impact of the free water level of the river, which continuously releases water vapour, thus maintaining a permanently high air moisture content. this moisture affects the soffit of the bridge arch, particularly at the footings close to the pier. this state is further worsened by moisture accumulating at the extrados of the arch. higher summer temperatures and solar radiation, which is more or less one-sided (coming from the south side), also affect the moisture field of the bridge structure. finally we can assume that there are some chemical effects from the acid rain and earlier bridge pavement salting on the moisture field changes. 2.2 anticipated moisture field of poorly functioning hydroinsulation if hydroinsulation is not applied correctly, particularly the ends and the connections to the spandrel walls, massive leakage into the bridge occurs. water accumulates at the extrados of the arch and its level increases if not drained. at the same time, layers of the pavement and tha hand-placed arenaceous marl become to a very large extent saturated with water. the high water content brings about excessive ice formation in the winter season, which negatively affects the bridge structure at the spandrel walls and in the bridge structure itself. continuous thin accretion is formed on the breast walls, mostly at places of maximum leakage. on the extrados, particularly at the footings, ice wedges may even appear. therefore, high-quality, long-functioning hydroinsulation is essential for the long service life of the bridge. moreover, continuous monitoring of the effectiveness and the condition of the insulation by continual measurement of the bridge moisture field, even after the completed reconstruction, is vital. 3 measuring methods and the measuring system 3.1 methods of moisture measurement measurement methods can invoke destructive and non-destructive procedures. they can also be classificatied, according to the manner of moisture measurement, as direct and indirect methods. direct methods involve measuring the water content in the construction material. the main principle of indirect methods is measurement of the functional dependence of moisture on a selected physical variable, such as electric resistance, capacity, absorption of gamma radiation, thermal conductivity, etc. overview of the methods direct methods � gravimetric method indirect methods � chemical method � thermal method � tensiometric method � gammascopic method � neutron method � electric method � resistance � capacity � electromotive tension, etc. the gravimetric method is a destructive procedure, its major advantage over all the other measurement methods being that it can be used for any building material, with any moisture content, without advance calibration. the gravimetric method is the fundamental procedure for calibrating various indirect measurement methods. it has disadvantages, though, which lead to the search for other measurement procedures. its disadvantages include the use of sampling, which makes continuous monitoring of moisture at a certain place in the structure impossible. further, gravimetric methods cannot be used in places with difficult access. another weak point is the need for laboratory processing, which results in high time demands. the gravimetric method is effective for gathering information on moisture in the monitored material at the place of sampling provided no moisture changes take place during transport and laboratory measurement. it is entirely inappropriate for monitoring moisture field changes with space and time dependency. indirect measurement methods enable researchers to follow changes continuously. however, practically all of them the calibration curve to be determined for the empirical dependency of moisture on a selected measured parameter, e.g. electric resistance or capacity. the accuracy of measurement is therefore limited in comparison with the gravimetric method, which is a universal and fundamental method for measuring the moistures of construction materials. 3.2 conditions for moisture field measurement it is necessary to make changes in temperature and moisture fields at the same measurement points and at equal time points, i.e. almost at the same time. the temperature field and the moisture field are closely related. moisture field measurement cannot be performed without parallel temperature field measurement. this principle was respected during measurements on charles bridge. as part of the construction works done within 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 temperature field measurement, for of moisture field sensors were placed in advance. this helped to eliminate the need for new building works, and saved costs and time. temperature field measurement was conducted independently throughout a one-year cycle. in the following year of experimental measurements on charles bridge, temperature and moisture field measurements were performed at the same time. 3.3 the measuring system and measuring probes the measuring system was tested in an experimental construction of a feeder road in prague – spořilov no. e-d/47 within the project new technologies of hydroinsulation of bridge pavements. the measurements were conducted by the author. moisture field measurements of the structure of charles bridge were conducted with a digital system using devices, such as the ultrakust (frg), multimetre (uk) and sanwa electric (japan) systems. a patent for the moisture field measurement methods for building structures is currently being applied for. photo 2 shows the moisture sensors built in rollers made of the original material together with the temperature sensors. the cable line from the moisture sensors is led along the external side of the protective tube of the temperature sensor line (see graph xy) and the cable line leading to the measuring centre. the following table presents basic information on the placement of the moisture and temperature sensors in the structure of the bridge. the depth of placement was measured from the bottom of the working spaces of a – pier and b – arch, which were laid on insulation material. point a – pier point of measurement material installation depth [m] a loose material 1.37 b expanded-clay concrete 0.48 c dab 0.10 d reinforced concrete slab 0.56 point b – arch point of measurement material installation depth [m] f sandstone 0.99 g expanded-clay concrete 0.47 h reinforced concrete slab 0.48 i dab 0.10 j below the pavement of the bridge deck l air temperature at the bridge deck level m lower soffit of arch no. x (depth of placement, measured from the bottom part of the arch) 0.07 4 results of moisture and temperature field measurement results of measurement of the moisture field are presented graphically (see graph no. 1). marking of moisture sensors (in graph no. 1): point a – pier a loose material b expanded-clay concrete c dab d reinforced concrete slab point b – arch f sandstone g expanded-clay concrete h reinforced concrete slab i dab j below the pavement of the bridge deck in order to solve transfer differential equations describing mass and heat transport applied to charles bridge, it is necessary to know a number of material constants. further, it is essential to determine the initial and limit conditions and time courses of the temperature and moisture fields. temperature field measurements therefore had to be conducted parallel to moisture field measurements of the charles bridge structure. the results of the temperature field measurements are shown in graph no. 2. markings of temperature sensors in the temperature field graph (graph no. 2): © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 42 no. 3/2002 photo 2: arrangement of the temperature and conductivity sensors 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 g ra p h n o. 1 © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 42 no. 3/2002 g ra p h n o. 2 point a – pier a loose material b expanded-clay concrete c dab d reinforced concrete slab point b – arch f sandstone g expanded-clay concrete h reinforced concrete slab i dab j below the pavement of the bridge deck l air temperature at the bridge deck level m lower soffit of arch no. x (in the middle of the arch) 5 conclusion high mass moisture values of expanded-clay concrete (see graph no. 1) indicate water leakage into this structural layer placed above the reinforced concrete slab. the material retains the gained moisture over a longer period of time. the revealed high value of the mass test for expanded-clay concrete is a worrying finding of the moisture field measurement experiment. it is suggested to that a detailed analysis of this should be made and that further experimental testing should be carried out in order to find out the suitability of applying expanded-clay concrete in the forthcoming reconstruction of charles bridge. measurement of the moisture field and of the changes in the moisture field allow researchers to judge the impact of moisture on the bridge structure, its durability, and to specify the causes of failures brought about by moisture. as a result, the main aim remains to collect of information on the quantity of water present in the structure and on changes in the moisture field related to changes in the temperature field. experimental results suggest interaction and interrelation between the moisture and temperature fields. the graphical documentation shows the temperature and moisture behaviour of the materials due to climatic changes within a time dependency in a one-year season. it is recommended that measurements of the temperature and moisture field of charles bridge should be continued, using all the measuring methods, until conclusions are finalized for the reconstruction of charles bridge in prague. the temperature and moisture field sensors remain in their places in the charles bridge structure and can be reactivated to facilitate continuation of the experiment. this fact is very significant for the current situation of charles bridge after the floods in august 2002. acknowledgement this project is being carried out with the participation of ph.d. students ing. jan pytel and ing. jana zaoralová and undergraduate student jakub římal. this research has been supported by the grant agency of the czech republic under grant no. 103/00/0776. references [1] římal, j.: measurement of temperature fields in long span concrete bridges. acta polytechnica, vol. 41, no. 6, ctu in prague, 2001, issn 1210-2709, p. 54–65. [2] na�, �.: development, design and load test of an “experimental” partially prestressed concrete bridge. proceedings “challenges for concrete in the next millennium”, xiiith fip congress 1998, amsterdam, netherlands, p. 503–510. [3] římal, j., klobouček, b.: reconstruction of bridge decking of nusle bridge. measurement of thermal fields, research report, ctu in prague, 1981. [4] final report on the measurment of displacements in cracks and arches of charles bridge. púdis 1991. [5] hejnic, j.: charles bridge repair, calculation of temperature effects. assessment of the function of the rc slab, and structural repairs project, general report, púdis 1988. [6] bažant, z. p., křístek, v., vítek, j. l.: drying and cracking effects in box-girder bridge segment. journal of structural engineering, 1992. [7] římal, j.: the measurement of thermal fields in the bridge deck and bridge structure of nusle bridge in prague. proceedings of the conference new requirements for materials and structures, ctu in prague, 1998. [8] římal, j.: evaluation of temperature fields on the reinforced concrete slab and bridge structure of charles bridge. ctu reports 2001, ctu in prague, 2001, p. 846. [9] witzany, j.: preservation of structural monuments – charles bridge. iabse symposium “structures for the future – the search for quality”, rio de janeiro, 1999. [10] witzany, j.: structural and technical condition and reconstruction of charles bridge. the czech chamber of certified engineers and technicians 1997, p. 8–19. [11] josef, d., mencl, v., šmerda, z., witzany, j.: charles bridge. commemorative publication “the bridge of the millenium and the bridges of the century in the czech republic”, international symposium “bridges 2001”, brno. [12] witzany, j. et al: assessment of structural and technical condition of charles bridge. bulletin of civil engineering, ctu in prague, no. 8, 2002, p. 226–249. doc. rndr. jaroslav římal, drsc. phone: +420 224 354 702 fax: +420 233 333 226 e-mail: rimal@fsv.cvut.cz czech technical university faculty of civil engineering thákurova 7, 166 29 prague 6, czech republic 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2019.59.0077 acta polytechnica 59(1):77–87, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap evaluation of railway ballast layer consolidation after maintenance works mykola sysyna, ∗, olga nabochenkob, vitalii kovalchukb, ulf gerbera a technical university of dresden, hettnerstr. 1/3, dresden, germany b dnipropetrovsk national university of railway transport, blagkevich 12a, lviv, ukraine ∗ corresponding author: mykola.sysyn@tu-dresden.de abstract. the results of the study of the ballast layer consolidation after the work of ballast-tamping machines of different types are given in the article. the existing methods of determining the degree of consolidation of the ballast layer are analysed. the seismic method was improved by means of a complex dynamic and kinematic interpretation of the impulse response. for the dynamic interpretation with the use of statistical analysis, the features are selected so that they correspond to the degree of consolidation of the ballast layer. on the basis of researches, a device and software were developed that allow an automated evaluation of the ballast layer consolidation based on the kinematic and dynamic analysis of the measured impulse response. the measurements of the degree of the ballast layer consolidation after an operation of ballast-consolidation machines in different sequences allowed establishing the efficiency of the consolidation and the feasibility of the machines’ application. keywords: ballast layer consolidation; tamping machines; seismic method; impulse response interpretation; signal processing; feature selection; classification; clustering. 1. introduction the long-term deformation of the track geometry, in great extent, depends on the initial quality of the track. one of the most effective means to improve the track operation is to achieve the highest possible initial quality of the track during its construction [1– 6]. at present, the quality of the construction of the ballast layer and its consolidation is estimated by the stability of the track geometry. the degree of the ballast layer consolidation corresponds to its greatest bearing capacity and resistance to a shape change. the information about the degree of the ballast consolidation under sleepers is important because it gives the possibility to evaluate the quality of the repair works, as well as further deformation of the track geometry. as studies show [7–11], a well-consolidated ballast layer after the track repair enables to significantly extend the lifecycle period of the ballasted track and track structures. among the known nondestructive methods (ndm) of the soil consolidation degree determination are the following: pneumatic, hydrometric, radiometric (isotope), radio engineering, seismic, hammer test and others. the author of the research [12] gives the classification of possible methods of the crushed stone consolidation (figure 1). the seismic methods and radio engineering techniques found a wide field of application. the falling weight deflectometer (fwd) is a method for measuring the subgrade elasticity, which is based on the measurement of the velocity response of the soil oscillation under the influence of the impulsed load caused by some falling mass [13]. this method is adapted to measure the ballast layer elasticity of the railways. the sonic echo method measures the time of the wave propagation from the impact point through the ballast and back to the sensor. the accelerometer is used as a receiver, as a rule, and the created oscillation excitation is relatively small, usually with a 1 kg of mass. the result of the measurement is the time-shift in oscillation impulses. with a known thickness of the ballast layer, one can draw the conclusion for the ballast consolidation. in the method of ground penetrating radar (gpr) [14, 15], instead of the physical wave of oscillation, short impulses of electromagnetic radiation are generated in the soil that are moving from the transmitting antenna to the receiver. the ballast, which has different dielectric properties in depth, reflects different waves of radiation in different ways. according to the amplitude and time of the measured wave’s passage, the distribution profile of the heterogeneity of waves’ propagation in the ballast layer is constructed. the gpr method can determine the presence of pollutants in the ballast layer, such as fine and clay constituents. however, this method is not suitable for measurement of the initial degree of consolidation of the pure ballast. the measurement of the ballast layer consolidation distribution on the basis of kinematic interpretation after tamping machines [12] with the seismic technique and the device uvp-diit were applied. the method is based on the techniques from seismic survey. the traditional field of seismic survey application is the study of relatively homogeneous masses of soils of a large size. unlike the soil, the railway track is more complicated object in terms of the design and variety of material prop77 http://dx.doi.org/10.14311/ap.2019.59.0077 http://ojs.cvut.cz/ojs/index.php/ap m. sysyn, o. nabochenko, v. kovalchuk, u. gerber acta polytechnica figure 1. monitoring methods of ballast layer consolidation [12]. erties. the impact of sleepers and the subgrade can be considerable on the overall measurement result. in most cases, the soil density analysis uses the method of kinematic interpretation of a signal, which is based on determining the propagation speed or time of flight (tof) of the acoustic impulse in an elastic medium. the method uses only the minimal part of the information, recorded by seismic sensors, and does not take into account the whole process of oscillation and its nature. the application of this kinematic method of interpretation, to determine the properties of the ballast layer, is complicated as the track is a physically non-uniform object and the velocity of the impulse propagation is affected not only by the ballast layer, but by other elements too. that is confirmed by a considerable ambiguity and the spread of measured values. in addition, the purpose of kinematic interpretation in seismic investigation was traditionally not to determine the properties of the elastic medium, but to determine the boundaries between different sections of the medium. recently, the dynamic interpretation of seismic data is increasingly being used, which, in addition to speed, also takes into account the entire recording of the dynamic response signal to the impulsed load, analogous to the methods used in the seismic investigation [13, 16, 17]. among these methods, spectral methods of dynamic interpretation are increasingly being applied such as the so-called rail road response factor (rrrf) used to evaluate the condition of the subgrade by means of waves excited by trains of different types and speeds [18]. in addition, non-destructive seismic methods for assessing the dynamic response of the superstructure of the track are used in the so-called hammer test [19–23], which is used to evaluate the dynamic properties of the track and the ballast layer in its composition. the peculiarity of the measurements of the ballast layer consolidation is that the results of measurements, in addition to the degree of consolidation, are influenced by many other factors, such as the ballast layer thickness, the influence of sleepers, etc., which determine the nature of the wave passage and distribution in the ballast layer. all the undetermined factors’ influence causes considerable deviations of the measurement results. this fact, together with the big number of measurements, demand the application of the computer based processing and data mining methods. the methods, together with laboratory tests, enable to reduce the influence of unknown factors and select the informative features of the ballast consolidation. the data mining methods are widely used in the transportation research. the general methodology of a big data analytics application for a track maintenance planning system is described in the study [24]. the practical application of the methods for the track maintenance is shown in another paper [25] at an example of a track quality index analysis. the datadriven rail-infrastructure monitoring that is based on data fusion, feature extraction, selection and other data mining methods is presented in the literature [26– 28]. in the present paper, one part of the data mining is used: the feature extraction and selection, that is based on the statistical classification and clustering methods. the purpose of this experimental study is to establish the effectiveness of ballast tamping with tamping machines and dynamic stabilizers of various designs and an optimal mode of their operation. the problem is solved in 3 subsequent steps: • development of measurement device together with signal processing software; • laboratory measurements and consolidation feature selection with statistical learning methods; • in-situ measurement of ballast consolidation after tamping works. 78 vol. 59 no. 1/2019 evaluation of railway ballast layer consolidation figure 2. principal scheme of the device for assessing ballast consolidation: 1 – server for signal processing; 2 – wi-fi microcontroller unit with adc; 3 – electrodynamic emitter of pulse acoustic signal shaker –; 4 – seismic sensor sv-20p; 5 – rail; 6 – sleeper; 7 – ballast layer. 2. experimental measurements of impulse response of the ballast layer with different consolidation degree the propagation of elastic waves through a grainy medium, which is the ballast layer of the railroad, is determined by the mineralogical and granulometric composition of the grains. to a large extent, it depends on the number of contacts between the grains, that is, the ballast layer consolidation. thus, by measuring the velocity of the elastic waves propagation after each passage of a dynamic rail stabilizer, it is possible to evaluate the degree of the ballast consolidation. for measurements, a seismic sensor of the type sv-20p is used, which operates on the principle of measuring the velocity of oscillations and the effect of electromagnetic induction [16]. the creation of the impulse in the ballast layer is carried out by means of the plate, which distributes the load between ballast grains on the surface of the ballast layer, and the hammer connected to it in an electric circuit. the source of elastic waves is an impact on the metal disk (diameter 100 mm, height 10 mm), which is placed on the ballast. the size of the plate is enough to provide the simultaneous support on 3 neighboured stones with a maximal size of 60 mm. at the same time, a bigger plate could provide some ambiguity in the interpretation due to a short distance between the receiver and emitter. the registration of the wave velocity was carried out by the digital analyser of uvi-2 type. the emitter is located in the adjacent ballast boxes to the signal receiver along the track axis. in addition, other layout schemes are considered, such as the transverse direction; location through several ballast boxes, etc. the considerations to the optimal measurement scheme are described in the study [29]. the distance between the shaker and the receiver is 0.5–1.5 m. it has been established that the granular medium can be considered as homogeneous with corresponding characteristics at the distance between the shaker and the receiver of approximately 10 d (where d is a mean grain diameter). in parallel with the measurements of the wave passage velocity, the signal recording from the seismic receiver and further kinematic and dynamic analyses based on the developed automatic system were used (figure 2). the prototype of the device includes the following units: • electrodynamic emitter of impulsed acoustic signal, which amplitude-frequency characteristic allows to generate signals of the spectrum of low frequencies (shaker); • amplifier of generated impulsed acoustic signal necessary for amplifying the output signal of the emitter; • seismic sensor sv-20p; • mcu (esp32 programmable microcontroller), which executes, together with a shaker, the command of the server program of the impulse excitation for the numerical registration of the analog signal from the geophone by the module ads1256 with the frequency of 15 khz, and sending information to the server for further processing; • special software that manages the controller, acceptance and processing of the dynamic response signal. to ensure the operation of the system, the software for the microcontroller was developed, based on the ide eclipse. to determine the criteria for evaluating the impulsed ballast feedback, the software was developed using the matlab program package [30], which allows plotting the recorded signal with its zooming and viewing functions. the connection with the controller is executed by the wireless protocol udp. the 79 m. sysyn, o. nabochenko, v. kovalchuk, u. gerber acta polytechnica figure 3. the program interface for measuring and processing the results. developed software also performs kinematic and dynamic analyses of a signal, or its separate components, using dft methods or a periodogram with averaging results, and calculates the analytical signal and instantaneous frequency. in addition, the statistical processing of the obtained results is carried out and the need for additional measurements is determined. the processing program example is shown in figure 3. 2.1. dynamic interpretation and signal processing the measured response signal is usually quite complex, so there are many ways of interpreting this signal. the most common way is to interpret the frequency response (hereinafter fr) of the measured signal. conventionally, a fourier series is used for this purpose, which allows decomposing the time series of the signal into elementary harmonic components of different amplitudes, frequencies and phases. the disadvantage of this method is that during the fourier series transformation, a part of the information associated with the time is lost. to eliminate this disadvantage, other criteria for signal evaluating are used: the instantaneous frequency (if) and the analytical signal found by hilbert transformation (hereinafter ht), spectrogram and wavelet analysis [31]. in contrast to the fr, the instantaneous frequency allows detecting the change in the average frequency of oscillations of the response signal during the oscillation time. the analytical signal is a two-way envelope of a real signal and allows it to be simplified in a more perceptible form for an interpretation. the spectrogram and wavelet analyses provide additional interpretation parameters, but also demand the following more complicated statistical processing. therefore, in this study, it is proposed to analyse the recorded dynamic response of the signal using the two given spectral methods of dynamic interpretation fr and ht, and also to compare the results with the results of the kinematic interpretation, carried out in parallel. a nonparametric periodogram [31] is used in practical calculations of the power spectrum of the discrete signal. the method allows estimating the spectral power and is obtained by n samples of one realization of the random process. the periodogram is calculated by the formula: w(ω) = 1 nfd ∣∣∣∣n−1∑ k=0 x(k)e−jωkt ∣∣∣∣2, (1) where t – sampling period; ωk – signal spectrum; fd – sampling rate; x(k) – signal data. in addition to the criterion of the signal spectrum, the criteria of the analytical signal and instantaneous frequency are used to estimate the ballast layer consolidation. in the frequency analysis of the dynamic ballast response with the help of the discrete fourier transform (dft), the important information is obtained about the spectral composition of the signal, however, the part of the information that corresponds to the time of each component during the oscillation period is lost. as it is known from experimental and theoretical studies, the waves of different frequencies occupy different areas in the elastic environment and pass different paths from the emitter to the receiver. therefore, one should expect that high spectra should dominate at the beginning of the signal at these measurements. however, the peculiarity of the dynamic 80 vol. 59 no. 1/2019 evaluation of railway ballast layer consolidation response in elastic environments is that the dynamic response, recorded by the receiver, always has a stable harmonic component regardless of the shape and duration of the load pulse of the emitter. thus, the task of evaluating the dependence of the oscillation parameters on time is raised. taking into account the existing problem as well as the properties of the signal, you can replace the actual signal s(t) with a combination of three functions: s(t) = a(t) · cos(ω0 · t + φ(t)), (2) where a(t), φ(t) – amplitude and phase, which are functions of time. in the complex, form s(t) can be represented by the real part of the analytic signal z(t): z(t) = s(t) + s⊥(t), (3) where s⊥(t) – conjugate signal. in exponential form, the dependence (2) has the form: z(t) = a(t) · eiφ(t), (4) where φ(t) – instantaneous phase. the unknown component is a conjugate signal s⊥(t). to find it, hilbert’s conversion is used: s⊥(t) = 1 π ∫ ∞ −∞ s(t′) t − t′ dt′. (5) the conjugate signal found in this way is used to determine the instantaneous values of the amplitude a(t), phase φ(t) and frequency ω(t): a(t) = √ s2(t) + s2⊥(t), φ(t) = arctg s⊥(t) s(t) , ω(t) = dφ(t) dt = s′⊥(t) · s(t) − s ′(t) · s⊥(t) s2(t) + s2⊥(t) . (6) the function a(t) is a two-way enveloping signal s(t) and is also called an analytic signal. the instantaneous frequency ω(t) differs from the frequency of spectral components by the fact that, at a certain point of time, it takes only one value and the number of components with different frequencies for one instant of time may be infinite. unlike the dft, this technique rejects the part relating to the distribution of the signal by spectra. before the field measurements, to determine the characteristic features, the laboratory bench test measurements were performed. after that, field measurements were made on the sections of the track where repairs were carried out. for comparative values of the criteria acquisition, the measurements were initially carried out during the laboratory bench tests on consolidated and unconsolidated crushed stone, and then the natural measurements on the track were carried out. the most rational arrangement of the signal emitter and receiver on the track was chosen. figure 4. the dependence of wave propagation velocity on consolidation of the crushed stone layer. 2.2. laboratory tests to determine the characteristic features of the difference between consolidated and unconsolidated ballast, the laboratory research was carried out on a special test-bench that allows creating consolidated ballast. the box of 1.05×0.70×0.80 m size, filled with crushed stone, was used as a test-bench. this box was filled with railroad crushed stone mixture of 0.60 m depth without consolidation. the degree of its compaction is calculated as the ratio of the mass of crushed stone to its volume. this is the least consolidated ballast. the three levels of compaction are considered: maximum unconsolidated, maximum consolidated and intermediate. then a gradual consolidation of the ballast is performed with the help of a vibro-plate, which is attached to the electric track-tamping machine. at the same time, at each stage of the compaction, the measurements of the wave passage velocity from the striker to the receiver are carried out. the measurements and records are performed on the basis of 0.5 m according to the described algorithm. different conditions of contact of the sensor with the ballast are considered. according to measured data, a kinematic and dynamic interpretation of the impulse response is performed. the results of the kinematic interpretation can be seen in the form of graphs of dependence of the wave propagation velocity on the density of crushed stone (figure 4). as it can be seen from the diagram, the fine crushed stone is consolidated more than the railroad crushed stone mixture, and the amount of fine crushed stone reaches 1650 kg/m3 in comparison with 1550 kg/m3 of the railroad crushed stone mixture. comparing the data obtained during the research, we can make conclusions about the crushed stone consolidation in the experimental sections of the railway. 2.3. selection of informative features for consolidation for a statistically substantiated determination of characteristic features of the ballast consolidation or unconsolidated, a special statistical analysis is carried out. it consists of the complex analysis and identification of common and distinctive features of the periodograms 81 m. sysyn, o. nabochenko, v. kovalchuk, u. gerber acta polytechnica figure 5. the results of distribution into two hierarchical clusters without taking into account measurement amplitude. by the cluster and classification analysis [32]. the cluster analysis can give the answer, which groups the aggregated multivariate experimental data collected in the general sample can be divided into, but it does not show according to which characteristic features one or another group can be distinguished. in this study, the hierarchical cluster grouping method is used. the above mentioned methods of classification and grouping are used initially for the analysis of laboratory measurements. the group of laboratory measurements includes 12 records of the impulse response of the consolidated and unconsolidated crushed stone under different contact conditions. since, in this study, the amplitude of the signal is a rather subjective value, which depends on the contact of the excitation source, the crushed stone and the sensor, this value is neglected, and the amplitude result is shown in a percentage of the maximum value. in the cluster hierarchical analysis, measurements are divided into groups using the similarity measure to the type of the correlation. the results of distribution among two groups without taking into account the amplitude of measurements are shown in figure 5. each of the charts showing the two most distinct groups. the group in the figure 5a contains only the periodograms of measurements of the consolidated ballast. the group in figure 5b contains two periodograms of measurements of the consolidated ballast and all periodograms of measurements of the unconsolidated ballast. while considering the entire range of periodograms in the graphs, it is impossible to answer definitely, whether the ballast is consolidated or not. however, if we consider only the initial frequency range 0 to 200 hz, the division into clusters is carried out without errors. figure 6 shows the clusters in the frequency range 60–150hz for a better visualization. this distinctive signal recognition is due to the fact that there is a significant difference between the periodograms of the signals in consolidated and unconsolidated ballast in this range of the spectrum. the inaccurate grouping of periodograms of signals while considering the entire spectrum of the signal is due to the fact that the expressed similarity at the beginning of the section is mixed in the general statistical uncertainty of the rest of the spectrum. the cluster analysis statistically confirms that the measurements made can be rather accurately attributed to measured groups. therefore, it can be assumed that there is a certain set of informative features that allow referring periodograms of signals to one or the other group. to find them, the linear discriminant analysis is performed [33]. the results are shown at figure 7 with 2 double vertical axes that correspond to average power spectral densities for unconsolidated and compacted ballast together with the measure of 82 vol. 59 no. 1/2019 evaluation of railway ballast layer consolidation figure 6. the results of division into two hierarchical clusters while considering part of a signal spectrum 60 to 150 hz. the difference. the measure is the fisher’s criterion. considering figure 7, one can see that the maximum value of fisher’s criterion f = 22 corresponds to the frequency of 118 hz, with the critical value of fisher’s criterion for 12 levels and probability of 0.95 fk = 3.89 [33]. this means that the value of the frequency near 118 hz point is representative for evaluating the ballast consolidation. in this case, the estimation of the difference of signals periodograms according to fisher’s criterion is performed over the entire frequency range, which shows that the distinction between the groups of signals is different at various frequencies of the spectrum. hereinafter, the selected frequency feature will be called the low side spectrum (lss). it is possible to conclude that the maximum consolidated ballast state under laboratory conditions corresponds to the frequency of the lss approximated at 122 hz, and the unconsolidated ballast corresponds to 103 hz. the graphs of instantaneous frequency change during the response signal passage before and after the ballast consolidation are shown in figure 8. considering the graphs, one can see that, at the beginning of the signal (up to 0.0003 s), the prevailing frequency is 450–650 hz, and the rest average value of frequency is 300 hz. large fluctuations of frequency occur due to the signal imposition of direct and reflected waves. this leads to an uncertainty in the frequency estimation. the greatest difference between the graphs of instantaneous frequency of the consolidated and unconsolidated ballast is observed at the beginning of the oscillation, so this point can be used as a characteristic sign of consolidation. from the physical point of view, the maximum value of the instantaneous frequency is connected with the leading edge of a wave, which corresponds to the time from the beginning of oscillations to the amplitude in the first half-wave of the oscillation and is of an inversely proportional magnitude. the leading edge of a wave is traditionally used in the analysis of physical properties of soils. one of the disadvantages of the leading edge of a wave evaluation is a rather large uncertainty in the determination of the beginning point of the oscillation, caused by the smoothness of growth from zero signal level, which leads to some error. the instantaneous frequency criterion does not have this disadvantage, since only its maximum value is selected. laboratory studies showed the following results: • before consolidation, the leading edge of a wave is 0.9285 ms at a relative error of 14.2 %, and the maximum instantaneous frequency of 538.9 hz with a relative error of 4.01 %; • after the consolidation, the leading edge of a wave is 0.869 ms at a relative error of 12.9 %, and the maximum instantaneous frequency of 586.1 hz with a relative error of ±6.71 %. 83 m. sysyn, o. nabochenko, v. kovalchuk, u. gerber acta polytechnica figure 7. graphs of averaged values of periodograms of signals for groups of consolidated and unconsolidated ballast and fisher’s criterion value. figure 8. the instantaneous frequency of signals change at the initial signal moment. the simplified statistical study of the grouping and classification of periodograms of the ballast layer impulse response without a regard to the amplitude obtained under laboratory conditions shows a high degree of reliability of the ballast consolidation assessment and a low error rate. 3. in-situ measurements of the ballast layer consolidation after track repair with tamping and stabilization on the basis of the developed device and the technique, the experimental measurements of the ballast consolidation after the track repair with the use of ballasttamping machines duomatic, dynamic stabilizers dgs (produced by austrian firm plasser & theurer) and the dsp (produced by russian company remputmash dsp-c4) are performed. the field measurements were performed at the line khangzhenkovo-khartsyzsk of donetska railway. the 3 km test track was devided into 6 equal parts, 500 m each, with different tamping machine types and number of passages. the absolute settlement measurements were performed with the track levelling immediately after the tamping and after 0.3 and 0.8 mt. all measurements are made with the base distance of 0.5 m along the axis of the track (the emitter and the receiver are in the neighbouring ballast boxes) at different stages of the ballast consolidation on different sections of the track, where repair works had been carried out. in each case, a group of measurements in 20 neighbouring ballast boxes was made to increase the validity. the reliability of parameter measurements at each ballast box is insured 84 vol. 59 no. 1/2019 evaluation of railway ballast layer consolidation no. machinery lss (hz) ∆ wle (ms) ∆ if (hz) ∆ wv (m/s) ∆ 1 hv + dte 82.3 10.6 % 0.836 8.8 % 495.7 9.2 % 282 36 % 2 hv + dte + dgs 86.5 5.9 % 0.785 14.9 % 526.8 13.5 % 318 28 % 3 hv + dte + dsp 96.6 4.6 % 0.659 9.7 % 624.8 4.3 % 336 22 % 4 hv + dte + 2dsp 103.6 7.9 % 0.562 24.0 % 641.5 11.9 % 352 18 % table 1. the analysis of the results of experimental in-situ measurements: hv – hopper-vagon; dte – dynamic tamping express (duomatic 09-3x); dsp – stabilization with dsp (remputmash); dgs – stabilization with dgs (plasser & theurer). automatically with a repeated number of impacts and measurements until their mean deviation reaches the acceptable value. the errors of measurement are minimized with outlier removal and statistical processing. the results of the analysis of low side spectrum (lss), the velocity of wave propagation (wv), the leading edge of a wave (wle) and the instantaneous frequency (if), as well as their relative error are given in table 1. 4. discussion as it can be seen from the table, the statistical spread of the values of the spectrum beginning, the instantaneous frequency and the leading edge of a wave is much less than the spread of an average velocity of the waves propagation. the results of measurements along the track are taken into account, where the emitter and the receiver are arranged in the neighbouring ballast boxes. the mean value analysis of the lss showed the following. the lss increases with the number of the stabilizer passages: 22.9 % and 31.8 % during the second and third passages of the dsp, respectively. that is, the difference of the values of the lss between the first and second passages was significant and amounted to 22.9 %, while the difference of the values between the second and third passages was negligible and amounted to 8.9 %. after the first passage, the dgs stabilizer showed better results of the lss compared to dsp: by 4.7 % more at the last composition of the hv in the machine chain and by 10 % more at the last composition of the dte in the chain. the analysis of the mean values of the leading edge of a wave showed that, with a larger number of passages of the dsp, the leading edge of a wave decreases. compared to the first and third passages of the dsp, the difference was 39.7 %, and 17.3 % between the second and third passages, respectively. the value of the leading edge of a wave after the first passage of the dgs stabilizer with the subsequent passage of dte and hv is the largest that corresponds to an unconsolidated crushed stone. the analysis of average values of instantaneous frequency showed the following. the if increases with the number of stabilizer passages: by 18.6 % and 21.8 % at the second and third dsp passages respectively. that is, the difference of values of the instantaneous frequencies between the first and second passages was significant and amounted to 18.6 %, while the difference of values between the second and third passages was negligible and amounted to 3.2 %. the value of the instantaneous frequency after the first passage of the dgs stabilizer with the subsequent passage of dte and hv is the smallest, which also corresponds to the unconsolidated crushed stone. thus, it can be concluded that the first and second passages of the stabilizer are the most effective, and the third passage is unproductive. 5. conclusions the performed theoretical and laboratory studies have shown that the considered parameters of the impulse response in the ballast have a stable correlation with the degree of the consolidation of the ballast layer. experimental studies of the degree of the ballast layer compaction after the ballast-consolidating machine’s work make it possible to conclude that there is a significant increase in the degree of the consolidation after first passages of ballast-compaction machines. further passages, while discharging the entire ballast layer, are ineffective. this indicates the necessity of a step-by-step ballast layer discharge and stabilization. as for further research in the field of the determination of the ballast layer consolidation degree, the following directions are promising: • the study of the interconnection of measured kinematic and dynamic values that characterize the degree of the ballast consolidation with long-term processes of a uniform and uneven subsidence of the ballast layer; • the forecast for track geometry deformation on the basis of measurements of consolidation degree. the improvement of the seismic technique for the consolidation degree measurement is possible in the following directions: • multisensory synchronized impulse response measurements; • measurements of longitudinal and transverse components of oscillation waves based on triaxial accelerometers; • determination of the spatial distribution of crushed stone consolidation along the sleeper and along the track on the basis of seismotomographic methods. 85 m. sysyn, o. nabochenko, v. kovalchuk, u. gerber acta polytechnica list of symbols lss low side spectrum [hz] wle leading edge of a wave [ms] if instantaneous frequency [hz] wv velocity of wave propagation [m/s] tof time of flight [s] ndm nondestructive method mcu micro-controller unit ide integrated development environment udp user datagram protocol adc analog-digital converter dft discrete fourier transform gpr ground penetrating radar fwd falling weight deflectometer rrrf rail road response factor ht hilbert transform hv hopper-vagon dte dynamic tamping express uvp-diit a type of consolidation measurement device sv-20p a type of geophone esp32 a type of mcu ads1256 a type of adc references [1] l. fendrich, w. fengler. handbuch eisenbahninfrastruktur. second edition. springer-verlag berlin heidelberg, 2013. doi:10.1007/978-3-540-31707-4. [2] s. fischer. breakage test of railway ballast materials with new laboratory method. periodica polytechnica civil engineering 61(4):794–802, 2017. doi:10.3311/ppci.8549. [3] l. izvolt, j. sestakova, m. smalo. tendencies in the development of operational quality of ballasted and ballastless track superstructure and transition areas. iop conference series: materials science and engineering 236(1):012–038, 2017. doi:10.1088/1757-899x/236/1/012038. [4] l. izvolt, j. harusinec, m. smalo. optimisation of transition areas between ballastless track and ballasted track in the area of the tunnel turecky vrch. communications âăş scientific letters of the university of zilina 20(3):67–76, 2018. [5] m. sysyn, u. gerber, v. kovalchuk, o. nabochenko. the complex phenomenological model for prediction of inhomogeneous deformations of railway ballast layer after tamping works. archives of transport 3(46):91–107, 2018. doi:10.5604/01.3001.0012.6512. [6] b. wang, u. martin, s. rapp. discrete element modeling of the single-particle crushing test for ballast stones. computers and geotechnics 88:61–73, 2017. doi:10.1016/j.compgeo.2017.03.007. [7] b. lichtberger. handbuch gleis: unterbau, oberbau, instandhaltung, wirtschaftlichkeit. second edition. hamburg: tetzlaff verlag, 2003. [8] s. fischer, e. juhász. railroad ballast particle breakage with unique laboratory test method. acta technica jaurinensis 12(1):26–54, 2019. doi:10.14513/actatechjaur.v12.n1.489. [9] l. izvolt, j. sestakova, m. smalo. analysis of results of monitoring and prediction of quality development of ballasted and ballastless track superstructure and its transition areas. communications âăş scientific letters of the university of zilina 18(4):19–29, 2016. [10] v. kovalchuk, y. kovalchuk, m. sysyn, et al. estimation of carrying capacity of metallic corrugated structures of the type multiplate mp 150 during interaction with backfill soil. easterneuropean journal of enterprise technologies 1(91):18–26, 2018. doi:10.15587/1729-4061.2018.123002. [11] o. nabochenko, m. sysyn, v. kovalchuk, et al. studying the railroad track geometry deterioration as a result of an uneven subsidence of the ballast layer. eastern-european journal of enterprise technologies 97(1):50–59, 2019. doi:10.15587/1729-4061.2019.154864. [12] a. atamanyuk. the technology for ballast layer compaction with machines of type vpo after deep cleaning of ballast layer. phd thesis:. sankt-petetrsburg state university of railway transport, 2010. [13] r. d. bold. non-destructive evaluation of railway trackbed ballast. phd thesis:. institute for infrastructure and environment, school of engineering, university of edinburgh, 2011. [14] j. sadeghi, m. motieyan-najar, j. zakeri, et al. improvement of railway ballast maintenance approach, incorporating ballast geometry and fouling conditions. journal of applied geophysics 151:263–273, 2006. doi:10.1016/j.jappgeo.2018.02.020. [15] r. d. bold, g. oâăźconnor, j. morrissey, m. forde. benchmarking large scale gpr experiments on railway ballast. construction and building materials 92:31–42, 2015. doi:10.1016/j.conbuildmat.2014.09.036. [16] a. glickman. about the principles of the spectral seismic survey. geology, geophysics and development of oil and gas fields 12:19–24, 1998. [17] j. sadeghi. field investigation on dynamics of railway track pre-stressed concrete sleepers. advances in structural engineering 13(1):139–151, 2010. doi:10.1260/1369-4332.13.1.139. [18] c. esveld. modern railway track. second edition. mrt-production, 2001. [19] h. lam, m. wong. railway ballast diagnose through impact hammer test. procedia engineering the twelfth east asia-pacific conference on structural engineering and construction 14:185–194, 2011. doi:10.1016/j.proeng.2011.07.022. [20] j. smutny, v. nohal. vibration analysis in the gravel ballast by measuring stone method. akustika 25(1):22–28, 2016. [21] j. sadeghi, p. barati. comparisons of the mechanical properties of timber, steel and concrete sleepers. structure and infrastructure engineering 8(12):1151– 1159, 2012. doi:10.1080/15732479.2010.507706. [22] m. sysyn, v. kovalchuk, d. jiang. performance study of the inertial monitoring method for railway turnouts. international journal of rail transportation 4(4), 2018. doi:10.1080/23248378.2018.1514282. 86 http://dx.doi.org/10.1007/978-3-540-31707-4 http://dx.doi.org/10.3311/ppci.8549 http://dx.doi.org/10.1088/1757-899x/236/1/012038 http://dx.doi.org/10.5604/01.3001.0012.6512 http://dx.doi.org/10.1016/j.compgeo.2017.03.007 http://dx.doi.org/10.14513/actatechjaur.v12.n1.489 http://dx.doi.org/10.15587/1729-4061.2018.123002 http://dx.doi.org/10.15587/1729-4061.2019.154864 http://dx.doi.org/10.1016/j.jappgeo.2018.02.020 http://dx.doi.org/10.1016/j.conbuildmat.2014.09.036 http://dx.doi.org/10.1260/1369-4332.13.1.139 http://dx.doi.org/10.1016/j.proeng.2011.07.022 http://dx.doi.org/10.1080/15732479.2010.507706 http://dx.doi.org/10.1080/23248378.2018.1514282 vol. 59 no. 1/2019 evaluation of railway ballast layer consolidation [23] j. sadeghi. effect of unsupported sleepers on rail track dynamic behavior. proceedings of the institution of civil engineers âăş transport 171(5):286–298, 2018. doi:doi:10.1680/jtran.16.00161. [24] f. ghofrani, q. hea, r. goverdec, x. liud. recent applications of big data analytics in railway transportation systems: a survey. transportation research part c: emerging technologies 90:226–246, 2018. doi:10.1016/j.trc.2018.03.010. [25] a. lasisi, n. attoh-okine. principal components analysis and track quality index: a machine learning approach. transportation research part c: emerging technologies 91:230–248, 2018. doi:10.1016/j.trc.2018.04.001. [26] g. lederman, s. chen, j. garrett, et al. a data fusion approach for track monitoring from multiple in-service trains. mechanical systems and signal processing 95:363–379, 2017. doi:10.1016/j.ymssp.2016.06.041. [27] m. sysyn, d. gruen, u. gerber, et al. turnout monitoring with vehicle based inertial measurements of operational trains: a machine learning approach. communications âăş scientific letters of the university of zilina 21(1):42–48, 2019. [28] s. rapp, u. martin, m. strãďhle, m. scheffbuch. track-vehicle scale model for evaluating local track defects detection methods. transportation geotechnics 19:9–18, 2019. doi:10.1016/j.jrtpm.2016.03.001. [29] m. sysyn, u. gerber, v. rybkin, o. nabochenko. determination of the ballast layer degree compaction with dynamic and kinematic analysis of the acoustic waves impacts. sborník přednášek železniční dopravní cesta voš a spš stavební pp. 123–130, 2010. [30] j. mathews, k. fink. numerical methods using matlab. third edition. williams publishing house, 2001. [31] a. sergiyenko. digital signal processing. second edition. spb: piter, 2003. [32] d. larose, t. larose. discovering knowledge in data: an introduction to data mining. second edition. wiley, 2014. [33] f. heijden, r. duin, d. ridder, d. tax. classification, parameter estimation and state estimation: an engineering approach using matlab. second edition. john wiley & sons, 2014. 87 http://dx.doi.org/doi:10.1680/jtran.16.00161 http://dx.doi.org/10.1016/j.trc.2018.03.010 http://dx.doi.org/10.1016/j.trc.2018.04.001 http://dx.doi.org/10.1016/j.ymssp.2016.06.041 http://dx.doi.org/10.1016/j.jrtpm.2016.03.001 acta polytechnica 59(1):77–87, 2019 1 introduction 2 experimental measurements of impulse response of the ballast layer with different consolidation degree 2.1 dynamic interpretation and signal processing 2.2 laboratory tests 2.3 selection of informative features for consolidation 3 in-situ measurements of the ballast layer consolidation after track repair with tamping and stabilization 4 discussion 5 conclusions list of symbols references acta polytechnica doi:10.14311/ap.2017.57.0462 acta polytechnica 57(6):462–466, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap crypto-hermitian approach to the klein–gordon equation iveta semorádováa,b a nuclear physics institute, czech academy of science, řež near prague, czech republic b faculty of nuclear science and physical engeneering, czech technical university in prague, czech republic correspondence: semorive@fjfi.cvut.cz abstract. we explore the klein-gordon equation in the framework of crypto-hermitian quantum mechanics. solutions to common problems with probability interpretation and indefinite inner product of the klein-gordon equation are proposed. keywords: klein-gordon equation; probability interpretation; metric operator; crypto-hermitian operator; quasi-hermitian operator. 1. introduction the urge to unite special theory of relativity with quantum theory emerged shortly after their discovery. the first relativistic wave equation was introduced in 1926 simultaneously by klein [1], gordon [2], kudar [3], fock [4][5] and de donder and van dungen [6]. schrödinger himself formulated it earlier in his notes together with the schrödinger equation [7]. however, with the introduction of the klein-gordon equation arose several problems. for given momentum equation allows solutions with both positive and negative energy, it has an extra degree of freedom due to presence of both first and second derivatives and mainly its density function is indefinite and therefore cannot be consistently interpreted as probability density. also, the predictions based on this equation seemed to disagree with experiments (cf., e.g., the historical remark in [8]). therefore, few years later all the attention shifted to the dirac equation. more than ninety years old problem of proper probability interpretation of the klein-gordon equation was first solved in 1934 by pauli and weisskopf [9] by reinterpreting the klein-gordon equation in the context of quantum field theory. quantum mechanical approach to the klein-gordon equation was forgotten until ali mostafazadeh brought it back in 2003 [10]. in his work, he made use of pseudo or quasi-hermitian approach to quantum mechanics. mathematical ideas of quasi-hermitian theory originate from works of dieudonné [11] and dyson [12], though it wasn’t until 1992 when the theory was consistently explained and applied in nuclear physics by scholtz, geyer and hahne [13]. this groundbreaking work initiated fast growth of interest popularized in 1998 by bender and boettcher [14]. nowadays the application of the theory is moving away from quantum mechanics to other branches of physics, such as optics. we would like to return to the problem of proper interpretation of the klein-gordon equation in the framework of quantum mechanics only. several publications concerning this subject appeared [15–18] or [19–21]. but even these studies did not provide an ultimate answer to all of the open questions. some of them will be addressed in what follows. 2. klein-gordon equation in schrödinger form the klein-gordon equation for free particle can be written in common form (� + m2c2 ~2 )ψ(t,x) = 0 , (1) where � = 1 c2 ∂2t − ∆ = ∂µ∂µ is the d’alembert operator. from now on we will use the natural units c = ~ = 1, furthermore we can denote k = −∆ + m2 and rewrite (1) as (i∂t)2ψ(t,x) = kψ(t,x) . (2) the fact that the klein-gordon equation is differential equation of second order in time gives it an extra degree of freedom. feshbach and villars [22] suggested solution to this problem by introducing twocomponent wave function and therefore making the extra degree of freedom more visible. following their ideas together with even earlier ideas of foldy [23], we can replace the klein-gordon equation with two differential equation of first order in time. inspired by convention introduced in [19] we put ψ(1) = i∂tψ, ψ(2) = ψ. (3) now, equation (2) can be decomposed into a pair of partial differential equations i∂tψ(1) = kψ(2), (4) i∂tψ(2) = ψ(1), (5) which, written in the matrix form, become i∂t ( ψ(1) ψ(2) ) = ( 0 k i 0 )( ψ(1) ψ(2) ) . (6) 462 http://dx.doi.org/10.14311/ap.2017.57.0462 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 crypto-hermitian approach to the klein–gordon equation hamiltonian of the quantum system takes form h = ( 0 k 1 0 ) , (7) and enters the schrödinger equation i∂tψ(t,x) = hψ(t,x), ψ = ( ψ(1) ψ(2) ) . (8) two-component vectors ψ(t) belong to h = l2(r3) ⊕l2(r3) (9) and the hamiltonian h may be viewed as acting in h. the so called schrödinger form of the klein-gordon equation (8) is equivalent to the original klein-gordon equation (1). it is in more familiar form, although, new challenge arises with the manifest non-hermiticity of hamiltonian (7). 2.1. eigenvalues new form of the klein-gordon equation (8) has many benefits. one of them is simplification of calculation of its eigenvalues to mere solving the eigenvalue problem for operator k kψn = �nψn. (10) the relationship between eigenvalues �n of the operator k and eigenvalues en of the non-hermitian operator h of the schrödinger form of the klein-gordon equation ( 0 k i 0 )( ψ(1) ψ(2) ) = e ( ψ(1) ψ(2) ) (11) can be easily seen. equation (11) is formed from two algebraic equations kψ(2) = eψ(1), ψ(1) = eψ(2). (12) after insertion of the second one to the first one we obtain kψ(2)n = e 2 nψ (2) n , (13) which compared with equation (10) gives us following relation between eigenvalues �n = e2n. (14) we can see, that eigenvalues en remain real under assumption of �n > 0. relationship between corresponding eigenvectors hψ(±)n = e (±) n ψ (±) n , ψ (±) n = ( ± √ �nψn ψn ) (15) is also easy to see. 2.2. free klein–gordon equation in case of free klein–gordon equation operator k = −∆ + m2 (16) acting on h = l2(r3) is positive and hermitian. it has continuous and degenerate spectrum. as suggested in [10], we identify the space r3 with the volume of a cube of side l, as l tends to infinity. than we can treat the continuous spectrum of k as the limit of the discrete spectrum corresponding to the approximation. the eigenvalues are given by �~k = k 2 + m2 (17) and corresponding eigenvectors ψ~k = ψ (2) ~k are ψ~k(~x) = 〈~x|~k〉 = (2π) −3/2ei ~k.~x, (18) where ~k ∈ r3 and ~k.~k = k2. we can see that ψ~k /∈ l2(r3). they are generalized eigenvectors, i.e. vectors which eventually becomes 0 if (k−λi) is applied to it enough times successively, describing scattering states [10]. vectors ψ~k satisfy orthonormality and completeness conditions 〈~k|~k′〉 = δ(~k −~k′), ∫ d3k|~k〉〈~k|) = 1 (19) and operator k can be expressed by its spectral resolution as k = ∫ d3k(k2 + m2)|~k〉〈~k|. (20) from the relations (14) and (15) we see that eigenvalues and eigenvectors of h are given by e (±) ~k = ± √ ~k2 + m2, ψ(±) ~k = ( ± √ ~k2 + m2 1 ) ψ~k . (21) the eigenvectors φ(±) ~k of adjoint operator h† are φ(±) ~k = ( 1 ± √ ~k2 + m2 ) ψ~k, (22) which form together with ψ(±) ~k complete biorthogonal system 〈φ(ν) ~k′ |ψ(ν ′) ~k 〉 = δ(~k −~k′)δνν′ 2e (ν) ~k , (23) where ν,ν′ = ±1. 3. crypto-hermitian approach apparent non-hermiticity of hamiltonian (7) can be dealt with by means of the crypto-hermitian theory (sometimes also called quasi-hermitian [24] or pt symmetric [25]). 463 iveta semorádová acta polytechnica hamiltonian is non-hermitian h 6= h† only in the false hilbert space h(f) = (v,〈·|·〉). the underlying vector space of states is fixed, given by the physical system. however, we have a freedom in the choice of inner product. if we represent our hamiltonian in different secondary hilbert space h(s) = (v,〈〈·|·〉), with newly defined inner product 〈〈·|·〉 = 〈ϕ|θ|ψ〉, (24) it may become hermitian. so called metric operator θ must be positive definite, everywhere-defined, hermitian and bounded with bounded inverse. operators for which such inner product exist will be called crypto-hermitian (c.f. [26]). they satisfy the so called dieudonée equation h†θ = θh (25) and they are similar to hermitian operators h = ωhω−1, (26) where θ = ω†ω is invertible and h = h†. in such scenario, the problem of negative probability interpretation of the klein-gordon equation can be reinterpreted as the problem of the wrong choice of metric operator θ. if we would be able to find more appropriate choice of representation space h(s), this problem would disappear. 3.1. computation of the metric one of the possible ways how to construct metric operator θ for given crypto-hermitian hamiltonian h is by summing the spectral resolution series. it requires the solution of eigenvalue problem for h†. in what follows, we try to construct the metric operator for free klein-gordon equation θ = ∫ d3k ( α(+)|φ(+) ~k 〉〈φ(+) ~k | + α(−)|φ(−) ~k 〉〈φ(−) ~k | ) , (27) where we insert eigenvectors φ(±) ~k as computed in (22) θ = ∫ d3k ( α β √ k2 + m2 β √ k2 + m2 α(k2 + m2) ) |~k〉〈~k|, (28) where α = α(+) + α(−), β = α(+) −α(−). by means of equation (20) we obtain family of metric operators θ = ( α βk1/2 βk1/2 αk ) , (29) where k1/2 = ∫ d3k √ k2 + m2|~k〉〈~k|. (30) with the knowledge of the metric operator (29), we can construct positive definite inner product defining hilbert space h(s) 〈〈ψ|φ〉 = α(〈ψ|k|ϕ〉 + 〈ψ̇|ϕ̇〉) + iβ(〈ψ|k1/2|ϕ̇〉−〈ψ̇|k1/2|ϕ〉) , (31) where ϕ̇, ψ̇ denote corresponding time derivatives (in fact, this equation is just an explicit version of equation (24)). 3.2. the discrete case unfortunately, the metric operator (29) is unbounded and therefore doesn’t satisfy all the requested properties we put upon metric operator. as was emphasized in [27], boundedness of metric operator θ is very important property, it guarantees that convergence of cauchy sequences is not affected by introduction of new inner product (24). the possibility of the use of unbounded metrics is treated e.g. in the last chapter of [28]. to overcome the problems with unboundedness of the metric operator (29), we choose to shift our attention to a discrete model. in the discrete approximation the metric operator stays bounded. we make use of equidistant, runge-kutta grid-point coordinates xk = kh , k = 0,±1,±2 . . . , (32) laplacian can be expressed as − ψ(xk+1) − 2ψ(xk) + ψ(xk−1) h2 , (33) the explicit occurrence of the parameter h will be important for the study of the continuum limit in which the value of h would decrease to zero. otherwise we may set h = 1 in suitable units. following further ideas from [29], laplace operator ∆ can be discretized into matrix form ∆(n) =   2 −1 −1 2 −1 −1 2 ... ... ... −1 −1 2   (34) matrix (34) is hermitian and therefore diagonalizable, i.e. similar to diagonal matrix. hence for our purposes it is enough to compute with n×n real diagonal matrix k =   a1 0 · · · 0 0 a2 · · · 0 ... ... ... ... 0 0 · · · an   . (35) let a, b, c be real matrices n×n, where a = at , b = bt . than we can write the dieudonné equation (25) by means of block matrices( 0 i k 0 )( a ct c b ) = ( a ct c b )( 0 k i 0 ) . (36) we obtain following conditions c = ct , kc = ctk, b = ka = ak. (37) 464 vol. 57 no. 6/2017 crypto-hermitian approach to the klein–gordon equation real symmetric matrix which commutes with diagonal matrix must be diagonal. thus the form of our metric operator is as follows θ =   α1 · · · 0 β1 · · · 0 ... ... ... ... ... ... 0 · · · αn 0 · · · βn β1 · · · 0 a1α1 · · · 0 ... ... ... ... ... ... 0 · · · βn 0 · · · anαn   . (38) it depends on 2n parameters α1 . . .αn, β1 . . .βn. requirement of positive-definitness of the metric put following conditions on our parameters αi > 0 , aiα2i > β 2 i , i = 1, 2, . . . ,n . (39) we can construct corresponding inner product 〈〈ψ|ϕ〉 = n∑ i=1 αiψ ∗ i ϕi + n∑ i=1 βi(ψ∗i ϕn+i + ψ ∗ n+iϕi) + n∑ i=1 aiαiψ ∗ n+iϕn+i , (40) where ψ = (ψ1,ψ2, . . .ψ2n)t , ϕ = (ϕ1,ϕ2 . . . ,ϕ2n)t are complex vectors. 4. conclusions in our work, we familiarized the reader with the cryptohermitian approach to the klein-gordon equation. we computed metric operator in both continuous and discrete cases. corresponding positive definite inner product for free klein-gordon equation was also computed. that is considered a crucial step in proper probability interpretation of the klein-gordon equation. the next step of this process would be construction of appropriate metric operator for the klein-gordon equation with nonzero potential v as was done for special cases in [16, 17, 19, 21]. it is also possible to broaden the formalism by adding manifest nonhermiticity in operator k 6= k†, as was shown in [20]. related complicated problems with locality, definition of physical observables and attempts to construct conserved four-current can be thoroughly studied in further references [16, 17]. the problems become much simpler if we narrow our attention to real kleingordon fields only. it was shown that in such a case, inner product is uniquely defined [16, 30]. acknowledgements the work of iveta semorádová was supported by the ctu grant sgs16/239/ohk4/3t/14. references [1] o. klein. quantentheorie und fünfdimensionale relativitätstheorie. zeitschrift für physik 37(12):895–906, 1926. doi:10.1007/bf01397481. [2] w. gordon. der comptoneffekt nach der schrödingerschen theorie. zeitschrift für physik 40(1-2):117–133, 1926. doi:https://doi.org/10.1007/bf01390840. [3] j. kudar. zur vierdimensionalen formulierung der undulatorischen mechanik. annalen der physik 386(22):632–636, 1926. doi:10.1002/andp.19263862208. [4] v. fock. über die invariante form der wellen-und der bewegungsgleichungen für einen geladenen massenpunkt. zeitschrift für physik 39(2-3):226–232, 1926. doi:10.1007/bf01321989. [5] v. fock. zur schrödingerschen wellenmechanik. zeitschrift für physik a hadrons and nuclei 38(3):242–250, 1926. [6] t. de donder, h. van den dungen. la quantification déduite de la gravifique einsteinienne. comptes rendus 183:22–24, 1926. [7] e. schrödinger. quantisierung als eigenwertproblem. annalen der physik 385(13):437–490, 1926. doi:10.1002/andp.19263851302. [8] f. constantinescu, e. magyari. problems in quantum mechanics. elsevier, 2013. doi:10.1119/1.1986548. [9] w. pauli, v. weisskopf. über die quantisierung der skalaren relativistischen wellengleichung. helv phys acta 7:709–731, 1934. [10] a. mostafazadeh. hilbert space structures on the solution space of klein-gordon type evolution equations. class quantum grav 20:155–171, 2003. doi:10.1088/0264-9381/20/1/312. [11] j. dieudonné. quasi-hermitian operators. proc internat sympos linear spaces (jerusalem, 1960), pergamon, oxford 115122, 1961. [12] f. j. dyson. general theory of spin-wave interactions. phys rev 102(5):1217, 1956. doi:10.1103/physrev.102.1217. [13] f. scholtz, h. geyer, f. hahne. quasi-hermitian operators in quantum mechanics and the variational principle. ann phys 213:71–101, 1992. doi:10.1016/0003-4916(92)90284-s. [14] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt -symmetry. phys rev lett 80(24):5243, 1998. doi:10.1103/physrevlett.80.5243. [15] a. mostafazadeh. quantum mechanics of klein-gordon-type fields and quantum cosmology. ann phys (new york) 309:1–48, 2004. doi:10.1016/j.aop.2003.08.010. [16] a. mostafazadeh, f. zamani. quantum mechanics of klein–gordon fields i: hilbert space, localized states, and chiral symmetry. ann phys 321(9):2183–2209, 2006. doi:10.1016/j.aop.2006.02.007. [17] a. mostafazadeh, f. zamani. quantum mechanics of klein–gordon fields ii: relativistic coherent states. ann phys 321(9):2210–2241, 2006. doi:10.1016/j.aop.2006.02.008. 465 http://dx.doi.org/10.1007/bf01397481 http://dx.doi.org/https://doi.org/10.1007/bf01390840 http://dx.doi.org/10.1002/andp.19263862208 http://dx.doi.org/10.1007/bf01321989 http://dx.doi.org/10.1002/andp.19263851302 http://dx.doi.org/10.1119/1.1986548 http://dx.doi.org/10.1088/0264-9381/20/1/312 http://dx.doi.org/10.1103/physrev.102.1217 http://dx.doi.org/10.1016/0003-4916(92)90284-s http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1016/j.aop.2003.08.010 http://dx.doi.org/10.1016/j.aop.2006.02.007 http://dx.doi.org/10.1016/j.aop.2006.02.008 iveta semorádová acta polytechnica [18] a. mostafazadeh. a physical realization of the generalized pt -, c-, and cpt -symmetries and the position operator for klein-gordon fields. international journal of modern physics a 21(12):2553–2572, 2006. doi:10.1142/s0217751x06028813. [19] m. znojil. relativistic supersymmetric quantum mechanics based on klein-gordon equation. j phys a: math gen 37:9557–9571, 2004. doi:10.1088/0305-4470/37/40/016. [20] m. znojil. solvable relativistic quantum dots with vibrational spectra. czech j phys 55:1187–1192., 2005. doi:10.1007/s10582-005-0127-6. [21] m. znojil, h. bíla, v. jakubský. pseudo-hermitian approach to energy-dependent klein-gordon models. czech j phys 54(10):1143–1148, 2004. doi:10.1023/b:cjop.0000044017.33267.58. [22] h. feshbach, f. villars. elementary relativistic wave mechanics of spin 0 and spin 1/2 particles. rev mod phys 30(1):24, 1958. doi:10.1103/revmodphys.30.24. [23] l. l. foldy. synthesis of covariant particle equations. phys rev 102(2):568, 1956. doi:10.1103/physrev.102.568. [24] a. mostafazadeh. pseudo-hermitian representation of quantum mechanics. int j geom meth mod phys 7:1191–1306, 2010. doi:10.1142/s0219887810004816. [25] c. m. bender. making sense of non-hermitian hamiltonians. reports on progress in physics 70(6):947, 2007. doi:10.1088/0034-4885/70/6/r03. [26] m. znojil. three-hilbert-space formulation of quantum mechanics. symmetry, integrability and geometry: methods and applications 5(001):19, 2009. [27] r. kretschmer, l. szymanowski. quasi-hermiticity in infinite-dimensional hilbert spaces. phys lett a 325(2):112–117, 2004. [28] f. bagarello, j.-p. gazeau, f. h. szafraniec, m. znojil. non-selfadjoint operators in quantum physics: mathematical aspects. john wiley & sons, 2015. [29] m. znojil. n-site-lattice analogues of v (x) = ix3. ann phys 327(3):893–913, 2012. doi:10.1016/j.aop.2011.12.009. [30] f. kleefeld. on some meaningful inner product for real klein-gordon fields with positive semi-definite norm. czech j phys 56(9):999–1006, 2006. doi:10.1007/s10582-006-0395-9. 466 http://dx.doi.org/10.1142/s0217751x06028813 http://dx.doi.org/10.1088/0305-4470/37/40/016 http://dx.doi.org/10.1007/s10582-005-0127-6 http://dx.doi.org/10.1023/b:cjop.0000044017.33267.58 http://dx.doi.org/10.1103/revmodphys.30.24 http://dx.doi.org/10.1103/physrev.102.568 http://dx.doi.org/10.1142/s0219887810004816 http://dx.doi.org/10.1088/0034-4885/70/6/r03 http://dx.doi.org/10.1016/j.aop.2011.12.009 http://dx.doi.org/10.1007/s10582-006-0395-9 acta polytechnica 57(6):462–466, 2017 1 introduction 2 klein-gordon equation in schrödinger form 2.1 eigenvalues 2.2 free klein–gordon equation 3 crypto-hermitian approach 3.1 computation of the metric 3.2 the discrete case 4 conclusions acknowledgements references acta polytechnica doi:10.14311/app.2016.56.0076 acta polytechnica 56(1):76–80, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of weld joint deformations by optical 3d scanning ján urminský∗, miroslav jáňa, milan marônek, ladislav morovič slovak university of technology in bratislava, faculty of materials science and technology in trnava, institute of production technologies, j. bottu 25, 917 24 trnava, slovakia ∗ corresponding author: jan.urminsky@stuba.sk abstract. this paper presents an analysis of weld joint deformation using optical 3d scanning. the weld joints of bimetals were made by explosion welding (exw). gom atos ii triplescan so mv320 equipment with measuring volume 320 × 240 × 240 mm, 5.0 mpix camera resolution and gom atos i 350 with a measuring volume of 250 × 200 × 200 mm, 0.8 mpix camera resolution were used for experimental deformation measurements of weldments. the scanned samples were compared with reference specimens. the angular and transverse deformation were visualized by colour deviation maps. the maximum observed deformations of the weld joints ranged from −1.96 to +1.20 mm. keywords: analysis of weldment deformation; explosion welding; 3d scanning; gom atos. 1. introduction non-uniform heating of welded components due to the thermal cycle of welding, thermal expansion during heating and cooling with the clamping stiffness and the formation of non-equilibrium structures in the heat affected zone (haz) cause the formation of transient, variable and permanent stresses. these stresses lead to a local or total deformation of the weldments. the resultant deformations can be classified as longitudinal, transverse and angular deformations, according to their position. angular deformations are adverse from all group and can degrade the utility of the weldment. the size of the angular deformation depends especially on the design of the weld joint. the correct structural design and positioning of welds in terms of stress are a precondition for making high quality weld structures. generally, deformations deteriorate the properties of the weldment [1, 2]. the residual stress can be measured by an electrical resistance strain gauge, by x-ray diffraction analysis, by the ultrasonic method, and also by standard methods of contact metrology. callipers, protractors and dial indicators are used for detecting shape and position deviations. these methods for determining the dimensional and shape deviations of a weldment are mostly time consuming, and require repeated measurements of the whole area. nowadays, products are checked by time-saving quality control methods [3–5]. the new quality control methods implemented into continuous production processes are based on coordinate measurements. modern optical 3d scanners are now more frequently used than standard traditional measurement methods. these digitization devices are based on obtaining of surface coordinate points by scanning the surface illuminated by structured light. the spatial coordinates of points forming the surface of the scanned object are determined by active triangulation. the digital model of the object is formed obtained by polygonising the cloud of points that are obtained. in order to scan reflective surfaces and transparent objects, antireflection layers are applied to the surface of the measured object. the final deformation can then be visualized by colour deviation maps. this paper evaluates the suitability of a gom atos 3d optical scanner for measuring the deformations of weld joints [6–9]. 2. experiment weld joints were produced by explosion welding. the flyer plate was made of cr-ni austenitic steel sheet. az31b mg magnesium alloy, aw-1050a aluminium alloy and gjs-500-7 ductile iron were used as parent plates. the chemical compositions of the materials are presented in tables 1 to 4. austenitic steel was selected on the basis of its corrosion resistance and its applicability from cryogenic temperatures to high temperatures. az31b alloy has a good plastic properties and weldability. az31b is used mainly for formed non-heat treated components. cr-ni steel – mg alloy bimetals are used in the automotive industry. aw-1050a alloy is suitable for cold forming. in combination with cr-ni steel, it is applied in the automotive industry as a semi-product for the production of bumpers and car doors, where increased corrosion resistance, low weight and aesthetic look are required. these combination of materials are also used in cryogenic environments in the chemical industry and in the shipbuilding industry. gjs-500-7 ductile iron shows good machinability and low abrasion resistance. this combination is used for producing bimetal valves, and it significantly reduces the manufacturing cost. the dimensions of the welded materials are shown in table 5 [10–13]. the weld joints were prepared in cooperation with the research institute of industrial chemistry in pardubice-semtin. there are several parameters that 76 http://dx.doi.org/10.14311/app.2016.56.0076 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 1/2016 analysis of weld joint deformations by optical 3d scanning element c mn si cr ni p s wt. % max. 0.07 max. 2.0 max. 1.0 17.0–20.0 9.0–11.5 max. 0.045 max. 0.03 table 1. chemical composition of the cr-ni austenitic steel [10]. element mg al zn mn si ni fe wt. % 97.05 2.5–3.5 0.6–1.4 0.2–1.0 max. 0.1 max. 0.005 max. 0.005 table 2. chemical composition of the az31b mg alloy [11]. element al fe si zn cu ti fe wt. % 99.5 max. 0.4 max. 0.3 max. 0.07 max. 0.05 max. 0.05 max. 0.005 table 3. chemical composition of the aw-1050a al alloy [11]. element c mn si p s ti fe wt.% 2.7–3.7 0.3–0.7 0.8–2.9 max. 0.1 max. 0.02 max. 0.05 max. 0.005 table 4. chemical composition of the gjs-500-7 ductile iron [12]. sample flyer plate parent plate 1 cr-ni steel 176 × 146 × 1 mm aw-1050a 146 × 116 × 15 mm 2 cr-ni steel 250 × 135 × 1 mm gjs-500-7 200 × 95 × 24 mm 3 cr-ni steel 370 × 260 × 1 mm az31b 190 × 155 × 20 mm table 5. dimensions of the welded materials. bimetal combination of materials dimension (l × w × h) 1 cr-ni steel – aw-1050a 116 × 146 × 16 mm 2 cr-ni steel – gjs-500-7 95 × 200 × 25 mm 3 cr-ni steel – az31b 155 × 190 × 21 mm table 6. dimensions of the bimetals. affect the quality and the character of weld joints, in particular the detonation velocity, the welding speed, the set-up distance, the geometrical dimensions and the properties of the explosive charge. a parallel welded materials set-up was used (figure 1a). the explosive charge was initiated by a starline 12 detonating fuse with a detonation speed of 6800 m s−1.the following explosive charges were used: semtex 10 se for the cr-ni steel – az31b alloy bimetal; semtex s 25 for the cr-ni steel– gjs-500-7 bimetal; and semtex s 35 for cr-ni steel – aw-1050a bimetal. the parameters and explosion welding conditions that were used are presented in figure 1b–d [10, 12, 13]. immediately after the detonation of the explosive charge, the flyer material is deformed and is accelerated toward the parent plate. welded materials therefore deform before the weld joint is formed. experimental measurements of the deformation of welded joints were performed using gom atos ii triplescan so mv320 equipment (measuring volume: 320 × 240 × 240 mm, camera resolution 2448 × 2050 pixels (5.0 mpx)) and using gom atos i 350, (measuring volume: 250 × 200 × 200 mm, 1032 × 776 pixels (0.8 mpx)), owned by the slovak university of technology in bratislava, faculty of materials science and technology in trnava, (figure 2) [6, 7]. before making the measurements, it was necessary to clean up the bimetals. in order to increase the precision of assembly of the individual scans from various measuring positions, non-coding reference points were stuck on to the cleaned surface of the bimetal. an anti-reflective coating was sprayed on the bimetal surface to prevent light reflections during scanning. the bimetal was placed on the rotary table and was scanned in 12 positions (figure 3a). a 3d digital model in .stl format was generated by the scanner after obtaining a cloud of points and subsequent polygonization (figure 3b). to detect the deformation of the weld joint, a comparison was made between the deformed bimetal model (after welding) and cad model (before welding) using gom atos professional v7.5 software. colour deviation maps were generated as 77 j. urminský, m. jáňa, m. marônek, l. morovič acta polytechnica figure 1. welding parameters; a) parallel set-up of welded materials; b) explosion welding parameters, sample 3; c) explosion welding parameters, sample 2; d) explosion welding parameters, sample 1; [10]. figure 2. the heads of 3d optical scanners; a) gom atos ii triplescan so mv320; b) gom atos i 350 [6, 7]. an output (figures 4 to 6). the quality of the weld interface and the total quality of the bimetals that were produced were evaluated by ultrasonic testing, by light microscopy, by micro-hardness measurements, and by edx and ebsd analyses. 3. results deformation measurements were made of weld joints made by explosion welding. the dimensions of the scanned bimetals are shown in table 6. bimetal 1 was scanned using a gom atos i 350 device. bimetals 2 and 3 were scanned using a gom atos ii triplescan device. transverse, longitudinal and angular deformations were observed as the shape deviations on the weld joints. the deformations of each bimetal were caused by detonation pressure and by the final high velocity collision of the welded materials. intensive plastic deformation occurred in the weld joint interface during formation of the weld joint. the scanned bimetals were compared with individual reference samples created by cad software. the beginning and the direction of the explosive initiation is illustrated by red arrow on the scanned bimetal. deformations of various patterns were observed on bimetal 1 (figure 4). the greatest deformations were seen on the edges of the bimetal. the bimetal lost its flatness due to deformation in direction of the z-axis and transverse deformation in the x-axis direction. angular deformations were not observed throughout the thickness of the material. deformations of various sizes were identified in individual planes perpendicular to the thickness of the bimetal. the maximum deformations were observed on the top of the bimetal, and varied from −2.00 to +0.40 mm. the deformations on the bottom side of the bimetal ranged from −0.80 to +1.07 mm. the colour deviation map of bimetal 2 (figure 5) revealed angular deformation (displacement in the direction of the z-axis). the most significant deformation occurred on the surface of the bimetal, and ranged from −0.20 to +1.00 mm. the deformation range on the bottom side of the bimetal varied from −0.40 to +0.60 mm. the angular deformations were also detected after evaluation of bimetal 3 (figure 6). at the corner adjacent to explosive initiation place (x-axis direction) a surface “hump” was created. this deformation was created as a consequence of explosive fuse detonation. 78 vol. 56 no. 1/2016 analysis of weld joint deformations by optical 3d scanning figure 3. digitization process of the welded bimetal); a) the scanning process provided by structured light and non-coded referential points; b) the scanned object after meshing [10]. figure 4. the colour deviation map of the bimetal; a) top view, b) bottom view [13]. however, for the statement verification, the more experiments under the same conditions should be carried out. the deformations showed various values of the top and bottom side of the bimetal. the deformation values on the top side were in range from −1.25 to +1.20 mm. the value of the hump amplitude was 1.55 mm and no defect was observed in this place of bimetal. the deformation values of on the bottom side of bimetal ranged from −1.96 to +0.40 mm in comparison to the reference sample. the various deformation values of the bimetals can be explained by different pressure of detonating explosive charge. the detonating pressure was sufficient for creating the weld joint, but was not high enough to achieve similar deformation on the opposite side of the bimetal. the angular deformations were decreasing towards the bottom side of the parent material. there is obvious, that deformation values depend on mechanical properties of welded materials. 4. conclusion devices gom atos ii triplescan so mv320 and gom atos i 350 have versatile use in quality evaluation of manufacturing products. the results confirmed that these devices can be used also for evaluation of the weld joints. the data analysis showed, that the scanning precision was adequate not only for evaluation of deformations, but also to reveal the details of the weld joints. used 3d scanners provide a sufficient amount of input data for the analysis of weldment’s deformation and they are able to accurately determine the size and direction of the bimetal distortion. however, the scanners are limited in measuring volume. it is possible to scan an area in range from 38 × 29 to 2000 × 1500 mm, with a suitable optical configuration. for larger bimetals, it is necessary to use the scanning equipment with appropriate measuring volume. tritop, scan box and atos plus systems could be suitable alternatives. the photogrammetric tritop system can scan area of 10 × 10 m. its disadvantage is necessity to use calibration bars, which in relation to scanned object cannot be moved during scanning. the scan box unit allows scanning a sample with maximum dimension of 3 m. the atos plus system represents hardware extension based on photogrammetry for 3d scanner atos triple scan and atos core. the dimensions of the scanned areas range from 300 × 225 to 2000 × 3000 mm. 79 j. urminský, m. jáňa, m. marônek, l. morovič acta polytechnica figure 5. colour deviation map of the bimetal; a) top view, b) bottom view [12]. figure 6. colour deviation map of the bimetal; a) top view, b) bottom view [10]. acknowledgements publication of this paper was supported by vega project of the ministry of education, science, research and sport of the slovak republic, 1/0470/14 — utilization of modern optical 3d scanning methods for weldment deformation analysis. references [1] turňa, m.: the special welding methods, bratislava: alfa, 1989, 379 p., isbn 80-05-00097-90 [2] marônek, m.: welding metals by explosion, bratislava: stu, 2009, 147 p., isbn 978-80-227-3128-7 [3] marônek, m., bárta, j.: multimedia guide of welding technology, trnava: alumnipress, 2008, 328 p., isbn 978-80-8096-066-7 [4] kálna, k.: the welding stress and deformation, bratislava: weldtech, 1998, 50 p., isbn 80-88734-25-8 [5] olabi, a. g., lorza, r. l., benyounis, k. y.: quality control in welding process, in: comprehensive materials processing, volume 6: welding and bonding technologies, 2014, 193-212 pp., [2014-12-28], doi:10.1016/b978-0-08-096532-1.00607-5 [6] atos tiplescan: user manual – hardware. https://support.gom.com, 47 p., [2014-11-01] [7] atos v7 hardware – benutzerinformation – atos i, atos i so. https://support.gom.com, 64 p., [2014-11-01] [8] atos v7.5 sr2 manual advanced – scanning with atos – advanced/units a-c. https://support.gom.com, 44 p. [2014-11-01] [9] atos v7.5 sr2 manual basic – scanning with atos – basic/units a-j. https://support.gom.com, 104 p. [2014-11-01] [10] jáňa, m.: the effect of atmosphere and vacuum on character of welded joints fabricated by explosion, master thesis. trnava: stu in bratislava, 2012, 95 p. [11] asm handbook volume 2, properties and selection: nonferrous alloys and special-purpose materials, asm international, 1990, 1328 p., isbn: 978-0-087170-378-1 [12] ondruška, m.: explosion welding of malleable cast iron with other metals, dissertation thesis. trnava: stu in bratislava, 2012, 132 s. [13] benák, m.: physicometallurgical aspects of explosion bonding of dissimilar metals, dissertation thesis. trnava: stu in bratislava, 2011, 160 s. 80 http://dx.doi.org/10.1016/b978-0-08-096532-1.00607-5 https://support.gom.com https://support.gom.com https://support.gom.com https://support.gom.com acta polytechnica 56(1):76–80, 2016 1 introduction 2 experiment 3 results 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0424 acta polytechnica 57(6):424–429, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap non-unitary transformation of quantum time-dependent non-hermitian systems mustapha maamache laboratoire de physique quantique et systèmes dynamiques, faculté des sciences, université ferhat abbas sétif 1, sétif 19000, algeria. correspondence: maamache@univ-setif.dz abstract. we provide a new perspective on non-hermitian evolution in quantum mechanics by emphasizing the same method as in the hermitian quantum evolution. we first give a precise description of the non unitary transformation and the associated evolution, and collecting the basic results around it and postulating the norm preserving. this cautionary postulate imposing that the time evolution of a non hermitian quantum system preserves the inner products between the associated states must not be read naively. we also give an example showing that the solutions of time-dependent non hermitian hamiltonian systems given by a linear combination of su(1, 1) and su(2) are obtained thanks to time-dependent non-unitary transformation. keywords: non-hermitian quantum mechanics; time-dependent hamiltonian systems; non-unitary time-dependent transformation. 1. introduction one of the postulates of quantum mechanics is that the hamiltonian is hermitian, as this guarantees that the eigenvalues are real. this postulate result from a set of postulates representing the minimal assumptions needed to develop the theory of quantum mechanics. one of these postulates concerns the time evolution of the state vector |ψ(t)〉 governed by the schrödinger equation which describe how a state changes with time: i~ ∂ ∂t |ψ(t)〉 = h|ψ(t)〉 (1) where h is the hamiltonian operator corresponding to the total energy of the system. the time dependent schrödinger equation is the most general way of describing how a state changes with time. formally, we can evolve a wavefunction forward in time by applying the time-evolution operator. for a hamiltonian which is time independent, we have |ψ(t)〉 = u(t,t0) |ψ(t0)〉, where u(t,t0) = exp(−ih(t − t0)/~) denotes the timeevolution operator. the time-evolution operator is an example of a unitary operator. the latter are defined as transformations which preserve the scalar product, 〈ψ(t)|ψ(t)〉 = 〈ψ(t0)|ψ(t0)〉, i.e., the norm 〈ψ(t)|ψ(t)〉 is time independent. the study of time-dependent systems has been a growing field not only for its fundamental physical perspective but also for its applicability, such as quantum optics. there has been attracted attention of physicists in the analytical solutions of the one-dimensional schrodinger equation with a time-dependent hamiltonian. the origin of this development was no doubt the discovery of an exact invariant by lewis [1, 2] and lewis and riesenfeld [3] which exploited the invariant operators to solve quantum-mechanical problems. the invariants method [3] is very simple due to the relationship between the eigenstates of the invariant operator and the solutions to the schrödinger equation by means of the phases. exploiting the invariant operator theory several authors, for instance [4–14], have studied extensively in the literature two models. one of them is the time-dependent generalized harmonic oscillator with the symmetry of the su(1, 1) dynamical group, the other is the two-level system possessing an su(2) symmetry. in this respect, m. maamache [15] has shown that, with the help of the appropriate time-dependent unitary transformation instead of the invariant operator, the hamiltonian of the su(1, 1) and su(2) algebra can be transformed into the time-independent hamiltonian multiplied by an overall time-dependent factor. the quantum mechanics is capable of working for some non-hermitian quantum systems. however, the hermiticity is relaxed to be pseudo-hermiticity [16] or pt symmetry in non-hermitian quantum mechanics, where is a linear hermitian or an anti-linear anti-hermitian operator, and p and t stand for the parity and time-reversal operators, respectively. the theories of non-hermitian quantum mechanics have been developed quickly in recent decades, the reader can consulte the articles [17, 18] and references cited therein. systems with time-dependent non-hermitian hamiltonian operators have been studied in [19–35]. the most recent monograph [36] can be consulted for introduction of the non-stationary theory. 424 http://dx.doi.org/10.14311/ap.2017.57.0424 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 non-unitary transformation of non-hermitian systems in this work, we use the same strategy as done in [15] to solve the schrödinger equation for the time-dependent hermitian hamiltonian systems . we introduce the non-unitary transformation v (t) mapping the solution |ψ(t)〉 of the time-dependent schrödinger equation involving a non-hermitian hamiltonian h(t) to a solution of the |φ(t)〉 involving a non hermitian hamiltonian h(t) required as a product of a simple time-independent hamiltonian h0 and a time-dependent factor g(t). after performing transformation of schrödinger equation, the problem becomes exactly solvable but the evolution is not unitary and consequently doesn’t preserve the scalar product. in order to obtain a conserved norm we postulate that the time evolution of a quantum system preserves, not just the normalization of the quantum states, but also the inner products between the associated states 〈φ(t)|φ(t)〉 = 〈φ(0)|φ(0)〉 . this is the main result of this paper. as an illustration of our method, we present a specific quantum system given by a linear combination of su(1, 1) and su(2) generators. for this we introduce, in § 2, a formalism based on the time-dependent non-unitary transformations and we show that the time-dependent non-hermitian hamiltonian is related to an associated time-independent hamiltonian multiplied by an overall time-dependent factor. in § 3, we illustrate our formalism introduced in the previous section by treating a non-hermitian su(1, 1) and su(2) time-dependent quantum problem and finding the exact solution of the schrödinger equation without making recourse to the pseudoinvariant operator theory as been done in [34, 35] or to the technique presented in [28, 29]. finally, § 4 concludes our work. our analysis has shown that the key of solving the time-dependent schrödinger equation is to find a way to transform the problem to a standard integrable form. 2. formalism consider the time-dependent schrödinger equation i ∂ ∂t |ψ(t)〉 = h(t)|ψ(t)〉, (2) with ~ = 1 and h(t) is the time-dependent non-hermitian hamiltonian operator. coming back to the evolution equation (2), we perform a non-unitary transformation to the wavefunction as follows: |φ(t)〉 = v (t)|ψ(t)〉. (3) this is essentially a change of representation from |ψ(t)〉 to |φ(t)〉 so that the evolution of the quantal system in the new representation is governed by the following evolution equation i ∂ ∂t |φ(t)〉 = h(t)|φ(t)〉. (4) the operator h(t) changes into h(t) h(t) = v (t)h(t)v −1(t) + i ∂v (t) ∂t v −1(t). (5) the evolution equation (4) shares the same form as the original evolution equation in (2). however this equation can readily be solved if we make a proper choice for the non-unitary operator v (t). in this way, we are seeking a representation in which the associated evolution equation can be solved easily. this is done by employing the following criterion: h(t) governing the evolution of (4) is required to be in the form h(t) = g(t)h0. (6) the implication of these results is clear, the transformed hamiltonian is a product of a simple time-independent hamiltonian h0 and a time-dependent factor g(t). consequently, the original time-dependent non-hermitian quantum problem is completely solved. if we now define |ζn〉 eigenstate of h0 with a constant eigenvalue λn, we can write the eigenvalue equation in the form h0|ζn〉 = λn|ζn〉. (7) as is easily verified, the solution |φn(t)〉 of the schrödinger equation (4) can be written as |φn(t)〉 = exp ( iλn ∫ t 0 g(t′)dt′ ) |ζn〉. (8) of note it follows immediately that the time evolution of a quantum system described by |φn(t)〉 doesn’t preserve the normalization i.e., the inner product of evolved states |φn(t)〉 depend on time: 〈φn(t)|φn(t)〉 = exp im ( λn ∫ t 0 g(t′))dt′ ) 〈φn(0)|φn(0)〉. (9) at this stage, we will postulate, like the hermitian case, that the time evolution of a quantum system preserves, not just the normalization of the quantum states, but also the inner products between the associated states. 425 mustapha maamache acta polytechnica postulate: the time evolution of a non-hermitian quantum system preserves the normalization of the associated ket. the preservation of the norm of the state is associated with conservation of probability 〈φn(t)|φn(t)〉 = 〈φn(0)|φn(0)〉, implying that the imaginary part of the phase vanishes, which imposes that g(t) and λn are reals and consequently the hamiltonian h(t) should be hermitian. the important implication of this results is clear. the requirement of the stationarity of the scalar product between the states |φn(t)〉 implies that the operator h(t) is diagonal in the basis {|ζn〉} with eigenvalue exp ( iλn ∫ t 0 g(t ′)dt′ ) . in the original representation |ψn(t)〉, the solution to the evolution equation in (2) will then be given by |ψn(t)〉 = exp ( iλn ∫ t 0 g(t′)dt′ ) v −1(t)|ζn〉. (10) as we are only changing our description of the system by changing basis, we must preserve the inner product between vectors. explicitly, from preserving of this inner product between states |φn(t)〉, we can now define the inner product between states |ψn(t)〉 = v −1(t)|φn(t)〉 as 〈ψn(t)|v +(t)v (t)|ψn(t)〉 = 〈ψn(0)|v +(0)v (0)|ψn(0)〉 (11) which has both a positive definite signature and leaves the norms of vectors stationary in time. 3. application: su(1, 1) and su(2) non-hermitian time-dependent systems however, if the hamiltonian h(t) takes the following form: h(t) = 2ω(t)k0 + 2α(t)k− + 2β(t)k+, (12) where (ω(t),α(t),β(t)) ∈ c are arbitrary functions of time. the hermitian operator k0 and k+ = (k−)+ forms a closed lie algebra. in this paper we shall concentrate on a particular time-dependent non-hermitian hamiltonian (12) which comprises su(1, 1) and su(2) group generators, where k0, k− and k+ form the su(1, 1) and su(2) lie algebra written in the following unified form:  [k0,k+] = k+, [k0,k−] = −k−, [k+,k−] = dk0 (13) the lie algebra of su(1, 1) and su(2) corresponds to d = −2 and 2 in the commutation relations (13), respectively. k+ is the creation operator and k− is the annihilation operator when acting on eigenfunctions of k0. then the non-unitary transformation operator v (t) can be expressed locally in the following form v (t) = exp[2ε(t)k0 + 2µ(t)k− + 2µ∗(t)k+], (14) where ε , µ are arbitrary real and complex time-dependent parameters respectively. we shall disentangle this exponential operator into a product of exponential operators [37, 38]. this procedure provides a way to uncouple exponential operators which are not necessarily unitary. now we have v (t) = eϑ+(t)k+eln ϑ0(t)k0eϑ−(t)k−. (15) we chose this particular form (15) for v (t) because it is expressed as a product of exponential operators, and direct differentiation with respect to time for this operator can be readily carried out. the time dependent coefficients ϑ0(t) and ϑ±(t) read ϑ0(t) = ( cosh θ − ε θ sinh θ )−2 , θ = √ ε2 + 2d|µ|2, ϑ+(t) = 2µ∗ sinh θ θ cosh θ −ε sinh θ , ϑ−(t) = 2µ sinh θ θ cosh θ −ε sinh θ . (16) the notation may be simplified even further by introducing some new quantities [29] z = 2µ ε = |z|eiϕ, φ = |z| 1 − ε θ coth θ , χ(t) = − cosh θ + � θ sinh θ cosh θ − � θ sinh θ . (17) with this adopted notation, the coefficients in (16) simplify to ϑ± = −φe∓iϕ, ϑ0 = − d 2 φ2 −χ . (18) 426 vol. 57 no. 6/2017 non-unitary transformation of non-hermitian systems using the relations { exp[ϑ−k−]k0 exp[−ϑ−k−] = k0 + ϑ−k−, exp[ϑ+k+]k0 exp[−ϑ+k+] = k0 −ϑ+k+, (19) { exp[ln ϑ0k0]k− exp[− ln ϑ0k0] = k− ϑ0 , exp[ϑ+k+]k− exp[−ϑ+k+] = k− + dϑ+k0 − d2 ϑ 2 +k+, (20) { exp[ln ϑ0k0]k+ exp[− ln ϑ0k0] = ϑ0k+, exp[ϑ−k−]k+ exp[ϑ−k−] = k+ −dϑ−k0 − d2 ϑ 2 −k−, (21) i ∂v (t) ∂t v −1(t) = i ϑ0 ( (ϑ̇0 + dϑ+ϑ̇−)k0 + ϑ̇−k− + ( ϑ0 · ϑ+ −ϑ+ϑ̇0 − d 2 ϑ2+ϑ̇− )) (22) and putting them into (5), we obtain, after some algebra the transformed hamiltonian h(t) = 2w(t)k0 + 2q(t)k− + 2y(t)k+, (23) where the coefficient functions are w(t) = 1 ϑ0 ( ω (d 2 ϑ+ϑ− −χ ) + d(ϑ+α + ϑ−βχ) + i 2 (ϑ̇0 + dϑ+ϑ̇−) ) , (24) q(t) = 1 ϑ0 ( ωϑ− + α− d 2 βϑ2− + i ϑ̇− 2 ) , (25) y(t) = 1 ϑ0 ( ωχϑ+ − d 2 αϑ2+ + βχ 2 + i 2 ( ϑ0ϑ̇+ −ϑ+ϑ̇0 − d 2 ϑ2+ϑ̇− )) . (26) the evolution equation (4) under h(t) can readily be solved if we make a proper choice for the non-unitary operator v (t). in this way, we are seeking a representation in which the associated evolution equation can be solved easily. this is done by employing the following criterion: h(t) in (23) is required to be diagonal in the eigenbasis {|ζn〉} of k0. the above requirement can be achieved if and only if the inner product between the associated states |φn(t)〉 is preserved, which is achieved by imposing q(t) = 0, y(t) = 0 and imw(t) = 0. these conditions lead, by using (18) and after some algebra, to the following constraints ϕ̇ = 2|ω|cos ϕω − 2 |α| φ cos(ϕα −ϕ) + dφ|β|cos(ϕ + ϕβ), (27) φ̇ = −2φ|ω|sin ϕω + 2|α|sin(ϕα −ϕ) −dφ2|β|sin(ϕ + ϕβ), (28) ϑ̇0 = 2ϑ0 φ ( −2φ|ω|sin ϕω + |α|sin(ϕα −ϕ) + (χ−dφ2)|β|sin(ϕ + ϕβ) ) , (29) by which ϑ−, ϑ+ and ϑ0 are detemined for given values of ω(t), α(t) and β(t). it is important to note here that when considering the time-dependent coefficient µ to be real function instead of complex one, i.e., the polar angles ϕ vanish, the auxiliary equations (27)–(29) that appear automatically in this process are identical to equations (28)-(30) for maamache et al [35] who used the general method of lewis and riesenfield to derive them. then the transformed hamiltonian h(t) becomes h(t) = 2 re(w(t))k0 re(w(t)) = ( |ω|cos ϕω + dφ|β|cos(ϕ + ϕβ) ) . (30) the implication of the results is clear. the original time-dependent quantum-mechanical problem posed through the hamiltonian (12) is completely solved if the wave function for the related transformed hamiltonian h(t) defined in (8) is obtained. the exact solution of the original equation (2) can now be found by combining the above results. we finally obtain |ψn(t)〉 = exp ( iλn ∫ t 0 2 ( |ω|cos ϕω + dφ|β|cos(ϕ + ϕβ) ) dt′ ) v −1(t)|ζn〉. (31) now, we consider the su(1, 1) case first where d = −2. the su(1, 1) lie algebra has a realization in terms of boson creation and annihilation operators a+ and a such that k0 = 1 2 ( a+a + 1 2 ) , k− = 1 2 a2, k+ = 1 2 a+2. (32) 427 mustapha maamache acta polytechnica then, the hamiltonian (12) describes the generalized time dependent sawson hamiltonian [29]. if ω(t), α(t) and β(t) are reals constant, this hamiltonian has been studied extensively in the literature by several authors, for instance [39–46]. substitution of d = −2, and λn = 12 (n + 1 2 ) into (31) yields |ψn(t)〉 = exp ( i ( n + 1 2 )∫ t 0 ( |ω|cos ϕω − 2φ|β|cos(ϕ + ϕβ) ) dt′ ) v −1(t)|n〉, (33) where |ζn〉 = |n〉 are the eigenvectors of k0. for d = 2, hamiltonian (12) possesses the symmetry of the dynamical group su(2). a spin in a complex time-varying magnetic field is a practical example in this case [47–53]. let k0 = jz and k∓ = j∓. |ζn〉 = |j,n〉 are the eigenvectors of jz, i.e., jz |j,n〉 = n|j,n〉. the next step is the calculation of the solutions (31) which are given by |ψn(t)〉 = exp ( in ∫ t 0 ( |ω|cos ϕω + 2φ|β|cos(ϕ + ϕβ) ) dt′ ) v −1(t)|j,n〉, (34) 4. conclusion it has been established [21–23] that the general frame-work for a description of unitary time evolution for time-dependent non-hermitian hamiltonians can be based on the use of a time-dependent metric operator. the unitarity of the time evolution can be guaranteed but the hamiltonian (the generator of the schrödinger time-evolution) must remain unobservable in general. the latter results were recently illustrated in [28, 29]. in this present work, we adapted another approach based on a time-dependent non-unitary transformation of time-dependent hermitian hamiltonians [15] to solve the schrödinger equation for the time-dependent non-hermitian hamiltonian. starting with the original time-dependent non-hermitian hamiltonian h(t) and through a non-unitary transformation v (t) we derive the transformed h(t) as time independent hamiltonian multiplied by a time-dependent factor. then, we postulate that the time evolution of a non hermitian quantum system preserves the inner products between the associated states, which allows us to identify this transformed hamiltonian h(t) = 2 re(w(t))k0 as hermitian. thus, our problem is completely solved. evidently, we then have presented to illustrate this theory: the su(1, 1) and su(2) non-hermitian timedependent systems described by the hamiltonian (12) when applying the non-unitray transformation v (t) we obtain the transformed hamiltonian h(t) as linear combination of k0 and k∓. consequently, we must disregard the prefactors of the operators k∓ . to this end, we next require that the coefficients q(t) = 0 and y(t) = 0 defined in (25)–(26). then, by using the postulat that the inner products between the associated states is preserved allows us to require that imw(t) = 0 and to identify the transformed hamiltonian h(t) = 2 re(w(t))k0 as hermitian. the su(1, 1) example provided was previously solved in [29] with the requirement q(t) = y+(t). at first glance, this means our new solution is just a special case of the one provided in [29] . in fact, this is not the case because the solution of shrodinger equation can never be obtained using uniquely this requirement; i.e.q(t) = y+(t). in order, to solve the shrodinger equation associated to their time dependent hermitian hamiltonian obtained by the requirement q(t) = y+(t), the authors of [29] have adapted the lewis and riesenfeld time-dependent invariants technique. thus, they use two steps to solve the generalized swanson hamiltonian. however, our method is straightforward to obtain the solution of the generalized swanson hamiltonian and the lewis and riesenfeld time-dependent invariant is a consequence. finaly, we also found the exact solutions of the generalized swanson model and a spinning particle in a time-varying complex magnetic field. acknowledgements i would like to thank professor omar cherbal for the interesting discussions on the notion of the time-dependent non-hermitian systems. references [1] h. r. lewis, phys. rev. lett. 18, 636 (1967). [2] h. r. lewis, j. math phys. 9, 1976 (1968). [3] h. r. lewis and w. b. riesenfeld, j. math. phys. 10, 1458 (1969). [4] c. m. cheng and p. c. w. fung, j. phys. a 21, 4115 (1988). [5] d. a. molares, j. phys. a 21, l889 (1988) [6] j. m. cervero and j. d. lejarreta, j. phys. a 22, l663 ( 1989). [7] n. datta and g. ghosh, phys. rev. a 40, 526 (1989). [8] x. gao , j. b. xu and t. z. qian, ann. phys., ny 204, 235 (1990). [9] d. b. monteoliva, h. j. korsch, and j. a. nunez, j. phys. a 27, 6897 (1994) 428 vol. 57 no. 6/2017 non-unitary transformation of non-hermitian systems [10] s. s. mizrahi, m. h. y. moussa, and b. baseia, int. j. mod. phys. b 8, 1563 (1994). [11] j. y. ji , j. k. kim , s. p. kim and k; s. soh phys. rev. a 52, 3352 (1995) [12] m. maamache, phys. rev. a 52, 936 (1995); j. phys. a 29, 2833 (1996); phys. scr. 54, 21 (1996). [13] y. z. lai, j. q. liang , h. j. w. müller-kirsten and j. g. zhou, phys. rev. a 53, 3691 (1996); j. phys. a 29, 1773 (1996). [14] y. c. ge and m. s. child, phys. rev. lett. 78, 2507 (1997). [15] m. maamache, j. phys. a 31, 6849 (1998). [16] f. g. scholz, h. b. geyer, f. j. hahne, ann. phys. 213, 74 (1992). [17] c. m. bender, rep. prog. phys. 70, 947 (2007) . [18] a. mostafazadeh, int. j. geom. methods mod. phys. 07, 1191 (2010). [19] c. figueira de morisson faria and a. fring, j. phys. a: math. theor. 39, 9269 (2006). [20] c. figueira de morisson faria and a. fring, laser physics 17, 424 [21] a.mostafazadeh, phys. lett. b 650, 208 (2007). [22] m. znojil, phys. rev. d 78, 085003 (2008). [23] m. znojil, sigma 5. 001 (2009) (e-print overlay: arxiv:0901.0700). [24] h. b´ıla, “adiabatic time-dependent metrics in pt-symmetric quantum theories”, eprint arxiv: 0902.0474. [25] j. gong and q. h. wang, phys. rev. a 82, 012103 (2010) [26] j. gong and q. h. wang, j. phys. a 46, 485302 (2013). [27] m. maamache, phys. rev. a 92, 032106 (2015) [28] a. fring and m. h. y. moussa, phys. rev. a 93, 042114 (2016). [29] a. fring and m.h. y. moussa, phys. rev. a 94, 042128 (2016). [30] b. khantoul, a. bounames and m. maamache, eur. phys. j. plus 132: 258 (2017). [31] a. fring and t. frith, phys. rev. a 95, 010102(r) (2017). [32] f. s. luiz, m. a. pontes and m. h. y. moussa, unitarity of the time-evolution and observability of non-hermitian hamiltonians for time-dependent dyson maps. arxiv:1611.08286. [33] f. s. luiz, m. a. pontes and m. h. y. moussa, gauge linked time-dependent non-hermitian hamiltonians. arxiv:1703.01451. evolution of the time-dependent non-hermitian hamiltonians: real phases, arxiv:1705.06341. [34] m. maamache, o-k. djeghiour, n. mana and w. koussa, eur. phys. j. plus 132, 383 (2017). [35] m. maamache, o-k. djeghiour, w. koussa and n. mana, time evolution of quantum systems with time-dependent non-hermitian hamiltonian and the pseudo hermitian invariant operator, arxiv:1705.08298. [36] f. bagarello, j. p. gazeau, f. h. szafraniec and m. znojil, “non-selfadjoint operators in quantum physics: mathematical aspects” wiley (2015). [37] a. b. klimov and s. m. chumakov, a group-theoretical approach to quantum optics: models of atom-field interactions (wiley-vch, weinheim, 2009). [38] s. m. barnett and p. radmore, methods in theoretical quantum optics (oxford university press, new york, 1997). [39] z. ahmed, phys. lett. a 294, 287 (2002). [40] m. s. swanson, j. math. phys. 45, 585 (2004). [41] h. f. jones, j. phys. a 38, 1741 (2005). [42] b. bagchi, c. quesne and r. roychoudhury, j. phys. a 38, l647 (2005). [43] d.p. musumbu, h.b. geyer and w.d. heiss, j. phys. a 40, f75 (2007). [44] c. quesne, j. phys. a 40, f745 (2007). [45] a. sinha and p. roy, j. phys. a 40, 10599 (2007). [46] eva-maria graefe, hans jurgen korsch, alexander rush and roman schubert, j. phys. a 48, 055301 (2015). [47] j. c. garrison and e. m. wright, phys. lett. a 128, 177 (1988). [48] g. dattoli, r. mignani, and a. torre, j. phys. a 23, 5795 (1990). [49] c. miniature, c. sire, j. baudon, and j. bellissard, europhys. lett. 13, 199 (1990). [50] a. mondragon and e. hernandez, j. phys. a 29, 2567 (1996); [51] a. mostafazadeh, phys. lett. a 264, 11 (1999). [52] x.-c. gao, j.-b. xu, and t.-z. qian, phys. rev. a 46, 3626 (1992). [53] h. choutri, m. maamache, and s. menouar, j. korean phys. soc. 40, 358 (2002). 429 acta polytechnica 57(6):424–429, 2017 1 introduction 2 formalism 3 application: su(1, 1) and su(2) non-hermitian time-dependent systems 4 conclusion acknowledgements references ap04_2web.vp 1 introduction design engineering is a knowledge and information intensive activity. however, there are many design tools, the internet and digital libraries containing publications on almost every subject. we suffer from the enormous amount of available information, tools and knowledge that make the search for the most useful bits a time-consuming activity. we are exploring this problem area and have found that design students have problems finding appropriate components for their designs. for example one student searched on the internet for about eight hours without finding a proper motor for his design of a hairdryer. furthermore, design engineers in practice use the internet for gaining knowledge, but many of them do not have a systematic approach that helps them to find the knowledge quickly and effectively. another problem is that although many tools have been developed or are under development to help the designer to optimize the speed, quality or cost of his work, design engineers are often not aware of the existence of these tools, nor do they have the time to regularly scan the market in order to discover them. finally, design engineers not only look for information and knowledge, they also create it when designing new products. we have not encountered existing tools that help the designer in the early phases of design to get an overview of his work in order to see what items and aspects have been covered yet and which are still missing or insufficiently handled. in order to overcome information and knowledge search and process problems of industrial design engineers, we aim to develop of a conceptual design engineering toolbox (c-det) in which existing tools, links, information and knowledge are made available for designers at an appropriate point in time and place in their conceptual development process and which helps designers to document and structure the output of their conceptual design work. this paper, begins by presenting the development of the early prototypes of c-det. then, the development of a tool that fits very well into c-det is described as an example of tool development that is monitored closely for the up-to-date framework-function of c-det. this is a tool that supports the designer in forecasting the usage of his product during conceptual design. finally, ‘extra functionality’ that can be incorporated in c-det is discussed, for example helping the designer to reach a balanced comprehension in his work. this extra functionality could be incorporated after the framework-function of c-det is running properly. 2 the development of c-det in the early 1990s, several research programs aimed at developing some kind of integrated design environment. the ambition for these programs was very high [1]. mostly the aim was to build an integrated database in which all output could be stored in a way that all people concerned could get their (part of the) information in a format that they needed. to achieve this, all kinds of mappings between different kinds of information were required. some of these mappings are still unavailable [2]. that is why the goal for c-det is to function as a framework for all kinds of useful bits rather than functioning as a comprehensive tool in which everything is linked through an integrated database. the starting point for developing c-det is that, to some extent design engineers work systematically (for example they orientate on the design problem, they make a list of requirements, they follow some design process) and that design engineers use their own intelligence. in other words, they are able to design and develop products for people. c-det will only support them in their work. for example, by making knowledge and tools and knowledge about tools better available and by helping them with structuring the work and by giving overview of the work done. these functionalities can have some level of intelligence but will never ignore the intelligence of the design engineer. 2.1 objectives the goal of this research is to develop a toolbox for conceptual design engineering (c-det) in which existing tools, 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 developing a conceptual design engineering toolbox and its tools r. w. vroom, e. j. j. van breemen, w. f. van der vegte in order to develop a successful product, a design engineer needs to pay attention to all relevant aspects of that product. many tools are available, software, books, websites, and commercial services. to unlock these potentially useful sources of knowledge, we are developing c-det, a toolbox for conceptual design engineering. the idea of c-det is that designers are supported by a system that provides them with a knowledge portal on one hand, and a system to store their current work on the other. the knowledge portal is to help the designer to find the most appropriate sites, experts, tools etc. at a short notice. such a toolbox offers opportunities to incorporate extra functionalities to support the design engineering work. one of these functionalities could be to help the designer to reach a balanced comprehension in his work. furthermore c-det enables researchers in the area of design engineering and design engineers themselves to find each other or their work earlier and more easily. newly developed design tools that can be used by design engineers but have not yet been developed up to a commercial level could be linked to by c-det. in this way these tools can be evaluated in an early stage by design engineers who would like to use them. this paper describes the first prototypes of c-det, an example of the development of a design tool that enables designers to forecast the use process and an example of the future functionalities of c-det such as balanced comprehension. keywords: toolbox, conceptual design engineering, development process, use process forecast, balanced comprehension. links, information and knowledge are made available for designers at an appropriate point (in time and place) in their conceptual development process, and which helps the designer to document and structure the output of his conceptual design phase. howevwr, the aim is not to create a central integrated database that is able to convert and translate all kind of data from one tool to another. tools to be integrated vary from planning tools to cad-tools, simulation tools, etc. interaction between the researchers, developers and users of c-det should lead to a useful toolbox in which design engineers properly rate links, tools, knowledge, etcetera, for use during the concept phase. monitoring the use of c-det and active rating enhances the usefulness. the idea for c-det is that the designer chooses which phase or what aspect he wants to work at. after that selection the designer can select helpful tools and links specifically for that phase or aspect. he can also create and/or insert his own specific design results for that phase or aspect. the way the output of his design project is structured and documented will be in an early prototype similar to the way that our design students make their design reports. 2.2 approach the approach is both theoretical and practical. c-det will be based on a theoretical framework including for example: the descriptions of conceptual design described in [2], the methods, techniques and methodologies used in design education at the faculty of industrial design engineering at delft, e.g. described in [3], the conceptual development part of the induced model in [4], other papers, e.g. [5], in which some categorizations of knowledge are presented. the practical track is intuitive, trial and error, but supported with prototypes and experiments. the early prototypes of c-det (will) have a small kernel, only serving a small design domain. this can be extended to the limits to maintain a balance between creating a toolbox that is too comprehensive (another internet/world) and a toolbox that is too small (not useful). the early prototypes of c-det will be mainly focused on the knowledge portal-function. research will cover items such as the way the knowledge, tools, etc. should be structured and made accessible (interface and entrance structure) in order to optimally support design engineers in finding the information they need. we foresee that a single way of categorizing knowledge will not be sufficient. for example, in order to support the design engineer in structuring his output we need a structure according to design process phases and, in order to enable some support in balanced comprehension (see later section) of a design result, we need a structure including design aspects (such as ergonomics, manufacturability, etc.). furthermore, in a parallel project carried out by restrepo ([6, 7]), a prototype is created in which information is structured according to product domain, such as bikes and offices, which might be another entrance structure in c-det. furthermore we expect that finding knowledge, tools etc. by using keywords would also be very valuable for designers. in two prototypes of the c-det system we have established two approaches in making available the tools and knowledge of our system. one approach is to link tools and knowledge sources to a phase structure, so designers can search and browse for knowledge using the phase they are in as guidance to the sought knowledge. the other approach is based on the designer’s notion of the specific aspect that he/she is working on. we are investigating the appropriateness of these approaches. both prototypes are discussed briefly in this paper. besides the entrance structure we need to find links, knowledge, tools etc. to be made available within c-det. for this we need criteria, a rating system and many search hours (by students). c-det will function as a framework in which existing material (including tools newly developed by colleagues or others) is made available. this way c-det might be able to visualize the gaps in existing design support. in the early stages of developing c-det we also need to investigate in what way the communication with design engineers – more specifically their feedback on the links, tools and knowledge – should be incorporated. this feedback will be used to improve c-det, to improve the criteria used for selecting knowledge, tools etc., to improve the ratings and/or the rating system and also to evaluate the newly developed tools that are integrated in c-det. another topic that needs attention in the early stages of c-det is the way the work of users could be inserted and stored. the output structure of this work needs to be developed and evaluated. the idea is that when the design work is inserted in c-det, it can be printed in such a way that the designer has an intermediate or final design report presenting his conceptual design stage. for this we need to differentiate between public and private data. c-det might be able to support cooperation between design engineers, even when these engineers are not at the same location. the required functionality for this kind of support needs further investigation. 2.3 progress: two prototypes the first two prototypes we have made deal only with the knowledge portal part of c-det. the goal for the prototype was to create a knowledge portal for some 100 websites and tools that can be used during the conceptual design process in order to evaluate our initial ideas about c-det [8]. the entrance structure should be some kind of basic design process. the questions to be answered in this part are: which phases are walked through by design engineers during the design process and what are the most common names for each of these phases; which sites and tools are relevant for design engineers; how should these sites and tools be categorized and presented. the first question is taken care of by studying literature e.g. [3, 4]. design engineers have their own way of working, and the methods and techniques used vary enormously between design engineers and between design projects. since it seems impossible to set up one concrete design process for all methods, we decided to set up an extensive design process in which most phases can be found. not in every design project will all phases be walked through. of course design engineers using c-det will not be forced to do so. the design engineer himself determines in which phase of his design process he is. the common basis of the design processes described in the literature is used, and is supplemented by relevant steps described in design methods and techniques. in this way the following basic design process has been put forward: © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 44 no. 2/2004 � phase 1: problem analysis (information phase, orientation phase, goal finding). sub phases are: lay down outline timetables; clarify and define the task; acquire general information; analyse the user environment; analyse usage; analyse competitive products; form a vision (collages); create a life cycle analysis; determine the list of requirements. � phase 2: conceptual design (product design phase). sub-phases are: start with the first sketches; evaluate and choose from ideas; possible clustering of ideas; search for solution principles; combine into concept variants; determine specifications of the concept variants; compare concept variants and choose from them; finally make a solution concept. � phase 3: embodiment design (detail design, optimisation phase). sub-phases are: summarise aspects still to be improved; improve specific aspects where needed; conduct studies into possible forms and shapes; conduct studies into possible colours; apply ergonomics data; investigate materials; investigate electronics; investigate environmental consequences; conduct mechanical calculations; develop plans for production and assembly; create an exploded-view; generate costs optimisation; optimise and complete the form design; put together a definitive design; construct a plan for building the prototype). � phase 4: presentation. sub-phases are: generate design documentation; create rendering; build a prototype. � phase 5: evaluation. sub-phases are: test the prototype (user trials); test by list of requirements; evaluate the process and its results; make recommendations for adjustments; prepare for production. the initial contents (sites and tools) for the toolbox are collected in three ways: using a questionnaire for our design students; searching literature that is referenced in our courses; searching the internet. the initial categorization is based on our own experience and knowledge and will be evaluated by tested using the prototype. the first prototype is created in asp. in the use test we that we carried out with this prototype, only design students participated. we asked the students to carry out a small design task in which some search items are included. the results of this use test are promising. users were enthusiastic about the knowledge portal. they were able to find information and tools and they began “playing” with the prototype because of the interesting links they found. however the entrance structure exclusively based on design phases was not considered optimal. to gain more indications about an entrance structure based on design aspects, we set up a questionnaire. based on internal reports e.g. [9] and the literature [3], [10] the following design aspects are expected to be important for design engineering projects: technology/technical aspects; materials; manufacturing; design / aesthetics; ergonomics; management science; ecology; regulations; marketing. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 1: first prototype of c-det furthermore we wanted to evaluate the criteria used for selecting websites and tools. these criteria are based on [8], [10, 11] and are: advertisements and links are not the only information on the site; it is a non-commercial site; there is an option for requesting more specific information; links to enable further searching are present; the site includes a search engine; easy way of navigating within the site; the site provides a guest book (or forum); the site suggests that a serious effort was put into the aesthetics of the site; technical pictures support the text; the site contains moving animations (e.g. flash intros); the time required for loading the site is short; a notification of the last update of the site is present; the last update was performed recently; the available links are all still present and ”alive”. based on the answers to the questionnaire, two criteria are skipped, namely the criterion about the guest book and about the moving animations. furthermore, for our user group, the language in which the site is presented is limited to dutch and english. in the second prototype [12] the entrance structure is a matrix consisting of design phases (rows) and design aspects (columns). some 400 sites are linked. on the page of the search results we have added two features, based on answers of respondents and user tests of the first prototype: a price indicator and a language indicator. see fig. 1. for the use test with the second prototype of c-det we set up twelve search assignments that a testes has to carry out within 10 minutes, after a short introduction to the toolbox. the search assignments are for example: you are making a cover for a presentation or a design report and you want to gain some ideas and inspiration; you want to know the recycling options and the environmental impact of a material that you are using in a product design (e.g. polyethylene); you are looking for a component in solid works format to insert in your own cad-drawing; 4) you want to know what material is most appropriate to use in your design of a mp3 player; etcetera. fifteen users participated in the user tests. the users did not understand the principle of c-det right away. however, once they had used the toolbox for only a few minutes, their insight into the entrance structure increased enormously, which resulted in faster results of the search assignments. some users would have categorized the sites slightly differently than we did. not all search assignments were carried out successfully. that is because some subjects could be found at multiple aspects. this causes doubt about the button where to use for searching for the information. 3 development of the use-forecast tool probably the most crucial phase in a product life cycle is the stage in which the product is used by humans to fulfil its intended functions. one of the issues is how to include knowledge related to the use stage of products in computer-aided conceptual design, as a supplement to functional modelling, artefact modelling and artefact-behaviour modelling. this issue has to be considered in the context of the increasing deployment of knowledge-intensive systems in computer support of design. the two key elements of our approach are (1) deployment of ontologies as a knowledge repository and (2) introducing knowledge-intensiveness into models by applying a nucleus-based approach. the application of ontologies has proven to be an advantageous paradigm in the research field of knowledge representation and knowledge processing . the nucleus-based modelling approach makes it possible to apply the ontology paradigm to so-called design concepts that offer integrated support to the product designer for modelling artefact geometry and artefact behaviour, in close connection to the artefact’s function and its intended use by humans. 3.1 ontology-based support of usage consideration in conceptual design for the application of ontologies in our project, cooperation was initiated with experts in the field of functional-ontology modelling at the department of knowledge systems at osaka university, japan. the objectives of functional-ontology modelling are a) to provide insight into the design rationale by making the intended product behaviour explicit and b) to provide computer-generated suggestions for alternatives based on the given functions in a product. the collaborative project aims to extend towards use-process modelling, with a specific focus on forecasting the different use processes that are possible if the behaviour of user u and environment e are also to be taken into account. unlike the design of product p, the u and e are not likely to be changed by the designer. a product on its own may sometimes not behave as was intended by the designer, but the risk that u or e does not behave as was intended, causing unwanted effects, is even more plausible. moreover, unintended behaviour of p itself is likely to be prompted by u or e. therefore, we want to offer assistance in managing the knowledge about possible (i.e., intended and unintended) use processes so that designers can anticipate them. this knowledge can originate from various sources, such as simulations, insight gained from previous products, or data collected from interactive user participation sessions. needless to say, it will not be realistic to capture and manage knowledge about all possible use processes, but we do not strive to exclude any particular use apriori. so far, the research into deployment of ontologies for the consideration of use processes has been at an explorative level, and it has not reached the stage that its results can be applied or tested by designers in the form of computer support. nevertheless, the exploration of use-related knowledge resulted in categorizations of a) typical and atypical forms of unintended use, b) forms of possible computer support, ranging from finding unintended use to solving the problems it causes, c) principal approaches that can be taken to solve the problems. c-det offers an attractive opportunity to present these results to designers so that they can use them for guidance in their own approach to use consideration, and for gathering feedback about the prospect of proposed forms of computer support. 3.2 knowledge-intensive models in the early stage of design, designers typically create the first concrete models of products. unfortunately, the modelling techniques and representations currently used in analysis and behavioural simulation do not provide effective means © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 44 no. 2/2004 for modelling the use of products. one of the reasons is that in modelling use, not only p has to be modelled, but also u and e. the concrete problems with the available geometry-oriented modelling environments and simulation tools are: 1) the created models are supposed to be complete and valid, which is not necessary in conceptual design; 2) the simulations are orientated toward the behaviour of artefacts in a given environment, but they typically only include passive behaviour of the human; 3) unlike knowledge describing the pure physical behaviour of p and e, knowledge related to the active behaviour of u cannot straightforwardly be embedded in geometry; 4) knowledge that associates p with different us and different es cannot be included; 5) simulations tend to be restricted to behaviour that is completely determined by one initial state. consequentially, current simulation packages cannot cope with interventions that naturally occur in use processes. currently, our research focuses on the generation of kernel u-p-e models to support modelling and forecasting use in conceptual design with a future system for generating resource-integrated models . the models are partly generated based on knowledge about use stored in ontologies, partly by the designer applying his or her own insights. for forecasting use processes, the knowledge in the models has to provide input for simulations. to overcome the problem that simulations cannot cope with multiple use processes, produced by a multitude of users, user behaviours, and environments, our idea is to apply multiple simulations based on scenarios, which can be associated with the u-p-e system in order to provide the starting conditions for simulations and forecasting. in the resource-integrated model, they are the carriers of the process knowledge that complements the physical behaviour processed in simulations and that usually cannot be captured in object-type models. instantiations of nuclei serve as building blocks in modelling the actors u, p and e. the fact that nuclei can represent the physical characteristics of the actors in addition to their geometric and structural characteristics makes them attractive for modelling and simulation of use cases. furthermore, it is also possible to consider them as function carriers, and to include them in a functional ontology. this ontology can incorporate scenarios consisting of situations known to occur in use processes. in generating of models, we distinguish two cases: a) conceptual design of a new product and b) conceptual redesign of an existing product or of a product belonging to a known class of products. in generating resource-integrated models, four activities are involved: modelling the user, modelling the product, modelling the environment, and modelling a scenario in which u, p and e appear as a system, involving situations that can be used as input for simulations. to investigate the applicability of resource-integrated modelling in conceptual design, we have developed a nucleus-based model of an existing product, a pedal bin. the level of detailing of the object-type models of u, p and e corresponds to what we presumed to be appropriate in conceptual design. in our understanding it represents what the ontology can provide for us as a basis for a use-oriented redesign of an existing product – case (b). but it can also be 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: (a) conceptual model of a pedal-bin: simulation of a use scenario (left); (b) improved conceptual model of a pedal-bin: simulation of a use scenario (right) understood as an intermediate stage in the conceptual stage of a new product – case (a). we generated a qualitative description of a simple use scenario, disposal of a piece of garbage, which specifies the situations and the initial conditions for a simulation. the actual simulation was performed with msc working model® 2d (wm2d) . this package was also used to create most of the nucleus-based models of u, p and e. fig. 2a shows the initial resource-integrated conceptual model of the pedal bin in a simulation of its of usage. when the pedal is pressed, the bin starts to tumble and the lid tends to oscillate around its highest position. if the object is launched from the shown position, this behaviour of the bin prevents proper disposal. fig. 2b shows an improved conceptual design with a counterweight to ensure a more determined movement of the bin and the lid. the garbage now lands successfully inside the bin. this application case study demonstrates the method and issues of using nucleus-based modelling in conceptual design. the homogeneous representation of u, p and e enables us (1) to model the known use processes in the form of common scenarios and (2) to predict ad-hoc use processes based on simulations. based on the results, an improved concept product can be realized on the level of detail that is typical for conceptual design. as a matter of course, we have to validate the findings for a wider range of products and use processes. in this respect it has to be made clear that the set-up of this tabletop research has apriori limitations in terms of an exhaustive validation of the hypothesis, for two reasons: (1) the generation or retrieval of known use processes was not based on a fully developed use ontology, nor on fully elaborated scenario-based modelling, and (2) the applied simulation technique, provisionally offered by the wm2d system, also posed limitations. despite these restrictions, the results revealed attractive prospects for the application of resource-integrated models to represent use-processes in conceptual design. designers can anticipate the use process without consulting external knowledge sources about, and models of, users and environments, and without having to switch between object-type models and process-type models (including functional models). that is, processes involving simulation-based forecasting can be seamlessly included in the modelling environment, and even intervention-type interactions can be studied. this development of a resource-integrated modelling environment has not yet reached the stage that its results can be applied or tested by designers in the form of computer support. however based on the currently available demonstrative examples, c-det can give designers an impression of the future possibilities and can gather feedback about the actual need for the proposed form of design support. to go one step further, and provide designers insight into the potential of a full-featured system that combines the power of ontologies in combination with knowledge-intensive models, we will most probably require more advanced development of the two pillars of tool development that were discussed above. progress is needed in particular in the following areas: 1) further development of the methodology and technology for capturing and processing knowledge related to the use of products, 2) further refinement of the fundamentals and methodology of modelling based on scenarios prescribing the use of products, 3) development of a dedicated simulation environment that can benefit from resource-integrated conceptual models. 4 extra functionality once c-det is put into use and starts to gain appreciation from design engineers, and once our research has resulted in some well-functioning fundaments, we can let c-det grow in extent and in functionality. then the following consideration will need to be taken into account. when a design engineer is working with c-det, his work could be monitored by keeping track of the links he is using and by analysing the information he has inserted. monitoring the design engineering work would give c-det opportunities to support the designer by: advising him where to store parts of his work; offering him the tools he could use best at that time (give information about tools other users used in comparable cases); advising him which aspect might need more attention (balanced comprehension). therefore, we may have to consider monitoring the work of the designer. this has to be approached with care, since unwanted advice is annoying, and system users do not always want the computer to keep track of what they are doing. the first form of design support we would like to incorporate in c-det after it has survived its first stage as a knowledge portal and output-structuring tool, is helping the designer to come to a balanced comprehension of his design work. for this we need some tracking or monitoring tool to be able to follow the aspects the designer has given attention to. this part of design-support still needs a lot of research: how to monitor, how to evaluate, how to visualize the current status, etc. furthermore, it might be valuable for designers if they are able to personalize their use of c-det. this could mean for example that they are able to change certain ratings for their own specific use. this item needs a closer look to decide whether or not it is an option we want to deal with. we could think of some future scenarios in which c-det is very valuable for developing competitive products. our first very immature ideas in this direction are about a c-det-box (small suitcase) containing disk space, a tablet pc, a scanner etc. the design engineer can plug in to the internet from everywhere. picture search, a shape search engine could be integrated and at in the very long term, some advanced input devices might be developed and incorporated. this section will be extended during the project. 5 conclusions and future results we have found criteria for selecting websites and tools based on their relevance and usability for industrial design engineering and thus for inclusion into c-det. we aim to find aspect-specific criteria and design phase-specific criteria as well. the matrix we used as the entrance structure in the second prototype is preferred to the entrance structure based on design phases only, as used in the first prototype. the order of the aspects has to be adjusted in order to achieve a more logical array of aspects. we will also continue searching and evaluating other entrance structures, such as the structure based on the knowledge categorization of [9]. an introduction page in which the usage of the toolbox is explained may improve the ease of use. to keep c-det up to © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 44 no. 2/2004 date, developments of tools for design engineering will have be watched closely. one of these tools under development is discussed in this paper. a tool for forecasting the use process of a concept design offers added value to the designer that fits into c-det very well. as we indicated, we can also include early prototypes of such tools in c-det to give designers a preview of future developments and to provide the opportunity to give feedback to the researchers involved in new-tool development. from our studies into the use of the two prototypes, we concluded that a phase-based structure alone does not facilitate optimal access to the tools and knowledge sources available within c-det. combined with the aspects in the matrix of the second prototype, most tasks for finding information, tools and knowledge sources were done within the set limit of ten minutes. however, to come to a balanced comprehension, an overview that the designer can apply to determine the imbalance in the completion of the task at hand, a more complete taxonomy of design aspects is needed. the aspect or combination of aspects that a designer is considering should be unambiguously designated in the entrance structure of c-det, in order to be able to find the right tool for the right job. therefore we will work on fine-tuning the list of design aspects or sub-aspects that we will use to make the contents of c-det available. references [1] vroom r. w. et al: overview of the initiatives on the development of designer’s toolkits. procs. design 2002. zagreb: faculty of mechanical engineering and naval architecture, 2002. [2] horváth i.: conceptual design: inside and outside. procs. ediprod ‘2000. zielona góra: technical university of zielona góra, 2000, p. 63–72. [3] roozenburg n. f. m., eekels j.: product design: fundamentals and methods. chichester: wiley, 1995. [4] vroom r.w.: zicht op producten procesontwikkeling. doctoral dissertation, delft: delft university press, 2001. [5] owen r., horváth i.: towards product-related knowledge asset warehousing in enterprises. procs. tmce 2002, wuhan: hust press, 2002, p.155–170. [6] garcía j., restrepo j.: context modelling in a collaborative virtual reality application as support to the design process. procs. design 2002. zagreb: faculty of mechanical engineering and naval architecture, 2002. [7] christiaans h., restrepo j.: designer conditioning by early representations. procs. skcf ’01. sydney: 2001. [8] knijnenburg r., braak g.: intermediate technical report: developing c-det. report ide 350. delft: delft university of technology, industrial design engineering, 2003. [9] nijhuis k.: het programma van eisen als instrument in de praktijk. internal graduation report, delft university of technology, industrial design engineering, 1996. [10] eder w. e.: edc engineering design and creativity. procs. workshop edc, november 1995, series wdk nr. 24. zürich: heurista. 1995. [11] bauer c., scharl a.: “quantitive evaluation of web site content and structure.” internet research: electronic networking applications and policy, vol. 10, (2000), no. 1, http://www.emerald-library.com. [12] koekkoek m., leving j.: weblink library for conceptual design engineering. report ide 350. delft: delft university of technology, industrial design engineering, 2003. 13] van der vegte w.f., horváth i.: consideration and modeling of use processes in computer-aided conceptual design: a state of the art review. transactions of the sdps, vol. 6, (2002), no. 2, p. 25–59. [14] kitamura y., mizoguchi r.: “ontology-based description of functional design knowledge and its use in a functional way server.” expert systems with applications. vol. 24, (2003), no. 2, p. 153–166. [15] horváth i., van der vegte w. f: nucleus-based product conceptualization – part 1: principles and formalization. to be published in procs. iced 2003, stockholm, 2003. [16] van der vegte w. f. et al: ontology-based modeling of product functionality and use part 2: considering use and unintended behavior. procs. ediprod ,02, zielona góra, technical university of zielona góra, 2002. [17] van der vegte, w. f., horváth, i.: nucleus-based product conceptualization – part 2: application in designing for use. proc. iced 2003, stockholm, 2003. [18] rivera t., mitiguy t. (last revisers): working model – user’s manual, msc software corporation, san mateo, 2001. dr. regine w. vroom phone: +31 152 781 342 fax: +31 152 781 839 e-mail: r.w.vroom@io.tudelft.nl ernest j. j. van breemen m. sc. e-mail: e.j.j.vanbreemen@io.tudelft.nl wilhelm f. (wilfred) van der vegte mtd e-mail: w.f.vandervegte@io.tudelft.nl industrial design delft delft university of technology landbergstraat 15 2265 cl delft, the netherlands 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2017.57.0105 acta polytechnica 57(2):105–115, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap implementation of a 3d vof tracking algorithm based on binary space-partitioning filip kolařík∗, bořek patzák ctu in prague, faculty of civil engineering, department of mechanics, thákurova 7, 166 29 praha 6; czech republic ∗ corresponding author: filip.kolarik@fsv.cvut.cz abstract. the paper focuses on modeling free surface flow. the interface is modeled using the volume-of-fluid method, where the advection of volume fractions is treated by a purely geometrical method. the novelty of the work lies in the way that it incorporates binary space-partitioning trees for computing the intersections of polyhedra. volume-conserving properties and shape-preserving properties are presented on two benchmarks and on a simulation of the famous broken dam problem. keywords: free surface; two-fluid flow; vof; binary space-partitioning. 1. introduction free surface flows appear in a wide range of industrial applications including molten metal production in steel processing [1], resin-infusion processes in structural manufacturing [2], water waves in ship hydrodynamics [3], and fresh concrete casting in civil engineering [4]. the development of numerical techniques for describing the evolution of a free surface can be tracked back to the early 1980s. a good review of the early development of this field can be found in [5]. since that time, a number of different approaches and methods have been proposed. the existing methods can be divided into so-called interface-tracking and interface-capturing methods [6]. in the interface tracking techniques, the underlying discretization follows the interface in a lagrangian manner (lagrangian formulation is typically employed). because of the large mesh distortions during the flow, the mesh needs to be updated as the interface evolves. the need for frequent re-meshing can sometimes involve an excessive expense, especially in the case of complex 3d flows. this disadvantage is balanced by the natural representation of the interface. interface-capturing methods are usually formulated in an eulerian context, where the computations are performed on a fixed mesh. the free surface flow is then realized as the flow of two immiscible fluids, where one is the fluid of interest (the reference fluid) and the other represents the ambient air. the interface between the fluids is determined with the help of a suitably chosen (scalar) interface function, defined on the underlying mesh. a direct consequence of this assumption is that the interface is captured within the resolution of the underlying mesh. fundamental examples of interface-capturing methods are the level set method [7–9], the volume-of-fluid (vof) method [5, 10], and a combination of these methods clsvof [11]. in level set methods, the interface is typically represented as a zero contour of the interface signed distance function, which is defined over the whole computational domain. the interface is then tracked by solving the transport equation which is of hyperbolic type, and therefore introduces some computational difficulties. in volume-of-fluid, the volume fractions of the representative fluid are tracked. many different numerical schemes have been developed in this area. most of them are based on finite difference schemes [8], or employ stabilized finite elements (supg), see [12, 13]. the work presented here employs the volume-of-fluid method which, within the interface capturing methods, belongs to the family of so-called volume tracking methods [5]. here, instead of tracking the interface itself according to the transport equation (as in the level-set method), we track the material volume fraction inside each cell along the streamlines. the interface is then reconstructed by projecting these advected volumes onto the original mesh. the volume fraction f coincides with the characteristic function of the domain occupied by the reference fluid, and can therefore be advected in the same fashion as in the level set methods. in the case of incompressible fluid, this approach leads to numerical schemes where it is necessary to compute the fluxes across the element boundaries [14]. this approach is sometimes referred to as eulerian [15]. a different treatment of volume fraction advection has been proposed by dukowicz and baumgardner in [16]. their method is based on two simple facts. the first is that the volume of a reference fluid inside a sub-domain is equal to its volume fraction integrated over that sub-domain. the second states that a certain volume of the reference fluid cannot change in time (mass conservation). by requiring that an arbitrary volume projected along the streamlines at time tn+1 is equal to the same volume at time tn, they derived a simple method that can be based purely on geometrical procedures, i.e. intersections of polygons and polyhedrons. our approach in this work follows dukowicz and baumgardner [16], with later developments by shah105 http://dx.doi.org/10.14311/ap.2017.57.0105 http://ojs.cvut.cz/ojs/index.php/ap filip kolařík, bořek patzák acta polytechnica bazi [15]. the novelty lies in incorporating so-called binary space-partitioning (bsp) trees. bsp trees were originally developed in connection with 3d computer graphics and are nowadays heavily used in the gaming industry. an early mention of bsp trees can be found in [17], and a modern exposition, together with implementation notes, is contained in [18]. the main idea behind bsp trees lies in subdividing space into convex sets by hyper-planes. this subdivision can be represented by a binary tree structure, which is used in our work for efficient implementation of polyhedra intersections. the main advantages of this formulation include natural geometrical interpretation, mass conservation, and applicability to structured and unstructured meshes. the paper is organized as follows. the first section provides an overview of the governing equations of incompressible immiscible flow, together with discretization using the finite element method. then the vof-based interface tracking method is presented in detail, covering the interface reconstruction based on volume fraction distribution, a mass conserving interface update using a three step procedure, and description of the bsp algorithm for truncating advected volumes of the representative fluid. finally, the interface tracking technique is presented on the basis of several examples that illustrates its capabilities and its performance. 2. a description of the fluid this section provides a description of the governing equations for the flow of two immiscible fluids. as the problem is by nature fully transient, the navier-stokes equations (nse) govern the motion of both fluids. let us denote ω ⊂ r3 as a three dimensional domain which is completely filled with the two immiscible fluids, occupying the corresponding subsets ω1(t) and ω2(t), where t stresses the time dependence of the two subsets. the boundary of the whole domain, denoted as ∂ω, can be decomposed into two mutually disjoint parts, γd and γn , on which the so-called dirichlet and neumann boundary conditions are prescribed, respectively. the interface between the two fluids is denoted as σ(t). then, for each phase j = 1, 2 in the domain ω and on its boundary ∂ω, and for each time t ∈ [0,t], the problem can be formulated as follows, see [19] ρj (∂v ∂t + (v · ∇)v−b ) − ∇ ·σ = 0 in ωj, ∇ ·v = 0 in ωj, v = g on γd, σ ·n = h on γn, (1) [v]σ(t) = 0 on σ, [n ·σ]σ(t) = 0 on σ, vt=0 = v0 in ωj. in the equations presented above, we have denoted as n the normal vectors to both ∂ω and σ(t), since it will be clear from the context which normal vector we have in mind. the unknowns are velocity field v and pressure field p. density ρ, body forces b and functions g, h and v0 are assumed to be known. the square brackets [·]σ(t) in the interface conditions on σ(t) denote a jump in the velocity and the normal stress components. in the case of a fluid with surface tension, the jump in the normal stress will be proportional to the curvature of the interface σ(t). standard decomposition of stress tensor σ into deviatoric stress τ and hydrostatic pressure p is used σ = τ −pδ. (2) both fluids are considered as newtonian fluids, so the constitutive equations read τ = µd, (3) where d denotes the strain rate tensor, which is defined as a symmetric part of the velocity gradient d = 1 2 ( ∇v + (∇v)t ) (4) and µ is the dynamic viscosity of the fluid. the numerical solution of (1) is based on stabilized finite elements as introduced in [12]. provided that suitable finite dimensional sub-spaces sh ⊂ s, v h ⊂ v and qh ⊂ q are defined, the discretized problem states: find vh ∈ sh and ph ∈qh such that ∀wh ∈ v h,∀qh ∈qh:∫ ω ρjw h ∂v h ∂t dx + ∫ ω ρjw h · (vh ·∇)vh dx + ∫ ω ∇wh : τ ( d(vh) ) dx− ∫ ω (∇·wh)ph dx − ∫ ω wh ·b dx− ∫ γn wh ·h ds + ∫ ω qh(∇·vh) dx + ∑ el ∫ ωe τsu p g(vh ·∇)wh · r(vh,ph) dx + ∑ el ∫ ωe τp sp g 1 ρ ∇qh · r(vh,ph) dx = 0, (5) where r(vh,ph) denotes the residuum of the linear momentum balance part of (1). the need for stabilization follows from the use of linear tetrahedrons (i.e. both velocity and pressure fields are approximated by linear functions) in order to circumvent issues with the lbb condition, see [20] for details. note that the effective density ρ and also other material parameters such as viscosity µ, have to be averaged in elements which are cut by the interface. averaging is performed using the original material parameters of both fluids, which are weighted by the respective volume fractions of the fluids in the cut element. the averaging is computed as µ = fµ1 + (1 −f)µ2, (6) ρ = fρ1 + (1 −f)ρ2, (7) 106 vol. 57 no. 2/2017 implementation of a 3d vof tracking algorithm where f stands for the vof value at the cut element. additional stabilization terms are defined only element-wise, and are added as the sum over all elements ωe. parameters τsu p g and τp sp g depend on the velocity and material parameters. details on how they are determined can be found in [12, 13]. 3. evolution of the interface this section presents in detail the computational representation of the interface between the two fluids, based on the vof technique. first, we present the algorithms for interface reconstruction and tracking. then there is a description of the method for truncating the polyhedra, based on bsp trees. it should be noted that there is only one-way coupling between solving (5) and interface evolution; the interface advection is computed with the given velocity field obtained as a solution to (5). the position of the interface can be obtained from the vof spatial distribution. the interface is located either between two elements, when one is completely filled with the reference fluid (vof = 1) while the other is fully filled with the second fluid (vof = 0), or it passes through the partially filled element. on every partially filled element, we adopted a piece-wise linear approximation of the interface by the line segment in 2d and the planar segment in 3d. this is sometimes referred to as plic-vof [5]. the interface evolution algorithm is required in order to update the interface position with the flow. the algorithm presented here is based on the paper by shahbazi et al. (see [15]), and consists of three parts (see also figure 1): (1.) the lagrangian part, in which the original mesh is projected along trajectories. this is achieved by advecting the nodal position with the flow. the time integration is based on forward trajectory remapping to evolve positions and volumes in time. to determine the trajectories, the velocity field is integrated from the original grid to the updated grid xl(tt+∆t) = x(t) + ∫ t+∆t t v dt, (8) where xl is the coordinate of the updated (lagrangian) grid and the x is the original coordinate. the discretized form of equation (8) employing the midpoint runge-kutta method is k1 = ∆tv(t,x(t)), k2 = ∆tv(t + ∆t,x(t) + 12k1), xt+∆tl = x(t) + k2. time step ∆t is taken to be equal to the time step used for solving (5). in order to satisfy the cfl condition, we used adaptive time stepping where the time step is determined by the element with the most unfavorable ratio h/||vh||, given the velocity field previously obtained by solving (5). (2.) the reconstruction part. in this step the fluid volumes are reconstructed on the updated grid, assuming that the vof values remain constant during the lagrangian phase. the 3d polyhedron representation of the volume occupied by the reference fluid is established on the basis of the spatial vof distribution in each cell and its neighborhood. this includes the calculation of the interface segment normal, the segment constant, and the material volume truncation at each lagrangian cell, as described in section 3.1. (3.) the remapping part. this involves assembling the polyhedral representation of a reference fluid material for each cell on the updated (lagrangian) grid and redepositing it back on the target grid (i.e. on the original grid) to obtain updated vof values. this phase is performed by a series of 3d polyhedron intersection procedures between the polyhedron representing the reference fluid on the updated grid and the polyhedra representing the original mesh cells, yielding the contributions to the updated reference fluid volumes in particular cells. the details of this algorithm are given in section 3.2. 3.1. interface reconstruction as was mentioned above, the interface at each partially filled cell (so-called interface cell) is represented as a planar segment, which can be described as n ·x− c = 0, (9) where n is the unit normal to the planar surface representing the interface, x is a position vector, and c is the plane constant. the interface segment normal in each cell can be determined from the spatial volume fraction distribution in the neighborhood of the cell. the segment constant is then determined from the volume conservation requirement. the method for determining the normal is based on an extension to young’s second method, developed originally by rider and kothe [5]. the idea is to form a taylor series expansion of volume fraction ft si for each interface cell i with vof value fi to each of its adjacent cell with k with vof value fk. then, the sum of (ft si −fk)2 over all cells neighboring cell i∑ k ( ft si (xi) + ∇f t s i (xi) · (xk −xi) −fk(xk) )2 (10) is minimized in the least square sense. the minimization yields the volume fraction gradient ∇ft si corresponding to the interface segment normal ni for cell i as the solution for the resulting system of normal equations. all coordinates are evaluated in the mass center of the cells. this method guarantees exact reconstruction of the gradients for a linear function of f. however, even for the linear interface, the distribution of f through tessellation is not linear. this method is therefore only first order accurate. in [15], a second order accurate 107 filip kolařík, bořek patzák acta polytechnica method is proposed for the 2d case. this method is based on geometric minimization of the differences between the real volume fractions and the fractions determined by the given line. the value of planar segment constant c is determined from the volume conservation requirement. its value is constrained such that the resulting planar segment (with the fixed normal determined in the previous step) passes through the cell with a truncation volume equal to the cell material volume v . the planar segment constant is determined from the following relation f(c) = v (c) −v = 0, (11) where v (c) is the material volume in the cell bounded by the planar interface segment with constant c and the portion of the cell boundary surfaces within the material. the cell material volume v is equal to the total cell volume multiplied by the volume fraction. brent’s method has been used to find zero of the f (c) function (and to determine the volume conserving segment constant). this method uses inverse parabola interpolation, so it is suitable for finding the root of f(c), as v often varies quadratically with c. 3.2. a bsp tree based approach to remapping the tetrahedral mesh the remapping part of the proposed algorithm is of purely a geometric nature. it involves establishing a representation of the reference fluid volume in each cell (in the form of polyhedra) on the updated grid, projecting it onto the original grid and summing up the contributions of the truncated volumes (representing the contributing volume fractions) in each cell. the pseudo-code is presented in algorithm 1 the overall procedure requires two basic operations. the first operation is polyhedron formation based on knowledge of the interface segment that intersects a given cell. the second necessary operation is to compute the intersection between two (convex) polyhedra and to compute the volume of the intersection. polyhedron formatting is represented by the truncatepoly procedure in algorithm 1. once the tetrahedron and the corresponding intersecting plane are given, truncating the polyhedron is a trivial task, at least from the theoretical point of view. the only difficulty lies in the need to draw a correct distinction for each case of tetrahedron-plane intersection. the truncatepoly procedure consists of a loop over the tetrahedron faces, where each of the faces is cut by the intersecting plane. each such plane-face intersection results in a sub-face that belongs to the newly formed polyhedron, so that the new sub-face lies on the side of the interface plane filled with the reference fluid. lastly, the closing sub-face of the new polyhedron (the sub-face coincident with the interface plane) is formed as a cross-section originating from the union of the intersecting lines. the formpoly function algorithm 1 remmaping algorithm 1: procedure dointerfaceremmaping 2: for el := 1 to nel do 3: if el is cut by interface then 4: n = el.getplanenormal(); 5: c = el.getplaneconstant(); 6: p = el.truncatepoly(n, c); 7: for all neighbours of el do 8: q = neighbour.formpoly(); 9: vol = intersectpoly(p, q); 10: neighbour.addvolume(vol); 11: end for 12: end if 13: end for 14: end procedure forms a polyhedron from the plain (not intersected) element. although an problem evaluation of the intersection of two arbitrary polyhedrons is theoretically trivial, implementing it presents an interesting problem. in this work, the intersections are computed with the help of so-called binary space-partitioning (bsp) trees. the original idea of space partitioning with binary trees comes from a paper by fuchs, kedem and naylor [17] in 1980. binary space-partitioning is, generally speaking, a method for recursively subdividing a space into convex sets by hyper-planes. this subdivision of the space enables polyhedra (in general not necessarily convex polyhedra) to be represented within the space by means of a tree data structure known as a bsp tree, see [18]. the idea is that a given plane n·x−c = 0 partitions the space (here, we assume that the space is bounded) into two sub-spaces. according to the plane normal n, one of the sub-spaces can be called positive (the points in that subspace lies on the side to which n points) and the other sub-space can be called negative. note that points on the positive subspace satisfy the inequality n ·x− c > 0, and vice versa. both the positive subspaces and the negative sub-spaces can be further partitioned by another plane, and so on. by applying this recursive procedure, one obtains the partitioning of the original space represented by a binary tree. tho nodes of the tree represent the splitting planes. hence, a left child of the node corresponds to a positive subspace created by the plane that the node represents; a right child corresponds to the negative subspace. the leaf nodes then represents convex regions obtained by partitioning. the idea is illustrated in fig. 2, which is for clarity represented only in 2d. splitting lines are denoted as pi, convex regions are denoted as cj. the sign over each edge of the tree indicates the positive and negative subspace. the structure of the bsp tree for partitioning the space can easily be used for partitioning a polyhedron into convex sub-polyhedra. this can be done by creating the nodes of the tree in such a way that each node represents a suitable 108 vol. 57 no. 2/2017 implementation of a 3d vof tracking algorithm f=0.8 (a) lagrangian step f=0.8 (c) remapping steps(b) reconstruction step f=0.8 figure 1. illustration of the three phases of the interface update algorithm. c2 c0 c3 c4 c1 p0 p1 p2 p3 ++++ + + + + p0 p1 p2 c0 p3 c3 c4 c1 c2 figure 2. bsp tree 2d illustration. the nodes correspond to splitting lines denoted as p ′s. the leaf nodes represent final convex sub-regions, denoted as c′s splitting plane (that splitting plane may or may not coincide with actual face of the polyhedron). other faces are split at that node by using the corresponding splitting plane. all the faces that are on the positive side of that plane are sent to the positive child, and all the faces on the negative side are sent to the negative child of the tree. the process is then repeated, so the binary tree is computed recursively. a face which accidentally lies fully on the splitting plane is also stored at the corresponding node, and it is denoted as a coincident face. the pseudo-code for constructing a bsp tree, motivated by [18], is shown below. the key stage of bsp tree construction is classification of the mutual position between the splitting plane and the given face, represented by the classifyface function. the classification is performed with the help of standard methods used in analytic geometry. however, due to the round-off errors, it is necessary to set up appropriate tolerance ε under which we consider the face and the splitting plane as coincident. our experience is that ε depends on the problem that is to be solved (i.e. element size and time step) and a typical value is 10−11. 3.2.1. intersection of polyhedra the intersection of two polyhedra is theoretically a trivial task, and can be computed in a straightforward manner. however, implementation in floating point arithmetic requires careful treatment of some special situations. we will go into this matter below. let us say that we want to intersect polyhedron p with polyhedron q. the idea is to intersect each face of algorithm 2 bsp tree construction 1: procedure constructbsptree(facelist) 2: p = getplanefromface(facelist.first()) 3: for all faces in facelist do 4: type = classifyface(p, face) 5: if type = crosses then 6: (posface, negface) = splitface(p, face); 7: addtoposlist(posface); 8: addtoneglist(negface); 9: else if type = positive then 10: addtoposlist(face); 11: else if type = negative then 12: addtoneglist(face); 13: else if type = coincident then 14: addtocoincident(face); 15: end if 16: end for 17: constructtree(poslist); 18: constructtree(neglist); 19: end procedure algorithm 3 intersection of polyhedra 1: procedure intersectpoly(p, q) 2: q.constructbsptree(q.facelist); 3: p.intersectwith(q); 4: p.constructbsptree(p.facelist); 5: q.intersectwith(p); 6: end procedure p with polyhedron q and to keep all portions of the intersected faces of p that lies inside q as a part of the intersection between p and q. then, we proceed the other way around, i.e. we intersect all faces of q with polyhedron p and keep all portions of the intersected faces of q lying inside p. if we were test for intersection of each face of p with each face of q, then the resulting algorithm would have quadratic complexity. however, the use of a bsp tree reduces the number of comparisons, because a face on one side of a splitting plane does not need to be tested with faces on the other side of the splitting plane [18]. the main part of algorithm 3 is contained in the intersectwith function. basically, this function 109 filip kolařík, bořek patzák acta polytechnica algorithm 4 intersection with polyhedron 1: procedure intersectwith(p) 2: for all faces of q do 3: t = p.givebsptree(); 4: (in, coin) = getpartition(t, face); 5: end for 6: end procedure takes the faces of one polyhedron and sends them in a loop for further proceeding into the bsp tree for the other polyhedron, see algorithm 4. the intersection of the two polyhedra can be composed of two types of faces/sub-faces. to the first type belong those faces of polyhedron p, which lies inside of polyhedron q (and symmetrically faces of q lying inside of p). to the second type belong faces of p , which accidentally coincide with some faces of q (and vice versa). therefore, the output of the intersectwith function is composed of the two sets of faces, namely in and coin. in contains sub-faces and faces of q which lies inside polyhedron p , coin contains sub-faces and faces of q which coincide with some of the faces of p . the intersection value is computed from these two sets of faces, as will be described below. the getpartition function is the core part of the intersectwith procedure. its purpose is to partition a given face by the bsp tree into sub-faces which belong to a positive sub-space, or to a negative sub-space, or which are coincident with some of the splitting planes. the face is processed at each node (representing the splitting plane). according to the result of the classifyface function, the face is sent to the positive child or to the negative child for further processing. if the face is crossed by the splitting plane, one part of the face lies on the positive side and the other part of the face lies on the negative side of the splitting plane. therefore the part of the face on the positive side is sent to the positive child, and the other part is sent to the negative child. the most complicated case is when the face lies in the splitting plane, i.e. when the face and the splitting plane are coincident. in this case, the overlapping portion of the processed face and the face which served as a basis for the splitting plane has to be computed. note that in the case of convex polyhedra, it is enough to keep just the overlapping part of the face; the rest is of no importance. in the case of general polyhedra, the nonoverlapping portions of coincident faces would have to be processed by the positive or negative child of the bsp tree. the modification of algorithm proposed in [18] is a contained in pseudo-code algorithm 5. in general, the result of getpartition procedure is a representation of a face in form of a union of the segments (sub-faces) which are contained in positive, negative or coincident sub-spaces according to the bsp tree. however, all we need here is to compute the volume of the polyhedra intersection. therefore, only the segments in the negative and coincident subalgorithm 5 partitioning of a face 1: procedure getpartition(tree, face) 2: p = getplanefromface(tree.coinface()) 3: type = classifyface(p, face) 4: if type = crosses then 5: (posface, negface) = splitface(p, face); 6: sendtoposchild(posface); 7: sendtonegchild(negface); 8: else if type = positive then 9: sendtoposchild(face); 10: else if type = negative then 11: sendtonegchild(face); 12: else if type = coincident then 13: overlap = intersect(coinface, face); 14: storeintersection(overlap); 15: end if 16: end procedure spaces of the bsp tree are needed. they are denoted as in and coin, see algorithm 4. finally, the volume is computed with the help of the stokes formula∫ vp ∇·f dx = ∫ sp f ·n ds, (12) which transforms the volume integral of divergence of vector field f to a surface integral of the normal component of that field. vp and sp denote the volume and the surface of the polyhedron, respectively. in order to compute the volume v of the polyhedron resulting from the intersection, we can proceed as follows v = ∫ vp 1 dx = ∫ vp ∇· (x 3 ) dx = 1 3 ∫ sp x ·n ds = 1 3 ∑ fi∈{in∪coin} ∫ fi x ·ni ds = 1 3 ∑ fi∈{in∪coin} ∫ fi ci ds = 1 3 ∑ fi∈{in∪coin} ciai, (13) where ni and ci are used to denote the normal vector and the plane constant of face fi, respectively. the area of face fi is denoted as ai. the volume of the intersection of the two polyhedra then reduces to an evaluation of the areas and the plane constants of the faces contained in the set {in ∪ coin}. all the procedures were implemented into the oofem code [21, 22]. all the results presented in the following section were also obtained with oofem. 4. numerical examples in this section, we present several numerical examples that illustrate the capability of the proposed bsp tree-based vof method. the first example focuses on the shape-preserving ability of the method, and also on conservation of volume. the second example 110 vol. 57 no. 2/2017 implementation of a 3d vof tracking algorithm figure 3. cube translation — initial position, t = 0 s. figure 4. cube translation — middle position, t = 5 s. models the famous experiment performed by martin and moyce in the early 1950s [23]. 4.1. cube propagation through the unstructured mesh the example presented here considers the propagation of a cube through the unstructured mesh in the case of two basic motions of a fluid: translation and rotation. in both cases, the velocity field is prescribed in the whole domain, so we can focus exclusively on the vof method itself. the focus here is on the shapepreserving and volume conservation properties. 4.1.1. cube translation the computational domain is a 10 × 10 × 20 prismatic block with a 5 × 5 × 5 cube inside, see figure 3. the velocity field is prescribed as a uniform field, i.e. (1, 0, 0), along the longest side of the block in the x-direction. for the computation presented here, we used an unstructured mesh with 27910 nodes and 150909 elements. for shape preservation, figures 3–5 show the initial position, the middle position and the final position of the cube; it can be seen that the shape preservation is very decent. figures 6 display slices through the cube by planes with the normal vectors in x, y and z direction respectively, at times t = 0 s (initial position), t = 5 s (middle position) and t = 10 s (final position). visually, there is no significant error in the original shape of the cube. detailed information about shape preservation is provided in figure 5. cube translation — end position, t = 10 s. figure 6. cube translation. slices with the normal in the x,y,z-direction (columns) and times t = 0, 5, 10 s (rows). table 1, which presents a quantitative evaluation of the vof values at time t = 0 s and t = 10 s; the first column shows the relative error of the vof values (denoted as h) on each side of the cube. the mean value of vof is shown in the middle column, and the standard deviation is in the third column. the theoretically exact value hex = 0.5. it can be seen that there are no significant changes in the mean value, and the first column shows an average error of approximately 12 %. there is a slight increment in the standard deviation, which indicates non-uniform distribution of vof values across the planes of the cubes and increasing fluctuations. other tests, which are not reported here, show that these fluctuations are reduced when finer meshes are used. the volume conservation is also excellent. the ratio between the final and initial the volume is only 1.000076, which represents an increase in volume of less than 0.1 ‰. 111 filip kolařík, bořek patzák acta polytechnica cube side |e(h) −hex|/hex e(h) σ(h) time t0 t10 t0 t10 t0 t10 nx front 0.122 0.005 0.439 0.503 0.121 0.143 back 0.127 0.186 0.437 0.407 0.121 0.125 ny left 0.144 0.108 0.428 0.446 0.118 0.141 right 0.141 0.121 0.429 0.439 0.121 0.133 nz top 0.123 0.113 0.439 0.443 0.129 0.146 bottom 0.132 0.136 0.434 0.432 0.124 0.152 table 1. cube propagation — shape preservation evaluation. figure 7. cube rotation. from left to right and from top to bottom there are rotation angles α = 0°, 90°, 180°, 270°, 360°. the bottom-right figure shows the top view of the fully unstructured mesh. 4.1.2. cube rotation the second example performed in order to investigate the shape and volume preserving capabilities deals with rotation of the cube. the geometry is formed as a 1 × 1 × 1 cube inside a cylinder with radius 1 and height 1.5. the velocity changes linearly from 0 at the axis of rotation to 1 at the surface of the cylinder. the computational mesh consists of 24448 nodes and 132971 elements, and it is fully unstructured. the whole setup is shown in figure 7, where the cube is presented in its initial position and then after 90°, 180°, 270° and 360° of rotation. the last figure shows the top view on the unstructured mesh. the conservation of volume is again excellent. the ratio between the initial and the final position of the cube (after 360 deg rotation) is 0.9998012 which represents an error of less than 0.2 ‰ error. figure 8 shows slices through the cube by planes with the normal in the x, y and z direction, again for the initial position and after 90°, 180°, 270° and 360° rotation, showing shape preservation. quantitative results connected to shape preservation are summarized in table 2. similarly as in the cube translation, the first column shows the relative error in the vof values on each side of the cube for the initial position and after 360° rotation. the second column shows the mean vof values, and the third column shows the standard deviation. note that the trend is the same as in cube translation, i.e. the vof values do not change significantly on an average, but, as the cube rotates, the fluctuations around the mean value increase slightly. 4.2. the martin and moyce broken dam experiment the final example makes use of the famous martin and moyce experiment with the problem of a broken dam (also known as the water column collapse problem), see [23]. the setup of the experiment is illustrated in figure 9 in the top-left picture. a rectangular column 112 vol. 57 no. 2/2017 implementation of a 3d vof tracking algorithm cube side |e(h) −hex|/hex e(h) σ(h) angle 0 deg 360 deg 0 deg 360 deg 0 deg 360 deg nx top 0.139 0.186 0.431 0.407 0.135 0.171 bottom 0.135 0.199 0.433 0.400 0.136 0.175 ny left 0.059 0.077 0.471 0.461 0.100 0.158 right 0.058 0.061 0.471 0.469 0.101 0.156 nz front 0.054 0.019 0.473 0.491 0.100 0.181 back 0.059 0.040 0.470 0.480 0.104 0.187 table 2. cube rotation — shape preservation evaluation. figure 8. cube rotation. slices with the normal in the x,y,z-direction (column) and rotation angles α = 0°, 90°, 180°, 270°, 360°. (row). of water is initially in a state of hydrostatic equilibrium. after time t0, the water column starts to collapse due to its gravity by forming an advancing water wave, as can be seen in the rest of the pictures of figure 9. the initial configuration is chosen as a cubic water block with edge-length a = 0.05175 m, and the bounding box is a prismatic block with dimensions of 0.2286 × 0.05715 × 0.085725 m. frictionless boundary conditions are assumed on the bottom and on the vertical walls. the density and the viscosity of water are taken as 1000 kg/m3 and 1 × 10−3 pa s. ambient air with density equal to 1 kg/m3 and viscosity 1 × 10−5 pa s is assumed. four different discretizations with 2454, 12589, 19111 and 29604 nodes have been used to verify the objectivity of the description with respect to mesh size. the position of the water front and the residual water column height with respect to time are compared with the experimental results in figure 10 and figure 11, respectively. for convenience, the dimensionless lengths x∗ = x/a, z∗ = z/a and time t∗ = t √ g/a are used in the figures. it can be seen from figure 10 that numerical results converge to the experimental results with increasing fineness of the mesh. in case of the residual height of the column in figure 11, the results are less sensitive to the mesh and all the meshes that were performed seem to provide reasonably accurate results. 5. conclusion the research presented here has dealt with numerical simulations of flow with a free surface. the free surface is handled using the volume-of-fluid method, and the strategy employed for advancing the interface is based on purely geometric algorithms. in particular, we have proposed a new approach for interface reconstruction and for remapping phases of the advecting algorithms, which is based on binary space-partitioning. to the best of our knowledge, this approach has not been published before. the presented examples show very good volumeconserving properties. the shape preserving properties are also satisfactory. we believe that this is achieved mainly due to the geometrical nature of the algorithm, which is implemented in a very effective manner with the help of the idea of binary spacepartitioning. although this work is considered as a proof of concept, and additional efforts will be needed in order to develop a more complex code, the results presented here are more than promising. in particular, more accurate time integration in the lagrangian phase of the interface advecting algorithm, and an analysis of its influence on the volume conserving properties, would be beneficial. 113 filip kolařík, bořek patzák acta polytechnica figure 9. martin and moyce. from left to right and from top to bottom, the times are t = 0, 0.025, 0.05, 0.075, 0.1 and 0.125 s. 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 x * [ -] t* [-] 29604 nodes 19111 nodes 12589 nodes 2454 nodes martin moyce figure 10. martin and moyce. dependence of the wave front on time for different meshes and experimental results. 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 0 0.5 1 1.5 2 x * [ -] t* [-] 29604 nodes 19111 nodes 12589 nodes 2454 nodes martin moyce figure 11. martin and moyce. dependence of the residual height of the column on time for different meshes and experimental results. 114 vol. 57 no. 2/2017 implementation of a 3d vof tracking algorithm acknowledgements the authors would like to acknowledge the support provided by the czech science foundation under project 13-23584s. references [1] g. trapaga, e. f. matthys, j. j. valencia, j. szekely. fluid flow, heat transfer, and solidification of molten metal droplets impinging on substrates: comparison of numerical and experimental results. metallurgical and materials transactions b 23, 1992. doi:10.1007/bf02656450. [2] w. brouwer, e. van herpt, m. labordus. vacuum injection moulding for large structural applications. composites part a: applied science and manufacturing 34, 2003. doi:10.1016/s1359-835x(03)00060-5. [3] i. akkerman, y. bazilevs, d. j. benson, et al. free-surface flow and fluid-object interaction modeling with emphasis on ship hydrodynamics. journal of applied mechanics 79, 2012. doi:10.1115/1.4005072. [4] b. patzák, z. bittnar. modeling of fresh concrete flow. computers & structures 87(15–16):962–969, 2009. doi:10.1016/j.compstruc.2008.04.015. [5] w. j. rider, d. b. kothe. reconstructing volume tracking. journal of computational physics 141, 1998. doi:10.1006/jcph.1998.5906. [6] e. stein, r. d. borst, t. j. hughes. encyclopedia of computational mechanics, vol. volume 3. john wiley, 1st edn., 2004. [7] s. osher, j. a. sethian. fronts propagating with curvature-dependent speed: algorithms based on hamilton-jacobi formulations. journal of computational physics 79, 1988. doi:10.1016/0021-9991(88)90002-2. [8] j. a. sethian. level set methods and fast marching methods: evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science. cambridge monographs on applied and computational mathematics. cambridge university press, 2nd edn., 1999. [9] r. f. a. stanley osher. level set methods and dynamic implicit surfaces. applied mathematical sciences 153. springer-verlag new york, 1st edn., 2003. [10] c. hirt, b. nichols. volume of fluid (vof) method for the dynamics of free boundaries. journal of computational physics 39, 1981. doi:10.1016/0021-9991(81)90145-5. [11] m. sussman, e. g. puckett. a coupled level set and volume-of-fluid method for computing 3d and axisymmetric incompressible two-phase flows. journal of computational physics 162, 2000. doi:10.1006/jcph.2000.6537. [12] t. e. tezduyar. stabilized finite element formulations for incompressible flow computations. advances in applied mechanics 28:1–44, 1991. [13] t. e. tezduyar, y. osawa. finite element stabilization parameters computed from element matrices and vectors. computer methods in applied mechanics and engineering 190:411–430, 2000. [14] j. e. pilliod, e. g. puckett. second-order accurate volume-of-fluid algorithms for tracking material interfaces. journal of computational physics 199, 2004. doi:10.1016/j.jcp.2003.12.023. [15] k. shahbazi, m. paraschivoiu, j. mostaghimi. second order accurate volume tracking based on remapping for triangular meshes. journal of comput physics 188:100–122, 2003. [16] j. k. dukowicz, j. r. baumgardner. incremental remapping as a transport/advection algorithm. journal of computational physics 160, 2000. doi:10.1006/jcph.2000.6465. [17] h. fuchs, m. kedem, b. f. naylor. on visible surface generation by a priori tree structures. acm siggraph computer graphics 14, 1980. doi:10.1145/965105.807481. [18] p. schneider, d. h. eberly. geometric tools for computer graphics. the morgan kaufmann series in computer graphics and geometric modeling. boston :, morgan kaufmann publishers, 2003. [19] h. sauerland, t. p. fries. the extended finite element method for two-phase and free-surface flows: a systematic study. journal of computational physics 230:3369–3390, 2011. [20] i. babuska. the finite element method with lagrangian multipliers. nummath 20:179–192, 1973. [21] b. patzák. oofem home page, 2000. http://www.oofem.org. [22] b. patzák, z. bittnar. design of object oriented finite element code. advances in engineering software 32:759–767, 2001. [23] j. c. martin, w. j. moyce. part iv. an experimental study of the collapse of liquid columns on a rigid horizontal plane. philosophical transactions mathematical physical and engineering sciences 244, 1952. doi:10.1098/rsta.1952.0006. 115 http://dx.doi.org/10.1007/bf02656450 http://dx.doi.org/10.1016/s1359-835x(03)00060-5 http://dx.doi.org/10.1115/1.4005072 http://dx.doi.org/10.1016/j.compstruc.2008.04.015 http://dx.doi.org/10.1006/jcph.1998.5906 http://dx.doi.org/10.1016/0021-9991(88)90002-2 http://dx.doi.org/10.1016/0021-9991(81)90145-5 http://dx.doi.org/10.1006/jcph.2000.6537 http://dx.doi.org/10.1016/j.jcp.2003.12.023 http://dx.doi.org/10.1006/jcph.2000.6465 http://dx.doi.org/10.1145/965105.807481 http://dx.doi.org/10.1098/rsta.1952.0006 acta polytechnica 57(2):105–115, 2017 1 introduction 2 a description of the fluid 3 evolution of the interface 3.1 interface reconstruction 3.2 a bsp tree based approach to remapping the tetrahedral mesh 3.2.1 intersection of polyhedra 4 numerical examples 4.1 cube propagation through the unstructured mesh 4.1.1 cube translation 4.1.2 cube rotation 4.2 the martin and moyce broken dam experiment 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0215 acta polytechnica 55(4):215–222, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap study of l–h transition and pedestal width based on two-field bifurcation and fixed point concepts boonnyarit chatthong∗, thawatchai onjun school of manufacturing systems and mechanical engineering, sirindhorn international institute of technology, thammasat university, pathum thani, thailand ∗ corresponding author: boonyarit.chatthong@gmail.com abstract. the l–h transition in magnetic confinement plasmas is investigated on the basis of concepts of two-field bifurcation and fixed-point stability. a set of heat and particle transport equations with both neoclassical and anomalous effects included is used to study etb formation and also pedestal width and dynamics. it is found that plasmas can exhibit bifurcation where a sudden jump in the gradients can be achieved at the transition point corresponding to the critical flux. furthermore, it is found that the transport barrier expands inward, whereby the radial growth of the pedestal initially appears to be superdiffusive but later slows down and stops. in addition, the time of barrier expansion is found to be much longer than the time that plasma takes to evolve from l-mode to h-mode. a sensitivity study is also performed, in which the barrier width is found to be sensitive to various parameters, e.g. heating, transport coefficients and suppression strength. keywords: plasma; tokamak; fusion; l–h transition; bifurcation. 1. introduction experimental observations in various magnetic confinement fusion devices have revealed that the formation of an edge transport barrier (etb) results in a sudden transition from low-confinement mode (l-mode) to high-confinement mode (h-mode) with great improvement in plasma performance [1]. this improvement is necessary for future nuclear fusion machines, such as iter [2]. understanding the physics of the l–h transition [3] is still a key issue in fusion research. experimentally, plasmas make the transition to h-mode when the injected heat exceeds a threshold marked by the formation of a transport barrier located near the edge of the plasma with a relatively high gradient profile (pressure/density) [4]. consequently, the core profile rises, resulting in enhanced fusion performance. it is therefore crucial to be able to explain the physical mechanisms of transport barrier formation and dynamics. although the underlying physics of the l–h transition is still unclear, many hypotheses have been put forward based on the concept of suppression of the turbulent transport by the flow shear or/and the magnetic shear [5]. it is known that turbulent transport can be stabilized by flow shear because of the breaking of a convection cell [6]. experimental results support the idea that turbulent fluxes can be reduced or quenched by a sheared flow in the transport barrier region [5, 7]. consequently, an etb is formed and the plasma makes an abrupt transition from l-mode to h-mode. some earlier research based on bistable s-curve bifurcation models [8–15] provided insights into qualitative aspects, and also into l–h transition physics. these works described the l–h transition using an s-curve graph in nonlinear flux versus a gradient space with both stable and unstable branches. the bifurcation model was introduced to explain particle and energy confinement in tokamaks [9]. ref. [11] utilized a simple one-field bifurcation model to portray the spatiotemporal behavior of the plasma, and found the hysteresis loop. malkov and diamond later applied this concept to analyze coupled heat and particle transport equations simultaneously, and illustrated that when the hyper-diffusion effect is included, the transition follows maxwell’s rule [12]. recently, the model has included heat and momentum density transports for an analytical study of the impact of external torque on the formation of an internal transport barrier (itb) [13]. different tasks can also be taken. for example, in [14], bifurcation theory was used to explain the transition and also the dithering h-mode. this work attempts to reveal the fundamentals of the l–h transition via a simple model of bifurcation approach, which could describe an overall view of the intrinsic bi-stability of the plasma. the variation of the system state, indicated as the pressure gradient with respect to a control parameter such as the heat flux, exhibits an s-curve shape. intrinsic behaviours of etb, including its dynamics and width, are also investigated. the study is therefore based on the assumption of an elm-free plasma, i.e. any gradientlimit instability is neglected in order to study the growth or the expansion of a transport barrier. this paper is organized as follows: brief descriptions of bifurcation and fixed-point concepts are presented in section 2; numerical results of the l–h transition and pedestal analysis are illustrated in section 3; and conclusions are presented in section 4. 215 http://dx.doi.org/10.14311/ap.2015.55.0215 http://ojs.cvut.cz/ojs/index.php/ap boonnyarit chatthong, thawatchai onjun acta polytechnica 2. bifurcation model and fixed-point analysis this section introduces the bifurcation model, and presents a conceptual and visual discussion of the local stability of a plasma and the dynamics of the l–h transition, as well as its locations in the bifurcation diagram. a simplified version of the heat and particle transport equations, in slab geometry, can be expressed, respectively, in the form: 3 2 ∂p ∂t − ∂ ∂x [ χneo + χano 1 + α(ν′e)2 ] ∂p ∂x = h(x,t), (1) ∂n ∂t − ∂ ∂x [ dneo + dano 1 + α(ν′e)2 ]∂n ∂x = s(x,t), (2) where p is plasma pressure, n is plasma density, χneo and dneo represent the neoclassical transport coefficients, χano and dano represent the anomalous transport coefficients, ν′e is the flow shear suppression, α is a positive constant representing the strength of the suppression, and h and s are the thermal and particle sources, localized at the center and at the edge of the plasma, respectively. the formation of an etb is a result of anomalous transport reduction caused by the flow shear effect [16]. hence, the main ingredient for stabilizing the anomalous transport is the flow shear, which accounts for the known reduction of the turbulent transport by a sheared radial electric field [6]. it couples the two transport equations according to the force balance equation, as shown in [12]: ν′e = c e′r b ≈− c ebn2 p′n′. (3) note that the contributions of the curvature, the toroidal and poloidal rotation are neglected here. equations (1) and (2) can be rewritten as the time variation of the pressure and density as: ∂p ∂t = h − ∂ ∂x [ χneo + χano 1 + α(ν′e)2 ] gp, (4) ∂n ∂t = s − ∂ ∂x [ dneo + dano 1 + α(ν′e)2 ] gn, (5) where gp ≡ −p′ and gn ≡ n′. integrations of these two equations with respect to x yield the followings: ẇ = q− [ χneo + χano 1 + α(ν′e)2 ] gp, (6) η̇ = γ − [ dneo + dano 1 + α(ν′e)2 ] gn, (7) where q = ∫ h dx and γ = ∫ s dx are the heat and particle fluxes given to the plasma. these two equations represent the time variation flow of energy and particle contents through a flux surface withẇ = ∂w/∂t and η̇ = ∂η/∂t defined as follows:∫ p dx = ∑ i pi∆xi = ∑ i fi ai ∆xi = ∑ i wi ai = ∑ i wi ≡ w, (8) ∫ n dx = ∑ i ni∆xi = ∑ i ni vi ∆xi = ∑ i ni ai = ∑ i ηi ≡ η, (9) where fi is the total force acting on a flux surface area ai due to the plasma pressure pi, wi is the work done by the pressure, w is total work done per surface area or the energy density of the plasma within the flux surface, and η is the particle surface density of the plasma. these are first-order nonlinear differential equations of the thermal and particle transport equation. they are coupled through the shear term of (3). in the transient limit ẇ = η̇ ∼= 0, a simple decoupling technique can be applied as danogn×(6)−χanogp×(7), resulting in: gn = γχanogp qdano + dneoχanogp −χneodanogp . (10) this can be substituted in (6) to decouple the two fields. from this point on, the analysis is based on the heat transport equation for the l–h transition. a similar discussion can be carried out for the particle transport equation, as [17] has shown that there exist both heating power and density thresholds for the l–h transition. physically, (6) represents the change of the energy density with time. it depends on both the heat flux and the pressure gradient. this can be seen in figure 1, where each panel illustrates a ẇ versus gp diagram (or η̇ versus gn for the particle field) based on various q values (or γ). note that all the constants are arbitrarily chosen in this work, so quantitative values have no physical meaning. with respect to time evolution, different behaviours can happen to a local plasma gradient point on this figure. firstly, in the regions where ẇ > 0 (η̇ > 0), the pressure gradient (density gradient) increases with time (rightwards arrow) because the plasma energy (density) increases. secondly, in the regions where ẇ < 0 (η̇ < 0), the pressure gradient (density gradient) decreases with time (leftwards arrow) because the plasma energy (density) decreases. lastly, if the point lies at ẇ = 0 (η̇ = 0), implying that it is in equilibrium, the gradients do not change; such points are called fixed points. these three behaviours of the local plasma gradients allow possibility to analyze the stability properties of the plasma. panel (a) of figure 1 shows the curve at relatively low fluxes value. apparently, only one stable fixed point can be observed at a relatively low pressure gradient, because any deviation from it will bring it back, as shown by the arrows. panel (b) shows the case where the fluxes reach their first critical values (q1,crit and γ1,crit), another half-stable fixed point appears. at higher fluxes, as shown in panel (c), the new fixed point splits into two fixed points, making total of three fixed points: two stable and one unstable. panel (d) shows the case where the fluxes are equal to 216 vol. 55 no. 4/2015 study of l–h transition and pedestal width figure 1. fixed points for each heat/particle flux value and their stabilities: a solid dot for stable fixed points, an open dot for unstable fixed points, and a semi-open dot for semi-stable fixed points. the second critical values (q2,crit and γ2,crit), the two fixed points on the left merge to a single half-stable point. panel (e) shows the case where the second critical fluxes are exceeded, the half-stable point is destroyed, so only one stable point at relatively high gradients exists. dynamics of local gradients can be described using a graphical interpretation. the stability analysis method can lead to understanding of the l–h forward transition and the h–l back transition. in particular, the fluxes can be treated as independent variable, which can be changed. as they are increased or decreased, the qualitative behaviour, e.g. the stability of fixed point, of the plasma system can be altered. the important assumption used throughout this paper is that the plasma relaxation time is sufficiently small. figure 2 shows the fluxes versus gradients space of the fixed points and their stability for each fluxes value. note that, a closed circle represents stable fixed points and an open circle represents unstable fixed points. they form traditional bifurcation diagram, similar to those shown in [11–13]. evidently, ql→h = q2,crit, γl→h = γ2,crit, qh→l = q1,crit, and γh→l = γ1,crit. essentially, the diagram shows 217 boonnyarit chatthong, thawatchai onjun acta polytechnica figure 2. bifurcation diagram illustrating two stable branches and one unstable branch with l–h and h–l transitions. figure 3. bifurcation diagram of the pressure field at different particle flux values. that the gradients depend non-monotonically on their respective fluxes. the two stable branches of the s-curve correspond to the low (l-branch) and high (hbranch) gradients. the third branch, containing only unstable fixed points, cannot be physically reached because the system is at unstable equilibrium. the onset of the l–h transition is where the gradient jumps from a relatively low value to a high value if the flux is increased slightly above the l–h threshold. the figure also shows that an h–l back transition occurs when the flux is reduced below the h–l threshold, with the gradient dropping from a relatively high value to a low value. the flux value of the h–l transition is lower than that of the l–h transition, implying hysteresis behaviour. although each of the pressure and density fields can have a bifurcation diagram with threshold fluxes for the l–h transition and the hysteresis loop, they interact with each other. this is illustrated in figure 3, which shows three bifurcation curves on q versus gp space at different particle flux values (γ1 < γ2 < γ3). as the particle flux is increased, figure 4. plasma density (top) and pressure (bottom) profiles as a function of the normalized minor radius at times 200 ms apart. the suppression strength due to the flow shear is also increased. it is therefore physically relevant that the requirement ql→h for the transition is less stringent. 3. numerical results and discussions in this section, the two transport equations (1) and (2) are solved simultaneously using a discretization method for the partial differential equation. heat and particle sources are localized at the center and at the edge of the plasma, respectively, and they are assumed to be constant in time. the numerical results yield the time evolution of the plasma profiles i.e. the pressure, the density, and their gradients. the neoclassical transport coefficients are simply set to be constant, while the anomalous transport coefficients follow critical gradient transport model similar to that described in [18]: χano = cχ(p′ −p′c)θ(p ′ −p′c), (11) dano = cd(n′ −n′c)θ(n ′ −n′c), (12) where cχ and cd are constants, p′c and n′c are the critical gradients for pressure and density fields, respectively, and θ represents the heaviside step function. 3.1. pedestal dynamics this section illustrates the pedestal growth in the plasma. the crucial assumption to be noted here 218 vol. 55 no. 4/2015 study of l–h transition and pedestal width figure 5. pressure (top) and density (bottom) pedestal widths as a function of time for a constant source (horizontal line) scenario. figure 6. pressure (top) and density (bottom) pedestal widths as a function of time for a heat ramping scenario. is that the pedestal is allowed to grow without any constraint, e.g. mhd instability, as the aim of this paper is to show the intrinsic property of the tokamak plasma system. hence, these results are presumed to picture what would happen to the plasma and to its pedestal if the loss mechanism, e.g. elm, can be controlled. firstly, the two criteria (the minimum flux and the minimum diffusivities ratio) for the possibility of an l–h transition according to bifurcation model are satisfied [19], so the plasma is ensured to reach the h-mode in a steady state. figure 4 demonstrates the time evolution profiles of plasma density and pressure at times approximately 200 ms apart. it shows that the plasma profiles make bigger increases early on. the change slows down as the plasma reaches a steady state. it can also be seen that the central pressure is almost doubled from l-mode to h-mode, whereas, the central density is increased by around 50 %. one other thing to note here is that the density profiles tend to be flatter in the plasma core. this makes sense, because the density flux is generated from the plasma edge, while the thermal flux comes from the plasma core. it appears that when the gradient-limiting instability is neglected, the pedestal is intrinsically able to expand inward. this growth of the pedestal is shown in figure 5, which illustrates the width of the pedestal as a function of time for both the pressure channel and the density channel. the heat and particle sources are assumed to be constant in time. evidently, the pedestal is formed first on the density channel. the pedestal grows rapidly at first, and then it slows down and eventually reaches its steady state. it appears that the pedestal growth is strongly superdiffusive (∆ped ∝ tb, b > 0.5), corresponding with a turbulent nature of the plasma, because in this phase the suppression effect is still low. turbulent transport therefore plays a dominant role. later, a wider region of the plasma is suppressed, so only neoclassical transport takes effect in the pedestal region, resulting in slower pedestal growth (subdiffusion or even lower). at some later time, a pedestal is also formed for the pressure. two interesting points are worth mentioning here. firstly, although the pedestals of the two channels do not form at the same time, they have the same width. this is likely to be explained by symmetry between the two transport equations. secondly, the time it takes the plasma to evolve during h-mode or the pedestal expansion time is around one order of magnitude slower than the time it takes for the plasma to evolve from the l-mode to the h-mode. this characteristic of the model is doubtful because, in the real tokamak plasma, instabilities at the edge cannot yet be controlled fully and efficiently. moreover, in order to observe this behavior, it is necessary to make sure that the sole mechanism for plasma loss is via transport. figure 6 shows different scenarios when the heat source is no longer constant in time. plasma heating (blue line) ramps up to a constant value, as in the previous scenario, while the particle source (red line) is kept constant at all times. again, the pedestal is formed first in the density channel but the pedestal grows more slowly. it takes a longer time for the plasma to reach a steady state, because the heat is being ramped up. eventually, the pedestal widths become the same as in the previous scenario in a steady state. this makes sense because in the end the heat and particle fluxes given to the plasma are the same. 3.2. pedestal width this section focuses on an analysis of the pedestal width in a steady state. the relationships between the pedestal widths and various plasma parameters are shown in figures 7–12. in these figures, the square bullets represent the pedestal width, which is the same for both pressure and density channels. the triangular bullets represent the central plasma pressure normalized to its value at the onset of the l–h transition. the cross bullets represent the central plasma density normalized to its value at the onset of the l–h transition. figure 7 shows the influence of the heat source on the plasma. it yields that there exists an l–h transition threshold for heating. below the threshold, there is no formation of a transport barrier. as the heating is increased above the threshold, the pedestal width 219 boonnyarit chatthong, thawatchai onjun acta polytechnica figure 7. pedestal width and central pressure and density in a steady state as a function of the heat source. figure 8. pedestal width and central pressure and density in a steady state as a function of the particle source. figure 9. pedestal width and central pressure and density in a steady state as a function of thermal anomalous transport. figure 10. pedestal width and central pressure and density in a steady state as a function of particle anomalous transport. widens but at a slower rate. apparently, the change in the heat source has a greater effect on plasma pressure than on plasma density. numerically for this particular case, the central pressure is increased to 3.76 times the lowest value, and the central density is increased to 1.18 times the lowest value, as the heat source is increased by 10 times the lowest value. figure 8 illustrates the influence of the particle source on the plasma. similarly, this yields that there is also an l–h transition threshold for particle flux. below the threshold, there is no formation of a transport barrier. this finding and the heat source results are qualitatively in agreement with the stability analysis of section 2. in contrast with the previous case, as the particle source is increased over the threshold, the pedestal width is in this case reduced. this can be explained by the suppression form of (3) used in the simulations. as the particle source is increased, the plasma density rises, resulting in a lower suppression value. consequently, the plasma performance is reduced, while the pedestal width and also the pressure profile are decreased. evidently, the change in particle source has a greater effect on plasma density than on plasma pressure. numerically for this particular case, the central density is increased to 3.31 times the lowest value, and the central pressure is reduced to 0.69 times the lowest value, as the particle source is increased to 10 times the lowest value. the effects of thermal anomalous transport are considered as shown in figure 9. this study is carried out as a variation of the proportional constant cχ, from (11), which controls the strength of the thermal anomalous transport coefficient. firstly, previous analysis in [12, 19] concluded that the l–h transition is possible only if the ratio of anomalous transport over neoclassical transport exceeds a critical value. generally, this value is in the order of 1 to 2. physically, this condition always holds in a real plasma, because the anomalous transport is normally about 10 times higher in the ion channel and can even reach 100 times higher in the electron channel [20]. figure 9 confirms the existence of this critical value for the realization of an l–h transition. if the anomalous transport is too low, there is no formation of a transport barrier. furthermore, as the strength of the anomalous transport is increased, the pedestal width narrows, and the central plasma pressure and the central plasma density are reduced. this makes sense: the plasma loss through transport is enhanced, so the plasma performance should be reduced. the 220 vol. 55 no. 4/2015 study of l–h transition and pedestal width figure 11. pedestal width and central pressure and density in a steady state as a function of thermal neoclassical transport. figure 12. pedestal width and central pressure and density in a steady state as a function of particle neoclassical transport. reductions of the profiles appear to be stronger in plasma pressure than in plasma density. numerically for this particular case, the central pressure is reduced to 0.71 times the lowest value, and the central density is reduced to 0.93 times the lowest value, as the proportional constant cχ is increased to 10 times the lowest value. the effects of particle anomalous transport are shown in figure 10. this study is carried out as a variation of the proportional constant cd, from (12), which controls the strength of the particle anomalous transport coefficient. similarly, this figure also confirms the existence of a critical value for the possibility of an l–h transition. if the anomalous transport is too low, there is no formation of a transport barrier. the results are similar to those presented in figure 9, in which as the strength of the anomalous transport is increased, the pedestal width narrows, and the central plasma pressure and density are reduced. however, the reductions of the profiles appear to be stronger in plasma density than in plasma pressure. numerically for this particular case, the central pressure is reduced to 0.96 times the lowest value, and the central density is reduced to 0.78 times the lowest value, as the proportional constant cd is increased to 10 times the lowest value. the effects of thermal neoclassical transport are shown in figure 11, where the transport coefficient is varied. the critical ratio for the possibility to obtain an l–h transition is also shown here, because if the neoclassical transport is increased too greatly, the h-mode cannot be reached. also, as the strength of the thermal neoclassical transport is increased, the pedestal width narrows and the central plasma pressure and density are reduced. however, the reductions of the profiles appear to be stronger in plasma pressure. numerically for this particular case, the central pressure is reduced to 0.23 times the lowest value, and the central density is reduced to 0.98 times the lowest value, as the thermal neoclassical transport coefficient is increased to 10 times the lowest value. figure 12 illustrates the effects of particle neoclassical transport on the pedestal width and on the central plasma values. the critical ratio for the possibility to obtain an l–h transition is also evident here. moreover, as the strength of the particle neoclassical transport is increased, the pedestal width is enlarged, the central pressure is increased and the central density is reduced. these results seem to be strange when compared with the results presented in figure 11. the explanation is that when the strength of the particle neoclassical transport is increased, the plasma particle loss is enhanced. subsequently, the plasma density is reduced, which increases the flow shear suppression, resulting in an increase in the pressure profiles and also in the pedestal width. the changes in the profiles appear to be stronger in plasma density. numerically for this particular case, the central pressure is increased to 1.32 times the lowest value, but the central density is reduced to 0.72 times the lowest value, as the particle neoclassical transport coefficient is increased to 10 times the lowest value. 4. conclusions a numerical method has been used to simultaneously solve the two-field (heat and particle) transport equations. the transport effect considered here is a combination of the neoclassical transport, which is assumed to be constant, and the anomalous transport, which follows the critical gradient transport model. the suppression mechanism is the flow shear calculated from the shear of the radial electric field equation. an analytical study based on bifurcation and stability of fixed points shows that an abrupt increase in the local gradients occurs at the onset of an l–h transition. this transition is also found to depend on the direction of heat ramping, where a backward h–l transition can occur at lower fluxes than for a forward l–h transition, implying hysteresis phenomena. numerically, it is found that without gradient limiting instability, the pedestal width can initially expand superdiffusively and later subdiffusively. the time that the plasma takes for pedestal expansion is 221 boonnyarit chatthong, thawatchai onjun acta polytechnica about one order of magnitude longer than it takes to transit from l-mode to h-mode. the pedestal tends to form first in the density channel, but in a steady state both pedestals have the same value. furthermore, the pedestal width in a steady state and the central plasma pressure appear to be proportional to the heat source and the particle neoclassical transport, and inversely proportional to the particle source, the thermal and particle anomalous transports and the thermal neoclassical transport. the central plasma density appears to be proportional to the heat and particle sources, and inversely proportional to all plasma transports. acknowledgements this work was supported by the commission on higher education (che) and the thailand research fund (trf) under contract no.rsa5580041. the authors greatly thank y. sarazin and irfm, cea. b chatthong thanks the royal thai scholarship and thailand institute of nuclear technology (tint) for financial support and acknowledges the discussion at the 3rd aptwg, korea 2013. references [1] k. h. burrell. summary of experimental progress and suggestions for future work (h mode confinement). plasma phys. control. fusion 36:a291, 1994. doi:10.1088/0741-3335/36/7a/043 [2] r. aymar, p. barabaschi, and y. shimomura. the iter design. plasma phys. control. fusion 44:519, 2002. doi:10.1088/0741-3335/44/5/304 [3] j. w. connor, and h. r. wilson. a review of theories of the l–h transition. plasma phys. control. fusion 42:r1, 2000. doi:10.1088/0741-3335/42/1/201 [4] f. wagner et al. regime of improved confinement and high beta in neutral-beam-heated divertor discharges of the asdex tokamak. phys. rev. lett. 49:1408, 1982. doi:10.1103/physrevlett.49.1408 [5] k. h. burrell. effects of e x b velocity shear and magnetic shear on turbulence and transport in magnetic confinement devices. phys. plasmas 4:1499-1518, 1997. doi:10.1063/1.872367 [6] h. biglari, p. h. diamond, and p. w. terry. influence of sheared poloidal rotation on edge turbulence. physics of fluids b: plasma physics 2:1-4, 1990. doi:10.1063/1.859529 [7] j.w. connor et al. a review of internal transport barrier physics for steady-state operation of tokamaks. nucl. fusion 44:r1, 2004. doi:10.1088/0029-5515/44/4/r01 [8] f. l. hinton. thermal confinement bifurcation and the lto h-mode transition in tokamaks. physics of fluids b: plasma physics 3:696-704, 1991. doi:10.1063/1.859866 [9] f. l. hinton, and g. m. staebler. particle and energy confinement bifurcation in tokamaks. physics of fluids b: plasma physics 5:1281-1288, 1993. doi:10.1063/1.860919 [10] g. m. staebler, f. l. hinton, and j. c. wiley. designing a vh-mode core/l-mode edge discharge. plasma phys. control. fusion 38:1461, 1996. doi:10.1088/0741-3335/38/8/055 [11] v. b. lebedev, and p. h. diamond. theory of the spatiotemporal dynamics of transport bifurcations. phys. plasmas 4:1087-1096, 1997. doi:10.1063/1.872196 [12] m. a. malkov, and p. h. diamond. analytic theory of l –> h transition, barrier structure, and hysteresis for a simple model of coupled particle and heat fluxes. phys. plasmas 15:122301, 2008. doi:10.1063/1.3028305 [13] hogun jhang, s. s. kim, and p. h. diamond. role of external torque in the formation of ion thermal internal transport barriers. phys. plasmas 19:042302, 2012. doi:10.1063/1.3701560 [14] w. weymiens et al. bifurcation theory for the l–h transition in magnetically confined fusion plasmas. phys. plasmas 19:072309, 2012. doi:10.1063/1.4739227 [15] b. chatthong, and t. onjun. investigation of toroidal flow effects on l-h transition in tokamak plasma based on bifurcation model. j. phys.: conf. ser. 611:012003, 2015. doi:10.1088/1742-6596/611/1/012003 [16] f. wagner. a quarter-century of h-mode studies. plasma phys. control. fusion 49:b1, 2007. doi:10.1088/0741-3335/49/12b/s01 [17] asdex team. the h-mode of asdex. nucl. fusion 29:1959, 1989. doi:10.1088/0029-5515/29/11/010 [18] x. garbet et al. profile stiffness and global confinement. plasma phys. control. fusion 46:1351, 2004. doi:10.1088/0741-3335/46/9/002 [19] b. chatthong et al. analytical and numerical modelling of transport barrier formation using bifurcation concept. 38th eps conference on plasma physics (strasbourg, france), paper p4.097. http://ocs.ciemat.es/eps2011pap/pdf/p4.097 [20] r c wolf. internal transport barriers in tokamak plasmas. plasma phys. control. fusion 45:r1, 2003. doi:10.1088/0741-3335/45/1/201 222 http://dx.doi.org/10.1088/0741-3335/36/7a/043 http://dx.doi.org/10.1088/0741-3335/44/5/304 http://dx.doi.org/10.1088/0741-3335/42/1/201 http://dx.doi.org/10.1103/physrevlett.49.1408 http://dx.doi.org/10.1063/1.872367 http://dx.doi.org/10.1063/1.859529 http://dx.doi.org/10.1088/0029-5515/44/4/r01 http://dx.doi.org/10.1063/1.859866 http://dx.doi.org/10.1063/1.860919 http://dx.doi.org/10.1088/0741-3335/38/8/055 http://dx.doi.org/10.1063/1.872196 http://dx.doi.org/10.1063/1.3028305 http://dx.doi.org/10.1063/1.3701560 http://dx.doi.org/10.1063/1.4739227 http://dx.doi.org/10.1088/1742-6596/611/1/012003 http://dx.doi.org/10.1088/0741-3335/49/12b/s01 http://dx.doi.org/10.1088/0029-5515/29/11/010 http://dx.doi.org/10.1088/0741-3335/46/9/002 http://ocs.ciemat.es/eps2011pap/pdf/p4.097 http://dx.doi.org/10.1088/0741-3335/45/1/201 acta polytechnica 55(4):215–222, 2015 1 introduction 2 bifurcation model and fixed-point analysis 3 numerical results and discussions 3.1 pedestal dynamics 3.2 pedestal width 4 conclusions acknowledgements references ap02_02.vp 1 introduction organising construction activities according to the latest possible internal time schedule, taking into account the organisational and technological process, has been called the just-in-time method (jit). this method has never been consistently applied in the construction industry, though this paper will argue that it should be. finishing processes in construction production are an appropriate place for practicing jit methods. two significant pillars of civil engineering, tradition and experience, have built a whole series of overt and covert myths into the civil engineering profession. many of these relate to work organization. some are useful and perpetuate the tradition and ethics of the profession, but many are outdated and no longer apply in the high-speed production conditions of modern construction. this refers to the following principles: 1. to build quickly (under any circumstances) means to build economically. 2. a contract manager fulfilling the earliest possible deadline is a good contract manager. 3. to build continuously means to build economically. 4. cumbersome technologies are disadvantageous. 5. payment for work in progress is a good principle. an important characteristic of a good contract manager has always been the ability to meet and fulfil deadlines. in other words, the agreed works should be completed before the deadline stated in the contract. from the modern construction point of view, early completion may not be either necessary or economically useful. however the idea that what is completed can be counted on, has extraordinary strength in some areas of the construction industry. the efforts of many contract managers to create a time reserve and to lower the risk of breaching the construction deadline go so far as to perform a series of works earlier than is technologically and organizationally necessary. this tendency can be observed in numerous projects. a textbook example is the cooling towers of the temelín nuclear power plant. in an effort to engage in expensive construction work (and to create time reserves for the future) the dominant construction feature during decades of construction was the cooling towers. cooling towers are, however, technologically simple structures that should have been provided much later in the project. nevertheless, they were the first technological structure to be erected on the site. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 optimisation and just-in-time v. beran arranging production activities to fit in with other construction activities is one of the basic ideas of the just-in-time approach. in the construction industry it has never been very fully applied. this is a mistake [1]. construction works, particularly expensive parts of them, are a field where the approach can be and should be applied. keywords: just-in-time approach, production speeds, production volumes, optimisation savings, dependent capacity expansion, risk of extra costs. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 section a segment b segment c section d section e segment f segment g sector h months fig. 1: input project situation, organizational time layout structure in the technologies and organizational processes for industrial buildings, railways and road reconstruction, public utilities and housing developments there are assembly procedures that are very appropriate for the given purpose. however, the cooling towers syndrome is also found here. the organisational process in the construction industry seems to favour extensively early completion of parts of a project. the application of jit-type procedures would certainly be economically more suitable. why does jit enjoy so little popularity in the construction industry? let us put forward some of the main reasons, in no particular order: a) low production speeds and large production volumes, b) the need to create large work forces (number of workers and quantity of machinery on a building site) to complete a project (assembly, surfaces, greenery, pavements, etc.), c) the preference for modest production technologies with low-cost material inputs, d) reluctance of workers to adapt working practices and work hours to current construction needs, e) reluctance of management to organise and pay for a special work regime in the final phases of construction (accommodation, shift work, transportation), f) low motivation of workers to re-train for new working practices and methods. 2 effect of jit – what was the task like? let us for a moment leave aside considerations, theories and detailed analysis. let us direct our attention to a substantial economic problem. how can the possible effects of jit be applied to the construction industry? let us assume that we are implementing and completing a simplified, extensive reconstruction of an administration building. the critical production activities can be divided into sections a, b, c (see fig. 1). in addition to these activities, there are some production activities that may be freely movable (e.g., some finishing works such as surfaces/floors (pavements, puddles, underlying insulating layers), soundproofing, etc., vapourand water-proofing, etc.) the possible schedules for these activities are graphically presented in the scheme in fig. 2 (see the possible schedule for the segments inside the technological sections). let us now see (and evaluate) the overall effect of construction execution in calculating the cost of the necessary © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 42 no. 2/2002 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 úsek a 40 40 50 60 30 20 20 10 10 10 podlahy a1 10 10 30 20 20 podlahy a2 20 20 20 30 10 10 10 úsek b 60 50 70 80 50 úsek c 20 30 40 50 60 70 80 80 70 20 podlahy c1 10 30 10 10 10 podlahy c2 20 40 10 10 podlahy a, b, c 10 30 20 20 20 10 10 total and minimum costs (fixed sections are a, b, c) 80 100 120 130 80 40 40 10 10 10 60 50 70 80 50 50 100 60 70 70 70 80 80 70 20 40 40 50 60 30 20 20 10 10 10 60 50 70 80 50 20 30 40 50 60 70 80 80 70 20 fig. 2: completed data, deadlines and costs for a reconstruction project total and minimum costs 0 50 100 150 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 duration in months total min t. e u r fig. 3: comparison of total and minimum costs (limited project schedule) operation credits for commonly carried out works as they were proposed. the time axis is given for calculation in months. for the sake of simplicity we assume that there will not be any delays in payments (in invoice payments, salary payments, payments to contractors and subcontractors). fig. 3 provides fuller information about fixed and non-fixed construction capacities drawn in two separate lines. changes in technology and in the organisation of the time schedule may be financially productive. however, the basic scheme is not very flexible in terms of time. we will investigate the advantages to be gained through optimisation under various conditions. relocating the non-fixed activity segments may provide the first orientation. a simple shift of production activities to the latest time limit is illustrated in fig. 4. the production activities are carried out at the latest possible time. the final and most difficult opportunity for reducing the cost of the whole process of carrying out the construction work is through optimisation. optimisation brings major savings (see table 1). in the given case the deadlines and financial payments take into account some restricting conditions. one of these is the production speed of individual production activities, taking technological considerations into account. the aim was to minimise the costs, including interest payments, for completing the construction. the optimum solution is in many ways surprising. see fig. 5 and compare the results with the calculation tables in fig. 2 (earliest possible execution) and fig. 4 (latest possible execution). the total difference between the starting situation (fig. 2) and the solution in an empirical shift to the latest possible deadlines is relatively low 1.89 %. sophisticated changes, using optimisation, lead to a radical drop in the total costs (6.17 %). even if not all gains can be achieved in practice, there is a range of possible managerial manipulation that could, if skilfully exploited, produce cost savings. note: each optimisation can lead to further possible improvements in the solutions, and can show which production sources (limits) incorporated into the restricting conditions will limit further improvements of the solution. a more detailed analysis can show under what conditions production sources (production speed) can be increased in such a way that further improvement of the solution can be achieved. let us compare the results in fig. 5 with the non-optimised solution in fig. 2, and let us look at table 1, where columns 2 to 4 give the main parameters of the task in figs. 2, 4 and 5. a) duration times are changed for all production activities. b) production speeds are changed in the course of execution for all production activities. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 section a 40 40 50 60 30 20 20 10 10 10 segment a1 10 10 30 20 20 segment a2 20 20 20 30 10 10 10 section b 60 50 70 80 50 section c 20 30 40 50 60 70 80 80 70 20 segment c1 10 30 10 10 10 segment c2 20 40 10 10 sectors a, b, c 10 30 20 20 20 10 10 total and minimum costs (fixed sections are a, b, c) 40 40 50 80 50 50 60 50 40 40 60 50 70 90 80 40 50 60 60 70 80 130 130 90 40 40 40 50 60 30 20 20 10 10 10 60 50 70 80 50 20 30 40 50 60 70 80 80 70 20 fig. 4: deadlines and costs for latest possible execution actions total (in mil. kč) earliest possible deadlines latest possible deadlines optimisation of deadlines and speeds total costs (without credits) 1600 1600 1600 total critical activities (without credits) 1120 1120 1120 including bank credits (rate 10 %) 1849 1813 1735 average production speed 28.66 28.66 32.45 table 1: comparison of solutions c) the total duration of the construction work is changed (or the proposal to create new time reserves to cover risks in connections with production speed). d) the main production processes will be speeded up, the duration of the project will be shorter. e) the average production speed of the construction work increases from 28,6 to 32,4 thousands € monthly, i.e., it increases to 113,2 % of the original production speed. f) a decrease in total costs to 93,83 % in comparison with the initial solution (see fig. 5 and proposed technological structure). other scenarios could also be presented. the main outcome of the whole task is an increase in production speeds and a reduction of time margins (floats). the overall effect is in essence a change in the organisation of project completion. most construction work is financed by credits. the contract is modified by regulations to the commercial code. this contract states that the creditor will provide funds up to the agreed amount in the benefit of the borrower. this contract requires the borrower to return the provided funds and to pay interest. the major characteristic features of a credit are: � the creditor’s obligation to provide a borrower with funds on his request, � the borrower’s obligation to return the provided funds, � the borrower’s obligation to pay interest. the outlined example shows a method for completing a floor (arrangements of surfaces and connecting parts of structures), i.e., technology that can substantially influence the final effect of the construction. a 6% reduction in the cost of a component, and its effect on the profitability of the project is a considerable argument and motivation for technological and organizational changes in the direction of jit. 3 a generalization of time dependent capacity expansion the problem of optimal capacity expansion of construction work as a time dependent problem has been studied in recent years in many different application contexts. traditional capacity planning usually begins with a forecast of demand on the basis of organizational or technological needs. planning and scheduling has for many years been the dominant approach in central european management methodology. new approaches adopting a more productive methodology seem to be needed. modern management of time dependent capacity expansion enables applications in production planning, strategic planning, inventory control, and network design. applications in telecommunications have been published by laguna [4]. the time dependent capacity problem consists of finding the combination of activities j ( j � 1, 2, …, n ) with price pj and capacity cj, that should be employed in each time period t ( t � 1, …, t ). the limitations are given by the total demand dt at a minimum discounted cost. then, the problem becomes min p x j n t t j t jt� � �� �� 1 11 (1) subject to c x d j n t t j jt t �� �� � 11 (2) for all t where x jt � 0 (3) is production speed for all j and t and � is the discount factor ( 0 < � < 1) for xjt for activities j in time t. the structure of the capacity may be very variable. table 2 shows a general © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 42 no. 2/2002 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 section a 28,9 43 43 43 33 24 24 15 17 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 segment a1 0 14 14 14 14 4,6 5 5,9 7,5 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 segment a2 0 10 12 12 12 13 13 14 16 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 section b 0 0 0 0 0 0 0 0 0 0 0 70 80 80 80 0 0 0 0 0 0 0 0 0 0 section c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 60 60 60 60 60 60 60 60 segment c1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 20 20 segment c2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 30 30 sectors a, b, c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0 30 30 30 30 total and minimum costs (fixed sections are a, b, c) 29 67 69 69 60 41 42 35 40 49 0 70 80 80 80 0 40 60 60 60 60 100 130 140 140 29 43 43 43 33 24 24 15 17 20 0 70 80 80 80 0 40 60 60 60 60 60 60 60 60 fig. 5: optimised deadlines and total cost of a construction project example of this interpretation. demand dt may be structured not only as to t as a particular time period, but also to demand blocks related to different activities j and even to blocks of technologically or organizationally related activities. if the matrix of variables in time t, where t � (1, 2, 3, …, t) is signed for particular scenarios s, where s � (1, 2, …, s), say as zts, the problem becomes � � � �min ,f p x w zj jt x � ���� � � �t jt s ts1 (4) subject to c x z dj jt jt ts ts� ��� t s, , (5) � �x jt 0 1 2, , , � j t, , (6) zts � 0 t s, , (7) where w is the weighting factor and � is the function of negative demand consequences related to the unmet demand zjt and probability �s in the range of scenario s. demand dts will be presented with an uncertainty component zjt, see eq. (5). this presents an imaginary demand associated with the risk of shortage of capacity with a probability �s related to scenario s at each period t. function � may take many forms. it usually reflects the risk attitude of the decision maker. the risk may be associated with the probability of shortage of capacity, the risk of extra costs or the risk of lack of quality if the production speed exceeds certain limits. further applications and interpretations are possible. 4 conclusion the implementation of a technical project carried out in conditions of high production speeds and low time reserves requires changed technologies, organization and preparation of construction. in each specific case a civil engineer needs to know the economic impacts (capability of applicable calculation). the next important factor in the preparation and choice of management and organisation is the ability to calculate the risks inherent in the chosen technology [1], [2], [3]. it is obvious from the given illustrative example, which has the same features as the execution of a series of construction projects in recent years, that the myth of the importance of executing works in large volumes ahead of the deadlines has significant financial consequences. the interest rate applied here (10 %) is very low for current czech business conditions, but may correspond to conditions of forthcoming recession in the current eu countries. it is very probable that wherever construction work has been carried out at a loss or at a low profit, bad time and volume scheduling will have played a significant role in the bad economic results. acknowledgment this paper originated as part of a ctu in prague, faculty of civil engineering research project on management of sustainable development of the life cycle of buildings, building enterprises and territories (msm: 210000006), financed by the ministry of education, youth and sports of čr. the calculations draw upon the methodological documents created within the project on harmonisation of engineering activities with eu (proposal methods, implementation and life cycle of construction works according to en iso), funded by the grant agency of the czech republic, prague 1, národní street 3 (gačr). references [1] beran, v.: základy teorie rozhodování. [foundations of decision theory]. praha: vydavatelství čvut v praze, 1985. [2] beran, v., macek, d.: programové vybavení balance sensitivity. [software]. praha, 1998. [3] beran, v., macek, d.: programové vybavení fault cell. [software]. praha, 2000. [4] laguna, m.: applying robust optimization to capacity expansion of one location in telecommunications with demand uncertainty. in management science, vol. 44, no. 11, 1998. [5] mulvey, j. m., vanderbei, r. j., zenios a.: robust optimisation of large scale systems. oper. res. 43, 1995. doc. ing. václav beran, drsc. e-mail: beran@fsv.cvut.cz department of economics and construction management czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 t � 1 t � 2 t � 3 … t � t activity 1 x11 x12 x13 … x1t activity 2 x21 x22 x23 … x2t … … … … … activity n xn1 xn1 xn1 … xn1 d1 d2 d3 dt table 2: general scheme of a production structure acta polytechnica https://doi.org/10.14311/ap.2021.61.0448 acta polytechnica 61(3):448–455, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague improving the efficiency of a steam power plant cycle by integrating a rotary indirect dryer michel sabatini∗, jan havlík, tomáš dlouhý czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic ∗ corresponding author: michel.sabatini@fs.cvut.cz abstract. this article deals with the integration of a rotary indirect dryer, heated by low pressure extraction steam, into the rankine cycle. the article evaluates the power generation efficiency of a steam power plant, with an integrated indirect dryer, which combusts waste biomass with a high moisture content and is further compared to the same plant without the dryer. the benefits of the dryer’s integration are analysed in respect to various moisture contents of biomass before and after the drying. the evaluation of the power generation efficiency is based on parameters evaluated from experiments carried out on the steam-heated rotary indirect dryer, such as specific energy consumption and evaporation capacity. the dryer’s integration improves the efficiency of the cycle in comparison to a cycle without a dryer, where moist biomass is directly combusted. this improvement increases along with the difference between the moisture content before and after the drying. for the reference state, a fuel with a moisture content of 50 % was dried to 20 % and the efficiency rised by 4.38 %. when the fuel with a moisture content of 60 % is dried to 10 %, the power generation efficiency increases by a further 10.1 %. however, the required dryer surface for drying the fuel with a moisture content of 60 % to 10 % is 1.9 times greater as compared to the reference state. the results of the work can be used both for the prediction of the power generation efficiency in a power plant with this type of dryer based on the moisture content in the fuel and the biomass indirect dryer design. keywords: indirect drying, biomass drying, power generation efficiency. 1. introduction climate change and environmental degradation may become a threat to europe and the world in the future. the european union has, therefore, created the green deal, which places decarbonisation demands on the energy sector. one way to achieve this is to increase the efficiency of power plants and increase the share of carbon-neutral fuels in the energy mix. among these fuels are many kinds of biomass, which provide an excellent source of clean energy and possess a great potential for further use, yet, nevertheless, the potential of high-quality dry biomass in the czech republic is now almost fully exploited. therefore, there is an effort to find ways how to efficiently use low-grade biomass with a high moisture content for electricity and heat generation. the quality of fuel for electricity and heat generation is described by its heating value, which is dependent on the amount of water and ash in the fuel. the ash content in biomass is relatively low and can range from 0.5 % to 12 % on a dry basis [1]. in general, the ash content is not significant as compared to the moisture content varying from 10 % to 70 % on a wet basis. due to this fact, the quality of fuel can significantly be improved by reducing the moisture content. the cheapest method appears to be solar drying, but combustion of biomass in a steam power plant requires a significant mass flow, so, accordingly, a large storage space would be needed. moreover, the storage of wet fuel causes microbial activity and especially fungal growth, which reduces the quality of the fuel and may cause health problems [2, 3]. therefore, the fuel is usually dried before the combustion process in a dryer, or the fuel enters the boiler wet, and the drying process is done directly there. the heat released from the combustion is partially consumed by the drying process and is not involved in the steam generation [4]. this process reduces the efficiency of the boiler, which consequently decreases the efficiency of the entire power plant [5]. furthermore, it is difficult to combust a wood fuel with a moisture content of 60 % or more separately [1]. additionally, the boiler designed for dry fuels could possess smaller dimensions compared to a boiler designed for wet fuel due to the reduced flue gas production at a higher temperature and, therefore, boiler ducts as well as the area of heating surfaces may be smaller [6]. the fuel in a power plant is usually dried in convective (direct) dryers heated by the flue gas taken from the boiler, or by the hot air preheated in a boiler. these two methods still use the heat released from the combusted fuel, therefore, they contribute only partially to the improvement of the power plant’s efficiency. utilisation of the flue gas leaving the boiler for drying of very moist fuel to a sufficiently low moisture content would require very high flue gas temperatures [7] and, moreover, poses a fire risk [6]. many works are devoted to the issue of drying the fuel pre448 https://doi.org/10.14311/ap.2021.61.0448 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 3/2021 improving the efficiency of a steam power plant cycle. . . vious to its combustion or further use. fuel drying for energy purposes in convective dryers using flue gas was researched in [8, 9]. thermodynamics and economics of fuel drying for an organic rankine cycle were analysed in [10]. an investigation into the lignite drying process in the indirect tubular dryer was conducted in [11]. the integration of a dryer or thermal mechanical dewatering unit into a power plant was investigated in [12]. however, all of these articles mainly concern the drying of lignite. several methods regarding how to dry biomass for power generation are explained in [13]. in [14], the drying of biomass for the production of a second generation of biofuels was investigated. integrated drying in a gasification plant is analysed in [15, 16]. however, previous works are mostly concerned with the drying of lignite for power generation or biomass drying for synfuel generation, pyrolysis and gasification plants. none of these studies are focused on increasing the power generation efficiency in a power plant, where very moist biomass is being combusted. our research focuses on the evaluation of the power generation efficiency of a power plant with an integrated rotary indirect dryer. indirect drying is a specific type of drying where the drying medium is separated from the material being dried by a heat transfer surface [17]. the heat is fed to the dried material through this surface, which defines the drying space. this type of dryer, heated with an extraction steam, should be a suitable option for the case of a small steam power plant. low-pressure steam extracted from the steam turbine can be effectively utilized for drying because this steam has already done the major part of the work in the turbine for electricity generation, but it still has enough energy in the form of condensation heat, which would otherwise be mostly lost in the condenser. this principle is similar to regenerative feed water preheating. it is generally known that this method increases the efficiency of the thermodynamic cycle in the steam power plant, the effect is explained as an example in [18]. furthermore, the common energy consumption for evaporation of 1 kg of water for this type of dryer ranges between 2800 − 3600 kj/kg in comparison with direct dryers, where their energy consumption ranges between 4000 − 6000 kj/kg [19]. the indirect rotary dryer, which uses extraction steam from the turbine, is integrated into the steam power plant cycle and its contribution to the drying of wet fuel to the efficiency of power generation is evaluated. the efficiency is calculated at various moisture contents in the fuel entering the cycle and then further compared to the cycle where the fuel is not dried. 2. methodology 2.1. dryer integration into a steam power plant figure 1a illustrates a simple power plant scheme based on the steam rankine cycle with an extraction turbine and steam extraction for deaeration. wet fuel enters the boiler directly. the aim of this work is to integrate the rotary indirect biomass dryer into this basic scheme and evaluate its contribution to the efficiency of the plant. figure 1b depicts the integration of the indirect dryer heated by low pressure extraction steam, which has the same parameters as the deaeration steam. the fuel is firstly dried in the dryer and is then transported directly into the boiler for combustion. the parameters of the steam power plant cycle are based on the typical parameters of biomass power plants located in the czech republic. parameters power plant output p = 10 mwe admission steam temperature t1 = 490 °c admission steam pressure p1 = 6.7 mpa extraction steam pressure p2 = 2.32 bar emission steam temperature t3 = 45 °c turbine thermodynamic efficiency ηt = 84 % mechanical efficiency ηm = 99 % electric motor efficiency ηmot = 95 % generator efficiency ηg = 98 % 2.2. boiler efficiency moisture content affects the efficiency of the boiler. the boiler efficiency (fig. 2) is calculated via the indirect method described in [20] and is related to the lower heating value of the fuel. the method is based on the estimation of the heat losses in the boiler. the effect of the variable moisture content in the fuel was reflected only in the change of the chimney loss, which expresses the relative heat loss in the flue gas leaving the boiler; in this case, at the temperature of 150 °c and with the excess of air of 1.5. when burning drier fuel, the volume of flue gases is lower, the chimney loss decreases, and the efficiency of the boiler increases. the values of other losses, i.e., losses due to incomplete combustion, by heat in the bottom ash and by radiation and convection to the surroundings, were considered constant and amounted to a total of 3.5 %. a fuel with a moisture content higher than 60 % on the wet basis is difficult to burn separately. drying is essential for the utilisation of waste biomass with a very high moisture content. as well as the efficiency of the boiler increasing with a decreasing moisture content in the fuel, the efficiency of the entire power plant also increases. for a comparison of the cycles with and without a dryer, it is necessary to know the energy balance for the chosen type of the integrated dryer and to verify its suitability for use with the considered material. these data are specified for each dryer type and the material used. 449 m. sabatini, j. havlík, t. dlouhý acta polytechnica (a) . power plant cycle without the dryer. (b) . power plant cycle with the dryer. figure 1. power plant cycle without and with the dryer (1 boiler; 2 hp steam turbine; 3 lp steam turbine; 4 electric generator; 5 condenser; 6 condensate pump; 7 deaerator; 8 feedwater pump; 9 indirect dryer). figure 2. boiler efficiency in dependence to the moisture content in the fuel. therefore, a set of experiments had to be prepared to provide data for evaluation (energy consumption, surface evaporation capacity and the energy loss of the dryer) for the steam rotary indirect dryer and the fuel used in the power plant. 2.3. evaluation of the power generation efficiency evaluation of the power generation efficiency in dependence to the moisture content in the fuel (fig. 5) is based on the following equations: power generation efficiency: η = pg − ppump ṁf · lhv · 100, (1) where ṁf [kg/s] is the fuel mass flow rate, lhv [kj/kg] is the lower heating value of the fuel. feedwater pump power consumption: ppump = ∆h7−8 · ṁl ηmot · ηm , (2) where ṁl [kg/s] is the water mass flow rate, ∆h7−8 [kj/kg] is the enthalpy change of water in the feedwater pump. steam generation: ṁg = ṁf · lhv · ηb (i1 − i8) , (3) where ṁg [kg/s] is the steam mass flow rate, ηb [-] is the boiler efficiency. the differences between cycles with and without the drying are in the boiler efficiency (fig. 2) and heating value of the fuel. power plant output: pg = (php t + plp t ) · ηt · ηm · ηg. (4) high pressure turbine power output: php t = ṁg · (h1 − h2). (5) 450 vol. 61 no. 3/2021 improving the efficiency of a steam power plant cycle. . . figure 3. steam rotary indirect dryer. low pressure turbine power output: plp t = (ṁg − ṁo · (h2 − h3), (6) where ṁo [kg/s] is the mass flow rate of the extracted steam. the steam mass flow through the lpt is reduced due to the extraction of the steam: • for deaeration in the cycle without the drying, • for deaeration and drying in the cycle with the drying. mass flow rate of the extracted steam: • the cycle without drying ṁo1 = ṁg · h7 − h6 h2 − h6 , ṁo2 = 0 (7) • the cycle with drying ṁo1 = ṁpv1 (h2 − h9) · åå 1 − wrin − w r out 1 − wrout ã · hf,out− −hf,in + å wrin − w r out 1 − wrout ã · hw v ã , (8) hw v [kj/kg] waste vapour enthalpy, hf [kj/kg] enthalpy of fuel, w r [-] moisture content in fuel. ṁ02 = ṁ01 · (h6 − h9) + ṁg · (h7 − h6) h2 − h6 , (9) ṁo = ṁ01 + ṁ02. (10) dryer dimensions: based on the moisture content in the fuel used in the cycle and the moisture content required after drying, the amount of evaporated water ṁw can be calculated. dryer volume needed for the evaporation of water: v = ṁw · 3600 ov (11) parameter unit value dryer dimensions inner surface m2 6.42 inner volume m3 0.579 diameter of the shell m 0.6 length of the shell m 2 steam extracted from the turbine steam temperature °c 135 steam pressure bar 3.2 table 1. dimensions of the dryer and steam parameters. dryer surface needed for the evaporation of water: s = ṁw · 3600 os (12) the required surface and volume of the dryer can be reached by changing the length or diameter of the shell in the process of the dryer design. additional surface area can be added through heated flights. 2.4. experimental dryer for many kinds of inhomogeneous biomass materials used for power generation, it would be a suitable option to use the rotary indirect dryer. the indirect dryer is more energy efficient than conventional direct dryers [19]. for the design of such an industrial dryer, it is necessary to know its operating characteristics, which are not usually available. for this purpose, experiments were carried out on a laboratory rotary steam indirect dryer (fig. 3). a determination of the surface and volumetric evaporation capacities are required for designing the dryer and determining its optimal size. the dimensions of the dryer and heating steam parameters are summarized in table 1. 2.5. material the tested type of biomass was predominantly spruce wood chips, which were bought from an external supplier and stored in an outdoor open area, so the inlet moisture content ranged between 55 % and 66 %. this material is very inhomogeneous, which may cause small deviations in experimental results. in fig. 4, the comparison of the material before and after the drying to approximately 10 % is shown. 3. results and discussion a series of drying experiments were performed. their goal was to verify the functionality of the dryer and determine the energy consumption of drying under different conditions. the input parameters were chosen to represent common conditions in a biomass power plant the steam pressure was 3.2 bar and its saturation temperature was 135 °c. table 3 shows the results 451 m. sabatini, j. havlík, t. dlouhý acta polytechnica parameter unit value moisture content % 65 low heating value (w = 65 %) mj/kg 4.4 bulk density kg/m3 420 average thickness of the woodchip mm 10 table 2. material parameters. figure 4. material before drying (left side) and after drying (right side). of selected experiments, which differed in the initial and final moisture of the biomass, and mainly the volumetric filling ratio of the dryer. the determined values of the energy consumption ranged between 3 and 3.52 mj/kgw, the surface and volumetric evaporation capacity varied between os = 1.59 − 2.03 kgw/(m2· h) and ov = 17.1 − 21.8 kgw/(m4· h). the results of experiments showed that these parameters are most affected by the volumetric filling ratio of the dryer. by [21], there is no recommended filling ratio for indirect dryers, so there is a lot of room for finding the optimum in sizing and operating a real dryer. optimizing other parameters, such as heating medium temperature and pressure, should be a result of a technical-economical assessment for each specific case. the energy consumption of 3 mj/kgw for an evaporation of 1 kg of water from biomass was used for the following calculations of the contribution of biomass drying for improving the power generation efficiency in fig. 5. 3.1. the impact of the dryer’s integration on the power generation efficiency based on the experimentally determined drying characteristics, the efficiency of the power plant with an integrated dryer (fig. 1b) was evaluated. figure 5 shows the power generation efficiency calculated for various moisture contents in the fuel after the drying process. the figure indicates that the highest efficiency gain is achieved when very wet fuel enters the cycle and is dried to the values of about 10 % to 20 %. fuel with a moisture content of 60 % or more is difficult to burn directly and co combustion with a quality fuel is usually necessary. in practice, the moisture content of very moist biomass such as bark stored outdoors during winter can be up to 65 %. a fuel with such a high moisture content can be used without drying for co-combustion with a high-quality fuel. therefore, the moisture content of the fuel was considered to be up to 70 % for the purpose of evaluating the efficiency benefits of the dryer’s integration. the fuel drying has a significant impact on the boiler efficiency, which affects the efficiency of the entire power plant. moreover, the effect of drying has a positive effect on increasing the heating value of the fuel. in this way, the heat from the extraction steam is partially reverted back into the steam cycle by means of fuel. it is apparent that the curves begin to flatten in the range of 10 % – 20 %, and thus drying fuel to a very low moisture level will not significantly increase the efficiency compared to the costs incurred. 3.2. increase in the power plant’s efficiency figure 6 describes the power generation efficiency increase in dependence on the change of the moisture content after drying for two different moisture contents of 50 % and 60 % in the fuel, which enters the power plant. the comparison is based on the assumption that the fuel enters those power plant cycles with the same moisture content and is either directly burned or dried to various moisture contents and then burned. the greatest benefit is achieved when the moisture content of the fuel, which enters the power plant, is the highest and the moisture content of the fuel after drying is the lowest. commonly used biomass, e.g. generally wood chips, has the moisture content of about 50 % and it can be dried to 20 %, which results in an increase in efficiency of 4.4 %. if the moisture content of the fuel, which enters the cycle, was 60 % and the moisture content after drying of the fuel was 10 %, the efficiency could increase by 10.1 %. using a fuel with a lower moisture content reduces the contribution of the dryer to the efficiency. the contribution of the dryer also changes in dependence on the moisture content in the fuel after drying. in the cycle where the fuel with a 60 % moisture content is used, the contribution decreases from 10.1 % to 4.7 % whereas the moisture content after drying rises from 10 % to 40 %. to obtain dried fuel, more water, in dependence on the moisture content before and after drying, has to be evaporated. this will result in considerably higher 452 vol. 61 no. 3/2021 improving the efficiency of a steam power plant cycle. . . experiment number 1 2 3 4 5 6 filling ratio [%] 9 10 15 17 27 27 moisture content in the fuel before drying [%] 69.2 62.2 64.5 64.5 65.5 60.6 moisture content in the fuel after drying [%] 16.6 21.1 4.1 3.7 14.8 17.0 surface evaporation capacity [kg/(m2·h)] 1.85 1.89 1.59 1.68 2.03 1.94 volumetric evaporation capacity [kg/(m3·h)] 19.9 20.3 17.1 18.0 21.8 20.9 energy consumption [mj/kgw] 3.43 3.46 3.52 – 3.24 3 table 3. experiment conditions and results. figure 5. power generation efficiency in dependence to the moisture content in the fuel after drying. figure 6. efficiency increase in the cycle with an integrated dryer compared to the cycle without a dryer. demands on the dryer size quantified by the surface area. figure 7 shows an increase in the required surface area against the reference case, which is the surface required for drying biomass from 50 % to 20 % moisture content. the dryer’s surface area for these conditions and the fuel flow required for the power plant with parameters mentioned in chapter 2 has to be 238 m2. if the moisture content of the fuel is figure 7. relative increase in the dryer’s surface. 60 % and it is dried to a moisture content of 20 %, the required surface will increase 1.7 times and, for drying to a 10 % moisture content, the surface must be 1.9 times larger than the reference case. for the drying of 50 % to 30 % moisture content, the required surface would only be 70 % of the reference case. the optimal size of the dryer has to be determined on the basis of a technical-economical assessment for each specific case. 453 m. sabatini, j. havlík, t. dlouhý acta polytechnica 4. conclusion integration of an indirect biomass dryer heated by extracted steam into the cycle of a steam power plant improves its efficiency and combustion conditions. the benefits of the dryer integration are most noticeable when the fuel is dried to a lower moisture content. should the biomass, which enters the dryer, have a moisture content of 50 % and is then dried to 20 %, the efficiency of the entire power plant rises by 4.4 % and the dryer surface area has to be 238 m2. if the fuel with a moisture content of 60 % is used and dried to 10 %, the power generation efficiency will rise by 10.1 % but the dryer surface area has to be 1.9 times larger. it has been experimentally verified that a rotary steam indirect dryer is suitable for the drying of waste biomass, such as wood chips or bark. the experiments showed the influence of drying conditions on the energy consumption, therefore, a further investigation for the dryer optimization is needed. to increase the intensity of the drying process, it would be suitable to use heated steam at a higher temperature, thereby reducing the dryer’s surface area and reducing investment costs, however, subsequently reducing the electricity generation due to a lower steam flow through the low-pressure part of the steam turbine. the optimum dryer size would be determined by the results of a technical-economical assessment. the ability to dry fuel will increase the range of combustible biomass types – it will allow the burning of less valuable moist fuel, and thus decreasing the fuel costs further. in addition, a boiler designed for a dry fuel can be smaller and thus less expensive to invest in. acknowledgements this work was supported by the project from research centre for low carbon energy technologies, cz.02.1.01/0.0/0.0/16_019/0000753. we gratefully acknowledge support from this grant. list of symbols h specific enthalpy [kj/kg] lhv lower heating value [kj/kg] ṁ mass flow rate [kg/s] o evaporation capacity [kg/(m2 h); kg/(m3 h)] p pressure [bar; mpa] p power input/output [kw; mw] s surface [m2] t temperature [°c] v volume [m3] w moisture content [–; %] greek symbols η efficiency [–; %] subscripts 01 steam extracted for deaerator 02 steam extracted for the dryer 1 boiler outlet 2 high pressure turbine outlet 3 low pressure turbine outlet 5 condenser outlet 6 condensate pump outlet 7 deaerator outlet 8 feed water pump outlet 9 dryer outlet b boiler f fuel g gas hpt high pressure turbine in in l liquid lpt low pressure turbine m mechanical mot electric motor out out pump pump s surface t turbine g generator v volume w water wv waste vapour references [1] s. van loo, j. koppejan (eds.). the handbook of biomass combustion and co-firing. earthscan, london, 2008. [2] r. jirjis. storage and drying of wood fuel. biomass and bioenergy 9(1 5):181 – 190, 1995. https://doi.org/10.1016/0961-9534(95)00090-9. [3] o. gislerud. drying and storing of comminuted wood fuels. biomass 22(1 4):229 – 244, 1990. https://doi.org/10.1016/0144-4565(90)90019-g. [4] b. g. miller, d. a. tillman (eds.). combustion engineering issues for solid fuel systems. academic press, 2008. [5] l. dzurenda, a. banski. the effect of firewood moisture content on the atmospheric thermal load by flue gases emitted by a boiler. sustainability 11(1):284, 2019. https://doi.org/10.3390/su11010284. [6] w. a. amos. report on biomass drying technology. tech. rep., national renewable energy laboratory, colorado, usa, 1998. https://doi.org/10.2172/9548. [7] t. dlouhý, f. hrdlička. regenerace tepla ze sušení biomasy. in kotle a energetická zařízení, pp. 16 – 22. vut, brno, 2003. [8] k. atsonios, i. violidakis, m. agraniotis, et al. thermodynamic analysis and comparison of retrofitting pre-drying concepts at existing lignite power plants. applied thermal engineering 74:165 – 173, 2015. https://doi.org/10.1016/j.applthermaleng.2013.11.007. [9] x. han, m. liu, k. wu, et al. exergy analysis of the flue gas pre-dried lignite-fired power system based on the boiler with open pulverizing system. energy 106:285 – 300, 2016. https://doi.org/10.1016/j.energy.2016.03.047. 454 https://doi.org/10.1016/0961-9534(95)00090-9 https://doi.org/10.1016/0144-4565(90)90019-g https://doi.org/10.3390/su11010284 https://doi.org/10.2172/9548 https://doi.org/10.1016/j.applthermaleng.2013.11.007 https://doi.org/10.1016/j.energy.2016.03.047 vol. 61 no. 3/2021 improving the efficiency of a steam power plant cycle. . . [10] x. han, s. karellas, m. liu, et al. integration of organic rankine cycle with lignite flue gas pre-drying for waste heat and water recovery from dryer exhaust gas: thermodynamic and economic analysis. energy procedia 105:1614 – 1621, 2017. https://doi.org/10.1016/j.egypro.2017.03.518. [11] k. hatzilyberis, g. p. androutsopoulos, c. e. salmas. indirect thermal drying of lignite: design aspects of a rotary dryer. drying technology 18(9):2009 – 2049, 2000. https://doi.org/10.1080/07373930008917824. [12] e. kakaras, p. ahladas, s. syrmopoulos. computer simulation studies for the integration of an external dryer into a greek lignite-fired power plant. fuel 81(5):583 – 593, 2002. https://doi.org/10.1016/s0016-2361(01)00146-6. [13] r. wimmerstedt. recent advances in biofuel drying. chemical engineering and processing: process intensification 38(4 6):441 – 447, 1999. https://doi.org/10.1016/s0255-2701(99)00041-0. [14] l. fagernäs, j. brammer, c. wilén, et al. drying of biomass for second generation synfuel production. biomass and bioenergy 34(9):1267 – 1277, 2010. https://doi.org/10.1016/j.biombioe.2010.04.005. [15] s. tuomi, e. kurkela, i. hannula, c.-g. berg. the impact of biomass drying on the efficiency of a gasification plant co-producing fischer-tropsch fuels and heat a conceptual investigation. biomass and bioenergy 127:105272, 2019. https://doi.org/10.1016/j.biombioe.2019.105272. [16] m. a. adnan, m. m. hossain. integrated drying and gasification of wet microalgae biomass to produce h2 rich syngas a thermodynamic approach by considering in-situ energy supply. international journal of hydrogen energy 44(21):10361 – 10373, 2019. https://doi.org/10.1016/j.ijhydene.2019.02.165. [17] c. w. hall. dictionary of drying. m. dekker, 1979. [18] s. basu, a. k. debnath. power plant instrumentation and control handbook. academic press, 2015. https://doi.org/10.1016/c2018-0-01231-1. [19] a. s. mujumdar (ed.). handbook of industrial drying. crc press, 4th edn., 2014. [20] t. dlouhý. výpočty kotlů a spalinových výměníků. čvut, 2nd edn., 2002. [21] j. havlík, t. douhý, m. sabatini. the effect of the filling ratio on the operating characteristics of an indirect drum dryer. acta polytechnica 60(1):49 – 55, 2020. https://doi.org/10.14311/ap.2020.60.0049. 455 https://doi.org/10.1016/j.egypro.2017.03.518 https://doi.org/10.1080/07373930008917824 https://doi.org/10.1016/s0016-2361(01)00146-6 https://doi.org/10.1016/s0255-2701(99)00041-0 https://doi.org/10.1016/j.biombioe.2010.04.005 https://doi.org/10.1016/j.biombioe.2019.105272 https://doi.org/10.1016/j.ijhydene.2019.02.165 https://doi.org/10.1016/c2018-0-01231-1 https://doi.org/10.14311/ap.2020.60.0049 acta polytechnica 61(3):448–455, 2021 1 introduction 2 methodology 2.1 dryer integration into a steam power plant 2.2 boiler efficiency 2.3 evaluation of the power generation efficiency 2.4 experimental dryer 2.5 material 3 results and discussion 3.1 the impact of the dryer's integration on the power generation efficiency 3.2 increase in the power plant's efficiency 4 conclusion acknowledgements list of symbols references ap04_2web.vp 1 introduction ensuring people’s quality of life within the means of nature involves many responsibilities for present-day society. strategies for environmental protection and recycling of used appliances, industrial and communal wastes are helpful in order to create a closed loop economy. in recent years, the notion has been gaining ground that, for environmental and legislative reasons, improvements in national environmental policies and practice, including recycling strategies, are desirable and in many cases may be economically beneficial. globalisation and the reduction of the life cycle of products, technologies and services have led to an increase in competition among enterprises in the field of recycling [1]. individual enterprises are no longer able to respond to the changing market needs and recycling activities because of their outdated organisational structures. in this way, the role of co-operating forms has been rising, because they can achieve economic and technologic advantages that which companies working separately could not achieve on their own. within the frame work of this paper, the author will present the main logistic tasks and the design conception for cooperative recycling processes to improve the existing closed loop economy of future eu member states like hungary [2]. a precise definition of these logistic tasks and the elaboration of an execution plan is important for the new eu member states, and thus for hungary, because in a short time after accession a system has to be in action that meets the eu norms, the technological, logistic, economic, legal, political and social conditions. 2 state of the art vs. eu practice nowadays the governments of the oecd states definitely consider the environment as a strategic sector. this thought involves mot only ecological aspects it is also important that the environment has become a major economic and market factor, and several enterprises owe their success to their environmental approach. in the oecd states, general environment – oriented services account for nearly one quarter of the total industrial output, mainly in the form of engineering and consultative services: process planning, regulation planning, project management, influence tests, environment surveys, environment monitoring, risk management, expert systems, financial analysis, and database administration [3]. the governments of the cee countries should do their best into bring their laws, economy and regulations in line with eu directives and regulations. hot areas are environmental protection, the development of environmental awareness, the development of an environmental industry, waste management, the application of clean technologies, closed loop production, and the introduction of environment control systems in companies [4], [5]. collection, take-back and recycling of used household appliances have been under discussion in europe for many years. the main reasons for addressing this subject are as follows: reduction of waste volume, promoting the recycling of materials, closing the loop, use of fewer resource, better control over toxic substances, reducing environmental risks [6]. the recycling of products is also dealt with by international agreements: � the rio de janeiro conference, which stated that “the natural load is not restricted and we do not pay enough”, � bs (british standard) amendment 7750, � emas environmental management and audit system, � the introduction of iso 14000 in environmental auditing, � the basel convention on the qualification and export of waste in europe, the recommendation of the ‘polluter pays’ principle. although the objective of the act on environmental product fees and on the environmental product fees of individual products (act lvi of 1995) is to create financial resources for the reduction and prevention of damage to the environment or one of its elements caused by the production or the use of a product or directly or indirectly imposing harm or stress on the environment as a consequence of the consumption of a product, and replace pollution of the environment and stimulate economic management of natural resources, the hungarian directives need to be harmonised with the directives of the european union. 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 recycling and networking t. bányai in recent years, the notion that for environmental and legislative reasons improvements the national environmental policies and practice, including recycling strategies, are desirable and in many cases might be economically beneficial has been gaining ground. although according to recent surveys the state of the environment in hungary is in line with average values of the european union, the main challenge for the country is to achieve sustainability in economic, environmental and technological terms. with a view to accession to the european union, a harmonisation strategy must be worked out and implemented. this harmonisation strategy includes not only legislative aspects, but also social, technological, financial and logistic considerations. because of the high logistic costs of achieving closed loop recycling systems, the author focuses on logistic aspects and tasks of the improvement phases and concentrates on the possibilities of networking and co-operation. the paper describes some possible alternative solutions for co-operative recycling processes, to improve the following logistic parameters: delivery times, accuracy of supply, running times, utilization of capacities, stock quantities, flexibility, transparency of the system, high forwarding capability, quality of product. the logistic aspects of co-operation will be analysed from the viewpoint of a closed loop economy. keywords: co-operation, logistics, networking, optimisation, recycling. the european union has decided that improving the closed loop economy is the most important step in the field of equipment containing hazardous materials, like electric and electronic equipment. the european parliament and the council have made proposals for directives on electric and electronic equipment wastes. on the basis of these directives, the milestones for possible future improvement are the following: treatment, selective collection, recovery, financing in terms of weee, information for users, information for treatment facilities, information requirements, reporting obligation and adaptation to scientific and technical progress [7]. 3 general collection structures of recycling the first and principal step in forming a potential recycling collection system structure is to explore the potential system elements. although two main types of recycling technology exist (disassembly and shredding), based on previous hungarian experience we can say that the future recycling system will basically depend on disassembly technology. the target during the formation of the recycling system is to form of a logistic system, that is capable of collecting used electronic products from the end users in an economical manner and setting them to disassembly plants [8], [9]. the number of disassembly plants can be determined according to the required disassembly capacity. when determining the number of disassembly plants, the basic question is whether one large plant is to be set up on one site, or several plants with smaller capacity on a number of sites. it is important to determine the types of products to be disassembled by a plant (homogeneous, inhomogeneous or mixed disassembly). determining the range of products to be disassembled, at the plant is a very important strategic decision, like the determining the range of products drawn into the orbit of the collection process, since it will define the companies working in strategic co-operation in the collection and recycling system. currently there is one disassembly plant in hungary that can be used to disassemble discarded refrigerators, freezers and freezing boxes. thus the number of disassembly plants and the types of products that can be linked to them is still an open question, especially because there is still no kind of agreement among the hungarian companies that make electronic household appliances about a collecting-disassembly system which can be operated in co-operation. however, on every account, the future foreshadows the formation of an all-round co-operating system, because establishing and operating several collecting-disassembling systems does not seem practical in a small country. a collection system can basically comprise the following elements: end users, waste courts, stores, services, distributors [10]. using these elements, a single or multilevel logistic collection system has to be formed, which, taking into account the characteristics of the above listed elements, offers the users and recyclers an economical way to collect discarded products. end users can be communal users and private users. it is important to distinguish between them during the formation of the collection system. while in the case of private users a single item is usually collected, in the case of communal users a larger quantity is usually collected from each communal unit. thus it would be useful in the collection system to deliver the used products of private and communal users to different collection points. when determining the number of collection levels, two important target functions © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 acta polytechnica vol. 44 no. 2/2004 fig. 1: structure of collection systems have to be considered. one target function is to minimize the implementation costs, which can be achieved by reducing the number of collection points. the other target function is to minimize the delivery costs linked to the operation of the collection system, which can be achieved by increasing the number of collection levels and collection points. when electronic household product began to be recycled (in about 1993), mobile as well as stationary disassembly plants were analysed, but for economic and geographical reasons, the planning and partial realization of versions of the collecting system were based on stationary disassembly plants. pre-disassembly is especially important in the case of products of this kind, when the volume of large items can be reduced in the early phase of collection. however the applicability of pre-disassembly has to be investigated carefully from the economic viewpoint because implementation and operation of a pre-disassembly plant considerably increase the operation and maintenance costs. figure 1 shows the general structure of the logistic collection system, from which the characteristic system versions can be derived. as shown in fig. 1, the logistic collection system consists of a number of participants; therefore such a system can be operated only in the form of co-operation. the next section will introduce forms of co-operation in the framework of which these collection and recycling systems can be operated. 4 networking possibilities of recycling operation of the recycling system can be done in two co-operation forms. one is the virtual enterprise, and the other is the cluster. a virtual enterprise is a temporary network of companies that come together rapidly to exploit fast changing opportunities. in a virtual enterprise, companies can share costs, skills and access to global markets, with each partner contributing what it is best at. this network is based on economic relationships, which can be flexibly built or broken up depending on demand. the most important partners of a general virtual enterprise can be purchasers, producers, distributors and end users [11, 12, 13, 14, 15, 16]. forming a cluster is a process that may start with an initial natural advantage or chance use of a location, leading to collaboration among several firms, present-day official oecd and eu definition of a cluster is as follows: the abilities of companies, industries, regions, nations and regions above nations to produce relatively high income beside relatively high employment while they are in international competition. advantages of co-operation based on geographical concentration and reliance: low transaction costs (communication), low transport costs, opportunities for special services, face-to-face communication, comparatively simple feasibility of jit conception, cost reduction because of common quality assurance, etc. since hungary is not a large country, thus the co-operation forms operating in it can be classified as cluster type co-operation. co-operation type operating forms can be successfully applied in two important fields of collection and recycling systems. these are harmonized operation of a) the collection system and b) the disassembly plants. the virtuality of the collection system means that the potential participants in the collection system (end users, waste courts, stores, services, distributors) do not in a legal sense operate in a common organization, but their operation is harmonized by a virtual logistic centre. the virtual disassembly system, which ensures the co-operation of plants specialized in disassembling various product groups, can work on the same principle. obviously, the co-operation between the collection system, the disassembly system and the distribution-recycle system is 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: co-operative subsystems of the recycling process essential. this high level co-operation is basically carried out by communication among the virtual logistic centres operating the individual sub-systems, but it is also possible for one virtual logistic centre to ensure the harmonized operation of the whole sub-system (see figure 2). obviously, the virtual logistic centres do not onlydeal with harmonizing the technological and logistic processes, but they also have to keep in contact with the partners concerning the whole business process (banking, exchange, insurance, dealing with the authorities, ministries, external services, etc.) in hungary, because of its small size, it seems practical to operate the whole system by a single logistic centre, but the final number of logistic centres and the structure of the system can be determined only after performing quantitative and qualitative analyses of the products to be collected and recycled, taking into account technology, logistics and economics, and the result can be considered as a long-term strategic decision from the point of view of the cooperating partners. 5 conception of the design of collection systems when an optimal logistic collection system is being formed, the internal and external logistic aspects need to be considered. external logistic aspects are basically connected with the collection processes: products to be collected and recycled, the topology of the collection/distribution system, hierarchical and heterarchical characteristics, collection/distribution routes, route planning, planning of storage, choosing transport and loading devices, scheduling of supply/distribution. internal logistic have to receive due considering when the disassembly system is being designed: topology of the disassembly systems, topology of the disassembly plants, choice of material handling devices, planning of stores of used products and reusable components/materials, scheduling of disassembly, pre-disassembly possibilities [17]. the first task during the formation of the logistic collection system is to determine the product group to be collected and the quantitative and qualitative parameters of these products. while the european union directives specify the quantity to be collected as 4 kg/person, the quantity related to a company can be determined by market participation. for this, the first step is to make a long-term trend analysis considering the marketing results of the previous period. in this trend analysis it is practical, based on the regression method, to determine the marketing trend of each product, because the marketing trend of a product that has already been in the market is determined differently than in the case of a new product in the market. for example, in hungary the market for built in ovens and cookers is quite small, but dynamic growth is expected, so it is practical to apply an exponential or logistic function. meanwhile, in the case of products that have been already been in the market for a long time, linear regression methods are applicable. based on the values estimated this way, the future market share and the quantity to be collected can be calculated. the recommended next step in the quantitative analysis is to determine the quantity of products to be collected according to region. when the quantity distribution in the regions is estimated, not only the population but also the standard of living has to be considered, for example taking into account the gdp or unemployment ratio as a weighting factor. based on this distribution, the parameters of the required collection structure can be determined. the aim of qualitative analysis is to determine the distribution of the products to be collected in order of age. for this, an amortization function needs to be used (see fig. 3), which defines how much of a given type of product becomes waste as function of its age [18]. this amortization function may differ for different products, as a function of the product life-cycle graph. when the quantitative and qualitative parameters are known, the collection system can be planned, which basically means selection based on a comparative analysis of the optimized system versions. during the formation of system versions, analysis of singleand multilevel collection systems is recommended, taking into account different region forming methods and disassembly system structures. a typical system version can be as follows: a single-level collection system with one or more disassembly plant, a multi-level collection system with one or more disassembly plant. although the investment costs are relatively low in the case of a single-level, single-disassembly placed collection system (see fig. 4), the operation costs will be high, since the customers will have to transport their waste products over large distances. in this system, private users as well as communal users take their products to the same collection system. in the case of hungary, this system version requires the formation of 18–20 collection centres, if each country has a collection centre, and the customers have to transport their waste products an average distance of 30 km. when forming the collection © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 acta polytechnica vol. 44 no. 2/2004 fig. 3: amortization rates ai i n � � � 1 1 system, it is recommended to consider having regional collection centres at institutions that already exist or will be established obligatorily during connection to eu systems. if the tasks of the collection centres are carried out by waste courts, the regulations of the operations of waste courts have to be modified in a way that will enable large amounts (from communal users) of dangerous wastes to be accepted. according to the current executive decree (98/2001 executive decree 12§) a waste collection court can be operated in order to collect dangerous wastes originating in small amounts from households or from other sources. the operator of the waste collection court is allowed to receive a small amount of dangerous waste that does not originate from household sources by agreement with the owner of the waste. the owner of the waste can use the waste collection court for collection only if the quantity of dangerous waste originating from him in the course of a year is not more than 500 kg. the operator of the waste collection court has to pass the collected dangerous waste over to a licensed waste treatment body. dangerous wastes may not be kept in the waste collection court longer than 1 year from the date of receipt. it is not necessary to lift the ban on storage longer than 1 year in the case of a collection system that functions with a well-operating disassembly plant that has the required disassembly capacity. however, in the case of communal users the upper limit of 500 kg is not acceptable. naturally if these modifications cannot be carried out, the establishment of a new institution is required. in the case of the single-level collection system shown in fig. 5, the collection and disassembly processes are harmonized by a single virtual logistic centre. although the investment costs are relatively high in the case of a multi level collection system, the operation costs will be quite low, since the customers have to transport their waste products over only short distances. although the investment costs are considerable in the case of a the multilevel collection system, because the extra collection levels include building up quite a large number of collection points, it is obviously a comfortable solution for the end users because they have to deliver their used products over only short distances [19, 20]. however, to type of a collection system shown in fig. 5 can be also built up by using plants or plant networks already in operation, because low investment costs can make the product distribution network capable of doing the tasks of an element of the collection system. the collection points can be department stores, servicing stations, distributors, while the collection centres can be working waste courts or other plants. a distinction needs to be made between private and communal users in this collection system, because the receivable quantities are limited at the collection points due to the limited logistic capacities. it therefore seems logical to link the communal users directly to the collection centres. the system shown in fig. 5 consists of two virtual logistic centres, which harmonize the operation of the two large subsystems. the need for co-operation is visible in the case of these two systems. the tasks of the virtual logistic centres basically involve harmonizing the collection process and the disassembly process in order to minimize delivery times, increase the accuracy of the supply process, minimize running times, increase the utilization of resources (human, technology and logistics), decrease the stock quantities, increase the flexibility of the whole system, and increase the forwarding capability. if the target function during the formation of the collection system is to minimize the performance and the inherent costs of materials handling, then we also have to consider the optimum formation of the collection runs, especially during the formation of collection levels near to the end user. it is expedient to form a collection structure in which delivery tasks close to the end user are done by collection routes, and delivery tasks going into the disassembly system can be done by direct routes. when forming the logistic system for collection, another important task is to determine the storage capacities in adition to the delivery capacities. the quantitative and time distribution of the used products to be stored is highly stochastic at the collection level near to the end user. since it is 94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 4: single-level collection system generally a matter of storing quantities for a short time at the level near to the end user, the above mentioned low calculability does not involve planning a storage capacity for large stores according to indefinite data. it involves storage places with small storage capacity built up at distributors and in waste courts. at the collection level near to the disassembly plants, large quantities are stored, and the stochastic characteristics of its quantities estimated by quantitative and qualitative analysis are smaller. the development of stocks with time can also be better estimated, because the deliveries are in phase with the production scheduling of the disassembly plants. waste management is a big business in western europe, where private and public stakeholders are involved. therefore, much information is given by special systems such as official national statistics and required documentation or proof. in the new eu member states, effective waste management is either lacking or still in an early stage of development. information systems, such as official national statistics and required documentation or proof, are not sufficiently available and need to be developed. in order to bring about a closed loop economy quickly and efficiently, an appropriate legal background has to be formed as soon as possible, because the establishment of such by 2005 in the new eu member countries of a recycling system, that is economically, technologically, logistically and environmentally suitable and leads to a better style of living for those living in the common europe depends on it. 6 conclusion strategies for environmental protection and recycling of used appliances, industrial and communal wastes are helpful in order to create a closed loop economy. during the implementation of a closed loop economy not only the economic and technological aspects have to be considered, but also the logistic aspects, because a considerable part of the costs of a closed loop economy are the costs of logistic processes. since harmonized co-operation among many economic participants is required during the realization of a closed loop economy, such co-operation is economically best implemented in the form of a network. in the case of hungary, because of its small size, co-operation can be realized in the future in the form of clusters and virtual companies. in order to improve the standards of services in the collection system, it would be practical to form a multi-level collection system, although the costs of forming and operating a large number of institutions can be higher than the costs of a single level collection system. forming a multilevel collection system requires considerable coordination, because of the many co-operating partners. this can be establishing a suitable informatic system. 7 acknowledgment this paper is based on work supported by a bolyai jános scholarship from the hungarian academy of sciences and the hungarian scientific research fund (project number: f037525). any opinions, findings and conclusions or recom© czech technical university publishing house http://ctn.cvut.cz/ap/ 95 acta polytechnica vol. 44 no. 2/2004 fig. 5: multi-level collection system mendations expressed in this material are those of the author and do not necessarily reflect the views of the hungarian academy of sciences. this research work is a part of the project of the applied co-operation research centre of mechatronics and material science at the university of miskolc entitled “development of take-back system of end of life household appliances from the point of view of eu directives”. references [1] enright m. j.: enhancing the competitiveness of smes in the global economy, strategies and policies. in: proceedings of the conference for ministers responsible for smes and industry ministers, bologna, 2000, p. 38. [2] bányai t.: environmental improvement from the point of view if eu practice – closed loop economy, a publication of the university of miskolc, 1, 2003, p. 155–162. [3] hungary – environmental performance review (2000), conclusions and recommendations. [4] hungary’s eu integration website. (2002). http://www.mfa.gov.hu/euint/ index.html. [5] regular report on hungary’s progress towards accession. (2002). http://www.mfa.gov.hu/euint/ index.html. [6] proposal for a directive of the european parliament and of the council on waste electrical and electronic equipment (2000). [7] frentz w.: kreislaufwirtschaftsund abfallgesetz. heymanns verlag, 1996. [8] haase h., gerecke a.: “logistische gestaltungsgrundsätze beim einsatz von integrierter sammelfahrzeugtechnik”. in: müll und abfall. vol. 12, 2000, p. 718–725. [9] haase h.: “logistische aufgabenstellungen im spannungsfeld der kreislaufwirtschaft”. in: logistikeffekte in der kreislaufwirtschaft. magdeburger schriften zur logistik, vol. 1, 1999. [10] sodtke m. w.: redistribution vertrieb und umweltschutz. entwicklung von strategien und maßnahmen zur rückführung von altprodukten auf der grundlage der produktdistribution. tectum verlag, 1997. [11] bányai á.: “das virtuelle logistikzentrum als koordinator der logistischen aufgaben”. in: modelling and optimization of logistic systems (editors: bányai t. and cselényi, j.), university of miskolc, 1999, p. 42–50. [12] bányai t.: “logistic analysis of lifecycle of virtual enterprises”. in: proceedings of the xvii. international conference on material flow, machines and devices in industry (editor: tosic s. b.), university of belgrade, 2002, p. 3.1–3.5. [13] filos e., ouzounis v.: “virtual organisations: technologies, trends, standards and the contribution of the european rtd programme”. international journal of computer applications in technology, special issue: applications in industry of product and process modelling using standards ”virtual-organisation.net”, ”newsletter”, vol. 1, 1997, no. 3–4. [14] gesmann-nuissl d.: “virtuelle unternehmens-organisation – eine gesellschaftsund kartell-rechtliche betrachtung”. in: virtuelle organisationen im zeitalter von e-business und e-government, einblicke und ausblicke (editors: gora w. and bauer h.), springer-verlag, berlin, 2001, p. 43–58. [15] karnani f.: “virtuelle wertschöpfungskette – mit revolutionären strategiekonzepte die märkte erobern”. in: virtuelle organisationen im zeitalter von e-business und e-government, einblicke und ausblicke (editors: gora w. and bauer h.), springer-verlag, berlin, 2001, p. 95–104. [16] sydow j., winand u.: “unternehmensvernetzung und -virtualisierung: die zukunft unternehmerischer partnerschaften”. in: unternehmensnetzwerke und virtuelle organisationen, schäffer-poeschel, 1998, p. 11–31. [17] cselényi j. et al: “recycling logistics as a new field of research of logistics”. economy, culture and science of north-hungary, vol. 5, 1996, no. 7–8, p. 19–26. [18] wallau f.: kreislaufwirtschaftssystem altauto. eine empirische analyse der akteure und märkte der altautoverwertung in deutschland. deutscher universitäts-verlag, 2001. [19] bányai t. et al: development of multi-level collection system of used household appliances. report acrc-mms. applied co-operation research centre of mechatronics and material sciences at the university of miskolc, 2003, p. 32. [20] horváth e., cselényi j.: “der wahl des optimalen sammlungsmittels im falle der integrierten mehrstufigen abfallsammlung”. in: modelling and optimization of logistic systems (editors: bányai t. and cselényi j.), university of miskolc, 1999, p. 42–50. tamás bányai ph.d. phone: (36)(46)565111 fax: (36)(46)563399 e-mail: alttamas@gold.uni-miskolc.hu department of materials handling and logistics university of miskolc miskolc-egyetemváros h-3515 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 acta polytechnica doi:10.14311/ap.2018.58.0402 acta polytechnica 58(6):402–413, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap multidimensional hybrid boundary value problem marzena szajewskaa,∗, agnieszka tereszkiewiczb a institute of mathematics, university of bialystok, 1m ciolkowskiego, pl-15-245 bialystok, poland b bialystok university of technology, faculty of civil and environmental engineering, 45 e wiejska, pl-15-351 bialystok, poland ∗ corresponding author: m.szajewska@math.uwb.edu.pl abstract. the purpose of this paper is to discuss three types of boundary conditions for few families of special functions orthogonal on the fundamental region. boundary value problems are considered on a simplex f in the real euclidean space rn of dimension n > 2. keywords: hybrid functions, dirichlet boundary value problem, neumann boundary value problem, mixed boundary value problem. 1. introduction the boundary value problems, considered in the paper, is a generalization of [24] in which the authors presented two-dimensional hybrids with mixed boundary value problems. here we take a real euclidean space rn of dimension n on finite regions f ⊂ rn that are polyhedral domains. the aim of this paper is to seek solutions of the helmholtz equation with mixed boundary condition by analogy to two-dimensional cases. the solutions are presented as expansions into a series of special functions that satisfy required conditions at the (n− 1)-dimensional boundaries of f . the recent discovery of special functions [5, 10, 11, 16, 19, 23] makes realization of this idea easy and straightforward in any dimension. the new functions, called ’multidimensional hybrids’, satisfy the dirichlet boundary condition on some parts of the boundary f and neumann on the remaining ones. the methods used in the paper are the standard methods of separation of variables for differential equations (see for example [15, 18]) and the branching rule method for orbits of reflection groups (see for example [19, 24, 28]). the boundary value conditions play an important role in mathematics and physics. they are used, for example, in the theory of elasticity, electrostatics and fluid mechanics [4, 9, 26]. in §2, we present the well known helmholtz equation and three types of boundary conditions. in § 3, we recall some facts about finite reflection groups. the next section is devoted to special functions, projection matrices and branching rules. in § 5 we present 3d cases in details, namely b3,c3,c2×a1,g2×a1,a1× a1×a1. in the appendix we list tables containing the values of functions on the boundaries of fundamental region. 2. helmholtz equation and boundary conditions in this paper we consider the partial differential equation called the homogeneous helmholtz equation [15, 18, 25, and references therein]: ∆ψ(x) = −w2ψ(x), (1) where w-positive real constant, x = (y1, . . . ,yn) is given in cartesian coordinates and ∆ = n∑ i=1 ∂2 ∂y2 i . using a standard method of separation of variables for (1) (see for example [15]) and searching for the solutions in the form ψ(x) = x1(y1) · · ·xn(yn), we have the following differential equation x′′1 x2 · · ·xn + x1x ′′ 2 · · ·xn + · · · + x1x2 · · ·x′′n + w 2x1x2 · · ·xn = 0. (2) by introducing −k21, . . . ,−k 2 n so-called separation constants, we get the solution of (2) in the form x11 (y1) = cos(k1 y1), ... x1n−1(yn−1) = cos(kn−1 yn−1), x1n(yn) = cos(kn yn), x21 (y1) = sin(k1 y1), ... x2n−1(yn−1) = sin(kn−1 yn−1), x2n(yn) = sin(kn yn), (3) where kn := √ w2 − ∑n−1 i=1 ki 2, ki 6= 0 for i = 1, . . . ,n. the way of choosing separation constants is not unique. in this paper ki, i = 1, . . . ,n−1 are selected according to a branching rule method [19, 24, 28], see next sections. three types of boundary conditions. d: a dirichlet boundary condition defines the value of the function itself ψ(x) = f(x), for x ∈ ∂f, 402 http://dx.doi.org/10.14311/ap.2018.58.0402 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 6/2018 multidimensional hybrid boundary value problem where f(x) is a given function defined on the boundary. n: a neumann boundary condition defines the value of the normal derivative of the function ∂ψ ∂n (x) = f(x), for x ∈ ∂f, where n denotes normal vector to the boundary ∂f . m: a mixed boundary condition defines the value of the function itself on one part of the boundary and the value of the normal derivative of the function on the other part of the boundary d: ψ|∂f0 = f0, n: ∂ψ ∂n ∣∣∣ ∂f1 = f1, where ∂f = ∂f0∪∂f1 and f0,f1 are given functions, defined on the appropriate boundary. 3. finite reflection groups our method is general and can be presented for any crystallographic finite reflection groups g of any rank and any dimension which are associated with simple and semisimple lie algebras/groups [1, 7, 10, 27]. there is a complete classification of finite reflection groups given by dynkin diagrams [2, 3, 10]. these graphs provide the relative angles and relative length of the vectors of a set of simple roots of the root systems. there are two kinds of root systems according to the number of roots with different lengths: systems with one root length, and systems with two root lengths. a reflection r in a hyperplane orthogonal to the long/short root and passing through the origin of rn be denoted by rl/rs respectively. working with finite reflection groups, it is convenient to use four bases in rn, namely natural e-, the simple root α-, co-root α̌and weight ω-bases [2, 7, 10]. the co-root basis α̌ is defined by the formula α̌i = 2αi 〈αi|αi〉 . the ω-basis is dual to simple root basis. the relationship between considered bases is standard for group theory and is expressed by 〈α̌i|ωj〉 = δij. there are two types of fundamental region either simplex for simple lie group g or prism for semisimple one. the simplex with n + 1 vertices has the following coordinates f = { 0, ω1 q1 , . . . , ωn qn } , where qi, i = 1, . . . ,n, called co-marks, can be found in [6, 10] for any simple lie group g of any rank and any dimension. the fundamental region for prisms can be given in the following sense. let g = g1 × g2, where g1,g2 are finite reflection groups. let ω1, . . . ,ωk be a set of generating elements of g1 and ωk+1, . . . ,ωn of g2. then the prism can be written as follows f = { 0, ωi qi , ωj qj , ωi qi + ωj qj } , where i = 1, . . . ,k and j = k + 1, . . . ,n and qi,qj are co-marks [6, 10]. let ∂fi be contained in the hyperplane generated by a set of orthogonal reflections r0,r1, . . . ,ri−1,ri+1, . . . ,rn, i = {0, . . . ,n}, where r0 is an affine reflection (it corresponds to long reflection). if ri corresponds to the reflection orthogonal to the short/long root then we denote a part of the boundary by ∂fs or ∂fl respectively. in other words we can say that the boundary ∂f of the fundamental region f will be denoted by ∂fl/∂fs if its normal vector is perpendicular to the long/short root α respectively. 4. special functions as a solution of helmholtz equation there are four kinds of special functions of interest to us whose orthogonality on lattice fragment f is known for any simple lie group [5, 6, 10, 16, 17, 19, 20, 23, and references therein]. the general formula for special functions (called orbit functions) [10, 11] corresponding to the finite reflection group g is given by ∑ w∈g σ(w)e2πi〈wλ|x〉, λ ∈ p +, x ∈ f (4) where the summation extends over the whole group g, p + denotes the set of dominant weights [10] and σ(w) = ±1 depends on the type of the orbit function. the homomorphism σ : g → {±1} is a product of σ(rl),σ(rs) ∈ {±1}. there are four types of maps σ [16, 17]: σ(rl) = σ(rs) = 1 =⇒ c, σ(rl) = σ(rs) = −1 =⇒ s, σ(rl) = −1, σ(rs) = 1 =⇒ sl, σ(rl) = 1, σ(rs) = −1 =⇒ ss. (5) all four families of functions defined above are formed as finite sums of exponential terms. the first two families, namely cand s-functions are generalized cosine and sine functions. they are symmetric and skew-symmetric with respect to the finite reflection group [6, 10, 16, 19–21, 23]. the other two, ssand sl-functions [11, 12, 16, 17, 23] have analogous properties as cand s-functions. the main difference between them is their behaviour at the boundary of their domain of orthogonality in rn. every finite group g generated by reflections can be reduced to a subgroup a1×···×a1 using a branching 403 marzena szajewska, agnieszka tereszkiewicz acta polytechnica rule method described in [13, 14, 19, 22, 24, 28]. this method allows us to do the separation of variables for special functions (5) corresponding to group g. as a result, we have all the functions written as a product of sine and cosine functions. remark 1. all four families of functions (5) presented above are solutions of the helmholtz equation (1) where w2 = 4π2〈λ|λ〉 with one of the three types of boundary conditions described in § 2. projection matrix reduces any n-dimensional group g to a subgroup a1×. . .×a1 [13, 19]. the branching rule allows one to divide any orbit of group g into a union of orbits of group a1. as an example see 3d cases described in § 5. remark 2. the union of orbits which we get after reduction determine our choice of separating constants used in solution of helmholtz equation (1). the behaviour of the functions c,s,ss and sl on the boundary ∂f can be summarize in the tab. 1. d n ∂fs ∂fl ∂fs ∂fl cλ(x) ∗ ∗ 0 0 sλ(x) 0 0 ∗ ∗ ssλ(x) 0 ∗ ∗ 0 slλ(x) ∗ 0 0 ∗ table 1. behaviour of the functions c, s, ss and sl on the boundary ∂f for any finite refleciton group g where ∗ denotes any function non-equivalent to 0. for any group g considered in the paper cfunctions fulfil the dirichlet condition with value nonequivalent to 0 and the neumann condition with 0 value on the whole boundary. the s-functions behave inversely. the ss-functions fulfil the dirichlet condition with a value non-equivalent to 0 on the part of boundary denoted by ∂fl and the neumann condition with a value non-equivalent to 0 on the part of the boundary denoted by ∂fs. the sl functions behave inversely. in the case of c-functions we talk about dirichlet boundary condition and s-functions neumann boundary condition. for ssand sl-functions we talk about mixed boundary condition. in the next section we present 3d cases in details. 5. 3d finite reflection groups the 3 dimensional groups which we considered here are b3,c3,c2×a1,g2×a1,a1×a1×a1 [2, 7, 8, 10]. we use the following notation for coordinates: r3 3 λ = (a,b,c)ω = aω1 + bω2 + cω3, r3 3 x = (x1,x2,x3)α̌ = (y1,y2,y3)e, where indexes e, ω, and α̌ denote natural-, ω-, and α̌basis, respectively. the action of the laplace operator ∇ on the functions given in different bases can be found in [10]. in the next subsections we describe each case in details. for each case we present functions which are the solutions of helmholtz equation (1). we give the exact forms of the projection matrices and branching rules which allow us to choose the separation constants used in (3). all functions described below fulfil one of the three types of boundary conditions described in §,2. 5.1. b3 and c3 groups the α-basis vectors in cartesian coordinates are b3: c3: α1 := (1,−1, 0)e, α1 := 1 √ 2 (1,−1, 0)e, α2 := (0, 1,−1)e, α2 := 1 √ 2 (0, 1,−1)e, α3 := (0, 0, 1)e, α3 := 1 √ 2 (0, 0, 2)e. as one can easily notice the short root for b3 is α3 and for c3 are α1,α2. the fundamental regions f for b3 and c3 groups, written in ω-basis, have the vertices: fb3 = {0,ω1, 1 2ω2,ω3}, fc3 = {0,ω1,ω2,ω3}, and are shown in fig. 1. figure 1. the fundamental region f for b3 and c3 group. the reduction of b3 and c3 to a subgroup a1 × a1 ×a1 is given by the projection matrices pb3 =  1 1 01 1 1 0 2 1   , pc3 =  1 1 10 1 1 0 0 1   . then the branching rule is the following: o(a,b,c) pb3−−→ o(2a+2b+c)o(2b+c)o(c) ∪o(2b+c)o(2a+2b+c)o(c) ∪o(2a+2b+c)o(c)o(2b+c) ∪o(c)o(2a+2b+c)o(2b+c) ∪o(2b+c)o(c)o(2a+2b+c) ∪o(c)o(2b+c)o(2a+2b+c), 404 vol. 58 no. 6/2018 multidimensional hybrid boundary value problem o(a,b,c) pc3−−→ o(a+b+c)o(b+c)o(c) ∪o(b+c)o(a+b+c)o(c) ∪o(a+b+c)o(c)o(b+c) ∪o(b+c)o(c)o(a+b+c) ∪o(c)o(a+b+c)o(b+c) ∪o(c)o(b+c)o(a+b+c). according to remarks 1 and 2 the separation constants for b3 and c3 group we can choose as −k21 = −π 2(2a + 2b + c)2, −k22 = −π 2(2b + c)2, −k23 = −π 2c2, (6) where w2 = 4π2(a2 + 2ab + 2b2 + ac + 2bc + 34c 2). the separation constants for c3 group are −k21 = −π 2(a + b + c)2, −k22 = −π 2(b + c)2, −k23 = −π 2c2, (7) where w2 = 4π2( 12a 2 + ab + b2 + ac + 2bc + 32c 2). the explicit forms of orbit functions for b3 and c3 group have form b3: ca,b,c(x) = c2a+2b+c(x1)c2b+c(x2)cc(x3) + c2b+c(x1)c2a+2b+c(x2)cc(x3) + c2a+2b+c(x1)cc(x2)c2b+c(x3) + cc(x1)c2a+2b+c(x2)c2b+c(x3) + c2b+c(x1)cc(x2)c2a+2b+c(x3) + cc(x1)c2b+c(x2)c2a+2b+c(x3), sla,b,c(x) = c2a+2b+c(x1)c2b+c(x2)cc(x3) −c2b+c(x1)c2a+2b+c(x2)cc(x3) −c2a+2b+c(x1)cc(x2)c2b+c(x3) + cc(x1)c2a+2b+c(x2)c2b+c(x3) + c2b+c(x1)cc(x2)c2a+2b+c(x3) −cc(x1)c2b+c(x2)c2a+2b+c(x3), sa,b,c(x) = s2a+2b+c(x1)s2b+c(x2)sc(x3) −s2b+c(x1)s2a+2b+c(x2)sc(x3) −s2a+2b+c(x1)sc(x2)s2b+c(x3) + sc(x1)s2a+2b+c(x2)s2b+c(x3) + s2b+c(x1)sc(x2)s2a+2b+c(x3) −sc(x1)s2b+c(x2)s2a+2b+c(x3), ssa,b,c(x) = s2a+2b+c(x1)s2b+c(x2)sc(x3) + s2b+c(x1)s2a+2b+c(x2)sc(x3) + s2a+2b+c(x1)sc(x2)s2b+c(x3) + sc(x1)s2a+2b+c(x2)s2b+c(x3) + s2b+c(x1)sc(x2)s2a+2b+c(x3) + sc(x1)s2b+c(x2)s2a+2b+c(x3); c3: ca,b,c(x) = ca+b+c(x1)cb+c(x2)cc(x3) + cb+c(x1)ca+b+c(x2)cc(x3) + ca+b+c(x1)cc(x2)cb+c(x3) + cc(x1)ca+b+c(x2)cb+c(x3) + cb+c(x1)cc(x2)ca+b+c(x3) + cc(x1)cb+c(x2)ca+b+c(x3), ssa,b,c(x) = ca+b+c(x1)cb+c(x2)cc(x3) −cb+c(x1)ca+b+c(x2)cc(x3) −ca+b+c(x1)cc(x2)cb+c(x3) + cc(x1)ca+b+c(x2)cb+c(x3) + cb+c(x1)cc(x2)ca+b+c(x3) −cc(x1)cb+c(x2)ca+b+c(x3), sa,b,c(x) = sa+b+c(x1)sb+c(x2)sc(x3) −sb+c(x1)sa+b+c(x2)sc(x3) −sa+b+c(x1)sc(x2)sb+c(x3) + sc(x1)sa+b+c(x2)sb+c(x3) + sb+c(x1)sc(x2)sa+b+c(x3) −sc(x1)sb+c(x2)sa+b+c(x3) sla,b,c(x) = sa+b+c(x1)sb+c(x2)sc(x3) + sb+c(x1)sa+b+c(x2)sc(x3) + sa+b+c(x1)sc(x2)sb+c(x3) + sc(x1)sa+b+c(x2)sb+c(x3) + sb+c(x1)sc(x2)sa+b+c(x3) + sc(x1)sb+c(x2)sa+b+c(x3). the functions on the right side of the above equations are special functions corresponding to group a1 cµ(xi) = ∑ µ∈a1 e2πi〈µ|xi〉 = 2 cos(2πµxi), sµ(xi) = ∑ µ∈a1 σ(µ)e2πi〈µ|xi〉 = 2i sin(2πµxi), where µ ∈ p +a1,xi ∈ fa1, i = 1, 2, 3. the coordinate xi respond to the i-th coordinate in a1 × a1 × a1. the functions cµ(xi) and sµ(xi) are the solutions of helmholtz equation (1) in the form (3) in 1d case. for group b3 the functions cand slare real valued and sand ss are purely imaginary. in the case of c3, the functions cand ssare real valued and sand sl are purely imaginary. the normal vectors shown on fig. 2 are b3: c3: n1 = {− 1√2, 1√ 2 , 0}, n1 = {0, 0, 1}, n2 = {0,− 1√2, 1√ 2 }, n2 = {0, 1√2,− 1√ 2 }, n3 = {0, 0,−1}, n3 = { 1√2,− 1√ 2 , 0}, n4 = { 1√2, 1√ 2 , 0}, n4 = {1, 0, 0}. 405 marzena szajewska, agnieszka tereszkiewicz acta polytechnica figure 2. normal vectors for b3 and c3. in the case of b3 group normal vector n4 is perpendicular to the short simple root. the rest of them, namely n1,n2,n3 are perpendicular to the long simple roots. so the boundaries that correspond to normal vectors are ∂fs for n1 and ∂fl for the others. the values of the functions on the boundaries are summarized in the appendix in tab. 2. in the case of c3 it is a little bit different, the normal vectors n2,n3 are perpendicular to the short simple roots and n1,n4 to the long simple roots. the values of the functions on the boundaries are given in the appendix in tab. 3. 5.2. c2 × a1 and g2 × a1 groups the α-basis vectors in cartesian coordinates have the form c2 ×a1: g2 ×a1: α1 := 1 √ 2 (1,−1, 0)e, α1 := ( √ 2, 0, 0)e, α2 := 2 √ 2 (0, 2, 0)e, α2 := ( − 1 √ 2 , 1√ 6 , 0 ) e , α3 := 1 √ 2 (0, 0, 2)e, α3 := 1 √ 2 (0, 0, 2)e. the vertices of the fundamental regions f for c2 × a1,g2×a1 groups, shown in fig. 3, written in ω-basis are fc2×a1 = {0,ω1,ω2,ω3,ω1 + ω3,ω2 + ω3}, fg2×a1 = { 0, 1 2 ω1,ω2,ω3, 1 2 ω1 + ω3,ω2 + ω3 } . figure 3. the fundamental region f for c2 × a1 and g2 × a1 group. the groups c2 ×a1,g2 ×a1 can be reduced to a subgroup a1×a1×a1 using a branching rule method described in [19, 24]. the projection matrices and the branching rules are pc2×a1 =  1 1 00 1 0 0 0 1   , pg2×a1 =  1 1 03 1 0 0 0 1   , o(a,b)o(c) pc2 ×a1−−−−−→ o(a+b)o(b)o(c) ∪o(b)o(a+b)o(c), o(a,b)o(c) pg2 ×a1−−−−−→ o(a+b)o(3a+b)o(c) ∪o(2a+b)o(b)o(c) ∪o(a)o(3a+2b)o(c). the separation constants for c2 ×a1 are −k21 = −π 2(a+b)2, −k22 = −π 2b2, −k23 = −π 2c2, (8) where w2 = 4π2(a2 + ab + b2 + 12c 2) and for g2 ×a1 equal −k21 = −2π 2(2a + b)2, −l21 = −2π 2(a + b)2, −k22 = − 2 3π 2b2, −l22 = − 2 3π 2(3a + b)2, −k23 = −2π 2c2, −l23 = −2π 2c2, −m21 = −2π 2a2, −m22 = − 2 3π 2(3a + 2b)2, −m23 = −2π 2c2, (9) where w2 = 4π2(2a2 + 2ab + 23b 2 + 12c 2). the explicit forms of orbit functions are c2 ×a1: ca,b,c(x) = ca+b(x1)cb(x2)cc(x3) + cb(x1)ca+b(x2)cc(x3), ssa,b,c(x) = ca+b(x1)cb(x2)cc(x3) −cb(x1)ca+b(x2)cc(x3), sa,b,c(x) = sa+b(x1)sb(x2)sc(x3) −sb(x1)sa+b(x2)sc(x3), sla,b,c(x) = sa+b(x1)sb(x2)sc(x3) + sb(x1)sa+b(x2)sc(x3); g2 ×a1: ca,b,c(x) = ca(x1)c3a+2b(x2)cc(x3) + ca+b(x1)c3a+b(x2)cc(x3) + c2a+b(x1)cb(x2)cc(x3), sla,b,c(x) = sa(x1)c3a+2b(x2)sc(x3) −sa+b(x1)c3a+b(x2)sc(x3) + s2a+b(x1)cb(x2)sc(x3), sa,b,c(x) = sa(x1)s3a+2b(x2)sc(x3) −sa+b(x1)s3a+b(x2)sc(x3) + s2a+b(x1)sb(x2)sc(x3), 406 vol. 58 no. 6/2018 multidimensional hybrid boundary value problem ssa,b,c(x) = ca(x1)s3a+2b(x2)cc(x3) −ca+b(x1)s3a+b(x2)cc(x3) −c2a+b(x1)sb(x2)cc(x3), where cµ(xi),sµ(xi) for i = 1, 2, 3 are the same as in the previous cases. for group c2 × a1 the functions cand ssare real valued and sand sl are purely imaginary. in the case of g2 ×a1, the functions cand slare real valued and sand ss are purely imaginary. figure 4. normal vectors of f for c2 × a1, g2 × a1 groups. the normal vectors shown in fig. 4 are c2 ×a1: g2 ×a1: n1 = {0, 0,−1}, n1 = {0, 0,−1}, n2 = {0,−1, 0}, n2 = { √ 3 2 ,− 1 2, 0}, n3 = {− 1√2, 1√ 2 , 0}, n3 = {−1, 0, 0}, n4 = {1, 0, 0}, n4 = {12, √ 3 2 , 0}, n5 = {0, 0, 1}, n5 = {0, 0, 1}. in the case of c2 ×a1 the group normal vector n3 is perpendicular to the short simple root. the rest of them, namely n1,n2,n4,n5 are perpendicular to the long simple roots. so the boundaries that correspond to normal vectors are ∂fs for n3 and ∂fl for the others. the values of the functions on the boundaries are summarized in appendix in tab. 4. in the case of g2 ×a1, the normal vector n2 corresponds to the short simple root so to the boundary ∂fs and the rest of normal vectors to the long simple roots i.e. to the boundaries ∂fl. the values of the functions on the boundaries are given in appendix in tab. 5. 5.3. a1 × a1 × a1 group although the root system of a1 ×a1 ×a1 does not have two different lengths of roots, it is still an interesting case for us. the α-basis vectors in cartesian coordinates have the form α1 := ( √ 2, 0, 0)e, α2 := (0, √ 2, 0)e, α3 := (0, 0, √ 2)e. according to (4) and (5) there are two families of special functions c and s. by the analogy to homomorphism (5) we can define new families of functions. σ(r1) = σ(r2) = σ(r3) = 1 =⇒ ccc, σ(r1) = σ(r2) = σ(r3) = −1 =⇒ sss, σ(r1) = σ(r2) = 1, σ(r3) = −1 =⇒ ccs, σ(r1) = σ(r2) = −1, σ(r3) = 1 =⇒ ssc, σ(r1) = σ(r3) = 1, σ(r2) = −1 =⇒ csc, σ(r1) = −1, σ(r2) = σ(r3) = 1 =⇒ scc, σ(r1) = 1, σ(r2) = σ(r3) = −1 =⇒ css, σ(r2) = −1, σ(r1) = σ(r3) = 1 =⇒ scs, where ccc, sss correspond to c and s-functions, respectively and the rest of them to sland ss-functions. all families of functions defined on the fundamental region fa1×a1×a1 = {0,ω1,ω2,ω3,ω1 + ω2, ω1 + ω3,ω2 + ω3,ω1 + ω2 + ω3}. fulfill mixed boundary condition (see tab. 6). figure 5. the fundamental region f with normal vectors of a1 × a1 × a1 group. the projection matrix is the identity matrix and then the choice of separation constants is trivial: −k21 = −π 2a2, −k22 = −π 2b2, −k23 = −π 2c2. (10) according to the branching rule o(a,b,c) pa1 ×a1 ×a1−−−−−−−−→ o(a)o(b)o(c) we have ccca,b,c(x) := ca(x1)cb(x2)cc(x3), scsa,b,c(x) := sa(x1)cb(x2)sc(x3), cssa,b,c(x) := ca(x1)sb(x2)sc(x3), ssca,b,c(x) := sa(x1)sb(x2)cc(x3), sssa,b,c(x) := sa(x1)sb(x2)sc(x3), csca,b,c(x) := ca(x1)sb(x2)cc(x3), ccsa,b,c(x) := ca(x1)cb(x2)sc(x3), scca,b,c(x) := sa(x1)cb(x2)cc(x3), 407 marzena szajewska, agnieszka tereszkiewicz acta polytechnica where cµ(xi),sµ(xi) for i = 1, 2, 3 are the same as in the previous cases. the first four families of functions are real valued and the rest of them are pure imaginary. normal vectors shown on fig. 5 are n1 = {0, 0,−1}, n2 = {0,−1, 0}, n3 = {−1, 0, 0}, n4 = {0, 1, 0}, n5 = {1, 0, 0}, n6 = {0, 0, 1}. the values of the functions on the boundaries are shown in appendix in tab. 6. 6. appendix in tables 2–6 we collect the values of special functions on the boundaries of the fundamental region f for each of 3d finite reflection groups presented in the paper. references [1] borel, a., and j. de siebental, “les sous-groupes fermés de rang maximum de groupes de lie clos.” comment. math. helv. 23, (1949): 200–221. [2] bourbaki, n. groupes et algèbres de lie, chapters iv, v, vi, hermann, paris, 1968. [3] dynkin, e.b. “semisimple subalgebras of semisimple lie algebras.” ams trnanslations, series 2, vol. 6, (1957): 111–244. [4] griffiths, d.j., and r. college, introduction to electrodynamics, prentice hall, new jersey, 1999. [5] hakova, l., hrivnak, j., and j. patera, “four families of weyl group orbit functions of b3 and c3.” j. math. phys. 54, 083501 (2013). [6] hrivnak, j., and j. patera, “on discretization of tori of compact simple lie groups.” j. phys. a: math. theor., 42 (2009) 385208; arxiv:0905.2395. [7] humphreys, j.e. introduction to lie algebras and representation theory, new york, springer, 1972. [8] humphreys, j.e. reflection groups and coxeter groups, cambridge univ. press, cambridge, 1990. [9] jung:1980, c. “an exactly soluble three-body problem in one-dimension.” can. j. phys., 58, (1980), 719–728. [10] klimyk, a., and j. patera, “orbit functions.” sigma (symmetry, integrability and geometry: methods and applications), 2 (2006), 006, 60 pages, math-ph/0601037. [11] klimyk a., and j. patera “antisymmetric orbit functions.” sigma, 3 (2007) paper 023, 83 pages, math-ph/0702040v1. [12] lemire, f.w., patera, j., and m. szajewska, “dominant weight multiplicities in hybrid characters of bn, cn, f4, g2.” internat. j. theoret. phys., vol. 54 (11), (2015), 4011–4026. [13] mckay, w.g., and j. patera, tables of dimensions, indices, and branching rules for representations of simple lie algebras, marcel dekker, new york, 1981. [14] mckay, w.g., patera, j., and d. sankoff, “the computation of branching rules for representations of semisimple lie algebras.” computers in nonassociative rings and algebras, ed. j. beck and b. kolman, academic press, new york, 1977. [15] miller, w. symmetry and separation of variables, with a foreword by richard askey, encyclopedia of mathematics and its applications 4, addison-wesley publishing co., reading, mass.-london-amsterdam, 1977. [16] moody, r.v., motlochova, l., and j. patera, “gaussian cubature arising from hybrid characters of simple lie groups.” j. fourier analysis and its applications, online issn 1531-5851 (2014), 23 pp., arxiv:1202.4415, doi:10.1007/s00041-014-9355-0 [17] moody, r.v., and j. patera, “characters of elements of finite order in simple lie groups.” siam j. on algebraic and discrete methods, 5 (1984), 359–383. [18] moon, p., and d.e. spencer, field theory handbook, including coordinate systems, differential equations, and their solutions, 2nd ed. new york: springer-verlag, 1988. [19] nesterenko, m., patera, j., szajewska, m., and a. tereszkiewicz, “orthogonal polynomials of compact simple lie groups: branching rules for polynomials.” j. phys. a math. theor. 43 (2010), no. 495207, 1–27. [20] nesterenko, m., patera, j., and a. tereszkiewicz “orthogonal polynomials of compact simple lie groups.” int. j. math. math. sci., (2011), no. 969424, 1-23. [21] patera, j.“compact simple lie groups and theirs c-, s-, and e-transforms.” sigma, 1 (2005), 025, 6 pages, math-ph/0512029. [22] patera, j., and d. sankoff, branching rules for representations of simple lie algebras, presses université de montréal, montréal, 1973, 99 pages. [23] szajewska, m. “four types of special functions of g2 and their discretization.” integral transform. spec. funct., vol. 23 (6) (2012), 455–472. [24] szajewska, m., and a. tereszkiewicz “two-dimensional hybrids with mixed boundary value problems.” acta polytechnica. journal of advanced engineering, vol. 56 (3) (2016), 245–253. [25] tikhonov, a.n., and a.a. samarskii, equations of mathematical physics, dover publ., new york, 1990. [26] timoshenko, s., and j.n. goodier, theory of elasticity, mcgraw-hill book company, inc., new york, 1961. [27] vinberg, e.b., and a.l. onishchik lie groups and lie algebras, springer, new york, 1994. [28] whippman, m.l. “branching rules for simple lie groups.” j. math. phys., 6 (1965): 1534. 408 http://dx.doi.org/10.1007/s00041-014-9355-0 vol. 58 no. 6/2018 multidimensional hybrid boundary value problem b 3 c s d n d n f 1 2( c 2b + c (x )c 2a + 2b + c (x )c c (z )+ c c (x )c 2b + c (z )c 2a + 2b + c (x ) 0 0 √ 2i (− k 1 s 2b + c (x )c 2a + 2b + c (x )s c (z )+ k 1 s c (x )c 2a + 2b + c (x )s 2b + c (z ) + c c (x )c 2b + c (x )c 2a + 2b + c (z )) − k 2 s c (x )c 2b + c (x )s 2a + 2b + c (z )+ k 2 s 2a + 2b + c (x )c 2b + c (x )s c (z ) − k 3 s 2a + 2b + c (x )c c (x )s 2b + c (z )+ k 3 s 2b + c (x )c c (x )s 2a + 2b + c (z )) f 2 2( c c (y )c 2b + c (y )c 2a + 2b + c (x )+ c c (x )c 2b + c (y )c 2a + 2b + c (y ) 0 0 √ 2i (k 1 s 2b + c (x )c 2a + 2b + c (y )s c (y )− k 1 s c (x )c 2a + 2b + c (y )s 2b + c (y ) + c c (y )c 2b + c (x )c 2a + 2b + c (y )) + k 2 s c (x )c 2b + c (y )s 2a + 2b + c (y )− k 2 s 2a + 2b + c (x )c 2b + c (y )s c (y ) + k 3 s 2a + 2b + c (x )s 2b + c (y )c c (y )− k 3 s 2b + c (x )s 2a + 2b + c (y )c c (y )) f 3 2( c c (y )c 2a + 2b + c (x )+ c 2b + c (y )c 2a + 2b + c (x )+ c c (x )c 2a + 2b + c (y ) 0 0 2i (− k 1 s 2b + c (x )s c (y )+ k 1 s c (x )s 2b + c (y ) c 2b + c (x )c 2a + 2b + c (y )+ c c (y )c 2b + c (x )+ c c (x )c 2b + c (y )) + k 2 s 2a + 2b + c (x )s c (y )− k 2 s c (x )s 2a + 2b + c (y ) − k 3 s 2a + 2b + c (x )s 2b + c (y )+ k 3 s 2b + c (x )s 2a + 2b + c (y )) f 4 c c (z )c 2b + c (1 − x )c 2a + 2b + c (x )+ c c (z )c 2b + c (x )c 2a + 2b + c (1 − x ) 0 0 √ 2 2 i( k 1 c 2a + 2b + c (x )s 2b + c (1 − x )s c (z )− k 2 c 2b + c (x )s 2a + 2b + c (1 − x )s c (z ) c c (x )c 2b + c (1 − x )c 2a + 2b + c (x )+ c c (x )c 2a + 2b + c (1 − x )c 2b + c (z ) + k 3 c c (x )s 2a + 2b + c (1 − x )s 2b + c (z )− k 3 c c (x )s 2b + c (1 − x )s 2a + 2b + c (z ) c c (1 − x )c 2b + c (z )c 2a + 2b + c (x )+ c c (1 − x )c 2b + c (x )c 2a + 2b + c (z ) − k 1 c 2a + 2b + c (x )s c (1 − x )s 2b + c (z )+ k 2 c 2b + c (x )s c (1 − x )s 2a + 2b + c (z ) + k 2 s 2a + 2b + c (x )c 2b + c (1 − x )s c (z )− k 1 s 2b + c (x )c 2a + 2b + c (1 − x )s c (z ) + k 1 s c (x )c 2a + 2b + c (1 − x )s 2b + c (z )− k 2 s c (x )c 2b + c (1 − x )s 2a + 2b + c (z ) − k 3 s 2a + 2b + c (x )c c (1 − x )s 2b + c (z )+ k 3 s 2b + c (x )c c (1 − x )s 2a + 2b + c (z )) b 3 s l d n f 1 0 √ 2i (k 2 c 2a + 2b + c (x )s 2b + c (x )c c (z )− k 1 c 2b + c (x )s 2a + 2b + c (x )c c (z ) + k 1 c c (x )s 2a + 2b + c (x )c 2b + c (z )− k 2 c c (x )s 2b + c (x )c 2a + 2b + c (z ) − k 3 c 2a + 2b + c (x )s c (x )c 2b + c (z )+ k 3 c 2b + c (x )s c (x )c 2a + 2b + c (z )) f 2 0 √ 2i (− k 2 c 2a + 2b + c (x )s 2b + c (y )c c (y )+ k 1 c 2b + c (x )s 2a + 2b + c (y )c c (y ) − k 1 c c (x )s 2a + 2b + c (y )c 2b + c (y )+ k 2 c c (x )s 2b + c (y )c 2a + 2b + c (y ) + k 3 c 2a + 2b + c (x )s c (y )c 2b + c (y )− k 3 c 2b + c (x )s c (y )c 2a + 2b + c (y )) f 3 2( c 2b + c (y )c 2a + 2b + c (x ) − c 2b + c (x )c 2a + 2b + c (x )+ c c (x )c 2a + 2b + c (y ) 0 − c c (x )c 2b + c (y ) − c c (y )c 2a + 2b + c (x )+ c c (y )c 2b + c (y )) f 4 0 √ 2 2 i( k 1 s 2a + 2b + c (x )c 2b + c (1 − x )c c (z )− k 2 s 2b + c (x )c 2a + 2b + c (1 − x )c c (z ) + k 3 s c (x )c 2a + 2b + c (1 − x )c 2b + c (z )− k 3 s c (x )c 2b + c (1 − x )c 2a + 2b + c (z ) − k 1 s 2a + 2b + c (x )c c (1 − x )c 2b + c (z )+ k 2 s 2b + c (x )c c (1 − x )c 2a + 2b + c (z ) + k 2 c 2a + 2b + c (x )s 2b + c (1 − x )c c (z )− k 1 c 2b + c (x )s 2a + 2b + c (1 − x )c c (z ) + k 1 c c (x )s 2a + 2b + c (1 − x )c 2b + c (z )− k 2 c c (x )s 2b + c (1 − x )c 2a + 2b + c (z ) − k 3 c 2a + 2b + c (x )s c (1 − x )c 2b + c (z )+ k 3 c 2b + c (x )s c (1 − x )c 2a + 2b + c (z )) b 3 s s d n f 1 2( s 2b + c (x )s 2a + 2b + c (x )s c (z )+ s c (x )s 2b + c (z )s 2a + 2b + c (x ) 0 + s c (x )s 2b + c (x )s 2a + 2b + c (z )) f 2 2( s c (z )s 2b + c (y )s 2a + 2b + c (x )+ s c (x )s 2b + c (y )s 2a + 2b + c (y ) 0 + s c (y )s 2b + c (x )s 2a + 2b + c (y )) f 3 0 2i (− k 3 s 2a + 2b + c (x )s 2b + c (y )− k 3 s 2b + c (x )s 2a + 2b + c (y ) − k 2 s c (x )s 2a + 2b + c (y )− k 1 s c (x )s 2b + c (y )− k 2 s 2a + 2b + c (x )s c (y )− k 1 s 2b + c (x )s c (y )) f 4 s c (z )s 2b + c (1 − x )s 2a + 2b + c (x )+ s c (z )s 2b + c (x )s 2a + 2b + c (1 − x ) 0 s c (x )s 2b + c (1 − x )s 2a + 2b + c (x )+ s c (x )s 2b + c (1 − x )s 2a + 2b + c (z ) s c (1 − x )s 2b + c (z )s 2a + 2b + c (x )+ s c (z )s 2b + c (1 − x )s 2a + 2b + c (z ) table 2. the values of c-, s-, sland ss-functions on the boundaries of fundamental region f of b3. the separation constants ki, i = 1, 2, 3 are given by (6) in § 5.1. 409 marzena szajewska, agnieszka tereszkiewicz acta polytechnica c3 c s d n d n f1 2(cb+c(x)cc(y)+ca+b+c(x)cc(y) 0 0 −2i(k1sb+c(x)sc(y)+k2sa+b+c(x)sc(y) +ca+b+c(x)cb+c(y)+cc(x)cb+c(y) −k3sa+b+c(x)sb+c(y)+k1sc(x)sb+c(y) +cc(x)ca+b+c(y)+cb+c(x)ca+b+c(y)) −k2sc(x)sa+b+c(y)+k3sb+c(x)sa+b+c(y) f2 2(ca+b+c(x)cb+c(z)cc(z) 0 0 i √ 2(k1cc(z)sb+c(x)ca+b+c(z)−k1sc(x)sb+c(z)ca+b+c(z) +cb+c(x)ca+b+c(z)cc(z) −k2sc(z)cb+c(z)sa+b+c(x)+k2sc(x)cb+c(z)sa+b+c(z) +cc(x)ca+b+c(z)cb+c(z)) −k3cc(z)sb+c(x)sa+b+c(z)+k3cc(z)sb+c(z)sa+b+c(x)) f3 2(cc(z)cb+c(y)ca+b+c(y) 0 0 −i √ 2(k1sc(z)sb+c(y)ca+b+c(y)−k1sc(y)sb+c(z)ca+b+c(y) +cc(y)cb+c(z)ca+b+c(y) −k2sc(z)cb+c(y)sa+b+c(y) + k2sc(y)cb+c(y)sa+b+c(z) +cc(y)cb+c(y)ca+b+c(z)) −k3cc(y)sb+c(y)sa+b+c(z)+k3cc(y)sb+c(z)sa + b + c(y)) f4 ca+b+c( 1√2 )cb+c(y)cc(z) 0 0 i(k1sc(z)ca+b+c( 1√2 )sb+c(y)−k1sc(y)ca+b+c( 1√ 2 )sb+c(z) +cb+c( 1√2 )ca+b+c(y)cc(z) −k2cb+c( 1√ 2 )sc(z)sa+b+c(y)+k2cb+c( 1√2 )sc(y)sa+b+c(z) +cc( 1√2 )ca+b+c(y)cb+c(z) −k3cc( 1√ 2 )sb+c(y)sa+b+c(z)+k3cc( 1√2 )sb+c(z)sa+b+c(y)) c3 sl d n f1 0 −2i(k1sb+c(x)cc(y)+k2sa+b+c(x)sc(y) +k3sa+b+c(x)sb+c(y)+k1sc(x)sb+c(y) +k2sc(x)sa+b+c(y)+k3sb+c(x)sa+b+c(y)) f2 2(sa+b+c(x)sb+c(z)sc(z)+sb+c(x)sa+b+c(z)sc(z) 0+sc(x)sa+b+c(z)sb+c(z)) f3 2(sc(z)sb+c(y)sa+b+c(y) + sc(y)sb+c(z)sa+b+c(y) 0+sc(y)sb+c(y)sa+b+c(z)) f4 0 i(k1sc(z)ca+b+c( 1√2 )sb+c(y)+k1sc(y)ca+b+c( 1√ 2 )sb+c(z) +k2cb+c( 1√2 )sc(z)sa+b+c(y)+k2cb+c( 1√ 2 )sc(y)sa+b+c(z) +k3cc( 1√2 )sb+c(y)sa+b+c(z)+k3cc( 1√ 2 )sb+c(z)sa+b+c(y)) c3 ss d n f1 −2(cb+c(x)ca+b+c(y)−cc(y)ca+b+c(x) 0+cb+c(y)ca+b+c(x)+cc(x)ca+b+c(y) +cc(y)cb+c(x)−cc(x)cb+c(y)) f2 0 i √ 2(k1cc(z)cb+c(x)sa+b+c(z)−k1cc(x)cb+c(z)sa+b+c(z) −k2cc(z)sb+c(z)ca+b+c(x)+k2cc(x)sb+c(z)ca+b+c(z) +k3sc(z)cb+c(z)ca+b+c(x)−k3sc(z)cb+c(x)ca+b+c(z)) f3 0 −i √ 2(k1cc(z)cb+c(y)sa+b+c(y)−k1cc(y)cb+c(z)sa+b+c(y) −k2cc(z)sb+c(y)ca+b+c(y)+k2cc(y)sb+c(y)ca+b+c(z) +k3sc(y)cb+c(z)ca+b+c(y)−k3sc(y)cb+c(y)ca+b+c(z)) f4 cc(z)ca+b+c( 1√2 )cb+c(y)−cc(y)ca+b+c( 1√ 2 )cb+c(z) 0−cb+c( 1√2 )cc(z)ca+b+c(y)+cc( 1√ 2 )cb+c(z)ca+b+c(y) +cb+c( 1√2 )cc(y)ca+b+c(z)−cc( 1√ 2 )cb+c(y)ca+b+c(z) table 3. the values of c-, s-, sland ss-functions on the boundaries of fundamental region f of c3. the separation constants ki, i = 1, 2, 3 are given by (7) in § 5.1. 410 vol. 58 no. 6/2018 multidimensional hybrid boundary value problem c2×a1 c s d n d n f1 2(ca+b(x)cb(y)+cb(x)ca+b(y)) 0 0 −2ik3(sa+b(x)sb(y)−sb(x)sa+b(y)) f2 2(ca+b(x)cc(z)+cb(x)cc(z)) 0 0 −2ik2sa+b(x)sc(z)+ik1sb(x)sc(z) f3 2ca+b(y)cb(y)cc(z) 0 0 −i √ 2(k1ca+b(y)sb(y)sc(z)−k2cb(y)sa+b(y)sc(z)) f4 ca+b( √ 2 2 )cb(y)cc(z)+cb( √ 2 2 )ca+b(y)cc(z) 0 0 ik1ca+b( √ 2 2 )sb(y)sc(z)−ik2cb( √ 2 2 )sa+b(y)sc(z) f5 ca+b(x)cb(y)cc( √ 2 2 )+cb(x)ca+b(y)cc( √ 2 2 ) 0 0 ik3(sa+b(x)sb(y)cc( √ 2 2 )−sb(x)sa+b(y)cc( √ 2 2 )) c2×a1 sl d n f1 0 −2ik3(sa+b(x)sb(y)+sb(x)sa+b(y)) f2 0 −2i(k2sa+b(x)sc(z)+k1sb(x)sc(z)) f3 2sa+b(y)sb(y)sc(z) 0 f4 0 i(k1ca+b( √ 2 2 )sb(y)sc(z)+k2cb( √ 2 2 )sa+b(y)sc(z)) f5 0 ik3(sa+b(x)sb(y)cc( √ 2 2 )+sb(x)sa+b(y)cc( √ 2 2 )) c2×a1 ss d n f1 2(ca+b(x)cb(y)−cb(x)ca+b(y)) 0 f2 2(ca+b(x)cc(z)−cb(x)cc(z)) 0 f3 0 −i √ 2(k1sa+b(y)cb(y)cc(z)−k2sb(y)ca+b(y)cc(z)) f4 ca+b( √ 2 2 )cb(y)cc(z)−cb( √ 2 2 )ca+b(y)cc(z) 0 f5 ca+b(x)cb(y)cc( √ 2 2 )−cb(x)ca+b(y)cc( √ 2 2 ) 0 table 4. the values of c-, s-, sland ss-functions on the boundaries of fundamental region f of c2 × a1. the separation constants ki, i = 1, 2, 3 are given by (8) in § 5.2. 411 marzena szajewska, agnieszka tereszkiewicz acta polytechnica g 2 × a 1 c s d n d n f 1 2( c a (x )c 3a + 2b (y )+ c a + b (x )c 3a + b (y ) 0 0 − i2 k 3 (s a (x )s 3a + 2b (y )− s a + b (x )s 3a + b (y )+ s 2a + b (x )s b (y )) + c 2a + b (x )c b (y )) f 2 (c a (x )c 3a + 2b (√ 3x )+ c a + b (x )c 3a + b (√ 3x ) 0 0 2i (− m 2 s a (x )c 3a + 2b (√ 3x )+ l 2 s a + b (x )c 3a + b (√ 3x ) + c 2a + b (x )c b (√ 3x )) c c (z ) − k 2 s 2a + b (x )c b (√ 3x )) s c (z ) f 3 2( c 3a + 2b (y )+ c 3a + b (y )+ c b (y )) c c (z ) 0 0 − 2i (m 1 s 3a + 2b (y )− l 1 s 3a + b (y )+ k 1 s b (y )) s c (z ) f 4 c a (x )c 3a + 2b (− √ 3 3 x + √ 6 3 )c c (z ) 0 0 i( m 1 2 c a (x )s 3a + 2b (− √ 3 3 x + √ 6 3 )s c (z )− l 1 2 c a + b (x )s 3a + b (− √ 3 3 x + √ 6 3 )s c (z ) + c a + b (x )c 3a + b (− √ 3 3 x + √ 6 3 )c c (z ) + k 1 2 c 2a + b (x )s b (− √ 3 3 x + √ 6 3 )s c (z )+ m 2 √ 3 2 s a (x )c 3a + 2b (− √ 3 3 x + √ 6 3 )s c (z ) + c 2a + b (x )c b (− √ 3 3 x + √ 6 3 )c c (z ) − l 2 √ 3 2 s a + b (x )c 3a + b (− √ 3 3 x + √ 6 3 )s c (z )+ k 2 √ 3 2 s 2a + b (x )c b (− √ 3 3 x + √ 6 3 )s c (z )) f 5 (c a (x )c 3a + 2b (y )+ c a + b (x )c 3a + b (y )+ c 2a + b (x )c b (y )) c c (√ 2 2 ) 0 0 ik 3 (s a (x )s 3a + 2b (y )− s a + b (x )s 3a + b (y )+ s 2a + b (x )s b (y )) c c (√ 2 2 ) g 2 × a 1 s l d n f 1 0 2i m 3 (− s a (x )c 3a + 2b (y )+ s a + b (x )c 3a + b (y )− s 2a + b (x )c b (y )) f 2 (s a (x )c 3a + 2b (√ 3x )+ s a + b (x )c 3a + b (√ 3x )− s 2a + b (x )c b (√ 3x )) s c (z ) 0 f 3 0 2i (m 1 c 3a + 2b (y )− l 1 c 3a + b (y )+ k 1 c b (y )) s c (z ) f 4 0 i( − k 1 2 c b (√ 6 3 − √ 3x 3 )s c (z )c a + b (x )+ l 1 2 s c (z )c a + b (x )c 3a + b (√ 6 3 − √ 3x 3 ) + m 1 2 c a (x )s c (z )c 3a + 2b (√ 6 3 − √ 3x 3 ) − 1 2 √ 3k 2 s b (√ 6 3 − √ 3x 3 )s c (z )s 2a + b (x ) + 1 2 √ 3l 2 s c (z )s a + b (x )s 3a + b (√ 6 3 − √ 3x 3 )+ 1 2 √ 3m 2 s a (x )s c (z )s 3a + 2b (√ 6 3 − √ 3x 3 )) f 5 0 ik 3 (s a (x )s 3a + 2b (y )− s a + b (x )s 3a + b (y )+ s 2a + b (x )s b (y )) c c (√ 2 2 ) g 2 × a 1 s s d n f 1 2( c a (x )s 3a + 2b (y )− c a + b (x )s 3a + b (y )− c 2a + b (x )c b (y )) 0 f 2 0 2i c c (z )( m 1 c a (x )c 3a + 2b (√ 3x )− l 2 c a + b (x )c 3a + b (√ 3x ) − k 2 c 2a + b (x )c b (√ 3x )) − k 2 c 2a + b (x )c b (√ 3x )) f 3 2( s 3a + 2b (y )c c (z )− s 3a + b (y )c c (z )− s b (y )c c (z )) 0 f 4 (c a (x )s 3a + 2b (− √ 3 3 x + √ 6 3 )− c a + b (x )s 3a + b (− √ 3 3 x + √ 6 3 )− c 2a + b (x )s b (− √ 3 3 x + √ 6 3 )) c c (z ) 0 f 5 c a (x )s 3a + 2b (y )c c (√ 2 2 )− c a + b (x )s 3a + b (y )c c (√ 2 2 )− c 2a + b (x )s b (y )c c (√ 2 2 ) 0 table 5. the values of c-, s-, sland ss-functions on the boundaries of fundamental region f of g2 × a1. the separation constants ki, li, mi, i = 1, 2, 3 are given by (9) in § 5.2. 412 vol. 58 no. 6/2018 multidimensional hybrid boundary value problem a1×a1×a1 ccc sss d n d n f1 ca(x)cb(y)cc(0) 0 0 − √ 2πik3sa(x)sb(y)cc(0) f2 ca(x)cb(0)cc(z) 0 0 − √ 2πik2sa(x)cb(0)sc(z) f3 ca(0)cb(y)cc(z) 0 0 − √ 2πik1ca(0)sb(y)sc(z) f4 ca(x)cb( 1√2 )cc(z) 0 0 √ 2πik2sa(x)cb( 1√2 )sc(z) f5 ca( 1√2 )cb(y)cc(z) 0 0 √ 2πik1ca( 1√2 )sb(y)sc(z) f6 ca(x)cb(y)cc( 1√2 ) 0 0 √ 2πik3sa(x)sb(y)cc( 1√2 ) a1×a1×a1 ccs ssc d n d n f1 0 − √ 2πik3ca(x)cb(y)cc(0) sa(x)sb(y)cc(0) 0 f2 ca(x)cb(0)sc(z) 0 0 − √ 2πik2sa(x)cb(0)cc(z) f3 ca(0)cb(y)sc(z) 0 0 − √ 2πik1ca(0)sb(y)cc(z) f4 ca(x)cb( 1√2 )sc(z) 0 0 √ 2πik2sa(x)cb( 1√2 )cc(z) f5 ca( 1√2 )cb(y)sc(z) 0 0 √ 2πik1ca( 1√2 )sb(y)cc(z) f6 0 √ 2πik3ca(x)cb(y)cc( 1√2 ) sa(x)sb(y)cc( 1√ 2 ) 0 a1×a1×a1 csc scs d n d n f1 ca(x)sb(y)cc(0) 0 0 − √ 2πik3sa(x)cb(y)cc(0) f2 0 − √ 2πik2ca(x)cb(0)cc(z) sa(x)cb(0)sc(z) 0 f3 ca(0)sb(y)cc(z) 0 0 − √ 2πik1ca(0)cb(y)sc(z) f4 0 √ 2πik2ca(x)cb( 1√2 )cc(z) sa(x)cb( 1√ 2 )sc(z) 0 f5 ca( 1√2 )sb(y)cc(z) 0 0 √ 2πik1ca( 1√2 )cb(y)sc(z) f6 ca(x)sb(y)cc( 1√2 ) 0 0 √ 2πik3sa(x)cb(y)cc( 1√2 ) a1×a1×a1 scc css d n d n f1 sa(x)cb(y)cc(0) 0 0 − √ 2πik3ca(x)sb(y)cc(0) f2 sa(x)cb(0)cc(z) 0 0 − √ 2πik2ca(x)cb(0)sc(z) f3 0 − √ 2πik1ca(0)cb(y)cc(z) ca(0)sb(y)sc(z) 0 f4 sa(x)cb( 1√2 )cc(z) 0 0 √ 2πik2ca(x)cb( 1√2 )sc(z) f5 0 √ 2πik1ca( 1√2 )cb(y)cc(z) ca( 1√ 2 )sb(y)sc(z) 0 f6 sa(x)cb(y)cc( 1√2 ) 0 0 √ 2πik3ca(x)sb(y)cc( 1√2 ) table 6. the values of six families of functions on the boundaries of fundamental region f of a1 × a1 × a1. the separation constants ki, i = 1, 2, 3 are given by (10) in § 5.3. 413 acta polytechnica 58(6):402–413, 2018 1 introduction 2 helmholtz equation and boundary conditions 3 finite reflection groups 4 special functions as a solution of helmholtz equation 5 3d finite reflection groups 5.1 b3 and c3 groups 5.2 c2a1 and g2a1 groups 5.3 a1 a1 a1 group 6 appendix references ap03_5.vp 1 introduction due to global competition, industrial companies need to develop competitive products in a shorter time. the methods of integrated product development are aimed at supporting designers in this endeavour [2]. the methods and tools of integrated product development are objects of investigation in design research. the circumstances in an industrial environment, such as time pressure and immense quality requirements, necessitate an effective and efficient approach. fig. 1 illustrates possible stages in the product development process. the x-axis represents the life cycle of a product. the y-axis represents the requirements which are not fulfilled yet (maximum number at the beginning of a project), or the failures which have occurred. curve 1 shows a trouble free procedure. the conceptual design of the product is detailed step by step according to the design approach analysis, searching solutions, evaluating and selection. before launch deadline the product meets all criteria required by the customer. curve 2 represents a project which is characterised by some failure. to handle the situation, the results of preventive measures (i.e., fmea) are implemented. curve 3 exemplifies a project disturbed by extensive failures. compared to curve 2, the results of preventive measures are not applicable. this situation is well known in industrial practice. despite methods of risk management and quality management, troubleshooting situations caused by product failure occur in the product development process. this can be due to time reasons: it is impossible to consider all possible contingencies within the scope of quality management. as shown in fig. 1, the time for searching a solution is very short (the worst case would be a serious failure after product launch). therefore there is extremely high pressure to succeed. curve 3 is a case for troubleshooting. an error has occurred which seriously endangers the success of the project extremely. the basic steps of conventional product development are usually supported by methods and tools. methods for systematic searching (i.e., a list of physical effects) and creativity techniques (i.e., brainstorming, triz) support the design teams in the phase of solution searching. the goal is to create a lot of innovative solutions. afterwards the ideas are evaluated. the criteria for evaluation result from the customer’s requirements and the economic and technical feasibility. depending on the level of detail of the solutions, and on the number of solutions some provisional criteria are considered for evaluation. accordingly, simple or more extensive evaluation methods can be used. in troubleshooting situations, time spent on creating ideas which are not appropriate for solving the problem is lost. due to the fact that the problem occurs shortly before or after product launch, a characteristic feature of 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 reducing time for the product development process by evaluation in the phase of solution searching b. jokele, d. k. fuchs less and less time is available for product development process. to prevent product failures and the resulting time intensive and cost intensive iteration steps, some preventive measures must be taken. within the scope of quality management, fmea anticipates possible problems concerning product and process properties. nevertheless, in industrial practice designed products can have failures which were not considered within fmea. the time pressure is immense, and efforts which do not make a contribution to a successful solution are regarded as lost time. this paper introduces a systematic approach to troubleshooting, with the aim of reducing the time for solution searching by considering the feasibility of ideas at an early stage. keywords: troubleshooting, change management, concurrent use of methods. normal process preventive measures minor failings to acceptable customer 1 2 3 p ro du ct io n pr ep ar at io n c la rif yi ng th e ta sk c on ce pt ua l d es ig n e m bo di m en t d es ig n d et ai l d es ig n trouble shooting normal process preventive measures ro du ct la un ch p unfullfilled requirements, or failings time major failings 1 2 3 fig. 1: possible stages in product development process a troubleshooting situation is lack of time. furthermore, engineering change costs rise progressively during product development [2]. the inadequacy of an idea frequently results from lack of feasibility. this can be due to manufacturing tools which would be necessary but are not available. another reason can be a very high level of risk, because the company has no experience with the technical princip of an innovative idea, and appropriate testing tools are lacking. the established approach and the established methods and tools do not take these boundary conditions into account. therefore, there is a growing need for a systematic approach and an adaptation of methods, aimed at avoiding time and cost intensive iterations in troubleshooting phases. 2 approach to troubleshooting situations evaluating and searching for solutions should not be strictly separated [1]. concurrent searching and evaluating solutions provide an opportunity to reduce the process time (fig. 2). in order to arrange the solution search more efficiently, the possibilities of modification should be already considered within the solution search. the possibilities of modification reflect themselves in the degrees of freedom of the product and the degrees of freedom of the process (fig. 3). the aim is to concentrate on generating solutions which can be realised with adequate efforts, and which promise © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 43 no. 5/2003 fig. 2: reducing time by concurrent searching and evaluating curve iii: knowledge about product propertiescurve iii: knowledge about product properties curve i: engineering change costscurve i: engineering change costs goal of ‚early evaluation of product properties‘ goal of ‚early evaluation of product properties‘ curve iii: knowledge about product propertiescurve iii: knowledge about product properties curve i: engineering change costscurve i: engineering change costs goal of ‚early evaluation of product properties‘ goal of‚ early evaluation of product properties‘ curve ii: design freedom fig. 3: general trends relating to design freedom, knowledge about product properties, and engineering change costs (acc. to [2]) a successful troubleshooting outcome. considering this, the first step is to identify the company’s specific capabilities, concerning production, assembly and testing. the aim is to identify system components which can be modified with minimum effort. based on this information, solution searching should concentrate on modifying of those components. 3 cause analysis the first step is to analyse the causes of the problem. for this purpose the system behaviour should be visualised (fig. 4). usually the unwanted effects give a first indication of the possible causes. structuring the problem using the triz-method, the correlations between the effects of system components are represented by an arrow. an arrow linking two effects means that one effect is caused by the other (i.e., effect 1 is caused by effect 4). no other types of arrows are used, because the visualisation should be as simple as possible. this ensures that the information can be communicated easily to other persons who are or will be involved in the process of problem solving. it is important to identify parts or assemblies of the product which are associated with the effects. based on this visualization of system behaviour, it becomes apparent which system components are causing the failure, or which system components are affected. 4 identifying the degrees of freedom of the product as mentioned above, the degree of design freedom decreases during the development process with increasing level of detail of the product. change management deals with basic aspects of the effects caused by changing the system components. a small initial change can instigate numerous other changes, which may have serious cost implications [3]. therefore, in a troubleshooting situation the impact of the changes has to be considered. the aim is to solve the problem without causing too many changes. for this purpose the system components have to be analysed concerning their impact and the expected success in solving the problem. the results of this analysis are shown as an example in fig. 5. 5 identifying the degrees of freedom of the process the next step is to identify the process steps which can be modified easily. for this purpose it is necessary to record the company’s manufacturing and assembly capabilities. depending on the level of detail of the product, the modification of some process steps is eliminated. for example: if expensive tools have already been bought, changing the tool would have intensive cost implications. furthermore, it is not advisable to use new process steps without experience. in a troubleshooting situation, the risk of yet another failure has to be reduced. 6 searching solutions by expressing questions degrees of freedom are not just a basis for obtaining criteria for evaluation. the fundamental idea is to use the degrees of freedom systematically in searching for solutions. it seems beneficial to present some aims of a problem solver in terms of simple structures of problems [4]. in the style of triz, a systematic approach is to express terms which simulate the designer to create solutions according to realistic possibilities. on the basis of the earlier steps, the table in fig. 6 lists the degrees of freedom. an appropriate term is expressed : ”try to find a solution by modifying /component x/ by /process-step y/.” 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 4: structuring of the problem fig. 5: matrix for prioritisation fig. 6: morphological box of degrees of freedom this ensures that the focus in solution searching is on realistic ideas. not the most innovative idea, but the most efficient idea is sought. 7 example: troubleshooting in the automotive industry in cooperation with an automobile supplier the procedure was applied in practice. a new product was derived from an earlier successful product. however, the redesigned product did not maintain the product quality. the main problem was some unwanted acoustic effects, caused by an unexpected gap between two parts of the product. this problem occurred just before the start of production. cost-intensive forming tools had already been installed. the customer was dissatisfied und demanded a quick solution. the first meeting a cause analysis. specialists in technical design, manufacturing and assembly analysed the situation. it was found that the gap was caused by a difference in temperature of the parts. compared to the earlier design, the length of an edge of the redesigned product was longer. due to this, the expansion effect had a greater influence and caused the gap. in the subsequent analysis of degrees of freedom, it became clear that modifications could hardly be realised. geometric modification of the affected parts would have involved costintensive modification of the forming tools. one option which resulted from the degrees of freedom was: ”try to prevent the gap by modifying part a by welding.” the answer to this question produced quite simple solutions which were easy to realise. by welding the two parts, the expansion and thus the gap could be prevented. there was no unwanted acoustic effect anymore. the company has a lot of experience in welding, and has the appropriate tools. the customer was satisfied and production started as scheduled. 8 conclusions and/or recommendations a troubleshooting situation is characterized by intensive time pressure. thus, not so much innovative as practicable solutions should be the aim of product development in this situation. in conventional product development procedures the search for a solution is followed by an evaluation of the generated ideas. an important criterion in the evolution phase is the necessary effort, taking into account technical realisation and the cost impacts. the degrees of freedom concerning product and process should be taken into account, leading to the development of suitable solutions. the introduced approach reduces the time for solution searching and feasibility studies, and is appropriate not only in troubleshooting situations but also in ordinary product development process. references [1] wulf, j.: elementarmethoden zur lösungssuche. münchen: dr. hut 2002. (produktentwicklung münchen, bd. 50); zugl. münchen: tu, diss. 2002. [2] stetter, r.: method implementation in integrated product development. münchen: dr. hut 2000. (produktentwicklung münchen, bd. 41); zugl. münchen: tu, diss. 2002. [3] clarkson, j. et al: prediction for product redesign. proceedings of iced 2001. [4] savransky, s.: engineering of creativity. crc press llc 2000, isbn 0-88493-2255-3. dipl. ing. bernd jokele e-mail: jokele@pe.mw.tum.de dipl. ing. daniel-karl fuchs lehrstuhl für produktentwicklung technische universität münchen boltzmannstrasse 15 85748 garching germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 43 no. 5/2003 acta polytechnica doi:10.14311/ap.2016.56.0462 acta polytechnica 56(6):462–471, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap itineraries induced by exchange of three intervals zuzana masákováa, edita pelantováa, štěpán starostab, ∗ a faculty of nuclear sciences and physical engineering,czech technical university in prague, prague, czech republic b faculty of information technology, czech technical university in prague, prague, czech republic ∗ corresponding author: stepan.starosta@fit.cvut.cz abstract. we focus on a generalization of the three gap theorem well known in the framework of exchange of two intervals. for the case of three intervals, our main result provides an analogue of this result implying that there are at most 5 gaps. to derive this result, we give a detailed description of the return times to a subinterval and the corresponding itineraries. keywords: interval exchange transformation; three gap theorem; return time; first return map. 1. introduction the well-known three gap theorem provides information about gaps between consecutive integers n for which {αn} < β with α,β ∈ (0, 1), where the notation {x} = x−bxc stands for the fractional part of x. in particular, the gaps take at most three values, the longest one being the sum of the other two. this result was first proved by slater [14], but it appeared in different versions and generalizations multiple times since. for a nice overview of this problem we refer to [1]. the reason is that the theorem can be interpreted in the framework of coding with respect to intervals [0,β), [β, 1) of rotation by the angle α on the unit circle. for irrational α and β = 1 −α, such coding gives rise to famous sturmian words. the three gap theorem is then intimately connected with the dynamical system associated with codings of rotations, induced transformations, return times and return itineraries. see also [8] for a distinct setting of exchange of two intervals related to the three gap theorem. codings of rotations are advantageously interpreted in the language of interval exchange. the simplest case provides sturmian words as codings of exchange of two intervals. we follow the study on itineraries induced by exchange of two intervals, presented in [13], to study the exchange of three intervals. in this article we focus on codings of a nondegenerate symmetric exchange t : j → j of three intervals. the main result is the description of the return times to a general interval i ⊂ j and an insight into the structure of the set of i-itineraries, i.e., the finite words that are codings of the return trajectories to i (see the definition below). these results are given in theorem 4.1 and then interpreted as analogues of the well-known three gap and three distance theorems (see section 6). particular attention is paid to the special cases when the set of i-itineraries has only three elements. these cases belong to the most interesting from the combinatorial point of view, since they provide information about return words to factors, and about the morphisms preserving three interval exchange words. these specific cases are studied in section 5. this allows us to describe the return words to palindromic bispecial factors, which can be seen as a complement to some of the results in [6]. we also focus on substitutions fixing words coding interval exchange transformations. the latter has implications [12] for the question of hof, knill and simon [10] about palindromic substitution invariant words with application to aperiodic schrödinger operators. 2. preliminaries 2.1. combinatorics on words a finite word w = w0 · · ·wn−1 is a concatenation of letters of a finite alphabet a. the number n of letters in w is the length of w and is denoted by |w|. the set of all finite words over an alphabet a, including the empty word �, with the operation of concatenation forms a monoid, denoted by a∗. one considers also infinite words u = u0u1u2 · · · ∈ an. if u ∈ a∗ ∪an and u = vwz, for some v,w ∈ a∗ and z ∈ a∗ ∪an, we say that v,w,z are factors of u. in particular, v is a prefix and z is a suffix of u. we write wz = v−1u, vw = uz−1. an infinite word u is said to be aperiodic if it does not have a suffix of the form wwww · · · . if the infinite word u contains at least two occurrences of each of its finite factors, then it is said to be recurrent. if the distances between two consecutive occurrences of every factor are bounded, then u is uniformly recurrent. if w,v are factors of u such that vw is also a factor of u and vw contains w as its prefix and as its suffix but not anywhere else, then v is a return word to w in u. thus u is uniformly recurrent if and only if each factor w of u has finitely many return words. the language of an infinite word u is the set of all its finite factors. it is denoted by l(u). the number of factors of u of length n defines the factor complexity function cu : n → n. it is known that aperiodic 462 http://dx.doi.org/10.14311/ap.2016.56.0462 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 6/2016 itineraries induced by exchange of three intervals infinite words have complexity cu(n) ≥ n + 1 for n ∈ n. sturmian words are aperiodic words with minimal factor complexity. 2.2. exchange of three intervals given a partition of an interval j into disjoint union of intervals j1, . . . ,jk, the exchange of k intervals is a bijection determined by a piecewise translation permuting the intervals according to a prescribed permutation π. in the article, we consider the case: let k = 3, 0 < α < β < 1, and t : [0, 1) → [0, 1) be given by t(x) =   x + 1 −α if x ∈ [0,α) =: ja, x + 1 −α−β if x ∈ [α,β) =: jb, x−β if x ∈ [β, 1) =: jc. (1) the transformation t is an exchange of three intervals with the permutation (321). it is often called a 3iet for short. the orbit {tn(ρ) : n ∈ z} of a point ρ ∈ [0, 1) can be coded by an infinite word uρ = u0u1u2 . . . over the alphabet {a,b,c} given by un = x if tn(ρ) ∈ jx for x ∈{a,b,c}. the infinite word uρ is called a 3iet word, the point ρ is called the intercept of uρ. an exchange of intervals satisfies the minimality condition if the orbit of any given ρ ∈ [0, 1) is dense in [0, 1), which amounts to requiring that 1 − α and β be linearly independent over q. then the word uρ is aperiodic, uniformly recurrent, and the language of uρ does not depend on the intercept ρ. the complexity of the infinite word uρ is known to satisfy cuρ(n) ≤ 2n + 1 (see [6]). if for every n ∈ n equality is achieved, then the transformation t and the word uρ are said to be non-degenerate. a necessary and sufficient condition for a 3iet t to be non-degenerate is that t is minimal and 1 /∈ (1 −α)z + βz; (2) see [6]. 3. itineraries in exchange of three intervals definition 3.1. given a subinterval i ⊂ [0, 1), we define the so-called return time to i as a mapping ri : i → z+ = {1, 2, 3, . . .} such that ri(x) = min{n ∈ z+ : tn(x) ∈ i}. the prefix of the word ux of length ri(x) coding the orbit of x ∈ i under a 3iet t is called the i-itinerary of x and denoted ri(x). the set of all i-itineraries is denoted by iti = {ri(x) : x ∈ i}. the map ti : i → i defined by ti(x) = tri(x)(x) is called the first return map of t to i, or induced map of t on i. the indices in ri or ri are usually omitted, if this causes no confusion. for a given subinterval i ⊂ [0, 1) there exist at most five i-itineraries under a 3iet t. in particular, from the article of keane [11], one can deduce what the intervals of points with the same itinerary are. we summarize it as the following lemma. lemma 3.2. let t be a 3iet defined by (1) and let i = [γ,δ) ⊂ [0, 1) such that δ < 1. denote kα := min { k ∈ z, k ≥ 0 : t−k(α) ∈ (γ,δ) } , kβ := min { k ∈ z, k ≥ 0 : t−k(β) ∈ (γ,δ) } , kγ := min { k ∈ z, k ≥ 1 : t−k(γ) ∈ (γ,δ) } , kδ := min { k ∈ z, k ≥ 1 : t−k(δ) ∈ (γ,δ) } , and further a := t−kα(α), b := t−kβ (β), c := t−kγ (γ), d := t−kδ (δ). for x ∈ i, let kx be a maximal interval such that for every y ∈ kx, we have r(y) = r(x). then kx is of the form [c,d) with c,d ∈{γ,δ,a,b,c,d}. consequently, # iti ≤ 5. for a 3iet t, lemma 3.2 implies that ti is an exchange of at most 5 intervals. consequently, the transformation ti has at most four discontinuity points. in fact, the following result of [9] says that independently of the number of i-itineraries, the induced map ti has always at most two discontinuity points. proposition 3.3 [9]. let t : j → j be a 3iet with the permutation (321) and satisfying the minimality condition, and let i ⊂ j be an interval. the first return map ti is either a 3iet with permutation (321) or a 2iet with permutation (21). in particular, in the notation of lemma 3.2, we have d ≤ c. we will use two other facts about itineraries of an interval exchange, stated as propositions 3.4 and 3.5. both of these were proven in [12] for general interval exchange transformations with symmetric permutation and thus hold also for 3iets with permutation (321). note that a 3iet t is right-continuous. therefore, if i = [γ,δ), then every word w ∈ iti = {r(x) : x ∈ i} is an i-itinerary r(x) for infinitely many x ∈ i, which form an interval, again left-closed right-open. proposition 3.4. let t be a 3iet satisfying the minimality condition and let i = [γ,δ) ⊂ [0, 1). there exist neighbourhoods hγ and hδ of γ and δ, respectively, such that for every γ̃ ∈ hγ and δ̃ ∈ hδ with 0 ≤ γ̃ < δ̃ ≤ 1 one has itĩ ⊇ iti, where ĩ = [γ̃, δ̃). in particular, if # iti = 5, then itĩ = iti. for an interval k = [c,d) ⊂ [0, 1) denote k = [1 −d, 1 − c). then t(jx) = jx for any letter x ∈{a,b,c}. (3) 463 z. masáková, e. pelantová, š. starosta acta polytechnica proposition 3.5. let t : [0, 1) → [0, 1) be a 3iet with permutation (321) satisfying the minimality condition. let i ⊂ [0, 1) and let r1, . . . ,rm be the iitineraries. the i-itineraries are the mirror images of the i-itineraries, namely r1, . . . ,rm. moreover, if [γj,δj) := {x ∈ i : ri(x) = rj } and [γ′j,δ ′ j) := ti ( [γj,δj) ) , for j = 1, . . . ,m, then {x ∈ i : r i (x) = rj } = [1 −δ′j, 1 −γ ′ j). convention: for the rest of the article, let t be a non-degenerate exchange of three intervals with permutation (321) given by (1). 4. return time in a 3iet the aim of this section is to describe the possible return times of a non-degenerate 3iet t to a general subinterval i ⊂ [0, 1). our aim is to prove the following theorem. theorem 4.1. let t be a non-degenerate 3iet and let i ⊂ [0, 1). there exist positive integers r1,r2 such that the return time of any x ∈ i takes value in the set {r1,r1 + 1,r2,r1 +r2,r1 +r2 + 1} or {r1,r1 + 1,r2,r2 + 1,r1 + r2 + 1}. first, we will formulate an important lemma, which needs the following notation. given letters x,y,z ∈ {a,b,c} and a finite word w ∈ {a,b,c}∗, let ωxy→z(w) be the set of words obtained from w replacing one factor xy by the letter z, i.e., ωxy→z(w) = {w1zw2 : w = w1xy w2 }. (4) similarly, ωz→xy (w) = {w1xy w2 : w = w1zw2 }. (5) clearly, v ∈ ωxy→z(w) ⇔ w ∈ ωz→xy (v). (6) by abuse of notation, we write v = ωxy→z(w) instead of v ∈ ωxy→z(w). lemma 4.2. assume that the orbits of points α,β,γ and δ are mutually disjoint. for sufficiently small ε > 0, we have the following relations between iitineraries of points in i: (a) r(a−ε) = ωb→ac ( r(a + ε) ) , (b) r(a + ε) = ωac→b ( r(a−ε) ) , (c) r(b−ε) = ωca→b ( r(b + ε) ) , (d) r(b + ε) = ωb→ca ( r(b−ε) ) , (e) r(d + ε) = r(d−ε)r(δ −ε), (f) r(c−ε) = r(c + ε)r(γ + ε), where a,b,c,d are given in lemma 3.2. proof. we will first demonstrate the proof of the case (a). let k = [a − ε,a + ε] with ε chosen such that k ⊂ i and α,β,γ,δ 6∈ ti(k) for all 0 ≤ i ≤ kα with the only exception of tkα(a) = α. (7) for simplicity, denote t = max{ri(x) : x ∈ k} the maximal return time. the existence of such ε follows trivially from the definition of the interval exchange transformation and the assumptions of the lemma. let k− = [a−ε,a) and k+ = [a,a+ε]. it follows from the definition of a and condition (7) that for all i such that 0 < i ≤ kα we have ti(k) ∩ i = ∅. moreover, condition (7) implies that all such ti(k) are intervals. it implies that for any x,y ∈ k, the prefixes of r(x) and r(y) of length kα + 1 are the same. denote this prefix by w. the definition of kα implies that α ∈ tkα(k). since tkα(k+) = [α,α + ε] ⊂ jb, we obtain tkα+1(k+) = [ t(α),t(α) + ε ) . furthermore, since tkα(k−) = [α − ε,α) ⊂ ja, we obtain tkα+1(k−) = [1 −ε, 1) ⊂ jc, and thus tkα+2(k−) = [ t(α) −ε,t(α) ) . this implies that the set k′ = tkα+2(k−) ∪ tkα+1(k+) = [t(α) −ε,t(α) + ε] is an interval. as above, condition (7) implies that the set ti(k′) is an interval for all i such that 0 ≤ i ≤ t−kα−1. it follows that min{i : ti(k′) ∩k 6= ∅} = t−kα − 2 and condition (7) moreover implies that tt−kα−2(k′) ⊂ k. thus, the iterations x,t(x), . . . ,tt−kα−2(x) of every x ∈ k′ are coded by the same word, say v. the whole situation is depicted in figure 1. from what is said above, we can write down the i-itineraries of points from k, r(x) = { wacv if x ∈ k−, wbv if x ∈ k+. this finishes the proof of (a). the claim in item (c) is analogous to (a). cases (b) and (d) are derived from (a) and (c) by the use of equivalence (6). let us now demonstrate the proof of the case (e). denote s = min{n ∈ z+ : tn(δ) ∈ i}. let k = [d−ε,d + ε] with ε chosen such that k ⊂ i and α,β,γ,δ 6∈ ti(k) for all 0 ≤ i ≤ kδ + s with the only exception of tkδ (d) = δ. (8) the existence of such ε follows trivially from the definition of the interval exchange transformation and the assumptions of the lemma. 464 vol. 56 no. 6/2016 itineraries induced by exchange of three intervals 0 1α βγ δ a a − ε a + ε 0 1α βγ δ t kα(a) = α t kα(a − ε) t kα(a + ε) t kα t kα 0 1α βγ δ t kα+1(a − ε) t 0 1α βγ δ t kα+2(a − ε) t kα+1(a + ε) t t 0 1α βγ δ t t(a − ε) t t−1(a + ε) t t−kα−2 t t−kα−2 . figure 1. situation in the proof of lemma 4.2, case (a). condition (8) implies that ti(k) is an interval for all i such that 0 < i ≤ kδ +s. moreover, ti(k)∩i = ∅ for all i such that 0 < i < kδ. we obtain tkδ (k)∩i = [δ−ε,δ). in other words, the i-itineraries of all points of k start with a prefix of length kδ which is equal to r(d − ε). condition (8) and the definition of s implies that for all i such that kδ < i < s + kδ we have ti(k) ⊂ jx for some x ∈ {a,b,c} and ti(k) ∩ i = ∅. moreover, ti(k) ⊂ i for i = kδ + s. thus, the iterations of points of tkδ (k) = [δ−ε,δ)∪ tkδ [d,d + ε] are coded by the same word of length s, namely r(δ−ε). altogether, we can conclude that the i-itinerary of points in the interval [d,d + ε] is equal to r(d−ε)r(δ −ε). the situation is depicted in figure 2. case (f) can be treated in a way analogous to case (e). now we can prove the main theorem describing the return times in 3iet. in the proof, it is sufficient to focus on the case when # iti = 5, since, as we have seen from proposition 3.4, the set of i-itineraries, and thus also their return times, for the other cases are only a subset of itĩ for some “close enough” generic subinterval ĩ ⊂ [0, 1). so throughout the rest of this section, suppose that # iti = 5. this means by lemma 3.2 that points a,b,c,d lie in the interior of the interval i = [γ,δ) and are mutually distinct. moreover, by proposition 3.3, we have d < c. such conditions imply 12 possible orderings of a,b,c,d which give rise to 12 cases in the study of return times. we will describe them in the proof of theorem 4.1 as cases (i)–(xii) and then show in example 4.5 that all 12 cases may occur. remark 4.3. note that if γ = 0, i.e. we induce on an interval i = [0,δ), we have t−1(γ) = β and therefore necessarily b = c. thus there are at most four iitineraries. due to proposition 3.5, a similar situation happens if δ = 1. proof of theorem 4.1. we will discuss the 12 possibilities of ordering of points a,b,c,d in the interior of the interval [γ,δ) with the condition d < c. the structure of the set of i-itineraries will be best shown in terms of i-itineraries of points in the left neighbourhood of the point d and right neighbourhood of the point c. for simplicity, we thus denote for sufficiently small positive ε r1 = r(d−ε), r2 = r(c + ε) and |r1| = t1, |r2| = t2. in order to be allowed to use lemma 4.2, we will assume that the orbits of points α,β,γ and δ are mutually disjoint. otherwise, we use proposition 3.4 to find a modified interval ĩ where this is satisfied and itĩ = iti. (i) let a < b < d < c. we know that r(x) is constant on the intervals [γ,a), [a,b), [b,d), [d,c), and [c,δ). by definition r(x) = r2 for x ∈ [c,δ) and r(x) = r1 for x ∈ [b,d). we can derive from rule (e) of lemma 4.2 that if x ∈ [d,c), then r(x) = r1r2. further, we use rule (c) to 465 z. masáková, e. pelantová, š. starosta acta polytechnica 0 1α βγ δ d d − ε d + ε 0 1α βγ δ t kδ (d − ε) t kδ (d + ε) t kδ t kδ 0 1α βγ δ t t(δ − ε) t t(δ + ε) t t t t . figure 2. situation in the proof of lemma 4.2, case (e). show that r(x) = ωca→b(r1) for x ∈ [a,b) and further by applying rule (a), we obtain that r(x) = ωb→ac ( ωca→b(r1) ) for x ∈ [γ,a). summarized, r(x) =   ωb→ac ( ωca→b(r1) ) for x ∈ [γ,a), ωca→b(r1) for x ∈ [a,b), r1 for x ∈ [b,d), r1r2 for x ∈ [d,c), r2 for x ∈ [c,δ). it is easy to see that the lengths of the above iitineraries are t1, t1 − 1, t1, t1 + t2, t2, respectively. for that, realize that by definition (4) and (5) of the action of ωxy→z and ωz→xy , the length of the word ωb→ac ( ωca→b(r1) ) is equal to the length of the itinerary r1, i.e. t1. setting r1 = t1 − 1 and r2 = t2, we obtain the desired return times. the proofs of the other cases are analogous, we state the results in terms of r1 and r2. (ii) let d < c < a < b. we obtain r(x) =   r1 for x ∈ [γ,d), r2r1 for x ∈ [d,c), r2 for x ∈ [c,a), ωac→b(r2) for x ∈ [a,b), ωb→ca ( ωac→b(r2) ) for x ∈ [b,δ), with lengths t1, t1 + t2, t2, t2 − 1, t2, respectively. we set r1 = t1 and r2 = t2 − 1. (iii) let b < a < d < c. a discussion as above leads to r(x) =   ωca→b ( ωb→ac(r1) ) for x ∈ [γ,b), ωb→ac(r1) for x ∈ [b,a), r1 for x ∈ [a,d), r1r2 for x ∈ [d,c), r2 for x ∈ [c,δ), with the corresponding lengths t1, t1 + 1, t1, t1 + t2, t2, respectively. we set r1 = t1 and r2 = t2. (iv) let d < c < b < a. we obtain r(x) =   r1 for x ∈ [γ,d), r2r1 for x ∈ [d,c), r2 for x ∈ [c,b), ωb→ca(r2) for x ∈ [b,a), ωac→b ( ωb→ca(r2) ) for x ∈ [a,δ), with lengths t1, t1 + t2, t2, t2 + 1, t2, respectively. we set r1 = t1 and r2 = t2. (v) let a < d < b < c. we obtain r(x) =   ωb→ac(r1) for x ∈ [γ,a), r1 for x ∈ [a,d), r1r2 for x ∈ [d,b), r2ωb→ac(r1) for x ∈ [b,c), r2 for x ∈ [c,δ), with lengths t1 + 1, t1, t1 + t2, t1 + t2 + 1, t2, respectively. we set r1 = t1 and r2 = t2. (vi) let d < a < c < b. we obtain r(x) =   r1 for x ∈ [γ,d), r1ωb→ca(r2) for x ∈ [d,a), r2r1 for x ∈ [a,c), r2 for x ∈ [c,b), ωb→ca(r2) for x ∈ [b,δ), with lengths t1, t1 + t2 + 1, t1 + t2, t2, t2 + 1, respectively. we set r1 = t1 and r2 = t2. (vii) let b < d < a < c. we obtain r(x) =   ωca→b(r1) for x ∈ [γ,b), r1 for x ∈ [b,d), r1r2 for x ∈ [d,a), r2ωca→b(r1) for x ∈ [a,c), r2 for x ∈ [c,δ), 466 vol. 56 no. 6/2016 itineraries induced by exchange of three intervals with lengths t1 − 1, t1, t1 + t2, t1 + t2 − 1, t2, respectively. we set r1 = t1 − 1 and r2 = t2. (viii) let d < b < c < a. we obtain r(x) =   r1 for x ∈ [γ,d), r1ωac→b(r2) for x ∈ [d,b), r2r1 for x ∈ [b,c), r2 for x ∈ [c,a), ωac→b(r2) for x ∈ [a,δ), with lengths t1, t1 + t2 − 1, t1 + t2, t2, t2 − 1, respectively. we set r1 = t1 and r2 = t2 − 1. (ix) let a < d < c < b. we obtain r(x) =   ωb→ac(r1) for x ∈ [γ,a), r1 for x ∈ [a,d), r1ωb→ca(r2) for x ∈ [d,c), r2 for x ∈ [c,b), ωb→ca(r2) for x ∈ [b,δ), with lengths t1 + 1, t1, t1 + t2 + 1, t2, t2 + 1, respectively. we set r1 = t1 and r2 = t2. (x) let b < d < c < a. we obtain r(x) =   ωca→b(r1) for x ∈ [γ,b), r1 for x ∈ [b,d), r1ωac→b(r2) for x ∈ [d,c), r2 for x ∈ [c,a), ωac→b(r2) for x ∈ [a,δ), with lengths t1 −1, t1, t1 + t2 −1, t2, t2 −1, respectively. we set r1 = t1 − 1 and r2 = t2 − 1. (xi) let d < a < b < c. we obtain r(x) =   r1 for x ∈ [γ,d), r1r2 for x ∈ [d,a), ωca→b(r2r1) for x ∈ [a,b), r2r1 for x ∈ [b,c), r2 for x ∈ [c,δ), with lengths t1, t1 + t2, t1 + t2 − 1, t1 + t2, t2, respectively. we set r1 = t1 − 1 and r2 = t2. (xii) let d < b < a < c. we obtain r(x) =   r1 for x ∈ [γ,d), r1r2 for x ∈ [d,b), ωb→ac(r2r1) for x ∈ [b,a), r2r1 for x ∈ [a,c), r2 for x ∈ [c,δ), with lengths t1, t1 + t2, t1 + t2 + 1, t1 + t2, t2, respectively. we set r1 = t1 and r2 = t2. remark 4.4. when describing the i-itineraries using the words r1, r2, we could apply the rules of γ δ type lengths 6 25 99 100 a < b < d < c [2, 1, 2, 3, 1] 29 100 71 100 d < c < a < b [1, 15, 14, 13, 14] 77 100 4 5 b < a < d < c [88, 89, 88, 109, 21] 7 25 3 4 d < c < b < a [1, 13, 12, 13, 12] 1 100 3 4 a < d < b < c [2, 1, 2, 3, 1] 1 100 29 100 d < a < c < b [2, 14, 13, 11, 12] 1 4 99 100 b < d < a < c [1, 2, 3, 2, 1] 71 100 99 100 d < b < c < a [2, 13, 14, 12, 11] 1 25 37 50 a < d < c < b [2, 1, 4, 2, 3] 29 100 99 100 b < d < c < a [1, 2, 4, 3, 2] 1 100 99 100 d < a < b < c [1, 2, 1, 2, 1] 1 4 3 4 d < b < a < c [1, 12, 13, 12, 11] table 1. the cases (i)–(xii) from the proof of theorem 4.1 for α = 15 √ 5 − 15 , β = − 1 6 √ 5 + 23 as in example 4.5. the endpoints of the interval i = [γ,δ) are in the first and second column. the last column contains a list of lengths of i-itineraries of all 5 subintervals of i starting from the leftmost one. lemma 4.2 in a different order. by doing so, we would obtain the itineraries expressed differently, which yields interesting relations between words r1,r2. for example, in the case (ix), we derive that the i-itinerary of x ∈ [d,c) is r(x) = r1ωb→ca(r2) = r2ωb→ac(r1). note also the symmetries between the cases (i) and (ii), (iii) and (iv), (v) and (vi), (vii) and (viii), in consequence of proposition 3.5. indeed, if we exchange the pair of points d ↔ c, b ↔ a, letters a ↔ c, and finally the inequalities “<” and “>”, we obtain a symmetric situation in the list of cases we discussed in the proof. in this sense, each of cases (ix) up to (xii) is symmetric to itself. example 4.5. set α = 15 √ 5 − 15 and β = − 1 6 √ 5 + 23 . table 1 shows 12 choices of i = [γ,δ) which produce 12 distinct orders of the points a, b, c and d, shown in the third column. the last column contains the respective lengths of the 5 distinct i-itineraries. let us describe in detail one of the cases, namely the case b < d < c < a. the induced interval is determined by setting γ = 29100 and δ = 99 100 . one can verify that b = t−0(β) = 1 6 √ 5 + 1 3 ≈ 0.706011329583298; d = t−2(δ) = 11 30 √ 5 + 37 300 ≈ 0.943224925083256; c = t−3(γ) = 8 15 √ 5 − 73 300 ≈ 0.949236254666554; a = t−1(α) = 11 30 √ 5 + 2 15 ≈ 0.953224925083256. it corresponds to the case (x) in the proof of theorem 4.1 with r1 = ca and r2 = cac. the i467 z. masáková, e. pelantová, š. starosta acta polytechnica itinerary of a point x ∈ i = [γ,δ) is r(x) =   b for x ∈ [γ,b), ca for x ∈ [b,d), cacb for x ∈ [d,c), cac for x ∈ [c,a), cb for x ∈ [a,δ). 5. description of the case of three i-itineraries the cases (i)–(xii) in the proof of theorem 4.1 correspond to the generic instances of a subinterval i in a non-degenerate 3iet which lead to 5 different i-itineraries. let us focus on the cases where, on the contrary, the set of i-itineraries has only 3 elements. first we recall two reasons why such cases are interesting. for a factor w from the language of a nondegenerate 3iet transformation t, denote [w] = {ρ ∈ [0, 1) : w is a prefix of uρ}. it is easy to see that [w] – usually called the cylinder of w – is a semi-closed interval and its boundaries belong to the set {t−i(z) : 0 ≤ i < n,z ∈{α,β}}. clearly, a factor v is a return word to the factor w if and only if v is a [w]-itinerary. it is well known [16] that any factor of an infinite word coding a non-degenerate 3iet has exactly three return words and thus the set it[w] has three elements. the second reason why to study intervals i yielding three i-itineraries is that any morphism fixing a nondegenerate 3iet word corresponds to such an interval i. details of this correspondence will be explained further in this section. proposition 5.1. let t be a non-degenerate 3iet and let i = [γ,δ) ⊂ [0, 1) be such that # iti = 3. one of the following cases occurs: (i) b = d < a = c and r(x) =   r1 for x ∈ [γ,b), ωb→ca(r1r2) = ωb→ac(r2r1) for x ∈ [b,a), r2 for x ∈ [a,δ); (ii) a = d < b = c and r(x) =   r1 for x ∈ [γ,a), ωac→b(r1r2) = ωca→b(r2r1) for x ∈ [a,b), r2 for x ∈ [b,δ); (iii) b < a = c = d and r(x) =   ωca→b(r1) for x ∈ [γ,b), r1 for x ∈ [b,a), r2 for x ∈ [a,δ); (iv) b = c = d < a and r(x) =   r1 for x ∈ [γ,a), r2 for x ∈ [a,b), ωac→b(r2) for x ∈ [b,δ); (v) a = c = d < b and r(x) =   r1 for x ∈ [γ,a), r2 for x ∈ [a,b), ωb→ca(r2) for x ∈ [b,δ); (vi) a < b = c = d and r(x) =   ωb→ac(r1) for x ∈ [γ,a), r1 for x ∈ [a,b), r2 for x ∈ [b,δ). sketch of a proof. since by lemma 3.2 the subintervals of i corresponding to the same itinerary are delimited by the points a,b,c and d, we may have # iti = 3 only if some of these points coincide, more precisely if #{a,b,c,d} = 2. the non-degeneracy of the considered 3iet implies that always a 6= b, which further limits the discussion. the six cases listed in the statement are the possibilities of how this may happen, respecting the condition d < c or d = c. in order to describe the itineraries, denote again r1 = r(d−ε) and r2 = r(c + ε) for ε > 0 sufficiently small. one can then follow the ideas of the proof of lemma 4.2. 5.1. return words to factors of a 3iet let us apply proposition 5.1 in order to provide the description of return words to factors of a non-degenerate 3iet word. if a factor w has a unique right prolongation in the language l(t), i.e. there exists only one letter a ∈ a such that wa ∈ l(t), then the set of return words to w and the set of return words to wa coincide. and (almost) analogously, if a factor w has a unique left prolongation in the language l(t), say aw for some a ∈ a, then a word v is a return word to w if and only if ava−1 is a return word to aw. consequently, to describe the structure of return words to a given factor w, we can restrict to factors which have at least two right and at least two left prolongations. such factors are called bispecial. it is readily seen that the language of an aperiodic recurrent infinite word u contains infinitely many bispecial factors. before giving the description of return words to bispecial factors, we state the following lemma. let us recall that for a word w the notation w denotes the reversal of w while for an interval [c,d) the notation [c,d) denotes the interval [1 −d, 1 − c). lemma 5.2. let w belong to the language of a nondegenerate 3iet t. denote n = |w|. for the cylinder of its reversal w, one has [w] = tn ([w]). 468 vol. 56 no. 6/2016 itineraries induced by exchange of three intervals proof. according to the definition of [w], for each [w]-itinerary r, the word rw belongs to the language and w occurs in rw exactly twice, as a prefix and as a suffix. in other words r is a return word to w. moreover, [w] is the maximal (with respect to inclusion) interval with this property. it follows also that if r is an [w]-itinerary, then the word w−1rw is a tn([w])-itinerary. applying proposition 3.5 to the interval tn([w]) we obtain that s := w−1rw is an tn([w])-itinerary. since the word sw = rw has a prefix w and a suffix w, with no other occurrences of w, the word s is a return word to w and thus by definition of the cylinder, s = w−1rw belongs to [w]-itinerary for any tn([w])-itinerary s. from the maximality of the cylinder we have tn([w]) ⊂ [w]. since the lengths of the intervals [w] and tn([w]) coincide we see, in particular, that the length of the interval [w] is less or equal to the length of the interval [w]. but from the symmetry of the role w and w, their length must be equal and thus tn([w]) = [w]. the language of t contains two types of bispecial factors: palindromic and non-palindromic. in [6], ferenczi, holton and zamboni studied the structure of return words to non-palindromic bispecial factors. the following proposition completes this description. proposition 5.3. let w be a bispecial factor. if w is a palindrome, then its return words are described by the cases (i) and (ii) of proposition 5.1. if w is not a palindrome, then its return words are described by the cases (iii) – (vi) of proposition 5.1. proof. let w be a bispecial factor. if w is not a palindrome, the claim follows from theorem 4.6 of [6]. assume that w is a palindrome and let [w] = [t−`(l),t−r(r)) with l,r ∈ {α,β} and 0 ≤ `,r < |w|. by lemma 5.2 we have [w] = t |w| ([w]). since w = w, we have iw = iw, and thus iw = [t−`(l),t−r(r)) = [1 −tn−r(r), 1 −tn−`(l)) = iw. since the considered 3iet is non-degenerate, the parameters α,β satisfy (2). consequently, the equation t−`(l) = 1 − tn−r(r) has a solution if and only if r 6= l. thus, we have neither a = c = d nor b = c = d and we are in the case (i) or (ii) of proposition 5.1. 5.2. substitutions fixing 3iet words another application of proposition 5.1 is providing some information about substitution having as a fixed point a non-degenerate 3iet word. a substitution over an alphabet a is a morphism η : a∗ →a∗ such that η(b) 6= � for b ∈a and there is a letter a ∈a satisfying η(a) = aw for some non-empty word w. the action of η can be naturally extended to infinite words u ∈an by setting η(u) = η(u0)η(u1)η(u2) . . .. if η(u) = u, then u is said to be a fixed point of η. obviously, a substitution always has the fixed point limn→∞ηn(a) where the limit is taken over the product topology. a substitution η is primitive if there exists an integer k such that for all a,b ∈a, the letter b occurs in ηk(a). 3iet words fixed by a substitution were studied in [3] and [5]. in [3] it was shown that a substitution fixing a non-degenerate 3iet word corresponds to an interval i such that the induced transformation is homothetic to the original one. more precisely, we have the following theorem. theorem 5.4 [3]. let ξ be a primitive substitution over {a,b,c} and let t be a non-degenerate 3iet. if substitution ξ fixes the word uρ coding the orbit of a point ρ ∈ [0, 1) under t , then there exists an interval i ⊂ [0, 1) such that the induced transformation ti is homothetic to t, the set of i-itineraries is equal to iti = {ra,rb,rc}, and the substitution ξ satisfies either η = ξ or η = ξ2, where η(a) = ra, η(b) = rb and η(c) = rc. using the above theorem together with proposition 5.1, one can derive information about the itineraries which determine the substitution η. corollary 5.5. let η be a primitive substitution as in theorem 5.4 fixing a non-degenerate 3iet word over the alphabet {a,b,c}. we have η(b) = ωac→b ( η(ac) ) = ωca→b ( η(ca) ) or η(b) = ωb→ca ( η(ac) ) = ωb→ac ( η(ca) ) . proof. by theorem 5.4, η corresponds to an interval i such that ti is homothetic to t. since t is nondegenerate, also ti is non-degenerate, and therefore its discontinuity points c,d are distinct. by proposition 5.1, the three i-itineraries are of the form given by cases (i) or (ii). example 5.6. we can illustrate the above corollary on the substitution η(a) = bcacac, η(b) = bcacbbcac, η(c) = bcac. the morphism η satisfies the property given in corollary 5.5 (see [12]). namely, we have η(b) = bcacbbcac = ωac→b ( η(ac) ) = ωac→b ( bcacacbcac ) = ωca→b ( η(ca) ) = ωca→b ( bcacbcacac ) . 469 z. masáková, e. pelantová, š. starosta acta polytechnica corollary 5.5 implies a relation of numbers of occurrences of letters in letter images of η which may be used to get an interesting relation for the incidence matrix mη of η. it is an integer-valued matrix defined by (mη)ab = |η(a)|b for a,b ∈a. as a consequence of corollary 5.5, we have for the columns of the incidence matrix mη that mη   1−1 1   =  |η(a)|a|η(a)|b |η(a)|c  −  |η(b)|a|η(b)|b |η(b)|c   +  |η(c)|a|η(c)|b |η(c)|c   = ±   1−1 1   . thus, (1,−1, 1)> is an eigenvector of mη corresponding to the eigenvalue 1 and −1, respectively. this fact has been already derived in [2] by other methods. 6. gaps and distance theorems let us reinterpret the statement of the main result (theorem 4.1) from the point of view of three gap and three distance theorems which are narrowly connected with the exchange of two intervals. under the name three gap theorem one usually refers to the description of gaps between neighbouring elements of the set g(α,δ) := { n ∈ n : {nα} < δ } ⊂ n, where α ∈ r\q, δ ∈ (0, 1), see [15]. sometimes one uses a more general formulation, namely the set g(α,ρ,γ,δ) := { n ∈ n : γ ≤{nα + ρ} < δ } ⊂ n, where moreover ρ ∈ r, 0 ≤ γ < δ < 1. the three gap theorem states that there exist integers r1,r2 such that gaps between neighbours in g(α,ρ,γ,δ) take at most three values, namely in the set {r1,r2,r1 + r2}. let us interpret the three gap theorem in the framework of exchange of two intervals j0 = [0, 1 − α), j1 = [1 −α, 1). the transformation t : [0, 1) → [0, 1) is of the form t(x) = { x + α for x ∈ [0, 1 −α), x + α− 1 for x ∈ [1 −α, 1), (9) i.e., t(x) = {x + α}. therefore we can write g(α,ρ,γ,δ) := { n ∈ n : tn(ρ) ∈ [γ,δ) } , (10) and the gaps in this set correspond to return times to the interval [γ,δ) under the transformation t . our theorem 4.1 is an analogue of the three gap theorem in the form (10) generalized for the case when the transformation t is a non-degenerate 3iet. we see that there are 5 gaps, but still expressed using two basic values r1,r2. the so-called three distance theorem focuses on distances between neighbours of the set d(α,ρ,n) := { {αn + ρ} : n ∈ n, n < n } ⊂ [0, 1). the three distance theorem ensures the existence of ∆1, ∆2 > 0 such that distances between neighbours in d(α,ρ,n) take at most three values, namely in {∆1, ∆2, ∆1 + ∆2}. in the framework of 2iet t, we can write for the distances d(α,ρ,n) := { tn(ρ) : n ∈ n, n < n } ⊂ [0, 1), (11) where α is the defining parameter of t as in (9), and ρ ∈ [0, 1). we could try to study the analogue of the three distance theorem in the form (11) for exchanges of three intervals. in fact, it can be derived from the results of [9] that if t is a 3iet with discontinuity points α,β, then d(α,β,ρ,n) := { tn(ρ) : n ∈ n, n < n } has again at most three distances ∆1, ∆2, and ∆1 +∆2 for some positive ∆1, ∆2. the three distance theorem can also be used to derive that the frequencies of factors of length n in a sturmian word take at most three values. recall that the frequency of a factor w in the infinite word u = u0u1u2 . . . is given by freq(w) := lim n→∞ 1 n ( #{0 ≤ i < n : w is a prefix of uiui+1 . . .} ) , if the limit exists. it is a well known fact that the frequencies of factors of length n in a coding of an exchange of intervals are given by the lengths of cylinders corresponding to the factors. the boundary points of these cylinders are t−j(1 −α), for j = 0, . . . ,n− 1. consequently, the distances in the set d(α, 1 −α,n) are precisely the frequencies of factors, and the three distance theorem implies the well known fact that sturmian words have for each n only three values of frequencies of factors of length n, namely %1,%2,%1 + %2. the frequencies of factors of length n in 3iet words are given by distances between neighbours of the set{ t−n(α) : n ∈ n, n < n } ∪ { t−n(β) : n ∈ n, n < n } . in [4] it is shown, based on the study of rauzy graphs, that the number of distinct values of frequencies in infinite words with reversal closed language satisfies #{ freq(w) : w ∈l(u), |w| = n} ≤ 2 ( cu(n) −cu(n− 1) ) + 1, which in case of 3iet words reduces to ≤ 5. article [7] shows that the set of integers n for which this bound is achieved is of density 1 in n. 470 vol. 56 no. 6/2016 itineraries induced by exchange of three intervals acknowledgements the authors acknowledge financial support by the czech science foundation grant gačr 13-03538s. references [1] p. alessandri and v. berthé, three distance theorems and combinatorics on words, enseign. math. (2), 44 (1998), pp. 103–132, doi:10.5169/seals-63900. [2] p. ambrož, z. masáková, and e. pelantová, matrices of 3-iet preserving morphisms, theor. comput. sci., 400 (2008), pp. 113–136, doi:10.1016/j.tcs.2008.02.044. [3] p. arnoux, v. berthé, z. masáková, and e. pelantová, sturm numbers and substitution invariance of 3iet words, integers, 8 (2008), http://eudml.org/doc/117362. article a14. [4] l. balková and e. pelantová, a note on symmetries in the rauzy graph and factor frequencies, theor. comput. sci., 410 (2009), pp. 2779–2783, doi:10.1016/j.tcs.2009.04.002. [5] p. baláži, z. masáková, and e. pelantová, characterization of substitution invariant words coding exchange of three intervals, integers, 8 (2008), http://eudml.org/doc/117368. article a20. [6] s. ferenczi, c. holton, and l. q. zamboni, structure of three-interval exchange transformations ii: a combinatorial description of the trajectories, j. anal. math., 89 (2003), pp. 239–276, doi:10.1007/bf02893083. [7] s. ferenczi and l. q. zamboni, structure of k-interval exchange transformations: induction, trajectories, and distance theorems, j. anal. math., 112 (2010), pp. 289–328, doi:10.1007/s11854-010-0031-2. [8] j. florek and k. florek, billiard and the five-gap theorem, discrete math., 309 (2009), pp. 4123–4129, doi:10.1016/j.disc.2008.12.010. [9] l. s. guimond, z. masáková, and e. pelantová, combinatorial properties of infinite words associated with cut-and-project sequences, j. théor. nombres bordeaux, 15 (2003), pp. 697–725, doi:10.5802/jtnb.422. [10] a. hof, o. knill, and b. simon, singular continuous spectrum for palindromic schrödinger operators, comm. math. phys., 174 (1995), pp. 149–159, doi:10.1007/bf02099468. [11] m. keane, interval exchange transformations, math. z., 141 (1975), pp. 25–31, doi:10.1007/bf01236981. [12] z. masáková, e. pelantová, and š. starosta, exchange of three intervals: substitutions and palindromicity, submitted. [13] z. masáková and e. pelantová, itineraries induced by exchange of two intervals, acta polytechnica, 53 (2013), pp. 444–449, doi:10.14311/ap.2013.53.0444. [14] n. b. slater, the distribution of the integers n for which θn < φ, proc. cambridge philos. soc., 46 (1950), pp. 525–534, doi:10.1017/s0305004100026086. [15] n. b. slater, gaps and steps for the sequence nθ mod 1, math. proc. cambridge phil. soc., 63 (1967), pp. 1115–1123, doi:10.1017/s0305004100042195. [16] l. vuillon, on the number of return words in infinite words constructed by interval exchange transformations, pure math. appl. (pu.m.a.), 18 (2007), pp. 345–355, http://www.mat.unisi.it/ newsito/puma/public_html/18_3_4/vuillon.pdf. 471 http://dx.doi.org/10.5169/seals-63900 http://dx.doi.org/10.1016/j.tcs.2008.02.044 http://eudml.org/doc/117362 http://dx.doi.org/10.1016/j.tcs.2009.04.002 http://eudml.org/doc/117368 http://dx.doi.org/10.1007/bf02893083 http://dx.doi.org/10.1007/s11854-010-0031-2 http://dx.doi.org/10.1016/j.disc.2008.12.010 http://dx.doi.org/10.5802/jtnb.422 http://dx.doi.org/10.1007/bf02099468 http://dx.doi.org/10.1007/bf01236981 http://dx.doi.org/10.14311/ap.2013.53.0444 http://dx.doi.org/10.1017/s0305004100026086 http://dx.doi.org/10.1017/s0305004100042195 http://www.mat.unisi.it/newsito/puma/public_html/18_3_4/vuillon.pdf http://www.mat.unisi.it/newsito/puma/public_html/18_3_4/vuillon.pdf acta polytechnica 56(6):462–471, 2016 1 introduction 2 preliminaries 2.1 combinatorics on words 2.2 exchange of three intervals 3 itineraries in exchange of three intervals 4 return time in a 3iet 5 description of the case of three i-itineraries 5.1 return words to factors of a 3iet 5.2 substitutions fixing 3iet words 6 gaps and distance theorems acknowledgements references acta polytechnica doi:10.14311/ap.2015.55.0313 acta polytechnica 55(5):313–318, 2015 © czech technical university in prague, 2015 available online at http://ojs.cvut.cz/ojs/index.php/ap analysis of mechanical properties of hydrothermally cured high strength matrix for textile reinforced concrete ondřej holčapek∗, filip vogel, petr konvalinka czech technical university in prague, faculty of civil engineering, experimental centre, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: ondrej.holcapek@fsv.cvut.cz abstract. the main objective of this article is to describe the influence of hydrothermal curing conditions in an autoclave device (different pressure and temperature), which took place at various ages of a fresh mixture (cement matrix – cm, and fibre-reinforced cement matrix – frcm), on textile reinforced concrete production. the positive influence of autoclaving has been evaluated through the results of physical and mechanical testing – compressive strength, flexural strength, bulk density and dynamic modulus of elasticity, which have been measured on specimens with the following dimensions: 40 × 40 × 160 mm3. in addition, it has been found that increasing the pressure and temperature resulted in higher values of measured characteristics. the results indicate that the most suitable surrounding conditions are 0.6 mpa, and 165 °c at the age of 21 hours; the final compressive strength of cement matrix is 134.3 mpa and its flexural strength is 25.9 mpa (standard cured samples achieve 114.6 mpa and 15.7 mpa). hydrothermal curing is even more effective for cement matrix reinforced by steel fibres (for example, the compressive strength can reach 177.5 mpa, while laboratory-cured samples achieve a compressive strength of 108.5 mpa). keywords: compressive strength; flexural strength; dynamic modulus; steel fibres; hydrothermal curing; boundary conditions. 1. introduction current research activities in the field of material science and civil engineering focus on investigations into the properties of advanced materials´ and their implementation into real practical use. in the context of sustainability and global efforts to minimize energy intensity, new types of cement composites and special curing conditions to help achieve higher mechanical properties and durability are investigated. this paper deals with an experimental analysis of the influence of hydrothermal curing on the mechanical properties of cement matrix for textile reinforced concrete. the aim of this study is to quantify the mechanical properties of textile reinforced concrete cement matrix autoclaved under various boundary conditions (pressure in autoclave vessel and age of concrete). hydrothermal curing took place at various ages after the first contact water of with cement (from 12 to 24 hours). two types of concrete were used (cement matrix and fibre-reinforced cement matrix). the material presented in this research was originally developed at the experimental centre, faculty of civil engineering, ctu in prague, for textile reinforced concrete production in general. taking into account the international political situation and terrorist risks we can find its application for protective façade panes or slabs production. projectile impact on slabs made from this composite (reinforced by various types of textiles and in some case in combination with steel fibres), as well as their response to dynamic load, have been experimentally investigated. textile reinforced concrete (trc) is a new type of advanced composite material with successful practical use in real constructions. the philosophy of trc combines the advantages of cement matrix (high-performance concrete – hpc or ultra-high-performance concrete – uhpc) and the tensile characteristics of textile, usually made from high-strength material (glass, carbon, basalt) [1]. the typical compressive strength of cement matrix lies in the range from 70 to 130 mpa. this material finds its application in façade panel production, strengthening of existing reinforced concrete load bearing elements, elements for footbridge manufacture [2], etc. several elements of the structure belong to the category of prefabricates, where the time development of mechanical properties determines the removal from formwork and the economic efficiency of the production. the possibility of accelerating the increase of mechanical properties becomes more relevant; therefore the process of hydrothermal curing in an autoclave device becomes more important. the effect of hydrothermal curing on the mechanical properties of cm and frcm is presented in this article. several works of research deal with the issue of the effect of the curing conditions of concrete and high-performance cement composites on final mechanical properties and their development in time. the final properties depend on the composition of the mixture (use of active admixtures, type of cement, plasticizer, etc.) and on the applied curing method. afther 28 days the system of hpc shows a com313 http://dx.doi.org/10.14311/ap.2015.55.0313 http://ojs.cvut.cz/ojs/index.php/ap o. holčapek, f. vogel, p. konvalinka acta polytechnica pressive strength of 137.2/149.6 mpa (cured in water/air), and after 180 days it reaches 154.9/168.3 mpa (cured 28 days in water/air) [3]. samples of uhpc stored for 48 hours in hot steam at 90 °c and 95 % relative humidity are characterized by a 19 % increase of modulus of elasticity (52.7 gpa) compared to laboratory-cured samples [4]. the maximum flexural strength of ultra-high-performance fibre-reinforced concrete (uhpfrc) increases by 11 % after being exposed for 72 hours to a temperature of 90 °c in a humid environment [5]. vogel et al. [6] describe the time development of compressive strength and flexural strength of cm and frcm for textile concrete production cured under laboratory conditions. at the age of one day compressive strength reached 49.7/27.5 mpa (cm/frcm), while after 28 days it reached 114.0/108.5 mpa (cm/frcm). a rapid increase of mechanical properties during the first 7 days characterizes cement composites classified as hpc or uhpc (containing basalt aggregate and cem i 42,5 r) [7]. in this work, reiterman et al. described the time dependence of compressive strength from an early age until 180 days. hydrothermal curing in an autoclave device (with a combination of high pressure and temperature, under the simultaneous effect of steam) provides a sophisticated possibility to achieve high final strength in a short time after the first contact between water and cement (approximately 2 days). a major effect of this solution is the formation of a denser microstructure with the formation of a calcium silicate hydrate (cs-h) phase, which results in higher mechanical properties [8]. the texture of autoclaved uhpc shows a homogeneous, dense cement paste which consists of close networked crystal fibres with a length up to one micrometre in a specimen cured at 200 °c and 15 bar [9]. a magnificent increase of compressive and flexural strength was achieved on high-performance concrete with addition of reactive powder concrete (rpc). compared to a reference mixture (80 mpa) the combination of 2 mpa and 210 °c resulted in compressive strength of 202 mpa after 8 hours of curing [9]. in addition, different behaviour of matrix for trc in terms of special curing conditions can be expected at higher-strength characteristics. the evaluation of the hydrothermal curing process took into account basic material and mechanical properties (bulk density, flexural strength, compressive strength and dynamic modulus of elasticity). during the experimental program specimens with dimensions of 40 × 40 × 160 mm3 were made from high-strength cement matrix and steel fibre-reinforced cement matrix and various mechanical tests were performed. 2. hydrothermal curing 2.1. curing before hydrothermal process samples from cm and frcm with dimensions of 40 × 40 × 160 mm3 were produced under laboratory figure 1. pressure vessel and steel holder with specimens. conditions. fresh mixtures were put into steel forms and their surface was covered by impermeable foil to avoid evaporation of technological water. the specimens remained in steel forms under laboratory conditions (20 °c, at a relative humidity of 50 %) until the time when the hydrothermal process took place. the hydrothermal process took place at various ages of the fresh samples (12, 15, 18, 21 and 24 hours). 2.2. hydrothermal process “hydrothermal curing condition” refers to the combination of high pressure (0.1–0.8 mpa) and temperature (100–250 °c) in a steam-saturated environment applied at an early stage of thehydration process. these conditions can be achieved in the autoclave device. the arrangement of a high pressure vessel´s enables the current treatment of six samples with the following dimensions 40 × 40 × 160 mm3. the process itself consists of several phases: measurement of the dimensions and weight and placement of the samples into a stainless steel stand; relocation of the specimens into the pressure chamber (figure 1) of the autoclave, along with approximately 1.0 litre of water; placement of the cover onto the pressure vessel and locking by nuts. the second phase of the autoclaving process involves setting the required pressure at the set-point pressure gauge, opening the pressure-release valve, and turning on the heating units. after reaching 100 °c, the steam started to escape from the pipe of the pressure-release valve, which had to be closed. the pressure and temperature in the high-pressure chamber started to rise up due to the closed pressurerelease valve until the required values were reached (in this experimental program, there were 135 °c and 0.3 mpa; 150 °c and 0.45 mpa; 165 °c and 0.60 mpa). after four hours the heating units were switched off while the cooling fan was switched on. during the cooling phase, the pressure and temperature gradually decrease to 100 °c and 0.1 mpa, when the precision pressure-release valve opens and from this moment the pressure in pressure vessel is the same as atmospheric pressure. the pressure vessel was opened and the spec314 vol. 55 no. 5/2015 analysis of mechanical properties of hydrothermally cured high strength matrix for textile reinforced concrete figure 2. pressure and temperature curves of hydrothermal processes. imens were removed when the temperature decreased to laboratory conditions (approximately 20 °c). at the end of this process, all the samples reach the final properties and are suitable for expedition (in the case of real production) or for testing of material properties (in laboratory practice). figure 2 schematically shows all three hydrothermal processes (characterized by different boundary conditions – temperature and pressure). we can see the pressure curve and temperature curve depending on time. the pressure starts to rise up when the temperature exceeds 100 °c and the pressure-release valve is closed. 3. composition the maximum grain size of 2 mm characterizes the cement matrix for trc production in general; the designed mixture used for this experiment has a maximum grain size of 1.2 mm. due to the excellent workability of the fresh mixture, the water to binder ratio achieves a slightly higher value (0.29 or even 0.31) than in the case of uhpc or uhpfrc (0.19) [10] or than in the case of high-performance cement composite for high temperature application (0.25) [11]. the composition of the mixtures used can be seen in table 1. the slightly higher value of water to binder ratio of frcm is due to steel fibres dosage. applied cem i 42.5 r cement replaces the commonly used cem 52.5 r in the uhpc category [12,13]. the philosophy of manufacture is in compliance with the technology of uhpc productions. in short, the homogenization of silica sand and silica fume takes place for 5 minutes; the process of mixing continues with the addition of silica powder and cement and homogenization for next 5 minutes. the water and the plasticizer are added in the same time and the process of mixing the cement matrix ends after 5 minutes. in the case of fibre-reinforced cement matrix the gradual addition of full fibres dosage and their even mixing takes the last 5 minutes. component cm frcm [kg/m3] [kg/m3] cem i 42.5 r cement 680 680 silica fume 129 129 silica sand 0.1/0.6 326 326 silica sand 0.3/0.8 340 340 silica sand 0.6/1.2 258 258 silica powder 326 326 svc 1035superplasticizer 6.8 7.5 potable water 238 252 steel fibres 0.14 × 13 mm – 102 water-binder 0.29 0.31 table 1. composition of the mixture used. figure 3. principle of ultrasonic testing. 4. investigated properties evaluation of the optimal age of cement matrix or fibre-reinforced cement matrix and of the conditions in the autoclave device (temperature and pressure) took place with the help of non-destructive measurements of dynamic modulus (ecu), calculation of bulk density (%) and destructive measuring of the following mechanical parameters: compressive strength (fcm) and flexural strength (ftm). all properties were measured on three samples and the final value is an average, except for the compressive strength value (average of six values – fragments from bending tests). 4.1. dynamic modulus of elasticity non-destructive methods find their significant role in civil engineering, material science, diagnostics of structures, and descriptions of gradual changes in materials due to temperature, moisture and dynamic loading [14, 15, 16]. a proceq punditlab+ ultrasonic velocity test instrument has been used to determine ultrasound speed vl by 54 khz pulse transducer. ultrasonic direct transmission is the most frequently used arrangement where pulse amplitude reaching the receiving transducer is the highest. the one-dimensional adjustment requires the position of pulse transducer and the receiver to be positioned on the opposite sides of the 40 x 40 x 160 mm3 specimen, and to be pointed directly at each other (figure 3). the following equation determines the values of the dynamic modulus of elasticity (bulk density was calculated from dimension and weight; pulse velocity measured by proceq punditlab+ and the characteristics of environment were equal to one, because of the one dimensional 315 o. holčapek, f. vogel, p. konvalinka acta polytechnica conditions age fcm [mpa] ftm [mpa] ecu [gpa] % [kg m−3] cm frcm cm frcm cm frcm cm frcm 0.30 mpa 135 °c 12 h 83.5 94.1 13.3 25.2 38.9 37.4 2150 2250 15 h 117.5 133.8 13.0 23.5 41.7 41.6 2190 2285 18 h 113.5 145.2 15.5 34.6 42.0 41.8 2200 2300 21 h 116.9 144.0 10.0 28.6 41.9 41.7 2190 2305 24 h 120.4 154.7 16.9 39.8 45.0 43.1 2240 2320 0.45 mpa 150 °c 12 h 58.3 63.4 9.5 24.0 30.2 29.2 2165 2295 15 h 102.4 106.7 16.9 29.6 43.7 39.7 2180 2250 18 h 122.5 163 17.2 40.0 41.2 43.3 2140 2310 21 h 118.8 175.3 19.2 39.2 44.7 45.2 2220 2320 24 h 125.5 171.9 15.6 46.0 46.1 45.6 2225 2260 0.60 mpa 165 °c 12 h 30.1 48.7 4.1 11.4 13.9 20.1 2200 2260 15 h 84.2 70.7 12.0 23.3 38.2 32.2 2190 2250 18 h 123.0 150.1 16.3 38.4 44.7 48.9 2190 2300 21 h 134.3 177.5 25.9 48.4 44.9 45.5 2225 2320 24 h 127.3 169.2 16.0 52.3 45.3 46.1 2190 2315 reference: 1 day 49.7 27.5 9.4 23.9 23.1 22.1 2240 2270 reference: 28 days 114.6 108.5 15.7 34.1 34.2 34.3 2260 2230 table 2. summary of performed experiments – measured data. arrangement): ecu = %v2l k2 , (1) where ecu is the dynamic modulus of elasticity [mpa]; % is the bulk density of measured material [kg m3]; vl is the pulse velocity of ultrasonic waves [m s−1]; k is the characteristics of the environment [–] [17]. 4.2. mechanical properties all investigated parameters were determined on prismatic specimens which had the following dimensions: 40 × 40 × 160 mm3. the measurement of flexural strength ftm was organized as a three-point bending test with supports at a distance of 100 mm according to [18], and was calculated with the help of the maximum reached force. an mts 100 universal loading machine was used for this testing, and the test was controlled by the increase of deformation (1.0 mm/min). the compressive strength test fcm was performed on two fragments left after the bending test with the help of the eu 40 machine with a maximum force of 400 kn. the area under compressive load (40 × 40 mm2) was demarcated by the matest loading-device put to eu 40. this arrangement fully complies with the requirements for refractory cement composites, hpc, uhpc, uhpfrc testing [11,13]. 5. results first, it is necessary to say that all testing of the mechanical properties of the hydrothermally cured specimens was performed on the day following the hydrothermal curing process (the final properties are achieved at the end of the hydrothermal curing process in an autoclave device). the values measured for the samples cured under laboratory conditions and tested after 1 day and after 28 days (which is the time when the strength parameters of cement composites are investigated in practice) are listed for better comparison. table 2 shows the results of all measurements (compressive strength, flexural strength, dynamic modulus of elasticity and bulk density) performed during the experimental program. the values for all these parameters of the 1-day and 28-day-old samples are shown for comparison with the autoclaved samples. three different sets of conditions (0.3 mpa and 135 °c; 0.45 mpa and 150 °c; 0.60 mpa and 165 °c), together with the different ages at which the samples were hydrothermally cured, are listed. the dependence of the compressive strength (fcm) of cm (figure 4) and frcm (figure 5) on pressure and on temperature is expressed in graphs for better illustration. figures 6 and 7 show the values of the flexural strength (ftm) of cm and frcm after different hydrothermal conditions and ages of the samples. especially in figures 4 and 5, we can see the influence of rising pressure on the compressive strength applied to specimens of various ages. the red line in each graph defines the boundary of each mechanical property measured on the reference samples at 28 days. this will help achieve better orientation, where all values of hydrothermal cured samples up to the line reach higher values of mechanical properties. 6. conclusions and discussion based on the performed experiments, on the summarization of the results, and on the graphical evaluation, we can find several dependencies and draw several con316 vol. 55 no. 5/2015 analysis of mechanical properties of hydrothermally cured high strength matrix for textile reinforced concrete 0 20 40 60 80 100 120 140 160 12 h 15 h 18 h 21 h 24 h c om pr es si ve s tr en gt h [m p a] 0.3 mpa 0.45 mpa 0.6 mpa 28 days cm figure 4. average values of compressive strength fcm of cm after curing under different conditions. 0 20 40 60 80 100 120 140 160 180 200 12 h 15 h 18 h 21 h 24 h c om pr es si ve s tr en gt h [m p a] 0.3 mpa 0.45 mpa 0.6 mpa 28 days frcm figure 5. average values of compressive strength fcm of frcm after curing under different conditions. clusions. the compressive strength, flexural strength, and the dynamic modulus of elasticity were measured on 90 specimens in total, which were tested after being cured under various conditions in an autoclave device (135 °c and 0.3 mpa; 150 °c and 0.45 mpa; 165 °c and 0.6 mpa). according to the outcomes from the performed experimental program, the following conclusions can be presented: (1.) the compressive strength (fcm) of the cement matrix does not increase as rapidly as that of the fibre-reinforced cement matrix. the maximum increase achieved for cm is 14.5 % in comparison with samples tested at 28 days. the increase observed for frcm is 38.9 %. we can observe the maximum compressive strength on samples hydrothermally cured at the ages of 21 hours or 24 hours. based on the values of flexural strength (ftm), we can confirm the assumption of theoretical research that the positive influence of hydrothermal curing lies in tensile characteristics. an increase of 34.7 % characterizes the flexural strength of frcm hydrothermally cured at 24 hours under the following conditions: 165 °c and 0.6 mpa. 0 5 10 15 20 25 30 35 12 h 15 h 18 h 21 h 24 h f le x ur al s tr en gt h [m p a] 0.3 mpa 0.45 mpa 0.6 mpa 28 days cm figure 6. average values of flexural strength ftm of cm after curing under different conditions. 0 10 20 30 40 50 60 12 h 15 h 18 h 21 h 24 h f le x ur al s tr en gt h [m p a] 0.3 mpa 0.45 mpa 0.6 mpa 28 days frcm figure 7. average values of flexural strength ftm of frcm after curing under different conditions. (2.) application of pressure higher than 0.60 mpa at a temperature higher than 160 °c will probably result in even better mechanical characteristics, but to the detriment of economy of production. higher pressure means more expensive production (costs for energy) and the hydrothermal process has to be applied to older samples (due to the initial strength being sufficient to resist the high pressure). (3.) increasing the temperature and pressure in an autoclave leads to lower values of mechanical properties of younger samples (typically 12 and 15 hours). the samples at this age are not strong enough to resist the pressure, which will result in micro-cracks and decreased strength. application of lower pressure – 0.3 mpa – at the age of 12 hours leads to values of compressive strength three times higher than 0.6 mpa. the hydrothermal condition takes effect especially at the ages of 21 and 24 hours, when the initial strength parameters are able to resist inner destruction due to high pressure. (4.) increasing mechanical parameters are accompanied by growth of the bulk density and dynamic modulus of elasticity. the bulk density growth is as317 o. holčapek, f. vogel, p. konvalinka acta polytechnica sociated with denser microstructure and hydration products due to hydrothermal curing conditions. we can conclude from the performed experiments that hydrothermal curing has a great potential for textile reinforced concrete production (especially of façade panels, load-bearing elements and other precast structure parts). the most suitable combination of boundary conditions (pressure and temperature) and final mechanical properties always takes into account technological and especially economical aspects. denser microstructure, developed in autoclaved cured specimens, is connected with the presumed lower permeability for liquids and gases. higher durability and long-term properties are connected with low permeability. the financial costs of hydrothermally cured samples have to be compensated by final mechanical characteristics. therefore, the hydrothermally cured elements are supposed to be used in extreme extensive places, with high requirements in terms of durability and resistance to extreme static or dynamic load. acknowledgements this research work was supported by the czech science foundation under project gap 105/12/g059 “cumulative time dependent processes in building materials and structures”. the authors gratefully acknowledge the assistance given by the technical staff of the experimental centre – department of the faculty of civil engineering, ctu in prague. references [1] brückner, a., ortlepp, r., curbach, m.: textile reinforced concrete for strengthening in bending and shear, materials and structures (2006), pp. 741-748. doi:10.1617/s11527-005-9027-2 [2] schladitz, f., frenzel, m., ehlig, d., curbach, m.: bending load capacity of reinforced concrete slabs strengthened with textile reinforced concrete, engineering structures 40, (2012), pp. 317-326. doi:10.1016/j.engstruct.2012.02.029 [3] holčapek, o., vogel, f., vavřiník, t., keppert, m.: time progress of compressive strength of high performance concrete, applied mechanics and materials, vol. 486, (2014), pp. 167-172. doi:10.4028/www.scientific.net/amm.486.167 [4] graybeal, b. a.: compressive behavior of ultra-high-performance fiber-reinforced concrete, aci materials journal (2007), vol. 104, pp. 146-152. doi:10.14359/18577 [5] hong, k. n., kang, s. t., kim, s. w., park, j. j., han, s. h.: material properties of air-cured ultra-high-performance steel-fiber-reinforced concrete at early ages, international journal of the physical sciences, 5 (2010). issn-1992-1950 [6] vogel, f., holčapek, o., jogl, m., kolář, k., konvalinka, p.: development of mechanical properties of steel fibers reinforced high strength concrete, advanced material research, vol. 1077 (2015), pp. 113-117. doi:10.4028/www.scientific.net/amr.1077.113 [7] reiterman, p., jogl, m., bäumelt, v., seifrt, j.: development and mix design of hpc and uhpfrc, advanced material research, vol. 982 (2014), pp. 130-135. doi:10.4028/www.scientific.net/amr.982.130 [8] lehmann, c., fontana, p., müller, u.: evolution of phases and micro structure in hydrothermally cured ultra-high performance concrete (uhpc), nanotechnology in construction: proceeding of the nicom3, isbn 978-3-642-00979-2, (2009), pp. 287-293. doi:10.1007/978-3-642-00980-8_38 [9] yazici, h., deniz, e., baradan, b.: the effect of autoclave pressure, temperature and duration time on mechanical properties of reactive powder concrete, construction and building materials 42, (2013), pp. 53-63. doi:10.1016/j.conbuildmat.2013.01.003 [10] sovják, r., vogel, f., beckmann, b.: triaxial compressive strength of ultra high performance concrete, acta polytechnica, 53, 2013, doi:10.14311/ap.2013.53.0901 [11] holčapek, o., reiterman, p., konvalinka, p.: experimental analysis of hydrothermal curing influence on properties of fiber-cement composite, applied mechanics and materials, vol. 732 (2015), pp. 55-58. doi:10.4028/www.scientific.net/amm.732.55 [12] sovják, r., vavřiník, t., zatloukal, j., máca, p., mičunek, t., frydrýn, m.: resistance of slim uhpfrc targets to projectile impact using in-service bullets, international journal of impact engineering (2015), vol. 76, pp. 166-177. doi:10.1016/j.ijimpeng.2014.10.002 [13] reiterman, p., holčapek, o., polozhiy, k., konvalinka, p.: fracture properties of cement pastes modified by fine ground ceramic powder, advanced material research, vol 1054 (2014), pp. 182-187, doi:10.4028/www.scientific.net/amr.1054.182 [14] holčapek, o., reiterman, p., konvalinka, p.: fracture characteristics of refractory composites containing metakaolin and ceramic fibers, advances in mechanical engineering, vol. 7, no.3 (2015), pp. 1-13, doi:10.1177/1687814015573619 [15] holčapek, o., reiterman, p., jogl, m., konvalinka, p.: destructive and non-destructive testing of high temperature influence on refractory fiber composite, advanced materials research, vol. 982 (2014), pp.145148. doi:10.4028/www.scientific.net/amr.982.145 [16] pavlů, t., šefflová, m.: the static and the dynamic modulus of elasticity of recycled aggregate concrete, advanced material research, vol. 1054 (2014), pp. 221-226. doi:10.4028/www.scientific.net/amr.1054.221 [17] csn en 12 504-4: testing of concrete – part 4: determination of ultrasonic pulse velocity (2005). [18] czech standard csn en 196-1: methods of testing cement – part 1: determination of strength (2005). 318 http://dx.doi.org/10.1617/s11527-005-9027-2 http://dx.doi.org/10.1016/j.engstruct.2012.02.029 http://dx.doi.org/10.4028/www.scientific.net/amm.486.167 http://dx.doi.org/10.14359/18577 http://dx.doi.org/10.4028/www.scientific.net/amr.1077.113 http://dx.doi.org/10.4028/www.scientific.net/amr.982.130 http://dx.doi.org/10.1007/978-3-642-00980-8_38 http://dx.doi.org/10.1016/j.conbuildmat.2013.01.003 http://dx.doi.org/10.4028/www.scientific.net/amm.732.55 http://dx.doi.org/10.4028/www.scientific.net/amr.1054.182 http://dx.doi.org/10.1177/1687814015573619 http://dx.doi.org/10.4028/www.scientific.net/amr.982.145 http://dx.doi.org/10.4028/www.scientific.net/amr.1054.221 acta polytechnica 55(5):313–318, 2015 1 introduction 2 hydrothermal curing 2.1 curing before hydrothermal process 2.2 hydrothermal process 3 composition 4 investigated properties 4.1 dynamic modulus of elasticity 4.2 mechanical properties 5 results 6 conclusions and discussion acknowledgements references acta polytechnica doi:10.14311/app.2016.56.0067 acta polytechnica 56(1):67–75, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap adaptive distribution of a swarm of heterogeneous robots amanda prorok∗, m. ani hsieh, vijay kumar general robotics, automation, sensing & perception (grasp) laboratory, university of pennsylvania, philadelphia, usa ∗ corresponding author: prorok@seas.upenn.edu abstract. we present a method that distributes a swarm of heterogeneous robots among a set of tasks that require specialized capabilities in order to be completed. we model the system of heterogeneous robots as a community of species, where each species (robot type) is defined by the traits (capabilities) that it owns. our method is based on a continuous abstraction of the swarm at a macroscopic level as we model robots switching between tasks. we formulate an optimization problem that produces an optimal set of transition rates for each species, so that the desired trait distribution is reached as quickly as possible. since our method is based on the derivation of an analytical gradient, it is very efficient with respect to state-of-the-art methods. building on this result, we propose a real-time optimization method that enables an online adaptation of transition rates. our approach is well-suited for real-time applications that rely on online redistribution of large-scale robotic systems. keywords: heterogeneous multi-robot systems; swarm robotics; stochastic systems; task allocation. 1. introduction technological advances in embedded systems, such as component miniaturization and improved efficiency of sensors and actuators, are enabling the deployment of very large-scale robot systems, i.e., swarms of robots. however, the smaller we design our platforms, the more stringent the tradeoffs we need to make with respect to endowed capabilities. as a consequence, investigators are composing their robot systems with multiple, heterogeneous types of robots in order to tackle increasingly challenging tasks [1, 2]. our premise is that, in a swarm of robots, one single type of robot cannot cater to all aspects of the task at hand, because at the individual level, it is governed by design rules that limit the scope of its capabilities. in this work, our objective is to distribute a swarm of heterogeneous robots as quickly and efficiently as possible among tasks that require specialized competences. this objective is part of a larger vision to develop control and coordination strategies for teams of heterogeneous robots with specific capabilities. for example, a larger robot may be able to carry more powerful sensors, but may be less agile than its smaller counterpart. or, we could consider the limited payload of aerial robots: if a given task requires rich sensory feedback, multiple heterogeneous aerial robots can complement each other by carrying distinct sensors, altogether more than a single one could carry on its own. initially, we consider tasks that are to be performed in parallel, continuously, and independently, yet we develop a framework that can easily be extended to accommodate temporal and precedence constraints. instances of information gathering lend themselves naturally to our problem formulation, with applications to surveillance, environmental monitoring, and situational awareness [3–5]. given a set of tasks, and knowledge about what the task requirements are, our problem considers which robots should be allocated to which tasks. this problem is an instance of the mt-mr-ta: multi-task robots, multi-robot tasks problem [6], and can be reformulated as a set-covering problem that stems from combinatorial optimization. this concept considers subsets of robots in a multi-robot system, and pairs them optimally (given a cost function) to tasks. this problem is strongly np-hard [7]. a number of heuristic algorithms have been proposed. however, the running times of these algorithms are functions of the sizes of the feasible subsets (of robots paired to a task), and hence, become very expensive for large robot teams and swarms. furthermore, the algorithms are penalized when multiple robot-subset-to-task combinations are feasible (this is the case, for example, when robots have overlapping capabilities). these algorithms are not suitable for large-scale systems, such as robot swarms. in particular, for systems that are required to adapt to changing task requirements online, we need to consider algorithms that are efficient and implementable on mobile platforms that run them in real-time. our method builds on previous work done in the domain of dynamic redistribution of homogenous robot swarms [8–10]. we consider a strategy that is scalable in the number of robots and their capabilities, and is robust to changes in the robot population. the key property of this strategy is its inherently decentralized architecture, with robots switching between behaviors (tasks) stochastically. this model is inspired by previous work in the swarm-robotic domain that explores self-organized behavior of natural systems [11, 12]. 67 http://dx.doi.org/10.14311/app.2016.56.0067 http://ojs.cvut.cz/ojs/index.php/ap a. prorok, m. ani hsieh, vijay kumar acta polytechnica the present work focuses on the optimization of transition rates that enables the robot swarm to converge quickly to a configuration that satisfies a desired trait distribution. the key difference between our work and previous work is that we formulate our desired state as a distribution of traits among sites, instead of specifying the desired state as a direct measure of the robot distribution. in other words, our framework allows a user to specify how much of a given capability is needed for a given task, irrespective of which robot type satisfies that need. as a consequence, we do not employ optimization methods that utilize final robot distributions in their formulations (which is the case in the works presented by [8] and [13]). instead, we explicitly optimize the distribution of traits, and implicitly solve the combinatorial problem of distributing the right number of robots of a given type to the right tasks. indeed, we will later show that there are cases where multiple final robot distributions satisfy the desired trait distribution. in such cases, state-of-the-art strategies require that we first determine the best final robot distribution (one that can be reached the fastest), and subsequently optimize transition rates using methods such as in [8]. 2. problem formulation heterogeneity and diversity are core concepts of this work. to develop our formalism, we will borrow terminology from biodiversity literature [14, 15]. we define our robot system as a community of robots. each robot belongs to a species, defining the unique set of traits that encodes the robots’ capabilities. in this work, we will consider binary instantiations of traits (corresponding to the presence or absence of a given trait in a species). as an example, one trait might consider the presence/absence of a particular sensor, such as a camera or laser range finder. another trait might consider the capability of fitting through a passageway with a fixed width. in this work, we assume that the tasks have been encoded through binary characteristics that represent the skill sets critical to task completion. 2.1. notation we consider a community of s robot species, with a total number of robots n, and n(s) robots per species such that ∑s s=1 n (s) = n. the community is defined by a set of u traits, and each robot species owns a subset of these traits. a species is defined by a vector q(s) = [q(s)1 ,q (s) 2 , ...,q (s) u ]. we can then define a s ×u matrix q, with rows q(s): qsu = { 0 if species s does not have trait u, 1 if species s has trait u. we model the interconnection topology of the m sites via a directed graph, g = (e,v) where the set of vertices, v, represents task sites {1, . . . ,m} and the set of edges, e, represents the ordered pairs (i,j), such that (i,j) ∈v×v, and i and j are adjacent. we assume the graph g is a strongly connected graph, i.e., a path exists between any pair of vertices (in contrast to a fully connected graph, where an edge exists between any pair of vertices). in other words, if the nodes in our graph are physically distributed sites, then, via some road, we can reach any other site. we assign every edge in e a transition rate, k (s) ij > 0, where k (s) ij defines the transition probability per unit time for one robot of species s at site i to switch to site j. here k(s)ij is a stochastic transition rule. we assume every robot has knowledge of g as well as all the transition rates of its species k(s)ij . we note that this information is represented by a small number of values (at most m2 ·s values, or much less if the graph is sparse), and needs to be transmitted to the robots only once, at the start of a run. we impose a limitation on the maximum rate of each edge with k(s)ij < k (s) ij,max. these values can be determined by applying system identification methods on the actual system. for example, in a system where nodes represent physically distributed sites, the transition rate represents the rate with which a specific path is chosen. this value can depend on observed factors, such as typical road congestion or the condition of the terrain. the distribution of the robots belonging to a species s at time t is described by a vector x(s)(t) = [x(s)1 (t), ..., x (s) m (t)] >. then, if x(s) are the columns of x(t), and q(s) are the rows of q, we have the m ×u matrix y that describes the distribution of traits on sites. for time t this relationship is given by y(t) = x(t) · q (1) as we will see in section 2.3, there may be several robot distributions x(t) that satisfy this equation for a given y(t). 2.2. problem statement the initial state of the system is described by x(0), and hence, the initial configuration of traits at the sites is described by y(0). the time evolution of the number of robots of species s at site i is given by a linear law dx(s)i dt = ∑ ∀j|(i,j)∈e kjix (s) i (t) − ∑ ∀j|(i,j)∈e kijx (s) i (t). (2) then, for all species s, our base model is given by dx(s) dt = k(s)x(s) ∀s ∈ 1, . . . ,s, (3) where k(s) ∈ rm×m is a rate matrix with the properties k(s) > 1 = 0, (4) k(s)ij ≥ 0 ∀(i,j) ∈e. (5) 68 vol. 56 no. 1/2016 adaptive distribution of a swarm of heterogeneous robots these two properties result in the following definition: k(s)ij =   k (s) ji , if i 6= j, (i,j) ∈e, 0, if i 6= j, (i,j) /∈e, − ∑m i=1,(j,i)∈e k (s) ij if i = j. since the total number of robots and the number of robots per species is conserved, the system in eq. 3 is subject to the constraint x> · 1 = [n(1),n(2), . . . ,n(s)]>. (6) the goal is to find an optimal rate matrix k(s)f for each species s so that we have ȳ = x̄ · q. (7) in other words, the task is to redeploy the robots of each species configured according to x(0) initially, so that a desired trait configuration ȳ is reached. in doing this, we reach a robot configuration x̄ that satisfies eq. 1, subject to eq. 6. 2.3. system properties since we describe the desired configuration of our system through ȳ, the final robot distribution x̄ is not known a priori. given knowledge of q we can infer the following properties: if a solution to eq. 7 subject to eq. 6 exists, then (1.) if rank(q) < s, the system is underdetermined, and an infinite number of solutions x̄ will satisfy eq. 7. in other words, at least one species in the system can be replaced by a combination of the other species. (2.) if rank(q) = s, only one solution x̄ exists that satisfies eq. 7. in other words, no species in the system can be expressed as a combination of the other species. we note that solving case (1) is relevant when we embed redundancy into the robot system, and case (2) is relevant when we consider fully complementary robot species. 3. methodology in this section, we describe our methodology for obtaining an optimal transition matrix k(s)f for each species so that the desired trait distribution is reached. berman et al. [8] present an exposé of optimization methods that can be used to obtain optimal transition rates for a homogenous robot swarm that is required to converge to a desired distribution. two general approaches are considered: convex optimization and stochastic optimization. the convex optimization approach requires knowledge of the desired final robot distribution. indeed, our problem formulation specifies a desired trait distribution ȳ without explicit definition of the final robot distribution x̄. hence, convex optimization strategies as in [8] are not applicable to our problem, unless rank(q) = s, and we can infer x̄. given this rationale, we choose an optimization approach that is able to find optimal transition rates with knowledge of ȳ and x(0), without knowledge of x̄. although fully stochastic schemes such as metropolis optimization have been shown to produce similar results [8], they are not computationally efficient, and are ill-suited to real-time applications. in the following, we present a differentiable objective function that can be efficiently minimized through gradient descent techniques. we show that our method has a computational complexity that is well-suited to real-time applications. additionally, our method explicitly minimizes the convergence time of k(s), unlike the convex optimization methods presented in [8] which approximate k(s) with a symmetric equivalent (forcing bidirectionally equal transition rates between sites). 3.1. design of optimal transition rates we combine the solution of the linear ordinary differential equation, eq. 3, and eq. 7 to obtain the solution for a desired trait distribution ȳ = s∑ s=1 ek (s)fτ x(s)0 · q(s), (8) where τ is the time at which the desired state is reached. we design our objective function as follows. to find the values of k(s)f for all species for the given initial configuration, x(0), we consider the following optimization minimize j (1) = ∥∥∥∥ȳ − s∑ s=1 ek (s)τ x(s)0 · q(s) ∥∥∥∥2 f (9) such that k(s)ij < k (s) ij,max, which formulates that a minimum cost is found when the final trait distribution corresponds to the desired trait distribution, subject to maximum transition rates k (s) ij,max. the notation x (s) 0 is shorthand for x(s)(0). the operator ‖ · ‖f denotes the frobenius norm of a matrix. there is no closed-form solution to the optimization problem in eq. 9, but we can use the derivatives of j (1) with respect to the parameters to perform gradient descent. so that the implementation of the optimization function is efficient, it is important that the function is differentiable and that an analytical gradient can be computed. by applying the chain rule, the derivative of our objective function with respect to the transition matrix k(s) is ∂j (1) ∂k(s) = ∂j (1) ∂ek (s)τ · ∂e k(s)τ ∂k(s)τ · ∂k (s)τ ∂k(s) . (10) we first compute the derivative of the cost with respect to the expression ek (s)τ: ∂j (1) ∂ek (s)τ = 2 [∑ r∈s ek (r)τ x(r)0 · q(r) − ȳ ] · [ x(s)0 · q(s) ]> . (11) 69 a. prorok, m. ani hsieh, vijay kumar acta polytechnica the derivation of the 2nd element of eq.10 requires the derivative of the matrix exponential. computing the derivative of the matrix exponential is not trivial. we adapt the closed-form solution given in [16] to our problem, and write the gradient of our cost function as ∂j (1) ∂k(s) = v−1> [ v> ∂j (1) ∂ek (s)τ v−1> � w(τ) ] v>τ, (12) where � is the hadamard product, k(s) = vdv−1 is the eigendecomposition of k(s). v is the m ×m matrix whose jth column is a right eigenvector corresponding to eigenvalue di, and d = diag(d1, . . . ,dm ). the matrix w(t) is composed as follows 1 w(t) = { (edit −edjt)/(dit−djt) if i 6= j, edit if i = j. 3.2. optimization of convergence time the cost function in eq. 9 does not consider convergence time τ as a variable. by adding a term that penalizes high convergence time values, we can compute transition rates that explicitly optimize convergence time. the modified objective function is minimize j (2) = j (1) + ατ2 (13) such that k(s)ij < k (s) ij,max and τ > 0, and α > 0. by increasing α, we increase the importance of the convergence time (by penalizing high values of τ). the derivative with respect to the transition rates is ∂j (2) ∂k(s) = ∂j (1) ∂k(s) . (14) in order to optimize the convergence time, we need the derivative with respect to τ. this derivative is computed analogously to the derivative with respect to k(s) (confer eq. 12). we have ∂j (2) ∂τ = ∂j (1) ∂τ + 2ατ (15) with ∂j (1) ∂τ = s∑ s=1 1>v−1>a1v> · k(s)1 (16) and a1 = v> ∂j (1) ∂ek (s)τ v−1> � w(τ). (17) the optimization of eq. 13 produces transition rates that lead to the desired trait distribution quickly, but there is no guarantee that this is the steady state of k(s). if we compute the transition rates at the outset 1here, we assume that that k(s) has m distinct eigenvalues. if this is not the case, an analogous decomposition of k(s) to jordan canonical form is possible, as elaborated in [16]. we note that for most models of interest, however, this is rarely the case. of the experiment (without refining them online), we may wish to ensure that the state reached at the optimal time tf remains near-constant. hence, we modify our cost function in eq. 13 as follows. minimize j (3) = j (2) + β s∑ s=1 ∥∥ek(s)τ x(s)0 (18) −ek (s)(τ+ν)x(s)0 ∥∥2 2 such that k(s)ij < k (s) ij,max and τ > 0, and β > 0. the additional term in our cost function allows us to ensure that the robot distribution reached by employing k(s)f remains near-constant for arbitrarily long time intervals ν. this is possible because our model in eq. 3 is stable [9], and the difference between the current robot distribution and the one at steady state can only decrease monotonically over time. by increasing the value of β, the difference of the robot distributions at times τ and τ + ν is decreased. in other words, as we will see in section 4, the trait distribution corresponding to the steady state robot distribution of k(s) gets arbitrarily close to the desired trait distribution ȳ as β increases (the same is true when we increase ν). note that when α = 0, β should not be infinitely large, as in this case, k(s)f = 0. however, for all practical purposes β is bounded and α > 0. let us refer to this additional third term of j (3) (and second term of eq. 18) as j (3,3). then, the derivative of the new objective function with respect to the transition rates can be expressed as ∂j (3) ∂k(s) = ∂j (2) ∂k(s) + ∂j (3,3) ∂k(s) . (19) again, we apply the chain rule to obtain ∂j (3,3) ∂k(s) = ∂j (3,3) ∂ek (s)τ ∂ek (s)τ ∂k(s)τ ∂k(s)τ ∂k(s) (20) − ∂j (3,3) ∂ek (s)(τ+ν) ∂ek (s)(τ+ν) ∂k(s)(τ + ν) ∂k(s)(τ + ν) ∂k(s) . the outer derivative is ∂j (3,3) ∂ek (s)τ = ∂j (3,3) ∂ek (s)(τ+ν) (21) = 2β [ ek (s)τ x(s)0 −ek (s)(τ+ν)x(s)0 ] · x(s)0 > . we apply the same development as in eq. 12 to obtain the equation ∂j (3,3) ∂k(s) = v−1>a2v> · τ − v−1 >a3v> · (τ + ν) (22) with a2 = v> · ∂j (3,3) ∂ek (s)τ · v−1> � w(τ) (23) 70 vol. 56 no. 1/2016 adaptive distribution of a swarm of heterogeneous robots and a3 = v> · ∂j (3,3) ∂ek (s)(τ+ν) · v−1> � w(τ + ν). (24) for all above cost functions, z = 1, 2, 3, the derivative with respect to the off-diagonal elements ij of the matrix k(s), with (i,j) ∈e, is ∂j (z) ∂k(s)ij = { ∂j (z) ∂k(s) } ij − { ∂j (z) ∂k(s) } jj , (25) where {·}ij denotes the element on row i and column j. the derivative with respect to time τ is analogous. finally, we summarize our optimization problem as follows: k(s)f,τf = argmin k(s),τ j (3), (26) under the constraints shown in eq. 18. to solve the system, we implement a basin-hopping optimization algorithm [17], which attempts to find the global minimum of a smooth scalar function. locally, our basin-hopping algorithm uses a quasi-newton method (namely, the broyden-fletcher-goldfarb-shanno algorithm [18] with bound constraints). 3.3. computational complexity the computational complexity of computing the gradient of our objective function is o(s ·m3 +s ·m2·u). the first part of this complexity is dictated by the eigenvalue decomposition, which is known to be o(m3) for non-sparse matrices [19]2. we compute this decomposition only once per optimization (see eq. 12, where k(s) = vdv−1), for each optimization of k(s). the second part is dictated by the multiplication of the matrices in eq. 12, for which the cost is o(m2 · u). globally speaking, the computation grows linearly with the number of species and traits, and it grows cubically with respect to the number of tasks. overall, for the results shown in section 4, the average time to compute the gradient for a system with m = 8, u = 4, and s = 4 is around 1.35 ms with ν = 0, and 2.2 ms with ν > 0 (the number of parameters to optimize can be as large as 225 in this case, depending on the graph’s adjacency matrix). the code was implemented in python using the numpy and scipy libraries, and tested on a 2 ghz intel core i7 using a single cpu. 3.4. continuous optimization of k(s) as shown above, the optimization of the objective function j (3) is efficient and can be performed in realtime. building on this result, we can implement a continuous, online optimization strategy that allows us to refine the optimal k(s)f as a function of the current state. in noisy systems, where the trajectories of individual agents exhibit deviations from predicted 2in the special case where all eigenvalues are distinct, the eigenvalue decomposition can be reduced to o(m 2.376 log(m )) [20]. macroscopic trajectories, this strategy inevitably leads to an improvement of the convergence time. by taking the actual robot distribution into account, it can recompute updated optimal transition rates. practically, we initially compute k(s)f at time t = 0 to control the system over a finite period δ from t = 0 to t = δ. after that period (at time t = δ), we optimize a new value of k(s)f that controls the system for the next period, as a function of the actual robot distribution that was encountered at time t = δ. this process can be repeated indefinitely. the value δ is called the sampling time. formally, we write our control policy as k(s)f(t),τf(t) = argmin k(s),τ j̃ (3)(x(tp)) (27) with tp ≤ t < tp + δ tp ∈ kδ,k ∈ n, where we rewrite our cost equation as a function of the robot distribution j̃ (3)(x) = ∥∥∥ȳ − s∑ s=1 ek (s)τ x(s) · q(s) ∥∥∥2 f + ατ2 (28) + β s∑ s=1 ∥∥∥ek(s)τ x(s) −ek(s)(τ+ν)x(s)∥∥∥2 2 . note that j̃ (3)(x(0)) is equal to j (3). in practice, if δ is small with respect to the rates at which robots transition between sites (cf. kij,max), we set β = 0, since the continuous optimization of k(s) enforces a current state that is close to the desired state, irrespective of the steady-state distribution. for cases where the optimization time becomes large (implying that δ also becomes large), we either need to set β > 0, or use a strategy that accounts for computation delay, such as those presented in [21]. also, we note that we can accelerate the computations by setting the initial values of the present sampling window with optimized values of the preceding sampling window (i.e., warm start). 4. results previous work has shown the benefit of validating methods over multiple, complementary levels of abstraction (sub-microscopic, microscopic, and macroscopic) [22]. in the present work, we propose an evaluation of our methods on two levels: microscopic and macroscopic. indeed, the most efficient way of simulating the swarm of robots is by considering a continuous macroscopic model, derived directly from the ordinary differential equation, eq. 3. in order to validate the control policy at a lower level of abstraction, we also implement a discrete microscopic model that emulates the behavior of individual robot controllers. this agent-level control is based on the transition rates k(s)ij encoded by the transition matrix k(s): a robot of species s at site i transitions to site 71 a. prorok, m. ani hsieh, vijay kumar acta polytechnica ©5 ©6 (a) 5 ©6 (b) figure 1. a strongly connected instance of a graph with 10 nodes representing spatially distributed sites (nodes) and their paths (edges). the system includes 4 traits. the trait abundance at each node is represented by a bar plot. nodes 5 and 6 are highlighted. (a) initial configuration (b) desired final configuration. j according to probability p(s)ij that is an element of the matrix p(s) = ek (s)∆t , where ∆t is the duration of one time-step. running multiple iterations of the microscopic model enables us to capture the stochasticity resulting from our control system. our performance metric considers the degree of convergence to ȳ, expressed by the fraction of misplaced traits µ(y) = ‖(y − ȳ)‖1 ‖y‖1 . (29) we say that one system converges faster than another if it takes less time for µ(y) to decrease to some relative error µthresh, such as µthresh = 5%. similar performance metrics have been proposed in [8–10]. we will consider three optimization methods, two of which stem from this paper, and one of which stems from [8]: fixed-nc — we consider the time-constant, nonconvex optimization problem posed in eq. 26, with α = 1, β = 5, and ν = 2, producing a fixed k(s)f for each species. these values were not tuned in any way to improve performance. adaptive-nc — we consider the time-continuous optimization problem in eq. 27, with α = 1, β = 0, and a sampling time δ = 0.08 s, producing an adaptive control policy k(s)f(t) for each species. fixed-c — we adapt the convex optimization method presented in [8], denoted in the latter work as [p1]. this adapted method optimizes the asymptotic convergence rate (of a system of homogenous robots) by minimizing the second eigenvalue λ2 of a symmetric matrix s(s), such that λ2(s(s)) ≥ re(λ2(k(s))). since this method requires the knowledge of the desired species distribution x̄, we compute a random instantiation of x̄ that satisfies the desired trait distribution defined by eq. 7. we note that in practical applications, computing a good instantiation of x̄ is not trivial. for all optimization methods, we set k(s)ij,max = 2 [s −1]. 4.1. example to illustrate our method, we consider an example of n = 3893 robots moving between 10 sites. we sample a random initial robot distribution x(0), and generate a random, feasible desired trait distribution ȳ. the initial trait distribution is visualized in fig. 1(a), and the desired trait distribution is visualized in fig. 1(b). the graph is generated randomly according to the watts-strogatz model [23] (with a neighboring node degree of k = 3, and a rewiring probability of γ = 0.6; the graph is guaranteed to be connected). the robot community consists of 3 species and 4 traits, and is defined as follows: q =   1 0 1 01 0 0 1 0 1 0 1   with x> · 1 = [987, 1490, 1416]> (30) we solve the system for ȳ as shown in fig. 1(b). we show the evolution of the robot distribution in fig. 2(a) and trait distribution in fig. 2(b)—to avoid clutter, we plot two selected sites (5 and 6) and one selected trait (corresponding to the 4th trait, shown as the top bar in cyan in fig. 1). the plots demonstrate a good agreement between the macroscopic model and the discrete microscopic model. fig. 3 shows the ratio of misplaced traits µ(y) over time for the initial and desired trait distributions depicted in fig. 1. the macroscopic model demonstrates that the system successfully achieves a negligible error. we run 20 iterations of the discrete microscopic model, once with method fixed-nc, and once with 72 vol. 56 no. 1/2016 adaptive distribution of a swarm of heterogeneous robots microscopic macroscopic n um be r of ro bo ts time [s] 6 5 (a) microscopic macroscopic n um be r of tr ai ts time [s] 6 5 (b) figure 2. the plot shows the macroscopic model as well as the average over 20 iterations of the microscopic model, on the graph shown in fig. 1. we plot values for nodes 5 and 6, which are highlighted in the graph in fig. 1. (a) number of robots of species 1 present at nodes 5 and 6 (b) number of trait 4 (top bar in cyan in fig. 1) present at nodes 5 and 6. µ (y ) time [s] mic. fixed-nc mic. adapt.-nc mac. fixed-nc figure 3. ratio of misplaced traits over time for the initial and desired trait distributions depicted in fig. 1. the simulation was run with 3893 robots and a total of 7786 present traits. the plot shows the macroscopic model as well as the average over 20 iterations of the microscopic model with and without continuous optimization of transition rates. the errorbars show the standard deviation. method adaptive-nc. since the adaptive optimization method takes the current robot configuration into account, it produces transition rates that lead to the desired configuration faster. also, since it continuously optimizes toward the desired final trait distribution, the error remains low. we note that, due to the stochasticity of the microscopic model, the error ratio (which counts absolute differences) will always be larger than 0. 4.2. comparison of methods we compare the three optimization methods, fixednc, adaptive-nc, fixed-c and evaluate their performance with respect to the metric in eq. 29. we instantiate 40 random graphs with m = 6 nodes, and random matrices q with s = 4 species and u = 4 traits, and generate random desired trait distributions ȳ for each graph. the microscopic model is iterated 4 times on each graph instantiation. for the method fixed-c, we compute a random robot distribution x̄ that satisfies the desired trait distribution. we measure the time tµ,thresh at which the system converges to a value µthresh = 5% of misplaced traits. since we sample random matrices q, we obtain different rank values. fig. 4(a) shows results for rank(q) = 3, i.e., a system with redundant species, and fig. 4(b) shows results for rank(q) = 4, i.e., a system with complementary species (cf. the description in sec. 2.3). 73 a. prorok, m. ani hsieh, vijay kumar acta polytechnica t µ ,t hr es h [s ] fixed-c fixed-nc adapt.-nc (a) fixed-c fixed-nc adapt.-nc (b) figure 4. the plot shows the convergence time of three optimization methods, evaluated on the microscopic model, with tµthresh for µthresh = 0.05, for 40 random graphs with m = 6 and random matrices q with 4 species and 4 traits. the microscopic model was iterated 4 times over each graph instantiation. (a) rank(q) = 3 (b) rank(q) = 4. the boxplots show the median and the 25th and 75th percentiles. the plots show that our method fixed-nc is able to improve upon fixed-c: for rank(q) = 4 by 16%, and for rank(q) = 3 by 25%. the stronger improvement in the lower-rank case points towards the importance of finding a good final robot distribution x̄ when several are possible that satisfy eq. 7. the results for adaptive-nc confirm the fast convergence towards desired trait distributions, with a 75% improvement over fixed-nc in both cases. it is clear that this method outperforms the other two methods because of its ability to take into account the current state of the robot distribution, and to adapt the transition rates as a function of this. finally, we compute the error obtained by our method fixed-nc by comparing the analytical steady-state distribution of traits (obtained by taking the eigenvectors that correspond to the zeroeigenvalues of each rate matrix k(s) and multiplying them by q) with the desired trait distribution ȳ. the median, 90th percentile and maximum error from the steady-state to the desired trait distribution are 0.108%, 0.572% and 0.812%, respectively. these results demonstrate that, despite the fact that our method is not explicitly optimizing for the steadystate, it reaches a steady-state error smaller than system noise (at steady-state). 5. conclusion we present a method that distributes a swarm of heterogeneous robots among a set of tasks with the goal of satisfying a desired distribution of robot capabilities among those tasks. we propose a formulation for heterogeneous robot systems through species and traits, and show how this formulation is used to achieve an optimal distribution of robots as a function of a desired final trait configuration. to find the optimal transition rates, we pose an optimization problem, and develop a solution based on an analytical gradient that is computationally efficient and capable of producing fast convergence times. building on this result, we propose a variant real-time optimization method that enables an online adaptation of transition rates as a function of the state of the current robot distribution. we validate our approach on random graph instantiations, and show that our baseline method outperforms a classical alternative approach. also, we show how, when using the variant adaptive optimization, a significant gain in convergence speed is made. we believe that this method is well-suited to applications that control large-scale teams of robots that need to converge quickly to desired configurations as a function of their capabilities, and that need to adapt to changes in real-time. future work will include a more in-depth study of the implications of diversity in swarm-robotic systems, as well as an implementation of the proposed framework on real robots. acknowledgements we gratefully acknowledge the support of onr grants n00014-15-1-2115 and n00014-14-1-0510, arl grant w911nf-08-2-0004, nsf grant iis-1426840, and terraswarm, one of six centers of starnet, a semiconductor research corporation program sponsored by marco and darpa. 74 vol. 56 no. 1/2016 adaptive distribution of a swarm of heterogeneous robots references [1] t. balch, l. e. parker. special issue on heterogeneous multi-robot systems. autonomous robots 8:207–383, 2000. [2] e. g. jones, b. browning, m. b. dias, et al. dynamically formed heterogeneous robot teams performing tightly coordinated tasks. international conference on robotics and automation (icra) pp. 1–8, 2006. doi:10.1109/robot.2006.1641771. [3] b. charrow. information-theoretic active perception for multi-robot teams. ph.d. thesis, university of pennsylvania, 2015. [4] a. m. hsieh, a. cowley, f. j. keller, et al. adaptive teams of autonomous aerial and ground robots for situational awareness. journal of field robotics 24:991–1014, 2007. doi:10.1002/rob.20222. [5] p. tokekar, j. vander hook, d. mulla, v. isler. sensor planning for a symbiotic uav and ugv system for precision agriculture. ieee/rsj international conference on intelligent robots and systems (iros) 2013. doi:10.1109/iros.2013.6697126. [6] b. p. gerkey, m. j. mataric. a formal analysis and taxonomy of task allocation in multi-robot systems. interntional journal of robotics research 23(9):939–954, 2004. doi:10.1177/0278364904045564. [7] b. korte, j. vygen. combinatorial optimization: theory and algorithms. springer-verlag, berlin., 2000. [8] s. berman, á. halasz, m. a. hsieh, v. kumar. optimized stochastic policies for task allocation in swarms of robots. ieee transactions on robotics 25:927–937, 2009. doi:10.1109/tro.2009.2024997. [9] á. halasz, m. a. hsieh, s. berman, v. kumar. dynamic redistribution of a swarm of robots among multiple sites. ieee/rsj international conference on intelligent robots and systems (iros) 2007. doi:10.1109/iros.2007.4399528. [10] m. a. hsieh, á. halasz, s. berman, v. kumar. biologically inspired redistribution of a swarm of robots among multiple sites. swarm intelligence 2(2-4):121–141, 2008. doi:10.1007/s11721-008-0019-z. [11] m. j. b. krieger, j. b. billeter, l. keller. ant-like task allocation and recruitment in cooperative robots. nature 406:992–995, 2000. [12] t. h. labella, m. dorigo, j.-l. deneubourg. division of labor in a group of robots inspired by ants’ foraging behavior. acm transactions on autonomous adaptive systems 1:4–25, 2006. [13] l. matthey, s. berman, v. kumar. stochastic strategies for a swarm robotic assembly system. in ieee international conference on robotics and automation (icra), pp. 1953–1958. ieee, 2009. doi:10.1109/robot.2009.5152457. [14] d. tilman. functional diversity. encyclopedia of biodiversity 3:109–120, 2001. [15] o. l. petchey, k. j. gaston. functional diversity: back to basics and looking forward. ecology letters 9(6):741– 758, 2006. doi:10.1111/j.1461-0248.2006.00924.x. [16] j. d. kalbfleisch, j. f. lawless. the analysis of panel data under a markov assumption. journal of american statistical association 80:863–871, 1985. [17] d. j. wales, j. p. k. doye. global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. arxivorg pp. 1–8, 1998. [18] a. mordecai. nonlinear programming: analysis and methods. courier corporation, 2003. [19] j. demmel, i. dumitriu, o. holtz. fast linear algebra is stable. arxivorg pp. 1–26, 2007. [20] v. y. pan, z. q. chen. the complexity of the matrix eigenproblem. acm symposium on theory of computing pp. 507–516, 1999. [21] m. milam, r. franz, j. hauser, m. murray. receding horizon control of vectored thrust flight experiment. iee proceedings on control theory and applications 152:340–348, 2015. [22] a. martinoli. swarm intelligence in autonomous collective robotics: from tools to the analysis and synthesis of distributed collective strategies. ph.d. thesis, ecole polytechnique fédérale de lausanne (epfl), 1999. [23] d. j. watts, s. h. strogatz. collective dynamics of ‘small-world’ networks. nature 393:440–442, 1998. 75 http://dx.doi.org/10.1109/robot.2006.1641771 http://dx.doi.org/10.1002/rob.20222 http://dx.doi.org/10.1109/iros.2013.6697126 http://dx.doi.org/10.1177/0278364904045564 http://dx.doi.org/10.1109/tro.2009.2024997 http://dx.doi.org/10.1109/iros.2007.4399528 http://dx.doi.org/10.1007/s11721-008-0019-z http://dx.doi.org/10.1109/robot.2009.5152457 http://dx.doi.org/10.1111/j.1461-0248.2006.00924.x acta polytechnica 56(1):67–75, 2016 1 introduction 2 problem formulation 2.1 notation 2.2 problem statement 2.3 system properties 3 methodology 3.1 design of optimal transition rates 3.2 optimization of convergence time 3.3 computational complexity 3.4 continuous optimization of k(s) 4 results 4.1 example 4.2 comparison of methods 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0166 acta polytechnica 56(3):166–172, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap a superintegrable model with reflections on s3 and the rank two bannai-ito algebra hendrik de biea, ∗, vincent x. genestb, jean-michel lemayc, luc vinetc a department of mathematical analysis, faculty of engineering and architecture, ghent university, galglaan 2, 9000 ghent, belgium b department of mathematics, massachusetts institute of technology, 77 massachusetts ave., cambridge, ma 02139, usa c centre de recherches mathématiques, université de montréal, c.p. 6128, succ. centre-ville, montréal, qc, canada, h3c 3j7 ∗ corresponding author: hendrik.debie@ugent.be abstract. a quantum superintegrable model with reflections on the three-sphere is presented. its symmetry algebra is identified with the rank-two bannai-ito algebra. it is shown that the hamiltonian of the system can be constructed from the tensor product of four representations of the superalgebra osp(1|2) and that the superintegrability is naturally understood in that setting. the exact separated solutions are obtained through the fischer decomposition and a cauchy-kovalevskaia extension theorem. keywords: bannai-ito algebra; cauchy-kovalevskaia extension; quantum superintegrable model. 1. introduction superintegrability shares an intimate connection with exact solvability. for classical systems, this connection is fully understood while it remains an empirical observation for general quantum systems. the study of superintegrable models has proved fruitful in understanding symmetries and their algebraic description, and has also contributed to the theory of special functions. a quantum system in n dimensions with hamiltonian h is said to be maximally superintegrable if it possesses 2n− 1 algebraically independent constants of motion c1,c2, . . . ,c2n−1 commuting with h, that is [h,ci] = 0 for i = 1, . . . , 2n− 1, where one of these constants is the hamiltonian itself. such a system is further said to be superintegrable of order l if the maximum order in momenta of the constants of motion (except h) is l. one of the important quantum superintegrable models is the so-called generic three-parameter system on the two-sphere [12], whose symmetries generate the racah algebra which characterizes the wilson and racah polynomials sitting atop the askey scheme [1]. all two-dimensional second order superintegrable models of the form h = ∆ + v where ∆ denotes the laplace-beltrami operator have been classified [12] and can be obtained from the generic three-parameter model through contractions and specializations [11]. a similar model with four parameters defined on the three-sphere has also been introduced and its connection to bivariate wilson and racah polynomials has been established [10]. recently, superintegrable models defined by hamiltonians involving reflection operators have been the subject of several investigations [2–5, 9]. one of the interesting features of these models is their connection to less known bispectral orthogonal polynomials referred to as −1 polynomials. many efforts have been deployed to characterize these polynomials, which can be organized in a tableau similar to the askey one [13–19]. of particular relevance to the present paper is the laplace-dunkl equation on the two-sphere studied in [6, 7], which has the rank-one bannai-ito algebra as its symmetry algebra [17]. this bannai-ito algebra encodes the bispectrality of the bannai-ito polynomials which depend on four parameters and stand at the highest level of the hierarchy of −1 orthogonal polynomials. as such, this laplace-dunkl system on the two sphere can be thought of as a generalization with reflection operators of the generic three-parameter model (without reflections) on the two-sphere which is recovered when wavefunctions with definite parities are considered. the goal of this paper is to introduce a novel quantum superintegrable model with reflections on the three-sphere which similarly embodies the generic four-parameter model introduced and studied in [10]. the paper is divided as follows. in section 2, we introduce a superintegrable model with four-parameters on the three-sphere and exhibit its symmetries explicitly. in section 3, it is shown how the hamiltonian of the model can be constructed from four realizations of the superalgebra osp(1|2). moreover, the symmetry algebra is characterized and is seen to correspond to a rank-two generalization of the bannai-ito algebra. in section 4, the structure of the space of polynomial solutions is exhibited using a fischer decomposition and an explicit 166 http://dx.doi.org/10.14311/ap.2016.56.0166 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 a superintegrable model with reflections on s3 basis for the eigenfunctions is constructed with the help of a cauchy-kovalevskaia extension theorem. some concluding remarks are offered in section 5. 2. a superintegrable model on s3 let s1,s2,s3,s4 be the cartesian coordinates of a four-dimensional euclidian space and take the restriction to the embedded three-sphere: s21 + s22 + s23 + s24 = 1. consider the system with four parameters µ1,µ2,µ3,µ4 with µi ≥ 0 for i = 1, 2, 3, 4 governed by the hamiltonian h = ∑ 1≤i bth are included in the statistics at a given bth. a strong relationship c(bth) and wide range of optimal bth,opt is seen (bth,opt is defined by c(bth,opt) = cfc). aperture obviously affects the scale of abscissa and hence the values of bth,opt. from the increase of c with increasing bth, it can be deduced that laser hits of low brightness are identified predominantly far away from the wall: increasing the value of bth results in filtering out the distant hits, the mean position of remaining hits shifts closer to the wall, and the evaluated concentration increases. we also tested an improved method for an estimation of a mean free path from the measured data [2]. the improved method was expected to prevent the effect of the laser light attenuation by setting a maximum distance within which the hits are counted. never175 j. krupička, t. picek, š. zrostlík acta polytechnica figure 8. histograms of hits brightness. colour lines – distinct apertures on camera. theless, we obtained qualitatively the same result – although individual values of bth,opt shifted slightly, the sensitivity of the evaluated concentration to bth remained unchanged. the role of bth in the data evaluation is to filter out bright points corresponding to the observation of the refracted and reflected laser sheet (i.e. “false” hits). smooth course of curves in figure 7 indicates that there is no sharp division between the “true” and “false” hits. this can also be deduced from observation of the wide range of maximum brightness values in figure 5b. however, the two groups of hits are expected to predominantly occupy distinct parts of brightness histogram forming two peaks. these histograms are plotted in figure 8 (ordinates are normalized to give a unit area under the histograms). the histograms at the same concentration show a very similar course and differ only in vertical and horizontal scales. expected peaks are recognized only at the highest concentration. however, let’s try to identify position of these peaks. this can be done by fitting the histogram by two superposed probability density functions. as the bth must be positive, lognormal distribution is a natural choice. parameters of the lognormal distribution can be conveniently fitted when a transformation to normal coordinates is performed. after the transformation, the two peaks can be clearly recognized at all concentrations (figure 9). further analysis showed that the lognormal distribution (i.e. normal distribution in the normal coordinates) is not suitable to fit the histograms – a very good fit was achieved when the two triangles were used in the normal coordinates. the histograms for the measurements with an aperfigure 9. histograms of hits’ brightness transformed to normal coordinates. colour lines – distinct apertures on camera; black line – triangles fitting the histogram at cfc = 21 % and a = 16. ture higher than f/9.5 cannot be fitted because the second peak lies behind the range of b – corresponding parts of the histograms are shrunk to the value of b = 255 in figures 8. these measurements are excluded from the further analysis. as bth should be used to distinguish between “true” and “false” hits, it should be related to the position of the peaks. this hypothesis is proved in figure 10, where histograms are plotted in coordinates normalised by b2 – the position of the second peak. it can be seen that: 1) all the histograms at the same concentration fall to the same curve after the transformation; 2) the curve is successfully fitted by the two triangles (after the transformation back from the normal coordinates, triangles are distorted to black curves); 3) when the values of bth,opt deduced from figure 7 are normalised by b2, they fall close to the mean value of 0.57; 4) normalised bth,opt falls to the region where the two peaks overlap. based on this observation, the following guideline for the evaluation of the threshold bth can be proposed: the brightness of laser hits has to be within the range of the grey scale recorded by the camera (aperture, exposure time and iso have to be set carefully); then, the histograms of hits’ brightness can be fitted by triangles in normalized coordinates; bth, is obtained by multiplying the position of the peak of the second triangle, b2, by the factor 0.57. the predicted values of bth are compared with bth,opt in figure 11, showing a reasonable match. if the measured data are processed with the predicted bth, the evaluated concentrations match those measured in 176 vol. 58 no. 3/2018 reduction of uncertainties in laser stripe measurement figure 10. histograms of hits’ brightness normalised by position of the second peak. colour lines – distinct apertures on camera; black line – fit of the two peaks (triangles in normal coordinates); vertical lines – normalised values of bth,opt. the fluidization cell perfectly (figure 12). the good match indicates that both crucial parameters of the method, bth and ∆y0, were treated correctly. the particles we tested were almost spherical – the shape for which the lsm method was derived. however, authors of the method showed that it works also with non-spherical particles [5]. shape of a particle, as well as optical properties of a particle material in respect to a laser light (reflectivity and opacity), is expected to affect mutual positions and shapes of peaks recognized in the brightness histogram. hence, the optimal value of bth/b2 ratio has to be determined from calibration tests for each combination of particles and the laser employed. our results show that then it is virtually independent of the concentration and – within certain range – exposure. it should be noted that the typical image of particles hit by a laser obtained by [5] differs from our image in figure 5a – the laser stripe was narrower and sharper in [5]. this can be explained by different optical properties of tested particles. our particles are probably partially translucent whereas those used in [5] are not. when perfectly opaque particles are used, the false hits can diminish and the laser stripe measurement becomes insensitive to bth, in a wide range of brightness values. 5. conclusions three parameters enter the process of an evaluation of the laser stripe measurement (lsm). while the figure 11. comparison of predicted threshold brightness with the optimal values needed to obtain correct concentration. figure 12. comparison of concentration evaluated from lsm measurement with concentration measured in fluidization cell (only measurements with a > 9.5 included). observation angle affects the measurement results insignificantly, the threshold brightness for the data filtering and the wall position correction are found to play much more important roles. we calculated the wall position correction from the position of the light stripes drawn by the laser sheet on the rectification labels stuck to the wall of a conduit. this procedure seems to provide sufficiently accurate results. the lsm is expected to be practically insensitive to the threshold brightness if particles and laser optical properties are ideal (particularly the particle opacity). however, for partially translucent particles, it appears to be very threshold-brightness sensitive. a need for guidelines for a determination of the threshold arises in the latter case. we attempted to relate the threshold to the shape of a histogram of the brightness of the laser hits evaluated from our measurements in a fluidization cell. we found that two peaks can be distinguished in the histogram. the first peak can be attributed to “false hits”, which have to be filtered out and the second one to the “true” observations of particles hit by the laser sheet. a perfect match is achieved between concentrations evaluated from the lsm and actual concentrations within the flu177 j. krupička, t. picek, š. zrostlík acta polytechnica idization cell provided that the threshold is related to the second peak position scaled by a factor of 0.57. assumingly, particular values of the factor may be sensitive to parameters associated with the properties of the particle and laser. acknowledgements the research has been supported by the czech science foundation through the grant project no. 16-21421s. references [1] berzi, d.: analytical solution of collisional sheet flows. journal of hydraulic engineering-asce, vol. 137, no. 10, pp. 1200-1207, 2011. issn: 0733-9429. doi:10.1061/(asce)hy.1943-7900.0000420 [2] capart, h., fraccarollo, l.: transport layer structure in intense bed-load. geophysical research letters, vol. 38, art. no. l20402, 2011. issn: 0094-8276. doi:10.1029/2011gl049408 [3] berzi, d., fraccarollo, l.: inclined, collisional sediment transport. physics of fluids, vol. 25, no. 10, art. no. 106601, 2013. issn: 1070-6631. doi:10.1063/1.4823857 [4] matoušek, v., zrostlík, š.: laboratory testing of granular kinetic theory for intense bed load transport. journal of hydrology and hydromechanics, vol. 66, no. 3. (in press). doi:10.2478/johh-2018-0012 [5] spinewine et al.: laser stripe measurements of near-wall solid fraction in channel flows of liquid-granular mixtures. experiments in fluids, vol. 50, no. 6, pp. 1507-1525, 2011. issn: 0723-4864. doi:10.1007/s00348-010-1009-7 [6] krupička et al.: validation of laser-penetrationand electrical-conductivity-based methods of concentration measurement in flow with intense transport of coarse sediment. proceedings of the international conference experimental fluid mechanics 2017, mikulov (czech republic), 2017, pp. 322-327 178 http://dx.doi.org/10.1061/(asce)hy.1943-7900.0000420 http://dx.doi.org/10.1029/2011gl049408 http://dx.doi.org/10.1063/1.4823857 http://dx.doi.org/10.2478/johh-2018-0012 http://dx.doi.org/10.1007/s00348-010-1009-7 acta polytechnica 58(3):171–178, 2018 1 introduction and motivation 2 laser stripe measurement method 2.1 principles of the method 2.2 sensitivity of outputs to method parameters 3 experiment in fluidization cell 3.1 experimental setup 3.2 calibration 4 results and discussion 5 conclusions acknowledgements references ap04_3web.vp 1 introduction at the turn of the third millennium, world industry finds itself in probably the largest restructuring since the first industrial revolution. the progress is determined by two trends: � dynamic progress of information and communication technologies, which have enabled the creation of new markets and a redefinition of entire professions, � globalization of the economy thanks to new purchasing and selling markets. such progress forces enterprises to modify their production strategies. new competitors and greater changes in demand in the course of time in a stagnant market are presenting enormous cost pressure for many enterprises. in order to meet customers’ needs everywhere in the world, the enterprises should increase their flexibility. the main requirements put upon industry are the following: 1. cost pressure: � stagnant markets, � great changes in demand, � producers with a cheap work force. 2. economic globalization: � new markets for sales (asia/eastern europe), � new purchasing markets, � new competitors. 3. the third industrial revolution: � new potentials obtained by the application of information and communication technologies, � structural changes in entire professions, � new products and markets. nowadays, the main conditions for a successful enterprise are as follows: � the existence of products/ideas/services, � the existence of a market, � maximum fulfilment of customers’ requirements, � minimum use of resources. long–lasting success will be achieved only by enterprises which not only achieve the necessary optimization of their production process but also identify and conquer new markets. one possible conception for survival in a turbulent world market is chain cooperation involving the producer, his suppliers and customers, according to the concept of supply chain management. 2 the basics of supply chain management supply chain management (scm) comprises process-oriented integration of planning, proceeding, coordination and control of material and information flows in a one-stage or multi-stage supply chain in the area of planning, purchasing, production and distribution. the basic idea of scm is a further development of supply logic. classical supply management is engaged in optimizing material and information flows in a particular enterprise, while scm focuses on the entire integration of all partners in a supply chain. the aim of scm is to reduce costs in the entire production process, from raw materials purchasing to the delivery to the final customer, optimizing all partners in a chain. the partners in a supply chain can be departments of either one enterprise or single enterprise. a number of scm initiatives have already been taken, scor being the most important of them. this represents a new generation of scm systems. supply chain operation reference (scor) was founded in the usa in 1996, by the supply chain council (scc), as an independent organization (web pages http://www.supply-chain.org). it uses scm systems and further improves them. the scor model is a significant auxiliary device, which standardizes cooperation processes among several enterprises and makes them transparent. the main aim of the scor initiative is business process modelling, which takes place among several enterprises in a supply chain, so that scm can be realized. the processes are comparable by the process reference model. the scor model consists of four fundamental processes: plan, source, make and deliver, which are inter© czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 44 no. 3/2004 developing model for supply chain management – the case of croatia e. jurun, i. veza this paper describes a model of supply chain management (scm). it explains overall supply chain issues, strategic importance of scm, supply chain strategies and an example of mathematical formulation. a supply chain is a global network of organizations that cooperate to improve the flows of material and information between suppliers and customers at the lowest cost and the highest speed. the objective of a supply chain is customer satisfaction. at the strategic level, a supply chain can be considered as being composed of five activities: buy, make, move, store and sell. each activity is a module. the set of modules, along with its links, constitutes a model of the supply chain. our paper presents some insights into the supply chain strategies of companies in croatia. the major goal of this paper is to show a model for supply chain management in mathematical terms, with an example of mathematical formulation. keywords: supply chain, supply chain management, system modeling, logistic system, supply chain strategies. linked in a chain (fig. 1.). the figure shows the interconnection in the chain between supplier, producer and distributor in order to achieve the common optimum with reference to costs, delivery time, delivery amount and inventories. according to the definition, scm contains the following potentials: � supplier service improvement, � reduction of net inventories, � shortening of the entire production cycle, � enhancement of predicting accuracy, � a rice in productivity, � lower costs for purchasing, production and distribution in the chain. by applying the scm concept it is possible to reduce inventories up to 60 %, shorten the production cycle up to 50 %, increase profit up to 30 % and total costs up to 25 %. thus, for example, the coca cola company, after introducing scm and using potentials for rationalization, has achieved a 3,5 % increase in profit in the european market [2, 3]. with regard to the planned horizon and planning objects, global and local planned assignments within scm are divided into three time and logic levels: � strategy level. the main task of the strategic level of planning is to define the enterprise strategy by shaping a configurationally optimal production and supply network among several enterprises. on the basis of alternative configurations, a simulation with reference to given criteria chooses an optimal solution. distribution channels, from the raw material supplier to the selling market, are analysed in this phase. this is done on the basis of the annual planned quantity, production quantity and the situation on the stock. the aim of shaping is to get a deliverer’s real supply chain with reference to all relevant limits. � tactical level. based on the data obtained on the strategic level, particular members of the production network, with reference to long-term production and transport plans, are defined in this phase. the aim of planning is to synchronize a medium-term and long-term planning programme with reference to capacities and terms (between 3 and 6 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 deliver deliver deliver deliversourcesourcesource sourcemake make make plan your company customer internal or external customer’s customer supplier internal or external supplier’s supplier fig. 1: the basis of the scor reference model (top-level) [1] network planning production planning inventory planning distribution planning detailed scheduling transport planning needs planning strategic planning operational tactical strategic hours days weeks months years planned period fig. 2: planned assignments with regard to the planned period months). the input data for this planning are necessary information on the supply chain structure, selling predictions and customers’ needs. rough planning of purchasing, production and distribution is done on the basis of the input data. planning is done by means of simulation of different alternatives with reference to resource, costs and delivery times. � operational level. operational realization of the set programmes takes place through production planning and control (ppc). the existing ppc organizational structures can be used to realize scm, but they have to be extended in dependence on the exterior partner. it is indispensable to ensure quick exchange of information between the supplier and the customer so that a quick reaction to unplanned events (e.g. disturbances, short-term special orders and others) can be provided. typical planned functions at the operational level are detailed scheduling (on the basis of plans at the tactical level) and controlling of orders, warehouse and transport. planned assignments with regard to the planned period are presented in fig. 2. the major goal of this paper is to show a model for supply chain management in mathematical terms with an example of mathematical formulation. 3 model for supply chain management in most industries the cost of raw materials and component parts from external vendors constitutes the main part of the total product cost, and in some cases it can account for up to 80 % of the total product cost. for this reason, supplier selection is one of the most important tasks in every industry. basically there are two kinds of supplier problems; the first is supplier selection when there is no constraint. in other words when all suppliers can satisfy the buyer’s requirements for demand, quality, delivery, etc. in this kind of supplier selection the management needs to make only one decision – which supplier is the best. the second type of supplier selection problem is when there are some limitations on suppliers capacity, quality and so on. in other words, no supplier can satisfy the buyer’s total requirements and the buyer needs to purchase partly from one supplier and partly from another to compensate for the shortage of capacity or low quality of the first supplier. the enterprise must decide which supplier it should contract, and it must determine the appropriate order quantity for each supplier selected. this paper will present a cost minimizing model for supplier selection with price breaks. the existence of variable prices offered by the supplier usually complicates the selection process for the purchaser [5]. the change in prices depends upon the size of the order. in cases when variable prices are combined with capacities or limited conditions of delivery, supplier selection can be very complicated. consider a buyer who requires an amount d of a particular product over a fixed planning period. there are n suppliers that can meet the buyer’s demand. let xi be the order quantity of this product placed with supplier i, where x di i n � � � 1 (1) supplier i can provide up to vi units of the product over the planning period, i.e. x vi i� (2) it is assumed that the aggregate amount of the product available from this multiple-sourcing network is sufficient to satisfy d, i.e. v di i n � � � 1 (3) depending on the type of product, aggregate supplier performance measures of quality and delivery may or may not be meaningful to the buyer. we assume that they are in this section, for the purpose of illustration. the order quantity must therefore satisfy: q x qdi i i n � � � 1 and l x ldi i i n � � � 1 (4) where: qi quality of a unit of the product of supplier i, q buyer’s minimum quality requirement for a unit of a product, li lead time from the supplier that is needed to fulfill an order, l deadline, measured from the order date. let pi( xi) be the price that the buyer must pay to supplier i for supplying the order quantity xi. assume that the suppliers offer some type of price breaks. the net price that the buyer must pay for d is then p xi i i n ( ) � � 1 , which the buyer desires to minimize. the supplier selection problem of the buyer may then be modeled by the following mathematical program: min ( )p xi i i n � � 1 (objective: minimize the buyer’s price for ordering d) (5) q x qdi i i n � � � 1 (aggregate quality constraint) (6) l x ldi i i n � � � 1 (aggregate delivery constraint) (7) x di i n � � � 1 (demand constraint) (8) x v i ni i� �1, ,� (supplier constraints) (9) x i ni � �0 1, ,� (non-negativity constraint) (10) the objective function is nonlinear due to the presence of price breaks. a special case for supplier selection is cumulative price breaks. suppose each supplier offers all-unit price breaks. this implies that the cost of purchasing xi units from supplier i, pi( xi) is defined to be © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 44 no. 3/2004 p x p xi i ij i( ) � if b x bi j i ij, � � �1 , 1 � �j m i( ) (11) where: pij is the unit price for level j, bij are the quantities at which price breaks occur, m(i) is the number of quantity ranges in vendor is price schedule. if pij is a decreasing (increasing) function of j, then quantity discounts are being offered. we define a more specific set of decision variables that describes the order quantities: xij is the number of units purchased from supplier i at price level j. it follows that xi is related to these xij by x x i ni ij j m i � � � � 1 1 ( ) , , ,� . (12) since xij can be nonzero only if b x bi j ij ij, � � �1 , (13) it follows that the objective function becomes p x p xi i ij ij( ) � (14) which is nonlinear. to circumvent this nonlinearity, we introduce binary integer variables yij in the following way: y x xij ij ij � � � � 0 0 1 0 if if . (15) this definition can be realized by adding constraints of the form b y x bi j ij ij ij, * � � �1 , j m i�1, , ( )� (lower and upper quantity range constraints). (16) bij * is a number slightly lower than bij introduced to realize x b yij ij ij� (17) for all i and j m i� �1 1, , ( )� , and b di m i, ( ) * � (18) for all i, since d is the most any supplier can supply. actually, for i n�1, ,� , the constraints b y xi i i0 1 1� (19) are redundant, since bi0 and xi1 is defined to be non-negative. a further constraint for each supplier is that at most one price level can be chosen, i.e. y i nij j m i � � � � 1 1 1 ( ) , , ,� (price level constraints). (20) the situation can be even more complex. namely, let xijk be the number of units of the good k (the kind of material required) offered by supplier i at price level j. at the same time there are n different suppliers, (i n�1, ,� ), l kinds of goods ( , ,k l�1 � ), and each supplier offers m(i) ( j m i�1, , ( )� ) price levels, where the numbers of different discount categories depends on each supplier. let us assume that the producer requires the total quantity of goods d, therefore x dijk k l j m i i n ��� ��� � 111 ( ) (21) must be valid. at the same time there are individual requirements for each kind of goods dk, i.e. the following relations must be valid: x dijk j m i k i n �� �� � 11 ( ) and d dk k l � � � 1 . (22) if we consider in this way all kinds of goods required together, the model becomes extremely complex, with a great number of variables and constraints. such a model can be subdivided into a number of small models, each to be separately solved, which makes the problem simpler. 4 supply chain management – the case of croatia construction of a modern and efficient economic structure is, certainly, one of the fundamental objectives for the reconstruction and development of the republic of croatia. it can be safely assumed that many different factors and processes, especially privatisation and restructuring, influence the realisation of such an objective. namely, privatization in croatia is entering a new stage, which is characterized by the further privatization of small and medium-sized enterprises, as well as the end of privatization of large production systems, banks and public enterprises. restructuring is oriented toward the creation of a network of small and medium-sized enterprises, characterized by a high degree of flexibility, adaptability and creativity. they should be based on the principles of entrepreneurial economics and entrepreneurial spirit. restructuring is being carried out at all levels, especially at the level of large and complex systems. the republic of croatia is a country in transition from the socialist, state-dictated economy toward market principles. such a process implies a need for overall transformation of croatian companies, which includes all segments of the organizational restructuring process. all these factors are also related to the need for the internationalization of croatian enterprises, and by the imperative to be included in the international processes of economic cooperation and globalization. croatian companies are facing severe competition, on both the international, and the domestic market, which is becoming more and more dynamic and complex. in such a context, supply chain management (i.e. logistic chain management) becomes a significant factor that contributes to the creation of competitive advantage, which can be justified by the costs of logistics in the total costs of goods produced in croatia (which is estimated to be higher than 40 %). threats from the environment, as well as rising competition, require croatian enterprises to utilize new strategies of logistics and distribution, with the objective to optimize the flows of both goods and materials – from their source to the final consumer. such an objective is to be implemented with the idea of partnership in all the stages of the supply chain. the supply chain philosophy is based on partnership among all those involved. instead of the traditional competition, the supply chain is being built upon the concept of 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 “co-operation”, which aims to mutually maximize the profit of all members of the supply chain. lowering the costs and improving the quality of the shared logistics – e.g., by better planning of the materials and goods requirements, by utilization of contemporary information technology in information exchange, etc., – can achieve this. even in croatian enterprises, the traditional approach to competitive advantage usually referred to as the 4ps (product, price, promotion and place – i.e. the philosophy of a quality product as an acceptable price, with appropriate promotion and found in the right place) is being replaced by the 3r concept (referring to reliability, responsiveness and relationships). this concept is based on the principles of reliable and responsive delivery, as well as on stable relationships. such a concept is being increasingly used by a range of croatian enterprises. ad plastik d.d. can be identified as one such enterprise, as evidenced by its long-term cooperation with renault and peugeot. 5 conclusion with regard to the development trend of supplier chains between enterprises in different locations, today in europe, and particularly in the usa, inter-enterprises networking is constantly increasing. in the usa there are already well known examples of networking among several enterprises (e.g. wall mart trade work chain).these potentials are available for those who introduce scm should be used in croatian enterprises in the future. in order to realise this, it is necessary to network enterprises in a particular region and to connect them, for the purpose of knowledge transfer, with science and research institutions. an important role in this network should be played by state institutions (economic chambers, regional and municipal government, etc.) which should decide on a development strategy and projects for faster transfer to a market economy and entry into european integration. references [1] alard r., hartel i., hieber, r.: “innovationstreiber im supply chain management”., io management vol. 5 (1999), p. 64-67. [2] beckmann h.: “integrale logistik als wachstumskonzept – supply chain management – strategie der kooperation”. in: jahrbuch logistik 1998 (editors: k. radeker and v. kirsten). düsseldorf: vdi, 1998, p. 23–29. [3] dinges m.: “supply chain management – logistikrevolution oder wein in alten schläuchen?” information management & consulting , vol. 13 (1998), no. 3, p. 22–27. [4] kansky d.: “supply chain management”. industrie management, vol. 15,(1999), no. 5, p. 14–17. [5] jurun e., babic z., tomic-plazibat n.: “a model approach to the vendor selection problem”. proceedings of the 8th conference on operational research koi 2000, mathematical communications, vol. 1, (2001), no.1, p.103–110. elza jurun, ph.d. želimir dulcic, ph.d. university of split faculty of economics matice hrvatske 31 21000 split, croatia ivica veza, ph.d. university of split faculty of electrical engineering, mechanical engineering and naval architecture r. boskovica bb 21000 split, croatia © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 44 no. 3/2004 table of contents analysis of unsteady transonic flow fields by means of the colour streak schlieren method 3 j. ulrych on the effects of an installed propeller slipstream on wing aerodynamic characteristics 8 f. m. catalano finite element modelling of mechanical phenomena connected to the technologicalprocess of continuous casting of steel 15 j. heger controlling laminate plate elastic behavior 21 t. mareš design and properties of octaslide redundant parallel kinematics 24 v. bauma, m. valášek, z. šika common mathematical model of fatigue characteristics 32 z. maléø, s. slavík, t. marczi, m. rùžièka superior properties of ultra-fine-grained steels 37 j. i. leinonen integration of cfd methods into concurrent design of internal combustion engine 41 m. polášek, j. macek, o. vítek, k. kozel a component-based software development and execution framework for cax applications 47 n. matsuki, h. tokunaga, h. sawada nanoelectronic device structures at terahertz frequency 52 m. horák numerical calculation of electric fields in housing spaces due to electromagnetic radiation from antennas for mobile communication 59 h.-p. geromiller, a. farschtschi an integrated decision making model for evaluation of concept design 62 g. green, g. mamtani aerodynamic analysis of turboprop engine air intake 66 p. chudý, k. fi¾akovský, j. friedl developing model for supply chain management – the case of croatia 71 e. jurun, i. veza acta polytechnica doi:10.14311/ap.2017.57.0182 acta polytechnica 57(3):182–200, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap approaches to the gas control in ucg ján kačur∗, karol kostúr institute of control and informatization of production processes, faculty berg, technical university of košice, boženy němcovej 3, 040 01 košice, slovak republic ∗ corresponding author: jan.kacur@tuke.sk abstract. underground coal gasification represents an alternative to the conventional coal mining. this technological approach is less expensive than the traditional mining. it is expected that coal will be an important energy source in the coming decades. the gasification process must be improved to ensure that the combustion reactions generate enough energy to heat the reactants. this can be achieved by controlling the flow of gasification agents and pressure control at the exit point of the reactor ucg. this paper aims to propose a stabilization of the air flow, which is a main gasification agent injected into the gasification process, underground temperature and concentration of o2 in syngas. also, we propose a mechanism that could cope with uncertainties in the process of the ucg and its stabilization. the paper also presents a utilization of a discrete controller with an adaptation for stabilizing the ucg process variables. the controllers were verified on an ex-situ reactor. keywords: ucg; syngas; controller; stabilization; ex-situ reactor; adaptation. 1. introduction successful application of the ucg (underground coal gasification) process requires integration of a wide range of technical disciplines, which explains its slow commercial acceptance. special skills, used in the fields of chemistry, chemical engineering, geology, geotechnical engineering and geohydrology are all necessary to plan and execute a successful ucg project. information about the process conditions must be constantly monitored and updated as the gasification process moves forward. the ideal temperatures of above ground coal gasification are about 1000 °c. however, it isn’t possible to achieve these temperatures in the ucg. mainly because of the lack of control on the water influx and reactant gas flow patterns. it would be more useful to couple the ucg process models with a full scale process simulator so that the entire process can be modelled at once, rather than sequentially [1]. while it is impossible to monitor some variables of the process (e.g., temperatures in a georeactor), there are special systems that are developed for an indirect measurement of these variables. the changing conditions of the process may cause problems. for this reason the control system should be robust and able to continuously adapt to changes in the process. 1.1. geology the ucg requires an understanding of various aspects of the selected site. the geology, hydrology, mining, drilling, exploration, chemistry and thermodynamics of the gasification reactions in the cavity are important parameters for a successful operation [2]. there are many practical difficulties still to be overcome and it is already clear that the technology can only be applied to certain types of coal seams. the hydrogeology of the seam is important, since excessive ingress of water would render the process uneconomic and gas leakage into underground water supplies could represent an environmental hazard [1]. groundwater contamination around the ucg reactor is caused by dispersion and penetration of the products of the coal pyrolysis by migration of groundwater and escaped gases. during the gasification process, air or oxygen is injected with high pressure equal to or greater than the surroundings hydrostatic pressure. some of the gas products are therefore lost to the surrounding permeable media and perhaps to overlying strata, as a result of cracks in the overburden [3]. both air and oxygen gasification has been tried in the world. in the case of air a very low calorific value of syngas is produced, whereas with oxygen, the cost of the blast and the losses make the process very costly [4]. 1.2. factors affecting ucg long-lasting experiments confirmed that the efficiency of gasification depends on factors such as the method of gasification, the temperature of the relevant zones, the type of oxidant, physical and chemical properties of coal, the process control, geology of coal, operating pressure, the mass and energy balance of the underground reactor and other gasification parameters [1]. the syngas can be produced using a variety of oxidants, including air and oxidation mixtures (e.g., o2/h2o, co2/o2 etc.). effect of humidity on the parameters of syngas is significant an the extent of dependence is based on the coal bed. the most important performance parameter of the ucg is the calorific value of the syngas. the ucg is carried out as an auto-thermic process in which the injection of gasification agents, with the help of the injection hole in the coal bed, generates heat by the combustive reactions of the carbon. due to the need to improve the gasification process, we must ensure that 182 http://dx.doi.org/10.14311/ap.2017.57.0182 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 3/2017 approaches to the gas control in ucg figure 1. reaction zones of ucg [5]. the combustive reactions have generated enough energy to heat the reactants and overcome the heat loss from the reactor. it is also necessary to support the rate of endothermic reactions. figure 1 shows the basic chemical reactions and temperatures defining the distributed reaction zones in the ucg. injected air causes the coal to burn, which leads to exothermic processes that release heat and consume oxygen. the hot oxidized product gases migrates toward the exit borehole, passing through a reducing region where, in the absence of oxygen, they are converted to combustible gases by a number of endothermic reactions which absorb heat. the somewhat cooler, but still hot, combustible gases continue to migrate towards the exit borehole, where they cool further as they pass through a coal pyrolysis region. in this region, the coal is heated to drive off its volatile components which are tars, hydrocarbons, and other gases both combustible and non-combustible. then, the entire gas mixture exits the seam through a production borehole. the oxygenation of the coal increases its temperature and the desired temperature in the oxidation zone should be above 1000 °c, so that reduction can occur. after ignition of the coal in the inlet the coal seam is heated to the temperature above 600 °c, hydrocarbons and gaseous components represented mainly by methane (ch4), make up a majority of the matter burning in that region. the formation of the plastic state of coal at temperatures up to 400 °c will not occur, because of the burn out of volatile hydrocarbons i.e., plasticizers [6]. the carbon is burnt out last and its reactivity will be high with regard to increased surface and porosity created by leaks of volatile components. combustion products co2 and h2o at temperatures above 1000 °c — and flowing through a horizontal channel will create 3 zones: oxidation, reduction and distillation. in the oxidation zone, the reactions of burning gases will dominate and a temperature of generated exhaust gases range from 1000 to 1200 °c. in the reduction zone running reactions that produce co and h2 and regarding the behaviour of an endothermic reactions of further coal gasification in the “reduction zone”, the temperature of outgoing exhaust gases will decreases. in the distillation zone, temperatures are expected to be in a range from 130 °c (drying) to 500 °c (a release of volatile components of coal (ch4 and cnhm)). the output gas, with the temperatures above 600 °c, will be enriched by these volatile components of coal. 1.3. control of ucg the principle of automatic process control of the ucg is dependent on the nature of the information obtained from the system and the possibility to identify the controlled processes. the ucg process is difficult to identify and manage considering that the process takes place in several stages and during operation there are changes of the underground coal gasifier (e.g., cavity enlargement, shift of the combustion front, gas leaks, cracks, ground water, etc.). usually, we can create a mathematical model from the input-output analysis. this model will represent the mathematical relationship between input and output process variables (e.g., the relationship of the oxidant flow and the syngas calorific value), but the main control issue of the ucg is a relatively large degree of uncertainty of the controlled object (i.e., coal seam), which, unlike an industrial system, was created by nature. this uncertainty can be partly reduced by a more detailed geological survey. but even by doing this, it does not guarantee elimination of uncertainty, as evidenced by long-term experience in the traditional coal mining technology. unfortunately, there is not much evidence of an advanced ucg control in the world. most of works, which are published are focused on the control of the ucg with utilization of a process model. in recent times, a one-dimensional model of a coal bed was proposed for the control. this model can be used in a closed control loop with the robust slide mode controller (smc) to maintain the desired heating value of the syngas. the model is able to predict the chemical composition of the syngas based on the composition of the injected gas, and its flow and properties of coal [7, 8]. wei and liu [9] have proposed the principle of optimal control of coal gasification based on the iterative adaptive dynamic programming (adp) and a learning scheme. the authors have used a neural network to construct the dynamics of the process of the coal gasification. some works are focused only on the prediction of output parameters of the ucg (e.g., syngas composition, temperature, or calorific value). authors tried to use various principles of soft sensors or and thermodynamic model [10, 11]. there is little to no evidence about 183 ján kačur, karol kostúr acta polytechnica optimisation of input parameters in the ucg [12] and statistical analyses of the ucg process variables [13]. the control system of ucg should represent a set of components, modules and subsystems that are used to control the whole process. it’s mechanically united as a whole to the programmable logic controller (plc), which can perform a several control algorithms (i.e., cyclic tasks). based on model analysis and due to gas leaks from the cavity and the gasification channel, following cases can occur: • danger due to an explosive concentration, • poisoning due to increased concentrations of toxic gases, in or near the area of active mining, but also on the surface. in order to control gas production in the underground gasifier the following means of the ucg control (see figure 2) can be used: • overpressure control — the flow of the injected oxidizer is regulated. • under-pressure control — under-pressure is regulated by adjusting of the exhaust ventilator power. • combined control — the afore-mentioned are used. various control algorithms can be used in the feedback control, such as on–off control (e.g., controlled switching of the compressors according to the pressure in the tank), proportional control or proportionalintegral-derivative (pid) control. feedback control can also be realized as a feedforward control based on the personal presence of the operator. a very widespread type of the feedback control is a regulation. the following control approaches can be implemented for the syngas production control: • regulation of injected oxidizer flow on desired value. by changing the volume of the flow, we can influence the course of chemical reactions during gasification and syngas quality on the outlet. the addition of o2 improves the energy effect, but the volume of co2 increases. the air volume flow can be measured from pressure difference by an orifice plate or by the use of flowmeters. for measuring the flow rate of oxygen flowmeters should be used with a special treatment to avoid explosions. the higher rate of the injected oxidizer improves the calorific value and the rate of the gas production. too much oxidizer can cause a negative effect, i.e., reducing the temperature in the reaction zone (cooling) and consequently also the calorific value of syngas. therefore, there is a need to look for an optimal flow of gasification agents. • control of operation pressure of the injected oxidizer. the advantage of high pressure coal gasification over conventional atmospheric combustion is that the major pollutants can be economically separated from the product gas, resulting in substantial reduction in the amount of pollutants released into the environment per ton of coal processed [14]. in the future, separation and sequestration of co2 can also lead to further significant reductions in greenhouse gas emissions. when the gasification takes place at high pressures, the reaction of hydrogen and char can become important. however, an increased injection pressure will tend to increase the gas loss from the underground reaction zone to the surrounding strata. higher operating pressure has a positive effect on the concentration of methane and hydrogen, which ultimately affect the calorific value of the produced gas [15]. the higher operating pressure of the oxidizer displaces methane from the coal pore structure and causes cracking of coal. also, high injection pressure can cause the loss of a gas from the underground reaction zone into the surrounding layers. these conclusions result from the analysis of a gasification in rocky mountain i (crip) and newman spiney [16]. • temperature stabilization by regulated injected air. in general, stabilization means maintaining the desired value of the measured process variable to a desired value (e.g., the volume flow of injected oxidizers or oxygen concentration in the syngas) [17]. at higher temperatures, endothermic reactions (e.g., boudoard reaction: co2 + c = 2co) predominate the exothermic. the values of methane decrease, while the production rate of fission products, such as co and h2, increases. maximal effect of co2 consumption is at temperatures above 1000 °c, these are the ideal theoretical conditions for the reaction of the entire amount of co2 to co, assuming the presence of coal. • stabilization of the oxygen concentration in produced syngas with an exhaust ventilation system or by control of injected air flow. the basis is feedback control with a discrete controller. verification of the proportional-integrating (pi) controller will be shown in the section 5.3. it is necessary that the oxygen content of the syngas is close to zero. the components of the syngas can be measured at the outlet of the gasifier by stationary analysers based on the principle of infrared or electrochemical sensors. • control of burning with an exhaust fan. it is a process that uses active exhaust ventilation to ensure a complete combustion of coal in an underground coal bed. an exhaust fan is connected to a piping system that leads to the underground mine. the fan creates a vacuum and supports the smouldering of the coal. the air gets into the mine by flowing from the surface through natural cracks, crevices and also through the injection borehole. the increased airflow improves the combustion and generated gases are extracted through an output borehole by a fan. the main advantage of this system is that all the resulting gas is extracted away through a one point and with no gas losses into the surrounding layers. it was further found that the losses of heat through conduction are lower. this technique is not suitable 184 vol. 57 no. 3/2017 approaches to the gas control in ucg figure 2. basic control loops on stabilization level. for moist and thin seams, in these conditions, the process becomes uncontrolled [18]. a controlled combustion system was applied to a superficial, thin bituminous coal seam [19]. in the considered algorithms, the utilization of an appropriate scada/hmi (supervisory control and data acquisition / human machine interface) system is expected for the visualization of measured and controlled variables, the possibility of operative data entry and tools for operating alarms. the basic control loops can be implemented on a plc. achieving the goal of the control is ensured by processing of information obtained by measuring the real output from the system (or deviation of the measured value from the set point). such control in this case is derived directly from the current status of the system. the following sections, will describe the feedback approach to the stabilizing of the gasification process, which is based on a digital controller. 2. experimental ucg in lab scale within the experimental gasification, several control loops that were verified utilising experimental gasification equipment (i.e., ex-situ reactor, respectively syngas generator with a control system) were proposed in order to increase the performance of gasification. figure 3 shows the base structure of gas control in the ucg. the symbol y represents is the controlled variable, w is the desired value of the controlled variable i.e., the set point or the reference value, e is the control error, u is the manipulated variable or the control variable. a basic level represents a monitoring system created within a scada/hmi application promotic. the monitoring system provides a visualization of process variables, data recording and a setup screen for adjustment of controllers. the monitoring system is implemented on a pc. the upper level represents stabilization algorithms (i.e., cyclic tasks) implemented on the plc. stabilizing level will stabilize inputs into the process of the ucg (e.g., pressures, flows figure 3. block diagram of the feedback control of oxidants) as well as outputs (e.g., temperature, concentration levels of oxygen in the syngas). the stabilization level is supported with an adaptation mechanism due to uncertainties in the ucg process. the control system is based on the plc which contains several control algorithms, which are adjusted from the environment of the monitoring system. the algorithms of the control system were programmed in an integrated development environment (ide) of b&r automation studio 2.6 with utilization of the automation basic language. for gasification, the following stabilization loops were created: • stabilization of pressure (on–off control) in a pressure vessel. • injected air flow stabilization. • stabilization of the temperature in the generator. • stabilization of the o2 concentration in the produced syngas. proposed control algorithms were verified on experimental gasification equipment. two ex-situ reactors (i.e., syngas generators) for the experimental trials of the ucg were constructed. generators were employed ‘to bed’ the coal, so that they may simulate the underground coal seam. the scheme on figure 4 shows the supply system of gasification agents and the outlet system for the syngas exhaust. all relevant devices for measuring and control that were used are also shown. the air supply was supported by two 185 ján kačur, karol kostúr acta polytechnica figure 4. scheme of experimental gasification equipment (modified after [20]). compressors. the air flow was measured by utilization of a centric orifice. a servo valve was used for air flow regulation. additional oxygen was supplied from pressure cylinders. for syngas analysis, two analysers and a calorimeter were used. for syngas exhaust, an under-pressure ventilator and a frequency inverter were used. the temperature inside the generator (i.e., in a various layers of the coal) was measured with k type thermocouples. the coal ‘bedded into’ generator represents a physical model of a real coal seam. various models were tried (e.g., with channel, counter-flow, uni-flow, crushed coal or blocks of coal) [21]. similar gasification equipment was designed by dobbs and krantz [22]. the following is an overview of devices for measurement and control used in the experimental gasification: • two piston compressor (schneider 400 v, 10 bar), • pressure vessel (tlakon, 16 bar), • pressure transducer on pressure vessel (siemens, sitrans p300, 16 bar, 4–20 ma), • servo valve (honeywell, 24 v), • pressure transducer behind the servo valve (siemens, sitrans p, series z, 0–6 bar, 4–20 ma), • centric orifice and differential pressure sensor for injected air flow measurement (siemens, 4–20 ma), • vortex flow meter for injected oxygen (krohne, microwell, vfc070, optiswirl 4070/c, 4–20 ma), • pressure transducer for air (keyhence, 0–1 mpa, 0–5 v), • two pressure transducers for air (siemens, sitrans p, series z, 0–10 bar, 4–20 ma), • pressure transducer for air (keller, 0–5 bar, 4–20 ma), • gas analyser #1 (madur, cms-7, o2: 0–21 %, co: 0–50 %, co2: 0–50 %, ch4: 0–100 %, 4–20 ma), • gas analyser #2 (abb, caldos, h2: 0–40 %, 4–20 ma), • portable gas analyser #1 for co (testo 512), • portable gas analyser #2 for co, o2 and co2 (wohler, a97), • calorimeter cwd 2005 (0–8 mj/m3, 4–20 ma), • thermocouple probes of k type (omega, omegaclad® sheathing, up to 1335 °c), • reducing valve for injected air, o2, n2 (cleaning the analyser of h2), ch4 (carrier gas for calorimeter) and calibration gas, • pressure transducer for syngas on outlet (kimomp52, ±10000 pa, 4–20 ma), • analyser and gas humidity probe (madur, mapressii, rs 232), • segment orifice and differential pressure sensor for syngas flow measurement (siemens, 4–20 ma), • 26 electromagnetic valves (siemens, 24 v), • under-pressure ventilator (siemens, 400 v), • frequency inverter (siemens, micromaster, 400 v, 50 hz, 0–10 v, 4–20 ma), • plc b & r x.20 (x20cp1485, cpu intel celeron 400 mhz, 32 mb, tcp/ip, rs232) with i/o modules (3× do, 1× di, 4× ai, 4× ao, 11× thermocouple modules), • pc, intel p4 (dual core): 3 ghz, 4 gb ram, 1 tb hdd, 2× rs232, ethernet tcp/ip) with scada/hmi system promotic). 186 vol. 57 no. 3/2017 approaches to the gas control in ucg parameter value total moisture wrt (%) 22.25 ash ad (%) 26.33 volatiles vdaf (%) 60.39 carbon cdaf (%) 64.79 hydrogen hdaf (%) 5.59 nitrogen ndaf (%) 1.04 calorific value qdafi (mj/kg) 24.94 calorific value qdi (mj/kg) 18.37 calorific value qri (mj/kg) 13.74 ash ar (%) 20.47 carbon cr (%) 37.11 hydrogen hr (%) 3.20 nitrogen nr (%) 0.59 cao (%) 1.12 mgo (%) 0.62 sio2 (%) 12.10 al2o3 (%) 5.26 fe2o3 (%) 2.89 na2o (%) 0.14 p2o5 (%) 0.02 tio2 (%) 0.17 k2o (%) 0.55 volatiles vr (%) 34.59 analytical moisture wa (%) 9.56 total sulphur srt (%) 1.93 sulphate sulphur srs (%) 0.01 pyritic sulphur srp (%) 1.35 organic sulphur sro (%) 0.57 oxygen qdaf (%) 26.34 oxygen qd (%) 19.4 table 1. coal sample analysis (abbreviations: r– received, d–dry, daf–dry ash-free, a–analytical) [23]. for experiments, the lignite from the mine cigeľ was used. the mine cigeľ belongs to the upper nitra coal basin [23]. the analysis of the coal from this mine is summarized in table 1. the analysis of the coal sample was performed in an accredited laboratory. 3. selection of a suitable site for in-situ gasification for the in-situ trials of the ucg, three suitable locations were selected in slovakia i.e., mine cigeľ, handlová and nováky [24]. it is a coal deposit region of the hornonitrian basin. the underburden and overburden is formed by gravel, clay, coal clays and andesites [25, 26]. these are sites with a coal bed thickness of 3.2–4.5 m with a calorific value of the coal in the range of 10–15 mj/kg. seams that are close to the surface and with a coal bed thickness of less than 2 m are unsuitable for gasification. for the comparison with the table 1 we present some parameters of a coal from two other coal sites: • nováky: wrt = 36 %, qri = 11.81 mj/kg, ad = 21.59 %, cdaf = 65.61 %, ndaf = 0.88 %, hdaf = 5.5 %, sdt = 4.03 %, sdp = 2.84 %, sds = 0.08 % and sdo = 1.12 %. • veľká trňa: wrt = 1.53–14.83 %, ar = 11.74–39.92 %, ad = 11.74–39.92 %, qri = 14.44–27.19 mj/kg, qdi = 14.90–31.72 mj/kg, sd = 1.00–1.14 %, asd = 100– 560 ppm, cdaf = 89.83 %, hdaf = 1.34 %, ndaf = 0.77 %, srt = 1.12 %, srp = 0.69 %, sro = 0.4 % and srs = 0.02 % [27]. in addition, hydrogeological parameters such as permeability, effective porosity, and storage ratio have also been taken into account in assessing the suitability of the coal bed. these parameters are most suitable in the cigel deposit (mining sector vii). brown coal and lignites have significantly higher permeability and their use for the ucg is more appropriate, but due to their lower quality, we can not expect syngas with the same quality as the anthracite gas. the moisture, calorific value, sulphur and ash content affect the quality parameters of the coal in the the ucg process. humidity affects the rate of heating of the coal and thus also the gasification time itself. the calorific value of coal directly influences the quality of the syngas. higher sulphur content means higher costs for syngas processing. ash content in coal is an important factor in the ucg. in ucg, the coal with a substantially higher ash content can be gasified, but the ash content affects the amount of syngas. at a high ash content, there is a risk that the coal will not be burned enough. large amounts of groundwater in the surrounding layers can cause a decrease in the underground temperature, a reduction of the efficiency, or to the half of the ucg process. other important properties of the coal seam that affect its use for the ucg are depth, thickness, slope and structure. various optimal parameters and criterions for the ucg are described in literature [24, 25, 27]. the next section describes the basic form of the controller, methods of calculating the parameters of discrete controller and principles of the system identification which were used within the laboratory trials of the the ucg. 4. digital feed-back control the proposed stabilization level of ucg is based on the principle of using the feedback proportionalintegral (pi) controller. the discrete controller in an incremental form was used. this controller has almost universal application, but is especially suited for towing and servo-mechanisms control. this controller can sufficiently eliminate sudden disturbances and, in most cases, improves the stability of the control loop [28]. the pi controller (2) was derived from the standard pid controller (1) [29] by omission of a derivational part. continuous behaviour of regulation errors (i.e., deviations) was replaced by rectangles and an integral was replaced with a sum according to the practice 187 ján kačur, karol kostúr acta polytechnica presented in [17, 30]: u(t) = kp [ e(t) + 1 ti τ∫ 0 e(τ)dτ + td de(t) dt ] , (1) u(t) = kp [ e(t) + 1 ti τ∫ 0 e(τ)dτ ] , (2) where kp is the proportional gain, ti is integral time (i.e., constant of integration), td is the derivative time [31], e(t) is the control error (i.e., deviation) in time t, e(t) = w(t) − y(t), w(t) is the desired value or so-called set point, y(t) is controlled variable (e.g., measured air flow, temperature in the coal, concentration of some component in syngas), u(t) is the control variable [29, 32]. the integral is approximated by simple summation, so that a continuous function is approximated by sections t0 of constant function (e.g., step function, rectangles). using so-called feedback rectangular method then we get the so-called recursive, speed form of the ps (i.e., proportional-summing) controller [17]: ∆u(k) = u(k) − u(k − 1) = kp [ e(k) + (t0 ti − 1 ) e(k − 1) ] = kpe(k) − kp ( 1 − t0 ti ) e(k − 1) = q0e(k) + q1e(k − 1), (3) where ∆u(k) is an increase of the control variable, u(k) is the value of the control variable in step k (e.g., servo valve opening in percentage or power of the exhaust fan frequency inverter), u(k − 1) is the previous value of the control variable (i.e., control value from the step k − 1, e(k) is the control error, where e(k) = w(k) − y(k), y(k) is the controlled variable (e.g., measured air flow, temperature, o2 concentration), w(k) is the desired value in step k (e.g., desired air flow, temperature or o2 concentration), e(k − 1) is previous value of control error, q0, q1 are defined parameters of the resultant discrete controller by the following substitution: q0 = kp, q1 = −kp(1 − t0/ti), t0 is the sampling period calculated as t0 = (0.25 ÷ 0.125)τd where τd is the dead time or t0 = (0.35 ÷ 1.2)tu and tu is the delay time of the step response [30]. the job of a control system based on the discrete pi controller is to maintain the controlled variable y at its set point w. the discrete controller and controlled system in the form of z-transfer function can be simulated in a matlab simulink. discrete z-transfer of pi controller (3) has the following form: gr(z) = u(z) e(z) = q0 + q1z−1 1 − z−1 (4) the properties of the actuator are the most significant limitation in the choice of the sampling period (e.g., in case of servo-drives non-sensitivity and operating time). ucg endeavours to measure several process variables that can be used for evaluation of dependencies and process behaviour (i.e., flows of gasification agents, pressures, temperatures, concentrations of gasses and calorific value of the syngas). the task of the identification process is as important as the role of the controller synthesis [33]. in the case of discrete systems, the parameters of discrete arx model (i.e., auto-regressive model with exogenous inputs) (5), are directly estimated and the least-squares method as the most common method was used [34, 35]. from the identified parameters of the model, the unknown parameters of the controller can be directly calculated. the proposal of an adaptive controller is usually based on a regression arx model of the system, which is modells the system output according the following equation [18, 36, 37]: y(k) = − na∑ i=1 aiy(k − i) + nb∑ i=1 biu(k − i) + nd∑ i=1 div(k − i) + es(k), (5) where y(k) is the value of the output variable in the step k of the sampling period t0 i.e., at the time t = kt0, u(k) is the controller’s output (i.e., control variable) in the step k, es(k) is the fictive noise respectively random non-measurable component, v(k) is disturbance, ai, bi, ci, di are unknown parameters of the model, that we want to identify from measured data [33]. the functions to calculate the parameters of the discrete model from the measured data using the least squares method are already programmed in the mathematical program matlab (i.e., functions: load(), iddata(), arx(), filt()) [38]. the sampling period and order of the regression model determines the quality of the obtained model. a new identification of the controlled system is done when the proposed controller has a low quality of control or new identification is carried out at regular intervals. identification represents the creating of the test signal ∆u — wake up the system, measurement the values of the output variable y (i.e., controlled variable), editing and filtering record, and finally the mathematical model calculation in the form of a discrete z-transfer. for laboratory gasification these systems were controlled untill now: • controlled system #1: y — air flow, u — the percentage of servo valve opening. • controlled system #2: y — coal temperature, u — the percentage of servo valve opening. • controlled system #3: y — o2 concentration in syngas, u — servo valve opening. • controlled system #4: y — o2 (%) in syngas, u — exhaust fan power frequency. 188 vol. 57 no. 3/2017 approaches to the gas control in ucg 4.1. adaptation of controller due to uncertainties in ucg the vast majority of processes that occur in industrial practice has a stochastic nature. the control and identification of the ucg is often a difficult process, because the underground reactor is gradually changing (i.e., increasing cavity, moving of burning front, gas leaks through the cracks in the overburden, underground water, various anomalies etc.). it is stated as a process control in conditions of uncertainty. in addition, the coal bed was formed by an act of nature and therefore it is different from industrial systems. some uncertainty can be reduced by more detailed geological survey, but even that does not guarantee elimination of such uncertainty, as evidenced by the long-term experience in the traditional mining technology of coal. classic controllers with fixed parameters are often not appropriate for such processes, because in the case of changing parameters of the process, the control is not optimal and it leads to a loss of material, energy and a reduction of equipment life etc. one possibility of increasing the quality management of such processes is the use of adaptive control systems. in automated control the adaptive control system adapts the parameters or structure of one part of the system (i.e., controller) to change the parameters or structure of another part of the system (i.e., controlled system), so that based on chosen criterion the steady optimal behaviour of the whole system would be ensured, independent of occurring changes. adaptation to the changing parameters or structure of the system can be done in three ways: • appropriate change of controller parameters. • changing the structure of the controller. • generating a suitable auxiliary input signal (adaptation by the input signal). the addition of a feedback controller can be understood as a feedback of a higher level, which changes the controller parameters according to the quality of the control process. there are numerous approaches to the adaptive control published in the world [30]. for the calculation of the controller parameters continuously during the gasification process, the identification of regulated system was carried out repeatedly. from these new identifications, the new parameters of discrete controller were calculated. for the simulation, a discrete model of the system was used in the form of a discrete transfer function or differential equation. for the calculation of a discrete controller an applied calculation of the parameters of the continuous controller, from the dynamic characteristics of the system, was performed or parameters of a discrete controller were calculated directly from the parameters of the discrete transfer function of system. from the character of the process, which eventually changes its parameters results, it is obvious that there is a figure 5. block diagram of auto-tuning controller with repeated single-shot identification [30]. necessity to repeat the identification and calculation of controller parameters. for assessing the quality of regulation, the quadratic criterion ise (i.e., integrated squared error) was used in the following discrete form: ise = ∞∑ k=1 e2(k), (6) where e(k) is the control error in step k [39]. the principled scheme of adaptation of discrete controller that was applied in gas control within experimental ucg is shown in figure 5. 4.2. methods for calculating parameters of a discrete controller by the term of the proposal of the controller we only understand the choice of its structure however the synthesis of the control loop, determination of the parameters of controller is based on the knowledge of the controlled system and the required criteria. in the synthesis of the control loop, it is important to convert these requirements into a mathematical formulation that is suitable for further processing. in the case of the control loop formed by the regulator and the controlled system, the structure and some parameters of the circuit are usually fully entered. the second part of the parameters and their values are optional. the synthesis then includes a determination of the appropriate type and values of controller parameters to the desired controlled system. for synthesis of the control loop, number of methods is available, they differ in their complexity and also in what regulation quality indicators result. the choice of method depends on whether the proposed control loop is continuous or discrete. when designing discrete control loops, if they are designed to be a classic controller with a fixed structure, which is described by a transfer function or equation, the synthesis consists of a choice of the type and setting of the suitable parameters. usually it is a numerical analogy of a continuous pid controller. in this case, the optimal parameters of a continuous pid controller are adjusted with some of the methods used for continuous control loops and parameters of the digital controller determined using the conversion equations. a possible 189 ján kačur, karol kostúr acta polytechnica way of designing discrete control loop is an approximate method based on a substitution sampler and shaper by a dead time member. many of the known methods of continuous control loop synthesis have also its equivalent for the discrete control loops. it includes for example the method of the optimal module or the method of required model. there is also a discrete version of the ziegler–nichols method or methods in which the values of the delay time, rise time and gain of the controlled system are read from the behaviour of the step response of the controlled systems [40]. 4.2.1. modified ziegler – nichols open loop tuning method experimental setup of the parameters of a continuous pid controller, proposed by ziegler and nichols [41] more than half a century ago, is still used in industrial practice. in this very famous and popular approach the pid controller parameters are calculated from the critical proportional gain kpk and critical oscillations period tk of the closed control loop. these critical parameters are obtained by gradually increasing the gain of the proportional controller untill the output variable of the closed control loop oscillates with a constant amplitude and the control loop is on the boundary of the stability. in this case, the poles of the closed control loop are located on the imaginary axis of the complex s–plane. then we can read the critical proportional gain kpk (i.e., ultimate gain) and from the record of the controlled variable’s behaviour the critical period of oscillations tk (i.e., ultimate period) is read. the parameters of the pi and pid controller are determined from the following equations [30, 33]: pi: kp = 0.5kpk, ti = 0.83tk, (7) pid: kp = 0.6kpk, ti = 0.5tk, td = 0.125tk (8) disadvantages of an experimental determination of the critical parameter is that the system can be brought into an unstable state, and that a search for the stability limit in systems with large time constants is time consuming. these disadvantagesdo not have a modified method for setting the parameters of the digital controller. this method assumes that the discrete model includes the dead time to the size of t0/2. dead time does not change the amplitude, but the phase shift increases linearly with the increasing frequency: ϕ = − t0ω 2 (9) on the critical frequency ωk the system has the phase shift −π and gain ak so that we have: akkpk = −1 (10) because the discreet control is by influence of the phase shift ϕ caused by discretisation, which changes the critical frequency, and because at a different frequency the system has other gains; the critical gain of the system is changed too. critical values depend on the selected sampling period t0 [33]. the algorithm for calculating critical parameters for the model of the 2nd order will be presented. it assumes a discrete transfer equation of a controlled system in the form of gs(z) = y (z) u(z) = z−db(z−1) a(z−1) (11) with polynomials: a(z−1) = 1 + 2∑ i=1 aiz −i = 1 + a1z−1 + a2z−2, (12) b(z−1) = 2∑ i=1 biz −i = b1z−1 + b2z−2, (13) where parameter d is the number of steps of dead time. calculation of the critical gain and critical period of oscillations depends on the location of the poles on the unit circle in the complex z-plane [33]. for calculation of the real part of the complex conjugated pole α and critical control parameters kpk and tk the algorithm depicted in figure 6 can be used. the algorithm can be programmed as a script in matlab where input is represented by identified parameters (ai, bi) of the model for a 2nd order system in the form of z-transfer function. the reference [30] provides an algorithm for calculating the parameters kpk and tk for the model of the 3rd order. 4.2.2. time-optimal control — dead beat method the method of a time-optimal control solves the proposal of the common digital controller. it is a method of synthesis, according to criteria completion of the regulatory process in a few numbers of steps. the general form of the controlled system transfer is considered in the form: gs(z) = y (z) u(z) = b1z −1 + b2z−2 + · · · + bnkmin 1 + a1z−1 + · · · + anz−n = b(z) a(z) (14) the value of the parameter n specifies the number of control steps. by dividing the numerator polynomial b(z) and a(z) in the denominator polynomial the z-transfer function of controlled system expressed as a polynomial p(z) and q(z) is obtained: gs(z) = y (z) u(z) = y (z) w(z) u(z) w(z) = p(z) q(z) = p1z −1 + p2z−2 + · · · + pnz−n q0 + q1z−1 + · · · + qnz−n . (15) this discrete transfer function is converted to standard form (q0 = 1) and compared with the original 190 vol. 57 no. 3/2017 approaches to the gas control in ucg figure 6. algorithm for calculating critical parameters of pid controller [30]. polynomials in (21): gs(z) = p1 q0 z−1 + p2 q0 z−2 + · · · + pn q0 z−n 1 + q1 q0 z−1 + · · · + qn q0 z−n = b1z −1 + b2z−2 + · · · + bnz−n 1 + a1z−1 + · · · + anz−n . (16) by comparing the coefficients in powers of z−1 in equation (16) we obtain the system of equations for calculating the coefficients of qi and pi on the basis of the coefficient ai and bi (i = 1, 2, ...,n) of the controlled system. qi = aiq0, pi = biq0, (17) n∑ i=1 pi = n∑ i=1 biq0 = 1. (18) the parameter q0 can be determined as follows: q0 = 1∑n i=1 bi (19) the transmission of the controller can be obtained by substituting values of the coefficients obtained by equations (17) and (19) into the expression: gr(z) = 1 gs(z) gw(z) 1 − gw(z) = 1 p(z) q(z) q(z) 1 − p(z) = q0 + q1z−1 + · · · + qnz−n 1 − p1z−1 − p2z−2 − · · · − pnz−n (20) the control algorithm for system of the 2nd order has the following form [17, 42]: u(k) = q0e(k) + q1e(k − 1) + q2e(k − 2)+ + ( p1u(k − 1) + p2u(k − 2) ) (21) the derived algorithm can be applied to time-delaying processes and, similarly, it can be a derived equation of the controller for a higher order of the system. in solving of the ucg control the ziegler–nichols method of the critical gain (i.e., open loop method according equation (7)) that was presented was used, by means of which the parameters of continuous controllers were calculated. then the continuous controller parameters were recalculated at a selected sampling period to discrete controller parameters in an incremental form. the dead beat method was not used for calculating the parameters of the controller because it was not considered suitable (see section 5.1). for calculating the parameters of the mathematical model of the controlled system, the arx model was used, calculating by using the least squares method in matlab based on the data of the experimental identification. 5. experimental results and discussions 5.1. air flow stabilization a basic variable in the control of the gasification process is the air flow that is supplied to the experimental gasifier. by changing the flow rate, we can influence the behaviour of chemical reactions during gasification. for controlling the air flow in experimental gasification in the laboratory gasifier, a servo valve that was installed on the pipe to supply air from the pressure vessel was used. the air pressure in the pressure vessel was maintained between the minimum and maximum set point. an on–off control method (i.e., bang-bang control) was used for the pressure stabilization [17, 29]. the servo valve behind the pressure vessel was connected to the control plc that was operated via two digital signals. one signal was used for opening and the second for closing. the total time necessary to fully open or fully close the servo 191 ján kačur, karol kostúr acta polytechnica figure 7. control scheme of air flow stabilization by the servo valve. valve is 30 s. to ensure the full opening of the valve (i.e., opening to 100 %), the plc must send 60 pulses with a duration of 500 ms. a similar rule applies to the complete closure of the valve. they are discontinuous pulses, i.e., each pulse of the signal with a duration of 500 ms is followed by zero amplitude. the conversion of percentage to pulses is done using the following equation: upulses = u% · 0.6, (22) where upulses is calculated as the change of the servo valve position in pulses, where one pulse is digital a signal with an amplitude of 24 vdc and duration of 500 ms, u% is the change of the required servo valve position (%). calculated pulses are rounded to the nearest integer. the created control system has programmed its own counter of pulses, because the electronics used in the servo valve does not provide any feedback information on the current servo valve position. the control system can also be configured for broadcasting of uninterrupted pulses. automatic flow control is based on the principle of stabilized measured airflow to the desired value. the air flow is calculated based on the measured pressure difference on the centric orifice and the air pressure before the orifice. the task of the control algorithm is consecutive at the specified time steps to close or open the servo valve so that the regulation deviation between the desired and measured flow is eliminated. this problem is solved by the equation of the discrete pi controller (3). the discrete controller calculates the percentage increase (or decrease) of the servo valve opening ∆u on the basis of deviation e. control deviation e is calculated as the difference between the desired value w (i.e., set point) and the measured flow y of the air. in figure 7 the scheme for stabilizing the air flow through the servo valve is provided. for the calculation of unknown controller parameters q0, q1 an experimental identification of the controlled system was performed. experimental identification is based on the principle of a step excitation of the regulated system, follow-up and record of the responses of the controlled variable — air flow (see figure 8). for calculation of the model parameters, matlab funcfigure 8. system identification with 5 % servo valve opening. figure 9. comparison of measured data with the models. tions of parametric identification were implemented [43]. the functions of the system identification toolbox were applied to calculate the parameters of the arx model. the result is a model of the second order in the form of a discrete z-transfer function (23). this model was converted to the differential equation in the form (24). parameters a1, a2, b1, b2 of the second order differential equation are the results achieved from identification [43]. overall, models of the 1st, 2nd and 3rd order were calculated and then compared with the measured data. models in the form of the z-transfer function were transferred into differential equations (24)–(26): gs(z−1) = b1z −1 + b2z−2 1 + a1z−1 + a2z−2 = 2.46z−1 + 2.453z−2 1 − 1.118z−1 + 0.367z−2 , (23) ym(k) = − ( −0.7584y(k − 1) ) + ( 5.324u(k − 1) ) , (24) ym(k) = − ( −1.118y(k − 1) + 0.367y(k − 2) ) + ( 2.46u(k − 1) + 2.453u(k − 2) ) , (25) ym(k) = − ( −1.26y(k − 1) + 0.6419y(k − 2) 192 vol. 57 no. 3/2017 approaches to the gas control in ucg + 0.1392y(k − 3) ) + ( 1.619u(k − 1) + 1.619u(k − 2) + 1.619u(k − 3) ) . (26) for verification of the models, graphical comparison of the measured and model data was used, it also served as a review of equality of the stabilized values and a quantitative statement by quadratic criterion (27) [43]. j = τ∫ 0 [ y(t) − ym(t) ]2 dt = n∑ k=1 ( y(k) − ym(k) )2∆τ, (27) where y(k) is an output from the real system, ym(k) is an output from model, k is the time step, t is the time, n is the number of the measured samples [43]. in regard to lowest value of criterion (27) at the 2nd order of the model, this model was used for the calculation of the controller’s parameters [43]. the result of comparison is depicted in figure 9. for the calculation of the parameters of the discrete pi controller (i.e., q0, q1) for a specified sampling period, the modified ziegler–nichols method was used (see figure 6). this method enables us to estimate parameters of the pi or pid controller directly from the parameters of the discrete model (24)–(26). equation (28) represents the discrete pi controller in an incremental form: ∆u(k) = u(k) − u(k − 1) = q0e(k) + q1e(k − 1) = 0.116123e(k) − 0.086580e(k − 1) (28) where u is the control variable, e is the control error, k is an index of the control period, w is the desired value, y is the controlled variable, kp is the proportional constant, ti is an integration constant, t0 is the sampling period [43]. with consideration of the different pressure conditions, it is necessary to use other controllers of parameters, because the controlled system changes its parameters, several controllers for different pressure were calculated. a programmed algorithm of the discrete pi controller always chooses the parameters of q0, q1 which correspond to the proposed controller in the selected overpressure. similarly, the control parameters for the various air flows are changed. from previously described facts, results show that the system identification must be completed on a several air flow levels and different pressures adjusted on the reducing valves. the opening of servo valve with a smaller inlet pressure has resulted in a lower airflow rate and the same great opening, but with a higher overpressure, will result in a higher airflow rate. the behaviour of the stabilization of the air flow for different pressure conditions during experimental gasification is shown in figure 10. figure 10. behaviour of air flow stabilization during gasification at various overpressures. a deat beat controller (29) calculated according to equations (17)–(21) was not used during the ucg experiments, because in the first step of regulation it caused a major change of the control variable u (i.e., in some cases, the opening of the servo valve was up to 100 %) and that is an undesired state. an excessive opening of the servo valve in the first step of the control causes the venting of the air from the pressure vessel and a significant pressure. in the next step, the entirely closed and even slightly opened valve causes large oscillations of the system. the classical dead beat controller needs, by a small number of steps, to stabilize the system, which causes that control variable to jump from one extreme to the other. the result of the practical verification of the dead beat controller is shown in figure 11. the figure shows a short stretch of the control with the discrete pi controller, followed by the control with the dead beat controller. the graphical comparison shows that the dead beat controller is not suitable for stabilizing the flow of air that comes from the pressure vessel. an alternative for the application of the dead beat controller could be the use of high pressure blower. this type of controller was also successfully implemented in the control of time delayed system [44]: ∆u(k) = q0e(k) + q1e(k − 1) + q2e(k − 2) + ( p1u(k − 1) + p2u(k − 2) ) = 0.2035e(k) − 0.2275e(k − 1) + 0.0746e(k − 2) + ( 0.5u(k − 1) + 0.5u(k − 2) ) . (29) 5.2. temperature stabilization the aim was to inject air into the ex-situ reactor (i.e., syngas generator) with a flow that would ensure that the content of the heating components would be the highest. this aim can be achieved only at higher temperatures above 1000 °c. with increasing temperature, the production of co increases (see figure 12) and this component remains dominant. 193 ján kačur, karol kostúr acta polytechnica figure 11. control with discrete pi and dead beat controller. figure 12. increase of co concentration in syngas when temperature increases. similarly, it increases the ratio of co/(co + co2), which is an important indicator during gasification (see figure 13) [45]. however, the higher temperature should be maintained in the long term so that the syngas is produced as long as possible (see section 1.2). for this reason, it is necessary to implement an algorithm for stabilizing the temperature to the desired value. temperature stabilization in the experimental gasifier is based on the same principle as the stabilization of the air flow. the controlled variable y is the current highest temperature of coal, in the channel or by an operational personnel selected temperature (i.e., thermocouple by number). the feedback control algorithm includes an auxiliary algorithm that identifies the maximum (i.e., the highest) temperature tmax from all measured temperatures. control variable u is the percentage opening of the servo valve, which at a given pressure corresponds to the air flow. the discrete pi controller calculates the percentage increment of the servo valve opening or closing. the desired percentage change is then converted in the plc to a number of pulses (see figure 14). the discrete pi controller changes the supplied air flow, so that the selected temperature is stabilized. if it is necessary figure 13. example of increasing ratio co/co+co2 when temperature increases during gasification. figure 14. scheme of pi controller connections for the temperature stabilization. to increase the temperature, the controller increases the air flow. if the measured temperature is above the set point, the controller reduces the airflow. the calculated model of the controlled system in the form of a discrete transfer function is expressed by (30). a model of the controller is also expressed in the form of a discrete transfer (31) and by the algorithm in the form of a differential equation (32): gs(z−1) = 199.1z−1 + 199.1z−2 1 − 0.437z−1 − 0.05025z−2 . (30) a discrete transfer of the controller has the form gr(z) = q0 + q1z−1 1 − z−1 = 0.00027 − 0.00004115z−1 1 − z−1 . (31) an algorithm of controller has the following form: ∆u(k) = u(k) − u(k − 1) = q0e(k) + q1e(k − 1) = 0.00027e(k) + (−0.00004115e(k − 1)) (32) where ∆u(k) is an increase of control variable u, k is the time step, e(k) is the regulation error (e(k) = w(k) − y(k)), y(k) is the measured temperature (i.e., controlled variable) (°c), w(k) is the desired value (°c). in figures 15 and 16 temperature stabilization behaviours with the discrete pi controller for different 194 vol. 57 no. 3/2017 approaches to the gas control in ucg figure 15. temperature stabilization on 980 °c. figure 16. adaptation of the controller for the regulated system. set points are graphically illustrated. the graphs also show how the calorific value of syngas varied during the temperature stabilization. the controller’s parameters need to be calculated continuously as the controlled system is gradually changing. in figure 16 the time behaviour of the temperature stabilization during the gasification process is shown. in the second half of the graph we can see the behaviour after new identification and calculation of new controller parameters. the controller was adapted on the current state of the process and the next steps sufficiently stabilized the temperature. because temperature cannot be measured directly inside the cavity, a proxy is needed that can provide a reliable indication of the temperature [46]: • proxy by measured carbon isotopes [46]. • proxy by measured concentration of co, co2 and its ratio (e.g., co/(co + co2)) [45]. • proxy by measured emanation of radon from underground to surface [47]. • proxy by mathematical model based on theory of heat conduction [48, 49]. monitoring of temperature by measuring of carbon isotopes — the results show that the isotope value increases with each temperature step, until it stabilizes at a certain temperature [46]. this trend shows that fractionation is occurring in the low temperature range, where relatively light components are released. once the value stabilizes in the high temperature range, fractionation no longer takes place and the isotope signature represents the value for bulk coal [46]. in general, the temperature at which the value is stabilized increases with increasing rank. isotopes are variants of a particular chemical element which differ in neutron number, although all isotopes of a given element have the same number of protons in each atom. delta value would appear to be a good proxy, but only if the stabilization temperature would have been higher than the operational temperature for ucg [46]. monitoring by radon measurement — this technique can be used for the detection of the length and moving velocity of the gasification working face. in the experiment, the length and moving velocity of the gasification working face are detected with the help of the radon measuring technique. radon (222rn) is the only gaseous derivative in the disintegration of the natural radioelement, which possesses the unique feature of emitting vertically from the underground to ground surface [50]. the moving of radon is closely linked to temperature. when the temperature rises, the emanation coefficient of radon increases remarkably [47]. therefore, measuring the radon concentration on the surface of the earth can reflected the temperature beneath. the rn emanation factor increases dramatically when temperature is greater than 700 °c. radon solubility decreases as the temperature increases [51]. this method can be used for detection of the ucg burn-front and estimation of its migration speed. 5.3. stabilization of oxygen concentration in the syngas during gasification, the task is to maintain the concentration of o2 in the gas at the lowest possible value. then, a higher concentration of co is achieved and the resulting calorific value of gas is higher. figure 17 shows the behaviour of the measured concentrations of o2 and corresponding calorific value calculated from the gas composition. oxygen is measured by the analyser for oxygen with a range of 0–21 %. high concentrations of oxygen (10–21 %) in the produced gas, mean that a large volume of oxidant is blown, causing the cooling of coa, which decreases the calorific value. there can also be a high concentration of o2 in the gas caused by the fact that there is insufficient carbon oxidation during combustion in the oxidation zone, and consequently there are also low temperatures in oxidation and the reduction zone and less co is produced. high concentration of o2 in the gas can also mean, that through gaps, porosity, rifts or leakages in the pipe system, air is sucked into the reactor, which then enriches the produced gas with its oxygen. for stabilization of the o2 concentration in the produced gas, an alternate feedback controller was programmed. 195 ján kačur, karol kostúr acta polytechnica figure 17. effect of o2 concentration on the calorific value. figure 18. connection scheme of stabilization concentration of o2 by airflow. it is a discrete form of a continuous pi controller as in the case of the stabilization of the air flow and temperature (see section 4). the basis of the control algorithm is the controller equation (3), which calculates a new addition of the control variable in each step of the control algorithm. the control variable u is the percentage change in the position of the servo valve or exhaust fan motor power frequency. the change in the opening of the servo valve can change the air flow and affect the behaviour of chemical reactions that take place during gasification. in the second case, change of the fan power (i.e by change of power frequency) can change the amount of the intake air, under-pressure and the flow of produced gas. an equation of the controller (3) calculates the manipulated variable, which is in the range of 0–1 (i.e., 0–100 % of valve opening). regarding the control of the exhaust fan power, the control variable is also in the range of 0–1 (i.e 0–100 % of the exhaust fan motor power). this range corresponds to the frequency range of 0–50 hz, which the plc sends to the frequency inverter (see figure 19). in the first case, the control algorithm calculates in each step the control variable percentage to the number of pulses opening or closing the servo valve according to equation (3). the connection diagram of the feedback controller for the stabilization of o2 concentration by air flow is shown in figure 18. if it is necessary to increase the concentration of o2, the controller figure 19. variable frequency drive of exhaust fan. increases the air flow and vice versa. similarly, a second stabilizing loop is created. the o2 concentration in the produced gas in this case is stabilized by the power of an exhaust fan. the control variable is the power frequency of the motor calculated by control system according to equation (33) and sent as an analog voltage signal (0–10 vdc) to the frequency inverter (see figure 20). we have used a variable-frequency drive (vfd) which is a type of an adjustable-speed drive used in electro-mechanical drive systems to control the ac motor speed and torque by varying motor input frequencies and voltage [52]. a three-phase ac motor for exhaust of syngas was used. the frequency power can be setup directly on the inverter panel (i.e., operator interface) or via a controller (i.e., control task) actioned via the plc. in the case of the need to increase the o2 concentration, the controller increases the frequency. if the measured concentration of o2 is above the set point, the controller reduces the power frequency of the inverter. globally, there is a practice of using a technique for the control of combustion, which uses only the exhaust fan at the end of the outlet hole during the gasification process [18]. uv = uhz · 0.2 (33) where uv is the value of the analogue voltage signal (v), uhz is the desired power frequency (hz). for proposal of the controller, it was necessary to perform an experimental identification of the controlled system (u is the servo valve opening or motor power frequency of the ventilator, y is the concentration of o2 in syngas). the record of measured values was used for the identification. a model of the regulated system is described by the following z-transfer function: gs(z−1) = 6.509z−1 + 6.509z−2 1 − 0.7325z−1 + 0.1985z−2 . (34) using the ziegler–nichols method and algorithm shown in figure 6 the parameters of continuous pi controllers, which are converted to discrete form, were calculated. model of the regulated system “exhaust fan 196 vol. 57 no. 3/2017 approaches to the gas control in ucg figure 20. connection scheme of o2 stabilization by exhaust fan power. power frequency — o2 concentration” is described by z-transfer: gs(z−1) = 3.8095z−1 + 3.8095z−2 1 − 0.7218z−1 + 0.0804z−2 . (35) the model of the controller for stabilization of o2 concentrations by air flow is as follows: ∆u(k) = q0e(k) + q1e(k − 1) = 0.021e(k) + ( −0.0123e(k − 1) ) . (36) the model of the controller for stabilization of the o2 concentrations by frequency inverter of the exhaust fan is in the form: ∆u(k) = q0e(k) + q1e(k − 1) = 0.1086241e(k) + ( −0.073843e(k − 1) ) , (37) where ∆u(k) is an increment of the control variable u, k is the time step, e(k) is the regulation error (where e(k) = w(k) − y(k)), y(k) is the concentration of o2 i.e., controlled variable (%), w(k) is the desired value (%). in figure 21 and figure 22, there is a graphic presentation of the testing of the stabilization of the o2 concentrations by the air flow during gasification. the sampling period was set to t0 = 10 min and it set the point w(k) for o2 stabilization as it gradually changed. for the measuring of the concentrations of o2, the last set point was reached after 210 min. then the sampling rate was increased to t0 = 15 min. in terms of the quality of the gasification process, the stabilization of the concentration of o2 at 1 % reached the maximum calorific value of 5 mj/m3 (see figure 22). a test of the controller was performed during the trial on an experimental coal gasifier. in the second test (see figure 23), an o2 concentration of 5 % and sampling period t0 = 5 min was desired. the measured concentration reached the desired value after 35 min. in terms of the quality of the gasification process, after stabilization of the o2 concentration at 5 %, it reached the maximum calorific value of 0.35 mj/m3 on average (see figure 24). the test took place during gasification on the laboratory gasifier. figure 21. stabilization of o2 concentration on a various desired values. figure 22. calorific value and the o2 concentration in stabilizing. the above results indicate that the stabilization of the o2 concentration at lower values allows a higher calorific value of the produced gas. the ideal situation occurs when the concentration of o2 in the syngas is stabilized at 0 %. similarly, the controller for the stabilization of the concentration of o2 with a fan was tried. in figure 25 the practical verification of the controller is shown. the figure shows how change occurs in the value of the frequency inverter and how the o2 concentration responds. the controller reached the set point after 40 minutes and maintained it until a fault occurred. the disturbances arose as a result of changes to the air intake, because this controller usually starts along with the controller for the temperature stabilization or extremal controller. the controller was verified during the experiment with the gasification in the laboratory gasifier. we can see that a new setting of the controller (i.e., adaptation) increases the quality of the o2 stabilization (see figure 25). in case that the monitoring system indicates a high concentration of poisonous and explosive gases on the surface caused by gas leaks from underground, we propose to change the ucg control from pressurized to under-pressurized. in under-pressure control, the 197 ján kačur, karol kostúr acta polytechnica figure 23. stabilization of o2 concentration to 5 %. figure 24. calorific value during stabilization of o2 concentration. gasification is controlled only with sucked air and injection of oxidizer is closed. the under-pressure can be adjusted by the power of suction fan. the control of the ucg with under-pressure prevents gas leaks from underground to the surface but the concentration of oxygen in syngas should be controlled. 6. conclusions the construction of an underground industrial gasification system for a real coal seam requires not only the knowledge of geology and of the process itself, but also an investment needed to build systems for measurement and automatic control of the gasification process. in this paper, the problem of a gas control in the ucg was solved. the issue in question is the stabilization level and supportive adaptation level. knowledge gained from the manual process control tests were used to design controllers for stabilization levels. the stabilization level was built on the basis of a discrete proportional-summing controller. the outcome is a control system for the stabilization of oxidant to the gasifier, control system for the stabilization of the temperature in the gasifier and control system for the stabilization of the oxygen concentration in the produced gas. figure 25. stabilization of o2 by exhaust fan power frequency. the main objective in the verification of the control algorithms was the maximization of the produced syngas calorific value. despite the complexity of the control system, an implemented system could be improved in the future. the benefit would be an improvement of the system for centralized control of the controllers and improvement of the system for visualization and evaluation of specific measured values. in experimental gasification on a lab scale, we have used lignite from slovak mines. we suggest building automated control systems for a gas control with the following recommendations: • we propose to build a stabilization level based on the discrete feedback controllers (pi/pid). we propose the use of methods of discrete parametric process identification and a model-based method of the controller design. • we propose to support the stabilization level with the adaptation of controllers due to recognized uncertainties. we propose to use the principle of self-tuning controllers (stc). by stabilizing levels it will stabilize inputs into the process of the ucg (i.e., pressure, flow oxidants) as well as outputs (i.e., underground temperature, concentration of o2 in the syngas). • in the case that the ucg is controlled by the pressure of the injected gas, a consistent monitoring of gas leaks from underground is essential. • in the case that there is any reason to inject an oxidant with pressure into the cavity, we propose to extend the control system to monitor a leakage of syngas. • if the monitoring system detects the presence of syngas in vulnerable areas (although far from the vulnerable area), we recommend to switch from a pressurized system to under-pressurized controlled by the fan power. in this case, the concentration of o2 in the syngas must be checked. • if there is no serious reason to transport the oxidizer by overpressure regime, then we recommend the 198 vol. 57 no. 3/2017 approaches to the gas control in ucg control system to be built on the principle of an under-pressure control. • we suggest stabilizing the temperature at 1000 °c in the oxidizing zone and keeping the oxygen concentration close to zero for the best performance of the ucg reactor. although the automated control system for the gas control was tested only in laboratory conditions, it can be used with some modifications in real ucg operational environments. acknowledgements this work was supported by the project cogar rfcrct-2013-00002, by the slovak research and development agency under contract apvv-14-0892 and grants vega 1/0273/17 and vega 1/0529/15 from the slovak grant agency for science. references [1] a. w. bhutto, a. a. bazmi, g. zahedi. underground coal gasification: from fundamentals to applications. progress in energy and combustion science 39(1):189–214, 2013. doi:10.1016/j.pecs.2012.09.004. [2] a. khadse, m. qayyumi, s. mahajani, p. aghalayam. underground coal gasification: a new clean coal utilization technique for india. energy 39(11):2061–2071, 2007. doi:10.1016/j.energy.2007.04.012. [3] r. p. verma, r. mandal, s. k. chaulya, et al. contamination of groundwater due to underground coal gasification. international journal of water resources and environmental engineering, academicjournal 6(12):303– 311, 2014. http://www.academicjournals.org/ article/article1420795093_verma%20et%20al..pdf doi:10.5897/ijwree2014.0520. [4] c. higman, m. van der burgt. gasification, second edition. gulf professional publishing, elsevier; 2nd edition, oxford, uk, 2008. [5] anon. underground coal gasification program. technical report, booz, allen & hamilton, inc., report erda 77-51/4 on contract no. ex-76-c-01-2343, us energy research and development administration, 1977. [6] t. ishikawa, t. nagaoki. recent carbon technology. jec press inc., cleveland, ohio. usa, 1983. [7] a. a. uppal, a. i. bhatti, e. aamir, et al. control oriented modeling and optimization of one dimensional packed bed model of underground coal gasification. journal of process control 24:269–277, 2014. doi:10.1016/j.jprocont.2013.12.001. [8] a. arshad, a. i. bhatti, r. samar, et al. model development of ucg and calorific value maintenance via sliding mode control. in 2012 international conference on emerging technologies, pp. 1–6. 2012. doi:10.1109/icet.2012.6375477. [9] q. wei, d. liu. adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. transactions on automation science and engineering, ieee 11(4):1020–1036, 2014. doi:10.1109/tase.2013.2284545. [10] p. ji, x. gao, d. huang, y. yang. prediction of syngas compositions in shell coal gasification process via dynamic soft-sensing method. in proceeding of 10th ieee international conference on control and automation (icca), pp. 244–249. 2013. doi:10.1109/icca.2013.6565140. [11] m. laciak, j. kačur, k. kostúr. the verification of thermodynamic model for ucg process. in iccc 2016: 17th international carpathian control conference, pp. 424–428. 2016. doi:10.1109/carpathiancc.2016.7501135. [12] m. laciak, d. ráškayová. the using of thermodynamic model for the optimal setting of input parameters in the ucg process. in iccc 2016: 17th international carpathian control conference, pp. 418–423. 2016. doi:10.1109/carpathiancc.2016.7501134. [13] m. benková, m. durdán. statistical analyzes of the underground coal gasification process realized in the laboratory conditions. in sgem 2016: 16th international multidisciplinary scientific geoconference, bulgaria sofia : stef92 technology, pp. 405–412. 2016. [14] z. hyder, n. s. ripepi, m. e. karmis. a life cycle comparison of greenhouse emissions for power generation from coal mining and underground coal gasification. mitigation and adaptation strategies for global change, springer netherlands 21(4):515–546, 2014. doi:10.1007/s11027-014-9561-8. [15] k. kostúr, j. kačur. indirect measurement of syngas calorific value. in iccc 2015: 16th international carpathian control conference (iccc), pp. 418–423. ieee, 2015. doi:10.1109/carpathiancc.2015.7145079. [16] g. perkins, a. saghafi, w. sahajwalla. numerical modelling of underground coal gasification and its application to australian coal seam conditions. school of materials science and engineering, university of new south wales, sydney, australia, 2001. [17] j. balátě. automatické řízení. 2. přepracované vydání. ben technická literatúra, praha, 2009. [18] a. gibb. the underground gasification of coal. sir isaac pitman & sons ltd, london, 1964. [19] r. f. chaiken, j. w. martin. in situ gasification and combustion of coal. sme mining engineering handbook pp. 1954–1970, 1998. [20] j. kačur, m. durdán, m. laciak, p. flegner. impact analysis of the oxidant in the process of underground coal gasification. measurement 51:147–155, 2014. doi:10.1016/j.measurement.2014.01.036. [21] k. kostúr, m. laciak, m. durdán, et al. low-calorific gasification of underground coal with a higher humidity. measurement 63:69–80, 2015. doi:10.1016/j.measurement.2014.12.016. [22] r. l. dobbs, w. b. krantz. combustion front propagation in underground coal gasification, final report, work performed under grant no. de-fg22-86pc90512. technical report, university of colorado, boulder department of chemical engineering, 1990. [23] m. laciak, k. kostúr, m. durdán, et al. the analysis of the underground coal gasification in experimental equipment. energy 114:332–343, 2016. doi:10.1016/j.energy.2016.08.004. 199 http://dx.doi.org/10.1016/j.pecs.2012.09.004 http://dx.doi.org/10.1016/j.energy.2007.04.012 http://www.academicjournals.org/article/article1420795093_verma%20et%20al..pdf http://www.academicjournals.org/article/article1420795093_verma%20et%20al..pdf http://dx.doi.org/10.5897/ijwree2014.0520 http://dx.doi.org/10.1016/j.jprocont.2013.12.001 http://dx.doi.org/10.1109/icet.2012.6375477 http://dx.doi.org/10.1109/tase.2013.2284545 http://dx.doi.org/10.1109/icca.2013.6565140 http://dx.doi.org/10.1109/carpathiancc.2016.7501135 http://dx.doi.org/10.1109/carpathiancc.2016.7501134 http://dx.doi.org/10.1007/s11027-014-9561-8 http://dx.doi.org/10.1109/carpathiancc.2015.7145079 http://dx.doi.org/10.1016/j.measurement.2014.01.036 http://dx.doi.org/10.1016/j.measurement.2014.12.016 http://dx.doi.org/10.1016/j.energy.2016.08.004 ján kačur, karol kostúr acta polytechnica [24] k. kostúr, et al. research report of the project "underground gasification by thermal decomposition no. apvv-0582-06 for year 2008". technical report, technical university of košice, faculty berg and hbp a.s. prievidza, 2008. [25] k. kostúr, et al. research report of the project "underground gasification by thermal decomposition no. apvv-0582-06 for year 2007". technical report, technical university of košice, faculty berg, 2007. [26] b. stehlíková, p. flegner. possibilities to compare vibration in drilling rocks. in proceedings of the 2014 15th international carpathian control conference (iccc). ieee, 2014. doi:10.1109/carpathiancc.2014.6843666. [27] t. sasvári, m. blišťanová, et al. (eds). možnosti získavania energetického plynu z uhoľných ložísk. 1st ed. edičné stredisko tu v košiciach, faculty berg, 2007. [28] ľ. dorčák, j. terpák, f. dorčáková. teória automatického riadenia spojité lineárne systémy. edičné stredisko ams, faculty berg, tu v košiciach, košice, 2003. [29] k. åström, t. hagglund. pid controllers: theory, design, and tuning. 2nd ed. instrument society of america, usa, 1995. [30] v. bobál, j. böhm, r. prokop, j. fessl. praktické aspekty samočinne se nastavujícich regulátorú: algoritmy a implementace, 1. vydání. vsoké učení technické v brně, vutium, brno, 1999. [31] a. visioli. basics of pid control. in advances in industrial control, pp. 1–18. springer london, 2006. doi:10.1007/1-84628-586-0_1. [32] c. a. smith, a. b. corripio. principles and practice of automatic process control. john wiley & sons, inc.; 2nd edition, danvers, usa, 2008. [33] v. bobál, j. böhm, j. fessl, j. macháček. process modelling and identification for use in self-tuning controllers. in digital self-tuning controllers: algorithms, implementation and applications (advanced textbooks in control and signal processing), pp. 21–52. springer-verlag, 2005. doi:10.1007/1-84628-041-9_3. [34] k. åström, b. wittenmark. computer controlled systems theory and design. 3rd ed. prentice hall, new jersey, usa, 1996. [35] o. modrlák. teória automatického riadenia ii. úvod do diskrétnej parametrickej identifikácie. tu v liberci, liberec, 2003. [36] k. kostúr, m. laciak. the development of technology for the underground coal gasification in a laboratory conditions. metalurgija 47(3):263, 2008. [37] b. hanuš, o. modrlák, m. olehla. číslicová regulace technologických procesů. algoritmy, matematicko-fyzikálni analýza, identifikace, adaptace, 1. vydání. vysoké učení technické v brňe, vutium, brno, 2000. [38] j. kačur, k. kostúr. analysis of algorithms for control time delay systems. in iccc 2007: 8th international carpathian control conference, pp. 256–259. 2007. [39] r. isermann. digital control systems. springer-verlag, berlin, heidelberg, 1981. doi:10.1007/978-3-662-02319-8. [40] o. davidová. porovnanie metód syntézy diskrétnych regulačných obvodov. automa 12:55–60, 2006. [41] n. b. nichols, j. g. ziegler. optimum settings for automatic controllers. journal of dynamic systems, measurement and control, transactions of the asme 115(2b):220–222, 1993. doi:10.1115/1.2899060. [42] m. alexík. modification of dead beat algorithm for control processes with time delay. in 16th triennial world congress of international federation of automatic control, ifac 2005, prague; czech republic, vol. 38, pp. 278–283. 2005. doi:10.3182/20050703-6-cz-1902.00617. [43] k. kostúr, j. kačur. developing of optimal control system for ucg. in proceedings of the 13th international carpathian control conference (iccc). ieee, 2012. doi:10.1109/carpathiancc.2012.6228666. [44] j. kačur, k. kostúr. the algorithms for control of heating massive material. acta montanistica slovaca 13(1):87–93, 2008. http: //actamont.tuke.sk/pdf/2008/n1/12kacur.pdf. [45] j. kačur, m. durdán, g. bogdanovská. monitoring and measurement of the process variable in ucg. in sgem 2016: 16th international multidisciplinary scientific geoconference, bulgaria sofia : stef92 technology, pp. 295–302. 2016. [46] m. koenen, f. bergen, p. david. isotope measurements as a proxy for optimising future hydrogen production in underground coal gasification, news in depth, 2015. https: //www.tno.nl/media/2624/information20-nid1.pdf. [47] h. s. wu. the measuring methods of radon and its application. beijing: nuclear energy press, 1995. [48] m. durdán, k. kostúr. modeling of temperatures by using the algorithm of queue burning movement in the ucg process. acta montanistica slovaca 20(3):181–191, 2015. http: //actamont.tuke.sk/pdf/2015/n3/3durdan.pdf. [49] m. durdán, j. kačur. indirect temperatures measurement in the ucg process. in proceedings of the 14th international carpathian control conference (iccc). ieee, 2013. doi:10.1109/carpathiancc.2013.6560514. [50] j. m. wu. radon distribution under the mine and the application of radon measuring in the monitoring of the natural fire zone, 1994. [51] h. s. wu. the transferring effects of radon moving. geophysics journal 14(1):136, 1995. [52] india sme technology services ltd. 15 variable water requirement with vfd. http: //www.techsmall.com/eadmin/knowledgebankfile/ ee%20in%20pump%20through%20auto%20control.pdf. 200 http://dx.doi.org/10.1109/carpathiancc.2014.6843666 http://dx.doi.org/10.1007/1-84628-586-0_1 http://dx.doi.org/10.1007/1-84628-041-9_3 http://dx.doi.org/10.1007/978-3-662-02319-8 http://dx.doi.org/10.1115/1.2899060 http://dx.doi.org/10.3182/20050703-6-cz-1902.00617 http://dx.doi.org/10.1109/carpathiancc.2012.6228666 http://actamont.tuke.sk/pdf/2008/n1/12kacur.pdf http://actamont.tuke.sk/pdf/2008/n1/12kacur.pdf https://www.tno.nl/media/2624/information20-nid1.pdf https://www.tno.nl/media/2624/information20-nid1.pdf http://actamont.tuke.sk/pdf/2015/n3/3durdan.pdf http://actamont.tuke.sk/pdf/2015/n3/3durdan.pdf http://dx.doi.org/10.1109/carpathiancc.2013.6560514 http://www.techsmall.com/eadmin/knowledgebankfile/ee%20in%20pump%20through%20auto%20control.pdf http://www.techsmall.com/eadmin/knowledgebankfile/ee%20in%20pump%20through%20auto%20control.pdf http://www.techsmall.com/eadmin/knowledgebankfile/ee%20in%20pump%20through%20auto%20control.pdf acta polytechnica 57(3):182–200, 2017 1 introduction 1.1 geology 1.2 factors affecting ucg 1.3 control of ucg 2 experimental ucg in lab scale 3 selection of a suitable site for in-situ gasification 4 digital feed-back control 4.1 adaptation of controller due to uncertainties in ucg 4.2 methods for calculating parameters of a discrete controller 4.2.1 modified ziegler – nichols open loop tuning method 4.2.2 time-optimal control — dead beat method 5 experimental results and discussions 5.1 air flow stabilization 5.2 temperature stabilization 5.3 stabilization of oxygen concentration in the syngas 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0379 acta polytechnica 56(5):379–387, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap optimalization of afterburner channel in biomass boiler using cfd analysis jiri pospisil∗, martin lisy, michal spilacek brno university of technology, faculty of mechanical engineering, energy institute technická 2896/2, 616 69 brno, czech republic ∗ corresponding author: pospisil.j@fme.vutbr.cz abstract. this contribution presents the results of parametrical studies focused on the mixing process in a small rectangular duct within a biomass boiler. the first study investigates the influence of a local narrowing located in the central part of the duct. this narrowing works as an orifice with very simple rectangular geometry. four different free cross sections of the orifice were considered in the center of the duct, namely 100 %, 70 %, 50 %, 30 % of free cross section area in the duct. the second study is focused on the investigation of the influence of secondary air distribution pipe diameter on the mixing process in a flue gas duct without a narrowing. keywords: cfd analysis; flue-gas; mixing; emissions. 1. introduction increase in the use of renewable energy sources goes hand in hand with requirements on improvements of relevant technologies. technologies for energy from biomass very often use poor-quality fuels that are fuels with high water content. improper fuels cause many troubles in the control of the combustion process, and require advanced combustion equipment designs. besides low quality of the fuels, the industry has to face growing requirements on boiler and combustion technology efficiency, and on decrease of emitted pollutants. development of combustion technologies is commonly supported with computational simulations which stimulate fast development and savings in the development of prototypes [1]. 2. conditions of combustion process all combustion is a chemical reaction which provides thermal energy from energy chemically bound in the combusted fuel. combustion is a process of burning a fuel where the fuel burns and produces heat and light. active combustible matter in the fuel (c, h, and s) reacts with atmospheric oxygen (o2). combustion reactions occur under convenient temperatures and the temperature is the main factor affecting the speed of the process (the higher the temperature, the faster the reaction). combustion is called burning if accompanied by light effect, such as flames [2,3]. as already stated above, combustion is a reaction of fuel and oxygen contained in the combustion air. combustion equations provide simplifications for the so-called stoichiometric amount of oxygen necessary for the reaction. however, this amount of oxygen is never sufficient in reality, and the combustion process takes place with a certain amount of excess air, which increases the probability of the reaction. on the other hand, the amount of excess air cannot rise too much. first, it increases the production of nox, and second, it reduces the temperature in the furnace, which worsens the conditions for combustion of the fuel. when describing the conditions necessary for an optimum course of combustion reactions, the so-called 3ts are usually mentioned: time, temperature, and turbulence. the combustible components of the fuel must be in contact with an oxidizer so that the combustion reaction is optimal. the contact must occur under optimum temperature conditions and must last for a specific period of time. specific parameters are affected by various factors, such as type of fuel, furnace design, pressure in the furnace, etc. optimization of the three parameters, that are temperature, time, and turbulence, helps decrease the surplus of combustion air [2,3]. biomass contains a high share of volatile combustible matter (75–80 %) and therefore it is vital to comply with the optimum conditions during the combustion of biomass. volatile combustible matter is released from the fuel at temperatures exceeding 150 °c; intensive release occurs at 200–250 °c. further increase in temperature causes burning of the first components of the fuel. hydrogen and hydrocarbons are ignited at temperatures exceeding 450 °c, and the temperature of the flame steeply rises to 900–1400 °c (depending on fuel moisture content and excess combustion air). solid combustible components in charcoal are ignited at temperatures around 600 °c. flame temperatures must be kept at above 900 °c and a sufficient amount of oxygen must be supplied in order to achieve complete burnout of the combustible [4,5]. all “3t” parameters were considered and analyzed 379 http://dx.doi.org/10.14311/ap.2016.56.0379 http://ojs.cvut.cz/ojs/index.php/ap j. pospisil, m. lisy, m. spilacek acta polytechnica figure 1. geometry of the area – mesh of control volumes during development of the technology discussed below. the original combustion chamber had a rectangular cross section with an inclined grate and no builtin internals. the fuel was supplied from one side above the grate and flue gas flow left through the opposite outlet branch. primary air was supplied under the grate; secondary air was supplied above the grate. a major problem with operating the original technology was mixing of the flue gas flow containing a significant share of combustibles and the combustion air flow, that is an insufficient mixing of the flow under low temperatures in the afterburning and exchanger sections of the chamber. insufficient mixing resulted in high emissions, especially of co. installing an inclined partition inside the combustion chamber significantly helped to increase the temperature in the chamber and the retention time of the flue gas at high temperature levels (see figure 1). the partition in the chamber further helped to improve the mixing of air and the released combustible; however, the mixing was insufficient. it was necessary to improve the mixing of the gaseous mixture, especially mixing with the supplied secondary air. there are various technical solutions to this problem. the simple solutions include a reduction of the channel cross section by using built-in internals with an orifice; the more complicated ones use the so-called static mixer, which is a device for continuous mixing of liquids. thorough assessment of available options resulted in opting for the instalment of built-in internal with an orifice. the following procedures helped optimize the shape and location of the orifice so that the mixtures of flue gas and secondary air may be sufficiently mixed. 3. computational modelling computational modelling based on the control volume method was used for parametrical study testing the impact of local reduction of the cross section of a flue gas duct on mixing of the flue gas. a computational model of an afterburner channel in a 200 kw boiler for combustion of wood was built up for the purposes of the study. the geometry of the combustion channel was transposed into a mesh comprising hexagonal volume elements; see figure 1. flue gas flowing in an unobstructed flue gas duct of a rectangular cross section (0.72×0.4 m), with no local reduction, was modelled as the initial basic geometrical configuration. other geometrical configurations resulted from modifications of the basic configuration of the flue gas duct, that is by installing a thin partition with one rectangular orifice with a free section corresponding to 70 %, 50 %, and 30 % of the original unobstructed duct, free cross section of the duct. 3.1. boundary conditions inlet boundary conditions. the inlet was identified at a lateral cross section at the bottom part of the boiler (the grate surface) where flue gas enters the relevant section of the boiler. uniform inlet velocity of flue gas was set to 10 m/s, temperature to 750 °c, and flu-gas density to 0.24536 kg/m3 (at the given surface). a similar boundary condition was used for trace gas injected from the boiler wall. the trace gas entered the flow from the wall level at a velocity of 5 m/s and more in the direction perpendicular to the boiler wall. 380 vol. 56 no. 5/2016 optimalization of afterburner channel in biomass boiler figure 2. trace gas concentration close to the flue gas duct walls free cross section in reduction place concentration in cut free cross section width height minimum maximum average (%) (m) (m) (10−2 g/m3) (10−2 g/m3) (10−2 g/m3) option 1 100 0.720 0.402 0.025 6.43 1.52 option 2 70 0.602 0.336 0.355 4.98 1.72 option 3 50 0.509 0.284 0.179 3.60 1.70 option 4 30 0.394 0.220 0.123 2.61 1.56 table 1. values of trace gas concentrations in the assessed cut. the “pressure” outlet boundary condition. it was set to the outlet surface at the end of the combustion duct in the computational model. the wal l boundary condition. it was set to other surface areas of the model. the solution considers an adiabatic wall with friction. turbulence model. the solution used a k–� turbulence model. pure nitrogen was used as substitute for real flue gas in this study. nitrogen at a temperature of 750 °c serves as a nonreactive flow pattern for subsequent study of mixing and dispersion processes. the combustion chamber was modelled with adiabatic walls to simulate isothermal flow behavior. these simplifications enable us to investigate mixing processes with the exclusion of secondary mixing mechanisms caused by chemical reactions or by temperature gradient. 4. narrowing the duct flow of a gaseous mixture comprising 5 % hydrogen and 95 % nitrogen, the so-called trace gas, was supplied into the flue gas flow from the side walls of the boiler, 0.4 m before the local reduction of the duct (see figure 1 – yellow dots). there were two sources of trace gas, located at the same horizontal line, in the middle of the duct height. mixing of the flue gas was assessed using concentration maps of the trace gas which were acquired in a vertical cut of the combustion duct 0.5 m beyond the reduction. 3d fields of trace gas concentration in the combustion equipment were calculated in the parametrical study. a similar boundary setting was used for all analyzed geometry. examples of concentration fields are given in figure 2 (no reduction of the flue gas duct). assessment of the distribution of trace gas concentrations in the flue gas was done in the vertical cut led 0.5 m beyond the local reduction of the flue gas duct. concentration maps of trace gas from this cut are given in figure 4. the concentration was evaluated by identifying a minimum concentration, maximum concentration, and average concentration. the obtained values of the assessed parameters are given in table 1 for all researched configurations. the chart in figure 3 presents an overview of the assessed parameters. the calculated concentration maps (figure 4) clearly show a difference in the trace gas concentration distribution for the particular researched options. option 1 shows two independent flows of trace gas close to the walls where the gas enters the flue gas flow. the flue gas duct without any reduction does not intensify the mixing of gases in the combustion duct. mixing is thus a consequence of physical and turbulent diffusion. the highest difference between a maximum and minimum concentration at the assessed cut was obtained for option 1. the local reduction tested in the flue gas duct for 381 j. pospisil, m. lisy, m. spilacek acta polytechnica figure 3. graphical display of maximum, minimum, and average concentration options 2–4 shows a different nature of concentrations, but the decrease in maximum concentration in the cut is almost linear. this proves that a reduction of the cross section helps mix the flue gas. the minimum concentrations of all researched options are close to zero. this concentration may be found in parts of the flow which were exposed to the trace gas only minimally. the average concentration of the trace gas with identical mass of the entering gas should be identical for all options, which is not true for option 2 (70 % reduction). the deviation from that assumption may be explained in reference to the use of flat averaging which does not reflect different mass flows of flue gas in various parts of the cross section. quantification of the impact of the cross section reduction on the mixing of the gases may stem from a linear decrease in the maximum concentration related to smaller free cross sections. the boundary states of that linearization may be option 1 (no reduction in the duct) with a maximum concentration of 100 %, and option 4 (30 % of the free cross section) where the maximum concentration reached 41 % of the maximum free duct concentration. a reduction of the free cross section of the channel results in an increase in pressure losses. for correct evaluation of pressure losses, the static pressure fields were analyzed for constant mass flux rate of the flue gas. the static pressure values necessary for flowing of flu-gas through the entire boiler were obtained from the carried out parametrical study. the relationship between static pressure drop and the free cross section of the duct is presented in figure 5. the local reduction of the free cross section does not influence the total retention time of gaseous species in the combustion chamber. the local cross section reduction causes a local increase in flue gas velocity and it forms a very complex structure of the velocity field of flue gas. it results in longer ways of individual streamlines. pressure losses and mixing intensity are directly influenced by the actual value of kinetic energy of turbulence. an increase in kinetic energy of turbulence influences the fluid flow character like for a more viscous fluid. more turbulent flow generally results in higher value of pressure drops. more intensive mixing processes are encouraged by a more complex vortex structure of flow with higher kinetic energy of turbulence. the corresponding kinetic energy of turbulence was obtained from the numerical calculations in a position behind the channel reduction – in the position of the evaluating cut. the values of kinetic energy of turbulence obtained from the carried out parametrical study are presented in figure 6. kinetic energy of turbulence generally increases with increasing velocity of flue gas flow. for the studied configuration of flue gas duct, we can investigate the relation between the velocity of flue gas in the orifice of the reduction and kinetic energy of turbulence in the evaluated cut. this relation is expressed by utilizing relative changes in figure 7. figure 7 shows that increasing flue gas velocity increases the kinetic energy of turbulence behind the reduction orifice. the significant distance of the assessed cut from the reduction orifice causes the less intensive increase in kinetic energy of turbulence in comparison with the observed flue gas velocity. 5. size of secondary air distribution pipes the following parametrical study was carried on a numerical model of a flue gas duct without any reduction. the parametrical modification of the geometry is focused on the testing of changes of the distribution pipe diameter that is used for the supply of trace gas. four pipe diameters were tested: 50, 100, 150 and 200 mm. the distribution pipe is stacked into the combustion chamber, and ends in the wall surface 0.4 m before the baffle formed by the vault (same position as for the source of trace gas in the previous chapter). cool trace gas (15 °c) is supplied in the flow of hot flue gas via a distribution pipe. mixing 382 vol. 56 no. 5/2016 optimalization of afterburner channel in biomass boiler (a) (b) (c) (d) figure 4. concentration of trace gas in the cut of maximum concentrations. left of each subfigure: highest concentrations – section plan view; right: concentration in the cut – view in the direction of flue gas flow. a) option 1 – free cross section 100 %; b) option 2 – free cross section 70 %; c) option 3 – free cross section 50 %; d) option 4 – free cross section 30 %. 383 j. pospisil, m. lisy, m. spilacek acta polytechnica figure 5. relationship between the static pressure drop and the free cross section figure 6. relationship between the kinetic energy of turbulence and the free cross section figure 7. relative changes of kinetic energy of turbulence in the assessed cut and flue gas velocity in the duct orifice. 384 vol. 56 no. 5/2016 optimalization of afterburner channel in biomass boiler secondary air concentration in the assessed cross section pipe diameter injection velocity minimum maximum difference average (mm) (m/s) (vol%) (vol%) (vol%) (vol%) 200 5.1 18 68 50 36 150 9.1 26 61 35 36 100 20.4 22 48 27 34 table 2. comparison of options for constant mass flow. figure 8. example of trace gas distribution of the flows of flue gas and trace gas was analyzed using concentration maps of hydrogen (the trace gas compound) obtained from a vertical cross section of the flue gas duct, performed before the second twist of the flue gas duct; see the yellow cross section in figure 1. 3d concentration fields of trace gas were obtained from computation. no active chemical reactions were considered in this parametrical study. hydrogen worked as a trace gas for assessment of the mixing quality and its visualization. the same model setting was used in all four geometry options, which differed in the diameter of the distribution pipe for the secondary air. the distribution of trace gas in flue gas was assessed in the vertical cross section before the second twist of the flue gas duct, 0.7 m before the end of the modeled duct. an example of the obtained concentration field of trace gas in the central cross section of the duct is given in figure 8 for a diameter of trace gas distribution pipe of 100 mm and trace gas velocity of 5.1 m/s. the calculated concentration map for the velocity of 5.1 m/s clearly shows the character of concentration distributions of trace gas for the particular analyzed distribution pipe diameter. if the trace gas is distributed through a bigger cross section (bigger diameter), more of the flue gas duct is affected by trace gas with a local maximum concentration in the lower part of the flue gas duct. a decrease in the diameter of the trace gas pipe and same flow velocity of trace gas cause shifts of the local maximum concentration into the upper part of the assessed flue gas duct. when the smallest diameter of the trace gas pipe was tested, the local maximum concentration of trace gas was in close vicinity of the top surface of the flue gas duct. these changes are a result of a change in the flow distribution that is affected by a collision of flue gas flow and secondary combustion air flow. this conclusion may be generalized for all analyzed velocities of the entering secondary air. figure 9 shows a comparison of the average concentrations of trace gas for all four analyzed diameters of distribution pipes. dependencies in the results obtained for different inlet velocities of trace gas were acquired by a detailed quantification of the lowest and highest concentrations of trace gas in the analyzed cross section. it is obvious that the difference between the minimum and maximum concentration increases with an increase in the diameter of the distribution pipe of the trace gas. this dependency is displayed in more details in figure 10. the particular difference ranges from ca. 10–12 % for a distribution diameter of 50 mm (velocity has no impact) to 50 % difference for a diameter of 200 mm and the lowest velocity of 5.1 m/s. minimum concentration is another important parameter which may be identified. the higher air velocity, the higher the values of minimum concentration of trace gas. the minimum concentration of trace gas is close to zero for the smallest pipe diameter and for velocities of 5.1 and 9.1 m/s. this means that there are places in the cross section which are not affected by the flow of trace gas. the minimum concentration does not drop below 10 % for highest trace gas velocities of 20.4 m/s. 385 j. pospisil, m. lisy, m. spilacek acta polytechnica figure 9. assessment of concentrations of trace gas (secondary air) for inlet velocity of air of 5.1 m/s figure 10. difference of minimum and maximum concentrations of trace gas 6. conclusions the carried out parametrical study aimed to increase the retention time of flue gas in the gasification chamber and intensify the mixing of flue gas flow with secondary air. using the computational tool starcd, we designed a system of built-in internals for the regulation of flue gas flow in the chamber equipped with a turbulator, which significantly helps mix the flue gas flow with combustion air. the main task was to identify the impact of reduction of the duct on the quality of mixing of flue gas and secondary air flows. mathematical modelling clearly shows a linear dependency of the duct reduction on flow mixing, and a decrease in the maximum concentration of secondary air (substituted by trace gas) in the duct. in terms of practical applications, the optimum options seem to be between 50 % and 30 % of cross section reduction. the solution with the 30 % reduction of the flue gas duct was performed and tested in practice. emissions of pollutants significantly decreased and comply with emission limits stipulated in the most stringent class 5 according to en 3035. co emissions were lower than 100 mg/m3n; ogc ranged in the 2–3 mg/m3n, interval, and pm reached c. 30–40 mg/m3n. the second objective of the analyses was to observe the impact of diameter of a secondary air supply pipe and secondary air velocity on the mixing of the secondary air with flue gas in the combustion chamber without a narrowing. in this study secondary air was substituted by trace gas. distribution of the same amount of secondary air using more distribution pipes with smaller diameters may be recommended. in general, higher velocity of distributed air enhances mixing processes and results in more uniform concentration of air in afterburning duct. acknowledgements this work is an output of research and scientific activities of netme centre plus (lo1202) enabled by financial means from the ministry of education, youth and sports under the “national sustainability programme i”. references [1] yevgeniya, h., reinhard, l., horst, m., aliya, a.: cfd code florean for industrial boilers simulations 386 vol. 56 no. 5/2016 optimalization of afterburner channel in biomass boiler (2009) wseas transactions on heat and mass transfer, 4 (4), pp. 98-107. [2] j. horák, p. kubesa: the combustion of solid fuels in local heating (1), cited from: http://energetika.tzb-info.cz/8618-o-spalovanituhych-paliv-v-lokalnich-topenistich-1, [2015-03-20]. [3] j. horák, p. kubesa: the combustion of solid fuels in local heating (2), cited from: http://energetika.tzb-info.cz/8644-o-spalovanituhych-paliv-v-lokalnich-topenistich-2, [2015-03-20]. [4] z. lyčka: wood pellet ii., combustion in a local boilers, krnov, (2011). 387 http://energetika.tzb-info.cz/8618-o-spalovani-tuhych-paliv-v-lokalnich-topenistich-1 http://energetika.tzb-info.cz/8618-o-spalovani-tuhych-paliv-v-lokalnich-topenistich-1 http://energetika.tzb-info.cz/8644-o-spalovani-tuhych-paliv-v-lokalnich-topenistich-2 http://energetika.tzb-info.cz/8644-o-spalovani-tuhych-paliv-v-lokalnich-topenistich-2 acta polytechnica 56(5):379–387, 2016 1 introduction 2 conditions of combustion process 3 computational modelling 3.1 boundary conditions 4 narrowing the duct 5 size of secondary air distribution pipes 6 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0518 acta polytechnica 60(6):518–527, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap heat transfer enhancement and friction in double pipe heat exchanger with various number of longitudinal grooves putu wijaya sunua, ∗, daud simon anakottaparya, i made suartaa, i dewa made cipta santosaa, ketut suarsanab a bali state polytechnic, mechanical engineering department, kampus bukit jimbaran, jalan uluwatu 45 kuta selatan, 80361 badung, bali, indonesia b udayana university, faculty of engineering, mechanical engineering department, kampus sudirman, jalan p.b. sudirman, 80232 denpasar, bali, indonesia ∗ corresponding author: wijayasunu@pnb.ac.id abstract. it has been found out that heat exchangers with longitudinal grooves produce better heat transfer than those without longitudinal grooves. however, up to now, there have been few investigations and applications of longitudinal grooves in relation to heat transfer associated with friction from the annulus of a heat exchanger. the present investigation examined the effects of longitudinal grooves in a double pipe heat exchanger on the characteristics of heat transfer and friction. longitudinal rectangular grooves were carved into the outer side of a tube at a specified depth (t) and width (l). the effect of the number of longitudinal grooves, reynolds number (re), on the thermal and hydraulic performance was evaluated based on the heat exchanger experimental data. a total of four pipes were used: one pipe with 2 grooves, one pipe with 4 grooves, one pipe with 6 grooves and one pipe with 8 grooves. water, hot and cold, was used as the working fluid. the test was performed with the cold water as the working fluid, with the reynolds number from about 33 000 to 46 000 in a counter-flow scheme. the result showed that the number of grooves improved the heat transfer and caused a pressure drop. the increase in heat transfer ranged from 1.05 to 1.15, and the pressure loss of the system reached almost 30 % as compared with the smooth annulus, the annulus with no groove. the installation of longitudinal grooves in a heat exchanger system enhanced the process of the heat flow through the boundary but provided a compensation for the pressure loss, which was correlated with the friction and pumping power. keywords: heat transfer, heat exchanger, longitudinal grooves, friction. 1. introduction a heat exchanger is the most significant industrial equipment widely used for handling thermal energy [1, 2]. one type of a heat exchanger that is widely used is the double pipe heat exchanger. such a heat exchanger has a compact structure that can effectively transfer heat. a special interest has been spurred in designing and manufacturing a heat exchanger that is compact in its structure yet efficient in terms of costs, material, and energy. there are some significant performance parameters of a double pipe heat exchanger, such as the temperature of the hot and cold fluid, the flow rate of each working fluid, and pressure difference [3]. the thermal performance of a double pipe heat exchanger can be improved by a heat transfer enhancement technique. the heat transfer mechanism in a heat exchanger can potentially be improved fundamentally by generating turbulence in the fluid flow, for example, by adding a corrugated tube [4, 5], a twist tape [6, 7], a groove [8, 9], etc. this method is widely known as the passive technique, a technique that extends the surface heat transfer area through a surface modification and surface extension with minor changes in the diameter, without using any additive and energy input, yet is easy to install [10]. grooving, with its thermo-hydraulic performance, is the most promising passive technique. such a technique has been extensively studied in engineering applications. over the last few decades, numerical and experimental investigations have been carried out to examine various types of an internally grooved pipe within thermal and fluid science [11–14]. the grooves potentially improve the heat transfer by extending the surface area with a slight change in the pipe diameter. although it offers a considerable enhancement of heat transfer, the internally grooved tube usually causes an increase in friction and pressure drop [15, 16]. the interaction that occurs between the grooves and the fluid flow inside the heat exchanger system is a very important mechanism. many studies have been carried out to examine various parameters of grooves. the details of the flow structure have been shown to be able to control and improve the flow condition [17]. using various parameters, [18] investigated a spirally corrugated tube. moreover, the spirally-grooved tube was very feasible for an application in seawater de518 https://doi.org/10.14311/ap.2020.60.0518 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 6/2020 heat transfer enhancement and friction in double pipe. . . salination and caused an enhancement of the heat transfer [19]. another shape of grooves [20] used for transversely grooved tube using molten salt as the hot fluid flowing through the inner concentric tube with the range of the reynolds number from 300 to 60 000 was found to cause a heat transfer enhancement by 1.6. [21] investigated the flow pattern and heat transfer in the transversally grooved channel for a pulsatile flow. they found that the heat transfer enhancement factor was 2.74 when the oscillatory fraction was 1.4. [22] investigated the implementation of nano-fluid and found that it greatly enhanced the heat transfer of a helically corrugated tube. [23] explored the roughened tube surface and found that 2.73 was the maximum increased value. some researchers have examined the performance of rectangular grooves [24–28], and discovered that the rectangular grooves influenced the thermal and hydraulic performance. [29] studied the micro-grooved surface to examine whether it could enhance flow boiling. grooves were found to generate bubbles increasing the heat transfer coefficient by about 10 % − 15 %. [30] studied the effects of horizontal and vertical surface grooves on hydrodynamic performance of heated and non-heated spheres at room temperature and found that the horizontal grooves had a remarkable effect on the performance. [31] investigated the thermal behaviour of chamfer v-grooves used as a vortex generator to affect the flow in a heat exchanger channel at reynolds number ranging from 5 300 to 23 000. the findings of the present study corroborated those of [31] that a heat exchanger with grooves performed better than that without any grooves. for an application in the renewable and clean energy field, the grooving technology should improve the thermal systems. [32] conducted a study to investigate the thermal performance of a material enriched with square, circular and trapezoidal internal grooves and discovered that the structure could increase the heat transfer rate by 27 %. according to these researches, the application of differently shaped grooves generally leads to an increase in the amount of heat transfer. although several papers have studied the heat transfer in heat exchangers using modified surfaces, the influence of a longitudinal rectangular groove pattern on the heat transfer and pressure reduction has not been comprehensively studied. the axial/longitudinal corrugated surface has been found to increase the heat transfer by 15 − 30 % with a constant pumping power in the laminar flow [33, 34] and turbulent airflow [35]. the axial/longitudinal and radial grooves also bring benefits to the heat pipe, the drag in pipe flow, reduction of aerodynamic noise, and the pressure difference between the projectile head and the concrete wall [36– 41]. [42] examined a longitudinally grooved channel to enhance the diffusive transport on moderate re and discovered the saturated flow that could intensify spanwise motions enhancing the advective transport. there have been few investigations into heat transfer in terms of friction, especially in a longitudinally grooved heat exchanger. the grooved annulus room of a double pipe heat exchanger has received little attention, especially those using a high reynold number (re). however, the features of longitudinal grooves have not been clarified yet, which motivates the present investigation. therefore, a special consideration has been given to increasing the heat transfer with a minimum pressure loss penalty by using longitudinal grooves. the novelty of the findings of the present study lies in the number of longitudinal grooves installed in the annulus used for the intensification of the heat transport at high (re > 33 000) values of the reynolds number. the specific objective of the study was to evaluate the effectiveness of such a technique, identifying the number of the transfer unit (ntu), the enhancement of heat transfer, and friction of heat exchangers constructed using a longitudinal rectangular groove pattern to improve the thermo-hydraulic performance of the heat exchanger. the next section (section 2) presents the experimental procedure, describing the details of the experiment. moreover, well-defined mathematical equations used in this study in the data reduction part are also presented. subsequently, the results and discussion are presented from the experimental data (section 3). the last part (section 4) contains the conclusion of the present study. 2. experimental method the outline schema of the experimental test rig is shown in figure 1 below. it mainly consists of three major groups, namely a heat exchanger test section, hot and cold fluid loop and an instrumentation or data acquisition monitoring system. figure 2 shows the horizontal orientation of a double pipe heat exchanger test section equipped with longitudinal grooves. to allow changes and modifications of the test section for experimental design parameters, the test rig was divided into test sections of the heat exchanger, namely the flange assembly connections of the test section and the piping loop system. furthermore, the loop component consisted of a hotand cold-water centrifugal pump that circulates the fluid within the loop, a full set of rotameters used to adjust the flowrate to the desired level and control it and storage tanks that contained the working fluid circulating within the loop. meanwhile, the data acquisition device digitally recorded the temperature and pressure of the inlet and outlet of the annulus. all of the instrumentation equipment was installed at each important measurement point in the test section of the experimental apparatus. this investigation used the counter flow schema in a double pipe heat exchanger so that the hot fluid entry section came in contact with the cold fluid exit section and vice versa. to avoid a heat loss from the heat exchanger to the ambient, all loop and 519 p. w. sunu, d. s. anakottapary, i m. suarta et al. acta polytechnica figure 1. the outline of the experimental apparatus. figure 2. test section. test section surfaces were well insulated by a thermal insulator. the temperature of the hot fluid was 50±0.5 °c and that of the cold fluid was 30 ± 0.5 °c. the cold fluid volume flowrate varied from 11 litres per minute (lpm) to 15 litres per minute (lpm) in a direct correlation to the reynolds number (re) between 33 000 and 46 000, respectively. in addition, the reynolds number of the hot fluid was kept constant at around 30 000. 2.1. specification details of groove the configuration of the geometrical features and the details of the groove are shown in figure 3. the 50 cm long test section was a double pipe heat exchanger placed horizontally. the tube was made from aluminium tube with a 19.8 mm outer diameter. the shell was made of an acrylic tube with a 27.5 mm inner diameter. the outer surface of the aluminium tube was etched using longitudinal-rectangular grooves. the grooves were created using a conventional etching technique, with a groove depth (t) of 0.3 mm and groove width (l) of 1 mm. the experiment was carried out with 4 pipes, one with 2 longitudinal grooves, one with 4 longitudinal grooves, one with 6 longitudinal grooves and one with 8 longitudinal grooves. the longitudinal grooves were incised into the pipe walls using conventional techniques. in this investigation the smooth annulus was used as the control parameter. 2.2. measurement k-type thermocouples with 0.3 mm in diameter were used to measure the temperature of both the entry and 520 vol. 60 no. 6/2020 heat transfer enhancement and friction in double pipe. . . figure 3. groove cross sectional view on double pipe heat exchanger. the exit points of the working fluid in the test section. a data logger was used to digitalize the thermocouples’ signal and record for 600 s. to measure the pressure of the cold fluid section, the pressure transducers mpx 5050 d were used at the frequency of 10 hz. a rotameters were used to measure the flowrate of the hot and cold fluid loops. 2.3. data reduction and uncertainty analysis in this study, important parameters were calculated to evaluate the effect of the number of longitudinal grooves on a heat exchanger at different reynolds numbers. qh = qc (1) given that the difference between qh and qc was less than 5 %, the actual heat transferred from the hot water to the cold water flowing in the annulus room was calculated from the cold fluid side using the following equation, q = ṁ · c · (∆tc) (2) q = ρ · v · c · (tc out − tc in) (3) the other indicators of the heat transfer performance were the reynolds number (re), friction factor (f), real heat transfer (q), heat capacity ratio (c), number of transfer unit (ntu), and effectiveness (e). the reynolds number (re) was computed using the equation below: re = u · dh ν (4) the annulus hydraulic diameter was calculated using the equation below: dh = d2 − d1 (5) the friction factor (f) inside the annulus was calculated from the value of the pressure drop using the equation below: f = 2 · dh · ∆p l · ρ · u2 (6) the enhancement of the heat transfer process and the heat capacity ratio was estimated using the following equations: eh = qgroove qsmooth (7) c = cmin cmax (8) the ntu, effectiveness, and ua values were computed using the equations below, ntu = u · a cmin (9) ε = 1 − exp(−ntu(1 − c)) 1 − c exp(−ntu(1 − c)) (10) q = u · a · ∆tlm (11) 2.4. uncertainty analysis the accuracy level of the temperature data acquisition was about 0.4 %. the difference between the pressure inlet and outlet of the annulus being measured by pressure transducers had a level of accuracy of about 521 p. w. sunu, d. s. anakottapary, i m. suarta et al. acta polytechnica figure 4. validation test for friction of smooth pipe. 2.5 %. then, a rotameter with an accuracy level of about 5 % was used to measure the working fluid volume flow rate. the uncertainty of these physical and flow parameters was calculated by the method offered by [43]. um = ± ñå ∆t t ã2 + å ∆p p ã2 + å ∆v v ã2ô1/2 (12) using equation (12), it was found that this experiment’s uncertainty was less than 5 %. 3. results and discussion 3.1. validation test of friction of smooth annulus experimental data firstly, the result obtained in this experiment on the characteristics of the pressure drop in the smooth annulus was verified in terms of the friction factor using equation 6 above. to achieve the level of confidence in the experimental procedure, the friction factor value from the smooth annulus was compared to blasius equation [44]. f = 0.448re−0.275 (13) the flow resistance of the smooth annulus, shown as the friction factor, is presented in figure 4. it was found that the experimental value of the friction factor corresponded with the predicted result from the blasius equation. the absolute deviations obtained for f ranged from 1 % to 6.5 %. 3.2. the effect of the number of longitudinal grooves on the heat transfer figure 5 shows the value of the heat transfer in the annulus for various cold fluid reynold numbers. as presented, the heat transfer rate increased as the reynolds number of the cold water increased. this phenomenon is caused by the heat transfer process, which depends on the heat capacity of the cold fluid, through the tube surface. as figure 5 shows, the heat transfer of the annulus with 8 grooves was higher than that of the annulus with 2 grooves, 4 grooves and 6 grooves. and it can be seen that the heat transfer that occurred in the latter three types of annuli were higher than that occurring in the annulus without any grooves. the highest increase of heat transfer occurred at the highest re, and the observed heat transfer increase in the annulus with 8 grooves was 1.05 − 1.15. due to the resulting turbulence intensity, the intensity of the heat transferred and heat absorbed by the cold fluid in the annulus with 8 grooves was higher than in the case of other annuli. figure 6 shows the pattern of relationship between the overall heat transfer coefficient and reynolds number of the cold fluid. it is evident that the trend of the u value was similar to the patterns of the heat transfer trend in figure 5. in this case, the same justification described for figure 5 can be made for figure 6. the turbulence intensity, recirculation region and fluid momentum are the phenomena that will tear the thermal boundary layer so that the obstacle of heat transfer will be thinner. it was also found that the increase in the real heat transfer was accompanied by a similar increase in the cold-water mass flow rate indicating an overall increase in heat transfer. looking at figure 7, which describes the correlation between the number of transfer unit (ntu) and effectiveness, it can be seen that the number of grooves affected the ntu and effectiveness. the increase in the heat transfer and the u value was the product of increasing the number of grooves, from the smooth pipe to 2 grooves, 4 grooves, 6 grooves and 8 grooves. 522 vol. 60 no. 6/2020 heat transfer enhancement and friction in double pipe. . . figure 5. real heat transfers with various re for different numbers of grooves. figure 6. overall heat transfer coefficient on various reynold number. 3.3. comparison of the effectiveness and ntu obtained from the smooth pipe and that of grooved pipes on the enhancement of heat transfer figure 7 shows the relationship between the ntu and effectiveness obtained from the smooth pipe and the longitudinally-grooved pipe. numbers 1 through 5 denote the heat capacity ratio at a specified reynolds number in this experiment. it is obvious from figure 7 that at a specified heat capacity ratio, the correlation point between the ntu and effectiveness from the grooved pipe was located more towards the right side as compared with that obtained from the smooth annulus. the point getting towards the right side and higher up in figure 7 indicates an increase in the effectiveness and ntu. the actual increase in the heat transfer was indicated by the increase in the effectiveness value. the increase in the ntu value represents the increase in the (ua) value of the annulus system. looking at figure 7, it is clear that the correlation between the effectiveness and ntu of the grooved annulus, indicated by the dot and dash line, was almost similar to the result obtained by [45]. increasing the number of longitudinal grooves in the annulus led to the increase in the ntu and effectiveness. figure 7 also shows that the annulus having eight grooves had the highest ntu and effectiveness. the swirl flow generated in the groove valley was increased by increasing the number of grooves. this secondary flow was responsible for weakening the thermal boundary layer. 523 p. w. sunu, d. s. anakottapary, i m. suarta et al. acta polytechnica figure 7. relationship of ntu and effectiveness for different numbers of grooves. figure 8. friction factor and reynolds number correlation for various numbers of grooves. 3.4. comparison of the friction factor of smooth annulus and grooved annuli figure 8 indicates the assessment of the friction produced in the smooth annulus and the annulus with different numbers of grooves. it was evident that the friction slightly decreased with the increase of reynolds number of the cold water in the smooth annulus, and a similar decrease also occurred in the grooved annulus with the reynolds number being constant. the flow characteristics of the grooved annulus was more complex in the grooved annuli as compared with those occurring in the smooth annulus. the friction factor was generated by swirls and augmented turbulence on the surface of the annulus and in the groove valley area. this phenomenon increased the velocity gradient inside the groove, sheared stress on the surface of the tube and the recirculation region. moreover, it also increased the pressure drop and the friction factor of the grooved annulus. as expected, the friction obtained from all cases in the grooved annulus was significantly higher than those obtained in the annulus with no grooves. the value of the friction factor was affected by the formation and interaction of large-scale and small-scale fluid motion. the increase in the energetic smallscale fluid motion increased the velocity gradient and shear stress at the tube surface and inside the groove valley. it is well identified that the square of the velocity is proportional to the pressure drop. figure 8 also reveals that the annulus having four grooves had the highest friction factor as compared to the others. meanwhile, the annulus with eight grooves had a moderate friction and had the lowest thermal resistance. it is indicated by the value of the heat transfer inside the heat exchanger shown in figures 5, 6, and 7. 524 vol. 60 no. 6/2020 heat transfer enhancement and friction in double pipe. . . figure 9. heat transfer enhancement, friction factor enhancement and reynolds number correlation for various number of grooves. 3.5. compensation of heat transfer and friction let us now turn to the heat transfer performance and the compensation for the friction occurring in the grooved annulus tubes compared with the annulus with no grooves. figure 9 shows the compensation for the heat transfer and friction. the analysis conducted under the constraints of an identical re showed that the heat transfer performance of the annulus with eight grooves increased by up to 15 % and was much better than that of the annulus without any grooves and others with fewer numbers of grooves. meanwhile, the friction performance of the annulus with four grooves increased by up to 27 % and was much higher than that of the annulus without any grooves and the others with fewer numbers of grooves. the point of the heat transfer enhancement should have a higher position relative to the point of the friction enhancement to be considered a good agreement of compensation. the increase in the gap between these two parameters should constitute the indicator of increasing the overall grooved system performance. in the present study, the annulus with eight grooves was the best candidate for the compensation for the heat transfer and friction. 4. conclusions this study examined how a heat transfer enhancement and friction occurred in a double pipe heat exchanger with longitudinal grooves. the effects of the number of longitudinal grooves in an annulus, the cold-water flow rate and the friction factor were measured. the swirl flow and turbulence augmentation occurring near the annulus’ surface and inside the groove valley caused the convective heat transfer within the grooved annulus to be higher than that of the smooth annulus. the study found that the annulus with eight grooves produced good results (compared with the smooth annulus) with a better compensation between the heat transfer, which increased by about 15 %, and the friction, which increased by about 16 %. this study showed that incising the longitudinal-grooved structure on an annulus of a heat exchanger influenced the flow on the pipe surface leading to the increase in the heat transfer enhancement and the friction factor as well. acknowledgements the authors would like to express their gratitude to the drpm ristek dikti–indonesian government by no. 833/pl8/lt/2019 for providing a financial support. also the bali state polytechnic for all the administrative support. list of symbols a surface area [m2] c heat capacity ratio cmin the smallest heat capacity rate [w/°c] cmax the highest heat capacity rate [w/°c] dh hydraulic diameter [m] d1 outlet diameter of tube side [m] d2 inner diameter of shell side [m] ε effectiveness eh heat transfer enhancement ef friction enhancement f friction factor l pipe long [m] ṁ mass flow rate [kg/s] ntu number of transfer unit pa pressure [p] q heat transfer [j] re reynolds number 525 p. w. sunu, d. s. anakottapary, i m. suarta et al. acta polytechnica t temperature [°c] u velocity [m/s] u overall heat transfer coefficient [w/m2°c] um uncertainty ν kinematic viscosity [m2/s] v volumetric flow rate [m3/s] ∆p pressure drop [pa] ∆t fluid temperature difference in time period [°cs] ∆v volumetric flow rate difference [m3/s] ρ density [kg/m3] subscripts c cold fluid h hot fluid in inlet lm lmtd out outlet references [1] y. cao, h. ke, y. lin, et al. investigation on the flow noise propagation mechanism in pipelines of shell-andtube heat exchangers based on synergy principle of flow and sound fields. applied thermal engineering 122:339 – 349, 2017. doi:10.1016/j.applthermaleng.2017.04.057. [2] h. akhavan-zanjani, m. saffar-avval, m. mansourkiaei, et al. experimental investigation of laminar forced convective heat transfer of graphene–water nanofluid inside a circular tube. international journal of thermal sciences 100:316 – 323, 2016. doi:10.1016/j.ijthermalsci.2015.10.003. [3] j. opatřil, j. havlík, o. bartoš, t. douhý. an experimental assessment of the plate heat exchanger characteristics by wilson plot method. acta polytechnica 56(5):367 – 372, 2016. doi:10.14311/ap.2016.56.0367. [4] d. ndiaye. transient model of a refrigerant-to-water helically coiled tube-in-tube heat exchanger with corrugated inner tube. applied thermal engineering 112:413 – 423, 2017. doi:10.1016/j.applthermaleng.2016.10.045. [5] l. yang, h. han, y. li, x. li. a numerical study of the flow and heat transfer characteristics of outward convex corrugated tubes with twisted-tape insert. journal of heat transfer 138(2):024501, 2016. [6] y. hong, j. du, s. wang. experimental heat transfer and flow characteristics in a spiral grooved tube with overlapped large/small twin twisted tapes. international journal of heat and mass transfer 106:1178 1190, 2016. doi:10.1016/j.ijheatmasstransfer.2016.10.098. [7] n. piriyarungrod, m. kumar, c. thianpong, et al. intensification of thermo-hydraulic performance in heat exchanger tube inserted with multiple twisted-tapes. applied thermal engineering 136:516 – 530, 2018. doi:10.1016/j.applthermaleng.2018.02.097. [8] s. eiamsa-ard, p. promvonge. thermal characteristics of turbulent rib-grooved channel flows. international communications in heat and mass transfer 36(7):705 – 711, 2009. doi:10.1016/j.icheatmasstransfer.2009.03.025. [9] p. sunu, m. rasta. heat transfer enhancement and pressure drop of grooved annulus of double pipe heat exchanger. acta polytechnica 57(2):125 – 130, 2017. doi:10.14311/ap.2017.57.0125. [10] p. w. sunu, i. n. g. wardana, a. a. sonief, n. hamidi. the effect of wall groove numbers on pressure drop in pipe flows. international journal of fluid mechanics research 42(2):119 – 130, 2015. doi:10.1615/interjfluidmechres.v42.i2.30. [11] k. aroonrat, c. jumpholkul, r. leelaprachakul, et al. heat transfer and single-phase flow in internally grooved tubes. international communications in heat and mass transfer 42:62 – 68, 2013. doi:10.1016/j.icheatmasstransfer.2012.12.001. [12] s. huang. viv suppression of a two-degree-offreedom circular cylinder and drag reduction of a fixed circular cylinder by the use of helical grooves. journal of fluids and structures 27(7):1124 – 1133, 2011. doi:10.1016/j.jfluidstructs.2011.07.005. [13] s. eiamsa-ard, p. promvonge. numerical study on heat transfer of turbulent channel flow over periodic grooves. international communications in heat and mass transfer 35(7):844 – 852, 2008. doi:10.1016/j.icheatmasstransfer.2008.03.008. [14] p. sunu, i. wardana, a. sonief, n. hamidi. flow behavior and friction factor in internally grooved pipe wall. advanced studies in theoretical physics 8(14):643 – 647, 2014. doi:10.12988/astp.2014.4573. [15] p. sunu. the characteristics of increased pressure drop in pipes with grooved. advanced studies in theoretical physics 9(2):57 – 61, 2015. doi:10.12988/astp.2015.412152. [16] t. adachi, y. tashiro, h. arima, y. ikegami. pressure drop characteristics of flow in a symmetric channel with periodically expanded grooves. chemical engineering science 64(3):593 – 597, 2009. doi:10.1016/j.ces.2008.10.041. [17] sutardi, c. ching. effect of a transverse square groove on a turbulent boundary layer. experimental thermal and fluid science 20(1):1 – 10, 1999. doi:10.1016/s0894-1777(99)00031-x. [18] s. rainieri, g. pagliarini (22):4525 – 4536, 2002. [19] c. ua qi, x. han, h. qing lv, et al. experimental study of heat transfer and scale formation of spiral grooved tube in the falling film distilled desalination. international journal of heat and mass transfer 119:654 – 664, 2018. doi:10.1016/j.ijheatmasstransfer.2017.11.148. [20] y. chen, j. tian, y. fu, et al. experimental study of heat transfer enhancement for molten salt with transversely grooved tube heat exchanger in laminar-transition-turbulent regimes. applied thermal engineering 132:95 – 101, 2018. doi:10.1016/j.applthermaleng.2017.12.054. [21] j. pan, y. bian, y. liu, et al. characteristics of flow behavior and heat transfer in the grooved channel for pulsatile flow with a reverse flow. international journal of heat and mass transfer 147:118932, 2020. doi:10.1016/j.ijheatmasstransfer.2019.118932. [22] a. a. r. darzi, m. farhadi, k. sedighi. experimental investigation of convective heat transfer and friction factor of al2o3/water nanofluid in helically corrugated tube. experimental thermal and fluid science 57:188 – 199, 2014. doi:10.1016/j.expthermflusci.2014.04.024. 526 http://dx.doi.org/10.1016/j.applthermaleng.2017.04.057 http://dx.doi.org/10.1016/j.ijthermalsci.2015.10.003 http://dx.doi.org/10.14311/ap.2016.56.0367 http://dx.doi.org/10.1016/j.applthermaleng.2016.10.045 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2016.10.098 http://dx.doi.org/10.1016/j.applthermaleng.2018.02.097 http://dx.doi.org/10.1016/j.icheatmasstransfer.2009.03.025 http://dx.doi.org/10.14311/ap.2017.57.0125 http://dx.doi.org/10.1615/interjfluidmechres.v42.i2.30 http://dx.doi.org/10.1016/j.icheatmasstransfer.2012.12.001 http://dx.doi.org/10.1016/j.jfluidstructs.2011.07.005 http://dx.doi.org/10.1016/j.icheatmasstransfer.2008.03.008 http://dx.doi.org/10.12988/astp.2014.4573 http://dx.doi.org/10.12988/astp.2015.412152 http://dx.doi.org/10.1016/j.ces.2008.10.041 http://dx.doi.org/10.1016/s0894-1777(99)00031-x http://dx.doi.org/10.1016/j.ijheatmasstransfer.2017.11.148 http://dx.doi.org/10.1016/j.applthermaleng.2017.12.054 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2019.118932 http://dx.doi.org/10.1016/j.expthermflusci.2014.04.024 vol. 60 no. 6/2020 heat transfer enhancement and friction in double pipe. . . [23] p. kathait, a. patil. thermo-hydraulic performance of a heat exchanger tube with discrete corrugations. applied thermal engineering 66(1 2):162 – 170, 2014. doi:10.1016/j.applthermaleng.2014.01.069. [24] s. lorenz, d. mukomilow, w. leiner. distribution of the heat transfer coefficient in a channel with periodic transverse grooves. experimental thermal and fluid science 11(3):234 – 242, 1995. doi:10.1016/0894-1777(95)00055-q. [25] t. adachi, h. uehara. correlation between heat transfer and pressure drop in channels with periodically grooved parts. international journal of heat and mass transfer 44(22):4333 – 4343, 2001. doi:10.1016/s0017-9310(01)00070-9. [26] m. jain, a. rao, k. nandakumar. numerical study on shape optimization of groove micromixers. microfluidics and nanofluidics 15(5):689 – 699, 2013. doi:10.1007/s10404-013-1169-x. [27] c. wang, z. liu, g. zhang, m. zhang. experimental investigations of flat plate heat pipes with interlaced narrow grooves or channels as capillary structure. experimental thermal and fluid science 48:222 – 229, 2013. doi:10.1016/j.expthermflusci.2013.03.004. [28] sunu, putu wijaya, anakottapary, daud simon, santika, wayan g. temperature approach optimization in the double pipe heat exchanger with groove. in the 3rd bali international seminar on science & technology (bisstech 2015), vol. 58, p. 04006. 2016. doi:10.1051/matecconf/20165804006. [29] m. c. vlachou, c. efstathiou, a. antoniadis, t. d. karapantsios. micro-grooved surfaces to enhance flow boiling in a macro-channel. experimental thermal and fluid science 108:61 – 74, 2019. doi:10.1016/j.expthermflusci.2019.05.015. [30] a. mehri, p. akbarzadeh. hydrodynamic characteristics of heated/non-heated and grooved/un-grooved spheres during free-surface water entry. journal of fluids and structures 97:103100, 2020. doi:10.1016/j.jfluidstructs.2020.103100. [31] p. promvonge, p. tongyote, s. skullong. thermal behaviors in heat exchanger channel with v-shaped ribs and grooves. chemical engineering research and design 150:263 – 273, 2019. doi:10.1016/j.cherd.2019.07.025. [32] r. naveenkumar, n. karthikeyan, s. gopan, et al. analysis of heat transfer in grooved plain carbon steel tube for solar applications. in international conference on nanotechnology: ideas, innovation and industries, materials today: proceedings, part 7, vol. 33, pp. 4219 – 4223. 2020. doi:10.1016/j.matpr.2020.07.234. [33] s. k. saha. thermohydraulics of laminar flow of viscous oil through a circular tube having axial corrugations and fitted with centre-cleared twisted-tape. experimental thermal and fluid science 38:201 – 209, 2012. doi:10.1016/j.expthermflusci.2011.12.008. [34] s. k. saha, b. swain, g. l. dayanidhi. friction and thermal characteristics of laminar flow of viscous oil through a circular tube having axial corrugations and fitted with helical screw-tape inserts. journal of fluids engineering, transactions of the asme 134(5):051210, 2012. doi:10.1115/1.4006669. [35] s. k. saha. thermohydraulics of turbulent flow through rectangular and square ducts with axial corrugation roughness and twisted-tapes with and without oblique teeth. experimental thermal and fluid science 34(6):744 – 752, 2010. doi:10.1016/j.expthermflusci.2010.01.003. [36] a. r. anand. analytical and experimental investigations on heat transport capability of axially grooved aluminium-methane heat pipe. international journal of thermal sciences 139:269 – 281, 2019. doi:10.1016/j.ijthermalsci.2019.01.028. [37] a. r. anand. investigations on effect of evaporator length on heat transport of axially grooved ammonia heat pipe. applied thermal engineering 150:1233 – 1242, 2019. doi:10.1016/j.applthermaleng.2019.01.078. [38] a. bahmanabadi, m. faegh, m. b. shafii. experimental examination of utilizing novel radially grooved surfaces in the evaporator of a thermosyphon heat pipe. applied thermal engineering 169:114975, 2020. doi:10.1016/j.applthermaleng.2020.114975. [39] b. zhou, x. wang, w. guo, et al. experimental measurements of the drag force and the near-wake flow patterns of a longitudinally grooved cylinder. journal of wind engineering and industrial aerodynamics 145:30 – 41, 2015. doi:10.1016/j.jweia.2015.05.013. [40] n. fujisawa, k. hirabayashi, t. yamagata. aerodynamic noise reduction of circular cylinder by longitudinal grooves. journal of wind engineering and industrial aerodynamics 199:104129, 2020. doi:10.1016/j.jweia.2020.104129. [41] j. han, y. zhang, w. wang, et al. effect of grooves on the double-nosed projectile penetrating into plain concrete target. international journal of impact engineering 140:103544, 2020. doi:10.1016/j.ijimpeng.2020.103544. [42] s. w. gepner, n. yadav, j. szumbarski. secondary flows in a longitudinally grooved channel and enhancement of diffusive transport. international journal of heat and mass transfer 153:119523, 2020. doi:10.1016/j.ijheatmasstransfer.2020.119523. [43] r. j. moffat. describing the uncertainties in experimental results. experimental thermal and fluid science 1(1):3 – 17, 1988. doi:10.1016/0894-1777(88)90043-x. [44] m. n. ozisik. heat transfer: a basic approach. mcgraw-hill international editions, new york, 1985. [45] w. m. kays, a. l. london. compact heat exchanger. mcgraw-hill international editions, new york, 3rd edn., 1984. 527 http://dx.doi.org/10.1016/j.applthermaleng.2014.01.069 http://dx.doi.org/10.1016/0894-1777(95)00055-q http://dx.doi.org/10.1016/s0017-9310(01)00070-9 http://dx.doi.org/10.1007/s10404-013-1169-x http://dx.doi.org/10.1016/j.expthermflusci.2013.03.004 http://dx.doi.org/10.1051/matecconf/20165804006 http://dx.doi.org/10.1016/j.expthermflusci.2019.05.015 http://dx.doi.org/10.1016/j.jfluidstructs.2020.103100 http://dx.doi.org/10.1016/j.cherd.2019.07.025 http://dx.doi.org/10.1016/j.matpr.2020.07.234 http://dx.doi.org/10.1016/j.expthermflusci.2011.12.008 http://dx.doi.org/10.1115/1.4006669 http://dx.doi.org/10.1016/j.expthermflusci.2010.01.003 http://dx.doi.org/10.1016/j.ijthermalsci.2019.01.028 http://dx.doi.org/10.1016/j.applthermaleng.2019.01.078 http://dx.doi.org/10.1016/j.applthermaleng.2020.114975 http://dx.doi.org/10.1016/j.jweia.2015.05.013 http://dx.doi.org/10.1016/j.jweia.2020.104129 http://dx.doi.org/10.1016/j.ijimpeng.2020.103544 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2020.119523 http://dx.doi.org/10.1016/0894-1777(88)90043-x acta polytechnica 60(6):518–527, 2020 1 introduction 2 experimental method 2.1 specification details of groove 2.2 measurement 2.3 data reduction and uncertainty analysis 2.4 uncertainty analysis 3 results and discussion 3.1 validation test of friction of smooth annulus experimental data 3.2 the effect of the number of longitudinal grooves on the heat transfer 3.3 comparison of the effectiveness and ntu obtained from the smooth pipe and that of grooved pipes on the enhancement of heat transfer 3.4 comparison of the friction factor of smooth annulus and grooved annuli 3.5 compensation of heat transfer and friction 4 conclusions acknowledgements list of symbols references ap04-bittnar1.vp 1 introduction when the dimensions of a structure are scaled proportionally (i.e. keeping the ratios between the relevant dimensions that have been fixed), the mechanical properties of the various samples differ. this phenomenon is known as size effect, see for instance [1,2,3,4] and references therein. being able to model size effects is relevant since the size range of specimens that can be tested experimentally is limited, and extrapolations towards very large specimen sizes must be made theoretically. it has sometimes been claimed that a proper modelling of size effects includes extensions towards the very large size range as well as the very small size range. however, in the very small size range the mesoscopic or microscopic constituents start to play a dominant role [5], and it must be questioned whether the traditional assumptions of continuum mechanics still hold. size effects have different origins. originally, it was thought that merely a statistical size effect exists, which can be understood from the fact that larger specimens have a bigger chance that a weakest-link is present. more recently it has been recognised that also a deterministic size effect (also known as an energetic size effect) exists, which has been exemplified in tests where theoretically predicted stress gradients are dominant over local stress fluctuations (e.g. bending tests). furthermore, different size effect trends are observed for unnotched specimens on, the one hand, and notched specimens where the notch scales proportionally, on the other. in the past decades, many formulas have been proposed in order to capture size effects, most notably the multi-fractal scaling law (mfsl) by carpinteri and co-workers and the size effect law (sel) by bažant. these proposed models more or less coincide in the size range where experimental data are available; however, for the small-size limit and the large-size limit important differences may be present. to contribute additional arguments in this discussion, an attempt is made here to interpret size effects purely from a material modelling point of view. size effects occur when characteristic lengths at the structural level and at the material level interact – increasing the specimen size while using the same material implies that the material length scale remains constant. classical continua in an elasticity, plasticity or damage context do not contain a material length scale. as a result, they are unable to describe size effects. in contrast, enhanced continua may be used in which an intrinsic length scale is incorporated as an additional material parameter. with these so-called nonlocal continuum descriptions, size-effects can be described as has been shown for instance in [6, 7, 2, 8, 9, 10]. in [11, 12] the occurrence of size effects has been used to estimate the internal length scale, thus further establishing the relation that exists between the presence of an internal length scale and the occurrence of size effects. finally, in [13] a nonlocal damage continuum is used in conjunction with stochastic finite elements in order to capture deterministic and statistic size effects simultaneously. in this paper, we will employ a nonlocal damage continuum of the differential format proposed in [14] and a viscoplastic damage continuum as described in [15]. three-point bending specimens are analysed numerically by means of the finite element method, and a distinction is made between three types of beams: unnotched beams, notched beams where the notch dimensions do not scale with the specimen size (referred to as “constant notch”) and notched beams where the notch dimensions scale proportionally with the other dimensions of the specimen (referred to as “proportional notch”). whereas the cases of unnotched beams and beams with a proportional notch have been covered extensively in the literature and used as validation for the above-mentioned size effect models, beams with a constant notch may be more relevant for engineering practice: when notches are supposed to represent defects in the material, the notch dimensions are set by the material length scale and not by the structural length scale. 2 constitutive relationships common to the two classes of regularised constitutive models considered in this contribution is the use of the damage framework to represent void development. the stress-strain relation is written as � �� �( )1 � c (1) where � and � contain the components of stresses and strains, c contains the elastic moduli and � is a scalar damage variable. to avoid mesh dependence, damage evolution must be postulated as some function of a regularised monotonically increasing deformation history invariant �. for the gradient-enhanced continuum damage model, damage evolution is made a function of the nonlocal equivalent strain, while in the rate-dependent elastoplastic damage model, damage is made a function of the equivalent viscoplastic strain. the two © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 44 no. 5 – 6/2004 modelling of size effect with regularised continua h. askes, a. simone, l. j. sluys a nonlocal damage continuum and a viscoplastic damage continuum are used to model size effects. three-point bending specimens are analysed, whereby a distinction is made between unnotched specimens, specimens with a constant notch and specimens with a proportionally scaled notch. numerical finite element simulations have been performed for specimen sizes in a range of 1:64. size effects are established in terms of nominal strength and compared to existing size effect models from the literature. keywords: size effect, modelling, statistic and deterministic size effect, stress fluctuation. frameworks differ significantly in the nature of the regularisation involved (temporal regularisation versus spatial regularisation) and in the dissipation mechanism. it is noted that in the rate-dependent elastoplastic damage model, damage is plastically-induced. in the gradient-enhanced continuum damage model, an equivalent strain is defined according to the modified von mises criterion as � � � � eq � � � � � � � � k k i k k i k j 1 2 1 2 1 2 1 1 2 6 1 1 2 2 1 2 2 2( ) ( ) ( ) ( ) , (2) where i1 and j2 are the first and second invariant of the strain tensor and the deviatoric strain tensor, respectively, and k denotes the compressive-to-tensile strength ratio. the local equivalent strain �eq defined in equation (2) is translated into its nonlocal counterpart �eq via a helmholtz-type relation as � � �eq eq eq� � � 1 2 2 2l , (3) where l is a material length parameter that represents the underlying microstructure. equation (3) is augmented by neumann boundary conditions throughout and solved simultaneously with the equilibrium equations that involve equation (1), thus leading to a coupled system of equations, see for instance [14]. as usual, a history variable � is introduced as � � �� max( , )i eq , (4) where �i represents the crack initiation strain. upon loading, damage grows according to � � � � � �� � � �1 i iexp( ( )) if � �� i (5) in which � is a material parameter that sets the slope of the stress-strain relation in the softening regime. the differential format of nonlocal damage as described above can equally be written in an integral form, whereby the weight function takes the format of a green’s function [16]. indeed, the differences between the differential format and the integral format of nonlocal damage are merely quantitative [17, 18, 16]. in the rate-dependent elastoplastic damage model, damage evolution is postulated as [15] � � ��� � �( exp( ))1 (6) with � and � model parameters and �i the threshold of damage initiation. the viscoplastic strain rate is expressed, in presence of plastic flow ( f � 0, where f is the yield function), in the associative form [19] �� � vp � 1 � f , (7) where the overstress function is given the following power-law form: ( )f f n � � � � 0 , (8) with 0 the initial yield stress and n (n �1) a real number. the yield stress has been given an exponential form according to � � �( ) (( ) exp( ) exp( ))� � � � �0 1 2a b a b (9) with a and b model parameters and 0 the initial cohesion (or yield stress). the rate-dependent isotropic elastoplastic damage model is discussed in detail in [15] where the algorithmic treatment is presented and the regularisation properties are illustrated. details of the implicit gradient-enhanced continuum damage model are discussed in [14]. 3 description of numerical simulations for the size effect analysis three-point bending tests are simulated. fig. 1 shows the three sets of geometries that are studied, namely � unnotched beams: the dimensions of the beam are simply taking as span × height � 4d × d and no notch is present. � beams with a constant notch: the outer dimensions of the beam are the same as for the unnotched case, but a wedge-shaped notch is assumed with dimensions base × height � 0.5 mm × 0.5 mm. the dimensions of the notch do not scale with the dimensions of the beam. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 1: geometries of three-point bending specimens: unnotched beam (top), beam with constant notch (middle) and beam with proportional notch (bottom) � beams with a proportional notch: again, the outer dimensions of the beam are taken similar to the unnotched case. wedge-shaped notches are assumed that scale proportionally with the beam dimensions: for the notch we take base × height � 0.25d × 0.25d. for all three cases we study size ranges of d � 1 mm up to d � 64 mm. the constant notch case and the proportional notch case coincide for d � 2 mm. the elastic constants used in the nonlocal damage continuum are e � 30000 mpa and � � 0.15 while plane strain conditions are assumed. for the damage evolution we use �i � 0.0001 and � � 500; the compressive-to-tensile strength ratio is taken as k � 10. the material length scale in all analyses equals l � 1 mm. this implies that for the smallest specimens (d � 1 mm) the structural dimensions and the material length are in the same order of magnitude, whereas for the largest specimens (d � 64 mm) the material length scale is negligibly small compared to the structural dimensions. in the viscoplastic damage continuum we have used the same elastic constants of the nonlocal damage continuum and a plane stress smoothed rankine yield function with 0 2� �f t mpa and with softening parameters a � �1, b � 200, � � 0.999 and � � 5000. the exponent of the overstress function is n � 1 and the relaxation time is � � 10 s. for the numerical analyses we have used finite element meshes consisting of three-node triangles with linear shape functions for the displacements as well as for the nonlocal equivalent strain [20]. the mesh density is determined by two factors: in the central section of the beams, a sufficiently fine mesh must be used to capture damage initiation and damage propagation accurately; moreover, a sufficient number of elements over the beam height must be used to describe the bending behaviour without locking effects. representative element sizes of 0.25 mm have been employed in the centre of the beams, while it has been ensured that at least 5 elements over the beam height are present. 4 results size effects are investigated in terms of nominal strength, denoted as , which is defined as peak load divided by d × 1 mm2. in figs. 2 and 3 the nominal strength is plotted as a function of the structural dimension d in the usual logarithmic scale. it can be seen that for both regularised continuum formulations and for all three geometries a size effect in nominal strength is obtained. furthermore, this size effect persists in the whole size range. for larger specimen sizes, the unnotched specimens and the specimens with a constant (non-proportional) notch behave similarly in that the same slope is obtained in the size effect curves. however, the presence of a notch lowers the nominal strength, even if the notch dimensions are negligible compared to the structural dimensions. the beams with a proportionally scaled notch behave significantly different from the other two test series: a steeper (i.e. more negative) slope in the size effect diagrams is obtained. the two regularised continua follow the same trends, apart from the extreme sensitivity of the constant notch beam with small dimensions. in the nonlocal damage formulation the constant notch case does not differ very much from the other two cases for small specimen sizes, whereas in the viscoplastic damage formulation a more significant difference exists. the small size range implies that the notch dimension is in the same order of magnitude as the material length scale. as is argued in [21], the imperfection (here: notch) dominates the response in a viscosity-enhanced continuum when the material length scale and the notch dimensions are in the same order of magnitude, while a nonlocal continuum is much less sensitive to notch dimensions. for theoretical predictions it is relevant to validate the obtained numerical results with the proposed size effect models of the literature. to this end, we have fitted the numerical results of the unnotched beams with the multi-fractal scaling law and the numerical results of the proportion© czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 2: nonlocal damage: size effects for unnotched beams (circles), beams with constant notch (triangles) and beams with proportional notch (diamonds) fig. 3: viscoplastic damage: size effects for unnotched beams (circles), beams with constant notch (triangles) and beams with proportional notch (diamonds) 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: nonlocal damage: size effects for unnotched beams: numerical experiments (solid with circles) versus multi-fractal scaling law (dashed) fig. 5: nonlocal damage: size effects for proportionally notched beams: numerical experiments (solid with diamonds) versus size effect law (dashed) fig. 6: damage contours at peak load in beams with constant notch: d � 16 mm (top), d � 32 mm (middle) and d � 64 mm (bottom). deformed configurations are plotted with magnification factor 100 ally notched beams with the size effect law, see figs. 4 and 5, using the nonlocal damage formulation. a least-squares fitting procedure has been used. with the current set of material and geometrical parameters, the size effect models quantify as � �0 5482 1 14 7492 . . d (10) for the multi-fractal scaling law, and � � 1 1 4 4640 d . (11) for the size effect law. as can be seen in fig. 4, the multi-fractal scaling law provides a reasonable prediction of the numerically obtained results, although a finite slope in the size effect diagram is found numerically that cannot be recovered by means of the multi-fractal scaling law (which approaches a horizontal plateau in the large-size range). this can be understood when studying the damage contours at the peak load, for instance of the beams with a constant notch, as are displayed for d � 16 mm, d � 32 mm and d � 64 mm in fig. 6. a horizontal slope in the size effect curve would imply that the uncracked portion of the beam height is proportional to the structural size d, thereby assuming that the same distributions of damage and stress will be reached along the vertical symmetry axis. however, from fig. 6 it is seen that this is clearly not the case if the present nonlocal damage continuum is used: the uncracked portion of the beam decreases as structural dimension d increases. more severe deviations are found for the proportionally notched beams, cf. fig. 5: the numerical results exhibit a convexity in the large-size range where the size effect law is strictly concave. for the large-size range, the size effect law predicts a slope of the size effect curve of 1:2 in a log-log scale, while the numerical results tend to converge towards a much less steep slope. for the large-size range, the material model that underlies the size effect law is linear elastic fracture mechanics (lefm). compared to lefm, a nonlocal contin© czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 7: damage contours at peak load in beams with proportional notch: d � 16 mm (top), d � 32 mm (middle) and d � 64 mm (bottom). deformed configurations are plotted with magnification factor 100 uum has an increased nominal strength due to the material length scale, which is lacking in lefm. also, singularities are removed in the present nonlocal continuum [16, 22] which leads to an increased nominal strength compared to lefm. in the small-size range large differences are obtained between the numerically obtained size effect and the predictions via the size effect law. however, following the argumentation outlined in the introduction, these differences are considered to be of minor relevance. in fig. 7 the damage contours at the peak load are plotted for d � 16 mm, d � 32 mm and d � 64 mm. the damaged region has extended significantly, even for the largest size, so that the ductility is significantly increased. 5 closure size effects are studied from a numerical point of view. a nonlocal damage continuum of the differential type and a viscoplastic damage continuum have been employed, both of which include a material length scale. three-point bending tests have been analysed with and without notches, where notches were considered that do or do not scale with the structural dimensions (proportional notches and constant notches, respectively). the size effects exhibited by unnotched beams and by beams with a constant notch are virtually identical in the large-size range. the numerically obtained results differ somewhat from the multi-fractal scaling law, which is due to the different mechanics that underlies the nonlocal continuum. the size effects shown by proportionally notched beams are in reasonable agreement with the size effect law. the difference between the two types of regularisation (spatial gradients versus viscosity) is negligible for the size-effect predictions. it is emphasised that the slopes of the numerically obtained size effects in the usual log-log diagrams are bounded by the multi-fractal scaling law from above and by the size effect law from below. using the two regularised continua discussed in this paper, neither the multi-fractal scaling law nor the size effect law is obtained in the large-size range. references [1] carpinteri a. (editor): “size-scale effects in the failure mechanisms of materials and structures.” e. & f.n. spon, london, 1996. [2] aifantis e.c.: “strain gradient interpretation of size effects.” international journal of fracture, vol. 95 (1999), p. 299–314. [3] bažant z. p.: “size effect.” international journal of solids and structures, vol. 37 (2000), p. 69–80. [4] karihaloo b. l., xiao q. z.: “size effect in the strength of concrete structures.” sadhana, vol. 27 (2002), p. 449–459. [5] van mier j. g. m., van vliet m. r. a.: “influence of microstructure of concrete on size/scale effects in tensile fracture.” engineering fracture mechanics, vol. 70 (2003) p. 2281–2306. [6] fleck n. a., muller g. m., ashby m. f., hutchinson j. w.: “strain gradient plasticity: theory and experiment.” acta metallurgica et materialia, vol. 42 (1994), p.475–487. [7] zhu h. t., zbib h. m., aifantis e. c.: “strain gradients and continuum modeling of size effect in metal matrix composites.” acta mechanica, vol. 121 (1997), p. 165–176. [8] efremidis g., carpinteri a., aifantis e. c.: “griffith’s theory versus gradient elasticity in the evaluation of porous materials tensile strength.” journal of the mechanical behavior of materials, vol. 12 (2001), p. 95–105. [9] efremidis g., carpinteri a., aifantis e. c.: “multifractal scaling law versus gradient elasticity in the evaluation of disordered materials compresive strength.” journal of the mechanical behavior of materials, vol. 12 (2001), p. 107–120. [10] askes h., aifantis e. c.: “numerical modeling of size effects with gradient elasticity – formulation, meshless and examples.” international journal of fracture, vol. 117 (2002), p. 347–358. [11] carmeliet j.: ”optimal estimation of gradient damage parameters from localization phenomena in quasi-brittle materials.” mechanics of cohesive-frictional materials, vol. 4 (1999), p. 1–16. [12] le bellégo c., dubé j. f., pijaudier-cabot g., gérard b.: “calibration of nonlocal damage model from size effect tests.” european journal of mechanics a/solids, vol. 22 (2003), p. 33–46. [13] gutiérrez m. a., de borst r.: “deterministic and stochastic analysis of size effects and damage evolution in quasi-brittle materials.” archive of applied mechanics, vol. 69 (1999), p. 655–676. [14] peerlings r. h. j., de borst r., brekelmans w. a. m., de vree j. h. p.: “gradient enhanced damage for quasi-brittle materials.” international journal for numerical methods in engineering, vol. 39 (1996), p. 3391–3403. [15] simone a.,. sluys l. j.: “the use of displacement discontinuities in a rate-dependent medium.” computer methods in applied mechanics and engineering, vol. 193 (2004), p. 3015–3033. [16] peerlings r. h. j., geers m. g. d., de borst r., brekelmans w. a. m.: “a critical comparison of nonlocal and gradient-enhanced softening continua.” international journal of solids and structures, vol. 38 (2001), p. 7723–7746. [17] peerlings r. h. j., de borst r.,. brekelmans w. a. m, de vree j. h. p., spee i.: “some observations on localisation in non-local and gradient damage models.” european journal of mechanics, a/solids, vol. 15 (1996), p. 937–953. [18] askes h., pamin j., de borst r.: “dispersion analysis and element-free galerkin solutions of second and fourth-order gradient-enhanced damage models.” international journal for numerical methods in engineering, vol. 49 (2000), p. 811–832. [19] perzyna p.: “fundamental problems in viscoplasticity.” advances in applied mechanics, vol. 9 (1966), academic press, new york, p. 243–377. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 [20] simone a., askes h., peerlings r. h. j., sluys l. j.: “interpolation requirements for implicit gradient-enhanced continuum damage models.” communications in numerical methods in engineering, vol. 19 (2003), p. 563–572. (see also the corrigenda: communications in numerical methods in engineering, vol. 20 (2004), p. 163–165.) [21] wang w. m., sluys l. j., de borst r.: “interaction between material length scale and imperfection size for localisation phenomena in viscoplastic media.” european journal of mechanics a/solids, vol. 15 (1996), p. 447–464. [22] simone a., askes h., sluys l. j.: “incorrect initiation and propagation of failure in non-local and gradient-enhanced media.” international journal of solids and structures, vol. 41 (2004), p. 351–363. h. askes faculty of civil engineering and geosciences delft university of technology po box 5048 nl-2600 ga delft, netherlands a. simone faculty of mathematics and natural sciences university of groningen nijenborgh 4 nl-9747 ag groningen, netherlands l. j. sluys faculty of civil engineering and geosciences delft university of technology po box 5048 nl-2600 ga delft, netherlands © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2016.56.0202 acta polytechnica 56(3):202–213, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on cubature rules associated to weyl group orbit functions lenka hákováa, ∗, jiří hrivnákb, lenka motlochováb a department of mathematics, faculty of chemical engineering, university of chemistry and technology, prague, technická 5, cz-166 28 prague, czech republic b department of physics, faculty of nuclear sciences and physical engineering, czech technical university in prague, břehová 7, cz-115 19 prague, czech republic ∗ corresponding author: lenka.hakova@vscht.cz abstract. the aim of this article is to describe several cubature formulas related to the weyl group orbit functions, i.e. to the special cases of the jacobi polynomials associated to root systems. the diagram containing the relations among the special functions associated to the weyl group orbit functions is presented and the link between the weyl group orbit functions and the jacobi polynomials is explicitly derived in full generality. the four cubature rules corresponding to these polynomials are summarized for all simple lie algebras and their properties simultaneously tested on model functions. the clenshaw-curtis method is used to obtain additional formulas connected with the simple lie algebra c2. keywords: weyl group orbit functions; jacobi polynomials; cubature formulas. 1. introduction the purpose of this paper is to explicitly overview in full generality the link between the weyl group orbit functions and the jacobi and macdonald polynomials and further examine and compare related methods of numerical integration. these methods of numerical integration, known as cubature rules, emerged recently for all four cases of the weyl group orbit functions. the four families of the weyl group orbit functions [10, 18, 19, 26, 30] are connected to four families of orthogonal polynomials via similar relations between chebyshev polynomials of the first and second kinds and the ordinary cosine and sine functions. a full set of four families of orbit functions, called c-, s-, ssand sl-functions, arises from root systems of simple lie algebras with two different lengths of roots. these four families of orthogonal polynomials are in fact special cases of the multivariate jacobi polynomials [11, 12]. the jacobi polynomials associated to root systems are in turn limiting cases of the macdonald polynomials [24]. this connection between the four cases of the jacobi polynomials and the underlying orbit functions allows to formulate the corresponding methods for numerical integration in terms of the jacobi polynomials. among methods for numerical integration, the quadrature and cubature formulas related to polynomials of a bounded degree hold a prominent place [1, 6– 8, 35, 38]. such formulas estimate a given weighted integral over a fixed domain in euclidean space. this estimation holds exactly for all polynomials up to a certain degree. a significant effort put into development of various types of cubature formulas results in multitude of types of integration domains with varying efficiencies. the shapes of the integration domains and the nodes for cubature formulas corresponding to the orthogonal polynomials of the weyl group orbit functions are determined by the symmetries of the affine weyl groups and a certain transform [15, 22, 23, 26, 27]. this transform is generated by the transform which induces the given set of orthogonal polynomials. moreover, a specific notion of the modified degree of multivariate polynomials is essential for establishing the final cubature formulas. one of the specific methods of deriving quadrature formulas, known as clenshaw-curtis method [5], is classically related to chebyshev polynomials of one variable [9]. its two-dimensional version related to twovariable chebyshev polynomials of the root system a2 is also developed [31]. the importance of this method lies e.g. in its utilization for practical optimization of the shapes of integration domains. the shapes of the integration domains are determined by the underlying lie algebra [15] and are, however, of non-standard form. in case of simple lie algebras related to twovariable functions, one of the possible optimizations of these shapes is, similarly to [31], inscribing a triangle into the original fundamental domain. the focus of the present article is on simple lie algebra c2 and its corresponding cubature rules. the integration domain in the case of c2 is a region bounded by two lines and a parabola depicted in fig. 4. except from a general perspective in [15, 26, 27], integration over this region is studied in [32]. similarly to [31] for a2, the non-standard shape of this integration domain motivates further exploration of the clenshawcurtis method. this method crucially depends on the choice of the weight function and the inscribed integration region and has not yet been studied in detail for 202 http://dx.doi.org/10.14311/ap.2016.56.0202 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions the case of c2. in this case, the domain inscribed in the original integration region is considered either the original domain itself or the triangle depicted in fig. 6. two fundamental choices of the weight functions are detailed. prior to practical implementation, exact values of certain integrals are also needed. one of the goals of this article is to provide all data necessary for practical implementation of these cubature formulas. this is achieved by tabulating and calculating all needed stabilizer coefficients and exact integral values for each choice it the inscribed integration domain and weight function. to demonstrate usefulness and viability of the presented methods, the numerical tests on model functions, including multidimensional step-functions, are also performed. the development of novel cubature formulas is motivated by their widespread use in applied numerical simulations and engineering problems. among direct numerical applications of the cubature formulas is the induced method of polynomial approximation. the cubature formulas are ubiquitous in the modern theory of electromagnetism, especially in its branches of electromagnetic wave propagation [33], magnetostatic modeling [39] and micromagnetic simulations [4]. other fields include fluid flows simulations, laser optics and stochastic dynamics. in section 2 are reviewed the notions necessary for definition of the weyl group orbit functions. the relation between the orbit functions and the jacobi polynomials is detailed. in section 3, the cubature formulas from [15, 26, 27] are summarized, clenshawcurtis method is described and used to derive additional cubature formulas. furthermore, numerical test results are presented. 2. special functions associated to root systems 2.1. basic definitions this section reviews the basic concepts and notation from the theory of root systems, weyl groups and weyl group orbit functions. it is consistent with the notation used in recent papers regarding the topic, such as [10, 13–15, 26] and others. we consider simple lie algebras, i.e. four infinite families an(n ≥ 1), bn(n ≥ 3), cn(n ≥ 2) and dn(n ≥ 4) and five exceptional algebras e6, e7, e8, f4 and g2 (for the classification see [2, 16]). in particular, we focus on the algebra c2 as the simplest non-trivial example. each simple lie algebra is completely described by its set of simple roots ∆ = {α1, . . . ,αn} which forms a nonorthogonal basis of the euclidean space rn equipped with a scalar product denoted by 〈·, ·〉. simple roots are either of the same length or of two different lengths, in the latter case we distinguish so-called short and long roots and write ∆ = ∆s ∪ ∆l. the set of dual roots is denoted by ∆∨ = {α∨1 , . . . ,α ∨ n} , where α∨i = 2αi 〈αi,αi〉 . in addition to the bases of simple roots and dual roots we introduce the weight basis ω1, . . . ,ωn and the dual weight basis ω∨1 , . . . ,ω∨n where 〈α∨i , ωj〉 = 〈αi, ω ∨ j 〉 = δij. the cartan matrix c is defined as cij = 2〈αi,αj〉 〈αj,αj〉 and its determinant is denoted by c. each simple root αi relates to a reflection ri defined for every a ∈ rn as ria = a− 2〈a, αi〉 〈αi, αi〉 αi. the set of reflections {r1, . . . ,rn} generates a finite group w called the weyl group. by the action of w on the set of simple roots we obtain the root system π = w ∆. analogously, we define π∨ = w ∆∨, πs = w ∆s and πl = w ∆l. every element of π can be written as a combination of simple roots with only non-negative (positive roots) or non-positive integer coefficients (negative roots). the set of positive roots is denoted by π+. we define a partial ordering � of roots, µ � λ if µ − λ is a sum of simple roots with non-negative integer coefficients. there is a unique highest root ξ with respect to this ordering, its coordinates in the basis of simple roots are called marks and denoted by m1, . . . ,mn. dual root system π∨ contains the highest root η = m∨1 α∨1 + · · · + m∨nα∨n with the coefficients m∨i called dual marks. an infinite extension of the weyl group w is the affine weyl group w aff which is obtained by adding to the set of generators of w the affine reflection r0, r0a = rξa + 2ξ 〈ξ, ξ〉 , rξa = a− 2〈a, ξ〉 〈ξ, ξ〉 ξ. it can also be written as a semidirect product of w and a set of shifts by integer combinations of dual roots [13]. we denote by ψ the retraction homomorphism w aff → w [14]. the fundamental domain f a set containing exactly one point from each w aff orbit can be chosen as f = { b1ω ∨ 1 + · · · + bnω ∨ n | bi ∈ r≥0, b0 + b1m1 + · · · + bnmn = 1 } . (1) analogously we define dual affine weyl group as a semidirect product of w and shifts by integer combinations of simple roots. we introduce three lattices p,p + and p∨ as p = zω1 + · · · + zωn, p + = z≥0ω1 + · · · + z≥0ωn, p∨ = zω∨1 + · · · + zω ∨ n. note that the root system π is contained in p , therefore, the partial ordering � can be extended to the lattice p. a function k : α ∈ π → kα ∈ r≥0 such that kα = kw(α) for all w ∈ w 203 l. háková, j. hrivnák, l. motlochová acta polytechnica is known as a multiplicity function on π. the trivial example is to take kα = const for all α ∈ π which we denote by kconst. for simple lie algebras with two different root lengths, it is natural to distinguish between short and long roots by defining ksα ≡ { 1 if α ∈ πs, 0 if α ∈ πl, klα ≡ { 0 if α ∈ πs, 1 if α ∈ πl. the notion of multiplicity function allows us to define sums of positive roots %(k) and numbers h(k), %(k) ≡ 1 2 ∑ α∈π+ kαα, (2) h(k) = kξ + n∑ i=1 mikαi. in particular, with the choice of kt , where t is one of the symbols {0, 1,s, l}, we have %0 ≡ %(k0) = 0, %1 ≡ %(k1) = n∑ i=1 ωi, %s ≡ %(ks) = ∑ αi∈∆s ωi, %l ≡ %(kl) = ∑ αi∈∆l ωi (3) and h0 ≡ h(k0) = 0, h1 ≡ h(k1) = 1 + n∑ i=1 mi, hs ≡ h(ks) = ∑ αi∈∆s mi, hl ≡ h(kl) = 1 + ∑ αi∈∆l mi. (4) the number h ≡ h1 is called the coxeter number, analogously, we call hs and hl short and long coxeter number. the set of simple roots ∆ = {α1,α2} of the algebra c2 decomposes into the set of the short simple roots ∆s = {α1} and the set of the long simple roots ∆l = {α2}. the highest root is of the form ξ = 2α1 + α2 and the dual highest root is η = α∨1 + 2α∨2 . thus, the sets of mark and dual marks are (m1,m2) = (2, 1) and (m∨1 ,m∨2 ) = (1, 2). the vectors %t are of the form (%1,%s,%l) = (ω1 + ω2,ω1,ω2) and the coxeter numbers are (h1,hs,hl) = (4, 2, 2). the roots and dual roots, the weights and dual weights, together with the fundamental domain f and the vectors %t are depicted in figure 1. 2.2. weyl group orbit functions the definition of weyl group orbit function uses the notion of a sign homomorphisms σt : w 7→±1, where figure 1. root system of c2. the white circles denote the roots, black dots depict the dual roots. the triangle denotes the fundamental domain f . the lines denoted r1, r2 and r0 depict reflecting mirrors which realize the corresponding reflections. t ∈{0, 1,s, l}. these can be defined by the values on the reflections corresponding to the simple roots ri, namely σ0(ri) = 1, αi ∈ ∆, σ1(ri) = −1, αi ∈ ∆, σs(ri) = { 1 if αi ∈ ∆l, −1 if αi ∈ ∆s, σl(ri) = { 1 if αi ∈ ∆s, −1 if αi ∈ ∆l. several families of special functions are connected with each weyl group w. they are labelled by vectors λ ∈ p + and defined as weighed sums over the corresponding weyl group orbit, i.e. the set wλ = {wλ | w ∈ w }. for every x ∈ rn and t ∈{0, 1,s, l} we define stλ+%t(x) = ∑ µ∈w(λ+%t) σt(µ)e2πi〈µ,x〉, (5) where %t is given by (3) and σt(µ) ≡ σt(w) for w such that µ = w(λ + %t). functions corresponding to the choice of t = 0 and t = 1 are usually called cand sfunctions respectively, in the formulas in next sections we use the notation s0 and s1 for a simplicity. families of c-, s-, ssand sl-functions are complex multivariate functions with remarkable properties such as (anti-)invariance with respect to the action of the affine weyl group, continuous and discrete orthogonality. they were studied in many papers, see for example [18, 19, 26] for the general properties and [13, 14, 28] for their discretization. 204 vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions fundamental domains ft are defined as subsets of f such that we omit the part of the boundary of f which is stabilized by certain generating reflections r ∈ r = {r0,r1 . . . ,rn}. more precisely, with the notation rt = {r ∈ r | σt ◦ψ(r) = −1}, ht = {a ∈ f | ∃r ∈ rt, ra = a}, we define ft = f\ht. the explicit forms are obtained from (1) and can be found in [14]. the c-, s-, ssand sl-functions can be viewed as functional forms of elements from the algebra c[p] containing all complex linear combinations of formal exponentials ea,a ∈ p, with multiplication defined by ea · eb = ea+b, the inverse given by (ea)−1 = e−a and the identity e0 = 1. the connection is based on the exponential mapping from lie algebra to the corresponding lie group [2, 16, 27]. 2.3. jacobi polynomials we assume that the multiplicity function k satisfies kα ≥ 0. the jacobi polynomial p (λ,k) [11, 12] associated to the root system π with highest weight λ ∈ p + and multiplicity function k as parameter is defined by the following formulas. p(λ,k) ≡ ∑ µ∈p+ µ�λ cλµ(k)cµ, cµ = ∑ µ′∈wµ eµ ′ , (6) where the coefficients cλµ(k) are recursively given by( 〈λ + %(k),λ + %(k)〉−〈µ + %(k),µ + %(k)〉 ) cλµ(k) = 2 ∑ α∈π+ kα ∞∑ j=1 〈µ + jα,α〉cλ,µ+jα(k) along with the initial value cλλ = 1 and the assumption cλµ = cλ,w(µ) for all w ∈ w. recall that %(k) is defined by (2). by setting kα = 0 for all α ∈ π, the jacobi polynomials lead trivially to c-functions. in the case k = k1, the formula for the calculation of the coefficients becomes the freudenthal’s recurrence formula [16]. therefore, each p(λ,k1) specializes to a character χλ of an irreducible representation of the simple lie algebra of the highest weight λ, i.e., p(λ,k1) = χλ = sλ+%1 s%1 . in addition, we show that the jacobi polynomials are related to the ssand sl-functions in the following way. p(λ,ks) = ssλ+%s ss%s and p(λ,kl) = sl λ+%l sl %l . we first observe that ssλ+%s/s s %s are weyl group invariant elements of c[p ] (see proposition 4.2 of [26]) with well-known basis formed by c-functions [2]. therefore, each ssλ+%s/s s %s can be expressed as a linear combination of c-functions. by the definition of the weyl group, the weights of exponentials in ssλ+%s are of the form λ+%s−g(α1, . . . ,αn), where g(α1, . . . ,αn) denotes a sum of simple roots with non-negative integer coefficients, and the unique maximal weight is λ + %s. similarly, the unique maximal weight of ss%s is %s. therefore, we have ssλ+%s ss%s = ∑ µ∈p+ µ�λ bµcµ, bλ = 1. to prove bµ = cλµ(ks), we proceed by using an equivalent definition of jacobi polynomials with the multiplicity function satisfying kα ∈ z≥0 [12]. for any f = ∑ λ aλe λ, we define f ≡ ∑ λ aλe −λ and ct(f) ≡ a0. if we introduce the scalar product (·, ·) on c[p ] by (f,g) ≡ ct ( fgδ(k) 1 2 δ(k) 1 2 ) , f,g ∈ c[p ], δ(k) 1 2 ≡ ∏ α∈π+ ( e 1 2 α −e− 1 2 α )kα , then the jacobi polynomials p(λ,k) are the unique polynomials of the form (6) satisfying the requirement( p(λ,k),p(µ,k) ) = 0 for all µ ∈ p + such that µ � λ and λ 6= µ assuming cλλ = 1. using proposition 4.1 of [26], i.e. δ(ks) 1 2 = ss%s, we obtain( ssλ+%s ss%s , ssµ+%s ss%s ) = ct(ssλ+%sssµ+%s) = ct ( ∑ λ′∈w(λ+%s) ∑ µ′∈w(µ+%s) σs(λ′)σs(µ′)eλ ′−µ′ ) . clearly λ′ = µ′ if and only if there exists w ∈ w such that λ + %s = w(µ + %s). for we consider λ ∈ p + different from µ ∈ p +, it is not possible to have λ′ = µ′. this implies that( ssλ+%s ss%s , ssµ+%s ss%s ) = 0 and p(λ,ks) = ssλ+%s ss%s . the proof for the long root case is similar. finally, note that the jacobi polynomials can be viewed as the limiting case of the macdonald polynomials pλ(q,tα) when tα = qkα with kα fixed and q → 1. see [24] for more details. the relations among several special functions associated with the weyl groups, which are summarized in [29], are depicted in figure 2. 205 l. háková, j. hrivnák, l. motlochová acta polytechnica macdonald polynomials [24] pλ(q,tα), tα = qkα jacobi polynomials associated to root systems [11] p(λ,k) (section 2.3) c-functions and s-functions [18, 19] cλ = p(λ,k0) sλ+%1 = p(λ,k 1)s%1 ss-functions and sl-functions [26] ssλ+%s = p(λ,k s)ss%s sl λ+%l = p(λ,k l)sl %l cck and ssk [21]? cos+ and sin− [17] sck and csk [21] cos− and sin+ [17] chebyshev polynomials [9] tm,um,vm and wm 2-d jacobi polynomials pα,β,γk1,k2 with α,β,γ ∈ { ±12 } [20] 2-d jacobi polynomials on steiner’s hypocycloid [20] jacobi polynomials [36] p (α,β) m q → 1, kα fixed k ∈{ks,kl}k ∈{k0,k1} g2an bn and cn g2 bn and cn a1 a2 c2 c2 α,β ∈ { ±12 } figure 2. the diagram of relations among several special functions associated with weyl groups. 3. cubature formulas 3.1. general form of cubature formulas analogously to chebyshev polynomials, we identify polynomial variables y1, . . . ,yn with real-valued functions in the following way. let zj ≡ cωj, then a2k : yj = <(zj), y2k−j+1 = =(zj), j = 1, . . . ,k, a2k+1 : yj = <(zj), yk+1 = zk+1, y2k−j+2 = =(zj), j = 1, . . . ,k, d2k+1 : yj = zj, y2k = <(z2k), y2k+1 = =(z2k), j = 1, . . . , 2k − 1, e6 : y1 = <(z1), y2 = <(z2), y3 = z3, y4 = =(z2), y5 = =(z1), y6 = z6, otherwise we put yj = zj. we say that a monomial yλ11 . . .y λn n has an m-degree degm y λ1 1 . . .y λn n = m ∨ 1 λ1 + · · · + m ∨ nλn and any polynomial p in c[y1, . . . ,yn] has an m-degree equal to the largest m-degree of the monomials occurring in p. we denote the subspace containing all polynomials of m-degree at most m by πm. for λ = λ1ω1 + · · · + λnωn, the c-functions cλ and thus all jacobi polynomials p(λ,k) can be rewritten as orthogonal polynomials in variables y1, . . . ,yn of mdegree equal to m∨1 λ1 + · · · + m∨nλn [15]. the variables y1, . . . ,yn viewed as functions induce a map ξ : rn → rn, ξ(x) = (y1(x), . . . ,yn(x)). the map ξ is used to define the integration region ω and the sets of nodes ωtm, t ∈{0, 1,s, l} by ω ≡ ξ(f 1), ωtm ≡ ξ ( 1 m + ht p∨ ∩ft ) . let stabwaff (x) ≡ { w ∈ w aff ∣∣ wx = x}, ε(x) ≡ |w | |stabwaff (x)| , and define a map ε̃ : ω0m → n by ε̃(y) ≡ ε(ξ−1m y), ξm = ξ � 1 m+ht p ∨∩f0 . denoting k(y1, . . . ,yn) ≡ √ s%1s%1, the weight functions are given by st(y1, . . . ,yn) ≡ st%tst%t, wt(y1, . . . ,yn) ≡ st(y1, . . . ,yn) k(y1, . . . ,yn) , t ∈{0, 1,s, l}. 206 vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions figure 3. the fundamental domain f = f 0 of c2 is depicted as the triangle with dashed boundary hs and dot-and-dashed boundary hl. the black dots correspond to the points from 110 p ∨∩f . the numbers 1, 2, 4 of the dots are the values of ε(x), the inner dots have ε(x) = 8. since the products st%ts t %t , t ∈ {0, 1,s, l} are winvariant sums of exponentials from c[p], they are expressible as functions in polynomial variables y1, . . . ,yn. in [15, 26, 27] is shown that the following cubature formulas are exact equalities for any m ∈ n and any polynomial p which satisfies the following constraints, • degm p ≤ 2m − 1 for t ∈{0, l}, • degm p ≤ 2m + 1 for t ∈{1,s}. thus, it holds that∫ ω p(y)wt(y) dy = κ c|w | ( 2π m + ht )n ∑ y∈ωt m ε̃(y)st(y)p(y), (7) where κ =   2−b n 2 c for an, 1 2 for d2k+1, 1 4 for e6, 1 otherwise. to numerically compare the efficiency of these cubature formulas, we may consider an integrable function f, such that f/wt is well defined on ωtm, and rewrite the cubatures (7) in the following form: i(f) ≡ ∫ ω f(y) dy ≈ itm (f), (8) itm (f) ≡ κ c|w| ( 2π m + ht )n ∑ y∈ωt m ε̃(y)k(y)f(y). 3.2. cubature formulas of c2 in this section, the general cubature formulas (8) are specialized and tested on model examples for the case of algebra c2. the region ft of c2 with the points from 110p ∨ ∩ ft is depicted in fig. 3, whereas the corresponding integration region ω with the transformed grid points is depicted in fig. 4. note that figure 4. the integration region ω of c2 contains the points of the grid ω010. the inner points of ω corresponds to the grid ω16, the points not lying on the dashed boundary corresponds to the grid ωs8 and finally, the points not lying on the dot-and-dashed boundary corresponds to the grid ωl8. the numbers 1, 2, 4 are the values of ε̃(y), the inner dots have ε̃(y) = 8. the numbers of points in grids ωtm, t ∈{0, 1,s, l} are the same. fixing the basis x = b1ω∨1 + b2ω∨2 results in the polynomial variables expressed as y1 = 2 ( cos π(2b1 + b2) + cos πb2 ) , y2 = 2 ( cos 2π(b1 + b2) + cos 2πb1 ) . formula (8) specializes into itm (f) = π2 4(m + ht)2 ∑ y∈ωt m ε̃(y)k(y)f(y), (9) where: • m ∈ n is arbitrary; • h0 = 1,h1 = 4 and hs = hl = 2; • the integration region ω, depicted in fig. 4, is bounded by two lines y2 = ±y1−4 and the parabola y2 = y21 4 ; • the finite grid ωtm , depicted for m = 10 in fig. 4), consists of points (y1(x),y2(x)) where x = s t 1 m+ht ω ∨ 1 + st2 m+ht ω ∨ 2 with sti satisfying s0i ∈ z ≥0, 2s01 + s 0 2 ≤ m, s1i ∈ z >0, 2s11 + s 1 2 < m + 4, ss1 ∈ z ≥0,ss2 ∈ z >0, 2ss1 + s s 2 ≤ m + 2, sl1 ∈ z >0,sl2 ∈ z ≥0, 2sl1 + s l 2 < m + 2; • the weight function k becomes k(y1,y2) = √ (y21 − 4y2)((y2 + 4)2 − 4y21 ); • the weight function ε̃ is equal to ε̃(y) =   1 if (y1,y2) = (±4, 4), 2 if (y1,y2) = (0,−4), 8 if (y1,y2) is an inner point of ω, 4 otherwise. 207 l. háková, j. hrivnák, l. motlochová acta polytechnica figure 5. the graphs of error values |1 − itm (fi)| of the integral i(fi) = 1 and its estimations i t m (fi), m = 10, 15, 20, . . . , 195 given by (8). the values for t = 0, 1,s are depicted as circles, “+” signs and diamonds, respectively. for the purpose of numerical tests and comparison, we choose f1(y1,y2) = q1(y201 −y1y2 + y 20 2 ), f2(y1,y2) = q2 ( e−(y 2 1 +(y2+1.8) 2)/2×0.352), f3(y1,y2) = q3 1 + y21 + y22 , f4(y1,y2) = q4ey1+y2, f5(y1,y2) = { q5 if y21 + (y2 + 1.5)2 ≤ 1, 0 otherwise. as model functions. each value of qi ∈ r is set to satisfy the condition i(fi) = 1. fig. 5 shows for m = 10, 15, 20, . . . , 195 and t ∈ {0, 1,s} the graphs of the absolute value of the difference |1 − itm (fi)|. note that the cases t = s and t = l give the same results since hs = hl = 2 and k(y)fi(y) vanish on the boundary of ω. 4. clenshaw-curtis cubature formulas 4.1. clenshaw-curtis method assuming that we have an interpolation of a function f in terms of p(λ,kt) ∈ πm in points ωtm, i.e. f ≈ ∑ λ∈p+ 〈λ,η〉≤m btλp(λ,k t), f(y) = ∑ λ∈p+ 〈λ,η〉≤m btλp(λ,k t; y), y ∈ ωtm, we estimate a weighted integral of f with a weight function w over a domain d ⊂ ω by∑ λ∈p+ 〈λ,η〉≤m btλ ∫ d p(λ,kt; y)w(y) dy. such construction of the clenshaw-curtis cubature rule implies the exact equality for any polynomial f of m-degree at most m. denoting atλ(w) ≡ ∫ d p(λ,kt; y)w(y) dy, the clenshaw-curtis cubature is thus given by∫ d f(y)wt(y) dy ≈ ∑ λ∈p+ 〈λ,η〉≤m btλa t λ(w), where the coefficients btλ and a t λ(w) need to be determined. the coefficients btλ are readily obtained using the discrete orthogonality relations of the orbit functions from [13, 14]. denoting the order of the stabilizer of λ+% t m+ht with respect to the dual affine weyl group by h∨λ+%t, it holds that btλ = |stabw (λ + %t)|2 c|w |(m + ht)nh∨ λ+%t × ∑ y∈ωtm ε̃(y)st(y)f(y)p(λ,kt; y). (10) 208 vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions a0(λ1,λ2) (λ1,λ2) = (2i, 2j) ∑ (µ1,µ2)∈m(2i,2j) 64(4µ22+4µ1µ2−3) |stabw (2i,2j)|(µ21−1)(4µ22−1)(4µ22−9) (λ1,λ2) = (2i, 2j + 1) ∑ (µ1,µ2)∈m(2i,2j+1) −64(4µ22+4µ1µ2+3) |stabw (2i,2j+1)|(µ21−4)(4µ22−1)(4µ22−9) otherwise 0 a1(λ1,λ2) (λ1,λ2) = (2i, 2j) 32(i+j+1) (2i+4j+3)(2j+1)(2i+1) (λ1,λ2) = (2i, 2j + 1) 32(j+1) (2i+4j+5)(2i+2j+3)(2i+1) otherwise 0 as(λ1,λ2) (λ1,λ2) = (2i, 2j) ∑ (µ1,µ2)∈ms1(2i+1,2j) 8(µ1+µ2) |stabw (2i+1,2j)|µ2(µ21−1)(µ22−1) (λ1,λ2) = (2i, 2j + 1) ∑ (µ1,µ2)∈ms2(2i+1,2j+1) −8(µ1+µ2) µ2(µ21−1)(µ22−1) otherwise 0 al(λ1,λ2) (λ1,λ2) = (2i, 2j) (∑ (µ1,µ2)∈ml1(2i,2j+1) − ∑ (µ1,µ2)∈ml2(2i,2j+1) ) 32µ2 |stabw (2i,2j+1)|µ1(4µ22−1) (λ1,λ2) = (2i, 2j + 1) (∑ (µ1,µ2)∈ml2(2i,2j+2) − ∑ (µ1,µ2)∈ml1(2i,2j+2) ) 16(2µ1µ2+1) |stabw (2i,2j+2)|(µ21−1)(4µ22−1) otherwise 0 m(λ1,λ2) { (λ1 + λ2, λ12 + λ2), (λ2, λ1 2 + λ2), (λ1 + λ2, λ1 2 ), (λ2,− λ1 2 ) } ms1(λ1,λ2) { (λ2, λ12 + λ2), (λ2,− λ1 2 ) } ms2(λ1,λ2) { (λ1 + λ2, λ12 + λ2), (λ1 + λ2, λ1 2 ) } ml1(λ1,λ2) { (λ1 + λ2, λ12 + λ2), (λ2, λ1 2 + λ2) } ml2(λ1,λ2) { (λ1 + λ2, λ12 ), (λ2,− λ1 2 ) } table 2. values of atλ (11) for t ∈{0, 1,s, l}, λ = λ1ω1 + λ2ω2 and i,j are non-negative integers. [λ0,λ1,λ2] |stabw (λ + %t)| h∨λ+%t (?,?,?) 1 1 (0,?,?) 1 2 (?, 0,?) 2 2 (?,?, 0) 2 2 (0, 0,?) 2 4 (0,?, 0) 2 8 (?, 0, 0) 8 8 table 1. the values of |stabw (λ + %t)| and h∨λ+%t of c2, where λ+%t = λ1ω1 +λ2ω2 and λ0 ≡ m +ht−λ1− 2λ2. asterisks denote non-zero positive integers. it remains to evaluate the integrals atλ(w) which depend on the chosen weight and the integration domain d ⊂ ω. since the jacobi polynomials have several properties connected to the domain ω (e.g. continuous and discrete orthogonality), we firstly take d = ω. the cubature rules with the choice w = wt coincide for any simple lie algebra with the formulas (7). the difference lies in the fact that clenshaw-curtis method guarantees the exact equality only for polynomials up to m-degree m. 4.2. integration domain ω of c2 in this section, the clenshaw-curtis integration method is applied to the algebra c2. the values of |stabw (λ + %t)| and h∨λ+%t, needed in (10), are tabulated in tab. 1. since the choice of w = wt gives standard cubature formulas, the next natural choice of the weight function is to set w = 1. in this case are the coefficients atλ(1), denoted by a t λ, expressed as the following integrals: atλ = 2π 2   ∫ f cλ(x)s%1 (x) dx if t = 0,∫ f sλ+%1 (x) dx if t = 1,∫ f ssλ+%s(x)s l %l (x) dx if t = s,∫ f sl λ+%l(x)s s %s(x) dx if t = l. (11) the exact values of atλ are explicitly calculated in tab. 2. 4.3. triangular domain of c2 the next choice of the domain d, for which we derive the clenshaw-curtis cubature rules, is the triangle t ⊂ ω depicted on fig. 6 and given explicitly by t ≡ { (y1,y2) ∣∣∣ y2 ≤ 0, −y22 − 2 ≤ y1 ≤ y22 + 2 } . 209 l. háková, j. hrivnák, l. motlochová acta polytechnica a0(λ1,λ2) (λ1,λ2) = (0, 0) π 2 4 (λ1,λ2) = (2i, 2j + 1) (−1)i+1 8|stabw (2i,2j+1)|(2i+2j+1)(2j+1) otherwise 0 a1(λ1,λ2) (λ1,λ2) = (0, 0) 2π2 + 1289 (λ1,λ2) = (2i, 2j), i + j 6= 0 (−1)i+1 128(i+j+1) (2i+2j+1)(2j−1)(2i+2j+3)(2j+3) (λ1,λ2) = (2i, 2j + 1) (−1)i+1 128(j+1) (2j+1)(2i+2j+1)(2j+3)(2i+2j+5) otherwise 0 as(λ1,λ2) (λ1,λ2) = (0, 0) π2 + 8 (λ1,λ2) = (2i, 2j), i + j 6= 0 (−1)i+1 16|stabw (2i+1,2j)|(2j+1)(2i+2j+1)(2j−1) (λ1,λ2) = (2i, 2j + 1) (−1)i+1 16(2j+1)(2i+2j+3)(2i+2j+1) otherwise 0 al(λ1,λ2) (λ1,λ2) = (0, 0) π2 (λ1,λ2) = (2i, 2j + 1) (−1)i+1 128(i+j+1)(j+1) |stabw (2i,2j+2)|(2i+2j+3)(2j+3)(2i+2j+1)(2j+1) otherwise 0 table 3. values of atλ, given by (12), for t ∈ {0, 1,s, l} and λ = λ1ω1 + λ2ω2. the indices i,j are non-negative integers. figure 6. the domain bounded by the two lines and the parabola with the inscribed triangle t corresponds to the integration region ω of c2. choosing the weight function w = wt, the integrals atλ(w t) are calculated by a change of variables induced by the map ξ. denoting atλ ≡ a t λ(w t), it holds that atλ = 2π 2 ∫ p stλ+%t(x)st%t(x) dx, (12) where p is the pre-image of the triangle t under the map ξ. this pre-image p, depicted as a square in fig. 7, contains the points b1ω∨1 + b2ω∨2 satisfying 2b1 + b2 ≥ 1/2, 2b1 + b2 ≤ 1, b2 ≥ 0 and b2 ≤ 1/2. the exact values of atλ are tabulated in tab. 3. finally, choosing w = 1, we calculate the coefficients figure 7. the fundamental domain f corresponding to c2 is depicted as the triangle containing the square p with the boundaries α,β,γ and δ. btλ ≡ a t λ(1) on t as the following integrals btλ = 2π 2   ∫ p cλ(x)s%1 (x) dx if t = 0,∫ p sλ+%1 (x) dx if t = 1,∫ p ssλ+%s(x)s l %l (x) dx if t = s,∫ p sl λ+%l(x)s s %s(x) dx if t = l. (13) the exact values of btλ are tabulated in tab. 4. we choose the following functions as model functions for numerical tests, g1(y1,y2) = r1(y201 −y1y2 + y 20 2 ), g2(y1,y2) = r2 ( e−(y 2 1 +(y2+1.8) 2)/2×0.352), g3(y1,y2) = r3 1 + y21 + y22 , g4(y1,y2) = r4ex+y, 210 vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions b0(λ1,λ2) (λ1,λ2) = (0, 1) −323 (λ1,λ2) = (2, 0) −163 (λ1,λ2) = (2i, 2), i 6= 0 16(1+(−1)i+1) 3i(i+1) (λ1,λ2) = (2i, 1), i 6= 0 16 [ 2 (2i+3)(2i−1) + 1+(2i+1)(−1)i+1 3i(i+1) ] (λ1,λ2) = (2i, 2j + 1),j 6= 0 64|stabw (2i,2j+1)| [ −1+(2j+1)(−1)j ((2i+2j+1)2−4)((2j+1)2−1) + −1+(2i+2j+1)(−1)i+j ((2i+2j+1)2−1)((2j+1)2−4) ] (λ1,λ2) = (2i, 2j),j 6= 1, i + j 6= 1 16|stabw (2i,2j)| [ 1+(−1)i+j ((i+j)2−1)(4j2−1) + 1+(−1)j (4(i+j)2−1)(j2−1) ] otherwise 0 b1(λ1,λ2) (λ1,λ2) = (2i, 2j) 4(1+(−1)i+j) (i+j+1)(2j+1) (λ1,λ2) = (2i, 2j + 1) −4(1+(−1)j) (2i+2j+3)(j+1) otherwise 0 bs(λ1,λ2) (λ1,λ2) = (0, 0) 8 (λ1,λ2) = (2i, 1) 16(2i+3)(2i+1) (λ1,λ2) = (2i, 2j), i + j 6= 0 8(1+(2i+2j+1)(−1)i+j+1) |stabw (2i+1,2j)|(i+j+1)(i+j)(4j2−1) (λ1,λ2) = (2i, 2j + 1),j 6= 0 8(−1+(2j+1)(−1)j) (2i+2j+3)(2i+2j+1)j(j+1) otherwise 0 bl(λ1,λ2) (λ1,λ2) = (0, 0) 8 (λ1,λ2) = (2i, 0), i 6= 0 8 [ 2i+1+(−1)i+1 2i(i+1) + 1 2i+1 ] (λ1,λ2) = (2i, 2j),j 6= 0 4|stabw (2i,2j+1)| [ 2i+2j+1+(−1)i+j+1 (i+j+1)(2j+1)(i+j) + 2j+1+(−1)j+1 (2i+2j+1)(j+1)j ] (λ1,λ2) = (2i, 2j + 1), 16|stabw (2i,2j+2)| [ (−1+(−1)j+1)(i+j+1) (2i+2j+3)(j+1)(2i+2j+1) + (−1+(−1)i+j+1)(j+1) (i+j+1)(2j+3)(2j+1) ] otherwise 0 table 4. values of btλ, given by (13), for t ∈ {0, 1,s, l} and λ = λ1ω1 + λ2ω2. the indices i,j are non-negative integers. g5(y1,y2) = { r5 if y21 + (y2 + 1.5)2 ≤ 1, 0 otherwise. each value of ri ∈ r is set to satisfy the normalization condition ∫ t gi(y) dy = 1. we compute the approximations itm (gi) = ∑ λ∈p+ 〈λ,η〉≤m btλb t λ (14) of ∫ t gi(y) dy = 1 with the formula for btλ given by (13). figs. 8 and 9 show for t ∈ {0, 1,s, l} the graphs of of the absolute value of the difference |1 −itm (gi)|. 5. concluding remarks (1.) establishing the explicit connection between the jacobi and macdonald polynomials and the weyl group orbit functions in section 2.3 forms a crucial step for generalizing known cubature formulas to the entire class of the jacobi polynomials. (2.) numerical tests results in figs. 5 and 8 indicate in general excellent convergence rates of the developed cubature rules including their clenshaw-curtis 211 l. háková, j. hrivnák, l. motlochová acta polytechnica figure 8. the graphs of error values |1−itm (gi)| of the integral ∫ t gi(y) dy = 1, i = 1, . . . , 4 and its approximations itm (gi), m = 10, 11, . . . , 50 given by (14). the values for t = 0, 1,s, l are depicted as circles, “+”, diamonds and “×”, respectively. figure 9. the graphs of error values |1 −itm (g5)| of the integral ∫ t g5(y) dy = 1 and its approximations itm (g5), m = 10, 15, 20, . . . , 170 given by (14). the values for t = 0, 1,s, l are depicted as circles, “+”, diamonds and “×”, respectively. versions for the case c2. the convergence rate of the multidimensional step-functions, even though less uniform, still appears to be very good. developing similar methods for the two-variable case g2 and extending the rules to higher dimensions poses an open problem. (3.) the hyperinterpolation methods [3, 25, 34, 35] are among the tools which directly use cubature rules. for the standard cubature rules of the weyl group orbit functions, several tests with very good results are also performed in [15]. developing and testing hyperinterpolation methods for the presented cubature rules merits further study. (4.) the present work demonstrates wide variety of possibilities of constructing the cubature rules in the orbit functions setting. comparison of the developed methods is necessary for establishing range of their viable applications. especially, comparison of the gauss and clenshaw-curtis cubature methods, similar to [37], regarding their efficiency, speed, model function and integration domain dependence merits further research. 6. acknowledgments lm and jh gratefully acknowledge the support of this work by rvo68407700. references [1] h. berens, h. j. schmid, y. xu, multivariate gaussian cubature formulae, arch. math. (basel) 64 (1995), no. 1, 26–32, doi:10.1007/bf01193547. [2] n. bourbaki, groupes et algèbres de lie, chapitres iv, v, vi, hermann, paris, 1968. [3] m. caliari, s. de marchi, m. vianello, hyperinterpolation in the cube, comput. & math. with appl. 55 (2008), 2490–2497, doi:10.1016/j.camwa.2007.10.003. [4] d. chernyshenko, h. fangohr, computing the demagnetizing tensor for finite difference micromagnetic simulations via numerical integration, j. magn. magn. mat. 381 (2015) 440–445, doi:10.1016/j.jmmm.2015.01.013. [5] c.w. clenshaw, a.r. curtis, a method for numerical integration on an automatic computer, numer. math. 2 (1960), 197–205, doi:10.1007/bf01386223. [6] r. cools, an encyclopaedia of cubature formulas, journal of complexity 19 (2003), 445–453, doi:10.1016/s0885-064x(03)00011-6. [7] r. cools, i. p. mysovskikh, h. j. schmid, cubature formulae and orthogonal polynomials, j. comput. appl. math. 127 (2001), no. 1-2, 121–152, doi:10.1016/s0377-0427(00)00495-7. 212 http://dx.doi.org/10.1007/bf01193547 http://dx.doi.org/10.1016/j.camwa.2007.10.003 http://dx.doi.org/10.1016/j.jmmm.2015.01.013 http://dx.doi.org/10.1007/bf01386223 http://dx.doi.org/10.1016/s0885-064x(03)00011-6 http://dx.doi.org/10.1016/s0377-0427(00)00495-7 vol. 56 no. 3/2016 on cubature rules associated to weyl group orbit functions [8] p. de la harpe, c. pache, b. venkov, construction of spherical cubature formulas using lattices, algebra i analiz 18 (2006), no. 1, 162–186; reprinted in st. petersburg math. j. 18 (2007), no. 1, 119–139, doi:10.1090/s1061-0022-07-00946-6. [9] d. c. handscomb, j. c. mason, chebyshev polynomials, chapman&hall/crc, usa, 2003, doi:10.1201/9781420036114. [10] l. háková, j. hrivnák, j. patera, four families of weyl group orbit functions of b3 and c3, j. math. phys. 54 (2013), 083501, 19, doi:10.1063/1.4817340. [11] g. heckman, h. schlichtkrull, harmonic analysis and special functions on symmetric spaces, academic press inc., san diego, 1994. [12] g. heckman, e. m. opdam, root systems and hypergeometric functions. i, ii, composition math. 64 (1987), 329-373. [13] j. hrivnák, j. patera, on discretization of tori of compact simple lie groups, j. phys. a: math. theor. 42 (2009) 385208, doi:10.1088/1751-8113/42/38/385208. [14] j. hrivnák, l. motlochová, j. patera, on discretization of tori of compact simple lie groups ii., j. phys. a 45 (2012), 255201, 18, doi:10.1088/1751-8113/45/25/255201. [15] j. hrivnák, l. motlochová, j. patera, cubature formulas of multivariate polynomials arising from symmetric orbit functions, arxiv:1512.01710. [16] j. e. humphreys, introduction to lie algebras and representation theory, spinger-verlag, new york, 1978, doi:10.1007/978-1-4612-6398-2. [17] a. klimyk, j. patera, (anti)symmetric multivariate trigonometric functions and corresponding fourier transforms, j. math. phys. 48 (2007), 093504, 24, doi:10.1063/1.2779768. [18] a. u. klimyk, j. patera, orbit functions, sigma 2 (2006), 006, 60 pages, doi:10.3842/sigma.2006.006. [19] a. u. klimyk, j. patera, antisymmetric orbit functions, sigma 3 (2007), paper 023, 83 pages, doi:10.3842/sigma.2007.023. [20] t. h. koornwinder, two-variable analogues of the classical orthogonal polynomials, theory and application of special functions, edited by r. a. askey, academic press, new york (1975) 435–495, doi:10.1016/b978-0-12-064850-4.50015-x. [21] h. li, j. sun, y. xu, discrete fourier analysis and chebyshev polynomials with g2 group, sigma 8, paper 067, 29, 2012, doi:10.3842/sigma.2012.067. [22] h. li, j. sun, y. xu, discrete fourier analysis, cubature and interpolation on a hexagon and a triangle, siam j. numer. anal. 46 (2008), 1653–1681, doi:10.1137/060671851. [23] h. li, y. xu, discrete fourier analysis on fundamental domain and simplex of ad lattice in d-variables, j. fourier anal. appl. 16, 383–433, (2010), doi:10.1007/s00041-009-9106-9. [24] i. g. macdonald, orthogonal polynomials associated with root systems, sém. lothar. combin. 45 (2000/01), art. b45a, 40, doi:10.1007/978-94-009-0501-6_14. [25] s. de marchi, m. vianello, y. xu, new cubature formulae and hyperinterpolation in three variables, bit. numerical mathematics 49 (2009), number 1, 55–73, doi:10.1007/s10543-009-0210-7. [26] r. v. moody, l. motlochová, j. patera, gaussian cubature arising from hybrid characters of simple lie groups , j. fourier anal. appl. 20 (2014), issue 6, 1257–1290, doi:10.1007/s00041-014-9355-0. [27] r. v. moody, j. patera, cubature formulae for orthogonal polynomials in terms of elements of finite order of compact simple lie groups, adv. in appl. math. 47 (2011) 509–535, doi:10.1016/j.aam.2010.11.005. [28] r. v. moody and j. patera, orthogonality within the families of c-, s-, and e-functions of any compact semisimple lie group, sigma 2 (2006) 076, 14 pages, doi:10.3842/sigma.2006.076. [29] l. motlochová, special functions of weyl groups and their continuous and discrete orthogonality, ph.d. thesis, université de montréal (2014), http://hdl.handle.net/1866/11153 [30] h. z. munthe-kaas, m. nome, b. n. ryland, through the kaleidoscope: symmetries, groups and chebyshevapproximations from a computational point of view, foundations of computational mathematics, budapest 2011, 188–229, london math. soc. lecture note ser., 403, cambridge univ. press, cambridge, 2013. [31] b. n. ryland, h. z. munthe-kaas, on multivariate chebyshev polynomials and spectral approximation on triangles, spectral and high order methods for partial differential equations, lecture notes in computational science and engineering, springer, 2011, doi:10.1007/978-3-642-15337-2_2. [32] h. j. schmid, y. xu, on bivariate gaussian cubature formulae, proc. amer. math. soc., 122 (1994), 833–841, doi:10.2307/2160762. [33] i. sfevanovic, f. merli, p. crespo-valero, w. simon, s. holzwarth, m. mattes, j. r. mosig, integral equation modeling of waveguide-fed planar antennas, ieee antenn. propag. m. 51 (2009), 82–92, doi:10.1109/map.2009.5433099. [34] i. h. sloan, polynomial interpolation and hyperinterpolation over general regions, j. approx. theory 83 (1995), no. 2, 238–254, doi:10.1006/jath.1995.1119. [35] a. sommariva, m. vianello, r. zanovello, nontensorial clenshaw-curtis cubature, numer. algorithms 49 (2008), number 1-4, 409–427, doi:10.1007/s11075-008-9203-x. [36] g. szegő, orthogonal polynomials, american mathematical society, providence, r.i., 1975. [37] l. n. trefethen, is gauss quadrature better than clenshaw-curtis? siam rev. 50 (2008) 67–87, doi:10.1137/060659831. [38] j. waldvogel, fast construction of the fejér and clenshaw-curtis quadrature rules, bit 46 (2006), no. 1, 195–202, doi:10.1007/s10543-006-0045-4. [39] j. c. young, s. d. gedney, r. j. adams, quasimixed-order prism basis functions for nyström-based volume integral equations, ieee trans. magn. 48 (2012), 2560–2566, doi:10.1109/tmag.2012.2197634. 213 http://dx.doi.org/10.1090/s1061-0022-07-00946-6 http://dx.doi.org/10.1201/9781420036114 http://dx.doi.org/10.1063/1.4817340 http://dx.doi.org/10.1088/1751-8113/42/38/385208 http://dx.doi.org/10.1088/1751-8113/45/25/255201 http://dx.doi.org/10.1007/978-1-4612-6398-2 http://dx.doi.org/10.1063/1.2779768 http://dx.doi.org/10.3842/sigma.2006.006 http://dx.doi.org/10.3842/sigma.2007.023 http://dx.doi.org/10.1016/b978-0-12-064850-4.50015-x http://dx.doi.org/10.3842/sigma.2012.067 http://dx.doi.org/10.1137/060671851 http://dx.doi.org/10.1007/s00041-009-9106-9 http://dx.doi.org/10.1007/978-94-009-0501-6_14 http://dx.doi.org/10.1007/s10543-009-0210-7 http://dx.doi.org/10.1007/s00041-014-9355-0 http://dx.doi.org/10.1016/j.aam.2010.11.005 http://dx.doi.org/10.3842/sigma.2006.076 http://hdl.handle.net/1866/11153 http://dx.doi.org/10.1007/978-3-642-15337-2_2 http://dx.doi.org/10.2307/2160762 http://dx.doi.org/10.1109/map.2009.5433099 http://dx.doi.org/10.1006/jath.1995.1119 http://dx.doi.org/10.1007/s11075-008-9203-x http://dx.doi.org/10.1137/060659831 http://dx.doi.org/10.1007/s10543-006-0045-4 http://dx.doi.org/10.1109/tmag.2012.2197634 acta polytechnica 56(3):202–213, 2016 1 introduction 2 special functions associated to root systems 2.1 basic definitions 2.2 weyl group orbit functions 2.3 jacobi polynomials 3 cubature formulas 3.1 general form of cubature formulas 3.2 cubature formulas of c2 4 clenshaw-curtis cubature formulas 4.1 clenshaw-curtis method 4.2 integration domain omega of c2 4.3 triangular domain of c2 5 concluding remarks 6 acknowledgments references acta polytechnica https://doi.org/10.14311/ap.2021.61.0122 acta polytechnica 61(si):122–134, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague tnl: numerical library for modern parallel architectures tomáš oberhuber∗, jakub klinkovský, radek fučík czech technical university in prague, faculty of nuclear sciences and physical engineering, department of mathematics, trojanova 13, 120 00 praha , czech republic ∗ corresponding author: tomas.oberhuber@fjfi.cvut.cz abstract. we present template numerical library (tnl, www.tnl-project.org) with native support of modern parallel architectures like multi–core cpus and gpus. the library offers an abstract layer for accessing these architectures via unified interface tailored for easy and fast development of high-performance algorithms and numerical solvers. the library is written in c++ and it benefits from template meta–programming techniques. in this paper, we present the most important data structures and algorithms in tnl together with scalability on multi–core cpus and speed–up on gpus supporting cuda. keywords: parallel computing, gpu, explicit schemes, semi–implicit schemes, c++ templates. 1. introduction tnl aims to be an efficient and user friendly library for numerical simulations. to fulfill this goal, it must support modern parallel architectures like gpus (graphical processing units) and multi-core cpus on one hand and offer simple and flexible interface for implementation of complex numerical solvers on the other hand. if high computational efficiency is required, we cannot follow the typical rules of the object– oriented programming that usually lead to inefficient organization of data in the computer memory and, for example, use of virtual methods may lower the performance of the final code. the design of tnl profits from advantages of the c++ templates. especially templates specializations is a natural tool for generating specialized architecture dependent code with no run-time overhead. there is no doubt that modern numerical libraries must support accelerators like gpus. they provide high memory bandwidth as well as great computational power which is obtained not by high clock frequencies but by massively parallel design, which is more power efficient. programming of gpus is easier with the cuda framework. nevertheless, deep knowledge of the gpu hardware is still necessary to produce efficient code which makes the gpus almost unavailable for many experts in numerical mathematics. adding support for gpus to existing numerical libraries is nearly impossible. the gpu architecture is so different from the cpu that most of the numerical methods and algorithms must be completely rewritten. in cuda, we deal with two different address spaces: one associated with the cpu and the other with the gpu. communication between them is remarkably slow and it must be fully managed by the programmer. to get the maximum performance from the gpu, the programmer must take care of the organization of data in the memory, minimize the divergence of cuda threads, deal with limited shared memory, and many other details of the gpu design [1]. in some cases, one may use the gpu for numerical computing relatively easily. an implicit time discretization of linear problems allows to construct the linear system once on the cpu, transfer it to the gpu and solve it repeatedly there by some iterative solver like cg (conjugate gradients). such solvers require only common linear algebraic operations implemented in libraries like cublas or cusparse which are part of the cuda toolkit. difficulties arise with non-linear problems, where the linear system matrix must be updated in each time step. transfer of the matrix from the cpu to the gpu makes any speed-up impossible. therefore the matrix must be assembled on the gpu. this process requires a manipulation with an underlying format for sparse matrices and also efficient access to numerical mesh. neither is trivial on the gpu. table 1 presents a comparison of tnl with several other hpc libraries like cublas [2], cusparse [3], thrust [4], kokkos [5], viennacl [6], petsc [7] and eigen [8]. we chose primarily libraries with support of gpu. the table shows which of them have modern interface in c++ and which of them support distributed computation using mpi. parallel for, parallel reduction and scan are emerging programming patterns in hpc which allow to write one code for different parallel architectures. sorting is another common operation defined on arrays or vectors. blas operations are well known in the hpc community. blas is a standard library which, however, does not profit from the modern features of c++. especially blas level 1 operations can be better expressed with expression templates. sparse matrices belong with no doubt to one of the most important data structures in hpc. gpus are very sensitive to sparse matrix pattern. re122 https://doi.org/10.14311/ap.2021.61.0122 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures tnl cublas cusparse thrust kokkos viennacl petsc eigen gpu yes yes yes yes yes yes weak no mpi weak no no no no no yes no c++ yes no no yes yes yes no yes parallel for yes no no yes yes no no no parallel reduction,scan yes no no yes yes no no no sorting no yes no yes yes no yes no blas 1 yes yes yes no yes yes yes yes blas 2 weak yes yes no yes yes yes yes blas 3 no yes yes no no yes yes yes expression templates yes no no no no yes no yes sparse matrices yes no yes no yes yes yes yes linear systems solvers yes no weak no no yes yes yes preconditioners weak no no no no yes yes weak nonlinear sys.solvers no no no no no yes yes no eigenvalues comp. no no no no no yes no no ode systems solvers yes no no no no no yes no stencil computations yes no no no no no yes no unstructurted meshes yes no no no no no yes no python interface weak no no no no yes yes yes table 1. comparison of tnl with other numerical libraries. gardless of the great advance in the design of sparse matrix formats for gpus, it seems that there is no such format which would dominate the others in a large class of sparse matrices. it is profitable for a numerical library to offer several different sparse matrix formats. having implemented sparse-matrix-vector multiplication, it is relatively simple to incorporate the sparse matrices into iterative linear systems solvers. efficient preconditioning for gpus is still quite rare though very important. good preconditioning on the cpu can easily outperform high memory bandwidth and computational performance of gpus. nonlinear system solvers, eigenvalues computations or ode system solvers are slightly less common compared to linear systems solvers, nevertheless these algorithms belong to the core of hpc as well. stencil computations cover a large class of numerical algorithms performed on structured rectangular grids. unstructured meshes are necessary for finite volume or for finite element methods. some libraries also cooperate with python to offer a more efficient user interface. similar to petsc or viennacl, we believe that the main advantage of tnl is that it offers a wider class of data structures and algorithms with a unified interface. for decades, numerical libraries were developed in a very modular way. the users at the end must combine different libraries to solve the problem at hand. this is becoming more difficult on modern architectures like gpus. modern features of c++ can make it easier. our aim is to develop a library with a stllike consistent templated user interface and native support of gpus which would make development of hpc algorithms more efficient. the rest of the paper is organized as follows. first, we briefly explain the basics of gpu programming (section 2). then, we describe some basic data structures and solvers already implemented in tnl (section 3). finally, we demonstrate the performance of tnl by showing the scalability on multi-core cpus and speed–up of gpus (section 4). 2. programming gpus the gpu is an accelerator connected to the cpu via the pci express interface. it is equipped with its own memory referred to as the global memory. though its memory bandwidth is several times faster compared to common ddr4 connected to the cpu, communication between the gpu and the cpu is substantially slow. this limits the design of algorithms for gpus because frequent communication between the gpu and the cpu may negatively affect the overall performance. in case of iterative numerical solvers for pdes, it is often necessary to copy all necessary data to the gpu before the start of iterations. the result is then copied back to the cpu for post–processing. data in the global memory of the gpu need to be accessed in large continuous blocks (this is referred to as coalesced memory access), random access is by an order of magnitude slower. therefore, the data must be often organized in a completely different way than on the cpu. this is also the reason why porting an older code to the gpu is so difficult. the gpu consists of several independent multiprocessors which cannot communicate with each other. they can access the global memory and each multiprocessor has its own shared memory which is remarkably smaller (tens of kilobytes only) than the global memory, but much faster in comparison. the shared memory can work as a cache or it can be managed by the programmer. each multiprocessor may process 123 t. oberhuber, j. klinkovský, r. fučík acta polytechnica 32 cuda threads simultaneously which are referred to as a warp. threads in the same warp behave like the simd architecture, i.e., they should follow the same instruction at the same time. if not, we call it divergent threads. in this case, efficiency is diminished due to serialization. for more details about cuda we refer to [1]. 3. data structures and algorithms in tnl in this section, we describe basic data structures and algorithms implemented in tnl. in the following text, we refer to the gpu as device and to the cpu as host. methods, declared as __cuda_callable__, can be executed on both, the device and host. if the cuda framework is not installed, this attribute has no effect. tnl uses template parameters. few of them are present in the majority of the code: • device can be either tnl::devices::host or tnl::device::cuda. it defines where the data are going to be stored and where the related algorithms will be executed. • real defines the precision of the floating point arithmetic (float or double). • index defines integer type used for indexing within data structure/algorithm (int or long int). • allocator controls allocation of memory. it can be tnl::alocators::host for allocation of memory on the host system, tnl::allocators::cuda for allocation of the memory on the cuda device tnl::allocators::cudahost for allocation of a page-locked memory (part of the memory that cannot be swapped off) using cuda and tnl::allocators::cudamanaged for allocation of cuda unified memory (shared memory space between cpu and gpu). 3.1. arrays and vectors arrays are basic structures for the memory management in tnl. an array is a template tnl::containers::array< value, device, index, allocator > with template parameters value (an array element type), device (a device where the array elements are stored), index (the indexing type) and allocator (controlling memory allocation). array provides common methods such as allocation (setsize), comparison and assignment operators or i/o methods (binary load and save). array elements may be manipulated by setelement, getelement and __cuda_callable__ operator[]. the first two methods can be called from the host even for arrays allocated on the device but they cannot be called from the cuda kernels. the last one, on the other hand, is defined as __cuda_callable__ and it can be called only from the cuda kernels, if the array is allocated on the device, and from the host system only if the array is allocated on the host. a method bind takes a pointer to an array of given size or an instance of another array. an array adopted in this way is not being deallocated in the array destructor. such mechanism serves for data sharing between more arrays or for wrapping of data allocated outside of tnl. it also serves for the partitioning of large arrays into a set of smaller ones while keeping them allocated as one continuous block in the memory. vectors (tnl::containers::vector< real, device, index, allocator >), in tnl, extend arrays with vector operations from the linear algebra like vector addition, scalar products, norms, but also prefix-sum (scan) [9]. the blas level 1 operations are handled by expression templates which are more general and user friendly with no loss of performance. 3.2. matrices tnl supports dense matrices as well as several formats for the sparse ones (in namespace tnl::matrices). all formats are available for both, the host system and the cuda device. optimized sparse matrix formats are crucial especially for good efficiency of the gpu solvers. the user may choose between tridiagonal and multi-diagonal matrices, ellpack format [10], sliced ellpack format [11, 12]1, chunked ellpack format [13], biell format [14] and csr format (for the gpu, it is currently in experimental form). the sparse matrix formats are usually optimized for the matrix-vector multiplication. however, the matrix construction time is also important, especially for non-linear problems where the linear system must be recomputed in each time step. in general, the insertion of a matrix element can be very expansive for the majority of the sparse matrix formats. global changes of the data structure or even reallocation can be evoked. fortunately, in many applications the sparse matrix pattern does not change during the computation. in tnl, the matrix assembling process proceeds in two steps: (1.) matrix format meta–data initiation is based on information about the number of non–zero elements in each row (we refer to it as compressed row lengths or matrix row capacities). the user calls a method setcompressedrowlengths which accepts a vector having the same size as the number of matrix rows. i-th element of the vector says the number of nonzero elements in i-th row. the matrix format is initialized based on this information. most importantly, it means that the necessary memory for each matrix row is allocated. if the matrix pattern does not change, this method can be called only once. (2.) matrix elements insertion consists of the non– zero matrix elements insertion. since the necessary 1in [11], we referred the sliced ellpack format as rowgrouped csr. it is, however, almost identical as sliced ellpack from [12] and since it uses padding zeros, the name ellpack seems to be more convenient. 124 vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures memory for each row is already allocated, this can be done in parallel. each row can only have as many non-zero elements as its capacity allows. 3.3. numerical meshes currently, tnl supports regular orthogonal structured grids and unstructured conforming homogeneous meshes. in this paper we deal only with the regular grids. computational results obtained with multidimensional mixed-hybrid finite element method on both structured and unstructured meshes were presented in [15]. the regular orthogonal grids are one, two and three dimensional – tnl::meshes::grid< dimension, real, device, index >. the template parameter dimension corresponds to the grid dimension. the others are described at the beginning of section 3. the grid consists of mesh entities. mesh entity with topological dimension 0 is a vertex, mesh entity with the same topological dimension as the mesh itself is a cell. each cell has a boundary made of faces, i.e. mesh entities having topological dimension dimension-1 . in 3d, each face has a boundary consisting of 1-dimensional edges. within all mesh entities having the same dimension, each mesh entity has a unique index and coordinates as depicted on the figure 1. in 2d, we have two kinds of faces – those parallel to the x-axis (marked with light gray) and those parallel to the y-axis (marked with dark gray). they can be distinguished by its index – the faces parallel to the x-axis are indexed first (indexes 0 − 11), followed by faces parallel to the y-axis (indexes 12 − 23) – and their orientation defined by the unit normal. this similarly holds for edges in 3d, whose orientation is given by the axis they are parallel to. a mesh entity may be obtained using the method getentity defined in tnl::meshes::grid: all information necessary for a numerical scheme implementation is accessible via the meshentity. it is especially the mesh entity center, proportions, measure and its neighbor entities indexes. there are several template specializations for each type of the mesh entity depending on what information is precomputed and stored for repetitive use in the numerical scheme. the following code snippet shows how to get their indexes in case of cells: 3.4. solvers tnl provides solvers for odes and systems of odes arising from the method of lines. currently, the user may choose between the first order accurate euler solver and the fourth order runge-kutta-merson solver [16] with an adaptive time stepping. both may run on the gpu. systems of linear equations can be solved by several krylov subspace methods like cg, bicgstab , gmres and tfqmr (may be executed on the gpu as well [17]) or the stationary sor method (does not run on the gpu yet). (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) (2,2) (0,0) (1,0) (2,0) (3,0) (0,1) (1,1) (2,1) (3,1) (0,2) (1,2) (2,2) (3,2) (0,3) (1,3) (2,3) (3,3) (0,0) (1,0) (2,0) (3,0) (0,1) (1,1) (2,1) (3,1) (0,2) (1,2) (2,2) (3,2) (0,0) (1,0) (2,0) (0,1) (1,1) (2,1) (0,2) (1,2) (2,2) (0,3) (1,3) (2,3) 0 3 6 1 4 7 2 5 8 0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15 0 3 6 9 1 4 7 10 2 5 8 11 12 16 20 13 17 21 14 18 22 15 19 23 figure 1. figure shows 2d grid consisting of 3 × 3 cells, faces with gray box labels (vertical with dark gray, horizontal with light gray) and vertexes with white rounded box labels. coordinates are depicted on the top, indexing on the bottom. 4. solution of parabolic pdes performance of the presented data structures and algorithms is demonstrated on a solution of the heatequation. the reason for this simple problem is that, especially in explicit time discretization and with zero right-hand side f(x, t) = 0, we get low computational intensity which is the worst case for the gpu and so the lower bound of the speed-up. we consider a domain ω ≡ [0, 1]2, the initial condition uini ∈ c2(ω) and the dirichlet boundary conditions g ∈ c2(ω × (0,t]) defined as: ∂u(x, t) ∂t − ∆u(x, t) = f(x, t) in ω × (0,t], (1) 125 t. oberhuber, j. klinkovský, r. fučík acta polytechnica u(x, 0) = uini(x) in ω, (2) u(x, t) = g(x, t) on ∂ω × (0,t].(3) the solution u is approximated on vertexes of the 2d grid. let h be the space step and n the number of cells along x and y axis such that h = 1/n. let ωh = {(ih,jh) | 1 ≤ i,j ≤ n − 1} denote the set of interior grid vertexes and let ωh = {(ih,jh) | 0 ≤ i,j ≤ n} (4) stand for the set of all grid vertexes. then, by ∂ωh = ωh \ωh, we denote the boundary vertexes. for some function u : ω → r2, we define the projection on ωh as uij = u(ih,jh). for interior vertexes from ωh, we define the approximation of the laplace operator ∆h as follows: ∆hu ((ih,jh) , ·) ≈ ui+1,j + ui−1,j + ui,j+1 + ui,j−1 − 4uij h2 = ∆huij. the explicit time discretization is done using the method of lines which leads to the following system of odes d dt uij (t) = ∆huij (t) + fij (t), (5) at the interior vertexes and uij (t) = gij (t) at the boundary vertexes. this system can be solved by the runge-kutta method. for the semi-implicit time discretization2, we introduce a time step τ and we denote ukij := u(ih,jh,τk). with this notation, the semi-implicit scheme is given by a system of linear equations uk+1ij −u k ij τ − (6) uk+1i+1,j + u k+1 i−1,j + u k+1 i,j+1 + u k+1 i,j−1 − 4u k+1 ij h2 = fk+1ij , at the interior vertexes and uk+1ij = g k+1 ij (7) at the boundary vertexes. both (6) and (7) can be written in the form of a linear system axk+1 = bk, (8) where xk+1in +j ≡ u k+1 ij and for (i,j) ∈ ωh ain +j,i(n −1)+j = −λ ain +j,in +j−1 = −λ ain +j,in +j+1 = −λ ain +j,i(n +1)+j = −λ, ain +j,in +j = 1 − 4λ, bkin +j = τf k+1 ij + u k ij 2the scheme given by (6)–(7) is implicit. in general, implicit schemes for non-linear parabolic pdes involves the newton method, which is not implemented in tnl yet. such problems must be discretized by semi-implicit schemes. where we set λ = τ/h2. the dirichlet boundary conditions (3) are approximated on (i,j) ∈ ∂ωh as ain +j = 1 and bin +j = gk+1ij . discretization of 1d and 3d problem is done in the same way. algorithm 1 algorithm for the time dependent pde solver 1: initialize numerical mesh (4) 2: allocate degrees of freedom for u0ij 3: setup the initial condition (uini) 4: setup the numerical solver (runge-kutta or linear system solver) 5: t := 0,k := 0,τ := initial time step 6: while t < t do 7: τ := min{τ,t − t} 8: if have explicit time discretization then 9: evaluate the right hand side of (5) 10: update ukij by the runge-kutta solver to get uk+1ij 11: else /* we have semi-implicit time discretization */ 12: assembly the linear system (8) 13: resolve (8) by the linear system solver to get uk+1ij 14: end if 15: t := t + τ, k := k + 1 16: if t is snapshot time then 17: make snapshot of the solution ukij 18: end if 19: end while a numerical solver for such parabolic problem may be written as an algorithm 1. each of the steps in this algorithm may be non-trivial for more complex problems or parallel computations. tnl comes with a framework based on what we call a problem-solver model. on one hand, there is a problem defined by the user in a form of a (templated) class. on the other hand, there is a solver provided by tnl. the solver interacts with the problem via predefined methods and c++ types definitions. 4.1. pde solver design the design of the solver is depicted in figure 2. the problem to be solved is defined and managed by three template parameters of the solver tnl::solvers::solver i.e. problemconfig, problemsetter and problembuild. they serve for definition of the problem configuration parameters, resolution of the problem template parameters and control of the problem build process (complete build may take a lot of time) respectively. the static method tnl::solvers::solver::run takes the command line arguments argc and argv. the tnl solver firstly calls a static method problemconfig::configsetup to get a definition of the configuration keywords. in the next step, the command line arguments are parsed and based on 126 vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures figure 2. structure of the pde framework.the green boxes on the right are part of a problem written by the user. the blue ones on the left are the solver part implemented in tnl. the yellow boxes represent the template parameters and the violet ones stand for data. 127 t. oberhuber, j. klinkovský, r. fučík acta polytechnica the definition of the configuration keywords, the configuration parameters are resolved and stored in a container tnl::config::parametercontainer. the control is then passed to the solverinitiator that reads the file with the numerical mesh given by the value of the configuration keyword –mesh. it is a binary file created by the tnl tool tnl-grid-setup and it contains a binary image of templated class grid. in tnl, most objects may be saved in the binary format using the method save. a header of such file contains the object type written in the c++ style, e.g., tnl::meshes::grid< 2, double, tnl::devices::host, int >. the solver initiator parses template arguments of this object type and so it can resolve the mesh type completely. the values of its template parameters real and index are used as default values for floating point arithmetics and indexing type in the problem. this can be changed by the configuration keywords –real-type and –index-type. together with the argument –device-type, the solver initiator can resolve the primary template arguments (realtype, devicetype, and indextype) and meshtype for the problem. the solver may, however, depend on some other template types like numerical scheme or boundary conditions. we refer to them as secondary template arguments. to resolve them, the control is passed to the problemsetter. the solver starter (based on the configuration parameters) sets up the templated types for the time discretization (timestepper) and related solvers – the runge-kutta solver for the explicit time discretization or linear systems solver for semi-implicit scheme. at the end, the solver starter creates an instance of the class problem, passes it to the pdesolver and starts the solver. pdesolver loads the numerical mesh from the file and, subsequently, it calls methods of the class problem which we describe in the following section. 4.2. pde problem structure the class problem representing the heatequation or similar parabolic problem could be parametrized by four template parameters – mesh, boundaryconditions, righthandside and differentialoperator defining the mesh type, the boundary conditions (3), the right-hand side of equation (1) and the differential operator in the same equation (1) respectively. it is inherited from a templated class tnl::problems::pdeproblem which defines the following methods (we list only the most important ones): • setup this method serves for set-up of the problem configuration parameters which were parsed from the command-line arguments. • getdofs based on the numerical mesh, problem type (single pde or system of pdes) and type of the mesh entities linked with dofs (cells, faces or vertexes), this method returns a number of dofs for the unknown mesh function (it is ukij (5) or uij (t) (6) in our example). dofs are then allocated by the solver. • binddofs this method serves for binding of dofs into mesh functions. • setinitialcondition the initial condition uini (2) may be set here. • getexplicitupdate this method evaluates the right-hand side (5) of the explicit numerical scheme. it is called for the explicit time discretization only. • setuplinearsystem in the case of semi-implicit time discretization, this method serves for setup of the (sparse) matrix format storing the linear system (8) as described in the section 3.2. it is called only for the semi-implicit time discretization. • assemblylinearsystem this method is responsible for the construction of the linear system (8). it is called for the semi-implicit time discretization only. • makesnapshot this method stores the state of the time dependent problem into a file. 5. tnl tools tnl offers several helper tools (see figure 3). tnl-grid-setup is a simple utility for creating a file with the numerical grid definition. tnl-init serves for easier set-up of initial conditions. it produces file with binary image of a mesh function. after the problem solver finishes the computation and saves the solution in a form of a sequence of binary files, they may be post-processed by tnl-view to convert the binary data to vtk (or gnuplot) format or by tnl-diff to evaluate differences between different mesh functions or experimental order of convergence (eoc). tnl-image-converter can convert images to binary tnl files and vice versa. currently, tnl supports pgm (natively), png (via libpng), jpeg (via libjpeg) and dicom (via dcmtk). 6. performance tests computational benchmarks were performed on intel xeon e5-2640 running at 2.4ghz. this cpu is equipped with 8 computational cores and 25 mb l3 cache and it was connected to ddr4 memory modules. the turbo-boost technology was turned off only for measuring the weak scalability on cpu. for all other computations it was turned on. the gpu solvers were tested on nvidia tesla v100 with 5120 cuda cores running at 1455 mhz and equipped with 16 gb of global memory. the heat equation (1–3) was solved in a domain (−1, 1)n where n = 1, 2, 3 denotes the dimensions of the domain. the initial condition was uini(x) = exp ( −4‖x‖2 ) . the final time t was set to 0.1. 128 vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures figure 3. tnl tools for a computation data preprocessing and post-processing. 6.1. explicit numerical schemes firstly, we test the explicit numerical scheme. the integration in time is done by the runge-kutta-merson solver with the adaptive choice of the time step. results are presented in tables 2–4. the number of dofs is shown in the first column. the following columns show times and efficiency of the cpu solver on one, two, four and eight cores, together with the parallel efficiency. the time values in bold face show the best cpu computation time in each row. the last two columns belong to the gpu solver. they show the time and speed-up compared to the best time obtained on cpu. the simulations in 1d (table 2) have too small number of dofs and thus we do not obtain good parallel scalability and speed-up on the gpu. in 2d (table 3), the cpu solver scales relatively well up to four cores on larger numerical meshes. the best time is obtained on eight cores but the parallel efficiency is lower. the gpu speed-up is more than nine on large problems. in 3d, similar results can be seen in table 4 reporting the gpu speed-up more than eight. note that turbo-boost was active on the cpu which can affect the parallel efficiency. the heat equation (1) with f(x, t) = 0 exhibits low computational intensity. therefore the tables 2–4 demonstrates the lower bound of the gpu speed-up that can be obtained for explicit solvers in tnl. results with arithmetically more intensive computations are presented in tables 5–7 where we set f(x, t) = cos(t) ( −2a σ2 e −x2 σ2 + 4ax2 σ4 e −x2 σ2 ) , (9) f(x, t) = cos(t) ( −2a σ2 e −x2 −y2 σ2 + 4ax2 σ4 e −x2 −y2 σ2 + −2a σ2 e −x2 −y2 σ2 + 4ay2 σ4 e −x2 −y2 σ2 ) , (10) and f(x, t) = cos(t) ·( −2a σ2 e −x2 −y2 −z2 σ2 + 4ax2 σ4 e −x2 −y2 −z2 σ2 + −2a σ2 e −x2 −y2 −z2 σ2 + 4ay2 σ4 e −x2 −y2 −z2 σ2 + −2a σ2 e −x2 −y2 −z2 σ2 + 4az2 σ4 e −x2 −y2 −z2 σ2 ) , (11) for 1d, 2d and 3d respectively. the setup is the same for tables 2–4, just for 2d and 3d simulations (tables 6 and 7) we set the final time t was set to 0.01. in this situation we get certain speed-up even for 1d problem. in 2d and 3d the speed-up reaches almost one hundred compared to eight cores cpu time. this is the estimate of the upper bound of speed-up one can obtain with the explicit solver. 6.2. semi-implicit numerical schemes tables 8–10 show results obtained by the semi-implicit numerical scheme (6)–(7). in this case, majority of the time is spend by solving the linear system (8). the sparse matrix a in (8) is stored in csr format on cpu and slicedellpack format on gpu. the linear system was solved by the gmres method on cpu and cwygmres [18] on gpu. the use of gpu makes no sense in 1d where the speed-up is smaller than one as well as in the case of the explicit solver. in 2d and 3d, the speed-up is more than 12 and 10 respectively. the figure 4 shows graphs of efficiency of the cpu solvers and so it demonstrates the strong parallel scalability of different solvers on different problems sizes. the left column shows scalability in 1d where the problem size is usually too small for efficient parallelization. in 2d and 3d (the second and the third column), especially for larger grid size, the efficiency grows. the middle row shows the result of arithmetically more intensive explicit solver with f given by (9)–(11). the efficiency is significantly higher. it indicates that the cpu solvers are probably limited by the memory bandwidth. the weak scalability is reported in table 11 and figure 5. to study the weak scalability we increase the problem size linearly with the number of threads by prolongating the domain ω along the x axis. the initial condition is scaled in the same way. we set the domain ω as ω ≡ (0,p), ω ≡ (0,p) × (0, 1) and ω ≡ (0,p)×(0, 1)×(0, 1) in 1d, 2d and 3d respectively, where p denotes the number of threads. the discrete numerical mesh ωh resolution is set as 10000p, 200p× 100 and 100p×50×50 in 1d, 2d and 3d respectively. 129 t. oberhuber, j. klinkovský, r. fučík acta polytechnica cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 16 0.003 0.002 0.75 0.003 0.25 0.003 0.12 0.04 0.05 32 0.003 0.003 0.5 0.003 0.25 0.003 0.12 0.04 0.07 64 0.005 0.005 0.5 0.005 0.25 0.005 0.12 0.05 0.1 128 0.015 0.015 0.5 0.015 0.25 0.01 0.18 0.12 0.08 256 0.06 0.06 0.5 0.06 0.25 0.06 0.12 0.43 0.13 512 0.28 0.29 0.48 0.33 0.21 0.34 0.10 1.36 0.20 1024 1.45 1.8 0.40 1.85 0.19 2.2 0.08 5.32 0.27 2048 8.57 9.111 0.47 8.5 0.25 9.38 0.11 20.95 0.40 4096 56.37 50.94 0.55 40.9 0.34 41.9 0.16 84.66 0.48 table 2. performance of the explicit numerical solver of (5) for the heat equation in 1d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 162 0.003 0.003 0.5 0.003 0.25 0.008 0.04 0.003 0.03 322 0.005 0.006 0.41 0.006 0.20 0.006 0.10 0.07 0.07 642 0.03 0.027 0.55 0.022 0.34 0.028 0.13 0.09 0.33 1282 0.41 0.27 0.75 0.17 0.60 0.16 0.32 0.22 0.72 2562 6.17 3.67 0.84 2.12 0.72 1.46 0.52 0.79 1.84 5122 96 55.47 0.86 30.84 0.77 18.7 0.64 5.73 3.26 10242 1743 990.7 0.87 556.9 0.78 381.6 0.57 64.82 5.88 20482 31226 17627.7 0.88 10403.4 0.75 8297.8 0.47 911.11 9.10 table 3. performance of the explicit numerical solver of (5) for the heat equation in 2d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 163 0.005 0.005 0.5 0.004 0.31 0.004 0.15 0.08 0.05 323 0.07 0.049 0.71 0.031 0.56 0.02 0.43 0.08 0.25 643 2.46 1.47 0.83 0.8 0.76 0.49 0.62 0.23 2.13 1283 94.32 53.08 0.88 30.66 0.76 23.75 0.49 3.48 6.82 2563 3050.47 1720.04 0.88 1005.89 0.75 827.16 0.46 103.46 7.99 table 4. performance of the explicit numerical solver of (5) for the heat equation in 3d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. 130 vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 16 0.003 0.003 0.5 0.003 0.25 0.003 0.12 0.02 0.15 32 0.003 0.003 0.5 0.003 0.25 0.003 0.12 0.05 0.05 64 0.007 0.007 0.5 0.007 0.25 0.007 0.12 0.07 0.09 128 0.032 0.032 0.5 0.03 0.26 0.03 0.13 0.11 0.27 256 0.19 0.2 0.47 0.2 0.23 0.2 0.11 0.36 0.52 512 1.35 1.37 0.49 1.5 0.22 1.5 0.11 1.37 0.98 1024 10.58 6.08 0.87 4.21 0.62 3.64 0.36 5.25 0.69 2048 78.26 43.08 0.90 26.86 0.72 20.13 0.48 20.74 0.97 4096 609.17 321.73 0.94 190.37 0.79 128.69 0.59 83.55 1.54 table 5. performance of the explicit numerical solver of (5) for the heat equation in 1d with f given by (9) and t = 0.1. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 162 0.004 0.004 0.5 0.004 0.25 0.006 0.08 0.04 0.1 322 0.01 0.007 0.71 0.006 0.41 0.007 0.17 0.05 0.12 642 0.03 0.02 0.75 0.01 0.75 0.01 0.37 0.08 0.12 1282 0.69 0.34 1.01 0.19 0.90 0.14 0.61 0.06 2.33 2562 11.03 5.65 0.97 3.13 0.88 1.83 0.75 0.12 15.2 5122 181.6 93.57 0.97 53.65 0.84 29.29 0.77 0.63 46.4 10242 2996.2 1557.51 0.96 865.7 0.86 481.47 0.77 6.41 75.1 20482 48965 25065.9 0.97 13716.7 0.89 8002.77 0.76 87.6 91.3 table 6. performance of the explicit numerical solver of (5) for the heat equation in 2d with f given by (10) and t = 0.01. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 163 0.03 0.02 0.75 0.014 0.53 0.011 0.34 0.01 1.1 323 0.33 0.16 1.03 0.1 0.82 0.09 0.45 0.01 9 643 4.5 2.33 0.96 1.31 0.85 0.81 0.69 0.03 27 1283 174.2 88.33 0.98 50.11 0.86 28.65 0.76 0.37 77 2563 6047.66 3071.05 0.98 1696 0.89 982.2 0.76 9.87 99 table 7. performance of the explicit numerical solver of (5) for the heat equation in 3d with f given by (11) and t = 0.01. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. 131 t. oberhuber, j. klinkovský, r. fučík acta polytechnica cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 16 0.003 0.003 0.5 0.003 0.25 0.003 0.12 0.03 0.1 32 0.003 0.003 0.5 0.004 0.18 0.004 0.09 0.13 0.02 64 0.006 0.006 0.5 0.007 0.21 0.011 0.06 0.26 0.02 128 0.016 0.018 0.44 0.019 0.21 0.025 0.08 0.9 0.02 256 0.081 0.1 0.40 0.09 0.22 0.11 0.09 1.8 0.04 512 0.48 0.51 0.47 0.54 0.22 0.55 0.1 6.05 0.08 1024 3.42 2.98 0.57 2.83 0.30 3.57 0.11 21.6 0.17 2048 24.88 17.29 0.71 14.05 0.44 15.5 0.20 85.8 0.16 4096 189.7 114.12 0.83 79.53 0.59 76.11 0.31 342.8 0.22 table 8. performance of the implicit numerical solver (6)–(7) for the heat equation in 1d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 162 0.004 0.004 0.5 0.01 0.1 0.03 0.02 0.04 0.1 322 0.015 0.012 0.62 0.011 0.34 0.01 0.18 0.12 0.08 642 0.12 0.07 0.85 0.047 0.63 0.05 0.3 0.3 0.15 1282 1.2 0.65 0.92 0.37 0.81 0.26 0.57 0.83 0.31 2562 17.51 9.17 0.95 4.79 0.91 3.03 0.72 2.86 1.05 5122 230.9 120.83 0.95 66.49 0.86 39.62 0.72 12.07 3.28 10242 3634.24 1928.66 0.94 1081.04 0.84 752.73 0.60 86.9 8.66 20482 59185 32048 0.92 18080.1 0.81 13167.5 0.56 1057 12.4 table 9. performance of the implicit numerical solver (6)–(7) for the heat equation in 2d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. cpu gpu dofs 1 core 2 cores 4 cores 8 cores time (s) time (s) eff. time (s) eff. time (s) eff. time (s) speed-up 163 0.021 0.013 0.80 0.01 0.52 0.011 0.23 0.07 0.14 323 0.46 0.25 0.92 0.14 0.82 0.11 0.52 0.18 0.61 643 10.04 5.21 0.96 3.02 0.83 1.83 0.68 0.6 3.05 1283 240.9 125.11 0.96 81.24 0.74 51.46 0.58 6.54 7.86 2563 6429.6 3815.9 0.84 2313.79 0.69 1536.4 0.52 140.1 10.96 table 10. performance of the implicit numerical solver (6)–(7) for the heat equation in 3d. bold face stresses the best time on cpu (with turbo-boost turned on) based on which we compute the gpu speed-up written in bold face as well. 132 vol. 61 special issue/2021 tnl: numerical library for modern parallel architectures 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 4096 e ffi c ie n c y 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 4096 e ffi c ie n c y 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 4096 e ffi c ie n c y grid size 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 512 1024 2048 grid size 0 0.2 0.4 0.6 0.8 1 16 32 64 128 256 grid size 2 threads 4 threads 8 threads figure 4. graphs of the cpu efficiency – the first row represents the explicit solver from tables 2–4, the second row is the explicit solver with f given by (9)–(11) from tables 5–7 and the last row is the implicit solver from tables 8–10. the first column contains results of computations in 1d, the second column in 2d and the third one in 3d. the red curve is efficiency of simulation with two threads, the green one with four threads and the blue one with eight threads. the horizontal axes represent grid size of the simulation and the vertical axes show the efficiency. 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 2 3 4 5 6 7 8 t im e threads 1d 2d 3d 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 t im e threads 1d 2d 3d figure 5. graphs of weak parallel scalability on the cpu – the figure on the top shows the results of the explicit solver, the one on the bottom shows the implicit solver. the red curve represents computations in 1d, the green one in 2d and the blue one in 3d. the problem size grows linearly with the number of threads. therefore the more straight the curve is the better parallel scalability the solver exhibits. time(s) explicit implicit threads 1d 2d 3d 1d 2d 3d 1 0.56 1.06 1.34 1.25 2.84 5.59 2 0.58 1.31 1.36 1.29 3.54 5.82 3 0.61 1.42 1.35 1.30 3.61 6.09 4 0.59 1.51 1.49 1.31 3.80 6.30 5 0.60 1.54 1.80 1.31 3.80 6.59 6 0.61 1.51 1.94 1.31 3.68 7.07 7 0.60 1.42 1.91 1.32 3.60 7.59 8 0.66 1.41 1.97 1.36 3.54 8.31 table 11. the weak scalability on the cpu – the columns represent the cpu time of simulation on a domain growing linearly with the number of threads (the first column). the less the times grow with the number of threads the better the weak scalabiltiy is. 7. future work in the future we would like to improve the support of mpi to make computations on clusters easier, implement support of gpus by amd company via rocm toolkit [19] and create interface of tnl into julia [20]. as we mentioned before, efficient preconditioners for linear systems solvers are extremely important. currently we are working on a more flexible implementation of sparse matrices, on adaptive numerical grids and distributed unstructured numerical meshes. 133 t. oberhuber, j. klinkovský, r. fučík acta polytechnica 8. conclusion we have presented template numerical library, tnl, for easy development of numerical solvers on modern parallel architectures. we have described details of the tnl design and we have presented a number of numerical experiments to demonstrate the scalability of tnl on multi-core cpus together with the speedup on gpus. the explicit solver achieves speed–up of 8 to 99 depending on the arithmetic intensity. the semi-implicit solver gives a speed–up of almost 11. both results were obtained by the comparison of 8 core cpu intel xeon with the gpu nvidia tesla v100. acknowledgements the work was supported by the ministry of education, youth and sports opvvv project no. cz.02.1.01/0.0/0.0/16_019/0000765: research center for informatics, the large infrastructures for research, experimental development and innovations project it4innovations national supercomputing center – lm2015070 and project no. 18-09539s of the czech science foundation. references [1] j. cheng, m. grossman, t. mckercher. professional cuda c programming. wrox, 2014. [2] cublas. https://developer.nvidia.com/cublas. [3] cusparse. https://developer.nvidia.com/cusparse. [4] thrust. https://developer.nvidia.com/thrust. [5] kokkos. https://github.com/kokkos/kokkos. [6] viennacl. http://viennacl.sourceforge.net. [7] petsc. https://www.mcs.anl.gov/petsc. [8] eigen. http://eigen.tuxfamily.org/index.php. [9] m. harris, s. sengupta, j. d. owens. gpu gems 3, chap. parallel prefix sum (scan) with cuda, pp. 851–876. 2007. [10] n. bell, m. garland. efficient sparse matrix-vector multiplication on cuda. tech. rep. technical report nvr-2008-004, nvidia corporation, 2008. [11] t. oberhuber, a. suzuki, j. vacata. new row-grouped csr format for storing the sparse matrices on gpu with implementation in cuda. acta technica 56:447–466, 2011. [12] a. monakov, a. lokhmotov, a. avetisyan. automatically tuning sparse matrix-vector multiplication for gpu architectures. in hipeac 2010, pp. 111–125. springer-verlag berlin heidelberg, 2010. [13] m. heller, t. oberhuber. improved row-grouped csr format for storing of sparse matrices on gpu. in a. h. nad z. minarechová, d. ševčovič (eds.), proceedings of algoritmy 2012, pp. 282–290. 2012. [14] c. zheng, s. gu, t.-x. gu, et al. biell: a bisection ellpack-based storage format for optimizing spmv on gpus. journal of parallel and distributed computing 74(7):2639 – 2647, 2014. [15] r. fučík, j. klinkovský, j. solovský, et al. multidimensional mixed-hybrid finite element method for compositional two-phase flow in heterogeneous porous media and its parallel implementation on gpu. computer physics communications 238:165–180, 2019. [16] t. oberhuber, a. suzuki, v. žabka. the cuda implementation of the method of lines for the curvature dependent flows. kybernetika 47:251–272, 2011. [17] t. oberhuber, a. suzuki, j. vacata, v. žabka. image segmentation using cuda implementations of the runge-kutta-merson and gmres methods. journal of math-for-industry 3:73–79, 2011. [18] y. yamamoto, y. hirota. a parallel algorithm for incremental orthogonalization based on compact wy representation. jsiam letters 3(0):89–92, 2011. [19] rocm. https://github.com/radeonopencompute/rocm. [20] julia. https://julialang.org. 134 https://developer.nvidia.com/cublas https://developer.nvidia.com/cusparse https://developer.nvidia.com/thrust https://github.com/kokkos/kokkos http://viennacl.sourceforge.net https://www.mcs.anl.gov/petsc http://eigen.tuxfamily.org/index.php https://github.com/radeonopencompute/rocm https://julialang.org acta polytechnica 61(si):122–134, 2021 1 introduction 2 programming gpus 3 data structures and algorithms in tnl 3.1 arrays and vectors 3.2 matrices 3.3 numerical meshes 3.4 solvers 4 solution of parabolic pdes 4.1 pde solver design 4.2 pde problem structure 5 tnl tools 6 performance tests 6.1 explicit numerical schemes 6.2 semi-implicit numerical schemes 7 future work 8 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0469 acta polytechnica 60(6):469–477, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap analyses of waste products obtained by laser cutting of aw-3103 aluminium alloy jan loskota, ∗, maciej zubkoa, b, zbigniew janikowskic a university of hradec králové, department of physics, rokitanského 62, 500 03 hradec králové, czech republic b university of silesia in katowice, institute of materials engineering, 75 pułku piechoty 1a, 41 500 chorzów, poland c “silver” pphu, ul. rymera 4, 44 270 rybnik, poland ∗ corresponding author: jan.loskot@uhk.cz abstract. in the presented research, the methods of scanning electron microscopy, energy-dispersive x-ray spectroscopy, x-ray diffraction and transmission electron microscopy were applied to analyse the powder waste obtained by cutting of aw-3103 aluminium alloy using a fibre laser. the scanning electron microscopy allows to analyse the morphology of the waste microparticles, the energy-dispersive x-ray spectroscopy revealed their chemical composition, which was compared with the composition of the original cut material. in the waste powder, mainly plate-like particles were observed that contain almost pure aluminium. x-ray powder diffraction measurements confirmed that the waste powder is composed of aluminium phase with only a slight presence of other phases (magnetite, austenite and graphite) and the transmission electron microscopy revealed the presence of nanoscale particles in this waste powder. furthermore, it was found that the average size of the microparticles depends on the thickness of the cut material: particles obtained from a thicker workpiece were substantially bigger than those obtained from the thinner material. on the contrary, the dimensions of the workpiece have only a little impact on the particles’ shape and no significant influence on their chemical composition. the results also suggest that the microparticles could be used as an input material for powder metallurgy. but there is also a certain health risk connected with inhalation of such tiny particles, especially the nanoparticles, which can penetrate deep into the human pulmonary system. keywords: laser cutting, aluminium alloy, waste products, scanning electron microscopy, x-ray diffraction. 1. introduction over the last few decades, a considerable number of laser beam machining (lbm) technologies have been developed, offering a wide range of applications [1, 2]. a substantial advantage of the lbm is that it does not cause mechanically induced damage of the processed material, machine vibrations and tool wear. especially suitable for the lbm are materials with a high degree of hardness or brittleness, as well as materials having low thermal conductivity and diffusivity [1]. one of the lbm technologies is laser cutting, which is often used in machine industry for processing almost all types of engineering materials, thus offering many applications too [3]. a widespread usage of this technology is machining of metallic materials, such as steels or non-ferrous metal alloys. laser cutting enables to produce components of various shapes with a clean cut edge [4], but it is also applied, for example, in waste management where it serves to disassemble discarded products [5, 6]. it is worth mentioning that laser beam irradiation is also used in other manufacturing technologies, for example, in welding processes [7], sintering, turning, milling [1] or in laser surface alloying, which can enhance material properties (e. g., it can improve its hardness) [8]. during the laser cutting process, the material to be cut is targeted by a high power laser beam. when the laser beam hits a metal surface, the local temperature raises to the melting point, which results in the material melting at this place. this liquid film is subsequently ejected from the kerf area to the surrounding environment by the assist gas (usually compressed air, nitrogen or argon) which is blown out from a nozzle of the cutting head. it is worth mentioning that, on the one hand, the liquid layer becomes thicker with an increasing laser output power, because the energy absorbed is higher and thus the melting rate of the solid substrate increases. on the other hand, the liquid layer thickness decreases with an increasing assist gas velocity. this is caused by the shearing force between the assist gas and the melt surface, which accelerates the melt flow out of the workpiece in the direction of the assist gas. the assist gas also reduces the melt surface temperature due to convective heat transfer [9]. a typical imperfection generated on the workpiece cut edge is dross. it arises as a consequence of the molten material agglomeration at the lower wedge of the cut edge [10]. the dross formation depends on the cutting properties (such as assist gas velocity or kerf size), as well as on the properties of the liquid 469 https://doi.org/10.14311/ap.2020.60.0469 https://ojs.cvut.cz/ojs/index.php/ap j. loskot, m. zubko, z. janikowski acta polytechnica film (viscosity, density, surface tension) [9]. the more the molten material flows, the larger the droplet size is. if the droplets are close enough to each other, they can merge into larger drops [10]. while solidifying, the droplets are subjected to the shear force caused by the assist gas. it can cause their deformation, break-up or detachment followed by a subsequent ejection from the workpiece. the process of detachment occurs mostly in close proximity to the cutting front, because the liquid velocity is higher here (thanks to its lower viscosity), whereas its surface tension is lower [10]. in [11], another mechanism of the droplet formation is described. it is the so-called kelvin-helmholtz instability, which occurs on the interface of two fluids (in our case the assist gas and the molten material) in shear. this instability causes surface waves, which lead to turbulences of the liquid. as a consequence, metal drops can eventually break off from the liquid surface. the size of the resulting particle is of the order of the surface wave wavelength, which is comparable to the depth of the affected liquid. particles formed by this mechanism are typically small as compared to the particles formed from dross at the lower wedge of the cut edge. while the ejected metal droplets are passing through the atmosphere, they solidify again and become powder waste that is commonly thrown away [12]. this powder consists of various microand nanoparticles whose characteristics depend on the kind of the cut material and on the type of the used laser and technological parameters of the cutting process. when compared to conventional machining technologies (such as sawing or chipboard milling), laser cutting produces less cutting dross, because the kerfs created by the laser beam are quite narrow [13]. even so, a considerable amount of dross is produced by laser cutting of metallic materials this way [12]. this waste material is a source of difficulties for companies, which are performing laser cutting, because there is still not enough available possibilities to use this waste material and it is problematic to dispose of it legally. recently, there has been a growing pressure on reusing waste generated by industrial production in developed countries. several studies concerning the possible usage of powder waste produced by laser cutting have been conducted. it was found out that small dross particles obtained by laser cutting of metals can serve as an input material for powder metallurgy [14], there is also a potential to use them as a carrier for various substances, such as pesticides, fertilizers or medical drugs [12]. a considerable attention is also paid to health hazards arising from tiny dross particles generated by laser cutting and related material processing methods [13, 15]. in this research, we analysed waste particles generated during laser cutting of a widely used aw-3103 aluminium alloy. results of this study should provide figure 1. rings cut from the thicker tube. knowledge for a further research of reusing these waste materials as well as for an assessment of health risks connected especially with inhalation of such small particles. 2. material and methods research samples were prepared by laser cutting of aw-3103 alloy tubes. the cutting process was done using the blm lt 5 automated fibre laser cutting machine equipped with ylr-1000-mm-wc multimode fibre laser. the laser worked at a wavelength of 1070 nm and its average power was 300 w. the laser beam diameter during the cutting process was 100 µm, the cutting speed was 0.22 m·s−1. to prevent oxidation of the cut material, the cutting process was performed in a protective nitrogen atmosphere: the gaseous nitrogen was blown out from a nozzle to the cutting area. the pressure at the nozzle was 12 bar. tubes of two different sizes were cut: a thicker tube with an outer diameter of 20 mm and a wall width of 1.2 mm, and a thinner tube with an outer diameter of 16 mm and a wall width of 1.3 mm. both tubes were cut in a plane perpendicular to the axis of symmetry, so that small rings were cut from them. the waste powder generated during the cutting process was subsequently collected. some of the cut rings are shown in figure 1, an example of the powder sample is given in figure 2. the morphology of the powder waste particles was studied by using the hitachi flexsem 1000 scanning electron microscope (sem); the used accelerating voltage was 20 kv. the sem was equipped with an xray energy-dispersive spectroscopy (eds) attachment from oxford instruments (detection area: 30 mm2, type of detector: sdd silicon drift detector, energy 470 vol. 60 no. 6/2020 analyses of waste products obtained by laser cutting. . . figure 2. metal waste powder collected after laser cutting (stub diameter is 12 mm). resolution: 137 ev at mn kα line), which was used for elemental composition analysis of the samples. measurements of the smallest particles were done using the jeol jem-3010 high-resolution transmission electron microscope (tem) operated at 300 kv acceleration voltage, equipped with a ccd camera gatan 2k×2k orius™833 sc200d. to get the smallest particles from the powder material, the powder was dispersed in isopropanol solution and a 30-minute ultrasonic bath was applied. this suspension was then left still for approximately 2 minutes, so heavy particles could sediment to the bottom of the flask. after that, samples of the liquid with remaining tiny particles were taken from the flask by a pipette and deposited on a carbon grid for the measurements. the structural analysis of the waste powder was based on x-ray powder diffraction (xrd) measurements. for this purpose, malvern panalytical empyrean diffractometer with a copper anode (cukα: λ = 1.5418 å) and with the pixcel3d detector was used. the working current was 30 ma and the working voltage was 40 kv. the xrd measurements were performed in a range of angles 2θ = 10 ÷ 110°, the step size was 0.026°. 3. results and discussion 3.1. microscopic observations the particles obtained from both tubes were studied using the sem in secondary electrons mode. as can be seen in figure 3, most of the particles are platelike, but smaller bumpy spheres were also found in the powders (fig. 4). the particle image analysis was done in imagej software. table 1 shows mean values and medians of feret’s diameter, solidity, circularity and roundness of the particles (these characteristics were calculated according to their definitions in [16]). statistical distributions of these characteristics are given in figure 5. this analysis showed that the waste particles are quite solid, but mostly not very circular. particles obtained from the thicker tube are substantially bigger on average compared with the particles from the thinner tube, whereas their mean solidity, roundness and circularity are not so much higher. this suggests that the dimensions of the original aluminium material have quite a significant impact on the waste particles’ size, but only a little impact on their shape. the observed dependency of particles’ size on the original material thickness is consistent with findings of the study [17], where particles generated during laser cutting of mild and stainless steels were examined. in that case, plates of thickness 1, 2 and 3 mm (for mild steel also 4 mm) were cut at a comparable laser power (900 w) and it was found that the particles’ size tend to increase with increasing thickness of the cut material too. according to [17], this general trend can be attributed to the following factors: (1.) the vertical flow of the assist gas diminishes as the depth of the cut increases. therefore, the shearing action of the gas is reduced, which suppresses the formation of the tiniest particles. (2.) with increasing the thickness of the cut piece, the average temperature of the molten material in the kerf decreases. due to this, the surface tension of the molten material gets higher, which results in the formation of bigger particles. the observed microparticles were compared to the results of studies dealing with microparticles generated during laser cutting of steels. the most obvious difference is in the shape: aw-3103 alloy particles are mostly plate-like, whereas particles from mild steel as well as stainless steel are spherical [9, 11, 12, 17–19]. the reason might be that the steel particles had solidified before they fell to the ground, while the aw-3103 alloy particles had not been solidified enough at the moment of their impact, so they became plastically deformed. regarding the sizes of microparticles, laser cutting of 2 mm thick 0.25 %c mild steel samples under various processing parameters produced particles with a mean diameters of 150 µm [11]. the study [11] also showed that the particle size distribution depends on the laser power and cutting speed. the mean feret’s diameter of aw-3103 alloy particles is somewhat smaller for both the thicker tube (114 µm) and the thinner tube (81 µm), but this has to be considered carefully due to the different sample thickness and different processing parameters used. anyway, statistical distributions of feret’s diameters show that the sizes of particles from both aluminium tubes should be suitable for powder metallurgy [20, 21]. particles in a desired size range can easily be obtained from the waste powder by using laboratory sieves with appropriate mesh dimensions. using the tem, nanoparticles with feret’s diameters mostly in the range from 20 to 80 nm were also found in the waste powder. all observed nanoparticles were below 150 nm in feret’s diameter. a tem image of these nanoparticles is shown in figure 6. for a comparison, the study [19] describes nanoparticles produced by laser cutting of 6.35 mm thick sae-1010 471 j. loskot, m. zubko, z. janikowski acta polytechnica figure 3. metal powders produced by laser cutting of the thicker tube (left) and the thinner tube (right). figure 4. spherical microparticles produced by laser cutting of the thicker tube (left) and the thinner tube (right). particle size/shape descriptor particles from the thicker tube (mean values) particles from the thinner tube (mean values) particles from the thicker tube (medians) particles from the thinner tube (medians) feret’s diameter [µm] 114(82) 81(67) 95 66 solidity 0.91(0.08) 0.87(0.09) 0.93 0.89 circularity 0.68(0.16) 0.60(0.18) 0.70 0.61 roundness 0.53(0.19) 0.47(0.19) 0.53 0.46 table 1. statistical characteristics of waste powder particles. numbers in brackets have the meaning of standard deviations. 472 vol. 60 no. 6/2020 analyses of waste products obtained by laser cutting. . . figure 5. histograms describing sizes and shapes of particles obtained by laser cutting of the thicker tube (left) and the thinner tube (right). 473 j. loskot, m. zubko, z. janikowski acta polytechnica figure 6. nanoparticles in the waste powder. image recorded by a tem in a bright field. steel plate. these nanoparticles were spherical with a mean diameter of about 20 nm. hence it seems that the sizes of aw-3103 alloy and sae-1010 steel are comparable (of course, it can depend on the processing parameters again), while the aw-3103 alloy nanoparticles have a less spherical shape. based on the sizes of the observed aw-3103 alloy microand nanoparticles, it can be stated that there is a certain health risk connected especially with inhalation of the waste particles. the smaller the particle size is, the deeper it can penetrate into the respiratory system and thus the greater health problems it can cause. in general, microparticles with an aerodynamic diameter less than 10 µm can be inhaled easily into human respiratory system and cause various respiratory diseases, e.g. allergy, chronic obstructive pulmonary diseases or lung cancer. particles larger than 5 µm usually deposit before reaching the lungs, while particles of an aerodynamic diameter between 1 and 5 µm are more likely to reach the central and peripheral airways and the alveoli [22]. particles smaller than 2.5 µm in an aerodynamic diameter can cause also chronic bronchitis, development of asthma [23], decreased function of lungs, and even premature decease [22]. ultrafine particles (less than 100 nm in an aerodynamic diameter) can even penetrate the alveolar epithelium, get into the bloodstream [24] and harm other parts of the body, especially the cardiovascular system. for instance, they may contribute to coronary atherosclerosis and worsen its consequences [22]. such tiny particles exhibit an enhanced inflammatory potential too [24]. the mentioned health risks connected with microand nanoparticles imply that the described aluminium waste powders should be treated carefully to avoid the emergence of such health problems. 3.2. elemental composition analysis using the eds, it was found that the waste microparticles contain predominantly aluminium with manganese and iron as trace elements. the tube size has element concentration [wt%] mn 0.90 1.50 fe 0.0 0.70 si 0.0 0.50 mg 0.0 0.30 zn 0.0 0.20 cr 0.0 0.10 cu 0.0 0.10 ti + zr 0.0 0.10 others (total) 0.0 0.15 al balance table 2. declared elemental composition of aw-3103 alloy [25]. no significant influence on the particles’ elemental composition. the original material of the tubes also contains silicon. figure 7 shows the typical eds spectra of the waste microparticles, a laser cut tube surface and a polished tube surface (not affected by laser cutting). the content of silicon on the polished surface is probably overestimated due to its possible contamination from a sandpaper, which was used to polish the sample. low concentrations of silicon (around 0.3 wt%) were detected also in some areas of the laser cut surface, but not everywhere. a reason for this could be an inhomogeneous distribution of silicon on the cut surface, which caused that in some areas, the content of silicon was below the detection limit. a similar situation occurs with iron, which was also detected only in some areas of the laser cut surface. the declared composition of aw-3103 alloy is shown in table 2. the given concentrations are in compliance with the values obtained by the eds measurements. (only there is a big difference between the declared concentration of si and the si concentration measured on the polished surface, which can be explained by the mentioned sample contamination during its polishing.) 3.3. structural analysis phase composition of the waste powder particles was determined using the xrd method. the results confirmed that the particles are composed predominantly of aluminium phase with only a slight presence of other phases. based on the performed structural analysis using icdd pdf4+ database, the other phases were identified as magnetite (fe3o4 – icdd pdf card number 04-008-4512), austenite (γ-fe – icdd pdf card number 04-020-7293) and graphite (icdd pdf card number 04-016-6288). diffraction patterns of powders from both tubes are shown in figure 8, all six strongest peaks (marked with green triangles) belong to the aluminium phase. 4. conclusions the presented research revealed that waste powder microparticles generated during laser cutting of aw-3103 474 vol. 60 no. 6/2020 analyses of waste products obtained by laser cutting. . . figure 7. eds summary spectra of the polished tube surface (top left), laser cut tube surface (top right) and a waste particle surface (bottom). figure 8. x-ray diffraction patterns of the waste products obtained by laser cutting of the thicker tube (black curve) and the thinner tube (red curve). 475 j. loskot, m. zubko, z. janikowski acta polytechnica alloy are mostly plate-like with feret’s diameters up to hundreds of µm, but some smaller bumpy spheres are also present in the powder. the microparticles consist of almost pure aluminium with only a slight presence of other phases (magnetite, austenite and graphite). the average size of the microparticles increases with increasing thickness of the aluminium tube from which they originate, while there is only a little impact of the tube thickness on their shape. the chemical composition of the particles is not affected by the tube dimensions. the waste powder characteristics suggest that this material has a potential to be used, for instance, as an input material for additive manufacturing. furthermore, the presence of nanoscale particles (with an aerodynamic diameter below 150 nm) was revealed in the waste powder. inhalation of the observed microand especially nanoparticles can cause various health problems, such as cancer or damage to the cardiovascular system. future research should be focused on studying waste powders originating from other types of alloys and on assessing the impact of various cutting process parameters on both the powders and the cut surfaces. acknowledgements the presented research was financially supported by the specific research project 2107/2019 at the faculty of science, university of hradec králové. references [1] a. k. dubey, v. yadava. laser beam machining a review. international journal of machine tools and manufacture 48(6):609 – 628, 2008. doi:10.1016/j.ijmachtools.2007.10.017. [2] l. yang, j. wei, z. ma, et al. the fabrication of micro/nano structures by laser machining. nanomaterials 9(12):1789, 2019. doi:10.3390/nano9121789. [3] d. teixidor, j. ciurana, c. rodriguez. dross formation and process parameters analysis of fibre laser cutting of stainless steel thin sheets. the international journal of advanced manufacturing technology 71(9 12):1611 – 1621, 2014. doi:10.1007/s00170-013-5599-0. [4] a. amulevicius, k. mazeika, c. sipavicius. oxidation of stainless steel by laser cutting. acta physica polonica series a 115:880 – 885, 2009. doi:10.12693/aphyspola.115.880. [5] k. krot, e. chlebus, b. kuźnicka. laser cutting of composite sandwich structures. archives of civil and mechanical engineering 17(3):545 – 554, 2017. doi:10.1016/j.acme.2016.12.007. [6] a. khan, j. blackburn. laser size reduction of radioactively contaminated structures. journal of laser applications 30(3):032607, 2018. doi:10.2351/1.5040650. [7] a. lisiecki, a. kurc-lisiecka. automated laser welding of aisi 304 stainless steel by disk laser. archives of metallurgy and materials 63(4):1663 – 1672, 2018. doi:10.24425/amm.2018.125091. [8] z. brytan. the erosion resistance and microstructure evaluation of laser surface alloyed sintered stainless steels. archives of metallurgy and materials 63(4):2039 – 2049, 2018. doi:10.24425/amm.2018.125141. [9] b. s. yilbas, b. j. a. aleem. dross formation during laser cutting process. journal of physics d: applied physics 39(7):1451 – 1461, 2006. doi:10.1088/0022-3727/39/7/017. [10] a. riveiro, f. quintero, f. lusquiños, et al. study of melt flow dynamics and influence on quality for co2 laser fusion cutting. journal of physics d: applied physics 44(13):135501, 2011. doi:10.1088/0022-3727/44/13/135501. [11] l. lobo, k. williams, j. tyrer. the effect of laser processing parameters on the particulate generated during the cutting of thin mild steel sheet. proceedings of the institution of mechanical engineers part c journal of mechanical engineering science 216(3):301 – 313, 2002. doi:10.1243/0954406021525016. [12] r. mercader, s. marchetti, f. bengoa, et al. characterization of scraps produced by the industrial laser cutting of steels. hyperfine interactions 195(1 3):249 – 255, 2010. doi:10.1007/978-3-642-10764-1_38. [13] a. lopez, e. assunção, i. pires, l. quintino. secondary emissions during fiber laser cutting of nuclear material. nuclear engineering and design 315:69 – 76, 2017. doi:10.1016/j.nucengdes.2017.02.012. [14] j. souza, c. motta, t. machado, et al. analysis of metallic waste from laser cutting for utilization in parts manufactured by conventional powder metallurgy. international journal of research in engineering and science 4(11):1, 2016. [15] k. elihn, p. berg. ultrafine particle characteristics in seven industrial plants. the annals of occupational hygiene 53(5):475 – 484, 2009. doi:10.1093/annhyg/mep033. [16] t. ferreira, w. rasband. imagej user guide: ij 1.46r. imagej: image processing and analysis in java. https: //imagej.nih.gov/ij/docs/guide/user-guide.pdf, 2012. accessed: 17 april 2019. [17] j. powell, a. ivarson, c. magnusson. laser cutting of steels: a physical and chemical analysis of the particles ejected during cutting. part ii. journal of laser applications 5(1):25 – 31, 1993. doi:10.2351/1.4745321. [18] e. cabanillas, m. creus, r. mercader. microscopic spheroidal particles obtained by laser cutting. journal of materials science 40(2):519 – 522, 2005. doi:10.1007/s10853-005-6118-y. [19] e. cabanillas. transmission electron microscopy observation of nanoparticles obtained by cutting power laser. journal of materials science 39(11):3821 – 3823, 2004. doi:10.1023/b:jmsc.0000030748.97677.81. [20] malvern panalytical. material characterization solutions for powder metallugy. https://www.malvernpanalytical.com/en/assets/ mrk2319_tcm50-55142.pdf, 2017. accessed: 18 september 2019. 476 http://dx.doi.org/10.1016/j.ijmachtools.2007.10.017 http://dx.doi.org/10.3390/nano9121789 http://dx.doi.org/10.1007/s00170-013-5599-0 http://dx.doi.org/10.12693/aphyspola.115.880 http://dx.doi.org/10.1016/j.acme.2016.12.007 http://dx.doi.org/10.2351/1.5040650 http://dx.doi.org/10.24425/amm.2018.125091 http://dx.doi.org/10.24425/amm.2018.125141 http://dx.doi.org/10.1088/0022-3727/39/7/017 http://dx.doi.org/10.1088/0022-3727/44/13/135501 http://dx.doi.org/10.1243/0954406021525016 http://dx.doi.org/10.1007/978-3-642-10764-1_38 http://dx.doi.org/10.1016/j.nucengdes.2017.02.012 http://dx.doi.org/10.1093/annhyg/mep033 https://imagej.nih.gov/ij/docs/guide/user-guide.pdf https://imagej.nih.gov/ij/docs/guide/user-guide.pdf http://dx.doi.org/10.2351/1.4745321 http://dx.doi.org/10.1007/s10853-005-6118-y http://dx.doi.org/10.1023/b:jmsc.0000030748.97677.81 https://www.malvernpanalytical.com/en/assets/mrk2319_tcm50-55142.pdf https://www.malvernpanalytical.com/en/assets/mrk2319_tcm50-55142.pdf vol. 60 no. 6/2020 analyses of waste products obtained by laser cutting. . . [21] j. liu, j. silveira, r. groarke, et al. effect of powder metallurgy synthesis parameters for pure aluminium on resultant mechanical properties. international journal of material forming 12:79 – 87, 2019. doi:10.1007/s12289-018-1408-5. [22] d. vallero. fundamentals of air pollution. academic press, waltham, 5th edn., 2014. [23] g. yu, f. wang, j. hu, et al. value assessment of health losses caused by pm2.5 in changsha city, china. international journal of environmental research and public health 16(11):2063, 2019. doi:10.3390/ijerph16112063. [24] m. elmes, m. gasparon. sampling and single particle analysis for the chemical characterisation of fine atmospheric particulates: a review. journal of environmental management 202:137 – 150, 2017. doi:10.1016/j.jenvman.2017.06.067. [25] aalco. aluminium alloy 3103 h14 datasheet. http://www.aalco.co.uk/datasheets/ aluminium-alloy-3103-h14-sheet_298.ashx, 2020. accessed: 21 july 2020. 477 http://dx.doi.org/10.1007/s12289-018-1408-5 http://dx.doi.org/10.3390/ijerph16112063 http://dx.doi.org/10.1016/j.jenvman.2017.06.067 http://www.aalco.co.uk/datasheets/aluminium-alloy-3103-h14-sheet_298.ashx http://www.aalco.co.uk/datasheets/aluminium-alloy-3103-h14-sheet_298.ashx acta polytechnica 60(6):469–477, 2020 1 introduction 2 material and methods 3 results and discussion 3.1 microscopic observations 3.2 elemental composition analysis 3.3 structural analysis 4 conclusions acknowledgements references ap02_04.vp 1 introduction the interferometric measurement technique known as electro-optic holography [1-5] is a modern noncontact measurement method based on the interference phenomenon [6,7] and phase shifting [1,7,8]. like other modern nondestructive digital interferometric techniques, this method can be used for very accurate measurement of static and dynamic shape deformation of structures in many areas of industry [9–15]. subsequent processing of the measured data enables a strain and stress analysis of the structures to be performed [2,16]. in contrast to classical holographic measurement techniques [17], modern optoelectronic array detectors are used for recording the intensity of the interference field, e.g. ccd, together with highly precise phase shifting devices that enable very accurate evaluation of the phase change of the object wave field. this phase change of the object wave field is closely related to changes in the shape of the measured object surface that are caused, for example, by loading of the structure under investigation. electro-optic holography is an attractive modern method for measuring, displacements and strains in the field of experimental stress analysis. the technique can be used for measuring both optically smooth and rough surfaces during static or dynamic events. the automatic evaluation process is studied during static displacement measurement, using the electro-optic holographic method to obtain the required measurement accuracy with various types of phase calculation algorithms. several multistep phase evaluation algorithms are proposed, and a complex analysis is carried out with respect to main factors that influence the measurement and evaluation process in practice [18-21]. this paper proposes a mathematical model for analysing the main measurement factors. this model enables an analysis of the accuracy and stability of the proposed phase evaluation algorithms with respect to chosen parameters of the affecting factors. an analysis is preformed of several phase calculation algorithms using this model. it is shown that the influence of various measurement errors can be effectively reduced by a suitable choice of phase measuring algorithms. the analysis can be used for a general comparison of any phase evaluation algorithm in phase shifting. 2 principle of the measurement method the method uses the interaction of arbitrary coherent wave fields with the tested object in order to determine the change in the shape of the object. information about the displacement of the object surface is then coded into the phase of the object field, the physical properties of which are modified after reflection from the tested object. to determine this phase we allow the object wave field to interfere with the reference wave field. from the measured values of the recorded intensity of the interference field we are able to obtain phase values. consider now for simplicity two linearly polarized coherent wave fields with the same polarization vector. then for the resulting intensity of the interference field in the plane (x,y) of the detector for two different states of the tested object we obtain [5,6] � � � � � � � �� �i x y a x y b x y x yi i, , , cos ,� � �� � , (1) � � � � � � � � � �� �i x y a x y b x y x y x yzi i, , , cos , ,� � � �� �� � , (2) where a and b are functions that characterize the mean intensity and modulation of the recorded interference signal, � is the phase difference between the object and the reference field, ii and izi are the values of the intensity in the i-th frame with phase shift �i, �� is the change of the phase of the object field. it is necessary to capture at least three phase-shifted interferograms to determine phase values �� unambiguously with the phase shifting technique [1,7]. for static measurements, n phase-shifted interference patterns are recorded in two different states of the investigated object, e.g. in different loading states. in the general case we can derive the following equation for the phase change �� of the object wave field at some point (x,y) © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 42 no. 4/2002 analysis of phase evaluation algorithms in an interferometric method for static deformation measurement j. novák this article describes and analyses an interferometric method for measuring displacements and deformation. the method can be used for a very accurate evaluation of the change in the surface shape of structures used in industry. the paper proposes several multistep phase calculation algorithms and describes an automatic evaluation process using the measurement technique. a complex analysis is also performed of various factors that can have a negative effect on the practical measurement and evaluation process. an analysis is made of the proposed multistep phase calculation algorithms using the proposed error model. it is shown that the resulting phase measurement errors can be effectively reduced by using suitable phase calculation algorithms. the analysis can be applied for a complex comparison of the accuracy and stability of such algorithms. keywords: noncontact deformation measurement, phase calculation algorithms, error analysis. � �� � � � � �� � � � � � tan , , , , , �� x y i x y i x y c d c d i x y i x y zj i i j j i i n j n zj i � � �� 11 � �c c d di j i j i n j n � � � � � � � � � � � � � � � � � �� 11 , (3) where c q q qi i i� � �11 12 13cos sin� � (4a) d q q qi i i� � �21 22 23cos sin� � (4b) and n is the number of phase shifted intensity measurements and �i is the phase shift. the quantities qkl can be expressed from q g g g g q g g g g q g g g 11 12 23 13 22 12 12 13 11 23 13 11 22 12 � � � � � � , , 2 21 12 33 13 23 22 13 2 11 33 23 11 23 12 , , ,q g g g g q g g g q g g g � � � � � � g13, (5) where the matrix g is given by g � � � � � � � � � � � g g g g g g g g g n i 11 12 13 12 22 23 13 23 33 cos si� n cos cos cos sin sin cos sin sin � � � � � � � � � i i i i i i i i 2 2 i � � � � � � � � � � � � . (6) equation (3) is a general phase calculation algorithm. the calculated phase values �� are located in the range [-�,�]. the discontinuous distribution of the evaluated phase values, so called wrapped phase values, must be reconstructed (unwrapped) using suitable mathematical techniques [23,24]. the unwrapped phase values �� are then closely related to the optical path difference between the object and reference beam and subsequently also to the displacement of the object surface. this relation can be expressed for any observed point p on the object surface as � � � � � � � ��� � � p w p p p� � 2 s d , (7) where �� is the phase change of the object wave field for two different states of the object, � is the wavelength of light, d is the displacement vector, and s is the sensitivity vector. the sensitivity vector is defined as [2] � � � � � �� �s a bp p p� � 2� � , (8) where a is the illumination direction and b is the observation direction. the displacement vector d can be determined from (7) [25]. the principal scheme for measurement of displacements is shown in fig. 1. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 1: measurement of displacements fig. 2: experimental scheme of the measurement system 3 experimental arrangement for deformation measurement we now focus on practical implementation of the described measurement technique for measuring the change in the shape of the measured object. figure 2 shows an experimental scheme of the measurement system with a piezotranslator used as a phase shifting device. phase shifting is implemented into the reference beam by shifting a small plane mirror m1 mounted on a very precise piezoelectric transducer pzt. the beam of light from the source of coherent radiation (laser) is divided into two beams by the beamsplitter bs1. the first beam (reference beam) reflects successively from mirror m1, mirror m2 and beamsplitter bs2. the second beam (object beam) reflects from mirror m3 and test object o. then the object beam passes through beamsplitter bs2. both beams (reference and object) interfere, and the ccd sensor detects the resulting intensity of the interference field in a chosen plane ( x,y). the main element in the whole experimental measuring system is the computer with the control unit, which controls the precise shifting of the piezotranslator and detection of the intensity of the light with a ccd sensor. 4 analysis of the measurement and phase evaluation process the overall accuracy of interferometric measuring techniques is expressed in terms of systematic and random errors during the measurement process. there are many factors that can influence the measurement accuracy. the sensitivity of phase calculation with respect to parameters a, b and � in the interference equation (2) depends on the specific phase measuring algorithm used for measurement evaluation. generally, errors in interferometric measurements can be classified into two distinct categories: systematic and random errors. in order to identify the parameters which introduce errors into the measurement and evaluation process the different components of the interferometric system are considered (see table 1). in practice, some of these errors can be avoided in advance, e.g. by proper choice of components of the measuring system. the most important types of errors in the described measuring technique are random and systematic errors caused by the phase shifting device and by the detector. a very interesting and important task for practical use of the method is to find out the accuracy of the method for a given measurement arrangement. in the case of small changes in �� the error of the phase difference �� can be expressed as � � � �� �� � � � �� � �� cos2 tan , (9) where function �(tan��) depends on the values of the intensity detection error, the phase shift error and the form of the particular evaluation algorithm. functions tan�� can be derived for different values of the phase shift � from (3). in our work, a numerical model was proposed for determining the influence of the most important measurement factors on the phase evaluation process. a study was made of the impact on the overall accuracy and stability of the phase evaluation algorithms in this method. random and systematic errors of the phase shifting device and the detector were simulated with a computer program and the resulting phase error was determined. it was assumed that the random errors behave as normally distributed quantities with the mean value zero. the error of the phase shifting device can be modelled by the expression [19, 26] © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 42 no. 4/2002 parameters error origin classification laser variation of mean intensity systematic variation of coherence systematic variation of laser frequency systematic photon noise random phase shifting device miscalibration of phase shift systematic non-linearity of phase shift systematic inequality of phase shift random detector electronic noise random quantization noise random non-linear detection systematic optical parts geometrical aberrations systematic environmental parameters vibrations random fluctuations of refractive index random sensitivity of the measuring system improper arrangement of the measurement system systematic table 1 � � � � � �i i k i k k c� � � � � �� � � 1 2 0 , (10) where �i is the phase shift, ck are coefficients of systematic errors of the phase shifting device, and � is the random error of the phase shifting device. coefficients ck describe the real (nonlinear) behaviour of the chosen phase shifting device. however, the first two coefficients c1 and c2 are most significant for the measurement and evaluation process in practice. the standard deviation of the random error distribution can be determined for our model from the accuracy of the phase shifting device. assume now that we use a very precise piezoelectric translator for phase shifting. the non-linearity is then in the range 0.01–0.2% and the repeatability of the shifting is ( 1–10) nm [27]. from the repeatability of the phase shifting device we can calculate the corresponding phase change from the relation �� � � �� 2 w , (11) where �w is the change of the optical path difference caused by shifting a small mirror in the path of the reference beam, and � is the wavelength of light. in the case of a he-ne laser with the wavelength � � 632.8 nm, the phase error will be approximately in the range 0.2–0.01 radians. the error in detection of the intensity of the observed interference field can be modelled on the basis of � i i d ik k k i� � � � 1 , (12) where i is the intensity of the interference field, dk are coefficients that characterize the systematic errors in intensity detection, and i is the random error in intensity detection. coefficients dk describe the real (nonlinear) behaviour of the given detector of the intensity of the interference field. the most important factor for a real description of the detector response on the incident light is coefficient d1, which describes the second order non-linear response of the detector. the standard deviation that characterizes the distribution of random errors during the detection process of the interference signal can be simulated as a fraction of the intensity incident onto the detector, i.e. i pi� , where values of p can be considered in the range 0.1–1% with respect to the properties of the currently produced detectors used for recording the intensity of the interference field. it is important to know which properties of the individual elements of the measurement system are needed in order to obtain the required accuracy of the calculated phase values using some of the phase calculation algorithms. these factors were implemented into a numerical model that can simulate the impact on the measurement accuracy of the individual parameters that describe these factors [20]. the model of the intensity distribution for the i-th measurement can be expressed as � �i a b ii i i� � � � �cos � � �� � � , (13a) � �a i i b i i� � �0 02r r, , (13b) where ir is the intensity of the reference beam, i0 is the intensity of the object beam, a is the mean intensity of the interference signal, b is the modulation of the interference signal, �� is the phase change of the object beam, �i is the phase shift in i-th intensity measurement, ��i is the phase shift error, and �i is the detection error. the resulting error of phase values �� is then given by � �� � � �� � �� � � , (14) where are the calculated phase values and �� are the original phase values. for the performed error analysis values �� were considered in the range (��, �). now we can study the influence of the described factors on the accuracy of phase calculation for individual phase measuring algorithms in electro-optic holography. a root-mean-square �� of calculated phase errors was chosen as an error characteristic, i.e. � � � ��� �� � i i m m2 1 , (15) where m is the number of computer simulations of a phase evaluation. more than 500 simulation cycles were performed to guarantee the reliability of the results. the parameters considered in the error analysis of the phase calculation algorithms are shown in table 2. from (3) we can derive many phase calculation algorithms by a suitable choice of phase shift values � and the number of recorded intensity frames n needed for calculation. in identical measurement conditions, i.e. with the same error factors, the algorithms will differ in their sensitivity to these factors. the following text describes several phase calculation algorithms for electro-optic holography and these algorithms are compared using our model. for simplification of description, the differences between the different intensity measurements were denoted as a i i b i ii, k i k i, k zi zk� � � �, , (16) where ii and izi are the i-th intensity measurements in two different states of the observed object. ii and izi are functions of the phase shift �i in the i-th measurement of the intensity of the interference field. the derived phase calculation algorithms are shown in table 3. they were denoted as a1–a9. figure 3 shows the relationship between the error of phase values �(��) and phase values �� from the range (��,�), which enables a comparison of the accuracy and stability of the phase calculation algorithms. we can observe that the resulting phase error �(��) is very dependent on phase values ��, and the algorithms differ in the accuracy and stability of phase calculation in the given range. with increasing number of steps n the phase error decreases, but the phase error also depends on the properties of 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 coefficient value units c1 1 % c2 0.1 % � 0.05 rad d1 1 % p 1 % table 2 the particular algorithm, not only on the number of steps. on the basis of this error analysis we can conclude that the most accurate algorithms are multi-step algorithms a7, a8 and a11. the five-step algorithm also seems to have relatively very good properties as regards measurement errors. however, the three-step phase calculation algorithms a1 and a2 are the least accurate and least stable of all compared algorithms. the third three-step algorithm a3 is evidently rather more accurate than the other three-step algorithms. from a practical viewpoint, the time needed for the measurement and its automatic evaluation is also important. therefore the time for phase calculation using various phase evaluation algorithms was also determined. table 4 shows the relative computing time, which is taken as the ratio of the computing time for a given algorithm and the minimum computing time for all the algorithms. the computing time was obtained using the computer simulation of phase evaluation with different phase calculation algorithms. it is reasonable to assume that phase calculation algorithms with a larger number of steps n are more time consuming, but on the basis of our analysis we can see that the increase in computing time is not very rapid. the difference in computing time between © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 42 no. 4/2002 a1: n � 3, � � � 4 tan , , , , , , , , �� � � � a b a b a b a b 12 3 2 3 2 12 12 12 3 2 3 2 a2: n � 3, � � � 2 � � � � tan , , , , , , , , , , �� � � � � � � a b b b a a a b a a 1 3 2 3 12 1 3 2 3 12 1 3 1 3 2 3 1� �� �2 2 3 12b b, ,� a3: n � 3, � � 2 3� � � � �� � tan , , , , , , , , , �� � � � � � � 3 3 3 2 12 1 3 3 2 12 1 3 3 2 3 2 12 a b b b a a a b a a� �� �1 3 12 1 3, , ,b b� a4: n � 5, � � � 2 � � � � � � tan , , , , , , , , , �� � � � � � � 2 2 4 3 1 3 5 2 4 3 1 3 5 3 1 3 5 3 1 3 b a a a b b a a b b� �, , ,5 2 4 2 44� a b a5: n � 5, � � � 2 � � tan , , , , , , , , �� � � � � � � �7 3 3 3 34 2 1 3 5 3 12 5 4 4 2 1 3 5 3a b b b b b a a a� �� � � � 12 5 4 4 2 4 2 1 3 5 3 12 5 4 1 3 549 3 3 3 3 , , , , , , , , , , � � � � � � a a b a a a a b b� �3 12 5 4� �b b, , a6: n � 7, � � � 2 � �� � � �� � tan , , , , , , , , �� � � � � � �7 4 4 7 4 4 4 3 5 17 4 2 4 6 3 5 17 4 2 4 6b b a a a a b b � �� � � �� �a a b b a a b b4 2 4 6 4 2 4 6 3 5 17 3 5 174 4 4 7 7, , , , , , , ,� � � � � a7: n � 7, � � � 3 � �� � � � tan , , , , , , , , , �� � � � � � � � �3 6 2 5 3 1 3 2 4 6 4 7 5 6 2 5 3 1 3a a b b b b b b a� �� � � �� � a a a a a b b a a a a 2 4 6 4 7 5 6 2 5 3 6 2 5 3 1 3 2 4 6 43 , , , , , , , , , , � � � � � � � �� �� �7 5 1 3 2 4 6 4 7 5, , , , ,b b b b� � � a8: n � 9, � � � 4 � �� � tan , , , , , , , , , �� � � � � � � � �b b b a a a a a a2 8 3 7 4 6 4 1 5 2 5 8 6 9 2 8 32 2� �� � � � 7 4 6 4 1 5 2 5 8 6 9 2 8 3 7 4 6 2 8 3 72 2 � � � � � � � a b b b b a a a b b , , , , , , , , , ,� � � �� �� � � � � � � �b a a a a b b b b4 6 4 1 5 2 5 8 6 9 4 1 5 2 5 8 6 9, , , , , , , , , a9: n �11, � � � 2 � � � �� � tan , , , , , , , �� � � � � � � �4 8 15 2 4111 9 3 5 7 2 4 10 8 6 4 6 8 1b b b a a a a a� � � �� �, , , , , , , , , 11 9 3 5 7 2 4 10 8 6 4 6 8 111 9 3 8 15 2 8 1 � � � � � � � a a b b b b a a� �� � � �� �5 8 15 16 25 7 111 9 3 5 7 2 4 10 8 6 4 6 8 2a b b b a a a a b, , , , , , , , ,� � � � � � � �� �4 10 8 6 4 6 82� � �b b b, , , table 3 the fastest three-step and the slowest eleven-step algorithm is approximately 25 %. it can also be seen that the computing time does not depend directly on the increasing number of steps n. for example, the computing time for algorithms a1, a2 and a4, which differ in the number of steps required for phase evaluation, is practically the same. in order to determine the computing time it is necessary to consider the number of mathematical operations needed for phase calculation with each particular algorithm. however, it should be noted that the time for the phase shifting process itself, i.e. shifting the piezotranslator between individual captured intensity frames, needs to be included in the total time for phase evaluation. if we try to summarize the results of the performed analysis, it will in most cases in practice be sufficient to use five-step phase calculation algorithms, which are very accurate and less time consuming. to obtain greater measurement accuracy, algorithms with a larger number of steps can be used, but the practical application of phase calculation algorithms with a greater number of intensity measurements depends on the specific character of the measurement. these algorithms need a longer time to record all frames, which may not satisfy the requirements for the measurement, e.g. in the case of a measurement in an environment with quickly changing thermo-mechanical parameters. 5 conclusion we have described a noncontact interferometric measurement technique that can be used for deformation measurement in industry. the method is based on the principle of interference of arbitrary coherent wave fields and the phase shifting technique for automatic analysis of a measurement in real time. it can be used for very precise testing of various types of structures and objects in science and engineering. in order to detect the interference field, modern optoelectronic elements are used together with computers. this enables the measurement analysis to be carried out automatically in real time using suitable phase calculation algorithms. a general equation for phase evaluation was described, and several phase calculation algorithms were derived. complex error analysis was performed on them. the influence of the main factors that affect the accuracy of phase evaluation was considered in the error analysis. it is shown that phase measurement errors can be decreased by a proper choice of the phase calcu40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 phase values [rad]�� fig. 3: relationship between phase error �(��) and phase values �� algorithm relative computing time [%] a1 100 a2 105 a3 110 a4 104 a5 118 a6 116 a7 121 a8 120 a9 128 table 4 lation algorithm. the analysis can be applied for comparing any phase measurement algorithms. acknowledgement this work was supported by grant no. 103/03/p001 of the grant agency of the czech republic. references [1] malacara, d.: optical shop testing. new york: john wiley & sons, 1992. [2] kreis, t.: holographic interferometry: principles and methods. berlin, akademie verlag 1996. [3] stetson, k. a.: electro-optic holography and its application to hologram interferometry. applied optics, 1985, vol. 24, no. 21, p. 3631. [4] stetson, k. a., brohinsky, w. r.: electro-optic holography system for vibration analysis and non-destructive testing. optical engineering, 1987, vol. 26, no. 12, p. 1234. [5] miks, a.: applied optics 10 (in czech), prague: ctu publishing house, 2000. [6] born, m., wolf, e.: principles of optics. 6th ed. new york: pergamon press, 1980. [7] creath, k.: phase-measurement interferometry techniques. progress in optics vol.xxvi, amsterdam: elsevier science, 1988. [8] robinson, d. w., reid, g. t.: interferogram analysis: digital fringe pattern measurement techniques. bristol: institute of physics publishing, 1993. [9] rastogi, p. k.: digital speckle pattern interferometry and related techniques. new york: johnwiley & sons, 2001. [10] cloud, g.: optical methods of engineering analysis. cambridge: cambridge univ. press, 1998. [11] rastogi, p. k.: handbook of optical metrology. boston: artech house publishing, 1997. [12] jacquot, p., fournier, j. m.,(ed.): interferometry in speckle light: theory and applications. berlin: springer verlag, 2000. [13] osten, w., jüptner, w., (ed.): fringe 2001: the 4th international workshop on automatic processing of fringe patterns. paris, elsevier, 2001. [14] osten, w., jüptner, w., kujawinska, m. (ed.): optical measurement systems for industrial inspection ii. spie proceedings vol.4398, washington: spie 2001. [15] rastogi, p. k., inaudi, d.: trends in optical non-destructive testing and inspection. amsterdam: elsevier, 2000. [16] kobayashi, a. s. (ed.): handbook of experimental mechanics. new jersey: prentice hall, 1987. [17] vest, ch. m.: holographic interferometry. new york: john wiley & sons, 1979. [18] miks, a., novak, j.: non-contact measurement of static deformations in civil engineering. proceedings of odimap iii: optoelectronic distance/displacement measurements and applications, pavia: university of pavia, 2001, p. 57–62. [19] novak, j.: computer simulation of phase evaluation process with phase shifting technique. physical and material engineering 2002, prague: ctu, 2002, p. 87–88. [20] novak, j.: error analysis of three-frame algorithms for evaluation of deformations. interferometryof speckle light: theory and applications, berlin: springer verlag, 2000, p. 439–444. [21] miks, a., novak, j.: application of multi-step algorithms for deformation measurement. spie proceedings vol.4398, washington: spie, 2001, p. 280–288. [22] novak, j.: error analysis for deformation measurement with electro-optic holography. fine mechanics and optics, vol. 6, 2000, p. 166. [23] ghiglia, d. c., pritt, m. d.: two-dimensional phase unwrapping: theory, algorithms and software. new york: john wiley & sons, 1998. [24] novák, j.: methods for 2-d phase unwrapping. in matlab 2001 proceedings, prague: všcht publishing house, 2001. [25] stetson, k. a.: use of sensitivity vector variations to determine absolute displacements in double exposure hologram interferometry. applied optics, 1990, vol. 29, no. 4, p. 502. [26] zhang, h., lalor, m. j., burton, d. r.: error-compensating algorithms in phase-shifting interferometry: a comparison by error analysis. optics and lasers in engineering, 1999, vol. 31, p. 381. [27] physik instruments catalogue 2002. ing. jiří novák, ph.d. phone1: +420 224 354 435 fax: +420 233 333 226 e-mail:novakji@fsv.cvut.cz department of physics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 42 no. 4/2002 ap03_6.vp 1 introduction the department of aircraft engineering at the czech technical university in prague is developing a light aircraft powered by a ducted fan (“cold-jet”) propulsion unit. the plane is a replica of the well-known czech l-39 training jet plane. a drawing of the airplane is depicted in fig. 1. main technical data two-tandem seated, low wing monoplane, all composite aircraft. dimensions: wing span : 7.2m wing section ms 03xx length: 7.34m height: 3.0m wing area: 8.5m2 weight empty/take-off: 280kg/450kg motor performance: 110kw the proposed aircraft will be certificated in the very light or micro-light (ultra-light) aircraft category. hence, the maximum take-off weight must be less than 450 kg (including two pilots). the structure of the aircraft is made at composite materials. previous unsuccessful attempts to utilize a “cold jet” power unit in an airplane structure showed the crucial lack of effective power output of such a propulsion unit. the problem of the micro-light or ultra-light aircraft category is the strict restriction on maximum airplane weight. usually, the utilization of a more powerful engine substantially increases the weight of the plane structure. analysis of engines used nowadays for the micro-light aircraft category shows that their effective power output is insufficient for “cold jet” propulsion of a small aircraft. the problem of the necessary power output for our engine was solved by using a yamaha yzf-r1 motorcycle piston engine. the engine of the yamaha yzf-r1 is derived from a motorcycle racing engine and its performance to weight ratio is 2.5. the maximum performance of the yamaha yzf-r1 engine is 110 kw. in addition, neither a reducer nor a built-in engine gearbox is necessary for propulsion of the ducted fan. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 43 no. 6/2003 inlet channel for a ducted fan propulsion system of a light aircraft e. ritschl, r. theiner, d. hanus so-called ”cold-jet” propulsion units consist of a piston engine, a blower and the necessary air duct. till now, all attempts to utilize ”cold-jet” propulsion units to maintain the thrust of an airplane have been unsuccessful. analysis has shown that the main difficulty is the deformation of the flow field at the entry to the blower [1]. keywords: aircraft, ducted fan, cold-jet, inlet channel, aerodynamics. fig. 1: drawing of the l-39 replica the thin-walled shaft (fig. 2) implements the power transmission directly from the engine to the fan. the ducted fan and the inlet channel are designed to keep a laminar boundary layer. the fan consist of fourteen rotor blades and seven stator blades. the fan with a outlet nozzle provides 1.8 kn effective thrust. according to our calculation, this thrust should be generate to maintain a maximum speed of 300 km/h. during the landing, a fowler flap will be used. the landing speed will be 65 km/h. 2 the design of the inlet channels (3d model) the inlet channel is designed to keep a laminar boundary layer all along the channel length. in this case, the minimum inner energy of the flow will dissipate. our boundary layer calculation [5] shows that the optimum channel cross-section taper is 0.54 per 1 m of channel length (fig. 3). the chord ratio of the inlet channel is 0.64. input data for design: 1. the outlet cross-section is given by the fan dimensions. these are: inner diameter d � 0.300 m outer diameter d � 0.580 m 2. the shape of the inlet cross section has to fulfill the flow solution and the fuselage design condition. 3. the requirement of minimum curvature of the channel cross-section centroids join curve (minimum secondary flow [4]) 4. smoothness and continuity of the channel chord ratio are the main requirements of channel design. to maintain the laminar boundary layer throughout the length of the channel, it is important to continually downsize the cross-section areas of the channel [2 and 3]. fig. 4 represents the visual smoothness control of the designed channel and the shape of the channel face curves. during the development process, more than ten versions of the channel design were made. some of them are shown in fig. 5. the first version depicted in fig. 5 was made fully analytically. however, the analytical solution of this task is very tedious and time consuming. the other channel model versions were made using the unigraphics professional 3d cad system. the versions have varying shapes of the leading part, air entrance and the relative position of the fan axis. the aim of all design variants was to achieve the best velocity distribution at the channel exit. the aerodynamic quality of each variant was checked using the fluent by profes4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0 20 40 60 80 100 length [%] fig. 3: the channel cross-section layout fig. 2: accommodation of pilots, engine and fan – 3d cad model sional cfd calculation system. for these calculations similar boundary conditions were been used for each design variant. velocity w1 � 50 m/s was used as the “inlet velocity”, and uniform distribution assumed. for aerodynamic calculations, the incompressible condition of flow was used. the standard model of turbulence k� 0.2 % turbulence intensity was used. 3 description of the presented channel design variants version no. 1: fully analytical design. the channel has the shape of a diffuser. the velocity distribution is very unsatisfactory. at the bottom of the channel the value of the air flow velocity is about 20 m/s, while and at the top it is 120 m/s. version no. 5: in this version, the main focus was on the geometrical similarity of the inlet and outlet cross-section shape. the resultant shape of the channel does not correspond to that of a real l-39 airplane. the velocity distribution in the channel has been improved; however, the main flow occurred in the top part of the channel cross-section. version no. 10: the shape of the channel cross-section was improved by filleting the corners. the main axis of the fan was moved up. as can be seen in the fig. 5 the air-flow distribution in the channel is not uniform. there are three areas of © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 43 no. 6/2003 version no. channel design cfd flow computation 1 5 fig. 4: visual smoothness control of the channel faces higher flow velocity. this can negatively influence the stress of the rotor blades. version no. 12: the final version. the inlet cross-section has been enlarged in the bottom area. the cfd calculation showed that the flow distribution of this channel is relatively uniform. the velocity in the channel cross section varies by max. 6 % about the mean value of the flow velocity. 4 experiment the aerodynamic behavior of the final version of the inlet channel design was experimentally tested. for the experimental tests, a model of the proposed channel was made, on a scale of 1:1.934. the model was handmade from glass fiber and a positive form was used. the form itself was made with the use of cam technology based on a ug model. the aim of the aerodynamic experiment was to confirm the expected outlet velocity field. the experiment was carried in the wind tunnel of laboratory u 207.1 at ctu in prague. a five-hole probe was used for investigating the velocity field of the outlet channel. the investigation itself was in two steps. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 version no. channel design cfd flow computation 10 12 fig. 5: some of the design verions of the proposed channel fig. 6: experimental measurement of proposed channel © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 43 no. 6/2003 cfd method –fluent 6.0 experimental data fig. 7: comparison of the 2d velocity field data re 4.12 e6 5 the comparison of the cfd calculation and the 2d experimental measurement re � 4.12e6 in the first step, the 2d outlet velocity field was determined. the probe position setting and data recording are manual. the field is mapped at 250 points. the second step of the experimental measurement is the mapping of the 3d outlet velocity field. semiautomatic probe setting and automatic data recording are used for 3d mapping of the flow field, see fig. 7. outlet velocity map has about 500 points. the two experimental measurements are now in progress and the data is being evaluated. conclusion this paper attempts to demonstrate possibility of applying ducted fan in the design of micro-light or ultra-light aircraft. however, only our experimental results will prove the accuracy of the calculations presented here. the preliminary results from the experimental test show that the inner dissipation of the flow energy is below 3 percent and that utilization of the “cold jet” propulsion unit is possible. references [1] marc de piolenc, f., wright jr., g. e.: ducted fan design. 1997, v.1, p. 61–86. [2] goldsmith, e., seddon, j.: practical intake aerodynamic design aiaa. 1993. [3] seddon, j., goldsmith, e.: intake aerodynamics second edition aiaa. 1999. [4] jerie, j.: elementare theorie der doppelwirbel in gekrummten kanalen. wiss. zeit. d., tu dresden, 37, 1988, p. 87–93. [5] thwaites, b.: approximate calculations of the laminar boundary layer. aero quart., 1, 1949, p. 245–280. ing. erik ritschl phone: +420 224 357 492 e-mail: erik.ritschl@fs.cvut.cz ing. robert theiner phone: +420 224 357 423 e-mail: robert.theiner@fs.cvut.cz ing. daniel hanus, csc. phone: +420 224 357 482 e-mail: daniel.hanus@fs.cvut.cz department of aerospace engineering czech technical university faculty of mechanical engineering karlovo nám. 13 121 35 prague 2, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 ap04-bittnar1.vp 1 introduction the sandstone and travertine that we tested are brittle materials. a three-point bending experiment is usually used to investigate their material parameters. however, because of the brittleness of the materials and their application in engineering structures, it was found that a uniaxial compression test is more appropriate. if the loading is performed by increasing the deformation in the course of the experiment, we can obtain a stress-strain diagram including a part of the descending branch of the curve, and we can observe the compressive softening. if we use the same methodology for an experimental investigation of a different type of engineering materials we have an opportunity to compare their material characteristics. together with detailed information about the mineralogical composition and rock structure from a by microscopic analysis we can obtain efficient calculation data. © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 acta polytechnica vol. 44 no. 5 – 6/2004 experimental analysis of sandstone and travertine t. doležel, m. drdácký, p. konvalinka, l. kopecký sandstone and travertine are sedimentary rocks. the former is clastic, while the latter is sourced by chemical precipitation from hot springs. their applications in civil engineering structures are mostly influenced by the ability to carry compression loading. a three-point bending experiment is usually used to determine material characteristics. however it does not correspond very well to applicatiosn in structures. for this reason we used a uniaxial compression test to obtain the modulus of elasticity and the stress-strain diagram. to obtain detailed information about the crystalline structure of sandstone and travertine a microscopic analysis wascarried out, using optical microscopy and an edax multichannel spectrometer for elementary microanalysis. keywords: sandstone, travertine, edax point analysis, microscopic analysis, stress-strain diagram in compression, modulus of elasticity. fig. 1: micrograph of a thin section of sandstone (in crossed nicols, m � 200) fig. 2: edax point analysis of quartz braun. very pure quartz without accessories 2 experimental all the experimental tests were prepared in the laboratory of the department of structural mechanics at the faculty of civil engineering of ctu in prague. the uniaxial compression tests were carried out using the grond dsm 2500 apparatus, which consists of a stiff loading frame, and is provided with a hydraulic servomechanism which was used when loading a specimen under deformation control. a constant strain rate of 10�5 was used. the axial strains were measured by means of tensometric strain gauges located on the loading frame. sandner exa strain gauges were used, with the measuring base equal to 10 mm. special care was taken when preparing the specimens [1]. in accordance with the pilot tests the sizes of the prism specimens were specified as 50×50×200 mm. five specimens of each material were investigated. those tests providing the maximum and minimum values were ignored. the microstructure of the specimen was observed on the xl 30 esem-tmp phillips environmental scanning electron microscope equipped with an edax multichannel spectrometer for elementary microanalysis. the esem mode facilitates measurement at a different vacuum level and in a chamber with different environmental conditions. 2.1 sandstone in the case of sandstone we recognized that the sedimentary rock consists of clastic grains and cement. almost 99 % of the grains are quartz particles. the cement consists of chlorites, clay minerals and limonite. the microscopic pattern in the crossed nicols of a thin section of the sandstone is shown in fig. 1. a grey palette of quartz grains, dark coloured particles of biotite and microcrystallic aggregate of chlorite minerals and white limonite, can be seen. the highest strength of the sandstone specimen was achieved by specimen no. 2 (see fig. 4). its maximum strength was 27.5 mpa. the average value of the modulus of elasticity was found to be 11.85 gpa. 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 3: edax point analysis of the chlorite group minerals in the cement matrix of the rocks. probably diabanite fig. 4: stress-strain diagram of a sandstone specimen fig. 5: specimen of sandstone after the experiment 2.2 travertine from the point of view of the mineralogical composition, travertine is a simpler but a structurally very porous rock, consisting of aragonite and of clay minerals and chlorites. the microscopic pattern of a thin section of travertine is shown in fig. 6. allotriomorphous particles of aragonite and gray agglomerates of clay minerals and chlorites filled with finely dispersed limonite can be observed. the highest strength of a travertine specimen was found for specimen no. 3 (see fig. 9). the maximum strength reached 50.8 mpa. the average value of the modulus of elasticity was 20.19 gpa. 3 conclusion this work has presented an experimental analysis of the material characteristics of sedimentary rocks: sandstone and travertine. the strength and modulus of elasticity in uniaxial compression and the microstructural analysis of the specimen were investigated. the results show that: � the use ofuniaxial compression tests for determining the material characteristics provides more efficient data for calculations, � knowledge of the microstructure of sedimentary rocks enables a comparison of the different materials and their material characteristics, � it is useful to have a database of such materials which contains all the required engineering parameters. acknowledgment this work on an experimental investigation of sandstone and travertine was supported by the ministry of education of the czech republic under grant no. msm 210000004. references [1] konvalinka p. et al.: “methodics of determination of mechanical characteristics of concrete in compression.” workshop ctu, prague, 2000, pp. 250–251. [2] snethlage r., meinhardt-degen j.: “requirements for re-treating natural stone facades. an overview over the assessment parameters.” bavarian state department of historic monuments, munich, germany, 2002. © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 6: micrograph of travertine (in crossed nicols, m � 100) fig. 7: edax point analysis of travertine (aragonite) ing. tomáš doležel phone: +420 224 355 417 e-mail: tomas.dolezel@fsv.cvut.cz dept of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 16629 praha 6 ing. miloš drdácký, drsc. phone: +420 286 885 382 fax: +420 286 884 634 e-mail: drdacky@itam.cas.cz útam av čr prosecká 76 190 00 praha 9, czech republic doc. ing. petr konvalinka, csc. phone: +420 224354306 e-mail: conwa@fsv.cvut.cz rndr. lubomír kopecký phone: +420 224 354 823 e-mail: lubomir.kopecky@fsv.cvut.cz dept of structural mechanics czech technical university inprague faculty of civil engineering thákurova 7 16629 praha 6, czech republic 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 8: edax point analysis of the interstitial matrix of travertine – clay minerals and chlorites fig. 9: stress-strain diagram of a travertine specimen fig. 10: specimen of travertine after the experiment ap04-bittnar1.vp 1 introduction in recent decades, the finite element method (fem) has become the most powerful tool for structural analysis. automatic generation of consistent, reproducible, high quality meshes without user intervention makes the power of the finite element analysis accessible to those not expert in the area of mesh generation. therefore tools for automated and efficient mesh generation, including the discretization of 3d surfaces, are important prerequisites for the complete integration of the finite element method with design processes in cad, cae, and cam systems. an important class of 3d surfaces is the group of surfaces described by the stereolithography (stl) file format. this format approximates 3d surfaces of a solid model with oriented triangles (facets) of different size and shape (aspect ratio) in order to achieve a smooth enough representation suitable for industrial processing of 3d parts using stereolithography machines. however, such a representation is not appropriate for computational analysis using fem. the aim of this paper is to extend a recently developed algorithm for discretization of discrete 3d surfaces [1] to the class of surface of which the geometry is described by discrete data in the stl format. the actual discretization consists of several phases. initially, a boundary representation of the entire model is constructed from the stl file using feature recognition based on appropriate topological and geometrical operations. in this way, distinct model entities (vertices, curves, and surfaces) of a topological nature (topological features) or with important geometrical aspects (sharp features) are established. in the current implementation, the geometrical operations are based on the dihedral and turning angle, and on the aspect ratio of two neighbouring facets. note that the current implementation makes no attempt to detect the volumes. in the next phase, a smooth (limit) surface is recovered over the original stl grid. this is accomplished using the interpolating subdivision based on the modified butterfly scheme, which yields c1 surfaces (even in a topologically irregular setting). similarly, the limit boundary curves are recovered using one-dimensional interpolating subdivision producing c1 curves. in the last phase, the reconstructed limit surface is subjected to triangulation accomplished using the advancing front technique operating directly on the limit surface. this avoids difficulties with the construction of smooth parametrization of the whole surface. the paper is organized as follows. in section 2, the stl file format is described. section 3 outlines the extraction of the boundary representation of the model. the reconstruction of a smooth 3d surface from the discrete stl data using the subdivision technique is recalled in section 4. section 5 briefly describes the actual mesh generation and presents the capabilities of the algorithm on an example. the paper ends with concluding remarks in section 6. 2 stl file format an stl file is a triangular representation of a 3d surface geometry. the surface is tessellated logically into a set of oriented triangles (facets). each facet is described by the unit outward normal and three points listed in counterclockwise order representing the vertices of the triangle. while the aspect ratio and orientation of the individual facets is governed by the surface curvature, the size of the facets is driven by the tolerance controlling the quality of the surface representation in terms of the distance of the facets from the surface. the choice of the tolerance is strongly dependent on the target application of the produced stl file. in industrial processing, where stereolithography machines perform a computer controlled layer by layer laser curing of a photo-sensitive resin, the tolerance may be in the order of 0.1 mm to make the produced 3d part precise with highly worked out details. however, much larger values are typically used in pre-production stl prototypes, for example for visualization purposes. the native stl format has to fulfill the following specifications: (i) the normal and each vertex of every facet are specified by three coordinates each, so there is a total of 12 numbers stored for each facet. (ii) each facet is part of the boundary between the interior and the exterior of the object. the orientation of the facets (which way is “out” and which way is “in”) is specified redundantly in two ways, which must be consistent. first, the direction of the normal is outward. second, the vertices are listed in counterclockwise order when looking at the object from the outside (right-hand rule). (iii) each triangle must share two vertices with each of its © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 44 no. 5 – 6/2004 triangulation of 3d surfaces recovered from stl grids d. rypl, z. bittnar in the present paper, an algorithm for the discretization of parametric 3d surfaces has been extended to the family of discrete surfaces represented by stereolithography (stl) grids. the stl file format, developed for the rapid prototyping industry, is an attractive alternative to surface representation in solid modeling. initially, a boundary representation is constructed from the stl file using feature recognition. then a smooth surface is recovered over the original stl grid using an interpolating subdivision procedure. finally, the reconstructed surface is subjected to the triangulation accomplished using the advancing front technique operating directly on the surface. the capability of the proposed methodology is illustrated on an example. keywords: 3d surface, stereolithography format, interpolating subdivision, advancing front technique. adjacent triangles. this is known as the vertex-to-vertex rule. (iv) the object represented must be located in the all-positive octant (all vertex coordinates must be positive). however, for non-native stl applications, the stl format can be generalized. the normal, if not specified (three zeros might be used instead), can be easily computed by the application from the coordinates of the vertices using the right-hand rule. moreover, the vertices can be located in any octant. and, finally, the facet can even be on the interface between two objects (or two parts of the same object). this makes the generalized stl format suitable for modelling 3d non-manifold objects. 3 extraction of boundary representation although an stl file represents a fully conforming grid, the construction of a boundary representation suitable for further processing is not trivial. first of all, the stl file has to be converted to a topologically more consistent form (called a background (bg) file hereafter) containing initially a list of vertices (called bg nodes hereafter), each one defined by an identification number (id) and three coordinates, and a list of facets (called bg faces hereafter), each one defined by id and three bg nodes ids. the background file is then extended by a list of bg edges, each defined by id and two bg nodes ids. note that each bg edge must coincide with a side (called a bg side hereafter) of at least one bg face. firstly, bg edges corresponding to topological features are detected. these are all those bg sides that are not shared exactly by two bg faces. then bg edges corresponding to geometrical (sharp) features are identified. three criteria are used in the following order. the first criterion is based on finding all neighbouring bg faces that form a continuous plane. then those bg sides that are shared only by one bg face corresponding to a particular plane formed by at least three bg faces is marked as a bg edge. the second criterion whether a bg side forms a sharp feature uses the dihedral angle (based on the angle of normals) between the two neighbouring bg faces sharing that bg side. should this angle exceed a user specified threshold the bg side is marked as a bg edge. taking into account that the aspect ratio of bg faces corresponds to the curvature of the surface, the third criterion is based on the ratio of heights of two neighbouring bg faces with respect to the shared bg side. should this ratio exceed a user prescribed threshold and the normals of those bg faces are not same and do not exceed the angle threshold, then the bg side is marked as a bg edge. when all bg edges are identified, the bg nodes corresponding to topological features are detected and classified as a model vertex. these are all bg nodes that are not shared exactly by two bg edges. then the bg nodes corresponding to geometrical features are searched for. again three criteria are used. the first criterion is based on finding all neighbouring bg edges that form a continuous straight line. then those bg nodes that are shared only by one bg edge corresponding to a particular line formed by at least two bg edges is classified as a model vertex. the second criterion classifies as a model vertex all those bg nodes that are shared by two bg edges at a turning angle above a user prescribed value. and thirdly, if the ratio of the lengths of two neighbouring non-colinear bg edges exceeds a user specified value, the bg node shared by those bg edges is also classified as a model vertex. note that the current implementation makes no attempt to detect a sharp vertex not connected to any bg edge (e.g. the tip of a cone). note also that each model vertex keeps a list of all bg edges connected to it. once the model vertices are identified, model curves can be determined by traversing the chains of connected bg edges from a starting vertex until the ending vertex is reached. every visited bg edge and its not yet classified end nodes are classified to the corresponding model curve. loops of not visited bg edges (there is no model vertex on any of those bg edges) and their end nodes are also classified to a particular model curve. finally, the model surfaces are identified. this is simply accomplished by assembling all neighbouring bg faces that do not share the same bg edge. each face and its not yet classified corner nodes are classified to the corresponding model surface. since the border of each model surface is formed by bg edges, which are in turn classified to model curves, it is easy to set up for each surface the list of boundary curves. this makes the extraction of the boundary representation complete. the problem of the above concept consists in the fact that it gives no guide how to choose the individual thresholds. this is the consequence of the fact that the stl file does not possess information about the original tolerance used to generate it, and that the identification of the original geometry from the stl file is generally not unique. in other words, there may exist several geometries that are represented by the same stl file generated for the same tolerance. a reasonable strategy to tackle this problem is to use an iterative approach in which the algorithm accepts the already identified features from previous iterations. initially, conservative values of thresholds, which yield only really “sharp” features, are chosen to produce the first boundary representation of the object. alternatively, no values may be specified at all, resulting in boundary representation based solely on topological features. next, suitable values are interactively specified for individual entities of the model to further define the boundary representation. however, even such an approach can fail to detect some significant features of the object (or can detect them only at the cost of detecting simultaneously some undesirable ones). therefore the identification procedure cannot rely only on the threshold values and must also accept an interactive input of manually selected (or deselected) bg edges. this, however, makes the process of extracting the boundary representation tedious. 4 reconstruction of a limit surface the smooth surface over the original stl triangulation is reconstructed using a suitable subdivision technique based on the hierarchical recursive refinement of triangular simplices forming the stl mesh (fig. 1). each step of the refinement consists of two stages – splitting and averaging (fig. 2). in the splitting stage, new bg nodes are introduced exactly in the middle of individual bg edges. during the averaging, the bg nodes are repositioned to a new location evaluated as a weighted average of bg nodes in the neighbourhood (according to the so called averaging mask). as the level of refinement grows, the resulting grid approaches the so called limit surface. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 in the presented implementation, the recursive subdivision based on the modified butterfly scheme [2] has been employed. it is the interpolating non-uniform stationary scheme, in which the existing bg nodes (on the current level of refinement) remain unchanged and the position of a new bg node s on the next level (fig. 3a) is calculated as s w w pr i i i n � � � � 1 , (1) where pi are bg nodes connected to the surface bg node r of valence n. the weights wr and wi corresponding to the surface averaging mask (fig. 3a) are given by w r � 3 4 , (2) w n i n i n i � � � � �� � � � 1 1 4 2 1 1 2 4 1 cos ( ) cos ( ) , � � for n > 4, (3) w w w w1 2 3 43 8 0 1 8 0� � � � �, , , , for n � 4 , (4) w w w1 2 35 12 1 12 1 12� � � � �, , , for n � 3. (5) © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 1: two levels of hierarchical refinement of a bg face fig. 2: two stages of refinement – splitting and averaging fig. 3: averaging masks: a) surface mask (n � 7), b) 4-point curve mask, c) 3-point curve mask the modified butterfly scheme exhibits favourable properties: � generality – it works with a control grid of any topology, � smoothness – it yields c1 continuous limit surface, � locality – it uses only a one-level neighbourhood and � simplicity – it ensures easy and efficient evaluation. similarly, the limit boundary curves are recovered using a one-dimensional interpolating subdivision [3] producing c1 continuous curves. the adopted 4-point (for a new bg node between two curve bg nodes) and 3-point (for a new bg node between a vertex bg node and a curve bg node) averaging masks are depicted in fig. 3b and 3c. the final interpolating procedure evaluates the position of a new bg node according to the classification and regularity of the end bg nodes of its parent bg edge (a surface bg node of valence 6 is called regular, otherwise it is called irregular): 1. for every surface bg edge bounded by an irregular surface bg node and a regular surface bg node, compute the bg midnode position using the surface averaging mask with respect to the irregular surface bg node, 2. for every surface bg edge bounded by two irregular or regular surface bg nodes, compute the bg midnode position using the surface averaging mask with respect to both end bg nodes and take the average, 3. for every surface bg edge bounded by a surface bg node and a non-surface bg node, compute the bg midnode position using the surface averaging mask with respect to the surface bg node, 4. for every surface bg edge bounded by two non-surface bg nodes, compute the bg midnode position as the average of the positions of the end bg nodes, 5. for every curve bg edge bounded by two curve bg nodes, compute the bg midnode position using the 4-point curve averaging mask, 6. for every curve bg edge bounded by a curve bg node and a vertex bg node, compute the bg midnode position using the 3-point curve averaging mask, 7. for every curve bg edge bounded by two vertex bg nodes, compute the bg midnode position as the average of the positions of these vertices. an example of the application of the modified butterfly scheme to a simple domain is depicted in fig. 4. the control grid is derived from 24 triangular facets covering a regular 12-sided polygon by shifting two interior nodes (opposite 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: application of the modified butterfly scheme to a simple domain: original control grid (top left), four refinement levels (top right, middle row, and bottom left), and the limit surface (bottom right) with respect to the polygon center) out of the plane of the polygon (each one in the other direction). the properties of the limit surface (and also the curve) can been deduced by a standard examination of the eigenstructure of the local subdivision matrix corresponding to the adopted subdivision scheme [4]. this includes especially the derivation of the so called derivative masks used for the calculation of vectors tangent to the limit surface at the limit position of a bg node of the control grid. these tangent vectors are then in turn used for evaluating the limit surface normal. an important aspect is that the above seven rules for evaluating of the limit position of new bg nodes were derived for more or less isotropic control grids, i.e., for control grids with elements of approximately unit aspect ratio. however, stl meshes typically consist of elements of very large aspect ratio, which is the consequence of the curvature based tessellation procedure used to generate them. in this context, the 4th rule for the subdivision of a surface bg edge bounded by non-surface bg nodes was found too restrictive (rigid) and was replaced by the following rule: 4. for every surface bg edge bounded by two non-surface bg nodes, find the off-diagonal bg nodes of the quadrilateral formed by two bg faces sharing that bg edge and compute the bg midnode position (a) according to appropriate from rules 1 3 using the temporarily swapped bg edge, if any of the off-diagonal bg nodes is a surface bg node, (b) as the weighted average of positions calculated according to rule 4 for the original (weight 0.75) and temporarily swapped bg edge (weight 0.25), if both off-diagonal nodes are non-surface bg nodes. it should be emphasized that the topological change to the stl mesh by swapping the bg edge should be of temporary character only (just for the evaluation of the position of the corresponding bg midnode). otherwise the change of connectivity (in the case of stl meshes often regular) induced by the swapping may seriously deteriorate the quality of the recovered limit surface. 5 mesh generation having obtainedthe limit surface, the goal is now to generate a new triangular mesh over it, respecting a given mesh density distribution (typically prescribed at the nodes of the stl mesh serving as the control grid and/or curvature based) that makes the mesh suitable for subsequent computational analysis. the discretization is carried out in a hierarchical manner. firstly, the model vertices are discretized. then the (limit) curves are segmented using the mass curve of the required element density along that curve. and finally, the individual (limit) surfaces are triangulated. in order to control the element size distribution, an octree is built around the domain to be discretized. the size of the individual octants corresponds approximately to the required element spacing, while the nodes (corners) of the octants are storing the required spacing exactly. to ensure the gradual variation of the element size, the maximum one-level octree difference of octants sharing an edge is enforced. this will guarantee the creation of well shaped triangles. during the actual mesh generation the required element size is extracted from the octree for a given location using the interpolation of octree nodal values of the element size. in the presented implementation, the surfaces are discretized by the advancing front technique constrained directly to the surface and modified to reflect surface curvature [5]. firstly, the initial front consisting of mesh edges at boundary curves of the surface (including inner loops) is established. once the initial front has been set up, mesh generation continues on the basis of an edge removal algorithm according to the following steps until the front becomes empty: � the first available edge ab is selected from the front, � the position of the “ideal” point p (forming the new � abc) is calculated taking into account the local curvature of the limit surface and element size variation, � the projection �p of point p to the limit surface is evaluated, � the local neighbourhood of point �p is established (in terms of a set of octants), � the neighbourhood is searched for the most suitable candidate c to form a new � abc, � the intersection check is carried out to avoid overlapping of the candidate triangle with an already existing one in the neighbourhood, � the front is updated to account for the newly formed � abc. the generated mesh is then subjected to an optimization in order to improve the quality of the final mesh. the laplacian smoothing technique in combination with topological transformations (diagonal edge swapping) is adopted. this yields the optimized grid after only a few cycles of smoothing (typically up to six). note that, unlike to the smoothing carried out in 2d, the repositioning of a node during the smoothing is likely to shift the node out of the surface. therefore, the node has to be projected back to the surface to satisfy the surface constraint. a crucial aspect of the proposed mesh generation strategy is related to the point-to-surface projection. simple and efficient algorithms available for the projection to parametric surfaces cannot be adopted, simply because parametrization of the limit surface is missing. the situation is further complicated by the fact that the normal to the limit surface can be evaluated only at the bg nodes of the original or refined control grid. therefore in order to make the projection sufficiently accurate (in terms of the distance from the limit surface and match with the exact normal), it is necessary to subdivide the original control grid up to a high level. this results in a huge amount of data to be stored, which is not acceptable. in [6], an efficient and reliable approach for projecting of a point to the limit surface has been proposed. this approach is based on localized progressive refinement of the control grid towards the actual projection. recursive implementation of this algorithm enables virtually an unlimited number of refinements with constant memory requirements to be performed. note that the refinement is of a temporary character, and it is discarded after the projection is completed. since some of the projections during mesh generation can be accomplished with considerably lower accuracy (without any significant impact on the resulting mesh), an alternative approximate, but more efficient projection tech© czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 44 no. 5 – 6/2004 nique was suggested. this is based on approximating of bg faces of the control grid by quadratic bezier triangular patches and on employing of a standard projection technique applicable to parametric surfaces (see [6] for details). with respect to the application of the above meshing technique to the limit surfaces reconstructed from the stl meshes, it should be noted that planar surfaces (seemingly the simplest case) must be handled in a special way. the reason is that a planar surface, having zero curvature in all directions, prevents the use of a curvature based mechanism to tessellate it into stl bg faces. therefore the aspect ratio and orientation (in plane) of the individual bg faces cannot be related to the curvature. this allows two neighbouring bg faces forming the same planar surface to considerably differ in the aspect ratio measured with respect to the shared bg side. as a direct consequence, the limit surface folds over itself, possibly crossing the boundary of the surface. even though such a surface still seems to be planar, it is not any more, since the normal changes orientation from point to point, which is fatal for the meshing algorithm. therefore, when triangulating a planar surface, the subdivision is not invoked and the bg faces of the control grid are used for localization purposes only. the proposed methodology for the discretization of surfaces described by stl meshes is demonstrated on the example of a propeller. the original stl mesh is depicted in fig. 5. the triangulation with curvature based element size control is presented in fig. 6. although the original stl representation is rather coarse, the final mesh captures the shape of the propeller well. 6 conclusions this paper has introduced an approach for the direct triangulation of 3d surfaces described by stl meshes. although the stl mesh is a valid fully conforming triangulation, its special designation for rapid prototyping makes it very specific. the actual discretization consists of several phases. firstly, a boundary representation of the object is 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 5: stl mesh of a propeller fig. 6: graded mesh of a propeller constructed from the stl file using a feature recognition approach. it has been shown that this is an ambiguous task which cannot generally be fully automated. a successful completion of this procedure often requires user intervention in the framework of an interactive environment. in the next phase, a smooth (limit) surface is reconstructed over the original stl mesh using a subdivision technique yielding the differentiable c1 limit surface. an interpolating subdivision based on the slightly modified butterfly scheme has been adopted. the modifications make the limit surface smoother in situations where the original strategy seems to be insufficiently flexible, which is often the case for stl meshes containing elements of large aspect ratio spanning the whole extent of the surface in a particular direction. finally, the limit surface is subjected to a triangulation based on the advancing front technique constrained directly to the limit surface. the vitality of the proposed approach has been demonstrated on some examples. further research focus on primarily on additional improvement of the construction of the boundary representation in order to enable as much as possible automated processing of complex stl models. acknowledgment this work was supported by ministry of education of czech republic project no. msm 210000003. references [1] rypl d., bittnar z.: “triangulation of 3d surfaces: from parametric to discrete surfaces.” cd-rom proceedings of the sixth international conference on engineering computational technology, civil-comp press, 2002. [2] zorin d., schröder p., sweldens w.: “interpolating subdivision for meshes with arbitrary topology.” computer graphics proceedings (siggraph ’96), 1996, p. 189–192. [3] dyn n., gregory j. a., levin a.: “a four-point interpolatory subdivision scheme for curve design.” computer aided geometric design, 1987, p. 257–268. [4] halstead m., kass m., de rose t.: “efficient, fair interpolation using catmull-clark surfaces.” computer graphics proceedings (siggraph ’93), (1993), p. 35–44. [5] bittnar z., rypl d.: “direct triangulation of 3d surfaces using the advancing front technique.” numerical methods in engineering ’96 (eccomas ’96), wiley & sons, 1996, p. 86–99. [6] rypl d., bittnar z.: “triangulation of 3d surfaces reconstructed by interpolating subdivision.” to appear in computers and structures. doc. dr. ing. daniel rypl phone: +420 224354369 fax: +420 224 310 775 e-mail: drypl@fsv.cvut.cz prof. ing. zdeněk bittnar, drsc. phone: +420 224 353 869 fax: +420 224 310 775 e-mail: bittnar@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2017.57.0331 acta polytechnica 57(5):331–339, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap underground air duct to control rising moisture in historic buildings: improved design and its drying efficiency jiří pazderka, eva hájková∗, martin jiránek department of building structures, faculty of civil engineering, ctu in prague, thákurova 7, 166 29 prague, czech republic ∗ corresponding author: eva.hajkova@fsv.cvut.cz abstract. the underground air ducts along peripheral walls of a building are a remediation method, which principle is to enable an air flow along the moist building structure’s surface to allow a sufficient evaporation of moisture from the structure. this measure reduces the water transport (rising moisture) into the higher parts of the wall where the high water content in masonry is undesirable. presently, underground air ducts are designed as masonry structures, which durability in contact with ground moisture is limited. the article describes a new design of an underground air duct, which is based on specially shaped concrete blocks (without wet processes, because the blocks are completely precast). the air duct from concrete blocks is situated completely below the ground surface (exterior) or below the floor (interior). thanks to this, the system is invisible and does not disturb the authentic look of rehabilitated historic buildings. the efficiency of the air duct technical solution was verified by the results of tests (based on the measured moisture values) conducted on a laboratory model. the experimental study showed that the moisture in the masonry equipped with the presented underground air duct had decreased considerably compared to the reference sample, namely by 43 % on average. the experimental study was numerically validated through numerical simulations performed with the program wufi 2d. keywords: moisture; masonry; refurbishment; air ducts; drying. 1. introduction an increased moisture content of masonry structures always had a negative impact on the utility value of the particular building. this constitutes a serious problem in the reconstruction of historic buildings and its solution requires a comprehensive and highly professional approach. in historic structures, the increased moisture of vertical masonry structures occurs rather often. this phenomenon is attributable to the absence or the poor function of the waterproofing envelope of the substructure [1, 2]. as a rule, a higher masonry moisture is manifested by distinct moist patches on wall surfaces, or even by a surface layer disintegration (fig. 1). most frequently, a plaster degradation followed by a degradation of the building material and mortar is caused by the crystallisation pressure of soluble salts in masonry or a repeated freezing of the damp structure [3]. increased moisture levels result in changes in mechanical and physical properties [4–7] (e.g., modulus of elasticity and strength), which may lead to a reduced load carrying capacity of masonry structures [8, 9]. another adverse effect, attributable to a higher moisture, is the deterioration of thermal insulation properties of masonry leading to a higher thermal conductivity. a specific problem is the growth of microorganisms and moulds on damp wall surfaces, causing an unhealthy environment within the building. additional protection of historic buildings against ground moisture and subsurface water can be perfigure 1. long-term effect of rising moisture (unmaintained historic building) — church of st. catherine (built in 1315, modified in 19th century), havlickuv brod, cz. formed by several remediation methods. the final remediation measure is typically a set of more methods (the aim is also the elimination of other negative influences [10, 11]). the main task of the rehabilitation method described in this article is a significant increase of the evaporation of moisture from the historic masonry structure. the influence of the salts was not considered, although it is known that it can be important. it is a parameter to consider in future works. 331 http://dx.doi.org/10.14311/ap.2017.57.0331 http://ojs.cvut.cz/ojs/index.php/ap j. pazderka, e. hájková, m. jiránek acta polytechnica figure 2. the traditional air cavity based on a masonry structure. figure 3. a specially shaped concrete block. 1.1. the principle of underground air ducts air cavities are a remediation method, whose basic principle is to enable an air flow along the surface of moist structures to ensure a sufficient evaporation of moisture from the structure. the subsequently described air duct, which is installed along the perimeter walls of historic buildings, is one of these methods. a system based on a similar principle has been experimentally verified on limestone walls with a thickness of 20 cm [12, 13]. in that case, however, there was an air duct made of cement blocks on both sides of the outer wall masonry. ventilated underground ducts also enable protection of the building against external degradation processes. therefore, the structure of the air duct itself must be very durable. air ducts, which are situated in the ground, are most often designed as a masonry structure in combination with a traditional concrete slab and concrete screed (fig. 2). these structures are characterized by their high laboriousness (brickwork, concreting), but the main problem is a high risk of low durability arising from the structure contact with the ground moisture and also with a high level of air humidity inside the duct. it is evident that the durability of the traditional air duct (based on the masonry) is limited. figure 4. air duct (concrete blocks) installed into a historic masonry building. 2. materials and method 2.1. air duct based on concrete blocks the weaknesses of traditional air ducts (arising from their structural and material design) should be eliminated by a special set of concrete blocks presented below. the blocks are designed with the aim of achieving a high durability and simplicity of the installation. the system consists of a special set of concrete blocks in the shape of a letter “c” (fig. 3). the blocks are placed side by side along the walls of the refurbished building and create a continuous cavity. air ducts can be applied on either the outer (exterior) or inner (interior) side of outer walls (fig. 4). the application of a ventilated cavity on the interior side of masonry walls is highly invasive and it might be unfeasible for some historic buildings, as it requires dismantling of the floor. in such cases, only the exterior ventilation shaft can be installed. the blocks are equipped with a tongue and groove system (on their sides). the purpose of these elements is to ensure the equitable settlement of the entire system (the blocks “cooperates”). the installation of the system should be possible by manual handling only. for this reason, the dimensions of the blocks are limited (optimally to 0.35 × 0.45 × 0.3 m, which results in 37 kg in weight). during heavy rains, water can leak inside the duct – for this case, the block is equipped with a small drainage hole in the bottom part. the system also contains special shaped blocks, which are adapted to use at corners etc. the durability of the concrete blocks has to be ensured by designing the concrete composition at a relevant environmental influence degree under en 206 [14]. it could be ensured, for example, by the use of a crystalline admixture inside the concrete [15–17]. this way of concrete improvement should ensure a complete waterproof concrete structure [18, 19]. it was also confirmed by many independent test labs [20– 24]. a sufficient velocity of an air flow is necessary for a proper function of the ventilated duct. the design 332 vol. 57 no. 5/2017 underground air duct to control rising moisture in historic buildings figure 6. placement of measuring points in masonry (on the laboratory model). figure 5. laboratory model. and placement of ventilation holes are, therefore, extremely important. correctness of the design must be verified by a numerical calculation/simulation of the air flow in the duct and subsequently confirmed by a measurement. if a sufficient air flow rate is not measured in the piping, the required flow rate has to be provided by means of a fan. therefore, the technical solution was analysed on a laboratory model (§ 4) and by a numerical simulation (§ 5). 2.2. the drying efficiency of the presented air duct – experimental study an experimental analysis of the efficiency of the ventilated duct was made in spite of the fact that the above technical design does not aim at outperforming current design solutions in their effect on reducing moisture in masonry (it outperforms them with its greater durability and ease and speed of execution). the laboratory model (fig. 5) of the above mentioned technical solution has been created (an air duct near the peripheral wall). the aim was to analyse the impact of this structural measure to the moisture level in the wall. the model simulated the peripheral wall of a historic building (thickness 600 mm) from classic solid bricks, but in a reduced scale of 1 : 2 (the number of joints between the bricks have been overlooked). 2.3. laboratory testing the first step in the experiment was to make the laboratory model itself. historic bricks of format 290 × 140 × 65 mm and a historic lime mortar were used for the model. table 1 shows the properties of the respective materials: brick and lime mortar. all properties (except specific heat) were experimentally determined in the experimental centre at ctu in prague, in accordance with the respective european standard [25–28]. the dimensions of the simulated perimeter wall were 300 × 780 × 350 mm. the concrete ventilated duct was simulated by using a metal sheet, which was shaped into a precise shape according to scheme on figure 5. the whole set was installed into a “steel bath” and backfilled by the fine aggregate (simulating terrain). a constant level of water in the embankment and in lower part of masonry (50 mm above the bottom of “steel bath”) was kept constantly. the simulation of the duct by using a shaped metal sheet was not a significant problem, because the aim of the measurement was to analyse the reduction of the moisture level in the masonry (not in the structure of the duct). the second reference model was created without a ventilated duct. an electrical resistivity moisture meter was used for the measurement of the moisture level in the masonry. the measurement was performed using two probes applied into the holes drilled in the masonry. the principle of the masonry moisture measurement using an electrical resistivity moisture meter is described in detail in [30]. the measuring points (holes) were placed in the model as shown in figure 6. samples were extracted from the masonry after the end of the testing for a weight analysis. the samples were taken from the places where the holes for probes were located. thereby, the values measured by the electrical resistivity moisture meter have been validated. the highest moisture level in the masonry, due to the capillary attraction, can be expected at the centre of the wall [29]. therefore, the probes (sensors) were placed approximately 100mm below the surface of the masonry. the masonry blocks were vapour-tightly 333 j. pazderka, e. hájková, m. jiránek acta polytechnica material brick lime mortar bulk density % 1670 kg/m3 1650 kg/m3 heat capacity c 920 j/kg k 840 j/kg k porosity ε 37.50 % 37.70 % thermal conductivity λ 0.84 ± 0.025 w/m k 0.67 ± 0.009 w/m k vapour diffusion resistance factor µ dry cup 19.5 8.9 wet cup 14.7 4.3 water absorption coefficient a 0.19 ± 0.002 kg/(m2) 0.22 ± 0.006 kg/(m2) freewater saturation wsat 370 kg/m3 359 kg/m3 table 1. material properties. figure 7. assembled models in laboratory. closed on the sides (simulation of the continuation of the wall) in order to regard the masonry model as a 2d problem. after its completion, the models were kept in the laboratory for three months for achieving a stable level of moisture in the masonry (fig. 7). 2.4. arrangement of the test the first three months after the completion of models, the air flow inside the duct was not activated. the saturation of the masonry by rising water has been reached during this period of time. in the fourth month after the completion of the models, the flowing air in the duct was created by a fan (fig. 8a) with a very low speed of the air flow (0.05 m/s) through a fabric, which covered the inlet opening. this value is determined based on results of the calculation model that simulated a natural air flow in an exterior cavity applied to a real historical building [31]. in case the required wind flow rate is not achieved in the cavity after the whole system in a real building is completed, the required air flow rate can be additionally provided by means of a fan. the air flow in the cavity is important because this allows that the relative humidity in the duct does not reach a value of ϕ = 100 % (due to the transfer of water vapour from the masonry). it is also important that the incoming air should have the lowest possible relative humidity ϕ [%]. the influence of the salts was not considered, although it is known that it can be important. salts may negatively affect the masonry surface structure in the course of evaporation of salt-containing water. however, crystallisation of salt on a remediated building takes place before the remediation itself in places above the ground. after the remediation, the salt crystallisation area only shifts lower down into the ventilated cavity area, which is also more convenient in aesthetic terms. the air cavity was maintained ventilated using a fan for a period of 3 months. the moisture measurement results obtained in the first month are only informative. the effects of the flowing air on the masonry moisture reduction are negligible in such short time. therefore, the results of this initial measurement are not included in the final evaluation. seven moisture measurements were made at an interval of 10 days during the next two months, when the fan was installed to the air cavity (fig. 8b). the moisture was measured using the electrical resistivity moisture meter. in addition, the relative humidity of air and temperature were measured in the laboratory and inside the air duct. the relative air humidity measured within the cavity was 41.7 ± 2 %, whilst the humidity of the air in the lab was 39.1 ± 2 %. the air temperature in the lab was 23 ± 1 °c. 3. results and discussion the main benefit of the measurement was the possibility of an accurate comparison between the same structures (under the same conditions) with and without the ventilated duct. the impact of the ventilated duct was expressed as the percentage decrease in the bulk moisture of the masonry, which was measured in the same places in both models. results of measurements in all measured points are shown in table 2. the presented values were measured at an air velocity of 0.05 m/s. the results demonstrated the effectiveness of the air duct – the decrease of moisture level in the masonry was, on average, 43 %. the following charts show the rate of the moisture decrease in the individual measuring points (fig. 9). the initial moisture content in the masonry blocks 334 vol. 57 no. 5/2017 underground air duct to control rising moisture in historic buildings figure 8. (a) fan providing an air flow in the duct. (b) measurements made by using of electrical resistivity method. measuring points mean values of the bulk moisture w [%] value of moisture reductionreference sample sample with air duct 1 21.2 20.1 5.19 % 2 21.0 18.3 12.86 % 3 16.5 7.8 52.73 % 4 10.5 6.2 40.95 % 5 9.9 2.1 78.79 % 6 7.0 2.1 70.00 % table 2. the final values of the moisture readings w [wt.%], after 3 months of ventilation using a fan with the air velocity of 0.05 m/s. after the saturation with rising water for a period of 3 months (without using a fan, an air flow did not occur) is shown at the time 0. values of the moisture measurements during the next two months carried out every 10 days, when the fan was installed to the air duct, are shown for the times 30–90 days. in the charts, for the measuring points 1 and 2, the bulk moisture value of 21.3 %, indicating the highest water absorption of the brick (percent by mass) determined by gravimetry is, marked. it is evident that the moisture content in points 1 and 2 of the reference sample reached its maximum. moisture values in these two points are almost the same for the reference sample and for the sample with the air cavity. these measuring points are only 60 mm above the water level, and thus the effect of the cavity cannot occur here. the effect of the air cavity is evident at the measuring points above the air cavity. the final charts summarize the values of the moisture measured at individual points for the reference model and the model with the air duct. the first chart shows the values of moisture at the beginning of the measurement after 30 days of using the fan (fig. 10). the second chart shows the final values of the moisture after 90 days from the fan involvement (fig. 11). the experimental analysis results show that the air cavity described above is effective in reducing the moisture in masonry under the condition that an air flow is provided in the cavity (by a natural ventilation process or by installing mechanical ventilation device). 4. verification of the test results — numerical simulation 4.1. simulation program used and validation the experimental study was numerically validated through numerical simulations performed with the program wufi 2d. this software, developed at the fraunhofer institute for building physics (germany), permits an assessment of changes in the moisture content and temperature inside structures depending on interior and exterior climatic conditions [32–34]. the use of the software for the numerical simulation requires knowledge of boundary conditions and thermal and moisture-related properties of the materials used [35, 36]. in this case, we used the thermal and moisture-related material properties determined experimentally in the laboratory study (see 4.3 above). in addition, we entered the boundary conditions, the relative air humidity and air temperature in the laboratory and the test duration. all these values were the same as in the laboratory experiment described above. the numerical simulation was made for a reference sample and a sample with an air duct. 335 j. pazderka, e. hájková, m. jiránek acta polytechnica figure 9. the measured values of moisture in the individual measurement points. 336 vol. 57 no. 5/2017 underground air duct to control rising moisture in historic buildings figure 10. the measured values of moisture after 30 days of using the fan. figure 11. the final values of moisture after 90 days of using the fan. figure 12. the results of the numerical simulation (created in wufi 2d). 337 j. pazderka, e. hájková, m. jiránek acta polytechnica 4.2. results and discussion the results of the numerical simulation showed that, like in the laboratory experiment, the total water content in a brick block equipped with an air duct decreases. as it is evident from the resulting water content and the overall picture of the moisture trend in the samples, the masonry dries at the air duct location due to an increased evaporating surface and due to the air flow in the duct (fig. 12). when compared to the reference sample, the moisture front in the sample with the air duct was lower. we can, therefore, conclude that the simulation of a behaviour of test blocks using the calculation software was identical to the behaviour of test blocks in the experimental study. 5. conclusions the results of the experimental measurement on laboratory models (verified by a simulation using a software) demonstrated that an air duct based on concrete blocks has a great potential in the field of moist buildings rehabilitation. unlike conventional air ducts (based on masonry structures), it is a much simpler solution (less laboriousness) and with a higher durability. the results of laboratory tests demonstrated the effectiveness of the air duct – the decrease of the moisture level in the masonry was, on average, 43 % (in comparison with the reference model without an air duct). the simulation of the behaviour of the laboratory models using a calculation software was identical to the behaviour of the test blocks in the laboratory. acknowledgements this work was supported by the czech ministry of education, youth and sports under the grant no. sgs17/117/ohk1/2t/11 (provided within institutional support for ctu in prague). references [1] piaia, j.c.z. et al.: measurements of water penetration and leakage in masonry wall: experimental results and numerical simulation. building and environment, 61, 2013, p. 18-26. doi:10.1016/j.buildenv.2012.11.017 [2] janssen, h., derluyn, h., carmeliet, j.: moisture transfer through mortar joints: a sharpfront analysis. cement and concrete research, 42(8), 2012, p. 1105-1112. doi:10.1016/j.cemconres.2012.05.004 [3] künzel, h. m.: simultaneous heat and moisture transport in building components: oneand two-dimensional calculation using simple parameters. dissertation, university of stuttgart, stuttgart, germany, 1994. [4] sykora, j. et al.: computational homogenization of non-stationary transport processes in masonry structures. journal of computational and applied mathematics, 236(18), 2012, p. 4745-4755. doi:10.1016/j.cam.2012.02.031 [5] larsen, p. k.: determination of water content in brick masonry walls using a dielectric probe. journal of architectural conservation, 18(1), 2012, p. 47-62. doi:10.1080/13556207.2012.10785103 [6] hettmann, d.: zur beeinflussung des feuchte und salzgehaltes in mauerwerk. bautenschutz und bausanierung, 16(5), 1993, p. 72-75. [7] d’agostino, d.: moisture dynamics in a historical masonry structure: the cathedral of lecce (south italy). building and environment, 63, 2013, p. 122-133. doi:10.1016/j.buildenv.2013.02.008 [8] witzany, j., zigler, r.: failure mechanism of compressed reinforced and non-reinforced stone columns. materials and structures, 48(5), 2015, p. 1603-1613. doi:10.1617/s11527-014-0257-z [9] han, b., wang, t.: influence of water content on brick masonry’s shear strength. journal of beijing jiaotong university, 35(1), 2011, p. 1-5. [10] jiranek, m.: sub-slab depressurisation systems used in the czech republic and verification of their efficiency. radiation protection dosimetry, 162(1-2), 2014, p. 64-67. doi:10.1093/rpd/ncu219 [11] grytli, e. et al.: the impact of energy improvement measures on heritage buildings. journal of architectural conservation, 18(3), 2012, p. 89-106. doi:10.1080/13556207.2012.10785120 [12] torres, i. m.: wall base ventilation system to treat rising damp: the influence of the size of the channels. journal of cultural heritage, 15(2), 2014, p. 121-127. doi:10.1016/j.culher.2013.03.005 [13] torres, i. m., de freitas, v. p.: treatment of rising damp in historical buildings: wall base ventilation. building and environment, 42(1), 2017, p. 424-435. doi:10.1016/j.buildenv.2005.07.034 [14] en 206 concrete specification, performance, production and conformity. brussels: european committee for standardization, 2014. [15] rahhal, v. et al.: scheme of the portland cement hydration with crystalline mineral admixtures and other aspects. silicates industriels, 74(11), 2009, p. 347-352. scopus: 2-s2.0-73649090808. [16] pazderka, j.: concrete with crystalline admixture for ventilated tunnel against moisture. key engineering materials, 677, 2016, p. 108-113. doi:10.4028/www.scientific.net/kem.677.108 [17] scancella, t., robert, j.: use of xypex admixture to concrete as an inhibitor to reinforcement steel corrosion. proceedings of the materials engineering conference, 2, 1996, p. 1276-1280. scopus: 2-s2.0-0030401904. [18] dao, v.t.n. et al.: performance of permeability-reducing admixtures in marine concrete structures. aci materials journal. 107(3), 2010, p. 291-296. wos: 000278352700010 [19] reiterman, p., pazderka, j.: crystalline coating and its influence on the water transport in concrete. advances in civil engineering, 2016 (2016) 2513514. doi:10.1155/2016/2513514 338 http://dx.doi.org/10.1016/j.buildenv.2012.11.017 http://dx.doi.org/10.1016/j.cemconres.2012.05.004 http://dx.doi.org/10.1016/j.cam.2012.02.031 http://dx.doi.org/10.1080/13556207.2012.10785103 http://dx.doi.org/10.1016/j.buildenv.2013.02.008 http://dx.doi.org/10.1617/s11527-014-0257-z http://dx.doi.org/10.1093/rpd/ncu219 http://dx.doi.org/10.1080/13556207.2012.10785120 http://dx.doi.org/10.1016/j.culher.2013.03.005 http://dx.doi.org/10.1016/j.buildenv.2005.07.034 http://dx.doi.org/10.4028/www.scientific.net/kem.677.108 http://dx.doi.org/10.1155/2016/2513514 vol. 57 no. 5/2017 underground air duct to control rising moisture in historic buildings [20] reiterman, p., bäumelt, v.: long-term sorption properties of mortars modified by crystallizing admixture. advanced materials research, 1054, 2014, p. 71-74. doi:10.4028/www.scientific.net/amr.1054.71 [21] weng, t.l., cheng, a.: influence of curing environment on concrete with crystalline admixture. monatshefte fur chemie, 145(1), 2014, p.195-200. doi:10.1007/s00706-013-0965-z [22] zhou, m.r. et al.: study on experiment of concrete compounding xypex and steel fiber. applied mechanics and materials, 105-107, 2012, p. 1755-1759. doi:10.4028/www.scientific.net/amm.105-107.1755 [23] munn, r.l., kao, g., chang, z.t.: performance and compatibility of permeability reducing and other chemical admixtures in australian concretes. proceedings of 7th canmet/aci int. conference on superplasticizers and other chemical admixtures in concrete, 2003, p. 361-379. [24] bohus, s., drochytka, r.: cement based material with crystal-growth ability under long term aggressive medium impact. applied mechanics and materials, 166-169, 2012, p. 1773-1778. doi:10.4028/www.scientific.net/amm.166-169.1773 [25] en iso 10456 building materials and products hygrothermal properties tabulated design values and procedures for determining declared and design thermal values. brussels: european committee for standardization, 2007. [26] en iso 12571 hygrothermal performance of building materials and products determination of hygroscopic sorption properties. brussels: european committee for standardization, 2013. [27] en iso 12572 hygrothermal performance of building materials and products determination of water vapour transmission properties. brussels: european committee for standardization, 2001. [28] en iso 15148 hygrothermal performance of building materials and products determination of water absorption coefficient by partial immersion, brussels: european committee for standardization, 2002. [29] de freitas, v. p., abrantes, v., crausse, p.: moisture migration in building walls analysis of the interface phenomena. building and environment, 31, 1996, p. 99–108. doi:10.1016/0360-1323(95)00027-5 [30] pazderka, j., hajkova, e.: analysis of moisture in masonry. building engineer, 89(9), 2014, p. 20-24. scopus: 2-s2.0-84916933495. [31] tazky, l., sedlakova, a.: design of the ventilated air channel to resolve moisture problems in the historical church. energy procedia, 78, 2015, p. 1323-1328. doi:10.1016/j.egypro.2015.11.148 [32] de vries, d.: the theory of heat and moisture transfer in porous media revisited. journal of heat and mass transfer, 30(7), 1987, p. 1343-1350. doi:10.1016/0017-9310(87)90166-9 [33] bomberg, m.: moisture flow through porous building materials. dissertation, university of lund, sweden, 1974. [34] luikov, a. v.: systems of differential equations of heat and mass transfer in capillary – porous bodies. journal of heat and mass transfer, 18(1-a), 1975, p. 1-14. doi:10.1016/0017-9310(75)90002-2, [35] holm, a., künzel, h.m.: two-dimensional transient heat and moisture simulations of rising damp with wufi-2d. proceedings of 2nd international conference on building physics, leuven, belgium, 2003, p. 363–367. [36] krus, m.: moisture transport and storage coefficients of porous mineral building materials, theoretical principles and new test methods, fraunhofer irb verlag, stuttgart, germany, 1996. 339 http://dx.doi.org/10.4028/www.scientific.net/amr.1054.71 http://dx.doi.org/10.1007/s00706-013-0965-z http://dx.doi.org/10.4028/www.scientific.net/amm.105-107.1755 http://dx.doi.org/10.4028/www.scientific.net/amm.166-169.1773 http://dx.doi.org/10.1016/0360-1323(95)00027-5 http://dx.doi.org/10.1016/j.egypro.2015.11.148 http://dx.doi.org/10.1016/0017-9310(87)90166-9 http://dx.doi.org/10.1016/0017-9310(75)90002-2 acta polytechnica 57(5):331–339, 2017 1 introduction 1.1 the principle of underground air ducts 2 materials and method 2.1 air duct based on concrete blocks 2.2 the drying efficiency of the presented air duct – experimental study 2.3 laboratory testing 2.4 arrangement of the test 3 results and discussion 4 verification of the test results — numerical simulation 4.1 simulation program used and validation 4.2 results and discussion 5 conclusions acknowledgements references ap04_1.vp 1 introduction as known from experimental studies of concrete behaviour at room temperature, there is a close correspondence between compressive strength s and its porosity p. commonly this dependence is decreasing function, i.e. strength s decreases with porosity p. there are many attempts to express the relation s � s(p) in an analytical form. it has to be pointed out, however, that such expressions (as, for example, bal’shin relation, ryshkewitch expression, schiller law, hasselman formula – for details see, e.g., [1]) have been mostly derived for different type of materials. the present paper is based on the assumption that changes in porosity p are the main factor effecting changes in compressive strength s of hardened cement pastes and concrete (see, e.g., [2]). the hypothesis is generally accepted, howerer, some doubts exist [3,4]. moreover, practically in all articles cited above the dependence s � s(p) porosity was modified by a change of w/c ratio. in the present paper all specimens were prepared at the some w/c coefficient for all tested samples and the porosity varied due to heat treatment at different temperature levels. it is therefore interesting to the dependence of compressive strength s on porosity p in that case and check, if s � s(p) is at least qualitatively the same as for the porosity modified by w/c ratio. 2 experiment 2.1 composition of concrete the tested specimens for the whole experimental program were made from the concrete mixture, the composition of which is given in table 1. composition corresponds to the concrete composition applied in real structures (in the containment of npp temelin). the mineral composition of cement in percentage by weight can be found in [5]. the mineral composition of cement in percent per volume is given in table 2. applied starting material is portland cement with a small amount of fly ash and traces of slag. the clinker used for its production was burnt on the optimum content of free cao and has a high fraction of alite and c4af prevailing over c3a in it. an average size of crystals ranges between 0.025–0.030 mm and a size of belite grains is about 0.20 mm in average. 2.2 manufacturing of samples mixing of concrete mixtures was carried out in the laboratory mixer with the forced circulation. the careful procedure of loading the ingredients to the mixer and kept time schedule of mixing guaranteed the homogeneous concrete mixture © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 effect of temperature and age of concrete on strength – porosity relation t. zadražil, f. vodák, o. kapičková the compressive strengths of unsealed samples of concrete at the age of 180 days and have been measured at temperatures 20 °c, 300 °c, 600 °c and 900 °c. all of tests were performed for cold material. we compared our results with those obtained in [10] for the same type of concrete (age 28, resp. 90 days and measured at temperature ranging from 20 °c to 280 °c). dependencies of compressive strength and porosity were correlated together and compared for the samples of age 28, 90 and 180 days. behaviour of concrete of the age 90, resp. 180 days confirms generally accepted hypothesis that with increasing porosity strength of the concrete decreases. it has to be stressed out, howerer, that concrete samples of the age 28 days exhibit totally opposite dependency. keywords: compressive strength, temperature dependency, porosity. cement 499 kg water 215 kg plasticizer (ligoplast sf) 4.9 kg aggregates 710 kg (0 – 4 mm) 460 kg (8 –16 mm) 530 kg (16 –22 mm) table 1: composition of 1 m3 of concrete mixtures temelín cem i 42.5 mokrá phase clinker composition c3s 70.0 c2s 11.4 c3a 7.9 c4af 9.7 cfree 1.0 total 100.0 fraction of components in cement clinker 93.3 gypsum 4.7 fly ash 1.8 slag 0.2 total 100.0 table 2: quantitative composition of cements in % by volume of constant quality in all batches. the tested specimens for the experimental program were manufactured in two relays. attention was paid to the specimens curing. first, the specimens were covered with a polyethylene foil that restrained the leakage of excessive water from the surface bed of the tested body. then, approximately six hours after the manufacturing ended, the samples including the forms were covered with a damp fabric and then they were recovered with a polyethylene foil. the basic specimen of the experimental program was 0.1/0.1/0.4 m beam. 2.3 heating of specimens the concrete samples were placed in the high-temperature furnace bvd 100/ky (temperature range up to 1250 °c) with the programmable heating and cooling regulator. temperature increases 100 °c each 15 minutes during the heating up to final temperature 300 °c, 600 °c, 900 °c, respectively. samples were exposed to high temperatures for 2 hours. cooling rate was 100 °c per 30 minutes. duration of exposure at elevated temperatures (2 hours) practically guarantees that the samples are heated uniformly within whole specimen volume. this time period was estimated from solution of heat conductivity equation. appropriate input parameters (in particular, thermal conductivity coefficient, resp. specific heat capacity and its dependency on temperature) were taken from [6]. 2.4 tests of concrete mechanical properties the compressive strength was determined on the fragments of the beams of the original dimensions 0.1/0.1/0.4 m after the fracture toughness measurement referred in the previous paper [11]. all of the tests were performed according to the method of the czech national standard csn 73 13 18. the measurements of the compressive strength were performed on 6 specimens for each temperature level. 2.5 tests of concrete porosity properties the small fragments of the beam after strength tests were utilized to determine the concrete texture. samples were dried at temperature 105 °c for 4 hours before measurement. the porosity of hardened cement paste has been measured by the method of mercury porosimetry using high-pressure porosimeter micrometrics auto-pore 9200 (with pressure range up to 400 mpa). the porosity was measured on 6 specimens for each temperature level. 3 results and discussion dependence of compressive strength on the temperature level of heating is shown in fig. 1. furthermore, the ratio of strength value after cooling of the sample and its original strength value (determined by indoor temperature before heat pretreatment) is generally referred to be “residual stregth ratio” [7]. these results are depicted in fig. 2. the changes of the porosity with temperature of heating are presented in fig. 3. (each point of these curves corresponds to the average of six values measured on different specimens heated at the same temperature level). according to these measurements representative values of strength (s) and porosity (p) for a given temperature were obtained and successively functional dependency s � s(p) was established, see fig. 4. the analogical results for concrete at the age of 28 days 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 0 20 40 60 80 0 200 400 600 800 1000 temperature [°c] c o m p re s s iv e s tr e n g th [m p a ] age 180 days fig. 1: effect of temperature on compressive strength for concrete of age 180 days 0 0 2. 0 4. 0 6. 0 8. 1 1 2. 0 200 400 600 800 1000 temperature [°c] c o m p re s s iv e s tr e g th ra ti o age 180 days fig. 2: effect of temperature on compressive strength ratio for concrete of age 180 days 0 5 10 15 20 25 30 0 200 400 600 800 1000 temperature [°c] p o ro s it y [% ] age 180 days fig. 3: effect of temperature on porosity for concrete of age 180 days and 90 days treated in the temperature range �20 �c–280 °c� are in figs. 5–9 [10]. these dependences s � s(p) were compared for different ages of concrete samples (figs. 4, 8, 9). behaviour of concrete samples aged 90 and 180 days (figs. 4, 9) confirms decreasing compressive strength with the increasing porosity. samples of the age 28 days behave quite oppositely, the compressive strength increases with the increasing porosity (fig. 8). practically in all published papers dependence s � s(p) is studied at constant (room) temperature and specimens porosity is modified only by change of w/c ratio. our atypical dependence for samples aged 28 days is due to temperature effect. usually three competitive factors result from the analysis of temperature influence on properties of concrete: a) drying in the temperature range �100 °c–200 °c� increases the strength [7]. b) the increasing content of hydration products occurring in the temperature range 100 °c to 300 °c (note that the bound water starts to be released at 180 °c [9]) leads to an increase of concrete strenght [8]. a probable reason is larger mobility of water molekules in gaseous phase (as compared with their mobility in liquids). © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 0 20 40 60 80 5 10 15 20 25 30 porosity [%] c o m p re s s iv e s tr e n g th [m p a ] age 180 days fig. 4: dependence of compressive strength on porosity for concrete of age 180 days 40 50 60 70 80 0 50 100 150 200 250 300 temperature [°c] c o m p re s s iv e s tr e n g th [m p a ] age 28 days age 90 days fig. 5: effect of temperature on compressive strength for concrete of age 28 and 90 days 0.7 0.8 0.9 1 1.1 1.2 0 50 100 150 200 250 300 temperature [°c] c o m p re s s iv e s tr e g th ra ti o age 28 days age 90 days fig. 6: effect of temperature on compressive strength ratio for concrete of age 28 and 90 days 10 12 14 16 18 20 0 50 100 150 200 250 300 temperature [°c] p o ro s it y [% ] age 28 days age 90 days fig. 7: effect of temperature on porosity for concrete of age 28 and 90 days 45 50 55 60 65 70 75 80 12 14 16 18 20 porosity [%] c o m p re s s iv e s tr e n g th [m p a ] age 28 days fig. 8: dependence of compressive strength on porosity for concrete of age 28 days c) microcracking due mainly to thermal incompatibility of the hardened cement paste and aggregate which increases porosity and decreases strength [7]. this process takes place throughout whole temperature interval. from presented measurements may be deduced that resulting balance of these three influences essentially depends on the age of concrete. increasing content of hydration products plays dominant role, especially for concrete at the age of 28 days (with relatively large volume of unhydrated cement) due to intensifying hydration with temperature (between �100 °c–300 °c�). this process causes similar behaviour of strength as certain curing treatments (so-called steam curing). these facts result in extraordinary dependence of strength on porosity (fig. 8), which behaves as increasing function, contrary to original hypothesis. on the other hand, hydration processes are practically terminated in concretes at the age of 90 days and the functional dependence s � s(p) (fig. 9) corresponds to initial hypothesis, since the major role is now played by microcracking. additional „minihydration“ is caused mostly by water molecules of relatively high kinetic energies released at temperatures above 250 °c. this phenomenon, which leads to low increasing of the strength and decreasing of the porosity (see figs. 5 and 7), is again in good agreement with initial hypothesis. the results for the concrete at the age 180 days are quite problemless. the function s � s(t) decreases, the function p � p(t) increases and the dependency s � s(p) is the decreasing function. we may suppose that all significant hydration processes are finished at this age and changes of porosity and strength are influence only by microcracing. acknowledgement this work was partially supported by the msmt cr (contract no j304-098:210000020). references [1] taylor h. f. w.: cement chemistry. london: thomas telford publ., 1997. [2] odler i., rossler m.: “investigations on the relationship between porosity, structure and strength of hydrated portland cement pastes i a ii.” cem. concrete res., vol. 15 (1985), p. 320–330, 401–410. [3] jambor j.: “influence of phase composition of hardened binder pastes on its pore structure and porosity.” in: pore structure and properties of materials (editor: s. modrý). prague: academia, 1973, p. d75–d95. [4] fagerlund g.: strength and porosity of concrete. in: pore structure and properties of materials (editor: s. modrý). prague: academia, 1973, p. d57–d73. [5] vydra v., vodák f., kapičková o., hošková š.: “effect of temperature on porosity of concrete for nuklear-safety structures.” cem. concr. res., vol. 301 (2001), p. 1023–1026. [6] vodák f. et al: “tables of physical properties of concretes for nuclear-safety structures.” in: ctu reports 4 (2000) (editor: f.vodák). prague: ctu publ. house, 2000. [7] bažant z. p., kaplan m. f.: concrete at high temperatures. essex: longman, 1996. [8] jambor j.: “porosity, pore structure and strength of cementeous composites.” staveb. čas., vol. 33 (1985), p. 743–764. [9] komarovskij a. n.: design of nuclear plants. moscow: atomizdat, 1965. [10] hošková š., kapičková o., trtík k., vodák f.: experimental study of relation among elevated temperature exposure, strength and structure of concrete employed in contaiment of npp temelin. in: research activities of physical departments of civil engineering faculties in the czech and slovak republics. (eds: l. pazdera and m. kořenská). brno, 2001, p. 25–28. [11] zadražil t., vodák f., trtík k.: “vliv teploty na pevnost betonu užitého při stavbě kontejnmentu jaderné elektrárny temelín.” stavební obzor, vol. 9 (2003), p. 272–274. ing. tomáš zadražil phone: +420 224 354 693 e-mail: xzadrazt@stu.fsv.cvut.cz prof. františek vodák, drsc. phone: +420 224 353 886 e-mail: vodak@fsv.cvut.cz rndr. olga kapičková, csc. phone: +420 224 354 696 e-mail: kapickov@fsv.cvut.cz department of physic czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 50 55 60 65 70 13 14 15 16 17 porosity [%] c o m p re s s iv e s tr e n g th [m p a ] age 90 days fig. 9: dependence of compressive strength on porosity for concrete of age 90 days acta polytechnica doi:10.14311/ap.2019.59.0476 acta polytechnica 59(5):476–482, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap the effect of soil grain size on the deformation properties of reinforced geocell layers barbara luňáčková∗, marek mohyla, miroslav pinka technical university of ostrava, faculty of civil engineering, department of geotechnics and underground engineering, 17. listopadu 2172/15, 70800 ostrava-poruba, czech republic ∗ corresponding author: barbara.lunackova@vsb.cz abstract. the effect of backfill material grading on the behaviour of geocell reinforced layers was experimentally investigated in this study. a series of loading tests were performed on a model with geocell reinforced and unreinforced layers. five types of crushed aggregates were used as backfill materials in the experiment. the results showed that geocell reinforcement increased the deformation parameters. the rate of increase of the deformation characteristics depended on the backfill material grading. keywords: geocell, grading, crushed material, load plate test, deformation modulus. 1. introduction geocells are a three-dimensional honeycomb type of geosynthetics and applied as a reinforcement to improve the behaviour of soil layers by providing lateral confinement [1–3]. the geocell system was first used by the us army corps of engineers to reinforce pavements in order to improve the bearing capacity of the soil [4]. the geocell system not only increases the bearing capacity of soil but also considerably increases its stiffness and strength and reduces settlement. this is achieved by confining failure wedges that would develop in an unreinforced soil because of the lateral and outward displacement. mandal [5] stated that lateral movement and shear failure are resisted by both the tensile hoop strength of the cell walls and the passive resistance of the full adjacent cells. han [6] reported that base courses reinforced with geocells reduce the vertical stresses at the interface between the subgrade and base course, reduce permanent and creep deformations and increase the elastic deformation, stiffness and bearing capacity of the base course. the most common applications of the geocell system include embankments [7], pavements [8] and erosion control [9] similar to reinforcing geogrids [10–12]. the current trend in geocell application is the use of geocells beneath foundations in order to reduce the costs associated with the construction of the foundation. the geocell system can reduce the thickness of not only the underlying layer itself but also the foundation [13]. hegde presented a summary of previous studies, the state of the art in geocells and scope of the future directions in research in an extensive study [14]. the paper discussed numerous experimental, numerical, analytical and field performance studies related to geocells. hedge indicated several gaps in the research, such as a shortage of robust design methodologies and analytical formulations related to geocells and a lack of systematic documentation of case studies. the common geocell description includes cell dimensions, tensile strength, seam strength, strip thickness, density and aspect ratio. hegde [15] noted that the greater the increase in tensile strength of the material, the more confinement the geocell offers. the cell aspect ratio specifies the ratio of the geocell’s aperture size to the medium grain size of the backfill. the optimum cell aspect ratio is about 15. according to mehrjardi [16], larger backfill particles (smaller cell aspect ratio) deteriorate the interaction between the geocell and backfill, resulting in a lower bearing capacity. however, mehrjardi [16] also states that geocells with a cell aspect ratio of 4 have the best performance in improving the interface’s shear strength. a series of direct shear tests were performed in that study to investigate the interfacial characteristics of graingrain and grain-geocell interactions. three types of uniformly graded soils and backfill materials were classified as sp (poorly graded sand) and gp (poorly graded gravel) according to the unified soil classification system, and geocells with a pocket size and height of 55 × 55 mm and 50 mm, respectively. rajagopal [17] reported that geocell reinforcement adds apparent cohesive strength even to cohesionless soils and does not affect frictional strength. many researchers have observed the bearing capacity of geocell reinforced soils (e.g., [5, 6, 15, 16, 18–20]). mandal [5] stated that the low-settlement bearing capacity of geocell-reinforced soil did not improve much, compared to unreinforced soil, but the largesettlement bearing capacity showed a considerable improvement. he recommended using a smaller geocell opening size for low-settlement structures and a larger size for large settlement structures in order to obtain the maximum benefit from geocells. the objective in the experiment of this study was to determine the effect of geocells on the deformation behaviour of backfill material independently of the subsoil. 476 https://doi.org/10.14311/ap.2019.59.0476 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 5/2019 the effect of soil grain size on the deformation properties. . . 2. material and methods the experimental model was created for the town of polanka (czech republic). before the experiment, this location had been repeatedly used for storing crushed materials and often driven through by heavy machinery. the upper part of the subsoil (10-15 cm) comprised crushed aggregate that extended continuously into the lower part of the subsoil, which was classified as silty gravel (gm). a non-woven geotextile was laid onto this terrain/subsoil (15 kn·m−2 tensile strength and 300 g·m−2 area density) followed by a 100 mm geocell mattress, which was covered with crushed backfill and compacted to 150 mm thickness using an ntc vdr 22 compacting machine (140 kg, 81 hz frequency). the total area of the experimental model area was 2 × 2 m. a diagram and cross-section of the experiment is shown in figure 1. the backfill materials used in this experiment were five types of crushed granular aggregates from the hrabůvka quarry (czech republic). they were classified according to czech standard čsn 73 1005 [21] as gp (poorly graded gravel), s-f (sand with admixture of fine-grained soil) and g-f (gravel with admixture of fine-grained soil). the characteristics of these materials are summarized in table 1. the particle size distribution curves of the materials are shown in figure 2. the granularity curves were determined through a wet sieve analysis. the geocell used in this experiment was made from high density polyethylene strips. its basic characteristics are listed in table 2. the deformation modulus of reinforced and unreinforced soil layers was determined through a static plate load test (plt), which is a modulus-based compaction quality control system [22]. unfortunately, this test is restricted to a pointwise determination of deformation parameters because it is time-consuming and requires stopping the operation of the quarry. in order to overcome these limitations, a light dynamic load plate is often used in a static load plate test. regrettably, the light dynamic load plate test proved to be unsuitable for geocell systems. the results of the static load test showed that the subsoil in the study area can be considered incompressible. with a contact pressure of 300 kpa, the maximum settlement in this test was 1.24 mm. the resulting modulus was 156 mpa. in terms of the experiment’s boundary conditions, the subsoil could be considered qualitatively incompressible. the static load test results of the subsoil are shown in figure 3. the scheme of the experiment and plt are shown in figure 1. the overall horizontal dimensions of the model (2 × 2 m) were selected according to shadmand [13] so that the ratio between the model’s width (or length) and loading plate diameter was greater than 5. the diameter of the loading plate was 300 mm. for a proper transmission of force across the entire plate, a thin layer of sand is recommended to level the surface of the soil/aggregate layer. in order to minimize any measurement errors caused, for example, by poorly fitting the plate, each assembly was tested three times. the plate was loaded vertically in three load steps. the vertical displacements were recorded using a single gauge. the load was then incrementally reduced to zero (fig. 3). the deformation modulus edef,2 from the second loading stage in the loaddisplacement diagram was calculated with the following equation [23]: edef,2 = 1.5 × r (a1 + a2 × pmax) (1) where r is the radius of the plate, pmax is the maximum pressure exerted on the plate and a1 and a2 are the regression constants determined by overlaying hysteresis loops with a quadratic polynomial. figure 3 shows the application of the loaded part of the second stage of the static load test on the ground (marked in red) and its corresponding quadratic function (a1 = 1.3104, a2 = 0.72). static load tests were performed on all five backfill materials with and without geocells. the ratio of cell dimensions to the loading plate was 0.63. the effect of the loading plate size and other factors on the response from geocell reinforced backfill was investigated by mehrjardi [16]. mehrjardi stated that the bearing capacity’s tendency to vary in all conditions is incremental with an increasing loading plate size. 3. results and discussion the static loading test progress on the geocellreinforced and non-reinforced layers for each backfill material is shown in figure 4. the black curves show the test’s progress on the unreinforced layer, the blue curves show the layer reinforced with geocells. the dashed line shows the first load/unload stage and the solid line shows the second load/unload stage. the deformation modulus edef,2 was determined from these. the static load tests were performed with a load range of 0-300 kpa. this limited load range, caused mainly by the technical equipment at the construction site, may explain why this study had a lesser benefit from the reinforcements in increasing the deformation modulus and bearing capacity of reinforced layers than other studies [5, 6, 15, 16, 18–20]. if the differences between the maximum vertical settlement of the reinforced and unreinforced layers are compared, fractions 0/8 and 0/32, due to the usage of geocells, demonstrated lower settlement values than the unreinforced layer (negative difference). in fractions 4/8 and 8/16, the differences between the maximum settlement values of the static load plate were practically the same. fraction 16/32 demonstrated a greater settlement in the reinforced layer with geocells than in the layer without geocells (fig. 5). fraction 16/32 had the lowest cell aspect ratio (7.4) and highest medium grain size (23.6 mm) of all the backfill used. 477 b. luňáčková, m. mohyla, m. pinka acta polytechnica figure 1. large-scale model: a) geometrical scheme of the model, b) backfill – 16/32 aggregate, c) backfill – 0/8 aggregate. dimensions of the model: 2 × 2 m (length × width). characteristics aggregate 4/8 8/16 16/32 0/8 0/32 classification (čsn 73 1005) gp gp gp s-f g-f specific gravity γs (g·cm−3) 2.658 2.652 2.689 2.752 2.715 moisture (%) 0 0 0 3.09 3.82 fine fraction (%) 0.2 0.1 0.0 10.9 6.6 sand fraction (%) 1.8 0.5 0.3 47.1 18.6 gravel fraction (%) 98.0 99.4 99.7 41.0 74.8 medium grain size d50 (mm) 5.2 11.2 23.6 1.5 8.3 coefficient of uniformity cu 2.2 1.43 1.51 35.3 53.1 coefficient of curvature cc 1.24 0.95 0.91 3.31 3.21 cell aspect ratio (cell diameter/d50) 33.7 15.6 7.4 116.7 21.1 table 1. characteristics of the backfill materials. figure 2. particle size distribution curves. 478 vol. 59 no. 5/2019 the effect of soil grain size on the deformation properties. . . figure 3. polygonal regression of the second part of the hysteresis loop of the static load plate test on the subsoil. figure 4. load-displacement diagrams for static load plate tests (a) subsoil, (b) 0/32 aggregate, (c) 0/8 aggregate, (d) 4/8 aggregate, (e) 8/16 aggregate, (f) 16/32 aggregate. the black curve represents the progress of the static load test without the geocell reinforcement, the blue curve represents the progress of the test with the geocell reinforcement. 479 b. luňáčková, m. mohyla, m. pinka acta polytechnica figure 5. deformation modulus ratio (-) and settlement difference (mm) plotted against medium grain size (mm) and cell aspect ratio (-). characteristics cell dimensions (mm) see fig. 1 cell height (mm) 100 strip thickness (mm) 1.98 cell surface perforated ultimate tensile strength (kn·m−2) 20 table 2. geocell characteristics. the objective, of course, was to achieve conditions where the vertical settlement values of the reinforced layer were less than the settlement values of the nonreinforced layer. success depended on selecting the most suitable cell aspect ratio, i.e. the most suitable geocell mesh size and backfill grain size. figure 5a shows that negative differences between the reinforced and unreinforced layers were achieved with a cell aspect ratio greater than 18.5. the change in the settlement is given by the difference between the maximum settlement values of the reinforced and non-reinforced layers from the second stage of the static load test. negative settlement difference values mean that less settlement values were recorded for the reinforced layer than the unreinforced layer. using the measurement results presented in figure 4, the deformation modulus was calculated for each backfill material. the results are summarized in figure 6 with the type of aggregate as abscissa and deformation modulus as ordinate. each aggregate is characterized by two values/columns. the front column characterizes the aggregate without geocells, the rear column the aggregate with geocells. it is evident that the geocell reinforcement increases the deformation parameters. the results show that the highest deformation modulus value was for 0/32 aggregate reinforced with geocells, and the lowest deformation modulus value was calculated for the 0/8 fraction without geocells. however, the magnitude of the deformation modulus is not as essential to this study as is the change after the use of geocells as a reinforcement system. the greatest effect of reinforcement on the deformation modulus was in the 0/8 aggregate, where the deformation modulus ratio ie (ratio of the deformation modulus of reinforced backfill to unreinforced backfill) was 1.78. the 0/8 aggregate also had the highest cell aspect ratio of all the backfill types and the lowest medium grain size (table 3). if the deformation modulus ratio is plotted as a function of the cell aspect ratio (fig. 5a), it is clear that the deformation modulus ratio increased as the cell aspect ratio increased. in the previously mentioned studies, the effect of the reinforcement is usually assessed using the bearing capacity ratio. this ratio is defined as the bearing capacity of reinforced backfill to unreinforced backfill. most researchers generally agree that the larger the d50 (medium grain size = the diameter of the grain corresponding to 50 % of the backfill), the lower the bearing capacity ratio. the same dependence was observed in this study, however, not on the bearing capacity ratio but on the deformation modulus ratio (fig. 5b). 4. conclusions the aim of the experiment was to determine the effect of geocells on the deformation behaviour of backfill material independently of the subsoil. a series of static plate load tests were performed to observe the vertical stress-displacement responses of unreinforced layers and layers reinforced with geocells with different backfill materials. five types of crushed aggregates were investigated in the study (0/8, 0/32, 4/8, 8/16, 16/32 aggregates). the subsoil was considered incompressible. the dimensions of the model were 2 × 2 m and the diameter of the loading plate was 300 mm. only one type of geocell was used in the experiment (height of the geocell was 100 mm). from the results presented above, we can conclude that: • the use of geocells as a reinforcement system led to an increase in layer deformation parameters in each of the 0/8, 0/32, 4/8, 8/16, 16/32 fractions used, • the greatest effect of geocell reinforcement on the deformation modulus was in the case of the aggregate with fine fractions-the 0/8 aggregates, 480 vol. 59 no. 5/2019 the effect of soil grain size on the deformation properties. . . figure 6. deformation modulus from the second loading stage of the load-displacement diagram. aggregate ie cell aspect ratio d50 plate diameter/ d50 0/32 1.41 21.1 8.3 36.15 0/8 1.78 116.7 1.5 200 4/8 1.41 33.7 5.2 57.69 8/16 1.11 15.6 11.2 26 16/32 1.28 7.4 23.6 12.7 table 3. test results. • the greater the value of the cell aspect ratio (the ratio of the geocell’s aperture size to the medium grain size of the backfill), the greater the deformation modulus ratio (fig. 5a), • the geocell filler with a cell aspect ratio greater than 18.5 had lower settlement values in the reinforced layer than the unreinforced layer (fig. 5a) in the second load/unload stage of the static load test, • as medium grain size of used backfill increased, the deformation modulus ratio decreased. further testing on materials is required to verify the presented dependencies and conclusions and to determine which medium grain size values, or more importantly, which cell aspect ratio values are the most suitable for a geocell system’s performance. acknowledgements this article was written under the support for long-term research projects in conceptual science and research development at všb – technical university of ostrava in 2018 provided by the ministry of education, youth and sports of the czech republic. list of symbols gp poorly graded gravel sp poorly graded sand gm silty gravel g − f gravel with admixture of fine-grained soil s − f sand with admixture of fine-grained soil γs specific gravity [kn m−3] d50 medium grain size [mm] cu coefficient of uniformity cc coefficient of curvature e2 deformation modulus [pa] ie deformation modulus ratio plt plate load test references [1] s. kolathayar, p. suja, v. nair, et al. performance evaluation of seashell and sand as infill materials in hdpe and coir geocells. innovative infrastructure solutions 4:7, 2019. doi:10.1007/s41062-019-0203-6. [2] v. hasthi, a. hegde. numerical analysis of machine foundation resting on the geocell reinforced soil beds. geotechnical engineering 49:55–62, 2018. [3] o. kief, y. schary, s. pokharel. high modulus geocells for sustainable highway infrastructure. indian geotechnical journal 45:389–400, 2014. doi:10.1007/s40098-014-0129-z. [4] s. l. webster, j. e. watkins. investigation of construction techniques for tactical bridge approach roads across soft ground report s-77-1. tech. rep., soils and pavements laboratory, u.s. army engineer waterways experiment station, vicksburg, mississippi, 1977. https://apps.dtic.mil/dtic/tr/fulltext/u2/ a037351.pdf. [5] j. n. mandal, p. gupta. stability of geocell-reinforced soil. construction and building materials 8(1):55–62, 1994. doi:10.1016/0950-0618(94)90009-4. 481 http://dx.doi.org/10.1007/s41062-019-0203-6 http://dx.doi.org/10.1007/s40098-014-0129-z https://apps.dtic.mil/dtic/tr/fulltext/u2/a037351.pdf https://apps.dtic.mil/dtic/tr/fulltext/u2/a037351.pdf http://dx.doi.org/10.1016/0950-0618(94)90009-4 b. luňáčková, m. mohyla, m. pinka acta polytechnica [6] j. han, j. thakur, r. parsons, et al. a summary of research on geocell-reinforced base courses. in international symposium on design and practice of geosynthetic-reinforced soil structures 26th italian national conference on geosynthetics, pp. 331–340. bologna, italy, 2013. doi:10.13140/rg.2.1.4185.7129. [7] b. leshchinsky, h. ling. effects of geocell confinement on strength and deformation behavior of gravel. journal of geotechnical and geoenvironmental engineering 139:340–352, 2012. doi:10.1061/(asce)gt.1943-5606.0000757. [8] s. pokharel, j. han, d. leshchinsky, r. parsons. experimental evaluation of geocell-reinforced bases under repeated loading. international journal of pavement research and technology 11:114–127, 2017. doi:10.1016/j.ijprt.2017.03.007. [9] k. j. wu, d. n. austin. three-dimensional polyethylene geocells for erosion control and channel linings. in r. m. koerner (ed.), geosynthetics in filtration, drainage and erosion control, pp. 275–284. elsevier, 1992. doi:10.1016/b978-1-85166-796-3.50023-2. [10] i. vaníček, k. i. application and design of earth structures from the reinforced soils. acta polytechnica 40(2), 2000. [11] v. hudeček, k. černá, l. gembalová, j. votoček. completion of restoration and rehabilitation of the central tailing heap of jan šverma mine in žacléř. acta montanistica slovaca 21(2):129–138, 2016. [12] v. krivda, j. petru, k. zitnikova, i. mahdalova. road construction loaded by heavy vehicles. in 16th international multidisciplinary scientific geoconference sgem 2016, pp. 203–208. surveying geology & mining ecology management (sgem), albena, bulgaria, 2016. [13] a. shadmand, m. ghazavi, n. ganjian. loadsettlement characteristics of large-scale square footing on sand reinforced with opening geocell reinforcement. geotextiles and geomembranes 46:319–326, 2018. doi:10.1016/j.geotexmem.2018.01.001. [14] a. hegde. geocell reinforced foundation beds-past findings, present trends and future prospects: a state-of-the-art review. construction and building materials 154:658–674, 2017. doi:10.1016/j.conbuildmat.2017.07.230. [15] a. hegde, t. sitharam. experiment and 3d-numerical studies on soft clay bed reinforced with different types of cellular confinement systems. transportation geotechnics 10:73–84, 2017. doi:10.1016/j.trgeo.2017.01.001. [16] g. t. mehrjardi, r. behrad, s. n. moghaddas tafreshi. scale effect on the behavior of geocell-reinforced soil. geotextiles and geomembranes 47:154–163, 2019. doi:10.1016/j.geotexmem.2018.12.003. [17] k. rajagopal, n. r. krishnaswamy, g. m. latha. behaviour of sand confined with single and multiple geocells. geotextiles and geomembranes 17(3):171–184, 1999. doi:10.1016/s0266-1144(98)00034-x. [18] s. pokharel, j. han, d. leshchinsky, et al. investigation of factors influencing behavior of single geocell-reinforced bases under static loading. geotextiles and geomembranes 28:570–578, 2010. doi:10.1016/j.geotexmem.2010.06.002. [19] s. n. moghaddas tafreshi, a. dawson. laboratory model tests for a strip footing supported on geocell reinforced sand bed. in ground improvement and geosynthetics, pp. 353–360. 2010. doi:10.1061/41108(381)46. [20] n. k. kumawat, s. k. tiwari. bearing capacity of square footing on geocell reinforced fly ash beds. materials today: proceedings 4(9):10570–10580, 2017. doi:10.1016/j.matpr.2017.06.422. [21] čsn p 73 1005 inženýrskogeologický průzkum (ground investigation). standard, úřad pro technickou normalizaci, metrologii a státní zkušebnictví, praha, 2016. [22] y.-j. choi, d. ahn, t. nguyen, j. ahn. assessment of field compaction of aggregate base materials for permeable pavements based on plate load tests. sustainability 10:1–13, 2018. doi:10.3390/su10103817. [23] d. adam, f. kopf, i. paulmichl. computational validation of static and dynamic plate load testing. acta geotechnica 4:35–55, 2009. doi:10.1007/s11440-008-0081-0. 482 http://dx.doi.org/10.13140/rg.2.1.4185.7129 http://dx.doi.org/10.1061/(asce)gt.1943-5606.0000757 http://dx.doi.org/10.1016/j.ijprt.2017.03.007 http://dx.doi.org/10.1016/b978-1-85166-796-3.50023-2 http://dx.doi.org/10.1016/j.geotexmem.2018.01.001 http://dx.doi.org/10.1016/j.conbuildmat.2017.07.230 http://dx.doi.org/10.1016/j.trgeo.2017.01.001 http://dx.doi.org/10.1016/j.geotexmem.2018.12.003 http://dx.doi.org/10.1016/s0266-1144(98)00034-x http://dx.doi.org/10.1016/j.geotexmem.2010.06.002 http://dx.doi.org/10.1061/41108(381)46 http://dx.doi.org/10.1016/j.matpr.2017.06.422 http://dx.doi.org/10.3390/su10103817 http://dx.doi.org/10.1007/s11440-008-0081-0 acta polytechnica 59(5):476–482, 2019 1 introduction 2 material and methods 3 results and discussion 4 conclusions acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2018.58.0232 acta polytechnica 58(4):232–239, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap ultra-high-performance fibre-reinforced concrete under high-velocity projectile impact. part i. experiments sebastjan kravanjaa, radoslav sovjákb, ∗ a faculty of civil and geodetic engineering, university of ljubljana, jamova cesta 2, ljubljana 1000, slovenia b faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: sovjak@fsv.cvut.cz abstract. a series of cratering experiments were performed where the response of the ultra-highperformance fibre-reinforced concretes with various fibre volume fractions to the highvelocity projectile impact loading was investigated. it was found that the increment of the fibre volumetric fraction did not have a significant influence on the depth of the penetration, but it was very effective in reducing the crater area and volume. keywords: projectile impact; uhpfrc; depth of penetration; mechanical properties; shear crack resistance. 1. introduction as the occurrence of unexpected events is ever increasing in today’s society, a primary loading type that needs to be addressed is the localized impact of projectiles. in a broad sense, the projectile impact might be understood as a fragment generated from a high-speed rotating machine, explosion generated fragment, and last but not least a projectile generated from a direct armed attack. the projectile impact can be considered as a high strain-rate loading caused by an object travelling with high speed and having a relatively low weight. this kind of loading is characterized by its rapid increase in the release of energy in a very short time. several damage mechanisms are activated at once during the projectile penetration, such as compaction, compression with confining pressure and tension cracking. the complexity is further increased since the strain rate effect is different for each mechanism [1]. to simulate the effects of possible damage by the projectile impact, a series of small firearms tests with live ammunition was carried out. several authors suggested that structures are required to withstand impact loads generated by projectiles and in case of the plant-internal accidents [2, 3]. a typical case of this phenomenon is a fracture of the rotary machine, which could occur in turbine missiles, which are projectiles travelling at high velocities [4]. perforating the wall of the turbine may result in a severe damage to the facilities and endangering the safety of the personnel [5]. concrete is commonly used as an engineering solution due to its ability to withstand impact and point loads. a further potential expansion in the use of concrete has been indicated by the development of ultra-high-performance fibre-reinforced concrete (uhpfrc). for instance, riedel et al. [6] implied that the uhpfrc is potentially suitable for improved protective structural elements. consistently, other researchers also noted that the uhpfrc is a feasible solution for protective structures due to its enhanced mechanical properties and impact resistance [7, 8]. impact resistance of the uhpfrc was evaluated through a three main damage degrees, such as depth of the penetration and area and volume of the impact crater. beside the measurements of the impact resistance, several mechanical properties of the uhpfrc were determined for the means of the study of correlation to damage degrees. in addition to experimental results, a shear crack analysis on cut through samples through the point of the deepest penetration was performed. the goal was to characterize the shear crack impact resistance of the uhpfrc for various fibre volume fractions. 2. material and methods 2.1. uhpfrc the concrete base mixture was made using a low water-to-binder ratio and contained common highperformance concrete ingredients, such as ordinary portland cement binder, silica fume, superfine aggregates with a maximal size of a grain of 1.2 mm, superplasticizer (high-range water reducer) in powder form, water and anti-foaming agent [9]. for the fibre reinforcement of the specimens, straight, discrete, high-strength steel fibres were used with a length and diameter of 14 mm and 0.13 mm, respectively. the material of the fibres had the density of 7850 kg/m3, the tensile strength of 2800 mpa and elastic modulus of 200 gpa. 232 http://dx.doi.org/10.14311/ap.2018.58.0232 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 4/2018 ultra-high-performance fibre-reinforced concrete. part i. figure 1. experimental set-up for determination of tensile mechanical properties of uhpfrc. a) prism used for 4-point flexural strength test (i.e. modulus of rupture). b) dog-bone shaped specimen used for the direct tensile strength test with clothoid transitions. five different fibre volumetric fractions were set with gradually enlarging increments by the doubling geometric sequence in the form of vf,i = 0.125(2i−1) for 1 ≤ i ≤ 5 which yielded: vf,1 = 0.125 %, vf,2 = 0.25 %, vf,3 = 0.5 %, vf,4 = 1 % and vf,5 = 2 %. in addition, the plain mixture specimens without fibre addition were also cast as a control and comparison samples labelled for the case where i = 0 as vf,0 = 0 %. the plain mixture specimen is also referred in the text as an ultra-high-performance concrete (uhpc). 2.2. mechanical properties mechanical properties including unconfined compressive strength, flexural tensile strength (modulus of rupture) and direct tensile strength were investigated in the framework of the experimental part of the study. at least six specimens were tested for every mechanical property in the framework of the individual fibre volumetric content considered in this study. compressive strength was determined on cubes (100 × 100 × 100 mm) by monotonic increments of the load by 0.6 mpa/s. a modulus of rupture was determined on prisms in a four-point bending configuration without a notch (fig. 1a). the prisms were 100 × 100 × 550 mm in size and the clear span was 450 mm. the constant moment region regarding the four-point bending configuration was 150 mm. the test was a deformation controlled and the loading rate was 0.05 mm/min. a pair of the lvdt (linear variable differential transformer) sensors was attached to the beam in the mid-span. flexural toughness factor (ftδ) was determined according to a method for testing flexural strength and flexural toughness of a fibre reinforced concrete [10, 11]. flexural toughness factor was calculated through an area of the load-displacement diagram of the prism constructed of the uhpfrc up to a load point deflection of span/150 [12]. direct tensile strength was determined by using a dog-bone shaped specimen with a central part 200 mm in length and a reduced cross-section of 50 × 100 mm. the total length of the dog-bone specimen was 750 mm and narrowing of its cross-section was done by means of clothoids 100 mm in length and 25 mm in height (fig. 1b). the loading was deformation controlled with the loading rate set to 0.05 mm/min. furthermore, two additional mechanical properties were calculated. wille et al. [13] introduced the variable g that is defined as the energy absorbing capacity prior to a tension softening and variable g that is defined as the fracture energy dissipated per crack surface. energy absorbing capacity (g) and fracture energy (g) under direct tension were gained from the stress-strain diagram and load-total crack width diagram, respectively. the strain of the specimen under a direct tension was measured by using a pair of strain gauges 50 mm in length that was glued on the sides of the dog-bone specimen in its reduced cross-section. after the crack localization, the total crack width was measured using 4 inductive sensors that were placed over the reduced cross-section of the dog-bone specimen. the mechanical properties of the resulting mixture are listed in table 1. 2.3. impact testing in total, 38 specimens, which all had the same designed dimensions of a cube (200 × 200 × 200 mm), were cast and tested. six specimens for each fibre volumetric content and plain concrete mixture were made; except for the 2 % fibre content specimens, for which eight specimens were made, considering that was established as the optimal amount of fibres regarding the previous research [14, 15]. the impact test was conducted using a semiautomatic rifle with calibre 7.62×39 mm and two types of ogive-nosed projectiles (fig. 2). the deformable projectiles had a full-metal jacket and a soft-lead core (fmj-slc or slc), while the non-deformable projectiles had a mildsteel core with a smaller lead tip (fmj-msc or msc). both types of projectiles had a mass of 8.04 g and the outer diameter of 7.92 mm. the diameter of the core of the msc projectile and slc projectile was 5.68 mm and 6.32 mm, respectively. in the research by sovják et al. [14] with the same type of projectiles impacting uhpfrc plates with 50 mm and 45 mm thicknesses, it was found out that 233 sebastjan kravanja, radoslav sovják acta polytechnica fibre volume fraction (%) plain mixture 0.125 0.25 0.5 1 2 cube compressive strength (mpa) 111 118 131 133 134 144 modulus of rupture (mpa) 6.30 7.07 7.76 10.3 20.3 26.1 flexural toughness factor (mpa) 0.07 2.66 5.37 8.88 17.5 20.7 direct tensile strength (mpa) 3.54 3.58 3.60 3.83 5.05 6.52 energy abs. capacity (kj/m3) 0.170 0.171 0.215 0.229 0.303 1.270 fracture energy (kj/m2) 0.10 5.42 6.08 10.3 12.7 17.7 bulk density (kg/m3) 2346 2355 2358 2365 2381 2421 table 1. mechanical properties of the resulting uhpfrc mixture with various fibre volume fractions. figure 2. projectiles used in the framework of this study and their dimensions in millimetres: a) full-metal jacket with the mild-steel core (fmj-msc), b) full-metal jacket with the soft-lead core (fmj-slc). figure 3. experimental set-up of cratering experiments induced by two various projectiles on uhpfrc semi-infinite targets. the steel jacket was separated during the penetration process and rebounded from the target. from the experimental observations, it was concluded that the diameter, which controls the penetration process, is the diameter of the core of the projectile (dc). in addition, no pith or yaw angles were noted from the records from the high-speed camera and it was concluded that a normal penetration occurred at all times. the influence of the shape of the projectile head can be considered with the nose shape factor. the latter can be determined by the use of the ratio between the length of the projectile head and projectile diameter or with the use of calibre radius head (crh). crh is a factor of projectile head sharpness and it is defined as a ratio between the projectile head radius of curvature r and projectile diameter d [16]: crh = r d higher values of the crh are implying sharper projectile’s head and larger pressure at the target which result in a deeper total penetration depth. the calibre radius head of both types of projectiles was calculated based on the whole projectile diameter d and core diameter dc (table 2). fmj-msc fmj-slc external diameter (mm) 7.92 7.92 core diameter (mm) 5.68 6.32 total length (mm) 26.60 23.20 jacket mass (g) 4.15 2.78 core mass (g) 3.65 5.26 tip mass (g) 0.24 – total mass (g) 8.04 8.04 head radius of curvature r 28 31 crh(d) 3.54 3.91 crh(dc) 4.93 4.91 table 2. non-deformable and deformable projectile geometrical and mass properties. the average muzzle velocity for msc and slc projectiles was 710 m/s, while all the velocities were within the range of 710 ± 20 m/s. the projectiles were fired from 20 m distance to the targets and hit them perpendicularly in the approximate centre of the proximal face under normal penetration (fig. 3). the objective of the experimental part of this study was to investigate whether the increment in the fibre 234 vol. 58 no. 4/2018 ultra-high-performance fibre-reinforced concrete. part i. equivalent cone failure vf (%) dop (mm) area (cm2) volume (cm3) diameter (mm) angle (°) 0 36.41 84.70 – 101.7 26.83 0.125 35.78 50.58 30.33 84.38 33.42 0.25 35.72 37.53 23.17 72.17 39.75 0.5 35.13 21.97 14.50 64.18 47.08 1 33.37 24.48 15.67 63.10 46.17 2 31.43 18.99 11.50 54.83 51.81 table 3. average values for all measured or calculated damage degrees for fmj-msc projectile impact. equivalent cone failure vf (%) dop (mm) area (cm2) volume (cm3) diameter (mm) angle (°) 0 20.59 114.4 96.33 114.8 22.33 0.125 19.61 40.87 25.33 74.63 35.42 0.25 18.80 39.71 24.00 74.08 40.75 0.5 20.46 30.12 20.00 68.55 47.58 1 20.09 33.20 20.67 73.49 41.83 2 20.04 21.12 13.88 57.22 46.31 table 4. average values for all measured or calculated damage degrees for fmj-slc projectile impact. volumetric fraction within the concrete largely affects the damage intensity by measuring the dimensions of the crater made by the aforementioned impact. after the impact test, three main damage degrees were accurately measured, such as depth of penetration, crater area and crater volume, in order to quantitatively evaluate the difference in the damage resistance by enlarging the fibre volume fraction increment. the impact velocity was determined from the measured muzzle velocity by using a shooting chronograph. the 7.62 × 39 ak-47 (kalashnikov) projectile with a mass of 8.04 grams is defined by kneubuehl [17] to have the muzzle velocity 710 m/s and within 20 meters the velocity of the projectile is supposed to decrease to 688 m/s. thus, all muzzle velocities measured in the framework of this study were reduced by 22 m/s, which yielded the actual impact velocity. 2.4. shear crack analysis the analysis of shear cracks, produced by the projectile impact of both types was conducted on slices with a width of 10 mm, which were cut longitudinally through the target at the point of the maximum penetration depth. cracks appeared in all of the samples; however, the intensity of the cracking was obviously decreasing with an increment of the fibre volumetric fraction. the intention was to quantitatively assess the crack impact resistance with respect to the fibre volumetric content increment. shear crack angles and dimensions of cracks were measured using a ruler, microscope and cad software. the crack impact resistance was evaluated through newly proposed factors of the impact resistance of brittle concrete composites reinforced with various fibres. total shear crack lengths lc and total depth of shear cracks zc in each specimen were measured and the number of major cracks nc in each specimen was recorded. shear cracks widths were measured using a microscope (scale of 10 µm). the resistance against a shear crack propagation under the projectile impact was quantitatively evaluated through the use of the ultimate crack resistance factor ru, recommended by kankam [18] and revised by alhasanat and al qadi [19]: ru = niek lczcwc , where ni is a number of blows, ek impact (impulse) kinetic energy, lc total length of all cracks, zc maximum depth of the cracks and wc maximum crack width, emerged due to the projectile impact. additionally, with the means of an efficient comparison with other composites or materials with different compressive strengths, the non-dimensional impact crack resistance ratio cr was defined as a ratio between the ultimate crack resistance factor and unconfined compressive strength of the tested material: cr = ru f′c . 3. results for the cratering experiments numerical values of the experimental research are summarized in the (tables 3 and 4). in all of the cases, the dop was less than 38 mm, which is less than 1/5 of the target thickness, reaffirming the supposition of the semi-infinite target. in the cases of msc projectiles that impacted on the plain mixture specimens, an 235 sebastjan kravanja, radoslav sovják acta polytechnica figure 4. fmj-msc projectile core (5.68 mm in diameter) penetrated into the sample 2.0 % fmj-msc (the core was found rebounded from the target and inserted back in for presentation purposes). intense shattering and fragmentation occurred in all three targets, whereas in the case of the slc projectiles the less intense shattering occurred in two targets, while the third one remained in one piece. in all other specimens impacted by msc projectiles, two parts of the impact crater appeared – the front crater (the spalling part) and the tunnelling part (fig. 4). the tunnelling part of the crater slightly deviated from the straight trajectory, i.e. the so-called j-hook appeared during a cylindrical penetration. the msc cores rebounded from the target and the non-deformed core was separated from the steel jacket during the penetration process. in the case of the slc projectiles impact, only the spalling part emerged, while the soft lead core was completely destroyed during the impact. the j-hook trajectory that was observed in the specimens under the msc impact appeared as a change of the course during the penetration process due to the separation of the steel jacket and also by the anisotropic fracture process in concrete [1]. other authors suggested that the projectile trajectory may deviate from a straight line during penetration due to the occurrence of instabilities or due to variable mechanical properties of the target [14, 16]. 3.1. correlation of damage degrees to mechanical properties in recent years, doubts about the suitability of the use of concrete unconfined compressive strength as a parameter of a mechanical property of the target in prediction models have appeared. it is known that a vast majority of empirical and semi-analytical prediction models are derived on the basis of the general assumption that the depth of the penetration or volume of the crater is inversely proportional to the square root of the unconfined compressive strength of the tested concrete. the concrete’s unconfined compressive strength is worldwide used as the main classification parameter of concretes. as such, it is also set as an essential independent controlling variable in a vast majority of the impact prediction models. it is also known that the concrete compressive strength depends on specimen’s geometry, cure time and above all, mix composition, whereas the latter differs substantially between normal strength concretes and high performance concretes. yankelevsky [20] is implying that since the increase in strength due to the variation of the concrete mix leads to a decreased porosity that affects the material compressibility and impact resistance, any smooth continuous function relating concrete compressive strength to impact damage parameters is an improper description of the concrete response to impact loading. the basic correlation between the depth of the penetration or crater volume and inverse value of square root of unconfined compressive strength, on which the majority of models are derived, suggests that every concrete with the same unconfined strength exhibits an identical resistance to the impact penetration. it is known that the response of a concrete material to impact loading is conditioned on other concrete parameters as well, such as type, shape and size of aggregate, granulometric composition, additives, fibres type and volumetric content and so on, which can in different combinations result in a similar unconfined compressive strength but different impact resistance [20]. yankelevsky is suggesting that this relationship supposition lacks physical meaning and seems to be arbitrarily predetermined. in order to test this hypothesis and furthermore assess the recent findings, a statistical evaluation of the correlation between three main damage degrees and tested mechanical properties has been conducted. correlation coefficients between each of these quantities have been calculated and compared between values for each projectile type separately (table 5). it was already shown that in general, damage degrees’ values are decreasing whereas mechanical strengths are increasing with an increment of the fibre volumetric fraction. it can be seen that in the case of the msc projectile impact, the correlation coefficients are much larger than in the case of slc projectile impact, especially in the case of the dop. it can be also seen that the dop in the msc case is more correlated to tensile and flexural strength than unconfined compressive strength, whereas it is also strongly correlated to fracture energy, which is based on the direct tensile test. however, the crater area and crater volume are more correlated to the unconfined compressive strength in both projectile cases. 236 vol. 58 no. 4/2018 ultra-high-performance fibre-reinforced concrete. part i. fmj-msc fmj-slc dop area volume dop area volume f′c 0.84 0.93 0.93 0.15 0.83 0.80√ f′c 0.83 0.94 0.93 0.16 0.84 0.80 ft 0.99 0.59 0.72 0.16 0.49 0.44√ ft 0.99 0.61 0.73 0.17 0.50 0.45 ftδ 0.97 0.81 0.85 0.10 0.68 0.63 mor 0.99 0.67 0.77 0.17 0.55 0.50 g 0.89 0.48 0.62 0.08 0.42 0.36 g 0.94 0.90 0.88 0.04 0.81 0.77 table 5. correlation coefficients for the dependency of damage degrees to mechanical properties of the tested uhpfrc. fmj-msc fmj-slc vf zc lc nc zc lc nc (%) (mm) (mm) (–) (mm) (mm) (–) 0 141 – – 113 353 5 0.125 90 430 6 97 232 6 0.25 72 370 6 60 175 5 0.5 79 269 6 66 135 4 1 68 110 4 40 84 3 2 52 88 3 20 58 3 table 6. crack dimensions. this is, however, in conflict with previous studies [21] showing that the tensile strength, strain softening, fracture energy, and strain-rate effect on the tensile strength in the prediction model are crucial for the dimensions of the crater area and volume. the reason for this may be attributed to the shallow penetration in these experiments where the depth of the penetration was no more than two times of the projectiles’ length. as the tensile strength is crucial for the crater depth, the dop is sensitive to the tensile strength in the shallow penetration where the crater depth is more than half of the dop. furthermore, it should be noted that the compressive strength, tensile strength and flexural strength do not affect only the dop. during the penetration process, the concrete material surrounding the projectile is subjected to a high-intensity triaxial stress state that can be described by the shear failure surface of concrete that must be determined by a large amount data of triaxial stress state, while the unconfined compressive or tensile strength is only a uniaxial stress state [21]. in addition, the volume change of concrete occurs during the penetration, thus the equation of state that is used to describe the relationship between pressure and volumetric strain plays an important role. 3.2. shear crack resistance for the fmj-msc projectile impact, the average shear crack width value was 0.16 mm, while for the fmjfmj-msc fmj-slc vf ek ru cr ek ru cr (%) (j) (n/mm2) (–) (j) (n/mm2) (–) 0 1813.6 – – 1966.0 49.3 0.4 0.125 1915.8 123.8 1.1 1838.9 204.3 1.7 0.25 1913.9 179.6 1.4 1878.9 447.4 3.4 0.5 1873.5 220.4 1.7 1868.0 524.1 3.9 1 1878.9 628.0 4.7 1893.6 1409.0 10.5 2 1905.6 1041.1 7.2 1880.8 4053.4 28.2 table 7. impact kinetic energy ek, ultimate crack resistance factor ru, impact crack resistance ratio cr. slc, it was 0.12 mm. in order to estimate the ultimate impact crack resistance factor, the maximal width was estimated for both type of projectiles for the uhpfrc as wc,max,uhpfrc = 0.4 mm. in the case of the fmj-slc projectile impact on plain uhpc specimens, where the target was not defragmented, the maximal width was estimated as wc,max,uhpc = 1.0 mm, in the case of the fmj-msc projectile impact on plain uhpc specimens, all the targets were defragmented and no additional cracks appeared in the remaining segment of the target (table 6). with the measured data and calculated impact kinetic energies ek from estimated impact velocities, the ultimate impact crack resistance factor ru and crack resistance ratio cr were calculated (table 7). it can be observed that with the increment of the fibre volumetric fraction, the ultimate crack resistance factor of the uhpfrc is increasing with a constant trend (fig. 5). the crack resistance increases with larger increments in the case of the slc projectiles impact, since a lot of the impact energy is already dissipated with the significant deformation of the deformable lead core. it can be concluded, that the fibre volumetric fraction increment up to 2 % is efficient in providing the impact crack resistance to the otherwise brittle highperformance concrete matrix. this property is essential in providing further impact resistance against additional projectile impacts. 237 sebastjan kravanja, radoslav sovják acta polytechnica figure 5. ultimate crack resistance factor vs. fibre volumetric fraction for both types of projectiles. 4. conclusions in the experimental research, cube specimens made of the uhpfrc were subjected to a nondeformable and deformable projectile impact. cubes were differentiated on the basis of the fibre volumetric content and effect of projectile impact was investigated in terms of the cratering damage. the aim of the experimental part was to describe the shape of the crater that was produced on the uhpfrc targets as a result of the deformable and non-deformable projectile impact. based on the experimental results and statistical correlation analysis, the following conclusions can be drawn: (1.) an increase in the fibre volume fraction leads to an increase in mechanical properties. compressive mechanical properties are far less affected by the fibre volumetric content than tensile and flexural mechanical properties. (2.) an increase in fibre volume fraction also provides an efficient solution for decreasing the intensity of shear crack propagation due to the deformable and non-deformable projectile impact. (3.) statistical evaluation of the correlation between mechanical properties and main damage degrees emerged due to the rigid projectile impact showed that the depth of the penetration is more correlated to tensile and flexural strength than to unconfined compressive strength. however, the crater area and volume seem to be in a better correlation with the latter than with the former two. (4.) the conclusion that the dop is more correlated to the tensile and flexural strength than the unconfined compressive strength is valid within the framework of this study and is attributed to the nature of the shallow penetration in these experiments. (5.) it should be noted that the compressive strength, tensile strength and flexural strength do not affect only the dop and they should be taken with caution when standing alone. high-intensity triaxial stress state and the volume changes of concrete that occur during the penetration should also be taken into consideration. acknowledgements this work was supported by the ministry of interior of the czech republic project vi20172020061. the authors also acknowledge the assistance from the technical staff at the experimental centre, faculty of civil engineering, czech technical university in prague; and students who participated in the project. references [1] børvik t, langseth m, hopperstad os, polanco-loria ma. ballistic perforation resistance of high performance concrete slabs with different unconfined compressive strengths. proc. first int. conf. high perform. struct. compos. sevilla, spain wit press (isbn 1-85312-904-6), 2002, p. 273–82. [2] shirai t, kambayashi a, ohno t, taniguchi h, ueda m, ishikawa n. experiment and numerical simulation of double-layered rc plates under impact loadings. nucl eng des 1997;176:195–205. doi:10.1016/s0029-5493(97)00142-8. [3] ohno t, uchida t, matsumoto n, takahashi y. local damage of reinforced concrete slabs by impact of deformable projectiles. nucl eng des 1992;138:45–52. doi:10.1016/0029-5493(92)90277-3. [4] kar ak. residual velocity for projectiles. nucl eng des 1979;53:87–95. doi:10.1016/0029-5493(79)90042-6. [5] amde am, mirmiran a, a. walter t. local damage assessment of turbine missile impact on composite and multiple barriers. nucl eng des 1997;178:145–56. doi:10.1016/s0029-5493(97)00206-9. 238 http://dx.doi.org/10.1016/s0029-5493(97)00142-8. http://dx.doi.org/10.1016/0029-5493(92)90277-3. http://dx.doi.org/10.1016/0029-5493(79)90042-6. http://dx.doi.org/10.1016/s0029-5493(97)00206-9. vol. 58 no. 4/2018 ultra-high-performance fibre-reinforced concrete. part i. [6] riedel w, nöldgen m, straßburger e, thoma k, fehling e. local damage to ultra high performance concrete structures caused by an impact of aircraft engine missiles. nucl eng des 2010;240:2633–42. doi:10.1016/j.nucengdes.2010.07.036. [7] millon o, riedel w, mayrhofer c, thoma k. fiberreinforced ultra-high performance concrete – a material with potential for protective structures. in: li qm, hao h, li zx, yankelevsky d, editors. proc. first int. conf. prot. struct., manchester: manchester; 2010, p. no.-013. [8] nicolaides d, kanellopoulos a, petrou m, savva p, mina a. development of a new ultra high performance fibre reinforced cementitious composite (uhpfrcc) for impact and blast protection of structures. constr build mater 2015;95:667–74. doi:10.1016/j.conbuildmat.2015.07.136. [9] bažantová z, kolář k, konvalinka p, litoš j. multi-functional high-performance cement based composite. key eng mater 2016;677:53–6. doi:10.4028/www.scientific.net/kem.677.53. [10] jsce. test method for bending strength and bending toughness of steel fiber reinforced concrete. standard specification for concrete structures, test methods and specifications. 2005. [11] banthia n, majdzadeh f, wu j, bindiganavile v. fiber synergy in hybrid fiber reinforced concrete (hyfrc) in flexure and direct shear. cem concr compos 2014;48:91–7. doi:10.1016/j.cemconcomp.2013.10.018. [12] sovják r, shanbhag d, konrád p, zatloukal j. response of thin uhpfrc targets with various fibre volume fractions to deformable projectile impact. procedia eng., vol. 193, 2017, p. 3–10. doi:10.1016/j.proeng.2017.06.179. [13] wille k, el-tawil s, naaman ae. properties of strain hardening ultra high performance fiber reinforced concrete (uhp-frc) under direct tensile loading. cem concr compos 2014. doi:10.1016/j.cemconcomp.2013.12.015. [14] sovják r, vavřiník t, zatloukal j, máca p, mičunek t, frydrýn m. resistance of slim uhpfrc targets to projectile impact using in-service bullets. int j impact eng 2015;76:166–77. doi:10.1016/j.ijimpeng.2014.10.002. [15] máca p, sovják r, konvalinka p. mix design of uhpfrc and its response to projectile impact. int j impact eng 2013;63:158–63. doi:10.1016/j.ijimpeng.2013.08.003. [16] li qm, reid sr, wen hm, telford ar. local impact effects of hard missiles on concrete targets. int j impact eng 2005;32:224–84. doi:10.1016/j.ijimpeng.2005.04.005. [17] kneubuehl bp, sellier kg. wound ballistics. basics appl heidelb 2011. [18] kankam ck. impact resistance of palm kernel fibre-reinforced concrete pavement slab. j ferrocem 1999;29(4):279–86. [19] alhasanat mba, al qadi an. impact behavior of high strength concrete slabs with pozzolana as coarse aggregate. am j appl sci 2016;13:754–61. doi:10.3844/ajassp.2016.754.761. [20] yankelevsky dz. resistance of a concrete target to penetration of a rigid projectile revisited. int j impact eng 2017;106:30–43. doi:10.1016/j.ijimpeng.2017.02.021. [21] kong x, fang q, wu h, peng y. numerical predictions of cratering and scabbing in concrete slabs subjected to projectile impact using a modified version of hjc material model. int j impact eng 2016;95:61–71. doi:10.1016/j.ijimpeng.2016.04.014. 239 http://dx.doi.org/10.1016/j.nucengdes.2010.07.036. http://dx.doi.org/10.1016/j.conbuildmat.2015.07.136. http://dx.doi.org/10.4028/www.scientific.net/kem.677.53. http://dx.doi.org/10.1016/j.cemconcomp.2013.10.018. http://dx.doi.org/10.1016/j.proeng.2017.06.179. http://dx.doi.org/10.1016/j.cemconcomp.2013.12.015. http://dx.doi.org/10.1016/j.ijimpeng.2014.10.002. http://dx.doi.org/10.1016/j.ijimpeng.2013.08.003. http://dx.doi.org/10.1016/j.ijimpeng.2005.04.005. http://dx.doi.org/10.3844/ajassp.2016.754.761. http://dx.doi.org/10.1016/j.ijimpeng.2017.02.021. http://dx.doi.org/10.1016/j.ijimpeng.2016.04.014. acta polytechnica 58(4):232–239, 2018 1 introduction 2 material and methods 2.1 uhpfrc 2.2 mechanical properties 2.3 impact testing 2.4 shear crack analysis 3 results for the cratering experiments 3.1 correlation of damage degrees to mechanical properties 3.2 shear crack resistance 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0477 acta polytechnica 57(6):477–487, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap rationally extended shape invariant potentials in arbitrary d dimensions associated with exceptional xm polynomials rajesh kumar yadava, nisha kumarib, avinash kharec, bhabani prasad mandalb, ∗ a department of physics, s. p. college, dumka (skmu dumka)-814101, india b department of physics, banaras hindu university, varanasi-221005, india c raja ramanna fellow, indian institute of science education and research (iiser), pune-411021, india ∗ corresponding author: bhabani.mandal@gmail.com abstract. rationally extended shape invariant potentials in arbitrary d-dimensions are obtained by using point canonical transformation (pct) method. the bound-state solutions of these exactly solvable potentials can be written in terms of xm laguerre or xm jacobi exceptional orthogonal polynomials. these potentials are isospectral to their usual counterparts and possess translationally shape invariance property. keywords: exceptional orthogonal polynomial; point canonical transformation; rationally extended potential; shape invariance property. 1. introduction in recent years the discovery of exceptional orthogonal polynomials (eops) (also known as x1 laguerre and x1 jacobi polynomials) [1, 2] has increased the list of exactly solvable potentials. the eops are the solutions of secondorder sturm-liouvilli eigenvalue problem with rational coefficients. unlike the usual orthogonal polynomials, the eops starts with degree n ≥ 1 and still form a complete orthonormal set with respect to a positive definite innerproduct defined over a compact interval. after the discovery of these two polynomials quesne et.al. reported three shape invariant potentials whose solutions are in terms of x1 laguerre polynomial (extended radial oscillator potentials) and x1 jacobi polynomials (extended trigonometric scarf and generalized pöschl teller (gpt) potentials) [3, 4]. subsequently odake and sasaki generalize these and obtain the solutions in terms of exceptional xm orthogonal polynomials [5]. the properties of these xm exceptional orthogonal polynomials have been studied in detail in ref. [6–9]. subsequently, the extension of other exactly solvable shape invariant potentials have also been done [10–13] by using different approaches such as supersymmetry quantum mechanics (susyqm) [17, 18], point canonical transformation (pct) [21], darboux-crum transformation (dct) [22] etc. the scattering amplitude of some of the newly found exactly solvable potentials in terms of xm eops are studied in ref. [14–16]. the bound state solutions of these extended (deformed) potentials are in terms of eops or some type of new polynomials (yn) ( which can be expressed in terms of combination of usual laguerre or jacobi orthogonal polynomials). the bound state spectrum of all these extended potentials are investigated in a fixed dimension (d = 1 or 3). recently the extension of some exactly solvable potentials have been made in arbitrary d dimensions whose solutions are in terms of x1 eops [26]. the obvious question then if one can extend this discussion and obtain potentials whose solutions are in terms of xm eops in arbitrary dimensions. the purpose of this paper is to answer this question. in particular, in this paper we apply the pct approach, which consists of a coordinate transformation and a functional transformation, that allows generation of normalized exact analytic bound state solutions of the schrödinger equation, starting from an analytically solvable conventional potential. here we consider two analytically solved conventional potentials (they are isotropic oscillator and gpt potential) [3, 28] corresponding to which d-dimensional rationally extended potentials are obtained whose solutions are in terms of xm exceptional laguerre or jacobi polynomials. this paper is organized as follows. in § 2, detail about the point canonical transformation (pct) method for arbitrary d-dimensions is given. in § 3, we have written the differential equation corresponding to xm eops and discussed some important properties for eops. arbitrary d-dimensional rationally extended exactly solvable potentials whose solutions are in terms of xm laguerre or xm jacobi eops are obtained in § 4. the approximate solutions corresponding to xm jacobi case are also discussed in this section. in § 5, new shape invariant potentials for the rationally extended xm laguerre and xm jacobi polynomials are obtained in arbitrary d dimensions. in particular, for d = 2 and d = 4, the shape invariant partner potentials for the extended radial oscillator are obtained explicitly. § 6, is reserved for results and discussions. 477 http://dx.doi.org/10.14311/ap.2017.57.0477 http://ojs.cvut.cz/ojs/index.php/ap r. k. yadav, n. kumari, a. khare, b. p. mandal acta polytechnica 2. point canonical transformation (pct) method for arbitrary d-dimensions in this section, we discuss a more traditional approach, the pct approach [21] to get the extension of conventional potentials by considering the radial schrödinger equation in arbitrary d-dimensional euclidean space [19, 20] given by (~ = 2m = 1) d2ψ(r) dr2 + (d − 1) r dψ(r) dr + ( en −v (r) − `(` + d − 2) r2 ) ψ(r) = 0. (1) to solve this equation, we apply pct approach and assume the solution of the form ψ(r) = f(r)f(g(r)), (2) where f(r) and g(r) are two undetermined functions and f (g(r)) will be later identified as one of the orthogonal polynomials which satisfy a second-order differential equation f ′′(g(r)) + q(g(r))f ′(g(r)) + r(g(r))f(g(r)) = 0. (3) here a prime denotes derivative with respect to g(r). using (2) in (1) and comparing the results with (3), we get f(r) = n ×r− (d−1) 2 (g′(r))− 1 2 exp ( 1 2 ∫ q(g)dg ) (4) and en −v (r) − `(` + d − 2) r2 = 1 2 {g(r),r} + g(r)′2 ( r(g) − 1 2 q′(g) − 1 4 q2(g) ) + (d − 1)(d − 3) 4r2 , (5) where n is the integration constant and plays the role of the normalization constant of the wavefunctions and {g(r),r} is the schwartzian derivative symbol [29], {g,r} defined as {g(r),r} = g′′′(r) g′(r) − 3 2 g′′2(r) g′2(r) . (6) here the prime denotes derivative with respect to r. from (2) and (4), the normalizable wavefunction is given by ψ(r) = χ(r) r(d−1)/2 , (7) where χ(r) = n × (g′(r))− 1 2 exp ( 1 2 ∫ q(g)dg ) f(g(r)). (8) the radial wavefunction ψ(r) = χ(r) r(d−1)/2 has to satisfies the boundary condition χ(r) = 0 to be more precise, it must at least vanish as fast as r(d−1)/2 as r goes to zero in order to rule out singular solutions [30]. for (5) to be satisfied, one needs to find some function g(r) ensuring the presence of a constant term on its right hand side to compensate en on its left hand one, while giving rise to a potential v (r) with well behaved wavefunctions. 3. exceptional xm orthogonal polynomials for completeness, we now give the differential equations corresponding to the xm eops and summarize the important properties for these two polynomials. 3.1. exceptional xm laguerre orthogonal polynomials for an integer m ≥ 0, n ≥ m and k > m, the xm laguerre orthogonal polynomial l̂ (α) n,m(g(r)) satisfies the differential equation [23] l̂ ′′(α) n,m (g(r)) + 1 g ( (α + 1 −g) − 2g l (α) m−1(−g(r)) l (α−1) m (−g(r)) ) l̂ ′(α) n,m(g(r)) + 1 g ( n− 2α l (α) m−1(−g(r)) l (α−1) m (−g(r)) ) l̂(α)n,m(g(r)) = 0 (9) 478 vol. 57 no. 6/2017 rationally extended shape invariant potentials the l2 norms of the xm laguerre polynomials are given by∫ ∞ 0 ( l̂(α)n,m(g) )2 wαm(g)dg = (α + n)γ(α + n−m) (n−m)! , (10) where wαm(g) = gαe−g( l (α−1) m (−g) )2 (11) is the weight factor for the xm laguerre polynomials. in terms of classical laguerre polynomials the xm laguerre polynomials can be written as l̂(α)n,m(g) = l (α) m (−g)l (α−1) n−m (g) + l (α−1) m (−g)l (α) n−m−1(g); n ≥ m. (12) for m = 0, the above definitions reduce to their classical counterparts i.e. l̂ (α) 0,n(g) = l (α) n (g) (13) wα0 (g) = g αe−g, (14) and for m = 1 this satisfies [1, eq. 80]. the other properties related to the xm laguerre polynomials are discussed in detail in ref. [23]. 3.2. exceptional xm jacobi orthogonal polynomials for an integer m ≥ 1 and α,β > −1 the exceptional xm jacobi orthogonal polynomials p̂ (α,β) n,m (g(r)) [24, 27] satisfies the differential equation p̂ ′′(α,β) n,m (g(r)) + ( (α−β −m + 1) p (−α,β) m−1 (g(r)) p (−α−1,β−1) m (g(r)) − α + 1 1 −g(r) + β + 1 1 + g(r) ) p̂ ′(α,β) n,m (g(r)) + 1 (1 −g2(r)) ( β(α−β −m + 1)(1 −g(r)) p (−α,β) m−1 (g(r)) p (−α−1,β−1) m (g(r)) + m(α−β −m + 1) + (n−m)(α + β + n−m + 1) ) p̂ (α,β)n,m (g(r)) = 0. (15) the l2 norms of the xm jacobi polynomials are given by∫ 1 −1 [p̂ (α,β)n,m (g(r))] 2ŵα,βm dg = 2α+β+1(1 + α + n− 2m)(β + n)γ(α + 1 + n−m)γ(β + n−m) (n−m)!(α + 1 + n−m)(α + β + 2n− 2m + 1)γ(α + β + n−m + 1) , (16) where ŵα,βm = (1 −g)α(1 + g)β( p (−α−1,β−1) m (g(r)) )2 (17) is the weight factor for the xm jacobi polynomials. the above l2 norms of the xm jacobi polynomials hold, when the denominator of the above weight factor is non-zero for −1 ≤ g ≤ 1. to ensure this, the following two conditions must be satisfied simultaneously: (i) β 6= 0, α,α−β −m + 1 6∈ {0, 1, . . . ,m− 1}, (ii) α > m− 2, sgn(α−m + 1) = sgn(β), (18) where sgn(g) is the signum function. in terms of classical jacobi polynomials p (α,β)n (g), the xm jacobi polynomials can be written as p̂ (α,β)n,m (g) = (−1) m ( 1 + α + β + j 2(1 + α + j) (g − 1)p (−α−1,β−1)m (g)p (α+2,β) j−1 (g) + 1 + α−m α + 1 + j p (−2−α,β)m (g)p (α+1,β−1) j (g) ) , j = n−m ≥ 0. (19) for m = 0, the above definitions reduces to their classical counterparts i.e. p̂ (αβ) 0,n (g) = p (α,β) n (g), w α,β 0 (g) = (1 −g) α(1 + g)β, (20) and for m = 1 this satisfies ref. [1, eq. 56]. the other properties related to the xm jacobi polynomials are discussed in detail in ref. [24] 479 r. k. yadav, n. kumari, a. khare, b. p. mandal acta polytechnica 4. extended potentials in d-dimensions 4.1. potentials associated with xm exceptional laguerre polynomial in this section, we consider the extension of the usual radial oscillator potential. for this potential let us define the function f(g) as an xm (m ≥ 1) exceptional laguerre polynomial l̂ (α) n,m(g), where n = 0, 1, 2, 3, . . . , and α > 0, the associated second order differential equation (3) is equivalent to xm laguerre differential equation (9) where the functions q(g) and r(g) are q(g) = 1 g ( (α + 1 −g) − 2g l (α) m−1(−g) l (α−1) m (−g) ) , r(g) = 1 g ( n− 2α l (α) m−1(−g) l (α−1) m (−g) ) . (21) using q(g) and r(g) in (5), we get en −vm(r) = 1 2 {g,r} + (g′)2 ( − 1 4 + n g + (α + 1) 2g − (α + 1)(α− 1) 4g2 + l (α+1) m−2 (−g) l (α−1) m (−g) − (α + g − 1) g l (α) m−1(−g) l (α−1) m (−g) − 2 ( l(α)m−1(−g) l (α−1) m (−g) )2) + (d − 1)(d − 3) 4r2 . (22) to get en as explained in the above section, here we assume (g′(r))2 g(r) = c1 (a constant not equal to zero), and for the radial oscillator potential this constant c1 can be obtained by setting g(r) = 1 4 c1r 2. (23) putting g(r) in (22) above and defining the quantum number n → n + m, we get en = nc1, n = 0, 1, 2, . . . , (24) vm(r) = 1 16 c21r 2 + (α + 12 )(α− 1 2 ) r2 − c21r 2 4 l (α+1) m−2 (−g) l (α−1) m (−g) + c ( α + cr2 4 − 1 ) × l (α) m−1(−g) l (α−1) m (−g) + c2r2 2 ( l (α) m−1(−g) l (α−1) m (−g) )2 − c 2 (2m + α + 1) − (d − 1)(d − 3) 4r2 . (25) the wavefunction can be obtained by putting q(g) and g(r) in (8) and is given by χn,m(r) = nn,m r(α+ 1 2 ) exp(−c1r 2 8 ) l (α−1) m (−14cr 2) l̂ (α) n+m,m (c1r2 4 ) , (26) where nn,m is the normalization constant given by nn,m = ( n! (α + n + m)γ(α + n) )1 2 . (27) to get the correct centrifugal barrier term in d-dimensional euclidean space, we have to identify the coefficient of 1 r2 in (25) to be equal to `(` + d − 2), which fixes the value of α as α = ` + d − 2 2 (28) and identifying the constant c1 = 2ω , the energy eigenvalues (24), extended potential (25) and the corresponding wavefunction (26) in any arbitrary d-dimensions are en = 2nω, (29) vm(r) = v drad(r) −ω 2r2 l (l+ d2 ) m−2 (− ωr2 2 ) l (l+ d−42 ) m (−ωr 2 2 ) + ω(ωr2 + 2l + d − 4) l (l+ d−22 ) m−1 (− ωr2 2 ) l (l+ d−42 ) m (−ωr 2 2 ) + 2ω2r2 ( l (l+ d−22 ) m−1 (− ωr2 2 ) l (l+ d−42 ) m (−ωr 2 2 ) )2 − 2mω (30) 480 vol. 57 no. 6/2017 rationally extended shape invariant potentials and χn,m(r) = nn,m × r`+ d−1 2 exp(−ωr 2 4 ) l (l+ d−42 ) m (−ωr 2 2 ) l̂ (`+ d−22 ) n+m,m (ωr2 2 ) (31) respectively, where v drad(r) = 1 4ω 2r2 + `(`+d−2) r2 −ω(` + d2 ) is conventional radial oscillator potential in arbitrary d-dimensional space [31]. note that the full eigenfunction ψ as given by (7) with χ as given by (31). for a check on our calculations, we now discuss few special cases of the results obtained in (30) and (31). case (a). for m = 0, from (30) and (31) we get the well known usual radial oscillator potential in d-dimensions [31], v0(r) = v drad(r) = 1 4 ω2r2 + `(` + d − 2) r2 −ω ( ` + d 2 ) (32) and the corresponding wavefunctions which can be written in terms of usual laguerre polynomials χn,0(r) = nn,0 ×r`+ d−1 2 exp ( − ωr2 4 ) l (`+ d−22 ) n (ωr2 2 ) . (33) for d = 3 above expressions reduces to the well known 3-d harmonic oscillator potential. case (b). for m = 1, the obtained potential v1(r) = 1 4 ω2r2 + `(` + d − 2) r2 −ω ( ` + d 2 ) + 4ω (ωr2 + 2` + d − 2) − 8ω(2` + d − 2) (ωr2 + 2` + d − 2)2 (34) is the rationally extended d dimensional oscillator potential studied earlier in ref. [26] the corresponding wavefunctions in terms of exceptional x1 laguerre orthogonal polynomial can be written as χn,1(r) = nn,1 × r`+ d−1 2 exp(−ωr 2 4 ) (ωr2 + 2` + d − 2) l̂ (`+ d−22 ) n+1,1 (ωr2 2 ) . (35) for d = 3, the above expressions match exactly with the expressions given in ref. [3]. case (c). for m = 2, in this case the extended potential and the corresponding wavefunctions in terms of x2 laguerre orthogonal polynomials are given by v2(r) = 1 4 ω2r2 + `(` + d − 2) r2 −ω ( ` + d 2 ) + 8ω ( ωr2 − (2` + d) ) ω2r4 + 2ωr2(2` + d) + (2` + d − 2)(2` + d) + 64ω2r2(2` + d)( ω2r4 + 2ωr2(2` + d) + (2` + d − 2)(2` + d) )2 , (36) and χn,2(r) = nn,2 × r`+ d−1 2 exp(−ωr 2 4 ) ω2r4 + 2ωr2(2` + d) + (2` + d − 2)(2` + d) l̂ (`+ d−22 ) n+2,2 (ωr2 2 ) . (37) 4.2. potentials associated with xm exceptional jacobi polynomials let us consider the case where the second order differential equation (3) coincides with that satisfied by xm jacobi polynomial p̂ (α,β)n,m , where n = 1, 2, 3, . . . , m ≥ 1, α,β > −1 and α 6= β. thus the function f(g) in (3) is equivalent to p̂ (α,β)n,m (g) and the other two functions q(g) and r(g) are given by (15) q(g) = (α−β −m− 1) p (−α,β) m−1 (g) p (−α−1,β−1) m (g) − α + 1 1 −g + β + 1 1 + g , r(g) = β(α−β −m + 1) 1 + g p (−α,β) m−1 (g) p (−α−1,β−1)(g) m + 1 1 −g2 ( (α−β −m + 1) + (n−m)(α + β + n−m + 1) ) . (38) 481 r. k. yadav, n. kumari, a. khare, b. p. mandal acta polytechnica using the above equations in (5) and after doing some straightforward calculations for s-wave (` = 0), we get en −veff,m(r) = 1 2 {g(r),r} + 1 −α2 4 g′(r)2 (1 −g(r))2 + 1 −β2 4 g′(r)2 (1 + g(r))2 + 2n2 + 2n(α + β − 2m + 1) + 2m(α− 3β −m + 1) + (α + 1)(β + 1) 2 × g′(r)2 1 −g(r)2 + (α−β −m + 1) ( α + β + (α−β + 1)g(r) ) g′(r)2 1 −g(r)2 × p (−α,β) m−1 (g) p (−α−1,β−1) m (g) − (α−β −m + 1)2g′(r)2 2 ( p (−α,β) m−1 (g) p (−α−1,β−1) m (g) )2 , (39) and the wavefunction (8) becomes χn,m(r) = nn,m ×g′(r)− 1 2 (1 + g)(β+1)/2(1 −g)(α+1)/2 p (−α−1,β−1) m (g) p̂ (α,β)n,m (g), (40) where the effective potential, veff,m(r) is given by veff,m(r) = vm(r) + (d − 1)(d − 3) 4r2 , (41) and the normalization constant nn,m = ( n!(α + n + 1)2(α + β + 2n + 1)γ(α + β + n + 1) 2α+β+1(1 + α + n−m)(β + n + m)γ(α + n + 2)γ(β + n) )1 2 . (42) it is interesting here to note that when this extended potential is purely non-power law, the potential given by (41) has an extra term (d−1)(d−3)4r2 which behaves as constant background attractive inverse square potential in any arbitrary dimensions except for d = 1 or 3. for power law cases (e.g. radial oscillator potential), this background potential gives the correct barrier potential in arbitrary dimensions (as shown in (30)). on using (6) in (39) and assume a term g ′2(r) 1−g2(r) = c2 (a real constant not equal to zero), we get a constant term on right hand side which gives the energy eigenvalue en. there are several possibilities of g(r) which produces this constant c2. if we consider g(r) = cosh r, 0 ≤ r ≤ ∞ and define the parameters α = b − a − 12,β = −b − a − 1 2 , b > a + (d−1)2 > (d−1) 2 and the quantum number n → n + m, m ≥ 1, (39) and (40) give, en = −(a−n)2, n = 0, 1, 2, ......,nmax a− 1 ≤ nmax < a, (43) veff,m(r) = vm(r) + (d − 1)(d − 3) 4r2 = vgpt (r) + 2m(2b −m + 1) − (2b −m + 1) ( (2a + 1 − (2b + 1) cosh r) ) × p (−α,β) m−1 (cosh r) p (−α−1,β−1) m (cosh r) + (2b −m + 1)2 sinh2 r 2 ( p (−α,β) m−1 (cosh r) p (−α−1,β−1) m (cosh r) )2 (44) and the wave function χn,m(r) = nn,m × (cosh r − 1)(b−a)/2(cosh r + 1)−(b+a)/2 p (−b+a−12 ,−b−a− 3 2 ) m (cosh r) p̂ (b−a−12 ,−b−a− 1 2 ) n+m,m (cosh r). (45) where vgpt (r) = (b2 + a(a + 1)) cosech2 r −b(2a + 1) cosech r coth r (46) is the conventional generalized pöschl teller (gpt) potential. note that the full eigenfunction is given by (7) with χ as given by (45). here we see that the energy eigenvalues of conventional potentials are same as the rationally extended d-dimensional potentials (i.e they are isospectral). it is interesting to note that compared to d = 3, the only change in the potential in d-dimensions is the extra centrifugal barrier term (d−1)(d−3)4r2 , and χ(r) is unaltered while only ψ(r) is slightly different due to r(d−1)/2. it is also worth pointing out that even in three dimensions, the gpt potential (46) can be analytically solved only in the case of s-wave, i.e. l = 0. we now show that approximate solution of the gpt potential problem for arbitrary l can, however, be obtained in d dimensions. 482 vol. 57 no. 6/2017 rationally extended shape invariant potentials 4.2.1. approximate solutions for arbitrary ` in this section, we solve the d-dimensional schödringer equation (1) with arbitrary ` and obtain the effective potential (as given in (39)) with an extra ` dependent term i.e., veff,m(r) = vm(r) + ( (d − 1)(d − 3) 4r2 + `(` + d − 2) r2 ) . (47) so, in order to get the appropriate centrifugal barrier terms in the above effective potential, we have to apply some approximation. following [25], we consider the approximation 1 r2 ' 1 sinh2 r . (48) thus effectively, one has approximated a problem for the l’th partial wave to that of l = 0 but with different set of parameters compared to the usual l = 0 case. in that case, the effective potential (41) becomes veff,m = vm(r) + ( (d − 1)(d − 3) 4 + `(` + d − 2) ) cosech2r. (49) now we define the parameters α and β in terms of modified parameters1 b′ and a′ i.e., α = b′ − a′ − 12 , β = −b′ −a′ − 12 , b ′ > a′ + d−12 > d−1 2 , where b′ = ( (ζ + 14 ) + ( (ζ + 14 + b(2a + 1))(ζ + 1 4 −b(2a + 1)) )1 2 2 )1 2 (50) and a′ = 1 2 ( 2b(a + 12 ) b′ − 1 ) , (51) while ζ = b2 + a(a + 1) + `(` + d − 2) + (d − 1)(d − 3) 4 . (52) for d = 3 and ` = 0; we get the usual parameters as defined in the above section i.e., b′ → b and a′ → a. on using these new parameters α and β, quantum number n → n + m; m ≥ 1, (39) and (40) give en = −(a′ −n)2, n = 0, 1, 2, . . . ,nmax a′ − 1 ≤ nmax < a′, (53) veff,m(r) = v (a′,b′) gpt (r) + 2m(2b ′−m+ 1)−(2b′−m+ 1)[(2a′ + 1−(2b′ + 1) cosh r)]× p (−α,β) m−1 (cosh r) p (−α−1,β−1) m (cosh r) + (2b′ −m + 1)2 sinh2 r 2 ( p (−α,β) m−1 (cosh r) p (−α−1,β−1) m (cosh r) )2 (54) and the wave functions χn,m(r) = nn,m × (cosh r − 1)(b ′−a′)/2(cosh r + 1)−(b ′+a′)/2 p (−b′+a′−12 ,−b ′−a′−32 ) m (cosh r) p̂ (b′−a′−12 ,−b ′−a′−12 ) n+m,m (cosh r). (55) where v (a′,b′) gpt (r) = (b ′2 + a′(a′ + 1)) cosech2 r −b′(2a′ + 1) cosech r coth r (56) is the conventional generalized pöschl teller (gpt) potential in arbitrary d and `. similar to the extended oscillator case, we now consider few special cases of the results of extended gpt potential obtained in equations (54) and (55). 1when we solve the schrödinger equation for usual gpt potential, vgp t (r) = (b2 +a(a+ 1))cosech2r −b(2a+ 1) coth rcosechr, we define the parameters α and β in terms of a and b. but, in the above d-dimensional effective potential the parameters α and β have been modified due to the presence of an extra d-dependent term. 483 r. k. yadav, n. kumari, a. khare, b. p. mandal acta polytechnica case (a). for m = 0, from eqs. (54) and (55), the potential and the corresponding wavefunctions in terms of usual jacobi polynomials are veff,0(r) = v (a′,b′) gpt (r) (57) and χn,0(r) = nn,0 × (cosh r − 1)(b ′−a′)/2(cosh r + 1)−(b ′+a′)/2p (b′−a′−12 ,−b ′−a′−12 ) n (cosh r). (58) for d = 3 and ` = 0 the effective potential veff,0 = vgpt (r). case (b). for m = 1, the obtained potential veff,1(r) = v (a′,b′) gpt (r) + 2(2a′ + 1) (2b′ cosh r − 2a′ − 1) − 2 ( 4b′2 − (2a′ + 1)2 ) (2b′ cosh r − 2a′ − 1)2 (59) is the rationally extended d dimensional gpt potential. the corresponding wavefunctions in terms of exceptional x1 jacobi orthogonal polynomials can be written as χn,1(r) = nn,1 × (cosh r − 1)(b ′−a′)/2(cosh r + 1)−(b ′+a′)/2 (2b′ cosh r − 2a′ − 1) p̂ (b′−a′−12 ,−b ′−a′−12 ) n+1,1 (cosh r). (60) for d = 3 and ` = 0, the above expressions match exactly with the results obtained in [4, 15]. case (c). in this case the potential and its wavefunctions in terms of x2 jacobi polynomials are given by veff,2(r) = v (a′,b′) gpt (r) + 4(2b ′ − 1) − 4 ( 3(2b′ − 1)(2a′ + 1) cosh r − 2b′(2b′ − 1) − 8a′(a′ + 1) ) (2b′ − 1)(2b′ − 2) cosh2 r − 2(2b′ − 1)(2a′ + 1) cosh r + 4a′(a′ + 1) + 2b′ − 1 + 8(2b′ − 1)2 sinh2 r ( (2a′ + 1) − (2b′ − 2) cosh r )2( (2b′ − 1)(2b′ − 2) cosh2 r − 2(2b′ − 1)(2a′ + 1) cosh r + 4a′(a′ + 1) + 2b′ − 1 )2 − 8 (61) and χn,2(r) = nn,2 (cosh r − 1)(b ′−a′)/2(cosh r + 1)−(b ′+a′)/2 (2b′ − 1)(2b′ − 2) cosh2 r − 2(2b′ − 1)(2a′ + 1) cosh r + 4a′(a′ + 1) + 2b′ − 1 × p̂ (b ′−a′−12 ,−b ′−a′−12 ) n+2,2 (cosh r). (62) 5. new shape invariant potentials (sips) in higher dimensions a quantum mechanical system is termed as si when the partner potentials in a unbroken susy [17, 18] satisfy the si property v (+)(x; a) = v (−)(x; b) + r(a), (63) where a is a set of parameters, b is a function of a (say b = f(a)) and the remainder r(a) is independent of x. the partner potentials, v ±(x) are obtained from the superpotential, w(x) through v ±(x) = w 2(x) ±w ′(x) + e0, ~ = 2m = 1, (64) where e0 is factorization energy2. the eigenstates of these partner potentials are related by e(+)n = e (−) n+1, e (0) 0 = 0, ψ (+) n ∝ aψ (−) n+1, ψ (−) n+1 ∝ a †ψ(+)n , (65) where a, a† and superpotential w(x) are defined as a = d dx + w(x), a† = − d dx + w(x), w(x) = − d dx ln ψ(−)0 (x). (66) the factorized hamiltonians in terms of a and a† or in terms of partner potentials are given by h(−) = a†a = − d2 dx2 + v (−)(x) −e, h(+) = aa† = − d2 dx2 + v (+)(x) −e. (67) in this section, we will show that various rationally extended d-dimensional potentials satisfy such si property. 2for radial oscillator potential the factorization energy e0 = ω(l + d2 ) and for gpt potential e0 = −a 2. 484 vol. 57 no. 6/2017 rationally extended shape invariant potentials 5.1. extended radial oscillator potentials we now show that the extended radial oscillator potentials that we have obtained in d dimensions in sec. 4 provide us with yet another example of shape invariant potentials with translation. for the radial oscillator case the ground state wave function χ(−)0,m(r) is given by (31) i.e. χ (−) 0,m(r) ∝ φ0(r)φm(r), (68) where φ0(r) ∝ rl+ d−1 2 exp ( − ωr2 4 ) and φm(r) ∝ l (l+ d−22 ) m (−ωr 2 2 ) l (l+ d−42 ) m (−ωr 2 2 ) . (69) here we see that the ground state wave function of the extended radial oscillator potential in higher dimensions whose solutions are in terms of eops differs from that of the usual potential by an extra term φm(r) and the corresponding superpotential w(r)(= − d dr [ln χ(−)0,m(r)]) is given by w(r) = w1(r) + w2(r), (70) where w1(r) = − φ′0(r) φ0(r) and w2(r) = − φ′m(r) φm(r) . (71) using w(r), we get v (−)m (r)(= w(r)2 −w ′(r)) same as in (30) and v (+) m (r)(= w(r)2 + w ′(r)) is given by v (+)m (r) = v d,l+1 rad (r) −ω 2r2 l (l+ d+22 ) m−2 (− ωr2 2 ) l (l+ d−22 ) m (−ωr 2 2 ) + ω(ωr2 + 2l + d − 2) l (l+ d2 ) m−1 (− ωr2 2 ) l (l+ d−22 ) m (−ωr 2 2 ) + 2ω2r2 ( l (l+ d2 ) m−1 (− ωr2 2 ) l (l+ d−22 ) m (−ωr 2 2 ) )2 − 2mω − ( l + d − 2 2 ) (72) from the above equations (30) and (72), the potential v (+)m (r) can be obtained directly by replacing l −→ l + 1 in v (−)m (r) and satisfy (63). this means these two partner potentials are shape invariant potentials (with translation). thus we see that the same oscillator potential v (r) = 14ω 2r2, where r = √ x21 + x22 + · · · + x2d, gives different sips in different dimensions. for example: for d = 2 v (−)m (r) = 1 4 ω2r2 + l2 r2 −ω2r2 l (l+1) m−2 (− ωr2 2 ) l (l−1) m (−ωr 2 2 ) + ω(ωr2 + 2l− 2) l (l) m−1(− ωr2 2 ) l (l−1) m (−ωr 2 2 ) + 2ω2r2 ( l (l) m−1(− ωr2 2 ) l (l−1) m (−ωr 2 2 ) )2 − 2mω −ω(l + 1) (73) and v (+)m (r) = 1 4 ω2r2 + (l + 1) r2 −ω2r2 l (l+2) m−2 (− ωr2 2 ) l (l) m (−ωr 2 2 ) + ω(ωr2 + 2l) l (l+1) m−1 (− ωr2 2 ) l (l) m (−ωr 2 2 ) + 2ω2r2 ( l (l+1) m−1 (− ωr2 2 ) l (l) m (−ωr 2 2 ) )2 − 2mω −ωl. (74) for d = 4 v (−)m (r) = 1 4 ω2r2 + l(l + 2) r2 −ω2r2 l (l+2) m−2 (− ωr2 2 ) l (l) m (−ωr 2 2 ) + ω(ωr2 + 2l) l (l+1) m−1 (− ωr2 2 ) l (l) m (−ωr 2 2 ) + 2ω2r2 ( l (l+2) m−1 (− ωr2 2 ) l (l) m (−ωr 2 2 ) )2 − 2mω −ω(l + 2) (75) and v (+)m (r) = 1 4 ω2r2 + (l + 1)(l + 3) r2 −ω2r2 l (l+3) m−2 (− ωr2 2 ) l (l+1) m (−ωr 2 2 ) +ω(ωr2 +2l+2) l (l+2) m−1 (− ωr2 2 ) l (l+1) m (−ωr 2 2 ) +2ω2r2 ( l (l+2) m−1 (− ωr2 2 ) l (l+1) m (−ωr 2 2 ) )2 − 2mω − (l + 1). (76) 485 r. k. yadav, n. kumari, a. khare, b. p. mandal acta polytechnica 5.2. extended pöschl-teller potentials let us first discuss the extended gpt (with l = 0) as discussed in sec. 4.2. in this case, the ground state wave function χ(−)0,m(r) is given by (45) i.e. χ (−) 0,m(r) ∝ φ0(r)φm(r) (77) where φ0(r) ∝ (cosh r − 1)(b−a)/2(cosh r + 1)−(b+a)/2 and φm(r) ∝ p (−b+a−32 ,−b−a− 1 2 ) m (cosh r) p (−b+a−12 ,−b−a− 3 2 ) m (cosh r) . (78) it is easy to check that due to the extra centrifugal term (d − 1)(d − 3)/4r2, the corresponding potentials in d dimensions are not shape invariant except when d = 3. however, if we consider the approximate extended pöschl-teller potentials as discussed in sec. 4.2.1, then as we now show, one gets shape invariant potentials with translation. in that case the corresponding superpotential (w(r) = − d dr [ln χ(−)0,m(r)]) is given by w(r) = w1(r) + w2(r), (79) where w1(r) = − φ′0(r) φ0(r) and w2(r) = − φ′m(r) φm(r) . (80) using w(r), the partner potential v (−)eff,m(r) is same as given in (54) and v (+) eff,m(r) is obtained by using v (+) m (r) = w 2(r) + w ′(r) or simply by replacing a′ −→ a′ − 1 in v (−) eff,m(r). hence the potentials v (−) eff,m(r) is shape invariant potential (with translation) and satisfy (63) for any arbitrary values of d and `. for a check we are giving here some simple cases for v (−)eff,m(r) and v (+) eff,m(r). case (a). for m = 0 and any arbitrary values of d and `, (54) gives v (−)eff,0(r) as v (−) eff,0(r) = ( b′2 + a′(a′ + 1) ) cosech2 r −b′(2a′ + 1) cosech r + a′2 coth r (81) and the partner potential v (+)eff,0(r) is given by v (+) eff,0(r) = ( b′2 + a′(a′ − 1) ) cosech2 r −b′(2a′ − 1) cosech r coth r + a′2. (82) the potential v (+)eff,0(r) can also be obtained simply by replacing a ′ → a′− 1 in v (−)eff,0(r) and satisfies (63). for d = 3 and ` = 0 the parameters a′ → a, b′ → b and thus these potentials corresponds to the conventional shape invariant pöschl teller potentials given in ref. [18]. case (b). for m = 1 and arbitrary `, the partner potentials v (−) eff,1(a ′,b′,r) = v (a ′,b′) gpt (r) + 2(2a′ + 1) (2b′ cosh r − 2a′ − 1) − 2 ( 4b′2 − (2a′ + 1)2 ) (2b′ cosh r − 2a′ − 1)2 + a′2 (83) and v (+) eff,1(a ′,b′, ,r) = v (a ′−1,b′) gpt (r) + 2(2a′ − 1) (2b′ cosh r − 2a′ + 1) − 2 ( 4b′2 − (2a′ − 1)2 ) (2b′ cosh r − 2a′ + 1)2 + a′2. (84) these potentials satisfy the shape invariant property (63) i.e., v (+) eff,1(a ′,b′,r) = v (−)eff,1(a ′ − 1,b′,r) + 2a′ − 1. (85) for d = 3 and ` = 0, the above expressions matches exactly with the results obtained in [4, 14, 15]. 6. results and discussions in the present manuscript by using pct approach we have generated exactly solvable rationally extended d-dimensional radial oscillator and gpt potential and constructed their bound state wavefunctions in terms of xm exceptional laguerre and jacobi orthogonal polynomials respectively. the extended potentials are isospectral to their conventional counterparts. for the oscillator case we have shown that the rationally extended d-dimensional oscillator potentials are shape invariant with translation. for the jacobi case, this is not true unless d = 3. for l 6= 0 we also obtained approximate extended gpt potentials and these have been shown to be shape invariant. for the particular case (d = 3) the potentials corresponds to the potentials obtained by quesne et.al [3, 4] and others and thus provide a powerful check on our calculations. 486 vol. 57 no. 6/2017 rationally extended shape invariant potentials acknowledgements one of us (nk) acknowledges financial support from ugc under the bhu-cret fellowship. b.p.m. acknowledges the financial support from the department of science and technology (dst), gov. of india under serc project sanction grant no. sr/s2/hep/0009/2012. references [1] d. gomez-ullate, n. kamran and r. milson, j. math. anal.appl. 359 (2009) 352. [2] d. gomez-ullate, n. kamran and r. milson, j. phys. a 43 (2010) 434016. [3] c. quesne, j.phys.a 41 (2008) 392001. [4] b. bagchi, c. quesne and r. roychoudhary, pramana j. phys. 73(2009) 337, c. quesne, sigma 5 (2009) 84; a. khare, unpublished. [5] s. odake and r. sasaki, phys. lett. b, 684 (2010) 173; ibid 679 (2009) 414. j. math. phys, 51, 053513 (2010). [6] c-l. ho, s odake and r sasaki, sigma 7 (2011) 107. [7] c-l. ho and r sasaki, arxiv: 1102.5669. [8] d. gomez-ullate, n. kamran and r. milson, arxiv: 1204.2282. [9] c. quesne, int. j. mod. phys. a 26 (2011) 5337. [10] y. grandati, j. math. phys. 52 103505 (2011). [11] y. grandati, ann. phys. 327 185 (2012). [12] y. grandati, ann. phys. 326 2074 (2011). [13] c. quesne, sigma 8 080 (2012). [14] r. k. yadav, a. khare and b. p. mandal, phys. lett. b 723 (2013) 433. [15] r. k. yadav, a. khare and b. p. mandal, annals of physics 331 (2013) 313. [16] r. k. yadav, a. khare and b. p. mandal, arxiv:1309.6755. [17] e. witten nucl. phys. a 188 513. [18] f. cooper, a. khare, u. sukhatme phys. rep. 251 (1995) 267; "susy in quantum mechanics" world scientific (2001). [19] k. j. oyewumi, f. o. akinpelu and d. agboola, int. j. theor. phys.,47 (2008) 1039. [20] e schrödinger, proc. r. irish acad. a,46 (1940) 183. [21] a. bhattacharjie and e. c. g. sudarshan, nuovo cimento 25 864. [22] g. darboux, theorie generale des surfaces vol 2 (paris: gauthier-villars) (1988). [23] d. gomez-ullate, n. kamran and r. milson, arxiv: 1002.2666v2. [24] d. gomez-ullate, n. kamran and r. milson, contemporary mathematics 563 51 2012. [25] gao-feng wei, chao-yun long and shi-hai dong, phys. lett. a 372 (2008) 2592. [26] n. bhagawati, arxiv:1402.1265; r. k. yadav et.al. (unpublished). [27] b. midya and b. roy, arxiv:1210.0119v1. [28] a. khare and u. p. sukhatme j. phys. a: math. gen 21 (1988) l501. [29] e. hille, "lectures on ordinary differential equations" addision-wesley (1969) p 647. [30] a. a. khelashvili, t. p. nadareishvili, am. j. phys 79 (2011) 668. [31] s. a. s. ahmed, int. j. theor. phys.,36 (1997) 8. 487 acta polytechnica 57(6):477–487, 2017 1 introduction 2 point canonical transformation (pct) method for arbitrary d-dimensions 3 exceptional xm orthogonal polynomials 3.1 exceptional xm laguerre orthogonal polynomials 3.2 exceptional xm jacobi orthogonal polynomials 4 extended potentials in d-dimensions 4.1 potentials associated with xm exceptional laguerre polynomial 4.2 potentials associated with xm exceptional jacobi polynomials 4.2.1 approximate solutions for arbitrary 5 new shape invariant potentials (sips) in higher dimensions 5.1 extended radial oscillator potentials 5.2 extended pöschl-teller potentials 6 results and discussions acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0169 acta polytechnica 60(2):169–174, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap extraction and evaluation of oil from green algae cladophora glomerata by hexane/ether mixture adeyinka sikiru yusuff∗, donatus ewere afe babalola university, college of engineering, department of chemical and petroleum engineering, afe babalola way, ado-ekiti, ekiti state, nigeria ∗ corresponding author: yusuffas@abuad.edu.ng abstract. the overall aim of this study was to grow green algae cladophora glomerata, extract oil from it, and characterize the extracted oil to gain insight into its physicochemical properties. the effects of parameters affecting the solvent extraction process, temperature, time and biomass particle size were investigated at a fixed solvent ratio of 4:1 hexane to ether. an optimization of oil separation from algae biomass via the solvent extraction method was studied. the obtained results showed that at the extraction temperature of 60 °c, extraction time of 2.5 h, and the particle size of ≤ 0.6 mm, the maximum oil yield was 18.3 % from the process. the functional group analysis revealed the presence of alkane, esters, carboxylic acid, and unsaturated groups on the extracted oil, while the result from the fatty acid profile analysis confirmed the dominance of oleic acid. the physicochemical properties of the extracted algal oil conformed to the astm standard. keywords: optimization, extraction, characterization, chlorophyta, microalgae. 1. introduction it is no longer news that the use of biofuel as a source of energy has become a significant area of research due to the increase in world population and industries proliferation coupled with expanding economy. energy, which is one of the famous opportunities and challenges, plays a crucial role in human lives, both socially and economically, and the application of energy today is regarded worldwide as an important factor that determines the extent of a nation development [1, 2]. for so many years, fossil fuels, such as coal, natural gas and petroleum have globally been the main sources of energy either for transportation, industrial or domestic purpose [2]. however, the fossil fuels are associated with some demerits, which include depletion and environmental degradation [3]. biofuel provides solutions to the problems of environmental degradation and reserve depletion that are associated with fossil hydrocarbon fuel [3]. these associated problems with fossil hydrocarbon fuels necessitate the need to search for a suitable alternative that would be able to provide solutions for the aforementioned menaces. oil derived from plant and animal sources has a great potential toward synthesis of a cleaner and ecofriendly fuel [4] and biofuel such as biodiesel has been produced from three generations of feedstocks. the first generation (g1) biofuel feedstocks are edible crops like groundnut, coconut, soyabean and palm kernel. the synthesis of biofuel from these sources results in food crisis, which would definitely affect the global economy negatively [5]. the second generation (g2) biofuel feedstock is being derived from non-edible sources, such as jatropha curcas seed, leucaena leucocephala seed, tobacco seed, rubber seed, animal fat and used cooking/frying oil. the second generation feedstocks have quite a number of advantages over conventional feedstcoks as they guarantee biofuel sustainability and do not lead to a food crisis [6]. however, most of second generation biofuel feedstocks possess high free fatty acid (ffa) content. thus, biofuel production from these sources may require multiple processes. it will increase the cost of production and may result to wastewater generation [5]. the third generation biofuel feedstocks are regarded as algae (micro and macro algae), which have been considered as a starting ingredient for the bioenergy production and source of industrially important coproducts or value added products [7]. algae, as a third generation feedstock, have several advantages over the first and second generations feedstocks, which include higher profitability, maximum product yield, neither requiring any land for its cultivation nor cause food shortage [8]. all these features coupled with the suitability as feedstock for expansive scale biofuel creation, give the third generation feedstock an edge over g1 and g2 biofuel feedstock [4, 9]. in the present study, green algae chlorophyta, also known as cladophora glomerata, is chosen as a source of the third generation biofuel feedstock. it can either be found on the shore rocks or on stagnant water, such as ponds and lakes. they possess chlorophyll, carotene, and xanthaphylls which all aid in photosynthesis. it also possesses a cell wall, which contains cellulose and pectins as well as pyrenoid, which helps in the synthesis and deposition of starch [10]. the recovery of oil from biomass via a solvent extraction method is considered as one of the most commonly used methods. a solvent extraction is the process in which the oil is removed from a solid by means of a liquid solvent, it 169 https://doi.org/10.14311/ap.2020.60.0169 https://ojs.cvut.cz/ojs/index.php/ap adeyinka sikiru yusuff, donatus ewere acta polytechnica is also known as leaching [11]. various solvents are used for the extraction, such as organic and inorganic solvents, organic solvents are less dense than water while inorganic solvents are denser than water. commonly used organic solvents are diethyl ether, toluene, hexane, ethyl acetate, ethanol, and inorganic solvents are dichloromethane, chloroform and carbon tetrachloride [12]. the solvent extraction using n-hexane as a solvent results in the highest oil yield, which makes it the most commonly used solvent [11, 13]. generally, the process of solvent extraction can be considered in three parts: the change of phase of the solute as it dissolves in the solvent, its diffusion through the solvent in the pores of the solid to the outside of the particle, and the transfer of the solute from the solution in contact with the particles to the main bulk [1]. some reported scientific research studies showed that a solvent extraction using mixture of two or more solvents enhances the extraction yield [1, 4, 10, 14]. the present study was aimed to extract oil from cladophora glomerata (green algal) via the solvent extraction route. the effects of extraction process parameters, such as extraction temperature, particle size and extraction time, on the oil extraction yield were investigated. the algal oil extracted under the optimum conditions was characterized based on its physicochemical properties, functional groups and fatty acid profile. 2. material and method 2.1. sample cultivation and preparation the green microalgae cladophora glomerata was selfcultivated in enclosed plastic containers. the containers were filled with tap water and water containing algae sample collected from afe babalola university fish pond, ado-ekiti, nigeria. the water from the fish pond served as nutrient and growth medium for the algae. the enclosed containers were then exposed to sunlight to provide energy required for the growing algae so as to reproduce in the medium. the growth medium was monitored for colour changes and measured weekly to see if there was an increase in weight. due to the fact that it was microalgae that were being grown, it took weeks before any visible change was observed. after the algae had reached a substantial amount, it was harvested by draining out all the water leaving the algae on the sieve and the harvested algae sample was dried at an ambient temperature for 4 days. the dried algae biomass (ab) was thereafter gently pulverized and later sieved to obtain the required particle size. 2.2. extraction studies the extraction of oil from the algae biomass was carried out by using a mixture of n-hexane and ether in a mixing proportion of 4:1 as solvent. in the present work, the apparatus used was soxhlet extractor. the extractor consists of boiling flask, condenser, thimble and extraction chamber. 25 g of algae biomass powder was wrapped in a white muslin cloth and placed in a thimble of the extractor. a 250 ml boiling flask containing the required amount of solvent was gently connected to the end of the extractor and the whole set up was placed on a heating mantle in order to heat up the extractor contents. the extraction processes were performed under these operating conditions: extraction temperatures were 40, 50, 60, 70 and 75 °c, the extraction times were 1, 1.5, 2, 2.5 and 3 h and particle sizes were 0.3, 0.4, 0.5, 0.6, 0.7 and 1 mm. the yield of the extracted oil was gravimetrically evaluated by eq. (1). y = w0 wab × 100 % (1) where wo is the weight of the extracted oil (g) and wab is the weight of the algae biomass used (g). 2.3. characterization of the and extracted oil the physicochemical properties, such as specific gravity, ph, acid value, free fatty acid content, iodine value, flash point and pour point were determined according to american society for testing and method (astm) and european union (eu) standard procedures. the functional groups contained in the extracted oil were determined using fourier transform infrared (ftir) spectrophotometer (ir affinity-1s, shimadzu, japan). moreover, a gas chromatography combined with flame ionization detection (gc-fid) (hewlett packard 6890s, palo alto, usa) analysis was conducted to determine the fatty acid profile of the extracted algal oil. 3. results and discussion 3.1. effect of operating variables on algal oil yield 3.1.1. effect of extraction temperature on oil yield fig. 1 shows the influence of the extraction temperature on the yield of algal oil over the range of 40−75 °c. as can be seen from fig. 1, the algal oil yield increased with an increasing temperature up to 60 °c and then dropped as the temperature rose above 60 °c. the reason for this observation is that the temperature enhances the solubility of the oil, which results in a higher extraction rate. the higher oil yield at a higher temperature can also be attributed to the increase in diffusion rate [15]. another presumed reason for this observation is that when the extraction temperature is increased, the mass transfer of the oil is rapid, which leads to a higher extraction yield. it could, therefore, be deduced that at an optimum temperature of 60 °c, the solubility of the solvent was found to be increased with the increase in diffusion rate [1, 16]. 170 vol. 60 no. 2/2020 extraction and evaluation of oil from green algae. . . 0 2 4 6 8 10 12 14 16 18 20 20 30 40 50 60 70 80 a lg a l o il y ie ld ( w t. % ) extraction temperature (oc) figure 1. effect of the extraction temperature on algal oil yield at a fixed extraction time (2 h) and algal particle size (0.6 mm). 0 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 1.2 a lg a l o il y ie ld ( w t. % ) algal particle size (mm) figure 2. effect of particle size on algal oil yield at fixed extraction temperature (60 °c) and extraction time (2 h). 3.1.2. effect of particle size on oil yield to study the influence of particle size on the algal oil extraction yield, each experimental run was conducted at an extraction temperature of 60 °c for 2 h. the results have been displayed in fig. 2. this figure shows that the oil extraction yield increases with the particle size up to a maximum of 0.6 mm and then decreases as the biomass particle size increases. this may be attributed to several factors. at a high particle size, the interfacial area between the solid and solvent becomes lower, and therefore the mass transfer rate of material reduces and this leads to the decrease in the extraction yield. it indicates that, as the particle size increases, the diffusion of the solute within the solid becomes slower [17]. in addition to this, the larger particle size lowers extraction rates as it creates difficulty for the solvent to diffuse through the core of the algal biomass in order to leach the oil (solute). it is observed that the particle size does not only enhance the extraction rate, but also favours the oil extraction yield. this is corroborated by the work reported by han et al. [18] who showed that the increased oil yield was a direct result of the decrease in particle size, which in turn increases the interfacial surface area of the solute interacting with the solvent. similar observations were also reported for the oil extracted from marine macroalgae ulva lactuca [16] and green alga chlorella sp. [9]. 3.1.3. effect of extraction time on oil yield time is one of the important parameters in the extraction process as it helps in deciding the optimum residence time needed for the extraction process [1]. in this study, however, the effect of the extraction time on the oil extraction yield was investigated with different time intervals varying from 1 to 3 h at a fixed biomass particle size and extraction temperature of ≤ 0.6 mm and 60 °c, respectively (fig. 3). the oil yield speedily increased from 7.26 % to 21.82 % with an increase in extraction time from 1 h to 2.5 h, a higher oil yield was obtained at an extraction time of 2.5 h. however, the oil yield dropped when the extraction time increased beyond 2.5 h. the presumed reason for this observation is that an increase in the extraction time at a high extraction temperature can lead to a low yield of oil. this is probably because the high level of temperature or time is sufficient enough to improve the diffusion coefficient, which would enhance the extraction rate. however, when the extraction is 171 adeyinka sikiru yusuff, donatus ewere acta polytechnica 0 5 10 15 20 25 0 0.5 1 1.5 2 2.5 3 a lg a l o il y ie ld ( w t. % ) extraction time (h) figure 3. the effect of extraction time on the algal oil yield at a fixed extraction temperature (60 °c) and particle size (0.6 mm). parameter current study reference [4, 10] colour greenish yellow greenish yellow odour characteristic odour na specific gravity at 25 °c 0.887 ± 0.02 0.857 − 0.892 ph 7.18 ± 0.01 7 acid value (mg koh/g oil) 1.24 ± 0.13 0.935 − 14.27 free fatty acid (ffa) content (wt.%) 0.62 ± 0.025 0.468 − 7.14 iodine value (g i2/100 g oil) 81.27 ± 0.02 76.24˘97.22 flash point (°c) 108 ± 0.12 110 − 135 pour point (°c) −16 −12 to −22 table 1. physical and chemical properties of the extracted algal oil. conducted at a high temperature and time, the solvent will form bubbles which inhibit the extraction rate. hence the extraction time of 2.5 h was found to be an optimum value. 3.2. characterization of the extracted algal oil 3.2.1. physicochemical properties of the extracted algal oil to ascertain the quality of the oil extracted from the algae, the physicochemical properties analysis was conducted and the results obtained are presented in table 1. the value obtained for each property was compared with values reported by previous researchers [4, 16] and the resulting values were found to be in an agreement with the values earlier reported. in addition, the low acid value and ffa content suggest that the extracted oil could be converted directly to biodiesel via single step transesterification. the ph value of the algal oil was determined to be 7.18 ± 0.01 and this indicates that the oil is neutral, which is in an agreement with previous studies on oil extraction from algae by many researchers [4, 19]. the results obtained herein indicate that the algal oil could be used as a low-grade feedstock for biodiesel production and other industrial chemicals where non-edible plant oil can serve as a starting raw material. 3.2.2. ftir analysis fig. 4 shows the ftir spectrum of the extracted oil from algae biomass. the absorption band between approximately 3100 cm−1 and 3000 cm−1 can be attributed to the = c h stretching from aromatic, unsaturated hydrocarbons, while the frequency range 2990 − 2850 can be ascribed to the ch stretching in the alkane group. the absorption band at 1748.03 cm−1 confirms the presence of ester groups (c = o), which indicates that the extracted oil can be transformed into another ester and hence, it can serve as a biodiesel feedstock. the appearance of a trough at 950 − 1300 cm−1 can be attributed to the c c (o) c stretching of alcohol and esters. the observed band at around 710 cm−1 is due to the ch out of plane deformation in alkene. 3.2.3. fatty acid profile analysis in addition to the extracted oil quality characterization, the fatty acid profile of the oil was investigated using gc-fid technique. the chromatograms obtained with associated peaks are presented in table 2. the main fatty acids in the extracted oil were oleic and stearic acids. this is corroborated by earlier studies on fatty acid profile of algal oil [4, 9, 14]. 172 vol. 60 no. 2/2020 extraction and evaluation of oil from green algae. . . 0 20 40 60 80 100 120 40080012001600200024002800320036004000 % t wavenumber (cm-1) c -h c -h c = o h -c -h c -c (o )-c c -h = c -h figure 4. ftir spectrum of the extracted algal oil. peak no retention time (min) area (%) fatty acid systematic name 1 43.315 6.96 palmitoleic hexadecanedoic 2 47.258 69.07 oleic cis-9-octadecenoic 3 47.614 38.60 stearic octadecanoic 4 17.250 17.25 phthalic phthalic table 2. algal oil fatty acid profile. 4. conclusion the extraction of oil from green microalgae cladophora glomerata using a mixture of n-hexane and ether as a solvent was successfully carried out. the effect of extraction process parameters on the algal oil yield was investigated. the maximum oil yield, 18.3 %, was found at a particle size of ≤ 0.6 mm, extraction temperature of 60 °c, and extraction time of 2.5 hours. the result of the ftir analysis confirmed that the extracted oil possessed several functional groups, while the gc-fid analysis conducted that the extracted oil indicated a high oleic acid content. the free fatty acid content of the algal oil was much less, thus making it suitable for a biodiesel production via single-step transesterification. references [1] a. yusuff, m. lala, l. popoola, o. adesina. optimization of oil extraction from leucaena leucocephala seed as an alternative low-grade feedstock for biodiesel production. sn applied sciences 1:357, 2019. doi:10.1007/s42452-019-0364-0. [2] a. k. endalew, y. kiros, r. zanzi. inorganic heterogeneous catalysts for biodiesel production from vegetable oils. biomass and bioenergy 35(9):3787 – 3809, 2011. doi:10.1016/j.biombioe.2011.06.011. [3] a. refaat. biodiesel production using solid metal oxide catalyst. international journal of environmental science and technology 8:203 – 221, 2010. doi:10.1007/bf03326210. [4] m. yuvarani, d. kubendran, a. r. s. aathika, et al. extraction and characterization of oil from macroalgae cladophora glomerata. energy sources, part a: recovery, utilization, and environmental effects 39(23):2133 – 2139, 2017. doi:10.1080/15567036.2017.1400608. [5] k. v. thiruvengadaravi, j. nanadagopal, v. s. s. bala, et al. the solid acid catalyzed esterification of free fatty acids in pongamia pinnata oil. energy sources, part a: recovery, utilization, and environmental effects 34(21):2016 – 2022, 2012. doi:10.1080/15567036.2010.485176. [6] a. yusuff, o. adeniyi, m. olutoye, u. akpan. performance and emission characteristics of diesel engine fuelled with waste frying oil derived biodiesel-petroleum diesel blend. international journal of engineering research in africa 32:100–111, 2017. doi:10.4028/www.scientific.net/jera.32.100. [7] t. mathimani, a. pugazhendhi. utilization of algae for biofuel, bio-products and bio-remediation. biocatalysis and agricultural biotechnology 17:326 – 330, 2019. doi:10.1016/j.bcab.2018.12.007. [8] h. chen, d. zhou, g. luo, et al. macroalgae for biofuels production: progress and perspectives. renewable and sustainable energy reviews 47:427 – 437, 2015. doi:10.1016/j.rser.2015.03.086. [9] n. chi, d. pham, t. mathimani, a. pugazhendhi. evaluating the potential of green alga chlorella sp. for high biomass and lipid production in biodiesel viewpoint. biocatalysis and agricultural biotechnology 17:184 – 188, 2018. doi:10.1016/j.bcab.2018.11.011. [10] m. abubakar. extraction and characterization of oil from microalgae through soxhlet extraction method. bachelor thesis, afe babalola university, ado-ekiti, nigeria, 2018. [11] m. bhuiya, m. rasul, m. khan, et al. prospects of 2nd generation biodiesel as a sustainable fuel part: 1 173 http://dx.doi.org/10.1007/s42452-019-0364-0 http://dx.doi.org/10.1016/j.biombioe.2011.06.011 http://dx.doi.org/10.1007/bf03326210 http://dx.doi.org/10.1080/15567036.2017.1400608 http://dx.doi.org/10.1080/15567036.2010.485176 http://dx.doi.org/10.4028/www.scientific.net/jera.32.100 http://dx.doi.org/10.1016/j.bcab.2018.12.007 http://dx.doi.org/10.1016/j.rser.2015.03.086 http://dx.doi.org/10.1016/j.bcab.2018.11.011 adeyinka sikiru yusuff, donatus ewere acta polytechnica selection of feedstocks, oil extraction techniques and conversion technologies. renewable and sustainable energy reviews 55:1109 – 1128, 2016. doi:10.1016/j.rser.2015.04.163. [12] s. mani. extraction and characterization of jatropha curcas linnause seed oil through soxhlet method. bachelor thesis, universiti malaysia pahang, malaysia, 2016. [13] a. atabani, a. silitonga, i. a. badruddin, et al. a comprehensive review on biodiesel as an alternative energy resource and its characteristics. renewable and sustainable energy reviews 16(4):2070 – 2093, 2012. doi:10.1016/j.rser.2012.01.003. [14] a. zonouzi, m. auli, m. j. dakheli, m. a. hejazi. oil extraction from microalgae dunalliela sp. by polar and non-polar solvents. international journal of agricultural and biosystems engineering 10(10):642 – 645, 2016. doi:10.5281/zenodo.1126964. [15] s. seth, y. agrawal, p. ghosh, et al. oil extraction rates of soya bean using isopropyl alcohol as solvent. biosystems engineering 97(2):209 – 217, 2007. doi:10.1016/j.biosystemseng.2007.03.008. [16] t. suganya, s. renganathan. optimization and kinetic studies on algal oil extraction from marine macroalgae ulva lactuca. bioresource technology 107:319 – 326, 2012. doi:10.1016/j.biortech.2011.12.045. [17] j. rodríguez-miranda, b. hernández-santos, e. herman-lara, et al. effect of some variables on oil extraction yield from mexican pumpkin seeds. cyta journal of food 12(1):9 – 15, 2014. doi:10.1080/19476337.2013.777123. [18] x. han, l. cheng, r. zhang, j. bi. extraction of safflower seed oil by supercritical co2. journal of food engineering 92(4):370 – 376, 2009. doi:10.1016/j.jfoodeng.2008.12.002. [19] n. topare, s. raut, v. renge, et al. extraction of oil from algae by solvent extraction and oil expeller method. international journal of chemical sciences 9:1746 – 1750, 2011. 174 http://dx.doi.org/10.1016/j.rser.2015.04.163 http://dx.doi.org/10.1016/j.rser.2012.01.003 http://dx.doi.org/10.5281/zenodo.1126964 http://dx.doi.org/10.1016/j.biosystemseng.2007.03.008 http://dx.doi.org/10.1016/j.biortech.2011.12.045 http://dx.doi.org/10.1080/19476337.2013.777123 http://dx.doi.org/10.1016/j.jfoodeng.2008.12.002 acta polytechnica 60(2):169–174, 2020 1 introduction 2 material and method 2.1 sample cultivation and preparation 2.2 extraction studies 2.3 characterization of the and extracted oil 3 results and discussion 3.1 effect of operating variables on algal oil yield 3.1.1 effect of extraction temperature on oil yield 3.1.2 effect of particle size on oil yield 3.1.3 effect of extraction time on oil yield 3.2 characterization of the extracted algal oil 3.2.1 physicochemical properties of the extracted algal oil 3.2.2 ftir analysis 3.2.3 fatty acid profile analysis 4 conclusion references ap04-bittnar2.vp 1 introduction textile reinforced concrete (trc) has emerged in the last decade as a new composite material combining the textile reinforcement with the cementitious matrix. its appealing feature is the possibility to produce filigree high-performance structural elements that are not prone to corrosion, as is the case for steel reinforced concrete. in contrast to other composite materials, in trc both the matrix and the reinforcement exhibit a high degree of heterogeneity of their material structure at similar scales of resolution. as a consequence, the fundamental failure mechanisms in the yarns, in the matrix and in the bond layer interact with each other and can result in several macroscopically different failure modes. the development of a consistent material model for textile reinforced concrete requires the formulation and calibration of several sub-models on several scales of resolution. each of these models represents the material structure at the corresponding scale (fig. 1) with a focus on specific damage and failure mechanisms. the following correspondence between the scales and the observable components of the material structure and their interactions are specified: � micro level � filament, matrix � bond filament matrix � meso level � yarn, matrix � bond yarn matrix � macro level � textile, matrix � bond textile matrix while models at the micro level are able to capture the fundamental failure and damage mechanisms of the material components (e.g. filament rupture and debonding from the matrix) their computational costs limit their application to small size representative unit cells of the material structure. on the other hand, macro level models provide sufficient performance at the expense of a limited range of applicability. generally, all the scales must be included in the assessment of the material performance. the chain of models at each scale may be coupled (1) conceptually by clearly defining the correspondence between the material models at each level or (2) adaptively within a single multi-scale computation to balance accuracy and performance in an optimal way [1, 2]. due to the complex structure of textile reinforced concrete at several levels (filament – yarn – textile – matrix) it is effective to develop a set of conceptually related sub-models for each structural level covering the selected phenomena of the material behavior. the homogenized effective material properties obtained at a lower level can be verified and validated using experiments and models at higher level(s). the present paper is focused on the role of disorder in the bond layer between the yarn and the matrix. in sec. 2 we review the elementary effects occurring in the bond layer during loading. after that, in sec. 3, the model capturing some of these effects is introduced. then, in sec. 4, the calibration for a particular combination of yarn and matrix is performed, and finally, in sec. 5, a parametric study shows the interaction effects between two failure mechanisms – namely the debonding of filaments from the matrix and rupture of the filaments with an included disorder in the bundle. 2 elementary effects occurring in the bond layer in the reinforcement, the elementary mechanisms in the material behavior are appointed to the filaments with linear elastic behavior and brittle failure. the filament ensemble 186 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 the influence of disorder in multifilament yarns on the bond performance in textile reinforced concrete m. konrad, r. chudoba in this paper we analyze the performance of a bond layer between the multi-filament yarn and the cementitious matrix. the performance of the bond layer is a central issue in the development of textile-reinforced concrete. the changes in the microstructure during the loading result in distinguished failure mechanisms on the micro, meso and macro scales. the paper provides a brief review of these effects and describes a modeling strategy capable of reflecting the failure process. using the model of the bond layer we illuminate the correspondence between the disorder in the microstructure of the yarn and the bonding behavior at the mesoand macro level. particular interest is paid to the influence of irregularities in the micro-structure (relative differences in filament lengths, varying bond quality, bond-free length) for different levels of local bond quality between the filament surface and the matrix. keywords: textile reinforced concrete, material model, bond performance, micro scale. fig. 1: resolution scales constituting the yarn exhibits nonlinear behavior due to disorder in the filament structure. the delayed activation of individual filaments leads to a gradual growth of stiffness at the beginning of the loading process, the friction between filaments influences the maximum stiffness reached during loading and both these effects influence the rate of failure after reaching the maximum force. both in the filaments and in the yarn we may also observe the statistical size effect leading to reduced strength with increasing length [3]. the fine grained concrete matrix exhibits the evolution of microcracks in the fracture process zone that gradually close up to the macro crack. however, for the purpose of the present study, focused solely on the role of heterogeneity in the yarn, this influence can be disregarded. the interaction between the reinforcement and the matrix can be seen on the microand meso-scales shown in fig. 2. the fine scale interaction between the filament and the matrix includes the phases of bonding, debonding and friction. the interaction at the level of the yarn and matrix includes the same phases, but each of these phases includes fine scale interaction modes between the filaments and the matrix. due to the complex structure of the failure process zone, the yarn-matrix bonding behavior cannot be captured without analyzing the interaction effects in micro-mechanical terms, as is done in this paper. another interaction effect occurs upon cracking of the matrix and the evolution of crack bridges leading to the tension stiffening effect in the overall response. this interaction is studied using meso-scale models, and goes beyond the scope of the present study [4]. the same holds for the interaction at the level of textile structures embedded in the matrix, which can be addressed by a macromechanical treatment [5, 6]. 3 model of the bond layer in this model the interface layer between the yarn and the matrix is regarded as a set of laminas interacting with the matrix through the given bond law. the laminas represent groups of filaments with the same characteristics, and are coupled with the matrix using zero thickness interface elements [7]. the disorder in the filament bundle is taken into account using one of the three distributions of filament properties: (1) distribution of the bond quality, diminishing from the outside to the inside of the yarn (2) distribution of the bond free length, increasing from the outside to the inside of the yarn and (3) distribution of the delayed activation of filaments within the bond free length. these distributions do not represent the disorder in the filament bundle directly. the bundle geometry is assumed in the form of a parallel set of filaments. the effect of disorder is reflected indirectly in terms of the mentioned distribution functions inducing an inhomogeneous stress transfer throughout the bond layer that is assumed to occur in a similar way in the heterogeneous material structure. the model can serve the purpose of capturing the influence of the variations in the bond performance on the macroscopically observable failure process, so that these variations may be quantified in a calibration procedure. the calibration of the model is performed using both the load-displacement curve and the curve representing the instantaneous fraction of the broken filaments during the loading process. the latter is obtained experimentally by optical recording of the light transmission through the unbroken filaments [8]. using this model and the experimental data we are able to derive the effective bond law of the bond layer between the whole yarn © czech technical university publishing house http://ctn.cvut.cz/ap/ 187 acta polytechnica vol. 44 no.5–6/2004 brittle linear elastic size effect micro bond debonding friction cement aggregate particle fine grained cement matrix filament filament matrix micro cracking f u f u delayed activation size effect meso bond quality internal free length deterministic size effect statistical size effect localization f u f u yarn matrixyarn macro macro cracking tension stiffening size effect textile matrixtextile reinforcement interaction matrix fig. 2: correspondence between scales, components and effects used in this model (filament bundle) and the matrix, which can be used at the higher modeling levels. 4 identification of the characteristic parameters for the bond layer for the selected combination of the yarn and the matrix, the parameters characterizing the tensile behavior of the yarn and the parameters of the bond between the filament surface and the matrix can be determined from the preliminary numerical and experimental study. in particular, the characteristics of the yarn and of the filaments can be derived from tensile tests on yarn. the applied stochastic modeling of the multi-filament bundle allowed us to obtain also statistical distributions of strength and stiffness along the yarn as described thoroughly in [9, 10]. the local bonding between the filament surface and the matrix has been characterized by a bond model with parameters calibrated using the single filament pull-out experiment [11]. the sought material characteristics are the distributions of the bond quality, bond free length and activation strain across the filament bundle in the bond layer. the calibration procedure [12, 13] is based on the experimental data shown in fig. 3. here, the left diagram shows the load-displacement curves and the right diagram shows the diminishing fraction of unbroken filaments during the pull-out test for four selected specimens. before presenting the calibrated results, we first show the qualitative influence of the variations in the bond quality across the yarn cross section. three examples of assumed bond quality distributions are shown in fig. 4 with the maximum achievable shear flow (100 %) at the outside of the yarn and linear, quadratic and cubic reduction in the internal layers. the influence of a linear, a quadratic and a cubic bond quality distribution (fig. 4) on the pull-out curve and on the progression of filament rupture is shown in fig. 5. while the linear and the quadratic interpolation functions result in a sharp kink in the pull-out curve at the onset of filament rupture, the cubic distribution leads to a curve with a higher deformation capacity and is able to qualitatively reproduce the pull-out behavior and the progression of filament breaks measured in the experiment. while the form of the bond quality distribution influences the post-peak slope of the pull-out curve the maximum pull-out force primarily depends on the tensile strength of the filaments. this correspondence is documented in fig. 6. higher tensile strength of the filaments results in a higher pull-out force. furthermore, higher filament strength leads to a higher frictional force at the end of the pull-out test because a greater number of filaments are pulled out prior to their rupture. the effect of filament stiffness and strength and of the local bonding stiffness on the initial slope of the pull-out curve is not significant. on the other hand, their effect on the fraction of broken filaments is much higher. in other words, the initial stiffness in the pull-out test cannot be reproduced solely by reducing the bond quality and the tensile strength of the filaments. as a consequence, the reduced pull-out stiffness is explained by the existence of a free length inside the specimen between the macroscopic boundary of the matrix and the first contact of the filaments with the matrix inside the 188 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 4: possible functions representing the decrease in the bond quality fig. 3: load-displacement curve, fraction of unbroken filaments specimen, i.e. the start of the microbonding between filament and matrix. similarly to the bond quality, we assume that the bond-free length increases from the outside of the yarn cross section to the inside, which is illustrated in fig. 7. the influence of this free length on the initial stiffness is demonstrated in fig. 8. the initial stiffness and the maximum pull-out force decrease with increasing bond-free length. due to the shorter embedding length of the filaments the number of filaments being pulled out (i.e. debonded) increases and results in higher frictional force at the end of the pull-out test. the filaments in the bundle exhibit a waviness which is illustrated in fig. 9. within the internal free length the filaments have the possibility to straighten before they get © czech technical university publishing house http://ctn.cvut.cz/ap/ 189 acta polytechnica vol. 44 no.5–6/2004 fig. 5: influence of different bond quality functions fig. 6: influence of filament strength ft fig. 8: influence of free length fig. 7: free length between matrix boundary and the beginning of microbonding activated. the delayed activation of the individual filaments is modeled by an activation strain, which has to be reached before a filament takes up force. the activation strain increases with increasing free length. fig. 10 shows the influence of linear distributions of the activation strain with different maxima is shown. the increasing delay of the activation results in further reduction of the initial stiffness and of the maximum pull-out force. in contrast to the parameters described above, it does not influence the number of broken filaments. using the parameters described above a calibration of the model is possible, as exemplified in fig. 11. the calibrated distribution of the bond quality across the yarn cross section provides the basis for further modeling on the meso and macro level. 5 parametric study of the bond performance with disorder in the yarn in the previous section the characteristics of the material structure in the bond layer were introduced and their influence on the bond performance was shown. in the following we will study the influence of the local bond quality on the overall bond performance. as already specified, the bond behavior is described by a bilinear bond law (fig. 12) including the phases of adhesive bond, debonding and friction. the local bond quality can be modified by changing the maximum bonding stress �max, the frictional stress �fr and their ratio �max / �fr. the influence of the maximum bond stress �max and the ratio of maximum bond stress and frictional stress �max / �fr 190 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 11: comparison of simulation and experiment fig. 10: influence of activation strain fig. 9: waviness of the filaments fig. 12: bilinear bond law for an embedding length of 30 mm is shown in fig. 13. it is obvious that the effect of the maximum shear stress on the maximum pull-out force is negligible. this is a result of the long embedding length of 30 mm accumulating a high amount of the frictional stress. therefore the maximum pull-out force is depends essentially only on the frictional stress. the dependency of the maximum pull-out force on the frictional stress �fr is shown in fig. 14. for a constant maximum bond stress �max the maximum pull-out force and the associated displacement decrease with increasing frictional stress tfr. this effect is rather surprising. it means that the improvement of the bond performance of the filament surface results in a reduction of the resulting bond performance of the bundle. in order to illuminate this effect, the shear flow at maximum pull-out force along the filament is displayed for each lamina in fig. 15. the laminas in front represent the filaments inside the bundle with a low bond performance, while the rear laminas represent the outer filaments with a high bond performance. a constant shear flow indicates that up to this length the filaments have debonded. the left diagram shows the shear flow distribution across the bond layer for a low level of frictional stress �fr. the length activated for the stress transfer between the filaments and the matrix is much longer than in the right diagram, showing the shear flow distribution for a higher frictional stress. the longer stress transfer length results in a lower strain of the filaments, and leads to filament rupture at larger control displacements. the reduction of the maximum pull-out force with increasing local bond strength is explained using fig. 16. the two diagrams show the accumulative pull-out response (thick curve) and the pull-out curves for each lamina separately. for the lower level of frictional stress (�fr � 0 5. n/mm 2) more filaments can be activated simultaneously (left diagram). at the maximum pull-out force there are 95 % of the filaments active and at the end of the loading only 15 % of the filaments get broken while all the rest remain intact. on the other hand, for yarn with a higher level of friction ��fr � 6 n/mm 2) there are only 55 % of the filaments active at the maximum pull-out force (30 % are still inactive and 15 % © czech technical university publishing house http://ctn.cvut.cz/ap/ 191 acta polytechnica vol. 44 no.5–6/2004 0 100 200 300 400 500 600 700 800 0 2 4 6 8 tau_max / tau_fric [-] m a x im u m p u ll -o u t fo rc e [n ] tau_fric = 0,5 n/mm2 tau_fric = 1,0 n/mm2 tau_fric = 1,5 n/mm2 tau_fric = 2,0 n/mm2 tau_fric = 3,0 n/mm2 tau_fric = 4,0 n/mm2 tau_fric = 6,0 n/mm2 fig. 13: maximum pull-out force 0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0 1 2 3 4 5 6 tau_fric [n/mm2] d is p la c e m e n t a t m a x im u m [m m ] tau_max/tau_fric = 1 tau_max/tau_fric = 2 tau_max/tau_fric = 4 tau_max/tau_fric = 8 fig. 14: maximum pull-out force and associated displacement fig. 15: shear flow for � fr n mm� 0 5 2. and � fr n mm� 6 2 are already broken). at the end of the loading all the filaments are broken. thus, even though the inner filaments were able to transfer a higher amount of force to the matrix the resulting pull-out force was reduced, due to the non-uniformity of the transfer. this qualitative comparison demonstrates the role of disorder represented by the varying bond-free length, delayed activation and spatial variations in the bond quality. the improvement of the local bond performance is counterproductive and results in an earlier failure of the outer filaments with higher bond performance. due to the increased pull-out stiffness, the maximum pull-out force reaches its maximum at a smaller control displacement. at this displacement most of the inner filaments with lower bond performance have not been activated and cannot contribute to the total pull-out force. 6 conclusions in this paper, a modeling strategy for supporting the development of textile reinforced concrete is presented. it is based on the assumption that there is no ultimate model able to capture all the aspects of the material behavior. therefore the models currently being developed in the framework of the collaborative research center are classified and evaluated with respect to the failure mechanisms being captured. it is important that they have a defined validity and clearly specified interfaces. they are applied together in order to study the material response at various scales of material resolution. the modeling of the bond layer demonstrated that we face a failure process zone with complex interaction of elementary effects. the parametric study emphasized the role of disorder in this interaction and exemplified that it reverses the expected correlation between input parameters and material response. the final message of this paper can be put as follows: in the design of cementitious composites reinforced with multi-filament yarns, the issue of disorder must be carefully analyzed. only with a good knowledge of the phenomena in the microstructure it is possible to balance the performance of the individual components of the material structure to obtain optimum performance of the composite. 7 acknowledgment this work has been carried out in the framework of the project simulation of bond and crack behavior of textile-reinforced concrete at the meso level included in the collaborative research center textile reinforced concrete: foundation of a new technology (sfb 532) sponsored by the german research foundation. references [1] krause r: “investigation of failure mechanisms with a multiscale finite element method.” proc. european conference on computational mechanics, münchen, 1999. [2] fish j., yu q.: “computational mechanics of fatigue and life predictions for composite materials and structures.” computer methods in applied mechanics and engineering, vol. 191 (2002), p. 4827–4849. [3] phoenix s. l.: “stochastic strength and fatigue of fiber bundles.” international journal of fracture, vol. 14 (1978), no. 3, p. 327–344. [4] hegger j., mombartz m., chudoba r.: “combined use of extended finite elements and h-adaptivity for simultaneous crack propagation and strain localization.” euromech colloquium 460, numerical modelling of concrete cracking, innsbruck, austria, 2005. [5] hegger j., bruckermann o., chudoba r.: “modelling of the bond of filaments and rovings.” proc., 2nd colloquium on textile reinforced structures (ctrs2), dresden, 2003. [6] hegger j., bruckermann o., voss s., chudoba r.: “a smeared model for the simulation of textile reinforced concrete tesion tests.” proc., 3rd asia-pacific conference on fibre reinforced materials, changsha, 2003. [7] kaliakin v. n., li j.: “insight into deficiencies associated with commonly used zero-thickness interface elements.” computers and geotechnics, vol. 17 (1995), no. 2 p. 225–252 [8] brameshuber w., banholzer b., gries t., al-masri a.: “methode zur untersuchung des versagensmechanismus unter zugbelastung von multifilament-garnen für die betonbewehrung.“ technische textilien, vol. 45 (2002), p. 98–99. [9] chudoba r., vorechovsky m., konrad m.: “stochastic modeling of multi-filament yarns i: random properties within the cross-section and the size effect.” journal of engineering mechanics, submitted for publication, (2004). 192 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 0 10 20 30 40 50 60 70 0 0,2 0,4 0,6 0,8 1 displacement [mm] fo rc e (l a m in a ) [n ] 0 100 200 300 400 500 600 700 800 900 p u ll -o u t fo r c e [n ] 0 10 20 30 40 50 60 70 0 0,2 0,4 0,6 0,8 1 displacement [mm] fo rc e (l a m in a ) [n ] 0 100 200 300 400 500 600 700 800 900 p u ll -o u t fo r c e [n ] fig. 16: failure process for � fr � 0 5. n/mm 2 and � fr � 6 n/mm 2 [10] vorechovsky m., chudoba r.: “stochastic modeling of multi-filament yarns i: random properties over the length and the size effect.” journal of engineering mechanics, submitted for publication, (2004). [11] brameshuber w., banholzer b.: “bond characteristics of filaments embedded in fine grained concrete.” proc. 2nd colloquium on textile reinforced structures (ctrs2), dresden, 2003. [12] chudoba r., butenweg c., peiffer f.: “textile reinforced concrete part i: process model for collaborative research and development.” proc. international conference on the applications of computer science and mathematics in architecture and civil engineering, weimar, 2003. [13] chudoba r., pfeiffer f., meskouris k.: “experiment design and automated evaluation under the application of numerical material models.” proc. 2nd colloquium on textile reinforced structures (ctrs2), dresden, 2003. ing. martin konrad, csc. prof. ing. rostislav chudoba, csc. e-mail: r.chudoba@aut.uni.de aachen university of technology chair of structural statics and dynamics, mies-van-der-rohe-str 1 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 193 acta polytechnica vol. 44 no.5–6/2004 ap04-bittnar2.vp 1 introduction the effects of shear lag can cause a significant increase in the longitudinal stresses developed in steel box girders. previous investigations have shown that the extent of shear lag within a flange plate is dependent on the ratio between the axial stiffness and the shear stiffness of the plate. the introduction of longitudinal stiffeners increases the axial stiffness without changing the shear stiffness so that there is a consequent increase in shear lag. stiffeners are, of course, introduced to increase the resistance of the compressed flange to buckling. it has been proven in [1] that it is far more advantageous, from the point of view of shear lag, if the flange plate is stiffened with a layer of concrete that is made to act compositely with the steel plate (fig.1a). the necessary composite action can be achieved by means of shear studs welded to the steel plate. among many applications of composite arrangements, the case of increasing the load carrying capacity of an existing steel box girder bridge may be mentioned as a special example. the bottom flange plate in the hogging moment regions over the internal supports of a continuous girder is particularly susceptible to the effects of shear lag. the most obvious way of strengthening these regions is to weld on more longitudinal stiffeners in the compression zone of the bottom flange. this will increase the buckling resistance of the flange, but it will also accentuate the shear lag problem. an alternative method of strengthening an existing bridge girder is to add a concrete layer to the compression flange so that it acts compositely with the steel (fig. 1b), which will increase the buckling resistance while also controlling the shear lag effect. although the method is applicable primarily for strengthening an existing bridge, it may well provide an economic alternative in the design of a new box girder. a perfect connection between the steel flange and the concrete layer exists, however, only theoretically. although there certainly will be an intention to benefit from full composite interaction, the studs placed at regular distances, which are commonly used as connectors at the present time, exhibit some unavoidable deformability. 2 governing equations shear flows sq and cq, and normal forces snx and cnx per unit width act on a typical element of the steel flange sheet or the concrete layer, respectively (see fig. 2). the equations governing the equilibrium in the longitudinal direction are: for the steel sheet (fig. 2a) � � � � s x sn x q y f� � � 0, (1) for the concrete layer (fig. 2b) � � � � c x cn x q y f� � � 0, (2) in which f is the shear acting in the longitudinal direction at the interface between the steel flange and the concrete layer. if the contribution of small traverse forces to the strains is neglected, it may be written: s x s s x s s u x n t e � � � � � , (3) c x c c x c c u x n t e � � � � � , (4) s y s s x s s x s s n t e � � � �� � � � , (5) c y c c x c c x c s n t e � � � �� � � � , (6) © czech technical university publishing house http://ctn.cvut.cz/ap/ 125 acta polytechnica vol. 44 no.5–6/2004 a shear lag analysis for composite box girders with deformable connectors v. křístek a method is proposed for shear lag analysis which can be applied to steel-concrete composite box girders. the proposed method uses harmonic analysis and allows the determination of shear lag effects from simple calculations so that the method is regarded as a design aid. the character of the method can illustrate the influence of certain key parameters upon the extent of the shear lag effect. keywords: shear lag, composite girder, stress distribution, harmonic analysis. (b) (a) concrete steel fig. 1: (a) steel-concrete composite girder – simplified form of cross section, (b) concrete layer added to the compression bottom flange plate in the hogging moment regions � s s s s q t g � , (7) � c c c c q t g � , (8) where ts and tc are the thicknesses of the steel flange and the concrete layer, respectively, �x and �y are the direct strains in the longitudinal and transverse directions, respectively, and � are the shear strains. e, g and � represent young’s moduli, the shear moduli and poisson’s ratios, respectively; u are the longitudinal displacements. the general form of the condition of compatibility is as follows: � � � � � � � � � � 2 2 2 2 2 x y y x x y � � . (9) substituting the strains from eqs. (3)–(8) it is obtained: for the steel: � � � � � � � � � 2 2 2 2 2 2 1s x s s x s sn y n x q x y � � �( ) , (10) for the concrete layer: � � � � � � � � � 2 2 2 2 2 2 1c x c c x c cn y n x q x y � � �( ) . (11) substituting for the shear flows sq and cq from equations (1) and (2): � � � � � � � � 2 2 2 2 2 2 1 0s x s s x s n y n x f x � � � � �( ) ( ) , (12) � � � � � � � � 2 2 2 2 2 2 1 0c x c c x c n y n x f x � � � � �( ) ( ) . (13) it may be assumed that the shear f acting between the steel sheet and the concrete layer, being provided by deformable connectors, is proportional to the mutual longitudinal slip which occurs at the interface between the two components, i.e. f k u us c� �( ) (14) where k is the connector stiffness. eq. (14) may be written in the form: � � � � � � f x k u x u x k n t e n t e s c s x s s c x c c � �� � � � � � � � � � � . (15) the following fourier series may express the searched functions: s x s jn n y j x l � � ( ) sin � , (16) c x c jn n y j x l � � ( ) sin � , (17) f f y j x l j j � � � � ( ) cos � 1 , (18) where l is the effective span-length. eqs. (12), (13) and (15) can be written in the form: s j s s j s jn j l n j l f�� ( ) ( )� � � � � 2 2 2 2 2 1 0 � � � � , (19) c j c c j c jn j l n j l f�� ( ) ( )� � � � � 2 2 2 2 2 1 0 � � � � , (20) j l f k n t e n t e j s j s s c j c c � � � � � � � � � 0, (21) in which s j s j n n y �� � � � 2 2 , etc. these relations represent a set of three equations for the unknown functions s jn y( ), c jn y( ) and f yj ( ), which can be adjusted to the following system of two differential equations s j j s j j c jn a n c n �� � � � 0 , (22) c j j c j j s jn b n d n �� � � � 0, (23) where a j l k t e b j l k j s s s s j c � � � � � � � � � 2 2 2 2 2 2 2 2 1 2 2 � � � � � ( ) ( ) , ( ) (1 2 1 2 1 � � � � � � � � c c c j s c c j c s s t e c k t e d k t e ) , ( ) , ( ) . (24) it follows from eq. (22) that c j j s j j s jn c n a n� � 1 ( �� ) , (25) 126 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 2: equilibrium conditions in the longitudinal direction which substituted into eq. (23) allows to obtain a differential equation of the fourth order d d d d 4 4 2 2 2 0 s j j s j j s j n y y a n y y b n y ( ) ( ) ( )� � � , (26) whose coefficients are a j l k t e t e j s c s s s c c c � � � � � � � � � � � � � 2 2 2 4 2 1 1� � � � � ( ) ( ) ( ) , b j l j l k t e j s c s c s 2 2 2 2 2 2 2 2 2 2 1 2 � � � � � � � � � � � � � � � � ( )( ) ( )( ) s c s c ct e � � � � � � � � � � � ( )( ) . 1 2� � (27) the general solution of (26), if the case of complex roots of the characteristic equation is assumed, is s j j j j j j j j n y c p y c p y c p y c p ( ) ( ) ( ) ( ) , , , , , , , � � � � � 1 1 2 2 3 3 4 4 , ( ),j y (28) where p y y y p y y y p y j j j j j j j 1 2 3 , , , ( ) sinh sin ( ) cosh cos ( ) � � � � � � � cosh sin ( ) sinh cos, � � � � j j j j j y y p y y y4 � (29) and � �j j j j j jb a b a � � � � 2 4 2 4 , . the amplitude function c jn y( ), according to eq. (25), is determined as �c j j j j j j j j j j n y c c r p y s p y c r p y ( ) [ ( ) ( )] [ ( , , , , , � � � � 1 1 1 2 2 2 ) ( )] [ ( ) ( )] [ , , , , , � � � � � � s p y c r p y s p y c r p j j j j j j j j j 1 3 3 4 4 4 �, ,( ) ( )] ,j j jy s p y� 3 (30) in which r a s j j j i j j j � � � � � � � � 2 2 2 , . (31) 3 boundary and loading conditions shear lag analysis is carried out for loads, placed symmetrically on the girder cross-section. thus, assuming the origin of the traverse co-ordinate y to be taken at the mid-width of the flange, i. e. at the axis of symmetry, then, because of the symmetry c cj j3 4 0, ,� � (32) so that from equations (28) and (30), the distributions across the flange width of the normal forces in the steel flange and in the concrete layer are governed by: s j j j j jn y c p y c p y( ) ( ) ( ), , , ,� �1 1 2 2 (33) �c j j j j j j j j j j n y c c r p y s p y c r p y ( ) [ ( ) ( )] [ ( , , , , , � � � � 1 1 1 2 2 2 �) ( )] .,� s p yj j1 (34) the amplitude function governing the distribution of the shear at the interface between the steel and concrete can be expressed from equation (21) as f y kl j c c t e r p y s p y p j j j c c j j j j( ) [ ( ) ( )], , ,� � � � � � �� � � 1 1 2 1 1 2 2 1 2 1 , , , , , ( ) [ ( ) ( )] j s s j j c c j j j j j y t e c c t e r p y s p y p � � � � � ( ) . y t es s � � � (35) it is seen that also this distribution is symmetrical about the flange mid-width. the values of the remaining constants c j1, and c j2, can be determined from the shear loading conditions at the edges of the steel flange and the concrete layer. combining equations (1), (16), (18): � � � � � �s s x s j j j q y n x f j l n y f y j x l � � � � � �� � �� � � � � ( ) ( ) cos 1 � � � � � � � � � � � � � � � c j l kl j r c t e t e p yj j j c c s s j1 1 1 , , ( ) � � � � � �� � � � � � � � � j j j c c j j j j kls j c t e p y c j l kl j r c t 1 2 2 � � � , , ( ) c c s s j j j c c j e t e p y kls j c t e p � � � � � � � � � � � � � � � 1 2 1 , , ( ) � ( ) cosy j x l � � � �� � (36) so that (by integrating with respect to y) the shear flow in the steel flange at any point may be expressed as: s j j j j j c c s s q x y c j l kl j r c t e t e ( , ) ,� � � � � � � � � 1 1 2 2 1 � � � � � � � � � � � � � � �� � � � � � � � � j j j j j j p y p y kl s j c 1 3 4[ ( ) ( )], ,� � � j c c j j j j j j j c c t e p y p y c j l kl j r c t e [ ( ) ( )], , , � � � � 4 3 2 1 � � � � � t e p y p y kls j s s j j j j j � � � � � � � � � � � � � � � � � [ ( ) ( )], ,� �4 3 � � � � c t e p y p y j x l c z j c c j j j j j [ ( ) ( )] cos [ , , , 3 4 1 1 � � � � �� � � � , , ,( ) ( )]cos .j j j j y c z y j x l � � � � 2 2 1 � (37) © czech technical university publishing house http://ctn.cvut.cz/ap/ 127 acta polytechnica vol. 44 no.5–6/2004 similarly, the shear flow in the concrete layer, combining equations (2), (17) and (18), is governed by the following relation: � � � � � �c c x c j j j q y n x f j l n y f y j x l � � � � � � �� � �� � � � ( ) ( ) cos 1 � � � � � � � � � � � � ��� � � c jlc kl j c t e r p yj j j c cj j j1 1 1, ,[ ( � � ) ( )] ( ) , , , � � � � � � s p y kl j t e p y c j lc kl j c t e j j s s j j j j c 2 1 2 � � � c j j j j j s s j r p y s p y kls j t e p y � � � � � � � � � [ ( ) ( )] ( ) , , , 2 1 2 � � � �� cos . j x l � (38) thus, the shear flow in the concrete layer at any point is expressed as: c j j j j j c c q x y c j lc kl j c t e ( , ) ,� � � � � � � � � � � � � � 1 2 2 1 � � � � � � � � � � � � � � j j j j j j j j j j r p y p y s p y p 1 3 4 4 3 [ ( ) ( )] [ ( ) , , , , � � � � �j s s j j j j j j y kl j t e p y p y c j lc kl ( )] [ ( ) ( )], , , � � � � � � � � � � 3 4 2 � j c t e r p y p y s p j c c j j j j j j j � � � � � � � � � � � � � � [ ( ) ( )] [ , ,4 3 3 �, , , , ( ) ( )] [ ( ) ( )] j j j s s j j j j y p y kl j t e p y p y � � � � � � � � � � � 4 4 3 cos j x l � which may be written in the form c j j j j c cj q x y c c j l kl j t e ( , ) ,� � � � � � � � � � � � �� 1 1 2 2 1 � � � � � � � � � � � � � � 1 3 4[( ) ( ) ( ) ( )], ,r s p y s r p y kl j j j j j j j j j j j� � � � � � � � � t e p y p y c c j l kl j t e s s j j j j j j c c [ ( ) ( )], , , 3 4 2 1 � � � � � � � � � � � � � � � � [( ) ( ) ( ) ( )], ,r s p y r s p y kl j j j j j j j j j j� � � �3 4 j t e p y p y j x l c z s s j j j j j j � � � � [ ( ) ( )] cos [ ( , , , , 3 4 1 3 � � � � � � � y c z y j x l j j j ) ( )]cos ., ,� � � � 2 4 1 � (39) from simple beam theory, the shear flow qe(x) transmitted from the web to the edge of the steel flange can be approximated as q x v x t t e e be i e s c c s( ) ( )� � � � �� � 2 , (40) where v(x) is the total shear force acting on the beam cross-section at position x; i is the second moment of area of the composite cross-section (the contribution of the concrete layer being reduced by the ratio ec/ es), e is the distance from the cross-sectional neutral axis to the centroid of the composite flange. the shear flow transmitted at the edge of the flange can also be expressed in the form of the fourier series. for the case of simply supported ends, the series takes the form q x q j x l e e j j ( ) cos,� � � � � 1 , (41) where q l q x j x l x t t e e be il v x e j e l s c c b , ( ) cos ( � � � � � � �� � � 2 0 � d ) . �j x l x l d 0 � (42) values of the coefficients q e j, , evaluated according to this formula, are listed in table 1 for a few typical cases. it must hold at the joint of the web and the steel flange that s eq x b q x, ( ) 2 � � � � � (43) and at the edge of the concrete layer that s q x b , 2 0� � � � � (44) since the concrete layer is not directly connected to the web. combining relations (37), (39), (41), (43) and (44), it is possible to form two equations to determine the constants c j1, and c j2, , which can be written in matrix form as: z b z b z b z b j j j j 1 2 3 4 2 2 2 2 , , , , � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � c c qj j e j1 2 0 , , , (45) 4 completion of shear lag analysis having thus determined the two constants, the amplitudes of the normal longitudinal forces s jn a c jn for any particular harmonic can be obtained from equations (33) and (34). the magnitudes of these forces varies across the width of the flange; the peak value of the normal longitudinal force s jn y( ) acting in the steel sheet occurs at the edge, i. e. where y � b/2. the distribution of the normal longitudinal force c jn y( ) across the width of the concrete layer may have a more general character. 128 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 knowing the amplitudes s jn and c jn , the values of the longitudinal normal forces per unit width s xn x y( , ) and c xn x y( , ) may be determined from equations (16) and (17) for any position on the flange. also the shear flows s q x y( , ) and c q x y( , ) at any point may be determined from equations (37) and (39). to evaluate the forces taken by the studs, the shear f acting at the interface between the steel flange and the concrete layer is to be determined according to equation (18). the amplitude function f yj ( ), describing the distribution of the shear across the flange width, is determined – knowing constants c j1, and c j2, – by equation (35). 5 summary of calculations for any particular girder with composite flanges, the first step in the calculation of the shear lag effect is to determine the value of coefficient q e j, from equation (42). the value q e j, is then substituted into the right-hand side of equations (45) to give the values of constants c j1, and c j2, , and, finally, for any harmonic the amplitudes of all the functions are required. these, in turn, are substituted into equations (16), (17), (18), (37) and (39) to give the normal forces per unit width, the shear acting at the interface between the steel and concrete components, and the shear flows at any position on the composite flange. the corresponding value of the longitudinal stress in the steel component of the flange is then given by: sx s x s x y n x y t ( , ) ( , ) � (46) and the longitudinal stress in the concrete layer is obtained as: cx c x c x y n x y t ( , ) ( , ) .� (47) should the shear stress values also be required, then, having evaluated the shear flows sq x y( , ) and c q x y( , ) at any point, the shearing stress in the steel is obtained as: �sx s s x y q x y t ( , ) ( , ) � (48) and the shear stress in the concrete layer is given by: �cx c c x y q x y t ( , ) ( , ) � . (49) 6 conclusions this paper has described the development of an approximate analytical method for analysing the stress distribution in the flanges of composite steel-concrete beams with deformable connectors. its primary advantage is the closed form of the results obtained and its ease of application. the method is also very suitable for parametric studies investigating the influences of various arrangements, and for optimisation studies. to conclude, it should be noted that – besides the mechanical effects – the thermal effects can also play an important role in the structural performance of steel-concrete composite beams, see, e.g., [2]. 7 acknowledgment support for this research through grants 103/02/.1005 and 103/02/.0020 from the grant agency of the czech republic is gratefully acknowledged. © czech technical university publishing house http://ctn.cvut.cz/ap/ 129 acta polytechnica vol. 44 no.5–6/2004 table 1: values of the coefficient qe j, for different tapes of loading references [1] křístek v., evans h. r.: “a hand calculation of the shear lag effect in unstiffened flanges and in flanges with closely spaced stiffeners.” civil engng practising design engrs, vol. 4 (1985) no. 2. [2] římal j.: “measurement of temperature fields in composite steel and concrete bridges.” ctu publishing house, prague, 2003. [3] křístek v. evans h. r., ahmad m. k. m.: “a shear lag analysis for composite box girders.” j. construct. steel research, vol. 16 (1990). [4] evans h. r., ahmad m. k. m., křístek v.: “shear lag in composite box girders of complex cross sections”. j. construct. steel research, vol. 24 (1993). [5] křístek v., studnička j.: “composite girders with deformable connection between steel and concrete.” (chap. in composite steel-concrete structures), applied science publishers, elsevier, london 1988, edited by r. narayanan. [6] evans h. r., křístek v., škaloud m.: “strengthening steel box girder bridges by controlling the effects of shear lag.” proc. instn. civ. engrs. structs. and bldgs., vol. 104 (nov. 1994). prof. ing. vladimír křístek, drsc. phone: +420 224 353 875 e-mail: kristek@fsv.cvut.cz department of concrete structures and bridges czech technical university in prague faculty of civil engineering thákurova 7 praha 6, czech republic 130 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 acta polytechnica doi:10.14311/ap.2020.60.0025 acta polytechnica 60(1):25–37, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap parallelization of assembly operation in finite element method michal bošanský∗, bořek patzák czech technical university in prague, faculty of civil engineering, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: michal.bosansky@fsv.cvut.cz abstract. the efficient codes can take an advantage of multiple threads and/or processing nodes to partition a work that can be processed concurrently. this can reduce the overall run-time or make the solution of a large problem feasible. this paper deals with evaluation of different parallelization strategies of assembly operations for global vectors and matrices, which are one of the critical operations in any finite element software. different assembly strategies for systems with a shared memory model are proposed and evaluated, using open multi-processing (openmp), portable operating system interface (posix), and c++11 threads. the considered strategies are based on simple synchronization directives, various block locking algorithms and, finally, on smart locking free processing based on a colouring algorithm. the different strategies were implemented in a free finite element code with object-oriented architecture oofem [1]. keywords: parallel computation, shared memory, finite element method, vector assembly, matrix assembly. 1. introduction current development in computer hardware brings in new opportunities in numerical modelling. the solutions of many engineering problems are extremely demanding, both in terms of time and computational resources. the traditional serial computers permit to run simulation codes only sequentially on a single processing unit, where only one instruction can be processed at any moment in time. the current trend in technology is parallel processing, relying on the simultaneous use of multiple processing units to solve a given problem. any parallel algorithm is based on the idea of partitioning the overall work into a set of smaller tasks, which can be solved concurrently by a simultaneous use of multiple computing resources. parallel processing allows to concurrently perform tasks on a single computer with multiple processing units or even multiple computers with multiple processing units. parallelization can significantly reduce the computational time by a more efficient use of the available hardware. it can also allow solving large problems that are often not fit for a single machine. the parallel programming requires a development of new techniques and relies on programming language extensions and a use of communication libraries. the principal issue in parallel assembly is to prevent the race conditions, where the same memory location is to be updated by multiple threads. parallel computers can be classified, for example, by the type of memory architecture [2]. the shared, distributed, and hybrid memory systems exist. in the shared memory system, the main memory with a global address space is shared between all processing units, which can directly address and access the same global memory. the global memory significantly facilitates the design of the parallel program, as the individual tasks can communicate via the global memory, but the memory bus performance is the limiting factor for the scalability of these systems. the systems with the distributed memory are built by connecting individual processing units equipped with a local memory. there is no global address space and the individual tasks running on different processing units have to communicate using messages. the design and implementation of parallel algorithms is typically more demanding, however, these systems do not suffer from scalability limitations of shared memory systems. finally, the hybrid systems try to combine the advantages of shared and distributed systems by combining muti-core processing units with a local memory into powerful systems. the most powerful computers today are build using this paradigm [3]. finite element method (fem) is one of the most popular methods to solve various problems in engineering. it actually consists of a broad spectrum of methods for finding an approximate solution to boundary value problems described by the system of partial differential equations. the differential equations are converted to the algebraic system of equations using variational methods. once the integral form is set up, the discretization of the problem domain into non overlapping subdomains called elements is introduced to define an approximation and test functions. in structural mechanics, the resulting algebraic equations correspond to the discrete equilibrium equations at element nodes. in matrix notation, the resulting 25 https://doi.org/10.14311/ap.2020.60.0025 https://ojs.cvut.cz/ojs/index.php/ap michal bošanský, bořek patzák acta polytechnica equilibrium equations for a linear static problem have the following form k ∗ r = f, (1) where k and f are global stiffness matrix and load vector, respectively, and r is the vector of unknown nodal displacements. the global stiffness matrix k and global load vector f are assembled from individual element and nodal contributions. this process is relying on the global numbering of equations, where the contributions of individual elements are assembled (added) to global matrix/vector values according to global equation numbers of nodal unknowns. the typical sequential vector assembly algorithm is outlined in algorithm 1. algorithm 1: prototype code assembly of load vector 001 for elem = 1, nelem 002 f e = computeelementv ector(elem) 003 loce = giveelementcoden umbers(elem) 004 for i = 1, size(loce) 005 f (loce(i))+ = f e(i) the typical sequential algorithm for matrix assembly is similar to the algorithm of vector assembly, see algorithm 2 for a reference. algorithm 2: prototype code assembly of stiffness matrix 001 for elem = 1, nelem 002 ke = computeelementm atrix(elem) 003 loce = giveelementcoden umbers(elem) 004 for i = 1, size(loce) 005 for j = 1, size(loce) 006 k(loce(i), loce(j))+ = ke(i, j) as already mentioned, the assembly operations are one of the typical steps in the finite element analysis. in order to obtain a scalable algorithm, all the steps have to be parallelized, including the assembly phase, which could be costly. particularly when solving nonlinear problems, the evaluation of individual element contributions (tangent stiffness matrix, internal force vector) can be computationally demanding and have to be performed for every load increment step and every iteration (the frequency of update of stiffness matrix depends on the actual solution algorithm). in general, the parallelization of the assembly operation consists in splitting the assembly loop over elements into disjoint subsets, which are processed by individual processing threads. within each thread, the individual element contribution is evaluated for each element in subset (this part can be evaluated concurrently), followed by the assembly of the local contribution to the target global matrix/vector, see simple demonstration in figure 1. the key problem is that this step could not be performed concurrently, as the multiple threads may update the same global entry at the same time. therefore, it is necessary to ensure, that the same global entry is not being updated at the same time by multiple threads. this is known as so called race condition. to prevent the race condition on update, various techniques can be used. they typically consist in using locking primitives to make sure that (i) the code performing the update can be executed only by a single thread, (ii) the specific memory location can be updated only by a single thread, or (iii) the evaluation of element contributions is ordered in a such a way that the conflict cannot occur. one of the important characteristics of the parallel algorithm is its scalability. unfortunately, the ideal, linear scalability is difficult to obtain. almost every parallel algorithm has an overhead cost, when compared to the sequentional version. the individual tasks cannot be executed concurrently without a synchronization and communication with other tasks. finally, some parts of the algorithm are essentially serial and have to be executed only by a single thread. the different parallel assembly approaches have been presented, for example, in paper [4]. in [4], an approach based on openmp critical sections and openmp atomic directives, have been proposed. 2. shared memory frameworks in this section, different strategies to parallel vector/matrix assembly are presented. they use different techniques to prevent the race condition on data update. these strategies are implemented using different shared memory programming models available, including open multi-processing (openmp), portable operating system interface (posix) threads, and c++ 11 threads programming interface. 2.1. openmp openmp is a shared memory programming model that supports multi-platform shared memory multiprocessing programming in c, c++, and fortran programming languages. it is available for a wide variety of processor architectures and operating systems. it consists of a set of compiler directives, library routines, and environment variables that influence the run-time behaviour, see [5] for details. the programming in openmp consists in using so called parallel constructs (compiler directives), which are inserted in the source code, instructing the compiler to generate a specific code. the openmp defines various constructs allowing to parallelize serial code and synchronize the individual threads [6]. to reduce the granularity of the problem and reduce the overhead connected to thread creation and termination, it is usual to parallelize the outermost loops in the algorithm, which, in our particular case, corresponds to the parallelization of the loop over elements in the assembly operation. 26 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method (0,1) (2,3) (4,5) (6,7) 0 1 2 3 4 5 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 2 3 4 5 6 7 figure 1. parallel algorithm for stiffness matrix assembly schema 2.1.1. synchronization using critical section, atomic update, simple lock, nested lock in this subsection, we start with a simple solution preventing an occurrence of race conditions during data update based on the critical section, atomic update, simple lock, and nested lock constructs. the synchronization using critical section is implemented in three variants. in the first variant (marked as a1), the whole assembly operation consisting of loop over element code numbers (rows and columns numbers of global matrix / vector) is enclosed using critical section. the second variant considers only enclosing the actual update operation (a2). the last variant (a3) is similar to (a2), but uses the atomic update, see algorithm 3. a similar approach is followed to synchronize threads using locks. the lock is used to ensure that either a single thread can process the loop over element code numbers (a1.1 or a1.2 using the nested lock) or an actual update operation (a2.1) is carried out, see algorithm 4. the approach followed in this section is rather conservative, ensuring that only a single thread can peralgorithm 3: prototype code matrix assembly with explicit synchronization 001 # pragma omp parallel for private(ke, loce) 002 for elem = 1, nelem 003 ke = computeelementm atrix(elem) 004 loce = giveelementcoden umber(elem) 005 # pragma omp critical (a1) 006 for i = 1, size(loce) 007 for j = 1, size(loce) 008 # pragma omp critical (a2) 009 # pragma omp atomic (a3) 010 k(loce(i), loce(j))+ = ke(i, j) algorithm 4: prototype code matrix assembly with explicit locks 001 omp_init_lock(&my_lock) (a1.1) (a1.2) 002 omp_init_nest_lock(&my_nest_lock) (a2.1) 003 # pragma omp parallel for private(ke, loce) 004 for elem = 1, nelem 005 ke = computeelementm atrix(elem) 006 loce = giveelementcoden umber(elem) 007 omp_set_lock(&my_lock) (a1.1) 008 omp_set_nest_lock(&my_nest_lock) (a2.1) 009 for i = 1, size(loce) 010 for j = 1, size(loce) 011 omp_set_lock(&my_lock) (a1.2) 012 k(loce(i), loce(j))+ = ke(i, j) 013 omp_unset_lock(&my_lock) (a1.2) 014 omp_unset_lock(&my_nest_lock) (a1.1) 015 omp_unset_nest_lock(&my_nest_lock) (a2.1) 016 omp_destroy_lock(&my_lock) (a1.1) (a1.2) 017 omp_destroy_nest_lock(&my_nest_lock) (a2.1) 27 michal bošanský, bořek patzák acta polytechnica form any update operation. in reality, the race condition on data update can happen only if two or more threads are attempting to update the same entry in global vector/matrix. this essentially means that these threads are assembling contributions of elements sharing the same node(s). the probability of this can be relatively low, so, in next sections, we try to propose improved algorithms that allow to perform the update operation in parallel, provided that different entries of global vector/matrix are updated. 2.1.2. synchronization using block locks the algorithm presented in this section is based on the idea of having an array of locks, each corresponding to consecutive blocks of values in global vector/matrix. once a specific global value is to be updated by a specific thread, the corresponding lock is acquired, preventing other threads to update values in the same block, but allowing other threads to update values in other blocks. it may seem that the ideal situation is to have unique lock for every global value, but as the global vector/matrices are the dominant data structures (in terms of memory requirements) in a typical fe code, this approach is not feasible. in the presented approach, the individual groups correspond to blocks of rows of global vector/matrix values and the prototype implementation is presented in algorithm 5. algorithm 5: prototype code matrix assembly with explicit block locks 001 # define n blocks 002 omp_init_t_&my_lock [n blocks] 003 for n = 1, n blocks 004 omp_init_lock(&my_lock [n]) 005 blocksize = size(k.rows)/n blocks 006 # pragma omp parallel for private(ke, loce) 007 for elem = 1, nelem 008 ke = computeelementm atrix(elem) 009 loce = giveelementcoden umber(elem) 010 for i = 1, size(loce) 011 bi = loce(i)/blocsize //integer division 012 for j = 1, size(loce) 013 omp_set_lock(&my_lock [bi]) 014 k(loce(i), loce(j))+ = ke(i, j) 015 omp_unset_lock(&my_lock [bi]) 016 for n = 1, n blocks 017 omp_destroy_lock(&my_lock [n]) 2.2. posix threads the posix threads (pthreads) libraries are standardized thread programming interfaces for c/c++ language. pthreads allows one to spawn a new concurrent processes. pthreads consists of a set of c/c++ language types and procedure calls, see [7] for further details. the posix threads algorithms are based on distributing the element set into continuous subsets assigned to individual threads. 2.2.1. synchronization using simple mutex and recursive mutex the posix threads synchronization routines allow to protect shared data when multiple threads update the data. the concept is very similar to locks in openmp library. in posix threads, only a single thread can lock mutex variable at any given time. in the case where several threads try to lock mutex, only one thread will be successful. no other thread can own mutex until the owning thread unlocks that mutex. the simple mutex can be used only once by a single thread. attempting to relock the mutex (trying to lock mutex after a previous lock) causes a deadlock. the attempt to unlock a simple mutex that it has not locked leads to an undefined behaviour. the recursive mutex shall maintain the concept of a lock count. when a computing thread successfully acquires recursive mutex for the first time, the lock count shall be set to one. each time the thread unlocks the recursive mutex, the lock count shall be decremented by one. when the lock count reaches to zero, the recursive mutex shall become available for other threads to acquire. in this section, the prototype algorithms are presented using simple and recursive mutexes to prevent the race condition on a data update. three variants are considered. the first, marked as b1.1, is using simple mutex to protect loop over code numbers, the second, marked as b2.1, is using recursive mutex again protecting the loop, and finally the one protecting just the update operation using simple mutex, marked as b1.2. all variants are illustrated in algorithm 6. algorithm 6: prototype code matrix assembly with explicit synchronization using simple and recursive posix mutexes 001 void assembly element m atrix ( ... ) 002 for elem = 1, nelem 003 ke = compute element m atrix (elem) 004 loce = give element code n umbers (elem) 005 pthread_mutex_lock(&mutex_sim ) (b1.1) 006 pthread_mutex_lock(&mutex_rec) (b2.1) 007 for i = 1, size(loce) 008 for j = 1, size(loce) 009 pthread_mutex_lock(&mutex_sim ) (b1.2) 010 k(loce(i), loce(j))+ = ke(i, j) 011 pthread_mutex_unlock(&mutex_sim ) (b1.2) 012 pthread_mutex_unlock(&mutex_sim ) (b1.1) 013 pthread_mutex_unlock(&mutex_rec) (b2.1) 2.3. synchronization using colouring algorithm as already discussed in previous sections, the conservative strategy on always protecting the update operation may not lead to optimal results. it enforces the serial execution of the update operation for selected values, regardless if there is a real conflict or not. the fact that it prevents a parallel execution can have a significant impact on scalability. partially, this problem has been addressed in the algorithm using array of locks preventing the update to block a row of values. in this section, we present an alternative approach, which is based on the idea of assigning the individual elements into groups, where elements in 28 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method a group should not share any node. this essentially means that elements in a group can be processed concurrently as during the assembly operation only distinct values of a global vector/matrix can be updated. after determining the element distribution into the groups, the algorithm loops over the groups and each group is processed in parallel. the objective is to keep the number of groups minimal. this is known as the so-called colouring algorithm in graph theory [8]. in this paragraph, the introduction to the vertex colouring algorithm will be given. consider a graph of n mutually connected vertices (representing fe elements). the edges (connections) represent the element connectivity, i.e., the edge between two vertices represents a case, when two elements share the same node. the task is to assign a "colour" to each vertex under the condition that no neighbour has the same colour and keep the number of "colours" minimal. the algorithm for greedy colouring of a graph is the following: (1.) loop over the elements e = 1, ne (a) for the element e, find the colour c assigned to neighbours of element e loop over the nodes of element e, i = 1, n c = c + f n(i) (b) find the unused available colour not in c and assign it to element e loop over the available colours, c = 1, m if c is not in c then ec(e) = c where ec is the array of colours and f n(i) is the function returning a set of colours assigned to elements sharing the node i the computational cost of the greedy colouring algorithm depends heavily on the vertex ordering. in the worst case, the behaviour is poor and solving the of algorithm can take a lot of computation time. however, the graph construction and graph colouring has to be performed only once during the fe code initialization and after that it can be reused in any assembly operation. as already noted, the colouring algorithm splits the elements into groups marked with different colours. the algorithm ensures that a minimum number of colours is used. once the colouring is available, the assembly algorithm consists of the outer loop over individual colour groups and inner loop over individual elements in a group. inside the inner loop, the element contributions are evaluated and assembled into global vector/matrix. now the key is that the inner loop can be parallelized (individual members of a group can be processed by different threads) without the need of synchronization, as the colouring ensures that the race condition on update could not occur. even though this is appealing, the algorithm has its overhead. this includes the already discussed need to establish the colouring, however, only the inner loop can be parallelized. all threads should finish processing the element’s inner loop before processing elements for the next colour. this requires synchronization. finally, there is also an overhead connected to the creation and termination of threads for each colour. in the colouring algorithm, the individual threads are created and terminated inside the outer loop over individual colours, but this overhead is typically very small, considering typical assembly times for real problems. eventually, it can have some minor impact on overhead costs. the prototype code for colouring assembly is presented in algorithm 7 using openmp directives. the additional arguments of the parallel loop directive are used to declare some variables as shared (i.e., each thread accesses the same variable) or private (each thread has its own copy of that variable). the for-loop clause allows accumulating a shared variable without an explicit synchronization. algorithm 7: prototype code matrix assembly without explicit synchronization using colouring algorithm (openmp) 001 for ii = 0, number.of.colours 002 # pragma omp parallel for private(ke, loce) 003 for ie = 1, size(colour.group[ii]) 004 elem = colour.group[ii][ie] 005 ke = computeelementm atrix(elem) 006 loce = giveelementcoden umber(elem) 007 for i = 1, size(loce) 008 for j = 1, size(loce) 009 k(loce(i), loce(j))+ = ke(i, j) the posix threads variant of the colouring based assembly algorithm is presented in algorithm 8. algorithm 8: prototype code matrix assembly without explicit synchronization using colouring algorithm (posix threads) 001 threads = new pthread _t[num_threads] 002 for ii = 0, number.of.colours 003 for jj = 0, (num_threads − 1) 004 psize = size(colour.group[ii])/num_threads 005 start.indx = (jj) ∗ psize 006 end.indx = start.indx + psize 007 pthread_create( &threads[jj], n u ll, assembly element m atrix, ii, start.indx, end.indx) 008 for jj = 0, (num_threads − 1) 009 pthread_join( &threads[jj], n u ll) 010 void assembly element m atrix ( ... ) 011 for ie = start.indx, end.indx) 012 elem = colour.group[ii][ie] 013 ke = compute element m atrix (elem) 014 loce = give element code n umbers (elem) 015 for i = 1, size(loce) 016 for j = 1, size(loce) 017 k(loce(i), loce(j))+ = ke(i, j) 2.4. c++11 threads alongside well established openmp and posix threads, the c++11 standard has introduced a native c++ thread support [9]. the c++11 thread libraries include utilities for creating and managing threads, which are standardized for c/c++ language. 29 michal bošanský, bořek patzák acta polytechnica the c++11 standard library contains classes for a thread manipulation and synchronization, common protected data, and low-level atomic operations. the parallel program based on c++11 standard library is constructed by defining a new procedure function, which is executed by the thread and start the new thread. the synchronization in the c++11 standard is achieved by classical synchronization mechanisms like mutex object, condition variables, and other mechanisms like locks or controlling features used when threads are transferring computational data. 2.4.1. c++11 threads simple lock in this paragraph, the multitasking synchronization using mutex class is presented. the mutex can be used to protect shared data from being simultaneously accessed by multiple threads. the synchronization is enforced in the loop, as illustrated in algorithm 9. algorithm 9: prototype code matrix assembly with explicit synchronization using simple locks (c++11 threads) 001 std :: mutexm t x 002 threads = new thread _t[n umber_of _t hreads] 003 for i = 0, n umber_of_t hreads 004 std::thread[assembly element matrix, ... ] 005 for i = 0, n umber_of_t hreads 006 std[i].join() 007 void assembly element m atrix ( ... ) 008 for elem = 1, nelem 009 ke = compute element m atrix (elem) 010 loce = give element code n umbers (elem) 011 m t x.lock() 012 for i = 1, size(loce) 013 for j = 1, size(loce) 014 k(loce(i), loce(j))+ = ke(i, j) 015 m t x.unlock() 3. performance evaluation in this section, the performances and efficiencies of the presented approaches are compared both for matrix and vector assembly operations using two benchmark problems. the first benchmark problem is a 3d model of a nuclear containment, marked as jete and shown in figure 2. the second benchmark problem is a model of a 3d porous micro-structure, marked micro and shown in figure 3. figure 2. the benchmark problem of 3d finite element model of nuclear containment (jete). name nnodes nelems neqs jete250k 87k 67k 260k jete3m 899k 1m 3m micro250k 85k 80k 256k micro3m 1m 970k 3m table 1. discretizations of the benchmark problems considered where the nnodes represents the number of nodes, nelems represents number of elements and neqs represents number of equations. figure 3. the benchmark problem of 3d finite element model of porous micro-structure cube (micro). the different discretizations are generated for both benchmark problems with an increasing number of elements, see table 1. the tetrahedral elements with a linear approximation have been used in the case of the nuclear containment benchmark, while a structured grid of brick elements with linear interpolation was used for the micro-structure benchmark. the porous micro-structure consists of two phases representing the material and voids with a different set of elastic constants. in both cases, the linear structural analysis has been performed and structures were loaded by self-weight, which implied a nonzero contribution of every element to the external load vector. the benchmark problems are characterized by different sparsity of the system matrix. the model of the porous micro-structure has significantly more nonzero members than the model of the nuclear containment. for example, a number of nonzero members in the stiffness matrix of the problem is 433m in the case of jete3m, while the model of the porous micro-structure micro3m has 1528m nonzero entries, with the number of unknowns being similar. all the presented parallelization approaches have been implemented in the oofem finite element code. the object oriented c++ oofem code has been compiled using g++ 4.5.3.1 compiler version with optimization flags -o2. the tests have been executed on a linux workstation (running ubuntu 14.04 os) with two cpus intel(r) xeon(r) cpu e5-2630 v3 @ 2.40ghz and 132gb ram. each cpu consists of eight physical and sixteen logical cores, allowing up to thirty-two threads to run simultaneously on the workstation. all the tests fit into the system memory. 30 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method for each benchmark problem, the individual strategies have been executed on increasing number of processing cores and speedups (with respect to serial single cpu execution) have been evaluated as an average from three consecutive runs. the performance of individual strategies of the vector assembly (in terms of the achieved speedup versus increasing the number of processors) for jete250k test are presented in figure 4 and for matrix assembly in figure 8. similarly, for the jete3m test, the results are presented in figures 5, 9, for the micro250k test in figures 6, 10 and, finally, for the micro3m test in figures 7, 11. the following notation is used on above mentioned figures to distinguish individual parallelization strategies implemented: openmp with critical sections (omp cs), openmp with simple locks (omp l), openmp with nested locks (omp nl), openmp with critical sections only in update operation of global matrix/vector (omp lcs), openmp with atomic update directive only in update operation of global matrix/vector (omp lato), openmp with simple locks only in update operation of global matrix/vector (omp ll), openmp using blocks of simple locks (omp b50, omp b500, omp b105̂), openmp based on colouring (omp cp), posix threads based on colouring (pth cp), posix threads with simple mutexes (pth m), posix threads with recursive mutexes (pth rm) and finally c++11 threads with simple mutexes (thr m). the assembly process of the matrix/vector is composed of an evaluation of the local matrix/vector followed by its assembly into global matrix/vector. the execution times of these two operations together with the total execution time are presented in table 2 for selected benchmark problems. note that the complexity of the greedy colouring algorithm is linearly proportional to the number of elements and thus it is the same as the complexity of assembly algorithm. however, as already mentioned, the colouring is to be determined only once and can be reused in all subsequent assembly steps. theoretically, the algorithms that use colouring strategies should scale linearly up to the limit imposed by the memory bandwidth, but the implementations tested in the paper could not achieve these performance levels. this requires further investigations that are beyond the scope of this paper. the results for the vector assembly show very similar trend. all the strategies yield approximately the same results in terms of scalability and speedups. from the results, it is evident that the speedups are far from ideal. for example, in the case of jete3m test, the speedups for 16 cpus are in the range of 2.5 4. there are several possible reasons why only sub-optimal scalability has been obtained. the first reason is that the individual tasks are not independent, the update of the global entry is a part that can be processed only by one thread at a time. second, there is an additional overhead connected with the parallel algorithm (thread creation and management, synchronization) that is not present in the serial version. third, the individual threads share the common resources (memory bus), which do not appropriately scale in performance. moreover, from the results, one can observe the performance drop after reaching the limit on 16 threads. this can be attributed to the hyper-threading technology specific to intel processors [10], which shares some of the cpu resources (execution engine, caches, and bus interface) between the hyper-threaded cores. this trend is more pronounced on larger tests (jete3m and micro3m). the posix threads and c++11 threads implementations show better performance than openmp versions, particularly for smaller number of threads. the results for the sparse matrix assembly show different trends. some strategies (omp-lcs, ompll, omp-b50, omp-b500, omp-b105) were not even able to reach the performance of the serial algorithm and speedups are less than 1. the colouring based strategies (omp-cp, pth-cp) have a similar speedup trend for jete250k and micro3m benchmark problems. however, the colouring based strategy pth-cp clerly shows a better scalability than the colouring based strategy omp-cp on benchmark jete3m. the opposite trend can be observed for colouring strategies on benchmark micro250k (ompcp clearly shows a better scalability than pth-cp). the implementations based on the colouring algorithm do not perform well, which is somehow surprising observation. this could be (partially) explained by so-called false sharing. a typical smp system has local data cache organized hierarchically in several levels [11]. the system guarantees cache coherence. in the case when one core modifies the memory location that also resides on the different core cache line then the false sharing occurs. this is going to invalidate the cache lines on other cores and forcing to re-read them every time when any other thread has updated the memory in the cache line. the false sharing seems to have a much bigger impact on sparse matrix assembly performance than on the vector assembly performance. the false sharing effect in our case is confirmed using the valgrind tool with memory management and threading bugs monitoring. the benchmark problem jete250k with using omp-cs or omp-cp based on 32 computational threads is used as a representative example for illustrating false sharing effect and the outputs from valgrind are presented in table 3. cache accesses for data in table are presented. the parameter d refs represents the number of data fetches (summary of reads and writes data) in the system cache memory. the parameter d1 misses represents the number of data fetches in cache memory layer l1. the last parameter dll misses] represent the instruction and data fetches at the last level cache (l3). from the reported results, one can see a slight increase of the number of cache misses in the case of colouring algorithm compared to the approach using the synchronization based on 31 michal bošanský, bořek patzák acta polytechnica number of threads 1 2 4 8 12 16 32 total time [s] 36.7 23.81 13.83 11.70 13.51 12.34 16.59 evaluation of local matrix [s] 13.83 7.07 3.61 1.73 1.23 0.93 0.465 localization into global matrix [s] 22.87 16.74 10.22 9.97 12.28 11.41 16.225 table 2. matrix assembly total times with dividing to evaluation of local matrix assembly times and localization into global matrix assembly times of the benchmark problem considered (jete 3m). 1 1.5 2 2.5 3 3.5 4 4.5 5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete250k vector assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 4. speedups of the external force vector for benchmark jete250k. 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete3m vector assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 5. speedups of the external force vector assembly for benchmark jete3m. 32 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method 1 1.5 2 2.5 3 3.5 4 4.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) micro250k vector assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 6. speedups of the external force vector assembly for benchmark micro250k. 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) micro3m vector assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 7. speedups of the external force vector assembly for benchmark micro3m. openmp critical sections. to get a further insight into the problem, the intel vtune amplifier tool has been used to detect false sharing [12]. the memory access analysis type uses hardware event sampling to collect data for the metric. this monitoring reveals the growing importance of the false sharing effect with increasing number of threads (particularly for 16 and more threads, when intel hyper-threading technology is active). this is demonstrated by growing memory bound (showing a fraction of cycles spent on waiting due to load or store operations on each cache level) and average memory latency (average load latency in cycles) characteristics. the individual speedups with and without synchronization for different scheduling options for selected benchmark problem (jete3m) using critical section omp-cs omp-cp d refs 547 165 746 467 558 299 751 110 d1 misses 42 487 418 875 43 071 933 527 lld misses 16 625 383 973 16 650 025 976 table 3. number of cache data misses of the benchmark problem jete250k. synchronization (a1) are presented in figures 12 and 13. the results are quite interesting. from the fig. 12 (matrix assembly), one can clearly see that all the executions with synchronization outperform the ones without it. in our opinion, this indicates that the synchronization overhead is more than balanced by the better performance, which is most likely due to single 33 michal bošanský, bořek patzák acta polytechnica 0 0.5 1 1.5 2 2.5 3 3.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete250k matrix assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 8. speedups of the sparse matrix assembly for benchmark jete250k. 0 0.5 1 1.5 2 2.5 3 3.5 4 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete3m matrix assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 9. speedups of the sparse matrix assembly for benchmark jete3m. memory access at a specific time (ensured by critical section). this can be attributed to false sharing. at the same time, the results demonstrate that the effect of scheduling is negligible for executions without synchronization (symbols without fill) and only partially relevant for executions with synchronization (filled symbols). the effect of scheduling is relatively small. this indicates that limitation in memory bandwidth is the main reason for the sub-optimal scalability obtained. the results for vector assembly (fig. 13) are different, as the differences between speedups with and without synchronization are much smaller. this is most likely due to the less complex memory access pattern in the case of vector assembly. similarly to matrix assembly, the results demonstrate that the effect of scheduling is negligible as well as the effect of different scheduling options. the results again confirm the significant role of the limited memory bandwidth. posix threads and c++11 threads implementations performed best for lower number of threads, but overall, the openmp implementation (omp-l, omp-nl), and omp-cs performed the best. in [4] authors reported a speedup 11 for 12 threads. however, these results are achieved on a slightly different architecture, where the computational node consists of two processors intel xeon x5650 2,66ghz with 6 cores and results for 12 threads are given. therefore, the results were not affected by hyperthreading. 34 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method 0 0.5 1 1.5 2 2.5 3 3.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) micro250k matrix assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 10. speedups of the sparse matrix assembly for benchmark micro250k. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) micro3m matrix assembly omp cs(a1) omp l(a1.1) omp nl(a2.1) omp lcs(a2) omp ll(a1.2) omp b50 omp b500 omp b105 omp cp pth cp pth m(b1.1) pth rm(b2.1) thr m omp lato(a3) figure 11. speedups of the sparse matrix assembly for benchmark micro3m. 4. conclusions the paper evaluates different parallelization strategies of right-hand side vector and stiffness matrix assembly operations, which are one of the critical operations in any finite element software. the performance strategies use different techniques to ensure consistency and are implemented using openmp, posix threads and c++11 threads libraries. the performance of individual strategies and libraries is evaluated using two benchmark problems (a 3d structural analysis of a nuclear containment and of a 3d micro-structure) each with two different mesh sizes. for the particular benchmark cases considered, the performance of nearly all strategies has been much better than the performance of serial algorithm with a relatively good scalability, however, in the case of matrix assembly, considerable differences exist and the presented work provides an insight on how to select the optimal strategy. the achieved results clearly show a performance drop on systems with hyper-threading technology when the number of processes exceeds the number of physical cores. somehow disappointing are the results of the assembly based on the colouring approach that did not perform as expected, with the performance being affected most likely with false-sharing phenomena. the main conclusion of this study is that the performances of individual libraries are comparable, but performances of individual strategies differ, often significantly. 35 michal bošanský, bořek patzák acta polytechnica 0.5 1 1.5 2 2.5 3 3.5 4 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete3m matrix assembly omp cs(a1) omp sta omp sta 4 omp dyn omp dyn 1 omp dyn 4 omp dyn 8 omp ws omp sta ws omp sta 4 ws omp dyn ws omp dyn 1 ws omp dyn 4 ws omp dyn 8 ws figure 12. speedups of the sparse matrix assembly with and without synchronization for different scheduling options for benchmark jete3m using critical section synchronization (a1). 1 1.5 2 2.5 3 3.5 4 4.5 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 sp e e d u p number of threads intel xeon (32 computing units) jete3m vector assembly omp cs(a1) omp sta omp sta 4 omp dyn omp dyn 1 omp dyn 4 omp dyn 8 omp ws omp sta ws omp sta 4 ws omp dyn ws omp dyn 1 ws omp dyn 4 ws omp dyn 8 ws figure 13. speedups of the external force vector assembly with and without synchronization for different scheduling options for benchmark jete3m using critical section synchronization (a1). 36 vol. 60 no. 1/2020 parallelization of assembly operation in finite element method in general, the presented paper illustrates the potential of parallel assembly operations and importance of benchmarking, allowing to identify an optimal strategy. acknowledgements this work was supported by the grant agency of the czech technical university in prague, grant no. sgs16/038/ohk1/1t/11 advanced algorithms for numerical modeling in mechanics of structures and materials. the second author was supported by op vvv, research center for informatics, cz.02.1.01/0.0/0.0/16019/0000765. references [1] b. patzak. oofem. http://www.oofem.org, 2000. [2] c. hughes, t. hughes. parallel and distributed programming using c++. pearson education, 2003. [3] h. meuer, e. strohmaier, j. dongarra, et al. top 500 the list home page. https://www.top500.org/. [4] p. jarzebski, k. wisniewski, r. l. taylor. on parallelization of the loop over elements in feap. computational mechanics 56(1):77–86, 2015. doi:10.1007/s00466-015-1156-z. [5] b. barney. openmp. https://computing.llnl.gov/tutorials/openmp/, 2014. [6] a. marowka. book review [review of “using openmp: portable shared memory parallel programming” (chapman, b. et al, 2007)]. ieee distributed systems online 9(1):3–3, 2008. doi:10.1109/mdso.2008.1. [7] b. nicols, d. buttlar, j. p. farrell. pthreads programming. o‘reilly and associates, 1996. [8] d. b. west. introduction to graph theory., vol. 2. upper saddle river prentice hall, 2001. [9] a. williams. c++ concurrency in action. manning publications co, united states of america, 2012. [10] d. t. marr, f. binns, d. l. hill, et al. hyper-threading technology architecture and microarchitecture. intel technology journal 6(1), 2002. [11] j. handy. the cache memory book. academic press, inc., 1998. [12] intel corporation. intel vtune performance analyzer. http://software.intel.com/en-us/intel-vtune/, 2019. 37 http://www.oofem.org https://www.top500.org/ http://dx.doi.org/10.1007/s00466-015-1156-z https://computing.llnl.gov/tutorials/openmp/ http://dx.doi.org/10.1109/mdso.2008.1 http://software.intel.com/en-us/intel-vtune/ acta polytechnica 60(1):25–37, 2020 1 introduction 2 shared memory frameworks 2.1 openmp 2.1.1 synchronization using critical section, atomic update, simple lock, nested lock 2.1.2 synchronization using block locks 2.2 posix threads 2.2.1 synchronization using simple mutex and recursive mutex 2.3 synchronization using colouring algorithm 2.4 c++11 threads 2.4.1 c++11 threads simple lock 3 performance evaluation 4 conclusions acknowledgements references ap04_2web.vp 1 introduction fighter aircraft are mostly designed to carry stores such as launcher or external tank under the wing. when these stores are installed, the flow on its surrounding components such as the control surfaces can be considerably changed. this may introduce several aerodynamic interference characteristic such as changes in aerodynamics force, increase in turbulence and possibly flow separation, these phenomena may introduce adverse effect on other aircraft components such as the horizontal tail and vertical stabilizer and consequently may affect the controllability and stability of the aircraft. research on external store installation is complex and extensive. it covers several research areas such as aerodynamic, structure, flutter, physical integration, trajectory prediction, aircraft performance, stability analysis and several multiple engineering disciplines. however, the focus of this work is to study the aerodynamic interference particularly on the change in aerodynamic characteristics. the aerodynamic characteristics are prerequisite for the other analysis, since the aerodynamic data are required for a subsequent aircraft structural analysis, stability analysis, performance analysis and store trajectory analysis. the investigation of aerodynamic characteristics in the external store clearance program usually involved complex flow field study with multi component interferences. usually flow of such nature is investigated through wind tunnel testing besides the empirical methods. the main objective in this study is to identify the interference effect of a subsonic fighter aircraft that is currently used by royal malaysian air force with the present of external store installation. a generic model of one of the subsonic fighter aircraft used by royal malaysian air force was chosen for the study. wind tunnel testing and computational fluid dynamics (cfd) simulation were conducted to investigate these interference effects. the low speed wind tunnel with size 450 mm × 450 mm was used to conduct the experiments and commercial cfd software was used for the simulation. other milestones in this study include the verification and validation process and the suitability of applying a commercial cfd code for predicting the wing and external store aerodynamics interference effects. 2 simulation and experimental works the methodology adopted to conduct the study consists of a few steps. the first and foremost is to obtain digitized wing section geometry. the digitization process was done using photomodeller software. then the second step is to construct a scale model of the wing based on the digitized wing geometry using numerical control machine (cnc). then several series of experiment were carried out upon the scale model in the wind tunnel at a low speed which approximately 22.8 m/s. the digitized wing geometry was also used in the cfd simulation. gambit preprocessor software was used to produce the necessary mesh. the setup was then simulated using fluent 5 cfd software and the cfd simulation was carried out with various physical models, numerical algorithms, discretization method and boundary conditions. at the final step, the study was © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 44 no. 2/2004 computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft tholudin mat lazim, shabudin mat, huong yu saint the main objective of the present work is to study the effect of an external store to a subsonic fighter aircraft. generally most modern fighter aircraft is designed with an external store installation. in this project a subsonic fighter aircraft model has been manufactured using a computer numerical control machine for the purpose of studying the effect of the external store aerodynamic interference on the flow around the aircraft wing. a computational fluid dynamic (cfd) and wind tunnel testing experiments have been carried out to ensure the aerodynamic characteristic of the model then certified the aircraft will not facing any difficulties in stability and controllability. in the cfd experiment, commercial cfd code is used to simulate the interference and aerodynamic characteristics of the model. subsequently, the model together with an external store was tested in a low speed wind tunnel with test section sized 0.45 m × 0.45 m. result in the two-dimensional pressure distribution obtained by both experiments are comparable. there is only 12% deviation in pressure distribution found in wind tunnel testing compared to the result predicted by the cfd. the result shows that the effect of the external storage is only significant at the lower surface of the wing and almost negligible at the upper surface of the wing. aerodynamic interference is due to the external storage were mostly evidence on a lower surface of the wing and almost negligible on the upper surface at low angle of attack. in addition, the area of influence on the wing surface by store interference increased as the airspeed increase. keywords: computational fluid dynamic (cfd), wind tunnel testing, validation, aerodynamic interference. wrapped up by comparing both the computed and measured results in investigating further the nature of the interference effect of a wing and external storage configuration. 3 aircraft wing external geometry digitization digitization of the wing geometry is vital in order to obtain an adequate aircraft model geometry that represents the real aircraft. a photomodeler 3.0 software is used to capture and digitize the aircraft wing external geometry. the software captures the image of 84 photographs taken at various angles of the aircraft wing. these photos were taken using a digital camera. a number of points have been marked on the aircraft wing and the adjacent fuselage part using the masking tape as shown in fig. 1. the size of the markers was designed in such a way that the size will be visible clearly and sharp in the photographs taken at certain distance away. this was determined using the relationship between the number of pixels and the distance from the camera. the placement and location of the markers are determined based on the profile of the wing. the high curvature area was placed with denser markers. this figure also shown a total of 84 photographs were used to generate the wing profile and some part of the fuselage. output from the digitization process is a set of coordinates conforming to the wing geometry as shown in fig. 2. majority of the coordinates were on the wing surface and wingtip pylon. unfortunately, the wing geometry image is less quality in term of accuracy and perfection. therefore, cad software is used to smooth the image. after made minor adjustment, the image becomes as in fig. 3. 3.1 wind tunnel testing a wing model is required to perform the wind tunnel testing. therefore, a 20 % scale wing model of a fighter aircraft has been fabricated by using cnc machine. the model was made from a single solid piece of an aluminum10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 1: photographs of the marked wing surface at the various projection fig. 2: aircraft geometry produced by photomodeller fig. 3: digitized wing geometry after smoothed with cad -alloy that having nine conduits each on the upper and lower surface. fig. 4 show the semi-span model of a generic fighter aircraft taken from the digitized geometry produced by photomodeller. the figure also indicates 3 main part of the wing which includes a root section, mid section and tip section. but the external storage is installed at the mid section. then we only decided to fabricate the mid section. furthermore, we unable to test the full set of the wing model due to the limitation of the size of the wind tunnel test section. the model has three main stations for pressure measurement study located at the chord wise that parallel to each other with equal distance placing. every station was stationed with static pressure-taping point on upper and lower surface respectively. besides built the mid wing model, we also built a 1/5 scale model of launcher and pylon as the external stores. these external storages were design in such a way that they are easily secured and removed from the wing section. fig. 5 shows the complete assembly of this aircraft wing together with the external storage, inside the test section. the experiments have been conducted using two different configurations of the wing model. the first configuration is without the external storages. meanwhile, the second configuration is with the external storages. both configurations were tested at zero angle of attack at two different speeds, which are 22 m/s and 27 m/s. 3.2 computational fluid dynamic simulation in the cfd simulation, the mid wing was simulated at two conditions. in the first condition, the mid wing have been mesh into 111 239 elements. meanwhile, and the second condition it have been mesh into 221 112 elements as shown in fig. 6a and 6b. the flow was simulated at the speed 22 m/s, incompressible flow and at laminar consideration. fig. 6c shows the simulation for wing with external storage with 122 158 elements. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 44 no. 2/2004 fig. 4: semi span model of a generic fighter aircraft fig. 5: model installation inside the wind tunnel a) b) c) fig. 6: cfd model surface meshes: a) mesh for wing in tunnel, 111 239 elements, b) mesh for wing in tunnel, 221 112 elements, c) mesh for wing and store in tunnel, 122 158 elements 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.2 0 0.2 0.4 0.6 0.8 1 1.2 x/c c p coarse grid -lower coarse grid upper fine grid upper fine grid lower fig. 7: pressure distributions at the mid span for upper and lower surface 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 pressure distribution st 1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st1-lower store-st1-upper clean-st1-lower clean-st1-upper pressure distribution st3 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st3-lower store-st3-upper clean-st3-lower clean-st3-upper pressure distribution st2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -0.1 0.1 0.3 0.5 0.7 0.9 1.1 x/c clean-st2-lower clean-st2-upper store-st2-upper fig. 8: pressure distribution for a various chord wise location pressure distribution st 1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st1-lower store-st1-upper clean-st1-lower clean-st1-upper pressure distribution st3 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st3-lower store-st3-upper clean-st3-lower clean-st3-upper pressure distribution st2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -0.1 0.1 0.3 0.5 0.7 0.9 1.1 x/c clean-st2-lower clean-st2-upper store-st2-upper fig. 9: pressure coefficient at three different span wise locations 4 results 4.1 wind tunnel testing results after conducted a series of experiment, the pressure distribution at mid span for the upper and lower surfaces was plotted as shown in fig. 8. from these results we found that at station 1, the difference in pressure coefficient is only 3 % on the upper surface of the wing due to the external store installations. the pressure coefficient shows there are substantial differences by having external configuration at the lower surface. station 2 and 3 indicate the same phenomenon that there is a little difference in pressure distribution on the upper surface. the lower surface shows some reduction in pressure distribution compared to the upper surface. these experimental results give an initial indication that the flow at the upper surface will not be severely affected by the external storage configuration compared to the lower surface. 4.2 computational fluid dynamics results fig. 9 shows the results of computational fluid dynamic simulation for the upper and lower surfaces of this aircraft. at station 3, it is found that the coefficient of pressure is almost constant and same from the leading edges to trailing edge and the value is not much different by having the external installation at the upper surface. this shows that the external storage has not affected the flow at the upper surface. in contrast the pressure coefficient is much differed at a lower surface by having the external configuration compared to the clean wing configuration. this shows that the external storage is only affecting the lower surface. the same phenomenon also happen at station 1 where the coefficient of pressure has not changed at the upper surface but there are some reduction about 12 % in coefficient of in the lower surface. the simulation at station 2 also provides the same results. 5 analysis and discussion 5.1 comparison between cfd and experimental works the study shows that the value of pressure coefficient on the upper surface predicted by simulation is around 0.4 compared to 0.6 performed by the experimental study. there is about 12 % difference. in the experimental study, problem during the setup of experiment such as misalignment in determining the angle of attack, accuracy of the model, blockage effects and wind tunnel calibration can significantly influenced the result. even though the wing is machined accurately by the computer numerical control (cnc), there is still doubt on the accuracy of the model. moreover, the fluid level of the manometer used for measuring pressure was fluctuating between always from 1 to 2 cm. 5.2 discussion in the study we observed that the external store configuration only affects the lower surface of the wing. fig. 10 shows the pressure distribution at the quarter chord point along the span wise location from the tip to the root. at the upper surface, the pressure distribution is almost constant from the span wise position, which is from tip to the root of the wing. the pressure coefficient at the lower surface was reduced by 40 % compared to the upper surface. with the external storage, the pressure distribution at the lower surface was increased around 20 % compared with the clean lower surface, but there was sudden increased in pressure distribution at the position � 0.2 span wise location where the external storage was mounted to the wing. conclusion the experimental study at speed 22.8 m/s and computational fluids dynamic simulation has been performed on the wing and store configuration in this project. the results show the that the flow over the upper surface of the wing has not affected much when the pylon and launcher are installed. the study also shows that the flow over the lower surface is much affected by the presence of external storage. the static pressure around the wing is about 12 % higher than the simulated values. references [1] manoj k. bhardwaj, rakesh k. kapania, reichenbach e., guru p. guruswamy: “computational fluid dynamics/computational structural dynamics interaction methodology for aircraft wings”. aiaa journal, vol. 36 (1998), no. 12, p. 2179–2185. [2] tomaro robert f., witzeman f. c., strang w. z.: “simulation of store separation for the f/a-18c using cobalt”. journal of aircraft, vol. 37 (2000), no. 3, p. 361–367. [3] prewitt n. c., belk d. m., maple r. c.: “multiple-body trajectory calculations using the beggar code”. journal of aircraft, vol. 36 (1999), no. 5, p. 802–808. [4] brock j. m., jr, jolly b. a.: application of computational fluid dynamics at eglin air force base. 1988, sae 985500. [5] shanker v., malmuth n.: “computational and simplified analytical treatment of transonic wing/fuselage/pylon/store interaction”. journal of aircraft, vol. 18 (1981), no. 8, p. 631–637. [6] hirsch: “numerical computation of internal and external flows”. volume 1. brussels: wiley, 1988. [7] jameson a.: ”re-engineering the design process through computation”. journal of aircraft, vol. 36 (1999), no. 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 44 no. 2/2004 spanwise pressure distribution at quarter chord 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 -0.5 -0.4 -0.3 -0.2 -0.1 0 span position (tip root) c p store lower clean lower store upper clean -upper fig. 10: external store interference on pressure distribution [8] spradley l. w., lohner r., chung t. j.: ”generalized meshing environment for computational mechanics”. aiaa journal, vol. 36 (1996), no. 9, p. 1735–1737. [9] mavriplis d. j.: “viscous flow analysis using a parallel unstructured multigrid solver”. aiaa journal, vol. 38 (2000), no. 11, p. 2067–2075. [10] kallinderis y., khawaja a., mcmorris h.: “hybrid prismatic/tetrahedral grid generation for viscous flows around complex geometries”. aiaa journal, vol. 34 (1996), no. 2, p. 291–298. [11] koomullil r. p., soni b. k.: ”flow simulation using generalized static and dynamic grids”. aiaa journal, vol. 37 (1999), no. 12, p. 1551–1557. tholudin mat lazim, ph.d e-mail: tholudin@fkm.utm.my shabudin mat, m.sc e-mail: shabudin@fkm.utm.my department aeronautics & automotive faculty mechanical engineering universiti teknologi malaysia 81310, utm skudai johor malaysia huong yu saint, m.eng royal malaysian airforce wisma pertahanan jalan padang tembak 50634, kuala lumpur 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/app.2016.56.0041 acta polytechnica 56(1):41–46, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap the goodness of simultaneous fits in isis matthias kühnela, ∗, sebastian falknera, christoph grossbergera, ralf ballhausena, thomas dausera, fritz-walter schwarma, ingo kreykenbohma, michael a. nowakb, katja pottschmidtc, d, carlo ferrignoe, richard e. rothschildf, silvia martínez-núñezg, josé miguel torrejóng, felix fürsth, dmitry klochkovi, rüdiger stauberti, peter kretschmarj, jörn wilmsa a remeis-observatory & ecap, universität erlangen-nürnberg, sternwartstr. 7, 96049 bamberg, germany b mit kavli institute for astrophysics, cambridge, ma 02139, usa c cresst, center for space science and technology, umbc, baltimore, md 21250, usa d nasa goddard space flight center, greenbelt, md 20771, usa e isdc data center for astrophysics, chemin d’écogia 16, 1290 versoix, switzerland f center for astronomy and space sciences, university of california, san diego, la jolla, ca 92093, usa g x-ray astronomy group, university of alicante, spain h cahill center for astronomy and astrophysics, caltech, pasadena, ca 91125, usa i institut für astronomie und astrophysik, universität tübingen, sand 1, 72076 tübingen, germany j european space astronomy centre (esa/esac), science operations department, villanueva de la cañada (madrid), spain ∗ corresponding author: matthias.kuehnel@sternwarte.uni-erlangen.de abstract. in a previous work, we introduced a tool for analyzing multiple datasets simultaneously, which has been implemented into isis. this tool was used to fit many spectra of x-ray binaries. however, the large number of degrees of freedom and individual datasets raise an issue about a good measure for a simultaneous fit quality. we present three ways to check the goodness of these fits: we investigate the goodness of each fit in all datasets, we define a combined goodness exploiting the logical structure of a simultaneous fit, and we stack the fit residuals of all datasets to detect weak features. these tools are applied to all rxte-spectra from gro 1008−57, revealing calibration features that are not detected significantly in any single spectrum. stacking the residuals from the best-fit model for the vela x-1 and xte j1859+083 data evidences fluorescent emission lines that would have gone undetected otherwise. keywords: data analysis; multiple datasets; x-ray binaries. 1. introduction nowadays, the still increasing computation speed and available memory allows us to analyze large datasets at the same time. using x-ray spectra of accreting neutron stars as an example, we have shown in a previous paper [1] that loading and fitting the spectra simultaneously has several advantages compared to the “classical” way of x-ray data analysis, which treats every observation individually. in particular, instead of fixing parameters to a mean value one can determine them by a joint fit to all datasets under consideration. due to the reduced number of degrees of freedom the remaining parameters can be better constrained (see fig. 1 as an example). furthermore, parameters no longer need to be independent, but can be combined into functions. for instance, the slope of the spectra might be described as a function of flux with the coefficients of this function as fit-parameters. the disadvantages of fitting many datasets simultaneously are, however, an increased runtime and a complex handling because of the large number of parameters. in [1], we introduced functions to facilitate this handling, which were implemented into the interactive spectral interpretation system (isis) [2]. while these functions are already available as part of the isisscripts1 they are continuously updated and new features are implemented. one important question, which we raised in [1], is about the goodness of a simultaneous fit as it is, e.g., calculated after the commonly used χ2-statistics, particularly the case where some datasets are not described well by the chosen model. due to the potential large total number of datasets, information about failed fits can be hidden in the applied fit-statistics. after we have given a reminder about the terminology of simultaneous fits in section 1.1, we describe 1http://www.sternwarte.uni-erlangen.de/isis [2016-0201] 41 http://dx.doi.org/10.14311/app.2016.56.0041 http://ojs.cvut.cz/ojs/index.php/ap http://www.sternwarte.uni-erlangen.de/isis m. kühnel, s. falkner, c. grossberger et al. acta polytechnica p = 0 p = 1 p = 2 10.80.60.40.20 15 10 5 relative uncertainty in γ n u m b e r figure 1. distribution of the relative uncertainties (90% confidence level) of the power-law photon indices, γ, in our study of all rxte-observations of gro j1008−57 [7]. fitting all 43 observations separately results in the green histogram. as soon as we perform a simultaneous fit with p = 1 global continuum parameter the uncertainties decrease significantly as shown by the red histogram. finally, using p = 2 global parameters (blue histogram) results in a median of ∼ 6 % in the uncertainties (compare the arrows on top). the problem of detecting failed fits in more detail in section 2, and we provide possible solutions. we will conclude this paper by applying these solutions to examples in section 3. 1.1. simultaneous fits as we have described in [1], a data-group contains all datasets which have been taken simultaneously in time or, in general, represent the same state of the observed object. in the example illustrated in fig. 2 two data-groups, a and b, have been added to the simultaneous fit, containing n and m datasets, respectively. thus, a dataset is labeled by the data-group it belongs to, e.g., b3 is the third dataset in the second data-group. after a model with p parameters has been defined each of these data-groups is fitted by an individual set of parameters, called group parameters. consequently, all datasets belonging to a specific datagroup are described by the same parameter values. a specific group parameter can now be marked as a so-called global parameter. the value of the corresponding group parameters will now be tied to this global parameter, i.e, this parameter has a common value among all data-groups. instead of tying group parameters together to a global value, a parameter function may be defined, to which the group parameters are set instead. this function takes, e.g., other parameters as input to calculate the value for each group parameter. in this case, correlations between model parameters, e.g, as predicted by theory can be implemented and fitted directly. 2. goodness of a simultaneous fit as an indicator for the goodness of a fit the analysis software used, e.g isis or xspec [3], usually displays the fit-statistics after the model has been fitted to the data. here, we chose χ2-statistics since the developed global parameters dataset a.1 dataset a.2 ... dataset a.n data-group a dataset b.1 dataset b.2 dataset b.3 ... dataset b.m data-group b model parameters group parameters group parameters figure 2. terminology of simultaneous fits in isis, according to [1]. a data-group consists of simultaneously taken datasets, here data-group a has n and b has m datasets. a model with p + p parameters is fitted such that each data-group has its own set of pi parameters, called group parameters. the common p parameters between the groups are the so-called global parameters. functions for a simultaneous fit were first applied to accreting neutron stars. the high count rates satisfies the gaussian approxmation of the uncertainties, which are actually poisson distributed. in principle, however, the discussed issues and their solutions can be generalized for any kind of fit-statistics. for each datapoint k the difference between the data and the model is calculated and normalized by the measurement uncertainty, error, of the data. the sum over all n datapoints is called the χ-square, χ2 = n∑ k=1 (datak − modelk)2 error2k (1) and is displayed after a fit. additionally, the sum is normalized to the total number of degrees of freedom, n−p with the number of free fit-parameters p, since the χ2 increases with n. this normalized sum, called the reduced χ-square, χ2red = χ2 n−p (2) is also displayed. for gaussian distributed data the expected value is χ2red = 1 for a perfect fit of the chosen model. however, once the probability distribution is changed, e.g., when a spectrum has been rebinned, the expected value changes as well. consequently, a reliable measure for the goodness of the fit has to be defined with some forethought. the χ2red threshold, for which a simultaneous fit is acceptable, depends strongly on the considered case. in particular, a few data-groups might not be described well by the chosen model, which would result in an unacceptable χ2red when fitted individually. however, in case of a simultaneous fit, this information might be hidden in the classical definition of the χ2red (eq. 2). let us consider n data-groups and a model with p group parameters and p global parameters. 42 vol. 56 no. 1/2016 the goodness of simultaneous fits in isis then, the total χ2red is χ2red = ∑n i=1 χ 2 i∑n i=1(ni −pi) −p , (3) with the number of degrees of freedom, ni −pi, and the χ2i for each data-group i after eq. 1. now, we assume a failed fit with χ2i ∼ 2 for a particular i to be present, while for the remaining data-groups χ2i ∼ 1. for n & 10 the χ 2 red after eq. 3 is still near unity and, thus, suggests a successful simultaneous fit. in the following, we present three possible ways to investigate the goodness of a simultaneous fit more carefully. 2.1. histogram of the goodness a trivial but effective solution is to check the goodness of the fit for each data-group individually. here, in the chosen case of χ2-statistics, the χ2red,i is calculated for each data-group, i, after χ2red,i = χ2i ni −pi , (4) where ni are the number of datapoints in the datagroup, and pi is the number of free group parameters. for this reason the global parameters are not taken into account here, the χ2red,i is, however, different from that performed by a single fit of the data-group. in the case of a large number of data-groups, it is more convenient to sort the χ2red,i into a histogram to help in investigating the goodness of the fit to all data-groups. we have added such a histogram to the simultaneous fit functions as part of isisscripts. after a fit has been performed using the fit-functions fit_groups or fit_global (see [1]) this histogram is added to the default output of the fit-statistics. in this way, failed fits of specific data-groups can be identified by the user at first glance. 2.2. combined goodness of the fit instead of a few failed fits to certain data-groups, one might ask if the chosen model fails in the global context of a simultaneous fit. to answer this question, a special goodness of the simultaneous fit is needed to take its logical structure into account. as was explained in section 1.1, a data-group represents a certain state of the observed object, e.g., the datasets were taken at the same time. thus, the data-groups are statistically independent of each other. calculating the goodness of the fit in a traditional way, which is the χ2red after eq. 3 in our case, does not, however, take this consideration into account. as a solution we propose to define the combined goodness of the fit calculating the weighted mean of the individual goodness of each data-group. in the case of χ2-statistics, a combined reduced χ2 is calculated by χ2red,comb. = 1 n n∑ i=1 χ2i ni −pi −µip , (5) with χ2i computed after eq. 1 for each data-group, i, and a weighting factor, µi, for the number of global parameters, p: µi ≈ (ni −pi) × n∑ j=1 1 nj −pj . (6) thus, µi normalizes the effect of data-group i on the determination of the global parameters, p, by its number of degrees of freedom relative to the total number of degrees of freedom of the simultaneous fit. equation 6 is, however, an approximation only. a data-group might not be sensitive to a certain global parameter, e.g, if the spectra in this data-group do not cover the energy range necessary to determine the parameter. a failed fit to a specific data-group, for example with a high individual χ2red,i, has a higher impact on the χ2red,comb. (eq. 5) than on the traditional χ 2 red (eq. 2). in general we expect that χ2red,comb. ≥ χ 2 red, even if all data-groups are fitted well. in the case of a good simultaneous fit (better than a certain threshold), a weak feature in the data might still remain unnoticed, if it is not detected in any individual data-group. such a feature can be investigated by stacking the residuals, as outlined in the following section. we note, however, that eq. 5 is the result of an empirical study. a more sophisticated goodness of a simultaneous fit should be based on a different type of fit-statistics suitable for a simultaneous analysis of many datasets, such as a bayesian approach or a joint likelihood formalism similar to [4]. 2.3. stacked residuals once datasets can be technically stacked to achieve a higher signal-to-noise ratio, e.g., when spectra have the same energy grid and channel binning, further weak features might become visible. this is a common technique in astrophysics [see, e.g., 5, 6]. however, when stacked datasets are analyzed, differences in the individual datasets, e.g., source intrinsic variability, can no longer be revealed. in the case of a simultaneous fit, the residuals of all data-groups can be stacked instead. the stacking dramatically increases the total exposure in each channel bin. thus, the stacked residuals of all datagroups, r(k), as a function of the energy bin, k, can be investigated to further verify the goodness of the simultaneous fit r(k) = n∑ i=1 datai,k − modeli,k. (7) this task can be achieved using, e.g, the plot_data function2 written by m. a. nowak. 2http://space.mit.edu/home/mnowak/isis_vs_xspec/ plots.html [2016-02-01], which is available through the isisscripts as well. 43 http://space.mit.edu/home/mnowak/isis_vs_xspec/plots.html http://space.mit.edu/home/mnowak/isis_vs_xspec/plots.html m. kühnel, s. falkner, c. grossberger et al. acta polytechnica a hexte pca 10+1 10+0 10−1 10−2 10−3 b 10 5 0 -5 c 50205 10010 10 5 0 -5 c o u n ts s− 1 k e v − 1 χ energy (kev) χ figure 3. the stacked spectra (a) and stacked residuals of all individual data-groups (b) containing 43 rxte-spectra of gro j1008−57 (blue: pca; red: hexte). residual features, which are caused by calibration uncertainities, are left in pca. these features are not detected in that detail in the residuals of the single spectrum with the highest signal (c). we can show that the combined reduced χ2 is effectively equal to the goodness of the fit of the stacked data in the first place. assuming the same number of degrees of freedom, n−p, for each data-group, eq. 5 gives χ2red,comb. = 1 f n∑ i=1 n∑ k=1 (datai,k − modeli,k)2 error2i,k . (8) with f = n(n−p−µp) and having used eq. 1. now, the summand no longer depends on i or k explicitly. thus, the order of the sums in eq. 8 may be switched. if we finally interpret k as a spectral energy bin we end up with the goodness as a function of k: χ2red,comb.(k) ∝ n∑ i=1 (datai,k − modeli,k)2 error2i,k ∝ n∑ i=1 data2i,k. (9) this means that all datasets of the simultaneous fit are first summed up for each energy bin in the combined reduced χ2. in contrast to stacking the data in the first place, however, source variability can still be taken into account during a simultaneous fit. note that once all data-groups have the same number of degrees of freedom, the χ2red,comb. (eq. 8) is equal to the classical χ2red (eq. 2). to further investigate the goodness of the simultaneous fit in such a case, the histogram of the goodness of all data-groups (see sec. 2.1) and, if possible, the stacked residuals should be investigated. 3. examples 3.1. gro j1008−57 the be x-ray binary gro j1008−57 was regularly monitored by rxte during outbursts in 2007 december, 2005 february, and 2011 april with a few additional pointings by suzaku and swift. a detailed analysis of the spectra has been published in [7] and in [1] we demonstrated, as an example, the advantages of a simultaneous fit based on these data (see also fig. 1). the χ2red of 1.10 with 3651 degrees of freedom (see table 4 of kühnel at al., 2013 [7]) calculated after eq. 3 indicates a good fit of the underlying model to the data. using the combined reduced χ2 defined in eq. 5 we find, however, χ2red,comb. = 1.68. the reason for this significant worsening of the goodness are calibration uncertainties in rxte-pca, which are visible in the stacked residuals of all 43 data-groups as shown in fig. 3: the strong residuals below 7 kev are probably caused by insufficient modeling of the xe ledges, the absorption feature at 10 kev by the be/cu collimator, and the sharp features around 30 kev by the xe k-edge [for a description of the pca see 8]. these calibration issues have also been detected in a combined analysis of the crab pulsar by [9]. however, the calibration issues that are responsible for the high χ2red,comb., do not affect the continuum model of gro j1008−57 because of their low significance in the individual data-groups. these calibrations features might have an influence, however, in data with a much higher signal-to-noise ratio than the datasets used here or when narrow features, such as emission lines, are studied. 3.2. vela x-1 another excellent example for a simultaneous fit was performed by [10]. these authors analyzed 88 spectra recorded by xmm-newton during a giant flare of vela x-1. although the continuum parameters were changing dramatically within the ∼ 100 ks observation a single model consisting of three absorbed powerlaws is able to describe the data with χ2red = 1.43 with 9765 degrees of freedom [10]. due to a global photon index for all power-laws and data-groups the absorption column densities and iron line fluxes could be constrained well. because every data-group is a single spectrum taken by the xmm-epic-pn camera and a common energy grid was used, the χ2red,comb. equals the χ 2 red here. thus, it is preferred to calculate the stacked residuals of all data-groups according to eq. 7. to demonstrate the advantage of this tool we have used the continuum model only, i.e., without any fluorescence line taken into account, and evaluated this model without any channel grouping to achieve the highest possible energy resolution. the resulting stacked residuals in the iron line region (5–9 kev) are shown in fig. 4. the iron kα line at ∼ 6.4 kev and the kβ line at ∼ 7.1 kev are nicely resolved. the tail following the kβ line is probably caused by a slight mismatch of 44 vol. 56 no. 1/2016 the goodness of simultaneous fits in isis a 2000 1500 1000 500 0 b 9.08.07.06.05.0 200 100 0 1 0 − 3 p h s− 1 c m − 2 energy (kev) 1 0 − 3 p h s− 1 c m − 2 figure 4. the iron line region of vela x-1 can be nicely studied in these stacked residuals of all 88 xmmnewton-spectra (a). the model includes the continuum shape only and, thus, does not take any fluorescent line emission into account. the residuals of the single spectrum with the highest signal show the kα emission line only (b). note that the residual flux in this line is ∼15 times lower compared to the stacked residuals. 2.52.01.51.0 15 10 5 χ2 red n u m b e r o f d a ta -g r o u p s figure 5. example for a histogram of the goodness of the fits. here, the distribution of the χ2red of all individual data-groups of the 88 xmm-newton-spectra of vela x-1 are shown. the continuum model with the data and requires a more detailed analysis. note that the flux of this mismatch is a few 10−3 photons s−1 cm−2, which is detectable only in these stacked residuals featuring 100 ks exposure time and after the strong continuum variability on ks-timescales has been subtracted. as a further demonstrative example, the histogram of the goodness of the fits of the 88 data-groups calculated after eq. 4 is shown in fig. 5. the median χ2red is around 1.3, indicating that the model could still be improved slightly. investigating the three outliers with χ2red > 2 indeed proves that the residuals around in the iron line region are left, which are responsible for the high χ2red and very similar to those shown in fig. 4. however, no extended residuals are visible. thus, the continuum parameters presented in [10] are still valid. a 10+0 10−1 fe kα b 5 0 -5 c 52 101 5 0 -5 c o u n ts s− 1 k e v − 1 χ energy (kev) χ figure 6. seven stacked spectra of swift-observations of xte j1859+083 (a) and the residuals to the model (b). a weak iron kα emission line at 6.4 kev is visible, which is not detected in the residuals of any individual spectrum (c). 3.3. xte j1859+083 the last example shown in this work is the outburst of the transient pulsar xte 1859+083 in 2015 april. this source had been in quiescence since its bright outburst in 1996/1997 [11]. during the recent outburst several short observations by swift were performed. a first analysis of these data in combination with integral spectra reports an absorbed power-law shape of the source’s x-ray continuum [12]. we extracted and analyzed seven swift-xrt spectra and can confirm these findings. however, after examining the stacked residuals of all spectra an iron kα emission line at 6.4 kev shows up that has not been detected before (see fig. 6). we define the equivalent width of this line as a global parameter and find a value of 60 ± 40 ev (uncertainty is at the 90% confidence level, χ2red,comb. = 0.98 with 585 degrees of freedom). 4. summary we have continued developing functions to handle simultaneous fits in isis, which we introduced in [1]. in particular, we have concentrated on tools for checking the goodness of the fits to discover failed fits of individual data-groups or global discrepancies of the model. we propose to • investigate the distribution of the goodness of fits to all individual data-groups • calculate a combined goodness, here the χ2red,comb., which takes the individual nature of each data-group into account • look at the stacked residuals of all data-groups to reveal weak features during a simultaneous fit in order to find the global best-fit. we have demonstrated the tremendous bene45 m. kühnel, s. falkner, c. grossberger et al. acta polytechnica fit of analyzing the stacked residuals by observations of three accreting neutron stars, in which we could identify weak features that had not been detected before. acknowledgements m. kühnel was supported by the bundesministerium für wirtschaft und technologie under deutsches zentrum für luftund raumfahrt grants 50or1113 and 50or1207. the slxfig module, developed by john e. davis, was used to produce all figures shown in this paper. we are thankful for the constructive and critical comments by the reviewers, which were helpful to significantly improve the quality of the paper. references [1] m. kühnel, s. müller, i. kreykenbohm, et al. simultaneous fits in isis on the example of gro j1008−57. acta pol 55:1–4, 2015. [2] j. c. houck, l. a. denicola. isis: an interactive spectral interpretation system for high resolution x-xay spectroscopy. in n. manset, c. veillet, d. crabtree (eds.), astronomical data analysis software and systems ix, vol. 216 of astron. soc. of the pacific conf. series, p. 591. 2000. [3] k. a. arnaud. xspec: the first ten years. in g. h. jacoby, j. barnes (eds.), astronomical data analysis software and systems v, vol. 101 of astron. soc. of the pacific conf. series, p. 17. 1996. [4] b. anderson, j. chiang, j. cohen-tanugi, et al. using likelihood for combined data set analysis. in y. fukazawa, y. tanaka, r. itoh (eds.), proceedings of the 5th fermi symposium, arxiv 1502.03081, 2015 [5] c. ricci, r. walter, t. j.-l. courvoisier, s. paltani, reflection in seyfert galaxies and the unified model of agn. a&a 532:a102, 2011. [6] e. bulbul, m. markevitch, a. foster, et al. detection of an unidentified emission line in the stacked x-ray spectrum of galaxy clusters. apj 789:13, 2014. [7] m. kühnel, s. müller, i. kreykenbohm, et al. gro j1008−57: an (almost) predictable transient x-ray binary. a&a 555:a95, 2013. [8] k. jahoda, c. b. markwardt, y. radeva, et al. calibration of the rossi x-ray timing explorer proportional counter array. apjs 163:401–423, 2006. [9] j. a. garcía, j. e. mcclintock, j. f. steiner, et al. an empirical method for improving the quality of rxte pca spectra. apj 794:73, 2014. [10] s. martínez-núñez, j. m. torrejón, m. kühnel, et al. the accretion environment in vela x-1 during a flaring period using xmm-newton. a&a 563:a70, 2014. [11] r. h. d. corbet, j. j. m. in’t zand, a. m. levine, f. e. marshall. rossi x-ray timing explorer and bepposax observations of the transient x-ray pulsar xte j1859+083. apj 695:30–35, 2009. [12] d. malyshev, c. ferrigno, d. gotz. integral measures the hard x-ray spectrum of the be/x-ray binary xte j1859+083. atel 7425, 2015. 46 acta polytechnica 56(1):41–46, 2016 1 introduction 1.1 simultaneous fits 2 goodness of a simultaneous fit 2.1 histogram of the goodness 2.2 combined goodness of the fit 2.3 stacked residuals 3 examples 3.1 gro j1008-57 3.2 vela x-1 3.3 xte j1859+083 4 summary acknowledgements references ap03_5.vp 1 introduction the main problem of cad systems is how to represent designed objects, how to represent design knowledge and primarily how to represent the design process itself. the problem of representation of the design process and its components has been solved since cad systems came into existence. the question of how to represent the design process still remains open. there is no common universal theoretical model for describing the design process nor are there any standards for design process representation. this situation is similar that 20 – 30 years ago in computer graphics. there were a great number of geometrical modelers that served as the kernels of cad systems. the consequence of this situation is obvious even today – cad systems have poor compatibility. the principles of design theory were formulated among others in [2]. a general design theory based on the principle of metamodel first appeared in [3], then later, e.g., in [4] and [5]. the idea of design theory again evokes a new discussion activity. a fresh critical view on the extended general theory and the theory of the metamodel appeared in [6]. in [7] there is a discussion of a new idea for using a formal means of modal logic for design process description. lately, a number of papers have appeared which are searching for new methods to describe the design process – for example: by means of analysis of a design protocol [8], by means of a relational system [9], with the help of a set-theory model [10], an object-oriented approach [11], etc. there is a growing effort to overcome the gap in describing the design process. in this paper we try to formalize the design process by means of one type of non-standard logic – modal logic [1]. the reason for this choice is the ability of this formalism to describe modeling of the individual discrete steps of design, design process branching, respecting necessity or possibility types of design knowledge. 2 modal logic one way to define modal logic is by extending of the classical axiomatic definition of propositional logic by adding some other connectives (modal connectives), and the deductive system of propositional logic can be extended with axioms and inference rules characterizing modal logic. first, two alethic modalities – modality of necessity and modality of possibility – will be introduced and represented by symbols � and �. the expression ,p will be read as “it is necessary that p” and expression � p as “it is possible that p”. the language of propositional modal logic will be enriched by these two symbols, i.e., modal logic language will contain the following connectives: �, �, �, �, �, � and �. modal logic language syntax is determined by two rules: 1. all propositional logic formulas are also modal propositional logic formulas. 2. if p is a modal propositional logic formula then formulas � p and �p are also formulas of this logic. connectives of necessity and possibility are connected according to the following definition: � � �p def � � p which can be informally interpreted as, e.g. “if something is possible, then it is not true that it is necessary for it to be invalid”. some other interpretations are given, e.g., in [1, 14]. several axiomatic systems exist which characterize the respective modal propositional logic system [1]. here, some of the systems will be shown and one the most appropriate for our purpose will be chosen. attention will be focused on the term accessibility relation, which is closely connected to the modal logic structure of axioms, after which some axiomatic systems will be shown [1]. unlike classical propositional logic, modal logic formulas cannot be simply interpreted in an extensional way. modal connectives are operators of an intensional character – which means that interpretation of a formula that contains modal operators is not possible if based only on knowledge of the individual subformula values and application of logical operations represented by the semantics of these connectives. the interpretation is extended by different world images in which the logic formulas can be evaluated in a different way. a world is an element of a semantic environment for the interpretation of modal logic formulas. this environment has a finite non-zero number of elements, or worlds. modal logic did not need this image because it is, from the formula interpretation point of view, an enclosed system. interpretation of a formula depends only on the individual subformulas “in one given world of interpretation – given state of things”. for the final image, it is possible to imagine a world as one possible state of a tested reality description, the state of the investigated reality. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 43 no. 5/2003 modal logic – a tool for design process formalisation i. jelínek in this paper we show the possibility to formalize the design process by means of one type of non-standard logic – modal logic [1]. the type chosen for this study is modal logic s4. the reason for this choice is the ability of this formalism to describe modeling of the individual discrete steps of design, respecting necessity or possibility types of design knowledge. keywords: modal logic, accessibility relation, design process, formalisation of design process. interpretation of modal logic formulas can depend not only on subformula values in the given world but also on formula values in worlds which have some connection with the considered actual world of evaluation. and it is on the very properties of the given world connection with other worlds that the character of different modal logic systems is built. the given world connection to other connected worlds is characterized by the accessibility relation [1, 14]. modal logic formula interpretation is thus conducted within the frame of the accessible worlds. let us introduce the frame of worlds [1] (structure [14]) formally as an ordered couple: � �f w r� , , where w set of worlds, r r w w� � , binary relation (accessibility relation). now let us introduce the term model (compare to the term model for propositional logic, where it was determined only by evaluation). let p be a set of atomic formulas. let us symbolize as l (p) the set of formulas which can be generated from p. the model (p-model) within the frame � �f w r� , will be then called a triad: � �m w r v� , , where v: p � 2w. v maps each atomic formula p p� to the subset � �v p w� and therefore represents all worlds w w� in which p is valid. the validity of formula a in a given world w and model m will be symbolized as follows: m �w a and will be defined recursively: m � �w f (false) m �w p if � �w v p p p� �, m �w x � y if m �w x implies m �w y m �w � a if wrt implies m �w a for all worlds t � w and possibly we can add definitions of other connectives: m �w x � y if m �w x and simultaneously m �w y m �w x � y if m �w x or m �w y m �w �x if m � �w x m �w � a if wrt implies m �w a for at least one world t � w. formula � �a l p� is valid in a model � �m w r v� , , , if a is valid in all worlds of the model m, i.e. m �w a for all w w� . symbolized as: m � a. formula � �a l p� is valid within the frame � �f w r� , , if a is valid in every model � �m w r v� , , , i.e. m � a it means for every model � �m w r v� , , . symbolized as: f � a. formula � �a l p� is valid if a is valid in every frame, i.e. f � a it means for every frame � �f w r� , . symbolized as: �a. accessibility relation properties are the key moment that characterizes the modal logic system type. the fundamental properties that the relation of accessibility can have are the following (see [1, 14]): 1. reflexivity: s srs where the world s w� is accessible on its own. 2. symmetry: s t srt trs if world t is accessible from world s then world s is accessible from world t, 3. transitivity: � s t u srt tru sru if world t is accessible from world s and world u from world t then world u is accessible from world s. it is possible to show [1] that the individual properties of the accessibility relation respect the fundamental axiomatic system of modal logic: ad (1) reflexivity: � a � a (t) ad (2) symmetry: a � � � a (b) ad (3) transitivity: � a � � � a. (4) now, it is possible to formulate the fundamental system of propositional modal logic as a system created by: 1. the propositional logic axiomatic system 2. the modality distribution axiom: � (a � b) � (�a � �b) (k) 3. the modal interference rule of necessity: and possibly for derivations the inference rule can be used: a modal logic axiomatic system determined in this way is symbolized in the literature as a fundamental system k according to the characteristic modal axiom. some other used axiomatic systems of modal propositional logic will be demonstrated in the following table. every type of modal logic is characterized by the set of modal logic axioms. there exists an unambiguous relation between the modal logic axiom and the property of an accessibility relation [14] – see table 1. the modal logic s4 will be at the center of our attention. it has been shown that the connective � can be defined by using a relationship a � �def � � a and considerations of the connective � can be transferred to similar considerations of the connective �. symmetrically to the axiom ( t ): � a � a an axiom a � �a exists. in this paper, modal logic will be proposed for design process formalization. the modal logic worlds will represent individual stages (or steps) of the design process and the relation of accessibility will represent the transition between these stages. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 x y� x � y . x x 3 design space and metamodel theory in general, the design process is not a simple, “straightforward” mapping coming from the object specification to the field of design object attributes. on the contrary, the design process consists of gradual discrete steps. the design process can branch out, because the designer can choose a transition to several steps. moreover, one part of the specification can be a set of conditions, which must be valid in every step or there are some optional limits that can be used. these requirements cannot be described by any classical logic. for this reason, modal logic was suggested for design process description. the reasons were based on the described design process characteristics – individual discrete steps, design process branching, respecting necessity or possibility types of knowledge. we will try to show in the following text that modal logic as it was determined in above with its accessibility relation and connectives of necessity and possibility is an appropriate apparatus for formalizing the given types of design process. a set of accessible worlds w was introduced in the modal logic. each world was represented by a finite set of valid formulas using mapping v. the individual worlds were connected through the relation of accessibility r. a modal logic was built as a formal system based on a formal language, logical axioms and inference rules. the mutual connection between these two systems a design process based on a metamodel theory [3, 4, 5] and modal logic-will be shown in this paper. definition 1: the formalized metamodel t theory (metamodel theory) is a triad � �t j l a� , , , where j theory language represented by modal lo gic syntax, l theory logic represented by the basic axiomatic system of modal logic, a proper theory axioms. this definition comes from the basic determination of formalized theory. one part of this theory is a system of proper theory axioms that enables logic axioms of a formalized theory deductive system to be enriched. proper metamodel theory axioms represent specification of the functional and other attributes of designed objects. definition 2: metamodel mi will be a world w w ii , , , ,0 1 2 � in the sense of modal logic. it was shown above that the system of modal logic worlds is used for interpretation of modal logic formulas. each world is characterized by formulas of the world evaluation. the metamodel was characterized using a finite set of specifications and attributes of the designed object. definition 3: the set of metamodels m0, m1, …, mm, …, will be understood as a non-empty set w w i mi , , , , , ,0 1 2 � � of worlds of modal logic where mi is represented by the world wi for all i. each world is represented by a set of modal logic formulas. functional and other design object attributes that represent the object specification are part of the world that represents the metamodel m0 (object specification). the individual modal logic worlds represent elements of the set of metamodels. logic axioms of modal logic and proper design axioms form one part of the metamodel theory [3]. definition 4: the design space over the set of metamodels is a couple of worlds w w i mi , , , , , ,0 1 2 � � and the accessibility relation r (r � w � w ) of modal logic. the design space ensures the possibility to test the set of all metamodels accessible from one of the metamodels, e.g., from a metamodel m0 – of the initial specification. definition 4 ensures that each design space can be described by a frame f: � �f w r� , , where w represents the set of metamodels, r � w � w characterizes the accessibility relation between individual metamodels. the definition ensures that only those metamodels which are represented in the frame � �f w r� , are concerned in the design. their connection is represented by the accessibility relation r. definition 5: accessibility relation r of the design space is reflexive and transitive. the definition comes from a design reality. it is assumed that all conclusions derived in one metamodel (in one design stage) are usable in this particular step (reflexivity). similarly, conclusions derived in a transition from one design stage to another are also usable in the following design stages (transitivity). the other property of the relation of accessibility – symmetry – does not have a practical sense here, so it is not considered. this would mean that the conclusions created in the subsequent steps would be usable in the previous ones, © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 43 no. 5/2003 system symbol accessibility relation property axioms t s srs (t): � a � a b s srs (t): � a � a �s t srt trs (b): a � � � a s4 s srs (t): � a � a � �s t u srt tru sru (4): � a � � �a s5 s srs (t): � a � a �s t srt trs (b): a � � � a � �s t u srt tru sru (4): � a � � �a table 1: connection between accessibility relation properties and modal logic axioms which does not correspond with reality. in this case, the accessibility relation would cease to represent the basic principle of design process – evolutionary design process. as the accessibility relation of the design space is determined design system can be defined as follows. definition 6: the design system will be understood as a modal logic of type s4. the definition only formulates the facts given in definition 5 in an actual notation used for modal logic type names according to table 1. an accessibility relation of modal logic of s4 type is characterized by the transitivity and reflexivity property. it has been shown that the characteristic properties of the respective accessibility relation which models the actual design course are the reason for choosing modal logic type s4 for design modeling. the design proceeds gradually from one step to another and everything that was valid in the previous step is transitively used in the subsequent steps (a design can be returned by rejecting the following step, i.e. by exemption of a world from the design process). similarly, it is true that the properties valid in one step are usable in that step (reflexivity). the given definitions enable us to use the modal logic apparatus for design process formalization – modalities �p and �p respectively, can be used. the most common way of informal interpretation of these formulas is “it must be true in the whole design” or “it will be true somewhere in the design”. a number of definitions have been formulated and a number of theorems that set conditions for validity of the formulas of the formalized t theory have been formulated and proved in our previous work [7]. let us introduce only a fundamental theorem with the main idea of the proof. theorem: 1. if in formula f of a metamodel m ii, , , , 0 1 2 � there are no modal connectives, then the validity of formula f validity will be determined as follows: let us suppose that formula f of metamodel mi is invalid and let us check recursively whether this assumption is kept even for the subformulas f of the metamodel mi, or even to the level of atoms. if some logical conflict in the validity of the subformulas occurs during the evaluation, then formula f is valid, while it is not valid in the opposite case. 2. if a formula f of a metamodel mi contains some modal connectives then we proceed as in paragraph 1 with the following exceptions: a. if validity of formula �p is assumed then p must be valid in all metamodels accessible from the tested metamodel mi. b. if invalidity of formula � p is assumed then � p must be valid in all metamodels accessible from the tested metamodel mi. c. if invalidity of formula �p is assumed then at least one accessible model from the tested metamodel mi in which p is not valid must exist. d. if validity of formula � p is assumed then at least one metamodel accessible from the tested metamodel mi in which p is valid must exist. proof: validity of the first part of the theorem follows from the assumption of the consistency of modal logic type s4. this means that if some invalidity of a modal logic formula is assumed, then the proof of the contradiction in the subformula validity implies validity of this formula. the second part of the theorem comes from the principle of formula evaluation using modal connectives, from the assumption of consistency of modal logic system s4 and from the theorems published in [1]. the theorem can be understood as an algorithm for formula evaluation in a given interpretation. the axiomatic system and derivation modal logic s4 rules will be used for the design process model determined by the previous definitions. in the axiomatic system s4 several useful theorems can be proved [1, 14]. 4 conclusion in this article one of the possibilities of a design process description using the modal logic formalism has been demonstrated. modal logic serves as a good means for standard types of design processes (the examples were published in [7]). however, the definition of the metamodel elements set by means of modal logic need not always be so simple and direct. it will be necessary to add also methods, which are capable of testing whether the found solutions actually reflect the given specification to the deduction methods of the modal logic. some inconsistencies can also appear during design process. it seems abduction can be a possible appropriate formal tool [13]. the idea of design theory evokes again a new research in the field of design process description and simulation: synthesis-related methods [8], set-theoretic models [10], synthesis design process model [9], general design theory [6], object-oriented approach [11]. it is a question which line of thoughts will be the basis for a true “common design theory”? acknowledgements this research has been supported by gacr grant no. 102/01/0763 references [1] chellas, b. f.: modal logic: an introduction. cambridge university press, 1980. [2] suh, n. p.: the principles of design. new york: oxford university press, 1990. [3] tomiyama, t., yoshikawa, h.: extended general design theory. in: design theory for cad (yoshikawa, h, warman, e., a. eds.), proc. of the ifip wg 5.2 working conf. on design theory for cad, ifip, tokyo, 1987, p. 95–130. [4] tomiyama, t.: general design theory and its extension and applications. in: universal design theory, (grabowski, h., rude, s., grein, g.), aachen: shaker verlag, 1998, p. 25– 46. [5] akman, v., ten hagen, p., j., tomiyama, t.: a fundamental and theoretical framework for an intelligent cad 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 system. computer aided design, vol. 22, no. 6, july/august 1990. [6] reich, y.: a critical review of general design theory. research in engineering design, vol. 7, no. 1, 1995, p. 1–18. [7] jelínek, i: method for design process description. 6th international design conference design 2000, dubrovnik (croatia), may 2000, p. 107–112. [8] tsumaya, a., takeda, h., tomiyama, t.: a synthesis-related analysis method of design protocol. in: proceedings of the 12th international conference on engineering design (lindeman, birkhofer eds.), iced’99, munich, 1999, p. 1949–1952. [9] washio, t., hew, k. p., tomiyama, t., umeda, y.: the modelling of synthesis – from the viewpoints of mathematical logic. in: proceedings of the 12th international conference on engineering design. (lindeman, birkhofer eds.), iced’99, munich, 1999, p. 1219–1222. [10] zeng, y., gu, p.: a set-theoretic model of design process. in: proceedings of the 12th international conference on engineering design (lindeman, birkhofer eds.), iced’99, munich, 1999, p. 1117–1120. [11] pavkovič, n., marjanovič, d.: entities in the object-oriented design process model. 6th international design conf. design 2000, dubrovnik (croatia), may 2000, p. 29–34. [12] smets, p., mamdani, e. h., dobois, d., prade, h.: non-standard logics for automated reasoning. academic press, london, 1988. [13] takeda, h.: abduction for design. in: proceedings of the ifip tc5/wg 5.2 workshop of formal design methods for cad, (gero, j., s., tyugu, e. eds.), tallin, 1994, p. 221–244. [14] hughes, g. e., cresswell, m. j.: introduction to modal logic. london: methuen, 1982. prof. dr. ivan jelínek phone: +420 224 357 214 fax: +420 224 923 325 email.: jelinek@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 43 no. 5/2003 ap02_3.vp 1 introduction liquid flow in mechanically stirred vessels has been studied intensively on recent decades. numerous theoretical and experimental studies have been performed concerning various aspects of liquid flow in stirred vessels, e.g., mean flow velocity, intensity of turbulence, energy dissipation rate or spatial and temporal scales of a turbulent velocity field, etc. liquid flow in a stirred vessel operated under steady operational conditions may be considered as a pseudo-stationary high-dimensional dynamical system constituted by hierarchically ordered unsteady flows (vortices and eddies). the temporal and spatial scales of these flows span several decimal orders of magnitude. recently, a pseudo-periodic large-scale flow has been identified in stirred vessels occurring with very low frequency and manifesting itself on spatial scales comparable to the size of the mixing vessel [1–10]. this flow was named macro-instability (mi) of the flow pattern, and its existence has been confirmed by a special mechanical measuring device [7], by laser-doppler velocimetry (ldv) [8–10], and by visual observations [11, 12]. the presence of mi in the flow pattern is typically displayed by a distinct peak in the low-frequency part of the power spectrum of the local liquid velocity or other liquid flow-related experimental data. comprehensive reviews of a broad spectrum of experimentally observed phenomena related to macro-instability were given in our previous papers [13,14]. macro-instability of the flow pattern has a strong impact on mixing processes closely linked to fluid motion, e.g., on local massand heat-transport rates [4], local gas hold-up [1], homogenisation rate, etc. bittorf and kresta [15] identified the macro-instability of the flow pattern as a mechanism responsible for liquid mixing outside the active volume of the primary liquid circulation loop. macro-instability, however, also exerts strong forces affecting solid surfaces immersed in stirred liquid, e.g., baffles, draft tubes, cooling and heating coils, etc. these forces may significantly affect the performance of the mixing vessel and, in certain cases, can even cause serious failures of the equipment. an axially located pitched blade impeller in a standard cylindrical mixing vessel equipped with radial baffles exhibits two main force effects: axial and tangential. the axial force affects radial baffles only very slightly. conversely, it is the tangential force that exhibits most of the dynamic pressure affecting them. the vertical distribution (along the baffle) of the dynamic pressure was measured in a standard mixing vessel by kratěna et al [16–18] over a wide interval of impeller reynolds number value and in dependence on the impeller off-bottom clearance height. the force effects of two types of impellers were studied: pitched blade impellers with four or six blades and the standard rushton turbine. strong qualitative differences in the vertical distribution of the mean tangential force and of its variance were observed. pitched blade impellers elicit maximum force at the vessel bottom, and the magnitude of the force gradually falls in the direction towards the liquid level. rushton turbine impellers produce maximum force at the impeller level and the force rapidly decays in both the below-impeller and above-impeller region. few attempts have been reported in the literature [19, 20] to separate the deterministic and stochastic components of © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 42 no. 3/2002 frequency and magnitude analysis of the macro-instability related component of the tangential force affecting radial baffles in a stirred vessel p. hasal, j. kratěna, i. fořt experimental data obtained by measuring the tangential component of force affecting radial baffles in a flat-bottomed cylindrical mixing vessel stirred with pitched blade impellers is analysed. the maximum mean tangential force is detected at the vessel bottom. the mean force value increases somewhat with decreasing impeller off-bottom clearance and is noticeably affected by the number of impeller blades. spectral analysis of the experimental data clearly demonstrated the presence of its macro-instability (mi) related low-frequency component embedded in the total force at all values of impeller reynolds number. the dimensionless frequency of the occurrence of the mi force component is independent of stirring speed, position along the baffle, number of impeller blades and liquid viscosity. its mean value is about 0.074. the relative magnitude (q mi) of the mi-related component of the total force is evaluated by a combination of proper orthogonal decomposition (pod) and spectral analysis. relative magnitude q mi was analysed in dependence on the frequency of the impeller revolution, the axial position of the measuring point in the vessel, the number of impeller blades, the impeller off-bottom clearance, and liquid viscosity. higher values of q mi are observed at higher impeller off-bottom clearance height and (generally) q mi decreases slightly with increasing impeller speed. the q mi value decreases in the direction from vessel bottom to liquid level. no evident difference was observed between 4 blade and 6 blade impellers. liquid viscosity has only a marginal impact on the q mi value. keywords: stirred vessel, baffles, tangential force, macro-instability, spectral analysis, proper orthogonal decomposition. mixing phenomena in stirred tanks. letellier et al [19] adopted a hilbert transform based procedure for separating the low-dimensional deterministic part of an experimental time series. kovacs et al [20] used the fourier transform for the same purpose. recently we have established a new technique for detecting the macro-instability of the flow pattern from local velocity data, evaluating its relative magnitude and reconstructing its temporal evolution [13, 14, 21]. the technique is based on a combination of spectral analysis with proper orthogonal decomposition [22–26], and in this paper we apply this technique to experimental data obtained by kratěna et al [16–18] by measuring the tangential component of the force exerted on radial baffles by the liquid flow in a mixing vessel in order to quantify the relative magnitude of its macro-instability related component, to analyse its vertical distribution in the vessel and the effects of the frequency of the impeller revolution, the number of impeller blades and the liquid viscosity. the frequency of occurrence of macro-instability related force component is also analysed. 2 experimental a cylindrical flat-bottomed vessel with four radial baffles was used in all experiments reported in this paper. the dimensions of the vessel and of the impellers are marked in fig. 1 and their values are listed in table 1. two pitched blade impellers with six and four blades (pitch angle 45°) were used for stirring. the impellers pumped the liquid towards the bottom of the vessel. water and hot and cold aqueous glycerine solutions (with viscosity 3 and 6 mpas, respectively) were used as working liquids in order to extend the experimentally attainable interval of impeller reynolds number values. the frequency of impeller rotation fm was varied from 5 s �1 to 9.17 s�1. the corresponding interval of rem values was (approximately) from 16000 to 83300. the tangential force affecting the baffles was measured by means of a trailing target (target height ht � 10 mm, target width b � 28 mm) located in a slit made in the baffle and enabled to rotate around an axis parallel to the vessel axis. the target was balanced by a couple of (calibrated) springs, and its angular displacement was scanned via a photo-electronic sensor (see kratěna et al [16–18] for details of the measuring equipment). the sampling period was ts � 20 ms. the signal from the sensor was recorded on a pc (after a/d conversion) and subsequently used for evaluation of the tangential force {fi, i � 1, …, ns} affecting the target. the duration of a single experiment was 20 minutes and ns � 60000 samples was typically stored. finally, the force fi was converted to dimensionless force according to the relation f f f d i i m * � � 2 4 (1) the time series of fi * values measured at seven distinct locations h ht of the target along the baffle (see table 1) were used for analysis. at each target height, the value of rem was varied over the entire attainable interval. the total number of processed data sets was 335. 3 numerical analysis of experimental data the numerical procedures used for experimental data analysis are described only briefly, as details can be found either in the original papers [22–26] or in our previous reports [13, 14, 21]. the power spectra of the time records of the measured tangential force were used to detect the presence of the component generated by the macro-instability of the flow pattern. the power spectral densities were evaluated by means of an algorithm based on the fast fourier transform [27]. examples of power spectra of the analysed data are shown in fig. 3. the peaks located at dimensionless frequency f fm � 0 075. clearly demonstrate the presence of an mi-related component of the total measured force at all impeller reynolds number values. the value of the dimensionless 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 dimension value vessel diameter t � 0.3 m liquid aspect ratio h/ t � 1 impeller to tank diameter ratio d/ t � 0.333 relative impeller off-bottom clearance h2/ t � 0.20 and 0.35 measuring target relative axial position ht/ h � 0; 0.1; 0.2; 0.35; 0.5; 0.65; 0.8 relative baffle width b / t � 0.1 relative impeller blade width h / d � 0.2 number of impeller blades nb � 4 and 6 table 1: dimensions of mixing vessel and impeller t h d h h 2 b t h target fig. 1: mixing vessel geometry frequency of the mi-related peak ( f fm � 0 075. ) agrees very well with the findings of other authors and of our previous studies, which were, however, based on entirely different experimental data (local liquid velocity) [7–10, 13, 14, 21]. a procedure based on an application of the proper orthogonal decomposition (pod) technique was used for extracting the mi-related low-frequency component from the experimental data. details of the procedure can be found in our previous papers [13,14,21], and the general principles of pod are described elsewhere [22]. first, the raw data, i.e., the time series {fi *, i � 1, …, ns} recorded at a given measuring point, is centred by subtracting the mean force value fm * �i i m s * * * , , ,� � �f f i n1 � . (2) then, the so called n-window [24–26] is applied to the centred data series {�i *, i � 1, …, ns}. the n-window extracts n consecutive elements from the series {�i *} to vector {� k *} � � � � k j s* *, , , , , ,� � � � � �� j k k n k n n� �1 1 1 (3) the set of vectors {� k * , k � 1, …, m � (n � 1)} resulting from repeated application of the n-window to the centred velocity data is then rearranged to a form of the so called trajectory matrix w � � �w= � � �1 2 n n t s * * *, , ,� � �1 . (4) autocovariance matrix r of the trajectory matrix w is then evaluated by matrix multiplication r w w� t (5) and its (non-negative) eigenvalues �k and eigenvectors vk, k � 1, …, n are then determined by any proper algorithm, e.g., by singular value decomposition [27]. the eigenvalues �k sorted in decreasing order form a so called spectrum of pod eigenvalues, and each �k value expresses the magnitude of the contribution of the k-th eigenmode to the total analysed signal. the eigenvectors vk are used for evaluating the eigenmodes ak(t) of the original data series {fi *} using the relation � � a tk k� �w v , (6) where ( � ) denotes the inner product. the eigenmodes are functions of only time and are often called chronos [24]. the eigenmodes ak(t) can be used for reconstructing the temporal evolution of the macro-instability related component of the data � � f t f r a t j n nmi* j m* 1,k k j k smi mi mi � � �� , , ,1 � . (7) the summation in eq. (7) is performed over a set of selected values of the index kmi: the power spectra of the eigenmodes ak(t) are evaluated first. then only eigenmodes with a single significant peak in their power spectra, located exactly at the macro-instability frequency fmi (cf. fig. 7), are summed in eq. (7), and thus the mi-related phenomenon is reconstructed. more details on eigenmode selection and the summation procedure can be found in our previous papers [13,14,21]. the eigenvalues �k are used for evaluating the relative magnitude qmi of the macro-instability related component of the tangential force affecting the baffles q j k j nmi j j mi � � � � � � 1 (8) the summation in the numerator on the right hand side of eq. (8) is performed over the same kmi values as in eq. (7). 4 results and discussion first, the vertical distribution of the magnitude and the variability of the dimensionless tangential force affecting the baffle in the mixing vessel was analysed. the mean values fm * were evaluated for each moving target location h ht and mixing vessel configuration used in the experiments (the configuration is specified by the number of impeller blades nb and by the impeller off-bottom clearance height h h2 ). averaging of the tangential force values was performed over all rem values used at a given target location and vessel configuration, i.e., the values measured at different impeller speeds fm and at different liquid viscosities were averaged. the results are shown in fig. 2. this figure also shows the standard deviation of the dimensionless tangential force resulting from the averaging procedure (error bars). the standard deviation characterises the variability of the value of fm * due to the varying rem value. it is obvious that the distribution of mean tangential force fm * along the baffle is qualitatively the same for all configurations of the mixing vessel. the force attains its maximum at the vessel bottom and then decreases to a minimum located slightly above the impeller position. the minima attain negative values, i.e., the direction of the liquid flow and the resulting force is reverted at their positions compared to the flow at the vessel bottom. the rapidity of the decrease in the mean force value is somewhat higher for impellers located closer to the vessel bottom ( h h2 � 0 2. ). in the upper part of the vessel ( h ht � 0 5. ) the mean force gradually increases back to positive values that are, however, considerably lower than the force values recorded at the vessel bottom. the mean tangential force affecting the baffles is generally lower at higher values of the impeller off-bottom clearance height, i.e., at h h2 � 0 35. (panels b) and d) in fig. 2), and for the four blade impeller (panels c) and d) in fig. 2). the variability of the mean tangential force (error bars in fig. 2) decreases from the bottom to the top of the mixing vessel for all vessel configurations. the variability of the tangential force is particularly profound at the vessel bottom and in the adjacent below-impeller region, i.e., in the region where the baffles are affected by the impeller discharge stream. the effects of the impeller rotation speed and of the liquid viscosity are suppressed in the above-impeller region, i.e., in the ascending liquid stream, where the tangential force is generally weak. spectral analysis of the experimental time series data was performed in order to detect the macro-instability related component of the total tangential force embedded in the data. the analysis clearly demonstrates the presence of distinct peaks in the low-frequency part ( f / fm < 0.1) of the power spectra for all vessel configurations and all impel© czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 42 no. 3/2002 ler number values. examples are shown in fig. 3, where the power spectra of the data measured at the lowest position of the measuring target ( h ht � 0) are depicted. the heights of the mi-related peaks decrease with increasing rem, but the peaks are easily detectable even at the highest rem values used in the experiments. fig. 3 further suggests that the dimensionless frequency f fmi m � 0 075. corresponding to the maxima of the mi-related peaks does not depend on the impeller speed, the number of impeller blades and the impeller off-bottom clearance. in order to analyse this presumption in greater detail, the dimensionless peak frequencies f fmi m were determined for each data set, and then averaged values and standard deviations were evaluated. the results are summarised in table 2. the peak frequencies take practically the same value for all vessel configurations. the magnitude of the differences of mean peak frequencies between configurations is comparable with the magnitudes of their standard deviation (the coefficient of variation takes values of about 10 % in all cases). the differences are not therefore significant. nevertheless, we can speculate that the dimensionless frequency f fmi m increases somewhat with an increasing number of impeller blades. the dependencies of the frequency of the mi-related peaks in the power spectra on the frequency of the impeller revolution and on the axial position along the baffle are shown in figs. 4 and 5. fig. 4 shows f fmi m values averaged along the baffle, i.e., averaged across all moving target positions h ht , as a function of the frequency of the impeller revolution. full lines in fig. 4 indicate the overall mean values from table 2, and the dashed lines delimit the interval of the two standard deviations. it is evident that the f fmi m value does not depend on 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 2: vertical distribution of mean dimensionless tangential force fm * and its standard deviation in a mixing vessel. empty circles denote mean values; error bars depict interval of mean value ± one standard deviation: a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . . the impeller speed and on the viscosity of the working liquid. the dependencies of the f fmi m values (averaged over all rem values) on target height depicted in fig. 5 indicate that the f fmi m value also does not depend on the axial position in the vessel. this fact confirms the deductions reported in the literature [7–10, 13–18, 21] that the macro-instability of the flow-pattern is a large-scale, spatially synchronised phenomenon occupying a substantial part of the mixing vessel volume. the results of frequency analysis indicate that the frequency of the macro-instability related component of tangential force f mi increases linearly with increasing impeller speed f m, as observed, e.g., by brůha et al [7] and montes et al [8–10]. the overall mean value of the proportionality constant between fmi and fm (resulting from our analysis, cf. table 2) is 0.074. this value agrees well with the values reported by other authors [7–10, 15], although they used qualitatively different experimental data, for example local liquid velocity. quantification of the relative contribution of the mi-related component to the total tangential force affecting the baffle was performed by means of proper orthogonal decomposition of the experimental time series, as described in section 3. the used length of the n-window was n � 300. pod analysis unveiled a quite complex inner structure of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 42 no. 3/2002 nb h h2 mean value sd cv 4 0.20 0.0691 0.0081 0.117 4 0.35 0.0718 0.0057 0.079 6 0.20 0.0765 0.0078 0.102 6 0.35 0.0775 0.0077 0.099 table 2: mean values of dimensionless frequency f fmi m of macro-instability related component of dimensionless tangential force affecting the baffle fig. 3: power spectral densities of tangential force measured at vessel bottom, h ht � 0 : a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . . full lines: rem �16000 ; dotted lines: rem � 25000 ; dash-dotted lines: rem � 50000. low-frequency domain of the force: the spectrum of pod eigenvalues in fig. 6 documents that the values of �k decay quite slowly with increasing index k, i.e., the total force is composed of a large number of significant components. there is typically one leading (maximum) eigenvalue (k � 1) followed by 7–9 medium-valued eigenvalues (k � 2–10) capturing the low-frequency components of the force and, finally, there is a tail of slowly decaying low-valued eigenvalues (k > 10) capturing the high-frequency (turbulent and noisy) components of the force. the leading eigenvalue �1 reflects the presence of a very slowly varying component in the measured signal. its dimensionless frequency is of the order of 10�4–10�3 (substantially below the typical dimensionless mi frequency: 10�2) and can be assigned to the unsteadiness of the operational conditions of the vessel or of the measuring equipment. the tail parts of the pod eigenvalue spectra in fig. 6 correspond to eigenmodes with frequencies considerably above the typical mi frequency. these eigenmodes capture the fast, low spatial scale liquid motions (turbulent eddies) in the vessel. the macro-instability component of the total tangential force is typically captured by one or two (exceptionally by three) of the medium-valued eigenvalues (k � 2–10). this is documented in fig. 7, where the power spectrum of the measured force is plotted (thick full line) together with the power spectra of the first ten pod eigenmodes. two eigenmodes (specifically eigenmodes with k � 2 and k � 3) contribute to the mi component as their sole peaks (emphasised by empty and filled circles in fig. 7) are located at exactly the same frequency as the mi-related peak in the spectrum of measured (non-decomposed) force. other eigenmodes (k � 4) do not contribute to the mi component, as their peaks (plotted with dashed lines) do not coincide with the mi peak of total force, but are evenly spread across the low-frequency domain. their peak frequencies are either lower or higher than fmi, but do not coincide with fmi. the relative magnitude q mi of the mi fraction of the total force was evaluated using eq. (8) with kmi � {2, 3}. in this way all data sets were processed and the q mi values were evaluated for all experimental configurations. the results are plotted in figs. 8 and 9. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 cold glycerine solution hot glycerine solution circles: water fig. 4: dimensionless frequency f fmi m of macro-instability related fraction of the total tangential force averaged along the vessel height as a function of frequency of impeller rotation fm: a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . . full line: mean value; dashed lines: mean value ± one standard deviation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 42 no. 3/2002 cold glycerine solution hot glycerine solution circles: water fig. 5: dimensionless frequency f fmi m of macro-instability related fraction of total tangential force averaged over impeller rotation speeds as a function of dimensionless axial position h ht : a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . . full line: mean value; dashed lines: mean value ± one standard deviation. cold glycerine solution, 7.33 sfm = 1 hot glycerine solution, 7.33 sfm = 1 water, 7.5 sfm = 1 h ht/ 0= h ht/ 0.5= h ht/ 0.8= fig. 6: spectra of pod eigenvalues �k obtained by pod of experimental tangential force time series. 6 blade impeller, off-bottom clearance h h2 0 2� . : a) h ht � 0 (vessel bottom); b) fm � 7 33. s �1, hot glycerine solution the relative magnitude q mi of the mi-related component of the tangential force varies, in general, within the interval 0.05 – 0.15. these figures indicate that the macro-instability of the flow pattern in mixing vessels generates a significant part of the force effects exerted by a flowing liquid on solid surfaces, in this case on baffles. the primary factor affecting the value of q mi (cf. figs. 8 and 9) is the impeller off-bottom clearance h h2 . the principal effect of impeller off-bottom clearance on the circulation pattern in a stirred tank was demonstrated by kresta and wood [5]. the relative magnitude q mi of the mi component is markedly higher for higher impeller location h h2 � 0 35. . the influence of impeller speed fm on the value of q mi is weak (cf. fig. 8): the q mi value decreases only modestly with increasing frequency of impeller revolution fm. it can be concluded that the macro-instability related force affecting the baffles is a persistent phenomenon that is not destroyed by turbulent liquid flow in the vessel. the plots in fig. 8 do not indicate any substantial and consonant influence of liquid viscosity on the q mi value, as the scatter of data points plotted in fig. 8 is within the error limits of the q mi value evaluation. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 7: power spectral densities of measured tangential force (thick full line) and of its pod eigenmodes: the only pod eigenmodes #2 and #3 (thin full lines with filled and empty circles) contribute to the mi-related fraction of the total force. other pod eigenmodes (thin dash-dotted lines) do not contribute to mi. experimental conditions: 4 blade impeller; h h2 0 35� . ; h ht � 0 ; fm � 6 5. s �1; rem � 16818 (cold glycerine solution). and full line: cold glycerine solution and dashed line: hot glycerine solution and dash-dotted line: water fig. 8: relative magnitude q mi of the mi-related fraction of the total tangential force affecting the baffle as a function of the impeller speed fm: a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . . the points plotted at the same fm value were obtained at distinct h ht values. no effect of the number of impeller blades on the q mi value is visible. the magnitude of the macro-instability related component of the tangential force varies significantly in the vertical direction (fig. 9). the q mi value is, in general, higher at the vessel bottom, h ht � 0. this fact becomes very obvious when the impeller location is higher (h h2 � 0 35. ) – cf. panels b) and d) in fig. 9. when the impeller location is lower (h h2 � 0 20. ), no monotonous dependence of the q mi value on the vertical position is obvious, but the decreasing trend of the q mi value with increasing vertical position still can be deduced, having in mind that the relative error of q mi evaluation is about ±15 %. fig. 9 confirms the conclusions deduced from fig. 8: the q mi value is substantially influenced by the impeller off-bottom clearance and by the number of impeller blades (higher values of both quantities invoke higher q mi values); liquid viscosity has only marginal effects on the q mi value. 5 conclusions the existence of macro-instability of the flow pattern in stirred vessels has been noted many times in the literature, but few attempts have been made to evaluate its quantitative measures. recently we described a method for evaluating the relative magnitude of the mi-related component of the local liquid velocity [13, 14, 21]. in this paper the procedure was adopted for an analysis of qualitatively distinct data – the force affecting radial baffles in a stirred vessel. the procedure proved to be an efficient and reliable tool. the frequency of the mi-related component of the force and its relative magnitude were determined. the frequency of the mi-related component is independent of the operating conditions in the vessel and geometrical configuration of the vessel and the stirrer. the mi-related component forms a significant part of the total tangential force affecting the baffle. acknowledgement this project was supported by the ministry of education of the czech republic (projects: msm223400007 and j04/98:212200008). © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 42 no. 3/2002 and full line: cold glycerine solution and dashed line: hot glycerine solution and dash-dotted line: water fig. 9: vertical distribution of the relative magnitude q mi of the mi-related fraction of the total tangential force: a) 4 blade impeller, h h2 0 2� . ; b) 4 blade impeller, h h2 0 35� . ; c) 6 blade impeller, h h2 0 2� . ; d) 6 blade impeller, h h2 0 35� . list of symbols a(t) pod eigenmode b baffle width m b moving target width m d impeller diameter m fm frequency of impeller revolution s �1 fmi frequency of macro-instability s �1 f tangential force affecting baffles n f* dimensionless tangential force affecting baffles fm * mean dimensionless force h impeller blade width m ht moving target height m h liquid filling height m h2 impeller off-bottom clearance m ht moving target axial position m nb number of impeller blades ns number of samples q mi relative magnitude of mi-related fraction of tangential force r autocovariance matrix rem impeller reynolds number rem�(fmd 2 �)/ t time s t mixing vessel diameter m ts sampling period s v pod eigenvectors w trajectory matrix n � pod eigenvalue � liquid density kg m�3 dynamical liquid viscosity pa s �* centred value of tangential force n �* vector of centred values of tangential force n abbreviations a/d analog to digital conversion cv coefficient of variation ldv laser doppler velocimetry mi macro-instability pod proper orthogonal decomposition psd power spectral density sd standard deviation references [1] bakker, a., van den akker, h. e. a.: gas-liquid contacting with axial flow impellers. transactions of the institution of chemical engineers, 72a, 1994, p. 573–582. [2] bakker, a., van den akker, h. e. a.: single-phase flow in stirred reactors. transactions of the institution of chemical engineers, 72a, 1994, p. 583–593. [3] chapple, d., kresta, s.: the effect of geometry on the stability of flow patterns in stirred tanks. chemical engineering science, 21, 1994, p. 3651–3660. [4] haam, s., brodkey, r. s., fasano, j. b.: local heat transfer in a mixing vessel using heat flux sensors. industrial and engineering chemistry research, 31, 1992, p. 1384–1391. [5] kresta, s. m., wood, p. m.: the mean flow field produced by a 45° pitched blade turbine: changes in the circulation pattern due to off-bottom clearance. canadian journal of chemical engineering, 71, 1993, p. 42–53. [6] winardi, s., nagase, y.: unstable phenomenon of flow in a mixing vessel with a marine propeller. journal of chemical engineering of japan, 24, 1991, p. 243–249. [7] brůha, o., fořt, i., smolka, p., jahoda, m.: experimental study of turbulent macro-instabilities in an agitated system with axial high-speed impeller and with radial baffles. collection of czechoslovak chemical communications, 61, 1996, p. 856–867. [8] montes, j.-l., boisson, h.-c., fořt, i., jahoda, m.: local study of velocity field macro-instabilities in mechanically agitated fluids in a mixing vessel. 12th international congress chisa’96, praha, aug. 25–30, 1996. [9] montes, j.-l.: etude experimentale des fluctuations de vitesse de basse frequence dans une cuve agitee axialement. phd thesis, institut national polytechnique de toulouse, 1997. [10] montes, j.-l., boisson, h.-c., fořt, i., jahoda, m.: velocity field macro-instabilities in an axially agitated mixing vessel. chemical engineering journal, 67, 1997, p. 139–145. [11] guillard, f., träg rdh, c., fuchs, l.: new image analysis methods for the study of mixing patterns in stirred tanks. the canadian journal of chemical engineering, 78, 2000, p. 273–285. [12] guillard, f., träg rdh, c., fuchs, l.: a study on the instability of coherent mixing structures in a continuously stirred tank. chemical engineering science, 55, 2000, p. 5657–5670. [13] hasal, p., montes, j.-l., boisson, h.-c., fořt, i.: macro-instabilities of velocity field in stirred vessel: detection and analysis. chemical engineering science, 55, 2000, p. 391–401. [14] hasal, p., fořt, i.: macro-instabilities of the flow pattern in a stirred vessel: detection and characterization using local velocity data. acta polytechnica, 2000, vol. 40, p. 55–67. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 [15] bittorf, k. j., kresta, s. m.: active volume of mean circulation for stirred tanks agitated with axial impellers. chemical engineering science, 55, 2000, p. 1325–1335. [16] kratěna, j., fořt, i., růžička, m., brůha, o.: analysis of dynamic stress affecting a radial baffle in a mechanically agitated system. acta polytechnica, 1999, vol. 39, p. 11–38. [17] kratěna, j., fořt, i., brůha, o., pavel, j.: local dynamic effect of mechanically agitated liquid on a radial baffle. in: “proceedings of the 10th european conference on mixing” (editors: h. e. a. van den akker and j. j. derksen), amsterdam, elsevier 2000, p. 369–376. [18] kratěna, j., fořt, i., brůha, o., pavel, j.: distribution of dynamic pressure along a radial baffle in an agitated system with standard rushton turbine impeller. transactions of the icheme, 79a, 2001, p. 819–823. [19] letellier, c., le sceller, l., gousbet, g., lusseyran, f., kemoun, a., izrar, b.: recovering deterministic behavior from experimental time series in mixing reactor. a. i. ch. e. journal, 43, 1997, p. 2194–2202. [20] kovacs, t., träg rdh, c., fuchs, l.: fourier spectrum to recover deterministic and stochastic behavior in stirred tanks. a. i. ch. e. journal, 47, 2001, p. 2167–2176. [21] hasal, p., montes, j.-l., boisson, h., fořt, i.: macro-instabilities of velocity field in stirred vessel: detection, analysis and modeling. 13th international congress chisa’98, prague, august 23–28, 1998. [22] lumley, j. l.: stochastic tools in turbulence. new york: academic press, 1970. [23] hožič, m., stefanovska, a.: karhunen – ločve decomposition of peripheral blood flow signal. physica a, 280, 2000, p. 587–601. [24] aubry, n., guyonnet, r., lima, r.: spatiotemporal analysis of complex signals: theory and applications. journal of statistical physics, 64, 1991, p. 683–739. [25] broomhead, d. s., king, g. p.: extracting qualitative dynamics from experimental data. physica d, 20, 1986, p. 217–236. [26] kolodner, p., slimani, s., aubry, n., lima, r.: characterization of dispersive chaos and related states of binary fluid convection. physica d, 85, 1995, p. 165–224. [27] press, w. h., teukolsky, s. a., vetterling, w. t., flannery, b. p.: numerical recipes in fortran. cambridge: cambridge university press, 1992. doc. ing. pavel hasal, csc. phone: +420 224 353 167 fax: +420 233 337 335 e-mail: pavel.hasal@vscht.cz institute of chemical technology, prague department of chemical engineering & center for nonlinear dynamics of chemical and biological systems technická 5 166 28 praha 6, czech republic ing. jiří kratěna phone: +420 224 352 713 e-mail: kratena@student.fsid.cz doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: fort@fsid.cvut.cz dept. of process engineering czech technical university faculty of mechanical engineering technická 4 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2020.60.0012 acta polytechnica 60(1):12–24, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap homogenization of transport processes and hydration phenomena in fresh concrete michal beneša, radek štefanb,∗ a czech technical university in prague, faculty of civil engineering, department of mathematics, thákurova 7, 166 29 prague 6, czech republic b czech technical university in prague, faculty of civil engineering, department of concrete and masonry structures, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: radek.stefan@fsv.cvut.cz abstract. the problem of hydration and transport processes in fresh concrete is strongly coupled and non-linear, and therefore, very difficult for a numerical modelling. physically accurate results can be obtained using fine-scale simulations, which are, however, extremely time consuming. therefore, there is an interest in developing new physically accurate and computationally effective models. in this paper, a new fully coupled two-scale (meso-macro) homogenization framework for modelling of simultaneous heat transfer, moisture flows, and hydration phenomena in fresh concrete is proposed. a modified mesoscale model is first introduced. in this model, concrete is assumed as a composite material with two periodically distributed mesoscale components, cement paste and aggregates. a homogenized model is then derived by an upscaling method from the mesoscale model. the coefficients for the homogenized model are obtained from the solution of a periodic cell problem. for solving the periodic cell problem, two approaches are used – a standard finite element method and a simplified closed-form approximation taken from literature. the homogenization framework is then implemented in matlab environment and finally employed for illustrative numerical experiments, which verify that the homogenized model provides physically accurate results comparable with the results obtained by the mesoscale model. moreover, it is verified that, using the homogenization framework with a closed-form approach to the periodic cell problem, significant computational cost savings can be achieved. keywords: concrete, hydration, transport processes, homogenization, numerical experiments. 1. introduction modelling of transport processes and hydration phenomena in early age concrete is a challenging engineering problem, since hydration of cement is an exothermic reaction, which may result in high temperatures and related deformations, stresses, and cracking in concrete structures, e.g. [1–7] and references therein. fresh concrete is a heterogeneous porous material highly saturated with liquid water, e.g. [4]. in the present paper, concrete is assumed as a composite material with two periodically distributed mesoscale components, cement paste and aggregates, represented by multiphase hygroscopic and heat conducting rigid porous media partially saturated with liquid water, e.g. [4, 8]. coupled transport processes in porous materials are associated with balance equations for mass of moisture and heat energy of the whole medium, e.g. [9]. for the mesoscopic description of transport processes in cement paste and aggregates, we use the balance equations based on the averaging theory, see [4] and references therein, particularly [10–12], for more details. the governing equations at the mesoscale level are completed by an appropriate set of constitutive equations, material data, and physically relevant boundary and initial conditions. it is well known and closely described in literature that the problem of hydration and transport processes in concrete is strongly coupled and non-linear, e.g. [4, 7] and references therein. this, together with the complexity of the mesoscopic structure of the concrete composite, makes the detailed numerical simulations of the problem extremely time consuming. therefore, there is an interest in developing new physically accurate and computationally effective models based on a sophisticated numerical coupling at different scales, see e.g. [13–16], see also our previous work [17, 18]. the homogenization method is one of the most advanced techniques in upscaling the response of the microstructure of heterogeneous materials, e.g. [19– 28]. in this method, the solution of a fine scale problem is used to examine the local material behaviour at the macroscale. combining the heat transfer together with mass flows through fresh concrete motivates us to construct a fully coupled multi-field and a two-scale (meso-macro) homogenization framework. 12 https://doi.org/10.14311/ap.2020.60.0012 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . 2. the mesoscale model in the mesoscale model, concrete is assumed as a composite material with two periodically distributed mesoscale components, cement paste and aggregates, represented by multiphase hygroscopic and heat conducting rigid porous media partially saturated with liquid water, e.g. [4, 8]. the detailed description of the assumptions adopted for developing the model is given in our previous work [17]. let ω be a polygonal domain with boundary ∂ω. consider a concrete composite consisting of two flow regions (aggregates and cement paste) periodically distributed in a domain ω with period εy, where y = (0, 1)2 is a periodicity cell split into two complementary parts ya and yc. throughout the paper, subscripts a and c refer to aggregates and cement paste, respectively. by χa and χc, we denote the corresponding characteristic functions of ya and yc, respectively, extended y–periodically to r2. from the geometrical point of view, ε is the characteristic length representing the small scale variability of the concrete composite. in this paper, we present the homogenization result for the problem (indexed by ε) ∂t [%w (χεcφc(ξ ε) + χεaφa)s(p ε)] −∇· [ %wkr(s(pε)) µ(ϑε) (χεckc(ξ ε) + χεaka)∇p ε ] = α1χεcf(p ε,ϑε,ξε), (1) ∂t [cw%w (χεcφc(ξ ε) + χεaφa)s(p ε)ϑε] + ∂t [(χεc%sccsc(1 −φc(ξ ε)) + χεa%sacsa(1 −φa)) ϑ ε] −∇· [(χεcλc(p ε,ϑε,ξε) + χεaλa(p ε,ϑε))∇ϑε] −∇· [ cw%wϑ εkr(s(p ε)) µ(ϑε) (χεckc(ξ ε) + χεaka)∇p ε ] = α2χεcf(p ε,ϑε,ξε), (2) coupled with an integral condition ξε(x,t) = ∫ t 0 f(pε(x,s),ϑε(x,s),ξε(x,s)) ds. (3) the unknowns in the model are the temperature ϑε, water pressure pε, and the memory function ξε, the so-called degree of hydration. from the physical point of view, equations (1) and (2), respectively, represent the mass balance of moisture (liquid water) and the heat equation for the porous system, see e.g. [9]. equation (3) represents an additional integral condition. such type of equations arises in the theory of heat conduction when inner heat sources are of special types, in particular, in so-called problems of hydration heat. in this case, the intensity of inner sources of heat also depends on the amount of heat already developed, see [29]. in the mesoscale model, we assume different hydraulic and thermal characteristics in aggregate and cement paste, respectively. in particular, φ� (� ≈ a,c) is the porosity, and s represents the degree of saturation with liquid water. further, k� is the intrinsic permeability, kr represents the relative hydraulic conductivity, λ� is the thermal conductivity function, and µ is the temperature dependent viscosity of the fluid. material constants are as follows: %w is the density of liquid water and cw represents the isobaric heat capacity of water. moreover, %s� and cs�, respectively, are the mass density and the isobaric heat capacity of the solid. finally, f is the hydration degree rate and parameters α1 and α2 on the right hand sides of (1) and (2) express the final values of mass of hydrated water and the amount of released heat energy during hydration, respectively. to complete the model, it is further necessary to prescribe the initial and boundary conditions. the initial conditions specify the initial fields of the water pressure and temperature: pε(0) = p0, ϑε(0) = ϑ0 in ω. (4) the so-called newton (robin type) boundary conditions are prescribed on the boundary ∂ω where the water flux through the boundary of the domain is proportional to the difference between the pressure at the boundary and the outer pressure p∞. similarly, the heat flux at the boundary of the system is proportional to the difference between the temperature at the boundary and the external temperature ϑ∞. 3. the homogenized problem the system (1)–(3) has been theoretically investigated in [17, 29]. instead of solving (1)–(3) on a fine mesh resolving the small scale variability of ε, the basic idea of the upscaling methods is to replace the heterogeneous medium by an “equivalent” homogeneous one. it has been shown in [17] that there is a sequence {εj}, limj→+∞εj = 0+, such that [pεj ,ϑεj ,ξεj ] converges in a suitable topology to the solution of the following upscaled equations. water conservation equation: ∂t [%w (χ∗cφc(ξ) + χ ∗ aφa)s(p)] −∇· [a∗(p,ϑ,ξ)∇p] = α1χ∗cf(p,ϑ,ξ) (5) energy conservation equation: ∂t [cw%w (χ∗cφc(ξ) + χ ∗ aφa)s(p)ϑ] ∂t [([χ∗c%sccsc(1 −φc(ξ)) + χ ∗ a%sacsa(1 −φa)]) ϑ] −∇· [λ∗(p,ϑ,ξ)∇ϑ + cwϑa∗(p,ϑ,ξ)∇p] = α2χ∗cf(p,ϑ,ξ). (6) the system is coupled with the integral condition ξ(x,t) = ∫ t 0 f(p(x,s),ϑ(x,s),ξ(x,s))ds. (7) in (5) and (6), the homogenized coefficient functions are given by χ∗c = ∫ y χc(y)dy, χ∗a = ∫ y χa(y)dy, 13 michal beneš, radek štefan acta polytechnica λ∗ij(ω,η,ζ) = ∫ y (χc(y)λc(ω,η,ζ) + χa(y)λa(ω,η))︸ ︷︷ ︸ λ(y,ω,η,ζ) × × ( δij + δik ∂vj ∂yk ) dy (8) and a∗ij(ω,η,ζ) = ∫ y %wkr(s(ω)) µ(η) [χc(y)kc(ζ) + χa(y)ka]︸ ︷︷ ︸ a(y,ω,η,ζ) × × ( δij + δik ∂wj ∂yk ) dy, (9) where wi and vi ∈ w 1,2per(y), i = 1, 2, are periodic solutions (in a weak sense) of the following cell problems, respectively, −∇y · (a(y,ω,η,ζ)(ei + ∇ywi)) = 0 in y (10) and −∇y · (λ(y,ω,η,ζ)(ei + ∇yvi)) = 0 in y. (11) here, a summation convention is used, i.e. summation is performed over repeated indices. further, δij (or δik) denotes the kronecker delta and {e1,e2} is the canonical basis of r2. note that the problems (10) and (11), respectively, define wi and vi, i = 1, 2, up to an additive constants. usually, we choose wi and vi such that∫ y wi dy = 0 and ∫ y vi dy = 0. (12) it is easily seen that the homogenized coefficients (8) and (9) do not depend on the choice of the additive constant. 4. numerical solution 4.1. macroscopic fe formulation let 0 = t0 < t1 < · · · < tn = t be an equidistant partitioning of time interval [0,t] with the discrete time step ∆t, {tn} n n=0, ∆t = t/n. for any function ζ, we will use the approximation ζn ≈ ζ(tn) and denote the backward time difference as δ∆t [ζn+1], where δ∆t [ζn+1] := ζn+1−ζn ∆t ≈ dζ(tn+1) dt . by th, we denote an admissible quadrilateral partition of ω with a mesh size h with standard properties from the finite element theory (see e.g. [30]). let wh be the standard conforming linear finite element space over th. let ph0 = p0, ϑh0 = ϑ0, and ξh0 = 0 in ω. for n = 0, . . . ,n − 1, we seek [phn+1,ϑhn+1,ξhn+1] ∈ wh × wh × c(ω) the approximate solution of [p,ϑ,ξ] at time tn, such that %wχ ∗ c ∫ ω δ∆t [ φc(ξhn+1)s(p h n+1) ] ϕh dx + %wχ∗aφa ∫ ω δ∆t [ s(phn+1) ] ϕh dx + ∫ ω a∗(phn,ϑ h n,ξ h n)∇p h n+1 ·∇ϕ h dx + ∫ ∂ω βc(x)phn+1ϕ h ds = α1χ∗c ∫ ω f(phn,ϑ h n,ξ h n)ϕ h dx + ∫ ∂ω βc(x)(p∞)n+1ϕhds (13) holds for all ϕh ∈ wh, further cw%wχ ∗ c ∫ ω δ∆t [ φc(ξhn+1)s(p h n+1)ϑ h n+1 ] ψh dx + cw%wχ∗aφa ∫ ω δ∆t [ s(phn+1)ϑ h n+1 ] ψh dx + ∫ ω δ∆t [ χ∗c%sccsc(1 −φc(ξ h n+1))ϑ h n+1 ] ψh dx + ∫ ω δ∆t [ χ∗a%sacsa(1 −φa)ϑ h n+1 ] ψh dx + ∫ ω λ∗(phn,ϑ h n,ξ h n)∇ϑ h n+1 ·∇ψ h dx + ∫ ω cwϑ h na ∗(phn,ϑ h n,ξ h n)∇ϑ h n+1 ·∇ψ h dx + ∫ ∂ω αc(x)ϑhn+1ψ h ds + cw ∫ ∂ω βc(x)ϑhnp h n+1ψ h ds = α2χ∗c ∫ ω f(phn,ϑ h n,ξ h n)ψ h dx + ∫ ∂ω αc(x)(ϑ∞)n+1ψh ds + cw ∫ ∂ω βc(x)ϑhn(p∞)n+1ψ hds (14) holds for all ψh ∈ wh and ξhn+1 = ξ h n + ∆tf(p h n,ϑ h n,ξ h n). (15) the unknown vector field xn+1 =( pn+1,ϑn+1,ξn+1 )t is introduced, and the galerkin procedure is applied. this leads to a system of non-linear algebraic equations b(xn+1) −b(xn) + ∆tknxn+1 + ∆tfn+1(xn+1) = 0, (16) where pn+1, ϑn+1, and ξn+1 store the unknown nodal values of water pressure, temperature, and hydration degree at a time tn+1, respectively. the non-linear system (16) is solved iteratively using the newton’s method. 14 vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . 4.2. periodic cell problem the problems (10) and (11) need to be solved in each discrete time step. for fixed ω, η, and ζ ∈ r, both problems, (10) and (11), can be written in a general form as a periodic cell problem (i = 1, 2) −∇y · (a(y)(ei + ∇yui)) = 0 in y, (17) where a is a given step function, a(y) := { ac ∈ r+ for y ∈yc, aa ∈ r+ for y ∈ya, (18) and ui is the unknown function satisfying the periodic boundary conditions on ∂y. 4.2.1. the classical fe method the variational formulation of (17) reads as follows: we seek ui ∈ w 1,2per(y) such that (i = 1, 2)∫ y a(y)∇ui ·∇ϕ dy = − ∫ y a(y)ei ·∇ϕ dy (19) for all ϕ ∈ w 1,2per(y). a finite element approximation is obtained by restricting the weak formulation (19) to a finite dimensional subspace w 1,2per,h(y) of w 1,2per(y). then, we seek the approximate solution uhi ∈ w 1,2 per,h(y) satisfying the equation∫ y a(y)∇uhi ·∇ϕ h dy = − ∫ y a(y)ei ·∇ϕh dy (20) for all ϕh ∈ w 1,2per,h(y). 4.2.2. computation of the homogenized coefficients by an analytical approximation it is worth pointing out that the cell problems (10) and (11) need to be solved at each integration point and in each discrete time step n = 0, 1, . . . ,n of the problem (13)–(15). this means that a very large number of cell problems need to be solved in order to compute homogenized coefficients a∗ and λ∗ in the whole macroscopic domain ω. in [20], the authors proposed an analytical approximation of the solution to the periodic cell problem (17) with a step function a defined by (18). it can be easily shown that the functions ũ1(y) = ∫ y1 0 ds a(s,y2) (∫ 1 0 ds a(s,y2) )−1 −y1 (21) and ũ2(y) = ∫ y2 0 ds a(y1,s) (∫ 1 0 ds a(y1,s) )−1 −y2 (22) satisfy the equation (17) almost everywhere in y, see [20, theorem 2.1]. the proposed solution can be seen as an approximation to the solution of (19) in l2(y). the computational cost savings are evident because of the closed-form approximation of the solution to the periodic cell problem. it is worth noting that, for the aggregate of a rectangular shape symmetrically placed within the periodic cell, which is the case considered in the following numerical experiments, the aforementioned approximation leads to simple formulae for the homogenized coefficients, see also [20–23]. for λ∗ (and for a∗ analogously), we can write λ∗11(pk,ϑk,ξk) = 2d2λc(pk,ϑk,ξk) + 1 − 2d2 2d1 λc(pk,ϑk,ξk) + 1 − 2d1 λa(pk,ϑk) , (23) λ∗22(pk,ϑk,ξk) = 2d1λc(pk,ϑk,ξk) + 1 − 2d1 2d2 λc(pk,ϑk,ξk) + 1 − 2d2 λa(pk,ϑk) , (24) and λ∗12 = λ ∗ 21 = 0, (25) where pk, ϑk, and ξk, respectively, are the water pressure, temperature, and hydration degree at the k-th integration point of the macroscopic (homogenized) model, and d1 and d2 are the geometrical parameters of the periodic cell, see figure 1. figure 1. cell geometry. the solution of the cell problem in y is crucial for obtaining the homogenized coefficients in the global domain ω. therefore, a comparison of solutions of the homogenized problem (13)–(15), with homogenized coefficients obtained by using the analytical formulae (23)–(25) and the one computed numerically according to (20), is one of the major goals of the present work, see the next section. 5. numerical experiments the mesoscale model and the homogenization framework have both been implemented in a matlab [31] code, which is employed for the following numerical experiments. for the spatial discretization, the finite element method is utilized by using four-node bilinear elements with 3×3 integration points. the temporal discretization is performed by a semi-implicit difference scheme, see section 4. 15 michal beneš, radek štefan acta polytechnica the aim of the illustrative examples is to demonstrate some of the properties of the mesoscale model and the homogenization framework by analysing the effects of changing (i) the characteristic length ε, (ii) the aggregate shape, (iii) the fe mesh size, and (iv) the approach to solving the local problem. for the numerical experiments presented in this paper, the cell geometry shown in figure 1 is assumed. the rectangular aggregate shape was chosen for the reason that it can be easily analysed by the analytical approach and it can be simply discretized by fournode bilinear finite elements. similar examples have also been investigated by other researches focusing on the homogenization problem, e.g. [19–23]. for real-world problems, the cell geometry can be modified in order to represent the material more accurately. concrete is a cementitious composite formed by the aggregates of different shapes and sizes. this could be implemented in the present models by the appropriate setting of the periodic cell geometry. within the cell, multiple domains representing the aggregates can be modelled, each of them of a different shape and size, and the local problem can be solved by the finite element method. due to the limited scope of the paper, such problem is not presented and will be analysed in our future work. all the material parameters and properties of concrete components adopted for the calculations as well as the parameters of initial and boundary conditions are summarized in section 7. 5.1. convergence analysis – layered structure the first example is focused on an analysis of hydration and transport processes in a structure (wall) formed by parallel layers of a fresh cement paste and aggregates. the objective of this hypothetical example is to show the convergence of the mesoscale solution towards the homogenized problem solution, as described for similar problems, e.g., in [19–23]. this type of example has been chosen as in this case (a layered structure), the homogenized coefficients can be obtained by an exact solution, see [23, p. 2262], cf. e.g. [26]. the geometrical parameters of the analysed problem are depicted in figure 2. the calculations are performed on the mesoscale level for three different values of ε: 0.1, 0.05, 0.01. in all the cases, we use the same finite element mesh of 500 elements in total with ∆x = 1 mm, see figure 2. the homogenized problem is solved twice. once with the same mesh as for the mesoscale level (∆x = 1 mm), once with a coarse mesh with ∆x = 25 mm. the time step is set to ∆t = 30 s in all cases. the obtained distributions of water pressure across the analysed structure are displayed in figure 3. the temperature profiles are depicted in figure 4. the distributions of hydration degree are shown in figure 5. figure 2. scheme of the analysed problem, finite element (fe) mesh, and boundary conditions (bc). for the detailed convergence analysis, we focus on the resulting distributions of temperature and water pressure. in tables 1 and 2, the differences between the temperatures and water pressures determined by the mesoscale (ϑε, pε) and homogenized (ϑ, p) models for selected times are illustrated by the respective l2 error norms, see [19–23]. ε ‖ϑε −ϑ‖2 (k m) t = 8 h t = 16 h t = 24 h 0.1 8.9 × 10−3 5.8 × 10−3 2.4 × 10−3 0.05 2.2 × 10−3 1.5 × 10−3 7.6 × 10−4 0.01 8.3 × 10−5 7.9 × 10−5 1.0 × 10−4 table 1. error norms for the temperature determined by the mesoscale and homogenized models for selected times. ε ‖pε −p‖2 (pa m) t = 8 h t = 16 h t = 24 h 0.1 2560.0 1520.0 819.8 0.05 734.9 358.4 197.9 0.01 29.9 14.5 9.3 table 2. error norms for the water pressure determined by the mesoscale and homogenized models for selected times. from figures 4–3 and tables 1 and 2, it is clear that with the decreasing value of ε, the mesoscale solution converges towards the homogenized problem solution. moreover, it can be seen that for the homogenized problem, a coarse finite element mesh provides almost the same results as a fine mesh. 5.2. illustrative example – concrete column cross-section in the second example, the hydration and transport processes in a concrete column of a square crosssection are simulated. the objective of the example is to illustrate the usage of the numerical procedures 16 vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . 0 100 200 300 400 500 x1 (mm) -2.7 -2.6 -2.5 -2.4 -2.3 w a te r p re ss u re (p a ) #106 t = 8 h p", " = 0:1 p", " = 0:05 p", " = 0:01 p, -ne mesh p, coarse mesh 0 100 200 300 400 500 x1 (mm) -3.85 -3.8 -3.75 -3.7 -3.65 -3.6 -3.55 -3.5 w a te r p re ss u re (p a ) #106 t = 16 h p", " = 0:1 p", " = 0:05 p", " = 0:01 p, -ne mesh p, coarse mesh 0 100 200 300 400 500 x1 (mm) -4.35 -4.3 -4.25 -4.2 -4.15 -4.1 w a te r p re ss u re (p a ) #106 t = 24 h p", " = 0:1 p", " = 0:05 p", " = 0:01 p, -ne mesh p, coarse mesh figure 3. distribution of water pressure across the analysed structure. for the determination of the homogenized material parameters, as described in section 4.2. the scheme of the analysed problem is depicted in figure 6. as can be seen from figure 6, we assume two variants of the aggregate shape: square (denoted as variant s) and rectangle (denoted as variant r). on the mesoscale level, the simulations are performed for ε = 0.025, i.e. with 8 × 8 cells for the analysed section. in both cases (variants s and r), 0 100 200 300 400 500 x1 (mm) 29 29.5 30 30.5 31 31.5 32 t e m p e ra tu re (/ c ) t = 8 h #", " = 0:1 #", " = 0:05 #", " = 0:01 #, -ne mesh #, coarse mesh 0 100 200 300 400 500 x1 (mm) 42 44 46 48 50 52 t e m p e ra tu re (/ c ) t = 16 h #", " = 0:1 #", " = 0:05 #", " = 0:01 #, -ne mesh #, coarse mesh 0 100 200 300 400 500 x1 (mm) 46 48 50 52 54 56 58 60 t e m p e ra tu re (/ c ) t = 24 h #", " = 0:1 #", " = 0:05 #", " = 0:01 #, -ne mesh #, coarse mesh figure 4. distribution of temperature across the analysed structure. we use the same finite element mesh of square elements of the size of 2.5 mm (6400 elements in total), see figure 7. for the homogenized model, a finite element mesh of 10 × 10 square elements is used on the macroscale level (element size of 20 mm). the homogenized material parameters are determined using two different approaches (see section 4.2): (i) using the finite element solution of the periodic cell problem with the 17 michal beneš, radek štefan acta polytechnica 0 100 200 300 400 500 x1 (mm) 0 0.05 0.1 0.15 0.2 h y d ra ti o n d e g re e () t = 8 h 9", " = 0:1 9, -ne mesh 9, coarse mesh 0 100 200 300 400 500 x1 (mm) 0 0.05 0.1 0.15 0.2 h y d ra ti o n d e g re e () t = 8 h 9", " = 0:05 9, -ne mesh 9, coarse mesh 0 100 200 300 400 500 x1 (mm) 0 0.05 0.1 0.15 0.2 h y d ra ti o n d e g re e () t = 8 h 9", " = 0:01 9, -ne mesh 9, coarse mesh figure 5. distribution of hydration degree across the analysed structure. mesh of 10×10 square elements for one quarter of the cell (element size of 50 mm; due to symmetry, only one quarter of the cell is modelled), and (ii) using the simplified approach proposed by sviercoski et al. [20]. in the following figures, the results obtained by the homogenized model using these two approaches are indexed as (fe) and (approx.), respectively. the time step is set to ∆t = 60 s in all the cases. figure 6. scheme of the analysed problem, cell geometries, boundary conditions (bc). figure 7. finite element mesh used for the mesoscale simulations: variant s (left), variant r (right). the obtained profiles of water pressure, temperature, and hydration degree along the diagonal of the analysed section (xd, see figure 6, see also e.g. [19]) are displayed in figure 8. a more detailed comparison of the temperature distribution is presented using the isolines maps (contour plots) in figure 9, cf. e.g. [21–23]. for illustration, selected results (hydration degree and saturation) are shown using the filled contour maps in figures 10 and 11, cf. e.g. [23]. as can be seen, the presented homogenized model provides a sufficiently accurate approximation to the mesoscale problem of hydration and transport processes in early age concrete. this holds true mainly for the temperature, which is, from the engineering point of view, the most important variable. the analysed example also indicates that, in comparison with the classical fe-approach, the approximate solution of the periodic cell problem (determination of the homogenized material coefficients) proposed by sviercoski et al. [20] gives almost the same results while achieving significant computational cost savings. 18 vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . the computational cost savings are evident because of the closed-form approximation of the solution to the periodic cell problem, which needs to solved at each integration point (at the macroscale) and in each discrete time step. by the analytical solution, the homogenized coefficients are obtained directly by a closed-form formulae – see formulae (23)–(25), while for the finite element approach, the local problem needs to be solved numerically. 6. conclusions in the paper, we have proposed a fully coupled twoscale (meso-macro) homogenization framework (theoretically investigated in [17, 29]) for the modelling of simultaneous heat transfer, moisture flows and hydration phenomena in fresh concrete. a modified mesoscale model has been introduced by assuming concrete as a composite material with two periodically distributed mesoscale components, cement paste and aggregates, partially saturated with liquid water and gas. for the mesoscale problem, an upscaling method has been utilized in order to obtain the homogenized model. the homogenized coefficients have been determined based on the solution of the periodic cell problem. two different approaches have been adopted for solving the periodic cell problem: (i) the classical finite element method, (ii) a simplified approach based on an analytical approximation proposed by sviercoski et al. [20]. the resulting algorithm, based on the finite element discretization in space and a semi-implicit finite difference discretization in time, has been implemented in matlab environment and employed for several numerical experiments. the results of the presented examples have confirmed the following. • with a decreasing value of the characteristic length ε, representing the small scale variability of the concrete composite, the mesoscale solution converges towards the homogenized problem solution. • for the homogenized problem, a coarse finite element mesh provides almost the same results as the fine mesh. • the presented homogenized model provides a sufficiently accurate approximation to the mesoscale problem of hydration and transport processes in early age concrete. • for the homogenized model, the approximate solution of the periodic cell problem (determination of the homogenized material coefficients) proposed by sviercoski et al. [20] gives almost the same results as the classical fe approach while achieving significant computational cost savings. in our future work, we will focus on: • numerical experiments investigating the effect of the cell geometry (aggregates of different shapes and sizes); • utilization of the homogenization approach for the analysis of transport processes in concrete exposed to high temperatures. 7. appendix – model parameters based on a literature review, the model parameters employed for the numerical experiments presented in this paper have been adopted as follows (see also our previous work [18]). the values of density and heat capacity of concrete components are taken as: %w = 1000 kg m−3, %sa = 2830 kg m−3, %sc = 3220 kg m−3, cw = 4180 j kg−1 k−1, csa = 840 j kg−1 k−1, csc = 750 j kg−1 k−1 [16, table 1], [32, example 13.1]. the porosity of cement paste, φc(ξ), is determined by [4, (29)], with φc,∞ = 0.2, and aφ = 0.35 [4, fig. 1]. the porosity of aggregate, φa, is set to a constant value φa = 0.05, see e.g. [33]. the intrinsic permeability of cement paste, kc(ξ), is given by [4, (32)], with kc,∞ = 10−18 m2 [4, table i], [32, p. 308] and ak = 8 [4, p. 312]. the intrinsic permeability of aggregate, ka, is set to a constant value ka = 10−20 m2, see e.g. [33]. the thermal conductivity of cement paste and aggregate, λc(p,ξ) and λa(p), respectively, is determined by [4, (37)] with the constant thermal conductivity of a dry material, namely λc,dry = 1.0 w m−1 k−1 and λa,dry = 2.4 w m−1 k−1 [16, p. 136, table 1]. the degree of the saturation with liquid water, s, is taken from [34, (20)] with the material parameters specified in [34, table 5] for ordinary concrete (a = 18.6237 mpa, b = 2.2748). here, we follow a simplified assumption that the capillary pressure, pc, and the liquid water pressure, p, are in the relation pc = patm −p with patm being the atmospheric pressure [4, pp. 305–306], [13, chapter 4.2.2]. the relative hydraulic conductivity, kr(s), and the dynamic viscosity, µ(ϑ), respectively, are adopted from [34, (21)], [35, (38)], and [35, (ai.8)]. the hydration degree rate f is determined as (e.g. [4, (23)], [16, (7)], [32, (13.31)–(13.35)]) f = a(ξ)βϑ(ϑ)βrh(rh(p,ϑ)), (26) where the formulae and most of the material parameters for a(ξ), βϑ(ϑ), and βrh(rh(p,ϑ), are adopted from [16, 32] with: b1 = 23.4/(24×3600) s−1, b2 = 7×10−4, η = 6.7, qe/r = 4600 k, and αe = 7.5. the notation is taken from [32]. the relative humidity, rh, is calculated from [4, (12)], by assuming pc = patm − p (see above). the ultimate hydration degree, ξ∞, is set to ξ∞ = 0.8 (see [2, (13)], [16, (9),(10)]). the constants α1 and α2 in the source terms in equations (1),(2), or (5),(6), respectively, are assumed 19 michal beneš, radek štefan acta polytechnica 0 50 100 150 200 xd (mm) -2.82 -2.815 -2.81 -2.805 -2.8 -2.795 -2.79 -2.785 w a te r p re ss u re (p a ) #106 t =8 h, variant s # p 2 p" p(fe) p(approx:) 0 50 100 150 200 xd (mm) -3.07 -3.065 -3.06 -3.055 -3.05 -3.045 -3.04 w a te r p re ss u re (p a ) #106 t =8 h, variant r # p 2 p" p(fe) p(approx:) 0 50 100 150 200 xd (mm) 36 37 38 39 40 41 42 t e m p e ra tu re (/ c ) t = 8 h, variant s # p 2 #" #(fe) #(approx:) 0 50 100 150 200 xd (mm) 50 52 54 56 58 60 62 64 t e m p e ra tu re (/ c ) t = 8 h, variant r # p 2 #" #(fe) #(approx:) 0 50 100 150 200 xd (mm) 0 0.05 0.1 0.15 0.2 0.25 h y d ra ti o n d e g re e () t = 8 h, variant s # p 2 9" 9(fe) 9(approx:) 0 50 100 150 200 xd (mm) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 h y d ra ti o n d e g re e () t = 8 h, variant r # p 2 9" 9(fe) 9(approx:) figure 8. water pressure, temperature, and hydration degree profiles along the diagonal of the analysed section. as ([4, 16, 32], see also [29] and references therein) α1 = −qw,potmc, (27) α2 = qh,potmc, (28) with qw,pot = 0.25, qh,pot = 510 × 103 jkg−1, and mc = (1 −φc(0))%sc. the initial conditions are set to p0 = −2.155 × 106 pa (s0 = 0.99) and ϑ0 = 293.15 k. the parameters of the boundary conditions are taken as: βc = 10−30 s m−1, p∞ = −6.9 × 107 pa (rh∞ = 0.6), αc = 4 w m−2 k−1, and ϑ∞ = 293.15 k, for the boundaries denoted as bc1 in section 5. the boundaries denoted as bc2 in section 5 are assumed to be perfectly insulated, i.e. βc = 0 and αc = 0. list of symbols cw isobaric heat capacity of water [j kg−1 k−1] csa isobaric heat capacity of aggregates [j kg−1 k−1] csc isobaric heat capacity of cement paste [j kg−1 k−1] 20 vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . temperature (/c), t = 8 h, variant s 38 38.5 38.5 3 9 39 39 3 9 .5 3 9 .5 39.5 39.5 4 0 40 40 40 4 0 .5 40.5 40.5 41 4 1 41 41.5 41.5 0 50 100 150 200 x1 (mm) 0 50 100 150 200 x 2 (m m ) #" #(fe) #(approx:) temperature (/c), t = 8 h, variant r 5354 55 55 5 6 56 56 5 7 5 7 57 57 5 8 58 58 58 59 59 59 60 6 0 60 6 1 61 62 62 0 50 100 150 200 x1 (mm) 0 50 100 150 200 x 2 (m m ) #" #(fe) #(approx:) temperature (/c), t = 24 h, variant s 5456 5 8 58 58 6 0 60 60 60 6 2 62 62 64 64 66 0 50 100 150 200 x1 (mm) 0 50 100 150 200 x 2 (m m ) #" #(fe) #(approx:) temperature (/c), t = 24 h, variant r 75 80 80 8 5 85 85 85 9 0 90 90 95 0 50 100 150 200 x1 (mm) 0 50 100 150 200 x 2 (m m ) #" #(fe) #(approx:) figure 9. distribution of temperature in the analysed cross-section. ka intrinsic permeability of aggregates [m2] kc intrinsic permeability of cement paste [m2] kc,∞ intrinsic permeability of the matured concrete [m2] kr relative permeability of liquid phase [–] mc mass of cement [kg m−3] p liquid pressure [pa] p0 initial value of liquid pressure [pa] p∞ external liquid pressure [pa] patm atmospheric pressure [pa] pc capillary pressure [pa] qh,pot potential hydration heat [j kg−1] qw,pot chemically bound water [kg kg−1] rh relative humidity [–] rh∞ external relative humidity [–] s degree of saturation with liquid water [–] s0 initial degree of saturation with liquid water [–] greek symbols αc convective heat transfer coefficient [w m−2 k−1] βc convective mass transfer coefficient [m s−1] δij kronecker delta [–] ε scale parameter (characteristic length) [–] ϑ temperature [k] ϑ0 initial value of temperature [k] ϑ∞ external temperature [k] λc thermal conductivity of cement paste [w m−1 k−1] λc,dry thermal conductivity of dry cement paste [w m−1 k−1] λa thermal conductivity of aggregates [w m−1 k−1] λa,dry thermal conductivity of dry aggregates [w m−1 k−1] µ water viscosity [pa s] ξ degree of hydration [–] ξ∞ final degree of hydration [–] 21 michal beneš, radek štefan acta polytechnica figure 10. distribution of hydration degree in the analysed cross-section. %w liquid density [kg m−3] %sa density of aggregates [kg m−3] %sc density of cement paste [kg m−3] φc porosity of cement paste [–] φc,∞ porosity at final stage of hydration [–] figure 11. distribution of saturation degree in the analysed cross-section. φa porosity of aggregates [–] χa characteristic function of aggregates [–] χc characteristic function of cement paste [–] 22 vol. 60 no. 1/2020 homogenization of transport processes and hydration phenomena. . . acknowledgements the first author has been financially supported by the european regional development fund (project no. cz.02.1.01/0.0/0.0/16−019/0000778) within activities of the center of advanced applied sciences (caas). the second author has been supported by the czech science foundation, project no. ga17-23067s. the support is gratefully acknowledged. references [1] f. j. ulm, o. coussy. modeling of thermochemomechanical couplings of concrete at early ages. j eng mech 121(7):785–794, 1995. doi:10.1061/(asce)0733-9399(1995)121:7(785). [2] m. cervera, j. oliver, t. prato. thermo-chemo-mechanical model for concrete. i: hydration and aging. j eng mech 125(9):1018–1027, 1999. doi:10.1061/(asce)0733-9399(1999)125:9(1018). [3] m. cervera, r. faria, j. oliver, t. prato. numerical modelling of concrete curing, regarding hydration and temperature phenomena. comput struct 80(18):1511– 1521, 2002. doi:10.1016/s0045-7949(02)00104-9. [4] d. gawin, f. pesavento, b. a. schrefler. hygro-thermo-chemo-mechanical modelling of concrete at early ages and beyond. part i: hydration and hygro-thermal phenomena. int j numer meth eng 67(3):299–331, 2006. doi:10.1002/nme.1615. [5] g. di luzio, g. cusatis. hygro-thermo-chemical modeling of high performance concrete. i: theory. cement concrete comp 31(5):301–308, 2009. doi:10.1016/j.cemconcomp.2009.02.015. [6] b. klemczak, m. batog. heat of hydration of lowclinker cements. j therm anal calorim 123(2):1351– 1360, 2016. doi:10.1007/s10973-015-4782-y. [7] e. m. r. fairbairn, m. azenha (eds.). thermal cracking of massive concrete structures: state of the art report of the rilem technical committee 254cms. springer, 2019. doi:10.1007/978-3-319-76617-1. [8] g. xotta, g. mazzucco, v. a. salomoni, et al. composite behavior of concrete materials under high temperatures. int j solid struct 64-65:86–99, 2015. doi:10.1016/j.ijsolstr.2015.03.016. [9] j. bear. dynamics of fluids in porous media. courier corporation, 1972. [10] m. hassanizadeh, w. g. gray. general conservation equations for multi-phase systems: 1. averaging procedure. adv water resour 2:131–144, 1979. doi:10.1016/0309-1708(79)90025-3. [11] m. hassanizadeh, w. g. gray. general conservation equations for multi-phase systems: 2. mass, momenta, energy, and entropy equations. adv water resour 2:191–203, 1979. doi:10.1016/0309-1708(79)90035-6. [12] m. hassanizadeh, w. g. gray. general conservation equations for multi-phase systems: 3. constitutive theory for porous media flow. adv water resour 3(1):25–40, 1980. doi:10.1016/0309-1708(80)90016-0. [13] k. maekawa, r. chaube, t. kishi. modelling of concrete performance: hydration, microstructure formation, and mass transport. e & fn spon, 1999. [14] k. maekawa, t. ishida, t. kishi. multi-scale modeling of concrete performance. j adv concr technol 1(2):91–126, 2003. doi:10.3151/jact.1.91. [15] y. zhang, c. pichler, y. yuan, et al. micromechanics-based multifield framework for early-age concrete. eng struct 47:16–24, 2013. doi:10.1016/j.engstruct.2012.08.015. [16] l. jendele, v. šmilauer, j. červenka. multiscale hydro-thermo-mechanical model for early-age and mature concrete structures. adv eng softw 72:134–146, 2014. doi:10.1016/j.advengsoft.2013.05.002. [17] m. beneš, i. pažanin. homogenization of degenerate coupled transport processes in porous media with memory terms. to appear in math meth appl sci doi:10.1002/mma.5718. [18] m. beneš, r. štefan. homogenization of transport processes in early age concrete. aip conference proceedings 2116(1):450028, 2019. doi:10.1063/1.5114495. [19] j. f. bourgat. numerical experiments of the homogenization method. in r. glowinski, j. l. lions (eds.), computing methods in applied sciences and engineering, 1977, i, pp. 330–356. springer, 1979. doi:10.1007/bfb0063630. [20] r. f. sviercoski, c. l. winter, a. w. warrick. analytical approximation for the generalized laplace equation with step function coefficient. siam j appl math 68(5):1268–1281, 2008. doi:10.1137/070683465. [21] r. f. sviercoski, b. j. travis, j. m. hyman. analytical effective coefficient and a first-order approximation for linear flow through block permeability inclusions. comput math appl 55(9):2118– 2133, 2008. doi:10.1016/j.camwa.2007.07.016. [22] r. f. sviercoski, a. w. warrick, c. l. winter. two-scale analytical homogenization of richards’ equation for flows through block inclusions. water resour res 45(5), 2009. doi:10.1029/2006wr005598. [23] r. f. sviercoski, p. popov, b. j. travis. zeroth and first-order homogenized approximations to nonlinear diffusion through block inclusions by an analytical approach. comput methods appl mech engrg 198(30):2260–2271, 2009. doi:10.1016/j.cma.2009.02.020. [24] j. c. michel, h. moulinec, p. suquet. effective properties of composite materials with periodic microstructure: a computational approach. comput methods appl mech engrg 172(1):109––143, 1999. doi:10.1016/s0045-7825(98)00227-8. [25] z. chen, w. den, h. ye. upscaling of a class of nonlinear parabolic equations for the flow transport in heterogeneous porous media. commun math sci 3(4):493–515, 2005. doi:10.4310/cms.2005.v3.n4.a2. [26] h. w. zhang, s. zhang, j. y. bi, b. a. schrefler. thermo-mechanical analysis of periodic multiphase materials by a multiscale asymptotic homogenization approach. int j numer meth eng 69(1):87–113, 2007. doi:10.1002/nme.1757. 23 http://dx.doi.org/10.1061/(asce)0733-9399(1995)121:7(785) http://dx.doi.org/10.1061/(asce)0733-9399(1999)125:9(1018) http://dx.doi.org/10.1016/s0045-7949(02)00104-9 http://dx.doi.org/10.1002/nme.1615 http://dx.doi.org/10.1016/j.cemconcomp.2009.02.015 http://dx.doi.org/10.1007/s10973-015-4782-y http://dx.doi.org/10.1007/978-3-319-76617-1 http://dx.doi.org/10.1016/j.ijsolstr.2015.03.016 http://dx.doi.org/10.1016/0309-1708(79)90025-3 http://dx.doi.org/10.1016/0309-1708(79)90035-6 http://dx.doi.org/10.1016/0309-1708(80)90016-0 http://dx.doi.org/10.3151/jact.1.91 http://dx.doi.org/10.1016/j.engstruct.2012.08.015 http://dx.doi.org/10.1016/j.advengsoft.2013.05.002 http://dx.doi.org/10.1002/mma.5718 http://dx.doi.org/10.1063/1.5114495 http://dx.doi.org/10.1007/bfb0063630 http://dx.doi.org/10.1137/070683465 http://dx.doi.org/10.1016/j.camwa.2007.07.016 http://dx.doi.org/10.1029/2006wr005598 http://dx.doi.org/10.1016/j.cma.2009.02.020 http://dx.doi.org/10.1016/s0045-7825(98)00227-8 http://dx.doi.org/10.4310/cms.2005.v3.n4.a2 http://dx.doi.org/10.1002/nme.1757 michal beneš, radek štefan acta polytechnica [27] d. perić, e. a. de souza neto, r. a. feijóo, et al. on micro-to-macro transitions for multi-scale analysis of non-linear heterogeneous materials: unified variational basis and finite element implementation. int j numer meth eng 87(1-5):149–170, 2011. doi:10.1002/nme.3014. [28] v. nguyen, e. béchet, c. geuzaine, l. noels. imposing periodic boundary condition on arbitrary meshes by polynomial interpolation. comput mater sci 55:390––406, 2012. doi:10.1016/j.commatsci.2011.10.017. [29] m. beneš, i. pažanin. on degenerate coupled transport processes in porous media with memory phenomena. z angew math mech 98(6):919–944, 2018. doi:10.1002/zamm.201700158. [30] p. g. ciarlet. the finite element method for elliptic problems. elsevier, 1978. [31] matlab. version 8.6.0 (r2015b). the mathworks, inc., natick, massachusetts, united states, 2015. [32] z. p. bažant, m. jirásek. creep and hygrothermal effects in concrete structures. springer, 2018. doi:10.1007/978-94-024-1138-6. [33] c. gonilho pereira, j. castro-gomes, l. pereira de oliveira. influence of natural coarse aggregate size, mineralogy and water content on the permeability of structural concrete. constr build mater 23(2):602–608, 2009. doi:10.1016/j.conbuildmat.2008.04.009. [34] v. baroghel-bouny, m. mainguy, t. lassabatere, o. coussy. characterization and identification of equilibrium and transfer moisture properties for ordinary and high-performance cementitious materials. cement concrete res 29(8):1225–1238, 1999. doi:10.1016/s0008-8846(99)00102-7. [35] c. t. davie, c. j. pearce, n. bićanić. coupled heat and moisture transport in concrete at elevated temperatures–effects of capillary pressure and adsorbed water. numer heat tr a-appl 49(8):733–763, 2006. doi:10.1080/10407780500503854. 24 http://dx.doi.org/10.1002/nme.3014 http://dx.doi.org/10.1016/j.commatsci.2011.10.017 http://dx.doi.org/10.1002/zamm.201700158 http://dx.doi.org/10.1007/978-94-024-1138-6 http://dx.doi.org/10.1016/j.conbuildmat.2008.04.009 http://dx.doi.org/10.1016/s0008-8846(99)00102-7 http://dx.doi.org/10.1080/10407780500503854 acta polytechnica 60(1):12–24, 2020 1 introduction 2 the mesoscale model 3 the homogenized problem 4 numerical solution 4.1 macroscopic fe formulation 4.2 periodic cell problem 4.2.1 the classical fe method 4.2.2 computation of the homogenized coefficients by an analytical approximation 5 numerical experiments 5.1 convergence analysis – layered structure 5.2 illustrative example – concrete column cross-section 6 conclusions 7 appendix – model parameters list of symbols acknowledgements references ap04_1.vp 1 introduction the objective of an image compression algorithm is to exploit the redundancy in an image such that a smaller number of bits can be used to represent the image while maintaining an “acceptable” visual quality for the decompressed image. the redundancy of an image resides in the correlation of the neighboring pixels. for a color image, there is also a correlation, which can be exploited, between the color components [1, 2]. different applications require different data rates for the compressed images and different visual qualities for the decompressed images. in some applications when browsing is required or transmission bandwidth is limited, progressive transmission is used to send images in such a way that a low quality version of the image is transmitted first at a low data rate [3]. gradually, additional information is transmitted to progressively refine the image. a specific coding strategy known as embedded rate scalable coding is well suited for progressive transmission [4]. in embedded coding, all the compressed data is embedded in a single bit stream. the decompression algorithm starts from the beginning of the bit stream and can terminate at any data rate. a decompressed image at that data rate can then be reconstructed. in embedded coding, any visual quality requirement can be fulfilled by transmitting the truncated portion of the bit stream. to achieve the best performance the bits that convey the most important information need to be embedded at the beginning of the compressed bit stream [4]. the remainder of this paper is organized as follows: in section 2, the classifications of the image coder are presented. section 3 presents the discrete wavelet transform. section 4 presents imbedded image coding in the wavelet domain. the objective and subjective image quality measures are presented in section 5. section 6 provides a review of the main algorithms and recent publications in embedded scalable grayscale image compression. embedded scalable color image compression codecs are presented in section 7. in section 8 the jpeg2000 still image compression standard is presented. finally, conclusions are drawn in section 9. 2 image coder classifications image coders can be classified according to the organization of the compressed bit stream into scalable image coders © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 wavelet-based embedded rate scalable still image coders: a review farag ibrahim younis elnagahy, b. šimák embedded scalable image coding algorithms based on the wavelet transform have received considerable attention lately in academia and in industry in terms of both coding algorithms and standards activity. in addition to providing a very good coding performance, the embedded coder has the property that the bit stream can be truncated at any point and still decodes a reasonably good image. in this paper we present some state-of-the-art wavelet-based embedded rate scalable still image coders. in addition, the jpeg2000 still image compression standard is presented. keywords: jpeg2000, embedded image coding, grayscale, color, scalable, progressive, compression, discrete wavelet transform. fig. 1: non-scalable image coder and non-scalable image coders. in non-scalable image coders, the complete image is only obtained by decoding the entire compressed bit stream. truncating the compressed bit stream during the decoding process will produce an incompletely reconstructed image, as shown in fig. 1. scalable image coders are coders that allow the image data to be compressed once and then decompressed at multiple data rates or decompressed at different resolutions of the image. resolution refers to the size of the reconstructed image. a scalable image coder that is able to decode different resolutions of the image from the compressed bit stream is called a resolution scalable coder, while a scalable image coder that has the ability to decode a full resolution image with a certain bit rate from the compressed bit stream is called a rate (snr) scalable coder. both types of image scalable (resolution/snr) coders are further classified into embedded and non-embedded image coders. in an embedded resolution scalable image coder, lower resolution of the image is obtained by just decoding the first portion of the compressed bit stream, as shown in fig. 2. in a non-embedded resolution scalable image coder, lower resolution of the image is obtained by decoding certain portions of the compressed bit stream, as shown in fig. 3. in both cases no complete decoding of the whole compressed bit stream is needed. in an embedded rate (snr) scalable image coder, a lower quality of the image is obtained by just decoding the first portion of the compressed bit stream. by decoding more and more bits we can achieve a higher quality 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 2: embedded resolution scalable image coder fig. 3: non-embedded resolution scalable image coder image (higher snr), as shown in fig. 4. in non-embedded rate (snr) scalable image coder, however, a lower quality of the image is obtained by decoding certain portions of the compressed bit stream, as shown in fig. 5. in both cases no complete decoding of the whole compressed bit stream is needed. it is important to note that if a compression technique produces an embedded bit stream, this implies that the technique is scalable, because embedded bit streams can be truncated at any point during decoding. however, not all scalable compression techniques are embedded. resolution scalability can be achieved by encoding whole wavelet sub-bands one after the other, without interleaving bits from different sub-bands. snr scalability can be achieved by distributing encoded bits from one sub-band into the whole bit stream in an optimal way. another very interesting property of some coders is the ability to decode only certain regions of an image; this property is called random access. a coder can support the decoding of arbitrarily shaped regions or of any rectangular shaped region. region of interest encoding and decoding is also related to random access. for region of interest encoding, only an arbitrarily shaped region of the input image is encoded, and the rest of the image, which does not fall into the region of interest, is discarded. random access can be achieved by independently encoding portions of the whole image. all image coders can support random access by tiling the image and encoding each tile independently of the rest. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 4: embedded snr scalable image coder fig. 5: non-embedded snr scalable image coder 3 discrete wavelet transform a wavelet transform corresponds to two sets of analysis/synthesis digital filters, g/~g and h/~h, where h is a low pass filter (lpf) and g is a high pass filter (hpf). in two dimensions, filtering is usually applied in both horizontally and vertically. filtering in one direction results in decomposing the image in two components. there is a total of four components after vertical and horizontal decompositions. we will refer to these components as image sub-bands, ll, hl, lh, and hh. by using filters g and h, an image can be decomposed into four sub-bands. this is the first level of the wavelet transform (fig. 6). one of the four sub-bands, the ll sub-band, will contain low pass information, which is essentially a low resolution version of the image. sub-band hl will contain low pass information vertically and high pass information horizontally, and sub-band lh will contain low pass information horizontally and high pass information vertically. finally, sub-band hh will contain high pass information in both directions [5,6]. the operations can be repeated on the low-low (ll) sub-band for some number of levels, d, producing a total of 3-d+1 sub-bands whose samples represent the original image. the total number of samples (coefficients) in all sub-bands is identical to that in the original image. thus, a typical 2-d discrete wavelet transform used in image processing will generate a hierarchical pyramidal structure, as shown in fig. 7. the ll sub-band at the highest level can be classified as most important, and the other ,details’ sub-bands can be classified as of lesser importance, with the degree of importance decreasing from the top of the pyramid to the sub-bands at the bottom. the inverse wavelet transform is obtained by reversing the transform process and replacing the analysis filters by the synthesis filters and using up-sampling (fig. 8). an example illustrating three level octave-band decomposition is shown in fig. 9. 4 embedded image coding the wavelet transform can decorrelate the image pixel values and result in frequency and spatial-orientation separation. the transform coefficients in each sub-band exhibit unique statistical properties that can be used for encoding the image. the sub-band ll can be encoded alone and is added to the encoded bit stream information about the sub-bands hl, lh, hh. then the decoder is responsible for deciding whether to decode only ll or all sub-bands. sub-band data are usually real numbers, therefore they require a significant number of bits in their representation, and do not directly offer any form of compression. the first approach towards compressing wavelet data is to apply a quantizer in all the coefficients, where uniform and dead-zone quantizers are the most common choices. after quantization an entropy coder is applied to compress the quantization indices. in embedded coding, a wavelet transform is usually used to decorrelate the image pixels and achieve frequency and spatial-orientation separation. assuming that the wavelet transform � �c p� � , where p is the collection of image pixels and c is the collection of transformed coefficients, is unitary, the distortion of the image after decompression is � � � � � �d p d c d ci i � �� . the distortion measure is defined as the summation of the distortion at each pixel. the greatest distortion reduction can be achieved if the transformed coefficient with the largest magnitude is coded with infinite precision. thus attempts have been made to encode the transformed pixels with larger magnitudes first. furthermore, in order to distribute the bits strategically such that the decoded image will look “natural’’ at any data rate, progressive refinement or bit-plane coding is used. thus in the coding procedure, multiple passes through the data are adopted. in the first pass, those transformed pixels that are larger than a pre-selected threshold are added to the significance list and coded to the precision of the threshold. in the following passes, the threshold is halved and the pixels already in the list are refined to one more bit of precision. meanwhile more transformed pixels that are larger than the (halved) threshold are added to the list. it is critical that the positions of the enlisted pixels be encoded efficiently. it has been observed that large number of transform coefficients is insignificant in comparison with the threshold. these coefficients will be quantized to zero, which will not reduce the distortion. thus spending more bits on coding these insignificant coefficients results in lower efficiency [7]. imbedded image coding makes more efficient use of a communications channel, because images become visible more quickly, reducing the apparent retrieval time. moreover, progressive transmission enables efficient browsing of image databases: based on a quickly obtained preview of an image, a user might decide to terminate the transmission and proceed to the next image. alternatively, a user might select a low or medium resolution default for image retrieval. in this way, images of less value are not transmitted at full resolution. because of the small volume of data required to transmit relatively high quality previews of an image, the savings in access time can be significant when browsing large databases of images. 5 measures of image quality two mathematical formulas are commonly used to compare the various image compression techniques; they are the mean square error (mse) and the peak signal to noise ratio (psnr). the mse is the cumulative squared error between the compressed and the original image, whereas psnr in decibels (db) is a measure of the peak error. the mathematical formulae for the two are: � � � �� �mse mn i x y i x y x n y m � � �� ��1 2 11 , , , psnr mse � � � � � � �10 2552 log . where i( x, y) is the original image, i’( x, y) is the approximated version (which is actually the decompressed image) and m, n are the dimensions of the images. a lower value for mse means less error, and as seen from the inverse relation between mse and psnr, this translates to a high value of psnr. logically, a higher value of psnr is good because it means that the ratio of signal-to-noise is higher. here, the ‘signal’ is the original image, and the ‘noise’ is the error in 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague reconstruction. the nominator (255) in formula 2 is the maximum decimal value of an unsigned 8-bit image. a high psnr value does not always correspond to an image with perceptually high quality. another measure of image quality is the mean opinion score, where the performance of a compression process is characterized by the subjective quality of the decoded image. for example, a five-point scale such as very annoying, annoying, slightly annoying, perceptible but not annoying, and imperceptible might be used to characterize the impairments in the decoder output [8]. more details about the performance of image quality measures can be found in [9]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 8: one level of the wavelet transform reconstruction (a) single level decomposition (b) two levels decomposition (c) three levels decomposition fig. 7: pyramidal decomposition of an image fig. 6: one level of wavelet transform decomposition 6 wavelet-based embedded rate scalable grayscale image compression algorithms several techniques have been proposed to achieve rate scalability in still image compression. two of the most important techniques are shapiro’s embedded zerotree wavelet (ezw) [4], and said and pearlman’s set partitioning in hierarchical trees (spiht) [10]. both make use of “spatial orientation trees” (sot). spatial orientation trees are structures that use quad-tree representations of sets of wavelet coefficients that belong to different sub-bands, but have the same spatial location. this structure, which can be efficiently represented by one symbol, has been used extensively in rate scalable image and video compression. ezw, spiht, and other recent works are described in the following sub-sections. 6.1 embedded zerotree wavelet (ezw) algorithm [4] the ezw algorithm is based on four key concepts: � a discrete wavelet transform or hierarchical sub-band decomposition. � prediction of the absence of significant information across scales by exploiting the self-similarity inherent in images. � entropy-coded successive approximation quantization. � universal lossless data compression, which is achieved via adaptive arithmetic coding [11]. the embedded zerotree wavelet (ezw) algorithm exploits the interdependence between the coefficients of the wavelet decomposition of an image by grouping them into sots. this is based on the following observations: � most of the energy is concentrated at the low frequency sub-bands of the wavelet decomposition. � if the magnitude of a wavelet coefficient in the lowest sub-band of a decomposition is insignificant with respect to a threshold, then it is more likely that wavelet coefficients having the same spatial location in different sub-bands will also be insignificant. � the standard deviation of the wavelet coefficients decreases when proceeding from the lowest to the highest sub-bands of the wavelet pyramid. the combination of the above observations allows for the coding of a large number of insignificant wavelet coefficients by coding only the location of the root coefficient to which the entire set of coefficients is related. such a set is commonly referred to as a “zerotree”. the sot used in ezw is shown in fig. 10. the root of the tree is the coefficient located in the lowest sub-band (ll) of the decomposition. its descendants are all the other coefficients in the tree. the wavelet coefficient in the ll sub-band has three children. other coefficients, except for those in the highest frequency sub-bands, have four children. the ezw algorithm consists of successive approximation of wavelet coefficients, and can be summarized as follows: let f (m; n) be a grayscale image, and let w [f (m; n)] be the coefficients of its wavelet decomposition. 1) set the threshold � �� �� �t w f m n� max ; 2 2) dominant pass: � compare the magnitude of each wavelet coefficient in a tree, starting with the root, to threshold t. � if the magnitudes of all the wavelet coefficients in the tree are smaller than threshold t, then the entire tree structure (that is, the root and all its descendants) is represented by one symbol. this symbol is known as the zerotree (ztr) symbol. � otherwise, the root is said to be “significant" (when its magnitude is greater than t), or “insignificant” (when its magnitude is less than t). a significant coefficient is represented by one of two symbols, pos or neg, depending on whether its value is positive or negative. the magnitude of significant coefficients is set to zero to facilitate the formation of zerotree structures. an insignificant coefficient is represented by the symbol iz “isolated zero” if it has some significant descendant. � this process is carried out such that all the coefficients in the tree are examined for possible sub-zerotree structures. 3) subordinate pass: the significant wavelet coefficients in the image are refined by determining whether their magnitudes lie within the interval [t; 3t/2), represented by the symbol low, or the interval [3t/2; 2t), represented by the symbol high. 4) set t t� 2, and go to step 2. only the coefficients that have not yet been found to be significant are examined. this coding strategy is iterated until the target data rate is achieved. the order in which the coefficients are examined is 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague (a) original image (b) three level octave-band decomposition (c) synthesized image fig. 9: an example of wavelet decomposition predefined and known by the decoder, as shown in fig. 11. the initial threshold is encoded in the header of the bit stream, followed by the resulting symbols from the dominant and subordinate passes, which are entropy coded using an arithmetic coder [11]. an important element of ezw is the “significance map". the significance map is used to code the position of the significant coefficients in the wavelet decomposition. the simplest significance map is that where a significant coefficient is accompanied by the actual coordinates of its location. in ezw, the significance map is determined by having both the encoder and the decoder know the scanning order of the coefficients. in essence, the dominant pass of ezw determines of the significance map, whereas the subordinate pass refines the wavelet coefficients that have been found significant. in consequence, the reduction of distortion during the dominant pass is smaller than during the subordinate pass. ezw was developed for grayscale images. however it has been used for color images by using ezw on each of the color components separately. 6.2 set partitioning in hierarchical trees (spiht) [10] said and pearlman [10] investigated different tree structures that improved the quality of the decomposition. using set partitioning in hierarchical trees (spiht), the coefficients are divided into partitioning subsets using a known ordering in the sot. then, each subset is classified as significant or insignificant according to a pre-specified rule based on the magnitude of the coefficients in the subset. in the sot, the descendants (a group of 2×2 adjacent coefficients) correspond to the pixels of the same spatial orientation in the next finer level of the decomposition. fig. 12 shows how the spatial orientation tree is defined in a pyramid constructed with recursive four-sub-band splitting. one wavelet coefficient in the ll sub-bands (noted with “*”) does not have a child. the other coefficients, except for those in the highest frequency sub-bands, have four children. in contrast to shapiro’s zerotree, one coefficient in each group does not have descendants. spiht groups the wavelet coefficients into three lists: a list of insignificant sets (lis), a list of insignificant pixels (lip), and a list of significant pixels (lsp). the spiht algorithm can be summarized as follows: 1) initialize the lis to the set of sub-tree descendants of the nodes in the highest level, the lip to the nodes in the highest level, and the lsp to an empty list. 2) sorting pass: � traverse the lip testing the magnitude of its elements against the current threshold and representing its significance by 0 or 1. if a coefficient is found to be significant, its sign is coded and is moved to the lsp. � examine the lis and check the magnitude of all the coefficients in that set. if a particular set is found to be significant, it is then partitioned into subsets and tested for significance. otherwise, a single bit is appended to the bit stream to indicate an insignificant set. 3) refinement pass: examine the lsp, excluding coefficients added during the sorting pass. this pass is accomplished by using progressive bit plane coding of the ordered coefficients. 4) set t t� 2, and go to step 2. the process is repeated until the target data rate is achieved. it is important to note that the locations of the coefficients being refined or classified are never explicitly transmitted. this is because all branching decisions made by the encoder as it searches throughout the coefficients are appended to the bit stream. the output of the sorting-refinement process is then entropy coded [11]. 6.3 space frequency quantization (sfq) algorithm [12] in [12], a wavelet-based image compression scheme is presented. this iterative algorithm seeks to minimize the distortion measure (mse) by successively pruning a tree for a given target rate. the survivor nodes in the tree are quantized and sent in the data stream along with significance map infor© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 10: parent-descendant relationships in the spatial-orientation tree of the ezw algorithm fig. 11: scanning order of the sub-bands for encoding a significance map (ezw) fig. 12: parent-descendant relationships in the spatial-orientation tree in the spiht algorithm mation. to further compress the stream, this map is predicted on the basis of statistical information. the spatial wavelet coefficient tree structure used in the sfq algorithm is shown in fig. 13. a spatial wavelet coefficient tree is defined as the set of coefficients from different sub-bands that represent the same spatial region in the image. the arrows in fig. 13 identify the parent-children dependencies in a tree. the lowest frequency sub-band of the decomposition is represented by the root nodes (top) of the tree, the highest frequency sub-bands by the leaf nodes (bottom) of the tree, and each parent node represents a lower frequency component than its children. except for a root node, which has only three children nodes, each parent node has four children nodes, the 2×2 region of the same spatial location in the immediately higher frequency sub-band. the sfq coder has the goal of jointly finding the best combination of spatial zerotree quantization choice and the scalar frequency quantizer choice. the block diagram of this coder is shown in fig. 14 (assuming a 2-level wavelet decomposition). the sfq paradigm is conceptually simple: throw away, i.e. quantize to zero, a subset of the wavelet coefficients, and use a single simple uniform scalar quantizer on the rest. all nine possible complementary subsets for each depth-2 spatial wavelet coefficient tree are listed in the spatial zerotree quantization block, and there are three possible scalar quantizer choices in the frequency scalar quantization block. the results show a gain of about 0.5 db in psnr versus spiht for grayscale images. 6.4 multi-scale zerotree wavelet entropy (mzte) coding algorithm [13] this paper describes the texture representation scheme adopted for mpeg-4 synthetic/natural hybrid coding (snhc) of texture maps and images. the scheme is based on the concept of the multiscale zerotree wavelet entropy (mzte) coding technique. mzte was rated as one of the top five schemes in terms of compression efficiency in jpeg2000 november 1997, from among 27 submitted proposals. this scheme provides many levels of scalability layers in terms of either spatial resolution or picture quality. the mzte coding technique is based on zerotree entropy (zte) coding, but it utilizes a new framework to improve and extend the zte method to achieve a fully scalable yet very efficient coding technique. in this scheme, the low-low sub-band is separately encoded. to achieve a wide range of scalability levels efficiently, as needed by the application, the other sub-bands are encoded using the multi-scale zerotree entropy coding scheme. this multi-scale scheme provides a very flexible approach to support the right tradeoff between layers and types of scalability, complexity, and coding efficiency for any 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 13: sfq spatial wavelet coefficient tree fig. 14: block diagram of the sfq coder multimedia application. fig. 15 shows the concept of this technique. the wavelet coefficients of the first spatial (and/or quality) layer are first quantized with the quantizer q0. these quantized coefficients are scanned using the zerotree concept, and then the significant maps and quantized coefficients are entropy coded. the output of the entropy coder at this level, bs0, is the first portion of the bitstream. the quantized wavelet coefficients of the first layer are also reconstructed and subtracted from the original wavelet coefficients. these residual wavelet coefficients are fed into the second stage of the coder, in which the wavelet coefficients are quantized with q1, zerotree scanned, and entropy coded. the output of this stage, bs1, is the second portion of the output bitstream. the quantized coefficients of the second stage are also reconstructed and subtracted from the original coefficients. as shown in fig. 15, n+1 stages of the scheme provide n+1 layers of scalability. each level represents one layer of snr quality, spatial scalability, or a combination of both. in mzte, the wavelet coefficients are scanned in either the tree-depth scanning for each scalability layer or in the sub-band by sub-band fashion, from the lowest to the highest frequency sub-bands. the wavelet coefficients are quantized by a uniform and midstep quantizer with a dead zone equal to the quantization step size as closely as possible to each scalability layer. the results show that mtze performs about 2.5 db better than baseline jpeg for the luminance component, and about 6 db better for the chrominance components. 6.5 embedded block coding with optimized truncation of the embedded bit-streams (ebcot) algorithm [14–16] these papers describe the image compression scheme adopted for jpeg2000. the scheme offers rate and spatial scalability with excellent compression performance. in embedded block coding with optimized truncation of the embedded bit-streams (ebcot), each sub-band is partitioned into relatively small blocks of wavelet coefficients, called code blocks. ebcot generates an embedded bit stream separately for each code block. these embedded bit streams can be truncated independently to different lengths. the code-blocks are of 32×32 or 64×64 samples each, and provide random access to the image. the actual block coding algorithm which generates a separate embedded bit stream for each code block is combined with bit plane coding and adaptive arithmetic coding. each code block is again divided into 2-d sequences of sub-blocks, whose size is 16×16. then, for each bit plane, the significance map of sub-blocks is firstly encoded through quadtree coding. then, four different primitive coding operations are used to encode those significant sub-blocks. the operations are zero coding (zc), run-length coding (rlc), sign coding (sc) and magnitude refinement (mr). finally, ebcot organizes the final big stream in layers with optimized truncation so as to make it both resolution and snr scalable. the layered bit stream may be truncated at any point during decoding. ebcot uses the daubechies (9,7) wavelet filter [17] with a five level transform, though jpeg2000 can use other filters. the results show performance gains of about 0.5 db against spiht at various data rates. 7 wavelet-based embedded rate scalable color image compression algorithms both ezw and spiht were originally designed for grayscale images. a straightforward application to color images is to code the transformed data from each spectral channel independently without exploiting any correlation that might exist between the spectral channels. the sub-sections below describe the ezw and spiht algorithms for color image compression. 7.1 color image compression using an embedded rate scalable approach (cezw) [7] this paper presents a wavelet based coding algorithm for color images using a luminance/chrominance color space. data rate scalability is achieved by using an embedded coding scheme, which is similar to shapiro’s embedded zerotree wavelet (ezw) algorithm [4]. in a luminance/chrominance color space, the three color components have little statistical correlation. however, it is observed that at the spatial locations where chrominance signals have large transitions it is highly likely that the luminance signal will have large transitions. this interdependence between the color components is exploited in this algorithm. in this algorithm, the yuv space is used. the algorithm is developed on the basis of shapiro’s algorithm. the sot is established as follows: the original sot structure in shapiro’s algorithm is applied to all three-color components. each chrominance node is also a child node of the luminance node of the same location. thus each chrominance node has two parent nodes: one is of the same chrominance component in a lower frequency sub-band; the other node is of the luminance component. a diagram of the sot is shown in fig. 16, where the tree is developed on the basis of the tree structure in shapiro’s algorithm, and yuv color space is used. in this algorithm, the coding strategy is similar to shapiro’s algorithm, except for the following changes in the dominant pass. (1) the luminance component is first scanned. for each luminance pixel, all descendents, including those of the luminance component and those of the chrominance components, are examined and appropriate © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 15: mzte encoding structure symbols are assigned. (2) the two chrominance components are alternately scanned. the pixels that have already been encoded while scanning the luminance component are not examined. the subjective experiments showed that this algorithm produces images with the same quality as those from said and pearlman’s algorithm (http://ipl.rpi.edu:80/spiht/) at the same data rate. however the peak signal to noise (psnr) ratio is lower than that from said’s algorithm, where the psnr is obtained as � � � � � �� � psnr mse y mse u mse v � � � � � � � � �10 255 3 2 lg / 7.2 embedded color image coding using spiht with partially linked spatial orientation trees (cspiht) [18] this paper describes a variation of the set partitioning in hierarchical trees (spiht) scheme for color image coding. by using partially linked spatial orientation tree structures across different spectral planes, the new color-spiht scheme is able to embed both chrominance and luminance data in the coded bit stream. the performance is comparable to that of a spiht-based coding scheme but with significantly lower computational complexity. in cspiht, spiht’s sot structure [10] is used for all spectral planes. to create a comprehensive sot structure across the different spectral planes, luminance nodes in the ll sub-band that do not have any offspring will have descendants in the ll sub-bands of the chrominance planes, as shown in fig. 17. the results obtained are compared to said and pearlman’s algorithm (spiht klt) (http://ipl.rpi.edu:80/spiht/), which performs the karhunen–loeve transform (klt) [19] on the spectral components of the image before coding the decorrelated color planes independently using the spiht scheme. for the case of four and five-level dwt decomposition, cspiht has significantly better psnr performance than spiht klt, especially at low bit-rates (up to 6 db improvement in performance). 8 jpeg2000 still image compression standard jpeg2000 [20] is the new international standard for still image compression. the jpeg2000 standard is based on wavelet/sub-band coding techniques [17, 21] and supports lossy and lossless compression of single-component (e.g., grayscale) and multi-component (e.g., color) imagery. in order to facilitate both lossy and lossless coding in an efficient manner, reversible integer-to-integer [22–24] and nonreversible real-to-real transforms are employed. to code transform data, the codec makes use of bit-plane coding techniques. for entropy coding, a context-based adaptive binary arithmetic coder [11] is used. in addition to this basic compression functionality, however, numerous other features are provided, including: 1) progressive recovery of an image by fidelity or resolution; 2) region of interest coding, whereby different parts of an image can be coded with differing fidelity; 3) random access to particular regions of an image without needing to decode the entire code stream; 4) a flexible file format with provisions for specifying opacity information and image sequences and 5) good error resilience. due to its excellent coding performance and many attractive features, jpeg2000 has a very large potential application base. some possible application areas include: image archiving, internet, web browsing, document imaging, digital photography, medical imaging, remote sensing, and desktop publishing. the jpeg2000 core compression algorithm is primarily based on embedded block coding with optimized 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 16: parent descendent relations in cezw algorithm fig. 17: linking sot structures in different spectral planes in the cspiht scheme truncation (ebcot) of the embedded bit-streams [14–16]. the ebcot algorithm provides superior compression performance and produces a bit-stream with features such as resolution and snr scalability and random access. 8.1 jpeg2000 codec structure the general structure of the jpeg2000 codec is shown in fig. 18. the input to the encoding process is an image consisting of one or more components. before any further processing takes place, an image is partitioned into one or more disjoint rectangular regions called tiles. this is done when the original image is quite large in comparison to the amount of memory available to the codec. in the preprocessing step, each image component has its sample values adjusted by an additive bias, in a process called dc level shifting. the bias is chosen such that the resulting sample values have a nominal dynamic range (approximately) centered about zero. the rgb color space is transformed to ycrcb color space in the forward intercomponent transform step. in the forward intracomponent transform step, transforms that operate on individual components can be applied. the particular type of operator employed for this purpose is the wavelet transform. the resulting wavelet coefficients are quantized in the quantization step. a different quantizer is employed for the coefficients of each sub-band. in the case of lossless coding, reversible transforms must be employed and all quantizer step sizes are forced to be one. in the tier-1 coding step, the quantizer indices for each sub-band are partitioned into code blocks and each of the code blocks is independently embedded coded. the coding is performed using the bit-plane coder. there are three coding passes per bit plane. these passes are the significance pass, the refinement pass, and the cleanup pass, respectively. in the tier-2 encoding step, the coding pass information generated during tier-1 is packaged into data units called packets, in a process referred to as packetization. the packetization process imposes a particular organization on coding pass data in the output code stream. this organization facilitates many of the desired codec features cited above. the rate control can be achieved through two distinct mechanisms: 1) the choice of quantizer step sizes, and 2) the selection of the subset of coding passes to include in the code stream. the decoder structure essentially mirrors that of the encoder. that is, with the exception of rate control, there is a one-to-one correspondence between the functional blocks in the encoder and decoder. each functional block in the decoder either exactly or approximately inverts the effects of its corresponding block in the encoder. since the tiles are coded independently of one another, the input image is (conceptually, at least) processed one tile at a time. in the sections that follow, each of the above processes is briefly explained. for more details about the whole processes, see [20]. 8.2 preprocessing/postprocessing the codec expects its input sample data to have a nominal dynamic range that is approximately centered about zero. the preprocessing stage of the encoder simply ensures that this expectation is met. suppose that a particular component has p bits/sample. the samples may be either signed or unsigned, leading to a nominal dynamic range of [�2p �1, 2p �1� 1] or [0, 2p�1], respectively. if the sample values are unsigned, the nominal dynamic range is clearly not centered about zero. thus, the nominal dynamic range of the samples is adjusted by subtracting a bias of 2p �1 from each of the sample values. if the sample values for a component are signed, the nominal dynamic range is already centered about zero, and no processing is required. the postprocessing stage of the decoder essentially undoes the effects of preprocessing in the encoder. if the sample values for a component are unsigned, the original nominal dynamic range is restored. lastly, in the case of lossy coding, clipping is performed to ensure that the sample values do not exceed the allowable range. 8.3 intercomponent transform in the encoder, the preprocessing stage is followed by the forward intercomponent transform stage. here, an intercomponent transform can be applied to the tile-component data. such a transform operates on all of the components together, and serves to reduce the correlation between components, leading to improved coding efficiency. only two intercomponent transforms are defined in the base-line jpeg-2000 codec: the irreversible color transform (ict) and the reversible color transform (rct). the ict is nonreversible and real-to-real in nature, while the rct is reversible © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 (a) encoder (b) decoder fig. 18: the general structure of the jpeg2000 codec and integer-to-integer. both of these transforms essentially map image data from the rgb to the ycrcb color space. the ict may only be used in the case of lossy coding, while the rct can be used in either the lossy or lossless case. after the intercomponent transform stage in the encoder, data from each component is treated independently. the ict is nothing more than the classic rgb to ycrcb color space transform. the rct is simply a reversible integer-to-integer approximation to the ict (similar to that proposed in [24]). the inverse intercomponent transform stage in the decoder essentially undoes the effects of the forward intercomponent transform stage in the encoder. 8.4 intracomponent transform the intercomponent transform stage in the encoder is followed by the intracomponent transform stage. in this stage, transforms that operate on individual components can be applied. the particular type of operator employed for this purpose is the wavelet transform. through the application of the wavelet transform, a component is split into numerous frequency bands (i.e., sub-bands). due to the statistical properties of these sub-band signals, the transformed data can usually be coded more efficiently than the original untransformed data. both reversible integer-to-integer [22, 23, 25–27] and non-reversible real-to-real wavelet transforms are employed by the baseline codec. the inverse intracomponent transform stage in the decoder essentially undoes the effects of the forward intracomponent transform stage in the encoder. 8.5 quantization/dequantization in the encoder, after the tile-component data has been transformed (by intercomponent and/or intracomponent transforms), the resulting coefficients are quantized. quantization allows greater compression to be achieved, by representing transform coefficients with only the minimal precision required to obtain the desired level of image quality. quantization of transform coefficients is one of the two primary sources of information loss in the coding path. transform coefficients are quantized using scalar quantization with a deadzone. a different quantizer is employed for the coefficients of each sub-band, and each quantizer has only one parameter, its step size. in the decoder, the dequantization stage tries to undo the effects of quantization. unless all of the quantizer step sizes are less than or equal to one, the quantization process will normally result in some information loss, and this inversion process is only approximate. the baseline codec has two distinct modes of operation, referred to as the integer mode and the real mode. in the integer mode, integer-to-integer transforms are employed. the quantization step is fixed to one. lossy coding is still achieved by discarding bit-planes. in the real mode, real-to-real transforms are employed. quantization steps are chosen in conjunction with rate control. in this mode, discarding bi-planes or changing the size of the quantization step, or both, achieves lossy compression. 8.6 tier-1 coding after quantization is performed in the encoder, tier-1 coding takes place. this is the first of two coding stages. the quantizer indices for each sub-band are partitioned into code blocks and each of the code blocks is independently coded. the coding is performed using the bit-plane coder. for each code block, an embedded code is produced, comprised of numerous coding passes. the output of the tier-1 encoding process is, therefore, a collection of coding passes for the various code blocks. on the decoder side, the bit-plane coding passes for the various code blocks are input to the tier-1 decoder, these passes are decoded, and the resulting data is assembled into sub-bands. the tier-1 coding process essentially involves bit-plane coding. after all of the sub-bands have been partitioned into code blocks, each of the resulting code blocks is independently coded, using a bit-plane coder. there are three coding passes per bit plane. these passes, in order are as follows: 1) significance, 2) refinement, 3) cleanup. all three types of coding passes scan the samples of a code block in the same fixed order. the bit-plane encoding process generates a sequence of symbols for each coding pass. some or all of these symbols may be entropy coded. for the purposes of entropy coding, a context-based adaptive binary arithmetic coder is used [28]. for each pass, all of the symbols are either arithmetically coded or raw coded (i.e., the binary symbols are emitted as raw bits with simple bit stuffing). the arithmetic and raw coding processes both ensure that certain bit patterns never occur in the output, allowing such patterns to be used for error resilience purposes. the following subsections present these three passes. 8.6.1 significance pass the first coding pass for each bit plane is the significance pass. this pass is used to convey the significance and (as necessary) sign information for samples that have not yet been found to be significant and are predicted to become significant during the processing of the current bit plane. the samples in the code block are scanned in a predefined order. if a sample has not yet been found to be significant, and is predicted to become significant, the significance of the sample is coded with a single binary symbol. if the sample also happens to be significant, its sign is coded using a single binary symbol. if the most significant bit plane is being processed, all samples are predicted to remain insignificant. otherwise, a sample is predicted to become significant if any 8-connected neighbor has already been found to be significant. as a consequence of this prediction policy, the significance and refinement passes for the most significant bit plane are always empty (and need not be explicitly coded). the symbols generated during the significance pass may or may not be arithmetically coded. if arithmetic coding is employed, the binary symbol conveying significance information is coded using one of nine contexts. the particular context used is selected on the basis of the significance of the sample’s 8-connected neighbors and the orientation of the sub-band with which the sample is associated (e.g., ll, lh, hl, hh). in the case that arithmetic coding is used, the sign of a sample is coded as the difference between the actual and predicted sign. otherwise, the sign is coded directly. sign prediction is performed using the significance and sign information for 4-connected neighbors. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 8.6.2 refinement pass the second coding pass for each bit plane is the refinement pass. this pass signals subsequent bits after the most significant bit for each sample. if a sample was found to be significant in a previous bit plane, the next most significant bit of that sample is conveyed using a single binary symbol. like the significance pass, the symbols of the refinement pass may or may not be arithmetically coded. if arithmetic coding is employed, each refinement symbol is coded using one of three contexts. the particular context employed is selected according to whether the second msb position is being refined and the significance of 8-connected neighbors. 8.6.3 cleanup pass the third (and final) coding pass for each bit plane is the cleanup pass. this pass is used to convey significance and (as necessary) sign information for those samples that have not yet been found to be significant and are predicted to remain insignificant during the processing of the current bit plane. conceptually, the cleanup pass is not much different from the significance pass. the key difference is that the cleanup pass conveys information about samples that are predicted to remain insignificant, rather than those that are predicted to become significant. algorithmically, however, there is one important difference between the cleanup and significance passes. in the case of the cleanup pass, samples are sometimes processed in groups, rather than individually as with the significance pass. 8.7 tier-2 coding in the encoder, tier-1 encoding is followed by tier-2 encoding. the input to the tier-2 encoding process is the set of bit-plane coding passes generated during tier-1 encoding. in tier-2 encoding, the coding pass information is packaged into data units called packets, in a process referred to as packetization. the resulting packets are then output to the final code stream. the packetization process imposes a particular organization on coding pass data in the output code stream. this organization facilitates many of the desired codec features, including rate scalability and progressive recovery by fidelity or resolution. in the decoder, the tier-2 decoding process extracts the various coding passes from the code stream (i.e., depacketization) and associates each coding pass with its corresponding code block. 8.8 rate control in the encoder, rate control can be achieved through two distinct mechanisms: 1) the choice of quantizer step sizes, 2) the selection of the subset of coding passes to include in the code stream. when the integer coding mode is used (i.e., when only integer-to-integer transforms are employed) only the first mechanism may be used, since the quantizer step sizes must be fixed at one. when the real coding mode is used, then either or both of these rate control mechanisms may be employed. when the first mechanism is employed, the quantizer step sizes are adjusted in order to control the rate. as the step sizes are increased, the rate decreases, at the cost of greater distortion. although this rate control mechanism is conceptually simple, it does have one potential drawback. every time the quantizer step sizes are changed, the quantizer indices change, and tier-1 encoding must be performed again. since tier-1 coding requires a considerable amount of computation, this approach to rate control may not be practical in computationally-constrained encoders. when the second mechanism is used, the encoder can elect to discard coding passes in order to control the rate. the encoder knows the contribution that each coding pass makes to the rate, and can also calculate the distortion reduction associated with each coding pass. using this information, the encoder can then include the coding passes in order of decreasing distortion reduction per unit rate until the bit budget has been exhausted. this approach is very flexible in that different distortion metrics can be easily accommodated (e.g., mean squared error, visually weighted mean squared error, etc.). for a more detailed treatment of rate control, the reader is referred to [14] and [20]. 8.9 region of interest coding the codec allows different regions of an image to be coded with differing fidelity. this feature is known as region-of-interest (roi) coding. when an image is to be coded with an roi, some of the transform coefficients are identified as being more important than the others. the coefficients of greater importance are referred to as roi coefficients, while the remaining coefficients are known as background coefficients. roi coefficients are encoded with greater precision than the background coefficients. for more information on roi coding, the reader is referred to [29, 30]. 9 conclusions this paper has presented a review on some embedded rate scalable image codecs. the embedded zerotree wavelet (ezw) coder is often used as a performance reference. similar visual performance, with the added advantage of low arithmetic complexity, is achieved by set partitioning in the hierarchical trees algorithm (spiht). these algorithms use a tree structure or lists to detect and exploit similarities. the precise rate control that is achieved with these algorithms is a distinct advantage. the user can choose a bit rate and encode the image to exactly the desired bit rate. ezw and spiht offer only snr scalability. the complexity of the sfq algorithm lies mainly in the iterative zerotree pruning stage of the encoder, which can be substantially reduced with fast heuristics based on models rather than actual r-d data, which is expensive to compute. the sfq coder has important characteristics. first, sfq is built around a linear transform that allows signal energy to be compacted both in frequency and in space, and quantization modes designed to match this characterization. second, sfq provides a framework for optimizing (in the rate-distortion sense) the application of the quantization modes available to it. the ebcot image compression algorithm offers state-of-the-art compression performance together with an unprecedented set of bit-stream features, including resolution scalability, snr scalability and a “random access” capability. all features can coexist-exist within a single bit-stream without substantial sacrifices in compression efficiency. the mzte al© czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 gorithm describes the texture representation scheme adopted for mpeg-4 synthetic/natural hybrid coding (snhc) of texture maps and images. the scheme is based on the concept of the multi-scale zerotree wavelet entropy (mzte) coding technique, which provides many levels of scalability layers in terms of either spatial resolutions or picture quality. mzte, with three different modes (single-q, multi-q, and bilevel), provides much improved compression efficiency and fine-gradual scalabilities, which are ideal for hybrid coding of texture maps and natural images. the mzte scheme has been adopted as the baseline technique for the visual texture coding profile in both the mpeg-4 video group and snhc group. mzte was also rated as one of the top five schemes in terms of compression efficiency in the jpeg2000 november 1997 evaluation, among 27 submitted proposals. cezw is a coding algorithm for color images that uses a luminance/chrominance color space. in cezw algorithm data rate scalability is achieved by using an embedded coding scheme, which is similar to the ezw algorithm. the cezw algorithm does not require image-dependent transforms such as kl transforms to decorrelate the color components. a spatial-orientation tree that links not only the frequency bands but also the color channels is used for scanning the wavelet coefficients, such that the interdependence between different the color components in the lc spaces is automatically exploited. the cspiht scheme offered a simple solution to embed both luminance and chrominance data into a single coded data stream. if an image is able to go through a maximum of levels of dwt decomposition, cspiht provides comparable performance to spiht klt in terms of quality of reconstruction. for much fewer than decomposition levels for the same image, cspiht offers much better quality of reconstruction without the need to perform the computationally intensive klt on the spectral components of the image. the advantage in the performance of cspiht lies in the low bit-rate coding and dwt maps with large ll sub-band dimensions. the jpeg2000 standard for still image compression is presented. the jpeg2000 standard supports lossy and lossless compression of single-component and multi-component imagery. in addition to this basic compression functionality, it has many other features such as: progressive recovery of an image by fidelity or resolution, region of interest coding, whereby different parts of an image can be coded with differing fidelity, random access to particular regions of an image without needing to decode the entire code stream, a flexible file format with provisions for specifying opacity information and image sequences, and good error resilience. the performance of all mentioned algorithms can be improved by using efficient sign coding and estimation of zero-quantized coefficients, as presented in [31]. references [1] netravali a. n., rubinstein c. b.: “luminance adaptive coding of chrominance signals.” ieee transactions on communications, com 27, 1979, p. 703–710. [2] limb j. o., rubinstein c. b.: “plateau coding of the chrominance component of color picture signals.” ieee transactions on communications, com 22, 1974, p. 812–820. [3] rabbani m., jones p. w.: digital image compression techniques. bellingham, washington: spie optical engineering press, 1991. [4] shapiro j. m.: “embedded image coding using zerotrees of wavelet coefficients.” ieee transactions on signal processing, vol. 41 (1993), p. 3445–3462. [5] shen k., delp e. j.: “wavelet based rate scalable video compression.” ieee transactions on circuits and systems for video technology, vol. 9 (1999), p. 109–122. [6] frazier m. w.: an introduction to wavelets through linear algebra. springer-verlag new york inc., 1999. [7] shen k., delp e. j.: color image compression using an embedded rate scalable approach. proc. ieee int. conf. image processing, santa barbara, ca, oct. 26–29, 1997, iii-34–iii-37. [8] bhaskaran v., konstantinides k.: image and video compression standards: algorithms and architectures (second edition). kluwer academic publisher, 1997. [9] eskicioglu a. m., fisher p. s.: “image quality measures and their performance.” ieee transactions on communications, vol. 43 (1995), p. 2959–2965. [10] said a., pearlman w. a.: “a new, fast, and efficient image codec based on set partitioning in hierarchical trees.” ieee transactions on circuits and systems for video technology, vol. 6 (1996), p. 243–250. [11] witten h., neal r., cleary j. g.: “arithmetic coding for data compression.” comm. acm, vol. 30 (1987), p. 520–540. [12] xiong z., ramchandran k., orchard m. t.: “space-frequency quantization for wavelet image coding.” ieee transaction on image processing, vol. 6 (1997), p. 677–693. [13] sodagar i. et al: “scalable wavelet coding for synthetic/natural hybrid images.” ieee transactions on circuits and systems for video technology, vol. 9 (1999), p. 244–254. [14] taubman d.: “high performance scalable image compression with ebcot.” proc. of ieee international conference on image processing, kobe (japan), vol. 3 (1999), p. 344–348. [15] taubman d.: “high performance scalable image compression with ebcot.” ieee trans. on image processing, vol. 9 (2000), p. 1158–1170. [16] taubman d. et al: “embedded block coding in jpeg 2000.” signal processing: image communication, vol. 17 (2002), p. 49–72. [17] antonini m. et al: “image coding using wavelet transform.” ieee transactions on image processing, vol. 1 (1992), p. 205–220. [18] kassim a. a., lee w. s.: “embedded color image coding using spiht with partially linked spatial orientation trees.” ieee transactions on circuits and systems for video technology, vol. 13 (2003), p. 203–206. [19] kouassi r. k. et al: “application of the karhunen-loeve transform for natural color images analysis.” conf. record 3, 1st asilomar conf. signals, systems & computers, vol. 2 (1997), p. 1740–1744. [20] iso/iec 15444-1: information technology – jpeg2000 image coding system – part 1: core coding system, 2000. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague [21] lewis a. s., knowles g.: “image compression using the 2-d wavelet transform.” ieee trans. on image processing, vol. 1 (1992), p. 244–250. [22] calderbank a. r. et al: “wavelet transforms that map integers to integers.” applied and computational harmonic analysis, vol. 5 (1998), p. 332–369. [23] adams m. d.: reversible integer-to-integer wavelet transforms for image coding. ph.d. thesis, department of electrical and computer engineering, university of british columbia, vancouver, bc, canada, sept. 2002, available online from http://www.ece.uvic.ca/ mdadams. [24] gormish m. j. et al: lossless and nearly lossless compression of high-quality images. proc. of spie, san jose, ca, usa, 3025, 1997, p. 62–70. [25] chao h., fisher p., hua z.: an approach to integer wavelet transforms for lossless image compression. proc. of international symposium on computational mathematics, guangzhou, china, 1997, p. 19–38. [26] adams m. d.: reversible wavelet transforms and their application to embedded image compression. m.sc. thesis, department of electrical and computer engineering, university of victoria, victoria, bc, canada, jan. 1998, available from http://www.ece.uvic.ca/ mdadams. [27] adams m. d., kossentini f.: “reversible integer-to-integer wavelet transforms for image compression: performance evaluation and analysis.” ieee trans. on image processing, vol. 9 (2000), p. 1010–1024. [28] iso/iec 14492-1: lossy/lossless coding of bi-level images, 2000. [29] christopoulos c., askelof j., larsson m.: “efficient methods for encoding regions of interest in the upcoming jpeg2000 still image coding standard.” ieee signal processing letters, vol. 7 (2000), p. 247–249. [30] askelof j., carlander m. l., christopoulos c.: “region of interest coding in jpeg2000.” signal processing: image communication, vol. 17 (2002), p. 105–111. [31] deever a. t., hemami s. s.: “efficient sign coding and estimation of zero-quantized coefficients in embedded wavelet image codecs.” ieee transactions on image processing, vol. 12 (2003), p. 420–430. doc. ing. boris šimák, csc. e-mail: simak@feld.cvut.cz eng. farag ibrahim younis elnagahy, msc. e-mail: faragelnagahy@hotmail.com department of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 1simak 4 5 6 7 8 9 10 11 12 13 14 15 16_17 ap03_5.vp fdtd requires the computational domain to be limited by boundary conditions. two boundary conditions are used here, perfectly-electric conductors (pec) and perfectly-matched layers (pml). pec’s are implemented by forcing the tangential component of the electric field along the boundaries to be zero (etangential = 0). in the pml-technique, an artificial layer of absorbing material is placed around the outer boundary of the computational domain. the goal is to ensure an electromagnetic wave incident into the pml region at an arbitrary angle to be absorbed without reflection. the pml region is realized by implementing a new degree of freedom into the formulas of the fdtd code, which is done by splitting the field components [1]. fig. 1 shows the dimensions of the house spacing under consideration from a top and side view. the house space under consideration is restricted to a volume of v � 4 2 8 2 5m m m. . . five boundaries are considered to be walls of ferroconcrete and therefore good reflectors for high-frequency signals, as used for mobile communication. hence these walls are simulated by pec’s. the 6th boundary is considered to be a window/door combination and therefore electromagnetically transparent. in order to simulate the incidence of electromagnetic waves from radiating antennas far away from the housing, the window/door boundary is simulated as source plane. for calculation, the direction of propagation needs to be taken into account, so the electric field vector of the incident wave is simulated on the source plane by decomposition into components. simulation is performed by setting the components of the electric field strength einc(z) and einc(y) on the source plane according to the electric fieldstrength einc of the incident electromagnetic wave on the window/door combination. the time-dependence is taken into account by setting the values of the components of the electric field strength on the source plane sinusoidally. setting of the magnetic component was omitted, as magnetic and electric fields are related by the impedance of free space. impedance-matched simulation of free space along the window/door combination (source plane) is assured by implementing pml’s. the pml structure numerically absorbs the energy of the electromagnetic wave traveling from the interior of the housing space towards the environment. 2 results as shown in fig. 1a, numerical calculation was performed for an electromagnetic wave with magnetic and electric field vectors hinc and einc and direction of propagation k related to the dimensions of the defined cartesian system. investigation of the electric field strength inside the housing space was based on different angles of incidence within a range of 5 85� � � �� in steps of � step � �5 . for each angle of incidence a calculation was performed, until a steady state of the electric field inside the housing space could be observed. data analysis was restricted to the last time inverval in a steady state. the last time interval was divided into 10 time-points with equal time spacing. fixing the angle of incidence of the propagating electromagnetic wave, the maximum absolute value of the electric field strength emax was detected within the housing space and within the chosen time points for steady state. in addition emax was referred to the amplitude of the incident electric field strength einc. fig. 2 shows emax /einc over �. as may be seen, the maximum electric field strength depends strongly on the angle of the incident electromagnetic wave, and the maximum value of the electric field strength inside the housing exceeds the value of the electric field strength of the incident wave. the maximum may be observed for � � �20 with a ratio e einc max .� 2 5. this may be explained by reflections and superposition on the perfectly conducting walls of the housing space, particulary in corners and edges where supercomposition with reflected electric fields from several walls may occur. taking into account the density of the electromagnetic energy being quadratically dependent on the electric field strength, it may be argued that the density of the energy may be in the worst case about 5 times higher than the energy density of the incident electromagnetic wave on the source plane. conclusion from fig. 2 it may be concluded that the effects of electromagnetic radiation from antennas for mobile communication should not only be judged by their electric field strength in free-space or boundaries between free-space and housings. as electromagnetic waves with high frequencies may have negative effects on humans, attention should be paid to legal limits for electromagnetic radiation from radio transmitters for mobile communication in the vicinity of housings. legal limits referring to free space propagation of electromagnetic waves should be regarded with care, as under unfavourable conditions humans inside housings may be subjected to electric field strengths which exceed the allowed limits. it should be taken into account that the effects of electromagnetic radiation on humans are quadratically dependent on the electric field strength, as these effects are mainly related to the energetic density of the electromagnetic waves, and therefore the negative impacts on humans increase disproportionately with electric field strength. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 43 no. 5/2003 0 20 30 40 50 60 70 80 90 0 1 2 3 e emax / inc 10 angle of incidence [°] fig. 2: dependency of maximum electric field strength inside a housing space referred to incident electric field strength (emax /einc) over angle of incidence (�) references [1] berenger, j. p.: a perfectly matched layer for the absorption of electromagnetic waves. journal of computational physics 114, 1994, p. 185–200. [2] sadiku, m.: numerical techniques in electromagnetics. crc-press, 2001, second edition, isbn 0-8493-1395-3, p. 121–186. [3] simonyi, k.: theoretische elektrotechnik. deutscher verlag der wissenschaften, 10. auflage, isbn 3-335-00375-6. dr. ing. h.-p. geromiller phone: +493 715 313 354 fax.: +49 3 715 313 417 email: hans-peter.geromiller@e-technik.tu-chemnitz.de prof. dr. ing. habil. a. farschtschi technical university of chemnitz chair of fundamentials of electromagnetics 09111 chemnitz, germany 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 acta polytechnica doi:10.14311/ap.2020.60.0313 acta polytechnica 60(4):313–317, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap external rolling of a polygon on closed curvilinear profile tetiana kresan, serhii pylypaka, zinovii ruzhylo, ivan rogovskii, oleksandra trokhaniak∗ national university of life and environmental sciences of ukraine, heroiv oborony str., 15, kyiv, ukraine ∗ corresponding author: klendii_o@ukr.net abstract. the rolling of a flat figure in the form of an equilateral polygon on a curvilinear profile is considered. the profile is periodic. it is formed by a series connection of an arc of a symmetrical curve. the ends of the arc rely on a circle of a given radius. the equation of the curve, from which the curvilinear profile is constructed, is found. this is done provided that the centre of the polygon, when it rolls in profile, must also move in a circle. rolling occurs in the absence of sliding. therefore, the length of the arc of the curve is equal to the length of the side of the polygon. to find the equations of the curve of the profile, a first-order differential equation is constructed. its analytical solution is obtained. the parametric equations of the curve are obtained in the polar coordinate system. the limits of the change of an angular parameter for the construction of a profile element are found. it is a part of the arc of the curve. according to the obtained equations, curvilinear profiles with different numbers of their elements are constructed. keywords: equilateral polygon, curvilinear profile, external rolling, differential equation, centroids. 1. introduction some flat figures, including polygons that can be rolled on a curvilinear periodic profile without sliding, are considered in [1]. the profile is formed by equal symmetrical curvilinear elements of their serial connection so that the ends of the elements abut on a straight line. when rolling a polygon on such profile, its centre moves in a straight line. constructing a closed profile in which curvilinear elements touch a circle is important for the design of centroids of non-circular wheels. when rolling a polygon on such profile, its centre moves in a circle. if both centres (centre of the curvilinear profile and centre of the polygon) are stationary, then you can roll these figures while rotating around their centres. one centroid will be a polygon, the other will be a closed profile. many works are devoted to the study of the rolling of flat figures one by one. there are common examples of rolling a straight line segment along a curve and vice versa rolling a curvilinear profile along a straight line. for the first case, the rolling of a straight line in a circle is the classic one, as a result of which the point of the line describes the evolvent of a circle, and for the second case, the rolling of a circle along a straight line, as a result of which a circle point describes a cycloid [2]. in [3], information on the rolling of second-order curves along a straight line is given. the trajectory of focus in such rolling is known curves. the formation of flat curves by given kinematic parameters is considered in [4]. the basics of designing non-circular wheels for gears are given in [5]. geometric modelling of a centroid of non-circular wheels was further developed in the works [6–8]. the use of non-circular wheels in gears has been considered in [9–17], in chain drives in a monograph [18]. the purpose of the article is to develop an analytical description of a curvilinear closed profile, in which an equilateral polygon will be rolled without sliding and its centre will move in a circle of a given radius. 2. material and method consider the rolling of a polygon by the example of a square. we need to find a form of a flat profile in which the square will be rolled without sliding, and its centre will move in a circle of radius r (fig. 1,a). rolling of a square can be considered as an example of rolling of a triangle, the basis of which is the side of the square, and the altitude the distance ac=a from its centre to the side. in the initial position, the altitude of the ac is located on the ox axis (fig. 1,a). when rolling a square, its centre moves in a circle of a radius r, and the side rolls along the curve, that will be found. the moment comes when the point of contact of the square with the curve becomes its vertex. the diagonal a’c’ in this position passes through the origin of coordinates the point o. it is obvious that the length s of the arc aa’ is equal to the length of half of the side of the square (fig. 1,a). when rolling, the side of the square is tangent to the curve, and in the position where the point of contact is its vertex, both sides of the square touch the curves. with further rolling, the process is repeated. in such a way that a rectilinear plane profile will consist of equal arcs that intersect on a circle of radius ro at right angle (fig. 1,a). 313 https://doi.org/10.14311/ap.2020.60.0313 https://ojs.cvut.cz/ojs/index.php/ap t. kresan, s. pylypaka, z. ruzhylo et al. acta polytechnica figure 1. graphic illustrations for rolling a square along a curvilinear contour: a) schematic representation of the two positions of the square when it is rolling; b) the current point of contact a’ in the polar coordinate system. in fig. 1,a, it is shown that in the position when the point of contact with the curve is the vertex of the square, its centre is on the radius-vector oc’. this applies to any current point of contact of the side of the square with the curve. let the triangle, which is the fourth part of the square, touch the curve at the current point a’ (fig. 1,b). as it is rolling without sliding, the current point of contact a’ can be considered as the instantaneous centre of the rotation of the segment a’c’ around it. in this case, the direction of the velocity of the point c’ must be perpendicular to the segment a’c’. however, on condition that the point c’ moves in a circle of a radius r, that is, it must be perpendicular to the radius-vector oc’. thus, the centre of the square c’ (or the vertex of the corresponding triangle, whose base is the side of the square) and the point of contact of the side of square a’ are located on a common radius-vector that starts at the origin. due to this, the equation of the curve of the profile is conveniently considered in the polar coordinate system. denote the distance from the origin of coordinates to the point of contact oa’=ρ (fig. 1,b), where ρ is a function of the angle α: ρ=ρ(α). the constant distance r is the sum of two segments of variable lengths: r=ρ+a’c’. the length of the hypotenuse a’c’ can be found from the right triangle a’c’b. we find the expressions of the lengths of the legs a’b and bc’ for the current tangency point a’. for a better understanding of the rolling process, consider the moving coordinate system the accompanying trihedral of the curve of the profile, in which the unit vector τ is tangent to it, the unit vector n is perpendicular, and the unit vector of the binormal b is projected to a point. the vertex of the trihedral is point à in its initial position, with the altitude ac lying on the ox axis (fig. 1, a, b), which coincides with the unit normal vector n (fig. 1, b). when rolling the triangle, the point of contact a’, which is the vertex of the trihedral, moves along the curve and unit vector τ remains tangent to the curve. in the initial position τn of the trihedral, points a and b coincided, and the altitude a of the triangle coincided with the unit vectorn. when rolling a triangle along the curve, the trihedral occupies a new position τ′n′ with the vertex at point a’ (fig. 1,b). the coordinates of the point c’ in its system are as follows. the segment a’b is equal to the length of the arc aa’ of the curve: aa’=a’b=s. the altitude a of the triangle when it is rolling remains parallel to the principal normaln′ . so the length of the segment a’c’ is determined by the pythagorean theorem: a′c′ = √ s2 + a2. the expression r=ρ+a’c’ can be written as follows: r = ρ + √ s2 + a2. (1) let‘s solve equation (1) for s: s = √ (r −ρ)2 −a2. (2) let‘s write the parametric equations of the curve of the profile in the polar coordinate system: x = ρ cos α ; y = ρ sin α. (3) let‘s find the expression of the arc length s of the curve (3). to do this, we define its first derivatives: x′ = ρ′ cos α− ρ sin α; y′ = ρ′ sin α + ρ cos α. (4) by the known formula we write: ds dα = √ x′2 + y′2 = √ ρ2 + ρ′2. (5) the derivative of arc s can be found by a differentiation of expression (2): ds dα = − ρ′ (r −ρ)√ (r −ρ)2 −a2 . (6) we equate expressions (5) and (6) and solve for ρ’: dρ dα = ρ a √ (r −ρ)2 −a2. (7) 314 vol. 60 no. 4/2020 external rolling of a polygon on closed curvilinear profile the differential equation of first order (7) is obtained on the basis of the equality of the arcs of the profile curve and the side of the square that rolls on it without sliding. the altitude a of the triangle can be found through the angle ε : a = r · cos ε, where r is the length of the side of the triangle that is equal to the radius of the circle circumscribed about the square. let’s extend this expression to a polygon with an arbitrary number of sides n. the angle ε, in this case, will depend on the number of sides of the polygon: ε=π/n. the differential equation (7) for a polygon with an arbitrary number of sides n, being inscribed in a circle of radius r, will be written: dρ dα = ρ r cos (π/n) √ (r −ρ)2 −r2 cos2 (π/n). (8) differential equation (8) has an analytical solution. on a condition that α=0 ρ=r-a=r-r·cos(π/n), we find the corresponding value of the constant of integration. with consideration of this constant, the solution of equation (8) takes the final form: ρ = r2 −r2 cos2 (π/n) r + r cos (π/n) cosh ( α √ r2 r2 cos2(π/n) − 1 ). (9) substitution (9) in (3) will give parametric equations of the curve. we need a limited arc to construct a profile. the magnitude of this arc is due to the minimum value of the radius-vector ρ=ro (fig. 1a). inversely, ro=r-a’c’=r-r. let‘s substitute in (9) ρ=rr and solve for α: α0 = ± r cos (π/n)√ r2 −r2 cos2 (π/n) arc cosh( r −r cos2 (π/n) (r −r) cos (π/n) ) . (10) the wanted arc of the curve at given values of r, r and n is constructed according to equations (3) taking into account (9) when the angle α changes within α= –αî...αî. however, in this case, we will not be able to place the required number of arcs so that a closed profile is obtained. if we want to construct a profile of four arcs, then the angle αî=±π/4 (this case is shown in fig. 1,a). the measure of the angle αo is determined by dividing the number π by the number of arcs. hence, for a given number of arcs of a profile and the number of sides of a polygon of two radii r and r, we can only specify one of them, since the angle αî will also be given. equation (10) cannot be solved with respect to one of the radii r or r, so numerical techniques must be used. 3. result it should be noted that the curvilinear profile can consist of one arc. if the polygon is a square, then at αi = ±π, n = 4, r = 100, we find: r = 95.28. in fig. 2, we can see a curvilinear profile and a square in figure 2. curvilinear profile, with complete rolling during which the square makes quarter of a turn. two positions on opposite sides. when the profile is completely rolled, only one side of the square contacts it, that is, the square makes quarter of a turn. if a curvilinear profile consists of four elements, then the square during one complete rolling of the profile makes one turn. repeating the calculation at αi = ±π/4, n = 4, r = 100, we find: r = 62.27. in fig. 3, we can see a curvilinear profile of four elements and a set of positions of a square when it is rolled along one of the elements. sequential movement of the centre of the square when it is rolled is shown by circles. in the extreme positions of a square, when its vertices are points of contact with the profile, the sides of the square are depicted thickened. it should be noted that for the physical rolling of a polygon along a curvilinear profile, there is a limit on the number of its sides. the number of sides of a polygon cannot be less than four. this is explained as follows. when rolling a polygon, its vertex describes a known curve evolvent. its property is that at the moment of detachment from the curve, the point of the straight that rolls on it (in our case the end of the side of the square) moves perpendicular to it. this can be seen from the enlarged fragment in fig. 3, b. if the polygon was a triangle, then the angle between neighbouring elements would be 60◦ and physical rolling would be impossible. the considered approach allows constructing a curvilinear profile that would provide the required number of revolutions of the polygon at complete rolling along the profile. it is determined by the ratio of the number of profile elements to the number of sides of the polygon. in fig. 4,a, a profile consisting of 8 elements is constructed. at r = 100, the radius of the circumscribed circle is r = 41.76. when completely rolling along the profile, the square makes 2 turns. to ensure 4 turns, the profile must be 16 elements, with a radius of r = 25.13 (fig. 4,b). with an unlimited increase in the number n of the sides of a polygon, it transforms into a circle, and the radius-vector ρ into a constant value r–r, that is, rî (fig. 1,a). the number of turns of a circle of radius r, after a completed rolling along a circle of radius rî, is determined by the ratio of these radii. 315 t. kresan, s. pylypaka, z. ruzhylo et al. acta polytechnica (a). (b). figure 3. curvilinear profile of four elements, when completely rolling along it, the square makes one turn: a) the set of positions of the square when it is rolling along the arc of the profile; b) enlarged fragment of the profile element (a). (b). figure 4. curvilinear profiles with different number of turns of a square: a) a square, when completely rolling along the profile, makes two turns; b) a square, when completely rolling along the profile, makes four turns. figure 5. polygons and curvilinear profiles with different numbers of sides and curvilinear elements at r = 100: a) a hexagon with a radius of the circumscribed circle r = 54.93 and a corresponding curvilinear profile with six elements; b) a pentagon with a radius of the circumscribed circle r = 71.47 and a corresponding curvilinear profile with three elements. 316 vol. 60 no. 4/2020 external rolling of a polygon on closed curvilinear profile the correspondence of the number of sides of a polygon and elements of a curvilinear profile can be different (fig. 5). such figures can roll one by one with simultaneous rotation around fixed centres o and o1 with angular velocities ω and ω1, that is, they can serve as centroids for the design of non-circular gears [6–8]. the predetermined value r is the centre to centre distance. when rotating one non-circular wheel at a constant angular velocity, the second will rotate at a variable angular velocity. this is due to the variable radius from the centres of the wheels to the point of the contact during the rotation. the radius-vector ρ changes from the maximum value at α = 0 to the minimum ρ = ri = r − r (fig. 1,a, points a and a’). this distance difference is the difference between the maximum and minimum values of the distance from the axis of rotation of the polygon to the point of contact with the curvilinear profile. it decreases as the number of sides of a polygon increases as well as increase the number of curvilinear elements of the profile that can be traced by the example of the square in fig. 2 4. 4. conclusions and prospects for further research an analytical description of the rolling of a polygon along a curvilinear closed profile can be used to design non-circular gear wheels. the number of sides of a polygon must be at least four in order to be physically rolling. to design a curvilinear centroid, it is necessary to specify the distance from centre to centre, the number of sides of a polygon, that is another centroid, and the correspondence of the number of turns of these centroids. the length of the centroid element is equal to the length of the side of the polygon. the number of elements of a centroid can be any integer, starting with one. increasing the number of curvilinear centroid elements and the number of sides of a polygon increases the rotation uniformity of one centroid with respect to another. references [1] p. m. zaika. selected problems of agricultural mechanics. ukrainian agricultural academy publishing house, kyiv, 1992. [2] a. a. savelov. flat curves. systematics, properties, applications. state publishing house of physical and mathematical literature, moskov, 1960. [3] d. hilbert, s. kon-vossen. visual geometry. science, moskov, 1981. [4] v. bulgakov, s. pilipaka, v. adamchuk, j. olt. theory of motion of a material point along a plane curve with a constant pressure and velocity. agronomy research 12(3):937 – 948, 2014. [5] f. l. litvin. theory of gears. science, moskov, 1968. [6] v. v. kovrigin, i. v. molovik. analytical description of centroids of non-circular gears. applied geometry and engineering graphics melitopol: tavriya state agrotechnological university 49(4):125 – 129, 2011. [7] y. p. legeta, o. v. shoman. geometric modeling of centroids of non-circular gears by transfer function. geometric modeling and information technology scientific journal: mykolaiv national university named after vo sukhomlinsky (2):59 – 63, 2016. [8] y. p. legeta. description and construction of coupled centroid of non-circular gears. modern problems of modeling melitopol: melitopol bogdan khmelnytsky state pedagogical university 3:87 – 92, 2014. [9] a. p. padalko, n. a. padalko. tooth gear with non-circular wheel. theory of mechanisms and machines 11(2):89 – 96, 2013. [10] a. n. sobolev, a. y. nekrasov, m. o. arbuzov. modeling of mechanical gears with non-circular gears. bulletin of moscow state university "stankin" 40(1):48 – 51, 2017. [11] a. a. lyashkov, k. l. panchuk, i. a. khasanova. automated geometric and computer-aided non-circular gear formation modeling. journal of physics: conference series 1050:012049, 2018. doi:10.1088/1742-6596/1050/1/012049. [12] j. han, d. z. li, t. gao, l. xia. research on obtaining of tooth profile of non-circular gear based on virtual slotting. 2015. [13] d. mundo. geometric design of a planetary gear train with non-circular gears. mechanism and machine theory 41(4):456 – 472, 2006. doi:10.1016/j.mechmachtheory.2005.06.003. [14] e. ottaviano, d. mundo, g. a. danieli, m. ceccarelli. numerical and experimental analysis of non-circular gears and cam-follower systems as function generators. mechanism and machine theory 43(8):996 – 1008, 2008. doi:10.1016/j.mechmachtheory.2007.07.004. [15] v. marius, a. laurenţia. technologies for non-circular gear generation and manufacture. [16] e. doege, m. hindersmann. optimized kinematics of mechanical presses with noncircular gears. cirp annals 46(1):213 – 216, 1997. doi:10.1016/s0007-8506(07)60811-7. [17] w. smith. math of noncircular gearing. gear technology 17(4):18 – 21, 2000. [18] n. p. ututov. chain drives with non-circular gears. lugansk: knowledge, 2011. 317 http://dx.doi.org/10.1088/1742-6596/1050/1/012049 http://dx.doi.org/10.1016/j.mechmachtheory.2005.06.003 http://dx.doi.org/10.1016/j.mechmachtheory.2007.07.004 http://dx.doi.org/10.1016/s0007-8506(07)60811-7 acta polytechnica 60(4):1–5, 2020 1 introduction 2 material and method 3 result 4 conclusions and prospects for further research references ap04_2web.vp 1 introduction this paper describes the application of the reuse methodology developed within the framework of the eu clockwork project (creating learning organisations with contextualised knowledge-rich work artifacts). the presented reuse approach is based on knowledge support for engineering design based on virtual modelling and simulation. the use of clockwork tools on an application example is shown for demonstration purposes. the target is to obtain an overview of the possibilities and functionality of the clockwork methodology and the tools from the clockwork toolkit. two examples are given. these are based on examples from the clockwork project trials. first, we describe the development of the simulation model of the dyna-m machine tool, as collaborative development taking place from scratch between two geographically distributed partners. then we describe the development of simulation model of the trijoint machine tool as a reuse development where the simulation model of dyna-m is reused. both the examples are motivated by current highly innovative developments in the machine-tool industry. it is expected that the new generation of machine tools with increased productivity will be based on hybrid kinematics. the development of such machine tools requires overall comprehensive modelling and simulation of dynamic interactions between nonlinear kinematics, structural flexibility, drives and advanced control systems. 2 collaborative development of a simulation model of dyna-m from scratch this simulation model is being developed from scratch. the development process involves collaboration between two partners. one partner (a) is competent in modelling mechanics & control. the other partner (b) is competent in modelling of drives. they are geographically distributed and they cooperate through the internet. the development follows the methodology described in [1, 2]. the methodology follows the subsequent formulation of the simulation problem in four worlds: real (initial problem formulation), conceptual (of the problem decomposition into components), physical (creation of a physical model in ideal objects) and simulation (a proper simulation model). the problem in the real world is formulated above. the real world object is formed from a real world system, question and solution (fig. 1). the real world system is the dyna-m machine tool. the modelling goal is given by the question. the question concerns the dynamic interaction within the machine, in particular what level of dynamic accuracy can be achieved within interaction between nonlinear kinematics, structure flexibility, drive dynamics and control. the solution to this question will be answered by a simulation experiment on the simulation model. this data is stored in the clockwork knowledge management tool (ckmt) (see fig. 1). a new case is opened. the texts (abstracts), pictures and possible further extending documents in files (e.g. papers, reports) are filled in and downloaded into ckmt. this is the informal knowledge on the level of a real world object. the formal knowledge is represented by semantic indexing. the real world object is named as “machine tool dyna-m”, and it is classified in the product ontology (fig. 2) as ”horizontal-machine-tool”. this semantic indices will later be used for search and retrieving of suitable cases. this is done by partner a, who acts as the leading partner in the cooperation. the next step according to the clockwork methodology is conceptualization. the conceptualization insists on decomposition of the real world system into components. in many cases such a decomposition is described by a cad drawing or of 3d volumetric geometric models. during this decomposition the first assumptions are raised, e.g. neglecting of parts of the machine. in our case, for example, the moving table, spindle motion are not considered. as a part of the decomposition, the real world system is decomposed into the machine system (becoming the conceptual model) and the environment system (becoming the modeling environment) describing the interaction (excitation) of the system from outside. the question is formulated in greater detail and becomes the modelling objective. the conceptual model itself can be developed and described using any tool for graphical sketching. in our case it was exported from the cad drawing into corel draw and © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 44 no. 2/2004 knowledge support of simulation model reuse m. valášek, p. steinbauer, z. šika, z. zdráhal this describes the knowledge support for engineering design based on virtual modelling and simulation. these are the results of the ec clockwork project. a typical and important step in the development of a simulation model is the phase of reusing. virtual modelling and simulation often use the components of previous models. the usual problem is that the only remaining part of the previous simulation models is the model itself. however, a large amount of knowledge and intermediate models have been used, developed and then lost. a special methodology and special tools have therefore been developed on support of storing, retrieving and reusing the knowledge from previous simulation models. the knowledge support includes informal knowledge, formal knowledge and intermediate engineering models. this paper describes the overall methodology and tools, using the example of developing a simulation model of trijoint, a new machine tool. keywords: simulation model, reuse, knowledge management, machine tool. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 1: dyna-m as a real world system in ckmt fig. 2: semantic indexing of the dyna-m simulation problem finally into jpg format. the drawing was extended by the names of particular components for later semantic indexing. the development and description of the conceptual model can also be done within the modelling tool from clockwork (fig. 3). more support can be provided for saving appropriate knowledge. the modelling tool supports the annotations of particular components and automatic export of their names for semantic indexing. the developed conceptual world object is stored into ckmt. a new case is opened. the texts (abstracts), pictures, files containing the conceptual model(s) and possible further extending documents in files (e.g. papers, reports) are filled in and downloaded into ckmt. this is the informal knowledge on the level of conceptual world object. the formal knowledge is again created by semantic indexing. the names of particular components are classified in the component ontology and stored in ckmt. these semantic indices will later be used for search and retrieving of suitable cases. all this is done by partner a. now the cooperation begins. the stored data in ckmt is transmitted to partner b. using the accompanying e-mail discussion space or using e-mail directly, it is agreed that partner b will develop the modelling and simulation part of the electric drives. he studies and analyses the material in ckmt, discusses with partner a, and as a result he adds and/or modifies the data in ckmt on the level of conceptual world object. these iterations of conceptual world object are stored by partner b into ckmt either as a modification of an existing conceptual world object or, as a new version of it. as a result of these iterations, a conceptual world object is developed that is acceptable for both partners a and b. at this stage the semantic indexing can be completed. this refers the semantic indices themselves, but it refers especially to the description of the relation ”is conceptualised as” between real world object and the conceptual model. the following step according to the clockwork modeling and simulation methodology is proper modeling [1, 2]. the modeling insists on the subsequent replacement of components of the conceptual model by ideal objects, and thus obtaining the physical model. during this process the interfaces between partners a and b must first be specified. then both partners develop physical models of their parts of the complete model. during this process many different modeling assumptions are raised, e.g. the links of dyna-m are modeled as rigid bodies and the flexibility is taking into account in the models of the moving screws. the modeling environment is transformed into the model input, and the modeling objective is transformed into the model output, i.e. into the excitation of the model and the measures for evaluating of the solution. the development and description of the physical model can also be done within the modeling tool (fig. 4), but any other graphical tool can be used. the modeling tool can provide more support for saving appropriate knowledge. the modeling tool supports the annotations of particular ideal objects and automatic export of their names for semantic indexing. the developed modeled world object is stored into ckmt. a new case is opened. the texts (abstracts), pictures, files containing the physical model(s) and possible further extending documents in files (e.g. papers, reports) are filled in and downloaded into ckmt. this is the informal knowledge on the level of modeled world object. the formal knowledge is created again by semantic indexing. the names of particular components are classified in the ideal objects ontology and stored in ckmt. then the description of the relation ”is modeled as” between conceptual and physical models (fig. 5) is also created. these semantic indices and relations will later be used for search and retrieving of suitable cases. the next step according to the clockwork modeling and simulation methodology is the simulation model implementation and later simulation experimentation. the simulation model implementation insists on the subsequent replacement of ideal objects in the physical model by the constructs of the simulation software, and thus obtaining the simulation © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 44 no. 2/2004 fig. 3: dyna-m conceptual model including annotations model. again the interfaces between partners a and b must be specified. then both partners develop the simulation models of their parts of the complete model. for the solution of the simulation model a solution procedure must be selected. during this process further assumptions are raised, e.g. the chosen accuracy of the integration procedure. during the implementation, the testing model inputs and model outputs are used in order to test the correctness of the implementation during the testing simulation runs. in our case, for example, the unit steps are used in order to check the dynamic behaviour. implementation of the simulation model is realized in the simulation software (in our case in matlab-simulink), but the simulation annotation tool (fig. 6) for saving appropriate knowledge is of great importance. the simulation annotation tool developed for simulink supports the annotations of particular simulation constructs and automatic export of their names for semantic indexing. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 4: dyna-m physical model fig. 5: semantic indexing of relations between conceptual and physical models 3 lessons learned from the dyna-m case the developed simulated world object is stored into ckmt. the final step according to the clockwork methodology is the simulation experiment. the tested simulation model is finally run with the model input developed on the level of a physical model. the model outputs are analyzed and interpreted in order to answer the question from a real world object. in the case of dyna-m the real input was the motion along the desired circular trajectory, the achieved dynamic accuracy was investigated. the results of the simulation experiment are then stored in ckmt at the simulated world object and their interpretation is stored as the solution at the real world object. this finally closes the life cycle of modeling and simulation according to the clockwork methodology. at this stage, after finishing the first example, the advantages of the clockwork approach and the main differences between the clockwork approach and the traditional development process of simulation models can be summarized. first, the additional supporting tools for modeling and simulation developed in the clockwork project can be listed. leaving aside the knowledge part (clockwork knowledge management tool, appolo ontology tool, ontology interface tool), there are tools that are applicable and useful when applied alone. these are the simulation annotation tool and the modeling tool. they support the description of different models created within the development process of the simulation model. regarding the differences and advantages of the clockwork approach over the traditional development process of a simulation model, the most important consideration is the amount of remaining work documents after the end of the development process. in the domain of engineering all models are fully developed including intermediate models that have been used during the development. in the domain of knowledge both informal and formal knowledge associated with the simulation development have been captured. the advantage of the clockwork approach for the development of the cooperative simulation model is the fact that traditionally there has usually remained only the resulting simulation model without previous and associated tacit models and knowledge. 4 development of reuse of the simulation model of trijoint machine-tool the second modeling and simulation case investigates the dynamic properties of the new machine tool trijoint 900h (fig. 7). this machine tool has been developed within cooperation between kovosvit mas and the czech technical university in prague. it is a new machine tool concept patented worldwide by kovosvit mas and the czech technical university. it is also a machine tool with hybrid kinematics. its remarkable property is that it has simultaneously increased both the dynamics and the stiffness by a factor of 2–3 over traditional machine tools. the development is being done by partner a, and this development will be supported by the reuse methodology and the capability of the clockwork modeling and simulation methodology and the clockwork toolkit. first, the problem in the real world is formulated. the real world object is developed with the real world system, question and expected solution. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 44 no. 2/2004 fig. 6: simulation model of dyna-m with annotations 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 7: search for a similar case to trijoint 900h using semantic indices fig. 8: modified simulation model of trijoint based on the reused simulation model of dyna-m the resulting data is stored into ckmt similarly as for dyna-m. a new case is opened. the texts (abstracts), pictures and possible further extending documents in files (e.g. papers, reports) are filled in and downloaded into ckmt. this is the informal knowledge on the level of real world object. the formal knowledge is created by semantic indexing. the real world object is named as ”machine tool ph” (ph, i. e., parallel horizontal machine tool was the working name before the final name trijoint 900h), and it is classified in the product ontology as ”horizontal-machine-tool”. this is stored in ckmt. based on this semantic indexing, the possibility arises of reusing some previous case from the ckmt database. this can be done by a search (fig. 7) based on semantic indices. the search can be based on formal knowledge or even on informal knowledge. the search is based on looking for a similarity. the similarity in formal knowledge is based on a similarity in ontology. the similarity in informal knowledge is based on similarity with words of natural language (search through key words). the retrieved similar case from ckmt is dyna-m. the reuse development process within the clockwork methodology can start. first the retrieved case of dyna-m is copied into the ph case on the level of conceptual, modeling and simulation world objects. subsequently these levels are modified. first the conceptual model is modified and replaced in ckmt. the modeling tool for supporting modeling on levels above the simulation model is advantageously applied. the same is done on the physical modeling level with a physical model and mwo stored in ckmt. the final level of the simulation model is solved by replacing of blocks of dynamics and kinematics of dyna-m (fig. 6) with the corresponding blocks for ph (trijoint) (fig. 8). the resulting simulation world object is stored in ckmt. the more material that can be reused the more important is the testing of the implemented simulation model (fig. 9). after its successful realization, simulation experiments can be provided. the results of the simulation experiment are then stored in ckmt at the simulated world object, and their interpretation is stored as the solution at the real world object. this finally again closes the life cycle of modeling and simulation according to the clockwork modeling and simulation methodology. 5 lessons learned from the trijoint case regarding the differences and advantages of the clockwork approach over the traditional development process of a simulation model, it may be stated that the reuse of traditional simulation models is much more difficult compared with clockwork approach. in traditional approach there do not exist abstract models, informal knowledge, especially annotations of all models. this support for proper understanding of the simulation model is usually impossible, thus preventing reuse alltogether. the main supporting aspects for modeling and simulation developed with the clockwork project with special importance for reuse can be listed, as follows: � use of the facility for knowledge sharing in collaboration and reuse has been demonstrated. � the domain of engineering supported and organized on the basis of support for the knowledge domain. � a semantics based search through: � formal knowledge, � informal knowledge, � reasoning of both. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 44 no. 2/2004 -0.4 -0.2 0 0.2 0.4 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 xn [m] yn [m] requested trajectory -0.4 -0.2 0 0.2 0.4 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 realized trajectory xn [m] yn [m] 0 0.01 0.02 0.03 0.04 0.05 0.2 0.25 0.3 0.35 0.4 time [s] yn [m] transient owing to difference of initial position and requested position yn requested yn realized 0.49 0.5 0.51 0.52 0.53 0.349 0.349 5 0.35 0.350 5 0.351 time [s] xn [m] transient owing to jump change of requested trajectory direction xn requested xn realized requested velocity 46.6 [m/min] huge difference between requested position and initial position for t=0[s]jump change of requested trajectory direction fig. 9: simulation experiments with the simulation model of trijoint 900h 6 conclusions the clockwork approach including the clockwork methodology and tools can be evaluated on the basis of experience with development of the simulation model of dyna-m and trijoint. it can be concluded that the clockwork approach supports both collaborative development and the reuse of simulation models. the clockwork tools support all stages of the development process in the domain of both engineering and knowledge. finally, it can be concluded that the clockwork approach improves the whole engineering process of modeling and simulation. acknowledgment the authors appreciate the support provided by the eu ist 12566 project clockwork. references [1] steinbauer p., valášek m., zdráhal z., mulholland p., šika z.: “knowledge support of virtual modelling and simulation”, in: proc. of nato asi virtual nonlinear multibody systems, prague 2002, p. ii/241–246. [2] valasek m., steinbauer p., kolar j., dvorak j.: “concurrent design of railway vehicles by simulation model reuse”, this aed 03 proceedings. prof. ing. michael valášek, drsc. phone: +420 224 357 361 fax: +420 224 916 709 e-mail: michael.valasek@fs.cvut.cz ing. pavel steinbauer, ph.d. e-mail: pavel.steinbauer@fs.cvut.cz ing. zbyněk šika, ph.d. e-mail: zbynek.sika@fs.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic prof. zdeněk zdráhal knowledge media institute the open university walton hall milton keynes, uk 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2016.56.0360 acta polytechnica 56(5):360–366, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap experimental evaluation of wedm machined surface waviness katerina mouralova∗, jiri kovar, pavel houska brno university of technology, faculty of mechanical engineering, technicka 2896/2, 61669 brno, czech republic ∗ corresponding author: mouralova@fme.vutbr.cz abstract. wire electrical discharge machining (wedm) an unconventional machining technology which has become indispensable in many industries. the typical morphology of a surface machined using the electrical discharge technology is characterized with a large number of craters caused by electro-spark discharges produced during the machining process. the study deals with an evaluation of the machine parameter setting on the profile parameters of surface waviness on samples made of two metal materials al 99.5 and ti-6al-4v. attention was also paid to an evaluation of the surface morphology using 3d colour filtered and non-filtered images. keywords: wedm; electrical discharge machining; titanium alloy ti-6al-4v; aluminium al 99.5; waviness. 1. introduction the physical substance of stock removal in the electroerosion process is in periodically acting electrical discharges between the tool electrode and the workpiece. the erosion process alone takes place in a dielectric medium – a liquid with high electric resistance. microscopic particles are washed away by dielectric medium and small craters are formed on the workpiece surface. wire electrical discharge machining (wedm), the diagram of which is in fig. 1, is one of the most productive electro-erosion applications. there are no limiting mechanical properties of the machined material such as toughness or high hardness and only electric conductivity is necessary. the tool is a continuously unwinding wire electrode, removing material in all directions and its geometry does not change like in conventional machining methods [1,2]. mahapatra [3] focused on the significant machining parameters (metal removal rate, kerf and surface finish) for the performance measures in the wedm. it was proved that every performance measure requires different combination of these factors for its optimization. for their experiment, the nonlinear regression analysis (for the study of the relationship between control factors and responses) and the genetic algorithm (for the wedm process optimization with multiple objectives) were employed. the experiments demonstrated the ability of wedm process parameters to be adjustable for reaching better above mentioned significant machining parameters (metal removal rate, surface finish and kerf). sarkar [4] investigated the optimization of wedm of gamma titanium aluminide by employing the artificial neural network modelling. according to the outcomes of the experiments done and due to the overall optimization strategy and the figure 1. diagram of wire electrical discharge machining process. combination of single and multipass cutting operations, a new novel concept of critical surface roughness and effective cutting speed was obtained for the machining process selection in order to reach its maximum productivity. they focused on the influential force of four process parameters (namely, pulse on time, peak current, dielectric flow rate, and effective wire offset) on the process performance. there are many factors which have a substantial influence on the quality of the machined surface and they can be found using various methods [5,6]. although machine setting parameters are a significant factor, it is the material characteristics of the workpiece that define the final surface quality. the surface quality parameters are influenced by a set of physical and mechanical characteristics of the machined material and the type of its heat treatment [7]. 360 http://dx.doi.org/10.14311/ap.2016.56.0360 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 5/2016 experimental evaluation of wedm machined surface waviness contents si fe cu zn ti max. (wt%) 0.3 0.4 0.05 0.07 0.05 table 1. chemical composition of aluminium al 99.5 prescribed by standard. contents al fe o v min. (wt%) 5.5 – – 3.5 max. (wt%) 6.75 0.25 0.2 4.5 heat treatment (ht) quenched and tempered 940 °c / 45 min / water 500 °c / 2 h / air table 2. chemical composition of titanium alloy ti6al-4v prescribed by standard and one type of heat treatment. 2. experimental setup and material 2.1. experimental material samples for the experiment were made of pure aluminium al 99.5 and titanium alloy ti-6al-4v. aluminium al 99.5 is a material with low specific weight. its indisputable advantages include excellent corrosion resistance, good weldability and suitability for anodizing with hardness 15 hb, tensile strength 65– 160 mpa, and chemical composition, see tab. 1. it is used almost in all industrial sectors for structural elements and units that are low mechanically stressed, requiring a highly ductile material with high corrosion resistance, which is very well thermally and electrically conductive [8]. basically, it can be welded using any method [9]. the experiment used an initial rod 20 mm in diameter out of which a square-shaped material was made by the electro-erosion machining. titanium alloy ti-6al-4v with chemical composition shown in tab. 2 was used in two sets. the first set – material without additional heat treatment, the second set – with heat treatment, see tab. 2. this alloy has a high tensile strength of 900 mpa and an excellent corrosion resistance. it has the highest strength to specific weight ratio of all metal materials [10]. it has a high biocompatibility and capacity to resist thermal loads up to a temperature of 315 °c. it is used for manufacturing constructional parts of weapons and aircraft, turbine blades, fasteners, medical and dental implants, and sport equipment [11]. the experiment used an initial square-shaped material 18 mm in thickness. for the purpose of increasing the hardness, heat treatment was carried out. zhr 4150ak hardness tester, series rockwell by zwick roell was used for the measurement of hardness of sample material. for titanium alloy, a hardness of 46 hrc (432 hb) was measured. sample gv ton toff wot dc number (v) (µs) (µs) (m/min) (a) 1 70 8 40 12 30 2 60 8 30 12 30 3 60 8 40 12 25 4 60 10 40 12 30 5 50 8 40 12 30 6 60 8 50 12 30 7 60 6 40 12 30 8 60 8 40 12 35 9 60 8 40 10 30 10 60 8 40 14 30 11 60 8 40 12 30 12 50 6 30 10 35 13 70 10 50 10 25 14 70 10 30 10 35 15 60 8 40 12 30 16 70 6 50 10 35 17 70 10 50 14 35 18 60 8 40 12 30 19 60 8 40 12 30 20 70 6 50 14 25 21 50 6 30 14 25 22 60 8 40 12 30 23 70 10 30 14 25 24 50 6 50 10 25 25 60 8 40 12 30 26 50 10 50 14 25 27 50 10 30 10 25 28 50 6 50 14 35 29 50 10 50 10 35 30 70 6 30 14 35 31 50 10 30 14 35 32 60 8 40 12 30 33 70 6 30 10 25 table 3. machining parameters used in the experiments. gv – gap voltage; wot – wire off time; dc – discharge current. 2.2. wedm machine setup the wedm machine used in this study was high precision five axis cnc machine makino eu64. as electrode, a brass wire (60 % cu and 40 % zn) penta cut e with a diameter of 0.25 mm was used. samples were immersed in the deionized water which served as dielectric media and also removed debris in the gap between the wire electrode and the workpiece during the process. to find out the effects of parameters of gap voltage, pulse on (ton) and off time (toff), wire feed and discharge current on the machined surface, their different setting (tab. 3.) was used for each of 33 samples made of individual materials. the values of individual parameter settings were determined on the basis of previous tests [12]. 361 k. mouralova, j. kovar, p. houska acta polytechnica figure 2. waviness measurement area on each sample (sample of titanium alloy). figure 3. position of distribution of areas for waviness measurement. w a , w q , w z (μ m ) 0 2 4 6 8 10 number of sample 0 5 10 15 20 25 30 wa wq wz figure 4. average values wa, wq and wz of samples made of al 99.5. for the experiment, a “half response surface design” containing 33 runs grouped in two blocks (tab. 3) was chosen. in order to reduce the possibility of systematic errors, the individual runs are randomised; besides that, 7 central points were added to the experiment to ensure a better measure of error. this plan of data collection has been described in detail, for example, by montgomery [13]. 3. results of experiment the morphology and parameters of waviness of the machined surface were studied using the contactless measuring instrument from ifm g4 from alicona. the measured data were analyzed using the software if-laboratory measurement supplied by alicona. the term profile waviness is defined by a curvature that shows a certain periodicity – waving. the area of waviness measurement on each sample is shown in fig. 2. in this area, waviness measurement was carried out according to iso 4287 [14] on straight lines 4.2 mm long, always at exactly defined distances from one another as shown in fig. 3. the waviness parameters evaluated by profile method were maximum height of the waviness profile wz, arithmetic average of waviness wa and root mean square deviation of the waviness profile wq. figure 5. colour filtered image of the surface of sample 23, 2.5 × magnified (voltage 70 v, ton = 10 µs, toff = 30 µs, wire feed 14 m/min, current 25 a). 3.1. the results of the profile waviness measurements on samples made of al 99.5 the measured values of the waviness parameters wa, wz and wq were processed into fig. 4. each of the parameters was measured in 5 points on each sample and then the average value was calculated. the measured values of the waviness parameter wa range in an interval from 1.56 µm (sample 23) to 2.75 µm (sample 14). the average value wq in 362 vol. 56 no. 5/2016 experimental evaluation of wedm machined surface waviness w a ( μ m ) 0.5 1 1.5 2 2.5 3 3.5 point of measurement 1 2 3 4 5 figure 6. values wa in individual points of measurement on all samples made of al 99.5. w a , w q , w z (μ m ) 1 2 3 4 5 6 7 8 number of sample 0 5 10 15 20 25 30 wa wz wq figure 7. average values wa, wq and wz of samples made of titanium alloy ti-6al-4v. measured samples was 2.85 µm and for wz it was 7.48 µm. overall, sample 23 has the lowest values of all measured parameters wa = 1.56 µm, wq = 1.88 µm and wz = 3.23 µm and its surface is shown in fig. 5. a box graph (fig. 6) was compiled of all measured values wa in individual points on samples according to fig. 3. the average measured value wa does not differ significantly in individual points of measurement. only the maximum and minimum values measured in individual points are different. the lowest values were measured in the middle of the sample in points 3 and 4. 3.2. the results of the profile waviness measurements on samples made of titanium alloy ti-6al-4v the average values of waviness parameters in 5 points on the surfaces of samples made of titanium alloy were processed into fig. 7. the minimum measured figure 8. non-filtered image of the surface of sample 30, 2.5 × magnified (voltage 70 v, ton = 6 µs, toff = 30 µs, wire feed 14 m/min, current 35 a). value of the waviness parameter wa was in sample 9, specifically 1.21 µm, and the maximum in sample 20, 363 k. mouralova, j. kovar, p. houska acta polytechnica w a ( μ m ) 0.5 1 1.5 2 2.5 3 point of measurement 1 2 3 4 5 figure 9. values wa in individual points of measurement on all samples made of titanium alloy ti-6al-4v. w a , w q , w z (μ m ) 1 2 3 4 5 6 7 8 number of sample 0 5 10 15 20 25 30 wa wq wz figure 10. average values wa, wq and wz of samples made of titanium alloy ti-6al-4v with heat treatment. specifically 2.26 µm. the average value wq in the measured samples was 2.16 µm and for wz it was 5.08 µm. the maximum values of all of the three waviness parameters were for sample 30; its surface morphology is shown in fig. 8. the parameter wa was measured in 5 points on each sample and its values did not differ significantly in these different points, which is apparent from fig. 9. nor in the parameters wq and wz was a significant deviation in individual points of measurement. 3.3. the results of the waviness profile measurements on samples made of titanium alloy ti-6al-4v with heat treatment the profile measurements of the surface waviness parameters of samples made of titanium alloy with heat treatment were processed into fig. 10. the average value wq of the measured samples was 2.17 µm and for wz it was 6.01 µm. the minimum measured value of the waviness parameter wa was 1.3 µm for sample 4 and the maximum was 2.08 µm for sample 27. no significant deviation was found in any of the three examined waviness parameters in individual points of measurement. the values of parameter wa in 5 points of measurement were compiled into fig. 11. 4. conclusions and discussion the morphology of sample surfaces is made up of a large number of craters which were formed by erosion process [15,16]. the profile parameters of the machined surface are dependent not only on the machine setting parameters [17], but also on the mechanical and physical properties of the machined material 364 vol. 56 no. 5/2016 experimental evaluation of wedm machined surface waviness w a ( μ m ) 0.5 1 1.5 2 2.5 3 point of measurement 1 2 3 4 5 figure 11. values wa in individual points of measurement on all samples made of titanium alloy ti-6al-4v with heat treatment. w a , w q , w z (μ m ) 1 2 3 4 5 6 7 8 material of sample a l 99.5 titanium titanium + ht w a w q w z figure 12. average values of waviness parameters of individual sample sets of tested materials. which are a direct consequence of the microstructure parameters of the studies materials [18]. the measured values of the surface waviness parameters are in accordance with literature [19,20,21]. it is evident from fig. 12 that the maximum values wa were measured in samples made of al 99.5. the high values of wa unequivocally relate to the low melting temperature of the material, low strength and with a relatively large grain size of the machined al stock [22]. the surface of samples made of titanium alloys with and without heat treatment had almost the same average waviness value wa, specifically 1.77 and 1.75 µm. the average value of parameter wz was highest in the set of samples made of al 99.5, specifically 7.48 µm. the average value of wz of samples made of titanium alloy without heat treatment was lower by 1 µm than of the samples made of the same material with additional heat treatment (quenched and tempered). this fact is quite in accordance with the rougher microstructure of the titanium sample after heat treatment. the above-mentioned experiments quite clearly show that the profile parameters of surface waviness depend quite significantly not only on the setting of technological parameters during machining, but particularly on the chemical composition of the machined material and its mechanical properties after heat treatment. effects of machining parameters are subject to a further research. acknowledgements this work is an output of research and scientific activities of netme centre, supported through project netme centre plus (lo1202) by financial means from the ministry of education, youth and sports under the “national sustainability programme i”. this research work was supported by the but, faculty of mechanical engineering, brno, specific research 2013, with the grant “research of advanced technologies for competitive machinery”, fsi-s-13-2138, id 2138 and technical support of intemac solutions, ltd., kurim. references [1] jain, vijay kumar. advanced machining processes. allied publishers, 2009, isbn 8177642944. [2] ghodsiyeh, danial, golshan, abolfazl, shirvanehdeh, jamal azimi. review on current research trends in wire electrical discharge machining (wedm). indian journal of science and technology, 2013, 6.2: pp. 4128-4140. doi:10.17485/ijst/2013/v6i2/30595 [3] mahapatra, s. s., patnaik, amar. optimization of wire electrical discharge machining (wedm) process parameters using taguchi method. the international journal of advanced manufacturing technology, 2007, pp. 911-925, doi:10.1007/s00170-006-0672-6 [4] sarkar, s., et al. an integrated approach to optimization of wedm combining single-pass and multipass cutting operation. materials and manufacturing processes, 2010, pp. 799-807, doi:10.1080/10426910903575848 365 http://dx.doi.org/10.17485/ijst/2013/v6i2/30595 http://dx.doi.org/10.1007/s00170-006-0672-6 http://dx.doi.org/10.1080/10426910903575848 k. mouralova, j. kovar, p. houska acta polytechnica [5] matousek, radomil, josef bednar grammatical evolution: epsilon tube in symbolic regression task. in mendel 2009, mendel journal series. mendel. brno, but. pp. 9 15. isbn 978-80-214-3884-2, issn 1803-3814. [6] matousek, radomil, josef bednar. grammatical evolution and ste criterion: statistical properties of ste objective function, lecture notes in electrical engineering, vol 68, pp. 131–142, doi:10.1007/978-90-481-9419-3_11 [7] davim, j. paulo. surface integrity in machining. 1st ed. london: springer, 2010, 215 pp. isbn 978-1-84882-873-5. [8] niinomi, mitsuo. recent metallic materials for biomedical applications. metallurgical and materials transactions a, 2002, pp. 477-486, doi:10.1007/s11661-002-0109-2 [9] cao, x., jahazi, m. effect of welding speed on butt joint quality of ti–6al–4v alloy welded using a highpower nd: yag laser. optics and lasers in engineering, 2009, pp. 1231-1241, doi:10.1016/j.optlaseng.2009.05.010 [10] alhazaa, a., khan, t. i., haq, i. transient liquid phase (tlp) bonding of al7075 to ti–6al–4v alloy. materials characterization, 2010, pp. 312-317, doi:10.1016/j.matchar.2009.12.014 [11] marsh, elizabeth. a technological and market study on the future prospects for titanium to the year 2000. european commission, 1996. [12] mouralova, katerina. moderní technologie drátového elektroerozivního řezání kovových slitin. thesis. brno: cerm, 2015. 98 pp. isbn 80-214-2131-2. dizertacní práce. vut v brne, fsi, úst. [13] montgomery, douglas c. design and analysis of experiments. john wiley & sons, 2008, isbn 978-0-470-12866-4. [14] iso 4287. geometrical product specifications (gps)-surface texture: profile method-terms, definitions and surface texture parameters. 1997. [15] tosun, n., h. pihtili. the effect of cutting parameters on wire crater sizes in wire edm. the international journal of advanced manufacturing technology. 2003, pp. 857-865. doi:10.1007/s00170-002-1404-1 [16] han, fuzhu, jun jiang a dingwen yu. influence of machining parameters on surface roughness in finish cut of wedm. the international journal of advanced manufacturing technology. 2007, pp. 538-546. doi:10.1007/s00170-006-0629-9 [17] kumar, anish, vinod kumar a jatinder kumar. multi-response optimization of process parameters based on response surface methodology for pure titanium using wedm process. the international journal of advanced manufacturing technology. 2007, pp. 538-546. doi:10.1007/s00170-013-4861-9 [18] somashekhar, kodalagara puttanarasaiah, nottath ramachandran a jose mathew. material removal characteristics of microslot (kerf) geometry in µ-wedm on aluminum. the international journal of advanced manufacturing technology. 2007, pp. 538-546. doi:10.1007/s00170-010-2645-z [19] spedding, t.a a z.q wang. study on modeling of wire edm process. journal of materials processing technology. 1997, pp. 18-28. doi:10.1016/s0924-0136(96)00033-7 [20] serwiński, radoslaw. the wire electro discharge machine setting parameters analysis influencing the machined surface roughness and waviness. prace instytutu lotnictwa, 2009, pp. 168-175. [21] padhi, p.c., s.s. mahapatra, s.n. yadav a d.k. tripathy. performance characteristic prediction of wedm process using response surface methodology and artificial neural network. journal of materials processing technology. 1997, pp. 18-28. doi:10.1504/ijise.2014.065616 [22] hasçalik, ahmet, çaydaş, ulaş. electrical discharge machining of titanium alloy (ti–6al–4v). applied surface science, 2007, pp. 9007-9016, doi:10.1016/j.apsusc.2007.05.031 366 http://dx.doi.org/10.1007/978-90-481-9419-3_11 http://dx.doi.org/10.1007/s11661-002-0109-2 http://dx.doi.org/10.1016/j.optlaseng.2009.05.010 http://dx.doi.org/10.1016/j.matchar.2009.12.014 http://dx.doi.org/10.1007/s00170-002-1404-1 http://dx.doi.org/10.1007/s00170-006-0629-9 http://dx.doi.org/10.1007/s00170-013-4861-9 http://dx.doi.org/10.1007/s00170-010-2645-z http://dx.doi.org/10.1016/s0924-0136(96)00033-7 http://dx.doi.org/10.1504/ijise.2014.065616 http://dx.doi.org/10.1016/j.apsusc.2007.05.031 acta polytechnica 56(5):360–366, 2016 1 introduction 2 experimental setup and material 2.1 experimental material 2.2 wedm machine setup 3 results of experiment 3.1 the results of the profile waviness measurements on samples made of al 99.5 3.2 the results of the profile waviness measurements on samples made of titanium alloy ti-6al-4v 3.3 the results of the waviness profile measurements on samples made of titanium alloy ti-6al-4v with heat treatment 4 conclusions and discussion acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0385 acta polytechnica 57(6):385–390, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap on the spectrum of the one-dimensional schrödinger hamiltonian perturbed by an attractive gaussian potential silvestro fassaria, b, c, ∗, manuel gadellaa, luis miguel nietoa, fabio rinaldib, c a departamento de física teórica, atómica y óptica, and imuva, universidad de valladolid, 47011 valladolid, spain b cerfim, po box 1132, via f. rusca 1, ch-6601 locarno, switzerland c dipartimento di fisica nucleare, subnucleare e delle radiazioni, universitá degli studi guglielmo marconi, via plinio 44, i-00193 rome, italy ∗ corresponding author: silvestro.fassari@uva.es abstract. we propose a new approach to the problem of finding the eigenvalues (energy levels) in the discrete spectrum of the one-dimensional hamiltonian h = −∂2x −λe−x 2/2, by using essentially the well-known birman-schwinger technique. however, in place of the birman-schwinger integral operator we consider an isospectral operator in momentum space, taking advantage of the unique feature of this gaussian potential, that is to say its invariance under fourier transform. given that such integral operators are trace class, it is possible to determine the energy levels in the discrete spectrum of the hamiltonian as functions of λ with great accuracy by solving a finite number of transcendental equations. we also address the important issue of the coupling constant thresholds of the hamiltonian, that is to say the critical values of λ for which we have the emergence of an additional bound state out of the absolutely continuous spectrum. keywords: schrödinger equation; gaussian potential; birman-schwinger method; trace class operators; fredholm determinant. 1. introduction given its increasing relevance in the field of nanophysics, it is particularly interesting to investigate the schrödinger hamiltonian with an attractive gaussian potential v = −λe−x 2/2, since the latter has the typical properties of short-range potentials, which implies the existence of bound states with negative energies below the absolutely continuous spectrum given by the semibounded interval [0, +∞), but also those of the harmonic oscillator near the bottom of the well. hence, the original schrödinger eigenvalue problem we are going to consider is ( − d2 dx2 −λe−x 2/2 ) ψ = −�2ψ, � > 0 (1) although there have been quite a few papers on this model, the most rigorous of which being [1] by f. m. fernández (see also [2, 3]), it seems that the implications of a remarkable feature of this potential, namely its invariance with respect to the fourier transform, have been missed. our method, whilst being essentially based on the (by now) classical birman-schwinger method (see, e.g., [4, 5]), considers instead of the integral operator associated to (1) λe−x 2/4 ( − d2 dx2 + �2 )−1 e−x 2/4, (2) another integral operator which is isospectral to the birman-schwinger one given in (2) (see [6–8]): λb� = λ ( − d2 dx2 + �2 )−1/2 e−x 2/2 ( − d2 dx2 + �2 )−1/2 . (3) given that the function e−x 2/2 is invariant under the fourier transform, the unitary equivalent of the above integral operator, still denoted by b�, is b� = 1 √ 2π ( p2 + �2 )−1/2 e−p 2/2 ∗ ( p2 + �2 )−1/2 , (4) 385 http://dx.doi.org/10.14311/ap.2017.57.0385 http://ojs.cvut.cz/ojs/index.php/ap s. fassari, m. gadella, l. m. nieto, f. rinaldi acta polytechnica where the star denotes the convolution. therefore, the schrödinger equation (1) can be reformulated in terms of the following integral equation: χ(p) = 1 √ 2π ( p2 + �2 )−1/2 ∫ +∞ −∞ e−(p−q) 2/2(q2 + �2)−1/2χ(q) dq, (5) with ψ̂(p) = (p2 + �2)−1/2χ(p), in the sense that if � is a value for which the above integral operator has an eigenvalue equal to one, then −�2 is an eigenvalue of the original schrödinger equation. as a consequence, understanding in depth the properties of the integral operator is crucial in order to get a detailed description of the discrete spectrum of the original hamiltonian. 2. the integral operator b� we wish to open this section by stating and proving the key property of the integral operator b�. theorem 2.1. the integral operator b� is trace class. proof. as a consequence of a well-known theorem (see [9]), given the positivity of the integral kernel and its smoothness, the trace is simply given by the integral of the kernel evaluated along the diagonal p = q, that is to say: 1 √ 2π ∫ +∞ −∞ dp p2 + �2 = √ π 2 1 � . (6) the fact that the trace diverges as � → 0+, guarantees that there is always at least one bound state even for very small values of the coupling constant (shallow wells). we take this opportunity to remind the reader that the latter property is typical of one-dimensional quantum hamiltonians, as shown in [5, 10]. we also notice that, by expanding the square in the exponent of the gaussian in (5), the integral kernel b�(p,q) of the operator can be recast as: b�(p,q) = 1 √ 2π e−p 2/2 (p2 + �2)1/2 epq e−q 2/2 (q2 + �2)1/2 , (7) as an immediate consequence of (7), the operator b� can be written as a direct sum of two operators, one acting onto the symmetric subspace (s) the other onto the antisymmetric one (a), whose kernels are: bs� (p,q) = 1 √ 2π e−p 2/2 (p2 + �2)1/2 cosh(pq) e−q 2/2 (q2 + �2)1/2 , (8) ba� (p,q) = 1 √ 2π e−p 2/2 (p2 + �2)1/2 sinh(pq) e−q 2/2 (q2 + �2)1/2 . (9) by using the taylor expansion for each hyperbolic function, the two integral operators can be written as infinite sums of rank one operators, namely: bs� = 1 √ 2π ∞∑ n=0 1 (2n)! ∣∣∣∣ p2ne−p 2/2 (p2 + �2)1/2 〉〈 p2ne−p 2/2 (p2 + �2)1/2 ∣∣∣∣, (10) ba� = 1 √ 2π ∞∑ n=0 1 (2n + 1)! ∣∣∣∣p2n+1e−p 2/2 (p2 + �2)1/2 〉〈 p2n+1e−p 2/2 (p2 + �2)1/2 ∣∣∣∣. (11) obviously, as a consequence of (6), both operators are trace class, which implies that both series are convergent in the trace class norm. furthermore, although either operator has not yet been diagonalised at this stage (because the rank one operators are not mutually orthogonal), their diagonalisation can be achieved starting from the first rank one operator in each expansion (n = 0), that is to say: bs,0� = 1 √ 2π ∣∣∣∣ e−p 2/2 (p2 + �2)1/2 〉〈 e−p 2/2 (p2 + �2)1/2 ∣∣∣∣, (12) ba,0� = 1 √ 2π ∣∣∣∣ pe−p 2/2 (p2 + �2)1/2 〉〈 pe−p 2/2 (p2 + �2)1/2 ∣∣∣∣. (13) their norms can be easily computed as the required improper integrals are well known: ‖bs,0� ‖ = 1 √ 2π ∫ +∞ −∞ e−p 2 p2 + �2 dp = √ π 2�2 e� 2 erfc(�), (14) ‖ba,0� ‖ = 1 √ 2π ∫ +∞ −∞ p2e−p 2 p2 + �2 dp = 1 √ 2 ( 1 − √ π�e� 2 erfc(�) ) . (15) 386 vol. 57 no. 6/2017 on the spectrum of the one-dimensional schrödinger hamiltonian as can be understood from (14), the divergence of the trace of the entire operator as � → 0+ is only due to the the divergence of bs,0� . in fact, the trace class norm of the positive operator bs,1� = bs� −bs,0� can be calculated as follows: ‖bs,1� ‖1 = 1 √ 2π ∫ +∞ −∞ e−p 2 (cosh p2 − 1) p2 + �2 dp. (16) by writing the hyperbolic cosine as a combination of two exponentials, the right hand side of (16) is easily seen to be equal to ‖bs,1� ‖1 = √ π 23�2 ( 1 + e2� 2 erfc(21/2�) − 2e� 2 erfc(�) ) , (17) which, given its removable singularity at the origin, is almost immediately seen to converge to √ 2−1 as � → 0+. before determining the equation that will enable us to compute the ground state energy as a function of the coupling parameter λ, let us consider also the antisymmetric component ba� . as a result of the explicit expression of the integral kernel of ba� , it is not difficult to show that the latter operator converges weakly to ba0 , where ba0 (p,q) = 1 √ 2π e−p 2/2 |p| sinh(pq) e−q 2/2 |q| . (18) the latter is a positive trace class operator with trace equal to ‖ba0‖1 = 1 √ 2π ∫ +∞ −∞ e−p 2 p2 sinh p2 dp = 1, (19) as follows easily by writing the hyperbolic sine as a combination of two exponentials and using integration by parts. it is worth pointing out that the latter quantity is nothing else but the limit, as � → 0+ of the trace class norm of ba� , since ‖ba�‖1 = 1 √ 2π ∫ +∞ −∞ e−p 2 sinh p2 p2 + �2 dp = √ π 23�2 ( 1 −e2� 2 erfc( √ 2�) ) , (20) which ensures the convergence in the norm topology of trace class operators, as a consequence of a well-known theorem on operators belonging not only to the trace class t1 but to any ideal tp (see [11]). this result can be summarised in the following statement. theorem 2.2. the positive trace class operator ba� converges, as � → 0+, to ba0 , the positive trace class operator defined by its kernel in (18), in the norm topology of trace class operators. it is crucial to point out that, whilst bs,0� , the first rank one summand in the expansion of the symmetric component of our integral operator, diverges as � → 0+, ba,0� obviously converges to the following rank one operator: ba00 = 1 √ 2π ∣∣∣∣pe−p 2/2 |p| 〉〈 pe−p 2/2 |p| ∣∣∣∣ = 1√2π∣∣sgn(p)e−p2/2〉〈sgn(p)e−p2/2∣∣. (21) although the antisymmetric function sgn(p)e−p 2/2 has a jump discontinuity at the origin, its square coincides with the one of the unnormalised ground state eigenfunction of the harmonic oscillator in momentum space. 3. the two lowest eigenvalues of −∂2x −λe−x 2/2 3.1. the ground state as pointed out earlier, the divergent behaviour of the first term (n = 0) of the expansion of the symmetric part (12) implies that, no matter how shallow the gaussian well may be (small values of the coupling constant λ), there will always exist at least one bound state, the ground state, whose energy is −�0(λ)2 with �0(λ) given by the solution of a transcendental equation that can be derived from the application of well-known facts regarding the fredholm determinant of a trace class operator (see, e.g., [12]), namely: det (1 −λbs� ) = 0. (22) by isolating the first divergent rank one operator in the expansion of λbs� and taking advantage of the boundedness of λbs,1� , the left hand side of (22) can be rewritten as det ( 1 −λbs,0� (1 −λb s,1 � ) −1) det(1 −λbs,1� ). (23) 387 s. fassari, m. gadella, l. m. nieto, f. rinaldi acta polytechnica 0 1 2 3 4 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 λ e0 figure 1. the ground state energy e0(λ) = −�0(λ)2, as a function of the coupling parameter λ. as the first factor is the only one involving the divergent rank one operator bs,0� , the equation leading to the determination of the ground state energy reduces to det ( 1 −λbs,0� (1 −λb s,1 � ) −1) = 1 − tr(λbs,0� (1 −λbs,1� )−1) = 0, (24) given that the second term inside the deteminant is a rank one operator. by taking only the terms up to λ in the expansion of the inverse inside the trace in (24), we get the following quadratic equation in λ: g0(�) 2π λ2 + f0(�)√ 2π λ− 1 = 0, (25) with f0(�) = ∫ +∞ −∞ e−p 2 p2 + �2 dp = π � e� 2 erfc(�), (26) g0(�) = ∫ +∞ −∞ dp ∫ +∞ −∞ e−p 2 p2 + �2 ( cosh(pq) − 1 ) e−q2 q2 + �2 dq. (27) by using the standard taylor expansion of the hyperbolic cosine, the latter double integral can be recast as the following convergent series whose coefficients are expressed in terms of the gamma and the incomplete gamma functions: g0(�) = ∞∑ n=1 1 (2n)! (∫ +∞ −∞ p2ne−p 2 p2 + �2 dp )2 = e2� 2 ∞∑ n=1 �2(2n−1) (γ(n + 1/2)γ(−n + 1/2,�2))2 γ(2n + 1) . (28) hence, the positive solution of (25) is given by the function from [0, +∞) to [0, +∞): λ(�) = 2 √ 2π (f20 (�) + 4g0(�))1/2 + f0(�) , (29) which can be inverted to get the required �0(λ), leading to the ground state energy e0(λ) = −�0(λ)2, the plot of which is shown in figure 1. 3.2. the first excited state let us now consider the antisymmetric component given by (9) or (11), so that we have to study the equation det (1 −λba� ) = 0. as a result of the explicit expression of the integral kernel of ba� , it is not difficult to show that λba� converges weakly to λba0 , where the trace class operator ba0 is given by (18). as a consequence of theorem 2.2, and in analogy to what has been done for the determination of the ground state energy, e1(λ) = −�1(λ)2, the energy of the first antisymmetric bound state, can be determined by means of the following equation: det ( 1 −λba,0� (i −λb a,1 � ) −1) = 0, (30) with ba,1� = ba� − ba,0� , which is obviously trace class. taking account of the fact that ba,0� is a rank one operator, the latter equation becomes 1 −λ tr ( ba,0� (i −λb a,1 � ) −1) = 0 (31) 388 vol. 57 no. 6/2017 on the spectrum of the one-dimensional schrödinger hamiltonian 1.5 2.0 2.5 3.0 3.5 4.0 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 λ e1 figure 2. the energy of the first excited state e1(λ) = −�1(λ)2, as a function of the coupling parameter λ. by taking only the terms up to λ in the expansion of the inverse, we get the equation: g1(�) 2π λ2 + f1(�)√ 2π λ− 1 = 0, (32) with f1(�) = ∫ +∞ −∞ p2e−p 2 p2 + �2 dp = √ π −π�e� 2 erfc(�), (33) g1(�) = ∫ +∞ −∞ dp pe−p 2 p2 + �2 ∫ +∞ −∞ dq ( sinh(pq) −pq ) qe−q2 q2 + �2 . (34) by using now the standard taylor expansion of the hyperbolic sine, the latter double integral can be recast as the following convergent series whose coefficients are expressed in terms of the gamma and the incomplete gamma function: g1(�) = ∞∑ n=1 1 (2n + 1)! (∫ ∞ −∞ p2n+2e−p 2 p2 + �2 dp )2 = e2� 2 ∞∑ n=1 �2(2n+1) (γ(n + 3/2)γ(−n− 1/2,�2))2 γ(2n + 2) . (35) hence, the positive solution of (32) is given by λ1(�) = 2 √ 2π (f21 (�) + 4g1(�))1/2 + f1(�) , (36) with domain [0, +∞) and codomain given by [λ1(0), +∞), λ1(0) being λ1(0) = 2 √ 6 √ 4π − 9 + √ 3 ≈ 1.35311. (37) hence, the latter function can be inverted to get �1(λ), as well as e1(λ) = −�1(λ)2 whose plot is given in figure 2. the plot of both eigenvalues e0 and e1 as functions of the coupling parameter λ is shown in figure 3. the isolated points are those resulting from table 1 in the aforementioned paper by fernández [1]. 4. conclusions by combining a variation of the renowned birman-schwinger principle and the use of fredholm determinants, given that all the integral operators involved are positive trace class operators, we have been able to determine the two lowest eigenenergies, the one of the ground state and that of the lowest antisymmetric bound state, of the one-dimensional hamiltonian −∂2x−λe−x 2/2 as functions of the coupling parameter. whilst the ground state energy emerges out of the absolutely continuous spectrum at λ = λ0(0) = 0, the latter emerges at λ = λ1(0), approximately equal to 1.35311. the method can be further exploited to determine other eigenvalues of the hamiltonian. for example, by starting with the function ψ̂2(p) = ( p2 − �(1 − �) √ πe� 2 erfc(�) ) e−p2/2 (p2 + �2)1/2 , (38) 389 s. fassari, m. gadella, l. m. nieto, f. rinaldi acta polytechnica 0 1 2 3 4 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 λ e figure 3. the curves of both energies, e0 (blue) and e1 (yellow), as functions of the coupling parameter λ, and the points resulting from the table provided by fernández in [1]. orthogonal to the ground state eigenfunction ψ̂0(p) = e−p 2/2/(p2 + �2)1/2 (by using the gram-schmidt procedure), and the fredholm determinant det(1 −λbs,1� ), one can determine the energy of the first excited symmetric bound state as a function of the coupling parameter emerging out of the absolutely continuous spectrum at the next coupling constant threshold. work in this direction is in progress. 5. acknowledgements fabio rinaldi would like to thank prof. m. znojil for kindly inviting him to present the main points of this article in the final session of the conference “analytic and algebraic methods in physics” held in prague, 11–14 september 2017. partial financial support is acknowledged to the spanish junta de castilla y león and feder (project va057u16), and mineco (project mtm2014-57129-c2-1-p). s. fassari also wishes to thank the entire staff at departamento de física teórica, atómica y óptica, universidad de valladolid, for their warm hospitality throughout his stay. references [1] fernández f.m.: quantum gaussian wells and barriers, am. j. physics, 79 (7), 2011, 752-754. [2] muchatibaya g., fassari s., rinaldi f., mushanyu j.: a note on the discrete spectrum of gaussian wells (i): the ground state energy in one dimension, advances in mathematical physics, 2016. [3] nandi s.: the quantum gaussian well, am. j. physics, 78, 2010, 1341-1345. [4] klaus m.: some applications of the birman-schwinger principle, helv. physics acta 55, 1982, 413-419. [5] klaus m.: on the bound state of schrödinger operators in one dimension, annals of physics 108, 1977, 288-300. [6] klaus m.: a remark about weakly coupled one-dimensional schrödinger operators helv. phys. acta 52, 1979, 223. [7] fassari s.: an estimate regarding one-dimensional point interactions helv. phys. acta 68, 1995, 121-125. [8] fassari s., rinaldi f.: on the spectrum of the schrödinger hamiltonian with a particular configuration of three point interactions, reports on mathematical physics, 64 (3) 2009, 367-393. [9] reed m, simon b.: methods in modern mathematical physics iii scattering theory, academic press, ny 1979 [10] simon b.: the bound state of weakly coupled schrödinger operators in one and two dimensions, annals of physics, 97, 1976, 279-288. [11] simon b.: trace ideals and their applications, cambridge university press, cambridge 1979 [12] reed m, simon b.: methods in modern mathematical physics iv analysis of operators, academic press, ny 1978 390 acta polytechnica 57(6):385–390, 2017 1 introduction 2 the integral operator b 3 the two lowest eigenvalues of -x2 -e-x2/2 3.1 the ground state 3.2 the first excited state 4 conclusions 5 acknowledgements references ap05_1.vp 1 introduction progress in information technologies together with the development of geodetic and photogrammetric documentation methods has led to new opportunities for creating and using of 3d models of historical sites and buildings. this paper presents the current state of the “live theatre” project. the main goal of the project is to propose and implement a functional prototype of a 3d information system for a particular historical site. the theoretical foundations of the project were established in [1], which deals with ways from gathering 3d documentation to creation of a 3d is. 1.1 background of the project information processing has been increasing in importance, and progress in information technologies and computer technologies has been accelerating. information systems are coming into use in areas, where they were not considered in the past. the cultural heritage is such an area in our country. apart from the main content of information, localisation and time are becoming very important aspects today. there is a movement from is working with 2d information (planar data) toward is working with 3d information (3d models) and time, see e.g. [2]. another movement is connected with the development of network technologies. information systems are moving from separated offices to the net and internet. work with 3d data is closely connected with development in computer graphics (virtual reality). the cultural heritage area is now more important also because of its relation to the tourist industry. historical sites, together with areas of natural beauty are the most popular targets both in the czech republic (cz) and abroad. this is again connected with information. if a historical site is to attract potential visitors, it needs to be well presented. presentation on the internet can be very effective. in this context we can hear terms such as virtual tourism, virtual sightseeing and virtual monuments or sites, see e.g. [3]. the importance of 3d data in this area is growing. the laboratory of photogrammetry at the department of mapping and cartography has been cooperating with the administration of český krumlov castle and the baroque theatre foundation on a project to create of metrical documentation of the castle’s baroque theatre. this co-operation has been running since 1996, see [4]. the theatre is primarily documented by students of geodesy and cartography working on diploma projects. the aim of the first phase of the project is to create new metrical documentation of the theatre in the form of a 3d model. in the second phase, the 3d model will form the core of a 3d information system. 1.2 motivation and aim of the project it is not a major technical problem today to acquire 3d data and to create a 3d model of an object, e.g., in the area of cultural heritage. the question i met with was: how to use this 3d model effectively? currently, customers demand documentation predominantly in 2d form, and they do not want 3d data and models. models are used for once-off 48 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 3d information system of historical site – proposal and realisation of a functional prototype j. hodač the development of methods for 3d data acquisition, together with progress in information technologies raises the question of creating and using 3d models and 3d information systems (is) of historical sites and buildings. this paper presents the current state of the “live theatre” project. the theme of the project is the proposal and realisation of a 3d is of the baroque theatre at český krumlov castle (unesco site). the project is divided into three main stages – creation of a 3d model, proposal of a conception for a 3d is, and realisation of a functional prototype. 3d data was acquired by means of photogrammetric and surveying methods. an accurate 3d model (photo-realistic, textured) was built up with microstation cad system. the proposal of a conception of a 3d is was the main outcome of the author’s dissertation. the essential feature of the proposed conception is the creation of subsystems targeted on three spheres – management, research and presentation of the site. the functionality of each subsystem is connected with its related sphere however, each subsystem uses the same database. the present stage of the project involves making a functional prototype (with sample data). during this stage we are working on several basic technological topics. at present we are concerned with 3d data, its formats, format conversions (e.g. dgn � vrml) and its connection to other types of data. after that, we will be seeking a convenient technical solution based on network technologies (internet) and an appropriate layout for the data (database). the project is being carried out in close co-operation with the administration of the castle and some other partners. this stage of the project will be completed in december 2005. a functional prototype and the information acquired by testing it will form the basis for the final proposal of a complex is of a historical site. the final proposal and appropriate technology will be the outcome of the project. the realisation of a complex 3d is will then follow. the results will be exploitable both for site management and for organisations working in the area of presenting historical sites and creating multimedia shows. keywords: cultural heritage, 3d information sciences, system, project, proposal, development, virtual reality, internet/web. purposes, e.g., for presentation (visualization) or research of a site. other applications for these models are uncertain. this is discouraging, because of the amount of work involved in creating the models. utilisation of 3d data in 3d is is a promising prospect. in connection with is of historical sites it is necessary to ask: what ideas do the owners or administrators of sites have? do they need is, do they need 3d data? if yes, what data and for what purposes? the answers to these questions formed the basis for defining the aims of this project. the main aim of the project is: to find good ways from gathering 3d documentation to creation of a 3d information system of a historical site. 2 phases of the project the project is divided into three main stages – creation of a 3d model, proposal of a conception for a 3d information system, and realisation of a functional prototype. this paper focuses on the second stage. a detailed description of this stage follows. discussions with administrators of sites and also experience from professional meetings in the last decade, see, e.g., [3], formed the basis for the defining the basic areas for using 3d data. fig. 1 describes these areas, the most important of which are: presentation of the site, research of the site, and management of the site. in all of these areas we use similar initial data. the methods for processing this data vary according to the user’s requirements. the second stage of the project was divided into three parts – analytical, synthetic and application. in the analytical part, attention was paid to two target areas: site management and site presentation. an analysis was made of the current state of the data used by site administrators. the types of operation using data were also examined. the last part of this analysis dealt with conditions for utilising information technologies in the analysed area. another analysis focused on the current state of presentations of historical sites on the internet in the czech republic and abroad. virtual tourism, virtual sites and the tourist industry were at the focus of this analysis. the results of these analyses formed the basis for the proposal for the conception of is of a historical site in the synthetic part of the project. this proposal works with three main application areas – site presentation, site research and site management. the application part of the project was also very important. a proposal was created for a conception of an information system for the baroque theatre in the castle at český krumlov (“live theatre” project). primary attention was given to designing data schemes for each part of is. 2.1 analysis – site management the data for this analysis was acquired with the use of a questionnaire, which included questions on the following topics: affordable plan documentation (geometrical data), affordable inventories, tallies (non-geometric data), technical and organisational background. the inquiry was conducted at 24 selected sites located in south bohemia. the main condition for the selection of sites was accessibility for the public. most of the selected sites were large (e.g. castles) and they were owned by the state. the results of the questionnaire showed that the administrations of the sites tend to be rather small organisations, with not more than 5 employees. some of them are able to use services provided by the founder (mainly sites owned by the state). the organisations are equipped with computers only on a basic level, and connectivity to the internet is limited. utilisation of information technologies is rated positively by the staff. unfortunately, the real situation in the administrative offices is not so good, and the use of these technologies is not widespread. the respondents considered that the main barrier to wider use is cost. the quality of the existing documentation is assessed by the respondents as sufficient for current utilisation. a combination of traditional paper documentation with digital documentation would be optimal according to most of the respondents. 2.2 analysis – site presentation the selected sites (in cz) were the same as in the analysis above. the topics for investigation were: existence and content of presentation, language of presentation, use of multimedia, services, the place where the presentation is available, and graphical quality. approximately 15 foreign presentations and projects were analysed. the main focus was on the uk, because of the richness of its cultural heritage. other sites and projects dealing with the virtual heritage, and virtual tourism were included in the selection. the results of the analysis of sites located in cz showed that we cannot speak of virtual tourism today. most presentations do not use interactive elements and geometrical data (plan, maps, and 3d models). most are static, not living presentations. complex presentations consisting of voluminous sets of information are still only sporadic. links to other types of information systems (town, region, national, tourist industry etc.) are lacking. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 1: fields of utilisation of 3d models fig. 2: presentation of the cultural heritage – trends the results of the analysis of how sites abroad are presented provided some interesting information about current trends, see fig. 2. these trends all aim at the same goal, to facilitate understanding of the significance of the cultural heritage to visitors. complex 3d models of real sites were not included in this selection. interactivity was mainly used in connection with traditional forms of data (text, photos). 2.3 proposal for a conception of a 3d is of historical site the proposal of a conception for is is based on three areas of utilisation. these areas also determine the is subsystems. a complex information system including 3d data should comprise subsystems dealing with site management, site research, and site presentation. the subsystems use the same data set for a variety of purposes. fig. 3 shows the volume of data for each subsystem. the results of the analytical part of the study formed a basis for defining the principal features of an information system of a historical site and its subsystems: subsystem – site management (s-sm) � intended for site management, to support the decision-making process � for use by site administrators � the subsystem enables: updating and editing of data; making outputs with the use of various types of data (2d, 3d); simple spatial analysis, etc. subsystem – site research (s-sr) � intended for studying of site and its resources � for use by historians, conservationists, restorers and students � the subsystem enables: work with other associated systems; basic analysis of examined data, e.g., modelling; feedback � discussions, publications, etc. subsystem – site presentation (s-sp) � intended for presentation of the site � for use by potential visitors to the site � the subsystem enables: work with virtual reality and interaction; work with other associated systems; complete service for visitors, etc. fig. 4 shows factors which were considered in a detailed specification of the subsystems. user – future users determine the content and form of is. users of is of a historical site may have various areas of interest, see above. data – the subsystems all have the same set of original data. the requirements for the subsystems differ, and therefore the volume and form of the accessible data also varies, see fig. 3. functionality – the demands on functionality of the whole is are given by the previous two levels: user and data. although the subsystems have the same set of basic functions, each subsystem also needs a specialised function. 2.3.1 basic conditions for realisation apart from the content of is, the fundamental conditions for successful realisation and use of is were also defined. these are: utilisation of network technologies low installation cost inexpensive hardware and software simple handling lucidity update of data. a detailed proposal for the technical solution goes beyond the scope this stage of the project. the concept of a distributed system is a possible approach. the data can be stored not in one central powerful server, but on several servers. this concept could be demonstrated in conditions of public cultural heritage care (in cz) on a typical relation administration of the site � founder or administration of the site � network (or fund) administrator. fig. 5 illustrates this concept. 50 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 3: subsystems of is of historical site fig. 4: basic factors influencing a specification of the subsystems fig. 5: distributed system 2.4 the “live theatre” project an is project was designed at the end of this stage. the main goal was to document and elaborate the results of the previous phases in detail. the “live theatre” project arose as follow-on to previous work by students and teachers of the department of mapping and cartography on the baroque theatre at český krumlov castle. both geometric and non-geometric data was available. subsystems for management, research and presentation of the site were elaborated in greater detail in close connection with the general outcomes of previous phases. schemes of selected prototype data for each subsystem were created. the parameters of the technical solution were specified at the end of this phase. the present state of the castle theatre in český krumlov is the result of reconstruction of the castle’s area in 1765–66. the theatre represents a baroque stage in its mature form. the original theatre fund is preserved in both actual objects such as the building, auditorium, orchestra pit, stage, stage technology, machinery, decorations, costumes, props, lighting technology, fire extinguishers, etc., as well as in rich archival documentation such as librettos, scripts, texts, partituras, sheet music, inventories, accounts, iconographic material, and other information on theatre life in the 17th to 19th century. the krumlov theatre was closed to the public from 1966 to 1997, and after a large part of the restoration had been completed, trial tours of the theatre began in september 1997. 2.4.1 initial sources both geometric and non-geometric data were available. geometric data – this comprises a 3d model of the interior and exterior of the theatre. this model was created in stages. photogrammetric and surveying methods of documentation were used, according to the type of area to be documented. the creation of a 3d model is described in greater detail in [4]. cad system microstation was used as a tool for creating the model. this 3d model is photo-realistic, and can be used for visualization and animation purposes. fig. 6 shows a visualization of the upper stage part of the theatre. non-geometric data – the original wide-ranging theatre funds are gradually being restored, recorded and catalogued. the project to digitize the most important funds was completed in 2002. now there is a digital form of the “basic inventory of funds” (bif) which contains the following items (funds): costumes and props, decorations, technical and lighting equipment. bif is based on card and photo documentation of each element. the fund of technical equipment and part of the fund of decorations were 3d metrically documented, see above. in these funds, 3d information is clearly relevant. additional sources of non-geometric data are: photo documentation, panoramic videos (see �5�), videos (rehearsals), collections of sounds (scenic effects), historical plan documentation, etc. for the purposes of the is proposal, non-geometric data for two funds was chosen – the technical equipment fund and the decorations fund. the decorations fund consists of sets of 13 basic stage scenes. the technical equipment fund includes complete original theatre machinery with sliding frames, winches, levers, pulleys, and a movable lighting rack. the floor of the stage is also original, including sliding and removable planks and trap doors. 2.4.2 proposal for is the is project for the theatre has the working name “live theatre”. the objective of the project is to create a living system, which can be used to gain an insight into the baroque theatre culture and more widely to recognise general interrelations in life in the baroque epoch. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 6: visualization – upper stage, roping gear the project is being implemented in close co-operation with the administration of český krumlov castle and the baroque theatre foundation. the schemes of data for the two fundamental funds were designed for each subsystem. the following topics were further elaborated: user, functionality, data, form, and future development of is. a basic segment of the proposed is structure is an element. this element is connected with textual, graphical and other information. the existence of an element is given by the existence of a bif card. various ways to obtain information on an element are designed for users. geometric data forms the basis, see fig. 7 and appendix a. the user can choose a “traditional 2d approach”, i.e., selecting a fund � an element on a 2d plan or from a list of elements. another way is the “3d approach”, i.e., selecting a fund � an element inside a 3d model. the direct route – query to database – is also available. all the above-mentioned approaches can be combined. for example, the user can switch between 2d and 3d data. � for geometric 3d data, functions enabling various operations with the 3d model are included in the is proposal. the functions allow interactive work with the model (movement in virtual space, identification of elements)� work with a detailed model of an element (browse)� visualization (making views)� viewing animations (walk through, movement of elements, exchange of scenes, interrelations of elements). the baroque theatre is a stable system with a limited number of elements. the 3d data will not need to be edited frequently. � non-geometric data referred to an element consists of textual information (mainly bif), photo documentation (bif and others), video (rehearsal, panoramic views, etc.), sounds, drawings (technical drawings of elements and their relations, etc.). main features of the subsystems – were defined in chapter 2.3. proposed subsystems have following attributes: s-sm – collects a complete set of data from selected funds, s-sr – attention is given to elements, and especially to their interrelations and dependencies, s-sp – the main feature is interactive work with a 3d model, which enables the user to find his own way to recognize the site. 2.4.3 technical solution the technical solution will be based on the results mentioned in chapter 2.3.1. the requirements for the technical solution can be placed into three groups – operating, technological and data. the operating group concerns requirements for low costs (hardware and software), and easy and intuitive operation. the technological group concerns utilisation of network technology, open source software and convenient database organisation. the data group concerns utilisation of a standard data format, metadata, security of data, and multilingualism. 2.5 functional prototype of is the present stage of the project involves producing a functional prototype (with sample data, see 2.4.1). during this stage we are working step-by-step on basic technological topics: � 3d data: formats, conversion, interactivity, � interconnection of geometric and non-geometric data, � technical solution of is: network, database, � functional prototype: realisation, testing. 52 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 7: scheme of data, subsystem – site research, technical equipment element 3d data: we are now concerned with 3d data, its formats, format conversions and interactive work with this type of data. the existing 3d model is in dgn format. this format is not very convenient for the purposes of the proposed is. existing formats of 3d data (vrml, vet, and x3d) were analysed and vrml format was selected. vrml is a standard format for 3d data on internet, and it has been used successfully, see [2, 6]. it is sometimes very problematic to convert the formats. the 3d model of the theatre is complex, comprising both the interior and the exterior of the building. two approaches to conversion are being tested: standard export to dxf � optimization of dxf (specialised sw) � export to vrml; and export interactively controlled by the user (directly dgn – vrml or dgn – dxf – vrml). a special conversion program needs to be created for the second approach. the existing conversion results are documented in fig. 8. the next step is to create a set of tools for interactive work with this type of model. this will enable, e.g., measurements of distances, work with layers, selection of elements, etc. forthcoming steps will include searching for a suitable technological solution. interconnection of geometric and non-geometric data: elements of the vrml model will have a direct link with additional data (see above). technical solution of is: future users will work with is via the internet. an appropriate layout of all types of data will play a major role in the usability of is. storing 3d data directly in a database is a promising option that is now being studied, see e.g. [7]. functional prototype – production, testing: at the end of this stage, a functional prototype of is will be built on the basis of technologies verified in the previous steps, taking into account its functionality. this prototype will be tested by a sample of future users. outcomes of the project a functional prototype and information acquired by testing it will form the basis for the final proposal for a complex information system of a historical site. the final proposal and an appropriate technology will be the outcomes of the project. a complex 3d information system will than be created. 3 conclusions this project focuses on the cultural heritage area. the main aim is to create a 3d is of a particular historical site. the results of the individual stages of the project can be summarised as follows: � the analyses were made on a selected sample of historical objects. the first analysis dealt with the current state in the management of the site, with special reference to sources of data and their usage, and on the potential use of information technologies. an analysis was made of the presentation of cultural heritage sites on the internet, using a 3d model and virtual reality. © czech technical university publishing house e http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 8: vrml model – lower stage, machinery � the proposal for a conception of 3d is was based on information acquired in the analyses. areas for using 3d data (and is) were defined and is subsystems were drafted. these deal with site management, site research and site presentation. the subsystems all have the same set of original data. the basic technical requirement is network technology. � conclusions reached during the previous stages were used to create a conception of is for a specific site, the baroque theatre at český krumlov castle. an appropriate data set was selected and a scheme of data was created for each subsystem. we also paid attention to the functionality of the subsystems, taking into account work with a 3d model and 3d information. � producing of a functional prototype of is is the current stage of the project. the focus is now on converting the 3d model format from dgn to vrml. finally, the prototype will be tested. the results, together with the tested technology, will form the basis for creating the final proposal of 3d is of a historical site. the project offers an overall, complex view of the topic. that is being studied comprehensively for the first time in cz. we have tried to reflect current trends in 3d information technologies and their application in the area of cultural heritage, and to adapt them to the conditions of the czech republic. 4 acknowledgments this project is part of the research activity of the department of mapping and cartography (supported by msm:210000007). presentation of the results of the project is supported by the baroque theatre foundation in český krumlov. references [1] hodač, j.: „návrh koncepce prostorového informačního systému památkového objektu.“ stavební obzor, vol. 13, (2004), no. 2, p. 45–50. [2] zlatanova, s., rahman, a. a., pilouk, m.: “trends in 3d gis development.” journal of geospatial engineering, vol. 4, (2002), no. 2, p. 1–10. [3] santana, m.: “a witness to enhanced realities in virtual heritage: potentials and limitation.” berkeley, usa, 2001, virtual systems and multimedia conference. www: www.virtualheritage.net/news/article.html [4] hodač, j.: “documentation of the baroque theatre at český krumlov castle.” in: the international archives of photogrammetry, remote sensing and spatial information sciences. potsdam, isprs, vol. xxxiv-5/c7, p. 121–125. [5] ois ck.: „oficiální informační systém regionu český krumlov“. www: www.ckrumlov.cz. [6] schürle, t., fritsch, d.: “cafm data structures: a review and examples.” in: iaprs, vol. 33, amsterdam, iaprs, 2000, p. 124–128. [7] stoter, j., zlatanova, s.: “visualisation and editing of 3d objects organised in a dbms.” in: eurosdr com. v. workshop, enschede, eurosdr, 2003, p. 32–48. ing. jidřich hodač, ph.d. phone: +420 224 354 650 e-mail: hodac@fsv.cvut.cz department of mapping and cartography czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic 54 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague © czech technical university publishing housee http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 appendix a scheme of data, subsystem – site research, technical equipment element ap04_3web.vp 1 introduction the terahertz frequency band is recently usually considered as the interval 300 ghz – 3 thz that corresponds to the submillimeter wavelength range between 1 mm and 100 �m or to photon energy within the range 1.2 – 12.4 mev. despite great scientific interest the terahertz frequency range remains one of the least tapped regions of the electromagnetic spectrum. below 300 ghz we cross into the millimeter-wave bands. beyond 3 thz is more or less unclaimed territory: the border between far-infrared and submillimeter radiation is still rather blurry. recent rapid progress in nanoelectronics and high frequency technologies necessitates that heterojunctions, superlattices, low-dimensional semiconductor structures, quantum wells and barriers are today standard building blocks of modern electronic devices, which find their application in the field of microwave and submillimeter technology or in photonics. the existence of quantum wells and barriers results in the quantum-based mechanism of electron transport, thermionic emission across the barrier and the tunnelling (thermionic-field-emission) through the barrier. these effects should be treated by means of appropriate methods of quantum physics. although the frequency of 1 thz appears to be very high, this is only an appearance. the frequency 1.8 ghz is at present in general use in mobile telephones. it is clear that 1.8 ghz cannot be equal to the transient frequency ft of transistors in the integrated circuits of mobile telephones. the frequency 1.8 ghz should be even lower than the frequency f�, which is defined by the 3 db drop of the current gain – this means that the frequency ft should be of the order (100 � 300) ghz. even higher frequency bands are used in radar systems. the resonant tunneling diode (and similar structures with resonant tunneling) are recently typical devices for the terahertz fequency band [1]. moreover, nearly the same theoretical approach that is used for investigating of the terahertz frequency band can also be applied if the interaction of near-infrared radiation with photonic structures is studied. quantum cascade lasers may provide terahertz bandwith for communications [2]. a typical situation in electronics is that a dc-bias together with a small ac-signal are applied to the structure. the calculation of potential barrier transmittance with dc-bias only is a classical and well-known problem of quantum mechanics [3]. the application of a high-frequency signal to the barrier has been studied only in the last decade. a general method of solution is described in [4, 5] and developed in [6, 7]. however, papers concerning high-frequency phenomena are usually devoted to resonant tunnelling diodes (rtd) and different types of potential barriers are rarely investigated, e.g. in [8] (the high-frequency potential step with zero steady state potential). the aim of this paper is to present results achieved in a theoretical investigation of the high-frequency electron transport across the rectangular, triangular, trapezoidal or parabolic potential barriers that are most frequently used in various nanoelectronic or photonic structures. 2 steady-state transmittance of a potential barrier the term “steady-state” means that only dc-bias is applied to the potential barrier. although we have mentioned above that the calculation of steady-state barrier transmittance is a classical problem of quantum physics, we briefly summarise and generalise these results. we will consider a rectangular, triangular, trapezoidal or parabolic potential barrier, see fig. 1. we assume that there is a significant voltage drop only at the barrier region, i.e. outside the voltage barrier there is no electric field and the electrons can be described as free. the barrier height umax, the barrier width xb, and in fact the whole barrier profile, i.e. the potential energy u(x), depend on the external applied bias. the formulae for the potential energy u(x) are given in fig. 1. consider that the electrons are incident on the barrier from the left. in this case, in region a we have incident electrons described by the wave function e ik x0 (k0 is the wave vector related to the electron kinetic energy by e k m� �2 0 2 2 ) and electrons reflected from the barrier with wave function r e ik x0 0 � (r0 is the reflection amplitude). the electrons inside the barrier region b are described by some specific wave function that is dependent on the profile (shape) of the potential barrier. in region c (after the barrier) are transmitted only electrons with wave function t e iq x0 0 (t0 is the transmission amplitude) with the wave vector q0 satisfying 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 nanoelectronic device structures at terahertz frequency m. horák potential barriers of different types (rectangular, triangle, parabolic) with a dc-bias and a small ac-signal in the thz-frequency band are investigated in this paper. the height of the potential barrier is modulated by the high frequency signal. if electrons penetrate through the barrier they can emit or absorb usually one or even more energy quanta ��, thus the electron wave function behind the barrier is a superposition of different harmonics exp( ).�in t� the time-dependent schrödinger equation is solved to obtain the reflection and transmission amplitudes and the barrier transmittance corresponding to the harmonics. the electronic current density is calculated according to the tsu-esaki formula. if the harmonics of the electron current density are known, the complex admittance and other electrical parameters of the structure can be found. keywords: nanoelectronic, terahertz, potential barrier, transmittance, schrödinger equation. e q m� ��b � 2 0 2 2 . thus the electron wave functions in regions a, b, c are: � � � a b c ( ) , ( ) ( ) ( ), ( ) x e r e x a f x b g x x t e ik x ik x� � � � � �0 0 0 0 0 0 iq x0 . (1) let us turn our attention to the functions f x( ), g x( ) in eq. (1). these functions contain the information on the potential barrier. in general the wave function �b( )x inside the barrier region is the eigenfunction of the corresponding hamiltonian, i.e. it is the solution of the stationary schrödinger equation hdc b b� �� e where hdc � � � � 2 2 22m d dx u x( ) (2) for a rectangular barrier we obtain f x e g x e p m e uip x ip x( ) , ( ) , ( )max� � � � �0 0 0 2 � (3) where p0 is real for e u� max (this corresponds to the electron emission over the barrier) and p0 is imaginary, p i0 0� � for e u� max (electron tunneling through the barrier). the results for the potential barriers in fig. 1 are summarized in table 1. the electron wave functions for any type of barrier obey the standard boundary conditions at the interfaces x xb� � , x � 0 (to simplify the problem equal electron effective mass m is considered throughout the structure): � � � � � � � a b b b a b b b b c b ( ) ( ), ( ) ( ), ( ) ( ), ( � � � � � � � � � � x x x x 0 0 0 0) ( ) .� ��c (4) substituting the wave functions we obtain a system of four linear equations for the unknown coefficients r t a b0 0 0 0, , , in (1). as the system is sufficiently simple, it can be solved analytically. if the wave functions are known, the single electron quantum mechanical current densities of incident and transmitted electrons can be calculate according to the well-known formulae © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 44 no. 3/2004 max.)( uconstxu == ÷ ÷ ø ö ç ç è æ += bx x uxu 1)( max ÷ ÷ ø ö ç ç è æ +-= b dc x x evuxu 1)( max 2 max 1)( ÷ ÷ ø ö ç ç è æ += bx x uxu -xb 0 x umax e a b c f b rectangular barrier triangular barrier trapezoidal barrier parabolic barrier x f b -xb 0 umax e a b c f b -xb 0 x umax e a b c -xb 0 x umax e a b c f b evdc fig. 1: different types of potential barrier; e is the energy of the incident electron, u(x) is the potential energy in the barrier region b potential barrier (see fig. 1) functions f(x), g(x) (see eq. 1) parameters rectangular f x e ip x( ) � 0 , g x e ip x( ) � � 0 p m e u0 2� �( )max � triangular f x ai( ) ( )� � , g x bi( ) ( )� � airy functions � �� � � � � � � � � � � �x x e ub 1 max , � � � � � � � � 2 2 1 3 mu x max � b trapezoidal f x ai( ) ( )� � , g x bi( ) ( )� � airy functions � � � �( )x � �� � � 2 2 1 3 m u x ( )max �b b� , � � � � x e ub b b � �max parabolic f x u u( ) ( , )� � � , g x v u( ) ( , )� � � parabolic cylinder functions � �� �( )x xb � � � � � 8 2 2 1 4 mu x max � b , u e x m u � � � � b � 2 1 2 max table 1: electron wave functions in the barrier region j e im x x j e im inc trans � � � � � � � � 2 2 � � � � � a a a a c * * , * * . � � � c c c x x � � � � (5) the steady state barrier transmittance t edc( ) is a function of electron energy e and it is defined as the ratio j jtrans inc . we introduce the following short notation: a f x b g x a f x b g x c f x x � � � � � � � � � � � � � � ( ), ( ), ( ), ( ), ( ), 0 0 0 0 b d g x x c f x x d g x x k q q � � � � � � � � � � � � ( ), ( ), ( ) , , / b b b 1 2 0 � � � � � �/ / for the rectangular / triangular / trapeziodal / parabolic barrier (6) the transmission amplitude t0 defined in (1) is then given by � �t e b i b c i c a i a d i dikx0 2 1 2 1 12 � � � � � � � � � �� � � b ( )( ) ( )( ) (7) and the transmittance reads t e q k tdc( ) � 0 0 0 2. (8) let us consider the n-al1�xgaxas / p �-gaas abrupt heterojunction with the following parameters: aluminium mole fraction 0.35, donor concentration in n-region 5×1017 cm�3, acceptor concentration in p�region 1×1019 cm�3, the depletion layer extends in the n-region and its width is xn nm� 65 for zero bias, the heterojunction built-in voltage is vbi �18. v. the electron effective mass is considered to be the same throughout the structure and equal to the effective mass of an electron in gaas, thus m � 0 067. m el . the energy umax is related to the built-in voltage vbi and to the external applied voltage as u e v vamax ( )� �bi . the conduction band profile for various forward bias and the barrier transmittance are shown in fig. 2. 3 electron wave function in barrier region with high frequency modulation we will now consider the case if the potential barrier is modulated by a high frequency signal v tac cos( )� where the angular frequency w � �( . )01 10 thz and the amplitude vac is small and constant; such modulation is called homogeneous. the more general and more complicated case of non-homogeneous modulation v x tac( ) cos( )� is not considered in this paper. the electron wave function �b( , )x t inside the barrier region is the solution of the time-dependent schrödinger equation with the hamiltonian hdc+hac where hdc according to (2) represents the barrier profile (including the dc bias) and hac stands for the high frequency modulation i t m x u x e v � � � � � b dc ac b dc ac ac h h h h � � � � � � ( ) , ( ) , co 2 22 s( )�t (9) it can be immediately proved that the wave function is � � � �b ac b( , ) exp exp sin( )x t i e t i ev t� �� � � � � � � � � ( )x (10) where �b( )x is the solution of the stationary schrödinger equation (2). we can see that the problem of describing electron wave functions in a uniform sinusoidally oscillating potential (9) is identical to the problem of frequency modulation in telecommunications or in signal theory. the wave function (10) can be considered as the frequency modulated wave with carrier frequency w e0 � � , see fig. 3. we apply the bessel function expansion to the second term of (10) exp sin( ) exp( )�� � � � � � � �i ev t j ev ip tp p ac ac � �� � � � ��� �� � . (11) this expansion enables us to consider the wave function (10) as the superposition of harmonics exp( )�ip t� , p � � �0 1 2, , ,�, see fig. 4. thus, passing the barrier region, the electron is able to absorb or emit one or more energy 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 v0 = 0 0 < v0 < vbi v0 = vbi ec(x) 70 60 50 40 30 20 10 0 barrier width [nm] 1.4 1.0 0.6 0.2 barrier height [ev] barrier transmittance tdc electron energy e / kt external voltage: (left to right) v0 = 1.8 v = vbi 1.7 v 1.3 v 1.1 v 0.8 v 0 10 20 30 40 50 60 1.0 0.8 0.6 0.4 0.2 fig. 2: the parabolic potential barrier at the abrupt np�-heterojunction for various forward bias va and the corresponding steady-state transmittance tdc(e) quantum p��. its energy can be e p p� � � �� �� , , , ,0 1 2 , the � sign corresponds to the absorption/emission of energy quantum; p � 0 means no emission or absorption. as the electron energy can be e p p� � � �� �� , , , ,0 1 2 , the full electron wave function in regions a, b, c (see fig. 1) should be the superposition of waves corresponding to these values of energy � � a ( , ) exp exp( ) exp( ) exp( x t i e t ik x r in tn � � � � � � � � � � � 0 � � � � � � � ���� �� � ik xn n ) , � � � � b ( , ) exp exp( ) ( ) ( ) x t i e t in t a f x b g x js s n � � � � � � � � � � � � s sn evac �� � � � � � � � �� � � � ����� �� ��� �� �� , (12) � �c( , ) exp exp( ) exp( )x t i e t t in t iq xn n n � �� � � � � ��� �� � � , e n k m e n p mn n� � � �� � � �� � � 2 2 2 22 2, �b . the function �a is the superposition of the incident wave and the reflected waves with the reflectance amplitudes rn, positive and negative values of n correspond to the absorption and emission of energy quanta. the function �c is the superposition of transmitted waves with the transmittance amplitudes tn. the function �b describes the electron motion across the barrier region (both the emission and the tunnelling). the boundary conditions (4) should be now applied to the wave functions (12). evaluating these relations and equating the terms at harmonics exp( )�in t� we obtain a system of linear equations for the unknown coefficients an, bn, rn, tn, n � � �0 1 2, , ,� . to calculate all these coefficients it would be necessary to solve an infinite set of linear algebraic equations. it is clear that the probability of emission or absorption of energy n�� decreases with increasing number n, thus the system could be terminated at some finite value of the indices n, s in (12). the series expansion in (11) that results in the double summation in (12) is well known in the theory of frequency modulated signals in telecommunications and we can apply the result of signal theory: in the series expansion (11) it is sufficient to consider only the terms n n� �0, ,� , where n is approximately equal to evac ��. if the high frequency signal is small it is sufficient to consider only n � 1 or n � 2, i.e. the generation of the first or the second harmonics or, in other words, the absorption or emission of one energy quantum �� or two energy quanta 2��. if the energy of the incident electron is e, the energy of the reflected or transmitted electron could be e (unchanged, no absorption or emission), e � ��, e � 2�� (absorption of one or two quanta), e � ��, e � 2�� (emission of one or two quanta). for © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 44 no. 3/2004 (e�h )t (e�h )t fig. 3: the real part of the electron wave function according to eq. (10) for a rectangular barrier (height 200 mev, width 20 nm), incident electron energy 50 mev, microwave signal frequency 1.2 thz e + 2 hw … r2 e + hw … r1 e … r0 e hw … r-1 e 2 hw … r-2 t2 … e + 2 hw t1 … e + hw t0 … e t -1 … e hw t -2 … e 2 hw reflected waves transmitted waves incident electron wave barrier regionec1 ec2 highfrequency modulation e fig. 4: high-frequency modulation of a potential barrier; e is the incident electron energy, �� stands for the high-frequency quantum re exp( sin( ))� �� � �� i ev tac �� � re exp( )� �� � �� i e t � re exp( ) exp( sin( ))� � � �� � �� i e t i ev t � � ac � � n � 1 (or 2) we obtain from eq. (21) the system of 12 (or 20) linear equations for 12 (or 20) unknown coefficients; in general 8n � 4 linear equations for 8n � 4 unknown coefficients. such system can be solved analytically in principle, but in practice numerical solution is used. for the purpose of illustrating the above sketched theory it is useful to obtain some analytical results. we will consider the rectangular potential barrier in fig. 1. if the amplitude of high frequency signal vac is small and the absorption or emission of only one quantum �� is considered the transmission amplitudes t�1 (absorption, the electron energy in region c is e � ��) and t�1 (emission, the electron energy in region c is e � ��) t ev t k p p x i p k q p � � � � � � � � � � � 1 0 0 0 0 0 0 0 0 2 ac b �� ( ) cos( ) � � � � � � � sin( ) ( ) cos( ) sin( p x k p p x i p k q p 0 0 0 1 0 0 0 0 b b p x� � � � � � � � � � � � � � � 1 b) . (13) the transmission amplitude t0 in (13) is given by the general formula (7), k0, q0, p0 are the electron wave vectors defined in (1) and (3) and p m e u� � � �1 2 ( )max� �� . similarly as in (3) the quantities p0, p�1 are real for the electron emission over the barrier, and imaginary, thus p i0 0� � , p i� ��1 1� , for electron tunneling through the barrier. it can be seen in fig. 3 that the modules t�1 exhibit a strong resonant character at electron energy that corresponds to the barrier height. 4 high frequency barrier transmittance if the transmission and reflection amplitudes are known, the wave functions (12) can be substituted to the general formulae (5) and the single electron quantum mechanical current densities of incident and transmitted electrons can be calculated. the high frequency barrier transmittance is defined as the ratio j jtrans inc and can be found for each harmonic. if we adopt the approximation k kn � 0 , q qn � 0 (as the electron energy is high compared with ��) and restrict the calculation to the first harmonics, i.e. to the absorption or emission of one energy quantum, we obtain j e k m j e q m t t t t t t e inc trans i t � � � � �� � � � 0 0 0 0 0 1 1 0 , ( ) (* * * �� �t t t t ei t0 1 1 0* * ) .� � � (14) we can see that jtrans in (14) includes the dc component proportional to t t0 0 * (it is related to those electrons that pass the barrier region without absorption or emission of energy) and the ac component exp( )�i t� related to electrons that emit or absorb one energy quantum in the barrier region. it is clear that the transmittance of the dc component is again given by (8), and it is not affected by the high frequency modulation. as usual in electronics, we use the goniometric functions sin( )�t , cos( )�t in (14) instead of complex functions exp( )�i t� and denote �� � ��arg( ) * *t t t t0 1 1 0 ; the ac component then reads �j e q m t t t t t t t trans ( ) * * * cos cos( )� � �� � # � � � � � 2 0 0 1 1 0 1 0 1 � �t t t1 0 1* sin sin( )� �# (15) and the corresponding transmittances are t q k t t t t t q k t t t t c s 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 2 2 � � � � � � * * * * cos , si � n .�1 (16) more generally, if the n-quantum approximation is considered, the transmitted single electron quantum mechanical current density reads j j j e q m t e t trans trans n n n trans n n in t n � � � � � ( ) ( ) , ( � � � 0 0� * )e in t� (17) and the transmittances tn in (17) for n � 3 are given by 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 0.12 0.10 0.08 0.06 0.04 0.02 0.25 0.50 0.75 1.00 1.25 1.50 1.75 electron energy/barrier height transmission amplitude |t -1|, |t +1| |t -1| |t +1| p p/2 0 -p/2 -p 0.25 0.50 0.75 1.25 1.50 1.75 electron energy/barrier height transmission amplitude arg( t -1), arg( t +1) arg( t +1) arg( t -1) fig. 5: transmission amplitudes t�1, t�1 according to eq. (13) for a rectangular barrier t t t t t t t t t t t t t 0 3 2 2 2 1 2 0 2 1 2 2 2 3 2 1 3 2 2 � � � � � � � � � � � � � � � � * * 1 1 0 0 1 1 2 2 3 2 3 1 2 0 1 1 � � � � � � � � � � � � t t t t t t t t t t t t t t t * * * * * * * � � � � � �� � � � t t t t t t t t t t t t t 0 2 1 3 3 3 0 2 1 1 2 0 3 * * * * * * (18) 5 electrical parameters of quantum structure if the high frequency transmittances are known the electric current density j n( )� for each harmonic can be calculated by means of the well-known tsu-esaki formula [1, 4]. we denote as f ( )� the fermi-dirac function integrated over the parallel-to-interface wave vector components f e e k t e e c f c f ( ) ln exp exp ( ) � � � � � � � �� � � � � � � 1 1 1 1 2 2 b �� � � � � � � � � � � � � � � ev k t dc b (19) with the dimensionless energy � � e k tb . the high frequency electric current harmonics can be written in the following way j a n t b n t a em k t q k n n n n ( ) cos( ) sin( ) ( ) � � � � � � � � b 2 2 2 3 0 02 2 � ( ) ( ) cos ( ) ( ) ( ) ( � � � � � � � � � t f b em k t q k n n n d b 0 2 2 2 3 0 02 2 � $ � � ) ( ) sin ( ) ( )t fn n� � � � �d 0 � $ (20) observe that the origin of the higher order harmonics is related to the quantum character of electron transport in the barrier region rather than to the nonlinearity of current-voltage or capacitance-voltage characteristics. thus, their existence is an intrinsic property of the quantum structure. as our aim was to obtain the electrical parameters of the quantum structures, the relations (20) represent in fact the final result of the calculation. using these formulae it is possible to find, e.g., the module of the higher order current harmonics and their phase shift with respect to the modulating signal (9) or the complex admittance and its real and imaginary part. all these quantities can be investigated as functions of the potential barrier profile (it is included in the barrier transmittance tn ( )� ), dc bias (included in tn ( )� and in f ( )� ) or the angular frequency of the high frequency modulating signal (included again in tn ( )� ). the real and imaginary part of the complex admittance of the rectangular barrier for the first three harmonics as a function of frequency is shown in fig. 6. the slope of the imaginary part of the admittance � �im ( )y c� �� implies that the capacitance is frequency independent. 6 conclusions the theory related to the transmittance of different types of potential barriers with dc bias and small high frequency ac signal in the terahertz frequency band was presented in this paper. we have followed the way from the hamiltonian and the time dependent schrödinger equation to the electric current densities and complex admittance that can be measured in experiments. at such a high frequency the following effects could play an important role: the electron inside the barrier region can emit or absorb one or even more energy quanta �� where � is the signal angular frequency. the electron wave function outside the barrier and consequently the electric current is a superposition of different harmonics exp( )�in t� . as we know from classical electronics, the generation of higher-order harmonics is due to the non-linearity of the current-voltage or capacitance-voltage characteristics, and it occurs only if the amplitude of the signal is sufficiently large. the origin of the higher-order harmonics at potential barriers is different: it is caused by the emission or absorption of one or more energy quantum and occurs even for a small signal; thus their generation is an intrinsic property of the single-barrier structure. the high frequency quantum effect on potential barriers represents an additional conductivity channel and contributes with a small parallel admittance to the electronic parameters of the structure. © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 44 no. 3/2004 im ( y / ynorm ) w (thz) 0.1 1.0 10.0 10 -4 10 -5 10 -6 10 -7 10 -8 10 -9 10 -10 1 st harmonics 3 rd harmonics 2 nd harmonics 10 -3 10 -4 10 -5 10 -6 10 -7 10 -8 10 -9 0.1 1.0 10.0 re ( y / ynorm ) w (thz) 1 st harmonics 2 nd harmonics 3 rd harmonics fig. 6: the real and imaginary part of the complex admittance as functions of the modulation signal angular frequency for a rectangular potential barrier of height 300 mev and width 16 nm. the admittance is normalized by the quantity: y emk t e k tb bnorm � � � �( ( ) ( ) . 2 2 2 3 112 272 10� � ��1m�2 7 acknowledgments this research has been supported by the czech ministry of education in the framework of research plan msm 262200022 mikrosyt microelectronic systems and technologies. references [1] roblin p., rohdin h.: high-speed heterostructure devices: from device concepts to circuit modeling. cambridge university press, 2002. [2] shore k. a.: “qc lasers may provide thz bandwith for communications”. laser focus world, june 2002, p. 85–91. [3] bransden b. h., loachaim c. j.: introduction to quantum mechanics. addison-wesley longman ltd., 1989. [4] coon d. d., liu h. c.: “time-dependent quantum-well and finite-superlattice tunnelling”. journal appl. phys., vol. 58 (1985), p. 2230–2235. [5] liu h. c.: “analytical model of high-frequency resonant tunnelling: the first order ac current response”. phys. rev. b, vol. 43 (1991), p. 12538–12548. [6] truscott w. s.: “wave functions in the presence of a time-dependent field: exact solutions and their application to tunnelling”. phys. rev. lett. vol. 70 (1993), p. 1900–1903. [7] fernando ch. l., frensley w. r.: “intrinsic high-frequency characteristics of tunneling heterostructure devices”. phys. rev. b, vol. 52 (1995), p. 5092–5103. [8] tkachenko o. a., tkachenko v. a., baksheyev d. g.: “multiple-quantum resonant reflection of ballistic electrons from a high-frequency potential step”. phys. rev. b, vol. 53 (1996), p. 4672–4675. rndr. michal horák, csc. e-mail: horakm@feec.vutbr.cz department of microelectronics brno university of technology faculty of electrical engineering and communication údolní 53 602 00 brno, czech republic 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 ap04_3web.vp notation c [mm] chord d [mm] distance between streaks f [hz] frequency of oscillation m � [1] free stream mach number t [mm] streak thickness � [deg] angle of attack � � [deg] amplitude of oscillation � [deg] streak deflection � [kg/m3] density seq. video sequence 1 introduction optical methods of flow visualisation are firmly connected with experimental research techniques in the field of high-speed aerodynamics. all the optical methods utilize the change of density � of the fluid passing the test section of a wind tunnel. there are three basic methods – shadowgraph, schlieren, and interferometric – which provide images of the flow field around bodies in test sections. in recent decades many modifications of these methods have been developed in order to enhance the quality of aerodynamic research. although a great effort has been made to obtain quantitative information from optical records based on shadowgraph and schlieren methods, only the interferometric method remains an effective way to acquire this data. nevertheless, qualitative information from shadowgraph and schlieren pictures gives us a helpful insight into the real transonic flow fields. moreover, video records enable us to observe the changes in the investigated area under unsteady conditions, e.g. in the case of oscillating models, changing free stream mach number, angle of attack or the frequency of oscillation, etc. the aim of this paper is to describe a new method of flow field analysis which can reveal some hardly observable phenomena in unsteady transonic flows, and can obtain some quantitative data from schlieren video sequences. 2 experimental devices and aerodynamic model the schlieren video sequences that are analysed in this article were taken in the test section of a continuous transonic wind tunnel for single airfoils. the dimensions of the test section were 0.14 m × 0.32 m. all the records were obtained by means of a portable schlieren device, which is schematically depicted in fig. 1. a halogen lamp was used as a source of light. the beam of rays was deflected by mirrors and then passed through a lens and an optical window into the test section. on the opposite wall of the wind tunnel the light was reflected from the aluminised surface of the optical window and passed the test section and lens once more. after passing through the colour filter the condensed rays were recorded by the high-speed camera. the camera operated at the frequency up to 2 khz. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 44 no. 3/2004 analysis of unsteady transonic flow fields by means of the colour streak schlieren method j. ulrych this article deals with a new approach to the investigation of unsteady transonic flow fields around aerodynamic models and in blade cascades using a schlieren method of flow visualisation. the principle and the application of the colour streak schlieren method (cssm) are defined. the characteristic flow field features were observed and analysed around an oscillating naca 0012 airfoil under the conditions of transonic free stream mach number (m� � 0.9), initial angle of attack (� � � 4 deg), one amplitude of oscillation (�� � � 3 deg), and three frequencies of model oscillation (f � 1, 15, 30 hz). there is a description of the terminal shock wave hysteresis across the investigated area, which was revealed in particular cases. application possibilities of cssm and its further development are discussed. keywords: high-speed aerodynamics, experiment, flow field analysis, transonic flow fields, flow visualisation, colour streak schlieren method. vibratory device fig. 1: optical arrangement for schlieren method – oscillating profile configuration a hydraulic crankshaft vibratory device was placed on the outer side of the tunnel behind the aluminised window. the apparatus was used to generate harmonic oscillation of the model within the range of frequencies f � 1 – 40 hz. for the transonic flow field investigation around a body in the test section, the naca 0012 airfoil was employed. the length of airfoil chord c is 10 cm. 3 optical measurement results the optical measurement results are represented by three movie sequences (seq.1–3), which give us an idea about the flow field around a single naca 0012 airfoil oscillating under transonic conditions [1], [2]. the initial angle of attack was set � � +4 deg, the amplitude of oscillation � � � �3 deg, and the mach number of incoming flow m � � 0.9. the frequency was arranged f � 1, 15, and 30 hz. all the records were taken with the frequency 1 khz. the first sequence introduces the situation characterised by the initial angle of attack � � +4 deg and frequency f � 1 hz we can to observe a red region of flow deceleration in the vicinity of the leading edge, a green region of acceleration, the compression region on the upper side of the profile, the interaction between the terminal shock wave and the boundary layer, and the separation point. downstream, the separation point shear layer and wake are clearly visible. the response of the flow field features to the unsteady boundary value conditions is represented by the oscillations of the separation point, shock wave, and shear layer. the boundaries of the deceleration and acceleration areas close to the leading edge also vary according to the current angle of attack. at the bottom dead centre (� � 1 deg) the flow around the airfoil is nearly symmetric. the terminal shock wave on the upper side is situated at approximately 75 % of the chord length and it is almost perpendicular to the chord. the separation point is at the place of interaction between the shock wave and the boundary layer. the situation on the lower side is similar, only the terminal shock wave is slightly bit nearer to the trailing edge. as the angle of attack increases the compression area on the upper side is more intensive and a �-shock is formed. close to the top dead centre the shock wave becomes less intensive and oscillates with significantly higher frequency. the increase in the oscillation frequency (f � 15, 30 hz) results again in the development of a �-shock on the upper side (fig. 2b) and the flow field features become more distinctive. close to the top dead centre of the profile oscillation the shock wave starts to divide and again oscillate with significantly higher frequency. figs. 2a – c demonstrate the characteristic positions of the aerofoil in the case of seq. 3 at f � 30 hz. 4 optical measurement analysis – colour streak schlieren method the colour streak schlieren method (cssm) has been developed in order to examine the response of the flow around an aerodynamic model to unsteady boundary value conditions [1], [2]. the principle of the method is to observe the change of flow field phenomena position in time according to the change of the free stream conditions or the model oscillation. at the very beginning the area of interest (streak) in the investigated field has to be defined. the dimensions, position and deflection of the streak must be determined with respect to the intelligibility of the final picture. the deflection � is 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 a) b) c) fig. 2: a) bottom dead centre, b) increasing �, c) top dead centre fig. 3: principle of the colour streak schlieren method used to place the streak in the required direction (e.g., when the streak must be parallel or perpendicular or oblique to the model or free stream). then the selected part of the schlieren picture is extracted from each video frame and lined up according to time succession. the whole procedure is depicted in fig. 3. the time factor can be deduced from the harmonic oscillation of the model. this physical quantity can then be used for obtaining some quantitative information, for example, about the higher frequency of oscillation of the shock waves near to the maximum angle of attack. essentially, the method can be applied to every basic variant of optical visualisation mentioned above. however, the interferograms become unintelligible when the streaks are assembled. this problem does not occur when the shadowgraph method is employed. the disadvantage of the shadowgraph method lies in the lesser sensitivity to the change of fluid density around the body in the flow. therefore, only the intensive density changes can be recorded and subsequently analysed. colour schlieren video images are therefore the most suitable for such an analysis. 5 application of cssm to an oscillating airfoil development and testing of cssm has been carried out using the schlieren video sequences of an oscillating airfoil, as described above. the first step in the analysis was done to specify the major factors influencing the character and position of the flow field phenomena. in this particular case the following major factors were defined: free stream mach number m � , angle of attack �, amplitude of oscillation ��, frequency of oscillation f, and the position of the streak in the examined area defined by the y coordinate and deflection angle �. the streak thickness t and the number of streaks also determine the final picture. for the cssm analysis m � � 0.9, � � 4 deg, � � � � 3 deg, f � 1, 15, 30 hz, and t � 2.2 mm. the y-axis started at the centre of model rotation; the y coordinate of the first streak was chosen in all cases just above the upper side of the airfoil (y1 � 8 mm). the distance between the subsequent streaks was d � 9 mm. the values of streak deflection were chosen � � � 0.5 deg ( f � 1 hz), 1.5 deg ( f � 15 hz), 2 deg ( f � 30 hz) in order to eliminate the camera setting inaccuracy. fig. 4 introduces an example of the final flow field image after cssm application. the picture corresponds to m � � 0.9, � � 4 deg, � � � � 3 deg, f � 1 hz, y � 17 mm, t � 2.2 mm. according to fig. 2 the red area on the left relates to the deceleration of the flow upstream the leading edge, the green area represents the flow acceleration downstream the stagnation point, the compression region is grey, and the brown line defines the terminal shock wave. the decreasing intensity of the shock close to the top dead centre of airfoil oscillation is clearly visible – the brown line becomes lighter. (the technical reasons do not allow to print in colours.) a comparison of the flows characterised by different oscillation frequencies is made in figs. 5a – c. the pictures concentrate on one cycle; the streak related to the top dead centre ( � � 7 deg) is first from the top, and the bottom dead centre © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 44 no. 3/2004 fig. 4: cssm – the final flow field image a) b) c) fig. 5: a) m � � 0.9, � � 4 deg, �� � �3 deg, f � 1 hz, y � 17 mm, t � 2.2 mm, b) m � � 0.9, � � 4 deg, �� � �3 deg, f � 15 hz, y � 17 mm, t � 2.2 mm, c) m � � 0.9, � � 4 deg, �� � �3 deg, f � 30 hz, y � 17 mm, t � 2.2 mm corresponds to the streak in the middle of the picture. all the characteristics are the same as in the case shown in fig. 4 except the frequencies, with values are 1, 15, and 30 hz, respectively. in the case of f � 1 hz it is evident that the oscillations of the flow field features are harmonic and keep the same frequency. as the oscillation frequency increases the shock wave oscillation loses its harmonic character and hysteresis occurs in the vicinity of the middle angle of attack (� � 4 deg). the comparison of the shock wave position in time and the harmonic oscillation of the model shows no phase shift. shock wave splitting can be observed near to the top dead centres and the subsequent increase in the oscillation frequency of individual branches. a better analysis of the shock wave hysteresis was done for the oscillation frequency f � 30 hz, where the effect is most noticeable. the process is depicted in fig. 6. the flow field was examined in six streak positions. streak no. 1 is situated approximately 8 mm above the airfoil centre of rotation, and the next streaks are placed at a distance 17, 26, 35, 44, and 53 mm. it is obvious from the picture introducing one cycle that the hysteresis has its maximum value around � � 3.5 deg. close to the upper side of the airfoil the difference between the shock wave positions during the increase and decrease of � is almost 1 cm (approximately 10 % of a chord length) and going further away from the surface this difference is even greater. this effect can be visualised using a combination of two photographs in fig. 7. the first image is related to the decreasing angle of attack in the position � � 3.5 deg (the first shock wave from the left), while the latter is related to the increasing angle of attack in the same position. the difference between the shock wave locations is clearly visible. due to the velocity distribution around the airfoil the effective angle of attack seems to be bigger than � � 3.5 deg when the leading edge is moving downward. subsequently the separation point shifts upstream (two red points in fig. 7) and so does the terminal shock wave. during the upward move of the leading edge the situation is converse. following the black lines along the shock waves from the airfoil it can be seen that the distance between shocks becomes smaller close to the surface. then the difference increases again. the actual shape of a sonic region around the airfoil causes this curvature. fig. 8 shows a cssm picture of 5 cycles recorded in the same six positions as in fig. 6. the terminal shock oscillation becomes less harmonic as the distance from the airfoil increases. at a distance of approximately one half of the chord length from the airfoil centre of rotation it is practically impossible to distinguish the streaks related to the dead centres of model oscillation. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 6: terminal shock wave hysteresis 6 conclusions the colour streak schlieren method represents a new approach to flow field analysis based on optical records. it can be used as an independent method for evaluation of experimental and cfd research in the field of high-speed aerodynamics. basically it utilizes video sequences taken by means of the schlieren method. applications to shadowgraph and interferometric methods are possible, but the results are not of the same quality. cssm has been developed for the investigation of unsteady high subsonic, transonic, and supersonic flow fields around the bodies or in blade channels in test sections of wind tunnels. the method enables hardly observable phenomena to be revealed, and their time dependence to be examined. some quantitative information related to the position of flow field features can be obtained. the possibilities of cssm were demonstrated in an nalysis of the flow around an oscillating airfoil under transonic conditions. the shock wave hysteresis was disclosed, described, and quantified. the results meet the results reported in [3]. further development of the method will focus on the application of fourier analysis to the final cssm pictures. this improvement could be of significant importance for the investigation of phenomena like oscillations of shock waves with higher frequency described in this paper, or the transonic instability that has been recorded in blade cascades [1]. acknowledgments all the experiments were carried out at the department of high-speed aerodynamics, aeronautical research and test institute, prague. the electronic data processing was accomplished in cooperation with the department of aerospace engineering, czech technical university in prague. the support for this research from the aerospace research centre is gratefully acknowledged. references [1] ulrych j., benetka j., šafařík p.: “records of unsteady transonic flow past blade cascades by means of optical methods”. in: measuring techniques in turbomachinery, proceedings of the 16th symposium on measuring techniques for transonic and supersonic flow in cascades and turbomachines, cambridge, 2002, to be printed in 2003. [2] ulrych j., benetka j.: “records of unsteady transonic flow by means of schlieren video images”. in: fluid dynamics 2002, proceedings of the colloquium at institute of thermomechanics, academy of sciences of the czech republic, prague, 2002, p. 179–182. [3] benetka j., pernica z.: “harmonically oscillating airfoil in transonic stream of air”. arti report, z-34, prague, 1980. ing. jiří ulrych phone/fax: +420 284 825 347 e-mail: ulrych@aerospace.fsik.cvut.cz department of high-speed aerodynamics aeronautical research and test institute beranových 130 199 05 prague – letňany, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 44 no. 3/2004 fig. 7: shock wave hysteresis, � � 3.5 deg fig. 8: cssm pictures of terminal shock wave oscillation above the naca 0012 airfoil acta polytechnica doi:10.14311/ap.2016.56.0254 acta polytechnica 56(3):254–257, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap is pt -symmetric quantum theory false as a fundamental theory? miloslav znojil nuclear physics institute ascr, hlavní 130, 250 68 řež, czech republic correspondence: znojil@ujf.cas.cz abstract. yi-chan lee et al. claim (cf. phys. rev. lett. 112, 130404 (2014)) that the “recent extension of quantum theory to non-hermitian hamiltonians” (which is widely known under the nickname of “pt -symmetric quantum theory”) is “likely false as a fundamental theory”. by their opinion their results “essentially kill any hope of pt -symmetric quantum theory as a fundamental theory of nature”. in our present text we explain that their toy-model-based considerations are misleading and that they do not imply any similar conclusions. keywords: quantum mechanics; pt-symmetric representations of observables, measurement outcomes; locality; quantum communication. 1. introduction at present it is still necessary to admit that even after almost hundred years of the study of relativistic kinematics and/or of quantum dynamics, the peaceful coexistence between our intuitive perception of the underlying classicaland quantum-physics concepts and principles is often fragile. this fragility dates back to the publication of the epr paradox [1] and it may still be sampled by some freshmost preprints [2]. in our present paper we intend to reanalyze, critically, a re-emergence of the conflict which we noticed in one of the very recent and very well visible publications [3]. first of all, let us emphasize that the questions asked in [3] are important, with possible relevance ranging from the entirely pragmatic applications of the current quantization principles in information theory [4] up to pure mathematics [5]. in what follows we intend to complement the related discussions (to be sampled, e.g., by [6]) by a deeper analysis and reinterpretation of some technical aspects (and, mainly, the non-locality) of the toy model as used in [3]. we may briefly summarize that our analysis will support the affirmative answer to the question “could pt -symmetric quantum models offer a sensible description of nature?”. this conclusion will be based, first of all, on the explicit construction of all of the eligible physical inner products in all of the possible related and potentially physical, “standard” hilbert space h(s). in this manner, the two-parametric family of all of the eligible fundamental pt -symmetric probabilistic interpretations of the system in question is constructed. in full accord with the textbooks, the observables become represented by operators which are not selfadjoint in a “false” hilbert space but self-adjoint, as required, in another, non-equivalent, “standard” hilbert space. subsequently, a few implications of our construction will be discussed. in particular, it will be emphasized that the conclusions of yi-chan lee et al. [3], which refer to signalling, are based on an unfortunate use of one of the simplest but still inadequate, manifestly unphysical hilbert spaces. 2. toy model in letter [3] yi-chan lee et al. came with a very interesting proposal of analysis of what happens, during the standard quantum entangled-state-mediated information transmission between alice and bob, when the alice’s local, spatially separated part h of the total hamiltonian (say, of htot = h ⊗ i where the identity operator i represents the “bob’s”, spatially separated component) is chosen in the well known pt−symmetric two-level toy-model form h = s ( i sin α 1 1 −i sin α ) , s,α ∈ r. (1) the conclusions of [3] look impressive (see, i.a., the title “local pt symmetry violates the no-signaling principle”). unfortunately, many of them (like, e.g., the very last statement that the “results essentially kill any hope of pt -symmetric quantum theory as a fundamental theory of nature”) are based on several unfortunate misunderstandings. in what follows we intend to separate the innovative and inspiring aspects of the idea from some of the conclusions of [3] which ignore the overall non-locality of the toy model and which must be classified as strongly misleading and/or inadequate if not plainly incorrect. 2.1. pt−symmetry our task will be simplified by the elementary nature of the toy-model hamiltonian h of (1) with property hpt = pt h called pt−symmetry (for the sake of clarity let us recall that one may choose here operator p in the form of pauli σx matrix while t may be defined simply as complex conjugation). secondly, 254 http://dx.doi.org/10.14311/ap.2016.56.0254 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 is pt -symmetric quantum theory false as a fundamental theory? our task will be also simplified by the availability of several published reviews of the formalism (let us call it pt−symmetric quantum mechanics, ptqm) and, in particular, of its recent history of development (let us recall its most exhaustive descriptions [5, 7, 8]). incidentally, it is extremely unfortunate that the latter three ptqm summaries remained, obviously, unknown to or, at least, uncited by, the authors of letter [3] (for the sake of brevity let us call this letter “paper i” in what follows). otherwise, the authors of paper i would be able to replace their first and already manifestly incorrect description of the birth of the formalism (in fact, the first sentence of their abstract which states that in 1998, “bender et al. [9] have developed pt -symmetric quantum theory as an extension of quantum theory to non-hermitian hamiltonians”) by some more appropriate outline of the history. reminding the readers, e.g., that for the majority of active researchers in the field (who are meeting, every year, during a dedicated international conference [10]) the presently accepted form of the pt -symmetric quantum theory has only been finalized, roughly speaking, after the publication of the “last erratum” [11] in 2004. naturally, even the year 2004 was not the end of the history since during 2007, for example, the description of the so called pt -symmetric brachistochrone [12] moved a bit out of the field and had to be followed by the thorough (basically, open-systemrelated) re-clarification of the concept (cf. [13] and also, a year later, [14, 15]). during the same years also the methods of extension of the intrinsically non-local ptqm formalism to the area of scattering experiments were developed [16–18]. 2.2. eligible physical inner products unfortunately, the authors of paper i have missed the latter messages. having restricted their attention solely to the brachistochronic quantum-evolution context of the initial publication [12] they remained unwarned that in this context the role of the generator h may be twofold. it is being used either in the unitary quantum-evolution context of [7, 8] (cf. also the highly relevant, cca fifteen years older review paper [19]) or in application to the open quantum systems. in the latter case one is allowed to speak just about a non-unitary, truly brachistochronic quantum evolution within a subspace of a “full” hilbert space of states [14, 15]. naturally, the quantum world of the above-mentioned alice cannot belong to such a category. in other words, “her” hamiltonian (1) must necessarily be made self-adjoint. according to the standard theory (briefly reviewed also in our compact review [20]), this should be made via a replacement of the “friendly but false” hilbert space h(f ) (chosen, in paper i, as h(f ) ≡ c2 for model (1)) by another, “standard, sophisticated” hilbert space h(s) which only differs from h(f ) in its use of a different, locality-violating inner product between its complex two-dimensional column-vector elements |a〉 = (a1,a2)t and |b〉 = (b1,b2)t . the usual and “friendly”, f−superscripted inner product 〈a|b〉 = 〈a|b〉(f ) = ∑ i=1,2 a∗i bi (2) defines the hilbert-space structure in the false and manifestly unphysical, ill-chosen and purely auxiliary friendlier space h(f ). thus, what is now required by the ptqm postulates is an introduction of a different, non-local, s−superscripted product 〈a|b〉(s) = ∑ i,j=1,2 a∗i θijbj (3) containing an ad hoc (i.e., positive and hermitian [19]) “hilbert-space-metric” matrix θ = θ(s). precisely this enables us to reinterpret our given hamiltonian h with real spectrum as living in a manifestly physical, new hilbert space h(s). naturally, one requires that such a hamiltonian generates a unitary evolution in the correct, physical hilbert space h(s) or, in mathematical language, that it becomes selfadjoint with respect to the upgraded inner product (3). 3. physics 3.1. admissible probabilistic interpretations of the model for our two-dimensional matrix model (1) the latter condition proves equivalent to the set h†θ = θh (4) of four linear algebraic equations with general solution θ = a2 ( 1 u− i sin α u + i sin α 1 ) , a,u,α ∈ r, |u| < |cos α|. (5) any choice of admissible parameter u is easily shown to keep this metric (as well as its inverse) positive. thus, the reason why the parameter α was called “the non-hermiticity of h” in [14] is purely conventional, based on a tacit assumption that one speaks, say, about an open quantum system. on the contrary, once we restrict our attention to the world of alice (who must live in the physical hilbert space h(s)), we must speak about the unitarily evolving quantum states and about the relevant generator (1) which is, by construction, hermitian inside any pre-selected physical hilbert space, given by our choice of the free parameter u. for this reason the calculation of what, according to paper i, “bob will measure using conventional quantum mechanics” must be again performed in the physical hilbert space. in particular, the trace 255 miloslav znojil acta polytechnica formulae as used in paper i are incorrect and must be complemented by the pre-multiplication of the bra vectors by the “shared” metric from the right, 〈ψf|→ 〈ψf|θ̃ (in the most elementary scenario one could simply recall (5) and choose θ̃ = θ ⊗ i). 3.2. observables many of the related comments in paper i (like, e.g., the statement that “these states [given in the unnumbered equation after (2)] are not orthogonal to each other in conventional quantum theory”) must be also modified accordingly. the point is that the non-orthogonality of the eigenstates of h in the manifestly unphysical hilbert space h(f ) is entirely irrelevant. in contrast, what remains decisive and relevant is that, in the words of paper i, “when α = ±π/2, they become the same state, and this is the pt symmetry-breaking point”. indeed, one easily checks that in such an “out-of-theory” limit (towards the so called kato’s exceptional point [21]) the metric (and, hence, the physical hilbert space) ceases to exist. one has to admit that the currently accepted phqp terminology is a bit unfriendly towards newcomers. strictly, one would have to speak about the hermiticity of any two-by-two matrix observable λ, i.e., equivalently, about the validity of the necessary hermiticity condition in physical space, λ†θ = θλ. (6) naturally, this condition can only be tested after we choose a definite form of the metric (5), i.e., in our toy model, after we choose the inessential scale factor a2 > 0 and the essential metric-determining parameter u in (5). it is worth adding that in order to minimize possible confusion the authors of the oldest review paper [19] recommended that, firstly, whenever one decides to work with a nontrivial (sometimes also called “non-dirac”) metric θ 6= i, the natural hermiticity condition in the “hidden” physical space should be better called “quasi-hermiticity”. secondly, they also recommended that having a hamiltonian, there may still be reasons for our picking up a suitable candidate λ for another observable in advance. then, equation (6) would acquire a new role of an additional phenomenological constraint imposed upon the metric. incidentally, in the ptqm context the latter idea found its extremely successful implementation in which one requires that the second observable λ represents a charge of the quantum system in question. it is rather amusing to verify that such a specific requirement (called, sometimes, pct symmetry [7]) would remove all of the ambiguities from the metric of (5) simply by fixing the value of u = 0 as well as of a2 = 1/ cos α. 4. conclusions we are now prepared to return to the two key ptqm assumptions as formulated in paper i. their main weakness is that they use the concept of the physical hilbert space (i.e., in essence, of the unitarity of evolution) in a very vague manner. one should keep in mind that even in the phenomenologically extremely poor two-dimensional toy models the predictions and physical content of the theory are very well understood as given not only by the generator of evolution h but also by the second observable λ (say, charge — for both, naturally, we require that the spectrum is real). thus, what can be measured in the model is the energy and, say, charge. in other words, the theory does not leave any space for any kind of coexistence between different “conventional” metrics and/or between different normalization conventions (i.e., typically, for the simultaneous use of different parameters u in (5)). at the same time, in the light of paper [18] on the ptqm-compatible unitarity of the scattering, the ptqm theory still leaves space for a consistent implementation of the important phenomenological concepts like locality, etc. the concluding remarks of paper i about a conjectured “trichotomy of possible situations” must be thoroughly reconsidered. keeping in mind the necessary separation of alternative ptqm-related problems and eliminating, first of all, any mixing between the two well defined categories, viz., of the quantum models characterized by the unitary and/or non-unitary evolution. definitely, the theories of the ptqm type did not exhaust their potentialities yet. it is truly impossible to agree with the final statement of paper i that its “results essentially kill any hope of pt -symmetric quantum theory as a fundamental theory of nature”. acknowledgements the work on the project was supported by the institutional research plan rvo61389005 and by gačr grant 16-22945s. references [1] a. einstein, b. podolsky, and n. rosen, phys. rev. 47, 777 (1935). [2] j. soucek, the restoration of locality: the axiomatic formulation of the modified quantum mechanics, https://www.academia.edu/20127671/ the_restoration_of_locality_the_axiomatic_ formulation_of_the_modified_quantum_mechanics [2016-04-01] [3] yi-chan lee, min-hsiu hsieh, steven t. flammia, and ray-kuang lee, phys. rev. lett. 112, 130404 (2014). [4] s. croke, phys. rev. a 91, 052113 (2015). [5] f. bagarello, j. p. gazeau, f. h. szafraniec, and m. znojil, editors, "non-selfadjoint operators in quantum physics: mathematical aspects", john wiley & sons, hoboken, 2015. 256 https://www.academia.edu/20127671/ the_restoration_of_locality_the_axiomatic_formulation_of_the_ modified_quantum_mechanics https://www.academia.edu/20127671/ the_restoration_of_locality_the_axiomatic_formulation_of_the_ modified_quantum_mechanics https://www.academia.edu/20127671/ the_restoration_of_locality_the_axiomatic_formulation_of_the_ modified_quantum_mechanics vol. 56 no. 3/2016 is pt -symmetric quantum theory false as a fundamental theory? [6] dorje c. brody, j. phys. a: math. theor. 49, 10lt03 (2016). [7] c. m. bender, rep. prog. phys. 70, 947 (2007). [8] a. mostafazadeh, int. j. geom. meth. mod. phys. 7, 1191 (2010). [9] c. m. bender, and s. boettcher, phys. rev. lett. 80. 5243 (1998). [10] http://gemma.ujf.cas.cz/~znojil/conf/index. html [2016-04-01] [11] c. m. bender, d. c. brody, and h. f. jones, phys. rev. lett. 92, 119902(e) (2004). [12] c. m. bender, d. c. brody, h. f. jones, and b. k. meister, phys. rev. lett. 98, 040403 (2007). [13] a. mostafazadeh, phys. rev. lett. 99, 130502 (2007). [14] u. günther, and b. f. samsonov, phys. rev. lett. 101, 230404 (2008). [15] u. günther, and b. f. samsonov, phys. rev. a 78, 042115 (2008). [16] f. cannata, j.-p. dedonder, and a. ventura, ann. phys. (ny) 322, 397 (2007). [17] h. f. jones, phys. rev. d 76, 125003 (2007); h. f. jones, phys. rev. d 78, 065032 (2008). [18] m. znojil, phys. rev. d 78, 025026 (2008); m. znojil, j. phys. a: math. theor. 41, 292002 (2008). [19] f. g. scholtz, h. b. geyer, and f. j. w. hahne, ann. phys. (ny) 213, 74 (1992). [20] m. znojil, sigma 5 (2009) 001 (arxiv overlay: 0901.0700). [21] t. kato, perturbation theory for linear operators, springer-verlag, berlin, 1966. 257 http://gemma.ujf.cas.cz/~znojil/conf/index.html http://gemma.ujf.cas.cz/~znojil/conf/index.html acta polytechnica 56(3):254–257, 2016 1 introduction 2 toy model 2.1 pt-symmetry 2.2 eligible physical inner products 3 physics 3.1 admissible probabilistic interpretations of the model 3.2 observables 4 conclusions acknowledgements references ap03_5.vp 1 introduction the process design for hot forging has been greatly dependent upon the empirical experience of engineers because there are numerous process variables on the basis of constant volume such as prediction of metal flow, yielding condition, heat transfer between dies and forging material, friction behavior, etc. the differences in the personal experience of engineers could lead to empirical errors in real production. recently, there have been many reports an effort to save time and cost in die manufacturing through computer aided process design and simulation to reduce process errors [1–7]. ward [4] analyzed the effect of heat on material and dies during multi stage hot forging of train wheels through the commercial finite element code deform. qingbin [5] studied the effect of die temperature and forging speed by simulation of thermal behavior during high speed hot forging of aisi 1045 discs. doege [6] studied closed dies to produce spur gear, helical gear and connecting rods. choi [7] developed an automatic system that is available to design a blocker to forge rib-web shape products. however, most of the work has concentrated on plane symmetric or axi-symmetric products. recently, there has been interest in forging asymmetric or axi-asymmetric products, but very few reports have been found on the subject. generally, rib-web shaped asymmetric rails for an express train [ks70s] are produced in a specified fixed shape, but the shape transforms to symmetric at the rail turnouts. in the present study, the process design of forging the asymmetric rail to a symmetric rail at the turnouts was carried out by the rigid plastic finite element code deformtm-2d simulation. the process design through computer simulation was also compared with an experimental inspection of the flow behavior with plasticine simulation material. 2 fem analysis a high speed rail has a rib-web asymmetric [ks70s] to symmetric [ks60kg] shape transition range at the turnouts, as shown in fig. 1. owing to the difference in height and shape between the two shapes, it is impossible to forge in a single-stage operation. therefore, a process design for multi stage forging is needed. fig. 2 displays the structure of the forging dies the top die on the right and left dies with insert. because ks70s is eccentric, as shown in fig. 1b, the height of the left die and the right die was adjusted as shown in fig. 2 to forge a different amount of deformation on each side. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 43 no. 5/2003 process design for hot forging of asymmetric to symmetric rib-web shaped steel h. cho, b. oh, c. jo, k. lee the process design of hot forging, asymmetric to symmetric rib-web shaped steel, which is used for the turnout of express rails has been studied. owing to the great difference in shape between the initial billet and the final forged product, it is impossible to hot forge the rail in a single stage operation. therefore, multi stage forging and also die design for each stage are necessary for the production process. the numerical simulation for hot forging of asymmetric shape to symmetric shape was carried out using commercial fem code, deformtm-2d. modification of the design and repeated simulation was carried out on the basis of the simulation results. for comparison with the simulation results, a flow analysis experiment using plasticine was also carried out. the results of the flow analysis experiment showed good agreement with those of the simulation. keywords: process design, hot forging, asymmetric, symmetric, high speed rail. a) b) fig. 1: the shape of transformed parts from ks70s to ks60kg in turnouts: a) photograph of turnouts, b) schematic comparison of the two rail shapes. for the simulation, the following assumptions for the variables are implied: the dies are rigid bodies, fixed left die, the moving velocity of the right and the upper dies is 2 mm/s. the boundary conditions are an initial rail temperature of 1050 °c, a preheated die temperature of 100–200 °c, considering heat transfer all the surface and internal body of the dies and insert, friction factor 0.1 between the dies and friction factor 0.3 between rail and dies. the thermal properties of the materials are listed in table 1. fig. 3 displays the initiation of folding during subsequent forging processes when there is a fillet angle of 45° between the side dies and the insert. designing with an increased fillet angle of 60° can prevent the folding phenomenon and reduce friction wear due to stress concentration between the die and the insert. fig. 4 shows the folding initiation at the crossing point during web forming by the upper die. this figure 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 2: schematic illustration of forging dies material thermal conductivity (w/m�k) heat capacity (j/g�k) aisi 1055 (rail) 51.9 0.472 aisi h-13 (die) 28.6 0.460 table 1: thermal properties of materials fig. 3: folding initiation at a small fillet angle (45°) of moving die and inserts fig. 4: folding initiation between rib and web by top die shows that the metal flow and the folding phenomenon are dependent upon the beginning point of the tapered corner in the upper die. fig. 5 shows the simulated results when the machining allowance is 0.6 mm for the whole part. the simulated final shape was within the machining allowance of 0.6 mm, as shown in fig. 5b. however, the load increased rapidly at the end of the first forging stage, as shown in fig. 6. the rapid increase in the load is attributed to the increase in the contacting area caused by filling of the internal die cavity with the finished forming of the head of the rail at this stage. therefore, optimal die design is necessary in order to avoid an increase in the forming load. fig. 7 shows the modified forging process of fig. 5 to prevent load increase during forming. the machining allowances are 1.0 mm for the head of the rail and 0.6 mm for the other surfaces. a high amount of machining allowance at the head of the rail reduces the amount of metal flow through the web to a relatively short rib, but the high amount of metal flow in the 3 stage process makes it possible to form the full shape of the rib. 23 acta polytechnica vol. 43 no. 5/2003 a) b) fig. 5: simulated forging process (die velocity 2.0 mm/s, preheating 100 °c): a) the end of the 1st stage, b) the end of the 5th stage [n/mm] fig. 6: impressed load per unit length of the side die a) b) fig. 7: simulated results of modified process of fig. 5: a) the end of 1st stage, b) the end of the 5th stage fig. 8 compares the final and the forged shape of the rail. the final forged shape has a sufficient machining allowance. fig. 9 shows that the maximum forging load per unit length impressed to the right die at each stage was about 30kn/mm. as shown in the figure, the first stage (fig. 5a and fig. 7a), which forms the rail head, does not have the rapid increase in load that appeared in the process of fig. 5. however, the load increase of the second and third stages shows an increase similar to that of the first stage. though the amount of deformation in the subsequent stages decreased, the load increased rapidly because of the increased contacting area between the material and the dies, and the cooling of the material. the load per unit length impressed to the upper die was similar to that impressed to the right die (fig. 9b). therefore, the required press load is 30 kn/mm × 720 mm � 21,600 kn. if we put 2,940 kn as the load tolerance of the press, then the press load will be 24,540 kn. 3 experiment 3.1 testing facility and conditions on the basis of the simulation results 70 % by size of the actual dies and the insert was made of al 5052. fig. 10 shows the dies and the facilities set up on the press. the rail shape specimens were made with plasticine simulation material stacked to the flow direction with 3mm thick black and white plates as shown in fig. 11a. the lubricant between the specimen and the dies was soapy water. the actual process condition was obtained with a plain strain condition by closing the opposite side of the dies to prevent material flow. 3.2 results the experiment was carried out on the basis of the simulation. fig. 11a shows the flow of plasticine when the first stage finished. the experimental results correspond to the deformtm-2d simulation of the material flow shown in fig. 11b. fig. 12 displays the final forged shape of the two specimens. as the process proceeds, material flows to the fixed dies, and simultaneously to the rib then nearly perpendicu24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 8: comparisons of the final forged product and the desired shape a) b) [n/mm] [n/mm] fig. 9: impressed load per unit length of the side and top die: a) side die, b) top die fig. 10: experimental set up larly and finally to the long rib. during forming with an upper die, the long rib shows a reduction in thickness because material flow occurs to the short rib. the result corresponds to the simulation result for material flow. therefore, the process design in the present study seems to be applicable to real production. 4 conclusions 1) folding behavior and friction due to stress concentration reduced with increased fillet angle between the dies and the insert. in order to prevent folding at the cross point between the rib and the web, the beginning point of the taper to cause metal flow has to be close to the short rib. 2) in order to reduce the capacity of the press, the rail head area has to have a machining allowance, and the forming has to be finished at the first stage. 3) the experimental result corresponded to the simulated result. therefore, the process design is applicable to actual production. references [1] rodrigues, j. m. c., martins, p. a. f.: finite element modeling of the initial stages of a hot forging cycle. finite element in analysis and design, vol. 38, 2002, p. 295–305. [2] fujikawa, s.: application of cae for hot forging of automotive components. j. of materials processing technology, vol. 98, 2000, p. 176–181. [3] guo, y. m., nakanishi, k.: a hot forging simulation by the volumetrically elastic and deviatorically rigid-plastic finite element method. j. of materials processing technology, vol. 89, 1999, p. 111–116. [4] ward, m. j., miller, b. c.: simulation of a multi-stage railway wheel and firming process. j. of materials processing technology, vol. 80–81, 1998, p. 206–212. [5] qingbin, l., zengxiang, f.: coupled thermo-mechanical analysis of the high-speed hot-forging process. j. of materials processing technology, vol. 69, 1997, p. 190–197. [6] doege, e., bohnsack, r.: closed die technologies for hot forging. j. of materials processing technology, vol. 98, 2000, p. 165–170. [7] choi, j. c., kim, b. m., kim, s. w.: computer-aided design of blockers for rib-web type forgings. j. of materials processing technology, vol. 54, 1995, p. 314–321. prof. dr. haeyong cho phone: +820 432 612 464 fax: +820 432 632 448 e-mail: hycho@cbucc.chungbuk.ac.kr byungki oh, ph.d. department. of mechanical engineering chungbuk national university chungbuk, 361-763, korea dr.changyong jo high temperature group korea institute of machinery & materials gyungnam, 641-010, korea ing. kijoung lee lg industrial systems co. chungbuk, 361-720, korea © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 43 no. 5/2003 fig. 12: experimental result after final stage a) b) fig. 11: comparisons of the experimental result and the fem simulation after the first process: a) result of the experiment, b) result of the simulation acta polytechnica doi:10.14311/ap.2017.57.0149 acta polytechnica 57(2):149–158, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap experimental determination of temperatures in spark-generated bubbles oscillating in water karel vokurka physics department, technical university of liberec, studentská 2, 461 17 liberec, czech republic correspondence: karel.vokurka@tul.cz abstract. the surface temperatures of the plasma core in the final stages of the first contraction phase of spark-generated bubbles oscillating under ordinary laboratory conditions in a large expanse of water are determined experimentally. the measurement method is based on an analysis of the optical radiation from the bubbles and on the assumption that the plasma core is radiating as a black-body. it is found that the maximum surface temperatures of the plasma core range 4300–8700 k. keywords: spark-generated bubbles; temperatures in bubbles; bubble oscillations. 1. introduction bubble oscillations remain an important topic in fluid dynamics. while they are traditionally associated with erosion damage [1, 2], recent efforts have been aimed at medical applications, such as contrastenhancing in ultrasonic imaging [3] and shock wave lithotripsy [4]. both spark-generated bubbles [2, 5– 7, 9–11] and laser-generated bubbles [1, 8, 12–18] are very useful tools in experimental studies of free bubble oscillations. the temperature in the interior of a bubble is a very important quantity, and it is necessary to know its variation with time in order to understand the physical processes, such as light emission, taking place in an oscillating bubble. however, it is very difficult to determine the temperatures in spark-generated and laser-generated bubbles experimentally. until now, only a limited number of estimates based on spectral analysis of the light emitted from the bubbles have been reported, in [5, 8, 11, 12, 14, 15]. the difficulties associated with temperature measurements have their origin in the fact that the temperature in a bubble interior varies very rapidly in the final stages of the contraction phase, and the emitted light used for spectral analysis therefore has the form of a very short flash of light. another difficulty in the measurements originates from the fact that experimental spark-generated bubbles have very low reproducibility (in terms of bubble size and intensity of oscillation). although the reproducibility of lasergenerated bubbles may be better, it remains relatively low. this paper will show that useful estimates of the instantaneous surface temperature of the plasma core in the bubble interior can be determined using the experimental method described in [19]. in this way, autonomous behaviour of the plasma can also be proved. the analysis presented here is devoted to free bubble oscillation under ordinary laboratory conditions in a large expanse of liquid. the results discussed here have been presented in a shortened form at a conference in svratka [20]. 2. experimental setup freely oscillating bubbles were generated by discharging a capacitor bank via a sparker submerged in a laboratory water tank with dimensions of 6×4×5.5 m (length×width×depth). the experiments were performed in tap water at constant hydrostatic pressure p∞ = 125 kpa, at room temperature θ∞ = 292 k, and far from any boundaries. the capacitance of the capacitor bank could be varied in steps by connecting 1–10 capacitors in parallel. each of these capacitors had a capacitance of 16 µf. the capacitors were charged from a high-voltage source of 4 kv. an air-gap switch was used to trigger the discharge through the sparker. a schematic diagram of the experimental setup is given in figure 1. both the spark discharge and the subsequent bubble oscillations were accompanied by intensive optical radiation and acoustic radiation. the optical radiation was monitored by a detector, which consisted of a fiber optic cable, a photodiode, an amplifier, and an a/d converter. the input surface of the fiber cable was positioned in water at the same level as the sparker, at a distance of 0.2 m aside, pointing perpendicular to the sparker gap and the electrodes. a hammamatsu photodiode type s2386-18l was positioned at the output surface of the fiber optic cable. the usable spectral range of the photodiode is 320– 1100 nm. an analysis of the optical spectra given in the literature showed that the maximum temperatures in spark-generated and laser-generated bubbles range 5800–8150 k [5, 11, 15, 18]. then, using the wien and planck law, it can be verified that the spectral maxima of the optical radiation are within the photodiode bandpass, and that the prevailing part of the radiation is received by the detector. the load resistance of the photodiode was 75 ω, so the rise time of the measured pulses is about 50 ns. a broadband amplifier (0–10 mhz) was connected to the photodiode output terminals. the output voltage from the amplifier was recorded using a data acquisition board (national instruments pci 6115, 12 bit a/d converter) with a sampling frequency of 10 mhz. 149 http://dx.doi.org/10.14311/ap.2017.57.0149 http://ojs.cvut.cz/ojs/index.php/ap karel vokurka acta polytechnica figure 1. experimental setup used to generate oscillating bubbles and to record the optical and acoustic radiation from them (daq – data acquisition board, hv – high voltage). the acoustic radiation was monitored using a reson broadband hydrophone type tc 4034. the hydrophone was positioned with the sensitive element at the same depth as the sparker. the distance between the hydrophone acoustic centre and the sparker gap was r = 0.2 m. the output of the hydrophone was connected via a divider 10 : 1 to the second channel of the a/d converter. prior to the measurements reported here, a limited number of high-speed camera records were taken with framing rates ranging 2800–3000 fps (frames per second). a more detailed description of the experimental setup is given in an earlier work [19]. in the experiments, a larger number of almost spherical bubbles freely oscillating in a large expanse of liquid were successively generated. the size of these bubbles, as described by the first maximum radius rm 1, ranged 18.5–56.5 mm, and the bubble oscillation intensity, as described by the nondimensional peak pressure in the first acoustic pulse pzp1 = (pp1/p∞)(r/rm 1), ranged from 24 to 153. here pp1 is the peak pressure in the first acoustic pulse p1(t), p∞ is the ambient (hydrostatic) pressure at the place of the sparker, and r is the hydrophone distance from the sparker centre. both rm 1 and pzp1 were determined in each experiment from the respective pressure record, using an iterative procedure described in [21]. 3. results and discussion examples of several frames from a film record taken with a high-speed camera are given in figure 2. the first frame was taken in the bubble growth phase. the second to sixth frames correspond to the first contraction phase, and the seventh and eighth frames correspond to the first expansion phase. in the frames taken in the growth and first contraction phases, the glowing plasma core in the bubble interior can be seen. the small bright objects floating in the vicinity of the bubble are plasma packets [19]. the variation of the bubble radius r with time t is shown in figure 3. the experimental data were determined from the frames, as are the data shown in figure 2. as the spark-generated bubble is not ideally spherical (it is slightly elongated in the vertical direction), the data points represent an average from two perpendicular directions – one in the horizontal direction and the other in the vertical direction. the individual frames in figure 2 can be traced to the corresponding points on the plot of bubble radius vs. time, given in figure 3. the record of the optical radiation (represented by voltage u(t) at the output of the optical detector) consists of a pulse u0(t) that is radiated during the electric discharge and the subsequent explosive bubble growth, and the pulse u1(t) that is radiated during the first bubble contraction and the subsequent bubble expansion [19]. the dynamic range of the optical detector (the photodiode, amplifier, a/d converter) was not sufficiently high to record both u0(t) and u1(t) in a single experiment with good fidelity. two sets of experiments were therefore performed. the first set of experiments was aimed at recording the pulse u0(t) undisturbed, and the second set of experiments was 150 vol. 57 no. 2/2017 determination of temperatures in spark-generated bubbles figure 2. selected frames from a film record of a spark-generated bubble. the bubble is rm 1 = 51.5 mm in size, and oscillates with intensity pzp1 = 70.3. the times below each frame refer to the time origin, which is set at the instant of liquid breakdown. the spots of light on the sides of the frames are due to the illuminating lamps. 0 2 4 6 8 10 12 14 16 0 10 20 30 40 50 60 t [ms] r [m m ] r m1 r m2 t 2 t 1 t m1 r m1 figure 3. variation of the bubble radius r with time t: ‘◦’ experimental data, ‘–’ fit to the experimental data (in the vicinity of tm1 this fit is only very approximate). the time origin is set at the instant of liquid breakdown. the time at which the bubble attains the first maximum radius rm 1 is denoted as t1; the time at which the bubble is compressed to the first minimum radius rm 1 is denoted as tm1; and the time when the bubble attains the second maximum radius rm 2 is denoted as t2. the growth phase is defined to be within the interval (0, t1), the first contraction phase is within the interval (t1, tm1), and the first expansion phase is within the interval (tm1, t2). 151 karel vokurka acta polytechnica 0 2 4 6 8 10 12 −0.5 0 0.5 1 1.5 t [ ms ] u (t ) [ m v ] u 0 (t) u 1 (t) u m1 t 2 t u1t 1 figure 4. voltage u(t) at the output of the optical detector. the spark-generated bubble is rm 1 = 55.2 mm in size, and oscillates with intensity pzp1 = 153.2. in this figure, the time origin is set at the instant of liquid breakdown, and this instant coincides with the beginning of the steep growth of the initial pulse u0(t). the time at which the bubble attains the first maximum radius rm 1 is denoted as t1, and the time when the bubble attains the second maximum radius rm 2 is denoted as t2. pulse u0(t) is defined to be within the interval (0, t1), and pulse u1(t) is defined to be within the interval (t1, t2). the blip before t = 0 is a noise due to the air-gap switch. aimed at recording the pulse u1(t) with acceptable noise. a link between the two sets of experiments is achieved by using statistical averages from the first set of records to compute the respective values for the second set of records. an example of the optical pulse u1(t) from the second set of experiments is given in figure 4. in figure 4 the pulse u0(t) is clipped due to the limited dynamic range of the optical detector. the maximum value of pulse u1(t) is denoted as um 1, and the time of its occurrence is denoted as tu1. as can also be seen in figure 4, the optical radiation from the bubble decreases rapidly to zero after tu1. another interesting fact that can be observed in figure 4 is the occurrence of optical radiation from the bubble during the whole first oscillation, i.e., in the interval lasting approximately (0, tu1). as can be observed in the photographs presented in figure 2, the source of this persisting optical radiation is the plasma core. it can also be seen in these photographs that the bubble interior is filled with two substances. first, there is some transparent matter, which is most probably hot water vapour. second, there is opaque plasma at the bubble centre. the existence of this hot plasma core during the whole first bubble oscillation, i.e., even long after the electric discharge has terminated (in the case of the experimental data shown in figure 4 the electric discharge lasted approximately only 0.5 ms) is an astonishing phenomenon, observed already by golubnichii et al. [9, 10]. golubnichii et al. called this outlasting plasma core “long-living luminescence formations”. similar long-lasting optical radiation has also been observed by baghdassarian et al. [18]. baghdassarian et al. explain this radiation as the “luminescence from metastable atomic and molecular states injected into the water during or just after the plasma flash, which then recombine very slowly”. however, it can be seen in figure 2 that the light was emitted from the bubble interior, and not from the surrounding water. a direct comparison of figure 4 with figure 1 in [18] can also be used as further proof that in both spark-generated and laser-generated bubbles the glowing plasma core is present in their interior throughout the first bubble oscillation. and since there is no discharge current flowing through the laser-generated bubbles, it follows that the persisting plasma core in spark-generated bubbles is not due to a persisting discharge current. under the assumption that the hot plasma core in the bubble centre radiates as a black-body (this 152 vol. 57 no. 2/2017 determination of temperatures in spark-generated bubbles 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 t [ ms ] θ [ k k ] , r /1 0 [ m m ] r(t) θ(t) θ m1 r m1 r m2 figure 5. time variation of the plasma core surface temperature θ and of the bubble wall radius r. the size of the experimental bubble is rm 1 = 49.0 mm, and the bubble oscillation intensity is pzp1 = 142.1. the time origin is set at the instant when the bubble radius equals rm 1, i.e., it coincides with t1. the maximum surface temperature of the plasma core at the first bubble contraction is denoted as θm 1. assumption is justified, e.g., by the results published in [12]), an equation enabling the determination of the plasma surface temperature θ(t) has been derived in [19]. the derivation is based on the stefan– boltzman law, the equation of energy partition during the electric discharge, the time variation of the bubble radius r(t), and the voltage u(t) at the output of the optical detector. particularly, for the voltage record u1(t) from the second set of experiments the corresponding temperature θ(t) is given by the following equation θ4(t) = 〈θm 0〉4〈rm 0〉2 〈um 0〉 u1(t) r2p(t) . (1) here um 0 is the maximum voltage in pulse u0(t), and this voltage corresponds to the bubble radius rm 0. the surface temperature of the plasma, when the bubble during its growth attains radius rm 0, is θm 0. the angle brackets 〈〉 denote the average values on the first set of experiments. for a given bubble size rm 1, these average values can be computed using the regression lines and the polynomial derived in [19]: 〈θm 0〉 = −0.11rm 1 + 17.4 [kk, mm], 〈rm 0〉 = 0.1836rm 1, and 〈um 0〉 = 1.25 · 10−4r2m 1 [v, mm]. in (1), rp is the radius of the light-emitting hot plasma core. an estimate of radius rp can be obtained from knowledge of the bubble wall radius r and of the volume that the plasma core occupies in the bubble interior. denoting the reduction factor as q (q < 1), then rp = qr. the variation of the bubble wall radius r with time can be computed using a theoretical bubble model. the exact value of reduction factor q is not known at present. a value of q = 0.2 has been used in [19] for the region in the vicinity of the maximum bubble radius rm 1. this value of q has been estimated from high speed film records of the oscillating bubbles. when photographs of the bubble (e.g., in figure 2) are inspected, it can be seen that during the contraction phase both the transparent vapour and the opaque plasma are gradually compressed, and remain mutually separated. the proportion of vapour and plasma changes in such a way that the value of reduction factor q is increasing in the final stages of the contraction phase (the volume occupied by plasma decreases more slowly at later times of the contraction phase than the volume of vapour). unfortunately, no estimate of q can be made from the high speed camera frames for the important time interval covering the closest surroundings of instant tu1, when the bubble is compressed to its first minimum radius rm1. the reason why no estimate of q can be obtained in this interval is that the frames were overexposed by the intensive light emitted from the bubble there (see the sixth frame in figure 2). however, ohl [13] has studied the luminescence from 153 karel vokurka acta polytechnica 0 20 40 60 80 100 120 140 160 180 200 10 −2 10 −1 10 0 p zp1 [ − ] z m 1 [ − ] figure 6. the scaling function zm1 = f (pzp1) computed with herring’s modified model of an ideal gas bubble oscillating in a large expanse of liquid. a laser-generated bubble and gives (figure 6 in [13]) the size of the plasma core. the bubble studied by ohl was rm 1 = 0.8 mm in size, and the radiating plasma core at the maximum contraction of the bubble had dimensions of 31 µm (horizontal diameter) and 44 µm (vertical diameter). this gives the mean value of the first minimum plasma core radius rpm1 = 14 µm. when expressed in non-dimensional form, one obtains zpm1 = rpm1/rm 1 = 0.023. an estimate of the nondimensional first minimum bubble wall radius given in [19] is zm1 = rm1/rm 1 = 0.03. this gives an estimate of the reduction factor q = zpm1/zm1 ≈ 0.8, and this is the value of q that will be used in this work for the closest surroundings of rm1. an example of the variation of the plasma core surface temperature θ with time t during the first bubble contraction and the subsequent expansion, as computed with (1), is given in figure 5. equation (1) has been derived under the assumption that the plasma core is a black-body radiator. this assumption seems to be correct in those instants when the pressure and temperature in the bubble interior are high. and this is fulfilled only in the vicinity of tu1. hence the computed temperature θ(t) shown in figure 5 is correct only in the vicinity of the maximum value θm 1. in other instants, the computed temperatures represent just a rough estimate. in (1), only voltages u1(t) and um 0 are measured directly. radii r(t) and rm 0 are computed using a theoretical model. in this work, herring’s modified model of a bubble freely oscillating in a compressible liquid far from boundaries is used [21]. in this model, the bubble is assumed to be an ideal gas bubble which behaves adiabatically. however, it has been shown earlier, by evaluating experimental pressure records [22], that spark-generated and laser-generated bubbles are vapour bubbles. this means that liquid evaporation and vapour condensation at the bubble wall play an important role during the oscillation of the bubble. it has been further shown by evaluating experimental optical records [19], that spark-generated and laser-generated bubbles do not behave adiabatically. finally, in real spark-generated and laser-generated bubbles there are energy losses, the nature of which has not yet been clarified [19]. they are therefore not encompassed in the theoretical model used to compute r(t). however, it can be shown [23], that during an interval lasting for almost all the time of bubble oscillation (except for short intervals at the beginning of the bubble growth and in the final stages of bubble contraction), the bubble wall motion is governed by the transform of the bubble potential energy into liquid kinetic energy, and vice versa. and neither the bubble potential nor the liquid kinetic energy depends on any of the processes mentioned above. that is, these energies and their transforms do not depend on the compressibility of the liquid, the liquid evap154 vol. 57 no. 2/2017 determination of temperatures in spark-generated bubbles oration and the vapour condensation at the bubble wall, an adiabatic assumption concerning the bubble thermal behaviour, or on any energy losses from the bubble. the computed variation of r(t) (and thus also the computed values of rm 0) may therefore be considered to be relatively correct in this long interval. there is only a very short time interval in the vicinity of the minimum radius rm1, in which all the above-mentioned processes manifest themselves and play an important role. unfortunately, both the experimental description and the theoretical description of the bubble are still insufficient precisely in this short interval. it is a very difficult task to make an experimental determination of rm1. for example, the framing rate of the high speed camera used here was too low to enable rm1 to be determined. even those researchers who have succeeded making an experimental observation of rm1 encountered extreme difficulties. it is therefore not surprising that few experimentally determined values of rm1 are given in the literature on the oscillation of spark-generated and laser-generated bubbles and those values that have been given vary greatly. in addition, unfortunately, the authors usually do not provide further data enabling the computation of pzp1. because knowledge of the variation of rm1 with pzp1 is important in this work, a theoretical computation of rm1 is preferred in the following. the following procedure will be used to determine the value of rpm1, which is needed in (1) for the computation of temperature θm 1. first, using herring’s modified bubble model [21], the scaling function zm1 = rm1/rm 1 = f(pzp1) is determined. the computed scaling function zm1 = f(pzp1) is shown in figure 6. the physical constants used in the computation are: water density ρ = 103 kg m−3, the polytrophic exponent of the gas in a bubble interior γ = 1.25, the velocity of sound in water c = 1480 m s−1, and the hydrostatic pressure p∞ = 125 kpa. the scaling function zm1 = f(pzp1) has been computed for pzp1 ranging from 1 to 200, as this is the range of the bubble oscillation intensities encountered in experiments with spark-generated bubbles [21]. in the experiments that are analysed in this work, parameters rm 1 and pzp1 have been determined for each record u1(t). then, using the scaling function zm1(pzp1) given in figure 6, an estimate of the first minimum radius rm1 = zm1rm 1 can be obtained for each experimental record u1(t). an estimate of the corresponding plasma core radius is then rpm1 = 0.8rm1. thus, using the measured values of um 1, rm 1, and pzp1 from the second set of experiments and the average values of 〈θm 0〉, 〈rm 0〉, and 〈um 0〉 determined for a given bubble size rm 1 from the regression lines and the polynomial given above, temperature θm 1 can be computed. the values of θm 1 determined in this way for different bubble sizes rm 1 and for different bubble oscillation intensities pzp1 are displayed in figures 7 and 8. figures 7 and 8 show that the temperatures θm 1 determined in this work range c. 4300–8700 k. the regression lines for the mean values of temperature θm 1 in dependence on rm 1 and pzp1 are 〈θm 1〉 = −0.04rm 1 + 7.57 [kk, mm] and 〈θm 1〉 = 0.01pzp1 + 4.61 [kk, –]. it can be seen that the temperatures θm 1 vary only moderately with bubble sizes and with the oscillation intensities. whereas the small variation of θm 1 with rm 1 is as expected (the bubbles studied here have sizes for which it can be assumed that the scaling law is valid [24]), the small dependence of θm 1 on pzp1 is surprising. it shows that the plasma in a bubble behaves rather autonomously, i.e., the plasma temperature varies very little with the pressure in the interior of the bubble. this can be shown easily in the following numerical examples. let us denote the peak pressure at a bubble wall at the final stages of the first contraction as pp1. a rough estimate of the value of pp1 can be obtained from the experimentally determined values of pzp1 and the computed scaling function zm1 = f(pzp1) displayed in figure 6. at a bubble wall it holds that r = rm1 and pp1 = pp1. after substituting these equalities into the definition formula of pzp1 given in section ?? (i.e., into the relation pzp1 = (pp1/p∞)(r/rm 1)), one obtains pzp1 = pp1zm1/p∞. this equation can be rearranged to give pp1 = pzp1p∞/zm1. then, for pzp1 = 25 and p∞ = 125 kpa it can be read from figure 6 that zm1 ≈ 0.1 and thus pp1 ≈ 31 mpa. for pzp1 = 150 and p∞ = 125 kpa, it can be read from figure 6 that zm1 ≈ 0.032 and thus pp1 ≈ 586 mpa. it follows then that the variation of the peak pressure at the bubble wall pp1 from 31 mpa to 586 mpa, i.e., by a factor of approximately 19, is accompanied by variation of the mean temperature 〈θm 1〉 from 4860 k to 6110 k, i.e. only by a factor of 1.26 (the temperatures 〈θm 1〉 have been computed using the regression lines given above). the differences in the variation of pp1 and θm 1 provide surprising evidence about the autonomous behaviour of the plasma in the bubble interior. finally in this section, the temperatures θm 1 given in figure 7 can be compared with the experimental results of other researchers obtained with bubbles oscillating in water under ordinary laboratory conditions and far from boundaries. the first rough estimate of θm 1 can be derived from data published by buzukov and teslenko [8]. these researchers studied bubbles generated by a laser, and found that the optical radiation associated with the first optical pulses u1(t) had continuum spectra with spectral maxima occurring in the range 375–440 nm. using wien’s law, it can be calculated that the temperatures θm 1 ranged 6600–7700 k. unfortunately, the sizes of the bubbles generated in these experiments are not available. golubnichii et al. [5] studied bubbles generated by an exploding wire technique, and found that the maximum in the spectrum of the optical radiation associated with the first optical pulse u1(t) lies approximately 155 karel vokurka acta polytechnica 0 10 20 30 40 50 60 0 1 2 3 4 5 6 7 8 9 10 r m1 [ mm ] θ m 1 [ k k ] bw, br ba go figure 7. variation of experimentally determined maximum surface temperatures of the plasma core during the first bubble contraction θm 1 with bubble size rm 1: ‘◦’ — the values of θm 1 determined in this work, ‘∗’ — the values of θm 1 determined in [5] (go), [12] (ba), [14] (bw), [15] (br); these values are discussed at the end of this section. 0 20 40 60 80 100 120 140 160 180 200 0 1 2 3 4 5 6 7 8 9 10 p zp1 [ − ] θ m 1 [ k k ] figure 8. variation of the experimentally determined maximum surface temperatures of the plasma core during the first bubble contraction θm 1 with bubble oscillation intensity pzp1. 156 vol. 57 no. 2/2017 determination of temperatures in spark-generated bubbles at 500 nm. then, using wien’s law, temperature θm 1 = 5800 k is obtained. in this case, the size of the bubbles was rm 1 = 30 mm. baghdassarian et al. [12] studied laser-generated bubbles rm 1 = 0.6–0.8 mm in size, and from the gated optical spectra they determined that θm 1 = 7800 k. finally, brujan and williams [14], and brujan et al. [15] also studied lasergenerated bubbles, and from the gated optical spectra they determined that θm 1 = 8150 k. in this case, the bubble sizes rm 1 ranged from 0.65 mm to 0.75 mm. unfortunately, no data are given in the quoted references to enable the bubble oscillation intensity to be determined. it should be pointed out that the temperatures obtained in this work represent instantaneous values, and can be determined for any instant at which the assumption of black-body radiation is valid. in the works of other authors [5, 8, 12, 14, 15], however, no exact instant is given when the temperature was determined. it is only assumed that the measured temperature corresponds to the maximum bubble contraction. however, when the autonomous plasma behaviour is taken into account, this assumption may not always be correct, and experimental verification is still required. the temperatures measured in this work can also be assigned to particular experiments. this is again in contrast with other works [12, 14, 15], where the spectra have been averaged over 25–50 experiments. although care was taken in those works to average only the spectra corresponding to bubbles of almost the same size, the variation of the temperature with the bubble oscillation intensity (and maybe even with some other as yet unknown factors) has been lost during the averaging procedure. 4. conclusions the surface temperatures of the plasma core inside spark-generated bubbles in the final stages of the first contraction phase have been determined experimentally. it was found that these temperatures range 4300 k to 8700 k. as the statistical averages from another set of experiments were used in the computations, the method employed gives only approximate results. nevertheless, it has been shown that the values that were obtained are in good agreement with the temperatures published by other researchers. however, unlike the results presented by other researchers, the results presented here were obtained on a large set of bubbles of different sizes that oscillated with different intensities. it has also been shown that the plasma inside bubbles behaves rather autonomously, i.e., the surface temperature of the plasma varies only very little with the pressure at the bubble wall. the average maximum surface temperature 〈θm 1〉 increases moderately with bubble oscillation intensity pzp1, and decreases moderately with bubble size rm 1. in any case, it can be concluded that plasma in bubbles behaves rather differently from the ideal gas that has so often been considered. the experimental method described in this work offers an alternative to spectral methods. as has been discussed here, the method has certain advantages and disadvantages. it can be anticipated that, after further improvements, the precision of this method may be increased. acknowledgements this work has been supported by the ministry of education of the czech republic as research project msm 245 100 304. the experimental part of this work was carried out during the author’s stay at the underwater laboratory of the italian acoustics institute, cnr, rome, italy. the author wishes to thank dr. silvano buogo from this laboratory for his very valuable help in preparing the experiments. references [1] s.j. shaw, w.p. schiffers, d.c. emmony. experimental observation of the stress experienced by a solid surface when a laser-created bubble oscillates in its vicinity. j. acoust. soc. am. 110, 1822-1827, 2001. doi:10.1121/1.1397358 [2] a. jayaprakash, c.-t. hsiao, g. chahine. numerical and experimental study of the interaction of a sparkgenerated bubble and a vertical wall. trans. asme j. fluid eng. 134, 031301, 2012. doi:10.1115/1.4005688 [3] v. sboros. response of contrast agents to ultrasound. adv. drug deliver. rev. 60, 1117-1136, 2008. doi:10.1016/j.addr.2008.03.011 [4] t.g. leighton, c.k. turangan, a.r. jamaluddin, g.j. ball, p.r. white. prediction of far-field acoustic emissions from cavitation clouds during shock wave lithotripsy for development of a clinical device. proc. r. soc. a 469, 20120538, 2013. doi:10.1098/rspa.2012.0538 [5] p.i. golubnichii, v.m. gromenko, a.d. filonenko. the nature of an electrohydrodynamic sonoluminescence impulse (in russian). zh. tekh. fiz. 50, 2377-2380, 1980. [6] s.w. fong, d. adhikari, e. klaseboer, b.c. khoo. interactions of multiple spark-generated bubbles with phase differences. exp. fluids 46, 705-724, 2009. doi:10.1007/s00348-008-0603-4 [7] y. huang, h. yan, b. wang, x. zhang, z. liu, k. yan. the electro-acoustic transition process of pulsed corona discharge in conductive water. j. phys. d: appl. phys. 47, 255204, 2014. doi:10.1088/0022-3727/47/25/255204 [8] a.a. buzukov, v.s. teslenko. sonoluminescence following the focusing of laser radiation into a liquid (in russian). pisma zh. tekh. fiz. 14, 286-289, 1971. [9] p.i. golubnichii, v.m. gromenko, ju.m krutov. formation of long-lived luminescent objects under decomposition of a dense low-temperature water plasma (in russian). zh. tekh. fiz. 60, 183-186, 1990. [10] p.i. golubnichii, v.m. gromenko, ju.m. krutov. long-lived luminescent formations inside a pulsating cavern initiated by powerful energy emission in water (in russian). dokl. akad. nauk sssr 311, 356-360, 1990. 157 http://dx.doi.org/10.1121/1.1397358 http://dx.doi.org/10.1115/1.4005688 http://dx.doi.org/10.1016/j.addr.2008.03.011 http://dx.doi.org/10.1098/rspa.2012.0538 http://dx.doi.org/10.1007/s00348-008-0603-4 http://dx.doi.org/10.1088/0022-3727/47/25/255204 karel vokurka acta polytechnica [11] l. zhang, x. zhu, h. yan, y. huang, z. liu, k. yan. luminescence flash and temperature determination of the bubble generated by underwater pulsed discharge. appl. phys. lett. 110, 034101, 2017. doi:10.1063/1.4974452 [12] o. baghdassarian, h.-c. chu, b. tabbert, g.a. williams. spectrum of luminescence from laser-created bubbles in water. phys. rev. lett. 86, 4934-4937, 2001. doi:10.1103/phyrevlett.86.4934 [13] c.-d. ohl. probing luminescence from nonspherical bubble collapse. phys. fluids 14, 2700-2708, 2002. doi:10.1063/1.1489682 [14] e.a. brujan, g.a. williams. luminescence spectra of laser-induced cavitation bubbles near rigid boundaries. phys. rev. e 72, 016304, 2005. doi:10.1103/physreve.72.016304 [15] e.a. brujan, d.s. hecht, f. lee, g.a. williams. properties of luminescence from laser-created bubbles in pressurized water. phys. rev. e 72, 066310, 2005. doi:10.1103/physreve.72.066310 [16] h.j. park, g.j. diebold. generation of cavitation luminescence by laser-induced exothermic chemical reaction. j. appl. phys. 114, 064913, 2013. doi:10.1063/1.4818516 [17] t. sato, m. tinguely, m. oizumi, m. farhat. evidence for hydrogen generation in laseror spark-induced cavitation bubbles. appl. phys. lett. 102, 074105, 2013. doi:10.1063/1.4793193 [18] o. baghdassarian, b. tabbert, g.a. williams. luminescence characteristics of laser-induced bubbles in water. phys. rev. lett. 83, 2437-2440, 1999. doi:10.1103/physrevlett.83.2437 [19] k. vokurka, j. plocek. experimental study of the thermal behavior of spark-generated bubbles in water. exp. therm. fluid sci. 51, 84-93, 2013. doi:10.1016/j.expthermflusci.2013.07.004 [20] k. vokurka. determination of temperatures in oscillating bubbles: experimental results. in: 22nd international conference engineering mechanics 2016, svratka, czech republic 9–12 may 2016 (conference proceedings: institute of thermomechanics, academy of sciences of the czech republic, v.v.i., prague 2016, isbn: 978-80-87012-59-8, issn: 1805-8248, editors: igor zolotarev, vojtěch radolf, pp. 581-584). [21] s. buogo, k. vokurka. intensity of oscillation of spark generated bubbles. j. sound vib. 329, 4266-4278, 2010. doi:10.1016/j.jsv.2010.04.030 [22] k. vokurka. a model of spark and laser generated bubbles. czech. j. phys. b38, 27-34, 1988. doi:10.1007/bf01596516 [23] k. vokurka. significant intervals of energy transforms in bubbles freely oscillating in liquids. j. hydrodyn. 29, 217-225, 2017. doi:10.1016/s1001-6058(16)60731-x [24] k. vokurka. the scaling law for free oscillations of gas bubbles. acustica 60, 269-276, 1986. 158 http://dx.doi.org/10.1063/1.4974452 http://dx.doi.org/10.1103/phyrevlett.86.4934 http://dx.doi.org/10.1063/1.1489682 http://dx.doi.org/10.1103/physreve.72.016304 http://dx.doi.org/10.1103/physreve.72.066310 http://dx.doi.org/10.1063/1.4818516 http://dx.doi.org/10.1063/1.4793193 http://dx.doi.org/10.1103/physrevlett.83.2437 http://dx.doi.org/10.1016/j.expthermflusci.2013.07.004 http://dx.doi.org/10.1016/j.jsv.2010.04.030 http://dx.doi.org/10.1007/bf01596516 http://dx.doi.org/10.1016/s1001-6058(16)60731-x acta polytechnica 57(2):149–158, 2017 1 introduction 2 experimental setup 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0402 acta polytechnica 56(5):402–408, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap cyclic stress analysis of polyester, aramid, polyethylene and liquid crystal polymer yarns felipe vannucchi de camargoa, ∗, carlos eduardo marcos guilhermea, cristiano fragassab, ana pavlovicb a policab — escola de engenharia, universidade federal do rio grande, av. italia km 08 campus carreiros, 96203-900 rio grande/rs, brazil b din — department of industrial engineering, alma mater studiorum university of bologna, viale risorgimento 2, 40136 bologna, italy ∗ corresponding author: fevannucchi@gmail.com abstract. mooring ropes used in offshore oil platforms are exposed to a set of extreme environmental conditions that can be crucial to their behaviour in service. considering the elevated mechanical demands on these ropes imposed by both the undersea environment and the station keeping of the vessel, this paper is focused on the experimental determination of the yarns fatigue behavior. in order to be able to foresee and compare their general wear rate, a diagram that correlates the force to which the specimens are submitted to the number of cycles for failure for each material is achieved. the analyzed fibers are polyester, aramid, polyethylene and liquid crystal polymer (henceforth quoted as pet, ar, pe and lcp, respectively), and this work followed a pattern composed by a fixed test frequency and an established maximum stress for the diagrams. keywords: fatigue; polymer; synthetic fiber; offshore oil platform; ultra-deep water. 1. introduction the findings of oceanic oil basins, specifically the ones in ultra-deep water regions (with depths over 2000 meters), have been both exciting news for the oil industry and a challenge for engineers worldwide, entrusted with the endeavour to make the oil extraction a feasible task. previous technological boundaries had to be pushed, including the ones related to mooring systems, once the ambiance underseas with an intense pressure gradient along with environmental influences is particularly severe for its components and plants. simultaneously, the industrial usage of polymers has grown for decades due to their light weight and good mechanical properties, which can be exemplified by the usage of polymers such as polyethylene as matrix for wooden reinforced composites, providing an enhanced creep resistance to the material [1], aramid used for structural components in racing cars [2], and polyvinylidene difluoride (pvdf) reinforced composites for high-speed woodworking machines parts [3]. figure 1. pivoted mooring system with synthetic ropes used on a fpso oil platform. this growing trend strengthened the concept of using synthetic ropes to moor oil platforms, once long steel cable lines would bring along factors such as high maintenance fees and high weight. figure 1 shows a synthetic mooring line used on a fpso platform through a pivoted turret anchoring layout. this work is focused on the case of the semisubmersible platforms: along with the material substitution, a change in the mooring configuration itself was implemented replacing the catenary system by the taut-leg. in the first one, the end of the line is entirely laid on the seabed, causing exclusively horizontal stress on it. however, in the taut-leg mode, the mooring line is always axially tensioned; exposing the importance of having a proper knowledge regarding the mechanical behaviour of polymeric materials experimentally, which is encouraged by the literature [4]. an illustration of the aforementioned mooring configurations is shown in figure 2. with the advent and large-scale usage of the tautleg mooring system over the catenary system, the figure 2. comparison between the mooring radii of two layouts: catenary (a) and taut-leg (b). 402 http://dx.doi.org/10.14311/ap.2016.56.0402 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 5/2016 cyclic stress analysis of yarns figure 3. the yarn scale when compared to superior aggregations. ropes are constantly under tension: either for or against the wind, intermittent forces (i.e., waves) and stream influences — characterizing a fatigue condition. therefore, the aim of this paper is to provide an experimental-based prediction for the resistance of the main commercial synthetic fibres used for this application nowadays, by analyzing their endurance under certain cyclic loads. seeking to predict the mechanical behaviour of the most commonly used synthetic fibres within this specific application, researches have been carried out to evaluate wear mechanisms such as creep at low temperatures [5] and stiffness [6] of these yarns. thus, aiming to enhance the technical analysis on this paper, other tests besides fatigue were conducted, including creep, tensile and linear density (which allowed the determination of the linear tenacity). 2. material and methods 2.1. fatigue components of a mechanical system may be exposed to inherent cyclical efforts of its function, being subjected to a structural fatigue wear. unlike metal, where fatigue is identified by a crack propagation, this phenomenon has different characteristics when it comes to polymeric materials. in this case, the sensitivity to wear is greatly increased by factors such as mode, rate, and amplitude of loading; frequency, temperature, relative humidity [7], and environmental ph [8, 9]. actually, on real operation conditions, the sea-water represents a considerable degradation factor to the synthetic fibres properties [10]. within this application, fatigue is characterized as the main degradation mechanism of the synthetic fibres [4], rather than factors such as surface wear, tensile and structural imperfections and environmental factors that can be corrected with a carefully planned rope design. the fatigue effect over these fibres can be further subdivided in two subsequent steps of resistance decrease [4]: hysteresis heating and axial compression [11]. figure 4. specimen samples with socketed end in detail. 2.2. feasibility of the experimental phase polyester ropes are known to have a resistance to fatigue equal to or more effective than steel cables [4]. the life of a polyester cable subjected to a continuous cycling between 70 % and 0 % mbl (maximum breaking load) is approximately 100 000 cycles. when the peak load is decreased to 60 % mbl, the endurance rises to 1 million cycles [4]. therefore, chronological and economic feasibility is a concern for this work, since accelerating the test through frequencies higher than the operation’s would produce results unfaithful to reality [7]. 2.3. testing parameters the establishment of a test script suitable to polymeric materials was necessary, since the literature fails to present it for non-metallic materials [7]. the standard specimen scale is the yarn, due to the availability of technical references such as standards and papers. the mooring ropes used for offshore station keeping are a group of twisted sub cables which are composed by twisted strands. a further visual description of the yarn scale compared to a strand is shown in figure 3. the specimens (see figure 4) were 500 mm long, untwisted (with their fibres on a parallel configuration) with cardboard sandwich resin socketed ends [12] (because of it’s superior performance over other alternatives [4]). the testing machines used in this work are: • ohaus adventurer ar2140: electronic precision scale with the resolution of 1x10g−4 used for the linear density tests. • instron 3365: pneumatic, used for the tensile and creep tests, with a pneumatic clamp. • instron 8801: servo-hydraulic, used for the fatigue tests, with a mechanical screw-tightening clamp. 403 f. vannucchi, c. guilherme, c. fragassa, a. pavlovic acta polytechnica figure 5. commercial pictures of the three testing machines used for this work: ohaus scale (a), instron 3365 (b) and instron 8801 (c). commercial pictures of the three machines are displayed in figure 5. for all experiments, specimens were obtained extracting samples of yarn directly from the manufacturer’s coil using a manual reference tension without direct hand contact with the material [13–15]. after that, the specimens remained by the indicated time in standard atmosphere [16] of 20 ± 2 °c and 65 ± 4 % rh until the time of testing, in which the air flow was controlled to avoid the gough-joule effect [13]. in order to determine the main mechanical characteristics of the yarns of each material, preliminary tests of linear density, tension and creep were conducted. linear density (ld) specimens, differently from all the other, were one meter long and had no socketed ends. ten specimens per material were weighted on the scale during a stabilization time of 5 minutes per test. tensile tests were performed with a pre-load of 1 newton. the loading rate was imposed as a percentage of the specimen length: 50 % for ar (250 mm/min) and 100 % for other materials (500 mm/min) [13], providing a proper loading rate in order to avoid the wear effect caused by impact. thirty specimens of each material were tested. the creep tests started with a pre load of 1 newton, followed by a static load ramp of 250 n/min, and a force-based hold at 80 % or 90 % of the average yarn breaking load (ybl) determined by the tensile tests. for each of the two creep loads, 3 specimens were tested for all materials, i.e., 24 tests were carried out. the pre-load applied before the tensile, creep and fatigue tests is a standardized parameter [13] important to soften any impact-related wear. in other polymeric materials such as composites, for example, it is proofed that the application of a pre-load decreases the wear caused by impact [17]. once provided with the results of these preliminary tests, linear tenacity (lt), expressed in [n/tex], can be calculated as a function of those results through trough load mean load amplitude 10 50 40 20 55 35 30 60 30 40 65 25 50 70 20 60 75 15 70 80 10 80 85 5 table 1. loading ranges analyzed, in % ybl. the following equation: lt [n] [tex] = ybl [n] ld [tex] (1) regarding fatigue, as quoted above, an experimental loading simulation totally faithful to the conditions of a mooring rope operation is impractical and possibly inconclusive due to the small sample that would be able to be tested. to backup this premise, one can quote the loading condition ranging from 50 % to 40 % ybl, in which a polyester rope would take thousands of years to reach a fracture according to numerical simulations [4]. nowadays, mooring ropes are designed to withstand utmost loadings, once on a severe storm condition the rope might cycle with an amplitude of 15 % mbl and a mean stress of 30 % mbl. it is also known that even though fatigue is the main wear mechanism of those ropes, severe storms are considerably rare and do not represent a major concern. instead, the leading wear factor is the cycling at small amplitudes and frequencies that goes on uninterruptedly over the 20 years of operation of a rope [4]. accordingly, it is clear that fatigue wear rates should be better understood in order to provide an optimal rope design, backing up the importance of introducing an experimental fatigue study [4]. therefore, a frequency used in certification tests of 0.1 hz [18] was adopted for this work along with loads simulating extreme conditions in pursuance of obtaining tangible experimental data: keeping the peak load at 90 % ybl as a fixed parameter, 8 loading ranges were studied through the variation of the trough load from 10 to 80 % ybl with a 10 % ybl gradual increase, as table 1 indicates. the fatigue test is force-controlled (crl — constant rate of loading), and consists of a pre tension of 1 newton [13], followed by a ramp with a controlled rate [13] which rises until the force corresponds to the first quadrant of the sinusoidal cycling to which the material is subjected, as shown in figure 6. the sinusoidal wave shape was chosen because of its smoother loading-unloading transition when compared to other 404 vol. 56 no. 5/2016 cyclic stress analysis of yarns figure 6. illustration of the fatigue test. material ld [tex] ybl [n] lt [n/tex] cr at 90 % ybl [h] cr at 80 % ybl [h] pet 225.64 165.441 0.733 0.32 35.886 ar 358.68 641.250 1.788 0.479 31.028 pe 170.69 552.175 3.235 0.0178 2.113 lcp 170.70 359.911 2.108 0.036 2.814 table 2. results of preliminary tests. wave shapes such as triangular, saw-tooth and squared, which would cause an enhanced impact-related wear. in total, 96 specimens were submitted to fatigue tests, being 24 specimens per material and 3 per load range considered. 3. results and discussion the preliminary tests results are shown in table 2. finally, to define the behavior of the materials when subjected to sheer fatigue, curves were built relating the logarithm of the number of cycles to rupture (log n) with the trough force of the cyclic loading, expressed in % ybl. the highlighted points represent the average number of cycles for each load range, and the trend curves were selected according to the best correlation coefficient (r2) possible (even though the fitting functions may be distinct among the fibres), in order to assess statistical significance to figures 7–10. figure 11 displays the fatigue curves of the four materials studied altogether. also, table 3 shows the average number of cycles to rupture for each material and each load range. 3.1. preliminary tests on the one hand, the tensile tests revealed that the bigger the ybl is, the more fatigue resistant the material is. after all, the direct relation between the tests is evidenced once the materials resistance order (ar, pe, lcp, pet, decreasingly) is kept the same in both experiments. on the other hand, linear tenacity results clarified that the strength-to-weight ratio is not determinant to predict the fatigue resistance, because they do not show a mutual proportional relation whatsoever. the creep resistance does not have a proportional relation with the yarn breaking load as well. after all, even though pet and ar present the lowest and the highest yarn breaking loads, respectively, both materials have the highest resistance to creep among the four analyzed materials, with wide superiority over the others. 3.2. fatigue the exponential and polynomial demeanours of pet and lcp, respectively, have the same tendency: their fatigue resistance decreases as the loading amplitude decreases. with the most significant average fatigue resistance, ar presents an optimal loading amplitude for endurance of approximately 15% to 20% ybl. its average resistance draws attention, being far superior to the others, it is about fifteen times higher than the second highest average (see table 3). similarly to ar, pe demonstrates an optimal loading amplitude of about 20%ybl, at the respective loading 90%–50%ybl. nevertheless, both fibres present smaller resistances to fatigue at high and low loading amplitudes (of 40% ybl and 5% ybl) backed up by a polynomial tendency. 4. conclusions the presented work showed that it is attainable to realize an experimental study on fatigue of synthetic 405 f. vannucchi, c. guilherme, c. fragassa, a. pavlovic acta polytechnica figure 7. fatigue trend of polyester. figure 8. fatigue trend of aramid. figure 9. fatigue trend of polyethylene. figure 10. fatigue trend of liquid crystal polymer. 406 vol. 56 no. 5/2016 cyclic stress analysis of yarns figure 11. comparative fatigue diagram. trough load [% ybl] 10 20 30 40 50 60 70 80 total avg. pet 36 79 36 26 13 57 11 13 34 ar 600 834 1167 1352 993 1947 776 660 1041 pe 51 120 80 71 131 80 24 5 70 lcp 143 146 62 18 20 8 30 9 55 table 3. average number of cycles to failure by fatigue. fibres, although, as numerical simulations indicated, it involves extensive testing routines. also, despite of the tests being carried out exclusively under extreme loading conditions, it is possible to observe an endurance tendency particular to each studied material when submitted to cyclic stress. the fibres pet and lcp present a similar tendency having their cyclic endurance decreased as the amplitude decreases, while ar and hmpe have both an optimal endurance amplitude of about 20%ybl. despite this difference, all four materials have their average fatigue resistance decreased at low loading amplitudes. the results obtained for pet must be taken into account as a representation of a real polyester rope behaviour, once experimental tests with yarns are known to provide results faithful and proportional to what an actual rope would produce [19]. 5. future work due to time and cost feasibilities, the eight cyclic loading tracks studied had as fixed peak load 90% ybl for each fibre. therefore, as the behaviour of the materials studied are still unknown for other loading conditions, it is suggested that the same study is carried out with other fixed peak loads in order to analyse if the behaviour of all fibres remain the same on both occasions: when compared with each other , and when analyzed individually. also, the performed analysis aimed at one specific fibre from different manufacturers could verify whether the mechanical behaviour is inherent to the material only, or if the manufacturing process plays a significant role on the fatigue resistance of the polymer. the influence of the specimen scale might as well be studied carrying out the same tests under identical proportional loading conditions and with the same materials. always supported by proper standard testing regulations for the scale chosen (i.e. fibre, strand, sub cable), the discovery of a constant or function that defines the change in fatigue resistance through the number of yarns associated on it, would be a great gain in applying the research to industry needs, once mathematical extrapolations could be constituted to predict, in the best case scenario, the behaviour of an actual gross rope. list of symbols pet polyester ar aramid 407 f. vannucchi, c. guilherme, c. fragassa, a. pavlovic acta polytechnica pe polyethylene lcp liquid crystal polymer mbl maximum breaking load [n] ybl yarn breaking load [n] rh relative humidity cr creep resistance [h] ld linear density [tex] lt linear tenacity [n/tex] crl constant rate of loading n number of cycles r2 correlation coefficient acknowledgements this investigation was realized with the contribution of the brazilian companies petrobras and anp petroleum national agency. references [1] zivkovic, i., pavlovic, a., fragassa, c.: improvements in wood thermoplastic composite materials properties by physical and chemical treatments. in: international journal of quality research (editors: z. krivokapic and s. arsovski). 2016, 10, p. 205-218. [2] dawson, d.: focus on design: solar-powered composite car designed to win. in: high-performance composites, september 2007. [3] fotouhi, m., saghafi, h., brugo, t., minak, g., fragassa, c., zucchelli, a., ahmadi, m.: effect of pvdf nanofibers on the fracture behavior of composite laminates for high-speed woodworking machines. in: proceedings of the institution of mechanical engineers, part c: journal of mechanical engineering science, 2016, doi: 10.1177/0954406216650711. [4] bosman, r. l.: on the origin of heat build-up in polyester ropes. fort lauderdale: oceans conference, 1996. [5] husak, g. s., chimisso, f. e. g.: construction of a device to test creep behavior of synthetic multifilaments, submerged in cold water, used in offshore mooring systems, and results. brasov: 11th youth symposium on experimental solid mechanics, 2012. [6] stumpf, f. t., guilherme, c. e. m., chimisso, f. e. g.: preliminary assessment of the change in the mechanical behavior of synthetic yarns submitted to consecutive stiffness tests. in: acta polytechnica ctu proceedings, 2016, 3, p. 75. [7] m. zoroufi.: significance of fatigue testing parameters in plastics versus metals. jacksonville: 13th international astm/esis symposium on fatigue and fracture mechanics, 2013. [8] fragassa, c.: investigations into the degradation of ptfe surface properties by accelerated aging tests. in: tribology in industry (editor: slobodan mitrovic). 2016, 38, no. 2, p. 241-248. [9] giorgini, l., fragassa, c., zattini, g., pavlovic, a.: acid aging effects on surfaces of ptfe gaskets investigated by fourier transform infrared spectroscopy. in: tribology in industry (editor: slobodan mitrovic). 2016, 38, no. 3, p. 286-296. [10] poodts, e., minak, g., zucchelli, a.: impact of sea-water on the quasi static and fatigue flexural properties of gfrp. in: composite structures (editor: a. j. m. ferreira). elsevier, 2013, 97, p. 222-230. [11] mckenna, h. a., hearle, j. w. s., ohear, n.: handbook of fibre rope technology. woodhead publishing ltd and crc press llc, 2004. [12] pfarrius, j. d. et al.: theoretical and experimental modeling of a socket sandwich for use in tension tests of synthetic ropes. vrnjacka banja: 6th youth symposium on experimental solid mechanics, 2007. [13] american society for testing and materials, astm d885: standard test methods for tire cords, tire cord fabrics, and industrial filament yarns made from manufactured organic-base fibers. west conshohocken, 1998. [14] associacao brasileira de normas tecnicas, nbr 13214: determinacao de titulo de fios. rio de janeiro, 1994. [15] international standardization organization, iso 2060: textiles: yarns from packages, determination of linear density. geneva, 1994. [16] international standardization organization, iso 139: textiles: standard atmospheres for conditioning and testing. geneva, 2005. [17] saghafi, h., brugo, t. m., zucchelli, a., fragassa, c., minak, g.: comparison of the effect of preload and curvature of composite laminate under impact loading. in: fme transactions (editor: bosko rasuo). 2016, 44, p. 353-357. [18] det norsk veritas, dnv: yarns for offshore mooring fibre ropes. hovik, 2010. [19] rossi, r. r.: cabos de poliester para ancoragem de plataformas de petroleo em aguas ultraprofundas. thesis of m.sc. in oceanic engineering. universidade federal do rio de janeiro, rio de janeiro, 2002. 408 acta polytechnica 56(5):402–408, 2016 1 introduction 2 material and methods 2.1 fatigue 2.2 feasibility of the experimental phase 2.3 testing parameters 3 results and discussion 3.1 preliminary tests 3.2 fatigue 4 conclusions 5 future work list of symbols acknowledgements references ap04_1.vp 1 introduction not only is stellar astronomy the oldest topic in astronomical studies, but it continues to be of importance in astronomical research. this is because stars are the principle objects from which others are formed (binary and multiple star systems, star clusters and galaxies). stellar imaging has become an essential and effective technique in astronomy since the invention of photographic plates. its importance increased with the advent of sophisticated low light level two dimension detectors such as image intensifier tubes (iits) and charge coupled devices (ccds). by the end of the 1970’s, astronomical optical observations had made a great leap with the use of cameras equipped with ccd chips after their invention at the bell lab. by w. boyle and g. smith [1]. this may be attributed to reasons such as superior quantum efficiency, large dynamic and spectral sensitivity ranges, fairly uniform response, high linearity and relatively low noise in comparison with other detectors, particularly photographic plates. in addition, digital image data is directly accessible with no need for measurement or for cumbersome and imprecise calibration. the invention of ccd chips and their use coincided with great advances in of electronic computing machines. as a consequence of these advances, tremendous astronomical images have been acquired that need reliable, precise and fast reduction techniques and approaches. in the astronomical community various methods have been designed, developed, coded and applied. these are based on viewing the star image through a mathematical model [2, 3, 4], an empirical model [5, 6, 7] or a semi-empirical model [8, 9]. the codes based on these models, usually a bi-variant gaussian function, require a user-computer interface facility for providing the form of the model adopted and also for setting the initial values of the model parameters. several runs of such codes are necessary to optimise and derive the final set of parameters, through some non-linear fitting process, to be applicable for ccd frame reduction. such circumstances require a fast computing machine provided with a large memory and working space area as well as an expert user. however, incorrect and/or false results are possible, mainly due to imprecise parameter estimates and/or improper user intervention. recently, we have developed two approaches employing artificial intelligence techniques to recognise stellar images [10] besides deriving all relevant astronomical data [11]. in our previous communication [12], hereinafter paper i, an artificial neural network based system employing a unipolar function was proposed. very good agreement was achieved between the results of our system and those obtained by applying the most widely used software in the astronomical community, daophot-ii [13], through application on a test case (a ccd frame of the star cluster m67). in addition, exact coincidence was found between the results of paper i and the cluster standard data found in the astronomical literature and databases. in the present work, a bipolar function has been adopted and applied on the same frame, and the outcome has been investigated and compared with the previous ones. 2 the problem and the present method 2.1 the problem as known and outlined in paper i, a ccd frame may contain entities that are images of astronomical objects (star, galaxy, etc.) as well as those caused by other sources (cosmic rays, noise etc.). all of them are gathered at the same time and under the same atmospheric and instrumental conditions. after acquiring an image, the frame needs to be reduced, first by identifying the stellar images among the others and then by deriving their positions and magnitudes. the present work concerns the first step. this has been realised by a supervised artificial neural network based system (anns) as a discriminating approach. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague stellar image interpretation system using artificial neural networks: ii – bi-polar function case a. el-bassuny alawy, f. i. y. elnagahy, a. a. haroon, y. a. azzam, b. šimák a supervised artificial neural network (ann) based system is being developed employing the bi-polar function for identifying stellar images in ccd frames. it is based on feed-forward artificial neural networks with error back-propagation learning. it has been coded in c language. the learning process was performed on a 341 input pattern set, while a similar set was used for testing. the present approach has been applied on a ccd frame of the open star cluster m67. the results obtained have been discussed and compared with those derived in our previous work employing the uni-polar function and by a package known in the astronomical community (daophot-ii). full agreement was found between the present approach, that of elnagahy et al, and the standard astronomical data for the cluster. it has been shown that the developed technique resembles that of the uni-polar function, possessing a simple, much faster yet reliable approach. moreover, neither prior knowledge on, nor initial data from, the frame to be analysed is required, as it is for daophot-ii. keywords: neural networks, knowledge-based system, stellar images, image processing. 2.2 the architecture of the present anns the anns used here is similar to that adopted in paper i. it comprises the two-layer forward network depicted in fig. 1. it includes four and three neurones for the first (hidden) layer and the second (output) layer, respectively. the input is zi (for i � 1 to 24) yielding the hidden layer weight matrix as v(4×24) and that of the output as w(3×4). these are illustrated in the figure, which also shows the biases (v1,0, v2,0, v3,0 & v4,0 and w1,0, w2,0 & w3,0) for the neurones of the layers. the discriminating function adopted is one of the known bi-polar functions given by the hyperbolic tangent function as � � � � � � f x e e x x � � � � � 1 1 � � � �� 1 1f x where � (descending factor) is an arbitrarily small positive value. first, the weights for this function have to be determined through the learning process on the selected pattern set. secondly and before application, both the function adopted and the weights derived have to be verified against a similar known set. finally, the function and its weights can be applied to an unknown ccd frame to be reduced for the purposes of discrimination. 2.3 learning and test input patterns for the learning and test tasks, we adopted the same two pattern sets used in paper i. in each set, a total of 341 input patterns are included (119 stars, 111 cosmic ray events and 111 noises). it is obvious that the pattern sets are of large size and equally distributed over the three entities. this assures proper learning and reliable weight determination. for the purpose of learning, the set patterns were randomised and fed to the anns. each pattern consists of the data of a 5×5 pixel array centred at the peaked pixel taken from some ccd frames. the pattern samples were selected to represent stars of different brightness and events of cosmic rays with different energies as well as various noise patterns. in such a case, the data of the central pixels of the patterns differs widely. in order to perform the learning process properly this data has to be scaled. this was achieved, for each pattern, by normalising the data of all pixels to that of the central pixel. this leads to the unity to all central pixels of the patterns. fig. 2 demonstrates the raw data, the image and the 3-d view of a sample of these patterns. for the images and 3-d views, normalised data is used. it is evident from the pattern sets, as illustrated in the figure, that: a) for a star, the data shows a gradual decrease from the centre outward causing an extended bright area to the limits of the image and a pronounced (but not extreme) sharp peak in the 3-d view. b) for a cosmic sample, the central pixel datum is much larger than the data of the other 24 pixels, which is very close to each other with values much lower than those at the centre. this provides very sharp localised peak in the 3-d view and highly concentrated brightness at the centre of the image. c) for noise, the data of the array is more or less close to each other and distributed randomly, causing some peaks of low height in the 3-d view, and a featureless image. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 1: the architecture of the adopted anns these discriminate characteristics of the three identities are very helpful in applying anns. 2.4 training error and learning factors the training error and the learning factors adopted here are those described in detail by one of us elsewhere [10] and summarised in [12]. among the different types of errors workable in ann, the decision error is invoked to terminate the network training process. it was computed for the entire batch of training pattern samples (p) via, e n pkd err� where k is the threshold output over one cycle (set as 3) and nerr is the total number of bit errors resulted. regarding the learning mode, some precautions were undertaken for the learning factors in order to avoid the pitfalls generally associated with error minimisation techniques, such as instability, oscillation, divergence and shallow local minima. first, the network weights were initialised randomly to positive values between 0.0 and 0.1, while negative values between these limits were assigned to the biases for the hidden and output layer neurons vi,0 and wi,0. secondly, a value of 0.01 was set to the learning constant that was found to accelerate the convergence without overshooting the solution. finally, the momentum term was set as 0.5 to speed up the convergence of the error back-propagation learning algorithm. the learning factor and the momentum term values are in accordance with the similar values used in paper i. 2.5 training the present anns table 1 lists the desired three components of the output patterns, oi, nominated for star, noise and cosmic event entities. in the learning mode, many cycles were performed through the adopted patterns. at the beginning of any cycle, nerr was set to zero and all input patterns’ data were mapped to the anns sequentially. for each cycle the first pattern data was fed to the anns and the three outputs (i.e. actual output) were computed. for each one of these outputs the error associated was calculated. the total number of bit error (i.e. nerr) has to be increased if the difference between the desired output (table 1) and the actual output is equal to (or greater than) 0.1; then the weights of the output and the hidden layers are adjusted, respectively. then this process is applied for the next pattern, till the last one providing that the error of each pattern is computed. at the end of the cycle, the decision error (ed) is computed. if ed is not equal to zero, the whole cycle is repeated until ed is zero and hence the learning task is completed. the decision components of the output patterns, oi, are given in table 2 for the three entities (star, noise and cosmic ray event). these are used together with the adopted bi-polar function and the derived weights in order to classify the unknown frame entities. in such a case when local central peaked pixel (lcpp) is found the 5×5 pixel array data is extracted and the anns is employed to compute the relevant outputs and, applying the following: � if o1 > 0.9, o2 < 0.1 and o3 < 0.1 then the object is a star image. � if o1 < 0.1, o2 > 0.9 and o3 < 0.1 then the object is a noise. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 2: raw data, image and a 3-d view of a star, cosmic and noise samples of the input pattern � if o1 < 0.1, o2 < 0.1 and o3 > 0.9 then the object is a cosmic event. 2.6 implementation and test the present approach was developed to scan the ccd frame to reduce the search for any lcpp whose content is larger than those of the four adjacent pixels at the cardinal directions (see [10, 11]). if such a case is found, the data of the 5×5 pixel array is then normalised with respect to the lcpp datum and mapped to the ann system. the outputs obtained are compared to the decision output listed in table 2 to identify the image identity type as star, noise or cosmic ray event within a value of 0.1, which has been adopted as the tolerance limit for learning (sec. 2.5). the present ann system has been coded in c language and has been tested through application to the test pattern set (sec. 2.3). exact agreement was found between the entities recognised and the prior knowledge on the patterns adopted. 3 application after testing the present anns, it was applied on a standard ccd frame previously analysed by different methods in order to verify the capability and limitations of the present anns in comparison with these methods and the known standard data. 3.1 the case adopted a ccd frame of the star cluster m67 [14] was undertaken to apply the present anns for the following reasons: a) the cluster is one of the standard, well-known and well-studied star clusters, where precise data on the stars in its vicinity is available from several publications and from an accurate database. the region of the cluster imaged in the frame includes faint, intermediate and bright stars, while some stars are close together and some are far apart. b) the frame can be considered as an ideal case for stellar ccd imaging where images of the stars are generally circular. it was obtained together with the widely used software daophot-ii [13] and the relevant and necessary auxiliary files. hence reduction of the frame using this code can be achieved properly. c) the frame was reduced by two other approaches through of knowledge based system, kbs, [10, 11] and anns employing the uni-polar function [12]. these reasons facilitate an objective comparison of these approaches. the frame is 320×350 pixels acquired for a period of 30 seconds through the visual optical band where ccd chips have maximum quantum efficiency. the full well capacity value is 16252 adu (analog-to-digital converter unit). 3.2 results and discussion the selected case frame was reduced by applying the present anns approach as well as that of paper i and the daophot-ii code. the last two approaches identify 134 and 137 star images, respectively. a comparison between, and a discussion of, these findings are given in paper i. the execution of the present technique on the frame showed that: a) the maximum pixel datum is 16252 adu, where saturation occurred. b) the frame background level is 21.0 (p.e. � � 1.0). c) three cosmic events have been identified. d) the frame contains 131 stellar images. the second finding (i.e. frame background level) is in good agreement with that derived previously as 21.1 [see 11]. the third finding is exactly like that identified by [10, 11]. the computing time is similar to that needed for paper i, i.e., 45 seconds employing a pentium ii pc (233 mhz processor) for scanning the image, finding the data limits of the pixels, displaying the frame via the monitor, recognising stellar images and cosmic events and saving the output in the relevant files. during the recognition step, each star image found was marked by an open circle while an open square is displayed around each cosmic event. in the following, the present results are discussed and compared with those obtained by the other two methods. 3.2.1 the bi-polar and the uni-polar function anns approaches fig. 3 shows the results of applying the two codes. the two approaches agree about identifying the three cosmic events. all stellar images recognised in the present work were identified when the uni-polar function was adopted in paper i. three more stellar images were found by applying the latter approach. for these images, it is worthwhile to state that: a) the central pixel data are 33, 35 and 35 (see table 3). these values are very close to the frame background level (21) and very far from the saturation value (16252). this implies that these are very faint stars having a very low peak-to-background ratio (compare this data with the data in fig. 2). hence they are of much less astronomical importance. b) the star discriminator output element o1 has the values 0.91, 0.92 and 0.94. these values are lower than those of other stars, which are generally larger than 0.99. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 o1 o2 o3 star 1 0 0 noise 0 1 0 cosmic ray 0 0 1 table 1: desired output values for star, noise and cosmic ray event o1 o2 o3 star > 0.9 < 0.1 < 0.1 noise < 0.1 > 0.9 < 0.1 cosmic ray < 0.1 < 0.1 > 0.9 table 2: decision output values for star, noise and cosmic ray event 3.2.2 the bi-polar function anns approach versus the daophot-ii code as sated above, applying the daophot-ii code on the test case frame leads to recognition of 137 star images; out of these, 131 images were identified by the present approach, showing very good agreement. these common images are displayed in fig. 4 by filled circles. the remaining six images are shown with different symbols, on which some comments are given. first, three images (denoted by filled squares) are centred at the first or second row (column) close to the frame border. the present approach is not applicable, since no 5×5 pixel array data is available. nevertheless such locations imply that the stars are partially imaged and hence the images are of no importance from the astronomical point of view. secondly, two other images (denoted by open circles) are assigned by the daophot-ii code where no images could be found by the present code. at this position, only one very bright star can be seen from the palomar observatory sky survey (poss) and the cluster identification charts found in literature (e.g. [15, 16, 17]). because of the high brightness of the star the four central pixels are saturated, having the full well capacity value as 16252 (see table 4a). another case was found and designated by an open triangle in fig. 4. in such cases the local area of the frame is treated as two overlapped images by the daophot-ii code, while it is skipped by the present approach because no lcpp is found due to saturation. finally, the sixth image is at the frame top-left corner and denoted by an open square. on the one hand, one image was identified in the present work. on the other hand the other code assigned two images at almost the same position. according to poss and the cluster charts, there is one star only in this position. inspection of the data in the pixels shows that the star image departs slightly from circularity as no radial symmetry around the peaked pixel; the data shows almost two close peaks (see table 4b). due to the principles of the present technique, only one lcpp is adopted, leading to one star image. in the daophot-ii code, two overlapped images are considered. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 3: map of the images of stars recognised in the m67 star cluster frame (north is up and east to the left). see text. filled circles: images identified by both uni-polar and bi-polar functions open circles: images identified by the uni-polar function only star no. 1 star no. 2 star no. 3 21 19 21 23 24 25 25 29 26 21 21 22 25 24 21 21 27 30 25 21 24 25 29 27 22 21 27 33 26 23 21 28 33 28 25 24 29 35 27 23 23 30 35 28 23 24 27 32 30 18 21 27 28 31 21 24 21 25 21 21 23 22 26 22 21 18 21 22 16 23 22 23 24 23 21 table 3: 5×5 pixel array data for the three very faint stars fig. 4: map of images of stars recognised in the m67 star cluster frame (north is up and east to the left). see text. filled circles: images identified by bi-polar function and daophot-ii open circles, square and triangle: images identified by daophot-ii only 4 conclusions some conclusions can be drawn from the present study: both uni-polar and bi-polar functions are good discriminating functions when used in an ann approach to identify stellar images among the other entities in a ccd frame. both possess, in comparison with mathematical or empirical star image modelling, the following advantages: 1) extremely short execution time. 2) better and higher recognition ability. 3) the two methods do not require complicated non-linear fitting computation, user intervention, prior knowledge or initial values for the star model, a fast computer or large hd free space. 4) the ability of the present approach, i.e. display the image and mark the entities found is helpful for visual inspection and for evaluating the outcome of the anns technique. 5) the results of the bi-polar function approach agree very well with those obtained in the uni-polar function case, except for the images of the three very faint stars. however this may be accounted for by reducing the tolerance value to be less than 0.1 adopted for the training error (sec. 2.5) in order to enhance the weights of the hidden and output layers. the two approaches are limited for images centred within the frame up to the third pixel from the frame borders. any image outside such a region (even if it is a star image) is of no astronomical significance, since it is an incomplete image. hence this limitation can be ignored. references [1] boyle w. s., smith g. e.: “charge coupled semiconductor devices.” bell systems technical journal, vol. 49 (1970), p. 587–593. [2] blecha r.: “electronographic stellar photometry of globular clusters.” astronomy and astrophysics, vol. 135 (1984), p. 401–409. [3] penny a., dickens r.: “ccd photometry of the globular cluster ngc6752.” monthly notices of the royal astronomical society, vol. 220 (1986), p. 845–867. [4] mateo m., schechter p.: the daophot two-dimensional photometry program. proceeding of the first eso/st-ecf data analysis workshop, 1989, p. 69. [5] tody d.: “stellar photometry in crowded fields.” spie, vol. 264 (1980), p. 171–179. [6] lupton r., gunn j.: “m13main sequence photometry and the mass function.” astronomical journal, vol. 91 (1986), p. 317–325. [7] linde p.: highlights in astronomy. vol. 8 (1989), p. 651. [8] stetson p.: “daophota computer program for crowded fields stellar photometry.” publication of the astronomical society of the pacific, vol. 99 (1987), p. 191–222. [9] gilliland r., brown t.: “time-resolved ccd photometry of an ensemble of stars.” publication of the astronomical society of the pacific, vol. 100 (1988), p. 754–765. [10] elnagahy f. i. y.: msc thesis, faculty of engineering, al-azhar univ., cairo, egypt, 1988. [11] alawy a. el-bassuny: “stellar ccd photometry: “new approach, principles and application.” astrophysics and space sciences, vol. 277 (2001), p. 473–495. [12] elnagahy f. i. y., alawy a. e., simak b., ella m., madkour m.: “stellar image interpretation system using artificial neural networks: uni-polar function case.” acta polytechnica, vol. 41, no. 6 (2001), p. 33–38. [13] stetson p.: user’s manual for daophot-ii. dominion astrophysical observatory, victoria, canada, 1996. [14] bojan d.: http://david.fiz.uni-lj.si/daophotii, 1996. [15] eggen o., sandage a.: “new photoelectric observations of stars in the old galactic cluster m67.” astrophysical journal, vol. 140 (1964), p. 130–143. [16] johnson h., sandage a.: “the galactic cluster m67 and its significance for stellar evolution.” astrophysical journal, vol. 121 (1955), p. 616–627. [17] kent a. et al: “ccd photometry of the old open cluster m67.” astronomical journal, vol. 106 (1993), p. 181–219. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 a) data for the saturated star image denoted by (d&s) in fig. 4 79 124 221 335 429 369 181 93 132 251 624 1386 2108 1175 399 145 198 605 2048 5983 8592 3526 720 219 283 1005 4864 16252 16252 6156 1097 263 295 1065 5534 16252 16252 5659 1005 253 195 623 2437 7443 8213 2985 656 203 121 281 733 1653 1679 844 336 132 83 125 224 365 374 261 137 79 b) data for the deformed star image denoted by (d) in fig. 4 25 22 21 26 27 25 25 21 19 24 26 37 38 31 22 22 17 25 38 64 70 47 25 24 25 22 45 97 100 41 25 21 25 23 38 60 67 37 25 21 22 23 31 29 30 27 19 21 22 22 19 25 25 25 24 19 21 20 19 23 24 22 25 21 table 4: 8×8 pixel array data for the two stars identified by daophot-ii assoc. prof. ahmed el-bassuny alawy e-mail: abalawy@hotmail.com researcher ali abdel wahab haroon, phd e-mail: alyharoon@hotmail.com asst. researcher eng. yosry ahmed azzam, msc e-mail: yosryahmed@hotmail.com national research institute of astronomy and geophysics (nriag) helwan, cairo, egypt doc. ing. boris šimák, csc. e-mail: simak@feld.cvut.cz eng. farag ibrahim younis elnagahy, msc e-mail: faragelnagahy@hotmail.com department of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 24 25 26 27 28 29 30 acta polytechnica doi:10.14311/ap.2017.57.0391 acta polytechnica 57(6):391–398, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap highly accurate calculation of the real and complex eigenvalues of one-dimensional anharmonic oscillators francisco marcelo fernández∗, javier garcia inifta (unlp, cct la plata-conicet), blvd. 113 s/n, sucursal 4, casilla de correo 16, 1900 la plata, argentina ∗ corresponding author: fernande@quimica.unlp.edu.ar abstract. we draw attention on the fact that the riccati-padé method developed some time ago enables the accurate calculation of bound-state eigenvalues as well as of resonances embedded either in the continuum or in the discrete spectrum. we apply the approach to several one-dimensional models that exhibit different kind of spectra. in particular we test a wkb formula for the imaginary part of the resonance in the discrete spectrum of a three-well potential. keywords: anharmonic oscillators; bound states; resonances; riccati-padé method; wkb asymptotic expression. 1. introduction in a recent paper gaudreau et al[1] proposed a method for the calculation of the eigenvalues of the schrödinger equation for one-dimensional anharmonic oscillators. in their analysis of some of the many approaches proposed earlier with that purpose they resorted to expressions of the form: “however, the existing numerical methods are mostly case specific and lack uniformity when faced with a general problem.” “as can be seen by the numerous approaches which have been developed to solve this problem, there is a beautiful diversity yet lack of uniformity in its resolution. while several of these methods yield excellent results for specific cases, it would be favorable to have one general method that could handle any anharmonic potential while being capable of computing efficiently approximations of eigenvalues to a high pre-determined accuracy.” “various methods have been used to calculate the energy eigenvalues of quantum anharmonic oscillators given a specific set of parameters. while several of these methods yield excellent results for specific cases, there is a beautiful diversity yet lack of uniformity in the resolution of this problem.” the authors put forward an approach that they termed double exponential sinc collocation method (descm) and reported results of remarkable accuracy for a wide variety of problems. in fact they stated that “in the present work, we use this method to compute energy eigenvalues of anharmonic oscillators to unprecedented accuracy” which may perhaps be true for some of the models chosen but not for other similar examples. for example, in an unpublished article trott[2] obtained the ground-state energy of the anharmonic oscillator with potential v (x) = x4 with more than 1000 accurate digits. his approach is based on the straightforward expansion of the wavefunction in a taylor series about the origin. one of the methods mentioned by gaudreau et al[1] is the riccati-padé method (rpm) based on a rational approximation to the logarithmic derivative of the wavefunction that satisfies a well known riccati equation[3, 4]. in their brief analysis of the rpm the authors did not mention that this approach not only yields the bound-state eigenvalues but also the resonances embedded in the continuum[5]. what is more, the same rpm quantization condition, given by a hankel determinant, produces the bound-state eigenvalues, the resonances embedded in the continuum as well as some kind of strange resonances located in the discrete spectrum of some multiple-well oscillators[6]. it is not clear from the content of[1] whether the descm is also suitable for the calculation of such complex eigenvalues. the accuracy of the calculated eigenvalues not only depends on the chosen method but also on the available computation facilities and on the art of programming. for this reason the comparison of the accuracy of the results reported in a number of papers spread in time should be carried out with care. the purpose of this paper is two-fold. first, we show that the rpm can in fact yield extremely accurate eigenvalues because it exhibits exponential convergence. to that end it is only necessary to program the quantization condition in an efficient way in a convenient platform. second, we stress the fact that the rpm yields both real and complex eigenvalues with similar accuracy through the same quantization condition. more precisely: it is not necessary to modify the algorithm in order to obtain such apparently dissimilar types of eigenvalues that are associated to different boundary conditions of the eigensolution. in section 2 we outline the rpm for even-parity potentials. in section 3 we apply this approach to some of the examples discussed by gaudreau et al[1] and obtain eigenvalues with remarkable accuracy. in this section we also calculate several resonances supported by anharmonic oscillators that were not taken into account by those authors. 391 http://dx.doi.org/10.14311/ap.2017.57.0391 http://ojs.cvut.cz/ojs/index.php/ap francisco marcelo fernández, javier garcia acta polytechnica we consider examples of resonances embedded in the continuous as well as in the discrete spectrum. finally, in section 4 we summarize the main results and draw conclusions. 2. the riccati-padé method the dimensionless schrödinger equation for a one-dimensional model reads ψ′′(x) + [e −v (x)]ψ(x) = 0, (1) where e is the eigenvalue and ψ(x) is the eigenfunction that satisfies some given boundary conditions. for example, lim |x|→∞ ψ(x) = 0 determines the discrete spectrum and the resonances are associated to outgoing waves in each channel (for example, ψ(x) ∼ aeikx). in this paper we restrict ourselves to anharmonic oscillators with even-parity potentials v (−x) = v (x) to facilitate the comparison with the results reported by gaudreau et al[1] but it should be taken into account that the approach applies also to no non-symmetric potentials[7]. in order to apply the rpm we define the regularized logarithmic derivative of the eigenfunction f(x) = s x − ψ′(x) ψ(x) , (2) that satisfies the riccati equation f′(x) + 2sf(x) x −f(x)2 + v (x) −e = 0, (3) where s = 0 or s = 1 for even or odd states, respectively. if v (x) is a polynomial function of x or it can be expanded in a taylor series about x = 0 then one can also expand f(x) in a taylor series about the origin f(x) = x ∞∑ j=0 fj (e)x2j. (4) on arguing as in earlier papers we conclude that we can obtain approximate eigenvalues to the schrödinger equation from the roots of the hankel determinant hdd (e) = ∣∣∣∣∣∣∣∣∣ fd+1 fd+2 · · · fd+d fd+2 fd+3 · · · fd+d+1 ... ... ... ... fd+d fd+d+1 · · · fd+d−1 ∣∣∣∣∣∣∣∣∣ = 0, (5) where d = 2, 3, . . . is the dimension of the determinant and d is the difference between the degrees of the polynomials in the numerator and denominator of the rational approximation to f(x)[3–6]. in those earlier papers we have shown that there are sequences of roots e[d,d], d = 2, 3, . . . of the determinant hdd (e) that converge towards the bound states and resonances of the quantum-mechanical problem. we have at our disposal a set of sequences for each value of d but it is commonly sufficient to choose d = 0. for this reason, in this paper we restrict ourselves to the sequences of roots e[d] = e[d,0] (unless stated otherwise). in this paper we are concerned with anharmonic-oscillator potentials of the form v (x) = k∑ j=1 vjx 2j. (6) the spectrum is discrete when vk > 0 and continuous when vk < 0. in the latter case there may be resonances embedded in the continuous spectrum which are complex eigenvalues. the real part of any such eigenvalue is the resonance position and the imaginary part is half its width γ (|=e| = γ/2). 3. examples four examples chosen by gaudreau et al[1] are quasi-exactly solvable problems; that is to say, one can obtain exact solutions for some states: v1(x) = x2 − 4x4 + x6 e0 = −2 v2(x) = 4x2 − 6x4 + x6 e1 = −9 v3(x) = 105 64 x2 − 43 8 x4 + x6 −x8 + x10 e0 = 3 8 v4(x) = 169 64 x2 − 59 8 x4 + x6 −x8 + x10 e1 = 9 8 . (7) 392 vol. 57 no. 6/2017 one-dimensional anharmonic oscillators the rpm yields the exact result for all these particular cases because in all of them the logarithmic derivative f(x) is a rational function of the coordinate. the hankel determinants of lowest dimension for each case are: h02 (e) = 1 4725 (e + 2)(e5 − 2e4 − 23e3 − 602e2 + 1030e − 1412), (8) h02 (e) = 1 4465125 (e + 9)(e5 − 9e4 − 187e3 − 8217e2 + 78336e − 348624), (9) h03 (e) = 1 3189612751764848640000 (8e − 3)(8589934592e11 + 3221225472e10 − 1887235473408e9 − 399347250364416e8 − 1634745666502656e7 + 10770225531715584e6 − 836065166572191744e5 − 905684630058491904e4 + 5197219286067104256e3 − 2944302537136698432e2 − 12283878786837315912e + 22452709866105906693), (10) h03 (e) = 1 431028319209742820966400000 (8e − 9)(8589934592e11 + 9663676416e10 − 5569096187904e9 − 2064531673055232e8 − 15362232560910336e7 + 158709729905344512e6 − 23752960275863896064e5 − 84068173973645402112e4 + 2318080070178601634304e3 − 6274577633554290840768e2 − 75410626140297229262472e + 655367638076442656931879), (11) respectively. it is clear that the second factor of each hankel determinant yields the exact eigenvalue of the corresponding model in equation (7). as a nontrivial example we consider the quartic anharmonic oscillator v (x) = x2 + λx4. (12) gaudreau et al [1] calculated the ground state for λ = 1 with remarkable accuracy. the rpm also enables great accuracy because of its exponential convergence. for example, with determinants of dimension d ≤ 623 we obtained e0 = 1.39235164153029185565750787660993418460006671122083408890634932387756743187564652 85909735634677917591211513753417388174455516240463837130438178697370013460935168 15484208574889656901800305541236648743218953435715417409382624057229519998568711 18140968922702273638169811112603107034293861341959645684859182914634898518858148 63025469392145221031177208948219643654580541741801366088701870825264349698158700 82340760759574319226851138960019685449394982096240756162094619633463447377455701 49211492623468905916373385630626814055709925106270580909505786666030935831448351 97352905560061049224302849821825415119194035000689109989896675454979833183805654 19975466162573031052729404581567529262538228672118076018319975294595611113245756 78445653018419567798509749315372254188588216960225999726980950846580656370213654 47651793869049904755455309191949465274340562585980971938979595684138772300267900 68177673277845708654477245631366268184519934644126051969150124972306172724393638 74511499751517142498813649966422950045954851519165072488133686158144218817306000 39773536840117104637678735672726392478420532548924901523470626991951440934018875 83071929546817823113125377471312004221881276679422460872268510606766179549130792 640798558850522732484547554994100518213983, which is considerably more accurate than the result reported by those authors. such an accuracy is unnecessary but it clearly shows the stability and remarkable rate of convergence of the rpm. for some approaches the pure quartic oscillator v (x) = x4 (13) 393 francisco marcelo fernández, javier garcia acta polytechnica may be more demanding than the oscillator treated previously. however, this is not the case for the rpm that yields e0 = 1.06036209048418289964704601669266354551520872852829779332162452416959435630443444 21126896299134671720351054624435858252558087980821029314701317683637238249357892 26246004708175446960141637488417282256290593575779088806178879026360154939569027 51961489200942934873584409442694897901213971464290951923352453382834703350575761 51120257039888523720240221842110308657373109139891545365841031116794058335486020 09227440069631126702388622971429699610592155832226671376935508673610000831830027 51792623357391390621361807764985969618149941279280927284070795610604240722946809 94913627572927387279136890279842472226217169444889547513704380684054391877877295 32342458274372543178323190603810687416044034374530146847272813918612940470431034 01351071607110353008929823272542766151898695056504716025275608952626219102568822 00964410287815640052705292932405076382650282591122477362538471854714402572285438 48529745045857097828402490669995704768445877091762029124375273254907211643344023 02947306923981908956853745359884460160202313291933059395869304916644281633946163 32428700242614612377430099522342042085977356901535654168502308941851348795734106 58547971946759646679661346762885864379526545195605682867159583388847434670120422 42071491874787103842957338913898524589402226347126961769965604409311709985471606 46641857421281143028818111495112214843140887121662059313076923418022298272468836 26045356507913236221596486925870033200274440968806404623978817839469837807048268 60217427219460350750696191658224983009606134572666392863592217643534013718920448 14846483730289412529638634402446954353934473733433447707230478215508820964235121 06900382833900237848230939194834 from determinants of dimension d ≤ 806. we think that present result is more accurate than the one reported by trott[2], the discrepancy being in his last 9 figures. in the case of the quartic anharmonic oscillators (6) and (12) it was proved that e[d,0]0 and e [d,1] 0 are lower and upper bounds to the actual eigenvalue, respectively[3, 4]. in order to verify the accuracy of present calculation we verified that both bounds agreed to the last digit. as stated in the introduction, the rpm yields not only the bound states but also the resonances. for example, in the case of the anharmonic oscillator (12) with λ = −0.1 (inverted double well) we obtain 4 and n > 1. (2.) palindromic primes: the first few decimal palindromic primes are (sequence a002385 in the oeis [2]): 2, 3, 5, 7, 11, 101, 131, 151, 181, 191, 313, 353, 373, 383, 727, 757, 787, 797, 919, 929, . . . except for 11, all palindromic primes have an odd number of digits because the divisibility test for 11 tells us that every palindromic number with an even number of digits is divisible by 11. on the one hand, it is not known if there are infinitely many palindromic primes in base 10; the largest known decimal palindromic prime has 474,501 digits (found in 2014): 10474500 + 999 · 10237249 + 1. on the other hand, it is known that, for any base, almost all palindromic numbers are composite [6]. it means that the ratio of palindromic composites and all palindromic numbers less than n tends to 1. binary palindromic primes include the mersenne primes and the fermat primes 1. all binary palindromic primes, except the number 3 (having the expansion 11 in base 2), have an odd number of digits; palindromic numbers with an even number of digits are divisible by 3. let us write down 1a mersenne prime is a prime of the form 2p − 1, where p is a prime. a fermat prime is a prime of the form 22 n + 1. 428 https://doi.org/10.14311/ap.2021.61.0428 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 3/2021 antipalindromic numbers the sequence of binary expansions of the first binary palindromic primes (sequence a117697 in the oeis [2]): 11, 101, 111, 10001, 11111, 1001001, 1101011, 1111111, 100000001, 100111001, 110111011, . . . (3.) multi-base palindromic numbers: any positive integer n is palindromic in all bases b with b ≥ n + 1 because n is then a single-digit number, and also in base n − 1 because the expansion of n in base n − 1 equals 11. but, it is more interesting to consider bases smaller than the number itself. for instance, the number 105 is palindromic in bases 4, 8, 14, 20, 34, 104; the expansions of 105 in those bases are: (105)4 = 1221, (105)8 = 151, (105)14 = 77, (105)20 = 55, (105)34 = 33, (105)104 = 11. a palindromic number in base b whose expansion is made up of palindromic sequences of length ` arranged in a palindromic order is palindromic in base b`. for example, the number 24253 has the expansion in base 2 equal to (24253)2 = 101 111 010 111 101, i.e., it is made up of palindromes of length of 3, and its expansion in base 23 = 8 is equal to (24253)8 = 57275. (4.) sum of palindromes: every positive integer can be written as the sum of at most three palindromic numbers in every number system except the binary system. it was proven first for bases greater than or equal to 5 in [7] and for bases 3, 4 in [8]. moreover, the optimal bound for base 2 is four, see [8]. in this paper, we deal with antipalindromic numbers in various integer bases. we examine and compare properties of palindromic numbers and antipalindromic numbers, bringing a number of new results. these are structured as follows. in section 2, we introduce the definition of an antipalindromic number in an integer base and its basic properties, following from the definition. section 3 brings unexpected results concerning divisibility and antipalindromic primes. in section 4, antipalindromic squares and higher powers are examined. section 5 contains information about numbers that are antipalindromic in two or more bases at the same time. in section 6, we summarize our results and provide a list of conjectures and open problems. 2. definition and basic properties let us start with a formal definition of palindromic and antipalindromic numbers and their basic properties. definition 1. let b ∈ n, b ≥ 2. consider a natural number m whose expansion in base b is of the following form m = anbn + · · · + a1b + a0, where a0, a1, . . . , an ∈ {0, 1, . . . , b − 1}, an 6= 0. we usually write (m)b = an . . . a1a0. then m is called (1.) a palindromic number in base b if its digits satisfy the condition: aj = an−j for all j ∈{0, 1, . . . , n}, (1) (2.) an antipalindromic number in base b if its digits satisfy the condition: aj = b − 1 − an−j for all j ∈{0, 1, . . . , n}. (2) the length of the expansion of the number m is usually denoted |m|. example 1. consider distinct bases b and have a look at antipalindromic numbers in these bases: • 395406 is an antipalindromic number in base b = 10. • (1581)3 = 2011120, i.e., 1581 is an antipalindromic number in base b = 3. • (52)2 = 110100, i.e., 52 is an antipalindromic number in base b = 2. observation 1. if an antipalindromic number in base b has an odd number of digits, then b is an odd number and the middle digit is equal to b−12 . proof. it follows from the definition that twice the middle digit is equal to b−1. thus, b is an odd number and the middle digit is equal to b−12 . observation 2. a number is simultaneously palindromic and antipalindromic if and only if b is an odd number and all the digits are equal to b−12 . proof. consider an antipalindromic number with digits a0, a1, . . . , an. for this number to be palindromic, aj = an−j must be true for each j ∈ {0, 1, . . . , n}. from the definition of an antipalindromic number, it follows that aj + an−j = b − 1. for each j ∈ {0, 1, . . . , n}, we obtain aj = b−12 , i.e., all digits are equal to b−12 and the base b must, therefore, be odd. the opposite implication is obvious. 3. divisibility and antipalindromic primes let us first study the divisibility of antipalindromic numbers, which will be used in the sequel to show interesting results on antipalindromic primes. lemma 1. let m be a natural number and its expansion in base b be equal to anbn + an−1bn−1 + . . . + a1b + a0. then m is divisible by b − 1 if and only if the sum of its digits is divisible by b − 1, i.e., an + an−1 + . . . + a1 + a0 ≡ 0 (mod b − 1). proof. the statement follows from the fact that bk ≡ 1 (mod b − 1) for any k ∈ n. theorem 2. any antipalindromic number with an even number of digits in base b is divisible by b − 1. 429 ľ. dvořáková, s. kruml, d. ryzák acta polytechnica proof. consider an antipalindromic number m = anbn + an−1bn−1 + . . . + a1b + a0 for an odd n. from the definition, it is true that aj + an−j = b − 1 for each j ∈ {0, 1, . . . , n}. the number of digits is even, hence an+an−1+. . .+a1+a0 = (b−1) n + 1 2 ≡ 0 (mod b−1). using lemma 1, the antipalindromic number m is divisible by the number b − 1. theorem 3. an antipalindromic number with an odd number of digits in base b is divisible by b−12 . proof. consider the antipalindromic number m = a2nb2n + a2n−1b2n−1 + . . . + a1b + a0. the digit sum of the number m − anbn is divisible by b − 1. from lemma 1, we also know that the number m − anbn itself is divisible by b − 1 and, therefore, by b−1 2 . from observation 1, an = b−1 2 . the number m is a sum of two numbers divisible by b−12 . let us now turn our attention to antipalindromic primes. while palindromic primes occur in various bases, antipalindromic primes occur (except some trivial cases) only in base 3. theorem 4. let base b > 3. then there exists at most one antipalindromic prime number p in base b: p = b−12 . proof. theorems 2 and 3 show that every antipalindromic number is divisible either by b−12 or b − 1. although b − 1 may be a prime number, it is never antipalindromic. theorem 5. let base b = 2. then there exists only one antipalindromic number p = 2, (p)2 = 10. proof. every antipalindromic number in base b = 2 is even. 2 is the only even prime number. theorem 6. let base b = 3. every antipalindromic prime in this base has an odd number of digits n ≥ 3. proof. from theorem 2, antipalindromic numbers with an even number of digits in base b = 3 are even. the only antipalindromic number in this base with one digit is 1. lemma 7. antipalindromic numbers in base b = 3 beginning with the digit 2 are divisible by 3. proof. consider an antipalindromic number m = an3n + an−13n−1 + . . . + a13 + a0, where an = 2. the sum of an and a0 need to be equal to 2, therefore, a0 = 0. all the summands are divisible by 3. theorem 8. all antipalindromic primes in base b = 3 can be expressed as 6k + 1, where k ∈ n. proof. consider an antipalindromic prime m = a2n32n + a2n−132n−1 + . . . + a13 + a0. (the number of digits must be odd.) from lemma 7, a0 is equal to 1 and a2n = 1. let us pair the digits of the antipalindromic number m (except for a2n, an, and a0): a2n−j 32n−j + aj 3j , j ∈{1, . . . , n−1}. let us prove that for each j ∈ {1, . . . , n − 1}, the following expression is divisible by 6 3j (a2n−j 32n−2j + aj ) we can only consider three possibilities: a2n−j = 2, aj = 0, or a2n−j = aj = 1, or a2n−j = 0, aj = 2. in either case, the number in question is divisible by 6 because there is an even number inside the bracket. we then get m = a2n32n + an3n + a0 + 6` = 32n + 3n + 1 + 6` = 3n(3n + 1) + 1 + 6` for some ` ∈ n. the first summand is also divisible by 6, therefore m can indeed be expressed as 6k + 1 for some k ∈ n. the application [9] can be used for searching antipalindromic primes in base 3. during an extended search, the first 637807 antipalindromic primes have been found. let us now list at least the first 10 of them, along with their expansions in base 3: 13 111 97 10121 853 1011121 1021 1101211 1093 1111111 7873 101210121 8161 102012021 8377 102111021 9337 110210211 12241 121210101 4. squares and other powers as antipalindromes for palindromic numbers, squares and higher powers were considered in [4, 5] by simmons more than thirty years ago. he proved that there were infinitely many palindromic squares, cubes, and biquadrates. however, his conjecture was that for k > 4, k ∈ n, no integer m exists such that mk is a palindromic number (in the decimal base). this conjecture is still open. that is definitely not the case for antipalindromic numbers as 37 = 2187 is antipalindromic in base 10. for a more recent study of palindromic powers, see [10]. let us answer the following question: 430 vol. 61 no. 3/2021 antipalindromic numbers base n=20 21 22 23 24 25 n2 3 13 3 14 9 11 n2 + 1 47 44 48 53 55 57 n2 + 2 2 2 2 2 2 1 table 1. number of antipalindromic squares smaller than 1012 in particular bases base n=4 5 6 7 8 9 10 11 12 n4 0 1 0 1 0 1 0 0 0 n4 + 1 6 6 8 10 13 13 13 13 13 n4 + 2 0 0 0 0 0 0 0 0 0 table 2. number of antipalindromic biquadrates smaller than 1015 in particular bases question 1. are there any antipalindromic integer squares? our initial observation suggested that bases b = n2 + 1, n ∈ n, have the most antipalindromic squares and the computer application provided an additional insight needed to prove this observation not only for squares but for other powers as well. table 1 expresses the number of antipalindromic squares smaller than 1012 in bases n2, n2 + 1, and n2 + 2 to underline the differences between the bases of the form n2 + 1 and the others. as the exponent of nk is raised, the differences between the number of antipalindromic k-powers in the base nk + 1 and the others become even more significant, but the numbers rise faster, thus we meet the computational limits of our program already for small values of n, see table 2. example 2. consider the base b = 10 = 32 + 1. any antipalindromic number in this base must be divisible by 9. every double-digit number divisible by 9 (except 99) is antipalindromic: 18, 27, 36, 45, 54, 63, 72, 81, 90. the number 9 is a square, so if a square is divided by 9, it is still a square. 36 = 4 · 9 = 22 · 32 = 62, 81 = 9 · 9 = 92. thus 36 and 81 are antipalindromic squares. proposition 1. for b = n2 + 1, n ∈ n, and m ∈ {2, 3, . . . , n}, the number (m · n)2 is antipalindromic. proof. since b = n2 + 1, we can modify the expression as follows: (m · n)2 = m2 · (b − 1). this number has the expansion in base b equal to (m2 − 1) (b − m2), hence it is antipalindromic. further on, we will answer the following question. question 2. are there any higher integer powers that are also antipalindromic numbers? example 3. consider the base b = 28 = 33 + 1. any antipalindromic number in this base with an even number of digits must be divisible by 27. every double-digit number divisible by 27 (except the one with expansion (27)(27)) is antipalindromic. the number 27 is a third power of 3, so if a third power of any number is divided by 27, it still is a third power of an integer. (7)(20) = (216)28 and 216 = 8 · 27 = 23 · 33 = 63, (26)(1) = (729)28 and 729 = 27 · 27 = 33 · 33 = 93. thus 216 and 729 are antipalindromic cubes. theorem 9. for b = nk + 1, where n, k ∈ n, k ≥ 2, and m ∈{2, 3, . . . , n}, the number (m·n)k is antipalindromic. proof. since b = nk + 1, we can modify the expression as follows: (m · n)k = mk · (b − 1). this number has the expansion in base b equal to (mk − 1)(b − mk), thus it is antipalindromic. for odd powers and high enough bases, other patterns exist. theorem 10. for integers m > 1 and odd k > 1, there exists a number c such that in every base b ≥ c, the following number is antipalindromic: [m · (b − 1)]k. it suffices to put c = ( k k−1 2 ) · mk. proof. the binomial theorem reads [m · (b − 1)]k = mk · k∑ i=0 (−1)i · ( k i ) · bk−i. since ( k k−1 2 ) is the maximum number among ( k i ) for i ∈ {0, 1, . . . , k}, the expansion in base b equals:( [m · (b − 1)]k ) b = = ( mk · ( k 0 ) − 1 )( b − mk · ( k 1 ))( mk · ( k 2 ) − 1 ) . . . . . . ( b − mk · ( k k−2 ))( mk · ( k k−1 ) − 1 )( b − mk · ( k k )) . 5. multi-base antipalindromic numbers let us study the question whether there are numbers that are antipalindromic simultaneously in more bases. bašić in [11, 12] showed that for any n ∈ n and any d ≥ 2, there exists an integer m and a list of n bases such that m is a d-digit palindromic number in each of the bases. we do not know whether something similar holds for antipalindromic numbers. we will show a weaker statement that for any n ∈ n, there exists an integer m and a list of n bases such that m is antipalindromic in each of the bases. 431 ľ. dvořáková, s. kruml, d. ryzák acta polytechnica base expansion 2 110011001100 4 303030 10 3276 64 (53)(10) 79 (41)(37) 85 (38)(46) 92 (35)(56) 118 (27)(90) 127 (25)(101) 157 (20)(136) 183 (17)(165) 235 (13)(221) 253 (12)(240) 274 (11)(262) 365 (8)(356) 469 (6)(462) 547 (5)(541) 820 (3)(816) 1093 (2)(1090) 1639 (1)(1637) 6553 (3276) table 3. antipalindromic expansions of the number 3276 in 21 bases in 2014, bérczes and ziegler [13] discussed multibase palindromic numbers and proposed a list of the first 53 numbers palindromic in bases 2 and 10 simultaneously. our application [9] has only been able to find one number with an antipalindromic expansion in these bases. this number, 3276, is also antipalindromic in other 19 distinct bases, see table 3. the next greater number that is antipalindromic both in base 2 and 10 must be greater than 1010 and divisible by 18. it is not uncommon for a number to be antipalindromic in more bases. in this section, we show that if a number is antipalindromic in a unique base, then the number must be prime or equal to 1, see theorem 11. definition 2. an antipalindromic number is called multi-base if it is antipalindromic in at least two different bases. observation 3. every number m ∈ n is antipalindromic in base 2m + 1. example 4. the number 3276 is a multi-base antipalindromic number, as illustrated in table 3. theorem 11. for any composite number a ∈ n, we can find at least two bases b, c such that this number has an antipalindromic expansion in both of them. proof. assume that a = m ·n, m, n ∈ n, m ≥ n ≥ 2, set b = a n + 1, c = 2a + 1. the expansions of a in bases b, c are equal to: (a)b = (n − 1)( an − n + 1), (a)c = a. theorem 12. for every n ∈ n, there exist infinitely many numbers that are antipalindromic in at least n bases. proof. consider a number a such that a = (2n)!. theorem 11 indicates that the number a is antipalindromic in bases a2 + 1, a 3 + 1, . . . , a n + 1 and also 2a + 1. theorem 13. let b ∈ n, b ≥ 2. then there exists m ∈ n such that m is antipalindromic in base b and in at least one more base less than m. proof. base b m two expansions 2 12 (12)2 = 1100 (12)4 = 30 3 72 (72)3 = 2200 (72)9 = 80 ≥ 4 4 · (b − 1) (m)b = (3)(b − 4) (m)2b−1 = (1)(2b − 3) theorem 14. let p, q ∈ n such that gcd(p, q) = d and p ≥ q d > 1, q ≥ p d > 1. then the number m = pq d is antipalindromic in bases p + 1 and q + 1. proof. we have (m)p+1 = ( qd − 1)(p + 1 − q d ), (m)q+1 = ( pd − 1)(q + 1 − p d ). example 5. let p = 4, q = 6, then gcd(4, 6) = 2. the number m = 12 is antipalindromic in bases 5 and 7: (12)5 = 22, (12)7 = 15. in the introduction part, we have mentioned that a palindromic number in base b whose expansion is made up of palindromic sequences of length ` arranged in a palindromic order is palindromic in base b`. let us present a similar statement for antipalindromic numbers. for its proof, we will need the following definition. definition 3. let b ∈ n, b ≥ 2. consider a string u = u0u1 . . . un, where ui ∈ {0, 1, . . . , b − 1}. the antipalindromic complement of u in base b is a(u) = (b − 1 − un)(b − 1 − un−1) . . . (b − 1 − u0). theorem 15. let b ∈ n, b ≥ 2. an antipalindromic number m in base bn, where (m)bn = uk . . . u1u0 and uk ≥ bn−1, is simultaneously antipalindromic in base b if and only if the expansion of uj in base b of length n (i.e., completed with zeroes if necessary) is a palindrome for all j ∈{0, 1, . . . , k}. 432 vol. 61 no. 3/2021 antipalindromic numbers proof. the digits of m in base bn satisfy 0 ≤ uj ≤ bn − 1. let us denote the expansion of uj in base b by (uj )b = vj,n−1 . . . vj,1vj,0 (where the expansion of uj in base b is completed with zeroes in order to have the length n if necessary). the antipalindromic complement a(uj ) of uj in base bn equals bn − 1 − uj and its expansion in base b equals (a(uj ))b = (b − 1 − vj,n−1) . . . (b − 1 − vj,1)(b − 1 − vj,0). since m is antipalindromic in base bn, we have uk−j = a(uj ) = bn − 1 − uj for all j ∈{0, 1, . . . , k}. let us now consider the expansion of m in base b: it is obtained by concatenation of the expansions of uj in base b for j ∈{0, 1, . . . , k}, i.e., (m)b = (uk)b . . . (u1)b(u0)b. following the assertion that uk ≥ bn−1, the expansion (uk)b starts in a non-zero. thus, the length of the expansion (m)b equals n · |(m)bn|. the number m is antipalindromic in base b if and only if (uk−j )b = a((uj )b) for all j ∈{0, 1, . . . , k}, i.e., (b − 1 − vj,n−1) . . . (b − 1 − vj,1)(b − 1 − vj,0) = = a(vj,n−1 . . . vj,1vj,0) = = (b − 1 − vj,0)(b − 1 − vj,1) . . . (b − 1 − vj,n−1). consequently, m is antipalindromic in base b if and only if (uj )b = vj,n−1 . . . vj,1vj,0 is a palindrome for all j ∈{0, 1, . . . , k}. example 6. consider m = 73652. then (m)27 = (3)(20)(6)(23) = u3u2u1u0, thus, m is antipalindromic in base 27 = 33. however, (m)3 = 10202020212, thus m is not antipalindromic in base 3. if we cut (m)3 into blocks of length 3, then all of them are palindromic. however, the first one equals 010 and it starts with a zero, hence the assumption u3 ≥ 9 of theorem 15 is not met. example 7. consider b = 10. the number 6633442277556633 is an antipalindromic number both in base 10 and 100. 6. conclusion and open problems in this paper, we carried out a thorough study of antipalindromic numbers and described known results regarding palindromic numbers in order to draw a comparison. it brings a number of new results: • we described the divisibility of antipalindromic numbers and showed that non-trivial antipalindromic primes may be found only in base 3. • we found several classes of antipalindromic squares and higher powers. • we described pairs of bases such that there is a number antipalindromic in both of these bases. moreover, we obtained the following interesting results concerning multi-base antipalindromic numbers: . for any composite number, there exist at least two bases such that this number is antipalindromic in both of them. . for every n ∈ n, there exist infinitely many numbers that are antipalindromic in at least n bases. . let b ∈ n, b ≥ 2. then there exists m ∈ n such that m is antipalindromic in base b and in at least one more base less than m. this paper is based on the bachelor thesis [14], where several more results were obtained: • the number of (anti)palindromic numbers of a certain length and the maximum and minimum number of antipalindromic numbers between palindromic numbers and vice versa; • an explicit formula for the length of gaps between neighboring antipalindromic numbers. we created a user-friendly application for all the questions studied [9], which is freely available to the reader. based on computer experiments, we state the following conjectures and open problems: (1.) are there infinitely many antipalindromic primes in base 3? (we know there is never more than one antipalindromic prime in any other base except for 3.) during an extended search, the first 637807 antipalindromic primes have been found. (2.) we conjecture it is possible to express any integer number (except for 24, 37, 49, 117, and 421) as the sum of at most three antipalindromic numbers in base 3. our computer program shows that the answer is positive up to 5 · 106. (3.) we conjecture it is possible to express any palindromic number in base 3 as the sum of at most three antipalindromic numbers in base 3. this conjecture follows evidently from the previous one, and we verified it even for larger numbers, up to 108. (4.) is there a pair of bases such that it is impossible to find any number that has an antipalindromic expansion in both of them? according to our computer experiments, suitable candidates seem to be the bases 6 and 8. it is to be studied in the future. acknowledgements we would like to thank to the reviewers for their useful comments and suitable suggestions concerning references. l. dvořáková received funding from the ministry of education, youth and sports of the czech republic through the project no. cz.02.1.01/0.0/0.0/16_019/0000778. references [1] j. joyce. ulysses. first edition. shakespeare and company, 12, rue de l’odéon, paris, 1922. [2] n. j. a. sloane. the on-line encyclopedia of integer sequences. https://oeis.org. [3] a. tripathi. characterization and enumeration of palindromic numbers whose squares are also palindromic. rocky mountain j math 50(3):1115–1124, 2020. https://doi.org/10.1216/rmj.2020.50.1115. 433 https://oeis.org https://doi.org/10.1216/rmj.2020.50.1115 ľ. dvořáková, s. kruml, d. ryzák acta polytechnica [4] g. j. simmons. on palindromic squares of non-palindromic numbers. j recreational math 5(1):11–19, 1972. [5] g. j. simmons. palindromic powers. j recreational math 3:93–98, 1970. [6] w. d. banks, d. n. hart, m. sakata. almost all palindromes are composite. math res lett 11(5-6):853–868, 2004. https://doi.org/10.4310/mrl.2004.v11.n6.a10. [7] j. cilleruelo, f. luca, l. baxter. every positive integer is a sum of three palindromes. math comp 87(314):3023– 3055, 2018. https://doi.org/10.1090/mcom/3221. [8] a. rajasekaran, j. shallit, t. smith. additive number theory via automata theory. theory comput syst 64(3):542–567, 2020. https://doi.org/10.1007/s00224-019-09929-9. [9] s. kruml. antipalindromic numbers (application). [2020-08-10], https: //github.com/kruml3/antipalindromic-numbers/. [10] j. cilleruelo, f. luca, i. e. shparlinski. power values of palindromes. j comb number theory 1(2):101–107, 2009. [11] b. bašić. on d-digit palindromes in different bases: the number of bases is unbounded. int j number theory 8(6):1387–1390, 2012. https://doi.org/10.1142/s1793042112500819. [12] b. bašić. on “very palindromic” sequences. j korean math soc 52(4):765–780, 2015. https://doi.org/10.4134/jkms.2015.52.4.765. [13] a. bérczes, v. ziegler. on simultaneous palindromes. j comb number theory 6(1):37–49, 2014. [14] s. kruml. antipalindromic numbers. bachelor thesis, czech technical university in prague, 2020. available on request. 434 https://doi.org/10.4310/mrl.2004.v11.n6.a10 https://doi.org/10.1090/mcom/3221 https://doi.org/10.1007/s00224-019-09929-9 https://github.com/kruml3/antipalindromic-numbers/ https://github.com/kruml3/antipalindromic-numbers/ https://doi.org/10.1142/s1793042112500819 https://doi.org/10.4134/jkms.2015.52.4.765 acta polytechnica 61(3):428–434, 2021 1 introduction 2 definition and basic properties 3 divisibility and antipalindromic primes 4 squares and other powers as antipalindromes 5 multi-base antipalindromic numbers 6 conclusion and open problems acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0144 acta polytechnica 59(2):144–152, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap a study of ferrofluid lubrication based rough sine film slider bearing with assorted porous structure mohmmadraiyan m. munshia, ∗, ashok r. patelb, gunamani b. deheric a gujarat technological university, alpha college of engineering and technology, kalol, gujarat, 382721, india b gujarat technological university, vishwakarma government engineering college, ahmedabad, gujarat, 382424, india c sardar patel university, department of mathematics, vallabh vidhyanagar, gujarat, 388120, india ∗ corresponding author: raiyan.munshi@gmail.com abstract. this paper attempts to study a ferrofluid lubrication based rough sine film slider bearing with assorted porous structure using a numerical approach. the fluid flow of the system is regulated by the neuringer-rosensweig model. the impact of the transverse surface roughness of the system has been derived using the christensen and tonder model. the corresponding reynolds’ equation has been used to calculate the pressure distribution which, in turn, has been the key to formulate the load carrying capacity equation. a graphical representation is made to demonstrate the calculated value of the load carrying capacity which is a dimensionless unit. the numbers thus derived have been used to prove that ferrofluid lubrication aids the load carrying capacity. the study suggests that the positive impact created by magnetization in the case of negatively skewed roughness helps to partially nullify the negative impact of the transverse roughness. further investigation implies that when the kozeny-carman’s model is used, the overall performance is enhanced. the kozeny-carman’s model is a form of an empirical equation used to calculate permeability that is dependent on various parameters like pore shape, turtuosity, specific surface area and porosity. the success of the model can be accredited to its simplicity and efficiency to describe measured permeability values. the obtained equation was used to predict the permeability of fibre mat systems and of vesicular rocks. keywords: porous structure, roughness, ferrofluid, load carrying capacity. 1. introduction the last decade has seen a considerable shift wherein many tribological researches have been dedicated to study surface roughness and the impact of hydrodynamic lubrication. this is because every solid surface carries some amount of surface roughness, the height of which is usually parallel to the mean separation between lubricated contacts. as many researchers have suggested, studying the surface roughness will help to improve the performance of a bearing system. due to this reason, many researchers [1–3] studied the performance of various bearing systems using the stochastic concept of [4–6]. amongst the biggest inventions in the field is the use of ferrofluid as a bearing system lubricant. a number of authors [7–11] have worked to explain the performance and applications of ferrofluid when used in different types of bearing systems. these studies have suggested that ferrofluid impacts the bearing performance positively. many researchers have used different types of film geometries in order to study the effect of ferrofluid based squeeze film. some of the researches conducted on this topic are listed, [12] studied exponential slider bearing, [13] worked on secant shaped slider bearing, [14–16] analysed inclined slider bearing, [17] studied curved slider bearings, [18] examined parallel slider bearing, [19] evaluated on rayleigh step bearing, [20] investigated parabolic slider bearing, [21] worked on hyperbolic slider bearing, [22] discussed sine film thrust bearing, [23] studied convex pad slider bearing and [24] examined infinitely long slider bearing. from all the articles above, it can clearly be seen that the characteristics of ferrofluid and its effect on load bearing capacity are positive. the lubrication theory of porous bearings was first studied by [25]. porous structures are usually described using two common parameters, which are porosity and permeability. porosity is a measure of existing voids within a dense material structure. permeability defines the ease with which fluids can flow through the material, in case of open cell porosity. darcy’s law is generally used to determine the porosity. porous metallic materials have a lot of applications including vibration and sound absorption, light materials, heat transfer media, sandwich core for different panels, various membranes and during the last years as suitable biomaterial structures for the design of medical implants. porous matrix decreases the load carrying capacity and increase the frictional force on the slider. the porous layer has a beneficial property of self-lubrication, making it an important area of 144 http://dx.doi.org/10.14311/ap.2019.59.0144 http://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 2/2019 a study of ferrofluid lubrication based rough sine film slider bearing. . . study. [26] worked on studying the comparisons of porous structures and their impact on the load carrying capacity of a magnetic fluid based rough and short bearing. the studies have found that while magnetization has a positive impact on the bearing system’s performance, transverse roughness impacts it negatively. however, in the case of kozeny-carman model, this negative impact is comparatively lower. in this model, the negative impact of porosity on the bearing performance can be neutralized with the negatively skewed roughness’ positive impact. [27] worked on investigating the performance of a magnetic fluid based double layered rough porous slider bearing considering the combined porous structures. for a considerable range of combined porous structure, magnetization neutralizes the adverse effect of roughness. [28] studied shliomis model-based magnetic squeeze film in rotating rough curved circular plates: making a contrast of two different porous structures. it was found out that by choosing a proper rotation ratio and appropriate curvature parameters, the negative impacts of transverse roughness on a bearing’s load carrying capacity can be nullified by the positive impact of magnetization with a negatively skewed roughness. [29] studied squeeze film based on ferrofluid in curved porous circular plates with various porous structures. the studies showed that, with concave plates and porous structure given by kozeny-carman, there was a considerable increase in the load bearing capacity. different forms of modification of darcy’s law have been studied in [30]. [31] investigated a bearing system based on a hyperbolic slider. they experiment with porous structure as well as roughness in accordance with the impact of sinusoidal magnetic field. furthermore, the load bearing capacity is enhanced due to the influence of magnetization and the slip parameter being within the limited boundary. recently, [32] analysed inclined slider bearing. in this work, we can identify that they worked in detail with all aspects of surface roughness, porosity and magnetic field. somehow, by surprise, the result was that the load bearing capacity differs and gives a very effective ability when the sinusoidal magnetic field is applied in the form which appears in the presented study. none of the above-mentioned researchers worked on the impact of sine films in a slider bearing. in order to explore this filed, this paper studies ferrofluid lubrication based rough sine film slider bearing with assorted porous structure. 2. analysis figure 1 represents the geometry and configuration of the given bearing system. u denotes the uniform velocity of the system in the direction x. the thickness h is considered as h = h + hs (1) where h is taken as [22]: h = h0 + (h1 −h0) [ 1 − sin (πx 2l )] using the works of [4–6]. also, the study uses hs using the probability density function f(hs) = { 35 32c ( 1 − h 2 s c2 )3 ,−c ≤ hs ≤ c 0 ,elsewhere (2) c being the maximum deviation from the mean film thickness. α, σ and ε are considered by the relationships α = e(hs), σ2 = e(hs−α)2, ε = e(hs−α)3 (3) where e(•) denotes the expectancy operator given by e(•) = ∫ c −c (•)f(hs)dhs (4) [9] formulated explaining the steady flow of a magnetic fluid. it was: equation of motion ρ(q ·∇)q = −∇p + η∇2q + µ0(m ·∇)h (5) equation of magnetization m = µh (6) equation of continuity ∇·q = 0 (7) maxwell equations ∇×h = 0 (8) and ∇· (h + m) = 0 (9) where ρ, q, m,p and η are fluid density, fluid velocity, magnetization vector, film pressure and fluid viscosity respectively. also, q = ui + vj + wk (10) where u, v, w are components of film fluid velocity in x, y and z directions respectively. further, the magnetic field’s magnitude is given by h2 = kx(l−x) (11) where k is a suitable constant and, assuming the external magnetic field to come up from a potential function, the inclination angle of the magnetic field φ = φ(x,z) satisfies the equation [7] cot φ ∂φ ∂x + ∂φ ∂z = 2x−l 2x(l−x) (12) the governing equation of motion of the fluid flow in the film region [33] is ∂2u ∂z2 = 1 η ∂ ∂x ( p− 1 2 µ0µh 2 ) (13) 145 m. m. munshi, a. r. patel, g. b. deheri acta polytechnica figure 1. configuration of a sine film porous slider bearing including squeeze action [22]. by solving equation (13) following the no slip boundary conditions: u = 0 at z = h and u = u at z = 0 one can find u = 1 η ( z2 2 − h 2 z ) ∂p ∂x + u ( 1 − z h ) (14) integrating equation (14) over the film region, yields∫ h 0 udz = −h3 12η dp dx + uh 2 (15) using equation (15) in continuity equation ∂ ∂x ∫ h 0 udz + wh −w0 = 0 (16) yields ∂ ∂x [ −h3 12η dp dx + uh 2 ] + wh −w0 = 0 (17) where wh = −ḣ0 and w0 = 0 equation (17) leads to: d dx [ h3 d dx ( p− 1 2 µ0µh 2 )] = 6ηu dh dx + 12ηḣ0 (18) which is the reynolds’ equation [7, 21] modified according to the general hydrodynamic lubrication assumption. according to the stochastically average process of [4], equation (18) becomes: d dx [ e(h3) d dx ( p− 1 2 µ0µh 2 )] = = 6ηu d dx [e(h)] + 12ηḣ0 figure 2. structure model of porous sheet given by kozeny-carman [34]. d dx [ g(h,α,σ,ε,ψ) d dx ( p− 1 2 µ0µh 2 )] = = 6ηu d dx [ g(h,α,σ,ε,ψ)1/3 ] + 12ηḣ0 (19) where g(h,α,σ,ε,ψ) = h3 + 3αh2 + 3(σ2 + α2)h+ + 3σ2α + α3 + ε + 12ψl1 (20) 2.1. a globular sphere model globular particles (a mean particle size dc) are used to fill a porous material which is given in fig. 2. in fluid dynamics, the kozeny-carman equation [35] plays a major role in calculating the pressure drop when working with a fluid flowing in a packed bed of solids. although, the equation only remains valid for a laminar flow. this equation makes use of few general experimental trends, which makes it an efficient quality control tool that can be used for both physical as well as digital experimental results. the equation is commonly displayed as permeability versus porosity, pore size and turtuosity. the pressure gradient is assumed to be linear here. following the ideas of discussion [36] the use of kozeny146 vol. 59 no. 2/2019 a study of ferrofluid lubrication based rough sine film slider bearing. . . carman formula becomes: ψ = d2ce 3 72(1 −e)2 l l ′ where e is the porosity and l l ′ is the length ratio. the following dimensionless quantities are used, g ( h,α,σ,ε,ψ ) = g(h,α,σ,ε,ψ) h30 , x = x l , a = h1 h0 , α = α h0 , σ = σ h0 , ε = ε h30 , p = ph20 ηul , µ∗ = µ0µh 2 0lk ηu , β1 = − uh0 2ḣ0l , h = h h0 = 1 + (a− 1) [ 1 − sin (π 2 x )] , l∗ = l l ′ , g ( h,α,σ,ε,ψ )1/3 = g (h,α,σ,ε,ψ)1/3 h0 , ψ = d2ce 3l 72(1 −e)2l′ , ψ = d2cl1 h30 (21) the associated boundary conditions are p = 0 at x = 0, 1 (22) with the aid of equation (22) the pressure distribution in a non-dimensional form comes out to be p = 1 2 µ∗x(1−x) + 6 ∫ x 0 [ g ( h,α,σ,ε,ψ )1/3 − . . . g ( h,α,σ,ε,ψ ) · · ·−g ( 1,α,σ,ε,ψ )1/3 + β−11 (1 −x) g ( h,α,σ,ε,ψ ) ]dx (23) where g ( h,α,σ,ε,ψ ) = h 3 + 3αh 2 + 3 ( σ2 + α2 ) h+ + 3σ2α + α3 + ε + ψe3l∗ 6(1 −e)3 (24) the load bearing capacity in dimensionless form is obtained w = h20 ηul2b w (25) w = µ∗ 12 + 6 ∫ 1 0 [ g ( h,α,σ,ε,ψ )1/3 − . . . g ( h,α,σ,ε,ψ ) · · ·−g ( 1,α,σ,ε,ψ )1/3 + β−11 (1 −x) g ( h,α,σ,ε,ψ ) ](1 −x)dx (26) where the load bearing capacity is calculated using w = ∫ l 0 pbdx (27) 3. results and discussion the results calculated for the dimensionless loadcarrying capacity w given by equation (26) are found using simpson’s one-third rule with a step size of 0.2 for the kozeny-carman model. it proves that the load bearing capacity increases by: µ∗ 12 equation (26) suggests that even in the absence of a flow, a bearing system can handle a given amount of load for the kozeny-carman model. by keeping the roughness zero, the study reduces to the impact of an assorted porous structure on the neuringerrosensweig model based ferrofluid squeeze film for a slider bearing [7]. considering the magnetization parameter as a zero, it reduces to the study of [37] in the absence of porosity. equation (26) clearly suggests that the expression for w is linear with respect to the magnetization parameter µ∗. thus, when the kozeny-carman model is applicable, by increasing magnetization, the load bearing capacity can also be increased (fig. 3). figures 3-10 display a graphical representation of the kozeny-carman model results. they suggest that: (1.) according to fig. 4, it is evident that a standard deviation has a relatively lower impact when compared to porosity. (2.) as the positive variance increases, the load carrying capacity decreases. a decrease in the negative variance leads to an increase in the load carrying capacity (fig. 5). as suggested by fig. 6, the impact of skewness on the load carrying capacity is similar to variance. (3.) effect of ψ on w with respect to e and l∗ is seen to be adversely from fig. 7. (4.) figure 8 demonstrates the impact of porosity on the distribution of load carrying capacity. it suggests that porosity considerably reduces the load bearing capacity. in case of a measure of symmetry, this scenario is further exaggerated. (5.) figure 9 displays the impact of the ratio l∗ on the load bearing capacity. it is evident that, with an increase in l∗, the load bearing capacity decreases. the rate of it is further increased with an increase in porosity parameter e. if we look on the results presented in fig. 10a and correlate them with the results from fig. 10, we can firmly conclude that the kozeny-carman model is highly activated in reference to the conventional porosity case. 4. validation undoubtedly, tables 1-5 underline that an enhancement in the load bearing capacity by almost 5% is registered here. 147 m. m. munshi, a. r. patel, g. b. deheri acta polytechnica (a). (b). (c). (d). figure 3. profile of load bearing capacity with regards to µ∗. (a). (b). (c). (d). figure 4. profile of load bearing capacity with regards to σ. 148 vol. 59 no. 2/2019 a study of ferrofluid lubrication based rough sine film slider bearing. . . (a). (b). figure 5. profile of load bearing capacity with regards to α. figure 6. profile of load bearing capacity with regards to ε. µ∗ load bearing capacity calculated for (α = 0.01,σ = 0.01,ε = 0.05, ψ = 30, 1/β1 = 0.01, l∗ = 1.75, e = 0.15,ψ∗ = 0.02) result for assorted porosity result for conventional porosity 0.01 0.1651655 0.1567618 0.02 0.1659988 0.1575952 0.03 0.1668321 0.1584285 0.04 0.1676655 0.1592618 0.05 0.1684988 0.1600952 table 1. comparison of w calculated for µ∗. (a). (b). figure 7. profile of load bearing capacity with regards to ψ. figure 8. profile of load bearing capacity with regards to e. α load bearing capacity calculated for (µ∗ = 0.02,σ = 0.01,ε = 0.05, ψ = 30, 1/β1 = 0.01, l∗ = 1.75, e = 0.15,ψ∗ = 0.02) result for assorted porosity result for conventional porosity -0.02 0.1716939 0.1623983 -0.01 0.1697629 0.1607764 0 0.1678648 0.1591755 0.01 0.1659988 0.1575952 0.02 0.1641643 0.1560354 table 2. comparison of w calculated for α. 149 m. m. munshi, a. r. patel, g. b. deheri acta polytechnica (a). (b). figure 9. profile of load bearing capacity with regards to l∗. σ load bearing capacity calculated for (µ∗ = 0.02,α = 0.01,ε = 0.05, ψ = 30, 1/β1 = 0.01, l∗ = 1.75, e = 0.15,ψ∗ = 0.02) result for assorted porosity result for conventional porosity 0.01 0.1659988 0.1575952 0.03 0.1658797 0.1574952 0.05 0.1656424 0.1572959 0.07 0.1652885 0.1569985 0.09 0.1648207 0.1566048 table 3. comparison of w calculated for σ. (a). (b). (c). figure 10. profile of load bearing capacity with regards to µ∗, σ and � for the comparison of e and ψ∗. ε load bearing capacity calculated for (µ∗ = 0.02,α = 0.01,σ = 0.03, ψ = 30, 1/β1 = 0.01, l∗ = 1.75, e = 0.15,ψ∗ = 0.02) result for assorted porosity result for conventional porosity -0.02 0.1692074 0.1602794 -0.01 0.168718 0.1598716 0 0.1682335 0.1594672 0.01 0.1677536 0.1590663 0.02 0.1672784 0.1586686 table 4. comparison of w calculated for ε. 150 vol. 59 no. 2/2019 a study of ferrofluid lubrication based rough sine film slider bearing. . . e load bearing capacity calculated for (µ∗ = 0.02,α = 0.01,σ = 0.01, ψ = 30,ε = −0.01, 1/β1 = 0.01, l∗ = 0.08,ψ∗ = 0.02) result for assorted porosity result for conventional porosity 0.1 0.1706303 0.1599767 0.15 0.1699394 0.1599767 0.2 0.1684007 0.1599767 0.25 0.1655293 0.1599767 0.3 0.1607807 0.1599767 table 5. comparison of w calculated for e. 5. conclusions this paper has studied the effect of ferrofluid lubrication when used with a rough sine film slider bearing with an assorted porous structure on the load carrying capacity. a modified reynolds’ equation used for the sine profile slider bearing lubrication has been derived with the ferrohydrodynamic theory by neuringer-rosensweig and equation of continuity for film as well as porous region. the reynolds’ equation has also been used to determine the pressure equation and an expression for dimensionless load-carrying capacity. from the numerical calculations, the following conclusions have been derived: (1.) by increasing the strength of the external magnetic field, a bearing system’s pressure and its load bearing capacity can be increased considerably. also, unlike conventional lubricants, this type of a system can carry a given amount of load even if there is no flow. additionally, as suggested by equation (13), when the neuringer-rosensweig ferrofluid flow model is applicable, a constant magnetic field does not increase the load bearing capacity. (2.) comparing the present paper with [15] makes it evident that the system, in this case, enhances the load carrying capacity threefold at minimum. also, when a sine film profile is used to design the slider bearing, it enhances the bearing capacity, as can be seen when compared with an inclined slider bearing. lastly, the article determines that, when kozenycarman’s model is appropriate, the surface roughness must be studied properly in order to design a more efficient bearing system. list of symbols a inlet-outlet ratio b breath of the bearing g function of different parameters h film thickness [mm] h mean film thickness [mm] hs deviation from mean level h1 maximum film thickness h0 minimum film thickness h non dimensional film thickness h external magnetic field ḣ0 squeeze velocity in z-direction h0 thickness of porous facing l length of the bearing p non-dimensional film pressure w0 values of w at z = 0 wh values of w at z = h w load capacity [n] w non-dimensional load capacity α variance [mm] α non-dimensional variance β1 squeeze parameter ε skewness [mm3] ε skewness in dimensionless form µ0 magnetic characteristic µ magnetic susceptibility of particles µ∗ dimensionless magnetization parameter σ standard deviation [mm] σ dimensionless standard deviation ψ permeability of porous region ψ∗ dimensionless conventional porosity acknowledgements the authors would like to thank the reviewers for their comments and suggestions, which resulted in an improvement of the materials presented in the paper. references [1] p. andharia, j. l. gupta, g. deheri. effect of surface roughness on hydrodynamic lubrication of slider bearings. tribology transaction 44(2):291–297, 2001. doi:10.1080/10402000108982461. [2] n. naduvinamani, t. biradar. effects of surface roughness on porous inclined slider bearings lubricated with micropolar fluids. journal of marine science and technology 15(4):278–286, 2007. [3] n. naduvinamani, s. apparao, h. a. gundayya, s. n. biradar. effect of pressure dependent viscosity on couple stress squeeze film lubrication between rough parallel plates. tribology online 10(1):76–83, 2015. doi:10.2474/trol.10.76. [4] h. christensen, k. c. tønder. tribology of rough surfaces: stochastic models of hydrodynamic lubrication. sintef report 10/69, sintef, 1969a. [5] h. christensen, k. c. tønder. tribology of rough surfaces: parametric study and comparison of lubrication model. sintef report 22/69, sintef, 1969b. [6] h. christensen, k. c. tønder. the hydrodynamic lubrication of rough bearing surfaces of finite width. in asme-asle lubrication conference, pp. 12–15. cincinnati, 1970. [7] m. v. bhat. lubrication with a magnetic fluid. team spirit pvt. ltd, india, 2003. 151 http://dx.doi.org/10.1080/10402000108982461 http://dx.doi.org/10.2474/trol.10.76 m. m. munshi, a. r. patel, g. b. deheri acta polytechnica [8] b. j. hamrock. fundamentals of fluid film lubrication. mcgraw-hill, new york, 1994. [9] j. l. neuringer, r. e. rosensweig. magnetic fluids. physics of fluids 7(12):1927–1937, 1964. doi:10.1063/1.1711103. [10] n. patel, d. vakharia, g. deheri. hydrodynamic journal bearing lubricated with a ferrofluid. industrial lubrication and tribology 69(5):754–760, 2017. doi:10.1108/ilt-08-2016-0179. [11] y. d. vashi, r. m. patel, g. deheri. ferrofluid based squeeze film lubrication between rough stepped plates with couple stress effect. journal of applied fluid mechanics 11(3):597–612, 2018. doi:10.29252/jafm.11.03.27854. [12] r. shah, m. v. bhat. lubrication of a porous exponential slider bearing by ferrofluid with slip velocity. turkish journal of engineering and environmental sciences 27:183–187, 2003. [13] r. shah, m. v. bhat. porous secant shaped slider bearing with slip velocity lubricated by ferrofluid. industrial lubrication and tribology 55(3):113–115, 2003. doi:10.1108/00368790310470930. [14] n. b. naduvinamani, s. apparao. on the performance of rough inclined stepped composite bearings with micropolar fluid. journal of marine science and technology 18(2):233–242, 2010. [15] n. d. patel, g. deheri. a ferrofluid lubrication of a rough, porous inclined slider bearing with slip velocity. journal of mechanical engineering and technology 4(1):15–44, 2012. [16] p. ram, p. verma. ferrofluid lubrication in porous inclined slider bearing. indian journal of pure and applied mathematics 30(12):1273–1281, 1999. [17] u. p. singh. on the performance of pivoted curved slider bearings: rabinowitsch fluid model. in proceedings of national tribology conference. iit roorkee, india, 2011. [18] j. patel, g. deheri. performance of a ferrofluid based rough parallel plate slider bearing: a comparison of three magnetic fluid flow models. advances in tribology 2016:1–9, 2016. doi:10.1155/2016/8197160. [19] s. snehal, g. deheri. effect of slip velocity on magnetic fluid lubrication of rough porous rayleigh step bearing. journal of mechanical engineering and sciences 4:532–547, 2013. doi:10.15282/jmes.4.2013.17.0050. [20] n. d. patel, g. deheri. hydromagnetic lubrication of a rough porous parabolic slider bearing with slip velocity. journal of applied mechanical engineering 3(3):1–8, 2014. doi:10.4172/2168-9873.1000143. [21] s. patel, g. deheri, j. patel. ferrofluid lubrication of a rough porous hyperbolic slider bearing with slip velocity. tribology in industry 36(3):259–268, 2014. [22] j. r. lin. dynamic stiffness and damping characteristics of a sine film thrust bearing. in proceeding of international conference on advanced manufacture technology and industrial application. china, 2016. doi:10.12783/dtetr/amita2016/3580. [23] g. deheri, j. patel, n. patel. shliomis model based ferrofluid lubrication of a rough porous convex pad slider bearing. tribology in industry 38(1):57–65, 2016. [24] j. patel, g. deheri. a study of thin film lubrication at nanoscale for a ferrofluid based infinitely long rough porous slider bearing. facta universitatis series: mechanical engineering 14(1):89–99, 2016. doi:10.22190/fume1601089p. [25] v. t. morgan, a. cameron. mechanism of lubrication in porous metal bearings. in proceedings of conference on lubrication and wear. institution of mechanical engineers, london, 1957. [26] j. patel, g. deheri. a comparison of porous structures on the performance of a magnetic fluid based rough short bearing. tribology in industry 35(3):177–189, 2013. [27] j. patel, g. deheri. performance of a magnetic fluid based double layered rough porous slider bearing considering the combined porous structures. acta technica corviniensis bulletin of engineering 7(4):115–125, 2014. [28] j. patel, g. deheri. shliomis model-based magnetic squeeze film in rotating rough curved circular plates: a comparison of two different porous structures. international journal of computational materials science and surface engineering 6(1):29–49, 2014. doi:10.1504/ijcmsse.2014.063760. [29] r. shah, d. b. patel. squeeze film based on ferrofluid in curved porous circular plates with various porous structure. applied mathematics 2(4):121–123, 2012. doi:10.5923/j.am.20120204.04. [30] b. prajapati. on certain theoretical studies in hydrodynamic and electro-magneto hydrodynamic lubrication. ph.d. thesis, s.p. university, vallabh vidyanagar, 1995. [31] m. barik, s. mishra, g. c. dash. effect of sinusoidal magnetic field on a rough porous hyperbolic slider bearing with ferrofluid lubrication and slip velocity. tribology materials, surfaces & interfaces 10(3):131– 137, 2016. doi:10.1080/17515831.2016.1235843. [32] s. mishra, m. barik, g. c. dash. an analysis of hydrodynamic ferrofluid lubrication of an inclined rough slider bearing. tribology materials, surfaces & interfaces 12(01):17–26, 2018. doi:10.1080/17515831.2017.1418280. [33] p. verma. magnetic fluid-based squeeze film. international journal of engineering science 24(3):395– 401, 1986. doi:10.1016/0020-7225(86)90095-9. [34] k. yazdchi, s. srivastava, s. luding. on the validity of the carman-kozeny equation in random fibrous media. in ii international conference on particle-based methods-fundamentals and applications, particles, pp. 1–10. barcelona, 2011. [35] p. c. carman. fluid flow through granular beds. transactions of the institute of chemical engineering 15:150–166, 1937. [36] j. liu. analysis of a porous elastic sheet damper with a magnetic fluid. journal of tribology 131(2), 2009. doi:10.1115/1.3075870. [37] s. k. basu, s. n. sengupta, b. b. ahuja. fundamentals of tribology. phi private limited, new-delhi, india, 2009. 152 http://dx.doi.org/10.1063/1.1711103 http://dx.doi.org/10.1108/ilt-08-2016-0179 http://dx.doi.org/10.29252/jafm.11.03.27854 http://dx.doi.org/10.1108/00368790310470930 http://dx.doi.org/10.1155/2016/8197160 http://dx.doi.org/10.15282/jmes.4.2013.17.0050 http://dx.doi.org/10.4172/2168-9873.1000143 http://dx.doi.org/10.12783/dtetr/amita2016/3580 http://dx.doi.org/10.22190/fume1601089p http://dx.doi.org/10.1504/ijcmsse.2014.063760 http://dx.doi.org/10.5923/j.am.20120204.04 http://dx.doi.org/10.1080/17515831.2016.1235843 http://dx.doi.org/10.1080/17515831.2017.1418280 http://dx.doi.org/10.1016/0020-7225(86)90095-9 http://dx.doi.org/10.1115/1.3075870 acta polytechnica 59(2):144–152, 2019 1 introduction 2 analysis 2.1 a globular sphere model 3 results and discussion 4 validation 5 conclusions list of symbols acknowledgements references ap05_2.vp 1 introduction binary logical circuits designed with respect to boolean symmetrical, particularly majority, output functions are certainly worth attention. the article, therefore, makes an evaluation both by controlling binary one-digit adders and by using functions interpreted by arithmetic polynomials. it also demonstrates how effectively the shannon decomposition of the output functions can be used in designing a circuit with majority elements. 2 boolean function let a boolean function f x x xm m:{ , } { , } : , , ,0 1 0 1 1 2� � and y be given. if we denote the set { }xi i m �1 of arguments x1 by the symbol x, we can briefly write f (x) instead of f (x1, x2, …, xm). in addition, instead of f (x1, x2, …, xi�1, �i, xi�1, …, xm), in which � i �{ , }0 1 , let us simply write f (x� i s i). let x x x� � � � � � ��� � �( 0 1 ; any boolean function f (x1, x2, …, xm) can be expressed, without loss of generality, by the shannon extension development f x x x x f x n n n n n ( ) ( , , , , , , , , , � � � � � � � � � � � � � 1 2 1 2 1 2 1 2 1 � � � � xm) where n m esp. f x x f x x f x x f x i i i i i i i( ) ( ) ( ) ( )� � � � � � � � � �1 1 0 1 the functions f x xn n m( , , , , , , )� � �1 2 1� �� will be called remainder functions. by the hamming weight w f xh ( ) of the function f x( ) we understand the value of the arithmetic formula w f x fh m m ( ) ( , , , ) , , , � � � � � � � 1 2 1 2 � � the partial derivation � � f x xi ( ) [1] of function f x( ) by the argument x will be termed the boolean function � � f x x f x x x x x f x x x i i i m i ( ) ( , , , , , , , ) ( , , , � � � � �1 2 1 1 1 2 0� � � � �1 11, , , , )x xi m� defining the conditions under which f x( ) changes its value while the value of the argument x is changed. for example, for y x x x x x� �2 3 1 2 3 , when � w y x w x x x x w x xh i h h � � � � � � �2 3 2 3 2 3 1( ) ( ) the function y changes its value while the value of the argument x1 is changed under one condition for w y xh i � � �1, which is: x x2 3 1� � . 3 boolean formulae and arithmetic polynomials let us have f x x x x x xm i k i i ik i i ik( , , , ) , 1 2 1 2 1 2� �� � � � � where k � 1, 2, … , m and i � 0,1, …, 2m� 1, the function f (x ) is represented by a normal disjunctive formula (ndf f (x )). if there holds x x x x x xi i ik j j jl i i ikj j j jl 1 2 1 2 1 2 1 2� � � � � � � � � � � � � � � � � � � � � 0, where k � 1, 2, … , m and j � 0,1, …, 2m� 1, the conjuncts presented are termed orthogonal, i.e., all conjuncts of a complete normal disjunctive formula of the symmetrical boolean function (see paragraph 4) are mutually orthogonal. if all conjuncts ndf f (x ) are mutually orthogonal, we can also write f x x x x x xm i k i i ik i i ik( , , , ) , 1 2 1 2 1 2� �� � � � � . note that the function f (x ) can also be conveniently expressed by the boolean (zegalkin) polynomial [4]. since, as can easily be confirmed, the following equality holds: x y x y xy� � � � (probability addition) xy xy� x x� �1 and also (x and xy are orthogonal) x y x xy x x y� � � � � �( )1 , 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague realization of logical circuits with majority logical function as symmetrical function j. bokr, v. jáneš the paper deals with the “production” and design of symmetrical functions, particularly aimed at the design of circuits with majority elements, which lead to interesting solutions of logical structures. the solutions are presented in several examples, which show the applicability of the procedures to the design of fpga morphology on chips. keywords: shannon extension development, hamming weight, derivation of boolean function, symmetrical and majority function. each ndf f (x ) can be expressed by an arithmetic polynomial f x x x a a x a x a x a x x a x x m m m m m ( , , , )1 2 0 1 1 2 2 1 1 2 2 1 � �� � � � � � � � � 2 2 1 1 2 1 1 2 1 2 � � � � � � � � � � � a x x a x x x a x x x m m m m m ( , , , ) where a n ii m� � �( , , , )0 1 2 1� , this can be done either by applying the equality x y x y xy� � � � , or orthogonalizing all conjuncts ndf f (x ) and applying the absorption x xy� � � � �x x y( )1 . note that if the boolean function f x( ) is expressed by the arithmetic polynomial a x( ), then f x sign a x( ) ( )� . example 1.: let ndf f x x x x x( , , ) ( )1 2 3 1 2� � . x x x3 1 2� be given. express the given formula by means of the arithmetic polynomial a x x x( , , )1 2 3 , which is xx x x� � 2 : � ( ) ( )( ) ( x x x x x x x x x x x x x x x x x 1 2 3 1 2 1 2 1 2 3 1 2 3 1 2 1 � � � � � � � � � � 2 1 2 3 1 2 3 1 2 1 3 2 3 1 2 32 � � � � � � � x x x x x x x x x x x x x x x )( ) � ( ) ( ) ( )( x x x x x x x x x x x x x x x x x 1 2 3 1 2 1 2 1 2 1 1 2 3 1 2 1 2 � � � � � � � � � x x x x x x x x x x x x x x x x 1 1 2 3 1 2 1 2 1 2 3 1 2 1 21 1 � � � � � � � � � � ) ( ) ( ( ) ( 1 2 3 1 2 1 3 2 3 1 2 32 ) )x x x x x x x x x x x � � � � � 4 symmetrical boolean function let the bijection x � x: x x x x x xm i i im1 2 1 2, , , , , ,� �� be a set of all permutations of arguments from x; the function f x( ) is called symmetrical if f x x x f x x xm i i im( , , , ) ( , , , )1 2 1 2� �� ; i.e. 0, x y� , xy xz yz� � is a symmetrical function. let � �pj j k �1 be a set of integers pj (called operational or characteristic numbers) such that 0 pj m. it can be demonstrated [2] that f x( ) is symmetrical just if f m( , , , )� � �1 2 1� � for w ph m j( , , , )� � �1 2 � � . the symmetrical function with characteristic numbers pj will be denoted � � s p m j j k �1 ; obviously, s m� � 0 and � �s m m 0 1 1, , ,� � . for example: � �s x y1 2 � � , � �s xy xz yz2 3 3 , � � � . the symmetrical function � �s p m is elementary; for the length � �cndf s p m of the completely normal disjunctive formula � �s p m – there holds � �c ndf s m m pp m � � � �� � � �� , since � �c ndf s x x xp m w p m m h n m� � � � � � � � � � � � 1 2 1 2 1 2 1 2, , , ( ) � � � . there also holds [2] � � � � � �� �� �q q m i m m m s m i� � � �� �� � �� � � �� � 2 0 1 1 0 1 1 2 , , , as well as � � � �s sp m p m i j � � 0 for pi � pj. any symmetrical function � � s p m j j k �1 can be written in the form of � � cndf s p m j j k �1 : � � s x x p m w p p p j j k m h m k � � � �1 1 2 1 2 1 2 1 2 1 2 � � � � � � � � , , , ( ) , , , � � � � xmm � since [2] � � � � s s p m j k p m j j k j � � � � 1 1 . we can also write � � �� s s p m j m i i m j j k � � � � 1 0 � where �i j j i p i p � � � � � � 1 0 for for . denote the elementary symmetrical boolean functions the representation of which in the form of normal disjunctive formulae (ndf ) does not contain negated variables with the symbol � �s n m (n � 0, 1, …, m). for example � �s 0 1 m � , � �s 1 1 2 m mx x x� � �, , � �s m m m mx x x x x x� �� � � �1 1 2 1 3 1� , � �s m m mx x x� � � �1 2 � . every function � �s n m in which n � m ( � �s x x xn m m� 1 2� ), can be expressed by the composition � � � � � �s n m n m m n m� �s s . for example: � � � � � �s x x x x x x x x x x x x x x x 2 3 2 3 1 3 1 2 1 3 2 3 1 2 3 1 2 3 1 2 3 � � � � � � � � � s s ( ) � � � � � � x x x s x x x x x x x x x x x 1 2 3 2 3 1 3 2 3 1 2 3 1 2 1 3 2 3 1 2 , resp. � � � � � � � s s � x x x x x x x x x x x x x x x x3 1 2 1 3 2 3 1 2 3 1 2 3 1 2 3� � � � � . symmetrical functions are discussed in greater detail, e.g., in [2, 3, 4].the majority function � �maj m m refers to the symmetrical function � � � �� �� � �i m m i m m m m ms s 0 1, , ,� . for the three-variable majority function � �maj x x x x x x2 3 1 2 1 3 2 3� � � an infix notation x x x1 1 3# # # can be used. there obviously holds x x x x1 1 3 1# # � and x x x x1 1 3 3# # � . 5 numerical representation of symmetrical functions we might construct a minimal normal disjunctive formula to a given symmetrical function [4] or decompose the given © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 function ( � � � � � �s n m n m m n m� �s s ) and design, according to the constructed formulae, a structural model of the given function in one of the structurally complete systems of statistical elements [5]. further, consider a one-digit binary half-adder or an adder (fig. 1) with which ai, bi are binary augmenters, �i is the sum in i position, and ci �and ci � denote the transfer from the position i � 1 to the position i � 1, respectively. the half-adder can be modeled by a system of output functions � � i i ia b s� � 1 2 and � �c a b si i i � � � 2 2 ; by analogy for the adder we obtain � � i i i ia b c s� � � � � 1 3 3 , and � �c a b c a b c a b c a b c si i i i i i i i i i i i i � � � � �� � � � � 2 3 3 , . it is therefore sufficient to provide the half-adder with an inverse disjunctor (fig.1a) and the adder with a decoder (fig. 1b) and we obtain the products � �s 0 2 , � �s 1 2 , � �s 2 2 , since � � � � � � � � � �s s s s s1 2 2 2 0 2 2 0 1 2 0 2� �, , and � �s 0 3 , � �s 1 3 , � �s 2 3 , � �s 3 3 , since � � � � � � � � � �s s s s s1 3 3 2 3 3 0 2 3 0 1 3 0 3 , , , ,� � � � , � � � � � � � � � �s s s s s1 3 3 2 3 3 1 3 3 0 1 3 1 3 , , , ,� � � � , � � � � � � � � � �s s s s s1 3 3 2 3 3 0 2 3 2 3 3 2 3 , , , ,� � � � , � � � � � �s s s1 3 3 2 3 3 3 3 , ,� � . example 2.: design a structural model with adders or half-adders modeled with a system of output symmetrical functions s – fig. 2. indeed, � � 1 1 2 3 1 2 3 1 2 3 1 3 3� � � � � � � �x x x x x x x x x s( ) ( ) , , � �c s1 2 3 3� , , � � � � � � � � � � � � 2 4 1 3 3 4 1 3 3 4 1 3 3 1 3 4 4 0 2 3 1 3 4 � � � � � � � � � x s x s x s s x s s s , , , , , , � � � �1 3 4 1 3 4 , , ,� s � � � �c x s s2 4 1 3 3 2 4 4� �, , , � � � � � � � � � � � � 3 2 4 4 2 3 3 4 4 0 1 3 4 1 3 3 2 4 4 0 1 3 � � � � � � �� � � s s x x s s s s , , , , , , , ( ) � � � � � � � � � � � � � � � � � � � � � ( ) , , , , , , , , x x s s s s s s s 4 4 0 1 3 4 2 3 4 2 4 4 0 1 4 0 1 3 4 3 4 4 � � � � � � � � � � 2 4 4 1 2 4 3 4 2 4 2 3 4 , , , , s s s s � � � � � � � � � � � � � � � � � � � � c s s x x s s s s s s 3 2 4 4 2 3 3 4 4 2 4 4 2 3 4 2 4 4 3 4 4 2 4 4 � � � � � � � , , , , , ,( ) � � 4 2 4 4� s , . it is easy to obtain � � � �s s1 3 4 0 2 4 4 , , ,� , � � � �s s2 3 4 0 1 4 4 , , ,� and � � � �s s2 4 4 0 1 3 4 , , ,� ; hence for the decoder: � � � � � �s s s0 4 1 3 4 2 4 4� , , , � � � � � �s s s2 4 2 3 4 2 4 4� , , , � � � � � �s s s3 4 1 3 4 2 3 4� , , , � � � � � � � � � � � � � �s s s s s s s1 4 0 1 3 4 0 1 4 4 1 3 4 2 4 4 2 3 4 1 3 4� �, , , , , , , , , � � � � � � � � � � � � � �s s s s s s s4 4 0 1 4 4 0 2 4 4 2 4 4 2 3 4 1 3 4 2 4 4� �, , , , , , , , . if the boolean function f x( ) is symmetrical, it can be suitably expressed by an arithmetic polynomial in the form f x x x b b x x x b x x x x x m m m ( , , , ) ( ) ( 1 2 0 1 1 2 2 1 2 2 3 � � � � � � � � � � � � � � � � � � � � � � � � � 1 3 1 2 3 1 2 4 2 1 1 1 2 x b x x x x x x x x x b x x m m m m m ) ( )� � �� � � � � � � � � x b b x b x b x b x m m m i i i m 0 1 1 2 2 0 6� [ ]; for example for � �s x x x x x x x x x x x x x x1 2 3 1 2 1 2 1 2 1 2 1 2 1 2 1 21, ( )� � � � � � � � � we obtain b0 0� , b1 1� , b2 1� � and x 0 1� , x x x1 1 2� � , x x x 2 1 2� . the parametric notation � � cndf s p m j j k �1 being � � s x x m m p x p m p p m p jj j k j j j � � � � � � � � � � � � � ! ! " # $ $ � � 1 1 ( ) (x pj ) m p p j j � 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague a) b) fig. 1: a) half-adder , b) adder hence b x m m p xi i i m j m p p j j� � � � � � � � � � � � ! ! " # $ $ � 0 1x pj ( ) we can easily (!) determine the values of coefficients b, provided the structure of polynomials x i is known. example 3.: construct an arithmetic polynomial of the symmetrical function � �s 1 2 3 , . since � �s x x x x x x x x x wh 1 2 3 1 2 3 1 2 3 1 2 1 2 3 1 2 3 1 2 3 , , , ( ) � � � � � � � � � � � � � 3 1 2 3 1 2 3 1 2 3 1 2 3 3 3 � � � � � � � � � �� � � �� � ! " # $ x x x x x x x x x x x x j x j ( ) ( ) , 1 3 3 1 1 3 3 2 3 1 2 2 � � � � � �� � � �� � ! " # $ � � � � �� � � � x x x j j � �� � ! " # $ � � � � � � � � x x x x b b x b x b x 2 2 0 1 1 2 2 3 3 1( ) . hence b b0 3 0� � , b2 1� , b3 1� � provided that x x x x1 1 2 3� � � and x x x x x x x2 1 2 1 3 2 3� � � ; because � �s x x x x x x x x x x x 1 2 3 1 2 3 1 2 3 1 2 3 1 1 1 1 1 1 , ( ) ( ) ( ) ( )( � � � � � � � � � � 2 3 1 2 3 1 2 3 1 2 3 1 1 1 1 2 ) ( ) ( ) ( )( ) ( ) ( x x x x x x x x x x x � � � � � � � � � � � 1 2 1 3 2 3 1 2 3 1 2 3 1 2 1 3 2 3 3x x x x x x x x x x x x x x x x x � � � � � � � � � � ) ( ) ( ) .� 3 1 2 3x x x . 5 circuits with majority elements let us limit ourselves to majority elements modeled with the function � �maj 2 3 . since, as can be easily confirmed, there holds x y x y� � # #1 , xy x y� # #0 , the ndf of the given function f x( ) can be rewritten according to the above quoted equities and we can design the respective static structural model. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 2: production of symmetrical functions � �s ii 4 0 1 2 3 4( , , , , )� from example 2 fig. 3: structural model from example 4 example 4.: construct a structural model given by the output function y x x x x x� �( )1 2 3 4 5 hence � � y x x x x x� ( # # )# ( # # ) # # #1 2 3 4 50 0 1 0, and thus fig. 3. it is also helpful to use the shannon extension development of the given boolean output function y f x� ( ) � � y x f x x f x x f x x f x i i i i i i i i � � � � � � � � ( ) ( ) # ( )# # # ( )# # 0 1 0 0 1 0 1, and it remains only to decide according to which argument to start and according to which arguments to continue the repeated application of the development. we, therefore, heuristically develop f x( ), first according to the arguments whose change of values leads to the change of values f x( ) under the highest number of conditions, i.e., at the highest. hamming weights pertaining to the derivation of the function f x( ) according to the respective arguments. example 5.: design a structural model given by the output function y x x x x x x x x x x x x x x x x x x� � � � � �1 2 3 1 3 4 1 3 5 1 2 4 2 3 5 3 4 5 with majority elements. according to the map of the given output function, the ndf of the remainder functions of its shannon extension development can easily be constructed (fig. 4.). hence � � w y x w y x y x w x x x x x x x h h h � � 1 1 1 2 5 3 4 3 5 2 0 1� � � � � � � � � � ( ) ( ) ( ) ( x x x x x x x x x3 5 2 4 3 4 5 3 5 7� � � �) when stating the formula which expresses the derivation of the function we will preferably use a map, to each field of which we will write the value in the form of a fraction: y x y xi i( ) ( )� �0 1 ; the resulting value of the remainder function formula as well as the weight of its derivation is evident (fig. 5). there also holds � � w y x w y x y x w x x x x x x x x x h h h � � 2 2 2 1 3 4 1 3 3 5 4 0 1� � � � � � � � � ( ) ( ) ( 5 1 4 1 3 5 3 4 3 5 5 ) ( ) , � � � � � �x x x x x x x x x � � w y x w y x y x w x x x x x x x h h h � � 3 3 3 1 4 2 4 4 5 1 0 1� � � � � � � � � � ( ) ( ) ( ) ( x x x x x x5 1 4 5 2 5 8� � �) , � w y x w y x y x w x x x x x x x x x h h h � � 4 4 4 1 2 5 1 3 5 1 4 5 0 1� � � � � � � � ( ) ( ) (� ) ( ) , � � � � �x x x x x x1 3 1 3 2 3 5 � � w y x w y x y x w x x x x x x x x h h h � � 5 5 5 1 3 1 3 4 2 3 4 0 1� � � � � � � � � ( ) ( ) ( ) � � � � � �( ) .x x x x x x x x x x x1 3 2 3 2 3 3 4 1 3 4 7 since max i h i hw y x w y x � � � � � � � % & ' � � 3 8, we write y x y x x y x� � � �3 3 3 30 1( ) ( ) , where y x x x x x x x( )3 1 4 2 4 4 50� � � � and y x x x x x x x( )3 1 4 1 5 2 51� � � � (fig. 6). and, further, there is � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 1 1 3 1 3 4 0 0 0 1 0 � � � � � � � � � �� 5 2 4 4 5 2) ( ) ,� � �x x x x 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 4: the karnaugh map of the output function from example 5 fig. 5: recording of remainder functions � � y x1 from example 5 � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 2 2 3 2 3 1 4 0 0 0 1 0 � � � � � � � � � � � � � �x x x x4 5 4 5 2) ( ) , � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 4 3 4 3 4 5 0 0 0 0 1 � � � � � � � � � �� 1 2 4� �x ) , � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 5 3 5 3 5 1 4 0 0 0 0 1 � � � � � � � � � � � � � � �x x x x x2 4 1 2 4 4) ( ) , � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , )3 1 1 3 1 3 2 5 1 0 1 1 1 � � � � � � � � � �� ( ) ,x x x2 4 5 5� � � . � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 2 2 3 2 3 1 1 0 1 1 1 � � � � � � � � � �� 5 1 4 1 5 3) ( ) ,� � �x x x x � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 4 3 4 3 4 1 5 1 1 0 1 1 � � � � � � � � � � � � � �x x x x x2 5 1 2 5 1) ( ) , � w y x x w y x x y x x w x x h h h � � ( ) ( , ) ( , ) ( 3 5 3 5 3 5 1 1 1 0 1 1 � � � � � � � � � �� 2 1 4 3� �x x ) . since max ( ) ( ) i i h i hw y x x w y x x � �� � � % & ' � � � 3 3 3 4 0 0 4 � � � � and max ( ) ( ) i i h i hw y x x w y x x � �� � � % & ' � � � 3 3 3 1 1 1 5 � � � � we write y x x y x x x y x x x x y x x � � � � � � � � 3 4 3 4 4 3 4 3 1 1 3 0 0 0 1 0 ( ( , ) ( , )) ( ( , � � � �1 1 11 1 3) ( , )),x y x x where y x x x y x x x x( , ) , ( , ) ,3 4 5 3 4 1 20 0 0 1� � � � � � � y x x x x( , )1 3 2 50 1� � � and © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 a) b) fig. 6: map entries of a) y x( )3 0� , b) y x( )3 1� from example 5 fig. 7: structural model with majority elements from example 5 y x x x x x( , )1 3 2 4 51 1� � � � � , or y x x x x x x x x x x x x x x� � � � � � �3 4 5 4 1 2 3 1 2 5 1 2 4 5( ( )) ( ( )). in other words ( )� �y x x x x x x x x x � 3 4 5 4 1 2 3 1 2 0 1 0 1 0 0 # ( # # )# ( # ( # # )# )# # # # # (( # # )(� ) � # # )# # ( # (( # # ) # )# )# # # . x x x x x 5 1 2 4 5 0 1 1 0 1 0 1 and hence also the structural model (fig. 7) obviously, there is also � � x x x x x x x x x x x x x s x x 1 2 3 1 2 3 1 2 1 2 3 1 2 2 3 3 1 # # ( ) ( ) ( ,, � � � � � � � � 2 3 1 2 1 3 2 3 1 2 32, ) .x x x x x x x x x x� � � � 7 conclusion it appears that it is feasible to produce symmetrical boolean functions in a sufficiently simple way by a suitable control of one-digit binary adders or by numerical representation of values of the respective arithmetic polynomials, and to design logical circuits with majority elements by applying the shannon decomposition of the given output function through effective selection of the arguments by which the decomposition is carried. references [1] bochman, d., posthoff, ch.: binare dynamische systeme. berlin: akademie – verlag, 1981. [2] harrison, m. a.: introduction to switching and automata theory. new york – sydney: mc graw-hill book co., 1965. [3] frištacký, n. aj.: logické systémy. bratislava – praha: alfa – sntl, 1986. [4] bokr, j., jáneš, v.: “some interesting applications of the karnaugh map.” acta electrotechnica et informatica, vol. 3 (2003), no. 3, p. 22–27. [5] bokr, j., jáneš, v.: logické systémy. praha: vydavatelství čvut, 1999. [6] šalyto, a. a.: metody apparatnoj i programnoj relizacii algoritmov. sankt – peterburk: nauka, 2000. doc. ing. josef bokr, csc. e-mail: bokr@kiv.zcu.cz department of information and computer science university of west bohemia faculty of applied sciences universitní 22 306 14 pilsen, czech republic doc. ing. vlastimil jáneš, csc. phone: +420 224 357 289 e-mail: janes@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám.13 121 35 praha 2, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague ap02_02.vp 1 introduction the odor microclimate is the constituent of the microenvironment characterized by odor agents occurring in a given space in a building interior and affecting the total state of the human organism [6]. odor agents (odors) are gas components in the atmosphere perceived as pleasant or unpleasant, bad smells. they are organic or inorganic in nature, usually produced by man himself, or as a result of his activity, or else are released from building materials. toxic odors are not a part of the odor microclimate – they are included in the toxic microclimate. there is an increasing tendency for odors to occur in the interiors of buildings [7]. 2 sources of odors the sources of unpleasant and pleasant odors are as follows: 2.1 the sources of unpleasant odors the unpleasantness of an odor is characterized by the so-called hedonic tone, i.e., by the subjective feelings of human beings [5], [12]. there are five principal types of odors according to the zwaardermarker scale: 1. etheric (human smells), 2. aromatic (smells of ripe fruit decay), 3. isovalerate (smells from smoking tobacco, smells of animal sweat), 4. rancid (smells of milk products), 5. narcotic (smells of protein decay and the smell of tobacco). odors enter the microenvironment from outdoor as well as indoor sources, as a consequence of human activity, and after being released from building materials, especially from insulations, coatings, and from chemicals for wood protection (fig. 1). fifty to 80 % of odors enter the room from outdoors – products of combustion engines (oxides of nitrogen and sulphur, carbon monoxide, sulphates and compounds of nitrogen, sulphur and carbon), air pollution by industrial plants, and smoke from power stations and boiler rooms. the indoor air treatment system may cause problems with tobacco smoke (if smoking is allowed in air-conditioned rooms) and oil decay (if filters are not cleaned properly). formaldehyde is released from some building materials – plywood, chipboard, parquets, varnishes and other paintwork, wallpaper (treated with artificial resin against abrasion) and from cork flooring and linings. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 42 no. 2/2002 the impact of odors m. v. jokl the odor microclimate is formed by gaseous airborne components perceived either as an unpleasant smell or as a pleasant smell. smells enter the building interior partly from outdoors (exhaust fumes flower fragrance) and partly from indoors (building materials, smoking cigarettes cosmetics, dishes). they affect the human organism through the olfactory center which is connected to the part of brain that is responsible for controlling people’s emotions and sexual feelings: smells therefore participate to a high level in mood formation. the sense of smell diminishes slowly in people over the age of 60, but all female age categories have a better sense of smell than males. smell is extremely sensitive, e.g., during pregnancy, or if an illness is coming. bad smells cause a decrease in human performance, loss of concentration, and loss of taste. sweet smells have a positive impact on human feelings and on human performance. criteria for odor microclimate appraisal are presented (concentration limits of co2 , tvoc, plf, decipol, decicarbdiox, decitvoc). keywords: odors, microenvironment, hygiene, indoor air quality, microclimate. fig. 1: sources of unpleasant odors in a building interior pentachlorphenol (pcp), lindane and pyrethroid (occasionally permethrine) are given off by impregnating preparations. isocyanate (used as a binder) is produced by non-formaldehyde chipboards. petroleum spirit and xylol, organic solvents, are released from paints based on alkyde resins, insert space glykoether is released from acrylate paints, isocyanate from polyurethane materials, isoaliphate from paints based on natural resins. chloroprene is produced by adhesives based on artificial rubber, isocyanate (epichlorhydrine or ethylester of cyanoacrylate acid) is released from acrylate glue. dioxine and furane are produced by burning pvc. surprisingly, nothing is released from new floorings based on polyolefin. during human activity, various body smells are produced (acetone, isoprene); smoke from cigarettes, cigars and pipes is produced (pyridine), and there are odors from cosmetics (an inversion can cause serious unpleasantness), cleaning chemicals, and from rubbish. (inversion is the change of a pleasant odor into an unpleasant one, i.e., bad smell into sweet one or vice versa, as a result of one odor mixing with another, or as a result of a change concentration.) mostly volatile organic compounds (voc – volatile organic compounds – are defined by who as having melting points ranging from 50 to 260 °c) are produced. the complex of vocs (without formaldehyde) produced by human beings within a building, building materials and other sources is referred to as tvoc (total volatile organic compounds). cigarette smoking causes a special problem. most people perceived it negatively, as a bad smell threatening their health; the response of former smokers seems to be the most sensitive. smoking, especially intensive and long-term, causes narrowing of veins, nervousness, damage to breathing passages, a decrease in the differentiation ability of the olfactory and taste cells, damage to the lungs by tar and other harmful combustion products, leading to tumors, and so on. it is supposed that smoking one cigarette shortens a human life by seven (according to some specialists even by eleven) minutes. to call a cigarette a nail in one’s coffin seems to be quite realistic. recent studies indicate that tobacco smoke forms part of the toxic microclimate. 2.2 sources of pleasant odors from outdoors, the smell of blossoming flowers, mown grass, hay and melting snow comes into the interior of a building (fig. 2). indoors, flowers and man-made aromas (cosmetics and detergents, smoke from an open fireplace) are sources of pleasant smells. even some building materials (e.g., wood) are perceived as pleasant. what is perseived as an unpleasant odor by some may be perceived as pleasant by others. this can be by matter of personal experience very often: e.g., smoke from an open fireplace induces a feeling of comfort (e.g., based on the smell of smoked sausages) for one person and a feeling only of smoke only for another (having no experience of an open fireplace). the history of pleasant aromas is very long. they have been used by ladies to make themselves more attractive, and also for other purposes: e.g., a smoke clock was used by the japanese in the middle ages (there was a different smell for each hour). 3 the impact of odors on man the principal question concerns the impact of smells on the human organism. a schematic representation of the smell organ is shown in fig. 3a, with a picture in fig. 3b [15]. inhaled air passes the nasal shells in the olfactory zone; they are covered with smell cells with mucous membranes on their surface. odoriferous substances have to come into contact with a mucous membrane, if smell perception is to be evoked. only about 1 % of the inhaled molecules come in contact with smell cells, producing electric impulses through the bulbus olfactorius to the smell center, which is in the front part of the brain (rhinencephalon), where they are processed. the electroolfactogram (eog), i.e., the electrical activity registration, registers an increase with the odor concentration. the part of the brain dealing with smell perception (bulbus) is located above the nose in the center of the fore14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 2: sources of pleasant odors in a building interior head. most of the nervous pathways from the olfactory bulls end in the oldest (from the evolutionary point of view) area of the central brain, which is responsible for controlling emotional and sexual reactions. it is supposed, on the basis of this evidence, that smells participate to a high extent in the formation of moods. for examples it has been proved that if somebody links his relaxation with a sea smell, then a sea smell (without the sea) or the smell of sea salt is sufficient for him to relax. the perceived odor concentration increases only to the moment of mucous (lipidus doublelayer with odor receivers) saturation, and after that there is no response even to a great increase in the odor. on the contrary, the odor perception level decreases in the course of time, and after 5–15 minutes stabilizes at a minimal value (fig. 4) as a result of the short time odor adaptation caused by fatigue [10], [11]. the odor perception is recovered rapidly after removal of the odor. long-term odor adaptation depending on age also exists. based on an analysis of the ability to smell of 1955 subjects (1158 women and 797 men aged between 5 and 99), it was found at the clinical center for smell and taste research at pennsylvania university that both sexes have the best sense of smell between the ages of 30 and 60; then, between the age of 60 and 80, the sense of smell deteriorates slowly (at the age of 80, 60 % of the subjects had a very bad sense of smell, and 25 % of the subjects smelled almost nothing), and afterwards it decreases rapidly (over the age of 80, more than 80 % of the subjects had a very bad sense of smell, and 25 % of the subjects smelled almost nothing). it was also found at the clinical center that all categories (at each age) of women have a better sense of smell than men, and also that nonsmokers have a better sense of smell than smokers. the ability to smell is very sensitive during pregnancy, during nursing, if the sugar content in the blood is increased, during kidney inflammation, during neuralgic pains (hemicania), and before any coming illness. odor perception can be very individual: johann wolfgang goethe complained in one of his letters that the air liked by schiller impacted on him like a poison. schiller’s favorite was the smell of rotten apples, which could even be found in the drawer of his desk. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 42 no. 2/2002 a) b) fig. 3: the human olfactory organ (3ascheme, 3banatomical picture) time [min] fig. 4: odor perception over a period of time. there are five types of odors according to the zwaardermarker scale: 1. etheric (human smells), 2. aromatic (smells of ripe fruit decay), 3. isovalerate (smells from smoking tobacco, smells of animal sweat), 4. rancid (smells of milk product), 5. narcotic (smells of protein decay and the smell of tobacco). 3.1 the impact of negative odors on man odors do not threaten human health but they impact on man’s performance, concentration, taste, and sensation of well-being. odors are removed from the atmosphere for psychological reasons, and also economic and hygienic reasons, because they often indicate increased contamination of the microenvironment by microbes. long-term exposure can lead to anxiety, depression and chronical fatigue. the smell of mold, humidity, stagnant water and mice can produce feelings of derangement, depression, and headachs. various coatings, oils, photomaterials, and some laundry agents, referred to as smells of civilization, are the worst; they produce a bad mood, feelings of anger and disturbance in 80 % of subjects. the smell of lilies and orchids make subjects feel chilly, and even produce stupefaction in some of them. the odor of disinfectant and of some strong aromatic medications impact subjects in the same way, especially women. 3.2 the impact of positive odors on man pleasant odors are attracting increasing interest, because they can have a positive way effect on people’s feelings and performance. the impact of odors on people was estimated by stanford and reynolds on the basis of experiments on 5000 subjects aged 7 to 85 years exposed to 260 smells. subjects spent one to eight hours in chambers filled with odors of high intensity. then they were asked for their impressions, and were submitted to medical examination. 260 odors were used: smells of resin, hay, herbs, odors of eighty perfumes, various cosmetics, fresh bread, baked meat, cheeses, spices, rotten wood, smoke from various types of wood, manure, molds, disinfectants and detergents, varnishes, enamels, etc. despite the differences in the responses of the subjects to odors, there is a typical response of subjects of the same age and sex. children up to fifteen of both sexes: there is a positive response to camomile, mint, melting snow, and freshly mown grass. children, after several hours in a chamber filled with one of these smells, solved problems (corresponding to their age) much better than those without the smell; they though faster, and were able to reason better. the smell of resin, lime in flower, fresh dough, hay, and honey, refreshed women and subjects of both sexes younger than 35. the smell of conifers, fresh apples, lavender, sea salt, rushes and thyme refreshed 80 % of the subjects. some subjects were even refreshed by the smell of smoke, burning wood, roasted potatoes, gunpowder, and roast meat. men and women of all ages: the smell of roses and pansies, the smell of fresh currants, oranges and lemons stimulated pleasant sensations, an appetite for life, an eagerness to work. the smell of narcissi and violets produced a calming effect, and even a sentimental mood. old people recalled their youth; other people thought about their loved ones; some subjects remembered pleasant melodies. the smell of jasmine, lilac and reseda stimulated feelings of calm and relaxation, a pleasant condition of inactivity. shopping activity can be stimulated by smells. according to the german consumer association, about ten thousand hypermarkets and small shops have already used suitable smells to stimulate consumers to more intensive shopping. a new term has been introduced: air design. from the statistics it is really evident that in rooms with an appropriate smell consumers spend 16 % more time, shopping increases by 15 %, and turnover by 6 %. when asked on leaving the shop, the consumers often responded that had bought goods they did not need and purchased them from some unknown internal incentive. german mercedes and bmw car showrooms favor the smell of the leather of new seats, and deutsche bank offices, wempe jewellers and holiday hotels prefer perfumes used during gala occasions (theatre, a concert or a ball). deutsche bank even claims that its secret special smell composition creates an atmosphere of confidence and assurance in bank offices. fruit and vegetable shops have successfully used the smell of lemons, which induces a feeling of cleanliness and freshness of the goods. boutiques change the smell in their ladies’ department according to the time of year. 4 odor microclimate level assessment the basic criteria for odor microclimate level assessment, if there is no special source of agents in the room, are carbon dioxide (co2) and total volatile organic compound (tvoc) concentration in the building interior. other criteria are derived from these basic criteria: outdoor air rate per person (prescribed in most countries), olf and decipol (recommended in the european union) and the newest units called decicarbdiox (dcd) and decitvoc (dtv). 4.1 carbon dioxide co2 for a long time, the odor microclimate has been evaluated on the basis of co2 concentration and its limit value of 1000 ppm, introduced by von pettenkofer (1818–1901 professor at the university in munich), which is used to determine the minimum amount of fresh air (25 m3/h) per person. co2 is the most important biologically active agent whose production is proportional to the human metabolic ratio. there are many limit values of co2 concentration at present (see table 1), but the pettenkofer value is still widely used, and it is a starting point for the prescribed limits in most developed countries. according to eur 14449 en (report no. 11, guidelines for ventilation requirements in buildings, brussels, luxemburg 1992), recommended regulation of european union, 25 m3/ h � person is also prescribed for the maximum of 20 % of dissatisfied people within a building interior. according to us bsr/ashrae standard 62-1989r ventilation for acceptable indoor air quality the following basic values should be respected: a) for nonadapted persons 7.5 l/ s � person (27 m3/ h � person), b) for adapted persons only 2.5 l/ s � person (9 m3/ h � person). the value for adapted persons was introduced here for the first time in the world. it is the result of a short-time odor adaptation – see section 2 the impact of odors on man. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 in practice, monitoring the co2 level for the purpose of controlling fresh air supply has proved satisfactory for lecture theatres, halls, cinemas, theatres and similar spaces where the load imposed by occupants can vary rapidly. 4.2 total of volatile organic compounds tvoc although co2 is a good indicator of the perceived air quality by the people present, it can sometimes be an unsuitable indicator: it does not represent perceived sources of air contamination such as building materials and fittings, especially carpets and other floor covering materials that producing voc. the complex of these substances is called tvoc. various tvoc limits are listed in table 2. for this reason fanger [3] proposed a new system based on units known as the “olf ” and the “decipol”. one olf is the odor pollution produced by one standard person – an aver© czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 42 no. 2/2002 no. limit value source [mg m�3] [ppm] [dcd] 1 875 485 0 threshold 5.8 % dissatisfied 2 1080 600 8 usaf warning usaf armstrong laboratory (1992) 3 1110 615 9 un asthm. optimal 10 % dissatisfied unadapted 4 1440 800 20 osha warning osha: federal register (1994) 5a 1800 1000 28 optimal limit pettenkofer (1858) 5b 1800 1000 28 acceptable limit bsr ashrae 62-1989r (1989) 5c 1800 1000 28 opt. long-term who/euro: air quality guidelines eur 14449 en (1992) 6a 1825 1015 29 un optimal limit 20 % dissatisfied unadapted 6b 1825 1015 29 un asthm. admissible 6c 2000 1110 32 concentration of no concern for non-industrial buildings who; levy (1992) 7a 2160 1200 35 opt. short-term who/euro: air quality guidelines eur 14449 en (1992) 7b 2200 1225 36 ad asthm. optimal 10 % dissatisfied adapted 8 2830 1570 46 un admissible 30 % dissatisfied unadapted 9a 4350 2420 63 ad optimal limit 20 % dissatisfied adapted 9b 4350 2420 63 ad asthm. admissible (bsr/ashrae standard 62-1989r) 10a 5035 2800 68 limit for direct gas-fired air heaters bs 5990: 1981 of british standard institution 10b 5035 2800 68 limit for direct gas-fired air heaters bs 6230: 1982 of british standard institution 11a 6300 3500 77 long-term acceptable env. health directorate, canada (1989) 11b 7000 3890 81 concentration of concern for non-industrial buildings who; levy (1992) 11c 7360 4095 83 ad admissible 30 % dissatisfied adapted 12a 9000 5000 91 long-term exposure limit (8 h) guidance note eh 40/90 from hse of gb 12b 9000 5000 91 average concentration for industrial and non-industrial buildings commission de la sante et de la securite du travail 12c 9000 5000 91 long-term tolerable ussr space research (sbs range ends) 13a 18000 10000 118 maximum allowable concentration for ind. and non-ind. buildings commission de la sante et de la securite du travail 13b 18000 10000 118 short-term tolerable ussr space research 14a 27000 15000 134 short-term tolerable toxic range begins 14b 27000 15000 134 short-term exposure limit (10 min) guidance note eh 40/90 from hse of gb un, unadapted persons; ad, adapted persons; asthm., asthmatic persons table 1: limits for co2 concentrations in a building interior age adult sitting person having thermal comfort during office or similar nonindustrial work, whose hygienic standard is 0.7 baths per day. one decipol is the air pollution by one standard person (one olf) ventilated by an air volume of 10 l/s (36 m3/h) of clean air. on the basis of the estimated air quality within an interior in decipols the necessary outdoor air quantity can be calculated following the instructions in recommended eur 14449 en; for some basic values see table 3. there are certain obvious problems with this system, as waspointed out by [12], [8] and especially by [13]. thus these units were not accepted for the bsr/ashrae standard 62-1989r. for example, the units are presented as a system of overall indoor air quality valuation, although they are based on odor perception. thus, e.g., air polluted with radon is o.k., because radon has no smell. 4.3 co2 plus tvoc the most progressive systems – us ashrae standard 62 – 1989r and a new european standard, using decibel units dtv and dcd, are based on two basic criteria: co2 and tvoc. carbon dioxide is a criterion for odor air pollution in the presence of people, while tvoc is a criterion for odor air pollution by building materials and fittings. the necessary outdoor air rate for ventilation is the sum of the two air quantities, calculated on basis of co2 and tvoc. ashrae standard 62 – 1989r is a complex of tables in the main part (prescriptive requirements) (e.g., see table 4). the minimum outdoor air rates, first for people (l/ s � person), second for tvoc production by building and other materials (l/ s � m2 floor) are listed directly in the tables. the minimum 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 no. limit value source ��g � m�3] dtv 1 50 0 threshold 5.8 % dissatisfied (1.0 by yaglou psycho-physical scale) 2 85 12 un asthm. optimal 10 % dissatisfied unadapted, eur 14449 en (1992) 3a 200 30 un optimal limit 20 % dissatisfied unadapted, eur 14449 en (1992) 3b 200 30 un asthm. admissible old dwelling houses old office building 4 250 35 ad. asthm. optimal 10 % dissatisfied adapted 5 300 39 target guideline seifert (1990) 6 360 43 un admissible 30 % dissatisfied unadapted, eur 14449 en (1992) new dwelling houses new office buildings 7a 500 50 level of concern national health and medical research council of australia, dingle and murray (1993) 7b 580 53 ad optimal limit 20 % dissatisfied adapted 7c 580 53 ad asthm. admissible 8 1040 66 ad admissible 30 % dissatisfied adapted 9a 3000 89 long-term tolerable sbs range ends 9b 3000 89 multifactorial exposure range limit molhave (1990) 10a 25000 135 short-term tolerable toxic range begins 10b 25000 135 discomfort range limit molhave (1990) un, unadapted persons; ad, adapted persons; asthm., asthmatic persons table 2: limits for tvoc concentrations in a building interior quality level (categories) perceived air quality required air rate % dissatisfied decipol l/s olf m3/ h � person a 10 0.6 16 58 b 20 1.4 7 25 c 30 2.5 4 14 table 3: outdoor air rates per person according to eur 14449 en outdoor air rate for people, in contradiction to other regulations, is valid for adapted persons (the basic value is 2.5 l/ s � person, i.e., 9 m3/ h � person only). for unadapted persons (e.g., coming from outdoors, i.e., visitors) all values must be increased by 5 l/ s � person, i.e., by 18 m3/ h � person. thus the total value for people is 27 m3/ h � person, which does not differ too much from pettenkofer’s value of 25 m3/ h � person. the newest valuation system of the odor microclimate uses decibel units derived similarly as decibel units for noise: the new units are modified briggs’ logarithms of concentrations (so called odor levels) of co2 and tvoc measured in the investigated room and related to threshold values (the weakest odors that can be detected) [9]. odor level based on co2 concentration is defined by the equation: lodor(co2) ico ppm � 90 485 2log [ ]� [decicarbdiox], [dcd] (1) odor level based on tvoc concentration is defined by the equation: lodor(tvoc) itvoc 3g / m � 50 50 log [ ]� � [decitvoc], [dtv] (2) where �i are the measured concentrations in a building interior. optimal, admissible, long-term and short-term tolerable values and their ranges are presented in figs. 5 and 6. optimal values correspond to the percentage of dissatisfied persons pd � 20 % (for asthmatics pd � 10 %), admissible values pd � 30 %, long-term tolerable values determine the range for the so called “sick building syndrome” (sbs). all these values are given for adapted and nonadapted persons. the main advantages of the new proposed evaluation system can be summed up as follows: 1. the undoubted benefit of using the decibel scale is that it gives a much better approximation to human perception of odor intensity than to the co2 and tvoc concentration scales. this is because the human olfactory organ (see jokl 1989) reacts to a logarithmic change in level which corresponds to the decibel scale. 2. the new decicarbdiox and decitvoc values also fit very well with the db values for sound, e.g. the optimal odor value of 30 db corresponds to the iso noise rating acceptable value nr 30 for libraries and private offices. they can therefore be compared to each other. 3. it is possible, by comparing dcd and dtv values, to estimate which component – co2 or tvoc – plays a more important role, and hence which sources of contamination are more serious. 4. the dcd and dtv units can be estimated by direct measurements of tvoc and co2 concentrations – instruments can be calibrated directly in the new units. 5. the dcd and dtv units, as indoor air quality criteria, allow an optimal range to be defined and a corresponding optimal ventilation rate to be estimated for unadapted and adapted persons. 19 acta polytechnica vol. 42 no. 2/2002 space prescriptive requirements people rp l/ s � person building rb l/ s � m2 educational facilities day care (through age 4) 3.5 0.70 classrooms (ages 5-8) 3.0 0.70 general classrooms 3.0 0.55 lecture classroom 3.0 0.55 lecture hall (fixed seats) 2.5 0.35 art classroom 5.0 1.85 science laboratories 3.5 2.85 wood/metal shop 4.0 1.85 media center 3.0 0.70 music/theater/dance 7.0 0.70 multi-use assembly hall 3.0 0.55 table 4: minimum requirements for ventilation (6.1a in ashrae 62-1989r) 6. the units allow the sbs range to be defined (corresponding to the long-term tolerable range) and corresponding long-term tolerable ventilation rates to be estimated. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 optimal, admissible and tolerable values fig. 5: optimal, admissible and tolerable limits for co2 in a building interior 7. the units allow the efficiency of air cleaners (and other indoor air-improving measures, e.g. the use of less polluting building materials) to be expressed, i.e., what is the decrease in air contamination after application. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 42 no. 2/2002 optimal, admissible and tolerable values fig. 6: optimal, admissible and tolerable limits for tvoc in a building interior references [1] bsr/ashrae standard 62-1989r ventilation for acceptable indoor air quality. [2] eur 14449 en. quidelines for ventilation requirements in buildings. report no. 11, commission of ec, luxembourg 1992. [3] fanger, p. o.: introduction of the olf and the decipol units to quantify air pollution perceived by human indoors and outdoors. energy and buildings, vol. 12, no. 1/1988, p. 1–6. [4] hicks, j. et al: building bake-out during commissioning: effects on voc concentration. in: proc. of the fifth int. conference on indoor air quality and climate, vol. 3, toronto (canada), 1990. [5] iaqu. odor evaluation as an investigative tool. indoor air quality update, 1991, p. 10–13. [6] jokl, m.: microenvironment: the theory and practice of indoor climate. springfield (illinois – u.s.a): thomas publisher, 1989, p. 416. [7] jokl, m.: the theory of indoor environment of buildings. czech. praha: vydavatelství čvut, 1993, p. 261. [8] jokl, m. v., leslie, g. b., levy, l. s.: new approaches for the determination of ventilation rates: the role of sensory perception. indoor environment, 1993, vol. 2, no. 2 p. 143–148. [9] jokl, m. v.: evaluation of indoor air quality using the decibel concept. int. j. of environmental health research, 1997, vol. 7, no. 4, p. 289–306. [10] kaiser, e. r.: odor and its measurement. in: air pollution, academic press, 1962, p. 50–527. [11] mc burney, d. h., levine, j. m., cavanaugh, p. h.: psychological and social ratings of human body odor. personality and social psychology bulletin, 1977, no. 3, p. 135–138. [12] oseland, n. a.: a review of odor research and the new units of perceived air pollution. bre, watford, 1993, p. 24. [13] parine, n.: the use of odor in setting ventilation rates. indoor environment, 1994, vol. 3, no. 3, p. 87–95. [14] pettenkofer, m.: über den luftwechsel in wohngebauden. münchen, 1858. [15] the human body. bratislava: gemini, 1992. miloslav v. jokl, ph.d., sc.d, university professor phone: +420 2 2435 4432 fax: +420 2 3333 9961 e-mail: miloslav.jokl@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 acta polytechnica doi:10.14311/app.2016.56.0010 acta polytechnica 56(1):10–17, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap multi-robot motion planning: a modified receding horizon approach for reaching goal states josé m. mendes filhoa, b, eric luceta, ∗ a cea, list, interactive robotics laboratory, gif-sur-yvette, f-91191, france b ensta paristech, unité d’informatique et d’ingénierie des systèmes, 828 bd des marechaux, 91762, france ∗ corresponding author: eric.lucet@cea.fr abstract. this paper proposes the real-time implementation of an algorithm for collision-free motion planning based on a receding horizon approach, for the navigation of a team of mobile robots in the presence of obstacles of different shapes. the method is simulated with three robots. the impact of the parameters is studied with regard to computation time, obstacle avoidance and travel time. keywords: multi-robot motion planning; nonholonomic mobile robot; distributed planning; receding horizon. 1. introduction the control of mobile robots is a long-standing subject of research in the robotics domain. a trending application of mobile robotic systems is their use in industrial supply chains for processing orders and optimizing the storage and distribution of products. for example, amazon employs the kiva mobile multi-robot system, and the logistic provider idea group employs the scallog system for autonomously processing client orders [1, 2]. such logistics tasks have become increasingly complex as sources of uncertainty, such as human presence, are admitted in the work environment. one basic requirement for such mobile multi-robot systems is the capacity of motion planning, that is, generating an admissible configuration and input trajectories connecting two arbitrary states. to solve the motion planning problem, various constraints must be taken into account, in particular, the robot’s kinematic and geometric constraints. the first constraints derive directly from the mobile robot architecture implying, in particular, nonholonomic constraints. geometric constraints result from the need to prevent the robot assuming specific configurations in order to avoid collisions, communication loss, etc. we are particularly interested in solving the problem of planning a trajectory for a team of nonholonomic mobile robots, in a partially known environment occupied by static obstacles, being efficient with respect to the travel time (the amount of time to go from the initial configuration to the goal). a great amount of work towards collision-free motion planning for cooperative multi-robot systems has been proposed. this work can be split into centralized and distributed approaches. centralized approaches are usually formulated as an optimal control problem that takes all robots in the team into account at once. this produces solutions closer to the optimal one than distributed approaches. however, the computation time, security vulnerability and communication requirements can make it impracticable, specially for a great number of robots [3]. distributed methods based in probabilistic [4] and artificial potential fields [5] approaches, for instance, are computationally fast. however, they deal with collision avoidance as a cost function to be minimized. but rather than having a cost that increases as paths leading to collision are considered, collision avoidance should to be considered as hard constraints of the problem. other distributed algorithms are based on receding horizon approaches. in [6], a brief comparison of the main distributed receding methods is made, and the base approach extended in our work is presented. in this approach each robot optimizes only its own trajectory at each computation/update horizon. in order to avoid robot-to-robot collisions and loss of communication, neighbor robots exchange information about their intended trajectories before performing the update. intended trajectories are computed by each robot ignoring constraints that take the other robots into account. identified drawbacks of this approach are the dependence on several parameters for achieving realtime performance and good solution optimality, the difficulty to adapt it for handling dynamic obstacles, the impossibility of bringing the robots to a precise goal state and the limited geometric representation of obstacles. therefore, in this paper, we propose a motion planning algorithm that extends the approach presented in [6]. in this modified algorithm, goal states can be precisely reached and more complex forms of obstacles can be handled. furthermore, we investigate how the method’s parameters impact a set of performance criteria. thus, this distributed algorithm is able to find collision-free trajectories and 10 http://dx.doi.org/10.14311/app.2016.56.0010 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 1/2016 trajectory generation approach computes the corresponding angular and longitudinal velocities for a multi-robot system in the presence of static obstacles perceived by the robots as they evolve in their environment. this paper is structured as follows. the second section states the problem to be resolved, pointing out the cost function for motion planning and all constraints that need to be respected by the computed solution. the third section explains the algorithm for resolving the motion planning problem and makes some remarks on how to resolve the constrained optimization problems associated with the method. the fourth section is dedicated to the results found using this method and analyses the specific performance criteria and how they are impacted by the algorithm parameters. finally, in last section we present our conclusions and perspectives. 2. problem statement 2.1. assumptions in the development of the approach presented in this paper, the following assumptions are made: (1.) the motion of the multi-robot system begins at the instant tinit and goes until the instant tfinal. (2.) the team of robots consists of a set r of b nonholonomic mobile robots. (3.) a robot (denoted rb, rb ∈r, b ∈{0, . . . ,b−1}) is geometrically represented by a circle of radius ρb centered at (xb,yb). (4.) all obstacles in the environment are considered static. they can be represented by a set o of m static obstacles. (5.) an obstacle (denoted om, om ∈ o, m ∈ {0, . . . ,m −1}) is geometrically represented either as a circle or as a convex polygon. in the case of a circle its radius is denoted rom centered at (xom,yom ). (6.) for a given instant tk ∈ [tinit, tfinal], any obstacle om is considered detected by the robot rb whenever the distance between their geometric centers is less than or equal to the detection radius db,sen of the robot rb. therefore, this obstacle om is part of the set ob (ob ⊂o) of detected obstacles. (7.) a robot has precise knowledge of the position and the geometric representation of a detected obstacle, i.e., obstacle perception issues are neglected. (8.) a robot can access information about any robot in the team by using a wireless communication link. latency, communication outages and other problems associated to this communication link are neglected. (9.) the dynamic model of the multi-robot systems is neglected. (10.) the input (or control) vector of a mobile robot rb is bounded. 2.2. constraints and cost functions after giving the assumptions in the previous subsection, we can define the constraints and the cost function for multi-robot navigation. (1.) the solution of the motion planning problem for robot rb represented by the pair (q∗b (t),u ∗ b (t)); q∗b (t) ∈ r n being the solution trajectory for the robot’s configuration and u∗b (t) ∈ r p being the solution trajectory for the robot’s input – must satisfy the robot kinematic model equation: q̇∗b (t) = f(q ∗ b (t),u ∗ b (t)), ∀t ∈ [tinit, tfinal]. (1) where f : rn ×rp → rn is the vector-valued function modeling the robot kinematics. (2.) the planned initial configuration and initial input for robot rb must be equal to the initial configuration and initial input of rb: q∗b (tinit) = qb,init, (2) u∗b (tinit) = ub,init. (3) (3.) the planned final configuration and the final input for robot rb must be equal to the goal configuration and the goal input for rb: q∗b (tfinal) = qb,goal, (4) u∗b (tfinal) = ub,goal. (5) (4.) practical limitations of the input impose the following constraint: ∀t ∈ [tinit, tfinal], ∀i ∈ [1, 2, · · · ,p], |u∗b,i(t)| ≤ ub,i,max. (6) (5.) the cost for multi-robot system navigation is defined as: l(q(t),u(t)) = b−1∑ b=0 lb(qb(t),ub(t),qb,goal,ub,goal) (7) where lb(qb(t),ub(t),qb,goal,ub,goal) is the integrated cost for one robot motion planning [6]. (6.) to ensure avoidance of collisions with obstacles, the euclidean distance between a robot and an obstacle (denoted d(rb,om) | om ∈ ob,rb ∈ b) has to satisfy: d(rb,om) ≥ 0. (8) for the circle representation of an obstacle, the distance d(rb,om) is defined as:√ (xb −xom )2 + (yb −yom )2 −ρb −rom. for the convex polygon representation, the distance was calculated using three different definitions, according to the voronoi region [7] rb is located. 11 josé m. mendes filho, eric lucet acta polytechnica (7.) in order to prevent inter-robot collisions, the following constraint must be satisfied: ∀ (rb,rc) ∈ r×r,b 6= c,c ∈cb, d(rb,rc) −ρb −ρc ≥ 0 (9) where d(rb,rc) = √ (xb −xc)2 + (yb −yc)2 and cb is the set of robots that present a collision risk with rb. (8.) finally, the need for a communication link between two robots (rb,rc) yields the following constraint: d(rb,rc) − min(db,com,dc,com) ≤ 0 (10) with db,com,dc,com the communication link reach of each robot and db is the set of robots that present a communication loss risk with rb. 3. distributed motion planning 3.1. receding horizon approach since the environment is perceived progressively by the robots, and new obstacles may appear as time passes, planning the whole motion from initial to goal configurations before the beginning of the motion is not a satisfying approach. planning locally and replanning is more suitable for taking new information into account as it comes. besides, the computation cost of finding a motion plan using the first approach may be prohibited high if the planning complexity depends on the distance between the start and goal configurations. therefore, an on-line motion planner is proposed. in order to implement it, a receding horizon control approach [8] is used. two fundamental concepts of this approach are the planning horizon tp and the update/computation horizon tc. tp is the timespan for which a solution will be computed and tc is the time horizon during which a plan is executed while the next plan, for the next timespan tp, is being computed. the problem of producing a motion plan during a tc time interval is called here a receding horizon planning problem. for each receding horizon planning problem, the following steps are performed: step 1. all robots in the team compute an intended solution trajectory (denoted (q̂b(t), ûb(t))) by solving a constrained optimization problem. coupling constraints (9) and (10), which involve other robots in the team, are ignored. step 2. robots involved in a potential conflict (that is, risk of collision or loss of communication) update their trajectories computed during step 1 by solving another constrained optimization problem that additionally takes into account coupling constraints (9) and (10). this is done by using the other robots’ intended trajectories computed in the previous step as an estimate of the final trajectories of those robots. if a robot is not involved in any conflict, step 2 is not executed and its final solution trajectory is identical to the trajectory estimated in step 1. all robots in the team use the same tp and tc for assuring synchronization when exchanging information about their positions and abouth their intended trajectories. for each of these steps and for each robot in the team, one constrained optimization problem is resolved. the cost function to be minimized in these optimization problems is the geodesic distance of a robot’s current configuration to its goal configuration. this assures that the robots are driven towards their goal. this two step scheme is explained in detail in [6, 9], where constrained optimization problems associated to the receding horizon optimization problem are formulated. however, constraints related to the goal configuration and the goal input of the motion planning problem are neglected in their method. constraints (4) and (5) are left out of the planning. to take them into account, a termination procedure is proposed in the following that enables the robots to reach their goal state. 3.2. motion planning termination after stopping the receding horizon planning algorithm, we propose a termination planning that considers those constraints related to the goal state. this enables the robots to reach their goal states. the criterion used to pass from the receding horizon planning to the termination planning is based on the distance between the goal and the current position of the robots. it is defined by the equation 11: drem ≥ dmin + tc ·vmax (11) this condition ensures that the termination plan will be planned for at least a dmin distance from the robot’s goal position. this minimal distance is assumed to be sufficient for the robot to reach the goal configuration. before solving the termination planning problem new parameters for representing and computing the solution are calculated by taking into account the estimated remaining distance and the typical distance traveled for a tp planning horizon. this is done in order to rescale the configuration intended for a previous planning horizon not necessarily equal to the new one. potentially, this rescaling will decrease the computation time for the termination planning. the following pseudo code 1 summarizes the planning algorithm, and figure 1 illustrates how plans would be generated through time by the algorithm. in the pseudo code, we see the call of a plansec procedure. it corresponds to the resolution of the receding horizon planning problem as defined in subsection 3.1. 12 vol. 56 no. 1/2016 trajectory generation approach planlastsec is the procedure for solving the termination planning problem. this problem is similar to the receding horizon planning problems. it also has the two steps already presented for computing an intended plan and for updating it, if need be, so that conflicts are avoided. the difference consists in how the optimization problems associated to it are defined. the optimization problem defined in (12) and (13) is the problem solved at the first step. the optimization problem associated with the second step is defined (14) and (15). besides, in both new constrained optimal problems, the planning horizon is not a fixed constant as before, instead it is a part of the solution to be found. then, for generating the intended plan the following is resolved: min q̂b(t),ûb(t),tf lb,f (q̂b(t), ûb(t),qb,goal,ub,goal) (12) under the following constraints for τk = ktc with k the number of receding horizon problems solved before the termination problem:   ˙̂qb(t) = f(q̂b(t), ûb(t)), ∀t ∈ [τk,τk + tf ] q̂b(τk) = q∗b (τk−1 + tc) ûb(τk) = u∗b (τk−1 + tc) q̂b(τk + tf ) = qb,goal ûb(τk + tf ) = ub,goal |ûb,i(t)| ≤ ub,i,max, ∀i ∈ [1,p],∀t ∈ (τk,τk + tf ) d(rb,om) ≥ 0, ∀om ∈ob, t ∈ (τk,τk + tf ) (13) and for generating the final solution: min q∗ b (t),u∗ b (t),tf lb,f (q∗b (t),u ∗ b (t),qb,goal,ub,goal) (14) under the following constraints:   q̇∗b(t) = f(q∗b (t),u ∗ b (t)), ∀t ∈ [τk,τk + tf ] q∗b (τk) = q ∗ b (τk−1 + tc) u∗b (τk) = u ∗ b (τk−1 + tc) q∗b (τk + tf ) = qb,goal u∗b (τk + tf ) = ub,goal |u∗b,i(t)| ≤ ub,i,max, ∀i ∈ [1,p],∀t ∈ (τk,τk + tf ) d(rb,om) ≥ 0, ∀om ∈ob,∀t ∈ (τk,τk + tf ) d(rb,rc) −ρb −ρc ≥ 0, ∀rc ∈cb,∀t ∈ (τk,τk + tf ) d(rb,rd) − min(db,com,dd,com) ≥ 0, ∀rd ∈db,∀t ∈ (τk,τk + tf ) d(q∗b (t), q̂b(t)) ≤ ξ, ∀t ∈ (τk,τk + tf ) (15) a possible definition for the lb,f cost function present in the equations above can be simply tf . the sets ob, cb and db are functions of τk. algorithm 1 motion planning algorithm 1: procedure plan 2: qlatest ← qinitial 3: drem ←|pos(qfinal)−pos(qlatest)| 4: while drem ≥ dmin + tc ·vmax do 5: initsolrepresentation(· · ·) 6: qlatest ←plansec(· · ·) 7: drem ←|pos(qfinal)−pos(qlatest)| 8: end while 9: rescalerepresentation(· · ·) 10: tf ←planlastsec(· · ·) 11: end procedure figure 1. receding horizon scheme with termination plan. the timespan tf represents the duration of the plan for reaching the goal configuration. 3.3. strategies for solving the constrained optimization problems 3.3.1. flatness property as explained in [9], all mobile robots consisting of a solid block in motion can be modeled as a flat system. this means that a change of variables is possible in a way that enables states and inputs of the kinematic model of the mobile robot to be written in terms of a new variable, called the flat output (z), and its lth first derivatives. the value of l | l ≤ n depends on the kinematic model of the mobile robot. therefore, the flat output can completely determine the behavior of the system. searching for a solution to our problem in the flat space rather than in the actual configuration space of the system presents advantages. it prevents the need for integrating the differential equations of the system (constraint 1) and reduces the dimension of the problem of finding an optimal admissible trajectory. after finding (optimal) trajectories in the flat space, it is possible to retrieve the original configuration and input trajectories. 3.3.2. parametrization of the flat output by b-splines another important aspect of this approach is the parametrization of the flat output trajectory. as done in [10], the use of b-spline functions presents interesting properties. • it is possible to specify a level of continuity ck when using b-splines without additional constraints. 13 josé m. mendes filho, eric lucet acta polytechnica • a b-spline presents a local support – changes in the values of the parameters have a local impact on the resulting curve. the first property is very well suited for parametrizing the flat output, since its lth first derivatives will be needed when computing the actual state of the system and the input trajectories. the second property is important when searching for an admissible solution in the flat space; such parametrization is more efficient and better-conditioned than, for instance, a polynomial parametrization [10]. this choice for parameterizing the flat output introduces a new parameter to be set in the motion planning algorithm, i.e. the number of non-null knots intervals (denoted simply nknots). this parameter plus the l value determines how many control points will be used for generating the b-splines. 3.3.3. optimization solver the optimization problems associated with finding the solution q∗(t),u∗(t) are solved using a numerical optimization solver. for all time dependent constraints, time sampling is used. this introduces a new parameter into the algorithm: time sampling for optimization ns. each constraint that must be satisfied ∀t ∈ (τk,τk + tf ) implies in ns equations. the need for a solver that supports nonlinear equality and inequality constraint restricts the number of numerical optimization solvers to be considered. for our initial implementation of the motion planning algorithm, the slsqp optimizer stood out as a good option. besides being able to handle nonlinear equality and inequality constraint, its availability in the minimization module of the open-source scientific package scipy [11] facilitates the motion planner implementation. however, an error was experienced using this optimizer, which uses the slsqp optimization subroutine originally implemented by dieter kraft [12]. as the cost function value becomes too high (typically for values greater than 103), the optimization algorithm finishes with the “positive directional derivative for linesearch” error message. this appears to be a numerical stability problem also experienced by other users as discussed in [13]. to work around this problem, we proposed a change in the objective functions of the receding horizon optimization problems. this change aims to keep the evaluated cost of the objective function around a known value when close to the optimal solution, instead of having a cost depending on the goal configuration (which can be arbitrarily distant from the current position). we simply exchanged the goal position point in the cost function by a new point computed as follows: pb,new = pb,goal −pb(τs−1 + tc) norm(pb,goal −pb(τs−1 + tc)) αtpvb,max, where pb,goal and pb(τs−1 + tc) are the positions associated with configurations qb,goal and qb(τs−1 +tc) respectively, α | α ≥ 1,α ∈ r is a constant for controlling how far from the current position the new point is placed, the product tpvb,max is the maximum possible distance covered by rb during a planning horizon, and s | s ∈ [0,k),s ∈ n is the current receding horizon problem index. 4. simulation results the results and their analysis for the motion planner presented in the previous sections are presented here. the trajectory and the velocities shown in figures 2 and 3 illustrate a motion planning solution found for a team of three robots. they plan their motion in an environment where three static obstacles are present. each point along the trajectory line of a robot represents the beginning of a tc update/computation horizon. these figures show the planner generates the configuration and the input trajectories satisfying the constraints associated with the goal states. in particular, in figure 2, the resulting plan is computed ignoring coupling constraints (step 2 is never performed) and consequently two points of collision occur. a collision-free solution is presented in figure 3. specially near the regions where collisions occurred, a change in the trajectory is present from figure 2 to figure 3 to avoid a collision. at the same time, changes in the (linear) velocities of the robots across the charts in both figures can be observed. finally, the charts at the bottom show that the collisions were indeed avoided: the interrobot distances in figure 3 are greater than or equal to zero all along the simulation. to perform these two previous simulations, a reasonable number of parameters have to be set. these parameters can be categorized into two groups: algorithm related parameters and optimization solver related parametrs. among the algorithm related group, the most important parametrs are: • the number of samples for time discretization (ns); • the number of internal knots for the b-spline curves (nknots); • the planning horizon for the sliding window (tp); • the computation horizon (tc). • the detection radius of the robot (dsen). the optimization related parameters depend on the numeric optimization solver adopted. however, since most of them are iterative methods, it is common to have at least a maximum number of iterations parameter and a stop condition parameter. this considerable number of parameters makes the search for a satisfactory set of parameter values a laborious task. 14 vol. 56 no. 1/2016 trajectory generation approach 1 2 3 4 5 6 7 8 x(m) 0 1 2 3 4 5 y (m ) collision collision generated trajectory r0 r1 r2 0 1 2 3 4 5 6 7 8 9 time(s) 0.6 0.7 0.8 0.9 1.0 1.1 v (m / s ) linear speed r0 r1 r2 0 1 2 3 4 5 6 7 8 time (s) 1 0 1 2 3 4 5 in te rro b o t d is ta n c e ( m ) inter-robot distances throughout simulation d(r0 ,r1 )−ρ0 −ρ1 d(r0 ,r2 )−ρ0 −ρ2 d(r1 ,r2 )−ρ1 −ρ2 figure 2. motion planning solution without collision handling. therefore, it is important to have a better understanding of how some performance criteria are impacted by the changes in algorithm parameters. 4.1. impact of parameters three criteria considered important for validating this method were studied: the maximum computation time during the planning over the computation horizon (mct/tc ratio); the obstacle penetration area (p); the travel time (ttot). different parameters configuration and scenarios where tested in order to highlight how they influence those criteria. 4.1.1. maximum computation time over computation horizon mct/tc the significance of this criterion lies in the need to assure the real-time property of this algorithm. in a real implementation of this approach the computation horizon would always have to be superior to the maximum time taken to compute a plan. 1 2 3 4 5 6 7 8 x(m) 0 1 2 3 4 5 y (m ) generated trajectory r0 r1 r2 0 1 2 3 4 5 6 7 8 9 time(s) 0.6 0.7 0.8 0.9 1.0 1.1 v (m / s ) linear speed r0 r1 r2 0 1 2 3 4 5 6 7 8 time (s) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 in te rro b o t d is ta n c e ( m ) inter-robot distances throughout simulation d(r0 ,r1 )−ρ0 −ρ1 d(r0 ,r2 )−ρ0 −ρ2 d(r1 ,r2 )−ρ1 −ρ2 figure 3. motion planning solution with collision handling. table 1 summarizes one of the scenarios studied for a single robot. the results obtained from simulations in this scenario are presented in figure 4, for various parameters. each dot along the curves corresponds to the average of mct/tc along different tp’s for a given value of (tc/tp, ns). the absolute values observed in the charts depend on the processing speed of the machine in which the algorithm is run. these simulations were run on an intel xeon cpu 2.53ghz processor. rather than observing the absolute values, it is interesting to analyze the impact of changes in the values of the parameters. in particular, an increasing number of ns increases mct/tc for a given tc/tp. similarly, an increase in mct/tc as the number of internal knots nknots increases from charts 4a to 4c is observed. further analyses of the data show that finding the solution using the slspq method requires o(n3knots) and o(ns) time. although augmenting nknots can 15 josé m. mendes filho, eric lucet acta polytechnica 0.2 0.3 0.4 0.5 0.6 tc/tp 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 m c t / t c computation cost behavior ns =10 ns =11 ns =12 ns =13 ns =14 ns =15 (a) . four internal knots 0.2 0.3 0.4 0.5 0.6 tc/tp 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 m c t / t c computation cost behavior ns =10 ns =11 ns =12 ns =13 ns =14 ns =15 (b) . five internal knots 0.2 0.3 0.4 0.5 0.6 tc/tp 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 m c t / t c computation cost behavior ns =10 ns =11 ns =12 ns =13 ns =14 ns =15 (c) . six internal knots figure 4. three obstacles scenario simulations table 1. values for scenario definition vmax 1.00 m/s ωmax 5.00 rad/s qinitial [−0.05 0.00 π/2]t qfinal [0.10 7.00 π/2]t uinitial [0.00 0.00]t ugoal [0.00 0.00]t o0 [0.55 1.91 0.31] o1 [−0.08 3.65 0.32] o2 [0.38 4.65 0.16] lead to an impractical computation time, typical nknots values did not need to exceed 10 in our simulations, which is a sufficiently small value. another parameter having a direct impact on the mct/tc ratio is the detection radius of the robot’s sensors. as the detection radius of the robot increases, more obstacles are seen at once which, in turn, increases the number of constraints in the optimization problems. the impact of increasing the detection 0 2 4 6 8 10 12 14 16 db,sen(m) 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 m c t / t c mct/tc and detection radius relationship fitted curve (−5.29 exp(−0.50 ρd ) +3.23) original data figure 5. increasing the detection radius, and impact on a m t c/tc ratio 10 12 14 16 18 20 22 24 26 ns 0 10 20 30 40 50 60 70 80 90 p (c m 2 ) time sampling and obstacle penetration relationship p(n_s) fitted curve (9395.01 exp(−0.48 ns ) +3.13) figure 6. obstacle penetration decreasing as sampling increases radius dsen in the mct/tc ratio can be seen in figure 5 for a scenario with seven obstacles. the computation time stops increasing as soon as the robot sees all obstacles present in the environment. 4.1.2. obstacle penetration p obstacle penetration area p gives a metric for obstacle avoidance and consequently for the solution quality. a solution where the planned trajectory does not pass through an object at any instant of time gives p = 0. the solution quality decreases with increasing p . however, since time sampling is performed during the optimization, p is usually greater than zero. a way of assuring p = 0 would be to increase the radius of the obstacles computed by the robot’s perception system by the maximum distance that the robot can run within the time span tp/ns. however simple, this approach represents a loss of optimality and is not considered in this work. it is relevant then to observe the impact of the algorithm parameters in the obstacle penetration area. tc/tp ratio, nknots and dsen impact on this criteria are only significant for degraded cases, meaning that around typical values those parameters do not change p significantly. however, time sampling ns is a relevant parameter. figure 6 shows the penetration area decreasing as the number of samples increases. 4.1.3. travel time ttot another complementary metric for characterizing solution quality is the travel time ttot. analyses of data from several simulations show a tendency that for a given value of nknots, ns and tc the travel time decreases as the planning horizon tp decreases. this can be explained by the simple fact that for a given 16 vol. 56 no. 1/2016 trajectory generation approach 0 2 4 6 8 10 12 14 16 db,sen(m) 16.0 16.5 17.0 17.5 18.0 18.5 19.0 t to t (s ) total execution time and detection radius relationship figure 7. increasing the detection radius, and its impact on ttot tc, a better solution (in terms of travel time) can be found if the planning horizon tp is smaller. another relevant observation is that the overall travel time is shorter for smaller ns’s. this misleading improvement does not take into account the fact that the fewer the samples the greater will be the obstacle penetration area, as shown previously in figure 6. furthermore, figure 7 shows travel time invariance for changes in the detection radius far from degraded values that are too small. this indicates that local knowledge of the environment provides enough information for finding good solutions. 5. conclusions we have proposed a distributed motion planner based on a receding horizon approach, modified to take into account termination constraints. near the goal configuration neighborhood, the receding horizon approach is finished and a termination planning problem is solved for bringing the robots to their precise final state. the problem is stated as a constrained optimization problem. it minimizes the time for reaching a goal configuration through a collision-free trajectory securing communication between robots. circle and convex polygon representation of obstacles is supported. key techniques for implementing the motion planner are: system flatness property, b-spline parametrization of the flat output and the slsqp optimizer. finally, solutions using this planner for different scenarios were generated in order to validate the method. the impact of different parameters on computation time and on the quality of the solution was analyzed. future work will be performed in a physical simulation environment, where the dynamics is taken into account as well as models of the sensors and communication latency. references [1] s. robarts. autonomous robots are helping to pack your amazon orders. http://www.gizmag.com/ amazon-kiva-fulfillment-system/34999/. accessed: 2015-07-22. [2] idea groupe met en place scallog pour sa préparation de commandes. http://supplychainmagazine.fr/nl/2015/2085/. accessed: 2015-07-22. [3] f. borrelli, d. subramanian, a.u. raghunathan, l. biegler. milp and nlp techniques for centralized trajectory planning of multiple unmanned air vehicles. 2006 american control conference pp. 5763–5768, 2006. doi:10.1109/acc.2006.1657644. [4] g. sanchez, j.-c. latombe. on delaying collision checking in prm planning: application to multi-robot coordination. the international journal of robotics research 21(1):5–26, 2002. doi:10.1177/027836402320556458. [5] o. khatib. real-time obstacle avoidance for manipulators and mobile robots. in autonomous robot vehicles, pp. 396–404. springer science and business media, 1986. doi:10.1007/978-1-4613-8997-2_29. [6] m. defoort, a. kokosy, t. floquet, et al. motion planning for cooperative unicycle-type mobile robots with limited sensing ranges: a distributed receding horizon approach. robotics and autonomous systems 57(11):1094–1106, 2009. doi:10.1016/j.robot.2009.07.004. [7] c. ericson. real-time collision detection. m038/the morgan kaufmann ser. in interactive 3d technology series. taylor & francis, 2004. [8] t. keviczky, f. borrelli, g. j. balas. decentralized receding horizon control for large scale dynamically decoupled systems. automatica 42(12):2105–2115, 2006. doi:10.1016/j.automatica.2006.07.008. [9] m. defoort. contributions à la planification et à la commande pour les robots mobiles coopératifs. ecole centrale de lille 2007. [10] m. b. milam. real-time optimal trajectory generation for constrained dynamical systems. ph.d. thesis, california institute of technology, 2003. [11] scipy scientific computing tools for python. http://www.scipy.org/. accessed: 2015-07-31. [12] d. kraft. a software package for sequential quadratic programming. dlr german aerospace center – institute for flight mechanics, koln, germany, 1988. [13] runtime errors for large gradients. http://comments.gmane.org/gmane.science. analysis.nlopt.general/191. accessed: 2015-07-27. 17 http://www.gizmag.com/amazon-kiva-fulfillment-system/34999/ http://www.gizmag.com/amazon-kiva-fulfillment-system/34999/ http://supplychainmagazine.fr/nl/2015/2085/ http://dx.doi.org/10.1109/acc.2006.1657644 http://dx.doi.org/10.1177/027836402320556458 http://dx.doi.org/10.1007/978-1-4613-8997-2_29 http://dx.doi.org/10.1016/j.robot.2009.07.004 http://dx.doi.org/10.1016/j.automatica.2006.07.008 http://www.scipy.org/ http://comments.gmane.org/gmane.science.analysis.nlopt.general/191 http://comments.gmane.org/gmane.science.analysis.nlopt.general/191 acta polytechnica 56(1):10–17, 2016 1 introduction 2 problem statement 2.1 assumptions 2.2 constraints and cost functions 3 distributed motion planning 3.1 receding horizon approach 3.2 motion planning termination 3.3 strategies for solving the constrained optimization problems 3.3.1 flatness property 3.3.2 parametrization of the flat output by b-splines 3.3.3 optimization solver 4 simulation results 4.1 impact of parameters 4.1.1 maximum computation time over computation horizon mct/tc 4.1.2 obstacle penetration p 4.1.3 travel time ttot 5 conclusions references acta polytechnica doi:10.14311/ap.2020.60.0127 acta polytechnica 60(2):127–144, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap investigation of the dynamic behaviour of non-uniform thickness circular plates resting on winkler and pasternak foundations saheed salawua, ∗, gbeminiyi sobamowob, obanishola sadiqa a university of lagos, department of civil and environmental engineering, akoka 100213, nigeria b university of lagos, department of mechanical engineering, akoka 100213, nigeria ∗ corresponding author: safolu@outlook.com abstract. the study of the dynamic behaviour of non-uniform thickness circular plate resting on elastic foundations is very imperative in designing structural systems. this present research investigates the free vibration analysis of varying density and non-uniform thickness isotropic circular plates resting on winkler and pasternak foundations. the governing differential equation is analysed using the galerkin method of weighted residuals. linear and nonlinear case is considered, the surface radial and circumferential stresses are also determined. thereafter, the accuracy and consistency of the analytical solutions obtained are ascertained by comparing the existing results available in pieces of literature and confirmed to be in a good harmony. also, it is observed that very accurate results can be obtained with few computations. issues relating to the singularity of circular plate governing equations are handled with ease. the analytical solutions obtained are used to determine the influence of elastic foundations, homogeneity and thickness variation on the dynamic behaviour of the circular plate, the effect of vibration on a free surface of the foundation as well as the influence of radial and circumferential stress on mode shapes of the circular plate considered. from the results, it is observed that a maximum of 8.1 % percentage difference is obtained with the solutions obtained from other analytical methods. furthermore, increasing the elastic foundation parameter increases the natural frequency. extrema modal displacement occurs due to radial and circumferential stress. natural frequency increases as the thickness of the circular plate increases, conversely, a decrease in natural frequency is observed as the density varies. it is envisioned that; the present study will contribute to the existing knowledge of the classical theory of vibration. keywords: free vibration, natural frequency, winkler and pasternak, circular plate, galerkin method of weighted residual. 1. introduction application of plates is commonly found in a variety of contexts in civil, mechanical, naval, marine and aeronautic engineering. the understanding of the dynamic behaviour of the plate is germane in several contexts. for example, a plate resting on a foundation, fluid-structure interaction problems, panel flutter and plate in aeronautics, also the coupling effect of electro-magnetic and thermal fields on plate. several engineering structures, such as railway, storage tank foundation and telecommunication mast require information on the vibration analysis of plate entrenched on foundation earlier before embarking on the design. the simplest form of modelling the mechanical behaviour of a soil foundation interaction is using the winkler foundation. the winkler foundation has the limitation of non-interaction between the lateral spring, thereby resulting in unreliable results. two-parameter elastic foundations are established to cater for this interaction. the use of two-parameter elastic foundations offers a true account of the soil foundation interaction. in the study of vibration of a plate immersed in fluid, kyeong-hoon et al. [1] formulate a hypothetical approach for a hydroelastic analysis of a clamped edge circular plate partially in contact with a liquid. rayleigh-ritz method was adopted for obtaining the eigenvalue equation. the influences of the thickness and depth of the liquid thickness on the natural frequencies were determined. in another work, sari and butcher [2] proposed a numerical approach for analysing the free vibration response of an isotropic rectangular and annular mindlin plates under damaged clamped boundary conditions using the chebyshev collocation method. the results obtained show that the damaged boundary condition influences the natural frequency of the plate. also, fletcher [3] worked on a finite element formulation using a weighted residual method. in another study, bahram et al [4] investigated the vibration analysis of circular plates under the influence of in-plane loading resting on the winkler foundation. the solutions were obtained using the ritz method, it was observed that the foundation stiffness has an increasing influence on the natural frequencies. also, critical buckling load was increased for the laminated circular plate. on the application of the semi-analytical 127 https://doi.org/10.14311/ap.2020.60.0127 https://ojs.cvut.cz/ojs/index.php/ap s. salawu, g. sobamowo, o. sadiq acta polytechnica figure 1. varying thickness circular plate resting on two-parameter foundations. method, wei-ming et al. [5] proposed a new semi-analytical method for obtaining eigenvalues of vibrating circular plates with several holes, the influence of the holes on the natural frequencies was determined. previous studies show that the inherent singularity issue of a circular plate is not easy to handle. the numerical method is a reliable method of solution for handling the governing equations of related challenges but, the convergence, volume of iterations and stability studies associated with numerical increases the computation time and cost. meanwhile, an attempt to obtain a symbolic solution for the dynamic behaviour of a circular plate resting on winkler and pasternak foundations requires the adoption of semi-analytical and analytical methods. yun and temuer [6, 7] formulate an improved homotopy perturbation method (hpm) to obtain a solution for large deflection of a simply supported circular plate. reliable results were obtained. in a later work, zhong and liao [8] also used the homotopy analysis method (ham) to obtain the solution for the non-linear problems. in another study, yalcin et al. [9] adopted a differential transform method (dtm) for the free vibration of a circular plate. also, li et al. [10] analysed a bending large deflection of a simply supported circular plate using an analytical method. the dtm is equally very versatile method, good in handling singularity and a non-trivial differential system of equations, but requires the need to manipulate the governing equation before the singularity problem is resolved and, subsequently, involve transforming the governing equation to an algebraic form. the number of iterations in the dtm is very cumbersome compared to the galerkin method of weighted residual. meanwhile, the hpm also suffers the setback of finding the embedded parameter and initial approximation of the governing equation that satisfies the given conditions. nonetheless, several researches on a free vibration of circular plates using different methods have been presented in the literature [11–25]. moreover, the reliability and flexibility of the galerkin method of weighted residual [26] have made it more effective than any other semi-numerical method. the method is much simpler than any other approximating method of solutions. the galerkin method of weighted residual handles the circular plate vibration problem without any manipulation of the governing equation with very precise results compared to numerical and experimental results. the review of the past studies show that the analysis of the dynamic behaviour of varying thickness and non-homogeneous circular plates resting on winkler and pasternak foundations with the aid of the galerkin method of weighted residual has not been investigated. therefore, the present study focuses on the application of the galerkin method of weighted residual for a dynamic analysis of a non-uniform thickness circular plate resting on winkler and pasternak foundations. the novelty of the present study also includes resolving the singularities problem associated with a circular plate without modifying the governing equation. 2. problem formulation and mathematical analysis a circular plate of a varying uniform thickness and non-homogenous material resting on winkler and pasternak foundation in fig. 1 is considered under various boundary conditions, simply supported, free and clamped edge conditions. according to kirchhoff plate theory, the following assumptions are considered in the model of the governing equation. (1.) plate thickness is smaller compared to the dimension of the circular plate. (2.) normal stress is assumed negligible in the transverse direction of the circular plate. (3.) the rotary inertia effect is negligible. (4.) normal to the undeformed middle surface remains straight and normal to the deformed middle surface without length stretching. 128 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . the differential governing equation for a transverse free vibration of an isotropic circular plate with a uniform thickness as shown in fig. 1 may be written as [12, 27, 28], g(r) { ∂4w(r,t) ∂r4 + 2 r ∂3 ∂r3 w(r,t) − ( 2m2 + 1 r2 ) ∂2w(r,t) ∂r2 + ( 2m2 + 1 r3 ) ∂w(r,t) ∂r − ( 4m2 −m4 r4 ) w(r,t) } + ∂g(r) ∂r { 2 ∂3w(r,t) ∂r3 + 2 + ν r ∂2w(r,t) ∂r2 − 2m2 + 1 r2 ∂w(r,t) ∂r + 3m2 r3 w(r,t) } + ∂2g(r) ∂r2 { ∂2w(r,t) ∂r2 + ν ( 1 r ∂w(r,t) ∂r − m2 r2 w(r,t) )} −ks ∂2w(r,t) ∂r2 −ks 1 r ∂w(r,t) ∂r + ks m2 r2 w(r,t) + kww(r,t) + ρh ∂2w(r,t) ∂t2 = 0, (1) where r and θ are polar coordinates. w is the transverse displacement, ρ is the material density, h is the thickness of the plate, flexural rigidity d = eh3/12(1 −ν2), poisson’s ratio is ν, the elasticity coefficient is e, kw is the winkler foundation and ks is the pasternak foundation. for a free vibration equation, the solution is presented in this form based on kantorovich-type approximation w(r,t) = w(r)eiωt (2) presenting the solution in a more general form, the following dimensionless parameters are used r = r a ,w = w h , ω2 = ρ0h0a 4 d∗ ω2,kw = kwa 4 d∗ ,ks = ksa 2 d∗ ; (3) applying eq. (2) on eq. (1), we have g(r) { d4w(r) dr4 + 2 r d3 dr3 w(r) − ( 2m2 + 1 r2 ) d2w(r) dr2 + ( 2m2 + 1 r3 ) dw(r) dr − ( 4m2 −m4 r4 ) w(r) } + dg(r) dr { 2 d3w(r) dr3 + 2 + ν r d2w(r) dr2 − 2m2 + 1 r2 dw(r) dr + 3m2 r3 w(r) } + d2g(r) dr2 { d2w(r) dr2 + ν ( 1 r dw(r) dr − m2 r2 w(r) )} − a2ks d∗ d2w(r) dr2 − a2ks d∗r dw(r) dr + a2ks d∗ m2 r2 w(r) + a4kw d∗ w(r) + ρa4h d∗ ω2w(r) = 0, (4) where ω is natural frequency respectively. while m is an integer. linear variation of thickness is considered. h = h0(1 + ηr) the thickness and flexural rigidity of the plate can be expressed as; h = h0f(r), d = d0f3(r) = d0g(r), (5) where h0 is thickness at the center and d0 = eh30 12 −ν2 η and ξ are the thickness and density parameters respectively. ( r3η3 + 3r2η2 + 3ηr + 1 ) d4w(r) dr4 + 2 r d3w(r) dr3 − ( 2m2+1 r2 ) d2w(r) dr2 + ( 2m2+1 r3 ) dw(r) dr − ( 4m2−m4 r4 ) w(r)   + ( 3r2η3 + 6rη2 + 3η ){ 2 d3w (r) dr3 + 2 + ν r d2w (r) dr2 − 2m2 + 1 r2 dw (r) dr + 3m2 r3 w (r) } + ( 6rη3 + 6η2 ){d2w (r) dr2 + ν ( 1 r dw (r) dr − m2 r2 w (r) )} −ks d2w(r) dr2 − ks r dw(r) dr + ks m2 r2 w(r) + kww(r) + (ξr + 1) (ηr + 1)ω2w (r) = 0, (6) 129 s. salawu, g. sobamowo, o. sadiq acta polytechnica 2.1. boundary conditions the boundary conditions considered as earlier stated simply support, clamped and free edge conditions. the dimensionless form of the boundary conditions may be presented in terms of the deflection w(r) as follows [8] • clamped w (r)|r=1 = 0, dw dr ∣∣∣∣ r=1 = 0, (7) • simply supported w (r)|r=1 = 0, mr|r=1 = −d [ d2w dr2 + ν ( 1 r dw dr + m2 r2 w )] = 0, (8) • free edge mr| r=1 = −d [ d2w dr2 + ν ( 1 r dw dr + m2 r2 w )] = 0, vr|r=1 = [ d3w dr3 + 1 r d2w dr2 + ( m2ν − 2m2 − 1 r2 ) dw dr + ( 3m2 −m2ν r3 ) w ] = 0. (9) the bending moment is represented as the radial shear force per unit length is represented as vr. an nth-order differential equation requires n-number of boundary condition. since the dimensionless eq. (6) is a fourth-order governing equation then, four boundary conditions are expected for resolving the equation. two of the conditions may be obtained from the external condition of the plate while the rest two are obtained from the condition at the center of the plate. the regularity conditions at the center are given as, symmetric case dw dr ∣∣∣∣ r=0 = 0, vr|r=0 = d3w dr3 = 0, for (m = 0, 2, 4 · · ·) , (10) axisymmetric case w (r)|r=0 = 0, mr|r=0 = d2w dr2 ∣∣∣∣ r=0 = 0 for (m = 1, 3, 5 · · ·) . (11) 3. method of solution: principle of galerkin weighted residual galerkin method was first proposed by walther ritz but was credited to soviet engineer called boris galerkin [26]. the method is used for handling differential equations. the approximate solutions of the differential equations are presumed to be thoroughly approximated by a finite sum of test functions. however, the chosen method of weighted residuals are used to obtain the value of the coefficients of each resulting test function ϕi. the corresponding coefficients are made to reduce the error between the linear combination of test functions, and actual solution, in a chosen standard. the technique is a reliable estimated solution capable of solving a series of problems that eliminate the search for vibrational formulation. assuming the governing equation to be l(ϕs(r) + g(r)) = 0, r ∈ φ. (12) with the following boundary conditions b ( ϕs(r), dϕs(r) dr ) = 0, r ∈ ∂φ, (13) where l is the linear operator, r is the independent variable, ϕs(r) is the unknown function, φ is the domain, b is the boundary operator and ∂φ represents the boundary of the domain. residual l(ϕs(r) + g(r)) = r̃(r) (residual) . (14) minimize the residual by multiplying with a weight w and integrate over the domain. 1∫ 0 r̃(r)w(r)dr = 0 (15) wr is the weight function, which is found through either, galerkin of weighted residual, collocation, sub-domain, or least square method. for this study galerkin approach is adopted. 130 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . the approximate solution u(r) is a linear combination of trial functions which must satisfy the essential boundary condition ū(r) ≈ n∑ i=1 ciφi (r), (16) φi(r) is the trial function, while the weight function trial function is n and i is the node in the domain 1∫ 0 r̃(r)φi(r)dr = 0 where i = 1, 2, . . . . . . ,n, 3.1. application of galerkin method of weighted residual to the governing equation for the sake of brevity, symmetric case regularity condition and simply supported edge condition is presented here while the same approach is used to determine the other conditions treated in this study. the choice of the polynomial solution is based on the highest order of the derivative of the governing equation. this is a fourth-order derivative equation so; the chosen polynomial is of the order five. assume a polynomial solution for order four differential equation w (r) = a + br + cr2 2! + dr3 3! + er4 4! + gr5 5! , (17) applying the boundary conditions at r = 0, symmetric case in eq. (10) • simply supported edge w ′ (0) = b = 0, (18) w ′′′ (0) = d = 0, (19) w (1) = a + c 2 + e 24 + g 120 = 0, (20) −d [ w ′′ (1) + ν ( 1 r w ′ (1) + m2 r2 w (1) )] ⇒− 221793c 1000 − 187671e 2000 − 244541f 8000 = 0, (21) for the symmetric case m = 0 and asymmetric case m = 1. solving the simultaneous equation [eqs. (20) and (21)] to find the unknowns and substitute back to eq. (17). w ⇒ a + 1 2 cr2 + 1 24 ( − 5160a 83 − 2268c 83 ) r4 + ( 6360c 83 + 15840a 83 ) r5 120 , (22) chain function ni(r) in eq. (22) are a and c, as reported in eqs. (7-9)) the boundary condition range is 0 =⇒ 1, invariably the integral limit for this galerkin method is 0 to 1. galerkin equation 1∫ 0 ni (r)r̃dr = 0, (23) • chain function n1 = dw da ⇒ 1 − 215r4 83 + 132r5 83 , (24) n2 = dw dc ⇒ 1 2 r2 − 189r4 166 + 53r5 83 , (25) applying eq. (23) and obtain simultaneous equations based on the chain function. 1∫ 0 ( 1 − 215r4 83 + 132r5 83 )   ( r3η3 + 3r2η2 + 3ηr + 1 ) d4w dr4 + 2 r d3w dr3 − ( 2m2+1 r2 ) d2w dr2 + ( 2m2+1 r3 ) dw dr − ( 4m2−m4 r4 ) w   + ( 3r2η3 + 6rη2 + 3η ){2d3w dr3 + 2+ν r d2w dr2 − 2m 2+1 r2 dw dr + 3m 2 r3 w } +( 6rη3 + 6η2 ){ d2w dr2 + ν ( 1 r dw dr − m 2 r2 w )} −ks d 2w dr2 − ks r dw dr + ks m 2 r2 w + kww + (ξr + 1) (ηr + 1)ω2w   dr = 0, (26) 131 s. salawu, g. sobamowo, o. sadiq acta polytechnica validating the analytical solutions requires assign values for the thickness parameter. solution of eq. (26) becomes: − 104333c 8709 − −21153a 1168 + 2114ω2cξ 33391 + 30621aks 33884 + 7544ckw 226609 + 7544ω2c 226609 + 38892cks 79283 + + 19625akw 30401 + 19625ω2a 30401 = 0 (27) 1∫ 0 ( 1 2 r2 − 189r4 166 + 53r5 83 )   ( r3η3 + 3r2η2 + 3ηr + 1 ) d4w dr4 + 2 r d3w dr3 − ( 2m2+1 r2 ) d2w dr2 + ( 2m2+1 r3 ) dw dr − ( 4m2−m4 r4 ) w   + ( 3r2η3 + 6rη2 + 3η ){2d3w dr3 + 2+ν r d2w dr2 − 2m 2+1 r2 dw dr + 3m 2 r3 w } +( 6rη3 + 6η2 ){ d2w dr2 + ν ( 1 r dw dr − m 2 r2 w )} −ks d 2w dr2 − ks r dw dr + ks m 2 r2 w + kww + (ξr + 1) (ηr + 1)ω2w   dr = 0, (28) solution of eq. (28) becomes: 21524c 43463 + 52591a 27028 + 3406ω2cξ 1979425 + 2114ω2aξ 120859 + 1939aks 27556 + 1737ckw 606232 + 1737ω2c 606232 + 3617cks 86573 + 2767akw 83116 + 2767ω2a 83116 = 0 (29) eq. (27) and eq. (29) represent series expression obtained after resolving eq. (26) and eq. (28). the resulting simultaneous equation obtained may be written in this form( −1043338709 + 2114ω2ξ 120859 + 7544kw 226609 + 7544ω 2 226609 + 38892ks 79283 ) c + ( −211531168 + 7469ω2ξ 33391 + 30621ks 33884 + 19625kw30401 + 19625ω2 30401 ) a = 0 ( 21524 43463 + 3406ω2ξ 1979425 + 1737kw 606232 + 1737ω 2 606232 + 3617ks 86573 ) c + ( 52591 27028 + 2114ω2ξ 120859 + 1939ks 27556 + 2767kw83116 + 2767ω2 83116 ) a = 0 (30) the solutions are represented in terms of the natural frequency ω and the controlling parameters. therefore, eq. (30) maybe written in matrix form as  ( −1043338709 + 2114ω2ξ 120859 + 7544kw 226609 + 7544ω 2 226609 + 38892ks 79283 ) ( −211531168 + 7469ω2ξ 33391 + 30621ks 33884 + 19625kw30401 + 19625ω2 30401 ) ( 21524 43463 + 3406ω2ξ 1979425 + 1737kw 606232 + 1737ω 2 606232 + 3617ks 86573 ) ( 52591 27028 + 2114ω2ξ 120859 + 1939ks 27556 + 2767kw83116 + 2767ω2 83116 )   { c a } = { 0 0 } , (31) the following characteristic determinant is obtained applying the non-trivial condition  ( −1043338709 + 2114ω2ξ 120859 + 7544kw 226609 + 7544ω 2 226609 + 38892ks 79283 ) ( −211531168 + 7469ω2ξ 33391 + 30621ks 33884 + 19625kw30401 + 19625ω2 30401 ) ( 21524 43463 + 3406ω2ξ 1979425 + 1737kw 606232 + 1737ω 2 606232 + 3617ks 86573 ) ( 52591 27028 + 2114ω2ξ 120859 + 1939ks 27556 + 2767kw83116 + 2767ω2 83116 )   = 0, (32) solving eq. (32), one gets ( − 5975 8059706 − 72ξ2 912077 − 545ξ 928333 ) ω4 + (( − 857ξ 786779 − 2995 275111 ) ks + ( − 545ξ 928333 − 5975 4029853 ) kw− − 17466 29021 − 21131ξ 82827 ) ω2 − 1074ks2 331613 + ( 15438 36701 − 2995kw 275111 ) ks − 130782 9119 − 17466kw 29021 − 5975kw2 8059706 ; (33) 132 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . the eigenvalues are obtained resolving the quartic equation eq. (33) above. ω = √√√√√√√√√ ( 1.57 × 108ξ2 +1.17 × 109ξ +1.48 × 109 ) √√√√√√√   1.63 × 1017ks2ξ2 − 2.15 × 1018kskwξ2 + 1.10 × 1017kw2ξ2 +1.61 × 1019ks2ξ − 9.55 × 1018kskwξ + 6.88 × 1020ksξ2 +1.09 × 1020kwξ2 + 1.08 × 1020ks2 + 7.85 × 1021ksξ +4.98 × 1019kwξ + 6.05 × 1022ξ2 + 1.43 × 1022ks + 2.73 × 1023ξ +3.19 × 1023  − −1.08 × 109ksξ − 5.8 × 108kwξ + 1.08 × 1010ks − 1.48 × 109kw + . . .   (1.57 × 108ξ2 + 1.17 × 109ξ + 1.48 × 109) ; (34) ω = − √√√√√√√√√ ( 1.57 × 108ξ2 +1.17 × 109ξ +1.48 × 109 )1 ×   √√√√√√√   1.63 × 1017ks2ξ2 − 2.15 × 1018kskwξ2 + 1.10 × 1017kw2ξ2 +1.61 × 1019ks2ξ − 9.55 × 1018kskwξ + 6.88 × 1020ksξ2 +1.09 × 1020kwξ2 + 1.08 × 1020ks2 + 7.85 × 1021ksξ +4.98 × 1019kwξ + 6.05 × 1022ξ2 + 1.43 × 1022ks +2.73 × 1023ξ + 3.19 × 1023  − −1.08 × 109ksξ − 5.8 × 108kwξ + 1.08 × 1010ks − 1.48 × 109kw + . . .     (1.57 × 108ξ2 + 1.17 × 109ξ + 1.48 × 109) ; (35) ω = √√√√√√√√√ ( 1.57 × 108ξ2 +1.17 × 109ξ +1.48 × 109 )−1 ×   √√√√√√√   1.63 × 1017ks2ξ2 − 2.15 × 1018kskwξ2 + 1.10 × 1017kw2ξ2 +1.61 × 1019ks2ξ − 9.55 × 1018kskwξ + 6.88 × 1020ksξ2 +1.09 × 1020kwξ2 + 1.08 × 1020ks2 + 7.85 × 1021ksξ +4.98 × 1019kwξ + 6.05 × 1022ξ2 + 1.43 × 1022ks + 2.73 × 1023ξ +3.19 × 1023  − −1.08 × 109ksξ − 5.8 × 108kwξ + 1.08 × 1010ks − 1.48 × 109kw + . . .     (1.57 × 108ξ2 + 1.17 × 109ξ + 1.48 × 109) ; (36) ω = − √√√√√√√√√ ( 1.57 × 108ξ2 +1.17 × 109ξ +1.48 × 109 )1 ×  −1 ×   √√√√√√√   1.63 × 1017ks2ξ2 − 2.15 × 1018kskwξ2 + 1.10 × 1017kw2ξ2 +1.61 × 1019ks2ξ − 9.55 × 1018kskwξ + 6.88 × 1020ksξ2 +1.09 × 1020kwξ2 + 1.08 × 1020ks2 + 7.85 × 1021ksξ +4.98 × 1019kwξ + 6.05 × 1022ξ2 + 1.43 × 1022ks +2.73 × 1023ξ + 3.19 × 1023  − −1.08 × 109ksξ − 5.8 × 108kwξ + 1.08 × 1010ks + . . .       (1.57 × 108ξ2 + 1.17 × 109ξ + 1.48 × 109) ; (37) the numerical computation of the natural frequencies requires substituting values to solutions obtained in eq. (34-37) − 14923ω2 46030 − 28704 3259 + √ (4.17 × 1019ω4 + 2.52 × 1021ω2 + 2.52 × 1022) 2.00 × 108 , (38) solving the quadratic eq. (38) gives the natural frequency. ω = 4.957175854, −4.957175854, (39) substitute the positive root obtained in eq. (39) into eq. (31) gives,[ −1378654058 − 111560 8717 35062 31091 22757 53569 ]{ a c } (1) = { 0 0 } , (40) putting c = 1 in eq. (40), then a is calculated as,{ a c } (1) = { −2.654608759 1 } , (41) therefore, the deflection solution of the governing eq. (1) gives w(r) = 1 − 25785r2 20356 + 32609r4 93147 − 9721r5 116589 , (42) 3.2. application of galerkin method of weighted residual to nonlinear governing equation considering a non-uniform thickness circular plate resting on nonlinear foundation in fig. 2, vonkármán analogue is employed due to geometric nonlinearity condition involved. 133 s. salawu, g. sobamowo, o. sadiq acta polytechnica figure 2. varying thickness circular plate resting on three-parameter foundations. g(r)   ∂4w(r,t) ∂r4 + 2 r ∂3 ∂r3 w(r,t) − ( 2m2+1 r2 ) ∂2w(r,t) ∂r2 + ( 2m2+1 r3 ) ∂w(r,t) ∂r −( 4m2−m4 r4 ) w(r,t)   + ∂g(r) ∂r { 2 ∂3w(r,t) ∂r3 + 2 + ν r ∂2w(r,t) ∂r2 − 2m2 + 1 r2 ∂w(r,t) ∂r + 3m2 r3 w(r,t) } + ∂2g(r) ∂r2 { ∂2w(r,t) ∂r2 + ν ( 1 r ∂w(r,t) ∂r − m2 r2 w(r,t) )} − a2ks d∗ ∂2w(r,t) ∂r2 − a2ks d∗r ∂w(r,t) ∂r + a2ks d∗ m2 r2 w(r,t) − 3 4 a4kp d∗ (w(r,t))3 + a4kw d∗ w(r,t) + ρa4h d∗ ∂2w(r,t) ∂t2 − 1 r ∂ ∂r ( ϕ ∂w(r,t) ∂r ) = 0, (43) ∂2ϕ ∂r2 + 1 r ∂ϕ ∂r − ϕ r2 + eh 2r ( ∂w(r,t) ∂r )2 = 0 (44) where kp is the nonlinear winkler foundation and ϕ is the airy stress function. an approximate solution is obtained by assuming the non-linear free vibrations to have the same spatial shape, i.e., w(r,t) = ( c4r 4 + c2r2 + 1 ) φ (t) (45) substitute eq. (45) into eq. (44) and solve the ode ϕ(r,t) = c1 r − c3r + (φ(t))2r3 ( 2c42r4 + 4c2c4r2 + 3c22 ) 6 , (46) the value of ϕ is accordingly found to be finite at the origin c1 = 0. additionally, is the constant of integration is to be determined from in plane boundary conditions. maximum value of phi(t) coincides with the maximum deflection wmax divided by plate thickness h. the substitution of the expressions for w and ϕ given by eqs. (45) and (46) respectively into eq. (44) and the application of the galerkin procedure in the nonlinear time differential equation obtained in the form∫ 1 0 l′(w,ϕ)wrdr = 0 (47) l′(w,ϕ) = ( r3η3 + 3r2η2 + 3ηr + 1 )( 72c4φ (t) − ( 12c4r2 + 2c2 ) φ (t) r2 + ( 4c4r3 + 2c2r ) φ (t) r3 ) + + ( 3r3η3 + 6rη2 + 3η )( 48c4rφ (t) + (2 + ν) ( 12c4r2 + 2c2 ) φ (t) r − ( 4c4r3 + 2c2r ) φ (t) r2 ) + + ( 6rη3 + 6η2 )(( 12c4r2 + 2c2 ) φ (t) + ν ( 4c4r3 + 2c2r ) φ (t) r ) + 3 4 kp ( c4r 4 + c2r2 + 1 )3(φ (t))3− −kw ( c4r 4 + c2r2 + 1 ) φ (t) − ( rc3 − 16 (φ (t)) 2 r3 ( 2c42r4 + 4c2c4r2 + 3c22 ))( 12c4r2 + 2c2 ) φ (t) r − − ( c3 − 12 (φ (t)) 2 r2 ( 2c42r4 + 4c2c4r2 + 3c22 ) − 16 (φ (t)) 2 r3 ( 8c42r3 + 8c2c4r ))( 4c4r3 + 2c2r ) φ (t) r + + ks ( 12c4r2 + 2c2 ) φ (t) + ks ( 4c4r3 + 2c2r ) φ (t) r − (ξr + 1) (ηr + 1) ( c4r 4 + c2r2 + 1 ) d2 dt2 φ (t) , (48) 134 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . we have [m] { d2φ̈s dt2 } + [k]{φs}− [v ] { φ3s } = {0} , (49) where m = (((−11088ξ − 12320) η − 12320ξ − 13860) c2 + (−13860ξ − 15840) η − 15840ξ − 18480) c4 55440 + + ((−6930ξ − 7920) η − 7920ξ − 9240) c22 55440 + ((−18480ξ − 22176) η − 22176ξ − 27720) c2 55440 + + ((−4620ξ − 5040) η − 5040ξ − 5544) c24 55440 + (−13860ξ − 18480) η 55440 − ξ 3 − 1 2 , (50) k = − (( −188ν5 − 3196 15 ) η3 + (−72ν − 504) η2 + ( −216ν7 − 2376 7 ) η + 3kw5 + 12c3 − 12ks − 64 ) 6 c24− − ( ((−398114 −873ν14 )η3+(−696−120ν)η2+(−24125 −252ν5 )η+ 3kw2 +20c3−20ks−96) 6 c2 + ( −324ν5 − 1836 5 ) η3 + (−144ν − 1008) η2 + (−72ν − 792) η − 192 + 2kw + 24c3 − 24ks ) c4− − (( −102ν5 − 102 5 ) η3 + (−36ν − 36) η2 + (−12ν − 12) η + 6c3 − 6ks + kw ) 6 c2 2− − ( (−33ν − 33) η3 + (−72ν − 72) η2 + (−36ν − 36) η + 3kw + 12c3 − 12ks ) 6 c2 − kw 2 , (51) v = − ( −kp4 − 40 7 ) c44 6 − (( −20 − 9kp8 ) c2 − 9kp 7 − 8 ) c34 6 − (( −28 − 27kp14 ) c22 + ( −20 − 9kp2 ) c2 − 27kp 10 ) c24 6 − − (( −18 − 3kp2 ) c32 + ( −20 − 27kp5 ) c22 − 27c2kp 4 − 3kp ) 6 c4 − ( −9kp20 − 4 ) 6 c42 − ( −9kp4 − 6 ) 6 c32+ + 3c22kp 4 + 3c2kp 4 + 3kp 8 , (52) 3.3. application of galerkin method of weighted residual to foundation free surface considering the vibration influence of foundation on the plate beyond the domain of the plate. same procedure from eq. (43) applies. the galerkin procedure eq. (47) in the nonlinear time differential equation now becomes∫ a 0 l(w,ϕ)wrdr− ∫ a2 a1 l′(w,ϕ)wrdr = 0 (53) where l(w,ϕ) is the foundation equation. we have [m] { d2φ̈s dt2 } + [k]{φs}− [v ] { φ3s } = {0} , (54) where m = c4 2 55440 ( −4620ξηa112 + (−5040ξ − 5040η) a111 − 5544a110 + 4620a210 ( 6 5 + ξηa2 2+( 12ξ 11 + 12η 11 ) a2 )) + + c4 55440   ( −11088ξηa110 + (−12320ξ − 12320η) a19 − 13860a18 + 11088a28 ( 5 4 + ξηa2 2 + ( 10ξ 9 + 10η 9 ) a2 )) c2 −13860ξηa18 + (−15840ξ − 15840η) a17 − 18480a16 + 13860a26 ( 4 3 + ξηa2 2 + ( 8ξ 7 + 8η 7 ) a2 )  + + c2 2 55440 ( −6930ξηa18 + (−7920ξ − 7920η) a17 − 9240a16 + 6930a26 (4 3 + ξηa22 + (8ξ 7 + 8η 7 ) a2 )) + + ( −18480ξηa16 + (−22176ξ − 22176η) a15 − 27720a14 + 18480a24 ( 3 2 + ξηa2 2 + ( 6 5 ξ + 6 5 η ) a2 )) c2 55440 − − 1 4 a1 4 ηξ + (−18480ξ − 18480η) a13 55440 − 1 2 a1 2 + 1 4 ( ξηa2 2 + 2 + (4 3 ξ + 4 3 η ) a2 ) a2 2 , (55) 135 s. salawu, g. sobamowo, o. sadiq acta polytechnica k = 1 24   ((432ν 5 + 2448 5 ) η3 − 12kw5 ) a1 10 + 64η3 ( ν + 173 ) a1 9 + ( (288ν + 2016) η2 + 48ks − 48c3 ) a1 8 + 864η(11+ν)a1 7 7 + 256a1 6 + (( −432ν5 − 2448 5 ) η3 + 12kw5 ) a2 10 − 64η3 ( ν + 173 ) a2 9 + ( (−288ν − 2016) η2 − 48ks + 48c3 ) a2 8 − 864η(11+ν)a2 7 7 − 256a2 6 − 12a8(a2kw−20ks) 5  c24+ + 1 24 ( (126ν + 630) η3 − 6kw ) a1 8 + 864η3a17 7 ( 37 9 + ν ) + ( (480ν + 2784) η2 + 80ks − 80c3 ) a1 6+ + 1008a15η 5 ( 67 7 + ν ) + · · · , (56) v = 1 24 ( − 160a214 7 + 160a114 7 + kpa118 −kpa218 + a18kp ) c4 4 + 1 24 ( 6a12kp + 6a112kp −6a212kp + 72a18 − 72a28 ) c32+ + 1 24 ( ( 80a112 − 80a212 − 92kpa2 16 + 92kpa1 16 + 92a 16kp ) c2 − 32a210 + 32a110 + 36a1 14kp 7 −36a2 14kp 7 + 36a14kp 7 ) c34+ + 1 24   ( 112a110 − 112a210 + 54a14kp 7 − 54a2 14kp 7 + 54a1 14kp 7 ) c2 2 + ( 18a12kp + 18a112kp − 18a212kp + 80a18 − 80a28 ) c2 + 54kp(a10+a1 10−a2 10) 5  c42 + 80a16 − 80a26 + · · · , (57) 3.4. the stress-deflection expression is [12] to determine the radial and circumferential stress for the circular plate, the following dimensionless expression may be used σrr = e (r,z) z 1 −ν2 ( d2w dr2 + ν r dw dr ) , σθθ = e (r,z) z 1 −ν2 ( 1 r dw dr + ν d2w dr2 ) , (58) where e is the young modulus of the circular plate, ν is the poisons ratio and z is the mid-plane of the plate. 4. results and discussion the analytical solution of the governing equation of motion of the circular plate under various boundary conditions with galerkin method of weighted residual is hereby presented. the material properties for the thin non-uniform thickness, non-homogenous wrought iron circular plate used are e = 200gpa material density rho = 7.70 g/m3 thickness of the plate h = 0.01 m and poisson’s ratio ν = 0.3 respectively. to validate the analytical solution of free vibration of non-homogenous and varying thickness circular plate resting on winkler and pasternak foundations using galerkin method of weighted residuals, application is made to the numeric data stated above given by [9]. galerkin method of weighted residual determined the natural frequency in dimensionless form. however, the accuracy of the analytical solutions obtained are compared with results as reported in the literature [25] and confirm in good harmony along the entire values under different boundary conditions and presented in table 1 and 2. since the dimensionless value of the natural frequency ω is obtained in the analysis, the results are valid for all thickness to radius ratio. also, the parametric studies of the controlling factors are presented in graphical form. the solution of the galerkin method of weighted residual is a determinant of order of the assumed polynomial chosen. in this study, fifth-order polynomial is chosen for the fourth-order governing differential equation. the analytical solutions obtained though are limited to the first two natural frequencies but, it is good enough to predict the behaviour of the plate and the results are observed similar to results reported in the literature [27]. moreso, comparing the results obtained using galerkin method of weighted residual to results obtained in [9] using dtm. the dtm results converged at ten iterations while the present method predict almost the same results with fewer steps of calculation using the assumed polynomial. it is also observed that to obtain a higher node of natural frequencies there is a need to increase the order of the assumed polynomial chosen at the beginning of the analysis. results are shown in tables 1 and 2 illustrate the fundamental natural frequencies obtained which gives a reasonable prediction of the circular plate behaviour with minimal iterations. tables 1 and 2 also show comparisons of results for the symmetric and asymmetric case of the present study with reported work in the literature review. there is good agreement between the present study and the results reported [27], hence, it can be concluded that the present procedure is very effective. fig. 3 shows comparison of linear and nonlinear vibration. it is observed that the discrepancy become significant as the vibration increases. 136 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . edge condition/dimensionless natural frequency ω simply supported [9, 12, 27] 4.93511 present 4.95717 % diff 2.206 clamped [9, 12, 27] 10.21582 present 10.28571 % diff 6.989 free [9, 12, 27] 9.00312 present 9.03381 % diff 3.069 table 1. validation of fundamental natural frequency for symmetric condition. edge condition/dimensionless natural frequency ω simply supported [9, 27, 29] 13.94 present 13.9 % diff 4 clamped [9, 27, 29] 21.26 present 21.26 % diff 0 free [9, 27, 29] 20.475 present 20.556 % diff 8.1 table 2. validation of fundamental natural frequency for axisymmetric condition. figure 3. comparison of dflection time history for the linear and nonlinear circular plate. 137 s. salawu, g. sobamowo, o. sadiq acta polytechnica 4.1. investigation of plate behaviour on foundation to further investigate the effect of plate resting on elastic foundation for free vibration of non-homogenous varying thickness circular plate, the natural frequencies of the solutions obtained are plotted against the foundation parameter variation. the results are obtained by setting m = 1 and m = 0 respectively as shown in figs. 4 and 5. figs. 4 and 5 show that as the elastic foundation parameter (shear stiffness) of the elastic medium increases, the natural frequency of vibration of the uniform thickness, homogenous circular plate increases. this as a result of the increased value of the elastic medium stiffness, the shear stiffness makes the uniform circular plate stronger/stiffer and vibrates at higher natural frequency. although, it a known character of the plate to be affected by the characteristic of the elastic foundation. the same effect is also observed under the pasternak foundation and when the plate is resting on combined winkler and pasternak foundations. fig. 6 depicts the influence of varying nonlinear foundation on vibration of the circular plate, it is observed that increasing combine foundation attenuate the deflection of the circular plate. fig. 7 illustrates the influence of foundation free surface on the vibration characteristic of the circular plate. conversely, it is observed that as the stiffness of free foundation increases, the nonlinear vibration frequency ratio increases. figure 4. influence of pasternak foundation variation on symmetric and asymmetric case. figure 5. influence of winkler foundation variation on symmetric and asymmetric case. figure 6. influence of elastic foundation variation on deflection. figure 7. influence of foundation free surface on vibration of the circular plate. 138 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . 4.2. mode shapes the mode shape for the dimensionless natural frequencies are shown in figs. 8-13 respectively. it is important to note that, the mode shape obeys the classical theory of vibration. according to [12], for transient stress investigation, the response is normally based on the modal superposition principle and the modal stress which to a certain level will expose the characteristics and content of the whole response of the plate. based on that, the study of the non-dimension radial and circumferential stress is determined using eq. (58) and results illustrated in figs. 14-17. for radial and circumferential stresses, figs. 14-17 show that the location of the vibrating node and antinodes are in-away different. the phenomenon is due to the vanishing mode of the boundary condition. it is clearly shown that, the location of the node and antinodes of the vibrating plate changes. figs. 14-17 when compared to figs. 8-13 different in the mode shape is clearly shown. invariably, the extrema mode shapes location differs based on the boundary conditions. the extrema mode shape of the circular plate is distorted due to the presence of radial and circumferential stresses. also comparing figs 8 and 9, figs. 10 and 11 likewise figs. 12 and 13 respectively, it is observed the symmetric case of circular plate differs from the axisymmetric case. this due to the difference in the regularity condition at the center. figure 8. mode shape for simply supported edge condition symmetric case. figure 9. mode shape for simply supported edge condition asymmetric case. figure 10. mode shape for clamped edge condition symmetric case. figure 11. mode shape for clamped edge condition asymmetric case. 139 s. salawu, g. sobamowo, o. sadiq acta polytechnica figure 12. mode shape for free edge condition symmetric case. figure 13. mode shape for free edge condition asymmetric case. figure 14. radial stress for clamped edge symmetric case. figure 15. radial stress for clamped edge asymmetric case. figure 16. circumferential stress for clamped edge symmetric case. figure 17. circumferential stress for clamped edge asymmetric case. 140 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . figure 18. influence of thickness parameter variation on symmetric and asymmetric case circular plate. figure 19. influence of thickness parameter variation on deflection of circular plate for nonlinear foundation. figure 20. influence of thickness parameter variation on dynamic behaviour of circular plate for free surface of foundation. 4.3. investigation of plate thickness variation on the natural frequency fig. 18 illustrates the influence of the taper variation on the natural frequency. an increase in taper ratio along means that as the thickness of the plate increase, the plate gains more strength and stiffness. the more the stiffness of the plate, the more that natural frequency of the plate increase. this is an obvious result considering the classical theory of vibration. fig. 19 shows the result obtained while investigating the influence of plate thickness variation for the circular plate resting on nonlinear foundation. from the result, it is observed that increasing the thickness, deflection of the circular plate increases. as flexibility of the plate increases, the deflection increases. also, fig. 20 depicts the nonlinear frequency response of the circular when the plate thickness is increase with consideration to the free surface of the foundation. a decrease in nonlinear frequency ratio is observed as the thickness increases. 4.4. investigation of homogeneity parameter variation on the natural frequency the effect of homogeneity is determined and shown in fig. 21. homogeneity is a function of density variation. from the results obtained, it is shown that the natural frequency decreases with an increase in density. this implies that, as the material properties are being increased, the plates gained more mass. the more the plate mass increases the more the natural frequency decreases due to the increase in the mass of the plate, as a result, the natural frequency decrease. fig. 22 illustrates that, increasing the homogeneity of the circular plate leads to increase in the deflection of the circular plate meanwhile, from the result illustrated in fig. 23 increasing the density lowers the deflection of the circular plate in relation to the free surface of the foundation. 141 s. salawu, g. sobamowo, o. sadiq acta polytechnica figure 21. influence of density variation on symmetric and asymmetric case circular plate. figure 22. influence of density variation on deflection of circular plate for nonlinear foundation. figure 23. influence of density variation on deflection of circular plate for free surface of foundation. 5. conclusion in this study, free vibration of varying thickness and non-homogenous circular plate resting on winkler and pasternak foundations using galerkin of weighted residual method is investigated. from the parametric studies, it was established that (1.) the natural frequency of circular plate increases with increase in elastic winkler foundation parameter. (2.) the natural frequency of circular plate increases with an increase in elastic pasternak foundation parameter. (3.) extreme modal displacement occurs on the plate due to radial and circumferential stresses. (4.) the natural frequency of the circular plate increases with increase in the thickness of the circular plate (5.) the natural frequency of the circular plate decreases with increase in density of the circular plate (6.) the nonlinear vibration frequency ratio increases as the stiffness of free foundation increases. increasing nonlinear foundation stiffness decrease the deflection of the circular plate. (7.) deflection of the circular plate increases as the thickness increases for nonlinear model. conversely, the free foundation thickness increases the nonlinear frequency ratio decreases. (8.) increasing the homogeneity of the circular plate leads to increase in the deflection of the circular plate. the present study emphasizes the effect of elastic foundation thickness and density variation on the dynamic behaviour of a thin circular plate. also, the singularities issue of circular plate is handled with ease using galerkin of weighted residual, the ease of obtaining the eigenvalue without the difficulty of many iteration and complex mathematics is a huge benefit of the method. it is expected that the present study will contribute to the understanding of the study of the dynamic behavior of a circular plate under various parameters. 142 vol. 60 no. 2/2020 investigation of the dynamic behaviour of non-uniform thickness. . . 6. data availability the data used to support the findings of this study are available from the corresponding author upon request. 7. conflicts of interest the author declares that there are no conflicts of interest regarding the publication of this paper. acknowledgements the author expresses sincere appreciation to the university of lagos, nigeria and herzer research group, for providing material supports and a good environment for this work. list of symbols r radius of the plate c clamped edge plate e young’s modulus f free edge support s simply supported edge ω natural frequency d dr differential operator f dynamic deflection h plate thickness ρ mass density d modulus of elasticity references [1] k.-h. jeong, g.-m. lee, t.-w. kim. free vibration analysis of a circular plate partially in contact with a liquid. journal of sound and vibration 324(1):194 – 208, 2009. doi:10.1016/j.jsv.2009.01.061. [2] m. sari, e. butcher. free vibration analysis of rectangular and annular mindlin plates with undamaged and damaged boundaries by the spectral collocation method. jvc/journal of vibration and control 18(11):1722 – 1736, 2012. doi:10.1177/1077546311422242. [3] c. a. j. fletcher. an improved finite element formulation derived from the method of weighted residuals. computer methods in applied mechanics and engineering 15(2):207 – 222, 1978. doi:10.1016/0045-7825(78)90024-5. [4] b. afsharmanesh, a. ghaheri, f. taheri-behrooz. buckling and vibration of laminated composite circular plate on winkler-type foundation. steel and composite structures 17:1 – 19, 2014. doi:10.12989/scs.2014.17.1.001. [5] w. lee, j. chen, y. lee. free vibration analysis of circular plates with multiple circular holes using indirect biems. journal of sound and vibration 304(3):811 – 830, 2007. doi:10.1016/j.jsv.2007.03.026. [6] y. shan yun, c. temuer. application of the homotopy perturbation method for the large deflection problem of a circular plate. applied mathematical modelling 39(3):1308 – 1316, 2015. doi:10.1016/j.apm.2014.09.001. [7] y. shan yun, c. temuer. application of the homotopy perturbation method for the large deflection problem of a circular plate. applied mathematical modelling 39(3):1308 – 1316, 2015. doi:10.1016/j.apm.2014.09.001. [8] x. zhong, s. liao. analytic solutions of von karman plate under arbitrary uniform pressure (i): equations in differential form. studies in applied mathematics 138, 2016. doi:10.1111/sapm.12158. [9] h. s. yalcin, a. arikoglu, i. ozkol. free vibration analysis of circular plates by differential transformation method. applied mathematics and computation 212(2):377 – 386, 2009. doi:10.1016/j.amc.2009.02.032. [10] q. li, j. liu, h. xiao. a new approach for bending analysis of thin circular plates with large deflection. international journal of mechanical sciences 46(2):173 – 180, 2004. doi:10.1016/j.ijmecsci.2004.03.012. [11] z. liangchi, d. haojiang. the method of weighted residuals for transversely isotropic axisymmetric problems and its applications to engineering. acta mechanica sinica 3:261 – 267, 1987. doi:10.1007/bf02486772. [12] m. shariyat. differential transform vibration and modal stress analyses of circular plates made of two-directional functionally graded materials resting on elastic foundations. archive of applied mechanics 81:1289 – 1306, 2011. doi:10.1007/s00419-010-0484-x. [13] d. shi, h. zhang, q. wang, s. zha. free and forced vibration of the moderately thick laminated composite rectangular plate on various elastic winkler and pasternak foundations. shock and vibration 2017, 2017. [14] ömer civalek, m. h. acar. discrete singular convolution method for the analysis of mindlin plates on elastic foundations. international journal of pressure vessels and piping 84(9):527 – 535, 2007. doi:10.1016/j.ijpvp.2007.07.001. 143 http://dx.doi.org/10.1016/j.jsv.2009.01.061 http://dx.doi.org/10.1177/1077546311422242 http://dx.doi.org/10.1016/0045-7825(78)90024-5 http://dx.doi.org/10.12989/scs.2014.17.1.001 http://dx.doi.org/10.1016/j.jsv.2007.03.026 http://dx.doi.org/10.1016/j.apm.2014.09.001 http://dx.doi.org/10.1016/j.apm.2014.09.001 http://dx.doi.org/10.1111/sapm.12158 http://dx.doi.org/10.1016/j.amc.2009.02.032 http://dx.doi.org/10.1016/j.ijmecsci.2004.03.012 http://dx.doi.org/10.1007/bf02486772 http://dx.doi.org/10.1007/s00419-010-0484-x http://dx.doi.org/10.1016/j.ijpvp.2007.07.001 s. salawu, g. sobamowo, o. sadiq acta polytechnica [15] b. akgöz, o. civalek. nonlinear vibration analysis of laminated plates resting on nonlinear two-parameters elastic foundations. steel and composite structures 11:403 – 421, 2011. doi:10.12989/scs.2011.11.5.403. [16] m. bodaghi, a. saidi. stability analysis of functionally graded rectangular plates under nonlinearly varying in-plane loading resting on elastic foundation. archive of applied mechanics 81:765 – 780, 2011. doi:10.1007/s00419-010-0449-0. [17] o. civalek, c. demir. buckling and bending analyses of cantilever carbon nanotubes using the euler-bernoulli beam theory based on non-local continuum model. asian journal of civil engineering 12:651 – –661, 2011. [18] k. k. żur. quasi-green’s function approach to free vibration analysis of elastically supported functionally graded circular plates. composite structures 183:600 – 610, 2018. doi:10.1016/j.compstruct.2017.07.012. [19] k. k. żur. free vibration analysis of elastically supported functionally graded annular plates via quasi-green’s function method. composites part b: engineering 144:37 – 55, 2018. doi:10.1016/j.compositesb.2018.02.019. [20] k. k. żur, p. jankowski. multiparametric analytical solution for the eigenvalue problem of fgm porous circular plates. symmetry 11:429, 2019. [21] k. k. żur. free vibration analysis of discrete-continuous functionally graded circular plate via the neumann series method. applied mathematical modelling 73:166 – 189, 2019. doi:10.1016/j.apm.2019.02.047. [22] k. k. żur. green’s function approach to frequency analysis of thin circular plates. bulletin of the polish academy of sciences technical sciences 64(1), 2016. doi:10.1515/bpasts-2016-0020. [23] k. k. żur. green’s function in frequency analysis of circular thin plates of variable thickness 53:873 – 884, 2015. doi:10.15632/jtam-pl.53.4.873. [24] s. salawu, g. sobamowo. assessment of hybrid method on investigation of dynamic behaviour of isotropic rectangular plates resting on two-parameters foundation under a creative commons attribution-noncommercial 4.0 international license (cc by-nc 4 13:169 – 181, 2020. doi:10.17516/1999-494x-0213. [25] s. salawu, g. sobamowo, o. sadiq. dynamic analysis of non-homogenous varying thickness rectangular plates resting on pasternak and winkler foundations. engineering and applied science letters 3:1 – 20, 2020. doi:10.30538/psrp-easl2020.0031. [26] c. vendhan, y. das. application of rayleigh-ritz and galerkin methods to non-linear vibration of plates. journal of sound and vibration 39(2):147 – 157, 1975. doi:10.1016/s0022-460x(75)80214-8. [27] t. wu, g. liu. free vibration analysis of circular plates with variable thickness by the generalized differential quadrature rule. international journal of solids and structures 38(44):7967 – 7980, 2001. doi:10.1016/s0020-7683(01)00077-4. [28] d. zhou, s. lo, f. au, y. cheung. three-dimensional free vibration of thick circular plates on pasternak foundation. journal of sound and vibration 292(3):726 – 741, 2006. doi:10.1016/j.jsv.2005.08.028. [29] a. w. leissa. vibration of plates. tech. rep., national aeronautics and space administration, washington d.c., 1970. 144 http://dx.doi.org/10.12989/scs.2011.11.5.403 http://dx.doi.org/10.1007/s00419-010-0449-0 http://dx.doi.org/10.1016/j.compstruct.2017.07.012 http://dx.doi.org/10.1016/j.compositesb.2018.02.019 http://dx.doi.org/10.1016/j.apm.2019.02.047 http://dx.doi.org/10.1515/bpasts-2016-0020 http://dx.doi.org/10.15632/jtam-pl.53.4.873 http://dx.doi.org/10.17516/1999-494x-0213 http://dx.doi.org/10.30538/psrp-easl2020.0031 http://dx.doi.org/10.1016/s0022-460x(75)80214-8 http://dx.doi.org/10.1016/s0020-7683(01)00077-4 http://dx.doi.org/10.1016/j.jsv.2005.08.028 acta polytechnica 60(2):127–144, 2020 1 introduction 2 problem formulation and mathematical analysis 2.1 boundary conditions 3 method of solution: principle of galerkin weighted residual 3.1 application of galerkin method of weighted residual to the governing equation 3.2 application of galerkin method of weighted residual to nonlinear governing equation 3.3 application of galerkin method of weighted residual to foundation free surface 3.4 the stress-deflection expression is 12 4 results and discussion 4.1 investigation of plate behaviour on foundation 4.2 mode shapes 4.3 investigation of plate thickness variation on the natural frequency 4.4 investigation of homogeneity parameter variation on the natural frequency 5 conclusion 6 data availability 7 conflicts of interest acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2016.56.0236 acta polytechnica 56(3):236–244, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on the construction of partial difference schemes ii: discrete variables and schwarzian lattices decio levia, ∗, miguel a. rodríguezb a dipartimento di matematica e fisica, università degli studi roma tre and infn sezione di roma tre, via della vasca navale 84, 00146 roma, italy b departamento de física teórica ii, facultad de físicas, universidad complutense, 28040 madrid, spain ∗ corresponding author: decio.levi@roma3.infn.it abstract. in the process of constructing invariant difference schemes which approximate partial differential equations we write down a procedure for discretizing a partial differential equation on an arbitrary lattice. an open problem is the meaning of a lattice which does not satisfy the clairaut– schwarz–young theorem. to analyze it we apply the procedure on a simple example, the potential burgers equation with two different lattices, an orthogonal lattice which is invariant under the symmetries of the equation and satisfies the commutativity of the partial difference operators and an exponential lattice which is not invariant and does not satisfy the clairaut–schwarz–young theorem. a discussion on the numerical results is presented showing the different behavior of both schemes for two different exact solutions and their numerical approximations. keywords: partial differential and difference equations; discretization; clairaut–schwarz–young theorem. ams mathematics subject classification: 39a14, 35f05. 1. introduction the construction of difference equations, written as invariants of the continuous group of symmetries of differential equations, is part of a project to apply symmetry group methods to the numerical solution of differential equations [2, 3, 6–9, 11–13, 15–17, 19–23]. this project has accomplished a significative advance since its first introduction last century. in particular, the construction of invariant schemes has proven to be a very fruitful approach in the construction of numerical schemes for ordinary differential equations [2, 3], in cases when the usual approaches present serious problems of convergence and accuracy, for instance, in the behavior of the solutions in the neighborhood of a singularity. in this problem a deeper understanding of the mathematics involved in its relation with numerics can provide results important for their applications to problems in physics and mathematics. the usual procedure in this framework is to compute the symmetry group of the differential equation and then compute the invariant lattice and invariant difference equation with respect to the symmetry group. however, since the differential and difference calculus present substantial differences, a special care has to be taken to assure the consistency of the approach. for example, the clairaut–schwarz–young theorem on the equality of the cross derivatives, which is satisfied in the continuous case under some mild conditions on the functions, is not valid in the discrete case for a general lattice. recently it has been shown [14] that the discrete clairaut–schwarz–young theorem, equality of the cross differences, imposes strong restrictions on the lattice. moreover, the construction of the discrete invariant scheme starting from the discrete invariants is not at all obvious as it is usually obtained by finding a proper combination of the continuous limits of the various discrete invariants. the main idea guiding the construction of the whole scheme, that is the difference equations and the equations defining the lattice, which in some cases are mixed, is that the continuous limit yields the differential equation and trivial identities. this article is a continuation of our work on the construction of partial difference schemes [14]. in the previous work we concentrated on the clairaut–schwarz–young theorem. here we introduce by a one to one correspondence a new set of discrete coordinates which describe the partial difference equation on the lattice. in terms of these coordinate systems we can write immediately the discrete counterpart of any continuous invariant. so we can discretize in a straightforward way any partial differential equation described in terms of the invariants of a group of symmetries. a particular role in the construction of discrete invariant schemes is played by the lattice. consequently the clairaut–schwarz–young theorem can play an important role in discriminating lattice schemes, i.e., the combination of the discrete equation and its lattice. in section 2 we study schemes for scalar partial differential equations and show the constraints on the group transformations due to the clairaut–schwarz–young theorem. using these results, in section 3 we construct 236 http://dx.doi.org/10.14311/ap.2016.56.0236 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 on the construction of partial difference schemes ii in a standard way the invariant discrete potential burgers and in section 4, we study it numerically for two different lattices, one invariant and schwarzian and one not, for two different exact solutions of the differential equation. some concluding remarks are presented in section 5. 2. schemes for partial difference equations in the case of ordinary difference equations of order k for one dependent variable un and one independent variable xn, a natural scheme is given by the points {xn+k−1,un+k−1, 1 ≤ k ≤ k + 1} for some fixed n. an alternative equivalent set of coordinates on the scheme is given by [14, 23]: {xn,un,p (1) n+1,p (2) n+2,p (3) n+3,p (4) n+4, . . .p (k−1) n+k−1,p (k) n+k,hn+1,hn+2, . . .hn+k} (1) with hn+k = xn+k −xn+k−1, 1 ≤ k ≤ k, p (k) n+k = [dx] kun, where dx = 1 hn+1 [tx − 1], txun = un+1, k ∈ z+ (2) in the continuous limit hn+k → 0 and p (k) n+k → dku(x) dxk . if we transform the standard discrete prolongation [17] pr(k) x̂ = n+k∑ k=n ( ξk(xk,uk)∂xk + φk(xk,uk)∂uk ) (3) of the vector field x̂ = ξn(xn,un)∂xn + φn(xn,un)∂un (4) to the new variables (1), we obtain pr x̂ = ξn(xn,un)∂xn + φn(xn,un)∂un + k∑ k=1 κ(k)∂hn+k + k∑ k=1 φ (k) n+k∂p(k) n+k . (5) the general formulas for the coefficients κ(k) and φ(k) are κ(k) = ξn+k − ξn+k−1, φ (k) n+k = dxφ (k−1) n+k−1 −p (k) n+kdxξn, k = 1, 2, . . . ,k. (6) it is worthwhile to notice that both the higher order discrete derivatives and their corresponding prolongations are written in terms of xn, un, ξn(xn,un), φn(xn,un) and their difference consequences obtained by applying the operator dx given in (2). this way of constructing invariant ordinary difference equation is different from that considered in [13] but is easily extendible to the case of partial difference equations. for partial difference equations we consider here only, for simplicity, the case of one dependent variable and two independent variables as will be the example we will discuss in the following section. moreover, as we will deal with a nonlinear partial difference equation of second order we limit ourselves to a scheme of six points (n,m), (n + 1,m), (n,m + 1), (n + 2,m), (n,m + 2), (n + 1,m + 1), the minimum number of points necessary to get all partial second derivatives as first order approximations. the variables x, y and u(x,y) in all points correspond to 18 data, 12 related to the independent variables and 6 to the dependent one. the extension to more variables and higher order equations requires just more points but it is straightforward. having 12 data for the independent variables we can construct from them 10 differences x0,0, y0,0, h x 0,0 = x1,0 −x0,0, h y 0,0 = y0,1 −y0,0, σ x 0,0 = x0,1 −x0,0, σ y 0,0 = y1,0 −y0,0, hx1,0 = x2,0 −x1,0, h y 0,1 = y0,2 −y0,1, σ x 0,1 = x0,2 −x0,1, σ y 1,0 = y2,0 −y1,0, hx0,1 = x1,1 −x0,1, h y 1,0 = y1,1 −y1,0, (7) where for convenience of notation here and in the following, whenever it may not create misunderstanding, we have indicated just the distance from the values n,m of the indices. from the values of the dependent variables in the 6 points we can calculate the 6 quantities u0,0, dxu0,0, dyu0,0, [dx]2u0,0, [dy]2u0,0, dxdyu0,0, (8) where the operators dx and dy, introduced in [14], are given by dx = 1 hx0,0h y 0,0 −σx0,0σ y 0,0 (hy0,0∆n −σ y 0,0∆m), dy = 1 hx0,0h y 0,0 −σx0,0σ y 0,0 (−σx0,0∆n + h x 0,0∆m) (9) 237 decio levi, miguel a. rodríguez acta polytechnica with tnfn,m = fn+1,m, ∆n = tn − 1, tmfn,m = fn,m+1, ∆m = tm − 1. these operators are the discrete counterpart of the partial derivatives in the x,y axes directions, written in terms of the partial difference operators in the lattice directions, which, generically, do not coincide with the cartesian axes. see [14] for a more detailed description. note that dydxu0,0 is not independent from the 6 quantities (8). it can be written in term of (7) and (8). for a generic lattice we have: dydxu0,0 = −dxdxu0,0 (hx0,0 −hx0,1)(hx0,0 −σx0,0) hx0,1h y 0,0 + hx0,0σ y 0,0 −hx0,1σ y 0,0 −σx0,0σ y 0,0 + dydyu0,0 (hy0,0 −h y 1,0)(h y 0,0 −σ y 0,0) hx0,1h y 0,0 + hx0,0σ y 0,0 −hx0,1σ y 0,0 −σx0,0σ y 0,0 + dxdyu0,0 hx0,0h y 1,0 + h y 0,0σ x 0,0 −h y 1,0σ x 0,0 −σx0,0σ y 0,0 hx0,1h y 0,0 + hx0,0σ y 0,0 −hx0,1σ y 0,0 −σx0,0σ y 0,0 (10) formulas (7)–(10) can be simplified if we require the validity of the clairaut–schwarz–young theorem, i.e., dxdyu0,0 = dydxu0,0. this is true when the following constraint for the lattice holds [14]: σxn,m = σ x n+1,m ≡ σ x m, h x n,m = h x n,m+1 ≡ h x n, σ y n,m = σ y n,m+1 ≡ σ y n, h y n,m = h y n+1,m ≡ h y m. (11) using the operators dx and dy given in (9) we can transform the standard discrete prolongation [17] pr x̂n,m = x̂n,m + x̂n+1,m + x̂n+2,m + x̂n,m+1 + x̂n,m+2 + x̂n+1,m+1 (12) of the vector field x̂n,m = ξn,m∂xn,m + τn,m∂yn,m + φn,m∂un,m (13) to the new set of independent variables (7), (8). we get pr x̂n,m = x̂n,m + ∑ (i,j)=0,1 [ η (x) n+i,m+j∂hxn+i,m+j + χ (x) n+i,m+j∂σxn+i,m+j + η (y) n+i,m+j∂hyn+i,m+j + χ(y)n+i,m+j∂σyn+i,m+j ] + φ(1,x)n,m ∂dxun,m + φ (1,y) n,m ∂dyun,m + φ (2,xx) n,m ∂[dx]2un,m + φ (2,xy) n,m ∂dydxun,m + φ (2,yy) n,m ∂[dy ]2un,m, (14) where η (x) n+i,m+j = ξn+1+i,m+j − ξn+i,m+j, η (y) n+i,m+j = τn+i,m+1+j − τn+i,m+j, χ (x) n+i,m+j = ξn+i,m+1+j − ξn+i,m+j, χ (y) n+i,m+j = τn+1+i,m+j − τn+i,m+j, φ(1,x)n,m = dxφn,m −dxun,mdxξn,m −dyun,mdxτn,m, φ(1,y)n,m = dyφn,m −dxun,mdyξn,m −dyun,mdyτn,m, φ(2,xx)n,m = dxφ (1,x) n,m − [dx] 2un,mdxξn,m −dydxun,mdxτn,m, φ(2,xy)n,m = dxφ (1,y) n,m −dxdyun,mdxξn,m − [dy] 2un,mdxτn,m, φ(2,yy)n,m = dyφ (1,y) n,m −dxdyun,mdyξn,m − [dy] 2un,mdyτn,m. (15) in the continuous limit, when hxn+i,m+j, h y n+i,m+j, σ x n+i,m+j and σ y n+i,m+j go to 0, η (x) n+i,m+j, η (y) n+i,m+j, χ (x) n+i,m+j and χ(y)n+i,m+j go also to 0 while φ (1,x) n,m , φ (1,y) n,m , φ (2,xx) n,m , φ (2,xy) n,m and φ (2,yy) n,m go to the corresponding continuous prolongations. applying the infinitesimal generator (14) onto (11) we get that both functions ξn,m(xn,m,yn,m,un,m) and τn,m(xn,m,yn,m,un,m) must satisfy the discrete wave equations ξn,m+1 − ξn,m − ξn+1,m+1 + ξn+1,m = 0, τn,m+1 − τn,m − τn+1,m+1 + τn+1,m = 0. (16) this is a constraint for the symmetry coefficients if the clairaut–schwarz–young theorem is to be satisfied, i.e., (16) are to be added to the determining equations if we want to have a lattice satisfying the clairaut– schwarz–young theorem. in this case, if, for example, the difference equation for un,m involves second order shifts like un+2,m or un,m+2 so that un,m, un,m+1 and un+1,m are independent variables, then from (16) we get ξn,m(xn,m,yn,m,un,m) = ξn,m(xn,m,yn,m). if the lattice equations in our scheme involve the points xn+2,m,yn+2,m or xn,m+2,yn,m+2, so that xn,m, xn,m+1, xn+1,m, yn,m, yn,m+1 and yn+1,m are independent variables, then ξn,m(xn,m,yn,m,un,m) = ξn,m. then, as it is the case of the continuous wave equation, the 238 vol. 56 no. 3/2016 on the construction of partial difference schemes ii general solution of (16) is given by ξn,m = f (x) n + g (x) m and τn,m = f (y) n + g (y) m , i.e., the sum of an arbitrary function of n and one of m. it is worthwhile, in view of the application to be carried out in next section, to compute the lowest order discrete derivatives of monomials in x and y: dxxn,m = 1, dxyn,m = 0, dyxn,m = 0, dyyn,m = 1, dxx 2 n,m = 2xn,m + hyn,m(hxn,m)2 −σyn,m(σxn,m)2 h y n,mhxn,m −σ y n,mσxn,m = 2xn,m + ∆xxx, dxxn,myn,m = yn,m + hyn,mσ y n,m(hxn,m −σxn,m) h y n,mhxn,m −σ y n,mσxn,m = yn,m + ∆xxy, dxy 2 n,m = − hyn,mσ y n,m(hyn,m −σyn,m) h y n,mhxn,m −σ y n,mσxn,m = ∆xyy, dyx 2 n,m = − hxn,mσ x n,m(hxn,m −σxn,m) h y n,mhxn,m −σ y n,mσxn,m = ∆yxx, dyxn,myn,m = xn,m + hxn,mσ x n,m(hyn,m −σyn,m) h y n,mhxn,m −σ y n,mσxn,m = xn,m + ∆yxy, dyy 2 n,m = 2yn,m + hxn,m(hyn,m)2 −σxn,m(σyn,m)2 h y n,mhxn,m −σ y n,mσxn,m = 2yn,m + ∆yyy. (17) where the quantities ∆xxx, ∆xxy, ∆xyy, ∆yxx, ∆yxy and ∆yyy go to zero in the continuous limit when h and σ go to zero. 3. example: the potential burgers equation the burgers equation ut = νuxx + uux, (18) a very well known partial differential equation, appears as a simplification of the navier–stokes equation and has been studied from many, if not all, points of view [4]. it was proposed as a model for a viscous fluid, with a viscosity parameter ν. when the viscosity parameter ν is set equal to zero, the burgers equation degenerates into a quasilinear first order equation which is the prototype of a class of equations which presents nonlinear phenomena such as shock waves. in fact, the limit ν → 0 allows the study of these shock wave solutions as limits of the solutions of the viscous burgers equation. in particular, although the inviscid burgers equation has an infinite dimensional group of symmetries (being a first order equation), the limit of the symmetry group of the viscous burgers equations provides a subgroup of the whole group of symmetries of the inviscid burgers equation which is a useful tool in the study of the equation and in particular its discretization using invariant techniques. several invariant discretization approaches have been proposed to construct explicit numerical schemes for finding numerical solutions on invariant lattices [1, 10]. in some of these works, explicit comparison has been made, showing the higher accuracy and stability of these methods [5]. it is not our intention in this paper to present this kind of numerical results but rather to prove the possibility of constructing in an easy way such invariant discrete schemes and discuss the properties of a lattice satisfying the clairaut–schwarz–young theorem from the numerical point of view. for the control of the numerical calculations we will study the time evolution of the initial condition provided by exact solutions of the burgers equation. to simplify the presentation and with no loss of generality we will go over to consider the potential burgers equation as this is point transformable into the linear heat equation for which many exact solutions are known. let us construct using the formulas introduced in the previous section the discrete scheme which preserves the point symmetries of the potential burgers equation uy −uxx −u2x = 0. (19) the point symmetries of (19) are [18] v̂1 = ∂x, v̂2 = ∂y, v̂3 = ∂u, v̂4 = x∂x + 2y∂y, v̂5 = 2y∂x −x∂u, v̂6 = 4yx∂x + 4y2∂y − (x2 + 2y)∂u, v̂α = α(x,y)e−u∂u, (20) where the function α(x,y) satisfies the heat equation αy = αxx. (21) 239 decio levi, miguel a. rodríguez acta polytechnica the infinitesimal generator v̂α is the one responsible for the linearizability of the potential burgers equation as it provides its linearizing transformation α = eu. (22) a function f (x,y,u,ux,uy,uxx) is invariant under the infinitesimal generators v̂1, v̂2, v̂3 v̂4, v̂5 if it depends on i(1) = uy −u2x uxx . (23) it is then easy to see that (23) is weakly invariant also under v̂6 and v̂α, i.e., pr(2) v̂6 i(1) = 2 uxx (i(1) − 1), pr(2) v̂α i(1) = − e−u uxx ( αu2x + 2αxux + αy ) (i(1) − 1). the potential burgers equation (19) is then given by i(1) = 1 (24) taking into account the discrete prolongation (14) and the definition of the infinitesimal coefficients (15) we can construct the discrete prolongation of the vector fields (20). it is easy to show that the conditions (16) are satisfied for all symmetries (20) except for v̂6 for which ξn,m+1 − ξn,m − ξn+1,m+1 + ξn+1,m = −4[hxn,mh y n,m + σ x n,mσ y n,m] 6= 0 and τn,m+1 − τn,m − τn+1,m+1 + τn+1,m = −8hxn,mσ x n,m 6= 0. since we are presently interested in analyzing the role of the schwarz condition on the construction of lattices and its consequences in numerical computations, we will not consider the symmetry with generator v̂6 in the following, that is, we restrict ourselves to a sugbroup of the whole symmetry group (note that we have also disregard in our analysis the operator v̂α, which, as we have remarked above, is related to the linearization of the equation under study). taking into account (17) we have: prd v̂1 = ∂xn,m, pr d v̂2 = ∂yn,m, pr d v̂3 = ∂un,m, prd v̂4 = xn,m∂xn,m + 2yn,m∂yn,m + h x n,m∂hxn,m + h x n+1,m∂hxn+1,m + h x n,m+1∂hxn,m+1 + σ x n,m∂σxn,m +σxn,m+1∂σxn,m+1 + 2 ( hyn,m∂hyn,m + h y n+1,m∂hyn+1,m + h y n,m+1∂hyn,m+1 + σ y n,m∂σyn,m + σ y n+1,m∂σyn+1,m ) −dxun,m∂dxun,m − 2dyun,m∂dyun,m − 2[dx] 2un,m∂d2xun,m, prd v̂5 = 2yn,m∂xn,m −xn,m∂un,m + 2σ y n,m∂hxn,m + 2σ y n+1,m∂hxn+1,m + 2(h y n+1,m + σ y n,m −h y n,m)∂hxn,m+1 +2hyn,m∂σxn,m + 2h y n,m+1∂σxn,m+1 −∂dxun,m − 2dxun,m∂dyun,m. (25) the commutation table of this algebra appears in table 1 (it is, obviously the same as in the continuous case). it is immediate to see that a discrete potential burgers equation preserving the lie algebra of (19) given by the generators v̂1, v̂2, v̂3 v̂4, v̂5 is: i(1) = [dx]2un,m dyun,m − (dxun,m)2 = 1, i.e., dyun,m − [dx]2un,m − (dxun,m)2 = 0. (26) the continuous limit of (26) is trivially given by (19) when hx, hy, σx and σy go to zero preserving the structure of the lattice. eq. (26) involves 6 lattice points centered around (n,m), i.e., (n,m), (n + 1,m), (n,m + 1), (n + 2,m), (n + 1,m + 1), (n,m + 2). it explicitly reads: −σxm(un+1,m −un,m) + hxn(un,m+1 −un,m) hxnh y m −σxmσ y n − [ hym (hym(un+2,m −un+1,m) −σyn+1(un+1,m+1 −un+1,m) hxn+1h y m −σxmσ y n+1 − hym(un+1,m −un,m) −σyn(un,m+1 −un,m) hxnh y m −σxmσ y n ) −σyn (hym+1(un+1,m+1 −un,m+1) −σyn(un,m+2 −un,m+1) hxnh y m+1 −σxm+1σ y n − hym(un+1,m −un,m) −σyn(un,m+1 −un,m) hxnh y m −σxmσ y n )]( hxnh y m −σ x mσ y n )−1 − (hym(un+1,m −un,m) −σyn(un,m+1 −un,m))2 (hxnh y m −σxmσ y n)2 = 0. (27) 240 vol. 56 no. 3/2016 on the construction of partial difference schemes ii v̂1 v̂2 v̂3 v̂4 v̂5 v̂1 0 0 0 v̂1 −v̂3 v̂2 0 0 0 2v̂2 2v̂1 v̂3 0 0 0 0 0 v̂4 −v̂1 −2v̂2 0 0 v̂5 v̂5 v̂3 −2v̂1 0 −v̂5 0 table 1. commutation table of the discrete invariance algebra. to complete the difference scheme we have to associate to it a lattice equation which preserve the symmetries (25) or part of them. the complete list of discrete invariant of (25), obtained as usual, as solutions of the equations prd v̂i(k) = 0, are: k1 = h y n+1,m h y n,m , k2 = h y n,m+1 h y n,m , k3 = σyn,m h y n,m , k4 = σ y n+1,m h y n,m , k5 = 1 (hyn,m)3/2 ( hxn,mh y n,m −σ x n,mσ y n,m ) , k6 = 1 (hyn,m)3/2 ( h y n,m+1σ x n,m −h y n,mσ x n,m+1 ) , k7 = 1 (hyn,m)3/2 ( hxn,m(h y n+1,m −h y n,m) −σ y n,m(h x n,m+1 −h x n,m) ) , k8 = 1 (hyn,m)3/2 ( hxn,mσ y n+1,m −h x n+1,mσ y n,m ) , k9 = 1 (hyn,m)1/2 ( hxn,m + 2σ y n,mdxun,m ) , k10 = dyun,m − (dxun,m)2 [dx]2un,m . (28) our first choice for the difference scheme is an orthogonal cartesian lattice given, in the coordinates hn,m and σn,m by hyn,m = b, σ y n,m = 0, h x n,m = a, σ x n,m = 0, (29) where a and b are arbitrary constants, which in the continuous limit go to zero. this lattice, which is clearly invariant under all the symmetries which satisfy the commutativity constraint, corresponds to the lie point symmetry infinitesimal generators ξn,m = τn,m = 1 (30) these generators comply with the constraints in (16) and the orthogonal lattice (29) satisfies the clairaut– schwarz–young theorem. to carry out a comparison between this invariant schwarzian lattice with other lattices which lack these properties, we will consider an exponential non schwarzian lattice as given in [14] by yn,m = bm + b0, xn,m = (1 + c)m(an + a0), (31) where a, a0, b, b0 and c are arbitrary constants. b and a are the lattice spacing and c is a dilation parameter which, when set equal to zero reduces this lattice to an orthogonal lattice. this lattice corresponds to the lie point symmetry infinitesimal generators ξn,m = (1 + c)m(k1n + k2) + k0 xn,m, τn,m = k3 (32) which clearly do not satisfy (16). the non schwarzian property of this lattice can also be seen by considering the lattice differences hyn,m = b, σ y n,m = 0, h x n,m = (1 + c) ma, σxn,m = c(1 + c) man. (33) and comparing them with (11). this non schwarzian lattice is not invariant under the potential burgers symmetry group (20); k5, k6, k7 and k9 are non constants. in the following section we will provide a numerical test about the possible importance of the clairaut– schwarz–young theorem, by studying numerically the evolution provided by the symmetry preserving discretized potential burgers equation (27) on the two different lattices we introduced above, i.e., the schwarzian and non schwarzian lattices (29, 33). 4. numerical calculation results for the discrete potential burgers equation in order to compare the precision and accuracy of different lattices used to compute numerically the solution of the potential burgers equation, we will construct two of its exact solutions (associated to different symmetries 241 decio levi, miguel a. rodríguez acta polytechnica of the heat equation) and use them as initial conditions for the evolution of the burgers map (27). in fact, we will not do a direct comparison of the evolution of this map in the two lattices (29) and (33) but compare the solution of the map in each lattice with the exact solution of the continuous equation. we introduce two invariant solutions of (21) and transform them into solutions of the potential burgers equation [14] using (22). starting from the traveling wave solution of the heat equation (invariant under a combination of x and y translations) we get as a simple solution, bounded for x ∈ r+ for each y, f1(x,y) = log(1 + e−(x−y)). (34) a second exact solution (the fundamental solution) is given by the galilei–invariant solution of the heat equation f2(x,y) = log ( 1 + e−x 2/(4y) √ y ) . (35) to compare the results given by the evolution of the map on a given lattice to the evolution given by the exact solution we introduce a global estimator, the usual (relative) distance in the discrete analog of the l2 space: χlattice(f) = √∑ n,m(flatticen,m −fn,m)2∑ n,m f 2 n,m (36) where fn,m and flatticen,m are the values of the exact solution and the numerical one, respectively, computed in the points of the lattices (29, 33). since our intention is to compare the lattices, we will not insist on improving the precision and accuracy of the solution by modifying in an optimal way the parameters involved in the computation. for the orthogonal lattice we will take (for f1 and f2) a = b = 0.1 in a square d of 8 × 8 points in the x,y plane. we will use the same number of points in all the cases we will consider, in order to keep as far as possible the same round off errors due to machine precision. augmenting the number of points enlarges the numerical instabilities which could be reduced by increasing the precision of the calculations at the cost of the time of calculus. these instabilities are non–physical and we decide to avoid them by reducing appropriately the number of points. the region covered by the lattice in the two cases (orthogonal and exponential) can be different for the same lattice spacing. see for example in figure 1 the cases (1) and (2). in the exponential lattice we will consider two different situations: (i) the spacing is the same as in the orthogonal case (a = b = 0.1), and thus for the exponential lattice the region is deformed and enlarged with respect to the square d considered in the orthogonal lattice, (ii) we will modify a in such a way that the 64 points of the lattice are inside the square d. let us notice that in this case part of d is not covered by the lattice, and this may create problems as we will see later. the exponential lattice has a parameter c controlling the dilation of the x variable. as we said above, when c = 0 the lattice is orthogonal. we have considered in these numerical calculations two cases, c = 0.1 and c = 0.15, to compare the different behavior of the exponential lattice when it turns into an orthogonal one. in these two cases, the parameter a is taken as a = 0.0375 and a = 0.0513 respectively, when we keep the 64 point of the lattice inside d (see figure 1 for a graphical description of the four lattices under study). the χ estimator is given in table 2 and gives raise to the following conclusions: (1.) the orthogonal lattice (schwarzian lattice) provides better results than the exponential one (non schwarzian lattice) in all cases except one. (2.) the results for the exponential lattice with different values of c show that the approximation is better when the lattice is closer to a schwarzian lattice (recall that when c → 0 the exponential lattice becomes orthogonal), for the two solutions considered. (3.) the value of the χ estimator for the traveling wave solution in the exponential lattice when a = 0.0513 and c = 0.1 is lower than the corresponding value for the orthogonal case. in this case (figure 1.3) the lattice is close to the orthogonal one but there is a region in d which is not covered by the exponential lattice. this is the region where the round off errors the computer makes in calculating the points in the orthogonal lattice are greater. this is the reason for this unsatisfactory value. when c = 0.15 the lattice is far from the orthogonal one and thus χ is greater than in the orthogonal case. 5. conclusions in this work we have shown that also in the case of partial difference equations we can introduce a set of variables which are in one to one correspondence with the grid points when we substitute them by the lattice differences and the derivatives on the lattice of the dependent function. this correspondence allows us to write down the invariance equations by using only the knowledge of the continuous invariants. 242 vol. 56 no. 3/2016 on the construction of partial difference schemes ii y x � 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x y 0.5 1.0 1.5 0.2 0.4 0.6 0.8 (1) (2) x y 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x y 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 (3) (4) figure 1. (1) orthogonal lattice: a = 0.1, b = 0.1. (2) exponential lattice: a = 0.1, b = 0.1, c = 0.15. (3) exponential lattice: a = 0.0513, b = 0.1, c = 0.1. (4) exponential lattice: a = 0.0375, b = 0.1, c = 0.15. dimension: 8 × 8 points. χort χexp, c = 0.1 χexp, c = 0.15 f1 (1) 0.01267 (5) 0.01437 (2) 0.01651 (3) 0.01147 (4) 0.01408 f2 (1) 0.00249 (5) 0.00430 (2) 0.00610 (3) 0.00642 (4) 0.00913 table 2. χ estimator values for the functions f1 and f2 given by (34) and (35), respectively. (1): a = 0.1, lattice (1) in figure 1. (2): a = 0.1, lattice (2) in figure 1. (3): a = 0.0513, lattice (3) in figure 1. (4): a = 0.0375, lattice (4) in figure 1. (5): a = 0.1, the exponential lattice is similar to the lattice (2) in figure 1. the numerical calculation we carried out in section 4 shows that the schwarzian property seems important in providing better numerical results. more examples should be done, both to understand the stability of the models we construct in this way and the analysis of the lattices in the schwarzian and non schwarzian case. there might be boundary value problems where a non schwarzian lattice adapted to the geometry of the problem could be better than a schwarzian one. acknowledgements dl has been partly supported by the italian ministry of education and research, 2010 prin continuous and discrete nonlinear integrable evolutions: from water waves to symplectic maps and by infn is-csn4 mathematical methods of nonlinear physics. dl thanks the departamento de física teórica ii of the complutense university in madrid for its hospitality. mar was supported by the spanish mineco under project fis2015-63966. the authors would like to thank the referees for their useful comments. references [1] bihlo a and nave j-c 2014 convecting reference frames and invariant numerical models j. comput. phys. 272 656–663; doi:10.1016/j.jcp.2014.04.042 [2] bourlioux a, cyr-gagnon c, and winternitz p 2006 difference schemes with point symmetries and their numerical tests j. phys. a: math. gen. 39 6877–6896; doi:10.1088/0305-4470/39/22/006 243 http://dx.doi.org/10.1016/j.jcp.2014.04.042 http://dx.doi.org/10.1088/0305-4470/39/22/006 decio levi, miguel a. rodríguez acta polytechnica [3] bourlioux a, rebelo r, and winternitz p 2008 symmetry preserving discretization of sl(2,r) invariant equations j. nonlinear math. phys. 15 (suppl.3) 362–372 [4] burgers j m 1974 the nonlinear diffusion equation, asymptotic solutions and statistical problems, springer, dordrecht. [5] chhay m and hamdouni a 2010 a new construction for invariant numerical schemes using moving frames c.r. mecanique 338 97–101; doi:10.1016/j.crme.2010.01.001 [6] dorodnitsyn v a 1991 transformation groups in difference spaces, j. soviet math. 55 1490–1517 [7] dorodnitsyn v a 2011 applications of lie groups to difference equations, crc press, boca raton [8] dorodnitsyn v a and winternitz p 2000 lie point symmetry preserving discretizations for variable coefficient korteweg-de vries equations nonlinear dynamics 22 49–59 [9] kim p and olver p j 2004 geometric integration via multi-space regular and chaotic dynamics 9 213–226 [10] kim p 2008 invariantization of the crank-nicolson method for burgers’ equation physica d 237 243–254; doi:10.1016/j.physd.2007.09.001 [11] levi d, olver p, thomova z and winternitz p (editors) 2011 symmetries and integrability of difference equations, cup, lms lecture series. [12] levi d, scimiterna c, thomova z and winternitz p 2012 contact transformations for difference schemes j. phys. a: math. theor. 45 022001; doi:10.1088/1751-8113/45/2/022001 [13] levi d, thomova z and winternitz p 2011 are there contact transformations for discrete equations? j. phys. a: math. theor. 44 265201; doi:10.1088/1751-8113/44/26/265201 [14] levi d and rodríguez m a 2013 on the construction of partial difference schemes i: the clairaut, schwarz, young theorem on the lattice, j. phys. a: math. theor. 46 295203; doi:10.1088/1751-8113/46/29/295203 [15] levi d and winternitz p 1991 continuous symmetries of discrete equations, phys. lett. a 152 335–338; doi:10.1016/0375-9601(91)90733-o [16] levi d and winternitz p 1996 symmetries of discrete dynamical systems. j.math. phys. 37 5551–5576; doi:10.1063/1.531722 [17] levi d and winternitz p 2006 continuous symmetries of difference equations j. phys. a: math. gen. 39 r1; doi:10.1088/0305-4470/39/2/r01 [18] olver p j 1993 applications of lie groups to differential equations, springer-verlag, new york [19] rebelo r and valiquette f 2013 symmetry preserving numerical schemes for partial differential equations and their numerical tests j. difference eq. appl. 19 737–757; doi:10.1080/10236198.2012.685470 [20] rebelo r and valiquette f 2015 invariant discretization of partial differential equations admitting infinite-dimensional symmetry groups j. difference eq. appl. bf 21 285–318; doi:10.1080/10236198.2015.1007134 [21] rebelo r and winternitz p 2009 invariant difference schemes and their applications to sl(2,r) invariant differential equations j. phys. a: math.theor. 42 454016; doi:10.1088/1751-8113/42/45/454016 [22] winternitz p 2004 symmetries of discrete systems. discrete integrable systems, vol. 644, lecture notes in physics. ed b grammaticos, y kosmann-schwarzbach, and t tamizhmani (berlin, springer verlag) pp 185–243. see also arxiv:nlin.si/0309058 [23] winternitz p 2011 symmetry preserving discretization of differential equations and lie point symmetries of differential-difference equations. symmetries and integrability of difference equations. lms lecture series. ed d levi, p j olver, z thomova and p winternitz. 244 http://dx.doi.org/10.1016/j.crme.2010.01.001 http://dx.doi.org/10.1016/j.physd.2007.09.001 http://dx.doi.org/10.1088/1751-8113/45/2/022001 http://dx.doi.org/10.1088/1751-8113/44/26/265201 http://dx.doi.org/10.1088/1751-8113/46/29/295203 http://dx.doi.org/10.1016/0375-9601(91)90733-o http://dx.doi.org/10.1063/1.531722 http://dx.doi.org/10.1088/0305-4470/39/2/r01 http://dx.doi.org/10.1080/10236198.2012.685470 http://dx.doi.org/10.1080/10236198.2015.1007134 http://dx.doi.org/10.1088/1751-8113/42/45/454016 http://arxiv.org/abs/nlin.si/0309058 acta polytechnica 56(3):236–244, 2016 1 introduction 2 schemes for partial difference equations 3 example: the potential burgers equation 4 numerical calculation results for the discrete potential burgers equation 5 conclusions acknowledgements references ap02_02.vp 1 introduction video image compression plays an important role in transmission and storage of digital video data. the applications include multimedia transmission, teleconferencing, videophones, high-definition television (hdtv), cd-rom storage, etc. a large body of work in image/video processing has involved motion estimation [1] and [6]. applications of motion estimation exist in image sequence filtering and restorations, video coding, target tracking, robot navigation, monitoring and surveillance, biomedical problems, and the human-computer interface. the most effective technique for motion estimation makes use of block matching algorithms (bma). the full search algorithm (fs) is the most obvious candidate for a search technique for finding the best possible weight in the search area. kago et al [7] use a three-step motion vector search (tss) to compute displacements up to 6 pel/frame. this method, for w � 6 pel/frame, searches 25 positions to locate the best match. the three-step search (tss) algorithm is one of the best fast search algorithms, and provides a good estimation. to reduce the computational complexity, hierarchical and multi-resolution fast block matching is used. one family of fast block motion estimation algorithms relies on the idea of predicting the approximate large-scale motion vectors in the coarse-resolution video and refining the prediction motion vectors to find the final values. these are called hierarchical [5], [2] or multi-resolution methods [3] and [4]. hierarchical methods use the same image size but different block sizes at each level. multi-resolution methods use different image resolutions with a smaller image size at a coarser level. the wavelet transform has recently emerged as a promising technique for image processing applications, due to its flexibility in representing non-stationary image signals, and its ability in adapting to human visual characteristics. zhang and zafar [8] applied wavelet theory to real-time video compression, and proposed multi-resolution motion estimation (mrme). this scheme exploits the cross correlation among all layers of the wavelet pyramid structure in order to reduce the computational complexity of the motion estimation process. we present a novel multi-resolution variable block size algorithm (mrvbs) based on wavelet decomposition. the approach presented in this paper provides an accurate motion estimate even in the presence of noise. we utilize a wavelet component of the seven sub-bands from two layers of a wavelet pyramid in the lowest resolution. in each sub-band we perform the block matching estimation within a nine-block only. the simulation results are analyzed to assess the proposed algorithm with and without influence of noise. noise in a sequence not only degrades the visual quality, but also hinders the subsequent analysis and processing (e.g., compression, estimation and coding). the problem of removing noise from image sequences has attracted a number of researchers. however, the noise cannot be completely removed from the image sequences. this paper is organized as follows. in section 2, the proposed algorithm based on wavelet decomposition is briefly described. section 3 presents simulation results of the new algorithm without influence of noise. section 4 presents simulation results under the influence of a single noise and mixed noise. a few concluding remarks are given in section 5. 2 the proposed algorithm fig. 1 sets out the structure of the algorithm. the mrvbs algorithm is summarized as follows: step 1: to begin the motion vectors estimation process, the original image frame is decomposed into two layers using the two-dimensional discrete wavelet transform (dwt2). the motion vectors for the lowest low-pass band are estimated by central search. the search is performed at the center and its eight neighboring blocks with a block size of 4 × 4. step 2: these motion vectors are then used as a new center for other three-bands in the same layer (layer number 2). for these three-bands, the search is 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 a low-complexity wavelet based algorithm for inter-frame image prediction s. usama, b. šimák in this paper, a novel multi-resolution variable block size algorithm (mrvbs) is introduced. it is based on: (1) using the wavelet components of the seven sub-bands from two layers of wavelet pyramid in the lowest resolution; (2) performing a block matching estimation within a nine-block only in each sub-band of the lower layer; (3) scaling the estimated motion vectors and using them as a new search center for the finest resolution. the motivation for using the multi-resolution approach is the inherent structure of the wavelet representation. a multi-resolution scheme significantly reduces the searching time, and provides a smooth motion vector field. the approach presented in this paper providing an accurate motion estimate even in the presence of single and mixed noise. as a part of this framework, a comparison of the full search (fs) algorithm, the three-step search (tss) algorithm and the new algorithm (mrvbs) is presented. for a small addition in computational complexity over a simple tss algorithm, the new algorithm achieves good results in the presence of noise. keywords: video compression, motion estimation, wavelet transform, multi-resolution. also performed by the same method (central search) using the same block size. we used a block size of p q� � �� �2 2j j for the jth layer in the wavelet pyramid, where p and q are the sizes of the block required at the highest resolution ( p � 16 and q � 16). step 3: the current motion vectors, estimated from layer 2, are scaled and used as a new center for the three highest frequency bands in layer 1. in layer 1, the search is performed using a block size of 8 × 8. step 4: the estimated motion vectors are then scaled and used as a center for the final central search process. also, the final central search is performed at the center and its eight neighboring blocks. this process is performed for the original frame using block size of 16 × 16. the computational cost for mrvbs, without the wavelet complexity, is (36*p2+27*p1+9*p0), where p2, p1, and p0 are the block sizes used in layer 2, layer 1, and layer 0, respectively. 3 simulation results without influence of noise experimental results using the proposed algorithm are reported in this section. the algorithm is applied to three famous video sequences in the qcif format: carphone, foreman, and miss america. these video sequences have a three-kinds of motion. the experimental results are evaluated using the luminance component of each sequence. the results are based on the peak-signal-to-noise ratio (psnr) function, and use the mean absolute difference (mad) in performing block matching. the error terms are not used in the frame reconstruction. only forward prediction is implemented in the experiments. no threshold value is used in the search process. a performance comparison of mrvbs, tss, and fs in terms of psnr between the estimated frames and the original frames is carried out for these video sequences. the comparison is made among the first 30-frame of each sequence. the psnr comparisons show that the mrvbs usually provides a performance similar to the tss and fs algorithms, especially in the case of slow motion with a stationary background. as an example, fig. 2 shows the performance comparison for the carphone sequence (this was the worst result). © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 42 no. 2/2002 original frame, x(t) original frame, x(t-1) central search process cdwt2 central search process cdwt2 cdwt2 central search process cdwt2 central search process layer 0 initial estimates detail detail layer 1 low-pass low-pass initial estimates detail detail layer 2 low-pass initial estimates low-pass fig. 1: motion estimation using wavelet decomposition 20 22 24 26 28 30 32 34 36 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 frame no. mrvbs tss fs p s n r fig. 2: psnr comparisons of mrvbs, tss, and fs algorithms for the “carphone” sequence without influence of noise 4 simulation results under influence of noise to demonstrate the performance of our algorithm, the average psnr (across all input frames) is plotted against input noise density and signal-to-noise ratio (snr). the average psnr, psnravg, is given as psnr psnravg � � � 1 1 f i i f (1) where psnri is the measured psnr for frame i, and f is the total number of frames. we shall compare the mrvbs algorithm against the tss and fs algorithms. in addition, the psnr comparison among the three algorithms will be introduced. 4.1 simulation results under the influence of gaussian noise an additive gaussian noise with a different signal-to-noise ratio (snr) degraded the three video sequences. we applied the new motion estimation algorithm (mrvbs) to these sequences. fig. 3 shows the performance comparison for the miss america sequence with a signal-to-noise ratio (snr) of 10 dbs. fs and tss are performed at layer 0 using a block size of 16 × 16. the psnr comparison shows that the mrvbs usually performs better than the tss and fs algorithms. under normal operating, e.g., input snr between 30 to 50 dbs, the performance of mrvbs is similar to the performance of the tss and fs algorithms. for extremely noisy sequences, e.g., for snr of 10 dbs, the performance of mrvbs is as much as 2 dbs better than the other two algorithms. in fig. 4, the average psnr, psnravg (across all input frames) is plotted against input noise level for the “carphone” sequence. these results indicate that the motion estimation techniques used approximately at low level of noise have the same performance. the performance of mrvbs is as much as 2 dbs better than the performance of fs and tss under a high level of gaussian noise. in addition, for the “foreman” sequence the performance comparison is similar to the results of the “carphone” sequence. for the “miss america” sequence, the performance of mrvbs is as much as 3 dbs better than the other performance of fs and tss under a high level of gaussian noise. 4.2 simulation results under the influence of salt & pepper (impulse) noise additive salt & pepper noise with different noise density degraded the three video sequences. we applied the new motion estimation algorithm (mrvbs) to these sequences. fig. 5 shows the performance comparison for the “foreman” sequence with noise density of 40 %. fs and tss are performed at layer 0 using a block size of 16 × 16. as an example, the average psnr, psnravg (across all input frames) is plotted against the input noise level for the “miss america” sequence in fig. 6. the performance of mrvbs is as much as 8 dbs better than the other performance of fs and tss under a high level of salt & pepper noise. mrvbs performs well in the case of salt & pepper noise, better than the presence of gaussian noise. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 32 33 34 35 36 37 38 39 40 41 1 3 5 7 9 1 1 1 3 1 5 1 7 1 9 2 1 2 3 2 5 2 7 2 9 frame no. mrvbs fs tss p s n r [d b s ] fig. 3: psnr comparisons of the mrvbs, tss, and fs algorithms for the “miss america” sequence with gaussian noise (snr � 10 dbs) 27 29 31 33 35 10 15 30 40 50 mrvbs fs tss srn [dbs] p s n r a v g fig. 4: psnr (average) vs. source snr for the “carphone” sequence 10 12 14 16 18 20 22 24 26 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 frame no. mrvbs fs tss p s n r [d b s ] fig. 5: psnr comparisons of the mrvbs, tss, and fs algorithms for the “foreman” sequence with salt & pepper noise (40%) 4.3 simulation results under the influence of mixed noise we will now assess the performance of the proposed algorithm with respect to mixed gaussian noise and impulse noise. the restoration result for the “miss america” videosequence is shown in fig. 7 for mixed gaussian (variance � 200) and impulse noise (20 %). from these tests, we conclude that our algorithm works extremely well for video sequences corrupted with single or mixed noise. in addition, for mixed gaussian (variance � 200) and impulse noise (20 %), fig. 8 shows reconstructed frame number 30 of the “carphone” sequence from frame number 29 and motion vectors estimated using the current algorithms. 5 conclusion we introduced the new multi-resolution variable block size (mrvbs) algorithm. this algorithm is based on a central search in each layer of the wavelet pyramid. three qcif format video sequences corrupted by single and mixed © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 42 no. 2/2002 16 20 24 28 32 36 10 20 30 40 noise density (%) mrvbs fs tss p s n r a v g [d b s ] fig. 6: psnr (average) vs. source for the “miss america” sequence 27 29 31 33 35 37 39 1 3 5 7 9 1 1 1 3 1 5 1 7 1 9 2 1 2 3 2 5 2 7 2 9 frame no . tss mrvbs fs p s n r [d b s ] fig. 7: psnr comparisons of the mrvbs, tss, and fs algorithms for the “miss america” sequence under the influence of mixed noise a) original frame no. 30 b) reconstructed using mrvbs d) reconstructed using tss fig. 8: “foreman” sequence: reconstructed frame no. 30 using the mrvbs, tss, and fs under the mixed noise c) fsreconstructed using noise used for performance evaluation. the results show that mrvbs is usually better than the fs and tss algorithms, especially with slow motion video sequences. the psnr comparisons show that, the best performance is in the case of the miss america sequence (slow motion with stationary background). experimentally, the proposed algorithm has been shown to significantly outperform the motion estimation for these three types of video sequences for several distinct noise types, including impulsive, gaussian, and mixed impulsive gaussian noise. from the experimental results, under the influence of mixed noise, the maximum improvement within the first 30-frame was about 6 dbs with the miss america sequence. we observe that the maximum improvement in case of panning and object translation in the foreman sequence is 2.5 dbs in comparison with the fs algorithm. this supports our claim that mrvbs can be effectively used with noisy sequences to get a better estimation. the simulation confirms that the proposed algorithm performs better than the fs and tss algorithms with the three types of motion used. these gains can be observed in terms both of the perceptual quality and of the psnr of the restored images. it should also be noted that since the mrvbs algorithm can contain a regular data flow through the entire search procedure, it is suitable for hardware implementation. references [1] sezan, m. i., langendijk, r. l. (ed.): motion analysis and image sequence processing. kluwer academic publishers, 1993. [2] dufaux, f., kunt, m.: multi-grid block matching motion estimation with an adaptive local mesh refinement. in proc. spie visual commun. and image processing ’92, vol. 1818, p. 97–109. [3] li, j., lin, x., wu, y.: multiresolution tree architecture with its application in video sequence coding: a new result. in proc. spie visual communications and image processing ’93, vol. 2094, p. 730–741. [4] uz, k. m., vetterli, m., legall, d.: interpolative multiresolution coding of advanced television with compatible sub-channels. ieee trans. circuits syst. video technol., march 1991, vol. 1, p. 86–99. [5] bierling, m.: displacement estimation by hierarchical block matching. in proc. spie visual communications and image processing ’88, vol. 1001, p. 942–951. [6] netravali, a. n., robbins, j. d.: motion compensated television coding: part i. the bell system technical journal, march 1979, vol. 58, p. 631–670. [7] kago, t., iinuma, k., hirano, a., iijima, y., ishiguro, t.: motion compensated interframe coding for video conferencing. in proc. nat.: telecommun. conf., new orleans, la, nov. 29–dec. 3, 1981, p. g5.3.1–5.3.5. [8] zhang, y. q., zafar, s.: motion-compensated wavelet transform coding for color video compression. ieee trans. circuits sys. video technol., september 1992, vol. 2, p. 285–296. doc. ing. boris šimák, csc. phone: +420 2 2435 2203 e-mail: simak@feld.cvut.cz dept. of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic ing. sayed usama, ph.d. phone: +420 2 2435 2088 e-mail: usama@feld.cvut.cz deparment of electrical engineering faculty of engineering assiut university assiut, egypt 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 ap04_4web.vp 1 introduction the discharge stream from a standard rushton turbine impeller exhibits special flow properties different from the characteristics of the velocity field in other parts of the volume of an agitated liquid in a cylindrical baffled vessel [e.g. 1–7] , e.g. prevailing two components of the mean velocity (radial and tangential), a high rate of turbulent energy dissipation and anisotropy of turbulence in this region. at the same time, the discharge stream plays an important role in mixing operations, above all in liquid-liquid and gas-liquid systems [e.g. 10, 11]. this paper deals with a study of the velocity field and flow of the angular momentum in a discharge stream from a standard rushton turbine impeller in a cylindrical baffled flat bottomed vessel under turbulent regime of flow of an agitated liquid. three approaches to a description of the discharge stream are used: 1. a theoretical approach, when the impeller discharge stream is modelled as an axially symmetrical tangential submerged jet. 2. an experimental approach, i.e. determination of the force interaction of the impeller discharge stream and radial baffles by means of the trailing target. 3. a numerical simulation of the turbulent liquid flow in the whole agitated system with emphasis on the impeller discharge stream. 2 theoretical let us consider a flat-bottomed cylindrical vessel with four baffles at its wall. in the vessel the standard rushton turbine impeller [12] is coaxially located and rotates under turbulent regime of flow of an agitated liquid. in the region of the radially-tangential impeller discharge stream (see fig. 1) we assume a constant angular momentum flow between the impeller and the baffle [1, 8] m q t w q d wb d th b t b av p t p av, , , , , ,( ) ( )� �� �2 2 . (1) here index “p” denotes a velocity field near the impeller and index “b” a velocity field near the baffle. looking at the relation between the radial component of the mean velocity averaged over the impeller discharge stream wr p av, , and the impeller pumping capacity q w dwp r p av� , , � (2) and the relation between this component and the tangential one (see fig. 2) w wt p av r p av, , , ,� tg �, (3) where � �� � arcsin ( )a d 2 , (4) we can rearrange eq. 1 into the form � �� � m q a d w b d th p, , arcsin ( ) � � � 2 2 2 tg . (5) after introduction of the flow rate number (dimensionless impeller pumping capacity) n q nd q p p � 3 , (6) where symbol n denotes the impeller speed, we have finally an equation in dimensionless form for the angular momentum flow between the rotating standard rushton impeller and the vessel wall © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 44 no. 4/2004 study of the discharge stream from a standard rushton turbine impeller j. kratěna, i. fořt the discharge stream from a standard rushton turbine impeller exhibits special flow properties different from the characteristics of the velocity field in other parts of the volume of an agitated liquid in a cylindrical baffled vessel, e.g. two prevailing components of the mean velocity (radial and tangential), high rate of turbulent energy dissipation and anisotropy of turbulence in this region. at the same time, the discharge stream plays an important role in mixing operations, above all in liquid-liquid and gas-liquid systems. this paper deals with theoretical and experimental studies of the velocity field and flow of angular momentum in a discharge stream from a standard rushton turbine impeller in a cylindrical baffled flat bottomed vessel under turbulent regime of flow of an agitated liquid with emphasis on describing the ensemble averaged values over the whole interval of the tangential coordinate around the vessel perimeter. keywords: rushton turbine impeller, discharge stream, turbulent flow, velocity field. fig. 1: vertical view of the rushton impeller discharge stream � � m m n d n a d w d b d th b d th q p, , * , , arcsin( ) ( ) � � � � 2 5 2 2 2 tg . (7) quantity a denotes the radius of the virtual cylindrical source of the impeller discharge stream (see fig. 2), and for the chosen geometry of the agitated system (see fig. 3) it holds [1] 2 0 34a d � . . (8) 3 results of experiments experiments were carried out in a flat-bottomed cylindrical pilot plant mixing vessel with four baffles at its wall (see fig. 3) of diameter t � 0.3 filled with either water (� � 1 mpa�s) or with one of two water glycerol solutions of dynamic viscosity � � 3 mpa�s and � � 6 mpa�s, respectively. the impeller was a standard rushton disk impeller with six flat plate blades [12]. the range of frequency of revolution of the impeller was chosen in the interval n � 3.11� 5.83 s�1, so the regime of flow of the agitated liquid was always turbulent. in order to determine the axial (vertical) distribution of dynamic pressure pd affecting the baffle [9], one of the baffles was equipped with a trailing target (see fig. 3) of height ht and width b enabling it to be rotated parallel to the vessel axis with a small eccentricity and balanced by springs. eleven positions of the target along the height of the baffle were examined, above all in the region of the interference of the baffles and the impeller discharge stream. the angular displacement of target � is directly proportional to the peripheral component of force f affecting the balancing springs (see fig. 4). the flexibility of the springs was selected so that the maximum target displacement was reasonably small compared with the vessel dimensions (no more than 0.5 % of the vessel perimeter). a small photoelectronic device composed of two photodiodes scanned the angular displacement, and the output signal was treated, stored and analysed by the computer. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 2: turbine impeller as an axially symmetrical tangential source fig. 3: sketch of a flat-bottomed agitated pilot plant mixing vessel with four radial baffles at the wall and an axially located standard rushton impeller and sketch of measurement of local peripheral force affecting the trailing target (h/t � 1, h/t � 0.33, 0.48, b/t � 0.1, ht � 10 mm, b � 28 mm) the average value of the dimensionless mean dynamic pressure affecting the part of the baffle of dimensionless height hd *, i.e. the height of the region of the impeller discharge stream interference with the baffle (see fig. 5), can be calculated from the relation p h p h h h hd av d k av t t h h b b , * * , * ( ) ( ) * * � � 1 1 2 d , (9) where h h hd b b * * *� � 2 1 . (10) the total dimensionless component of the mean peripheral force affecting the baffle along its interference region with the impeller discharge stream is f f n d t d b h pd av d av d d av, * , * * , *( )� � � 2 4 2 , (11) where the dimensionless width of baffle b b t* � . (12) finally a portion of the dimensionless mean reaction moment of the baffles corresponding to the mutual interference of nb baffles and the impeller discharge stream m n f r db d b d av b, * , * ( )� (13) where the radial coordinate of the centre of gravity of the trailing target [9] r t bb � �( ) ( )2 2 3 . (14) table 1 consists of both theoretical and experimental results of the impeller torque by radial baffles in the discharge stream of the rushton turbine impeller. it follows from this table that for both investigated impeller off-bottom clearances h t theoretical results fit the experimental results fairly well. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 44 no. 4/2004 h/t hd * mb d, * mb d th, , * nq p [1] 0.33 0.33 0.464 0.472 0.80 0.48 0.30 0.475 0.472 0.80 table 1: transfer of the impeller torque by radial baffles in the discharge flow of a rushton turbine impeller 0.25 0.15 0.05 0.05 0.15 0.25 500 250 0 250 500 pc indicated displacement f [n ] 0.25 0.15 0.05 0.05 0.15 0.25 0.2 0.1 0.0 0.1 0.2 b [ rad] fig. 4: results of mechanical calibration of balancing springs fig. 5: axial profile of the dimensionless peripheral component of dynamic pressure affecting radial baffle along its height (shadow part characterizes an interference region of the impeller discharge stream and radial baffle) 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 6: three dimensional view of the flat bottomed agitated system with standard rushton turbine impeller and radial baffles (t � 300 mm, h/t � 1, d/t � 1/3, h/t � 0.33, 0.48, b/t � 0.1, n � 6.7 s�1) fig. 7: side view (coordinates r�z) of the half of investigated system with chosen grid 4 numerical simulation of turbulent liquid flow in a system with a standard rushton impeller reynolds averaged navier-stokes equations (rans) together with the multiple reference frame (mrf) and standard k-� model predictions of steady state flow have been used via fluent 4.5 commercial software with the mix sim 1.7 module [13]. half of the cylindrical tank with three radial baffles and three impeller blades was modelled (see fig. 6). the total number of cells in the investigated volume (tangential × radial × axial coordinates) was � × r × z 134 × 42 × 81 � 4.56×105 cells. special attention was paid to selection of the grid density. the highest density of the grid was located in the volume of the maximum level of the rate of turbulent energy dissipation � in the impeller discharge stream. on the other hand, the coarsest grid was located in the other volume, i.e. in the remaining volume of the agitated liquid. figs. 7 and 8 illustrate the selected distribution of the grid a both the � r and r � z planes. for the investigated agitated system the following boundary conditions were defined: 1. vessel wall, baffles and flat bottom – flow in the vicinity of the solid flat plate (laminar sublayer, buffer layer). 2. liquid surface and cylindrical vessel axis of symmetry – no penetration of liquid through the boundary (batch system). © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 44 no. 4/2004 fig. 8: top view (coordinates ��z) of the half of investigated system with chosen grid 0.0 0.5 1.0 1.5 10 5 0 5 10 z [mm] w r [m /s ] num. prediction (r = 50.7 mm) exp. data (r = 51.5 mm) radial component of the mean velocity (t=300 mm) 0.0 0.5 1.0 1.5 2.0 10 5 0 5 10 z [mm] w t [m /s ] num. prediction (r = 50.7 mm) exp. data (r = 51.5 mm) tangential component of the mean velocity (t=300 mm) fig. 9: comparison of experimentally determined and numerically predicted axial profiles of the radial and tangential components of the mean velocity in the rushton turbine impeller discharge stream (exp. data: möckel [2]) 4.1 numerical simulation versus experimental data the results of numerical prediction of the velocity field in the rushton turbine discharge stream were compared with the experimental data of möckel [2] (see fig. 9). both axial (vertical) profiles are located right next to the rotating impeller, and in accordance with the experimental technique of möckel (cta) they are expressed as the assemble averaged values over the whole circle, i.e. over the internal of tangential coordinate � � �0°; 360°�. it follows from the two compared profiles that the results of the numerical simulation correspond fairly well with the experimental data, but a certain shift of the numerical prediction appears, probably caused by the asymmetrical position of the impeller along the axial coordinate z. figs. 10 and 11 describe the radial distribution of the tangential component of the mean velocity in the turbine impeller discharge flow for the two investigated off-bottom clearances h/t calculated numerically. this is a very good recognition of the concept of the turbine impeller discharge flow as a turbulent submerged jet. the asymmetrical shape of this jet at the impeller off-bottom clearance h/t � 0.33 corresponds to the above mentioned results, illustrated in fig. 9. the boundaries of the jet in figs.10 and 11 correspond to the value of the given component of the mean velocity amounting to 1 % of its maximum value on the investigated axial profile. the radial profile of the dimensionless volumetric flow rate q / nd3 in the impeller discharge stream plotted in fig. 12 confirms the above mentioned idea of the submerged jet, because the volumetric flow rate q increases with increasing radial distance (radial coordinate) from the body of the rotating impeller. moreover, this figure again exhibits good agreement between the data calculated from numerical simulation of the flow in the investigated system and the available experimental data. 4.2 numerical simulation versus theoretical approach the predicted axial profiles of the radial and tangential components of the mean velocity in the turbine impeller discharge stream were used in order to calculate of the radial distribution of the angular momentum flow in the investigated region of an agitated system: m r q r r w r r d t bd num d num t num av, , , ,( ) ( ) ( ), ;� �� 2 2 , (15) 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 h/t=0.33 0.05 0.1 0.15 0.2 0.25 0.00 0.03 0.06 0.09 0.12 0.15 r [m] z [m m ] 0 1.5 wt [m/s] 2r/d 1 1.01 2 1.24 3 1.51 4 1.81 5 2.07 6 2.29 7 2.52 8 2.74 9 2.96 1 2 3 4 5 6 7 8 9 radial baffle fig. 10: radial distribution of tangential component of the mean velocity in the rushton turbine impeller discharge stream (numerical prediction of ensemble average values over interval of the values of tangential coordinate � = �0°; 360°�, h/t = 0.3) h/t=0.48 0.05 0.1 0.15 0.2 0.25 0.00 0.03 0.06 0.09 0.12 0.15 r [m] z [m m ] 0 1.5 wt [m/s] 2r/d 1 1.01 2 1.24 3 1.51 4 1.81 5 2.07 6 2.29 7 2.52 8 2.74 9 2.96 1 2 3 4 5 6 7 8 9radial baffle fig. 11: radial distribution of tangential component of the mean velocity in the rushton turbine impeller discharge stream (numerical prediction of ensemble average values over interval of the values of tangential coordinate � = �0°; 360°�, h/t = 0.48) where the tangential component of the mean velocity averaged over the width of the submerged jet (z2 – z1) at radius r is w r const z z w z z r constt num av t num z z , , ,( .) ( ) ,� � � �� 1 2 1 1 2 d . (16) similarly, the local value of the flow rate in the turbine impeller discharge flow can be calculated from the relation q r const r w z z r constd num t num z z , ,( .) ( ) , .� � ��2 1 2 � d (17) finally eq. (15) can be expressed in dimensionless form m r m r n d q r nd w r d num d num d num t num av , * , , , ,( ) ( ) ( ) ( ) � � � 2 5 3 r nd2 . (18) figures 13 and 14 presents the results of two compared methods for calculating the angular momentum flow in an investigated turbine impeller discharge flow: a theoretical method (solid line in each figure) and a numerically predicted method (points). it is clearly shown that, although the value of the angular flow momentum in the impeller discharge flow oscillates slightly around its theoretical value, the numerically calculated values ( md num av, , * ) and the theoretically predicted values (md th, * ) agree very well. moreover, it follows from table 1 that the values of this quantity determined experimentally differ from the above mentioned values within the range of experimental accuracy. 5 conclusions the results of the study confirm the idea of the impeller discharge flow of a rushton turbine as a submerged turbulent jet, where under turbulent regime of flow the angular momentum flow exhibits a practically constant value and its value fits fairly well with the results of the theoretical approach and also the results of experiments. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 44 no. 4/2004 0 0.2 0.4 0.6 0.8 1 1.2 1 1.1 1.2 1.3 1.4 1.5 1.6 2r/d q /n d 3 num. prediction exp. data qd,num fig. 12: radial profile of the dimensionless volumetric flow rate q/nd3 in the rushton turbine impeller discharge stream (h/t= 0.48, d/t= 1/3, exp. data: stoots and calabrese [4]) h/t=0.48 0.0 0.1 0.2 0.3 0.4 0.5 0.6 1.0 1.5 2.0 2.5 3.0 2r/d m * d num. prediction radial baffle m * d,th = 0.472 m*d,num,av= 0.461 m*d,num teoretical value fig. 13: radial profile of the angular momentum flow in the rushton turbine impeller discharge stream in dimensionless form (h/t= 0.33) 6 list of symbols a radius of the virtual cylindrical source of the impeller discharge stream, [m] b target width, [m] b baffle width, [m] d impeller diameter, [m] f peripheral component of the force, [n] h total liquid depth, [m] hd height of the region of the impeller discharge stream interference with the baffle, [m] ht height of the target above the bottom, [m] h impeller off-bottom clearance, [m] ht target height, [m] m impeller torque, [n�m] n impeller speed, [s�1] nq p flow rate number nb number of baffles qp impeller pumping capacity (impeller flow rate), [m3s�1] p dynamic pressure, [pa] rb radial coordinate of the centre of gravity of the trailing target, [m] r radial coordinate, [m] t vessel diameter, [m] w width of impeller blade, [m] w liquid local instantaneous velocity, [m�s�1] z axial (vertical) coordinate, [m] � declination of liquid velocity in the impeller discharge stream from radial coordinate, [°] � target angular displacement, [°] � tangential coordinate, [°] � density of agitated liquid, [kg�m�3] � dynamic viscosity agitated liquid, [pa�s] 7 subscripts and superscripts av average value b related to the region of radial baffles d related to the volume of the impeller discharge stream num calculated numerically p related to the impeller r radial t tangential th theoretical value z axial (vertical) * dimensionless value mean (time averaged) value 8 acknowledgment this research has been supported by research project of the ministry of education of the czech republic no. j 04/98: 212200008. references [1] drbohlav j., fořt i., krátký, j.: “turbine impeller as a tangential cylindrical jet.” collect. czech. chem. commun., vol. 43 (1978), p. 696–712. [2] möckel h. o.: hydrodynamische untersuchungen in rührmaschine. ph.d. thesis, ingenieurhochschule köthen, dresden (germany), 1978. [3] yianneskis m., popiolek z., whitelaw j. m.: “an experimental study of the steady and unsteady flow characteristics of stirred reactors.” j. fluid. mech., vol. 175 (1987), p. 537–555. [4] stoots c. m., calabrese r. v.: “mean velocity relative to a rushton turbine blade.” aiche j., vol. 41 (1995), no. 1, p. 1–11. [5] rutheford k., mahmoudi s. m. s., lee k., yianneskis, m.: “the influence of a rushton impeller blade and disk thickness on the mixing characteristics of stirred vessel.” 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 h/t=0.48 0.0 0.1 0.2 0.3 0.4 0.5 0.6 1.0 1.5 2.0 2.5 3.0 2r/d m * d num. prediction radial baffle m * d,th = 0.472 m*d,num,av= 0.461 m*d,num teoretical value fig. 14: radial profile of the angular momentum flow in the rushton turbine impeller discharge stream in dimensionless form (h/t= 0.48) trans. inst. chem. engrs., vol. 74 (1996) (part a), p. 369–378. [6] schäfer m., höfken m., durst f.: “detailed ldv measurements for visualization of the flow field within a stirred-tank reactor equipped with a rushton turbine.” trans. inst. chem. engrs., vol. 75 (1997) (part a), p. 729-736. [7] nikiforaki l., montante g., lee k.c., yianneskis y.: “on the origin frequency and magnitute of macro-instabilities of the flows in stirred vessels.” chem. eng. sci., vol. 58 (2003), p. 2937–2949. [8] bittins k., zehner p.: “power and discharge numbers of radial-flow impellers. fluid dynamic interactions between impeller and baffles.” chem. eng. and proc., vol. 33 (1994), p. 295–301. [9] kratěna j., fořt i., brůha o., pavel j.: “distribution of dynamic pressure along a radial baffle in an agitated system with standard rushton turbine impeller.” trans. inst. chem. engrs., vol. 79 (2001) (part a), p. 819–823. [10] riet van't k., smith j. m.: “the behaviour of gas-liquid mixtures near rushton turbine blades.” chem. eng. sci., vol. 28 (1973), p. 1031–1037. [11] wu h., patterson g. k., doorn van m.: “distribution of turbulence energy dissipation rates in a rushton turbine stirred mixer.” exp. in fluids, vol. 8 (1989), p. 153–160. [12] turbine disk impeller with flat plate blade, 1997, czech standard cvs 691021, mixing equipment (brno). [13] fluent 4.5 users quide, fluent inc., lebanon (nh, usa), 1997. ing. jiří kratěna e-mail: kratena@student.fsid.cvut.cz doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: fort@fsid.cvut.cz dept. of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 44 no. 4/2004 acta polytechnica doi:10.14311/ap.2020.60.0225 acta polytechnica 60(3):225–234, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap optimization of a small solar chimney janaína oliveira castro silva, cristiana brasil maia∗ pontifical catholic university of minas gerais (puc minas), department of mechanical engineering, av. dom josé gaspar, 500. belo horizonte, 30535-901, brasil ∗ corresponding author: cristiana@pucminas.br abstract. an optimization study to maximize the exergy efficiency of a small-scale solar chimney was carried out. optimization variables include collector diameter (dc), collector height (hc), tower height (ht), and tower diameter (dt). models from the literature were used to predict environmental and airflow parameters. exergy efficiency and solar chimney efficiency were determined, on an hourly basis, for a one-year period. the model was simulated using the ees software. two methods of optimization were used, the method of conjugate directions and the method of variable metric, both providing similar results. results were compared to the results from an experimental prototype, and it was found that the energetic and exergetic efficiency were significantly improved. the analysis indicated that the height and diameter of the chimney and collector are the most important physical variables in the design of a solar chimney. for both methods, it was found that the maximum exergy efficiency was obtained with a collector height of 0.5 m, a collector diameter of 30 m, a tower diameter of 1 m, and a tower height of 17.8 – 18.8 m. the exergy efficiency was 44 %. keywords: solar chimney, optimization, conjugate directions, variable metric. 1. introduction in the last few decades, global population and global consumption of energy have increased significantly. the rapid depletion of natural fuel resources, rising fossil fuel costs and environmental damages have increased the search for renewable energy sources. solar energy arises as a non-pollutant, abundant and renewable source. solar chimney (or solar updraft) power plant is a device that uses solar radiation to drive air for electricity generation. it consists mainly of a solar collector, a tower and turbine generators. solar radiation passes through the transparent solar collector and reaches the ground, which works as an absorber. part of the heat is stored in the deeper layers of the ground, and part is released to the air under the solar collector. the buoyancy acts as the driving force. the air flows toward the tower and drives the turbine. the main use of the solar chimney is power generation [1, 2], but it can also be used for food drying [3–5]. recently, some authors started to develop mathematical simulations to assess the viability of solar chimneys to be used combined with desalination systems, to simultaneously generate power and produce freshwater [6–11], as reviewed by [12, 13]. geometric parameters play an important role in the performance of solar chimneys. several authors studied the influence of geometry on the performance of the device. for fixed values of tower height and collector diameter, the collector inlet opening, the collector outlet diameter and height, the tower inlet opening and the chimney divergence angles were varied as an attempt to improve the performance of small scale solar chimney power plants using computational fluid dynamics (cfd) [14, 15]. an experimental study of the variation of the tower height, inlet collector height, and collector materials was performed by [16, 17] for a small solar chimney. the experimental prototype was used in the validation of a cfd analysis to increase the fluid velocity and the solar chimney efficiency, varying the tower height, the collector diameter and the collector entrance. a 3d numerical model incorporating a radiation model, a solar load model, and a real turbine was developed by [18], using the commercial code ansys fluent. the manzanares dimensions were used as a reference. the influence of the tower height, collector diameter, and area ratio of chimney entrance over chimney exit was studied by [19] using the ansys fluent cfd code. the tower height ranged from 100 m to 300 m and the collector diameter ranged from 5 m to 15 m. a cfd model based on a finite volume method was also used by [20] to evaluate the effect of geometric parameters on temperature, velocity, pressure distribution, efficiency and output power in a solar chimney. the tower height and collector radius ranged from 20 m to 500 m, using prescribed values for the heat flux. according to the literature, the tower height and the solar collector diameter are the parameters that affect the performance the most [21, 22]. numerical analyses were developed by several researchers to describe the airflow parameters inside the solar chimney [23, 24], and to describe the entropy generation inside the device [25]. the influence of daily solar radiation and ambient temperature on the distribution of temperature, velocity and pressure was evaluated by [26]. optimization techniques have been used to determine the best geometry for a given purpose, such as the highest efficiency, the highest power output, or 225 https://doi.org/10.14311/ap.2020.60.0225 https://ojs.cvut.cz/ojs/index.php/ap janaína oliveira castro silva, cristiana brasil maia acta polytechnica the minimum expenditure. global analysis models and cfd were used to describe the airflow inside the solar chimney. in [27], an energy balance was developed, using meteorological data (solar radiation, ambient air temperature, and wind velocity) and manzanares dimensions as a reference. subsequently, the authors used the developed model to maximize the total output power. a multi-objective optimization was performed by [28, 29]. the authors used the collector diameter and the tower height and diameter as design parameters and prescribed values of solar radiation to maximize the power output and minimize the capital cost of components [28] and minimize the expenditure and maximize total efficiency and power output [29], using a global analysis. a finite-difference approximation was used by [30] to solve the mathematical model of a solar chimney and to minimize the plant expenditure, using manzanares dimensions as reference. a 2d numerical simulation and a sensitivity analysis were developed by [31], using finite volume method to maximize the power output of a solar chimney, also using manzanares as reference. the maximization of the power output was also described by [32]. the authors developed a 1d asymptotic fluid dynamic model, with the differential equations solved using matlab. cfd analyses were developed by [33] and [34]. in both works, the reynolds averaged navier stokes equations were solved using a constant heat flux as boundary condition and manzanares dimensions as reference. the objective function of [33] was the maximization of the power density, while the objective function of [34] were the performance parameters of the airflow temperature, pressure and velocity profiles. to extract the maximum output from the wind, [35]performed an optimization analysis of small scale wind turbine blades for a solar chimney power plant, under various wind velocity ranges. optimization techniques were also used in the solar chimney combined with other applications. a combined solar chimney for desalination and power generation was studied by [11]. the objective function was the ratio of the costs of the collector, tower and desalination, and the prices of electricity and freshwater production. a solar chimney power plant integrated to tehran’s waste-to-energy was studied by [36], attempting to maximize the exergy efficiency and minimize the total cost rate. an optimization of a solar chimney power plant integrated with transparent photovoltaic cells and desalination was performed by [37]. the objective function was to maximize the utilization of solar energy, enhancing the overall plant efficiency. from the literature review, it can be seen that the majority of works use manzanares prototype as a reference for the dimensions. it is well-known that large structures are required to generate power with competitive prices. the airflow generated in small structures can be used to dry agricultural products. most works from literature use yearly averaged values of temperature, solar radiation, and heat fluxes to obtain the optimum dimensions that maximize the efficiency or power output and/or minimize the total cost. in our work, we use a mathematical model to determine hourly solar radiation and ambient temperature and use these results to predict the airflow parameters for a given geometry. the geometry is then optimized to achieve maximum exergy efficiency. the model is simulated for a small-scale solar chimney [38]. what makes our work original is that the optimization was performed for one year, with a time step of one hour. 2. mathematical model 2.1. system description a typical solar chimney consists of a solar collector and a chimney (or tower). the geometry and schematics of the solar chimney are represented in fig. 1. dc and hc are the diameter and height of the collector, and dt and ht are the diameter and height of the tower, respectively. a reference geometry was established, based on previous works from the authors ([3–5, 39]), in which a prototype was built in brazil. the smallscale pilot plant has a height and a diameter of 12.3 m and 1.0 m, respectively. the diameter and height of the solar collector are 25.0 m and 0.5 m, respectively. the dimensions were based on the dimensions of manzanares prototype, in a scale of 1 : 50, adapted to have a minimum height from the ground. 2.2. problem statement engineering equation solver (ees) was used to solve the model equations due to its reliable thermodynamic properties database and optimization techniques. the model consists of modelling the ambient conditions (ambient temperature and incident solar radiation) and the airflow parameters (mass flow rate, ground surface temperature, and outlet airflow temperature). the variables were determined on an hourly basis for the whole year. for each hour, steady-state conditions were assumed. the model is fully described in [38]. 2.2.1. environmental analysis the incident solar radiation and ambient temperature were predicted for belo horizonte city, brazil (latitude: 19º55’15"s, longitude: 43º56’16"w). incident solar radiation was predicted considering an isotropic sky, and average values from the literature for the clearness index to belo horizonte, brazil [40], also used in [41]. the absorbed solar radiation by the ground s was considered to include beam radiation, isotropic diffuse radiation, and solar radiation diffusely reflected from the ground [42]. 226 vol. 60 no. 3/2020 optimization of a small solar chimney figure 1. main geometric parameters. s = ibrb(τα)b + id(τα)d ( 1 + cos β 2 ) + + ρgroundi(τα)groung ( 1 − cos β 2 ) (1) i represents the total solar radiation incident, given as the sum of the beam (ib) and diffuse (id) components. the subscripts b, d, and ground refer to direct, diffused, and reflected by the ground types of radiation. ρground represents the ground reflectance. β is the slope of the solar collector and (τα) is the transmittance-absorptance product. rb is a geometric factor, accounting for the ratio of beam radiation on the tilted surface to that on a horizontal surface at a given time. a detailed model for these parameters can be found in [42]. ambient temperature was determined according to [43], in which the daily period is subdivided into five intervals. the model assumes a correlation between the ambient temperature and the incident solar radiation, requiring the daily maximum, minimum, sunrise, and sunset temperatures, given by [43]. 2.2.2. airflow analysis an energy balance between the heat fluxes in the collector was performed. part of the solar radiation absorbed by the ground (s) is transferred to the deeper ground layers. the remaining is transferred to the airflow by convection, to the collector by radiation, and to the external environment by radiation. the heat fluxes were modelled using literature correlations, described in [38]. the ground was assumed as a semi-infinite solid, since it extends to infinity, and the temperature variations occur only near the surface, as suggested by [44]. therefore, the initial temperature and the temperature at a point in a large distance from the surface are constant and considered as the annual average ambient temperature for the city of belo horizonte. koonsrisuk et al. [45] presented a model for the mass flow rate and outlet temperature of the flow as a function of the geometric parameters. for large solar chimneys, it is assumed that dt is significantly smaller than dc. however, for small solar chimneys, the crosssectional area of the chimney cannot be neglected. therefore, mass flow rate expression was modified to take into account the cross-sectional area, and it is represented by: ṁ = 3 √√√√√√√√√ ρ2β′gq′′conv,1π 3 8cpao ht [( dc 2 )2 − ( dt 2 )2] 4fyht d5t + fx 64 rch3c + 1 d4t (2) β′ is the volumetric expansion coefficient for real gases. fx and fy represent the friction factors in the collector and the tower, respectively [46]. the temperature of the airflow leaving the tower is given by [45], assuming adiabatic conditions: tao = q′′conv,1π ṁcp [( dc 2 )2 − ( dt 2 )2] + tai (3) ṁ is the mass flow rate. 2.3. energy and exergy analysis the energy balance is used to determine the net heat rate (q̇). the mechanical work rate ẇ is neglected, since there is no turbine in the system. vai and vao represent the inlet and outlet air velocity of the system, respectively. the specific enthalpies of the air in the inlet are hai and outlet hao , respectively [47] q̇−ẇ = ∑ ṁao ( hao + v 2ao 2 ) − ∑ ṁai ( hai + v 2ai 2 ) (4) according to [48], the solar chimney efficiency, neglecting the turbine, can be given by: η = [∫ ṁcp(tao − tai)dt acolho ][ htg cptai ] (5) 227 janaína oliveira castro silva, cristiana brasil maia acta polytechnica where cp is the specific heat of air at constant pressure, acol is the area of the collector and ho is the extraterrestrial solar energy on a horizontal surface. the exergy analysis was performed based on the mathematical model presented by [4]. the exergetic balance can be represented by:∑ (ėxheat+ėxmass,in)− ∑ ėxmass,out = ∑ ėxlost (6) the first term represents the inlet exergy rate (ėxin), due to heat (ėxheat) and to the air entering the system (ėxmass,in). the second term is the outlet exergy rate (ėxout) due to the air leaving the system (ėxmass.out) and the last term represent the destroyed exergy rate (ėxlost) [49]. the exergy rates are presented in [38]. the exergy efficiency ε is defined as the ratio of exergy outflow to exergy of inflow [50–52]: ε = 1 − ėxlost ėxin = ėxout ėxin = ėxmass,out ėxheat + ėxmass,in (7) 3. optimization theoretical works about solar chimneys found in literature usually deal with only the analysis based on the first law of thermodynamics. growing attention has been given to analyses using the second law of thermodynamics, which provide a more powerful tool for engineering assessments [53]. the exergy analysis allows for identifying exergy losses due to irreversibilities in the processes. the yearly average exergy efficiency was chosen as the objective function to be maximized. the environmental conditions and airflow parameters used to determine the exergy efficiency were simulated on an hourly basis for the 365 days of the year. the most important geometric parameters were considered as the optimization variables, and their ranges are: 10 < dc(m) < 100, 0.1 < hc(m) < 1.0, 0.8 < dt(m) < 1.8, 5.0 < ht(m) < 50 the constraints applied to the problem were defined after a previous study on the influence of the geometry on the airflow parameters and exergy efficiency of small solar chimneys [54]. the parametric analysis developed on [54] showed that that the collector height does not significantly influence the airflow parameters, as long as it is kept at a minimum height from the ground. an increase of the collector diameter increases the average velocity and temperature, the tower diameter increases the mass airflow and the tower height increases the velocity and decreases the temperature. there is no single optimization method that can be effectively applied to all problems [55]. the chosen method in a particular case depends on the characteristics of the objective function, the constraints on the nature and the number of variables. therefore, the author recommends that more than one method is used in a comparative way. in this paper, the methods of conjugate directions and variable metric were used. both methods are similar, but the conjugate directions method uses first-order information, resulting in lower computational efforts. the variable metric method has been shown to be a very powerful technique in non-linear optimization problems [56]. 4. results and discussion the results are divided into two sections. in the first section, optimization results are presented for two optimization methods. the objective function was the maximization of the yearly average exergy efficiency. the optimization variables were the main geometric parameters of the solar chimney. in the second section, the airflow and performance parameters throughout the year for the optimized and reference geometries are presented. 4.1. optimization results two optimization methods were used to evaluate the geometry that provides a higher exergy efficiency: variable metric and conjugate directions. the optimum dimensions found for both methods are presented in table 1. the results of mass flow rate, ground surface temperature, outlet airflow temperature, solar chimney efficiency, and exergy efficiency are presented for the optimum geometry for both models. the model was simulated for the geometric parameters of the experimental prototype used as a reference. the results are also presented in table 1. the optimum geometry is similar for both of the optimization models used. the tower height was slightly different between the models (a difference of approximately 5 %), causing small differences in the airflow parameters. the differences between the models’ results can be attributed to the model structure. the optimum geometry presented higher mass flow rate and temperatures, resulting in a significant increase in the yearly average efficiencies. 4.2. monthly average airflow parameters the airflow and performance parameters were evaluated on a monthly basis for the optimized geometry, for the optimization methods used. the results are presented in figures 2 to 6, compared to the results for the reference geometry. buoyancy forces generate the airflow inside the solar chimney. the air under the solar collector is heated by the ground surface, which is heated by the incident solar radiation. therefore, solar radiation significantly affects the airflow parameters. the higher the levels of solar radiation, the higher the mass flow rate and temperatures. the monthly average mass flow rate throughout the year is presented in figure 2. the general behaviour is the same as the solar radiation, higher during the summer and lower during the winter. although it is expected that the mass flow rate varies 228 vol. 60 no. 3/2020 optimization of a small solar chimney variables constraints reference geometry variable metric conjugate directions collector diameter [m] 10 < dc < 100 25.0 30.0 30.0 collector height [m] 0.1 < hc < 1.0 0.5 0.5 0.5 tower diameter [m] 0.8 < dt < 1.8 1.0 1.0 1.0 tower height [m] 5.0 < ht < 50 12.3 17.9 18.8 ηanual 0.0040 0.1617 0.1638 εanual 0.1545 0.4416 0.4409 taianual 28.80 29.22 29.21 tgroundanuall 28.44 31.01 31.00 ṁanual 1.33 1.57 1.59 table 1. comparison of the results of the optimization methods. figure 2. variation of the mass flow rate throughout the year. during the day, the monthly average values varied about 12 % throughout the year for a defined geometry. it can be seen that the results obtained for the optimization methods were very close, slightly higher for the variable metric method. the mass flow rate obtained for the reference geometry was lower than the optimized geometry. this behaviour was already expected, because when the collector diameter and the tower height increase, the mass flow rate also increases, as seen in the literature [17, 22, 57, 58]. the monthly average ground surface temperature and outlet temperature are shown in figures 3 and 4. the temperatures follow the same behaviour of the mass flow rate and solar radiation, with higher values in the summer and lower values in the winter. there are no significant differences between optimized and reference geometries since the ground surface temperature has a little influence on the geometrical parameters. the outlet airflow temperature is more affected by geometry than the ground surface temperature. therefore, higher differences were found when comparing the results from the optimized geometry to the results from the reference geometry. since the optimized geometry presented higher tower height and collector diameter, higher airflow temperatures were obtained. it is worth noting that, as expected, ground surface temperatures are higher than airflow temperatures. the monthly average solar chimney efficiency is shown in figure 5. higher efficiencies were found during the winter. this behaviour can be explained by the definition of the solar chimney’s efficiency. during the winter, the incident solar radiation and ambient temperature are lower, and the outlet temperature is higher. since the variations of solar radiation and ambient temperature are more significant than the variations of the outlet temperature, the solar chimney efficiency increases. similar results were found for both optimization models since the geometry is similar. 229 janaína oliveira castro silva, cristiana brasil maia acta polytechnica figure 3. variation of the ground temperature throughout the year. figure 4. variation of the air outlet temperature throughout the year. 230 vol. 60 no. 3/2020 optimization of a small solar chimney figure 5. variation of the solar chimney efficiency rate throughout the year. figure 6. variation of the exergy efficiency throughout the year. 231 janaína oliveira castro silva, cristiana brasil maia acta polytechnica the reference geometry presented very low efficiency, ranging from 0.3 % to 0.5 %. the efficiency for the optimized geometry increased significantly, ranging from 12 % to 23 %. this difference can be attributed mainly to the geometry. the chimney height has the greatest influence on the solar chimney efficiency [48], and it was significantly increased during the optimization process. the monthly average exergy efficiency is presented in figure 6. higher values were found for the optimized geometry since the objective function used in this paper was its maximization. the exergy efficiency is defined as the ratio between the outlet and inlet exergy rates. during the summer, higher solar radiation levels are found, increasing the outlet temperatures. therefore, both outlet and inlet exergy rates increase during the winter, and the monthly average exergy efficiency has small variations throughout the year. again, both optimization methods presented similar results. 5. conclusions an unsteady theoretical model was used to determine the airflow and performance parameters of a smallscale solar chimney. results were obtained for a oneyear simulation, with a time-step of one hour. the geometry of the solar chimney was optimized to ensure the maximum yearly average exergy efficiency. two optimization methods were used and the results were compared to those expected in a reference geometry, based on an experimental prototype. monthly average values of the parameters were also evaluated. the following conclusions can be drawn from the analysis. • the mathematical model was able to appropriately predict the parameters. the mass airflow equation should be better investigated, since the results were lower than experimental values; • the solar chimney efficiency is higher in the winter and lower in the summer; • the exergy efficiency does not present significant variations throughout the year; • the optimum geometry showed great influence on the airflow parameters and on the energetic and exergetic efficiencies. the mass flow rate presented an average increase of 20 % when compared to the reference geometry, and the outlet temperature was increased by about 1 °c. • the energetic efficiency was stignificantly increased, mainly due to the increase of the chimney height; • the optimization process was able to significantly increase the yearly average exergy efficiency when compared to the reference geometry; • the optimization methods presented similar results, both for the geometry and the airflow and performance parameters. in future works, other optimization methods should be applied. acknowledgements the authors are thankful to cnpq, fapemig and puc minas. this study was financed in part by the coordenação de aperfeiçoamento de pessoal de nível superior brasil (capes) finance code 001 list of symbols acol area of the collector cp air specific heat cpao air specific heat at the tower outlet dc collector diameter dt tower diameter ėxheat exergy rate due to heat ėxin inlet exergy rate ėxlost destroyed exergy rate ėxmass,in exergy rate due to the air entering the system ėxmass,out exergy rate due to the air leaving the system ėxout outlet exergy rate fx friction factor in the collector fy friction factor in the tower g gravity acceleration hai specific enthalpy of the airflow at the collector inlet hao specific enthalpy of the airflow at the tower outlet hc collector height h0 extra-terrestrial solar energy on a horizontal surface ht tower height i total solar radiation ib beam component of the solar radiation id diffuse component of the solar radiation ṁ mass flow rate q̇ heat transfer rate q′′conv,1 convective heat transfer rate between the ground and the airflow rb geometric factor s solar radiation absorbed by the ground tai temperature of the airflow at the collector inlet tao temperature of the airflow at the tower outlet tground ground surface temperature vai velocity of the airflow at the collector inlet vao velocity of the airflow at the tower outlet ẇ mechanical work rate β slope of the solar collector β′ volumetric expansion coefficient ε exergy efficiency η solar chimney efficiency ρ air density ρground ground reflectance (τα) transmittance-absorptance product references [1] n. fathi, s. s. aleyasin, p. vorobieff. numerical–analytical assessment on manzanares prototype. applied thermal engineering 102:243 – 250, 2016. doi:10.1016/j.applthermaleng.2016.03.133. [2] a. b. kasaeian, s. molana, k. rahmani, d. wen. a review on solar chimney systems. renewable and sustainable energy reviews 67:954 – 987, 2017. doi:10.1016/j.rser.2016.09.081. 232 http://dx.doi.org/10.1016/j.applthermaleng.2016.03.133 http://dx.doi.org/10.1016/j.rser.2016.09.081 vol. 60 no. 3/2020 optimization of a small solar chimney [3] a. g. ferreira, c. b. maia, m. f. b. cortez, r. m. valle. technical feasibility assessment of a solar chimney for food drying. solar energy 82:198–205, 2008. doi:10.1016/j.solener.2007.08.002. [4] c. maia, j. castro silva, l. cabezas-gómez, et al. energy and exergy analysis of the airflow inside a solar chimney. renewable and sustainable energy reviews 27:350 – 361, 2013. doi:10.1016/j.rser.2013.06.020. [5] c. b. maia, a. g. ferreira, l. cabezas-gómez, et al. thermodynamic analysis of the drying process of bananas in a small-scale solar updraft tower in brazil. renewable energy 114:1005 – 1012, 2017. doi:https://doi.org/10.1016/j.renene.2017.07.102. [6] l. zuo, y. zheng, z. li, y. sha. solar chimneys integrated with sea water desalination. desalination 276:207–213, 2011. doi:10.1016/j.desal.2011.03.052. [7] l. zuo, y. yuan, z. li, y. zheng. experimental research on solar chimneys integrated with seawater desalination under practical weather condition. desalination 298:22–33, 2012. doi:10.1016/j.desal.2012.05.001. [8] t. ming, t. gong, r. kiesgen de richter, et al. freshwater generation from a solar chimney power plant. energy conversion and management 113:189–200, 2016. doi:10.1016/j.enconman.2016.01.064. [9] t. ming, t. gong, r. kiesgen de richter, et al. numerical analysis of seawater desalination based on a solar chimney power plant. applied energy pp. 1258–1273, 2017. doi:10.1016/j.apenergy.2017.09.028. [10] t. ming, t. gong, r. kiesgen de richter, et al. a moist air condensing device for sustainable energy production and water generation. energy conversion and management 138:638–650, 2017. doi:10.1016/j.enconman.2017.02.012. [11] m. asayesh, a. kasaeian, a. ataei. optimization of a combined solar chimney for desalination and power generation. energy conversion and management 150:72–80, 2017. doi:10.1016/j.enconman.2017.08.006. [12] c. b. maia, f. v. m. silva, v. l. c. oliveira, l. l. kazmerski. an overview of the use of solar chimneys for desalination. solar energy 183:83 – 95, 2019. doi:10.1016/j.solener.2019.03.007. [13] l. zuo, l. ding, y. yuan, et al. research progress on integrated solar chimney system for freshwater production. global energy interconnection 2:214 – 223, 2019. doi:10.1016/j.gloei.2019.07.014. [14] s. patel, d. prasad, m. r. ahmed. computational studies on the effect of geometric parameters on the performance of a solar chimney power plant. energy conversion and management 77:424–431, 2014. doi:10.1016/j.enconman.2013.09.056. [15] e. cuce, h. şen, p. m. cuce. numerical performance modelling of solar chimney power plants: influence of chimney height for a pilot plant in manzanares, spain. sustainable energy technologies and assessments 39:100704, 2020. doi:10.1016/j.seta.2020.100704. [16] m. ghalamchi, a. kasaeian, m. ghalamchi. experimental study of geometrical and climate effects on the performance of a small solar chimney. renewable and sustainable energy reviews 43:425–431, 2015. doi:10.1016/j.rser.2014.11.068. [17] m. ghalamchi, a. kasaeian, m. ghalamchi, s. mirzahosseini. an experimental study on the thermal performance of a solar chimney with different dimensional parameters. renewable energy 91:477–483, 2016. doi:10.1016/j.renene.2016.01.091. [18] p. guo, j.-y. li, y. wang, y. wang. numerical study on the performance of a solar chimney power plant. energy conversion and management 105:197– 205, 2015. doi:10.1016/j.enconman.2015.07.072. [19] s. hu, d. leung, j. chan. impact of the geometry of divergent chimneys on the power output of a solar chimney power plant. energy 120:1–11, 2017. doi:10.1016/j.energy.2016.12.098. [20] d. toghraie, a. karami, m. afrand, a. karimipour. effects of geometric parameters on the performance of solar chimney power plants. energy 162:1052–1061, 2018. doi:10.1016/j.energy.2018.08.086. [21] c. okoye, u. atikol. a parametric study on the feasibility of solar chimney power plants in north cyprus conditions. energy conversion and management 80:178–187, 2014. doi:10.1016/j.enconman.2014.01.009. [22] c. maia, a. ferreira, r. valle, m. cortez. theoretical evaluation of the influence of geometric parameters and materials on the behavior of the airflow in a solar chimney. computers & fluids 38:625–636, 2009. doi:10.1016/j.compfluid.2008.06.005. [23] t. tayebi, m. djezzar. numerical simulation of natural convection in a solar chimney. international journal of renewable energy research 2:712–717, 2012. [24] t. tayebi, m. djezzar. numerical analysis of flows in a solar chimney power plant with a curved junction. international journal of energy science 3:280–286, 2013. [25] t. tayebi. entropy generation analysis of convective airflow in a solar updraft tower power plant. heat transfer-asian research 48:3885–3901, 2019. doi:10.1002/htj.21573. [26] t. tayebi, m. djezzar. effect of varying ambient temperature and solar radiation on the flow in a solar chimney collector. international journal of smart grid and clean energy 5:16–23, 2016. doi:10.12720/sgce.5.1.16-23. [27] m. najmi, a. nazari, s. mansouri, g. zahedi. feasibility study on optimization of a typical solar chimney power plant. heat and mass transfer 48:475–485, 2012. doi:10.1007/s00231-011-0894-5. [28] h. dehghani, a. mohammadi. optimum dimension of geometric parameters of solar chimney power plants a multi-objective optimization approach. solar energy 105:603–612, 2014. doi:10.1016/j.solener.2014.04.006. [29] e. gholamalizadeh, m. h. kim. thermo-economic triple-objective optimization of a solar chimney power plant using genetic algorithms. energy 70:204–211, 2014. doi:10.1016/j.energy.2014.03.115. [30] e. gholamalizadeh, s. mansouri. a comprehensive approach to design and improve a solar chimney power plant: a special case – kerman project. applied energy 102:975–982, 2013. doi:10.1016/j.apenergy.2012.06.012. 233 http://dx.doi.org/10.1016/j.solener.2007.08.002 http://dx.doi.org/10.1016/j.rser.2013.06.020 http://dx.doi.org/https://doi.org/10.1016/j.renene.2017.07.102 http://dx.doi.org/10.1016/j.desal.2011.03.052 http://dx.doi.org/10.1016/j.desal.2012.05.001 http://dx.doi.org/10.1016/j.enconman.2016.01.064 http://dx.doi.org/10.1016/j.apenergy.2017.09.028 http://dx.doi.org/10.1016/j.enconman.2017.02.012 http://dx.doi.org/10.1016/j.enconman.2017.08.006 http://dx.doi.org/10.1016/j.solener.2019.03.007 http://dx.doi.org/10.1016/j.gloei.2019.07.014 http://dx.doi.org/10.1016/j.enconman.2013.09.056 http://dx.doi.org/10.1016/j.seta.2020.100704 http://dx.doi.org/10.1016/j.rser.2014.11.068 http://dx.doi.org/10.1016/j.renene.2016.01.091 http://dx.doi.org/10.1016/j.enconman.2015.07.072 http://dx.doi.org/10.1016/j.energy.2016.12.098 http://dx.doi.org/10.1016/j.energy.2018.08.086 http://dx.doi.org/10.1016/j.enconman.2014.01.009 http://dx.doi.org/10.1016/j.compfluid.2008.06.005 http://dx.doi.org/10.1002/htj.21573 http://dx.doi.org/10.12720/sgce.5.1.16-23 http://dx.doi.org/10.1007/s00231-011-0894-5 http://dx.doi.org/10.1016/j.solener.2014.04.006 http://dx.doi.org/10.1016/j.energy.2014.03.115 http://dx.doi.org/10.1016/j.apenergy.2012.06.012 janaína oliveira castro silva, cristiana brasil maia acta polytechnica [31] k. shirvan, s. mirzakhanlari, m. mamourian, n. abu-hamdeh. numerical investigation and sensitivity analysis of effective parameters to obtain potential maximum power output: a case study on zanjan prototype solar chimney power plant. energy conversion and management 136:350–360, 2017. doi:10.1016/j.enconman.2016.12.081. [32] h. allwörden, i. gasser, m. kamboh. modelling, simulation and optimisation of general solar updraft towers. applied mathematical modelling 64:265–284, 2018. doi:10.1016/j.apm.2018.07.023. [33] o. najm, s. shaaban. numerical investigation and optimization of the solar chimney collector performance and power density. energy conversion and management 168:150–161, 2018. doi:10.1016/j.enconman.2018.04.089. [34] h. muhammed, s. atrooshi. modeling solar chimney for geometry optimization. renewable energy 138:212–223, 2019. doi:10.1016/j.renene.2019.01.068. [35] r. balijepalli, v. p. chandramohan, k. kirankumar. optimized design and performance parameters for wind turbine blades of a solar updraft tower (sut) plant using theories of schmitz and aerodynamics forces. sustainable energy technologies and assessments 30:192–200, 2018. doi:10.1016/j.seta.2018.10.001. [36] a. habibollahzade, e. houshfar, p. ahmadi, et al. exergoeconomic assessment and multi-objective optimization of a solar chimney integrated with waste-to-energy. solar energy 176:30 – 41, 2018. doi:10.1016/j.solener.2018.10.016. [37] k. rahbar, a. riasi. performance enhancement and optimization of solar chimney power plant integrated with transparent photovoltaic cells and desalination method. sustainable cities and society 46:101441, 2019. doi:10.1016/j.scs.2019.101441. [38] j. o. castro silva, t. s. fernandes, s. de m. hanriot, et al. a model to estimate ambient conditions and behavior of the airflow inside a solar chimney. in renewable energy in the service of mankind, vol. 2. 2016. [39] c. maia, a. ferreira, r. valle, m. cortez. analysis of the airflow in a prototype of a solar chimney dryer. heat transfer engineering 30:393–399, 2009. doi:10.1080/01457630802414797. [40] a. p. c. guimarães. estudo solarimétrico com base na definição de mês padrão e seqüência de radiação diária. ufmg, 1995. [41] c. maia, a. ferreira, s. hanriot. evaluation of a tracking flat-plate solar collector in brazil. applied thermal engineering 73:953–962, 2014. doi:10.1016/j.applthermaleng.2014.08.052. [42] j. a. duffie, w. a. beckman. solar engineering of thermal processes. john wiley & sons inc, 2013. [43] meteotes. meteonorm global meteorological database. http://meteonorm.com/, 2012. [44] c. b. maia. análise teórica e experimental de uma chaminé solar: avaliação termofluidodinâmica. universidade federal de minas gerais, 2005. [45] a. koonsrisuk, s. lorente, a. bejan". constructal solar chimney configuration. international journal of heat and mass transfer 53:327–333, 2010. doi:10.1016/j.ijheatmasstransfer.2009.09.026. [46] d. g. kröger, m. burger. experimental convection heat transfer coefficient on a horizontal surface exposed to the natural environment. in proceedings of the ises eurosun2004 international sonnenforum, vol. 1, pp. 422 – 430. 2004. [47] i. dincer, m. a. rosen. exergy, environment and sustainable development. in exergy, pp. 36 – 59. 2007. [48] s. nizetic, sandro, ninic, et al. analysis and feasibility of implementing solar chimney power plants in the mediterranean region. energy 33:1680–1690, 2008. doi:10.1016/j.energy.2008.05.012. [49] a. hepbasli. a key review on exergy analysis and assessment of renewable energy resources for sustainable future. renewable and sustainable energy reviews 12:593–661, 2008. doi:10.1016/j.rser.2006.10.001. [50] a. r. celma, f. cuadros. energy and exergy analyses of omw solar drying process. renewable energy 34(3):660–666, 2009. doi:10.1016/j.renene.2008.05.019. [51] s. sevik, m. aktaş, e. dolgun, et al. performance analysis of solar and solar-infrared dryer of mint and apple slices using energy-exergy methodology. solar energy 180:537–549, 2019. doi:10.1016/j.solener.2019.01.049. [52] n. aviara, l. onuoha, o. falola, j. igbeka. energy and exergy analyses of native cassava starch drying in a tray dryer. energy 73:809–817, 2014. doi:10.1016/j.energy.2014.06.087. [53] m. aghbashlo, h. mobli, s. rafiee, a. madadlou. a review on exergy analysis of drying processes and systems. renewable and sustainable energy reviews 22:1–22, 2013. doi:10.1016/j.rser.2013.01.015. [54] j. de o. castro silva. modelagem do escoamento de ar e otimização de uma chaminé solar. pontifícia universidade católica de minas gerais, 2014. [55] j. r. shewchuk, et al. an introduction to the conjugate gradient method without the agonizing pain. carnegie-mellon university, 1994. [56] f. kowsary, k. pooladvand, a. pourshaghaghy. regularized variable metric method versus the conjugate gradient method in solution of radiative boundary design problem. journal of quantitative spectroscopy and radiative transfer 108:277–294, 2007. doi:10.1016/j.jqsrt.2007.03.007. [57] j. oliveira castro silva, s. hanriot, c. maia. parametric analysis of geometric configurations of a small-scale solar chimney. advanced materials research 1051:975–979, 2014. doi:10.4028/www.scientific.net/amr.1051.975. [58] k. cherif, f. ferroudji, m. ouali. analytical modeling and optimization of a solar chimney power plant. international journal of engineering research in africa 25:78–88, 2016. doi:10.4028/www.scientific.net/jera.25.78. 234 http://dx.doi.org/10.1016/j.enconman.2016.12.081 http://dx.doi.org/10.1016/j.apm.2018.07.023 http://dx.doi.org/10.1016/j.enconman.2018.04.089 http://dx.doi.org/10.1016/j.renene.2019.01.068 http://dx.doi.org/10.1016/j.seta.2018.10.001 http://dx.doi.org/10.1016/j.solener.2018.10.016 http://dx.doi.org/10.1016/j.scs.2019.101441 http://dx.doi.org/10.1080/01457630802414797 http://dx.doi.org/10.1016/j.applthermaleng.2014.08.052 http://meteonorm.com/ http://dx.doi.org/10.1016/j.ijheatmasstransfer.2009.09.026 http://dx.doi.org/10.1016/j.energy.2008.05.012 http://dx.doi.org/10.1016/j.rser.2006.10.001 http://dx.doi.org/10.1016/j.renene.2008.05.019 http://dx.doi.org/10.1016/j.solener.2019.01.049 http://dx.doi.org/10.1016/j.energy.2014.06.087 http://dx.doi.org/10.1016/j.rser.2013.01.015 http://dx.doi.org/10.1016/j.jqsrt.2007.03.007 http://dx.doi.org/10.4028/www.scientific.net/amr.1051.975 http://dx.doi.org/10.4028/www.scientific.net/jera.25.78 acta polytechnica 60(3):225–234, 2020 1 introduction 2 mathematical model 2.1 system description 2.2 problem statement 2.2.1 environmental analysis 2.2.2 airflow analysis 2.3 energy and exergy analysis 3 optimization 4 results and discussion 4.1 optimization results 4.2 monthly average airflow parameters 5 conclusions acknowledgements list of symbols references ap04_4web.vp 1 introduction the mechanical behaviour of unsaturated soils is greatly influenced by the degree of saturation and, consequently, by the matric suction. matric suction is a function of many soil properties such as the grain size and the geometry of the pores constrained between the soil particles. in addition, matric suction depends on the pore fluid properties such as the interfacial forces, density, and the degree of saturation. based on these relations, a simple numerical simulating model is introduced in this study to predict the relationship between the matric suction and water content inside unsaturated relatively dry samples (i.e., samples with low water contents). the suggested model, basically, makes use of the surface tension and the capillary action phenomena of the water between the particles in addition to the grain-size distribution curve of the soil under investigation. then, the relationship between the water content and the matric suction is predicted and the corresponding soil-water characteristic curve can be constructed. however, the values of the predicted suction are approximate, and thus they can be used as a simple, quick indicator of how much (i.e., in which range) the matric suction will be inside the investigated soil. 2 suction in unsaturated soils the total suction, �, in an unsaturated soil is, generally, made up of two components; namely, matric suction and osmotic suction. the sum of these two components is called the total suction. matric suction is defined as the difference between the pore-air pressure and the pore-water pressure (i.e., matric suction � ua� uw), while osmotic suction, �, is a function of the amount of dissolved salts in the pore fluid. therefore, matric suction is attributed mainly to capillary actions in the soil structure, while osmotic suction is associated with physico-chemical interactions between soil minerals and pore water [13]. however, matric suction is of primary interest because many engineering problems involving unsaturated soils are commonly the result of environmental changes, which primarily affect the matric suction of the soil [9]. for most cases, environmental changes primarily affect the matric suction component, while osmotic changes are generally less significant. in other words, a change in total suction is essentially equivalent to a change in matric suction (i.e., � �� � �( )u ua w ) [1]. in contemporary unsaturated soil mechanics theories, an element of soil is often considered as a simple three-phase system consisting of pore-air, pore-water and granular solid particles. matric suction in such a system arises from capillary actions attributed to interactions between air-water menisci (which are generated from the surface tension phenomenon), and soil particles. 3 role of surface tension one of the most important properties that affect the matric suction is the surface tension. the phenomenon of surface tension results mainly from the intermolecular forces acting on molecules in the air-water interface, which is known as the contractile skin. a water molecule within the contractile skin experiences an unbalanced force towards the interior of the water. in order for the contractile skin to be in an equilibrium condition, a tensile pull is generated along it. the tensile pull is tangential to the contractile skin surface. therefore, surface tension causes the contractile skin to behave like a stretched elastic curved membrane. if the contractile skin has a double curvature, (i.e., a three-dimensional membrane), the total excess in pressure acting on the membrane can be calculated as the sum of the components obtained in two principal directions as follows: � u t r r s� � � � � � � � 1 1 1 2 . (1) equation (1) is known as the laplace equation of capillarity [1], where r1 and r2 are the radii of curvature of the membrane in two orthogonal principal planes, while ts denotes the surface tension as illustrated in fig. 1. in soils, the surface of the grains tends always to adsorb water more strongly than air, while the air is compressed if it is completely encompassed by water. that is why, in unsaturated soils, the air pressure is always higher than that of the water. thus, the contractile skin is assumed to be subjected to an air pressure, ua, greater than the water pressure, uw. the difference between these two pressures is the matric suc© czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 44 no. 4/2004 a numerical model to predict matric suction inside unsaturated soils a. farouk, l. lamboj, j. kos the objective of this research is to introduce a numerical simulation model to predict approximate values of the matric suction inside unsaturated soils that have low water contents. the proposed model can be used to predict the relationship between the water content and the matric suction of a studied soil to construct the soil-water characteristic curve. in addition, the model can be utilized to combine the predicted matric suction with the soil parameters obtained experimentally, which enables us to explain how matric suction can affect the behaviour of unsaturated soils, without the need to utilize advanced measuring devices or special testing techniques. the model has given good results, especially when studying coarse-grained soils. keywords: unsaturated soil, matric suction, surface tension, soil-water characteristic curve. tion, (ua � uw), and consequently, the pressure difference that causes the contractile skin to curve according to eq. (1), can be formulated as: ( )u u t r r a w s� � � � � � � � � 1 1 1 2 . (2) eq. (2) clearly shows that as the radius of curvature of the contractile skin decreases, the matric suction of a soil increases. this supports the fact that matric suction increases as a result of decreasing the water content of the soil. at the opposite extreme, when the pressure difference between the pore-air and pore-water goes to zero, the radius of curvature, r, goes to infinity. thus, a flat air-water interface exists when the matric suction goes to zero, which is the case of a fully saturated soil. however, the contractile skin may be completely concave or a combination of concave and convex in orthogonal directions [3]. it is only necessary that the relative magnitudes of r1 and r2 are such that they balance equation (2). in fact, when the contractile skin spans between collections of soil particles, dependent on the geometry, it is necessary for continuity that the interface has curvature both concave and convex to the air phase [8]. this case arises obviously in unsaturated relatively dry soils, where the water menisci are concentrated only between the soil particles and there is no continuity between these menisci. in such a case, the matric suction can be calculated by converting eq. (2) to the following equation: ( )u u t r r a w s� � � � � � � � � 1 1 1 2 . (3) 4 description of the numerical model the numerical model introduced in the current study takes into consideration the effect of the capillary water on the behaviour of unsaturated soils. the soil is simulated by replacing its particles with a system of spheres, whereas water is supposed to exist at the contact points between the spheres as capillary water. harr [6] stated that natural soil particles in the silt-size range, and coarser ranges generally, are bulky and fairly equidimensional. therefore, there will be a small approximation in simulating the particles of coarse-grained soils by a system of spheres. unfortunately, proposing a constitutive simulation model that takes into consideration the exact or even an approximate shape to simulate the real shape for clay particles is very complicated. thus, the simulation model presented in the current study focuses on predicting the relationship between the water content and matric suction for coarse-grained soils. however, in the case of fine-grained soils, the spheres might be considered as packets of saturated clay particles (i.e., as aggregations of clay particles). in this case, the predicted values will be approximate ones, and thus they can be used only to indicate the range in which the matric suction will be inside such soils. the suggested model takes into consideration the effect of both the packing pattern of the spherical particles and the ratio of the voids restricted between them. in addition, the matric suctions generated by the surface tension acting on the meniscus are calculated by taking into consideration the effect of the grain sizes, the water content, and the specific gravity of the particles. fig. 2a represents a 3d illustration of four spherical particles and the pore-water menisci accumulated at points of contact as well as the surface tension forces acting on the water menisci. the shape of the capillary water at the points of contact between the particles can be considered as a center-pinched cylinder with two extracted segments of spheres, (one from its top and the other from the bottom), as shown in fig. 2b. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 1: surface tension in the contractile skin fig. 2: 3d representation of the numerical model: (a) forces acting on spherical particles due to surface tension, (b) water meniscus at one point of contact between two particles 4.1 numerical model geometry and calculations a series of mathematical equations are adopted to introduce a simple constitutive relationship between matric suction and water content. the effect of the amount of water distributed at the contact points between the spheres is investigated. in the proposed model, matric suction can be evaluated as a function of the water content accumulated between the soil particles. some studies have proposed similar models, but most of them have considered only the equal-sized particle to be the case, e.g., [7], and [12]. therefore, the effect of considering non equal-sized spheres in the calculation of the matric suction and the water content is investigated in this study. fig. (3) summarizes the possible theoretical packing of identical spheres where n and e are the porosity and the void ratio of the packing, respectively, while nc denotes the number of points of contact for each particle. 4.1.1 geometry of the spherical model a simple representation of the proposed model can be introduced in two-dimensional space by considering the cross section of any two contacting identical spherical particles. the objective of studying the geometry of these two particles is to determine the main dimensions of the inter-particle water meniscus. once these dimensions are known, the volume of the water accumulated between any two particles can be easily calculated. then the water content can be evaluated as a function of the volume of the water and the volume of the particle. this can be achieved by taking into consideration the number of contact points for each possible packing pattern shown in fig. 3. fig. 4a shows two identical spheres, having equal radii of r, placed vertically and holding capillary water (contact moisture) between them. for simplification, the contact angle, �, between the water meniscus and the surface of the spherical particles is assumed to be zero. the water menisci can be considered as a combination of concave and convex in orthogonal directions. the radii of curvature that define the geometry of the water meniscus in two orthogonal directions are ”r0” and ”r1”. the volume of the water content accumulated between any two particles can be calculated as a function of the water retention angle, �. from a geometrical point of view, increasing angle � leads to an increase in the size of the water meniscus and simultaneously increases the volume of water. the main condition for the calculation to continue is the discontinuity between the regimes of the water menisci around the soil particles. once the menisci begin to fuse with each other, the calculations cannot proceed because the shape of the meniscus will be irregular and very complicated to analyze. the menisci tend to fuse at an angle known as the critical water retention angle, �c [11]. from a geometrical consideration, this angle is equal to 45° for the simple cubic packing, while for the rest of the packing patterns, the fringes of the menisci begin to meet each other when the critical angle �c is equal to 30°. according to eq. (3), the matric suction generated by the water meniscus between the spherical particles depicted in fig. 4a can be reformulated as: ( )u u t r r a w s� � � � � � � � � 1 1 1 2 . (4) from the geometry of the two contacting particles shown in fig. 4a, the radii of curvature, r0 and r1, as well as dimensions x and z can be expressed using eq. (5) through eq. (8) as follows: r r0 1� �(sec )� , (5) r r1 1� � �( tan sec )� � , (6) x r� sin � , (7) z r� �2 1( cos )� . (8) finally, it should be noted that “o1” in fig. 4a represents the distance from the centroid of the segment (a-b-c) to the © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 44 no. 4/2004 no. e( ) *( n ) )% ( cn * 100])1([% ´+= een (a) simple cubic (b) cubic-tetrahedral (c) tetragonal-spheroidal (d) pyramidal (e) tetrahedral fig. 3: theoretical packing of identical spheres axis of revolution (y–y). this distance can be calculated using the formulas described by rektorys [10]. 4.1.2 calculations of water content the water content of the studied model can be calculated as the ratio between half the weight of water accumulated at the points of contacts of any spherical particle and the weight of the particle itself. the volume of the water meniscus, vm, accumulated at one of the points of contact between two particles can be calculated according to the volumes depicted in fig. 4b as: v vol a vol b vol cm � � � �( . ) ( . ) ( . )2 . (9) now, the dimensions described in eq. (5) through (8), will be substituted in eq. (9) to formulate the volume of water accumulated at one point of contact between two particles. the final resulting formula to calculate the volume of the center-pinched cylinder with the two extracted segments of spheres will be as follows: v r r m � � � � � � � � � � � � 2 1 2 3 3 3 2 3 � � � � (sin )( cos ) cos (cos ) 3 2 3 1 1 90 2 4 3 tan (sec ) ( sin ) (cos ) .� � � � � � � � � � � � � � � (10) then, the percentage water content, wc %, accumulated around one particle at all points of contact can be expressed as: w n v r g c c m s � 37 5 3 . , (11) where: nc the number of points of contact for each particle, which depends on the packing pattern of the spherical particles, gs the specific gravity of the soil particles. in addition, the volumetric water content, �w, can be evaluated as: � w c mn v r e � � 0 375 13 . ( ) . (12) in the simulation, the input data are the radius, r (m), the specific gravity, gs, the surface tension of the air water interface, ts (mn/m), the angle of water retention, � (degree), the number of points of contact between particles, nc, and finally the void ratio, e. it can be seen that all these inputs can be measured using relatively simple laboratory measurements. in addition, the implementation of the model is as simple as possible. at this point, it should be clarified that the foregoing equations can be applied easily in the case of soils that contain a uniform, homogeneous particle size. however, for soils composed of more than one particle size, the model can also be applied along with the grain-size distribution curve as described below. 4.1.3 calculations for non equal-sized spherical particles when dealing with a soil composed of non equal-sized spherical particles, it is necessary for the current model to use a combination of the capillary action in the water menisci along with the grain-size distribution curve. first, the grain-size distribution curve is divided up into divisions of uniform soil particles. then, for each particle size, the individual relationships between water content and matric suction can be built up by means of equations (5) through (12). once the whole grain-size distribution curve has been incrementally analyzed, the individual relations between water content and matric suctions are summed together using a superposition technique to get the soil-water characteristic curve (swcc) for the whole investigated soil. however, in the assemblage of soil particles, the voids created between larger particles are assumed to be filled with smaller particles. it should be noted 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 4: geometry of a spherical model for unsaturated soils: (a) main dimensions of water menisci between two particles, (b) volumes used to evaluate water content that the effect of the interaction among spheres of different sizes is ignored in this model. a similar technique was suggested by fredlund et al., [4] and [5], who also used the grain-size distribution curve in their model. 5 numerical simulations the numerical simulations presented below are devoted mainly to checking the reliability of the proposed model. some relationships between the matric suction and the water content were calculated numerically using this model. then it was possible to introduce the graphical representations of these relationships (i.e., swcc). the first simulation studies the effect of particle size on the predicted swcc. in the second simulation, the effect of the packing pattern of the spherical particles, and as a result the effect of the soil’s porosity, on the swcc is investigated. the third simulation is presented to verify the proposed model in the case of constructing the swcc for a soil with non equal-sized grain particles, making use of some of the experimental and numerical results available in the literature. simulation 1: this simulation investigates the effect of the grain size of a soil on the relation between the matric suction and the water content, using the proposed numerical model. the input data needed to calculate the water content and the corresponding matric suctions for this simulation are: ts � 0.075 n/m and gs � 2.65, while the suggested radius of the particle, r, ranges from 0.001 mm to 1.0 mm, which is the common range for sand and silt grains. fig. 5 shows the swcc calculated for five different grain sizes, assuming pyramidal packing to be the packing of the simulating spherical particles (thus, nc � 12, e � 0.35). as expected, at the same water content, the larger the size of the spheres, the lower is the predicted matric suction. it can be clearly seen from fig. 5 that the matric suctions calculated from the diameter of the large particles are of the order of kilo-pascals, whereas the suctions evaluated from the diameters of the small particles are of the order of mega-pascals. in fact, this behaviour is similar to that known for real soils. however, it is well known that matric suction can develop only in the presence of both water and air at their interface surface (i.e., at the contractile skin). that is why in the case of completely dry soil, where the water content is equal to zero, there is no contractile skin; and thus, there is no matric suction. once the water flows into the pores an the fact that the numbers of contacts between the particles in the case of a dense soil are innumerably large. in this case, the numbers of water menisci accumulated at the points of contact are also large. consequently, the matric suction for a dense soil is believed to be extremely high. simulation 3: the main objective of this simulation is to verify the possibility of using the suggested model to predict the swcc in the case of a soil that comprises non equal-sized grain particles. this simulation makes use of one of the numerical and experimental results reported in 1997 by fredlund et al., [4], who predicted a swcc for sand using the grain-size distribution curve shown in fig. 7. the data drawn out from this curve were input by fredlund et al., [4] into the computer program “soilvision”, which is mainly based on the theoretical method suggested by fredlund and xing [2] to predict the swcc. it can be seen from fig. 7 that the studied sand has a relatively uniform particle size distribution (i.e., a low range of different grain sizes). however, such a relatively uniform distribution is speculated to be the ideal case for examining the reliability of the proposed model. the simulation processes were done by dividing the grain size distribution curve into small divisions of uniform soil particles as illustrated in fig. 7. the swcc is estimated for each soil division, and then the final swcc is built up by the summation of all the divisional soil-water characteristic curves. the predicted results of the current model as well as both the laboratory measured and the analytically predicted swcc 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 6: effect of the packing type on the behaviour of the simulated soil-water characteristic curve fig. 7: grain-size distribution curve of sand showing the dividing method used for the current study reported by fredlund et al., [2] are all depicted in fig. 8. as indicated in this figure, the predicted swcc for uniform sand using the current model compared well to both measured and predicted curves introduced by fredlund et al., [2]. it should be noted that some studies have proposed similar models, but most of them have considered only the equal-sized particle to be the case, e.g., [7], and [12]. therefore, to illustrate the importance of considering different particle sizes in the simulation of a soil, an swcc was also drawn in fig. 8, taking into account only one particle size from the grain-size distribution curve shown in fig. 7. in this case, an average particle size of 0.35 mm was taken into consideration (i.e., r � 0.175 mm). it is clear from the figure that considering only one particle size has under-predicted the swcc for the simulated sand. 6 conclusion the proposed numerical model has given good results when studying coarse-grained soils. in general, this model can be used as a simple quick indicator of how much (i.e., in which range) the matric suction will be inside an investigated soil. however, the model has proved to be effective for sand that has a somewhat uniform grain-size distribution. in addition, it has demonstrated that, when a soil is nearly dry, the remaining water in the voids may sustain very high negative pore pressure, and thus very high matric suction develops inside this soil. moreover, applying the proposed model taking into account the effect of non equal-sized grain particles has given more accurate results than those obtained using only one particle size, when compared to the results available in the literature, since the latter case under-predicts the swcc for a simulated soil. however, this model needs to be developed to simulate the behaviour of unsaturated fine-grained soils more precisely. notations e voids ratio gs specific weight (specific gravity) n porosity nc number of points of contacts between soil particles ts surface tension of water ua pore-air pressure uw pore-water pressure (ua � uw) matric suction � contact angle between the water meniscus and surface of the soil particle � water retention angle �c critical angle of water retention � osmotic suction � total suction references [1] fredlund d. g., rahardjo h.: soil mechanics for unsaturated soils. john wiley and sons, inc., new york, ny 10158-0012, isbn 0-471-85008-x, 1993. [2] fredlund d. g., xing a.: “equations for the soil-water characteristic curve.” canadian geotechnical journal, vol. 31 (1994), p. 521–532. [3] fredlund m. d., wilson g. w., fredlund d. g.: indirect procedures to determine unsaturated soil property functions. proceedings of the 50th canadian geotechnical conference, ottawa, ontario, canada, 1997. [4] fredlund m. d., fredlund d. g., wilson g. w.: prediction of the soil-water characteristic curve from grain-size distribution and volume-mass properties. proceedings of the 3rd brazilian symposium on unsaturated soils, © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 44 no. 4/2004 fig. 8: a comparison between the swcc of sand using the current numerical model and other methods nonsat, rio de janeiro, brazil, vol. 1 (1997), p. 13–23. [5] fredlund m. d., wilson g. w., fredlund d. g.: “use of the grain-size distribution for estimation of soil-water characteristic curve.” canadian geotechnical journal, vol. 39 (2002), p. 1103–1117. [6] harr m. e.: mechanics of particulate media: a probabilistic approach. mcgraw-hill international book co., new york, isbn 0-07-26698-6, 1977. [7] likos w. j., lu n.: hysteresis of capillary cohesion in unsaturated soils. 15th asce engineering mechanics conference, columbia university, new york, june 2–5, 2002. [8] murray e. j.: “an equation of state for unsaturated soils.” canadian geotechnical journal, vol. 39 (2002), no. 1, p. 125–140. [9] nishimura t., hirabayashi y., fredlund d. g., gan j. k.: “influence of stress history on the strength parameters of an unsaturated statically compacted soil.” canadian geotechnical journal, vol. 36 (1999), p. 251–261. [10] rektorys k.: survey of applicable mathematics. second revised edition. vol. 1, kluwer academic publishers, maia 280, isbn 0-7923-0680-5, 1994. [11] shimada k., fuji h., nishimura s., nishiyama t.: change of shear strength due to surface tension and matric suction of pore water in unsaturated sandy soils. proceedings of the 1st asian conference on unsaturated soils (unsat-asia 2000), singapore, isbn 90-5809-139-2, 2000, p. 147–152. [12] tateyama k., fukui y.: understanding of some behaviour of unsaturated soil by theoretical study on capillary action. proceedings of the 1st asian conference on unsaturated soils (unsat-asia 2000), singapore, isbn 90-5809-139-2, 2000, p. 163–168. [13] wan a. w. l., gray m. n., graham j.: on the relations of suction, moisture content, and soil structure in compacted clays. proceedings of the 1st international conference on unsaturated soils, paris, france, isbn 90 5410 584 4, vol. 1, 1995, p. 215–222. ing. ahmed farouk ibrahim phone: +420 224 354 555 e-mail: afarouk2000@hotmail.com doc. ing. ladislav lamboj, csc. phone: +420 224 353 874 e-mail: lamboj@fsv.cvut.cz ing. jan kos, csc. phone: +420 224 354 552 e-mail: jankos@fsv.cvut.cz department of geotechnics czech technical university in prague faculty of civil engineering thakurova 7 166 29 praha 6, czech republic 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 3-6 7 8 9 10 acta polytechnica doi:10.14311/ap.2020.60.0440 acta polytechnica 60(5):440–447, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap artificial neural network approach for the identification of clove buds origin based on metabolites composition rustama, ∗, agus yodi gunawana, made tri ari penia kresnowatib a institut teknologi bandung, faculty of mathematics and natural sciences, industrial and financial mathematics research group, jl. ganesha 10, 40132 bandung, indonesia b institut teknologi bandung, faculty of industrial technology, food and biomass processing technology research group, jl. ganesha 10, 40132 bandung, indonesia ∗ corresponding author: rustam.math@s.itb.ac.id abstract. this paper examines the use of an artificial neural network approach in identifying the origin of clove buds based on metabolites composition. generally, large data sets are critical for an accurate identification. machine learning with large data sets lead to a precise identification based on origins. however, clove buds uses small data sets due to the lack of metabolites composition and their high cost of extraction. the results show that backpropagation and resilient propagation with one and two hidden layers identifies the clove buds origin accurately. the backpropagation with one hidden layer offers 99.91 % and 99.47 % for training and testing data sets, respectively. the resilient propagation with two hidden layers offers 99.96 % and 97.89 % accuracy for training and testing data sets, respectively. keywords: artificial neural networks, backpropagation, resilient propagation, clove buds. 1. introduction there is a variation in the flavour and aroma of different plantation commodities. for example, in indonesia, the clove buds from java have a prominent wooden aroma and sour flavour while those in bali have a sweet-spicy flavour [1]. arabica coffee from gayo has a lower acidity and a strong bitterness. in contrast, coffee from toraja has a medium browning, tobacco, or caramel flavour, not too acidic and bitter. furthermore, kintamani coffee from bali has a fruit flavour and acidity, mixed with a fresh flavour. contrastingly, coffee from flores has a variety of flavours ranging from chocolate, spicy, tobacco, strong, citrus, flowers and wood. coffee from java has a spicy aroma while that from wamena has a fragrant aroma and without pulp [2]. the specific flavours and aromas are attributed to the composition of commodities’ metabolites. generally, specific metabolite is responsible for particular flavours and aroma. for this reason, it is vital to recognize the characteristics of each plantation commodity based on the metabolite composition. this study investigates the origin of clove buds. this helps to maintain the flavour of a product using clove buds as a mixture. also, the characteristics of food products can be predicted based on the origin of clove buds used due to the differences in flavour and taste between regions [3]. metabolic profiling is a widely used approach in obtaining information related to metabolites contained in a biological sample. this is a quantitative measurement of metabolites from biological samples [4, 5]. to give meaning to the metabolites data sets, the chemometrics technique was developed. this is a chemical sub-discipline that uses mathematics, statistics and formal logic to gain knowledge about chemical systems. it provides maximum relevant information by analysing metabolites data sets from biological samples [6]. additionally, it is used in pattern recognition of metabolites data sets in complex chemical systems [3]. pattern recognition in biological samples identifies specific metabolites or biomarkers that form a particular flavour and aroma. artificial neural networks have been widely used in pattern recognition [7] and other applications in various fields as shown by some researchs [8–13]. however, it has not been fully implemented, especially in clove buds. the small data sets available limit the implementation of artificial neural networks for clove buds. this is attributed to the lack of metabolite composition in the clove buds and the high cost for extracting them. furthermore, some clove buds have zero metabolite concentration. however, this is because of inefficient tools in the laboratory to detect metabolites whose values are very small. therefore, this study implements artificial neural networks as pattern recognition in clove buds data sets. each origin of clove buds has specific metabolites as a biomarker. 2. materials and methods 2.1. materials this study uses clove buds data sets obtained from kresnowati et al. [3]. it examined clove buds from four origins in indonesia, including java, bali, manado and toli-toli. each origin has three regions, and 440 https://doi.org/10.14311/ap.2020.60.0440 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 5/2020 artificial neural network approach for the identification. . . therefore, there are twelve regions in total. in the laboratory, eight experiments are carried out in each region, except for java with only six experiments. for each experiment, 47 types of metabolites were recorded. in the matrix, data sets are 94 × 47. the row and column represent the number of experiments and metabolites, respectively. 2.2. data preprocessing in total, the clove buds data sets have a wide range, specifically between 10−4 and 10. therefore, logarithmic transformations are used to obtain reliable numerical data. since some metabolites data have zero concentration, logarithmic transformation cannot be applied directly. this is because their concentrations range below the specified threshold. the metabolite data with zero concentration are not removed because of acting as biological markers. therefore, they are replaced with a value of one order smaller than the smallest concentration available. in this case, the zeros are replaced with 10−5. before implementing artificial neural networks, one stage preprocessing clove buds data sets from [14] are added to normalize the values of metabolites data. the normalization ensures that each data has the same influence or contribution to determine its origin. the following normalization formula is used [15] zkl = xkl −x s . (1) here, zkl is the result of normalization of xkl, x is the mean of the k-th experiment and s is s = √√√√ n∑ k=1 xkl −x n− 1 . (2) 2.3. artificial neural network artificial neural networks are a false representation of the human brain that simulates the learning process [16]. backpropagation and resilient propagation are learning algorithms widely used in artificial neural networks [17–27]. in this study, two different network architectures, including resilient and backpropagation, are used. the first and second architectures consist of two and one hidden layers, respectively. 2.3.1. backpropagation learning algorithm the backpropagation learning algorithm is based on the repeated use of chain rules to calculate the effect of each weight in network concerning the error function e [28]. ∂e ∂wij = ∂e ∂oi ∂oi ∂neti ∂neti ∂wij (3) where wij is the weight from j − th neuron to i− th neuron, oi is the output, and neti is the weighted number of neurons input i. once the partial derivatives for each weight are known, the goal of minimizing the error function is achieved with gradient descent [28]: w (t+1) ij = w (t) ij − � ∂e ∂wij (t) (4) where t is iteration and 0 < � < 1 the learning rate. from the equation (4), choosing a large learning rate (close to 1), allows for oscillations. this makes the error fall above the specified tolerance value and lessens the identification accuracy. conversely, in case the learning rate (�) is too small (close to 0), many steps are needed for convergence of the error function e. to avoid these, the backpropagation learning algorithm is expanded by adding the momentum parameter (0 < α < 1) as shown in equation (5). the addition of the momentum parameter also accelerates the convergence of error function [28]. ∆w(t+1)ij = −� ∂e ∂wij (t) + α∆w(t−1)ij (5) where it measures the effect of the previous step on the current one. to activate neurons in the hidden and output layers, the sigmoid activation function is used. three essential properties used in backpropagation and resilient propagation include bounded, monotonic and continuously differentiable. this helps to convert a weighted amount of input into an output signal for each neuron i as shown by equation (6) [29]. oi = f(ii) = 1 1 + e−σii . (6) where ii is the input of i-th weighted number of neuron, σ the slope parameter of the sigmoid activation function and oi the output of i-th neuron. the threshold used on the output layer for the sigmoid activation function is oi = { 1, if oi ≥ 0.5 0, if oi < 0.5. (7) the weighted amount input is given in the following equation [29]. n∑ i=1 wijoi + wbjob. (8) the sum of i represents the input received from all neurons in the input layer, while b is the bias neuron. weight wij is associated with connections from i-th neuron to j-th neuron, while wbj weight relates to the connections from the biased to j-th neuron. the weighted amount obtained in the hidden and the output layers are activated by substituting the weighted amount from equation (8) to be an exponent in equation (6). 2.3.2. resilient propagation learning algorithm riedmiller et al. in [28] proposed a resilient propagation learning algorithm developed by the backpropagation algorithm. the algorithm directly adapts to the 441 rustam, a. y. gunawan, m. t. a. p. kresnowati acta polytechnica weight value based on the local gradient information. riedmiller et al. [28] introduced an update value ∆ij for each weight determining the size of the weight update. the adaptive update value evolves during the learning process based on its local sight on the error function e, according to the following learning rule [28]: ∆tij =   η+ ∗ ∆(t−1)ij , if ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) > 0 η− ∗ ∆(t−1)ij , if ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) < 0 ∆(t−1)ij , else (9) where (0 < η− < 1 < η+) η− and η+ represents the decrease and increase factors, respectively. according to this adaptation rule, every time the partial derivative of the corresponding weight wij changes its sign, which indicates that the last update is too big and the algorithm is above the local minimum, the update value ∆ij is decreased by the factor η−. in case the derivative retains its sign, the update value slightly increases to accelerate the convergence in the shallow regions [28]. once the update value for each weight is adjusted, the update weight itself follows the rule stating that in case the derivative is positive, the weight is decreased by its update value. if the derivative is negative, the update value is added ∆wtij =   −∆t−1ij , if ∂e ∂wij (t) > 0 +∆t−1ij , if ∂e ∂wij (t) < 0 0, else (10) wt+1ij = w t ij + ∆w t ij (11) however, in case the partial derivative sign changes, which means the previous step was too large and the minimum missed, the previous weight update is reverted: ∆w(t)ij = −∆w (t−1) ij , if ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) < 0 (12) due to the ’backtracking’ weight step, the derivative should change its sign once again in the next step. to avoid another problem, there should be no adaptation of the update value in the succeeding step. in practice, this can be carried out by setting ∂e ∂wij (t−1) = 0 in the ∆ij adaptation rule. the update values and weights are changed every time the whole set of patterns is presented to the network once (learning by epoch). the following shows the process of adaptation and resilient propagation learning process. the minimum(maximum) operator is expected to provide a minimum or maximum of two numbers. the sign operator returns +1 if the argument is positive, -1 in the case it is negative, and 0 for otherwise. for each weight and bias{ if ( ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) > 0) then{ ∆(t)ij = minimum(∆ (t−1) ij ∗η +, ∆max) ∆w(t)ij = sign( ∂e ∂wij (t) ∗ ∆(t)ij ) w (t+1) ij = w (t) ij + ∆w (t) ij } else if ( ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) < 0) then{ ∆(t)ij = maximum(∆ (t−1) ij ∗η −, ∆min) w (t+1) ij = w (t) ij − ∆w (t−1) ij ∂e ∂wij (t) = 0 } if ( ∂e ∂wij (t−1) ∗ ∂e ∂wij (t) = 0) then{ ∆w(t)ij = −sign( ∂e ∂wij (t) ∗ ∆(t)ij ) w (t+1) ij = w (t) ij + ∆w (t) ij } } 3. results and discussions in this study, the percentage of training and testing data sets are 80 % and 20 %, respectively. the metabolites data sets in matrix are 94×47. out of 94 rows, 75 were chosen randomly as training data sets, while the remaining were used as testing sets. the selection of training data sets is carried out randomly 30 times. therefore, in each network architecture, there are 30 values for the percentage of identification accuracy, coefficient of determination and the mean squared error (mse). the average is chosen as a representative of the 30 values. in each network architecture, learning rate (�) 0.9, momentum parameter (α) 0.1 and maximum epoch 5000 are used with an error target of 10−3. in this study, each origin is represented by a binary code. specifically, the binary code for the java origin is 1000, bali 0100, manado 0010 and toli-toli 0001. the calculation of the identification accuracy and mse is shown in equation (13) and (14). % accuracy = a k 100 % (13) where a is the number of origins identified correctly, while k is the total number. mse calculated by the following equation [29] mse = 1 m ·n m∑ p=1 n∑ k=1 (tkp −okp)2. (14) 442 vol. 60 no. 5/2020 artificial neural network approach for the identification. . . where tkp is the desired target, okp the network output and p the variable corresponding to the number of origins. the suitability between the expected target and network output was evaluated based on the coefficient of determination r2. it was calculated using the following equation [21] r2 = 1 − 1 n ∑n k=1(tkp −okp) 2 1 n−1 ∑n k=1(tkp −tkp)2 . (15) where tkp is the average desired target. in this study, backpropagation and resilient propagation were used, each consisting of two and one hidden layers. for one hidden layer, the number of neurons was determined using the formula proposed by shibata and ikeda in 2009 [30], specifically nh = √ ni ·no, where nh, ni, and no represent hidden, input and output neurons, respectively. in both the backpropagation and resilient propagation, the number of neurons used does not exceed one hidden layer. based on shibata and ikeda [30] formula, the number of neurons in one hidden layer was obtained, specifically nh = √ ni ·no = √ 47 · 4 = 13.71. however, in this study, it was rounded up to 15 neurons. some experiments were conducted to evaluate the identification accuracy, and whether using one hidden layer with 15 neurons might lead to a better accuracy of identification than two hidden layers. however, the number of neurons varied, setting less than 15 neurons. for two hidden layers, experiments were conducted with the number of consecutive neurons as follows; 3-5 (8), 4-6 (10), 5-7 (12) and 6-8 (14). the number of neurons in the hidden layer never exceeded 15 neurons. 3.1. backpropagation (b-prop) with two hidden layers in this section, the backpropagation learning algorithm with two hidden layers was used. the number of neurons in the hidden layer varied with not more than 15 neurons. there were four variations of the network architecture, including 47-3-5-4, 47-4-6-4, 475-7-4 and 47-6-8-4. the input layer consists of 47 neurons based on the number of metabolites. the output layer consists of 4 neurons according to the number of clove buds origins. table 1 shows the network architecture 47-3-5-4 gives the highest value for the identification accuracy and coefficient of determination in training and testing data sets. similar to the mse, this network architecture provides the smallest amount of both training and the testing data sets. from table 1, increasing the number of neurons in the backpropagation with two hidden layers decreases network performance. this is in line with shafi et al. in 2006 [31], which stated that increasing the number of neurons in the hidden layer only heightened the complexity of the network. still, it does not increase the accuracy of the pattern recognition. 3.2. backpropagation (b-prop) with one hidden layer the backpropagation learning algorithm with one hidden layer was implemented to evaluate its result in the case of a comparison using two hidden layers. the results obtained are shown in table 2. table 2 shows that the network architecture 47-15-4 identifies the clove buds origin effectively. the identification accuracy percentage is 99.91% and 99.47 % for training and testing data sets, respectively. moreover, the mse value is also smaller compared to the two hidden layers. for the backpropagation algorithm, the results show one hidden layer is better than two. this is in line with villiers and barnard [32], which stated that a network architecture with one hidden layer is on average better than two hidden layers. they concluded that two hidden layers are more difficult to train. additionally, they also established that this behaviour is caused by a local minimum problem. the networks with two hidden layers are more prone to the local minimum problem during the training. 3.3. resilient propagation (r-prop) with two hidden layers resilient propagation learning algorithm contains some parameters, including the upper and lower limits as well as the decrease and increase factors. in this study, the range of update values is limited to the upper limit (∆max) = 50, the lower limit (∆min) = 10−6, and the decrease and increase factors (η−) = 0.5 and (η+) = 1.2, respectively. the reason for choosing these values is shown in [28]. similar to section 3.1, the resilient propagation learning algorithm is applied to the network architecture with two hidden layers. the number of neurons vary but do not exceed 15 neurons. in this section, there are four variations of the network architecture, including 47-3-5-4, 47-4-6-4, 47-5-7-4 and 47-6-8-4. the results in table 3 show the network architecture 47-5-7-4 gives the highest identification accuracy of the clove buds origin. the percentage of the identification accuracy is 99.96 % and 97.89 % for training and testing data sets, respectively. 3.4. resilient propagation (r-prop) with one hidden layer in this section, the resilient propagation learning algorithm is implemented with one hidden layer. similar to section 3.2, the number of neurons in the hidden layer is 15 neurons, and have the network architecture 47-15-4. table 4 shows the network architecture 47-15-4 identifies the origin of clove buds with an identification accuracy of 99.86 % and 94.74 % on training and testing data sets, respectively. the network architecture of the resilient propagation algorithm, both two hidden layers and one hidden layer, provides identification results with a very high 443 rustam, a. y. gunawan, m. t. a. p. kresnowati acta polytechnica network mse accuracy (%) r2 architecture training testing training testing training testing 47-3-5-4 0.10346 0.11357 76.98 73.68 0.81 0.76 47-4-6-4 0.13084 0.13547 62 57.54 0.64 0.61 47-5-7-4 0.14889 0.15884 49.73 41.4 0.51 0.42 47-6-8-4 0.15388 0.15874 50.04 42.46 0.48 0.43 table 1. backpropagation with two hidden layers. network mse accuracy (%) r2 architecture training testing training testing training testing 47-15-4 0.0721 0.0773 99.91 99.47 0.99 0.98 table 2. backpropagation with one hidden layer. figure 1. identification accuracy percentage of training data sets. accuracy. however, the network architecture with two hidden layers has a slightly lower accuracy. tables 3 and 4 show the two-layered resilient propagation with deficient neurons performs better than the single-layer having more neurons. this is in line with santra et al. [24], which established that the performance of two hidden layers with 8-10 (18) neurons is better that of one hidden layer with 62 neurons. the summary of the best identification accuracy and determination coefficient are shown in figures 1, 2, 3 and 4, respectively. for each network architecture, the smallest mse in training and testing data sets are shown in figures 5 and 6, respectively. the results of the identification from the origins of clove buds have been obtained. in small data set categories, backpropagation with one hidden layer provides an accurate identification in the training and testing data sets. it accurately identifies the origins of clove buds obtained using the resilient propagation algorithm with two hidden layers. the neural networks model obtained in this paper can be a reference from a scientific perspective. for instance, it can be used in future studies to identify the figure 2. identification accuracy percentage of testing data sets. figure 3. determination coefficient of training data sets. 444 vol. 60 no. 5/2020 artificial neural network approach for the identification. . . network mse accuracy (%) r2 architecture training testing training testing training testing 47-3-5-4 0.07209 0.08111 99.73 97.37 0.98 0.95 47-4-6-4 0.07162 0.08316 99.69 96.49 0.99 0.94 47-5-7-4 0.07160 0.07961 99.96 97.89 0.99 0.96 47-6-8-4 0.07160 0.07978 99.78 97.72 0.99 0.96 table 3. resilient propagation with two hidden layers. network mse accuracy (%) r2 architecture training testing training testing training testing 47-15-4 0.07158 0.07932 99.86 94.74 0.99 0.92 table 4. resilient propagation with one hidden layer. figure 4. determination coefficient of testing data sets. figure 5. m se of training data sets. figure 6. m se of testing data sets. origin of various plantation commodities with small metabolites data sets. at the moment, the most appropriate way of determining the origin of a plantation commodity is qualitative, relying on the services of flavourist to evaluate the flavour and taste. this is because each commodity has a specific flavour and taste based on the origin of its region. furthermore, the different origins of clove buds data sets have not been reported in the literature and thus no direct comparison can be presented in this paper. 4. conclusions this paper demonstrated the potential and ability of a neural network approach with backpropagation and resilient propagation learning algorithms. it was meant to identify the clove buds origin based on metabolites composition. the work was divided into two parts, the first one being an identification of the clove buds origin using the backpropagation learning algorithm. in this algorithm, two network architectures were constructed. one having a single hidden layer and the second one having two. the results showed that the use of one hidden layer gives the clove buds origin identification accurately, specifically 99.91 % 445 rustam, a. y. gunawan, m. t. a. p. kresnowati acta polytechnica and 99.47 % in training and testing data sets, respectively. the second step involved the identification of the clove buds origin using a resilient propagation learning algorithm. in this algorithm, two network architectures were constructed. one having a single hidden layer and the second one having two. the results showed that the use of two hidden layers gives an accurate clove buds origin identification, including 99.96 % and 97.89 % in training and testing data sets, respectively. from these results, it was concluded that for an identification of small metabolites data sets from a plantation commodity, the backpropagation algorithm with one hidden layer and the resilient propagation algorithm with two hidden layers should be used. this paper also confirmed the contribution of artificial neural networks to the pattern recognition of metabolites data sets obtained by the metabolic profiling technique. acknowledgements the authors express gratitude to the government of indonesia, especially endowment fund for education (lpdp lembaga pengelola dana pendidikan), for its funding. references [1] l. broto. derivatisasi minyak cengkeh, dalam cengkeh: sejarah, budidaya dan industri, chap. 23. indesco jakarta dan biologi uksw salatiga, 2014 (in indonesian). [2] coffeeland 2010. jenis kopi arabika terbaik dari berbagai daerah di indonesia (in indonesian). https:// coffeeland.co.id/product-category/kopi-supply/. [3] m. t. a. p. kresnowati, r. purwadi, m. zunita, et al. metabolite profiling of four origins indonesian clove buds using multivariate analysis. report research collaboration pt hm sampoerna tbk and institut teknologi bandung (confidential report) 2018. [4] j. kopka, a. fernie, w. weckwerth, et al. metabolite profiling in plant biology: platforms and destinations. genome biology 5(6):109, 2004. doi:10.1186/gb-2004-5-6-109. [5] s. p. putri, e. fukusaki. mass spectrometry-based metabolomics: a practical guide. crc press, 2016. doi:10.1007/s13361-015-1246-3. [6] d. l. massart, b. g. m. vandeginste, s. n. deming, et al. chemometrics: a textbook. elsevier amsterdam, 1988. doi:10.1002/cem.1180020409. [7] c. t. leondes. image processing and pattern recognition. elsevier, 1998. [8] b. samir, a. boukelif. new approach for online arabic manuscript recognition by deep belief network. acta polytechnica 58(5), 2018. doi:10.14311/ap.2018.58.0297. [9] a. n. ponce, a. a. behar, a. o. hernández, v. r. sitar. neural networks for self-tuning control systems. acta polytechnica 44(1), 2004. [10] m. chvalina. demand modelling in telecommunications: comparison of standard statistical methods and approaches based upon artificial intelligence methods including neural networks. acta polytechnica 49(2):48–52, 2009. [11] i. bukovsky, m. kolovratnik. a neural network model for predicting nox at the melnik 1 coal-powder power plant. acta polytechnica 52(3):17–22, 2012. [12] p. kutilek, s. viteckova. prediction of lower extremity movement by cyclograms. acta polytechnica 52(1), 2012. [13] d. novák, d. lehkỳ. neural network based identification of material model parameters to capture experimental load-deflection curve. acta polytechnica 44(5-6), 2004. [14] rustam, a. y. gunawan, m. t. a. p. kresnowati. the hard c-means algorithm for clustering indonesian plantation commodity based on metabolites composition. in journal of physics: conference series, vol. 1315, p. 012085. iop publishing, 2019. doi:10.1088/1742-6596/1315/1/012085. [15] t. beltramo, m. klocke, b. hitzmann. prediction of the biogas production using ga and aco input features selection method for ann model. information processing in agriculture 2019. doi:10.1016/j.inpa.2019.01.002. [16] l. fausett. fundamentals of neural networks: architectures, algorithms, and applications. prentice-hall, inc., 1994. [17] i. aizenberg, c. moraga. multilayer feedforward neural network based on multi-valued neurons (mlmvn) and a backpropagation learning algorithm. soft computing 11(2):169–183, 2007. doi:10.1007/s00500-006-0075-5. [18] e. m. johansson, f. u. dowla, d. m. goodman. backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. international journal of neural systems 2(04):291–301, 1991. doi:10.1142/s0129065791000261. [19] t. t. pleune, o. k. chopra. using artificial neural networks to predict the fatigue life of carbon and low-alloy steels. nuclear engineering and design 197(12):1–12, 2000. doi:10.1016/s0029-5493(99)00252-6. [20] k. l. kaiser, s. p. niculescu, g. schüürmann. feed forward backpropagation neural networks and their use in predicting the acute toxicity of chemicals to the fathead minnow. water quality research journal 32(3):637–658, 1997. doi:10.2166/wqrj.1997.037. [21] e. el tabach, l. adishirinli, n. gascoin, g. fau. prediction of transient chemistry effect during fuel pyrolysis on the pressure drop through porous material using artificial neural networks. journal of analytical and applied pyrolysis 115:143–148, 2015. doi:10.1016/j.jaap.2015.07.010get. [22] r. chayjan, m. esna-ashari. isosteric heat and entropy modeling of pistachio cultivars using neural network approach. journal of food processing and preservation 35(4):524–532, 2011. doi:10.1111/j.1745-4549.2010.00498.x. [23] a. d. anastasiadis, g. d. magoulas, m. n. vrahatis. new globally convergent training scheme based on the resilient propagation algorithm. neurocomputing 64:253–270, 2005. doi:10.1016/j.neucom.2004.11.016. 446 https://coffeeland.co.id/product-category/kopi-supply/ https://coffeeland.co.id/product-category/kopi-supply/ http://dx.doi.org/10.1186/gb-2004-5-6-109 http://dx.doi.org/10.1007/s13361-015-1246-3 http://dx.doi.org/10.1002/cem.1180020409 http://dx.doi.org/10.14311/ap.2018.58.0297 http://dx.doi.org/10.1088/1742-6596/1315/1/012085 http://dx.doi.org/10.1016/j.inpa.2019.01.002 http://dx.doi.org/10.1007/s00500-006-0075-5 http://dx.doi.org/10.1142/s0129065791000261 http://dx.doi.org/10.1016/s0029-5493(99)00252-6 http://dx.doi.org/10.2166/wqrj.1997.037 http://dx.doi.org/10.1016/j.jaap.2015.07.010get http://dx.doi.org/10.1111/j.1745-4549.2010.00498.x http://dx.doi.org/10.1016/j.neucom.2004.11.016 vol. 60 no. 5/2020 artificial neural network approach for the identification. . . [24] a. k. santra, n. chakraborty, s. sen. prediction of heat transfer due to presence of copper–water nanofluid using resilient-propagation neural network. international journal of thermal sciences 48(7):1311– 1318, 2009. doi:10.1016/j.ijthermalsci.2008.11.009. [25] l. m. patnaik, k. rajan. target detection through image processing and resilient propagation algorithms. neurocomputing 35(1-4):123–135, 2000. doi:10.1016/s0925-2312(00)00301-5. [26] d. fisch, b. sick. training of radial basis function classifiers with resilient propagation and variational bayesian inference. in 2009 international joint conference on neural networks, pp. 838–847. ieee, 2009. doi:10.1109/ijcnn.2009.5178699. [27] m. shiblee, b. chandra, p. k. kalra. learning of geometric mean neuron model using resilient propagation algorithm. expert systems with applications 37(12):7449–7455, 2010. doi:10.1016/j.eswa.2010.04.018. [28] m. riedmiller, h. braun. a direct adaptive method for faster backpropagation learning: the rprop algorithm. in proceedings of the ieee international conference on neural networks, vol. 1993, pp. 586–591. san francisco, 1993. doi:10.1109/icnn.1993.298623. [29] p. bhagat. pattern recognition in industry. elsevier, 2005. doi:10.1016/b978-0-08-044538-0.x5054-x. [30] k. shibata, y. ikeda. effect of number of hidden neurons on learning in large-scale layered neural networks. in 2009 iccas-sice, pp. 5008–5013. ieee, 2009. [31] i. shafi, j. ahmad, s. i. shah, f. m. kashif. impact of varying neurons and hidden layers in neural network architecture for a time frequency application. in 2006 ieee international multitopic conference, pp. 188–193. ieee, 2006. doi:10.1109/inmic.2006.358160. [32] j. de villiers, e. barnard. backpropagation neural nets with one and two hidden layers. ieee transactions on neural networks 4(1):136–141, 1993. doi:10.1109/72.182704. 447 http://dx.doi.org/10.1016/j.ijthermalsci.2008.11.009 http://dx.doi.org/10.1016/s0925-2312(00)00301-5 http://dx.doi.org/10.1109/ijcnn.2009.5178699 http://dx.doi.org/10.1016/j.eswa.2010.04.018 http://dx.doi.org/10.1109/icnn.1993.298623 http://dx.doi.org/10.1016/b978-0-08-044538-0.x5054-x http://dx.doi.org/10.1109/inmic.2006.358160 http://dx.doi.org/10.1109/72.182704 acta polytechnica 60(5):440–447, 2020 1 introduction 2 materials and methods 2.1 materials 2.2 data preprocessing 2.3 artificial neural network 2.3.1 backpropagation learning algorithm 2.3.2 resilient propagation learning algorithm 3 results and discussions 3.1 backpropagation (b-prop) with two hidden layers 3.2 backpropagation (b-prop) with one hidden layer 3.3 resilient propagation (r-prop) with two hidden layers 3.4 resilient propagation (r-prop) with one hidden layer 4 conclusions acknowledgements references ap02_3.vp gaseous toxic agents occurring in the interior of a building and affecting the total state of the human organism form a constituent of the microenvironment, known as the toxic microclimate [3], [10]. odor agents in higher concentrations may also be harmful, but some toxic agents in any concentration do not smell at all (e.g., carbon monoxide). 1 sources of toxic gases the components of the toxic microclimate are toxic gases, i.e., gaseous components of the atmosphere which produce pathological changes in an exposed subject. there are organic and inorganic gases. they either enter the interior from the outdoor atmosphere or they occur within the building, produced by human activity, or released from building materials [6], [7], [8]. 1.1 outdoor air as a source of toxic substances from the outdoor atmosphere, toxics enter the interior: carbon monoxide, sulfur dioxide and trioxide, oxides of nitrogen, ozone, some hydrocarbons and especially smog and acid rains (fig. 1) in recent times. carbon monoxide co is produced mainly in gasoline engines and by boiler furnaces and stoves with imperfect (incomplete) combustion. its affinity with building materials is very slow and its concentration decreases very slowly in an unventilated room (just by diffusion). sulfur dioxide so2 and sulfur trioxide so3 are produced by burning fossil fuels containing sulfur. combined with water, acids are produced (e.g., concentrated sulphuric acid h2so4). the concentration of sulfur dioxide decreases rapidly in rooms with lime plaster and usual furniture, if there is ventilation by infiltration (air change by untight windows, doors and walls). just one hour after windows are closed, no concentration can be measured. but after windows were opened, the indoor sulfur dioxide concentration equaled the outdoor concentration within a few minutes. in airconditioned rooms, the sulfur dioxide concentrations are maintained indoors at 30 to 40 % of the outdoor concentrations. oxides of nitrogen nox are produced from atmospheric nitrogen by high burning temperatures. diesel engines, boiler houses in power stations and industrial plants, and the operation of gas appliances. they can be combined with various building materials within the interior, thus the concentration of these oxides decreases rapidly, if there is ventilation by infiltration only. complexes of these agents form so called acid rains during rainy weather and so called smog during sunny weather, by the following process. oxides of nitrogen and oxides of sulfur mixed with other natural chemicals (water vapor) form acids in higher levels of the atmosphere, and fall as acid rain (fig. 1). these rains, known to be dangerous for forests, also have a negative effect on building structures. smog (smoke + fog from oxides of nitrogen and other harmful agents, not from water vapor) is a result of air pollution: nitrogen dioxide no2 mixed with carbon monoxide and water vapor by an intensive uv radiation impact is decayed into nitrogen monoxide no and very reactive atomic oxygen o. atomic oxygen o bonded with molecular oxygen o2 forms ozone o3. additional toxic gases are also produced by these photochemical processes (photochemical smog) e.g., compounds of peroxoacetyl (pans), aldehydes and organic acids. these gases are produced from hydrocarbons, e.g., from methane, ethane, ethylene, propane or butane, forming about one third of photochemical smog. the remaining smog contains ozone and co. intensive uv radiation is a condition for smog formation: thus the smog concentration increases towards midday and decreases later in the afternoon and in © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 42 no. 3/2002 the danger of toxic substances m. v. jokl toxic (harmful) gases enter building interiors partly from outdoors (sulfur oxides, nitrogen oxides, carbon monoxide, ozone, smog and acid rains), partly originate indoors as a result of human activity (carbon monoxide, tobacco smoke, nitrogen oxides, ozone, hydrocarbons) and also emanate from building materials (formaldehyde, volatile organic compounds). the human organism is most often exposed to cigarette smoke (especially nonsmokers are endangered, as cigarette smoke devastes the pulmonary and cardiovasculary system) and to smog entering from outdoors, paradoxically during sunny weather. preventing toxic production is the most effective measure, e.g., by coaxing to coax smokers out of “civilized” areas, by using energy rationally (i.e., conserving energy), to turn to pure fuels and to increase energy production by non-combustion technologies. besides ventilation and air filtration, the toxic gases can be removed to a remarkable extent by plants (by which decay the substances into nontoxic gases), and by air ionization. review article. keywords: toxic substances, hygienic problem of buildings, health problem within buildings, sick building syndrome. fig. 1: the main sources of air pollution outdoors, smog and acid rain formation (production of harmful agent as a percentage of total emission in windows) the morning. high relative humidity disturbes uv radiation penetration, and thus also smog formation (it decreases almost to zero at rh � 75 %). indoor ozone concentration is usually about a half of the outdoor ozone concentration. lightning is the most common source of ozone. it is combined with oxidizable organic materials in the interior. thus, if there is no supply of air with ozone, the ozone concentration decreases exponentially (i.e., very rapidly) below 20 % of the original (starting) value within 30 minutes. the decrease is faster in rooms with rubber, plastic and textile surfaces than in interiors with metal and glass surfaces. no differences were found between apartments with gas and electric kitchens (from a theoretical point of view, faster decay of ozone may be supposed in a room with gas appliances). vehicles (smoke from vehicle engines, evaporation from fuel tanks) and the chimneys of furnaces with incomplete combustion are the main sources of hydrocarbons. 1.2 indoor sources of toxics toxic substances in building interiors are produced by human activity and are released from building materials. carbon monoxide, due to human activity, is the most common component of the toxic microclimate. the sources are usually internal combustion engines (up to 7 % co when idling), cigarette smoking, various stoves and boilers. during complete combustion smoke contains about 0.2 to 0.5 % co; during incomplete combustion, the percentage is substantially higher. city gas piping may be a source of carbon monoxide. co concentrations were measured inside kitchens and living rooms during nonsummer months: inside more than 6 % of the apartments it was over 10 ppm (12 000 �g/m3), and even up to 60 ppm (70 000 �g/m3) within locally heated rooms. these values are not toxic, but long-term exposure may cause an increase in accidents, and may affect the overall state of the human organism. the main reasons for high carbon monoxide concentrations are an insufficient air supply (unsuitable location of inlets or blocked inlets), partially blocked injectors of gas burners, leaky combustion chambers, unsuitable fuel in stoves, and leaky chimneys in old buildings. oxides of nitrogen, apart from carbon monoxide, are also emitted by gas appliances. the concentration of nitrogen dioxide may be higher by about 50 �g/m3 within kitchens and about 20 to 25 �g/m3 in bedrooms than in the outdoor air (about 10 �g/m3). the nitrogen dioxide concentrations were about 5 �g/m3 higher within buildings where propane was used than in houses with natural gas, as a result of different gas compositions. there was even a lower nitrogen dioxide concentration in electrical kitchens (about 6 �g/m3) than outdoors – probably as a result of increased air infiltration into the kitchens. therefore a gas cooker is a worse source of nitrogen dioxide than exhaust fumes from cars coming from outdoors. dangerous ozone concentrations may be reached during long-term operation of artificial sunlight (solarium) inside a room: this is produced by the impact of uv radiation on air molecules. photocopiers and laser printers are the most frequent source of ozone nowadays: they produce about 100 �g/min ( 2 0 – 1350 �g/min) of ozone. these emissions can cause an ozone concentration of 200ppb (400 �g/m3) in a badly ventilated room. the main sources of hydrocarbons indoors, besides gas and oil appliances, are various detergents, insecticides, pesticides, herbicides, impregnants, preservatives, fungicides, etc. they are also produced by tobacco smoking. formaldehyde, styrene, and various organic agent mixtures may be released from building materials. building elements (wood fiber plates, chipboards, sawdust boards, insulation boards) made from granulated organic raw materials and joined with ureaformaldehyde and phenolformaldehyde adhesives are very often sources of formaldehyde, as are some plastics, varnishes, enamels and lacquers. formaldehyde may also be released from finishing agents in textiles; during incomplete fuel combustion, it is present in tobacco smoke and is an intermediate compound of the photochemical oxidation of hydrocarbons in the atmosphere. formaldehyde is also used in the cosmetics industry as an antihydroticum (antihydrating agent) and in medical care as a disinfectant (bactericide agent) and antimycoticum (fungicide agent) [2]. some other toxic substances may be released from plastics inside a room: styrene from polystyrene, mixtures of organic gases from coatings (especially if heated). decay products of various materials are listed in table 1. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 material 1 2 3 4 5 6 7 8 9 10 11 12 13 14 decay product co + + + + + + + + + + + + + + co2 + + + + + + + + + + + + + + h2o + + + + + + + + + + + + + + lower hydrocarbons + aldehydes + lower acids + higher flavoring + mono-tetra styrene + + table 1: decay products of various building materials teflon, used in pots and pans, can serve as an example. teflon decays when warmed to temperatures of about 100 –150 °c higher than the usual temperatures for baking and frying. irritating and harmful gases are produced. for this reason pots and pans with a teflon layer should not be heated when empty, without food. smoking during teflon processing is also not allowed because the glowing end of a cigarette decays teflon dust, and inhaling the smoke that is produced causes an illness known as “polymer fever”. the dangers of using teflon start at temperatures over about 260 °c (depending on the type of teflon). formaldehyde, various hydrocarbons, toluene, benzene and other agents, released from various materials in a building interior, are referred to, jointly, as volatile organic compounds or chemicals, vocs. vocs in higher concentrations are harmful, of course [4], [5], [6]. the application of cyanoacrylate adhesives can also be dangerous. they cause eye irritation, asthma and skin diseases in 1.4 % of the population (predominantly women) according to the finnish institute of occupational health. tobacco smoke must also be considered toxic, as proved by recent research. the human organism is also seriously threatened by smoke produced by other people, by so called environmental tobacco smoke, ets. 2 impact of toxic gases on the human organism toxic strain depends on the kind of toxic agent (noxa) – each one usually has a specific impact on the human body (fig. 2). the most common noxa – carbon monoxide – affects subjects in two ways: a) by an affinity to hemoglobin (200 to 300 times higher than to oxygen) and forming carbonyl hemoglobin (cohb). the consequence is hypoxemia and hypoxia: the hemoglobin is not able to transfer oxygen. even the remaining oxyhemoglobin is unable to deliver oxygen to the tissues. 15 acta polytechnica vol. 42 no. 3/2002 material 1 2 3 4 5 6 7 8 9 10 11 12 13 14 decay product soot + + + hcl + + phosgene + + cl2 + + chlorohydrocarbons + + methacrylic acid ester + saturated esters + acrylonitrile + nh3 + hcn + + + higher boiling compounds of n + + + + + formaldehyde + + formic acid + + + phenol + amines + + + cyanates + isocyanates + + methanol acetic acid + h2s + ethylbenzene + 1 – polyolefins 8 – polyesters 2 – polystyrene 9 – epoxides 3 – pvc 10 – melanine resins 4 – pvc softened 11 – polyamides 5 – methyl methacrylate 12 – polyurethanes 6 – acrylonitrile 13 – wool 7 – phenol resins 14 – wool, silk b) by blocking respiration fragments (cytochromoxidases): cells of these tissues are toxically damaged and the impact of hypoxia is increased. e.g., an eight hour stay in air with a concentration of 90 mg/m3 has the same effect on the human organism as the loss of one liter of blood. the consequence of poisoning of the human body by co is headache, loss of coordination, loss of concentration and even apathy, pains throughout the body sometimes changing into cramps, loss of consciousness and even death [9]. poisonings are classified as: light (short, several minutes of unconsciousness), medium heavy (1 to 4 hours of unconsciousness) and very heavy (over 8 hours of unconsciousness). note: subjects suffering from ischemical heart disease may fall prey to angina pectoris, as a result even of a small increase in co in the air, causing a cohb increase from 0.8 % (physiological value) to three times the value. sulfur dioxide and sulfur trioxide have an irritating effect at higher concentrations. foggy weather and an increased quantity of dust, soot and fly ash in the atmosphere make the situation worse, due to absorption of so2 and so3 by these particles: the result is the increased irritation of the breathing passages. higher so2 concentrations produce damage (etching) to the breathing pathways. this is cause why smoke disasters lead to increasing morbidity and mortality. sulfur oxides of higher concentrations also damage vegetation and inorganic materials. nitrogen oxides are irritant agents (especially no2) and probably decrease the immunity of the human body. bound up with hemoglobin they produce methemoglobin, which is dangerous especially for babies (their livers are not able to remove it) but not so dangerous for them as water containing nitrates. nitrogen oxides play a decisive role in photochemical reactions producing ozone and peroxoacetylnitrates (pans) which, besides direct toxicity for breathing organs, stimulate the inception of cancer. epidemiological studies in uk prove that there is a higher rate of upper breathing pathway illnesses in pre-school children living in apartments with gas cookers than in those living in apartments with electric cookers. an increase by 30 �g/m3 (16 ppb) in no2 concentration can over a period of time, lead to a 20 % increase in respiration illness among children. gas cookers should therefore be fitted with exhaust hoods. exhaust fumes from diesel engines produce in addition to nox and other toxic agents, 3-nitrobenzatrone, which is the most carcinogenic agent discovered so far. it produces more mutations than 1.6 dinitropyrene, which was previously thought to be the most dangerous. despite the low concentration of these agents in exhaust fumes, more serious limits are imposed for overloaded trucks at present because the greater the overloading, the greater the exhaust fumes. ozone irritates eyes and the delicate membranes of the lungs, causing inflammations with many symptoms, e.g., pains in the lungs, coughing, irritation of the throat. people who go in for jogging get tired rapidly, and building workers show a tendency to suffer from many colds. sensitive people show the first impacts at an ozone concentration of 10 to 20 ppb (20 to 40 �g/m3) (the bad smell also has an impact). first symptoms, e.g., headaches and other difficulties, start to appear at about 50 ppb (100 �g/m3). itching of the mucosis, increased breathing difficulties, decreasing physical performance and increased tiredness start from 100 ppb (200 �g/m3). according to research results, intensive outdoor activities should be cancelled when average ozone concentrations over a period three hours reach a level of 150 ppb (300 �g/m3). ozone is dangerous at effective bactericidal and deodorant concentrations, i.e., between 150 and 250 ppb (300 to 490 �g/m3). these concentrations can threaten the function of the lungs, especially of children: there is evidence of lung damage after 16 to 20 hours exposure at a concentration of 120 ppb (240 �g/m3). ozone also decreases the resistance of the lungs to infection, and can stimulate an asthmatic attack. ozone concentration exceeding 9 000 ppb (17 700 �g/m3) can cause an acute lung swelling followed by total change in the organism (bleeding, loos of weight, death). ozone is also dangerous for plants, especially for conifers, while deciduous trees are more resistant. plants that are very sensitive to ozone: onions, oats, barley, tomatoes, alfalfa, clover, tobacco, beans, ra1dishes, spinach, potatoes, wheat and corn. a decrease in yield and quality, appearance, flavor, durability and composition changes have been found in these plants. a decrease in yield up to 60 % occurs in grapes and citrous fruits. formaldehyde is a powerful irritant of the eye mucosis and the upper breathing pathways. it can be perceived at a concentration of about 400 �g/m3. concentrations of 200 to 300 �g/m3 are tolerable, 400 to 500 �g/m3 produces powerful coughing, and concentrations over 10 000 �g/m3 are bearable only for a few minutes. there is evidence of higher formaldehyde concentration in new buildings. for this reason there are different pre16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 2: impact of a toxic agent on the human organism (air=exposing air, 1 nose, throat, lungs: inhaled ozone decreases the resistance of the lungs to infection, making asthma worse, 2 eyes: formaldehyde and peroxyacetylnitrate irritate eyes, 3 brain: inhaling co can decrease motion coordination and the ability to concentrate, probably as a result of decreased oxygen delivery to the brain, 4 heart: co decreases oxygen delivery; low oxygen concentration in the blood exacerbates angina pectoris – pains in the chest). scribed values for new and old buildings in some regulations (e.g., the finnish standard on indoor climate and building ventilation prescribes an admissible formaldehyde concentration for new buildings of 1150 �g/m3, and for existing buildings of 800 �g/m3). part of the population is allergic to formaldehyde. this may be inborn or acquired due to frequent contact with formaldehyde. two types of this allergy are known: dermatic and bronchial. formaldehyde dries the skin, decreases immunity and is thought to produce cancer. it affects the menstrual cycle, causes pregnancy disorders and lowers the weight of newborn children. it is toxic at concentrations unbearable for humans, causing great irritation to the eyes and to the breathing passages. tobacco smoke not only produces a bad smell (see chapter about odor microclimate). it also threatens the human organism with carcinogenic components at higher, long-term concentrations: 3, 4 benzpyrene, hydrazine, vinilchloride, polycyclic aromatic hydrocarbons, arsenic, nickel, chromium, tar and nicotine, nornicotine, myosine and anatasine. carbon monoxide, creating carboxyhemoglobin in the blood, increases from an original value of 1 % to up 15 % in the blood of smokers. some irritating agents are also produced by smoking: formaldehyde, acetaldehyde, acrolein, acetone, nitrogen oxides and sulfane [9]. these toxic agents damage the mucosis of the upper breathing pathways, which lose the use of their cilia as a filter (rhythmic motions remove excessive mucus, trapped dust particles and other pollutants). this leads to multiplication of the cells producing mucus, and the typical smoker’s cough appears at the same time (especially in the morning) replacing the function of the cilia (fig. 3, [11]). connective tissue in the lungs is multiplied at the same time, the bronchi are deformed (scarred) and contracted. contracted locations are followed by dilations of the breathing pathways, and the effective area for gaseous exchange is decreased. the result is emphysema which makes breathing difficult. larger bronchi, chronically irritated, respond by decreasing their diameter and by mucosis swelling. chronic obstructive bronchitis develops; the patient begins to suffocate. immunity of the organism decreases, thus enabling the formation of lung cancer. this is the most frequent type of tumor in males and even females in many countries. the cardiovascular system (heart and vessels) is also seriously threatened. smoking is the riskiest factor for ischemic disease of the heart (heart attack) and legs, and for arterosclerosis. smokers suffer more frequently from stomach and duodenal ulcers, from cancer of the urinary bladder, and diseases of the mouth, tonque, pharynx and throat. female smokers give birth to babies of lower weight, i.e., they are born prematurely [9]. it has been shown that the probability of a smoker dying after smoking 200 000 cigarettes is eleven times higher than that of a non-smoker. however, non-smoker’s can also be threatened in the same way if exposed to other people’s cigarette smoke, because they are more sensitive to the smoke. a ten-year research project at harvard university (boston, u.s.a.) (32 thousand healthy non-smoking women aged 36 to 61 years) ascertained that regular non-smokers staying in rooms filled with smoke increase their risk of heart diseases twofold. it has also been found that willing, active smoking affects the lungs more negatively than the cardiovascular system, but that passive, unwilling smoking, on the contrary, threatens the cardiovascular system more than the lungs. 3 admissible limits of toxic gases some examples of the admissible concentrations of certain toxic substances in the outdoor air are presented in table 2. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 42 no. 3/2002 fig. 3: the origin of smoker’s morning cough (a mucosis within the cilia of a non-smoker, b smoker’s mucosis without cilia, c cilia of a non-smoker). air pollutant sampling time p.p.m. admissible concentration �g/m3 carbon monoxide 1 hour 35 40 000 8 hour 9 10 000 sulfur dioxide 3 hour 0.5 1.300 mg 24 hour 0.1 0.260 mg mean per year 0.02 0.060 mg aerosol 24 hour 150 geometric mean per year 60 nitrogen dioxide mean per year 0.05 100 photochemical oxidative agents 1 hour 0.08 160 so2ka � 260 table 2: admissible concentrations of some toxic substances in outdoor air (environmental quality, the 3rd annual report of the council on environmental quality, washington, d.c. 20402, 1972) admissible limits of toxic gases for a building interior should be slighter higher than those prescribed for outdoors. however, they should be lower than those prescribed for workplaces. e.g., austrian standard h 6000t3 (january 1989) prescribes that the admissible value for a building interior should be 10 % of the highest admissible value for workplaces. according to who (world health organization) (who regional publications, european series no. 23, 1987) guideline values for indoor pollution have been prescribed (see table 3); no guidance is given on outdoor concentrations that might be of indirect relevance. acceptable limits for outdoor ozone were introduced by the european union in 1994: 110 �g/m3 (8-hour average value, so-called protection threshold) (irritation of eyes and breathing passages), 180 �g/m3 (one hour average value, so-called public information threshold) (inhabitants must be informed because children playing outdoors must be limited to max. one hour, loos of concentration can occur in adults. adult sporting activities outdoors should also be limited). there is also a limit of 360 �g/m3 (also a one-hour average value, the so-called warning limit) (inhabitants must be warned). removal of harmful agents from various products also plays an important role in air pollution control. the quality level of products from this point of view is expressed by so-called eco-labelling, which started in germany in 1978 when the so-called blue angel mark was introduced (see fig. 4). for an example of prescribed values, see table 4. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 substance time-weighted average averaging time chapter cadmium 1-5 ng/m3 1 year (rural areas) 19 10 –20 ng/m3 1 year (urban areas) carbon disulfide 100 μg/m3 24 hours 7 carbon monoxide 100 mg/m3 b 15 minutes 20 60 mg/m3 b 30 minutes 30 mg/m3 b 1 hour 10 mg/m3 8 hours 1,2-dichloroethane 0.7 mg/m3 24 hours 8 dichloromethane (methylene chloride) 3 mg/m3 24 hours 9 formaldehyde 100 �g/m3 30 minutes 10 hydrogen sulfide 150 �g/m3 24 hours 22 lead 0.5 –1.0 �g/m3 1 year 23 manganese 1 �g/m3 1 year c 24 mercury 1 �g/m3 d (indoor air) 1 year 25 nitrogen dioxide 400 �g/m3 1 hour 27 150 �g/m3 24 hours ozone 150 –200 �g/m3 1 hour 28 100 –120 �g/m3 8 hours styrene 800 �g/m3 24 hours 12 sulfur dioxide 500 �g/m3 10 minutes 30 350 �g/m3 1 hour sulfuric acid – e – 30 tetrachloroethylene 5 mg/m3 24 hours 13 toluene 8 mg/m3 24 hours 14 trichloroethylene 1 mg/m3 24 hours 15 vanadium 1 �g/m3 24 hours 31 a information from this table should not be used without reference to the rationale given in the chapters indicated b exposure at these concentrations should be for no longer than the indicated times and should not be repeated within 8 hours c due to respiratory irritancy, it would be desirable to have a short-term guideline, but the present data base does not permit such estimations d the guideline value is given only for indoor pollution: no guidance is given on outdoor concentrations (via deposition and entry into the food chain) that might be of indirect relevance table 3: guideline values for individual substances based on effects other than cancer or odour/annoyancea (guidelines for ventilation requirements in buildings. eur 14449 en. brussels-luxemburg, 1992). note: when air levels in the general environment are orders of magnitude lower than the guideline values, present exposures are unlikely to present a health concern. guideline values in those cases are directed only to specific release episodes or specific indoor pollution problems. the system of eco-labelling was introduced throughout the european union in 1992 by eu directive no. 880/1992 with the mark of the eu sign over a flower (fig. 5). carpeting, owing to its large area, usually causes the most ifficult problems within an interior. for this reason, a special eco-label (fig. 6) was introduced for carpets by a german research institute for carpets (deutsches teppich forschungsinstitut e.v.). the label is a guarantee that the labeled carpet is acceptable from the indoor air pollution point of view. the aim of eco-labelling is first to provide a guarantee that every labeled product is acceptable from indoor air pollution point of view, and second to assist the selection and purchase of an environment-friendly product. 4 removal of toxic gases harmful gases can be removed from a building interior by suitable changes in a) the source of toxic substances, b) the environment, c) human activities and human attitudes. 4.1 changes in the source of toxic substances reducing or even removing the source of toxic substances is the most effective way of optimization. building materials that do not release toxic agents should be preferred. technologies with minimal sources of toxic substances should be used in industry. boilers, stoves and ranges should be cleaned regularly when complete combustion is reached. indoor air quality can also sometimes be improved quite simply. for example, by using soda saleratus or a mixture of on e part flour and one part plaster (calcinated gypsum) instead of harmful insecticide aerosols to kill cockroaches. improvements in the efficiency of combustion engines have led to a decrease in the toxic substances that they produce. because combustion with an excessive amount of air increases nox production, an additive called carbonex has been developed to reduce the production of smog, soot and nox. carbonex dissolves iron in organic liquids, e.g. oil or gasoline. carbonex is sprayed into the flame simultaneously with the coal or other fuel based on carbon. combustion is impacted in two ways: a) particles are more rapidly oxidized and can thus burn better, b) the process can be accomplished at lower temperatures. additives applied in diesel engines reduce emissions by 43 %, and efficiency increases by 1.5 to 3 %. in oil burners, emissions decrease by 67 %, and in stone coal combustion nox emissions decrease by 25 %. however, the most effective changes in the source of toxic substances are and will be: more rational use of energy, use of cleaner fuels, and the development of non-combustion technologies. 19 acta polytechnica vol. 42 no. 3/2002 fig. 4: german eco-label, the so called blue angel din 470211 blue angel boilers with atmospheric burners, natural gas nox 50 mg/kwh 70 mg/kwh co 100 mg//kwh 60 mg/kwh oil boiler nox 260mg/kwh 120 mg/kwh co 110 mg/kwh 60 mg/kwh table 4: acceptable limits for heating boilers up to 350 kw according to german regulations fig. 5: european eco-label fig. 6: eco-label for carpets more rational use of energy seems to be the most effective, because conserved energy is the cheapest energy. the possibilities are great. e.g., the u.s.a. spent 10 % of its gross national product on fuels in 1986, as against only 4 % in japan at the same time. it has been proved in the u.s.a. that each investment of two cents in more economic energy technologies saves one kwh of electric current. motor cars provide an example. since 1973 the fuel efficiency of new cars has increased on average by 25 %, and is still growing (the latest citroen has fuel consumption of 3.5 l of gasoline per 100 km, and the toyota avx has a consumption of 2.21 l of diesel per 100 km). the efficiency of oil boilers has increased from 0.65 to 0.86, natural gas boilers from 0.65 to 0.87 –1.0, newly developed freezers and refrigerators in the u.s.a. consume only 15 % of the consumption of present models. the use of an 18 w fluorescent lamp instead of a 75 w electric bulb (with the same illumination output) saves 200 kg of coal in an electrical power plant over a period of ten years, etc. an indication of the future potential is the following fact: the economy of the u.s.a. increased by one third between 1973 and 1986, without an increase in energy consumption. another possibility is to increase the use of natural gas, which is available in abundance. it produces 30 % less co2 during combustion than cruide oil, and 40 % less than coal; emissions of nox and sox are also significantly reduced. sulphur emissions from coal burning can also be reduced by changing to so-called liquid coal, which is a mixture of water, powder coal and special chemicals. it is produced by svenska fluidcarbon in malmö, and is 30 % cheaper than heating oil. nuclear power plants use a typical progressive technology without combustion (even if there are some problems with safety and with spent radioactive fuels) and also siliceous photovoltaic cells, which have come down in price from 44 usd/w in 1976 to 1.6 – 4.0 usd/w in 1988, and are still falling. 4.2 dealing with toxic substances in a building toxic substances in a building can be dealt with in the following ways: a) stop the substance from spreading within the building, b) adequate air change, i.e., suitable ventilation, c) air filtration, d) decay of toxic substances into non-toxic substances, and e) remove of the substance by air ionization. 4.2.1 how to stop toxic substances from spreading inside a building toxic substances can be stopped from spreading inside a building either by separating vertical shafts into several parts (we must be careful about the air streams produced by infiltration and by indoor heat sources), or by locating harmful sources on the highest floors. the former arrangement is important in multistorey buildings, especially where toxic substances can be spread by upward pressure. e.g., if the staircase is a single shaft, upward pressure works and the substances will be lifted upward and all over the building from a source in the lower part of the building. 4.2.2 adequate air change – ventilation ventilation is preferred in most cases, owing to its simplicity and effectiveness. the quantity of outdoor air supplied into the interior depends on the admissible concentration, expressed by the equation � � max v m � �� � o [m3�h�1] (1) where �v � outdoor air volume rate necessary for admissible concentration maintenance [m3�h�1] �m � quantity of the arising toxic agent [g � h�1] � max � highest admissible concentration as prescribed by standards [g � m�3] � o � concentration of the toxic agent in outdoor air [g � m �3] if a person is exposed to several toxic agents at the same time, the outdoor air volume rate should be estimated for each agent separately and then a) the sum of all values is used if the impact of all agents on humans is also summed, and b) the highest value is used if the effect of each agent on humans is independent. 4.2.3 filtration filtration can be performed by activated coal or by charcoal, which do not absorb humidity (up to relative air humidity 75 %) and do not change the chemical composition of the air. the filtration efficiency depends on the contact period of agent and coal. the minimum coal thickness layer is 3 cm for air velocity of 2.5 to 3.0 m/s and efficiency of 80 %. filtration is used a) for outdoor air supplied into the building (only special rooms: food industry, pharmaceutical plants), b) tin recirculated air (in ventilation, heating, and air conditioning systems) usually, and c) tin extracted air (when there is a threat of outdoor air pollution). 4.2.4 the decay of toxic substances into nontoxic substances a grown tree of aesculus hippocastanum (a sort of chestnut tree) is able to clean a volume of 20 000 m3 (e.g., 10 m high, 20 m wide, 100 m long) of motor-car exhaust smoke, without itself being affected by the smoke, according research carried out at the university of stockholm. the substances entering the tree are consumed by the microorganisms on its roots and within the root space according to nasa research. not only can the quality of outdoor air be improved in this way, but also the quality of fresh air supplied into the interior can improve. 4.2.5 removal of toxics by intensive air ionization formaldehyde, so2 and dioxin can be successfully removed by intensive air ionization. for this purpose an ionizer (producing negative aeroions) with a fan (ensuring proper air 20 acta polytechnica vol. 42 no. 3/2002 streaming within the room) and an electrode connected to the positive pole of a direct current source are applied (fig. 7). the formaldehyde content in a room decreases by 80 % within 30 minutes of operation and the so2 content by 73 %. 4.3 gas masks a gas mask can also be used for protection against toxics, but it is suitable only in special, exceptional situations. references [1] eur 14449 en. guidelines for ventilation requirements in buildings. report no. ii. commision of ec, luxemburg, 1992. [2] holcátová, i., bencko, v.: health aspects of formaldehyde in the indoor environment. centr. eur. j. publ., 1997, vol. 5, no. 1, p. 38 – 41. [3] jokl, m. v.: microenvironment: the theory and practice of indoor climate. springfield (illinois – u.s.a.): thomas publisher, 1989, p. 416. [4] jokl, m. v.: evaluation of indoor air quality using the decibel concept. int. j. of environmental health research, 1997, vol. 7, no. 4, p. 289 – 306. [5] jokl, m. v.: new units for indoor air quality: decicarbdiox and decitvoc. international journal of biometeorology, 1998, vol. 42, no. 2, p. 93 –111. [6] jokl, m. v.: evaluation of indoor air quality using the decibel concept based on carbon dioxide and tvoc. building and environment, 2000, vol. 35, no. 8, p. 677 – 697. [7] jokl, m. v.: toxic gases within building interior. in czech. part 1. interiér stavby, 2000, vol. 3, no. 11, p. 57 – 58. [8] jokl, m. v.: toxic gases within building interior. in czech. part 2. interiér stavby, 2000, vol. 3, no. 12, p. 48 – 49. [9] kašák, v.: how to survive smog. in czech. maxdorf, edice medica – praktické rady lékaře sv. 6, praha 1995, p. 61. [10] symon, k., bencko, v. et al: air pollution and health. in czech. praha: avicenum, 1988. [11] the human body. bratislava: gemini ltd., 1992. miloslav v. jokl, ph.d., sc.d, university professor phone: +420 224 354 432 fax: +420 233 339 961 e-mail: miloslav.jokl@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 21 acta polytechnica vol. 42 no. 3/2002 fig. 7: scheme of an arrangement with intensive air ionization for removal of toxic substances from an interior (e positive electrode, sa source of toxics, p probe for toxic concentration estimation) acta polytechnica https://doi.org/10.14311/ap.2021.61.0049 acta polytechnica 61(si):49–58, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague modified equation for a class of explicit and implicit schemes solving one-dimensional advection problem tomáš bodnára, b, ∗, philippe frauniéc, karel kozela a czech technical university in prague, faculty of mechanical engineering, karlovo náměstí 13, 121 35 prague, czech republic b czech academy of sciences, institute of mathematics, žitná 25, 115 67 prague, czech republic c université de toulon, mediterranean institute of oceanography mio, bp 20132 f-83957 la garde cedex, france ∗ corresponding author: tomas.bodnar@fs.cvut.cz abstract. this paper presents the general modified equation for a family of finite-difference schemes solving one-dimensional advection equation. the whole family of explicit and implicit schemes working at two time-levels and having three point spatial support is considered. some of the classical schemes (upwind, lax-friedrichs, lax-wendroff) are discussed as examples, showing the possible implications arising from the modified equation to the properties of the considered numerical methods. keywords: modified equation, finite difference, advection equation. 1. introduction numerical solution of differential equations became a standard tool in many disciplines of theoretical science as well as in applied sciences and engineering. numerical and computational methods brought the possibility to solve non-trivial problems described by ordinary and partial differential equations for which it was impossible to obtain an analytical solution by standard mathematical methods. a wide range of numerical methods was developed during the years for specific problems. typically the physical problem is first described mathematically, so the mathematical model is created. then this mathematical model is solved numerically. the obtained numerical solution is an approximation to the solution of the mathematical model, which itself is just an approximation of the physical problem. we keep aside the error introduced by the inaccuracies and approximations made when developing the mathematical model from the physical problem. here the focus is on the difference between the discrete numerical solution of the mathematical model and its exact (analytical) solution. the numerical solution strongly depends on the method used to obtain it. so the properties and quality of the numerical solution may differ from the solution of the original mathematical model. the aim of this paper is to show that the numerical solution of certain problems (equations) is much closer to the solution of some modified equation, rather than to the solution of the original problem. many important properties of the numerical solution can than be easily seen (and a-priori expected) from the behavior of the known solution of the modified problem. this rather general principle will be demonstrated on a finite-difference approximation of the solution of one-dimensional advection equation. the structure of the paper is as follows. first the advection, diffusion and dispersion equations are introduced and discussed. then several explicit schemes for advection equation are presented and analyzed at the discrete level. modified equation is first derived for upwind scheme and then extended for a general class of explicit and implicit schemes. finally the modified equations are discussed from the point of view of numerical diffusion and dispersion. 2. model problems in order to be able to explain the properties of modified equations and the underlying numerical schemes three model problems are introduced. they describe the physical phenomena of advection, diffusion and dispersion. all these problems can be mathematically formulated using a linear evolutionary partial differential equation. in all cases the unknown function u(x,t) is the sought subject to the initial data u(x,t = 0) = η(x). for a sketch of an example of initial data see fig. 1. the initial value problem can be thus solved analytically using the fourier transform method. the fourier transform (in space) needed to obtain the analytical solutions is defined by: û(ξ,t) = 1 √ 2π ∫ +∞ −∞ u(x,t)e−iξxdx , while the inverse is u(x,t) = 1 √ 2π ∫ +∞ −∞ û(ξ,t)eiξxdξ . 49 https://doi.org/10.14311/ap.2021.61.0049 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en t. bodnár, ph. fraunié, k. kozel acta polytechnica u x u(x,t )= (x)η0 figure 1. initial condition. using this transformation, the analytical solutions of model linear problems (discussed further) can be derived. the details can be found for example in the appendix of the book [1]. 2.1. advection equation the advection problem can be described by a first order pde of the form: ut + aux = 0 in this paper we consider the advection velocity a > 0, but for a ≤ 0 the solution can be obtained as well. this advection equation is supplemented by the initial data u(x,t = 0) = η(x) . applying the fourier transform we can get the expression for the fourier image of the solution û(ξ,t) = e−iξatη̂(ξ) . using the inverse transform, the solution can be written in the form u(x,t) = 1 √ 2π ∫ +∞ −∞ e−iξatη̂(ξ)eiξxdξ and finally u(x,t) = 1 √ 2π ∫ +∞ −∞ η̂(ξ)eiξ(x−at)dξ it’s not difficult to see that the solution corresponds in fact to the initial data η(x) shifted along the x-axis at velocity a, i.e. u(x,t) = η(x−at) . an illustration of such a situation can be seen in fig. 2. when the advection equation is solved numerically the discrete solution differs from the exact one, depending on the numerical method used. the numerical solution often exhibits some non-physical oscillations (due to dispersion) or smeared solution gradients (due to diffusion). this is why the diffusion and dispersion equations are briefly presented hereafter. 0t=t +2 tδt=t + t0 δt=t 0 u x a figure 2. advection equation solution evolution. 0t=t δ0t=t +2 t t=t + t0 δu x figure 3. diffusion equation solution evolution. 2.2. diffusion equation the diffusion equation contains second spatial derivative, multiplied by a diffusion coefficient b. ut + buxx = 0 also this linear pde can easily be solved analytically using the fourier transform to obtain û(ξ,t) = e−i 2ξ2btη̂(ξ) and after the inverse transform the solution u(x,t) = 1 √ 2π ∫ +∞ −∞ eξ 2btη̂(ξ)eiξxdξ . a straightforward comparison with the solution of the advection equation reveals that now the individual fourier modes found in the initial data η(x) do not change their position. they only change their amplitude by the factor eξ 2bt, which means the exponential decay for b < 0 and growth for b > 0. this is why only the decaying case with b < 0 is usually physically acceptable leading to an asymptotically stable evolution in time. it should also be noted that the decay depends on the square of wave number ξ and thus the rapidly oscillating modes decay much faster. an illustration of the diffusion process and diffusion equation solution is shown in fig. 3. 2.3. dispersion equation the linear dispersion equation can be written as ut + cuxxx = 0 where c is the dispersion coefficient. the fourier image of the solution is û(ξ,t) = e−i 3ξ3ctη̂(ξ) 50 vol. 61 special issue/2021 modified equation for explicit and implicit schemes from which follows the solution u(x,t) = 1 √ 2π ∫ +∞ −∞ eiξ 3ctη̂(ξ)eiξxdξ that can be rearranged to u(x,t) = 1 √ 2π ∫ +∞ −∞ η̂(ξ)eiξ(x+cξ 2t)dξ this form is very similar to the solution of the advection equation u(x,t) = 1 √ 2π ∫ +∞ −∞ η̂(ξ)eiξ(x−at)dξ . comparing the corresponding expressions in the exponential, it can be seen that during the dispersive evolution the fourier modes are also shifted along the x-axis, but each mode is shifted at a different velocity depending on the wave number as −cξ2. the minus sign indicates that for c > 0 the modes propagate against the sense of the x-axis. in general it means that the initial data are decomposed into individual modes and each of them propagates at a different speed, proportional to ξ2. 2.4. advection-diffusion-dispersion equation the above described model problems involving the advection, diffusion and dispersion can be combined into advection-diffusion or advection-dispersion equations. the solutions of these combined equations can also be derived analytically. they will share the properties and behavior of both original equations solutions. these combinations will be important later in this paper while discussing the properties of the modified equations, where often the diffusive and dispersive term appear due to the discretization of advection equation. see the discussions in the sections 4.1–5. 3. discrete analysis the problem of numerical solution of advection equation is one of the classical topics in numerical mathematics. there exists a wide range of numerical schemes to discretize this problem. here we only show a few classical methods as an illustration. further we will consider advection in one spatial dimension ut + aux = 0 a > 0 , where u(x,t) is the sought solution and uni ≈ u(xi, tn) is its discrete numerical approximation at point xi = i∆x and time instant tn = n∆t, i,n being integer indices in spatial and temporal coordinates respectively. the numerical approximation of the solution can be constructed e.g. using the finite-difference discretization, replacing the derivatives in the original equation by the corresponding divided differences formulas developed from the taylor expansions. the x t i−1 i+1i n+1 n figure 4. initial condition. final formulas for few classical schemes are shown in the table 1. some details concerning the process of derivation of some of these schemes will be shown in the following sections. more details can be found e.g. in the classical textbooks [2], [3] or [4]. 3.1. up-wind & down-wind decomposition a family of simple explicit schemes working on a three-point spatial stencil and two time levels (see fig. 4) can be derived using forward and backward difference approximations of the spatial derivative. when a general (convex) combination of forward and backward differences is used, with weighting factor α, the explicit finite difference approximation will take the form un+1i −u n i ∆t + a [ α ( uni+1 −u n i ∆x ) + +(1 −α) ( uni −u n i−1 ∆x )] = 0 this family of schemes can be formally written using the forward (down-wind) difference {d} and backward (up-wind) difference {u} as un+1i −u n i ∆t + a [ α{d} + (1 −α){u} ] = 0 . it’s evident, that the classical upwind and downwind schemes can be recovered for special choices of α = 0 and α = 1 respectively. the simple central scheme can be obtained for a symmetric choice represented by α = 1/2. a little bit less apparent, but still easy to verify is the representation of lax-friedrichs and lax-wendroff schemes, that also belong to this family. values of the upwind/downwind blending coefficient α for all these schemes are summarized in the tab. 2. 3.2. central & upwind decomposition in the same way as every explicit scheme with a threepoint stencil can be written in the form of convex combination of forward and backward differences, it can also be formally rewritten as a weighted sum of central and upwind difference approximations. the central approximation {c} can simply be expressed as an average of upwind and downwind (backward and forward) differences as {c} = {d} + {u} 2 =⇒ {d} = 2{c}−{u} . 51 t. bodnár, ph. fraunié, k. kozel acta polytechnica scheme formula up-wind [u] un+1i = u n i − a∆t ∆x (u n i −u n i−1) down-wind [d] un+1i = u n i − a∆t ∆x (u n i+1 −u n i ) central [c] un+1i = u n i − a∆t 2∆x(u n i+1 −u n i−1) lax-friedrichs [lf] un+1i = 1 2 (u n i+1 + u n i−1) − a∆t 2∆x(u n i+1 −u n i−1) lax-wendroff [lw] un+1i = u n i − a∆t 2∆x(u n i+1 −u n i−1) + a2∆t2 2∆x2 (u n i+1 − 2u n i + u n i−1) table 1. examples of some simple classical discretizations for advection equation. scheme coefficient α [u] 0 [d] 1 [c] 12 [lf] 12 − ∆x 2a∆t [lw] 12 − a∆t 2∆x table 2. coefficient of up-wind/down-wind decomposition of classical schemes. the family of explicit schemes can thus be rewritten using the central difference approximation {c} and additional upwinding {u}, where the blending parameter now has the value 2α. un+1i −u n i ∆t + a [2α{c} + (1 − 2α){u}] = 0 3.3. central & viscous decomposition by taking a divided difference of forward and backward differences (approximations of ux), the central approximation of the second derivative uxx can be constructed. this is often used in approximation of diffusive (or viscous terms) containing second derivatives. this viscous-like term {v} can be formally written as {v} = {d}−{u} ∆x =⇒ {u} = {c}− ∆x 2 {v} . using this definition of {v}, the whole family of schemes can be written as a combination of the central approximation part and a viscous (diffusive) part. un+1i −u n i ∆t + a [ {c}− (1 − 2α) a∆x 2 {v} ] = 0 3.3.1. numerical viscosity by a simple rearrangement of this formula, it’s easy to see that all the schemes from the considered explicit family can be written in the form, where only the central approximation of the first derivative is kept on left, while the remaining viscous-like part is moved to the right-hand side un+1i −u n i ∆t + a{c} = (1 − 2α) a∆x 2 {v} , scheme coefficient α � µ [u] 0 1 a∆x2 [d] 1 −1 −a∆x2 [c] 12 0 0 [lf] 12 − ∆x 2a∆t 1 γ ∆x2 2∆t [lw] 12 − a∆t 2∆x γ a2∆t 2 table 3. numerical viscosity coefficients for some classical schemes. or shortly un+1i −u n i ∆t + a{c} = µ{v} . it’s not difficult to see that this discrete equation corresponds to a finite-difference approximation of the advection-diffusion equation rather than just to the original advection equation. ut + aux = 0 −→ ut + aux = µuxx the extra term on the right hand side corresponds to the numerical diffusion (viscosity), where the coefficient µ depends on the method being used. µ = (1 − 2α)︸ ︷︷ ︸ � a∆x 2 values of this numerical viscosity coefficient for few classical explicit methods are listed in the table 3. in this context, each scheme within this explicit family can be considered as a simple central scheme with different amounts of added numerical viscosity (or upwinding, alternatively). when taking into account the definition of the non-dimensional courant-friedrichslewy parameter γ = a∆t∆x , which is positive (and bounded by stability conditions), the non-dimensional viscosity coefficient � can be defined. the value of the numerical viscosity coefficient determines the essential behavior and properties of the specific numerical method. just by changing the parameter µ, the schemes can become more diffusive or dispersive, stable or unstable and also their order of accuracy will change. these properties for some schemes are summarized in the table 4. 52 vol. 61 special issue/2021 modified equation for explicit and implicit schemes scheme � µ accuracy stability behavior [d] −1 −a∆x2 < 0 (∆t) 1/(∆x)1 unstable dispersive [c] 0 0 (∆t)1/(∆x)2 unstable dispersive [lw] γ < 1 a 2∆t 2 (∆t) 2/(∆x)2 stable dispersive [u] 1 a∆x2 (∆t) 1/(∆x)1 stable diffusive [lf] 1 γ > 1 ∆x 2 2∆t (∆t) 1/(∆x)1 stable diffusive table 4. properties and behavior of selected explicit numerical schemes. the schemes in the table 4 are sorted from top to bottom according to increasing numerical viscosity. it can be seen (as expected) that when the numerical viscosity coefficient is negative, the scheme is unstable. by increasing its value, the scheme becomes stable and also can improve its accuracy. however by further increase of the numerical viscosity, the behavior of the method becomes more diffusive and the formal order of accuracy drops down again. in summary, the identification of the amount of numerical viscosity embedded in a numerical scheme is the key point in understanding its behavior. here it was done at the discrete level, by identifying the discrete diffusive term in the scheme. similarly even more detailed analysis can be performed at the continuous level, leading to so called modified equation. 4. modified equation the modified equation approach is well known, often used to assess the order of accuracy for finite-difference schemes. when doing the taylor expansions to develop the finite-difference approximations, it is possible to go beyond just finding the order of the leading term in the truncated taylor series. it’s possible to find analytically the form of the leading term, add it to (keep it in) the original equation and study the behavior of the modified equation that includes this extra term introduced by the discretization of the problem. the properties of the modified problem solution will be close to the properties of the numerical solution of the original problem. 4.1. modified equation for upwind scheme when solving the the advection equation ut + aux = 0 a > 0 the up-wind scheme can be written as un+1i −u n i ∆t + a [ uni −u n i−1 ∆x ] = 0 , using the explicit euler discretization in time and backward difference in space (i.e. upwind when the advection velocity a > 0). considering a sufficiently smooth interpolant u(xi, tn) = uni , the taylor expansions can be used to derive the corresponding approximation formulas. u(xi, tn+1)−u(xi, tn) ∆t +a [ u(xi, tn)−u(xi−1, tn) ∆x ] = 0 the terms u(xi, tn+1) and u(xi, tn+1) appearing in this formula are then obtained from (truncated) the taylor series u(xi, tn+1) = u(xi, tn) + ∆tut(xi, tn)+ + ∆t2 2 utt(xi, tn) + ∆t3 6 uttt(xi, tn) + o(∆t4) u(xi−1, tn) = u(xi, tn) − ∆xux(xi, tn)+ + ∆x2 2 uxx(xi, tn) − ∆x3 6 uxxx(xi, tn) + o(∆x4) . using these values, the corresponding difference approximations can be obtained for temporal and spatial derivatives. u(xi, tn+1) −u(xi, tn) ∆t = ut(xi, tn)+ + ∆t 2 utt(xi, tn) + ∆t2 6 uttt(xi, tn) + o(∆t3) u(xi, tn) −u(xi−1, tn) ∆x = ux(xi, tn)− − ∆x 2 uxx(xi, tn) + ∆x2 6 uxxx(xi, tn) + o(∆x3) when these difference approximations are put together as in the numerical scheme, the original advection equation appears on the right hand side, together with some extra terms, representing the leading order terms in the remainder of the truncated taylor series. u(xi, tn+1)−u(xi, tn) ∆t +a [ u(xi, tn)−u(xi−1, tn) ∆x ] = = ut(xi, tn) + aux(xi, tn) + . . . . . . . . . = 0 it means that although the difference scheme approximates the advection equation, some extra (higher order) terms appear on the right hand side due to discretization. ut(xi, tn) + aux(xi, tn) = − ∆t 2 utt(xi, tn) − ∆t2 6 uttt(xi, tn) + o(∆t3)+ + a∆x 2 uxx(xi, tn) − a∆x2 6 uxxx(xi, tn) + o(∆x3) 53 t. bodnár, ph. fraunié, k. kozel acta polytechnica partial derivatives with respect to time and also space, both appear on the right hand side. the derivatives with respect to time can be converted to spatial derivatives using the original advection equation (or corresponding scheme taylor’s expansions): ut + aux = 0 =⇒ ∂ · ∂t = −a ∂ · ∂x utt = a2uxx & uttt = −a3uxxx this leads to ut(xi, tn) + aux(xi, tn) = − a2∆t 2 uxx(xi, tn) + a3∆t2 6 uxxx(xi, tn) + o(∆t3)+ + a∆x 2 uxx(xi, tn) − a∆x2 6 uxxx(xi, tn) + o(∆x3) . when only the leading order term is kept, the modified equation takes the form: ut(xi, tn) + aux(xi, tn) = = ( a∆x 2 − a2∆t 2 ) uxx(xi, tn) + o(∆t2; ∆x2) . in the approximate version, the extra term containing second (spatial) derivative uxx appears on the right hand side. ut(xi, tn) + aux(xi, tn) .= a∆x 2 (1 −γ)uxx(xi, tn) it means that while solving the advection equation by the up-wind scheme, we obtain rather the solution of the advection-diffusion equation with the diffusion coefficient corresponding to the numerical viscosity µ ut + aux = 0 −→ ut + aux = a∆x 2 (1 −γ)︸ ︷︷ ︸ µ uxx in more detail, instead of the first order approximation of the advection equation we have obtained a second order approximation of the advection-diffusion equation. 1st order approximation of original equation ut + aux = 0 + o(∆t; ∆x) 2nd order approximation of modified equation ut + aux = a∆x 2 (1 −γ)uxx + o(∆t2; ∆x2) so the numerical solution of the original (advection) problem is in fact much closer to the solution of the modified (advection-diffusion) problem, with all the consequences it may have on the solution behavior. from the form of the modified equation we can see the order of accuracy of the approximation of the original problem, the diffusive (or possibly dispersive) x t i−1 i+1i n+1 n α 4 α α α1 3 2 figure 5. computational stencil for general implicit and explicit schemes. character of the modified equation, representing the behavior of the numerical solution. it’s also possible to see e.g. how the numerical viscosity coefficient µ scales with time-step ∆t and spatial step ∆x. from the requirement of positivity of the diffusion coefficient µ > 0 it’s possible to estimate the stability of the underlying numerical scheme, i.e. the limitations for the cfl parameter γ (evidently γ < 1 is required for the up-wind scheme). 4.2. general modified equation the procedure of deriving the modified equation can be applied to all explicit schemes discussed so far. in fact it can be applied to a much larger family including also implicit schemes. further we will work with a family of explicit and implicit schemes, where the approximation of the spatial derivative is obtained as a linear combination of the forward and backward differences at time levels n and n + 1. the whole family of such schemes can be written in the form: un+1i −u n i ∆t + + a [ α1 ( uni+1 −u n i ∆x ) + α2 ( uni −u n i−1 ∆x )] + +a [ α3 ( un+1i+1 −u n+1 i ∆x ) +α4 ( un+1i −u n+1 i−1 ∆x )] = 0 . the computational stencil for such family of schemes is shown in the fig. 5, introducing also the weights α1 – α4 used for blending the individual differences in the linear combination. due to consistency the condition α1 + α2 + α3 + α4 = 1 should be verified, moreover obviously |α3| + |α4| = 0 for explicit schemes, while the opposite |α3| + |α4| 6= 0 characterizes the implicit schemes. for simplicity, the coefficients of the explicit part are marked in blue, while the implicit part is green. the coefficients α1 – α4 for some examples of classical explicit and implicit schemes are shown in the tab. 5. the last two rows in the table correspond to the wendroff scheme denoted by [w] and cranknicolson scheme denoted by [cn]. these two schemes are examples of implicit schemes, using the difference approximations at both time levels n and n + 1. for 54 vol. 61 special issue/2021 modified equation for explicit and implicit schemes scheme α1 α2 α3 α4 [u] 0 1 0 0 [d] 1 0 0 0 [c] 12 1 2 0 0 [lf] 12 − ∆x 2a∆t 1 2 + ∆x 2a∆t 0 0 [lw] 12 − a∆t 2∆x 1 2 + a∆t 2∆x 0 0 [w] 12 0 0 1 2 [cn] 14 1 4 1 4 1 4 table 5. coefficients of some explicit and implicit schemes with three-point stencil. scheme modified equation [u] ut + aux = (1 −γ)a∆x2 uxx [d] ut + aux = −(1 + γ)a∆x2 uxx [c] ut + aux = −γa∆x2 uxx [lf] ut + aux = ( 1γ −γ) a∆x 2 uxx [lw] ut + aux = −(1 −γ2)a∆x 2 6 uxxx [w] ut + aux = −(2 + 3γ + γ2)a∆x 2 12 uxxx [cn] ut + aux = −(2 + γ2)a∆x 2 12 uxxx table 6. modified equations for some explicit and implicit schemes. each of these schemes it’s possible to derive the modified equation similarly as in the case of the up-wind scheme. these modified equations are listed in the tab. 6 using the same procedure a general modified equation (with up to 3rd order terms) for the whole family of considered schemes can be obtained in the form: ut + aux = �2uxx + �3uxxx . this is formally an advection-diffusion-dispersion equation with the numerical diffusion coefficient �2 and dispersion coefficient �3. these coefficients can be derived analytically to the form of functions depending on the blending weights α1 – α4. �2 = − a∆x 2 { (α1 −α2) + (α3 −α4)+ + γ [ (α1 + α2)2 − (α3 + α4)2 ]} �3 = − a∆x2 6 { 1 + 3γ [ (α21 −α 2 2) − (α 2 3 −α 2 4) ] + + 2γ2 [ (α1 + α2)3 + (α3 + α4)3 ]} 5. numerical diffusion and dispersion the general modified equation for the whole explicit/implicit family of schemes can be rewritten in 0 0.5 1 1.5 2 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x−coordinate u ut + aux = (1 −γ) a∆x 2 uxx figure 6. up-wind [u] scheme solution and modified equation. the form ut + aux = �2uxx + �3uxxx = = �̃2 a∆x 2 uxx + �̃3 a∆x2 6 uxxx where the non-dimensional coefficients �̃2 and �̃3 are functions of the cfl parameter γ. this dependence of numerical diffusion and dispersion coefficients is obvious from the tab. 7. as already noted in the case of the discrete analysis of classical schemes (end of section 3), the amount of added numerical viscosity (diffusion) is responsible for the essential properties and behavior of each scheme. 5.1. diffusive schemes the schemes, for which the leading order term on the right-hand side of the modified equation is the diffusive term �̃2 a∆x2 uxx are of the first order of accuracy. if the sign of the numerical viscosity coefficient �̃2 is positive, the scheme is stable and it’s behavior is diffusive. the negative sign of �̃2 indicates that the scheme is unstable. the typical behavior of first order schemes is shown in the fig. 6 for up-wind and in fig. 7 for laxfriedrichs scheme. the same initial value problem is solved for piece-wise constant initial data. the analytical solution corresponds to the "shifted" original data. the discrete numerical solution obtained using each scheme is marked by circles in the corresponding plots. while the sign of �̃2 determines the stability of the numerical method, the size of �̃2 is responsible for the amount of added numerical diffusion. the higher the coefficient, the more diffused (smeared) the solution will be. by comparing the lax-friedrichs and the upwind schemes modified equations it is clear that for given cfl parameter γ the numerical viscosity coefficient �̃2 = (1 γ −γ ) of the [lf] scheme is always greater than the �̃2 = (1−γ) of the [u] scheme. so the lax-friedrichs will be more diffusive than the up-wind scheme. this is also well visible in the corresponding numerical solutions in the figs. 6 and 7. 55 t. bodnár, ph. fraunié, k. kozel acta polytechnica scheme �̃2 �̃3 behavior [u] 1 −γ · · · �2 > 0 diffusive [d] −(1 + γ) · · · �2 < 0 unstable [c] −γ · · · �2 < 0 unstable [lf] 1 γ −γ · · · �2 > 0 diffusive [lw] 0 −(1 −γ2) �3 6= 0 dispersive [w] 0 −12 (2 + 3γ + γ 2) �3 6= 0 dispersive [cn] 0 −12 (2 + γ 2) �3 6= 0 dispersive table 7. coefficients of numerical diffusion and dispersion for selected schemes. 0 0.5 1 1.5 2 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x−coordinate u ut + aux = (1 γ −γ )a∆x 2 uxx figure 7. lax-friedrichs [lf] scheme solution and modified equation. 5.2. dispersive schemes for some schemes the first-order diffusive term vanishes and the dominant role in the modified equation is played by the dispersive term �̃3 a∆x 2 6 uxxx. due to this, such schemes are of a second order and their behavior is dispersive. the numerical solution behaves much more like the solution of the advectiondispersion equation. it means that the advected initial data are decomposed into individual fourier modes and each of them propagates at a different velocity, depending on the corresponding wave-number. this kind of behavior can be observed for lax-wendroff and wendroff schemes in the fig. 8 and 9 respectively. again, for a given cfl parameter γ the numerical dispersion coefficient �̃3 = −(1 −γ2) of the [lw] scheme is smaller (in the absolute value) than the coefficient �̃3 = −(2 + 3γ +γ2)/2 for the [w] scheme. this results in a less dispersive (less oscillatory) solution obtained using the lax-wendroff scheme. 6. limitations and extensions in this paper the presentation was limited to the finite-difference approximation of linear advection, using schemes with three-point stencil operating at two time levels. most of these limiting assumptions can be removed or relaxed. larger stencil and more time levels can be used to construct the scheme, it will only make the analysis of the scheme and its modified 0 0.5 1 1.5 2 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x−coordinate u ut + aux = −(1 −γ2) a∆x2 6 uxxx figure 8. lax-wendroff [lw] scheme solution and modified equation. 0 0.5 1 1.5 2 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x−coordinate u ut + aux = −(2 + 3γ + γ2) a∆x2 12 uxxx figure 9. wendroff [w] scheme solution and modified equation. equation derivation more difficult (time consuming), however it can easily be done. the only important limitation is that the schemes should be linear in the sense that their coefficients can’t depend on the solution. the non-linear schemes case will be significantly more complicated, although some kind of local linearization (frozen coefficients) would probably give at least some basic information. multidimensional schemes can be analyzed in a very similar way, at least on regular, non-distorted grids. for finite-volume schemes such analysis can be performed on arbitrary grids, based on the expression and splitting of the numerical flux as a sum of the simple 56 vol. 61 special issue/2021 modified equation for explicit and implicit schemes average central flux and a dissipative stabilizing part [5, 6]. based on the extra dissipation, the properties of the scheme can be shown. 7. remark on applications based on a detailed knowledge of the diffusive/dispersive behavior of numerical schemes, suitable strategy can be chosen to improve their properties, namely the stability and resolution. the classical way is to modify the embedded numerical viscosity of the scheme. it can be done for example by the following approaches: (1.) modify the coefficients – as it was shown in section 3, each scheme can be written as a sum of the central (non-diffusive) part and additional internal dissipative part (or sum of down-wind and up-wind parts alternatively). the coefficient of internal numerical viscosity embedded in the scheme can be modified and adjusted by varying e.g. the blending parameter α and balancing the amount of forward and backward differences in the approximation. for modified lax-friedrichs scheme with reduced numerical viscosity see e.g. [7]. (2.) build a composite scheme – two schemes are used, one with higher accuracy (typically dispersive) and one with lower accuracy (typically diffusive). during the time-stepping process, several steps are performed using the higher order (but usually oscillatory) scheme, followed by a "smoothing" step performed using a lower order diffusive scheme. the ratio of high/low order steps can be tuned to optimize the performance of the combined composite method. an example of this technique using a combination of lax-wendroff and lax-friedrichs scheme can be found in [8]. (3.) use artificial viscosity – the splitting of numerical methods into the central and viscous (diffusive) part offers the possibility to use always the (accurate but unstable) central scheme, followed by a separate stabilizing step applying an artificial viscosity term. in this way, the numerical viscosity is separated and no longer relies on the form embedded (hidden) in the numerical discretization. the separate, stand alone, numerical viscosity can be designed and tuned for the specific problem and best performance. typically the second and fourth order damping terms were used proportional to second and fourth order derivatives �2uxx and �4uxxxx. more efficient non-linear numerical viscosities can be constructed easily by considering variable, solution dependent, numerical viscosity coefficients �2(u), �4(u). when these coefficients are kept small on the smooth solution and only locally increased where the solution has high gradients or is oscillatory, a good resolution properties can be preserved while the stability and robustness of the method is improved. see e.g. [9] for artificial viscosity example and application or [10] or alternative filtering techniques. 8. conclusions and remarks the discrete analysis based on a decomposition of the scheme into a central and diffusive (or upwind) part was shown for a family of explicit schemes. a general modified equation was developed for a large family of explicit and implicit schemes. this led to a discussion concerning the diffusive and dispersive properties of numerical schemes. most of the partial conclusions were already formulated in the above sections. let’s just point out again some key points. the numerical solution of the advection equation is “much closer” to the solution of advection-diffusion or advection-dispersion equation, depending which is the leading order term in the discretization error. the behavior and quality of the numerical solution heavily depends on the coefficients of the modified equation. their knowledge can help to assess a-priori the behavior of a numerical method. the detailed knowledge of the structure of the leading order terms on the right-hand side of the modified equation can be used to construct the “high(er) resolution” numerical methods. acknowledgements t. bodnár is greatful for the support provided by the european regional development fund-project “center for advanced applied science” no.cz.02.1.01/0.0/0.0/16 019/0000778 and partly by the czech science foundation under the grant no. p201-19-04243s. references [1] r. j. leveque. finite difference methods for ordinary and partial differential equations : steady-state and time-dependent problems. siam, 2007. [2] c. hirsch. numerical computation of internal and external flows, vol. 1,2. john willey & sons, 1988. [3] j. d. anderson. computational techniques for fluid dynamics, vol. 1-2 of springer series in computational physics. springer-verlag berlin heidelberg, 2nd edn., 1991. [4] c. a. j. fletcher. computational fluid dynamics the basics with applications. mcgraw-hill, 1995. [5] r. j. leveque. numerical methods for conservation laws. lectures in mathematics. birkhäuser verlag, 1990. [6] r. j. leveque. finite volume methods for hyperbolic problems. cambridge texts in applied mathematics. cambridge university press, 2002. [7] r. dvořák, k. kozel. mathematical modeling in aerodynamics (in czech). vydavatelství čvut, 1996. [8] r. liska, b. wendroff. composite schemes for conservation laws. siam journal on numerical analysis 35(6):2250–2271, 1998. doi:10.1137/s0036142996310976. [9] t. bodnár, l. beneš, k. kozel. numerical simulation of flow over barriers in complex terrain. il nuovo cimento c 31(5–6):619–632, 2008. doi:10.1393/ncc/i2008-10323-4. 57 http://dx.doi.org/10.1137/s0036142996310976 http://dx.doi.org/10.1393/ncc/i2008-10323-4 t. bodnár, ph. fraunié, k. kozel acta polytechnica [10] a. sequeira, t. bodnár. on the filtering of spurious oscillations in the numerical simulations of convection dominated problems. vietnam journal of mathematics 47:851–864, 2019. doi:10.1007/s10013-019-00369-z. 58 http://dx.doi.org/10.1007/s10013-019-00369-z acta polytechnica 61(si):49–58, 2021 1 introduction 2 model problems 2.1 advection equation 2.2 diffusion equation 2.3 dispersion equation 2.4 advection-diffusion-dispersion equation 3 discrete analysis 3.1 up-wind & down-wind decomposition 3.2 central & upwind decomposition 3.3 central & viscous decomposition 3.3.1 numerical viscosity 4 modified equation 4.1 modified equation for upwind scheme 4.2 general modified equation 5 numerical diffusion and dispersion 5.1 diffusive schemes 5.2 dispersive schemes 6 limitations and extensions 7 remark on applications 8 conclusions and remarks acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0038 acta polytechnica 57(1):38–48, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap preliminary study on combustion and overall parameters of syngas fuel mixtures for spark ignition combustion engine rastislav tomana, b, ∗, marián polónia, andrej chríbika a institute of transport technology and design, faculty of mechanical engineering, slovak university of technology in bratislava, námestie slobody 17, 812 31 bratislava, slovak republic b department of automotive, combustion engine and railway engineering, faculty of mechanical engineering, czech technical university in prague, technická 4, 166 07 prague 6, czech republic ∗ corresponding author: rastislav.toman@fs.cvut.cz abstract. this paper presents a numerical study on a group of alternative gaseous fuels – syngases, and their use in the spark-ignition internal combustion engine lombardini lgw 702. these syngas fuel mixtures consist mainly of hydrogen and carbon monoxide, together with inert gases. an understanding of the impact of the syngas composition on the nature of the combustion process is essential for the improvement of the thermal efficiency of syngas-fuelled engines. the paper focuses on six different syngas mixtures with natural gas as a reference. the introduction of the paper goes through some recent trends in the field of the alternative gaseous fuels, followed by a discussion of the objectives of our work, together with the selection of mixtures. important part of the paper is dedicated to the experimental and above all to the numerical methods. two different simulation models are showcased: the single-cylinder ‘closed-volume’ combustion analysis model and the full-scale lgw 702 model; all prepared and tuned with the gt-power software. steady-state engine measurements are followed by the combustion analysis, which is undertaken to obtain the burn rate profiles. the burn rate profiles, in the form of the vibe formula, are than inserted into the in-house developed empirical combustion model based on csallner-woschni recalculation formulas. its development is described in the scope as well. the full-scale lgw 702 simulation model, together with this empirical combustion model, is used for the evaluation of engine overall performance parameters, running on gaseous fuel mixtures. the analysis was done on engine full load and stoichiometric mixture conditions only. keywords: spark-ignition piston combustion engine; alternative gaseous fuel; syngas; combustion modelling. 1. introduction syngas or a ‘synthesis gas’ is a summarizing name for the gaseous fuel mixture containing combustible and relatively high inert gas content. the main combustible components of these fuels are always hydrogen (h2) and carbon monoxide (co). methane (ch4), carbon dioxide (co2) or nitrogen (n2) and other minor hydrocarbon constituents (ethane, propane, butane etc.) can then constitute the rest of the fuel mixture. the syngas is produced from various feedstock, for example, coal, biomass, organic waste, tar or natural gas, by a variety of production processes such as gasification or pyrolysis [1–4]. the exact composition of a syngas is then given by its production process and feedstock used. resulting syngas mixture is a solid alternative fuel in the field of power generation by cogeneration units. the topic of alternative gaseous fuels for the use in cogeneration units is of growing interest in recent years. an extensive research has been done to make the spark-ignition combustion engines fuelled by natural gas (ng) competitive to diesel engines, with a special focus on exhaust gas emissions. experimental studies dealing with the effect of exact ng composition, lean burn conditions, different egr levels, and the use of 3-way catalytic converters on the overall engine parameters and emissions were presented [5–7]. more recent experimental studies focus also on the combustion characteristics and emission effects of different dilution gases on the ng engine under constantconditions (load, speed and spark advance) stoichiometric operation [8]. nevertheless, the general trend is to use the experiments for a model validation purpose, and then to employ the model for a numerical investigation and optimization of ice running in the cogeneration unit [9]. regarding the syngas mixtures, the impact of the syngas composition on the combustion characteristics in the ice [10], or in more detail on the laminar and turbulent flow velocities is prominent [11]. apart from these experimental and numerical studies, the authors in [12] presented a combined experimental, numerical, and theoretical study on the engine performance running on four gaseous mixtures: two different syngas mixtures with comparison to the ng and hydrogen. after the measurements, 38 http://dx.doi.org/10.14311/ap.2017.57.0038 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 1/2017 preliminary study on combustion and parameters of syngas fuel mixtures parameter unit ng sg6 sg18 sg21 sg22 sg23 sg31 ch4 [vol%] 96.033 20 0 0 10 0 10 c2h6 [vol%] 2.036 0 0 0 0 0 0 c3h8 [vol%] 0.599 0 0 0 0 0 0 nc4h10 [vol%] 0.196 0 0 0 0 0 0 nc5h12 [vol%] 0.001 0 0 0 0 0 0 h2 [vol%] 0 20 10 20 20 30 25 co [vol%] 0 30 30 20 10 10 20 co2 [vol%] 0.283 25 10 10 10 10 20 n2 [vol%] 0.794 5 50 50 50 50 25 lhv [kj/kg] 48 796 12 027 4 043 4 298 6 762 4 620 8 385 a/f ratio [kg/kg] 16.96 3.66 1.01 1.12 2.08 1.26 2.49 molar mass [kg/kmol] 16.74 24.42 27.01 24.41 23.22 21.81 23.52 density [kg/m3] 0.696 0.998 1.104 0.998 0.949 0.892 0.961 h/c [1] 3.91 1.6 0.5 1.33 2.67 3 1.8 o/c [1] 0.01 1.07 1.25 1.33 1 1.5 1.2 table 2. basic physical and chemical properties of the used fuel mixtures. unit group 1 group 2 group 3 ch4 [vol%] 0–50 0–20 0–20 h2 [vol%] 10–50 10–30 5–45 co [vol%] 20–40 10–30 10–50 co2 [vol%] 25 10 20 n2 [vol%] 5 50 25 table 1. range of syngas groups composed with various inert gas concentrations (in vol%). combustion analysis was undertaken, with crank angle resolved in-cylinder turbulence and flame propagation plotted onto so-called ‘bradley diagram’, to explain the variation of the engine performance parameters. our own previous work in the facility of slovak university of technology in bratislava was mostly focused on the experimental research of ng based fuels, such as ng mixtures with hydrogen [13] or co2 [14] – h2ng and co2ng. however, here presented literature on alternative gaseous fuels show many opportunities for a simulation too. the principal opportunity is that a well-tuned numerical model allows for the extensive optimization of ice parameters, for example, ignition or valve timing, as it was the case in [9]. a numerical model may also guide the experiments. other possibility is to employ the advanced predictive combustion models to model and study the ice combustion process for different fuel mixtures [12]. finally, the simulation tools also shorten the development experimental loops and subsequently reduce the time and costs of the research and real-life application of alternative fuels into the cogeneration units. recapitulating all discussed benefits, we have decided to enhance our research activities with simulation tools. there are two main objectives of this paper. first objective is the preliminary analysis of the main combustion parameters for the different syngas mixtures, with a gt-power combustion analysis model. this combustion analysis model processes the measured in-cylinder pressure data together with other important measured quantities. the second objective of this paper is to get an estimation of steady-state engine performance parameters powered by different stoichiometric syngas mixtures under full load and different engine speeds (1200 to 2500 rpm sweep) by simulation. to meet this second objective a gt-power full-scale simulation model of lombardini lgw 702 engine was prepared and tuned, with a proper in-house combustion model. 1.1. selected fuel mixtures in the early phases of our research on syngas mixtures, we determined thirty-five various syngas mixtures, based on the knowledge of real-life syngas produced industrially, with the aim to cover the widest range of possible mixture compositions. for a better classification of these compositions, the mixtures are sorted into three different groups by the inert gas concentration: with 30, 45, and 60 volume percent of the inert gas. the rest are combustible components (table 1). group 1 contains seventeen various syngas mixtures with relatively small amount of inert gases. group 2 contains only six mixtures and group 3 twelve different mixtures. for the preliminary analysis, only six of the total number of thirty-five various syngas mixtures are presented in this paper: one syngas from group 1 and one from a group 3 as a representative sample. from the group 1 it is syngas 6 (further labelled sg6) and from group 3 the sg31. in this paper, a group 2 syngas mixtures are of a special interest, due to their very high inert gas content. therefore, four of them are covered: the sg18, the sg21, the sg22, and the sg23. the basic physical and chemical properties of these six fuel mixtures are summarised in table 2. in 39 r. toman, m. polóni, a. chríbik acta polytechnica lgw 702 bore 76 mm stroke 77.6 mm cylinders 2; in line engine displacement 704 cm3 compression ratio 12 : 01 : 00 valves per cylinder 2 ivo 340 °ca ivc 599 °ca evo 147 °ca evc 386 °ca table 3. tested engine main dimensions. our experimental and simulation study of syngas mixtures, we use the ng distributed in slovak republic (96 % of methane) as a reference fuel mixture, with the real composition used during the measurements. a high inert gas content in syngas mixtures lowers the lhv, which should then lead to lower maximum pressures, temperatures, lower overall performance parameters, and high fuel consumption. inert gases slow down the burning process, whereas hydrogen does the opposite. thus, different burn duration is expected compared to the ng. the best performance properties should be reached by the sg6, because of its high lhv compared to the other syngas mixtures. 2. experimental methods all the experimental data were acquired on a steadystate test bench in the facility of stu bratislava on a small, two-cylinder, water-cooled, naturally-aspirated, spark-ignition engine lombardini lgw 702 [15]. this engine was rebuilt from the compression ignition lombardini variant ldw 702. thus, it had to be fitted with an ignition system instead of fuel injectors and new lower compression ratio pistons. the combustion chamber has a simple geometry with a flat cylinder head and a centred piston bowl. the main geometrical parameters of lgw 702 are listed in table 3 (the cam timing zero reference point is the firing tdc). a fuel mixture is prepared in a diffuser-type mixer. our experimental test bench is equipped with an acquisition system measuring engine torque and speed, overall pressures and temperatures, air and fuel flows, together with the exhaust gas composition. the ice is then equipped with one spark plug integrated piezoelectric pressure transducer, for the in-cylinder pressure measurement, together with a crank angle encoder. currently, we measure the pressure with a constant frequency of 66 khz and the crank angle encoder only determines the tdc. in the future measurement, a full ‘three-pressure-analysis’ (tpa) capability will be introduced into the test bench, together with more precise crank angle related pressure measurement. all fuel mixtures in this study were experimentally tested under full load and constant engine speed of 1500 rpm. the spark timing of 27 °ca btdc and stoichiometric air-to-fuel ratio (λ = 1) were kept constant. the latter due to the use of the 3-way catalytic converter. 3. numerical methods 3.1. combustion analysis the combustion of fuel mixture in a combustion engine is best understood by studying the burn rate profile. a burn rate profile can be obtained from a measured in-cylinder pressure by a so-called ‘reverse run’ simulation method. in a ‘reverse run’ method, the in-cylinder pressure is an input for the analysis; the burn rate is then an output. in a normal simulation, the ‘forward run’, it is vice versa. these methods are also implemented in the gt-power simulation tool. a great advantage of using the gt-power is that both forward and reverse runs use the same approach of two-zone combustion modelling, without any simplifying assumptions on the side of thermodynamics or chemistry [16]. then in the gt-power, there are two approaches of how to use the ‘reverse run’ cylinder analysis. the first one: the ‘stand-alone burn rate calculation’ (1cyl) uses only an in-cylinder measured pressure; together with a few basic cycle averaged results (as volumetric efficiency), engine cylinder geometry, cylinder wall temperatures, heat transfer model, and initial conditions. this model is not connected to any upstream or downstream 1d flow components, therefore, these initial conditions represent the trapped conditions at the ivc and need to be estimated from measurement. the second approach is the already mentioned tpa, which also requires measured intake and exhaust pressures. the authors in [17] show, that the tpa approach is more accurate, since there is no need of parameter estimation: the trapped conditions at the ivc are directly simulated from the 1d flow components. since, in our research facility, the tpa is not available, we use a simple 1cyl approach and for the initialization of the closed-volume cycle we use the trapped quantities from a full-scale engine model. in general, the error in a calculation of the burn rate from the measured in-cylinder pressure is always present. there are potential errors in the in-cylinder pressure measurement, initialization inputs estimation, or also from other sub-models, such as for the incylinder heat transfer coefficient calculation. in every combustion analysis run, the gt-power reports some automatic ‘consistency checks’ that help to identify the source of the error. these ‘consistency checks’ are listed in [16] and we used them during our analysis to compensate the lack of the tpa or inaccuracies in our measurement. the most important of the ‘consistency checks’ is the ’fuel energy (lhv) multiplier’ (lhv multiplier). when the lhv multiplier value differs from unity, it means that the input energy in the simulation system is different from the needed one to follow the measured in-cylinder pressure curve. 40 vol. 57 no. 1/2017 preliminary study on combustion and parameters of syngas fuel mixtures 3.2. combustion modelling one of the objectives of this study is to estimate correctly the engine performance parameters under different engine speeds. but the output of the combustion analysis – a burn rate profile – cannot be universally used for different engine speeds, since the profile is deformed substantially with changed operating conditions [18]. various possibilities of either empirical or phenomenological combustion models can be used inside the gt-power. empirical models proposed in [18, 19] are based on an extensive measurement database. the authors used simple recalculation multipliers of significant burn points, dependant on engine speed, load, spark advance, λ, residual gas fraction, intake pressure, and temperature. other option is the gt-power’s predictive phenomenological turbulent combustion model engcylcombsiturb (or siturb), based on [20, 21]. both the empirical and phenomenological options need a number of accurately measured engine operation points. the advantage of the empirical combustion models is that they are easy to implement, but their downside is that they work well only inside the tuned operating range. the siturb model is even more demanding on the side of the tuning process and input data, requiring the swirl and tumble characteristics of the intake valves, together with the complete combustion chamber geometry. however, the siturb can then be used also for the reasonable combustion prediction outside of the tuned operating range. this was shown in [22] for the gasoline fuel. the siturb was likewise successfully used for syngas combustion modelling in [12]. in our study, we have decided to build an empirical combustion model (empirical model) by combination of empirical vibe formula model engcylcombsiwiebe [23] xb = 1 − exp−a (α−α0 ∆α ) (1) and recalculation formulas proposed by csallner and woschni [24]. csallner-woschni formulas incorporate the effect of different engine operating point (engine speed, load, λ, spark-advance, and residual gas fraction) on the combustion characteristics. vibe formula requires three important parameters to be set: burn duration (10–90 % mfb), vibe shape exponent m (1), and ca50, so the crank angle at 50 % mass fuel burned (figure 1: a comparison of three different gas mixtures at full load, λ = 1 and 1500 rpm for illustration). in our empirical combustion model, the vibe formula parameters – outputs from a single operation point combustion analysis – are automatically adjusted by csallner-woschni formulas, to obtain a new steady-state operating point values. regarding the vibe shape coefficient, we assume its value to be constant for the particular syngas and the whole operating range. this simplification is also justified and used by the author of [25], who states that the crank angle [deg] -40 -30 -20 -10 0 10 20 30 40 50 60 m f b [ -] 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ca10 ca50 ca90 ng@1500 sg18@1500 sg23@1500 spark figure 1. vibe curves and significant burning points for different fuel mixtures at constant spark advance and engine speed. change in the shape coefficient does not affect the overall result. it is worth to emphasise that the csallner-woschni recalculation formulas have not been verified for this preliminary analysis. however, they are necessary for our future simulation activities. 3.3. full scale engine simulation lombardini lgw 702 gt-power full-scale (full) model includes all relevant engine cylinder and cranktrain dimensions. full model further includes measured engine speed dependant friction losses of the whole lgw 702 assembly and measured necessary intake/exhaust flow coefficients. the empirical combustion model from the previous section is included in the full model. the in-cylinder heat transfer coefficient calculation is carried out by woschnigt model [16] that closely emulates the classical woschni correlation without swirl, as described in [26]. the base full model was tuned after the initial modelling. the tuning process was split into two steps. we focused the first step on the engine integral parameters: brake torque, bsfc, and volumetric efficiency. full model results were adjusted to the set of the ng full load steady-state measurement data [27], which does not include the in-cylinder pressure curves. there was a small uncertainty on the exact ng composition (arising from a possible grid gas composition variation that is controlled by a public regulation), but the spark timing and λ were known for each operating point. other combustion model parameters had to be reasonably estimated. some of the component flow and heat transfer characteristics were manually adjusted to get an overall agreement with measured data (especially volumetric efficiency). our requirement was to reach the error values lower then ±5 %. this requirement was fulfilled (figure 2). 41 r. toman, m. polóni, a. chríbik acta polytechnica minimized target function xk weight factor αk optimal solution average pressure difference 0.35 0.179 maximum pressure difference 0.15 0.224 average lhv multiplier error 0.35 0.005 maximum lhv multiplier error 0.15 0.008 table 4. criterial function weight factor from equation (3) and optimal solution values. engine speed [rpm] 1200 1400 1600 1800 2000 2200 e rr o r [% ] -4 -3 -2 -1 0 1 2 3 4 brake torque bsfc volumetric efficiency figure 2. errors on brake torque, the bsfc and volumetric efficiency prediction after first step calibration (simulation – measurement). the second full model tuning step was focused on in-cylinder quantities, mainly in-cylinder pressure curve agreement with measurement and the already mentioned lhv multiplier. we used a set of seven full load, λ = 1 operation points (engine speeds 12002200 rpm), with a pure methane as a fuel. methane properties are well known and were inserted into our model, together with measured in-cylinder pressure. then, since the measurement errors and uncertainties are always present, we appointed three main possible sources: • effective compression ratio, which can be different from the geometrical compression ratio due to blowby, wrong geometry estimation, or material wear; • convection multiplier from the heat transfer model is generally not a unity [16] and needs to be adjusted; • tdc error parameter, that introduces a possible inaccuracy in the measurement tdc determination. these three tuning parameters can be tuned manually, or as in our case: coupled to an external optimization software. we performed this multi-criterial optimization task using the genetic algorithm nsgaii [28]. as an optimization criterion, four target functions from table 4 were minimized. the pressure difference is evaluated by a simple formula (2) that compares the measured and simulated pressure curve. crank angle [deg] -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 p re s s u re [ b a r] 0 10 20 30 40 50 60 ch4@2200 sim ch4@2200 meas ch4@1200 sim ch4@1200 meas spark figure 3. comparison of measured and simulated pressure curves for the optimal solution. other quantities, for instance, the mixture air excess λ, were assumed as correctly determined, and therefore were not included into the multi-criterial optimization: ∆p = 1 αend −αstart ∫ αend αstart |pmeas −psim|dα. (2) the pareto frontier results set was then processed by a criterial function (3). the fraction xk xk,max represents the normalization, so that different functions xk can be combined into a single equation: f = n∑ k=1 αk xk xk,rmmax . (3) table 4 also shows the chosen weight factors αk of the four target functions and the values of the ‘optimal solution’. the ‘optimal solution’ determined the values for the three tuning parameters. figure 3 displays the comparison of measured and simulated pressure curves of two operating points (methane; full load; λ = 1; 1200 rpm and 2200 rpm) of the ‘optimal solution’. in addition, some different weight factors were tested. however, if the weight factors ratio between the average and maximum errors is kept, the final ‘optimal solution’ is not affected by the weight factors change. 42 vol. 57 no. 1/2017 preliminary study on combustion and parameters of syngas fuel mixtures figure 4. schematic simulation procedure of a new syngas fuel mixture. verification function syngas set average pressure difference 0.308 maximum pressure difference 0.594 average lhv multiplier error 0.022 maximum lhv multiplier error 0.033 table 5. verification parameters from the syngas combustion analysis. 3.4. simulation procedure the general simulation procedure of a new syngas mixture (or other gaseous mixture) is schematically represented on figure 4. this procedure starts with a combustion analysis of the measured pressure data (pcyl). a simulation loop between the 1cyl combustion analysis model and the full model has to be iterated, until the combustion analysis results converge. this is because the 1cyl model needs the initialization parameters from the full model simulation, and these are affected by the combustion model parameters. it normally takes two to three manual iterations to reach the same result of the combustion analysis. full model overall results have to be always compared against the measurements and also the 1cyl simulated in-cylinder pressure and burn rate, to make sure everything is set up correctly. after that, the combustion analysis results are prepared to be used as an input for the empirical combustion model. with the empirical combustion model extended by the data from the new syngas mixture, the simulation or optimization of a new engine operating points on full model can start. we used this particular procedure in this presented project. empirical combustion model was extended with the combustion analysis and then used for the simulation of the whole engine speed sweep (1200 to 2500 rpm), full load, and λ = 1. we carried out the spark-timing optimization for each operating point of all simulated mixtures in accordance with the aim of this project, which is to get the insight on maximum possible overall engine parameters. figure 5. burn duration 10–90 % and si vibe burn exponent comparison of the 6 sgs with the ng fuel for the lgw 702 engine single-cylinder combustion analysis. 4. simulation results 4.1. combustion analysis results the syngas (and the ng reference) measurement set was measured right after the methane set, which was used for the model tuning. therefore, the same tdc error correction and other parameters were also used in the syngas combustion. table 5 shows the verification functions from the 1cyl combustion analysis on syngas set. values in table 5 are acceptable, although higher than those from table 4 for the originally tuned methane set. this is probably caused by a relatively small set (seven operation points), that was used for the multi-criterial optimization, or other uncertainties in measurement. the 1cyl combustion analysis determined the values of the vibe shape exponent and burn duration 10–90 % for each mixture (figure 5). the values of the ng were checked with past research results and match them well. nevertheless, the individual impact of the fuel mixture components and their interaction together in different mixtures has to be studied separately by a measurement and combustion analysis, which is not of interest in this presented paper. in the case of the analysis of the 6 different syngas compositions and natural gas, it is very hard to conclude the attribution of individual mixture components, as these can cancel out each other. the author in [19] shows a similar trend: the addition of h2 speeds-up the burning process and the inert gases n2 and co2 have the opposite effect. the impact of co2 is greater than the impact of n2 [8]. from the combustion analysis results (figure 5) the effect of the h2/ch4 ratio, with other mixture components constant, for the sg22 and sg23 is visible: more h2 content over ch4 leads a to faster burn. the same can be said with h2/co for the sg18 and sg21: again more h2 content leads to a faster burn. 43 r. toman, m. polóni, a. chríbik acta polytechnica engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 b u rn d u ra ti o n [ c a ] 10 15 20 25 30 35 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 6. optimized ca50 position comparison of 6 sgs with ng fuel for lgw 702 engine. engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 c ra n k a n g le a t d c [ d e g ] 6 8 10 12 14 16 18 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 7. burn duration 10-90 % mfb comparison of the 6 sgs with the ng fuel for the lgw 702 engine. 4.2. full model results the final full model simulations of the fuel mixtures sg6, sg18, sg21, sg22, sg23, sg31, and the ng reference on the lgw 702 engine were carried out in a range of engine speeds from 1200 to 2500 rpm, at full load, λ = 1, and optimal ca50. the optimized ca50 position for each fuel mixture is shown on figure 6. for higher engine speeds, the position of combustion centre shifts slightly towards the tdc. the empirical combustion model automatically adjusts burn duration for each mixture (figure 7), prolonging the burn duration with growing engine speed. the values at 1500 rpm are matching the trends from figure 5: the sg21 and sg23 show faster burn in the whole rpm region; the sg18, sg22, and sg31 slow burn; and the sg6 with the ng are balanced in the middle. figure 8 shows the comparison of the imep. as expected, the imep values, in general, for all the engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 im e p [ b a r] 6 6.5 7 7.5 8 8.5 9 9.5 10 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 8. imep comparison of the 6 sgs with the ng fuel for the lgw 702 engine. engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 is f c [ g /k w h ] 100 200 300 400 500 600 700 800 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 9. isfc comparison of the 6 sgs with the ng fuel for the lgw 702 engine. sgs are below the ng, for instance, for the sg6, it is approximately 1.2 bar, but for the group 2 sg18, sg21, and g23 with the lowest lhv, it is around 3 bars, so more than 30 %. the obvious increment of the isfc for syngas mixtures compared to the ng is displayed in the following figure 9. values of the isfc for the ng are below 200 g/kwh; for the sg22 it is around 285 g/kwh (+45 %); for the sg21 already 535 g/kwh (+180 %), and the most extreme case is the sg18 with a difference of more than +400 %, compared to the ng. the figures for the imep and the isfc (figure 8 and figure 9), show the indicated values, that are directly derived from the engines p-v diagram. the imep and the isfc are independent on the ice operating losses (friction, etc.), therefore, they can be considered as an appropriate relative measure of engine overall operation parameters. the indicated efficiency (indicated work per cycle over input energy 44 vol. 57 no. 1/2017 preliminary study on combustion and parameters of syngas fuel mixtures engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 in d ic a te d e ff ic ie n c y [ p e rc e n t] 36.5 37 37.5 38 38.5 39 39.5 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 10. indicated efficiency comparison of the 6 sgs with the ng fuel for the lgw 702 engine. engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 b ra k e t o rq u e [ n m ] 20 25 30 35 40 45 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 11. brake torque comparison of 6 sgs with ng fuel for lgw 702 engine. in fuel [16]) itself is shown on figure 10. in general, the indicated efficiency differences between the mixtures are not significant. figure 11 with the brake torque output illustrates what torque values can be expected during experimental testing on the engine test-bench. here, the impact of the fuel lhv parameter on the brake torque output is visible: the low lhv syngas mixtures also show low brake torque output. regarding the cycle data, there are three important values: maximum cycle pressure (figure 12), maximum cycle temperature (figure 13), and maximum cycle pressure crank angle position (figure 14). these three figures also show the importance of the fuel lhv parameter: the low lhv syngas mixtures also show low maximum cycle pressure and lower maximum cycle temperature. with the higher fuel lhv, it is vice versa. for the nox production, it is crucial that for the sg fuels, the maximum cycle temperature values are engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 p re s s u re [ b a r] 42 44 46 48 50 52 54 56 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 12. maximum cycle pressures comparison of the 6 sgs with the ng fuel for the lgw 702 engine. engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 m a x im u m t e m p e ra tu re [ k ] 2150 2200 2250 2300 2350 2400 2450 2500 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 13. maximum cycle temperature comparison of the 6 sgs with the ng fuel for the lgw 702 engine. lower than for the ng (figure 13). thus, a lower tendency to produce nox is expected. finally, the maximum cycle pressure position (figure 14) shows the effect of the ca50 optimization and a different burn duration for each fuel mixture: the values are shifting towards the tdc with growing engine speed, accordingly to the optimized ca50 position. 5. conclusions this paper presented a preliminary simulation study of the engine performance and combustion parameters running on six different syngas mixtures compared to natural gas. a great part of the paper was focused on the numerical methods used for this preliminary analysis. two different models – a closed-volume combustion analysis 1cyl model and a full model – were prepared and 45 r. toman, m. polóni, a. chríbik acta polytechnica engine speed [rpm] 1200 1400 1600 1800 2000 2200 2400 2600 c ra n k a n g le a t d c [ d e g ] 6 8 10 12 14 16 18 ng sg6 sg18 sg21 sg22 sg23 sg31 figure 14. maximum cycle pressure position comparison of the 6 sgs with the ng fuel for the lgw 702 engine. used, with a special emphasis on the accurate combustion modelling, which was done with an in-house empirical model based on vibe and csallner-woschni formulas. this methodology expands our previous experimental work and goes in line with current research trends on alternative fuels. the combustion analysis of the measured data using the 1cyl model showed some dependencies of the burn duration and the burn profile shape on the mixture composition: • hydrogen content in the fuel mixture affects the burn duration positively, speeding up the burning process; the inert gases act in opposite, slowing the process down; • for the group 2 mixtures, the effect of h2/ch4 (the sg22 and sg23) and h2/co (the sg18 and sg21) ratios on the burn duration is evident: more hydrogen means a faster burn; • however, in order to get more accurate conclusions on burn duration trade-offs or the vibe shape exponent, it is necessary to carry out a more detailed study on individual mixture components. the results from the combustion analysis were used subsequently in the empirical combustion model for the simulation of the lgw 702 operation on various syngas mixture compositions. to make use of the simulation model capabilities, we carried out the ca50 position optimization for each and every engine operating point. the obtained results from the simulation can be summarized in the following conclusions: • all of the syngas mixtures show deteriorated overall performance and economy parameters compared to the ng, due to a high inert gas content and low lhv; • the most favourable results are shown by the group 1 syngas sg6 that also has the lowest inert gas content; the worst results are for group 2, the syngas sg18, sg21, sg22 and sg23; • regarding the imep values, the sg6 shows 11 % drop to the ng and the sg23 shows approximately 30 % drop to the ng values; isfc difference varies from +45 % for sg22 to +400 % for sg18, compared to ng; • a favourable result for all syngas mixtures is the decrease of the maximum cycle temperature, which leads to a lower nox production. all of these results are promising, but our future work should focus on the refinement of our simulation models and especially on the empirical combustion model verification. generally speaking, this study provides a framework for the future studies of the alternative fuels performance parameters. here presented methods will be applied on the whole range of thirty-five syngas mixtures. the measurements of those are already underway. list of symbols a vibe duration coefficient [1] m vibe shape coefficient [1] p pressure [bar] α crankshaft angle [deg] λ air excess [1] ca50 50 % fuel burned [ca deg] t temperature [deg, k] mfb mass fraction burned [1] imep indicated mean effective pressure [bar] isfc indicated specific fuel consumption [g/kwh] f criterial function [1] αk weight factor [1] xk minimized function [1] lhv lower heating value [kj/kg] acronyms 1cyl single-cylinder combustion analysis model full full scale engine model evc exhaust valve closes evo exhaust valve opens ivc inlet valve closes ivo inlet valve opens si spark ignition engine tdc top dead center ng natural gas sg syngas ch4 methane molecule h2 molecular hydrogen n2 molecular nitrogen o2 molecular oxygen co carbon monoxide co2 carbon dioxide nox mono-nitrogen oxides (no and no2) 46 vol. 57 no. 1/2017 preliminary study on combustion and parameters of syngas fuel mixtures acknowledgements this work was supported by the slovak research and development agency under contract-no. apvv-0015-12 and contract-no. apvv-14-0399 and was also supported by the scientific grant agency under the contract no. vega 1/0017/14. references [1] jílková, l., ciahotný, k., kusý, j., “pyrolysis of brown coal using a catalyst based on w-ni,” acta polytechnica 55(5):319–323, 2015, doi: 10.14311/ap.2015.55.0319 [2] lisý, m., baláš, m. špiláček, m., skála, z., “operating specifications of catalytic cleaning of gas from biomass gasification,” acta polytechnica 55(6):401–406, 2015, doi: 10.14311/ap.2015.55.040 [3] liu, c.c., shy, s.s., chiu, c.w., peng, m.w. et al., “hydrogen/carbon monoxide syngas burning rates measurements in high-pressure quiescent and turbulent environment”, international journal of hydrogen energy, 36(14): 8595-8603, 2011, doi:10.1016/j.ijhydene.2011.04.087 [4] herdin, g., gruber, f., klausner, j., robitschko, r. et al., "hydrogen and hydrogen mixtures as fuel in stationary gas engines," sae technical paper 2007-01-0012, 2007, doi:10.4271/2007-01-0012 [5] feist, m., landau, m., and harte, e., "the effect of fuel composition on performance and emissions of a variety of natural gas engines," sae int. j. fuels lubr. 3(2):100-117, 2010, doi:10.4271/2010-01-1476 [6] saanum, i., bysveen, m., tunestål, p., and johansson, b., "lean burn versus stoichiometric operation with egr and 3-way catalyst of an engine fueled with natural gas and hydrogen enriched natural gas," sae technical paper 2007-01-0015, 2007, doi:10.4271/2007-01-0015 [7] nellen, c. and boulouchos, k., "natural gas engines for cogeneration: highest efficiency and near-zero-emissions through turbocharging, egr and 3-way catalytic converter," sae technical paper 2000-01-2825, 2000, doi:10.4271/2000-01-2825 [8] li, w., liu, z., wang, z., li, c. et al., "effect of co2, n2, and ar on combustion and exhaust emissions performance in a stoichiometric natural gas engine," sae technical paper 2014-01-2693, 2014, doi:10.4271/2014-012693 [9] neher, d., kettner, m., scholl, f., klaissle, m. et al., "numerical investigations of a naturally aspirated cogeneration engine operating with overexpanded cycle and optimised intake system," sae technical paper 2014-32-0109, 2014, doi:10.4271/2014-32-0109 [10] hagos, f. and abd aziz, a., "mass fraction burn investigation of lean burn low btu gasification gas in direct injection spark-ignition engine," sae technical paper 2014-01-1336, 2014, doi:10.4271/2014-01-1336 [11] okafor, e., fukuda, y., nagano, y., and kitagawa, t., "turbulent burning velocities of stoichiometric hydrogen-carbon monoxide-air flames at elevated pressures," sae technical paper 2014-01-2701, 2014, doi:10.4271/2014-01-2701 [12] orbaiz, p., brear, m., abbasi, p. and dennis, p., "a comparative study of a spark ignition engine running on hydrogen, synthesis gas and natural gas," sae int. j. engines 6(1):2013, doi:10.4271/2013-01-0229 [13] chríbik, a., polóni, m., ragan, b., toman, r., “some results of gas mixture influence on combustion course in spark ignition engine,” proceedings of 18th international conference “transport means 2014”, october 23-24, 2014, kaunas, issn 1822-296x. [14] chríbik, a., polóni, m., ragan, b., toman, r., “the influence of biogas composition on the parameters of a combustion engine running in micro-cogeneration unit,” erin, the 8th international conference for young researchers and phd students, april 23-25, 2014, blansko-češkovice, czech republic. [15] polóni, m., kálman, p., lach, j., smieško, š., lazar, l., kunc, p., jančošek, ľ., “micro-cogeneration unit with variable–speed generator,” international scientific event “power engineering 2010”, may 18-20, 2010, tatranské matliare, high tatras, slovak republic. 9th international scientific conference: energy-ecologyeconomy (eee) 2010, isbn 978-80-89402-23-6. [16] gt-power engine performance application manual. [manual] westmont: gamma technologies, inc., 2015. [17] morel, t., goerg, k.a., “use of tpa (three-pressure analysis) to obtain burn rates and trapped residuals,” gt-power user conference, http: //www.gtisoft.com/publications/pub_contp.php [18] vávra, j. and takáts, m., "heat release regression model for gas fuelled si engines," sae technical paper 2004-01-1462, 2004, doi:10.4271/2004-01-1462 [19] skarohlid, m., "modeling of influence of biogas fuel composition on parameters of automotive engines," sae technical paper 2010-01-0542, 2010, doi:10.4271/2010-01-0542 [20] wahiduzzaman, s., moral, t., and sheard, s., "comparison of measured and predicted combustion characteristics of a four-valve s.i. engine," sae technical paper 930613, 1993, doi:10.4271/9306 [21] morel, t., keribar, r., "a model for predicting spatially and time resolved convective heat transfer in bowl-in-piston combustion chambers, " sae technical paper 850204, 1985, doi: 10.4271/850204 [22] mirzaeian, m., millo, f., and rolando, l., "assessment of the predictive capabilities of a combustion model for a modern downsized turbocharged si engine", sae technical paper 2016-01-0557, 2016, doi:10.4271/2016-01-0557 [23] vibe, i.i., “semi-empirical expression for combustion rate in engines,” in proc. conference on piston engines, ussr academy of sciences, moscow, 1956 [24] csallner, p., woschni, g., “zur vorausberechnung des brennverlaufes von ottomotoren bei geanderten betriebsbedingungen,” mtz no.5, 1982 [25] lacina, v., “použití modelu pracovního oběhu při vývoji plynového motoru.” the dissertation thesis, čvut praha 1988 47 http://dx.doi.org/ 10.14311/ap.2015.55.0319 http://dx.doi.org/ 10.14311/ap.2015.55.0319 http://dx.doi.org/ 10.14311/ap.2015.55.040 http://dx.doi.org/10.1016/j.ijhydene.2011.04.087 http://dx.doi.org/10.4271/2007-01-0012 http://dx.doi.org/10.4271/2010-01-1476 http://dx.doi.org/10.4271/2007-01-0015 http://dx.doi.org/10.4271/2000-01-2825 http://dx.doi.org/10.4271/2014-012693 http://dx.doi.org/10.4271/2014-32-0109 http://dx.doi.org/10.4271/2014-01-1336 http://dx.doi.org/10.4271/2014-01-2701 http://dx.doi.org/10.4271/2013-01-0229 http://www.gtisoft.com/publications/pub_contp.php http://www.gtisoft.com/publications/pub_contp.php http://dx.doi.org/10.4271/2004-01-1462 http://dx.doi.org/10.4271/2010-01-0542 http://dx.doi.org/10.4271/9306 http://dx.doi.org/ 10.4271/850204 http://dx.doi.org/10.4271/2016-01-0557 r. toman, m. polóni, a. chríbik acta polytechnica [26] woschni, g., “universally applicable equation for instantaneous heat transfer coefficient in the combustion engine,” sae paper 670931, sae trans., vol.76, 1967 [27] chríbik, a., “parametre spaľovacieho motora na zmes metánu a vodíka”, the diploma thesis, stu bratislava 2010 [28] deb, k.; pratap, a.; agarwal, s.; meyarivan, t. (2002). "a fast and elitist multiobjective genetic algorithm: nsga-ii". ieee transactions on evolutionary computation. 6(2): 182. doi:10.1109/4235.996017 48 http://dx.doi.org/10.1109/4235.996017 acta polytechnica 57(1):38–48, 2017 1 introduction 1.1 selected fuel mixtures 2 experimental methods 3 numerical methods 3.1 combustion analysis 3.2 combustion modelling 3.3 full scale engine simulation 3.4 simulation procedure 4 simulation results 4.1 combustion analysis results 4.2 full model results 5 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0235 acta polytechnica 60(3):235–242, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap the condensation of water vapour in a mixture containing a high concentration of non-condensable gas in a vertical tube jan havlík∗, tomáš dlouhý, jakub krempaský czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 16607 prague, czech republic ∗ corresponding author: jan.havlik@fs.cvut.cz abstract. this paper deals with the condensation of water vapour possessing a content of noncondensable gas in vertical tubes. the condensation of pure steam on a vertical surface is introduced by the nusselt condensation model. however, the condensation of water vapour in a mixture with non-condensable gas differs from pure vapour condensation and is a much more complex process. the differences for the condensation of water vapour in a mixture containing a high concentration were theoretically analysed and evaluated. in order to investigate these effects, an experimental stand was built. experiments were carried out in regards to the case of pure steam condensation and the condensation of water vapour with a non-condensable gas mixture to evaluate the influence of the variable non-condensable gas content during the process. a non-condensable gas in a mixture with steam decreases the intensity of the condensation and the condensation heat transfer coefficient. a gradual reduction of the volume and partial pressure of steam in the mixture causes a decrease in the condensation temperature of steam, and the temperature difference between steam and cooling water. the increasing non-condensable gas concentration restrains the transportation of steam towards the tube wall and this has a significant effect on the decrease in the condensation rate. keywords: condensation, non-condensable gas, vertical tube condenser. 1. introduction in many cases in industry, water vapour is not present as a pure separate substance, but it may mix or become polluted by other gases due to infiltration, chemical reaction or the presence of other impurities. these mixed gases influence the condensation process of water vapour and have to be taken into account in the design of condensing heat exchangers. depending on the concentration of non-condensable gases, the total heat transferred in the exchanger is reduced [1]. the content of non-condensable gases in a mixture with water vapor may be high in some industrial applications (flue gas in energy applications, waste vapor from the process industry, etc.) thus it significantly affects the intensity of heat transfer. for applications where water vapour condenses in a mixture with a non-condensable gas (ngc), there are two commonly used forms of condensation: direct contact condensation [2] and surface condensation [3]. in the case of the condensation of waste vapour with a presence of mechanical impurities, a surface condenser operates with the lower amount of the outgoing condensate which may be contaminated at the outlet of the condenser [4]. the basic configurations of surface condensers are horizontal tube condensers and vertical tube condensers [4]. condensation on horizontal tube bundles of different configurations has a wide application in industry [5]. the heat transfer rate in the condensation processes is mainly affected by the external flow velocity and the presence of non-condensable impurities [6]. for the condensation of steam with non-condensable gases outside a horizontal tube, a decrease in the condensation heat transfer coefficient (htc) begins even in low concentrations of ncg and the decrease significantly rises with concentrations of ncg [7]. for air-steam condensation on a vertical tube, a decrease in the htc below the value of 1000 w/m2k was observed in a concentration of more than 10 % of a ncg [8, 9]. for the condensation of water vapour in a mixture with a ncg, which may also contain small mechanical impurities, it is suitable to use condensers in a vertical tube-side configuration [4]. these solid particles, which can stick to the tube wall, are spontaneously carried away from the tubes by the condensate flowing out. this design of the condenser is very flexible and is suitable where a particularly low pressure drop is specified for the condensing fluid [4]. condensation in vertical tubes in the presence of ncg flowing downward in such tubes was experimentally investigated in [10]. the condensate flows down the tubes in the form of an annular film of liquid, thereby maintaining a good contact with both the cooling surface and the remaining vapour [11]. the disadvantages are that the coolant, which is often more prone to fouling, is on the shell side, and the use of finned tubes is precluded. a determination of the overall htc, which is necessary for the design of the condenser’s heat transfer 235 https://doi.org/10.14311/ap.2020.60.0235 https://ojs.cvut.cz/ojs/index.php/ap j. havlík, t. dlouhý, j. krempaský acta polytechnica area, is well described in the literature for the case of pure steam condensation on a vertical surface by the nusselt condensation model [1, 3, 4]. however, the condensation of water vapour in a mixture with ncg differs from pure vapour condensation and is a much more complex process [4, 12]. the aim of this article is to provide a theoretical analysis of the modifications of condensation in the presence of ngc with a high concentration in a vertical tube and to carry out an experimental investigation into the effect of ncg on the condensation process. 2. condensation in vertical tubes the basic heat-transfer model for the surface condensation introduced by nusselt describes how a pure saturated vapour condenses on a vertical wall, forming a thin film of condensate that flows downward due to gravity [4]. the operating conditions of real condensers may differ from the assumptions adopted in the basic nusselt theory [13]. the following differences may occur in the condensation of the steam mixture with inert gas in a vertical tube. during a condensation inside vertical tubes, flowing steam works on the surface of the condensate film through shear force, the film flow accelerates and the condensation htc increases slightly [1, 4]. on the bottom of the high vertical walls or long tubes, the thickness of the film grows and the laminar flow can change to turbulent flow increasing the htc [1, 4]. furthermore, a subcooling of the condensing mixture may occur due to a decrease in the partial pressure of steam and a decrease in the condensation temperature [14]. ncg restrains the diffusion of the steam molecules through the gas towards the vapour-liquid interface, and it results in a decrease in the partial pressure of the vapour and reduces the htc. since the total pressure of the mixture remains constant, the partial pressure of the inert gas increases with the decreasing partial water vapour pressure. steam concentrations decrease along the length of the tube equally with corresponding steam partial pressures [4, 12]. the modifications of the nusselt condensation model were theoretically analysed for the condensation of an air-steam mixture in a vertical tube (see tab. 1). the flow of a low-pressure steam mixture with a low velocity is assumed. when steam condenses its partial pressure decreases and the ngc concentration increases. the velocity decreases as well as the volume flow of the mixture. the presence of non-condensable gases profoundly affects the condensation process, providing a great reduction in the condensation htc. other effects have a lesser influence on the value of the condensation htc and, conversely, they increase the htc. 3. experimental setup a schematic diagram of the experimental apparatus is shown in fig. 1. the condenser was designed as a vertical double pipe heat exchanger consisting of two concentric stainless tubes. the inner tube of the heat exchanger is 2 000 mm long with an inner diameter of 23.7 mm (di) and a wall thickness of 1.6 mm. the outer tube is 1 500 mm long with an inner diameter of 29.7 mm (di) and a wall thickness of 2 mm. the material of the tubes is stainless steel 1.4301 (aisi 304). the annulus made from concentric tubes is 1.6 mm wide. stainless pins are used as spacers at three circumferential positions to keep the annulus concentric. steam from a steam generator with a regulated pressure close to 1 bar and a temperature from 100 to 130 °c is mixed with pressurised air. the steam temperature is controlled so that the mixture is in a saturated or slightly superheated state after mixing superheated steam with cold air before it enters the condenser. a mixture of water vapour and air enters the condenser at the top and is directed vertically downward through a calming section before flowing over the inner vertical tube. the cooling water flows upwards in the annulus. the heat exchanger is in a counter-current configuration. the condensate flowing out of the pipe is collected in a tank and its production is determined by weighing. the excessive steam-air mixture in the heat exchanger outlet is released to the ambient. the position of the temperature, pressure, weight and flow measurements are shown in fig. 1. the experimental loop may operate in two modes condensation of pure steam or condensation of steam in a mixture with air. in the case of a pure steam condensation mode, the air compressor is disconnected. 4. evaluation procedure the calculation of the htc is based on the heat balance of the condenser [2]. the total heat performance q is given by the equation q = mw · cw · (tw,out − tw,in) , (1) where mw is the cooling water flow, c(w) is the specific heat capacity of water, tw,out is the outlet cooling water temperature, and tw,in is the inlet cooling water temperature. the overall htc u is given by the equation u = q a · ∆tlog , (2) where a is the heat transfer surface of a condenser tube and ∆tlog is the logarithmic mean temperature difference. the overall htc for condensation in a vertical tube is determined by the heat transfer balance (see eq. 3). the right-hand side consists of the term corresponds to the tube side condensation heat transfer coefficient hcn g (condensation in the presence of ncg), the term corresponds to the tube wall conduction heat transfer coefficient, and the term corresponds to the outside tube heat transfer coefficient hw (cooling water side). the value of u is related to 236 vol. 60 no. 3/2020 the condensation of water vapour in a mixture containing. . . effect occurrence condition operating condition influence influence on htc reference non-condensable gas concentration more than approx. 0.1 % high concentration decrease up to several times [15, 16] flow mode of the film thick condensate film film re > 30: waviness flow film re > (400 1600): pure turbulent flow small film thickness increase up to 25 % for waviness interval; more for turbulent [12, 16] sub-cooling of condensing mixture well temperature is below saturation (change in released enthalpy at the wall) temperature change due to concentration profile change increase few % [12] shear stress of flow vapour is flowing at a high velocity low velocities increase negligible under 5 m/s [13] table 1. the influence of operating conditions on the condensation process. figure 1. experimental loop with a vertical tube condenser. 237 j. havlík, t. dlouhý, j. krempaský acta polytechnica outer surface of the tube. u = 1 do 1 di · hcn g + 1 2k ln ( do di ) + 1 do · hw , (3) where di is the inside diameter of the tube, k is the thermal conductivity of the tube and do is the outside diameter of the tubes. the evaluation procedure consists of two steps: experimental determination of the htc on the cooling water side and subsequently experimental determination of the htc of the condensation of water vapour in a mixture with ncg. 4.1. determining the cooling water htc from the point of view of reducing the inaccuracy of the determination of the htc from the experiment, it is advisable to achieve low thermal resistance on the cooling water side, so that its effect on the overall htc is as small as possible. therefore, there is an effort to achieve high htc on the cooling water in the annulus, which has been achieved by reducing the annulus spacing, and should increase the flow velocity as well as the convective htc [3, 4]. moreover, the heat transfer can be influenced by the small width of the annulus. the proposed condenser was designed with an annulus width of 1.4 mm. an annulus with such a restricted width approaches the scale of microchannels where the heat transfer rapidly increases [17, 18]. it is generally considered that a micro-channel is any channel with a hydraulic diameter in the range of a micrometer, i.e. less than 1 mm. thus, it can be assumed that the flow of cooling water in the annulus of the proposed condenser approaches this phenomenon and it can have an influence on the htc of the cooling side of the condenser. therefore, it is suitable to determine the htc experimentally for the tested heat exchanger [19]. experiments with pure steam condensation were performed to determine the cooling water htc. in the case of pure steam condensation, it is possible to calculate the condensation htc hc according to the nusselt model of pure steam condensation on the vertical wall [4, 12, 20]. hc = 0.943 [ gρl(ρl − ρv )k3lr ′ f g µl(tsat − ts)l ]1/4 , (4) where ρl is the density of the condensate, ρp is the density of water vapour, h′f g is the latent heat of condensation, kl is the thermal conductivity of the condensate, µl is the dynamic viscosity of the condensate, ∆tsat is the difference between the saturation temperature and the wall temperature, and l is the wall length. according to [20], the inaccuracy given by using eq. 4 is up to 3 % for cl · (tsat − ts)/r′f g ≤ 0.1 and 1 ≤ pr ≤ 100. knowing the value of hc, the cooling water htc hw in the annulus side for the operating range of the device (characterised by reynolds number) was evaluated by the equation for determining the overall htc as in the case of pure steam condensation u = 1 do 1 di · hc + 1 2k ln ( do di ) + 1 do · hw . (5) the htc on the cooling water side is dependent on medium properties (resp. fluid velocity and temperature) [4, 20, 21]. therefore, it is suitable to use the nusselt number for evaluating hw in various operating conditions. the nusselt number is equal to the dimensionless temperature gradient at the surface and it essentially provides a measure of the convective heat transfer. the nusselt number nu may be viewed as the ratio of the conduction resistance of a material to the convection resistance of the same medium [21] nu = hw · de k , (6) where k is the thermal conductivity, de is the characteristic diameter which is defined as de = 4 · flow area wetted perimeter = di − do, for annulus. (7) in single phase fluid flow heat transfer, the nusselt number for forced convection is generally in the form of nu = f(c,re, pr), (8) where c is the term given by the geometry characteristic, the reynolds number re is defined as re = ww · de µw , (9) the prandtl number pr is defined as pr = cw · µw kw , (10) where, µw, kw are the velocity, the dynamic viscosity resp. the thermal conductivity of cooling water. the dependence of nu on re is influenced by the geometry and regime of flow, therefore it is generally difficult to describe. the term c can be considered as a constant for a specific type of tested heat exchanger. to take into account changes in the cooling water temperature, the correction for the prandtl number has to be introduced assuming a raise in the value to the power of 0.33 in accordance with the standard practice for the forced convection [3, 4]. the heat transfer on the cooling water side is described by the dependence (see fig. 2) y = nu/pr0.33exp = f(reexp) (11) for various conditions with re in the analysed range and corresponding pr, it is possible to determine hw for the experimental condenser as hw = y · pr0.33 · kw de (12) 238 vol. 60 no. 3/2020 the condensation of water vapour in a mixture containing. . . figure 2. heat transfer for the cooling water side. 4.2. determining the condensation htc after the experimental determination of the cooling water htc hw, the value of condensation htc hcn g for various concentrations of air in a mixture with steam is derived from eq. 3. hcn g = 1 di 1 do · u − 1 2k ln ( do di ) − 1 do · hw . (13) 5. results 5.1. the cooling water htc experiments with pure steam condensation were carried out to evaluate the cooling water htc. the experiments were made in the design operating range of cooling water flow in the condenser described by the reynolds number from 1 000 to 5 000 and a cooling water temperature of around 20 °c within the atmospheric parameters of condensing steam. based on the dependence in fig. 2, it is possible to determine the value of the htc when changing the conditions on the cooling water side. 5.2. the condensation htc after the evaluation of the cooling water htc, experiments with various concentrations of air in a mixture with steam were carried out for the concentration range of 23 % to 69 % (see tab. 2). finally, the experimentally determined values of the condensation htc for water vapour in a mixture with air were compared with the value for pure steam condensation calculated according to the nusselt condensation model for corresponding operating parameters. inlet mixture velocity is calculated based on the molar weight of air ma, resp. vapour mv as ug = ma + mv ρg · π d2i 4 (14) with mixture density ρg defined as ρg = ma + mv ma ρa + mv ρv (15) where ρa is the air density and ρv is the vapour density. the inlet and outlet temperatures of the mixture correspond to the partial pressure of steam in the mixture. in the measurements 1 and 2, the steam condensation rate at the inlet of the condenser was sufficiently high. this resulted in a higher temperate drop in the second part of the condenser. due to low steam mass fraction at the end of the condenser, condensed steam caused a higher temperature drop in the mixture as compared to measurements 3 to 6. therefore the outlet temperatures in the measurements 1 and 2 are lower. when compared, the thermal resistances in eq. 13, the value of thermal resistance of the condensation term 1/hcn g is significantly higher than the value of thermal resistance of the cooling water term 1/hw. thus the sensitivity of the condensation htc value on the overall htc is more significant in comparison with the cooling htc. a deviation of 20 % in the determination of the hw value (part 4.1) corresponds with a change up to 1 % in the resulting hcn g value, resp. 2.5 % for a deviation of 50 %. therefore, it can be said that a deviation in the determination of the value of hw has a minimal effect on the result of the hcn g values. 5.3. the effect of the operation conditions on the condensation process the shear stress of flow the shear stress caused by the flowing gas-vapour mixture depends on kinetic energy and the flow direction of the mixture. the mixture flow accelerates the condensate film since it flows in the same direction as the condensate flow. this causes a reduction in the thickness of the film which improves the heat transfer rate. in all cases, the inlet velocity of the mixture during the experiments was less than 5.2 m/s. according to the calculation based on [21], the calculated improvement of the htc was lower than 2 % for the all measurements. this is in a good agreement with the equation introduced in [22], where the theoretical and experimental analysis of the local htc during the condensation of water vapour in the presence of a ncg in a vertical tube condenser was conducted. this study focused on the effect of the shear stress and created a new empirical factor which incorporates the influence of the shear stress on the heat transfer. the analysis shows that the effect of the shear stress is higher with a decreasing tube diameter for the same reynolds number of the mixture. on the contrary, the effect of the shear stress decreases with an increasing concentration of the ncg in the mixture. 239 j. havlík, t. dlouhý, j. krempaský acta polytechnica measurement 1 2 3 4 5 6 air mass concentration [%] 23 26 41 52 60 69 inlet mixture velocity [m/s] 2.8 2.5 3.7 5.1 5.2 4.3 inlet mixture temperature [°c] 94.9 93.9 91.1 85.3 82.1 76.4 outlet mixture velocity [°c] 0.66 0.7 1.7 1.2 3.7 2.9 outlet mixture temperature [°c] 43.6 40.6 74.6 71.1 64.6 51.6 cooling water temperature [°c] 23.4 22.0 22.8 22.7 20.2 19.1 cooling water htc hw [w/m2k] 5916 5987 5946 5703 6087 6525 air water vapour mixture condensation htc hcn g [w/m2k] 314 271 214 175 168 125 corresponding calculated nusselt htc pure steam hc [w/m2k] 5887 5881 5884 5996 5873 5746 reducing to [%] 5.3 4.6 3.6 3.1 2.9 2.1 table 2. experimental results. in conclusion, an increase of the mixture velocity or a decrease of the tube diameter increases the effect of the shear stress, which has a positive effect on heat transfer. however, a decrease in the heat transfer due to the presence of air has a far higher effect. waviness on the film surface waviness formations on the film surface can be theoretically characterised by the critical re. at certain values of re, waves form on the film surface and improve the htc of the film due to an increase in the heat transfer area. however, the exact value of recrit is often very difficult to determine and this effect is not taken into account until the empirically evaluated value re = 30 [16]. the maximal calculated re of the film during experiments was in all cases lower than 30. in study [23] an empirical number for the nusselt formula was introduced which enables the waviness formation effect on the htc. for the range of 5 < re < 100 and for the presented experimental setup, this gives an enhancement of the htc in the range of 1.05˘1.2. these values are in accordance with the empirical factor of 1.15 as presented in [12]. as the air concentration in the mixture for the presented measurements increases, the condensation rate decreases for the same experimental parameters. this means that the amount of condensate on the tube surface is lower which corresponds to the lower re of the film. hence the effect of waviness is quite low and it is not necessary to take it into account. superheating and subcooling it is well proven that subcooling of the condensate occurs very often during a film condensation [12]. in the case when steam condenses in the presence of air, superheating of the mixture also occurs since the temperate of the mixture changes according to the local saturation temperature of steam. the temperature difference of the film at the wall and the saturation temperature is calculated and shown in the tab. 3. as the concentration of air increases, the difference between the wall temperature of the film and the saturation temperature decreases because the steam has smaller partial pressure in the mixture and the saturation temperature is less. during experiments, the difference in additional heat transferred due to the superheating and subcooling was at maximum 4 % and 10 %, respectively. such values have a negligible effect on the overall htc. temperature dependent variables the film flowing on the cold surface does not have a constant thickness along the vertical tube. thermodynamic properties along the width of the film differ due to a temperature difference in the film. the most important parameters, which are influenced and are often important to include are the density, the dynamic viscosity and the heat conductivity. as discussed in previous studies [24], the temperature dependent variables of the film have a very small effect on the htc during the steam condensation. especially when the saturation temperature and the wall temperature are similar. generally, the film temperature is between the saturation temperature and temperature of the wall so the drew reference temperature or the mean temperature is often used. in [12, 25], the influence of temperature dependent material properties on heat transfer in the film condensation were analysed. it has been concluded that in the case when a difference between the wall temperature and the saturation temperature is less than 50 °c, a deviation from the nusselt´s theory is less than 3 %. in [26], it was shown that a temperature difference lower than 100 °c results in a change of the htc to just a few percent with a maximal deviation for 100 °c 5.1 %. during experiments the mean tem240 vol. 60 no. 3/2020 the condensation of water vapour in a mixture containing. . . effect values duringexperiment influence on htc during experiments influence with increased air concentration (constant re) ncg 23 – 62 mass. % decrease flow mode of the film less than 30 5 – 20 % decrease superheating 15.1 – 53.3 °c few % subcooling 42 – 57 °c(for average temperatures) few % shear stress of flow 2.5 – 5.2 m/s below 2 % decrease temperature dependent variables tsat − tw < 60 k(for average temperatures) few % table 3. calculated influence of considered effects. perature difference had no higher value than 60 °c so a deviation in the nusselt number in the range of few percent is assumed. 5.4. an evaluation of the effects an evaluation of the effects mentioned above and analysed for the measured conditions is given in tab 3. although these effects have a generally positive effect on heat transfer, the negative influence of ncg on the htc is so high that it overcomes all other considered effects. thus, it can be stated that a significant decrease in the condensation htc is caused by the presence of an ncg. 5.5. effect of non-condensable gas for the condensation of a water vapour mixture with a mass air concentration from 23 % to 69 %, a drop in the condensation htc to the level of 5.3 % to 2.1 % compared to the value for pure steam condensation was evaluated for the described condenser (see tab. 2). moreover, the heat transfer rate is reduced by a decrease in the mixture temperature as it passes the tube (see tab. 2), more precisely, by the temperature difference between the condensing mixture and the cooling water. these results confirm the trend of decreasing the htc with increasing the concentration of ngc [9, 10, 27] and expand the range of the investigation by higher concentrations of ncg. for air-steam condensation in a vertical tube, a decrease in the htc below the value of 200 w/m2k was observed in a concentration of more than 50 % of ncg. 6. conclusion the condensation of water vapour with a high content of a non-condensable gas in vertical tubes was experimentally investigated. the influence of the operating conditions on the condensation process was theoretically analysed. the presence of a non-condensable gas reduces the steam condensation temperature and reduces the htc. in order to evaluate the influence of the non-condensable gas, experiments on a condensation of water vapour with air content were carried out in a vertical tube condenser. a significant decrease in the htc value was experimentally verified. for the air concentration in a steam mixture of 23 % to 69 %, the condensation htc decreases by a level of several percentages compared to the value of the condensation of a pure steam. a non-condensable gas in a mixture with steam decreases the intensity of the condensation. a gradual reduction of the volume and partial pressure of steam in the mixture causes a decrease in the condensation temperature of steam respective to the temperature difference between the mixture and the wall film of the condensate. a growing non-condensable gas concentration restrains the transportation of steam to the wall and this has a significant effect on the decrease of the condensation rate. other effects on the condensation process (the flow mode of the film, subcooling of the condensing mixture, the shear stress of flow, temperature dependent variables) have a lesser influence on the value of the condensation htc and, conversely, they increase the htc value. the content of non-condensable gases in the steam restrains the condensation process, which results in a reduction in the htc. acknowledgements this work was supported by the ministry of education, youth and sports under op rde grant number cz.02.1.01/0.0/0.0/16_019/0000753 “research centre for low-carbon energy technologies”. references [1] c. eduardo. heat transfer in process engineering: calculations and equipment design. mcgraw-hill, new york, 1st edn., 2009. 241 j. havlík, t. dlouhý, j. krempaský acta polytechnica [2] w. li, j. wang, z. sun, et al. experimental investigation on thermal stratification induced by steam direct contact condensation with non-condensable gas. applied thermal engineering 154:628 – 636, 2019. doi:10.1016/j.applthermaleng.2019.03.138. [3] y. a. cengel. heat transfer. mcgraw-hill, new york, 2nd edn., 2003. [4] g. f. hewitt, g. l. shires, t. r. bott. process heat transfer. begell house, new york, 1st edn., 1994. doi:10.1081/e-eee2-120041541. [5] p. kracík, j. pospíšil, l. šnajdárek. heat exchangers for condensation and evaporation applications operating in a low pressure atmosphere. acta polytechnica 52(3):48 – 53, 2012. [6] k. b. minko, g. g. yankov, v. i. artemov, o. o. milman. a mathematical model of forced convection condensation of steam on smooth horizontal tubes and tube bundles in the presence of noncondensables. international journal of heat and mass transfer 140:41 – 50, 2019. doi:10.1016/j.ijheatmasstransfer.2019.05.099. [7] j. lu, h. cao, j. m. li. condensation heat and mass transfer of steam with non-condensable gases outside a horizontal tube under free convection. international journal of heat and mass transfer 139:564 – 576, 2019. doi:10.1016/j.ijheatmasstransfer.2019.05.049. [8] y.-g. lee, y.-j. jang, s. kim. analysis of air-steam condensation tests on a vertical tube of the passive containment cooling system under natural convection. annals of nuclear energy 131:460 – 474, 2019. doi:10.1016/j.anucene.2019.04.001. [9] g. fan, p. tong, z. sun, y. chen. experimental study of pure steam and steam–air condensation over a vertical corrugated tube. progress in nuclear energy 109:239 – 249, 2018. doi:10.1016/j.pnucene.2018.08.020. [10] s. z. kuhn, v. e. schrock, p. f. peterson. an investigation of condensation from steam-gas mixtures flowing downward inside a vertical tube. nuclear engineering and design 177(1):53 – 69, 1997. doi:10.2172/106998. [11] f. toman, p. kracik, j. pospisil, m. spilacek. comparison of different concepts of condensation heat exchangers with vertically oriented pipes for effective heat and water regeneration. chemical engineering transactions 76:379 – 384, 2019. doi:10.3303/cet1976064. [12] h. d. baehr, k. stephan. heat and mass-transfer. springer, berlin, 3rd edn., 2011. doi:10.1007/978-3-642-20021-2. [13] j. havlík, t. dlouhý. condensation of water vapour in a vertical tube condenser. acta polytechnica 55(5):306 – 312, 2015. doi:10.14311/ap.2015.55.0306. [14] t.-h. phan, s.-s. won, w.-g. park. numerical simulation of air–steam mixture condensation flows in a vertical tube. international journal of heat and mass transfer 127:568 – 578, 2018. doi:10.1016/j.ijheatmasstransfer.2018.08.043. [15] w. j. minkowycz, e. m. sparrow. condensation heat transfer in the presence of noncondensables, interfacial resistance, superheating, variable properties, and diffusion. international journal of heat and mass transfer 9(10):1125 – 1144, 1966. doi:10.1016/0017-9310(66)90035-4. [16] w. m. rohsenow, j. p. hartnett, y. i. cho. handbook of heat transfer. mcgraw-hill, new york, 3rd edn., 1998. [17] s. kakaç, y. yener, w. sun, a. t. okutucu. singlephase convective heat transfer in microchannels. in 14th international conference on thermal engineering and thermogrammetry (thermo). budapest, 2005. [18] b. palm. heat transfer in microchannels. microscale thermophysical engineering 5(3):155 – 175, 2001. doi:10.1080/108939501753222850. [19] havlik, jan, dlouhy, tomas. experimental determination of the heat transfer coefficient in shell-and-tube condensers using the wilson plot method. epj web of conferences 143:02035, 2017. doi:10.1051/epjconf/201714302035. [20] f. p. incropera. principles of heat and mass transfer. john wiley & sons, new jersey, 7th edn., 2012. [21] vdi-gvc (ed.). vdi heat atlas. springer, berlin, 2010. doi:10.1007/978-3-540-77877-6. [22] k.-y. lee, m. h. kim. effect of an interfacial shear stress on steam condensation in the presence of a noncondensable gas in a vertical tube. international journal of heat and mass transfer 51(21):5333 – 5343, 2008. doi:10.1016/j.ijheatmasstransfer.2008.03.017. [23] s. s. kutateladze, i. i. gogonin. heat transfer in film condensation of slowly moving vapour. international journal of heat and mass transfer 22(12):1593 – 1599, 1979. doi:10.1016/0017-9310(79)90075-9. [24] r. i. hirshburg, l. w. florschuetz. laminar wavy-film flow: part ii, condensation and evaporation. journal of heat transfer 104(3):459 – 464, 1982. doi:10.1115/1.3245115. [25] k. d. voskresenskij. heat transfer in film condensation with temperature dependent properties of the condensate (russ.). izv akad nauk pp. 1023 – 1028, 1948. [26] d. shang, t. adamek. study on laminar film condensation of saturated steam on a vertical flat plate for consideration of various physical factors including variable thermophysical properties. warmeund stoffubertragung 30:89 – 100, 1994. doi:10.1007/bf007150. [27] condensation in a vertical tube bundle passive condenser – part 1: through flow condensation. international journal of heat and mass transfer 53(5):1146 – 1155, 2010. doi:10.1016/j.ijheatmasstransfer.2009.10.039. 242 http://dx.doi.org/10.1016/j.applthermaleng.2019.03.138 http://dx.doi.org/10.1081/e-eee2-120041541 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2019.05.099 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2019.05.049 http://dx.doi.org/10.1016/j.anucene.2019.04.001 http://dx.doi.org/10.1016/j.pnucene.2018.08.020 http://dx.doi.org/10.2172/106998 http://dx.doi.org/10.3303/cet1976064 http://dx.doi.org/10.1007/978-3-642-20021-2 http://dx.doi.org/10.14311/ap.2015.55.0306 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2018.08.043 http://dx.doi.org/10.1016/0017-9310(66)90035-4 http://dx.doi.org/10.1080/108939501753222850 http://dx.doi.org/10.1051/epjconf/201714302035 http://dx.doi.org/10.1007/978-3-540-77877-6 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2008.03.017 http://dx.doi.org/10.1016/0017-9310(79)90075-9 http://dx.doi.org/10.1115/1.3245115 http://dx.doi.org/10.1007/bf007150 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2009.10.039 acta polytechnica 60(3):1–8, 2020 1 introduction 2 condensation in vertical tubes 3 experimental setup 4 evaluation procedure 4.1 determining the cooling water htc 4.2 determining the condensation htc 5 results 5.1 the cooling water htc 5.2 the condensation htc 5.3 the effect of the operation conditions on the condensation process 5.4 an evaluation of the effects 5.5 effect of non-condensable gas 6 conclusion acknowledgements references ap02_04.vp 1 introduction the principal problem of classical numerical methods, such as finite element methods, boundary element methods, etc., consists in “too stiff ” models, or too complicated simulations of the real states when no a priori knowledge of crack initiation is available. this is why discrete element methods have been introduced to replace fracture mechanics problems by contact problems, which are in many respects more transparent, and which lead us to the same results. moreover, it is not necessary to know the crack initiation in advance. in the early 1970s cundall, [2], and others, [3], introduced discrete elements starting with dynamic equilibrium. first, brick-like elements were used (professional computer program udec), and later circular elements in 2d and spherical elements in 3d (pfc – particle flow code – both computer systems issued by itasca) simulated the continuum behavior of structures. the application of such methods took place mainly in geotechnics, where soil is a typical grain material with the above-mentioned shape, [11]. if the material parameters are well chosen, the mechanical behavior of the discrete elements is very close to reality. the problem consists of finding such material parameters. there have been many attempts to find these parameters, but still there is no satisfactory output from these studies. a promising approach may be to cover the domain, defining the physical body by hexagonal elements, very close to disks, which can cover the domain with very small geometrical error. replacing the discrete elements by elastic, or elastic-plastic, hexagons with the full contact of adjacent elements along their common boundaries yields a honeycomb-like shape of the elements, see, e.g. [15], covering the structure of the continuous medium. it is necessary to note that beams form the honeycomb boundaries in [15] and there is no material inside such particles. in our case some kind of material fills the interior of the hexagonal elements. the relations inside the hexagonal particles are solved by a special form of the boundary element method, [1]. free hexagons are used by onck and van der giessen in [12], for example, where wide range of references on this topic can be found. in [12], the finite element method, e.g. [16], is used to create the stiffness matrices of the elements, namely six finite elements are substructured to a hexagon. in applications to geotechnical problems the disturbed state concept (dsc) established by desai, [4], [5], can describe a wide spectrum of material states inside the elements, starting with elastic, elastic-plastic, [6], and even damage states, [10]. the use of eigenparameters for plastic strain, or relaxation stress, [7], [8], completes the description of the possible and suitable nonlinear constitutive laws, which moreover can be “tuned” from “in situ” measurements, or from results from scale modeling. geotechnical properties are defined on the boundaries of the elements. a typical formulation of the problem involving the generalized mohr-coulomb law combined with exclusion of tensile zones is proposed in [14], where the technique using lagrangian multipliers leads to a mixed problem (both displacements and stresses – element boundary tractions are iterated). in this paper, the penalty method is applied, and element boundary tractions (formerly lagrangian multipliers) are substituted by spring stiffnesses (i.e., by penalty functions, or in our case by penalty parameters). the springs enable us to simulate the interfacial constraint, namely the exclusion of the tensile tractions and application of the mohr-coulomb law. the mohr-coulomb law is used in two basic forms, for brittle or almost brittle materials, and for soft rocks or soil. several phenomena, e.g., gas extrusion in a coal seam, swelling, watering, and even prestress, can be modeled by eshelby forces [9]. this treatment seems to be much more promising then the eigenparameters introduced in each element from the zone of some disturbance occurrences, because only tractions (eshelby forces) along the boundaries of those zones have to be applied. the paper starts with the formulation of the free hexagon element method, and then statical particle flow is described. applications in several fields of practical problems are discussed in a forthcoming paper by the authors [13]. 2 free hexagonal element method the discrete free hexagonal element method may be considered a discrete element method. the great disadvan42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 certain discrete element methods in problems of fracture mechanics p. p. procházka, m. g. kugblenu in this paper two discrete element methods (dem) are discussed. the free hexagon element method is considered a powerful discrete element method, which is broadly used in mechanics of granular media. it substitutes the methods for solving continuum problems. the great disadvantage of classical dem, such as the particle flow code (material properties are characterized by spring stiffness), is that they have to be fed with material properties provided from laboratory tests (young’s modulus, poisson’s ratio, etc.). the problem consists in the fact that the material properties of continuum methods (fem, bem) are not mutually consistent with dem. this is why we utilize the principal idea of dem, but cover the continuum by hexagonal elastic, or elastic-plastic, elements. in order to complete the study, another one dem is discussed. the second method starts with the classical particle flow code (pfc – which uses dynamic equilibrium), but applies static equilibrium. the second method is called the static particle flow code (spfc). the numerical experience and comparison numerical with experimental results from scaled models are discussed in forthcoming paper by both authors. keywords: discrete element methods, free hexagon element method, statical particle flow code. tage of some classical dem, however, is to feed them with material properties provided from laboratory tests (this is the case of the statical particle flow code, formulated in the next section, as the discs are connected by springs, while laboratories provide completely different material parameters). this is here overcome by considering the material characteristics, which are similar to a continuum. the principal idea of classical dem is adopted, and the domain defining the structure continuum is in our case covered by the hexagonal element, or other material properties can be introduced, such as elastic-plastic, visco-elastic-plastic, etc. this step avoids the necessity to estimate the material properties of springs, which are essential, e.g., for pfc. the free hexagonal element method fulfills a natural requirement due to the fact that the elastic properties are assigned to the particles, and other geotechnical material parameters (angle of internal friction, shear strength or cohesion) to the contacts of the elements. since most particles are of the same shape it is possible to apply very powerful iteration procedures, because the stiffness matrix can be stored in the internal memory of a computer. when dealing with crack problems, two principal methods are used. first, the means of fracture mechanics are applied, or secondly a contact problem can be formulated. the first case is generally not suitable, because the direction or way of propagation of the cracks needs to be known in advance. this paper uses the second possibility, which avoids the obstacle of a priori unknown way of propagation of the cracks by creating a mesh of free hexagonal elements. they are in mutual contact in the undeformed state, but can be disconnected when the contact conditions are violated. the computational model is described in the following paragraph, where the relations needed for numerical computation are also introduced. the interface conditions are formulated in paragraph 2.2, where the lagrangian principle is based on the penalty method. the penalty parameters are spring stiffnesses; the springs connect the adjacent elements. the material characteristics of springs can possess a large value to ensure the contact constraints. on the other hand, if, say, the tensile strength condition is violated, the spring parameters tend to zero, and in this case naturally no energy contribution in the normal direction to the element boundary appears in the energy functional in this case. this process excludes the possibility of a multivalue solution, and uniqueness of the solution of the trial problem is ensured. if we cut out the springs when a certain interfacial condition is violated, the problem turns to singular and has not unique solution. then the way on how the particles move in some later stages of destruction of the trial structure cannot be described. the hexagonal particles are studied under various contact (interfacial) conditions of the grain particles (elements). in our paper two contact conditions are considered: � the generalized mohr-coulomb hypothesis, with exclusion of non-admissible tensile stresses along the contact (a rock mass), � limit state of shear stresses and exclusion of tensile tractions along the contact (a brittle coal seam). the first case is generally connected with applications in geotechnics, composite materials, shotcrete, etc., and the second case is more appropriate for applications in underground bumps or rock bursts. a two-dimensional formulation and its solution have been prepared and are studied in this paper. the problem formulated in terms of hexagonal elements (which are not necessarily mutually connected during the loading process of the body, because of nonlinearities arising due to the interfacial conditions) enables us to simulate that crack propagate. the cracking of the medium can be described in such a way that the local damage may be derived. local deterioration of the material is also shown in the pictures drawn for particular examples. such a movement of elements and change of stresses probably cannot be obtained from continuous numerical methods. 2.1 computational model let us now consider a single hexagonal element (described by domain with its boundary ). its connection with the adjacent elements is shown in fig. 1. in each hexagonal element, the pseudo-elastic material properties are taken into consideration, i.e., in each iteration step the element behaves linearly, but the material properties can change during the process of loading and unloading. this makes it possible to introduce only an elastic material stiffness matrix, which is homogeneous and isotropic, and we get well-known integral equations that are valid along the boundary abscissas of the hexagons, [1]: � � � � � � � � � � � � c u p u u p l ss kl l i ik i id d � � � � � � �� � � � � � 1 2 x x x x x x* *, , �� � � � � � � � � � � � �� � �� �� is i b u k 1 2 1 6 1 2 1i ik dx x x * , ,� � , ,2 (1) where bi are components of the volume weight vector, �s are edges (abscissas) of the boundary elements, � is the point of © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 42 no. 4/2002 fig. 1: geometry of adjacent hexagonal elements observation, x is the integration point, ui are components of the vector of displacements (defined not exclusively on the boundary, but also in the domain of the hexagonal element), pi are components of the tractions, ckl are components of the matrix, the values of which depend on position of the point of observation. the quantities with an asterisk are the given kernels. the kernels can be expressed as (for example, see [1]): � � u a m r x x p a k n x n x ik * ik i k 2 ik * 2 k i i k r r � �� � � � � � � � � log , 2 k x x x n�ik i k 2 j jr �� � � � � � 2 where � � � � � � � �a m� � � � � � �� � �� � � � � � �4 2 3, , � �k x x r x x� � � � � �� � � �, , ,i i i 2 1 2 2 2 and � and � are lame’s material constants. assuming uniform distribution of the boundary quantities (displacements ui(x) and tractions pi(x), i 1, 2, and volume weight forces bi to be uniform in the domain �, and positioning the points of observation � successively at the points �s, which are the centers of the boundary abscissas of the hexagonal elements, a simplified version of (1) is written as: � � � � 1 2 1 2 1 u p u u p ss is k t i s ik i s ikd d� � � ��� �� * *, ,x x x x� � �� � � 6 1 2 1 2 1 6 � �� � � � � � � � b u k t i i s ik d * , , , ; , , ,x x� � � (2) where ui s and pi s are the values of the relevant quantities positioned at the �s, s 1, …, 6, i.e., � �u ui s i s� � and � �p pi s i s� � . moreover, the vector of influences of the volume weight forces on the boundary abscissas is � �bs � 1 2, , s 1, …, 6, and � � � � k s i s ik s d� ��� � b u k i * , , ,x x� � �1 2 1 2. for better and more convenient computation, the most important integrals are given in the appendix. in this way, the integrals in (2) may be calculated directly, without numerical integration. let us introduce vectors �s, �s, s = 1, …, 6, and also u and p as: � � � � � � � � s s s s s s� � � � � � � � � u u p p 1 2 1 2 1 2 3 4 5 6 , , u � � � � � � � , p � � � � � � 1 2 3 4 5 6 � � � � , b b b b b b b 1 2 3 4 5 6 � � � �� �k s i s ik s k s i s ik s s s d d� � ��� � � � u p p u i i * *, , ,x x x x� � � �1 2 1 2 � . using this notation, the relations on the elements (2) can be recorded as: au bp b� � (3) where a and b are (12 * 12) matrices, and their components are singular integrals over the boundary abscissas. matrix a is generally singular, while matrix b is regular. this fact enables us to rearrange equations (3) into the form: ku p v k b a v b b� � � �� �, ,1 1 (4) where the stiffness matrix k is different from that arising in applications of finite elements (here it is prevailingly non-symmetric), v is the vector of volume weight forces concentrated on the boundary abscissas (more precisely at the point �). in this way, the discretized problem becomes a problem similar to the fem. along the adjacent boundary abscissas it should hold (pi � are eshelbys’ forces): � � � �p p p pi i i i� � � � � � �� � , (5) where superscript plus means from the right and minus from the left (at most two particles can be in contact). now using the relations (4) and (5), we get twice as many unknowns as equations, because no connection between the elements has yet been introduced. equations (5) have to be accomplished by a constraint of the type � �k u u pi i i i� �� � . (6) the latter conditions are penalty-like conditions, since if ki is great enough, the distribution of displacements is continuous, and the displacement from the right is equal to the displacement from the left. these conditions can locally be violated, because of the contact conditions, which are discussed later in this text. introducing boundary conditions and assuming that ki remains great enough leads us to a stable system of equations delivering a unique solution. even in the case when local disturbances occur, the solution can be stable. it can happen that there are too many disturbances, e.g., dense occurrence of crack, and localized damage along a path (earth slope stability violation). then the solution is unstable, and there is a failure of the structure. this is also, for example, the case of a rock burst. discretization in the previous sense leads to a nonlinear system of algebraic equations, which are solved by an over-relaxation iterative procedure. this method is sufficient for study purposes. for a larger range of equations the conjugate gradient method has been prepared. for displacements inside the element domain � it holds: � � � �u p u u p is k i s ik * i s ik * d s � �� � � � � ��� �� x x x , �1 2 1 6 � � � �, , , , ,� �d d s i s ik *x x x � � � �� � � � � � � � b u k i 1 2 1 2 (7) where the element boundary displacements and tractions are known from the previous computation, providing the solution is stable. using kinematic equations and hooke’s law, the internal stresses can be calculated from (7). there is no 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 danger of singularities, as the points x and � never meet (point x lies inside the domain � and � on boundary �). 2.2 formulation of the contact problem recall that displacements are described by a vector function � �u � u u1 2, of the variable � �x � x x1 2, . the traction field on the particle boundaries is denoted either as � �p � p p1 2, , or after projections to normal and tangential directions as � �p � p pn t, . a similar result is valid for projections of displacements, � �u � u un t, . assuming the “small deformation” theory, the essential contact conditions on the interface may be formulated as follows (no penetration conditions): � �u n k n k, c n k, � � �u u � 0 on �c k , (8) where �c k, k 1, …, n are boundaries between adjacent particles, un k, � is the normal displacement of current element � c and � a belongs to the adjacent element, both on the current common boundary , �c k, k runs numbers of all common sides of the particles, n is the number of common sides of hexagons (having exactly two adjacent particles inside the domain, one or none on the external boundary). let kn k be the spring stiffness in the normal direction and kn k be the spring stiffness in the tangential direction on the boundary between particles with a common boundary �c k. then in the elastic region � �p kn k n k n k� u and � �p kt k t k t k� u . denote � � � � � � � � k v p p k p p p k � � � �� � � u u u , ,n + k n k n k n k n + k n k n k t k t k if then 0 � � � �� � � � � � �c k n u uk c k n + k t k t k , c t k ,on � , , , , .1 u � where ut k, � is the tangential displacement on the side k, � �pn+ kdenotes the tensile strength, ck is the shear strength, v is the set of displacements that fulfill the kinematical boundary conditions and condition (8). if pn k � 0 then set k is a cone of admissible displacements satisfying the essential boundary and contact conditions. this is valid for brittle or almost brittle material. if the material exhibits elastoplastic behavior, then cone k is changed as: � � � � � � � � k v p p k p p p k � � � �� � � u u u , ,n + k n k n k n k n + k n k n k t k t k if then 0 � � � � � � � � � � � � � � c p p k n u u k n k n k c k n + k t k t k , c t k , � � tan on � , , , ,1 u . where � is the angle of internal friction, and pn k is the normal traction on the side k, is the generalized heaviside’s function being equal to zero for a positive argument and equal to one otherwise. here the sign convention is important: positive normal traction is tension. from the above defined spaces we can deduce that � �pn k n k, u , and � �pt k t k, u behave linearly between certain limits, which are given by the material nature of the body. the total energy j of the system reads: � � � � � �� � � � j a k a c k n u u u u b u u u � � � � � � � �� � � 1 2 2 1 , , n k n k d d c t t � � � � � � d� � � 0 1 2� � � � � � � , , ,� � � � � � � � � u x u x u x u x n n t t t n n t � � . (9) where � is the strain tensor, c is the stiffness matrix of the particle, t denotes transposition, �0 is the sum of subdomains �, i.e., of hexagonal elements, b is the volume weight vector. note that the spring stiffness kn k plays the role of a penalty. recall that the problem can also be formulated in terms of lagrangian multipliers, and then leads to mixed formulation. the latter case is more suitable for a small number of boundary variables; the problem discussed in this paper decreases the number of unknowns introducing the penalty parameters. 3 statical pfc this section deals with the idea of modeling a structure, e.g., an earth body, by using a statical version of pfc. recall that the pfc is based on dynamic equilibrium; for slow movements of structures, which appear in most geotechnical problems, for example, it seems to be better to employ statical equilibrium. the earth mass is modeled in a statical version by balls in 3d or disks in 2d in a similar way as in the dynamic version. the balls are connected by springs that relate forces and the appropriate difference of displacements in the direction of the springs. the springs are considered either in normal and tangential directions to the boundary of the particles, or in the direction of the coordinate axes x and y. in our case, statical equilibrium has to be fulfilled all over each ball, and at the contact points between the adjacent balls. the balls (disks) are considered rigid. introduce coordinate system 0xyz in 3d. then each disk has six degrees of freedom (displacement u in direction x, displacement v in direction y , and displacement w in direction z, and three rotations with respect to the three axes x, y and z. in what follows we restrict our considerations to 2d for simplicity; generalization to 3d is straightforward. the movement of each disk is described by two displacements u, v and rotation �. the forces concentrated at one contact point of the adjacent balls obey the contact conditions that are typical for soil in our study. they determine the change of stages in the dsc model. the plastic behavior, providing rigid balls, is imposed only by the forces brought about by spring stiffnesses and eigenparameters, e.g., plastic strains (displacements), or relaxation stresses (forces). when introducing some spring (of the shape of a straight line, or more precisely of an abscissa) spanned between two points, its stiffness is k, the relation force f – displacement u in the elastic case may generally be written as (considering one-dimensional case): f k u� � �, or � �f k u� � � (10) where respectively � is the eigenstress (eigenforce), and � is the eigenstrain (eigendisplacement). another possibility is to drop out the eigenparameters and impose the nonlinear conditions exceptionally on the springs. the eigenparameters © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 42 no. 4/2002 enable a larger range of physical laws to be used in the material models. this is why they are mostly applied in the mechanical models. 3.1 relations between disks let us take a set of disks describing a discontinuous medium positioned in a coordinate system 0xy, see fig. 2. at the contact points (nodal points), the springs in the tangential and normal directions are introduced, which have stiffnesses kt in the tangential direction to the boundary of the disk and kn in the normal direction to the boundary at the interface nodal point connecting two adjacent disks. such a connection is described in more detail in fig. 3 for three disks in mutual contact. fig. 3 depicts external (volume weight) forces fi, i 1,…, n (n is the number of all disks), contact forces in normal and tangential directions nij, qij, respectively, where i, j 1, 2, and reactions at the supports (created in this case by a flat plate in 3d, or by a straight line in 2d) a, b, ha, hb. the first index denotes the number of the current disk and the second index stands for the number of the adjacent disk. this notation is kept in the following text. the main objective is to formulate the equations of equilibrium in each disk i, i 1, …, n and from this equilibrium to determine the displacements of the center ui, vi and the rotations �i of each disk. the connection with the adjacent disks is created by the quantities with indices i (the current disk) and j (the adjacent disk). in the sense of (10) the physical equations at each nodal point are formulated as: n q k k ij ij n ij t ij n ij t ij n0 0 � � � � � � � � � � � � � � � � � ij t ij � � � � (11) where �n ij and �t ij are respectively eigenforces in normal and tangential directions. indices i, j describe the numbers of disks in mutual contact and �n ij , �t ij are the differences of displacements in normal and tangential directions, respectively, between disk i and disk j, i.e., �n ij n ij n ji� �u u , �t ij t ij t ji� �u u . let the nodal point ij under consideration be deviated from the x-direction by an angle �ij. then the transformation of forces to the 0xy coordinate system is written by: n n nx ij y ij ij ij ij ij i� � � � �� � � � � � cos sin sin cos � � � � j ij ij t ij ijq n q � � � � � � � t (12) where nx ij, ny ij are forces in x and y directions, tij is the transformation matrix and superscript t denotes transposition. recall that matrix tij is unitary, which means that t tij ij t� �1 . since the same equations hold for displacements, the following forces – displacements relation holds valid as: n n k k x ij y ij ij t n ij t ij n ij t ij 0 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � n ij t ij xx ij xy ij yx ij yy ij k k k k � � n ij t ij n ij t ij � � � � � � � � � (13) where k k kxx ij n ij ij t ij ij� �cos sin2 2� � ; k k kyy ij t ij ij n ij ij� �cos sin2 2� � ; � �k k k kxyij yxij nij tij ij� � �12 2sin � � � � � �x ij n ij ij t ij ij� �cos sin ; � � � � �y ij n ij ij t ij ij� �sin cos (14) and �x ij , �y ij are differences of displacements in x and y directions, respectively, in disks i and j, i.e., �x ij x ij x ji� �u u , �y ij y ij y ji� �u u . a typical disk with springs introduced at nodes in x and in y directions and the induced forces in the normal and tangential directions are illustrated in fig. 4. if no rotations were considered, the above formulas would be valid without improvement and the computation may start with (13). for each disk two degrees of freedom in 2d (two independent displacements are unknown), and three dof in 3d (three independent displacements) are sought. in the case of admitted rotations of disks, additional unknown angles �i describing the rotations of the disks have to be introduced. recall that three dof (two displacements ui, vi and one angle of rotation �i in 2d are to be sought. the situation is depicted in fig. 5. let us focus on one typical disk. its basic movements and their denotations are clear from fig. 6. with respect to the 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 2: structure of the balls fig. 3: forces between disks above-mentioned arguments, the three movements and their parts are described in this picture. the total displacement at any point on the boundary of the disk in the x direction, or in the y direction, consists of two kinds of displacements that are related to the rotation (subscript rot), and translation (subscript tran), i.e., u u ux i x, rot i x, tran i� � , u u uy i y, rot i y, tran i� � . since the 2d case is under consideration, the unknown quantities in each disk i are ux, tran i , uy, tran i and �i. it only remains to express the influence of rotation of each node. from fig. 6 it obviously holds: � �� � � � u r u r x, rot i y, rot i � � � � � � � i ij i ij i ij i cos cos , sin � � � � �� �sin ,�ij (15) where �ij is given for each node and ri is the radius of disk i. the forces nx ij, and ny ij acting between disk i and disk j are then given by (13). 3.2 governing equations equation (13) can be rearranged in a more suitable way for algorithmization: n n n n k k k k x ij y ij x ji y ji xx ij xy ij xx ij x� � � � � � y ij xy ij yy ij xy ij yy ij xx ji xy ji xx ji xy ji x k k k k k k k k k � � � � � y ji yy ji xy ji yy ji x i y i x j y j � � � � � � � � � � � � � � � � k k k u u u u � � � � � � � � � � x i y i x j y j (16) and from (15): u u u u u u u x i y i x j y j x,tran i y,tran i x,tran � � � � � �� � j y,tran j i ij i ij i u r r � � � � � � �cos cos sin � � � � �� � � �� � � �� � � � � � � � � � � ij i ij j ji i ji j ji i ji � � � � � � � sin cos cos sin sin r r � � � (17) the third unknown �i appears in the conditions of equilibrium in nonlinear terms, namely in cosines and sines (recall that �ij is the angle of deviation from the x-axis of the point ij on the boundary of disk i being in contact with disk j). in order to avoid very complicated and unreliable nonlinear computation, the load (for example volume weight) will be divided into increments, and in each increment the small displacement (or more precisely small rotation) theory will be considered. assuming small enough increments, small enough angle � also results, and (17) is substituted by: u u u u u u u x i y i x j y j x,tran i y,tran i x,tran � � � � j y,tran j i i ij i i ij j i u r r r � � � � � � � � � sin cos sin cos � � � ji j i jir � � � (17a) in each increment ux,tran i , uy,tran i , and �i are parts of the total values, as also are the forces computed from these movements. another advantage of this incremental process is © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 42 no. 4/2002 fig. 4: starting model of one disk for further computation fig. 5: additional displacements at the point ij from rotation fig. 6: three basic movements of a disk the possibility to test the contact conditions, as described in the following section. an additional condition is necessary to complete the equations for three unknowns in each disk. this is the moment condition with respect, for example, to the center of the disk under study: q ij i i r j m � � � 1 0 , (18) where mi is the number of nodes on the boundary of disk i. since ri is constant in each disk i, it may be dropped out and the three conditions in disk i are obtained as: n f n f q n j m j m j m x ij x i y ij y i t ij x ij i i i � � � � � � � � � � 1 1 1 , , sin �� �ij yij ij i � � � � n j m cos � 1 0 (19) where fx i and fy i are volume weight forces. if only gravitation is considered, the denotation f fi y i� , and assumption fx i � 0 can be used, as in fig. 3. after introducing boundary conditions and eigenforces, the equations (19) create an algebraic system for unknown displacements ux,tran i , uy,tran i , and rotations �i of the disks. solving this, the forces can be determined from (16). note that when properly stated, system (19) has a unique elastic solution, providing that the incremental formulation (17a) is assumed, i.e., small displacement theory may be employed. 3.3 interfacial conditions in geotechnics and mining engineering it is well known that the material behavior of the soil mass is not elastic, but exhibits either plastic behavior, or localized damage, or both. contrary to the case of free hexagons, plastic behavior has to be employed only for spring properties. for the sake of completeness, an example of such properties is discussed here. consider two balls that are connected by the above-described system of springs. for the sake of simplicity, we again concentrate our attention on the 2d case. using formula (11), the normal forces nij and tangential forces qij can be obtained for each linear state. vector transformation of the coordinates applied to (17) provides displacements un ij and ut ij. in a wide range of problems, the classical plastic laws of deformation and stress state, e.g., elasto-plastic law, are formulated as: n q kij ij 0 2 2 24� � , i.e., � � � �k u k u knij nij tij tij 0 2 2 24� � (20) where k0 is a given positive number. in the case of violating conditions (20), new kt ij may be determined. while the value of kt ij is restricted by (20), un ij is mostly bounded by some value that cannot be exceeded. moreover, kn ij changes nonlinearly with un ij. a parabolic rule is mostly applied as: � �k k unij n0ij nij s � �1 (21) where s is an exponent to be stated from laboratory tests, as well as the starting value of kn0 ij . the value of nij increases nonlinearly and when the strength is reached, a crack is assumed at point ij and the spring is suddenly removed. in practical examples, the spring that is in tension is removed gradually to stabilize the convergence and speed up the iteration process. on the other hand, at this point “penetration” of one disk into the adjacent disk is fully permitted. this is an impact of the nonlinear behavior (21) of the spring being compressed in the normal direction. damage occurs when the mohr-coulomb hypothesis is violated or tensile strength is reached. in this case, it means that a) � �q n nij ij b ij� � �tan � � , where �b is the shear strength (cohesion), is the heaviside function, and � is the angle of internal friction, b) n nij ij +� , where nij + is the tensile strength. when condition a) is not fulfilled, then a “cut” of qij is supposed according the following rule: � �q n nij ij b ij tij� � � �tan � � � . note that more complicated rules may be imposed. for example, both internal parameters, angle of internal friction and shear strength, may change with the values of qij. in the case of violation of condition b), a local disconnection (debond) occurs and the spring is removed again. this is not due to local cracking but because of disconnection of the disks that were originally in contact at point ij. eigenforces have not been discussed in this paper. they are additional design parameters for optimal approximation of reality or of laboratory tests in such a numerical model. they may also simulate other phenomena, such as change of temperature, gas emission, blasting at some local sources, etc. conclusions two new discrete element methods have been introduced in this paper. one of them is an extension of the well-known particle flow code. the solution of this method is very easy and fast. on the other hand, it bears the same disadvantages as pfc itself. the interpretation of the material properties currently obtained from laboratory tests is quite complicated, if possible at all. if the material properties are properly chosen, the results seem to be realistic, as shown in the forthcoming paper by the authors. a much more complex and suitable method for predicting the real behavior of the test material is the method of free hexagons, which involves both mechanical and geotechnical properties received from the experiments. the hexagons can cover the entire domain describing the physical body, and along the local boundaries between elements the geotechnical properties can be imposed. the material behavior inside the elements is described by virtue of boundary elements, which are more appropriate than finite elements in this case. when using the finite element method, the local tractions are polynomials of one 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 order higher than the boundary displacements, while the boundary element method delivers polynomials of the same order along the boundary. the spring condition is then better fulfilled by the boundary elements. acknowledgment this research was supported by the grant agency of the czech republic – grant number 103/00/0530 and by the ministry of education of the czech republic, project msm:210000001,3. references [1] brebbia, c. a., telles, j. f. c., wrobel, l. c.: boundary element techniques. berlin: springer-verlag, 1984. [2] cundall, p. a.: a computer model for simulation of progressive large scale movements of blocky rock systems. symposium of the international society of rock mechanics, 1971, p. 132–150. [3] cundall, p. a., strack, o. d. l.: a discrete numerical model for granular assemblies. geotechnique, 1979, p. 47–65. [4] desai, c. s.: a consistent finite element technique for work-softening behavior. in: j. t. oden et al (eds.), int. conf. on comp. meth. in nonlinear mech. univ. of texas at austin, 1974. [5] desai, c. s.: constitutive modeling using the disturbed state concept. chapter 8, continuum models for materials with microstructure, ed. h. mulhaus, uk: john wiley & sons, 1994. [6] duvaut, g., lions, j.-l.: les inequations en mecanique et en physique. paris: dunod, 1972. [7] dvorak, g. j., procházka, p. p.: thick-walled composite cylinders with optimal fiber prestress. composites, part b, 27b, 1996, p. 643–649. [8] dvorak, g. j., procházka, p. p., srinivas, s.: design and fabrication of submerged cylindrical laminates. part i and part ii, int. j. solids & structures, 1999, p. 1248–1295. [9] eshelby, j. d.: the determination of the elastic field of an ellipsoidal inclusion, and related problems. jmps, vol. 11, 1963, p. 376–396. [10] kachanov, l. m.: introduction to continuum damage mechanics. dordrecht (netherlands): martinus nijhoff publishers, 1986. [11] moreau, j. j.: some numerical methods in multibody dynamics: application to granular materials. eur. j. mech. solids vol. 13, no. 4, 1994, p. 93–114. [12] onck, p., van der giessen, e.: growth of an initially sharp crack by grain boundary cavitation. jmps, vol. 47, 1999, p. 99–139. [13] procházka, p. p., kugblenu, m.: application of certain discrete element methods to the problems of fracture mechanics. acta polytechnica, vol. 42, no. 4, p. 57–67. [14] procházka, p. p., šejnoha, m.: development of debond region of lag model. computers & structures, vol. 55, no. 2, 1995, p. 249–260. [15] triantafillidis, n., schraad, m. w.: failure in aluminium honeycombs. jmps, vol. 47, 1999, p. 1093–1124. [16] zienkiewicz, o. c.: the finite element method. new york: mcgraw-hill book company, 1977. appendix � � log d log d a b a b r x x a x b a b b a x x x x 1 1 2 2 1 2 21 2 2 2 � �� � � � � � �log arctan b a � � � (a.1) x x r x x x i j d a b 2 1� � 1. � � � � � � � � � � x x a x a b x a x a x x 1 2 1 2 2 1d a b a barctan arctan � � � � � 2. � � � � �� ax x a x a x a x a x x 1 b a d a b 1 2 2 1 2 2 2 2 log 3. � � � � � � � � � � a x a x a b x a x a x x 2 1 2 2 1 2 d a b b aarctan arctan � � � � � 2 (a.2) x r x x x i d a b 2 1� � 1. x x a x x a x a x x 1 b a d a b 1 2 2 1 2 2 2 2 � � � �� log 2. a x a x x a x a x x 1 2 2 1� � � � � � � � � � d a b b aarctan arctan (a.3) © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 42 no. 4/2002 fig. 7: geometry for calculating boundary integrals d a b b ax r a x a x a x x 1 2 1 � � � � � � � � � � � � arctan arctan (a.4) x x x n r x x x a r x x x x x i j k j i kd d a b a b 2 1 2 1� �� � see (a.2) prof. ing. rndr. petr p. procházka, drsc. phone: +420 224 354 480 fax: +420 224 310 775 e-mail: petrp@fsv.cvut.cz ing. michael kugblenu dept. of structural mechanics ctu in prague, faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 acta polytechnica doi:10.14311/ap.2017.57.0418 acta polytechnica 57(6):418–423, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap new spectral statistics for ensembles of 2 × 2 real symmetric random matrices sachin kumara,∗, zafar ahmedb a theoretical physics section, bhabha atomic research centre, mumbai 400 085, india b nuclear physics division, bhabha atomic research centre, mumbai 400 085, india ∗ corresponding author: sachinv@barc.gov.in abstract. we investigate spacing statistics for ensembles of various real random matrices where the matrix-elements have various probability distribution functions (pdfs: f(x)) including gaussian. for two modifications of 2 × 2 matrices with various pdfs, we derive the spacing distributions p(s) of adjacent energy eigenvalues. nevertheless, they show the linear level repulsion near s = 0 as αs where α depends on the choice of the pdf. more interestingly when f(x) = xe−x 2 (f(0) = 0), we get cubic level repulsion near s = 0: p(s) ∼ s3e−s 2 . we also derive the distribution of eigenvalues d(�) for these matrices. keywords: real symmetric matrices; wigner surmise. 1. introduction due to matrix mechanics of heisenberg and method of linear combination of atomic orbitals (lcao) one can easily visualize the eigenspectrum of various systems with time-reversal symmetry as result of diagonalization of a real symmetric matrix where the matrix element are calculated using the inter-particle interaction. most of the times this interaction is not known. for example, energy levels of various nuclei are known experimentally but the nuclear interaction is not really known. random matrix theory [1–5] originated by considering the level spacing statistics p(s) between eigenvalues of real symmetric matrices r1 = ( a b b c ) , r2 = ( a + b c c a− b ) (1) when matrix elements a,b,c are random numbers with the probability distribution function (pdf) as gaussian: f(x) = 1√ 2π e−x 2 . the spacing of eigenvalues are given as s1 ∼ √ 4b2 + (a− c)2 and s2 ∼ √ b2 + c2, respectively. notice that the s1, is function of three parameters (a,b,c), whereas s2 function of just two (b,c). notwithstanding this disparity and the complexity of the multiple integral p(s) = a ∫ ∞ −∞ ∫ ∞ −∞ ∫ ∞ −∞ f(a,b,c)δ[s −s(a,b,c)] dadbdc (2) the spacing distributions in the two cases (r1,r2) turned out to be the same: p (s) = se−s 2 . when arranged to yield the average spacing as 1, the normalized spacing distributions is written as pw (s) = πs 2 e−πs 2/4, (3) this is called the spacing distribution of gaussian orthogonal ensemble (goe) due to the orthogonal symmetry of real symmetric matrices. moreover pw is well known as wigner distribution function. wigner surmised [1–5] that the spacing distribution of adjacent eigenvalues of n number of n×n gaussian random real symmetric matrices will again be given by (3). next, wigner predicted that (3) would eventually represent the spacing statistics of neutron-nucleus scattering resonances and nuclear levels. notice that near zero pw (s) is linear as πs/4, this is called the linear level repulsion of adjacent eigenvalues. the rotational invariance and invariance under time-reversal of a symmetric matrix lie behind the linear level repulsion. consequently, the spacing of nuclear levels of same angular momentum j and parity π indeed display [1–5] p(s) in (3). wigner’s surmise is strange but true, each nucleus behaves like a matrix of large order. even much later, in the recent years investigations on spacing distributions of ensembles random matrices using 2 × 2 continue to be an attractive proposition for both symmetric/hermitian [7, 8] and non-hermitian matrices [9–14]. in these works [7, 8] one has taken gaussian distribution with zero mean and different variances for various entries of the matrices and derived a variety of spacing distributions. similarly, under gaussian 418 http://dx.doi.org/10.14311/ap.2017.57.0418 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 new spectral statistics 2x2 n=10 5 semi-analytic wignersurmise 0 0.4 0.8 1.2 1.6 2. 2.4 s 0.4 0.8 phsl hal 2x2 n=10 5 semi-analytic wignersurmise 0 0.4 0.8 1.2 1.6 2. s 0.3 0.6 0.9 1.2 phsl hbl figure 1. p(s) for (a): r1 and (b): r2 matrices due to the uniform distribution of elements. the solid lines are due to (7) and (10), dashed lines represent the wigner distribution (3). these two p(s) are distinctly different but near s = 0 they are linear (αs), with α as 1.23 and 1.01, respectively pdf, for ensembles of several 2 × 2 pseudo-symmetric and pseudo-hermitian representing parity-time-reversal symmetric systems the novel expressions of p(s) have been derived [9–14]. here, we show that even two modifications of real symmetric 2×2 random matrices yield different p(s) under the same probability distribution function (pdf) f(x). similarly, one type of matrix under several non-gaussian pdfs yield distinct expressions for p(s). however, all the spacing distributions display the linear (αs) level repulsion near s = 0 wherein α depends on the type of pdf and the type of matrix (1) being used. the question of using a non-gaussian probability distribution to test wigner’s second conjecture does not appear to have attracted much attention after [6]. however, it is generally believed that spectral distributions are insensitive to pdfs. use of many non-gaussian distributions for finding the probability [15] of occurrence of real eigenvalues for the product of n number of 2 × 2 real random matrices is worth mentioning. the other interesting spectral statistics denoted as d(�) is called distribution of eigenvalues of n×n real symmetric matrices when n is large. wigner proposed it to be [1–5] d(�) = 2 π √ 1 − �2, � = e/e∗ (4) which is well known as wigner’s semicircle law. in case of large values of n, e∗ is the maximum eigenvalue of the matrix. due to the historic connection of 2×2 matrices in rmt and specially due to the astonishing sameness of p(s) in case of goe for n = 2 and n >> 2, the question arising here is as to what is the analytic form of d(�) in case of n = 2. due to the numerical calculations of porter we know that qualitatively d(�) makes an interesting transition from a bell shape to the semi-circle as the n increases. in case of n = 2, we collect a large number (n) of matrices and find the mean of positive eigenvalues to fix e∗ = ē to obtain d(�) both analytically and by finding their histograms numerically. the obtained d(�) once again are unlike the wigner’s semi-circle law (4) (for n = 2) and our analytic/semi-analytic results agree excellently with the numerically computed histograms. in § 2, we wish to present analytic or semi-analytic p(s) for four non-gaussian pdfs of elements of matrices for two types of 2 × 2. these non-gaussian pdfs are : uniform (u), exponential (e: f(x) = e−|x|, super-gaussian (sg: f(x) = e−x 4 ) and maxwellian (m: f(x) = xe−x 2 ). in § 3, we derive d(�) for r1 and r2 and plot them in figure 4 with their numerically computed histograms. 2. ensembles of 2 × 2 real-symmetric random matrices and their spacing distributions in this section we find p(s) (2) for two real symmetric matrices r1 and r2 for four pdfs (u, e, sg, m). by finding the average spacing as s̄ = ∫∞ 0 sp(s) ds/ ∫∞ 0 p (s) s, we then find the normalized spacing distribution p(s = s/s̄). we thus conform to the invariance of distributions in two s and s as p(s)ds = p(s)ds. 2.1. uniform distribution: f(x) = 1, 0 ≤ |x| ≤ λ, f(x) = 0, |x| > λ for the matrix r1, we have to evaluate p(s) = a ∫ λ −λ ∫ λ −λ ∫ λ −λ δ [ s − √ 4b2 + (a− c)2 ] da db dc, (5) 419 sachin kumar, zafar ahmed acta polytechnica 2x2 n=10 5 semi-analytic wignersurmise 0 0.5 1. 1.5 2. 2.5 3. 3.5 s 0.4 0.8 phsl hal 2x2 n=10 5 semi-analytic wignersurmise 0 0.4 0.8 1.2 1.6 2. 2.4 2.8 3.2 3.6 s 0.2 0.4 0.6 0.8 phsl hbl figure 2. the same as in figure 1, for exponential pdf (e: e−|x|) arising from semi-analytic expression (13) and semi-analytic form (15). here α values are 4.05 and 2.91, respectively. without a loss of generality we may choose λ = 1. let us introduce the transformation from (a,b,c) to (u,v,w) as u = a− c,v = a + c,w = 2b, (5) becomes p(s) =   a ∫s 0 du ∫ 2 0 dw ∫ 2−u u−2 δ(s − √ w2 + u2) dv, 0 ≤ s ≤ 2 a ∫ 2√ s2−4 du ∫ 2 0 dw ∫ 2−u u−2 δ(s − √ w2 + u2) dv, 2 < s < 2 √ 2 0, s ≥ 2 √ 2. (6) we find that integrals in (6) can be done and p (s) turns out to be a piecewise continuous function given as p(s) =   a′s(π −s)/4, 0 ≤ s ≤ 2, a′s2 [sin −1(2/s) − sin−1( √ s2 − 4/s)] + s4 [ √ s2 − 4 − 2], 2 < s < 2 √ 2, 0, s ≥ 2 √ 2. (7) for the matrix r2 (1), p(s) = a ∫ λ −λ ∫ λ −λ ∫ λ −λ δ[s − √ b2 + c2] da db dc. (8) the a-integral is separable and it will yield a multiplicative constant. we convert the double integral in b and c in to polar form as b = r cos θ,c = r sin θ p(s) =   a′ ∫ 1 0 ∫π/4 0 r drδ(s −r) dθ, 0 ≤ s ≤ 1 a′ ∫√2 1 ∫π/4 cos−1(1/r) r drδ(s −r) dθ, 1 < s < √ 2, 0, s ≥ √ 2. (9) finally for real symmetric matrix r2 (1) when the elements are distributed uniformly over [-1,1], from (9) we get the continuous three piece spacing distribution function for s ∈ (0,∞) (λ = 1) as p(s) =   a′ πs/2, 0 ≤ s ≤ 1, 2a′s[π/4 − cos−1(1/s)], 1 < s < √ 2, 0, s ≥ √ 2. (10) in figure 1, we plot p(s) arising from analytic results (7,10) along with the histograms generated from 100000 (= n), 2 × 2 real symmetric matrices of the types (a): r1 and (b): r2. near s = 0, they show linear repulsion, where α (the coefficient of linearity) is 1.23 and 1.09, respectively. notice the excellent agreement of solid lines with histograms, the dashed lines represent wigner’s distribution (3). 2.2. exponential distribution: f(x) = e−|x| multiple integrals in p(s) for r1 under exponential pdf can be written as p(s) = a ∫ ∞ −∞ ∫ ∞ ∞ ∫ ∞ −∞ e−(|a|+|b|+|c|)δ [ s − √ 4b2 + (a− c)2 ] da db dc. (11) 420 vol. 57 no. 6/2017 new spectral statistics 2x2 n=10 5 semi-analytic wignersurmise 0 0.4 0.8 1.2 1.6 2. 2.4 2.8 s 0.2 0.4 0.6 0.8 phsl hal 2x2 n=10 5 analytic wignersurmise 0 0.4 0.8 1.2 1.6 2. 2.4 2.8 s 0.3 0.6 0.9 phsl hbl figure 3. the same as in figures 1 and 2, for super-gaussian pdf (sg: e−x 4 ) arising from the semi-analytic expression (18) and the analytic one (19). here α values are 1.30 and 0.91, respectively. we transform p(s) into the three dimensional spherical polar co-ordinates using 2b = r cos θ, a = r sin θ cos φ, c = r sin θ sin φ as p(s) = a ∫ ∞ 0 ∫ π 0 ∫ 2π 0 e−r(|cos θ|/2+sin θ(|cos φ|+|sin φ|))δ[s −rg(θ,φ)]r2 dr sin θ dθ dφ. (12) crashing the delta function in above, we get a θ,φ integral p(s) = a′ ∫ π/2 0 ∫ π 0 e−s(|cos θ|/2+sin θ(|cos φ|+|sin φ|))/g(θ,φ) s2 |g[θ,φ)|3 sin θ dθ dφ, g(θ,φ) = √ 1 − sin2 θ sin 2φ, (13) due to the symmetry of integrand the domains of integrations in (13) have been reduced. the p(s) of r2 for exponential distribution can be written as p(s) = a ∫ ∞ −∞ ∫ ∞ −∞ ∫ ∞ −∞ e−(|a|+|b|+|c|)δ[s − √ b2 + c2] da db dc. (14) here the a-integral is separable and gives 1. the remaining double integral can be converted to polar form as p(s) = a′s ∫ π/2 0 e−s(sin θ+cos θ) dθ. (15) the integrals (13), (15) are further inexpressible in terms of known functions. p(s) for these two cases are plotted in figure 2, they look similar though distinct, notice their linear behaviour near s = 0 like wigner’s distribution (dashed line). 2.3. super-gaussian distribution: f(x) = e−x4 for r1 (1), the p(s) integral (2) becomes p(s) = a ∫ ∞ −∞ ∫ ∞ −∞ ∫ ∞ −∞ e−(a 4+b4+c4)δ[s − √ 4b2 + (a− c)2] da db dc. (16) we transform p(s) into the three dimensional spherical polar co-ordinates using 2b = r cos θ, a = r sin θ cos φ, c = r sin θ sin φ as p(s) = a ∫ ∞ 0 ∫ π 0 ∫ 2π 0 e−r 4(cos4θ/16+sin4 θ(cos4 φ+sin4 φ))δ[s −rg(θ,φ)]r2 dr sin θ dθ dφ. (17) crashing the delta function in above, we get a θ,φ integral p(s) = a′ ∫ π/2 0 ∫ π 0 e−s 4(cos4 θ/16+sin4 θ(cos4 φ+sin4 φ))/g4(θ,φ) s 2 |g(θ,φ)|3 sin θ dθ dφ, g(θ,φ) = √ 1 − sin2 θ sin 2φ. (18) 421 sachin kumar, zafar ahmed acta polytechnica for the matrix r2, the a-integral in p (s) is separable gives a multiplying constant. then the double integral in b,c is changed to polar form where by crashing the delta function we get an integral p(s) = a′s ∫ π/2 0 e−s(cos 4 θ+sin4 θ) dθ = a′se−3s 4/4 ∫ 2π 0 e−(s 4 cos t)/4 dt = a′πs 2 e−3s 4/4i0(s4/4), (19) p(s) corresponding to (18) and (19) are plotted in figure 3 showing linear level repulsion near s = 0. 2.4. maxwellian distribution: f(x) = xe−x2 , x > 0 for r1 type of real symmetric matrix p(s) is not simple, however here we would like to show that when the pdf does not peak at x = 0, we get highly non-linear behaviour of p(s) near s = 0 for r2, to this end we convert the integral (2) to polar form and get p(s) = a′s3e−s 2 , (20) displaying nonlinear cubic behaviour ∼ s3 behaviour near s = 0. in rmt, the pdf of matrix elements is usually taken as symmetric and peaking at x = 0 and one gets linear level repulsion near s = 0. but when we take the non-symmetric maxwellian distribution (f(0) = 0), we get cubic level repulsion near s = 0. therefore, it would be interesting to see whether for n×n (n large) the cubic level repulsion persists. 3. distribution of eigenvalues d(�) of 2 × 2 gaussian random matrices we collect 2n eigenvalues of n 2 × 2 matrices to find the mean of positive eigenvalue (ē) and divide all eigenvalues by ē and find histograms d(�). for a large real symmetric matrix this distribution is well known as semi-circle law (4). the distribution of eigenvalues e1(a,b,c) and e2(a,b,c) can be obtained analytically as g(e) = a ∫ ∞ −∞ ∫ ∞ −∞ ∫ ∞ −∞ f(a,b,c)[δ(e −e1] + δ(e −e2)] da db dc, ē = ∫∞ 0 g(e) de∫∞ 0 g(e) de , � = e ē , d(�) = g(�ē)∫∞ −∞g(�ē)d� . (21) once again f(x) is the pdf of matrix elements. here, e1,2 = 12 (a + c ± √ (a− c)2 + 4b2) for r1 and e1,2 = a ± √ b2 + c2 for r2. for r1, g(e) can be obtained from (21), by using gaussian pdf, defining a + c = u,a− c = v and crashing the delta function w.r.t. u. next, we use polar co-ordinates v = r cos θ and b = r2 sin θ to get g(e) = a′e−2e 2 ∫ ∞ 0 ∫ 2π 0 cosh(2er)e−r 2( 78 + cos 2θ 8 )r dr dθ (22) which reduces to a one-dimensional integral g(e) = a′′e−2e 2 ∫ ∞ 0 r cosh(2er)e−7r 2/8i0(r2/8) dr. (23) for r2 with gaussian pdf, we crash the delta function w.r.t. the variable a and use polar co-ordinates b = r cos θ,c = r sin θ and we get a simple form g(e) = e−e 2[ 2 + √ 2πe erf(e/ √ 2)ee 2/2]/(4√π). (24) this function is normalized to 1 in for e ∈ (−∞,∞), ē calculated in e ∈ (0,∞) is 4+π4√π = 1.0073, consequently, d(�) = g(�). see the d(�) histograms in figure 4 for eigenvalues of n = 8 × 104 matrices:(a) r1 and (b) r2 where the matrix elements are gaussian random numbers with mean 0 and variance 1. in figure 4, d(�) ((23) and (24)) matches well with the histograms. usually, d(�) is plotted by taking � = e/em, where em is the maximum of the eigenvalues and d(�) is studied for −1 ≤ � ≤ 1. with regard to this the x-axis could be scaled down to the domain [−1, 1] to see that the ensembles of 2 × 2 real matrices defy the semi-circle law which is observed for real symmetric matrices of large order. we also find that d(�) for both r1 and r2 are sensitive to the pdf of matrix elements. 422 vol. 57 no. 6/2017 new spectral statistics -3 0 3 ε 0.15 0.3 dhεl hal -3 0 3 ε 0.15 0.3 dhεl hbl figure 4. distribution of eigenvalues d(�) for r1 (a) and r2 (b) in (1) under gaussian pdf of matrix elements. the solid line (blue) is due to (23) and (24). the histograms are due to an ensemble of n = 8 × 104 matrices. 4. conclusions our analytic and semi-analytic results on p (s) for two modifications of 2×2 real symmetric matrices in (7), (10), (13), (15), (18) and (19) under various probability distribution functions and the plotted p(s) in figures 1–3 are new and instructive. they all give the linear level repulsion as αs near s = 0 but notably α is not fixed. the maxwellian pdf (f(0) = 0) of matrix elements presents a striking result wherein the level repulsion near s = 0 is cubic. it will be further interesting to investigate spectral distributions for n×n matrices with pdfs which are non-symmetric and vanish at x = 0. the distribution of eigenvalues for two real matrices under gaussian pdf obtained in (23) and (24) are also new and instructive. acknowledgements s. k. wishes to thank dr. shashi c. l. srivastava, vecc, kolkata, for some clarifications on rmt. references [1] c. e. porter, statistical theories of spectra: fluctuations (academic, new york, 1965). [2] m. l. mehta, random matrices 3rd ed. (amesterdam, elsevier, 2004). [3] a. bohr and mottelson, nuclear structure vol. i (benjamin, reading, ma, 1975). [4] f. hake, quantum signatures of chaos (new york springer, 1992). [5] e.p. wigner, ann. math. 67 1958. [6] n. rosenzweig, phys. rev. lett. 1 24 (1958). [7] p.c. huu-tai, n. a. smirnova, p. van isacker, j. phys. a: math. gen. 35 l199 (2002). [8] m.v. berry and p. shukla, j. phys. a: theor. 42 485102 (2009). [9] s. grossman and m. robnik, j. phys. a: math. theor. 40 409 (2007). [10] z. ahmed, phys. lett. a 308 140 (2003). [11] z. ahmed and s.r jain, j. phys. a: math. gen. 36 3349 (2003). [12] z. ahmed and s.r. jain, phys. rev. e 67 045106 (r) (2003). [13] e.m. graefe, s. mudute-ndumbe and m. taylor, j. phys. theor. 48 38ft02 (2015). [14] j. gong and q. wang, j. phys. 45 444014, (2012). [15] s. hameed. k. jain and a. laxminaraynan, j. phys. a: math. theor 48 385204 (2015). 423 acta polytechnica 57(6):418–423, 2017 1 introduction 2 ensembles of 2x2 real-symmetric random matrices and their spacing distributions 2.1 uniform distribution 2.2 exponential distribution 2.3 super-gaussian distribution 2.4 maxwellian distribution 3 distribution of eigenvalues d(epsilon) of 2x2 gaussian random matrices 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0411 acta polytechnica 59(4):411–422, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap heat flux jump method as a possibility of contactless measurement of local heat transfer coefficients stanislav solnař∗, martin dostál czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: stanislav.solnar@fs.cvut.cz abstract. this article deals with the derivation of the temperature response of the finite thickness wall, which is heated by heat flow, as a possibility to measure the local values of the heat transfer coefficient with a very high spatial resolution. in addition to its own experimental measurement technique, it is also possible to use this derivation to describe the gradual heating of the wall while using the temperature oscillation (toirt) method for the heat transfer measurement. the system of the partial differential equations and the boundary and initial conditions is solved by a transformation into laplace’s space, where the task is easier to solve. at the end of the article, the results of the analytical solution are compared with a numerical solution of this problem. keywords: heaviside function, heat transfer coefficient, experimental method, heat jump method. 1. introduction when designing heat transfer devices, the heat transfer surface is a critical parameter. the size of the heat transfer surface directly affects the amount of the heat flow passing through these devices. the relationship between these variables is described by newton’s law of cooling q̇ = αs ∆t. (1) the temperature difference ∆t is mostly given by the requirements (depending on the avaiable steam or water temperature, the maximum overheating in the food industry . . . ) and thus the only variable that can affect the size of the heat transfer surface s for the desired heat flux q̇ is the heat transfer coefficient α. the heat transfer coefficient represents the intensity of the transferred heat and is dependent on the properties of the liquid and, above all, the relative speeds of the solid wall and the fluid. the heat transfer coefficient is most often measured by experiments, or in the present time computer-aided by numerical simulations, which, however, need an experimental (or theoretical, if possible) validation of the calculated data. the most commonly used method for measuring the heat transfer coefficient is the static method of measuring the wall surface temperature. a thin plate or foil is fixed in place and most often heated by electric current and is affected by the convective heat transfer we deliberately evoke. heat transfer coefficients cause a temperature field to form on the thin plate, which can then be used to calculate local heat transfer coefficient values. this method was used, for example, by katti and prabhu [1], which measured the heat transfer between the smooth plate and the air flow from the nozzle, and used the infra-red camera to measure the local surface temperatures. kenneth et al. [2] used an adiabatic wall to measure the heat transfer between smooth wall and a matrix of impinging jets. the smooth wall was coated with a thin layer of thermochromic liquid crystals (tlc, a thermally active colour that changes its colour when changing the temperature). the wall with tlc layer was scanned by a rgb camera and the recorded colour at each point was recalculated to a temperature field. quite the same method, but another way of measuring the surface temperature field. beniaiche et al. [3] measured with a help of the stationary method, the heat transfer in the trailing edge of the modern gas turbines. the 30:1 replacement model was coated with a thin calibrated tlc layer and rgb camera scanned the temperature field while the cooling air was flowing. the pressure field was also studied. the resulting heat transfer coefficient maps are used to optimize the trailing edges and prevent overheating of the modern turbine blades. frequently used heat transfer measurement methods are the transient methods of a surface temperature measurement. hippensteele and poinsatte [4] measured the heat transfer coefficient maps in a tapered channel for high reynolds numbers (2×105 to 1.2×106). authors pre-warmed the measured channel, and after reaching the initial temperature throughout the model volume, they began to cool the channel through the internal air stream and recorded the surface of the channel with a tlc calibrated layer by rgb camera. another, very often used method, is the electrodiffusion method (edm), which measures the current between two electrodes. when measuring with this method, it is necessary to have two electrodes and an electrolytic 411 https://doi.org/10.14311/ap.2019.59.0411 https://ojs.cvut.cz/ojs/index.php/ap stanislav solnař, martin dostál acta polytechnica liquid. if we bring the voltage to electrodes, a heterogeneous reaction results in a change of equalization between anode and cathode, which induces a measureable current. the stronger the mass transfer of the ions, the higher are the measured values of the current [5]. this method for measuring local values of the heat transfer coefficient was used, for example, by petera et al. [6] in the measurement of the local heat transfer coefficients at the bottom of the agitated vessel on which the jet of water was falling. the impinging jet was formed by an agitator that was placed in the draft tube. another application can be, for example, cudak and karcz [7], where the heat transfer coefficients were measured on the wall of the agitated vessel without baffles and the agitator had been placed in the axis or outside the axis of the agitated vessel. karcz a strek [8] also measured the influence of the baffles, their location and the length of the heat transfer on the wall of an agitated vessel with this method. a relatively new and modern method for measuring the local heat transfer coefficient values is the temperature oscillation infra-red thermography method (toirt). this method is based on measuring the surface temperature of the wall with infra-red camera, which is influenced by the oscillating heat flow from one side and the convective heat transfer from both sides of the wall. the key information of the method is the phase delay of the temperature signal behind the oscillating heat flux, this delay is in a direct relationship with the heat transfer coefficient. authors wandelt and roetzel [9] derived this relation and tried to measure the heat transfer on a wall that had an air flow from both sides as well. by comparing the experimental results with the assumed values at the individual measurement points, the functionality of the method was confirmed. freund [10] subjected this method to a very thorough theoretical analysis and set certain practical limits (maximum and recommended wall thickness, optimal oscillating heat signal frequency, etc.) for the method to function. the thesis also presents a sensitivity analysis of the method to input variables, the theoretical and experimental validation and the problems associated with the method. besides the experiments themselves, he also dealt with the problem of lateral conduction in the wall that occurs during certain measurements, but the method does not take this into account. the author tries to solve this problem by means of a back-up task and a numerical finite-element method modelling. freund also encountered the problem of the gradual heating of the entire system (he called that drift), which arises at the beginning of the measurement as a temperature transient response. this gradual warming of the measured wall must be removed from the data if we want to use all measured data. in a theoretical level, we should wait an infinitely long time to guarantee the thermal equilibrium of the system, which is not entirely enjoyable from a practical point of view. by removing the drift from the data, it is possible to use the data that do not meet the thermal equilibrium condition. this temperature tendency caused a relatively large error in the values of the temperature phase delay (which are in relation to the resulting heat transfer coefficient), and freund removed it by interleaving the various functions and subtracting them and claimed that this procedure did not have a significant effect on the method error. freund published several papers on the application of the toirt method eg. [11] when measuring the local values of the heat transfer coefficient in the plate heat exchanger plates for which he prepared the numerical model in ansys fluent or [12] when measuring heat transfer in water cooling spray systems. in freund [10], there are also experiments, such as heat transfer in the flow of fluid in the tube, heat transfer behind the vortex generators in the air tunnel, or heat transfer between the plate and the impinging jet experiments. we will solve this problem using an integral transformation that simplifies the solution of similar problems. the laplace transformation is an integral transformation named after its discoverer pierre-simon laplace [13]. the real variable t function is transformed into the s complex variable function, which can be transformed back. laplace transformation l transforms the functions of the real variable f(t) defined for all real numbers t ≥ 0 to the images of the functions f(s) defined as l(f(t)) = ∫ ∞ 0 f(t) e−st = f(s). (2) we will use f for function images in laplace’s space. the f(t) function must be locally integrable on the [0,∞) interval. in the literature, it is then possible to find images (laplace transformations) of many different functions. laplace transform is also very often used to solve differential equations (both ordinary differential equations and partial differential equations). if we transform an ordinary derivative of the function f′(t) l ( df(t) dt ) = l(f′(t)) = sf(s) −f(0) (3) we get an image of a function that is in the form of an algebraic expression. the system of such equations can then be solved very easily and the solution of the equation is obtained by the inverse transformation. if we transform the partial derivation of function f(x,t) by time, we get a function image l ( ∂f(x,t) ∂t ) = sf(x,s) −f(x, 0), (4) 412 vol. 59 no. 4/2019 derivation of a new contactless method t t z t q q(t)=q h(t) heat flux f f figure 1. a schematic drawing of the new method applied to a wall with a finite thickness δ and further influenced by the heat transfer coefficiens α0 and αδ. which is again in the form of an algebraic expression. for example, if we transform the second partial derivative of f(x,t) by the coordinate of x we get l ( ∂2f(x,t) ∂x2 ) = d2f(x,s) dx2 (5) which is an ordinary differential and so the solution is simplified to an ordinary differential equation. partial differential tasks are simplified in laplace’s space to ordinary differential tasks, which are much easier to solve, and we can get their solutions by the inverse transformation. numerical solvers of partially differential equations are nowadays included in commercial software, which allow a relatively simple solving of these tasks. for 1d tasks, matlab offers the pdepe function that solves the initial tasks of parabolic and elliptical equations in a one selected direction. after defining the task, boundary and initial conditions, and the number of time and dimension points, we get the result within a few seconds. a detailed description of the use of this function can be found at https://www.mathworks.com/help/matlab/ ref/pdepe.html. probably the most commonly used tool for solving differential problems is ansys fluent. it seems unnecessary to use such a powerfull tool to solve the banal problem of heat conduction in a homogeneous wall. 2. definition of the problem the idea of the new heat flux jump method is depicted on the fig. 1. it is a wall that has infinite dimensions in the direction of x and y, and in the direction z has a finite thickness of δ, which is assumed to be very small (in milimeters). this wall with known thermophysical properties (density ρ, specific heat capacity cp and thermal diffusivity λ) is heated from the left side by a uniform heat flux q̇. the heated wall also causes a flow in the region near to the wall, causing a natural convection to the left of the task expressed as the convective heat transfer coefficient α0. the temperature field in the wall is also affected by the convective heat transfer coefficient αδ on the right hand side that we are trying to determine. heat losses to the enviroment are not considered in this task. the main temperature change is caused by the heat flow, which is applied to the left side of the wall. this heat flow will have a jumping character and can be expressed using heaviside function h(t). q̇(t) = q̂ h(t), (6) where q̂ represents the amplitude of the heat flux. heat conduction in a homogeneous wall is descibed by the fourier equation. ∂t ∂t = a ( ∂2t ∂x2 + ∂2t ∂y2 + ∂2t ∂z2 ) . (7) where a represents a temperature diffusivity. assuming that the incident heat flow to the wall is uniform and that the heat transfer coefficient on the right hand side αδ does not change on the wall, it can be assumed that the task becomes a 1d problem (there will be no heat conduction in the direction of x and y), and therefore, 413 https://www.mathworks.com/help/matlab/ref/pdepe.html https://www.mathworks.com/help/matlab/ref/pdepe.html stanislav solnař, martin dostál acta polytechnica the first two partial derivatives on the right hand side can be ommited. if we do not calculate the absolute temperature values, but only the change from the initial value t0, we can formulate this task as follows: ∂(t −t0) ∂t = a ∂2(t −t0) ∂z2 . (8) the boundary condition for solving this task from the right hand side (at the point z = δ) is given only by the investigated heat transfer coefficient αδ and can be formulated as a condition of the third kind: −λ ∂(t −t0) ∂z ∣∣∣∣ z=δ = αδ (t −tf )|z=δ , (9) where the temperature tf is the ambient temperature. the boundary condition from the left hand side of the wall (at the point z = 0) is given by the heat transfer coefficient α0 and by the heat flux that is sent to the measured wall, it, again, is a condition of the third kind and is defined as: −λ ∂(t −t0) ∂z ∣∣∣∣ z=0 = q̂ h(t) −α0 (t −tf )|z=0 . (10) since this is a non-stationary task, it is necessary to add the initial condition to these equations. we have now assumed that the initial temperature t0 is identical to the ambient temperature tf, i.e. the system was originally in the state of a thermodynamic equilibrium, we see from equations (8 – 11) that t0 (or tf) is only the additive constant and equations (8 – 11)) we can formulate for t0 = tf = 0 (11) regardless of the generatily of the problem. this initial condition will simplify the equations as well as the whole solution. 3. problem solving we solved this problem in the subsections: general solution, application of boundary conditions and the particular solution. 3.1. general solution to solve this system of a partial differential equation (8), two boundary conditions (9), (10) and initial condition (11), we first transformed the system into a partial dimensionlessness for a simpler solution. as the first transformation, we converted the z coordinate and defined the dimensionless coordinate z′ as z′ = z δ , (12) which is very often seen in the literature. another variable is the time t, which we transformed into a dimensionless time t′ as follows t′ = at δ2 . (13) by introducing dimensionless variables into the fourier heat conduction equation (8), along with zero initial condition, we obtain an expression ∂t ∂t′ = ∂2t ∂z′2 . (14) by the laplace’s transformation we can convert this equation into the ordinary differential equation in the laplace’s space. l ( ∂t ∂t′ ) = l ( ∂2t ∂z′2 ) . (15) we find suitable transformations of the expressions in the “dictionary” to create an image t of this equation st(z′,s) −t(z′, 0) = d2t dz′2 . (16) the second expression on the left hand side of this equation represents the temperature t at z′ and at time 0. this temperature is already defined by the initial condition t0 = 0 and so the equation will simplify to d2t dz′2 −st(z′,s) = 0. (17) 414 vol. 59 no. 4/2019 derivation of a new contactless method this equation is a homogeneous second order differential equation with constant coefficients that can be solved using the characteristics. the characteristic equation is in this case λ2c −s = 0, (18) and the roots of equation λc are then equal to λc = ± √ s (19) and the general solution of this equation can, therefore, be found in the form of a sum of hyperbolic functions t(z′,s) = c1 sinh( √ sz′) + c2 cosh( √ sz′), (20) where c1 and c2 are constants that must be solved by determined conditions. 3.2. transformation and application of boundary conditions both boundary conditions must be transformed into the same dimensionless variables as the main equation and the initial condition t0 = tf = 0 applied to them as well as to the main equation. z′ = 1 → −λ ∂t ∂z′ ∣∣∣∣ z′=1 = αδ δ t |z′=1 (21) and for the left hand side boundary condition z′ = 0 → −λ ∂t ∂z′ ∣∣∣∣ z′=0 = q̂ h(t) −α0 δ t |z′=0 . (22) if we perform the laplace’s forward transformation, we get images of both boundary conditions z′ = 1 → − dt(1,s) dz′ = αδ δ λ t(1,s) (23) and z′ = 0 → − dt(0,s) dz′ = 1 s q̂ δ λ − α0 δ λ t(0,s) (24) because boundary conditions are expressed by derivatives on both left hand sides, it is necessary to calculate the derivative of the general solution. dt(z′,s) dz′ = c1 √ s cosh( √ sz′) + c2 √ s sinh( √ sz′). (25) applying the coordinate z′ = 0 to the equation of the general solution we obtain t(0,s) = c1 sinh(0)︸ ︷︷ ︸ =0 +c2 cosh(0)︸ ︷︷ ︸ =1 (26) where the individual hyperbolic functions are already expressed by their values. by applying this coordinate to the derivative of the general solution we obtain dt(0,s) dz′ = c1 √ s cosh(0)︸ ︷︷ ︸ =1 + c2 √ s sinh(0)︸ ︷︷ ︸ =0 (27) where the values of hyperbolic functions are again expressed. by applying the boundary condition we get − 1 s q̂ δ λ + bi0 t(0,s) = c1 √ s. (28) for expression t(0,s) we can add the solution (26) from the general equation. by assingning the t(0,s), we obtain the first equation for solving integration constants. − 1 s q̂ δ λ + bi0 c2 = c1 √ s (29) by setting the coordinate z′ = 1 to the general equation we get t(1,s) = c1 sinh( √ s) + c2 cosh( √ s) (30) 415 stanislav solnař, martin dostál acta polytechnica variable value unit q̇ 2 000 w m−2 α0 3 w m−2 k−1 αδ 500 w m−2 k−1 z′ 0 δ 0.001 m λ 15.0 w m−1 k−1 cp 500 j kg−1 k−1 ρ 7900 kg m−3 table 1. table of input values for the calculation. the heat transfer coefficient α0 corresponds to the free convection on the heated side of the wall (estimated value based on correlations), αδ is the representative value of the heat transfer coefficient we are trying to find out. thermophysical properties are taken for stainless steel en 1.4307 (see https://www.notzgroup.com/media/wysiwyg/pdf/jas/werkstoffe/1.4307_en.pdf. applying this coordinate to the derivatiove of the general solution we obtaion dt(1,s) dz′ = c1 √ s cosh( √ s) + c2 √ s sinh( √ s) (31) and applying the boundary condition − biδ t(1,s) = c1 √ s cosh( √ s) + c2 √ s sinh( √ s) (32) the expression t(1,s) can be again assigned from (30) and we get the second equation − biδ [ c1 sinh( √ s) + c2 cosh( √ s) ] = c1 √ s cosh( √ s) + c2 √ s sinh( √ s) (33) 3.3. particular solution this creates two equations of two unknowns c1 a c2. this system of two equations can be written into matrix a = [ √ s bi0√ s cosh( √ s) + biδ sinh( √ s) √ s sinh( √ s) + biδ cosh( √ s) ] [ c1 c2 ] = [ −1 s q̂ δ λ 0 ] (34) and solved, for example, with cramer’s rule for solving the system of equations. determinant of the matrix a is det a = √ s [√ s sinh( √ s) + biδ cosh( √ s) ] − bi0 [√ s cosh( √ s) + biδ sinh( √ s) ] (35) and the constants c1 and c2 can be calculated from the subdeterminants (using the cramer’s rule) c1 det a = − 1 s q̂ δ λ [√ s sinh( √ s) + biδ cosh( √ s) ] (36) and c2 det a = 1 s q̂ δ λ [√ s cosh( √ s) + biδ sinh( √ s) ] (37) therefore, finding a non-stationary temperature profile in the wall, as a particular solution to equations (8 – 11), can be expressed in laplace’s space as t(z′,s) = c1 det a det a sinh( √ sz′) + c2 det a det a cosh( √ sz′) (38) 4. inverse laplace transformation since the particular solution is not quite simple to be divided into partial fractions (i.e., a direct translation of the function t), we have chosen a numerical method for the inverse laplace transform. all the results presented here will be generated for the values specified in the tab. 1. authors abate and whitt [14] introduced a numerical solver for an inverse laplace transformation and its generalized form and also re-explained its functioning. we have used the talbot algorithm, which is based on the deformation of bromwich’s integral. to understand the functioning of talbot’s algorithm you can read their fantastic paper [14]. the talbot inversion algorithm is freely downloadable for matlab as a function talbot_inversion (f_s,t,m) and for the input function f_s, vector of computation times t (in our case the 416 https://www.notzgroup.com/media/wysiwyg/pdf/jas/werkstoffe/1.4307_en.pdf vol. 59 no. 4/2019 derivation of a new contactless method αδ = 500 w/m 2k αδ = 1500 w/m 2k αδ = 3000 w/m 2k αδ = 5000 w/m 2k t ( o c ) 0 1 2 3 4 t (s) 0 20 40 60 80 100 figure 2. the results of the analytical calculation given as a change of temperature t compared to the initial temperature t0. the graph is depicted for the values listed in the tab. 1 and for various values of heat transfer coefficient αδ. δ=0.5 mm δ=1 mm δ=2 mm δ=5 mm δ=10 mm δ t ( o c ) 0 1 2 3 4 t (s) 0 20 40 60 80 100 figure 3. calculated temperature changes for different values of wall thickness δ. the graph is depicted for the values listed in the tab. 1. dimensionless values of time t′) and the number of iterations of the calculation m returns the real function values in time f(t). the result of this operation applied to our particular solution t(z′,s) can be seen in the fig. 2. the analytically calculated change of temperature t (from the initial temperature tf = t0 = 0, the real temperature of the experiment is only an additive constant) changes in both the time and amplitude domain with the changing value of the heat transfer coefficient αδ. this would mean that, for the evaluation of the heat transfer coefficient, we could theoretically use two pieces of information from the measured data, namely the amplitude of the temperature difference and the time constant of the temperature-time relationship. the relationship between analytically calculated temperatures and thermophysical properties of the wall or its thickness are depicted in the fig. 3, 4 and 5. as we can see, the final temperature, in the long enough time, is not affected by the change of any thermophysical property of the wall or its thickness. the change of these parameters affects the shape of the temperature-time dependence. thus, a perfect knowledge of the thermophysical properties of the measured wall and its thickness is necessary. the effect of increasing or decreasing heat flux q̇ is only in the change of the final temperature. the change of thermal conductivity λ makes almost no change to the shape of temperature-time dependence (the maximum error for triple value of thermal conductivity λ is 0.019 k). the effect of the heat transfer coefficient α0 in the range 0 – 50 w/m2k is so low that it can almost be neglected. 5. numerical solution of the problem to confirm the correctness of the results, we chose a numerical simulation of this initial problem. for the same conditions (α0, αδ, q̇, cp, ρ, λ, δ a z′) we have prepared a heat conduction simulation in a homogeneous wall with a finite thickness of δ and the initial temperature of t0 for 20 dimensional points in matlab using the 417 stanislav solnař, martin dostál acta polytechnica ρ=1000 kg/m3 ρ=3000 kg/m3 ρ=5000 kg/m3 ρ=7000 kg/m3 ρ=9000 kg/m3 δ t ( o c ) 0 1 2 3 4 t (s) 0 20 40 60 80 100 figure 4. calculated temperature changes for different values of wall density ρ. the graph is depicted for the values listed in the tab. 1. cp=500 j/kgk cp=750 j/kgk cp=1000 j/kgk δ t ( o c ) 0 1 2 3 4 t (s) 0 20 40 60 80 100 figure 5. calculated temperature changes for different values of wall heat capacity cp. the graph is depicted for the values listed in the tab. 1. pdepe function. the comparison of calculated changes of the temperature with the analytical method (although with numerical inverse transformation, but analytical) and numerical method can be seen in the fig. 6. 6. discussion we outlined our derivation of the change of temperature in a homogeneous wall with a finite thickness of δ, which is influenced by heat flow rate q̇ on one side and convective heat transfer from both sides. based on the knowledge of this time-temperature curve of the wall and its comparison with the experimentally measured data (i.e. with infra red camera), it is then possible to predict the heat transfer coefficient αδ on the other side of the wall. a secondary application of this derivation is an analytical description of the gradual heating of the wall by an application of the temperature oscillation method for measuring the heat transfer coefficient. the analytical calculation shows the possibility of determining the heat transfer coefficient in the temperature domain (each of the heat transfer coefficients belongs to another value of the maximum temperature difference), as well as in the time domain (each of the heat transfer coefficient belongs to another time constant of the temperature progression). the question of experimentally determined values of the heat transfer coefficient will be our future research. this method is independent on the medium in which the heat transfer takes place, as well as the temperature oscillation (toirt) method. freund [10] described the problems associated with the gradual heating of the wall when applying this method. this exponential part of the temperature dependence made it impossible to correctly determine the phase shift (and hence the heat transfer coefficient) and had to be removed from the data (freund used a stepwise iterative subtraction of the data from the lines in each period). this calculation could remove this process and describe the exponential part of the temperature dependence better and more accurate. due to the fact that, at the beginning of the data processing, we will not know the resulting heat transfer coefficient, this process would be probably iterative. 418 vol. 59 no. 4/2019 derivation of a new contactless method αδ=500 w/m 2k αδ=1000 w/m 2k αδ=2000 w/m 2k numerical δ t ( o c ) 0 1 2 3 4 t (s) 0 20 40 60 80 100 figure 6. comparison of analytically and numerically calculated values of temperature changes in the heated wall for various values of heat transfer coefficient αδ. halogen lights ir camera heat flux heat transfer coefficient wall figure 7. schematic drawing of our idea of experimental measurement. the inverse transformation of the particle solution using the talbot algorithm lasts for 200 time steps and number of iterations m = 64 (a preset value) about 0.3 seconds. the results of the analytical calculation were compared to the 1d heat conduction numerical simulation using the matlab software and the pdepe function for 20 dimension points. the results show a very good match of calculated values (the maximum value of the difference is 0.062 ◦c) caused by the numerical implementation of the solved problem. 7. idea of experimental measurements our idea of an experimental measurement of local values of the heat transfer coefficient is depicted in fig. 7. a measured wall of metallic material (steel, stainless steel, copper, aluminium . . . ) is influenced by the heat transfer coefficient we produce (i.e. airflow impinging jet, air flow along the wall, impact fluid flow in the agitated vessel . . . ). heat flux is supplied through halogen floodlights, which have a very poor efficiency and much of the power is transformed into heat. these halogen lights will be powered by power source and controlled by a synchronized 2-channel signal. the first signal controls the halogen flood lights and the other triggers the ir camera at specified intervals. both signals can be generated by a function generator or through a pc card. the measured data could be then compared with the calculated analytical results (see fig. 8) and the optimization algorithm will be looking for the best match of analytical and experimental values that will lead to the evaluation of the heat transfer coefficient. 8. conclusions in this paper, we have presented the derivation of the temperature response of the finite thickness wall, which is affected by the heat flow and also by the convective heat transfer coefficient. after defining the problem, we introduced the equation, boundary and initial conditions that describe the problem. to facilitate the solution 419 stanislav solnař, martin dostál acta polytechnica q=2000 w/m2 experimental analytical αδ=150 w/m 2k analytical αδ=171 w/m 2k analytical αδ=190 w/m 2k analytical αδ=210 w/m 2k δ t ( o c ) 0 2,5 5 7,5 10 12,5 t (s) 0 20 40 60 80 figure 8. comparison of experimental and analytical results. the optimization algorithm leads to the heat transfer coefficient αδ = 171 w/m2k. of this system of equations, we used the laplace’s transformation to make the solution easier. an inverse transformation of the resulting equation is not easy, so we used a numerical method to perform the inverse laplace transform. the analytical results as well as the dependence of some thermophysical parameters of the wall were plotted. other wall parameters do not have a major impact on the solution. we verified the correctness of our derivation by a numerical simulation in matlab. our idea of an experimental measurement was presented and also representative measured values of the wall temperature were compared with the analytical solution of this problem. list of symbols a temperature diffusivity [m2 s−1] bi0 biot number in place z = 0 (= α0 δ/λ) [–] biδ biot number in place z = δ (= αδ δ/λ) [–] c1,c2 integration constants [–] cp specific heat capacity [j kg−1 k−1] f(t) function in real time t [–] f′(t) derivative of function [–] f(s) image of the function in the laplace space [–] h(t) heaviside function [–] q̇ heat flow rate [w] q̇ heat flux [w m−2] q̂ amplitude of heat flux [w m−2] s variable in laplace space [–] s surface [m2] t time [s] t′ dimensionless time (= at/δ2) [–] t temperature [k,◦c] t0 temperature at the time t = 0 [k,◦c] tf ambient temperature [k,◦c] ∆t temperature difference [k,◦c] x,y,z coordinations [m] z′ dimensionless coordination (= z/δ) [–] α heat transfer coefficient [w m−2 k−1] α0 heat transfer coefficient in place z = 0 [w m−2 k−1] αδ heat transfer coefficient in place z = δ [w m−2 k−1] δ thickness of the wall [m] λ thermal conductivity [w m−1 k−1] λ roots of equation [–] ρ density [kg m−3] 420 vol. 59 no. 4/2019 derivation of a new contactless method abbreviations edm electrodiffusion method rgb red-green-blue tlc thermochromic liquid crystals toirt temperature oscillation infra-red thermography acknowledgements authors acknowledge support from the grant agency of the czech technical university in prague, grant sgs18/129/ohk2/2t/12 and support from the eu operational programme research, development and education, and from the center of advanced aerospace technology (cz.02.1.01/0.0/0.0/16_019/0000826), faculty of mechanical engineering, czech technical university in prague. references [1] v. katti, s. v. prabhu. experimental study and theoretical analysis of local heat transfer distribution between smooth flat surface and impinging air jet from a circular straight pipe nozzle. international journal of heat and mass transfer 51:4480–4495, 2008. doi:10.1016/j.ijheatmasstransfer.2007.12.024. [2] w. kenneth, z. wang, p. t. ireland, t. v. jones. detailed measurements of local heat transfer coefficient and adiabatic wall temperature beneath an array of impinging jets. in proceedings of the international gas turbine and aeroengine congress and exposition. the american society of mechanical engineers, 1993. [3] a. beniaiche, a. ghenaiet, c. carcasci, b. facchini. heat transfer investigation in new cooling schemes of a stationary blade trailing edge. apllied thermal engineering 87:816–825, 2015. doi:10.1016/j.applthermaleng.2015.05.001. [4] s. a. hippensteele, p. e. pinsatte. transient liquid-crystal technique used to produce high-resolution convective heat-transfer-coefficient maps. in proceedings of the national heat transfer conference. atlanta, georgia, 1993. [5] l. bohm, s. jankhah, j. tihon, et al. application of the electrodiffusion method to measure wall shear stress: integrating theory and practice. chemical engineering and technology 37(6):938–950, 2014. doi:10.1002/ceat.201400026. [6] k. petera, m. dostál, m. věříšová, t. jirout. heat transfer at the bottom of a cylindrical vessel impinged by a swirling flow from an impeller in a draft tube. chem biochem eng q 31(3):343–352, 2017. doi:10.15255/cabeq.2016.1057. [7] m. cudak, j. karcz. studies of local heat transfer at vicinity of a wall region of an agitated vessel. in proceedings of 34th international conference of ssche. 2007. [8] j. karcz, f. strek. heat transfer in jacketed agitated vessels equipped with non-standard baffles. the chemical engineering journal 58:135–143, 1995. doi:10.1016/0923-0467(94)02945-8. [9] m. wandelt, w. roetzel. lockin thermography as a measurement technique in heat transfer. quantitative infrared thermography 96 pp. 189–194, 1997. doi:10.21611/qirt.1996.031. [10] s. freund. local heat transfer coefficients measured with temperature oscillation ir thermography. ph.d. thesis, universitat der bundeswehr hamburg, hamburg, 2008. [11] s. freund, s. kabelac. investigation of local heat transfer coefficients in plate heat exchangers with temperature oscillation ir thermography and cfd. international journal of heat and mass transfer 53:3764–3781, 2010. doi:10.1016/j.ijheatmasstransfer.2010.04.027. [12] s. freund, a. g. pautsch, t. a. shedd, s. kabelac. local heat transfer coefficients in spray cooling systems measured with temperature oscillation ir thermography. international journal of heat and mass transfer 50:1953–1962, 2007. doi:10.1016/j.ijheatmasstransfer.2006.09.028. [13] t. myint-u, l. debnath. linear partial differential equations for scientists and engineers. borkhauser, 4th edn., 2007. [14] j. abate, w. whitt. a unified framework for numerically inverting laplace transforms. informs journal on computing 18(4):408–421, 2006. doi:10.1287/ijoc.1050.0137. [15] b. weigand. analytical methods for heat transfer and fluid flow problems. springer, 2nd edn., 2015. 421 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2007.12.024 http://dx.doi.org/10.1016/j.applthermaleng.2015.05.001 http://dx.doi.org/10.1002/ceat.201400026 http://dx.doi.org/10.15255/cabeq.2016.1057 http://dx.doi.org/10.1016/0923-0467(94)02945-8 http://dx.doi.org/10.21611/qirt.1996.031 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2010.04.027 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2006.09.028 http://dx.doi.org/10.1287/ijoc.1050.0137 stanislav solnař, martin dostál acta polytechnica 9. appendix the matlab code for calculating the analytical value of the change of the temperature in the wall with the talbot’s inversion (variable definition, function definition and talbot’s inversion). c l e a r all ; f o r m a t c o m p a c t ; clc ; d = 0 . 0 0 1 ; l =15; cp = 5 0 0 ; rho = 7 9 0 0 ; z s t a r =0; q = 2 0 0 0 ; a = l /( rho * cp ) ; a l f a 0 =3; a l f a d = 5 0 0 ; bi0 = a l f a 0 * d / l ; bid = a l f a d * d / l ; funt = @( s ) ((( q .* d ./( s .* l ) ) .*( sqrt ( s ) .* sinh ( sqrt ( s ) ) + bid .* cosh ( sqr t ( s ) ) ) ) ... c 1 d e t a / ( sqrt ( s ) .*( sqrt ( s ) .* sinh ( sqrt ( s ) ) + bid .* cosh ( sqrt ( s ) ) ) bi0 .*( sqrt ( s ) .* cosh ( sqrt ( s ) ) + bid .* sinh ( sqrt ( s ) ) ) ) )... / deta * sinh ( z s t a r * sqrt ( s ) ) + ... ((( q .* d ./( s .* l ) ) .*( sqrt ( s ) .* cosh ( sqrt ( s ) ) + bid .* sinh ( sqrt ( s ) ) ) )... c 2 d e t a / ( sqrt ( s ) .*( sqrt ( s ) .* sinh ( sqrt ( s ) ) + bid .* cosh ( sqrt ( s ) ) ) bi0 .*( sqrt ( s ) .* cosh ( sqrt ( s ) ) + bid .* sinh ( sqrt ( s ) ) ) ) )... / deta * cosh ( z s t a r * sqrt ( s ) ) ; t = l i n s p a c e (0 ,100 ,200) ; time = ( a * t ) /( d ^2) ; r e s u l t = t a l b o t _ i n v e r s i o n ( funt , time ,64) ; matlab code for numerical calculation of the change of temperature in the wall with the help of pdepe function (variable definition, spatial and time discretization, numerical calculation). the code is also supllemented with additional functions for writing a partial differential equation (pdefun), initial condition (icfun) and boundary conditions (bcfun). c l e a r all ; f o r m a t c o m p a c t ; clc ; g l o b a l a l a m b d a a l p h a a l p h a d q t = l i n s p a c e (0 ,100 ,200) ; l a m b d a = 15; a = 1 5 / ( 7 9 0 0 * 5 0 0 ) ; a l p h a = 3; a l p h a d = 500; qave = 2 0 0 0 ; d e l t a = 0 . 0 0 1 ; x = l i n s p a c e (0 , delta ,20) ; m = 0; [ r e s u l t ] = p d e p e ( m ,@ pdefun ,@ icfun ,@ bcfun , x , t ) ; f u n c t i o n [ c , f , s ]= p d e f u n ( x , t , u , dudx ) g l o b a l a c = 1/ a ; f = dudx ; s = 0; end f u n c t i o n u = i c f u n ( x ) u = 0; end f u n c t i o n [ pl , ql , pr , qr ] = b c f u n ( xl , ul , xr , ur , t ) g l o b a l a l a m b d a a l p h a a l p h a d q pl = a l p h a * ul ; ql = l a m b d a ; pr = a l p h a d * ur + q qr = l a m b d a ; end 422 acta polytechnica 59(4):411–422, 2019 1 introduction 2 definition of the problem 3 problem solving 3.1 general solution 3.2 transformation and application of boundary conditions 3.3 particular solution 4 inverse laplace transformation 5 numerical solution of the problem 6 discussion 7 idea of experimental measurements 8 conclusions list of symbols acknowledgements references 9 appendix acta polytechnica doi:10.14311/ap.2020.60.0252 acta polytechnica 60(3):252–258, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap kinematic synthesis of central-lever steering mechanism for four wheel vehicles santiranjan pramanika, b, ∗, sukrut shrikant thipsec a symbiosis international (deemed university), faculty of engineering, department of mechanical engineering, lavale, mulshi, pune, maharashtra 412115, india b college of military engineering, department of mechanical engineering, old mumbai pune hwy, dapodi, pimpri-chinchwad, maharashtra 411031, india c the automotive research association of india, survey no.102, paud road, rambaug colony, kothrud, pune, maharashtra 411038, india ∗ corresponding author: santiranjan_pramanik@rediffmail.com abstract. a central lever steering mechanism has been synthesized to obtain five precision points for a four-wheel vehicle using hooke and jeeves optimization method. this compound mechanism has been studied as two identical crossed four-bar mechanisms arranged in series. the optimization has been carried out for one crossed four-bar mechanism only instead of the entire mechanism. the number of design parameters considered for the optimization is two. the inner wheel has been considered to rotate up to 52 degrees. the steering error, pressure angle and mechanical advantage of the proposed mechanism have been compared with those achieved by the ackermann steering mechanism. the proposed mechanism has less steering error, more favourable pressure angle and increased mechanical advantage. the method of compounding the mechanism is also applicable when the central lever is offset from the longitudinal axis of the vehicle. keywords: ackermann steering, four-wheel vehicle, hooke and jeeves method, compound steering mechanism, unsymmetrical mechanism, steering error, crossed four-bar mechanism. 1. introduction there are several types of steering mechanisms used for four-wheel vehicles. there are symmetric fourbar mechanism, centre-lever mechanism and rack-andpinion type mechanism. fahey and huston [1] considered an eight-bar mechanism for steering automobiles. they suggested this mechanism after modifying a leading ackermann steering mechanism. this eight-bar mechanism eliminated the divergent end behaviour [1] of the existing ackermann mechanism and the maximum steering error was reduced to 0.03 degree. the progressive deviation in the steering error curve with the increasing rotation of the inner wheel nearing the end range is referred here as divergent-end behaviour. however, there were two small and disproportionate links in the eight-bar mechanism and hence the mechanism was unpractical. simionescu and smith [2] found out initial estimates for the design of centre-lever steering linkages. they considered both leading and trailing mechanisms for which they produced parametric design charts. de-juan et al. [3] considered kinematic synthesis of six-bar steering linkages for both leading and trailing configurations. they considered mixed-leading and mixed-trailing configurations as well. in their work, it was found that the steering error curve had five precision points for leading configuration only. de-juan et al. [4] considered three types of leading steering mechanisms and optimized these considering steering error and transmission angles and compared them. bajpai et al. [5] explained optimization of a function of several variables by hooke and jeeves method, which was a direct method in which it was not necessary to differentiate the function with respect to the variables. zhao et al. [6] proposed a geared five-bar steering mechanism with noncircular gear sectors capable to exactly satisfy ackermann’s formula. however, manufacture imperfections and backlash can produce steering error. they suggested that this mechanism can be applied to light carriages. peterson and kornecki [7] considered a design of the steering mechanism of a wide power frame used to harvest tree fruits. the unit was collapsible to half the width for the purpose of transportation. the ratio of the wheel track to wheelbase was 1.55 while harvesting fruits and 0.77 while being transported on the road. a crossed four-bar steering mechanism that included two spur gears has been considered in reference [8]. there, hooke and jeeves method has been used to minimize the objective function that comprised of steering error only. the present mechanism has been considered to consist of two such crossed four-bar mechanisms to eliminate the spur gears. the mechanism provides five precision points and is a trailing mechanism. the kinematic synthesis has been necessary for crossed four-bar mechanism only and 252 https://doi.org/10.14311/ap.2020.60.0252 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 3/2020 kinematic synthesis of central-lever steering mechanism. . . there are two design parameters, thus simplifying the calculation. the aim of the present work is to carry out kinematic synthesis of a centre lever steering mechanism in a simplified approach. the kinematic synthesis of the offset centre lever steering mechanism has been considered here for the first time. it was found that the offset centre lever steering mechanisms can be designed easily if the central hinge is not imposed to be located on the longitudinal axis of the vehicle. 2. crossed four-bar steering mechanism figure 1 shows a crossed four-bar steering mechanism. the vehicle has been turned towards right. agfb is the crossed four-bar mechanism. the inner wheel is rotated about centre i and the outer wheel has been rotated about centre a. when the inner wheel rotates by angle fbe, then the outer wheel rotates by angle gad. as the steering arm bf is rotated counter clockwise, the steering arm ag is rotated clockwise. since these are rotating in opposite directions the use of spur gears has become necessary. 3. condition of correct steering the steering is assumed geometrically correct if the axes of rotation of the wheels intersect at the instantaneous centre of rotation of the vehicle relative to the ground. figure 2 shows that when the inner wheel rotates by an angle γ, then the outer wheel rotates by an angle α. the steering position is correct as all four wheels rotate about a common centre o. from this geometry the following equation can be written: cot α− cot γ = t w (1) 4. optimization method the mechanism shown in figure 3 has two parameters. these are angle kag (β) and steering arm length ag (r). the inner wheel has been rotated by angle φ (angle fbe) and then the rotation of the outer wheel (angle gad) has been found out. the correct rotation of the outer wheel has been found out using an equation (1). then the steering error has been calculated as follows. the initial straight-ahead position has two equal angles kag and fbh. the two steering arms ag and fb are equal. the length of the coupler has been found as gf = √ (d− 2r sin β)2 + (2r cos β)2 (2) angle bae = tan−1 ( bev aeh ) (3) where bev = r sin (π 2 −β + φ ) (4) and aeh = d−r cos (π 2 −β + φ ) (5) angle dae = cos−1 { r2 + (ae)2 − (gf)2 2 ×r ×ae } (6) where ae = √ (bev )2 + (aeh)2 (7) angle dab = angle dae − angle bae (8) the rotation of the outer wheel is given by angle gad = π 2 −β − angle dab (9) the correct angle of rotation of the outer wheel is given by angle gadcorrect = cot−1 ( cot φ + t w ) (10) the steering error is given by error = angle gad − angle gadcorrect (11) the objective function is given by obj fun = ∑ error2 (12) the limits of the above sum are φ from zero to 52 degree. the design variables are β and r. hooke and jeeves optimization method [5] has been applied to minimize the objective function (12). we use, as initial estimate, r = 1.9374 and angle β = 13.5407° obtained by an analytical method in another study [9]. a vehicle with a track to wheelbase ratio 0.326 has been considered, for which the final estimate has been made. using the concept of compound mechanisms, the track to wheelbase ratio has been taken as 0.163. by optimization, we found the length of the steering arm 1.933 units and inclination angle β as 13.521 degree, where the distance ab is 10 units. the steering arm angles are β = 13.521 degree. we have shown the steering error curve in figure 4. the maximum steering error of the proposed compound mechanism has been reduced to 0.044 degree by the through optimization. the steering error of the proposed mechanism is less compared to that of the ackermann steering mechanism of the benchmark vehicle [10]. the pressure angle at any joint is defined as the angle between the output link and the direction of the force applied by the input link. the pressure angles at joint d and e are found out as follows: angle (ade) = cos−1 { r2 + (gf)2 − (ae)2 2 ×r ×gf } (13) the pressure angle at joint d is given by µ1 = { angle (ade) − π 2 } (14) 253 santiranjan pramanik, sukrut shrikant thipse acta polytechnica figure 1. crossed four-bar steering mechanism including gears. figure 2. geometry of correct steering. figure 3. geometry of a crossed four bar mechanism. 254 vol. 60 no. 3/2020 kinematic synthesis of central-lever steering mechanism. . . figure 4. comparison of steering error. now the length db is given by db = √ {d−r · cos(dab)}2 + {r · sin(dab)}2 (15) the angle (bed) is given by angle (bed) = cos−1 { r2 + (de)2 − (db)2 2 ×r ×de } (16) the pressure angle at joint e is given by µ2 = {π 2 − angle(bed) } . (17) the pressure angle of the proposed steering mechanism has been compared with that of the conventional ackermann steering of the benchmark vehicle [10]. as figure 5 shows, the pressure angle of the proposed compound steering mechanism is less than that of the ackermann steering mechanism. mechanical advantage of a mechanism is defined as the ratio of the torque produced in the output link and the torque provided in the input link. this has been calculated [9] from the product (cos µ1 × cos µ2). the mechanical advantages of the two mechanisms have been compared in the figure 6 and it has been found that the proposed mechanism has a higher mechanical advantage. 5. compound six member mechanism a compound six-member mechanism has then been considered, comprising of two mirrored crossed fourbar mechanisms. the compound mechanism has a track to wheelbase ratio 0.326, which is twice the ratio of 0.163. the track to wheelbase ratio 0.163 for a vehicle is not feasible, but using ratio 0.163 after compounding effect is suitable as the vehicle has the said ratio 0.326 which is used in benchmark vehicle12 m, a bus by ashok leyland [10] which is the same as above. the front wheel track of this vehicle is 2020 mm which is shown as 20 units in figure 7. hence one unit is 101 mm in this figure. the wheelbase of the vehicle is 6200 mm. the length of the arms ce, bd, bf and ag is 1.933 units, which are equal to 195.2 mm. the length of the coupler gf and de is 994 mm. the crossed four bar mechanism is such that two such chains can be series connected as shown in the figure 7. here, the central lever bfd can be moved backwards in order to increase the space for installing the engine and the gearbox. the paper under reference [8] uses spur gears to bring the mechanism behind the front axle and the use of gears may produce more steering error. so the mechanism suggested in the present paper has two benefits. the first benefit is the availability of more space and the second one is the elimination of the spur gears. the figure 8 shows the modification of the mechanism to increase the available space. the present paper shows the usefulness of crossed four bar chain. in this figure, the central hinge joint is shifted backwards by the distance og, which is 520 mm, for example. now the distance gf and ag is 1136 mm. the arms ab, gc, gd and ef are all 219.6 mm in length. the length of the coupler de and bc is 1118 mm each. the length of the link cd is 286.8 mm. in case there is such a constraint that the hinge point g cannot be placed on the longitudinal axis of the vehicle, then this can be accommodated by modifying the mechanism as shown in the figure 9. the kinematic synthesis for such an offset centre lever mechanism is difficult by the presently available methods. but the concept of compounding the mechanism is very useful to devise an offset centre lever steering mechanism. in figure 9, it has been shown that the central hinge g has been located by distances x and y. this shift is not done intentionally but is required when there is a constraint that the central hinge cannot be placed on the longitudinal axis. here, the distances x and y have been chosen arbitrarily as an example. the links ab and gc are equal in size. also, the links ef and gd are equal in length. for a non-zero value of the distance x, the links gc and gd are unequal. in this 255 santiranjan pramanik, sukrut shrikant thipse acta polytechnica figure 5. comparison of pressure angles. figure 6. comparison of mechanical advantages. figure 7. compound six bar mechanism. 256 vol. 60 no. 3/2020 kinematic synthesis of central-lever steering mechanism. . . figure 8. modification of the mechanism to increase space. figure 9. offset center lever steering mechanism. figure, for example, x is 200 mm and y is 500 mm. the distance ao and fo are equal to half of the front track, which is 2020 mm. using pythagorean theorem, we find the distances ag and gf as 1309.2 mm and 951.9 mm respectively. the arms ef and gd are 184 mm in length. but the arms ab and gc each are 253.1 mm in length. the length of the coupler bc is ag× 994/1010 = 1288.5 mm. the length of the coupler de is gf × 994/1010 = 936.8 mm. the length of the link cd is calculated as follows. the angle cgd is given by tan−1 ( y oa + x ) + tan−1 ( y of −x ) + 2β = tan−1 ( 500 1010 + 200 ) + tan−1 ( 500 1010 − 200 ) + + 2 × 13.521 = 22.452 + 31.686 + 27.042 = 81.18° (18) from the triangle cgd the following equation can be written cd2 = gc2 +gd2−2×gc×gd×cos(cgd) = = 1842 + 253.12 − 2 × 184 × 253.1 × cos(81.18°) = 83 634.26 (19) therefore, the length of the link cd is 289.2 mm. the steering error curve for mechanisms shown throughout figure 7 to figure 9 should be same as all of these are obtained by compounding two crossed four-bar mechanisms having similar proportions. the steering error curve has been shown in figure 10, which shows that by compounding effect, the steering error has been reduced from that of a single crossed four bar mechanism that employs two equal spur or helical gears. 6. conclusions it has been shown that two crossed four-bar chains can be added in series to obtain a centre lever steering mechanism. the kinematic synthesis of this centre lever steering mechanism has been found to be easy in the present work. the mechanism can be suitably modified to provide more space for the placement of the engine, gearbox and other parts of the vehicle. this mechanism is more accurate than the conventional ackermann steering mechanism and also the crossed four bar steering mechanism. this mechanism has a low pressure angle and high mechanical advantage compared to the ackermann steering mechanism. also, it can accommodate a constraint like the requirement of the placement of the central hinge at some offset from the longitudinal axis of the vehicle. in this case, two crossed four-bar chains are not iden257 santiranjan pramanik, sukrut shrikant thipse acta polytechnica figure 10. comparison of steering error. tical as one is smaller in size and the other is bigger in size because of the offset. the steering arms are unequal in length. but both have been designed for equal value of the wheel track by a wheel base ratio 0.163. in the case of a wide track vehicle mentioned in reference [7], the steering mechanism has to be made by adding four numbers of crossed four-bar chains each having an imaginary track to wheelbase one-fourth of that of the expanded vehicle while harvesting fruits. while this vehicle has to be transported on the road, it can be made collapsible by using two crossed fourbar chains in series by discarding two such four-bar chains. references [1] s. o. fahey, d. r. huston. a novel automotive steering linkage. journal of mechanical design 119(4):481 – 484, 1997. doi:10.1115/1.2826393. [2] p. a. simionescu, m. r. smith. initial estimates in the design of central-lever steering linkages . journal of mechanical design 124(4):646 – 651, 2002. doi:10.1115/1.1505853. [3] a. de-juan, r. sancibrian, f. viadero. optimal synthesis of function generation in steering linkages. international journal of automotive technology 13(7):1033 – 1046, 2012. doi:10.1007/s12239-012-0106-4. [4] a. de-juan, r. sancibrian, f. viadero. optimal synthesis of steering mechanism including transmission angles. in proceedings of eucomes, pp. 177 – 183. 2008. [5] a. c. bajpai, l. r. mustoe, d. walker. advanced engineering mathematics. john wiley and sons, london, 1990. [6] j. s. zhao, x. liu, z. j. feng, j. s. dai. design of an ackermann type steering mechanism. journal of mechanical engineering science 227(11):2549 – 2562, 2013. doi:10.1177/0954406213475980. [7] d. l. peterson, t. s. kornecki. steering mechanism for wide-track vehicles. applied engineering in agriculture 2(1):16 – 17, 1986. doi:10.13031/2013.26699. [8] s. pramanik, s. s. thipse. kinematic synthesis of a crossed four-bar mechanism for automotive steering. international journal of vehicle structures and systems 9(3):169 – 171, 2017. doi:10.4273/ijvss.9.3.07. [9] s. pramanik. design and development of innovative crossed four-bar steering mechanism for automobiles. ph.d. thesis, symbiosis international university, pune, india, 2019. [10] wikipedia. ashok leyland 12m. https://en.wikipedia.org/wiki/ashok_leyland_12m, 2019. accessed: april 2020. 258 http://dx.doi.org/10.1115/1.2826393 http://dx.doi.org/10.1115/1.1505853 http://dx.doi.org/10.1007/s12239-012-0106-4 http://dx.doi.org/10.1177/0954406213475980 http://dx.doi.org/10.13031/2013.26699 http://dx.doi.org/10.4273/ijvss.9.3.07 https://en.wikipedia.org/wiki/ashok_leyland_12m acta polytechnica 60(3):252–258, 2020 1 introduction 2 crossed four-bar steering mechanism 3 condition of correct steering 4 optimization method 5 compound six member mechanism 6 conclusions references acta polytechnica https://doi.org/10.14311/ap.2021.61.0155 acta polytechnica 61(si):155–162, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague numerical solution of fluid-structure interaction problems with considering of contacts petr sváček czech technical university in prague, faculty of mechanical engineering, department of technical mathematics, karlovo nám. 13, 121 35 praha 2, czech republic correspondence: petr.svacek@fs.cvut.cz abstract. this paper is interested in the mathematical modelling of the voice production process. the main attention is on the possible closure of the glottis, which is included in the model with the concept of a fictitious porous media and using the hertz impact force the time dependent computational domain is treated with the aid of the arbitrary lagrangian-eulerian method and the fluid motion is described by the incompressible navier-stokes equations coupled to structural dynamics. in order to overcome the instability caused by the dominating convection due to high reynolds numbers, stabilization procedures are applied and numerically analyzed for a simplified problem. the possible distortion of the computational mesh is considered. numerical results are shown. keywords: aeroelasticity, finite element method, incompressible fluid. 1. introduction the voice production mechanism is a complex process consisting of a fluid-structure-acoustic interaction problem, where the coupling between fluid flow, viscoelastic tissue deformation and acoustics is crucial, see [1]. the so-called phonation onset (flutter instability) for a certain airflow rate with a certain prephonatory position leads to the vocal folds oscillation. the important aspect of the phenomena is the glottis closure (glottis is the narrowest part between the vibrating vocal folds). the considered problem can be mathematically described as a problem of fluid-structure interaction with the involvement of the (periodical) contact problem of the vocal folds. in order to include the interactions of the fluid flow with solid body deformation an the contact problem, a simplified model problem is considered. this model is similar to the simplified two mass model of the vocal folds of [2], see also the aeroelastic model in [3]. in this paper the mathematical model is introduced and the numerical approximation of the problem is described, where the residual based stabilization is used for the incompressible flow model. the simplified lumped vocal fold model is considered, where the contact is treated with the aid of the hertz impact forces. in the flow model, the contact is considered with the combination of a suitable modification of the inlet boundary condition and the concept of a fictitious porous media flow. attention is paid to the details of the finite element approximations with the aid of the ale conservative method. the solution of the system of equations is discussed, with attention on the projection method and on the discrete projection method. numerical tests are presented. 2. mathematical model in this section a simplified model fluid-structure problem is considered. the domain occupied by fluid ωt is shown in figure 1. the fluid flow is coupled with the elastic structure deformation of a simplified two degrees of freedom model of a vocal fold. 2.1. flow problem first, the air flow is modelled by the system of the navier-stokes equations (cf. [4]) written in the ale conservative form (cf. [5]) 1 ja da(jaρu) dt + ρ∇· ((u−wd) ⊗u) = divτf, ∇·u = 0, (1) where u = (v1,v2) is the fluid velocity vector, ρ is the constant fluid density, τf = (τfij) is the fluid stress tensor given by τf = −pi + 2µd, the symmetric gradient tensor d = (dij) is given by d(u) = 12 (∇u + ∇ tu), p denotes the pressure and µ > 0 is the constant fluid viscosity. further, by at the arbitrary lagrangianeulerian mapping is denoted which maps at any time t ∈ [0,t] the reference configuration ωref = ω0 onto the current configuration ωt, by ja the jacobian of this mapping is denoted, wd denotes the domain velocity, and d au dt is the ale derivative, i.e. the derivative with respect to the reference configuration ωref. for the system (1) the initial and boundary conditions are prescribed. the boundary conditions are prescribed on the boundary ∂ωft of the computational domain formed by mutually disjoint parts ∂ωft = γi ∪ γs ∪ γo ∪ γwt, where γi denotes the inlet, γo the outlet, γs the axis of symmetry and γwt denotes either the fixed or the deformable wall. the standard boundary conditions are prescribed at 155 https://doi.org/10.14311/ap.2021.61.0155 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en petr sváček acta polytechnica 0 l l l 2 i o g (t ) γ γγ ω wt t figure 1. the computational domain ωt with specification of the boundary parts. 3 2 1 1 w 2 w 1 2 2 1 k k m f m m f cog figure 2. two degrees of freedom model (with masses m1, m2, m3) in displaced position (displacements w1 and w2) the acting aerodynamic forces f1 and f2 are shown. γt = γwt∪γwf formed of the fixed wall γwf (where wd = 0) or the moving wall γwt i.e. the surface of the vocal fold model and at the axis of symmetry γs a) u = wd on γwt, b) u2 = 0,−τ f 12 = 0 on γs, (2) as γs is chosen to be located at line y = const (in computations we choose y = 0). furthermore, at the inlet γi and at the outlet part γo the boundary conditions are formally prescribed as c) ρ 2 (u ·n)−u−n ·τf = 1 ε (u−ui) on γi, d) ρ 2 (u ·n)−u−n ·τf = prefn on γo, (3) where n denotes the unit outward normal vector to ∂ωft , ui is a prescribed inlet velocity, pref is a reference pressure value (pref = 0 in what follows), ε > 0 is a penalization parameter and α− denotes the negative part of a real number α. here, the boundary condition (3c) weakly imposes the dirichlet boundary condition u = ui with the aid of a penalization parameter ε. 2.2. structure model the motion of the vocal fold is modelled as a motion of a rigid body governed by the displacements w1(t) and w2(t) of the two masses m1 and m2, respectively (see fig. 2). the equation of motion (see [3] for details) reads mẅ + bẇ + kw = −f , (4) where m is the mass matrix of the system, k is the diagonal stiffness matrix of the system characterized by spring constants k1,k2, and b = ε1m + ε2k, is the matrix of the proportional structural damping, ε1, ε2 are the constants of the proportional damping. the mass matrix is given by m = ( m1 + m34 m3 4 m3 4 m2 + m3 4 ) , k = ( k1 0 0 k2 ) , (5) where m1,m2,m3 are the masses shown in fig. 2. the components of f = (f1,f2)t are the aerodynamical forces (downward positive) acting at masses m1 and m2, see fig. 2. 2.3. coupling conditions the aerodynamical forces f1,f2 are computed with the aid of the aerodynamical lift force l(t) and the aerodynamical torsional moment m(t) acting on the surface of the structure γwt. the aerodynamical lift force and the aerodynamical torsional moment are evaluated with the aid of the mean (kinematic) pressure p and the mean flow velocity u = (u1,u2) as the integrals over the surface of the airfoil l = −l ∫ γw t τ f 2jnj ds, m = l ∫ γw t τ f ijnjr ort i ds, (6) where l denotes the depth of the profile section, and the vector rort has components rort1 = −(x2 −xea2 ), and rort2 = x1 −xea1 , with (xea1 ,xea2 ) being the position of the structure elastic axis. the displacement of the structure surface γwt is determined in terms of w1,w2, and it is used as the boundary condition for the displacement of any point of the computational domain ωref0 onto the domain ωt. moreover, the domain velocity at γwt is determined using the w1(t) and w2(t) obtained from solution of the ordinary differential equations (4). 2.4. treatment of the contact problem in order to close of the fluid computational domain, the idea of a fictitious porous media flow is employed. it means that during the closing phase of the vocal folds, the part of the structure domain (denoted by ωpt ) is assumed to be a domain of porous media flow, see fig. 3. the flow in the domain ωpt is then assumed to be governed by the modified equations ρ ja da(jau) dt + ρ∇·((u−wd)⊗u) + αu = divτf, (7) where the coefficient α corresponds to the artificial porosity of the medium, see [6]. the coefficient in the porous media flow is usually chosen as α = µ p where µ is the dynamical viscosity of the fluid and p is an artificial permeability coefficient, see [6]. in practical computations equation (7) is solved in the whole computational domain ωt with α being zero at all points outside of the porous media domain ωpt and being a positive constant (in this paper α was chosen as 107) in the interior of ωpt . 156 vol. 61 special issue/2021 numerical solution of fsi problems with contacts ω γ t wt figure 3. the detail of the porous media flow domain ωpt . the porous media domain is determined by the following procedure illustrated in figure 3. first, the actual gap g(t) at time t is computed as distance of γwt from the axis of symmetry γs and compared to a prescribed minimal gap threshold gmin. if g(t) ≥ gmin, then the glottis is open and no porous media is used (i.e. ωpt is empty and α ≡ 0). for g(t) < gmin the y-displacement of the points of the surface of γwt is modified in order to not violate the condition g(t) ≥ gmin, i.e. these points are vertically displaced by such a shift −∆w, which makes the actual gap equal to g(t) = gmin. for the structural model the hertz impact forces are used as specified in [3]. in this case, the right hand side f of eq. (4) is modified by the addition of the hertz impact force fh. the hertz impact force is then distributed to the both components of f depending on the position xmax of the impact. the model of herz impact forces is given as fh = khδ3/2(1 + bhδ̇), kh ≈ 4 3 e 1 −µ2h √ r, where δ stands for the penetration of the vocal fold through the contact plane, bh is a damping factor (here set to zero), r is the radius of the osculating circle (i.e. inverse of the curvature), e is young’s modulus of elasticity of the vocal fold and µh is its poisson’s ratio, for values see [3]. 2.5. mesh deformation first, due to the motion of the structure, the transformation of the computational domain (and mesh as well) needs to be constructes at every time instant tk. here it is represented by the construction of the ale mapping at at every time instant t = tk. the ale mapping at is sought in the form of displacements d = d(ξ,t) of points ξ of the original configuration, i.e. at(ξ) = ξ + d(ξ,t). the boundary condition for d is then known for every point of the boundary ∂ω: at ∂ω \ γwt the displacement d = 0 and at γwt it is determined by the already known (or estimated) position of the structure surface γwt in terms of the displacements w1 and w2. in order to determine d in ω a boundary value problem is solved, see e.g. [7]. 3. numerical approximation in this section let us consider the computational domain ωt and the ale mapping at to be already known and at to be sufficiently smooth at every time instant t ∈ i = (0,t). a similar assumption is made about the domain velocity wd. 3.1. weak formulation in order to introduce the function spaces for velocity and pressure, we start with defining the spaces for velocity and pressure at the reference configuration. we shall seek the velocity-pressure pair u = (u,p) at any time instant t in the space wt = vt ×qt, where qt = l2(ωt) and vt = {ϕ ∈ h1(ωt) : ϕ ·n = 0 at γs}. further, the weak formulation is derived using the standard approach, i.e. we multiply equations (1) by test functions z and q, integrate over ωt and use green theorem. however, in the finite element context such test functions are usually time independent, whereas here the test functions are defined over the time dependent domain ωt and naturally must be chosen as time dependent. to this end we first define the test functions on the reference domain ωref0 , i.e. we denote x ref = {ϕ ∈ h1(ω ref 0 ) : ϕ = 0 at γ w ref 0 , ϕ ·n = 0 at γs }. (8) next, these (reference) test functions are defined on the current configuration ωt with the aid of the ale mapping at, i.e. the test function z = z(x,t) can be defined using a reference test function zref ∈ x ref by z(x,t) = zref (atn+1 (a −1 t (x))) ∀x ∈ ωt, t ∈ (0,t). (9) we shall denote the space of such test functions by x , i.e. x = {z : z satisfy (9) for some zref ∈ x ref}. (10) the space x is for any t subspace of vt, and the test functions from x are time independent when transformed on the reference domain. this means that their ale derivative equals zero, i.e. for z ∈x we have datz dt = 0. in what follows by the symbol (·, ·)m the dot product in l2(m) or l2(m) is denoted. the weak form is then derived in the standard form: we take test 157 petr sváček acta polytechnica function v = (z,q) ∈ x ×qt multiply the first equation (1) by test functions z and the second by q, sum together, integrate over ωt and use green’s theorem for viscous terms and the pressure gradient. thus we arrive to the weak form: find u = (u,p) ∈ wt such that for any t ∈ (0,t) u satisfy the boundary condition (2a) and d dt (ρu,z)ωt + c(u; u,v ) + b(u; u,v ) (11) b(u,v ) + b∗(u,v ) + d(u,v ) = lγ(v ), holds for any v = (z,q) ∈ x ×qt. the forms in eq. (11) are defined for any u = (u,p) ∈ wt, u = (u,p) ∈ wt and v = (z,q) ∈ x ×qt as follows: the form d in eq. (11) is defined by b(u,v ) = (∇·u,q)ωt, b ∗(u,v ) = −(∇·z,p)ωt, d(u,v ) = ∫ ωt 2µdij(u)dij(z) dx the boundary forms b,lγ are given as b(u; u,v ) = 1 ε (u,z)γi + (ρ 2 (u ·n)+u,z ) γi∪γo lγ(v ) = 1 ε (ui,z)γi − ∫ γo pref (n ·z)ds and the convective term is given by the skewsymmetric trilinear form c (here we abbreviate w = u−wd) c(u; u,v ) = ( ρ 2 ((w ·∇)u,z ) ωt − ( ρ 2 ((w ·∇)z,u ) ωt − ( ρ 2 (∇·wd )u,z ) ωt . the time derivative terms arises from the identity∫ ωt 1 ja da(jau) dt ·zdx = d dt (u,z)ωt , (12) which follows from (see [8]) daja dt = ja(divwd). (13) in what follows an equivalent integral form of equation (13) shall be used in the form d dt (∫ vt dx ) = ∫ ∂vt wd ·nds (14) for any volume vt whose motion is fully determined by the ale mapping, i.e. vt = at(v ref 0 ). 3.2. time discretization for the purpose of time discretization an equidistant partition tj = j∆t of the time interval i with the constant time step ∆t > 0 is considered and we denote the approximations of velocity and pressure by uj ≈ u(·, tj) and pj ≈ p(·, tj) for j = 0, 1, . . . , uj ∈ vtj and pj ∈. similarly, by jaj and w j d the jacobian of the ale mapping at at time instant tj and the domain velocity at time instant tj is denoted. in what follows we shall describe the discretization at a fixed time instant tn+1. the time derivative on the right hand side of equation (12) is approximated at t = tn+1 formally by the second order backward difference formula 1 ja da(jau) dt |tn+1 ≈ (15) 1 jan+1 3jan+1un+1 − 4jan un + jan−1un−1 2∆t or more precisely the time derivative term from (11) is approximated by d dt (∫ ωt ρu ·zdx ) |tn+1 ≈ m(u,v ) −lm (v ) (16) for u = (u,p) := (un+1,pn+1), v = (z,q) and with m(u,v ) = 3 2∆t ( un+1,z ) ωtn+1 , (17) lm (v ) = 2 ∆t (un,z)ωtn − 1 2∆t ( un−1,z ) ωtn−1 (18) where the test function z ∈ x is a time dependent function defined by (9) with a given steady reference function. thus, the time discretized weak formulation reads: find u = (u,p) := (un+1,pn+1) such that u satisfy the boundary condition (2a) and a(u; u,v ) = l(v ) (19) holds for any v = (z,q) ∈x ×qtn+1, where a(u; u,v ) = m(u,v ) + c(u; u,v ) + b(u; u,v ) + b(u,v ) + b∗(u,v ) + d(u,v ), (20) and l(v ) = lm (v ) + lγ(v ). 3.3. finite element approximations the spaces vtn+1 and x are approximated using the finite element subspaces vh and x h constructed over an admissible triangulation t∆ of the domain ω, respectively. similarly, the pressure space qtn+1 is approximated by its finite element subspace qh constructed again over t∆. here, the taylor-hood finite elements are used, i.e. the spaces of continuous piecewise quadratic velocities and continuous piecewise linear pressures are used, i.e. the velocities are sought in vh = {ϕ ∈ c(ω) : ϕ ∈ p 2(k)∀k ∈t∆}∩ v. the space of the test functions is given by x h = {ϕ ∈ c(ω) : ϕ ∈ p 2(k)∀k ∈t∆}∩ x . and the trial/test pressure space is given as qh = {ϕ ∈ c(ω) : ϕ ∈ p 1(k)∀k ∈t∆}. 158 vol. 61 special issue/2021 numerical solution of fsi problems with contacts the finite element approximations of u and p are then sought in the finite element spaces vh = xh×qh constructed over an admissible triangulation τh of the computational domain ωft : find an approximate solution uh = (u,p) ∈ wh such that eq. (19) holds for any test function vh = (z,q) ∈ x h×qh. furthermore, this formulation is stabilized using the supg/pspg stabilization terms together with the div-div stabilization terms given as s(u; u,v ) = ∑ k∈t∆ δk ( ρ 3u 2∆t −µ4u + ρ (w ·∇) u + ∇p, φ(u; v ) ) k f(u; v ) = ∑ k∈t∆ δk ( ρ 4ũn − ũn−1 2∆t , φ(u; v ) ) k p(u,v ) = ∑ k∈t∆ τk ( ∇·u,∇·z ) k , where the modified test function is given by φ(u; v ) = ((u−wd) ·∇)z +∇q and δk, τk are suitably chosen stabilization parameters, see e.g. the stabilized discrete formulation then reads: find u = (u,p) ∈ wh such that as(u; u,v ) = ls(u; v ) (21) holds for any test function v = (z,q) ∈ x h ×qh, where as(u; u,v ) = a(u; u,v ) + p(u,v ) + s(u; u,v ), and ls(u; v ) = l(v ) + f(u; v ). 3.4. linearization in order to solve the nonlinear problem (21), the sequence of the linearized problem is solved until it converges to a sufficient precision. we start with an approximation u0 and for k = 0, 1, . . . solve the linearized problems: find u =: uk+1 such that as(uk; u,v ) = ls(uk; v ) (22) holds for any v = (z,q) ∈ x h ×qh. using the finite element base functions the discretization leads to the system of linear equations in the form of a saddle point problem, i.e. in the form( a b∗ bt q )( αu αp ) = ( f g ) , (23) where the matrix a = m + d + c + ab + as consists of the mass matrix m, the diffusion matrix d, the convection matrix c, the boundary terms matrix ab and the stabilization matrix as part arising from the forms m, d, c, b and s, respectively. matrix b∗ is the discrete gradient operator. it corresponds to the form b∗ including the stabilization terms, bt corresponds to the discrete divergence operator realized by the form b plus stabilization. the matrix q follows fully from the stabilization s, i.e. in particular for the case with no stabilization the matrix q = 0. the solution of such a problem is difficult, see e.g. [9]. this is caused by the presence of the continuity equation, which can be treated at the continuous level by the approach based on the projection method. on the other hand, this can be understood as an approximation of the solution of the system (23), or more precisely as preconditioned iterative solution of this system. 4. solution of the discrete problem 4.1. projection method the projection method is based on the helmholtzhodge decomposition of any vector field v, i.e. v = vdiv + ∇ψ, where vdiv is the divergence free field, see [10]. in this section, for the sake of simplicity, we shall restrict ourselves to the case of the fixed computational domain ω and to the case of the first order discretization in time. this means that here we shall discuss the solution of the navier-stokes equations in the form ρ ∂u ∂t + ρ(u ·∇)u + ∇p−µ4u = 0, ∇·u = 0, where the time derivative term is replaced by the backward difference formula ρ ∂u ∂t ≈ ρ un+1 −un ∆t . here, the problem (24) can be sought using the segregated approach, see [11]. this can be formally written as i. solve momentum equations for ũ ũ−un ∆t + (ũ ·∇)ũ−ν4ũ = f, (24) ii.project ũ on the space of (discrete) divergence free functions: solve pressure equation ∇· (∇p) = ∇· ( ũ ∆t ) , (25) iii.update the velocity field by adding the pressure gradient: un+1 − ũ ∆t + ∇p = 0. (26) let us mention that the steps i. and ii. need to be equipped with suitable boundary conditions. due to splitting the coupled equations, the boundary conditions as presented in (3-2) require a modification. particularly the boundary conditions (2) contain pressure, which is not available in the step i, but can be used from the previous time step. the pressure equation needs to be equipped by the neumann type boundary condition, where the compatibility condition needs to be satisfied for the existence of a solution, 159 petr sváček acta polytechnica which is unique to a constant. nevertheless, due to splitting these two steps, the splitting error arises, for overview see e.g. [12]. in the considered problem, the main difficulty is in the realization of the non-standard boundary conditions (2,3). to this end we shall consider another possibility based on the preconditioned solution of the discrete problem (23). 4.2. discrete projection methods and preconditioning let us consider the system of linear equations written in the form mu = b. where the matrix m is given as m = ( a b bt 0 ) , b = ( f g ) . (27) this is the well studied case, where the matrix a is (possibly) a non-symmetric positive definite and the matrix b has full rank due to the babuška-brezzi inf-sup condition satisfied. the matrix m can be factorized as m = ( i 0 bt a−1 i )( a 0 0 s )( i a−1b 0 i ) (28) where s is the pressure schur complement given as s = −bt a−1b. (29) the inverse matrix to the matrix m given (28) is able to compute the inverse matrix to the matrix a and the inverse to the schur complement matrix s. in order to solve the problem (27), the three sub-steps can be followed similarly as in section 4.1. i. solve the momentum equations for the intermediate velocity field and update the right hand side for pressure by subtracting the “divergence” of the intermediate velocity field, i.e. ũ := a−1f, g̃ := g −btũ. ii.solve pressure equation sp = g̃ iii.update the velocity field by adding a pressure gradient component to the momentum equations u = ũ−a−1(bp) in this process, the solution of the system with matrix a and the matrix s needs to be realized. here, matrix a is a mass matrix perturbed with convectiondiffusion. the stabilization terms and the solution of the system with the matrix a can be realized effectively. however the solution of the system with the schur complement matrix s must be realized iteratively. this is why the above approach is used only as a preconditioner for gmres method, where the matrix s is replaced by a suitable approximation, see e.g. [13]. figure 4. detail of the mesh for the almost closed glottal part. in the considered discretization, the matrix of the system is given by m = ( a bt −b d ) , b = ( f g ) . (30) where the matrix a is (possibly) a non-symmetric positive definite, the matrix b has full rank and matrix d is positive semidefinite. in this case the matrix m can be factorized as m = ( i 0 −ba−1 i )( a 0 0 s1 )( i a−1bt 0 i ) (31) where s1 is the pressure schur complement given as s1 = s + ba−1bt . 5. numerical results first, the oseen problem is considered −ν4u + b ·∇u + ∇p + αu = f, ∇·u = 0 in the computational domain ω = (0, 1)2. the problem is equipped with the dirichlet boundary condition u = b prescribed at ∂ω. here, α = 0 is used, b = (sin(πx),−πy cos(πx)) and the right hand side f is chosen in such a way that u = (sin(πx),−πy cos(πx)) is solution of the oseen problem. the computations were performed for different values of the viscosity coefficient nu. first, the convergence of the galerkin finite element approximations ugh to the exact solution u = b is investigated, p(x,y) = sin(πx) cos(πy) for ν = 0.05 (here, relatively high viscosity was chosen in order to obtain stable galerkin approximations even on coarser meshes). for an approximation of the flow problem, the taylor hood finite elements were used. the errors in the h1 norm are shown in table 1, where by hmax the length of the maximal edge in the triangulation is denoted. these results are compared to the results of the stabilized formulation of the same problem, which shows that the used residual based stabilization does not pollute the solution, see table 2. the convergence orders in both cases agree well with the theoretical estimate. for the stabilized method such convergence rates are well preserved for the values ν = 10−3, . . . , 10−6 with a slow down observed only for coarse grid configuration. 160 vol. 61 special issue/2021 numerical solution of fsi problems with contacts hmax h 1(u) h1(v) h1(p) 0.333174 0.148971 0.278814 0.824015 0.166358 0.0294769 0.0389521 0.306831 0.0881204 0.00751673 0.00970303 0.155735 0.0449673 0.00178417 0.00225841 0.0781441 0.0230627 0.000444434 0.00055994 0.038375 0.0118955 0.000107972 0.000139096 0.0192135 order 2.14 2.17 1.07 table 1. convergence of galerkin fe method to the solution of the oseen problem with convergence order estimate hmax h 1(u) h1(v) h1(p) 0.333174 0.148971 0.278814 0.824015 0.158053 0.0385103 0.0479786 0.332202 0.0877183 0.00912475 0.0113478 0.162359 0.0451748 0.00239609 0.00279304 0.0821471 0.0235271 0.000593443 0.000695859 0.0402822 0.0119951 0.000148245 0.000174687 0.0201917 order 2.06 2.05 1.03 table 2. convergence of stabilized fe method to the solution of the oseen problem with convergence order estimate hmax h 1(u) h1(v) h1(p) 0.333174 0.0895357 0.149882 0.764274 0.158053 0.0211092 0.033792 0.322219 0.0877183 0.00602738 0.00931535 0.160369 0.0451748 0.00158538 0.00240926 0.0810965 0.0235271 0.000394708 0.000603108 0.0397932 order 1.94 2 1.16 table 3. convergence of stabilized fe method to the solution of the navier-stokes problem with convergence order estimate -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 t[s] -0.05 -0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04 0.05 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 t[s] figure 5. the aeroelastic response in terms of w1(in mm, top) and w2 (in mm, bottom) of the structure for flow velocity u∞ = 0.65 m/s phonation onset -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 1.9 1.95 2 2.05 2.1 2.15 2.2 2.25 2.3 t[s] -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 1.9 1.95 2 2.05 2.1 2.15 2.2 2.25 2.3 t[s] figure 6. the aeroelastic response in terms of w1(in mm, top) and w2 (in mm, bottom) of the structure for flow velocity u∞ = 0.65 m/s phonation phase with almost periodical oscillations -0.0006 -0.0004 -0.0002 0 0.0002 0.0004 0.0006 0 0.05 0.1 0.15 0.2 t[s] -0.0004 -0.0003 -0.0002 -0.0001 0 0.0001 0.0002 0.0003 0.0004 0 0.05 0.1 0.15 0.2 t[s] figure 7. the aeroelastic response of the structure in terms of w1(in mm, top) and w2 (in mm, bottom) for flow velocity u∞ = 0.70 m/s phonation onset -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.2 0.25 0.3 0.35 0.4 0.45 0.5 t[s] -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.2 0.25 0.3 0.35 0.4 0.45 0.5 t[s] figure 8. the aeroelastic response in terms of w1(in mm, top) and w2 (in mm, bottom) of the structure for flow velocity u∞ = 0.70 m/s phonation onset 161 petr sváček acta polytechnica 0.1 5.6 11.1 16.6 22.1 27.6 33.1 38.6 0.1 5.6 11.1 16.6 22.1 27.6 33.1 38.6 0.1 5.6 11.1 16.6 22.1 27.6 33.1 38.6 figure 9. the flow velocity magnitude during the opening and closing phase for the inlet flow velocity u∞ = 0.65 m/s let us emphasize, that as the convection b equals the exact solution u, the dirichlet problem for navierstokes equations can be formulated with the same analytical solution. this problem was solved as an navier-stokes problem with a known analytical solution. one can really notice that b = u, and thus the given u is the analytical solution of the navier-stokes problem with the known dirichlet boundary condition. similarly as above, the numerical approximation was compared for both the stabilized and non-stabilized method, with the results confirming the expected theoretical order of convergence in h1-norm, see table 3 for the stabilized method. further, the method for the solution of fsi problem with contact was realized and tested on a benchmark problem from [14]. the results for the inflow velocity u∞ = 0.65 m/s are shown in terms of displacements w1 and w2 in dependence on time in figures 5-6. one can easily see that the system becomes aeroelastically unstable and the so called phonation onset phenomena arises. with further continuation, the vibrating vocal folds start to be influenced by their mutual contact, see the vibrations resembling a limit cycle oscillations in figure 6. similar behaviour is observed also for the inflow velocity u∞ = 0.7 m/s with a much faster appearance of the contact. figure 9 then shows the details of the flow in terms of flow velocity in the glottis region. 6. conclusion in this paper the mathematical description of a simplified fluid-structure interaction problem arising in the biomechanics of voice production was presented. special attention was paid to considering the contact of the vibrating vocal folds, which was treated with the aid of the hertz impact forces, a suitable choice of inlet boundary conditions and the use of a fictitious porous media flow description. attention was also given to the numerical discretization of the problem, to the solution of the linearized problem and to the realization of the gap closing. the numerical tests were shown to verify the theoretical error estimates of the applied method and the numerical results of a benchmark test problem were presented. acknowledgements this work was supported by the czech science foundation under the grant no. 19 07744s. references [1] i. r. titze. the myoelastic aerodynamic theory of phonation. national center for voice and speech, u.s.a., 2006. [2] k. ishizaka, j. l. flanagan. synthesis of voiced sounds from a two-mass model of the vocal coords. the bell system technical journal 51:1233–1268, 1972. [3] j. horáček, p. šidlof, j. g. švec. numerical simulation of self-oscillations of human vocal folds with hertz model of impact forces. journal of fluids and structures 20(6):853–869, 2005. [4] m. feistauer. mathematical methods in fluid dynamics. longman scientific & technical, harlow, 1993. [5] t. nomura, t. j. r. hughes. an arbitrary lagrangianeulerian finite element method for interaction of fluid and a rigid body. computer methods in applied mechanics and engineering 95:115–138, 1992. [6] e. burman, m. a. fernández, s. frei. a nitsche-based formulation for fluid-structure interactions with contact 2018. arxiv:1808.08758. [7] m. feistauer, j. horáček, p. sváček. numerical simulation of vibrations of an airfoil induced by turbulent flow. communications in computational physics 17(1):146–188, 2015. [8] m. feistauer, m. křížek, v. sobotíková. an analysis of finite element variational crimes for a nonlinear elliptic problem of a nonmonotone type. east-west j numer math 1:267–285, 1993. [9] h. elman, d. silvester. fast nonsymmetric iterations and preconditioning for navier-stokes equations. tech. rep. no. 263, manchester, england, 1995. [10] a. chorin. numerical solution of the navier-stokes equations. math comput 22:745–762, 1968. [11] a. j. chorin. a numerical method for solving incompressible viscous flow problems. journal of computational physics 2(1):12–26, 1967. [12] a. quarteroni, a. valli. numerical approximation of partial differential equations. springer, berlin, 1994. [13] a. wathen, d. silvester. fast iterative solution of stabilised stokes systems, part ii: using general block preconditioners. siam journal on numerical analysis 31:1352–1367, 1994. [14] p. sváček, j. horáček. numerical simulation of glottal flow in interaction with self oscillating vocal folds: comparison of finite element approximation with a simplified model. communications in computational physics 12(3):789–806, 2012. doi:10.4208/cicp.011010.280611s. 162 http://dx.doi.org/10.4208/cicp.011010.280611s acta polytechnica 61(si):155–162, 2021 1 introduction 2 mathematical model 2.1 flow problem 2.2 structure model 2.3 coupling conditions 2.4 treatment of the contact problem 2.5 mesh deformation 3 numerical approximation 3.1 weak formulation 3.2 time discretization 3.3 finite element approximations 3.4 linearization 4 solution of the discrete problem 4.1 projection method 4.2 discrete projection methods and preconditioning 5 numerical results 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0297 acta polytechnica 58(5):297–307, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap new approach for online arabic manuscript recongnition by deep belief network benbakreti samir∗, aoued boukelif university of djillali liabes, department of electronic, sidi bel abbes 22000, algeria ∗ corresponding author: benbakretisamir@gmail.com abstract. in this paper, we present a neural approach for an unconstrained arabic manuscript recognition using the online writing signal rather than images. first, we build the database which contains 2800 characters and 4800 words collected from 20 different handwritings. thereafter, we will perform the pretreatment, feature extraction and classification phases, respectively. the use of a classical neural network methods has been beneficial for the character recognition, but revealed some limitations for the recognition rate of arabic words. to remedy this, we used a deep learning through the deep belief network (dbn) that resulted in a 97.08 % success rate of recognition for arabic words. keywords: arabic manuscript; online recognition; neural networks; mlp; tdnn; rbf; deep learning; dbn. 1. introduction nowadays, the cursive writing recognition problem represents a major challenge for both handwritten and typed forms. the complexity of this problem comes from the style, the variability of the patterns, the tilt in the case of the unconstrained handwriting, this is even worse for some languages that present more complicated morphologic characteristics. the arabic handwriting, which is cursive by nature, presents a big variability and is consequently very difficult to resolve. the recent electronic revolution allows the emergence of new typing devices capable of creating high quality online documents, such as exam papers, notes, filling forms, online reporting, etc. this progress extends the application field of the online handwriting recognition, which was restricted by terminals of small size (pda, smartphone) that support character recognition only. the online documents represent a new source of information that has few applications in the field of the pattern recognition. they are represented by signals which may be assimilated to sampled trajectories captured from the handwriting instrument as a temporal sequence of points (x(t),y(t)), represented in an orthonormal coordinate system. the arabic script is semi-cursive in the two forms, printed and handwritten. the word may be composed of one or several pseudo-words. these pseudo-words are ligated horizontally or vertically, which makes the segmentation task very complicated. the shape of the character differs according to its position in the word. some characters include diacritical points, which lie above or below. however, others have the same body or shape, only the diacritical point allows to differentiate between two characters. also, in the handwritten arabic text, the variations in the horizontal bands, according to the calligraphy of the characters contained in the pseudo-word, appears [1]. in the case of the manuscript, this complexity is more important, because there are other problems that may arise, such figure 1. online arabic handwriting: (a) the “bah” character in the middle of a word; (b) the “tah” character at the end of a word; (c) the word “wahran” composed of 4 pseudo-words. as: the variability intra and inter-writer, the overlap of pseudo-words, the fusion of diacritical points, the writing conditions or the writer state. in this paper, the use of the on-line aspect decreases some processing difficulties because the diacritical points are considered as a signal, which complements the character body. hence the fusion of diacritical points is not a real problem. however, the segmentation in pseudo-words or characters is not performed because the word is taken in its entirety, therefore, the overlaps are ignored. nevertheless, these features increase the variability intra or inter-writer. among the approaches presented in the literature for a handwriting recognition, the neural networks demonstrate a great discriminating power and a very good capacity to construct class separators in multidimensional spaces. that being said, models based on hidden markov models (hmm) [2] use a parametric approach to model sequences of observations. these sequences are generated by stochastic processes (e.g., handwriting manuscript), which are more visible in the word recognition process. the hmm has a strong 297 http://dx.doi.org/10.14311/ap.2018.58.0297 http://ojs.cvut.cz/ojs/index.php/ap benbakreti samir, aoued boukelif acta polytechnica expressive power to model the distribution of observations for any class. for the character case, the neural networks are much more adapted because it is the global shape that dominates. these networks present the advantage to be compatible with the two dimensional pictorial nature of the writing. 1.1. related work the handwriting is a complicated task, due to the variability of handwriting styles. indeed, the shape of the handwriting character differs not only from one writer to another, but also for a single writer according to: its position in the word, the neighbouring letters, the writer’s health (motor disabilities, parkinson ... ), the psychological status ( depression, sadness ...). in addition, it is also possible to have the same trace for two different meanings depending on the context. this represents another source of ambiguity. the aforementioned properties make the handwriting recognition more challenging and the complicated patterns recognition a problem. the online handwriting recognition is not a recent subject. it goes back to around 30 years ago. however, the emergence of more sophisticated devices, like smartphones and tablets, has stimulated more research works in this field. the online recognition problem is characterized by three fundamental properties: line temporal chaining, the dynamic of the trace (speed, acceleration, pressure on the pen) and the trace skeleton (ignore the thickness of line). in [3], liu et al. presented the state of the art of online chinese handwriting recognition. in [4], kessentiniet al. have developed a unified approach for arabic and latin alphabetic scripts recognition. however, few works were interested in the online arabic writing recognition. in fact, its cursive nature triggers numerous problems in the automatic recognition system. these problems are bound to the diversity and the complexity of the handwriting signal. the unavailability of databases with online character also represents another problem for this field of research. consequently, most of the works focused on the offline aspect, such as [5] where margner et al. present a development strategy for offline arabic manuscript recognition systems. more recent works have been presented in [6], which propose a deep architecture for the arabic manuscript recognition. nevertheless, several works treat the online case, like mezkhani et al. they have used a bayesian classification [7] and a kohonen network [8] for the online arabic character recognition. in [9], biadsy used the hidden marcov model for the arabic manuscript recognition. and more recently, s.benbakreti and a.boukelif proposed a tdnn “time delay neural network” based recognition system of the online isolated arabic characters [10]. in this work, we present a design of an arabic handwriting recognition system; we propose a classification approach based on neural networks. figure 2 illustrates the architecture of the proposed system. acquisition and storage interface pretreatment feature extraction classifier (nn) optional step according to nn choice character or word identified figure 2. architecture of the recognition system. 2. processing and features extraction the preprocessing stage described in the previous recognition system used for the character size normalization and trace sampling in a fixed number equidistant points fashion. the next stage extracts features from the previously sampled trace, which prepares the input for the neural network classifier. after the training phase, the class of the character is supplied at the output layer. this procedure is respected by almost all the neural networks. however, the deep neural networks do not require any features extraction phase. the task is done implicitly in the network. the task of the character recognition is difficult, because of the big variability inherent to the handwriting style and the variation of the character position in the word (see figure 3). for most of the arabic letters, similarities are noticed between forms at the beginning/ middle on the one hand and the end/ isolated on the other. the presence of a ligature with a previous/next letter does not significantly modify the form of the letter (no more than the case of latin cursive handwriting). arabic ligatures occur where two letters are written one upon the other. furthermore, the arabic writing is rich in diacritic signs (or secondary signs) such as vowels, points (dots), chaddah, maddah, hamzah, etc. in our work, we define a diacritical mark as a secondary component of a letter, which may complete it or even modify the whole sense of the word. but the notion of vowel, such as el damah, el kasrah, el fathah and el soukoun are not considered. 298 vol. 58 no. 5/2018 new approach for online arabic manuscript recongnition by deep belief network figure 3. different forms of the arabic character according to its position in the word. figure 4. result of preprocessing: (a) character noun before preprocessing; (b) character noun after preprocessing. these specificities of the arabic language make the recognition task more complicated, hence the necessity to do a preprocessing including the following steps: • a spatial sampling that allows preserving the useful information of the signal and excluding the redundant information resulting from the repetition of points. indeed, the duration of character formation as well as its shape can vary significantly from one writer to another and from an occurrence to another. the spatial sampling transforms a handwriting signal into a sequence of equally spaced points with coordinates [x(n),y (n)] (n is the point index) according to the length of the trace for a fixed number of points. . definition of the line between pen up and pen down. n: number of sampling points reservation of the memory space for the following vectors: x, y , penup / down . calculation of the total length of the signal: if only one trait: length = length + distance between points if several traits: length = length + length between strokes. . calculation of the sampling distance dist-samp = length/(n − 1) determination of n − 1 points by interpolation • character normalization and centring: the character is recentred to (x0,y0) then normalized to the maximal size of the character. to obtain an invariant representation with respect to the translations and the spatial distortions. a result sample of these two preprocessing steps is illustrated in figure 4: once the preprocessing step is finished, in order to facilitate the classifier task, geometric information, such as the direction of movement and the curvature of the trajectory, are extracted from this sequence of points. we then obtain a vector sequence of 7 characteristics, which are mentioned in the following: • the x coordinates • the y coordinates 299 benbakreti samir, aoued boukelif acta polytechnica algorithm 1 xmax ← x(0) xmin ← x(0) ymax ← y(0) ymin ← y(0) {calculation of the center of the character [x0,y0]} n ← 0 while n ≤ n − 1 do if x(n) > xmax then xmax ← x(n) end if if x(n) < xmin then xmin ← x(n) end if if y(n) > ymax then ymax ← y(n) end if if y(n) < ymin then ymin ← y(n) end if end while x0 ← (xmin + xmax)/2 y0 ← (ymax + ymin)/2 {delta scaling factor calculation} deltay ← (ymax − ymin) deltax ← (xmax − xmin) if deltax < deltay then delta ← deltay else delta ← deltax end if {calculation of the new coordinates x[n][1] and x[n][2]: x[n][1] the x coordinates x[n][2] the y coordiantes. scanning of (x and y ) points} x[n][1] ← (x[n] −x0)/delta x[n][2] ← (y [n] −y0)/delta x[n][7] ← penupdown[n] {calculation of the direction x[n][3] and x[n][4]: cosine directions scanning points calculation of the length of the ds string} dx ← x[i + 1] ∗x[i− 1] dy ← y [i + 1] ∗y [i− 1] ds ← sqrt(dx ∗ dx + dy ∗ dy) if ds = 0 then x[i][3] ← 0 x[i][4] ← 0 else x[i][3] ← arcx/ds x[i][4] ← arcy/ds end if {special points: initial and last point} initial point ← next point last point ← previous point {curvature calculation x[n][5] and x[n][6]: cosine directions scanning points} x[i][5] ← x[i+1][3]∗x[i−1][3] + x[i+1][4]∗x[i−1][4] x[i][6] ← x[i+1][3]∗x[i−1][4] + x[i+1][4]∗x[i−1][3] {special points: initial and last point} initialpoint ← nextpoint lastpoint ← previouspoint figure 5. mlp type connexion -++ complete field of view. • the cosine directions of the movement direction (cos θ et sin θ) • the cosine directions of the trajectory curvature (cos φ et sin φ) • the pen up/down. this step is summarized in algorithm 1. once the features extraction is realized, we can proceed to the classification. 3. neural network classification we have used a set of neuronal classifiers, which gave satisfactory results, the proposed neural architectures are as follows. 3.1. multi-layer perceptron (mlp) we have used the mlp with error back-propagation that has one hidden layer only [11, 12]. the seven features extracted from the arabic characters (characters in their isolated forms, at the beginning, middle and end), generated by the previous module, constitutes the inputs of the network. the hidden layer includes 80 neurons. the classes to be discriminated are 28 characters of the arabic alphabet (or 48 words), hence the choice of 28 (48) neurons for the output layer. the unipolar sigmoid was used as a neuron activation function. figure 5 shows how the mlp processes the letter mim. however, the training of the lowest layer (the closest to the output layer) is less efficient in a deep mlp [13, 14]. it seems that the update of the parameters is less and less relevant as we propagate in the lowest layers. because of that, we use one hidden layer for our mlp. 3.2. time delay neural network (tdnn) the used tdnn consists of three layers: input, hidden and output. furthermore, every layer possesses two directions: a direction of features and a temporal 300 vol. 58 no. 5/2018 new approach for online arabic manuscript recongnition by deep belief network figure 6. convolution network connexion — restricted field of view. direction. the tdnn is distinguished from a multilayer perceptron by the fact that it takes into account a time notion. consequently, it processes all the neurons of the input layer at the same time (see figure 6), before making a temporal scanning. it is the notion of the temporal window. the tdnn has three basic principles: temporal window, shared weights, and delays: • the temporal window: the basic idea states that every neuron of the l + 1 layer is only connected to one subset of neurons from the l layer. after an experiment study, we have fixed the size of this window to seven neurons. • the shared weights: this notion allows reducing the number of network parameters, which positively influences the network generalization capacity. a window with a given characteristic will have the same weights according to the temporal direction. it also allows progressively extracting the differences when scanning the signal. • the delays: in addition to the previous two constraints, we introduce delays between two successive windows for a given layer. the first delay is three neurons and the second one is four. for the shared weights constraint, the same neuron is duplicated in the time direction (the same duplicated weight matrix) to detect the presence or the absence of the same feature on various points in the signal. by using several neurons in every temporal position, the network of neuron detects the different features: the outputs of the different neurons of one layer produces new features vector for the next layer and so on. thus, the temporal component of the original signal is progressively eliminated and transformed into a feature by upper layers. to compensate this loss of information, we increase the number of neurons in the feature direction. input layer centers output layer figure 7. the rbf network architecture. 3.3. radial basis function network (rbf) the rbf uses radial functions rather than sigmoid (e.g., mlp) to build a local decision function centred on a subset from the inputs [15]. the summation of all local functions represents the global decision function [16] to solve the local minima problem. the used rbf network [17, 18] consists of 3 layers (figure 7). every hidden node applies a kernel function to the input data, then the output layer performs a weighted sum of these kernel functions. every node is characterized by two parameters, which are the width and the centre of the radial function. if the input data of a node are close to the centre (estimated using the euclidian distance), the output value will be high. in this work, we have used a gaussian kernel function. the complete configuration of the network is achieved after determining the centre and the width associated to each node as well as the weight of the connections between the hidden layer and the output layer, which contains 28 arabic characters or the 48 words representing the cities of algeria. in this work, we have experimented with several variants of rbf networks, a brief description of each one of them is given in the following paragraphs: 3.3.1. radial basis function networks exact (rbfne) the design of an rbfne can be done using a function that takes as input : the matrices of input vectors p, the target vectors t and the spread factor of the radial basis layer, then it sends back a network with weights and bias values such that the output is exactly t when the input is p. the same function creates many radial basis neurons that input vectors in p. so, we have a radial basis neuron layer in which every neuron acts as a detector for a different input vector. then, the only parameter that rbfne has is the spread of the radial basis functions of the first layer. in this work, the value of the spread factor is 0.3. 301 benbakreti samir, aoued boukelif acta polytechnica ... ... input layer radial basis layer special linear layer output layer figure 8. the grnn network architecture. 3.3.2. radial basis function (rbf) the second type iteratively creates a radial based neuron in each instant. the difference is that the rbf creates only one neuron at a time. in every iteration, the input vector, which will succeed to reduce most of the network error, is used to create a radial based neuron. the error of the new network is checked. if it is sufficiently small, the network creation is ended, otherwise, a following neuron is added. this procedure is repeated until the targeted mean squared error is reached (10e−6), or the maximum number of neurons is reached (300). the number of neurons added between every evaluation is fixed, after several experiences, to 25. 3.3.3. generalized regression neural networks (grnn) this variant is another type of the nn that was proposed by speckt [19]. this regression based network estimates the expected mean value of the output variable using bayesian technique. any continuous function approximated to a linear combination of gaussian functions. the objective is to make a regression, i.e., to build a good approximation of a function, which is known by a limited number of experimental points. this network can be used to solve the classification problem. for each input, the first layer calculates the distances between the input vector and the weight vector and then it produces a vector, which is multiplied by the bias. the network grnn diagram is described in figure 8: the other grnn specific layer is called a special linear layer and consist of two subparts: numerator part performs a summation of the multiplication of training output data and activation function, the denominator part is the summation of all activation function. this layer feeds both the numerator & denominator to the next output layer. 3.3.4. probabilistic neural networks (pnn) the probabilistic neural network, introduced by donald specht in 1988, is a feedforward network with three layers used for the classification of the data [20]. contrary to the other neural networks, which are based on the backpropagation method, the pnn is based on the bayes decision strategy and the probability density estimation. the pnn uses radial basis spherical gaussian functions centred in each training vector. the membership probability of a vector in a given class is expressed as follows [21, 22]: fi(x) = 1 2πp/2σpmi m∑ j=1 e −(x−xij) t (x−xij) 2σ2 , (1) where i is the number of classes (28 or 48 in our case), j is the number of the forms to be recognized, mi is the number of the training vector of the ith class, x is a test vector, xij is the jth training vector of the ith class, p is the dimension of the vector x, is the standard deviation, and fi(x) is the summation of the gaussian spherical multi-variable functions, centred on each training vectors xij used for the evaluation of the probability density function of the ith class. the classification decisions are taken according to the decision of bayes rule [22]: d(x) = ci if fi(x) > fk(x) for k 6= i, (2) where ci is the ith class. when an input is presented; • the first layer calculates the distances between the input vector and all the training vectors, and produces a vector with elements that indicate how this input vector is close to every training vector. • the second layer adds these contributions for every inputs class to produce a probability vector at the network output. finally, a transfer function at the output of the second layer selects the maximum of these probabilities, and assigns “1” to the corresponding class and “0” to the others. 3.4. deep learning the deep learning models are built in the same way as the previously described multilayer perceptron, except that the different middle layers are more numerous. but the main difference is the learning method, which distinguishes them from a classic mlp and which gave them a renewed interest since 2006. in fact, the limits and the drawbacks [23] of the classic architectures mentioned so far in this paper contributed to this interest. for these reasons, bengio et al. [11] proposed a greedy learning algorithm based on a staking of autoassociators, which allows building the hidden layers one after the other. in this paper, we have used the deep network model proposed by hinton et al. [12] in 2006, which is based on staked restricted boltzmann machines (rbms), to construct the so called deep belief networks. the topology and the learning method used in this model are presented below. 302 vol. 58 no. 5/2018 new approach for online arabic manuscript recongnition by deep belief network v w h figure 9. rbm: v is the visible layer, h is the hidden layer and w is the connection weight. 3.4.1. restricted boltzmann machine (rbm) and contrastive divergence learning (cd) rbm is an undirected graph with two layers. the first layer is considered visible while the second is hidden. each node represents a neuron. the neuron is active when its cost is equal to 1 and inactive if it is 0. the visible layer is connected to the hidden layer by weighted edges (figure 9). the visible layer represents the observed data and the hidden layer represents the unknown elements associated with the data. its size (fixed arbitrarily) allows the rbm to model more or less complex distributions. let us suppose that a bias unit, always active, is presented in the visible layer and the hidden layer. w is the weights matrix, where wij represents the weight of the link between the units vj and hi the rbm energy is given by: energy(v,h) = −ht wv, (3) each unit corresponds to a hypothesis to which we assign a binary value(1 or 0). the connections represent constraints for these hypotheses: if wij is negative, i and j should not be activated at the same time to reduce the rbm energy. but, if the weight wij is positive, then the simultaneous activation of both units reduces the energy. to summarize, the energy is a function of configuration, which is dependent on the constraints related to the weights. this energy function allows associating to an rbm, of weight w , a probability on the (v,h) configurations space: pw(v,h) = e−energy(v,h) z , (4) where z is a partition function, it is the sum of all possible joined configurations. it is given by the following formula: z = ∑ v,h e−energy(v,h). (5) the activation probability of the visible or hidden units are independent from each other. let sgm(t) be the sigmoid function: sgm(t) = 1 1 + e−t , (6) the conditional probabilities for an rbm are given by: ∀i ≤ r, pw (hi \v) = sgm (∑ j wijvj ) , (7) ∀j ≤ q, pw (vj \h) = sgm (∑ i wijhi ) . (8) an rbm models the probability of an input v with the joined pw (v,h). we use the gibbs sampling technique to make a random draw from the model to generate the configurations (v,h). it serves as reliable examples of inputs v. this technique uses the markov chain monte carlo method, which consists in a repeated random draw from pw (h\v) and pw (v\h), such that the markov chain is guaranteed to be converged to the pw distribution. in summary, this distribution is modelled by the rbm independently of the inputs, and according to it we can make a random draw with the gibbs sampling method. in the training part, we define pt r as the probability distribution corresponding to the random draw of a sample v from a training database tr, then to the random draw of a hidden representation h, which is associated to v. pt r is the objective distribution of training, it is fixed from a sample basis. having defined pw and pt r , the training step will minimize the kullbackleibler divergence between these two distributions in such way that the probability to draw a sample v from the examples approaches the probability to generate it from the training rbm. it turns out that the kullbackleibler divergence minimization has a very high treatment cost, thus we prefer to minimize an approximated criterion: contrastive divergence (cd). this technique was developed by geoffrey hinton in 2001. the pioneering paper, written in 2002 [24], demonstrated an improvement of the handwritten digit recognition results (dataset mnist) by using a cd algorithm to train rbm. 3.4.2. deep belief network (dbn) the restricted boltzmann machines stacked in a generative way represents a particular type of a deep neural network, the reason why we detailed the rbm structure previously. the topology of such network is presented in figure 10. the training of the stacked rbm is made layer by layer, in a greedy way. a first rbm is trained on the dataset of samples to minimize the contrastive divergence (cd). then, each of the following rbm makes its training on the hidden representations of the previous rbm. the training process includes two phases: 303 benbakreti samir, aoued boukelif acta polytechnica v = input h1 h2 rbm r2 rbm r1 stacked rbms figure 10. a deep network topology: stacked rbm. (1.) unsupervised pre-training: the dbn distinguishes itself from the other neural networks by the way of learning layer-by-layer, the idea is to train every layer as an rbm. the training begins from the lowest hidden layers (close to the visible input vector) and progresses to the output vector, so a dbn learns to probabilistically reconstruct its inputs. the layers then act as feature detectors. the dbn training algorithm for each rbm layer is presented as follows: step 1: build a multilayer network from the restricted boltzmann machines (rbm), then train the layers by the greedy-wise algorithm. naturally, the input vector is the visible layer v. step 2: encode x as an rbm to produce a new sample v′, by using the gibbs sampling method. step 3: repeat the second step with v ←− v′ for each two successive layers until the two top layers of the network are reached. (2.) supervising training: the use of a supervised algorithm like back-propagation algorithm in mlp that helps to search the optimal parameters of the dbn network. nevertheless, we use the weight w and the hidden bias b parameters to initialize this mlp. in [13], several experiments were performed on different networks, trained for simple classification tasks (handwritten digit, geometrical forms). the conclusions of this reference paper are that training deep architectures was unsuccessful until the recent advent of algorithms based on an unsupervised pretraining. the concept of deep learning does not mean that the number of layers is high, it is rather the way of learning that is a different (layer by layer) method that database number classes writers sample of training/ test class letters 28 20 1400/1400 class word 48 20 2400/2400 table 1. description of the database. has been applied in our work with the dbn network. we deduce from these works that if the network contains enough number of neurons, the classification performances are better with an unsupervised training. even on classical networks (not deep), the performances are improved. in our current work, we suggest the use of all the neural architectures described previously for the on-line arabic handwriting recognition without constraint (letters and words). we also observe how the dbn manages to discriminate between the data without having a prior information and without making the feature extraction. in this classifier, the complexity resides in the choice of the hyper-parameters of the algorithm cd, this heuristic choice requires many experiments to find the optimal architecture, giving the best rates of classification. 4. description of the database the objective of the architectures described previously is the classification and the recognition of arabic characters/words coming from our database noundatabase built from an on-line acquisition by means of a tablet of acquisition. this choice is motivated by the fact that most of the existing databases are off-line i.e.,containing images. in this work, we wanted to handle the on-line writing signal rather than the images. the arabic alphabet consists of 28 letters of variable shapes depending on its position in the word. our database contains 2800 arabic characters written by 20 different writers, each writer inserts the alphabet 5 times (once for every position in the word: in its isolated form, at the beginning, middle, end, and another time in the choice of the writer). the constructed database includes 4800 words, inputted by the same writers (20). the words represent the 48 algerian provinces (wilaya). each writer performs five insertions of the 48 wilayas. the handwriting can be influenced by several parameters, such as the age, the sex and the state of the writer (each writer possesses an appropriate writing style). to preserve the variability of the writing signals, we took care to make the data acquisition (the samples database) by 20 people with different sexes and age categories. the database is summarized in table 1. 5. experiments and results in this section, we shall only consider the final results of our system of the arabic handwritten character recognition, i.e. with all the parameter adjustments 304 vol. 58 no. 5/2018 new approach for online arabic manuscript recongnition by deep belief network nn arabs letters % correct % error time for classification (sec) mlp 97.73 2.27 48.21 tdnn 49.46 50.54 2358.00 rbf 86.09 13.91 23.09 rbfe 75.39 24.61 8.68 grnn 86.36 13.64 8.14 pnn 84.92 15.08 8.73 table 2. arabic characters recognition rates and classification time according the neural network. nn arabs words % correct % error time for classification (sec) mlp 85.31 14.69 27.01 tdnn 46.76 53.24 32017.00 rbf 81.85 18.15 37.57 rbfe 62.58 37.42 16.38 grnn 78.44 21.56 16.74 pnn 78.44 21.56 17.77 table 3. arabic words recognition rates and classification time according the neural network. made in the different neuronal approaches. we also proceeded to the division of the database in two parts: 50 % of samples formed the training (1400 characters and 2400 words) and the remaining 50 % for the test (1400 characters and 2400 words). 5.1. neural approach (classical methods) experiment 1: in the first experiment, we have used the classic architectures described previously for the on-line arabic handwritten character recognition regardless of its position in the word, i.e., in its isolated form, at the beginning, middle and end. the best configurations of our system gave the results illustrated in the table 2. discussion: we notice that the used classic architectures give more or less satisfactory results, having said that, the best classification results of the arabic characters were obtained by using the multi-layer perceptron mlp with a 97.73 % success rate. experiment 2: in the second experiment, we are going to use the same previous architectures for the online arabic words recognition, represented by 48 wilayas of algeria. the best configurations of our system gave the results illustrated in the table 3. discussion: we notice that the word recognition results have considerably fallen in comparison with table 2, this is due to the increase of the number of classes (which rises from 28 to 48). that can also be explained by the complexity of the data samples, indeed a word can consist of several pseudo-words which can contain one or several characters. however, the variability of the writing signal is richer in the case of the word writing. in addition, the reduction of the rate recognition can be explained by the main limitation of the classic architectures. if the architecture is too deep, the optimization of the parameters very often leads to a non-optimal local minima. the increase of the number of layers did not improve the results (some-times, it degraded them). for this reason, we had previously highlighted that the results mentioned are those of the ideal architecture (with the best parameters for each neural network, including the number of layers). as explained in [13, 14]), the surface of the error is not convex and is more irregular as the network is deep. despite this, the mlp still gives the best results, which is 85.31 % with an acceptable time of classification of 27.01 seconds. other observation is that the tdnn takes a long time for the classification, because of its complex architecture, which adds an additional dimension to the network, i.e. a temporal dimension in this particular case. 5.2. deep neural approach experiment 3: to enhance that results, we will use a deep neuronal architecture of the type deep belief network (dbn), which represents a stacking of restricted boltzmann machines. it should be noted that the recognition architecture system is slightly modified because we have eliminated the “features extraction” block (see figure 2). this is due to the abstraction ability of the deep learning network. in fact, it allows automatically learning the different representations of data by an abstraction level from the raw data. phase 1: pre-training (un-supervising training without backpropagation,): in this phase, the algorithm utilizes a concatenation of the rbm layers to learn the distribution of the inputs data x (handwriting signal) without taking into consideration the y labels. each rbm layer (the two first layers in this case) has a more abstract representation than the precedent. hinton et al. [12] proved that a greedy learning method for unsupervised training is effective. this method is also called contrastive divergence (cd). this phase, called sometimes pretraining, could be considered as an efficient initialisation of the dbn. phase 2: fine-tuning (supervising phase with back-propagation) the role of the second part of the dbn is to convert the abstract representation x into a y labels that are used in the case of the supervised learning (the last rbm layer with 2000 units). the back-propagation algorithm is used to readjust the network parameters, and finally, the global optimal network could be obtained. the output layer has the number of classes (48 in this case). the training phase is considered as accomplished 305 benbakreti samir, aoued boukelif acta polytechnica parameter value number of hidden layer 2 units of 1st hidden layer 100 units of 2nd hidden layer 100 units of 3rd hidden layer 2000 batch size 100 number of rbm epochs 50 table 4. dbn parameters. nn arabs words % error % error % correct before after bp bp time for classification (sec) dbn 33.75 2.92 97.08 252.21 table 5. arabic words recognition rates and classification time according the dbn network. once the error between the output and desired one is stabilized. after that, we proceed to test another samples from the database (see table 1). this phase, which is called fine-tuning, is more time consuming than the previous one. the parameters of the used dbn network for each layer (the pre-training and fine tuning phases) are mentioned in table 4. the above described dbn network architecture gave the results shown in table 5. discussion: we can see that the results of the word classification are significantly improved with the use of the dbn. indeed, in the previous table, the best rate of classification was 85.31 % with the mlp. it reaches a rate of 97.08 % with the dbn network with the rbm staked, giving a gain of 11.77 %. also, we noted that the second step, the supervised part (after the back propagation), that is called “fine-tunning” is very important in terms of the error rate reduction (2.92 % of error rate) and turned out to be much more efficient than the pre-training step (33.75 % of error rate), because we do not start from an initial random solution. figure 11 represents a summary of the classification rates obtained for each used neural network. 6. conclusion in this paper, we have proposed a neuronal approach, which aims to develop a solution based on neural networks for the on-line arabic manuscript recognition acquired dynamically. with the aim of performing the on-line recognition, we built our own database, containing 2800 characters and 4800 words. we divided our work in two parts; the first presents a classic neuronal approach using these different neural networks: mlp, tdnn, rbfne, rbf, grnn and pnn. our experiments on the character recognition gave interesting results, such as 97.73 % success rate for m l p t d n n r b f r b fe g r n n p n n d b n 0 10 20 30 40 50 60 70 80 90 100 classifier c la ss ifi ca ti on ra te (% ) figure 11. comparison between the deep neuronal approach (dbn) and the classic neuronal approach for the arabic word recognition. the mlp and 86.36 % for the grnn. nevertheless, the application of the same architectures on the word recognition significantly deteriorates the results, which did not exceed 85.31 % for the mlp and 81.85 % for the rbf. this can be explained by the complexity of the input signal and the local minima problem. the second approach uses a deep learning, involving the dbn, which represents a stacking of restricted boltzmann machines and which is characterized by a layer-by-layer learning method. this network significantly improves the performance for the words database and gives a classification rate of 97.08 %. in fact, the dbn has shown its ability to exerts an excellent performance for the classification tasks and dimension reduction. we conclude the superiority of the deep learning network compared to the classic neural networks. we also note that the various parameters used in all the previous neural architectures have allowed us to obtain the best classification rates. references [1] n. ben amara, a. belaïd. modélisation pseudo bidimensionnelle pour la reconnaissance de chaînes de caractères arabes imprimés. in colloque international francophone sur l’ecrit et le documernt cifed’98, pp. 131–141. québec, canada, 1998. colloque avec actes et comité de lecture. [2] r. rabiner, l, h. juang, b. an introduction to hidden markov models. pp. 4–16. ieee assp magazine, 1986. [3] l. cheng-lin, j. stefan, n. masaki. online recognition of chinese characters: the state-of-the-art. ieee transactions on pattern analysis and machine intelligence 26(2):198–213, 2004. doi:10.1109/tpami.2004.1262182. [4] k. yousri, p. thierry, b. h. abdelmajid. a multi-lingual recognition system for arabic and latin handwriting. in proceedings of the 2009, 10th international conference on document analysis and recognition, pp. 1196–1200. july 26-29, 2009. doi:10.1109/icdar.2009.55. 306 http://dx.doi.org/10.1109/tpami.2004.1262182 http://dx.doi.org/10.1109/icdar.2009.55 vol. 58 no. 5/2018 new approach for online arabic manuscript recongnition by deep belief network [5] m. volker, e. a. haikal. databases and competitions: strategies to improve arabic recognition systems. in proceedings of the 2006 conference on arabic and chinese handwriting recognition, pp. 82–103. college park, md, usa, september 27-28, 2006. doi:10.1109/icdar.2009.55. [6] e. mohamed, t. najiba, k. monji. optimization of dbn using regularization methods applied for recognizing arabic handwritten script. in international conference on computational science, iccs. 12-14 june 2017. doi:10.1016/j.procs.2017.05.070. [7] m. neila, m. amar, c. mohamed. bayes classification of online arabic characters by gibbs modeling of class conditional densities. ieee transactions on pattern analysis and machine intelligence 30(7):1121–1131, 2008. doi:10.1109/tpami.2007.70753. [8] m. neila, c. mohamed, m. amar. combination of pruned kohonen maps for on-line arabic characters recognition. in proceedings of the seventh international conference on document analysis and recognition, p. 900. august 03-06, 2003. doi:10.1109/icdar.2003.1227790. [9] f. biadsy, j. el-sana, n. habash. online arabic handwriting recognition using hidden markov models, 2006. [10] s. benbakreti, a. boukelif. features extraction and on-line recognition of isolated arabic characters, pp. 481–500. 2018. doi:10.1007/978-3-319-67056-0_23. [11] y. bengio, p. lamblin, v. popovici, h. larochelle. greedy layer-wise training of deep networks. in advances in neural information processing systems 19,eds schölkopf b, platt j, hoffman t, editors (vancouver: mit press) pp. 153–160, 2007. [12] g. hinton, s. osindero, yee-whyeteh. a fast learning algorithm for deep belief nets. neural computation 18(7):1527–1554, 2006. [13] e. dumitru, m. pierre-antoine, b. yoshua, et al. the difficulty of training deep architectures and the effect of unsupervised pre-training. in international conference on artificial intelligence and statistics, pp. 153–160. 2009. [14] i. guyon, p. albrecht, i. le cun, al. design of a neural network character recognizer for a touch terminal. pattern recognition 24(2):105–119, 1991. doi:10.1016/0031-3203(91)90081-f. [15] p. burrascano. learning vector quantization for the probabilistic neural network. ieee transactions on neural networks 2(4):458–461, 1991. doi:10.1109/72.88165. [16] k. kim, d, h. kim, d, k. chang, se, k. chang, sa. modified probabilistic neural network considering heterogeneous probabilistic density functions in the design of breakwater. ksce journal of civil engineering 11(2):65–71, 2007. doi:10.1007/bf02823849. [17] j. d. powell, m. radial basis functions for multivariable interpolation: a review. in algorithms for approximation, pp. 143–167. clarendon press, oxford, 1987. [18] i. park, w. sandberg, i. universal approximation using radial basis function networks. neural computation 3(2):246–257, 1991. doi:10.1162/neco.1991.3.2.246. [19] d. speckt. a generalized regression neural network. ieee transactions on neural networks 2(6):568–576, 1991. doi:10.1109/72.97934. [20] f. specht, d. probabilistic neural networks. neural networks 3(1):109–118, 1990. doi:10.1016/0893-6080(90)90049-q. [21] m. berthold, j. diamond. constructive training of probabilistic neural networks. neurocomputing 19(1):167–183, 1998. doi:10.1016/s0925-2312(97)00063-5. [22] w. tomasz, j. jacek, m. jacek. probabilistic neural network for direction estimation. in proceedings of the third conference neural networks and their applications and summer school on neural networks applications to signal processing, kule, 14 x-18 x 97, pp. 173–178. eds r. tadeusiewicz, l. rutkowski, j. chojcan. czȩstochowa: pol. neural netw. soc. s., 1997. [23] y. bengio, y. lecun. scaling learning algorithms towards ai. in large-scale kernel machines. mit press, 2007. [24] h. geoffrey, e. training products of experts by minimizing contrastive divergence. neural computation 14(8):1771–1800, 2002. doi:10.1162/089976602760128018. 307 http://dx.doi.org/10.1109/icdar.2009.55 http://dx.doi.org/10.1016/j.procs.2017.05.070 http://dx.doi.org/10.1109/tpami.2007.70753 http://dx.doi.org/10.1109/icdar.2003.1227790 http://dx.doi.org/10.1007/978-3-319-67056-0_23 http://dx.doi.org/10.1016/0031-3203(91)90081-f http://dx.doi.org/10.1109/72.88165 http://dx.doi.org/10.1007/bf02823849 http://dx.doi.org/10.1162/neco.1991.3.2.246 http://dx.doi.org/10.1109/72.97934 http://dx.doi.org/10.1016/0893-6080(90)90049-q http://dx.doi.org/10.1016/s0925-2312(97)00063-5 http://dx.doi.org/10.1162/089976602760128018 acta polytechnica 58(5):297–307, 2018 1 introduction 1.1 related work 2 processing and features extraction 3 neural network classification 3.1 multi-layer perceptron (mlp) 3.2 time delay neural network (tdnn) 3.3 radial basis function network (rbf) 3.3.1 radial basis function networks exact (rbfne) 3.3.2 radial basis function (rbf) 3.3.3 generalized regression neural networks (grnn) 3.3.4 probabilistic neural networks (pnn) 3.4 deep learning 3.4.1 restricted boltzmann machine (rbm) and contrastive divergence learning (cd) 3.4.2 deep belief network (dbn) 4 description of the database 5 experiments and results 5.1 neural approach (classical methods) 5.2 deep neural approach 6 conclusion references acta polytechnica doi:10.14311/ap.2017.57.0295 acta polytechnica 57(4):295–303, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap characterization of selected properties of composites of waste paper with untreated bamboo stem fibre and rice husk taofeek a. yusuf∗, lawson e. dan-laaka, peter o. ogwuche mechanical engineering department, university of agriculture makurdi nigeria ∗ corresponding author: yusuf.taofeek@uam.edu.ng abstract. composite technology is an excellent approach to utilizing natural fibres and agricultural wastes, which constitute an environmental nuisance. efforts are being made to characterize the properties of composites produced from different sources of these wastes and fibres to facilitate a choice and selection for different applications. in this study, selected properties of composite samples, produced from waste paper in equal mix-ratio with rice husk and bamboo stem fibres (bsf) separately without chemical pre-treatment using cassava starch as a binder, were characterized. composites from rice husk are better in terms of their higher compressive strength (71–202 n/mm2), lower water absorption, at a rate of 1.97–5.19 and 1.09–3.02 %/min, and a lower thickness swelling, at a rate of 0.74–1.23 and 0.52–0.70 %/min at 30 min and 1 h immersion time respectively, while that from the bsf is superior for its lower density 0.321–0.358 g/cm3 and specific weight 3.15–3.51 kn/m3. the material composition (percentage fibre volume fraction) appears to have no significant effect on the impact strength 26.0–26.4 kj/m2 as well as other selected properties of the composites (p > 0.05). however, all the samples have properties that meet the requirement for composites except that the water absorption and thickness swelling are relatively high. the composites have considerably low density, which makes them suitable in light weight applications. their compressive and impact strength make them appear specifically relevant for the production of construction blocks and industrial helmets respectively. meanwhile, the properties are liable to modification with a chemical pre-treatment of the fibres. keywords: biocomposite properties; waste paper; bamboo stem fibre; rice husk. 1. introduction preferences of one engineering material over the other are fundamentally on account of the desired properties. not all the materials have excellent properties suitable for all applications. at the same time, no material is entirely unsuitable in any application. characterization of properties of different materials is a subject of great importance in material choice and selection for production engineers. composite materials produced from agricultural wastes and natural fibres have special merit, because of their elimination of environmental wastes. they have been said to have advantages, such as availability, light weight, specific strength and good modulus properties [1–3]. biocomposites have been described as composites that contain at least one natural or plant fibre. they are finding relevance in biomedical applications, such as tissue engineering, orthopaedics etc. [4]. various other uses of agro-based biocomposites have been highlighted [5, 6], while various researchers have emphasized the use of biomass for their productions [5]. hence, due to their general nature as wastes, paper, rice husk and bamboo stem fibres may be considered as good sources for production of such composites. organic products and paper constitute 63 percent of global waste composition in ratio 2.7 : 1 respectively [7]. waste paper constitutes a bulk of wastes turned out daily from academic institution, office, home and industrial activities. its potential is being technologically underutilized in nigeria [8]. rice and bamboo are natural fibres in plant class, which has been generally categorized into six, namely bast (flax, hemp, jute, kenaf, andramie), leaf (abaca, pineapple and sisal), seed or fruit (coir, cotton, and kapok), straw or stalk (corn, rice, and wheat), grass (bagasse and bamboo), and wood (softwood and hardwood) [9]. some other authors listed them as seven, categorizing seed and fruit as separate classes [10] while [4] also categorized them as six but listed maize as straw/stalk and included corn under the grass class. the chemical composition of plant fibres includes wax and the major ones being lignocellulose (cellulose, hemicellulose, and lignin). cellulose is described as the stiffest and strongest part of the fibre; lignin is the phenolic compound, which is resistant to microbial degradation and acts as a binder that links the celluloses to retain or support the plant structure while the wax content is said to influence wettability of the composite matrix as well as interfacial fibre-matrix adhesion. these basic components determine the physical properties of the fibres, physico-mechanical properties of the biocomposites and can be altered or disturbed by physical/chemical treatment [9]. for instance, cel295 http://dx.doi.org/10.14311/ap.2017.57.0295 http://ojs.cvut.cz/ojs/index.php/ap t. a. yusuf, l. e. dan-laaka, p. o. ogwuche acta polytechnica lulose makes plant fibres hydrophilic [10] and acid treatment reduces their water absorption and thickness swelling [1]. bast and leaf fibres have enjoyed greater attention than others because of their generally large amount of cellulose content. the seed/fruit fibres come next due to the fact that cotton and coir, in this class, have the largest amount of cellulose and lignin among the plant fibres respectively. deep study of the information provided by [9] revealed that rice has the largest amount of hemicelluloses (33.0 %) and is also made of cellulose content (41.0–57.0 %) higher than that of coir (36.0–43.0 %) and kenaf (31.0–39.0 %) in the seed and bast fibres categories respectively. more so, the wax content of rice and rice husk are the highest in the entire categories of plant fibres. rice husks from rice processing factories are conspicuously mounting in public places and along roadside across the major cities like makurdi in nigeria. they are simply qualified as abundant in world production estimation [2]. this is expected to increase as the current government is diversifying the economy with special support for agriculture, especially local rice farming and processing. the husk is said to be formed from hard materials including silica and lignin and could be used as filler in construction, insulation material, fertilizer, or fuel. its chemical composition is cellulose (35-–45 %), hemicellulose (19–25 %), lignin (20 %) and wax (14–17 %) [9]. however, bamboo is one of the most commonly found trees within the local community. it is categorized with bagasse in ratio 1 : 2.5 under the grass class of plant fibres, which has the largest world production of about 80 percent. its chemical composition is cellulose (26–43 %), lignin (21–31 %), hemicellulose (30 %) and wax (0 %). although bamboo has no wax and has lower values than rice husk in terms of cellulose, it has higher values than it in both the lignin and hemicellulose [9]. however, it is one of the least contributors of natural fibres being used in biocomposites. bamboo stem fibre is ranked next to wood with world production of 99.1 and 0.56 % in 2004 respectively [6]. meanwhile, other natural fibres, such as flax and hemp, which are commonly used [11, 12], have comparatively less production of 0.05 and 0.01 in this ranking respectively. this may be due to some disadvantages identified with it including high moisture content and difficulties in extracting fine and straight fibres. the high lignin content is said to have influenced the high brittle nature of bamboo and difficulty to obtain its fibres in an uniform length. meanwhile, it has advantages, such as low density, low cost, high mechanical strength, stiffness etc [10]. the use of bamboo fibre in reinforced composites materials has been strongly recommended [13]. composite technology is an excellent approach to utilize natural fibres and agricultural wastes, which constitute an environmental nuisance. the use of these composites as packaging materials, automobile s/n materials locations 1 rice husk new rice mill, wurukum, makurdi 2 bamboo stem fibre pila village, makurdi local government 3 waste (bond) paper university of agriculture, makurdi (uam) 4 cassava starch north bank market makurdi 5 water water board, university of agriculture, makurdi table 2. apparatus and equipments s/n equipment model manufacturer 1 sensitive electronic weighing balance scout pro 6000g ahans scale corp. okuas u.s.a 2 measuring jug nil nil 3 wooden mould locally constructed nil 4 impact testing machine (izod) 6705u/d/t85337 ls102de england 5 california bearing ratio (cbr) machine 40e/e/e5 ele limited u.s.a 2.3 methods of preparation 2.3.1 waste paper, fibres and cassava starch the waste paper collected was soaked in water for 3 days. it was later pounded and grounded into fine pulp and placed under the sun for drying as shown in figure 1. both the rice husk and small bamboo stem branches collected were sun dried to reduce the moisture content. the bamboo stem fibres (bsf) shown in figure 2 were grinded using a milling machine and sun dried again. both were initially sieved using a sieve pan of 1500microns and later with 850microns to ensure finer particle-size. the final residues tapped inside the smaller sieve were used for the composites. the starch binder was prepared by mixing 20g of cassava starch with 100cm3 of cold water in a vessel at a room temperature. the solution mixture was well stirred before 400cm3 of hot water was poured into it with continuous stirring to form a lump slurry starch. figure 1. paper pulp in the sun figure 2. bamboo stem fibe before grinding figure 1. paper pulp in the sun. body parts, fibreboards etc., has reduced the volume of the plastic in such products resulting into improved biodegradability as well as reduced costs and disposal threats. it is essential to characterize the properties of composites produced from different materials sources to discern suitable blend or composition for specific applications. chemical pre-treatment has been recommended for natural fibres used in composites to enhance the removal of hemicelluloses constituents, which are hydrophilic and often responsible for high thickness swelling and water absorptivity of the resulting composites [1]. meanwhile, binders are important components in composites matrix. replacement of petroleum dependent binder such as urea formaldehyde (uf) with natural adhesives like starch and natural rubber is being recommended [8]. according to the same authors, cassava starch possesses good ductility, good bind-ability, self-curing properties and hygroscopic-resistance in their incorporated composite. in this study, selected properties of two classes of plant fibres separately bounded with waste paper using cassava starch to formulate composites without any chemical pre-treatment are characterized to ascertain their natural properties. 2. materials and methods 2.1. raw materials and apparatus/equipment the description of material and locations are presented in table 1 while the apparatus and equipment used are shown in table 2. 2.2. methods of preparation 2.2.1. waste paper, fibres and cassava starch the waste paper collected was soaked in water for 3 days. it was later pounded and grounded into fine pulp and placed under the sun for drying as shown in figure 1. both the rice husk and small bamboo stem branches collected were sun dried to reduce the moisture content. the bamboo stem fibres (bsf) shown 296 vol. 57 no. 4/2017 properties of composites of waste paper s/n materials locations 1 rice husk new rice mill, wurukum, makurdi 2 bamboo stem fibre pila village, makurdi local government 3 waste (bond) paper university of agriculture, makurdi (uam) 4 cassava starch north bank market makurdi 5 water water board, university of agriculture, makurdi table 1. raw materials. s/n equipment model manufacturer 1 sensitive electronic weighing balance scout pro 6000g ahans scale corp. okuas u.s.a. 2 measuring jug – – 3 wooden mould locally constructed – 4 impact testing machine (izod) 6705u/d/t85337 ls102de england 5 california bearing ratio (cbr) machine 40e/e/e5 ele limited u.s.a. table 2. apparatus and equipments. s/n materials locations 1 rice husk new rice mill, wurukum, makurdi 2 bamboo stem fibre pila village, makurdi local government 3 waste (bond) paper university of agriculture, makurdi (uam) 4 cassava starch north bank market makurdi 5 water water board, university of agriculture, makurdi table 2. apparatus and equipments s/n equipment model manufacturer 1 sensitive electronic weighing balance scout pro 6000g ahans scale corp. okuas u.s.a 2 measuring jug nil nil 3 wooden mould locally constructed nil 4 impact testing machine (izod) 6705u/d/t85337 ls102de england 5 california bearing ratio (cbr) machine 40e/e/e5 ele limited u.s.a 2.3 methods of preparation 2.3.1 waste paper, fibres and cassava starch the waste paper collected was soaked in water for 3 days. it was later pounded and grounded into fine pulp and placed under the sun for drying as shown in figure 1. both the rice husk and small bamboo stem branches collected were sun dried to reduce the moisture content. the bamboo stem fibres (bsf) shown in figure 2 were grinded using a milling machine and sun dried again. both were initially sieved using a sieve pan of 1500microns and later with 850microns to ensure finer particle-size. the final residues tapped inside the smaller sieve were used for the composites. the starch binder was prepared by mixing 20g of cassava starch with 100cm3 of cold water in a vessel at a room temperature. the solution mixture was well stirred before 400cm3 of hot water was poured into it with continuous stirring to form a lump slurry starch. figure 1. paper pulp in the sun figure 2. bamboo stem fibe before grinding figure 2. bamboo stem fibe before grinding. in figure 2 were grinded using a milling machine and sun dried again. both were initially sieved using a sieve pan of 1500microns and later with 850microns to ensure finer particle-size. the final residues tapped inside the smaller sieve were used for the composites. the starch binder was prepared by mixing 20g of cassava starch with 100 cm3 of cold water in a vessel at a room temperature. the solution mixture was well stirred before 400cm3 of hot water was poured into it with continuous stirring to form a lump slurry starch. 2.2.2. composite samples a wooden mould of a size 100 × 50 × 8 mm shown in figure 3 was prepared with polythene as a facing material to prevent a leakage as well as enhancing surface smoothness of the composite sample. with a constant mass (20g) of cassava starch, different weight ratios of the waste paper, rice husk and bsf were measured and mixed manually to achieve the required composite blend as shown in table 3. equal volume figure 3. wooden mould 2.3.2 composite samples a wooden mould of a size 100𝑚𝑚 × 50𝑚𝑚 × 8𝑚𝑚 shown in figure 3 was prepared with polythene as a facing material to prevent a leakage as well as enhancing surface smoothness of the composite sample. with a constant mass (20g) of cassava starch, different weight ratios of the waste paper, rice husk and bsf were measured and mixed manually to achieve the required composite blend as shown in table 3. equal volume of water was added to the blend to form mouldable matrix. the blend was transferred into the mould and compressed to take shape. the cast was allowed to set properly, after which it was removed and kept in the sun for days to ensure dryness. table 3. composition composite matrix s/n rice husk/waste paper composites bamboo stem fibre/waste paper composites samples rice husk (wt%) waste paper (wt%) binder (wt%) samples bsf (wt%) waste paper (wt%) waste paper (wt% 1 rh10wp70s20 10 70 20 bf10wp70s20 10 70 20 2 rh20wp60s20 20 60 20 bf20wp60s20 20 60 20 3 rh30wp50s20 30 50 20 bf30wp50s20 30 50 20 4 rh40wp40s20 40 40 20 bf40wp40s20 40 40 20 5 rh50wp30s20 50 30 20 bf50wp30s20 50 30 20 6 rh60wp20s20 60 20 20 bf60wp20s20 60 20 20 7 rh70wp10s20 70 10 20 bf70wp10s20 70 10 20 2.4 properties testing methods 2.4.1 density and specific weight the mass, m of each of the samples was determined using sensitive electronic weighing scale, while the volume is already predetermined from dimension of the mould. consequently, the density and specific weight of the samples are obtained using equation 1 and 2 respectively. density (𝜌) = 𝑚𝑎𝑠𝑠(𝑚) 𝑉𝑜𝑙𝑢𝑚𝑒 (𝑣) (1) specific weight (𝛾) = 𝜌𝑔 (2) where, 𝑔 = 9.81𝑘𝑔/𝑚3 2.4.2 compression and impact strength the impact strength of the composite samples determined in the mechanical laboratory of university of agriculture, makurdi nigeria using avery-denison impact testing (it) machine is shown in figure 4. compression test to determine the compressive strength of the samples was carried out in civil engineering figure 3. wooden mould. of water was added to the blend to form mouldable matrix. the blend was transferred into the mould and compressed to take shape. the cast was allowed to set properly, after which it was removed and kept in the sun for days to ensure dryness. 2.3. properties testing methods 2.3.1. density and specific weight the mass (m) of each of the samples was determined using sensitive electronic weighing scale, while the volume (v) is already predetermined from dimension of the mould. consequently, the density % and specific weight γ of the samples are obtained in the following way: % = m v , γ = %g, (1) where g = 9.81 m/s2. 2.3.2. compression and impact strength the impact strength of the composite samples determined in the mechanical laboratory of univer297 t. a. yusuf, l. e. dan-laaka, p. o. ogwuche acta polytechnica s/n rice husk/waste paper composites bamboo stem fibre/waste paper composites samples rice husk (wt%) waste paper (wt%) binder (wt%) samples bsf (wt%) waste paper (wt%) binder (wt%) 1 rh10wp70s20 10 70 20 bf10wp70s20 10 70 20 2 rh20wp60s20 20 60 20 bf20wp60s20 20 60 20 3 rh30wp50s20 30 50 20 bf30wp50s20 30 50 20 4 rh40wp40s20 40 40 20 bf40wp40s20 40 40 20 5 rh50wp30s20 50 30 20 bf50wp30s20 50 30 20 6 rh60wp20s20 60 20 20 bf60wp20s20 60 20 20 7 rh70wp10s20 70 10 20 bf70wp10s20 70 10 20 table 3. composition composite matrix. laboratory, plateau state polytechnic barkinladi, jos nigeria in accordance with the specifications of astm d1037 (1978), en 310 (1993) and en 319 (1993) using the california ratio bearing (crb) compression testing (ct) machine with 1500 kn capacity shown in figure 5 . each sample was placed on the machine plate and loaded at 5 bars per second until it was crushed. the compressive strength, tc was determined as calculated by [14] using equation (3) 𝑇𝐶 = 𝑊 𝑏𝑡 (3) where 𝑊 in (n) is the failure load, 𝑏 and 𝑡 are the breadth and the thickness of the samples in (mm) respectively. figure 4. avery-denison it machine figure 5. crb ct machine (1500 kn capacity) 2.4.3 water absorption and thickness swelling two samples of each composite were immersed in water in a flat bottom container and removed after separate intervals of 30mins and 1hour. the mass of the sample before (m1) and after the immersion (m2) were recorded. water absorptive rate was calculated using equation (4). the experiment was performed at an average room temperature of 33℃. absorptivity rate (% rate of water absorbption) = m2−m1 m1t x 100 (4) the thickness swelling (ts) tests were performed according to the astm d-1037. all samples have the same thickness before immersion (8mm), but the thickness of each one after the immersion (t) mm was recorded. thickness swelling rate was calculated using the equation (5) 𝑇𝑆 𝑟𝑎𝑡𝑒 = 𝑇−8 8𝑡 × 100% (5) where 𝑡 = duration of immersion in min 2.5 statistical analysis ibm spss statistical software tool (version 21) was used to examine the relationship and differences among the properties as well as between each of the properties and the percentage fibre volume fraction(pfvf, i.e. 10-70%) of each of the rice husk and the bsf contained in the samples. 3. results and discussions 3.1 density and specific weight figure 4. avery-denison it machine. sity of agriculture, makurdi nigeria using averydenison impact testing (it) machine is shown in figure 4. compression test to determine the compressive strength of the samples was carried out in civil engineering laboratory, plateau state polytechnic barkinladi, jos nigeria in accordance with the specifications of astm d-1037 (1978), en 310 (1993) and en 319 (1993) using the california ratio bearing (crb) compression testing (ct) machine with 1500 kn capacity shown in figure 5. each sample was placed on the machine plate and loaded at 5 bars per second until it was crushed. the compressive strength, tc was determined as calculated by [14] tc = w bt , (2) where w (n) is the failure load and b and t (mm) are the breadth and the thickness of the samples, respectively. laboratory, plateau state polytechnic barkinladi, jos nigeria in accordance with the specifications of astm d1037 (1978), en 310 (1993) and en 319 (1993) using the california ratio bearing (crb) compression testing (ct) machine with 1500 kn capacity shown in figure 5 . each sample was placed on the machine plate and loaded at 5 bars per second until it was crushed. the compressive strength, tc was determined as calculated by [14] using equation (3) 𝑇𝐶 = 𝑊 𝑏𝑡 (3) where 𝑊 in (n) is the failure load, 𝑏 and 𝑡 are the breadth and the thickness of the samples in (mm) respectively. figure 4. avery-denison it machine figure 5. crb ct machine (1500 kn capacity) 2.4.3 water absorption and thickness swelling two samples of each composite were immersed in water in a flat bottom container and removed after separate intervals of 30mins and 1hour. the mass of the sample before (m1) and after the immersion (m2) were recorded. water absorptive rate was calculated using equation (4). the experiment was performed at an average room temperature of 33℃. absorptivity rate (% rate of water absorbption) = m2−m1 m1t x 100 (4) the thickness swelling (ts) tests were performed according to the astm d-1037. all samples have the same thickness before immersion (8mm), but the thickness of each one after the immersion (t) mm was recorded. thickness swelling rate was calculated using the equation (5) 𝑇𝑆 𝑟𝑎𝑡𝑒 = 𝑇−8 8𝑡 × 100% (5) where 𝑡 = duration of immersion in min 2.5 statistical analysis ibm spss statistical software tool (version 21) was used to examine the relationship and differences among the properties as well as between each of the properties and the percentage fibre volume fraction(pfvf, i.e. 10-70%) of each of the rice husk and the bsf contained in the samples. 3. results and discussions 3.1 density and specific weight figure 5. crb ct machine (1500 kn capacity). 2.3.3. water absorption and thickness swelling two samples of each composite were immersed in water in a flat bottom container and removed after separate intervals of 30 min and 1 h. the mass of the sample before (m1) and after the immersion (m2) were recorded. water absorptive rate was calculated as m2 − m1 m1t · 100 %. (3) the experiment was performed at an average room temperature of 33 °c. the thickness swelling (ts) tests were performed according to the astm d-1037. all samples have the same thickness before immersion (8mm), but the thickness of each one after the immersion (t) was recorded. thickness swelling rate was calculated as t − 8 mm 8 mm · t · 100 %, (4) where t is the duration of immersion. 298 vol. 57 no. 4/2017 properties of composites of waste paper s/n rice husk/waste paper composites bamboo stem fibre/waste paper composites samples density (g/cm3) specific weight (kn/m3) samples density (g/cm3) specific weight (kn/m3) 1 rh10wp70s20 0.325 3.19 bf10wp70s20 0.321 3.15 2 rh20wp60s20 0.375 3.68 bf20wp60s20 0.325 3.19 3 rh30wp50s20 0.425 4.17 bf30wp50s20 0.325 3.19 4 rh40wp40s20 0.413 4.05 bf40wp40s20 0.343 3.36 5 rh50wp30s20 0.475 4.66 bf50wp30s20 0.345 3.38 6 rh60wp20s20 0.500 4.91 bf60wp20s20 0.350 3.43 7 rh70wp10s20 0.513 5.03 bf70wp10s20 0.358 3.51 table 4. density and specific weight of the samples. s/n rice husk/waste paper composites bamboo stem fibre/waste paper composites samples compression strength (mpa) impact strength (j) samples compression strength (mpa) impact strength (j) 1 rh10wp70s20 202 130 bf10wp70s20 166 132 2 rh20wp60s20 108.5 132 bf20wp60s20 95 132 3 rh30wp50s20 128 131 bf30wp50s20 93 132 4 rh40wp40s20 71 132 bf40wp40s20 80 131 5 rh50wp30s20 79 132 bf50wp30s20 77 131 6 rh60wp20s20 89.5 132 bf60wp20s20 77 131 7 rh70wp10s20 72.5 132 bf70wp10s20 74 131 table 5. compression and impact strengths of samples. the area of the samples is 5000 mm2. 2.4. statistical analysis ibm spss statistical software tool (version 21) was used to examine the relationship and differences among the properties as well as between each of the properties and the percentage fibre volume fraction (pfvf, i.e., 10–70 %) of each of the rice husk and the bsf contained in the samples. 3. results and discussions 3.1. density and specific weight the results on density and specific weight of the samples were presented in table 4. it shows that the density and specific weight of the samples increase with increased quantity of rice husk and the bsf. meanwhile, increased quantity of waste paper in either of both waste materials leads to decrease in these properties implying that it has a much lesser density and specific weight than these materials. it also reveals that rice husk is heavier than the bsf having densities ranging between 0.325–0.513 and 0.321–0.358 g/cm3 while having specific weight of 3.19–5.03 and 3.15–3.51 kn/m3, respectively. these values are lower than the light weight blocks 0.356– 0.713 g/cm3, which production is predicted from composites of saw dust, paper and lime [7] and can be well compared to 0.213–0.580 g/cm3 obtained for fonio (acha) husk/gum arabic-resin bounded composites [15]. they are even lower than the lowest density of 0.6 g/cm3 recorded for natural plant fibres and 0.6– 1.1 g/cm3 specifically for bamboo fibres [9] and for all the natural fibres listed by [11]. they are also lesser than that of a typical cement boards (1.86 g/cm3) as well as 1.408–1.603 g/cm3 found with the composites developed for similar application from paper, natural fibres (rice husk and rice), silica, cement, polyvinyl acetate and poly ol [16]. low density or weight is one of the most important advantages of composites from natural fibres. consequently, the composites from this study are acceptable and suitable in light weight applications. 3.2. compression and impact strength the outcome of compression and impact test on the samples are present in table 5. it shows that the composites of rice husk and the bsf have compression strength of 71–202 and 77–166 mpa (i.e., n/mm2), respectively. the compression strength of the composites is higher than 50–80 mpa observed for two different species of bamboo strips in previous research [10]. they are also higher than 0.057–0.397 n/mm2 that is the value for the particleboards produced from composites of fonio husk and gum arabic at the lowest to highest percentage of resin adhesive used as a binder and the minimum (2.5 n/mm2) compressive strength acceptable for construction blocks [15]. they 299 t. a. yusuf, l. e. dan-laaka, p. o. ogwuche acta polytechnica rice husk/waste paper composites bamboo stem fibre/waste paper composites samples initial weight (g) final weight (g) absorptive rate (%/min) samples initial weight (g) final weight (g) absorptive rate (%/min) @ 30 @ 60 @ 30 @ 60 @ 30 @ 60 @ 30 @ 60 rh10wp70s20 12.6 32.23 35.44 5.19 3.02 bf10wp70s20 12.86 44.14 44.87 8.11 4.15 rh20wp60s20 15.5 35.64 36.62 4.33 2.27 bf20wp60s20 13 41 43.62 7.18 3.93 rh30wp50s20 16.5 35.42 37.78 3.82 2.15 bf30wp50s20 13.33 39.67 43.56 6.59 3.78 rh40wp40s20 16.83 27.1 27.82 2.03 1.09 bf40wp40s20 13.67 36.33 39.2 5.53 3.11 rh50wp30s20 19.17 31.17 33.19 2.09 1.22 bf50wp30s20 13.83 35.17 38.37 5.14 2.96 rh60wp20s20 19.67 31.67 33.69 2.03 1.19 bf60wp20s20 14 34 38.2 4.76 2.88 rh70wp10s20 20.33 32.37 33.92 1.97 1.11 bf70wp10s20 14.34 31.66 36.88 4.03 2.62 table 6. water absorption of samples. legend: @ 30 – at 30 min time; @ 60 – at 60 min time. rice husk/waste paper composites bamboo stem fibre/waste paper composites samples final thickness (mm) thickness swelling rate (%/min) samples final thickness (mm) thickness swelling rate (%/min) @ 30 @ 60 @ 30 @ 60 @ 30 @ 60 @ 30 @ 60 rh10wp70s20 10.36 11 0.98 0.63 bf10wp70s20 11.98 14.15 1.66 1.28 rh20wp60s20 10.96 11.15 1.23 0.66 bf20wp60s20 11.95 13.05 1.65 1.05 rh30wp50s20 10.37 11.2 0.99 0.67 bf30wp50s20 11.38 12.05 1.41 0.84 rh40wp40s20 10.52 10.84 1.05 0.59 bf40wp40s20 10.49 11.5 1.04 0.73 rh50wp30s20 10.1 10.5 0.88 0.52 bf50wp30s20 10.18 11.28 0.91 0.68 rh60wp20s20 9.82 10.6 0.75 0.54 bf60wp20s20 10.13 10.38 0.89 0.5 rh70wp10s20 9.77 11.38 0.74 0.7 bf70wp10s20 10.01 10.08 0.84 0.43 table 7. thickness swelling of samples. the initial thickness was 8 mm. legend: @ 30 – after 30 min; @ 60 – after 60 min. are also higher than that of a typical cement boards (26.9 n/mm2) as well as 17.5–22.1 n/mm2 that was measured for the composites developed for similar application from paper, natural fibres (rice husk and rice), silica, cement, polyvinyl acetate (pva) and poly ol [16]. on average, it can be well compared to the optimum compressive strength of 90 mpa reported for epoxy resin filled rice husk [17]. meanwhile, the composition of the composites appears to have no significant effect on the impact strength with composites of rice husk and bamboo stem fibres having impact strength of 26.0–26.4 and 26.2–26.4 kj/m2, respectively. this result can be well contrasted with polycarbonate (20.0–30.0 kj/m2) and acrylonitrile butadiene styrene (10.0–29.0 kj/m2) being used for production of industrial helmet shell and coir/epoxy resin composite (21.80–26.43 kj/m2) being proposed for similar application [18]. 3.3. water absorptivity and thickness swelling the result of water absorption and thickness swelling test are presented in tables 6 and 7. table 6 shows that rice husk have a lower affinity for water (i.e lower water absorption capacity) than the bsf. the composites from rice husk have water absorptive rate (war) of 1.97–5.19 and 1.09– 3.02 %/min while those from the bsf have 4.03–8.11 and 2.62–4.15 %/min at 30 min and 1 h time of immersion, respectively. table 7 shows that the rice husk is better than the bsf in terms of its resistance to thickness swelling. the composites from rice husk have thickness swelling rate (tsr) of 0.74–1.23 and 0.52–0.70 %/min, while those from the bsf have 0.84– 1.66 and 0.43–1.28 %/min at 30 min and 1 h time of immersion, respectively. a contrast may not be appropriate between these values and 81.2 % of the dry weight measured for water absorbed by bamboo fibre when soaked in water for 144 h (absorptive rate of 0.009 %/min) [10] and for some selected natural fibres [11] due to the extended and unknown length of immersion respectively. however, a comparatively similar outcome may be expected since this result indicates that the water absorption of the composites decreases with prolonged time of the immersion in water. this may be partly due to the presence of 300 vol. 57 no. 4/2017 properties of composites of waste paper paired factors pearson correlation differences n correlation sig. (2-tailed) t df sig. (2-tailed) pfvf & density 14 −0.34 0.234 6.328 13 0 pfvf & specific weight 14 −0.342 0.231 3.124 13 0.008 pfvf & compression strength 14 −0.522 0.055 −8.415 13 0 pfvf & impact strength 14 −0.071 0.81 −108.435 13 0 pfvf & water absorptive rate 28 0.284 0.143 5.284 27 0 pfvf & thickness swelling rate 28 0.005 0.981 8.501 27 0 table 8. percentage fibre volume fraction (pfvf) and composite properties. sig. (2-tailed) properties sw cs is war tsr time density d 0 0.196 0.167 (−)0.000 (−)0.016 np specific weight sw (−)0.199 0.165 (−)0.000 (−)0.017 np compression strength cs ns (−)0.152 0.114 0.213 np impact strength is ns ns (−)0.648 0.231 np water absorptive rate war hs ns ns 0 (−)0.003 thickness swelling rate tsr hs ns ns hs (−)0.001 table 9. correlation between composite properties. legend: ns – not significant, hs – highly significant, np – not possible. waste paper, which creates further interfacial space within the matrix of the composites. it may also be partly due to the lack of pre-treatment of the fibres with the aim of characterizing the properties of the composites in natural composition. meanwhile, [8] has found that untreated composites of rattan particulate reinforced paper pulp, using starch as a binder, is better than alkali treated samples in water absorption. chemical treatment has been reported as capable of removing hemicelluloses and lignin content of the plant fibre resulting in a reduction of water absorption and thickness swelling of the biocomposites [1, 19]. this result agrees with this earlier finding as the bsf contains no wax and has higher amount of lignin and hemicelluloses than rice husk. therefore, it implies that the hydrophilic nature of plant fibres is more due to hemicelluloses than cellulose or that biocomposites are less hydrophilic with increased wax content and reduced hemicelluloses and lignin content of the plant fibres from which they are made. hemicellulose and lignin have been described as amorphous and hygroscopic thermoplastic substances, which are affected by environmental conditions, such as humidity and temperature [20]. since rise husk is higher in cellulose content and yet has the lower water absorption and thickness welling, it may be concluded that the relationship between cellulose and these composite properties is inverse as that of the wax. this result is in agreement with the observation that increase in cellulose improves the mechanical properties of the fibres [13] and also suggests that same may be true for wax. cellulose has been described as the main reinforcing element and it is not affected by alkalis and dilute acids [20]. logically, this analysis suggests that cellulose and wax are like binding agents while the lignin and hemicelluloses act like pore cavities within the matrix of the fibres. this result agrees with information in table 5 — that the composites with rice husk has higher compression strength compared to the bsf. the former is expected to have a greater resistance to compression since its matrix is composed of more binding structures than the pore cavities. 3.4. result of statistical analysis the outcome of pearson correlation and t-test are shown in tables 8 and 9. table 8 shows that the percentage volume fraction of the fibres in the composites matrix has no significant effect on the selected properties (p > 0.05), but there is a highly significant difference in each of these properties for varying percentage fibre volume fraction (pfvf) in the samples (p < 0.05). this implies that these properties are significantly different for each sample, but this difference is not related to their compositions. this result has confirmed that it is not just that the sample composition has no significant effect on the impact strength (as earlier suspected on table 5), but also that the same applies to every other selected property. table 9 shows that compression and impact strength has no significant relationship neither with each other nor with any of the selected properties (p > 0.05). meanwhile, the density has a highly significant effect on the specific weight, water absorption rate (war) and thickness swelling rate (tsr) of the 301 t. a. yusuf, l. e. dan-laaka, p. o. ogwuche acta polytechnica samples (p < 0.05). however, its relationship with the war and tsr is negative, implying that as the density of the composite samples reduces, both properties increase and vice versa. specific weight also has a similar relationship with the both properties. the war has a highly significant positive relationship with the tsr and vice versa (p < 0.05). this suggests that the increase in one implies increase in other. meanwhile, both the war and the tsr have highly significant negative relationships with time such that the higher the duration of immersion, the lesser the both properties (as observed in tables 6 and 7). this result gives a more logical justification to the remark earlier made that the war of 0.43–1.28 %/min, observed in this study for the bsf (table 6), and the estimated corresponding value of 0.009 %/min, observed by [10] after 1 and 144 h of immersions, respectively may be equivalent and corroborative. 4. conclusion waste paper, rice husk and bamboo stem fibre, which constitute wastes have been used to produce composites and the selected properties of the composites produced have been characterized. composites from rice husk are better in terms of their higher compressive strength, lower water absorption and thickness swelling while bamboo stem fibre is superior for its lower density and specific weight. water absorption and thickness swelling of the composites decrease with increase of the immersion time in water for each of the samples. waste paper has a lower density, specific weight and higher compression strength than both materials such that the higher the quantity of waste papers in the composition the better the compression strength and lower density and specific weight of the composites. the material composition (percentage fibre volume fraction) appears to have no significant effect on the impact strength as well as on all other selected properties of the composites (p > 0.05). however, all the samples have properties that meet the requirements for composites, except the fact that the water absorption and thickness swelling are relatively high. the composites have considerably low density, which makes them suitable in light weight applications. their compressive and impact strength make them appear specifically relevant for production of construction blocks and industrial helmets respectively. the properties are liable to a modification with chemical pre-treatment of the fibres. references [1] ibrahim, z., ahmad, m., aziz, a., ramli, r., jamaludin, m. , muhammed, s., alias, a.: dimensional stability properties of medium density fibreboard (mdf) from treated oil palm (elaeis guineensis) empty fruit bunches (efb) fibres. open journal of composite materials, 6, 2016, p91-99. doi:10.4236/ojcm.2016.64009 [2] asim, m., abdan, k., jawaid, m., nasir, m., dashtizadeh, z., ishak, m.r., hoque, m.e.: a review on pineapple leaves fibre and its composites, international journal of polymer science, hindawi publishing corporation, 2015, p1-16. doi:10.1155/2015/950567 [3] tudu, p.: processing and characterization of natural fiber reinforced polymer composites. bachelor of technology degree project, mechanical engineering, national institute of technology rourkela, 2009, p52. [4] namvar, f., jawaid, m., tahir, p. m, mohamad, r., azizi, s., khodavandi, a., rahman, h.s., nayeri, m.d.: potential use of plant fibres and their composites for biomedical applications, bioresources, 9(3), 2014, p5688-5706. doi:10.15376/biores.9.3 [5] dungani, r., karina, m., sulaeman, s.a., hermawan, d., hadiyane, a.:. agricultural waste fibers towards sustainability and advanced utilization: a review, asian j. plant sci., 2016, 15(1-2), p42-55 doi:10.3923/ajps.2016.42.55 [6] suddell, b.c., rosemaund, a.: industrial fibres: recent and current developments, proceedings of the symposium on natural fibres, p71-82 [7] okeyinka, o.m, oloke, d.a., khatib, j.m.,: a review on recycled use of solid wastes in building materials, world academy of science, engineering and technology international journal of civil, environmental, structural, construction and architectural engineering, 9 (12), 2015, p1570-1579, scholar.waset.org/1999.3/10003128 [8] oluwole, o.i, avwerosuoghene, o.m.,: effects of cassava starch and natural rubber as binders on the flexural and water absorption properties of recycled paper pulp based composites, international journal of engineering and technology innovations, 2(4), 2015, p7-12 [9] ramamoorthy, s.k., skrifvars, m., perssona, a.,: review of natural fibers used in biocomposites: plant, animal and regenerated cellulose fibers, polymer reviews, 55, 2015, p107–162. doi:10.1080/15583724.2014.971124 [10] zakikhani, p., zahari, r., sultan, m.t.h., majid, d.l.,: extraction and preparation of bamboo fibre-reinforced composites, materials and design, 63, 2014, p820–828 doi:10.1016/j.matdes.2014.06.058 [11] chandramohan, d., k. marimuthu, k.,: a review on natural fibers, ijrras, 8 (2), 2011, p194-206 [12] panthapulakkal, s., sain, m.,: studies on the water absorption properties of short hemp–glass fiber hybrid polypropylene composites, journal of composite materials, vol. 41(15), 2007, p1871-188, doi:10.1177/0021998307069900 [13] zakikhani, p., zahari, r., sultan, m.t.h., majid, d.l.,: bamboo fibre extraction and its reinforced polymer composite material , world academy of science, engineering and technology, international journal of chemical, molecular, nuclear, materials and metallurgical engineering, 8 (4), 2014, p315-318. [14] nevile, a.m.,: properties of concrete: american concrete institute, farmington hills, 4(2), 2003, p503-507. 302 http://dx.doi.org/10.4236/ojcm.2016.64009 http://dx.doi.org/10.1155/2015/950567 http://dx.doi.org/10.15376/biores.9.3 http://dx.doi.org/10.3923/ajps.2016.42.55 http://dx.doi.org/10.1080/15583724.2014.971124 http://dx.doi.org/10.1016/j.matdes.2014.06.058 http://dx.doi.org/10.1177/0021998307069900 vol. 57 no. 4/2017 properties of composites of waste paper [15] ndububa, e. e., nwobodo, d.c., okeh, i.m.,: mechanical strength of particleboard produced from fonio husk with gum arabic resin adhesive as binder, int. journal of engineering research and applications, 5 (4), 2015, p29-33. [16] shawia, n.b., jabber, m.a., mamouri, a.f.,: mechanical and physical properties of natural fiber cement board for building partitions, physical sciences research international, 2(3), 2014, p49-53. [17] ibrahim, w.m.a., kuek, s.y.,: compressive strength of rice husk filled resin, advanced materials research, 264-265, 2011, p576-579. doi:10.4028/www.scientific.net/amr.264-265.576 [18] obele, c., ishidi, e.,: mechanical properties of coir fiber reinforced epoxy resin composites for helmet shell. industrial engineering letters, 5 (7), 2015, p67-74. [19] h.p.s. abdul khalil, h.p.s., bhat, i.u.h, jawaid, m., zaidon, a., hermawan, d., hadi, y.s.,: bamboo fibre reinforced biocomposites: a review, materials and design, 42, 2012, p353–368. doi:10.1016/j.matdes.2012.06.015 [20] ni, y.,: natural fibre reinforced cement composites, ph.d thesis, mechanical engineering, victoria university of technology, australia, 1995, p1-220 303 http://dx.doi.org/10.4028/www.scientific.net/amr.264-265.576 http://dx.doi.org/10.1016/j.matdes.2012.06.015 acta polytechnica 57(4):295–303, 2017 1 introduction 2 materials and methods 2.1 raw materials and apparatus/equipment 2.2 methods of preparation 2.2.1 waste paper, fibres and cassava starch 2.2.2 composite samples 2.3 properties testing methods 2.3.1 density and specific weight 2.3.2 compression and impact strength 2.3.3 water absorption and thickness swelling 2.4 statistical analysis 3 results and discussions 3.1 density and specific weight 3.2 compression and impact strength 3.3 water absorptivity and thickness swelling 3.4 result of statistical analysis 4 conclusion references acta polytechnica doi:10.14311/ap.2018.58.0355 acta polytechnica 58(6):355–364, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap ultra-high-performance fibre-reinforced concrete under high-velocity projectile impact. part ii. applicability of prediction models sebastjan kravanjaa, radoslav sovjákb, ∗ a faculty of civil and geodetic engineering, university of ljubljana, jamova cesta 2, ljubljana 1000, slovenia b faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: sovjak@fsv.cvut.cz abstract. semi-infinite targets of ultra-high-performance fibre-reinforced concrete with various fibre volume fractions were subjected to the high-velocity projectile impact using in-service bullets. in this study, a variety of empirical and semi-analytical models for prediction of the depth of penetration and mass ejection were evaluated with respect to the experimental results. models for the depth of penetration and spalling mass ejection were revisited and applied both with deformable and nondeformable projectiles parameters. the applicability of the prediction models was assessed through a statistical comparison of values from models with experimental results. the evaluation of the applicability was made through the newly proposed measure of a relative prediction accuracy for model selection and model estimation, which was verified with established statistical accuracy evaluations, such as accuracy ratio, logarithmic standard deviation and correlation coefficient. the best fit to the experimental readings was provided by newer semi-analytical models, which are incorporating additional concrete parameters beside compressive strength while the majority of older models failed to provide sufficient accuracy. keywords: projectile impact; uhpfrc; prediction models; depth of penetration; mass ejection. 1. introduction a set of results for ultra-high-performance fibrereinforced concrete (uhpfrc) with various fibre volume fractions under a high-velocity projectile impact was gathered for both rigid (non-deformable) and soft (deformable) projectile [1]. the uhpfrc was chosen due to its exceptional mechanical properties and impact resistance. in the framework of this study, uhpfrcs with unconfined compressive strengths over 110 mpa were reinforced with discrete steel fibres in five different volumetric fractions ranging from 0.125 % to 2.0 %, while additional control specimens without fibre inclusion (i.e., 0 %, plain uhpc) were tested as well. the rigid projectile was provided by a bullet with a full-metal jacket and mild-steel core (fmj-msc or msc) while the soft projectile was provided by a bullet with a full-metal jacket and soft-lead core (fmj-slc or slc). both projectiles were chosen because they represent a world’s widespread intermediate cartridge for the military assault rifle ak-47. for semi-infinite targets investigated in this study, loaded with high-velocity projectile impact, a number of empirical and semi-analytical prediction models for predicting the penetration depth and mass ejection were tested and evaluated through comparison to the experimental results. predicting the effect of the projectile impact on cementitious composite is a very complex problem and although empirical formulas do exist, most accurate methods are semi-analytical models and numerical simulations. empirical formulas are based on the coefficients calibration and fitting of the empirical constants and do not contain any physical substance, while semi-analytical models are developed on the basis of a physical concept and then calibrated to experimental results. a large number of predictive models were used and its applicability for the case of this study was evaluated and discussed. 2. theoretical background throughout modern history, numerous empirical and semi-analytical prediction models have been developed and tested for the means of calculating the prediction of the penetration depth due to projectile impact on solid materials. the purpose was to verify if any of the existent penetration models fit the results gathered from the experimental part of this study. by this means, in total 42 penetration depth prediction models, which were publicly accessible, were evaluated and their results were compared with the results of the experimental analysis and between each other. the accuracy of the prediction models was evaluated with the use of value forecast model accuracy assessment∑ ln2 q proposed by tofallis [2], for which it was proven to be less biased than other known statistical 355 http://dx.doi.org/10.14311/ap.2018.58.0355 http://ojs.cvut.cz/ojs/index.php/ap sebastjan kravanja, radoslav sovják acta polytechnica assessments and can be calculated using ∑ ln2 q = n∑ i=1 ln2 qi, where qi is named accuracy ratio and is calculated as qi = ft at , where ft is predicted (forecast) value and at is the actual value of the compared quantity. in this case, the predicted value represented the result from prediction models (whether it was penetration depth or mass ejection) and the actual value represented the result from the experimental investigation. the model and experimental results were gathered for each of the six different fibre volumetric ratios (including the plain uhpc mixture), so the summation index i of the square of logarithmic values of accuracy ratios is assuming values from 1 to 6. higher accuracy of the prediction models results in a lower value from the accuracy assessment equation; therefore, the most accurate models had the lowest accuracy estimation values. the efficiency of the newly proposed logarithmic assessment by tofallis was verified by using much more established logarithmic standard deviation (lsd), which is calculated by the equation lsd = √∑ (s2/2 − ln2 qi) n − 1 , where s2 is the sample variance of the accuracy ratio qi. in this assessment, the lower value means lower error and higher accuracy of the prediction model. in addition, the statistical quantity of the correlation coefficient was calculated for each model with respect to the experimental results in order to evaluate the relative relation between these values [3]. in the vast majority of the cases of penetration models, the unconfined compressive strength of a concrete target is a major parameter on the resistance side of the model. the major supposition in the development of prediction models was that the penetration depth and crater volume are in inverse correlation to the unconfined compressive strength of the concrete. however, kennedy [4] proposed, according to the one-dimensional theory of wave propagation, that significant reflected tensile wave also appears in the target with finite geometries. the tensile strength affects the spalling and scabbing part, and therefore cannot be neglected. this was appropriately incorporated in the newly proposed semi-analytical model by hwang et al. [5]. 2.1. parameters. it is important to mention that the majority of the empirical and semi-analytical models with empirical factors were developed through the curve fitting and regression analysis on the basis of experimental results, therefore establishing a strict application range symbol parameter unit impact parameters x predicted penetration depth m vi projectile impact velocity m/s d diameter of the projectile m m mass of the projectile kg crh calibre radius head – n∗ nose shape factor – ρp density of the projectile core material kg/m3 ep elastic modulus of the projectile core material pa yp yield strength of the projectile core material pa concrete target parameters f′c unconfined compressive strength pa ρc density kg/m3 ft direct tensile strength pa p fibre volumetric fraction % sa maximum diameter of coarse aggregate m h target thickness m table 1. parameters used in prediction models. of validity [6]. the application range is a set of intervals of parameters, for which the models have been tested, proven or for which the empirical factors were calculated. in the majority of models, these parameters are mass, diameter and impact velocity of the projectile and unconfined compressive strength of the concrete target. in penetration and mass ejection prediction models, a various set of parameters can be used. the parameters are divided into two parts: impact parameters, which are considering the projectile physical properties as well as impact velocity or impact kinetic energy, and target parameters, which are considering physical properties of the concrete target (table 1). 3. the depth of penetration for non-deformable projectiles five of the most known models and newest models, which were used in the study, are briefly described in this chapter. the original equation of the national defence research committee (ndrc) was presented in 1946 and was based on the physical model of the impact process [7]. in this model, the force on the projectile is assumed to increase linearly to a penetration depth of x = 2d and further on remains at a constant value. later, it has been shown that this model does not give an appropriate description of the penetration process, however, it was further modified to calculate total penetration depth with respect to the boundary condition of x = 2d. the penetration 356 vol. 58 no. 6/2018 ultra-high-performance fibre-reinforced concrete. part ii. model consisted of impact parameters of projectile velocity, diameter and mass, and additional empirical factors of material dependence and the nose shape factor. the latter two were later determined and a first modified ndrc formula was developed through the definition of the impact function g. in 1978, kar [7] revised the ndrc formula to consider the type of the projectile material using a regression analysis. the type of the material was incorporated through the use of the ratio between young’s elastic modulus of the deformable projectile material and the referenced elastic modulus of steel, and the modified impact function was proposed [8]. in 2012, almusallam [9] proposed and further modified the ndrc equation to incorporate the effect of the fibre reinforcement in the concrete matrix. an exponential term was introduced to the impact function g while the general form of the ndrc equation remained unaltered. in 2015, the authors proposed the equation for calculating fibre empirical constants, based on fibre’s geometrical and mechanical properties [9, 10]. 4. hwang et al. prediction model a new model for prediction of the penetration depth was presented by hwang et al. [5]. the whole model is based on the principle of the energy conservation law, in which the authors are considering two main energies, which appear at the impact process: kinetic energy of the projectile ek and resistant energy of the concrete target er. the latter is further divided into three resistant energies, which appear in the target of finite geometries: spalling es, tunnelling et and scabbing ec resistant energy. we have: ek = m 2 (v2i − v 2 r ), er = es + et + ec. with respect to the energy conservation law, the kinetic energy ek on the influence (impact) side of the problem is the same as the mobilized resistant energy er of the concrete target. in the continuation, a brief summary of the energy model calculation is presented. the original model is entirely the work of the authors of this model [5] and is presented just with the intention of explaining its basic assumptions. 4.1. spalling resistant energy the spalling energy is dissipated due to the reflected impact force (reflected tension wave) on the proximal face of the target. the dissipation of the energy emerges as an ejection of a concrete part, which is idealized with the truncated cone. the resistance force fs of the concrete cone is defined as fs = ftd ( tsbs tan θs + πd2 4 ) kskbs, where ftd is the concrete tensile strength increased by the strain rate according to the fib model code 2010 [11]; ts is the allowable spalling depth; θs is the average failure cone angle; bs is the average perimeter of the concrete cone; ks is the size effect factor and kbs is the stress concentration factor. the proposed formula for the average perimeter of the concrete cone bs with a diameter of d + 2ts tan θs was corrected from bs = π(d + ts tan θs) to bs = π(d + 2ts tan θs) by the reason of the inferred error in the text, however, it was used only in the modified version of the model. allowable spalling depth is the maximal depth of the ejected concrete cone. the model is proposing an estimation of the allowable spalling depth using four empirical factors, which take into consideration concrete target’s thickness, steel fibre volumetric ratio, concrete’s density and maximum size of the coarse aggregate. the average tangent value of an idealized cone angle was proposed by the authors of the model, which was experimentally acquired through larger series of experimental data. for an ogive nose projectile, it was proposed to be 1.55, which corresponds to a cone failure angle of 57.17°. the expression of the resistance force fs is multiplying the designed concrete tensile strength with truncated cone surface area (without bottom surface area on the face of the specimen). it is then multiplied by the size effect factor, which is determined by the concrete target’s thickness and stress concentration factor, which is an empirical factor proposed to a value 1.25 by authors. the non-conventional term for the cone surface area was not in a good agreement with experimental results of the crater cone area and it gave underestimated results and was, therefore, replaced with the conventional geometrical equation for a lateral cone surface area al: al = π(r1 + r2) √ (r1 − r2)2 + t2s , where r1 = d + 2ts tan θs 2 , r2 = d 2 are the radii of the bottom and top circle surfaces, respectively, and ts is the originally proposed allowable spalling depth. the resistance force fs was then calculated by the modified expression fs = ( al + πd2 4 ) ftdkskbs. the spalling resistant energy could then be determined by the equation es = fs vsc asp , where vsc is the volume of an idealized concrete cone and asp is the projected area of the idealized concrete cone on the proximal face of the target. both of these estimated quantities could also be compared with measured values. 357 sebastjan kravanja, radoslav sovják acta polytechnica 4.2. tunnelling resistant energy after the spalling region, the projectile continues its way through the concrete material by a tight tunnelling penetration. the projectile velocity is decreased by the bond resistance between the projectile and concrete. the authors of the model suggest the following expression for the bond resistance ft = πdttψτd, where tt is the allowable tunnelling depth, ψ is the nose shape factor (0.7 for ogive nosed) and τd is the bond strength, increased by a strain effect depending on the strain rate affecting the compressive strength according to the fib model code 2010 [11]. the tunnelling resistant energy could then be calculated with the equation et = ft ρpap m. 4.3. scabbing resistant energy the scabbing failure mode is, according to the authors, similar to the spalling failure mode, and therefore uses the same method for the resistant energy calculation. an allowable scabbing depth tc is assumed to be equal to the allowable spalling depth, while the average tangent value of an idealized failure cone angle on the scabbing part is proposed to be 2.0, regardless of the projectile nose shape. in addition, also in a part of the scabbing resistance energy calculation, the lateral area of the cone was replaced with the equation, which was presented in the spalling resistant energy section. the energy model with modified expressions for the lateral area of the concrete cone in both spalling and scabbing resistance energy section and corrected equation for the crater perimeter was labelled as mod. hwang et al. 4.4. penetration depth calculation the total penetration depth (x) was calculated by an equation that is derived on the basis of the aforementioned assumptions [5] for targets with semi-infinite geometries, where perforation does not occur, as x = mv2i 2er,max (h − ts). both the models, one with exact original formulas (hwang et al.) and one with two modified expressions for the spalling and scabbing crater perimeters and failure cone lateral area (mod. hwang et al.) were compared with the data from the experimental study. 5. the depth of penetration for deformable projectiles an analytical algebraic formula for the penetration prediction of a deformable projectile impact into a deformable target material was proposed by rubin and yarin [12]. it is divided into two stages of penetration: first, a deformable penetration stage and second, a rigid penetration stage. in the first stage of the penetration p i, the projectile’s head deforms into a mushroom-like shape and its tail remains rigid – a projectile is eroding with its length decreasing while the penetration velocity is relatively constant. it is assumed by the authors that after the first stage, the remaining mushroom-like head and rigid tail continue to penetrate the target as a rigid body – the second stage p ii begins. the total penetration x is calculated by summarizing the penetration depths from both stages x = p i + p ii. the first stage penetration depth p i can be calculated with the use of the following formulas: p i = u vi − u ler, ler = (l0 − l0) ( 1 − exp −ρp(vi − u)2 2yp ) , where u is the decreased velocity of the projectile due to the pressure at the target/projectile interface, calculated with the use of the projectile and target densities and assumed as constant, ler is the length of the portion of the rigid tail of the projectile that has been eroded away, l0 is the initial projectile length and l0 is the length of the mushroom-like section. the model is derived for the penetration of the cylindrical rod with an initial length l0. the analytical formula only considers cylinder-shaped blunt-nosed projectile. in this study, the head of the projectile was not blunt. however, since the resulting values of this analytical model are strongly influenced by the projectile length, an assumption of the equivalent length of the idealized cylindrical projectile was made and used in the equation. the equivalent length of the lead core was calculated by calculating the actual total volume of the core (by the cylinder and truncated cone) and then dividing it by the actual area of a cylindrical part of the deformable projectile core with a diameter of 6.32 mm, yielding an equivalent length l0 = 13.81 mm, which was further used in this model. although the resulting values of the rubin and yarin analytical model are strongly influenced by the projectile length, the effect of the nose shape for a deformable projectile should also be noted. walker and anderson [13] reported that the conical nosed projectiles did not perform as well as the blunt-nosed projectiles when the target was sufficiently hard to cause a significant projectile erosion. it can be derived that the deformable ogive-nosed projectile used in this study, which undergone a significant erosion, possibly induces a smaller damage than a blunt-nosed projectile. results from the analytical model should be, therefore, on the safe side. again, it is important to mention that an equivalent length was established with an assumption that the major part of the cratering damage is governed by the length of the projectile. 358 vol. 58 no. 6/2018 ultra-high-performance fibre-reinforced concrete. part ii. figure 1. comparison between experimental and representative prediction model results for dop for non-deformable fmj-msc projectile impact on uhpfrc and linear regression line for experimental data. 6. spalling mass ejection for the mass ejection prediction, two models were found accessible in public literature and assessed with corresponding input data. the model of sierakowski [14] is based on the physical principle of impulse. it is calculated by integrating resultant force with respect to time; however, in the case of an object of a constant mass (rigid projectile), the impulse can be expressed as a difference in momentum. the major supposition in this empirical model is that the volume of the impact crater is in inverse correlation with the square root of the unconfined compressive strength of the concrete. a prediction for the ejected mass from the front and the rear faces of fibre reinforced concrete slabs subjected to impact loads was proposed by almusallam [15]. the mass defragmented from the front face was modelled by a combination of a crater and a tunnel in a manner similar to the one in the prediction model by hwang et al., however, the shape of the front face crater at the proximal face was assumed to be elliptical and the shape of the crater at the penetration depth was assumed to be circular, whereas the transition was considered to be elliptical. the authors also proposed prediction equations for this quantity [16], which yielded a suitable usability of the prediction model in terms of designing of structures. the prediction model for equivalent crater diameter is incorporating the fibre effect directly through the use of the reinforcing index, whereas the prediction model for total mass ejection is incorporating this effect through the use of a penetration depth by almusallam et al. modified ndrc, equivalent crater diameter and additional tunnelling depth. fmj-msc fmj-slc modulus of elasticity 210 44.3 ep (gpa) density ρp (kg/m3) 7850 10735 yield strength yp (mpa) 552 73.3 table 2. assessed input parameters regarding projectile’s core material properties for consideration of deformability in prediction models. 7. results for depth of penetration in total, 42 prediction models were tested for the prediction of the penetration depth. the projectile core diameter was used in the calculation in both the msc and slc cases. in the prediction models for the deformable projectile penetration, the projectile deformability is incorporated through the use of ratios of the modulus of elasticity, density or yield strength of the projectile’s core and steel (table 2). 7.1. non-deformable fmj-msc projectile the results gathered through a calculation of prediction models of rigid projectile penetration gave values with a large dispersion. for better clarity, the prediction models, which take into account the fibre incorporation and/or high strain rate effect, were plotted (figure 1). additionally, original modified ndrc and ace equations were added for their common and 359 sebastjan kravanja, radoslav sovják acta polytechnica acc. model ref. ∑ ln2 q qi lsd ρfa scale 1 mod. hwang et al.* [5] 0.012 0.98 0.049 0.841 2 haldar-hamieh [8, 17] 0.057 1.1 0.105 0.82 3 mod. hughes* [8, 18] 0.078 1.03 0.124 0.983 4 haldar & miller [7, 19] 0.147 0.86 0.172 0.82 5 ace [4, 8] 0.156 0.85 0.177 0.816 6 almusallam et al. [15] 0.471 1.32 0.304 0.966 7 irs [8, 20] 0.571 1.36 0.337 0.815 8 whiffen [8, 21] 0.674 0.72 0.367 0.815 9 ukaea [8, 22] 0.683 1.4 0.368 0.803 10 amman & whitney [4, 8] 0.692 0.71 0.372 0.803 11 mod. almusallam et al. [9] 0.708 1.41 0.375 0.82 12 mod. ndrc [4, 8, 23] 0.712 1.41 0.376 0.803 13 young/sandia [24] 0.815 0.69 0.404 0.978 14 umist [8, 25, 26] 1.083 1.53 0.464 0.826 15 hwang et al. [5] 1.4 1.62 0.528 0.902 16 young [7] 1.584 1.67 0.56 0.978 17 berezan [27] 1.844 1.74 0.601 -0.342 18 brl [4, 8, 28] 1.989 1.78 0.629 0.821 19 hughes (flexural) [8, 18] 2.076 1.7 0.507 0.951 20 bergman [7, 29] 2.083 1.8 0.644 0.811 21 conwep [30, 31] 2.32 1.86 0.679 0.803 22 newton [32] 2.539 1.92 0.709 0.988 23 zaidi et al. [33] 2.559 1.92 0.713 0.82 24 british formula [7, 34] 2.573 1.89 0.59 0.847 25 tolch & bushkovitch [7, 35] 3.461 2.14 0.823 -0.342 26 mod. petry i (k = 2.26 · 10−4) [4, 8, 36, 37] 4.006 2.26 0.885 -0.344 27 tbaa [7, 38] 4.782 2.44 0.975 0.827 28 mod. petry ii (kp = 0.01) [4, 8, 36, 37] 5.065 2.51 0.995 -0.344 29 hughes (tension) [8, 18] 5.794 2.67 1.069 0.699 30 forrestal et al. [7, 39, 40] 5.884 2.69 1.08 0.796 31 mod. forrestal et al. (teland) [41] 6.264 2.78 1.115 0.809 32 li & chen [30, 42] 6.267 2.78 1.114 0.796 33 mod. forrestal et al. (frew) [27] 6.726 2.88 1.154 0.785 34 mod. petry i (k = 3.39 · 10−4) [4, 8] 8.986 3.4 1.318 -0.344 35 wen & yang [43] 17.66 0.18 1.88 -0.036 36 adeli & amin – cubic [7, 28] 102.9 64.3 154.8 0.789 37 adeli & amin – quadratic [7, 28] – -19.8 – 0.789 38 criepi [8, 44] – 0 – 0.795 *modified by authors of this study table 3. accuracy scale according to logarithmic accuracy assessment with accuracy ratio qi, logarithmic standard deviation lsd and correlation coefficient values ρfa for fmj-msc impact. established use in the penetration prediction in the history and good agreement with experimental results according to other researchers. modified hughes equation was added as a representative model since it takes into account tensile strength instead of the unconfined compressive strength. however, all of the tested models were assembled in an accuracy scale table according to their logarithmic accuracy value from the most accurate to the least accurate, resulting in 36 displayed models (table 3). the accuracy ratio, lsd and correlation coefficient values were displayed as well as the control quantities. the modified version of the semi-analytical model, proposed by hwang et al., was evaluated as a most accurate model in this case. however, the original version of this model turned out to be on the safe side and still displayed sufficient accuracy and is, therefore, preferred to use in engineering practice. the original modified ndrc equation overestimated the results, while the first proposed modification of this equation 360 vol. 58 no. 6/2018 ultra-high-performance fibre-reinforced concrete. part ii. figure 2. comparison between experimental and representative prediction model results for dop for fmj-slc projectile impact on uhpfrc and linear regression line for experimental data. by almusallam et al. adjusted the results with the consideration of fibre volumetric fraction. the modified hughes equation, where the high strain-rate effect on the tensile strength from the fib model code 2010 [11] was used, turned out to be relatively accurate, however, it underestimated results in the cases of 1 % and 2 % fibre volumetric content. 7.2. deformable fmj-slc projectile the only models, which take into account projectile deformability, were displayed in this section. the model by rubin & yarin was tested with corresponding material parameters and effective length. in addition, the hydrodynamic limit value [12] was compared with prediction models (figure 2). only 10 models were appropriate for the deformable projectile penetration prediction, the other 32 models were developed for a rigid projectile penetration, and were, therefore, labelled as irrelevant (table 4). the newest and most analytical model by rubin & yarin turned out to be the most accurate, while the hydrodynamic limit provides sufficiently accurate results for this kind of projectile. a model by hwang et al. and its modified version did provide a good estimation of experimental results. the older models turned out to be less accurate, while the bernard model from 1977 did provide relatively accurate results on the safe side. the use of rubin & yarin model to the uhpfrc targets yielded a very good agreement to experimental data; however, it is important to note that the rubin & yarin model was originally developed for an eroding projectile penetration into metallic targets. the formula is limited to the case of long-rod penetration where both the projectile and the target experience significant plastic flow. in our case, plastic flow can be expected in the higher volumetric content of fibres in the uhpfrc mixture. consistently, the prediction is slightly better within the higher volumetric content of fibres (1 % and 2 %) in the uhpfrc mixture than within lower fibre volumetric fractions. due to the aforementioned reasons, the use of this model for future predictions on the uhpfrc targets should be cautious. 8. results for spalling mass ejection prediction models values were compared with experimental data, which were transformed from crater volumes values to mass ejection based on the concrete bulk density for each fibre volumetric ratio (figure 3). here it must be mentioned that for the evaluation of the model by abbas et al., the whole projectile diameter was used since it gave a much more realistic description of an actual mass ejection than the core diameter. 8.1. non-deformable fmj-msc projectile it is evident that newly prosed semi-analytical model by abbas et al. provides much better correlation to the actual mass ejection than the older empirical relation by sierakowski (figure 3). the latter is correct in the case of 0.125 % fibre volume fraction; however, this is probably just a coincidental occurrence since the relation was developed for concretes with compressive strengths around 30 mpa. furthermore, it can be seen, that the sierakowski relation is not following the decrement of the mass ejection values with an increment of fibre volumetric content; 361 sebastjan kravanja, radoslav sovják acta polytechnica acc. model ref. ∑ ln2 q qi lsd ρfa scale 1 rubin & yarin [12, 41] 0.225 1.21 0.211 0.158 2 hydrodynamic limit [41] 0.913 1.48 0.426 −0.079 3 bernard (1977) [27] 1.022 1.51 0.448 0.266 4 healey & weissman [8, 45] 1.261 0.63 0.503 0.213 5 mod. hwang et al.* [5] 1.525 1.65 0.545 −0.438 6 bernard (1978) [27] 1.660 1.69 0.570 0.266 7 kar [8, 45, 46] 2.051 0.56 0.641 0.314 8 bernard & creighton [27] 2.931 2.01 0.755 0.268 9 hwang et al. [5] 3.973 2.26 0.880 −0.183 10 newton [32] 13.76 4.55 1.644 −0.078 *modified by authors of this study table 4. accuracy scale according to logarithmic accuracy estimation with accuracy ratio qi, logarithmic standard deviation lsd and correlation coefficient values ρfa for fmj-slc impact. figure 3. comparison between experimental and prediction models values for mass ejection for fmj-msc. however, the model by abbas et al. is approximating the experimental values with a sufficient accuracy. the logarithmic accuracy assessment of the latter reached the value of 0.230, while the correlation coefficient was high: 0.90. 8.2. deformable fmj-slc projectile in this case, the term for mass ejection of the cylindrical tunnel in the model by abbas et al. was not calculated, since it did not appear in the experimental work. the results are more similar than in the rigid projectile case, since the sierakowski relation was derived on the supposition of a constant mass, which is only true for the rigid projectile, while the model by abbas et al. was developed on the basis of the modified ndrc equation, which was not corrected for the use of deformable projectile parameters (figure 4). the logarithmic accuracy assessment of the latter was 0.133, while the correlation coefficient was 0.74. 9. conclusions a number of prediction models have been applied in the framework of this study and compared to the experimental readings. it can be deduced that newly proposed and more developed semi-analytical prediction models provide a better fit to the experimental data than older models. it was assessed that the most accurate model for a non-deformable projectile depth of penetration was a model by hwang et al. with the modified expressions for the cone perimeter and lateral area, while for the deformable projectile, the first place was taken by the model by rubin & yarin. furthermore, the newer model by abbas et al. shows a very good agreement with the experimental data for the mass ejection prediction. 362 vol. 58 no. 6/2018 ultra-high-performance fibre-reinforced concrete. part ii. figure 4. comparison between experimental and prediction models values for mass ejection for fmj-slc. acknowledgements this work was supported by the ministry of interior of the czech republic [project no. vi20172020061]. the authors also acknowledge assistance from the technical staff at the experimental centre, faculty of civil engineering, czech technical university in prague; and students who participated in the project. references [1] kravanja s, sovják r. ultra-high-performance fibre-reinforced concrete under high-velocity projectile impact. part i. experiments. acta polytech 2018;58:232–9. doi:10.14311/ap.2018.58.0232 [2] tofallis c. a better measure of relative prediction accuracy for model selection and model estimation. j oper res soc 2015;66:1352–62. doi:10.1057/jors.2014.103 [3] turk g. verjetnostni račun in statistika. delovna različica učbenika 2008:vi, 264 pp. [4] kennedy rpp. a review of procedures for the analysis and design of concrete structures to resist missile impact effects. nucl eng des 1976;37:183–203. doi:10.1016/0029-5493(76)90015-7 [5] hwang h-j, kim s, kang t. prediction of hard projectile penetration on concrete targets. 2016 struct. congr., jeju island, korea: 2016, p. 1–7. [6] yankelevsky dz. resistance of a concrete target to penetration of a rigid projectile revisited. int j impact eng 2017;106:30–43. doi:10.1016/j.ijimpeng.2017.02.021 [7] teland ja. a review of empirical equations for missile impact effects on concrete (ffi/rapport97/05856). rep ffi/rapport 1997:37. [8] li qm, reid sr, wen hm, telford ar. local impact effects of hard missiles on concrete targets. int j impact eng 2005;32:224–84. doi:10.1016/j.ijimpeng.2005.04.005 [9] almusallam th, abadel aa, al-salloum ya, siddiqui na, abbas h. effectiveness of hybrid-fibers in improving the impact resistance of rc slabs. int j impact eng 2015;81:61–73. doi:10.1016/j.ijimpeng.2015.03.010 [10] song ps, hwang s. mechanical properties of high-strength steel fiber-reinforced concrete 2004;18:669–73. doi:10.1016/j.conbuildmat.2004.04.027 [11] fédération internationale du béton. fib model code for concrete structures 2010. 2013. [12] rubin mb, yarin al. a generalized formula for the penetration depth of a deformable projectile. int j impact eng 2002;27:387–98. doi:10.1016/s0734-743x(01)00061-6 [13] walker jd, anderson ce. the influence of initial nose shape in eroding penetration. int j impact eng 1994;15:139–48. doi:10.1016/s0734-743x(05)80027-2 [14] clifton jr. penetration resistance of concrete a review. washington dc: 1982. [15] almusallam th, siddiqui na, iqbal ra, abbas h. response of hybrid-fiber reinforced concrete slabs to hard projectile impact. int j impact eng 2013;58:17–30. doi:10.1016/j.ijimpeng.2013.02.005 [16] abbas h, almusallam t, al-salloum y, siddiqui n. prediction of ejected mass from hybrid-fiber reinforced concrete slabs subjected to impact loads. procedia eng., vol. 173, 2017, p. 77–84. doi:10.1016/j.proeng.2016.12.035 [17] haldar a, hamieh ha. local effect of solid missiles on concrete structures. j struct eng 1984;110:948–60. doi:10.1061/(asce)0733-9445(1984)110:5(948) [18] hughes g. hard missile impact on reinforced concrete. nucl eng des 1984;77:23–35. doi:10.1016/0029-5493(84)90058-x [19] haldar a, miller fj. penetration depth in concrete for nondeformable missiles. nucl eng des 1982;71:79–88. doi:10.1016/0029-5493(82)90171-6 363 http://dx.doi.org/10.14311/ap.2018.58.0232 http://dx.doi.org/10.1057/jors.2014.103 http://dx.doi.org/10.1016/0029-5493(76)90015-7 http://dx.doi.org/10.1016/j.ijimpeng.2017.02.021 http://dx.doi.org/10.1016/j.ijimpeng.2005.04.005 http://dx.doi.org/10.1016/j.ijimpeng.2015.03.010 http://dx.doi.org/10.1016/j.conbuildmat.2004.04.027 http://dx.doi.org/10.1016/s0734-743x(01)00061-6 http://dx.doi.org/10.1016/s0734-743x(05)80027-2 http://dx.doi.org/10.1016/j.ijimpeng.2013.02.005 http://dx.doi.org/10.1016/j.proeng.2016.12.035 http://dx.doi.org/10.1061/(asce)0733-9445(1984)110:5(948) http://dx.doi.org/10.1016/0029-5493(84)90058-x http://dx.doi.org/10.1016/0029-5493(82)90171-6 sebastjan kravanja, radoslav sovják acta polytechnica [20] bangash myh. impact and explosion: analysis and design. crc press; 1993. [21] bulson ps. explosive loading of engineering structures: a history of research and a review of recent developments. e & fn spon; 1997. [22] barr p. guidelines for the design and assessment of concrete structures subjected to impact 1987. [23] kennedy rp. effects of an aircraft crash into a concrete reactor containment building. los angeles, calif.: holmes & narver, inc.; 1966. [24] young cw. penetration equations. albuquerque, nm, and livermore, ca (united states): sandia national laboratories; 1997. doi:10.2172/562498 [25] wen hm, xian yx. a unified approach for concrete impact. int j impact eng 2015;77:84–96. [26] umist report me/am/02.01/te/g/018507/z. predicting penetration, cone cracking, scabbing and perforation of reinforced concrete targets struck by flat-faced projectiles. 2001. [27] ben-dor g, dubinsky a, elperin t. high-speed penetration dynamics: engineering models and methods. world scientific; 2013. doi:10.1142/8651 [28] adeli h, amin am. local effects of impactors on concrete structures. nucl eng des 1985;88:301–17. doi:10.1016/0029-5493(85)90165-7 [29] bergman sga. intrangning av pansarbrytande projektiler oeh bomber i armerad betong. fortf rapport b2, stockholm: 1950. [30] wang s, le htn, poh lh, feng h, zhang mh. resistance of high-performance fiber-reinforced cement composites against high-velocity projectile impact. int j impact eng 2016;95:89–104. doi:10.1016/j.ijimpeng.2016.04.013 [31] conwep (conventional weapons effects program). us army engineer waterways experiment station. vicksburg: 1992. [32] houghton k, clark a, simms h, kuzemczak j. p3_10 extinction event. phys spec top 2012;11. [33] zaidi ama, bux q, rahman ia, ismail my. development of empirical prediction formula for penetration of ogive nose hard missile into concrete targets. am j appl sci 2010;7:711–6. doi:10.3844/ajassp.2010.711.716 [34] vretblad b. penetration of projectiles in concrete according to forth 1. proc. from work. weapon penetration into hard targets, nor. def. res. establ., 1988. [35] tolch na, bushkovitch av. penetration and crater volume in various kinds of rocks as dependent on caliber, mass, and striking velocity of projectile. ballistic research laboratory, aberdeen proving ground, md: 1947. [36] samuely fj, hamann cw. civil protection. the architectural press; 1939. [37] amirikian a. design of protective structures. report nt-3726, bureau of yards and docks, department of the navy: 1950. [38] fraser rdg. penetration of projectiles into concrete and soil. n.d. [39] forrestal mj, altman bs, cargile jd, hanchak sj. an empirical equation for penetration depth of ogivenose projectiles into concrete targets. int j impact eng 1994;15:395–405. doi:10.1016/0734-743x(94)80024-4 [40] forrestal mj, frew dj, hanchak sj, brar ns. penetration of grout and concrete targets with ogive-nose steel projectiles. int j impact eng 1996;18:465–76. doi:10.1016/0734-743x(95)00048-f [41] sjol h, teland j, kaldheim o. penetration into concrete analysis of small scale experiments with 12 mm projectiles. 2002. [42] li qm, chen xw. dimensionless formulae for penetration depth of concrete target impacted by a non-deformable projectile. int j impact eng 2003;28:93–116. doi:10.1016/s0734-743x(02)00037-4 [43] wen hm, yang y. a note on the deep penetration of projectiles into concrete. int j impact eng 2014;66:1–4. doi:10.1016/j.ijimpeng.2013.11.008 [44] kojima i. an experimental study on local behavior of reinforced concrete slabs to missile impact. nucl eng des 1991;130:121–32. doi:10.1016/0029-5493(91)90121-w [45] bangash myh. concrete and concrete structures: numerical modelling and applications. elsevier applied science; 1989. [46] kar a. local effects of tornado-generated missiles. j struct div 1978;104:809–16. 364 http://dx.doi.org/10.2172/562498 http://dx.doi.org/10.1142/8651 http://dx.doi.org/10.1016/0029-5493(85)90165-7 http://dx.doi.org/10.1016/j.ijimpeng.2016.04.013 http://dx.doi.org/10.3844/ajassp.2010.711.716 http://dx.doi.org/10.1016/0734-743x(94)80024-4 http://dx.doi.org/10.1016/0734-743x(95)00048-f http://dx.doi.org/10.1016/s0734-743x(02)00037-4 http://dx.doi.org/10.1016/j.ijimpeng.2013.11.008 http://dx.doi.org/10.1016/0029-5493(91)90121-w acta polytechnica 58(6):355–364, 2018 1 introduction 2 theoretical background 2.1 parameters. 3 the depth of penetration for non-deformable projectiles 4 hwang et al. prediction model 4.1 spalling resistant energy 4.2 tunnelling resistant energy 4.3 scabbing resistant energy 4.4 penetration depth calculation 5 the depth of penetration for deformable projectiles 6 spalling mass ejection 7 results for depth of penetration 7.1 non-deformable fmj-msc projectile 7.2 deformable fmj-slc projectile 8 results for spalling mass ejection 8.1 non-deformable fmj-msc projectile 8.2 deformable fmj-slc projectile 9 conclusions acknowledgements references ap04-bittnar2.vp 1 introduction there are many different methods and tools that can be used to deliver the macroscopic constitutive response of heterogeneous materials from a local description of the microstructure behavior. here we are concerned with non-linear behavior caused by inelasticity of the constituents or with the initiation and growth of damage. in developing the homogenization procedures for non-linear materials we have to define both the homogenization step itself (from local variables to overall variables) and the often more complicated localization step (from overall controlled quantities to the corresponding local quantities. as practical tools in the homogenization framework we have mainly three categories of methods involving the linearly elastic behavior of the composite aggregate: � first, we consider exact solutions and theorems that deliver variational bounds on the overall constitutive parameters. at that level, we can mention voigt and reuss bounds, hashin-shtrikman bounds in the linear context, and more recent works (ponte castaneda, 1991; ponte castaneda and willis, 1995; ponte castaneda and suquet, 1998), whicht give pertinent results for inelastic behavior. � second, good estimations of overall mechanical properties are provided by the techniques for estimating the overall mechanical properties based on generalizations of the eshelby method, proposed in 1957. these techniques are particularly useful for situations with random misconstrues (multiconnected phases, randomly distributed shapes, size orientations of lay-ups, polycristalline aggregates, etc), see (šejnoha m., zeman j., šejnoha j., 2000), (zeman j., šejnoha j., šejnoha m., 2000), (šejnoha m., šejnoha j., zeman j., 2002), (šejnoha j., šejnoha m., 2001), (procházka p., šejnoha j., 2000), (procházka p., šejnoha j., 2003) (šejnoha j., šejnoha m., 2000), (šejnoha j., šejnoha m., 2003). � third, numerical techniques, most often based on the assumption of microstructure periodicity. in these cases, which may be quite particular, periodic homogenization techniques (suquet, 1998) are able to describe the local stress (and strain) fields and their evolutions very correctly. they also deliver the overall stress strain behavior of the considered representative volume element of the material. use is made of finite element methods or fast fourier transform solutions. these numerical methods are limited to quasi-periodic situations. moreover their cost is very high and their direct use in a true structural analysis is at present limited to very special cases. nonlinear problems of localization and homogenization are of great importance today. not only classical composites suffer from deterioration of the material due to hereditary problems (aging, viscoelasticity). on the other hand, composite materials prepared in a special way can improve the properties of another materials, and the resulting effect can be much better than before. in this case, nonlinear and time dependent behavior has to be taken into account. the present paper develops constitutive equations for inelasticity and damage of heterogeneous materials that benefit from some specificities of a special boundary element method. on the one hand, we need to obtain better approximations of the local stress and strain fields than are provided by eshelby based approaches, especially when considering damage and failure conditions. on the other band, we want to simplify the numerical techniques of overall homogenization in order to obtain a treatable system of equations that can recover the status of a constitutive equation. in this paper we restrict ourselves to the application and improvement of a method proposed initially by laws (laws, 1973, dvorak, 1992). the most elaborated version, called transformation field analysis (tfa) (dvorak and benveniste, 1992), incorporates in the same framework thermo-elasticity, plasticity and viscoelasticity (dvorak, 1992), or even piezoelectric-elasticity couplings (benveniste, 1993). it can be used either with a low number of sub-volumes (or subphases), typically one subvolume per actual phase, then recovering the context of eshelby type approaches, or with a larger number 194 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 micromechanics based inelastic and damage modeling of composites p. p. procházka micromechanics based models are considered for application to viscoelasticity and damage in metal matrix composites. the method proposes a continuation and development of dvo ák ’s transformation field analysis, considering the piecewise uniform eigenstrains in each material phase. standard applications of the method to a two-phase are considered in this study model, i.e., only one sub-volume per phase is considered. a continuous model is used, employing transformation field analysis with softening in order to prevent the tensile stress overstepping the tensile strength. at the same time shear cracking occurs in the tangential direction of the possible crack. this is considered in the principal shear stresses and they make disconnections in displacements. in this case, discontinuous models are more promising. because discrete models, that can describe the situation more realistically have not been worked out in detail, we retain a continuous model and substitute the slip caused by overstepping the damage law by introducing eigenparameters from tfa. the various aspects of the proposed methods are systematically checked by comparing with finite element unit cell analyses, made through periodic homogenization assumptions, for sic/ti unidirectional lay-ups. keywords: elastic-viscoelastic material, anisotropic material, constitutive behavior, inhomogeneous material, microcracking. of sub-volumes, corresponding to some simplifications (or averaging) of the finite element based methods. we note the main lines of the tfa method, and some of its properties for two-phase exploitations (dvorak and benveniste, 1992). since we know the relative “stiffness” of the method when applied to a two-phase system (uniform fields on each phase), we propose and exploit a correction method that takes advantage of “tangent formulation” under asymptotic conditions (section 3). introducing damage effects through a continuum damage mechanism (cdm) formulation within each sub-volume, the models will be based on numerical results obtained via the overall strain over the external boundary of representative volume element (rva) and the localization process. the boundary element method is used as a very powerful tool for problems of this kind. the method has proved to be appropriate particularly for solving the contact problem of debonding fibers from the matrix, and also for “softening” of the matrix due to damage, cracking, and other continuum phenomena. 2 transformation field analysis 2.1 local and overall constitutive equations we begin by the expression for the local elastic constitutive equations, assuming a uniform elastic stiffness lr over each sub-volume vr . the local stresses �r(x) and strains �r(x) relate by: � � � � � � � � r r r r r r r r r r r ( ) ( ) , ( ) ( ) ( ) ( ) ( x l x x l x x x c � � � � � ��1 x x) ( )� �r (1) prescribed distribution of local eigenstrains is denoted �r ( )x and � �r r r( ) ( )x l x� � is the corresponding eigenstress fieid. these eigenstrains can be thermal strains, piastic strains, transformation strains. in the macroscale, the overall (uniform) strain e and stress � are also reiated by: � �� �l e , e l� ��1� � (2) where l is the overall elastic stiffness matrix and � and � are eigenstresses and eigenstrains. in the case of pure elasticity, eigenstrains and eigenstresses may be given by a change of temperature, swelling, watering, prestressing, for example. they are not unknowns, but are given in advance. then the following procedures are not of interest to us, as the expressions for internal states in the composite body can be calculated in a much simpler way. in pure elasticity, concentration factors a and b are sought, such as: �r ( ) ( )x a x e� , � �r ( ) ( )x b x� . (3) it is immediately clear that the relationship between the two concentration factors follows from (1), (2) and (3) as: l a x b x lr r r( ) ( )� . (4) note that according to dvorak and benveniste, (1992), it holds: l l a� �cr r r r . (5) in this way the overall stiffness can be calculated, knowing the stiffnesses and strain concentration factors and volume fractions of both phases: c v vr z� . similarly, the eigenparameters can be determined from the stress and strain concentration factors and volume fractions as: � �� �cr rt r r b . (6) note that the lippman-schwinger equation is used in this text and will be derived in a slightly different way than usual. the lippman-schwinger equation relates strains, strains of a comparative medium and the eigenstrain (eigentress) field. some particular implementations of this equation enable us to derive very important results, using the theory of generalized functions (distributions). the lippman-schwinger equation has the form: � �� � � � � � � �( ) ( ) )( ( ) ( ))� � �� 0 � � � � � � � � � x l x l x x xd (7) in which �0 denotes the strain field (in our case considered to be uniform) that would exist in a comparable homogeneous medium l0 under the same boundary conditions. the kernel � is defined by: � ��ijkl ijkl jkilg g� � � � � � � � �x x x� � �� � 1 2 , (8) where gijkl is the green function of the homogeneous medium l0 obeying: l gijkl kplj ip 0 0� ��� � � x x� �( ) , (9) where ip is the kronecker symbol and ( )x � � is the dirac continuous functional. 2.2 localization and homogenization in this section eigenstrain and eigenstress are dealt with and involved into the computation. no body forces are present. let us consider the coordinate system 0y1y2 in 2d (for the sake of simplicity our restriction is to 2d, while the generalization to 3d is straightforward), bounded domain � (unit cell) with the boundary �� � �� f m , � �f m � 0, see fig. 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 195 acta polytechnica vol. 44 no.5–6/2004 fig. 1: the unit cell – geometry and denotation we define average quantities < . > by: � � � �a a� � � � 1 meas d( ) ( )y y , (10) where meas � stands for the volume (area) of �. the homogenization can start by introducing the overall (average) strain e, or stress tensor �, or periodic conditions can be prescribed. the latter conditions are the most useful in applications. there are plenty of other boundary conditions which are less important than those we have mentioned. from (10) we get, for example: � � � � �� �� � � �e y y 1 meas d( ) ( ) (11) � � � � �� � �� � � � 1 meas d( ) ( )y y . (12) without loss of generality, we focus only on given e. then we have to solve the problem: div �( )y � 0 in �, u y e y( ) � on �� . (13) the real displacement u and the real strain � may be written in the form of the sum of e and the fluctuating terms u and � as: u y e y u y( ) ( )� + , � �( ) ( )y e y� � . (14) in the case of elastic behavior of both fiber and matrix and the fiber-matrix interface, it holds: � � �� � 0 the procedure for solving of � is split into two steps: first, let u 0, �0, and �0 be the known displacement, strain, and stress fields, respectively, defined on a comparative medium l0. the linear hooke law relates the stresses and strains: � � 0 0 0� l in , u e y0 � on � . (15) matrix l0 is not yet fixed. an obvious option is to put � 0 � e and in the sense of the above definitions l0 can be expected to be the average sought stiffness. it will be seen that this assumption is not so easy. in the second step, a geometrically identical body is considered, which is heterogeneous, anisotropic, and may exhibit nonlinear behavior. the displacements u, the strains �, and the stresses � are unknown, and the generalized hooke law including the eigenstresses �, or eigenstrains �, holds valid as: � � �� �l in , u e y� on � , � �� �l . (16) similarly to [19] we define the symmetric polarization tensor by: � � � �l0 . (17) we also define u u u� � 0 in , u � 0 on � , (18) � � �� � 0, � � �� � 0. (19) our aim is to obtain the relations strain, or stresses and eigenstrain, or eigenstresses. since both � and �0 are statically admissible, the following equations have to be satisfied in the sense of distributions: div � � 0 in , (20) � �� � �[ ]l 0 in , (21) u 0� on �� . (22) where [ ] ( )l l y l� � 0 (23) subtracting (17) from (16) yields � � � �l0 . (24) 3 numerical derivation of quantities solution of problems involving the linear or nonlinear behavior of composite bodies is mostly formulated in terms of integral equations. consequently, a natural way to solve these problems is to describe the behavior of such bodies by the boundary integral equation method. the integral equation equivalent to (21) to (24) can be expressed as: u u y p y y p y u y y y ( ) ( ; ) ( ) ( ; ) ( ) ( ; ) ( � � � � � � � � � � * * * u p d d � �� � y y) , ,d ( )� � � � �� (25) where the starred quantities are the given kernels. due to the validity of the boundary equation (22), � �� �� p and the second integral on the right hand side of (25) disappears. therefore, we get: u u y p y y y y y( ) ( ; ) ( ) ( ; ) ( )� � � � � �� � * * u d d ( ) �� � � . (26) differentiating the latter equation successively by one of the coordinate � , we arrive at the expression � � � � � � ( ) ( ; ) ( ) ( ; ) ( ) ( )� � �� �h y p y y y y y * * u p d d( ) �� � . (27) the convected term � arises at the internal point of � by the interchange of integration and differentiation when deriving (27) from (26). note that this unpleasant term may be avoided by introducing eshelby’s trick, which is very famous in the theory of composite materials. levin in his work on termal bounds uses a similar trick leading to omitting the convected term. 4 numerical interpretation of tfa now, our goal is to derive the strain – eigenstress relation of the form � �� �a e g , (28) where a and g are the influence function tensors (a is mostly referred to as the mechanical concentration function tensor). note that once computed, these matrices do not change their value during the incremental process for nonlinear solution of plasticity and damage. after discretization of the boundary and the domain � into m internal cells, we get the discretized form of the integral equations: u p s s 0� � �� � , (29) � � � ��� � �h p , (30) 196 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 where u is a square matrix (2n×2n) and 2n is a number of degrees of freedom on the boundary in 2d (when using, e.g., a linear approximation of both tractions and displacements, n is the number of nodal points), p is the vector (2n) of discretized unknowns tractions at the nodal points on �� , s, s are the matrices (2n×3m) of influences of the strains and eigenstrains in the discretized domain (in 2d three components of the strain tensor are independent), h is a (3m×2n) matrix, and, finally, � and � are square matrices (3m×3m). since the system is well-posed, the regular matrix u may be inverted. elimination of p from both latter types of equations provides w t� � �� �0 , or w e t� �� � , (31) where w i hu s t hu s� � � � � �� �� �1 1, . (32) obviously, w is a regular (6m×6m) matrix, since for the given component of � a unique response � may be expected. the sought influence function tensor g is equal to w t�1 , while w�1 is the mechanical concentration function tensor a. now we turn to concentration factors for the phases. there still is a certain freedom in selecting matrix l0. let put � � 0 and then successively l l0 � f and l l0 � m in (23). then obviously [ ] ( )l l y l� � f for l l0 � f (33) and [ ] ( )l l y l� � m for l l0 � m . (34) in the same way in (21) changes, and the influence on the integral (25) to (27) appears as: u u y p y y p y u y y l l ( ) ( ; ) ( ) ( ; ) ( ) ( ) � � � � � � � � � � � * * u p d d f m � �� � *( ; ) ( ) , ,y y y� � �d ( )� � � � � (35) � � � � � � ( ) ( ; ) ( ) ( ) ( ; ) ( ) � � � � � �e h y p y y l l y y y * * u d d( )f m f �� � � � � ( ), (36) � � � � � � ( ) ( ; ) ( ) ( ) ( ; ) ( ) � � � � � �e h y p y y l l y y y * * u d d( )m f m �� � � � � ( ). (37) in (36) and (37) it is important that the integration is led over only one phase and the second is not explicitly obtained in the formulas. this fact can principally speed up the computation, particularly if the fiber volume ratio is very small (fiber reinforced concrete), or is very large (classical composites). because of well-known identity c cf f m ma a i� � (38) one concentration factor can be calculated from either (36) or (37), and the remaining follows from (38). note that for an unlimited domain � the lippman-schwinger equation is fully fulfilled. the last relation makes clear why we concentrated our attention on splitting of the integrals into integration over the fiber and integration over the matrix. because of (38) we can simply calculate the integrals either over the matrix, or fiber, exceptionally. then, using (38), we get the second concentration factor needed. hence, in the case of elastic behavior, the homogenization is straightforward: � � � � � �� � � � � � � � � � � � � � � � � � � � � l l l a l a e f f f m m m f f f m m m (39) and the elastic stiffness matrix appears to be: l l a l a* f f f m m m� � � � � �� � . (40) it is worth noting that the stress concentration factor can be derived in the same way: � � � � �� b � � �f m (41) note that in the general case relation (41) is calculated from the given �. this leads to similar integral equations, but the solution is not the same. since the external forces are equilibrated, the rigid motion of the unit cell is disregarded, and the solution is then unique. here we use relation (41) and the possibility to invert l*. including the eigenparameters into the calculation, we can derive from (35) to (37) the relations � � � � � � � � � � � �� � � �a e d b fk k k k � � �f m . (42) the relations (42) are the starting points for the theories established by dvorak. he assumes that the concentration functions are estimated by approximate formulas following mori-tanaka, or the self-consistent method. the calculations presented in this paper are considered to be very accurate and fast. 5 viscoelasticity in linear viscoelasticity it is always possible to write the stress-strain relation in a similar form to that of elasticity with the terms of the l* matrix now representing suitable differential or integral operators, rather than elastic constants. thus in an isotropic continuum a pair of operators corresponding to an appropriate pair of elastic constants will appear – while for anisotropic material up to 9 separate operators may be necessary. typically, the viscoelastic part of the strain may thus be described by � �c � �l 1 , (43) where each term of the viscoelastic matrix, l�1, may take up a form of kelvin chains. this, as is well known, can be interpreted as a response of the form: � �t ( )e a b en n s n n� �� (44) the increments of each such term in a time interval may now be found from the above expression from the knowledge of the current value of appropriate stress component �s and the current value of en. thus it becomes necessary to store © czech technical university publishing house http://ctn.cvut.cz/ap/ 197 acta polytechnica vol. 44 no.5–6/2004 only a finite number of such terms as e n at their current value to represent the full history effect. the viscoelastic strains can be treated as eigenstrains, value a representing spring and value b representing dashpot. 6 damage properties damage effects are split into two parts: the first part is connected with exclusion of tensile stress in the principal direction, which overcomes some prescribed value (tensile strength �+). note that if the tensile strength is different from zero, the mathematical formulations lead to problems, the correct solution of which is not yet fully proved. nevertheless, the mechanical results are reasonable and for practical applications they are treated as correct, at least from the mechanical standpoint. if the principal stress is too low, the stiffness is weaker. the tests are carried out in each cell of the discretization into the boundary elements and internal cells. the second criterion of damage is violation of the mohr-coulomb hypothesis in the principal shear direction. because the outcome of such a violation is shear cracks – disconnection of displacements – the shear modulus is calculated for the new stage. 7 example since the procedure leads to a linear relations at each stage, for given (step by step increasing) values of components of either stress tensor (elastic and relaxation part) or the strain tensor (elastic and viscoelastic part) of “unit impulses” we do not need to compute the influence tensors. now the procedure fully described in [15] can be used. note that in [15], the values of the concentration tensors, which are the most important quantities for numerical computation, are computed by very approximate methods (mori-tanaka, etc.). a quarter of a unit cell is considered with a fiber volume ratio equal to 0.6, according to fig. 2. we used the following elastic material properties of the phases: young’s modulus of fiber ef � 210 mpa, poisson’s ratio �f � 0.16; for the matrix em � 17 mpa, and �m � 0.3. for a fiber volume ratio 0.6, the radius of the fiber is r � 0.714. the homogenized elastic matrix l* in this case possessed the following values: l* . . . . � � � � � � � � � � � � � � 182 62 0 05 62 182 0 034 0 75 11 98 . from the above matrix we can conclude that the responses on the normal unit strains are computed with high accuracy (comparing the symmetry), while the results from the shearing strains are less accurate, but still very precise. the angle of internal friction was � � 25°, and cohesion c � 270 kpa. the results are depicted in fig. 3. in the upper and lower part of the ilustrations damage is obvious (debonding occurred) and on both sides viscoelastic behavior prevails. the computation was run on pentium iv pc, 2.6 ghz in fortran. the program for generating the meshes of the internal cells and also the boundary nodes had been prepared, as is clearly shown in fig. 2. according to the wish of the user, the meshing can be improved. the time needed for computing even a large system of equations (150×150), which can be stored into memory without the use of a hard disc or an extended/expanded memory, was negligible in each step. our illustration does not reach such dimensions of computation. it is also not necessary, in such problems, to increase the precision of the meshing, which would reduce the efficiency. the iteration at each step of loading was also very fast. it is worth noting that similar computations were carried out using fem, but finer meshing had to be imposed to get comparable results with bem in the procedure presented here. the comparison was tested in such a way that the sum of the concentration factors would be the unit tensor. 198 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 2: geometry of boundary elements and internal cells fig. 3: distribution of �11 and �11 8 conclusions this paper has presented the fundamental idea of a numerical procedure leading to the overall viscoelastic and damage behavior of the matrix in a composite aggregate on a unit cell. based on transformation field analysis it is possible to obtain strain and stress/strain influence matrices, that relate the strains and the eigenstrains and eigenstresses. since this problem leads to integral equations, the most suitable numerical tools appear to be bem. the obvious advantage of this procedure is found in a priori computed influence matrices (concentration tensors). these may be stored into a computer and, hence, the iteration process for solving the nonlinear material behavior of the structures is very efficient. moreover, attention may be focused on integrals over either the matrix or the fiber. a very important property of the above procedure is the linearity of the problem at each stage. the accuracy of the overall stiffness (compliance) matrix is not dependent on the size of the step, providing there is no “unwanted” unloading in any internal cell at the current stage. when this is the case, the time is slightly extended, as the standard iterative process has to be carried out. this was not the case for our computations. although it is not our intention to discuss cases of eigenstrain fields, it is appropriate to mention the connection between the present formulas and those obtained in studies of the thermoelastic response in composite materials subjected to a uniform change of temperature. 9 acknowledgment this paper was supported by gačr, grant no. 103/04/1178 and partially by research project msm 210000001. references [1] ponte casta eda p.: “the effective mechanical properties of nonlinear isotropic composites.” j. mech. phys. solids, vol. 39 (1991), p. 5–71. [2] ponte casta eda p., suquet p.: “nonlinear composites.” adv. appl. mech., vol. 34 (1998), p. 171–302. [3] ponte casta eda p., willis j.: “the effect of spatial distribution on the effective behavior of composites-materials and cracked media.” j. mech. phys. solids, vol. 43 (1995), no. 12, p. 1919–1951. [4] šejnoha m., zeman j., šejnoha j.: “evaluation of effective thermoelastic properties of random fibrous composites.” engineering modelling, vol. 13 (2000), p. 61–68. [5] zeman j., šejnoha j., šejnoha m.: “effective properties of statistically homogeneous random composites.” advances in composite materials and structures vii. southampton: wit press, 2000, p. 143–152. [6] šejnoha m., šejnoha j., zeman j.: “inelastic response of random fibrous composites.” contemporary research in theoretical and applied mechanics. blacksburg,va: virginia polytechnic institute and state university, 2002, p. 214. [7] šejnoha j., šejnoha m.: “multi-scale modeling of composites with randomly distributed phases. theoretical and applied mechanics.” sofia: bulgarian academy of sciences demetera ltd., 2001, p. 141–152. [8] procházka p., šejnoha j.: “thebem formulation of statistically distributed fibers in composites.” betech xxii, cambridge: wit press, 2000, p. 339–350. [9] procházka p., šejnoha j.: “a bem formulation for homogenization of composites with randomly distributed fibers.” engineering analysis with boundary elements. vol. 71 (2003), p. 137–144. [10] laws n.: “on the thermostatics of composite materials.” j. mech. phys. solids, vol. 21 (1973), p. 9–17. [11] dvorak g.: “transformation field analysis of inelastic composite materials.” proc. royal soc. london a, vol. 437 (1992), p. 311–327. [12] dvorak g., benveniste y.: “on transformation strains and uniform fields in multiphase elastic media.” proc. royal soc. london a, vol. 437 (1992), p. 291–310. [13] benveniste y.: “universal relations in piezoelectric composites with eigenstress and polarization fields.” parts i and ii. j. applied mechanics, vol. 60 (1993), p. 265–275. [14] procházka p., šejnoha j.: “eigenparameters optimization of strain energy function.” ctu reports prague: ctu, vol. 3 (1999), p. 129–138. [15] procházka p., šejnoha j.: “extended hashin-shtrikman variational principles.” to appear in applications of mathematics (2004). [16] šejnoha m., šejnoha j.: “multi-scale modeling of heterogeneus materials.” slovak journal of civil engineering, vol. 8 (2000), p. 33–44. [17] wierer m., šejnoha j., šejnoha m.: “evaluation of effective properties of woven composite tubes.” inženýrská mechanika 2002. brno: institute of mechanics of solids, faculty of mechanical engineering, brno university of technology, 2002, p. 325–326. [18] válek m., procházka p., šejnoha j.: “on a contact problem in composite materials.” computational methods in contact mechanics v. southampton: wit press, 2001, p. 165–174. [19] šejnoha m., šejnoha j., procházka p.: “modelling of interfacial separation of mmc composites.” journal of theoretical and applied mechanics, vol. 4 (1999), p. 23–32. [20] krejčí t., šejnoha m., šejnoha j.: “modeling of time dependent properties of heterogeneous materials.” engineering mechanics 2000. prague: academy of sciences of the czech republic, institute of theoretical and applied mechanics, 2000, vol. i, p. 175–180. prof. ing. rndr. petr pavel procházka, drsc. phone: +420224 354 480 e-mail: prochazk@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 199 acta polytechnica vol. 44 no.5–6/2004 ap04-bittnar2.vp 1 introduction the bre’s cardington laboratory is a unique facility for advancement of the understanding of whole-building performance, see [1]. this facility is located at cardington, bedfordshire, uk, and consists of a former airship hangar with dimensions 48 m×65 m×250 m. the cardington laboratory comprises three experimental buildings: a six storey timber structure, a seven storey concrete structure, and an eight storey steel structure. the steel test structure was built in 1993. it is a steel framed structure using composite concrete slabs supported by steel decking in composite action with the steel beams. it has eight storeys (33 m) and is five bays wide (5×9 m � 45 m) by three bays deep (6 � 9 � 6 � 21 m) in plan. the structure was built as non-sway with a central lift shaft and two end staircases providing the necessary resistance to lateral wind loads. the main steel frame was designed for gravity loads, the connections consisting of flexible end plates for beam-to-column connections and fin plates for beam-to-beam connections, designed to transmit vertical shear loads. the building simulates a real commercial office in the bedford area, and all the elements were verified according to british standards and checked for compliance with the provisions of the structural eurocodes. the building was designed for a dead load of 3.65 kn m�2 and an imposed load of 3.5 kn m�2. the floor construction consists of a steel deck and a light-weight in-situ concrete composite floor, incorporating an anti-crack mesh of 142 mm2 m�1 in both directions, see [2]. the floor slab has an overall depth of 130 mm and the steel decking has a trough depth of 60 mm. seven large-scale fire tests at various positions within the experimental building were conducted, see [3], and there is still a place for two more tests. the main aim of these compartment fire tests was to assess the behaviour of structural elements with real restraint in a natural fire. the structural integrity fire test (large test no.7) was carried out in a centrally located compartment of the building, enclosing a plan area of 11 m by 7 m on the 4th floor [4]. the preparatory works took four months. the fire compartment was bounded with walls made of three layers of plasterboard (15 mm + 12.5 mm + 15 mm) with thermal conductivity (0.19–0.24) w m�1k�1. in the external wall the plasterboard was fixed to a 0.9 m high brick wall. the opening 1.27 m in hight and 8.7 m in length simulated an open window to ventilate the compartment and allow for observation of the element behaviour. the ventilation condition was chosen to result in a fire of the required severity in terms of maximum temperature and overall duration. the steel structure exposed to fire consists of two secondary beams (section 305×165×40ub, steel s275 measured fy � 303 mpa; fu � 469 mpa), an edge beam (section 356×171×51ub), primary beams (section 336×171×51ub, steel s350 measured fy � 396 mpa; fu � 544 mpa) and columns, internal section 305×305×198uc and external 305×305×137uc, steel s350. the joints were a cruciform arrangement of a single column with three or four beams connected to the column flange and web by the header plate connections, steel s275. the beam to beam connections were created by fin plates, steel s275. the composite behaviour was achieved by a concrete slab (lightweight concrete lw 35/38; experimentally by a schmidt hammer 39.4 mpa) over beams cast on shear studs (�19–95; fu � 350 mpa). the geometry and material properties of the measured section are summarized in table a1, see [2, 5]. the mechanical load was simulated using sandbags, 1100 kg of each, applied over an area of 18 m by 10.5 m on the 5th floor. the sand bags represent the mechanical loadings; 100 % of permanent actions, 100 % of variable permanent actions and 56 % of live actions. the mechanical load was designed to reach the collapse of the floor, based on analytical and fe simulations. wooden cribs with 14 % moisture content provided the fire load of 40 kg/m2 of the floor area, see fig. 1a,b. the columns, external joints and connected beam (about 1.0 m from the joints) were fire protected to prevent global structural instability. the material protection used was 20 mm of cafco300 vermiculite-cement spray, based on vermiculite and gypsum, see table a2 and fig. 1c, d. it was applied as a single package factory controlled premix, with a thermal conductivity of 0.078 w m�1k�1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 131 acta polytechnica vol. 44 no.5–6/2004 temperature of steel columns under natural fire f. wald, p. studecká, l. kroupa current fire design models for time-temperature development within structural elements as well as for structural behaviour are based on isolated member tests subjected to standard fire regimes, which serve as a reference heating, but do not model natural fire. only tests on a real structure under a natural fire can evaluate future models of the temperature developments in a fire compartment, of the transfer of heat into the structure and of the overall structural behaviour under fire. to study overall structural behaviour, a research project was conducted on an eight storey steel frame building at the cardington building research establishment laboratory on january 16, 2003. a fire compartment 11×7 m was prepared on the fourth floor. a fire load of 40 kg/m2 was applied with 100 % permanent mechanical load and 65 % of imposed load. the paper summarises the experimental programme and shows the temperature development of the gas in the fire compartment and of the fire protected columns bearing the unprotected floors. keywords: steel structures, fire design, fire test, compartment temperature, protected steel, natural fire. the paper describes the temperature development in the fire protected columns. the temperatures predicted analytically and by 2d fem simulation are compared to the measured values. 132 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 a) b) c) d) fig. 1: a) fire load in compartment; b) fire load around column d2; c) protection of internal column d2 (after test), d) protection of external column (after test) d e n g525 g526 g527 g528 g529 g530 g531 g532 1 2 g533 g534 g535 g536 0,80,8 2,25 2,25 2,252,25 0,5625 1,625 1,625 1,625 0,5625 internal wall of the fire compartment north view 50 c415 a a´ d2 major axis b b´ d2 minor axis 200 c416 c417 50100 20 20 c418, c415, c419, c420 c418, c419, c420 a) b) window of the fire compartment fire protected part fire protected part g521 g522 g523 g524 1,27 x 8,70 m c416, c417 plan 4th floor a a´ b b´ fig. 2: a) location of thermocouples in a compartment 300 mm below the ceiling; b) thermocouples on the beam and column end round the connection 2 instrumentation the instrumentation used included thermocouples, strain gauges and displacement transducers. a total of 133 thermocouples monitored the temperature of the connections and beams within the compartment, the temperature distribution through the slab and the atmospheric temperature within the compartment, see fig. 2a. an additional 14 thermocouples measured the temperature of the protected columns, see fig. 5. two different types of gauge were used, high temperature and ambient temperature, to measure the strain in the elements. in the exposed and unprotected elements (fin plate and end plate minor axis) nine high temperature strain gauges were used. in the protected columns and on the slab a total of 47 ambient strain gauges were installed. 25 vertical displacement transducers were attached along the 5th floor to measure the deformation of the concrete slab. an additional 12 transducers were used to measure the horizontal movement of the columns and the slab. ten video cameras and two thermo-imaging cameras recorded the fire and smoke development, the deformations and the temperature distribution, see [5]. 3 fire development the quantity of thermal load and the dimensions of the opening on the facade wall were designed to achieve a representative fire in the office building. the openings allowed the fire to develop without a flashover managed by combustible timber sticks, see [4]. the temperature grew to reach the plateau of the time temperature curve in about 18 minutes, with a peak at 54 min., after which cooling began, see fig. 3. the maximum recorded compartment temperature near the wall (2 250 mm from d2) was 1107.8 °c after 54 minutes. the predicted value was 1078 °c in 53 min, see [5]. during heating the temperature was distributed regularly, see fig. 4. the measured differences of gas temperature decreased during cooling from 200 °c to 20 °c in 120 min. the measured © czech technical university publishing house http://ctn.cvut.cz/ap/ 133 acta polytechnica vol. 44 no.5–6/2004 0 200 400 600 800 1000 1200 0 15 30 45 60 75 90 105 120 135 150 time, min temperature, °c back in fire compartment prediction, en 1991-1-2, annex b [10] in front of fire compartment everage temperature 300 1562,5 562,5 fire compartment (de, 1-2) 1108 °c 1078 °c 3 * 1 625 53 min. g525 g525 g526 g528 g526 g528 predicted 54 min. recorded fig. 3: comparison of the prediction of the gas temperature with the measured temperatures, for values see table b1 fig. 4: isotherms in the compartment with thermocouples 300 mm under the ceiling, the input data are summarised in table b1 maximum gas temperatures are summarised in tab. b1 in the time intervals. the average gas temperature is calculated from all sixteen thermocouples. 4 column temperatures the temperatures in the columns in the fire compartment were measured at middle of the compartment’s height, 500 mm from the floor, and 500 mm below the ceiling at both flanges and at its web, and in the connections, see figs. 2b and 5, table b2. the columns were fire protected except the joint area, where the primary and secondary beams were connected. a selection of the temperatures recorded at column d1 and d2 is presented in figs. 6 and 7, where they compared to the gas temperature, the beam mid-span temperature, the beam end temperature as well as the column end temperature. the fire created a homogeneous gas temperature, and both columns were heated almost equally. the maximum re134 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 c409, c412, c414 c429, c432, c434 c411 c431 c408, c410, c413 c417,c416,c415 c437,c436,c435 c408 c409 c429 c428 c412 c432, c431 c430 c413 c414 c434 c433 column e2 c402, c405, c407 c422, c425, c427 c400 c424 c401,c403,c406 c421,c423,c426 c401,c402 c421,c422 c403,c405 c423, c425 c406,c407 c426, c427 column d1 column e1 500 500 1348 1348 uc 305 x 305 x 198 colum d2 uc 305 x 305 x 137 uc 305 x 305 x 137 uc 305 x 305 x 137 c428, c430, c433 c410, c411 c420,c419,c418 c440,c439,c438 c424c404 fig. 5: location of thermocouples on columns/beams, on flanges 20 mm from edge, on web in centreline 0 200 400 600 800 1000 0 30 60 90 120 150 secondary beam d2-e2; temperature, °c time, min. gas temperature external column e1 tc525 tc488 midspan, lower flange at mid height tc421 n e2d2 e1d1 c488 c421 e2 c421 c421 e1 c428 c430 c433 500 500 1348 1348 c428 c430 c433 internal column e2, 500 mm under slab, tc428 internal column e2, at mid height, tc430 internal column e2, 500 mm above floor, tc433 fig. 6: comparison of the measured temperature along the column length to the gas and connected beam temperature (column d2) and external column temperature (d1) 0 200 400 600 800 1000 0 30 60 90 120 150 temperature, °c time, min. gas temperature external column d1, thermocouple c401 secondary beam d2-e2; g525 midspan, lower flange, n e2d2 e1d1 c488 c401 internal column d2, 500 mm under slab, c408 internal column d2, at mid height, c410 internal column d2, 500 mm above floor, c413 d2 c401 c401 d1 c408 c410 c413 500 500 1348 1348 c408 c410 c413 c488 fig. 7: comparison of the measured temperature along the column length with the gas and connected beam temperature (column e2) and external column temperature (e1) ported temperature in the insulated part of the middle column was 426.0 c, which occurred after 106 minutes of fire. the values reached at the middle height of column and in the upper part of the column were similar. the gradient of the temperatures along the column changed in the course of time. the differences of the measured temperature cross sections were insignificant, see table b2. the accurate and simple step-by-step calculation procedure is based on the principle that the heat entering the steel over the exposed surface area in a small time step � t (taken as 30 seconds maximum) is equal to the heat required to raise the temperature of the steel by �a,t (at time t) assuming that the steel section is a lumped mass at uniform temperature, so that � , ,�� �q f t c va a t a t� �� � , (1) where �a is the unit mass of steel, ca,t is the temperature dependent specific heat of steel, v is the volume of the member per unit length, and ���q is the heat transfer at the surface, given by � ( ) ( ), , , ,�� � � � �q hc g t a t g t a t� � � �� � � � � � 4 4 , (2) where hc is the convective heat transfer coefficient, � is the stefan-bolzman constant (56.7 10�12 kw m�2 k�4), � is the resultant emissivity, and � � g t, is the increase of the ambient gas temperature during the time interval � t. eqs (1) and (3) may be rearranged to give � �� � � � � �� � � � �� � �a t m a a c g t a t g t a t a v c h t, , , , ,( ) ( )� � � � 4 4 , (3) where am/ v is the section factor for unprotected steel members, am is the surface area of the member per unit length. the convective heat transfer coefficient is recommended to have a value of 25 w m�2 k�1. the iterative procedure for protected steelwork is similar to that for unprotected steel. the equation does not require heat transfer coefficients because it is assumed that the external surface of the insulation is at the same temperature as the fire gases. it is also assumed that the internal surface of the insulation is at the same temperature as the steel. the equation is � � � � � � � a t p p p a t a p t p a t a p p p t p g a v k d c c c a v d c, , , , , ,( � � � � 2 t a t t� � , ) ,� (4) where ap /v is the section factor for steel members insulated by fire protection material, kp is the thermal conductivity of the insulation, dp is the thickness of the fire protection material, cp,t is the temperature independent specific heat of the fire protection material, �p is the thermal conductivity of the fire protection system, �p is the unit mass of the fire protection material. eccs, see [6], suggested ignoring the heat capacity of the insulation if it is less than half of that of the steel section, such that c a c aa t a p t p i, ,� �2 , where ai is the cross-section area of the insulating material and a is the cross-section area of the steel. this prediction is used in en 1993-1-2: 2003 par. 4.2.5.2 [7], taking constant 3 instead of constant 2 in the heavy insulations term to allow calculations for the temperature gradient across the insulation material, in the form � �� � � � � � � a t p p p a a g t a t a v d c t e , , ,( ) ( ) � � � � � � � � � � 1 3 110 � � �� � �g t a t g tif, , ,( )but � 0 0 (5) with � � � � c c d a v p t p a t a p p, , , where ap is the appropriate area of fire protection material per unit length of the member, which should generally be taken as the area of its inner surface, but for a hollow encasement with a clearance around the steel member the same value as for a hollow encasement without a clearance may be adopted. for prediction we used fire protection material thickness dp � 0.018 m; unit mass �p � 310 kg m �3; specific heat cp � 1200 j kg �1 k�1; and thermal conductivity �p � 0.078 w m �1 k�1. fig. 9 compares of the predicted and measured temperatures. the internal column was exposed from four sides, the external column from two sides only. the prediction is based on the measured gas temperature in thermocouple g525, on the calculated parametric temperature, see [8] and [9], and also on the nominal temperature, see [10]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 135 acta polytechnica vol. 44 no.5–6/2004 0 200 400 600 800 1000 0 30 60 90 120 150 c418, measured steel temperature, °c time, min. c414, measured c409, measured e1d1 n e2d2 c415, measured c417, measured d2 c414 500 500 200 c409 c418 c415 c409 c418 150 c415 c414 50 c417 c417 fig. 8: measured temperatures along a column length, column d2, and at a beam end the results of modelling the heat transfer into the columns by the 2-dimensional fe code are shown in fig. 10. the super-tempcalc code [11], taking into account the above listed parameters, was used for prediction. in the code, the differential equation is derived for 2-dimensional heat flow from conservation of energy, based on the fact that total inflow per unit time equals the total outflow per unit time. the constitutive relation invoked is fourier’s law of heat conduction, which describes heat flow within a material. the spatial and time domains are discretized by the weighted residual approach. boundary conditions implemented include convective and radiative heat flow and heat exchange within enclosures. 3-node triangular finite elements were used. the thermal properties of the materials are described as temperature dependent. the temperature distribution within the protected column is presented during heating, after 30 min. of fire, and during cooling, after 120 min. of fire in fig. 11. a temperature difference of 40 °c only was reached in the section. the comparison of the analytical and numerical results confirms the good quality of the presented analytical model. 5 conclusions the collapse of the structure or parts of the structure was not reached during the experiment for a fire load of 40 kg m�2, which represents the fire load in a typical office building, together with a mechanical load greater than standard approved cases. the structure showed good structural integrity. the test results supported the concept of unprotected beams and protected columns as a viable system for composite floors. the connections do not need to be fire protected from the point of view of its resistance, see fig. 11b, where only moderate local buckling is visible. 136 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 experimental curve column d2 c410 n e2d2 e1d1 c410 g525 0 50 100 150 200 250 300 350 400 0 30 60 90 120 150 steel temperature, °c time, min. prediction by 2d fem from parametric curve, en 1991-1-2 prediction by 2d fem from nominal curve, en 1991-1-2 from exp. gas temp. g525 prediction by 2d fem c410 d2 c410 1848 1848 fig. 10: comparison of column calculated temperatures based on prediction by tempcalc code to measured temperatures, thermocouple c410 n e2d2 e1d1 c410 g525 c405 from nominal curve en 1991-1-2 from exp. gas temp. g525 measured, c410 from parametric gas temp. internal column d2 steel temperature, °c time, min. prediction by en1993-1-2, c410 prediction by en1993-1-2, c410 prediction by en1993-1-2, c410 0 50 100 150 200 250 300 350 400 0 30 60 90 120 150 from exp. gas temp. g525 prediction by en 1993-1-2, c405 measured, c405 external column d1 d2 c410 1848 1848 c410 d1 c405 c405 en 1991-1-2 fig. 9: comparison of column calculated temperatures based on prediction by en 1993-1-2 with measured temperatures, thermocouple c410 the test in at cardington on january 16, 2003 documents that the incremental analytical models in pren-1993-1-2: 2003 [7] allows predict the column temperature from the gas temperature during the heating phase with good accuracy, see figs. 3, 9 and 10. from the nature of the heat transferred from the connected unprotected beams it is clear that the 3d solution is sufficient to describe the transfer of heat into the protected columns under the unprotected floors. an approximation based on 2d calculations is acceptable for design up to 60 minutes of fire only. accurate analytical prediction of the temperature of the structure during its cooling will enable optimization of the application of the fire protective material on the compressed member of the structure only. 6 acknowledgment the authors would like to thank all nineteen members of the project team working on this large scale experiment at the cardington bre laboratory from october 2002 till january 2003. special thanks go to mr. tom lennon, mr. nick petty, and mr. martin beneš for careful measurements of data presented above. the project was supported by the grant of european community fp5 hpri cv 5535. this paper was prepared as a part of project 103/04/2100 of the czech grant agency. references [1] wang y. c.: “steel and composite structures”, behaviour and design for fire safety, spon press, london 2002, isbn 0-415-24436-6. [2] bravery p. n. r.: “cardington large building test facility, construction details for the first building”, building research establishment, internal paper, watford 1993, p. 158. [3] moore d. b.: “steel fire tests on a building framed”, building research establishment, paper no. pd220/95, watford 1995, p. 13. [4] lennon t.: “cardington fire tests: survey of damage to the eight storey building”, building research establishment, paper no. 127/97, watford 1997, p. 56. [5] wald f., santiago a., chladná m., lennon t., burges i., beneš m.: “tensile membrane action and robustness of structural steel joints under natural fire”, internal report, part 1 project of measurements; part 2 prediction; part 3 – measured data; part 4 – behaviour, bre, watford, 2002–2003. [6] buchanan a. h.: “structural design for fire safety”, john wiley & sons, chichester 2003. isbn 0-471-89060-x. [7] eurocode 3: design of steel structures pren-1993-1-2: 2004, part 1.2: “structural fire design”, final draft, 2003, cen, brussels 2003. [8] wald f., silva s., moore d. b, lennon t., chladná m., santiago a., beneš m.: “experiment with structure under natural fire”, the structural engineer, in press. [9] wald f., chladná m., santiago a., lennon t.: “temperatures of structure during cardington structural integrity fire test”, in proceedings of icscs’04, seoul 2004, in press. [10] eurocode 1: actions on structures, pren-1991-1-2: 2004, part 1.2: general actions – “actions on structures exposed to fire”, cen, brussels 2002. [11] anderberg, y.: “super-tempcalc, a commercial and user-friendly computer program with automatic fe-generation for temperature analysis of structures exposed to heat”, fire safety design ab, lund 1991. prof. ing. františek wald, csc. phone: +420 224 354 757, +420 233 334 766 e-mail: wald@fsv.cvut.cz department of steel structures ing. petra studecká phone: +420 224 353 877 e-mail: studecká@fsv.cvut.cz ing. lukáš kroupa phone: +420 224 354 624 e-mail: lukaskroupa@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 137 acta polytechnica vol. 44 no.5–6/2004 30 minutes 100 °c 300 500 700 800 °c 300 °c 120 minutes uc 305x305x137 cafco300, 18 mm t d t f w h p dpb fig. 11: a) temperature distribution within the protected column during heating (after 30 min.) during cooling (after 120 min.) by tempcalc code; b) local buckling of column flange e2 annex a – measured geometrical properties of columns 138 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 profile column hl hr bupp blow tw tf,upp,l tf,upp,r tf,low,l tf,low,r mm mm mm mm uc305×305×137 nominal 320.5 309.2 13.8 21.7 d2 318.1 316.6 308.2 309.6 – 21.2 21.9 21.2 21.6 e1 318.1 317.6 307.2 309.8 – 21.4 21.6 21.2 21.7 e2 320.2 318.6 309.2 309.6 – 21.4 21.9 21.1 21.5 uc305×305×198 nominal 339.9 314.5 19.1 31.4 d1 336.0 336.8 312.3 312.5 – 32.0 30.8 31.4 31.1 symbols, see fig. 11: hl height of the column section on the left side, hr height of the column section on the right side, b is the width of the column section, upp upper measured value, low lower measured value, tw thickness of the column web, tf thickness of column flange table a1: geometry of column, [5] fig. 2 column above floor west (outer) flange prot. thickness east (inner) flange prot. thickness hickness of prot. on web average of cross section average on column max. min. max. min. max. min. e1 1000 18 15 18 22 18 20 19 18 2000 18 19 15 20 20 15 18 e2 1000 13 19 23 16 19 21 19 19 2000 22 14 18 20 26 20 20 d1 1000 23 13 18 20 18 16 18 18 2000 22 15 19 18 22 13 18 3000 12 18 15 21 16 – 16 4000 19 13 18 21 22 – 19 d2 1000 22 26 24 – 32 – 26 25 2000 26 19 20 – 30 – 24 table a2: thickness of the thermal protection of the column dp (mm), [5], fig. 11 annex b – measured temperature © czech technical university publishing house http://ctn.cvut.cz/ap/ 139 acta polytechnica vol. 44 no.5–6/2004 thermocouple time interval, min. c 521 c 522 c 523 c 524 c 525 c 526 c 527 c 528 aver. 10 – 15 356.40 321.00 349.50 370.40 399.00 422.80 386.00 358.20 373 25 – 30 687.6 660.1 698.3 762.6 806.8 838.0 827.6 782.4 805 40 – 45 810.5 777.3 834.8 851.1 935.0 971.6 964.5 885.9 966 0 – 180 1 015.3 1 016.1 1 007.3 990.5 1 107.8 1 096.3 1 063.1 979.8 1 074 75 – 80 769.6 796.2 730.5 697.2 762.6 754.5 735.0 662.2 761 90 – 95 567.1 579.7 576.9 528.7 560.3 535.0 555.1 475.1 555 table b1: maximum gas temperatures (°c) in time intervals, thermocouples 300 mm under ceiling, number of thermocouples, see fig. 2 [5] column d1 d2 d2 d2 d2 d2-d1 d2 e2 e2 e2 time, min. c 401 c 408 c 409 c 410 c 411 c 415 c 418 c 428 c 430 c 433 0 17.6 17.6 16.9 16.1 16.3 22.6 18.4 15.9 15.9 15.9 15 18.0 23.2 22.2 23.4 23.2 106.2 26.6 19.8 20.8 21.2 30 31.0 85.0 82.9 82.5 89.4 521.2 84.8 73.3 74.6 76.2 45 61.3 106.9 106.8 109.9 107.4 726.0 141.7 107.3 107.6 109.0 60 88.7 191.5 205.5 215.1 208.8 976.0 266.6 174.3 197.8 202.7 90 95.2 408.8 401.3 385.8 421.7 704.6 489.4 377.8 380.3 385.8 106 92.5 426.0 421.8 415.9 434.9 522.2 511.2 400.0 402.3 416.6 124 84.6 413.1 410.8 402.5 408.7 365.6 495.5 392.2 390.9 408.1 160 73.2 367.4 367.3 355.0 354.1 215.2 411.5 352.1 345.7 358.7 maximal 95.3 426.0 421.8 415.9 436.3 985.8 511.2 410.4 402.3 416.6 position 3/4 sw 3/4 sw 3/4 se 1/2 nw 1/2 n **bf *nw 3/4 sw 1/2 sw 1/4 sw the thermocouples were located 20 mm from the column/beam edge; * 200 mm below secondary beam; ** on the beam lower flange of the beam, 200 mm from the column face. table b2: steel beam temperatures (°c), numbers of thermocouples, see figs. 2 and 5 [5] ap04_3web.vp 1 introduction it is a great advantage if the designer has reliable fatigue characteristics of the main carrying parts while designing a new aircraft. the first complete characteristics of wings were make in internationally known laboratories. nowadays, major producers create the characteristics of their own specific, commonly used parts. this paper offers a new mathematical model that enables fatigue tests to be considered during testing, and in this way fatigue characteristics can be established very reliably and economically. for illustration, three examples of the use of this model are shown. 2 the present state of fatigue characteristics expression sa amplitude and sm mean stress of the load cycle are decisive and at the same time, also geometrically transparent factors influencing the life of structure elements, the influence of these quantities on the life, expressed by the number of cycles until failure a, is often presented graphically for the system of s/n curves [1] (see fig. 1) or in the form of the haigh diagram [2] (see fig. 4). where computers were introduced, the graphical representation of fatigue characteristics was no longer satisfactory, and mathematical ways of expression were searched. an expression used to describe the fatigue characteristics of wing and horizontal tail surfaces [1, 3] was one of the first models log ( ) ( ) log( . )n a s b s s� � �m m a 0 835 (1) where a(sm) and b(sm) represent the positions (intercept on log n axis) and slopes of s/n curves left branches. note: this paper will not deal with the right side of s/n curves, i.e., fatigue limits. parameters a(sm) and b(sm) are functionally dependent on the mean stress value (see fig. 1) and they are described in four segments by means of polynomials of the first and third degree. this way of expression is accurate, and it enables a computer to be used. the width of the segments and the mathematical function are chosen to replace the results of fatigue tests as accurately as possible. however, selection of the segment width and suitable polynomials leads to the loss of the general character. the system of polynomials accurately describes the geometrical form of the results but this description has no the physical sense. the system does not describe the physical relations among sa, sm and n. after supplying the existing results with another group of results, or when processing another set of test results it is necessary to choose new width segments and a new system of polynomials to achieve an accurate presentation of the test results. section widths and polynomials are chosen by way of trial. for these reasons this model is not generally valid, nor can it be used in fatigue test design or during fatigue tests to control them. it is necessary to find a new mathematical model. the requirements for the new mathematical model can be expressed as follows: 1. the model should have the basic form of the common model of s/n curve log log(n a b s� � a ) 2. functions a and b should always be the same for any tested structure or element within the whole study range, and only its parameters may vary. 3. changes of equation parameters must depend upon test results only. if the test results are homogeneous and there is no deflection from the physical limits, the equation parameters cannot change considerably when increasing the number of test samples. 4. functions a and b should be valid within a high-cycle area, i.e., for n > 104. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 common mathematical model of fatigue characteristics z. maléř, s. slavík, t. marczi, m. růžička this paper presents a new common mathematical model which is able to describe fatigue characteristics in the whole necessary range by one equation only: log ( ) ( ) logn a r b r s� � � a where a r ar br c( ) � � �2 and b r dr ar f( ) .� � �2 this model was verified by five sets of fatigue data taken from the literature and by our own three additional original fatigue sets. the fatigue data usually described the region of n �104 to 3 106� and stress ratio of r to� �2 0 5. . in all these cases the proposed model described fatigue results with small scatter. studying this model, following knowledge was obtained: – the parameter ”stress ratio r” was a good physical characteristic – the proposed model provided a good description of the eight collections of fatigue test results by one equation only – the scatter of the results through the whole scope is only a little greater than that round the individual s/n curve – using this model while testing may reduce the number of test samples and shorten the test time – as the proposed model represents a common form of the s/n curve, it may be used for processing uniform objective fatigue life results, which may enable mutual comparison of fatigue characteristics. keywords: fatigue characteristics, mathematical model, stress ratio r. 3 design of the new mathematical model when designing the model, we start [4] from the following knowledge. it was found out, see e.g. [5], that when describing the influenceof the mean (or better to say, at description of load cycle position influence) on crack growth, stress ratio r is a more suitable parameter than the mean stress sm. for this reason parameter r � smin/smax was used instead of sm. the relation among sa, r, n is described in the shape y n a r b r s� � � �log ( ) ( ) log a . (3a) this function describes the system/family of s/n curve branches that correspond to different values of r � const. in the area of high-cycle fatigue, see fig. 2. the position and the slope parameters of s/n curve branches, in co-ordinate system log sa– log n, depends on the value of r. dependencies a (r) � f (r) and b (r) � f (r) are expressed by means of the following functions: a r ar br c( ) � � �2 , (3b) b r dr er f( ) � � �2 . (3c) © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 44 no. 3/2004 fig. 1: s/n curves of complete wings and tail plans the relations for calculations parameters a, b, c, …, f are determined through the method of smallest squares of deviations di between results ni and model function yi: q d n yi i n i i� � � � � � �2 1 2(log ) minimum (4) � � � � � � � � q a q b q c q f � � � �0 0 0 0, , , ,� (5) as a result we obtained a system of six normal equations. these enable six parameters a, b, c, …, f of the model function sa– r – a to be established. the input values for calculating the model function parameters are n triples of values (s ia , ri, ni), where ni is the number of cycles to failure achieved at the test. note: to determine parameters a, b, …, f it is necessary to create a computer program. then calculating parameters a, b, …, f is easy, and so is not difficult to process fatigue the test results during testing, and thus to control these tests while they are still running. this involves altering the test plan on the basis of the obtained results, and thus reducing the number of test samples and shortening the time of testing. at the beginning of testing it is necessary to carry out fatigue tests at a minimum of three stress ratios of r � const., while at each r it is necessary to perform tests a minimum of three sa levels. the range of r (rmin to rmax) is established according to usual needs. the sa levels are assessed so that the n will be between n � 104 to 106. on this basis, the sa– r – n diagram can be supplemented by other tests only in regions with information shortage. to describe the scatter of the test results round the model function (3) the following expression is used s n n yf i i� � � � � �� 1 6 2 1 2(log ) (6) where n is the number of test results described by n-triples (sa, r, n), yi is the theoretical number of load cycles given by the model function (3). the yi value corresponds to probability of failure of p � 50 %. in the test on al-alloy elements, in which the s/n curve has two straight line sections of different slopes, each part of the curve must be solved separately. 4 examples of judgement of validity and accuracy of the sa–r–n model suggested sa– r – n model (3) was judged on five sets of fatigue results taken from the literature, and was used for processing our own threesets of test results. validity and accuracy were judged by: � comparing the origin and sa– r – n diagrams, � comparing s/n curve parameters ( ari, bri) and model functions a(r), b(r), � value of standard deviation sf (6). this paper will be describe the following three results. 4.1 example 1 fig. 1 shows s/n curves of complete wings and tail planes. the test results there were taken as points read from the original curves. the dashed lines are the original s/n curves according to the data sheets – fatigue e.02.01 [1], and the solid lines are calculated by the new sa– r – n model, see equation (3). as we can see, the differences are negligible. the equation is: log ( ) ( ) logn a r b r s� � � a where a(r) � �1.726325 r2� 1.643550 r � 13.560228 b(r) � � 0.827044 r2� 0.296186 r � 4.987437. the differences between the original and the model curves are expressed by the value of the standard deviation sf � 0.086. 4.2 example 2 fig. 4 compares the original constant life curves (thin curves) of the mustang wings and the curves established by the new sa– r – n model (3) (thick curves) [2]. the model curves were calculated in the range of n � 104 to 107 cycles. the test results were taken as read from the original n � const. curves for r � �1; � 0.8; � 0.6; … ; 0.6. log ( ) ( ) logn a r b r s� � � a where a(r) � 0.542293 r2 � 1.125954 r � 11.914455 b(r) � � 0.853812 r2 � 1.948959 r � 5.884825. the sf value represents the differences/scatter between the new model and the original curves. the value of the standard deviation is sf � 0.107. 4.3 example 3 fig. 3 there are shows our original test results for the test specimen that represents a steel attachment critical point. the specimens are made of czech poldi l-rol.7 steel heat treated to rmmin � 1080 mpa which is similar to u.s. 4130 alloy 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 2: the scheme and the equation of the new mathematical model describing the elementary fatigue characteristics steel. the specimens were axially loaded at r � �1; �0.5; 0.23; 0.42. the equation describing the sa– r – n model is: log ( ) ( ) logn a r b r s� � � a where a(r) � 4.930716 r2 � 3.479890 r � 13.980960 b(r) � �2.289047 r2 � 1.950201 r � 3.876255. the standard deviation of the test results round the sa– r – n model is sf � 0.167. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 44 no. 3/2004 fig. 3: the original test results for the test specimen that represents a steel attachment critical point fig. 4: the comparison of the original constant life curves (thin curves)of the mustang wings and the curves established by the new sa – r – n model 5 partial conclusion this paper presents three of the eight test groups on which the new sa– r – n model has been tested. in all five other checked fatigue sets the new model describes the test results with similar precision and in the whole test range by one equation of (3) only. 6 conclusion the following conclusions can be drawn from a comparison of the original characteristics and the characteristics obtained when using the mathematical model (3): 1. the designed model was able to describe, with sufficient accuracy and reliability, the results for five different structures and elements within the range of n � 104 through 106 to 107, and within the range of r � �1 through 0.6, pre-processed in some way (graphical or mathematical). 2. when processing the original results of the fatigue tests, the s/n curves determined through model function (3) were nearly identical with partial s/n curves calculated for different values of r � const. the scatter of test results calculated from the whole set was not much higher than that around the individual s/n curves. 3. parameter r (together with sa) proved to be a suitable physical parameter for describing the load cycle. further knowledge follows from this: 4. the sa– r – n model can be considered as a suitable, generally valid mathematical description of fatigue characteristics within the high-cycle fatigue area (n > 104). 5. for this reason, model (3) enables economical design of fatigue tests, because during tests not just one s/n curve but the wide region s described. the number of test samples is not much greater. 6. as the designed model is only a generalized form of a commonly used s/n curve, it provides a basis for uniform, comparable, objective and accurate processing of fatigue test results. 7. instead of sa, smax can also be used. 7 acknowledgment this paper is presented thanks to the airspace research center at the institute of aerospace engineering at the technical university in brno and at the czech technical university in prague, czech republic. references [1] engineering science data unit, (data sheets – fatigue e.02.01.): endurance for complete wing and tail structures. royal aeronautical society, 1958. [2] johnstone w. w., payne a. o.: aircraft structural fatigue research in australia fatigue in aircraft structures. proceedings of the international conference (ed.: a. m. freudenthal). columbia university, 1956, fig. on page 436. [3] hangartner r.: “correlation of fatigue data for aluminium aircraft wing and tail structures”. national aeronautical establishment. ottawa: 1974, aeronautical report lr-582. [4] maléř z.: basis for calculation estimate of safe life of the primary structure of a small airplane . doctor thesis, military academy, brno (czech republic), 1990. [5] hudson c. m.: effect of stress ratio on fatigue crack growth.report nasa res.center, bethlehem, 1967. ing. zdeněk maléř, csc. private adviser in fatigue life u trojáku 4650 760 05 zlín, czech republic doc. ing. svatomír slavík, csc. phone: +420 224 357 227 e-mail: svatomir.slavik@fs.cvut.cz ing. tomáš marczi phone: +420 224 357 433 e-mail: tomas.marczi@fs.cvut.cz department of auto motive and airspace eng. czech technical university in prague faculty of mechanical engineering karlovo náměstí 13 121 35 praha 2, czech republic doc. ing. milan růžička, csc. phone: +420 224 352 512 e-mail: milan.ruzicka@fs.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 ap02_04.vp 1 introduction graduates from technical universities start their careers in wide spectrum of professions and in various positions, including teaching technical subjects at high schools. new teachers are required to complete a teaching qualification. technical teacher education has been organized at the ctu in prague since 1966. in 1996 this education was made into a bachelor study programme, and in 1997 it gained accreditation from the international society for engineering education (igip). in 2001, accreditation was given by the ministry of education for the day-time programme and for the distance programme with intensive sessions. engineers who have graduated in technical teacher education can extend their pedagogical qualifications in a doctoral study programme at the university in hradec králové (this programme is organized in co-operation with ctu), or can apply to igip to become a european engineer – educator. in the 2001/2002 school year, the department of engineering education of the masaryk institute of advanced studies launched a study of the attitudes of teachers qualified in engineering education on the following issues: � why did they apply for technical teacher education? � why did they choose technical teacher education at ctu in prague? � where did they find information about this study programme? � had they received any previous education in non-technical subjects? � did they feel a personal deficiency in knowledge of non-technical subjects? � which non-technical subjects do they prefer? � should teacher education for secondary school and university teachers of technical subjects be obligatory? � how many semesters should technical teacher education last? � what should be the extent, form, organization and content of technical teacher education? 2 research results the questionnaire was administered to 107 teachers with a technical background, 67 male, 39 female, born between 1945 and 1977, teaching experience 0–24 years. the teachers had mostly graduated from a faculty of mechanical engineering, electrical engineering and civil engineering. the first item in the questionnaire concerned motivation to study. twenty-two respondents answered that they had chosen technical teacher education because they were interested in it, for 29 respondents it was a matter of obligation, for 69 it was an opportunity to gain a further qualification, and 6 had other reasons. the choice of technical teacher education at ctu was motivated by the following reasons: previous technical graduation from the same university (73), no tuition fee (30), opportunity to gain a bachelor degree (16), the study programme had already built its own prestige (13), the university is not far from my home town (10), while 14 respondents gave other reasons (recommended by the school management or school head, interest in the teaching profession, an opportunity to find a new job, extending their knowledge in non-technical subjects). in most cases, teachers had received information about technical teacher education from their school colleagues (46 respondents), from colleagues at ctu (27 answers). seven© czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 42 no. 4/2002 technical teachers and technical teacher education – research results d. dobrovská, p. andres chartered engineers who are new teachers of technical subjects at various educational institutions receive technical teacher education in the accredited bachelor programme at the czech technical university in prague. this paper presents the results of a recent survey in which engineers expressed their opinions on technical teacher education. keywords: teachers of technical subjects, technical teacher education, motivation, non-technical subjects, e-learning. answer frequency interest 22 matter of obligation 29 further qualification 69 other reasons 6 fig. 1: why did you apply for technical teacher education? answer frequency graduate of ctu (or of a similar technical university) 73 no tuition fee 30 bachelor degree 16 has already built its own prestige 13 not far from home town 10 other reasons 14 fig. 2: why did you choose technical teacher education at ctu? teen respondents learnt about this study programme from the magazine “učitelské noviny”, 14 from the internet, and 11 respondents from elsewhere – from their friends, relatives, former graduates of this study programme or by telephone. the previous non-technical subject experience of the respondents may be assessed as follows: 48 took some non-technical courses during their university studies, 29 during their secondary school studies, 9 had learned from specialized literature, 7 had been through short-term training, and 39 had never received any non-technical education. in the next question, respondents indicated whether they perceived a lack of non-technical education as a personal weakness: fifty-nine answered yes, 36 no, 4 from time to time, 4 hardly ever. four respondents gave no reply. the following question focused on the non-technical subjects that the respondents preferred. the answers were: psychology (57), philosophy (19), sociology (17), educational science – pedagogy (14), and teaching methodology (11). thirteen other subjects named by the respondents received less than 10 preferences. the 5 subjects named above were also generally mentioned in leading positions. in the next item, respondents expressed their attitudes towards obligatory technical teacher education for teachers of technical subjects at secondary schools. fifty-seven expressed the opinion that teacher education should be obligatory, as technical university studies do not include pedagogical and psychological preparation. five respondents wrote that technical university studies do not provide a teaching qualification. on the other hand, 12 engineers held the opinion that quality of teaching was an inborn talent. ten other respondents expressed the view that teaching competences can be gained on-the-job. twenty-six teachers had no opinion, and 3 gave no answer. the next question focused on the desirable amount of technical teacher education. one semester was preferred by 4 respondents, 2 semesters by 44 respondents, 3 semesters 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 answer frequency school colleagues 46 at ctu 27 in “učitelské noviny” 17 internet 14 elsewhere 11 fig. 3: where did you find information about the study programme? answer frequency university studies 48 secondary school studies 29 independent study 9 short-term training 7 none 39 fig. 4: have you ever studied non-technical subjects? answer frequency yes 59 from time to time 4 hardly ever 4 no 36 no reply 4 fig. 5: have you ever perceived the lack of information in non-technical subjects as a personal weakness? answer frequency psychology 57 philosophy 19 sociology 17 educational science – pedagogy 14 teaching methodology 11 communication skills 9 history and technology history 9 foreign language 7 biology 4 educational theory – special pedagogy 2 art history 2 literature 2 political science 2 interpersonal communication 2 geography 1 law 1 ecology 1 theology 1 fig. 6: if you could choose to study some non-technical subjects, which would you prefer? answer frequency technical university studies do not include pedagogical and psychological preparation 57 technical university studies do not provide a teaching qualification 5 a good teacher has an inborn talent for teaching 12 teaching competence is a matter of practical experience 10 no opinion 26 no reply 3 fig. 7: do you think technical teacher education ought to be obligatory for teachers of engineering? by 15 respondents and 4 semesters by 35 respondents. five respondents favoured more than 4 semesters. time arrangement is also an important factor in the organisation of technical teacher education. seventy-seven teachers are satisfied with the present arrangements (lectures and seminars are concentrated in one day per week). conversely, 10 respondents prefer weekly blocks of lectures and seminars. fourteen respondents would like to have just lectures, and 6 respondents just seminars. two respondents preferred other arrangement, and 2 gave no answer. the next item dealt with distance learning and e-learning: “would distance learning be an adequate form of technical teacher education?” the respondents answered “no” (68), “not quite” (3), “maybe” (2), “under certain conditions” (6), and ”yes” (23). five respondents gave no answer. should technical teacher education be obligatory for technical university teachers? 60 respondents answered “yes”, 6 answered “no”, 32 had no opinion, and 9 gave another © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 42 no. 4/2002 answer frequency 1 semester 4 2 semesters 44 3 semesters 15 4 semesters 35 longer 5 no reply 4 fig. 8: how many semesters should technical teacher education take? answer frequency lectures and seminars one day per week 77 weekly blocks of lectures and seminars 10 only lectures 14 only seminars 6 other arrangement 2 no reply 2 fig. 9: what is the most convenient time arrangement for the study programme? answer frequency no 68 not quite 3 maybe 2 under certain conditions 6 yes 23 no reply 5 fig. 10: do you think distance learning would be an adequate form of technical teacher education (for instance by mail, with e-contact between learners and teachers)? answer frequency yes 60 no 6 no opinion 32 other opinion 9 fig. 11: do you think technical teacher education should be obligatory for technical university teachers? subjects present number of hours frequency of deviations from the present number of hours (up to) no replyless �15 �10 �5 0 5 10 15 more educational science pedagogy 60 22 1 2 0 62 0 4 0 2 14 psychology 60 3 0 1 0 76 0 4 0 9 14 teaching methodology 60 7 0 2 0 60 0 4 1 15 18 sociology 20 4 0 19 5 56 3 2 0 3 15 philosophy 20 5 0 18 8 50 1 6 0 6 13 communication skills 20 0 0 5 3 52 7 15 0 12 13 biology 20 3 0 10 4 61 2 9 0 5 13 logic 20 3 0 10 4 64 0 2 0 2 22 teaching practice 20 1 0 7 1 62 1 6 0 8 21 informatics 16 4 0 20 2 45 13 1 0 2 20 school management 16 2 2 12 2 44 14 1 3 0 27 educational technology 16 0 2 9 3 45 21 0 3 0 24 technology assessment 12 x 14 16 20 34 2 2 0 0 19 bachelor diploma seminar 12 x 0 3 7 44 6 21 0 2 24 economics of education 12 x 2 2 15 45 4 8 0 1 30 fig. 12: attitudes towards the content of technical teacher education modification (optional study, obligatory study for new teachers, etc.) in the final question, teachers expressed their opinions on the content of technical teacher education. as a basis for giving an answer, the present structure of courses taught in technical teacher education and the number of hours for each particular course were offered to the respondents. they could accept, increase or decrease the number of hours, remove the course from the given list or add some other course. the opinions are summarized in fig. 12. 3 discussion the respondents are teachers of technical subjects from various technical schools in the czech republic. although most of them are new teachers, a few are experienced teachers who were “sent” to technical teacher education in their mature years. most are graduates from the “big faculties” of ctu (electrical, mechanical and civil engineering), some graduated from other technical universities (brno, liberec, ostrava), and some graduated from military technical faculties. the first part of the questionnaire focused on observing study motivation. two-thirds of the respondents indicate external motivation (“income growth”, “requirement of a further qualification”); 20 % of the respondents indicate motivation by internal factors (“interest”), and the rest provide some other source of motivation. some respondents chose more than one answer. as expected, external motivation was indicated in most of the answers, especially “an effort to gain another qualification”. twenty-two answers falling into the category of internal motivation are an encouraging result, as the questionnaire was anonymous and the respondents were under no pressure to write “socially acceptable” answers. in the second question, where motivation to study technical teacher education was analyzed, “pragmatic answers” were expected in most cases. the respondents prefer to study at ctu rather than at other educational institutions. there are other reasons, of course, such as the absence of tuition. some respondents were influenced by recommendations by their colleagues or by the management of their school, and also by graduates in technical teacher education at ctu. some respondents received the information on internet (ticked by 14 respondents). the next part of the questionnaire dealt with previous and present experience of engineers in the area of non-technical subjects. the answers showed that more than one third of the respondents had never attended any course in non-technical subjects, and almost one third had studied such subjects only at secondary school. this unfavorable situation may improve slightly in the near future, as most faculties of ctu now require students to take a certain number of credits in non-technical subjects. in their answers to the next item, respondents expressed their personal feelings on whether their lack of knowledge of non-technical subjects is a weakness. while more than one half of the respondents feel it to be a personal weakness, more than one third do not. few try to improve the situation by independent study. some respondents with low non-technical knowledge do not perceive this as a personal weakness. such an attitude is generally formulated as “… i am a technician and there is no need to study non-technical subjects… ”. the question surveying the respondents’ favoured non-technical subjects was in an open form, i.e., they were free to name particular non-technical sciences. they generally named subjects from the technical teacher programme (psychology, philosophy, sociology, educational science – pedagogy, and teaching methodology). these subjects attracted most interest. several other subjects were also mentioned (history and history of technology, a foreign language). with a certain caution, we can conclude that our respondents tend to agree with the content of technical teacher education at ctu. the next question analyzed the view of respondents on whether technical teacher education should be obligatory for teachers of technical subjects at secondary schools. almost two thirds of the respondents consider that technical teacher education is a necessity for the teaching profession. statements such as “… a high quality technical and professional background is sufficient, or ‘natural talent’, they are adult students after all… ” are found in the answers of 22 respondents. unfortunately, such an attitude can also be heard even among university teachers. the statement “i have no opinion” (given by approx. 25 % of the respondents) is disappointing, given that the respondents are graduates of a technical teacher education programme. qualitative analysis of the answers shows some ambivalences and contradictions. most students prefer either a two-semester or a four-semester study programme. however, the answers range from one to four and more semesters. a strong consensus was reached on the issue of the time arrangements for the study programme. most respondents are satisfied with the present timetable (one-day per week), combining lessons and seminars. fourteen respondents consider that tuition based just on traditional lectures won’t be best. recently, members of the academic staff have discussed the possibilities and limitations of e-learning as a form of study. can e-learning be used in technical teacher education? most of the respondents believe it is not suitable at all, or that it can be used only to a limited extent. one reason for these negative attitudes may be the mature age of some respondents and their relation with computer technology. the main reason, however, is the character of teacher education, in which personal contact with academic staff and other colleagues is more important than in other fields. technical teacher education for university teachers is not yet obligatory. the respondents (graduates from technical universities) mostly consider that some form of complementary pedagogical education is also necessary for university teaching staff. opinions about the form and extent of such a study programme differed. respondents’ views on the subjects that should be studied were of key importance in the survey. respondents who were recent graduates from technical teacher education mostly chose the subjects that are at present included in the programme. none of the subjects were rejected. some subjects are more ”popular” or are considered ”useful” (psychology, communication skills, bachelor diploma seminar, teaching methodology). logic is perceived as a “difficult” subject, probably because it is intellectually demanding. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 4 conclusions � to gain a “further qualification” or a “higher income” are leading motives for technical teachers to study teacher education, while ”interest” in pedagogical study is less common � most technical teachers have no previous experience with non-technical subjects � psychology is the most popular “human science” chosen by technical teachers � most of the respondents agree that technical teacher education should be obligatory; some have no opinion about the problem � a timetable involving one day weekly is acceptable for most respondents, but they are not in consensus about the desirable length of the programme (ranging from 1 to 4 semesters) � most technical teachers consider that e-learning is not quite suitable for technical teacher education � most also believe that technical university teachers should take some complementary teacher education courses � some subjects studied in technical teacher education are more popular or are felt to be more useful; others are less popular or considered less useful. none of the subjects in our present programme was totally rejected. references [1] dobrovská, d., andres, p.: postoje učitelů technických předmětů k pedagogickému vzdělání. e-pedagogium, 2002. http://epedagog.upol.cz/eped2.2002/index.htm phdr. dana dobrovská, csc. phone/fax: +420 224 359 138 mobile: +420 603 342 339 e-mail: dobrovd@muvs.cvut.cz ing. pavel andres phone: +420 224 359 134 e-mail: andres@muvs.cvut.cz masaryk institute of advanced studies of the ctu in prague horská 3 128 00 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 42 no. 4/2002 ap03_5.vp 1 introduction the stack structure of a desulphurization absorber (fig. 1), consisting of a steel skeletal tower structure adjoining the absorber vessel, terminates in a grp (glas fiber reinforced plastics) extension. the purpose of the analysis is to assess the response of the steel tower structure and its grp extension with reference to the design loads. the work is based on a theoretical as well as an experimental analysis of the structure. the experimental verification of the dynamic response was based on the measurements of the vibrations of both the steel parts and the grp parts of the structure excited by the effects of wind and other sources generated by the machines in the power plant area, and by a test pulse generated by a hammer impact on the wall of the grp extension. the vibration measurements proceeded repeatedly in an extended form (an increased number of monitored points) with reference to instantaneous excitation values and on a long-term basis on selected sites for about 14 days, with 15-minute repetition to determine the long term development of the response to wind effects. the measured vibration histories in terms of acceleration, velocities and relative deformations, together with the measurements of wind velocity and direction, were evaluated by a computer and compared with the response values assumed in the design, with the natural frequencies of the grp extension according to the designer’s computation and with the frequencies of the dominant vibration sources – major machines in the environs of the stack. 2 description of the structure the principal part of the structure (fig. 1) is a vertical grp stack structure mounted above the absorber vessel. the stack is supported and partly surrounded by a supporting steel lattice structure on which the horizontal flue is also suspended at the point of its inlet into the absorber vessel. the steel structure is anchored in four pile footings. the grp stack tube between the levels of +29.680 and +120.220 was assembled from 14 segments of 7000 mm inside diameter and 6610 mm assembly length. the wall thickness varies between 2.0 + 9.0 mm and 2.0 + 23.6 mm, where the 2.0 mm layer forms a chemical barrier protecting the structure against the weather. the grp tube is constrained at +30.935 in the vertical and horizontal directions by a steel structure. horizontal displacements are also prevented at three further levels (+51.190, +71.647 and +92.116), where the tube is strutted horizontally against the steel structure by means of buffers on the circumference of the tube enabling, however, vertical displacements. the supporting steel structure between +0.600 and +92.300 was erected as a lattice tower comprising mostly tubular sections. it consists of four columns mutually con8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 verification of the wind response of a stack structure d. makovička, j. král, d. makovička this paper deals with verification analysis of the wind response of a power plant stack structure. over a period two weeks the actual history of the dynamic response of the structure, and the direction and intensity of the actual wind load was measured, reported and processed with the use of a computer. the resulting data was used to verify the design stage data of the structure, with the natural frequencies and modes assumed by the design and with the dominant effect of other sources on the site. in conclusion the standard requirements are compared with the actual results of measurements and their expansion to the design load. keywords: stack structure, wind load, dynamic response, experimental verification. fig. 1: schematic drawing of the stack structure © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 43 no. 5/2003 fig. 2: comparison of 10-minute mean wind velocities with maximum gusts of wind during measurements a) b) fig. 3: autospectra of acceleration, transmission of vibrations from the footings to the steel tower structure at +31 level: a) horizontally in the east-west direction, b) vertically nected by horizontal beams on nine levels and stiffened by diagonal diaphragms. below the +22.500 level the steel tower plan is of 20.0 × 20.0 m dimensions, while above this level its dimensions decrease linearly to the top of the tower at the +92.300 level, where the dimensions are 8.6 × 8.6 m. until the +30.700 level the steel tower passes through the steel skeleton of the hall housing the absorber technology. the steel part of the structure is founded on four concrete footings at the +0.600 level. every footing is mounted on a bored pile 1220 mm diameter and 8300 mm length. at their feet the piles are constrained in sound granite bedrock. to prevent the piles from being pulled out of the bedrock (to intercept the tensile forces) the pile heads are anchored by prestressed anchors. 3 load with reference to the limit states of its safety the dominant load applied to the tower structure is the wind load. the measurements of wind velocity and direction were made on the level of the platform approximately at +53 level. the measuring set projected horizontally approx. 2 m from the tower structure to ensure that the measured values were minimally influenced by the bypassing air flow. the objective was to record the response to various instantaneous wind velocities (fig. 2) and directions. another load applied was the test force pulse. to define the magnitude of the test pulse load applied to the structure a test hammer was used. the test hammer was provided with a force sensor in its punch, enabling measurements of the pulse force at the moment of its application to the surface of the grp structure across a rubber pad. the response of the structure to the hammer pulse was measured within the 0 – 1000 hz frequency band. the purpose of these measurements was to make an accurate determination of the spectrum of the natural frequencies of the structure. the test hammer was situated on the western side of the tower at approximately +48 m level. finally the response measurements used were also to determine of other sources excited by the power plant machinery in the proximity of the stack structure. 4 description of response measurements and evaluation the sites for response monitoring were selected both on the footings on the pile heads and on the steel tower structure and its grp extension, in order to provide the data necessary for determining of the phase shift between the movements of the two parts of the structure (steel and grp). the selection of the measuring sites was also adjusted to ensure accessibility from the galleries, fixed ladders and their landings, so that we could monitor the three-dimensional movement of the structure with reference to load magnitude and character. the stack response consisted, consequently, of the response to technical seismicity (effect of machines in the power-plant area, which was practically of a stationary character, and the response to wind effects, which were mostly of a non-stationary character (gusty wind). when the test pulses were applied during low wind velocities, their effects were superposed on the machinery and wind effects. the history of the response records to the above loads was described by the basic statistical characteristics (mean and effective value, maximum and minimum). as it was impossible to measure the state without a load, the records were described by the effective value and the maximum double amplitude (maximum – minimum). the integration of the filtered acceleration records yielded velocity records and dynamic displacements, also described in the above mentioned manner. the response records were analysed with reference to frequency; their autospectra (fig. 3) show both the frequencies of the forced vibrations due to technical seismicity and the natural frequencies of the structure. further the coherence functions between the response records on various sites and the corresponding transfer functions were evaluated from which the vibration modes of the structure during the lowest natural frequencies were derived (fig. 5). for the purpose of extrapolation of the measured response to the design loads the measured wind velocities for the individual records were converted to the basic wind pressure w v� 2 1600, where v is wind velocity [m/s], w is wind pressure [kn/m2]. the wind velocity and direction measurements were made on a selected site parallel with the response measurements on the other sites. in simplified terms, it was assumed that the distribution of the measured wind pressure in all other points (both vertically and around the horizontal circumference of the grp extension and steel tower structure) corresponds with standard requirements. with regard to stack height and the height of the buildings on its windward side it was assumed that the wind velocity variation with height corresponded with ground category ii (eurocode, open landscape). the velocity at a height of some 53 m above ground level is about 1.32 times as high as that at the height of 10 m above ground level. the wind direction during the measurements was easterly to northeasterly. the mean wind velocity can be considered, in simplified terms, as the average wind velocity in 10-minute intervals. the maximum measured ten-minute mean wind velocities attained 6 –7 m/s (fig. 2). the differences between the average value of the whole interval and the instantaneous (peak) velocity can be considered as the effect of a gust of wind. the coefficient of wind gusts in the design load was considered as g � 1.8. the measured values are obvious from the comparison of the curves in fig. 2. the reference wind velocity (10-minute mean wind velocity 10 m above ground level with annual occurrence probability 0.02 for region 2) was considered with the design value of vref,0 � 26 7. m/s. this velocity is higher than required by the national application document of the respective eurocode (csn p env 1991-2-4) vref,0 � 26 m/s. the measured velocities of 6–7 m/s correspond with the mean wind velocity 10 m above ground level (category ii) (6 to 7) / 1.32 � 4.5 to 6.3 m/s. the response of the structure under the design load, consequently, can be estimated from the measured response values by means of the ratio of the squares of the reference velocity and the measured wind velocity 10 m above ground level, i.e. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 � �design reference velocity measured velocity convert 2 � � � � ed to m height to 35 to 25 times the me 20 26 7 4 5 5 3 2 2 2 � � � � . . . asured response values. the design load used in the structural design or the response computed from it express as the equivalent effect of the mean time-variable random load component. direct measurement of this response required measurement of the state without a load, i.e., the response at a wind velocity of 0 m/s and the response to a defined stationary load. there was no state without a load during the measurements. for this reason it was necessary to base the measurements on the instantaneous wind velocity at which the response of the structure was measured in terms of acceleration within the frequency band of 0 –100 hz. the double integration of the centered acceleration from selected records yielded the history of the displacement excited by the effects of wind and technical seismicity. the displacement record had a significant quasistatic component corresponding with the changing instantaneous wind velocity fluctuating about the mean wind velocity. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 43 no. 5/2003 mean wind velocity [m/s] fig. 4: effective acceleration at the top of the grp extension (+121) plotted against wind velocity in the wind direction – lengthwise (solid line) and transversal (dotted line) value position of pickups in relation to structure height and wind direction grp extension steel tower +121 +91 +51 lengthwise transversal lengthwise transversal lengthwise transversal acceleration [mm/s2] … effective value of dynamic response component mean value 55.0 49.4 16.9 11.1 7.4 8.0 maximum 76.9 78.3 21.5 22.6 10.5 12.4 minimum 43.2 38.2 12.7 7.0 5.3 5.7 acceleration [mm/s2] … measured maximal effective value (quasistatic + dynamic components) mean value 254.1 217.8 73.1 48.4 33.4 38.2 maximum 398.0 367.5 93.8 84.4 46.1 60.6 minimum 175.0 168.6 53.0 31.0 22.5 23.9 acceleration [mm/s2] … measured minimal effective value (quasistatic + dynamic components) mean value �256.1 �222.9 �72.3 �47.9 �34.8 �37.8 maximum �185.6 �167.1 �52.6 �29.0 �20.4 �26.3 minimum �328.4 �404.9 �93.9 �76.8 �55.3 �65.8 table 1: measured accelerations [mm/s2] in the 0–40 hz frequency interval, in two perpendicular horizontal directions (the first sensor in approximately east/west wind direction, the other in the transverse direction) the dynamic component of the displacement of the structure was superimposed on the quasistatic component. the long-time records contained record sections in which the wind velocity as well as the quasistatic component of the displacement of the structure dropped to a minimum approaching the non-loaded state. in such a case the difference between the maximum and the minimum displacements could be considered as the overall response corresponding to the mean wind velocity of the given record. the probable magnitude of the response of the structure to wind effects under the design load could then be estimated by multiplying the ascertained response by the ratio of the squares of the design and measured wind velocities in accordance with the above considerations. 5 natural frequencies the measured response histories (due to wind effects and pulse excitation by means of the test hammer) were used to compute the response frequency spectra, which revealed that the lowest measured frequencies corresponded with the natural vibration frequencies of the structure. the records taken on the +121 level show the dominant frequency peaks (fig. 4) on the level of approx. 1.06 hz, 2.13 – 2.19 hz (the stack stiffness is not identical in both horizontal directions due to the service ladders – a fact expressed by the interval of measured frequencies); 2.63 – 2.94 hz, 3.69 – 3.82 hz, 4.75 hz, 5.56 hz, 6.19 hz. annexed fig. 5 shows the lowest vibration modes for the east-west direction. the evaluated natural vibration frequencies correspond to the computed natural vibration modes. the lowest basic flexural vibration mode of the structure of 1.06 hz can be observed in all vibration records also on the lower vertical levels of the structure. the frequency of 3.4 hz obviously corresponds with the vertical natural vibration mode as compared with the computed natural vertical vibration frequency of 2.8 hz. the frequency shift of the measured natural frequency as compared with the computed natural frequency enables us to conclude that the erected structure is somewhat stiffer than considered by the design. 6 frequency spectra due to effects from other sources the frequency spectra computed from the measured response histories show further dominant frequency peaks (apart from the natural mode frequencies) – see fig. 3 and fig. 4. these dominant peaks correspond with the effects of other sources – e.g., technical seismicity propagating into the structure from its environs through the foundation soil. as a rule, the dominant seismic effects are generated by the operation of major machines or mechanisms in the power plant area. let us compare the dominant higher frequency peaks with revolution frequencies and higher harmonic frequencies of the machines in the near environs in the given area. the basic revolution frequency of the turbogenerator sets is 50 hz and their higher harmonic frequencies are 100 hz, 150 hz, etc. incl. the half-harmonic frequency of 25 hz. the basic revolution frequency of the cooling tower pumps is 8.18 hz, that of the compressors is 94.3 hz in one station and 153.3 hz in the other station, and that of the drives of both compressors is 25 hz. it is known that under higher loads the compressor revolutions fluctuate about their rated values and, consequently, the measured values differ or may differ in the order of 1 hz. a comparison of the measured frequencies of the records with the revolution frequencies of the machines in the technical seismicity sources shows that the stack excitation by technical seismicity is significantly influenced particularly by the cooling tower pumps on a level of 8.2 hz and its multiples and the compressors with frequencies in the proximity of 100 hz, 150 hz and their multiples. the influence of the turbo-generator sets, due to their higher-quality balancing and obviously better maintenance, participates in the wet stack response at a lower rate than the cooling tower pumps and compressors. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 value position of pickups in relation to structure height and wind direction grp extension steel tower +121 +91 +121 lengthwise transversal lengthwise transversal lengthwise transversal displacement [mm] … effective value of dynamic response component mean value 50.1 45.0 15.4 10.1 6.8 7.3 maximum 47.6 48.5 13.3 14.0 6.5 7.7 minimum 56.8 50.3 16.7 9.2 6.9 7.5 displacement [mm] … measured effective value of total displacement (quasistatic + dynamic components) mean value 239.1 206.1 67.7 44.8 31.9 35.5 maximum 399.2 427.8 101.5 89.3 52.2 64.9 minimum 148.7 128.0 48.4 26.0 19.1 21.9 table 2: vibration displacements [mm] in the 0–40 hz frequency band, for both horizontal directions, extended to displacements corresponding to the design wind velocity and dominant flexural frequency 1.06 hz. the evaluation monitored particularly the increase in the signal generated by technical seismicity on the level of approx. +31 of the steel structure with reference to the vibrations of the footings of the tower columns (fig. 3). this increase may be 20 to 30 fold, depending on the frequency of the dominant peak. the comparison shows that the effects of technical seismicity are comparable with wind effects for current wind velocities of about 5 –7 m/s. if we were to compare the seismic effects with the design wind effects, we would find the effect of the technical seismicity lower than the effect of the design wind load. wind effects usually generate large amplitudes of the structure in the displacements at low frequencies, while technical seismicity generates lower to small displacements, but at higher and very high frequencies, which may be critical for some structural details (such as the joints of structural members, a wide range of measuring probes in the structure, chimney stack warning lights, etc.) 7 transmission of effects between steel structure and footings fig. 3a compares the of acceleration spectra in approx. the east-west horizontal direction, i.e., in the wind direction for the sites at +31 and the footings. it is obvious that the steel structure magnifies the effects at the frequencies propagating into the structure through the foundation soil from ambient vibration sources, such as technical seismicity. a significant increase will manifest itself at low frequencies from approx. 1 hz to 8 hz, i.e., in the region of the natural frequencies of the steel structure. at higher frequencies the steel structure has higher damping, yet the transmission of exterior dynamic effects (due to technical seismicity) is significant in the environs of frequencies of 16.5 hz, 25 hz, 33 hz, 45 –50 hz and 96 hz. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 43 no. 5/2003 a) b) fig. 5: normalized measured natural horizontal vibration modes in the wind direction a) first three lowest modes, b) fourth and fifth higher modes fig. 3b compares the acceleration spectra in the vertical direction for the two above mentioned sites. during vertical vibrations, the transmission from the footings to the structure is dominant, both at the low frequencies in the environs of 1–3 hz and at frequencies of 16.5 hz, 25 hz, 33 hz, 42 hz, 47–50 hz, 58 hz, 74 hz and 96 hz. these frequency components with significant response peaks on the footing sites are mostly due to technical seismicity analogous with horizontal vibrations. 8 conclusions using the example of a composite structure, this paper analyses the influence of wind and technical seismicity effects on the dynamic response and compares the significance of these two load types for the safety and reliability of the structure. the comparison has revealed that the dominant effect on the structure with reference to its safety (maximum displacements, extreme stress state in selected cross sections, etc.) is exercised by the design wind load. the effects of technical seismicity and other sources from the power plant area are comparable with the dynamic wind load within the interval of usual wind velocities. however, technical seismicity may become dominant for the reliability of the structure in the case of vibrations of selected parts, such as joints, measuring probes installed in the structure for technological purposes, etc. finally, we have qualified these effects, which are sometimes underestimated in the design stage. the measurements of the response of the structure during “stronger” wind (5 –7 m/s) have revealed the obvious advantage of extrapolating of this relatively strong wind to the design load. a comparison of the measured response with its assumed design value makes it possible to determine the reserves of the structure in its actual behaviour and the influence of a wide range of imperfections arising from the co-operation of the structure as a whole: piles – steel tower – grp extension. acknowledgement this research was supported by fortum engineering, ltd., pardubice, and formed part of gačr research project no. 103/03/0082 “nonlinear response of structures under extraordinary loads and man induced actions” and gačr project no. 103/03/1395 “reliability of local pressure measurements on models in a boundary layer wind tunnel”, for which the authors would like to thank the company and the agencies. references [1] čsn p env 1991 – 1: basis of design and actions on structure. part 1: basis of design (in czech), czech standard institute, prague, 1996. [2] čsn p env 1991 – 2 – 4: basis of design and actions on structure. part 2–4: actions on structures – wind actions .(in czech), czech standard institute, prague, 1997. [3] model code for steel chimneys, cicind, switzerland, 1988. [4] commentaries for the model code for steel chimneys, cicind, switzerland, 1989. [5] tichý, m. et al: load of building structures. (in czech), tp 45, prague: sntl, 1987. doc. ing. daniel makovička, drsc. phone: +420 224 353 856 fax: +420 224 353 511 e-mail: makovic@klok.cvut.cz ing. jaromír král, csc. phone: +420 224 353 544 e-mail: jkral@klok.cvut.cz klokner institute czech technical university in prague šolínova 7 166 08 prague 6, czech republic ing. daniel makovička static and dynamic consulting šultysova 167 284 01 kutná hora czech republic 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 ap04_2web.vp 1 introduction modern footbridges are usually slender, light-weight structures, frequently of unusual structural systems, e.g. of stressed ribbon, suspended or cable-stayed types. if such footbridges are designed for static loads only they may be susceptible to vertical as well as to horizontal vibrations. hence a dynamic design is often necessary. rhythmical human body motion, e.g., walking, running or jumping, can cause heavy vibrations of structures. there have been several accidents in dance halls, grandstands and footbridges caused by marching, dancing or applauding people. in recent years there have been examples of footbridges that have proved to be unacceptably lively to pedestrians. the latest case is the millennium bridge in london. a cable-stayed footbridge with prestressed concrete was designed over the main road in ústí nad labem in north bohemia by súdop praha [1]. the structure in plan consist of a y form, and is suspended on two pylons. an artistic view of the structure is shown in fig. 1. the pans of the main bridge are 26.1 m + 44.7 m + 17.1 m, the curved pavement ramp is 24.8 m – see fig. 2. the height of the i pylon is 14 m, while the height of the h pylon is 17 m. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 44 no. 2/2004 the effect of pedestrian traffic on the dynamic behavior of footbridges m. studničková the dynamic response of a footbridge depends namely on the natural frequencies of the structure in vertical, in horizontal and in torsion. if any of the frequencies in vertical is in the range 1.0 hz to 3.0 hz, the dynamic response from moving people can be significant. in this case it is necessary to calculate vibrations taking into account both serviceability and ultimate limit states. the same problem arises when any of the frequencies in horizontal (transversal) or in torsion are in the range 0.5 hz to 1.5 hz. such frequencies haveare found namely in footbridges with larger spans or cable-stayed and suspension footbridges. a unique cable-stayed footbridge with prestressed concrete was dynamically analyzed and the dynamic response to simulated pedestrian loading was calculated. the calculated effects were compared with the pedestrian comfort criteria for serviceability limit states. these criteria are defined in terms of maximum acceptable acceleration of the bridge deck. keywords: vibrations, footbridge, dynamic actions due to pedestrians, acceptance criteria, response, serviceability. fig. 1: artistic view of the footbridge fig. 2: section of the footbridge 2 model of the structure the computational model of a footbridge for dynamic analysis usually consists of truss, beam and 2d elements. the correct results of the eigenvalue analysis are strongly dependent on the boundary conditions. the implementation of these into the calculation must be carefully considered. the computational model of the bridge is shown in fig. 3. 3 dynamic analysis the dynamic analysis consists of computational eigenvalue analysis and of the analysis of the response to the time-dependent loads caused by pedestrians. the damping of the analysed footbridge was considered by rayleigh’s damping proportional to mass and stiffness with values of coefficients corresponding to logarithmic damping decrement � � 0.02. the results of the eigenvalue analysis – five lowest frequencies – are summarised in table 1, and the corresponding mode shapes in fig. 4–7. 4 acceptance criteria the measured or calculated vibration amplitudes of a footbridge must be compared with the acceptance criteria. the acceptance criteria are frequency–dependent and in general they are given in units of acceleration. in the case of vertical vibrations, maximum acceleration amplitudes of 0.5 ms�2 to 1.0 ms�2 , i.e. 5 % to 10 % of gravity g may be accepted – see fig. 8. some countries have given considerable attention to the specification of tolerable vibration levels on bridges and footbridges, and the ascertained criteria have been incorporated into the design standards. fig. 9 shows the criteria of acceptability of vertical vibrations above 1 hz given in the standards of the united kingdom (bs 5400, 1978), canada (obdc, 1983) and international iso standards (iso 2631). experience has shown that most users tolerate even slightly higher values than those pertaining to the hatched part of fig. 9. people are much more sensitive to horizontal vibrations when walking or running than to vertical vibrations. therefore, an acceptance value of 1 % to 2 % in horizontal is re48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 3: computational model of the structure mode frequency fi [hz] shape identifier 1 0.967 superstructure horizontal longitudinal 2 1.485 girder vertical bending 3 1.985 girder combined vertical bending+ horizontal transverse 4 2.303 girder vertical bending + pylon h horizontal 5 2.808 girder vertical bending + pylon h horizontal table 1: natural frequencies of the footbridge © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 44 no. 2/2004 fig. 5: 2nd mode of the bridge f2 � 1.485 hzfig. 4: 1 st mode of the bridge f1 � 0.967 hz fig. 6: 3rd mode of the bridge f3 � 1.985 hz fig. 7: 4 th mode of the bridge f4 � 2.303 hz fig. 8: criterion for vertical accelerations fig. 9: acceptable level of vertical acceleration commended. the available sources provide very little data. nevetherless, an approximate criterion can be proposed – see fig. 10. 5 dynamic forces induced by pedestrians moving people excite the footbridge in vertical, in horizontal (longitudinally or transversally) and in torsion. the response of a footbridge depends namely on the pacing frequency (walking, running, jumping), the time function of the vertical and horizontal dynamic action, the number of persons involved, and the dynamic characteristics of the footbridge. the action force fp(t) due to a single person can be expressed with sufficient accuracy as the sum of the static force (the weight of a person) and the first three harmonic components of the excitation force [3] � � � � � � f t g g f t g f t g f t � � � � � � � 1 2 2 3 3 2 4 6 sin sin sin � � � � � p p p (1) where g weight of a person (usually g � 800 n) g1 load amplitude to the first harmonic component g2 load amplitude to the second harmonic component g3 load amplitude to the third harmonic component fp pace frequency �2 phase shift between the first and second harmonic components �3 phase shift between the first and third harmonic components. phase shifts ji can be introduced approximately with values of �2 � �3 � �/2. for standard walking, the frequency of 2 hz is most frequent. the results of a number of measurements by a number of authors are presented in [3], with the conclusion that the typical pace frequency of ordinary people is subject to gaussian distribution with a mean value of fp � 2 hz and the standard deviation of �f � 0.15 hz. during running at the double, the frequency fluctuates within the limits of 2.4 hz and 2.7 hz. during sprinting it can be as high as 5 hz. however, pace frequencies higher than 3.5 hz on footbridges are rare. vandals try to tune the structure to one (most frequently the lowest) natural frequency within the limits of 0.5 hz and 4.5 hz. such cases do not involve merely excitation by footsteps, but also various methods of periodic force excitation with the objective of making the footbridge vibrate with the greatest possible intensity. in practical cases the dynamic forces due to moving people can be simplified, and it is considered that only the resonant part of the dynamic action excites the bridge (e.g. [4]). in this case the concentrated dynamic action for a group of pedestrians can be expressed in the form for vertical vibrations � � � �f t k f f tpv v v v� 280 2sin � [n] (2) for horizontal vibrations � � � �f t k f f tph h h h� 70 2sin � [n] (3) where fv is a bending natural frequency of the bridge in vertical closest to 2.0 hz, fh is a bending natural frequency of the bridge in horizontal closest to 1.0 hz, kv(fv), kh(fh) are magnifying factors given in fig. 11. forces (2) and (3) are applied in the location of maximum displacement of the natural mode. for long and wide footbridges the dynamic load model of a continuous stream of pedestrians is used. this model consists of a uniformly distributed pulsating load acting in vertical or (separately) in horizontal direction. this load should be applied on the relevant areas of the footbridge deck (e.g. span by span or on the half-wavelength of the mode of vibration under consideration), for verification of the specified comfort criteria – see fig. 8–10, as well as for an assessment of the inertia effects in order to obtain the most unfavorable effect. the load model of a continuous stream of pedestrians can be expressed in the form: for vertical vibrations � � � �q k f f tsv v v v�12 6 2. sin � [nm�2] (4) for horizontal vibrations � � � �q k f f tsh h h h�12 6 2. sin � [nm�2] (5) where all symbols are as already defined. 6 dynamic effects of pedestrian loading the lowest natural frequency in vertical bending is f � 1.48 hz with the greatest amplitudes on the curved ramp. the frequency is lower than 1.6 hz and the dynamic response to the pedestrians does not need to be checked. the third natural frequency of the bridge f � 1.98 hz belongs to the combined natural mode (vertical + horizontal) and the value of the frequency is the same as the most fre50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 10: proposal of a criterion for horizontal transverse vibrations quent pace frequency 2 hz. the dynamic force of a group of pedestrians given by (2) was applied in the forced vibration calculation. the maximum calculated value of the acceleration response is amax, .v ms� �0 01 2. according to figs. 8 and 9, the limit value for acceleration is about 0.7 ms�2. the calculated value of acceleration is very small and the pedestrians do not threaten the footbridge. the footbridge is very weighty, and walking people are not able to bring it into vibration. even the continuous stream of pedestrians modeled by (4) does not give rise to larger vibrations. 7 conclusion modern footbridges have low damping, relatively small stiffness and are susceptible to vibrations generated by pedestrians. footbridges with low damping (logarithmic damping decrement � � 0.03) and a natural frequency in vertical bending within the limits of 1.6 hz and 2.4 hz or 3.3 hz and 4.5 hz (regions of the first and second harmonic of the pace frequency) usually react to pedestrian traffic with a significant response. the calculation of natural vibrations (natural frequencies and modes) must be performed carefully, using approved software. correct results of the dynamic analysis are strongly dependent on the boundary conditions. if the designer has little experience, a specialist´s advice is recommended. the response of a footbridge to pedestrian actions should be computed in the design stage. if the computed response is higher than provided by the criteria – figs. 8, 9, 10– it is advisable to change the structural system, to increase damping and stiffness, or to give consideration to applying of a vibration absorber. the definitive vibration absorber tuning is then based on the dynamic footbridge characteristics ascertained by measurements of the behavior of the erected structure. this paper shows that a detailed dynamic analysis in the stage of design enables the susceptibility of a footbridge to be determined. the structural system can be modified, or the material can be changed, thus ensuring that the design of the footbridge will comply with the criteria of serviceability and the ultimite limit states without additional measures. 8 acknowledgment this research has been supported by grant no 103/03/ /0082 “nonlinear response of structures under accidental actions and pedestrian dynamic actions” of the czech grant agency, and by ministry of education and youth research project j04/98/210000029 “risk engineering and reliability of technical systems”. references [1] súdop praha: project so a 252 the footbridge over žižkova street, prague, june 2002. [2] studničková m.: “the assessment of footbridge vibrations according to eurocodes” (in czech). stavební obzor 4, (2001), pp. 296–299. [3] bachmann h., amman w.: “vibrations in structures induced by man and machines”. iabse structural engineering document 3e, 1987. [4] pren 1991-2 actions on structures – part 2: “traffic loads on bridges”, cen, july 2001. dr. marie studničková phone: +420 224 353 503 fax: +420 224 353 511 e-mail: studnic@klok.cvut.cz czech technical university in prague klokner institute šolínova 7 166 08 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 44 no. 2/2004 a) b) fig. 11: factors a) kv(fv), b) kh(fh) table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap05_1.vp 1 introduction if a rotating body is heated, its internal energy increases and the concurrent temperature increase is followed by thermal expansion. this results in an increase in the moment of inertia of the body. when no external torque affects the body, then according to the principle of conservation of angular momentum, the angular momentum is constant [1]. since the angular momentum of the body is proportional to the moment of inertia and to the angular velocity [1], the increase in the moment of inertia is followed by a reduction in angular velocity. the final effect is that there is a decrease in the kinetic energy of the rotating body, which is proportional to the moment of inertia and to the power in the angular velocity. (an example of a converse effect is the increase of the angular velocity of a “black hole” during its gravitational collapse). according to the energy conservation law, the decay of kinetic energy must convert into another form of energy. only thermal energy and deformation energy originating from centrifugal forces are in consideration. however, as proved by the following mathematical calculation, deformation energy also decreases with thermal expansion. this means that the deficit of kinetic energy and of deformation energy converts into an increase in the internal energy of the body. in sum, if a rotating body is heated, the increase in its internal energy is higher than if the body at rest. the “additional” increase in the internal energy of a rotating body appears to the detriment of its kinetic and deformation energy. in the following study the case of heating or cooling a rotating body is treated mathematically, and the trend of the quantities characterising the effect is demonstrated under concrete parameters. 2 assumptions it is supposed that the body, rotating with angular velocity �0, (subscript “0” always means the initial value of a variable) is a cylinder with radius r0, mass m and height h0, which is small in comparison with radius r0. the axis of rotation and the geometric axis of the cylinder are identical. the density of the cylinder material is �0, young’s modulus e, poisson’s number �, specific heat capacity c, and coefficient of thermal expansion � (fig. 1). quantities e, c, � are constant in the considered range of temperatures (approximately 300–1000 k) and the tension stress in the material does not exceed the proportional limit. no external forces affect the body and no uncontrolled heat transfer occurs between the cylinder and the environment (fig. 2). for the parameters considered, the elastic deformation of the body originating from centrifugal forces is very small in comparison with the thermal expansion, and therefore its dependence on angular velocity is neglected. 3 mathematical treatment it is supposed that a quantity of heat �q is transferred into or out of a rotating body (for instance by radiation, see fig. 2). a mathematical description of the consecutive behaviour of the body follows. © czech technical university publishing house e http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 heating and cooling anomaly of a rotating body o. brůha, t. brůha this paper deals with an effect which appears when heating or cooling a rotating body. no external forces acting on the body are supposed. due to thermal expansion, the moment of inertia of the body varies together with the temperature changes. in agreement with the principle of conservation of angular momentum [1], the angular momentum is constant. this results in angular velocity changes and subsequently in kinetic energy changes. also the stress energy varies together with the changes in thermal dimension. to satisfy the principle of energy conservation we have to suppose that the changes in kinetic and stress energy are compensated by the changes in internal energy, which is correlated with temperature changes of the body. this means that the rules for the heating or cooling process of a rotating body are not the same as those for a body at rest. this idea, applied to a cylinder rotating around its geometric axis under specific parameters, has been mathematically treated. as a result, the difference between the final temperature of the rotating cylinder and the temperature of the cylinder at rest has been found. keywords: angular momentum, energy conservation, rotating body, thermal expansion. � c, e, , ,� � � r h fig. 1: a rotating cylinder and its parameters basic relations moment of inertia of the cylinder j mr� 1 2 2. (1) rotational kinetic energy of the cylinder e jk � 1 2 2� . (2) angular momentum b j� �� const. (if the external torque is zero) [1], [2] (3) the linear coefficient of thermal expansion is defined generally as: � � 1 0x x t d d , (4) where x0 is initial length and dt is temperature change. for better mathematical access another definition was used, namely: � � 1 x x t d d , (4a) where x is variable. relations (4) and (4a) result in practically the same values for the parameters used. (for � t � 700 k and � � � �16 10 6 k�1, the difference in �x calculated according to (4a) or according to (4) is only 0.0089 �x.) using relations (1) and (2) we can express the rotational kinetic energy of the cylinder (see fig. 1.) e h rk � 1 4 4 2� � � . (5) for r(t) and h(t) as functions of temperature we obtain from definition (4): r t r e t( ) � 0 �� , (6) h t h e t( ) � 0 �� , (7) where �t t t� � 0 . density �(t) and angular velocity �(t) as functions of temperature can be expressed in a similar way (see appendix, items 1. and 2.) � �( )t e t� �0 3�� , (8) � �( )t e t� �0 2�� . (9) substituting relations (6), (7), (8) and (9) in (5) we obtain an expression for rotational kinetic energy as a function of temperature e t r h e e et tk k0( ) � � � �1 4 0 4 2 0 0 2 2� � �� � �� � . (10) the deformation energy should also be taken into consideration. the total deformation energy corresponding to the centrifugal forces in a rotating cylinder is (see appendix, item 3.) e r h � �� � � � � � � � � � � � � � � 2 4 7 2 448 2 3 2 3 3 1 3 6 e � � � � � ( ) ( )( ) 1 2 1 6 1 3 2 � � � � � � � �( ) .� (11) substituting relations (6), (7), (8) and (9) in (11) we obtain for deformation energy as a function of temperature: e t r h e t � � � � ( ) ( ) ( � � � � � � � � � � � 0 2 0 4 0 7 0 6 2 448 2 3 2 3 � � � � e 3 1 3 6 1 2 1 6 1 3 2 0 6 � � � � � � � � � � � � � � � � � � ) ( ) ( ) .e e t� � (12) the total mechanical energy is e t e t e t� �( ) ( ) ( )� � �k k . (13) using relations (10) and (12), the change in total mechanical energy as a function of temperature can be expressed as: 20 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague �0 �( )t r t( ) h0 h t( ) t0 r0 t �0 �( )t � �q (radiation) �t t t� � 0 vacuum vacuum fig. 2: an illustration of an isolated rotating body � � � � � e t e t e t e e e e e et t � � �� �� � � � � � � � � � � k k k0 k0 ( ) ( ) ( ) 2 61� �� �� 1 . (14) now we suppose that a quantity of heat �q is added to (or taken away from) the rotating cylinder and we will try to express the corresponding change in its temperature �t according to equation (14). after the thermal field in the cylinder reaches the steady state, the angular velocity of the cylinder is again constant. there are no more reasons for changes to it. this means that the cylinder is in thermodynamic equilibrium both at the beginning of the process and also at the end [4]. consequently, these two states must satisfy the principle of conservation of energy. taking into account that not only mechanical energy, but also the change in internal energy �u has to be considered, the law of conservation of energy can be written as � � �q u e� � �� k , (15) and � � �u q e� � �� k . (16) for a body at rest, it is generally supposed that: � �u q0 � . (17) this means that the change in internal energy, originating from the heat transferred �q, is not the same for a rotating body and for the same body at rest. there exists a difference � � � � � � � �� �� � � � � � �u u u q e q e0 � �k k . (18) it is well known that a change in internal energy is correlated with a temperature change in a body. [1]. this means that a change in temperature corresponding to �q of a rotating body also differs from that of a body at rest. let us accept a simplified presumption, that the internal energy of a rigid body is entirely a function of the kinetic energy of chaotic motion of all particles and that an increase in it corresponds to an increase in the body temperature. then we obtain a simplified expression for the difference � * )t : � � � � * t t t u m � � � � s c [1], (19) where �t is a change in temperature of the rotating body, and �ts is a change in temperature of the still body, m is mass and c is specific heat capacity of the body. equation (19) combined with (14) and (18) yields: � � � * ( ) ( ) t e e e e m t t � � � �� �k0 c 1 12 0 6� � � . (20) according to (19) � � �t t t� �s * and then (20) reads: � � � � � * ( ) ( )( * ) ( * ) t e e e e m t t t t � � � �� � � �k0 s s c 1 12 0 6� � � , (21) where � ts is: � � t q m s c � [1] (22) expression (21) is a transcendent equation for unknown quantity � * t and given quantity � ts . this equation was solved with the help of the maple mathematical program, and the following real solutions for � * t as a function of � ts were found: � �* ( ) ( t f t e e zm e e z e e z e � � � � � � � � � s rootof c k � � � 0 6 0 2 0 2� � � 0 0 0 6 0 6 0 2 � � � � � � � � � � � e t m zm e e z e e z m e e k s k c k rootof c c � ) ( � � � � � � � � �e e t m m e e m � � 0 0 0 0 23 k s c k c c � ) ( ) (the symbol z in this equation represents an auxiliary variable for the solution of partial expressions for given � ts .) the solution (23) is illustrated (fig. 3) for the following parameters of the rotating cylinder: © czech technical university publishing housee http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 cooling of body heating of body | | [k]�ts | [k]��t| fig. 3: curves � * t versus �ts as a solution of the equation (23) material – steel r0 1� m � �0 2 100� rad×s �1 h0 01� . m �0 7800� kg×m �3 [2] e � �2 1011 pa a � � �16 10 6 k�1 [2] c � 483 j×kg�1×k�1 [2] � � 0 29. [3] conclusions fig. 3 shows the curves of � * t versus � ts (in absolute values) as a solution of equation (23), both for heating and for cooling the cylinder. for � ts max � 700 k, and for the parameters supposed, the value of � * t is: � * ( ) .t 700 4 62k k� for heating and � * ( ) .t 700 4 72k k� for cooling. the results show that the effect is more readable for cooling than for the heating process. this corresponds to the fact that for the heating process the temperature difference � * t has the finite asymptotic value: � �* ( )t e e m t �� � �k c 0 0� (follows from (21)). (this relation describes a theoretical extreme case when the whole initial kinetic and deformation energy converts into inner energy and the cylinder “comes to a standstill”). in the case of cooling, the mathematical expression for � * t has no limit. in the study it is supposed that the coefficient of thermal expansion of the cylinder material is positive. if the coefficient of thermal expansion is negative, the heating of a rotating cylinder would result in an increase in kinetic energy on account of the heat transferred. as materials with a negative coefficient of thermal expansion are very rare (for instance quenched irreversible fe-ni invar alloys), this case is not considered in detail in the study. appendix 1. relation (8): from its definition, the density is given by: � � m v (a1) and according to fig. 1: � � � m r h2 , (a2) or: � � � � 0 0 2 0 2 � � m r h t m r t h t , ( ) ( ) ( ) , (a3) and from relation (6, 7) follows: � � � �( )t m r e h e m r h e e t t t t� � � � 0 2 2 0 0 2 0 3 0 3 � � � � � � � � . (a4) 2. relation (9): according to relation (3): j const� � . , (a5) or: j j t t0 0� �� ( ) ( ) , (a6) � �� �( ) ( ) t j j t 0 0 (a7) and from relation (1, 6) follows: � � �( )t mr mr e e t t� � � 1 2 1 2 0 2 0 2 2 0 0 2 � � � � . (a8) 3. relation (11): note: since the height of the cylinder is small in comparison with its radius, the axial stress is neglected. the deformation energy e� per volume unit corresponding to the tension stress � is: e v � �� 2 2e [2] (a9) and e v � � � 2 2e , (a10) and in elementary form: d d e e v � � � 2 2 . (a11) in the case that the height of the rotating cylinder is small in comparison with its radius, radial and tangential stress �r and �t, originating from centrifugal force, can be expressed as (see fig. 4) � ��r( ) ( )r r rx x� � � 3 8 2 2 2� (a12) and � �� t( ) ( ( ( )r r rx x� � � � 2 2 2 8 3 1 3� � , [3] (a13) where rx is the radius of stress action. the elementary deformation energy corresponding to the radial and tangential stress is d d e r t r t e r v r r r rx x x x x� � � � �( ) ( ( ) ( ) ( ) ( ))� � �2 22 2 � . (a14) as shown in fig. 4.: d d d dv l s r h rx� � � . (a15) substituting relation (a15) in (a14) we obtain for elementary deformation energy: d e d d r t r t e r h r r r r r r x x x x x x � � � � � � ( ) ( ( ) ( ) ( ) ( )) � � � � 2 2 2 2 � � (a16) substituting relations (a12), (a13) in (a16) and integrating we obtain the integral deformation energy accumulated in the rotating cylinder: 22 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague e e r r r h x r � � � � �� � � � � � � � � �� d d d e ( ) ( ) 0 2 0 2 4 7 448 2 3 2 3 � � 2 23 1 3 6 1 2 1 6 1 3 � � � � � � � � � � � � ( ) ( ) ( ) .� � � � (a17) references [1] tipler, p. a.: “physics.” new york, worth publisher inc., 1982, p. 80, 295–297, 495–496, 498–500. [2] horák, z., krupka, f., šindelář, v.: „technická fyzika.“ praha, sntl, 1960, p. 187–188, 310, 1411. [3] michalec, j. et al.: „pružnost a pevnost ii.“ praha, vydavatelství čvut, 2001, p. 78–82. [4] landau, l. d., lifschitz, e. m.: “lehrbuch der theoretischen physik v: statistische physik.” berlin, akademie-verlag, 1966, kap. 2, § 10, p. 38–40. doc. ing. oldřich brůha, csc. phone: +420 602 458 314 fax: +420 251 561 115 e-mail: oldrich.bruha@volny.cz department of physics czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic ing. tomáš brůha phone: +420 777 895 762 e-mail: enex@volny.cz enex s.r.o. na míčánce 13 160 00 praha 6, czech republic © czech technical university publishing housee http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 dm r � 2 x d /2� d� d /2� � t h rd � t h rd ( d ) d� � � � r r hrx � � r d h rx h dl � drrx fig. 4: centrifugal and stress forces in a rotating cylinder acta polytechnica doi:10.14311/ap.2019.59.0554 acta polytechnica 59(6):554–559, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap applicability verification of autoform software for fem simulation of mechanical fixation of hemmed joints lukáš chrášťanskýa, b, ∗, jan šanoveca, yatna yuwana martawiryac, michal valeša, b a czech technical university in prague, faculty of mechanical engineering, department of manufacturing technology, technická 4, 166 07 prague 6, czech republic b škoda auto a.s., tř. václava klementa 869, 293 01 mladá boleslav, czech republic c itb bandung institut of technology, faculty of mechanical and aerospace engineering labtek ii, 2nd floor jl. e-itb / jl. ganesha 10, 40132 bandung, indonesia ∗ corresponding author: lukas.chrastansky@fs.cvut.cz abstract. this paper deals with the problematics of fixation of hemmed joints, especially with one specific mechanical method how to ensure a dimensional stability of a hemmed joint of car-body parts, such as doors, bonnet and trunk lid, during the production process in the automotive industry. it also evaluates simulations and verify the possibility of the use of the autoform software for this problematics. in the paper, the possibility of using the law of similarity in the analysis of the compressibility of small parts by autoform is described. keywords: hemming, fem simulation, autoform, car-body, automotive industry, the law of similarity. 1. introduction car-body production is a difficult complex task where many manufacturing technologies are used. hemming is specific joining technology, which is commonly used during a car-body production in the automotive industry. one of the greatest challenges in the development of the car-body occurs during and after the hemming stage in the welding shop where all the car-body parts are assembled before painting. car-body panel parts, such as doors, trunk lid and bonnet, are usually made from two main parts – inner metal sheet and outer metal sheet which are joined by hemming. during joining two or more sheets of metal plates together, the greatest challenge is to ensure a dimensional stability of sheet metal parts, which can move against each other. this negative effect can occur during other stages of production, especially during installation, when parts may not be correctly aligned. due to the problems mentioned above, it is always necessary to use some fixation method of the hemmed joint to provide the dimensional stability. this article deals with a specific fixation method of hemmed joints and verification of the applicability of autoform software. this software is commonly used for fem simulations during preparation stages for car-body parts production. 2. fixation of hemmed joints the fixation of hemmed joints helps to stabilize the relative positions of the inner metal plate and outer metal plate during the manufacturing process until the parts go through the paint shop, where the main stabilizing function takes place in the form of a cured adhesive in hemmed joint applied the after cataphoretic painting. fixation method further serves to maintain the desired dimensions of the final part during the hardening of the adhesive, which may temporarily cause different stresses due to the heat on the hemmed joint. figure 1 is a representation of the possible movement of the inner sheet metal part against the outer part. the fixation is used to prevent such movement. [1, 2] unfortunately, this specific problematic is not described much in technical articles. current methods of fixation are often developed by automotive companies that keep their know-how through patents and patent applications. each of the different methods has its specific advantages and disadvantages, there are also differences in the applicability of the various methods, depending on the size, complexity and quality of the particular component. because of all these mentioned reasons, a single suitable method as a universal one cannot be determined. most common methods can be divided into two main groups. the methods based on a mechanical solution (some form of a mechanical lock) and methods which use some form of heat (brazing/ resistance spot welding/ adhesive curing). [3–8] 2.1. mechanical fixation of hemmed joint this specific solution envisages to prevent movement of the inner metal sheet against the outer metal sheet in both directions on the basis of a mechanical lock 554 https://doi.org/10.14311/ap.2019.59.0554 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 6/2019 applicability verification of autoform software for fem simulation. . . figure 1. possible movement of the inner part in hemmed joint. (see figure 2). by using a special puncher and die, small grooves around the whole circumference of the component on the inner metal plate are stamped. on the flange of the outer panel part, special small protrusions are stamped. these protrusions with the combination of grooves are making the mechanical lock after the hemming process, with using a plastic deformation, where the protrusion copies the shape of one of these grooves. this creates a strong joint with mechanical locking, and this leads to fixation of the mutual position of the joined parts during the hemming and the subsequent production process. [2, 9] 3. description of experiment this solution of the mechanical fixation is designed for use on steel metal sheets, which are used for a car-body [10]. the specific material specification is shown below in table 1 and table 2. different types of materials are used, due to different material requirements in practice on real parts. chosen materials are typical for inner and outer parts of car-body in the automotive industry. the thickness of the sheet for both materials is 0.65 mm. to verify the applicability of the autoform software for a simulation of this problematics, it is necessary to prepare the simulation model with several operations (fig 3). the first operation deals with the preparation of the inner part for hemming. it includes stamping of the fixation element (grooving) on the inner part, with the use of special tools. it is necessary to export the simulation result as it is used for hemming. the preparation of outer part includes two operations – flanging and stamping. flanging consists in the bending of the sheet edge with an angle approximately equal to 90°. after flanging, the second step occurs, the stamping of protrusions on the flange of the outer part. in the same simulation task, the table hemming operation is prepared, so it is necessary to import the inner part and position it on the prepared outer part. as the last step before simulation, it is necessary to set up the calculation method in the autoform software. a specific calculation method can influence the whole result in its accuracy. in this case, for the first operation (stamping on the inner part), 5 strategies were set up – ce (concept evaluation), ce+, fv (final validation), ud1 and ud2 (user-defined). all of the simulation strategy settings are shown below in table 3. on the basis of the comparison of the simulation results, where tools with original dimensions have been used, it is clear that the software autoform r6 has difficulties resulting in an inaccurate calculation of the numerical simulation for the ce and ce+ strategy. according to figure 4, it is clear that the quality of the output is not optimal, especially due to the specific settings. the current mesh elements are too large for the ce and ce+ strategy. the tool is penetrating into the sheet and the sheet is not formed correctly. the rest of the simulations (fv, ud1 and ud2) give much better graphical results, where the grooving profile is nicely formed. however, with regard to the fld, it can be concluded that the continuation of the simulations with the original dimensions will not reveal the required results. if the current simulation result would be considered as a final result, it would lead to the conclusion that it is not possible to produce this grooving profile by the stamping process due to defects, which are detected in the forming limit diagram (fld) – see figure 5. in case that the results would be relevant, it would lead to using this solution in practice. the autoform software is mainly used for a simulation in the automotive industry for large car-body parts. verification of such data can be done, for example, by the circular grid method [5]. small and sharp geometries are difficult to simulate with the use of the autoform software as the used simulation model and material model cannot give relevant results. therefore, another method for designing the methodology for numerical simulation of grooving stamping to get relevant results must be used. a possible solution for the trouble mentioned above can be applying one of the fundamental laws of plastic deformation, namely “law of similarity”. [2, 11, 12] 4. simulation optimization as mentioned above, the possible solution is the use of the law of similarity. for the application, it is required to follow the definition of the law. the whole definition can be found in study texts ies [2, 12]. in this case, the same conditions are used to keep the mechanical and physical similarity. for geometrical similarity, it is necessary to modify the input parameters of the numerical simulation, in this case, the tool dimensions are increased in comparison to the original dimensions in the ratio of 10 : 1, including the thickness of the used metal sheet. simulation strategy is based on fv (final validation) settings (see table 3). 555 l. chrášťanský, j. šanovec, y. y. martawirya, m. valeš acta polytechnica figure 2. model of mechanical fixation. figure 3. process scheme. material re [mpa] rm [mpa] a80 [%] r90 [–] n90 [–] dx54d 120–220 260–350 36 1.60 0.18 table 1. mechanical properties of the inner part material. material rp 0.2 [mpa] bh2 [mpa] rm [mpa] a80 [%] r90 [–] n90 [–] hx180bd 180–240 35 290–360 34 1.50 0.16 table 2. mechanical properties of the outer part material. strategy ce ce+ fv ud1 ud2 tolerances and settings stitching distance [mm] 0.5 0.5 0.5 0. 0.5 meshing tolerance [mm] 0.10 0.05 0.05 0.05 0.05 max side length [mm] 50.0 30.0 10.0 10.0 10.0 calculation settings radius penetration [mm] 0.22 0.22 0.22 0.15 0.09 max element angle [°] 30° 30° 30° 22.5° 22.5° max refinement level [–] 7 7 6 10 12 max element size [mm] 40.0 20.0 10.0 2.0 1.0 min element size [mm] 0.31 0.16 0.16 0.00 0.00 number of initial elements [–] 8 29 114 2 830 11 317 data size [mb] 42.134 42.139 45.027 49.894 70.804 table 3. simulation strategy settings. figure 4. simulation result from ce+ strategy. 556 vol. 59 no. 6/2019 applicability verification of autoform software for fem simulation. . . figure 5. simulation results from autoform r6 for ud2 strategy (a – width of profile in mm; b – isometric view; c – fld) [2]. figure 6. simulation results from autoform r6 after applying the law of similarity (a – width of profile in mm; b – isometric view; c – fld) [2]. figure 7. experimental sample of grooving. 557 l. chrášťanský, j. šanovec, y. y. martawirya, m. valeš acta polytechnica figure 8. assembly of the inner part and outer part with fixation elements prepared for hemming. figure 9. cross-section view of hemming simulation result after applying the law of similarity (material thickness is 6.5 mm). the results from the fem simulation show that it is possible to use the law in experimental practice. after comparing the result of the simulation (see figure 6a and 6b), where the ratio of dimensions 10 : 1 to original dimensions was used, with the real experimental sample (see figure 7), it is obvious that the results from the fem simulation are correct. the correct results mean an unbroken output from the simulation with the smooth shape of grooving without a material failure. in the fld, the excess thinning is located, however, on the real experimental sample, the excess thinning is not located and the result is acceptable, so the application of the law is used for the experiment of complete hemming stage (see below). [2] as it was mentioned in the description of the experiment, the whole simulation process follows the proposed operation steps. the whole simulation of the fixation of the hemmed joint is made for one fixation element on the experimental sample after the application of the law of similarity. the assembly of the flanged outer sheet with a stamped protrusion and inner sheet with the stamped grooving profile is shown in figure 8. this assembly is prepared for the last operation – hemming. 5. discussion results from the simulation can be compared to the real experimental sample (figure 10). samples were performed made onfrom real metal sheets from the same material as that was used for the simulations. from the obtained results, it follows that for a complete simulation of the hemming with a mechanical figure 10. cross-sectional view of the real hemmed sample with original dimensions (material thickness is 0.65 mm). fixation, the law of similarity can be applied. before the application of the law, several simulations were made to obtain a comparison of the accuracy of the autoform software. when the correct mesh and calculation settings were set up, it was clear that the used simulation software has difficulties to give proper results. the fld in figure 5c represents the course of the stamping of the grooving profile of the original dimensions, while the diagram in figure 6c shows the course of the grooving profile stamping after application ofapplying the law of similarity, where the original tool and blank dimensions are increased by a ratio of 10 : 1. this fact is based on the confirmation of the theory of application of the law of similarity, in which the simulation software, despite the precise setting, cannot give the results for the first simulation, i.e. before applying the law of similarity. the simulation of the original size of grooves had been calculated without any problems, however, in practice, this solution is unacceptable due to the material failure. the simulation software used is primarily designed for large parts, and this experiment had been found to be unable too difficult to characterize the material behaviour in detail during the stamping process of small and sharp dimensions. it follows from the above conclusions that when solving the problems of stamping and drawing of sharp and small geometries, it is always appropriate to perform a simulation of the corresponding process not only for a designed process that only calculates the original dimensions but it is advisable to check the process by a simulation just after the application of the law of similarity and then compare both results. if both the fld trends match and at least one of the simulations finish with acceptable results, that is, without a failure of the material, it can be said that the geometry is manufacturable. 6. conclusion as is known, numerical simulations are widely used in the process of design for manufacturing of car-body parts and also in the design of stamping tools. although the fem simulations for hemming are increasing in the last years, this field is not much explored yet. in this respect, the simulation software is designed only for basic and most common used operations, un558 vol. 59 no. 6/2019 applicability verification of autoform software for fem simulation. . . fortunately, the fem software does not involve solving complex tasks for hemming. one of the main objectives of this article is to verify theof applicability of the autoform r6 software for a fixation of hemmed joints. based on the results, it is possible to say that the current version of the software can be used, but only with a verification after the use of the law of similarity. for more detailed results and a final comparison of results, it is necessary to verify the same functionality with newer versions of the autoform software – r7 and r8. due to a continuous development in the field of fem simulations, it can be expected that results from newer versions can give more accurate results for this specific problematics. acknowledgements the research was supported by sgs16/217/ohk2/3t/12. sustainable research and development in the field of manufacturing technology references. references [1] l. chrášťanský, j. hálek, t. pačák, f. tatíček. methodology for mechanical joining of sheets for designing the hemmed joints. in 25th anniversary international conference on metallurgy and materials (metal 2016), pp. 345 – 349. tanger, ltd., ostrava, 2016. [2] l. chrášťanský, j. šanovec, y. y. martawirya, m. valeš. the law of similarity and its application for numerical simulation of sharp geometries stamping. in 27th anniversary international conference on metallurgy and materials (metal 2018), pp. 378 – 383. tanger, ltd., ostrava, 2018. [3] e. hasegawa, takeishi, nakamura. us 2008/0230588 a1 electromagnetic hemming machine and method for joining sheet metal layers. patent, honda motor co., ltd., tokyo, 2010. [4] j. l. mcclure, s. c. ramalingam, j. p. lezotte. us6927370b2 hemming working method and panel assembly manufacturing method. patent, daimlerchrysler corp., 2005. [5] s. k. vanimisetti, r. raghavan, j. e. carsley. us8341992b2 roller hemming with in-situ adhesive curing . patent, gm global technologz operations llc, 2013. [6] j. e. carsley, b. e. carlson, j. g. schroth, d. r. sigler. us20120204412a1 method of joining by roller hemming and solid-state welding and system for same. patent, gm global technologz operations llc, 2012. [7] j. rintelmann, b. hans. de102013013001 b3 a method of manufacturing a folding and glueing connection between sheet metal components and component assembly. patent, audi ag, 2013. [8] m. polon, shelby, mich. us5237734 method of interlocking hemmed together panels. patent, general motors corporation, detroit, 1993. [9] p. gowling, r. korda, m. hradec, j. hálek. wo/2017/140332 sheet metal part assembly. patent application, audi ag and skoda auto, 2017. [10] r. čada. formability of deep-drawing steel sheets. in l. a. j. l. sarton, h. b. zeedijk (eds.), proceedings of the 5th european conference on advanced materials and processes and applications (euromat 97): materials, functionality & design: volume 4 characterization and production/design, pp. 463 – 466. netherlands society for materials science, netherlands, maastricht, 1997. [11] a. makinouchi, c. teodosiu, t. nakagawa. advance in fem simulation and its related technologies in sheet metal forming. cirp annals 47(2):641 – 649, 1998. [12] r. čada. technologie i: studijní opora. všb – technical university of ostrava, 2007. 559 acta polytechnica 59(6):554–559, 2019 1 introduction 2 fixation of hemmed joints 2.1 mechanical fixation of hemmed joint 3 description of experiment 4 simulation optimization 5 discussion 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0065 acta polytechnica 60(1):65–72, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap energy gain from error-correcting coding in channels with grouping errors alexandr kuznetsova, b, ∗, oleg oleshkoa, kateryna kuznetsovaa a v. n. karazin kharkiv national university, svobody sq., 6, kharkiv, 61022, ukraine b jsc “institute of information technologies”, bakulin st., 12, kharkiv, 61166, ukraine ∗ corresponding author: kuznetsov@karazin.ua abstract. this article explores a mathematical model of a data transmission channel with errors grouping. we propose an estimating method for energy gain from coding and energy efficiency of binary codes in channels with grouped errors. the proposed method uses a simplified bennet and froelich’s model and allows leading the research of the energy gain from coding for a wide class of data channels without restricting the way of the length distributing the error bursts. the reliability of the obtained results is confirmed by the information of the known results in the theory of error-correcting coding in the simplified variant. keywords: modelling of data transmission channels, grouping errors, coding theory, energy gain from coding, channels with grouped errors. 1. introduction the usage of redundant codes to increase the reliability of the transmitted information requires the designer to take into account various factors, including the nature of the error distribution in the communication channel [1–9]. a detailed research of statistical properties of error sequences in real channels has shown that the errors are dependent and have a tendency to batch groups [1–3]. most of the time, the information is transferred via communication channels without any distortions. however, at any point of time, error condensations, so-called error bursts, can occur, inside of which the error probability is significantly higher than the average error probability calculated for a considerable transmission time. in such conditions, protection methods that are optimal for the independent error hypothesis are absolutely ineffective in real communication channels [4–9]. it is necessary to use a (scientific and) methodological apparatus to estimate the error-correcting codes efficiency correctly. it allows to describe the error behaviour in the communication channel and to develop practical recommendations on using error correcting codes. one of the main criteria of the error-correcting code efficiency is the energy gain from coding (egc), which refers to the reduction of the minimum required ratio of signal energy to spectral power density of noise. it allows to apply a system of an error-correcting code while ensuring the given probability of the erroneous receiver of signs [10–12]. today, the egc is calculated in the known manner, for channels with independent errors. the current direction of research is to develop methods of estimating the energy gain from coding in channels with grouping errors. 2. model of channel with independent errors let’s consider the process of a code word decoding in conformity with the binary symmetrical channel model with an independent error distribution. let’s suppose that transmission errors occur independently with a probability p0. if t is the number of corrected errors by (n,k,d)-block code, t = ⌊ d−1 2 ⌋ , then the probability of the erroneous decoding is calculated as ped(n) = 1 − t∑ i=0 cinp i 0 (1 −p0) n−i− − n∑ i=t+1 u(i)pi0 (1 −p0) n−i, (1) where cin is a binomial coefficient; u(i) is the number of error vectors of weight i, errors being fixed by the code. if the code corrects all errors within the radius of the code batch and does not fix other bugs, then ped(n) = p(> t,n) = n∑ i=t+1 cinp i 0 (1 −p0) n−i = = 1 − t∑ i=0 cinp i 0 (1 −p0) n−i, (2) to recalculate the error probability of the decoding for one character, ped will use the expression [10–12]: ped = d n ped(n). after the substitution in (2), we will obtain ped = d n n∑ i=t+1 cinp i 0 (1 −p0) n−i. (3) 65 https://doi.org/10.14311/ap.2020.60.0065 https://ojs.cvut.cz/ojs/index.php/ap a. kuznetsov, o. oleshko, k. kuznetsova acta polytechnica to calculate the egc, we will consider the probability dependence of p0 from the ratio of the signal energy to spectral power density of noise to [13]: p0 = 1 √ 2π ∫ −√e(1−bs)/n0 −∞ exp ( − y2 2 ) dy = = v  √e(1 − bs) n0   , (4) where e is the signal energy, n0 is the spectral power density of white noise; bs is the coefficient of mutual correlation between signals; v (x) is the error integral. thus, in the case of using binary phase-shift keyed (psk) signal with a 1800 phase manipulation, the coefficient of correlation is bs = −1. the probability of p0 for such signals is determined by the expression p0 = 0, 5 ( 1 − φ (√ 2e n0 )) = 1 − φ′ (√ 2e n0 ) , where φ and φ′ are tabulated functions that represent the probability integrals: φ (x) = 2 √ 2π ∫ x 0 exp ( − y2 2 ) dy = = 1 − 2 √ 2π ∫ −x −∞ exp ( − y2 2 ) dy, φ′ (x) = 1 √ 2π ∫ −x −∞ exp ( − y2 2 ) dy = = 1 − 1 √ 2π ∫ ∞ x exp ( − y2 2 ) dy. the usage of (n,k,d)-block codes detecting and correcting errors leads to increasing the redundancy of the transmitted data. if you fix the message energy transmitted in the channel, then the energy per symbol is reduced proportionally to the introduced redundancy. to calculate the error probability per symbol at the output of the decoder according to the expression (3), considering the introduced redundancy, we will decrease the ratio of the signal energy to spectral power density of noise in the expression ( 4) in r = k/n times. the final expression for the error probability per symbol, using the error-correcting (n,k,d)-block code, will be as follows: ped = d n n∑ i=t+1 cin  v  √ke(1 − bs) nn0    i× ×  1 −v  √ke(1 − bs) nn0    n−i . (5) for binary psk signals, the last expression we will rewrite as follows: ped = d n n∑ i=t+1 cin ( 1 − φ′ (√ 2ke nn0 ))i × × ( φ′ (√ 2ke nn0 ))n−i . (6) let’s fix the required probability of an error on one character pd and calculate the required ratio γ1 = e/n0 at p0 = pd according to the expression (4), and the ratio γ2 = e/n0 at per = pd, according to the expression (5). the difference γ2−γ1 gives the needed estimation of the egc. if the egc is positive, then the usage of the error-correcting code leads to a gain, and, on the contrary, it is inappropriate to use the chosen (n,k,d)-block code with a negative egc. the values γ1, γ2 and egc are usually on a logarithmic scale. 3. model of channel with grouping errors a convenient tool to describe data transmission channels is bennet and froelich’s mathematical model, which has no restrictions as to the way of length distribution of error bursts [1–9]. the main features of the bennet and froelich’s model [4] are: • constant pn-probability is the probability that the error package will start from a certain position; • the independence of error packets occurrence; • the independence of the pl-probability of error packet occurrence with l-length from the lengths of other packet errors; • the independence of errors within the package; • constant pε-probability inside the package; • no errors outside the package; • the possibility of contiguity and mutual overlapping of the error packets. to specify the model, it is enough to determine pn, pε probabilities and pn(l,n) distribution. at the same time, only solid batches with pε = 1 are experimentally singled out. in [6–8], a simplified bennet and froelich’s model is proposed: • an error can only arise within the error burst with the constant (pε = 1)-probability (continuous packages); • contiguity and mutual overlap of solid packages do not exist; • the constant pn-probability is the probability that a solid error burst of any length will start from a certain position; • p (l) is the probability of continuous l-length packet occurrence; 66 vol. 60 no. 1/2020 energy gain from error-correcting. . . • pn(l) is the probability that a continuous error burst of l-length will start from a certain position, pn(l) = pn ·p(l). to specify a simplified bennet and froelich’s model, it is sufficient to specify the pn-probability and p(l)distribution. the pn probability value and p(l)distribution can be obtained experimentally on a large enough sample size [1, 2, 4]. according to the bennet and froelich’s model, the distribution of pner(m) -probabilities of occurrence of error-free m-length intervals between adjacent continuous error bursts has the following geometrical meaning: pner(m) = pn(1 −pn) m−1 . (7) if p(l)-distribution can also be presented as a geometrical law p(l) = (1 −g)gl−1, (8) then, the average error burst length lav, the average length of error-free interval mav, the error probability for the bit p0 and the error burst probability pn are associated with the values mav = 1,p0 = pnlav, lav(1 −g) = 1. (9) to define the considered model, it is sufficient to specify only two parameters, for example, p0 and lav. in figure 1, there are p(l) dependencies for cases: 1) lav = 2; 2) lav = 4; 3) lav = 8; 4) lav = 16; 5) lav = 32; 6) lav = 64. figure 1. the dependencies of the probability of solid package occurrence. the analysis of dependencies presented in figure 1 shows that even with a small average error burst length, there is a high probability of a continuous package occurrence. indeed, with the average error burst length lav = 8 bits the probability of a solid package occurrence of 20 errors is ≈ 10−2. with the same average error burst length, the probability of a solid package occurrence of 40 errors is ≈ 10−3. using the considered simplified model, it is possible to calculate the features of an error-correcting coding efficiency in channels with grouping errors. let us fix the average error burst length lav and error probability per bit p0. then, we obtain: pn = p0 lav ,g = 1 − 1 lav , p(l) = 1 lav ( 1 − 1 lav )l−1 , pn(l) = p0 l2av ( 1 − 1 lav )l−1 . (10) as an example, figure 2 shows the dependencies pn(l) for different values lav: a) lav = 2; b) lav = 4; c) lav = 8, d) lav = 16, e) lav = 32, f) lav = 64. the values pn(l) were calculated for the case of receiving a binary phase-shift keyed signal and correspond to the following values: 1) p0 = pn(1); 2) pn(2); 3) pn(4); 4) pn(8); 5) pn(16); 6) pn(32); 7) pn(64). the analysis of the dependencies presented in figure 2 shows that, whith increasing the average error burst length lav, there is a very slight (one to two orders of magnitude) decrease in the probability of the continuous package occurrence of small length (2 ≤ l ≤ 4 bits). at the same time, there is a significant increase in the probability of the solid package occurrence of great length (32 ≤ l ≤ 64 bits). so, with lav = 2, the probability of the continuous error burst occurrence of the length l = 64 bits is p(64) = 1 2 ( 1 − 1 2 )64−1 = ( 1 2 )64 ≈ 5.42 ·10-20, and the probability of the continuous error burst occurrence of the length l = 64 bits from the current position is as follows pn(64) ≈ p0 ·5.42 ·10-20. with lav = 64, the corresponding probabilities are: p(64) = 1 64 ( 1 − 1 64 )64−1 ≈ 5.70 ·10-3, pn(64) ≈ p0 ·5.70 ·10-3, as it can be seen, the probability of occurrence of long solid package increased by seventeen orders of magnitude. thus, as the analysis shows, with the increase in the average length of packet errors, a redistribution of the probabilities of occurrence of packet errors takes place: the reduction of the probability of occurrence of packets of short length and the increase in the probability of occurrence of packages of greater length. consider the event consisting in the error decoding of linear (n,k,d)-block code, when used in channels with grouping errors. if, for a block of n-symbols, the 67 a. kuznetsov, o. oleshko, k. kuznetsova acta polytechnica figure 2. the dependencies pn(l) for different values lav. 68 vol. 60 no. 1/2020 energy gain from error-correcting. . . code corrects all errors of t-weight and less, doesn’t corrects other errors and errors that occurred in an accordance with the considered model are grouped in packets of l -symbols, then the errors that occurred in an accordance with the considered model are grouped in packets of ξ-packages, so that ξl > t. let’s consider a simplified bennet and froelich’s model with a disjoint not adjacent to any other error bursts. in this case, the probability of the error burst occurrence ξ-packages of an l-length on a block with n-symbols is determined by the quantity combination of the package number (with no overlapping parts) on the length of n-symbols. the probability of one error burst occurrence of an l-length on the length of the package of n-symbols is computed as: p1(l,n) = (n− l + 1)pn(l)(1 −pn(l)) n−l . it applies that, on a block length of n-symbols, no more than λ = ⌊ n+1 l+1 ⌋ error blocks of an l-length can appear. the combination number of ξ-packages on the length of n-symbols is defined as a value of the binomial coefficient c ξ λ+n+1−ξ(l+1)−λ+ξ = c ξ n−ξl+1. (11) then, the expression for the probability of ξ-error burst occurrence with l-length on the package of nsymbols is written as: pξ(l,n) = c ξ n−ξl+1pn(l) ξ(1 −pn(l)) n−ξl . (12) for the probability of ξ > 1-packages occurrence from l-errors on the block length of n-symbols, the ξl ≤ t is determined by the expression: pl<ξl ≤t(l,n) = = λ∑ ξ=1, l<ξl ≤t c ξ n−ξl+1pn(l) ξ(1 −pn(l)) n−ξl . (13) the probability of the error decoding is defined as ped = 1 − (1 −pn) n − n∑ l=1 pl<ξl ≤t(l,n), where (1 −pn) n is the probability that, on the block of n-symbols, no error bursts will happen. then, taking into account (13), we get ped = 1 − (1 −p) n− − n∑ l=1 λ∑ ξ=1, l<ξl ≤t c ξ n−ξl+1pn(l) ξ(1 −pn(l)) n−ξl . (14) when compared to the model with independent errors, the disadvantage of the model with no attached error bursts is irreducibility, even in the case of the fixed l = 1. indeed, let us suggest, that lav = 1, then g = 0, p(1) = 1, p(> 1) = 0, pn = p0 = pn(l), with only single errors, which are not adjacent to each other, appearing. then, the expression (12-13) will be rewritten as: pξ(l,n) = c ξ n−ξ+1p0 ξ · (1 −p0) n−ξ , p1<ξ ≤t(l,n) = t∑ ξ=1 c ξ n−ξ+1p0 ξ(1 −p0) n−ξ , and the expression for the decoding of an error probability will be written as ped = 1 − (1 −p0) n − t∑ ξ=1 c ξ n−ξ+1p0 ξ(1 −p0) n−ξ = = 1 − t∑ ξ=0 c ξ n−ξ+1p0 ξ (1 −p0) n−ξ . (15) that does not correspond to the expressions (1-2) for the model with independent errors. let’s analyse the reasons of this disparity. the expression (15) conforms to the probability of an erroneous decoding, distorted by such single errors, which cannot adhere to each other. in general, the model of a binary symmetrical channel without memory, described by the expressions (1-2), allows adjoining single errors that cause the discrepancy between the corresponding formulas. let’s consider a simplified bennet and froelich’s model with disjoint error bursts and their possible adjacency to each other. in this case, on the block length of n-symbols, there could be no more than λ′ = bn/lc error bursts of an l-length. the number of combinations of ξ-packages on the length of n-symbols is defined by the binomial coefficient c ξ λ′+n−ξl−λ′+ξ = c ξ n−ξl+ξ. (16) then the expression for the ξ-error bursts occurrence probability of l-length on the block of n-symbols is determined by the expression: pξ(l,n) = c ξ n−ξl+ξpn(l) ξ(1 −pn(l)) n−ξl . (17) for any occurrence of ξ > 1 of l-errors packages on the block of n-symbols with ξl ≤ t, pl<ξl ≤t(l,n), the probability is computed by the expression pl<ξl ≤t(l,n) = = λ′∑ ξ=1, l<ξl ≤t c ξ n−ξl+ξ ·pn(l) ξ · (1 −pn(l)) n−ξl . (18) 69 a. kuznetsov, o. oleshko, k. kuznetsova acta polytechnica the probability of erroneous decoding is computed as ped = 1 − (1 −pn) n − n∑ l=1 p1<ξ ≤t(l,n), and, taking into account (18), we get ped = 1 − (1 −pn) n− − n∑ l=1 λ′∑ ξ=1, l<ξl ≤t c ξ n−ξl+ξpn(l) ξ(1 −pn(l)) n−ξl . (19) the disadvantage of the model with not-adjacent error bursts is the irreducibility, even with the fixed l = 1, when compared to the model with independent errors. indeed, let’s suggest that lav = 1, g = 0, p(1) = 1, p(> 1) = 0, pn = p0 = pn(l) and only single errors occur, which are not adjacent to each other. then, the expression (17-18) will be rewritten in the following form pξ(l,n) = cξnp0 ξ(1 −p0) n−ξ , p1<ξ ≤t(l,n) = t∑ ξ=1 cξnp0 ξ(1 −p0) n−ξ , and the expression for the error probability decoding will be transformed to ped = 1 − (1 −p0) n − t∑ ξ=1 cξnp0 ξ(1 −p0) n−ξ = = 1 − t∑ ξ=0 cξnp0 ξ(1 −p0) n−ξ , that with i = ξ, it fully complies with the expressions (1-2) for the model with independent errors. to calculate the egc of (n,k,d)-block code in a channel with grouping errors, it is necessary to fix a required error probability on one character pd and calculate the corresponding value of e/n0 by expressions (14) and/or (19) (taking into account the introduced redundancy and the multiplier d/n). the difference of γ2 and γ1 gives the required estimation of the egc: egc = γ2 −γ1, where γ2 is the ratio e/n0, the minimum required to achieve a desired probability of an erroneous reception of symbols pd for a fixed ensemble of signals (without coding); γ1 is the ratio e/n0, the minimum required to achieve pd for a fixed ensemble of signals by using the (n,k,d)-code block. it should be noted that the considered mathematical model and methodology of evaluating the energy gain are not limited by the distribution of the length of error bursts that allows to explore the egc for a wide class of data channels. gf(2m) the roots the polynomial g(x) (n,k,d) gf(24) α1, α2, α4, α8, (15, 7, 5) α3, α6, α9, α12 gf(25) α1, α2, α4, α8,α16, (31, 16, 7) α3, α6, α12,α24, α17, α5, α10, α20, α9, α18 gf(26) α1, α2, α4, α8, α16, α32, (63, 36, 11) α3, α6, α12, α24, α48, α33, α5, α10, α20, α40, α17, α34, α7, α14, α28, α56, α49, α35, α9, α18, α36 gf(27) α1, α2, α4, α8, α16, α32, α64, (127, 64, 21) α3, α6, α12, α24, α48, α96, α65, α5, α10, α20, α40, α80, α33, α66, α7, α14, α28, α56, α112, α97, α67, α9, α18, α36, α72, α17, α34, α68, α11, α22, α44, α88, α49, α98, α69, α13, α26, α52, α104, α81, α35, α70, α15, α30, α60, α120, α113, α99, α71, α19, α38, α76, α25, α50, α100, α73 table 1. primitive binary bch-code. 4. egc-evaluation of bch codes consider the binary bose-chaudhuri-hocquenghem (bch) codes for an assessment of their egc into channels with independent and grouping errors. to evaluate the egc, we will use the techniques developed earlier. the theory and methods of constructing bch codes are best described in the monographs [10– 12], in which it is shown that bch codes yield the highest win at r ≈ 1/2. fix a finite field gf(2m), m = 4, 5, 6, 7 and primitive binary bch-code with r ≈ 1/2. in table 1, the corresponding code parameters and the roots of the generating polynomial g(x) in the form of the degrees of the primitive element of the field are presented. figure 3 shows the dependencies of the probability of erroneous receiving of symbols for the cases: • 1 the optimum receiving of binary psk signals (without coding); • 2 using the (15, 7, 5) bch-code; • 3 using the (31, 16, 7) bch-code; • 4 using the (63, 36, 11) bch-code; • 5 using the (127, 64, 21) bch-code. in figure 3a), the reduced dependencies correspond to the model considered earlier with an independent occurrence of errors, which is equivalent to the case lav = 1. in the rest of the figures, the reduced dependencies correspond to the probability of receiving erroneous channel symbols, which is described by a simplified bennet and froelich’s model containing disjoint error packets and their possible contiguity to each 70 vol. 60 no. 1/2020 energy gain from error-correcting. . . figure 3. the dependencies of the probability of erroneous receiving of a symbol. other for the cases: b) lav = 1, 00001; c) lav = 1, 0001; d) lav = 1, 001. as follows from the reduced dependencies, even a small clustering of errors leads to a sharp decrease of the egc. indeed, for the above considered model with independent errors at pd = 10−5, the egc binary (127, 64, 21) bch-code is equal to ≈ 3.2db (see figure 3a). but when lav = 1, 0001, the egc is sharply reduced and, for the same, value pd is approximately equal to 0.9 db (see figure 3b). with an increasing average length of a package of continuous errors, the egc continues to decrease and becomes negative (see figure 3d). practically, this means that the use of strictly random error-correcting coding in channels with a strong clustering of errors is inefficient and leads to energy loss. 5. conclusions as the result of the research, we have developed a scientific and methodological apparatus description of the error behaviour in discrete channels. for the first time in the open literature, we have developed the methodology of estimating the egc-binary codes in channels with grouping errors. the proposed method uses a simplified bennet and froelich’s model and allows conducting a research of the egc for a wide class of data channels with different laws of length distribution of the error bursts. the reliability of the obtained results is confirmed by the information in the simplified variant of the known results in the theory of the error-correcting coding. so, the estimating expression of probability of erroneous decoding in channels with grouping errors 71 a. kuznetsov, o. oleshko, k. kuznetsova acta polytechnica with lav = 1 (19) is reduced to the known expression (2) concerning the computation of the probability of erroneous decoding in channels with an independent error distribution. using the developed methodology, we have conducted the estimation of the egc for binary bchcodes, which has shown a significant decrease of the efficiency of error-correcting coding in channels with grouping errors. even with a negligible error grouping, the egc of binary bch-codes is sharply reduced, while further increasing the average length of an error burst, the egc reaches negative values. a remark should be added, stating that many binary bch codes are optimum for correcting burst errors when they satisfy the reiger bound, i.e. whenever 2b = n−k, where b denotes the maximum guaranteed correctable burst length and the bch code has parameters (n,k,d). however, the introduced redundancy with this encoding still requires significant energy costs (for the transmission of each redundant bit). for channels with a strong clustering of errors, the egc may be small or negative, i.e. coding may lead to an energy loss. thus, even codes that are optimal for correcting error bursts may not compensate for the energy costs of transmitting redundant symbols. obviously, in this case, it is better to use protocols with error detection and automatic repeat request (arq). a perspective direction for further research is the development of estimating methods of the egc for non-binary codes in channels with grouping errors and a direct estimation of the egc for such codes (e.g., for non-binary bch-codes, reed-solomon codes, algebraic geometric codes). these results may also be useful in other important practical applications: cryptography, authentication, the theory of complex signals, etc. [14–16]. references [1] e. n. gilbert. capacity of a burst-noise channel. bell system technical journal 39(5):1253–1265, 1960. doi:10.1002/j.1538-7305.1960.tb03959.x. [2] a. fontaine, r. gallager. error statistics and coding for binary transmission over telephone circuits. proceedings of the ire 49(6):1059–1065, 1961. doi:10.1109/jrproc.1961.287890. [3] m. zorzi, r. rao, l. milstein. error statistics in data transmission over fading channels. ieee transactions on communications 46(11):1468–1477, 1998. doi:10.1109/26.729391. [4] w. bennett, f. froehlich. some results on the effectiveness of error-control procedures in digital data transmission. ire transactions on communications systems 9(1):58–65, 1961. doi:10.1109/tcom.1961.1097651. [5] e. bloch, o. popov, w. y. turin. models of error sources in channels for digital information transmission. moscow, svyaz 1971. [6] w. turin. error source models. in performance analysis and modeling of digital transmission systems, pp. 9–59. springer us, 2004. doi:10.1007/978-1-4419-9070-9_2. [7] o. melentiev, v. yatsukov, e. minina. the estimation technique of parameters of discrete channel with grouping errors. in 2003 siberian russian workshop on electron devices and materials. proceedings. 4th annual (ieee cat. no.03ex664). novosibirsk student branch of ieee, 2003. doi:10.1109/sredm.2003.1224208. [8] t. kløve, v. i. korzhik. channel models. in error detecting codes, pp. 1–16. springer us, 1995. doi:10.1007/978-1-4615-2309-3_1. [9] j. poikonen, j. paavola. error models for the transport stream packet channel in the dvb-h link layer. in 2006 ieee international conference on communications, vol. 4, pp. 1861–1866. 2006. doi:10.1109/icc.2006.254991. [10] g. c. clark, j. b. cain. error-correction coding for digital communications. springer us, 1981. doi:10.1007/978-1-4899-2174-1. [11] i. s. reed, x. chen. error-control coding for data networks. springer us, 1999. doi:10.1007/978-1-4615-5005-1. [12] f. macwilliams, n. sloane. the theory of error-correcting codes. elsevier, 1977. doi:10.1016/s0924-6509(08)x7030-8. [13] b. sklar. digital communications: fundamentals and applications (paperback). prentice hall communications engineering and emerging techno. pearson education, 2016. [14] a. kuznetsov, m. lutsenko, n. kiian, et al. code-based key encapsulation mechanisms for post-quantum standardization. in 2018 ieee 9th international conference on dependable systems, services and technologies (dessert), pp. 276–281. 2018. doi:10.1109/dessert.2018.8409144. [15] a. kuznetsov, a. pushkar’ov, n. kiyan, t. kuznetsova. code-based electronic digital signature. in 2018 ieee 9th international conference on dependable systems, services and technologies (dessert), pp. 331–336. 2018. doi:10.1109/dessert.2018.8409154. [16] a. kuznetsov, i. svatovskij, n. kiyan, a. pushkar’ov. code-based public-key cryptosystems for the postquantum period. in 2017 4th international scientificpractical conference problems of infocommunications. science and technology (pic s t), pp. 125–130. 2017. doi:10.1109/infocommst.2017.8246365. 72 http://dx.doi.org/10.1002/j.1538-7305.1960.tb03959.x http://dx.doi.org/10.1109/jrproc.1961.287890 http://dx.doi.org/10.1109/26.729391 http://dx.doi.org/10.1109/tcom.1961.1097651 http://dx.doi.org/10.1007/978-1-4419-9070-9_2 http://dx.doi.org/10.1109/sredm.2003.1224208 http://dx.doi.org/10.1007/978-1-4615-2309-3_1 http://dx.doi.org/10.1109/icc.2006.254991 http://dx.doi.org/10.1007/978-1-4899-2174-1 http://dx.doi.org/10.1007/978-1-4615-5005-1 http://dx.doi.org/10.1016/s0924-6509(08)x7030-8 http://dx.doi.org/10.1109/dessert.2018.8409144 http://dx.doi.org/10.1109/dessert.2018.8409154 http://dx.doi.org/10.1109/infocommst.2017.8246365 acta polytechnica 60(1):65–72, 2020 1 introduction 2 model of channel with independent errors 3 model of channel with grouping errors 4 egc-evaluation of bch codes 5 conclusions references ap02_02.vp introduction when solving a wide range of problems related to nonlinear acoustics, we may describe the nonlinear sound waves in fluids by using the burgers model equation. this equation is named after johannes martinus burgers, who published an equation of this form in his paper concerning turbulent phenomena modelling in 1948 (see [1]). however, this type of equation used in continuum mechanics was first presented in a paper published in a meteorology journal by h. bateman in 1915 (see [2]). thanks to the fact that this journal was not read by experts in continuum mechanics, the equation has become known as the burgers equation. the possibilities of using this equation in nonlinear acoustics were probably discovered by j. cole in 1949. subsequently e. hopf (in 1950) and j. cole (in 1951) published the transformation independently in their papers (see [3, 4]). the transformation enables us to find the general analytic solution of the burgers equation, as a result of which this equation plays a major role in nonlinear acoustics. the burgers equation can be used for modelling both travelling and standing nonlinear plane waves [5–8]. the equation is the simplest model equation that can describe the second order nonlinear effects connected with the propagation of high-amplitude (finite-amplitude waves) plane waves and, in addition, the dissipative effects in real fluids [9–12]. there are several approximate solutions of the burgers equation (see [13]). these solutions are always fixed to areas before and after the shock formation. for an area where the shock wave is forming no approximate solution has yet been found. it is therefore necessary to solve the burgers equation numerically in this area (see [14,15]). numerical solutions themselves have difficulties with stability and accuracy. a numerical solution of the burgers equation is discussed in this paper. burgers equation a complete system of equations which describes the propagation of waves in thermo-viscous fluids consists of the navier-stokes equation, the equation of continuity, the heat-transfer equation and the thermodynamic state equations. unfortunately, no general solution of this system has yet been found, and if we were to consider all these equations in their full form, many problems would arise even with a numerical solution (the large number of operations required, instability, etc.). for these reasons, it is advisable to approximate this system by an appropriate model equation in a given approximation, e.g., the quasi-linear homogeneous burgers equation (see [13,16]). homogeneous burgers equation � � � � � � � � � � v x c v v b c v � � � 0 2 0 0 3 2 22 0, (1) here v is acoustic velocity, � is retarded time, � � �t x c0, where t is time, x is the distance from the sound source, c0 is the velocity of sound propagation in the linear approximation, b is the dissipation coefficient, �0 is density of medium, � is nonlinearity coefficient. for the calculations that follow it is advisable to rewrite the equation in the following dimensionless (normed) form: � � � � � � v s v v y g v y � � � 1 0 0 2 2 (2) where y � �� ; s v c x� � �m 0 2 ; v v v � m ; g v c b 0 0 02 � � � � m , g0 represents the goldberg number, vm is the velocity amplitude of the sound source and � is the annular frequency of the sound source. the burgers equation (1) is a parabolic equation which takes a hyperbolic form in the case of an ideal fluid. the homogeneous burgers equation (1) describes nonlinear plane travelling waves. cole-hopf transformation v g u u y � 2 0 � � . (3) using transformation (3) we can transform the burgers equation (2) to the linear diffusion equation [17]: � � � � u s g u y � 1 0 2 2 (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 42 no. 2/2002 solution of the burgers equation in the time domain m. bednařík, p. koníček, m. červenka this paper deals with a theoretical description of the propagation of a finite amplitude acoustic waves. the theory based on the homogeneous burgers equation of the second order of accuracy is presented here. this equation takes into account both nonlinear effects and dissipation. the method for solving this equation, using the well-known cole-hopf transformation , is presented. two methods for numerical solution of these equations in the time domain are presented. the first is based on the simple simpson method, which is suitable for smaller goldberg numbers. the second uses the more advanced saddle point method, and is appropriate for large goldberg numbers. keywords: burgers equation, cole-hopf transformation, saddle point method, nonlinearity. solution of the burgers equation the solution of the diffusion equation (4) is given by the poisson integral, thus the solution of the burgers equation can be written as the fraction of two integrals with infinite limits: � � � � v y z s g y z s g v y y z g z � � � � � � � � � � � �� � exp , exp 0 2 0 04 2 0 d d � � � �0 2 0 04 2 0 y z s g v y y z z� � � � � � � �� � , d d (5) where v (0, y) is the given boundary condition. in most cases the integrals in equation (5) cannot be solved analytically. numerical integration it is necessary to integrate eq. (5) only in intervals � � � �q q i i 1 2, where the value of an integrand is less than a chosen parameter � . in general, it is necessary to determine these intervals numerically. when we can express the approximate solution of the burgers equation in the following form � � � � � � � � v y z s g y z s g v y y z z q q i i i � � � � � � � � � � � exp ,0 2 0 04 2 0 1 2 d d � � � � � � � � 1 0 2 0 0 1 4 2 0 1 2 n z q q i n g y z s g v y y z i i � � � � � � � � � � � exp , d d .(6) in the special cases g0 1� or roughly s < 10, the integration limits analytically can be determined, using the following speculation: the integrand contains the exponential function which for values of z drawing apart from the value of y falls to zero very quickly. let us choose a small parameter 1 0� � � , � � exp � �� � � � � � � y z g s 2 0 4 . (7) then we are able to calculate the limits from the equation as q y s g 12 0 2, ln� � � �. (8) the integration itself can be performed by means of the simpson method, which is based on approximating the integrated function in the segments by parts of a parabola � � � � � � � �f z z q q n f z f z f z q q d 1 2 2 1 0 1 2 3 4 2 � � � � � �� [ � � � � � � �� � �� �2 42 1f z f z f zn n n ], (9) where n � 2k, k n� is the number of equidistant intervals and � �z q i q q ni � � �1 2 1 . however, there is a problem with floating-point arithmetic, especially for large values of the number g0. when calculating the integral, we add terms according to (9), and after a short time (especially after integrating one of the maxima) the partial is result much larger than the next added terms. thus these terms are not taken into account, and the result is not accurate. if the accuracy of 18 valid digits is used, we can calculate for values of g0 up to approximately 600. if it were necessary to use this method for larger values of g0, we would have to use some suitable software that allows calculations of high precision, e.g., the free library gnu mp. saddle point method if we study the integrand properties, the calculation can be significantly simplified and the problems caused by processing numbers in high precision can be avoided [17]. when using the saddle point method, the accuracy of the values of the integrals obtained in equation (5) improves with increasing goldberg number g0. for easier orientation we express equation (5) in the following form � � � � � � v f z s y g f z s y z g f z s y � � � � �� � � � �� � ; , exp ; , exp ; , 0 0 2 2 d ���� � dz (10) where � �f z s y y z s ; , � � , � � � � � �f z s y y z s v y y z ; , ,� � � 2 0 2 0 d , (11) here s and y represent the parameters. figs. 1–3 show the courses of the integrands for different goldberg numbers (g0 � 10, 50, 500) at the various distances (s � 1, 10, 50) for a harmonic boundary condition. it is obvious from the shape of these three-dimensional graphs that the courses of the integrands have significant extrema (maxima). consequently we can restrict the integration limits to their close neighbourhood. further, we can see that when the goldberg number increases, the values of the integrand extrema also increase and diminish to zero from these ex72 acta polytechnica vol. 42 no. 2/2002 fig. 1: space evolution of integrand, g0 � 10, s � 1, 10, 50 trema more rapidly (the course of the integrand is narrower). these pictures also show that, on the contrary, in the case of increasing dimensionless distance s near to the extreme, the integrand falls more slowly and the course of integrand spreads (the extrema are not so significant). consequently we need to take into account two maxima for certain (appointed) values of dimensionless time to do the integration (the maxima superimpose in the direction of the z axis). let us now analyse the relation (5) for g0 1� and constant s and y. in this case the largest contributions of both integrands come from the extrema of the function f ( z; s, y). from the condition � � � � � � � � � � f z s y z z y z s v y y z ; , ,� � � � � � � � � � 2 0 2 0 0d (12) we obtain � � y z s v z � � �0 0, . (13) let us mark the solutions of equation (13) z0 and select only those which represent maxima of the integrands � � � � � � � � � � � � � � �f z s y z s v z z 0 2 00 1 0 0 ; , , . (14) now the contribution of the neighbourhood of this extreme can be expressed if we expand the function f ( z; s, y) into the taylor series at the point z � z0 and we consider only the first three terms (the quadratic order). for the function f ( z; s, y) we shall omit writing parameters s and y; � ��f z and � ���f z will have the meaning of the first and second derivation of this function with respect to variable z. � � � � � �� � � �� � � � � �� � f z f z f z z z f z z z f z f z z z � , � � � � � �� � � � � �� � 0 0 0 0 0 2 0 0 0 2 1 2 1 2 (15) because at the stationary point there is always � �� �f z0 0. now we substitute this expansion into the integral � �exp �� � �� �� � g f z z0 2 d (16) which we rearrange by means of the well-known laplace-gauss integral e da x 2� �� � � � 2 0z a a , (17) to the form � � � � 4 1 20 0 0 0 g f z g f z �� � � � �� exp . (18) now let us suppose that there exists only one maximum, z0, that fulfils condition (13), and let us substitute this representation of integral (16) into the numerator of relation (5). we can see that this term is multiplied by the function f(z) here. the value of this function does not vary significantly in the neighbourhood of the extreme. then we can treat this function as a constant with value f(z0) (this assumption is satisfied because these extrema are narrow in comparison with the behaviour of the function). on the basis of the foregoing reasoning we can write the numerator of term (5) as � � � � y z s g f z z y z s g f z g � � � � �� � � � �� � �� � exp � � exp 0 0 0 0 0 2 4 1 d � � 2 0f z � � �� (19) and the denominator can be written as � � � � � �exp � exp�� � �� � �� � � � �� �� � g f z z g f z g f z0 0 0 0 0 2 4 1 2 d . (20) then we have v y z s �� � 0 , (21) where z0 is given by expressions (13) and (14). © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 42 no. 2/2002 fig. 2: space evolution of integrand, g0 � 50, s � 1, 10, 50 fig. 3: space evolution of integrand, g0 � 500, s � 1, 10, 50 however, relation (21) can give wrong solutions in certain cases. the explanation is that there are more stationary points satisfying equation (13) and condition (14), usually two. now let us suppose two maxima z1 and z2, when z z1 2� . their contributions are determined by relations (19) and (20), and thus: � � � � � � v y z s g f z g f z g f z i� � �� � � � �� �� � � i i i i 4 1 2 4 1 0 0 1 2 0 exp ex � �p �� � �� � � g f z i 0 1 2 i 2 . (22) this expression can be extended to an arbitrary number of maxima zi. when � � � �f z f z1 2� , the contribution of one extreme becomes very high due to large g0 – for example in the case g0 � 20000 and � � � �f z f z1 2 1� � this rate is equal to e 10000 1: . for this reason, we can use the approximate solution in some circumstances. � � � � � � � � v y z s f z f z v y z s f z f z � , � , . � � � � � � 1 1 2 2 2 1 when and when (23) examples of the solution of the burgers equation figure 4 a) shows waveforms with the harmonic boundary condition when g0 � 1500. the waveforms are calculated using the saddle point method at dimensionless distances which were chosen in order to show the development of a nonlinear wave. figure 4 b) shows waveforms for the identical boundary condition at the same distances as in figure 4 a), but for g0 � 50. direct integration was used in this case. we can see that the dissipation effect is stronger in this case, and therefore the width of the shock front is more significant. figure 5 compares the time domain solution (saddle point method) with the frequency domain solution of the burgers equation (see [18]). conclusion this paper analyses a method for solving the burgers equation in the time domain by means of the cole-hopf transformation. however, this method poses problems connected with the numerical solution of the poisson integral for the case g0 � 1. the first way to solve these problems is based on the application of an appropriate software instrument (e.g., the 74 acta polytechnica vol. 42 no. 2/2002 a) b) fig. 4: waveforms at distances s � 0, 1, 4, 10, 20 for a) g0 � 1500 b) g0 � 50 fig. 5: comparison between the time domain solution and the frequency domain solution of the burgers equation, g0 � 1000, s � 1.1 and s � 3, m represents number of harmonics free library gnu mp), which enables calculation with accuracy to an arbitrary number of valid digits. the next way involves applying the saddle-point method, which becomes more accurate the larger the goldberg numbers are. the solution in the time domain enables us to find the proper parameters of the numerical frequency correction that is often used for solving the burgers equation in the frequency domain. when we compare the results of the solutions of the burgers equation obtained in the time domain and in the frequency domain, it is evident that they are almost identical in the case when the number of used harmonics in the frequency domain exceeds 250, but the calculation is noticeably longer. acknowledgment the research was supported by ctu research program 16:cez:j04/98:212300016 and by gacr grant 202/01/1372. references [1] burgers, j. m.: a mathematical model illustrating the theory of turbulence. advances in applied mechanics, 1948, vol.1, p. 171–199. [2] bateman, h.: some recent researches on the motions of fluids. monthly weather review, 1915, vol. 43, p. 163–170. [3] hopf, e.: the partial differential equation u uu ut x xx� � . comm. pure and appl. math., 1950, vol. 3, p. 201–230. [4] cole, j. d.: on a quasi-linear parabolic equation occurring in aerodynamics. q. appl. math., 1951, vol. 9, p. 225–236. [5] koníček, p., bednařík, m., červenka, m.: finite-amplitude acoustic waves in a liquid-filled tube. hydroacoustics (editors: e. kozaczka, g. grelowska), gdynia: naval academy, 2000, p. 85–88. [6] bednařík, m.: description of plane finite-amplitude traveling waves in air-filled tubes. acoustica, 1996, vol. 82, p. 196. [7] bednařík, m.: description of high-intensity sound fields in fluids. proceedings of the 31st conference on acoustics, prague, 1994, p. 134–138. [8] ochmann, m.: nonlinear resonant oscilllations in closed tubes – an application of the averaging method. j. acoust. soc. am., 1985, vol. 77, p. 61–66. [9] bednařík, m.: description of nonlinear waves in gas-filled tubes by the burgers and the kzk equations with a fractional derivative. proceedings of the 16th ica, ica, seattle, 1998, p. 2307–2308. [10] blackstock, d. t.: connection between the fay and fubuini solution for plane sound waves of finite amplitude. j. acoust. soc. am., 1966, vol. 39, p. 1019–1026. [11] blackstock, d. t.: thermoviscous attenuation of plane, periodic, finite-amplitude sound waves. j. acoust. soc. am., 1964, vol. 36, p. 534–542. [12] webster, d. a., blackstock, d. t.: finite-amplitude saturation of plane sound waves in air. j. acoust. soc. am., 1977, vol. 62, p. 518–523. [13] hamilton, m. f., blackstock, d. t.: nonlinear acoustic. usa: academic press, 1998, p. 117–139. [14] bednařík, m.: nonlinear propagation of waves in hard-walled ducts. proceedings of the ctu workshop, prague: ctu, 1995, p. 91–92. [15] mitome, h.: an exact solution for finite-amplitude plane sound waves in a dissipative fluid. j. acoust. soc. am., 1989, vol. 86, no. 6, p. 2334–2338. [16] makarov, s., ochmann, m.: nonlinear and thermoviscous phenomena in acoustics 2. acustica, 1997, vol. 83, p. 197–222. [17] whitham, g. b.: linear and nonlinear waves. new york: john willey, 1974, p. 152–157. [18] trivett, d. h., van buren, a. l.: propagation of plane, cylindrical, and spherical finite amplitude waves. j. acoust. soc. am., 1981, vol. 69, no. 4, p. 943–949. dr. ing. michal bednařík e-mail: bednarik@fel.cvut.cz dr. mgr. petr koníček e-mail: konicek@fel.cvut.cz ing. milan červenka e-mail: cervenm3@fel.cvut.cz department of physics czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 42 no. 2/2002 table of contents concept design and reliability 3 g. cooper, g. thompson the impact of odors 13 m. v. jokl optimization of the odor microclimate 23 m. v. jokl a low-complexity wavelet based algorithm for inter-frame image prediction 30 a. k. haghi sampling by fluidics and microfluidics 41 v. tesaø mixing suspensions in slender tanks 50 f. rieger, e. rzyski formation of haloforms during chlorination of natural waters 56 a. grünwald, b. š•astný, k. slavíèková, m. slavíèek reducing ultrasonic signal noise by algorithms based on wavelet thresholding 60 m. kreidl, p. houfek optimisation and just-in-time 66 v. beran solution of the burgers equation in the time domain 71 m. bednaøík, p. koníèek, m. èervenka acta polytechnica https://doi.org/10.14311/ap.2021.61.0307 acta polytechnica 61(2):307–312, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague parallel machine scheduling with monte carlo tree search anita agárdi∗, károly nehéz university of miskolc, faculty of mechanical engineering and informatics, institute of information science, miskolc-egyetemváros, 3515, hungary ∗ corresponding author: agardianita@iit.uni-miskolc.hu abstract. in this article, a specific production scheduling problem (psp), the parallel machine scheduling problem (pmsp) with job and machine sequence setup times, due dates and maintenance times is presented. in this article after the introduction and literature review the mathematical model of the parallel machines scheduling problem with job and machine sequence setup times, due dates and maintenance times is presented. after that the monte carlo tree search and simulated annealing are detailed. our representation technique and its evaluation are also introduced. after that, the efficiency of the algorithms is tested with benchmark data, which result, that algorithms are suitable for solving production scheduling problems. in this article, after the literature review, a suitable mathematical model is presented. the problem is solved with a specific monte carlo tree search (mcts) algorithm, which uses a neighbourhood search method (2-opt). in the article, we present the efficiency of our iterative monte carlo tree search (imcts) algorithm on randomly generated data sets. keywords: parallel-machine scheduling, monte carlo tree search. 1. introduction a cost-efficient production is one of the main goals of manufacturing companies, because production efficiency means higher profits for the company. a cost-efficient production means that as many jobs as possible are processed on time at the lowest cost. in this article, a specific production scheduling problem, the parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times is presented. in this specific problem, m jobs must be distributed among n machines. setup time means transition time between jobs. the due date means the day by which the job has to be done. maintenance time means a time window when the production stops because there is maintenance. in our problem, the objective function is the minimization of the setup times and maximization of the number of created jobs. the production scheduling problems are np hard, therefore, it is necessary to apply some heuristics to their solution. with heuristics, of course, there is little chance of finding a global optimum, but we can get a good local optimum within a reasonable amount of time. in this article, a specific monte carlo tree search (mcts) algorithm is applied to the problem. the paper is organized as follows: after the literature review, the mathematical model of our specific production scheduling problem is described. in section 4, the monte carlo tree search is detailed. in section 5, the representation and evaluation of parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times are presented. in section 5, our iterative monte carlo tree search algorithm is also detailed. after that, the test results are discussed. in section 7, the conclusion and remarks are presented. 2. literature review production scheduling problem (psp) [1, 2] is a common problem in manufacturing systems. the problem lies in scheduling n jobs between m machines. over time, many types of psp have evolved. in the following paragraph, some types of production scheduling problems are presented based on the literature. in the case of parallel machine scheduling problem (pmsp) [3], m jobs must be distributed among n machines. fuzzy parallel machine problem [4] has some fuzzy parameter (setup time, capacity constraint, time windows etc.). machine eligibility restriction [5] states if a machine is capable of processing a job (operation). in the case of production scheduling, the machines can be identical [6] or uniform [7]. the job can have due date [8], release time [9] and time window [10]. the due date means the day by which the job has to be done. release date means that the job needs to be started at least in this predetermined time. the time window is the combination of the release time and due date. jobs with priority level [11] should be created as soon as possible. the job processing time can be not only fixed but also variable [12]. the pmsp [9] can have several objective functions, for example, the minimization of the setup times [13], the minimization of the total earliness and tardiness time [14] or objective function, which takes into account environmental impacts (for example green scheduling problem) [7]. the constraints and components can 307 https://doi.org/10.14311/ap.2021.61.0307 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en anita agárdi, károly nehéz acta polytechnica machine maintenance time machine 1 [1, 4], [7, 8] machine 2 [4, 5] machine 3 [1, 2] machine 4 [2, 4] table 1. the components of the machines job duration due date job 1 1 1 job 2 1 9 job 3 4 6 job 4 1 2 job 5 3 9 job 6 1 4 job 7 4 10 table 2. the components of the jobs also be used together, for example, in article [15] the authors also use release and due dates, family set-ups, etc. 3. parallel machine scheduling problem with job and machine sequence dependent setup times, due dates and maintenance times in this specific problem, m jobs must be distributed among n machines. setup time means the transition time between jobs. the due date means the day by which the job has to be done. maintenance time means a time window when the production stops because there is maintenance. in our problem, the objective function is the minimization of the setup times, and maximization of the number of created jobs. in the following, a simple example and its solution are presented, after that, the mathematical model of our problem is detailed. table 1 illustrates the components of the machines, table 2 presents the components of the jobs, while in table 3, the setup times of machine 1 can be seen. figure 1 presents the results of the parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times. figure 1 presents the solution of the problem. as shown on the gantt chart, 4 machines create 7 jobs. job 1 and job 2 belongs to machine 1. machine 2 has job 3. machine 3 has two jobs: job 4 and job 5. machine 4 also has two jobs: job 6 and job 7. 3.1. the mathematical model assume that our problem contains n jobs and m machines. each job has m∗ (n + 1) setup times because the setup time is joband machine-dependent. each r0 j1 j2 j3 j4 j5 j6 j7 r0 0 0 1 2 2 3 1 1 j1 1 0 1 2 2 3 1 1 j2 2 3 0 4 1 2 1 1 j3 1 2 1 0 1 2 1 2 j4 1 2 1 2 0 1 2 1 j5 1 2 3 1 2 0 1 1 j6 0 1 2 1 3 1 0 4 j7 1 0 1 2 1 2 1 0 table 3. the setup times for machine 1 job has a processing time and a due date. each machine has (any) maintenance times. each maintenance time means a time window, i.e.:[e,l]. one machine can only process one job at a time. the notations and their definitions are illustrated in table 4. the objective function is the minimization of the setup times: z = min(z1 + z2) (1) where z1 means the job-job setup time: z1 = m∑ i=1 n∑ j0=1 n∑ j1=1 sti,j0,j1 ∗xi,j0,j1 (2) and z2 means the machine-job setup time: z2 = m∑ i=1 n∑ j1 sti,i,j1 ∗xi,i,j1 (3) constraint 1: each job can be assigned to one machine: m∑ i=1 n∑ j0=1 xi,j0,j1 + m∑ i=1 xi,i,j1 = 1∀j1 = 0...n (4) constraint 2: production continuity: m∑ i=1 n∑ j0=1 xi,j0,j1 = m∑ i=1 n∑ j0=1 xi,j1,j0∀j1 = 0...n (5) constraint 3: due date constraint: yj ≤ ddj∀j = 1...n (6) constraint 4: maintenance times: yj0 −ptj0 ≤ e,yj0 ≤ le ∈ mti,k (7) xi,j1,j0 = 1j1 = 1...n∀j0 = 1...n (8) occurs in at least one case 308 vol. 61 no. 2/2021 parallel machine scheduling with monte carlo tree search figure 1. the gantt chart of the production scheduling notation definition m number of machines n number of jobs ptj processing time of job j sti,j0,j1 setup time of job j1 for machine i after job j0 sti,i,j1 setup time of job j1 for machine i mti maintenance times of machine i, where mti,k = [e,l]. ddj due date of job j yj ending time of job j xi,j0,ji ∈ 0, 1 decision variable table 4. the notations and their definitions 4. monte carlo tree search the monte carlo tree search (mcts) algorithm creates a tree structure to solve a specific problem. during the algorithm, the search tree grows gradually and asymmetrically. the algorithm starts at the root node and repeatedly picks a random child as a successor until it reaches a leaf node. the mcts guides the search towards better candidate successors. it uses a balance between exploration and exploitation. the exploration means that the algorithm needs randomness to explore the search space. the exploitation means that the mcts takes an advantage of the best option we know. the algorithm repeats the following four phases [16]: (1.) selection: the algorithm starts from the root node and recursively chooses the best child based on the ucb formula until a leaf node is reached. this phase can be seen in figure 2. (2.) expansion: when a leaf is reached, the algorithm further expands, unless there is no possible leaf. this phase can be seen in figure 3. (3.) simulation: run playout from the leaf. this phase can be seen in figure 4. (4.) backpropagation: update the evaluation (fitness) values from the result. this phase can be seen in figure 5. the iteration of these four steps is called simulation. the ucb formula provides a balance between exploration and exploitation: ucbj = xj + c ∗ √ log10 nj (9) figure 2. monte carlo tree search: selection figure 3. monte carlo tree search: expansion where xj is the average win-rate (or fitness) of the j-th child, c is the ucb constant, most of the time c = 2. nj is the visit count of a child j, and n is the visit count of the parent. algorithm 1 presents the base monte carlo tree search technique. the tree policy means selecting or creating a leaf node from the nodes already contained within the search tree (i.e., the selection and expansion process). the default policy means a simulation (i.e., playing out the domain from a given non-terminal state to the estimating value). v0 indicates the root 309 anita agárdi, károly nehéz acta polytechnica figure 4. monte carlo tree search: simulation figure 5. monte carlo tree search: backpropagation algorithm 1 monte carlo tree search[17] 1: procedure monte carlo tree search 2: create root node v0 with state s0 3: while termination criterion is not met do 4: vl ← treepolicy(v0) 5: δ ← defaultpolicy(s(vl)) 6: backup(v,δ) return bestchild(v0) figure 6. representation of our problem figure 7. before 2-opt figure 8. after 2-opt algorithm 2 iterative monte carlo tree search 1: procedure monte carlo tree search 2: while term. criterion is not met do 3: create root node v0 with state s0 4: while term. criterion is not met do 5: vl ← treepolicy(v0) 6: v ← defaultpolicy(s(vl)) 7: backup(v,δ) 8: while term. criterion is not met do 9: application of 2-opt on v 10: backup(v,δ) 11: v ← bestchild(v0) 12: while term. criterion is not met do 13: application of 2-opt on v . return v node, and its state is s0. vl is the last node reached during the tree policy. its corresponding state is indicated with sl. δ means the gain of the terminal state reached with a random simulation from state sl. 5. algorithm for parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times representation of our problem our problem is represented with a permutation (figure 6). in our approach, the permutation means a given sequence of job indices. each element of the permutation is assigned to the first machine. if the constraints of any element (any job) of the permutation are not satisfied, then the other items (jobs) will be scheduled to the next machine. edge swapping (2-opt) edge swapping is the swapping of elements between two randomly selected permutation items. iterative monte carlo tree search (imcts) our algorithm (algorithm 2) is the modification of the monte carlo tree search. we start from a random state (initial permutation). switching from one state to another means an exchange (2-opt) operation. so by state, we mean a particular permutation (solution). our mcts is an iterative algorithm, which means, that the tree is re-built iteratively. after a tree is built, in the result solution, the 2-opt operation is also performed. 6. test results the test results are described in this section. our data set was artificially generated. a total of 7 different data sets were used. the first five data sets have 17 jobs and 5 machines, and the last two have 25 jobs and 10 machines. the best, worst, and the average results of the runs are illustrated in the tables, where p. means the problem, f. means the fitness value, t. means 310 vol. 61 no. 2/2021 parallel machine scheduling with monte carlo tree search best res. average res. worst res. p. f. t. (s) f. t. (s) f. t. (s) 1 124 16 135 17.6 142 19 2 100 20 112.4 21.6 118 23 3 127 18 134 19 147 20 4 114 20 122.6 21 137 23 5 96 20 109.8 20.4 118 22 table 5. the test result for imcts in the case of 17 jobs and 7 machines best res. average res. worst res. p. f. t. (s) f. t. (s) f. t. (s) 1 247 90 288.6 93.4 312 99 2 242 88 262.8 92.2 314 96 table 6. the test result for imcts in the case of 25 jobs and 10 machines the time in sec and res. means the results. based on the table 5, for smaller problems, our imcts algorithm proved to be efficient. the situation is similar for a larger data set, this is illustrated in table 6. in summary, based on the test results, the imcts algorithm proved to be efficient based on our entire data set. 7. conclusion in this article, a specific production scheduling task, the parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times was introduced. after the introduction, the literature on production scheduling was presented. in section 3, the mathematical model of the parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times was detailed. after that, the monte carlo tree search algorithm is detailed. in section 6, our algorithms, their representation and evaluation were presented. in section 7, the efficiency of the iterative monte carlo tree search was tested with artificially generated datasets. based on the test results, the efficiency of our representation and evaluation is demonstrated. in this article, we presented an example of using the mcts for scheduling. the authors believe that the imcts method they developed can be effective for additional scheduling tasks. our further research direction is the development and testing of other hybrid algorithms in the field of optimization. references [1] j. y. leung. handbook of scheduling: algorithms, models, and performance analysis. crc press, 2004. [2] m. pinedo. scheduling: theory, algorithms, and systems. 5-th ed. cham, 2016. [3] e. vallada, r. ruiz. a genetic algorithm for the unrelated parallel machine scheduling problem with sequence dependent setup times. european journal of operational research 211(3):612–622, 2011. https://doi.org/10.1016/j.ejor.2011.01.011. [4] o. a. arık, m. d. toksarı. multi-objective fuzzy parallel machine scheduling problems under fuzzy job deterioration and learning effects. international journal of production research 56(7):2488–2505, 2018. https://doi.org/10.1080/00207543.2017.1388932. [5] g. bektur, t. saraç. a mathematical model and heuristic algorithms for an unrelated parallel machine scheduling problem with sequence-dependent setup times, machine eligibility restrictions and a common server. computers & operations research 103:46–63, 2019. [6] d. biskup, j. herrmann, j. n. gupta. scheduling identical parallel machines to minimize total tardiness. international journal of production economics 115(1):134–142, 2008. https://doi.org/10.1016/j.ijpe.2008.04.011. [7] h. safarzadeh, s. t. a. niaki. bi-objective green scheduling in uniform parallel machine environments. journal of cleaner production 217:559–572, 2019. https://doi.org/10.1016/j.jclepro.2019.01.166. [8] m. ji, w. zhang, l. liao, et al. multitasking parallel-machine scheduling with machine-dependent slack due-window assignment. international journal of production research 57(6):1667–1684, 2019. [9] g. centeno, r. l. armacost. minimizing makespan on parallel machines with release time and machine eligibility restrictions. international journal of production research 42(6):1243–1256, 2004. https://doi.org/10.1080/00207540310001631584. [10] l. nie. parallel machine scheduling problem of flexible maintenance activities with fuzzy random time windows. in proceedings of the tenth international conference on management science and engineering management, pp. 1027–1040. springer, 2017. [11] s. ghanbari, m. othman. a priority based job scheduling algorithm in cloud computing. procedia engineering 50(0):778–785, 2012. [12] a. agnetis, j.-c. billaut, s. gawiejnowicz, et al. scheduling problems with variable job processing times. in multiagent scheduling, pp. 217–260. springer, 2014. [13] k.-c. ying, z.-j. lee, s.-w. lin. makespan minimization for scheduling unrelated parallel machines with setup times. journal of intelligent manufacturing 23(5):1795–1803, 2012. https://doi.org/10.1007/s10845-010-0483-3. [14] z. wu, m. x. weng. multiagent scheduling method with earliness and tardiness objectives in flexible job shops. ieee transactions on systems, man, and cybernetics, part b (cybernetics) 35(2):293–301, 2005. https://doi.org/10.1109/tsmcb.2004.842412. [15] m. ciavotta, c. meloni, m. pranzo. speeding up a rollout algorithm for complex parallel machine scheduling. international journal of production research 54(16):4993–5009, 2016. 311 https://doi.org/10.1016/j.ejor.2011.01.011 https://doi.org/10.1080/00207543.2017.1388932 https://doi.org/10.1016/j.ijpe.2008.04.011 https://doi.org/10.1016/j.jclepro.2019.01.166 https://doi.org/10.1080/00207540310001631584 https://doi.org/10.1007/s10845-010-0483-3 https://doi.org/10.1109/tsmcb.2004.842412 anita agárdi, károly nehéz acta polytechnica [16] t.-y. wu, i.-c. wu, c.-c. liang. multi-objective flexible job shop scheduling problem based on montecarlo tree search. in 2013 conference on technologies and applications of artificial intelligence, pp. 73–78. ieee, 2013. https://doi.org/10.1109/taai.2013.27. [17] c. b. browne, e. powley, d. whitehouse, et al. a survey of monte carlo tree search methods. ieee transactions on computational intelligence and ai in games 4(1):1–43, 2012. https://doi.org/10.1109/tciaig.2012.2186810. 312 https://doi.org/10.1109/taai.2013.27 https://doi.org/10.1109/tciaig.2012.2186810 acta polytechnica 61(2):307–312, 2021 1 introduction 2 literature review 3 parallel machine scheduling problem with job and machine sequence dependent setup times, due dates and maintenance times 3.1 the mathematical model 4 monte carlo tree search 5 algorithm for parallel machine scheduling problem with job and machine sequence setup times, due dates and maintenance times 6 test results 7 conclusion references acta polytechnica https://doi.org/10.14311/ap.2022.62.0427 acta polytechnica 62(4):427–437, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague a model of isotope transport in the unsaturated zone, case study josef chudoba∗, jiřina královcová, jiří landa, jiří maryška, jakub říha technical university of liberec, faculty of mechatronics, informatics and interdisciplinary studies, studentská 2, liberec, czech republic ∗ corresponding author: josef.chudoba@tul.cz abstract. this work deals with a groundwater flow and solute transport model in the near-surface (predominantly unsaturated) zone. the model is implemented so that it allows simulations of contamination transport from a source located in a geological environment of a rock massif. the groundwater flow model is based on richards’ equation. evaporation is computed using the hamon model. the transport model is able to simulate advection, diffusion, sorption and radioactive decay. besides the basic model concept, the article also discusses potential cases that could lead to non-physical solutions. on three selected examples, which include, for example, rapidly changing boundary conditions, the article shows the solvability of such cases with the proposed model without unwanted effects, such as negative concentrations or oscillations of solution, that do not correspond to inputs. keywords: richards’ equation, unsaturated zone, groundwater flow in unsaturated zone, solute transport. 1. introduction in the czech republic, it is planned to dispose of the spent nuclear fuel within the deep repository in hard crystalline rock. it is a similar concept to those adopted in sweden (www.skb.se) and finland (www.possiva.fi) [1–3] with one of the main differences being the near zero terrain gradient over their coastal deep repository sites. in the czech republic, the repository candidate sites have an average altitude of about 500 m with locally steep gradients, which means that it is necessary to focus closely on both saturated and unsaturated flow and solute transport. the biosphere model is a part of the safety analyses for ensuring the safety of a planned deep repository of spent nuclear fuel and highly active wastes. for a purpose of computation of a possible radionuclide impact on biosphere it is necessary to know their distribution (concentration or activity) in its vicinity, i.e. in a groundwater (near its level) where there is a possibility of direct extraction via pumping and also in the unsaturated zone above the groundwater level. a prediction of radionuclide concentration distribution in groundwater is done by transport simulations along with groundwater flow simulations (both mainly in saturated zone). such a model usually doesn’t provide us with enough detailed concentrations in a zone near the surface where the solute transport has an entirely different character (due to the low saturation). this problem is not new. it is being studied both in the context of deep repositories and potential environmental contamination. existing sw tools (pandora [4], resrad [5], hydrus [6] used in this field do allow for transport simulation in the unsaturated zone, but they show numerical instabilities in the case of dynamically changing flow boundary condition. this article uses case studies to show how to deal with such instabilities. with a biosphere module in mind, it is necessary to evaluate the near-surface radionuclide distribution using a more detailed model of the unsaturated zone (based on a known concentration distribution in a saturated zone and expected precipitation amounts). for such simulations, there is a variety of existing tools. an overview of their selection along with their characteristic features may be found in steefel, appelo and arora [7]. the alternative is to implement one of the well-known concepts for transport in unsaturated and partially saturated environments, which is the case in a work presented in this article. the aim was to create a software tool for biosphere simulations, which uses flow123d [8] and [9], verified against other codes in [10] for transport simulations in the saturated zone (geosphere). flow123d provides us with pressure distribution, velocity vectors and radionuclide concentrations in both mobile and immobile phases of a rock massif saturated environment. for the calculation of a radionuclide concentration time-space distribution in an unsaturated zone of defined model subdomain, we use the precipitation amount data. so far, in the phase of model implementation and testing, we used data from meteorological station praha klementinum, which are (for daily precipitation amounts) available for past ca 200 years. the implementation includes the hamon evapotranspiration model [11]. the unsaturated zone flow simulation is based on richards’ 427 https://doi.org/10.14311/ap.2022.62.0427 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en www.skb.se www.possiva.fi j. chudoba, j. královcová, j. landa et al. acta polytechnica equation and transport simulation is based on the advection-diffusion equation. for the approximation of both flow and transport, the numerical scheme shown in [6] and [12] is used (as is the case of, for example, hydrus [6]). this article deals with some of the aspects of the unsaturated flow and transport model implemented as an extension of a 3d hydrogeological and transport model of a site with a deep repository as a source of radionuclides that have the potential to migrate from the repository to a biosphere. the unsaturated zone model outputs serve as an input for radionuclide transport computations in a biosphere. the much needed mutual interconnection of all the modules along with the necessary customization of unsaturated zone model input and output form is the main reason for our own implementation, instead of using existing sw such as hydrus. this also means that there is no need for installation or execution of any third party sw tools. we managed to ensure 1) same data structure for transport simulations in both saturated and unsaturated zones, which allows us to avoid difficult interconnection of inputs and outputs of various simulation sws and 2) a support for sensitivity analysis of the whole path from a source to a biosphere. the downside is that we had to deal with problems linked to numerics, such as solution oscillations or negative concentrations. besides the model itself, the article also shows three case studies to present the model’s usability, including the sensitivity analysis performed in accordance with [13]. 2. model concept the unsaturated zone model was implemented with the aim to simulate the radionuclide transport from a geosphere to a biosphere. it assumes a known timespace distribution of radionuclide concentration in a part of geosphere that is close to groundwater level. this concentration distribution is acquired from the transport model of a site with a deep repository, which has concentration values in given times and discrete points of a simulation domain as an output. in the case of flow123d simulation tool, the time-dependent elementwise constant values of concentration are available. it is possible to identify the near-surface zones of the model domain for which the more detailed unsaturated zone simulations will be performed with the aim of acquiring a higher resolution of the concentration distribution near a surface. while in the saturated part of the geosphere, the flow has predominantly horizontal character, in the unsaturated zone, the flow of water and dissolved contaminants is mainly vertical. this is why the unsaturated zone model was limited to one dimension in the direction of z axis (with the direction of rising altitude). for each selected near-surface element, the separate unsaturated zone simulation is performed with the following basic inputs: • time evolution of contamination concentration in water – acquired from an output of the 3d transport model; values are used as a transport boundary condition for the 1d model, • vertical profile of soil composition in the unsaturated zone (alternatively of the van genuchten parameters), • depth where the saturation is equal to one, i.e. depth of the groundwater level, • time evolution of precipitation amounts; used as a flow boundary condition, • time evolution of evaporation values (implemented via hamon model [11]), • time evolution and location of sources in a model domain. the definition of initial conditions requires the lower part of the model domain to be saturated and, in sake of numerical stability, requires the saturated part to be at least few meters thick. the initial flow conditions in the model domain as a whole are usually derived from the groundwater level (hydraulic/piezometric head) which could be determined by a direct measurement or an expert estimation. the flow boundary condition on the top of the model domain is given by the volume of the water entering the profile (could be time dependent) and on the bottom, by hydraulic head (could also be time dependent). sources could be defined to cover the root withdrawal. for the transport simulation, the top boundary condition is usually zero concentration (when we assume the source of contamination to be located deep in the geosphere). the bottom boundary condition is defined based on the concentration values acquired from a simulation of the saturated zone. this boundary condition interconnects the saturated and the unsaturated model. the transport model includes sorption, diffusion and radioactive decay. the 1d model domain is discretized with a given step. the values of pressure head, water content and flux in each discretization node are the outputs of the flow simulation. the values of concentrations in each discretization node are the outputs of the transport simulation. 3. mathematical and physical model 3.1. model of flow in the saturated zone, all pores are filled with water and the pressure head h ≥ 0. the groundwater flow could be both horizontal and vertical. and in the unsaturated zone, the pores could be filled with water or with air. the pressure head h < 0. the flow is predominately vertical. that is why we limited the model to one dimension. 428 vol. 62 no. 4/2022 a model of isotope transport in the unsaturated zone, case study the flow in the unsaturated zone is described by richards’ equation [14]: ∂θ(h) ∂t = ∂ ∂z ( k(θ, h) ( ∂h ∂z + 1 )) − s, (1) where h [m] is the pressure head, θ [1] is the water content, t [s] is the time, z [m] is the vertical axis, s [s−1] are the sources and k [m s−1] is the unsaturated hydraulic conductivity. in equation (1), the principal variable is h. variable θ(h) and parameter k(h) are linearly dependent on the current value of pressure head h. according to van genuchten [15], this dependence could be described by following equations: θ(h) =  θr + θs − θr [1 + |αh|n]m , h < 0 θs, h ≥ 0 , (2) k(h) = ks · s0.5e · [ 1 − ( 1 − s1/me )m]2 , (3) se = θ(h) − θr θs − θr , m = 1 − 1 n , where θr [1] is the residual water content, θs [1] is the saturated water content, α [m−1] is the inverse of the air-entry value, n [1] is the empirical shape-defining parameters and ks [m s−1] is the saturated hydraulic conductivity. van genuchten parameters α and n depend on the soil composition: fraction of clay, silt and sand and bulk density of soil. there is no analytical formula linking the soil composition to van genuchten parameters. their values could be estimated, for example, by using the rosetta sw [16] which is based on a quasi-empirical model. for a particular simulation, it is necessary to also input (aside from soil parameters) the boundary and initial conditions. on each model boundary (there are only two since the model domain is 1d) it is possible to input either flux or pressure head. the boundary conditions could be changed in the course of simulation, both their type and their value. in each time step, it is necessary for the pressure head to be specified on at least one boundary. the initial conditions (in the form of pressure head values) are derived from the groundwater level in the model domain. it is possible to define sources s [s−1] which represent water withdraval through roots or a well. the model domain is divided into a mesh of n − 1 line elements with n nodes. richards’ equation (1) has no analytical solution; in a model, it is solved using the piccard numerical scheme [17]. the solution is in a form of pressure head in each computational node and each simulation time step. 3.2. model of transport in the unsaturated zone, the contamination may generally exist dissolved in three phases – liquid, solid and gaseous. in our concept, we do not account for a contamination transfer to gaseous phase. the reasoning behind this is the long simulation time in comparison to rock residence time of isotopes in the gaseous form. the concentration of dissolved contaminants/isotopes is very low, which is why the model uses only the linear sorption isotherm. the transport model also includes diffusion and radioactive decay. in each simulation time, we also assume an equilibrium between concentrations in solid and liquid phases given by the distribution coefficient. based on these assumptions, the dependence of concentration on time and space could be described by the advection-diffusion equation (4) [6]. ∂θck ∂t + ∂ρsk ∂t = ∂ ∂z ( θdwk ∂ck ∂z ) − ∂qck ∂z − µw,kθck − µs,kρsk + n∑ m=1 m ̸=k µw,mθcm+ n∑ m=1 m̸=k µs,mρsm + γw,kθ + γs,kρ − rk, (4) where θ [1] is the water content, ρ [kg m−3] is the rock density, z [m] is the vertical axis, ck [kg m−3] is the concentration of isotope k in a liquid phase, sk [1] is the concentration of isotope k in a solid phase, dwk [m 2 s−1] is the diffusion coefficient of isotope k, q [m s−1] is the flux, µw,k and µs,k [s−1] is the radioactive decay of isotope k in liquid and solid phases, γw,k [kg m−3 s−1] is the zero order reaction in a liquid phase, γs,k[s−1] is the zero order reaction in a solid phase, rk [kg m−3 s−1] are sources. the relationship between concentrations in solid phase sk and in liquid phase ck is given by equation (5). sk = kd,k · ck, (5) where kd,k [m3 kg−1] is the distribution coefficient of linear sorption for isotope k. boundary conditions on both ends of the model domain are given in a form of concentrations in the liquid phase. the model allows for the boundary conditions to change over time. the initial conditions are given as concentrations in the liquid phase in individual computational nodes; the initial concentrations in the solid phase are then computed using equation (5). the differential equation (4) is numerically solved using the finite elements method. the computation uses the same discretisation as the model of flow. the solution is in the form of concentration of each contaminant in each node and at each simulation time. 4. model instabilities while simulating the flow and transport using the model described above, some instable states may arise. they needed to be dealt with during the implementation. potential instabilities must be eliminated by 429 j. chudoba, j. královcová, j. landa et al. acta polytechnica an appropriate choice of parameter values. this section provides an overview of known potential unstable states along with a method that was used to deal with them in the model implementation. the evaporation from the model surface may be defined by outflux boundary condition. this may cause a non-physical state where the withdrawn water doesn’t exist in the model domain. this would lead to results limitedly approaching h → −∞ and θ → θr , which would require the simulation time step to be ∆t → 0. a possible solution is to define the evaporation as a source (water is withdrawn from the defined depth). this approach allows for the simulation of 1) water withdrawal through roots, 2) evaporation and 3) near-surface water circulation. another possible solution is to define the model domain extent so that its bottom part remains saturated throughout the simulation. if the defined initial conditions imply non-zero flux in the model domain, it results in a rapid stabilization in the first few simulation time steps, which may lead to significant computed groundwater fluxes. this may lead to a non-physical solution and a necessity to shorten the simulation time step. to control the time step, the courant number defined as cr = |q·∆t| θ·∆z [1] may be used. the solution is stable when cr ≤ 1. based on the knowledge of max i cri (here, the lower index i stands for the element of discretisation) the time step may be adjusted accordingly. when changing the flow boundary conditions during the course of simulation, similar problems as in the previous case may arise. those problems may be generally solved by lowering the simulation time step and making the changes in boundary conditions gradual, if possible. because the flux may be discontinuous in time, it is necessary to estimate the courant number and adjust the time step accordingly. unstable and non-physical solutions may be generally caused by long flow-simulation time step. the problem may be fixed by an estimation of the time step using the courant number. when simulating the transport, there may be oscillations in the solution and negative concentrations in cases where the advective flux dominates over the diffusion (along with suboptimal time and space discretisation). maximum transport-simulation time step may be set based on péclet number p e = q∆x θd [1], where q [m s−1] is the flux, ∆x is the element size [m], θ is the water content [1] and d is the diffusion coefficient [m2 s−1]. numerical oscilations are negligible if p e < 5. the denominator in the equation for péclet number computation includes the diffusion coefficient, which consists of hydrodynamical dispersion and molecular diffusion. in the case of 1d simulation, it means dw = dl|q| θ + dwτw, where dl is the longitudal dispersivity [m], dw is the coefficient of molecular diffusion [m2 s−1] and τw = θ 7/3 θ2s is tortuosity. péclet number may be lowered by including both dispersion and molecular diffusions in the simulation. oscillations of the solution and/or negative concentrations may also be caused by changing concentration in transport boundary conditions. such problems may be solved by lowering the transport simulation time step. an unstable solution may occur in the case of a too small model domain with respect to the pressure head value given as the flow boundary condition. if the bottom boundary condition is a pressure head, which is not h ≫ 0, then, if the precipitation amounts are high, the saturated/unsaturated zone interface moves up but the boundary condition remains the same. this results in a significant pressure head gradient in nodes close to the bottom boundary and consequently, unrealistically high fluxes. such problems may be solved by extending the model domain so that it captures a greater part of the saturated zone. 5. case studies this part of the article shows three case studies. the first one is a synthetic task that includes a simulation of flow and transport with regularly changing boundary conditions. the second one examines the sensitivity of the output on the precipitation amount. and finally, the third one shows the expected radionuclide migration in the case of a realistic precipitation boundary condition (data from praha klementinum; available since 1804). this case also includes the evaporation computed using the hamon model, which is based on temperature measurements, latitude, and date (last two quantities are used for the computation of daylight length). bottom boundary conditions as well as initial conditions were, in all three cases, derived from the saturated transport simulation with contamination source located 500 m below surface. this regional saturated model is not discussed here; parameters derived from it are shown where relevant. case studies are evaluated based on the concentration evolution (depth dependent). 5.1. case 1 – periodic precipitation evolution the first case simulates a flow and solute transport from a 10 m depth towards a surface. the model domain is divided into 100 computational elements (101 nodes). the z axis is in the direction of rising altitude (the surface is at z = 10 m). the simulation period is 5000 days with a simulation time step of 1 day. the soil properties are constant throughout the model domain. the soil is 40 % silt, 15 % clay and 45 % sand. its dry density is 1500 kg m−3. this composition corresponds to the following parameters: ks = 1.912 · 10−6 m s−1, n = 1.469, θr = 0.0492, θs = 0.3687, a = 1.355 m−1. the flux, representing precipitation (after subtraction of evaporation), is used as a flow boundary condition on the top boundary. the time evolution of 430 vol. 62 no. 4/2022 a model of isotope transport in the unsaturated zone, case study figure 1. pressure head time evolution at 5 m depth (left) and 1 m depth (right). figure 2. concentration time evolution at 4 m depth (left) and 1 m depth (right). the boundary condition has a sinus shape with a 365day periodicity. its values for individual simulation times were (for the expected influx of 365 mm y−1) calculated as okppf lux = 365 − 438 · sin ( 2πt 365 ) [mm y−1]. a constant pressure head of 5 m is specified on the bottom boundary throughout the simulation. this corresponds to a groundwater level 5 m below surface, which is also how the initial conditions were defined. the flow simulation result is the time-space distribution of the pressure head and saturation. for example, at 5 m depth (where the groundwater level was situated at the simulation start), the pressure head oscillates between 0.009 and 0.061 m (see figure 1, left). at one meter below the surface (see figure 1, right), we can see an initial rise, followed by periodic oscillations between −2.61 and−1.17 m (difference of ca 1.44 m). based on the flow simulation results, the transport of two isotopes (noted isotope 1 and isotope 2) was simulated. isotope 1 has a half-life of 10 000 days, its product is the stable isotope 2. the following parameters were used for the simulation: coefficient of molecular diffusion 1 · 10−3 m2 day−1 (i.e. 1.15 · 10−8 m2 s−1), longitudal dispersivity 0.1 m, distribution coefficient 0 m3 kg−1 for isotope 1 and 0.0001 m3 kg−1 [18] for isotope 2. the initial concentration for both isotopes is 1 · 10−9 kg m−3 below the groundwater level (depths of 5–10 m) and zero above it. the bottom transport boundary condition is a constant concentration of 1 · 10−9 kg m−3. the model also assumes a horizontal flow in its saturated part, which means that the concentration there is kept at a constant value of 1 · 10−9 kg m−3 (via prescription of nonzero sources in computational nodes) throughout the simulation. figure 2 shows examples of results at two depths: 4 m and 1 m. it is clear that after 2 000 days, the concentration evolutions adopt the periodicity of the precipitation evolution. near the surface, the concentration is significantly lowered by an infiltration of clean water. in a realistic environment, the measure of dilution would be predetermined by many factors, such as soil type, unsaturated zone thickness, precipitation and evaporation amounts, roots, and wells. 431 j. chudoba, j. královcová, j. landa et al. acta polytechnica isotope half-life [y] diffusion coefficient [m2 s−1] distribution coefficient [m3 kg−1] 14 c 5 700 1e-9 0.0005 36 cl 301 000 1e-9 0.0 41 ca 102 000 1e-9 0.0001 59 ni 76 000 1e-9 0.01 79 se 356 000 1e-9 0.0005 107 pd 6 500 000 1e-9 0.0001 126 sn 230 000 1e-9 0.0 129 i 16 100 000 1e-9 0.0 135 cs 2 300 000 1e-9 0.01 238 u 4 468 000 000 1e-9 0.1 table 1. values of half-life, molecular diffusion coefficient and linear sorption distribution coefficient of selected isotopes used in the simulation. 5.2. case 2 – radionuclide transport with realistic transport boundary condition, sensitivity on infiltration amount this case deals with the transport of 10 real isotopes (14 c, 36 cl, 41 ca, 59 ni, 79 se, 107 pd, 126 sn, 129 i, 135 cs and 238 u) which are expected to migrate from a deep repository. the isotope parameters used in the simulation are shown in table 1. the regional saturated model transport simulation period was 500 000 years. as a consequence of a given radionuclide release scenario and transport parameters, the following isotopes had a high-enough concentration near the surface at the end of the simulation: 36 cl, 41 ca, 79 se, 129 i. concentrations of other isotopes were less than 1e-15 g m−3. for the four isotopes mentioned above, the unsaturated transport simulation was performed with an aim to determine the sensitivity of the concentration distribution on the infiltration amount. the model domain is 10 m thick and discretized with a 0.1 m step. the simulation period was 500 000 y with a 0.1 y time step. the soil is sandy loam with the following parameters: θr = 0.065, θs = 0.41, α = 7.5 m−1, n = 1.89, ks = 1.228 · 10−5 ms−1. the saturated part of the model domain is 8 m thick (groundwater level 2 m below surface). this corresponds to bottom flow boundary condition of a constant 8 m pressure head. the simulations were carried out with various top flow boundary conditions that correspond to annual infiltrations of 0, 10, 20, 30, 40, 50, 60 and 80 mm y−1. the transport initial condition was zero concentration throughout the domain. the bottom transport boundary condition had a form of concentration time evolution based on regional saturated model results (see figure 3). these concentrations are also kept in the saturated part of the model domain throughout the simulation. table 2 shows the calculated values of 36 cl, 41 ca, 126 sn and 129 i concentrations at simulation times of 50 000, 100 000 and 200 000 y and in depths of 0.3, 0.5 and 1.0 m as well as in the saturated zone. the results shown are for the infiltration amounts of 0, 20 and 50 mm y−1. there is a clear significant influence of the infiltration amount. for each isotope, simulation time, and given infiltration amount, the ratio between the unsaturated zone (at selected level) and the saturated zone concentration is about the same. the measure of dilution in the unsaturated zone is influenced mainly by the infiltration amount; the influence of the simulation time and isotope is negligible. figure 4 shows the dependence of 129 i concentration on the depth and infiltration amount at a simulation time of 50 000 y. figure 5 shows the evolution of 129 i concentration at selected depths. 5.3. case 3 – precipitation and evaporation based on meteorological data the model domain is 10 m thick and discretized with a 0.1 m step. the soil is sandy loam with the following parameters: θr = 0.065, θs = 0.41, α = 7.5 m−1, n = 1.89, ks = 1.228 · 10−5 m s−1. the saturated part of the model domain is 7 m thick (groundwater level 3 m below surface). this corresponds to a bottom flow boundary condition of a constant 7 m pressure head. flow boundary conditions on the top of the model domain are derived from meteorological measurements from the past 200 years [19]. the influx to the model domain is based on daily precipitation data. the evaporation is estimated based on mean temperatures using the hamon model [11] and included in the simulation as a negative source uniformly distributed in the top 1 m of the model domain. the transport simulation assumes that contamination with a concentration of 1 µg m−3 appears (as a 432 vol. 62 no. 4/2022 a model of isotope transport in the unsaturated zone, case study c l3 6 50 00 0 y 10 0 00 0 y 20 0 00 0 y pr ec ip it at io n [m m y− 1 ] 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0 1. 5e -0 6 1. 6e -0 6 1. 6e -0 6 1. 6e -0 6 8. 3e -0 7 8. 4e -0 7 8. 6e -0 7 8. 8e -0 7 1. 6e -0 7 1. 6e -0 7 1. 6e -0 7 1. 7e -0 7 20 1. 1e -1 2 8. 2e -1 2 1. 1e -0 9 1. 6e -0 6 6. 1e -1 3 4. 5e -1 2 6. 1e -1 0 8. 8e -0 7 1. 2e -1 3 8. 4e -1 3 1. 2e -1 0 1. 7e -0 7 50 2. 0e -1 3 1. 6e -1 2 2. 8e -1 0 1. 6e -0 6 1. 1e -1 3 8. 4e -1 3 1. 5e -1 0 8. 8e -0 7 2. 0e -1 4 1. 6e -1 3 2. 8e -1 1 1. 7e -0 7 c a4 1 50 00 0 y 10 0 00 0 y 20 0 00 0 y pr ec ip it at io n [m m y− 1 ] 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0 8. 5e -1 9 8. 6e -1 9 9. 3e -1 9 1. 0e -1 8 8. 5e -1 9 8. 6e -1 9 9. 3e -1 9 1. 0e -1 8 8. 5e -1 9 8. 6e -1 9 9. 3e -1 9 1. 0e -1 8 20 7. 0e -2 5 5. 1e -2 4 7. 0e -2 2 1. 0e -1 8 7. 0e -2 5 5. 1e -2 4 7. 0e -2 2 1. 0e -1 8 7. 0e -2 5 5. 1e -2 4 7. 0e -2 2 1. 0e -1 8 50 1. 2e -2 5 9. 5e -2 5 1. 7e -2 2 1. 0e -1 8 1. 2e -2 5 9. 5e -2 5 1. 7e -2 2 1. 0e -1 8 1. 2e -2 5 9. 5e -2 5 1. 7e -2 2 1. 0e -1 8 s n 12 6 50 00 0 y 10 0 00 0 y 20 0 00 0 y pr ec ip it at io n [m m y− 1 ] 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0 9. 3e -1 9 9. 3e -1 9 9. 7e -1 9 1. 0e -1 8 6. 9e -1 8 7. 1e -1 8 7. 5e -1 8 7. 9e -1 8 1. 4e -1 4 1. 4e -1 4 1. 4e -1 4 1. 5e -1 4 20 7. 0e -2 5 5. 1e -2 4 7. 0e -2 2 1. 0e -1 8 5. 5e -2 4 4. 0e -2 3 5. 5e -2 1 7. 9e -1 8 1. 0e -2 0 7. 5e -2 0 1. 0e -1 4 1. 5e -1 4 50 1. 2e -2 5 9. 5e -2 5 1. 7e -2 2 1. 0e -1 8 9. 5e -2 5 7. 5e -2 4 1. 3e -2 1 7. 9e -1 8 1. 8e -2 1 1. 4e -2 0 2. 5e -1 8 1. 5e -1 4 i1 29 50 00 0 y 10 0 00 0 y 20 0 00 0 y pr ec ip it at io n [m m y− 1 ] 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0. 3 m 0. 5 m 1. 0 m sa tu ra te d zo ne 0 1. 6e -0 5 1. 6e -0 5 1. 6e -0 5 1. 5e -0 5 8. 1e -0 6 8. 1e -0 6 8. 1e -0 6 8. 1e -0 6 1. 3e -0 6 1. 3e -0 6 1. 3e -0 6 1. 3e -0 6 20 1. 1e -1 1 7. 8e -1 1 1. 1e -0 8 1. 5e -0 5 5. 6e -1 2 4. 1e -1 1 5. 6e -0 9 8. 1e -0 6 9. 1e -1 3 6. 6e -1 2 9. 1e -1 0 1. 3e -0 6 50 1. 9e -1 2 1. 5e -1 1 2. 6e -0 9 1. 5e -0 5 9. 7e -1 3 7. 7e -1 2 1. 4e -0 9 8. 1e -0 6 1. 6e -1 3 1. 2e -1 2 2. 2e -1 0 1. 3e -0 6 t ab le 2. c on ce nt ra ti on s of 36 c l, 41 c a, 12 6 sn , 12 9 i in se le ct ed si m ul at io n ti m es , fo r va ri ou s pr ec ip it at io n in fil tr at io n am ou nt s in se le ct ed de pt hs . 433 j. chudoba, j. královcová, j. landa et al. acta polytechnica figure 3. concentration evolutions at the geosphere/biosphere interface used as a boundary condition in the unsaturated zone transport simulations. figure 4. dependence of 129 i concentration on depth and infiltration amount at simulation time 50 000 y. consequence of the flow) in the previously clean saturated portion of the model domain during the first simulation time step. this concentration remains constant throughout the simulation. the simulated tracer is 129 i; it is non-sorbing and its decay is negligible (its half-life is much higher than the simulation period). the simulation period is dictated by the length of the available meteorological data, i.e. 78 965 days (may 1st 1804 to december 31st 2018). the simulation time step was 0.01 days. figure 6 shows the time evolutions of selected quantities. they capture the first five years of the simulation (the time axis has a mm.yy format). from the results, we can see the initial rapid change to an equilibrium state followed by seasonal oscillations around a mean value. for example, in the case of the concentrations in the 2 m depth (figure 6c), the oscillation magnitude is about 10 % of the mean, in the 0.5 m depth, it is 18 % and at surface level, (figure 6d) it is up to 200 %. this significant near-surface concentration uptick happens during summer months when the evaporation rate is higher, which, combined with lower precipitation amounts, leads to higher fluxes from the groundwater level towards the surface. the significantly low values in the summer season are caused by rainstorms. colder seasons with higher precipitation amounts (usually from october to march) are linked with lower concentrations and oscillations. the concentrations drop by 14-15 orders in the 3 m thick unsaturated zone; from 1 · 10−9 in the saturated zone to around (depending on time) 6 · 10−24 kg m−3 near the surface. during the simulation of this case study, there were rapid changes of the flow boundary condition (precipitation). this has led to instabilities and non434 vol. 62 no. 4/2022 a model of isotope transport in the unsaturated zone, case study figure 5. evolution of 129 i concentration at selected depths. figure 6. time evolution of (from top to bottom) a) daily precipitation, b) evaporation, c) concentration 2 m below terrain and d) concentration at terrain level. 435 j. chudoba, j. královcová, j. landa et al. acta polytechnica physical solutions. the stability was increased by shortening the simulation time step to the final value of 0.01 days. another way to increase the stability would be to increase the portion of simulation domain in which the evaporation and root withdrawal takes place. the 1 m thickness was used. the third way (not used in this study) would be to average the precipitation/evaporation amounts over a longer time period. 6. conclusion for a simulation of flow and transport in an unsaturated zone, several concepts and their implementations (both commercial and non-commercial) exist. despite that (for reasons stated in the introduction), we have decided to implement our own tool simulating the flow using the richards’ equation and transport (advection, diffusion, sorption and radioactive decay) using the advection-diffusion equation. the result is a sw that allows for a simulation of these processes in the unsaturated zone or, to be exact, in the simulation domain that consists of both the saturated and unsaturated zones. this sw was designed and implemented as a part of a more complex system that tracks the radionuclides released from a deep repository through the saturated zone, the unsaturated zone and the biosphere all the way to an individual (in the form of calculated effective dose). we have created a tool for a safety assessment of contamination sources, such as deep repositories of spent nuclear fuel and highly active wastes in a geosphere. our implementation allowed us to have a precise control over the data exchange between individual models covering portions of the transport path. however, it forced us to deal with potentially unstable states and to verify the model so that we rule out any significant conceptual or implementation errors. three case studies were described in this article. they were designed to show the behaviour and results of the model in cases with varying boundary conditions and sources. such cases are very challenging for existing sw tools used in the field and our ability to solve them is one of the main achievements of the sw concept (and implementation) shown in this article. simulations with daily changes in precipitation and evaporation proved challenging because they can cause instabilities manifesting as negative concentrations or oscillations (not corresponding to periodicity in inputs). the presented cases prove the solvability of such situations by the presented model implementation as well as its usability within the system for a safety assessment of risks linked to biosphere contamination by radionuclides released from a deep repository. the system could also be used for cases when the contamination source is in the atmosphere and contaminates the environment through fallout and rain water. acknowledgements this article was prepared with a support from the technology agency of the czech republic within the epsilon program. www.tacr.cz. references [1] p. o. johansson. description of surface hydrology and near-surface hydrogeology at forsmark. site descriptive modelling, sdm-site forsmark. tech. rep. skb r-08-08, swedish nuclear fuel and waste management co, stockholm, sweden, 2008. https: //www.skb.com/publication/1853247/r-08-08.pdf. [2] k. werner, m. sassner, e. johansson. hydrology and near-surfacehydrogeology at forsmark – synthesis for the sr-psu project. sr-psu biosphere. tech. rep. skb r-13-19, swedish nuclear fuel and waste management co, stockholm, sweden, 2013. https: //www.skb.com/publication/2477948/r-13-19.pdf. [3] e. jutebring sterte, e. johansson, y. sjöberg, et al. groundwater-surface water interactions across scales in a boreal landscape investigated using a numerical modelling approach. journal of hydrology 560:184–201, 2018. https://doi.org/10.1016/j.jhydrol.2018.03.011. [4] p. a. ekström. pandora – a simulation tool for safety assessments. technical description and user’s guide. tech. rep. skb r-11-01, swedish nuclear fuel and waste management co, stockholm, sweden, 2010. https: //www.skb.com/publication/2188873/r-11-01.pdf. [5] c. yu, e. gnanapragasam, j.-j. cheng, et al. user’s manual for resrad-offsite code version 4. united states nuclear regulatory commission, argonne national laboratory, usa, 2020, https://resrad.evs. anl.gov/docs/resrad-offsite_usersmanual.pdf. [6] j. šimůnek, m. šejna, h. saito, et al. the hydrus-1d software package for simulating the movement of water, heat, and multiple solutes in variably saturated media, version 4.17, hydrus software series 3. department of environmental sciences, university of california riverside, riverside, california, usa, 2013, https://www.pc-progress.com/ downloads/pgm_hydrus1d/hydrus1d-4.17.pdf. [7] c. i. steefel, c. a. j. appelo, b. arora, et al. reactive transport codes for subsurface environmental simulation. computational geosciences 19(3):445–478, 2015. https://doi.org/10.1007/s10596-014-9443-x. [8] j. březina, m. hokr. mixed-hybrid formulation of multidimensional fracture flow. in i. dimov, s. dimova, n. kolkovska (eds.), numerical methods and applications. nma 2010. lecture notes in computer science, vol. 6046. springer, berlin, heidelberg, 2011. https://doi.org/10.1007/978-3-642-18466-6_14. [9] j. březina, j. stebel, d. flanderka, et al. flow123d, version 2.2.1. user guide and input reference. technical university of liberec, faculty of mechatronics, informatics and interdisciplinary studies, liberec, 2018, https://flow.nti.tul.cz/packages/2.2.1_release/ flow123d_2.2.1_doc.pdf. [10] m. hokr, h. shao, w. p. gardner, et al. real-case benchmark for flow and tracer transport in the fractured rock. environmental earth sciences 75(18):1273, 2016. https://doi.org/10.1007/s12665-016-6061-z. 436 www.tacr.cz https://www.skb.com/publication/1853247/r-08-08.pdf https://www.skb.com/publication/1853247/r-08-08.pdf https://www.skb.com/publication/2477948/r-13-19.pdf https://www.skb.com/publication/2477948/r-13-19.pdf https://doi.org/10.1016/j.jhydrol.2018.03.011 https://www.skb.com/publication/2188873/r-11-01.pdf https://www.skb.com/publication/2188873/r-11-01.pdf https://resrad.evs.anl.gov/docs/resrad-offsite_usersmanual.pdf https://resrad.evs.anl.gov/docs/resrad-offsite_usersmanual.pdf https://www.pc-progress.com/downloads/pgm_hydrus1d/hydrus1d-4.17.pdf https://www.pc-progress.com/downloads/pgm_hydrus1d/hydrus1d-4.17.pdf https://doi.org/10.1007/s10596-014-9443-x https://doi.org/10.1007/978-3-642-18466-6_14 https://flow.nti.tul.cz/packages/2.2.1_release/flow123d_2.2.1_doc.pdf https://flow.nti.tul.cz/packages/2.2.1_release/flow123d_2.2.1_doc.pdf https://doi.org/10.1007/s12665-016-6061-z vol. 62 no. 4/2022 a model of isotope transport in the unsaturated zone, case study [11] w. r. hamon. estimating potential evapotranspiration. journal of the hydraulics division 87(3):107–120, 1961. https://doi.org/10.1061/jyceaj.0000599. [12] t. vogel, k. huang, r. zhang, m. t. van genuchten. the hydrus code for simulating one-dimensional water flow, solute transport, and heat movement in variably-saturated media, version 5.0. research report no 140. u.s. salinity laboratory, usda, ars, riverside, ca, 1996. [13] k. zhang, y.-s. wu, j. e. houseworth. sensitivity analysis of hydrological parameters in modeling flow and transport in the unsaturated zone of yucca mountain, nevada, usa. hydrogeology journal 14(8):1599–1619, 2006. https://doi.org/10.1007/s10040-006-0055-y. [14] l. a. richards. cappillary conduction of liquids through porous mediums. physics 1(5):318–333, 1931. https://doi.org/10.1063/1.1745010. [15] m. t. van genuchten. a closed-form equation for predicting the hydraulic conductivity of unsaturated soils. soil science society of america journal 44(5):892–898, 1980. https://doi.org/10.2136/ sssaj1980.03615995004400050002x. [16] m. g. schaap, f. j. leij, m. t. van genuchten. rosetta: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. journal of hydrology 251(3-4):163–176, 2001. https://doi.org/10.1016/s0022-1694(01)00466-8. [17] m. a. celia, e. t. bouloutas, r. l. zarba. a general mass-conservative numerical solution for the unsaturated flow equation. water resources research 26(7):1483–1496, 1990. https://doi.org/10.1029/wr026i007p01483. [18] e. juranová, e. hanslík, s. dulanská, et al. sorption of anthropogenic radionuclides onto river sediments and suspended solids: dependence on sediment composition. journal of radioanalytical and nuclear chemistry 324(3):983–991, 2020. https://doi.org/10.1007/s10967-020-07174-w. [19] historical data: weather: prague clementinum. chmi portal. [2020-11-02], http://portal.chmi.cz/ historicka-data/pocasi/praha-klementinum?l=en#. 437 https://doi.org/10.1061/jyceaj.0000599 https://doi.org/10.1007/s10040-006-0055-y https://doi.org/10.1063/1.1745010 https://doi.org/10.2136/sssaj1980.03615995004400050002x https://doi.org/10.2136/sssaj1980.03615995004400050002x https://doi.org/10.1016/s0022-1694(01)00466-8 https://doi.org/10.1029/wr026i007p01483 https://doi.org/10.1007/s10967-020-07174-w http://portal.chmi.cz/historicka-data/pocasi/praha-klementinum?l=en# http://portal.chmi.cz/historicka-data/pocasi/praha-klementinum?l=en# acta polytechnica 62(4):427–437, 2022 1 introduction 2 model concept 3 mathematical and physical model 3.1 model of flow 3.2 model of transport 4 model instabilities 5 case studies 5.1 case 1 – periodic precipitation evolution 5.2 case 2 – radionuclide transport with realistic transport boundary condition, sensitivity on infiltration amount 5.3 case 3 – precipitation and evaporation based on meteorological data 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0259 acta polytechnica 58(4):259–263, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap numerical simulations of static strength of novel dental implants luboš řehounek∗, aleš jíra, františek denk czech technical university in prague, faculty of civil engineering, thákurova 7 166 29, prague, czech republic ∗ corresponding author: lubos.rehounek@fsv.cvut.cz abstract. it is important to know the mechanical and micromechanical characteristics of conventional dental implants in order to design and develop novel dental implants and surface treatments that ensure a good biomechanical stability and almost fully replace the biological tissue. a successful integration of the implant depends on its chemical, mechanical and physical properties and the quality of the bone tissue. considering these facts, we developed two novel variants of dental implant stems with an anti-rotational geometrical solution. on these variants, we have then performed numerical analyses of static strength. outcomes of simulations vary depending on the type of the implant and load mode. keywords: dental implant; numerical; static strength; osseointegration; fem. 1. introduction every successfully implanted prosthetic has to be accepted by the body of the patient. this is a process we call osseointegration – the full functional and structural connection between the living tissue and an implant [1]. many implants fail to achieve osseointegration because of an early failure [2–4]. the extent of osseointegration depends on many factors, such as the chemical, mechanical and physical properties of the implant, bone-implant bonding, cytotoxicity or the implant geometry. this article will focus on the implant geometry and simulations of static strength of two novel dental implants that were patented in the czech republic [5, 6]. these novel dental implants are shown in fig. 1. the first implant is the 0001 “four leaf clover” variant. it has a system of stabilizing ribs that resembles a four leaf clover in its cross-section. these ribs connect into a half-spherical root ending and serve to increase the torsional stability of the implant. the second variant is the 0002 “ribbed” variant. its cylindrical shape transforms into a conical part, which is equipped with a system of vertical stabilizing ribs. there is a small hollow chamber situated at the intersection of the beams that serves as a free space for human bone to grow into. the vertical ribs coupled together with the transversally-oriented chamber provide a good vertical and torsional stability. the vertical ribs run parallel to each other and they form a 90° angle. the lower intraosseous part of the implant is comprised of a tapering conical surface with the stabilizing ribs. this part is especially important in a replacement of front teeth as there is usually not enough space to introduce a cylindrical implant. both implants offer greater torsional stability and much greater bonding surface for osseointegration than conventional implants, which is beneficial in the case of front teeth or in an applications where there figure 1. two novel dental implant types, the “four leaf clover” variant (left) and the “ribbed” variant (right). the four leaf clover implant variant has a halfspherical ending and four vertical ribs running through the whole body of the implant. the ribbed implant has four vertical ribs situated in the conical intraosseous part of the stem as well as a small hollow chamber. both implants belong to the “push-in” category of dental implants. is generally not enough space (e.g., in between two roots of adjacent teeth). another two variants of the 0002 implant have been developed, both using a half-spherical ending. one of them has a parallel system of vertical ribs and the second has a system of slant ribs (fig. 2). 2. numerical analyses the main goal of the presented numerical analyses is the evaluation of the intraosseous parts of dental implants in regard to their ability to withstand a mechanical load. for these analyses, we chose two novel patented dental implant types (type 0001, the “four leaf clover” variant and type 0002, the “ribbed” variant). for the sake of comparison, we also analysed a 259 http://dx.doi.org/10.14311/ap.2018.58.0259 http://ojs.cvut.cz/ojs/index.php/ap l. řehounek, a. jíra, f. denk acta polytechnica figure 2. other versions of the ribbed dental implant. one with a system of slant stabilizing ribs (left) and one with a system of parallel stabilizing ribs (right). these implant variants were not considered in the numerical analysis at this point in time. one conventionally manufactured dental implant, type prospon zv14112-3200-00008. all analysed implants are from the “push-in” category of dental implants. the analyses focus on evaluating the static strength of the implants. this simulation and evaluation was performed using the čsn en iso 14801 – “dentistry – implants – dynamic loading test for endosseous dental implants” standard [7]. the goal of this analysis was to determine the stress and strain response of the novel dental implants and compare them with a real approved implant (prospon zv14112-3200-00008), which conforms the given standard. the analyses will provide the determination of critical locations where stresses concentrate along with viable design changes that could potentially benefit the stress distributions. the main observed attribute of the analysed intraosseous parts of novel dental implant stems is their distribution of stress and strain. the environment used for the simulations was ansys apdl. 3. creating fem models 3.1. dental implants geometry the geometrical models of dental implants 0001, 0002 (novel implants) and implant zv14112-3200-00008 (reference implant) were created using the ansys apdl environment. the foundation for these geometries was a precise project documentation provided in pdf and dxf formats by prospon spol. s r.o. the outer surfaces of the implants are created in detail, whereas the inner parts are created without a great attention to detail (e.g., without the screw-thread). 3.2. mesh we used the solid 187 quadratic tetrahedrons generated in the “smart size” mode as mesh elements. every single element has 10 nodes situated in its peaks and in half of every edge. the numerical analysis assumes a linear elastic behaviour of all materials. the structure of all materials was defined as homogeneous and figure 3. a schematic of the testing assembly according to the used standard. 1 – loading head, 2 – nominal surface level of bone tissue, 3 – connecting part, 4 – hemispherical loading part, 5 – dental implant stem, 6 – fixation device. isotropic. individual material properties were specified for each material, but left out of this paper for the sake of brevity. 3.3. numerical simulation and loads the loads described in this subsection are specified in the čsn en iso 14801 – “dentistry – implants – dynamic loading test for endosseous dental implants” standard [7]. the test set comprises of the dental implant stem, a pillar with a loading head, connecting screw m 2.2 with an inner hexagon and a device for sample anchoring (fig. 3) the ansys apdl simulation represented testing the anchored specimen with a static load. the stem was equipped with a proper superstructure. the anchoring device is represented by a hollow cylindrical body of an outer diameter of 8.0 mm and a height of 18.0 mm. the outer surface of the cylindrical part has a defined boundary condition that eliminates its displacement in all directions (ux = uy = uz = 0). the superstructure is set on the upper hexagon of the specimen. it is represented by a full cylinder with a diameter of 3.8 mm and 4.0 mm and a height of 4.0 mm. the superstructure’s surface is defined by the shape of the implant. the interconnection of these bodies is attained by using rigid constraints generated by volumetric “vglue” operations. 260 vol. 58 no. 4/2018 numerical simulations of static strength of novel dental implants figure 4. equivalent von mises stress distribution (mpa) for the 0001 implant variant. whole body and 2 different load modes. figure 5. equivalent von mises stress distribution (mpa) for the 0002 implant variant. whole body and 2 different load modes. the load is applied by a single static force located in the centrepoint of the upper base of the cylindrical superstructure. the load magnitude was set as f = 350 n. this force is applied with a 30° deviation. this magnitude of force was chosen because it corresponds with the static failure of the implant during loading in the mode of controlled deformation. the whole assembly was investigated in two load modes – in the plane of the centreline of the ribs and in the vertical plane running in between them. the outcomes of the analysis are shown in tab. 1. 4. results and discussion in the following figures, we show the stress distributions among different dental implants. we chose to use von mises stress as it is the most widely used criterion for analysing ductile materials, such as metal. 4.1. the 0001 variant results the analysis was carried out in two different planes in regard to the orientation of the system of stabilizing ribs and the applied load. it was found that the stem loaded in the plane of the ribs exhibits a better mechanical response. the extreme values of stress are approximately 12 % lower while the displacement values are almost the same for both load modes. from this fact, we can deduce that a dental implant stem with more than four ribs would have a superior mechanical response. concentrations of stresses occur mainly at places of sudden changes of a curvature, geometrical inhomogeneities and at the area of the transition of the implant into the fixation device (fig. 4). this shortcoming can be accounted for by rounding all sharp edges in the geometry by a radius of at least 0.5 mm. the comparison showed that the extreme values of stress are approximately 35 % lower than those of the prospon zv14112-3200-00008 reference implant. the tested implant variant 0001 is able to withstand load values up to 350 n without any plasticity in the body of the stem. maximum values of stress are 801.656 mpa, which is approximately 95 % of the yield strength (fk = 850.000 mpa). further attention will be directed towards determining the ultimate bearing capacity with an elastoplastic model. also, a numerical model of the abutment and the connecting screw will be created to realistically simulate the connection to the stem. 261 l. řehounek, a. jíra, f. denk acta polytechnica figure 6. equivalent von mises stress distribution (mpa) for the zv14112-3200-00008 implant variant. whole body and 2 different load modes. figure 7. three trios of models of individual dental implants. left – the “four leaf clover” variant, middle – the “ribbed” variant, right – the reference implant. images in trios from left to right – whole 3d model, longitudinal section, fem mesh. 4.2. the 0002 variant results the analysis was carried out in two different load modes with regard to the placement of the stabilizing ribs and the applied force. similarly to implant 0001, the stem of the implant variant 0002 exhibits lower values of stress (approximately 11 % lower) when tested in the plane of the stabilizing ribs. it is, therefore, assumed that the stress distribution will be more favourable for the direction of the force in the plane of the stabilizing ribs. the analysed stem is not thin, but rather bulky. despite this fact, the analysis showed that greater values of stress occur at the upper part of the stem in comparison to the variant 0001 (8 % greater values of maximum stress). the displacement values are approximately 8 % lower. when compared to the prospon zv14112-320000008 reference implant, the implant 0002 has lower extreme values of stress (approximately by 29 %). concentrations of stresses in the intraosseous parts of the stem occur at the root areas of the stem, at the stabilizing ribs and at the location of their anchoring into the cylindrical part. the stress concentrations occur also at the walls of the transversal chambers (fig. 5). this is a fact that could potentially negatively affect osseointegration and bone ingrowth. the next development will be directed towards an optimization of the shape and length of the stabilizing ribs, rounding the sharp edges by a radius of at least 0.5 mm and adjusting the root with regard to an optimal distribution of vertical forces into the tissue of a cancellous bone. the tested implant variant 0002 is not able to withstand a load of 350 n without any plasticity. plasticity is expected to be present at the area of anchoring of the stabilizing ribs into the cylindrical part of the intraosseous stem. maximum stress in this area reaches values of 869.518 mpa, which is approximately 102 % of the yield strength (fk = 850.000 mpa) and 91 % of the ultimate strength (fp = 950.000 mpa). the next development in verifying the mechanical behaviour of this implant will be directed towards determining the values of the ultimate bearing capacity with an assumption of the elastoplastic behaviour in the body of the implant stem. as for the implant variant 0001, a numerical model of the abutment and the connecting screw will be created to realistically simulate the connection to the stem. 262 vol. 58 no. 4/2018 numerical simulations of static strength of novel dental implants figure 8. all manufactured implant variants. variant with a conical shape and ribs (left), variant with a system of slant ribs (middle) and variant with a parallel system of ribs (right). all implants have been manufactured from the ti6al4v eli alloy and have a diameter of 3.8mm and length of 8 mm (the most conventionally used size). equivalent von-mises stress σeqv [mpa] tested implant deformation of the implant head uimp [µm] deformation of the superstructure head uc [µm] min σeqv max σeqv implant 0001 104 215 10.724 801.656 104 213 10.964 703.792 implant 0002 95 190 9.862 869.518 95 188 9.121 777.285 reference implant 180 361 22.474 1217.100 zv14112-3200-00008 180 357 22.249 1166.960 table 1. an overview of the results of the numerical analysis for all tested dental implants. 5. conclusions the outcomes of the numerical analysis provide an insight into the stress distribution of two novel dental implants. another conventionally manufactured reference dental implant was tested to compare the obtained results. the main investigated properties are the stress distribution, strain distribution and locations of their concentration. all values mentioned in the subsections of this section are listed in tab. 1. it was found that the implant variant 0001 has a better overall mechanical response (1). although the differences between variants 0001 and 0002 are small, it was proven to be a better variant of the two. by modifying the geometry of the implant, we were able to reduce maximum values of stress by up to 35 % compared to the reference conventional implant. acknowledgements the financial support by the technology agency of czech republic (tačr project no. ta03010886) is gratefully acknowledged. references [1] t. albrektsson, p. i. branemark, h. a. hansson, j. linderstrom. osseointegrated titanium implants.requirements for ensuring a long lasting, direct bone-to-implant anchorage in man. acta orthop scan 52(2):155–70, 1981. [2] b. friberg, t. jemt, u. lekholm. early failures in 4,641 consecutively placed brånemark dental implants: a study from stage 1 surgery to the connection of completed prostheses. the international journal of oral & maxillofacial implants 6(2):142–6, 1991. [3] r. osman, m. swain. a critical review of dental implant materials with an emphasis on titanium versus zirconia. materials 8(3):932–958, 2015. doi:10.3390/ma8030932. [4] c. palma-carrió, l. maestre-ferrín, d. peñarrocha-oltra, et al. risk factors associated with early failure of dental implants. a literature review. medicina oral, patologia oral y cirugia bucal 16(4):2–5, 2011. doi:10.4317/medoral.16.e514. [5] f. denk, f. denk, a. jíra. a dental implant shaft, czech republic, 306457, 2017. [6] f. denk, f. denk, a. jíra. a dental implant shaft, czech republic, 306456, 2017. [7] čsn en iso 14801: dentistry – implants – dynamic loading test for endosseous dental implants. 263 http://dx.doi.org/10.3390/ma8030932 http://dx.doi.org/10.4317/medoral.16.e514 acta polytechnica 58(4):259–263, 2018 1 introduction 2 numerical analyses 3 creating fem models 3.1 dental implants geometry 3.2 mesh 3.3 numerical simulation and loads 4 results and discussion 4.1 the 0001 variant results 4.2 the 0002 variant results 5 conclusions acknowledgements references ap02_04.vp 1 introduction the pumping capacity of a pitched blade impeller (pbt) is defined [1] as the amount of liquid leaving the rotor region of the impeller, i.e., the cylindrical volume circumscribed by the rotating impeller blades, per unit time. this quantity is an important process characteristic of the pbt and plays an important role when calculating the blending or homogenization time of miscible liquids in mixing [2,3], in the design of continuous-flow stirred reactors [4] and in calculating the process characteristics of solid-liquid suspensions [5], i.e. the impeller frequency for just off bottom suspension. the pumping capacity of a pbt can be measured by the indirect “flow follower” (indicating particle) method [1] and calculated from the measured mean time of liquid primary circulation, or calculated [3, 5, 6] from the known radial profile of the axial component of the mean velocity in the impeller discharge stream leaving the impeller rotor region by means of integration over the circular cross section of the impeller rotor region. the pumping capacity of the pbt qp can be expressed in dimensionless form as the impeller flow rate number [1, 2] n q ndq p p� 3, (1) where n is the frequency of the impeller revolution and d is its diameter. quantity nq p does not depend on the reynolds number of the impeller when this quantity exceeds ten thousand [1, 3, 6]. for impeller power input p the power number has been introduced po p n d� � 3 5, (2) where � is the density of the agitated liquid. this quantity is also independent of the impeller reynolds number when it exceeds ten thousand. a combination of the dimensionless quantities nq p and po gives the so called hydraulic efficiency of the impeller [2, 7] defined as e n po d t q p p� � � � � � � 3 4 , (3) where t is the diameter of the vessel. the higher the quantity ep, the greater the ability to convert impeller energy consumption into its pumping effect. this study deals with an analysis of the pumping and energetic efficiency of various pitched blade impellers under two main geometrical conditions: conditions convenient for solid-liquid suspensions operations and conditions convenient for blending miscible liquids. the pbt pumping capacity will be calculated from the radial profile of the axial component of the mean velocity in the impeller discharge stream leaving the impeller rotor region. the velocity profile will be determined by laser doppler anemometry. 2 experimental experiments were carried out in a pilot plant cylindrical vessel with adished bottom [9] – see fig. 1. the vessel had 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 study of pumping capacity of pitched blade impellers i. fořt, t. jirout, r. sperling, s. jambere, f. rieger a study was made of the pumping capacity of pitched blade impellers in a cylindrical pilot plant vessel with four standard radial baffles at the wall under a turbulent regime of flow. the pumping capacity was calculated from the radial profile of the axial flow, under the assumption of axial symmetry of the discharge flow. the mean velocity was measured using laser doppler anemometry in a transparent vessel of diameter t � 400 mm, provided with a standard dished bottom. three and six blade pitched blade impellers (the pitch angle varied within the interval � � �24°; 45°�) of impeller / vessel diameter ratio d/t � 0.36, as well as a three blade pitched blade impeller with folded blades of the same diameter, were tested. the calculated results were compared with the results of experiments mentioned in the literature, above all in cylindrical vessels with a flat bottom. both arrangements of the agitated system were described by the impeller energetic efficiency, i.e, a criterion including in dimensionless form both the impeller energy consumption (impeller power input) and the impeller pumping effect (impeller pumping capacity). it follows from the results obtained with various geometrical configurations that the energetic efficiency of pitched blade impellers is significantly lower for configurations suitable for mixing solid-liquid suspensions (low impeller off bottom clearances) than for blending miscible liquids in mixing (higher impeller off bottom clearances). keywords: pitched blade impeller, impeller pumping capacity, turbulent flow, laser doppler anemometer, impeller energetic efficiency. fig. 1: layout of test rig with dished bottom (h/t � 0.36, c/d � � 0.5, b/t � 0.1, r/t � 1, r/t � 0.1) diameter t � 400 mm and was equipped with four equally spaced radial baffles mounted at the wall. it was filled with water at a temperature of 20 °c to a height of h � t. the frequency of revolution of the impeller was measured by means of a photoelectric cell with an accuracy �1 rev/min. figs. 2 and 3 show geometrical sketches of the pitched blade impellers used in this study. two types of impellers were investigated: simple pbts (see fig. 2) and a pbt with folded blades (see fig. 3). both types of impellers had a relative diameter d/t � 0.36 and relative off bottom clearance c/d � 0.5. the off bottom clearance was measured from the centre of the dished bottom to the lower edge of the impeller using a ruler, with a precision of �1 mm. the error in measuring the blade angle of the pbts can be considered as �0.5°. all the pbts rotated in such a way that they pumped liquid downwards towards the bottom. the mean velocity field in the impeller discharge flow just below the impeller rotor region was measured with a laser doppler anemometer (lda). a dantec 55x two component modular series lda and its associated bsa data processor, connected with a pc, was used for the experiments. the lda was operated in a forward scatter mode (see fig. 4). the laser (5 – w ar ion, manufactured by spectra physics, © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 42 no. 4/2002 a/ b/ fig. 2: sketch of pitched blade impellers with three or six blades – czech standard cvs 691020 (a/ nb � 3: � � 24°, 35°, 45°, b/ nb � 6: � � 45°, h/d � 0.2) fig. 3: sketch of a pitched blade impeller with folded blades – czech standard cvs 691010 (nb � 3, s/d � 1.5, � � 67°, � � 25°, � � 48°, h/d � 0.2) fig. 4: layout of a laser doppler anemometer with forward scatter mode usa) and optics were mounted on a bench which has a two-dimensional traversing mechanism. to identify the flow reversals correctly, a frequency shift was given to one of the beams by means of a bragg cell with electronic downmixing. two components of the local velocity were measured simultaneously, with positioning accuracy � 0.1 mm. the sample size was set at 20,000 items for each velocity measurement, and the mean time (averaged) value from all the samples was calculated. 3 results and discussion the impeller pumping capacity qp was calculated from the experimentally determined radial profiles of the axial component of the mean velocity in the impeller discharge stream leaving the impeller rotor region wax � wax(r). the local value of the mean velocity corresponds to the ensemble average value over the circle of radius r determined by lda. assuming axial symmetry of the impeller discharge stream, the impeller pumping capacity can be calculated from the equation � �q p w r r d p ax d� �2 0 2 . (4) fig. 5 depicts the measured radial profiles of the axial components of the mean velocity in the impeller discharge flow related to the impeller tip speed w w dn ax � ax � , (5) at various impeller frequencies of revolution. this figure provides quite a good illustration of the independence of the dimensionless quantity wax from the frequency of revolution of the impeller corresponding to the fully turbulent regime of agitated liquid. dimensionless radial coordinate 2r/t defines the velocity profile in the axial discharge flow [1]. in the vicinity of the impeller hub (2 0 01r t � ; , ) the liquid velocity amounts to a zero value, then it increases in the region of the rankin forced vortex and, finally, it decreases in the region of the trailing vortices behind the impeller blades [1]. table 1 consists of the above mentioned results of calculations for all the impellers tested (the value of criterion nq p is the arithmetic mean value from the values of qp (or nq p ), calculated according to formulas (1) and (4) from the experimental velocity data) and further values of the power number po calculated from eq. 2. the impeller power input was also determined experimentally by means of a strain gauge torquemeter mounted on the impeller shaft. finally, the value of the impeller hydraulic efficiency was calculated from eq. (3) and was included in table 1. table 2 consists of the dimensionless pumping and energetic characteristics of pbts in a baffled system with a flat 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 nb � nq p po ep [ ] [ ° ] [ ] [ ] [ ] 3 24 0.41 0.37 0.00312 3 35 0.51 0.79 0.00282 3 45 0.56 1.27 0.00232 6 45 0.65 1.81 0.00255 table 1: dimensionless pumping and energetic characteristics of pitched blade impellers in a baffled system with a dished bottom (h/t � 1, d/t � 0.36, c/d � 0.5) i. results of wu and co-workers [5] (h/t � 1, d/t � 0.41, c/d � 0.815) nb � nq p po ep [ ] [ ° ] [ ] [ ] [ ] 4 20 0.43 0.27 0.00832 4 25 0.53 0.37 0.01137 4 30 0.58 0.56 0.00951 4 35 0.65 0.73 0.01063 4 40 0.72 0.97 0.01087 4 45 0.76 1.22 0.1016 2 30 0.49 0.45 0.00739 3 30 0.54 0.53 0.00840 5 30 0.60 0.69 0.00847 6 30 0.61 0.72 0.00891 ii. results of fořt and medek [7] (d/t � 0.33, c/d � 0.1) nb � ep note [ ] [ ° ] [ ] 2–6 25–60 0.00742 impeller with folded blades (see fig. 3) 2–6 s/d � 1.0 – 1.25 0.00864 table 2: dimensionless pumping and energetic characteristics of pitched blade impellers in a baffled system with a flat bottom �0.3 �0.2 �0.1 0 0 0.1 0.2 0.3 0.4 2 /r t fig. 5: radial distribution of the axial component of the dimensionless mean velocity in a liquid leaving the rotor region of a pitched blade impeller – nb � 3, � � 24° (n [min �1] – point:162 – � , 210 – � , 261 – � , 310 – �) bottom found in the literature under conditions of higher off bottom impeller clearance than those set up in our study. the flow rate criterion nq p and the power number po were correlated in the literature on the basis of many experiments carried out for pitched blade impellers in a baffled flat bottomed cylindrical pilot plant agitated system (see fig. 6) under a turbulent regime of an agitated liquid. medek [7] published a correlation po n c d t d h t� 1 507 0 701 0 165 0 365 0 140 2 077. sin. . . . .b � (6) while medek and fořt [1] published n n c d t d h tq p b� 0 745 0 233 0 254 0 023 0 251 0 468. sin .. . . . . � (7) the intervals of the validity of the two correlations are as follows: nb � 2 8; , c d � 0 2 10. ; . , t d � 2 45 5 93. ; . , h t � 0 55 10. ; . , � � � �15 60; , 4 baffles b t � 01. , re .m � �10 10 4. taking the data of kresta and wood [6] and comparing it with correlation (7) we can write the relation n c d d t h t c dq p ~ , , , . ; . .0 244 1 3 1 0165 0 875� � � . (8) the exponent at the geometrical simplex c/d mentioned in eq. (8) following from the data of kresta and wood corresponds fairly well to the exponent at the same simplex in eq. (7). similarly, we can compare relation [5] n n d t h t c d nq p b 0.215 b~ , . , , . , ;� � � �0 41 1 0 813 2 6 (9) with the corresponding relation within eq. (7) and, again, the two exponents for the equivalent quantity (nb) agree fairly well. a combination of eqs. (6) and (7) according to the definition of the hydraulic efficiency of the impellers (eq. 3) gives for the squared configuration of the agitated system (h/t � 1) e n c d d tp b� 0 274 0 008 0 597 3 566 0 673. sin .. . . .� (10) the exponent for the number of impeller blades in eq. (8) can be neglected with respect to its statistical significance. table 3 shows the values of impeller hydraulic efficiency ep under various geometrical conditions calculated from eq. (10). comparing the values of the impeller hydraulic efficiency in tables 1–3, we can consider as important the influence of the impeller off bottom clearance c/d and, also, probably, the shape of the bottom. the curved shape of the bottom and the shorter distance between the impeller and the bottom, both important geometric features suitable for solid liquid suspension during mixing, reduce the ability of the impeller to convert its power input into its pumping efficiency. on the other hand, when there is a longer distance between the impeller and the bottom, i.e., under conditions suitable for blending of miscible liquids during mixing [3], the hydraulic efficiency of the pitched blade impeller exhibits fairly high values. a pitched blade impeller with folded blades (see fig. 3) corresponds quite well to its original design purpose [7], i.e., to replace the shape of the complex surface of the marine propeller by the simple and well defined shape of the folded blade of a pitched blade impeller when its hydraulic efficiency is the same as the value of this quantity for a marine propeller. 4 conclusions the pumping capacity of pitched blade impellers depends significantly under a turbulent regime of flow on the geometry of agitated system, i.e., on the shape of the bottom, the impeller off bottom clearance, and the impeller / vessel diameter ratio. the impeller hydraulic efficiency exhibits higher values for impeller off bottom clearance equal to the impeller diameter than for half of this distance, when interference between the bottom and the impeller takes place. this © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 42 no. 4/2002 fig. 6: layout of a cylindrical agitated flat bottomed system ( h/t � 1, d/t � 1/3, c/d � 0.5 or 1, b/t � 1) c/d nb � ep [ ] [ ] [ ° ] [ ] 0.5 3 24 0.00768 0.5 3 35 0.00689 0.5 3 45 0.00598 0.5 6 45 0.00598 1.0 3 24 0.01200 1.0 3 35 0.01042 1.0 3 45 0.009045 1.0 6 45 0.009045 table 3: hydraulic efficiency of a pitched blade impeller calculated from data correlation (10) in a system with a flat bottom (h/t � 1, d/t � 0.36) phenomenon is more apparent when the dished bottom of the cylindrical vessel is introduced. list of symbols b baffle width, m c off bottom impeller clearance, m d impeller diameter, m ep impeller hydraulic efficiency h height of liquid from bottom of vessel, m h width of impeller blade, m nq p flow rate number n impeller frequency of revolution, s 1 nb number on impeller blades p impeller power input, w po power number qp impeller pumping capacity, m 3s 1 rem reynolds number r radius of dished bottom, r r radius of round corners of dished bottom, m r radius, m s pitch, m t vessel diameter, m wax dimensionless axial component of the liquid mean velocity wax axial component of the liquid mean velocity, m s 1 � pitch angle of blade, ° � pitch angle of blade, ° � angle, ° � density of agitated liquid, kg m 3 acknowledgment this research was supported by research project of the ministry of education of the czech republic no. jo4/98: 212200008. references [1] medek, j., fořt, i.: pumping effect of impellers with flat inclined blades. collect. czech. chem. commun., vol. 44, 1979, p. 3077 – 3089. [2] nienow, a. w.: on impeller circulation and mixing effectiveness in the turbulent flow regime. chem. eng. sci., 1997, vol. 22, no. 15, p. 2557–2565. [3] patwardhan, a. w., joshi, j. b.: relation between flow pattern and blending in stirred tanks. ind. eng. chem. res., 1999, vol. 38, p. 3131–3143. [4] mavros, p., xuereb, c., fořt, i., bertrand, j.: investigation of flow patterns in continuous – flow stirred vessel by laser doppler velocimetry. can. j. chem. eng., 2002 in press. [5] wu, j., zhu, y., pullum, l.: the effect of impeller pumping and fluid rheology on solid suspension in a stirred vessel. can. j. eng., 2001, vol. 79, p. 177–186. [6] kresta, s., wood, p.: the flow field produced by a 45° pitched blade turbine: changes in the circulation pattern due to off bottom clearance. can. j. chem. eng., 1993, vol. 71, p. 42–53. [7] fořt, i., medek, j.: hydraulic and energetic efficiency of impellers with inclined blades. proceedings of the 6th european conference on mixing, pavia (italy), may 1988, p. 51–56. [8] medek, j.: power characteristics of agitators with flat inclined blades. int. eng. chem. 1980, vol. 20, no. 4, p. 664–672. [9] deeply dished bottoms for pressure vessels. czech standard čsn 425815, prague 1980. doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: fort@fsid.cvut.cz ing. tomáš jirout prof. ing. františek rieger, drsc. dept. of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic prof. dr.-ing. reinhard sperling e-mail: reinhard.sperling@lbv.hs-anhalt.de ing. solomon jambere dept. of chemical engineering anhalt university of appl. sciences hochschule anhalt (fh) bernburger str. 52–57 063 66 koethen, germany 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 table of contents technical teachers and technical teacher education – research results 3 d. dobrovská, p. andres temperature effects in acoustic resonators 8 m. èervenka, m. bednaøík, p. koníèek a comparison of acoustic field measurement by a microphone and by an optical interferometric probe 13 r. bálek, z. šlegrová propagation of quasi-plane nonlinear waves in tubes 18 p. koníèek, m. bednaøík, m. èervenka reliability analysis of a steel frame 27 m. sýkora analysis of phase evaluation algorithms in an interferometric method for static deformation measurement 35 j. novák certain discrete element methods in problems of fracture mechanics 42 p. p. procházka, m. g. kugblenu new phase shifting algorithms insensitive to linear phase shift errors 51 j. novák application of discrete element methods to the problem of rock bumps 57 p. p. procházka, m. g. kugblenu study of pumping capacity of pitched blade impellers 68 i. foøt, t. jirout, r. sperling, s. jambere, f. rieger ap03_5.vp 1 introduction these days there is no generally accepted calculation for design of thin-rimmed asymmetric loaded hub with gearing which would take into account both – impact of the notch of root of the tooth and impact of the notch of the keyway. therefore a designer prefers thicker wall of hub in the critical places, that means over dimensioning. growing competition and increasing costs demand the measures which suppose to optimise constructional parts. however, there are still reserves in the hub with the keyway. thin-rimmed hub with gearing is used in many driving machine sets as a pinion in transmission, couplers and as a part of the braking mechanisms. shaft-key-hub system is used very often, because of simple assembly and disassembly. during the testing of toothed wheels it was discovered that in the area of keyway crack initiation frequently occurs (fig. 1). the cause are the notch stresses produced in the root of the tooth and in the keyway. 2 dimension assignation of the hub with gearing geometrical dimensions of the shaft-key-hub system are standardized in din 6885 [2] and din 6892 [3]. fig. 2 shows a valid designation according to these standards and fundamental geometrical dimensions for a description of the gearing as well. these are standardized according to din 3990 [1]. important parameter for an analysis is minimum wall thickness sk. it is defined as the shortest distance between the fillet of the keyway r2 and the dedendum circle df of the gearing, counted by equation [4]: � �s d d t b rk f w� � �� � � � � � � � � � � 2 2 2 2 12 2 2 2 . (1) for a description of the loaded root of the tooth towards to the keyway of the hub was defined an angle �. the angle � � �0 means, that the tooth z4 lies symmetrical over the keyway of the hub. for each of the investigations was used a form of key (rounded ends). either there were the key and the hub flushed or the load bearing length of the key corresponded to hub width. first case occurs often in the practice, the second is important to a comparison of the results of 2d-fe-analysis with experimental acquired results. 3 design of the fe-model, boundary conditions geometric dimensions used in numerical calculations are introduced in table 1. for absolute description of all aspects of investigated parameters should be designed a 3d-fe-model of the shaft-key-hub system. for the reason that a evaluation of the 3d-model is difficult and time demanding, most of the calculations were carried out with 2d-fe-model (plane stress). there will be shown a sufficient agreement of the results of the 2dand 3d-models on the one geometrical variant. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 43 no. 5/2003 numerical simulation of stresses in thin-rimmed spur gears with keyway b. brůžek, e. leidich this paper contains an investigation of the key on a stress distribution in a thin-rimmed spur gear. a stress analysis was carried out by means of the finite element method (fem). the 2d-fe analysis has helped to find the influence of turning the gearing towards the keyway on the stress in the loaded root of the tooth and in the keyway. 2d and 3d numerical analysis has been used to find mutual influence of every single notch (root of tooth and keyway), influence of thickness of the hub, length of the key and the form of loading. verification has been carried out through experimental method. keywords: gear, keyway, rim thickness. f course of a crack fig. 1: pinion with shaft and key for an exact description of the investigated variant is necessary to model all components which are in mutual contact (shaft, key and hub). because of a reduction of a computational time there were modeled only eight teeth along circumference in region of the keyway and three teeth lying opposite (as “non-attenuate” teeth, that means the teeth aren’t influenced by the keyway). the other teeth have no influence on investigated problem. on the other hand it would be insufficient to model only a loaded tooth, since adjacent teeth influence stiffness of the researched zone. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 2: geometric dimensions of the shaft-key-hub system (left) and the fundamental geometrical dimensions of the gearing dw diameter of the shaft df dedendum circle h height of the key b width of the key t1 depth of the keyway in shaft t1tr load bearing depth of the keyway in shaft t2 depth of the keyway in hub t2tr load bearing depth of the keyway in hub s1 chamfer or radius of the keyway in shaft s2 chamfer or radius of the keyway in hub r chamfer or radius of the key r2 fillet of the keyway d pitch circle df dedendum circle da addendum circle db base circle � radius of the root of the tooth sk minimum thickness of the hub � position from tooth z4 to middle of the keyway calculation variants module m 2 mm 2.5 mm 4 mm number of teeth z 29 24 17 diameter of the shaft dw 40 mm width of the hub bn 40 mm cross section of the key din 6885 – a – 12 × 8 mm × mm fillet of the keyway r2 0,25 mm pitch circle d 58.00 mm 60.00 mm 68.00 mm minimum thickness of the hub (eq. 1) sk � 1.27 × m � 1.16 × m � 1.26 × m tooth root radius of the of the reference profile �fp 0.75 mm 0.94 mm 1.50 mm table 1: geometric dimensions there is displayed 2d-model and the boundary conditions in fig. 3. 4 2d-fe-model at first the most loaded geometry was detected. the maximum stress was calculated in the root of the tooth and in the keyway for various geometrical variants. next the eventuality of a recrimination of both opposite notches (root of tooth and keyway) was investigated. fig. 4 (left) shows the influence of the angle � on the maximum loading in the root of the tooth (�1max) always for the loaded tooth and the module 2 mm, that means the influence of the position of tooth z4 on the symmetry plane (middle of the keyway). resulting from the same calculations there is corresponding stress in the keyway displayed (fig. 4 – right). it is evident from the course of stress, as for the root of tooth that the maximum stress is at the angle � � 2.5° and by loading of the tooth z3. by further turn and consecutive loading of the tooth z4 the stress in the root of the tooth decreases again and goes even under basic stress [1]. the angle � has only a small influence upon the place of the maximum stress in the root of the tooth. however, there is another situation by the loading of the keyway. during the sequential loading of the tooth with normal force rise two maximums of stress on different places in dependence on angle �. fig. 4 also shows that there are always higher stresses in the root of the tooth. from whence it follows that the root of the tooth is ever a critical location and not the keyway. it is necessary to confirm this assumption by experiment. it will focus mainly on the root of the tooth by further calculations. it is possible to explain the stress distribution both in the root of the tooth and in the keyway from the deformation of a hub with eight teeth in the region of the keyway under rotary normal force loading. in fig. 5 is a deformation displayed by © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 43 no. 5/2003 x y fn · ux = uy = 0 for shaft z1 z8 z7. . .. z6 fig. 3: 2d-fe-model of the shaft-key-hub system with boundary conditions; fn – normal force (example for tooth z3) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 loaded tooth s 1 m a x / s f 0 z3 z4 z5 z6z1 z2 z7 z8 j = 0° j = 2.5° j = 3° j = 6° j = 9° 0.6 0,7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 loaded tooth s 1 m a x / s f 0 z3 z4 z5 z6z1 z2 z7 z8 j = 0° j = 2.5° j = 3° j = 6° j = 9° jfn fn fn z1 z8 .. .. .. .. z4 fig. 4: change of the maximum first principal stress in the loaded root of the tooth (left) and in the keyway (right) by rotating loaded tooth and various angles � for the hub with module 2 mm; �f0 – characteristic value of stress loading of the third, fourth, fifth and eighth tooth. it is obvious, that the maximum stress in the root of the tooth is always on the loaded tooth. however, there is quite different situation in the region of the keyway. the location of maximum stress varies. the angle between the bottom and the side of the keyway is smaller than 90° by the loading of the teeth z3 and z4 and the maximum stress lies on the bottom of the keyway. by the loading of the tooth z5 the angle between the bottom and the side of the keyway is bigger than 90° and the maximum stress is in a fillet of the keyway. by the loading of the tooth z8 the angle between the bottom and the side of the keyway is just about constant and there is no significant maximum stress in the region of the keyway. 5 3d-fe-model the 3d-calculations were carried out only for the critical case, that means for � = 2,5° by the loading of tooth z3. to save a computational time the number of elements in unloaded zones was more reduced in comparison with the 2d-model. the basic boundary conditions for the 3d-model are shown in fig. 6. by the double-sided torque distribution both ends of the shaft are steadily supported. by the one-sided torque distribution the left end of the shaft is steadily supported and the right end of the shaft is supported only in a radial direction. 6 double-sided torque distribution fig. 7 is determined to compare the 2d and 3d calculations. in the picture it is evident, that calculated stresses for 2d and 3d-model are almost identical. therefore it is possible to use 2d-model for symmetric boundary conditions (double-sided torque distribution). the first principal stress was evaluated in loaded area of the root of the tooth, that means between the tooth z2 and the tooth z3, and in the keyway. fig. 8 shows evaluated areas and the stress distribution for double-sided torque distribution. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fn fn fnfn dl dr dl dr dl dr dl dr loaded tooth z3 dl < 90° (compression) dr < 90° (compression) loaded tooth z5 dl > 90° (extension) dr > 90° (extension) loaded tooth z8 dl » 90° dr » 90° a) loaded tooth z4 dl < 90° (compression) dr < 90° (compression) b) c) d) fig. 5: deformation of the hub with teeth in the region of the keyway fn ur = uj = uz = 0 ur = 0 z ur = uj = uz = 0 ur = uj = uz = 0double-sided one-sided fig. 6: boundary conditions for 3d-model, � � 2.5°; modul 2 mm; fn – normal force (tooth z3) 7 one-sided torque distribution in the fig. 9 there is displayed an analog evaluation of the stress in the loaded root of the tooth and in the keyway for one-sided torque distribution. in the fig. 9 is possible to identify only small influence of the asymmetrical loading. the result is, that the difference between double-sided and one-sided distribution of the torque is imponderable. this presumption is confirmed by experimental analysis. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 43 no. 5/2003 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0.0 0.2 0.4 0.6 0.8 1.0x s 1 / s f o 3d 2d .0 8 .0 6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 0. 0 2. 0 4. 0 6. 0 8. 1 0.h s 1 / s f o 2d 3d 2d 3d 2d 3d z3 z2 z4 1 x 0 1h0 fig. 7: comparison of the results of the 2d and 3d-model for module 2 mm, � � 2.5° and load on tooth z3. left: stress distributions in loaded tooth radius; right: stress distribution in keyway; �f0 – characteristic value of stress. 0,2 0,4 0,6 0,8 1 5 10 15 20 25 30 35 40 -800 -600 -400 -200 0 200 400 600 800 -800 -600 -400 -200 0 200 400 600 800 z2 z3 z4 y x z z2 z3 z4 z5 z6 y xz 0 0,2 0,4 0,6 0,8 5 10 15 20 25 30 35 40 -200 0 200 400 600 800 1.000 1.200 1.400 -200 0 200 400 600 800 1.000 1.200 1.400 z [mm] h s 1 [mpa]s 1 [mpa] xz [mm] fig. 8: stress distribution in the loaded root of the tooth (left) and in the keyway (right) for double-sided torque distribution; � � 2.5°; module 2 mm; normal force on tooth z3; �, �: see fig. 7. 8 length of the key in the practice the variant with the flush hub and key often occurs. boundary conditions and force application are displayed in fig. 6, the same as for an overhang key. the evaluation is analogue as well. the first principal stress was evaluated in the loaded root of the tooth and in the keyway. stress distributions is shown in fig. 10. from the stress course in the keyway the influence of the length of the key is obvious. the stresses in the area of the keyway decrease on both ends of the hub, however they are bigger due to shorter supporting length of the key than by the overhang key. on the other hand stress distribution in the root of the tooth (fig. 10 – left) is similar to that one at the overhang key (fig. 8 – left). since the first principal stress is in the root of the tooth bigger than in the keyway, therefore the length of the key has no influence on a crack initiation for the hub with module 2 mm. the crack initiation is always in the root of the tooth. this assumption was also successfully verified by the experimental method. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 0,2 0,4 0,6 0,8 1 5 10 15 20 25 30 35 40 -800 -600 -400 -200 0 200 400 600 800 -800 -600 -400 -200 0 200 400 600 800 0 0,2 0,4 0,6 0,8 5 10 15 20 25 30 35 40 -200 0 200 400 600 800 1.000 1.200 1.400 -200 0 200 400 600 800 1.000 1.200 1.400 z [mm] h s 1 [mpa]s 1 [mpa] xz [mm] fig. 9: stress distribution in the loaded root of the tooth (left) and in the keyway (right) for one-sided torque distribution; � � 2.5°; module 2 mm; normal force on tooth z3; �, �: see fig. 7 0,2 0,4 0,6 0,8 1 5 10 15 20 25 30 35 40 -800 -600 -400 -200 0 200 400 600 800 -800 -600 -400 -200 0 200 400 600 800 0 0,2 0,4 0,6 0,8 5 10 15 20 25 30 35 40 -200 0 200 400 600 800 1.000 1.200 1.400 -200 0 200 400 600 800 1.000 1.200 1.400 z [mm] h s 1 [mpa]s 1 [mpa] xz [mm] fig. 10: stress distribution in the loaded root of the tooth (left) and in the keyway (right) for double-sided torque distribution; � � 2.5°; module 2 mm; normal force on tooth z3; hub and key flushed 9 numerical investigated influences the influence of the keyway on stress increase in the root of the tooth is mainly due to small thickness of the hub sk (see fig. 2). the stress distribution was evaluated in the root of the tooth and in the keyway for various thicknesses of the hub sk always for the critical geometry of the hub with module 2 mm. fig. 11 shows exemplary maximum values of the first principal stress in the root of the tooth and in the keyway depending on dimensionless thickness of the hub. from the picture it is evident that from definite hub thickness the keyway has no influence on the stress in gearing. for assessment of strength behaviour of thin rimmed spur gears with keyway is important to consider interaction both notches (root of the tooth and fillet of the keyway) which lie close together. from the calculations in [5] results that for various fillets of the keyway the stress distribution in the root of the tooth stays identical. as well as for various radiuses of the tooth root the stress distribution in the keyway stays similar. that means that by the used minimum thickness of the hub (see table 1) both notches do not influence each other. beyond calculations for the hub with module 2 mm was also counted with modules 2.5 mm and 4 mm. for each module the influence of the angle � on the maximum loading in the root of the tooth and in the keyway was investigated. fig. 12 represents maximum values of the first principal stress depending on module, always for critical angle � in the loaded root of the tooth and in the keyway for the constant minimum thickness of the hub sk. from the picture is evident that until definite module the maximum first principal stress in the root of the tooth is bigger than in the keyway, while for the greater module the maximum loaded place moves towards the keyway. depending on the stress gradient in both notches the initiation of the crack moves accordingly, also from the root of the tooth towards the keyway. 10 conclusion in this paper the stress distribution in the thin-rimmed spur gears with the keyway is investigated. the results show that not only the hub thickness between the root of the tooth and the keyway has the big influence on the stress increase but also the position of the gearing towards the keyway. on the other hand the form of the loading (one-sided and double-sided torque distribution) and the length of the key (hub and key flushed and overhang key) have only small influence. from the calculations it is also evident that for small module the maximum first principal stress is bigger in the root of the tooth than in the keyway while for the greater module the maximum loaded place moves towards the keyway. verification of the numerical calculations has been carried out through the experimental method. references [1] din 3990, tragfähigkeitsberechnung von stirnrädern. beuth verlag, 1987. [2] din 6885, mitnehmerverbindungen ohne anzug; passfedern, nuten, hohe form; abmessungen und anwendung. beuth verlag, 1968. [3] din 6892, mitnehmerverbindungen ohne anzug – passfedern berechnung und gestaltung. beuth, 1998. [4] floer, m.: beanspruchungsanalyse an torsionsbelasteten passfedernabe. aachen: shaker, 2000. [5] leidich, e., brůžek, b.: verzahnte dünnwandige naben mit passfederverbindung. aif abschlussbericht, 2002. dipl. ing. bohumil brůžek phone: +493 715 314 572 fax: +493 715 314 560 e-mail: bohumil.bruzek@mb.tu-chemnitz.de prof. dr. ing. erhard leidich phone: +493 715 314 660 fax: +493 715 314 560 e-mail: erhard.leidich@mb.tu-chemnitz.de department of engineering design chemnitz university of technology reichenhainer straße 70 09126 chemnitz, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 43 no. 5/2003 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 sk/m s 1 m a x / s f o root of tooth keyway fig. 11: maximum first principal stress in the loaded root of the tooth and in the keyway depending on dimensionless thickness of the hub with module 2 mm; �f0 – characteristic value of stress 0.8 1.0 1.2 1.4 1.6 1.8 modul [mm] s 1 m a x / s f o root of tooth keyway fig. 12: maximum first principal stress in the loaded root of the tooth and in the keyway depending on module always for the critical position of the keyway; �f0 – characteristic value of stress acta polytechnica doi:10.14311/ap.2019.59.0518 acta polytechnica 59(5):518–526, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap model-based security analysis of fpga designs through reinforcement learning michael vetter university of west bohemia, faculty of applied sciences, department of computer science and engineering, technická 8, 301 00 plzeň, czech republic correspondence: michaelvetter@protonmail.com abstract. finding potential security weaknesses in any complex it system is an important and often challenging task best started in the early stages of the development process. we present a method that transforms this task for fpga designs into a reinforcement learning (rl) problem. this paper introduces a method to generate a markov decision process based rl model from a formal, high-level system description (formulated in the domain-specific language) of the system under review and different, quantified assumptions about the system’s security. probabilistic transitions and the reward function can be used to model the varying resilience of different elements against attacks and the capabilities of an attacker. this information is then used to determine a plausible data exfiltration strategy. an example with multiple scenarios illustrates the workflow. a discussion of supplementary techniques like hierarchical learning and deep neural networks concludes this paper. keywords: fpga, it security, model-driven design, reinforcement learning, machine learning. 1. introduction securing any (non-trivial) computer system against malicious actors is a challenging task. the defender has to identify and protect all possible paths of entry while an attacker has to find only one feasible way to breach the system’s security properties. as a system matures, new threats can arise and the strength of defence mechanisms may erode. current threat modelling approaches are often informal, provide little automation and have a tendency to neglect the defence in-depth aspect of a design by focusing on the attack surface. attack trees [1] are a common method to model an attacker’s options but their creation is often tedious and they provide little information about the most efficient way to breach a system. we address this problem for fpga [2] based designs through a combination of a text-based domain specific language (dsl) for the design description and a quantified assessment of the security properties. this description is automatically transformed into a processable markov decision process (mdp). an agent is trained on this mdp to exfiltrate all data stored within an fpga design using the most efficient sequence of predefined actions. this task is performed by well-established reinforcement learning (rl) [3] [4] algorithms. the result of this analysis is one or more attack sequences that provide the user with insights about an attacker’s strategy under the described circumstances. we illustrate the feasibility of this approach through a series of experiments, assess its constraints and conclude with a discussion about our model’s limits and possible methods to lift them. 2. related work the approach described in this work operates on the intersection between it security, model-driven development (mdd) and machine learning. model-driven development [5] is a common technique to master the complexity of modern systems. general purpose languages like sysml [6] and aadl [7] flatten the learning curve, advance best practices and prevent vendor lock-in, but, to the authors knowledge, no standardized support for a security analysis exists in either of these languages. proposals like [8] for aadl have been made but not widely adopted. threat modelling [9] is often performed informally using tools as simple as paper, whiteboards or general purpose diagram software. dedicated threat modelling tools, including microsoft threat modelling tool [10] and owasp threat dragon [11], provide a suitable structure and user interface but little automation. machine learning has seen a boom in recent years, mainly powered by a trifecta of big data, parallel data processing power provided by gpus (graphics processing unit) and new ml architectures that utilize both. [12] discusses the application of machine learning for a malware detection from a practitioner’s perspective.[13] presents both a comprehensive review of the scientific literature and a number of research papers with applications ranging from the detection of malware to automatically generated penetration test plans. machine learning has also been used to generate malicious inputs with complexity beyond the conventional fuzzing [14] methods. [15] presents a generative adversarial networks (gan) [16] based attack against a fingerprint scanner. in [17], a similar approach is used against the computer vision system of an electric vehicle. in [18], the authors use markov 518 https://doi.org/10.14311/ap.2019.59.0518 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 5/2019 security analysis of fpgas designs through reinforcement learning decision processes to craft stealthy attack sequences against a cyber-physical system. mdps can be used for defensive purposes e.g. to detect attacks [19]. the cyber grand challenge 2016 [20] demonstrated that modern computer programs are, in principle, able to detect, exploit and patch previously unknown vulnerabilities in other cyber systems without a human intervention. it is generally assumed that most research in this field is done by military contractors and intelligence agencies and, therefore, a secret [21] advances in other, more visible, domains like games can provide us with valuable insights about the capabilities and limits of current machine learning systems. milestones for this progress include chess with deep blue (chess) [22], jeopardy with watson [23], pacman[24] and diverse atari 2600 arcade games [25], the board game go [26], as well as the real-time strategy game starcraft ii [27] and quake iii [28]. unresolved, but intensely researched problems, like autonomous driving, indicate the limits of the current technology when the problem space becomes too large. advances in other fields of statistics have provided us with methods to determinate casual effects [29][30] in the absence of randomized experiments and the exploration of counterfactual scenarios. progress in probabilistic programming [31] [32] have eased the usage of bayesian methods to e.g. incorporate knowledge about a system as informed priors. we can conclude that the advancement of machine learning in recent years can and has been used to improve both attacks and defences of existing cyber physical systems. the machine learning tools available today must be tailored to the application at hand and used with a careful knowledge of their limits. to the author’s knowledge, no such system exists that uses reinforcement learning to determine a systems vulnerabilities in the analysis and design phase. 3. the descriptive model of the fpga design it is common wisdom that security problems should best be solved at the early stages of the development process, ideally before any hardware is built or code written. this approach requires a suitable model of the system to be created. the fpgasecml metamodel (figure 1) defines several components that can be instantiated and parameterized for this purpose. fpgamodules represent the computational parts of the design. they come in two different manifestations: processing blocks store and process data while io blocks provide an additional connection to the outside world. fpgamodules utilize slots to perform their tasks. slots represent disjunctive sets of fpga primitives like configurable logic blocks and embedded ram. each slot can be used by only one fpgamodule at a time and all fpgamodules must use the same predefined slot every time. communication networks represent the communication infrastructure between the slots. they are directed graphs with the slots as nodes and the possible data flow between them is represented by the edges. each change in the slot utilization (partial runtime reconfiguration) is represented by a reconfiguration event. each reconfiguration event has a set of fpgamodules that replace the fpgamodule in their respective slot. the only mandatory event evinit marks the initial configuration of the fpga and must provide one fpgamodule for each slot. 4. the mdp-representation of the fpga design our model-to-model generator translates each valid fpgasecml description into a computable markov decision processes model. this section presents a short introduction into mdps in general, and the simplifications made to create a suitable representation of the problem at hand. we further discuss the different components state, action, reward and terminal states of our model (figure 2) and its implementation. 4.1. scope and constraints of the model we assume that the attacker has only one goal: the exfiltration of all data stored within the fpga. we further assume that the agent has a full access to all peripheral devices including those containing the fpgas configuration. each action in the mdp-model represents a whole class of attacks, many (low-level) operations might be necessary to execute each of these actions in a real-world case. the model does not support side effects of attacks like a damaged configuration. we further grant the attacker total knowledge over the systems state, and this state depends solely on the actions of the attacker. methods to address these constraints are presented later. 4.2. generic components of markov decision processes each state of the markov decision processes (mdp) contains all the information necessary to determine the next state as well as the agent’s reward after an action is taken. each mdp has one start state and at least one terminal state. the agent can choose from a finite set of actions to change the state of the system. executing this action results in the transition from the state s to state s′. all actions are atomic and require the same amount of time. the probability pa(s,s′) specifies the likelihood that an action a in the state s leads to a transition into the state s′. the reward ra(s,s′) informs the agent which transitions are desirable and which are not. the agent receives its reward immediately after the transition into s′. rewards can also be negative (punishments) or zero. γ or gamma is a discount factor for future rewards (we use the traditional value of γ = 0.7). 519 michael vetter acta polytechnica figure 1. fpga primitives (bottom layer) and their abstract representation in fpgasecml (top layer). figure 2. transformation flow from the fpgasecml to the burlap based reinforcement learning model. 4.3. the domain-specific markov decision process this section describes how the fpgasecml metamodel is translated into the required mdp components. 4.3.1. state the state contains the current configuration of the fpga, which fpgamodule is under the control of the attacker and which slot contains data already exfiltrated from another fpgamodule (chain exfiltration). the state further encodes which fpgamodules storage has been tampered with and which data has already been leaked into the outside world. all the information are stored in integer or boolean variables. the fpga starts in its initial configuration (as defined by evinit) and all fpgamodules untampered with. terminal states are, by default, all states where the agent has exfiltrated the data of each fpgamodule into the outside world. the code generator does not validate whether the agent can reach a terminal state. 4.3.2. actions the agent can invoke one of the (pre-defined) actions to get and extend control over the system. possible actions include the reconfiguration of the system, tampering with a distinct fpgamodule (inside the fpga or in its storage element) and the extraction of data from a tampered-with fpgamodule. the attacker can further transfer these extracted data to other slots or the outside world (if the data is in an ioblock that the attacker has already tampered with.) the mdp moves into the new state if the action is successful and remains in the current state ( s′ = s) otherwise. 4.3.3. reward the agent receives a positive reward for reaching a terminal state and either none or a negative reward 520 vol. 59 no. 5/2019 security analysis of fpgas designs through reinforcement learning (punishment) for every other step. this penalty encourages the agent to use shorter sequences of low-cost actions and constraints the search space. it can be the same for all actions but it is also possible to penalize the usage of distinct actions. 4.4. reinforcement learning scenarios multiple experiments with different parameters must be performed for a meaningful analysis. each scenario is a combination of mdp parameters like the reward, the success likelihood and costs of actions as well as a hyperparameter like the learning rate. fpgasecml allows the definition of scenarios that allow the independent definition and evaluation of these parameters. all scenarios share a set of common parameter like γ and the exploration-exploitation tradeoff � set. the example in the next section illustrates the application of scenarios. 4.5. implementation our proof of concept implementation transforms, as mentioned earlier, any valid fpgasecml-model into java code. this generated code is meant to be linked against the brown-umbc reinforcement learning and planning (burlap) library [33], a java framework that provides a generic reinforcement learning functionality. where the algorithmic flexibility provided by burlap is not required, a reimplementation in e.g., c++,rust or go should lead to a better performance. the memory consumption could be reduced through an optimized state encoding at the expense of readability and extensible the implementation used here is optimized for a simple code generation process. 5. example an fpga design consists of three slots with two fpgamodules each (figure 3). fpgamodule a is the only io block. the run ends the data of all six modules (a-f) is exfiltrated. the q-learning algorithm is used to find the best sequence of actions. the heap memory of the java virtual machine (jvm) had to be extended to up to 6 gb to accommodate the qtable as well as the telemetric data. each scenario is executed (ran) once for each epsilon (0.1, 0.05 and 0.25 in our example). each run consists of 3 million episodes, and, in each episode, the agent starts at the initial state and ends in a terminal state. histograms (truncated to the right) are used to compare the agent’s learning performance for the different epsilon and scenarios. the fpgasecml-model, as well as the generated java code and reports, for this example, can be found in the accompanying files. 5.1. scenario 1: baseline scenario 1 establishes a baseline of the model through a deterministic mdp (the success rate of every action is 1) and a fixed reward/punishment assignment. the agent gets a reward of 200.000 for reaching a terminal state and a punishment of -1.000 for every other action. all three runs return a minimal solution with 24 actions, the histogram (figure 5) of all three actions shows that the epsilon of 0.1 has the most episodes with the minimal steps followed by 0.05 (low exploration rate requires more steps to find a solution) and 0.25 (too much exploration hinders the efficient exploitation of the available data.) the learning process starts with episodes well over 2.000 steps (figure 4) and provides better results as the qtable gets filled. for the epsilon of 0.1, the first minimal solution is found after just 303.323 episodes indicating that the agent has either found the best possible solution or is stuck in a local minimum with little chances to break out. it is also notable that all three solutions make an extensive use of the tamper−bitstream action, but what if this action is unlikely to succeed or prohibitively expensive? 5.2. scenario 2: tamper resistance storage prevents most attacks the second scenario assumes that only one in a thousand attacks against tamper-resistant bitstream storage is successful (relative to all other attacks as success factors are normalized.) the three runs return a minimal sequence of 25 actions each and all three of them avoid the tamper−bitsteam actions completely. the histogram (figure 5) shows a smaller spread than that of the first scenario and the median of the episodes for each run has shrunk from (29,39,44) to (28,26,35). 5.3. scenario 3: faulty encryption eases attacks we assume that a weaker tamper protection of the bitstream storage and a weakness in the bitstream encryption mechanism increases the success rate of tamper−bitstream actions. the minimal sequences returned require 27 steps for epsilon of 0.1, 26 for 0.05 and 30 for 0.25 a significantly worse performance than in the previous scenarios. the histogram of all three runs also skews much more to the right than those of the preceding scenarios. worse, the three attack sequences have all 4 of the 6 bitstreams tampered with. the success rate of one in a thousand was, apparently, low enough to discourage the agent from using these actions while the “one in five” -chance is too weak to achieve a similar effect. the low success rate is, however, strong enough to prolong the learning process as the median number of steps per episode rises from previously (28,26,35) to (154,155,152). the learning progress (figure 7) is much noisier than it was in the baseline scenario (figure 4.) 5.4. scenario 4: physical attacks against storage devices are expensive not all attacks require the same amount of resources some take longer than others, a few require costly 521 michael vetter acta polytechnica figure 3. schematic of the fpgasecml to burlap example. figure 4. number of actions necessary to reach a terminal state for each episode in the baseline scenario (the line at the bottom represents the running average). 522 vol. 59 no. 5/2019 security analysis of fpgas designs through reinforcement learning figure 5. histogram of multiple runs with different epsilons for scenario 1 (top) and 2 (bottom). figure 6. histogram of multiple runs with different epsilons for scenario 3 (top) and 4 (bottom). 523 michael vetter acta polytechnica figure 7. training progess for scenario 3 faulty encryption eases attacks. equipment and highly specialized threat agents. reducing the success rate of an attack makes it implicitly more expensive (as, on average, more attacks must be performed before succeeding) but with several million trials per run, there is a good chance that at least one sequence of highly improbable actions will succeed at least once. a real attacker might not want to take this risk. we convert our mdp back into a dmdp and assign the tamper−bitstream action an additional cost weight of 100.000 (this cost is subtracted from the default reward of -1.000.) the minimal sequence for the epsilons 0.1 and 0.25 are 24, but the low exploration rate of 0.05 has led to a global minimum of 23. it is also notable that this global minimum is not reached until the 2.543.660th episode and that the agent tampers with two bitstream storages while the other two use this expensive operation only once. imposing even higher costs (bitstream tampering is extremely expensive) on the tamper−bitstream operation had no positive effect with all three runs returning a sequence of the length 24. 5.5. results we can conclude that our approach found a reasonable solution to this problem within a feasible amount of computations. the noisy progress of the rl algorithm makes it impossible to determine whether a local or global minimum has been found. scenarios can be used to assess the impact of high costs and low success rates and to find better solutions by constraining the size of the search space. a plausibility check of the results (here performed on the generated sequence) is mandatory, as for all machine learning methods. the number of training episodes required for this small example indicates challenges for intricate designs and models with a richer state, action representation. the memory consumption of the program, presumably driven by the expanding qtable, supports this assumption. finding the appropriate cost/reward structure remains a challenge and any analysis should include multiple scenarios with different cost and success rate values. 6. limits and restrictions of the mdp based approach and possible solutions the mdp based model presented here relies on certain constraints that could be lifted by supplementing the proposed model. hierarchical learning [34] can be used to combine this high-level model with the low-level activities needed to execute the strategy. it is also plausible that the attacker does not know the state of the system but has to guess it from a limited set of, presumably noisy, indicators available (like physical side channels [35][36]). the reinforcement learning equivalent of this effect is a partially observable mdp (pomdp [37] burlap provides limited support for pomdp) transforming the mdp into a stochastic game allows the integration of multiple actors (e.g. an active defence mechanism.) navigating the vast search space remains the main problem of the reinforcement learning. limiting the content of the state and the number of actions eases this problem but restricts the expressiveness of the model. constraining the search space through transition probabilities and costs is another method but increases the number of experiments. a functional approximation of the qtable could provide an avenue towards an improved rl based weakness analysis. one candidate for this approximation are deep neural networks, whose general feasibility has been demon524 vol. 59 no. 5/2019 security analysis of fpgas designs through reinforcement learning strated by the deepq [38] algorithm. this more recent technique replaces the qtable with a neuronal network that approximates the value of a given state-action combination. suitable neural networks mitigate the state explosion problem and should be able to detect patterns. they may, therefore, be used to extract and abstract information from a feature-rich state, action representation in the same way convolution networks extract information e.g from a picture [16]. policy shaping [39] or reward learning [40] provide other avenues to explore. a sufficiently competent and autonomous reinforcement learning system could be pitted against real-life systems (preferably in a laboratory environment) for further refinement without a human interference (similar to alphazero [41].) these proposals require at least a rudimentary implementation of the design and the insight gained from attacking this system could be transferred to other systems still in the design and analysis phase. finding the best parameters for the reward and success rates represents an additional challenge. different domain experts may have different assessments of the threat landscape or varying confidence in their assumptions. running a large number of scenarios with different parameters also increases the chance to find an efficient and robust solution. in later stages of the development cycle, data from penetration tests and from security incidents can be used to verify the validity of the assumptions made. to simplify the creation of scenarios, the point values for the cost and success rate could be replaced with distributions (similar to hierarchical learning in probabilistic programming) where the actual values for each run are drawn from. 7. conclusion reinforcement learning can be used to identify potential weaknesses in an fpga design and the steps a reasonable attacker may take to exploit them. we presented a method to generate a markov decision process based reinforcement learning model from a formal, high-level system description (formulated in the domain-specific language fpgasecml.) the automatic model-to-model translation reduces the developer’s workload, decreases the risk of errors and helps to keep both models in sync. probabilistic transitions and the reward function can be used to model the varying resilience of different elements against attacks and the capabilities of an attacker. supplementary techniques like hierarchical learning, partially observable mdps, and stochastic games can be used to extend the scope of the model. references [1] b. schneier. academic: attack trees: schneier on security. https://www.schneier.com/academic/ archives/1999/12/attack_trees.html. [2] s. m. trimberger, j. j. moore. fpga security: motivations, features, and applications. proceedings of the ieee 102(8):1248–1265, 2014. doi:10.1109/jproc.2014.2331672. [3] c. isbell, m. littman. reinforcement learning udacity. https://classroom.udacity.com/courses/ud600. [4] f. akhtar. practical reinforcment learning. packt publishing limited, 2017. [5] m. brambilla, j. cabot, m. wimmer. model-driven software engineering in practice. synthesis lectures on software engineering. morgan & claypool publishers, second edition edn., 2017. [6] l. delligatti. sysml distilled: a brief guide to the systems modeling language. addison-wesley professional, 2014. [7] j. delange. aadl in practice. reblechon development co, 2017. [8] robert ellison, allen householder, john hudak, rik kazman, carol woody. extending aadl for security design assurance of cyber-physical systems. https://resources.sei.cmu.edu/asset_files/ technicalreport/2015_005_001_449522.pdf. [9] f. swiderski, w. snyder. threat modeling. microsoft press, redmond and wash, 2004. [10] jegeib. getting started microsoft threat modeling tool azure. https://docs.microsoft.com/en-us/azure/ security/azure-security-threat-modelingtool-getting-started. [11] threat dragon. https://threatdragon.org/login. [12] j. saxe, h. sanders. malware data science: attack detection and attribution. no starch press incorporated, san francisco, ca, 2018. [13] s. parkinson, a. crampton, r. hill. guide to vulnerability analysis for computer networks and systems: an artificial intelligence approach. computer communications and networks. springer, 2018. [14] m. sutton, a. greene, p. amini. fuzzing: brute force vulnerabilty discovery. addison-wesley, upper saddle river, n.j., 2007. [15] p. bontrager, a. roy, j. togelius, et al. deepmasterprints: generating masterprints for dictionary attacks via latent variable evolution. https://arxiv.org/pdf/1705.07386.pdf. [16] i. goodfellow, y. bengio, a. courville. deep learning. mit press, cambridge, massachusetts and london, england, 2016. [17] tencent keen security lab. autopilot. https://keenlab.tencent.com/en/whitepapers/ experimental_security_research_of_tesla_ autopilot.pdf. [18] s. lakshminarayana, t. z. teng, d. k. y. yau, r. tan. optimal attack against cyber-physical control systems with reactive attack mitigation. proceedings of the eighth international conference on future energy systems 2017, 2017. doi:10.1145/3077839.3077852. [19] h. koduvely. anomaly detection through reinforcement learning. http://blog.zighra.com/ anomaly-detection-and-reinforcement-learning. 525 https://www.schneier.com/academic/archives/1999/12/attack_trees.html https://www.schneier.com/academic/archives/1999/12/attack_trees.html http://dx.doi.org/10.1109/jproc.2014.2331672 https://classroom.udacity.com/courses/ud600 https://resources.sei.cmu.edu/asset_files/technicalreport/2015_005_001_449522.pdf https://resources.sei.cmu.edu/asset_files/technicalreport/2015_005_001_449522.pdf https://docs.microsoft.com/en-us/azure/security/azure-security-threat-modeling-tool-getting-started https://docs.microsoft.com/en-us/azure/security/azure-security-threat-modeling-tool-getting-started https://docs.microsoft.com/en-us/azure/security/azure-security-threat-modeling-tool-getting-started https://threatdragon.org/login https://arxiv.org/pdf/1705.07386.pdf https://keenlab.tencent.com/en/whitepapers/experimental_security_research_of_tesla_autopilot.pdf https://keenlab.tencent.com/en/whitepapers/experimental_security_research_of_tesla_autopilot.pdf https://keenlab.tencent.com/en/whitepapers/experimental_security_research_of_tesla_autopilot.pdf http://dx.doi.org/10.1145/3077839.3077852 http://blog.zighra.com/anomaly-detection-and-reinforcement-learning http://blog.zighra.com/anomaly-detection-and-reinforcement-learning michael vetter acta polytechnica [20] cyber grand challenge (cgc). https: //www.darpa.mil/program/cyber-grand-challenge. [21] mayhem, the tech behind the darpa grand challenge winner, now used by the pentagon cyberscoop. https://www.cyberscoop.com/ mayhem-darpa-cyber-grand-challenge-dod-voltron/. [22] chandrasekaran, rajiv. washingtonpost.com: deep blue defeats kasparov in game 2. https://www.washingtonpost.com/wp-srv/tech/ analysis/kasparov/kasparov.htm? [23] n. jones. quiz-playing computer system could revolutionize research. https://www.nature.com/news/ 2011/110215/full/news.2011.95.html. [24] h. van seijen, m. fatemi, j. romoff, et al. hybrid reward architecture for reinforcement learning. corr abs/1706.04208, 2017. [25] v. mnih, k. kavukcuoglu, d. silver, et al. playing atari with deep reinforcement learning. [26] alphago: using machine learning to master the ancient game of go. https://blog.google/topics/machine-learning/ alphago-machine-learning-game-go/. [27] o. vinyals, i. babuschkin, j. chung, et al. alphastar: mastering the real-time strategy game starcraft ii. https://deepmind.com/blog/alphastarmastering-real-time-strategy-game-starcraft-ii/. [28] capture the flag: the emergence of complex cooperative agents | deepmind. https: //deepmind.com/blog/capture-the-flag-science/. [29] j. pearl. causal inference in statistics: an overview. statistics surveys 3(0):96–146, 2009. doi:10.1214/09-ss057. [30] j. pearl, d. mackenzie. the book of why: the new science of cause and effect. first edition edn., 2018. [31] c. davidson-pilon. bayesian methods for hackers: probabilistic programming and bayesian methods. addison-wesley data and analytics series. addison-wesley, new york, 2016. [32] o. martin. bayesian analysis with python: introduction to statistical modeling and probabilistic programming using pymc3 and arviz, 2nd edition. packt publishing ltd, birmingham, 2nd edn., 2018. [33] burlap. http://burlap.cs.brown.edu/. [34] r. s. sutton, d. precup, s. singh. between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. artificial intelligence 112(12):181–211, 1999. doi:10.1016/s0004-3702(99)00052-1. [35] s. mangard, e. oswald, t. popp. power analysis attacks: revealing the secrets of smart cards. springer ebook collection, computer science [dig. serial], springer-11645 [dig. serial]. springer science+business media, llc, boston, ma, 2007. doi:10.1007/978-0-387-38162-6. [36] meltdown and spectre. https://meltdownattack.com/. [37] planning and acting in partially observable stochastic domains. https://www.sciencedirect.com/science/ article/pii/s000437029800023x?via%3dihub. [38] v. mnih, k. kavukcuoglu, d. silver, et al. humanlevel control through deep reinforcement learning. nature 518(7540):529–533, 2015. doi:10.1038/nature14236. [39] s. griffith, k. subramanian, j. scholz, et al. policy shaping: integrating human feedback with reinforcement learning. in in advances in neural information processing systems. 2013. [40] a. gonfalonieri. inverse reinforcement learning – towards data science. https://towardsdatascience.com/ inverse-reinforcement-learning-6453b7cdc90d. [41] alphazero: shedding new light on the grand games of chess, shogi and go | deepmind. https://deepmind.com/blog/ alphazero-shedding-new-light-grand-gameschess-shogi-and-go/. 526 https://www.darpa.mil/program/cyber-grand-challenge https://www.darpa.mil/program/cyber-grand-challenge https://www.cyberscoop.com/mayhem-darpa-cyber-grand-challenge-dod-voltron/ https://www.cyberscoop.com/mayhem-darpa-cyber-grand-challenge-dod-voltron/ https://www.washingtonpost.com/wp-srv/tech/analysis/kasparov/kasparov.htm? https://www.washingtonpost.com/wp-srv/tech/analysis/kasparov/kasparov.htm? https://www.nature.com/news/2011/110215/full/news.2011.95.html https://www.nature.com/news/2011/110215/full/news.2011.95.html https://blog.google/topics/machine-learning/alphago-machine-learning-game-go/ https://blog.google/topics/machine-learning/alphago-machine-learning-game-go/ https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/ https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/ https://deepmind.com/blog/capture-the-flag-science/ https://deepmind.com/blog/capture-the-flag-science/ http://dx.doi.org/10.1214/09-ss057 http://burlap.cs.brown.edu/ http://dx.doi.org/10.1016/s0004-3702(99)00052-1 http://dx.doi.org/10.1007/978-0-387-38162-6 https://meltdownattack.com/ https://www.sciencedirect.com/science/article/pii/s000437029800023x?via%3dihub https://www.sciencedirect.com/science/article/pii/s000437029800023x?via%3dihub http://dx.doi.org/10.1038/nature14236 https://towardsdatascience.com/inverse-reinforcement-learning-6453b7cdc90d https://towardsdatascience.com/inverse-reinforcement-learning-6453b7cdc90d https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/ acta polytechnica 59(5):518–526, 2019 1 introduction 2 related work 3 the descriptive model of the fpga design 4 the mdp-representation of the fpga design 4.1 scope and constraints of the model 4.2 generic components of markov decision processes 4.3 the domain-specific markov decision process 4.3.1 state 4.3.2 actions 4.3.3 reward 4.4 reinforcement learning scenarios 4.5 implementation 5 example 5.1 scenario 1: baseline 5.2 scenario 2: tamper resistance storage prevents most attacks 5.3 scenario 3: faulty encryption eases attacks 5.4 scenario 4: physical attacks against storage devices are expensive 5.5 results 6 limits and restrictions of the mdp based approach and possible solutions 7 conclusion references ap03_6.vp 1 introduction in 1841 christian schonbein identified the characteristic odor of ozone in the vicinity of an electrical discharge and named it ozone after the greek word ozein, meaning “smell”. ozone was identified as triatomic oxygen in 1867. ozone is a toxic gas. its odor is detectable by most people at a level of 0.003 ppm and becomes intolerable at concentrations about 0.15 ppm. overexposure to ozone causes problems of the human respiratory system, watery eyes, coughing, heavy chest and a stuffy nose. ozone is an unstable gas, which begins to decompose instantly. its half-life ranges from a few seconds to several hours. however the half-life decreases significantly in the presence of moisture, increased temperature and some metals – copper, silver, iron. ozone decomposes to oxygen after being used, so no harmful byproducts result. ozone is a practically transparent gas with very strong absorption in the region of a wavelength of 245 nm. namely this property of ozone protects life on earth from dangerous ultraviolet radiation emitted by sun. ozone is a powerful disinfecting and oxidizing agent, and for this reason it is used in a wide range of applications such as treatment of municipal and waste water, food processing, fire restoration, restoration of buildings and other objects after floods, etc. when ozone comes into contact with organic compounds or bacteria, the extra atom of oxygen destroys the contaminant by oxidation. thus ozone will neutralize virtually all organic odors, specifically those containing carbons as their base element. this will include all bacteria as well as smoke, decay and cooking odors. however, due its oxidizing nature ozone attacks and degrades natural latex rubber, thus for example a laminate silica gel should be applied to car and freezer door moldings. ozone is usually generated in one of three ways: � electrochemical generation – in this case an electric current passes through the liquid electrolyte and produces a mixture of gases, containing ozone. � generation of ozone by ultraviolet rays (this process takes place in the upper layers of the atmosphere). � generation of ozone in non-thermal plasma produced by electrical discharges. most of the ozone for practical applications is produced in non-thermal plasma generated by electrical discharges. 2 non-thermal plasma ozone generation plasma is a quasi-neutral mixture of neutral and charged particles, having collective properties. it can be classified from several standpoints. one of these standpoints deals with the equilibrium between the temperatures of ions and electrons. when the temperatures of electrons and ions are the same we speak of thermal plasma. on the other hand, if the temperature of the electrons is higher than the temperature of the ions we speak about non-thermal plasma. the key advantage of non-thermal plasma is the “directed” energy consumption. in this case the energy delivered to the discharge is predominantly used for generating highly energetic electrons, and little energy is lost in heating the volume of a gas. consequently, technologies based on non-thermal plasma can be highly effective in promoting oxidation, enhancing molecular dissociation or producing free radicals to stimulate plasmachemical reactions, which can be used for ozone generation or for other ecological applications such as decomposition of pollutants in air streams. there are two main methods for producing non-thermal plasma: � acceleration of electrons by an electric field. � injection of electrons externally. the first group of methods comprises different types of electrical discharges, and the second method involves the usage of an electron beam. due to the fact that the processes of ozone generation are carried out at atmospheric pressure, we shall focus on atmospheric pressure electrical discharges only. the main type of electrical discharge that can be used for the above-mentioned purpose is dielectric barrier discharge and corona discharge. dielectric barrier discharge is a general term for discharge that uses a dielectric material as a plasma stabilizer. the main types are: � silent discharge. � surface discharge. the main feature of silent discharge is the dielectric layer, which covers at least one of the electrodes, sometimes both. typically, materials with a high dielectric constant (ceramics or glass) are used. the most widely used silent discharge electrode configurations at the present time are parallel plate © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 43 no. 6/2003 non-thermal plasma ozone generation s. pekárek this paper reviews ozone properties, ozone applications and the mechanism of ozone production in non-thermal plasma. an analysis is made of the influence of a reduced electric field and discharge space temperature on ozone production. the phenomenon of discharge poisoning is also explained. finally, a modern ozone production system based on dielectric barrier electrical discharge is described. keywords: ozone, non-thermal plasma, electrical discharge. types or the wire-tube type. a common parallel plate electrode configuration for silent discharge is shown in fig. 1. surface discharge uses different types of design. the most frequent one is so-called packed bed discharge, which is shown in fig. 2. in this case, spherical pellets of dielectric material fill the discharge volume between two electrodes. if the voltage over the electrodes is increased, a weak glow appears at the surface of the pellets. the pellets enhance the electric field in the contact regions among the adjacent pellets, and this leads to microbreakdowns in the gas. the materials often used are batio3, pbtio3 and others. a high voltage (about 15 kv) of frequency ranging from 50 hz to about 10 khz is applied to generate the discharge. corona discharge is a relatively low power electrical discharge that takes place at or near atmospheric pressure. the corona is generated by a strong electric field, and at least one of the electrodes is a thin wire, a needle or has a sharp edge. the strong curvature is responsible for the high electric field, which is necessary for the ionization of neutral gas atoms. the other electrode is most frequently a plate or a cylinder. two ways are used to enhance the corona discharge power and therefore to increase its current-voltage range and at the same time to prevent the discharge transition to a spark: � in the first case pulse voltage is applied to the electrodes [1]. � the second way of discharge stabilization involves the usage of the gas flow. in this case we talk about gas flow stabilized discharge or about atmospheric pressure glow discharge [2]. there are several types of electrode configurations of gas flow stabilized discharge. one of these involves the multi-needle to plate electrode configuration with the flow of a gas medium around the needle electrodes [3]. the disadvantage of this arrangement, which is shown in the top part of fig. 3, is that to stabilize the discharge the velocity of the gas should be about 50 to 100 m/s. to relax the conditions applied to the gas velocity we have developed a new version of the discharge with a hollow needles to plate electrode configuration [4]. in this case the gas is supplied through the needles only – see botttom part of fig. 3. the advantage of this arrangement is that all the gas medium passes through the discharge region and therefore it is affected by plasmachemical processes. there are other types of electrical discharges (microwave discharge, capillary discharge, glidarc, etc.), which can be used for ozone generation. however, their application for these purposes is not so frequent. the advantage of electrical discharges as a source of non-thermal plasma for ozone generation is that the plasma parameters can be relatively easily controlled by changing several variables, for example: � the operating voltage of the discharge affects the magnitude of the electric field, and hence the energy of the charged particles in the plasma. � the power controls the number of ionizations per unit volume per second and hence is an approximate control variable for the plasma density. it is also a possible control variable for the energy of the plasma constituents. � the gas pressure controls the electron collision frequency and the mean free paths of all plasma constituents. � the type of gas controls the ionization potential, and thus the energy required to produce an electron-ion pair in the plasma. � the geometry of the electrodes can affect the energy input by altering the electric field, through changing the geometry of the anode-cathode configuration. � finally the cathode characteristics, such as the secondary emission coefficient, or the capability of thermoionic emission can also affect the discharge characteristics. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 high voltage ac generator discharge gap ground electrode high voltage electrode fig. 1: silent discharge electrode configuration gas in gas out high-voltage wire anode tube electrode dielectric pellets dielectric tube fig. 2: packed bed discharge electrode configuration 50 – 100 m/s fig. 3: different modifications of the gas flow stabilized discharge 3 mechanism of ozone generation there exist several models of ozone generation mechanisms, which differ mainly by the number of reactions involved [e.g. 5, 6, 7]. generally speaking the mechanism of ozone generation by electrical discharge in air is rather complicated, due to the presence of nitrogen in air. this presence of nitrogen causes ozone destruction processes because of the existence of nitrogen oxides. the role of the electrons produced in an electrical discharge is to excite and dissociate the oxygen and nitrogen molecules. the initial step in formation of o3 and nox is therefore the electron impact dissociation of molecular o2 and n2 e o o o e� � � �2 (1) e n n n e� � � �2 (2) atomic nitrogen reacts with o2 and o3 to form no by the reactions n + o no + o2 � (3) n + o no + o3 2� (4) no and no2 form a cycle for ozone destruction by the reactions o + no no + o2 2� (5) no + o no + o3 2 2� (6) no2 is consumed by the following reactions no + o no + o2 3 3 2� (7) no + no n o22 3 5� (8) and regenerated by the reactions no + no no no3 2 2� � (9) n o no no2 5 2 3� � (10) o + n o no no o2 5 2 2 2� � � (11) the main ozone formation reaction dominant at high pressures is the reaction o + o m o + m32 � � (12) an undesirable reaction, which determines the upper limit of o concentration is o + o m o + m2� � (13) in the case of a discharge in air, m represents molecular oxygen or nitrogen. there are certainly other reactions that destroy ozone. one example of a catalytic reaction destroying ozone is no + o no o2 3 3 2� � (14) o + no o no3 2 2� � this model is very simple, because it does not include reactions involving excited species, other molecules and other ozone self-destruction reactions. the reaction rates of reactions (1)–(14) depend on various parameters – for example on electric parameters of discharge, temperature, etc. 3.1. effect of electric field strength the dissociation rate coefficients of o2 and n2 by electron impact reactions (1) and (2) depend on the energy distribution of the electrons in the discharge. these coefficients are usually treated as functions of the reduced electric field, which is defined as the ratio of electric field strength per unit gas density (e / n). a unit of the reduced electric field that is frequently used in the physics of electrical discharges, is 1 townsend (abbreviation 1 td), which is defined as 1 td � 10�17 v�cm2. following [5], the optimum reduced electric field for ozone formation from air is about 200 td. the mean energy of electrons as a function of the reduced electric field [8] obtained for streamer corona induced plasma in air at temperature 300 k is shown in fig. 4. as can be seen, this mean energy increases with the applied electric field and for (e / n) about 200 td reaches 5 ev. 3.2 effect of temperature increasing gas temperature substantially reduces the ozone generation processes. the gas temperature tg, which appears in the rate coefficient of the reactions, is assumed to be the time and space averaged temperature in the discharge space. we have studied the dependence of the rotational temperature in the discharge space as a function of the discharge current [4]. it was found that as the discharge current increases the temperature in the discharge space also increases. for this reason, the quantities presented in fig. 4 are given as a function of the discharge current. the main ozone formation reaction dominant at atmospheric pressure is reaction (12), for which the reaction rate dependence on gas temperature t g is given by the following expression k t 12 352 5 10 970 � � � � � � � � �. exp g [cm6�s�1]. it is seen that the reaction rate of ozone generation decreases with increasing gas temperature. thus if the temperature in the discharge space is increased from 300 k to 400 k, then the reaction rate given by coefficient k12 is 2.2 times decreased. unlike ozone, nox formation is favored by heat. thus for example the ozone destruction reaction (6) is significantly enhanced by an increase in the gas temperature, or k t 6 1215 10 1300 � � � � � � � � � �. exp g [cm3�s�1]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 43 no. 6/2003 50 100 150 200 250 1 2 3 4 5 6 m e a n e n e rg y [e v ] reduced electric field [td] fig. 4: mean energy of electrons as a function of reduced electric field if the temperature is increased from 300 k to 400 k, then the reaction rate given by coefficient k6 is more than 3 times increased. the dependence of the reaction rates of ozone and nitrogen oxides generation on temperature causes an unpleasant effect, known as discharge poisoning – that is, the production of nitrogen oxides prevails over the production of ozone, so that finally no ozone is produced – see fig. 5. this figure was obtained for a single hollow needle to plate discharge, with the airflow rate through the needle 5 slm. on the left and right vertical axes we can see the concentration of nitrogen oxides and ozone, respectively. the horizontal axis corresponds to the discharge current. increasing the discharge current causes, for the same airflow rate, an increase of the temperature in the discharge volume. 4 ozone generation system from the previous analysis it is clear that for efficient ozone production the electric discharge space must be kept both at a high pressure (atmospheric or even higher) and at a low temperature. the first ozone production installation based on the electrical discharge for the purpose of water treatment went into operation in nice in 1907. a typical present-day ozone generating system consists of two main components: an air preparation or oxygen production unit and an ozone non-thermal plasma production unit. in commercial ozone generating systems air is more frequently used as a feed gas than oxygen (oxygen must be produced from air, which causes additional costs). air preparation is critical for efficient ozone generation. we studied the difference between ozone generation from air supplied by a compressor and ozone generation for the same condition from synthetic air. it was found [9] that the ozone production yield – g of ozone produced per kw – is higher for synthetic air. the reason is the presence of moisture and particulate matter in the air supplied by a compressor, which has a detrimental effect on ozone generation. for the supply of air, either pressure fed systems or atmospheric pressure dryers are used. pressure fed systems – i.e. systems with compressors – are more suitable than atmospheric pressure dryers, because they are able to provide (with filters, etc.) the necessary air quality in various climatic conditions. the air preparation unit also comprises rotameters, filters, regulating valves, etc. an example of ozone production unit based on dielectric barrier discharge is shown in fig. 6. the electrode arrangement makes use of a borosilicate glass tube 1–2 m in length and 20 to 50 mm in diameter. this tube is mounted inside a stainless steel tube to form a narrow annular discharge space. a conductive coating (for example a thin aluminium film) forms the high voltage electrode on the inside of a glass tube. the inner part of the stainless steel tube serves as a second electrode. since the efficiency of ozone generation decreases strongly with rising temperature, modern ozonizers have narrow discharge gaps ranging from 1 to 3 mm. the wall of the steel electrode is cooled by water. large ozone generating units use several hundreds of tubes. 5 conclusions apart from small-scale production of ozone for medical purposes and for preparation of ultra-fine water, ozone is mainly prepared by non-thermal plasma technologies based on electrical discharges. modern industrial ozone production systems are capable of producing several hundreds kilograms of ozone per hour at power consumption of several mw. research in this field is oriented on progress toward higher attainable ozone concentrations and lower energy consumption i.e., toward higher ozone yields. acknowledgement this work was supported by research program no: j04/98: 212300016 “pollution control and monitoring of the environment” of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic). references [1] šimek, m., člupek, m.: efficiency of ozone production by pulsed positive corona discharge in synthetic air. j. phys. d: appl. phys., 2002, vol. 35, p. 1–5. [2] goosens o. et al: the dc glow discharge at atmospheric pressure. ieee trans. on plasma science, 2002, vol. 30, no. 1, p.176–177. [3] napartovitch, a. p., akishev y. s.: dc glow discharge with fast gas flow for flue gas processing. in: non thermal 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 0 20 40 60 80 100 0 20 40 60 80 100 i [ma] 2 4 6 8 10 0 20 40 60 80 100 no 0 20 40 60 80 100 120 140 160 nox no2 fig. 5: ozone and nitrogen oxides concentration as a function of current for a discharge with needle negative (q � 5 slm) air, o2 air, +o2 o3 glass tube d�50 mm l�50 mm discharge gap (1–3 mm) cooled steel tube conductive layer fig. 6: electrode arrangement of a dielectric barrier discharge for ozone production plasma techniques for pollution control, nato asi series, springer-verlag, 1992. [4] pekárek, s. et al: hollow needle to plate electrical discharge at atmospheric pressure. plasma sources sci. technol., 1999, vol. 8, p. 513–518. [5] kogelschatz, u., eliasson, b.: ozone generation from oxygen and air: discharge physics and reaction mechanism. ozone science and engineering, 1988, vol. 10, p. 367–377. [6] kitayama, j., kuzumoto, m.: analysis of ozone generation from air in silent discharge. j. phys. d: appl. phys., 1999, vol. 32, p. 3032–3040. [7] mason, n. j., skalny, j. d.: experimental investigations and modelling studies of ozone producing corona discharges. czech. j. physics, 2002, vol. 52, p. 85–94. [8] mcadams, r.: prospects of non-thermal atmospheric plasma for pollution abatement. j. phys.d: appl. phys., 2001, vol. 34, p. 2810–2821. [9] šimek, m., člupek, m., pekárek s.: efficiency of ozone production in non-thermal plasma electrical discharges in synthetic air – comparative study. 16th int. symp. on plasma chemistry, taormina, symp. proc. vol.ii, 2003. prof. ing. stanislav pekárek, csc. phone: +420 224 352 333 fax: +420 224 352 331 e-mail: pekarek@feld.cvut.cz department of physics ctu in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 43 no. 6/2003 ap04_3web.vp 1 introduction during the 1980s a large number of works [1– 4] on udf (unducted fans) or propfans brought attention back to the use of advanced propellers in transport aviation. all these works pointed to the potential benefits in fuel efficiency and t/o thrust of the new propellers. despite the fact that attention on propofans has decreased, there is still great interest in the use of propellers in general aviation and commuter [2, 4, 8] aircraft as well as in rpv and unmanned aircraft [5]. for these classes of aeroplane the distance between the wing and the propeller can be close enough to induce quite large effects on the wing surfaces, especially when the propellers are operating at high thrust as in take-off and climb. at take-off the aircraft speed is close to stall velocity and the whole process from rotation to climb-out involves a large range of incidence with the propeller operating all the time at maximum thrust. therefore, the effect of the propeller inflow on the wing in this situation can be of considerable magnitude [6, 9]. since the 1930s a large number of investigations have been performed on the effects of the slipstream on a wing and/or other components of the aircraft, including tractor and pusher configurations [13]. almost a half of these works have been on steady loads or steady state wing/propeller interaction. optimisation analyses performed by kroo [10] and miranda [11] have demonstrated that propeller/wing interaction, for the case of the tractor configuration, can result in significant wing drag reduction. recent work [5, 6] and also demonstrated that laminar flow could be increased when pusher propellers are installed in convenient positions behind a wing, resulting in less friction drag. concerning aircraft drag reduction we have to take into account the effect of the propeller slipstream, (here the propeller inflow is also considered as slipstream), on the wing boundary layer characteristics. tractor and pusher propellers affect the boundary layer of a wing in completely different way. the tractor propeller acts in a unsteady fashion, due to the propeller wake and tip vortex crossing the wing surfaces. such an effect can promote transition [12] or induce an alternation between laminar and turbulent states. on the other hand, a pusher propeller only affects the flow angularities on the wing surfaces, and for some positions it can alleviate the adverse pressure gradient and so prevent separation or/and increase laminar flow. this paper describes two experimental approaches to the analysis of the problem of wing/propeller interference. the first set of experiments was designed to analyse the effect of three different tractor propellers on the wing boundary layer. it was decided to use propellers with two, three and four blades in order to investigate the effect of the propeller wake and tip vortex frequency crossing the wing the second method concentrated on testing the effect of a high thrust pusher propeller driven by a hydraulic motor on a two-dimensional wing at a wide range of incidence and with the propeller also positioned at several positions behind the wing. measurements included pressure distributions for the pusher case only, flow visualisation for both cases and hot wire measurements for the tractor case. 2 experimental set up pusher set up: a wortmann fx63-137 profile wing with a chord of 0.34 m was used for the tests. the wing carried 82 pressure tappings around the centre line chord. a 0.52 m diameter three blade propeller driven by a 20 hp hydraulic motor was used. for the pressure measurements an 8 ft×4 ft open return low speed wind tunnel was used with the wing positioned vertically in the working section (fig. 1). the propeller was mounted on a separated pylon, which could be moved in order to set the propeller/wing positions. the wing could be moved vertically through the working section in order to measure the spanwise effect of the propeller on the surface pressure distribution. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 on the effects of an installed propeller slipstream on wing aerodynamic characteristics f. m. catalano this work presents an experimental study of the effect of an installed propeller slipstream on a wing boundary layer. the main objective was to analyse through wind tunnel experiments the effect of the propeller slipstream on the wing boundary layer characteristics such as: laminar flow extension and transition, laminar separation bubbles and reattachment and turbulent separation. two propeller/wing configurations were studied: pusher and tractor. experimental work was performed using two different models: a two-dimensional wing with a central cylindrical nacelle for the tractor configuration, and a simple two-dimensional wing with a downstream propeller for the pusher tests. the relative position between propeller and wing could be changed in the pusher model, and a total of 7 positions were analysed. for the tractor tests the relative propeller/wing was fixed, but three different propellers: two, three and four bladed were tested. measurements included pressure distribution, hot wire anemometry and boundary layer characteristics by flow visualisation. the results showed that the pusher propeller inflow affects the wing characteristics by changing the lift, drag, and also delays the boundary layer transition and separation. these effects are highly dependent on the relative position of the wing/propeller. on the other hand, the tractor propeller slipstream induces transition and its effect is dependent on the number of blades. keywords: wing /propeller interference, propeller slipstream, boundary layer. the force and moment measurements were made in an 8 ft×6 ft closed-circuit low speed wind tunnel using a similar arrangement for the propeller to that described above. the wing was attached vertically to a six-component balance and spanned the tunnel except for a 3 mm gap at one end so that force measurements could be made (fig. 2). flow visualisation was carried out using both sublimation and oil technique. the seven wing/propeller position (table 1) was tested through the incidence range of � 4 to �20 degrees with and without a trip wire. wing surface pressure distribution was measured at 10 spanwise positions. the reynolds number was set at 0.45 millions and the propeller was run at a thrust coefficient of ct � 0.15 with an advance ratio of j � 0.33. these propeller characteristics were chosen in order to simulate a high power condition such as take-off and climb. tractor set-up: the same wortmann fx63-137 profile was used to construct two wings with a chord of 0.28 m. these two wings were attached to a nacelle of cylindrical shape, fig. 3 and 4. inside the nacelle there was a shaft and two ball bearings, with the propeller and pulley attached to each end. the propeller was driven by a 5 hp electric motor through a 1:2 pulley/belt system. a frequency inverter controlled the motor/propeller speed. the model span was 1m long and the nacelle diameter 0.9m. the wing tips were positioned close to the wind tunnel wall in order to keep tip vortex at a minimum. fig. 4 shows the tractor propeller model mounted inside the wind tunnel working section. the wind tunnel is of the open circuit type with a 1 m×1 m working section. the three and four blade propellers had the same diameter of 0.40 m and the two-blade propeller 0.36 m. all experimental tests performed with the tractor model were conducted at a reynolds number of 350.000 with the wing without any transition trip. the propeller speed was 7.000 rpm, resulting in an average advance ratio of j � 0.43 for the three and four bladed propellers, and j � 0.48 for the two bladed propeller. the visualisation technique used for transition localisation was by sublimation that consisted of spraying naphthalene diluted in a volatile solvent on the wing surfaces. also oil flow visualisation was used for determining the wing surface characteristics such as laminar bubble sepa© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 44 no. 3/2004 fig. 1: pressure measurements, pusher case fig. 2: force and moment test set-up for the pusher case propeller position pos 01 pos 02 pos 03 pos 04 pos 05 pos 06 pos 07 distance from te 0.5 c 0.5 c 0.5 c 0.85 c 0.85 c 0.85 c 0.5 c above + below – chord 0.0 0.23 c 0.46 c 0.46 c 0.23 c 0.0 0.19 c (�) table 1 propeller positions for the pusher case fig. 3: tractor propeller set-up fig. 4: tractor propeller model ration, turbulent separation, etc. hot wire anomometry was carried out at the 40 % chord position of the upper surface of the wing in 30 spanwise positions. the hot wire was kept at 1 mm from the wing surface in order to ensure that the measurements were taken inside the boundary layer for the whole probe traverse. a constant temperature hot wire anemometer with a traverse gear was used in the experiments. 3 results pusher propeller model: the increase in suction on the upper surface of the wing due to the propeller is clearly shown in the pressure distributions of fig. 5, which resulted in a gain in cl as shown in the cl– � curve of fig. 6. the effect of the propeller is larger at the working incidence angles (�4 to 6 degrees) for propeller positions above the wing’s chord line and close to the trailing edge due to the increase in effective incidence and camber influenced by the propeller inflow. a direct consequence of this increase of suction on the upper surface of the wing is an increase of pressure drag, as shown in the cl– cd curve of fig. 7. at high incidence angles part of this gain in cl is due to a delay in turbulent separation, as demonstrated by the movement downstream of the separation point s in the � � 12.5° curve of fig. 5. flow visualisation using a smoke stream also showed the effect of the propeller on separation and upwash angle, as can be seen in fig. 8. because the boundary layer transition, in this case, is free from any trip, and also due to the low reynolds number of the experiment, the effect of the propeller on changing local flow incidence affects the transition front. this effect can be seen in figs. 9 and 10 by the movement of the transition front (determined by sublimation). the maximum effect on the transition front occurs at the centre of the wing and also acts on the laminar separation bubble, as can be seen in fig. 5 with a change in the position of the point laminar separation (l’) and reattachment (r) this effect decreases after incidence angles greater than 8 degrees and may even promote transition, as the effect of the propeller inflow at the leading edge is an increase of upwash. this 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 pusher model, alpha = 12.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 0 0.2 0.4 0.6 0.8 1 x/c c p propoffs1 poss1 poss2 poss3 t t s s fig. 5: pressure distribution at the centre line of the pusher model wing fig. 6: cl– � for the pusher model wing fig. 7: drag polar for pusher model effect for two incidence angles can be seen in fig. 11, which also shows that for high incidence angles, near the leading edge, the flow incidence induced by the propeller can move the transition front forward. this phenomenon is especially © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 44 no. 3/2004 fig. 8: effect of the pusher propeller on separation, � � 14° fig. 9: transition and separation points at centre line for pusher model at position 01 fig. 10: transition and separation points at the centre line for pusher model, position 03 intense for propeller positions above the chord line, but even so it is much less intense than the backward movement of the turbulent separation front due to the propeller inflow. therefore, the gain in friction drag due to the increase of laminar flow found in low incidence angles is compensated by the increase of pressure drag and, on the other hand, for high incidence angles the decrease in pressure drag due to the delay of turbulent separation is the main benefit. these results can be seen in fig. 12, which also shows that for propeller position 03 there is a decrease in drag for the working range of incidence angles. this happens not only due to the extended laminar flow but mainly due to a shift forward of the resultant force which will thus produce a small thrust force. tractor propeller model: due to working section restrictions, the range of incidence angles was limited from 0° to 8°. in this phase of the tests, only flow visualisation and hot wire measurements were carried out. the first series of tests were to analyse the transition front using the sublimation technique. naphthalene was sprayed only on the upper surface. the results are plotted in fig. 13 for the three propellers and the measurement was taken in a spanwise station correspondent to 75 % of the propeller radius. the results showed that inside the slipstream the transition front was brought close to the leading edge. it was found that there is no measurable difference between the effects of the three propellers, at least with the sublimation technique. also it was not possible to observe if there was any difference between the left and right wing flow due to the propeller wake swirl. fig. 14 shows a sketch of the transition front on the upper surface wing. flow visualisation using the oil flow technique was more elucidating because it showed better the flow pattern of the wing. fig. 14 shows the whole left wing at 4° with the different oil flow patterns. it can be seen that the laminar separation bubble was washed out inside the slipstream and that the effect of the slipstream extends further than the propeller radius due to the viscous mixing between slipstream and external flow. hot wire measurements were effective in order to find the effect of the blade wake crossing frequency. fig. 15 shows the time history of the velocity inside the boundary layer for the two-bladed propeller. the periodic effect of the blade wake crossing the boundary layer can be seen. figs. 16 and 17 show the time history of velocity for the 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 effect on transition, smooth wing, position 03 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 spanwise station y/c x /c l' l' r r a=4 a=12 r = reattachment l'= laminar separation effect on transition, smooth wing, position 03 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1.5 1 0.5 0 0.5 1 1.5 2 spanwise station y/c x /c l' l' r r a=4 a=12 r = reattachment l'= laminar separation fig. 11: effect on spanwise transition due to the pusher propeller at position 03 0 0 10 20 30 incidence angle c d o n c d o ff pos 01 pos 02 pos 03 pos 07 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0 0.02 0.04 10 0 10 20 30 incidence angle c d o n c d o ff pos 01 pos 02 pos 03 pos 07 fig. 12: variation of cd for the pusher propeller on, compared with the propeller off case transition front at 75% propeller radio 0 0.2 0.4 0.6 0.8 0 5 10 15 20 incidence angle x /c propoff tractor fig. 13: transition front for the tractor model, from flow visualisation fig. 14: location of the transition front determined by flow visualization three and four bladed propellers. it can be observed from these figures that when increasing the frequency in the wake that is passing over the wing the injection of turbulence to the boundary layer from the blade wakes is much more intense. the turbulence intensity for the three propellers is plotted in fig. 18. 4 conclusions the effect of a pusher and a tractor propeller on the flow over a straight wing was investigated by wind tunnel tests. a total of 7 different configurations of the pusher model were investigated and three different propellers were used for the tractor model. the propeller induced flow over the wing surfaces, thus increasing lift, pressure drag, and delaying turbulent separation. for the pusher propeller the effect was more intense on the rear of the wing but can also extend to the front by changing the upwash angle. the propeller effects are very dependent on the relative propeller/wing position. over the working range of incidence angles, pusher propeller positions above the wing gave the best results. the propeller inflow can also delay transition by preserving laminar flow on a smooth wing at low reynolds number due to the alleviation of the adverse pressure gradient at the rear of the wing. for the tractor propeller it was found that the slipstream passing over the wing promotes transition, changing its position to near the leading edge. if a laminar flow wing or a low reynolds profile is used inside the slipstream, laminar flow can decrease 80 % of that for a clear wing with no propeller flow. also if a multi-blade propeller is in use it can destroy the intermittent shift of laminar to turbulent flow encountered when a two-blade propeller wake passes over a laminar wing as pointed out by howard et al [12]. pusher propeller wing-body configurations are still attractive when compared with the tractor configuration, particularly concerning wing flow and cabin noise. references [1] henne p. a., dahlin j. a., peavey c. c., gerren d. s.: “configuration design studies and wind tunnel tests of an energy efficient transport with a high aspect ratio supercritical wing”. nasa cr-3224, may, 1982. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 44 no. 3/2004 fig. 16: time history of the velocity fluctuations inside the tractor model boundary layer for the three-bladed propeller fig. 15: time history of the velocity fluctuations inside the tractor model boundary layer for the two-bladed propeller fig. 17: time history of velocity fluctuations for the four-bladed propeller local turbulence intensity tractor propeller 0 0.5 1 1.5 2 2.5 3 3.5 4 0 100 200 300 spanwise section r m s /u % two bladed four bladed three bladed fig. 18: spanwise turbulence intensity [2] goldsmith i. m.: “a study to define the research and technology requirements for advanced turbo/propfan transport aircraft”. nasa cr-166138, 1981. [3] dunham d. m., gentry g. l. jr., gregory m. s., applin z. t., quinto p. f.: “low speed aerodynamic characteristics of an twin-engine general aviation configuration with aft-fuselage-mounted pusher propellers”. nasa tp-2763, 1987. [4] coe p. l. jr., turner s. g., owens d. b.: “low speed wind tunnel investigation of the flight dynamics characteristics of an advanced turboprop business/commuter aircraft configuration”. nasa tp-2982, 1990. [5] catalano, f. m., stollery j. l.: “the effect of a high thrust pusher propeller on the aerodynamic characteristics of a wing at low reynolds”. icas 94-6.1.3, anaheim (california, usa) september 1994. [6] catalano f. m., maunsell m. g.: “experimental and numerical analysis of the effect of a pusher propeller on a wing and body”. 35th aiaa aerospace sciences meeting and exhibit, reno (nv, usa), january 6–10, 1997. [7] maunsell, m. g.: “a study of propeller/wing /body interference for a low speed twin-engined pusher configuration”. icas-90-5.4.3, 1990. [8] johnson j. l., white e. r.: “exploratory low speed wind tunnel investigation of an advanced commuter configuration including an over-the-wing propeller design”. aiaa-83-2531, 1983. [9] witkowski d. p., johnston r. t., sullivan j. p.: “propeller/wing interaction”. 27th aerospace science meeting reno (nv, usa), january 9–12, 1989. [10] kroo i.: “propeller-wing interaction for minimum induced loss”. aiaa 84-2470, 1984. [11] miranda l. r., brennan j. e.: “aerodynamics effects of wing tip mounted propellers and turbines”. aiaa 86-1802, 1986. [12] howard r. m., milley s. j., holmes b. j.: “an investigation of the propeller slipstream on a laminar wing boundary layer”. sae paper no. 850859, 1985. [13] thompson j. s., smelt r., davison b., smith f.: “comparison of pusher and tractor propeller mounted on a wing”. reports and memoranda r&m, no. 2516 june 1940. fernando m. catalano m.sc ph.d, mraes aircraft laboratory university of sao paulo – brazil ave do trabalhador saocarlense 400 cep 13566-590 sao carlos sp 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 8 9 10 11 12 13 14 acta polytechnica doi:10.14311/ap.2017.57.0229 acta polytechnica 57(3):229–234, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap electrical and thermodynamic properties of a collagen solution jaromir stancla, ∗, jan skocilasa, ales landfeldb, rudolf zitnya, milan houskab a czech technical university in prague, faculty of mechanical engineering, department of process engineering, technicka 4, prague, czech republic b food research institute prague, radiova 7, prague, czech republic ∗ corresponding author: jaromir.stancl@fs.cvut.cz abstract. this paper focuses on measurements of the electrical properties, the specific heat capacity and the thermal conductivity of a collagen solution (7.19 % mass fraction of native bovine collagen in water). the results of our experiments show that specific electrical conductivity of collagen solution is strongly dependent on temperature. the transition region of collagen to gelatin has been observed from the measured temperature dependence of specific electrical conductivity, and has been confirmed by specific heat capacity measurements by a differential scanning calorimetry. keywords: electrical conductivity; thermal conductivity; specific heat capacity; collagen matter, gelatin, differential scanning calorimetry. 1. introduction collagen is the most abundant protein in animals [1] and is the main component of the extracellular matrix of animal connective tissue [2, 3]. collagen-based solutions are used mainly in the food industry for extruding sausage casings [4], or for producing vascular grafts in medical applications [5]. the dependence of specific electrical conductivity on temperature has been studied for a wide range of foods, but to the best of our knowledge, there is only very limited information on natural collagenous materials. the temperature dependence of the specific electrical conductivity (sec) of various kinds of meat (pork, beef, lamb, turkey and chicken meat) measured by an electrical conductivity probe in the temperature range 5–85 °c has been presented by zell et al. [6]. shirsat et al. [7] studied the electrical conductivity of a pork leg and a pork shoulder, and the dependence of the electrical conductivity of pork meat on the fat content in the meat. with the increasing fat content, the electrical conductivity of the meat decreases. strong dependence of the electrical conductivity on temperature, its structure and the ion content has also been confirmed for an apricot and peach puree [8], for a pomegranate [9] and for various other kinds of fresh fruits and vegetables [10]. measurements of the electrical conductivity of the material may be a suitable alternative to the differential scanning calorimetry (dsc) method for identifying structural changes in food materials, e.g. the gelation of a potato or corn starch suspension [11]. the measurement results showed that the specific electrical conductivity of the starch increases linearly with temperature, outside the section where gelation of the starch suspension occurs. the aim of the presented paper was to study the relationship between electrical and thermodynamic properties (the dependence of specific electrical conductivity and specific heat capacity on temperature) of a collagen solution. the second aim was the measurement of the thermal conductivity to provide more information about the thermal properties of a collagen solution. 2. materials and methods 2.1. tested material and sample preparation 2.1.1. tested material the tested material (a collagen solution obtained from a local producer) was a water solution of a bovine collagen (type i) extracted from mechanically pretreated bovine skins with no other chemical additives. the material is a viscoelastic paste that looks like "silly putty". the mass fraction of the dry matter collagen in the tested collagen solution was 7.19 %. the material was well homogenized by mixing and was not thermally treated. laboratory tests of tested material were carried out by using a size exclusion chromatography and a uv detection, three characteristic fractions were identified (12 % of light fraction 3 kda, 15 % of middle fraction 550 kda and 19 % of the longest fraction 780 kda), see [12]. the material was stored in a refrigerator, packed in a plastic bag. 2.1.2. sample preparation samples for the specific electrical conductivity measurements were prepared from a homogenized collagen solution. smaller pieces of the collagen solution were taken out randomly from the collagen solution mass (approx. 30 kg of total mass of collagen solution). five 229 http://dx.doi.org/10.14311/ap.2017.57.0229 http://ojs.cvut.cz/ojs/index.php/ap j. stancl, j. skocilas, a. landfeld et al. acta polytechnica figure 1. scheme of the experimental setup. identical cylindrical samples were prepared with the dimensions: diameter d = 85 mm, thickness h = 15 mm. prepared samples were packed in a plastic bag to prevent desiccation and stored in a refrigerator between measurements. samples for the specific heat capacity measurements were prepared from same collagen solution. five capsules were prepared for the dsc method of the specific heat capacity measurement. the mass of each tested sample (in 40 µl capsule) was within the range of 15– 25 mg. to perform thermal conductivity measurements, a sample with the dimensions of 150 × 100 × 20 mm was prepared from the mass of the tested collagen material. 2.2. experimental setup and procedure 2.2.1. measurements of specific electrical conductivity a new apparatus was assembled for measuring the specific electrical conductivity of the tested material. a scheme of the experimental apparatus is shown in figure 1. the apparatus consists of an ohmic cell (a closed plastic mould with a cylindrical gap, where the top and the bottom of the cylindrical gap are formed by planar stainless steel electrodes). the distance between the electrodes was h = 15 mm, and the diameter of the cylindrical gap was d = 85 mm. the electrodes were connected to the power supply. the power supply was assembled from an electrovoice q1212 power audio amplifier (electrovoice, germany) and a sine wave generator (wavetek, usa), where it was possible to set the frequency of the electric current. during the experiment, the tested sample was heated by a direct ohmic heating. an lmg 95 electronic power meter (zes zimmer, germany) was used for measuring the voltage u and the electric current i. the temperature of the tested material was measured by a t-type thermocouple, which was placed at the geometrical centre of the tested sample. both electrodes were cleaned using a sand paper and were cooled down before the next measurement. a cylindrical sample of the tested material was placed between two planar stainless steel electrodes, fed by a voltage u = 10 v. the specific electrical conductivity (sec) was continuously evaluated during an ohmic heating of the sample at approximately constant voltage u, recorded together with the electric current i by the power meter. at the same time, the continuous increase in temperature at the centre of the sample was recorded by a t-type thermocouple. the evaluation of the effective sec was based on κ = c u i (1) with the cell constant c = h/s (where h is the distance between the electrodes, and s is the contact area), as follows from the assumption that the intensity of the electric field (= u/h) inside the heated sample is uniform. five repeated measurements were performed, a new fresh collagen solution sample was used for each of the measurements. the sec of the material can be influenced by the frequency of the electric field. for high electrolyte concentrations, the polarization effects have a negative influence on the electrical conductivity measurements. for low electrolyte concentrations, there is an impact of the capacitances. these negative impacts can be reduced by the right choice of frequency of the electric 230 vol. 57 no. 3/2017 electrical and thermodynamic properties of a collagen solution figure 2. the dependence of the specific electrical conductivity (triangles) and the specific heat capacity (dots) on temperature — average from repeated experiments. model predictions are represented by solid lines. field used for the sec measurements. the selection of a high frequency of the electric field reduces the influence of the polarization in highly conductive materials, while the selection of low frequencies reduces the impact of the capacitances in low conductive materials [13]. our measurement cell was calibrated using a 0.01 mol l−1 solution of kcl with sec of 0.1413 s m−1 (at a temperature of 25 °c). the frequency of the electric field of 2.5 khz was found to be optimal for the tested material, according to the calibration results. the calibration results showed the lowest difference of the cell constant c, which was identified by the calibration (c = 2.63 m−1), compared to the calculated value according to the geometry of the cell and the assumption of the uniform intensity of the electric field in the tested sample (c = 2.64 m−1) at the frequency of 2.5 khz. 2.2.2. measurements of the specific heat capacity and thermal conductivity measurements of the specific heat capacity of the sample were carried out using a perkin-elmer diamond dsc differential scanning calorimeter with a heating rate of 10 °c/min. five repeated measurements were performed, a new fresh collagen solution sample was used for each of the measurements. the thermal conductivity of the tested material was assessed by a kemtherm qtm-d3 (kyoto electronics, japan) thermal conductivity meter working on the hot wire (transient) principle. the experiment was repeated ten times with same collagen sample under the condition of a constant ambient and sample temperature of 13 °c. the probe was calibrated using reference plates with a known thermal conductivity before the experiment. 3. results of experiments and discussion 3.1. specific electrical conductivity the sec of the tested collagen material was evaluated from the measured timecourses of the temperature in the geometric centre of the sample and the volt-ampere characteristics in the temperature range from 10–50 °c for constant frequency of the electric field (frequency of the sine wave) of 2.5 khz and approximately constant voltage of 10 v. the dependence of the evaluated sec of the tested material on temperature is shown in figure 2. the standard deviation of the measured sec is 0.008 s m−1. figure 2 shows that the sec of the tested material depends strongly on temperature. the dependence of the sec κ (in s m−1) of the tested collagen material on temperature in the temperature range from 10–50 °c can be described by the model in (2). the parameters of the model for sec a,b,c,d and e in the following equation were received by a nonlinear regression of the experimental data: κ = a + bt + c 1 + d (t − e)2 . (2) received correlation coefficient r2 = 0.999 confirms that applied model very well represents the experimental data in the temperature range from 10–50 °c. the identified model parameters for the sec, together with a statistical analysis, are summarized in the table 1. for all parameters of the sec model, the temperature is statistically significant parameter, confirmed by a statistical analysis. figure 2 also shows a slight decrease in the linear trend of the sec in the temperature range 32–43 °c, 231 j. stancl, j. skocilas, a. landfeld et al. acta polytechnica parameter unit value +/− t-ratio conclusion a s m−1 0.2001 0.0034 127.4 significant b s m−1 °c−1 0.0049 0.0004 65.9 significant c s m−1 −0.0228 0.0044 −13.8 significant d °c−2 0.07735 0.0275 3.4 significant e °c 40 0.5926 145.8 significant r2 0.999 table 1. the identified model paramaters for sec and results of statistical analysis for 95 % confidence interval. parameter unit value +/− t-ratio conclusion a j kg−1 °c−1 3665.1 6.7885 1062.7 significant b j kg−1 °c−2 2.0179 0.1105 35.9 significant c j kg−1 °c−1 634.57 14.9065 83.8 significant d °c−2 0.14712 0.0120 25.9 significant e °c 37.614 0.0612 1210.3 significant r2 0.975 table 2. the identified model paramaters for specific heat capacity and results of statistical analysis for 95 % confidence interval. start of the gelatin transition end of the gelatin transition method tstart s tend s dsc 33.7 0.39 43.4 1.15 sec measurement 32.8 0.79 42.5 0.95 table 3. a comparison of the identified temperature region of the collagen transition to gelatin from the dsc measurements and the sec measurements. tstart corresponds to the start of the peak observed from the dsc measurements and the sudden slope change from the sec measurements; tend corresponds to the end of the peak observed from the dsc measurements and the slope change from the sec measurements. the presented temperatures values are the average value from repeated experiments; s indicates the standard deviation of the temperatures. where natural collagen denatures and changes its structure into gelatin. a peak in the temperature dependence of the specific heat capacity of the tested collagen by the dsc measurements was observed in this temperature range. 3.2. specific heat capacity and the melting region of the collagen material the specific heat capacity of the tested collagenous material was measured by a differential scanning calorimetry (dsc). the temperature dependence of the specific heat capacity is shown in figure 2. the specific heat capacity of the tested sample increases slightly with rising temperature. the same model as for the sec (2) was used for predicting the dependence of the specific heat capacity cp (in j kg−1 °c−1) on temperature for the tested natural collagen solution in the temperature range from 10–50 °c. this empirical model is usually used for predicting the heat capacity of pork and beef fat [14]. table 2 contains results of model parameters a,b,c,d and e (t is temperature in °c) for specific heat capacity of the tested collagen solution received by the nonlinear regression of the experimental data by (2) together with the statistical analysis. the received correlation coefficient r2=0.975 confirms that the used model represents relatively well the experimental data in the temperature range from 10–50 °c. for all model parameters of the specific heat capacity, the statistical analysis confirms that the temperature is a statistically significant parameter. a peak in the specific heat capacity trend was observed during the experiment. the peak corresponds to the temperature region in which a natural collagen sample denatures and changes its form into gelatin. table 3 presents the identified temperatures of the probable start and end of the gelatin transition process from the observed peak by the dsc measurements and by the sec measurements (from the change in the slope). the observed temperatures of the region where the collagen changes into gelatin are almost identical. our results show that the denaturation temperature region can also be assessed by electrical conductivity measurements (from the slope change in the course of specific electrical conductivity), but it may not always be entirely conclusive according to the dsc measurements. similar conclusions for the starch gelation process are presented in [11]. 232 vol. 57 no. 3/2017 electrical and thermodynamic properties of a collagen solution property model parameter and value validity / note spec. electrical conductivity a = 0.2001 s m−1 t range 10–50 °c b = 0.0049 s m−1 °c−1 r2 = 0.999 c = −0.0228 s m−1 d = 0.07735 °c−2 e = 40 °c spec. heat capacity a = 3665.1 j kg−1 °c−1 t range 10–50 °c b = 2.0179 j kg−1 °c−2 r2 = 0.975 c = 634.57 j kg−1 °c−1 d = 0.14712 °c−2 e = 37.614 °c thermal conductivity λ = 0.642 w m−1 °c−1 t =13 °c s = 0.073 w m−1 °c−1 t range of transition into gelatine t = 33.7–43.4 °c assessed by dsc measurement table 4. summarized parameters of the model for predicting κ and cp (2) and the physical properties of the tested collagen solution. 3.3. thermal conductivity of collagen material thermal conductivity measurements of the tested collagen solution were carried out to complete the information on basic thermo-physical properties for the tested material. the measurements were performed for a constant temperature of the material of 13 °c. the observed average value of the thermal conductivity of the tested collagen solution from an experiment repeated ten times was λ = 0.642 w m−1 °c−1, with a standard deviation of s = 0.073w m−1 °c−1. it is apparent that the thermal conductivity is greatly influenced by the high water content in the tested material. 4. conclusions our research has focused on an experimental investigation of the temperature dependence of the sec and the specific heat capacity of a collagenous material (a solution of bovine skin collagen and water) in the temperature range from 10–50 °c. the thermal conductivity of the tested collagen solution was measured to complete the knowledge of physical properties of this material. strong temperature dependence of the sec of the tested collagen matter was observed. the sec increases practically linearly with increasing temperature of the tested sample, which is consistent with the debye-hückel theory of ionic solutions [15]. only in the temperature region of 32–43 °c, the slope of the linear course of the electrical conductivity changes, due to the irreversible transition into gelatin. this region was also confirmed by the dsc measurement, where a peak in the temperature dependence of the specific heat capacity was observed. a model has been developed for predicting the temperature-dependent sec and the specific heat capacity, taking into account the transition of natural collagen into gelatin, and its parameters have been identified. the slope change in the linear course of the sec is clearly visible, but it may not always be entirely conclusive according to the dsc measurements. however, it can tell us that something has happened in the material. the thermal conductivity is greatly influenced by the high water content in the tested material. the identified basic information about the physical properties of the tested collagen solution is summarized in table 4. acknowledgements this research was supported by the research project ga cr no. 14-23482s (thermal, electrical and rheological properties of collagen matter). list of symbols a,b,c,d,e parameters of the model for κ and cp c constant of the conductivity probe [m−1] cp specific heat capacity [j kg−1 °c−1] d diameter of the electrodes [m] f frequency of the electric field [hz] h distance between electrodes [m] i electric current [a] s contact area [m2] s standard deviation t temperature [°c] u voltage [v] greek letters κ specific electrical conductivity (sec) [s m−1] λ thermal conductivity [w m−1 °c−1] subscripts start start of irreversible changes in the material end end of irreversible changes in the material 233 j. stancl, j. skocilas, a. landfeld et al. acta polytechnica references [1] i. j. haug, k. i. draget, o. smidsrod. physical and rheological properties of fish gelatin compared to mammalian gelatin. food hydrocolloids 18(2):203–213, 2004. doi:10.1016/s0268-005x(03)00065-1. [2] m. d. shoulders, r. t. raines. collagen structure and stability. annual review of biochemistry 78:929–958, 2009. doi:10.1146/annurev.biochem.77.032207.120833. [3] c. zeltz, j. orgel, d. gullberg. molecular composition and function of integrin-based collagen gluesůintroducing colinbris. biochimica et biophysica acta 1840(8):2533–2548, 2014. doi:10.1016/j.bbagen.2013.12.022. [4] j. a. deiber, m. b. peirotti, m. l. ottone. rheological characterization of edible films made from collagen colloidal particle suspensions. food hydrocolloids 25(5):1382–1392, 2011. doi:10.1016/j.foodhyd.2011.01.002. [5] v. a. kumar, j. m. caves, c. a. haller, et al. acellular vascular grafts generated from collagen and elastin analogs. acta biomaterialia 9(9):8067–8074, 2013. doi:10.1016/j.actbio.2013.05.024. [6] m. zell, j. g. lyng, d. a. cronin, d. j. morgan. ohmic heating of meats: electrical conductivities of whole meats and processed meat ingredients. meat science 83(3):563–570, 2009. doi:10.1016/j.meatsci.2009.07.005. [7] n. shirsat, j. g. lyng, n. p. brunton, b. mckenna. ohmic processing: electrical conductivities of pork cuts. meat science 67(3):507–514, 2004. doi:10.1016/j.meatsci.2003.12.003. [8] f. icier, c. ilicali. temperature dependent electrical conductivities of fruit purees during ohmic heating. food research international 38:1135–1142, 2005. doi:10.1016/j.foodres.2005.04.003. [9] h. darvishi, m. h. khostaghaza, g. najafi. ohmic heating of pomegranate juice: electrical conductivity and ph change. journal of the saudi society of agricultural sciences 12:101–108, 2013. doi:10.1016/j.jssas.2012.08.003. [10] s. sarang, s. k. sastry, l. knipe. electrical conductivity of fruits and meats during ohmic heating. journal of food engineering 87(3):351–356, 2008. doi:10.1016/j.jfoodeng.2007.12.012. [11] f.-d. li, l.-t. li, z. li, e. tatsumi. determination of starch gelatinization temperature by ohmic heating. journal of food engineering 62:113–120, 2004. doi:10.1016/s0260-8774(03)00199-7. [12] r. žitný, a. landfeld, j. skočilas, et al. hydraulic characteristic of collagen. czech journal of food science 33(5):479–485, 2015. doi:10.17221/62/2015-cjfs. [13] mettler-toledo ag, schwitzerland. a guide to conductivity measurement: theory and practice of conductivity applications, 2013. [14] v. p. latyshev, t. m. ozerova. specific heat and enthalpy of molten beef and pork fat (in russian). kholoilnaja tekhnika 5:37–39, 1976. [15] p. debye. polar molecules. new york, usa, dover publications, inc., 1929. 234 http://dx.doi.org/10.1016/s0268-005x(03)00065-1 http://dx.doi.org/10.1146/annurev.biochem.77.032207.120833 http://dx.doi.org/10.1016/j.bbagen.2013.12.022 http://dx.doi.org/10.1016/j.foodhyd.2011.01.002 http://dx.doi.org/10.1016/j.actbio.2013.05.024 http://dx.doi.org/10.1016/j.meatsci.2009.07.005 http://dx.doi.org/10.1016/j.meatsci.2003.12.003 http://dx.doi.org/10.1016/j.foodres.2005.04.003 http://dx.doi.org/10.1016/j.jssas.2012.08.003 http://dx.doi.org/10.1016/j.jfoodeng.2007.12.012 http://dx.doi.org/10.1016/s0260-8774(03)00199-7 http://dx.doi.org/10.17221/62/2015-cjfs acta polytechnica 57(3):229–234, 2017 1 introduction 2 materials and methods 2.1 tested material and sample preparation 2.1.1 tested material 2.1.2 sample preparation 2.2 experimental setup and procedure 2.2.1 measurements of specific electrical conductivity 2.2.2 measurements of the specific heat capacity and thermal conductivity 3 results of experiments and discussion 3.1 specific electrical conductivity 3.2 specific heat capacity and the melting region of the collagen material 3.3 thermal conductivity of collagen material 4 conclusions acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2018.58.0395 acta polytechnica 58(6):395–401, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap lead-time, inventory, and safety stock calculation in job-shop manufacturing miguel afonso sellitto universidade do vale do rio dos sinos, production and systems engineering graduate program, av. unisinos 950, são leopoldo rs 93022-000 brazil correspondence: sellitto@unisinos.br abstract. the purpose of this article is to present a method for calculating the lead-time, the inventory, and the safety stock or buffer in job shop manufacturing, which are essentially stochastic variables. the research method is quantitative modelling. the theoretical foundation of the method relies on the techniques belonging to the wlc (workload control) body of knowledge on the manufacturing management. the study includes an application of the method in a manufacturing plant of the furniture industry, whose operation strategy requires high dependability. the company operates in a supply chain and must have high reliability in deliveries. adequate safety stocks, lead times, and inventory levels provide the protection against the lack of reliability in the deliveries. the inventory should remain within a certain range, being as small as possible to maintain low lead times, but not too small that it could provoke a starvation, configuring an optimization problem. the study offers guidelines for a complete application in industries. further research shall include the influence of the variability of the lot size in the stochastic variables. keywords: lead-time; inventory; safety stock; dependability; workload control; queuing theory. 1. introduction workload control (wlc) is a production planning and control technique suitable for high-variety job shop manufacturing [1] and focused mainly on make-toorder (mto) production [2]. the wlc integrates two control mechanisms, the input and the output control. the input control regulates the income of workload into the manufacturing system by priority rules. the output control regulates the outcome of the orders by adjustments in the production capacity. the extant literature considers the load-based order release as the main input control technique and adding or removing equipment and workforce as the main output control technique [1]. the dependent variable is the lead-time, the time between the arrival and the completion of an order, controlled by the work-in-process inventory, the orders waiting for service. small inventories decrease the time an order waits for service, decreasing the lead-time. this study focuses on the southern brazilian furniture industry. the main strategic objective of the industry is a high dependability, which includes high speed (short due dates), high reliability (accuracy in quantity and quality), and on-time deliveries (no tardiness). in the furniture industry, the manufacturing operates mostly in job shop plants based on the mto production. the industry produces furniture items and mattresses and covers activities that range from the extraction and reception of raw materials from suppliers to the delivery, installation, and technical assistance of goods. a typical manufacturing company designs and produces a high variety of items with wood, steel, plastic, and specific materials. the industry must comply with various and rigorous technical specifications, whose application may require, at the same time, the use of advanced technologies and artisanal skills. the innovation depends mainly on the development of new materials and new design techniques, in a close cooperation with suppliers [3]. the main elements of a furniture design are plates, fasteners, paints and resins, upholstery, and packaging. some products, like rectilinear items, support a high automation level, while others, like solid wood items, require artisan skills [4]. although the wlc originally focused on highly balanced production lines, further results indicate that the wlc could also be effective in unbalanced situations, like those found in the furniture industry. not only bottlenecks but also non-bottlenecks work centres receive close attention [2]. otherwise, the wlc performance might strongly deteriorate if the protective capacity of critical resources is not sufficiently high [5]. despite most reported implementations are in large manufacturing plants, the wlc is particularly suitable for small and medium-sized enterprises [6], such as those found in the furniture industry. in the furniture industry, a large part of the production comes from small and medium size plants, organized in the job-shop mto production, which makes the wlc a useful tool in the manufacturing control function [7]. the purpose of this article is to present a method for calculating the lead-time, the inventory, and the safety stock in a job shop mto manufacturing. the research method is quantitative modelling. the theoretical foundation of the method relies on proper contributions and on some issues extracted from the wlc body of knowledge. the research object is a manufacturing plant of the furniture industry of southern brazil. the company aims at a high dependability and must rely on adjusted levels of safety stocks, lead-times, and inventory levels to ensure on-time deliveries. the decision on inventory levels in the manufacturing stems from risk hedge policies due to unreliable suppliers, seasonal 395 http://dx.doi.org/10.14311/ap.2018.58.0395 http://ojs.cvut.cz/ojs/index.php/ap miguel afonso sellitto acta polytechnica production, short-term deliveries, or low-consumption items. the calculation of the inventory level creates an optimization problem. if a given level of inventories ensures a satisfaction of some specific customers, it also entails financial costs to maintain the cash flow [16– 18]. the remainder of the article contains the review, methodology, results, discussion, and conclusions. 2. theoretical background the wlc or load-oriented control aims at controlling the manufacturing workload controlling the inventory level [8]. over time, the wlc balances the workload releases to and from the manufacturing system. as inventory does not grow above a given level, the lead-time remains low. as the inventory does not fall below a given level, efficiency remains high [9]. as the mto manufacturing works with low quantity and high variety production, the control requires stochastic approaches [10]. the model employs five stochastic variables: lead-time (lt ), work-in-process (wip), arrival rate (ri ), throughput (p), and safety stock (ss). the lt (measured in days or hours) varies according to the time waiting for service and the processing time. order lt considers only the time to completion. part lt considers also the size of the orders. wip (measured in parts, tons, or other units) is the queue in the manufacturing system and represents the instantaneous quantity of materials waiting for service in the shop-floor. ri is the rate of arrival of orders, measured in pieces or tons per day or per hour. p is the rate of orders or parts delivery, measured by the same unit as ri . imbalances between ri and p modify the queue length and, consequently, the wip. finally, ss is the minimum level of wip that, under a given confidence level, prevents the starvation produced by instantaneous differences between ri and p [8, 12, 13]. the calculation model is valid for an isolated work centre as well as for a complete production line [8], and it looks as follows: lti = tpeu i − tpei, (1) lt m = n∑ i=1 lti n , (2) ltσm = √ ∑n i=1(lt m − lti)2 n − 1 , (3) lt mw = ∑n i=1 ltiqi∑n i=1 qi , (4) ltσw = √ ∑n i=1(lt m − lti)2qi∑n i=1 qi , (5) ri m = ∑n i=1 qi max tpi i − min tpi i , (6) wip m = ri mlt mw, (7) ri m = ∑n i=1 qi max tpeu i − min tpeu i , (8) ss m = pm ∆tmax, (9) figure 1. the throughput diagram. in which: lti – lead-time (lt ) of order i in the current work centre; tpei – completion time of order i in the previous work centre (time of liberation); tpeu i – completion time of order i in the current work centre. n – number of orders. lti – lead-time of order i; lt m – mean order lt ; ltσm – standard deviation of the order lt ; lt mw – mean part lt ; qi – amount of value added by order i; ∑n i=1 qi – total amount dispatched by all orders; ltσw – standard deviation of the part lt ; ri m – mean arrival rate of orders or parts; wip m – mean wip; pm – mean throughput; ∆tmax – maximum time elapsed between the arrival of two successive orders. the throughput diagram (td) derives from the queue theory and represents the materials´ flow through the manufacturing system. the td is a graphical method to verify the analytical calculation of the lt and wip [7, 8]. in the shop-floor, the td also helps with the control of the queue, keeping lt short and stable over time [2, 12]. figure 1 illustrates the td. 3. methodology and results the research procedures are: (1.) data collection in a company of the furniture industry that manufactures desk-chair sets for offices (one set contains one or two chairs and one desk); (2.) application of the calculation model; (3.) analysis of the results regarding strategic priorities. the company operates an mto job shop manufacturing that processes orders with a size of multiples of ten. eventual surplus routs to distribution centres for fast delivery. the company’s manufacturing information system provided the data from the first hundred orders served in 2018. the company uses a manufacturing execution system (mes) to control the orders. the information retrieved from the mes are the size of the order, the release date, and the completion date. tables 1 and 2 show the raw data and the results of the application respectively. 396 vol. 58 no. 6/2018 lead-time, inventory, and safety stock calculation order # qi (sets) tpei tpeu i lti qilti (lt mw − lti)2 (lt mw − lti)2qi 1 20 02/01 04/01 2 40 88.6 1772.6 2 40 02/01 17/01 15 600 12.9 514.3 3 50 02/01 03/01 1 50 108.5 5423.0 4 30 02/01 15/01 13 390 2.5 75.4 5 70 03/01 21/01 18 1260 43.4 3035.9 6 30 03/01 22/01 19 570 57.5 1726.2 7 30 04/01 24/01 20 600 73.7 2211.4 8 40 04/01 14/01 10 400 2.0 80.0 9 30 07/01 08/01 1 30 108.5 3253.8 10 80 07/01 18/01 11 880 0.2 13.7 11 30 08/01 17/01 9 270 5.8 174.9 12 30 09/01 15/01 6 180 29.3 879.5 13 30 10/01 17/01 7 210 19.5 584.6 14 60 10/01 17/01 7 420 19.5 1169.2 15 40 11/01 21/01 10 400 2.0 80.0 16 30 11/01 30/01 19 570 57.5 1726.2 17 60 14/01 02/02 19 1140 57.5 3452.4 18 40 15/01 01/02 17 680 31.2 1247.9 19 90 16/01 27/01 11 990 0.2 15.5 20 30 16/01 06/02 21 630 91.9 2756.5 21 80 21/01 04/02 14 1120 6.7 534.8 22 20 22/01 27/01 5 100 41.1 822.9 23 20 24/01 28/01 4 80 55.0 1099.5 24 20 25/01 28/01 3 60 70.8 1416.1 25 20 25/01 22/02 28 560 275.1 5501.6 26 50 28/01 13/02 16 800 21.0 1051.4 27 20 30/01 31/01 1 20 108.5 2169.2 28 90 01/02 08/02 7 630 19.5 1753.9 29 40 04/02 14/02 10 400 2.0 80.0 30 30 04/02 06/02 2 60 88.6 2658.9 31 20 04/02 07/02 3 60 70.8 1416.1 32 30 04/02 20/02 16 480 21.0 630.8 33 20 04/02 22/02 18 360 43.4 867.4 34 40 05/02 13/02 8 320 11.7 466.3 35 40 05/02 14/02 9 360 5.8 233.2 36 40 05/02 16/02 11 440 0.2 6.9 37 30 05/02 06/02 1 30 108.5 3253.8 38 50 05/02 19/02 14 700 6.7 334.3 39 20 05/02 22/02 17 340 31.2 624.0 40 70 05/02 25/02 20 1400 73.7 5159.8 41 30 06/02 18/02 12 360 0.3 10.3 42 50 06/02 20/02 14 700 6.7 334.3 43 30 06/02 22/02 16 480 21.0 630.8 44 20 07/02 14/02 7 140 19.5 389.7 45 20 07/02 14/02 7 140 19.5 389.7 46 60 07/02 19/02 12 720 0.3 20.6 47 20 07/02 19/02 12 240 0.3 6.9 48 50 07/02 25/02 18 900 43.4 2168.5 49 60 07/02 25/02 18 1080 43.4 2602.2 50 80 07/02 21/02 14 1120 6.7 534.8 (continued on the next page) table 1. raw data and the model application. part 1/2. 397 miguel afonso sellitto acta polytechnica (continued) order # qi (sets) tpei tpeu i lti qilti (lt mw − lti)2 (lt mw − lti)2qi 51 20 08/02 25/02 17 340 31.2 624.0 52 20 08/02 26/02 18 360 43.4 867.4 53 20 13/02 15/02 2 40 88.6 1772.6 54 40 13/02 26/02 13 520 2.5 100.6 55 30 13/02 27/02 14 420 6.7 200.6 56 40 13/02 25/02 12 480 0.3 13.7 57 40 13/02 21/02 8 320 11.7 466.3 58 60 18/02 26/02 8 480 11.7 699.5 59 40 18/02 26/02 8 320 11.7 466.3 60 20 18/02 25/02 7 140 19.5 389.7 61 40 18/02 26/02 8 320 11.7 466.3 62 20 18/02 26/02 8 160 11.7 233.2 63 30 19/02 26/02 7 210 19.5 584.6 64 20 20/02 05/03 13 260 2.5 50.3 65 20 21/02 27/02 6 120 29.3 586.3 66 20 21/02 11/03 18 360 43.4 867.4 67 50 21/02 11/03 18 900 43.4 2168.5 68 90 21/02 12/03 19 1710 57.5 5178.7 69 50 22/02 04/03 10 500 2.0 100.0 70 40 22/02 05/03 11 440 0.2 6.9 71 50 25/02 28/02 3 150 70.8 3540.1 72 20 27/02 12/03 13 260 2.5 50.3 73 20 28/02 06/03 6 120 29.3 586.3 74 20 28/02 06/03 6 120 29.3 586.3 75 80 01/03 19/03 18 1440 43.4 3469.6 76 20 01/03 07/03 6 120 29.3 586.3 77 30 04/03 06/03 2 60 88.6 2658.9 78 20 04/03 22/03 18 360 43.4 867.4 79 60 05/03 22/03 17 1020 31.2 1871.9 80 20 06/03 12/03 6 120 29.3 586.3 81 50 06/03 11/03 5 250 41.1 2057.3 82 50 06/03 20/03 14 700 6.7 334.3 83 30 07/03 13/03 6 180 29.3 879.5 84 40 07/03 14/03 7 280 19.5 779.5 85 20 07/03 11/03 4 80 55.0 1099.5 86 40 11/03 13/03 2 80 88.6 3545.3 87 50 11/03 13/03 2 100 88.6 4431.6 88 20 11/03 21/03 10 200 2.0 40.0 89 20 11/03 22/03 11 220 0.2 3.4 90 30 12/03 26/03 14 420 6.7 200.6 91 30 12/03 27/03 15 450 12.9 385.7 92 20 12/03 27/03 15 300 12.9 257.1 93 30 13/03 28/03 15 450 12.9 385.7 94 30 13/03 27/03 14 420 6.7 200.6 95 50 15/03 19/03 4 200 55.0 2748.7 96 20 15/03 27/03 12 240 0.3 6.9 97 30 18/03 26/03 8 240 11.7 349.8 98 30 18/03 01/04 14 420 6.7 200.6 99 40 18/03 27/03 9 360 5.8 233.2 100 30 20/03 03/04 14 420 6.7 200.6 table 1. raw data and the model application. part 2/2. 398 vol. 58 no. 6/2018 lead-time, inventory, and safety stock calculation time proportion accumulated interval (pn) 0 58 % 58 % 1 25 % 83 % 2 6 % 89 % 3 7 % 96 % 4 1 % 97 % 5 3 % 100 % figure 2. distribution of the time intervals between arrivals of orders. variable result unit equation lt m = 10.9 days (2) ltσm = 5.8 days (3) lt mw = 11.4 days (4) ltσw = 5.6 days (5) ri m = 48.6 sets/day (6) wip m = 554.4 sets (7) pm = 41.6 sets/day (8) ss m = 207.8 sets (9) table 2. results of the application. 4. discussion the mean order lt and mean part lt are close (10.9 and 11.4 days respectively), but not equal due to the variability in the order size (µ = 37.4, σ = 18.4 sets). to cope with this uncertainty, the model creates the bivariate stochastic variable ltiqi, without a direct physical meaning that represents the use of the manufacturing system by an order. the rationale that explains ltiqi is: the smaller the time to completion or the order size, the smaller the use of the system; the larger the time to completion or the order size, the larger the use of the system. the sum ∑n i=1 ltiqi is a proxy variable representing the total use of the system in the period. the division of ∑n i=1 ltiqi by the total number of parts produces a proxy variable that represents the expected time to completion of a single part. the mean arrival rate ri m is 48.6 sets per day. the mean throughput pm is 41.6 sets per day. since pm is lower than ri m, the manufacturing system accumulated an inventory in the period. the mean inventory wip m is 554.4 sets of parts, which is more than two times the required buffer number of sets (ss m = 207.3) and ensures an excessive protection against a starvation. a reduction on the buffer would result in a much lower inventory cost, as it is usually required by manufacturing strategies. in the period, the maximum interval between two arrivals is five days, in three out of the 100 orders. assuming a normal distribution for the inventory, a wip m of 207.3 sets provides a 50 % protection for the three orders with inter-arrival times of five days. for a full protection for the orders with inter-arrival times lower than five days, the upper limit for the safety level sl, the probability that the manufacturing system will not be stopped by starvation, is sl = 1 − 0.5 · 3/100 = 98.5 %, which is too high for the manufacturing purposes. the sl is expected to decrease when taking into account the inventory uncertainty. figure 2 shows the data and the distribution of the inter-arrival times. the data follow a negative exponential distribution. another usual requirement of a manufacturing strategy is the full attendance of the orders due dates, which requires managing the lead-time. the maximum order lead-time is 28 days. therefore, promised dates of 28 or more days ensure a full compliance with due dates. on the one hand, a promised date of 28 days satisfactorily protects the majority of orders; on the other hand, it could jeopardize a fast-delivery sales policy. a compromise solution, a trade-off between the protection and speed is needed. considering lt m = 10.9 and ltσm = 5.8 days, the upper limit of a 95 % confidence interval for the order lead-time is ltul = 10.9 + 1.96 · 5.8 = 22.3 days. a promised date of 23 days would protect 95 % of the sales. if the sales policy requires deadlines of no more than 20 days, the safety level would be sl = 1 −n(20, 10.9, 5.8) = 94.2 %, which usually suffices for a competitive manufacturing strategy. the next stage is the use the td to graphically verify the analytic calculation of lt m and im. figure 3 shows the inflow and outflow td and regression models. the figure shows the accumulation of arrivals and completion of orders in the period, the regression models and the respective r2. the figure shows two models for the outflow, one with the intercept that maximizes the fitting, the other with the intercept equal to zero. the reason is that the negative intercept that maximizes r2 has no physical meaning and harms the analysis, distorting the angular coefficient. the first regression model serves to verify the calculation of lt m and wip m. the second serves to verify pm. as both r2 are close to 1, the models can satisfactorily replace the raw data. regarding ri and p, the slopes are close to the analytical outcomes (ri m = 48.8 and 48.6; pm = 41.6 and 41.04), which reinforces the validity of the model. in the long term, the manufacturing control releases on average 48.8 parts per day to the production line that dispatches on average 41.6 parts per day, producing an increasing instantaneous wip and a wip m larger than the minimum needed to avoid a starvation, the ss m. 399 miguel afonso sellitto acta polytechnica is ltul = 10.9 + 1.96*5.8 = 22.3 days. a promised date of 23 days would protect 95% of the sales. if the sales policy requires deadlines of no more than 20 days, the safety level would be sl = [1-n(20, 10.9, 5.8)] = 94.2%, which usually suffices for a competitive manufacturing strategy. the next stage is the use the td to graphically verify the analytic calculation of ltm and im. figure 3 shows the inflow and outflow td and regression models. figure 3. the td and the regression models the figure shows the accumulation of arrivals and completion of orders in the period, the regression models and the respective r2. the figure shows two models for the outflow, one with the intercept that maximizes the fitting, the other with the intercept equal to zero. the reason is that the negative intercept that maximizes r2 has no physical meaning and harms the analysis, distorting the angular coefficient. the first regression model serves to verify the calculation of ltm and wipm. the second serves to verify pm. as both r 2 are close to 1, the models can satisfactorily replace the raw data. regarding ri and p, the slopes are close to the analytical outcomes (rim = 48.8 and 48.6; pm = 41.6 and 41.04), which reinforces the validity of the model. in the long term, the manufacturing control releases on average 48.8 parts per day to the production inflow y = 48.888x + 50.107 r² = 0.9847 outflow y = 46.633x 381.82 r² = 0.9816 y = 41.037x r² = 0.9524 0 500 1000 1500 2000 2500 3000 3500 4000 0 10 20 30 40 50 60 70 80 90 100 figure 3. the td and the regression models. to reduce the excess, the manufacturing control should slow the order release, increase the processing capacity, or both, until wip m meets ss m. regarding lt and wip, the central points of the regression models are (x,y) = (41.5, 1870), corresponding respectively to the centre of time and inventory ranges. the horizontal distance between the inflow and outflow model at y = 1870 and the vertical distance at x = 45.5 represent lt m and wip m respectively. after some algebraic manipulation with the equations of the two regression models, it is possible to calculate the horizontal and the vertical distance between the central points of the lines. lt m = 1870+381.846.6 − 1870−50.1 48.8 = 11.02 days corresponds to the horizontal distance. wip m = (48.8 · 45.5 + 50.1) − (46.6 · 45.5 − 381.8) = 532 sets corresponds to the vertical distance between the central points of the lines. again, the graphical values are close (11.4 and 11.02; 554.4 and 532) to the analytical outcomes of the model. as the variables are stochastic, the approximations suffice for the analysis. finally, the td also allows considering the inventory uncertainty. the orders arrived in 77 days, but only those inside the dashed area in figure 3 are useful. the analysis discards the initial orders, as they require negative outflow accumulation to calculate the instantaneous inventory, which has no physical meaning. the mean and the standard deviation of the 67 valid instantaneous inventory values are 548.4 and 202.1 sets respectively. assuming a fixed value pm = 41.04, wip m = 548.4 and wipσ = 202.1, and using pn, the proportion of orders whose inter-arrival time is n = 1, . . . , 5, sl = 1 − ∑5 n=1 pnn(41.04n, 548, 202) = 99.3 %, which is very high and confirms the excess of the inventory. if the management reduces the wip m to 207 sets, the analytical ss m, the new value for the standard deviation is wipσ = 202.1 √ 207/542.8 = 124.15 and sl = 1 − ∑5 n=1 pnn(41.04n, 207, 124) = 93.2 %, which is still acceptable in a competitive manufacturing strategy. as expected, the uncertainty reduced the sl, whose upper limit was previously estimated at 98.5 %. 5. conclusion the purpose of this article was to present a method for calculating lead-time, inventory, and the safety stock in job shop mto manufacturing. the article included an application of the model in a job shop mto manufacturing of the furniture industry. besides the calculation of the variables, the application demonstrated that, due to unbalanced flows of material (arrival and completion of orders), the manufacturing produced excessive wip, which possibly impairs the delivery deadlines and increases the operational cost. as the manufacturing strategy requires reliability in deliveries, the management should focus on reducing wip m to ensure a dependability and support the competition in the industry. the article presented the initial development of the study. further research shall focus on the influence of the uncertainty of the order size and the inventory in the leadtime. the research method is the simulation, which is widely used in manufacture control studies [14]. further research shall also include fuzzy sets theory to manage the intrinsic uncertainty in the inventory management. a systematic review identified major achievements in fuzzy inventory management [15]. further research should also focus on other companies in the same industry and in other industries, such as the footwear and the electronics industries that also rely on mto job shop manufacturing systems. 400 vol. 58 no. 6/2018 lead-time, inventory, and safety stock calculation references [1] thürer m, stevenson m, land m. on the integration of input and output control: workload control order release. int j prod econ 2016; 174:43-53. doi:10.1016/j.ijpe.2016.01.005 [2] fernandes n., land m., carmo-silva s. workload control in unbalanced job shops. int j prod rese 2014; 52: 679-690. doi:10.1080/00207543.2013.827808 [3] sellitto m, luchese j. systemic cooperative actions among competitors: the case of a furniture cluster in brazil. journal of industry, competition and trade (accepted, on line first) doi:10.1007/s10842-018-0272-9 [4] sellitto m, luchese j, bauer j, saueressig g, viegas c (2017). ecodesign practices in a furniture industrial cluster of southern brazil: from incipient practices to improvement. journal of environmental assessment policy and management 2017; 19: 1750001.doi:10.1142/s1464333217500016 [5] henrich p, land m, gaalman g. grouping machines for effective workload control. int j prod econ 2006; 104:125-142. doi:10.1016/j.ijpe.2004.11.006 [6] stevenson m, huang y, hendry l, soepenberg e. the theory and practice of workload control: a research agenda and implementation strategy. int j prod econ 2011; 131:689-700. doi:10.1016/j.ijpe.2011.02.018 [7] hendry l, land m, stevenson, m., gaalman, g. (2008). investigating implementation issues for workload control (wlc): a comparative case study analysis. int j prod econ 2008; 112:452-469. doi:10.1016/j.ijpe.2007.05.012 [8] perona m, saccani n, bonetti s, bacchetti a. manufacturing lead time shortening and stabilisation by means of workload control: an action research and a new method. prod plan control 2016; 27: 660-670. doi:10.1080/09537287.2016.1166283 [9] wiendahl h, glässner j, petermann d. application of load-oriented manufacturing control in industry. prod plan control 1992; 3:118-129 doi:10.1080/09537289208919381 [10] haskose a, kingsman b, worthington, d. performance analysis of make-to-order manufacturing systems under different workload control regimes. int j prod econ 2004; 90:169-186 doi:10.1016/s0925-5273(03)00052-5 [11] bechte, w. theory and practice of load-oriented manufacturing control. int j prod rese 1988; 26:375-395 doi:10.1080/00207548808947871 [12] land m, stevenson m, thürer m. integrating load-based order release and priority dispatching. int j prod rese 2014; 52:1059-1073. doi:10.1080/00207543.2013.836614 [13] thürer m, silva c, stevenson m, land, m. improving the applicability of workload control (wlc): the influence of sequence-dependent set-up times on workload controlled job shops. int j prod rese 2012; 50: 6419-6430. doi:10.1080/00207543.2011.648275 [14] thürer m., stevenson, m. workload control in job shops with re-entrant flows: an assessment by simulation. int j prod rese 2016; 54: 5136-5150. doi:10.1080/00207543.2016.1156182 [15] shekarian e., kazemi n., abdul-rashid, s., olugu, e. fuzzy inventory models: a comprehensive review. appl soft comput 2017; 55: 588-621. doi:10.1016/j.asoc.2017.01.013 [16] michalski, g. value maximizing corporate current assets and cash management in relation to risk sensitivity: polish firms case. econ comput econ cyb 2014; 48:259-276. doi:10.2139/ssrn.2442862 [17] michalski, g. full operating cycle influence on food and beverages processing firms characteristics. agric econ 2016; 62:71-77 doi:10.17221/72/2015-agricecon [18] michalski, g. (2016). risk pressure and inventories levels. influence of risk sensitivity on working capital levels. econ comput econ cyb 2016; 50:189-196. 401 http://dx.doi.org/10.1016/j.ijpe.2016.01.005 http://dx.doi.org/10.1080/00207543.2013.827808 http://dx.doi.org/10.1007/s10842-018-0272-9 http://dx.doi.org/10.1142/s1464333217500016 http://dx.doi.org/10.1016/j.ijpe.2004.11.006 http://dx.doi.org/10.1016/j.ijpe.2011.02.018 http://dx.doi.org/10.1016/j.ijpe.2007.05.012 http://dx.doi.org/10.1080/09537287.2016.1166283 http://dx.doi.org/10.1080/09537289208919381 http://dx.doi.org/10.1016/s0925-5273(03)00052-5 http://dx.doi.org/10.1080/00207548808947871 http://dx.doi.org/10.1080/00207543.2013.836614 http://dx.doi.org/10.1080/00207543.2011.648275 http://dx.doi.org/10.1080/00207543.2016.1156182 http://dx.doi.org/10.1016/j.asoc.2017.01.013 http://dx.doi.org/10.2139/ssrn.2442862 http://dx.doi.org/10.17221/72/2015-agricecon acta polytechnica 58(6):0–0, 2018 1 introduction 2 theoretical background 3 methodology and results 4 discussion 5 conclusion references ap02_02.vp 1 introduction nominally flat surfaces are widely used in practice. these can be mathematically described by the composition of the deterministic and random components of irregularities in the given cartesian coordinate system: � � � � � �h x y s x y x y, , ,� � � (1) where s ( x, y) is the deterministic function of the surface ( x, y) coordinates and �( x, y) is the homogeneous random normal field. the parameters of surface irregularities measured over the whole surface (the topographic parameters) characterize more positively the functional properties of the surface than the h profile parameters. surface deviations are functions of two coordinates ( x, y) and therefore the profile evaluation gives incomplete information about the surface. 2 surface topography parameter measurement for surface topography parameter measurement it is necessary to determine the actual value of the parameter and to know the accuracy of the measurements. in this case the analogue mean value is taken for the actual value of the parameter, and[s] the measurement error is determined by the systematic and random components. the series of the surface topographic parameters can be represented as an averaging operator of the generalized transformation g { h ( x, y)} of the surface coordinates on the given rectangular area of the surface l1 × l2 with sides l1 and l2 [1]: � �� �p l l g h x y x ys ll � �� 1 1 2 00 21 , d d (2) since there is a random component on the measured surface, the topographic parameter measured is a random value, which is characterized by the mathematical expectation e (ps) and the variance d (ps). therefore, one of the problems in measuring the topographic parameter is the determination of its probability characteristics, i.e., the mathematical expectation e (ps) and the variance d (ps). it is known that the mathematical expectation of the parameters given by equation (2) can be derived by integration of the mathematical expectation e(g) of the transformation g{h ( x, y)} in equation (2) by the x and y variables. as an example of the application e(g) of the transformation g{h ( x, y)} in equation (2) by the x and y variables. as an example of the application of measuring methods of the topographic parameters the ps � ras parameter is used. this is the arithmetic mean deviation of the surface coordinates of the mean plane. � � � �� �e r l l e h x y x y ll as d d� �� 1 1 2 00 21 , (3) where � �h x y, is the absolute value of the h( x, y) surface coordinate. this expression can also be extended for the h( x, y) surface: � �� � � �� � � � � � e h x y s x y s x y s x y , exp , , , � � � � � � � � � � � � � � � � 2 2 1 2 2 2� �� � �21 2 � � � �� � � � �� (4) where � � � �� z t t z � �� 2 1 2 2 0 � exp d is the laplace function. the generalized transformation g{h ( x, y)} is the random field which has the correlation function [3] defined as � � � � � � � � � �� � k x x y y g h g h f h h h h e g h g h g 1 2 1 2 1 2 1 2 1 2 1 , , , ,� � � �� �� �� �� �� d d � �� �2 , (5) where � �h h x y1 1 1� , , � �h h x y2 2 2� , are the coordinates of the surface at the� �x y1 1, and� �x y2 2, points, � �f h h1 2, distribution density expands into a series in terms of hermite polynomials. the correlation function (5) can be represented as [3]: � � � � � � � �� � k x x y y c x y c x y n g n n n 1 2 1 2 1 1 1 1 1 2 1 2 1 , , , , , , ! � � � � �� � � � � � (6) � � � � � � � � c x y g h h h s h s hn 1 1 1 2 2 1 2 2 , exp� � � � � � �� � � �� � � � ��� � � � n d �� �� � (7) �1 2 1� �x x �2 2 1� �y y � is the r.m.s. deviation of the random component � ( x, y) and � �� � �1 2, is the correlation coefficient of the random components � ( x, y) , � �h h x y� , , � �s s x y� , . for the generalized transformation � �� � � �g h x y h x y� �, , , which determines the ras parameter, the coefficients cn in equation (7) are written as follows after the transformation: 35 acta polytechnica vol. 42 no. 2/2002 a mathematical approach for evaluation of surface topography parameters a. k. haghi the probability characteristics of surface topography parameters described by the composition of the deterministic component and the homogeneous random normal field were analysed. formulae for the calculation of the mathematical expectation of the ras parameter and the evaluation of its variance are given. keywords: mathematical approach, surface topography, deterministic component. � � � � � �� � c x y s x x s x y s x 1 1 1 1 2 1 2 1 1 2 2 2 , , exp , � � � � � � � � � � � � � � � � �� � �1 1 1 22 , y � � � � � � � � � � �� � c x y s x y h s x y n 1 1 1 2 2 2 2 1 2 2 , exp , , � � � � � � � � � � � � � ! �� � � � n � � � � � �1 1 1 1 1 � � � � � � � � � � � � � � � � � � � � � � � � � � s x y h s x y, , n 1 (8) where hn( z) are hermite polynomials. the variance of the parameter ps is determined by integration of the correlation function � �k x x y yg 1 2 1 2, , , of the generalized transformation g{h(x, y)} using variables x1, x2, y1, y2: � � � �d p l l k x x y y x x y y lll s g d d d d� ��� 1 1 2 2 2 1 2 1 2 1 2 1 2 000 211 , , , . (9) calculation of the integral in equation (9) involves considerable difficulties; therefore, in the general case, formula (9) in not suitable for calculations. however, having � �s x y, " � and � �s x y, # � offers simplified evaluations of the variance. thus, having � �s x y, " � from equation (8), c1 $ � cn $ 0 n � %2 3, , (10) the correlation function of the transformation g{h ( x, y)} from equation (6) is reported approximately as � � � �k g � � � � � ��1 2 1 2, ,$ . (11) by using � �s x y, # � from equation (8) we obtain c1 0$ � �c hn n$ � � � � 2 0 1 2 2 � � . the correlation function of the transformation g{h ( x, y)} from equation (6) is � � � �� � k c n g n n n � � � � � 1 2 2 1 2 2 , , ! $ � �� . (12) in both cases the correlation function � �k g � �1 2, is a function of only two variables �1 and �2. thus, for the above-mentioned approximations, d ( ps) can be calculated by using the formula: � � � �d p l l l l k g l s d d� � � � � � � � �� 4 1 1 1 2 1 1 2 2 1 2 1 2 0 2 � � � � � �, 0 1l � . (13) the following notation is now used: � � � �s k kk gg g d d� �� �� 1 0 0 1 2 1 2 00 , ,� � � � (14) with l l sk1 2! " g equation (13) may be written approximately: � � � �d p l l k sks g g$ 4 0 0 1 2 , . (15) � �kg 0 0, is the variance of g{h ( x, y)}. it is not possible to carry out analogue measurements of the topographic parameter ps. therefore the analogue-discrete and discrete methods are the only ones that can be used to measure the topographic parameters. with analogue-discrete measurements the averaging operator is: � �� �� ,p n l g h i y y l i n s d� � � 1 1 2 1 01 21 & (16) where &1 is the sampling in the sampling interval between the profiles and n1 is the number of profiles. the evaluation of the parameter in equation (2) in the general case is therefore shifted together with the task of determining the parameter probability characteristics, i.e., for its mathematical expectation and variance it is necessary to determine the shift in the value of the evaluation of equation (16) with respect to that of equation (2). we can regard the probability characteristics of the evaluation (16) for that particular transformation as � �� � � �g h i y h i y& &1 1, ,� . (17) �ras represents the analogue-discrete evaluation of the topographic parameter ras. the mathematical expectation of the transformation (17) will be determined by the expression in equation (4) with � � � �s x y s i y, ,� &1 . the mathematical expectation of the �ras parameter is identically: � � � �� �e r n l e h i y y l i n � ,as d� � � 1 1 1 2 1 01 21 & . (18) the integrals in equation (18) are not taken into account, but if the deterministic component � �s i y&1, is linearized with a small error using the y argument on the terminal number of intervals the length of which is 2 then it is possible to obtain an exact expression for the evaluation � �e r�as of the topographic parameter ras. for this, we expand the deterministic component � �s i y&1, in its taylor series by using the y argument up to the linear terms at the point j = 1, 2, …, n: � � � � � � � �s i y s i j s i j y y& & & & & &1 1 2 1 2 2, , , $ � � . (19) by substituting equation (19) into equation (18) and taking into account equation (4), we obtain: � � � � e r n l j j j n i n � exp as � ' ' � � � � � ��� � 1 1 2 1 2 111 1 2 2 21 & & � � � � � � � � � � � � � � ij ij ij ij ij ij y y y �� � � �� � � � �� � � � � � � � � 2 2 1 2 2 2 � � � � � � � dy (20) where � � � � � � � � ij ij s i j y s i j s i j y j � � � & & & & & & & 1 2 1 2 1 2 2 , , , . (21) the integrals in both components of expression (20) can be tabulated [4]. then after transformation we obtain: 36 acta polytechnica vol. 42 no. 2/2002 � �e r n l c a c a cd j n i n ij ij ij ij ij � as � � � �� 1 1 12 1 4 2 1 2 11 1 � � � ��� � � � � � � � �� � � � �� � (a cij (22) � �( � � � � � � � � � � � b c b cd b c aij ij ij ij ij ij ij ij � � � 1 4 2 1 2 exp� � � � � � � � c a b c b ij ij ij ij 2 2 1 2 2 2 � � exp where aij ij ij� �� �&1 � �b jij ij ij� � �� �1 1& c � 1 2 1 2� dij ij � 2 2 1 2 � �� the � �� z and exp( )�z2 and the standard calculation program for the � �� z and exp( )�z2 functions can be used to calculate e r( � )as on the computer. formula (22) shows that the mathematical expectation of the parameter �ras depends on the characteristic of the � � s i y&1 , the deterministic component s i y( , )&1 and the random component � 2. with the definite relationship between the deterministic and the random components, an approximate evaluation of e r( � )as can be obtained. thus, having � �s x y, # � and linearizing equation (4) we obtain: � � � �e r n l s i y y l i n � ,as d$ � � 1 1 1 2 1 01 21 & . (23) by using � �s x y, # � and linearizing equation (4): � �� � � � � � � � e h x y s x y , , $ � � � � 2 2 1 2 2 1 2� � � � . (24) © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 42 no. 2/2002 5,06 mm 5,06 mm μm 20 0 0 1 2 3 4 mm 0 90 μm fig. 1: representation of surface roughness measurement 2,9 mm 2,5 mm 0 0 1 2 mm 20 10 μm fig. 2: representation of surface roughness measurement 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 figs. 3 and 4: a 3d topographical image © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 42 no. 2/2002 f ig . 5: r ep re se n ta ti on of 3d ro u gh n es s te st in a la rg e sc al e then � � � � � �� �e r l l s x y x y ll � ,as $ � � � � �� 2 1 2 1 2 1 2 1 2 2 00 21 � � � � d d . (25) to analyse the e r( � )as dependence on s � and to determine what relationship with s � is needed in order to make formulae (23) and (25) applicable to the determination of the mathematical expectation e r( � )as , calculations were carried out using formulae (22), (23) and (25). the mathematical models of the composition surface and the experimental calculation of the parameter �r hij j n i n i as � �� 11 1 (26) where hij is the deviation of the h (x, y) surface from the mean plane at the discrete points. the random component was modelled with the help of the random number generator. the deterministic component was designated by formula (19). 3 discussion � if the deterministic component is a piecewise linear function and the random component is a homogeneous normal field, then the mathematical expectation of parameter �ras increases with increasing s �. � if s � � 1, when e r( � )as is calculated the influence of the deterministic component can be ignored and the error is less than 10 %. � when s � � 3, formula (25) can be used to calculate e r( � )as with an error of less than 10 %. � if s � � 4, e r( � )as can be evaluated with an error of less than 10 % by using the deterministic component (formula 23). � formula (22) holds for the whole range of s � changes. � full agreement with the theoretical expression can be demonstrated by the experimental calculations using formula (26). � some optical profilometers based on these principles are shown in figures (1–6) to measure the statistical parameters of rough surfaces. the contrast is found to be related to surface roughness when the length of coherence of the light [are] is comparable in magnitude. 4 conclusion the theoretical dependence of the mathematical expectation of parameter �ras has been determined for the case when the deterministic component s ( x, y) can be linearized on the separate sections. the dependence of the evaluation bias in parameter �ras on the relationship between the random and deterministic components has been determined. references [1] lukyan, v.: measurement of the surface topography. icte, 2000, p. 103–154. [2] lewis, j.: fundamentals of microgeometry. rpi, 2001, p. 32–65. [3] zahedi, m.: mesure sans contact de la topographie d’une surface. iut de belfort, 2000, p. 43–78. [4] galat, t.: optical surface roughness determination. applied optic, 2000. dr. a. k. haghi e-mail: haghi@kadous.gu.ac.ir guilan university p.o. box 3756 rasht, iran 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 6: microtopograph of a rough surface acta polytechnica doi:10.14311/ap.2017.57.0373 acta polytechnica 57(6):373–378, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap functional realizations of lie algebras as noether point symmetries of systems rutwig campoamor-stursberg instituto de matemática interdisciplinar-ucm, plaza de ciencias 3, e-28040 madrid, spain correspondence: rutwig@ucm.es abstract. functional realizations of lie algebras are applied to the problem of determining lie and noether point symmetries of lagrangian systems in n dimensions, particularly in the plane. this encompasses both the case of symmetry-preserving perturbations of a given system, as well as the generic analysis on the structure of (regular) lagrangians in order to admit a symmetry algebra belonging to a specific isomorphy class. keywords: lagrangian; lie point symmetry; noether symmetry; perturbation. 1. introduction the study of the precise relation between constants of the motion of a dynamical system and the symmetry properties of the corresponding equations of the motion (alternatively, an associated lagrangian) goes back to the seminal paper of e. noether [1], a key result that, combined with the lie group theoretical approach to differential equations, has become an essential tool in many branches of applied mathematics and mechanics (see, e.g., [2, 3] and references therein). usually, the noether symmetry analysis is carried out mainly for conservative systems, due to requirements of the hamiltonian formalism and the corresponding quantization of the systems [4]. however, there is no restriction, at least from the lie algebraic point of view, to treat both conservative and dissipative systems simultaneously. this can be done introducing appropriate functional realizations of lie algebras, the symmetry generators of which depend on the coordinates of the extended configuration space, and considering the constraints imposed by this timedependence. this extended approach allows to cover physically relevant systems, such as time-modulated oscillators, and has further applications in the context of perturbation theory, such as the study of some geometric properties of the orbits of a given system [5]. within the context of inverse of problems in dynamics [5, 6], in this work we consider an inverse approach to dynamics based on functional realizations of lie algebras that are required to satisfy the noether symmetry condition for a (regular) lagrangian system. this allows two approaches, either starting from a given system and analyzing perturbations that preserve certain of the symmetries, or determining families of systems invariant with respect to these realizations. for the purpose of illustration, the plane case is considered, although the ansatz can also be formulated in arbitrary dimensions. 2. lie and noether point symmetries of systems let l (t, q, q̇) be a regular lagrangian in n-dimensions and let d dt (∂l ∂q̇i ) − ∂l ∂qi = 0, 1 ≤ i ≤ n (1) be the corresponding equations of the motion. by regularity, we can always rewrite the equations of motion in normal form, i.e., q̈i = ωi(t, q̇j,qj) = gij ( ∂l ∂qj − ∂2l ∂t∂q̇j − ∂2l ∂q̇j∂qk q̇k ) , (2) for 1 ≤ i ≤ n, with gij denoting the inverse hessian matrix of l. as is well known, the system (2) can be reformulated in equivalent form as the first order partial differential equation af = ( ∂ ∂t + q̇i ∂ ∂qi + ωi ∂ ∂q̇i ) f = 0. (3) we call a vector field x = ξ(t, q) ∂ ∂t + ηj(t, q) ∂∂qj ∈ x(rn+1) a lie point symmetry of the equations (2) if its first prolongation ẋ = x + η̇j (t, q, q̇) ∂∂q̇j with η̇j = −dξdt q̇j + dηj dt satisfies the commutator [ ẋ, a ] = − dξ dt a. (4) lie point symmetries span a finite-dimensional lie algebra lps, called the lie (point) symmetry algebra of the system, that be seen to be always a subalgebra of the simple algebra sl(n + 2,r), corresponding to the free system q̈ = 0 [7, 8]. whenever the system arises from a lagrangian, a constant of the motion of (2) is defined as a function f (t, q, q̇) satisfying the condition df dt = ∂f ∂t + q̇i ∂f ∂qi + q̈i ∂f ∂q̇i = a(f) = 0. (5) 373 http://dx.doi.org/10.14311/ap.2017.57.0373 http://ojs.cvut.cz/ojs/index.php/ap rutwig campoamor-stursberg acta polytechnica in the following, we will mainly consider lie point and noether point symmetries and the conservation laws associated to them. in this context, a vector field x = ξ (t, q) ∂ ∂t + ηj (t, q) ∂ ∂qj (6) is called a noether point symmetry of (2) if it satisfies the constraint ẋ (l) + a (ξ) l− a (v ) = 0, (7) for some function v (t, q) that does not depend on the velocities, and that we usually refer to as the gauge term of the symmetry generator. expanding the symmetry condition (7) provides the following partial differential equation ξ (t, q) ∂l ∂t + ηj (t, q) ∂l ∂qj + η̇j (t, q, q̇) ∂l ∂q̇j + dξ dt l (t, q, q̇) − ∂v ∂t − q̇j ∂v ∂qj = 0. the first noether theorem (see e.g. [1]) states that to any symmetry (6) there corresponds a constant of the motion determined by the rule j = ξ ( q̇k ∂l ∂q̇k −l ) −ηk ∂l ∂q̇k + v (t, q) . (8) it follows in particular that a constant of the motion j is an invariant of ẋ, i.e., the condition ẋ(j) = 0 is satisfied. 2.1. perturbations that preserve symmetry subalgebras given a system described by a lagrangian l (t, q, q̇) and having an r-dimensional lie algebra lns of noether point symmetries, fixing a certain subalgebra l0 < lns of dimension r0 < r we can ask whether a perturbed lagrangian l̂ = l + εs (t, q, q̇) exists such that the symmetry generators xj (1 ≤ j ≤ r0) of l0 are still noether point symmetries of l̂. clearly, the equations of the motion (1) of l̂ are given by d dt (∂l ∂q̇i ) − ∂l ∂qi + d dt (∂s ∂q̇i ) − ∂s ∂qi = 0, (9) so that the symmetry condition for the subalgebra l0 leads, for each generator xk (1 ≤ k ≤ r0), to the differential equation ẋk ( l̂ ) + a (ξk) l̂− a (vk) = ẋk (l) + a (ξk) l− a (vk) + ẋk (s) + dξk dt s = 0. (10) now the vector fields xk are noether symmetries of the lagrangian l, and fixing the gauge terms vk(t, q), the equation (10) simplifies to a pde involving merely the perturbation term s (t, q, q̇): ẋk (s) + dξk dt s (t,q,q̇) =ξk (t,q) ∂s ∂t +ηjk (t,q) ∂s ∂qj +η̇jk (t,q,q̇) ∂s ∂q̇j + dξk dt s (t,q,q̇) = 0, (11) where 1 ≤ k ≤ r0. in the case of velocity-independent perturbation terms, a straightforward verification shows that a noether symmetry is preserved only if ∂ξk ∂qj = 0 holds for all 1 ≤ k ≤ r0 and 1 ≤ j ≤ n. this allows to restrict the perturbation analysis to subalgebras l0, the generators of which have components of the type ξk = ϕk(t). this agrees with the usually observed pattern of lie point symmetries of nonlinear second-order systems of differential equations of the type q̈ = ω (t, q) [7, 8]. 3. functional realizations of sl(2, r) as noether symmetry algebra as has been already pointed out in many contexts, the non-compact lie algebra sl(2,r) plays a relevant role within the group-theoretic analysis of differential equations, in particular, concerning the (super)integrability of plane systems and the linearization analysis [9–11]. this fact suggests to consider this lie algebra more closely in the context of inverse problems, as done in [12]. for these reasons, in the following we restrict our analysis to sl(2,r). prior to analyzing a functional realization of sl(2,r), we make some observations on the generic structure of noether point symmetries. consider to this extent a lagrangian adopting the kinetic form t = 1 2 aij(q)q̇iq̇j, (12) such that a11(q)a22(q) −a12(q)2 6= 0 holds. lemma 1. let x = ξ(t, q) ∂ ∂t + ηj(t, q) ∂∂qj be a noether point symmetry of a lagrangian (12). then the components have the generic structure ξ(t, q) = c1t2 + c2t + c3, ηj(t, q) = ηj1(q)t + ηj2(q). (13) moreover, the gauge term does not explicitly depend of time, i.e., v = v (q). proof. developing the symmetry condition (7) and keeping only the independent term, as well as the terms with highest power in q, we obtain that ∂v ∂t = 0 and the conditions ∂ξ ∂q1 a11(q) = 0, ∂ξ ∂q2 a22(q) = 0, ∂ξ ∂q2 a12(q) + 1 2 ∂ξ ∂q1 a22(q) = 0, ∂ξ ∂q1 a12(q) + 1 2 ∂ξ ∂q2 a22.(q) = 0. 374 vol. 57 no. 6/2017 noether point symmetries of systems as a11(q)a22(q)−a12(q)2 6= 0, a short computation shows that the condition ∂ξ ∂q1 = ∂ξ ∂q2 = 0 must be necessarily satisfied. introducing this into the symmetry condition, the following relations are obtained for the terms linear in q: ∂η2 ∂t a12(q) + ∂η1 ∂t a11(q) − ∂v ∂q1 = 0, ∂η1 ∂t a12(q) + ∂η2 ∂t a22(q) − ∂v ∂q2 = 0. (14) multiplying the first equation by a12, the second by −a11 and adding them leads to the expression ∂η2 ∂t ( a212 −a11a22 ) + a12 ∂v ∂q1 + a11 ∂v ∂q2 = 0, (15) from which we conclude that η2 is at most linear in t, hence it admits a decomposition η2(t, q) = η21(q) t + η22(q). for η1(t, q) the assertion is obtained similarly. finally, for the terms quadratic in q in the symmetry condition (7), we have expression of the type − 1 2 aij(q) dξ dt + tψ1(q) + ψ2(q) = 0, (16) showing that ξ(t) is at most quadratic in t, from which the assertion follows. we observe that the generalization of the latter result to n-dimensions is straightforward. now let f(q) and g(q) be arbitrary non-vanishing functions and consider the lie algebra generated by the following vector fields: x1 = t2 ∂ ∂t + tf(q) ∂ ∂q1 + tg(q) ∂ ∂q2 , x2 = 1 2 d dt x1 = t ∂ ∂t + 1 2 f(q) ∂ ∂q1 + 1 2 g(q) ∂ ∂q2 , (17) x3 = 1 2 d2 dt2 x1 = ∂ ∂t . it follows at once that [x1,x2] = −x1, [x1,x3] = −2x2 and [x2,x3] = −x3, showing that the lie algebra is isomorphic to sl(2,r) for any choices of f and g. we observe that the structure of the sl(2,r) lie algebra generalizes naturally that studied in [8, 12]. in these conditions, it can be asked which is the most general (kinetic) lagrangian such that it admits this lie algebra as an algebra of noether point symmetries. it suffices to impose the invariance with respect to x1 and x3 in order to ensure that the system is sl(2,r)invariant. the symmetry condition (7) applied to x3 is trivially satisfied for the gauge term v (t, q) = 0. analyzing now the invariance with respect to x1, and inspecting first the terms linear in q̇, leads to the constraints f(q)a11(q) + g(q)a12(q) − ∂v ∂q1 = 0, g(q)a22(q) + f(q)a12(q) − ∂v ∂q2 = 0, for the gauge term v (q). as f(q)g(q) 6= 0, this allows us to set f2(q)a11(q)−g2(q)a22(q)−f(q) ∂v ∂q1 −g(q) ∂v ∂q2 = 0. in order to completely satisfy the symmetry condition, the functions aij(q) must be solutions to the following system of pdes: f(q) ∂a11 ∂q1 + g(q) ∂a11 ∂q2 + 2a11(q) ( ∂f ∂q1 − 1 ) + 2a12(q) ∂g ∂q1 = 0, f(q) ∂a22 ∂q1 + g(q) ∂a22 ∂q2 + 2a22(q) ( ∂g ∂q2 − 1 ) + 2a12(q) ∂f ∂q2 = 0, f(q) ∂a12 ∂q1 + g(q) ∂a12 ∂q2 + a12(q) ( ∂f ∂q1 + ∂g ∂q2 − 2 ) + a11(q) ∂f ∂q2 + a22(q) ∂g ∂q1 = 0. (∗) there are in principle two different ways to analyze these systems related to a lagrangian of type (12): either we fix the latter and search for functions f(q) and g(q) such that (17) is a lie algebra of noether symmetries, or we fix the components of the symmetry generators and try to determine the most general kinetic term (12) invariant under the algebra. the same problem can be formulated allowing a potential term. we shall exhibit examples of the two approaches leading to nontrivial solutions of the equations. 3.1. separable kinetic lagrangians let us illustrate the preceding situation first for the case of the separable lagrangians t = 1 2 ( qk1 q̇ 2 1 + q l 2q̇ 2 2 ) , where k,l 6= −2 are constants. this case contains in particular that of the free euclidean lagrangian, that is well known to allow a sl(2,r)-subalgebra of noether symmetries [4]. the symmetry condition (∗) thus requires the solving for the unknown functions f(q) and g(q), as well as the gauge term v (q) for the symmetry generator x1. the resulting equations are lg(q) + 2q2 ( ∂g ∂q2 − 1 ) = 0, ql2g(q) − ∂v ∂q2 = 0, kf(q) + 2q1 ( ∂f ∂q1 − 1 ) = 0, ql1f(q) − ∂v ∂q1 = 0. as they are first-order pdes, they can be solved with standard methods (see e.g. [13]), and the solution can be expressed as f(q) = 2q1 k + 2 + a1q −k/2 1 , g(q) = 2q2 l + 2 + a2q −l/2 2 , v (q) 2 = qk+21 (k + 2)2 + ql+22 (l + 2)2 + a1q (k+2)/2 1 k + 2 + a2q (l+2)/2 2 l + 2 375 rutwig campoamor-stursberg acta polytechnica for the component functions and gauge term, respectively. clearly, as the system related to t is linearizable, it admits five additional noether symmetries, all of which possessing a zero term in ∂ ∂t [11]. looking now for a potential u(t, q) that preserves the symmetry, it follows at once from the x3-invariance that ∂u∂t = 0, and thus the perturbed system is conservative. the invariance by x1 implies that u must satisfy the first-order differential equation (see (11)) f(q) ∂u ∂q1 + g(q) ∂u ∂q2 + 2u(q) = 0, (18) with the functions as obtained above. a routine but tedious computation leads to the general solution u(q) = ψ(u) qk1 ( 2q1 + a2(k + 2)q −k/2 1 )2 , (19) where u = 22l 2/(2l+4)2q (1+l)/2 2 + a1(l + 2) 4q(k+1)/21 + 2a2(k + 2) . (20) we skip the proof that the perturbed system t + εu possesses exactly a symmetry algebra of noether point symmetries isomorphic to sl(2,r). we observe that, considering k and l as parameters, this approach also allows us to relate different dynamical systems with (k,l) 6= (k′, l′) that have an isomorphic symmetry algebra, the corresponding symmetry generators being related by the parameterization of the functions f(q) and g(q). 3.2. systems with fixed symmetry let us now consider the second possibility, namely, fixing the functions f(q) and g(q) in (17). to this extent, consider f(q) = qn1 , g(q) = qn2 for simplicity, where n 6= 0. in this case, the symmetry condition for x1 leads to the system( nqn−1i + nq n−1 j −2 ) aij(q) + qn1 ∂aij ∂q1 + qn2 ∂aij ∂q2 = 0, qn1 a11 + q n 2 a12 − ∂v ∂q1 = 0, qn1 a12 + q n 2 a22 − ∂v ∂q2 = 0 for (ij) = (11), (12), (22). it is not too difficult to see that if a11(q) = a22(q) 6= 0 holds, then the preceding system possesses a nontrivial solution only for the values n = 0, 1. in order to find manageable solutions for arbitrary n, we thus assume that a11(q) = a22(q) = 0. then the system can be reduced, and for n 6= 1 admits the solution a12(q) = (q1q2)−n exp (−(q1−n1 + q1−n2 ) n− 1 ) , v (q) = exp (−(q1−n1 + q1−n2 ) n− 1 ) . for n = 1, we merely get a12(q) = 1 (the free pseudoeuclidean lagrangian) and the gauge term v (q) = q1q2. further, considering a potential u(q) requires to solve the additional pde qn1 ∂u ∂q1 + qn1 ∂u ∂q2 + 2u(q) = 0, (21) that can be easily seen to provide the solution u(q) = exp (2q1−n1 n− 1 ) ψ (qn−11 + qn−12 (q1q2)n−1 ) . (22) if we admit generalized potentials depending on q̇, the integrability condition (11) can be separated with respect to the variable t, because of the x3-invariance. skipping the details, it follows from a routine computation that the most general u preserving the subalgebra sl(2,r) in the preceding realization is given by u(q, q̇) = exp (2q1−n1 n− 1 ) ψ (qn−11 + qn−12 (q1q2)n−1 ,u1,u2 ) , where the auxiliary variables are defined as ui = q̇iq1−ni exp ( − 2q1−n1 n− 1 ) , i = 1, 2. (23) for generic choices of the function ψ, the system determined by t − εu always possesses a noether point symmetry algebra isomorphic to sl(2,r). 4. time-dependent lagrangians as follows from lemma 1, for conservative systems a noether point symmetry depends at most quadratically on time. if we skip the conservative character of the system, a more wide class of possibilities is given, and (explicitly time-dependent) constants of the motion can still be guaranteed whenever an appropriate subalgebra of noether symmetries is chosen [8]. let θ(t) and ρ(t) be two arbitrary functions, and let {u(t),v(t)} be an independent set of solutions of the second-order ode z̈(t) + ρ(t)ż(t) + θ(t)z(t) = 0. (24) in these conditions, the function ξ (t) = c1u(t)2 + c2u(t)v(t) + c3v(t)2 determines the general solution of the third-order ode ... ξ + 3ρ̇ξ̈ + ( ρ̇ + 2ρ2 + 4θ ) ξ̇ + ( 4ρθ + 2θ̇ ) ξ = 0. (25) let us consider vector fields of the generic shape x = ξ (t) ∂ ∂t + 1 2 ξ̇ (t) qi ∂ ∂qi , (26) and define xi (1 ≤ i ≤ 3) as the vector field associated to the constant cj = δ j i . computing the brackets, taking into account the constraint (25), we get the relations [x1, x2] = w(u,v)x1, [x1, x3] = 2w(u,v)x2, [x2, x3] = w(u,v)x3, (27) 376 vol. 57 no. 6/2017 noether point symmetries of systems where w(u,v) = uv̇ − u̇v denotes the wronskian of {u(t),v(t)}. now, if w (u,v) reduces to a constant λ,1 it is straightforward to verify that the vector fields x1, x2 and x3 span a lie algebra isomorphic to sl(2,r). for convenience, we further define the function v (q) = 1 4 ξ̈ (t) ( q21 + q 2 2 ) , which shall serve as generic gauge term for the symmetry condition (11). analyzing the symmetry condition (7), the vector fields xi are the symmetry generators of a noether point symmetry of a lagrangian l (t, q, q̇) whenever the first-order pde 1 2 ∂l ∂q̇1 ( ξ̈q1 − ξ̇q̇1 ) + 1 2 ∂l ∂q̇2 ( ξ̈q2 − ξ̇q̇2 ) + ξ ∂l ∂t + ξ̇ 2 ( q1 ∂l ∂q1 + q2 ∂l ∂q2 ) − ... ξ 4 ( q21 + q 2 2 ) − 1 2 ξ̈q̇1q1 − 1 2 ξ̈q̇2q2 + ξ̇l = 0 (28) is satisfied. the general solution to the latter is given by l (t, q, q̇) = ( ξξ̈ − ξ̇2 ) 4ξ2 ( q21 + q 2 2 ) + ξ̇ (q1q̇1 + q̇2q2) 2ξ + 1 ξ ψ ( q1√ ξ , q2√ ξ , ξ̇q1 − 2q̇1ξ√ ξ , ξ̇q2 − 2q̇2ξ√ ξ ) , with ψ an arbitrary function of its arguments. we observe that the previous class of lagrangians contains, in particular, a family giving rise to oscillatory systems with a time-dependent frequency: l = 1 2 ( q̇21 + q̇ 2 2 ) + 2 ¨ξ(t)ξ(t) − 1 ξ(t)2 ( q21 + q 2 2 ) . (29) it should be remarked that the realization of type (26) exhibits the most general form that the term in ∂ ∂t can have, as follows at once from the following property: lemma 2. let x = ξ(t, q) ∂ ∂t + ηj(t, q) ∂∂qj be a noether point symmetry of a regular lagrangian l = aij(t, q)q̇iq̇j−u(t, q). then the condition ∂ξ∂q = 0 always holds. the proof is completely analogous to that of lemma 1, and follows immediately from the inspection of the terms in the symmetry condition (7) having the highest power in the velocities q̇, as well as the regularity of the lagrangian. 5. conclusions in this work we have illustrated, using functional realizations of lie algebras based on the simple lie algebra sl(2,r), different possibilities to formulate a kind of inverse problem in dynamics, imposing that 1this condition is ensured whenever we set ρ(t) = 0 in equation (24). see e.g. [14], p. 512. the generators appear as noether point symmetries of lagrangian dynamical systems. this allows either to consider symmetry-preserving perturbations of a given system, as developed in [12], or to derive the most general lagrangian invariant by the functional realization of the lie algebra. the cases of conservative and dissipative systems can be treated simultaneously, considering realizations explicitly depending on timedependent functions. albeit the examples have been restricted to the plane by simplicity, there is no obstruction to formulate the problem in arbitrary dimension. however, in order to ensure that the system is integrable [4], it is convenient to consider a realization of a symmetry algebra sl(2,r) ⊂ g, so that formula (8) provides a sufficient number of independent constants of the motion. in this situation, it should be taken into account that this approach by means of noether point symmetries in the n-dimensional (conservative) case is somewhat restricted, as the corresponding preserved symmetry algebras are subalgebras of the noether point symmetry algebra of the free system defined by the kinetic term t of the lagrangian. for the euclidean case, this corresponds to subalgebras of the schrödinger algebra s(n), and hence any semisimple lie algebra considered in this frame must be taken as a subalgebra of sl(2,r) ⊕ so(n) (see e.g. [12, 15]). however, for non-euclidean geometries, the situation may differ. as an illustrative example consider the free lagrangian l0 = 1q2n ( q̇21 + · · · + q̇2n ) in the upperhalf space u = {q ∈ rn| qn > 0} endowed with the poincaré metric ds2 = 1 (qn)2 ( dq1 ⊗dx1 + · · · + dqn ⊗dqn ) (30) it is well known that it admits the conformal group so (1,n) as isometry group [16], i.e., the corresponding symmetry generators are killing vectors. it is easy to show that for any n ≥ 2, the algebra lps of lie point symmetries of the dynamical system associated to l0 is isomorphic to the direct sum so (1,n) ⊕ r2, with r2 the 2-dimensional affine lie algebra. only the lie point symmetry y = t ∂ ∂t fails to satisfy the condition (7), as it leads to the pde − 1 2q2n n∑ k=1 ( q̇2k + q̇k ∂v ∂qk ) − ∂v ∂t , (31) which has no solution for a gauge term v (t, q), thus the noether point symmetries are given by the lie algebra lns ' so (1,n)⊕r. as a consequence, for any n ≥ 2, the algebra lns of noether point symmetries of a second-order system q̈k = 2q̇kq̇n qn − q2n 2 ∂u ∂qk , 1 ≤ k ≤ n− 1, q̈n = q̇2n − q̇21 −···− q̇2n−1 qn − q2n 2 ∂u ∂qn (32) with lagrangian l = l0 −u (t, q) corresponds to a subalgebra of so (1,n) ⊕r. these two cases indicate 377 rutwig campoamor-stursberg acta polytechnica that a detailed study of the symmetry algebras of free lagrangians corresponding to nonequivalent metrics in n ≥ 2 dimensions would give rise to a hierarchy of lie algebras that allows to systematize the symmetry analysis of perturbed systems. for some important types of differential equations, this approach has already provided interesting results (see [15, 17] and references therein). further work along these lines is currently in progress. finally, as follows from the symmetry condition (7), a noether point symmetry of a regular lagrangian l necessarily possesses the generic form x = ξ(t) ∂ ∂t + ηj(t, q) ∂∂qj . this suggests to study specifically realizations of lie algebras of this type, in order to characterize those isomorphism classes of lie algebras that appear as noether symmetries of a system, but do not correspond to an isometry generator of the associated kinetic lagrangian. some developments in this direction have been proposed in [18], in connection with various geometric properties. a similar approach can be found in [15] and some previous work, where symmetries of certain types of non-autonomous systems defined in riemann spaces have been studied, providing new insights to the geometrical interpretation of dynamical quantities. an interesting problem worthy to be analyzed in the context of the symmetry analysis of dynamical systems is the possibility of combining symmetry groups with certain of the geometric properties of the orbits determined by the solutions of a system [5], a question that has still not exhausted the possibilities of the group-theoretical approach. acknowledgements the author expresses his gratitude to prof. m. znojil for the invitation to the aamp xiv conference. during the preparation of this work, the author was financially supported by the research project mtm201679422-p of the aei/feder (eu). references [1] e. noether. invariante variationsprobleme. nachr ges wiss göttingen, math-phys kl 1918:235–257, 1918. [2] l. v. ovsyannikov. group analysis of differential equations. academic, new-york, 1982. [3] l. dresner. application of lie’s theory of ordinary and partial differential equations. iop publishing, bristol, 1999. [4] a. m. perelomov. integrable systems of classical mechanics and lie algebras. birkhäuser verlag, basel, 1990. [5] a. s. galiullin. inverse problems of dynamics. mir publishers, moscow, 1984. [6] s. e. jones, b. g. vujanovic. on the inverse lagrangian problem. acta mech 73:245–251, 1988. doi:10.1007/bf01177044. [7] v. m. gorringe, p. g. l. leach. lie point symmetries for systems of second-order linear differential equations. quaest math 11:95–117, 1988. doi:10.1080/16073606.1988.9631946. [8] r. campoamor-stursberg. on certain types of point symmetries of systems of second-order ordinary differential equations. comm nonlinear sci num simulat 19:2602–2614, 2014. doi:10.1016/j.cnsns.2014.01.006. [9] p. g. l. leach. equivalence classes of second-order ordinary differential equations with only a three dimensional lie algebra of point symmetries and linearisation. j math anal appl 284:31–48, 2003. doi:10.1016/s0022-247x(03)00147-1. [10] a. ballesteros, et al. n-dimensional sl(2)-coalgebra spaces with non-constant curvature. phys letters b 652:376–383, 2007. doi:10.1016/j.physletb.2007.07.012. [11] g. gubbioti, m. c. nucci. are all classical superintegrable systems in two-dimensional space linearizable? j math phys 58:012902, 2017. doi:10.1063/1.4974264. [12] r. campoamor-stursberg. perturbations of lagrangian systems based on the preservation of subalgebras of noether symmetries. acta mech 227:1941–1956, 2016. doi:10.1007/s0070. [13] e. kamke. differentialgleichungen. lösungsmethoden und lösungen. band ii. akademische verlagsgesellschaft, leipzig, 1962. [14] e. kamke. differentialgleichungen. lösungsmethoden und lösungen. band i. akademische verlagsgesellschaft, leipzig, 1962. [15] l. karpathopoulos, a. paliathanasis, m. tsamparlis. lie and noether point symmetries for a class of nonautonomous dynamical systems. j math phys 58:082901, 2017. doi:10.1063/1.4998715. [16] y. fuji, k. yamagishi. killing spinors on spheres and hyperbolic manifolds. j math phys 27:979–981, 1986. doi:10.1063/1.527118. [17] m. tsamparlis, a. paliathanasis, a. qadir. noether symmetries and isometries of the minimal surface lagrangian under constant volume in a riemannian space. int j geom methods mod phys 12:1550033, 2015. [18] r. campoamor-stursberg. an alternative approach to systems of second-order ordinary differential equations with maximal symmetry. realizations of sl(n + 2, r) by special functions. comm nonlinear sci num simulat 37:200–211, 2016. doi:10.1016/j.cnsns.2016.01.015. 378 http://dx.doi.org/10.1007/bf01177044 http://dx.doi.org/10.1080/16073606.1988.9631946 http://dx.doi.org/10.1016/j.cnsns.2014.01.006 http://dx.doi.org/10.1016/s0022-247x(03)00147-1 http://dx.doi.org/10.1016/j.physletb.2007.07.012 http://dx.doi.org/10.1063/1.4974264 http://dx.doi.org/10.1007/s0070 http://dx.doi.org/10.1063/1.4998715 http://dx.doi.org/10.1063/1.527118 http://dx.doi.org/10.1016/j.cnsns.2016.01.015 acta polytechnica 57(6):373–378, 2017 1 introduction 2 lie and noether point symmetries of systems 2.1 perturbations that preserve symmetry subalgebras 3 functional realizations of sl(2,r) as noether symmetry algebra 3.1 separable kinetic lagrangians 3.2 systems with fixed symmetry 4 time-dependent lagrangians 5 conclusions acknowledgements references acta polytechnica vol. 43 no. l/2003 a short note on non-isothermal diffusion models t. ficker asymptotic behaviour of the dial and dmlnon-isothermal mod,els, deriaed preaiously for the dffision of uater aapour through a porous buil.ding structure, is studied under the assumption that the i,nitially non-isothennal structure becomes purely isothermal. keyuords: isothermal and non-isothermal difiuions, dffision models, condensation in building structures. i introduction in a previous research communication [], the dial and dral non-isothermal difilsion models were derived and compared with the standard isothermal models commonly used in building thermal technology. in a further related communication [2] we applied these models to the glaser condensation scheme. the glaser scheme enables, among others, an assessment of condensate for a one-year period. naturally, throughout the year the structure is mostly exposed to non-isothermal states but, especially in the summer season, it may be subjected to a purely isothermal state. the dial and dral models provide the basic relations solely for non-isothermal conditions, i.e., they contain different temperatures ?,, t, belonging to the opposite sides of the structure. when approaching the isothermal state (?r-+ t,), the relations give an uncertain expression 0/0 and, at first sight, it is not clear how these relations should be applied to the isothermal state (zr= tf t). this short communication is aimed at deriving asymptotic dial and dral relations holding for the isothermal state of a building structure possessing the diffusion resistance factor p. 2 asymptotic dial relations for the dial model the generalised diffusion resistance $ and diffusion'conductivity'djip read [l] 3 asymptotic dral relations repeating the same procedure as in the foregoing section, we can rewrite the original dral non-isothermal relations [] n"n =j-, d"o =@ ,'' -", (4)-'eii d.ff ' "ttt l, t,)-' trl-' using the ',hospital rule d (7, _tr\ ti,l -? = li^ ,dr2' " = rr+r,t|-n -tt-n ,-r, h(d-, -4t-,; (5) h* into the purely isothermal relations (tr=tt=71 d h -, * -t.streff =*-, deff=-t =-i .uefl p p relation (6) exactly corresponds to the (6) isothermal -* d ^* (2 -n)kpo tt-tz'-\eff=: ' ueff =-deff tlr" rf-r;-' fa =980665pa,k =8.9?18'lol0 *2s-1k-t 81,, = l8l. to determine the relations describing the isothermal state (tr=tt:71it is necessary to use the i'hospital rule @ /^ .t\ ^ lt1 t21dlo,. tt -tqll[i='-=tz-+rr 7rz-n -rz-n '"'#('?-" -rr'*) ti^ *' z -ft a /^9-n -9-n\ dr, \'t (9\ =tt'=rn-tz-"2-"' which leads to the following result 4o =+, niu = k* rn; k f: ro'tr.dir pr" pr" 62 imjidr result [l], which confirms the consistency of the developed models. 4 conclusion derived asymptotic relations (3) and (6) represent a necessary complement for the dial and dral non-isotherrut'l models when they are faced with the task of estimating the water condensate inside building structures exposed to purely isothermal conditions. such problems may appear in building thermal technology when calculating the one-year balance of condensate inside the building envelopes. references tll ficker, t., pode5vov6, z.: modek for non-lsothermal steady-stnte difusion in porous build'ing materials. acta polytechnica (accepted for publication). t2l ficker, t., pode5vov6, z.: modified glaser's condensation moful. acta polytechnica (accepted for publication). assoc. prof. rndr. tomd5 ficker, drsc. phone: + 420 541 147 661 e-mail: fyfi c@fce.vutbr.cz department of physics university of technology faculty of civil engineering zizkova 17 662 37 brno, czech republic (1) (3) ap04_4web.vp 1 introduction in developing any kind of industrial control system, it is important what kind of software support is used, in addition to the hardware. in addition to the standard methods, solved by the compiler of linear schemes and various representations of combining equations, there is a wide range of tasks where the support of higher program language is necessary. there are languages like c, pascal and perhaps basic. the problem is their in ability to represent combinatorial relations. a sufficient solution is to use a parallelism for programming logical control in a higher language. primas offers not only the usual language structure but also parallel processes, timing functions, easier object access and simulation instruments. 2 a linear scheme or a programming language? as an example, let us take a situation where our equipment is performing one operation from its list. the control program is composed of operations from this list. running these operations depends on a combination of logical conditions. when a condition is fulfilled the running operation should stop and a different one will start. there are usually some other parallel operations running on their own algorithm. in addition the equipment has an alphanumerical display with a keyboard to communicate with the operator. it can also communicate with its major system. if we want to use the linear scheme compiler or some other form of combinatorial equations compiler, difficulties may occur. we will have trouble to form sequences and cycles and there will be probably a lack of procedures. the dialog to the operator and the internal communications, taking place concurrently with the running processes can be performed only on the basis of commercially-available producers modules. transparency and program service will also be difficult. there exist higher level of logical control program systems such as step 5 and grafcet. however there are situations so that program instruments cannot deal with. in c language the programmer writes functions like shift left( ), drill( ), etc., and puts them into a sequence. this sequence is closed into the cycle for (i � 0; i < 5: i��) and creates a readable algorithm for action( ). problems emerge if we want to bring this action to the end abruptly whenever a condition x appears. into shift left( ), drill( ) and into other actions we have to write a new ifs (…). unstructured jumps need to be solved and program transparency suffers. another problem is parallel dialog to the operator and the communications. here the interrupt function can be helpful. however the program can become too complicated and totally lacking in transparency. 3 parallel tasks the task of running the system control, dialog and communication, which are quite independent actions, at the same time can be solved naturally by the ability to create parallel processes. parallel processes are also good for creating the actual control process and, even as a tuning tool. 4 decomposing the task in usual multitasking, the tasks are not allowed to influence each other and they communicate through exporting and importing files. in the control processes there is typically close cooperation among the actions. if we substitute actions for persons, the whole program becomes more like a company. they have the same interests, work with the same information and know what their colleagues do. this naive comparison proves to be surprisingly adequate when asking: “how many actions, what will they perform, and how do they influence each other?” we can create a “manager”, a “communicator”, a “service man”, a “security technician”, and a mass of “workers”. the relations between actions are solved by the main program structure and as are some particular problems. let us take our example with action conditioned by condition x. instead of testing this condition in each procedure, we just launch the guarding parallel process. this will continually test that condition. our action will no longer be needed. when the condition is fulfilled the guard stops the action and launch a different one, or even the same one with changed parameters. this condition can be, for example, a time limit. our action will set the time limit and the guard watches its runover. this type of the parallel control has the advantage that the actions “see into each others plate”. in order to optimized decomposition, sensitivity and experience are needed. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 44 no. 4/2004 programming a logical control method by a parallel process p. jiroušek this paper deals with the development of the problem oriented language primas for use in program control. it is based on virtual parallelism of the controlling program to make its hierarchical structure transparent. the author has also worked on the compiler and the control process simulator. this system enables verification of the control algorithm when there is no controlled machine and no control system. the primas language, compiler and simulator were developed and applied to real tasks, in the course of work on the author’s phd dissertation. keywords: higher program language, logic control, modelling and simulating. 4 modeling and simulating parallel processes are an excellent tool for simulating and tuning control tasks. a control process is an interaction between the control and the controlled. in simple terms a model consists of controlling processes and a controlled utility. then we leave them to interact each other. with suitable visualization of internal processes it is possible to observe the behavior of the whole system. the use of cross development software enables the tuning when there is no controlling process and no controlled utility. in practical terms this means that the more time is used on development, the less time is needed when the program is brought to life. the program just works as soon as it is switched on. 5 problems when implementing parallelisms parallelisms offer many advantages. it remains is to find suitable development software to enable the implementation of parallelism. let us focus on control systems which do not support unix, qnx or some other multitask operating system. these are mostly smaller and cheaper systems, mainly eight bits, with processors such as 8051, z80, i80186 etc. most of producers of development software, if they do offer parallelism, will enable it to run only thanks to the real time core of their system. this processes drives the switching and also the mutual communication. this separates the language from the parallelism. there are libraries for approaching a core function. the compiler remains unaffected by parallelisms. sharing the information in the system in this way is not problem-free. passing the information through the core by sending messages proves to be a very safe and clear method, but it is not really necessary in this case. if such a message is an integer, sending it becomes to difficult. this variable can be shared between processes inside the memory. this can be done only if collisions between the producer and the receiver of the information are eliminated. the recipient cannot touch this variable at the same time as the producer is changing it. the core control has some procedure to deal with this. one way to eliminate mutual work with one variable is to apply cooperative multitasking in a suitable manner. the processor is not forced to surrender the process, but it can decide whether or not it is appropriate. the information producer holds on the processor until the data work is finished. then there is the problem of how to control process switching. if the application program has to take care of this problem, it is not only a burden to bear but also a possible source of any kind of mistake. these processes can cause long delays. 6 parallelisms in primas to deal with the problem discussed above the primas program system was developed. its core consists of a higher language orientated to logical control tasks. it is based on the ideas of parallel programming. when looking at the source code, a slight resemblance with pascal is recognizable. there are procedures, functions, integers, real variables, the classical if-then-else program structure, etc. in addition, the bite objects have been made accessible, the program works with virtual timers, and there are also language structures which facilitate work with these particles. in primas, parallelism is supported by the ability to define more than one process. switching between them is ensured by the cooperative multitasking method. the decision on switching, and its exact placing, is made by the compiler according to a precisely set strategy. the programmer can also influence this placing by himself, but this has been used minimally. individual processes share the global variables, timers, inputs and outputs, and they can call the same procedures if they are re-enter. processes can launch and stop each other, while they are accessible to each other through their individual identifiers. the processes can be addressed indirectly, so they become parameters in the procedures and functions. 6 tuning and simulating in primas the primas program system consists of a compiler for the given control system and the universal tuning program. it is made for ms-dos. when simulating, one or more external file is attached to define the object placement and the way in which they are shown on the pc screen. the tuning program then interprets the translated processes, and projects all changes onto the screen. here we can observe how the water level is rising, how the object runs along its trajectory, or how the end switch is working. then this process can be influenced with the use of the mouse or keyboard. when the mouse is clicked on an object showing a binary value, an immediate change will follow. in other cases, the dialog line shows up and a new value will be inputted from the keyboard. 7 implementing primas while it was being developed primas was implemented for i8048, i8051, z80, z181 and i8086 processors. the processor units of promos system by elsaco and arem pro are based on these platforms. many library functions have been prepared for promos, to support specific hardware. primas is offered as a device for developing programs for this hardware. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 example %——————————————————————————————————————————————————————————————% % the following program has no practical means. it only % % demonstrates commonly used elements of primas. % %——————————————————————————————————————————————————————————————% processes: main maincontrol errorstatus oiling <* simulator *> ; © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 44 no. 4/2004 constants: hex[2c] name portadr 5 name cyclecount real 4.2e6 name coefficient enum status0 status1 status2; byte inputs: with address portadr name inport with address 32 name serialline with address 33 name datasl with filter hex[2d] name fport ; byte outputs: with address 1024 name outport1 ; timers: interval 100 time delaytimer alarmclock ; include “\inc\library.prm“ int: counter array[22] array[2] name element2 with address hex[f000] name videoram with address 100 name monitorstatus ; real: ra rp[20] ; bit outputs: outport1.0 name pump outport1.7 name errorsignal monitorstatus.5 name bufferfull outport1.1# name drain ; bit inputs: inport.2 name button inport.3 name loadfull inport.4# name oilpressureok serialline.0 name dataready @(123).5 name bit5onadr123 array[3].10 name improtantbit ; int function: charfromserialline(i:mask) int: i p[10]; real: r x[20]; waitfor dataready = and(datasl,mask) ; procedure: getmessage(i:length,l:signal) int: i ; i := 0 repeat array[i] := charfromserialline( hex[7f] ) i := i+1 until [i>length] bufferfull := signal ; log function: dumpenable = button * loadfull ; process oiling int: i ; real: a ; begin waitfor oilpressureok# timer := 50 set( pump ) waitfor oilpressureok+[timer=0] 9 conclusion on the basis of analyzing the process of creating a control program, the primas program system offers two means of development. first, the language for formalizing the control tasks of a given class. second, the simulation tool. the primas program language as a language for describing the algorithm for logical automatic machines offers an innovative and untraditional means. logical controlling has until now been commanded by methods deriving from finite automatic description. an example is proved by be the grafcet system, created in 1997 by the french association for economic and technical cybernetics. grafcet became an international standard for a sequence description. there are still research teams working on the grafcet system and implementing this method on to control systems. although this system is a significant step forward, compared to state diagrams, it still suffers from all the shortcomings discussed above. primas was introduced to specialists on grafcet and similar systems for modeling sequence systems from the polytechnique national de grenoble during the higher education in control engineering seminar organized by the faculty of electrical engineering at ctu. they declared that no other system similar to primas was known to them. they thought the reason might lie in “cultural” differences in 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 if oilpressureok# then set( errorsignal ) else set( errorsignal# pump# ) endif again ; process errorstatus begin if errorsignal then videoram := “oiling error “ reset elseif bit5onadr123 then videoram := “perror xy “ reset endif again ; process maincontrol outport1 := 0 waitfor button waitfor button# run oiling run errorstatus begin videoram := “working cycle “ getmessage(4,0) case element2 of ’a’ then set( outport1.1 ) of ’b’ then set( outport1.2 ) else set( outport1.3 ) endcase do array[1] times outport1.4 := outport1.4# delay timer := 2 loop while dumpenable# set( outport1.5 outport1.6# ) delay timer := 5 set( outport1.5# outport1.6 ) endwhile again ; <* include “\sim\simul.prm“ *> logical control and computer science. they also emphasize the importance of solving the problems caused by expressing utility diversity. primas offers some new elements and language constructions, that make the algorithms much easier to express. the basic characteristic is the parallel description of the algorithm by mutually integrating processes. in primas, parallelism is not just a way of exploiting the processor better, but especially it leads to easier control system decomposition. parallelism is also used for describing a standard situation. writing it in a single sequence program would be theoretically possible but, practically very difficult and not transparent. other language properties supporting the development of the control program, are the timing functions, the simple approach to the system interface, and independent object access. the primas program system as a means of development can verify whether the system works within the assigned time. unlike other program systems, for instance systems supporting the grafcet, verification is performed in graphical form. this ability is given by the means of simulation in the tuning program and is also contained in the language. this enables verification of the ideas and conceptions without a physical need for the existence of the controlled system. primas has been put into practice in several industrial companies and the users have been highly satisfied. they appreciate that it changes and improves the control system programmer and the mechanical part constructor relationship. simulation has become their meeting point, and it makes their communication easier. in industrial companies where the system has been in use for a period of time, we can see that the wide ability of primas leads to higher demands on the system author. some universities working on simulations of discreet processes have also expressed an interest in the primas system. references [1] jiroušek p.: vyšší programovací jazyk pro logické řízení paralelními procesy. kandidátská disertační práce (ph.d. thesis, czech technical university in prague), praha 1991. [2] jiroušek p.: “programování úloh logického řízení paralelními procesy.” automa 3–4, 1996, (in czech). [3] alla h., david r.: du grasfcet aux réseaux de petri. hermes, 1989. ing. pavel jiroušek, csc. phone: +420 221 912 387 e-mail: jirousek@troja.fjfi.cz department of solid state engineering czech technical university in prague faculty of nuclear and physical engineering trojanova 13 120 00 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 44 no. 4/2004 acta polytechnica doi:10.14311/ap.2016.56.0173 acta polytechnica 56(3):173–179, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap covariant integral quantizations and their applications to quantum cosmology jean pierre gazeau astroparticle and cosmology, univ paris diderot, sorbonne paris cité, paris 75205, france correspondence: gazeau@apc.univ-paris7.fr abstract. we present a general formalism for giving a measure space paired with a separable hilbert space a quantum version based on a normalized positive operator-valued measure. the latter are built from families of density operators labeled by points of the measure space. we especially focus on group representation and probabilistic aspects of these constructions. simple phase space examples illustrate the procedure: plane (weyl-heisenberg symmetry), half-plane (affine symmetry). interesting applications to quantum cosmology (“smooth bouncing”) for friedmann-robertson-walker metric are presented and those for bianchi i and ix models are mentioned. keywords: integral quantization; covariance; povm; affine group; weyl-heisenberg group; coherent states; frw model; smooth bouncing. 1. introduction group theory is one of the favorite domains of j. patera and p. winternitz. this contribution to the celebration of the jiri and pavel 80th birthday emphasizes the role of group representation theory in a certain type of quantization procedures. these methods [1–9] are named (covariant) integral quantizations and offer a large scope of possibilities for mapping a classical mathematical model of a physical system to a quantum model. they are based on normalized positive operator-valued measures (povm) and present at least two attractive advantages: (1.) by employing all resources of integration and distributions, they are most appropriate when we have to deal with some geometric singularities. (2.) they afford a semi-classical phase space portrait of quantum states and quantum dynamics together with a probabilistic interpretation. moreover, their recent applications to quantum cosmology [10–15] has yielded interesting results as: • consistent quantum dynamics of isotropic, anisotropic non-oscillatory and anisotropic models of universe; • singularity resolution; • unitary dynamics without boundary conditions; • consistent semi-classical description of involved quantum dynamics. the aim of the present article is to give an overview of the methods, their main characteristics, and their promising applications to early cosmology. in section 2 we give a rapid survey of canonical quantization with the problems raised by this procedure, and we define a few basic requirements that any quantization procedure should fullfill. in section 3 we define povm-based integral quantizations of functions defined on a measure space, their subsequent semi-classical counterparts (i.e., lower symbols) and discuss their possibilities when they are compared with other methods. covariant integral quantizations are then considered when the measure space is homogeneous for some group, and we give the example of the half-plane viewed as the affine group of the real line. section 4 is devoted to another example, the integral quantization of functions on the plane viewed as a coset of the weyl-heisenberg group. in section 5 we illustrate the method with recent applications to quantum cosmology where the initial singularity is regularized by using the affine integral quantization. finally we sketch in section 6 a more philosophical point of view on the approaches described in this contribution. a large part of the content has already been published, as exemplified by the above citations. on the other hand the original side of the paper lies in the synthetic presentation of the method together with some new ideas like the general (and promising) construction sketched in section 3.2. 2. what is quantization? 2.1. canonical quantization the basic procedure of quantization of a classical mechanical model is well known and is named “canonical" (heisenberg, born, jordan, dirac, weyl, etc). starting from a phase space or symplectic manifold, e.g. r2, e.g. the phase space for the motion of particle on the line, it consists in the maps r2 3 (q,p), {q,p} = 1 7→ self-adjoint (q,p), (1) with the canonical commutation rule (ccr) [q,p] = i~i, and f(q,p) 7→ f(q,p) 7→ (sym f)(q,p). (2) 173 http://dx.doi.org/10.14311/ap.2016.56.0173 http://ojs.cvut.cz/ojs/index.php/ap jean pierre gazeau acta polytechnica beyond the ordering problem [16–18] raised by the second step (symmetrisation), one should keep in mind that [q,p] = i~i holds true with self-adjoint q, p, only if both have continuous spectrum (−∞, +∞), and there is uniqueness of the solution, up to unitary equivalence (von neumann). but then what about singular f, e.g., the angle arctan(p/q)? what about barriers or other impassable boundaries? the motion on a circle? in a bounded interval? on the half-line? despite their elementary aspects, these examples leave open many questions both on mathematical and physical levels, irrespective of the manifold quantization procedures, like path integral quantization (feynman, thesis, 1942), geometric quantization with weyl (1927), groenewold (1946), kirillov (1961), souriau (1966), kostant (1970), deformation quantization with moyal (1947), bayen, flato, fronsdal, lichnerowicz, sternheimer (1978), fedosov (1985), kontsevich (2003), coherent state or anti-wick or toeplitz quantization with klauder (1961), berezin (1974). of course, the canonical procedure is universally accepted in view of its numerous experimental validations, one of the most famous and simplest one going back to the early period of quantum mechanics with the quantitative prediction of the isotopic effect in vibrational spectra of diatomic molecules [19]. these data validated the canonical quantization, contrary to the bohr-sommerfeld ansatz (which predicts no isotopic effect). nevertheless this does not prove that another method of quantization fails to yield the same prediction [3]. moreover, as already mentioned above, the canonical quantization is difficult, if not impossible, in many circumstances. as a matter of fact, the canonical or the weyl-wigner integral quantization maps f(q) to f(q) (resp. f(p) to f(p)), and so are unable to cure any kind of classical singularity. clearly, quantization procedures may be intractable when different phase space geometries and/or topologies are considered. self-adjointness of basic operators is not guaranteed anymore. 2.2. quantization: minimal requirements in our viewpoint, any quantization of a set x (e.g., a phase space or something else) and functions on it should meet four basic requirements: (1.) linearity — it is a linear map q : c(x) 7→a(h), q(f) ≡ af, (3) where: • c(x) is a vector space of complex-valued functions f(x) on set x, i.e., a “classical” mathematical model; • a(h) is a vector space of linear operators (ignoring domain limitations) in some complex hilbert space h, i.e, a “quantum” mathematical model. (2.) unity — the map (3) is such that the function f = 1 is mapped to the identity operator i on h. (3.) reality — a real f is mapped to a self-adjoint operator af in h or, at least, a symmetric operator, (4.) covariance — if x is endowed with symmetry, the latter should be preserved in a “certain” sense. of course, further requirements are necessarily added, depending on the mathematical structures equipping x and c(x) (e.g., measure, topology, manifold, closure under algebraic operations, time evolution or dynamics, etc). moreover, a physical interpretation should be advanced about measurement of spectra of classical f ∈c(x) or quantum af ∈a(h) to which are given the status of observables. finally, an unambiguous classical limit of the quantum physical quantities should exist, the limit operation being associated to a change of scale. 3. integral quantization(s) 3.1. integral quantization: general setting and povm let (x,ν) be a measure space. suppose that there exists an x-labeled family of bounded operators on a hilbert space h resolving the identity i: x 3 x 7→ m(x) ∈l(h), ∫ x m(x) dν(x) = i, (4) where the equality holds in a weak sense. the case we are specially interested in occurs when the m(x)’s are positive semi-definite and unit trace, i.e., when m(x) ≡ ρ(x) (density operator). (5) then, if x is a space with suitable topology, the map b(x) 3 ∆ 7→ ∫ ∆ ρ(x) dν(x) (6) may define a normalized positive operator-valued measure (povm) on the σ-algebra b(x) of borel sets. the quantization of complex-valued functions f(x) on x is the linear map: f 7→ af = ∫ x m(x)f(x) dν(x), (7) understood as the sesquilinear form, bf (ψ1,ψ2) = ∫ x 〈ψ1|m(x)|ψ2〉f(x) dν(x), (8) defined on a dense subspace of h. if f is real and at least semi-bounded, and if the m(x)’s are positive operators, then the friedrich’s extension of bf univocally defines a self-adjoint operator. if f is not semi-bounded, there is no natural choice of a selfadjoint operator associated with bf. we then need more information on h to solve this subtlety. covariance implies symmetry. symmetry suggests a mathematical group g. a symmetry structure on x presupposes that some group g acts on x. our hypothesis here is that x is an homogeneous space for g, which means that x ∼ g/h for h a subgroup of g. we now examine two possible cases. 174 vol. 56 no. 3/2016 covariant integral quantizations and quantum cosmology 3.1.1. x = g: quantizing the group let g be a lie group with left invariant haar measure dµ(g), and let g 7→ u(g) be a unitary irreducible representation (uir) of g in a hilbert space h. let m be a bounded operator on h. suppose that the operator r := ∫ g m(g) dµ(g), m(g) := u(g)mu(g)†, (9) is defined in a weak sense. from the left invariance of dµ(g) the operator r commutes with all operators u(g), g ∈ g, and so, from schur’s lemma, r = cmi with cm = ∫ g tr ( ρ0m(g) ) dµ(g), (10) where the unit trace positive operator ρ0 is chosen in order to make the integral convergent. if cm is finite and not zero, then we get the resolution of the identity:∫ g m(g) dν(g) = i, dν(g) := dµ(g)/cm. (11) for this reason, the operator m to be u transported will be designated (admissible) fiducial operator. the method becomes more apparent when the representation u is square-integrable in the following sense (for example, it is the case for the affine group of the real line, see below). suppose that there exists a density operator ρ which is admissible in the sense that cρ = ∫ g dµ(g) tr ( ρρ(g) ) < ∞, (12) with ρ(g) = u(g)ρu†(g). then the resolution of the identity (11) holds with m(g) and cm = cρ. this allows a covariant integral quantization of complexvalued functions on the group f 7→ af = ∫ g ρ(g)f(g) dν(g), dν(g) := dµ(g) cρ , (13) u(g)afu†(g) = au(g)f (covariance), (14) where ( u(g)f ) (g′) := f(g−1g′) (15) is the regular representation if f ∈ l2(g, dµ(g)). this leads to a non-commutative version of the original manifold structure for g. example: affine cs integral quantization in this example, the group g is the affine group of the real line and is also viewed as the phase space for the motion on the half-line, i.e., the half-plane {(q,p) ∈ r∗+ ×r}. the group law is defined as (q,p)(q0,p0) = ( qq0, p0 q + p ) , (16) and the left invariant measure is the symplectic dq dp. this group has two uir’s which are both square integrable. we choose the following one. l2(r∗+, dx) 3 ψ(x) 7→ ( u(q,p)ψ ) (x) = eipx √ q ψ (x q ) (17) as a density operator we choose the rank one projector ρ = |ψ〉〈ψ| where the fiducial vector, i.e., the ‘wavelet” ψ is in l2(r+, dx) ∩l2(r+, dx/x) and its affine transport produces the overcomplete family of affine coherent states (acs) |q,p〉 = u(q,p)|ψ〉. (18) the acs integral quantization reads as f(q,p) 7→ af = ∫ r+×r dq dp cρ f(q,p)|q,p〉〈q,p| (19) with cρ = ∫∞ 0 |ψ(x)| 2 dx x . more details and new developments are given in the recent [20]. applications of this method to the resolution of the singularity problem on a quantum level for a series of cosmological models (frw, bianchi i, bianchi ix), for which the volume — expansion pair variables form the above half-plane, are given in the recent papers [10–15] and will be summarised in section 5. 3.1.2. x = g/h: quantizing a non-trivial group coset in the absence of square-integrability over g (the trivial illustration is the weyl-heisenberg group which underlies in particular the canonical quantization, see section 4), there exists a definition of square-integrable representations with respect to a left coset manifold x = g/h, with h a closed subgroup of g (it is the center in the weyl-heisenberg case), equipped with a quasi-invariant measure ν [2]. for a global borel section σ : x → g of the group, let νσ be the unique quasi-invariant measure defined by dνσ(x) = λ ( σ(x),x ) dν(x), (20) where λ(g,x)dν(x) = dν(g−1x) for g ∈ g with λ(g,x) obeying the cocycle condition λ(g1g2,x) = λ(g1,x)λ(g2,g−11 x). let u be a uir which is square integrable mod(h) with an admissible ρ, i.e., cρ := ∫ x tr(ρρσ(x)) dνσ(x) < ∞, with ρσ(x) = u(σ(x))ρu(σ(x))†. then we have the resolution of the identity and the resulting quantization f 7→ af = 1 cρ ∫ x f(x)ρσ(x) dνσ(x) (21) covariance holds here too in a certain sense (see [2, chapter 11]). besides the weyl-heisenberg group, another example concerns the motion on the circle for which g is the group of euclidean displacements in the plane, i.e., the semi-direct product r2 o so(2), and the subgroup h is isomorphic to r [22]. other examples involve the relativity groups, galileo, poincaré [2], 1 + 1 anti de sitter (unit disk and su(1, 1)) [21], 1 + 1 and 3 + 1 de sitter [23]. 175 jean pierre gazeau acta polytechnica 3.2. an example of construction of original operator m or ρ let u be a uir of g and $(x) be a function on the coset x = g/h. suppose that it allows to define a bounded operator m$σ on h through the operatorvalued integral m$σ = ∫ x $(x)u(σ(x)) dνσ(x). (22) then, under appropriate conditions on the “weight” function $(σ(x)) such that u be a uir which is square integrable mod(h) and m$σ is admissible in the above sense, the family of transported operators m$σ (x) := u(σ(x))m$σ u†(σ(x)) resolves the identity. 3.3. semi-classical portraits some quantization features, e.g., spectral properties of af, may be derived or at least well grasped from functional properties of the lower (lieb) or covariant (berezin) symbol (it generalizes husimi function or wigner function) af 7→ f̌(x) := tr ( m(x)af ) , (23) when m = ρ (density operator) this new function is the local average of the original f with respect to the probability distribution tr(ρ(x)ρ(x′)) f(x) 7→ f̌(x) = ∫ x f(x′) tr ( ρ(x)ρ(x′) ) dν(x′). (24) the bargmann-segal-like map f 7→ f̌ which, generalizes the berezin or heat kernel transform when x = g is a lie group, is in general a regularization of the original, possibly extremely singular, f, and the classical limit itself means that, given one or more scale parameter(s) �(i) and a distance d(f, f̌), we have d(f, f̌) → 0 as �(i) → 0. (25) 3.4. comments beyond the freedom (think of analogy with signal analysis where different techniques are complementary) allowed by integral quantization, the advantages of the method with regard to other quantization procedures in use are of four types. (i)the minimal amount of constraints imposed on the classical objects to be quantized. (ii)once a choice of (positive) operator-valued measure has been made, which must be consistent with experiment, there is no ambiguity in the issue, contrarily to other method(s) in use (think in particular of the ordering problem). to one classical model corresponds one and only one quantum model. of course different choices of (p)ovm are requested to be physically equivalent, e.g., leading to the same experimental predictions, if they are aimed to describe one specific system. (iii)the method produces in essence a regularizing effect, at the exception of certain choices, like the weyl-wigner (i.e., canonical) integral quantization. (iv)the method, through povm choices, offers the possibility to keep a fully probabilistic content. as a matter of fact, the weyl-wigner integral quantization does not rest on a povm. 4. weyl-heisenberg covariant integral quantization(s) the weyl-heisenberg group is defined as gwh = {(s,z) | s ∈ r, z ∈ c} with multiplication law (s,z)(s′,z′) = ( s + s′ + im(zz̄′),z + z′ ) (26) let h be a separable (complex) hilbert space with orthonormal basis e0,e1, . . . ,en ≡ |en〉, . . . , and corresponding lowering and raising operators defined in the usual way a|en〉 = √ n|en−1〉, a|e0〉 = 0, a†|en〉 = √ n + 1|en+1〉, [a,a†] = i, it is well known that the weyl-heisenberg group has a unique non-trivial uir u, up to unitary equivalence, each element in this equivalence class corresponding to a non-zero real number like the planck constant. consider the center c = {(s, 0) | s ∈ r} of gwh. then, the set x is the coset x = gwh/c ∼ c with measure d2z/π. choosing the trivial section c 3 z 7→ σ(z) = (0,z), to each z corresponds the (unitary) displacement (∼ weyl) operator d(z) := u(σ(z)) = eza †−z̄a, d(−z) = (d(z))−1 = d(z)†. (27) the noncommutativity of quantum mechanics is encoded in the relation d(z)d(z′) = e 1 2 (zz̄ ′−z̄z′)d(z + z′) = e(zz̄′−z̄z ′)d(z′)d(z), (28) i.e., z 7→ d(z) is a projective representation of the abelian group c. the standard (i.e., schrödingerklauder-glauber-sudarshan) cs are defined as |z〉 = d(z)|e0〉. (29) we now adopt the construction sketched in section 3.2. let $(z) be a function on the complex plane obeying $(0) = 1. suppose that it allows to define a fiducial operator m$ on h through the operator-valued integral m$ = ∫ c $(z)d(z) d2z π . (30) then, the family of displaced m$(z) := d(z)m$d(z)† under the unitary action d(z) resolves the identity ∫ c m$(z) d2z π = i. (31) 176 vol. 56 no. 3/2016 covariant integral quantizations and quantum cosmology it is a direct consequence of d(z)d(z′)d(z)† = ezz̄ ′−z̄z′d(z′), of ∫ c e zξ̄−z̄ξ d2ξ π = πδ2(z), and of $(0) = 1 with d(0) = i. the resulting quantization map is given by f 7→ a$f = ∫ c m$(z)f(z) d2z π = ∫ c $(z)d(z)fs[f](z) d2z π , (32) where are involved the symplectic fourier transforms fs and its space reverse fs fs[f](z) = ∫ c ezξ̄−z̄ξf(ξ) d2ξ π , fs[f](z) = fs[f](−z) both are unipotent fs[fs[f]] = f and fs[fs[f]] = f. we have the following covariance properties • translation covariance: a$f(z−z0) = d(z0)a $ f(z)d(z0) †. (33) • parity covariance a$f(−z) = pa $ f(z)p ∀f ⇐⇒ $(z) = $(−z) ∀z, (34) where p = ∑∞ n=0(−1) n|en〉〈en| is the parity operator. • complex conjugation covariance a$ f(z) = ( a$f(z) )† ∀f ⇐⇒ $(−z) = $(z) ∀z, (35) • for rotational covariance we define the unitary representation θ 7→ ut(θ) of s1 on the hilbert space h as the diagonal operator ut(θ)|en〉 = ei(n+ν)θ|en〉, where ν is arbitrary real. from the matrix elements of d(z) one proves easily the rotational covariance property ut(θ)d(z)ut(θ)† = d(eiθz), (36) and its immediate consequence on the nature of m$ and the covariance of a$f : the equality ut(θ)a$f ut(−θ) = a $ t(θ)f, (37) with t(θ)f(z) := f ( e−iθz ) , holds true if and only if $(eiθz) = $(z) ∀z,θ, i.e., if and only if m$ is diagonal. 5. applications of acs quantization in quantum cosmology let us consider the friedmann-robertson-walker model filled with barotropic fluid with equation of state p = wρ. the line element is given by: ds2 = −n(t)2dt2 + a(t)2δijωiωj, (38) where a(t) is the scale factor and the ωi’s are rightinvariant dual vectors. the resolving of the hamiltonian constraint leads to a model of singular universe analogous to a particle moving on the half-line (0,∞). indeed, canonical coordinates (q,p) can be introduced in terms of the volume v ∼ a3 and the expansion rate v̇ /v ∼ ȧ/a through the relations v = q2/(1−w) and k = (3/8)(1 −w)pq−(1+w)/(1−w). clearly, q > 0 and p ∈ r. then the reduced hamiltonian reads as {q,p} = 1, h(q,p) = α(w)p2 + 6k̃qβ(w), q > 0. with k̃ = ( ∫ dω)2/3k, α(w) = 3(1−w)2/32 and β(w) = 2(3w + 1)/(3(1 −w)). k = 0,−1 or 1 (in suitable unit of inverse area) depending on whether the universe is flat, open or closed. assuming a closed universe with radiation content w = 1/3 and k = +1, the classical isotropy singularity is cured on the quantum level by using affine cs quantization [10]. indeed the latter regularizes the kinetic term by adding a repulsive centrifugal potential which prevents the system from reaching the value q = 0. precisely, with the choice of a fiducial vector ψ, h = 1 24 p2 + 6q2 7→ ah = 1 24 p 2 + k(ψ) 24 1 q2 + 6m(ψ)q2, (39) p ≡−i d dx , qφ(x) ≡ xφ(x). the constants k(ψ) and m(ψ) are > 0 for all ψ and their appearance is due to the affine cs quantization. moreover, for a ψ such that k(ψ) ≥ 34 the quantum hamiltonian ah is essentially self-adjoint, giving a unique unitary evolution: there is no need of boundary conditions. the semi-classical dynamics is ruled by the acs mean value of the quantum hamiltonian : 〈q,p|ah|q,p〉 = c(ψ) ∫ r+×r dq′dp′ ∣∣〈q′,p′|q,p〉∣∣2h(q′,p′), (40) with a displacement of the equilibrium point of the potential at q4eq = 1 144 k m . the constant c(ψ) is > 0 for all ψ. in figure 1 contour plots of phase space trajectories for classical hamiltonian h(q,p) (left) and semi-classical hamiltonian (right) defined by (40) are compared. as the result of affine quantization and restoring arbitrariness on w and k, we obtain two corrections to the friedmann equation which reads in its semiclassical version as(ȧ a )2 + c2a2p (1 −w) 2a 1 v 2 + b kc2 a2 = 8πg 3c2 ρ, (41) where a and b are positive factor dependent on ψ and can be adjusted at will in consistence with (so far very hypothetical!) observations. the first correction is the repulsive potential, which depends on the volume, 177 jean pierre gazeau acta polytechnica figure 1. contour plots of the classical and semiclassical hamiltonians for the closed friedmann universe. the singularity q = 0 on the left is replaced with a bounce on the right. and this excludes non-compact universes from quantum modelling. we notice that as the singularity is approached a → 0, this potential grows faster (∼ a−6) than the density of fluid (∼ a−3(1+w)) and therefore at some point the contraction must come to a halt. second, the curvature becomes dressed by the factor b. this effect could in principle be observed far away from the quantum phase. however, we do not observe the intrinsic curvature neither in the geometry nor in the dynamics of space. nevertheless, for a convenient choice of ψ, this factor ≈ 1 also note that the form of the repulsive potential does not depend on the state of fluid filling the universe: the origin of singularity avoidance is quantum geometrical. the applications of the same quantization methods to more involved cosmological models like bianchi i and bianchi ix are described in [11–15]. for other approaches to quantum cosmology based on affine symmetry see [24–26]. 6. conclusion to conclude with an allusion to more “philosophical" perspectives, we think that integral quantization is just a part of a world of mathematical models for one phenomenon. indeed, the physical laws are expressed in terms of combinations of mathematical symbols, and these combinations make sense for people who have learnt these elements of mathematical language. this language is in constant development, enhancement, since the set of phenomenons which are accessible to our understanding is constantly broadening. these combinations take place within a mathematical model. a model is usually scale dependent. it depends on a ratio of physical (i.e., measurable) quantities, like lengths, time(s), sizes, impulsions, actions, energies, etc. changing scale for a model amounts to “quantize” or “de-quantize” toward (semi-)classicality. one changes perspective like one would change glasses to examine one’s environment. acknowledgements the author is indebted to jiri patera and pavel winternitz for inviting him at many occasions to share their enthusiasm in search for symmetry. he is also indebted to h. bergeron, e. czuchry, and p. małkiewicz, for having developed with them a large part of the material exposed in the present contribution. references [1] j.-p. gazeau, coherent states in quantum physics, wiley-vch, berlin, 2009. [2] s. t. ali, j.-p antoine, and j.-p. gazeau, coherent states, wavelets and their generalizations 2d edition, theoretical and mathematical physics, springer, new york (2013), specially chapter 11. [3] h. bergeron, j.-p. gazeau, and a. youssef, are the weyl and coherent state descriptions physically equivalent? phys. lett. a 377 598(2013); arxiv:1102.3556 [4] h bergeron and j.-p gazeau, integral quantizations with two basic examples, annals of physics (ny) 344 43 (2014); arxiv:1308.2348 [5] h. bergeron, e. m. f. curado, j.-p. gazeau, and ligia m. c. s. rodrigues, quantizations from (p)ovm’s, j. phys.: conf. ser. (2014); arxiv:1310.3304 [6] m. baldiotti, r. fresneda, and j.-p. gazeau, three examples of covariant integral quantization, proceedings of science (2014). [7] j.-p. gazeau and b. heller, positive-operator valued measure (povm) quantization, axioms (special issue on quantum statistical inference) 4 1 (2015); http://www.mdpi.com/2075-1680/4/1/1 [8] m. baldiotti, r. fresneda, and j.-p. gazeau, about dirac&dirac constraint quantizations, invited comment, phys. scr. 90 074039 (2015). [9] r. fresneda and j.-p gazeau, integral quantizations with povm and some applications, in proceedings of the 30th international colloquium on group theoretical methods eds. j. van der jeugt et al, journal of physics: conference series 597 (2015) 012037. [10] h. bergeron, a. dapor, j.-p. gazeau, and p. małkiewicz, smooth big bounce from affine quantization, phys. rev. d 89 083522 (2014); arxiv:1305.0653 [11] h bergeron, a dapor, j.-p. gazeau, and p małkiewicz, smooth bounce in affine quantization of bianchi i, phys. rev d 91 124002 (2015); arxiv:1501.07718 [12] h. bergeron, e. czuchry, j.-p. gazeau, p. małkiewicz, and w. piechocki, smooth quantum dynamics of the mixmaster universe, phys. rev. d 92 061302(r) (2015); arxiv:1501.02174 [13] h. bergeron, e. czuchry, j.-p. gazeau, p. małkiewicz, and w. piechocki, singularity avoidance in quantum mixmaster universe, phys. rev d 92 124018 (2015); arxiv:1501.07871 [14] h. bergeron, e. czuchry, j.-p. gazeau, and p. małkiewicz, inflationary aspects of quantum mixmaster universe, submitted; arxiv:1511.05790 [15] h. bergeron, e. czuchry, j.-p. gazeau, and p. małkiewicz, vibronic framework for quantum mixmaster universe, phys. rev d 93 064080 (2016); arxiv:1512.00304 178 http://arxiv.org/abs/1102.3556 http://arxiv.org/abs/1308.2348 http://arxiv.org/abs/1310.3304 http://www.mdpi.com/2075-1680/4/1/1 http://arxiv.org/abs/1305.0653 http://arxiv.org/abs/1501.07718 http://arxiv.org/abs/1501.02174 http://arxiv.org/abs/1501.07871 http://arxiv.org/abs/1511.05790 http://arxiv.org/abs/1512.00304 vol. 56 no. 3/2016 covariant integral quantizations and quantum cosmology [16] m. born and p. jordan, on quantum mechanics, zs. f. phys. 34 858 (1925). [17] b. s. agarwal and e. wolf, calculus for functions of noncommuting operators and general phase-space methods in quantum mechanics, phys. rev. d 2 2161; 2187; 2206 (1970). [18] f. a. berezin and m. a. shubin, the schrödinger equation, kluwer academic publishers, dordrecht, 1991. [19] g. herzberg, molecular spectra and molecular structure: spectra of diatomic molecules, 2nd. ed., krieger pub., malabar, fl, 1989. [20] j.-p. gazeau and r. murenzi, covariant affine integral quantization(s), accepted to j. math. phys. (2016), arxiv:1512.08274 [21] j.-p. gazeau and m. del olmo, unit disk & su(1,1) integral quantizations, in preparation. [22] r. fresneda, j.-p. gazeau, and d. noguera, covariant integral quantization of the motion on the circle, in preparation. [23] j.-p. gazeau, a. izadi, a. rabei, and m. takook, covariant integral quantization of the motion on 1 + 1 de sitter space-time, in preparation. [24] m. fanuel and s. zonetti, affine quantization and the initial cosmological singularity, eur. phys. lett. 101, 10001 (2013); [25] j. r. klauder, an affinity for affine quantum gravity, proc. steklov institute of mathematics 272, 169-176 (2011); gr-qc/1003.261 [26] j. r. klauder, enhanced quantization, particles, fields & gravity, world scientific, singapore, 2015. 179 http://arxiv.org/abs/1512.08274 acta polytechnica 56(3):173–179, 2016 1 introduction 2 what is quantization? 2.1 canonical quantization 2.2 quantization: minimal requirements 3 integral quantization(s) 3.1 integral quantization: general setting and povm 3.1.1 x=g: quantizing the group 3.1.2 x=g/h: quantizing a non-trivial group coset 3.2 an example of construction of original operator m or rho 3.3 semi-classical portraits 3.4 comments 4 weyl-heisenberg covariant integral quantization(s) 5 applications of acs quantization in quantum cosmology 6 conclusion acknowledgements references acta polytechnica vol. 43 no. 3/2003 a general approach to study the reliability of complex systems g. m. repici, a. sorniotti in recent years new cornplex systems haae been deaeloped in the automotiue f,eld to increase safety and comfort. these systems integrate h,ardware and, sofiware to guarantee the best results in aehicle handling and mahe products competitite on the market. howeuer, the increase in techni,cal dntails and the utilization and i,ntegration of these cornplicated systems require a high' leuel of dyamic control system reliabitity. in order to intproae this fundamental chnracteristic methods can be extracted from methods wed in the aeronauti.cal field to dcal uith reliability and these can be integrated into one simplifed method for application in the automotiae feld. first|y, as a case stud,y, ue decid.ed to analyse vdc (the vehicle dyrnmics control system) by def,ning a possible approach to relinbility techni.qucs. a vdc fault tree analysis represenh the f.rst step in this actiaiq: fm enables us to recognize the critical components in all possible uorking conditioru of a car, inclul,i.ng cranking, during'key-on'-'key-olf'phnses, wh.ich is particular\ cfiticalfor the electrical on-board system (because of u oltage reduction). by associating fa (functional arnlysis) and fta results uith a good ffa (funnti,onal failure ana\sis), it is possible to define the best architecture for the general system to achieae the aim of a htgh reliability structure. th,e paper uill shou some preliminary results from the application of this methodologl, taken from uariow typical handling conditions from uell establisfud test procedures for aehicles. ke",worik: safety, systems reliability, fault tree anallsis (fta), functional analysis (fa), handling, uhicle dynamic control (vdc) 1 introduction automobiles and land vehicles in general have seen a dramatic increase in complexity in recent years. today's automobile presents a higher than evel and increasing, number of value-added features, many of which are controlled by the vehicle's electrical and electronic (e/e) system. in fact, a vehicle today has approximately twice as many e/e functions as one producedjust l0years ago. this rend requires electrical system designs that pmvide both increased functionality and increased reliability. this inflation effect has been caused mainly by two factors: the hrst is rising demands from the consumer. this has not only manifested itself through the desire for better performance or comfort, but also stems from increased awareness of safety related issues and more protection for the occupants of the vehicle. the second factor has been the development of various electronic techniques and equipment. this technology has pushed the limits imposed by the on-board systems and, specifically, has allowed the implementation of many functions controlled by hardware and software systems on board. it is common practice when buying a car nowadays to find under the hood and scattered throughout the vehicle kilometres and kilometres of cables and wires, multiple control boxes and an equally high number of sensors picking up a very wide range of physical parameters. on top of this, all the electronic systems on board a car are interfaced in some way or another with themechanical and hydraulic systems. an example of typical functions now under the control or assistance of electronics is the control function. this aflects the ride and the handling performance, above all else. when the driver makes a sudden manoeuvre, control is critical. it is just as essential in bad weather or on rough roads, especially on unpredictable road surfaces. even under normal conditions, on straightroads and turns, or during braking and acceleration, control determines the ride and handling per34 formance. often, the level of control depends on the skill of the driver. the ride and handling technologies emerging in the industry help offer significantly more control for every driver in every situation, regardless of skill []. howeve4 all this comes with a price tag. l1lis is not only in terms of the final price for the use4 but also in terms of increased design complexity that places heavier loads on the design engineers and extends the time to prototype and testing. this is a key point that has been taken as a key driver in our methodology. later we will uncover how integrating the different analyses in an intelligent way can provide a way to develop preliminary estimates early in the development. in this context the complex system identified has to be intended as the ensemble of subsystems in a vehicle integrating different and advanced functions. to give a brief idea, the list of various state-of-the-art technologies applied might include : higherand multiple-voltage power generation and storage, networked communications (multiplexing), fiber-optic communications, multi-drop wiring, networked controllers with distributed computing, standard interfaces, and mechatronics (electronics integrated into switches, connectors, sensors, and actuators). fig. 1 shows a generic vehicle encompassing a set of advanced circuitries and components. tirrning to the aerospace field, we can see how avionics [5] has been a relevant part of the development of an airplane since the iate 1940s. it has since developed into a variety of lesser streams, covering the most various functions on an airplane: communications, navigations, control, etc. it still is at a level of complexity much higher than that of a car, but several system are comparable in terms of functions and criticalities. since some of the electronics mounted on a vehicle oversee safety, and the development of some specific aspects can be derived from analogous activities from the aerospace industry. in particular we would like to point out here how evolutiorr in avionics design has shifted with hardware miniaturization and the concomitant architectural integration strategies t2l. acta polytechnica vol. 43 no. 3/2003 lderal a((clerohaief yu rata jantor fig. l: modern vehicle circuitry schematics (courtesy of delphi automotive systems) in its basic form, an aircraft avionics system can be viewed as a large number of interconnected computers. up to the 1980s there would be about ten computers, going up around 30 for bigger airplanes. the capabilities developed over recent years have allowed a switch from these so called federated architectures to a system integration approach. basically, that functions can be mapped onto hardware as integrated computer nodes. the base line for an avionics architecture can be represented as in frg. 2. this can be taken as a starting point for an analysis that will lead us to transition development methodologies aimed at taking the higher reliability levels achieved in aerospace into the automotive field. aside flom the specific architecture, which is not under discussion in this work, we have recognized how some typical methodologies have been applied in, in a slightly diflerent way the aerospace field than in the automotive field. in particular, well known analyses like ea (functional analysis), fault tiee analysis (fta) and failure modes and effect analysis (fmea) [9], have all been used extensively and have undergone improvements [6], [7]. most significantly, however they are all inserted in a well structured methodology that allows results and trade information to be gathered from the very beginning, so that the overall results can be evaluated. a major advantage ofthis approach is the integration ofall the analyses, both horizontally and as vertically over the different levels of definition from equipment up to system level. it is not necessary to recollect and reframe the results since the analysis are all interlaced among them and they are evolved from common standpoints. at this point it is thus clear how this operational way can be exported to the automotive sector lvith great advantage in terms of development and overall reliability. ,.:-' ,-.,,-;,;:i i '..}-,i.'i 1,"'j!.lj ocl-lmwrcol-'ll-ll-loqooeg er_ogedei" fig. 2: typical bus configuration for elementary on board avionics [3] digital data bus 35 acta polytechnica vol. 43 no. g/2002 t------l i software i' l---r-j reliabilitr and sqf'etv t regai.rements i hardware il_**_l proiect requirements satisfaction fig. 3: overall scheme, methodology 2 study scheme starting from the background illustrated in the previous paragraph, we have identified the main target of our study as the.definition of a general methodology to suppor-r, as an ancillary analysis, the development of the design bf auto_ motive systems complex. this will essentially be aimed at reaching a consistent level of reliability and safety. these are features intended for a generic system encompassing electro-mechanical components and containing consistent software functions. our purpose has been to deielop a true methodology. though not as cornplex as others jvailable in the literature [4], it will have the great advanrage of being exfemely lean and widely applicable. one of the main fea_ turcs is the possibility, which we intend to make use of, to apply it from the very beginning of the design, accompanying all the phases of the developmenr of the project. mo.eoverl our intention has been to define in an objective way, the requirements necessary to develop all the relited subsystems as well as the embedded softrvare. all of the above will rake into account the targets established at the beginning of the project. for the time being the main developmental and test benchmarks come from the automotive industry. that is to that say some requirements have been extracted directly fiom the set of initial requirements and quality levels of the automotive industry. the purpose is to obtain significant reliability data well before the resrs are launched, and in this way to identify the critical issues and act to corect them. 36 fig. 3 shows the overall logic driving the study. the framework within which we have moved is given by the basic idea of integrating from the very beginning all the analyses and activ_ ities carried out, in parallel as much as possible, for both hardware and software [10]. it is essential that the developers of the software are fully aware of what is taking place on the hardware side, and vice versa. running all the ictivities in parallel allows us to evaluate the impact on the diflerent functions early in the project. as we can see from the figure, several cross checks are carried out during the developirent of the design. these are not intended aj formal gates, but rather as check points for assessing the coherenidevelopment and exchange of information. the main drivers arc the reliability and safety requirements. they are pursued all along,-and.each analysis is aimed at implemenring, verifiing then checking compliance with the target values. the anayl sis, as shown in fig. 3, cascades from functional down to the control loop, through fault trees, and then back to reassess the progress, and integrate the results from downstream. the naro plans, green and red for sw and hw, always move in tight parallel, maximizing the interchange. 3 specific problem application we now introduce the system we have chosen as a case study for the application of our methodology. speaking in general terms, we can call vehicle dynamics control sysiem (\rdc) ageneric system aiming at increasing the level of safety during the operations of a common vehicll. the main func_ ttt---i ri j ..t ; acta polytechnicavol 43 no. 3/2003 tion is to control the dynamic behaviour of the vehicle, intervening especially whenever the vehicle is approaching, the limits of its usage envelope. as a first approach we see the action as being carried out by acting on the brakes and simultaneously controlling the torque produced by the engine. typically, a \rdc includes functions related to the control of the braking actions (ebd), functions avoiding locking of brakes during the braking action (abs), the traction control system [ics), and a function controiling the release of torque in acceleration (asr), and others. each of the functions listed above encases several aspects and is carried out by processing various quantities. as an example we can point out here that an abs function has to control the various degrees of longitudinal variation of attrition according to the different motion conditions. while making a turn, both longitudinal and lateral forres act on the vehicle, and also an additional function is called in, cornering brake control (cbc), which takes into account the different load exprcssed by the internal and external wheels on the ground. the vdc is a very complex system. lt is normally made up of a number of electro-hydraulic-mechanical components' typically we have up to twelve two-lvay valves coupled to the limiting components, pumps, and actuators. in order to better analyse this part of the system, a dedicated action has been devoted to modeling all the hydraulic components. this modeling is essential in order to advance the knowledge of the system and so move on from an a priori logic to a rcsponsive system which acts according to the real vehicle-ground interaction and the conditions encountetcd while operating. once again we would like to underline the importance of integrating the knowledge related to the software running the system with the physical definition of the system itself, especia\ the electro-hydraulic portion of it. from our point ofview it is useless to investigate the physical functionalities of the system without a substantial verification of the data transmissions logics. hence the use of several commercial software packages for testing the data buses and data interchange. assuming now that we want to develop a system similar to vdc, the flrst step is to think out the overall structure of the system. after all the main functions are identified and a thorough description has been made the next step is to start doing the prcliminary design. therc are several ways of doing this; to maximizing the implementation of reliability and safety featules fiom the vert beginning [8], a potential development scheme is shorvn in fig. 4. logical scheme the starting point is represented by a functional analysis: a ta{get function is defined, then a detailed representation of a breakdown of all the sub-functions. in this phase experts from different fields (mechanics, electronics, electrical, etc.) work together to evaluate every single function necessary to comply with the required target. when all functions are clearly identified, is possible to analyse the components implementing that function from both the hardware and software point of view. in practical terms the theoretical structure derived from the functional analysis becomes a physical structure inwhich we can see every single element making up the general system. at this point, a fault tiee analysis can be applied to the obtained scheme and verified with a control loop if the requirernents are satisfied in the case of failure of various components. the control loop is basically a series of logical steps taken by the system engineer aimed at assessing the consequentiality of all the functions and the full satisfaction through the dedicated hardware. since the procedure has not yet been formalized, check lists are being prcpared in a generalized rvay and will be tested as more analyses are carried out on differ.ent subsystems. it is important to underline that it is also possible to veri$ through the control loop whether all fundamental functions have been correctly identified during the functional analysis step. at the end ofthis processwe have a general system architecture fiom the point ofview of theoretical requirements and from the point ofview ofboth physical hardware and software elements. as the project design unfolds, thorough adherence to the scheme assures the safety and reliability allocations can be controlled in real time. in particular, the fault tree analysis results (see fig. 5) can be used to check the failure rate target of each component, and so it is out relatively easy tb evaluate the trade offs and par-t substitutions to raise the overall system reliability. reliabilitv data is nowadays widely used in every engineering field. mil-hdbk 217 and rac are nvo of the most widely used documents containing collections of failure rates and other data for various components. several specific databases have also been built throughout the years to support design choices in the field ofaerospace' in the autornotive field these databases are not yet fully developed to the salne extent, physical scheme analysis of the component to realise required functions fig. 4: development scheme for a vdcjike system 3t acta polytechnica vol. 43 no. 312003 fig. 5: vdc fault tree analysis fig. 6: reliability data building and managemenr philosophy fig. 6 shows a typical basic policy to enhance this situation. as an ancillary activiry to this study the procedure shown has been partially implemented. in particular the work has been focused on increasing the data yield from subcontractors and suppliers. as far as we have seen till now, the data collection and management activities are carried out on the nvo different planes in two not completely compatibleways. it happens that the data relevant to the overall production is not necessarily the data that is sensitive to the equipment manufacturer. this causes biased collection, and subsequently data transmission. due to all of the above it is sometimes very hard to conectly evaluate the overall reliability of a complex system. in the course of the study an alternative strategy, coping with 38 cai{ communlcrdon frilurs configurraio! frults bs me$rgcs not corr€ct bus trrnsmitter not pre$nt hvrlid drtr on the can bn sensor c.llbr.tlon frults ytw sensor lrlerrl sensor corrective actions analysis requests the lack of data, has been evaluated. starting from the reliability data available, a corrective factor pm (proceeding mutation) is generated through a series 6f 2nalyses aimed at coupling the aeronautical components (i.e., sensors) and the automotive components. through the correct application of this factor to aeronautical data, estimated values can be obtained for automotive elements and an approximate failure rate of these values can be calculated. while the appropriate databases are being expanded and refined, a temporary collection of all the pm f,actors devised will be used and updated, constantly crossing the values with the results obtained from experimental tests. saeering angle sersor acta polytechnica vol. 43 no. 312003 4 identification of the problem ti,vo main problems have been identified in the course of our work. they are diverse in nature, and can be solved immediately: firstly, there is the need to identis, a methodology that will help, by means of graphical support, in correctly identifring the physical structure of the system being analysed. to this purpose we have looked at fa, which requires a functional description of the system, and fmea, usually carried out to a more thorough level of detail and specific descriptions. hence we will start out from the formel describe the main objectives that our system architecture has to comply with, and then move on to the latter to analyse the different components, their i'eatures and the potential failure modes. in doing this the two methodologies come together for whole and also work as a reciprocal verification. the second main point we stated is that all too often the analyses from the sofnvare and hardware components are carried out separately. in thisway the data gathered from the two sides, even though formally correct and complete, are not structurally integrated, and so information regarding the interactions is lost. to repair this fault a method [4] has been developed, called hierarchically performed hazard origin and propagation studies or hip-hops. these techniques are founded on the principle that all the existing methodologies function well, but need a higher degree of integration to suitably fit the most modern complex systems. the work evolves through integrating of several analyses, with the main purpose of maximizing the automation of the procedures through the development of appropriate tools and soffivare. 5 fault injection techniques in order to achieve a more complete reliability analysis, it is deemed useful to analyse system reactions to hardware and software failures. the technique explained in this work, through fault injections, has proven to be cost effective and capable of providing valuable results. using software tools like amesim or.matlab simulink, it is possible to develop software models to obtain a very close simulation of real events without the use of prototypes; in particula4 it is possible to sirnulate the mathematical logic (simulink) and physical elements (amesim) of a generic electro-mechanical system. analysing the mathematical equation and the logic which control the phenomena (iig.7 shows the traction control logic), it is possible to simulate a failure in the virtual model, using results from fta and fa to isolate the most critical componenrs. in this way we can study the behaviour of the system in critical conditions and evaluate whether the general response is sufficient to guarantee the minimum safety value. in addition to this, using simulation techniques starting fiom hardwarc and software integrated fta and fa analysis a large number of results can be obtained in a short period of time, and the correctnessof the project design can be evaluated before the construction of the physical system (i.e., the first prototypes). fig. 8 shows the simulink implemented scheme for fault injection. as an example in order to better understand the process, we can take a failure in the data transmission system. after injecting a generic or specific error in the transmission protocol, or the hardware implementing it, we evaluate the consequences, comparing the actual output with vehicle state information b;tke torques control compensator brake torques fig. 7: traction control system operational logic fig. 8: simulink scheme for fault injection techniques 39 acta polytechnica vol. 43 no. 312003 the set of expected values. the consequences, both over a short period and over a long period are then evaluated. a wrong sigrial can be either a lack of information or a set of inconsistent bits. a preliminary result of this analysis is the establishment of the so-called safe operational time. this represents the minimum elapsed time during which the system, thanks to its robust design and reliability, can sdll operare within the safety limits. later the analysis provides data about the period of latency and the reboot of the sysrem. 6 conclusion and recommendations our work has presented the preiiminary results and the overall methodological approach to the problem of designing reliability and safety in automotive systems since the very beginning of the developmenr. none of the component analyses used are brand new or innovative in themselves. that was not the purpose of the work. nevertheless, the overall approach has been viewed as innovarive and valuable in the automotive world, where interconnections between the diflerent analyses and also between software and hardware, functional and physical analysis, are still parrially lacking, at least in comparison with the aerospace environment. we have also seen how other researchers, all ofthem producing outstanding and valuable results, have travelled this road. what is different in this structured method is the leaner approach, aiming at applying just the minimum analysis required at the right time in the developmenr, avoiding massive efforts for preliminary design. we have also highlighted how ro ler the software and hardware sides talk rogether right from the beginning in order to ensure that the development of the functions is correctly transformed into code and the hardware implementation evolves at the same pace. the fault injection technique has also proven its effectiveness in supporting the assessment of the system performance in the earlier design phases. the key point is to accurarely select the events to generate and support the simulations adequately, especially for those failures not €asily reproducible on track. the main advantage of applying this methodology is that is avoid common pitfalls and mistakes, especially in rhe earliest phases of the design, without overburdening the system with cumbersome procedures. acknowledgments the authors wish to grarefully thank prof. paolo maggiore for his support and invaluable help provided throughout the research work, and for reviewing this paper. he is a well known expert in the field of aerospace systems reliabiliry anal_ ysis and design. references lll delphi automorive systems ridz and handling systems usa, delphi, 2000. t2l newport,j. r.:avionics systems design. crc press, 1994. t3l henderson, m. f.: aircrafi instrurnmts and adunics. jeppesen sanderson training products inc., 1g93. t4l papadopoulos, y., mcdermid, j., sasse, r., heiner, g.: arnlysis and synthesis of the behaaiour of comptex programmable electrunic systems in conditioru of failure. reliability engineering and systems safety ?1, 2001, p.229-2a7 . l5l helfrick, a.: prhciples of avionits.2nd edition, avionics communication inc., 2002. t6l chiesa, s.: aflidabilita, sirurepa e manutenziow nel progetto dei sistemi. torino, clut, 1988. l7l galetto, f.: aflrdabilita. vol. i teoria e mctoili d:i calcolo. torino cleup editore. lg8l. t8l society of automotive engineers, arp-4761: aerospace recommendcd practice: gu'idclines and methods for conducting safety assessment process on ciaij airbonu systems and. equiprnent. l2th edition, sae, usa, 1996. l9l palady, p.: failure modcs ard efect ana,lysis. pt publications, usa, 1995. i l0] crow, k.: vahrc analysis anl. functinr analysis system techntque. usa, drm associates. ing. gianfrancesco maria repici phone: +39 0l i 564 6858 fax: *39 0l i 564 6899 e-mail : gianfrancesco.repici@polito.it department of aeronautical and space engineering ing. aldo sorniotti phone: +39 0l i 564 6915 fax: *39 0l i 564 6999 e-mail: aldo.sorniotti@polito.it department of mechanical engineering politecnico di torino corso duca degli abruzzi, 24 10129 torino, italy 40 scan34 scan35 scan36 scan37 scan38 scan39 scan40 scan41 ap05_2.vp 1 introduction to atmel microcontrollers the structure of avr microcontrollers was designed so as to comply with high-level language compilers, namely the widely used c language. such an optimized core unit having the harvard architecture bears the main characteristics of microprocessors with a reduced instruction set (risc). the basic architecture of the avr microcontrollers is depicted in fig. 1. the whole family of avr microcontrollers can be divided into 3 subgroups: the at90s, attiny and atmega families. the at90s series was developed first, and its name suggests that it represents a continuation of the at89c series. the situation with the other two series is different, since the manufacturer had more experience with the previous series 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague implementation of a microcode-controlled state machine and simulator in avr microcontrollers (micoss) s. korbel, v. jáneš this paper describes the design of a microcode-controlled state machine and its software implementation in atmel avr microcontrollers. in particular, atmega103 and atmega128 microcontrollers are used. this design is closely related to the software implementation of a simulator in avr microcontrollers. this simulator communicates with the designed state machine and presents a complete design environment for microcode development and debugging. these two devices can be interconnected by a flat cable and linked to a computer through a serial or usb interface. both devices share the control software that allows us to create and edit microprograms and to control the whole state machine. it is possible to start, cancel or step through the execution of the microprograms. the operator can also observe the current state of the state machine. the second part of the control software enables the operator to create and compile simulating programs. the control software communicates with both devices using commands. all the results of this communication are well arranged in dialog boxes and windows. keywords: state machine, microcontroller, microprogramming, software implementation of a simulator. control registers interrupt unit spi unit t/c0 serial uart t/c1 t/c2 watchdog timer 32 i/o lines analog comparator 256x8 eeprom 256x8 data sdram 32x8 general purpose registers status and tests program counter 4k x16 program memory instruction register instruction decoder control lines alu d ir e c t a d d re s is in g data bus 8bit in d ir e c t a d d re s s in g fig. 1: architecture of avr microcontrollers and designed the names of these two according to their designation. representatives of the attiny family are designed for smaller, simpler applications, while the atmega family representatives are designed for complex and more sophisticated applications. more detailed information concerning the architecture of avr microcontrollers can be found in [4] and [6]. 2 applications of microcontrollers in the design environment in the 1970’s computer controllers were designed as microcode-controlled state machines. although we have now switched when designing microcomputers to standard circuit control units, state machines are still important. they are suitable for certain control applications and tutoring. when writing a microprogram we can test the controlling procedures of some of the operations and subsequently use the acquired knowledge in optimizing the microprogram. the correct function of the microprogram and the subsequent optimization is strongly dependent on the development and tuning devices. thus there is great emphasis on the design environment in which the microprograms are written. it is appropriate to use a microcode-controlled state machine structure for developing microprograms (see fig. 2). the whole structure of a state machine has been implemented into avr microcontrollers by atmel company [3]. the simulation of the behavior of the designed microprogram is carried out on the model of a control system, which is also implemented in the avr microcontroller. this model is so universal that it can be used for simulation of an arbitrary system. the design environment micoss is composed of two parts: microcode-controlled state machine [1] and simulator of the controlled unit [2]. these devices can be interconnected by a flat cable and linked to a computer through a serial rs-232 or usb interface. the application software is common to both devices. it allows us to crate and to edit microprograms and to control the whole state machine, i.e., to start, cancel or step the running of the microprograms and to monitor the current state of the state machine. the second part of the simulator software enables an operator to create and compile simulating programs © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 2: structure of a microcode-controlled state machine fig: 3: design environment scheme and to control the simulator. a scheme of the design environment is given in fig. 3. 3 microcode-controlled state machine the realized microcode-controlled state machine was developed according to the scheme in fig. 1, so that its behavior and function would resemble as much as possible the state machine of an amd2909 chip. for the implementation of the whole core of the state machine, the avr microcontroller was chosen and designated as atmega103 [4]. this microcontroller has input/output interfaces by means of which it is possible to communicate with outside parts. a great advantage is the large number of these interfaces, so that they can be used as direct inputs and outputs without any additional multiplexing. interfaces a, b and c are outputs and interfaces d, e and f are inputs. the lowest two bits from interface e and the lowest two bits from interface f are reserved for communication. this implies that the micro operational command is 24 bits wide, without the necessary output multiplexing. there are 16 bits connected directly into mpx input and 4 bits into maprom memory. the total input is thus 20 bits. this is then reflected in the number of 40 pin connectors on the printed circuit board (20 pins are grounded). the address of microinstruction was chosen to be 15 bits and owing to this, it is possible to develop microprograms up to maximum size 32k (32769 microinstructions). the above considerations imply that it is possible to determine the size of the individual parts of the microinstruction, whose structure is depicted in fig. 1. because 16-bit input was chosen, the condition code (cc) is 5 bits sized. the reason for 5 bits and not 4 bits is as follows: because the input as well as the output from the backward micro cycles counter (bmc) is input into multiplexer mpx, and because constants 1 and 0 are also input here, the total numbers of input bits to input multiplexer mpx are thus 19. from this value it follows that to choose just one input bit from the whole set of 19, the above mentioned 4 bits are not sufficient. the width of the jump type ts was taken from amd company, to be precise from chip 29811, i.e. 4 bits. in keeping with the width of the address of microinstruction, the address of the jump adrsk is 15 bits. the resultant structure of the microinstruction, including the proposed widths of the individual items, is given in fig. 4. what remains to be determined is the width of the backward counter of the micro cycles zpc, the depth of the stack zas and the size of the mapping memory maprom. the backward counter zpc is set according to the address of the jump adrsk from the microinstruction, thus its width will also be 15 bits. the stack memory (stm) at slices amd was 4 items deep. if the microinstruction address space were extended up to 32k, the depth of 4 items space would not be sufficient. therefore the depth of the stack memory was enlarged up to 16 items. another reason for this enlargement is the fact that, with such a large space, it is possible to create more complex microprograms; consequently the demand for stack memory rises as well. the width of mapping memory maprom is the same as the address of microinstruction, i.e., 15 bits. in order to address this memory 4 input bits will be used; therefore 16 items result from it. the proposed state machine properties are summarized in the following table. 3.1 implementation of a microcode-controlled state machine the whole design of a microcode-controlled state machine was developed in the avr studio version 3.53 [5]. the avr studio is a professional development tool for development and debugging of c applications that will be running in avr microcontrollers. the implementation of the microcode-controlled state machine itself is depicted in fig. 1. the individual blocks of the scheme are described as variables and independent functions that perform the required task. all these functions are periodically called in an infinite loop until the running is interrupted. if there is no data on the data link to pc, the application runs autonomously according to the input conditions, and it generates outputs signals. if this is not the case (individual inquiries are being sent from pc), the application performs the requested tasks. a more detailed description of the implementation and specification of the individual blocks according to fig. 2 is given in [1]. 4 controlled system simulator as stated above, a part of the design environment is a simulator of a controlled system. the simulator is a completely universal structure, which enables us to simulate an arbitrary system. to achieve this, there are also external or data inputs and outputs available besides control inputs and outputs. control and data inputs are chosen by means of multiplexers; conversely, the control and data outputs use latch registers. a total of 16 data inputs and outputs are implemented. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 4: resultant microinstruction structure characteristic size cc 5 bits jt 4 bits jmpadr 15 bits command part 24 bits backward micro cycles counter 15 bits input multiplexer 19 bits stack size 16 items mapram size 16 items table 1: properties of state machine the data inputs can be used to simulate the inputs of variables or to connect an external matrix keyboard. the data outputs can be used for example for connecting of the seven segment led display. the data inputs and outputs can be used either together or separately by means of a jumper. the general architecture of the device is depicted in fig. 5. for the implementation of the simulator, the avr microcontroller atmega128 [6] was chosen. this microcontroller is placed on a printed circuit board together with the communication interfaces, data inputs and outputs, display elements and mode jumpers. the atmega128 microcontroller also has enough interfaces; moreover it has one extra communication interface g. this is, in the design itself, used to control the input multiplexer and output latches. two bits from this interface are used for indication led diodes, which inform about the running and the state of the simulator. the remaining interfaces are used for communication between the simulator and the state machine in the following way; interfaces a, b and c are used for inputs to the simulator, and interfaces d, e and f are used for outputs. the two lowest bits from interface f are reserved for communication with the computer. a more detailed description of the simulator can be found in [2]. 4.1 implementation of the simulator of controlled system the whole design was, as in the previous case, performed in the design environment of avr studio version 3.53 [5]. the implementation itself was carried out in the following way; the basic communication between an application running in a microcontroller and a computer is accomplished by sending individual commands. each command has its specific code which determines its meaning. for this reason, each command is assigned one function in the application. these functions are periodically called in an infinite loop according to the corresponding received code. the list of selected codes is given in the table 2. a detailed command description is given in [2]. 5 communication software the proposed software is common to both the microcode-controlled state machine [1] and the controlled system simulator [2]. the software application was written in the program delphi 6.0 to ensure transferability of the code between the individual platforms. the program may be executed under windows 95, 98, 2000 and xp. the design of software for the state machine is maximally accommodated to software for the simulator; it is composed of two-level program units. the units of lower level ensure communication with the device, and high-level units ensure communication with the user. 5.1 microcode-controlled state machine proposed architecture of communication software is depicted in fig. 6. 5.1.1 communication with the device (lower level) communication with the device is accomplished by a serial uart interface on the part of the microcontroller and a serial rs-232 or usb interface on the part of the computer. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 5: architecture of the simulator number command type description 3 maprom setting systemic sets 17–20 simulator output bits. 20 set 1–8 according to input command, powerful sets output bits 1–8 in accordance with input bits 1-8 63 set b according to out (1–8) command, powerful sets b register according to output bits (1–8) 74 give back output state (9–16) information simulator sends outputs bits 9–16 state 83 wait for bits inverting blocking simulator waits on chosen input bit settings to inverse state. 166 give back registers state command, powerful simulator sends to pc actual content of d, e and f register. table 2: example of simulator commands the whole communication is accomplished by sending individual commands to an application in a microcontroller, which performs the required function. there are two different modes. the first is autonomous running of the state machine. in this mode the running depends only on the inputs (state information), and the outputs (controlling information) correspond to the stored microprogram. in this way the state machine can be used, for example, to control a processing line. the state machine can work quite autonomously and is independent of further commands from the computer. selection of the second mode enables the user to influence directly the running of the whole machine. of course, it is possible to influence the running of the machine even in the previous mode; the state machine can be switched between these two modes. this choice is accomplished by changing logical ,0’ to logical ,1’ on the line rts of rs-232. it can be said that this line serves as a switch for the functioning of the whole machine and it switches between the two modes. if logical ”1”is chosen in the rts line, the serial interface uart of the avr microcontroller plays an important role. through uart, individual controlling bytes are sent to a microcontroller. each transferred byte represents either a command (selected command) or transferred data. after logical ”1” has been set on the rts line, a further, important choice follows. the microcontroller awaits which byte will be accepted by uart. a transferred byte (0x10) specifies the choice of microprogram memory – work with its contents, while byte (0x20) signifies the choice of maprom memory – also handling its contents; byte (0x30) signifies monitoring the state of the machine, changes of contents in individual registers, setting breakpoint and stepping mode. after this choice, mere transfer of data to the microcontroller follows. communication between a computer and a microcontroller on a low level is solved by the above described procedure. to illustrate the mutual communication, see fig. 7, which deals with the contents of the memory of the microprograms. 5.1.2 user interface (higher level) the user interface enables users to use the following functions: � testing of the connected device � editing of maprom memory and microprogram memory � saving and reopening of both memories � monitoring the state of the machine: � microprogram stepping, register state watching, microinstruction address (ra) stack top and backward counter dc. � microprogram running with possible stopping and breakpoint address definition. � microinstruction address (ra) top of stck and backward counter dc value changing. the whole application can be operated using a keyboard and a mouse (like any application running in windows), and there are also shortcuts for majority operations. after starting application atas.exe (main file application), the basic application window is displayed (see fig. 8). there is a main menu in the upper part. the menu contains the following items: project, edit, mode, run, window and help. below the menu there are many buttons for quick choice: create new project, open existing project, save current project, buttons for text editing (cut, copy, paste), state machine and simulator. a detailed description of menu and buttons can be found in [2]. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague user interface (high level) comunication with device (low level) serial or usb port fig. 6: software architecture fig. 7: low-level communication between devices fig. 8: basic application window after pushing button a in the main window or after clicking on the item state machine in menu mode, the new, so-called state machine window is opened (fig. 8). this window is absolutely independent of the main window. in the upper part there is the main menu with the following items: file, microprogram, run, device and help. the specification of other buttons listed in the main menu can be found in [1]. as indicated in fig. 9, within this window it is possible to input the individual microinstructions and to edit the content of maprom. all this can be done without the necessity to link the device. at the moment when you need to test the microprogram, you must take the following steps: � connect the device to the computer � choose the communication interface � choose the communicating speed. then you must press the “connect device”. at this moment, the application tests the device. if everything is correct, three other buttons controlling the operation of the device appear. these are the “run”, “step” and “stop” buttons. fig. 10 describes the situation after the device is connected. if the connection fails, an error message is displayed. a detailed description of communication with the state machine can be found in [1]. 5.2 simulator of a controlled system the simulator window contains a text editor for recording simulating programs (see fig. 11). these programs are written in a special simulating language: “s-language”. in terms of control, the structure of the simulator is simple, and it copies its hardware structure. all simulating language commands, are related to specific basic elements (control and data registers, stack, and so on). commands are divided into system, command, blocking, and information. a detailed description is given in [2]. simulating programs can be saved and reloaded by commands from the file menu, or directly by pressing corresponding icons from control panel. the saved programs have a sim extension. simulating programs are saved in text mode in this file. therefore they can be written simply in any text editor, saved with a sim extension and then opened in a simulator program window. an important choice is represented by the “compile” button. this button begins the compilation into internal form. during compilation, the correctness of the program from the semantic and syntactic point of view is checked. the syntactic analyzer repeatedly calls the lexical analyzer, which returns the appropriate lexical elements. if there are lexical or syntactic errors, the compilation is stopped and the error must be corrected. it is necessary to connect the simulator device to start the simulating program. when the “connect device” button is pressed, the program starts communication with the device. after confirmation of the connection by the device, the items for starting the simulation programs are made accessible. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 9: state machine – disconnected device fig. 10: state machine window – connected device fig. 11: window of simulator it is possible to choose between continuous running or stepping of the simulation program. the current program line is highlighted on the screen. a special mode is stepping with delay. this choice is suitable if it is necessary to debug the application and to follow the processes running in the simulator. the simulator or stepping program can be started from both the menu and icons (tool panel). the “stop” button stops simulation. after starting or stepping simulation, a new display window appears (see fig. 12). the display window substitutes the simulator display and other signaling elements. it clearly displays the stack register contents and the data input and output logical values. the command code or each single command can be sent directly to simulator, irrespective of the executed program. a detailed description of communication with the simulator can be found in [2]. 5.3 intercommunication of devices as indicated above, the microcode-controlled state machine device is connected to a controlled system simulator device by flat cables. the input signal cable has 40 pins and the output signal cable has 50 pins – micro operation sign moz (command part-cp). the reason for using this cable is that for each of signal wires it is also necessary to have a ground wire. the signal input is 20 bits, 20 bits are the ground wires – a 40 wire flat cable. mutual synchronization of the two devices has not been explicitly solved. it is thus necessary to implement this synchronization directly in the running application, which is being tested on the devices. a possible solution is to reserve one signal bit in both directions and control the running of the whole state machine according to the state of this signal wire. the second choice for mutual synchronization of the devices, which is made possible due to common connection of both devices to a single computer, is to control the running of the state machine on the basis of the input of the control byte from the pc. this would define the exact moment at which the state machine would start operation. in addition to positive considerations accompanying this synchronization, there would also be some negative considerations. an example is the delay resulting from waiting for the input of a starting byte. synchronization could also be achieved by creating a clock – a cycle on the device of a microprogrammable state machine. ”clocks” generated in this way would enable us to control the running of the whole state machine, and at same time the ”clocks” could be joined to the top of the output connector. this would ensure control of the simulator from a single time base. such a generator would use for instance a counter with a preset initial value. in the event of downward counting, an interruption would occur and thus a clock pulse would be generated. it is obvious that the value set in the counter would have to correspond to the longest executed part of the program. however, if speed is emphasized, this method would cause a delay in the running of machine. 6 example let us assume we have a reservoir with two pumps and an outflow in our house. the reservoir supplies the whole house with drinking water. there are three sensors in the reservoir. if the water level is between sensors h1 and h2, both pumps are inactive. if the level sinks below sensor h1, pump no. 1 begins to fill the reservoir. when the level is above sensor h2, pump no.1 is stopped. conversely, if the level sinks below the sensor h0, the pump no. 2 also begins to fill the reservoir. if the level is above sensor h2, both pumps are stopped. buttons no. 1 and no. 2 simulate the requirements on the amount of water. the whole reservoir with pumps and buttons is depicted in fig. 13. the microprogram and a model of the controlled system are shown in fig.14 and fig.15. the simulator program with a detailed description is shown in table 3. information on the 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 12: display window fig. 13: a reservoir with pumps and sensors address jt cc jmpadr cp 000 cjp 4 000 0x000000 001 cjp 5 000 0x000001 002 cjp 5 001 0x000001 003 cjp 5 000 0x000003 004 jp 0 003 0x000003 fig. 14: microprogram of the reservoir amount of water is stored in register a and it is shown on the pc monitor. 7 conclusions a complete design environment for microcode development and debugging is a demonstration of systems implementation in avr microcontrollers by atmel company. this design environment is suitable both for the construction of simulator of controlled object and for the design of the microprogram of a state machine. the final behavior of the designed state machine can be tested. all the components are freely available on the market and thus this environment can be mass-produced. the universality of the environment offers the possibility of further experimentation in programming and the design of various structures with avr microcontrollers. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 15: the state machine model ;program ”the water reservoir” define m1 1 ;the pump no.1 define m2 2 ;the pump no.2 define hl_top 10 ;the sensor h2 define hl_bottom 1 ;the sensor h0 define hl_middle 5 ;the sensor h1 hw_reset lcd_clr disp_txt 1 ‘the simulation of the reservoir’ init_all 0 0 ;initialization init_a 10 ;the level of water loop: if_lt_a hl_top t2 ;if the level of water is below the sensor h2 then jump to t2 set_high 11100000b ;set all sensors to 1 disp_txt 21 ‘the reservoir is full !’ go_to test1 t2: if_lt_a hl_middle t3 ;if the level of water is below the sensor h1 then jump to t3 set_high 01100000b ;the level of water is between sensors h1 and h2 clr_high 10000000b ;set the sensor h2 to 0 go_to test1 t3: if_lt_a hl_bottom t4 ;if the level of water is below the sensor h0 then jump to t4 set_high 00100000b ;the level of water is between sensors h0 and h1 clr_high 11000000b ;set the sensors h1 and h2 to 0 go_to test1 t4: clr_high 0ffh ;the level of water level is below the sensor h0 init_a 0 disp_txt 21 ‘the reservoir is empty !’ test1: if_clr_in m1 test2 ;if the pump no.1= 0 then jump inc a ;the level of water level is raised test2: if_clr_in m2 testtl ;if the pump no.2 = 0 then jump inc a ;the level of water level is raised testtl: if_clr_tl1 tlac2 ;the test of pressing button no.1 dec_a ;the level of water level is reduced tlac2: if_clr_tl2 loop ;the test of pressing button no.2 dec_a ;the button no.2 is pressed dec_a ;fast draining go_to loop end table 3: the simulator program of the water reservoir references [1] korbel, s.: design and implementation of a microcode-controlled state machine. diploma thesis, fee ctu prague 2003 (in czech). [2] klíma, d.: controlled system simulator. diploma thesis, fee ctu prague 2003 (in czech). [3] atmel: http://www.atmel.com/ [4] atmega103: http://www.atmel.com/dyn/products/devic es.asp?family_id=607#760 [5] avr studio: http://www.atmel.com/dyn/products/tools_c ard.asp?tool_id=2725 [6] atmega128: http://www.atmel.com/dyn/products/produ ct_card.asp?part_id=2018 ing. stanislav korbel e-mail: korbels@fel.cvut.cz doc. ing. vlastimil jáneš, csc. e-mail: janes@fel.cvut.cz department of computer science and engineering czech technical university karlovo nám. 13, 121 35 prague 2, czech republic 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague acta polytechnica doi:10.14311/ap.2020.60.0268 acta polytechnica 60(3):268–278, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap on the second light flash emitted from a spark-generated bubble oscillating in water karel vokurka technical university of liberec, physics department, studentská 2, 461 17 liberec, czech republic correspondence: karel.vokurka@tul.cz abstract. the light emitted from the spark-generated bubbles oscillating in water is studied experimentally. attention is paid to the emission of light from bubbles in the final stages of their first contraction and in the early stages of their following expansion. in some experiments, two close flashes of light were observed. the first light flash has already been studied in earlier works. in the present work, attention is paid to the second light flash. the relations between the first and second flashes of light and the size of the bubbles are studied and discussed in detail. it is assumed that these two light flashes are caused by two different processes taking place in the bubbles. the possible nature of these two processes is briefly discussed. keywords: spark-generated bubbles, bubble oscillations, light emission. 1. introduction the physical processes taking place in bubbles oscillating in liquids are very complex. although much effort has been devoted to clarifying these physical processes, many issues in this field are still not well understood. for example, a great deal of works has been devoted to clarifying the processes responsible for cavitation erosion (see, e.g., reference [1]). however, the specific mechanism responsible for cavitation erosion has not yet been satisfactorily identified. the emission of light from oscillating bubbles, which will be studied in this work, represents another unexplained problem. other studies of bubble oscillations have focused on improving contrast-enhancement in medical ultrasonic imaging [2–6], and even in this case, a better understanding of the physical processes running in bubbles may be useful. oscillating bubbles are generated in laboratory experiments by many techniques, such as by focussing a laser beam into liquid [7–13], spark discharge in liquid [14–19], irradiating liquid with intense ultrasonic waves (acoustic cavitation) [20], or in a liquid flow (hydrodynamic cavitation) [21]. all these techniques are also used in studies of light emission from bubbles [8–13, 15–18, 20, 21]. over the last three decades, the emission of light from a single bubble oscillating in acoustic resonators has also been intensively studied (see, e.g., a recent review [22], which summarises results from 163 works). an advantage of a method based on acoustic resonators is that relatively small experimental set-ups can be used. however, a serious disadvantage of this technique is that only small bubbles are generated which have maximum radii of less than 100 µm. these small bubbles oscillate very quickly, and therefore all the physical processes that take place in them also run very fast, which makes their study difficult. measuring light flashes from these small sources at relatively large distances requires averaging, during which many important features are lost. and measuring ultrasonic waves radiated by these bubbles is also a difficult task because spectral components of these waves are ranging up to several hundreds of mhz. in the present work, the emission of light from large spark-generated bubbles freely oscillating in water far from boundaries is studied. as mentioned in earlier works [23–25], the large spark-generated bubbles have many advantages that will also be exploited in this study. during the study of the light flashes radiated from these bubbles in the final stages of their first contraction and early stages of the following expansion, we occasionally observed that the first flash of light was accompanied by a slightly delayed second flash of light. whereas the first flashes have been studied in detail in [24, 25], in this work we want to concentrate on the second flashes. in section 3, the time distance between the two flashes, the maximum values of these flashes, and the position of the two flashes relative to the position of the pressure pulse (and thus also with respect to an instant when the bubble is contracted to its first minimum volume) will be studied in detail. multiple secondary light flashes have also been observed by ohl [9], sukovich et al. [12], supponen et al. [13] and moran and sweider [26]. however, in these works, secondary light pulses have not been studied in detail and no concurrently emitted pressure pulses have been used to analyse the observed events in the time domain. 2. experimental setup and nomenclature the data analysed and discussed in this work are a subset of the data already presented in the works [24, 25]. this means that the data were obtained using the same experimental setup. therefore, only a brief de268 https://doi.org/10.14311/ap.2020.60.0268 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 3/2020 on the second light flash emitted from a spark-generated bubble. . . scription of the instruments, the measuring procedure, and the nomenclature will be given here. further details can be found in [24, 25]. the freely oscillating bubbles were generated by spark discharges in a large laboratory water tank having dimensions 4 m (width), 6 m (length), and 5.5 m (depth). the sparker used in the experiments consisted of two thin tungsten electrodes having a diameter of 1 mm and a length of 50 mm. the electrode tips were facing each other and were separated by a narrow gap. due to electrode burning the length of the gap in subsequent experiments gradually increased from about 0.2 mm to 3 mm. the tungsten electrodes were mounted in conical brass holders and were connected by cables to a condenser bank, whose capacitance could be varied in 10 steps from 16 uf to 160 uf. the capacitors were charged from a high voltage source of 4 kv, and an air-gap switch was used to trigger the discharge. after closing the air-gap switch, at a time t0 the liquid breakdown occurs and the discharge channel starts growing explosively. this explosive growth is accompanied by intensive light (optical) emission and pressure (acoustic) wave radiation from the bubble. the explosively growing almost spherical bubble attains its first maximum volume at time t1 and has a radius rm1. then the bubble starts contracting. at time tc1, the bubble contracts to its first minimum volume. although very little is currently known about the shape of the contracted bubble, it will be assumed that it is a sphere having radius rm 1. then the bubble starts expanding and at time t2 attains its second maximum volume and has a radius rm 2. further bubble oscillations follow, but these are beyond the scope of this work. in the following, the interval (t0, t1) will be referred to as the initial growth phase, the interval (t1, tc1) as the first contraction phase, and the interval (tc1, t2) as the first expansion phase of the bubble. prior to the measurements reported here, a limited number of high-speed camera films were taken with framing rates ranging from 2800 to 3000 frames/s. these records were used to check the shape of the generated bubbles. besides, the photographs yielded useful visual information about the bubble content. examples of images of spark-generated bubbles can be seen in earlier works [23, 27]. both the spark discharge and the subsequent bubble oscillations are accompanied by an intensive emission of light and pressure waves. a relatively simple arrangement was used to record the optical waves. a fibre optic cable was fixed at the same depth in water as the sparker. the input surface of this cable was pointing perpendicularly to the electrodes and was positioned at a distance r = 0.2 m from the sparker gap. a photodiode was positioned at the other end of the fibre optic cable. the output voltage u(t) from the photodiode was amplified, digitized and stored in a computer. the record of the optical radiation u(t) can be divided into two pulses. first, it is the pulse u0(t) that was emitted during the interval (t0, t1), and second, it is the pulse u1(t) that was emitted during the interval (t1, t2). in this work, only the pulses u1(t) will be considered and the instant, at which the pulse u1(t) attains the maximum value um 1 will be denoted as tu1. the pressure waves p(t) were recorded using a broadband hydrophone, which was positioned at the same depth as the sparker at a distance rh = 0.2 m from the gap of the sparker. the hydrophone output voltage was digitized and stored in a computer. like the optical wave, the pressure wave p(t) can be divided into two pulses. first, it is the pressure pulse p0(t) that was radiated during the interval (t0, t1), and second, it is the pressure pulse p1(t) that was radiated during the interval (t1, t2). only the pulse p1(t) will be considered in this work. the instant, at which the pressure pulse p1(t) attains the peak value pp1 , will be denoted as tp1. the sparker was submerged in water at a depth of h = 2.5 m (ie. at hydrostatic pressure p∞) far away from the tank walls. generated bubbles can be described by two parameters. first, it is the bubble size rm 1, and second, it is the bubble oscillation intensity pzp1 (this parameter is defined as pzp1 = (pp1 ·rh)/(p∞·rm 1)). both rm 1 and pzp1 were determined in each experiment from the respective pressure record using an iterative procedure described in [23]. the sizes rm 1 of the bubbles studied in this work ranged from 21 mm to 56.5 mm, the bubble oscillation intensities pzp1 ranged from 92.4 to 152.8. the pressure wave propagates from the bubble wall to the hydrophone at the speed of sound in water. therefore, the times t0, t1 and t2 in the pressure record are delayed by about 135 µs after the times t0, t1, and t2 in the optical record. however, as shown in [25], the instants of the liquid breakdown t0 can be determined in both records u(t) and p(t) with a precision 0.1 µs. the pressure record can thus be shifted along the time axis so that the times t0 in both records are identical. examples of the whole records u(t) and p(t) were presented in [25]. in this work, only small portions of the pulses u1(t) and p1(t) extracted from the records in the vicinity of tu1 and tp1 will be displayed and discussed in the following text. and even if it has not been verified experimentally yet, in the following discussion, it will be assumed that the peak pressure in the pulse p1(t) is radiated at the same instant the bubble is contracted to the first minimum volume. in other words, in the following discussion it is assumed that in the shifted pressure record tp1 = tc1. in earlier studies of light emission in the interval (t1, t2) from the spark-generated bubbles, a total of 98 experiments were quantitatively evaluated [24, 25]. in a prevailing part of these experiments, a single light flash was observed. an example of a typical single pulse u1(t) is shown in figure 1. in [24] the single optical pulses were analysed and characterized 269 karel vokurka acta polytechnica by three parameters: the maximum voltage um 1 in the pulse, the time tu1 of occurrence of maximum voltage, and the pulse width ∆ at the half value of the maximum voltage (that is the pulse width at um 1/2). all these parameters are displayed in figure 1. in [25], it was further shown that the light flashes u1(t) are not radiated from the bubble synchronously with the pressure pulses p1(t), but the light flashes are radiated either a bit earlier or a bit later than the pressure pulses. the difference between times tu1 and tp1 was denoted as δ1 and was defined as δ1 = tu1 − tp1. in some experiments (exactly in 22 of 98 experiments), beside the first optical pulse u1(t), a second optical pulse u2(t), slightly delayed after the first pulse, was also observed. an example of a record, where both the first pulse u1(t) and the second pulse u2(t) can be seen, is given in figure 2. the situation, when two pulses are present in the record, can be characterized by six parameters: the maximum voltages um 1 and um 2 in pulses, the times of occurrence of these maxima tu1 and tu2, the distance d12 between the times tu1 and tu2 (this parameter is defined as d12 = tu2 − tu1), and the pulse width ∆ introduced already earlier for a single pulse u1(t). it is evident that now pulse width ∆ describes two pulses, which are more or less melted together. however, there is currently no way to separate the two pulses from each other. in defining the mutual position of the second light pulse u2(t) and the pressure pulse p1(t) on the time axis we will proceed in the same way as in the case of the time difference δ1. the difference between times tu2 and tp1 will be denoted as δ2 and is defined as δ2 = tu2 − tp1. the parameters describing the position of the two optical pulses u1(t) and u2(t) and the acoustic pulse p1(t) on the time axis are shown in figure 3. 3. results in the experiments analysed here, pulses p1(t), u1(t) and u2(t) were radiated from almost spherical bubbles [24, 25]. the sizes of these bubbles are described by the first maximum radius rm 1 and the bubble oscillation intensities are described by the non-dimensional peak pressure in the first acoustic pulse pzp1. thus, there are eight parameters available that can be used in the analysis: six parameters describing pulses u1(t) and u2(t) and their time position with respect to pulse p1(t) and two parameters describing the bubble itself. using the data from 22 experiments, where second light pulses were observed, a correlation analysis was done between these eight parameters, that is between d12, δ1, δ2, um 1, um 2, ∆, rm 1, and pzp1. in most of these analyses, it was noticed that the correlation between selected parameters is very weak. such weak correlation can be seen, for example, in cases where the bubble oscillation intensity pzp1 was entered as one of the two parameters into the analysis. this weak correlation of optical radiation with the intensity of bubble oscillation was already observed in references [24, 25] and was used as a proof for the assertion concerning the relative autonomous behaviour of plasma in the bubble interior. some other weak correlations have also been observed. as these weak correlations currently do not provide any new information, they will not be considered any further and only the dependences of the selected parameters will henceforth be discussed. these are variations of d12, δ2, um 2, and um 2/um 1 with rm 1, d12 with ∆, and δ1 with d12. these variations are shown in figures 4 9. in figure 4, the variation of the time distance d12 between the two optical pulses u2(t) and u1(t) with the bubble size rm 1 is shown. the regression line for the mean value of d12 in dependence on rm 1 is < d12 >= 0.27rm 1 − 4.55 [µs, mm]. it can be seen that d12 is only very weakly correlated with rm 1 and that the dispersion of d12 increases with rm 1. the variation of d12 with rm 1 agrees with the correlation between d12 and ∆ shown later in figure 8 (d12 grows with ∆) and with the correlation between ∆ and rm 1 shown in figure 9 in [24] (∆ grows with the bubble size as ∼ r3.3m 1). however, now the quantities d12 and rm 1 are only weakly correlated. this is in contrast to the moderate correlation of d12 with ∆ shown in figure 8, and ∆ with rm 1, shown in figure 9 in [24]. in figure 5, the variation of the time difference δ2 between the radiation of the second light pulse u2(t) and the pressure pulse p1(t) with the bubble size rm 1 is shown. the regression line for the mean value of δ2 in dependence on rm 1 is < δ2 >= 0.097rm 1−1.12 [µs, mm]. it can be seen that δ2 is correlated with rm 1 only weakly and that the dispersion of δ2 increases with rm 1. the time difference δ2 grows with the bubble size rm 1. this is also in an agreement with the variation of distance d12 with the pulse width ∆ given in figure 8 (distance d12 is part of ∆ and equals d12 = δ2 − δ1). and it is also in an agreement with previous results concerning the variation of the pulse width ∆ with rm 1 (figure 9 in [24]). it can be seen in figure 5 that δ2 was positive in all experiments reported here, which means that the second light flashes were always (with the exception of a single experiment, in which δ2 = 0) radiated some µs after the time tc1 when the bubbles were contracted to minimum volumes. however, as can be seen in figure 7 in [25], the time difference δ1 between the radiation of the first optical pulse u1(t) and the pressure pulse p1(t) was negative in the prevailing number of experiments. this means that the first light flashes u1(t) are usually radiated a few µs before the bubbles have been contracted to minimum volumes at tc1, which is in contrast to the second light flashes u2(t) that were radiated some µs after the bubbles have been contracted to minimum volumes. in figure 6, the variation of the maximum voltage um 2 in the optical pulse u2(t) with the bubble size rm 1 is shown. the regression quadratic polynomial for the mean value of um 2 in dependence on rm 1 is 270 vol. 60 no. 3/2020 on the second light flash emitted from a spark-generated bubble. . . −40 −30 −20 −10 0 10 20 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 t [ µs ] u 1 (t ) [ m v ] u m1 t u1 ∆ figure 1. detailed view of pulse u1(t) at the output of the optical detector. the spark-generated bubble has a size of rm 1 = 49 mm, and oscillates with an intensity of pzp1 = 142.1. in this figure, the time axis origin is set at tu1 and from the pulse u1(t), only a small portion near tu1 is shown. the width of this pulse is ∆ = 9.4 µs and the difference between times tu1 and tp1 is δ1 = −2.6 µs. −60 −40 −20 0 20 40 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 t [ µs ] u (t ) [ m v ] u m1 t u2 d 12 u m2 ∆ u 1 (t) u 2 (t) t u1 figure 2. detailed view of pulses u1(t) and u2(t) at the output of the optical detector. the spark-generated bubble has a size of rm 1 = 53.6 mm, and oscillates with an intensity of pzp1 = 98.0. in this figure, the time axis origin is set at tu1 and from the two pulses, only a small portion near tu1 is shown. the width of the two partially merged pulses is ∆ = 83.6 µs and the difference between times tu1 and tp1 is δ1 = −11 µs. 271 karel vokurka acta polytechnica −40 −30 −20 −10 0 10 20 −1 0 1 2 3 4 5 6 t [ µs ] p 1 (t ) [ m p a ] u (t ) [ a .u . ] t u2 t p1 t u1 u 1 (t) p 1 (t) u 2 (t) δ 1 δ 2 figure 3. example of optical and pressure waves in the vicinity of the times tu1 and tp1 (to display both waves in comparable sizes, the wave u(t) is shown in [a.u.]).the bubble size is rm 1 = 56.5 mm, the intensity of bubble oscillation is pzp1 = 127.3. time differences in the occurrence of maxima in both optical pulses with respect to the pressure pulse are δ1 = −7.0 µs and δ2 = 4.6 µs. in this figure, the time axis origin is set at tp1 and only small portions near tu1, tp1 and tu2 are shown from the optical record u(t) and acoustic record p(t). 20 25 30 35 40 45 50 55 60 0 2 4 6 8 10 12 14 16 18 20 r m1 [ mm ] d 1 2 [ µs ] figure 4. variation of the time distance between the first and second optical pulses d12 with bubble size rm 1. 272 vol. 60 no. 3/2020 on the second light flash emitted from a spark-generated bubble. . . 20 25 30 35 40 45 50 55 60 0 1 2 3 4 5 6 r m1 [ mm ] δ 2 [ µs ] figure 5. variation of the time difference between the occurrence of the second optical pulse and the acoustic pulse δ2 with bubble size rm 1. < um 2 >= 2.3 × 10−4r2m 1 − 6.2 × 10 −3rm 1 + 3.9 × 10−2 [mv, mm]. it can be seen that um 2 is weakly correlated with rm 1 and grows with the bubble size as ∼ r2m 1. in figure 7, the variation of the ratio um 2/um 1 with the bubble size rm 1 is shown. the regression line for the mean value of the ratio um 2/um 1 in dependence on rm 1 is < um 2/um 1 >= −0.004rm 1 + 0.87 [ , mm]. it can be seen that the ratio um 2/um 1 is correlated with rm 1 only very weakly and that, in most of the experiments, um 1 > um 2. the ratio um 2/um 1 is almost independent of the bubble size rm 1. this is in an agreement with the fact that um 2 grows with rm 1 as ∼ r2m 1 (figure 6) and um 1 grows with rm 1 as ∼ r2.5m 1 [24]. the variation of the time distance d12 between optical pulses u2(t) and u1(t) with the pulse width ∆ is shown in figure 8. the regression line for the mean value of d12 in dependence on ∆ is < d12 >= 0.15∆ + 1.62 [µs, µs]. it can be seen that d12 is moderately correlated with ∆. the moderate correlation between d12 and ∆ is what could be expected, viz. that the distance d12 is larger for broader pulses (larger ∆) and smaller for narrower pulses (smaller ∆). the variation of the time difference δ1 between the first light pulse u1(t) and the pressure pulse p1(t) with the time distance d12 between optical pulses u2(t) and u1(t) is shown in figure 9. the regression line for the mean value of δ1 in dependence on d12 is < δ1 >= −0.79d12 + 1.65 [µs, µs]. it can be seen that δ1 is moderately correlated with d12. as can also be observed, the bubbles with larger time distance d12 between optical pulses u2(t) and u1(t) radiate the first optical pulse more early before the bubble is contracted to the minimum volume at tc1. this finding can be compared with figure 7 in [25], where the variation of δ1 with rm 1 is given and where it is shown that for larger bubbles, δ1 is larger, too. and as shown in figure 9 in [24], for larger bubbles, the widths ∆ are also larger. finally, as shown in figure 8, the distance d12 grows with the pulse width ∆, hence it can be expected that δ1, in absolute values, will grow with d12 as well, a fact that is confirmed in figure 9. in conclusion, it may be said that the variances of the parameters discussed above are in an agreement with the results set forth in references [24, 25]. unfortunately, at the current state of knowledge of the processes taking place in oscillating bubbles, no deeper physical explanation of the observed correlations is possible. the processes that may be responsible for the observed phenomena are briefly discussed in the following section. 4. discussion although the light emission from oscillating bubbles has been intensively studied in many laboratories for several decades (see, e.g., works [8–13, 15–18, 20, 21, 26], and the recent review by borisenok [22]), only very limited quantitative experimental data are still available at present, and therefore understanding of the physical or chemical processes taking place in oscillating bubbles is currently very difficult. in a review paper [22], many theories trying to explain the light emission from oscillating bubbles are mentioned, but none of the theoretical models can explain the experimental data presented in this work and in references [24, 25, 27, 28]. for example, as can be seen in figures 2, 3 and 5, the shape of the pulses u1(t) and u2(t) and their timing with respect to the bubble wall motion at first sight exclude the “hot spot” theory preferred by most researchers [22]. and both the shape of the pulse p1(t) (single pulse) and its position relative to the pulses u1(t) and u2(t) exclude the explanation 273 karel vokurka acta polytechnica 20 25 30 35 40 45 50 55 60 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 r m1 [mm] u m 2 [m v ] figure 6. variation of the maximum voltage in the second optical pulse um 2 with bubble size rm 1. 20 25 30 35 40 45 50 55 60 0 0.2 0.4 0.6 0.8 1 1.2 1.4 r m1 [ mm ] u m 2 /u m 1 [ − ] figure 7. variation of the ratio of maximum voltages in the first and second optical pulses um 2/um 1 with bubble size rm 1. of the two light flashes by the bubble splitting in the final stages of the contraction and early stages of the following expansion into two parts. and although the values of the parameters d12,δ1,δ2,um 1,um 2, and ∆ may vary in different experiments (see, figs 4 9), the shapes of the pulses u1(t) and u2(t) were always similar to the shapes shown in figs. 2 3. and this also contradicts to the possible explanation that the bubble was split into several parts, because such a splitting can be expected to be random. from the optical records on which both pulses u1(t) and u2(t) are simultaneously present, it is evident that there are two physical or chemical processes taking place in a bubble. the first process is responsible for the emission of the optical pulse u1(t). the second process is responsible for the emission of the optical pulse u2(t). based on the experimental data published in references [24, 25, 27, 28], where the longlasting and very autonomously behaving plasma in bubbles was described, the author of this work came to the conclusion that in the interior of spark-generated bubbles, during their first oscillation, plasmoids are present (and not the usual plasma) and that these plasmoids are responsible for the emission of light pulses u1(t) [25] (the presence of these plasmoids in the bubble interior is clearly seen, for example, in images presented in figure 2 in [27]). as mentioned already in [25], similar nonstandard plasma generated by electric discharges in water has been studied, e.g., by egorov et al. [29]. these authors talk about autonomous glowing plasma (or about plasmoids) and refer to the works of shevkunov [30, 31], where the process of interaction between h2o molecules and h+ and oh− ions in air containing water vapour is mod274 vol. 60 no. 3/2020 on the second light flash emitted from a spark-generated bubble. . . 0 20 40 60 80 100 120 0 2 4 6 8 10 12 14 16 18 20 ∆ [ µs ] d 1 2 [ µ s ] figure 8. variation of the time distance between the first and second optical pulses d12 with optical pulse width ∆. bubble sizes: ‘o’ rm 1 > 50 mm, ‘x’ 50 mm ≥ rm 1 > 40 mm, ‘+‘ 40 mm ≥ rm 1 > 30 mm, ‘*’ 30 mm ≥ rm 1 > 20 mm, ‘.’ 20 mm ≥ rm 1. 0 5 10 15 20 −15 −10 −5 0 5 d 12 [ µs ] δ 1 [ µs ] figure 9. variation of the time difference in the occurrence of the first optical pulse and the pressure pulse δ1 with time distance between the first and second optical pulse d12. bubble sizes: ‘o’ rm 1 > 50 mm, ‘x’ 50 mm ≥ rm 1 > 40 mm, ‘+‘ 40 mm ≥ rm 1 > 30 mm, ‘*’ 30 mm ≥ rm 1 > 20 mm, ‘.’ 20 mm ≥ rm 1. 275 karel vokurka acta polytechnica elled to explain the phenomenon of long-lasting and glowing plasma. independently of egorov et al. [29], golubnichy et al. [15] also studied nonstandard plasma in bubbles generated by electric discharges in water and in this case, the authors call this form of plasma as “long-living luminous objects” (lllo). it may also be interesting to note that, while the pulse u1(t) was observed independently on the bubble oscillation intensity pzp1 in all experiments, the pulse u2(t) was only observed when studying bubbles oscillating with an intensity of pzp1 > 92. however, as mentioned in section 3, only very weak correlation of the studied parameters d12,δ1,δ2,um 1,um 2, and ∆ with intensity of bubble oscillations pzp1 was observed, and therefore no scatter plots of these parameters were presented. in the present paper, the shapes of the second light pulses u2(t) and their position on the time axis relative to the position of the pressure pulses (and thus implicitly also relative to the instantaneous motion of the bubble wall) were studied in a greater detail. from figures 2, 3, and 7, it is evident that to explain the existence of the second light pulses, a new physical or chemical process taking place in spark-generated bubbles must be considered. it is highly probable that the process responsible for the emission of the light pulses u2(t) is always present in the bubble, but the light emitted by this process is very often overlapped by the light emitted from the contracted (and thus heated) plasmoid. therefore, only a single flash of light is visible in most experiments. the process responsible for u2(t) lasts only several µs and emits light of comparable intensity as the contracted plasmoid (see figures 2, 3, and 7). to explain the origin of this second light, a physical (or chemical) reaction of the plasmoid components h, h2, o, o2, oh, h2o can be assumed (in reference [15], the authors talk about unusual power-consuming compounds of oxygen and hydrogen present in the plasmoid). at present, however, it is still unclear what kind of physical (or chemical) reaction it should be. all that can be said is that, most probably, this reaction usually starts after the plasmoid is contracted sufficiently to a small volume and thus a very high pressure and temperature in the plasmoid is achieved. as can be seen in figures 4 7, parameters d12, δ2, and um 2 grow almost linearly with the bubble size rm 1 (parameter ∆ also grows with rm 1, but steeper than linearly, see figure 9 in [24]). thus, the amount of substances entering into the chemical reaction will be proportional to the bubble size rm 1. however, nothing else can be deduced from the available experimental data at the present time. and no other quantitative data are available in the literature [9, 12, 13, 26]. after a careful observation of the shapes and widths of the light pulses u2(t) and after closer examination of the data given in figure 7, we came to a conclusion that under certain circumstances, such as those occurring in laser-generated bubbles and in bubbles that are oscillating in acoustic resonators, the emission of light caused by the second process can be significantly increased compared to the emission caused by the first process. in that case, the value of um 2 may be much higher than the value of um 1. if this assumption proves to be correct, then the light pulses observed in works [9, 12, 13, 26] are identical with the light pulses u2(t) observed in the present study. finally, it may be useful to compare different types of oscillating bubbles from the point of view of light emission. as mentioned in the introduction, beside the spark-generated bubbles studied here, other types of bubbles were also intensively used to investigate light emission. most extensive data on light pulses were obtained when studying laser-generated bubbles [8–10, 12, 13], and in the case of bubbles oscillating in acoustic resonators [22]. when comparing the experimental data on light emission from sparkgenerated bubbles with data measured with other types of bubbles, certain similarities and certain differences can be observed. the similarities can be seen, for example, in the values of the maximum surface temperatures of the emitting plasma. for different types of bubbles, these temperatures range from 4300 k to 8700 k [10, 18, 27, 28] (only bubbles oscillating in water under ordinary laboratory conditions are compared here). authors of works [9, 12, 13] also mention the great scatter of the optical pulse maximum values, of the pulse widths, and of the pulse shapes. in the case of laser-generated bubbles [9, 12, 13], the multiple peaks in light flashes were also observed. and moran and sweider [26], who were studying bubbles in an acoustic resonator, also reported the occurrence of the first light pulse followed by a small second pulse, which they called “afterpulse”. finally, to close the discussion of the similarities, let us mention that even in the article of baghdassarian et al. [8], the light emission during the whole first bubble oscillation to1 can be seen in the published figure 1. however, there are also differences between the light flashes emitted from spark-generated bubbles and other types of bubbles. these differences can be seen, for example, in the shape of the light pulses and in the variation of the light pulse widths ∆ with the bubble size rm 1. the shapes of the light pulses observed in the case of laser-generated bubbles and bubbles oscillating in acoustic resonators [8–10, 12, 26] are “gaussian” and pulse widths increase almost linearly with the bubble size rm 1 [8–10]. unlike these observations, the shape of the light pulses u1(t) observed in our experiments is not “gaussian” (see, e.g. figures 1, 2, and 3) and pulse widths increase with the bubble size as ∼ r3,3m 1 [24]. in some experiments, the number of observed pulses was also greater than two. for example, ohl [9] observed up to three local maxima before the main maximum and denoted these local maxima as precursors. as an explanation for these local maxima, he suggested that “the precursors originate from different hot spots; either a strongly 276 vol. 60 no. 3/2020 on the second light flash emitted from a spark-generated bubble. . . inhomogeneous bubble interior, or a splitting of the bubble into parts”. sukovich et al. [12] also observed that “many events were shown to have multiple peaks in the emission curve for a single event”. and as an explanation of this, they said that “this likely suggests nonuniformities in either pressure or bubble distribution in the collapse region or that the conditions requisite for emissions are probabilistic in nature and so may occur at any point in space or time in the region so long as conditions are above some threshold value”. finally, supponen et al. [13] reported that “the number of peaks in the photodetector signals varies between one and four, suggesting multiple events yielding light emission”. the last-mentioned authors did not suggest any closer explanation for the origin of the events. unfortunately, due to the lack of suitable experimental data, the causes of the differences between the spark-generated and laser-generated bubbles cannot be explained in greater detail at present. 5. conclusion in this work, the second light flashes emitted from the spark-generated bubbles in the final stages of the first bubble contraction and early stages of the following expansion were studied in detail. to obtain the necessary time information, optical waves u(t) and acoustic waves p(t) had to be simultaneously recorded. the large size of the generated bubbles also proved advantageous. to explain the existence of the observed two light flashes, two independent processes taking place in bubbles are assumed. the first process is responsible for the light emission during the whole first bubble oscillation, and it is believed that the nature of this process is similar to the process running in plasmoids. the second process, which is responsible for the second light flashes, is assumed to be a physical (or chemical) reaction of the plasmoid components. experimental investigation of these processes taking place in bubbles under very high pressures and temperatures in a very limited space and lasting an extremely short time will require the development of new experimental techniques. also, the very low reproducibility of spark-generated bubbles must be overcome. and the new technique should avoid averaging, unfortunately so common in studies of light emission from bubbles. acknowledgements this work was partly supported by the ministry of education of the czech republic as a research project msm 245 100 304. the experimental part of this work was carried out during the author’s stay at the underwater acoustics laboratory of the italian acoustics institute, cnr, rome, italy. the author wishes to thank dr. silvano buogo of the cnr-insean marine technology research institute, rome, italy, for his very valuable help in preparing the experiments. references [1] g. l. chahine, a. gnanaskandan, a. mansouri, et al. interaction of a cavitation bubble with a polymeric coating–scaling fluid and material dynamics. international journal of multiphase flow 112:155 – 169, 2019. doi:10.1016/j.ijmultiphaseflow.2018.12.014. [2] p. a. dayton, j. s. allen, k. w. ferrara. the magnitude of radiation force on ultrasound contrast agents. the journal of the acoustical society of america 112(5):2183–2192, 2002. doi:10.1121/1.1509428. [3] v. sboros. response of contrast agents to ultrasound. advanced drug delivery reviews 60:1117–1136, 2008. doi:10.1016/j.addr.2008.03.011. [4] e. p. stride, c. c. coussios. cavitation and contrast: the use of bubbles in ultrasound imaging and therapy. proceedings of the institution of mechanical engineers, part h: journal of engineering in medicine 224(2):171–191, 2010. doi:10.1243/09544119jeim622. [5] d. thomas, m. butler, n. pelekasis, et al. the acoustic signature of decaying resonant phospholipid microbubbles. physics in medicine and biology 58:589–599, 2013. doi:10.1088/0031-9155/58/3/589. [6] t. segers, n. de jong, m. versluis. uniform scattering and attenuation of acoustically sorted ultrasound contrast agents: modeling and experiments. the journal of the acoustical society of america 140:2506–2517, 2016. doi:10.1121/1.4964270. [7] b. ward, d. emmony. interferometric studies of the pressures developed in a liquid during infrared-laserinduced cavitation-bubble oscillation. infrared physics 32:489 – 515, 1991. doi:10.1016/0020-0891(91)90138-6. [8] o. baghdassarian, b. tabbert, g. a. williams. luminescence characteristics of laser-induced bubbles in water. physical review letters 83:2437–2440, 1999. doi:10.1103/physrevlett.83.2437. [9] c.-d. ohl. probing luminescence from nonspherical bubble collapse. physics of fluids 14:2700–2708, 2002. doi:10.1063/1.1489682. [10] e. brujan, d. hecht, f. lee, g. williams. properties of luminescence from laser-created bubbles in pressurized water. physical review e, statistical, nonlinear, and soft matter physics 72:066310, 2005. doi:10.1103/physreve.72.066310. [11] c. frez, g. diebold. laser generation of gas bubbles: photoacoustic and photothermal effects recorded in transient grating experiments. the journal of chemical physics 129:184506, 2008. doi:10.1063/1.3003068. [12] j. r. sukovich, a. sampathkumar, p. a. anderson, et al. temporally and spatially resolved imaging of laser-nucleated bubble cloud sonoluminescence. physical review e, statistical, nonlinear, and soft matter physics 85:056605, 2012. doi:10.1103/physreve.85.056605. [13] o. supponen, d. obreschkow, p. kobel, m. farhat. luminescence from cavitation bubbles deformed in uniform pressure gradients. physical review e, statistical, nonlinear, and soft matter physics 96:033114, 2017. doi:10.1103/physreve.96.033114. [14] y. sun, i. v. timoshkin, m. j. given, et al. acoustic impulses generated by air-bubble stimulated underwater spark discharges. ieee transactions on dielectrics and electrical insulation 25(5):1915–1923, 2018. doi:10.1109/tdei.2018.007293. 277 http://dx.doi.org/10.1016/j.ijmultiphaseflow.2018.12.014 http://dx.doi.org/10.1121/1.1509428 http://dx.doi.org/10.1016/j.addr.2008.03.011 http://dx.doi.org/10.1243/09544119jeim622 http://dx.doi.org/10.1088/0031-9155/58/3/589 http://dx.doi.org/10.1121/1.4964270 http://dx.doi.org/10.1016/0020-0891(91)90138-6 http://dx.doi.org/10.1103/physrevlett.83.2437 http://dx.doi.org/10.1063/1.1489682 http://dx.doi.org/10.1103/physreve.72.066310 http://dx.doi.org/10.1063/1.3003068 http://dx.doi.org/10.1103/physreve.85.056605 http://dx.doi.org/10.1103/physreve.96.033114 http://dx.doi.org/10.1109/tdei.2018.007293 karel vokurka acta polytechnica [15] p. golubnichy, y. krutov, e. nikitin, d. reshetnyak. conditions accompanying formation of long-living luminous objects from dissipating plasma of electric discharge in water. voprosy atomnoy nauki i tekhniki pp. 143–146, 2008. [16] y. huang, h. yan, b. wang, et al. the electroacoustic transition process of pulsed corona discharge in conductive water. journal of physics d: applied physics 47:255204, 2014. doi:10.1088/0022-3727/47/25/255204. [17] y. huang, l. zhang, j. chen, et al. experimental observation of the luminescence flash at the collapse phase of a bubble produced by pulsed discharge in water. applied physics letters 107:4104, 2015. doi:10.1063/1.4935206. [18] l. zhang, x. zhu, h. yan, et al. luminescence flash and temperature determination of the bubble generated by underwater pulsed discharge. applied physics letters 110:034101, 2017. doi:10.1063/1.4974452. [19] a. hajizadeh aghdam, b. khoo, v. farhangmehr, m. t. shervani-tabar. experimental study on the dynamics of an oscillating bubble in a vertical rigid tube. experimental thermal and fluid science 60:299–307, 2015. doi:10.1016/j.expthermflusci.2014.09.017. [20] i. ko, h.-y. kwak. measurement of pulse width from a bubble cloud under multibubble sonoluminescence conditions. journal of the physical society of japan 79:124401, 2010. doi:10.1143/jpsj.79.124401. [21] t. g. leighton, m. farhat, j. e. field, f. avellan. cavitation luminescence from flow over a hydrofoil in a cavitation tunnel. journal of fluid mechanics 480:43–60, 2003. doi:10.1017/s0022112003003732. [22] v. borisenok. sonoluminescence: experiments and models (review). acoustical physics 61:308–332, 2015. doi:10.1134/s1063771015030057. [23] s. buogo, k. vokurka. intensity of oscillation of sparkgenerated bubbles. journal of sound and vibration 329:4266–4278, 2010. doi:10.1016/j.jsv.2010.04.030. [24] k. vokurka, s. buogo. experimental study of light emitted by spark-generated bubbles in water. the european physical journal applied physics 81:11101, 2018. doi:10.1051/epjap/2017170332. [25] k. vokurka. the time difference in emission of light and pressure pulses from oscillating bubbles. acta polytechnica 58:323, 2018. doi:10.14311/ap.2018.58.0323. [26] m. j. moran, d. sweider. measurements of sonoluminescence temporal pulse shape. physical review letters 80:4987–4990, 1998. doi:10.1103/physrevlett.80.4987. [27] k. vokurka. experimental determination of temperatures in spark-generated bubbles oscillating in water. acta polytechnica 57:149–158, 2017. doi:10.14311/ap.2017.57.0149. [28] k. vokurka, j. plocek. experimental study of the thermal behavior of spark generated bubbles in water. experimental thermal and fluid science 51:84–93, 2013. doi:10.1016/j.expthermflusci.2013.07.004. [29] a. egorov, s. stepanov. properties of short-lived ball lightning produced in the laboratory. technical physics 53:688–692, 2008. doi:10.1134/s1063784208060029. [30] s. shevkunov. cluster mechanism of the energy accumulation in a ball electric discharge. doklady physics 46:467–472, 2001. doi:10.1134/1.1390398. [31] s. shevkunov. scattering of centimeter radiowaves in a gas ionized by radioactive radiation: cluster plasma formation. journal of experimental and theoretical physics 92:420–440, 2001. doi:10.1134/1.1364740. 278 http://dx.doi.org/10.1088/0022-3727/47/25/255204 http://dx.doi.org/10.1063/1.4935206 http://dx.doi.org/10.1063/1.4974452 http://dx.doi.org/10.1016/j.expthermflusci.2014.09.017 http://dx.doi.org/10.1143/jpsj.79.124401 http://dx.doi.org/10.1017/s0022112003003732 http://dx.doi.org/10.1134/s1063771015030057 http://dx.doi.org/10.1016/j.jsv.2010.04.030 http://dx.doi.org/10.1051/epjap/2017170332 http://dx.doi.org/10.14311/ap.2018.58.0323 http://dx.doi.org/10.1103/physrevlett.80.4987 http://dx.doi.org/10.14311/ap.2017.57.0149 http://dx.doi.org/10.1016/j.expthermflusci.2013.07.004 http://dx.doi.org/10.1134/s1063784208060029 http://dx.doi.org/10.1134/1.1390398 http://dx.doi.org/10.1134/1.1364740 acta polytechnica 60(3):268–278, 2020 1 introduction 2 experimental setup and nomenclature 3 results 4 discussion 5 conclusion acknowledgements references acta polytechnica doi:10.14311/app.2016.56.0018 acta polytechnica 56(1):18–26, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap ad hoc teamwork behaviors for influencing a flock katie genter∗, peter stone department of computer science, the university of texas at austin, austin, tx, usa 78712 ∗ corresponding author: katie@cs.utexas.edu abstract. ad hoc teamwork refers to the challenge of designing agents that can influence the behavior of a team, without prior coordination with its teammates. this paper considers influencing a flock of simple robotic agents to adopt a desired behavior within the context of ad hoc teamwork. specifically, we examine how the ad hoc agents should behave in order to orient a flock towards a target heading as quickly as possible when given knowledge of, but no direct control over, the behavior of the flock. we introduce three algorithms which the ad hoc agents can use to influence the flock, and we examine the relative importance of coordinating the ad hoc agents versus planning farther ahead when given fixed computational resources. we present detailed experimental results for each of these algorithms, concluding that in this setting, inter-agent coordination and deeper lookahead planning are no more beneficial than short-term lookahead planning. keywords: ad hoc teamwork; agent cooperation; coordination; flocking. 1. introduction consider a flock of migrating birds that is flying directly towards a dangerous area, such as an airport or a wind farm. it will be better for both the flock and the humans if the path of the migratory birds is altered slightly such that the flock can avoid the dangerous area but still reach its migratory point at approximately the same time. the above scenario is a motivating example for our work in orienting a flock using ad hoc teamwork. we assume that each bird in the flock dynamically adjusts its heading based on that of its immediate neighbors. we assume further that we control one or more ad hoc agents — perhaps in the form of robotic birds or ultralight aircraft1 — that are perceived by the rest of the flock as one of their own. flocking is an emergent behavior found in different species in nature including flocks of birds, schools of fish, and swarms of insects. in each of these cases, the animals follow a simple local behavior rule that results in a group behavior that appears well organized and stable. research on flocking behavior has appeared in various disciplines such as physics [1], graphics [2], biology [3, 4], and distributed control theory [5–7]. in each of these disciplines, the research has focused mainly on characterizing the emergent behavior. in this paper, we consider the problem of leading a team of flocking agents in an ad hoc teamwork setting. an ad hoc teamwork setting is one in which a teammate — which we call an influencing agent — must determine how to best achieve a team goal given a set of possibly suboptimal teammates. in this work, we are given a team of flocking agents following a known, well-defined rule characterizing their flocking 1www.operationmigration.org behavior, and we wish to examine how the influencing agents should behave. specifically, the main question addressed in this paper is: how should influencing agents behave so as to orient the rest of the flock towards a target heading as quickly as possible? the remainder of this paper is organized as follows. section 2 introduces our problem and necessary terminology for this paper. the main contribution of this paper is the 1-step lookahead algorithm for orienting a flock to travel in a particular direction. this algorithm is presented in section 3, while variations of this algorithm are presented in sections 4 and 5. we present the results of running experiments using these algorithms in the mason simulator [8] in section 6. section 7 situates this research in the literature, and section 8 concludes. 2. problem definition in this work we use a simplified version of the reynolds’ boid algorithm for flocking [2]. we assume that each agent calculates its orientation for the next time step to be the average heading of its neighbors. throughout this paper, an agent’s neighbors are the agents located within some set radius of the agent. in order to calculate its orientation for the next time step, each agent computes the vector sum of the velocity vectors of each of its neighbors and adopts a scaled version of the resulting vector as its new orientation. an agent is not considered to be a neighbor of itself, so an agent’s current heading is not considered when calculating its orientation for the next time step. figure 1 shows an example of how an agent’s new velocity vector is calculated. at each time step, each agent moves one step in the direction of its current vector and then calculates its new heading based on those of its neighbors, keeping a constant speed. 18 http://dx.doi.org/10.14311/app.2016.56.0018 http://ojs.cvut.cz/ojs/index.php/ap www.operationmigration.org vol. 56 no. 1/2016 ad hoc teamwork behaviors for influencing a flock =++ ≈ figure 1. an example of how an agent’s new velocity vector would be calculated. in this example, the black dot represents the agent in question, the solid arrows represent the velocity vectors of the agent’s neighbors, and the dotted circle represents the area of the agent’s neighborhood. the agent’s new velocity vector is calculated as shown at the bottom of the figure — in this calculation, the three vectors are first summed and then scaled to maintain constant speed. over time, agents behaving as described above will gather into one or more groups, and these groups will each travel in some direction. however, in this work we add a small number of influencing agents to the flock. these influencing agents attempt to influence the flock to travel in a pre-defined direction — throughout this paper we refer to this desired direction as θ∗. note that the challenge of designing influencing agent behaviors in a dynamic flocking system is difficult because the action space is continuous. hence, in our work we make the simplifying assumption of only considering a limited number (numangles) of discrete angle choices for each influencing agent. 2.1. simulation environment we situate our research on flocking using ad hoc teamwork within the mason simulator, a concrete simulation environment [8]. a picture of the flockers domain is shown in figure 2. each agent points and moves in the direction of its current velocity vector. the mason flockers domain is toroidal, so agents that move off one edge of our domain reappear on the opposite edge, moving in the same direction. we conclude that the flock has converged to θ∗ when every agent (that is not an influencing agent) is facing within 0.1 radians of θ∗. other stopping criteria, such as when 90 % of the agents are facing within 0.1 radians of θ∗, could also have been utilized. (a) start (b) finish figure 2. pictures of (a) the start of a trial and (b) the end of a trial in the mason flockers simulation environment. the grey agents are influencing agents, while the black agents are other members of the flock. variable definition bestdiff the smallest difference found so far between the average orientation vectors of neighofia and θ∗ bestorient the vector representing the orientation adopted by the influencing agent to obtain bestdiff neighofia the neighbors of the influencing agent norient the predicted next step orientation vector of neighbor n of the influencing agent if the influencing agent adopts iaorient norients a set of the predicted next step orientation vectors of all of the neighbors of the influencing agent, assuming the influencing agent adopts iaorient table 1. variables used in algorithm 1. 3. 1-step lookahead behavior in this section we present algorithm 1, a 1-step lookahead algorithm for determining the individual behavior of each influencing agent. this algorithm considers all of the influences on neighbors of the influencing agent at a particular point in time, such that the influencing agent can determine the best orientation to adopt based on this information. the 1-step lookahead algorithm is a greedy, myopic approach for determining the best individual behavior for each influencing agent, where ‘best’ is defined as the behavior that will exert the most influence on the next time step. note that if the algorithm only considered the current orientations of the neighbors (instead of the influences on these neighbors) when determining the next orientation for the influencing agent to adopt, it would only be estimating the state of each neighbor and hence the resulting orientation adopted by the influencing agent would not be ‘best’. the variables used throughout algorithm 1 are 19 katie genter, peter stone acta polytechnica algorithm 1 bestorient = 1steplookahead(neighofia) 1: bestorient ← (0, 0) 2: bestdiff ←∞ 3: for each influencing agent orient vector iaorient do 4: norients ←∅ 5: for n ∈ neighofia do 6: norient ← (0, 0) 7: for n’ ∈ n.neighbors do 8: if n’ is an influencing agent then 9: norient ← norient + iaorient 10: else 11: norient ← norient + n’.vel 12: norient ← norient|n.neighbors| 13: norients ← {norient}∪norients 14: diff ← avg diff between vects norients and θ∗ 15: if diff < bestdiff then 16: bestdiff ← diff 17: bestorient ← iaorient 18: return bestorient defined in table 1. two functions are used in algorithm 1: neighbor.vel returns the velocity vector of neighbor and neighbor.neighbors returns a set containing the neighbors of neighbor. note that algorithm 1 is called on each influencing agent at each time step, and that the neighbors of the influencing agent at that time step are provided as a parameter to the algorithm. the output from the algorithm is the orientation that, if adopted by this influencing agent, is predicted to influence its neighbors to face closer to θ∗ than any of the other numangles discrete influencing orientations considered. conceptually, algorithm 1 is concerned with how the neighbors of the influencing agent are influenced if the influencing agent adopts a particular orientation at this time step. algorithm 1 considers each of the numangles discrete influencing agent orientation vectors. for each orientation vector, the algorithm considers how each of the neighbors of the influencing agent will be influenced if the influencing agent adopts that orientation vector (lines 3–13). hence, algorithm 1 considers all of the neighbors of each neighbor of the influencing agent (lines 7–11) — if the neighbor of the neighbor of the influencing agent is an influencing agent, the algorithm assumes that it has the same orientation as the influencing agent (even though, in fact, each influencing agent orients itself based on a different set of neighbors, line 9). on the other hand, if it is not an influencing agent, the algorithm calculates its orientation vector based on its current velocity (line 11). using this information, the algorithm calculates how each neighbor of the influencing agent will be influenced by averaging the orientation vectors of the each neighbor’s neighbors (lines 12–13). the algorithm then picks the influencing agent orientation vector that results in the variable definition n’orient the predicted next step orientation vector of a neighbor n’ of a neighbor of the influencing agent if the influencing agent adopts iaorient norient2 the predicted ‘2 steps in the future’ orientation vector of neighbor n of the influencing agent if the influencing agent adopts iaorient on the first time step and iaorient2 on the second time step norients2 a set containing the predicted ‘2 steps in the future’ orientation vectors of all of the neighbors of the influencing agent, assuming the influencing agent adopts iaorient on the first time step and iaorient2 on the second time step table 2. variables used in algorithm 2 that were not used in algorithm 1. least difference between θ∗ and the neighbors’ current orientation vectors (lines 14–18). if there are numagents agents in the flock, the worstcase complexity of algorithm 1 is calculated as follows. line 3 executes numangles times, line 5 executes at most numagents times, and line 7 executes at most numagents. hence, the complexity for algorithm 1 is o(numangles∗numagents2). results regarding how algorithm 1 performs in terms of the number of time steps needed for the flock to converge to θ∗ can be found in section 6. 4. 2-step lookahead behavior whereas the 1-step lookahead behavior presented in the previous section optimizes each influencing agent’s orientation to best influence its neighbors on the next step, it fails to consider more long-term effects. hence, in this section we present a 2-step lookahead behavior in algorithm 2. this 2-step lookahead behavior considers influences on the neighbors of the neighbors of the influencing agent, such that the influencing agent can make a more informed decision when determining the best orientation to adopt. the variables used in algorithm 2 that were not used in algorithm 1 are defined in table 2. like algorithm 1, algorithm 2 is called on each influencing agent at each time step, takes in the neighbors of the influencing agent at each time step, and returns the orientation that, if adopted by this influencing agent, will influence the flock to face closer to θ∗ than any of the other numangles influencing orientations considered. conceptually, algorithm 2 is concerned with (1) how the neighbors of each neighbor of the influencing agent are influenced if the influencing agent adopts a particular orientation at this time step (lines 5–13 in algorithm 2) and (2) how the neighbors of the 20 vol. 56 no. 1/2016 ad hoc teamwork behaviors for influencing a flock algorithm 2 bestorient = 2steplookahead(neighofia) 1: bestorient ← (0, 0) 2: bestdiff ←∞ 3: for each influencing agent orientation iaorient do 4: norients ←∅ 5: for n ∈ neighofia do 6: norient ← (0, 0) 7: for n’ ∈ n.neighbors do 8: if n’ is an influencing agent then 9: norient ← norient + iaorient 10: else 11: norient ← norient + n’.vel 12: norient ← norient|n.neighbors| 13: norients ← {norient}∪norients 14: for each influencing agent orientation iaorient2 do 15: norients2 ←∅ 16: for n ∈ neighofia do 17: norient2 ← (0, 0) 18: for n’ ∈ n.neighbors do 19: n’orient ← (0, 0) 20: for n” ∈ n’.neighbors do 21: if n” is an influencing agent then 22: n’orient ← n’orient + iaorient 23: else 24: n’orient ← n’orient + n”.vel 25: n’orient ← n’orient|n’.neighbors| 26: if n’ is an influencing agent then 27: norient2 ← norient2 + iaorient2 28: else 29: norient2 ← norient2 + n’orient 30: norient2 ← norient2|n.neighbors| 31: norients2 ← {norient2}∪norients2 32: diff ← the avg diff between vects norients and θ∗ and between vects norients2 and θ∗ 33: if diff < bestdiff then 34: bestdiff ← diff 35: bestorient ← iaorient 36: return bestorient neighbors of each neighbor of the influencing agent are influenced if the influencing agent adopts a particular orientation at this time step (lines 19–25 in algorithm 2), since they will influence the neighbors of each neighbor of the influencing agent on the next time step (lines 16–31 in algorithm 2). algorithm 2 starts by considering each of the numangles discrete influencing agent orientation vectors, and by considering how each of the neighbors of the influencing agent will be influenced if the influencing agent adopts that particular orientation vector. for each neighbor of the influencing agent, this requires considering all of its neighbors and calculating how each neighbor of the influencing agent will be influenced on the first time step (lines 5–13). next, algorithm 2 considers the effect of the influencing agent adopting each of the numangles influencing agent orientation vectors on a second time step (lines 14–31). as before, this requires considering all of the neighbors of each neighbor of the influencing agent, and calculating how each neighbor of the influencing agent will be influenced (lines 18–31). however, in order to do this the algorithm must first consider how the neighbors of the neighbors of the influencing agent were influenced by their neighbors on the first time step (lines 20–25). finally, algorithm 2 selects the first step influencing agent orientation vector that results in the least difference between θ∗ and the neighbors’ orientation vectors after both the first and second time steps (lines 32–36). in algorithm 2 we make the simplifying assumption that agents do not change neighborhoods within the horizon of our planning. due to the fact that movements are relatively small with respect to each agent’s neighborhood size, the effects of this simplification are negligible for the relatively small number of future steps that the 2-step lookahead behavior considers. the complexity of algorithm 2 can be calculated 21 katie genter, peter stone acta polytechnica as follows. line 3 executes numangles times, line 14 executes at most numangles times, line 16 executes at most numagents times, line 18 executes at most numagents times, and line 20 executes at most numagents times. hence, the complexity for algorithm 2 is o(numangles2 ∗numagents3). 5. coordinated behavior the influencing agent behaviors presented in sections 3 and 4 were for individual influencing agents, where each influencing agent calculated its behavior independent of any other influencing agents. in this section, we consider whether influencing agents can exert more influence on the flock by working in a coordinated fashion. in particular, coordination is potentially useful in cases where a flocking agent is in the neighborhoods of multiple influencing agents. ideally, all of the influencing agents would coordinate their behaviors to influence the flock to reach θ∗ as quickly as possible. however, due to computational considerations, in this work this is infeasible due to the complexity of such a calculation. instead, we pair influencing agents that share some neighbors. these pairs then work in a coordinated fashion to influence their neighbors to orient towards θ∗. we opted to use pairs for simplicity and for computational considerations, but our approach could also be applied to larger groups of influencing agents that share neighbors. we select the influencing agents to pair by first finding all pairs of influencing agents with one or more neighbors in common. then we do a brute-force search and find every possible disjoint combination of these pairs. for each such combination, we calculate the sum of the number of shared neighbors across all the pairs and select the combination with the greatest sum of shared neighbors. this combination of chosen pairs is called the selectedpairs. note that selectedpairs is recalculated at each time step. the behavior of each influencing agent depends on whether it is part of a pair in selectedpairs or not. if it is part of a pair, it follows algorithm 3 and coordinates with a partner influencing agent. if it is not part of a pair, it follows algorithm 1 and performs a 1-step lookahead search for the best individual behavior. only one new variable and one new function are used in algorithm 3 that are not used in algorithm 1 or algorithm 2. the variable is “norientsp”, which is a set used to hold the predicted next step orientation vectors of all the neighbors of the influencing agent’s partner, assuming the influencing agent adopts iaorient and the influencing agent’s partner adopts iaorientp. the function is neighbors.get(x), which returns the xth element in the set neighbors. algorithm 3 is called on influencing agents that are part of a pair in selectedpairs at each time step. algorithm 3 takes in the neighbors of the influencing agent and the neighbors of the partner of the influencing agent, and returns the orientation that, if adopted by the influencing agent, is guaranteed to influence the flock to face closer to θ∗ than any of the other numangles influencing agent orientations considered for both the influencing agent and its partner. conceptually, algorithm 3 considers each of the numangles influencing agent orientations for the influencing agent and for the influencing agent’s partner and performs two 1-step lookahead searches. the main difference between algorithm 1 and algorithm 3 is that the coordinated algorithm takes into account that another influencing agent is also influencing all of the agents that are both in the influencing agent’s neighborhood and in the influencing agent’s partner’s neighborhood. hence, the influencing agent may choose to behave in a way that influences the other agents in its neighborhood closer to θ∗ while relying on its partner to more strongly influence the agents that exist in both of the paired influencing agents’ neighborhoods towards θ∗. specifically, algorithm 3 executes as follows. for each potential influencing agent orientation, the algorithm considers how each of the neighbors of the influencing agent will be influenced if the influencing agent adopts that orientation (lines 6–16). then algorithm 3 considers how each of the neighbors of the influencing agent’s partner will be influenced if the influencing agent’s partner adopts each potential influencing agent partner orientation (lines 18–29). finally, the algorithm selects the influencing agent orientation that results in the least difference between θ∗ and the current orientations of the neighbors of both the influencing agent and the influencing agent’s partner (lines 30–34). note that agents that are neighbors of both the influencing agent and its partner are only counted once (lines 28–29). the complexity of algorithm 3 can be calculated as follows. line 3 executes numangles times, line 4 executes numangles times, line 6 executes at most numagents times, line 8 executes at most numagents, line 18 executes at most numagents times, and line 20 executes at most numagents. hence, the complexity for algorithm 3 is o(numangles2 ∗numagents2). results for how algorithm 3, as well as algorithms 1 and 2, performed in our experiments can be found in the next section. 6. experiments in this section we describe our experiments testing the three influencing agent behaviors presented in sections 3, 4, and 5 against some baseline methods described in this section. our original hypothesis was that algorithms 1, 2, and 3 would all perform significantly better than the baseline methods. we also believed that algorithms 2 and 3 would perform better than algorithm 1. 6.1. baseline ad hoc agent behaviors in this subsection we describe two behaviors which we use as comparison baselines for the lookahead and 22 vol. 56 no. 1/2016 ad hoc teamwork behaviors for influencing a flock algorithm 3 bestorient = 1stepcoordinated(neighofia, neighofp) 1: bestorient ← (0, 0) 2: bestdiff ←∞ 3: for each influencing agent orient iaorient do 4: for each influencing agent orient iaorientp do 5: norients ←∅ 6: for n ∈ neighofia do 7: norient ← (0, 0) 8: for n’ ∈ n.neighbors do 9: if n’ is the influencing agent then 10: norient ← norient + iaorient 11: else if n’ is the influencing agent’s partner then 12: norient ← norient + iaorientp 13: else 14: norient ← norient + n’.vel 15: norient ← norient|n.neighbors| 16: norients ← {norient}∪norients 17: norientsp ←∅ 18: for n ∈ neighofp do 19: norient ← (0, 0) 20: for n’ ∈ n.neighbors do 21: if n’ is the influencing agent then 22: norient ← norient + iaorient 23: else if n.neighbors.get(n’) is influencing agent’s partner then 24: norient ← norient + iaorientp 25: else 26: norient ← norient + n’.vel 27: norient ← norient|n.neighbors| 28: if n 6∈ neighofia then 29: norientsp ← {norient}∪norientsp 30: diff ← the avg diff between vectors norients and θ∗ and between vectors norientsp and θ∗ 31: if diff < bestdiff then 32: bestdiff ← diff 33: bestorient ← iaorient 34: return bestorient coordinated influencing agent behaviors presented in sections 3, 4 and 5. 6.1.1. face desired orientation behavior when following this behavior, the influencing agents always orient towards θ∗. note that under this behavior the influencing agents do not consider their neighbors or anything about their environment when determining how to behave. this behavior is modeled after work by jadbabaie, lin, and morse [6]. they show that a flock with a controllable agent will eventually converge to the controllable agent’s heading. hence, the face desired orientation influencing agent behavior is essentially the behavior described in their work, except that in our experiments we include multiple controllable agents facing θ∗. 6.1.2. offset momentum behavior under this behavior, each influencing agent calculates the vector sum v of the velocity vectors of its neighbors and then adopts an orientation along the vector v ′ such that the vector sum of v and v ′ points towards θ∗. see figure 3 for an example calculation. in figure 3, the velocity vectors of each neighbor are summed in the first line of calculations. in the second line of calculations, the vector sum of the influencing agent’s orientation and the results of the first line must equal θ∗, which in this example is pointing directly south. from the equation on the second line of calculations, the new influencing agent orientation vector can be found by vector subtraction. this vector is displayed and then scaled to maintain constant velocity on the third line of calculations. this influencing agent behavior was inspired by our previous work [9]. in this work, we showed how to optimally orient a stationary agent to a desired orientation using a set of stationary influencing agents. in particular, we presented an algorithm which the influencing agents could utilize to orient the agent to the desired orientation in the least number of steps possible. our offset momentum influencing agent behavior implements this algorithm. however, this algorithm assumes that the agent is only influenced 23 katie genter, peter stone acta polytechnica orientation ad hoc orientation ad hoc =+ =+ = + ≈ figure 3. an example of how the offset momentum influencing agent behavior works. the influencing agent is the black dot, the circle represents the influencing agent’s neighborhood, and the three arrows inside the circle represent the influencing agent’s neighbors. by influencing agents within its neighborhood. hence, it is not optimal in our experimental setting because each agent being influenced by an influencing agent is usually also being influenced by other agents. 6.2. experimental setup we utilize the mason simulator [8] for our experiments in this paper. the mason simulator was introduced in section 2.1, but in this section we present the details of the environment that are important for completely understanding our experimental setup. we use the default simulator setting of 150 units for the height and width of our experimental domain. likewise, we use the default setting in which each agent moves 0.7 units during each time step. the number of agents in our simulation (numagents) is 200, meaning that there are 200 agents in our flock. 10 % of the flock, or 20 agents, are influencing agents. the neighborhood for each agent is 20 units in diameter. numagents and the neighborhood size were both default values for mason. we chose for 10 % of the flock to be influencing agents as a tradeoff between providing enough influencing agents to influence the flock and keeping the influencing agents few enough to require intelligent behavior in order to influence the flock effectively. we only consider numangles discrete angle choices for each influencing agent. in all of our experiments, numangles is 50, meaning that the unit circle is equally divided into 50 segments beginning at 0 radians and each of these orientations is considered as a possible orientation for each influencing agent. numangles=50 was chosen after some experimentation using the 1step lookahead algorithm, in which numangles=20 resulted in a higher average number of steps for the algorithm time steps face desired orientation 34.82±3.85 offset momentum 36.70±4.63 1-step lookahead 26.02±3.10 2-step lookahead 25.94±3.16 coordinated 25.76±3.15 table 3. the number of time steps required for the flock to converge to θ∗ using the experimental setup described in section 2.1. we show the 95 % confidence intervals. flock to converge to θ∗ and numangles=100 and numangles=150 did not require significantly fewer steps for convergence than numangles=50. in all of our experiments, we run 50 trials for each experimental setting. we use the same 50 random seeds to determine the starting positions and orientations of both the flocking agents and the influencing agents for each set of experiments for the purpose of variance reduction. 6.3. experimental results table 3 shows the number of time steps needed for the flock to converge to θ∗ for the two baseline algorithms, the 1-step lookahead algorithm presented in algorithm 1, the 2-step lookahead algorithm presented in algorithm 2, and the coordinated algorithm presented in algorithm 3 using the experimental setup described in section 2.1. the results shown in table 3 clearly show that the 1-step lookahead behavior, the 2-step lookahead behavior, and the coordinated behavior all perform significantly better than the two baseline methods. however, these results did not show the 2-step lookahead behavior and the coordinated behavior performing significantly better than the 1-step lookahead behavior as we had expected. hence, we present additional experimental results below in which we alter the percentage of the flock that are influencing agents and the number of agents in the flock (numagents) one by one to further investigate the dynamics of this domain. 6.3.1. altering the composition of the flock now we consider the effect of decreasing the percentage of influencing agents in the flock to 5 % as well as increasing the percentage of influencing agents in the flock to 20 %. in both cases, the remainder of the experimental setup is as described in section 2.1. altering the percentage of influencing agents in the flock clearly alters the amount of agents we can control, which affects the amount of influence we can exert over the flock. hence, as can be seen in figure 4, flocks with higher percentages of influencing agents will, on average, converge to θ∗ in a smaller number of time steps than flocks with lower percentages of influencing agents. 24 vol. 56 no. 1/2016 ad hoc teamwork behaviors for influencing a flock figure 4. results from experiments using the experimental setup described in section 2.1, except that we varied the percentage of influencing agents in the flock. the values in the table are averaged over 50 trials and the error bars represent the 95 % confidence interval. 6.3.2. altering the size of the flock in this section we evaluate the effect of changing the size of the flock while keeping the rest of the experimental setup as presented in section 2.1. changing the flock size will alter the number of influencing agents, but not the ratio of influencing agents to non influencing agents. we expected that increasing the flock size would lead to the coordinated behavior performing better comparatively, as, with a larger flock, more agents are likely to be in multiple influencing agents’ neighborhoods at any given time. however, the coordinated behavior did not perform significantly differently than the lookahead behaviors, and actually performed slightly worse in the experiment with a larger flock size. the results of our experiments in altering the flock size can be seen in figure 5. the difference between the 1-step lookahead behavior, the 2-step lookahead behavior, and the coordinated behavior versus the baseline behaviors was not significant in the experiment utilizing a smaller flock. this may have been caused by the agents being more sparse in the environment, and hence having less of an effect on each other. 6.4. discussion our hypothesis was that algorithms 1, 2, and 3 would all perform significantly better than the baseline methods. this was indeed the case in all of our experiments except when the flock size was decreased from 200 agents to 100 agents. apparently having 100 agents in a 150 by 150 unit environment resulted in the agents being too distributed for our lookahead and coordinated behaviors to be effective. our original research question, which was to determine how influencing agents should behave so as to orient the rest of the flock towards a target heading as quickly as possible, was partially answered by this work. although it is possible that better algorithms could be designed, given the algorithms and the experimental setting presented in this paper, we found that it is best for influencing agents to perform the figure 5. results from experiments using the experimental setup described in section 2.1, except that we varied the number of agents in the flock. the values in the table are averaged over 50 trials and the error bars represent the 95 % confidence interval. 1-step lookahead behavior presented in algorithm 1. this behavior is more computationally efficient than the other two algorithms presented, and performed significantly better than the baseline methods in most cases. in many cases, the coordinated behavior and the 1-step lookahead behavior led the flock to converge to θ∗ in the same number of time steps. this is because the behaviors were identical when no agents were in the neighborhoods of two paired influencing agents at the same time. additionally, even when a pair of influencing agents shared one or more neighbors, these influencing agents often behaved similarly, and hence did not exert significantly different types of influence. there are, of course, cases in which each of the lookahead and coordinated behaviors performs noticeably better than the others. for example, when the flock size is decreased to 100, the 2-step lookahead only takes 44 time steps to converge to θ∗ when a particular random seed (93) is used in the simulator, but the 1-step lookahead takes 67 steps and the coordinated approach takes 61 steps. 7. related work although there has been a significant amount of work in the field of multiagent teamwork, there has been relatively little work towards getting agents to collaborate with teammates that cannot be explicitly controlled. most prior multiagent teamwork research requires explicit coordination protocols or communication protocols (e.g. sharedplans, steam, and gpgp) [10–12]. however, in our work we do not assume that any protocol is known by all agents. han, li and guo studied how one agent can influence the direction in which an entire flock of agents is moving [5]. similarly to our work, in their work each agent follows a simple control rule based on its neighbors. however, unlike our work, they only consider one influencing agent with unlimited, non-constant velocity. this allows their influencing agent to move to any position in the environment within one time step, which we believe is unrealistic. 25 katie genter, peter stone acta polytechnica as we mention in section 2, reynolds introduced the original flocking model [2]. however, his work focused on creating graphical models that looked and behaved like real flocks, and hence he did not address adding controllable agents to the flock, as we do. vicsek et al. considered just the alignment aspect (also called flock centering) of reynolds’ model [1]. hence, like in our work, they use a model where all of the particles move at a constant velocity and adopt the average direction of the particles in their neighborhood. however, like reynolds’ work, they were only concerned with simulating flock behavior and not with adding controllable agents to the flock. jadbabaie, lin, and morse build on vicsek et al.’s work [6]. they use a simpler direction update than vicsek et al. and they show that a flock with a controllable agent will eventually converge to the controllable agent’s heading. like us, they show that a controllable agent can be used to influence the behavior of the other agents in a flock. however, they are only concerned with getting the flock to converge eventually, whereas we attempt to do so as quickly as possible. su, wang, and lin also present work that is concerned with using a controllable agent to make the flock converge eventually [7]. 8. conclusion in this work, we have set out to determine how influencing agents should behave in order to orient a flock towards a target heading as quickly as possible. our work is situated in a limited ad hoc teamwork domain, so although we have knowledge of the behavior of the flock, we are only able to influence them indirectly via the behavior of the influencing agents within the flock. this paper introduces three algorithms that the influencing agents can use to influence the flock — a greedy lookahead behavior, a deeper lookahead behavior, and a coordinated behavior. we ran extensive experiments using these algorithms in a simulated flocking domain, where we observed that in such a setting, a greedy lookahead behavior is an effective behavior for the influencing agents to adopt. although we have begun to consider coordinated algorithms in this work, there is room for more extensive coordination as well as different types of coordination. additionally, as this work focused on a limited version of reynolds’ flocking model, a promising direction for future work is to extend the algorithms presented in this work to reynolds’ complete flocking model. finally, it would be interesting to empirically consider the effect of influencing agent placement. acknowledgements this work has been carried out in the learning agents research group (larg) at the artificial intelligence laboratory, the university of texas at austin. larg research is supported in part by grants from the national science foundation (cns-1330072, cns-1305287), onr (21c184-01), afrl (fa8750-14-1-0070), afosr (fa955014-1-0087), and from yujin robot. references [1] t. vicsek, a. czirok, e. ben-jacob, et al. novel type of phase transition in a system of self-driven particles. physical review letters 75(6), 1995. doi:10.1103/physrevlett.75.1226. [2] c. w. reynolds. flocks, herds and schools: a distributed behavioral model. siggraph 21:25–34, 1987. doi:10.1145/37401.37406. [3] w. bialeka, a. cavagnab, i. giardinab, et al. statistical mechanics for natural flocks of birds. proceedings of the national academy of sciences 109(11), 2012. doi:10.1073/pnas.1118633109. [4] h. h. charlotte k. hemelrijk. some causes of the variable shape of flocks of birds. plos one 6(8), 2011. doi:10.1371/journal.pone.0022479. [5] j. han, m. li, l. guo. soft control on collective behavior of a group of autonomous agents by a shill agent. journal of systems science and complexity 19(1):54–62, 2006. doi:10.1007/s11424-006-0054-z. [6] a. jadbabaie, j. lin, a. morse. coordination of groups of mobile autonomous agents using nearest neighbor rules. ieee transactions on automatic control 48(6):988–1001, 2003. doi:10.1109/tac.2003.812781. [7] h. su, x. wang, z. lin. flocking of multi-agents with a virtual leader. ieee transactions on automatic control 54(2):293–307, 2009. doi:10.1109/tac.2008.2010897. [8] s. luke, c. cioffi-revilla, l. panait, et al. mason: a multi-agent simulation environment. simulation: transactions of the society for modeling and simulation international 81(7):517–527, 2005. doi:10.1177/0037549705058073. [9] k. genter, n. agmon, p. stone. ad hoc teamwork for leading a flock. in proceedings of the 12th international conference on autonomous agents and multiagent systems (aamas 2013). 2013. [10] b. j. grosz, s. kraus. collaborative plans for complex group action. artifical intelligence journal 86(2):269– 357, 1996. doi:10.1016/0004-3702(95)00103-4. [11] m. tambe. towards flexible teamwork. journal of artificial intelligence research 7:83–124, 1997. [12] k. s. decker, v. r. lesser. readings in agents. chap. designing a family of coordination algorithms, pp. 450–457. morgan kaufmann publishers inc., san francisco, ca, usa, 1998. 26 http://dx.doi.org/10.1103/physrevlett.75.1226 http://dx.doi.org/10.1145/37401.37406 http://dx.doi.org/10.1073/pnas.1118633109 http://dx.doi.org/10.1371/journal.pone.0022479 http://dx.doi.org/10.1007/s11424-006-0054-z http://dx.doi.org/10.1109/tac.2003.812781 http://dx.doi.org/10.1109/tac.2008.2010897 http://dx.doi.org/10.1177/0037549705058073 http://dx.doi.org/10.1016/0004-3702(95)00103-4 acta polytechnica 56(1):18–26, 2016 1 introduction 2 problem definition 2.1 simulation environment 3 1-step lookahead behavior 4 2-step lookahead behavior 5 coordinated behavior 6 experiments 6.1 baseline ad hoc agent behaviors 6.1.1 face desired orientation behavior 6.1.2 offset momentum behavior 6.2 experimental setup 6.3 experimental results 6.3.1 altering the composition of the flock 6.3.2 altering the size of the flock 6.4 discussion 7 related work 8 conclusion acknowledgements references ap04-bittnar1.vp 1 introduction the coulomb failure condition is defined by the equation f c� � � �� � �tan 0 (1) where � and � are the shear and normal traction components respectively on the critical plane in the material, c is the apparent cohesion and � is the angle of shearing resistance (internal friction). the usual sign convention is used for the normal stress �, compression is negative. in the classical mohr-coulomb formulation, the critical plane normal is inclined by the angle � � �� �4 2 from the �1 direction to the �3 direction. ordered principal stresses � � �1 2 3� � are assumed. this orientation of the plane follows from the postulated condition that the mohr circle in the � �1 3� plane touches the envelope (1) as shown in fig. 1. stresses �cx, �cz and �c are implied in the coordinate frame associated with the critical plane. the mohr-coulomb condition is natural but the assumed orientation of the critical plane in fact lacks a rigorous substantiation. other orientations could be assumed. a rational modification of the mohr-coulomb condition can be obtained when the critical plane orientation is not a priori restrained. instead, it can be determined so that f attains its maximum on the critical plane. the resulting criterion should be more severe than the classical one. 2 mohr-coulomb criterion based on an extreme property direct notation is used in the development, and a general triaxial stress is assumed for full generality. stress tensor � is assumed to have principal stresses �i with direction vectors ni. the unknown critical plane normal is denoted n. the normal and tangential traction components on the plane are � �� � � � �n n n n� �� 2 . (2) the extreme of f is sought when n is subject to variation with subsidiary condition n n� �1. lagrange multiplier � is introduced and the extended criterion � � � � �f f �( )n n 1 is differentiated with respect to n to yield 1 2 4 2 2 2 � � �( ( ) ) tan� � � � � �n n n n n n� � �� � 0 . (3) the equation is contractively multiplied by n and the resulting scalar equation is used to eliminate � from eq. 3. assuming that � � 0, equation { } ( tan ) [ (( ) tan )] �� � � �� � � � � � � � � � � � � 2 2 2 n n n n n n n n n � � � � � 0 (4) is obtained for the unknown n. the equation can be rewritten in a comprehensive form when tensor � is introduced � �� � �� � � �2( tan )n n � � , (5) ( )� �� � �n n n 0 . (6) the eigenvectors of � deliver extremes of f. eigenvectors of � and � are the same, however, so these extremes are minima (� � 0) of f. in order to find the other extremes, all variables are decomposed in terms of the eigenvalues �i and principal vectors ni of �: � �� � � � � � i i i i i i i i i i i n n n n n n n 2 (7) s n n n n i i i i i i i i i i i i � � � � � � � � � n n n n � �� � � � � � 2 2 2 2 2 2 � � � � 2 (8) � � � � � r r s i i i i i i i n n � � ��( ( tan )).2 (9) © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 acta polytechnica vol. 44 no. 5 – 6/2004 mohr-coulomb failure condition and the direct shear test revisited p. řeřicha an alternative critical plane orientation is proposed in the mohr-coulomb failure criterion for soils with an extreme property. parameter identification from the direct shear test is extended to incude the lateral normal stress. keywords: soil strength, shear failure, direct shear test. fig. 1: inclination � of the critical plane in the classical mohr-coulomb yield condition equation (6) becomes ( ) ,r r n n ii j j j i� � � 2 0 1 3. (10) six relevant solutions can be best presented in terms of the cyclic permutations of indices i, j, k: n n ni j k, tan tan ,� � � � � � � � � � 1 2 1 4 0 2 � � . (11) it is apparent that the critical plane normal lies always in the plane of two principal stresses directions, in the same plane as the classical mohr-coulomb normal. back substitutions yield then � � � � � � � � �� � � � � � i j i j 4 1 22tan , ( tan ) (12) and the modified coulomb condition on the critical plane of maximum f: f ci j i j� � � � � � � � � � � � � � � � 1 2 2 4 0 2 ( ) tan tan ( ) tan� � � � � � � (13) it is interesting to compare the equation with the original mohr-coulomb condition 1 2 1 02( ) tan ( ) tan� � � � � �i j i j c� � � � � �� � �� � � , (14) and with the coulomb condition (1) applied on the plane of the maximum shear stress 1 2 0[( ) ( ) tan� � � � ��i j i j c� � � � � . (15) the latter condition represents the third option for the critical plane orienation. for plane stress conditions �2 0� the graphic representation of all three yield locuses is in fig. 2. the modified yield locus is the most severe, as expected. intersections of the modified (a) and classical (m) yield locuses with the rendulic plane � �1 2� are shown in fig. 3. functions fa and fm are important for parameter calibration of the model by the triaxial test. the modified coulomb (a) condition intersection with the rendulic plane is: f x c x a( ) ( tan )( tan ) ( tan ) tan tan � � � � � � � � 2 2 3 4 3 3 6 3 4 2 � � � � 2 � (16) whereas for the original mohr-coulomb f x c x m( ) ( tan ) tan tan � � � � � � 2 2 3 3 3 3 1 2 � � � . (17) positive signs pertain to the lower branches of the yield locus intersections with the rendulic plane. the three options for the critical plane orientation distinguish three slightly different material models of the mohr-coulomb type. the practical value of these modifications can be assessed in connection with the solutions of actual problems. the problem tackled below is the parameter identification in the direct shear test. 3 evaluation of the direct shear test most applications of constitutive equations include a) the parameter calibration and b) solution of the actual task analytically or numerically. let us assume first that the triaxial test is used in the first step. tests provide points in the rendulic plane and parameters c and tan � are selected to best fit the points. other procedures are available for identifying of the parameters using, for instance, the modified and alternate mohr-coulomb diagrams as recommended in [1] and [4]. different parameter values are obtained for the three versions of the yield locus. the calibrated locus remains nearly the same for all versions, however. application of the three versions in any actual problem solution does not thus make any difference in the results, in spite of the difference in the parameter values. differences might occur when direct shear apparatus is used in the first step, see fig. 4. the failure plane orientation is imposed by the test arrangement in this case. strictly speak94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 2: the modified mohr-coulomb (a), the original mohr-coulomb (m) and the maximum shear plane condition (s) for tan . , .� �� �0 5 26 6° fig. 3: the modified coulomb (a) and the original mohr-coulomb (m) for tan . , ( . )� �� � �0 5 26 6 , c � 1 in the rendulic plane � �1 2� . axis �1 and projection of axes �1 and �2 are also shown. ing, there is no homogeneous stress in the specimen and from this point of view the test is not suitable for direct parameter calibration. nevertheless, in a layer adjacent to the failure plane approximately homogeneous stress conditions can be assumed. it is assumed forthwith that normal stress �z and the corresponding limit shear stress � are determined in the direct shear test in the failure plane, see fig. 4. corresponding techniques are specified, e.g., in [2] or [3]. the point is that the failure plane in this test does not coincide with the critical plane in the mohr-coulomb failure condition for any of the three versions considered here. consequently, the line obtained by fitting the � �z : points from the test is not the mohr-coulomb failure condition. the confining normal stress �x in the slip direction is unknown. the coulomb condition (1) does not depend on the latter stress. the mohr-coulomb condition with both its modifications (14) – (16) however, depends on both principal stresses in the problem plane and therefore depends on �z and �x. consequently, the direct shear test cannot directly determine c and � since �x is not known. the arrangement of the shear test admits the approximate assumption of proportionality between the confining and active stresses � �x z� with constant parameter . the direct shear test can now be simulated with the three versions of the failure criterion. assuming stress components � � �z x z, � , and � in the layer adjacent to the slip plane at failure, standard expressions for the principal stresses are substituted in the respective failure criterion and explicit formulas for � are derived. these formulas represent the correct failure limits. the respective limits read, for the modified mohr-coulomb: � �� � � � � � � � � � � � � � � � � � � 1 2 1 4 4 1 1 2 2 2 4 tan ( ) ( tan ( ) tan ( z z� � �) tan ( tan ) ,� �� �c c4 1 2 1 2 (18) for the classical mohr-coulomb: � � � � � � �� � � � � � � � � � � ! " # 1 2 4 1 1 1 4 1 2 2 2 2 1 z z zc tan ( ) tan ( ) 2 , (19) and for the coulomb on the maximum shear plane: � �� �� � �� � � � �1 2 1 1 42 2 2 2 2 1 2z c( ) ( ) tan . (20) each criterion can be perceived as a batch of curves � �( )z with parameter . standard evaluation of this fictitious test would deliver a straight line from the point � z � 2 on the horizontal axis to the point � �1 on the vertical axis – the conventional mohr-coulomb envelope. it is apparently wrong to use in the mohr-coulomb material models the parameter values obtained in the direct shear test by the standard evaluation. instead, the three parameters c, � and should be determined to best fit the measured data. low values < 0.3 obviously are not realistic. the other extreme, �1, is closest to the conventional mohr-coulomb envelope that would be obtained by the standard test evaluation with the same material. however, not even this extreme curve coincides with the conventional envelope except for the maximum shear orientation of the critical plane in fig. 7. it is worth noting that the introduction of parameter allows for curved locuses, which are often observed in practice [1], and that parameter , a side product of the parameter fitting, can be used to determine the elastic properties of the soil. © czech technical university publishing house http://ctn.cvut.cz/ap/ 95 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: stress components at yield in the direct shear test fig. 5: mohr-coulomb for tan . , ( . ),� �� � � �0 5 26 6 1c and several values of parameter fig. 6: modified mohr-coulomb for tan . , ( . ),� �� � � �0 5 26 6 1c and several values of parameter fig. 7: coulomb for the maximum shear plane tan . ,� � 0 5 ( . ),� � � �26 6 1c and several values of parameter references [1] us army corps of engineers (2003). “slope stability” us army corps of engineers documents, pub. no: em1110-2-1902, appendix d. [2] us army corps of engineers (1970) “slope stability”, appendix ix, drained (s) direct shear test us army corps of engineers documents, pub. no: em1110-2-1906. [3] american society for testing and materials (1998) “standard test method for direct shear test of soils under consolidated drained conditions.” american society for testing and materials, astm d3080-98, annual book of astm standards, stm, west conshohocken, pa, usa, (1998). [4] fredlund d. g., vanapalli s. k.: “shear stress in unsaturated soils.” chapter 2.7 in handbook of agronomy soil science society of america, 2002. prof. ing. petr řeřicha, drsc. phone: +420 224 354 478 e-mail: petr.rericha@fsv.cvut.cz / rer@cml.fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29, praha 6, czech republic 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 ap04_4web.vp 1 introduction the pumping capacity of a pitched blade impeller (pbt) in a draught tube is defined as the amount of liquid leaving the outlet base of the draught tube per unit time. this quantity is an important process characteristic of the pbt and plays an important role when calculating the process characteristics of solid-liquid suspensions in a system with a draught tube [1, 2, 3] or without this internal [4], i.e. above the impeller frequency for just off bottom suspension. the pumping capacity of a pbt can be measured by the indirect ”flow follower” (indicating particle) method [5] and calculated from the measured mean time of liquid primary circulation, or calculated from the radial profile of the axial component of the mean velocity in the impeller discharge stream [4, 6] or in the draught tube [1, 2] by means of integration over the cross section of the impeller rotor region or over the cross section of the draught tube. the pumping capacity of the pbt, qp, can be expressed in dimensionless form as the impeller flow rate number [5, 8] n q ndq pp � 3, (1) where n is the frequency of the impeller revolution and d is its diameter. quantity does not depend on the impeller reynolds number when this quantity exceeds five thousand [7, 8, 9]. for impeller power input p the power number has been introduced po p n d� � 3 5, (2) where � is the density of the agitated liquid. this quantity is also independent of the impeller reynolds number when it exceeds ten thousand. a combination of the dimensionless quantities and po gives the so called hydraulic efficiency of the impeller, defined either as [6] e n po q p � 3 , (3) or as [8] e n po d t p q p � � � � � � � 3 4 . (4) the former definition is suitable for agitated systems with a draught tube [1], and the latter for systems without this internal. quantity t in eqs. (3) and (4) denotes the diameter of the vessel. the higher the quantity e or ep, the greater the ability to convert the impeller energy consumption into its pumping effect, i.e., it is possible to study the influence of various geometries of the mixing system on the quantity e or ep. [8]. this study analyses the pumping and energetic efficiency of various pitched blade impellers in tall vessel with a draught tube, a suitable geometry for industrial applications where high homogeneity of solid-liquid suspension and temperature distribution are desirable. the pbt pumping capacity will be calculated from the radial profile of the axial compo48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 pumping capacity of pitched blade impellers in a tall vessel with a draught tube j. brož, i. fořt, r. sperling, s. jambere, m. heiser, f. rieger a study was made of the pumping capacity of pitched blade impellers (two, three, four, five and six blade pitched blade impellers with pitch angles � � 35° and 45°) coaxially located in a cylindrical pilot plant vessel with cylindrical draught tube provided with a standard dished bottom. the draught tube was equipped with four equally spaced radial baffles above the impeller pumping liquid upwards towards the liquid surface. in all investigated cases the liquid aspect ratio h/t � 1.2 � 1.5, the draught tube / vessel diameter ratios dt /t � 0.2 and 0.4 and the impeller / draught tube diameter ratio d/dt � 0.875. the pumping capacity of the impeller was calculated from the radial profile of the axial component of the mean velocity in the draught tube below the impeller at such an axial distance from the impeller that the rotor does not affect the vorticity of the flow. the mean velocity was measured using a laser doppler anemometer with forward scatter mode in a transparent draught tube and a transparent vessel of diameter t � 400 mm. two series of experiments were performed, both of them under a turbulent regime of flow of the agitated liquid. first, the optimum height of the dished bottom was sought, and then the dependences of the dimensionless flow rate criterion and the impeller power number on the number of impeller blades were determined for both pitch angles tested under conditions of optimum ratio ht /dt. it follows from the results of the experiments that the optimum ratio ht /dt � 0.25 when the cross sectional areas of the horizontal flow around the bottom and the vertical inflow to the draught tube are the same. for all the tested pitched blade impellers the impeller power number when � � 45° exceeds the value of this quantity when pitch angle � � 35°, while the flow rate number when � � 35° exceeds this quantity when � � 45°. on the other hand, the absolute values of the impeller power number when the draught tube was introduced correspond fairly well to the dimensionless impeller power input measured in a system without a draught tube. however, the absolute values of the flow rate number found in the former system are significantly lower than the dimensionless impeller pumping capacity determined in the latter system. the hydraulic efficiency of pitched blade impellers n poq p 3 for the investigated geometry of the agitated systems does not depend on the number of impeller blades, but it is significantly lower than the quantity determined in an agitated system with a dished bottom but without the draught tube. keywords: pitched blade impeller, impeller pumping capacity, draught tube, laser doppler anemometer, turbulent flow. nent of the mean velocity in the draught tube. the velocity will be determined by the laser doppler anemometer and the impeller power input by means of a strain gauge torquemeter. 2 experimental the experiments were carried out in a pilot plant cylindrical vessel (diameter t � 400 mm) with a dished bottom [10]. two series of experiments were made: i. optimisation of the slot ht between the lower edge of the draught tube and the dished bottom (see fig. 1). here, a six blade impeller with pitch angle � � 45° (see fig. 2) pumping liquid upwards towards the liquid surface was used. the vertical (axial) distances of the lower edge of the draught tube above the bottom were set up within the interval ht � �2 mm; 40 mm�. ii. determination of the dependence of the impeller pumping capacity and the impeller power input on the number of impeller blades nb � 2, 3, 4, 5, 6 at two levels of pitch angle � � 35°, 45° for optimum height ht of the lower edge of the draught tube above the bottom. all the pbts rotated in such a way that they pumped the liquid upwards towards the surface of the liquid. each draught tube was equipped with four equally spaced radial baffles equal in width of the tube to one tenth of the diameter of the tube mounted at its wall above the impeller. water at a temperature of 20 °c was chosen as the working liquid. the frequency of revolution of impeller n was measured by means of a photoelectric cell with accuracy ±1 rev/min. the slot ht between the lower edge of the draught tube and the bottom was measured from the corresponding point vertically above the bottom, using a ruler with a precision of ±1mm. the error in measuring the blade angle of the pbts can be considered as ±0.5°. the mean velocity field in the draught tube was measured in the inlet area below the impeller in various cross sections. for the 1st series of experiments the following vertical axial coordinates of the cross sections were chosen: 1, 15, 45, 90 and 130 mm below the lower edges at the impeller and for the 2nd series the following axial coordinates: 30, 60, 90 and 120 mm below the lower edges of the impeller. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 44 no. 4/2004 4 31 2 fig. 1: geometry of the pilot plant tall vessel with a draught tube (the 1st series of experiments); 1 – vessel, 2 – draught tube, 3 – baffle, 4 – impeller (h � 600 mm, t � 400 mm, l � 540 mm, c � 465 mm, dt � 80 mm, d � 70 mm, l � 70 mm, b � 7 mm, r � 400 mm, r � 40 mm) 45° h d fig. 2: sketch of pitched blade impellers with three or six blades – czech standard cvs 691020: a) nb � 3 : � � 35°, 45°, b ) nb � 6, � � 35°, 45°; h/d � 0.2) 4 31 2 fig. 3: geometry of the pilot plant tall vessel with a draught tube (the 2nd series of experiments); 1 – vessel, 2 – draught tube, 3 – baffle, 4 – impeller (h � 470 mm, t � 400 mm, l � 350 mm, c � 400 mm, dt � 160 mm, d � 140 mm, l � 45 mm, b � 15 mm, r � 400 mm, r � 40 mm) the mean velocity was determined from the measurements made by means of a laser doppler anemometer (lda). a dantec 55x two component modular series lda and its associated bsa data processor, connected with a pc, was used for the experiments. the lda was operated in a forward scatter mode (see fig. 4). the laser (5 w ar ion, manufactured by spectra physics, usa) and optics were mounted on a bench with a two-dimensional traversing mechanism. to identify the flow reversals correctly, a frequency shift was given to one of the beams by means of a bragg cell with electronic downmixing. two components of the local velocity were measured simultaneously, with positioning accuracy � 0.1 mm. the sample size was set at 20,000 items for each velocity measurement, and the mean time (averaged) value from all the samples was calculated. the impeller power input was also determined experimentally (with relative accuracy � 10 %) by means of a strain gauge torquemeter mounted on the impeller shaft. the impeller pumping capacity qp was calculated from the experimentally determined radial profiles of the mean velocity over the cross sectional area of the draught tube below the impeller (in the inlet zone) w w rax ax� ( ). the local value of the mean velocity corresponds to the average value of the ensemble over the circle of radius r determined by lda, assuming axial symmetry of the flow in the draught tube, and the impeller pumping capacity can be calculated from the equation q w r r rp ax dt � 2 0 2 � ( ) d . (5) fig. 5 depicts the measured radial profiles of the axial component of the mean velocity in the inlet zone of the draught tube in the 1st series of experiments. it follows from the experiments that vorticity appears in the vicinity of the impeller owing to the rotation of impeller blades, i.e. the velocity field is significantly distorted. therefore for calculation of the impeller pumping capacity the third and the fourth cross sectional areas below the lower edges of the impeller blades (vertical (axial) distances 130 mm and 90 mm) were chosen. all the experiments were carried out at three levels of impeller revolution frequency under turbulent regime of flow of the agitated liquid. the main part of the first series of experiments consisted of optimisation of the slot ht between the lower edge of the draught tube and the dished bottom (see fig. 6). at a constant value of the diameter of the draught tube dt the vertical (axial) distance ht was changed. it is clear from the figure that the value of the flow rate number is practically constant when the relative height of the slot ht/d > 0.25. the lowest value of this inequality corresponds to the conditions when the cross sectional area of the vertical upflow in the inlet of the draught tube is the same as the cross sectional area of the horizontal flow in the slot � � 4 2d d ht t t� , (6) 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 laser bsa fig. 4: layout of a laser doppler anemometer with forward scatter mode 90mm bellow agitator 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 2r1/dt w a x ./ n d tu b e w a ll 130mm bellow agitator 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 2r1/dt w a x ./ n d tu b e w a ll fig. 5: radial distribution of the axial component of the mean velocity in the inlet zone of the tube below pbt (size blade pbt – � � 45°), n � 500 min�1, n � 600 min�1, n � 700 min�1 0 0.1 0.2 0.3 0.4 0.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 ht/dt n q p fig. 6: dependence of the flow rate number on ratio ht/dt (first series of experiments), points – experimental values so that h dt t� 0 25. . (7) similarly, the power number is independent of the ratio ht/dt when its holds h dt t 0.1 , (8) as shown in fig. 7. moreover, it is expected that the power number is independent of the impeller reynolds number under turbulent regime of flow of the agitated liquid. above the limit expressed by eq. (7) the average axial component of the mean velocity over the slot depends on the height of the slot w q d h const h h db p t t t t t� � � � . , .0 25 . (9) fig. 8 illustrates in dimensionless form relation (8), i.e. the hyperbolic relation between dimensionless velocity and the dimensionless ratio ht /dt . it is noting that for the given geometry of the investigated system the theoretical hyperbolic dependence can be expressed in terms of fig. 8 in the form w n d h d b t t � 0119. , (10) which fits fairly well to the experimental curve when h dt t � 0125. , when the flow resistance owing to the sudden change of the cross sectional area can be neglected. the results of the second series of experiments, carried out at the optimum arrangement in the vicinity of the dished bottom determined above (eq. 7), cover the dependence of the flow rate number and the power number on the number of blades of pbt at two levels of its pitch angle. the geometry of the investigated system (see fig. 2), although with different ratio dt/t, corresponds to the geometric similarity of the main characteristics d d h dt t t� �0 875 0 25. , . , (11) considered for the system tested in the first series of experiments (see fig. 1). each geometry was investigated at five levels of frequency of revolution of the impeller and the shown values of nq p and po are the average data from all individual experiments results, calculated with an average error � 10%. all the results correspond to the fully turbulent regime of flow of an agitated liquid when the impeller reynolds number exceeds ten thousand [5, 11]. figs. 9 and 10 illustrate the dependence of the pbt flow rate number nq p and the pbt power number po, respectively, on pbt number of blades nb. while the results of the power consumption of the impeller correspond fairly well with the results of experiments made in a standard mixing system (down pumping pbt in a vessel with four baffles at the wall and impeller off bottom clearance c/d � 0.5) [11], the pumping capacity of pbt in the investigated system is significantly lower than the equivalent quantity in the above described standard system and, moreover, the quantity qp for the pitch © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 44 no. 4/2004 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 0.01 0.1 1 ht/dt p o fig. 7: dependence of the power number on ratio ht/dt (first series of experiments), points – experimental values 0 0.5 1 1.5 2 2.5 3 3.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 ht/dt w b /n d fig. 8: dependence of the dimensionless average axial component of the mean velocity over the slot between the lower edge of the draught tube and the dished bottom on ratio ht /dt (first series of experiments), theoretical curve , points – experimental values 35° 35° 35° 35° 45° 45° 45° 45° 0 0.1 0.2 0.3 0.4 0.5 0.6 nb n q p 2 3 4 6 fig. 9: dependence of the pbt flow rate number on the number of blades of the impeller and the blade pitch angle 35° 35° 35° 35° 45° 45° 45° 45° 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 nb p o 2 3 4 6 fig.10: dependence of the pbt power number on the number of blades the impeller and the blade pitch angle angle � � 35°, except for the number of impeller blades nb � 2, exceeds this quantity when � � 45°. on the other hand, taking to account the hydraulic efficiency of pbt (see eq. 3) this quantity is practically independent of the number of impeller blades (see fig. 11). this independence is in accordance with the results of the experiments made in the above described standard mixing system, but their absolute values are significantly lower than those calculated for the standard system. this difference can be explained by the longer average liquid circulation loops in the system with a draught tube than in the system with baffles only, i.e. a higher portion of the mechanical energy necessary for circulation is consumed in the former system. however, this disadvantage is significantly balanced by the suitability of the investigated system with a draught tube for suspensions of solid particles in a liquid [3]. here the height of the sediment of solid particles can exceed the height of the vertical slot between the lower edge of the draught tube and the dished bottom. however, under relatively low frequency of revolution of the off bottom suspension effect successfully occurs. so, even after accidence of the industrial mixing unit with the solid phase, e.g. a crystallizer, it is possible to resuspend the sedimented solid particles in a system. 4 conclusions the flow rate of the agitated liquid in a cylindrical draught tube with up flow pbt does not depend on the height of the slot between the draught tube and the dished bottom, when the cross sectional area in the slot is the same or greater than the cross sectional area of the draught tube. both pbt flow rate number and the power number increase with increasing number of impeller blades for the tested impeller pitch angles � � 35° and 45°. the hydraulic efficiency of both tested pbts (� � 35° and 45°) does not depend on the number of impeller blades, but, on the contrary, in an agitated system without a draught tube (with baffles at the wall only) the value of quantity e when � � 35° exceeds the hydraulic efficiency of pbt with pitch angle � � 45°. 5 list of symbols b baffle width, [m] c off bottom impeller clearance, [m] d impeller diameter, [m] dt draught tube diameter, [m] e impeller hydraulic efficiency (eq. 3) ep impeller hydraulic efficiency (eq. 4) h height of liquid from bottom of vessel, [m] ht height of the slot between the lower edge of the draught tube and the dished bottom, [m] h width of impeller blade, [m] l draught tube length, [m] l baffle length, [m] nq p impeller flow rate number n frequency of impeller revolution, [s�1] nb number of impeller blades p impeller power input, [w] po impeller power number qp impeller pumping capacity, [m 3s�1] r radius of dished bottom, [m] r radius of round corners of dished bottom, [m] r1 radius, [m] t vessel diameter, [m] wax axial component of the liquid mean velocity, [m�s �1] wb mean velocity over the slot between the lower edge of the draught tube and the dished bottom, [m�s�1] � pitch angle of blade, [°] � density of agitated liquid, [kg�m�3] 6 acknowledgment this research was supported by research project of the ministry of education of the czech republic j04/98:212200008. references [1] hoš�álek j., fořt i.: “description of vortex turbulent flow of mixed liquid.” collect. czech.chem. commun., vol. 50 (1985), p. 930–945. [2] sýsová m., fořt i., kudrna v.: “analytical description of the solid phase particle distribution in a mechanically agitated system.” collect. czech. chem. commun. vol. 53 (1988), p. 1198–1215. [3] brož j., rieger f.: “mixing of suspensions in tall vessels with a draught tube.” proceedings on cd rom of the 30th conference of the slovak society of chemical engineers, tatranské matliare (slovakia), may 2003, p. 1–8. [4] wu j., zhu y., pullum l.: “the effect of impeller pumping and fluid rheology on solids suspension in a stirred vessel.” can. j. chem. eng., vol. 79 (2001), p. 177–186. [5] medek j., fořt i.: “pumping effect of impellers with flat inclined blades.” collect. czech chem. commun., vol. 44 (1979), p. 3077–3089. [6] fořt i., medek j.: “hydraulic and energetic efficiency of impellers with inclined blades.” proceedings of the 6th european conference on mixing, pavia (italy), may 1988, p. 51–56. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 2 4 6 8 nb 35° 45° fig. 11: dependence of the pbt hydraulic efficiency on the number of impeller blades and the blade pitch angle, � � 35°, � � 45° [7] kresta s., wood p.: “the flow field produced by a 45o pitched blade turbine: changes in the circulation pattern due to off bottom clearance.” can. j. chem. eng., vol. 71 (1993), p. 42–53. [8] nienow a. w.: “on impeller circulation pattern due to off bottom clearance.” chem. eng. sci., vol. 22 (1997), no. 15, p. 2557–2565. [9] fořt i., jirout t., sperling r., jambere s., rieger f.: “study of pumping capacity of pitched blade impellers.” acta polytechnica, vol. 42 (2002), no. 2, p. 68–72. [10] deeply dished bottoms for pressure vessels. czech standard čsn 425815, prague 1980. [11] medek j.: “power characteristics of agitators with flat inclined blades.” int. chem. eng., vol. 20 (1980), no. 4, p. 664–772. ing. jiří brož phone: + 420 224 352 714 fax: +420 224 310 292 e-mail: brozj@student.fsid.cvut.cz doc. ing. ivan fořt, drsc. phone: +420 224 352 713 e-mail: fort@fsid.cvut.cz prof. ing. františek rieger, drsc. department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic prof. dr.-ing. reinhard sperling e-mail: reinhard.sperling@lbv.hs-anhalt.de ing. solomon jembere ing. martin heiser department of chemical engineering anhalt university of applied sciences hochschule anhalt (fh) bernburger str. 52-57 063 66 koethen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 44 no. 4/2004 acta polytechnica doi:10.14311/ap.2019.59.0527 acta polytechnica 59(6):527–535, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap study of synthetic dye removal using fenton/tio2, fenton/uv, and fenton/tio2/uv methods and the application to jumputan fabric wastewater tuty emilia agustinaa, ∗, dedi teguhb, yourdan wijayaa, febrian mermaliandia, ahmad bustomia, jantan manalaoona, gita theodoraa, tessa rebeccaa a universitas sriwijaya, engineering faculty, chemical engineering department, palembang prabumulih km 32, 30622, indralaya south sumatera, indonesia b b universitas sriwijaya, engineering faculty, graduate student of chemical engineering master program, srijaya negara bukit besar, 30139, palembang south sumatera, indonesia ∗ corresponding author: tuty_agustina@unsri.ac.id abstract. synthetic dyes were commonly used in textile industries such as jumputan fabric industries in south sumatera. most of these industries were categorized as a home industry without a wastewater treatment plant, so the wastewater is released directly into waterbody. in general, the wastewater contains synthetic dyes, which are harmful to the environment and human body. therefore, the wastewater needs to be treated before its release into the environment. reactive red 2 (rr2) is one of important synthetic dyes usually applied for colouring textile materials such as jumputan fabric. the rr2 was used as a pollutant model in this research. the objective of the study is to compare the removal of rr2 by using fenton/tio2, fenton/uv, and fenton/tio2/uv methods. furthermore, the optimum conditions obtained were applied for the treatment of wastewater from jumputan fabric industry. as a conclusion, the highest rr2 degradation of 100 % was reached by using the fenton/tio2/uv method after 5 minutes of reaction. it was discovered that the optimum conditions were found when using [fe2+]/[h2o2] molar ratio of 1:80, ph of 3, and tio2 concentration of 0.4 % (w/v). however, the application of the condition to the jumputan wastewater treatment leads to chemical oxygen demand (cod) removal of 94 % within 120 minutes of reaction. keywords: reactive red 2, fenton reagent, fenton/tio2, fenton/uv, fenton/tio2/uv, wastewater. 1. introduction generally, the textile industry produces a large amount of waste water every year. beside of the suspended solids, this wastewater has a high organic content characterized by high chemical oxygen demand (cod) and biochemical oxygen demand (bod) values. the content of organic pollutants in wastewater is mostly derived from the use of chemicals in the process of colouring, dyeing, and washing textile materials. this is marked by the colour of the wastewater being discharged into the sewer. the coloured wastewater, besides being aesthetically displeasing, is also harmful to the environment. the presence of large amounts of colour can block the penetration of sunlight into a body of water, leading to a disruption of the balance of water ecosystems. therefore, the textile industry wastewater must be treated properly. the main source of pollution in textile industry wastewater is the presence of high concentrations of dyes. the textile industry generally uses synthetic dyes, such as procion, erionyl, and auramine, in the colouring process. in textile colouring procedures such as batik, synthetic colouring agents used include indigosol, naphtol and rapid [1]. the reasons behind the use of synthetic dyes rather than the natural ones are the lower price, greater durability, easy obtainability and a wider variety of colour choices. these synthetic dyes are categorized as toxic and hazardous substances, whose presence in the water is not desirable because of the carcinogenic effects they cause. in addition, the colour changes have also reduced the water quality along with an increasing turbidity due to synthetic-dye pollutants. for this reason, there is a need for a wastewater treatment technology that is simple and easy to apply in the textile industry, especially small and medium production plants, which are common palembang. several technologies have been developed to overcome wastewater problems that contain synthetic dyes, such as adsorption with activated carbon [2], coagulationflocculation [3], photocatalysis including semiconductors [4], and a combination of physics and biology [5]. physical processing using adsorbents, such as zeolite and activated carbon, only removes pollutants from liquid media into solid media, but does not destroy the pollutants themselves. chemical processing involves the use of chemicals in large quantities and produces sludge, which must be separated at the end of the process, thus requiring further processing. biological processing has been proven to reduce cod in large 527 https://doi.org/10.14311/ap.2019.59.0527 https://ojs.cvut.cz/ojs/index.php/ap t. e. agustina, d. teguh, y. wijaya et al. acta polytechnica quantities, but requires a long time so the treatment of wastewater with this method generally requires a large land. an alternative method to handle textile industrial wastewater is by using advanced oxidation processes (aops). aops are defined as the oxidation processes involving the generation of hydroxyl radicals (•oh) as powerful oxidizing agents, which can destruct the wastewater pollutants and transform them to less or even non-toxic products, thereby providing an ultimate solution for the wastewater treatment [6]. these processes offer a technology which can disintegrate the pollutants into less harmful components, and finally, to carbon dioxide and water. therefore, the fenton oxidation, one of the aops methods, was applied in this study to examine the reactive red 2 synthetic dye as a model pollutant. the advantage of the aop process with fenton’s reaction is that it has a short reaction time compared to other aop processes. hydrogen peroxide reagents are used only in small amounts, and can degrade organic components that are difficult to decompose [7]. hydrogen peroxide is also environmentally friendly and easy to handle. the fenton process is very promising among other aops, because this system provides a high reaction yield, which is able to produce hydroxyl radicals at a low cost and is easy to carry out [8–12]. in addition, fe (iron) is widely available and nontoxic. the use of fenton reagent does not require a special equipment, so the main advantage of the fenton reagent method is the simplicity of the process itself [13]. fenton reagent is a solution consisting of hydrogen peroxide (h2o2) and iron catalyst (fe2+), which have high oxidation ability in oxidizing pollutants or wastewater. the fenton reaction involves hydroxyl radicals produced from the oxidation reaction between h2o2 and fe2+. the use of the fenton reagent involves complex reactions beginning with the h2o2 decomposition reaction with the assistance of fe2+ catalyst. the mechanism of the reaction starts with fe2+ initiating the reaction and catalysing the decomposition reaction of h2o2 to produce hydroxyl radicals (•oh) according to the reaction equation: fe2+ + h2o2 −→ fe3+ + oh− + •oh (1) the hydroxyl radical (•oh) is able to break down almost all organic compounds, react with dissolved components and initiate successive reactions through a series of oxidation processes so that the component is degraded. although involving complex reactions, generally the reactions that occur in fenton reagents are as follows [14]: •oh + h2o2 −→ h2o + ho2• (2) fe3+ + ho2•−→ fe2+ + h+ + o2 (3) fe2+ + ho2•−→ fe3+ + ho−2 (4) fe2+ + •oh −→ fe3+ + oh− (5) if illuminated with a light of an appropriate wavelength (180-400 nm) (i.e., ultraviolet and some visible light), fe3+ can catalyse the formation of hydroxyl radicals: fe3+ + h2o + hv −→ fe2+ + h+ + •oh (6) reaction (6) is known as a fenton-photo reaction (fenton/uv) and followed by reaction (1). the •oh production is determined by the presence of light with an appropriate wavelength and hydrogen peroxide. in theory, by combining reactions (1) and (6), two moles of •oh will be produced per mole of hydrogen peroxide used [15]. photocatalyst is a catalyst that works when absorbing a light of a specific wavelength. photocatalysts are generally a semiconductors that have a full valence band and an empty conduction band such as tio2. the tio2 is a catalyst that is often used in the photocatalysis process because of its superiority. if the semiconductor is subjected to a certain wavelength of light, the electrons will be excited from the valence band to the conduction band to produce a hole in the valence band [6]. this process occurs in the early stages of a photocatalyst reaction. among many types of semiconductors, tio2 has become the preferred choice, especially in the form of anatase crystals as photocatalysts. commercially, tio2 powders are easy to obtain and have advantages, such as having high photocatalyst activity, non-toxicity, resistance to corrosion, and non-solubility in water. the tio2 also has a high light-absorption ability, which is characterized by the corresponding band gap energy (eg) value, which is 3.2 ev for anatase structure. when a light with the hv energy equal to or greater than the band gap energy (eg) is used, the electrons in the valence band have enough energy to be able to move or be excited to the conduction band and leave the positive hole (hv+) in the valence band: tio2 + hv −→ e−cb + hv + vb (7) holes in the valence band can act with h2o to produce hydroxyl radicals: hv+vb + h2o −→•oh + h + (8) this reaction is one type of an advanced oxidation technique and is the beginning of the next photocatalytic reaction [6]. if there are other oxidizing agents, such as hydrogen peroxide or ozone, additional hydroxyl radicals can be produced under uv irradiation. for example, hydrogen peroxide separated in the presence of uv light produces two hydroxyl radicals: h2o2 + hv −→ 2 •oh (9) hydroxyl radicals, which become the character of aops, have high oxidation potential, so they can reduce cod levels in wastewater. the oxidation method with fenton reagent has been applied for processing 528 vol. 59 no. 6/2019 study of synthetic dye removal various kinds of industrial wastewater containing toxic organic compounds, such as olive oil processing industry [16], palm oil processing industry [17], and pesticides [18]. in this study, the fenton-based aops method is used to degrade reactive red 2 (rr 2) synthetic dyes. and it will be compared with the use of fenton reagents and their combinations in processing rr 2 synthetic dyes. rr 2 is used as a model for synthetic dye wastewater because this dye is very commonly used in palembang, the price is relatively cheap and can dissolve in water without heating. traditional fabrics in palembang always use a red touch in their main products. the purpose of this study is to treat rr 2 synthetic dye wastewater using the fenton, fenton/tio2, fenton/uv, and fenton/tio2/uv methods, and also to study the parameters that affect the cod and colour reduction. furthermore, the best operating conditions are applied to treat the wastewater from jumputan fabric home industry in palembang. 2. experimental 2.1. material in this study, reactive red 2 (rr 2) synthetic dye was obtained from dye suppliers of fajar kimia in jakarta. the titanium dioxide (tio2) catalysts were purchased from sigma aldrich. whereas sulphuric acid (h2so4), sodium hydroxide (naoh), sodium thiosulfate (na2s2o3), fenton reagents in the form of hydrogen peroxide (h2o2 30 % w/v) and ferro sulphate (feso4 7 h2o) were obtained from merck. to adjust the ph, 0.1 m h2so4 and 0.1 m naoh were used. the uv source is obtained from a 15 watt uv lamp with a wavelength of 253.7 nm (uv-c). 2.2. procedures the treatment of rr 2 is carried out in a batch reactor equipped with uv lamps and mechanical stirrers. synthetic dye wastewater is made by dissolving a certain amount of rr 2 into distilled water. the rr 2 initial concentration varied from 150-300 ppm. in this study, the fenton reagent was made by h2o2 and feso4 7 h2o with a molar ratio of 20:1-80:1, using 4 mm of feso4 7 h2o. first, the cod and absorbance of rr 2 solution with a concentration of 150 ppm were measured. then, the uv reactor was filled with the solution and set to a stirring speed of 500 ppm. followed by an addition of feso4 7 h2o and setting the ph to 3 by adding 0.1 m h2so4 or 0.1 m naoh solution, before introducing h2o2. 2.2.1. fenton/tio2 process in the fenton/tio2 process, the addition of 0.05 % (w/v) tio2 was carried out after adding the fenton reagent into the reactor. the experiments were repeated with a varying rr 2 concentration, molar ratio of [h2o2]/[fe2+], and tio2 concentration. the concentration of tio2 catalyst varied from 0.05 to 0.4 % (w/v). in this process, the reaction time starts when tio2 is added. 2.2.2. fenton/uv process the concentration of h2o2 varied with the fix concentration of feso4 7 h2o to make the molar ratio of [h2o2]/[fe2+] 20 : 1 − 80 : 1. the experiments were repeated with varying rr 2 concentrations and molar ratios of [h2o2]/[fe2+]. in the fenton/uv process, the reaction time starts when the uv lamp is turned on. 2.2.3. fenton/uv/tio2 process in the fenton/uv/tio2 process, the concentration of tio2 catalyst varied from 0.05 to 0.4 % (w/v).the addition of tio2 was carried out after adding the fenton reagent into the uv reactor. the experiments were repeated with varying rr 2 concentrations, molar ratios of [h2o2]/[fe2+], and tio2 concentrations. in this process, the reaction time starts when tio2 is added and the uv lamp is turned on. the research was done on each process that was carried out. the solution samples were taken from the fenton/tio2 process, the fenton/uv process, and the fenton/uv/tio2 process every 5 minutes to analyse the cod value and its absorbance. after taking samples from each process, 0.1 ml 1 n na2s2o3 was immediately added into each sample solution to stop the reaction [19]. 3. analyses the ph measurement is done by using hanna ph meter, the colour analysis through absorbance measurements using uv-visible genesys tm 20 spectrophotometer, while the cod value was determined by titrimetric method (astm d-1252). the percentage of the rr 2 colour degradation is calculated based on the equation: % color degradation = (a0 − at) a0 · 100 % (10) with a0 being the colour absorbance at t = 0, at being the colour absorbance at t = t, and t = time. while the percentage of the cod degradation is calculated by the following equation: % cod removal = (cod0 − codt) cod0 · 100 % (11) with cod0 being cod at t = 0, codt being cod at t = t, and t = time. 4. results and discussion the parameters analyszed in this study are colour and cod. the cod value represents the amount of total oxygen needed to decompose organic and chemical compounds that are chemically dissolved in a wastewater. in textile industry wastewater, the presence of pollutants in the form of organic compounds is caused 529 t. e. agustina, d. teguh, y. wijaya et al. acta polytechnica by the use of chemical dyes in the production process, such as colouring and dyeing, so thatresulting in the wastewater containings various chemicals [20]. therefore, the cod value can be used as a determining parameter to determine changes in the pollutant content and quality of wastewater with a processing method used. fenton’s oxidation, as indicated in the reaction equation (1), is a process that is very dependent on ph, because ph plays an important role in the •oh production mechanism in fenton’s reaction. hydroxyl radicals can be formed efficiently under acidic conditions. at high ph, ferrous ions (fe2+) are unstable and will easily form ion ferries (fe3+) which tend to produce colloidal hydrosol complexes. this complex inhibits the hydroxyl radical production, in this form, iron ions catalysze the decomposition of hydrogen peroxide to oxygen and water without producing hydroxyl radicals. while at ph 2-5, the solubility of fe2+ is still high enough so that the hydroxyl radical can be produced [21]. in the study of the treatment of azo and reactive synthetic dyes, the colour removal and the highest cod were obtained at ph 3 [22–24]. therefore, in this study, ph 3 was used in each experiment. in a previous study, fenton reagent was used to treat rr 2 with a concentration of 150 ppm. as the a results, the photo-fenton process was superior to fenton process. in the fenton process, during the 20 minutes of reaction, only 69 % of the colour reduction was obtained achieved using the molar ratio [fe2+]/[h2o2] 1 : 20 − 1 : 80. while the colour degradation of 99.9 % was achieved within 10 minutes of the reaction by using the photo-fenton process with the molar ratio of at[fe2+]/[h2o2] molar ratio ofbeing 1 : 80 within 10 minutes of reaction [22]. the results of fenton process are considered ineffective, therefore, the use of fenton reagents needs to be combined with other advanced oxidation methods, namely uv light, tio2 catalyst, or photocatalyst of uv/tio2. the mechanism of the reaction proposed for the rr 2 degradation by using the fenton reagent is as follows: c9h10c12n6na2o7s2 + •oh −→ c9h10c12n6na2o7s2(−oh)+ + oxidized intermediates + co2 (12) 4.1. effect of fenton’s reagent molar ratio the fenton reagent is a combination of chemicals that use hydrogen peroxide and iron catalysts. h2o2 and fe2+ ions not only react to form hydroxyl radicals through the reaction equation (1), but both can also consume hydroxyl radicals through the reaction equation (2) and (5). the ratio of [fe2+]/[h2o2] will affect the rate of production and the use of hydroxyl radicals [15]. therefore, it is very important to use the right [fe2+]/[h2o2] ratio in the treatment. the effect of the molar ratio of the fenton reagent on the rr 2 and cod degradation is shown in figures 1a and 1b. it can be seen from figure 1 that the efficiency of colour degradation and cod removal are still increasing when the ratio of [fe2+]/[h2o2] is increased to 80 : 1. further addition of h2o2 could possibly enhance the removal of cod. in the figure, it can be seen that the higher the fenton molar reagent ratio is used in this study, the higher the colour and cod degradation is obtained. the rr 2 and cod highest degradation was achieved with the highest molar ratio of 1 : 80, with the percentage of colour and cod degradation 97.5 % and 59.4 %, respectively. the importance of the ratio of [h2o2]/[fe2+] in the treatment of carpet-colouring wastewater using fenton reagents was investigated by [24]. in their study, it was reported that cod removal increased with an increase in the ratio of [h2o2]/[fe2+] between 95 and 290, while the ratio above 290 caused a reduction in the cod removal. the highest cod removal was achieved at a ratio between 95 and 290 (g/g) which is equivalent to the molar ratio of 153-470 [24]. other research regarding decolourization and mineralization of some commercial reactive dyes using fenton reagents by both homogeneous and heterogeneous processes, and the process of homogeneous fenton with uv light (fenton/uv), has shown that the fenton/uv homogeneous process is the most effective process. in this process, a high level of mineralization (78-84 %) and decolourization (95-100 %) has been achieved. optimal operating conditions for efficient colour degradation in all investigated dyes (100 ppm concentration) were obtained when using a molar ratio of [fe2+]/[h2o2] of 0.5 mm/20 mm or 1 : 40 [23]. in the previous study, the use of fenton reagent alone (molar ratio of 1 : 80) in processing the rr 2 with a concentration of 150 ppm resulted in a 69 % of colour degradation. while the use of uv lamps and fenton reagent with the same molar ratio is able to achieve a higher colour degradation of 97.5 % even though it is used at higher rr 2 concentrations (300 ppm). thus, the fenton/uv method provides a better results than the method using only fenton. this is due to more hydroxyl radicals, which are initiated by the presence of uv light according to equation (9), being produced. the results of this study are in an agreement with the research on the degradation of reactive blue 19 dyes (rb 19) using aops. the study reported that the use of fenton/uv (photofenton) resulted in a greater degradation of rb 19 compared to the use of uv, h2o2, uv/h2o2, and fenton reagent only. the uv/h2o2 process is effective enough to achieve a 91 % colour degradation, however, the downside being the required time – 3 h. the fenton process gives > 98 % colour degradation in a few minutes, but the decrease of dissolved organic carbon content (doc) is only 36.8 %. the most effective process is photo-fenton, where the colour degradation of 99.4 % was reached and a decrease in dissolved organic 530 vol. 59 no. 6/2019 study of synthetic dye removal (a). (b). figure 1. effect of fenton reagent molar ratio on (a) rr 2 colour degradation (b) cod removal by using fenton/uv method (rr 2 concentration of 300 ppm, ph 3, reaction time 5 min). (a). (b). figure 2. effect of tio2 catalyst concentration on (a) rr 2 colour degradation and (b) cod removal by fenton/tio2 and fenton/tio2/uv methods (fenton reagent molar ratio of 1 : 80, rr 2 concentration of 150 ppm, ph 3, reaction time of 5 min). carbon of 94.5 % was reached [25]. thus, the achieved difference in colour degradation and the appropriate fenton reagent molar ratio is strongly influenced by the chemical structure of the dye, the process used, and the type of pollutant treated. 4.2. effect of tio2 catalyst concentration the catalyst concentration is very important especially in its use together with the presence of photon sources (as the photocatalyst), such as uv light or sunlight. because the excessive amount of catalyst can reduce the effectiveness of the oxidation itself where the catalyst can block the uv light to degradedegrading the pollutants [26]. the effect of the catalyst concentration on the colour degradation and cod removal can be seen in figures 2a and 2b. the higher the catalyst concentration used in this study was, the greater the percentage of colour degradation and cod removal was achieved. this is in line consonance with the research conducted by [27], who studied the effect of the tio2 amount, ph, and temperature on the decolourization of c.i reactive red 2 using a system that involves ultraviolet-ultrasound-tio2. their results indicate that the rate of decolourization rises constantly with the increase in the amount of tio2 catalyst used in the range of 0.5−2 g/l. several studies have also shown that the rate of the decolourization rises with the increase in the amount of the catalyst used [4, 28–31]. the increase in the photocatalyst activity with the increase in the amount of catalyst used is an indication of a heterogeneous catalyst regime, because the fraction of light absorbed by the catalyst 531 t. e. agustina, d. teguh, y. wijaya et al. acta polytechnica rises with the increase in the amount of the catalyst in the suspension [27]. but, the decolourization rate only slightly increases with the use of catalysts more in amountsthan larger than 2 g. this is due to the reductione of the photocatalytic activity due to the scattered and absorbed light. the fenton/tio2/uv method gives a 99 % of colour degradation and 90 % of cod removal. while the fenton/tio2 method produced achieved a colour degradation and cod removal of 85.8 % and 88 %, respectively. the fenton/tio2/uv method gives a better percentage of degradation than the fenton/tio2 method because more hydroxyl radicals are produced in the presence of uv light. the presence of uv light can initiate the fenton photo reaction (equation 6) and photocatalytic reaction (equation 8), which produces hydroxyl radicals. in addition, the presence of hydrogen peroxide as an oxidizing agent can also produce additional hydroxyl radicals under uv irradiation based on equation (9). 4.3. comparative method of fenton/tio2, fenton/uv, and fenton/tio2/uv the treatment of the rr 2 with a concentration of 150 ppm using the fenton/tio2, fenton/uv, and fenton/tio2/uv methods produced a percentage of color degradation at 85.8 %, 98.5 %, and 98.8 % colour degradation, respectively, as shown in figure 3a. it can be concluded that the fenton/tio2/uv method is a method that is superior to the fenton/tio2 and fenton/uv methods. from figures 3a and 3b, it can be seen that the use of uv becomes very important, because a very large increase in the percentage of the degradation compared to using the fenton reagent only is obtained when using uv lamps compared to when adding tio2 catalysts. this can be seen by comparing the degradation percentage of the rr 2 and the removal of the cod achieved in the treatment with fenton/tio2 and fenton/uv methods. treatment of the rr 2 with fenton alone resulted in a degradation percentage of 69 % within 20 minutes. while the use of fenton/tio2 and fenton/uv methods resulted in the rr 2 degradation percentages of 85.8 % and 98.5 % within 5 minutes of reaction, respectively, within 5 minutes of reaction. figure 3b shows a comparison of the three methods in reducing the cod value in the rr 2 with a concentration of 150 ppm. the cod removal of 81.2 %, 88 %, and 89.5 %, respectively, were achieved by using the fenton/tio2, fenton/uv, and fenton/tio2/uv methods. the same as in coloras in the colour degradation, the fenton/tio2/uv method is a method that is superior to the fenton/tio2 and fenton/uv methods in reducing the cod. this is due to the use of fenton/tio2/uv, the more hydroxyl radicals will produced, so thatresulting in more pollutants can be degraded which results in aand a decrease in the cod value. in the treatment of the black effluent and blue effluent from the textile industry, it was reported that the use of the photo-fenton method with the help of uv lamps proved to provide the highest colour and cod degradation compared to the fenton and hydrogen peroxide methods only. in for the black effluent, colour and cod degradation reached 39 % and 84 %, respectively, while in for the blue effluent, colour and cod degradation were 56 % and 66 %, respectively [31]. in this study, the highest percentage of degradation was obtained by using the fentonphotocatalytic (fenton/tio2/uv) method, because of the increasing number of radical hydroxyls, caused by the presence of uv light and photocatalytic process (tio2/uv).in this study, the highest percentage of degradation was obtained by using the fentonphotocatalytic (fenton/tio2/uv) method, wherein the addition of radical hydroxyl besides being assisted by the presence of uv light, also came from the photocatalytic process (tio2/uv) involved in it. 4.4. treatment of jumputan fabric wastewater to determine the effectiveness of the treatment method that has been studied, the best operating conditions that have been obtained in the treatment of the rr 2, namely the fenton reagent molar ratio of 1 : 80, 0.4 % tio2 catalyst concentration, and ph of 3, are then applied to treat the jumputan fabric wastewater coming from the centre of the textile home industry in palembang. the percentage of the cod removal of jumputan fabric wastewater using fenton/tio2, fenton/uv, and fenton/tio2/uv methods is shown in figure 4. the highest cod removal of 94 % was obtained in the treatment of jumputan fabric wastewater by using the fenton/tio2/uv method. this is because, in this method, the hydroxyl radical is doubled, which is based on the reaction equation (1) and (6). in this case, there is a cycle between +2 and +3 oxidation numbers of the fe element. the •oh production is determined by the presence of light with an appropriate wavelength and hydrogen peroxide. theoretically, by combining equations (1) and (6), two moles of •oh can be produced per mole of hydrogen peroxide consumed. shahwar et al., in their research on the effluent treatment of the textile industry, have compared the photo-fenton method both using uv light and sunlight with other advanced oxidation processes (h2o2, fenton, and ozonation) to see how the efficiency of colour and cod removal changes. it is reported that the photo-fenton method assisted by sunlight is the most energy and cost effective treatment process (100 to 150 times lower than the uv/fenton and ozone methods) among all the advanced oxidation processes studied in their study [31]. in the study, it was reported that the colour and cod degradation were 61 % and 85 %, respectively for the blue effluent, and 52 % and 88 %, respectively, for the black effluent, in 18 h of irradiation. 532 vol. 59 no. 6/2019 study of synthetic dye removal (a). (b). figure 3. (a) rr 2 colour degradation and (b) cod removal by fenton/tio2, fenton/uv, and fenton/tio2/uv methods (fenton reagent molar ratio of 1 : 80, rr 2 concentration of 150 ppm, tio2 concentration of 0.4 %, ph 3, reaction time of 5 min). figure 4. cod removal from the jumputan fabric wastewater using fenton/tio2, fenton/uv, and fenton/tio2/uv methods (fenton reagent molar ratio of 1 : 80, ph of 3, 0.4 % tio2 catalyst concentration, reaction time of 120 min). thus, there is an opportunity for the jumputan fabric wastewater treatment using the fenton method with the aid of sunlight, although it will take a long time, but it will require less energy. in this case, the uv light source is replaced by sunlight, which is available abundantly throughout the year in indonesia. the fenton/tio2/uv method with the sunlight is one of the promising choices in treating textile industrial wastewater, considering the tio2 catalyst can be activated with the support of sunlight. 5. conclusion in this study, the reactive red 2 (rr 2) synthetic dyes were treated by using fenton-based of advanced oxidation processes, namely, fenton/tio2, fenton/uv, and fenton/tio2/uv. the effect of the molar ratio of the fenton reagent, rr 2 initial concentration, and tio2 catalyst concentration on colour degradation and cod removal were studied. and the best operation condition was applied to the jumputan fabric wastewater. from the results of the study, it can be concluded that the fenton/tio2/uv method is superior to other methods with an achieved colour degradation and cod removal of 98.8 % and 94.1 %, respectively, using the fenton reagent molar ratio of 1 : 80, ph of 3, and tio2 catalyst concentration of 0.4 % (w/v), in a reaction time of 5 minutes. this condition was applied to jumputan fabric wastewater where the cod removal of 94.1 % was obtained in a reaction time of 120 minutes. acknowledgements the authors would like to express appreciation and gratitude to the colleagues, students, and all parties involved in this research. we also express our gratitude for the support of facilities, technicians, and analysts of the waste treatment technology laboratory of the chemical engineering department of universitas sriwijaya. references [1] f. jannah, a. rezagama, f. arianto. pengolahan zat warna turunan azo dengan metode fenton (fe2+ + h2o2) dan ozonasi (o3). jurnal teknik lingkungan 6(3):1–11, 2017. [2] t. e. agustina. pengolahan air limbah pewarna sintetik dengan metode adsorpsi menggunakan karbon aktif. jurnal rekayasa sriwijaya 20(1):36 – 42, 2011. 533 t. e. agustina, d. teguh, y. wijaya et al. acta polytechnica [3] a. f. rusydi, d. suherman, n. sumawijaya. pengolahan air limbah tekstil melalui proses koagulasi – flokulasi dengan menggunakan lempung sebagai penyumbang partikel tersuspensi (studi kasus: banaran, sukoharjo dan lawean, kerto suro, jawa tengah). arena tekstil 31(2):105 – 114, 2016. doi:10.31266/at.v31i2.1671. [4] t. e. agustina, r. komala, m. faiza. application of tio2 nano particles photocatalyst to degrade synthetic dye wastewater under solar irradiation. contemporary engineering sciences 8(34):1625 – 1636, 2015. doi:10.12988/ces.2015.511305. [5] h. suprihatin. kandungan organik limbah cair industri batik jetis sidoarjo dan alternatif pengolahannya. jurnal kajian lingkungan 2(2):130 – 138, 2014. [6] y. deng, r. zhao. advanced oxidation processes (aops) in wastewater treatment. current pollution reports 1(3):167 – 176, 2015. doi:10.1007/s40726-015-0015-z. [7] a. d. bokare, w. choi. review of iron-free fentonlike systems for activating h2o2 in advanced oxidation processes. journal of hazardous materials 275:121 – 135, 2014. doi:10.1016/j.jhazmat.2014.04.054. [8] v. pawar, s. gawande. an overview of the fenton process for industrial wastewater. osr journal of mechanical and civil engineering pp. 127–136, 2015. [9] s. krishnan, h. rawindran, c. m. sinnathambi, j. w. lim. comparison of various advanced oxidation processes used in remediation of industrial wastewater laden with recalcitrant pollutants. iop conference series: materials science and engineering 206(1):012089, 2017. doi:10.1088/1757-899x/206/1/012089. [10] a. morone, p. mulay, s. p. kamble. 8 removal of pharmaceutical and personal care products from wastewater using advanced materials. in m. n. v. prasad, m. vithanage, a. kapley (eds.), pharmaceuticals and personal care products: waste management and treatment technology, pp. 173 – 212. butterworth-heinemann, 2019. doi:10.1016/b978-0-12-816189-0.00008-1. [11] h. bhuta. chapter 4 advanced treatment technology and strategy for water and wastewater management. in v. v. ranade, v. m. bhandari (eds.), industrial wastewater treatment, recycling and reuse, pp. 193 – 213. butterworth-heinemann, oxford, 2014. doi:10.1016/b978-0-08-099968-5.00004-0. [12] r. ameta, a. k. chohadia, a. jain, p. b. punjabi. chapter 3 fenton and photo-fenton processes. in s. c. ameta, r. ameta (eds.), advanced oxidation processes for waste water treatment, pp. 49 – 87. academic press, 2018. doi:10.1016/b978-0-12-810499-6.00003-6. [13] t. e. agustina. aops application on dyes removal, pp. 353–372. springer netherlands, dordrecht, 2013. doi:10.1007/978-94-007-4942-9_12. [14] v. vatanpour, n. daneshvar, m. rasoulifard. electro-fenton degradation of synthetic dye mixture: influence of intermediates. j environ eng manage 19(5):277–282, 2009. [15] s. parsons. advanced oxidation processes for water and wastewater treatmrnt. iwa publishing, london, uk, 2004. [16] a. alver, e. baştürk, a. kılıç, m. karataş. use of advance oxidation process to improve the biodegradability of olive oil mill effluents. process safety and environmental protection 98:319 – 324, 2015. doi:10.1016/j.psep.2015.09.002. [17] r. yulia, h. meilina, a. adisalamun, d. darmadi. aplikasi metode advance oxidation process (aop) fenton pada pengolahan limbah cair pabrik kelapa sawit. jurnal rekayasa kimia & lingkungan 11:1–9, 2016. doi:10.23955/rkl.v11i1.4098. [18] m. gar alalm, a. tawfik. fenton and solar photo-fenton oxidation of industrial wastewater containing pesticides. in 17th international water technology conference, vol. 2. 2013. [19] x.-r. xu, h.-b. li, w.-h. wang, j.-d. gu. degradation of dyes in aqueous solutions by the fenton process. chemosphere 57(7):595—600, 2004. doi:10.1016/j.chemosphere.2004.07.030. [20] m. tichonovas, e. krugly, v. racys, et al. degradation of various textile dyes as wastewater pollutants under dielectric barrier discharge plasma treatment. chemical engineering journal 229:9 – 19, 2013. doi:10.1016/j.cej.2013.05.095. [21] t. e. agustina, h. m. ang. decolorization and mineralization of c . i . reactive blue 4 and c . i . reactive red 2 by fenton oxidation process. international journal of chemical and environmental engineering 3(3):141 – 148, 2012. [22] t. e. agustina, a. wijaya, f. mermaliandi. degradation of reactive red 2 by fenton and photo-fenton oxidation processes. arpn journal of engineering and applied sciences 11(8):5227–5231, 2016. [23] s. papić, d. vujević, n. koprivanac, d. sinko. decolourization and mineralization of commercial reactive dyes by using homogeneous and heterogeneous fenton and uv/fenton processes. journal of hazardous materials 164(2-3):1137 – 1145, 2009. doi:10.1016/j.jhazmat.2008.09.008. [24] i̇pek gulkaya, g. a. surucu, f. b. dilek. importance of h2o2/fe2+ ratio in fenton’s treatment of a carpet dyeing wastewater. journal of hazardous materials 136(3):763 – 769, 2006. doi:10.1016/j.jhazmat.2006.01.006. [25] j. r. guimarães, m. g. maniero, r. nogueira de araújo. a comparative study on the degradation of rb19 dye in an aqueous medium by advanced oxidation processes. journal of environmental management 110:33 – 39, 2012. doi:10.1016/j.jenvman.2012.05.020. [26] t. agustina, h. ang, v. pareek. treatment of winery wastewater using a photocatalytic/photolytic reactor. chemical engineering journal 135(1):151 – 156, 2008. process intensification and innovation cleaner, sustainable, efficient technologies for the future, doi:10.1016/j.cej.2007.07.063. [27] c.-h. wu, c.-h. yu. effects of tio2 dosage, ph and temperature on decolorization of c.i. reactive red 2 in a uv/us/tio2 system. journal of hazardous materials 169(1):1179 – 1183, 2009. doi:10.1016/j.jhazmat.2009.04.064. 534 http://dx.doi.org/10.31266/at.v31i2.1671 http://dx.doi.org/10.12988/ces.2015.511305 http://dx.doi.org/10.1007/s40726-015-0015-z http://dx.doi.org/10.1016/j.jhazmat.2014.04.054 http://dx.doi.org/10.1088/1757-899x/206/1/012089 http://dx.doi.org/10.1016/b978-0-12-816189-0.00008-1 http://dx.doi.org/10.1016/b978-0-08-099968-5.00004-0 http://dx.doi.org/10.1016/b978-0-12-810499-6.00003-6 http://dx.doi.org/10.1007/978-94-007-4942-9_12 http://dx.doi.org/10.1016/j.psep.2015.09.002 http://dx.doi.org/10.23955/rkl.v11i1.4098 http://dx.doi.org/10.1016/j.chemosphere.2004.07.030 http://dx.doi.org/10.1016/j.cej.2013.05.095 http://dx.doi.org/10.1016/j.jhazmat.2008.09.008 http://dx.doi.org/10.1016/j.jhazmat.2006.01.006 http://dx.doi.org/10.1016/j.jenvman.2012.05.020 http://dx.doi.org/10.1016/j.cej.2007.07.063 http://dx.doi.org/10.1016/j.jhazmat.2009.04.064 vol. 59 no. 6/2019 study of synthetic dye removal [28] m. ainikalkannath lazar, s. varghese, s. nair. photocatalytic water treatment by titanium dioxide: recent updates. catalysts 2(4):572 – 601, 2012. doi:10.3390/catal2040572. [29] m. abdellah, s. nosier, a. el-shazly, a. mubarak. photocatalytic decolorization of methylene blue using tio2/uv system enhanced by air sparging. alexandria engineering journal 57(4):3727 – 3735, 2018. doi:10.1016/j.aej.2018.07.018. [30] s. mortazavian, a. saber, d. james. optimization of photocatalytic degradation of acid blue 113 and acid red 88 textile dyes in a uv-c/tio2 suspension system: application of response surface methodology (rsm). catalysts 9, 2019. doi:10.3390/catal9040360. [31] n. durr-e-shahwar, a. yasar, s. yousaf. solar assisted photo fenton for cost effective degradation of textile effluents in comparison to aops. global nest journal 14(4):477 – 486, 2012. 535 http://dx.doi.org/10.3390/catal2040572 http://dx.doi.org/10.1016/j.aej.2018.07.018 http://dx.doi.org/10.3390/catal9040360 acta polytechnica 59(6):527–535, 2019 1 introduction 2 experimental 2.1 material 2.2 procedures 2.2.1 fenton/tio2 process 2.2.2 fenton/uv process 2.2.3 fenton/uv/tio2 process 3 analyses 4 results and discussion 4.1 effect of fenton’s reagent molar ratio 4.2 effect of tio2 catalyst concentration 4.3 comparative method of fenton/tio2, fenton/uv, and fenton/tio2/uv 4.4 treatment of jumputan fabric wastewater 5 conclusion acknowledgements references ap02_3.vp 1. introduction none of the many artificial systems, that human society has daily deals with on a daily basis can operate entirely independently – all of them still have to be controlled, or at least supervised, by man. we will not here study the problem of whether or when (maybe sometime in distant future) artificial systems can be developed that will be fully independent on human beings. as far as we can see, this seems not to be probable. all the artificial systems that we need for our life, now and in the foreseeable future, will require interaction with a human operator (see e.g. [18]). the quality and reliability of such an interaction is therefore of extreme importance for all of us, irrespective of the level of intelligence of the particular artificial system. 2 an overview of the current state-of the art very long experience of human beings dealing with artificial systems leads to the regrettable observation, that the human factor is very often the weakest point in the interaction. the reason for this is quite natural and easy to understand: unlike artificial systems, humans cannot operate indefinitely without a break – human beings need to relax, rest and sleep. decreasing human vigilance and attention while operation or using a given system has always been and still is the most frequent cause of system failures and accidents. another source of system failures lies in the possibility that the human operator (or user) of a particular artificial system may react too late, and that his/her decision and reaction may be incorrect. human behavior is not fully deterministic; it varies from subject to subject. all these factors combine, with the result that the reliability of human subject – artificial system interaction is limited, above all from the human side. the price that we all pay for insufficiently reliable human – artificial system interaction (above all from the human side) is tremendous. moreover, this price increases with time and also – surprisingly – with the level of the sophistication of the artificial system. we urgently need to counteract this human unreliability. various approaches can be taken in an attempt to minimize the hazard of system operation failures. however, the following are main ways of improving system reliability (see [4]): � by making the system as robust as possible, preferring to use of highly reliable (and therefore massive and expensive) components for its construction. � by duplication and often even multiplying the most important parts of the system, or even duplicating or multiplying whole systems, which may operate as ”hot” reserves. like the first approach, this can also require very great effort and expense. � a more sophisticated approach tries to modify the system structure in a way that minimizes the sensitivity of its system functions to parameter changes. � in very sophisticated systems the predictive diagnostic approach can be applied, which enables us to analyze how much and in what direction the values of some system parameters can deviate from their nominal values without affecting the properties of the system as a whole, and to predict with what probability and when these limits of acceptable system behavior will be broken. these four main approaches are in practice usually combined. however, as concerns the reliability of interactions between human subjects and artificial systems, only the last approach seems truly able to minimize the frequency of system operation failures caused by human error and failure © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 42 no. 3/2002 reliability of human subject – artificial system interactions m. novák, j. faber , z. votruba, v. přenosil, t. tichý, p. svoboda, v. tatarinov main problems related to reliability of interaction between human subject and artificial system (namely of the transportation character) are discussed. the paper consists of three mayor parts: the first one is devoted to the theoretical backgrounds of the problem from the both theory of system reliability and neurology / psychology views. second part presents the discussion of relevant methodologies of the classification and prediction of the reliability decline. the methodology based on eeg pattern analysis is chosen as the appropriate one for the presented task. the key phenomenon of “micro-sleep” is discussed in detail. the last part presents some latest experimental results in context of presented knowledge. proposals for the future studies are presented at the end of the presented article. the special interest should be devoted to the analysis and in-time prediction of fatal attention decreases and to the design and construction of the respective on-board applicable warning system. keywords: man-system interaction, reliability, attention, vigilance, classification, prediction, eeg analysis, micro-sleep, neuroinformatics. (see [23]). of course, the first approach, which favors selecting well-qualified operators and then training and testing them, is also very important. this paper on problems related to the technical aspects of reliability of interactions between human subject – either in the role of an operator (driver, pilot, dispatcher), supervisor or user – and an artificial, namely technical (and above all transportation) system, therefore concentrates on creating a practically applicable tool for diminishing the losses incurred when such interaction reliability is diminished. the problem of interaction reliability between artificial systems and human beings (operators, users), though extremely important, has not yet been solved satisfactorily, even though most of complex and complicated systems are now controlled by computers or with the aid of computers (see [1, 2, 3, 5, 6, 7, 10, 11, 12, 16, 17, 22, 23]). a human operator, who has to interact with a powerful, complicated and often-efficient artificial system (e.g. a transport system, aircraft, express train, large truck, extensive power station, security or defense systems, etc.) needs to react rapidly and correctly to very variable situations over a long period of time. the high load on the brain and nervous system of the operator results in an inevitable decrease in his/her vigilance, and in loss of attention. the complex demands on a human operator interacting with a complicated and powerful artificial system are moreover often combined with the influence of some internal and external factors (such as long length of service, psychic isolation in the course of service, the current mental and physical state of the operator, climatic conditions, the quality of the environment in the given cockpit or control room, the monotony of the scene or image, that the operator has to observe, etc.). the psychic load of the system operator and the presence of pre-service or in-service stress factors can play a very important role in the reliability of a human subject – artificial system interaction. we also have to take into account various other factors that are, typical for humans, namely: � the extremely high variability of human subjects, especially as concerns intelligence, education, level of operation training and skill, ability to concentrate, tendency to irrational and panicky reactions to unexpected situations, general disposition, aggressiveness, etc. � all human subjects are individuals, none react identically. � human subjects dealing with artificial systems differ greatly in age, race and sex. � unlike artificial systems, the reactions of human subjects cannot be exactly repeated – a human subject learns from experience and modifies his/her behavior to use a minimum of physical and psychic energy for controlling or applying the system. all these factors seriously complicate the problem – we cannot use standard methods of control and system engineering. our methodology needs to change from the standard psychologically – oriented approaches used for decades in system operator education and training to more modern approaches, based on the better knowledge of system reliability theory, mathematical analysis, artificial neural networks, numerical methods, etc. this makes our problem much more complex – we need to combine knowledge from very different areas of science. 3 theoretical backgrounds of interaction reliability artificial systems can interact with human beings on the basis of: � human control of system operation, � human supervision of system operation, � human use of system operation, � human society interaction with system operation. all these four fundamental forms are important and in practice they are very often combined. for the first three forms, reliability and safety of system operation depends directly on the reliability and safety of these interactions. in all these cases, interaction failures can lead to fatal situations, or at least to serious economic losses. in the case of interaction between an artificial system and various members of human society forming its environment, interaction reliability and predictability can also be of very great importance. this is especially true in situations when the artificial system suddenly changes its behavior, and when it interacts with a large and heterogeneous part of human society. in order to estimate this environmental reaction, we need a deep understanding not only of the individual behavior of a human subject exposed to interaction with the varying properties of a particular artificial system, but also of any social factors that may exist or may be activated in the relevant part of human society. such studies are evidently of great importance for general safety, but they are very difficult and time-consuming and require access to large special databases storing the results of many measurements of human subject interaction reliability markers. the micro sleep base to be discussed below is an example of such a database. as mentioned above, the unsatisfactory reliability of almost all-artificial systems used by man throughout history is a major problem. this is caused not by the low reliability and short lifetime of the artificial systems themselves, but by various errors by the human operators who deal with such systems. recently, technological progress in the reliability of artificial systems has improved greatly. consequently the probability of technical faults in well-designed and well-manufactured artificial systems is now usually very low (though unfortunately still not zero). however, the probability of failure resulting from misuse and faults in system operator activities has increased rapidly. the main reasons for this unacceptable situation are as follows: � increasing complexity of the systems, � increasing demands on the operator’s ability, � increasing demands on his/her level of continuous, long-term attention, and � the increasing demands on the speed of his/her reactions. the losses resulting from artificial system operation faults are proportional to their power, significance and value. in 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 the case of many modern transportation systems (large planes, fast trains, large ships, trucks), large power plants, major financial systems, security and defense systems, and also major medical care systems, the losses resulting from a malfunction can be catastrophic. therefore, alongside the continuing interest in minimizing the probability of technical failures in any artificial system (at an acceptable cost), considerable interest has also been shown in recent years in the reliability of system operator activity. studies demonstrate that human error accounts of the losses incurred by artificial system malfunctions. the demands on a human operator of an artificial system can be categorized as follows: a) demands on attention level and attention span, b) demands on the speed of the operator’s reactions, c) demands on the correctness of the operator’s decisions. let us suppose that these three above mentioned kinds of requirements can be expressed as the real numbers xi of some significant parameter xi, i � 1, 2, 3 which expresses the level of attention, the speed of reaction and the correctness of the operator’s decision. the acceptable values of the vector x � {x1, x2, x3} fill out in space x some region ra, called the region of acceptability. the concept of regions of acceptability is well known from the theory of tolerances of system parameters (see [4]). in the theoretical case that the parameters xi are independent, the region ra has a rectangular shape. however, in practice, there are certain correlations between x1, x2 and x3. ra therefore has a complicated shape. the analysis of ra is a very important but usually rather laborious problem, for which many methods have been developed. though the set of these methods is still not closed, for the purpose of this paper let us suppose that we know how to analyze the region of acceptability of interaction of operators with a certain system. let us also suppose, that we are dealing only with so called well operated systems, i.e., with operator – system interactions that work well, at least initially. the following theorem can therefore be formulated: each real interaction between a human operator and an artificial system, when exposed to the influence of a set of independent variables p (time is usually – in dynamic systems – one of them, can be represented by some position of vector x in the parameter space, moving along a certain interaction life-curve � �� t . for t � �, the life-curve � �� t at least once breaks the boundaries of ra. this is illustrated in fig. 1. in accordance with the general theory of system reliability, we can state that: the reliability of some operator – system interaction is represented as the probability h that for a certain time interval (or some interval of another independent variable influencing the system under consideration) the respective interaction life-curve � �� t will remain inside the region of acceptability ra. similarly, the safety of the operator – system interaction is to be considered as the probability that it will be resistant to any disturbing influences. a straightforward correlation exists especially between attention level and speed of reaction. operators demonstrating a high level of attention usually also possess very fast reactions. on the other hand, cases can occur, when a fast, almost impulsive reaction may not be accompanied by very a high level of the operator’s concentration and attention. some people can react fast even, when their attention is dispersed on very different objects (they have very fast but unreliable reflexes). in addition, a high level of attention leads in most cases to a very high probability of correct decisions and conversely, if somebody is not concentrating enough, there is a rather low probability that his/her decision will be correct. on the other hand, in the case of very fast reactions accompanied by a very low level of the human operator’s attention, the probability of an incorrect decision can increase significantly. this is typical for a so-called surprise reaction, which can sometimes change to a panicky reaction. a drop in the attention level of a given human operator can be caused by various external or internal reasons; some of these have a general character; the intensity of others depends significantly on the operator’s individuality. general conditions causing decrease in attention include: � extreme length of service without a break, � physical and mental exhaustion, � a monotonous scene, which the operator has to observe for a long time, � extreme temperature in which the operator has to serve (too high or too low), � extreme humidity in which the operator has to serve (too high or too low), � extreme air pressure, � air smell, dust density, etc. situations leading the operator to concentrate on problems other than his/her main task can likewise cause attention to drop. if the task is also monotonous, this can lead to a micro-sleep. of course, in practice, the shape of ra and � �� t is much more complicated, as shown in schematic fig. 1. for example occasionally the situation occurs that the operator’s attention can improve in the course of his/her interaction with the system even if he/she was already in the 33 acta polytechnica vol. 42 no. 3/2002 fig. 1: an idealized run of the interaction life-curve of an operator – system interaction inside and outside the respective region of acceptability stage of micro-sleep. in such a case, the respective interaction life-curve � �� t returns inside ra, and after some time the procedure of attention decrease starts again. such episodes, schematically expressed in fig. 2 for two repetitions, can be repeated several times. in fortunate case the operator reaches the end of his/her service without an accident, or, in an unfortunate case, an accident occurs. the number of possible repetitions of such breaking of ra boundaries in micro-sleep episodes depends on many factors, including the individuality of the particular operator, his actual mental and physical state, the environmental and technical conditions of system operation and control etc. though this is a very important problem, present-day knowledge is far from adequate, and much more research needs to be done in this area. as in many cases when the regions of acceptability of technical system operation are investigated, here, too, the boundaries of the regions of acceptable interaction are very often fuzzy. 4 human vigilance and attention classification and prediction before starting a discussion of the problems of human subject attention decrease and micro-sleep, we will attempt to define this curious state of the human organism. the available literature offers varied descriptions of this phenomenon. it is usually characterized as a state of an organism in which the eyes are closed and vigilance approaches zero. however, micro-sleep can also be understood as a state of the organism when its vigilance decreases below a certain limit, without reference to the activity of the visual tract. there are also several other conceptions of micro-sleep that lie between these two limits. when dealing with man-system interaction reliability, decrease of vigilance seems to be much more significant than closing of the eyes, but a break in input of visual information longer than a certain limit ( e.g., about 1 second for car drivers) can also be dangerous. the later mentioned definition has been considered for the purpose of [24, 25] to be a reasonable basis for the concept of micro-sleep, and will be used in further discussion: first, however, we have to state what we mean by vigilance and what we mean by attention: vigilance… the state of the organism in which all its mental functions can be realized and when all receptor signals are accepted and well processed. attention… the form of vigilance, when the dominant part of mental functions is concentrated on external objects 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig.2: repeated breaking of ra boundaries by �(t) in two temporary episodes and one final micro-sleep episode fig. 3: decrease of attention in the course of a human subject’s activity (focused attention is considered as concentration on a certain object). in a certain sense, attention can be considered as a special case of vigilance. in the state of vigilance, the human operator can react to all received signals. in the state of attention, she/he is usually concentrating on a certain group of received signals, which are dominant for the function she/he has to perform. in the state of attention, the sensitivity of the operator can be lowered to other non-dominant (secondary) signals. in further discussion of the problems of micro-sleep, we shall deal with the term “attention” only, because this state seems to be more characteristic for human operators who are able to interact well with artificial systems and control them properly. let us suppose that the level of a human operator’s attention can be measured by the use of some real figure of merit lat, expressed by real numbers. a discussion of some ways to express lat and to measure it will be presented later. as is schematically shown in fig. 3, the level of attention lat decreases with time over which the mental activity of the human subject in the course of his/her interaction with the artificial system is observed. here we can distinguish the following four basic stages: a) vigilance (full attention), in which the subject is completely competent to control (or use) the system under consideration, b) relaxation, in which the subject is still competent to deal with the system under consideration, but where his/her attention decreases subsequently. this stage can last for a considerably period of time. c) somnolence, in which the competence of the subject to interact with the system under consideration becomes restricted. this stage can also last for a considerably period of time, but unlike the previous stage, there is now a real danger that the subject will make control faults. d) hypnagogium, in which the subject falls into a micro-sleep, at first with open eyes, but with very limited ability to control the system under consideration. he/she later falls into a micro-sleep with closed eyes, in which the competence of the subject to control the system is almost zero (some very skilled drivers – namely professional truck drivers – are able in this stage to hold the moving vehicle in a straight line, but they cannot react adequately to any curvature or barrier on the road). in general, micro-sleep can be defined as follows: micro-sleep is a state of the human organism in which the mental vigilance and attention of the human operator controlling artificial system decreases below a certain limit. this stage is very dangerous in both its phases. fortunate subjects awake from this stage after a short time (especially from the phase of micro-sleep with closed eyes), usually quite rapidly. in the course of such an awakening the level of subject’s attention increases rapidly, sometimes for a short period to a higher level than before the micro-sleep episode. however, in this short period of time there can be a higher © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 42 no. 3/2002 fig. 4: control functions in a human-system interaction during micro-sleep probability of incorrect responses to stimuli, and there is, also a possibility of so-called panic reactions. micro-sleep episodes can be of various lengths. for each kind of human subject interaction with an artificial system there exist some limits of maximum acceptable decrease of attention and of length of micro-sleep, beyond which the given micro-sleep must be considered as dangerous. this discussion should be supplemented by at least the following considerations: a) the minimum acceptable level latmin of the mental attention lat of the human organism depends significantly on the demands for a certain application of a human operator – artificial system interaction. b) the micro-sleep episodes can be classified according to their general length in t and according to the depths of the decrease in lat level (depth of micro-sleep). the stages belonging to the class of micro-sleep with open eyes are usually a precursor of micro-sleep with closed eyes. in such a state of the organism certain level of vigilance still exists, but the subject’s attention is considerably lowered and his/her reaction time rt is significantly prolonged. the probability of a correct and rapid decision on how to react to stimuli can decrease significantly. this kind of micro-sleep is very dangerous. this stage can last for long period of time (and may take the form of a light or shallow micro-sleep) in which a given system operator is still able to control the respective system, though often only partially and insufficiently reliably. after some time the light micro-sleep usually modifies into a real micro-sleep with closed eyes. this second class of micro-sleep usually shows some similarities to the regular rem phase of real night sleep but it lasts for a much shorter time. an operator falling into a micro-sleep with closed eyes cannot respond to any change of the system parameters, that he/she has is responsible for controlling. of course, there are exceptions. as was mentioned above, some professional truck drivers can fall into a special form of micro-sleep with closed eyes in which they are for some time still able to keep a moving truck in the right position on a straight road. some very skilled drivers say that this can last up to several hundreds of seconds. the human control activity is here restricted to a small set of basic control functions, which are produced automatically without any higher feedback from input receptors (see fig. 4). we can suppose that driver in the stage of micro-sleep with open eyes can react to some input signals of an acoustic and visual character, though much more slowly and with greater probability of a wrong reaction. when he falls into this light form of micro-sleep with closed eyes, he may be able to react to some basic input signals of a mechanical character, such as vibrations, acceleration, deceleration and position (stability). in a very deep micro-sleep with closed eyes he/she cannot react at all. micro-sleep with open eyes can last a considerable time, and the operator’s attention in such a situation is still near to the limit of the acceptable level. an operator sleeping in such a form of micro-sleep has changed (lowered) other significant parameters (markers) of his attention, and he/she is therefore scarcely able to be in control of any artificial system. this can be a dangerous situation. the human operator’s attention is represented in fig. 3 by a scalar figure of merit lat. in maximum simplification, this can be considered as the inverse of reaction time rt. when the human subject, while operating an artificial system, falls into the stages of relaxation, somnolence and hypnagogium, his/her reaction times prolong significantly. while in full vigilance, the typical values of rt are about 200 ms, for the stage of somnolence rt values in the range from 400 ms to 500 ms appear and in the stage of hypnagogium the rt values can exceed 800 ms. if the human subject load is further prolonged, he/she usually falls into a real micro-sleep with closed eyes and his/her rt increases up to the time of awaking (this may be several minutes). rt values below 200 ms in full vigilance are found only in exceptional cases. the level of danger caused by such prolongation of rt varies according the kind of human interaction with an artificial system. typically, for car drivers, values of rt above 400 ms represent a distance of about 15 m at a speed of 100 km/s, over which the vehicle runs without any specific control (braking, turning etc.) corresponding to the stimulus (signal) received by the driver's sensors (to this distance we must, of course, add the distance, due to technical reasons, e.g., braking time). nevertheless, representing of lat by rt seems to be quite easy and physically transparent. however, for more accurate representation, a higher number nat of parameters xi, i � 1… nat have to be taken into consideration. these parameters are called micro-sleep markers. an exact analysis of them can be quite laborious. for simplicity, in this paper we will deal with a reduced set of micro-sleep markers. our investigations at present are aimed at diagnosing attention decrease and the related reduction in speed of reaction to an unexpected situation. such attention degradation can also lead to real micro-sleep. both these critical types of states of the brain of an operator can be extremely dangerous and can result not only in huge material and financial damages, but also in loss of human life. 5 methodology of the study in order to detect a decrease in attention, we need to select a set of significant parameters which can be used for identifying attention decrease and the onset of micro-sleep. such parameters include: � electro-magnetic activity of the brain, � frequency of breath, � frequency of heart beats, � eye movements, � skin resistance, � facial grimaces, etc. all these parameters have their specific advantages and also disadvantages. we have chosen eeg activity as the dominant significant parameter, because this is probably the only parameter from which almost immediate and reliable information about brain function can be analyzed (similar information can be of course be obtained from magnetic measurements of brain activity, but this would be technically much more difficult). 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 analysis of eeg signals is a field of interest for many researchers specializing not only in neurology, but also in electrical control, signal engineering and mathematics (see [8, 9, 12, 14, 15]). eeg measurements can be made quite well in laboratory conditions, and we have achieved very satisfactory results. our preliminary results with eeg measurements in a moving vehicle are also quite promising. there is a real hope that in not distant future such eeg measurements can be performed in practical applications in moving vehicles or planes. after investigating the information from many eeg time-series recorded on several dozen human subjects (probands) in our laboratory, we can analyze the process of the decrease in operator vigilance and attention degradation almost immediately, and with a quite acceptable level of reliability. as basis for this information mining we have used some relations that can be found between the components of an eeg signal. here we must of course take into account that the time-series representing the sampled eeg signal are in principle of a quasi-periodic and quasi-stationary character. therefore, the classical concept of a spectrum for such time-series is not valid. the standard methods for spectral analysis, based on fourier decomposition into the sum of periodical functions do not give accurate, representative and replicable results. nevertheless, they continue to be widely used for this purpose, though the results are of limited use. to avoid any misunderstanding, we will denote such results below as pseudo-spectra. nevertheless, careful analyses based on long-time experience of a skilled human expert enable much useful information and knowledge to be mined from the sets of data obtained by such pseudo-spectral analyses. notice: one serious drawback here is caused by the fact that in many neurological laboratories equipped with commercial eeg analyzers no information is made available about the kind of fast fourier transform used in the particular measuring set, because the manufacturer treats this as his intellectual property. the results of such analyses, obtained in various laboratories, are therefore not fully compatible. to avoid this source of uncertainty, we have stimulated the development of fast pseudo-spectral analysis based on gabor filtration, for which the respective polynomial filtration function of the 50th order was designed. however, there is good reason to hope that other non-spectral approaches to quasi-periodical and quasi stationary time-series will lead to the development of a more accurate and sharper analytical tool than can be achieved by pseudo-spectral methods. 6 current experimental results an example of a promising approach based on chaos theory is shown in fig. 5. (see [20]). here we see the time trajectories of the representative state vector in 3-dimensional state-space obtained by the delay-time embedding method using the following equation � � � � � �� �� �s i i l i m li � x x x, , ,� 1 (1) � �s � s s sn m1 2, , ,� where l is time lag, s state vector, m state dimension, x data vector, n number of data samples. evidently, the form of such trajectories for different levels of proband attention varies significantly. however, in order to generalize such observations into a hypothesis that can later on be verified and used as basis for constructing of some © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 42 no. 3/2002 fig. 5: an example of the “chaotic” state-space trajectory of the proband in the stage of full vigilance (a), somnolence (b) and hypnagogium (c). in all cases 3 second sample are shown. micro-sleep warning device, we need to have access to many more measurements. nevertheless, on the basis of a comparison of the results from analyses already made of the above mentioned pseudo-spectral measurements compared with real time measurements of the particular proband reaction-time rt and the correctness of the subject’s response to a stimulus, we have found dependences such as those that shown schematically in fig. 6. this relation should be taken as an example only, although it is the result of careful measurement. this is because of the very great individuality of the structure and distribution of eeg signals. we now have available a set of complete measurements performed on about 30 human subjects. this seems to be enough to formulate the hypothesis that in will be possible to estimate the actual and expected attention level of a given 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 6: example of a typical relation between pseudo-spectral components �, � and reaction time rt proband “freedom” – classifier lvq 1 – 20 neurons – learning: 100 steps – 2 classes dm � 2 dm � 3 hm � 12 hm � 18 efficiency � 0.8676 efficiency � 0.8382 dm � 2 dm � 3 hm � 15 hm � 21 efficiency � 0.8529 efficiency � 0.8397 dm � 1 dm � 2 dm � 4 hm � 12 hm � 18 hm � 12 efficiency � 0.7456 efficiency � 0.8397 efficiency � 0.8574 dm � 1 dm � 2 dm � 4 hm � 15 hm � 21 hm � 15 efficiency � 0.7265 efficiency � 0.8926 efficiency � 0.8706 dm � 1 dm � 3 dm � 4 hm � 18 hm � 12 hm � 18 efficiency � 0.6882 efficiency � 0.8824 efficiency � 0.8500 dm � 1 dm � 3 dm � 4 hm � 21 hm � 15 hm � 21 efficiency � 0.7794 efficiency � 0.8750 efficiency � 0.8574 example of classification results for one sample of data: freedom, frequency band 3–12hz obtained result 2 2 2 2 2 1 1 2 1 2 1 2 1 correct 2 2 2 2 2 1 1 2 1 2 1 2 1 obtained result 2 2 2 1 1 1 2 2 2 2 1 1 1 correct 1 2 2 1 1 1 2 2 2 1 1 1 1 obtained result 2 2 2 1 2 1 2 2 1 1 2 1 2 correct 2 2 2 1 2 1 1 2 1 1 1 1 2 human subject interacting with an artificial system, on the basis of knowledge of the pseudo-spectral components � and �, and from a prediction of their values for a satisfactory prediction horizon. on the basis of such a hypothesis, we attempted to develop a set of micro-sleep classifiers. here the application of artificial neural networks (especially lqv and rbf networks) gave quite interesting results (see [19] e.g.), an example of which is shown in the tables presented in fig. 7. the reliability of such a classification was in this case in the region between 69 to 89 %. the level of correspondence between the obtained and correct results for other probands and other types of neural network classifiers that we were used were slightly better (about 90 %). we can therefore say that this approach seems to be an acceptable starting point for further research in this direction. 7 conclusions the presently available base of experimental results is too small for the results to be generalized and to verify not only the basic hypothesis, but also the usefulness of these approaches to micro-sleep classification. many more measurements on several hundreds or even thousands of subjects will be necessary. of course, this is beyond the capacity of a single laboratory. an attempt has therefore been made to stimulate cooperation in this research from other laboratories, institutes, universities and possibly companies in the czech republic and abroad. we hope that this research, representing a significant part of the activity of the newly created czech national node for neuroinformatics, can be included in the long term program of the global neuroniformatic net of the oecd global science forum. in the framework of such a project, a so-called micro sleep base will be developed and results will be collected. compatible measurements will be made of the eeg activities of many people, supplemented by data on necessary auxiliary measurements such as in-time reaction time, correctness of on-stimulus response etc. the data will be stored, processed, mined and distributed to researchers. this seems to be the best way to verify the hypothesis and test the usefulness of our approach. our work up to now indicates that, though the dependences between the components of eeg pseudo-spectra and reaction time are very individual and differ for each subject, they do not change within the lifetime of an individual subject (except in the event of some serious illnesses or injuries). this seems to be very significant, because it allows us to put forward the idea that in due course an individually applicable tool for classifying and predicting the attention level of an operator of a specific (namely transportation) system can be developed. this will warn him/her (and also a supervisor) of the imminent possibility of falling into dangerous stages of interaction with the artificial system, and will improve the potential to prevent accidents. references [1] nilsson, t., nelson, t. m., carlson, d.: development of fatigue symptoms during simulated driving. accid anal prev., july 1997, vol. 29, no. 4, p. 479–488. [2] marottoli, r. a., richardson, e. d., stowe, m. h., miller, e. g., brass, l. m., cooney, l. m. jr, tinetti, m. e.: development of a test battery to identify older drivers at risk for self-reported adverse driving events. j. am. geriatr. soc., may 1998, vol. 46, no. 5, p. 562–568. [3] arnold, p. k., hartley, l. r., corry, a., hochstadt, d., penna, f., feyer, a. m.: hours of work, and perceptions of fatigue among truck drivers. accid anal prev., july 1997, vol. 29, no. 4, p. 471–477. [4] novák, m.: theory of system tolerances. czech. praha: academia, 1987. [5] samel, a., wegmann, h. m., vejvoda, m.: aircrew fatigue in long-haul operations. accid anal prev., july 1997, vol. 29, no. 4, p. 439–452. [6] samel, a., wegmann, h. h., vejvoda, m., wittiber, k.: stress and fatigue in long distance 2-man cockpit crew. german. wien med. wochenschr., 1996, vol. 146, no. 13–14, p. 272–276. [7] taylor, j. g., alavi, f. n.: a global competitive network for attention. neural network world, 1993, vol. 3, no. 5, p. 477–502. [8] tzyy-ping, j., makeig, s., stensmo, m., jejnowski, t. j.: estimating alertness from the eeg power spectrum. ieee transaction on biomedical engineering, january 1997, vol. 44, no. 1, p. 60–69. [9] jansen, b. h., bourne, j. r., ward, j. w.: autoregressive estimation of short segment spectra for computized eeg signals. ieee transaction on biomedical engineering, september 1981, vol. bme-28, no. 9, p. 630–638. [10] makeig, s., elliot, f., inlow, m., kobus, d.: lapses in alertness: brain-evoked responses to task-irrelevant auditory probes. san diego (ca): naval health research center, 1992, tech. rep., p. 90–93. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 42 no. 3/2002 obtained result 2 2 2 1 2 1 1 2 2 2 1 2 1 correct 1 2 1 1 1 1 1 2 2 1 1 2 1 obtained result 2 2 2 1 1 2 1 1 2 2 2 2 1 2 2 2 correct 2 2 1 1 1 2 1 1 2 2 1 2 1 2 2 2 key: 1 vigilance 2 somnolence fig. 7: example of results obtained by various neural network based micro-sleep classifiers [11] fruhstorfer, h., bergstrom, r.: human vigilance and auditory evoked responses. electroencephalogr. clin. neurophysiol., 1969, vol. 27, no. 4, p. 346–355. [12] makeig, s., inlow, m.: lapses in: alertness coherence of fluctuations in performance and eeg spectrum. electroencephalogr. clin. neurophysiol., 1993, vol. 86, p. 23–35. [13] matoušek, m., petersén, i.: a method for assessing alertness fluctuations from eeg spectra. electroencephalogr. clin. neurophysiol., 1983, vol. 55, no. 1, p. 108–113. [14] belyavin, a., wright, n.: changes in electrical activity of the brain with vigilance. electroencephalogr. clin. neurophysiol., 1987, vol. 66, no. 2, p. 137–144. [15] venturini, r., lytton, w., sejnowski t.: neural network analysis of event related potentials and electroencephalogram. in: advances in neural information processing systems, eds: j. e. moody, s. j. hanson, r. p. lippmann, san mateo, (ca): morgan kaufman, 1992, vol. 4, p. 651–658. [16] makeig, s., elliott, f., postal, m.: first demonstration of an alertness monitoring/management system. san diego (ca): naval health research center, 1993, tech. rep. no. 93–36. [17] makeig, s., jung, t.: alertness is a principal component of variance in the eeg spectrum. neurorep., 1995, vol. 7, no. 1, p. 213–216. [18] cohen, j.: human robots in myth and science. south brunswick and new york: a. s. barnes and company, 1966. [19] tatarinov, v.: klasifikace bdělosti operátora. (czech, classification of operator vigility), prague: ctu in prague, faculty of transportation sciences, laboratory of system reliability, 2001, res. rep. no: lss 107/01. [20] svoboda, p.: metody analýzy eeg aktivity. (czech, methods of eeg activity analysis), diploma thesis, prague: ctu in prague, faculty of electrical engineering, 2001. [21] svoboda, p.: alternative methods of eeg signal analysis. neural network world, 2002, vol. 12, no. 3, p. 255–260. [22] faber, j., šrutová, l., pilařová, m.: eeg spectrum as information carrier. sborník lék., 1999, vol. 100, p. 191–204. [23] novák, m., faber, j., přenosil, v., valach, i.: thalamo-cortical oscillations based prediction of micro-sleeps. neural network world, 1999, vol. 9, no. 6, p. 527–546. [24] novák, m., votruba, z.: reliability of man-system interaction and theory of neural networks operations. siena (italy): nato workshop limitations and future trends in neural computing, october 22–24, 2001. [25] novák, m.: artificial systems operation – problems in safety and reliability. rethymno (crete, greece): multiconference cscc, mcp, mcme, july 9, 2001. prof. dr. mirko novák, corresponding author phone: +420 224 359 548, +420 602 242 870 fax: +420 224 359 545 e-mail: mirko@fd.cvut.cz prof. mudr. josef faber prof. zdeněk votruba dr. tomáš tichý ing. petr svoboda ing. vladimír tatarinov department of control engineering and telematics laboratory of system reliability czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic col. prof. václav přenosil military academy, brno kounicova 65 612 00 brno, czech republic 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 ap02_04.vp 1 introduction the problems of techniques for automatic evaluation of interferometric measurements are much studied nowadays, because of the rapid development of digital interferometric techniques and their applications in industrial practice [1–10]. depending on the specific character of the measurement problem to be solved, various phase evaluation methods can be used [11–19]. some of these methods use only one intensity frame, i.e. an interferogram, to determine the phase values, while other methods need to record several interferograms to calculate the phase unambiguously. one of the most popular and most accurate methods for phase evaluation is the phase shifting technique [2,4,16–19]. the phase shifting technique was first applied for analysis of interferometric measurements in the field of classical two beam interferometry, and nowadays it is the most widely used technique for evaluating of interference fields in many areas of science and engineering. the method is based on evaluating phase values from several phase modulated measurements of the intensity of the interference field. it is necessary to carry out at least three phase shifted intensity measurements in order to make an unambiguous and very accurate determination of the phase at every point of the detector plane. the phase shifting technique offers fully automatic calculation of the phase difference between two coherent wave fields that interfere. there exist many phase shifting algorithms for phase calculation that differ in the number of phase steps, in phase shift values between captured intensity frames, and in their sensitivity to factors that affect interferometric measurements [2, 16, 18, 20, 21]. the following text describes new proposed algorithms that are insensitive to phase shift miscalibration, and a complex error analysis of these phase calculation algorithms is performed. 2 phase shifting technique for analysis of interferometric measurements the general principle of most interferometric measurements as follows. two light beams (reference and object) interfere after an interaction of the object beam with the measured object, i.e. the beam is transmitted or reflected by the object. the distribution of the intensity of the interference field is then detected, e.g. using a photographic plate, ccd camera, etc. the phase difference between the reference and the object beam, which is related to the measured quantity, e.g. displacement, can be determined using various phase calculation techniques. the phase shifting technique is based on an evaluation of the phase of the interference signal using phase modulation of this interference signal. the intensity distribution i of the detected phase modulated interference signal at some point (x,y) of the detector plane is given by [2, 4, 22] � � � � � � � �� �i x y a x y b x y x y i n n i i, , , cos , , , , � � � � � �� � 1 3 (1) where a is the function of the background intensity, b is the function of the amplitude modulation, �� is the phase difference of the wave fields that interfere, and �i is the phase shift of the i-th intensity measurement. to determine phase values �� it is necessary to solve nonlinear equations (1) for every point (x,y) of the detected interference pattern. if the phase shift values �i are properly chosen in advance, there remain only three unknowns a, b and �� that must be determined from the recorded intensity values [2, 4, 11]. we must carry out at least n � 3 intensity measurements with known phase shift values �i to calculate phase ��. 3 new phase shifting algorithms we will now show several new multistep phase shifting algorithms that are insensitive to linear phase shift errors. we will consider constant but unknown phase shift values � between recorded images of the intensity of the observed interference field. then it is necessary to capture at least four interferograms, because the interference equation (1) consists of four unknowns a, b, � and ��. from the above-mentioned assumption we can derive phase calculation algorithms that are insensitive to linear phase shift errors. the very well known carré algorithm is a phase shifting algorithm of this type [8, 11, 23]. for the intensity at every point of the recorded interferogram we can write � � i a b k n k nk � � � � � � �� � �cos , , ,�� � 2 1 2 1 , (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 42 no. 4/2002 new phase shifting algorithms insensitive to linear phase shift errors j. novák this article describes and analyses multistep algorithms for evaluating of the wave field phase in interferometric measurements using the phase shifting technique. new phase shifting algorithms are proposed, with a constant but arbitrary phase shift between captured frames of the intensity of the interference field. the phase evaluation process then does not depend on linear phase shift errors. a big advantage of the described algorithms is their ability to determine the phase shift value at every point of the detector plane. a detailed analysis of these algorithms with respect to main factors that affect interferometric measurements is then carried out. the dependency of these algorithms on phase shift values is also studied several phase calculation algorithms are proposed. these are compared with respect to the resulting phase errors. keywords: noncontact deformation measurement, phase calculation algorithms, error analysis. where � is the constant phase shift between captured intensity frames, a is the mean intensity, and b is the modulation of the interference signal. using very well known trigonometric relations we obtain the following equations for the phase values �� (table 1). from these phase calculation algorithms it is evident that firstly the phase values � ��� � �w � , must be determined and then the discontinuities in the phase data can be properly unwrapped using suitable phase unwrapping procedures [24–26]. for unambiguous determination of the wrapped phase values ��w it is necessary to find for the described algorithms appropriate expressions proportional to functions sin �� and cos ��. such expressions are shown in table 2. then according to the sign of the expressions we can determine the wrapped phase values � ��� � �w � , . a big advantage of this type of phase calculation algorithm with unknown value of the phase shift is that we can calculate the phase shift at all points (x,y) of the detector in order to control the distribution of the phase shift over the interferogram. these phase evaluation algorithms are not 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 agorithm expressions for calculation of phase values �� a1 � �� � � � tan �� � � � � � i i i i i i i i i i i i 1 4 2 3 1 4 2 3 2 3 1 4 3 3 a a2 � � � � � � tan �� � 4 2 2 4 2 1 5 2 3 1 5 i i i i i i i a a3 � � � � � � tan �� � � � � � i i i i i i i i i i i i 1 5 2 4 2 4 3 2 1 5 2 4 3 2 2 2 4 2 2 2 2 a a4 � � � � � �� � � � � � tan �� � � � � 4 2 5 2 3 4 1 6 2 3 4 1 6 i i i i i i i i i i a5 � � � � � � � � � � � � tan �� � � � 2 2 2 5 3 4 1 6 2 5 3 4 1 6 3 4 1i i i i i i i i i i i i i i i � � � � � � � i i i i i i i 6 3 4 1 6 2 52 a6 � � � � � � � � � �� � � � tan �� � � i i i i i i i i i i i i i 3 5 4 2 6 2 6 2 3 5 1 7 2 2 62 4 a7 � � � � � �� � � � tan �� � i i i i i i i i i 3 5 3 5 1 7 4 2 6 3 2 table 1 algorithm expressions � sin�� expressions � cos�� a1 � �i i2 3 � � � �� �i i i i2 3 1 4� � a2 � �i i2 4 � �2 3 1 5i i i a3 � �i i i i1 5 2 42 2 � � �i i i2 4 32� a4 � � � �� �i i i i3 4 1 6 � � � � �� �i i i i3 4 1 6� � a5 � � � �� �i i i i3 4 1 6 � �� �i i i i i i3 4 1 6 2 52� � � � a6 � � � �� �i i i i3 5 1 7 � � �� �i i i2 6 42� a7 � � � �� �3 3 5 1 7i i i i � �� �2 4 2 6i i i � table 2 affected by miscalibration of the phase shift device and they can be used if the phase shift is nonuniformly distributed over the area of the detector. 4 analysis of phase shifting algorithms the overall accuracy of interferometric measuring techniques is determined by systematic and random errors during the measurement process. there are many potential factors that may influence the measurement accuracy. for analysis of the influence of the most important factors that have a negative effect on the accuracy of interferometric measurement techniques, a mathematical model has been proposed that enables the accuracy and stability of phase shifting algorithms to be analysed with respect to chosen parameters of the influencing factors [18]. this relationship has been studied because the stability and accuracy of the described phase evaluation algorithms with unknown phase shifts also depends on the phase shift value itself. performing the computer analysis, the optimal phase shift values were obtained for all algorithms. the given phase shifting algorithms have the lowest phase errors for these values. the accuracy and sensitivity of the phase shifting algorithms was evaluated on the basis of the mean phase error � �� �� . the mean phase error is given by � � � � � � � � � � � � � k k m m 1 , (3) where �(��)k is k-th simulation of the phase error �(��) for given measurement conditions, and m is the number of simulation cycles. the modelled function � �� �� that is shown in fig. 1 expresses the obtained accuracy for arbitrary phase values �� and the chosen phase shift value �. the optimal phase shift value �opt is found, where the phase error function � �� �� reaches its minimum. these values were obtained from the simulated relationship using a polynomial approximation near the minima of these functions. the resulting phase shift values �opt are shown in table 3. fig. 1 shows clearly that algorithms a1, a2, a3 and a7 have a relatively wide range of phase shift values, where the phase error is approximately the same as the optimal value. therefore these phase-shifting algorithms are not too sensitive to a change in the optimal phase shift value. with calculated optimal phase shift values for every algorithm a complex error analysis was carried out. the influence of errors of the phase shifting device and the detector were simulated using an appropriate model of these errors [18]. figure 2 shows the relationship between the calculated phase error �(��) and the phase values �� from the range (��, �), which enables a comparison of the accuracy and stability of phase calculation using a particular phase-shifting algorithm. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 42 no. 4/2002 [rad] [r a d ] fig. 1: optimal phase shift values algorithm number of steps n phase shift �opt [rad] phase shift �opt [ ° ] a1 4 0.61 � 110 a2 5 0.50 � 90 a3 5 0.67 � 121 a4 6 0.38 � 69 a5 6 0.54 � 97 a6 7 0.34 � 61 a7 7 0.50 � 90 table 3 further, a study was made of the properties of the phase shifting algorithms with respect to the change of parameters simulating the real nonlinearities of the phase shifting device and detector. with respect to the insensitivity of the phase measuring algorithms to linear phase shift errors, we studied only the dependency of the phase errors on the second order errors of the devices. the dependency of the phase error �(��) of the phase evaluation on the nonlinear behaviour of the phase shifting device is shown in figure 3 for all compared algorithms. figure 4 demonstrates the dependency of the phase error on the nonlinear response of the detector of the intensity. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 [rad] [r a d ] fig. 2: relationship between phase error �(��) and phase values �� fig. 3: effect of nonlinear phase shift errors finally, the computing time for phase shifting algorithms a1–a7 was determined. the relative computing time was calculated as the ratio between the computing time for a given algorithm and the minimum computing time of all the compared algorithms (see table 4). on the basis of the error analysis we can conclude that the most accurate algorithms are the five-step algorithm a2, the six-step algorithm a4, and the seven-step algorithm a7. however, the six-step algorithm a5 is the least accurate and its computing time is the longest. the computing time depends on the number of mathematical operations. it can be seen that the computing time does not depend directly on the increasing number of steps n. for example, the computing time of algorithms a2, a4 and a7 is practically the same. algorithms a2 and a7 seem to be particularly accurate, having stable properties for a wide range of phase step values. 5 conclusion this paper deals with algorithms for phase calculation in interferometric measurement methods using the phase shifting technique. it describes new multistep phase shifting algorithms with constant but unknown phase steps between the intensity frames. these phase evaluation algorithms are not sensitive to phase shift miscalibration and enable calculation of the actual phase shift value at any point of the detector plane during the measurement process. with these algorithms, the phase shifting device can be calibrated and measurement can be carried out if the phase shift is not uniformly distributed over the detector area. phase calculation algorithms with unknown phase shifts can be applied for any measurement that uses the phase shifting technique for measurement evaluation. the accuracy and stability with respect to the initial phase shift values of all described algorithms were studied, and optimal phase shift values for particular algorithms were determined in order to ensure the lowest possible phase errors of the phase calculation algorithms. on the basis of the proposed model, an analysis was performed of the influence of the most important parameters that have a negative effect on the overall accuracy and stability of phase calculation algorithms, and these algorithms were compared. acknowledgement this work was supported by grant no. 103/02/0357 of the grant agency of the czech republic. references [1] cloud, g.: optical methods of engineering analysis. cambridge: cambridge univ. press, 1998. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 42 no. 4/2002 fig. 4: effect of nonlinear response of detector algorithm relative computing time [%] a1 108 a2 100 a3 140 a4 101 a5 146 a6 130 a7 102 table 4 [2] malacara, d.: optical shop testing. new york: john wiley & sons, 1992. [3] rastogi, p. k.: digital speckle pattern interferometry and related techniques. new york: john wiley & sons, 2001. [4] kreis, t.: holographic interferometry: principles and methods. berlin: akademie verlag, 1996. [5] rastogi, p. k.: handbook of optical metrology. boston: artech house publishing, 1997. [6] jacquot, p., fournier, j. m.,(ed.): interferometry in speckle light: theory and applications. berlin: springer verlag, 2000. [7] osten, w., jüptner, w., (ed.): fringe 2001: the 4th international workshop on automatic processing of fringe patterns. paris: elsevier, 2001. [8] osten, w., jüptner, w., kujawinska, m. (ed.): optical measurement systems for industrial inspection ii. spie proceedings, vol. 4398, washington: spie, 2001. [9] kobayashi, a. s. (ed.): handbook of experimental mechanics. new jersey: prentice hall, 1987. [10] rastogi, p. k., inaudi, d.: trends in optical non-destructive testing and inspection. amsterdam: elsevier, 2000. [11] malacara, d., servin, m., malacara, z.: interferogram analysis for optical testing. new york: marcel dekker, 1998. [12] yatagai, t.: automatic fringe analysis using digital image processing techniques. optical engineering, vol. 21, 1982, p. 432. [13] novak, j.: fringe tracing technique in the process of optical testing. physical and material engineering 2002, prague, ctu 2002. [14] takeda, m., ina, h., kobayashi, s.: fourier-transform method of fringe-pattern analysis for computer-based topography and interferometry. j. opt. soc. am, vol. 72, no. 1, 1982, p.156. [15] kujawinska, m., wójciak, m.: high accuracy fourier transform fringe pattern analysis. optics and lasers in engineering, vol. 14, 1991, p. 325–339. [16] creath, k.: phase-measurement interferometry techniques. progress in optics, vol. xxvi, amsterdam: elsevier science, 1988. [17] robinson, d. w., reid, g. t.: interferogram analysis: digital fringe pattern measurement techniques. bristol: institute of physics publishing, 1993. [18] novak, j.: computer simulation of phase evaluation process with phase shifting technique. physical and material engineering, prague: ctu 2002, p. 87–88. [19] miks, a., novak, j.: application of multi-step algorithms for deformation measurement. spie proceedings, vol. 4398, washington: spie, 2001, p. 280–288. [20] novak, j.: error analysis of three-frame algorithms for evaluation of deformations. interferometry of speckle light: theory and applications, berlin: springer verlag, 2000, p. 439–444. [21] novak, j.: error analysis for deformation measurement with electro-optic holography. fine mechanics and optics, vol. 6, 2000, p. 166. [22] miks, a.: applied optics 10. (in czech), prague: ctu publishing house, 2000. [23] carré, p.: installation et utilisation du comparateur photoelectrique et interferential du bureau international des poids de mesures. metrologia vol. 2, 1966, p. 13–23 [24] ghiglia, d. c., pritt, m. d.: two-dimensional phase unwrapping: theory, algorithms and software. new york: john wiley & sons, 1998. [25] novak, j.: methods for 2-d phase unwrapping. in matlab 2001 proceedings, prague: všcht publishing house, 2001. [26] gutmann, b., weber, h.: phase unwrapping with the branch-cut method: clustering of discontinuity sources and reverse simulated annealing. applied optics, vol. 38, no. 26, 1999, p. 5577. ing. jiří novák, ph.d. phone: +420 224 354 435 fax: +420 233 333 226 e-mail:novakji@fsv.cvut.cz department of physics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 ap04_2web.vp 1 introduction processes may be of different nature: for instance, we may be interested in brain biopotentials, heart work or ic features. to obtain a full picture it is necessary to process tens and hundreds of thousands of measurements. a special data acquisition system is needed for analog-to-digital conversion (adc). our system is a distributed system where transmitter and receiver are separated by up to one hundred metres. such systems are described, for example, in [1], [2], [3]. in particular, the proposed system seems to have the following advantages over the system described in [1]: � it can simultaneously process 64 channels which increases the precision of the measurements precision from point of view of discretization, and the member of channels may be increased; � the performance is superior to parallel work; � comparing this system with [1] we can see that it has simpler construction, which makes the system cheaper and more tolerant. biological research has a high demand for tolerant; � it has high accuracy and noise immunity. 2 system description the system consists of a transmitter, a receiver, and a communication line. the transmitter (see fig. 1) has 64 identical voltage to time interval converters. the control logic simultaneously enters the control pulse sample into all 64 input channels, which fixes value obtained from sensors. after selecting the first channel by analog multiplexes, a dual slope conversion process takes place and the comparator output has the time interval t1. the dual slope converter is therefore a voltage-to-time converter (v/t). the v/t converter offers some advantages. the conversion accuracy is independent of the clock period and integrating capacitance. the theoretical accuracy depends only on the absolute value of the reference and clock stability. even changes in other components, such as the comparator input offset voltage, have no effect, provided that are not changed during conversion. the digital circuits substitute time interval t1 by two short pulses, called start and stop, which define the starting and finishing moments of time interval t1. time interval t1 is delayed by 40 ns in order to obtain a blank between the 64 time intervals passing sequentially through the cable. the serial set of start and stop signals is fed to the control circuits of the laser allowed for the transmission of very short infra-red light pulses of 20 ns. the format of the transmitted information is shown in fig. 2. the system usually uses a cable up to 200 m in length which is possible owing to the low noise level. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 a coding and on-line transmitting system v. zagursky, i. zarumba, a. riekstinsh a distributed data acquisition system is proposed. it provides parallel and simultaneous coding and on-line transmission of signals. this system has higher accuracy of measurements and performance in sampling and transmission than known analogs. fig. 1: scheme diagram of the transmitter the timing diagram for the simultaneous mode is shown in fig. 2. the simultaneous mode is selected by connecting the sample/hold pin to the logic high signal sample. this mode should be used when all 64 channels are to be sampled at the same instant in time. this mode keeps the sample-and-hold amplifiers on the sampled channels in the hold mode until the signal, called channel select, comes from the control unit (see fig. 1). this incoming pulse initiates the dual-slope conversion that as a result give us a time interval. then the time interval is divided into two short pulses with a pulse length of about 20 ns. this pulse shows the beginning and the end of each time interval. the prioritizing logic internal to the control unit (see fig. 1) is used to ensure that the channels are transmitted (see transmission format fig. 2) in a predetermined sequence. synchronizing the channel select signal (rising edge of channel select) will result in the time interval to channel 1 being transmitted before channel 2, and channel 63 before channel 64. the incoming light pulses (in fig. 3) start and stop are converted to a ttl compatible signal by a fast p-i-n photodiode, an integrated transimpendance amplifier and a fast comparator. the pulse synchronization device works as follows. the multiphase generator incessantly forms right-angled signals to its own pin. these signals are shifted relative to each other by the phase on value t0 / n, which determines the synchronization accuracy. first, there is the moment of synchronization when the incoming start signals (logic high) pass to the d trigger inputs. these triggers store the signal level of the outputs in the multiphase generator. the start signal reaches the first input of the and microscheme. it also reaches the second microscheme input, but with a constant delay. the delay value equal 15 ns (setting time to d-triggers in a stable state). as a result, when the output signal coincides with any output signal of the multiphase generator, the transmission of the truncated pulse will stop. the advantage of using this device is that we increase the data accuracy and conversion speed. in the receiver (fig. 3), after obtaining start pulse from the clock oscillator, pulses begin passing to the counter. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 44 no. 2/2004 fig. 2: timing diagram of the sampling period fig. 3: block-scheme of the receiver entrance of these counting pulses stops when the stop signal is given. as a result, the counter holds the number of pulses entered sin the summing input, which is limited by the given maximum number of clock pulses – 16384. the at digit information held in the counter is proportional to the time interval. in other words, this is an output code. parallel data is obtained and rewritten to the parallel register. the register shown in fig. 3 makes a direct connection to as many microprocessor buses as possible. the conversion results, like any i/o, may be interfaced to microprocessors by various methods. these include direct memory access (dma), isolated or accumulator i/o, and memory – mapped i/o. we chose dma – the fastest of the metod, since the conversions occur automatically and data updates into memory are transparent to the processor. the parallel code in the register is read by the (dma) board (metra byte pdma-32) and stored in the memory of a personal computer (ibm compatible). 3 hardware the proposed set of microschemes is one possible construction variant for developing the system, which will be made in asic technology based on low power logic technology. in general, the transmitted block consists of 16 similar sections with 4 channels in each. a small block in the receiver is the pulse synchronizer, which passes only a full period of pulses. let us describe some circuits in greater details. the ad684 is a monolithic guard sample-and-hold amplifier (sha). it is ideal for a high performance, multichannel data acquisition system. each sha channel complies with an internal hold capacitor. the ad684 is ideal for high speed 12and 14-bit adcs. the adg201a device comprises four independently selectable switches. these switches also feature high switching speed. the max-934 has four independent channel comparators. max-934 micropower, low voltage comparators plus an on-board 2% accurate reference feature the lowest power consumption available. the mxd1000 silicon delay line provides 5 ns to 500 ns delays. the circuits are designed to reproduce leading and trailing edges with equal precision. the max-306 is a precision, monolithic, cmos analog multiplexer. it is a single-ended 1-from-16 device. by joining four max-306 in one block we can construct a 1-from-64 multiplexer. this is needed for saving the number of channels in the transmission phase. 4 results the system can be used to evaluate the clinical features of body surface map measurements, blood pressure, ecg, etc. the most significance feature of the device is the possibility to increase the number of channels (up to 128) without any basic restructuring. in [1], an increasing number of channels lead to a proportional increase of acquisition time. in the proposed system, this time can be saved in practice, regardless of the number of channels. in [1], the transmission time depends on the conversion time for each of the multiplexed channels, but in the proposed system this is determined only by the transmission time of the information that has been converted. the full acquisition cycle for all 64 channels is 194.52 �s, which is five times less than the cycle in [1]. 5 conclusion a parallel 64-channel distributed measurement system which may be used in multichannel bioelectric measurements is presented. the use of a 14-bit high speed digital transmission format provided data acquisition with a very low level of noise. the proposed system has the useful feature that by using a different kind of sensors and software it is possible to obtain some important human parameters. for instance, it may be possible to diagnose the vibration characteristics of a ventricle wall, caused by dysfunctioning of the heart muscle. with the use of 64 sensors we can obtain a full picture covering a certain area, due to parallel collection. the selected pc-based approach and other approaches allow the greatest flexibility in developing software procedures tailored for specific applications. it has been concluded that the data acquisition system based on the proposed conversion method and kind of transfer may be useful for research on human health. the experimental method proposed here may be applied in clinical use as a new evaluationtool that will enable data to be collected by 64 channels simultaneously per 194.52 �s with 14 digital accuracy. references [1] alexander c.: ”patient isolation in multichannel bioelectric recordings by digital transmission through a single optical fiber.” ieee transactions on biomedical engineering, vol. 40, no. 3, march 1993. [2] lahti j., riiha a., lappetelanen m.: ”a programmable dsp asic for heart rate measurement application”, proceeding of 21 european solid-state circuits conference, esscirc ’95, lille, france, september 19–21 1995, p. 158–161. [3] franchi d., palagi g., bedini r.: ”a pc – based generator of surface ecg potentials for computer electrocardiograph testing.” computers and biomedical research, vol. 27, (1994), no. 1, p. 68–80. v. zagursky phone: +371 755 8448 fax: +371 755 5337 e-mail: zagursky@edi.lv, aigars@egle.cs.rtu.lv i. zarumba institute of electronics and computer science of latvian university riga city council dep. of road control a. riekstinsh institute of automation and computer engineering of riga technical university 14 dzerbenes str. riga, lv-1006, latvia 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 acta polytechnica https://doi.org/10.14311/ap.2021.61.0476 acta polytechnica 61(3):476–488, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague the impact of the air temperature on measuring the zenith angle during the year in the ground layer of the atmosphere for the needs of engineering surveying tomáš suk∗, martin štroner czech technical university in prague, faculty of civil engineering, department of special geodesy, thákurova 7, 160 00 prague, czech republic ∗ corresponding author: tomas.suk@fsv.cvut.cz abstract. this paper presents the results of over a year-long experiment dealing with a temperature measurement to calculate the theoretical effect of the atmosphere on the measured zenith angle in engineering surveying. the measurements were performed to determine the accurate and specific temperatures (temperature gradients), which can be recorded in different seasons in the low level of the atmosphere (up to 2 m above the ground, where most engineering surveying measurements take place) for the geographical area of central europe specifically the czech republic. a numerical model was then applied to the resulting determined temperature gradients to calculate the path of the beam passing through an inhomogeneous atmosphere. from these values, the apparent vertical shifts caused by refraction in a given environment and time were finally determined. keywords: refraction, vertical temperature gradient, zenith angle, beam path, vertical shifts. 1. introduction due to technical progress, geodetic instruments are constantly evolving and improving. for total stations, the standard deviation of the rangefinder is 1 mm (per 100 m) and the accuracy of angles in units of tenths of a milligon is no exception (for example, as shown here [1–4]). thanks to this, it is theoretically possible to carry out measurements with an accuracy of millimetres or, in special applications, even tenths of a millimetre. however, the condition for achieving these accuracies is not only the quality of the instrument used, or, in the case of adjustment, the number of redundant measurements, but also the suppression or elimination of errors caused by the surrounding environment. one of the environment influences, which has been studied in the geodesy for a long time, is atmospheric refraction a set of atmospheric influences that negatively affects the measurement results. in geodesy, according to [5], refraction can be divided into two basic groups according to the position of the observed target. on the one hand, astronomical refraction examines the influence of the atmosphere in the observation of celestial objects. thus, it usually works with the assumption of the sight line close to vertical. geodetic refraction, on the other hand, describes the influences acting on ground targets (in common geodetic practice on the earth’s surface). geodetic refraction can then be divided according to which direction it affects in terms of a measurement. so, we are talking about vertical or horizontal refraction. in the following, text we will deal with the vertical refraction in a specific application of engineering surveying (es) in the low level of the atmosphere (to 2 m above the ground) affecting the zenith angle. we will not pay more attention to other types here. attempts to suppress or eliminate the influence of refraction date back a very long time, and many theories and experiments have been made on this topic over the past century. some older works, focused on trigonometric measurements in mountain areas, taking into account the effect of refraction, were carried out in the 60s and 70s by mr. hradílek (for example, [6] and [7]). this research is still relevant, as evidenced, for example, in publication [8]. another interesting article describes the effect of refraction on geodetic measurements in another medium – water [9]. in principle, this effort can be divided into several basic variants. the first variant is, of course, the use of direct geodetic measurements to detect refraction. this is most often a counter-measurement of zenith angles or a calculation of k, as a variable, within a given trigonometric network [10]. an interesting and innovative way of measuring opposite zenith angles is described, for example, in [11]. another way is to apply atmospheric models to the measured data. one of the most famous is the creation of the temperature equation from kukkamaki [12], which was used for precise levelling. there are several parameter adjustments for different areas and use [13]. currently, it is very common to determine the computer model by the reverse method, where the measured data derive theoretical relationships (polynomial and other functions) between the measured quantities defining the state of the atmosphere and the measured geodetic data [14, 15]. in this case, we encounter the problem that derived re476 https://doi.org/10.14311/ap.2021.61.0476 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . lationships are often only theoretical and cannot be applied, in general, across seasons and parts of the world. another approach is to determine the current turbulent parameters of the well-known monin – obukhov similarity theory (described in [16] and [17]). to calculate the refractive index, it is important to determine the turbulent parameters c and l, for which it is possible to use scintillometry and/or image dancing method [18–20]. monin – obukhov method has its limits in the assumptions of the uniform stratification of the atmosphere (although there are modifications [18]) and especially in the use of the coefficient k or temperature gradient describing the whole path of the beam, or the whole set of measurements. at present, this method is already considered insufficient. the list of possibilities is followed by a technologically very complex dispersion method (described in [19, 20]), which is based on the transmission of a pair of beams of different wavelengths (most often ir and blue), which bend slightly differently as they pass through the atmosphere. it is, therefore, possible to measure the difference between the points (angle) of impact. from this, it is possible to derive the refractive coefficient or the effect of the refraction as such. this method is difficult due to the necessity of measuring a very small difference between the points (angles) of impact of individual colours with sufficient accuracy [21]. the path used in this article differs from the methods described. by monitoring the physical properties of the atmosphere (specifically its temperature and temperature gradient), the current state is described, for which a mathematical simulation of the beam’s passage through an inhomogeneous medium is performed (measurement simulation) using the numerical model. the basis of this calculation is the knowledge of the refractive index of air and its gradient. the refractive index depends mainly on temperature and pressure. at short distances (as in es) the pressure practically does not change (does not affect the gradient). the main influence is the temperature gradient, which is determined by a system of temperature sensors placed on a vertical structure designed for experimental purposes. in the past, this approach has already been attempted, but with significantly worse technical equipment (e.g. sirůčková [22] and others) and especially with the effort to introduce corrections to actual measurements. in our case, there is an effort to know the properties and laws of the phenomenon in the ground layer of the atmosphere and to make basic recommendations for the practical measurement. for the purpose of estimating the real state of the atmosphere, an extensive series of all-day experiments were measured for over a year with an interval of about 15 to 20 days. through the set of data from each day, several simulations were done (more described below). the result of the simulation is a theoretical estimate of the effect of the atmospheric refraction on the zenith angle (observed point height) for a given situation. 2. equipment and experiment the first part of this chapter provides a basic overview of the apparatus and equipment used to carry out the temperature (temperature gradient) measurement, followed by a simple analysis of the accuracy of the whole system. the second subchapter describes the experiment and basic output measurements. in the last part, the used numerical model and the main partial calculations are explained and formulated. 2.1. equipment for the implementation of experiments of a temperature measurement, a measuring apparatus was designed and assembled, consisting of a set of resistance temperature detectors (rtds), a data logger and a shielding and supporting structure. 2.1.1. resistance temperature detector the determination of temperature by rtds uses a known physical principle describing the dependence of the change in resistance of a conductor on the change in its temperature. in this case, the opposite view is used, where a change of the resistance is observed and the change in temperature is derived from it. tr097c sensors manufactured by sensit s.r.o. were used for most of the measurements. (for some measurements, the tg8-40 sensor, which is a similar type with the same accuracy, was also used). tr097c is equipped with a calibrated platinum sensor (pt1000 / 3850 class a). the detector technically consists of a lacquer encapsulated sensor to increase physical resistance, a supply cable in a protective insulating silicone sleeve and an insulated connector to the data logger. the disadvantage of this sensor is a lower physical resistance and a lower resistance to moisture and other weather conditions. the permitted deviation ∆ts for platinum sensors class can be determined as: ∆ts = (0.15 + 0.002 · |t|), (1) where |t| is the temperature in an absolute value. in the range of measured values (approx. −20 to 40 °c), the permitted deviation ∆ts is approx. 0.15 to 0.25 °c [23, 24]. 2.1.2. datalogger the data logger s 0141 (fig. 1, manufactured by comet system s.r.o.) enables temperature calculation (transition from resistance to temperature measurement) and subsequent registration. the logger is equipped with four inputs for sensors, a smaller black and white display and an input for connecting a communication cable to a pc (in the form of usb, com or wifi). the control and settings are changed via the computer program comet vision provided by the manufacturer. 477 tomáš suk, martin štroner acta polytechnica figure 1. datalogger s0141. figure 2. construction – layout. the data logger has a defined measuring temperature range from −90 °c to 260 °c for the pt1000 sensor used. for temperatures measured during the experiment, a margin of the permitted deviation ∆tl equals to approximately 0.2 °c [24]. 2.1.3. shielding and supporting structure for the purposes of shading out sunlight and heating prevention, sensors were installed in a wooden structure, which also serves as a supporting element of the apparatus. at the same time, the construction prevents thermal radiation of other (more thermally conductive) objects. for this purpose, a simple l-shaped cover system (see fig. 2) has been designed with regularly drilled holes (to allow air to pass through), which are tilted downwards. 2.1.4. analysis of the accuracy of temperature measurement of the whole system from the above permitted deviations ∆tl and ∆ts, the standard deviations of the logger σl and the sensor σs can be calculated using the law of standard deviation transmission σs = ∆ts 2 = approx. 0.10 °c to 0.13 °c, σl = ∆tl 2 = 0.10 °c, (2) where ∆tl a ∆ts are the permitted deviations in the temperature range for the logger and the sensor. subsequently, the accuracy of the determined temperature σt can be derived as σt = » σ2s + σ 2 l = 0.15 °c. (3) the resulting accuracy of the temperature determination is thus σt = 0.14 to 0.16 °c in a given temperature range. in further analyses, we will consider σt = 0.15 °c, which also corresponds to a long-term observation. since the standard deviation of the temperature determination can be considered as the same for every sensor, we can calculate the standard deviation of the temperature difference σ∆t according to σ∆t = σt · √ 2 = 0.21 °c, (4) with the permitted deviations ∆mt = 0.42 °c. for the temperature gradient, the standard deviation σ∇t is determined as σ∇t = σ∆t dh , (5) where dh is the vertical distance between the sensors. after substituting into the equation, we get the results given in tab. 1. 2.2. experiment the measuring experiment was carried out in order to determine the actual prevailing conditions on the ground level of the atmosphere in the czech republic. the measured data are available from professional meteorological stations, but they usually describe the parameters at a certain height and do not reach such accuracy and frequency, which is necessary for a reliable determination of the temperature gradient. therefore, an apparatus was assembled (see chapter 2.1.3), which allows a simultaneous temperature measurement at four different heights above the surface. subsequently, it is possible to determine temperature gradients. in terms of the time schedule, it was decided on a continuous measurement of at least 24 hours and a recording interval of 20 s. the interval between individual days of measurement was chosen to be about 15 to 20 days (depending on the weather). the result is 23 days of measurements distributed throughout the year, which map the development of temperature across the seasons. all measurements were recorded in central european time cet (utc +01:00). 478 vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . gradient vertical range ∆t (0.5 − 1.0 m) (1.0 − 1.5 m) (1.5 − 1.9 m) σ∇t [°c/m] 0.42 0.42 0.53 table 1. accuracy of temperature gradients. 2.2.1. measurement results the measurement was performed on a regularly mown lawn. except for the data from 1.4.2020 to 20.5.2020, the structure was in the shade most of the day. in the mentioned interval, it was measured in a sunny area with all-day sunlight. an example of typical data measured on a sunny day is the measurement from 7.5.2020 (shown in fig. 3 and fig. 4). the first graph shows the development of temperature at given heights. we can notice a significant fluctuation in temperature in the afternoon and a relatively calm, although stratified atmosphere in the night. fig. 4 shows the development of the vertical temperature gradient during the day 7.5.2020. the results confirm the often-described phenomenon that during the day (sunny), the earth’s surface heats up and the temperature gradient becomes negative (in this case also at a height of 0.50 to 1.00 m!). at a height of 1.00 to 1.50 m, the gradient is relatively small, as the temperature radiated by the surface and the temperature of the higher layer are “equalized” (warmer air rises). the top layer reaches high gradients (some other experiments confirmed a positive temperature gradient up to 4 meters above the ground). it is necessary to repeat here that the data describe a specific sunny day with the measuring equipment placed on a sunlit lawn during the whole day. other days, such as when the apparatus was in a shade, the graphs would look relatively different in the illustration. unfortunately, due to the scale, they will not be included here. 2.2.2. determination of temperature gradient due to the large volume of data, the vertical temperature gradient ∇t was calculated only for specific times of each day. these times are 00:00, 4:00, 8:00, 12:00, 16:00 and 20:00. in order to make the calculation of the gradient more reliable from the point of view of defining the season, a regression function (2nd order polynomial) was calculated from the measurements in the period −30 to +30 min and a specific functional value of this polynomial was used in the following calculation. the following fig. 5 and fig. 6 show gradients for selected times during each day. the dotted line shows the gradient for the lower section (0.5 to 1.0 m), the dashed line for the middle section (1.0 to 1.5 m) and the solid line for the upper section (1.5 to 1.9 m). it is clear that the resulting gradients can take on high values exceeding 4.5 °c/m. the most significant gradients are in a sunny place during the measurement (1.4.2020 to 20.5.2020). since the temperature gradient used is obtained by a regression function from a relatively large number of measurements, its accuracy can be expected to be higher than that of the calculated in the accuracy analyses (chapter 2.1.4). however, the standard deviation of the average cannot be used due to the influence of systematic errors burdening the measured set. 2.3. calculations fig. 7 shows, in a simplified manner, the relationships between significant parameters defining the vertical refraction and its influence on the measured values. between points a and b, the slope distance sd and the zenith angle za are marked. the points are located on the earth’s surface (represented by a sphere of radius r and centre s) at height ha (respectively hb). the distance of the points along the circle sha at the height ha defines the central angle ϕ. furthermore, the apparent point b′ shifted to from the actual position by the vertical error ∆hb is marked. the error is caused by the angular deviation of the beam from the straight trajectory refractive angle β/2 (if the refractive curve is not circular, then this angle is general ρa), or more precisely the tangent ta to the substitute circular refractive curve (purple) at point a. the refractive curve is defined by centre d, radius of curvature rk and centre angle β, which is also referred to as the angle of complete refraction. the zenith angle z′a describes the value that we would actually measure [5]. 2.4. differential equation of the passage of a wave-path above the measured data, specifically above the obtained gradients, a simulation of the beam passage by the depwp method (differential equation of the passage of a wave-path through inhomogeneous atmosphere) was performed. it is a physical relationship describing the change of the direction of the beam dependent on the refractive index and the change (gradient) of the refractive index of the given environment (derived in [25]). the whole relation according to [26] and [27] can be written as d2r dt2 = n(r) ·∇n(r) = f(r), (6) 479 tomáš suk, martin štroner acta polytechnica figure 3. development of temperature during the day 7.5.2020. figure 4. development of temperature gradient during the day 7.5.2020. 480 vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . figure 5. interpolated temperature gradients for times 0:00, 4:00 and 8:00. figure 6. interpolated temperature gradients for times 12:00, 16:00 and 20:00. 481 tomáš suk, martin štroner acta polytechnica figure 7. relationships between measured and determined quantities. r = ñ x y z é , n(r) = n(x,y,z), (7) ∇n(r) = ñ dn(r)/dx dn(r)/dy dn(r)/dz é , where vector r defines a specific point on the beam path and is expressed in coordinates, n is the refractive index of the medium and ∇n is the gradient of the refractive index in the directions of the coordinate axes. the calculation is performed numerically, and a designation can be introduced for a better understanding u = dr dt , du dt = d2r dt2 , (8) where u is the direction vector and dt is a differential part of the path [27]. we can convert this second order differential equation (8) to two differential equations of the first order and then solve them simultaneously. du dt = f(r), dr dt = u, (9) the initial conditions are position r0 (total station) and direction u0 (to the measured point zenith angle). using a sufficiently small step dt → ∆t (in our case ∆t = 0.1 mm) we can calculate the numerical passage through the atmosphere. the resulting equations for the numerical pass are as follows: dri+1 = ui · ∆t, ri+1 = ri + dri+1, (10) dui+1 = f(ri+1) · ∆t, ui+1 = ui + dui+1, (11) 2.4.1. barrel-sears formula the barrel-sears formula with temperature and pressure correction can be used to calculate the refractive index of air [28]. although more accurate procedures are available [29], these formulas are sufficiently accurate for this purpose. n = 287.604 + å 1.6288 λ2 ã + å 0.0136 λ4 ã , (12) n = 1 + öö n 1 + t 273.15 è · p 101325 − − ö 5.5 · 10−2 1 + t 273.15 è · h 133.322 è · 10−6, (13) where λ is the wavelength of radiation in µm, t is the temperature, p is the pressure and h is the partial pressure of water vapour. for the calculation of the models, visible light of λ = 0.555 µm, and water vapour pressure p = 101 325 pa and water vapour pressure h = 0 pa were considered. in essence, the effect of the change in pressure and water vapour pressure was not taken into account. 2.4.2. iterative calculation of path after calculating the gradients, the iterative simulation of the beam path according to depwp was performed. thus, the calculation of the apparent vertical displacement ∆h the height deviation of the actual path from the straight line, was realised. three 482 vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . paths of the beam were calculated with the specified distance s = 50 m and with a variable zenith angle (z0 = 99.2, 100.0 and 101.2 gon). the vertical shift ∆h is, therefore, a function of these variables ∆h = depwp(∇ti,s,z0, . . . ). (14) from the obtained vertical shift, the first iteration of the “refractive angle” ρ1 at survey station was calculated backwards. the following simplification also applies to z ≈ 100 gon ρ1 = sin(z) · ∆h s/ sin(z) − (cos(z) · ∆h) ≈ ∆h s . (15) z1 = z0 + ρ1 (16) the newly determined zenith angle z1 is obtained by adjusting the initial zenith angle z0 by the refractive index. usually, in the second or third iteration, the beam strikes the target with an accuracy of more than 0.001 mm. by this procedure, the resulting refractive angle ρ1, and the resulting vertical displacement of the target ∆h was determined according to ρ1 = zn −z0, ∆h ≈ ρ1 · s sin(z0) . (17) 3. results the very determination of real temperatures (see example fig. 3) prevailing during the year in the microclimate of the ground level of the atmosphere can be considered as a result of the experiment itself. this research has not been carried out in our territory so far, and in previous refraction solutions, estimates or data determined elsewhere were used. of course, it is not possible to use the specified temperatures as generally applicable, as they are influenced by the whole spectrum of parameters, such as position, influence of surrounding objects, material of the surface, content of dust particles. nevertheless, the data are important for obtaining a qualified estimate of possible temperatures in our conditions. equally important is the determination of the temperature gradients shown in fig. 4. 3.1. vertical shifts using the depwp simulation described in chapters 2.4 to 2.4.2, the values of the vertical shifts (difference between the apparent and the actual target) for each beam path can be obtained. for better clarity, the results are again displayed for selected times of day (00:00, 4:00, 8:00, 12:00, 16:00 and 20:00) in the form of graphs (see fig. 8), which describe the vertical shifts (corrections) for z = 99.2; 100 a 101.8 gon. this range of the zenith angle was chosen due to the use of the entire height of the measuring structure (up to 2 m). it is clear from the graphs that the direct exposure of the terrain greatly affects the vertical change (the largest ones were calculated for the days when the structure was placed in a sunny position). furthermore, it is worth recalling that the calculated deviations in some days and times are essential for many geodetic applications and it is necessary to consider them or change the principle or the time of the measurement accordingly. the highest deviations are usually achieved in the afternoon, when, in the summer months, the error exceeding 2 mm is not uncommon. on the contrary, the lowest error is observed early in the morning and later at night, when it usually does not exceed 1 mm. 4. real beam trajectory for a better idea of the realization of the actual trajectory, graphs in fig. 9 to fig. 11 show the beam trajectory on selected days. the straight line is shown in black and the differences in the trajectory from the straight line at a particular point are multiplied by 1 000 for better clarity (so they are in mm). the graphs show all three variants of zenith angles at once and the individual paths according to the time of day are shown by a coloured dashed line chosen according to tab. 2. note that in fig. 9, the beam trajectory is flat and very close to a straight line. for z = 99.2 gon, the curve is convex towards the surface, while for the others, it is inverted. this is due to a negative gradient in the higher part of the measured range. the graph fig. 10 shows a state where the apparatus has been placed in a sunny place. the gradients are very high and vertical deviations are as well. importantly, for z = 99.2 and 100 gon, the curve is always concave, while for z = 101.8 gon, it is convex during the day (8:00 to 16:00) and concave at night. this is due to the large thermal radiation of the surface. in the last measurement shown (see fig. 11), from the end of october, the curves are relatively flat, but the state is very variable. for z = 99.2 gon, it is concave in the morning, almost straight at noon and convex in the afternoon. with a horizontal view of z = 100 gon, the curve is always concave and for z = 102.8 gon, the curve is concave at night (0:00 and 20:00) and convex during the day due to the thermal radiation of the surface. 5. discussion the main benefit of this experiment and article should be a warning about refraction, or rather that the solution of its influence is no longer just a question of academia and the construction of large point fields. on the contrary, the current direction of accurate geodesy directly requires a practical investigation of the impact of refraction and the active minimization of its influence on geodetic data. after all, as the results published here shows, it is still true that for applications, where accuracy in the order of centimetres is required, refraction is not critical and will not need to be addressed. however, in areas where an 483 tomáš suk, martin štroner acta polytechnica figure 8. vertical shifts for z = 99.2; 100.0 a 101.8 gon. 484 vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . trajectory straight 0:00 4:00 8:00 12:00 16:00 20:00 color table 2. colors of individual paths. figure 9. the real trajectory of the beam of the day 15.1.2020. figure 10. the real trajectory of the beam of the day 7.5.2020. 485 tomáš suk, martin štroner acta polytechnica figure 11. the real trajectory of the beam of the day 26.10.2019. accuracy of more than one cm is required, it will be necessary to consider the possible impact of refraction and its possible suppression. if we focus on specialized geodetic work in construction, engineering or control measurements, then the accuracy is in units of millimetres and refraction can easily become a significant systematic component of measurement errors the results also clearly show that the atmosphere is very variable and cannot simply be described by a single universal number, as the gaussian refractive index k = 0.1306 is sometimes incorrectly presented. this number describes the atmosphere in a particular area, which is very far from the application of engineering surveying. t. křemen deals, in detail, with the issue of the gaussian coefficient in article [10]. the theoretical refractive index can be calculated retrospectively from the determined vertical displacements. the measured data show that the coefficient usually takes values up to +3 and seldomly takes negative values. for measurements in a sunny place, it even reaches values exceeding +20, while for measurements in a shade, the maximum values are up to +5. in the winter months, the coefficient does not deviate much from the value of 0. there are several reasons for the very significant diversity compared to the so-called average coefficients determined from the network alignment. a specific application can be considered as the most important one. when measuring large networks (such as the network in the kingdom of hanover from 1821 to 1825 measured by k. f. gauss), where the survey station is elevated (for a better observation of network points often miles away) and usually a large part of the sight is high above the ground (tens of meters). the use of measuring towers was no exception. the temperature at such a height behaves completely differently from the temperature determined by the measurement presented here at a height of up to 2 m above the ground. 6. conclusion the task of the whole experiment was to measure real temperatures that may occur in our territory and to critically evaluate what effect refraction may have during the year. the experiment was aimed at engineering surveying, which is usually characterized by a measurement at shorter distances (up to hundreds of metres) and higher accuracy (most often in the order of mm). it is clear from the results that there are seasons and times of day that are not suitable for a measurement, as the required accuracy may be exceeded by the effect of the refraction alone. likewise, the well-known rule was confirmed that during noon to afternoon, the effect of refraction reaches its highest values, and conversely, the best conditions for accurate measurements can be expected at night or very early in the morning. it has a very adverse effect on the measurement results over sunny terrain. a measurement close to the ground on a sunny day is not suitable due to the extreme temperature gradient. however, the steepness of the sight also negatively affects the resulting data. the theoretical ideal would be to measure in a layer where the temperature radiated by the surface and the temperature of the upper atmosphere are equal. this is of course not easy to determine, but the results suggest the height from 1.5 to 1.7 meters above the earth’s surface. a certain danger arising from the observation is the fact that during the night, the atmosphere is not homogeneous, but it is significantly calmer, and the stratification of the atmosphere stabilizes more. 486 vol. 61 no. 3/2021 the impact of the air temperature on measuring the zenith angle. . . however, this can result in a systematic effect on the measured data that cannot be recognized during the measurement. the only way to detect such an error would be to re-measure, ideally, on another day (another night) when the atmosphere stratifies differently. however, this is often not possible in practice. although this experiment appears to be relatively robust from a measurement point of view, it is not possible to perceive it as a list of corrections to measurements in a given part of the year and at a given time. the benefit should be a concrete idea of the conditions that can be expected in our conditions for measurements near the terrain. nor can it be considered true that the temperature in the atmosphere is systematically stratified, and that the measurement of the temperature at the survey station describes the temperature over the entire range of the beam path. in the field of research into the influence of temperature (and refraction in general) on geodetic measurements, several measurements will still need to be made to better understand it. for example, it is not certain at what height above the ground the temperature gradient decreases and the atmosphere becomes more homogenized. for this reason, the calculations were focused only on the height range of the measuring structure. acknowledgements this research was funded by the grant agency of ctu in prague grant number sgs20/052/ohk1/1t/11 “optimization of acquisition and processing of 3d data for purpose of engineering surveying, geodesy in underground spaces and 3d scanning”. references [1] j. braun, m. štroner. geodetic measurement of longitudinal displacements of the railway bridge. [2] j. braun, h. fladrova, k. prager. comparison of different measurement methods of crane runway. in advances and trends in geodesy, cartography and geoinformatics ii: proceedings of the 11th international scientific and professional conference on geodesy, cartography and geoinformatics, pp. 3 – 9. 2020. https://doi.org/10.1201/9780429327025-2. [3] j. bures, d. bartonek, m. kalina, o. svabensky. security geodetic monitoring of structures. in 7th international conference on cartography and gis, pp. 874 – 879. bulgarian cartographic association, bulgaria, 2018. [4] j. bureš, l. bárta, o. švábenský. contributions to international conferences on engineering surveying, chap. influence of external conditions in monitoring of building structures, pp. 223 – 235. springer, 2021. https://doi.org/10.1007/978-3-030-51953-7_19. [5] m. hauf. technický průvodce: geodezie. sntl nakladatelství technické literatury, 1982. [6] l. hradilek. refraction in trigonometric and three-dimensional terrestrial networks. the canadian surveyor 26(1):59 – 70, 1972. https://doi.org/10.1139/tcs-1972-0006. [7] l. hradilek. trigonometric levelling and spatial triangulation in mountain regions. bulletin géodésique 87(1):33 – 52, 1968. https://doi.org/10.1007/bf02530312. [8] r. urban, o. michal. development deflection of prestressed concrete bridge. in 15th international multidisciplinary scientific geoconference sgem 2015, pp. 203 – 210. stef92 technology ltd., sofia, bulgaria, 2015. https://doi.org/10.5593/sgem2015/b22/s9.025. [9] š. rákay, k. bartoš, k. pukanská. the influence of refraction on determination of position of objects under water using total station. in advances and trends in geodesy, cartography and geoinformatics, pp. 95 – 100. 2018. https://doi.org/10.1201/9780429505645-16. [10] t. křemen. refrakční koeficient a gaussova hodnota k = 0,1306. geodetický a kartografický obzor 64(8):161 – 169, 2018. [11] c. hirt, s. guillaume, a. wisbar, et al. monitoring of the refraction coefficient in the lower atmosphere using a controlled setup of simultaneous reciprocal vertical angle measurements. journal of geophysical research: atmospheres 115(d21), 2010. [12] t. j. kukkamaki. ober die nivellitische refraktion. publication of the finnish geodetic institute, helsinki, 1938. [13] s. r. holdahl. removal of refraction errors in geodetic leveling. in symposium international astronomical union, vol. 89, pp. 305 – 319. 1979. [14] p. v. angus-leppan. use of meteorological measurements for computing refractional effects a review. in symposium international astronomical union, vol. 89, pp. 165 – 178. 1979. https://doi.org/10.1017/s0074180900065979. [15] d. gaifillia, v. pagounis, m. tsakiri, v. zacharis. empirical modelling of refraction error in trigonometric heighting using meteorological parameters. journal of geosciences and geomatics 4(1):8 – 14, 2016. [16] a. s. monin, a. m. obukhov. basic laws of turbulent mixing in the atmosphere near the ground. academiia nauk sssr, geofizicheskii institut 24:163 – 187, 1954. [17] f. k. brunner. systematic and random atmospheric refraction effects in geodetic levelling. in proc. of second international symposium on problems related to the redefinition of north american vertical geodetic networks, pp. 691 – 703. 1980. [18] a. reiterer. modeling atmospheric refraction influences by optical turbulences using an image-assisted total station. zfv zeitschrift für geodäsie, geoinformation und landmanagement 137(3):156 – 165, 2012. [19] b. böckem, p. flach, a. weiss, m. hennes. refraction influence analysis and investigations on automated elimination of refraction effects on geodetic measurements. in proc. imeko 2000. 2000. 487 https://doi.org/10.1201/9780429327025-2 https://doi.org/10.1007/978-3-030-51953-7_19 https://doi.org/10.1139/tcs-1972-0006 https://doi.org/10.1007/bf02530312 https://doi.org/10.5593/sgem2015/b22/s9.025 https://doi.org/10.1201/9780429505645-16 https://doi.org/10.1017/s0074180900065979 tomáš suk, martin štroner acta polytechnica [20] h. ingensand. concepts and solutions to overcome the refraction problem in terrestrial precision measurement. geodezija ir kartografija 34(2):61 – 65, 2008. https://doi.org/10.3846/1392-1541.2008.34.61-65. [21] s. kyle, s. robson, l. macdonald, m. shortis. compensating for the effects of refraction in photogrammetric metrology. in proceedings of 14th international workshop on accelerator alignment. 2016. [22] h. sirůčková. experimental levelling at the interface of optical environments. acta polytechnica 56(2):138 – 146, 2016. https://doi.org/10.14311/ap.2016.56.0138. [23] m. frk, z. rozsívalová. overview, accuracy and sensitivity of temperature sensors in practice. elektrorevue 14(4):55–1 – 55–8, 2012. [24] návod na použití a kalibrační protokol. tech. rep., sensit, rožnov pod radhoštěm, 2019. [25] j. a. kravcov, j. i. orlov. geometričeskaja optika neodnorodnych sred. moskva: nauka, 1980. [26] a. mikš, j. pospíšil. počítačová simulace vlivu atmosféry na geodetická měření. stavební obzor: odborný měsíčník 7:220 – 225, 1998. [27] m. štroner, j. pospíšil, t. křemen, v. smítka. geodetická měření při požární zkoušce na experimentálním objektu v mokrsku. české vysoké učení technické v praze, 2008. [28] m. štroner. metody výpočtu indexu lomu vzduchu. jemná mechanika a optika 7 -8:224 – 228, 2000. [29] f. dvořáček. interpretation and evaluation of procedures for calculating the group refractive index of air by ciddor and hill. in 18th international multidisciplinary scientific geoconferences sgem 2018 informatics, geoinformatics and remote sensing, vol. 18, pp. 837 – 844. 488 https://doi.org/10.3846/1392-1541.2008.34.61-65 https://doi.org/10.14311/ap.2016.56.0138 acta polytechnica 61(3):476–488, 2021 1 introduction 2 equipment and experiment 2.1 equipment 2.1.1 resistance temperature detector 2.1.2 datalogger 2.1.3 shielding and supporting structure 2.1.4 analysis of the accuracy of temperature measurement of the whole system 2.2 experiment 2.2.1 measurement results 2.2.2 determination of temperature gradient 2.3 calculations 2.4 differential equation of the passage of a wave-path 2.4.1 barrel-sears formula 2.4.2 iterative calculation of path 3 results 3.1 vertical shifts 4 real beam trajectory 5 discussion 6 conclusion acknowledgements references ap04-bittnar1.vp 1 introduction assessment of the durability and serviceability of nuclear power plants is a topical problem in many countries. the most critical part of a power plant is the reactor containment, which is made from concrete with or without a steel lining. there are several requirements on the reactor containment. basically, it must protect the reactor from external effects, and also the external environment from possible accidents. mechanical analysis is used for assessing the limit state, and transport analysis is used for describing the leakage of pollutants to the external environment. possible accidents can lead to great pressure inside the reactor containment, which can cause damage to the concrete, and therefore there is an impact on the transport processes of the pollutants. therefore coupled hydro-thermo-mechanical analysis is required for a correct assessment of reactor containment properties. concrete is a heterogeneous and porous material, which leads to relatively complicated material models. the aim of the present study is to show in a condensed form the theoretical basis of the most widely used mathematical models describing the coupled heat and moisture transport in deforming porous media, to provide a set of governing equations together with the finite element method. the theory discussed below is based on porous media theories given in [1]. 2 mass and heat transfer in deforming porous media – a review of theory 2.1 constitutive relations moisture in materials can be present as moist air, water and ice or in some intermediate state as an adsorbed phase on the pore walls. since it is in general not possible to distinguish the different aggregate states, water content w is defined as the ratio of the total moisture weight (kg/kg) to the dry weight of the material. the degree of saturation sw is a function of capillary pressure pc and temperature t, which is determined experimentally s s p tw w c� ( , ) . (1) the capillary pressure pc is defined as p p pc g w� � , (2) where pw> 0 is the pressure of the liquid phase (water). the pressure of the moist air, pg> 0, in the pore system is usually considered as the pressure in a perfect mixture of two ideal gases dry air, pga, and water vapor, pgw, i.e., p p p m m tr m trg ga gw ga a gw w g g � � � � � � � � � � � � � . (3) in this relation �ga , �gw and �g stand for the respective intrinsic phase densities, t is the absolute temperature, and r is the universal gas constant. identity (3) defining the molar mass of the moist air, mg, in terms of the molar masses of the individual constituents is known as dalton’s law. the greater the capillary pressure, the smaller is the capillary radius. it is shown thermodynamically that the capillary pressure can be expressed unambiguously by the relative humidity rh using the kelvin-laplace law rh p p p m tr � � � � � � � gw gws c w w exp � . (4) the water vapor saturation pressure, pgws, is a function of the temperature only, and can be expressed by the clausius-clapeyron equation p t p t m r t t gws gws w( ) ( ) exp� � � � � � � � � � � � �0 0 1 1 , (5) where t0 is a reference temperature and hvap is the specific enthalpy of saturation. materials having heat capacities is the term deliberately used to emphasize the similarity to the description of moisture retention. it is simply expressed as h h t� ( ) , (6) where h is the mass specific enthalpy (j�kg�1), t is the temperature (k). it is not common to write the enthalpy in an absolute way as is done here. instead, changes of enthalpy are described in a differential way, which leads to the definition of the specific heat capacity as the slope of the h t curve, i.e. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 solution of nonlinear coupled heat and moisture transport using finite element method t. krejčí this paper deals with a numerical solution of coupled of heat and moisture transfer using the finite element method. the mathematical model consists of balance equations of mass, energy and linear momentum and of the appropriate constitutive equations. the chosen macroscopic field variables are temperature, capillary pressures, gas pressure and displacement. in contrast with pure mechanical problems, there are several difficulties which require special attention. systems of algebraic equations arising from coupled problems are generally nonlinear, and the matrices of such systems are nonsymmetric and indefinite. the first experiences of solving complicated coupled problems are mentioned in this paper. keywords: coupled hydro-thermo-mechanical analysis, concrete, non-linear system of equations. c h t p p � � � � � � � � � � const. . (7) the heat capacity varies insignificantly with temperature. it is customary, however, to correct this term for the presence of the fluid phases and to introduce the effective heat capacity as ( )� � � �c c c cp p p peff s s w w g g� . (8) 2.2 transfer equations the mass averaged relative velocities, v v� s , are expressed by the generalized form of darcy’s law [1] ns k p� � � � � � � �( ) ( )v v � s r sat grad k g , (9) where � � w for the liquid phase and � � g for the gaseous phase. dimensionless relative permeabilities kr� � 0 1, are functions of the degree of saturation k k sr r w � �� ( ) (m�s�1). (10) in equation (9), ksat (m 2) is the square (3×3) intrinsic permeability matrix and �� is the dynamic viscosity (kg�m�1�s�1). the intrinsic mass densities �� are related to the volume averaged mass densities � � through the relation � �� � �� n s . (11) the relative permeability krw goes to zero, when water saturation sw approaches sirr, which is the limiting value of sw as the suction stress approaches infinity [2]. the diffusive-dispersive mass flux (kg�m�2�s�1) of the water vapor (gw) in the gas (g) is the second driving mechanism. it is governed by fick’s law j v vg gw g gw gw g g g gw g grad� � � � � � � � � � ns � � � � ( ) d , (12) where dg (m 2 �s�1) is the effective dispersion tensor. it can be shown [1] that jg gw g a w g g gw g g a w g g grad gra � � � � � � � � � � � � � m m m p p m m m 2 2 d d d ga g g gap p � � � � � � � � � j . (13) here, jg ga is the diffusive-dispersive mass flux of the dry air in the gas. conduction of heat in the normal sense comprises radiation as well as convective heat transfer on a microscopic level. the generalized version of fourier’s law is used to describe the conduction heat transfer q � � eff grad t , (14) where q is the heat flux (w�m�2), and � eff is the effective thermal conductivity matrix (w�m�1�k�1). the thermal conductivity increases with increasing temperature due to the non-linear behavior of the microscopic radiation, which depends on the difference of temperatures raised to the 4th power. the presence of water also increases the thermal conductivity. a suitable formula reflecting this effect can be found in [1]. 3 deformation of a solid skeleton 3.1 concept of effective stress the stresses in the grains, �s, can be expressed using a standard averaging technique in terms of the stresses in the liquid phase, �w, the stresses in the gas, �g, and the effective stresses between the grains, �ef. the equivalence conditions for the internal stresses and for the total stress � lead to the expression [3]. � � � � �� ns ns nw w g g s( )1 � . (15) the assumption that the shear stress in fluids is negligible converts the latter equation into the form � �� ef sp m, (16) where � � � �� � � x y z yz zx xy t t, , , , , , { , , , , , }m 1 1 1 0 0 0 (17) and p s p s ps w w g g� . (18) deformation of a porous skeleton associated with the grain rearrangement can be expressed using the constitutive equation written in the rate form � ( � � )� � �ef sk� d 0 . (19) the dots denote differentiation with respect to time, d dsk sk ef� ( �, , )� � t is the tangential matrix of the porous skeleton and ��0 represents the strains that are not directly associated with stress changes (e.g., temperature effects, shrinkage, swelling, creep). it also comprises the strains of the bulk material due to changes of the pore pressure � � �p p � � � � � � � � � m s s3k , (20) where ks is the bulk modulus of the solid material (matrix). when admitting only this effect and combining eqs. (16), (19) and (20), we get � � � � � � � � � � �� � � �� ef s sk s sp p pm d m m� � , (21) where � � � � � � � � � � � � 1 3 3 1m i d mt m sk sk sk k k , (22) and k sk sk� m d m t 9 is the bulk modulus of the porous skeleton. for a material without any pores, k ksk s� . for cohesive soils, k sk� k s and � �1 . the above formulas are also applicable to long-term deformation of rocks, for which � � 0 5. , and this fact strongly affects equation (21) [4]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 44 no. 5 – 6/2004 changes of the effective stress along with temperature and pore pressure changes produce a change in the solid density ��s . to derive the respective material relation for this quantity, we start from the mass conservation equation for the solid phase. in the second step we introduce the constitutive relationship for the mean effective stress expressed in terms of quantities describing the deformation of the porous skeleton. after some manipulations we arrive at the searched formula ( ) � ( ) � � ( )1 1� � � � � � � � � � �n n p t � � � � � s s s s s sdiv k v , (23) where �s is the thermal expansion coefficient of the solid phase. a similar approach applied to the mass conservation equation of the liquid phase leads to the following constitutive equation � � � � � � w w w w w� � p t k , (24) where kw is the bulk modulus of water and �w is the thermal expansion coefficient of this phase. 3.2 set of governing equations the complete set of equations describing the coupled moisture and heat transport in deforming porous media comprises the linear balance (equilibrium) equation formulated for a multi-phase body, the energy balance equation and the continuity equations for the liquid water and gas. continuity equation for dry air � � � � � � � t s s k ( ) ( ) �1 1�� � � � � � � � w ga w ga ga rg sat div div u k � � g g g a w g 2 g gw g grad div grad p m m m p p � � � � � � � � � � � � � d � � � � � � 0, (25) where �u ( �u v� s ) is the velocity of a solid. continuity equation for the water species � � � � � � � t s s k ( ) ( ) �1 1�� � � � � � � � w gw w gw gw rg sat div div u k � � g g g a w g 2 g gw g grad div grad p m m m p p � � � � � � � � � � � � � d � � � � � � � � � � � � � � � � � t s s k ( ) �w w w w w rw sat w div div (gra u k d gradg c wp p� � � � � � � � � g) . (26) energy balance equation ( ) ( )� � � � � � c t t t c k p eff eff pw w rw sat w div grad div (gra � � � k d grad grad grad g c w pg gw rg sat g g p p c k p � � � � � � � � � � � g k ) t h t s s k � � � � � = div div (g vap w w w w w rw sat w � � � � � � � � ( ) �u k rad gradg c wp p� � � � � � � � � � � � � g) . (27) the equilibrium equation (the linear balance equation) must still be introduced to complete a set of governing equations div g w c � � �� � � � � �m g( )p s p � 0 (28) with the density of the multi-phase medium defined as � � � � � � �� � � ( )1 n ns nss w w g g s w g. (29) initial and boundary conditions the initial conditions specify the full fields of gas pressure, capillary or water pressure, temperature and displacement and velocities: p p p p t tg 0 g c 0 c� � � �, , ,0 0u u , and � � ,u u� �0 0at t . (30) the boundary conditions can be imposed values on � 1 or fluxes on � 2, where the boundary � � � 1 2 , p pg g� on g 1, p pc c� on c 1, t t� on t 1 and u u� on u 1. (31) the volume averaged flux boundary conditions for water species and dry air conservation equations and energy equation to be imposed at the interface between the porous medium and the surrounding fluid are as follows ( ) ( ) � � � � � � ga ga g gw ga g gw ga w w g gw c onj j n q j j j n � � � � � � 2 ( ) ( ) ( � � � � � gw gw gw w c w w vap eff c on grad � � � � � � q q j n 2 h t t t t� �) ,on 2 (32) where n is the unit normal vector of the surface of the porous medium, �� gw and t � are the mass concentration of water vapor and temperature in the undisturbed gas phase far away from the interface, and qga , qgw , qw and q t are the imposed air flux, the imposed vapor flux, the imposed liquid flux and the imposed heat flux, respectively. the traction boundary conditions for the displacement field are: � � �n t on u 2 , (33) where t is the imposed traction. 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 4 discretization of governing equations a weak formulation of the governing equations (25) to (28) is obtained by applying galerkin’s method of weighted residuals. for the numerical solution, the governing equations are discretized in space by means of the finite element method, yielding a non-symmetric and non-linear system of partial differential equations: k k k k c c c c uu uc c ug g u u cu cc c cg g c u p p t f u p p t � � � � � � � t t , � � � � � � � � � � � � � k k k k c c c c cu cc c cg g c c gu gc c gg g g u p p t f u p p t t , � � � � , � � � t u p p t f u p p � � � � � � � � � k k k k c c c c gu gc c gg g g g u c c g g t t t t tt t t t tt t � . t u p p t f � � � � � �k k k ku c c g g (34) the eqs. (34) can be rewritten in concise form as k x x c x x f x( ) ( ) � ( )� � , (35) where � �x tt � p p ug c, , , , c(x) is “the general capacity matrix”, k(x) is “the general conductivity matrix” and they are obtained together with f(x) by assembling the sub-matrices indicated in eqs. (34). the dot denotes the time derivative. 5 method of solution coupled mechanical and transport processes after discretization by the finite element method are described by a system of ordinary differential equations which can be written in the form k cr r f� � d dt (36) where k is the stiffness-conductivity matrix, c is the capacity matrix, r is the vector of nodal values and f is the vector of prescribed forces and fluxes. numerical solution of the system of ordinary differential eqs. (36) is based on expressions for unknown values collected in vector r at time n � 1 r r vn n nt� �� �1 � � (37) where the vector vn �� has the form v v vn n n� �� � �� � �( )1 1. (38) the vector v contains time derivatives of unknown variables (time derivatives of the vector r). eq. (36) is expressed in time n � 1 and with the help of the previously defined vectors we can find ( ) ( ( ) )c k k� � � � �� �� �� �t tn n n nv f r v1 1 1 . (39) this formulation is suitable because an explicit or implicit computational scheme can be set by parameter �. the advantage of the explicit algorithm is based on possible efficient solution of the system of equations, because parameter � can be equal to zero and capacity matrix c can be diagonal. therefore the solution of the system is extremely easy. the disadvantage of such a method is its conditional stability. this means that the time step must satisfy the stability condition, which usually leads to a very short time step. the previously described algorithm is valid for linear problems, and one system of linear algebraic equations must be solved in every time step. the situation is more complicated for nonlinear problems, where a nonlinear system of algebraic equations must be solved in every time step. the high complexity of the problems leads to the application of the newton-raphson method as the most popular method for such cases. there are several ways to apply and implement the solver of nonlinear algebraic equations. we prefer equilibrium of forces and fluxes (computed and prescribed) in the nodes. this strategy is based on the equation f fint � �ext 0 (40) where vectors f int and f ext contain internal values and prescribed values. due to the nonlinear feature of the material laws used in the analysis, the eq. (40) is not valid after computation of new values from the equations summarized in table 1. there are nonequilibriated forces and fluxes which must be suppressed. when new values in the nodes are computed, the strains and gradients of the approximated functions can be established. this is done with the help of matrices bs and bg where particular partial derivatives are collected. concise expressions of strains and gradients are written as g b r b r� �g , � � . (41) new stresses and fluxes are obtained from the constitutive relations � �� �d q d g� , q , (42) where d� is the stiffness matrix of the material and dq is the matrix of conductivity coefficients. the real nodal forces and the discrete fluxes on an element are computed from the relations f f qe t e e t e int int,� �� �b b� � d dg� � � � . (43) these are compared with the prescribed nodal forces and discrete fluxes and their differences create the vector of residuals r. increments of vector v are computed from the equation ( )c k v r� ���� t n 1 (44) if matrices c and k are updated after every computation of the new increment from eq. (44), the full newton-raphson © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 44 no. 5 – 6/2004 initial vectors (defined by initialconditions) r0, v0 do until i n� (n is number of time steps) predictor ~ ( )r r vi i it� � � �1 1 � � right hand side vector y f ri i i� � �� �1 1 1k ~ matrix of the system a c k� � �� t solution of the system v yi i� � ��1 1 1a new approximation r r vi i it� �� � ~ 1 1�� table 1 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 finite elements: 50 quadrilateral with linear approximation functions initial conditions: t0 � 298.15 k, pc0 � 9.0e7 pa(50 % rh), pg0 � 101325 pa boundary conditions: side1 side2 t0 – heat flux: qt � 1 k�min �1 (up to 475 k), �c � 20 w�m�2�k�1 (both sides) pc – cauchy’s rh � 10 %, �c � 0.02 m�s�1; rh � 5 %, �c � 0.01 m�s�1 pg – dirichlet’s pg � 101325 pa; pg � 101325 pa fig. 1: problem description 292.90 297.90 302.90 307.90 312.90 317.90 322.90 327.90 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 x (cm) t( k ) fig. 2: temperature profile 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 7000.0 0.0 1.0 2.0 3.0 4.0 5.0 x (cm) fig. 3: water vapour pressure profile method is used. if the matrices are updated only once after every time step, the modified newton-raphson method is used. 6 numerical example a simple benchmark test is used to show the differences between a linear and a nonlinear solution. we have a concrete sealed specimen under high temperature conditions, fig. 1. for this example, concrete is considered as a non-deforming medium (without a displacement field). the material model for ordinary concrete presented by baroghel-bouny et al. [8] extended to high temperatures is used. the results are compared after 1, 2 and 4 hours. 7 conclusion hydro-thermo-mechanical problems are extremely complicated due to many nonlinearities, and therefore only numerical methods are used for two and three-dimensional problems. a suitable method is the finite element method, where each node has 6 degrees of freedom in three-dimensional problems (3 displacements, temperature and 2 pore pressures). the system of linear algebraic equations (after discretization and linearization) contains many unknowns, and an appropriate solver must be used. sequential computer codes have severe difficulties with memory and cpu time in connection with such systems. therefore parallel computers are more often used in complicated analysis. probably the © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 44 no. 5 – 6/2004 8.0e+07 8.5e+07 9.0e+07 9.5e+07 1.0e+08 1.1e+08 1.1e+08 0.0 1.0 2.0 3.0 4.0 5.0 x (cm) fig. 4: capillary pressure profile 0.42 0.44 0.46 0.48 0.50 0.52 0.0 1.0 2.0 3.0 4.0 5.0 x (cm) fig. 5: degree of saturation profile most powerful tool for solving of large systems of equations with the help of parallel computers is the family of domain decomposition methods. transport processes lead to nonsymmetric indefinite systems. this means that usual methods, such as ldlt decomposition, do not work for such problems, and a more general lu decomposition must be used. it seems to us that there are problems where pivoting is necessary. acknowledgment financial support for this work was provided by gačr, contract no. 103/03/d145, and by the eu “maecenas” project. references [1] lewis r. w., schrefler b. a.: “the finite element method in static and dynamic deformation and consolidation of porous media.” john wiley & sons, chichester-toronto, 1998, (492). [2] fatt i., klikoff w. a.: “effect of fractional wettability on multiphase flow through porous media.” note no. 2043, aime trans., 216, 246, 1959. [3] bittnar z., šejnoha j.: “numerical methods in structural mechanics.” asce press and thomas telford, ny, london, 1996 (422). [4] zienkiewicz o. c.: “basic formulas of static and dynamic behaviour of soils and other porous media.” institute of numerical methods in engineering. university college of swansea, 1983. [5] krejčí t., nový t., sehnoutek l., šejnoha j.: “structure subsoil interaction in view of transport processes in porous media.” ctu reports, vol. 1 (2001), no. 5. [6] wang x., gawin d., schrefler b. a.: ”a parallel algorithm for thermo-hydro-mechanical analysis of deforming porous media.” computational mechanics 19, springer-verlag, 1996, p. 94–104. [7] kruis j., krejčí t., bittnar z.: “numerical solution of coupled problems”, in proceedings of the ninth international conference on civil and structural engineering computing, b. h. v. topping, (editor), civil-comp press, stirling, united kingdom, paper 24, 2003. [8] baroghel-bouny v., mainguy m., lassabatere t., coussy o.: “characterization and identification of equilibrium and transfer moisture properties for ordinary and high-performance cementitious materials.” cem. and conc. res., vol. 29 (1999), p.1225–1238. ing. tomáš krejčí, ph.d. phone: +420 224 354 309 e-mail: krejci@cml.fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2016.56.0147 acta polytechnica 56(2):147–155, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap elasto-kinematic model of suspension with flexible supporting elements tomáš vránaa, ∗, josef bradáčb, jan kovandac a department of vehicles and ground transport, faculty of engineering, culs prague, kamýcká 129, 165 21 praha – suchdol, czech republic b department of automotive technology, škoda auto university, na karmeli 1457, 293 60 mladá boleslav, czech republic c department of security technologies and engineering, faculty of transportation sciences, ctu in prague, konviktská 20, 110 00 praha 1, czech republic ∗ corresponding author: vrana.ds@seznam.cz abstract. this paper analyzes the impact of flexibility of individual supporting elements of independent suspension on its elasto-kinematic characteristics. the toe and camber angle are the geometric parameters of the suspension, which waveforms and their changes under the action of vertical, longitudinal and transverse forces affect the stability of the vehicle. to study these dependencies, the computational multibody system (mbs) model of axle suspension in the system hyperworks is created. there are implemented finite-element-method (fem) models reflecting the flexibility of the main supporting elements. these are subframe, the longitudinal arms, transverse arms and knuckle. flexible models are developed using component mode synthesis (cms) by craig-bampton. the model further comprises force elements, such as helical springs, shock absorbers with a stop of the wheel and the anti-roll bar. rubber-metal bushings are modeled flexibly, using nonlinear deformation characteristics. simulation results are validated by experimental measurements of geometric parameters of real suspension. keywords: independent suspension, mbs model, flexible model of supporting elements, wheel toe, wheel camber, hyperworks. 1. introduction nowadays, vehicles are equipped with powerful power units and achieve very high speeds. at the same time, the number of vehicles is significantly increasing and causes high traffic density on the roads and thus increases the number of accidents and traffic safety problem. it is important to study not only passive safety (reducing the consequences of road traffic accidents), but mainly active safety, which aims to prevent traffic accidents. this technical discipline is closely related to the vehicle dynamics. the mathematical description of vehicle behavior is possible to find in publications [1, 2], which generally show the importance of the topic. specifically, publication [3] deals with the creating of the mbs computational model of the whole vehicle using non-linear fem model of the tire, which is then used for simulations and research of vehicle behavior on rough road surfaces. driving characteristics and vehicle behavior are affected by many aspects. one of the main aspects is elasto-kinematic characteristic of the suspension during different loading modes. high attention should be paid not only to the investigation of the vehicle behavior, but also to the field of suspension elasto-kinematics. it is important to deal with the study and preparation of computational models of suspension and increase their accuracy and efficiency. the term elasto-kinematic characteristics of the suspension is defined as the change of geometrical parameters of the suspension (toe angle, camber angle) due to the action of the wheel load in vertical, longitudinal and transverse direction of the vehicle. for the study of kinematic characteristics of the suspension mechanism, without considering flexibility, it is most effective to use methods of transformation matrices [4]. the most commonly used method is disconnected loop method or the method of removing the body, which leads to simpler mathematical solution thanks to the lover number of equations. kinematic analysis and analysis of mechanism in case of mcpherson suspension, wishbone and five-link suspension is shown in [5–7]. influence of positioning of kinematic points on the camber and toe angle of independent multi-link suspension is shown in [8]. computational model is created in hyperworks system and includes the flexible model of longitudinal arm. dependencies of geometrical parameters of the vertical movement of the wheels are provided for the movement of kinematic points in the vertical and transverse direction by the value of ±1 and ±2 mm. sensitivity analysis of elasto-kinematic and dynamic properties on the deformation characteristics of bushings and supporting elements for independent suspension mcpherson is shown in [9]. the mbs computational model, taking into account the flexibility of rubber-metal bushings, was created in the msc.adams/car system. the 147 http://dx.doi.org/10.14311/ap.2016.56.0147 http://ojs.cvut.cz/ojs/index.php/ap t. vrána, j. bradáč, j. kovanda acta polytechnica flexibility of supporting elements was not taken into account in this model. another work [10] proposes and explores verification method to validate the results of simulations elasto-kinematic characteristics of the front axle mcpherson suspension. in [11], there is presented the concept of a new mechanism of independent axle suspension, which was created during optimizing the shape of the contact surface of the tire-road. suspension kinematics is described by equations for closed loop mechanism, solved using the determinant of sylvester matrix. mbs models that take into account the flexibility of supporting elements of the suspension are not used in the automotive industry because of their intensity and complexity in preparing thus the knowledge of how their flexibility affects the elasto-kinematics of suspension is missing. the aim of this paper is to show and describe the effect of flexibility of supporting components of independent suspension on its elasto-kinematic properties in defined loading modes. to achieve this goal, several variants of the computational models of an independent rear suspension, which reflecting the deformation characteristics of rubber-metal bushings and flexibility of suspension parts such as subframe, longitudinal and transverse arms and knuckle were used. the simulations clearly show the influence of flexibility of individual supporting elements on elastokinematic behaviour of suspension that is described by the toe angle δ and the camber angle γ at defined load. the results compared with the model containing absolutely rigid supporting elements are validated by experimental measurements. the presented computational model was created in several modules of altair hyperworks system [12]. 2. mbs model in hyperworks multi body system (mbs) model is a mechanical coupled system, which consists mostly of absolutely rigid bodies, but may also contain fem models that can have flexible behaviour. all the elements are connected by linkages, which may be kinematic and elasto-kinematic with the flexible description. according to the kinematic structure, it is possible to determine the number of degrees of freedom of dof system using equation dof = 6(nu − 1) − 5(ro + sl) − 3sp, (1) where nu is the number of elements of suspension mechanism including frame, ro is the number of rotational kinematic pairs (kp), sl is the number of translation kinematic pairs kp and sp is the number of spherical kp. mechanical system of elements is described by the dependent coordinate qi, number n > dof arranged into a vector of dependent coordinate q according to (2) q = [q1,q2, . . . ,qi, . . . ,qn ]t . (2) mathematical solver motionsolve of hyperworks system assembles the equations of motion for the mechanical system, created in the mbs model of preprocessor motionview, by help of the lagrange equations [13, 14] of mixed type written in the matrix form (3), d dt ( ∂e ∂q̇ ) − ∂e ∂q = q + ∂ft ∂q λ, (3) where e is the kinetic energy of the mechanical system, the vector of generalized forces q = [q1,q2, . . . ,qi, . . . ,qn ]t , the vector of holonomic binding conditions f = [f1,f2, . . . ,fk, . . . ,fr]t and the vector of lagrange multipliers λ = [λ1,λ2, . . . ,λk, . . . ,λr]t . the number of lagrange equations in the matrix form (3) is then dof + r, where r = n − dof. lagrange equations (3) must be supplemented by coupling conditions that can be written in matrix equation (4) f(q) =   f1(q) f2(q) ... fk(q) ... fr(q)   =   0 0 ... 0 ... 0   = 0. (4) these coupling conditions, in the total number of r, represent a geometric definition of kinematic pairs connecting the body system [14]. the system of differential-algebraic equations formed by lagrange equations (3) and coupling conditions (4) is solved in mathematical tool motionsolve by numerical mathematics. unknown vector of dependent coordinate q in the individual iteration steps is found by dae – integrator, which is designated as hyperworks dstiff [15]. 3. mbs computational model of independent suspension examined mbs model of independent suspension was created in the pre-processor motionview of hyperworks system. the complex computational model was created by inserting flexible fem models into the mbs model, see figure 1. 3.1. creation of mbs model special type of suspension was chosen for the model. it is kinematical determined mechanism according to the diagram in figure 2. this means that for the kinematic functionality (dof = 1) the certain defined flexibility of longitudinal arms 2, which have to deform in the longitudinal direction during the vertical motion of the wheel (wheel centre – point wc), is required. when the calculation model contains the description of absolutely rigid longitudinal arm 2, this arm must be connected to 148 vol. 56 no. 2/2016 model of suspention with flexible supporting elements figure 1. computational mbs model with implemented fem flexible models of supporting elements. figure 2. computational mbs model with implemented fem flexible models of supporting elements. the knuckle 6 via a rotary coupling to fulfil dof = 1. elast-flex la model was created as the basic model with flexible longitudinal arms (la), which have to be joined to the knuckle via the fix links to correspond to the real suspension. other flexible bodies were implemented into the models, such as the upper transverse arm (uta), the lower transverse arm (lta), the help rod (hr), knuckle (kn) and the subframe (sub). thus the complex elasto-kinematic model elast+flex all, containing the description of all the supporting elements and other flexible bodies, was created. for example, it is the model elast+flex la+flex sub (flexible rubber-metal bushings, flexible longitudinal arms and subframe) or elast+flex la+flex lta (flexible bushings, longitudinal arms and lower transverse arms). the model elast+rigid (flexible bushings, supporting elements absolutely rigid) was also created. it is used for the comparison with models comprising flexible bodies. the position of suspension mechanism was defined in motionview by cartesian coordinates of kinematic points (table 1). these data were obtained from measurements on the real vehicle. because of the symmetry of the model around the x axis in cgs only the left side of the model is displayed. the mbs model includes also elastic elements (helical spring with linear stiffness of 30 n/mm and torsion stabilizer, shock absorber and bumper) that affect the kinematic type kp coordinate in gcspoint x [mm] y [mm] z [mm] a1 ball 2098 −603 36 a2t fix 2402 −595 27 a2b fix 2402 −595 −33 b1 ball 2480 −365 −21 b2 ball 2502 −678 −40 c1 ball 2534 −380 143 c2 ball 2540 −685 130 d1 ball 2805 −105 −10 d2 ball 2790 −685 −45 nkp fix 2410 −482 72 nkz fix 2860 −485 95 table 1. type and position of kp global coordinate system (gcs). rubber-metal bushings properties and influence the elasto-kinematic behaviour of suspension. deformation properties of rubber-metal bushings are described in the same way for all variants of the model and thus by means of a force-deformation dependence. all characteristics for each bushing were experimentally measured for six load cases. figure 3 shows the composition and method of measuring the load to determine the radial deformation characteristics dx = f(fx) a1-bushing. figure 3. measurement of deformation characteristics dx = f(fx) of rubber-metal bushing. 3.2. flexible models of suspension supporting elements flexible models of the supporting elements of the suspension are generated by component mode synthesis (cms), which is processed in the tool flexprep in motionview. cms transfers the body from node to modal representation that characterizes its flexible behaviour using natural frequencies and natural shapes. craig-bampton method [12, 16] was chosen as the cms method that uses the linear combination of modal shapes φ and vector of modal coordinates s and approximates the vector of linear displacement dn of fem network according to (5) dn = φ ·s. (5) 149 t. vrána, j. bradáč, j. kovanda acta polytechnica during the creation of flexible models of supporting elements the 3d cad models in the module hypermesh with precise geometry were created. they were subsequently discretized in hypermesh module by finite element method. figure 4 shows the fem model of help rod (hr). figure 4. fem model of help rod (hr). the network was created on the base of combination of triangular and quadrangular elements. surface network with pshell elements and assigned thickness was used for arms and the subframe. for discretization of knuckle volume, the network elements type psolid were used. the size of elements was chosen to be 5 mm. the number of elements and nodes of the network and their thicknesses for analyzed flexible elements are shown in table 2. the total number of elements for all models of supporting elements is 136 103. element no. of nodes no. of elements thickness [–] [–] [mm] sub 24 238 24 309 3.5 la 744 698 3 lta 6 300 6 118 2.5 uta 2 526 2 498 3.5 hr 1 211 1 103 3.0 kn 11 692 45 480 table 2. number of elements and nodes for flexible supporting elements. knuckle material is cast iron with modulus of elasticity e = 1.76 × 105 mpa and the poisson number µ = 0.275. for subframe and arms it is steel with e = 2.10 × 105 mpa and µ = 0.3. the first three calculated natural shapes for the help rod (hr) are shown in figure 5. the value of the first natural frequency corresponding to the first shape is 390 hz, the second natural frequency is 683 hz and the third is 923 hz. the natural frequency of other supporting elements is shown in figure 6. the lowest natural frequency is for subframe (sub), while the highest value is for knuckle (kn). created models with flexible elements were implemented to the mbs model. output file of fem models of supporting elements exported from hypermesh has an extension .fem and enters the tool flexprep together with the definition of rbe2 spider. the output is then the final flexible model in h3d format. figure 5. first three calculated natural shapes for the help rod (hr), a) undeformed state, b) first modal shape (torsion), c) second modal shape (bend around x axis), d) third modal shape (bend around z axis). figure 6. comparison of first three natural frequencies [hz] analyzed supporting elements. 3.3. creation of interface nodes before inserting flexible fem models into mbs model, the reference nodes, so-called rbe2-spiders, were created. these help entities join the mbs kinematic model (table 1) with the corresponding nodes of fem simulation models of flexible supporting elements. figure 7. rbe2 spiders for help rod. rbe2-spiders for the help rod (hr) are shown in figure 7. it is the point rbe2-b1, which joins the kinematic point b1 (the node no. 6 777) with 104 nodes of the inner arm housing at subframe. rbe2b2 joins the kinematic point b2 (node 6 778) with 104 nodes of the knuckle housing. in the knuckle, there is also created inverse rbe2-b2, which joins the point b2 (node no. 31 061) with 134 nodes of knuckle hole 150 vol. 56 no. 2/2016 model of suspention with flexible supporting elements for housing the help rod (hr). other rbe2-spiders of other flexible bodies are created in a similar manner in hypermesh module. the model of suspension thus contains 34 rbe2-spiders. 4. simulations computations elasto-kinematic properties of computational models were simulated using mathematical tool motionsolve. for the calculations the time interval t = 〈0; 80〉s and the time step ∆t = 0.05 s were set. the simulation model of wheel suspension was gradually loaded by three loading modes according to figure 8. figure 8. definition of loading states for independent suspension. in the first mode, the wheel support is loaded by vertical force fv , which causes a vertical movement of the wheels wz = 〈−105; 105〉mm. edge values ±105 mm represent the upper and lower stop during the vertical movement of the wheel. in two other modes, the wheel support is loaded by side force that varies in the interval fl = 〈−10000; 10000〉n and longitudinal braking force in the value fb = 〈0; 10000〉n. furthermore, the static radius of the tire 286 mm, the radial stiffness 225 n/mm, for vehicle mass 1487 kg and a wheelbase of 2550 mm, were set up. 5. validation of mbs model validation of computational model for the calculation of elasto-kinematic characteristics, taking into account the flexibility of supporting elements, was carried out at two levels. first moments of inertia calculated from models of deformable bodies in hyperworks were compared with measured values of the moments of inertia of the real suspension elements. in the second step, the experimental measurement of geometrical parameters of the wheel suspension for validation of simulation results was prepared. 5.1. validation of inertia moments of suspension supporting elements moments of inertia of supporting elements calculated from geometrical models in hypermesh have been verified by experimental measurements on so-called torsion hanger [9]. this measuring device (figure 8) consists of a lightweight circular plate having a diameter d = 0.8 m, hinged at points a, b, c using thin cords of length l = 3.5 m. the suspension points are located at pitch circle of radius r = 0.375 m. figure 9. measurement of inertia moment izz of subframe using torsion hanger. the aim is to find moments of innertia about axes that are passing through the center of gravity, and are parallel to the axes of gcs. using this definition, the moments of inertia are calculated in hypermesh and transferred to motionview. the measured body is mounted on a plate, with its center of gravity g placed above the center of the plate s, so that the axis of the rotation of the plate (in figure 9 marked as o) is identical with the axis to which the innertia is being found. when we are looking for the moment of inertia izz, then the subframe axis coincides with the suspension axis o. after the motion starts, the hanger oscillates torsionally and the time period t is measured by the help of stop watch. from the equality of kinetic and potential energy, the final relation (6) is found. this relation is used to calculate the moment of inertia j2 [kg m2] of the particular supporting element. j2 = (m1 + m2)gr2t 2 4π2l −j1, (6) where m1 [kg] is the weight of the plate of torsion hanger, m2 [kg] is the weight of the measured body, g = 9.8066 [m s−2] is the acceleration of gravity, r [m] is the pitch circle radius, t [s] is the period time, j1 [kg m2] is the inertia moment of plate hanger, l [m] is the cords length. calculated values of inertia moment of subframe (sub) in hypermesh compared with real measurement are shown in table 3. highest difference was found for inertia moment iyy of 7.9 %. 5.2. experimental measurement to validate dependencies of the elasto-kinematic parameters of the suspension obtained from simulations, 151 t. vrána, j. bradáč, j. kovanda acta polytechnica subframe ixx [kg m2] iyy [kg m2] izz [kg m2] computation 1.186 0.239 1.386 measurement 1.221 0.258 1.432 table 3. 3 moments of inertia of subframe. the experimental measurements on the vehicle using a test bench beissbarth 1995 + vas5080 were carried out. this device is designed to measure the geometrical parameters of suspension. the parameters, like the toe angle δ and camber angle γ, depending on the vertical movement of the wheel dz, were measured. the vehicle is established on the base plate of measuring stations. wheel suspension stands on sliding supports (335 × 280 mm), allowing the wheels side-shift (change of axles track). changing ∆dz enables to improve measurement accuracy. measuring heads were installed on the rims, joined via linkage and established horizontally. each head has two cdd cameras which transmit infrared light beam to measure geometric parameters of the wheel. measurement configuration is shown in figure 10. figure 10. configuration of the experimental measurement on the left side of the vehicle at beissbarth testing device. the measured vehicle is encumbered and unencumbered in order to move with the center of the wheel by a step of ∆dz = 10 mm with the interval dz = 〈−105, 105〉mm, that is limited by the upper and lower stop. measured values of δ = f(dz ) and γ = f(dz ) are used for the comparison with the simulation results. 6. results and discussion simulation results for individual variants of mbs computational model are geometrical parameters of axle suspension under the action of force load, as it is shown in figure 8. further outputs are the graphical maps of the distribution of stresses and deformations of the examined supporting elements (figure 1) which can be used during design, construction and optimization. 6.1. effect of supporting elements flexibility during wheel vertical motion calculated dependencies of toe angle δ = f(dz ) and the camber angle γ = f(dz ) for vertical movement of the wheel dz for the individual variants of computational model are shown in figure 11 and figure 12. figure 11. dependence of δ = f(dz ) for suspension with flexible supporting elements. figure 12. dependence of γ = f(dz ) for suspension with flexible supporting elements. they represent the main characteristics of elastokinematic properties of the suspension. toe shape resembles an inverted letter s with the linear region in an interval around dz = 0 mm. from calculated values, it is evident that the flexibility of supporting elements in mbs models (elast+flex) strongly affects the value and shape of the toe angle δ = f(dz ) and vary from the model with rigid elements (elast+rigid). each element influences the course in another way. the 152 vol. 56 no. 2/2016 model of suspention with flexible supporting elements basic model elast+flex la with flexible longitudinal arms differs most in the lower stop for dz = −70 mm, up to 139 % compared to the model elast-rigid. flexibility of the arm uta and hr affects the toe angle δ very slightly and shows the same behaviour as the model elast+flex la. flexibility of knuckle (kn) and lta arm moves at the limit dz = 0, the toe angle to δ = 0.04° and δ = 0.18° towards lower values. subframe flexibility causes slope of the toe in linear segment. the tangent is 0.71 compared to the value of 0.48, which was found for the model elast+rigid with absolutely rigid supporting elements. complex flexible model elast+flex all shows the movement of toe 0.12 deg towards lower values and the steepest linear section of the directive 1.21. different courses of the toe angle for models in figure 11 are caused by the flexibility of the supporting elements. the analysis shows that each suspension element affects the course δ = f(dz ) differently. the camber angle, on the other hand, does not differ for individual variants. calculated values coincide very well with experimental measurements. 6.2. effect of supporting elements flexibility during action of side force elasto-kinematic properties of suspension with flexible supporting elements during action of side force fl show parameters δ = f(fl) in figure 13 and camber γ = f(fl) in figure 14. figure 13. dependence of δ = f(fl) for suspension with flexible supporting elements. toe angle δ, with decreasing longitudinal force fl, decreases linearly up to the border fl ∼ −6 850 n, then it starts progression due to the nonlinear characteristics of bushings. the model taking into account arm flexibility lta and model elast flex all differ from the model elast+rigid. parameter fl = 0 is shifted by δ = 0.18° and δ = 0.12° towards the lower values. the different values of the toe as well as for δ = f(dz ) are caused by the flexibility of particular suspension elements. the camber angle γ after a few sharp breaks in the value of fl = −6 480 n linearly decreases (negative values indicate negative camber). as it turns out, the biggest changes in camber are made by flexibility of the subframe and the knuckle. during the action of force fl = 4 800 n, formed by the rigid model of elast+rigid, the camber angle is γ = −1.29°. for the flexible model elast+flex all it is γ = −0.87°. figure 14. dependence of δ = f(fl) for suspension with flexible supporting elements. figure 15 shows the comparison of calculated variations with flexible bodies, in terms of changes in the geometric parameters of the linear section relative to the load change in ∆fl = 1 kn. the greatest variaton of camber can be seen during the effect of the lateral force on the elast+flex all model ∆γ = 0.257°/1 kn which is a difference of 70.2 %, when compared to the rigid model. the highest change in the toe angle ∆δ = 0.119°/1 kn occurs in the model taking into account the compliance of subframe. figure 15. change of toe angle δ and camber γ during action of fl force. 6.3. effect of supporting elements flexibility during action of longitudinal force calculated dependences of δ = f(fb) in figure 16 and γ = f(fb) in figure 17 show the elasto-kinematic characteristics of the suspension under the action of 153 t. vrána, j. bradáč, j. kovanda acta polytechnica the longitudinal braking force fb. toe angle δ increases to approximately fb < 1 225 n, after this break, it continues into linearly decreases. models with flexible arms la, uta, hr and knuckle (kn) have the same characteristics like the model with rigid components. lta model and fully flexible model elast+flex all show for fb = 4 800 n the toe angle δ = −0.149° and δ = −0.292° compared to the rigid model with δ = 0.111°. taking into account the flexibility of supporting elements thus generates considerable deviation from the rigid model. figure 16. dependence of δ = f(fb) for models with flexible supporting elements. camber angle γ drops within the boundaries fb = 1 255 n and for fb > 1 255 n for all the studied variants except elast+flex all increases linearly. on the contrary, if we look at the model that takes into account the compliance of all the elements, n is linearly gradualy decreasing in the interval of fb ∈ (1 225, 10 000〉n. consideration of knuckle flexibility in the model elast+flex la+flex kn causes another characteristics compared to the models with flexible arms and elast+rigid model. the highest variation during the exposure of the force fb = 4 800 n compared to the rigid model (γ = −2.06°) was found for the model with the subframe flexibility (γ = −2.13°). different values of the toe angle and camber angle for fb = 0 n are caused by a static vertical load, which is determined by the weight of the vehicle. elasto-kinematic change of geometric parameters during exposure of longitudinal force fb is shown in figure 18. during this load, the highest change in the toe ∆δ = −0.128°/1 kn was found for the model elast+flex all, which is characterized by the lowest change of camber angle ∆γ = 0.003°/1 kn. the highest change in camber ∆γ = 0.025°/1 kn was found for the model elast+flex la+flex sub, which contains the subframe flexible model. figure 17. dependence of γ = f(fb) for models with flexible supporting elements. figure 18. changes of the toe angle δ and camber angle γ during an action of longitudinal force fl. 7. conclusion this paper deals with the construction of the mbs computational model of an independent suspension, in which the flexibility of rubber-metal bushings, but also the flexibility of supporting elements (subframe, knuckle, longitudinal and transverse arms), are implemented. the model is then used to analyse the impact of these elements’ flexibility on elasto-kinematic characteristics. flexible fem models discretized to 136 103 elements were created by compoment mode synthesis by craig-bampton method and linked to the mbs model using rbe2-spiders. the simulations show the impact of the flexibility of supporting elements on elasto-kinematic properties. it was shown that implementation of flexibility of the individual components strongly influences waveforms of geometrical parameters and final results significantly differ from the rigid model. the experimental measurement corresponds well with the model elast+flex all, which includes flexible models of all supporting elements. this model shows the highest changes in geometric parameters during the load. significant changes can also be observed in case of the model elast+flex la+flex sub reflecting the subframe 154 vol. 56 no. 2/2016 model of suspention with flexible supporting elements flexibility. during calculations of elasto-kinematic characteristics of suspension, it is very useful to take into account the flexibility of supporting elements. thus it is possible to achieve the required accuracy of results. references [1] m. mitschke, h. wallentowitz. dynamik der kraftfahrzeuge. springer fachmedien wiesbaden, 2014. doi:10.1007/978-3-658-05068-9. [2] k. popp, w. schiehlen. ground vehicle dynamics. springer berlin heidelberg, 2010. doi:10.1007/978-3-540-68553-1. [3] c. mousseau, t. laursen, m. lidberg, r. taylor. vehicle dynamics simulations with coupled multibody and finite element models. finite elements in analysis and design 31(4):295–315, 1999. doi:10.1016/s0168-874x(98)00070-5. [4] v. stejskal, m. valášek. kinematics and dynamics of machinery. marcel dekker, new york, 1996. [5] d. schramm, m. hiller, r. bardini. modellbildung und simulation der dynamik von kraftfahrzeugen. springer berlin heidelberg, 2010. doi:10.1007/978-3-540-89315-8. [6] j. knapczyk, m. maniowski. elastokinematic modeling and study of five-rod suspension with subframe. mechanism and machine theory 41(9):1031– 1047, 2006. doi:10.1016/j.mechmachtheory.2005.11.003. [7] j. knapczyk, s. dzierzek. displacement and force analysis of five-rod suspension with flexible joints. journal of mechanical design 117(4):532–538, 1995. doi:10.1115/1.2826715. [8] t. vrána, j. bradáč, j. kovanda. study of wheel geometrical parameters for single-axle suspension by using elasto-kinematic model. in v. adámek, m. zajíček, a. jonášová (eds.), computational mechanics 2015: 31st conference with international participation: špičák, czech republic, november 9-11, 2015, pp. 127–128. 2015. [9] p. vokál. kinematika a dynamika zavěšení přední nápravy typu mcpherson s uvažováním poddajností. ph.d. thesis, čvut v praze, 1996. [10] j. sajdl. elastokinematický model přední nápravy a metody jeho verifikace. ph.d. thesis, čvut v praze, 2009. [11] l. heuze, p. ray, g. gogu, et al. design studies for a new suspension mechanism. in proceedings of the institution of mechanical engineers, vol. 217, pp. 529–535. 2003. doi:10.1243/095440703322114915. [12] m. goelke. practical aspects of multi-body simulation with hyperworks. altair university, michigan, 2015. available on: http://www.altairuniversity.com/ 14225-practical-aspects-of-multi-bodysimulation-with-hyperworks-free-book/. [13] j.-c.samin, p.fisette. symbolic modeling of multibody system. springer netherlands, 2003. doi:10.1007/978-94-017-0287-4. [14] v. stejskal, j. brousil, s.stejskal. mechanika iii. čvut v praze, 1993. [15] motionsolve reference guide 11.0. altair engineering, michigan, 2011. available on: http://www.altairhyperworks.com/ clientcenterhwtutorialdownload.aspx. [16] j.-c.samin, p.fisette. multibody dynamics: computational methods and application. springer netherlands, 2013. doi:10.1007/978-94-007-5404-1. 155 http://dx.doi.org/10.1007/978-3-658-05068-9 http://dx.doi.org/10.1007/978-3-540-68553-1 http://dx.doi.org/10.1016/s0168-874x(98)00070-5 http://dx.doi.org/10.1007/978-3-540-89315-8 http://dx.doi.org/10.1016/j.mechmachtheory.2005.11.003 http://dx.doi.org/10.1115/1.2826715 http://dx.doi.org/10.1243/095440703322114915 http://www.altairuniversity.com/14225-practical-aspects-of-multi-body-simulation-with-hyperworks-free-book/ http://www.altairuniversity.com/14225-practical-aspects-of-multi-body-simulation-with-hyperworks-free-book/ http://www.altairuniversity.com/14225-practical-aspects-of-multi-body-simulation-with-hyperworks-free-book/ http://dx.doi.org/10.1007/978-94-017-0287-4 http://www.altairhyperworks.com/clientcenterhwtutorialdownload.aspx http://www.altairhyperworks.com/clientcenterhwtutorialdownload.aspx http://dx.doi.org/10.1007/978-94-007-5404-1 acta polytechnica 56(2):147–155, 2016 1 introduction 2 mbs model in hyperworks 3 mbs computational model of independent suspension 3.1 creation of mbs model 3.2 flexible models of suspension supporting elements 3.3 creation of interface nodes 4 simulations computations 5 validation of mbs model 5.1 validation of inertia moments of suspension supporting elements 5.2 experimental measurement 6 results and discussion 6.1 effect of supporting elements flexibility during wheel vertical motion 6.2 effect of supporting elements flexibility during action of side force 6.3 effect of supporting elements flexibility during action of longitudinal force 7 conclusion references acta polytechnica https://doi.org/10.14311/ap.2021.61.0199 acta polytechnica 61(1):199–218, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague new method evaluation of detail material and heat flows for single string cement clinker plant prihadi setyo darmantoa, ∗, i made astinaa, alfian kusuma wardhanaa, alfi amaliab, arief syahlanb a institut teknologi bandung, faculty of mechanical and aerospace engineering, jalan ganesha 10, 40132 bandung indonesia b indonesia cement and concrete institute, jalan ciangsana raya, 16968 bogor, indonesia ∗ corresponding author: prihadisetyo@gmail.com abstract. material flow in each main equipment of a cement clinker plant, which is very useful for controlling the process, is impossible to be measured during operation due to very high temperatures. this paper intends to overcome the difficulties associated with the measurement of these material flow values. this study presents a new method of calculating material flow (gas and solid) in each main equipment of a single string conventional suspension preheater type of a cement clinker plant. using the proposed method, mass flow rate at a clinker cooler, kiln, suspension preheater (sp) and even each cyclone separator can be calculated with a heat conservation error less than 1 %. with the application of the least square method for solving the overdetermined system of mass and heat conservation equations obtained in the cyclones of sp, the flow of gas and solid materials entering and exiting each cyclone that cannot be measured directly in the operating plant can be approached. based on the operation temperature data of gas and solid flows monitored in the control room of an indonesian cement plant as a case study, the mass flow rate of gas and solid entering and exiting as well as separation efficiency of each cyclone can be calculated. the results show that the separation efficiencies of cyclones 1, 2, 3 and 4 are 95 %, 91.89 %, 84.09 % and 79.51 % respectively. finally, this study will be very useful by providing data that are impossible to gather by a direct measurement in an operating plant, due to a very high process temperature constraint, for operational control needs, new equipment design, process simulation using computational fluid dynamics (cfd) software and even modification of existing equipment. the proposed method can be applied to all types of modern cement clinker plant configurations, either with or without a calciner including the double strings. keywords: mass & heat conservations, suspension preheater, cyclone, separation efficiency. 1. introduction due to the fact that processes in a cement manufacturing plant are highly exothermic, it is necessary to conduct a precise heat consumption analysis in an effort to optimize heat conservation and efficiency [1]. understanding the actual heat consumption and its conservation efforts have been main issues in cement industries [2]. therefore, a heat audit at a cement plant is conducted to identify opportunities for decreasing heat consumption, increasing the productivity and improving the production has been carried out [3]. the heat consumption assessment of portland cement production in a thailand’s cement plant was also reported and found 3.29 gj per ton of cement [4]. based on the audit results, there are several methods to improve heat conservation in cement plants: improving the quality of kiln feed in the kiln, modifying the preheater cyclone, repairing the control system, utilizing waste heat and improving the combustion process at the kiln [5]. the realization of the utilization of waste heat discharged into the environment through the preheater and cooler to 5.26 mw of electrical heat in a cement plant in ethiopia has been reported [6]. previous studies using heat and exergy analyses in kilns [7], preheater and calciner [8], as well as a thermodynamic analysis of processes in a raw mill [9] and a heat transfer analysis of preheater hot gas [10] found that there are many opportunities left out unexplored for heat conservation. one of the notable efforts in heat conservation is by using renewable materials as an alternative of using fossil fuel, such as: industrial waste [11], biomass (e.g. rice husk and bamboo) [12, 13] and tyres [14]. no less important is the use of alternative materials as a clinker substitution to reduce the production cost and maintain their environment sustainability (e.g. steelmaking slag [15], fly ash [16], pozzolanic material [17], limestone and micro-silica [18]). modifying processes and equipment was done in an effort to increase the heat efficiency. a top cyclone preheater modification was conducted to increase its separation efficiency and to reduce the return dust, which can contaminate the fine coal in the coal mill was explored in [19, 20]. a modification of the process using co-fuel combustion and modelling of calcination process in the calciner was explored in [21, 22]. a modification 199 https://doi.org/10.14311/ap.2021.61.0199 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica of the tertiary duct of kiln hood and modernization of the kiln using pyroclone calciner was explored in [23, 24]. a development of the design concept as well as the evolution of the main burner was explored in [25]. these studies used computational fluid dynamics (cfd) to understand the mechanism of any equipment modification. cfd simulations required precise and accurate data such as mass flowrate of the gas and material entering into the equipment [26, 27], but it was rarely mentioned where the data came from. as simulation results, heat consumption, life cycle inventory, and waste generated from the cement manufacturing plant can be carried out profoundly [28, 29]. however, obtaining the precise input data from a currently active cement manufacturing plan remain as prevailing issue. the issues are due to limited measurement instruments and methods and very high temperature conditions, which sometimes make the measurements impossible to do if the factory is running. the input data measurement for the simulations can be obtained from a direct possible measurement in a cement plant. these data consist of operational parameters: feed rate, clinker production, fuel mass rate, air consumption rate for transporting fuel, clinker cooling air rate and equipment surface temperature. meanwhile, the material mass conservation between two interconnected devices cannot be measured directly when the plant is in operation. the examples are between the cyclones in the suspension preheater (sp) during the preheating process of materials; between the cyclone and the calciner during a calcination process; and the between kiln entrance and the clinker cooler. these mass conservation calculations such as the input and output of a material flow rate in each cyclone stage and calciner are necessary for the design and modification of a cyclone and calciner. the difficulties in achieving direct measurements in the cement manufacturing equipment are still critical due to technical limitations such as high temperature, dusty environment and others [30]. furthermore, the separation efficiency of each cyclone is closely related to the mass flow rate and heat in the whole sp. for example, to obtain cyclone separation efficiency values, the design results are normally used in the calculation of mass and heat conservations [31]. however, this certainly does not match the conditions of the real equipment when the factory is running. thus, parametrization based on a real operating condition value is necessary to obtain a realistic approach for the preheater system design modification. a single string preheater is currently still used in a cement manufacturing plant. this technology consumes high thermal heat while having low production capacity. in order to increase the efficiency of the fuel used and increase the production, a plant modification using a newly designed calciner shall be conducted [32]. thus, the aim of this research is to propose a new method for estimating the value of cyclone efficiency by utilizing detail mass and heat conservation equations on each cyclone in the preheater system, without making direct mass flow rate measurements on the related equipment. to simplify the results of modelling and calculation, the equation of mass and heat conservations at cyclone preheaters are formed as a matrix and solved using the least squares method. the equation of this study is an overdetermined equation system that could be solved by using an approximation method. from the estimated cyclone separation efficiency value, the mass flow rate in each component of the cyclone can be approximated. the literature survey indicates that studies of detail materials and heat flow evaluation of the cyclone separator in the sp are limited in number and scope. generally, only limited to kiln [33–35], clinker cooler [36] and all the main equipment of a cement plant as a whole [37]. the additional objective of this study is to contribute providing detailed data required for an equipment design, modification, control of plant operation, and detail heat audit. 2. materials and method 2.1. materials the sp configuration of cement plants depends on the plant manufacturers. according to the classification proposed by schmidt [38], the configuration of the studied dry process cement plant suspension preheater is a conventional suspension preheater (sp) type as shown in fig. 1, while a schematic diagram of the whole cement plant main equipment is presented in fig. 2. sp consists of four cyclone separators and its riser duct is installed in series. fig. 1 shows raw-meal being inserted into sp through the riser duct of the top cyclone, while the material coming out of the preheater is transported towards the kiln for further processing to become clinker. the fuel used is coal, and is introduced into the kiln using the main burner. table 1 shows the chemical composition of kiln feed and coal that are normally used in an indonesian cement plant, including in this study. 2.2. mass and heat conservation equations from fig. 2, with the assumption that the plant condition is steady and the chemical composition of the kiln feed and fuel (coal) is as presented in table 1 and the ash contained in fuel forms clinker, the material flow conservation of the plant can be written as eq. 1. the production of the clinker mcli can be evaluated by eqs. (2) to (8). mkf + mcoal + mtr−air + mcoal−air = mcli + mh2o−kf + musnep_kf _1 + mgas−kf (1) 200 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . kiln feed composition % of mass coal composition % of mass cao 43.29 h2 4.44 mgo 1.43 c 73.40 sio2 13.82 n2 1.30 al2o3 3.37 o2 11.10 fe2o3 1.77 s 0.50 h2o 0.32 h2o 4.70 loi (co2, na2o, k2o & so3) 36.00 ash + dust 13.00 table 1. chemical composition of kiln-feed and coal. figure 1. suspension preheater (sp). figure 2. material flow in cement clinker’s plant main equipment. 201 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica mcli = msep−kf c + mash (2) msep−kf c = (1 −loikf ) ·msep−kf (3) mgas−kf = (loikf −h2okf ) ·msep−kf (4) msep−kf = ηc1.a&1.b ·mkf (5) munsep_kf _1 = (1 −ηc1.a&1.b ) ·mkf (6) mash = ashcoal ·mcoal (7) mh2o−kf = h2okf ·msep−kf (8) if it is assumed as complete combustion, the mass flow rate of the hot gas (mhg) is equal to the flue gas resulted from coal combustion process. the value of mhg can be estimated by eq. (9): mhg = (1 −ashcoal) ·mcoal + mtr−air + mcoal−air (9) based on the fig. 2, the heat conservation equation of the plant can be written in eq. (10):∑ enin = ∑ enout (10) where ∑ enin is the sum of the heat flow entering the plant equipment, while ∑ enout is the sum of the heat flow exiting the plant. the heat flow or heat entering the plant consists of: (1.) heat flow of kiln feed entering the sp (eq. 11): enkf = mkf ·hkf (tkf ) (11) (2.) heat flow of coal entering the kiln (eq. 12): encoal = mcoal ·hcoal(tcoal) (12) (3.) heat flow of air entering the plant through clinker cooler and main burner (eq. 13): enair = (mtr−air + mcool−air ) ·hair (tair ) (13) (4.) heat flow resulted by coal combustion process (eq.14): encoal−comb = mcoal ·nhvcoal (14) meanwhile, the flow of heat exiting plant’s main equipment, with the assumption that the temperature of dust and gas exiting from sp is equal thg1, contains of: (1.) heat flow of clinker product at temperature tcli (eq. 15): encli = mcli ·hcli(tcli) (15) (2.) heat flow of coal combustion process gas exiting from the top of the cyclone of sp (eq. 16): enhg = mhg ·hhg (thg1) (16) (3.) heat flow of vapour created by the evaporation of water content in kiln feed exits from sp (eq. 17): envapor−kf = mh2o−kf ·hvapor (thg1) (17) it is noted that the mass of vapour created by the coal combustion process is included in the mass flow of flue gas. (4.) heat flow of unseparated kiln feed exiting from the top of the cyclone of sp (eq. 18): enunsep−kf −1 = munsep−kf −1 ·hkf (thg1) (18) (5.) heat flow of kiln feed gas resulting from the calcination process, mostly co2, exits from sp (eq. 19): engas−kf = mgas−kf ·hco2 (thg1) (19) 202 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . material a b c raw meal 0.206 101 -37 clinker 0.186 54 0 coal 0.262 390 0 air 0.237 23 0 co2 0.196 118 -43 o2 0.218 30 0 vapor (h2o) 0.443 39 28 table 2. values of a, b, and c of eq. (24) [41]. (6.) heat of the clinker formation where the value can be calculated by bogue’s equation multiplied by the rate of the clinker production [39] (eq. 20): enf orm = mcli{7.646(caco3)kf + 6.48(mgco3)kf + 4.11(al2o3)kf− − 5.176(sio2)kf − 0.59(fe2o3)kf} (20) where (caco3)kf , (mgco3)kf , (al2o3)kf , (sio2)kf , and (fe2o3)kf are mass based percentage of each substance contained in the kiln feed. it should be noted that for each kg of clinker produced, the heat of the clinker formation consists of calcination heat (encalc), which is endothermic and the heat of clinkerization or sintering (enclink), which is exothermic. hence, the heat of the clinker formation is the difference between the calcination and the sintering heat. calcination is the decomposition reaction of caco3 to form cao and co2 while sintering is the reaction process of the formation of oxides contained in the clinker. sintering occurs in the temperature range of 1250 to 1450 °c, thus it only occurs in the kiln. the heat of this calcination reaction which can occur in the sp and kiln is considered constant at 425 kcal/kg of caco3 [40]. by determining the calcination heat per kg of caco3, the calcination heat per kg of clinker produced can be calculated. finally, we can summarize the calculation in eq. (21). enclink = encalc −enf orm (21) (7.) evaporation heat of the water content in the kiln feed and coal exits from sp (eq. 22): enevap = (mh2o−kf + mh2o−coal)hf g (22) where hf g is the enthalpy of the water evaporation process. (8.) heat loss by radiation and convection (qloss) through the overall surface area of the main equipment atot that can be calculated by the proposed formula [41] (eq. 23): qloss = atot(4 · 10−8(t 4surf −t 4 amb) + 80.33 å tsurf −tamb 2 ã−0.724 (tsurf −tamb)1.333) (23) enthalpy of the substances can be calculated, if their temperatures are known from a measurement, using eq. (24) h(t) = ∫ t2 t1 cpdt = a + bt 2 · 10−6 + ct 3 · 10−9 (24) where cp is the specific heat of the substance and t1 is the reference temperature. the value of constants a, b, and c for limited substances and gas are shown in table 2. 2.3. methodology in general, a cement plant is equipped with measuring devices and instrumentations to monitor operating parameters such as temperatures and draft pressures. a weighing feeder is applied for measuring the mass flow rate of the kiln feed and coal. an oxygen content monitoring system is put in the kiln and sp gas outlets. an obstruction meters for measuring the air flow are generally available at the control system of the plant. in addition, the flow of dust returning from the unseparated kiln feed leaving for the sp is also measured. this allows an estimation of the separation efficiency of the top cyclone by using equation (6). the first methodology of the study is the evaluation of mass conservation of the whole plant based on the measurement results of these instrumentation devices using equations (1) to (9). the second methodology is to solve the mass and heat rates in each main equipment in more detail, (e.g. clinker cooler, kiln, sp and each cyclone separator). additional 203 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica figure 3. materials flow in planetary cooler. mass and heat conservation equations are required for each equipment. these equations are interdependent and can be formed into linear algebraic equations and solved simultaneously. however, especially for cyclones, because the number of equations is greater than the number of parameters to be searched for, the approximation solution taken is by using the least square method with the minimum square of a residual error as the criteria. 2.4. mass and heat conservation equations of planetary clinker cooler for the planetary clinker cooler, a schematic diagram of mass flow entering and leaving the planetary cooler is presented in fig. 3. the mass conservation equation can be written as eq. (25): mcli_k + mcool−air = mcli + mclid_to_k + mcool_air_k (25) where mcli, mcli_k , mcool−air, mcool_air_k and mclid_to_k are the mass flow rates of the clinker product, clinker entering the cooler, cooling air entering the cooler, cooling air entering the kiln (combustion air) and clinker dust returning to kiln, respectively. with the assumption that there is no leakage in this cooler, the mass flow rate of the cooling air is equal to the combustion air minus the transporting air. the rate of the clinker mass from leaving the kiln and entering the cooler must be equal to the sum of the clinker production and the returning clinker dust back to the kiln. this clinker dust is carried by the combustion air back from the cooler to the kiln. the percentage of clinker dust returning to the clinker production will be assumed in the calculation of the mass conservation in the cooler. the heat conservation calculation error in the cooler and kiln is not more than 1 %. with limits determined like this and with the measurement data from the plant, the calculation of the mass and heat conservations in the cooler can be carried out. the planetary cooler heat conservation equation is written in eq. (26): encli_k + encool−air = encli + enclid_to_k + encool_air_k + qloss (26) the flow of heat entering the planetary cooler consists of: (1.) heat flow of clinker from kiln at a temperature tcli_k: encli_k = mcli_k ·hcli(tcli_k) (27) (2.) heat of cooling air: encool_air = mcool_air ·hair (tair ) (28) meanwhile, the flow of heat leaving for the cooler, assuming that the temperature of the dust returning to kiln and the gas exiting from the cooler is equal to thg_c consists of: (1.) heat flow of clinker product encli at a temperature tcli as mentioned in eq. (15). (2.) heat of combustion air entering the kiln ensec_air_c at temperture thg_c is: ensec_air_c = mcool_air_k ·hhg_c(thg_c) (29) (3.) heat flow of clinker dust returning to kiln enclid_k at temperture thg_c: enclid_k = mclid_to_k ·hcli(thg_c) (30) heat loss by radiation and convection qloss_c through the surface area of the clinker cooler (acooler) can be calculated by eq. (23). in eqs. (25) and (26), the value of the mass rate of the clinker dust returning to the kiln (mclid_to_k) and combustion air temperature cannot be measured directly at the plant. however, the mass and heat conservation in the kiln can be evaluated with the minimum acquisition limit between the heat entering and exiting the kiln. using this condition, the eqs. (27) to (30) can be solved by completing the heat conservation of the cooler. 204 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . figure 4. materials flow in the kiln. 2.5. mass and heat conservation equations of the kiln for kiln system, the mass flow in and from the kiln is described in fig. 4. the equation of mass conservation is written in eq. (31): msep−kf −4 + mclid_to_k + mcoal + mcomb_air = mcli_k + mclid_to_sp + mhg_to_sp (31) where msep−kf −4 = mass flow rate of kiln feed from sp. mcoal = mass flow rate of coal. mcomb_air = mass flow rate of combustion air (cooling air mcool-air and transporting air mtr-air). mclid_to_sp = mass flow of clinker dust entering the sp. mhg_to_sp = mass flow rate hot gas entering the sp. the hot gas entering the sp consists of flue gas, as mentioned in eq. (9), and co2, including other gasses from the calcination and burning process. the portion of the calcinated kiln feed can be evaluated using equation (4) multiplied by the percentage of the kiln feed calcinated in the kiln (%calc_k). thus, the value of mhg_to_sp can be described as eq. (32), mhg_to_sp = mhg + (%calc_k ·mgas_kf ) (32) the other notations were already mentioned in the mass conservation of the cooler. while the heat conservation equation of the kiln is as follows (eq. 33), enkf _to_k + encoal + encoal−comb + ensec_air_c + entr_air + enclid_k = encli_k + enclid_to_sp + + enf orm_k + enhg_to_sp + qloss_k (33) the flow of heat entering the kiln consists of: (1.) heat flow of kiln feed from the sp at a temperature tin_k (eq. 34): enkf _to_k = msep−kf −4 ·hkf (tin_k) (34) (2.) heat of coal entering the kiln as defined in equation (12). (3.) heat from coal combustion process (eq. 14). (4.) heat flow of cooling and transporting air (eq. 13). (5.) heat of clinker dust returning from cooler as written in eq. (30). while the heat flows leaving for the kiln consist of: (1.) heat flow of clinker product encli_k to the cooler (eq. 27). (2.) heat flow of clinker dust returning to the sp enclid_to_sp in which its temperature is assumed equal to the temperature of the hot gas leaving for kiln tin_k: enclid_to_sp = mclid_to_sp ·hcli(tin_k) (35) (3.) net heat of calcination and clinkerization process in the kiln enf orm_k that can be assumed equal to the sum of calcination heat and clinkerization heat. hence, the value of enf orm_k can be written as follows, eq. (36). enf orm_k = encalc_k + enclink (36) where encalc_k = %calc_k ·encalc. here, %calc_k can be estimated indirectly when the evaluation of the heat conservation in the kiln is carried out. the value of %calc_k also depends on the percentage of the kiln feed that was calcined in the sp (%calc_sp). therefore, %calc_k = 1 − %calc_sp. in this study, we assume that the calcination process in the sp occurs at the cyclone 3 (%calc_sp_c3) and the lowest cyclone (%calc_sp_c4) and their riser ducts. thus %calc_sp = %calc_sp_c3 + %calc_sp_c4. 205 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica (4.) the temperature of the heat flow of hot gas from the kiln to the sp is tin_k. the total mass of this hot gas is the sum of the mass flow rate of the combustion gas mhg (eq. 9) and co2 produced by the calcination process of the kiln feed in the kiln equal to (%calc_k ·mgas−kf ). therefore, the heat flow of the hot gas can be calculated as follows (eq. 37), enhg_to_sp = mhg ·hhg (tink ) + (%calc_k ·mgas−kf ) ·hco2(tink ) (37) the heat loss by radiation and convection qloss_k, and the heat loss through the surface area of the kiln (akiln) can be calculated by equation (23). 2.6. detail mass and heat conservation equations of sp system and each cyclone separator the flow of solid materials and gas of the sp is presented in fig. 1 and its related mass conservation equation can be written as follows (equation 38), mkf + mclid_to_sp + mhg_to_sp = msep−kf −4 + mh2o−kf + munsep_kf _1 + mhg + mgas−kf (38) it should be noted that the values of mkf _to_k and mclid_to_sp are known by solving the kiln mass conservation equation. heat flow into the sp consists of: (1.) heat of kiln feed entering the sp, enkf . (2.) heat of clinker dust returning from the kiln, enclid_to_sp , which is represented by eq. (35). (3.) heat flow of hot gas from kiln, which was presented by eq. (37) and consisted of the sum of the heat flow rate of the combustion gas mhg and loi produced by the calcination process of the kiln feed. while the heat flows leaving the sp consist of: (1.) heat flow of unseparated kiln feed by top cyclones (eq. 18). (2.) heat flow of coal combustion hot gas exiting from the top cyclone as formulated by eq. (16). (3.) heat of evaporated water content in the kiln feed as written by eq. (17). (4.) heat of the gas product of the kiln feed calcination and burning process, mainly co2. its value is approximated by eq. (19). (5.) heat flow of the kiln feed entering the kiln at a temperature tink as presented in equation (34). (6.) heat of the clinker formation in sp enf orm_sp that can be assumed equal to the heat of the clinker formation enf orm (eq. 20) minus the heat formation absorbed in the kiln enf orm_k: enf orm_sp = enf orm −enf orm_k (39) it should be noted that the heat formation in the sp only consists of the calcination heat. its value can be estimated based on the percentage of the calcined kiln feed in the cyclones 3 and 4. we can write this term as enf orm_sp = (%calc_sp_c3 + %calc_sp_c4) ·encalc. (7.) heat loss by radiation and convection qloss_sp through the sp’s surface area (asp ) that can be calculated by eq. (23). finally, the heat conservation equation for the sp is written in eq. (40): enkf + enclid_to_sp + enhg_to_sp = enunsep−kf −1 + enclid_to_sp + enkf _to_k+ + enf orm_sp + engas−kf + envapor−kf + qloss_sp (40) for understanding the flows of mass and heat at each cyclone separator that is very useful for the plant operation and design as well as equipment modification, its formulation will be derived one by one. the design of top cyclone 1a and 1b consist of 2 cyclones of a diameter smaller than their lower cyclones. the reason of this design is to achieve a higher separation efficiency. the schematics of these top cyclones are presented in fig. 5a, while the lower cyclones are depicted in fig. 5b. for the top cyclones, assuming no flow leakage, the incoming mass flow consists of: (1.) kiln feed, mkf at temperature tkf . (2.) flow of unseparated kiln feed from the cyclone 2, munsep_kf _2 where its temperature is thg2. (3.) gas flow from the cyclone below at a temperature thg2, which consists of combustion gas (mhg )in and kiln feed gas (co2 and others), (mgas−kf )in. 206 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . figure 5. mass flow of top (left) and other (right) cyclones. while the outflow of mass consists of: (1.) separated kiln feed to the cyclone 2, msep−kf −1. (2.) unseparated kiln feed at a temperature tkf 1, (munsep_kf _1). (3.) combustion gas, kiln feed gas and water vapour (mh2o−kf ) at temperature tkf 1. the mass conservation equation for the top cyclone can be written as eq. (41). mkf + munsep_kf _2 + (mhg )in + (mgas−kf )in = msep−kf −1 + mh2o−kf + munsep_kf _1+ + (mhg )out + (mgas−kf )out (41) for a general sp operation, the gas temperature that heats the kiln feed is slightly higher than the kiln feed temperature when exiting the cyclone (tkf 1). these temperatures are measured and can be read from the control room. the rate of the kiln feed and unseparated dust filtered by baghouse can also be measured. assuming that the dust from cyclone 2 will be filtered back by the top cyclone. thus, the material that has been filtered by the top cyclone is a part of the circulating material in the the sp. furthermore, the ratio of (mkf −munsep_kf _1) and mkf can be used for the calculation of the value of its separation efficiency, ηc1.a&1.b. in this case, (mgas−kf )in = (mgas−kf )out = mgas−kf and (mhg )in = (mhg )out = mhg and it will be valid for the following derived equations. based on this operating conditions, the top cyclone separation efficiency can be calculated by eq. (5) and its heat conservation can be written as follows (eq. 42), enkf + munsepkf2 ·hkf (thg2) + mhg·hhg (thg2) + mgas−kf·hco2 (thg2) = enhg + engas−kf + + envapor−kf + enunsep−kf −1 + msep−kf −1·hkf (tkf 1) + qloss_c1 (42) where qloss_c1 is the heat loss by radiation and convection through the top cyclone surface area (ac1) that can be calculated by eq. (23). using the same notation of materials and gas flows as denoted in fig. 5right, the mass conservation equation of the cyclone 2 can be written as eq. (43). msep−kf −1 + munsep_kf _3 + (mhg )in + (mgas−kf )in = msep−kf −2 + munsep_kf _2+ + (mhg )out + (mgas−kf )out (43) while its heat conservation equation is (eq. 44): msep−kf −1 ·hkf (tkf 1) + munsep_kf _3 ·hkf (thg3) + mhg ·hhg (thg3) + mgas−kf ·hco2 (thg3) = = msep−kf −2 ·hkf (tkf 2) + munsep_kf _2 ·hkf (thg2) + mhg ·hhg (thg2) + mgas−kf ·hco2 (thg2) + + qloss_c2 (44) for the cyclone 3, the temperature of the exiting gas and solid materials are thg3 and tkf 3, respectively. meanwhile, the temperature of the gas entering this cyclone from the lowest cyclone is thg4. the calcination reaction starts at a temperature of around 600 °c [42–44] and the kiln feed temperature from cyclone 3 (tkf 3) is 207 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica higher than 600 °c. hence, a part of the kiln feed is calcined. by assuming that the percentage of calcined kiln feed in cyclone 3 is %calc_sp_c3, then the co2 formed from this reaction and the associated reaction heat must be taken into account in the mass and heat conservation equation. hence, the mass conservation equation for cyclone 3 can be written as follows (eq. 45), msep−kf −2 + munsep_kf _4 + (mhg )in + (mgas−kf ) in = msep−kf −3 + munsep_kf _3+ + (mhg )out + (mgas−kf )out (45) where (mgas−kf ) out− (mgas−kf ) in = %calc_sp_c3 ·mgas−kf = (msep−kf −2 + munsep_kf _4)− − (msep−kf −3 + munsep_kf _3) (46) and its heat conservation equation is: msep−kf −2 ·hkf (tkf 2) + munsep_kf _4 ·hkf (thg4) + mhgmhg ·hhg (thg4) + + mgas−kf ·hco2 (thg4) = msep−kf −3 ·hkf (tkf 3) + munsep_kf _3 ·hkf (thg3) + + enf orm_sp _c3 + mhg ·hhg (thg3) + mgas−kf ·hco2 (thg3) + qloss_c3 (47) it should be noted that the kiln feed calcination process in cyclone 3 (msep−kf c−3) continues in the lowest cyclone due to the higher temperature of the hot gas coming from the kiln. therefore, a part of the kiln feed mass transforms to co2 in this lowest cyclone. the mass conservation equation of the lowest cyclone is finally written as (eq. 48), msep−kf −3 + (mhg )in + mclid_to_sp + (%calc_k ·mgas−kf ) = = msep−kf −4 + munsep_kf _4 + (mhg )out + (mgas−kf ) out (48) the gas produced by the calcination process in the lowest cyclone is equal to (eq. 49), (mgas−kf ) out− (%calc_kmgas−kf ) = %calc_sp_c4mgas−kf = = (msep−kf −3 + mclid_to_sp ) − (msep−kf −4 + munsep_kf _4) (49) the related heat conservation equation can be given as equation (50), msep−kf −3 ·hkf (tkf 3) + enclid_to_sp + enhg_to_sp = munsep_kf _4 ·hkf (thg4) + + mhg ·hhg (thg4) + mgas−kf ·hco2 (thg4) + enkf _to_k + enf orm_sp _c4 + qloss_c4 (50) the mass and heat flows in cyclones 2, 3 and 4 (lowest cyclone) cannot be calculated sequentially. this is due to the unknown value of the mass of the kiln feed separated by cyclones 2 and 3 and unseparated kiln feed from cyclones 3 and 4. these four unknown parameters are interdependent and must be solved simultaneously. to evaluate all four parameters simultaneously, equations (43) to (45), (47), (48) and (50) are used and written in the form of a matrix (eq. 51): [a] × [x] = [b] (51) the matrices [a], [x] and [b] are represented by equations (51), (52) and (53), respectively. matrix [a] 6 × 4 is a multiplier constant of unknown variables, while the matrix [b] 6 × 1 is a constant resulting from the known constant value of the equation (51). [a] =   1 0 −1 0 hkf (tkf 2) 0 −hkf (thg3) 0 −1 1 1 −1 −hkf (tkf 2) hkf (tkf 3) hkf (thg3) −hkf (thg4) 0 −1 0 1 0 −hkf (tkf 3) 0 hkf (thg4)   (52) [x] =   msep−kf −2 msep−kf −3 munsep_kf _3 munsep_kf _4   (53) 208 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . [b] =   msep−kf −1 −munsep_kf _2 msep−kf −1 ·hkf (tkf 1) + mhg · {hhg (thg3) −hhg (thg2)}+ mgas−kf · {hco2 (thg3) −hco2 (thg2)}− munsep−kf −2 ·hkf (thg2) −qloss_c2 0 mhg · {hhg (thg4) −hhg (thg3)}+ mgas−kf · {hco2 (thg4) −hco2 (thg3)}− enf orm_sp _c3 −qloss_c3 mclid_to_sp −msep−kf −4− {(1 − %calck)·mgas−kf} enclid_to_sp + enhg_to_sp −mhg ·hhg (thg4)− mgas−kf ·hco2 (thg4) −enkf _to_k− enf orm_sp _c4 −qloss_c4   (54) matrix [x] 6×1 is the variable which shall be solved. from the equation (51), it is known that the number of unknown variables is less than the number of equations (overdetermined equations). a system that has a number of equations greater than the number of variables usually do not have a solution or cannot be solved directly [45]. the conservation of mass and heat equations in the form of matrix as written in eqs. (52) to (54) are stand-alone equations so that the rank matrices [a] and [a|b] cannot be reduced from one another because the matrix compiler equation is not a combination of other equations. thus, the overdetermined equation systems usually have inconsistent variables. however, the solution can still be found using the least squares method, which is to find the value to fulfil the equation of results with the minimum square of the residual error [46]. the method used is by least squaring the matrix [a] by multiplying the transpose matrix itself [a]t so that the matrix [a]t [a] has the same number of rows and columns. the matrix [a]t [b] will change the row of the matrix [b] so the matrix [b] has the same line as the number of rows of [a]t as shown in eq. (55). the variable values in the matrix [x] can be found by multiplying the inverse matrix [at a]−1 on both sides of the equation as described in eq. (56), that can be solved using mathematical software. [a]t [a] × [x] = [a]t [b] (55)[ at a ]−1 [ at a ] × [x] = [ at a ]−1 [a]t [b] (56) from equation (56), the values of x ≈ hls are called as unique least squares approximate solution that comply the least square method, and are defined by eq. (57). xls = [ at a ]−1 [ at b ] (57) knowing the approximating values of x, the approximate value of the separation efficiency of the cyclones 2, 3, and 4 can be calculated using the following eqs. (58)-(60) [47, 48]. ηc2 = msep−kf −2 msep−kf −1 + munsep_kf _3 (58) ηc3 = msep−kf −3 msep−kf −2 + munsep_kf _4 − (%calc_sp_c3 ·mgas−kf ) (59) ηc4 = msep−kf −4 msep−kf −3 + mclid_to_sp − (%calc_sp_c4·mgas−kf ) (60) the approximate value of the cyclone separation efficiency can be substituted into the mass conservation equation of each cyclone to recalculate the value of x that meets the mass conservation law. the final results of x can be used to evaluate the flow of heat in each cyclone with a minimum error. hence, it can be accepted from an engineering point of view (i.e. error < 1 %). the flowchart of the calculation steps is shown in figure 6. all above mentioned equations and methods of calculations have been adopted as base in developing the substantial engineering equation solver (i.e., in the form of executable programs) and used to calculate detail materials and heat flows and conservations. 209 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica no data input: kiln feed and fuel chemical composition and heating value operational parameters(feeding, temperatures of gas, materials, surface, clinker, fuel flow, primary air flow, return dust flow, cooling air) first estimating of top cyclone separation efficiency percentage of return dust from cooler to kiln and from kiln to sp. calculation of plant mass & heat balance and klinker product using eqs. 1-9 calculation of sp, kiln and clinker cooler's mass & heat balance using eqs. 10-40 using data of material's flow exiting from c4 and top cyclones, calculation detail mass balance of each cyclones using least square method (eqs. 41-57) calculation of separating efficiency of each cyclone using eqs. 58-60 subtitution to mass & heat balances equations of detail each cyclone using eqs. 41-50 δecyclone < 1% best approximation results report end δmcyclone = 0 no using new values of cyclones separation efficiency slight tuning of gas and/or material's temperatures of the cyclone yes start yes figure 6. flowchart of the calculation steps. 3. results and discussion 3.1. operational data parameters a cement factory plant in indonesia, tonasa 2, was used as the object of this study. daily average data at a normal operation capacity were used for the current analysis. these data were taken from the control room and direct field measurement. the data for input values are classified into several groups (i.e. kiln feed, coal, sp, kiln, and cooler data). the coal consumption is 271 tons per day (tpd) or 3.137 kg/s with 6,100 kcal/kg of coal nett heating value. the rate of the transporting air is 2.08 m3/s. the rate of the inlet kiln feed is 3,195 tpd. the rate of the return dust is 158.8 tpd (1.85 kg/s). the remaining oxygen percentage in the kiln and top cyclone outlets are 2.5 % and 2.7 %, respectively, and can be used to estimate the kiln excess air and sp false air [41]. the ambient temperature is 33 °c. based on these measurement data, the production of clinker, flue gas and kiln feed gas can be evaluated using eqs. (2) to (9). meanwhile, the data for kiln, planetary cooler, and sp are based on measurement results and general assumptions of normal cement plant operations and are given in tables 3 and 4. another parameter that can’t be measured directly in the plant is the mass flow rate of the re-circulated dust from the cooler to the kiln and from the kiln to the sp, in this study, their values are assumed as 15 % and 17 % of the clinker production rate, respectively. 210 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . parameters unit kiln cooler average surface temperature (°c) 330.7 189.2 cylinder diameter (m) 4.5 2.3 length of cylinder (m) 75 16 number of cylinder unit 1 11 clinker product temperature (°c) 190 entering air temperature (°c) 33 estimated return dust from cooler (%) 15 table 3. input data for kiln and planetary cooler. parameters unit cyclone 1 cyclone 2 cyclone 3 cyclone 4 average surface temperature (°c) 164.1 194.2 198.8 226 total surface area (m2) 450.7 376.9 376.5 455.3 hot gas inlet temperature (°c) 560 694 867 1190 hot gas exit temperature (°c) 360 560 694 867 separated kiln feed temperature (°c) 345 545 680 837 estimated return dust from kiln (%) 17 estimated kiln feed calcinated exit sp (%) 25 table 4. input data for each cyclone of the sp. 3.2. plant mass and heat conservations using the available input data, the mass and heat conservations for the case study plant can be evaluated using eqs. (10) to (23). the results of the mass and heat conservations calculation is presented in the form of per unit mass of clinker products (i.e., per kg of clinker produced). with the kiln feed mass fed to the plant of 3,195 tpd and the measured top cyclone separation efficiency of around 95 %, the clinker production is approximately 1,978 tpd (22.9 kg/s). for the whole clinker plant, the overall mass and heat conservations per unit mass of clinker produced are given in table 5. generally, the heat consumption at a higher capacity of the cement plant is lower. from the results, it can clearly be seen that the formation of heat per kg of clinker is around 425 kcal. the heat of this clinker formation generally consists of the calcination heat of caco3, and clinkerization heat to form clinker-contained oxides. the calcination process is endothermic, while the clinkerization process is an exothermic process. because both processes occur at a high temperature range (600 °c 1,450 °c), an additional heat is needed to maintain the temperature of the kiln and suspension preheater (sp). this is the reason why the plant heat consumption is far above its clinker formation heat. this excess heat is finally carried away by the flue gas, co2 and other gasses out of the sp. this hot gas, with a low oxygen content, is used as a heat source for a drying of coal and raw materials. in modern cement factories, the flue gas is even used as for electricity generation. in addition, the excess heat is also carried out by clinker products. the clinker heat is recovered in the cooler to increase the combustion air temperature and reduce the coal requirements. in effort to reduce the production cost and supporting environmental sustainability, part of the fuel used is low rank coal and alternative fuels. however, by shifting to lower rank coal with a high water content and alternative fuels, the specific heat consumption could increase to ±850 kcal/kg of the produced clinker [49]. 3.3. cooler and kiln mass and heat conservations the calculation results of the mass and heat conservation at the cooler and kiln, using eqs. (24) to (36), are given in tables 6 and 7, respectively. the absolute value of the incoming and outgoing difference heat of the cooler is less than 1 %. the temperature of the combustion air entering the kiln is around 960 °c and its recovery heat efficiency is 74.55 %. this combustion air temperature is in the range of the value measured by alfi et.al [49]. meanwhile, the percentage of the heat recovery in the cooler is relatively high when compared to the grate cooler, which is widely used in large capacity factories. this is caused by the conditions on the planetary cooler with the entirety of the cooling air being used as a combustion air. whereas, on the grate cooler, not all of the cooling air is used as combustion air. the consequence is that the clinker temperature coming out of the cooler is generally higher on the planetary cooler than in the case of the grate cooler. the heat loss from the cooler is large enough that it can be further utilized (i.e., for drying coal in tonasa 2 plant [50]). in table 7, the mass and heat conservation per kg of clinker produced in the kiln can be seen. the difference between the rate of heat entering and exiting the kiln is insignificant, with an approximate value of 0.46 %. this value is obtained by assuming that 17 % of clinker from 211 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica parameters inlet outletmass (kg) heat (kcal) mass (kg) heat(kcal) kiln feed in 1.6155 20.541 coal to main burner 0.1370 837.102 total combustion air 1.5850 11.442 clinker production 1.0000 37.289 combustion gas including excess air 1.7042 154.620 kiln feed gas 0.5528 46.744 return dust 0.0805 6.887 clinker heat formation 425.162 convection & radiation loss 193.285 water evaporation process 6.741 total of mass & heat flow 3.3375 869.085 3.3375 870.728 table 5. plant materials and heat flows per kg of clinker produced. parameters inlet outletmass (kg) heat (kcal) mass (kg) heat(kcal) clinker from kiln 1.1500 428.488 cooling air 1.3239 9.440 clinker product 1.0000 37.289 return clinker dust to kiln 0.1500 34.201 combustion air 1.3239 328.863 convection & radiation loss 37.375 total of mass & heat flow 2.4739 437.928 2.4739 437.728 table 6. materials and heat flows of cooler per kg of clinker produced. parameters inlet outletmass (kg) heat (kcal) mass (kg) heat(kcal) kiln feed from sp 1.5636 346.300 coal to main burner 0.1370 837.102 combustion air 1.5189 330.393 return clinker dust from cooler 0.1500 34.201 produced clinker 1.1500 428.488 recirculating clinker dust to sp 0.1700 50.630 flue gas to sp 1.6381 537.000 kiln feed gas 0.4114 134.900 net of calcination and clinkerization heat 293.310 evaporating of coal moisture content 3.846 convection & radiation loss 92.730 total of mass & heat flow 3.3695 1547.996 3.3695 1540.904 table 7. materials and heat flows of kiln per kg of clinker produced. 212 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . parameters inlet outletmass (kg) heat (kcal) mass (kg) heat(kcal) kiln feed 1.6155 20.442 recirculating clinker dust from kiln 0.1700 50.630 flue gas from kiln 1.7042 541.660 kiln feed gas from kiln 0.4114 134.900 kiln feed to the kiln 1.5636 346.300 combustion gas including excess air 1.7042 154.620 kiln feed gas out 0.5528 46.744 return dust flow 0.0805 6.887 heat of calcination process 131.800 evaporating of kiln feed water content 2.894 convection & radiation loss 63.180 total of mass & heat flow 3.9011 747.632 3.9011 752.425 table 8. materials and heat flows of sp per kg of produced clinker. kiln is returned back to the sp. by limiting the difference between the heat flows in and out of the kiln to ≤ 1 %, we can estimate the percentage of the kiln feed calcined in the sp and in the kiln. the calculation results of this heat conservation show that about 25 % of the kiln feed is calcined in the sp while the remaining 75 % are calcinated in the kiln. in addition, using equations (20) and (21), we can evaluate that the heat released during the sintering process in the kiln is about 102.2 kcal/kg of clinker produced. we can also analyse that the heat carried by the clinker to the cooler, and by the gas from the fuel combustion and calcination of the kiln feed out from the kiln to the sp is very large. in the cooler, the heat carried by the clinker is mostly recovered by the combustion air. in the sp, it is used for heating and the initial calcination process of the kiln feed. moreover, it also appears that the heat loss to the environment (i.e. 92.73 kcal/kg of produced clinker) is also significant (i.e., greater than in the cooler). the value obtained is still in the range of the results of radwan (2012) (50 − 140 kcal/kg of produced clinker) [5]. in general, the specific heat loss will decrease with an increasing plant capacity. however, there are still opportunities to recover this heat loss for the purpose of increasing the plant efficiency. 3.4. sp and detail each cyclone mass and heat conservations the focus of this research is to estimate the separation efficiency of cyclones in the sp with the smallest possible error of the difference between the flows of heat in and out. this certainly cannot be obtained by a direct measurement in the field. hence, by using equations (55) to (60) based on the data from gas and material temperature measurements at the inlet and outlet of each cyclone, the value of the separation efficiency approximation can be obtained. the results of calculating the mass and heat conservation in the sp are given in table 8. from the results, we can conclude that the difference between the outgoing and incoming heat rates is insignificant (0.64 %). the percentage of calcined kiln feed in this sp is around 25 % with around 8 % in cyclone 3 and the remaining 17 % in the lowest cyclone. using equations (52) to (54) and completion methods as described in equations (55) to (57), the mass and heat conservations in each cyclone can be obtained with the difference between the heat in and out being less than 1 %. the mass and heat conservations in cyclones 1, 2 and 3, 4 are given in tables 9 and 10, respectively. from the results of these calculations using equations (58) to (60), the estimated separation efficiency for cyclone 2 is 91.89 %, 84.09 % for cyclone 3 and 79.51 % for the lowest cyclone. the separation efficiency of the top cyclone has been evaluated by eq. (6). the results of the mass and heat conservation calculation for each cyclone is highly useful for designing equipment, controlling the process and modifying equipment (e.g., estimating the rate of kiln feed flow to the calciner). some studies that simulate the process in a calciner [22, 51] require data of the entering kiln feed flow, which generally come out of cyclone 3. unfortunately, these studies did not explain the origin of this data, which of course cannot be measured directly at the plant while the plant is operating. with the results of the calculations as shown in table 9, the process simulation that occurs within the calciner is expected to be closer to the real conditions, because the data can be obtained in a more accurate way. the results of calculating the separation efficiency of these cyclones depend on the measured temperature data. for example, in cyclone 3, the variation in temperature of material coming out of the cyclone 2 as well as the gas entering the cyclone 3 is ±2 °c, which practically does not affect the value of its separation efficiency. the lower value of the separation efficiency for the lower cyclone position is a common design referred to in cement plants. the top cyclone is designed with a high efficiency since the separation function is more dominant. 213 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica parameters cyclone 1 cyclone 2mass (kg) heat (kcal) mass (kg) heat(kcal) kiln feed 1.6155 20.670 kiln feed from upper cyclone 1.6966 138.396 kiln feed dust from lower cyclone 0.1665 23.406 0.3572 64.028 flue gas from lower cyclone 1.7042 247.467 1.7042 311.981 kiln feed gas from lower cyclone 0.5479 76.000 0.5479 97.440 flue gas from kiln kiln feed gas from kiln total of mass & heat flow in 367.543 4.3059 611.845 separated kiln feed to the lower cyclone 1.6966 138.396 1. 8873 257.199 unseparated dust to upper cyclone/duct 0.0805 6.887 0.1665 23.406 kiln feed gas 0.5528 47.648 0.5479 76.000 flue gas including excess air 1.7042 153.543 1.7042 247.467 kiln feed moisture evaporating process 2.894 heat of calcination process convection & radiation heat loss 20.550 11.630 total of mass & heat flow out 4.0341 367.543 4.3059 615.702 table 9. materials and heat flows per kg of produced clinker of cyclones 1 and 2. parameters cyclone 3 cyclone 4mass (kg) heat (kcal) mass (kg) heat(kcal) kiln feed from upper cyclone 1.8873 257.199 1.8875 330.592 kiln feed dust from lower cyclone 0.4029 92.839 0.1700 50.630 flue gas from lower cyclone 1.7042 397.520 kiln feed gas from lower cyclone 0.5024 115.400 flue gas from kiln 1.7042 541.526 kiln feed gas from kiln 0.4114 134.900 total of mass & heat flow in 4.4968 862. 958 4.1731 1057.648 separated kiln feed to the lower cyclone 1.8875 330.592 1.5636 346.306 unseparated dust to upper cyclone/duct 0.3572 64.028 0.4029 92.839 kiln feed gas 0.5479 97.440 0.5024 115.400 flue gas including excess air 1.7042 311.981 1.7042 397.520 kiln feed moisture evaporating process heat of calcination process 43.950 87.890 convection & radiation heat loss 12.330 18.670 total of mass & heat flow out 4.4968 860.321 4.1731 1058.625 table 10. materials and heat flows per kg of produced clinker of cyclones 3 and 4. 214 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . in addition to controlling the process, a high efficiency of the top cyclone is also intended to reduce the amount of pollutant emissions released into the environment. whereas for the 3 lower cyclones, their heat transfer function is more important. from tables 8 and 9, it can be seen that the heat carried by the gas coming out of the top-cyclone is significantly high (more than 200 kcal/kg of clinker produced). this heat can be recovered for drying fine coal, lime stone and clay as reported by alfi et.al [52]. 4. conclusions equations and a new approach to precisely calculate the mass and heat conservations in a single string cement plant are presented in this paper. using the proposed new method, the detail mass and heat conservation in each main equipment (e.g., cooler, kiln, sp and cyclones) can be evaluated based on common data available in the control room and measuring the surface temperature of the main equipment. as a case study, operational data of tonasa 2 cement plant were used for the detailed mass and heat conservation evaluation. the conclusions that can be drawn from the results of this study are as follows: (1.) the advantage of this proposed method over the conventional method of calculating the mass and energy flow of a clinker plant is that it can estimate the mass and energy flow in each cyclone so that the separation efficiency of each cyclone, which is impossible to measure, can be evaluated. this cannot be done with conventional methods, which only evaluate the sp as a whole. the estimated separation efficiency value of each cyclone can be obtained with less than 1 % error in heat conservation. (2.) this proposed method can also be used to estimate several parameters that cannot be obtained by a direct measurement of a running plant (e.g. combustion air temperature at the exit from the cooler, and clinkerization heat). (3.) the results obtained are expected to be used as additional data in controlling operations, designing new equipment and modifying the processes and dimensions of an existing equipment. (4.) the proposed calculation method can be applied to all types of modern cement clinker plant configurations, either with or without a calciner including the double strings. it is also expected to contribute to design process improvement, operation simulation of a calciner and controlling the daily operation of a clinker plant. list of symbols a surface area [m2] ash mass percentage of ash in coal [%] en heat flow rate [kcal/s, kw] m mass flow rate [kg/s] nhv nett heating value [kcal/kg, kj/s] η materials separation efficiency of the cyclone [%] loi loss of ignition [%] h2o mass flow rate of vapor [kg/s] hx(t x) enthalpy of substance x at temperature tx [kcal/kg] hf g enthalpy of evaporation [kcal/kg] hkf enthalpy of kiln feed [kcal/kg] (caco3)kf mass percentage of caco3 [%] (mgco3)kf mass percentage of mgco3 [%] (sio2)kf mass percentage of sio2 [%] (al2o3)kf mass percentage of al2o3 [%] (fe2o3)kf mass percentage of fe2o3 [%] t temperature [řc] qloss radiation and convection heat loss [kcal/s, kw] %calc_k percentage of kiln feed calcined in the kiln [%] %calc_sp calcined kiln feed percentage in sp [%] ∆ecyclone error of energy conservation law (the difference of heat in-flow and heat out-flow of the cyclone) ∆mcyclone error of mass conservation law (the difference of mass in and mass out of the cyclone) subscripts amb ambient ash ash in coal c clinker cooler calc calcination process 215 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica c cyclone separator c1 cyclone 1 (top) c2 cyclone 2 c3 cyclone 3 c4 cyclone 4 (lowest) cli clinker clid_to_k clinker dust return to kiln coal coal coal − comb coal combustion process cool − air clinker cooling air cool_air_k cooling air to kiln evap evaporation process form clinker formation process gas − kf kiln feed gas hg hot gas hgi hot gas exit from cyclone i h2o − kf kiln feed evaporated water k kiln kf kiln feed kfi separated kiln feed from cyclone i sp suspension preheater sec_air_c combustion air from cooler to kiln sep − kf separated kiln feed sep − kfc separated kiln feed to form clinker sep − kf − i separated kiln feed by cyclone i (i = 1a and 1b, 2, 3 and 4) surf surface area tr − air transporting air tot whole plant unsep_kf_i unseparated kiln-feed by cyclone i, ( i = 1a and 1b, 2, 3 and 4) vapor water vapor vapor − kf vapor from kiln feed references [1] r. virendra, b. sudheer prem kumar, j. suresh babu, d. rajani kant. detailed energy audit and conservation in a cement plant. international research journal of engineering and technology 2(1):248 – 256, 2015. [2] n. anantharaman. energy audit in cement industry (1500 tpd). international journal of science technology & engineering 3(10):12 – 18, 2017. [3] s. sattari, a. avami. assessment of energy-saving opportunities of cement industry in iran. in proceeding of the 3rd iasme/wseas international conference on energy, environment, ecosystems and sustainable development, pp. 585 – 593. citeseer, agios nikolaos, greece, 2007. [4] p. khongprom, u. suwanmanee. environmental benefits of the integrated alternative technologies of the portland cement production: a case study in thailand. engineering journal 21(7):15 – 27, 2017. doi:10.4186/ej.2017.21.7.15. [5] a. m. radwan. different possible ways for saving energy in the cement production. advances in applied science research 3(2):1162 – 1174, 2012. [6] t. t. ayu, m. h. hailu, f. y. hagos, s. m. atnaw. energy audit and waste heat recovery system design for a cement rotary kiln in ethiopia: a case study. international journal of automotive and mechanical engineering 12:2983 – 3002, 2015. doi:10.15282/ijame.12.2015.14.0249. [7] m. parmar, d. solanki, b. vegada. energy and exergy analysis of cement rotary kiln. international journal of advanced engineering and research development 3(4):284 – 293, 2016. [8] a. i. okoji, d. e. babatunde, a. n. anozie, j. a. omoleye. thermodynamic analysis of raw mill in cement industry using aspen plus simulator. in iop conference series: materials science and engineering, vol. 413, p. 012048. 2018. doi:10.1088/1757-899x/413/1/012048. [9] a. kolip, a. f. savas. energy and exergy analyses of a parallel flow, four-stage cyclone precalciner type cement plant. international journal of physical sciences 5(7):1147 – 1163, 2010. doi:10.5897/ijps.9000219. [10] s. jonnalagadda, s. reddy. heat transfer analysis of recuperative air preheater. international journal of engineering research and management 4(1):105 – 110, 2017. 216 http://dx.doi.org/10.4186/ej.2017.21.7.15 http://dx.doi.org/10.15282/ijame.12.2015.14.0249 http://dx.doi.org/10.1088/1757-899x/413/1/012048 http://dx.doi.org/10.5897/ijps.9000219 vol. 61 no. 1/2021 new method evaluation of detail material and heat flows. . . [11] l. k. nørskov. combustion of solid alternative fuels in the cement kiln burner. ph.d. thesis, technical university of denmark, 2012. [12] h. mikulčić, e. von berg, m. vujanović, n. duić. numerical study of co-firing pulverized coal and biomass inside a cement calciner. waste management & research 32(7):661 – 669, 2014. doi:10.1177/0734242x14538309. [13] c. y. h. chao, p. c. w. kwong, j. h. wang, et al. co-firing coal with rice husk and bamboo and the impact on particulate matters and associated polycyclic aromatic hydrocarbon emissions. bioresource technology 99(1):83 – 93, 2008. doi:10.1016/j.biortech.2006.11.051. [14] verein deutscher zementwerke. activity report: process technology of cement manufacture. utilisation of used tyres in cement works. tech. rep., verein deutscher zementwerke, 2003 2005. [15] s. kourounis, s. tsivilis, p. e. tsakiridis, et al. properties and hydration of blended cements with steelmaking slag. cement and concrete research 37(6):815 – 822, 2007. doi:10.1016/j.cemconres.2007.03.008. [16] m. varma, p. gadling. additive to cement, a pozzolanic material-fly ash. international journal of engineering research 3(5):558 – 564, 2016. doi:10.17950/ijer/v5i3/010. [17] d. a. y. ghassan k. al-chaar, l. a. kallemeyn. the use of natural pozzolan in concrete as an additive or substitute for cement. erdc/cerl tr-11-46. tech. rep., federal university of technology, minna, niger. [18] a. allahverdi, s. salem. studies on main properties of ternary blended cement with limestone powder and microsilica. iranian journal of chemical engineering 4(1):3 – 13, 2007. [19] d. paa, p. darmanto. studi numerik pengaruh laju umpan kiln terhadap rugi tekanan dan efisiensi pemisahan top siklon suatu pabrik semen. in proceeding seminar nasional tahunan teknik mesin xiii (snttm xiii), pp. 625 – 630. 2014. [20] p. s. darmanto, i. m. astina, a. syahlan. design and implementation of deduster for improving fine coal quality in a cement plant. in the international conference on fluid and thermal heat conversion (ftec). tongyeong, south korea, 2009. [21] h. mikulčić, e. von berg, m. vujanović, n. duić. numerical study of co-firing pulverized coal and biomass inside a cement calciner. waste management & research 32(7):661 – 669, 2014. doi:10.1177/0734242x14538309. [22] a. c. kahawalage, m. c. melaaen, l.-a. tokheim. numerical modeling of the calcination process in a cement kiln system. in linköping electronic conference proceedings, vol. 138, pp. 83 – 89. reykjavik, iceland, 2017. doi:10.3384/ecp1713883. [23] g. borsuk, j. wydrych, b. dobrowolski. modification of the inlet to the tertiary air duct in the cement kiln installation. chemical and process engineering 37(4):517 – 527, 2016. doi:10.1515/cpe-2016-0042. [24] claus bauer. modernization and production increase with cement kilns. humbolt report. tech. rep., khd humboldt wedag ag, 2000. [25] x. d’hubert. latest burner profiles. global cement magazine 03, 2017. [26] n. gopani, a. bhargava. design of high efficiency cyclone for tiny cement industry. international journal of environmental science and development 2(5):350 – 254, 2011. doi:10.7763/ijesd.2011. v2.150. [27] h. mikulčić, e. von berg, m. vujanović, et al. numerical analysis of cement calciner fuel efficiency and pollutant emissions. clean technologies and environmental policy 15(3):489 – 499, 2013. doi:10.1007/s10098-013-0607-5. [28] a. t. gebremariam. cfd modelling and experimental testing of thermal calcination of kaolinite rich clay particles an effort towards green concrete. ph.d. thesis, aalborg university, denmark, 2015. [29] y. sonavane, e. specht. numerical analysis of the heat transfer in the wall of rotary kiln using finite element method ansys. in seventh international conference on cfd in the minerals and process industrie csiro, pp. 1 – 5. melbourne, australia, 2009. [30] e. copertaro, a. a. estupinan donoso, b. peters. a discrete-continuous method for predicting thermochemical phenomena in a cement kiln and supporting indirect monitoring. engineering journal 22(6):165 – 183, 2018. doi:10.4186/ej.2018.22.6.165. [31] z. dulaimi, h. hameed, a. ali, m. alfahham. investigation the effect of calcinations degree and rotary kiln gases bypass opining in the preheating system for dry cement industries. international journal of latest trends in engineering and technology 10(3):119 – 127, 2018. doi:10.21172/1.103.21. [32] f. l. smidth. preheater calciner systems. www.flsmidth-prod-cdn.azureedge.net/-/media/brochures/brochures-products/pyro/2000-2017/ preheater-calciner-systems.pdf?rev=3519484b-6b80-47ee-811a-b69f889f353f, 2011. [33] a. atmaca, r. yumrutaş. analysis of the parameters affecting energy consumption of a rotary kiln in cement industry. applied thermal engineering 66(1-2):435 – 444, 2014. doi:10.1016/j.applthermaleng.2014.02.038. [34] s. b. nithyananth, h. rahul. thermal heat audit of kiln system in a cement plant. international journal of modern engineering research 5(12):73 – 79, 2015. 217 http://dx.doi.org/10.1177/0734242x14538309 http://dx.doi.org/10.1016/j.biortech.2006.11.051 http://dx.doi.org/10.1016/j.cemconres.2007.03.008 http://dx.doi.org/10.17950/ijer/v5i3/010 http://dx.doi.org/10.1177/0734242x14538309 http://dx.doi.org/10.3384/ecp1713883 http://dx.doi.org/10.1515/cpe-2016-0042 http://dx.doi.org/10.7763/ijesd.2011. v2.150 http://dx.doi.org/10.1007/s10098-013-0607-5 http://dx.doi.org/10.4186/ej.2018.22.6.165 http://dx.doi.org/10.21172/1.103.21 www.flsmidth-prod-cdn.azureedge.net/-/media/brochures/brochures-products/pyro/2000-2017/preheater-calciner-systems.pdf?rev=3519484b-6b80-47ee-811a-b69f889f353f www.flsmidth-prod-cdn.azureedge.net/-/media/brochures/brochures-products/pyro/2000-2017/preheater-calciner-systems.pdf?rev=3519484b-6b80-47ee-811a-b69f889f353f http://dx.doi.org/10.1016/j.applthermaleng.2014.02.038 p. s. darmanto, i m. astina, a. k. wardhana et al. acta polytechnica [35] p. prasanth, g. sudhakar. analysis of heat loss in kiln in cement industry a review. in international conference on explorations and innovations in engineering & technology (iceiet –2016). [36] d. touil, h. belabed, c. frances, s. belaadi. heat exchange modeling of a grate clinker cooler and entropy production analysis. international journal of heat and technology 23(1):61 – 68, 2005. [37] l. farag. energy and exergy analyses of egyptian cement kiln plant with complete kiln gas diversion through by pass. international journal of advances in applied sciences 1(1):35 – 44, 2012. doi:10.11591/ijaas.v1i1.757. [38] f. l. smidth. dry process kiln. fl smidth inc, usa, 2004. [39] w. h. duda. cement data book i: international process in the cement industry. french & european pubs, 3rd edn., 1985. [40] d. k. fidaros, c. a. baxevanou, c. d. dritselis, n. s. vlachos. numerical modelling of flow and transport processes in a calciner for cement production. powder technology 171(2):81 – 95, 2007. doi:10.1016/j.powtec.2006.09.011. [41] f. l. smidth. plant services devision. heat conservations. international cement production seminar, lecture 5.13a. tech. rep., fl smidth inc, 1990. [42] t. chatterjee. burnability and clinkerization of cement raw mixes. in advances in cement technology, pp. 69 – 113. pergamon press, 1983. doi:10.1016/b978-0-08-028670-9.50009-0. [43] p. a. aisop, h. chen, h. tseng. the cement plant operations handbook. tradeship publications ltd, surrey, uk, 5th edn., 2007. [44] holcim group support ltd. cement manufacturing services. reference guide for process performance engineers. thermal process and materials technology, edition 3.0. tech. rep., holcim group support ltd., 2006. [45] a. howard, r. chris. elementary linear algebra. john wiley and sons inc., united states, 9th edn., 2005. [46] i. markovsky. low rank approximation algorithms, implementation, applications. springer-verlag london, 2012. doi:10.1007/978-1-4471-2227-2. [47] j. chen, m. shi. analysis on cyclone collection efficiencies at high temperatures. china particuology 1(1):20 – 26, 2003. doi:10.1016/s1672-2515(07)60095-5. [48] a.-n. huang, n. maeda, d. shibata, et al. influence of a laminarizer at the inlet on the classification performance of a cyclone separator. separation and purification technology 174:408 – 416, 2017. doi:10.1016/j.seppur.2016.09.053. [49] a. amalia, a. syahlan, p. s. darmanto. heat auditing of gresik and tonasa plants. internal project report of indonesian cement and concrete institute. tech. rep., indonesian cement and concrete institute, 2006. [50] a. amalia, a. syahlan, p. s. darmanto. design and implementation of hot gas system for raw coal drying in tonasa 3 plant. internal project report of indonesian cement and concrete institute. tech. rep., indonesian cement and concrete institute, 2006. [51] h. mikulčić, e. von berg, m. vujanović, et al. cfd analysis of a cement calciner for a cleaner cement production. chemical engineering transactions 29:1513 – 1518, 2012. doi:10.3303/cet1229253. [52] a. amalia, a. syahlan, p. s. darmanto. modification of hot gas utilization for drying lime stone and clay at tonasa 2 and 3 plants. final project report of indonesian cement and concrete institute. tech. rep., indonesian cement and concrete institute, 2007. 218 http://dx.doi.org/10.11591/ijaas.v1i1.757 http://dx.doi.org/10.1016/j.powtec.2006.09.011 http://dx.doi.org/10.1016/b978-0-08-028670-9.50009-0 http://dx.doi.org/10.1007/978-1-4471-2227-2 http://dx.doi.org/10.1016/s1672-2515(07)60095-5 http://dx.doi.org/10.1016/j.seppur.2016.09.053 http://dx.doi.org/10.3303/cet1229253 acta polytechnica 61(1):199–218, 2021 1 introduction 2 materials and method 2.1 materials 2.2 mass and heat conservation equations 2.3 methodology 2.4 mass and heat conservation equations of planetary clinker cooler 2.5 mass and heat conservation equations of the kiln 2.6 detail mass and heat conservation equations of sp system and each cyclone separator 3 results and discussion 3.1 operational data parameters 3.2 plant mass and heat conservations 3.3 cooler and kiln mass and heat conservations 3.4 sp and detail each cyclone mass and heat conservations 4 conclusions list of symbols references acta polytechnica doi:10.14311/ap.2019.59.0322 acta polytechnica 59(4):322–351, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap a comparative study of data-driven modeling methods for soft-sensing in underground coal gasification ján kačur∗, milan durdán, marek laciak, patrik flegner technical university of košice, faculty berg, institute of control and informatization of production processes, němcovej 3, 040 01 košice, slovak republic ∗ corresponding author: jan.kacur@tuke.sk abstract. underground coal gasification (ucg) is a technological process, which converts solid coal into a gas in the underground, using injected gasification agents. in the ucg process, a lot of process variables can be measurable with common measuring devices, but there are variables that cannot be measured so easily, e.g., the temperature deep underground. it is also necessary to know the future impact of different control variables on the syngas calorific value in order to support a predictive control. this paper examines the possibility of utilizing neural networks, multivariate adaptive regression splines and support vector regression in order to estimate the ucg process data, i.e., syngas calorific value and underground temperature. it was found that, during the training with the ucg data, the svr and gaussian kernel achieved the best results, but, during the prediction, the best result was obtained by the piecewise-cubic type of the mars model. the analysis was performed on data obtained during an experimental ucg with an ex-situ reactor. keywords: underground coal gasification, syngas calorific value, underground temperature, time series prediction, machine learning, soft-sensing. 1. introduction 1.1. understanding ucg technology underground coal gasification (ucg) represents an in-situ controlled combustion of coal where valuable gases (i.e., syngas) are produced. the ucg represents an alternative to traditional coal mining methods. the ucg allows to mine coal from deep coal seams, seams affected by tectonic disturbances, seams with a low grade, or seams that have a thin stratum profile. various coal types can be gasified, e.g., lignite or bituminous. the ucg offers a low surface damage, low solid waste discharge and lower emissions of so2, nox to the air than the traditional coal mining. for an industrial gasification, at least two boreholes should be drilled (i.e., inlet and outlet). inlet borehole serves as a supply well for gasification agents (i.e., air, oxygen, and steam), and outlet borehole as the exhaust of the produced syngas. inlet and outlet boreholes are usually linked by various methods in order to create a gasification channel [1]. the main chemical reactions that occur during the ucg are drying, pyrolysis, combustion, and gasification of solid hydrocarbons. for the improvement of the ucg, it must be ensured that combustion reactions produce sufficient energy for the heating of reactants. it is also necessary to overcome the heat losses from the georeactor and to support the rate of endothermic gasification reactions [2]. the ucg is performed as an autothermic process where the heat in the coalbed is generated by an injection of oxygen from the injection well and by means of combustion reactions with carbon. the ucg essentially represents the acquisition of a spatially and thermally decomposed reaction zone in the coalbed, which overlaps regions of coal oxidation, coal reduction, and coal pyrolysis. the incoming air causes that the coal burns, the exothermic process releases heat and consumes oxygen. when coal is heated and co is produced, the boudoard chemical reactions (i.e., co2+c ⇒ 2co) is one of the most important chemical reaction. raw, pure gas from the ucg consists predominantly of h2, co, co2, ch4, higher hydrocarbons, tar, impurities and small quantities of sox, nox, and h2s [3]. in terms of the calorific value, gases, such as co, h2 and ch4, are valuable, but higher hydrocarbons also contribute to the calorific value. the syngas can be used for generating electricity, to produce synthetic natural gas or various chemical products. 1.2. measurement and monitoring in ucg the efficiency of the coal-to-gas transformation depends on the ucg monitoring and control and the various coal seam parameters. the main reason for the ucg monitoring is operating the technology more efficiently and increasing the quality of the produced gas, cost reduction and to meet regulatory requirements. monitoring also informs about the control decision effects, injection rates, syngas composition, temperatures, pressures, cavity size, fractures, and when to stop the gasification. in the ucg, various process variables can be monitored. these variables can be used for the data-driven 322 https://doi.org/10.14311/ap.2019.59.0322 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . figure 1. scheme of measurement and control in ucg. 323 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica modeling of the process behaviour. in the terms of the process control, it is needed to monitor volume flows and pressures (i.e., overpressure) of injected oxidizers (i.e., air, oxygen, water vapor). on the outlet, the volume flow of produced syngas and regulated underpressure can be monitored. of course, it is needed to monitor concentrations of syngas components that affect the calorific value (e.g., co, co2, ch4 and h2). the volume flow, pressure, and composition of injected gasification agents can substantially affect the composition of the produced syngas [2]. the measurement of the temperature inside oxidizing and reducing zones is most problematic. there are two methods of an indirect measurement of the underground temperature from the current syngas composition [4]) or method based on rules of heat transfer [5, 6]. recently, methods of monitoring the underground temperature by measuring carbon isotopes and by measuring emissions of radon to the surface appeared [7]. figure 1 presents the basic scheme of the process variables measurement. 1.3. modeling and prediction in ucg in the last years, an increased demand for an online and accurate measurement of some process variables that cannot be measured by conventional methods has occurred. this is the measurement of process variables in an aggressive (e.g., high temperature) or physically inaccessible (e.g., underground) environment. similarly, in the ucg, modelling and prediction methods need to be applied in order to determine the process parameters when going deep underground. these variables are decisive in increasing the efficiency and quality of the production. for this reason, different predictors and models, which can calculate desired process variables based on other observations, are developed and applied. these models often serve as support systems for the control of the technological process. predictive modeling usually uses statistics to predict the future behaviour of the process and is often associated with machine learning. the most popular are methods of a regression analysis where the output is a regression model. almost every regression model can serve for the prediction. some primordial regression analyses of the ucg were previously performed in [8]. the time series prediction is a challenging research area with broad application prospects. the softsensing methods for the data estimation and prediction are widely used in the industry. various approaches to modelling and data prediction have been explored in the world. unfortunately, there is only scarce evidence of the ucg models oriented for process control and soft-sensing. soft sensors based on data-driven predictive modelling are very useful in industry, especially in operations where important process variables cannot be measured directly by a conventional hardware. soft sensors use various models that enable real-time estimating process variables without a hardware sensor. they can provide less expensive and quicker process data than slow and costly hardware devices. however, the soft sensors can be run in parallel with the hardware devices for the measurement [9]. well-known software algorithms that can be seen as soft sensors include kalman filters. more recent implementations of soft sensors in the ucg use neural network (nn) or fuzzy computing. unfortunately, there is only scarce evidence of using the machine learning methods for a prediction of the underground temperature, syngas calorific value or syngas composition in the ucg. for example, ji and shi [10] have used a hybrid radial basis function (rbf) nn as a learning scheme for the temperature prediction of texaco gasifier. in order to increase the performance of the nn, the number of hidden neurons was determined by a fuzzy c-means algorithm and particle swarm optimization algorithm. recently, uppal et al. [11–13] have proposed a control oriented one dimensional packed bed model of the ucg for an estimation of the syngas composition. this model works with a connection to the sliding mode controller to maintain the desired syngas calorific value. learning schemes for the coal gasification to support the process control can also be found in [14]. multiple neural network (mnn) for the syngas composition prediction and dynamic principal component analysis was proposed in [15]. other researchers, e.g., guo et al. [16] have modeled coal gasification with a hybrid nn. a model of a coal gasification was developed, incorporating a first-principles model with an nn parameter estimator. the hybrid nn was trained with experimental data for the two coals and gave a good performance in the process modeling. other effective methods have also been applied to the gasification. liu et al. [17] have proposed a datadriven modeling for fixed-bed intermittent gasification processes inside ugi gasifiers by an enhanced lazy learning combined with a relevance vector machine. authors have used the bayesian learning framework for the modeling gasifier’s temperature. the effectiveness of the enhanced lazy learning approach combined with the relevance vector machine for the modelling of the ugi gasification processes has been verified by a series of experiments based on the data collected from practical fields. similarly, for the same problem of the data-driven modeling for the ugi gasification process, a variable structure of a genetic bp nn was used in [17]. the ugi represents a gasification named by ugi company. the ugi gasifier is an atmospheric fixed bed, solid-state slag coal gasification equipment. the prediction of the syngas composition based on the thermodynamic model can be found in [18, 19]. in the past, the application of the one-dimensional time-dependent numerical computational model of the 324 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . ucg in a packed bed has also been investigated with a verification on laboratory measurements [20]. the model based on nonlinear partial differential equations was capable of estimating the syngas composition and temperature distribution. a novel dynamic soft-sensing method based on an impulses response template for the shell coal gasification process was developed in [21]. the proposed model can predict the syngas composition during the coal gasification. the application and comparison of the efficiency of various learning methods, i.e., nn, adaptive regression splines (mars) or support vector machines (svms) in the ucg data prediction has not yet been the subject of an extensive study, but similar applications in steel making processess and biomass gasification are registered (e.g., [22, 23]). the purpose of the ucg monitoring is to provide a better understanding of how the syngas is produced. for this reason, it is needed to know what temperature is reached in the underground oxidizing zone. this work examines a potential learning method that can be implemented in the proposal of the soft sensor for the data prediction in the ucg. an underground geo-reactor is different from another industrial plant because the coal seam was created by nature and it is not possible to see what is in the underground when the ucg is in progress. for the ucg, new research and technologies are aimed to make measurements of process variables faster and non-destructive, which would allow having a smart non-intrusive quality sensor at hand. in this paper, in order to training data-driven modeling, a data set from an experimental ucg was used. in the data-driven (i.e., black box) modelling, input and output data are used in order to create a statistical model. in order to find a prediction apparatus for the ucg data prediction, the machine learning approach has been examined. one of the interesting advantages of the machine learning is that a system, randomly initialized and trained on some data sets, will eventually learn good feature representations for a given task. in the following sections, three learning methods, back-propagation nn (bpnn), mars and support vector regression (svr), are examined in order to support soft-sensing in the ucg. predictive methods are evaluated using statistical approaches and calculating the performance index. the methods were applied to the experimental data obtained from the experimental trial of the ucg. the results from the three methods are compared to each other for determining which method is the most suitable for the ucg. 2. analysis of selected modeling methods 2.1. multilayer feed-forward neural networks the inherent non-linear structure of the nn is well suited for solving many real-world problems. in recent years, several models of nns have been designed and optimized to solve a specific problem. nns models have an excellent ability to learn from experience and are also suitable non-parametric methods that do not require many limiting factors. a multilayer feed-forward neural networks are most commonly used as a universal means for classification and prediction. they consist of sensoric units, so-called input nodes, that form an input layer, one or more hidden layers with counting nodes and an output layer also with counting nodes. the signal passes through the network forwards across the individual layers. in a multi-layer feed-forward nn, all neurons of the previous layer are linked to each neuron of the following layer. however, there are no interconnections between the neurons at the level of the same layer as well as the direct interconnection of the input layer neurons with the neurons that are two layers further. in this paper, the back-propagation algorithm was used for the nn modeling of the ucg. this simple gradient algorithm was proposed by [24, 25]. there are more approaches to explaining the principle of nns and the back-propagation method, e.g., using the projection pursuit regression (ppr) [26]. in this paper, a graph-oriented approach with an extensive description that can be found in [27] has been used. the input and output scheme considered for the nn for the ucg data prediction is shown in section 4.1 (see figure 7). formally, the nn is defined as the oriented graph g = (v,e) where v = {v1,v2, ...,vn} is the set of verteices and e = {e1,e2, ...em} the set edges. denote a non-empty vertex or edge set of the graph g, containing n nodes (neurons) and m (connections). the set of v neurons is distributed to disjunctive subsets of v = vi ∪vh ∪vo where vi contains ni input neurons, which are adjacent to only the outgoing edges. vh contains nh hidden neurons, which are adjacent to the outgoing edges as well as to the incoming ones. finally, vo contains no output neurons, which are adjacent only to incoming edges. for an acyclic nn, the neurons can be arranged into layers where l1 = vi is an input layer (i.e., contains only input neurons), l2,l3, ...,lt−1 are hidden layers and lt is an output layer. the nn determined by the acyclic graph is usually chosen so that the neurons from the two adjacent layers are joined together by all possible connections. neurons and connections are rated by real numbers. each neuron vi is rated by a threshold ϑi and an activity xi. similarly, each connection (vj,vi) is rated by a weighting coefficient (or simply, by weight) wij. the activities of hidden 325 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica and output neurons are determined by the following equation [27]: xi = t(ξi) (1) ξi = ∑ j∈γ−1 i wijxj + ϑi (2) where a summation runs through neurons which are predecessors of the neuron vi and variable ξ is called the potential of the neuron vi. for defining the oriented graph g, we used the view of γ that assigns to each vertex v ∈ v a subset γ(v) ⊂ v that contains those neurons that are endpoints on the connections that are going out from the vertex v [28]. the neurons of the γ(v) subset are called the descendents of the vertex v in the graph g. the “inverse” γ−1 view will assigns to each vertex v ∈ v a subset γ−1(v) ⊂ v of the gamma composed of the “descendents” of the vertex v in the graph g. neuronal activities form a vector x = (x1,x2, ...xn ). this vector can be formally decomposed into three subsets containing input, hidden, and output activities x = xi ⊕ xh ⊕ xo (3) hidden activities are not explicitly mentioned; they only play the role of intermediate results. in general, to calculate activities from the layer li (where i > 1), it is only needed to know the activities from the lower layers l1,l2, ...,li−1. in this recursive manner, the activities of all neurons can be gradually calculated. activities of output neurons are calculated as the last. for this reason, for nns represented by the acyclic graph, the name εfeed-forward nnsε is used. the adaptation of the nn is based on searching for such threshold and weighting coefficients, which, for a given pair of input vector of activities xi and desired output vector of activities x̂o i.e., xi/x̂o and the calculated output vector xo, minimize the difference between the output activities xo and x̂o. vector x̂o represents the desired measured (i.e., experimental) data. the aim of the adaptation process is to find such thresholds and weighting coefficients that minimize the objective function e. for more pairs of input and output vectors x(1)i /x̂ (1) o , x (2) i /x̂ (2) o , ..., x (r) i /x̂ (r) o , (4) which represent the training set, the objective function has the following form: e = r∑ i=1 ei = r∑ i=1 1 2 (x(i)o − x̂ (i) o ) 2 (5) where x(i)o is the output vector of the nn as a response to the input vector x(i)i and x̂ (i) o the desired output vector is assigned to the input x(i)i . this minimization of the non-linear objective function can be performed by many optimization methods known in numerical mathematics. the most effective are so-called gradient methods, based on the use of a gradient of the objective function for the iterative construction of an optimal solution. when calculating the gradient of the objective function that contains more than one pair of input-otput vectors xi/x̂o (see equation (5)) then, the overall gradient of the objective function is simply determined as the sum of the gradients for all pairs xi/x̂o of the training set (4). grad e = r∑ i=1 grad e(i) (6) where the objective function e(i) is defined for i-th pair xi/x̂o of the training set. formally, the adapted nn is described by the coefficients determined as (w, ϑ) = argmin (w,ϑ) e(w, ϑ) (7) the settings of an applied bpnn and evaluation of its performance on the ucg data prediction is discussed in section 4.1. 2.2. multivariate adaptive regression splines multivariate adaptive regression splines (mars) is the method of regression that was developed by [29]. many works have been published that discussed the mars method [26, 30–34]. it is a non-parametric regression technique that looks like an extension of linear models. this technique automatically models non-linearities and interactions between variables. this technique is also more flexible than linear models and is suitable for processing large data series. this technique can serve for a quick prediction of time series. mars is similar to recursive partitioning where input data are divided into discontinuous regions of varying size. then, the local model is created for each region. the size of each area is set by mars as required. in mars, these regions are smaller when the relationship between input and output is more complex. mars, like the recursive partitioning technique, performs the automatic selection of variables, so the model includes important (useful) variables and excludes non-essential (as opposed to nn). the mars model is adapted based on the input training data, and cross-validation is used to validate the resulting model. the resulting model may not only be stored in a pc but is also portable, and there is easy to see the impact of each predictor (the model is easier to understand by humans). in order to create the mars model, the training data vectors, i.e., the inputs (observations) and outputs (targets) are needed. training data are split into several splines on an equivalent interval basis [29]. 326 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . the data are, in each spline, split into many subgroups, and several knots are created that can be placed between different input variables or different intervals in the same input variable to separating subgroups [31]. in mars, the regression function called a basis function (bf) is approximated by smoothing splines for a general representation of data in each subgroup. between any two knots, the model can characterize the data either globally or by using the linear regression. the bf is unique between any two knots and is shifted to another bf at each knot [29, 35]. the two bfs in two adjacent domains of data intersect at the knot to make the model outputs continuous. mars creates a curved regression line to fit the data from subgroup to subgroup and from one spline to another spline. for evading over-fitting and over-regressing, the shortest distance between two neighboring knots is predetermined to prevent too few data in a subgroup. in the mars method, the goal is to find the dependency of variables yi on one or more independent variables xi. the following regression sample is considered: d = {xi,yi}ni=1 = {x1i, ...,xni,yi} n i=1 (8) where xi ∈ rp is the i-th vector of the independent variable, yi, (i = 1, ...,n) is the dependent variable , n is the number of independent variables, n is the number of data in xi. the relationship between yi and xi (i = 1, ...,n) can be represented as: yi = f(x1i ,x 2 i , ...,x p i ) + ε = f(xi) + ε, (9) where f is an unknown function, and ε is an error (ε ∼ n(0,σ2)). the single valued deterministic function f, captures the joint predictive relationship of yi on (x1i ,x2i , ...,x p i ). the additive stochastic component ε, whose expected value is determined to be zero, usually reflects the dependence of yi on values other than (x1i ,x2i , ...,x p i ), that are neither controlled nor observed. in the one-dimensional case, splines are expressed in terms of piecewise linear basis functions, (x− t)+ and (t−x)+ with the node in t. the “+” means a positive part. these functions are truncated linear functions, for x ∈ r. (x− t)+ = { x− t, if x > t, 0, otherwise and (t−x)+ = { t−x, if x < t, 0, otherwise (10) each function (i.e., (x−t)+ and (t−x)+) is piecewise linear, with a knot at the value t. they are marked as linear splines. these two functions are named as a reflected pair. in the multidimensional case the idea is to form reflected pairs for each input component xj of the vector x = (x1, ...,xj, ...,xp)t with knots at each observed value xji of that input (i = 1, 2, ...,n; j = 1, 2, ...,p). thus, a set of constructed basis functions can be represented in the form: c = { (xj − t)+, (t−xj)+|t ∈{xj1,x j 2, ...,x j n}, j ∈{1, 2, ...,p}} (11) if all input data are different, then, in the set of 2np basis functions, each of them depends on only one variable xj. for example, b(x) = (xj − t)+ is regarded as a function over the entire input space rp. the basic functions used for approximation are as follows: bm(x) = km∏ k=1 [ sk,m · (xv(k,m) − tkm) ] + (12) where km is the total number of truncated linear functions in the m-th basis function (i.e., it is the number of “splits” that gave rise to bm), xv(k,m) is the component of the vector x, related to the k-th truncated linear function in the m-th basic function, tkm is the corresponding node, and skm ∈{±1}. km is the user defined degree order of the interaction term and sk,m represents the direction of the univariate term, which could be positive or negative. the model-building strategy is like a forward stepwise linear regression, but instead of using the original inputs, it is allowed to use functions from the set c of their products. therefore, the mars model can be expressed by the following equation [26]: y = f̂(x) + ε = c0 + m∑ m=1 cmbm(x) + ε (13) where y is the output variable, x is the vector of input variables, m is the number of basis functions in the model (i.e., number of spline functions, c0 is the coefficient of the constant basis function b0, and sum is over the basis functions bm produced by algorithm that implements the stepwise forward part of the mars strategy by incorporating the modification to recursive partitioning. the coefficients cm are estimated by minimizing the residual sum-of-squares (i.e., by standard linear regression). bm(x) is the m-th function in c, or a product of two or more such functions. the most important thing in this model is the choice of basis functions. in the beginning, the model contains a single function b0(x) = 1 and all functions from the set c are possible candidates for an inclusion in the model. as in the linear regression, setting bm, the coefficients cm can be found by the method of least squares. another subroutine of mars performs the backward deletion strategy wherein each iteration causes one unnecessary (i.e., redundant) basis function to be deleted. the inner loop of the algorithm will select 327 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica one function to be deleted. a function whose removal either mostly improves the fit or at least degrades it will be deleted. however, the constant basis function b1(x) = 1 is never removed. the settings of mars, its result forms, and evaluation of its performance are discussed in section 4.2. 2.3. support vector regression back-propagation nns are capable of representing general non-linear functions, but its disadvantage is often very difficult teaching, because, practically, there is always a risk of a deadlock in the local minimum of the error function, and in addition, the learning is highly complicated by looking for a high number of weights in the multidimensional space. an alternative and relatively new approach are the so-called support vector machines (svms). the svms are used for time series prediction and classification tasks. these methods represent the field of the so-called kernel machines and exploit the benefits provided by effective algorithms for finding a linear boundary while being able to represent highly complex non-linear functions. kernel functions methods try to find an optimal linear separator. the optimal linear separator in an svm algorithm is searched using the quadratic programming method. in support vector regression (svr) the data x ∈x are mapped into a high-dimensional feature space f via a nonlinear mapping φ, and to do linear regression in this space [36, 37]. input data (i.e., observations) are represented by vector x = (x1, x2, ..., xl) where l denotes the size of the sample. taking into account regression with one target variable y, so observations on the examined process can be written as a sequence of pairs (x1,y1), ..., (xi,yi), ..., (xl,yl), xi ∈ rn, yi ∈ r. vector xi represents one pattern of input observations xi = (xi1,xi2, ...,xin). in the case of observing process variables during the ucg, this vector may contain measured data from the database. thus, a linear regression in a high dimensional (feature) space corresponds to a non-linear regression in the low dimensional input space rn. the whole problem of the svr can be rewritten in terms of dot products in the low dimensional input space [38]. f(x) = l∑ i=1 (αi −α∗i )(φ(xi) · φ(x)) + b = l∑ i=1 (αi −α∗i )k(xi, x) + b (14) given two points xi, xj ∈ x , the function that returns the inner product between their images in the space f is known as the kernel function. in equation (14) a kernel function k(xi, xj ) = (φ(xi) · φ(xj )) is introduced. kernel type kernel function gaussian (rbf) kernel k(xi, xj ) = e−γ||xi−xj|| 2 linear kernel k(xi, xj ) = xti xj polynomial kernel k(xi, xj ) = (γ(xti xj + 1)) d sigmoid kernel k(xi, xj ) = tanh (γxti xj + d) table 1. overview of common kernels used by svr (γ is a kernel parameter controlling the sensitivity of the kernel function, and d is an integer). prameters αi, α∗i are the solutions of the quadratic programming problem [37]. these parameters have an intuitive interpretation as forces pushing and pulling the estimate f(xi) towards the measurements yi. parameter b is a threshold. common kernels are summarized in table 1. in this paper, the linear epsilon-insensitive svm (ε-svm) regression has been used. for this special cost function, the lagrange multipliers αi, α∗i are often sparse, i.e., they result in non-zero values after the optimization only if they are on or outside the boundary, which means that they fulfill the karush-kuhn-tucker (kkt) conditions. the ε-insensitive cost function is given by c(f(x) −y) ={ |f(x) −y|−ε for |f(x) −y| ≥ ε 0 otherwise (15) in ε-svm regression, the set of training data includes predictor variables and observed response values. the goal is to find a function f(x) that deviates from y by a value no higher than ε for each training point x, and at the same time is as flat as possible. in svr the kernel matrix k = (k(xi, xj ))li,j=1 (xi, xj ∈ x) is introduced. it is a symmetric positive definite matrix of inner products between all pairs of points {xi}li=1. each element represents the inner product of the predictors transformed by φ. however, it is not needed to know φ, because the kernel function can generate the kernel matrix directly. using this approach, the non-linear svr finds the optimal function f(x) in the transformed predictor space. the prediction of new values is based on a function that depends only on the support vectors: f(x) = l∑ i=1 (αi −α∗i ) k(xi, x) + b (16) where α and α∗ are non-negative lagrange multipliers for each observation x. a threshold b can be determined from the lagrange multipliers. lagrange coefficients can be found by minimization of the following function [39]: l(α) = 1 2 l∑ i=1 l∑ j=1 (αi −α∗i ) ( αj −α∗j ) k(xi, xj )+ ε l∑ i=1 (αi + α∗i ) − l∑ i=1 yi(α∗i −αi) (17) 328 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . subject to the constraint l∑ i=1 (αi −α∗i ) = 0; ∀i : 0 ≤ αi ≤ c; ∀i : 0 ≤ α∗i ≤ c (18) the kkt complementarity conditions are ∀i : αi(ε + ξi −yi + f(xi)) = 0 ∀i : α∗i (ε + ξ ∗ i + yi −f(xi)) = 0 ∀i : ξi(c −αi) = 0 ∀i : ξ∗i (c −α ∗ i ) = 0 (19) where slack variables ξ and ξ∗ for each point ensure that regression errors are up to the value of ξ and ξ∗ and meet the desired conditions. kkt complementarity conditions are optimization constraints required to obtain the optimum. these conditions indicate that all observations strictly inside the epsilon tube have lagrange multipliers αi = 0 and α∗i = 0. observations with nonzero lagrange multipliers are called support vectors. the constant c is the box constraint, a positive numeric value that controls the penalty imposed on observations that lie outside the epsilon margin (ε) and helps to prevent overfitting, i.e., regularization [40]. the minimization problem can be solved by common quadratic programming techniques, e.g., the chunking and working set method, sequential minimal optimization(smo), or iterative single data algorithm (isda). for the modeling of the ucg process, the epsiloninsensitive svm (ε-svm) regression has been used. to solve the optimization problem, the algorithm of smo has been used. 3. experimental ucg in ex-situ reactor for the purpose of verifying the ucg, laboratory equipment has been created. the base is an experimental coal gasifier, i.e., an ex-situ reactor or so-called syngas generator (see figure 2) and a set of devices for measurement and control. the ex-situ reactor was constructed so that the bedding of coal with overburden and under-burden layers can simulate the real coal seam. several experiments as trials of a real ucg were performed with the ex-situ reactor. this laboratory gasification equipment was well described in [41, 42]. similar trials of the ucg on a laboratory ex-situ reactor can be found in [43, 44]. various gasification agents (i.e., oxidizers), ways of bedding coal and monitoring of the ucg process were tested there. the gasification in the ex-situ reactor is based on the control of the flow of the inlet gasification agents (i.e., air and oxygen) and pressure on an outlet. lignite coal from the slovak mine that is suitable for the ucg was gasified. the composition of the coal that was figure 2. experimental coal gasifier (ex-situ reactor). gasified and factors that affect the ucg can be found in [41]. the influence of various gasification agents (i.e., its flows and pressures) on the syngas quality was discussed in [2]. the bedding of coal in the ex-situ reactor was made on the basis by the rules of the similarity theory. the goal was to obtain the similarity with the real coal seam. blocks of coal merged into one coal unit were used when preparing a physical model of the coalbed. in order to make the physical model airtight, layers of over-burden and under-burden contained sand mixed with water glass. in addition, the reactor was tilted at 10◦, in order to get as close as possible to the real coal seam. the coal used in the experiment was extracted from the underground coalbed with the same inclination. this coalbed (i.e., in mine cigel slovakia, overburden bed) has the potential to be mined in the future by the ucg. air, as the primary oxidant, was blown into the pressure vessel by two compressors. the pressure of air injected into the ex-situ reactor was adjusted by a reducing valve. the air flow was controlled by the servo valve and measured by a differential pressure sensor with a centric orifice that was installed in a pipeline. similarly, the flow of the produced syngas was measured, but the segment orifice was used. the oxygen flow and pressure were controlled by two reducing valves. as a source of the technical oxygen, pressure cylinders were used. technical oxygen was injected as an auxiliary oxidant into the mixing chamber where it was mixed with air, and the mixture was then injected into the ex-situ reactor. the pressures of the oxidants were measured by a set of pressure transducers. the k-type thermocouples were used to measure the temperatures in the coal model. these 329 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica thermocouples allow to measure the temperature up to 1300 ◦c. thermocouples were placed in ceramic tubes (for protection in an oxidation-reducing atmosphere) and then inserted into holes drilled in the physical model. the temperature of the coal along the gasification channel, in the overburden and the under-burden was measured. on the outlet from the reactor, the sample of syngas was captured to be analysed by two stationary analysers. the concentration of co, co2, o2, ch4, h2 in the syngas was constantly measured . the syngas calorific value was continually calculated from the composition of the syngas. the pressure at the outlet was controlled by a vacuum fan. the power of the fan was controlled by a frequency inverter. the outlet pressure was measured by one pressure transducer. the produced syngas was finally burned in the combustion chamber. the ucg was tried in two different reactors. figure 3 shows a complete technological scheme of the experimental gasification with one generator. all devices for measurement and control (i.e., pressure transducers, differential pressure transducers, servovalve, thermocouples, switching relays, frequency converter with fan, compressors and gas analysers) were connected to a plc that provided the basic control loop gasification (i.e., on-off control of compressors, air flow stabilization, temperature stabilization, oxygen concentration stabilization). the plc was connected to a pc that performed data storage, optimal control [45], and process monitoring [46]. the scada/hmi system promotic was used for the monitoring of the process and setting the controllers [46]. a detailed description of all devices that were used for the measurement and control was presented in [41]. the measured process variables (i.e., flows of gasification agents, pressures, temperatures, syngas composition and calorific value) were recorded in a database which may later be processed as a set of time series. the pc with cpu intel®coretm i54300u (2.9 ghz) and 8 gb ram was used for all calculations. 4. results and discussion the flowchart for the proposed soft-sensing in the ucg is shown in figure 4. this paper focuses on the evaluation of the potential predictive methods that could be used in soft-sensing. machine learning models are generalized to data similar to those on which they were trained. although static models, which are timeindependent, i.e., they work on a single data set, have been used, their application to the dynamic process should be improved by the continual updating of the training set with the online data. the development of practical on-line prediction soft sensors consists of two stages: training and on-line prediction. a data set from the experimental ucg was used in order to train a data-driven modeling algorithm. three prediction methods analysed in the previous section have been applied in order to predict the underground temperature in the oxidizing zone of ex-situ reactor and syngas calorific value. the data obtained during the experimental ucg with the laboratory equipment have been used in the analyses. as an underground temperature in the oxidizing zone, the highest temperature along the gasification channel during the experiment is considered. observations and target data measured during a one well-running experiment were used in this paper for the analysis. when the learning method is applied, it is convenient to divide observed data into a training set atrain and test set atest (i.e., validation set). when choosing a training and test sets, it is recommended to ensure that the data used for the model testing covers a significant range of variations that are supposed to be encountered during the use of the soft sensor. given that there is no exact rule in the literature how to divide the data into a training and test set for specific learning methods (there are some different recommendations and instruction for experimentation), the models were tested on data from the 10 % and 20 % of the experiment. in general, a higher performance of selected methods was obtained with more data for training (i.e., the ratio between the training set and the test set was 90:10). to compare the performance of three different methods, this paper presents only results when 10 % of the experiment was used for the test. the used test set consisted of 10 % of the data from the end of the experiment. the whole experiment lasted for 70 hours. in simulations, there were 4201 patters in total and 3781 patterns were used for training. overview of all regarded observations and targets shows table 2. the pressure on the outlet is the relative pressure measured on the output pipe from the gasifier. this pressure can also be negative when the power of the exhaust fan is increased. the behaviour of the measured data from the ucg experiment is shown in figure 5 and figure 6. due to the fact that the highest temperature along the gasification channel is considered to be the temperature in the oxidizing zone weakly correlates with the operating variables (i.e., flows of gasification agents and pressure) and it has some inertia, the decision to ensure its estimation from the composition of syngas measured on the outlet was made. since the composition of the syngas depends on the temperature in the oxidizing zone, there is an inverse way to determine the temperature that corresponds to the measured concentrations. this decision was also supported by the existence of a large number of uncertainties that occur in the ucg. the propagation of the temperature in the underground is not uniform, i.e., there are different temperatures in the coal, along the gasification channel, in the underburden and overburden. in addition, there is a continual shift of the combustion front. due to the changing conditions in the underground gasifier (e.g., groundwater, cracks, fractures, gas leaks and shift of combustion front and surface subsidence) it is the process control in conditions of 330 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ! � � � � � � � � figure 3. scheme of gasification equipment with one ex-situ reactor (modified after [41]). 331 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica figure 4. the principle of soft-sensing in ucg. 332 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . figure 5. time series of measured syngas composition divided into training and measured data set. figure 6. time series of measured control variables divided into training and measured data set. 333 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica target (output process variable y) observation (input process variable x) calorific value of syngas (mj/nm3) x1 air flow (nm 3/h) x2 oxygen flow (nm3/h) x3 pressure on outlet (i.e., underpressure/overpressure) (pa) underground temperature (◦c) x1 concentration of co in syngas (%) x2 concentration of co2 in syngas (%) x3 concentration of h2 in syngas (%) x4 concentration of ch4 in syngas (%) x5 concentration of o2 in syngas (%) table 2. observations and targets used in modeling. uncertainty. in such conditions, the process of measuring the process variables, identification, and finally automated control is more complicated. partly, these uncertainties can be reduced by the more detailed geological survey. but even this does not guarantee the elimination of such uncertainties, as evidenced by long-term experience in the traditional coal mining technology. predictive methods were evaluated using statistical approaches and calculating the performance index. the following indicators were used to compare the performance of the individual prediction methods. variable y represents the measured target, and y represents its prediction. y is the average of target values yi and y is the average of predicted values yi (i = 1, ...,n). n represents the number of patterns in the training or testing set. • coefficient of correlation (ryy ) the coefficient expresses the force of the linear relationship between two variables. determines the degree of dependence of two variables and acquires the value from the interval (-1; 1). its definition is based on the consideration of the sum of deviations of individual values of two correlated characters and their averages. several equations are used to calculate the correlation coefficient, but the following are used in this work: ryy = ∑n i=1(yi −y )(yi −y)√∑n i=1 (yi −y )2 √∑n i=1 (yi −y)2 (20) if ryy = 1, the dependence is completely direct; ryy = −1 i.e., the correlation is completely indirect; if ryy = 0, between the variables is independence. more precisely: ryy < 0.3 low tightness; 0.3 ≤ ryy < 0.5 slight tightness; 0.5 ≤ ryy < 0.7 significant tightness; 0.7 ≤ ryy < 0.9 high tightness; 0.9 ≤ ryy very high tightness. • coefficient of determination (r2yy ) it expresses the degree of the causal dependence of two variables. it is a statistic that will give some information about the goodness of the fit of a model. the correlation coefficient is the square root of the determination coefficient. degrees of tightness depending on the coefficient of determination are as follows: r2yy < 0.1 low tightness; 0.1 ≤ r 2 yy < 0.25 slight tightness; 0.25 ≤ r2yy < 0.50 significant tightness; 0.5 ≤ r2yy < 0.80 high tightness; 0.8 ≤ r 2 yy very high tightness. r2yy = 1 indicates that the model perfectly fits the measured target data. r2yy = 1 − 1 n ∑n i=1 (yi −y ) 2 1 n ∑n i=1 (yi −y)2 (21) • relative root mean squared error (rrmse) this error can be calculated by dividing root mean square error rmse by the average of actual values yi. rmse represents the square root of mean square error (mse) calculated as follows: rmse = √ mse = √√√√ 1 n n∑ i=1 (yi −yi)2 (22) the mse is a useful statistic measure for assessing the accuracy of the prediction. the rrmse can be calculated by the following equation [34, 47]: rrmse = rmse 1 n ∑n i=1 yi × 100 =√ 1 n ∑n i=1 (yi −yi)2 1 n ∑n i=1 yi × 100 (%) (23) • mean absolute percentage error (mape) this statistic indicator expresses a percentage prediction error. it can be calculated as follows: mape = 1 n n∑ i=1 |yi −yi| |yi| × 100 (%) (24) this error has certain disadvantages. at zero values of yi, a division by zero can occur, and mape 334 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . produces undefined values. in very low values of yi, mape can exceed 100 % extremely. when actual values yi are very high (i.e., above yi), mape will not exceed 100 %. • performance index (pi) indicates the overall performance of the given prediction method. the pi value ranges is from 0 to +∞. the smaller values of pi indicate a better performance. the pi is calculated as follows [47, 48]: pi = rrmse ryy + 1 (25) the above statistical indicators were calculated individually for training and testing data. the time needed to train the predictive model was also measured. 4.1. prediction by the back-propagation nn the back-propagation algorithm (or gradient algorithm) is based on an “error-correction” learning rule, i.e., learning by the network error. this algorithm performs the steepest descent procedure. the learning consists of two pass-overs across different layers of the network, i.e., the forward pass (i.e., forward activation flow of outputs) and backward pass (i.e., the backward error propagation of weight adjustments). before the training, it is appropriate to standardize all inputs of the nn for the effective scaling of the weights. the learning takes place in cycles (i.e., epochs), always with new input patterns. it is based on the gradual minimizing weights so that the error e (5) is reduced. this error is minimized iteratively in different epochs so that the required accuracy ε was achieved. after learning, it is desirable that the output from the nn network is equal to the required output or to get it as close as possible, considering all input patterns from the set of atrain. the ability of the nn to determine the output for inputs outside the atrain set is called the generalization. this is also the main role of the nn in the prediction. weights are modified using the set atrain and, using atest, the generalization error is detected. five input variables i.e., concentration of o2 (x1), co2 (x2), co (x3), h2 (x4) and ch4 (x5) (measured in vol. %) have been regarded for the initialization of input neurons and the prediction of the underground temperature in the ucg. two stationary analysers that measured concentrations of only these five gases have been used during the ucg experiment. these concentrations have been considered to be the most significant. in the prediction of the syngas calorific value, three input process-relevant variables were used i.e., injected air flow (x1), flow of supplementary oxygen (nm3/h) (x2) and pressure on outlet (pa) (x3). these variables are adjustable by the automatized control system. the general scheme of the nn considered from the ucg data prediction is shown in figure 7. the methods of batch gradient descent with momentum and gradient descent with variable learning rate have been applied. in this approach, weights and biases are updated according to the gradient descent momentum and an adaptive learning rate. it is the most widely used way to realize this minimization of error (5)) within gradient optimization methods, in which weighting and threshold factors are recurrently updated according to following equations [27]: w (k+1) ij = w (k) ij −λ ∂e ∂wij + µ∆w(k)ij ϑ (k+1) j = ϑ (k) j −λ ∂e ∂ϑj + µ∆ϑ(k)j (26) where parameter λ > 0 represents the learning rate and must be small enough to ensure the monotone convergence of the optimization algorithm and, at the same time, large enough to provide a sufficiently high convergence rate. calculating the partial derivations ∂e ∂wij and ∂e ∂ϑj for the entire nn, running recurrently from the highest to the lowest layer, i.e., against the direction of dissemination of information in the nn, runs from the lowest to the highest layer. initial values of the threshold and weighting coefficients ϑ(0)j and w (0) ij are randomly generated from a small center-to-zero interval e.g., from an open interval (-1, 1). the last member µ in (26) represents the socalled moment member that is determined by the difference of the coefficients from the last two iterations, ∆w(k)ij = w (k) ij − w (k−1) ij and ∆ϑ (k) j = ϑ (k) j − ϑ (k−1) j . the momentum is important for the “skip” of the local minima in the initial optimization phase. the value of the parameter µ is usually chosen from the interval 0.5 ≤ µ ≤ 0.7. the adaptive learning rate tries to maintain a stable learning and a largest size of the learning step. the mean squared error (mse) was used as the function for the error calculating during the training. the number of hidden layers and neurons is usually determined on the basis of experimentation, where an nn model is selected for which etest is minimal. however, a small number of hidden layers in the nn may not well model non-linearities in the data. it is, therefore, necessary to look for an optimal number of hidden neurons. two variants of the nn, i.e., with one and two hidden layers, were used, but the number of neurons was estimated in previous experimentation. it has also been tried to set the number of neurons in the hidden layer to 2m + 1, where m is the number of input neurons. in all variants that were tried only one neuron in the output layer was used. the momentum constant was set-up to 0.9. within the results, the goal was also to show what impact has the number of hidden neurons on the quality of prediction. the results of training and testing are shown in table 3. 335 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica p re di ct ed va ri ab le h id de n la ye rs n um be r of ne ur on s (l 1: l2 ) o bs er va ti on s (i np ut s) t ra in in g t es ti ng r y y r 2 yy r r m se (% ) p i m a p e (% ) m se r m se t im e (s ) r y y 2 ry y r r m se (% ) p i m a p e (% ) m se r m se t em pe ra tu re 2 50 00 :1 5 c o ,c o 2 0. 27 07 0. 07 33 8. 00 62 6. 30 08 6. 41 08 64 84 .2 53 7 80 .5 24 9 44 .8 31 2 0. 19 85 0. 03 94 8. 68 18 7. 24 40 7. 27 07 69 38 .2 15 3 83 .2 96 0 2 50 00 :1 5 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 05 67 0. 00 32 11 .7 86 9 11 .1 54 2 9. 12 23 14 05 4. 28 43 11 8. 55 08 50 .8 07 7 0. 06 11 0. 00 37 18 .2 23 7 17 .1 74 9 10 .8 64 0 30 57 0. 46 82 17 4. 84 41 2 80 0: 8 c o ,c o 2 0. 20 43 0. 04 17 8. 06 49 6. 69 67 6. 64 52 65 79 .6 87 8 81 .1 15 3 6. 85 60 0. 67 55 0. 45 63 8. 50 75 5. 07 75 7. 45 26 66 62 .3 75 9 81 .6 23 4 2 80 0: 8 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 18 20 0. 03 31 8. 84 28 7. 48 10 6. 95 26 79 10 .2 62 9 88 .9 39 7 7. 19 21 0. 08 10 0. 00 66 9. 98 93 9. 24 07 6. 80 42 91 85 .3 46 2 95 .8 40 2 1 50 00 c o ,c o 2 0. 05 87 0. 00 34 69 .5 63 5 65 .7 07 8 53 .1 50 0 48 95 23 .6 38 2 69 9. 65 97 38 .4 06 1 0. 10 21 0. 01 04 95 .0 73 8 10 5. 88 81 73 .4 97 4 83 20 48 .0 25 2 91 2. 16 67 1 50 00 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 18 65 0. 03 48 42 .8 46 2 36 .1 10 5 31 .0 09 9 18 57 09 .9 32 8 43 0. 94 08 43 .1 59 9 0. 16 11 0. 02 60 70 .0 98 2 83 .5 61 4 48 .0 39 3 45 23 13 .8 23 3 67 2. 54 28 1 80 0 c o ,c o 2 0. 04 28 0. 00 18 23 .6 40 0 22 .6 70 0 17 .8 90 8 56 53 3. 41 72 23 7. 76 76 1. 74 51 0. 16 23 0. 02 63 48 .7 51 4 41 .9 44 6 35 .9 24 2 21 87 76 .5 47 0 46 7. 73 56 1 80 0 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 07 78 0. 00 61 16 .0 17 4 14 .8 60 8 11 .9 06 1 25 95 3. 34 24 16 1. 10 04 1. 84 75 0. 63 81 0. 40 72 28 .3 44 7 17 .3 03 2 22 .0 81 8 73 95 5. 58 71 27 1. 94 78 1 5 c o ,c o 2 0. 14 95 0. 02 23 8. 10 83 7. 05 38 6. 74 66 66 50 .7 12 0 81 .5 51 9 0. 67 02 0. 33 73 0. 11 38 8. 42 45 6. 29 96 7. 16 68 65 33 .0 96 2 80 .8 27 6 1 11 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 42 10 0. 17 72 7. 45 52 5. 24 66 6. 14 09 56 22 .4 66 2 74 .9 83 1 0. 74 61 0. 67 87 0. 46 06 7. 43 01 4. 42 61 5. 81 29 50 81 .7 73 2 71 .2 86 6 c al or ifi c va lu e 2 50 00 :1 5 a ir ,o 2 0. 55 88 0. 31 22 33 .3 75 0 21 .4 10 8 37 .6 19 5 9. 64 41 3. 10 55 55 .2 75 3 0. 03 66 0. 00 13 30 .4 62 6 29 .3 86 1 30 .6 66 0 12 .4 35 3 3. 52 64 2 50 00 :1 5 a ir ,o 2 ,o ut le t pr es su re 0. 13 11 0. 01 72 11 9. 94 83 10 6. 04 31 12 6. 19 49 12 4. 56 76 11 .1 61 0 73 .1 43 2 0. 02 70 0. 00 07 86 .6 73 7 84 .3 98 8 70 .8 20 8 10 0. 66 93 10 .0 33 4 2 80 0: 8 a ir ,o 2 0. 73 83 0. 54 51 22 .6 63 8 13 .0 37 8 28 .3 85 9 4. 44 72 2. 10 88 7. 10 23 0. 09 99 0. 01 00 24 .8 01 0 22 .5 48 6 22 .6 78 8 8. 24 25 2. 87 10 2 80 0: 8 a ir ,o 2 ,o ut le t pr es su re 0. 73 92 0. 54 64 22 .6 56 9 13 .0 27 1 28 .0 10 8 4. 44 44 2. 10 82 7. 58 02 0. 69 06 0. 47 69 21 .5 02 7 12 .7 19 0 19 .5 45 6 4. 58 77 2. 14 19 1 50 00 a ir ,o 2 0. 03 98 0. 00 16 15 9. 12 70 15 3. 03 06 18 3. 30 15 21 9. 23 21 14 .8 06 5 44 .2 14 6 0. 14 74 0. 02 17 17 3. 39 74 15 1. 12 07 14 8. 02 62 40 2. 90 98 20 .0 72 6 1 50 00 a ir ,o 2 ,o ut le t pr es su re 0. 23 39 0. 05 47 13 3. 33 27 10 8. 06 00 13 5. 64 04 15 3. 91 82 12 .4 06 4 44 .6 76 0 0. 49 76 0. 24 76 13 9. 36 64 93 .0 57 1 11 3. 27 22 26 0. 27 89 16 .1 33 2 1 80 0 a ir ,o 2 0. 41 06 0. 16 86 55 .1 18 8 39 .0 76 1 61 .2 30 8 26 .3 03 6 5. 12 87 7. 73 48 0. 18 19 0. 03 31 87 .2 72 9 73 .8 42 9 75 .1 44 2 10 2. 06 61 10 .1 02 8 1 80 0 a ir ,o 2 ,o ut le t pr es su re 0. 45 90 0. 21 07 51 .0 45 8 34 .9 87 4 51 .7 01 3 22 .5 59 9 4. 74 97 7. 94 96 0. 66 93 0. 44 79 57 .1 71 8 34 .2 49 6 47 .0 39 3 43 .8 01 3 6. 61 83 1 5 a ir ,o 2 0. 67 14 0. 45 07 24 .9 07 6 14 .9 02 5 32 .9 43 5 5. 37 13 2. 31 76 0. 96 49 0. 04 01 0. 00 16 19 .8 08 2 19 .0 44 5 22 .9 76 6 5. 25 79 2. 29 30 1 7 a ir ,o 2 ,o ut le t pr es su re 0. 72 81 0. 53 01 23 .0 70 7 13 .3 50 3 28 .6 97 4 4. 60 83 2. 14 67 1. 03 06 0. 71 87 0. 51 66 15 .8 21 9 9. 20 55 16 .8 80 4 3. 35 46 1. 83 16 table 3. results of simulations with nns where 10 % of the experiment was used to test. 336 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . figure 7. proposal of neural network considered for ucg data prediction. when training and testing the prediction of underground temperature, it can be seen that the lowest values of statistic indicators (i.e., rrmse, mse, mape, and rmse) were obtained in the case of one hidden layer with 11 neurons (i.e., 2m+ 1, according to a general recommendation, where m is the number of input neurons). in this case, the rrmse and performance index had the lowest values (pi = 4.42 for testing and pi = 5.24 for training). the value of the coefficient of determination was the highest in this case (r2yy = 0.46 for testing and r2yy = 0.17 for training). this result was obtained when five input observations were used for training the nn model and for the testing of prediction. this means that this variant predicts the target the best. the second interesting result was obtained in the case of using an nn with two hidden layers and the structure of 800:8 neurons. it was the case when the two observation inputs were used. the worst results in the prediction of the underground temperature were achieved in the case of one hidden layer with 5000 neurons. when training and testing the nn for the calorific value, it can be seen that the lowest values of statistic indicators were also obtained in the case of one hidden layer with 7 neurons (i.e., 2m + 1). in this case, the rrmse and performance index was the lowest (e.g., pi = 9.20 for testing). this result was obtained when three input variables were used for training the nn model and for the testing of prediction. the value of the coefficient of determination was the highest in this case (i.e., r2yy = 0.51 for testing). similarly, as in the case of the calorific value prediction, the second interesting result was obtained in the case of using an nn with two hidden layers and the structure of 800:8 neurons. this is the case where three observation inputs were used. the worst results in the prediction of the calorific value were achieved in the case of one hidden layer with 5000 neurons (see table 3). it can be stated that the use of the nn model for the temperature prediction has achieved better results in terms of the performance index than in the case of the calorific value. the best prediction of the calorific value and underground temperature by nn, where 10 % of the experiment was used for the test is shown in figure 8 and figure 9. the black vertical line in figures divides the prediction into training and testing. 4.2. prediction by the mars the algorithm of a regression model creation with mars runs in two phases as it was analysed in detail in section 2.2. in the forward phase, the algorithm begins with a model that has only an intercept term. then, in the cycle, reflected pairs of bfs are added so that the training error is reduced as much as possible. this is done until, for example, the maximum number of bfs is not reached. in the backward phase, the model is simplified by deleting one of the least important bfs, which also reduces the training error. then, more “best” models of different sizes are obtained. at the end of this phase, only one model with the lowest gcv is selected from these best models (excluding models larger than maximal final bfs). several different variants of simulations with mars have been performed. the maximum number of bfs included in the model in the forward phase has been experimentally changed. the initial number of bfs in the forward phase was determined according to the formula: min(200, max(20, 2d))+1, where d represents the number of input variables. the initial number of bfs was set to 21 in all simulations. in modelling, we have considered a maximal interactivity between input variables without 337 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica figure 8. measured and predicted calorific value of syngas by nn, where three inputs were used in the test. figure 9. measured and predicted underground temperature by nn, where five inputs were used in the test. 338 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . self-interactions. since there were not smoother data, the piecewise-cubic and piecewise-linear type of modeling have been analysed in order to know the prediction performance. in default, all mars models are created as piecewise-linear and are transformed into piecewise-cubic after the backward phase. the best or optimal number of maximal bfs in the final mars model was estimated by the gcv criterion and by the 10-fold cross-validation. in an all cross-validation iteration, a new mars model is created and reduced using the gcv in the in-fold (training) data. in addition, there also is a calculated mse criterion on out-of-fold (test) data (mseoof) in the reducing phase for each model. figure 10 shows a comparison of the behavior of the gcv and mseoof criterion calculated for each new model after the 10-fold iteration. in the simulation, the model for predicting syngas calorific value was considered. the figure shows two vertical dashed lines at the minimum of the two solid lines. the figure also shows the number of optimum bfs estimated by the gcv (cyan) and cross-validation (magenta). ideally, these two lines would coincide. similarly, figure 11 shows the determination of the optimal number of basis functions for the mars model for the underground temperature prediction. figures show simulations of the piecewise-linear type of mars models with all inputs considered in a given target variable. the results from the training of the model and testing the prediction are shown in table 4. the original mars approximation method uses a cubic function to smooth the truncated piecewiselinear functions. the cubic function has the following general form: c(x|s = +1, t−, t, t+) =  0, x ≤ t− α+(x− t−)2 + β+(x− t−)3, t− < x < t+ x− t, x ≥ t+ (27) where α+ = 2t+ − 3t + t− (t+ − t−)2 , β+ = −t+ + 2t− t− (t+ − t−)3 , (28) and c(x|s = −1, t−, t, t+) =  t−x, x ≤ t− α−(x− t+)2 + β−(x− t+)3, t− < x < t+ 0, x ≥ t+ (29) where α− = −t+ + 3t− 2t− (t− − t+)2 , β− = t+ − 2t + t− (t− − t−)3 . (30) where t represents a univariate knot, which is selected for each of the factor variables x. the piecewise-linear type of mars models better fits the training data but, in the prediction on untrained data, better results with the piecewise-cubic type of the model (see table 4) are obtained. equation (31) represents a resulting piecewise-cubic type of mars model for the prediction of the underground temperature. basis functions in the equation are calculated according to table 5. this mars model takes into account five inputs, i.e., concentration of measured gasses: x1 co, x2 co2, x3 h2, x4 ch4, x5 o2. similarly, equation (32) represents the piecewise-cubic type of mars model for the prediction of the calorific value of the syngas. basis functions in the equation are calculated according to table 6. the mars model, in this case, has three inputs x1 air flow and x2 oxygen flow and x3 controlled pressure on outlet. in these models, the best results for the prediction were obtained in terms of all statistical indicators. the lowest performance index pi = 5.66 was obtained in the testing of the prediction of the underground temperature on untrained data with the piecewise-cubic type of the mars model. when testing the prediction of the calorific value on untrained data with piecewise-cubic type of mars model, the performance index pi = 12.1382 was obtained. temperature (◦c) = 855.36 −2108.8 × bf1 +8.7366 × bf2 + 14.266 × bf3 +20.014 × bf4 − 13.465 × bf5 −18.233 × bf6 + 1.2743 × bf7 −3.764 × bf8 − 1.3594 × bf9 −6.7334 × bf10 + 1.1376 × bf11 +0.61503 × bf12 + 0.79404 × bf13 −1.766 × bf14 + 0.20451 × bf15 −1.8976 × bf16 − 5.8421 × bf17 +0.72146 × bf18 + 0.86513 × bf19 (31) calorificvalue (mj/nm3) = 13.528 −0.0044217 × bf1 − 0.13132 × bf2 −0.14469 × bf3 − 1.1482 × bf4 +0.015322 × bf5 − 0.0034767 × bf6 −0.019788 × bf7 + 0.11264 × bf8 +0.55397 × bf9 − 0.21132 × bf10 −0.84369 × bf11 + 0.032498 × bf12 −0.16806 × bf13 + 0.12228 × bf14 +0.084482 × bf15 + 0.00096036 × bf16 +0.00075854 × bf17 (32) due to the fact that the piecewise-linear type of the model gives, during the training, better results in terms of all indicators, these variants of the model are also presented. these are models with all considered inputs for the prediction of the given target. piecewise-linear type of mars model usage max(0, x−t) function, where t is the knot. the max() function represents positive part of (0,x − t) which can be formally expressed as the following: max(0,x− t) = { x− t, if x ≥ t 0, otherwise (33) 339 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica figure 10. example of the estimation of the “best” number of bfs of the calorific value mars model with three inputs by gcv and 10-fold cross-validation (i.e., mseoof). figure 11. example of estimation of the “best” number of bfs in mars model of underground temperature with five input by gcv and 10-fold cross-validation (i.e., mseoof). 340 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . p re di ct ed va ri ab le t yp e of m a r s m od el n um be r of b fs in fin al m od el in cl ud in g b f 0 o bs er va ti on s (i np ut s) t ra in in g t es ti ng r y y r 2 yy r r m se (% ) p i m a p e (% ) m se r m se t im e (s ) r y y 2 ry y r r m se (% ) p i m a p e (% ) m se r m se t em pe ra tu re pi ec ew is ecu bi c 16 c o ,c o 2 0. 47 07 0. 22 16 7. 23 42 4. 91 89 5. 74 32 52 94 .1 48 1 72 .7 60 9 10 .5 61 9 0. 23 12 0. 05 34 10 .5 86 5 8. 59 86 8. 87 79 10 31 6. 44 17 10 1. 56 99 20 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 58 59 0. 34 33 6. 64 45 4. 18 97 5. 20 14 44 66 .1 54 4 66 .8 29 3 17 .8 53 8 0. 60 31 0. 36 37 6. 42 75 4. 00 94 5. 16 23 38 02 .8 74 7 61 .6 67 5 pi ec ew is elin ea r 16 c o ,c o 2 0. 49 85 0. 24 85 7. 10 80 4. 74 34 5. 62 62 51 10 .9 55 4 71 .4 90 9 12 .1 52 5 0. 32 82 0. 10 77 10 .6 82 4 8. 04 27 9. 16 29 10 50 4. 30 10 10 2. 49 05 20 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 66 66 0. 44 44 6. 11 19 3. 66 73 4. 72 85 37 78 .9 20 0 61 .4 72 9 17 .8 48 2 0. 57 88 0. 33 50 8. 94 62 5. 66 65 7. 52 83 73 67 .2 26 9 85 .8 32 6 c al or ifi c va lu e pi ec ew is ecu bi c 15 a ir ,o 2 0. 78 63 0. 61 83 20 .7 58 2 11 .6 20 8 24 .0 94 6 3. 73 07 1. 93 15 8. 05 20 0. 04 19 0. 00 18 83 .9 17 9 80 .5 43 1 32 .5 39 8 94 .3 69 5 9. 71 44 18 a ir ,o 2 ,o ut le t pr es su re 0. 86 61 0. 75 01 16 .7 96 4 9. 00 10 17 .3 05 7 2. 44 26 1. 56 29 13 .5 41 4 0. 43 86 0. 19 24 17 .4 62 0 12 .1 38 2 17 .8 49 3 4. 08 61 2. 02 14 pi ec ew is elin ea r 16 a ir ,o 2 0. 79 91 0. 63 86 20 .1 96 5 11 .2 25 6 23 .0 08 4 3. 53 16 1. 87 92 9. 16 35 0. 04 19 0. 00 18 80 .7 36 5 77 .4 89 7 30 .3 19 3 87 .3 50 0 9. 34 61 17 a ir ,o 2 ,o ut le t pr es su re 0. 87 30 0. 76 21 16 .3 86 6 8. 74 89 16 .5 88 3 2. 32 49 1. 52 47 13 .4 41 0 0. 42 38 0. 17 96 17 .8 94 5 12 .5 68 1 18 .3 34 1 4. 29 10 2. 07 15 table 4. results of simulations with mars models where 10 % of the experiment was used to test. 341 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica bf equation bf equation bf1 c(x4 | +1, 20.571, 20.579, 20.607) bf11 c(x1 | +1, 3.6356, 6.3881, 7.6506) × c(x2 | -1, 19.073, 26.51, 28.665) bf2 c(x4 | -1, 20.571, 20.579, 20.607) bf12 bf8 × c(x1 | +1, 9.2894, 9.6658, 14.265) bf3 c(x2 | +1, 7.967, 11.636, 19.073) bf13 bf8 × c(x1 | -1, 9.2894, 9.6658, 14.265) bf4 c(x2 | -1, 7.967, 11.636, 19.073) bf14 bf9 × c(x1 | +1, 7.6506, 8.9131, 9.2894) bf5 c(x1 | -1, 3.6356, 6.3881, 7.6506) bf15 bf9 × c(x1 | -1, 7.6506, 8.9131, 9.2894) bf6 c(x3 | +1, 15.747, 26.344, 30.018) bf16 bf5 × c(x3 | +1, 2.5749, 5.1498, 15.747) bf7 c(x3 | -1, 15.747, 26.344, 30.018) bf17 bf5 × c(x3 | -1, 2.5749, 5.1498, 15.747) bf8 bf3 × c(x4 | +1, 11.785, 20.563, 20.571) bf18 bf14 × c(x5 | +1, 2.1322, 3.9609, 12.268) bf9 bf3 × c(x4 | -1, 11.785, 20.563, 20.571) bf19 bf14 × c(x5 | -1, 2.1322, 3.9609, 12.268) bf10 c(x1 | +1, 3.6356, 6.3881, 7.6506) × c(x2 | +1, 19.073, 26.51, 28.665) table 5. basis functions of piecewise-cubic type of mars model of underground temperature (five inputs). bf equation bf equation bf1 c(x1 | +1, 5.866, 11.732, 17.503) × c(x3 | +1, 8.357, 16.442, 32.428) bf10 c(x1 | -1, 5.866, 11.732, 17.503) × c(x2 | +1, 0.239, 0.478, 2.505) bf2 c(x1 | +1, 5.866, 11.732, 17.503) × c(x3 | -1, 8.357, 16.442, 32.428) bf11 c(x1 | -1, 5.866, 11.732, 17.503) × c(x2 | -1, 0.239, 0.478, 2.505) bf3 c(x2 | +1, 7.763, 9.83, 25.869) bf12 c(x1 | +1, 17.503, 23.274, 31.139) × c(x2 | +1, 2.505, 4.532, 5.114) bf4 c(x2 | -1, 7.763, 9.83, 25.869) bf13 c(x1 | +1, 17.503, 23.274, 31.139) × c(x2 | -1, 2.505, 4.532, 5.114) bf5 bf4 × c(x3 | +1, 32.428, 48.414, 74.478) bf14 bf11 × c(x3 | +1, -0.155, 0.272, 8.357) bf6 bf4 × c(x3 | -1, 32.428, 48.414, 74.478) bf15 bf11 × c(x3 | -1, -0.155, 0.272, 8.357) bf7 c(x1 | +1, 5.866, 11.732, 17.503) × c(x2 | +1, 5.114, 5.696, 7.763) bf16 bf5 × c(x1 | +1, 31.139, 39.004, 48.437) bf8 c(x1 | +1, 5.866, 11.732, 17.503) × c(x2 | -1, 5.114, 5.696, 7.763) bf17 bf5 × c(x1 | -1, 31.139, 39.004, 48.437) bf9 c(x1 | -1, 17.503, 23.274, 31.139) table 6. basis functions of piecewise-cubic type of mars model of syngas calorific value (three inputs). temperature (◦c) = 851.31 +3036.4 × bf1 + 11.023 × bf2 +21.043 × bf3 + 7.8072 × bf4 −7.1445 × bf5 − 28.58 × bf6 +1.264 × bf7 − 483.77 × bf8 −2.2149 × bf9 − 6.5355 × bf10 +0.25511 × bf11 + 33.098 × bf12 +57.057 × bf13 − 1.0868 × bf14 +0.2452 × bf15 − 4.2728 × bf16 −8.3842 × bf17 + 0.56517 × bf18 +0.74386 × bf19 (34) calorificvalue (mj/nm3) = 10.969 −0.003224 × bf1 − 0.13121 × bf2 −0.63223 × bf3 − 0.51773 × bf4 +0.008499 × bf5 − 0.0075648 × bf6 +0.04656 × bf7 + 0.050259 × bf8 +0.43584 × bf9 − 0.13449 × bf10 −1.2256 × bf11 − 0.043978 × bf12 −0.08991 × bf13 + 0.22099 × bf14 +1.4463 × bf15 + 0.00078717 × bf16 +0.0011077 × bf17 (35) the piecewise-linear type of mars model for the underground temperature prediction represents equation (34). corresponding basis functions are shown in table 7. the performance index obtained during the training of the mars model was pi = 3.66. in this case, the coefficient of determination was the highest (r2yy = 0.44). the piecewise-linear type of the mars model for the syngas calorific value prediction is represented by the equation (35). corresponding basis functions are shown in table 8. the performance index obtained during the training of the mars model was pi = 8.74. in this case, the coefficient of determination was the highest (r2yy = 0.76). due to the comparison with other methods, this paper presents the behaviour of the prediction with only the piecewise-cubic type of mars models, because the best results on untrained data have been achieved for them. the best prediction of the calorific value and underground temperature by the piecewise-cubic type of the mars model where 10 % of the experiment was used for the test of the prediction is shown in figure 12 and figure 13. the black vertical line divides the prediction into training and testing part. it can be said that better predictions with the mars model were achieved in the case of the underground temperature. the experimental results demonstrate that the piecewise-cubic type of mars model is better than the piecewise-linear both in the temperature and the calorific value prediction. 4.3. prediction by the support vector regression the svr model has been trained on the predictor data similarly as in the previous method. the predictor data were mapped using three kernel functions, and the smo method for the objective-function minimization was used. the training data table was used where one row of the table represented one observation and individual columns were predictors x. the table contains one additional column for the response variable y. the standardized predictor matrix has been used for the training. the standardization was performed using corresponding weighted means of predictors and weighted standard deviations. predictors 342 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . bf equation bf equation bf1 max(0, x4 -20.579) bf11 max(0, x1 -6.3881) × max(0,26.51 -x2) bf2 max(0,20.579 -x4) bf12 bf8 × max(0, x1 -9.6658) bf3 max(0, x2 -11.636) bf13 bf8 × max(0,9.6658 -x1) bf4 max(0,11.636 -x2) bf14 bf9 × max(0, x1 -8.9131) bf5 max(0,6.3881 -x1) bf15 bf9 × max(0,8.9131 -x1) bf6 max(0, x3 -26.344) bf16 bf5 × max(0, x3 -5.1498) bf7 max(0,26.344 -x3) bf17 bf5 × max(0,5.1498 -x3) bf8 bf3 × max(0, x4 -20.563) bf18 bf14 × max(0, x5 -3.9609) bf9 bf3 × max(0,20.563 -x4) bf19 bf14 × max(0,3.9609 -x5) bf10 max(0, x1 -6.3881) × max(0, x2 -26.51) table 7. basis functions of the piecewise-linear type of mars model of underground temperature (five inputs). bf equation bf equation bf1 max(0, x1 -11.732) × max(0, x3 -16.442) bf10 max(0,11.732 -x1) × max(0, x2 -0.478) bf2 max(0, x1 -11.732) × max(0,16.442 -x3) bf11 max(0,11.732 -x1) × max(0,0.478 -x2) bf3 max(0, x2 -9.83) bf12 max(0, x1 -23.274) × max(0, x2 -4.532) bf4 max(0,9.83 -x2) bf13 max(0, x1 -23.274) × max(0,4.532 -x2) bf5 bf4 × max(0, x3 -48.414) bf14 bf11 × max(0, x3 -0.272) bf6 bf4 × max(0,48.414 -x3) bf15 bf11 × max(0,0.272 -x3) bf7 max(0, x1 -11.732) × max(0, x2 -5.696) bf16 bf5 × max(0, x1 -39.004) bf8 max(0, x1 -11.732) × max(0,5.696 -x2) bf17 bf5 × max(0,39.004 -x1) bf9 max(0, 23.274 -x1) table 8. basis functions of the piecewise-linear type of mars model of syngas calorific value (three inputs). figure 12. measured and predicted calorific value of syngas by piecewise-cubic type of mars model where 10 % of the experiment and three inputs were used for the test. 343 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica figure 13. measured and predicted underground temperature by piecewise-cubic type of mars model where 10 % of the experiment and five inputs were used for the test. are less sensitive to the scale on which they are measured on when standardization is used. similarly, a table for the test of the model on untrained data was prepared. as kernel functions, the linear, gaussian and polynomial kernels were used (see table 1). the final values of α were stored in the memory of the computer. table 9 shows the results of applying a various types of the kernel function in the svr where 10 % of the ucg experiment was used for the test. the table shows that the best results were obtained in the utilization of gaussian kernel function, regarding all input variables. this result was obtained for the temperature prediction and also for the syngas calorific value prediction. it is possible to see that the temperature model with five input variables gives the best tightness with the real measured data (see parameter r2yy = 0.82 for training and r 2 yy = 0.47 for test). also, the performance index calculated for the training and testing has reached the lowest value in this case (see parameter pi = 1.80 for training and pi = 4.67 for test). the best performance of prediction is also indicated by other statistical parameters. similarly, the best result for the svr model of the syngas calorific value was also obtained when three inputs and gaussian kernel were used. it is possible to see that the calorific value model with three inputs gives the best tightness with the real measured data (see parameter r2yy = 0.83 for training and r 2 yy = 0.35 for test). also, the performance index calculated during the training and testing has reached the lowest value in this case (see parameter pi = 7.20 for training and pi = 8.21 for test). the best quality of prediction is also indicated by other statistical parameters. the worst results for prediction on untrained data were obtained using the polynomial kernel, both in the case of the temperature model and also the calorific value model. figure 14 and figure 15 show the best prediction of the calorific value and underground temperature by the svr on untrained data where 10 % of the experiment was used for the test. the black vertical line in figures divides the prediction into the training and testing phase. even with this method, the results were better for the temperature prediction. 4.4. overall results a ranking of the results from three evaluated methods is presented in table 10 and table 11. these tables show the comparison of the best results when two variants of observations (i.e., input variables) of each predicted target were used. in the training phase, the model was verified on training data in order to predict the target variable. the results from the training phase of each method are shown in table 10. it is possible to see that the svr model with gaussian kernel better fits the measured target data in the case of modelling the temperature with five observations. the svr model also achieved a better performance when using two input variables. the other interesting results were obtained with the piecewise-linear type of the mars model, both in the case of two and five observations. however, mars models have consumed more time than bpnn and svr. when fitting the calorific value, the svr and gaussian kernel also reached the best performance. it was the case with five inputs variables. in the case of 344 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . p re di ct ed va ri ab le k er ne l fu nc ti on o bs er va ti on s (i np ut s) t ra in in g t es ti ng r y y r 2 yy r r m se (% ) p i m a p e (% ) m se r m se t im e (s ) r y y 2 ry y r r m se (% ) p i m a p e (% ) m se r m se t em pe ra tu re li ne ar c o ,c o 2 0. 27 94 0. 07 81 8. 26 94 6. 46 35 6. 83 99 69 17 .6 85 4 83 .1 72 6 0. 48 13 0. 36 50 0. 13 32 8. 45 38 6. 19 33 7. 21 03 65 78 .6 16 5 81 .1 08 7 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 29 45 0. 08 67 7. 84 80 6. 06 25 6. 45 95 62 30 .6 60 5 78 .9 34 5 0. 48 21 0. 47 54 0. 22 60 8. 40 61 5. 69 75 6. 88 81 65 04 .5 92 5 80 .6 51 1 g au ss ia n c o ,c o 2 0. 58 88 0. 34 66 6. 63 62 4. 17 70 4. 80 36 44 55 .0 43 8 66 .7 46 1 0. 66 05 0. 20 53 0. 04 21 12 .4 96 7 10 .3 68 3 10 .4 41 1 14 37 5. 34 49 11 9. 89 72 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 90 96 0. 82 74 3. 44 14 1. 80 22 1. 97 18 11 98 .0 79 3 34 .6 13 3 0. 66 88 0. 69 13 0. 47 79 7. 90 51 4. 67 40 6. 96 61 57 52 .3 31 7 75 .8 44 1 p ol yn om ia l c o ,c o 2 0. 25 18 0. 06 34 8. 00 87 6. 39 77 6. 25 60 64 88 .3 47 4 80 .5 50 3 2. 16 46 0. 17 04 0. 02 90 12 .7 47 4 10 .8 91 4 10 .2 19 7 14 95 7. 76 85 12 2. 30 20 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 55 31 0. 30 59 6. 92 84 4. 46 09 4. 82 13 48 55 .9 22 9 69 .6 84 5 3. 42 30 0. 07 72 0. 00 60 16 .7 33 5 15 .5 34 2 14 .1 21 5 25 77 4. 97 45 16 0. 54 59 c al or ifi c va lu e li ne ar a ir ,o 2 0. 69 27 0. 47 98 24 .5 83 2 14 .5 23 1 31 .2 77 7 5. 23 23 2. 28 74 1. 52 94 0. 32 22 0. 10 38 19 .2 42 1 14 .5 53 1 20 .0 50 8 4. 96 17 2. 22 75 a ir ,o 2 ,o ut le t pr es su re 0. 70 61 0. 49 86 24 .2 51 8 14 .2 14 5 31 .6 47 8 5. 09 22 2. 25 66 1. 55 39 0. 38 50 0. 14 82 16 .1 07 3 11 .6 29 7 16 .1 57 2 3. 47 67 1. 86 46 g au ss ia n a ir ,o 2 0. 78 68 0. 61 90 21 .0 84 8 11 .8 00 6 22 .3 99 0 3. 84 91 1. 96 19 0. 51 78 0. 41 71 0. 17 40 18 .9 02 3 13 .3 38 7 19 .4 81 6 4. 78 80 2. 18 81 a ir ,o 2 ,o ut le t pr es su re 0. 91 30 0. 83 35 13 .7 89 6 7. 20 85 9. 99 55 1. 64 63 1. 28 31 0. 54 46 0. 59 97 0. 35 96 13 .1 47 1 8. 21 84 11 .3 80 8 2. 31 62 1. 52 19 p ol yn om ia l a ir ,o 2 0. 70 42 0. 49 59 25 .9 05 4 15 .2 01 2 32 .7 63 3 5. 81 03 2. 41 05 10 9. 85 09 0. 37 85 0. 14 33 20 .8 25 6 15 .1 07 5 23 .5 17 0 5. 81 19 2. 41 08 a ir ,o 2 ,o ut le t pr es su re 0. 65 00 0. 42 25 35 .7 76 2 21 .6 82 6 44 .4 54 0 11 .0 81 7 3. 32 89 12 1. 12 58 0. 20 66 0. 04 27 28 .6 15 5 23 .7 15 8 32 .3 06 5 7. 98 59 3. 31 26 table 9. results of simulations with svr where 10 % of the experiment was used to test. 345 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica figure 14. measured and predicted calorific value of syngas by svr with gaussian kernel function where 10 % of the experiment and three inputs were used for test. figure 15. measured and predicted underground temperature by svr with gaussian kernel function where 10 % of the experiment and five inputs were used for test. 346 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . p re di ce d va ri ab le m et ho d o bs er va ti on s t ra in in g r y y r 2 yy r r m se (% ) p i m a p e (% ) m se r m se t im e (s ) t em pe ra tu re b p n n ,l ay er s: 2, n eu ro ns : (l 1: l2 ) 80 0: 8 c o ,c o 2 0. 20 43 0. 04 17 8. 06 49 6. 69 67 6. 64 52 65 79 .6 87 8 81 .1 15 3 6. 85 60 b p n n ,l ay er : 1, n eu ro ns (l 1) 11 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 42 10 0. 17 72 7. 45 52 5. 24 66 6. 14 09 56 22 .4 66 2 74 .9 83 1 0. 74 61 m a r s, pi ec ew is elin ea r, 16 b fs c o ,c o 2 0. 49 85 0. 24 85 7. 10 80 4. 74 34 5. 62 62 51 10 .9 55 4 71 .4 90 9 12 .1 52 5 m a r s, pi ec ew is elin ea r, 20 b fs c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 66 66 0. 44 44 6. 11 19 3. 66 73 4. 72 85 37 78 .9 20 0 61 .4 72 9 17 .8 48 2 sv r ,g au ss ia n ke rn el c o ,c o 2 0. 58 88 0. 34 66 6. 63 62 4. 17 70 4. 80 36 44 55 .0 43 8 66 .7 46 1 0. 66 05 sv r ,g au ss ia n ke rn el c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 90 96 0. 82 74 3. 44 14 1. 80 22 1. 97 18 11 98 .0 79 3 34 .6 13 3 0. 66 88 c al or ifi c va lu e b p n n ,l ay er s: 2, n eu ro ns : (l 1: l2 ) 80 0: 8 a ir ,o 2 0. 73 83 0. 54 51 22 .6 63 8 13 .0 37 8 28 .3 85 9 4. 44 72 2. 10 88 7. 10 23 b p n n ,l ay er s: 2, n eu ro ns : (l 1: l2 ) 80 0: 8 a ir ,o 2 ,o ut le t pr es su re 0. 73 92 0. 54 64 22 .6 56 9 13 .0 27 1 28 .0 10 8 4. 44 44 2. 10 82 7. 58 02 m a r s, pi ec ew is elin ea r, 15 b fs a ir ,o 2 0. 79 91 0. 63 86 20 .1 96 5 11 .2 25 6 23 .0 08 4 3. 53 16 1. 87 92 9. 16 35 m a r s, pi ec ew is elin ea r, 17 b fs a ir ,o 2 ,o ut le t pr es su re 0. 87 30 0. 76 21 16 .3 86 6 8. 74 89 16 .5 88 3 2. 32 49 1. 52 47 13 .4 41 0 sv r ,g au ss ia n ke rn el a ir ,o 2 0. 78 68 0. 61 90 21 .0 84 8 11 .8 00 6 22 .3 99 0 3. 84 91 1. 96 19 0. 51 78 sv r ,g au ss ia n ke rn el a ir ,o 2 ,o ut le t pr es su re 0. 91 30 0. 83 35 13 .7 89 6 7. 20 85 9. 99 55 1. 64 63 1. 28 31 0. 54 46 table 10. overall results from the training phase. 347 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica p re di ce d va ri ab le m et ho d o bs er va ti on s t es ti ng r y y r 2 yy r r m se (% ) p i m a p e (% ) m se r m se t em pe ra tu re b p n n ,l ay er s: 2, n eu ro ns : (l 1: l2 ) 80 0: 8 c o ,c o 2 0. 67 55 0. 45 63 8. 50 75 5. 07 75 7. 45 26 66 62 .3 75 9 81 .6 23 4 b p n n ,l ay er : 1, n eu ro ns (l 1) 11 c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 67 87 0. 46 06 7. 43 01 4. 42 61 5. 81 29 50 81 .7 73 2 71 .2 86 6 m a r s, pi ec ew is elin ea r, 16 b fs c o ,c o 2 0. 32 82 0. 10 77 10 .6 82 4 8. 04 27 9. 16 29 10 50 4. 30 10 10 2. 49 05 m a r s, pi ec ew is ecu bi c, 20 b fs c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 60 31 0. 36 37 6. 42 75 4. 00 94 5. 16 23 38 02 .8 74 7 61 .6 67 5 sv r ,l in ea r ke rn el c o ,c o 2 0. 36 50 0. 13 32 8. 45 38 6. 19 33 7. 21 03 65 78 .6 16 5 81 .1 08 7 sv r ,g au ss ia n ke rn el c o ,c o 2 ,h 2 ,c h 4 ,o 2 0. 69 13 0. 47 79 7. 90 51 4. 67 40 6. 96 61 57 52 .3 31 7 75 .8 44 1 c al or ifi c va lu e b p n n ,l ay er : 1, n eu ro ns (l 1) 5 a ir ,o 2 0. 04 01 0. 00 16 19 .8 08 2 19 .0 44 5 22 .9 76 6 5. 25 79 2. 29 30 b p n n ,l ay er : 1, n eu ro ns (l 1) 7 a ir ,o 2 ,o ut le t pr es su re 0. 71 87 0. 51 66 15 .8 21 9 9. 20 55 16 .8 80 4 3. 35 46 1. 83 16 m a r s, pi ec ew is elin ea r, 15 b fs a ir ,o 2 0. 04 19 0. 00 18 80 .7 36 5 77 .4 89 7 30 .3 19 3 87 .3 50 0 9. 34 61 m a r s, pi ec ew is ecu bi c, 18 b fs a ir ,o 2 ,o ut le t pr es su re 0. 43 86 0. 19 24 17 .4 62 0 12 .1 38 2 17 .8 49 3 4. 08 61 2. 02 14 sv r ,g au ss ia n ke rn el a ir ,o 2 0. 41 71 0. 17 40 18 .9 02 3 13 .3 38 7 19 .4 81 6 4. 78 80 2. 18 81 sv r ,g au ss ia n ke rn el a ir ,o 2 ,o ut le t pr es su re 0. 59 97 0. 35 96 13 .1 47 1 8. 21 84 11 .3 80 8 2. 31 62 1. 52 19 table 11. overall results from the testing phase. 348 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . model fitting with two observations, slightly better results for training were achieved when the piecewiselinear type of the mars model was used. the mars model also had the highest time consumption during the training in this case. fitting the model of the calorific value has reached a worse performance on average than in the case of the temperature. the higher performance index in the training phase is due to the higher variability of inputs and low correlation between the inputs and target. the results of fitting with the bpnn take the third-place, because of the worst performance index (pi), both in the case of the calorific value and the underground temperature. in general, to improve the performance index in the training phase, it is suggested to use more input variables. the results from the test phase are shown in table 11. these results are not so consistent as in the training phase, especially for the temperature prediction where the best results are scattered over individual methods. in the testing phase, the model was verified on untrained data in order to predict the target variable. the time consumption was not evaluated in the testing phase. for the temperature prediction with five input variables, the best result in terms of performance index was obtained in the utilization of piecewise-cubic type of the mars model with 20 bfs. the worst result in terms of the performance index was obtained by the svr with the gaussian kernel and five input variables. when the svr with five inputs was used, the predicted temperature value correlated the best with the measured one. with two input variables, the best results were obtained by the bpnn and the worst by the piecewise-linear type of the mars model. for the calorific value prediction with three input variables, the results with the lowest performance index were obtained for the utilization of the svr and gaussian kernel. results obtained with the utilization of the bpnn take the second-place. when the bpnn with three inputs was used, the predicted calorific value correlated the best with the measured one. the worst results in terms of the performance index were obtained by the piecewise-cubic type of the mars model with 18 bfs and three inputs. with two input variables, the best results were obtained for the svr and gaussian kernel and the worst in the case of the piecewise-linear type of the mars model. the prediction of the underground temperature has reached a higher performance index on average than in the case of temperature. 5. summary and conclusions in this paper, three approaches were examined in order to find the best prediction method for the ucg data soft-sensing. a comparison of methods suitable for predicting the ucg data has not been published yet. in the ucg process, it is a complicated to predict some process variables because it is not possible to see the state of the process that runs in an inaccessible environment. the goal was to find the valid data-driven learning method that allows to estimate the underground temperature or the syngas calorific value from other measurable process variables. predicting these variables will make it possible to control the ucg process more efficiently. in this paper, only a small amount of measurable input variables from one ucg experiment were used to get a comparison of learning methods. all methods that have been applied considered only one output variable. the resulting mars model can be stored in a pc and is even portable as an analytic equation and the impact of each predictor can clearly ne seen (i.e., the model is easier to understand by humans). in mars, the prediction is based on a simple and quick calculation of the mars model formula. in svr, each variable is multiplied by the corresponding element of each support vector, which can be a slow process if there are many variables and a large number of support vectors. individual svr models have been obtained by using the ε-svr. applying the kernel trick in the svr, it allows modelling expert knowledge of the ucg process. the svr is defined as a convex optimization problem where there are no local minima, and therefore, effective optimization methods such as smo can be used. in the case of nns, the training is often complicated because there is always a risk of a deadlock at the local minimum of the error function. also, the learning of nns is highly complicated by looking for a high number of weights in a multidimensional space. in mars and svr, it is needed to package program code that provides the prediction with the optimized weights or support vectors. it can be said that all three methods have achieved satisfactory results in terms of the underground temperature and syngas calorific value prediction. regarding the training, the svr with the gaussian kernel was the winner. this model best matched the measured data, both in the case of the temperature and the calorific value. regarding the prediction, the best result was obtained by piecewise-cubic type of mars model. in these cases, the better results were achieved at all considered input variables of the target variable. the results show that a higher number of input variables increases the predictive performance. obtained results can be applied to the model predictive control of the ucg process. acknowledgements this work was supported by the ec research programme of the research fund for coal and steel (grant no. rfcrct-2013-00002), by the slovak grant agency for science under grant vega 1/0273/17, and by the slovak research and development agency under the contract no. apvv14-0892. references [1] g. ökten, v. didari. underground gasification of coal. in: kural, o. (ed.) coal. technical report, istanbul 349 j. kačur, m. durdán, m. laciak, p. flegner acta polytechnica technical university, university, istanbul, turkey, p. 371-378, 1994. [2] j. kačur, m. durdán, m. laciak, p. flegner. impact analysis of the oxidant in the process of underground coal gasification. measurement 51:147–155, 2014. doi:10.1016/j.measurement.2014.01.036. [3] m. sury, m. white, j. kirton, et al. review of environmental issues of underground coal gasification, technical report coal r272 dti/pub urn 04/1880. technical report, ws atkins consultants ltd. department of trade and industry, 2010. [4] j. kačur, m. durdán, g. bogdanovská. monitoring and measurement of the process variable in ucg. in sgem 2016: 16th international multidisciplinary scientific geoconference, bulgaria sofia : stef92 technology, pp. 295–302. 2016. doi:10.5593/sgem2016/b21/s07.038. [5] m. durdán, k. kostúr. modeling of temperatures by using the algorithm of queue burning movement in the ucg process. acta montanistica slovaca 20(3):181–191, 2015. [6] k. kostúr. mathematical modeling temperature’s fields in overburden during underground coal gasification. in iccc 2014 : proceedings of the 2014 15th international carpathian control conference (iccc) velke karlovice, may 28-30, pp. 248–253. danvers : ieee, 2014. doi:10.1109/carpathiancc.2014.6843606. [7] m. koenen, f. bergen, p. david. isotope measurements as a proxy for optimising future hydrogen production in underground coal gasification, news in depth, 2015. [8] m. benková, m. durdán. statistical analyzes of the underground coal gasification process realized in the laboratory conditions. in sgem 2016: 16th international multidisciplinary scientific geoconference, bulgaria sofia : stef92 technology, pp. 405–412. 2016. doi:10.5593/sgem2016/b21/s07.052. [9] l. fortuna, s. graziani, a. rizzo, g. m. xibilia. soft sensors for monitoring and control of industrial processes. springer london, 2007. doi:10.1007/978-1-84628-480-9. [10] t. ji, h. shi. soft sensor modeling for temperature measurement of texaco gasifier based on an improved rbf neural network. in 2006 ieee international conference on information acquisition, pp. 1147–1151. ieee, 2006. doi:10.1109/icia.2006.305907. [11] a. a. uppal, a. i. bhatti, e. aamir, et al. control oriented modeling and optimization of one dimensional packed bed model of underground coal gasification. journal of process control 24:269–277, 2014. doi:10.1016/j.jprocont.2013.12.001. [12] a. a. uppal, a. i. bhatti, e. aamir, et al. optimization and control of one dimensional packed bed model of underground coal gasification. journal of process control 35:11–20, 2015. doi:10.1016/j.jprocont.2015.08.002. [13] a. a. uppal, y. m. alsmadi, v. i. utkin, et al. sliding mode control of underground coal gasification energy conversion process. ieee transactions on control systems technology 26(2):587–598, 2018. doi:10.1109/tcst.2017.2692718. [14] q. wei, d. liu. adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. transactions on automation science and engineering, ieee 11(4):1020– 1036, 2014. doi:10.1109/tase.2013.2284545. [15] r. guo, g. x. cheng, y. wang. texaco coal gasification quality prediction by neural estimator based on dynamic pca. in proceedings of the 2006 ieee international conference on mechatronics and automation, pp. 1298–1302. 2006. doi:10.1109/icma.2006.257660. [16] b. guo, y. shen, f. zhao. modelling coal gasification with a hybrid neural network. fuel 76(12):1159–1164, 1997. doi:10.1016/s0016-2361(97)00122-1. [17] s. liu, z. hou, c. yin. data-driven modeling for fixed-bed intermittent gasification processes by enhanced lazy learning incorporated with relevance vector machine. in 11th ieee international conference on control & automation (icca), pp. 1019–1024. ieee, 2014. doi:10.1109/icca.2014.6871060. [18] m. laciak, j. kačur, k. kostúr. the verification of thermodynamic model for ucg process. in iccc 2016: 17th international carpathian control conference, pp. 424–428. 2016. doi:10.1109/carpathiancc.2016.7501135. [19] m. laciak, d. ráškayová. the using of thermodynamic model for the optimal setting of input parameters in the ucg process. in iccc 2016: 17th international carpathian control conference, pp. 418– 423. 2016. doi:10.1109/carpathiancc.2016.7501134. [20] a. m. winslow. numerical model of coal gasification in a packed bed. symposium (international) on combustion 16(1):503–513, 1977. doi:10.1016/s0082-0784(77)80347-0. [21] p. ji, x. gao, d. huang, y. yang. prediction of syngas compositions in shell coal gasification process via dynamic soft-sensing method. in proceeding of 10th ieee international conference on control and automation (icca), pp. 244–249. 2013. doi:10.1109/icca.2013.6565140. [22] i. h. al-qinani. multivariate adaptive regression splines (mars) heuristic model: application of heavy metal prediction. international journal of modern trends in engineering & research 3(8):223–229, 2016. doi:10.21884/ijmter.2016.3027.7nuqv. [23] a. aryafar, r. gholami, r. rooki, f. d. ardejani. heavy metal pollution assessment using support vector machine in the shur river, sarcheshmeh copper mine, iran. environmental earth sciences 67(4):1191–1199, 2012. doi:10.1007/s12665-012-1565-7. [24] d. e. rumelhart, g. e. hinton, r. j. wiliams. learning internal representation by error propagation. in: d. e. rumelhart, j. l. mcclelland, and pdp research group. parallel distributed processing. explorations in the microstructure of cognition. vol 1: foundation, 1987. [25] g. sampson, d. e. rumelhart, j. l. mcclelland, t. p. r. group. parallel distributed processing: explorations in the microstructures of cognition. language 63(4):871, 1987. doi:10.2307/415721. 350 http://dx.doi.org/10.1016/j.measurement.2014.01.036 http://dx.doi.org/10.5593/sgem2016/b21/s07.038 http://dx.doi.org/10.1109/carpathiancc.2014.6843606 http://dx.doi.org/10.5593/sgem2016/b21/s07.052 http://dx.doi.org/10.1007/978-1-84628-480-9 http://dx.doi.org/10.1109/icia.2006.305907 http://dx.doi.org/10.1016/j.jprocont.2013.12.001 http://dx.doi.org/10.1016/j.jprocont.2015.08.002 http://dx.doi.org/10.1109/tcst.2017.2692718 http://dx.doi.org/10.1109/tase.2013.2284545 http://dx.doi.org/10.1109/icma.2006.257660 http://dx.doi.org/10.1016/s0016-2361(97)00122-1 http://dx.doi.org/10.1109/icca.2014.6871060 http://dx.doi.org/10.1109/carpathiancc.2016.7501135 http://dx.doi.org/10.1109/carpathiancc.2016.7501134 http://dx.doi.org/10.1016/s0082-0784(77)80347-0 http://dx.doi.org/10.1109/icca.2013.6565140 http://dx.doi.org/10.21884/ijmter.2016.3027.7nuqv http://dx.doi.org/10.1007/s12665-012-1565-7 http://dx.doi.org/10.2307/415721 vol. 59 no. 4/2019 a comparative study of data-driven modeling methods. . . [26] t. hastie, r. tibshirani, j. friedman. the elements of statistical learning data mining, inference, and prediction, second edition. springer new york, 2009. doi:10.1007/b94608. [27] v. kvasnička, . beňušková, i. f. j. pospíchal, et al. úvod do teórie neurónových sietí. iris, bratislava, 1997. [28] j. sedláček. úvod do teorie grafů. academia, praha, 1981. [29] j. h. friedman. multivariate adaptive regression splines. the annals of statistics 19(1):1–67, 1991. doi:10.1214/aos/1176347963. [30] p. sephton. forecasting recessions: can we do better on mars?, 2001. [31] m. chugh, s. s. thumsi, v. keshri. a comparative study between least square support vector machine(lssvm) and multivariate adaptive regression spline(mars) methods for the measurement of load storing capacity of driven piles in cohesion less soil. international journal of structural and civil engineering research 2015. doi:10.18178/ijscer.4.2.189-194. [32] v. r. tselykh. multivariate adaptive regression splines. machine learning and data analysis 1(3):272–278, 2012. doi:10.21469/22233792. [33] p. samui, d. p. kothari. a multivariate adaptive regression spline approach for prediction of maximum shear modulus and minimum damping ratio. engineering journal 16(5):69–78, 2012. doi:10.4186/ej.2012.16.5.69. [34] w. zhang, a. t. c. goh. multivariate adaptive regression splines and neural network models for prediction of pile drivability. geoscience frontiers 7(1):45–52, 2016. doi:10.1016/j.gsf.2014.10.003. [35] a. abraham, d. steinberg. mars: still an alien planet in soft computing? in international conference on computational science iccs (proceedings), part ii, vol. 2, pp. 235–244. springer berlin heidelberg, 2001. doi:10.1007/3-540-45718-6\_27. [36] b. e. boser, i. m. guyon, v. n. vapnik. a training algorithm for optimal margin classifiers. in proceedings of the 5th annual acm workshop on computational learning theory colt’92, pittsburgh, pa, pp. 144–152. acm press, 1992. doi:10.1145/130385.130401. [37] v. n. vapnik. constructing learning algorithms. in the nature of statistical learning theory, pp. 119–166. springer verlag, new york, 1995. doi:10.1007/978-1-4757-2440-0\_6. [38] k. r. müller, a. j. smola, g. rätsch, et al. predicting time series with support vector machines. in lecture notes in computer science, pp. 999–1004. springer berlin heidelberg, 1997. doi:10.1007/bfb0020283. [39] j. kačur, m. laciak, m. durdán, p. flegner. utilization of machine learning method in prediction of ucg data. in iccc 2017: 18th international carpathian control conference, pp. 1–6. ieee, 2017. doi:10.1109/carpathiancc.2017.7970411. [40] mathworks. understanding support vector machine regression, in: statistics and machine learning toolbox user’s guide (r2018ab). regression.html, 2018. [41] j. kačur, k. kostúr. approaches to the gas control in ucg. acta polytechnica 57(3), 2017. doi:10.14311/ap.2017.57.0182. [42] m. laciak, k. kostúr, m. durdán, et al. the analysis of the underground coal gasification in experimental equipment. energy 114:332–343, 2016. doi:10.1016/j.energy.2016.08.004. [43] r. l. i. dobbs, w. b. krantz. combustion front propagation in underground coal gasification, final report, work performed under grant no. de-fg22-86pc90512. technical report, university of colorado, boulder department of chemical engineering, 1990. doi:10.2172/6035494. [44] k. stańczyk, a. smoliński, k. kapusta, et al. dynamic experimental simulation of hydrogen oriented underground gasification of lignite. fuel 89(11):3307–3314, 2010. doi:10.1016/j.fuel.2010.03.004. [45] k. kostúr, j. kačur. developing of optimal control system for ucg. in proceedings of the 13th international carpathian control conference (iccc), pp. 347–352. ieee, 2012. doi:10.1109/carpathiancc.2012.6228666. [46] k. kostúr, j. kačur. development of control and monitoring system of ucg by promotic. in 2011 12th international carpathian control conference (iccc), pp. 215–219. ieee, 2011. doi:10.1109/carpathiancc.2011.5945850. [47] a. h. gandomi, d. a. roke. intelligent formulation of structural engineering systems. in seventh mit conference on computational fluid and solid mechanics-focus: multiphysics and multiscale, pp. n/a–n/a. cambridge, usa, 2013. [48] a. h. gandomi, d. a. roke. assessment of artificial neural network and genetic programming as predictive tools. advances in engineering software 88:63–72, 2015. doi:10.1016/j.advengsoft.2015.05.007. 351 http://dx.doi.org/10.1007/b94608 http://dx.doi.org/10.1214/aos/1176347963 http://dx.doi.org/10.18178/ijscer.4.2.189-194 http://dx.doi.org/10.21469/22233792 http://dx.doi.org/10.4186/ej.2012.16.5.69 http://dx.doi.org/10.1016/j.gsf.2014.10.003 http://dx.doi.org/10.1007/3-540-45718-6\_27 http://dx.doi.org/10.1145/130385.130401 http://dx.doi.org/10.1007/978-1-4757-2440-0\_6 http://dx.doi.org/10.1007/bfb0020283 http://dx.doi.org/10.1109/carpathiancc.2017.7970411 http://dx.doi.org/10.14311/ap.2017.57.0182 http://dx.doi.org/10.1016/j.energy.2016.08.004 http://dx.doi.org/10.2172/6035494 http://dx.doi.org/10.1016/j.fuel.2010.03.004 http://dx.doi.org/10.1109/carpathiancc.2012.6228666 http://dx.doi.org/10.1109/carpathiancc.2011.5945850 http://dx.doi.org/10.1016/j.advengsoft.2015.05.007 acta polytechnica 59(4):322–351, 2019 1 introduction 1.1 understanding ucg technology 1.2 measurement and monitoring in ucg 1.3 modeling and prediction in ucg 2 analysis of selected modeling methods 2.1 multilayer feed-forward neural networks 2.2 multivariate adaptive regression splines 2.3 support vector regression 3 experimental ucg in ex-situ reactor 4 results and discussion 4.1 prediction by the back-propagation nn 4.2 prediction by the mars 4.3 prediction by the support vector regression 4.4 overall results 5 summary and conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0214 acta polytechnica 60(3):214–224, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap beta cantor series expansion and admissible sequences jonathan caalima, shiela demegilloa, b, ∗ a university of the philippines diliman, institute of mathematics, c.p. garcia, 1101 quezon city, philippines b adamson university, mathematics and physics department, san marcelino st., 1000 manila, philippines ∗ corresponding author: ssdemegillo@upd.edu.ph abstract. we introduce a numeration system, called the beta cantor series expansion, that generalizes the classical positive and negative beta expansions by allowing non-integer bases in the q-cantor series expansion. in particular, we show that for a fix γ ∈ r and a sequence b of real number bases, every element of the interval [γ,γ + 1) has a beta cantor series expansion with respect to b where the digits are integers in some alphabet a(b). we give a criterion in determining whether an integer sequence is admissible when b satisfies some condition. we provide a description of the reference strings, namely the expansion of γ and γ + 1, used in the admissibility criterion. keywords: beta expansion, q-cantor series expansion, numeration system, admissibility. 1. introduction the subject of representations of real numbers is an extensively studied research field. in the seminal work [1], renyi introduced the now well-known concept of beta expansions. beta expansions are representations of real numbers using an arbitrary positive real base β > 1 obtained via the beta transformation tβ : [0, 1) −→ [0, 1) given by tβ(x) = βx−bβxc. the iterates of t induce a numeration system on [0, 1) wherein the expansion of an element x ∈ [0, 1) is given by the sequence d(β; x) = (d1,d2, . . . ) with di = bβti−1(x)c. thus, the digits di belong to the alphabet a = {0, 1, . . . ,bβc} if β /∈ n or a = {0, 1, . . . ,β − 1} if β ∈ n. parry, in [2], considered the admissibility problem of determining the integer sequences over the alphabet a that appear as the beta expansion of a real number in the domain [0, 1). parry provided a necessary and sufficient condition (formulated in terms of the beta expansion of 1) for a sequence of integers to be beta admissible. in the subsequent paper [3], parry extended the definition of the beta transformation to t : [0, 1) −→ [0, 1) where t(x) = βx + α + bβx + αc with β > 1 and 0 ≤ α < 1 and he also tackled the admissibility problem in this setting. an important generalization of beta expansion is a positional numeration system that uses negative bases. as remarked by frougny and lai in [4], it appears that grünwald was the first to introduce this idea in [5]. here, we present a general formulation considered by ito and sadahiro in [6]. let 1 < β ∈ r and define lβ := −β/(β + 1) and rβ := 1/(β + 1). the negative beta transformation is the map t−β : [lβ,rβ) −→ [lβ,rβ) given by t−β(x) = −βx−b−βx− lβc. the map t−β also induces an expansion on the domain [lβ,rβ), where the digits are given by b−βti(x) − lβc. an admissibility criterion was also given in [6, theorem 10]. (in [7], liao and steiner introduced the self-map t̂ : (0, 1] −→ (0, 1] given by t̂(x) = −βx + bβxc + 1. this transformation is conjugate to the one defined by ito and sadahiro and, as such, the results for the negative beta expansion can be restated using the map t̂.) as with the positive beta transformations, dombek, et.al, in [8] introduced a parameter α to generalize the negative beta transformation defined by ito and sadahiro. they considered the map t : [α,α + 1) −→ [α,α + 1) given by t(x) = −βx−b−βx−αc where β > 1 and α ∈ (−1, 0]. (see also [9, 10] for other transformations inducing an expansion in a negative base.) the motivation of the current study originates from a certain class of rotational beta expansions in dimension two (see [11, 12]). rotational beta expansions generalize the notion of beta expansions in higher dimensions. let z = [0, 1) × [0, 1) and 1 < β ∈ r. define the four-fold rotational beta transformation t : z −→z by t ([ x y ]) = [ −βy −b−βyc βx−bβxc ] . it is easy to see that if we wish to keep track of the itinerary of a point z ∈ z under t, then we need to alternatingly apply the functions f1(x) = −βx−b−βxc and f2(x) = βx−bβxc to an element x ∈ [0, 1). this series of applications of the maps f1 and f2 yields a numeration system in [0, 1) in two bases −β and β as discussed in section 2 below. this numeration system is akin to the q-cantor series expansion [13]. given a sequence q = (qn)n≥1 of integers qn ≥ 2, the q-cantor expansion of a real 214 https://doi.org/10.14311/ap.2020.60.0214 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 3/2020 beta cantor series expansion and admissible sequences number x is the unique expansion of the form x = e0 + ∑ n≥1 en πnj=1qj where e0 = bxc and en ∈ {0, 1, . . . ,qn − 1} for all n ≥ 1 such that en 6= qn − 1 infinitely many number of times. we call the numeration system considered in this paper the beta cantor series expansion as it marries the notions of beta expansion and q-cantor series expansion. as mentioned in section 2, the beta expansion of parry and the negative beta expansion of ito and sadahiro are examples of beta cantor series expansion. it is the hope of the authors that the beta cantor series expansion provides a unified formulation for the positive and negative beta numeration systems to further highlight their similarities. after all, the positive and negative beta expansions share many similar properties (see e.g. [9, 14, 15]). our goal is to extend the work of parry on admissibility to beta cantor series expansions. in section 2, we define the transformations that induce the beta cantor series expansion. in section 3, we provide a discussion on the relationship between two different definitions of the expansion of γ + 1 (similar to the expansion of 1 in [2] and the expansion of rβ in [6]). in section 4, we tackle the problem of finding a necessary and sufficient condition for a sequence to be admissible with respect to the beta cantor series expansion. 2. b-expansion maps fix γ ∈ r and let b = (β1,β2, . . . ) where βi ∈ r for all i ∈ n. for j ∈ n, we define fj : [γ,γ + 1) −→ [γ,γ + 1) by fj(x) = βjx −bβjx−γc. for m ∈ n, consider the transformation tm = tmb = t m b,γ on [γ,γ + 1) given by tm(x) = fm(. . .f3(f2(f1(x))) . . . ). hence, tm(x) = βmtm−1(x) −am(x) where am(x) = ⌊ βmt m−1(x) −γ ⌋ . for β = βm, we also define uβ := min{bβγ −γc ,bβ(γ + 1) −γc}, vβ := max{bβγ −γc ,bβ(γ + 1) −γc} and a(β) := { [uβ,vβ) ∩z if β > 0 and β + γ(β − 1) ∈ z [uβ,vβ] ∩z otherwise. then am(x) ∈a(βm). define a(b) := π∞m=1a(βm), which is the set of all sequences (d1,d2, . . . ) where dm ∈a(βm). for ease of notation, we define b[i,j] := πjm=iβm. when i = 1, we write b[j] instead of b[1,j] with the convention that b[0] := 1. observe that b[m + i] = b[m]b[m + 1,m + i]. the transformations tm induce a numeration system on the interval [γ,γ + 1) over the alphabet a(b) if limm→∞ |b[m]| = ∞. proposition 2.1. let b = (β1,β2, . . . ) ∈ rn and x ∈ [γ,γ + 1). if limm→∞ |b[m]| = ∞, then x = ∞∑ i=1 ai(x) b[i] . proof. for simplicity, let aj = aj(x). note that tj−1(x) = tj(x) + aj βj . hence, x = a1 β1 + t(x) β1 = a1 β1 + a2 β1β2 + t 2(x) β1β2 . in general, x = m∑ i=1 ai b[i] + tm(x) b[m] . this implies that as m →∞,∣∣∣∣∣x− m∑ i=1 ai b[i] ∣∣∣∣∣ = ∣∣∣∣tm(x)b[m] ∣∣∣∣ ≤ max{|γ| , |γ + 1|}|b[m]| → 0. we write x = ∞∑ i=1 ai b[i] as (a1,a2, . . . )b. we call the sequence d(b; x) := (a1,a2, . . . ) the b-expansion of x. let 1 < β ∈ r. note that if γ = 0 and b = (β), then the b-expansion of x coincides with the classical β-expansion. (here, v stands for the periodic repetition of a word v.) when γ = −β/(β + 1) and b = (−β), then the b-expansion coincides with the (−β)-expansion. if b is periodic, say b = (β1,β2, . . . ,βn ) for some n ∈ n, we also call the bexpansion as the {β1, . . . ,βn}-expansion of x. in this case, we may write d(b; x) as d(β1, . . . ,βn; x). we may extend the definition of tj to γ + 1 as has been done in [2] and [6]. for all j ∈ n, define tj(γ + 1) := βjtj−1(γ + 1) − ⌊ βjt j−1(γ + 1) −γ ⌋ . as in proposition 2.1, we have γ + 1 = ∞∑ i=1 ci b[i] where ci := ⌊ βit i−1(γ + 1) −γ ⌋ . we also write d(b; γ+1) = (c1,c2, . . . ). note that c1 ∈ [uβ1,vβ1 ]∩z and for j > 1, cj ∈a(βj) since tj−1(γ + 1) < γ + 1. example. let α = −β = (1 + √ 5)/2 be the golden mean. table 1 gives some information on the {α,β}transformations for various values of γ. 215 jonathan caalim, shiela demegillo acta polytechnica γ a(α) a(β) d(α,β; γ) d(α,β; γ + 1) 0 {0, 1} {−2,−1, 0} (0) (1,−1, 0) 1/α {0, 1} {−4,−3,−2} (0,−3, 1,−3, 1,−2) (2,−2, 1) 2 {1, 2} {−7,−6} (1,−6, 1,−7) (2,−7, 1) table 1. the expansion of γ and γ + 1 under various values of γ when α = −β = (1 + √ 5)/2 figure 1. the maps t and t 2 when γ = 0, α = −β = (1 + √ 5)/2 let us consider the particular case of γ = 0. then fα(x) = { αx if x ∈ [0, 1/α) αx− 1 if x ∈ [1/α, 1) and fβ(x) =   −αx if x = 0 −αx + 1 if x ∈ (0, 1/α] −αx + 2 if x ∈ (1/α, 1) . figure 1 depicts the shape of the {α,β}transformations t and t 2. from these, we obtain a graph g (figure 2) describing the dynamics of the map tm. in this graph, the vertices are subintervals of [γ,γ + 1) that form its partition and there is a directed edge (dashed, if τ = α; and solid, otherwise) from vertex v1 to vertex v2 labelled d if and only if v2 ⊂ fτ (v1) and the corresponding digit is d. now, let γ = 1/α. we have fα(x) = { αx if x ∈ [1/α, 1) αx− 1 if x ∈ [1,α) and fβ(x) =   −αx + 2 if x ∈ [1/α, 3/α− 1] −αx + 3 if x ∈ (3/α− 1, 4/α− 1] −αx + 4 if x ∈ (4/α− 1,α) . figure 3 gives a graph corresponding to γ = 1/α where j := (1/α, 4 − 2α), k := (4 − 2α, 3/α − 1), l := (3/α−1, 1), m := (1, 2α−2), p := (2α−2, 3−α), q := (3 −α, 4/α− 1) and r := (4/α− 1,α). 3. expansion of γ + 1 the expansion of γ + 1 defined in the previous section proves to be insufficient for our purposes and hence, the definition needs to be modified. in this section, we present another definition of the expansion of γ + 1 analogous to those defined in [2] and [6, lemma 6]. hereafter, we assume b = (β1,β2, . . . ) ∈ rn with limm→∞ |b[m]| = ∞. definition 1. we define d∗(b; γ + 1) = (c∗1,c∗2, . . . ) as the limit lim x→(γ+1)− d(b; x). that is, for any n ∈ n, there exists an �n > 0 such that for all x ∈ (γ + 1 − �n,γ + 1) and for all i < n, the i-th digit of d(b; x) is c∗i where �n+1 < �n and �n → 0 as n →∞. example. let β be a quadratic pisot number. then β satisfies the minimal polynomial x2 − bx− c where b ∈ n and 1 ≤ c ≤ b; or b ∈ n−{1, 2} and 2−b ≤ c ≤ −1. let γ = 0. we compute for d∗(β,−β; γ + 1) = d∗(β,−β; 1). let � > 0 be arbitrarily small. case 1. let 1 ≤ c ≤ b. then b < β < b + 1. we have t (1 − �) = β (1 − �) −bβ (1 − �)c = β − �− b t 2 (1 − �) = −β (β − b− �) −b−β (β − b− �)c = � t 3 (1 − �) = β�−bβ�c = � t 4 (1 − �) = −β�−b−β�c = −� + 1. hence, d∗(β,−β; γ + 1) = (b,−c, 0,−1). it can be shown that d(β,−β; γ + 1) = (b,−c, 0). 216 vol. 60 no. 3/2020 beta cantor series expansion and admissible sequences (0, 1 α2 ) ( 1 α2 , 1 α ) ( 1 α , 1) 0 1α2 1 α 0 0 0 1 1 −1 −1 −1 −2 −2 −1 1 0 0 0 −1 figure 2. tα,β : [0, 1) −→ [0, 1) with α = −β = (1 + √ 5)/2 j k l m p q r 0 0 0 01 1 1 1 1 1 1 −2 −2 −2 − 3 −3 −3−3 −3 −3 −3 −4 −4 figure 3. tα,β : [1/α,α) −→ [1/α,α) with α = −β = (1 + √ 5)/2 case 2. let 2−b ≤ c ≤−1. then b−1 < β < b. so t (1 − �) = β (1 − �) −bβ (1 − �)c = β − �− b + 1 t 2 (1 − �) = −β (β − b + 1 − �) −b−β (β − b + 1 − �)c = −β + b + � t 3 (1 − �) = β(−β + b + �) −bβ(−β + b + �)c = � t 4 (1 − �) = −β�−b−β�c = −� + 1. hence, d∗(β,−β; γ + 1) = (b− 1,−b− c,−c,−1). also, we have d(β,−β; γ + 1) = (b− 1,−b− c,−c, 0). example. let b = (α,β) where α ∈ z < 0 and 0 > β ∈ r. suppose α(γ + 1) − γ ∈ z and t 2n(γ + 1) = fn(β) − bfn(β) −γc and t 2n+1(γ + 1) = gn(β) − bgn(β) −γc for some polynomials fn and gn of degree n in z[x] where fn(β) − γ /∈ z and gn(β) − γ /∈ z for all n ∈ n (e.g. β may be taken to be transcendental over q and γ ∈ q). let d(α,β; γ + 1) = (c1,c2, . . . ) and d∗(α,β; γ + 1) = (c∗1,c∗2, . . . ). then, for small � > 0, c∗1 = bα(γ + 1) −γ + �c = α(γ + 1) −γ = c1. since fn(β)−γ and gn(β)−γ are not integers, we can show that tn(γ + 1−�) = tn(γ + 1) + (−1)n+1�. moreover, c∗2n = bfn(β) −γ − �c = bfn(β) −γc = c2n and c∗2n+1 = bgn(β) −γ + �c = bgn(β) −γc = c2n+1. hence, d(α,β; γ + 1) = d∗(α,β; γ + 1). from the examples above, we see that d(b; γ + 1) may or may not be equal to d∗(b; γ + 1). in what follows, we characterize the b-expansions such that d(b; γ + 1) = d∗(b; γ + 1). let sgn denote the signum function. define ib := {n ∈ n∪{0} | sgn(b[n + 1]) > 0} and cb := {γ ∈ r | βn+1tn(γ + 1) −γ /∈ z for n ∈ ib}. proposition 3.1. if γ ∈ cb, then d(b; γ + 1) = d∗(b; γ + 1). proof. we show by induction on n ∈ n∪{0} that, for arbitrarily small constant, � > 0, tn(γ + 1 − �) = tn(γ + 1) −b[n]�. (?) the case where n = 0 is clear. suppose (?) holds for some n ∈ n∪{0}. then tn+1(γ + 1 − �) = βn+1tn(γ + 1 − �) −bβn+1tn(γ + 1 − �) −γc = βn+1tn(γ + 1) −b[n + 1]� −bβn+1tn(γ + 1) −b[n + 1]�−γc . since γ ∈ cb and � is arbitrarily small, then bβn+1tn(γ + 1) −b[n + 1]�−γc = bβn+1tn(γ + 1) −γc. therefore, tn+1(γ + 1 − �) = βn+1tn(γ + 1) −b[n + 1]� −bβn+1tn(γ + 1) −γc = tn+1(γ + 1) −b[n + 1]�. thus, we get d(b; γ + 1) = d∗(b; γ + 1). proposition 3.2. if d(b; γ + 1) = d∗(b; γ + 1), then γ ∈cb. proof. suppose d(b; γ + 1) = (c1,c2, . . . ) and d∗(b; γ + 1) = (c∗1,c∗2, . . . ) are equal. we show, by induction on n ∈ n∪{0}, that tn(γ + 1 − �) = tn(γ + 1) −b[n]�. (?) 217 jonathan caalim, shiela demegillo acta polytechnica the base case n = 0 is clear. suppose (?) holds for some n ∈ n∪{0}. then tn+1(γ + 1 − �) = βn+1(tn(γ + 1) −b[n]�) − c∗n+1 = βn+1tn(γ + 1) −b[n + 1]�− cn+1 = tn+1(γ + 1) −b[n + 1]�. thus, for all n ∈ n∪{0} and � > 0 sufficiently small, c∗n+1 = bβn+1t n(γ + 1) −b[n + 1]�−γc = bβn+1tn(γ + 1) −γc = cn+1. if n ∈ ib, then βn+1tn(γ + 1) − γ /∈ z. thus, γ ∈ cb. combining prop. 3.1 and prop. 3.2, we have the following theorem. theorem 3.3. γ ∈ cb if and only if d(b; γ + 1) = d∗(b; γ + 1). theorem 3.3 and [2, theorem 3] imply corollary 3.3.1 while theorem 3.3 and [6, lemma 6] imply corollary 3.3.2. corollary 3.3.1. let 1 < β ∈ r. let t : [0, 1) −→ [0, 1) be the beta transformation given by t(x) = βx−bβxc. then the following are equivalent: (1.) d(b; 1) = d∗(b; 1); (2.) βtj(1) /∈ z for all j ∈ n∪{0}; (3.) d(b; 1) is infinite. corollary 3.3.2. let 1 < β ∈ r. let t−β be the negative beta transformation on [lβ,rβ) given by t−β(x) = −βx−b−βx− lβc. then the following are equivalent: (1.) d(b; rβ) = d∗(b; rβ); (2.) −βt 2j+1−β (rβ) − lβ /∈ z for all j ∈ n∪{0}; (3.) d(b; rβ) is not purely periodic of odd period. next, we determine the relation between d∗(b; γ+1) and d(b; γ + 1) when they are not equal (i.e., γ /∈cb). define the propositional statement e(b; k) to mean βk+1t k(γ + 1) −γ ∈ z and sgn(b[k + 1]) > 0. suppose e(b; k) holds and k is minimal with such property. then tk+1(γ+1) = βk+1tk(γ+1)−bβk+1tk(γ+1)−γc = γ. thus, if d(b; γ + 1) = (c1,c2, . . . ), then d(b; γ + 1) = (c1,c2, . . . ,ck+1) ◦d(σk+1(b); γ), where ◦ denotes the usual word concatenation and σj (j ∈ n) is the shift operator in rn given by σj(r1,r2, . . . ) = (rj+1,rj+2, . . . ). moreover, from the proof of proposition 3.1, we see that tk+1(γ + 1 − �) = βk+1tk(γ + 1) −b[k + 1]� −bβk+1tk(γ + 1) −b[k + 1]�−γc = γ + 1 − �. therefore, d∗(b; γ+1) = (c1,c2, . . . ,ck+1−1)◦d∗(σk+1(b); γ+1). from the computation above, we see that the process of determining d∗(b; γ + 1) depends on the other sequences d∗(σi(b); γ + 1), i ∈ n. to illustrate this process, we present the two-base expansion case where we set α := β1 > 0 and β := β2 > 0. we easily compute ib to be n∪{0}. suppose e(b; k) is satisfied and k is minimal. on the one hand, suppose k is odd. then d(α,β; γ + 1) = (c1,c2, . . . ,ck+1) ◦d(α,β; γ) and d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1−1)◦d∗(α,β; γ + 1). this implies that d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1 − 1). on the other hand, suppose k is even. then d(α,β; γ + 1) = (c1,c2, . . . ,ck+1) ◦d(β,α; γ) and d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1−1)◦d∗(β,α; γ + 1). note that iσ(b) = n∪{0}. suppose that there is no m ∈ n such that e(σ(b); m) holds. then d∗(β,α; γ + 1) = d(β,α; γ + 1) and so, d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1 −1)◦d(β,α; γ + 1). let d(β,α; γ + 1) = (q1,q2, . . . ). if there exists m ∈ n∪{0} such that e(σ(b); m) holds and m is minimal and odd, then d∗(β,α; γ + 1) = (q1,q2, . . . ,qm+1 − 1). therefore, d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1 − 1) ◦ (q1,q2, . . . ,qm+1 − 1). finally, if m is even, we have d∗(β,α; γ+1) = (q1,q2, . . . ,qm+1−1)◦d∗(α,β; γ+1). 218 vol. 60 no. 3/2020 beta cantor series expansion and admissible sequences hence, d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1 − 1) ◦ (q1,q2, . . . ,qm+1 − 1) ◦d∗(α,β; γ + 1) = (c1,c2, . . . ,ck+1 − 1,q1,q2, . . . ,qm+1 − 1). to sum up, we have the following proposition. proposition 3.4. let b = (α,β) where α,β ∈ r > 0 and αβ > 1. let d(α,β; γ + 1) = (c1,c2, . . . ) and d(β,α; γ + 1) = (q1,q2, . . . ). then d∗(α,β; γ + 1) can only assume one of the following forms: (1.) (c1,c2, . . . ,c2k − 1) (2.) (c1,c2, . . . ,c2k+1 − 1,q1,q2, . . . ) (3.) (c1,c2, . . . ,c2k+1 − 1,q1,q2, . . . ,q2m − 1) (4.) (c1,c2, . . . ,c2k+1 − 1,q1,q2, . . . ,q2m+1 − 1) (5.) (c1,c2, . . . ) examples. we now give examples to illustrate prop. 3.4 (1–5) by providing values of α > 0 and β > 0 with αβ > 1 and γ = 0 for each case. let r,s ∈ n. (1.) let α = r/s /∈ z and β = s. d(α,β; γ + 1) = (br/sc,r −sbr/sc, 0) d∗(α,β; γ + 1) = (br/sc,r −sbr/sc− 1) (2.) let α = r and β be transcendental over q. d(α,β; γ + 1) = (r, 0) d(β,α; γ + 1) = d∗(β,α; γ + 1) = (q1,q2, . . . ) d∗(α,β; γ + 1) = (r − 1,q1,q2, . . . ) (3.) let α = (1 + √ 5)/2 and β = α2. d(α,β; γ + 1) = (1, 1, 1, 0) d(β,α; γ + 1) = (2, 1, 0) d∗(α,β; γ + 1) = (1, 1, 0, 2, 0) (4.) let β = α + 1 where α is the (smallest) pisot number which satisfies α3 −α− 1 = 0. d(α,β; γ + 1) = (1, 0, 1, 0) d(β,α; γ + 1) = (2, 0, 1, 0) d∗(α,β; γ + 1) = (1, 0, 0, 2, 0, 0) (5.) let α be transcendental over q and β = r. then d(α,β; γ + 1) = d∗(α,β; γ + 1). to end this section, we recover the classical results for beta and negative beta expansions. let 1 < β ∈ r. for the positive beta expansion, we see that ib = n∪ {0} and cb = {γ ∈ r | βtn(γ + 1)−γ /∈ z for all n ∈ ib}. suppose 0 /∈ cb. then there exists a minimal k ∈ n∪{0} such that e(b; k) holds. we have d(b; 1) = (c1, . . . ,ck+1) ◦d(b; 0) = (c1, . . . ,ck+1, 0) and d∗(b; 1) = (c1, . . . ,ck+1 − 1). in other words, d∗(b; 1) ={ d(b; 1) if d(b; 1) is infinite (c1,c2, . . . ,cn − 1) if d(b; 1) = (c1, . . . ,cn, 0). for the negative beta expansion, we have ib = 2n − 1 and cb = {γ ∈ r | −βt 2n−1(γ + 1) − γ /∈ z for all n ∈ n}. suppose lβ /∈ cb. then there exists minimal k ∈ n such that e(b; 2k − 1) holds. let d(b; lβ) = (a1,a2, . . . ). it is easy to see that d(b; rβ) = (0,a1,a2, . . . ). from e(b; 2k − 1), it follows that d(b; rβ) = (c1, . . . ,c2k,a1,a2, . . . ). this means that c1 = 0; ci = ai−1 for all i = 2, . . . , 2k; and ai = a2k−1+i for all i ∈ n. therefore, d(b; lβ) = (a1, . . . ,a2k−1) and d∗(b; rβ) = (c1, . . . ,c2k − 1) ◦d∗(b; rβ) = (c1, . . . ,c2k − 1) = (0,a1, . . . ,a2k−1 − 1). this is equivalent to d∗(b; rβ) =   (0,a1,a2, . . . ,a2n−1 − 1) if d(b; lβ) = (a1,a2, . . . ,a2n−1) d(b; rβ) otherwise. 4. admissible sequences throughout this section, we let b = (β1,β2, . . . ) ∈ rn with limm→∞ |b[m]| = ∞. a b-representation of a real number x ∈ [γ,γ + 1) is an expansion of the form x = ∞∑ i=1 di b[i] with (d1,d2, . . . ) ∈ a(b). note that the condition lim |b[m]| = ∞ does not guarantee that any sequence (d1,d2, . . . ) in a(b) is a b-representation of a real number x since the series ∑∞ i=1 di/b[i] may not converge. if the sum converges, we adopt the notation (d1,d2, . . . )b = ∑∞ i=1 di/b[i]. now, the b-expansion of x is a particular brepresentation of x. deciding whether a sequence (d1,d2, . . . ) in a(b) is the b-expansion of an element of [γ,γ + 1), thus, entails showing that the series converges. definition 2. an integer sequence (d1,d2, . . . ) ∈ a(b) is b-admissible if there is an x ∈ [γ,γ + 1) such that d(b; x) = (d1,d2, . . . ). 219 jonathan caalim, shiela demegillo acta polytechnica the admissibility of sequences with respect to the b-expansion map is related to the admissibility of sequences for a special class of rotational beta expansion map. let z = [0, 1) × [0, 1) and 1 < β ∈ r. define the map t : z −→z by t ((x,y)) = (−βy −b−βyc, βx−bβxc). let t be the b-expansion map on [0, 1) with b = (−β,β). it follows that for all n ∈ n, we have t 2n−1(x,y) = ( t 2n−1b (y), t 2n−1 σ(b) (x) ) and t 2n(x,y) = ( t 2nσ(b)(x), t 2n b (y) ) . so, if d(b; y) = (a1,a2, . . . ) and d(σ(b); x) = (b1,b2, . . . ), then the expansion of (x,y) with respect to t is ((a1,b1), (b2,a2), (a3,b3), . . . ) . proposition 4.1. let b = (−β,β) with β > 1. then (a1,a2, . . . ) ∈ a(b) is b-admissible and (b1,b2, . . . ) ∈a(σ(b)) is σ(b)-admissible if and only if ((a1,b1), (b2,a2), (a3,b3), . . . ) is admissible with respect to t . in this section, our goal is to provide an admissibility criterion for sequences in a(b). we first mention few results. lemma 4.2. let x ∈ [γ,γ + 1) such that d(b; x) = (a1,a2, . . . ). for n ∈ n, tn(x) = b[n]x− n∑ i=1 aib[n] b[i] . proof. we prove this lemma by induction. let x ∈ [γ,γ + 1). then t (x) = b[1]x−a1. suppose that for some k ∈ n, tk(x) = b[k]x− k∑ i=1 aib[k]/b[i]. thus, tk+1(x) = βk+1tk(x) −ak+1 = b[k + 1]x− k∑ i=1 aib[k + 1] b[i] −ak+1 b[k + 1] b[k + 1] = b[k + 1]x− k+1∑ i=1 aib[k + 1] b[i] . in the following lemma, we give certain conditions for a b-representation (d1,d2, . . . ) to be a b-expansion. note that the convergence of the sum (d1,d2, . . . )b implies the convergence of (dk+1,dk+2, . . . )σk (b) for all k ∈ n∪{0}. lemma 4.3. let (d1,d2, . . . ) be a b-representation of x ∈ [γ,γ + 1). if (dk+1,dk+2, . . . )σk (b) ∈ [γ,γ + 1) for all k ∈ n∪{0}, then d(b; x) = (d1,d2, . . . ). proof. by induction on n ∈ n, we prove that dn =⌊ βnt n−1(x) −γ ⌋ and tn(x) = (dn+1,dn+2, . . . )σn(b). note that β1x − d1 = (d2,d3, . . . )σ(b) ∈ [γ,γ + 1). hence, d1 = ⌊ β1t 0(x) −γ ⌋ and t(x) = (d2,d3, . . . )σ(b). suppose the claim holds for n ≤ k where k ∈ n. then βk+1t k(x) −dk+1 = βk+1 ( dk+1 βk+1 + dk+2 βk+1βk+2 + . . . ) −dk+1 = dk+2 βk+2 + dk+3 βk+2βk+3 + . . . = (dk+2,dk+3, . . . )σk+1(b) ∈ [γ,γ + 1). hence, dk+1 = ⌊ βk+1t k+1(x) −γ ⌋ and tk+1(x) = (dk+2,dk+3, . . . )σk+1(b). corollary 4.3.1. let x ∈ [γ,γ + 1) such that d(b; x) = (a1,a2, . . . ). then d(σn(b); tn(x)) = (an+1,an+2, . . . ). proof. by lemma 4.2, tn(x) = b[n] ( x− n∑ i=1 ai b[i] ) = b[n] ∑ i≥n+1 ai b[i] = b[n] ∑ i≥1 an+i b[n + i] = ∑ i≥1 an+i b[n + 1,n + i] . thus, (an+1,an+2, . . . ) is a σn(b)-representation of tn(x). for all k ∈ n, we have σk(an+1,an+2, . . . )σk (σn(b)) = σn+k(a1,a2, . . . )σn+k (b) = tn+k(x) ∈ [γ,γ + 1). the conclusion then follows from lemma 4.3. remark. proposition 2.1, lemma 4.2, and corollary 4.3.1 also hold when x = γ + 1. from lemma 4.3 and corollary 4.3.1, we obtain the following proposition, which gives an admissibility criterion for a sequence (d1,d2, . . . ) ∈a(b) in terms of σk(d1,d2, . . . )σk (b). proposition 4.4. a sequence (d1,d2, . . . ) ∈ a(b) is b-admissible if and only if σk(d1,d2, . . . )σk (b) ∈ [γ,γ + 1) for all k ∈ n∪{0}. now, we provide another admissibility criterion – this time, in terms of the shifts of a sequence (x1,x2, . . . ) ∈ a(b). to this end, we need to introduce an order ≺b on a(b). 220 vol. 60 no. 3/2020 beta cantor series expansion and admissible sequences definition 3. let (a1,a2, . . . ) and (b1,b2, . . . ) be in a(b). we say (a1,a2, . . . ) ≺b (b1,b2, . . . ) if and only if there exists k ∈ n such that bi = ai for all i = 1, 2, . . . ,k − 1 and bk 6= ak where (bk −ak) sgn(b[k]) ≥ 1. if (a1,a2, . . . ) ≺b (b1,b2, . . . ) or (a1,a2, . . . ) = (b1,b2, . . . ), we write (a1,a2, . . . ) �b (b1,b2, . . . ). remark. for the classical positive and negative beta expansions, the order ≺b coincides with the orders defined in [2] and [6], respectively. the following proposition tells us that the monotonicity of points in [γ,γ + 1) is carried over to the ordering of words with respect to ≺b. proposition 4.5. let x,y ∈ [γ,γ + 1). then d(b; x) ≺b d(b; y) if and only if x < y. proof. let d(b; x) = (x1,x2, . . . ) and d(b; y) = (y1,y2, . . . ). let k ∈ n be the least integer such that xk 6= yk. suppose d(b; x) ≺b d(b; y). then y −x = yk −xk b[k] + ∑ i≥k+1 yi −xi b[i] . we have∑ i≥k+1 yi −xi b[i] = ∑ i≥1 yk+i −xk+i b[k + i] = 1 b[k] ∑ i≥1 yk+i −xk+i b[k + 1,k + i] = tk(y) −tk(x) b[k] = (tk(y) −tk(x)) sgn(b[k]) |b[k]| > −1 |b[k]| . thus, y −x > (yk −xk) sgn(b[k]) − 1 |b[k]| ≥ 0. for the reverse implication, suppose 0 < y −x = yk −xk + tk(y) −tk(x) b[k] . note that −1 < tk(y) − tk(x) < 1. when sgn(b[k]) > 0, then yk − xk + 1 > 0. this implies that yk − xk ≥ 0 since both yk and xk are integers. but since yk 6= xk, then yk −xk ≥ 1. however, when sgn(b[k]) < 0, then 0 > yk−xk−1. thus, yk−xk ≤ 0. but since yk 6= xk, then yk −xk ≤−1. in both cases, (yk −xk) sgn(b[k]) ≥ 1. proposition 4.5, together with corollary 4.3.1, implies the following result. corollary 4.5.1. if (d1,d2, . . . ) ∈ a(b) is badmissible, then, for all n ∈ n, d(σn(b); γ) �σn(b) (dn+1,dn+2, . . . ). analogous to proposition 2.1 and lemma 4.3, we provide a relation between d∗(b; γ + 1) and γ + 1. proposition 4.6. if d∗(b; γ + 1) = (c∗1,c∗2, . . . ), then γ + 1 = (c∗1,c∗2, . . . )b and (c∗k+1,c ∗ k+2, . . . )σk (b) ∈ [γ,γ + 1] for all k ∈ n. proof. suppose d∗(b; γ + 1) = (c∗1,c∗2, . . . ). then there exist a sequence {�n} converging to 0 and a sequence {yn} such that yn ∈ (γ + 1 − �n,γ + 1) and d(b; yn) = (c∗1, . . . ,c∗n,yn,1,yn,2, . . . ). thus, yn = n∑ i=1 c∗i b[i] + ∑ i≥1 yn,i b[n + i] . since∑ i≥1 yn,i b[n + i] = 1 b[n] ∑ i≥1 yn,i σn(b)[i] ∈ 1 b[n] [γ,γ + 1), then lim n→∞ ∑ i≥1 yn,i b[n + i] = 0. hence, γ + 1 = lim n→∞ yn = lim n→∞   n∑ i=1 c∗i b[i] + ∑ i≥1 yn,i b[i]   = ∑ i≥1 c∗i b[i] . now, for j,k ∈ n, let us consider d(b; yk+j) = (c∗1,c ∗ 2, . . . ,c ∗ k+j,yk+j,1,yk+j,2, . . . ). by corollary 4.3.1, d(σk(b); tk(yk+j)) = (c∗k+1, . . . ,c ∗ k+j,yk+j,1,yk+j,2, . . . ). hence, (c∗k+1, . . . ,c ∗ k+j,yk+j,1,yk+j,2, . . . )σk (b) ∈ [γ,γ + 1). that is, γ ≤ wj := j∑ i=1 c∗k+i σk(b)[i] + ∑ i≥1 yk+j,i σk(b)[j + i] < γ + 1. since {wj} tends to (c∗k+1,c ∗ k+2, . . . )σk (b), then γ ≤ (c∗k+1,c ∗ k+2, . . . ) ≤ γ + 1. proposition 4.7. if x ∈ [γ,γ + 1), then d(b; x) ≺b d∗(b; γ + 1). 221 jonathan caalim, shiela demegillo acta polytechnica proof. let d∗(b; γ + 1) = (c∗1,c∗2, . . . ). then there exist a sequence �n tending to zero and yn ∈ (γ + 1 − �n,γ + 1) such that d(b; yn) = (c∗1, . . . ,c∗n,yn,1,yn,2, . . . ) with c∗n+1 6= yn,1, so that d(b; yn) 6= d∗(b; γ + 1). suppose d(b; yn) �b d∗(b; γ + 1). there exists yn+m ∈ (yn,γ + 1) where m ≥ 1 such that d(b; yn+m) = (c∗1, . . . ,c ∗ n+m,yn+m,1,yn+m,2, . . . ). since d(b; yn) �b d∗(b,γ + 1), then (yn,1 − c∗n+1) sgn(b[n + 1]) ≥ 1, implying that d(b; yn) �b d(b; yn+m). by proposition 4.5, yn > yn+m which is a contradiction since yn+m ∈ (yn,γ + 1). hence, if x < yn, then d(b; x) ≺b d(b; yn) ≺b d∗(b; γ+1). definition 4. a sequence (d1,d2, . . . ) ∈a(b) satisfies the lexicographic restriction if, for all k ∈ n∪{0}, d(σk(b); γ) �σk (b) σk(d1,d2, . . . ) ≺σk (b) d ∗(σk(b); γ + 1). combining corollary 4.5.1 and proposition 4.7 yields the following proposition. proposition 4.8. let x ∈ [γ,γ + 1). then d(b; x) satisfies the lexicographic restriction. we now show that the converse of prop. 4.8 holds under some condition. to proceed, consider a sequence z = (z1,z2, . . . ) ∈a(b). for i ∈ n, we define z(i,j) = { (zi,zi+1, . . . ,zi+j) if j ∈ n∪{0} (zi,zi+1, . . . ) if j = ∞ and set z(i,j)σi−1(b) = j∑ k=0 zi+k b[i, i + k] , provided that the sum converges if j = ∞. for n ∈ n∪ {0}, let u(n) = d(σn(b); γ) and v(n) = d∗(σn(b); γ + 1). lemma 4.9. let w = (w1,w2, . . . ) ∈ a(b). if w satisfies the lexicographic restriction, then there are infinitely many n such that at least one of the two holds: (1.) b[n] > 0 and (w(1,n− 1) ◦u(n))b ≥ γ; (2.) b[n] < 0 and (w(1,n− 1) ◦v(n))b ≥ γ. proof. suppose w 6= d(b; γ). for the base of the induction, we set n = 0 and define w(1,−1) as the empty word. then (w(1,−1) ◦u(0))b = γ. likewise (w(1,−1)◦v(0))b = γ + 1 ≥ γ. now, let m ∈ n∪{0}. case 1 suppose b[m] > 0 and (w(1,m − 1) ◦ u(m))b ≥ γ hold. by the lexicographic restriction, u(m) = d(σm(b); γ) ≺σm(b) σm(w) = w(m + 1,∞). thus, there exists a least positive integer l such that wm+i = u(m)(i, 0) for all i < l and (wm+l −u(m)(l, 0)) sgn(σm(b)[l]) ≥ 1. since b[m] > 0, then sgn(σm(b)[l]) = sgn(b[m + l]). case 1.1 suppose b[m + l] > 0 so that (wm+l − u(m)(l, 0)) ≥ 1. we have (w(1,m + l− 1) ◦u(m+l))b − (w(1,m− 1) ◦u(m))b = wm+l −u(m)(l, 0) b[m + l] + (u(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) b[m + l] . now, wm+l −u(m)(l, 0) b[m + l] ≥ 1 b[m + l] . meanwhile, (u(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) = γ −t lσm(b)(γ). hence, (u(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) b[m + l] is greater than −1/b[m + l] and less than or equal to 0. therefore, (w(1,m+l−1)◦u(m+l))b−(w(1,m−1)◦u(m))b ≥ 0, implying that (w(1,m+l−1)◦u(m+l))b > (w(1,m−1)◦u(m))b ≥ γ. case 1.2 suppose b[m + l] < 0. then (w(1,m + l− 1) ◦v(m+l))b − (w(1,m− 1) ◦u(m))b = wm+l −u(m)(l, 0) b[m + l] + (v(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) b[m + l] . since b[m + l] < 0, we have wm+l −u(m)(l, 0) b[m + l] ≥ −1 b[m + l] . moreover, (v(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) = γ + 1 −t l σm(b)(γ). it follows that (v(m+l))σm+l(b) − (u(m)(l + 1,∞))σm+l(b) b[m + l] 222 vol. 60 no. 3/2020 beta cantor series expansion and admissible sequences is less than 0 but greater than or equal to 1/b[m + l]. therefore, (w(1,m+l−1)◦v(m+l))b−(w(1,m−1)◦u(m))b ≥ 0, and consequently, (w(1,m+l−1)◦v(m+l))b > (w(1,m−1)◦u(m))b ≥ γ. case 2 suppose b[m] < 0 and (w(1,m − 1) ◦ v(m))b ≥ γ hold. by the lexicographic restriction, σm(w) = w(m+1,∞) ≺σm(b) v(m) = d∗(σm(b); γ+1). thus, there exists a least positive integer l such that wm+i = v(m)(i, 0) for all i < l and (v(m)(l, 0) −wm+l) sgn(σm(b)[l]) ≥ 1. since b[m] < 0, we have sgn(σm(b)[l]) = −sgn(b[m + l]). as before, we have two subcases: sgn(b[m + l]) > 0 and sgn(b[m + l]) < 0. the proofs are similar. analogous to lemma 4.9, we have the following result. lemma 4.10. let w ∈a(b). if w satisfies the lexicographic restriction, then there are infinitely many n such that at least one of the two holds: (1.) b[n] < 0 and (w(1,n− 1) ◦u(n))b ≤ γ + 1; (2.) b[n] > 0 and (w(1,n− 1) ◦v(n))b ≤ γ + 1. we now apply lemmas 4.9 and 4.10 to prove the next proposition. proposition 4.11. let w = (w1,w2, . . . ) ∈ a(b) such that the sum σk(w)σk (b) converges for all k ∈ n ∪{0}. if w satisfies the lexicographic restriction, then σk(w)σk (b) ∈ [γ,γ + 1). proof. we show that γ ≤ wb ≤ γ + 1. let en(w) := σn(w)σn(b), which by assumption converges. then, for all n ∈ n, wb = n∑ k=1 wk b[k] + en(w) b[n] = w(1,n− 1)b + en(w) b[n] . thus, as n tends to ∞, the quotient en(w)/b[n] tends to 0. for sufficiently large n, (w(1,n− 1) ◦u(n))b ≥ γ or (w(1,n− 1) ◦v(n))b ≥ γ by lemma 4.9. so, wb − (w(1,n− 1) ◦ t(n))b = en(w) b[n] − (t(n))σn(b) b[n] = en(w) b[n] − c b[n] −→ 0 where (t(n),c) is either (u(n),γ) or (v(n),γ+1). therefore, wb ≥ γ. in general, observe that as w satisfies the lexicographic restriction with respect to b, then σm(w) also satisfies the lexicographic restriction with respect to σm(b). consequently, lemmas 4.9 and 4.10 apply. in other words, letting σm(w), σm(b), (u(m))σm(b) and (v(m))σm(b) take the role of w, b, γ and γ + 1, respectively, in lemma 4.9, we obtain the conclusion that σm(w)σm(b) ≥ γ. likewise, we have σm(w)σm(b) ≤ γ + 1 for all m ∈ n ∪{0} by lemma 4.10. the only thing we are left to do is to show that σm(w)σm(b) 6= γ + 1. let z = (z1,z2, . . . ) denote the sequence d∗(σm−1(b); γ + 1). let s be the least positive integer such that wm+i−1 = zi for 1 ≤ i < s and (zs −wm+s−1) sgn(σm−1(b)[s]) ≥ 1. note that there exists y ∈ [γ,γ + 1) such that d(σm−1(b); y ) = (z1, . . . ,zs,ys+1,ys+2 . . . ). then, |y (s + 1,∞)σm +s−1(b) −w(m + s,∞)σm +s−1(b)| ≤ (γ + 1) −γ = 1. therefore, y −w(m,∞)σm −1(b) = (zs −wm+s−1) + y (s + 1,∞)σm +s−1(b) σm−1(b)[s] − w(m + s,∞)σm +s−1(b) σm−1(b)[s] ≥ 1 + sgn(σm−1(b)[s])(y (s + 1,∞)σm +s−1(b) |σm−1(b)[s]| − w(m + s,∞)σm +s−1(b)) |σm−1(b)[s]| ≥ 1 − 1 |σm−1(b)[s]| = 0. since γ + 1 > y ≥ w(m,∞)σm −1(b), then w(m,∞)σm −1(b) 6= γ + 1. in the previous proposition, an important part of the proof is the assumption that the sequence w = (w1,w2, . . . ) ∈a(b) has the property that the series σk(w)σk (b) converges for all k ∈ n ∪{0}. it is clear that if the base b = (β1,β2, . . . ) is eventually periodic, then this property holds for w. we can say more. first, note that the digits are bounded by uβi and vβi (see section 2), which, in turn, satisfy max(|uβi|, |vβi|) ≤ (|βi| + 1)(|γ| + 1). now, let us consider the following. for the base b, let |b| be the sequence (|β1|, |β2|, . . . ). suppose that (|β1| + 1, |β2| + 1, . . . )|b| = ∞∑ n=1 |βn| + 1 |b[n]| < ∞. (??) 223 jonathan caalim, shiela demegillo acta polytechnica then for every sequence w ∈ a(b), the sum wb is convergent. indeed,∣∣∣∣∣ ∞∑ n=1 wn b[n] ∣∣∣∣∣ ≤ ∞∑ n=1 ∣∣∣∣ wnb[n] ∣∣∣∣ ≤ ∞∑ n=1 (|βn| + 1)(|γ| + 1) |b[n]| ≤ (|γ| + 1) ∞∑ n=1 |βn| + 1 |b[n]| < ∞. note that if b is eventually periodic, then (??) holds. however, if b = (b1,b2, . . . ) with bn = (n+ 1)/n, then (??) does not hold. we now state the main result of this article, which provides a sufficient and necessary condition for admissibility of integer sequence in a(b) with respect to the beta cantor series expansion for a base sequence b satisfying (??). it would be interesting to know how the result can be extended beyond property (??). theorem 4.12. let b ∈ rn such that limm→∞ |b[m]| = ∞ and (??) holds. let (d1,d2, . . . ) ∈ a(b). then (d1,d2, . . . ) is badmissible if and only if (d1,d2, . . . ) satisfies the lexicographic restriction. acknowledgements the authors would like to thank the anonymous reviewer for valuable remarks that improved the quality of the paper. j. caalim is grateful to the university of the philippines for the financial support through its phd incentive award under the office of the vice chancellor for research and development. s. demegillo is grateful to the department of science and technology of the philippine government for the financial support under the dost-asthrdp scholarship grant. references [1] a. rényi. representations for real numbers and their ergodic properties. acta math acad sci hungar 8:477–493, 1957. doi:10.1007/bf02020331. [2] w. parry. on the β-expansions of real numbers. acta math acad sci hungar 11:401–426, 1960. doi:10.1007/bf02020954. [3] w. parry. representations for real numbers. acta math acad sci hungar 15:95–105, 1964. doi:10.1007/bf01897025. [4] c. frougny, a. lai. on negative bases. development in language theory pp. 252–263, 2009. doi:10.1007/978-3-642-02737-6_20. [5] v. grünwald. intorno all’aritmetica dei sistemi numerici a base negativa con particolare riguardo al sistema numerico a base negativo-decimale per lo studio delle sue analogie coll’aritmetica ordinaria (decimale). giornale di matematiche di battaglini pp. 203–221, 1885. [6] s. ito, t. sadahiro. beta-expansions with negative bases. integers 9:239–259, 2009. doi:10.1515/integ.2009.023. [7] l. liao, w. steiner. dynamical properties of the negative beta transformation. ergod th & dynam sys 32(5):1673–1690, 2012. doi:10.1017/s0143385711000514. [8] d. dombek, z. masáková, e. pelantová. number representation using the generalized (−β)-transformation. theoretical computer science 412(48):6653–6665, 2011. doi:10.1016/j.tcs.2011.08.028. [9] c. kalle. isomorphisms between positive and negative (β)-transformations. ergod th & dynam sys 34(1):153–170, 2014. doi:10.1017/etds.2012.127. [10] k. dajani, c. kalle. transformations generating negative (−β)-expansions. integers 11(b):1–18, 2011. [11] s. akiyama, j. caalim. rotational beta expansion: ergodicity and soficness. j math soc japan 69(1):397–415, 2017. doi:10.2969/jmsj/06910397. [12] s. akiyama, j. caalim. invariant measure of rotational beta expansion and tarski’s plank problem. discrete comput geom 57(2):357–370, 2017. doi:10.1007/s00454-016-9849-4. [13] g. cantor. üeber die einfachen zahlensysteme. zeitschrift für math und physik 14:121–128, 1869. [14] z. masáková, e. pelantová. ito-sadahiro numbers vs. parry numbers. acta polytechnica 51(4):59–64, 2011. [15] k. dajani, s. ramawadh. symbolic dynamics of (−β)-expansions. journal of integer sequences 15:12.2.6/1 – 12.2.6/21, 2012. 224 http://dx.doi.org/10.1007/bf02020331 http://dx.doi.org/10.1007/bf02020954 http://dx.doi.org/10.1007/bf01897025 http://dx.doi.org/10.1007/978-3-642-02737-6_20 http://dx.doi.org/10.1515/integ.2009.023 http://dx.doi.org/10.1017/s0143385711000514 http://dx.doi.org/10.1016/j.tcs.2011.08.028 http://dx.doi.org/10.1017/etds.2012.127 http://dx.doi.org/10.2969/jmsj/06910397 http://dx.doi.org/10.1007/s00454-016-9849-4 acta polytechnica 60(3):214–224, 2020 1 introduction 2 b-expansion maps 3 expansion of +1 4 admissible sequences acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0089 acta polytechnica 61(si):89–98, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague modeling of flows through a channel by the navier–stokes variational inequalities stanislav kračmara, jiří neustupab,a,∗ a czech technical university, faculty of mechanical engineering, department of technical mathematics, karlovo nám. 13, 121 35 praha, czech republic b czech academy of sciences, institute of mathematics, žitná 25, 115 67 praha, czech republic ∗ corresponding author: neustupa@math.cas.cz abstract. we deal with a mathematical model of a flow of an incompressible newtonian fluid through a channel with an artificial boundary condition on the outflow. we explain how several artificial boundary conditions formally follow from appropriate variational formulations and the way one expresses the dynamic stress tensor. as the boundary condition of the “do nothing”-type, that is predominantly considered to be the most appropriate from the physical point of view, does not enable one to derive an energy inequality, we explain how this problem can be overcome by using variational inequalities. we derive a priori estimates, which are the core of the proofs, and present theorems on the existence of solutions in the unsteady and steady cases. keywords: variational inequality, navier-stokes equation, “do nothing” outflow boundary condition. 1. introduction 1.1. the considered initial–boundary value problem we denote by ω a lipschitzian domain in r3, which represents a channel. an incompressible newtonian fluid is supposed to flow into the channel through the part γ1 of the boundary ∂ω and to flow essentially out of the channel through the part γ2 of ∂ω. (see fig. 1.) by “essentially” we mean that we do not exclude possible backward flows on γ2. a fixed wall of the channel is denoted by γ0. the flow is described by the equations of motion ∂tv + v ·∇v− div sd + ∇p = f, (1) div v = 0, (2) 6 � � � � � ��3 � � � � �� q q q q qqs q q q q q qq ���: ���: ���: ω γ0 γ1 γ2 x1 x2 x3 fig. 1 the channel. where v denotes the velocity, p is the pressure, sd is the dynamic stress tensor and f represents an external body force. for simplicity, we assume that the density of the fluid is equal to one. we use the homogeneous dirichlet boundary condition v = 0 on γ0 × (0,t), (3) where (0,t ) is a time interval. the velocity on γ1 can be naturally assumed to be known, which yields the inhomogeneous dirichlet boundary condition v = v∗ on γ1 × (0,t). (4) on the other hand, since the velocity profile on γ2 cannot be predicted in advance, it is logical to apply some “artificial” boundary condition. there appear various artificial boundary conditions in the literature, see e.g. [1–8]. boundary conditions, that follow automatically from an appropriate weak formulation of the considered problem if one a priori assumes a sufficient regularity of asolution, are usually called the “do nothing” conditions. (see e.g. [1, 6, 9] for more details.) an example, and probably the most often used artificial boundary condition is −pn + ν ∂v ∂n = g on γ2 × (0,t), (5) where n denotes the outer normal vector field on ∂ω, ν is the coefficient of viscosity and g is a given function. the non–steady problem also contains the initial condition v = v0 in ω ×{0}. (6) 1.2. on some previous related existential results the existential theory for the system (1)–(6) is based on appropriate a priori estimates. as the boundary condition (5) admits a possible reverse flow on γ2, which may bring to ω an arbitrarily large amount of kinetic energy from the outside, the usual energy inequality does not hold. this does not matter if the given data of the problem are in an appropriate sense 89 https://doi.org/10.14311/ap.2021.61.0089 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en stanislav kračmar, jiří neustupa acta polytechnica “sufficiently small” or if the number t is “sufficiently small” in the non–steady case. (see [1, 9, 10].) the existence of a weak solution of the problem (1)–(6) on an arbitrarily large time interval for “large” data (which is well known for the navier–stokes equations with the dirichlet or navier boundary conditions on ∂ω), is an open problem. a similar problem arises if one studies a flow through a 2d turbine cascade, see [3, 4]. some authors use boundary conditions on γ2, modified in such a way that it enables one to estimate the kinetic energy of the fluid flowing to ω through γ2. examples of such modifications can be found in [2, 5, 7, 11, 12]. another modification, used in connection with the heat transfer, can be found in [8]. a different approach has been suggested in papers [13–15]. there the authors have considered the stationary and non-stationary problems and impose an additional condition on γ2, that enables one to estimate the kinetic energy of a possible reverse flow and obtain an a priori estimate of the energy type. however, the new additional condition implies that the solution cannot lie in the whole sobolev space w 1,2(ω), but only in a certain closed convex subset of this space. since one does not know in advance whether the solution of the problem (1)–(5) falls into this convex set, one must consider a variational inequality instead of the momentum equation. note that artificial boundary conditions on a part of the boundary are also being used, if one approximates a problem in an exterior domain d by a problem in a bounded domain d ∩ br(0) (for “large” r) and prescribes an artificial boundary condition on ∂br(0). (see e.g. [16] and [17].) 2. several boundary conditions of the “do nothing” type 2.1. three equivalent forms of the dynamic stress tensor in equation (1) since the difficulties, caused by the artificial boundary conditions on γ2, are of the same nature in stationary and non-stationary problems, here we consider, for simplicity, only the stationary problem. the dynamic stress tensor sd, in an incompressible newtonian fluid, equals 2νd, where d is the rate of the deformation tensor. (it coincides with the symmetrized gradient of velocity.) the term div sd, which appears in equation (1), can be expressed by any of these formulas: a) div sd = ν∆v, b) div sd = ν div [ ∇v + (∇v)t ] , c) div sd = −ν curl2v.   (7) 2.2. variational formulations of the initial–boundary value problem a variational formulation of the system (1), (2) with the boundary conditions (3), (4) formally follows from the classical formulation if one multiplies equation (1) by a “smooth” test function φ, such that div φ = 0, and integrates in ω. as v should satisfy the dirichlet boundary conditions (3) and (4) on γ0 and γ1, respectively, it is logical to assume that φ = 0 on γ0 ∪γ1. on the other hand, one imposes no boundary condition on φ on γ2. applying the integration by parts, using the forms a) – c) of the dynamic stress tensor, we successively obtain the equations a) ∫ ω [ v ·∇v ·φ + ν∇v : ∇φ ] dx = ∫ γ2 [−pn + ν∇v ·n] ·φ ds + ∫ ω f ·φ dx, b) ∫ ω [ v ·∇v ·φ + ν(∇v + (∇v)t ) : ∇φ dx = ∫ γ2 [ −pn + ν ( ∇v + (∇v)t ) ·n ] ·φ ds + ∫ ω f ·φ dx, c) ∫ ω [ v ·∇v ·φ + ν curl v · curl φ ] dx = ∫ γ2 [−pn−ν curl v×n] ·φ ds + ∫ ω f ·φ dx. however, the integrals on γ2 cannot be involved by the weak formulation because the integrands are generally not integrable if one stays in the usual level of weak solutions. this is why it is logical to neglect these integrals or to replace them just by ∫ γ2 g·φ ds, where g is an arbitrarily given function on γ2. then the variational forms of the system (1), (2) are a) ∫ ω [ v ·∇v ·φ + ν∇v : ∇φ ] dx = ∫ γ2 g ·φ ds + ∫ ω f ·φ dx, b) ∫ ω [ v ·∇v ·φ + ν ( ∇v + (∇v)t ) : ∇φ ] dx = ∫ γ2 g ·φ ds + ∫ ω f ·φ dx, c) ∫ ω [ v ·∇v ·φ + ν curl v · curl φ ] dx = ∫ γ2 g ·φ ds + ∫ ω f ·φ dx.   (8) the equations should satisfy all functions φ with the aforementioned properties and v should also satisfy the boundary conditions (3) and (4). if a weak solution v exists and is sufficiently smooth then, by a reverse integration by parts, one can prove that there exists an appropriate associated pressure p and show that v and p satisfy the boundary conditions a) −pn + ν∇v ·n = g, b) −pn + ν [∇v + (∇v)t ] ·n = g, c) −pn−ν curl v×n = g,   (9) 90 vol. 61 special issue/2021 modeling of flows through a channel respectively, on γ2. it is well known that the pressure p in equation (1) is not unique, because p + c, where c is an arbitrary additional constant (or a function of time), also satisfies the same equation. however, the same consideration is not possible in the boundary conditions a) – c) in (9). here, it only follows from the variational formulation that , one can choose only one pressure p from all associated pressures, that satisfies the boundary condition. 2.3. the momentum equation with the bernoulli pressure none of the conditions a) – c) in (9) excludes a possible reverse flow on γ2 that could hypothetically bring an arbitrarily large amount of the kinetic energy back to ω. one usually derives an a priori energy inequality so that the momentum equation (1) is multiplied by v and integrated over ω. then the flow of the kinetic energy through γ2 comes from the integral of (v·∇v)·v . applying the integration by parts, we can express this integral as follows:∫ ω (v ·∇v) ·v dx = ∫ ∂ω (v ·n) 12 |v| 2 ds = ∫ γ1 (v∗ ·n) 12 |v ∗|2 ds + ∫ γ2 (v ·n) 12 |v| 2 ds. (10) the last integral can hypothetically take an arbitrarily large value if v ·n < 0 on the part of γ2, i.e. in the case of a reverse flow. the situation is different if the nonlinear term in equation (1) is considered in the form curl v ×v + ∇12 |v| 2 and one involves ∇12 |v| 2 and p to the so called bernoulli pressure q := p + 12 |v| 2. then, instead of (9), one obtains from the integral equations (8) the boundary conditions a) −qn + ν∇v ·n = g, b) −qn + ν [∇v + (∇v)t ] ·n = g, c) −qn−ν curl v×n = g.   (11) now, the nonlinear term in equation (1) is just curl v×v (instead of v ·∇v), which yields the term∫ ω curl v × v · φ dx in the variational formulation. if one formally multiplies equation (1) by the velocity v then the nonlinear term vanishes, because (curlv × v) · v = 0. this has the following consequences: 1) the nonlinear term curl v×v generates no backward inflow of kinetic energy to ω through the surface γ2, 2) the usual energy–type inequality can be derived, 3) the existence of a weak solution on an arbitrarily long time interval can be proven by similar methods, as if one considers the homogeneous or inhomogeneous dirichlet boundary condition on the whole boundary ∂ω (see e.g. [18]). 2.4. which artificial boundary condition is the best? there arises a natural question: which of the formulated boundary conditions a) – c) in (9) and a) – c) in (11) on γ2 is most appropriate? the advantage of all the conditions in (11) is that, in contrast to the conditions from (9), they enable one to prove the existence of a weak solution. on the other hand, the condition a) from (9) is fulfilled (with g = 0) by the poiseuille flow in a circular pipe. this is mainly why this condition is usually considered as the best from a physical point of view. on the other hand, in any of the formulated boundary conditions, one can always calculate an appropriate function g so that the poiseuille flow satisfies the considered condition with this concrete function g. thus, the suitability of the chosen boundary condition probably depends only on a particular situation. moreover, in our opinion, it would be very useful to perform numerical calculations with various boundary conditions so that one could compare the results among themselves and also with physical measurements. 3. the navier–stokes inequality – the non-steady case in this section, we deal with the navier–stokes problem (1)–(4) in ω with the boundary condition (5) on γ2. due to the reasons, explained in sections 1 and 2, we study the problem in the form of a variational inequality. 3.1. notation (i) we use the usual notation of the norms in the lebesgue spaces: ‖ .‖r is the norm in lr(ω) or in lr(ω) (the space of vector functions) or in lr(ω)3×3 (the space of tensorial functions). by analogy, ‖ .‖k,r denotes the norm in the sobolev space wk,r(ω) or wk,r(ω) or wk,r(ω)3×3. if the norm is related to another set than ω then we denote it e.g. by ‖ .‖r; γ2 , etc. (ii) we assume that v∗ is a given function on γ1 × (0,t), such that 12 ∫ γ1 [v∗ · (−n)] |v∗|2 ds (the inflow of the kinetic energy to ω through γ1) is bounded, as a function of t, for t ∈ (0,t). (iii) furthermore, we assume that v∗ can be extended to ω×(0,t ) so that the extended function (which is denoted by v∗ext) satisfies the boundary condition (3) on γ0 × (0,t) and a) v∗ext ∈ l∞ ( 0,t; w 1,2(ω) ) and ∂tv∗ext ∈ l2 ( 0,t; w−1,2(ω) ) , b) v∗ext is divergence–free. (here, we denote by w−1,2(ω) the dual space to w 1,2(ω). the duality pairing between w−1,2(ω) and w 1,2(ω) is denoted by 〈 . , .〉.) due to [19, theorem i.3.1] function v∗ext belongs to c0 ( [0,t]; l2(ω) ) . (iv) we denote by v the linear space of all divergence– free vector functions φ ∈ w 1,2(ω), such that φ = 0 on γ0∪γ1. then v∗ext(t)+v (for a.a. t ∈ (0,t)) 91 stanislav kračmar, jiří neustupa acta polytechnica is the set of all functions φ from w 1,2(ω), such that div φ = 0, φ = 0 on γ0 and φ = v∗ext(t) on γ1. (v) let �1 > 0 and ζ ∈ l∞(0,t ) satisfy the inequality∥∥(v∗ext(t) ·n)− |v∗ext(t)|2∥∥1; γ2 + �1 < ζ(t) (12) for a.a. t ∈ (0,t). (the subscript “−” denotes the negative part.) such number �1 and function ζ exist, because the trace of v∗ on γ2×(0,t) is in l∞ ( 0,t; w 1/2,2(γ2) ) and this space is continuously imbedded to l∞ ( 0,t ; l4(γ2) ) . we denote by kt the set of all functions φ ∈ v∗ext(t) + v such that∥∥(φ ·n)− |φ|2‖1; γ2 ≤ ζ(t) for a.a. t ∈ (0,t), (13) by kct the convex hull of kt and define k c t to be the closure of kct. set k c t is the so called closed convex hull of kt. (see [20] for more properties of the convex hull and the closed convex hull. note that kct can also be defined as the intersection of all closed convex sets in v∗ext(t) + v , containing kt.) we assume that the number �1, and the functions v∗ext and ζ are fixed throughout the whole paper. using the presence of �1 > 0 in inequality (12), one can also show that there exists �2 > 0 such that kct contains the �2–neighborhood of v∗ext(t), independently of t. (vi) denote by w (0,t) the space { w ∈ l2(0,t; w 1,2(ω)); ∂tw ∈ l2(0,t ; w−1,2(ω)) } with the norm |||w||| := (∫ t 0 ‖w‖21,2 dt+ ∫ t 0 ‖∂tw‖2−1,2 dt )1/2 . using [19, theorem i.3.1], one can show that w (0,t) ⊂ c0 ( [0,t]; l2(ω) ) . (vii) put k c(0,t) := { w ∈ w (0,t); w(t) ∈ kct for a.a. t ∈ (0,t) } . 3.2. a formal derivation of the variational inequality suppose that v, p is a sufficiently smooth solution of the problem (1)–(6) and w is a sufficiently smooth function from [0,t] such that w(t) ∈ kct for a.a. t ∈ [0,t]. using the form a) of the divergence of the dynamic stress tensor in (7), multiplying equation (1) by the difference w−v, integrating in ω×(0,t), applying the integration by parts and using the equality w −v = 0 on γ0 ∪ γ1 × (0,t) and the boundary condition (5), we get∫ t 0 ∫ ω [∂tv + v ·∇v] · (w−v) dx dt + ∫ t 0 ∫ ω ν∇v : ∇(w−v) dx dt = ∫ t 0 ∫ ω f · (w−v) dx dt + ∫ t 0 ∫ γ2 g · (w−v) ds dt. (14) the term, which contains the derivative ∂tv, satisfies∫ t 0 ∫ ω ∂tv · (w−v) dx dt = ∫ t 0 ∫ ω ∂t(v−w) · (w−v) dx dt + ∫ t 0 ∫ ω ∂tw · (w−v) dx dt = 1 2 ‖w(0) −v(0)‖22 − 1 2 ‖w(t) −v(t)‖22 + ∫ t 0 ∫ ω ∂tw · (w−v) dx dt ≤ 1 2 ‖w(0) −v0‖22 + ∫ t 0 ∫ ω ∂tw · (w−v) dx dt. (15) let w ∈ k c(0,t) further on. since∫ ω ∂tw · (w−v) dx = 〈∂tw,w−v〉,∫ ω f · (w−v) dx = 〈f,w−v〉, (15) and (14) yield∫ t 0 〈 ∂tw,w−v 〉 dt + ∫ t 0 ∫ ω v ·∇v · (w−v) dx dt + ∫ t 0 ∫ ω ν∇v : ∇(w−v) dx dt ≥ ∫ t 0 〈f,w−v〉 dt + ∫ t 0 ∫ γ2 g · (w−v) ds dt − 1 2 ‖w(0) −v0‖22. (16) since (16) is an inequality, we have the possibility to choose another condition and to impose it on the solution v: we require that the inclusion v(t) ∈ kct holds for a.a. t ∈ (0,t). 3.3. definition of the initial–boundary value problem (p) let v∗ext be the extension of function v∗ with the properties (i) and (ii) from paragraph 3.2. let v0 ∈ l2(ω), div v0 = 0 in ω in the sense of distributions, v0 ·n = 0 on γ0 and v0 ·n = v∗(0) ·n on γ1 in the sense of traces. let f ∈ l2(0,t ; w−1,2(ω)) and g ∈ l2(0,t; l4/3(γ2)). one looks for v ∈ l∞(0,t ; l2(ω))∩l2(0,t ; w 1,2(ω)) such that v(t) ∈ kct for a.a. t ∈ (0,t) and v satisfies inequality (16) for all w ∈ k c(0,t). it is well known that for v0 ∈ l2(ω), such that div v0 ∈ l2(ω) (which it definitely satisfies if v0 is 92 vol. 61 special issue/2021 modeling of flows through a channel divergence–free in the sense of distributions), the scalar product v · n makes sense on ∂ω, as an element of w−1/2,2(∂ω). (this follows e.g. from [21, theorem iii.2.2].) thus, the conditions v0 ·n = 0 on γ0 and v0 ·n = v∗(0) ·n are assumed to hold on γ0 and γ1 as equalities in w−1/2,2(γ0) and w−1/2,2(γ1), respectively. the next theorem provides the information on the existence of a solution of a problem (p) on an arbitrarily long time interval (0,t). theorem 1. let v0, v∗ext, f and g be the functions with the aforementioned properties. then problem (p) is solvable. the solution can be expressed in the form v = v∗ext + u, where u ∈ l∞(0,t; l 2(ω)) ∩ l2(0,t; v ) satisfies the inequality ‖u(t)‖22 + ν ∫ t 0 ‖∇u(s)‖22 ds ≤ ‖u(0)‖22 − ∫ t∗ 0 ∫ γ1 (v∗ ·n) |v∗|2 ds dt + c1 ∫ t 0 ‖u(s)‖22 ds + ∫ t 0 [ c2 ‖f(s)‖2−1,2 + c3 ‖v∗ext(s)‖ 2 1,2 + c4 ‖∂tv ∗ ext(s)‖ 2 −1,2 + c5 ‖g(s)‖24/3; γ2 ] ds (17) for all t in(0,t), where all the constants c1–c5 are independent of v0, v∗, v∗ext, f, g and u. note that there is a minus sign in front of the first integral on the right hand side, because n is the outer normal vector and therefore −12 ∫ t∗ 0 ∫ γ1 (v∗ ·n) |v∗|2 ds dt represents the inflow of the kinetic energy to ω through γ1 in the time interval (0, t∗). an analogous theorem, with a different convex set kct, has been proven in [15]. 3.4. the principle of the proof and a priori estimates the complete way theorem 1 can be proven consists of these main steps: 1) construction of appropriate approximations vn (for n ∈ n) of solution v, 2) derivation of a series of estimates of the approximations, 3) derivation of various types of convergence of a subsequence of {vn} in various spaces, 4) verification that the limit is the solution v. among others, one also needs the strong convergence in l2(0,t; w 1,2(ω)), which follows from an estimate of a fractional derivative with respect to t of vn and from the lions–aubin lemma, see e.g. [22]. since the complete estimates of the approximations are laborious, technically complicated and necessarily influenced by the technique, used just for the construction of the approximations, we show below on an a priori level how one can directly obtain from the variational inequality the estimates of u in l∞(0,t ; l2(ω)) and in l2(0,t ; w 1,2(ω)). the advantage of a priori estimates is that they enable one to abstract from the whole machinery, which is necessary in the proof of existence of the approximations. on the other hand, we assume, just inside the procedure, that u is smooth. (this formal assumption is naturally satisfied on the level of approximations.) thus, let t∗ ∈ (0,t), α ∈ (0, 1) and δ > 0 be so small that t∗+δ < t . define function η of one variable t by the formulas η(t) =   α for 0 < t ≤ t∗, α + (1 −α) δ (t− t∗) for t∗ < t < t∗ + δ, 1 for t∗ + δ ≤ t < t. (function η is continuous on (0,t), constant on (0, t∗] and on [t∗ + δ,t) and linear on [t∗, t∗ + δ].) solution v can be expressed in the form v = v∗ext + u, where u ∈ v . put w := v∗ext + ηu = (1 −η)v∗ext + ηv. (as set k c(0,t) is convex and 0 < η ≤ 1, w belongs to k c(0,t).) then w−v = (η − 1)u (which equals 0 on the interval [t∗ + δ,t)). substituting this to the first term in (16), we obtain ∫ t 0 〈 ∂tw,w−v 〉 dt = ∫ t∗ 0 〈 ∂t(v∗ext + αu), (α− 1)u 〉 dt + ∫ t∗+δ t∗ 〈 ∂t(v∗ext + ηu), (η − 1)u 〉 dt = (α− 1) ∫ t∗ 0 〈 ∂tv ∗ ext, u 〉 dt + α(α− 1) ∫ t∗ 0 〈 ∂tu, u 〉 dt + ∫ t∗+δ t∗ 〈 ∂tv ∗ ext, (η − 1)u 〉 dt + ∫ t∗+δ t∗ 〈 ∂t(ηu), ηu 〉 dt− ∫ t∗+δ t∗ 〈 η̇u, u 〉 dt − ∫ t∗+δ t∗ 〈 η ∂tu, u 〉 dt = ∫ t∗+δ 0 (η − 1) 〈 ∂tv ∗ ext, u 〉 dt + α(α− 1) 2 ( ‖u(t∗)‖22 −‖u(0)‖ 2 2 ) + 1 2 ( ‖η(t∗ + δ)u(t∗ + δ)‖22 −‖η(t ∗)u(t∗)‖22 ) − 1 −α δ ∫ t∗+δ t∗ ‖u‖22 dt − 1 2 ∫ t∗+δ t∗ η d dt ‖u‖22 dt = ∫ t∗+δ 0 (η − 1) 〈 ∂tv ∗ ext, u 〉 dt + α(α− 1) 2 ( ‖u(t∗)‖22 −‖u(0)‖ 2 2 ) + 1 2 ( ‖η(t∗ + δ)u(t∗ + δ)‖22 −‖η(t ∗)u(t∗)‖22 ) 93 stanislav kračmar, jiří neustupa acta polytechnica − 1 −α δ ∫ t∗+δ t∗ ‖u‖22 dt − 1 2 ( η(t∗ + δ)‖u(t∗ + δ)‖22 −η(t ∗)‖u(t∗)‖22 ) + 1 2 ∫ t∗+δ t∗ η̇‖u‖22 dt = ∫ t∗+δ 0 (η − 1) 〈 ∂tv ∗ ext, u 〉 dt + α(α− 1) 2 ( ‖u(t∗)‖22 −‖u(0)‖ 2 2 ) + 1 2 ( ‖η(t∗ + δ)u(t∗ + δ)‖22 −‖η(t ∗)u(t∗)‖22 ) − 1 −α δ ∫ t∗+δ t∗ ‖u‖22 dt − 1 2 ( η(t∗ + δ)‖u(t∗ + δ)‖22 −η(t ∗)‖u(t∗)‖22 ) + 1 −α 2δ ∫ t∗+δ t∗ ‖u‖22 dt. considering δ → 0+, we get∫ t 0 〈 ∂tw,w−v 〉 dt = (α− 1) ∫ t∗ 0 〈∂tv∗ext, u〉 dt + α2 − 1 2 ‖u(t∗)‖22 − α(α− 1) 2 ‖u(0‖22. substituting this to (16), using v = v∗ext + u and w = v∗ext + ηu in all other terms in (16), considering δ → 0+, dividing the whole inequality by α−1 (which is negative), and considering finally α → 0+, we obtain∫ t∗ 0 〈∂tv∗ext,u〉 dt + 1 2 ‖u(t∗)‖22 + ∫ t∗ 0 ∫ ω (v∗ext + u) ·∇(v ∗ ext + u) ·u dx dt + ∫ t∗ 0 ∫ ω ν∇(v∗ext + u) : ∇u dx ≤ ∫ t∗ 0 〈f,u〉 dt + ∫ t∗ 0 ∫ γ2 g ·u ds dt + 1 2 ‖u(0)‖22, 1 2 ‖u(t∗)‖22 + ∫ t∗ 0 ν‖∇u‖22 dt + ∫ t∗ 0 ∫ ω (v∗ext + u) ·∇(v ∗ ext + u) · (v ∗ ext + u) dx dt + ∫ t∗ 0 〈∂tv∗ext,u〉 dt ≤ ∫ t∗ 0 ∫ ω (v∗ext + u) ·∇(v ∗ ext + u) ·v ∗ ext dx dt + ∫ t∗ 0 ∫ ω ν∇v∗ext : ∇u dx dt + ∫ t∗ 0 〈f,u〉 dt + ∫ t∗ 0 ∫ γ2 g ·u ds dt + 1 2 ‖u(0)‖22, 1 2 ‖u(t∗)‖22 + ∫ t∗ 0 ν‖∇u‖22 dt + 1 2 ∫ t∗ 0 ∫ γ1 (v∗ ·n) |v∗|2 ds dt + 1 2 ∫ t∗ 0 ∫ γ2 [(v∗ext + u) ·n] |v ∗ ext + u| 2 ds dt + ∫ t∗ 0 〈∂tv∗ext,u〉 dt ≤ ∫ t∗ 0 ∫ ω ( v∗ext ·∇v ∗ ext ·v ∗ ext + v ∗ ext ·∇u ·v ∗ ext + u ·∇v∗ext ·v ∗ ext + u ·∇u ·v ∗ ext ) dx dt + ∫ t∗ 0 ∫ ω ν∇v∗ext : ∇u dx dt + ∫ t∗ 0 〈f,u〉 dt + ∫ t∗ 0 ∫ γ2 g ·u ds dt + 1 2 ‖u(0)‖22. since ∣∣∣∣ ∫ t∗ 0 ∫ ω u ·∇u ·v∗ext dx dt ∣∣∣∣ ≤ ∫ t∗ 0 ‖u‖3 ‖∇u‖2 ‖v∗ext‖6 dt ≤ c ∫ t∗ 0 ‖u‖3 ‖∇u‖2 ‖v∗ext‖1,2 dt ≤ c ∫ t∗ 0 ‖u‖3 ‖∇u‖2 dt ≤ c ∫ t∗ 0 ‖u‖1/22 ‖u‖ 1/2 6 ‖∇u‖2 dt ≤ c ∫ t∗ 0 ‖u‖1/22 ‖∇u‖ 3/2 2 dt ≤ ∫ t∗ 0 ( ξ‖∇u‖22 + c(ξ)‖u‖ 2) dt, where c is a generic constant, we get 1 2 ‖u(t∗)‖22 + (ν − ξ) ∫ t∗ 0 ‖∇u‖22 dt + 1 2 ∫ t∗ 0 ∫ γ1 (v∗ ·n) |v∗|2 ds dt + ∫ t 0 〈∂tv∗ext,u〉 dt ≤ 1 2 ∫ t∗ 0 ∫ γ2 [(v∗ext + u) ·n]− |v ∗ ext + u| 2 ds dt + ∫ t∗ 0 ∫ ω ( v∗ext ·∇v ∗ ext ·v ∗ ext + v ∗ ext ·∇u ·v ∗ ext + u ·∇v∗ext ·v ∗ ext ) dx dt + c(ξ) ∫ t∗ 0 ‖u‖2 dt + ∫ t∗ 0 ∫ ω ν∇v∗ext : ∇u dx dt + ∫ t∗ 0 〈f,u〉 dt + ∫ t 0 ∫ γ2 g ·u ds dt + 1 2 ‖u(0)‖22. (18) (note that ξ > 0 can be chosen arbitrarily small.) the first integral on the right hand side satisfies the 94 vol. 61 special issue/2021 modeling of flows through a channel inequality∫ γ2 [(v∗ext + u) ·n]− |v ∗ ext + u| 2 ds ≤ (∫ γ2 [(v∗ext + u) ·n] 3 − ds )1 3 · (∫ γ2 |v∗ext + u| 3 ds )2 3 . (19) since v∗ext + u ∈ k c t, there exists a sequence {uk} in kct, such that uk → u (for k → ∞) in the norm of w 1,2(ω). then we also have(∫ γ2 [(v∗ext + u) ·n] 3 − ds )1 3 = lim k→∞ (∫ γ2 [(v∗ext + uk) ·n] 3 − ds )1 3 . to each function uk, there exist finite families {θki}nki=1 and {uki}nki=1 in [0, 1] and kt, respectively, such that nk∑ i=1 θki = 1 and uk = nk∑ i=1 θkiuki. then, applying minkowski’s inequality, we get(∫ γ2 [(v∗ext + uk) ·n] 3 − ds )1 3 = (∫ γ2 [nk∑ i=1 θki(v∗ext + uki) ·n ]3 − ds )1 3 ≤ (∫ γ2 nk∑ i=1 [θki(v∗ext + uki) ·n] 3 − ds )1 3 ≤ nk∑ i=1 (∫ γ2 [θki(v∗ext + uki) ·n] 3 − ds )1 3 = nk∑ i=1 θki (∫ γ2 [(v∗ext + uki) ·n] 3 − ds )1 3 ≤ nk∑ i=1 θkiζ = ζ. hence (∫ γ2 [(v∗ext + u) ·n] 3 − ds )1 3 ≤ ζ, (20) too. note that this is a crucial part, where we use the fact that v∗ext +u lies in k c t. the estimates, following from this information, are not available if one deals with the navier–stokes equation instead of the navier– stokes variational inequality (16). as there exists a continuous operator of traces from the sobolev– slobodetski space w 5/6,2(ω) to l3(γ2), which can be deduced e.g. by means of [23], we have(∫ γ2 [(v∗ext + u) ·n] 3 − ds )1 3 (∫ γ2 |v∗ext + u| 3 ds )2 3 ≤ ζ‖v∗ext + u‖ 2 3; γ2 ≤ c‖v∗ext + u‖ 2 5/6,2 ≤ cζ‖v∗ext + u‖ 1 3 2 ‖v ∗ ext + u‖ 5 3 1,2 ≤ cζ‖v∗ext + u‖ 1 3 2 ( ‖v∗ext‖ 5 3 1,2 + ‖∇u‖ 5 3 2 ) ≤ ξ‖∇u‖22 + c(ξ) ζ 6 ‖v∗ext + u‖ 2 2 + ‖v ∗ ext‖ 2 1,2. recall that ζ ∈ l∞(0,t). substituting to (18), and using also the estimates∫ t∗ 0 ∫ ω ν∇v∗ext : ∇u dx dt ≤ ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ) ν 2 ∫ t∗ 0 ‖∇v∗ext‖ 2 2 dt = ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ,v ∗ ext),∣∣∣∣ ∫ t∗ 0 〈∂tv∗ext,u〉 dt ∣∣∣∣ ≤ ∫ t∗ 0 ‖∂tv∗ext‖−1,2 ‖u‖1,2 dt ≤ c ∫ t∗ 0 ‖∂tv∗ext‖−1,2 ‖∇u‖2 dt ≤ ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ) ∫ t∗ 0 ‖∂tv∗ext‖ 2 −1,2 dt = ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ,v ∗ ext),∫ t∗ 0 ∫ ω v∗ext ·∇u ·v ∗ ext dx dt ≤ ∫ t∗ 0 ‖∇u‖2 ‖v∗ext‖ 2 4 dt ≤ ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ) ∫ t∗ 0 ‖v∗ext‖ 4 4 dt = ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ,v ∗ ext),∫ t∗ 0 ∫ ω ( v∗ext ·∇v ∗ ext ·v ∗ ext + u ·∇v ∗ ext ·v ∗ ext ) dx dt ≤ c(v∗ext) + ∫ t∗ 0 ∫ ω ‖u‖4 ‖∇v∗ext‖2 ‖v ∗ ext‖4 dt ≤ ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ,v ∗ ext),∫ t∗ 0 〈f,u〉 dt ≤ ∫ t∗ 0 ‖f‖−1,2 ‖u‖1,2 dt ≤ ∫ t∗ 0 ξ‖∇u‖22 dt + c(ξ,f),∫ t∗ 0 ∫ γ2 g ·u ds dt ≤ ∫ t∗ 0 ‖g‖4/3; γ2 ‖u‖4; γ2 dt ≤ ∫ t∗ 0 ‖g‖4/3; γ2 ‖u‖1,2 dt ≤ c ∫ t∗ 0 ‖g‖4/3; γ2 ‖∇u‖2 dt ≤ ∫ t∗ 0 ξ‖∇u‖22 + c(ξ,g), 95 stanislav kračmar, jiří neustupa acta polytechnica where c is independent of t∗, we obtain 1 2 ‖u(t∗)‖22 + (ν − 9ξ) ∫ t∗ 0 ‖∇u‖22 dt + 1 2 ∫ t∗ 0 ∫ γ1 (v∗ ·n) |v∗|2 ds dt ≤ c(ξ) ∫ t∗ 0 ‖u‖2 dt + c(ξ,v∗ext,f,g)+ + c(ξ) ∫ t∗ 0 ζ6(t) ( ‖v∗ext + u‖ 2 2 + ‖v ∗ ext‖ 2 1,2 ) dt + 1 2 ‖u(0)‖22. (21) choose ξ so small that ξ < 118ν. evaluating precisely the right hand side (which concerns especially c(ξ,v∗ext,f,g)), we can rewrite the inequality in the form (17). omitting at first the second term on the left hand side (i.e. the integral of ‖∇u‖22) and applying the generalized gronwall inequality, we obtain the estimate of u in l∞(0,t ; l2(ω)) in terms of the norms of ζ(t), v∗ext, f and g in appropriate spaces, which are all finite. then, omitting the first term on the left hand side in (21) and considering t∗ → t−, we obtain the estimate of the norm of u in l2(0,t; w 1,2(ω)). 3.5. remark by analogy with [15], one can show that if v is a solution of a problem (p) then there exists an associated pressure p as a distribution in ω × (0,t). the pair (v,p) satisfies the equations (1), (2) in the sense of distributions in ω × (0,t). if, moreover, ∂tv ∈ l1(0,t; w−1,2(ω)) and one prescribes p ∈ l1(0,t) then the pressure p can be chosen so that∫ ω p(t) dx = p(t) for a.a. t ∈ (0,t). suppose now that the solution v has these a posteriori properties: v ∈ l2(0,t; w 2,2(ω)), ∂tv, v · ∇v, f ∈ l2(0,t; l2(ω)) and there exists �3 > 0 such that all φ ∈ v(t) + v , whose distance from v(t) in the w 1,2–norm is less than �3, belong to kct for a.a. t ∈ (0,t). (the last condition means that v(t) lies “uniformly” in the interior of kct.) then one can prove that the distribution p is regular and can be represented by a function from l2(0,t ; w 1,2(ω)). moreover, one can also find a function ϑ ∈ l2(0,t) so that ν ∂v ∂n − (p + ϑ)n = g (22) holds a.e. in γ2 ×(0,t ). this shows that the concrete pressure, obtained from the variational inequality and satisfying the outflow boundary condition on γ2 × (0,t), is unique in the sense that it cannot be changed by adding an arbitrary constant (or a function of t). 4. the navier–stokes inequality – the steady case in both the non-steady and steady cases, the solution’s proof of existence relies on the construction of appropriate approximations, the estimations of the approximations that in some sense copy a priori estimates, the deduction of various types of convergence of a sequence (or a subsequence) of approximations to some limit function, and the demonstration that the limit is a solution whose existence one wants to prove. as we have already mentioned in subsections 1.2 and 3.4, the crucial part is the derivation of a priori estimates. in order to obtain appropriate estimates, in the non–steady case, one can apply gronwall–type inequalities in order to obtain a uniform (in time) estimate of the l2–norm of the solution and the estimate of ∫t 0 ‖∇u‖ 2 2 dt (see subsection 3.4). in the steady case, the estimates substantially depend on the properties of the extended function v∗ext, introduced in subsection 3.1. moreover, as follows from estimate (25), we are able to prove the existence of the steady solution only if ζ (which is now just a positive number) is “sufficiently small” in comparison to ν. (recall the ζ estimates possible reverse flows on the outflow part γ2 of the boundary, see (13).) the extended function v∗ext should now be naturally time–independent, and should be constructed so that the integral ∫ ω u ·∇u ·v ∗ ext dx is “sufficiently small” in comparison with ‖∇u‖22 for all u ∈ v . the reasons are the same as in the case of the steady navier–stokes problem with inhomogeneous dirichlet–type boundary condition on the whole boundary of ω, see e.g. [21, chapter ix] for the detailed explanation. it follows from the paper [14] that this condition of “sufficient smallness” of the aforementioned integral is in fact not an obstacle. concretely, it is shown in [14] that if v∗ satisfies the condition (?) v∗ can be extended from γ1 onto the whole boundary ∂ω so that the extended function belongs to w 1/2,2(∂ω) is equal to zero on γ0 and its flux through ∂ω is zero, then the extension v∗ext can be constructed so that given δ > 0, v∗ext ∈ w 1,2(ω), v∗ext is divergence–free and∫ ω u1 ·∇u2 ·v∗ext dx ≤ δ‖∇u1‖2 ‖∇u2‖2 (23) for all u1, u2 ∈ v . this is the analogue of the so called leray–hopf inequality, see [21]. let us show how the a priori estimate looks. obviously, in the steady case, ζ is just a number and set kct is independent of t. hence we further on use the notation kc instead of kct. the “steady state version” of inequality (16) is∫ ω v ·∇v · (w−v) dx + ∫ ω ν∇v ·∇(w−v) dx ≥ 〈f,w−v〉 + ∫ γ2 g · (w−v) ds. (24) the solution v lies in kc and the inequality is required to be satisfied for all w ∈ kc. writing v in the form v∗ext + u, where u ∈ v , using inequality (24) with 96 vol. 61 special issue/2021 modeling of flows through a channel w = v∗ext, and applying (19), (20) and (23), we obtain ∫ ω v ·∇v · (w−v) dx + ∫ ω ν∇v ·∇(w−v) dx ≥ 〈f,w−v〉 + ∫ γ2 g · (w−v) ds, ν‖∇u‖22 + ∫ ω (v∗ext + u) ·∇(v ∗ ext + u) · (v ∗ ext + u) dx ≤ ∫ ω [ ν∇v∗ext : ∇u + (v∗ext + u) ·∇(v ∗ ext + u) ·v ∗ ext ] dx + 〈f,u〉 + ∫ γ2 g ·u ds, ν‖∇u‖22 + 1 2 ∫ γ1 (v∗ ·n) |v∗|2 ds + 1 2 ∫ γ2 [ (v∗ + u) ·n ] |v∗ + u|2 ds ≤ ∫ ω [ ν∇v∗ext : ∇u + v ∗ ext ·∇v ∗ ext ·v ∗ ext + v∗ext ·∇u ·v ∗ ext + u ·∇v ∗ ext ·v ∗ ext + u ·∇u ·v∗ext ] dx + 〈f,u〉 + ∫ γ2 g ·u ds, ν‖∇u‖22 ≤ − 1 2 ∫ γ1 (v∗ ·n) |v∗|2 ds + 1 2 ∫ γ2 [ (v∗ + u) ·n ] − |v ∗ + u|2 ds + ∫ ω [ ν∇v∗ext : ∇u + v ∗ ext ·∇v ∗ ext ·v ∗ ext + v∗ext ·∇u ·v ∗ ext + u ·∇v ∗ ext ·v ∗ ext ] dx + δ‖∇u‖22 + 〈f,u〉 + ∫ γ2 g ·u ds, ≤ − 1 2 ∫ γ1 (v∗ ·n) |v∗|2 ds + 1 2 ζ‖v∗ + u‖23; γ2 + ∫ ω [ ν∇v∗ext : ∇u + v ∗ ext ·∇v ∗ ext ·v ∗ ext + v∗ext ·∇u ·v ∗ ext + u ·∇v ∗ ext ·v ∗ ext ] dx + δ‖∇u‖22 + 〈f,u〉 + ∫ γ2 g ·u ds, ≤ − 1 2 ∫ γ1 (v∗ ·n) |v∗|2 ds + ζ 2 c6 ‖v∗ext + u‖ 2 1,2 + ∫ ω [ ν∇v∗ext : ∇u + v ∗ ext ·∇v ∗ ext ·v ∗ ext + v∗ext ·∇u ·v ∗ ext + u ·∇v ∗ ext ·v ∗ ext ] dx + δ‖∇u‖22 + 〈f,u〉 + ∫ γ2 g ·u ds, where c6 = c6(ω). writing only the terms with second powers of u, which are decisive for the estimates, and involving all other terms to a generic constant c, we obtain ν‖∇u‖22 ≤ δ‖∇u‖ 2 2 + c7 ζ‖∇u‖ 2 2 + c, (25) where c7 = c7(ω). as δ > 0 can be chosen to be arbitrarily small, we observe that these inequalities yield an a priori estimate of ‖∇u‖2 in terms of v∗, f and g, provided that ζ > 0 is so small that c7ζ < ν. obviously, in this case one also obtains an a priori estimate of ‖v‖1,2 ≡‖v∗ext + u‖1,2. under the aforementioned condition on ζ, one can prove the existence of a weak solution v ∈ kc of the variational inequality (24), applying the procedure sketched at the beginning of this section. (see also [14] for the construction of appropriate approximations and the detailed derivation of the estimates on the level of approximations. however, the convex set, used in paper [14], differs from kc used here.) thus, we can formulate the theorem: theorem 2. let functions v∗ ∈ w 1/2,2(γ1) (satisfying condition (?)), f ∈ w−1,2(ω) and g ∈ l4/3(γ2) be given. let number ζ be so small that c7ζ < ν. then there exists v ∈ kc, such that the variational inequality (24) is satisfied for all w ∈ kc. recall that ζ is used in the definition of the convex set kc, see (12) and (13). the smaller is ζ, the smaller is kc and the narrower space is left for possible reverse flows on γ2. 5. conclusion the paper provides a mathematical model of flows through a channel with an artificial boundary condition (5) on the outflow. both unsteady and steady cases are considered. the core of the model is the variational inequalities (16) (in the unsteady case) and (24) (in the steady case). solutions are sought in an appropriate closed convex subsets of relevant function spaces, defined by means of restrictions, imposed on possible reverse flows on the outflow. the restricting conditions bound the kinetic energy, brought back to ω through γ2 by the reverse flows. consequently, they enable one to derive energy–type a priori estimates. then, applying a relatively standard technique (based e.g. on construction of appropriate approximations or some of the fixed point theorems), one can come to the conclusion on the existence of solutions. this confirms the sense of the used model and associated variational inequalities, in contrast to models based just on equations, where the existence of weak or strong solutions is generally an open problem. except for the discussion on various boundary conditions of the “do nothing” type (see paragraphs 2.2 and 2.3) and some a posteriori properties of solutions (paragraph 3.5), we present a detailed description of a priori estimates of solutions. these estimates clarify, on the formal level, how the information that the solutions belong to l∞(0,t ; l2(ω)) ∩l2(0,t ; w 1,2(ω)) (in the unsteady case) or w 1,2(ω) (in the steady case) directly follows from the used variational inequalities, regardless of other technicalities, connected e.g. with possible approximations. analogous estimates have been obtained in a completely different and much 97 stanislav kračmar, jiří neustupa acta polytechnica more technical way (i.e. at first on the level of approximations and then considering an appropriate limit transition) in papers [14] and [15]. however, it must be noted that while the convex set, corresponding to our kct, is defined in a rather artificial way in [14] and [15], our kct has a good physical sense. naturally, the change of set kct requires a new technique in the derivation of approximations. we do not present any numerical justification of our model. nevertheless, we recall that corresponding numerical experiments, also involving comparison between various artificial boundary conditions on the outflow, suggested in paragraphs 2.2 and 2.3, would be very desirable and interesting. acknowledgements this work was supported by the european regional development fund–project “center for advanced applied science” no. cz.02.1.01/0.0/0.0/16_019/0000778. references [1] m. beneš, p. kučera. solutions of the navier–stokes equations with various types of boundary conditions. archiv der mathematik 98:487–497, 2012. doi:10.1007/s00013-012-0387-x. [2] c.-h. bruneau, p. fabrie. new efficient boundary conditions for incompressible navier-stokes equations: a well-posedness result. mathematical modelling and numerical analysis 30(7):815–840, 1996. doi:10.1051/m2an/1996300708151. [3] m. feistauer, t. neustupa. on some aspects of analysis of incompressible flow through cascades of profiles. operator theory, advances and applications 147:257–276, 2004. [4] m. feistauer, t. neustupa. on non-stationary viscous incompressible flow through a cascade of profiles. mathematical methods in the applied sciences 29(16):1907–1941, 2006. doi:10.1002/mma.755. [5] m. feistauer, t. neustupa. on the existence of a weak solution of viscous incompressible flow past a cascade of profiles with an arbitrarily large inflow. journal of mathematical fluid mechanics 15(15):701–715, 2013. doi:10.1007/s00021-013-0135-4. [6] j. g. heywood, r. rannacher, s. turek. artificial boundaries and flux and pressure conditions for the incompressible navier–stokes equations. international journal for numerical methods in fluids 22(5):325–352, 1996. doi:10.1002/(sici)10970363(19960315)22:5<325::aid-fld307>3.0.co;2-y. [7] t. neustupa. a steady flow through a plane cascade of profiles with an arbitrarily large inflow – the mathematical model, existence of a weak solution. applied mathematics and computation 272:687–691, 2016. doi:10.1016/j.amc.2015.05.066. [8] t. neustupa. the weak solvability of the steady problem modelling the flow of a viscous incompressible heat–conductive fluid through the profile cascade. international journal of numerical methods for heat & fluid flow 27(7):1451–1466, 2017. doi:10.1108/hff-03-2016-0104. [9] p. kučera, z. skalák. local solutions to the navier–stokes equations with mixed boundary conditions. acta applicandae mathematicae 54:275–288, 1998. doi:10.1023/a:1006185601807. [10] p. kučera. basic properties of the non-steady navier– stokes equations with mixed boundary conditions in a bounded domain. ann univ ferrara 55:289–308, 2009. [11] m. braack, p. b. mucha. directional do-nothing condition for the navier-stokes equations. journal of computational mathematics 32(5):507–521, 2014. doi:10.4208/jcm.1405-m4347. [12] m. lanzendörfer, j. stebel. on pressure boundary conditions for steady flows of incompressible fluids with pressure and shear rate dependent viscosities. applications of mathematics 56(3):265–285, 2011. doi:10.1007/s10492-011-0016-1. [13] s. kračmar, j. neustupa. modelling of flows of a viscous incompressible fluid through a channel by means of variational inequalities. zamm 74(6):637–639, 1994. [14] s. kračmar, j. neustupa. a weak solvability of a steady variational inequality of the navier–stokes type with mixed boundary conditions. nonlinear analysis: theory, methods & applications 47(6):4169–4180, 2001. proceedings of the third world congress of nonlinear analysts, doi:10.1016/s0362-546x(01)00534-x. [15] s. kračmar, j. neustupa. modeling of the unsteady flow through a channel with an artificial outflow condition by the navier–stokes variational inequality. mathematische nachrichten 291(11–12):1801–1814, 2018. doi:10.1002/mana.201700228. [16] p. deuring, s. kračmar. artificial boundary conditions for the oseen system in 3d exterior domains. analysis 20:65–90, 2012. [17] p. deuring, s. kračmar. exterior stationary navier-stokes flows in 3d with non-zero velocity at infinity: approximation by flows in bounded domains. mathematische nachrichten 269–270:86–115, 2004. doi:10.1002/mana.200310167. [18] r. temam. navier-stokes equations. north-holland, amsterdam, 1977. [19] j. l. lions, e. magenes. nonhomogeneous boundary value problems and applications i. springer–verlag, new york, 1972. doi:10.1007/978-3-642-65161-8. [20] i. ekeland, r. temam. convex analysis and variational problems. north holland publishing company, amsterdam–oxford–new york, 1976. doi:10.1137/1.9781611971088.bm. [21] g. p. galdi. an introduction to the mathematical theory of the navier–stokes equations, steady–state problems. springer–verlag, 2nd edn., 2011. [22] j. l. lions. quelques méthodes de résolution des problèmes âux limites non linéaire. dunod, gauthier–villars, paris, 1969. [23] j. marschall. the trace of sobolev–slobodeckij spaces on lipschitz domains. manuscipta mathematica 58:47–65, 1987. doi:10.1007/bf01169082. 98 http://dx.doi.org/10.1007/s00013-012-0387-x http://dx.doi.org/10.1051/m2an/1996300708151 http://dx.doi.org/10.1002/mma.755 http://dx.doi.org/10.1007/s00021-013-0135-4 http://dx.doi.org/10.1002/(sici)1097-0363(19960315)22:5<325::aid-fld307>3.0.co;2-y http://dx.doi.org/10.1002/(sici)1097-0363(19960315)22:5<325::aid-fld307>3.0.co;2-y http://dx.doi.org/10.1016/j.amc.2015.05.066 http://dx.doi.org/10.1108/hff-03-2016-0104 http://dx.doi.org/10.1023/a:1006185601807 http://dx.doi.org/10.4208/jcm.1405-m4347 http://dx.doi.org/10.1007/s10492-011-0016-1 http://dx.doi.org/10.1016/s0362-546x(01)00534-x http://dx.doi.org/10.1002/mana.201700228 http://dx.doi.org/10.1002/mana.200310167 http://dx.doi.org/10.1007/978-3-642-65161-8 http://dx.doi.org/10.1137/1.9781611971088.bm http://dx.doi.org/10.1007/bf01169082 acta polytechnica 61(si):89–98, 2021 1 introduction 1.1 the considered initial–boundary value problem 1.2 on some previous related existential results 2 several boundary conditions of the ``do nothing'' type 2.1 three equivalent forms of the dynamic stress tensor in equation (1) 2.2 variational formulations of the initial–boundary value problem 2.3 the momentum equation with the bernoulli pressure 2.4 which artificial boundary condition is the best? 3 the navier–stokes inequality – the non-steady case 3.1 notation 3.2 a formal derivation of the variational inequality 3.3 definition of the initial–boundary value problem (p) 3.4 the principle of the proof and a priori estimates 3.5 remark 4 the navier–stokes inequality – the steady case 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0144 acta polytechnica 58(2):144–154, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap application of the temperature oscillation method in heat transfer measurements at the wall of an agitated vessel s. solnař∗, m. dostál, k. petera, t. jirout czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, prague, czech republic ∗ corresponding author: stanislav.solnar@fs.cvut.cz abstract. the temperature oscillation infra-red thermography (toirt) was used to measure convective heat transfer coefficients at the inner vertical wall of an agitated and baffled vessel. two impellers were used: an axial six-blade impeller with pitched blades and the rushton turbine. the toirt method represents an indirect method based on measuring the phase shift between the oscillating heat flux applied to one side of the heat transfer surface and the wall temperature response monitored by a contactless infra-red camera at the same side. on the basis of this phase shift, the toirt method can indirectly evaluate the heat transfer coefficient on the other side of the heat transfer surface. two forms of experimental results are presented in this paper. the first one describes graphical dependencies of the local nusselt number on the dimensionless distance from the vessel bottom for various reynolds numbers. the second form describes the mean nusselt number along the wall as a function of the reynolds number. keywords: toirt, temperature oscillation method, agitated vessel, wall, heat transfer coefficient. 1. introduction one of the most common operations in chemical, food and pharmaceutic industries is a mixing of liquid materials. this operation frequently requires cooling or heating the agitated liquid. the required size of the transfer surface depends mainly on the intensity of the heat transfer between the agitated liquid and the heat transfer surface. the heat transfer rate q̇ depends on the size of heat transfer surface s, heat transfer coefficient α, and mean temperature difference ∆t . q̇ = αs∆t. (1) therefore, the knowledge of the heat transfer coefficient α at the heat transfer surface (for example, at the vessel bottom, vessel walls, baffles, inserts, etc.) is very important for an optimum design of the mixing equipment. experimental methods based on measuring various thermal quantities (heat fluxes, heat flow rates, temperatures) are the ones most frequently used for determining values of heat transfer coefficients. besides that, methods based on an analogy between various transport phenomena (heat and mass transport, for example) can also be used. nowadays, numerical methods implemented in many cfd packages solving momentum and heat transport equations are also frequently used to determine heat transfer coefficients in laminar as well as turbulent flow regimes. in principle, the methods based on measuring thermal quantities use (1). with the liquid and wall temperatures along with the heat flow rate transferred between the wall and liquid determined, the heat transfer coefficient can be expressed from this equation (assuming we know the size of the heat transfer surface). the heat flow rate can be measured by various techniques. one possibility is using thin film heat flux sensors, which measure small temperature differences arising from the heat flow rate through a thin film element made of material with known properties. transient methods determine the heat flow rate, for example, on the basis of a monitored temperature profile of the liquid accumulating the supplied heat, see [1, 2] for a description of this approach. in general, any experimental method where we can derive some relationship between the heat transfer coefficient and directly measured physical quantities can be used to determine heat transfer coefficients. wandelt and roetzel [8] found an analytical solution of the case where a plate (representing the heat transfer surface) is heated from one side by an oscillating (sine) heat flux. this solution depends on properties of the plate and heat transfer coefficients on both sides. by measuring the time dependency of the plate surface temperature and on the basis of their mathematical model, it is then possible to indirectly evaluate the heat transfer coefficient on the other side of the plate. wandelt and roetzel’s temperature oscillation infrared thermography method (toirt) has been extensively studied by freund [9] in his dissertation thesis. freund studied the theoretical background of this method and focused on some well-known cases like a pipe flow or a jet impinging a plate perpendicularly. freund and kabelac [10] presented another study based on the toirt method where heat transfer coefficients in a plate heat exchanger were measured. 144 http://dx.doi.org/10.14311/ap.2018.58.0144 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 2/2018 application of the temperature oscillation method c m n lchar note ref. nu0 0.726 0.53 – d h/d = 1, pr = 0.71 [3] nu 0.424 0.57 – d h/d = 1, pr = 0.71, 0 ≤ r/d ≤ 1 [3] nu 0.150 0.67 – d h/d = 1, pr = 0.71, 0 ≤ r/d ≤ 2 [3] nu0 1.2 0.5 1/3 d h/d = 1 [4] nu6bd 0.81 0.68 0.33 d 105 < re < 7 · 105 [5] nu6bd 1.15 0.65 0.33 d 30 < re < 4 · 104 [6] nu6bd 0.76 2/3 1/3 d 4000 < re < 28 · 104 [6] nu6bd 0.73 0.67 0.33 d 104 < re < 5 · 104 [7] nu6pbt45 0.43 0.67 0.33 d 104 < re < 5 · 104 [7] table 1. parameters m, n, and c in correlation (4) for various cases in the literature. lchar represents the characteristic length in equation (2). subscript 0 refers to the stagnant point in the impinging jet. 6bd refers to six-blade disc impeller (rushton turbine), and 6pbt45 to six-pitched-blade turbine impeller (pitched angle 45°) in common geometry configuration, i.e. h/d = 1, d/d = 1/3. freund et al. [11] also used the toirt method in a study of heat transfer in spray cooling systems. impinging jets are widely used in industrial applications for local cooling and heating. many papers dealing with heat transfer in various geometries can be found in the literature and impinging jets represent frequent reference cases for testing the results of heat transfer measurements or numerical simulations. in studies of impinging jets, most of the authors focus on cases with the nozzle-plate spacing related to the nozzle diameter greater then 1, only few authors deal with relative distances h/d less than 1 or around 1. lytle and webb [3] measured local values of a heat transfer coefficient for 0.1 < h/d < 6 and the reynolds number within the range of 3600 < re < 27600. they presented the radial dependency of the local nusselt number, correlations for the nusselt number in the stagnation point and the mean nusselt numbers averaged over the heat transfer surface. katti and prabhu [4] presented results of the radial dependency of the local nusselt number based on experimental data for 0.5 < h/d < 8 and 12000 < re < 28000. they presented expressions describing the local nusselt number in the stagnant, transition, and wall jet regions, and they also proposed a correlation for the stagnant nusselt number. persoons et al. [12] studied local heat transfer in a pulsating impinging jet for the nozzle-to-surface spacing 1 ≤ h/d ≤ 6, reynolds numbers 6000 ≤ re ≤ 14000, and pulsation frequency 0 ≤ f ≤ 55 hz). many works also focused on numerical modelling of impinging jets, for example, draksler et al. [13] or jensen and walther [14]. zuckerman and lior [15] published an extensive review paper on impinging jets summarizing empirical correlations and numerical simulation techniques. another very frequent field of study is the heat transfer in agitated vessels. many papers can be found about heat transfer in the whole vessel (wichterle [16], mohan et al. [17], akse et al. [5], karcz [7]) but only a few deal with local heat transfer coefficients at the bottom of the agitated vessel or at the vessel wall (karcz and cudak [18], engeskaug et al. [19], mohan et al. [17] and akse et al. [5]). mohan et al. [17] presented a review of the heat transfer in agitated vessels. this work focused on heat transfer in single-phase and two-phase (gas-liquid) agitated systems equipped with either jacket or heat transfer coils. they also presented axial dependencies of local heat transfer coefficients in an air-water agitated system. the heat transfer in a jacketed vessel agitated by the standard rushton turbine was described by akse et al. [5]. edwards and wilkinson [6] also presented a review of the heat transfer in agitated vessels with various types of agitators and heat transfer surfaces. typical correlation parameters (see equation 4) for the average nusselt number in a vessel agitated by a six-blade disc impeller (rushton turbine) are summarized in table 1. typical values of parameters for the average nusselt number on the wall of the vessel agitated by the six-blade impeller with a pitch angle of 45° published by karcz [7] are also mentioned in this table. the local convective heat transfer coefficients are usually evaluated by experiments or numerical models. two most frequently used experimental methods are thermochromic liquid method (tlc), see vejrazka and marty [20] for an example, and electro-diffusion method (edd) (karcz and cudak [18]). usually, the measured values of heat transfer coefficients are expressed in terms of correlations for the dimensionless nusselt number nu = αd λ , (2) where d represents a characteristic length, it could be length of a plate, pipe diameter, impeller diameter, etc. in engineering practice, mean values of the nusselt numbers averaged over the heat transfer surface are 145 s. solnař, m. dostál, k. petera, t. jirout acta polytechnica more frequently used nu = 1 s ∫ s nu ds. (3) the dependency of local or mean nusselt numbers is often expressed in the following form nu = c rem prn, (4) where c, m and n are parameters, which are determined in the experiments. table 1 illustrates a comparison of typical values of these parameters in correlation (4) for some cases referenced above. the reynolds number re, in this correlation, is defined as re = udρ µ (5) or, for mixing systems, with nd substituted for the velocity u rem = nd2ρ µ (6) the prandtl number is usually defined as pr = ν a = µcp λ (7) transport and thermophysical properties, in the correlations for the nusselt number and corresponding dimensionless numbers (see equations above), are related to fluid (liquid). 2. theoretical background of toirt the toirt method by wandelt and roetzel [8] is based on measuring the wall temperature with an ir camera (see figure 1). this temperature depends on two main factors: heat transfer coefficients on both sides of the wall with thickness δ and modulated heat flux q. the heat flux is modulated by a sine function q(t) = q̂ sin(ωt) and it is applied to one side of the wall. the temperature field monitored at the same side by the ir camera then shows a sinusoidal dependency. according to the toirt method, the phase shift between the sine-modulated heat flux and wall temperature response can be used to indirectly evaluate the heat transfer coefficient α0 on the other side of the wall. the time and spatial dependency of the temperature t in a homogeneous wall (with a thermal diffusivity aw) is described by fourier’s equation ∂t ∂t = aw ( ∂2t ∂x2 + ∂2t ∂y2 + ∂2t ∂z2 ) (8) by neglecting the lateral heat conduction in the wall (the first two partial derivatives on the right-hand side of the previous equation), it is possible to find an analytical solution in the case of the periodically z δ heat transfer coefficient α0 wall modulated heat f ux q ir camera heat transfer coefficient αδ l figure 1. the temperature oscillation method for heat transfer measurement by wandelt and roetzel [8]. oscillation heat flux. the boundary conditions of the third kind can be written as λw ∂t ∂z ∣∣∣∣ z=0 = α0t |z=0, (9) λw ∂t ∂z ∣∣∣∣ z=δ = q̂ sin(ωt) −αδt |z=δ, (10) where α0 is the heat transfer coefficient we are looking for, αδ is the heat transfer coefficient on the other side of the wall, ω is the angular frequency of the oscillating heat flux and q̂ is its amplitude. the authors [8] used the laplace transformation to solve this system of equations and found solution in the following form t(z,t) = a(z) sin ( ωt−ϕ(z) ) (11) where the phase shift ϕ(z) at the surface with periodically oscillating heat flux can be described as tan ϕ|z=δ = c1 + 2ξψ0c2 + 2ξ2ψ20c3 2ξψ0(1+r)c0 + 2ξ2ψ20 (1+2r)c1 + 4ξ3ψ30rc2 + c3 . (12) dimensionless parameters r, ψ0, ξ and c0,1,2,3 in the previous equation are defined as r = αδ α0 , ψ0 = α0aw δλwω , ξ = δ √ ω 2aw , (13) c0 = cosh2 ξ cos2 ξ + sinh2 ξ sin2 ξ, c1 = cosh ξ sinh ξ + cos ξ sin ξ, c2 = cosh2 ξ sin2 ξ + sinh2 ξ cos2 ξ, c3 = cosh ξ sinh ξ − cos ξ sin ξ. (14) the big advantage of the equation (12) when determining the heat transfer coefficient α0 is no dependency on the amplitude of the heat flux. however, the amplitude size influences the accuracy of the measured surface temperature, which then has an indirect impact on the accuracy of the key quantity, that is the phase shift between the oscillating heat flux and 146 vol. 58 no. 2/2018 application of the temperature oscillation method the measured temperature response. therefore, the correct and accurate evaluation of the heat transfer coefficient on the base of the experimental data depends mainly on the accuracy of this phase shift. figure 2 at the top illustrates a real temperature response measured by an ir camera at one point of the heat transfer surface. it is obvious that it does not oscillates around a constant average value as it would imply from boundary conditions (9) and (10) but it slowly increases with time. this increase is caused by the accumulation of heat in the measured volume because the oscillating heat flux does not represent a heat source and sink with zero average value. the “slow component” of the temperature increase must be removed and the measured data can be transformed into the form depicted in the middle of figure 2, that is oscillations around zero average value. this transformation can be accomplished by many algorithms. the simplest one is subtracting a function describing the “slow component” (approximated by a piece-wise linear function, for example) so that the resulting average value is zero. freund [9] mentioned that removing the exponential part of the time dependency should have no significant impact on the results. on the basis of the transformed measured data, see figure 2, it is then possible to find the phase shift ϕ. we used two methods in our case (their results are compared to detect possible discrepancies). the slower method uses a classic non-linear regression to find parameters of a model function describing the transformed experimental data. equation (11) represents a suitable model function for approximating ∆t. then, parameters a and ϕ can be determined with the least-squares method based on the minimizing function ∑ k ( ∆t(tk) − ∆tk )2 , (15) where k is the index representing the sum over experimental data ∆tk at times tk (see figure 2). in practical implementation of this algorithm, we used the function nlinfit in system matlab®. the second method for evaluating the phase shift is based on the discrete fourier transformation (dft) [21] which seems to be very suitable with respect to the periodic behaviour of the signal. coefficients xn of the fourier series describing original signal f can be determined as xn = 1 n ∑ k={n} f(k)e−2πikn/n (16) it is sufficient to analyse only the first harmonic component, in our case, this means that we can evaluate only the first coefficient in (16) and express the value of the phase shift as tan ϕ = im (x1) re (x1) (17) the dft requires n2 of operations for the transformation of the whole spectrum where n is the number 0 10 20 30 40 50 60 70 80 90 100 0 0 20 40 60 80 100 20 21 22 23 24 25 0 20 40 60 80 100 �0.6 �0.4 �0.2 0 0.2 0.4 t (s) t (s) t (o c ) � t (o c ) � t * , q * t (s) φ figure 2. illustration of processing the temperature time responses at the heat transfer surface measured by ir camera. (top) original measured temperature response. (middle) transformed temperature response oscillating around zero value, that is with removed accumulation (exponential) component. (bottom) comparison of normalized values of the generated heat flux q∗ with first harmonics of the preprocessed temperature response ∆t ∗ in order to find out the phase shift ϕ. of discrete points, therefore, it is substantially faster than the non-linear regression mentioned above. 3. application of the method the temperature oscillation method generates the sine-modulated heat flux and monitors the temperature fields in precise time intervals. to satisfy these requirements, we used the two-channel function generator bk precision 4052. the first channel generates the sine wave in a voltage range of 3–10 v. this signal is connected with 1.6 kw power supply and halogen lamps. we used 500 w linear halogen lamps because of their bad efficiency, therefore, most of the electric power is transformed to the heat flux. the second channel of the bk 4052 triggers the thermoimager tim 160 ir camera with a spatial resolution of 160 × 120 pixels. to get better results and constant emissivity ε, the 147 s. solnař, m. dostál, k. petera, t. jirout acta polytechnica function generator power supply pc halogen light ir camera figure 3. wiring schematic and experimental equipment for the temperature oscillation method. phase shift (◦) (ms) zero-heat method 8.271 229 step-voltage method 8.377 232 table 2. time-frame synchronization results, the phase-shift results are for sine wave with period 10 s (frequency 0.1 hz). target surfaces were painted by black matte spray colour with emissivity ε = 0.96. the experimental equipment and wiring schematics are depicted in figure 3. the phase-shift between the heat flux and wall temperature response must be measured precisely. therefore, it is important to synchronize the heat flux and the ir camera trigger signal. the dynamic characteristic of the whole system, i.e. the voltage supply – halogen lamps, has to be taken into account, otherwise, the experimental results would be affected by a systematic error. we used two methods to determine the dynamic characteristic. the first one measures the temperature response at one side of the heat transfer surface (wall), which is insulated on the other side (zero-heat method). the phase shift is then determined from the shift between the signal controlling the heat flux and the temperature response signal. the second method measures the transition characteristic of the voltage supply – the halogen lamp system on the basis of a step change of the voltage supply control input and the corresponding heat flux measurement (step-voltage method). assuming the system is of the first order, the phase shift can be determined as the time constant of the measured response. the dynamic characteristic of the system, i.e., time constant, depends only on the properties of the system and not on the period of the oscillating signal. nevertheless, several periods were tested (2, 5, 10 and 20 s, with corresponding frequencies 0.5, 0.2, 0.1, and 0.05 hz) and same results were obtained. the comparison of both methods is in table 2. ir camera halogen lights 500 2 0 0 td figure 4. pipe-flow experiment with copper pipe 26×1 mm. 3.1. validation of the method: pipe flow the toirt experimental method was validated on two cases: a flow of water in a pipe, and a jet of water impinging a plate perpendicularly. the mean temperature of the liquid (water) was maintained within the range of 16 ± 1.5 °c during the pipe-flow experiment. corresponding liquid properties at this temperature are: density 999 kg m−3 , thermal conductivity 0.598 w m−1 k−1, dynamic viscosity 1.1 mpa s, and prandtl number 7.36. temperature of the flowing liquid was measured at the inlet and outlet of the measuring section. the maximum temperature increase of the outer pipe surface was 5 °c during the experiments. neglecting the thermal resistance of the wall and the heat accumulation, we could assume the same liquid temperature at the inner pipe surface. the dynamic viscosity for this temperature is 0.98 mpa s. taking into account the temperature dependency of the viscosity as described by sieder-tate’s correction, its influence would be (0.98/1.1)0.14 = 0.98, that is 2 %. this influence was neglected. water was pumped from a water tank by a centrifugal pump and it flowed through a copper pipe (density 8941 kg m−3, thermal conductivity 392 w m−1 k−1, thermal diffusivity 1.136 × 10−4m2 s−1) with an inner diameter of d = 24 mm. the measurement section (100 mm long, approx. 4d) started at the distance approximately l = 80d from the inlet, therefore, we assumed a fully developed velocity profile there. the volumetric flow rate was measured by the induction flow meter (krohne ifc010d) and it was used in the calculation of the mean velocity u in the pipe and consequently the reynolds number according to (5). the experimental setup is in figure 4. according to the distance of the ir camera from the surface, the size of the measured point was 0.75 mm/px. heat transfer coefficients were measured in the range of reynolds number from 104 to 5 × 104 and 148 vol. 58 no. 2/2018 application of the temperature oscillation method 7 50 <1 0 0 h d d h halogen lights ir camera figure 5. impinging jet experiment. with various parameters: the period of the sine wave 10 or 20 s, scan frequency of the ir camera 10 or 20 hz and the power modulation between 30 and 100 %. the heat transfer coefficients were evaluated at the inner side of the pipe wall and the results were described as the nusselt number dependency on the reynolds number nu = c rem pr0.4 (18) where pr0.4 is the prandtl number with a commonly used exponent for heating. our evaluated dependency (re from 104 to 5 × 104) was nu = 0.029 re0.8 pr0.4 (19) which is very close to the well-known dittus-boelter correlation nu = 0.023 re0.8 pr0.4, (20) valid for re > 104 and pr from 0.6 to 160. 3.2. validation of the method: impinging jet flow heat transfer in an impinging jet was investigated at a plane stainless steel plate (density 7800 kg m−3, thermal conductivity 14.6 w m−1 k−1 and thermal diffusivity 3.736×10−6 m2 s−1) of a diameter of 392 mm. the stainless plate formed a flat bottom of a cylindrical unbaffled vessel and it was impinged by a liquid stream coming from the pipe of an inner diameter of 22 mm, see figure 5. the jet outlet nozzle was located at h/d = 1 above the vessel bottom. the centrifugal pump controlled by an electric frequency inverter formed a closed loop with the water in the vessel, hence the water level was constant during the experiment. the volumetric flow was measured by the induction flow meter (krohne optiflux 5300) and it was used in calculating the mean velocity 0 1 2 3 4 5 6 0 20 40 60 80 100 120 140 160 180 r/d (−) n u / p r0 .4 (− ) katti , re =20000 solnař , re = 19695 persoons , re = 10000 solnař , re =10375 figure 6. comparison of measured experimental data for the impinging jet validation case, h/d = 1, solnař [22] with literature data by persoons et al. [12], and katti and prabhu [4]. in the pipe and reynolds number according to (5). the mean temperature of the liquid was maintained within a range of 20.5 ± 1.5◦c during the experiment. corresponding liquid properties are: density 998.1 kg m−3, thermal conductivity 0.605 w m−1 k−1, dynamic viscosity 0.99 mpa s, and prandtl number 6.92. according to the distance of the ir camera from the bottom (see figure 5), the size of the measured point was 3.3 mm/px. figure 6 compares the experimental results [22] with literature data [4, 12] for two different reynolds numbers. 4. experimental setup schematic drawings of the agitated vessel (stainless steel, inner diameter d = 300 mm) with two types of impellers (axial and radial impellers) and four baffles (angle 90°) are depicted in figure 7. the axial down-pumping impeller (6pbt45) with six pitched blades (pitch angle = 45°) and diameter d = 100 mm (ratio d/d = 3) was used. the radial impeller was a rushton turbine with the same diameter as the axial-flow impeller. the thickness of the vessel wall was 1.03 mm and it was coated by a black matte colour at the outer side of the vessel in the area between baffles (the measurement area). thermophysical properties of the vessel wall were: thermal conductivity 14.6 w m−1 k−1 , density 7800 kg m−3 and specific heat capacity 501 j kg−1 k−1. the water level in the vessel was h = d. the height of impellers h above the bottom and water level h were adjusted with an accuracy of 1 mm. the ir camera monitored the temperature field along the height of the vessel wall with a spatial resolution of about 2.5 mm/px. the sensitivity of the ir camera was 0.08 k. the temperature of the liquid in the vessel was measured by a pt1000 temperature sensor. it was within a range of 22.0±0.7 °c, and corresponding properties of 149 s. solnař, m. dostál, k. petera, t. jirout acta polytechnica d h b d h 2 2 24 5 20 2 0 12 100 100 o h /3 d/4 n t figure 7. schematic drawings of our experimental vessel with four baffles and an impeller. (left) axialflow impeller 6pbt45, (right) rushton turbine. water are: density 997.8 kg m−3, thermal conductivity 0.605 w m−1 k−1, dynamic viscosity 0.99 mpa s, and prandtl number 6.92. both impellers were situated at h/d = 1 above the vessel bottom. radial baffles of a width b = 0.1 d were used in both cases. the reynolds number in a mixing system is defined by (6) with the impeller diameter as the characteristic dimension d. the reynolds number, in the case of 6pbt45, was within the range 9 × 103–9 × 104, and in the case of rushton turbine, the range was 9 × 103–6 × 104. the following parameters were used in each experiment: the period of the sine wave 10 or 20 s, the number of periods 5 or 10, and the sampling frequency of the ir camera was always set to 10 hz. 5. experimental results local values of the nusselt numbers can be calculated according to (2), and it is possible to express the mean nusselt number dependency by a commonly used correlation (18). two forms of experimental data are presented in this paper. the first one describes graphical dependencies of the local nusselt number on dimensionless height z/h for various reynolds numbers (re from 104 to 5×104). the second form describes the mean nusselt number along the wall as a function of the reynolds number. 0 100 200 300 400 500 600 700 800 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 re m = 10 000 re m = 20 000 re m = 30 000 re m = 40 000 re m = 50 000 nu/pr0.4 z / h (� ) figure 8. nusselt number profile on the wall of the vessel with baffles, rushton turbine, re from 104 to 5 × 104. 0 100 200 300 400 500 600 700 800 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 re m = 10 000 re m = 20 000 re m = 30 000 re m = 40 000 re m = 50 000 nu/pr0.4 z / h (− ) figure 9. nusselt number profile on the wall of the vessel with baffles, 6pbt45 down-pumping axial-flow impeller, re from 104 to 5 × 104. the nusselt number profiles for various reynolds numbers and the rushton turbine are in figure 8. there is a significant peak of the nusselt number at a dimensionless height z/h = 0.33, which corresponds to the impeller location. above z/h > 0.4, the measured data show an exponential dependency. figure 9 illustrates the nusselt number profiles for various reynolds numbers and the 6pbt45 axial-flow impeller. the nusselt numbers are slightly larger in the lower part of the vessel, from z/h = 0 to z/h = 0.5, but there is no visible peak in the nusselt number dependency as in the case of the rushton turbine. the axial-flow impeller was located at z/h = 0.33. the local values of the nusselt number in figures 8 and 9 were evaluated as average values over the width of the monitored surface at a specific dimensionless distance from the bottom z/h. mean values of the nusselt number along the vessel wall and their dependencies on the reynolds number 150 vol. 58 no. 2/2018 application of the temperature oscillation method 10 4 10 5 10 1 10 2 10 3 6pbt45 regresion rusthon regresion re(−) n u / p r0 .4 (− ) figure 10. mean values of the nusselt numbers along the wall of the agitated vessel and the nonlinear regression lines based on the model function (18). dotted lines represent 95 % confidence bands of the fitted functions. the width of these bands is no more than ±15 %. 10 20 30 40 50 60 70 80 90 100 110 � � � � � � � 2 6 10 14 800 1000 1200 1400 1600 1800 20 40 60 80 100 120 5 10 15 600 800 1000 1200 1400 1600 1800 2000 2200 x (px)x (px) y (p x ) y (p x ) α (w / m 2 k ) α (w / m 2 k ) figure 11. comparison of the heat transfer intensity (coefficients) at the wall of the agitated vessel: 6pbt45 axial-flow impeller (left), rushton turbine (right). re = 3 × 104 for both cases. are displayed in figure 10. the dotted lines represent 95 % confidence bands, which are no more than ±15 % wide in our case. evaluated parameters of the correlation (18) are summarized in table 3 for both cases, the rushton turbine and the 6pbt45 axial-flow impeller. contours of the heat transfer coefficients at the wall are depicted in figure 11. the part of the tank wall monitored by the ir camera is represented by 120x15 pixels here, that is 300x37.5 mm according to the camera resolution 2.5 mm/px. this means that the wall temperature was monitored along the whole height of the liquid in the vessel (h = 300 mm). 0 100 200 300 400 500 600 700 800 900 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 karcz , re m = 23 000 m = 25 000 nu/pr0.4 (−) z / h (− ) present study figure 12. comparison of the experimental results for the rushton turbine with karcz and cudak [18] rus on turbine 6pbt45 ~l h ht figure 13. illustration of flow patterns and regions, where similarities between the flow along a flat plate and the flow along the wall of the agitated vessel with axial-flow (left) and radial-flow (right) impeller can be identified. 6. discussion figure 12 shows a good agreement of our experimental results for the rushton turbine with karcz and cudak [18] measured by the edd experimental method. a bigger difference (approx. 50 %) is visible within the range z/h = 0.25 − 0.5 where we have a clearly smoother distribution of experimental data. vlček et al. [23] studied velocity fields in a vessel agitated by the 6pbt45 impeller using lda experiments and les-based simulations in ansys cfd software. c m rushton turbine 8.75 ± 33 % 0.336 ± 9.5 % 6pbt45 5.98 ± 34 % 0.328 ± 9.6 % table 3. evaluated parameters of (18) for mean values of the nusselt number nu. 151 s. solnař, m. dostál, k. petera, t. jirout acta polytechnica they reported that dimensionless axial velocity near the wall of the vessel u∗ax = uax πdn (21) equals approximately to 0.3 in the middle region of the vessel wall. assuming a characteristic length of this region according to vlček et al. [23] as 2/3 of water level h, it is possible to calculate the mean axial velocity near the vessel wall and the corresponding reynolds number. in the case of rotation speed n = 300 min−1, the mean axial velocity along the wall is uax = 0.43 m s−1 and the corresponding reynolds number rel = uaxlρ µ (22) is 94 000. with the well-known correlation describing the flow along a flat plate ([24], valid for re < 105 and pr > 0.6) nu = 0.664 re0.5l pr 0.33 (23) it is possible to calculate the nusselt number as 395. using (6) and rotation speed n = 300 min−1, the reynolds number in the agitated vessel rem is 50 000. substituting to correlation (18) for the 6pbt45 axial impeller and a same prandtl number, we can evaluate the nusselt number nu = 404, which is quite close to the value based on the case of the flow along a plate (error 2.3 %). in the case of the vessel agitated by the 6pbt45 impeller, it is possible to find regions with some similarities between the heat transfer at the vessel wall and the heat transfer in a flow along a flat plate. such regions with similar results cannot be identified in the case of the rushton turbine (radial-flow impeller). the evaluated experimental data do not show similarities with the flow along the flat plate for the entire height of the vessel wall. in general, such similarities can be found only in regions where the flow is aligned with the vessel wall. such regions are larger in configurations with an impeller placed in a draft tube [25], or in cases with axial-flow impellers and larger h/d ratio. 7. conclusion the temperature oscillation infra-red thermography method was used for measuring local heat transfer coefficients at the wall of an agitated vessel with four baffles. the rushton turbine and a 6pbt45 axial-flow impeller located at dimensionless distance z/h = 0.33 from the vessel bottom were used in the experiments. we have validated the toirt method on the pipe flow and impinging jet experiments. the experimental results of the heat transfer in the agitated vessel are presented as graphical dependencies of the nusselt number with respect to a dimensionless coordinate z/h. in the case of the rushton turbine, a significant peak of the nusselt number at the vessel wall can be observed around z/h = 0.33 (the impeller location). no such peak can be identified in the case of the axial-flow impeller 6pbt45. parameters of the commonly used correlation for the mean nusselt number were evaluated in both cases, the rushton turbine and 6pbt45 axial-flow impeller. similarities between the heat transfer at the wall of the vessel agitated by the 6pbt45 impeller and the flow along a flat plate can be identified in some regions of the wall, but they cannot be recognized for the entire height of the vessel wall or the rushton turbine impeller. the disadvantages of the toirt method consist in the necessity of an accurate measuring of the wall temperature response. the temperature variations are relatively small, therefore, relatively expensive ir cameras with good sensitivity are required, otherwise, the key parameter of this method, the phase shift, is determined inaccurately and the resulting value of the heat transfer coefficient is inaccurate as well. another problem lies in light reflections from the surface irradiated by the halogen lamps, which could easily distort the image scanned by the ir camera. nevertheless, we think that the advantages of the toirt method prevails and they can be briefly summarized as (1) contactless, (2) relatively fast for a heat transfer measurement, (3) with an option to easily change the spatial resolution. list of symbols a thermal diffusivity (of liquid) [m2 s−1] aw thermal diffusivity of wall [m2 s−1] a amplitude [k] b baffle width [m] c geometric constant [–] c0,1,2,3 dimensionless parameters [–] cp specific heat capacity [j kg−1 k−1] d characteristic length, impeller or pipe diameter [m] d vessel diameter [m] h height of impeller of tube above bottom [m] h height of water level [m] lchar characteristic length [m] m,n exponents [–] n rotation speed [s−1] n number of experimental points [–] nu nusselt number [–] nu mean nusselt number, see (3) [–] pr prandtl number [–] q heat flux [w m−2] q̂ amplitude of heat flux [w m−2] q̇ heat flow rate [w] r dimensionless parameter [–] re reynolds number [–] rem mixing reynolds number [–] rel reynolds number for flat plate [–] s surface [m2] t time [s] t temperature [◦c, k] 152 vol. 58 no. 2/2018 application of the temperature oscillation method ∆t temperature difference [◦c, k] u mean velocity [m s−1] uax axial part of velocity [m s−1] u∗ax dimensionless velocity [–] x1,xn coefficients in fourier’s series x,y,z coordinate [m] α heat transfer coefficient [w m−2 k−1] α0 heat transfer coefficient at the coordinate z = 0 [w m−2 k−1] αδ heat transfer coefficient at the coordinate z = δ [w m−2 k−1] δ wall thickness [m] ε emissivity [–] λ thermal conductivity of liquid [w m−1 k−1] λw thermal conductivity of wall [w m−1 k−1] µ dynamic viscosity [pa s] ν kinematic viscosity [m2 s−1] ρ density [kg m−3] ϕ phase-shift [◦, rad] ψ0,ξ dimensionless parameters [–] ω angular frequency [s−1] abbreviations 6bd six-blade disc impeller (rushton turbine) 6pbt45 six-blade impeller with a pitched angle of 45° cfd computational fluid dynamics dft discrete fourier transform edd electro-diffusion method ir infra red les large eddy simulation lda laser doppler anemometry tlc thermochromic liquid crystals toirt temperature oscillation infra-red thermography px pixel acknowledgements this work is supported by a research project of the grant agency of czech republic no. 14-18955s. references [1] m. dostál, k. petera, f. rieger. measurement of heat transfer coefficients in an agitated vessel with tube baffles. acta polytechnica 50:46–57, 2010. [2] m. dostál, m. věříšová, k. petera, et al. analysis of heat transfer in a vessel with helical pipe coil and multistage impeller. canadian journal of chemical engineering 92(12):2115–2121, 2014. doi:10.1002/cjce.22033. [3] d. lytle, b. w. webb. air jet impingement heat transfer at low nozzle-plate spacing. international journal of heat and mass transfer 37(12):1687–1697, 1994. doi:10.1016/0017-9310(94)90059-0. [4] v. katti, s. prabhu. experimental study and theoretical analysis of local heat transfer distributions between smooth flat surface and impinging air jet from a circular straight pipe nozzle. international journal of heat and mass transfer 51(17):4480–4495, 2008. doi:10.1016/j.ijheatmasstransfer.2007.12.024. [5] h. akse, w. j. beek, f. c. a. a. van berkel, j. de graauw. the local heat transfer at the wall of a large vessel agitated by turbine impellers. chemical engineering science 22:135 – 146, 1967. doi:10.1016/0009-2509(67)80006-x. [6] m. f. edwards, w. l. wilkinson. heat transfer in agitated vessels. part i – newtonian fluids. the chemical engineer 264:310 – 319, 1972. [7] j. karcz. studies of heat transfer process in agitated vessels. trends in chemical engineering 8:161 – 182, 2003. [8] m. wandelt, w. roetzel. lockin thermography as a measurement technique in heat transfer. quantitative infrared thermography 96 pp. 189–194, 1997. [9] s. freund. local heat transfer coefficients measured with temperature oscillation ir thermography. ph.d. thesis, universitat der budeswehr hamburg, 2008. [10] s. freund, s. kabelac. investigation of local heat transfer coefficients in plate heat exchanger with temperature oscillation ir thermography and cfd. international journal of heat and mass transfer 53:3764–3781, 2010. doi:10.1016/j.ijheatmasstransfer.2010.04.027. [11] s. freund, a. g. pautsch, t. a. shedd, s. kabelac. local heat transfer coefficients in spray cooling systems measured with temperature oscillation ir thermography. international journal of heat and mass transfer 50:1953–1962, 2007. doi:10.1016/j.ijheatmasstransfer.2006.09.028. [12] t. persoons, k. balgazin, k. brown, d. b. murray. scaling of convective heat transfer enhancement due to flow pulsation in an axisymmetric impinging jet. international journal of heat and mass transfer 135, 2013. doi:10.1115/1.4024620. [13] m. draksler, b. končar, l. cizelj, b. niceno. large eddy simulation of multiple impinging jets in hexagonal configuration flow dynamics and heat transfer characteristics. international journal of heat and mass transfer 109:16–27, 2017. doi:10.1016/j.ijheatmasstransfer.2017.01.080. [14] m. v. jensen, j. h. walther. numerical analysis of jet impingement heat transfer at high jet reynolds number at large temperature difference. heat transfer engineering 34(10):801–809, 2013. doi:10.1080/01457632.2012.746153. [15] n. zuckerman, n. lior. jet impingement heat transfer: physics, correlations, and numerical modelling. advances in heat transfer 39:565–631, 2006. doi:10.1016/s0065-2717(06)39006-5. [16] k. wichterle. heat transfer in agitated vessels. chemical engineering science 49(9):1480–1483, 1994. doi:10.1016/0009-2509(94)85075-5. [17] p. mohan, a. n. emery, t. al-hassan. review heat transfer to newtonian fluids in mechanically agitated vessels. experimental thermal and fluid science 5:861 – 883, 1992. doi:10.1016/0894-1777(92)90130-w. [18] j. karcz, m. cudak. studies of local heat transfer at vicinity of a wall region of an agitated vessel. in proc. 34th international conference of ssche 2007, pp. 029–1 – 029–10. 2007. 153 http://dx.doi.org/10.1002/cjce.22033 http://dx.doi.org/10.1016/0017-9310(94)90059-0 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2007.12.024 http://dx.doi.org/10.1016/0009-2509(67)80006-x http://dx.doi.org/10.1016/j.ijheatmasstransfer.2010.04.027 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2006.09.028 http://dx.doi.org/10.1115/1.4024620 http://dx.doi.org/10.1016/j.ijheatmasstransfer.2017.01.080 http://dx.doi.org/10.1080/01457632.2012.746153 http://dx.doi.org/10.1016/s0065-2717(06)39006-5 http://dx.doi.org/10.1016/0009-2509(94)85075-5 http://dx.doi.org/10.1016/0894-1777(92)90130-w s. solnař, m. dostál, k. petera, t. jirout acta polytechnica [19] r. engeskaug, e. thorbjørnsen, h. f. svendsen. wall heat transfer in stirred tank reactors. industrial & engineering chemistry research 44:4949 – 4958, 2005. doi:10.1021/ie049178a. [20] j. vejrazka, p. marty. an alternative technique for the interpretation of temperature measurements using thermochromic liquid crystals. heat transfer engineering 28(2):154–162, 2007. doi:10.1080/01457630601023641. [21] d. sundararajan. the discrete fourier transform: theory, algorithms and applications. world scientific publishing co. inc., 2001. [22] s. solnař, k. petera, m. dostál, t. jirout. heat transfer measurements with toirt method. epj web of conferences 143:02113, 2017. doi:10.1051/epjconf/201714302113. [23] p. vlček, b. kysela, t. jirout, i. fořt. large eddy simulation of a pitched blade impeller mixed vessel comparsion with lda measurement. chemical engineering research and design 108:42–48, 2016. doi:dx.doi.org/10.1016/j.cherd.2016.02.020. [24] p. stephan, h. martin, s. kabelac, et al. (eds.). vdi heat atlas. springer, 2010. [25] k. petera, m. dostál, m. věříšova, t. jirout. heat transfer at the bottom of a cylindrical vessel impinged by a swirling flow from an impeller in a draft tube. chemical and biochemical engineering quarterly 31:343 – 352, 2017. doi:10.15255/cabeq.2016.1057. 154 http://dx.doi.org/10.1021/ie049178a http://dx.doi.org/10.1080/01457630601023641 http://dx.doi.org/10.1051/epjconf/201714302113 http://dx.doi.org/dx.doi.org/10.1016/j.cherd.2016.02.020 http://dx.doi.org/10.15255/cabeq.2016.1057 acta polytechnica 58(2):144–154, 2018 1 introduction 2 theoretical background of toirt 3 application of the method 3.1 validation of the method: pipe flow 3.2 validation of the method: impinging jet flow 4 experimental setup 5 experimental results 6 discussion 7 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0081 acta polytechnica 60(1):81–87, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap eigenvalues evaluation of generally damped elastic disc brake model loaded with non-conservative friction forces juraj úradníčeka, ∗, miloš musila, michal bachratyb a slovak university of technology, faculty of mechanical engineering, institute of applied mechanics and mechatronics, námestie slobody 17, 82131 bratislava, slovak republic. b slovak university of technology, faculty of mechanical engineering, institute of manufacturing systems, environmental technology and quality management, námestie slobody 17, 82131 bratislava, slovak republic. ∗ corresponding author: juraj.uradnicek@stuba.sk abstract. this paper deals with the evaluation of eigenvalues of a linear damped elastic twodegrees-of-freedom system under a non-conservative loading. as a physical interpretation of a proposed mathematical model, a simplified disk brake model is considered. a spectral analysis is performed to predict an eigenvalues bifurcation, known as the krein collision, leading to double eigenvalues, one of them having a positive real part causing a vibration instability of the mechanical systems. this defective behaviour of eigenvalues is studied with respect to a magnitude of non-conservative coulomb friction force, through the variation of the friction coefficient. the influence of a proportional versus general damping on the system stability is further analysed. the generalized non-symmetric eigenvalue problem calculation is employed for spectral analyses, while a modal decomposition is performed to obtain a time-domain response of the system. the analyses are compared with an experiment. keywords: eigenvalues bifurcation, krein collision, non-conservative force, modal decomposition, brake squeal. 1. introduction brake systems represent one of the most important safety and performance components of vehicles, such as cars, trains, airplanes or industrial machines. reliability, braking power and a fluent operation are important properties of brake systems. however, during the braking, the components of a disc brake tend to vibrate and a noise and harshness such as a brake squeal can occur. it is generally accepted that the brake squeal is determined by a friction-induced vibration via a rotating disc [1]. in 1952, ziegler [2], studying a double mathematical pendulum exposed to the non-conservative force, found that the limits of the system stability were reduced by an application of minor damping to the system. damping reduces the stability of the system, which contradicts the classical theory of structural stability, in which the damping plays a purely stabilizing role. thus, an unstable behaviour of mechanical systems can also be caused by a dissipation induced instability, which was, in the case of brake system, pointed out in the work of hoffman and gaul in 2003 [3]. in a braking system, the non-conservative friction force plays a major role. for this reason, the disc brake system is considered as the system with a potential occurrence of the damping destabilization paradox. several studies focus on analyses of the behaviour of such systems using simple analytical minimal models [4, 5], complex numerical finite element (fe) models [1] and experiments on real physical systems [6]. analytical and fe studies are mainly based on the complex eigenvalue analysis [7] to predict the mode coupling. these analyses subsequently allow an optimization of brake systems to avoid a potential brake squeal occurrence. in the process of disc brake development, the complex eigenvalue analysis of finite element model is usually used to perform the modal optimization of the system to reduce any potential system instability [8]. in this process, no damping or simple proportional damping is usually considered. according to the above-mentioned destabilization paradox, a damping can possibly cause a reduction of stability limits. significance of this effect should be closely examined for disc brake applications. in this paper, the damping in the system and its influence on the system stability is studied on a simple two-degree-of-freedom (dof) analytical model of a disc brake. observations are experimentally verified by measurements on an experimental pad-on-disc system. 2. physical problem description in the following study, a simple two-degree-of-freedom elastic system is considered (figure 1). the system consists of spring/damping elements and a mass moving in the horizontal and vertical direction. the spring and damping elements in a skew direction couple both degrees of freedom so that the mode local bifurcation 81 https://doi.org/10.14311/ap.2020.60.0081 https://ojs.cvut.cz/ojs/index.php/ap j. úradníček, m. musil, m. bachraty acta polytechnica figure 1. two-degrees-of-freedom mechanical model including non-conservative friction force in the system equilibrium position. of complex conjugate eigenvalues (mode coupling) can occur [9]. the mass element represents the mass of the friction material (a part of the braking pad which is in contact with the disc), while spring/damping elements represent the stiffness and damping of the friction material in particular directions. the moving belt acts as a generator of the non-conservative friction force whose magnitude depends linearly on state variables and a coefficient of friction, considering coulomb friction law. to avoid the nonlinear behaviour of the friction force related to changing its sign, the following assumption is taken into account. the solution is evaluated around the equilibrium point. this case corresponds to the situation when the prestress effect is applied on the mass in a negative x2 direction and after the solution is obtained around this system’s equilibrium point. under this assumption, the overall friction force can exhibit only positive values. in this paper, the evaluation of the eigenvalues with respect to the friction coefficient is studied to investigate the limits of stability of the generally damped simple minimal brake model. 3. mathematical problem description the mechanical system in figure 1 is described by a set of second order differential equations in matrix form mẍ + cẋ + kx = f (1) m = [ m 0 0 m ] , c = [ c2 + 12c3 1 2c3 1 2c3 c1 + 1 2c3 ] , k = [ k2 + 12k3 1 2k3 1 2k3 k1 + 1 2k3 ] , f = [ ft 0 ] , x = [ x1 x2 ] , where coordinates x1,x2 represent the displacement of the mass around the equilibrium position, m, b, k are coefficients’ matrices and ft represents the dynamical part of the overall friction force ftc = fo + ft. constant friction force fo = µfp results from the pre-stress normal force fp and inequality fp > |ft|/µ has to be fulfilled so that the slider cannot detach the moving belt. if the coulomb friction model is considered and the slider in the model cannot detach the moving belt, the friction force ft linearly depends on the normal force acting between the slider and the moving belt: ft = µ(k1x2 + c1ẋ2) (2) where the µ represents the coulomb friction coefficient. substituting (2) into (1), the system is reorganized to the form { mẍ + c′(µ)ẋ + k′(µ)x = 0 mẋ − mẋ = 0 (3) k′ = [ k2 + 12k3 1 2k3 −µk1 1 2k3 k1 + 1 2k3 ] , c′ = [ c2 + 12c3 1 2c3 −µc1 1 2c3 c1 + 1 2c3 ] . adding the identical equation in (3), the system is transformed into the state-space representation with state variables vector y, a(µ)ẏ + b(µ)y = 0 (4) a = [ c′ m m 0 ] , b = [ k′ 0 0 −m ] , y = [ x ẋ ] . it can be seen that matrices a and b are nonsymmetric and smoothly depend on the scalar parameter µ. 4. system eigenvalues evaluation the equation (4) can be rewritten in the form of a generalized non-symmetric eigenvalue problem, (b + λia)vri = 0 (5) where λi is the i−th generalized eigenvalue and vri is the corresponding i−th right complex eigenvector. an imaginary part of the eigenvalue is the damped natural frequency ωd = ω0 √ 1 − ζ2, where ω0 represents an undamped natural frequency and the real part is ζω0. a damping ratio ζ is calculated from a geometrical relation cos θ = ζ, where θ is an angle to a pole from a negative real axis. in the undamped case, the local bifurcation of the complex conjugate eigenvalues occurs in the point µ = 0.5 (figure 2) and the damped natural frequencies are being coupled (figure 3). at this point, the system loses its stability. in the case of the proportional damping, the system loses its stability slightly over the point µ = 0.5 and damped natural frequencies cross each other but don’t couple. in the case of the general non-proportional damping, the bifurcation point is unclear and the system loses its stability below µ = 0.5. 82 vol. 60 no. 1/2020 eigenvalues evaluation of generally damped elastic disc brake model. . . figure 2. 1. and 2. modes damping ratios’ evolution with the friction coefficient, considering the proportionally damped, non-proportionally damped and undamped system. figure 3. damped natural frequencies’ evolution of the 1. and 2. modes with the friction coefficient considering proportionally damped, non-proportionally damped and non-damped system. 83 j. úradníček, m. musil, m. bachraty acta polytechnica (a). (b). (c). (d). (e). (f). figure 4. the transient response of the non-proportionally damped system to the initial conditions (ic) y(0) = [1, 0, 0, 0] for a) µ = 0 (λ1 = −15.6 ± 5000i; λ2 = −21.9 ± 7071i), b) µ = 0.47 (λ1 = −37.5 ± 5886i; λ2 = 0 ± 6352i), c) µ = 0.48 (λ1 = −44 ± 5943i; λ2 = 6.6 ± 6300i). 5. modal decomposition homogeneous response of the linear state determined system can be expressed in terms of state-transition matrix φ(t) as follows yh(t) = φ(t)y(0), (6) where the state-transition matrix φ(t) may be expressed in terms of the eigenvalues/eigenvectors as φ(t) = vreλtv−1r , (7) defined by the modal matrix, consisting of right eigenvectors in columns, where n is a number of columns of the modal matrix vr = [ vr1 vr2 · · · vrn ] (8) 84 vol. 60 no. 1/2020 eigenvalues evaluation of generally damped elastic disc brake model. . . and the square matrix eλt with terms eλit on the leading diagonal. define rr = vr−1   rr1 rr2 · · · rrn   (9) the homogeneous response of the system can be expressed as a weighted superposition of the system modes eλitvri where an initial condition y(0) affects the strength of an excitation of each mode yh(t) = (vreλt)(rry(0)) = n∑ i=1 eλitvri(rriy(0)) = n∑ i=1 cie λitvri (10) where the scalar coefficient ci = rriy(0). the yh1 coordinate in figure 4 represents the homogeneous response of the mass displacement x1 in figure 1. if the friction is absent in the system, both modes are attenuated over the time (figure 4 a)) with the rate given by their damping ratios ζi derived from the corresponding eigenvalues λi. damped eigenfrequencies ωd1 = 5000 rad·s−1 and ωd2 = 7071 rad·s−1 can be read from a single side amplitude spectrum (ssas) and must correspond to given eigenvalues. if the friction coefficient reaches stability threshold, the negative real part of the corresponding eigenvalue changes its sign the mode with a negative real part of λi is attenuated over the time, while the mode with a positive real part becomes unstable and is increasing or maintaining (if real part is 0) its amplitude (figure 4 b)). it can be seen that the mode with the lower frequency is attenuated while the second one with the higher frequency becomes unstable. damped eigenfrequencies ωd1,2 become closer to each other with increasing coefficient µ. a last plot (figure 4 c)) shows the response for the friction coefficient set over the stability threshold value. from eigenvalues λi, it can be seen that the stable mode is attenuated faster (compared to b) case) due to the lower real part value. oppositely, the unstable mode is increasing its amplitude faster over the time due to the higher value of the positive real part. the ssas shows that the unstable mode fully dominates the stable one over the analysed time interval. 6. experimental results and discussion to investigate the existence of described effects in a real application, a simple experimental device has been set up. the device consists of a simplified padon-disc system where a friction material is pressed onto a rotating disc. elastic properties of a pad are achieved through a thin plate attached between the figure 5. experimental pad-on-disc set up. friction material and a piston of a linear motor which generates a normal force. dynamical instability and self-excited vibrations of the system lead to a brake squeal. this effects can be measured both using accelerometer sensors attached to the flexible structure and a microphone, which directly measures the sound pressure as a side effect of an extensive vibration of the system. the time evaluation of the sound pressure (figure 6) carries out the information about an intensity of a the vibration and also a spectral content of the vibration. the spectral content is observed through a single side amplitude spectrum by the fourier transformation of the sound pressure time signal (figure 7). in a real physical environment, more physical aspects are involved in the process of dynamic destabilization. it can be demonstrated that nonlinear effects in friction contact, such as stick slip [10] or contact surfaces detaching [11], lead into limit cycles of vibrations. these effects bring perturbations to the system. if the system is perturbed, both stable and unstable eigenmodes should be excited over the time. this behaviour can be seen from the experimental observed data from the real physical system. the experimental data have been studied over the time interval of 0.06s. the variation of the vibration amplitude can be seen in figure 6. the variation of the amplitude shows that one or more modes consisting in the response are changing their vibration propensity. this can be due to the combination of nonlinear effects, such as arising the amplitudes of unstable modes into limit cycles along with the attenuation of the repeatedly excited stable modes. strong nonlinear effects on the signal 85 j. úradníček, m. musil, m. bachraty acta polytechnica figure 6. sound pressure evaluation over the time interval of 0.06s. figure 7. single side amplitude spectrum of the sound pressure over the studied time interval of the length 0.06s. 86 vol. 60 no. 1/2020 eigenvalues evaluation of generally damped elastic disc brake model. . . can be concluded from the poly-harmonic response of the signal in figure 7 consisting of the fundamental frequency of 5700 hz and its higher harmonics. a detailed observation of the signal spectrum around 5700 hz shows more harmonics with relatively close frequencies contained in the signal. this can be due to the upper mentioned bifurcation effects. coexistence of more than one eigenmode in the unstable state points out the effect of non-proportional damping in the system, which has been described in the previous chapter. 7. conclusions the destabilization paradox due to the nonconservative friction force in the system with nonproportional damping can play a significant role, since the evolution of system eigenvalues is different when compared to the proportionally damped or undamped systems. the stability threshold value of the friction decreases with the non-proportional damping. damped eigenfrequencies don’t couple, more precisely, do not cross each other in the bifurcation point in comparison to the undamped system and proportionally damped systems. this behaviour is known, in various physical problems, as the destabilization paradox or the dissipation induced destabilization. even the described model represents only the mode coupling mechanism of the linear system where an unstable mode cannot coexists with a stable one in the steady solution, the experimental observation proved the coexistence of more than one mode in the response of the unstable (squealing) system. this coexistence of double modes points out on the general (non-proportional) damping or a system non-linearity. the cause of double modes in experimental analyses should be closely examined by the measurement of damping properties of the system to determine how significantly non-proportional the damping is and how this non-proportionality can reduce the stability threshold value. the non-proportional damping is expected due to significantly different material properties of the friction material and the rest of the system [12]. this discussion leads to the question how much important role a general damping plays in a brake squeal. the significance of these effects in brake systems will be further investigated in a next research. acknowledgements the research in this paper was supported by the grant agency vega 1/0227/19 of ministry of education, science and sport of the slovak republic and the author also appreciates the financial support provided by slovak research and development agency for the project apvv-0857-12. references [1] r. a. abu-bakar, h. ouyang. recent studies of car disc brake squeal. in new research on acoustics, chap. 4. 2008 nova science publishers, inc., 2008. [2] h. ziegler. die stabilitätskriterien der elastomechanik. ing-arch springer verlag 20:49–56, 1952. [3] n. hoffman, l. gaul. effects of damping on mode-coupling instability in friction induced oscillations. zamm journal of applied mathematics and mechanics 83(8):524–534, 2003. doi:10.1002/zamm.200310022. [4] n. hoffman, l. fischer, et al. a minimal model for studying properties of the mode-coupling type instability in friction induced oscillations. mechanics research communications 29(4):197–205, 2002. doi:10.1016/s0093-6413(02)00254-9. [5] u. wagner, d. hochlenert, p. hagedorn. minimal models for disk brake squeal. journal of sound and vibration 302(3):527–539, 2007. doi:10.1016/j.jsv.2006.11.023. [6] u. wagner, t. jearsiripongkul, et al. brake squeal: modelling and experiments. vdi-bericht 1749:96–105, 2003. [7] p. liu, h. zheng, et al. analysis of disc brake squeal using the complex eigenvalue method. applied acoustics 68(6):603–615, 2007. doi:10.1016/j.apacoust.2006.03.012. [8] j. úradníček, m. musil, p. kraus. predicting the self-excited vibrations in automotive brake systems. noise and vibration in practice : peer-reviewed scientific proceedings 23(1):101–106, 2018. [9] o. n. kirillov. nonconservative stability problems of modern physics. first printing. de gruyter, berlin, 2013. doi:10.1515/9783110270433. [10] j. úradníček, m. musil, p. kraus. investigation of frictional stick-slick effect in disk brake nvh. journal of mechanical engineering 67(1):93–100, 2017. doi:10.1515/scjme-2017-0010. [11] j. brunetti, f. massi, w. d‘ambrogio, et al. a new instability index for unstable mode selection in squeal prediction by complex eigenvalue analysis. journal of sound and vibration 377(1):106–122, 2016. doi:10.1016/j.jsv.2016.05.002. [12] p. peciar, o. macho, e. eckert, et al. modification of powder material by compaction processing. acta polytechnica 57 (2):116–124, 2017. issn 1805-2363, doi:10.14311/ap.2017.57.0116. 87 http://dx.doi.org/10.1002/zamm.200310022 http://dx.doi.org/10.1016/s0093-6413(02)00254-9 http://dx.doi.org/10.1016/j.jsv.2006.11.023 http://dx.doi.org/10.1016/j.apacoust.2006.03.012 http://dx.doi.org/10.1515/9783110270433 http://dx.doi.org/10.1515/scjme-2017-0010 http://dx.doi.org/10.1016/j.jsv.2016.05.002 http://dx.doi.org/10.14311/ap.2017.57.0116 acta polytechnica 60(1):81–87, 2020 1 introduction 2 physical problem description 3 mathematical problem description 4 system eigenvalues evaluation 5 modal decomposition 6 experimental results and discussion 7 conclusions acknowledgements references ap04-bittnar1.vp 1 introduction in solid mechanics, unlike fluid mechanics, it is still not widely recognized that knowledge of the size effect, or scaling, is the means to obtain analytical predictions of quasibrittle failures in general, even if the size effect need not be calculated. for the actual size of interest, a direct analytical solution is hard, next to impossible. however, by scaling the structure down to vanishing size, or up to infinite size, one gets a ductile, or brittle, response, either of which is much easier to solve. knowing these asymptotic solutions, an approximate failure prediction for the middle range of practical interest can then be obtained by asymptotic matching – “interpolation” between the opposite infinities. it is for this reason that the size effect is the key problem for all quasibrittle failures. the purpose of this paper is to present a brief summary of the advances in six problems of size effect recently studied at northwestern university. 2 variation of cohesive softening law in boundary layer by now it has been well established that the total fracture energy gf of a heterogeneous material such as concrete, defined as the area under the cohesive softening curve, is not constant but varies during crack propagation across the ligament. the variation of gf at the beginning of fracture growth, which is described by the r-curve, is only an apparent phenomenon which is perfectly consistent with the cohesive crack model (with a fixed softening law) and can be calculated from it. however, the variation during propagation through the boundary layer at the end of the ligament is not consistent with the cohesive crack model and implies that the softening curve of this model is not an invariant property. the fact that the fracture energy representing the area under the softening curve should decrease to zero at the end of the ligament was pointed out in a paper by bažant (1996), motivated by the experiments of hu and wittmann (1991 and 1992a), and was explained by a decrease of the fracture process zone (fpz) size, as illustrated on the left of fig. 1a reproduced from (1996) paper of zdeněk bažant. an experimental verification and detailed justification of this property was provided in the works of hu (1997, 1998), hu and wittmann (1992b, 2000), duan, hu and wittmann (2002, 2003), and karihaloo, abdalla, and imjai (2003). as mentioned by bažant (1996) as well as hu and wittmann, the consequence of these experimental observations is that: g a d a p u x x gf f d ( ) ( )� � � � � � � 1 0 0 0 d d� , (1) where gf � average fracture energy in the ligament (fig. 1b), d � specimen size (fig. 1a), a0 � notch depth, p � load, u � load-point deflection, � (x) � local fracture energy as a function of coordinate x along the ligament (fig. 1a left), g w xf � � � � � �d 0 �( ) value at points x remote from the boundary ( � area under the complete �(w) diagram, fig. 1c), � � cohesive (crack-bridging) stress, and w � crack opening � � separation of crack faces. is this behavior compatible with the cohesive crack model? to check it, consider a decreasing fpz attached to the boundary at the end of ligament, fig. 1a. exteding to this situation rice’s (1968) approach, which effectively launched the use of the cohesive (or fictitious) crack model, we calculate the j-integral along a path touching the crack faces as shown in fig. 2a,b: � �j u y t u x x v x x x t wi x d tip � � � � � � � � � �� d d d d d end � � �� 2 2 ( ) � � � � � � � � � � � � � �( ) ( ) x w w w j d j d area below or end end end tai � 0 � �l middleor( ) ( ) ,x j x d� 2 (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 44 no. 5 – 6/2004 size effect in fracture of concrete specimens and structures: new problems and progress z. p. bažant, q. yu presented is a concise summary of recent northwestern university studies of six new problems. first, the decrease of fracture energy during crack propagation through a boundary layer, documented by hu and wittmann, is shown to be captured by a cohesive crack model in which the softening tail slope depends on the distance from the boundary (which causes an apparent size effect on fracture energy and implies that the nonlocal damage model is more fundamental than the cohesive crack model). second, an improved universal size effect law giving a smooth transition between failures at large cracks (or notches) and at crack initiation is presented. third, a recent renewed proposal that the nominal strength variation as a function of notch depth be used for measuring fracture energy is critically examined. fourth, numerical results and a formula describing the size effect of finite-angle notches are presented. fifth, a new size effect law derivation from dimensional analysis coupled with asymptotic matching is given. finally, an improved code-type formula for shear capacity of r.c. beams is proposed. keywords: size effect, scaling, fracture mechanics, cohesive cracks, quasibrittle materials, concrete. note: this paper is a republication, with minor modification of bažant & yu (2004), authorized by ia-framcos in which � � path length; x � x1, y � x2 are the cartesian coordinates; ui � displacements; ti � tractions acting on the path from the outside; u � strain energy density; v � w/2; x � d is the end point of the ligament; wend is the opening at the end of ligament (fig. 2b). from these equations, we see that the instantaneous flux of energy, j, into the shrinking fpz attached to the end of ligament (fig. 1a) represents the area below the line � � �end in the softening diagram, cross-hatched in fig. 2i. it is, however, a matter of choice with which coordinate x in the fpz this flux j should be associated. if we associate j with the front, the tail, or the middle (fig. 2b) of the fpz, we get widely different plots of � � jend, jtail, jmiddle, as shown in fig. 2c, d, e, respectively (the first one terminating with dirac delta function). this ambiguity means that the boundary layer effect experimentally documented by hu and wittmann cannot be represented by the standard cohesive crack model, with a fixed stress-separation diagram. can the cohesive crack model be adapted for this purpose? it follows from eq. (4) and fig. 2 (h, i) (and has been computationally verified) that wittmann et al.’s (1990) and elices et al.’s (1992) data can be matched (fig. 2j) if the slope of the tail segment of the bilinear stress-separation diagram for concrete is assumed to decrease (fig. 1c) in proportion to diminishing distance r � d � x (fig. 2b) from the end of the ligament. after such an adaptation, the cohesive (or fictitious) crack model has a general applicability, including the boundary layer. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 r/d x/d average f g� � local fracture energy f g f g fg small large � /w h p fpz d a 0 a 1a 2 a x � a) b) c) g w r( , )� � fixed curver ( )x� fig. 1: (a) variation of local fracture energy �(x) across the ligament, decreasing in the boundary layer (reproduced from bažant 1996); (b) average fracture energy gf; (c) required modification of cohesive (or fictitious) crack model ' t f x ( ) ( ( ))x g v x� � j pathlf l ' end t f� � x ( ) ( ( ))x g v x� � j pathr tail end l end w 0 0a fd l� d r� d j when fpz is growing from notch j when fpz is growing from notch j when fpz is moving j when fpz is moving j when fpz is shrinking at end j when fpz is shrinking at end jtip dfd l� jfront dfd l� jmiddle dfd l� v end end v jend tip � v end� end v jend tip 0 0 d a d � f f g g cohesive crack model 0 6 0 1 tests by wittmann et al.(1990) and elices et al.(1992) a) b) c) d) e) f) g) h) i) j) fig. 2: (a, b) j-integral path; (c, d, e) ambiguity in j-integral variation; (f, g, h, i) fracture energies corresponding to j-integral; (j) test data fitted by modified cohesive crack model with variable tail however, the consequence is that the total fracture energy gf (area under the complete stress-separation curve) is not constant. noting that the larger the structure, the smaller is the length fraction of the boundary layer, one must conclude that the diminishing tail slope in fig. 1c automatically implies a certain size effect on the apparent gf, as given by eq. (1). it further follows from fig. 2 (h, i), and has been computationally verified, that the initial tangent of the stress-separation diagram, the area under which represents the initial fracture energy gf (bažant 2002a, b, bažant, yu and zi 2002), can be considered as fixed – in other words, gf, unlike gf , is a material constant. aside from the fact that the maximum loads of specimens and structures are generally controlled by gf, not gf, this suggests that the standard fracture test that should be introduced is that which yields not gf but gf (the size effect method, as well as the method of guinea, planas and elices, 1994a, b, serve this purpose, while the work-of-fracture method does not). this conclusion is not surprising in the light of abundant experimental data revealing that gf is statistically much more variable than gf (bažant and becq-giraudon 2002, bažant, yu and zi 2002). jirásek (2003) showed that hu and wittmann’s data can be matched by a nonlocal continuum damage model in which the characteristic softening curve is kept fixed. consequently the nonlocal model is a more general, and thus more fundamental, characterization of fracture than the cohesive (or fictitious) crack model. this finding should be taken into account in fracture testing. it appears that gf would better be defined by the area under the softening curve of the nonlocal model, multiplied by the characteristic length of material corresponding to the effective width of fpz, which equals the minimum possible spacing of parallel cracks (bažant 1985, bažant and jirásek 2002), to be distinguished from l eg ff t0 2� � . 3 universal size effect law as is now well known, the size effect for crack initiation from a smooth surface (a0 � 0) is very different from the size effect for large notches or large stress-free (fatigued) cracks at maximum load (a0/d not to small). as far as the mean nominal strength of structure, �n, is concerned, the former is always energetic (i.e., purely deterministic), while the latter is purely energetic only for small enough sizes and becomes statistical for large enough sizes. it is of interest to find a universal size effect law that includes both of these size effects and spans the transition between them. a formula for this purpose was proposed in bažant and li (1996) (also bažant and chen 1997, bažant 2002b). however, that formula was not smooth and did not include the statistical (weibull) part for crack initiation failures. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 44 no. 5 – 6/2004 lo g � n log d 0 0.4 0.85 -5 0 0 -3 -6 lo g n log d 0 0.4 0.85 -5 0 0 -3 -6 lo g n /f t' log d � 0 0.4 0.85 -5 0 0 -3 -6 lo g � n /f t' log d � 0 0.4 0.85 -5 0 0 -3 -6 a) b) log d lo g n 0 -0.8 0-1 -1 -2 � = 0.01 � = 0.06 � = 0.3 usel duan & hu log d lo g � � n 0 -1 -2 0-2 -2 0 0.02 0.06 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 universal size effect law (usel) � c) log � lo g n 0.1 1 1 0.2 usel duan & hu �d/a 0.9 1.8 3.0 e)d) fig. 3: (a, b) improved universal size effect law usel (left–without, right –with weibull statistics), and (c, d, e) profiles obtained from usel, compared to duan and hu’s (2003) approximation � � n f f f ke g g c g d r c g e l d g � � � � � � � � � � �� � � 0 0 1 2 2 0 0 1 4 2 ( )( 0 0 1 0 0 1 2 d g c e g g c g d l lf r n f f s � � � �� � � �� � � � � � � � � �) � s a d r n m f k de r c g e l d g d g� � � � � � � �� � � �� � ( ) ( )(0 2 22 0 0 04 � 0 cf ) � � � � � � � � � a better formula has now been obtained; see figs. 3, the left (fig. 3a) without, and the right (fig. 3b) with, the statistical (weibull) part, in which � �e effective young modulus; � �ft local tensile strength of material; g0, �g 0, ��g 0 are values of dimensionless energy release function g(�) and its derivatives at �0 0� a d ; l e g ff t0 2� � � � irwin’s characteristic length (corresponding to the initial fracture energy); g(�) � k2(�), k(�) is the dimensionless stress intensity factor; m � weibull modulus of concrete (about 24), n � number of dimensions for scaling; r, k � empirical positive constants; and cf � constant (the ratio c lf 0 depends on the softening curve shape, and c lf � 0 2 for triangular softening). the formulas in figs. 3a, b were derived by asymptotic matching of 6 cases: the small-size and large-size asymptotic behaviors (first two terms of expansion for each), of the large-notch and vanishing-notch behaviors, and of the energetic and statistical parts of size effect. 4 can fracture energy be measured on one-size specimens with different notch lenghts? the fact that specimens of different sizes are needed for the size effect method of measuring gf is considered by some as a disadvantage. for this reason, bažant and kazemi (1990), bažant and li (1996) and tang et al. (1996) generalized the size effect method to dissimilar specimens, the dissimilarity being caused by the use of different notch lengths a0 in specimens of one size. if the random scatter of test data were small (coefficient of variation cov < 4 %), this approach would work. however, for the typical scatter of maximum loads of concrete specimens (cov � 8%), the range of brittleness numbers attainable by variation of notch length in a specimen of any geometry (about 1:3, bažant and li 1996) does not suffice to get a sharp trend in size effect regression of test data, and thus prevents determining gf accurately. recently, this problem was considered independently by duan and hu (2003). they proposed the semi-empirical formula: � �n a a � � � � � � � � � 0 1 2 1 * , (6) where � 0 � �f t for small 3pb specimens; a� * is a certain constant; and � n represents the maximum tensile stress in the ligament based on a linear stress distribution over the ligament, � � �n n a� ( )0 . this alternative formula, intended for specimens of the same size when the notch length a0 is varied, in effect attempts to replace the profile of the universal size effect law (fig. 3) at constant size d, scaled by the ratio � � �n n a�1 0( ) , where a( )�0 depends on specimen geometry; a( ) ( )� �0 0 21� � for notched three-point bend beams. however, the curve of the proposed formula has, for a0 0� , a size independent limit approached, in log a0 scale, with a horizontal asymptote, while the correct curve, amply justified by tests of modulus of rupture (flexural strength) of unnotched beams (bažant and li 1995, bažant 1998, 2001, 2002b, bažant and novák 2000a, b, c), terminates with a steep slope for a0 0� and has a size dependent limit, as seen in the aforementioned profiles in fig. 3a, b, and better in fig. 3c. fig. 3d, e shows the profiles in d and in �, and it is seen that they cannot be matched well by duan and hu’s approximation converted from �n to �n. the test data on the dependence of �n on a0, which duan and hu fitted by their formula, should better be fitted with the size effect law (bažant and kazemi 1990, bažant and planas 1998, and bažant 1997, 2002b): � � � n n a � ( )0 , � � � n f f e g g c g d � � � �( ) ( )0 0 , (7) in which d is constant and �0 0� a d is varied (and function g( )�0 is available from handbooks). however, very short or zero notches (a0 < 0.15 d) must be excluded, which means that the value of strength �f t cannot be used with duan and hu’s approach. to use it, it is necessary to adopt either the approach of guinea et al. (1994a, b) or the zero brittleness method (bažant, yu and zi 2002). 5 size effect of finite-angle notches in elastic bodies, a sharp notch of a finite angle (fig. 4) causes stress singularity � �� �r 1 that is weaker than the crack singularity (� > 0.5) and is given by williams’ (1952) formulas (a) – (e) shown in fig. 4, in which r, � � polar coordinates, �rr, ���, �r� � near-tip stresses. if the structure has a positive geometry, it will fail as soon as an fpz of a certain characteristic length 2 cf is fully formed at the notch tip. in the limit of d � �, the structure will fail as soon as a crack can start propagating from the notch tip, which requires a certain critical energy release rate equal to gf . experiments show that the load (or nominal stress �n) at which this occurs increases with angle . in previous studies (e.g., carpinteri 1987, dunn et al. 1997a,b), some arguments in terms of a non-standard ‘stress intensity factor’ k corresponding to singularity exponent 1 � � < 0.5 were used to propose that the nominal stress � �n d� �1. a notch of finite angle cannot propagate. so, a realistic approach requires considering that a cohesive crack must propagate from the notch tip (fig. 4 left). circular bodies with notches of various angles 2 (and ligament dimension d, fig. 4a) were simulated by finite elements with a mesh progressively refined as r � 0 (the first and second rings of elements were bounded by r d� 6000 and d 3000). the circular boundary was loaded by normal and tangential surface tractions equal to stresses �rr and �r� taken from williams’ symmetric (mode i) solution; p � load parameter representing the resultant of these tractions; and � n p bd� � nominal stress (b � 1). first, ligament d was considered to be so large that the length of the fpz, lc, was less than 0.01 d. in that case, the angular distribution of stresses along each circle with r d� 01. ought to match williams’ functions frr, fr� , f�� . indeed, the numerical results could not be visually distinguished from these functions. the logarithmic plots of the calculated stress versus r for any fixed � (and any ) were straight lines, and 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 their slopes agreed with the exponent � � 1 required by williams solution, see fig. 4 (right). thus the correctness of the cohesive finite element simulation was confirmed. from eqs. (b), (f) and (g) of williams’ lefm solution in fig. 4, � � � �� �� n r d f � �( ) ( ) ( , ) 1 0 . (8) according to the equivalent lefm approximation of cohesive fracture, ��� for r cf� (the middle of fpz), should be approximately equal to material tensile strength �ft . this condition yields, � � n t rf d h� � �( ) ( ) 1 (9) h c f r f� � � � � � � �� ( , ) ( ) 0 (10) valid for d l� 250 0; �( ) is the � value for angle . to check this equation, geometrically similar scaled circular bodies of different ligament dimensions d (fig. 4) were analyzed by finite elements for various angles using the same linear softening stress-separation diagram of cohesive crack. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: (a) angle-notched circular specimen considered for analysis, (b) numerically computed singularity exponent compared to williams' elastic solution, (c) computed variation of nominal strength with notch angle, (d) analytical size effect law for angular notches compared to cohesive crack model � � (a) � � � � � � � rr rrar f ar f f � � � � �� � � 1 1 1 ( , ) ( ) ( , ) ( , ) � � (b) � � � � � � �� � 00 1 1 1 � � � � � ar f ar f ( , ) ( ) ( , ) � � (c) � � � � � � � r rar f ar f 0 1 1 � � � � � � ( , ) ( , ) (d) mode i: f ( , ) cos( ) ( ) sin( )( ( � � � � � � � � � � � � � � � 1 1 1 1) sin( )( cos( ) � � � � � � � � 1 1 (e) eq. for � � � � � � �: sin ( sin (2 2� � � � (f) a d p d n n� � �� � �1 ( ) � �( ) ( ) ( , ) sin ( , ) cos ( g f frr r � � � � �� � � � � � � d0 numerically obtained values of log �n for various fixed d cf are plotted in fig. 4c as a function of angle . we see that this size effect curve matches perfectly the curve of eq. (9) and (10) for d cf � 500, confirming that the equivalent lefm approximation obtained for r cf� is good enough. a general approximate formula for the size effect of notches of any angle, applicable to any size d, may be written, as proposed by bažant (2003), as follows: � � � n h h d d � � � � � � � � 0 0 0 1 1 ( ) (11) where h h0 � value � 0, and d0 is given in terms of g(�), and is the same as for a crack ( � 0). eq. (11), which is of course valid only for large enough notches penetrating through the boundary layer of concrete, has been derived by asymptotic matching of the following asymptotic conditions: 1) for 0, the classical size effect law � �n d d� � � 0 0 1 21( ) must be recovered; 2) for d l0 0� , there must be no size effect; 3) for d l0 � �, eqs. (11) and (9) must coincide; 4) for �� 2 (flat surface), the formula must give no size effect when d � �. in reality, there is of course a size effect in the last case, but it requires a further generalization of eq. (11) (which will be presented separately). therefore, eqs. (11) and (9) can be applied only when the notch is deeper than the boundary layer, which is at least one aggregate size. complete generality will require amalgamating eq. (11) with the universal size effect law in fig. 3. the plot of log � n versus log d for �� 3 according to eq. (11) is compared to the finite element results for notched circular bodies with cohesive cracks in fig. 4d. the agreement is seen to be excellent. 6 new derivation of size effect law from asymptotic dimensional analysis from buckingham’s (1914) –theorem of dimensional analysis, two size effects can be easily proven: (1) if the failure depends only on material strength �f t (dimension n/m 2), then there is no size effect, i.e., the nominal strength of geometrically similar specimens does not depend on their size, and (2) if the failure depends only on material facture energy gf (dimension n/m), then of course the size effect is � n d� �1 2 (e.g. bažant 1993). nothing more can be deduced. knowing that irwin’s characteristic length l e g ff t0 2� � has the physical meaning of fracture process zone length, one can deduce more. when d l0 0� , the body is much smaller than the fracture process zone, and so gf cannot matter. it follows that case (1) corresponds to the small-size asymptotic limit, i.e., a horizontal asymptote in the plot of log �n versus log d. when d l0 � �, the fracture process zone is a point compared d, and so there is a stress singularity, which means that the local material strength cannot matter. it follows that case (2) corresponds to the large-size asymptotic limit, i.e. an inclined asymptote of slope �1/2 in the same plot. hence, for the intermediate sizes, the size effect must be a gradual transition from the horizontal to the inclined asymptote. to deduce the form of this transition, one needs to further take into account the known asymptotic properties of the cohesive crack model (bažant 2001, 2002b). they may be satisfied as follows (bažant 2003). from the governing parameters of the failure problem, �n, d, �f t , gf and e, we may form, according to buckingham’s �-theorem, two and only two independent dimensionless parameters, which we choose as �1 2 � � � n fe g , �2 2 2 � � � n tf , (12) (note that, for size effect, the ratios of the structural dimensions characterizing the geometry may be ignored because they remain constant when the structure is scaled up or down). the equation governing failure may, therefore, be assumed in the form: f (�1, �2) � 0. if function f is assumed to be smooth, we can approximate it by taylor series truncated after the linear terms: � � �f f f f� � �* 1 1 2 2 , (13) where: f fi i� � �� (i � 1, 2), �1 2� �� n fe g and �2 2 2� �� n tf (bažant 2003). substituting these values into eq. (13) and solving for �n, one gets an equation of the form of bažant’s classical size effect law (1984): � �n d d� � � 0 0 1 21( ) . (14) this function satisfies the first two asymptotic terms of both the large-size and the small-size asymptotic expansions of the cohesive crack model (bažant 2001, 2002). if we chose for �1 and �2 any other dimensionless monomial, these asymptotic conditions could not be satisfied. this proves that (14) is the proper asymptotic matching formula. for a cohesive crack with its fpz attached to either a notch or a stress-free (fatigued) crack, the first two terms of the small-size asymptotic series expansion in terms of powers of d, as well as the first two terms of the large-size asymptotic series expansion in terms of powers of 1/d (bažant 2001), were previously shown to be matched by the asymptotic expansions of this law. if different dimensionless variables �1 and �2 were chosen, different size effect laws would result by using the same logical procedure. however, these laws either would not match the asymptotic properties of the cohesive crack model, or would lead to more complex size effect formulas differing from (14) only by thirdand higher-order terms of the asymptotic expansions (bažant 2003). thus, if the dimensional analysis is combined with the known asymptotic requirements, the resulting size effect law is unique (except for complex formulas with higher than second-order deviations). it may be noted that a size effect formula recently promulgated by karihaloo (1999), and karihaloo et al. (2003), � n d d� �( )1 0 1 2, is not an asymptotic matching formula because it matches only the first two terms of the large-size asymptotic expansion in terms of powers of 1/d, but not the terms of the small-size expansion in powers of d. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 the foregoing argument is not valid for failure at crack initiation because the energy release rate at crack initiation vanishes. the size effect is here governed by the derivatives of the energy release rate, for which gf is immaterial. 7 design formula with size effect for shear capacity of rc beams the time is now ripe for adding size effect to all the design specifications dealing with brittle failures of concrete structures (shear and torsion of r.c. beams with or without stirrups, slab punching, column failure, bar embedment length, splices, bearing strength, plain concrete flexure, etc.). in aci standard 318, only the new formula for anchor pullout includes the size effect (of lefm type). by analysis of the latest aci 445 database with 398 data (reineck et al. 2003), representing an update of 1984 and 1986 northwestern university databases (with 296 data), two improved formulas for shear failure of reinforced concrete beams without stirrups, having the simplicity desired in aci, have recently been developed by asymptotic matching of first or second order (bažant and yu 2003); see fig. 5 (vc � �n � mean shear stress in the cross sections at failure, d � beam depth up to reinforcement centroid, � �f c standard compression strength of concrete). the formula in fig. 5a is more accurate (2nd order match), the other is simpler (1st order match). they are shown in fig. 5, where they are also compared to the database (the solid line is the mean formula, the dashed line is a formula scaled down to achieve additional safety, as practiced in aci). the coefficients of variation of the vertical deviations from data points, �, are shown in the fig. 5. in the derivation of these formulas, the following three principles were adhered to: 1) only theoretically justified formulas must be used in data fitting because the size effect (which is of main interest for d ranging from 1 m to 10 m) requires enormous extrapolation of the aci 445 database (in which 86 % of data pertain to d < 0.6 m, 99 % to d < 1.1 m and 100 % to d < 1.89 m). 2) the validity of the formula must be assessed by comparing it only to (nearly) geometrically scaled beams of broad enough size range (only 11 such test series exist). 3) the entire database must be used only for the final calibration of the chosen formula (and not for choosing the best formula, because data for many different concretes and geometries are mixed in the database, and only 2 % of the data have a non-negligible size range). 8 closing remarks although the size effect in fracture of concrete structures has been studied for over quarter a century, there are still significant issues to be resolved. among them, the introduction of size effect into the specifications in concrete design codes is of the greatest practical importance. the present paper is a mere summary of six recent investigations at northwestern university which will be fully presented in forthcoming papers. acknowledgement thanks for partial financial support are due to the u.s. national foundation (grant cms-0301145 to northwestern university) and to the infrastructure technology institute of northwestern university. references [1] bažant z. p.: “size effect in blunt fracture: concrete, rock, metal.” j. eng. mech. asce, vol. 110 (1984), p. 518–535. [2] bažant z. p.: “mechanics of fracture and progressive cracking in concrete structures.” chapter 1 in fracture mechanics of concrete: structural application and © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 44 no. 5 – 6/2004 current design '2 c f current design '2 c f 1 0.1 1000 ' 1/ 2 0 (1 / ) c c v f d d� �� log(d/d0) 8 5� � 3.8� � ' 2/3 0 16.1% 16.2 % 1120( / ) l c s d f � � � � � 0.3 lo g (v c /f c' 1 /2 ) a) 1 0.3 0.1 100 lo g (v c /f c' 1 /2 ) log(d/d0) 8 ' ' 0 / , but c c c c v f d d v f� �� � 3.5� � 2.5� � 2/3 0 21.3% 21.5% 7 ls d � � � � � current design '2 c f current design '2 c f b) fig. 5: improved formulas for shear failure of reinforced concrete beams without stirrups: (a) 2nd order match; (b) 1st order match (circle area represents the assigned weights in fitting) num. calculation, sih g. c., di tommaso a., eds.: martinus nijhoff, dordrech & boston, 1985, p. 1–94. [3] bažant z. p.: “scaling laws in mechanics of failure.” j. of eng. mech., asce, vol. 119 (1993), no. 9, p. 1828–1844. [4] bažant z. p.: “analysis of work-of fracture method for measuring fracture energy of concrete.” j. of eng. mech., asce, vol. 122 (1996), no. 2, p. 138. [5] bažant z. p.: “scaling of quasibrittle fracture: asymptotic analysis.” int. j. of fracture, vol. 83 (1997), no. 1, p. 19–40. [6] bažant z. p.: “size effect in tensile and compression fracture of concrete structures: computational modeling and design.” fracture mechanics of concrete structures, (proc. 3rd int. conf., framcos-3, held in gifu, japan), mihashi h., rokugo k., eds., aedificatio publishers, freiburg, germany, 1998, p. 1905–1922. [7] bažant z. p. “size effects in quasibrittle fracture: apercu of recent results.” fracture mechanics of concrete structures, (proc., framcos-4 int. conf., paris), r. de borst et al., eds., netherlands a.a. balkema publishers, lisse, 2001, p. 651–658. [8] bažant z. p.: “concrete fracture model: testing and practice.” engineering fracture mechanics, vol. 69 (2002a), no. 2 (special issue on fracture of concrete and rock, elices m., ed.), p. 165–206. [9] bažant z. p.: “scaling of structural strength.” hermes penton science (2002b), london. [10] bažant z. p.: internal research note to q. yu, (feb. 27) and a paper under preparation (2003). [11] bažant z. p., becq-giraudon e.: “statistical prediction of fracture parameters of concrete and implications for choice of testing standard.” cement and concrete research, vol. 32 (2002), no. 4, p. 529–556. [12] bažant z. p., chen, e.-p.: “scaling of structural failure.” applied mechanics reviews, asme, vol. 50 (1997), no. 10, p. 593–627. [13] bažant z. p., jirásek m.: “nonlocal integral formulations of plasticity and damage: survey of progress.” j. of eng. mechanics, asce, vol. 128 (2002), no. 11, p. 1119–1149. [14] bažant z. p., kazem, m. t.: “determination of fracture energy, process zone length and brittleness number from size effect, with application to rock and concrete.” international journal of fracture, vol. 44 (1990), p. 111–131. [15] bažant z. p., li z.: “modulus of rupture: size effect due to fracture initiation in boundary layer.” j. of struct. eng. asce, vol. 121 (1995), no. 4, p. 739–746. [16] bažant z. p., li z.: “zero-brittleness size-effect method for one-size fracture test of concrete.” j. of eng. mechanics, asce, vol. 122 (1996), no. 5, p. 458–468. [17] bažant z. p., novák d.: “probabilistic nonlocal theory for quasibrittle fracture initiation and size effect. i. theory.” j. of eng. mech., asce, vol. 126 (2000a), no. 2, p. 166–174. [18] bažant z. p., novák d.: “probabilistic nonlocal theory for quasibrittle fracture initiation and size effect. ii. application.” j. of eng. mech., asce, vol. 126 (2000b), no. 2, p. 175–185. [19] bažant z. p., novák d.: “energetic-statistical size effect in quasibrittle failure at crack initiation.” aci mat. j., vol. 97 (2000c), no. 3, p. 381–392. [20] bažant z. p., planas j.: “fracture and size effect in concrete and other quasibrittle materials.” crc press, (1998), boca raton and london. [21] bažant z. p., yu q.: “designing against size effect on shear strength of reinforced concrete beams without stirrups.” str. eng. report, (2003), no. 03-6/a446d, northwestern university. [22] bažant z. p., yu q.: “size effect in concrete specimens and structures: new problems and progress.” in fracture mechanics of concrete structures, (proc., framcos-5, 5th int. conf. on fracture mechanics of concrete and concrete structures), vol. 1 (2004), v. c. li, k. y. leung, k. j. willam., s. l. billington., eds. ia-framcos, (2004), p.153–162. [23] bažant z. p., yu q., zi g.: “choice of standard fracture test for concrete and its statistical evaluation.” international journal of fracture, vol. 118 (2002), p. 303–337. [24] buckingham e.: “on physically similar systems: illustration of the use of dimensional equations.” phys. rev., vol. 4 (1914), p. 345–376. [25] carpinteri a.: “stress-singularity and generalized fracture toughness at the vertex of re-entrant corners.” engineering fracture mechanics, vol. 26 (1987), p. 143–155. [26] duan k., hu x. z.: “scaling of quasi-brittle fracture: boundary effect.” submitted to engineering fracture mechanics (2003). [27] duan k., hu x. z., wittmann f. h.: “explanation of size effect in concrete fracture using non-uniform energy distribution.” materials and structures, vol. 35 (2002), p. 326–331. [28] duan k., hu x. z., wittmann f. h.: “boundary effect on concrete fracture and non-constant fracture energy distribution.” eng. fracture mechanics, vol 70 (2003), p. 2257–2268. [29] dunn m. l., suwito w., cunningham, s. j.: “stress intensities at notch singularities.” engineering fracture mechanics, vol. 57 (1997), no. 4, p. 417–430. [30] dunn m. l., suwito w., cunningham s. j.: “fracture initiation at sharp notches: correlation using critical stress intensities.” int. j. solids structures, vol. 34 (1997), no. 29, p. 3873–3883. [31] elices m., guinea g. v., planas j.: “measurement of the fracture energy using three-point bend tests: part 3 – influence of cutting the p tail.” materials and structures, vol. 25 (1992), p. 327. [32] guinea g. v., planas j., elices m.: “correlation between the softening and the size effect curves.” size effect in concrete. mihashi h., okamura h., bažant z. p. eds., (1994a), e&fn spon, london, p. 233–244. [33] guinea g. v., planas j., elices m.: “a general bilinear fitting for the softening curve of concrete.” materials and structures, vol. 27 (1994b), p. 99. [34] hu x. z.: “toughness measurements from crack close to free edge.” international journal of fracture, vol. 86 (1997), no. 4, p. l63–l68. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 [35] hu x. z.: “size effects in toughness induced by crack close to free edge.” fracture mechanics of concrete structures. mihashi h., rohugo k. eds., proc. framcos-3, japan, 1998, p. 2011–2020. [36] hu x. z., wittmann f. h.: “an analytical method to determine the bridging stress transferred within the fracture process zone: i. general theory.” cement and concrete research, vol. 21 (1991), p. 1118–1128. [37] hu x. z., wittmann f. h.: “an analytical method to determine the bridging stress transferred within the fracture process zone: ii. application to mortar.” cement and concrete research, vol. 22 (1992a), p. 559–570. [38] hu x. z., wittmann f. h.: “fracture energy and fracture process zone.” materials and structures, vol. 25 (1992b), p. 319–326. [39] hu x. z., wittmann f. h.: “size effect on toughness induced by crack close to free surface.” engineering fracture mechanics, vol. 65 (2000), p. 209–221. [40] jirásek m.: keynote lecture presented at euro-c, st. johann in pongau, austria (2003). [41] karihaloo b. l.: “size effect in shallow and deep notched quasi-brittle structures.” international journal of fracture, vol. 95 (1999), p. 379–390. [42] karihaloo b. l., abdalla h. m., imjai t.: “a simple method for determining the true specific fracture energy of concrete.” magazine of concrete research, vol. 55 (2003), no. 5, p. 471–481. [43] karihaloo b. l., abdalla h. m., xiao q. z.: “size effect in concrete beams.” engineering fracture mechanics, vol. 70 (2003), p. 979–993. [44] reineck k., kuchma d. a., kim k. s., marx s.: “shear database for reinforced concrete members without shear reinforcement.” aci structural journal, vol. 100 (2003), no. 2, p. 240–249; with discussions by bažant and yu in aci materials j., vol. 101, no. 1 (jan.-feb. 2004). [45] rice r. j.: “mathematical analysis in the mechanics of fracture.” fracture-an advanced treatise, vol. 2 (1968). liebowitz h. ed., academic press, new york, p. 191–308. [46] tang t., bažant z. p., yang s., zollinger d.: “variable-notch one-size test method for fracture energy and process zone length.” eng. fracture mechanics, vol. 55 (1996), no. 3, p. 383–404. [47] williams m. l.: “stress singularities resulting from various boundary conditions in angular corners of plates in extension.” journal of applied mechanics, vol. 74 (1952), p. 526–528. [48] wittmann f. h., mihashi h., nomura n.: “size effect on fracture energy of concrete.” eng. fracture mechanics, vol. 35 (1990), p. 107–115. zdeněk p. bažant qiang yu mccormick school of engineering and applied science, cee 2145 sheridan road, northwestern university, evanston, illinois 60208, u.s.a. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 44 no. 5 – 6/2004 ap03_6.vp 1 introduction engineering design of various products can be divided into two main groups. the first one is characterized as configurational design, and can be fully described by a design network. using such a design network, effective computer support systems can be developed. such a computer support system, called colin, was developed for automotive design, together with further extensions, described in [1]. although the railway vehicle design process is mostly a configurational one, it is not possible to establish a sequence or network of repeatable steps and operations which fully describe previous design cases. usually, previous designs or parts of them are utilized. current advances in computer hardware and software technologies allow designers to base vehicle design on a simulation model of such a vehicle. thus, a significant part of the design process is the creation and verification of the dynamic simulation model of the vehicle, and experiments with this model. further, previous simulation models and parts of them are often used. knowledge support for such a design process is supported by the methodology and tools developed within the eu clockwork project (creating learning organisations with contextualised knowledge-rich work artifacts). the design methodology based on multidisciplinary virtual modelling and subsequent simulation is essential for any present-day engineering design [3], and specifically for the design of complex mechatronic machines [4]. in additions current engineering design is multidisciplinary and therefore based on team work. the team work often uses the potential and benefits of geographic distribution and internet communication. engineering design draws heavily on previous experience and therefore it is essential to build and maintain the archives of previous cases. due to the nature of the described methodology, simulation models are an essential part of this. engineering design is a knowledge-intensive activity, but analysing the resulting documentation of such activities shows that most of this knowledge is tacit, and is not stored for future reuse. the clockwork methodology and tools provide a means for storing, retrieving, reusing, sharing and communicating such knowledge. this paper deals with implementation of the clockwork methodology and tools which support integrated design of railway vehicles. 2 clockwork methodology 2.1 knowledge life cycle of virtual modelling and simulation the development of a virtual model and then its simulation typically goes through a number of tasks ([2]). these tasks are illustrated in fig. 1. the light arrows represent mappings between the different elements, and the black arrows represent transformation processes between the tasks. the development of a simulation, as shown in fig. 1, involves five steps. the first step is an analysis of the real world, since when we start to develop a simulation there exists some real world object. this real world object is investigated within a certain experimental frame concentrated on the relevant behaviour in order to answer some question, the objective of object investigation. it forms the real world system (either actual or hypothesised). from this real world system we want to create a model to use for answering a modelling question (question), such as how a real-world system would respond to particular stimuli. the real world task has three elements: the real world system, the question (modelling question), and finally the solution. � the real world system (either actual or hypothesised), of which we want to create a model to use for answering a modelling question, such as how a real-world system would respond to particular stimuli. � the question (modelling question) is a question – modelling objective that is based within the real world and is to be answered using simulation. � the solution is the interpretation of the interpreted output of the simulation task. this will be left blank until there is a set of results. the second step is the conceptual task, where the conceptual model is developed. the transformation from the real world into the conceptual world consists of creating a hierarchical break-down of the real world system being investigated (e.g., a hierarchical description of real product components and their environment). this scheme should be accompanied by a description of the function (physical interaction), as it forms the basis of the causal and functional explanation. as soon as this description is com9 acta polytechnica vol. 43 no. 6/2003 concurrent design of railway vehicles by simulation model reuse m. valášek, p. steinbauer, j. kolář, j. dvořák this paper describes a concurrent design approach to railway vehicle design. current railway vehicles use many different concepts that are combined into the final design concept. the design support for such systems is based on reusing components from previous design cases. the key part of the railway vehicle design concept is its simulation model. therefore the support is based on support for reuse of previous simulation models. the simulation models of different railway component concepts are stored using the methodology from the eu clockwork project. the new concept usually combines stored components. keywords: concurrent engineering, design, reuse, railway vehicles, simulation model. pleted the conceptualisation is completed. certainly various assumptions are raised about the granularity of the modelling being provided. it is the way of “thinking about” and representing the real world system in a conceptual manner. a crucial design decision is the determination of what factors influence the system behaviour and what system behaviours are to be incorporated into this task. the result of the conceptualisation is a conceptual model investigated within the modelling environment in order to answer the modelling objective. � the conceptual model is an abstracted view of the real world system. � the modelling objective is to conceptualise the question and form objectives for the simulation task (i.e. what behaviour we are interested in? and how we will know that we have the correct answer from the results?). � the modelling environment is the conceptualisation of the inputs of the real world systems, providing a simulation environment to work within in the simulation task. typically the inputs are specified here. validation is the process of determining whether the conceptual model is an adequately accurate representation of the real world task. the next task is physical modelling. the third step is the transformation from the conceptual world into the physical world. this consists of replacing of each element on the conceptual level by a corresponding element on the physical level. the elements of the physical model are the so-called ideal modelling elements (e.g. rigid body, ideally flexible body, ideal gas, ideal incompressible fluid, ideal capacitor, etc.). again, as soon as this replacement is completed the physical modelling is completed. this is where most of the modelling assumptions are raised, and where many modelling modifications are made. this is the way of “modelling it” and representing the real world system in a physical modelling manner. for example two typical modifications are provided: either an element is decomposed into more elements (a small subsystem) (e.g. a body cannot be treated as rigid body and it is modelled as a flexible body being modelled as a subsystem) or several elements are replaced by one element as the detailed treatment of interactions is neglected. the modelling task results in the physical model, its input and the investigated output. � the physical model is an accurate physical view of the conceptual model. � the model input is the modelling conceptualisation of the inputs of the real world systems, providing a physical modelling of the real world stimuli to the investigated system, i.e. a detailed physical description of the modelling environment from the conceptual task. � the model output is the modelling conceptualisation of the objectives for the simulation task (i.e. how the relevant behaviour is measured and evaluated). verification is the process of determining whether the physical model is an adequately accurate representation of the conceptual model. the results of the modelling task are implemented in the form of a computer-executable set of instructions known as the simulation model. this is the transformation from the physical world into the simulation world. it consists of replacing of each element on the physical modelling level by the corresponding element on the simulation level. usually this replacement can be straightforward, as simulation packages attempt directly to contain the ideal objects of physical models. however, some modifications are always necessary. the simulation task consists of two stages. the first stage deals with the implementation of the physical model using a particular simulation environment (simulation language and simulation software). the second stage deals with the simulation experiment, i.e., proper usage of the developed simulation model for the solution of the real world problem. testing is the process of determining whether the implemented version of the model is an accurate representation of the physical model. there are many relations between the models from different worlds. the main relations are connected with the 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 real world object validation conceptualisation verification modelling interpretation execution real world conceptual world simulated world real world system question solution conceptual model modelling environment modelling objective conceptual world object implementation physical model model input model output modelling world object modelled world simulation world object tested model model solution procedure testing model output simulation model implementation interpreted output model input model solv. proc. model output simulation experiment testing model input simulation model testing fig. 1: clockwork knowledge framework described transformations. they represent the relation “consists of ” from the real world to the conceptual world. the relation “is modelled as” from the conceptual world to the modelling world and the relation “is implemented as” from the modelling world to the simulation world, and finally the relation “is simulated by” and “is answered by” from real world to the simulation world. the development of objects of these relations is accompanied by many iterations (fig. 2). 2.2 knowledge support for virtual modelling and simulation the above described process of creating virtual models and simulation experiments with them is accompanied by supporting knowledge that is vital for knowledge sharing and reuse within the design team. it is believed that the supporting knowledge for the modelling and simulation process comes in two forms, informal and formal knowledge. informal knowledge is typically concerned with relationships and knowledge about models using natural language. formal knowledge is also concerned with the establishment and representation of knowledge about the model, but this knowledge is typically more structured, using knowledge modelling technology. both of these types of knowledge will be associated with each world object, as shown in fig. 3. based on the methodology of the enrich project [3], the modelling and simulation objects are working documents of the engineering activity, and informal knowledge can be captured by annotations and the formal knowledge by semantic indexing, using concepts from related ontologies (fig. 4). each world model is associated with informal and formal knowledge on the level of model components as well on the level of complete models. in adition, each transition between worlds is also associated with informal and formal knowledge. informal knowledge about world object components and a complete world object describes the function, behaviour and any other useful information about the component or object in question. formal knowledge represents the most valuable informal knowledge in terms of corresponding ontologies (fig. 4). informal knowledge of world transitions describes the structural changes of the model with related assumptions raised during particular a transformation, and formal knowledge of these transitions describes the relationships among the modelling processes. the particular relationships between world models and their components can also be generalized into more abstract knowledge about the development of virtual models and their simulation (fig. 5) the above described analysis has shown that the process of modelling can be described on the level of each world/object by three descriptors: 1. a conceptual/physical/simulation model depending on the world (cwo/mwo/swo). 2. an annotation text (informal knowledge). 3. semantic indices (formal knowledge). similarly, each component of the model can be described by the same triplet. capture, retrieval and search of this knowledge is supported by several tools. it is considered that models on various levels of the framework (from various worlds) consist of components © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 43 no. 6/2003 fig. 2: iterative process of simulation model development fig. 3: supporting knowledge for the modelling and simulation process fig. 4: ontology hierarchy which have a specific meaning and a lot of knowledge is related to the particular model component, as shown in the fig. 6 (e.g. how the stiffness of the front left spring was determined). access and distinguishing of these components is however dependent on the modelling tool. within the clockwork system, dependent specific tools have been developed for various modelling environments (matlab/simulink, simpack). another tool for knowledge management on the level of whole models and worlds (the model is a work document) is then proposed. it supports storing, retrieval and search of annotations on the level of whole models and transitions between worlds, the semantic indexing of all objects and hierarchical relations between objects of different worlds (fig. 10). informal knowledge can be in the form of a textual description, an image or an arbitrary file. such knowledge can be associated with all parts of a world object (modelling question-objective, assumptions and environment). 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 5: clockwork models, ontologies and their relationships fig. 6: annotations of simulation model elements fig. 7: summary of different railway vehicles to be covered by concurrent design 3 railway vehicle concurrent design objectives within concurrent design of railway vehicles a complete system of different types of railway vehicles will be covered (fig. 7). 4 implementation of the clockwork methodology for integrated design of railway vehicles the clockwork methodology and tools have been used as basis of the system for integrated design of railway vehicles. the system is based on the creation of a simulation model library of various previous designs, using the clockwork tools and approach. thus, these design cases have developed and verified not only as simulation model, but also other mutually related objects in different “worlds” according to fig. 1. furthermore, the objects are enriched by formal and informal knowledge, as shown in fig. 3. when completed, the system will support integrated design of railway vehicles by enabling effective reuse of previous design cases or parts of them. creating a new simulation model will be much faster than starting from the scratch. 4.1 interpretations of clockwork worlds the four world framework (fig. 1) described previously needs to be interpreted for purposes of the wheel rail design cases library. the common modelling objective-question is quite general: create a simulation model of a given railway vehicle to enable evaluation of vehicle properties according to code uic518 (which includes passenger riding comfort etc.). this question is narrowed according to the level of model complexity. real world objects are considered as high level objects, which provide a general specification of the vehicle model being developed (geared/not geared, commuter vehicles, high speed vehicles). conceptual world objects related to a particular real world object contain a detailed specification of the modelled vehicle (i.e., number of axes, type of springs, dampers, wheel shape etc.) there are certainly many conceptual world objects related to a particular real world object, as there are many structural variants and decisions. each conceptual world object is expanded into at least one model world object. at this level modelling decisions are made. a decision on how to model each component or piece of a conceptual world object is made, e.g., which bodies are considered rigid, which flexible, linear/nonlinear characteristics of springs, etc. the simulation world object below contains implementation of a conceptual world object in a specific modelling language/package (e.g. simpack, simulink, adams, etc.). 4.2 ontology adaption clockwork ontologies fig. 4 can mostly be used with only small extensions. domain and component onto© czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 43 no. 6/2003 fig. 8: simpack workplace with model of vehicle 711 logies are however problem specific, and their railway extensions are being developed. to illustrate some issues discussed above, fig. 9 and fig. 10 show an example of simulation model of railway vehicle 711, created using the described methodology. fig. 9 shows a simpack workspace with an opened simulation model. this workspace enables annotations of simulation model parts. fig. 9 shows the related record in the ckmt tool. 5 conclusions an approach toward support for integrated design of railway vehicles, a specific and important class of mechani14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 9: record of simulation model of vehicle 711 in ckmt – real world object fig. 10: example of a model world object for vehicle 711 cal products, has been developed. there is no design procedure based on a sequence or network of related steps for this product class. instead, previous design cases are used to learn lessons for new designs. in particular, simulation models of previous design cases are reused. using the clockwork methodology, means are developed for knowledge capture and reuse. this work is based on the technology and methodology developed within the eu clockwork project, adapted for integrated design of railway vehicles. this approach is currently being verified on building a series of commuter railway vehicle simulation models. acknowledgment the authors appreciate the kind support received from eu ist 12566 project clockwork and from msmt project ln. references [1] macek, j., valášek, m.: concept of computer aided internal combustion design at configuration level. in: proc. of xxxiii. int. conf. of departments of ice 2001, va brno, 2001, p. 321–333. [2] steinbauer, p., valášek, m., zdráhal, z., mulholland, p., šika, z.: knowledge support of virtual modelling and simulation. in: proc. of nato asi virtual nonlinear multibody systems, prague, 2002, p. ii/241–246. [3] zdráhal, z., valášek, m.: modeling tasks in engineering design. in: cybernetics and systems, vol. 27, 1996, p. 105–118. [4] valášek, m.: mechatronic system design methodology initial principles based on case studies. in j. adolfsson and j. karlsen (eds.), mechatronics 98, pergamon, amsterdam, 1998, p. 501–506. prof. ing. michael valášek, drsc. phone: +420 224 357 361 e-mail: michael.valasek@fs.cvut.cz ing. pavel steinbauer phone: +420 224 357 568 e-mail: pavel.steinbauer@fs.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 prague 2, czech republic jiří kolář e-mail: jiri.kolar@fs.cvut.cz jiří dvořák department of vehicles and aerospace czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 43 no. 6/2003 ap04-bittnar2.vp 1 introduction in practice, various kinds of damage to reinforced concrete structures can be found, such as mechanical damage due to cracking, corrosion of the reinforcement and deterioration of the concrete due to chemical actions from the environment. undetected damage can lead to the failure of structural members or failure of the structure as the whole. therefore monitoring of the degree of deterioration degree and damage detection of a structure at the earliest possible stage is very important. in many cases cracks are hidden, or damage is located inside the structure, and is only visible through changes in the overall properties. current damage detection methods require that the vicinity of the damage is known a priori, and that the portion of the structure being inspected is readily accessible. the need for methods that can be applied to complex structures has led to the development of methods that examine changes in the modal characteristics of a structure. methods and procedures that use the results of an experimental modal analysis for estimating the degree of degradation of a structure can be verified on simple structural elements or complex structures (e.g. bridges [1] , [3]), when we know their damage state. 2 investigated structural elements tests of structural elements of two types (fig. 1) were carried out in the laboratories of the faculty of civil engineering, ctu in prague. the influence of increasing damage to reinforced concrete beams and slabs on their modal characteristics was monitored. three reinforced concrete beams with dimensions 4.5 m×0.3 m×0.2 m were prepared for the purposes of the test. the beam was placed on a cast steel bearing to achieve a good agreement with theoretical boundary conditions. it was a simply supported beam with a span length 4.0 m with cantilevered ends 0.25 m on both sides. the state of deterioration of the beams performed by static loading and dynamic fatigue loading. the beams were tested in four-point bending to get a constant bending moment in the mid-section of the beam. static loading was imposed in four steps (load by its self-weight, load effect equal to a theoretical limit of crack initiation, load to the first real cracking in the lower part of the beam, load to half of the ultimate moment). then we continued with a dynamic fatigue load, induced by harmonic force. the amplitude of the dynamic load was chosen to achieve a stress range in the main reinforcement � � � 220 mpa, which would bring to an end of a the service life of the beam after 500 000 cycles. in reality, the end of the service life of beam no. 1 came after 260 000 cycles (52 % of the theoretical lifetime). so for beams no. 2 and 3 the loading steps were divided into steps of about 65 000 cycles. the end of the service life of beam no. 2 came after 220 000 cycles (44 % of the theoretical lifetime), and for beam no. 3 after 255 000 cycles (51 % of the theoretical lifetime). 140 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 nondestructive damage detection based on modal analysis t. plachý, m. polák three studies of damage identification and localization based on methods using experimentally estimated modal characteristics are presented. the results of an experimental investigation of simple structural elements (three rc-beams and three rc-slabs) obtained in the laboratory are compared with the results obtained on a real structure (a composite bridge – a concrete deck supported by steel girders) in situ. keywords: modal analysis, natural mode, natural frequency, damage detection, mac, comac, service life, fatigue. fig. 1: view of the reinforced concrete beam and slab before the test and after each load step the dynamic response of the beam was measured with a separate test arrangement. for excitation of the beam during the experimental modal analysis, ese 11 076 electrodynamic shaker was © czech technical university publishing house http://ctn.cvut.cz/ap/ 141 acta polytechnica vol. 44 no.5–6/2004 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 0 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0.0 0.5 fig. 2: comparison of the dynamic behavior of beam no. 3 between the last but one and the last load step – functions (1-comac(x)) and camosuc(3), x 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 -2.0e-06 2.0e-06 fig. 3: comparison of the dynamic behavior of beam no. 3 between the last but one and the last load step – change in the modal flexibility matrix [�] and the second derivative of the change in the modal flexibility matrix [�] used. the shaker produced a random driving force over the frequency range of 5 to 200 hz. the excitation force was measured indirectly by measuring the acceleration of the excitation mass. a b12/200 acceleration transducer hbm was placed on the excitation weight. the response of the elements to forcing by the exciter was measured by ten b12/200 hbm inductive acceleration transducers in a chosen net of points (5 points in each of 19 cross sections), on the upper face in the vertical direction and on the left side of the beam in the horizontal direction. the shaker was placed centrally to the longitudinal axis of the beam. it covered two fifths of the beam span. the position of the point of excitation was designed to be able to excite the basic bending modes of the natural vibration of the beam. values of the frequency response function hrs(if) were obtained as an average from ten measurements. the window length of the time signal processing was 16 s, and the frequency range of the window was set to 200 hz. the change in modal characteristics was monitored and confronted with the damage state of the beams. the modal characteristics of the beams, which were measured after each loading step, were compared with each other. the changes in natural frequencies � f j( ) were also computed. for the comparison of the natural modes, modal assurance coefficient mac( j,j), coordinate modal assurance criterion comac( p), changes in mode surface curvature camosuc( j),x, changes in a modal flexibility matrix �[�] and the second derivative of the changes in the diagonal members of modal flexibility matrix �[ ]� �� were used. after the evaluation of the first test results of the beams, the second tests of the slabs were designed. four reinforced concrete slabs with dimensions 3.2 m × 1.0 m × 0.1 m were 142 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 3 5 7 9 11 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 3 5 7 9 11 0 1,4 fig. 4: comparison of the dynamic behavior of the slab between the first and the last load step – functions (1-comac(x)) and camosuc(1),x made for the purposes of the test. the slabs were simply supported on two opposite sides with a span of 3.0 m and cantilevered ends 0.1 m. the slabs were also loaded by static and dynamic fatigue loading. static loading was imposed in the same four steps as for the beams, but the steps for dynamic loading were different. in contrast to the beams, the end of the fatigue life of the first slab came after 1 000 000 cycles (200 % of the theoretical lifetime). therefore, the fatigue load was performed in four steps (load to half of the theoretical lifetime – 250 000 cycles, load to the theoretical lifetime – 500 000 cycles, load to 750 000 cycles, and load to the end of the real lifetime). before the test and after each load step the dynamic response of the slab was measured with a separate test arrangement, as for the beams. the response of the slabs was measured only in the vertical direction in a chosen net of points (11 points in each of 21 cross sections) on the upper face. the change of modal characteristics was monitored and confronted with the damage state of the slabs, using the same coefficients as for the beams. in order to lokalize of the cracks induced by the load increase during experimental and theoretical studies on structural elements the camosuc( j),x (fig. 2 and 4), the change of the modal flexibility matrix and especially the second derivative of the change of the diagonal members of the modal flexibility matrix seem to be most appropriate (fig. 3 and 5). to acquire reliable data to apprise the monitored structure based on comac( p), the change of the modal flexibility matrix and the second derivative of the change of the diagonal members of the modal flexibility matrix it is very important to consider carefully the character and number of natural modes, that which are used in the computation. to determi© czech technical university publishing house http://ctn.cvut.cz/ap/ 143 acta polytechnica vol. 44 no.5–6/2004 fig. 5: comparison of the dynamic behavior of the slab between the first and the last load step – change in the modal flexibility matrix [�] and the second derivative of the change of the modal flexibility matrix [�] nate of camosuc( j),x it is most suitable to use the basic natural mode (the first vertical bending mode of the natural vibration), while for higher natural modes camosuc( j),x does not give as good results as for the first natural mode. for reliable analysis it is important to obtain reference data about the dynamic properties of the investigated structure in the undamaged virgin state, optimally before entering into operation. 3 the bridge under investigation the bridge under investigation crosses highway d5 near the village of vráž in the czech republic. the bearing structure of this composite bridge consists of four main steel i-girders and a reinforced concrete slab. it is a three-span continuous bridge with spans 11.7 m + 35.1 m + 11.0 m. in march 2001 the bridge was damaged by a crash. a heavy vehicle (excavator) being ferried along the d5 crashed into its two main girders. the consequences of this crash was a permanent buckle of the main girder (15 cm at the point of impact), and damage to the connection between the main girder and the crossbeam (fig. 6). an experimental modal analysis was carried out twice on this bridge – for the damaged state and for the state after reconstruction of the bridge. to vibrate the bridge, (fig. 7) a tiravib 5140 electrodynamic exciter was used. the driving force was white noise type in the range 0–20 hz, which was measured by s-35 lukas force transducers. the response of the bridge was measured only in the vertical direction in a chosen net of points (280 points – 28 cross sections and points in each cross section) on the upper face of the bridge by b12/200 hbm accelerometers. the data acquisition was performed by the 2550b spectral dynamics vibration control system with a sun control computer. the star spectral dynamics program was used for modal characteristics evaluation. 144 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 6: view of the damaged main girder of the bridge fig. 7: the tiravib 5140 electrodynamic exciter and its placement on the bridge state after reconstruction the 2nd stage damaged state – the 1st stage number of the natural mode of vibration 1 2 3 4 5 6 7 8 8 10 11 1 0.937 0.205 0.038 0.018 0.000 0.010 0.000 0.004 0.011 0.003 0.004 2 0.005 0.674 0.045 0.054 0.003 0.001 0.000 0.007 0.002 0.001 0.020 3 0.084 0.022 0.618 0.345 0.027 0.063 0.001 0.020 0.087 0.012 0.000 4 0.001 0.026 0.378 0.338 0.017 0.024 0.001 0.014 0.013 0.003 0.002 5 0.008 0.001 0.259 0.110 0.241 0.050 0.001 0.061 0.049 0.031 0.001 6 0.000 0.001 0.091 0.050 0.928 0.062 0.000 0.019 0.017 0.002 0.003 7 0.004 0.001 0.000 0.019 0.014 0.121 0.099 0.113 0.005 0.000 0.008 8 0.000 0.002 0.000 0.003 0.001 0.015 0.576 0.001 0.004 0.002 0.000 9 0.009 0.000 0.014 0.013 0.001 0.330 0.194 0.017 0.032 0.001 0.009 10 0.004 0.001 0.042 0.113 0.031 0.039 0.254 0.772 0.181 0.006 0.173 11 0.010 0.001 0.023 0.005 0.004 0.011 0.014 0.135 0.272 0.006 0.011 12 0.023 0.000 0.047 0.000 0.000 0.261 0.002 0.033 0.022 0.326 0.035 table 1: the bridge across the highway d5 near vráž – comparison of natural modes from the first and from the second stage of the experimental investigation of the bridge using mac eleven natural frequencies, modal shapes and damping frequencies were evaluated after the first stage (the damaged state) of experimental bridge monitoring, and twelve natural frequencies, modal shapes and damping frequencies were evaluated after the second stage (the state after reconstruction) of experimental bridge monitoring. just from a visual comparison of the characters of the natural modes evaluated in the damaged and repaired states (fig. 10–13) and the measured frequency response functions (fig. 8), it is clear that the changes resulting from the reconstruction of the damaged girder are substantial. for this reason, the natural modes of the two verified states of the bridge were compared with each other using the mac(i,j) assurance criterion (tab. 1, fig. 9). table 1 shows are highlighted values of mac(i,j) for the modes that have very similar characters. a visual comparison of these modes as shown in fig. 10–13. values of mac(i,j), that show less correlating modes are highlighted by underlined bold numbers. natural modes © czech technical university publishing house http://ctn.cvut.cz/ap/ 145 acta polytechnica vol. 44 no.5–6/2004 frequency response function point 112 0 0 2. 0 4. 0 6. 0 8. 1 1 2. 1 4. 1 6. 1.8 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 frequency [hz] m a g f r f 112 damaged state 112 state after reconstruction frequency response function point 118 0 0 2. 0 4. 0 6. 0 8. 1 1 2. 1 4. 1 6. 1 8. 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 frequency [hz] m a g f r f 118 damaged state 118 state after reconstruction fig. 8: comparison of the frequency response functions measured at points 112 (the point at upon the middle of the damaged part of the main girder) and 118 (the point in the same cross section on the exterior undamaged main girder) in the damaged state and in the state after reconstruction 146 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 1 2 3 4 5 6 7 8 9 10 11 12 s1 s3 s5 s7 s9 s11 0 000. 0 100. 0 200. 0 300. 0 400. 0 500. 0 600. 0 700. 0 800. 0 900. 1.000 mac(i,j) state after reconstruction (j) damaged state (i) comparison of natural modes of vibration of the bridge near vrá� mac fig. 9: the bridge across highway d5 near vráž – comparison of natural modes from the 1st and from the 2nd stage of the experimental investigation of the bridge, using mac natural modes with similar character of vibration damaged state state after reconstruction characteristics of change june 2001 october 2001 natural frequencies � f( j) [%] damping frequencies � fb( j) [%] natural modes mac no. ( j) f( j) [hz] fb( j) [hz] no. (i) f( j) [hz] fb( j) [hz] (1) 3.26 0.093 (1) 3.38 0.051 3.7 �45.3 0.937 (2) 3.41 0.135 (2) 3.65 0.114 7.0 �15.3 0.674 (3) 8.16 0.161 (3) 8.54 0.147 4.7 �8.5 0.618 (5) 10.21 0.421 (6) 10.86 0.316 6.4 �24.9 0.928 (7) 13.74 0.368 (8) 14.18 0.402 3.2 9.4 0.576 (8) 14.72 0.458 (10) 15.89 0.461 7.9 0.7 0.772 natural modes with lesser correlation (3) 8.16 0.161 (4) 8.95 0.199 9.7 24.0 0.378 (4) 8.42 0.171 (3) 8.54 0.147 1.4 �13.8 0.345 (4) 8.42 0.171 (4) 8.95 0.199 6.3 16.8 0.338 (6) 12.01 0.475 (9) 15.17 1.060 26.3 123.4 0.330 (10) 17.59 1.280 (12) 19.25 0.329 9.4 �74.3 0.326 less correlating modes (3) 8.16 0.161 (5) 9.58 0.741 17.4 361.1 0.259 (6) 12.01 0.475 (7) 11.39 0.400 �5.2 �96.7 0.121 (9) 15.58 0.768 (11) 16.58 0.578 6.4 �96.3 0.272 (11) 19.14 0.217 (10) 15.89 0.461 �17.0 �97.6 0.173 table 2: the bridge across highway d5 near vráž – the change of natural frequencies � f( j) and damping frequencies � fb( j) caused by reconstruction of the main girder no. 5, 7 and 11 for the state after reconstruction, and natural modes no. 9 and 11 for the damaged state, correlate less strongly with all of the modes for the other state. the changed size of the natural frequencies � f( j) and damping frequencies � fb( j) caused by reconstruction of the damaged girder are listed in table 2. compared pairs of f( j) and fb( j) were allocated to each other based on the natural mode correlation specified by mac(i,j) (table 1). the comparison was done separately for the natural modes with similar character of vibration, for the natural modes with less correlation, and for the least correlating modes. not only mac(i,j) was used for the comparison of natural modes but also comac( p) (fig. 14), the change of the curvature of natural modes camosuc( j),p (fig. 15–18), the change of the modal flexibility matrix [�] and the second derivative of the change of the diagonal members of the modal flexibility matrix � ��� (fig. 19 and 20) were used. © czech technical university publishing house http://ctn.cvut.cz/ap/ 147 acta polytechnica vol. 44 no.5–6/2004 fig. 10: visual comparison of the real parts of the natural modes for the state after reconstruction (the first natural mode) and for the damaged state (the first natural mode) – mac(1,1) � 0.937 fig. 11: visual comparison of the real parts of the natural modes for the state after reconstruction (the second natural mode) and for the damaged state (second natural mode) – mac(2,2) � 0.674 148 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 12: visual comparison of the real parts of the natural modes for the state after reconstruction (third natural mode) and for the damaged state (third natural mode) – mac(3,3) � 0.618. (above – comparison in the section of the damaged girder, below – comparison in the section of the undamaged girder.) fig. 13: visual comparison of the real parts of the natural modes for the state after reconstruction (sixth natural mode) and for the damaged state (fifth natural mode) –mac(6,5) � 0.928 © czech technical university publishing house http://ctn.cvut.cz/ap/ 149 acta polytechnica vol. 44 no.5–6/2004 fig. 14: the change in the dynamic behavior of the bridge (damaged state – state after reconstruction) described by function (1-comac), used natural modes no. 1, 2, 3, 5, 7, 8 (damaged state) and 1, 2, 3, 6, 8, 10 (state after reconstruction), max. value shown in the figure is 0.887 (i.e. comac(128) � 0.113) at point 128 fig. 15: the change in the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the change of the curvature of the first natural mode camosuc(1) calculated in the longitudinal direction of the bridge fig. 16: the change of the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the change of the curvature of the first natural mode camosuc(1) calculated in the transversal direction of the bridge 150 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 17: the change of the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the change of the curvature of the third natural mode camosuc(3) calculated in the longitudinal direction of the bridge fig. 18: the change in the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the change of the curvature of the third natural mode camosuc(3) calculated in the transversal direction of the bridge fig. 19: the change in the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the second derivative of the diagonal members of the modal flexibility matrix � ��� computed in the longitudinal direction of the bridge, used natural modes no. 3, 5, 7, 8 (damaged state) and 3, 6, 8, 10 (state after reconstruction) 4 conclusion the evaluated results show that damage to the main girder and its reconstruction significantly influenced the dynamic behavior of the investigated bridge across highway d5 near vráž. the ascertained changes of modal characteristics are significant, and confirm the improvement of the structural state of the bridge after reconstruction of the main girder. the change of the mode shapes was so great that coefficient mac had to be used to find corresponding modes for comparison. for damage detection and localization on this bridge the use of natural frequency changes, changes of mode surface curvature camosuc( j),x, changes of modal flexibility matrix � [�] and the second derivative of changes of the diagonal members of a modal flexibility matrix �[ ]� �� proved to be appropriate. unlike the results obtained on simple structural elements, it seems that for damage localization on complex structures using camosuc( j),x not only the first natural mode but also the higher modes can be used. in particular it is suitable to use a combination of the values of camosuc( j),x computed in the longitudinal and transversal direction of the bridge. this investigation of methods for damage detection and localization was supported by msmt of the czech republic as a part of project cez: j04/98: 210000030. references [1] brincker r., andersen p., cantieni r.: “identification and level i damage detection of the z24 highway bridge.” experimental techniques, vol. 25 (2001), no. 6, p. 51–57. [2] farrar c. r., jauregui d. a.: “comparative study of damage identification algorithms applied to a bridge: i experiment.” smart materials and structures, vol. 7 (1998), p. 704–719. [3] feltrin g., motavalli m.: “vibration-based damage detection on a highway bridge.” proc. of the 1st int. conf. on bridge maintenance, safety and management, 2002. [4] frýba l., pirner m., urushadze s.: “localization of damages in concrete structures.” proc. of the int. conf. computational methods and experimental measurements x, 2001. [5] plachý t.: “dynamic study of a reinforced concrete beam damaged by cracks.” phd thesis, ctu in prague, fce, prague, 2003. [6] plachý t., polák m.: “influence of damage of a reinforced concrete beam on change of its behaviour.” proc. of the 5th int. conf. on structural dynamics eurodyn 2002, munich, 2002, p. 1451–1456. [9] toksoy t., aktan a. e.: “bridge-condition assessment by modal flexibility.” experimental mechanics, vol. 34 (1994), no. 3, p. 271–278. ing. tomáš plachý, phd. phone: +420 224 354 482 e-mail: plachy@fsv.cvut.cz doc. ing. michal polák, csc. phone: +420 224 354 476 e-mail: polak@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 151 acta polytechnica vol. 44 no.5–6/2004 fig. 20: the change in the dynamic behavior of the bridge (damaged state – state after reconstruction) described by the second derivative of the diagonal members of the modal flexibility matrix � ��� ´computed in the transversal direction of the bridge, used natural modes no. 3, 5, 7, 8 (damaged state) and 3, 6, 8, 10 (state after reconstruction) acta polytechnica doi:10.14311/ap.2019.59.0299 acta polytechnica 59(3):299–304, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap jatropha oil with iron nanoparticles application in drilling processes veikko shalimbaa, ∗, vít sopkob a jan evangelista purkyně university in ústí nad labem, faculty of mechanical engineering, pasteurova 3334/7, 400 01 ústí nad labem, czech republic b czech technical university in prague, faculty of nuclear sciences and physical engineering, department of physical electronics, břehová 7, 115 19 prague 1, czech republic ∗ corresponding author: veikko12@yahoo.com abstract. a performance of heat transfer fluids has a substantial influence on the size, weight and cost of heat transfer systems, therefore, a high-performance heat transfer fluid is very important in many industries. over the last decades, nanofluids have been developed. according to many researchers and publications on nanofluids, it is evident that nanofluids have a high thermal conductivity. the aim of this experimental study was to investigate the change of the workpiece temperature during drilling using jatropha oil with iron nanoparticles and water with iron nanoparticles as lubricating and cooling fluids. these experiments were carried out with samples of nanofluid with different nanoparticles volume ratio, such as samples j n 1, j n 5 and j n 10 of iron nanoparticles in the base jatropha oil with a nanoparticle volume fraction of 1 %, 5 % and 10 % respectively and samples w n 1, w n 5 and w n 10 of iron nanoparticles in the base water with a nanoparticle volume fraction of 1 %, 5 % and 10 % respectively. keywords: jatropha oil, nanofluids, nanoparticles, temperature, drilling. 1. introduction cooling is a technical challenge that is faced by many high technology industries today due to the rising demands of modern technologies. conventional heat transfer fluids have a low thermal conductivity that limits their applications in high technology industries, therefore, there is an urgent need for more effective heat transfer fluids in terms of performance and heat storage capacity. many ideas have been implemented and many research activities have been carried out attempting to improve the thermal properties of heat transfer fluids. the thermal conductivity of heat transfer fluids can be enhanced by the use of metallic or non-metallic nanoparticles in a base fluid [1–4]. the resulting mixture is commonly known as a nanofluid. this concept has attracted the attention of many researches worldwide into researching the nanofluids due to their exciting thermal properties and their wide application in numerous fields, such as microelectronics, transportation, manufacturing, medicine, mining and energy. a nanofluid is a fluid containing nanometer-sized particles. the nanoparticles used in a base fluid are normally made from metals, oxides, carbides or carbon nanotubes [5, 6]. common conventional base fluids are water, ethylene glycol or oil. the key question still lingers, concerning the best nanoparticle and base fluid pairing. according to many publications, a nanofluid is considered to be the next-generation heat transfer fluid as it possesses good thermal properties (a high thermal conductivity, thermal diffusivity and convective heat transfer coefficients) as compared to those of a conventional fluid like water or oil [5, 7]. in this study, an application of jatropha oil with iron nanoparticles in drilling process was investigated and compared to that of a water with iron nanoparticle nanofluids with a similar volume concentration of nanoparticles. drilling is a cutting process that uses a drill bit to cut a hole of a circular cross-section in solid materials. both rotating force and a downward pushing force are needed for drilling. however, drill bits generate a lot of heat energy in the process of the metal cutting. the excessive heat generation during the drilling process causes dullness and reduces the drill bit’s effective life span. the cutting fluid is sometimes used to cool down the tool and the workpiece. besides cooling, cutting fluids also aid the cutting process by minimizing the cutting forces thus saving energy and, by lubricating the interface, the contact between the tool’s cutting edge and the workpiece. the lubrication also helps to improve the surface finish and prevents the chips from being welded onto the tool. most metalworking and machining processes can benefit from the use of a cutting fluid, depending on the workpiece material. one of the main properties that are sought after in a good cutting fluid is the ability to keep the workpiece at a stable temperature. the aim of this experimental investigation was to determine the change of the workpiece temperature during drilling using jatropha oil with iron nanoparticles and water with iron nanoparticles as lubricating 299 http://dx.doi.org/10.14311/ap.2019.59.0299 http://ojs.cvut.cz/ojs/index.php/ap veikko shalimba, vít sopko acta polytechnica and cooling fluids in order to ascertain the cooling properties of both cooling fluid types. 2. samples preparation and characterizations in this experimental study, nanofluids samples were prepared employing the two-step technique, where the nanoparticles used are produced as a dry powder. afterwards, the powder at a nanoscale size was dispersed into the base fluids. according to nanofluids researchers, one of the major problems encountered in the preparation of nanofluids is the formation of clusters of particles which leads to a poor dispersion, therefore, a high shear mixer was used in order to break up agglomerates and give more uniform dispersions. however, no dispersant or surfactants were used to stabilize the nanofluid samples. six nanofluid samples with different volume ratio were selected as shown in table 1. samples j n 1, j n 5 and j n 10 of iron nanoparticles in the base jatropha oil with a different volume fraction of 1 %, 5 % and 10 % respectively. samples w n 1, w n 5 and w n 10 of iron nanoparticles in the base water with a different volume fraction of 1 %, 5 % and 10 % respectively. sample n w 1, n w 5 and n w 10 are selected mainly for a comparison purpose. moreover, another two samples without iron nanoparticles have been selected for the comparison purpose, i.e. oil-water emulsion and undiluted jatropha oil, e100 and j n 100 respectively. oil-water emulsion used in this work was purchased in the czech republic. the blend ratios were selected according to ratios used in previous publications and iron nanoparticles were selected due to the ease of suspension and availability in the market. jatropha oil is a new material that is being researched for new applications. water with iron nanoparticles samples were chosen because, in general, water is the heat transfer medium for heat exchangers, because of its availability and good thermo-physical properties. iron nanoparticles with a diameter of 25 nm and spherical shape were supplied by nano iron, s.r.o, czech republic. jatropha oil used in this work is produced by the department of mechanical engineering at the university of applied science and technology in namibia. samples of nanoparticles in the base jatropha oil as well as in the base water were prepared at the laboratory of nano iron s.r.o. according to the safety regulation ec.: 19017/2006, nanofluid samples used in this experiment were classified as not dangerous materials under the act on chemical substances and preparations. nanofluid samples were stored in a cool place with a temperature between 1-5 °c, in order to avoid excess pressure built-up in the sample bottles. nanofluid samples were kept in 500 ml tightly sealed plastic bottles. jatropha oil was selected as the primary base fluid, to study potential applications of this oil, which is produced in large amount in southern african development community (sadc) member states including namibia. the water was selected as the base fluid for comparison purposes, since water is widely used in heat transfer applications. however, the water used as the base fluid in these experiments was distilled in order to avoid the presence of ions which might have an impact on the thermo-physical properties and reduce the lifespan of the samples. iron nanoparticles were employed as dispersed phase since they are easily produced as nanoparticles and their availability in the market. in addition, the iron nanoparticles were used also due to their high thermal conductivity. 3. experimental method the experimental results of drilling tests for all measurement samples are depicted in tables 2 – 9 and figures 3 – 10. figures 3 – 10 show the change in the temperature during drilling in specimens submerged in fluid samples of e100, j 100, j n 1, j n 5, j n 10, w n 1, w n 5, w n 10. the graphs present both the increase and the decline in the temperature during the drilling and after the drilling was completed. figure 11 compares results of samples e100, j 100, j n 1, j n 5, j n 10, w n 1, w n 5, w n 10. during the experimental tests, the temperature for all samples increased gradually, reaching a peak and decreasing slowly after the drilling depth of 10 mm was completed. samples with the jatropha oil with iron nanoparticles j n 1, j n 5 and j n 10 had a shorter drilling time and a lower peak temperature than the sample with the oil-water emulsion e100, and samples with water with nanoparticles as shown in figure 11; this indicates a good lubrication, anti-wear properties and cooling performance. the shortest drilling time and the lowest peak temperature was recorded during the experimental investigations of the sample j n 10 (with a drilling time of 29 seconds and a peak temperature of 33.3 °c, as shown in figure 7). samples with water with iron nanoparticles had longer drilling times and higher peak temperatures than the samples with oil-water emulsion e100. the longest drilling time and the highest peak temperature was recorded on the experimental investigations of sample w n 10 (with a drilling time of 95 seconds and a peak temperature of 42.4 °c, as shown in figure 10). no wear was observed on the main cutting edges and the dead centre of hss drill bits for all fluid samples. 4. conclusion the experimental study was conducted to investigate the application of jatropha oil with iron nanoparticles in drilling processes. according to the presented results, the following conclusions are drawn: • the addition of the nanoparticle concentration in jatropha oil improves the lubrication and cooling performance. 300 vol. 59 no. 3/2019 jatropha oil with iron nanoparticles application in drilling processes sample fluid volumetric fraction [%] nanoparticles volumetric fraction [%] e100 100 j 100 100 j n 1 99 1 j n 5 95 5 j n 10 90 10 w n 1 99 1 w n 5 95 5 w n 10 90 10 table 1. sample characterization. figure 1. workpieces. figure 2. experimental setup for measurement of temperature in drilling process. time [s] 0 10 20 30 40 50 60 70 80 average temperature [°c] 24.08 25.41 28.35 30.18 31.41 32.26 33.37 34.18 32.87 standard deviation [°c] 0 0.070 0.089 0.074 0.086 0.087 0.086 0.046 0.073 table 2. change in temperature vs. time e100. time [s] 0 10 20 30 40 50 60 70 80 90 average temperature [°c] 24.62 24.06 25.15 26.97 28.88 30.57 31.81 33.60 35.60 35.94 standard deviation [°c] 0 0.038 0.002 0.007 0.020 0.031 0.040 0.007 0.022 0.065 table 3. change in temperature vs. time j100. 301 veikko shalimba, vít sopko acta polytechnica figure 3. change in temperature vs. time e100. figure 4. change in temperature vs. time j100. time [s] 0 10 20 30 40 50 60 70 80 average temperature [°c] 23.60 24.07 25.00 26.56 28.49 30.48 31.81 33.10 32.87 standard deviation [°c] 0 0.037 0.006 0.059 0.063 0.041 0.040 0.048 0.008 table 4. change in temperature vs. time jn1. figure 5. change in temperature vs. time jn1. time [s] 0 10 20 30 40 50 average temperature [°c] 23.67 24.94 28.46 32.45 33.55 33.28 standard deviation [°c] 0 0.050 0.069 0.064 0.078 0.088 table 5. change in temperature vs. time jn5. figure 6. change in temperature vs. time jn5. 302 vol. 59 no. 3/2019 jatropha oil with iron nanoparticles application in drilling processes time [s] 0 10 20 30 40 average temperature [°c] 23.64 24.57 28.23 33.36 32.88 standard deviation [°c] 0 0.023 0.031 0.042 0.012 table 6. change in temperature vs. time jn10. figure 7. change in temperature vs. time jn10. time [s] 0 10 20 30 40 50 60 70 80 90 100 avg. temperature [°c] 23.58 24.78 26.75 29.02 31.59 33.52 34.47 35.32 36.73 37.73 37.82 std. deviation [°c] 0 0.047 0.410 0.040 0.500 0.044 0.049 0.430 0.049 0.049 0.053 table 7. change in temperature vs. time wn1. figure 8. change in temperature vs. time wn1. time [s] 0 10 20 30 40 50 60 70 80 90 100 avg. temperature [°c] 23.60 29.34 29.93 30.93 32.05 33.51 34.68 36.15 37.83 39.22 39.25 std. deviation [°c] 0 0.059 0.059 0.061 0.068 0.069 0.068 0.084 0.074 0.071 0.057 table 8. change in temperature vs. time wn5. figure 9. change in temperature vs. time wn5. time [s] 0 10 20 30 40 50 60 70 80 90 100 avg. temperature [°c] 23.55 25.44 28.33 31.61 33.94 36.26 38.00 40.21 41.69 42.20 40.75 std. deviation [°c] 0 0.056 0.053 0.043 0.046 0.046 0.058 0.065 0.065 0.047 0.055 table 9. change in temperature vs. time wn10. 303 veikko shalimba, vít sopko acta polytechnica figure 10. change in temperature vs. time wn10. figure 11. change in temperature vs. time – comparison. • during the drilling experimental tests, the temperature for all samples increased gradually, reaching a peak and then decreasing slowly after a drilling depth of 10 mm was reached. • no wear was observed on the main cutting edges and the dead centre of hss drill bits for all fluid samples. references [1] s. choi, j. eastman. enhancing thermal conductivity of fluids with nanoparticles. in proceedings of the asme mechanical engineering congress and exposition, pp. 99–104. usa, 1995. [2] k. das, s. choi, t. praddep (eds.). nanofluids science and technology. john wiley & son, new jersey, 2007. [3] e. timofeeva. nanofluids for heat transfer potential and engineering strategies. in two phase flow, phase change and numerical modeling, pp. 435–450. 2011. doi:10.5772/22158. [4] a. wakif, z. boulahia, r. sehaqui. numerical study of the onset of convection in a newtonian nanofluid layer with spatially uniform and non uniform internal heating. journal of nanofluids 6:136–148, 2017. doi:10.1166/jon.2017.1293. [5] s. ozerinc. heat transfer enhancement with nanofluids. master’s thesis, school of natural and applied science, middle east technical university, 2010. [6] w. minkowycz, e. sparrow, j. abraham (eds.). nanoparticle heat transfer and fluid flow, vol. 4. taylor & francis group, new york, 2013. [7] m. pastoriza gallego, l. lugo, j. l. legido soto, m. piñeiro. thermal conductivity and viscosity measurements of ethylene glycol-based al2o3 nanofluids. nanoscale research letters 6:221, 2011. doi:10.1186/1556-276x-6-221. 304 http://dx.doi.org/10.5772/22158 http://dx.doi.org/10.1166/jon.2017.1293 http://dx.doi.org/10.1186/1556-276x-6-221 acta polytechnica 59(3):299–304, 2019 1 introduction 2 samples preparation and characterizations 3 experimental method 4 conclusion references ap02_04.vp 1 introduction the mean temperature distribution in resonators with high amplitudes of the acoustic field is caused by thermodynamic state changes due to the mean pressure and the density changeover, and in addition due to the conversion of acoustic energy into heat due to the viscous losses. the mean temperature distribution needs to be known in some appliances, such as thermoacoustic engines. the one-dimensional model equation for nonlinear standing waves of the 2nd order including a viscous boundary layer is modified to describe the acoustic standing waves in a resonator with a longitudinal distribution of mean temperature. it is assumed that the mean temperature changes are small. 2 model equations let us consider a one-dimensional acoustic field in an axisymetrically shaped gas-filled resonator driven by means of an external force. the model equation describing the acoustic field is derived from the basic equations of fluid mechanics. the one-dimensional form of these equations with terms up to the 2nd order is presented here: navier-stokes equation, see [1], � � � � � � � � � � � � � � � � 0 0 2 02 4 3 v t v t v x a a p x � � � � � � � � � � �� � � � � � � �x r x r v 1 2 2� �� � �� , (1) the continuity equation taking into account the boundary layer, see [2, 3], � � �� � � � � � � � �� � � � � � � � � � � � � �� � t r x r v r x r v v x r 0 2 2 2 2 3 0 2 1 1 pr � � � � � � � 0 212 1 2 r v t x , (2) where the fractional derivative represents an integrodifferential operator � � � � � � �� � � � � �� 1 2 1 2 1f t f t t t t t d (3) and the thermodynamic state equation taking into account heat conductivity and mean temperature change, see [4], � � � � � � � � � � � � � � � � � p c c c c c r p v p 2 0 0 2 0 2 1 2 1 1 1 1 � � � � � � �� � 2 2 � �x r v , (4) where � �� , p v, are acoustic density, pressure and velocity, p0 0 0, ,� � are equilibrium state pressure, density and temperature, � is mean temperature, �� � � ��, � a a t� is the driving acceleration, x is the spatial coordinate along the resonant cavity, t is time, � r r x� is radius of the resonator, � � c cvp is ratio of specific heats at constant pressure and volume, is the coefficient of thermal conduction, �0 is kinematic viscosity, �, � are the coefficients of bulk and shear viscosity, pr is the prandtl number, c0 is the small-signal sound speed due to equilibrium temperature �0 and c is the small-signal sound speed due to changed mean temperature defined as c c c2 0 2 0 0 2 0 1� � � � �� �� � �� � � � . after linearization, from equations (1), (2) and (4) we obtain the following relations with terms of the 1st order � � � �� � � �p t ax� �� � 0 , (5) � � �� � � � � �� � � �� � � �0 2 2 0 1 c t ax c c p �� , (6) 1 1 2 2 2 2 r x r x c t x a t � � �� � � � � � � � � � � � � � � � � � d d , (7) where � is velocity potential, v t� �� � . for deriving the model equation, the following method is used: linearized equations, eqs. (5), (6), (7), are substituted into terms of the 2nd order and so the resulting error is of the 3rd order. it is assumed that the mean temperature changeover is small and that its time and spatial derivatives are of the 2nd order. after eliminating acoustic density �� from eq. (1) using eq. (6), introducing velocity potential and after its integration with respect to the spatial coordinate x we obtain the relation 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 temperature effects in acoustic resonators m. červenka, m. bednařík, p. koníček this paper deals with problems of nonlinear standing waves in axisymetrically shaped acoustic resonators where a mean temperature is distributed along the axis. keywords: nonlinear standing wave, acoustic resonator, temperaturechangeover. � � �� � � �� � � � �� � � � � t c t ax c c t ax p � �� � � � � � � � � � 2 1 2 2 2 �� � � � � � � � � � � � � � � � � � � � �� � � � � � � � � �2 4 32 2 2 2x ax p c t x a t d d � � . (8) we can eliminate pressure �p from eq. (8) using eq. (4). thus we get � � �� � � �� � � � �� � � � � t c t ax c c t ax p � �� � � � � � � � � � � 2 1 2 2 2 � � � � � � � � � � � � � � � � � �� � � � � � � � � � � �2 1 2 2 2 2 2 x c c ax b c t x p �� � � � d d a t c c t ax cp � � � � � � � � � � � � �� � �� � � �� � �� 0 2 4 2 2 1 1 �� . (9) after differentiating eq. (9) with respect to time, we eliminate the time derivative �� �� t using eq. (2), thereby obtaining a model equation describing the nonlinear standing waves in an acoustic resonator with a spatial distribution of mean temperature. the derived model equation has the form � � � � � �� � � � �� � 2 2 0 2 0 2 2 21 t c r x r x t x x� � � � � � � � � � � � � � da t a x c t t ax b d � � � � � � � � � � � � � � � �� � � � � �� � � � 1 2 0 2 0 2 2 � � c t x a t r 0 2 0 3 3 2 0 2 1 1 3 2 � � � � � � � � � � � � � � � � � � �� � � � d d 2 pr � � � � � t x a t c c p 3 2 0 2 2 0 2 2 1 � � � � � � � � � � � � � � � � d d 1 2 1 2 �� � � � �t x a t2 � � � � � � � d d , (10) where � b c cv� � � �� � 4 3 1 1 p is the diffusity of sound. this model equation is written in the coordinates moving together with the resonator body, consequently the boundary conditions have the form �� �x v� � 0 for x � 0 and x � �, where � is the resonator cavity length. the acoustic pressure and density can be calculated from the numerical solution of eq. (10), applicable equations are derived from relations (8) and (9) � � � � � � � � � � � � � �p t ax x c t ax� �� � � � �� � � �� � � � � � 2 2 2 0 2 0� � � � � �� � � � � � � � � � � � � 2 0 2 0 2 2 0 1 4 3c t x a t c c p � � � � � � � d d � 2 01� � �� � � � � � � � ��� � � t ax , (11) � � � � � � � � � � � � �� � � � �� � � � � c t c ax c x0 2 0 0 2 0 0 2 0 2 2 � � � � � � � � � c c c c b c p p 0 2 0 2 0 4 3 2 0 3 0 4 0 1 2 1� � � �� �� � � � � � � � � �� � � �� � � � � � � � �� � � � � � � � � � � � � � � �� � �� 2 2 2 0 4 0 01 1 � � � � �� t x a t c d d � � �� � � � � � � �� � � � � � � � � � � � � � 2 21 2 1 �� � � �� �t ax c t axp � � � � � . (12) a one-dimensional model equation describing the temperature distribution is derived from the energy equation for an ideal gas, see [5], � � � � � � � � � � � c t p v x v x v x v x v i i i j j i ij d d � � � � � � � � � � �2 3 � � � � � � � � � � � � � � � � � � � � � � � � v x v x v x x x i j ij i j j j � � � , (13) where �ij is the kronecker delta, indices i, j, � go from 1 to 3 and vi represents three components of the acoustic velocity vector. a one-dimensional form of eq. (13) is obtained using the relation for one-dimensional divergence in an axisymetrically shaped waveguide, see [1], � � � � � � � � � c t p r x r v r x r v r v d d � � � � �� � � � � �� � �� � � 2 2 2 2 24 3 1 2 2 2 � � � � � � �x r x y � �� � � � � , (14) where y is the spatial coordinate normal to the resonator wall. the last term of eq. (14) describes the heat flow through the boundary layer to the resonator wall. for the temperature in the resonator cavity, we can write � � � � � �x y t x y t x tb m, , , , ,� � , where �b is the temperature in the boundary layer and �m is the mainstream temperature. it is easy to show, see [3], that � �b ek k k tk y e� � � � � � � � ��� � � exp prj j�� �0 (15) and thus � � � � � �� � �b y ek k t k y k e t� ��� � � � � �� 0 0 0 pr pr j d d j 1 2 1 2 . (16) integrating eq. (14) with respect to the resonator cross-section with help of the divergence theorem yields � � � � � � � � � c t p r x r v r x r vv d d � � � �� � � � � �� � �� � � � � � 2 2 2 2 24 3 1 � � � � � � � � � �� � � � � � � � � � �r x r x y2 2 � �d d= � , (17) where � is the resonator cross-section and � is its boundary. after substitution of relation (16) into eq. (17) and calculation of the integrals (acoustic quantities are assumed to be constant with respect to the integration domain), we obtain the model equation for the temperature distribution © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 42 no. 4/2002 � � � � � � � � � � � � � � �� � � � � � � � �� � � � 0 0 2 21 4 3 1 c t p p r x r x v d d � r x r x r x r x r 2 2 2 2 2 2 � � �� � � � � � � � � � � �� � �� � � � � � � � � pr � � �0 1 2 1 2 � t . (18) eq. (10) is solved numerically in the frequency domain, and driving acceleration a is assumed to be periodic. owing to the numerical instability of eq. (18) solved in the frequency domain, the mean temperature change is estimated from the thermodynamic state equation (it is assumed that all the heat generated in the resonator cavity due to viscous losses is conducted out through the resonator cavity walls) as � �� 1 0 0 � � � �� � � � � � p p � � . (19) 3 numerical results and conclusions fig. 1 shows the spatial distribution of the mean temperature in the cylindrical, conical and bulb resonator due to the medium thermodynamic state change in an intensive sound field. the temperature was estimated using eq. (19). the resonator driving acceleration a 3000 m�s�2, the driving frequency agrees with the 1st natural frequency of the resonant cavities, the length of the resonators � 0.17 m, the radiuses of the resonators are: � cylindrical resonator – � r x � 0 01. m, 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 1: mean temperature distribution in a cylindrical, conical and bulb resonator � conical resonator – � r x x� �0 0056 0 268. . m, � bulb resonator – � � � r x x x� �0 003 3 0 005. sin exp .� � � m. it can be seen that the mean temperature is found near the pressure antinodes. fig. 2 compares the acoustic velocity spectrum distribution in a conical resonator where the influence of the temperature distribution is taken into account (dashed line) and where it is not taken into account (solid line). fig. 3 compares the frequency characteristics for the same resonator. the numerical results show a slight resonant frequency shift and a waveform changeover if these mean temperature changes in the medium are taken into account. fig. 4 shows an example of a thermoacoustic engine. the upper figure shows the mean temperature distribution, the bottom figure compares the acoustic velocity spectrum if the mean temperature distribution is taken into account (dashed © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 42 no. 4/2002 fig. 2: distribution of the acoustic velocity spectrum in conical resonator. dashed line: temperature distribution is taken into account, solid line: temperature distribution is not taken into account. fig. 3: frequency characteristics of a conical resonator. dashed line: temperature distribution is taken into account, solid line: temperature distribution is not taken into account. line) and if the mean temperature distribution is not taken into account (solid line). references [1] ilinskii, y. a., lipkens, b., lucas, t. s., van doren, t. w., zabolotskaya, e. a.: nonlinear standing waves in an acoustical resonator. j. acoust. soc. am. vol. 104, 1998, p. 2664–2674. [2] bednařík, m., červenka, m.: nonlinear waves in resonators. proc. of the 15th isna, goettingen (germany), 1999. [3] chester, w.: resonant oscillations in closed tubes. j. fluid mech., vol. 18, 1964, p. 44–64. [4] makarov, s., ochmann, m.: nonlinear and thermoviscous phenomena in acoustics. part i, acustica – acta acustica, vol. 82, 1996, p. 579–605. [5] blackstock, d. t.: fundamentals of physical acoustics. new york: john wiley & sons, inc., 2000, p. 77–84. ing. milan červenka phone: +420 224 353 975 e-mail: cervenm3@feld.cvut.cz dr. ing. michal bednařík phone: +420 224 352 308 e-mail: bednarik@feld.cvut.cz dr. mgr. petr koníček phone: +420 224 352 329 e-mail: konicek@feld.cvut.cz department of physics czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 fig. 4: mean temperature distribution in cylindrical resonator due to external heating and cooling (upper figure), comparison of acoustic velocity spectra in the resonator if the mean temperature distribution is taken into account (dashed line) and if the mean temperature distribution is not taken into account (solid line). acta polytechnica https://doi.org/10.14311/ap.2021.61.0570 acta polytechnica 61(4):570–578, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague utilization of agricultural waste adsorbent for the removal of lead ions from aqueous solutions adeyinka sikiru yusuffa, ∗, lekan taofeek popoolaa, victor anochieb a afe babalola university, college of engineering, department of chemical & petroleum engineering, afe babalola way, ado-ekiti, ekiti state, nigeria b baze university, faculty of engineering, department of petroleum and gas engineering, abuja, nigeria ∗ corresponding author: yusuffas@abuad.edu.ng abstract. this work investigated the potentiality of using chemically modified onion skin waste (cmosw) as an adsorbent for the removal of lead ions (pb2+) from an aqueous solution. the material properties were characterized using techniques, such as brunauer-emmett-teller surface area analysis, scanning electron microscopy (sem) and fourier transform infrared (ftir) spectroscopy. the effects of adsorbent dosage, contact time, ph and initial pb2+ concentration on the removal efficiency were investigated by experimental tests. the experimental data were analysed by the langmuir and freundlich isotherms, while kinetic data obtained at different concentrations were analysed using a pseudo-first-order and pseudo-second-order models. a distinct adsorption of pb2+ was revealed by the sem results. from the ftir analysis, the experimental result was corresponded to the peak changes of the spectra obtained before and after the adsorption of pb2+. the maximum removal efficiency of pb2+ by the cmosw was 97.3 ± 0.01 % at an optimum cmosw dosage of 1.4 g/l, contact time of 120 min and solution ph of 6.0. experimental data obtained fitted well with the freundlich isotherm model. the kinetics of the pb2+ adsorption by cmosw appeared to be better described by the pseudo-second-order model, suggesting the chemisorption mechanism dominance. keywords: lead ions, onion skin, adsorption, isotherm, kinetics. 1. introduction the increased level of environmental contamination as a result of industrial development by compounds of organic and inorganic sources has become one of the major environmental concerns of many industries [1]. most water contaminants, such as hydrocarbon, phenolic, dyes, solvent and heavy metals, are soluble chemical substances [2]. these compounds end up in water bodies, causing water and soil pollutions, and thereby constitute threats to plants, animals and human health [1]. however, the presence of toxic heavy metals in industrial effluents gives a cause for environmental concern [3]. heavy metals are generally toxic, highly soluble in water and can easily find their way into the soil and flowing streams, thereby causing damage to the environment and human health. lead is known as multifunctional metal, which is a needed element to manufacture pipe, paints, bullets and also one of the essential metals used in the pewter industry [4]. however, a long term exposure to lead ions (pb2+) can cause mental illness, infertility in women and damage to vital human internal organs [5]. thus, there is a need to adopt a technology suitable for getting rid of toxic heavy metals and other dissolved contaminants from effluents prior to their discharge into flowing streams. the nigerian ministry of environment recommends 0.01 mg/l as the maximum permitted level of pb2+ in an industrial effluent before its release into surface waters [6] while it has a maximum allowable limit (sets by world health organization, who) of 0.003 mg/l in drinking water [7]. however, many wastewater treatment methods have been employed in removing toxic heavy metals from the aqueous environment. these methods include electrocoagulation, reverse osmosis, chemical precipitation, electrochemical reduction, ion-exchange membrane and adsorption [8, 9]. among these treatment techniques, adsorption seems to be cost-effective, easy and simple to operate [10]. another advantage of adsorption over other techniques is its capability to treat wastewater pollutants at low concentrations [11]. some of the porous materials that have been synthesized as adsorbents for the removal of toxic heavy metals include activated carbon, molecular sieve and zeolite. these materials are often used to remove toxic metal ions from contaminated water due to their better surface properties and adsorption capacities [12]. however, they are expensive and challenging to be regenerated and reused. these drawbacks necessitate the need to explore cheap, reusable and biodegradable adsorbents for the removal of toxic metal ions from aqueous solutions. adsorbents could be obtained from different sources, such as naturally occurring, agricultural biomass and waste materials [13–15]. in order to minimize the wastewater treatment costs and avoid the accumulation of solid waste in the environment, agricultural wastes have been suggested and studied, such as pineapple stem [16], pineapple fruit peel [17], waste tea [18], walnut shell [19], peanut shell [20] and 570 https://doi.org/10.14311/ap.2021.61.0570 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 4/2021 utilization of agricultural waste adsorbent for . . . seed pod [21]. moreover, onion skins are abundantly available and can be a good source of electrostatically active metal oxides [22]. according to the literature research, a modification of onion skin waste by acid activation is currently being used, and a series of technical challenges are being experienced [23–25]. the promising use of onion skin requires a modification without severely damaging its structure. however, metal oxides, such as an aluminium oxide (al2o3) and silicon oxide (sio2), could be used to modify the structure of the adsorbent due to their better textural properties, mechanical and thermal stabilities [26, 27]. the application of al2o3 modified onion skin waste for the removal of heavy metals from aqueous solution is still not well explored. the aim of the current work was to investigate the potentiality of utilizing onion skin waste as an adsorbent for the removal of lead ions from an aqueous solution. the operational parameters’ effect on the uptake efficiency, including initial concentration, ph, adsorbent dosage and contact time, was investigated. the equilibrium adsorption isotherm and kinetics were also studied. 2. methodology 2.1. materials onion skin waste was collected from oja-oba, adoekiti, nigeria. lead ii nitrate (pb (no3)2), hydrochloric acid (hcl), sodium hydroxide (naoh) and aluminium nitrate (al(no3)3) were all bought from nizochem chemical enterprise, akure, nigeria. pb (no3)2 was used as the adsorbate. 1000 mg/l of lead salt solution was prepared by dissolving 1.60 g of (pb (no3)2) in 1 l of deionized water and solutions with different concentrations were prepared from the standard solution. 2.2. preparation and characterization of adsorbent the collected onion skin waste (osw) were thoroughly washed with clean water to get rid of dirt particles and dried at 110 °c for 24 h. then, the dried osw was grinded and finally sieved through a 220 µm mesh size to remove larger parts. in order to prepare a chemically modified onion skin waste (cmosw) adsorbent, the osw powder was mixed with al(no3)3 in a ratio of 1:2, and 150 ml of distilled water was added to the resulting solid mixture. thereafter, the solution was agitated on a hot plate at 75 °c for 12 h and nh4oh solution was subsequently added dropwise to obtain a basic solution and achieve a complete precipitation. the precipitate was washed severally with deionized water, then dried at 110 °c, and finally calcined at 600 °c for 2.5 h. a morphological analysis was conducted on the cmosw sample to evaluate its surface morphology before and after the adsorption of lead ions using a scanning electron microscope (sem, jeol-jsm 7650f). the fourier transform infrared (ftir) spectrometer (ftir-1s shimadzu, japan) was used to determine the functional groups present on the prepared adsorbent. the spectra were recorded within the range of 4000-400 cm−1. the textural characteristics of the cmosw samples including specific surface area, total pore volume and pore size distribution were obtained through n2 adsorption-desorption data at -196 °c using a belsorp max surface area and porosity analyser (bel, japan). the studied samples were degassed before the analysis at 250 °c for 3 h to remove the adsorbed gases that filled up their pores, and their specific surface areas were computed using the brunauer-emmett-teller (bet) model. the total pore volume and pore size were determined by the barrett-joyner-halenda (bjh) method and studied near the saturation pressure of n2 (p/p0 = 0.99). 2.3. adsorption experiments a required volume of lead salt contaminated water (50 ml) and the needed amount of the prepared adsorbent were charged into 250 ml conical flasks. the mixtures were agitated in a temperature-controlled water bath shaker at 28 ± 2 °c and 200 rpm until an equilibrium was attained. the adsorption of pb2+ ions onto cmosw was carried out under these operating conditions: initial pb2+ concentration (10-200 mg/l), solution ph (2-10), cmosw dosage (0.2-1.6 g/l) and contact time (30-180 min). after the batch adsorption experiment was completed, the adsorbent was removed from the suspension using a centrifuge and the residual pb2+ concentration was measured by an atomic absorption spectrophotometer (model vgp 210, us). the quantity of pb2+ removed at the equilibrium, qe (mg/g) and the removal percentage, y (%) were determined as follows: qe = (co − ce) × v m (1) y = (co − ce) co × 100 % (2) 2.4. adsorption isotherm two-parameter isotherm model (langmuir and freundlich) was employed in evaluating the experimental data for the pb2+ adsorption by cmosw. the nonlinear isotherm models are as given by eqs. 3 and 4. qe = qmaxbce (1 + bce) (langmuir model) (3) qe = kf c 1/ne (freundlich model) (4) 571 a. s. yusuff, l. t. popoola, v. anochie acta polytechnica (a). (b). figure 1. sem micrographs of cmosw sample (a) prior to and (b) after pb2+ ions adsorption. separation factor (rl), a dimensionless parameter, was applied in determining the nature of pb2+ adsorption onto the cmosw adsorbent. the dimensionless parameter signifies whether the adsorption process is favourable (0 < rl < 1), non-favourable (rl > 1), linear (rl = 1) or irreversible (rl = 0). rl = 1 (1 + bco ) (5) 2.5. adsorption kinetics adsorption kinetics describes the rate at which the amount of adsorbate removed varies with time. the kinetic data for the adsorption of the pb2+ onto chemically modified onion skin were analysed using pseudofirst-order and pseudo-second-order kinetic equations. log (qe − qt) = log qe − ( k1 2.303 ) t (6) (pseudo-first-order) t qt = 1 k2q2e + ( 1 qe ) t (7) (pseudo-second-order) 3. results and discussion 3.1. characterization of cmosw adsorbent the morphological structure of the prepared cmosw before and after the pb2+ adsorption was determined using the sem analysis, and the micrographs are shown in figure 1. figrue 1a revealed that the prepared adsorbent had an almost regular surface containing a long but tiny cavity and pores of different sizes, suggesting a reasonable possibility of a rapid adsorption of the pb2+ ions. however, upon adsorption of the cation, that large cavity, earlier observed on the fresh cmosw sample, was filled as a result of the pb2+ adsorption. the ftir spectrum of the cmosw adsorbent before the adsorption of pb2+ is displayed in figure 2 (a) with several bands at 3496.21 cm−1 (o-h stretching vibration), 2968.87 cm−1, 2980.24 cm−1 (c-h asymmetric and symmetric stretching), 1654.06 cm−1 (c=o deformation), 1380.71 cm−1 (ch3 deformation), 1086.23 cm−1 (-c-nh3 primary aliphatic amine), 652.45 cm−1 (c-o-h twist broad) and 458.34 cm−1 (c-n-c bending modes). the detection of all these functional groups was an indication that the cmosw adsorbent was complex. however, upon adsorption of the metal ions (figure 2 (b)), some peaks shifted, and new bands at 3452.26 cm−1 (o-h stretching vibration), 1698.17 cm−1 (c=o deformation), and 400.57 cm−1 (-c-n-c bending modes) were also formed. figure 2. ftir spectra of cmosw sample (a) prior to and (b) after pb2+ ions adsorption. 572 vol. 61 no. 4/2021 utilization of agricultural waste adsorbent for . . . sample specific area (m2/g) pore volume (cm3/g) pore diameter (å) fresh cmosw 26.5 0.17 317.9 cmosw loaded with pb2+ 2.8 0.01 201.7 table 1. textural characteristics of cmosw before and after adsorption of lead ions. table 1 presents the textural properties of the prepared cmosw adsorbent before and after the loading of lead ions. the results showed that the fresh adsorbent possessed a large surface area and pore size distribution, and this was an indication that the studied sample had several sorption sites on its surface. the bet surface area of the prepared composite adsorbent was 26.5 m2/g, as seen in table 1, which was above the surface areas of the acid-modified onion skin (10.62 m2/g) [23] and garlic waste (5.62 m2/g) [28]. however, there was a significant reduction in textural properties (surface area, total pore volume and average pore diameter) of the cmosw sample after the adsorption of pb2+. these observations were attributed to the agglomeration or overlapping of the sorption sites available for the adsorbates, as corroborated by the sem result. 3.2. influence of adsorption process parameters 3.2.1. influence of initial lead ions concentration the influence of the initial pb2+ concentration (10200 mg/l) on the adsorption process was investigated by treating the lead salt contaminated solution at room temperature and ph of 6.0 for 120 min using 1.0 g/l of the cmosw sample. figure 3 revealed that the removal percentage decreased, and the equilibrium adsorption capacity increased as the lead ions concentration increased. this observation was due to the same number of adsorption sites, which were available for the increasing adsorbate concentration. it could be deduced that at a higher concentration of pb2+ ions, the sorption sites on the cmosw surface for the electrostatic attraction were occupied by more contaminants, thus leading to an increase in equilibrium uptake capacity of the cmosw adsorbent and a decrease in the removal percentage of the adsorbate [29]. similar observations were also reported for an adsorptive removal of heavy metals by moringa stenopetala bark powder [30], prawn shells [31] and paper mill sludge activated carbon [10]. 3.2.2. influence of contact time figure 4 shows the effect of the contact time on the removal percentage and equilibrium adsorption capacity at a fixed pb2+ concentration of 50 mg/l, ph of 6 and a cmosw dosage of 1.0 g/l. the figure revealed that both the removal percentage and equilibrium adsorption capacity speedily increased with an increase in the contact time and then decreased until an equilibrium was attained. generally, a prolonged contact time figure 3. influence of initial concentration on pb2+ adsorption onto cmosw. favours the adsorption process by enhancing the uptake of adsorbate and also increasing the equilibrium sorption capacity of the adsorbent [16]. as seen in the figure, after the contact time reached 120 min, both the removal percentage and the sorption capacity plots became nearly flat, which indicated that the adsorption equilibrium had been reached. this observation agrees with the work reported for a pb2+ removal from aqueous solutions by inactive biomass [5], moringa stenopetala bark [30] and dead anaerobic biomass [32]. figure 4. influence of contact time on pb2+ adsorption onto cmosw. 3.2.3. influence of adsorbent dosage to investigate the influence of the cmosw dosage on the adsorption of lead ions, we used different quantities 573 a. s. yusuff, l. t. popoola, v. anochie acta polytechnica of the cmosw sample (0.2-1.6 g/l) to treat a lead ioncontaining solution at a fixed adsorbate concentration of 50 mg/l, solution ph of 6.0 and a contact time of 120 min. the results showed that the removal percentage of pb2+ increased as the cmosw loading also increased, while the adsorption capacity exhibited a reverse trend. when the adsorbent dosage rose from 0.2 to 1.6 g/l, the removal percentage increased from 47.98 to 98.24 % because of more active sites for pb2+ ions and the equilibrium sorption capacity decreased from 5.99 to 1.54 mg/g. a similar observation was also reported for a pb2+ ion removal by soil [33]. figure 5. influence of adsorbent dosage on pb2+ adsorption onto cmosw. 3.2.4. influence of aqueous solution ph the influence of ph on the adsorption of lead ions onto a chemically modified onion skin waste was studied at a fixed initial pb2+ concentration of 50 mg/l, a contact time of 120 min and an adsorbent dosage of 1.4 g/l. figure 6 shows a combined plot of removal percentages and adsorption capacity of pb2+ versus the ph of the simulated solutions. the figure showed that the removal percentages of the cation increased as the ph increased from 2 to 6 and then decreased after the aqueous solution ph exceeded the optimum value. this result indicated that a substantial amount of lead ions could be adsorbed by the cmosw adsorbent under acidic conditions. in contrast, the adsorption of pb2+ significantly decreases under basic conditions. 3.3. adsorption isotherms the non-linear plot of the quantity of pb2+ adsorbed at the equilibrium against the equilibrium concentration for the langmuir model, freundlich model and experimental data are displayed in figure 7. the estimated isotherm parameters, which were obtained from the plot, are presented in table 2. the best among the models was selected based on the value of the coefficient of determination (r2). therefore, the r2 value (0.9881) suggested that the experimental data better fitted the freundlich isotherm model. figure 6. influence of ph on pb2+ adsorption onto cmosw. a similar observation was also reported for a pb2+ adsorption by polypyrole-based activated carbon [34] and hazelnut husks based activated carbon [35]. the values of the separation factor and freundlich exponent for the lead ions, as can be seen in table 2, suggested favourability and capacity of the pb2+ ions – cmosw system [36]. by comparing the adequacy of the two isotherm models, it can be concluded that the freundlich isotherm was suitable to describe the pb2+ adsorption onto the chemically modified onion skin waste. figure 7. application of two-parameter isotherm to pb2+ adsorption onto cmosw. the monolayer adsorption capacity of cmosw was compared with other reported adsorbents for the adsorption of pb2+ from an aqueous solution at different optimum operational parameters, as shown in table 3. the table showed that the cmosw adsorbent had a strong affinity for the pb2+ removal from an aqueous solution by exhibiting a high maximum uptake capacity of 7.16 mg/g. this value was obtained under the optimum process condition of 50 mg/l initial concentration (co), 120 min contact time (t), 1.40 g/l 574 vol. 61 no. 4/2021 utilization of agricultural waste adsorbent for . . . isotherm value unit langmuir qmax 7.16 mg·g−1 b 0.09 l·mg−1 r2 0.9847 – rl 0.05 – average relative error 0.128 – freundlich kf 1.19 mg·g−1(l·mg−1)1/n n 2.50 – r2 0.9881 – average relative error 0.050 – table 2. values of two-parameter isotherm constants. adsorbent source qm experimental condition reference (mg/g) co (mg/l) t (min) d (g/l) ph t (°c) apricot stone 21.38 50 – 1.0 6.0 20 [37] hazelnut husk 13.05 200 – 12.0 5.7 18 [35] sand 21.78 50 30 2.0 2.0 65 [38] groundnut shell 3.428 75 90 8.0 6.0 – [39] soya bean 0.55 – 60 3.0 4.0 ± 0.26 37 [40] moringa stenopetala bark 35.71 10 120 1.5 5.0 40 [30] cmosw 7.16 50 120 1.4 6.0 28 ± 2 present study table 3. comparison of the adsorption capacities of different adsorbents for pb2+ removal. adsorbent dosage (d), ph of 6.0, and 28 ± 2 °c temperature (t). these findings indicated that cmosw was an efficient adsorbent for lead ions removal from water/wastewater, mostly when compared with those adsorbents derived from groundnut shell [39] and soya bean [40]. 3.4. adsorption kinetics the plot of log(qe − qt) against t (pseudo-first-order) and t/qt against t (pseudo-second-order), from which the rate constants (k1 and k2) and predicted quantity of pb2+ adsorbed at the equilibrium (qe) were determined, are shown in figure 8. the estimated kinetic parameters and r2 values are given in table 4. the results revealed that the predicted qe did not correlate with the observed qe in the case of pseudofirst-order kinetic. besides, its r2 value (0.9773) was lower than that of the pseudo-second-order kinetic model. therefore, the pseudo-second-order kinetic model was applied to evaluate the obtained adsorption data. the result obtained showed that the predicted qe was found to be in agreement with the experimental qe with the value of the coefficient of determination being 0.9996, which rendered it suitable for describing the adsorption kinetics of pb2+ onto cmosw. figure 8. pseudo-first-order and pseudo-second-order kinetics for pb2+ removal by cmosw. 3.5. adsorption mechanism the lead ions covered the surface and diffused into the pores of the cmosw adsorbent by capillary force, which was confirmed by the results of the textural properties (surface area and pore size) analysis (table 1). thus, the cmosw adsorbent with better 575 a. s. yusuff, l. t. popoola, v. anochie acta polytechnica kinetic model parameter values unit pseudo-first-order qe (exp) 1.443 mg·g−1 qe (cal) 1.337 mg·g−1 k1 0.016 min−1 r2 0.9773 – pseudo-second-order qe (cal) 1.462 mg·g−1 k2 0.0086 g·mg−1·min r2 0.9996 – table 4. kinetic models and their parameters for pb2+ adsorption onto cmosw. textural characteristics would adsorb more lead ions. the capillary force within the mesopore facilitated the diffusion of the metal ions, which enhanced the overall adsorption capacity. additionally, the electrostatic interaction, which occurred between the lead ions and negative functional groups (o-h and c=o) on the adsorbent surface, also aided the adsorption. therefore, the adsorptive removal of lead ions by cmosw involved a synergy between the capillary forces and electrostatic interactions [41]. 4. conclusion and future recommendations this study has revealed that an adsorbent prepared from onion skin was efficient for the removal of pb2+ from an aqueous solution. the freundlich isotherm model provided a better fit of the equilibrium adsorption data, indicating a multilayer adsorbate-adsorbent system with the dominance of the chemisorption. the maximum removal efficiency of 97.3 ± 0.01 % was achieved at an optimum adsorbent dosage of 1.4 g/l, a contact time of 120 min and ph of 6.0. the pseudosecond-order model proved to best describe the kinetic data. the change in the adsorbent structure and peak changes of the spectra after adsorption, as revealed by sem and ftir analyses, respectively, suggested a distinct adsorption of the pb2+ onto cmosw. the removal of pb2+ by cmosw via an adsorption process experiment led to encouraging results, and we authors wish to achieve the same breakthrough in an adsorption column mode under the conditions applicable to the treatment of industrial wastewater. the current investigation also revealed that cmosw has a potential as an effective adsorbent for the toxic heavy metals and dye removal. list of symbols co initial concentration [mg/l] ce equilibrium concentrations [mg/l] v solution volume [l] m mass of adsorbent use [g] qe amount of metal ion adsorbed at equilibrium [mg/g] qmax maximum adsorption capacity [mg/g] b langmuir equilibrium constant kf adsorption capacity of the adsorbent [mg/g(l/mg)1/n] n freundlich exponent qt amount of metal ion adsorbed at time t [mg/g] k1 rate constant for pseudo-first order model [min−1] k2 rate constant for pseudo-second order model [g/mg min] acknowledgements the authors would like to thank the department of chemical and petroleum engineering, afe babalola university, ado-ekiti, nigeria for allowing us to use the laboratory facilities. references [1] c. n. owabor, i. o. oboh. kinetic study and artificial neural network modeling of the adsorption of naphthalene on grafted clay. journal of engineering research 17(3):41–51, 2012. [2] a. s. yusuff, i. i. olateju, o. a. adesina. tio2/anthill clay as a heterogeneous catalyst for solar photocatalytic degradation of textile wastewater: catalyst characterization and optimization studies. materialia 8:100484, 2019. https://doi.org/10.1016/j.mtla.2019.100484. [3] m. rafatullah, o. sulaiman, r. hashimi, a. ahmad. adsorption of copper (ii), chromium (iii), nickel (ii) and lead (ii) ions from aqueous solutions by meranti sawdust. journal of hazardous materials 170(2-3):969–977, 2009. https://doi.org/10.1016/j.jhazmat.2009.05.066. [4] m. n. m. ibrahim, w. s. w. ngah, m. s. norliyana, et al. a novel agricultural waste adsorbent for the removal of lead (ii) ion from aqueous solutions. journal of hazardous materials 182(1-3):377–385, 2010. https://doi.org/10.1016/j.jhazmat.2010.06.044. [5] m. j. mohammed-ridha, a. s. ahmed, n. n. raoof. investigation of the thermodynamic, kinetic and equilibrium parameters of batch biosorption of pb (ii), cu (ii), and ni (ii) from aqueous phase using low cost biosorbent. al-nahrain journal for engineering sciences 20(1):298–310, 2017. 576 https://doi.org/10.1016/j.mtla.2019.100484 https://doi.org/10.1016/j.jhazmat.2009.05.066 https://doi.org/10.1016/j.jhazmat.2010.06.044 vol. 61 no. 4/2021 utilization of agricultural waste adsorbent for . . . [6] s. kavitha, r. selvakumar, k. swaminathan. polyvinyl pyrrolidone k25 modified fungal biomass as biosorbent for as(v) removal from aqueous solution. separation science and technology 43(15):3902–3919, 2008. https://doi.org/10.1080/01496390802222590. [7] m. a. khan, a. alqadami, m. otero, et al. heteroatom-doped magnetic hydrochar to remove post-transition and transition metal from water: synthesis, characterization and adsorption studies. chemosphere 218:1089–1099, 2018. https: //doi.org/10.1016/j.chemosphere.2018.11.210. [8] s. vasudevan, j. lakshmi. electrochemical removal of boron from water: adsorption and thermodynamic studies. the canadian journal of chemical engineering 90:1017–1026, 2012. [9] r. bushra, m. shahadat, m. a. khan, et al. optimization of polyamiline supported ti (iv) arsenophosphate composite cation exchanger based ion-selective membrane electrode for the determination of lead. industrial and engineering chemistry research 53(50):19387–19391, 2014. https://doi.org/10.1021/ie5034655. [10] f. gorzin, m. m. b. r. abadi. adsorption of cr (vi) from aqueous solution by adsorbent prepared from paper mill sludge: kinetics and thermodynamic studies. adsorption science and technology pp. 1–21, 2017. [11] a. a. oyekanmi, a. z. a. latiff, z. daud, et al. adsorption of cadmium and lead from palm oil mill effluent using bone-composite: optimization and isotherm studies. international journal of environmental analytical chemistry 99(8):707–725, 2019. https://doi.org/10.1080/03067319.2019.1607318. [12] a. s. yusuff, i. i. olateju, s. e. ekanem. equilibrium, kinetic and thermodynamic studies of the adsorption of heavy metals from aqueous solution by thermally treated quail eggshell. journal of environmental science and technology 10(5):246–257, 2017. [13] n. m. alandis, w. mekhamer, i. o. aldage, et al. adsorptive applications of montmorillonite clay for the removal of ag (i) and cu (ii) from aqueous medium. journal of chemistry p. 7129014, 2019. https://doi.org/10.1155/2019/7129014. [14] n. abdel-ghani, m. hefny, a. el-chaghaby. removal of metal ions from synthetic wastewater by adsorption onto eucalyptus camaldulensis tree leaves. journal of chile chemical society 53(3):585–589, 2008. https://doi.org/10.4067/s0717-97072008000300007. [15] s. sugashini, k. m. m. s. begum. preparation of activated carbon from carbonized rice husk by ozone activation for cr (vi) removal. new carbon materials 30(3):252–261, 2015. https://doi.org/10.1016/s1872-5805(15)60190-1. [16] b. h. hameed, r. r. krishni, s. a. sata. a novel agricultural waste adsorbent for the removal of cationic dye from aqueous solution. journal of hazardous materials 162(1):305–311, 2009. https://doi.org/10.1016/j.jhazmat.2008.05.036. [17] a. ahmad, a. khatoon, s. h. mohd-setapar, et al. chemically oxidized pineapple fruit peel for the biosorption of heavy metals from aqueous solutions. desalination and water treatment 57(14):6432–6442, 2016. https://doi.org/10.1080/19443994.2015.1005150. [18] w. cherdchoo, s. nithettham, j. charoenpanich. removal of cr(vi) from synthetic wastewater by adsorption onto coffee ground and mixed waste tea. chemosphere 221:758–767, 2019. https: //doi.org/10.1016/j.chemosphere.2019.01.100. [19] m. ghasemi, a. ghoreyshi, h. younesi. synthesis of a high characteristics activated carbon from walnut shell for the removal of cr(vi) and fe(ii) from aqueous solution: single and binary solutes adsorption. iranian journal of chemical engineering 12(4):28–51, 2015. [20] z. al-othman, r. ali, m. naushad. hexavalent chromium removal from aqueous medium by activated carbon prepared from peanut shell: adsorption kinetics, equilibrium and thermodynamic studies. chemical engineering journal 184:238–247, 2012. https://doi.org/10.1016/j.cej.2012.01.048. [21] a. s. yusuff. adsorption of hexavalent chromium from aqueous solution by leucaena leucocephala seed pod activated carbon: equilibrium, kinetic and thermodynamic studies. arab journal of basic and applied sciences 26(1):89–102, 2019. https://doi.org/10.1080/25765299.2019.1567656. [22] e. o. adeaga. adsorption of hexavalent chromium from aqueous solution by hcl treated onion skin. journal of pure and applied science 2(1):7–13, 2017. [23] s. e. agarry, o. o. ogunleye, o. a. ajani. biosorptive removal of cadmium (ii) ions from aqueous solution by chemically modified onion skin: batch equilibrium, kinetic and thermodynamic studies. chemical engineering communications 202(5):655–673, 2015. https://doi.org/10.1080/00986445.2013.863187. [24] b. w. waweru. efficiency and sorption capacity of unmodified and modified onion skins (allium cepa) to adsorb selected heavy metals from water. kenyatta university, nairobi, kenya 60:1–84, 2015. [25] e. f. olasehinde, a. v. adegunloye, m. a. adebayo. cadmium (ii) adsorption from aqueous solution using onion skins. water conservation science and engineering 4:175–185, 2019. https://doi.org/10.1007/s41101-019-00077-2. [26] f. p. l. hariani, f. riyanti, w. sepriani. desorption of re-adsorption of procion red mx-5b dye on alumina-activated carbon composite. indonesian journal of chemistry 18(2):222–228, 2018. https://doi.org/10.22146/ijc.23927. [27] y. h. taufiq-yap, n. f. abdullah, m. basri. biodiesel production via transesterification of palm oil using naoh/al2o3 catalysts. sains malaysiana 40(6):587–594, 2011. [28] m. d. adetoye, s. o. adeojo, b. f. ajiboshin. utilization of garlic waste as adsorbent for heavy metal removal from aqueous solution. journal of pure and applied science 6(4):6–13, 2018. [29] a. s. yusuff, l. t. popoola, e. o. babatunde. adsorption of cadmium ion from aqueous solutions by copper-based metal organic framework: equilibrium modeling and kinetic studies. applied water science 9:106, 2019. 577 https://doi.org/10.1080/01496390802222590 https://doi.org/10.1016/j.chemosphere.2018.11.210 https://doi.org/10.1016/j.chemosphere.2018.11.210 https://doi.org/10.1021/ie5034655 https://doi.org/10.1080/03067319.2019.1607318 https://doi.org/10.1155/2019/7129014 https://doi.org/10.4067/s0717-97072008000300007 https://doi.org/10.1016/s1872-5805(15)60190-1 https://doi.org/10.1016/j.jhazmat.2008.05.036 https://doi.org/10.1080/19443994.2015.1005150 https://doi.org/10.1016/j.chemosphere.2019.01.100 https://doi.org/10.1016/j.chemosphere.2019.01.100 https://doi.org/10.1016/j.cej.2012.01.048 https://doi.org/10.1080/25765299.2019.1567656 https://doi.org/10.1080/00986445.2013.863187 https://doi.org/10.1007/s41101-019-00077-2 https://doi.org/10.22146/ijc.23927 a. s. yusuff, l. t. popoola, v. anochie acta polytechnica [30] t. g. kebede, s. dube, a. a. mengistie, et al. moringa stenopetala bark: a novel green adsorbent for the removal of metal ions from industrial effluents. physics and chemistry of the earth 107:45–57, 2018. https://doi.org/10.1016/j.pce.2018.08.002. [31] j. guo, y. song, x. ji, et al. preparation and characterization of nanoporous activated carbon derived from prawn shell and its application for removal of heavy metal ions. materials 12(2):241–247, 2019. https://doi.org/10.3390/ma12020241. [32] a. sulaymon, s. ebrahim, m. ridha. equilibrium, kinetic, and thermodynamic biosorption of pb(ii), cu(ii) and cd(ii) ions by dead anaerobic biomass from synthetic wastewater. environmental science and pollution research 20(1):175–187, 2013. [33] s. f. lim, a. y. w. lee. kinetic study on removal of heavy metal ions from aqueous solution by using soil. environmental science and pollution research 22:10144–10158, 2015. https://doi.org/10.1007/s11356-015-4203-6. [34] a. a. alghamdi, a. al-odayni, w. s. saeed, et al. efficient adsorption of lead (ii) from aqueous phase solutions using polypyrrole-based activated carbon. materials 12(12):1–16, 2020. https://doi.org/10.3390/ma12122020. [35] m. imamoghu, o. tekir. removal of copper (ii) and lead (ii) from aqueous solutions by adsorption on activated carbon from a new precursor hazelnut husks. desalination 228(1-3):108–113, 2008. https://doi.org/10.1016/j.desal.2007.08.011. [36] m. s. el-geundi, m. m. nassar, t. e. farrag, m. h. ahmed. removal of an insecticide (methomyl) from aqueous solutions using natural clay. alexandria engineering journal 51(1):11–18, 2012. https://doi.org/10.1016/j.aej.2012.07.002. [37] l. mouni, d. merabet, a. bouzaza, l. belkhiri. adsorption of pb (ii) from aqueous solutions using activated carbon developed from apricot stone. desalination 276(1-3):148–153, 2011. https://doi.org/10.1016/j.desal.2011.03.038. [38] m. mohapatra, s. khatum, s. anand. adsorption behaviour of pb(ii), cd(ii) and zn(ii) on nalco plant sand. indian journal of chemical technology 16:291–300, 2009. [39] j. bayuo, k. b. pelig-ba, m. a. abukari. optimization of adsorption of parameters for effective removal of lead (ii) from aqueous solution. physical chemistry: an indian journal 14(1):1–25, 2019. [40] n. gaur, a. kukreja, m. yadav, a. tiwari. adsorption removal of lead and arsenic from aqueous solution using soya bean as a novel biosorbent: equilibrium isotherm and thermal stability studies. applied water science 8:98, 2018. https://doi.org/10.1007/s13201-018-0743-5. [41] y. h. zhao, j. t. geng, j. c. cai, y. f. cai. adsorption performance of basic fuchin on alkali-activated diatomite. adsorption science and technology 38(5-6):151–167, 2020. https://doi.org/10.1177/0263617420922084. 578 https://doi.org/10.1016/j.pce.2018.08.002 https://doi.org/10.3390/ma12020241 https://doi.org/10.1007/s11356-015-4203-6 https://doi.org/10.3390/ma12122020 https://doi.org/10.1016/j.desal.2007.08.011 https://doi.org/10.1016/j.aej.2012.07.002 https://doi.org/10.1016/j.desal.2011.03.038 https://doi.org/10.1007/s13201-018-0743-5 https://doi.org/10.1177/0263617420922084 acta polytechnica 61(4):1–9, 2021 1 introduction 2 methodology 2.1 materials 2.2 preparation and characterization of adsorbent 2.3 adsorption experiments 2.4 adsorption isotherm 2.5 adsorption kinetics 3 results and discussion 3.1 characterization of cmosw adsorbent 3.2 influence of adsorption process parameters 3.2.1 influence of initial lead ions concentration 3.2.2 influence of contact time 3.2.3 influence of adsorbent dosage 3.2.4 influence of aqueous solution ph 3.3 adsorption isotherms 3.4 adsorption kinetics 3.5 adsorption mechanism 4 conclusion and future recommendations list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0352 acta polytechnica 59(4):352–358, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap the characteristics of al-si coating on steel 22mnb5 depending on the heat treatment marie kolaříkováa, ∗, rostislav chotěborskýb, monika hromasovác, miloslav lindac a czech technical university in prague, faculty of mechanical engineering, department of manufacturing technology, technická 4, 166 07 prague 6, czech republic b czech university of life science prague, faculty of engineering, department of material science and manufacturing technology, kamýcká 129, 165 00 prague 6, czech republic c czech university of life science prague, faculty of engineering, department of electrical engineering and automation, kamýcká 129, 165 00 prague 6, czech republic ∗ corresponding author: marie.kolarikova@fs.cvut.cz abstract. the coating on the 22mnb5 steel is intended to protect it against oxidation during the forming process. this steel is hot-pressed. a preheating before the pressing and subsequent hardening in the tool affects the properties of the alsi coating. this study summarizes the results of investigating the effect of heat treatment parameters on the formation of intermetallics in the alsi coating. the chemical analysis of the coating was performed by the edx and ebsd method and the mechanical properties were determined by the hysitron ti 950 triboindenter™ system. the result of this study is that, due to a diffusion during the heat treatment, the brittle coating was transformed into a tougher phase. keywords: al-si coating, intermetallic phase, heat treatment, steel 22mnb5. 1. introduction nowadays, high-strength martensitic hot-formed steels are used for the body stiffening. their use for bodywork (especially for safety components) is on the rise worldwide. for example, in the octavia iii model, the proportion of high-strength hot-formed steels in 2012 was 26.1 %. today, there is no car manufacturer, who does not use the sheet metal with a surface treatment when making a bodywork. the primary function of the coatings is to prevent corrosion and thus increase the service life of the bodywork. furthermore, it improves the surface morphology for a better adhesion of the lubricant required for the forming operations. an alsi-based coating on 22mnb5 steel is designed to protect it against oxidation during the forming process. a preheating of the steel before the pressing and subsequent hardening in the tool affects the properties of the alsi coating. typically, the al-si coating has a nominal thickness of 30-50µm. the basic composition is 90 % al + 10 % si with si enriched locally. the interlayer at the interface with the steel substrate formed by the heat treatment (forming process) has a thickness of 5 to 10µm, according to the manufacturer. its basic constituents are feal and fe2al5 phases. while the al-si coating has a melting point of 650-700 °c, feal and fe2al5 have a higher melting point of about 1150-1350 °c. [1–3] in the delivered state, the al-si coating on the ferritic-pearlitic steel is comprised of a compact al layer with si precipitates. upon raising the temperature to an austenitizing temperature during the forming process, fe begins to diffuse into the al-si layer. a tertiary al-si-fe alloy is formed, which gives the coating characteristics. in his work, grauer et al. [4] accentuates that, at the aluminium melting temperatures (660 °c), the iron diffusion always occurs in the alsi coating, although the coating is heated very slowly. windmann [5, 6] describes the dependence of the morphology of the alsi coating on the heat treatment parameters, and, in this context, the research by füssela et al. [7] shows that different layer thicknesses significantly change the weldability of the base material. [8–10] schmidová et al. [11] examined heat treated alsi coatings in the temperature range of 880-850 °c with a holding time of 5 10 min. this has confirmed the creation of a diffusion layer formed during the heat treatment. her study shows that the heterogeneous iron enriched in the coating also increases with the temperature and the holding time. the layer with the increasing ratio of al/si is formed just below the surface of the coating. the formation of a continuous secondary intermetallic interlayer in a coating with predominantly al has a particular effect on weldability. especially in connection with the increasing porosity (kirkendall’s pores) and oxide content on the surface. the higher amount of pores leads to their accidental collapse, which affects the flow and causes an instability of the whole process. according to the study, the maximum thickness of the diffusion layer at the interface of the steel coating is 13µm. a higher thickness 352 https://doi.org/10.14311/ap.2019.59.0352 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 4/2019 the characteristics of al-si coating on steel 22mnb5. . . of the diffusion layer leads to a random occurrence of an instability in the welding process. cheng et al. [12] and springer et al. [13] studied the phase formation at the soft steel interface and alsi coating (with a silicon concentration of 5 and 10 %) at temperatures of 650-700 °c. they observed the formation of a multilayer phase with the sequence al5fe2-al13fe4-al8fe2si, which was formed directly on the steel-coating interface. in addition, they proved the formation of small precipitates of the al2fe3si3 type within the al5fe2 phase. they claim that the silicon has inhibitory effects on the intermetallic growth in the steel-coating interface and that it neutralizes the al13fe4 phase formation and it promotes a growth of al8fe2si. their results are consistent with windmann’s work and confirm the work of yun et al. [14]. eggeler et al. [15] proved that silicon occupies free positions in the al5fe2 grid and thus prevents the growth of this phase. in his next work, windmann et al. [5] concluded that the iron diffusion from steel to alsi coating prevails during the first two minutes of the heating (around 900 °c). after two minutes, when aluminium diffuses into the steel, the al content in the coating decreases and the al (al13fe4 and al5fe2) rich intermetallic phases are converted to a fe (alfe) type. the aluminium-to-steel diffusion supports the formation of an alpha-fe layer at the interface between the coating and the steel. this layer increases its thickness with the austenitizing temperature and with a temperature-holding time. köster et al. [16] as well as kobayashi et al. [17] found that, in particular, alxfey -rich al intermetallic phases (type al13fe4 and al5fe2), which are formed in the alsi coating, have a low fracture toughness (1 mpa·m1/2), which is attributed to their high hardness of 900-1150 hv0.05. according to krasnowski et al. [18], the hardness of the al5fe2 phase is 851 hv1. the alfe and alfe3 phases have a lower hardness of 300 to 650 hv0.05 (kubošová et al. [19]) and a higher fracture toughness of up to 26 mpa·m1/2. with respect to the stoichiometry of these intermetallics, the the alxfey phases are tougher and softer and they have a higher fe content, so they can be stabilized by various diffusion processes. kim et al. [20] found out that the brittle intermetallic phases of the al13fe4 and al5fe2 types reduced the weldability. according to shiota et al. [21] the al2fe3 phase has a hardness of 882 hv1, fracture toughness of 1.6 mpa·m1/2 and a modulus of elasticity of 245 gpa. our aim is to point out the change of phases into the al-si coating of 22mnb5 so that our results could be used for a future research of the weldability of these coated steel. the aim of the research is to help to clarify what is happening in the coating during the heat treatment in terms of a phase formation and mechanical properties and to determine how this influences the weldability and stability of the welding process. the result will be the determination of the thermal processing parameter limits so that changes in the coating affect the process stability as little as possible. 2. materials and methods a high-strength boron steel 22mnb5 was chosen as the material for the experiment. 1.2 mm thick sheets were supplied with an alsi coating. the chemical composition of the 22mnb5 steel is shown in table 1. the mechanical properties are in table 2. these materials are supplied from the mills in a cold or hotrolled condition with a ferritic-perlite structure and eliminated carbides. in this state, the yield strength is rp0.2 350-550 mpa, the strength of rm is 500-700 mpa and the ductility a80 is greater than 15 %. for the chemical analysis of all coatings, the mira3 gmu scanning electron microscope and the eds and ebsd analysers from oxford instruments were used. from the measured data, an elemental map was compiled. two line profiles have been marked on each map and spot analysis were performed at the selected locations to determine the percentage representation of the elements. at the same time, another measurement was performed by electron spectrum diffraction (ebds). the hysitron ti 950 triboindenter™ nanoindentation system was used to analyse the mechanical properties of coatings, such as reduced elastic modulus and indentation hardness (hit, according to iso 14577-1). unlike martens hardness (hm), hit calculates the contact area (h = f/acont), acont as the surface projection (or tip cut) at the maximum contact depth. the xpm (ultra-fast nanoindentation) mode with a maximum load force pmax = 3000 µn was used for the mechanical analysis, corresponding to the contact depths of hc ∼ 60−100 nm. the indent matrix 8×30 was separated by 3µm between individual indents. the load function had two segments: 1st in 1 second load to pmax = 3000 µn, 2nd in 1 second relieving segment (a tip extraction from sample). the measurement method is described in [22], including the evaluation method. 3. experiment samples were heat treated with various parameters in the range: time t = 5 to 15 min, temperature t = 850 − 950 °c. the heat treatment was performed in an induction furnace. the set temperature was verified by the thermocouple and recorded by the almeno station. all samples were analysed for the mechanical properties. the measured values of reduced modulus and indentation hardness were composed into matrices for a better visualization of the results. a chemical analysis followed. this was done at the site of the previous analysis of mechanical properties in order to relate the results to one another. the results of both analyses are shown in figure 1 figure 8. for clarity, only 4 samples were selected for the comparison: sample 1 without heat treatment, low 353 m. kolaříková, r. chotěborský, m. hromasová, m. linda acta polytechnica c si mn p s al b cr mo ti 0.26 0.30 1.14 0.0075 < 0.150 0.04 0.0027 0.16 0.01 0.032 table 1. chemical composition of steel 22mnb5. before heat treatment | after heat treatment rp0,2 [mpa] rm [mpa] a80 min. [%] rp0,2 [mpa] rm [mpa] a80 min. [%] min 200 400 570 15 950 1250 1300 1650 4.5 table 2. mechanical properties of steel 22mnb5 . figure 1. phase identification (left), reduced modulus (center) and indentation hardness (right) of sample 1 before heat treatment. figure 2. phase identification for samples with heat treatment parameters: time t = 5.8 min and temperature t = 882 °c (2 left), t = 7.4 min and t = 905 °c (3 center) and t = 12.9 min and t = 907 °c (4 right). 354 vol. 59 no. 4/2019 the characteristics of al-si coating on steel 22mnb5. . . figure 3. indentation hardness for samples with heat treatment parameters: time t = 5.8 min and temperature t = 882 °c (left), t = 7.4 min and t = 905 °c (center) and t = 12.9 min and t = 907 °c (right). figure 4. reduced modulus for samples with heat treatment parameters: time t = 5.8 min and temperature t = 882 °c (left), t = 7.4 min and t = 905 °c (center) and t = 12.9 min and t = 907 °c (right). figure 5. chemical compound in line for sample 1 without heat treatment. 355 m. kolaříková, r. chotěborský, m. hromasová, m. linda acta polytechnica figure 6. chemical compound in line for sample 2 time (t = 5.8 min, t = 882 °c). figure 7. chemical compound in line for sample 3 (t = 7.4 min, t = 905 °c). figure 8. chemical compound in line for sample 4 (t = 12.9 min, t = 907 °c). 356 vol. 59 no. 4/2019 the characteristics of al-si coating on steel 22mnb5. . . tempered sample 2 (882 °c) with short holding time (5.8 min), sample with mean values of heat treatment parameters 3 t = 905 °c, t = 7.4 min, and a high values sample 4 t = 907 °c, t = 12.9 min. 4. results and discussion figure 1 (left) shows that, before the austenitization, the alsi coating consists of pure aluminium, the al2fe3si3 phase and the al8fe2si precipitate on the steel-coating interface. the iron diffusion into the partially molten alsi coating is very rapid during the first 2 minutes of the austenitisation. figure 2 shows that, due to a massive diffusion, the al8fe2si precipitate was transformed to al5fe2 + al2fe3si3 according to the equation (1), this is in an agreement with the works published by schmidova [13] and also windmann [6, 7]. al → al + al8fe2si → al5fe2 + al2fe3si3 → → al5fe2 + alfe (1) after a complete transformation of the coating into the intermetallic phases, there is an increased diffusion of al towards the steel. due to the opposite direction of the diffusion of al and fe and its different velocity, cavities (kirkendall’s pores) start to form on the coating-steel interface after 6 minutes, as it can be seen in figure 2 – sample 3 and 4. figure 1 (in the middle) shows the reduced modulus er. steel has er = 210 − 250 gpa, the coating has er = 100 gpa (therefore, a half). after the coating transformation, the reduced modulus of the coating and steel is at the same value er = 210−250 gpa (see figure 4). on the indentation hardness graphs (see figure 3), the influence of the type and proportion of the phase volume can be seen. the original tough coating transformed into a brittle al5fe2 + al2fe3si3 + feal after the first 2 minutes of the austenitisation. with the increasing temperature and time, the volume of feal increased. in all of the hardness matrices in figure 3, there is an increase of the hardness in the whole volume of the coating on the left. this is probably due to the used measurement method. the xpm is a method with very fast tip crossings between individual indents. after such a transition from the end of the matrix (right) to the beginning of the next row of the matrix (left), the measuring tip can oscillate. measured values may be up to 2 gpa higher. when comparing the values with the right half of the matrix, we can clearly see the increase in the hardness (orange and red). in figures 5 to 8, there are graphs from the linear sem analysis. the graphs show how incredibly fast the thickness of the diffusion layer (αfe) increases from 0 to up to 10 micrometers during the first 6 minutes. after 6 minutes, its growth slows down and the feal layer begins to increase considerably. it can also be observed that the volume of the feal phase throughout the coating increases steeply, which corresponds to the results above. 5. conclusion the results of the research show that a massive diffusion occurs during the austenitization, which causes the transformation of the original alsi coating into intermetallic phases of the type al5fe2 + al2fe3si3. after the transformation, the phase spacing does not affect the reduced modulus. with increasing temperature and austenitizing time, the volume of the feal + αfe phase in the coating is increasing. this work should be followed by welding tests. knowing the results of the diffusion processes subsequently related to the quality of the weld joint will allow to determine and limit the heating conditions (thermal processing parameters) before the forming process. acknowledgements this research was supported by the project sgs16/217/ohk2/3t/12. references [1] m. kolaříková, p. nachtnebl. properties of interface between manganese-boron steel 22mnb5 and coating al-si. in metal 2016 conference proceedings, pp. 735–740. 2016. [2] volkswagen aktiengesellschaft. tl 4225. alloyed quenched and tempered steel for press quenching uncoated or precoated: material requirements for semi-finished products and components, 2012. internal document. [3] m. kolaříková, l. kolařík, t. pilvousek, j. petr. mechanical properties of al-si galvanic coating and its influence on resistance weldability of 22mnb5 steel. defect and diffusion forum 368:82–85, 2016. doi:10.4028/www.scientific.net/ddf.368.82. [4] s. grauer, e. j. f. r. caron, n. l. chester, et al. investigation of melting in the al–si coating of a boron steel sheet by differential scanning calorimetry. journal of materials processing technology 216:89–94, 2015. doi:10.1016/j.jmatprotec.2014.09.001. [5] a. röttger, m. windmann, w. theisen. phase formation at the interface between a boron alloyed steel substrate and an al-rich coating. surface and coatings technology 226:130–139, 2013. doi:10.1016/j.surfcoat.2013.03.045. [6] m. windmann, a. röttger, w. theisen. formation of intermetallic phases in al-coated hot-stamped 22mnb5 sheets in terms of coating thickness and si content. surface and coatings technology 246:17–25, 2014. doi:10.1016/j.surfcoat.2014.02.056. [7] u. füssel, v. wesling, a. voigt, e.-c. klages. visualisierung der temperaturentwicklung in der schweißzone einschließlich der schweißelektroden über den gesamten zeitlichen verlauf eines punktschweißprozesses. schweissen und schneiden 4:634–642, 2012. [8] d. richard, m. fafard, r. lacroix, et al. carbon to cast iron electrical contact resistance constitutive model for finitie element analysis. journal of materials processing technology 132:119–131, 2003. doi:10.1016/s0924-0136(02)00430-2. 357 http://dx.doi.org/10.4028/www.scientific.net/ddf.368.82 http://dx.doi.org/10.1016/j.jmatprotec.2014.09.001 http://dx.doi.org/10.1016/j.surfcoat.2013.03.045 http://dx.doi.org/10.1016/j.surfcoat.2014.02.056 http://dx.doi.org/10.1016/s0924-0136(02)00430-2 m. kolaříková, r. chotěborský, m. hromasová, m. linda acta polytechnica [9] q. song, w. zhang, n. bay. an experimental study determines the electrical contact resistance in resistance welding. welding journal 84:73–76, 2005. [10] p. rogeon, p. carre, j. costa, et al. characterization of electrical contact conditions in spot welding assemblies. journal of materials processing technology 195:117– 124, 2008. doi:10.1016/j.jmatprotec.2007.04.127. [11] e. schmidová, p. hanus. weldability of al-si coated high strength martensitic steel. periodica polytechnica transportation engineering 41:127–132, 2013. doi:10.3311/pptr.7113. [12] w.-j. cheng, c.-j. wang. microstructural evolution of intermetallic layer in hot-dipped aluminide mild steel with silicon addition. surface and coatings technology 205:4726–4731, 2011. doi:10.1016/j.surfcoat.2011.04.061. [13] h. springer, a. kostka, e. j. payton, et al. on the formation and growth of intermetallic phases during interdiffusion between low-carbon steel and aluminum alloys. acta materialia 59(4):1586–1600, 2011. doi:10.1016/j.actamat.2010.11.023. [14] j.-g. yun, j.-h. lee, s.-y. kwak, c.-y. kang. study on the formation of reaction phase to si addition in boron steel hot-dipped in al–7ni alloy. coatings 7:186, 2017. doi:10.3390/coatings7110186. [15] g. eggeler, w. auer, h. kaesche. on the influence of silicon on the growth of the alloy layer during hot dip aluminizing. journal of materials science 21(9):3348–3350, 1986. doi:10.1007/bf00553379. [16] u. köster, w. liu, h. liebertz, m. michel. mechanical properties of quasicrystalline and crystalline phases in al-cu-fe alloys. journal of non-crystalline solids 153154:446–452, 1993. doi:10.1016/0022-3093(93)90393-c. [17] s. kobayashi, t. yakou. control of intermetallic compound layers at interface between steel and aluminum by diffusion-treatment. materials science and engineering: a 338:44–53, 2002. doi:10.1016/s0921-5093(02)00053-9. [18] m. krasnowski, s. gierlotka, t. kulik. nanocrystalline al5fe2 intermetallic and al5fe2–al composites manufactured by high-pressure consolidation of milled powders. journal of alloys and compounds 656:82–87, 2015. doi:10.1016/j.jallcom.2015.09.224. [19] a. kubošová, m. karlík, p. haušild, j. prahl. fracture behaviour of fe3al and feal type iron aluminides. materials science forum 567-568:349–352, 2008. doi:10.4028/www.scientific.net/msf.567-568.349. [20] c. kim, m. j. kang, y.-d. park. laser welding of al-si coated hot stamping steel. procedia engineering 10:2226–2231, 2011. doi:10.1016/j.proeng.2011.04.368. [21] y. shiota, h. muta, k. yamamoto, et al. a new semiconductor al2fe3si3 with complex crystal structure. intermetallics 89:51–56, 2017. doi:10.1016/j.intermet.2017.05.019. [22] k. rokosz, t. hryniewicz, j. lukeš, j. sepitka. nanoindentation studies and modeling of surface layers on austenitic stainless steels by extreme electrochemical treatments: nanoindentation and modeling of sl on stainless steels by ep1000. surface and interface analysis 47:643–647, 2015. doi:10.1002/sia.5758. 358 http://dx.doi.org/10.1016/j.jmatprotec.2007.04.127 http://dx.doi.org/10.3311/pptr.7113 http://dx.doi.org/10.1016/j.surfcoat.2011.04.061 http://dx.doi.org/10.1016/j.actamat.2010.11.023 http://dx.doi.org/10.3390/coatings7110186 http://dx.doi.org/10.1007/bf00553379 http://dx.doi.org/10.1016/0022-3093(93)90393-c http://dx.doi.org/10.1016/s0921-5093(02)00053-9 http://dx.doi.org/10.1016/j.jallcom.2015.09.224 http://dx.doi.org/10.4028/www.scientific.net/msf.567-568.349 http://dx.doi.org/10.1016/j.proeng.2011.04.368 http://dx.doi.org/10.1016/j.intermet.2017.05.019 http://dx.doi.org/10.1002/sia.5758 acta polytechnica 59(4):352–358, 2019 1 introduction 2 materials and methods 3 experiment 4 results and discussion 5 conclusion acknowledgements references ap04_3web.vp notation c [ j kg�1k�1 ] specific heat t [ k ] temperature x, y, z [ m ] coordinates q [ w m�3 ] heat source � [ w m�1k�1 ] thermal conductivity � [ kg m�3 ] density � [ s ] time 1 introduction the transition from the conventional pouring of steel into metal moulds to the progressive technology of continuous casting is progressing throughout the world. in the first stage of continuous casting the liquid metal flows from a ladle through the reservoir (tundish) into a water-cooled copper mould. its goal is to remove such an amount of the heat from the melt that a thick enough layer of solidified metal is created on its surface (primary cooling zone) to contain the liquid metal. the slab sinks down gradually and after exiting the mould is taken off by the roller conveyor and sprayed by the system of water and air-water nozzles (secondary cooling zone). the natural convection and radiation of the surrounding air (tertiary cooling zone) further cool it. the endless product is divided into pieces of desired length using the torch cut-off technique after the complete solidification of its core. received slabs are freely cooled under the natural air conditions. the scheme of a radial slab caster with its main parts and used terminology can be seen on fig. 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 44 no. 3/2004 finite element modelling of mechanical phenomena connected to the technological process of continuous casting of steel j. heger a finite element method algorithm is presented that enables numerical simulation of real phenomena that take place during an industrial process of continuous casting of steel. the algorithm takes into account all known kinds of nonlinearities: material nonlinearity connected to the nonlinear temperature dependence of material properties, large deformations from the process of material forming and contacts between the slab and rollers of the strand. received results describe the sensitivity of the product to crack initiation, not only during the process of continuous casting itself but also in the finished and cooled slab. keywords: continuous casting, steel slab, numerical simulation, finite element method, nonlinearity, large deformations, contact, crack initiation. fig. 1: main parts of a radial slab caster one of the problems in the progressive technology of continuous casting is the formation of cracks in slabs. the main cause of cracks is the development of stresses, mechanical and thermal, during slab solidification and cooling. cracks form when this stress exceeds the local strength of material, i.e., when the corresponding strain exceeds the local material ductility. the operating parameters like pouring temperature, casting rate, the length and taper of the mould, cooling intensity in the mould and in the zones of secondary and tertiary cooling, influence the tendency of the slab to create cracks. they effect the tendency to crack forming in two ways: � the direct impact influencing the distribution of stresses and strains during the continuous casting process, � the indirect impact influencing the change of metallographical structure. the aim of the work was to design and tune an algorithm that would simulate numerically the substantial phenomena that take place during the process of continuous casting. the main consideration was the direct impact of outer impulses on the evolution of stress-strain distribution and on the sensitivity of the slab to the formation of surface and subsurface defects. 2 physically mathematical analysis of the process of the continuous casting of steel 2.1 coupled thermo-structural transient problem continuous casting of steel is from the physical point of view a coupled thermo-structural problem. it is described by the partial differential equations containing thermal and mechanical variables. theoretically these two groups of variables are interdependent; the temperature distribution influences the mechanical response of the continuum and vice versa. however with respect to the relatively small rate of deformation at the process of continuous casting, it is possible to assume on the technical level of distinguishability that the mechanical phenomena will not significantly influence the temperature distribution during the process of continuous casting. assuming this, it is possible to decouple the thermal and mechanical processes. the thermal processes are considered to be of primary importance and to influence the mechanical processes in the solidifying and cooled half-finished product. thermal impact can show itself in two ways: 1) by the thermal expansion of material, 2) by the varying mechanical properties of material as a result of their temperature dependence. 2.2 numerical simulation of thermal phenomena in the solidifying slab in conventional gravitation pouring of castings the casting is cooled by the mould and ambient atmosphere. from the point of view of thermo-kinetics, it is a case of three dimensional transient heat and mass transfer in the system casting-mould-surroundings. the problem of casting solidification and cooling in conventional gravitation pouring can be described by the fourier differential equation � � � � � � � � � � � � � � �x t x y t y z t z x y z � � � � � � � � � � � � � � � � � � � � � � � q c t � � �� . (1) corresponding initial conditions for the whole region and boundary conditions for all surfaces during the whole process must be known for the problem to be defined in the correct way. mathematical description of the solidification and cooling of the slab during continuous casting is a slightly more complicated problem. it is a transport problem with the transfer of heat and mass that can be described by the fourier-kirchhof equation � � �� � � � � � � � � � � � � c t w t z x t x y t y z x y� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �z t z qz (2) assuming, for simplicity, that the motion of the slab is in the direction of the z coordinate axis at velocity wz. 2.3 motion of the continuously cast slab relative motion of the slab with respect to the mould and the caster during the process of continuous casting may be included in the numerical simulation of thermal phenomena in two ways: 1) by fixing the boundary conditions and moving the slab – the transport problem solution described by the fourier-kirchhof equation, 2) by fixing the slab and moving the boundary conditions – the problem solution is described by the fourier equation. both these approaches produce identical results. solution using the fourier-kirchhof equation requires an unusual algorithm but is a fluent and less demanding way of manipulating the input data using boundary conditions that do not change with time or space. on the other hand solution using the fourier equation does not require a special algorithm but entering boundary conditions moving in time means a huge amount of data, which can cause considerable technical difficulties in running the program. the first approach has been chosen for its simplicity with the input data manipulation. 2.4 finite difference method application for the numerical simulation of thermal phenomena during continuous casting of steel for the thermal phenomena simulation of the continuous casting of steel a specialised program was created on the principal of the finite difference method [1]. this program solves the fourier-kirchhof equation for transient thermal problems of continuously cast steel slabs with the width 800 to 1600 mm and the thickness 120 to 250 mm. the program works with the explicit version of the finite difference method using the smooth thermal dependence of enthalpy. it is equipped with the original mesh generator as well with as a graphical post-processor. there is a user friendly way of changing all the input data: caster parameters, thermo-physical parameters of 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 cast material, heat transfer coefficients, slab dimensions as a whole and also dimensions of meshed elementary volumes. with respect to the need for a subsequent stress analysis, to be performed by the finite element method following the thermal analysis, the thermal program is equipped with a finite element mesh generator that corresponds to the finite difference method mesh. it produces also the transient thermal result output, which is one of the inputs for the following structural analysis. all the transferred data are created in the format used by the finite element method program ansys. 2.5 numerical simulation of mechanical phenomena during continuous casting of steel mechanical phenomena acting on the solidifying and cooled slab during the process of continuous casting of steel are simulated [2] using the finite element program system ansys. in this thermally affected elasto-plastic stress-strain analysis all the material properties are strongly temperature dependent. this analysis takes into account all the impacts that the slab experiences during the process of continuous casting. the originally vertical position of the slab in the mould at the beginning of the process is changed during its progress through the forming rollers of the caster to the horizontal position. the product is exposed not only to the thermal influence varying in time but also to the transport and forming at the contact with the individual rollers. in the radial part of the caster there is forming by the bending influence of the rollers, and in the following straight part there is straightening of the product which was bent in the previous part. the described and designed simulation of mechanical phenomena during continuous casting can be used to predict the sensitivity of the product to the forming of surface and inner defects. using the parameter called the normalised stress, defined as the ratio of the local maximum principal stress and the local yield stress, it is possible to quantify the sensitivity of the product to defects forming during the whole process of continuous casting. it describes this sensitivity from the pouring of the melt, through the solidification and cooling of the product till the equilibration of the product temperature with the ambient temperature. received results corresponding to this last described status predicate the distribution of residual stresses in the product after the process of continuous casting is finished. the advantage of numerical simulation of continuous casting is that, after the algorithm is designed and tuned, it predicts the sensitivity of the product to the forming of inner and surface defects at various thermal and mechanical conditions of the cast material. these conditions comprise, for example, varying cast profiles, pouring temperature, casting rate, shape of caster, varying rollers distribution and their adjustment, influence of changing the cooling intensity. 2.6 finite element algorithm for numerical simulation of mechanical phenomena proceeding during the process of continuous casting of steel taking into account the high requirements for computer memory, available disc space and time of the computation, the problem of numerical simulation of mechanical phenomena during the process of continuous casting of steel was in the first stage drawn as a two dimensional problem. for the slab with a rectangular cross section 1530 mm × 250 mm, a longitudinal plane section was chosen to represent the solution in the plane of symmetry with the plane strain assumption. the cross section was meshed using quadrilateral finite elements. the program system ansys version 5.7 was used for the finite element solution. besides the slab itself the finite element model also comprises 128 rollers of six diameters (fig. 2). the distribution and adjustment of these rollers creates the shape of the caster © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 44 no. 3/2004 fig. 2: roller distribution at the strand of continuous casting and by this way they also affect the final shape of the product. in the finite element contact analysis these rollers are meshed as the target elements target169. for the slab meshing visco-plastic elements visco106 have been used. the material properties of the manufactured slab are strongly temperature dependent. their thermal stress-strain dependence curves can be seen on fig. 3. the applied regular quadrilateral mesh is designed with 23 elements along the slab height to enable a detailed description of the stress and strain gradients. the basic element dimensions are 10.870 mm ×50.192 mm. the direction of the longer dimension corresponds to the direction of the slab longitudinal axis. the resulting finite element mesh contains 21827 elements with 231 × 84 nodes. the nodes on the external surface of the product are connected to the target elements by means of 1898 contact elements conta172. the connection of target and contact elements enables the simulation of real contact conditions during the process of continuous casting. the simulation algorithm of the real process of continuous casting is based on the following principle. the melted steel reservoir is modelled by a vertical strip of elements of material at the melting temperature. the remaining part of the manufactured product is of material with the temperature distribution found in the previous numerical analysis based on the finite difference method. however, this part of the product is used in the model only as a dummy part helping to accommodate contact nonlinearities in the fully developed contact conditions. results received from this region cannot be used for the evaluation of the real stress-strain circumstances in the manufactured product. each element and each node of the vertical region that is filled by the melted metal at the beginning of the process has to experience its own unique thermal-stress-strain transient history during the passage of the whole product through the caster created by the adjusted pairs of rollers. the algorithm enables simulation of the real passage of the manufactured product through the rolling mill of the caster, during which the plastic strain in every individual location of the product is accumulated. for the numerical simulation of mechanical phenomena proceeding during the process of continuous casting it is not enough to mesh the problem in space but it must be meshed also in time. for the numerical solution, a time step was applied that corresponded to the transport of the product by the length of one element (50.192 mm). the algorithm is designed so that during the simulation the whole originally melted product in a total length of 24.594 m passes through the pairs of rollers. a fictitious time difference of 1second is added to the global time at each finished step of the analysis. altogether 482 time steps were employed for the continuous casting simulation. potentially critical locations can be revealed by graphical evaluation of results for the whole product. an example of the many evaluated results – distribution of stress components along the outer surface of the product – is shown in fig. 4. waves corresponding to the straightening of the previously bent product can be seen on the second half of curves representing axial (sy) and transversal (sz) stress components. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 3: stress-strain curves © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 44 no. 3/2004 fig. 4: stress distribution along the product outer surface fig. 5: residual stress distribution 2.7 residual stresses the endless product of continuous casting is torch cut-off into slabs of the prescribed length and their successive gradual cooling in the environment of the production hall follows. it is a long-time process of natural convection taking several days. after the temperature equilibration the slab adopts the ambient temperature. distribution of residual stresses through the thickness of the final slab, the presence of which is a result of plastic strains during the technological process of continuous casting, can be seen on fig. 5. while there are compressive axial surface stresses (sy), the transverse stresses (sz) have the tensile character. 3 conception of using results of numerical simulation it is obvious that the higher the values of the normalised stresses that are achieved, the higher the danger of crack initiation that can be expected. however, it is generally difficult to determine the ultimate value of normalised stresses that are dangerous for the formation of defects. it is known that tensile stresses have the main damaging effect on crack initiation. this is the basic reason why the normalised maximum principal stress is used to evaluate the sensitivity of the product to crack initiation. since there are known locations of defect occurrence from field experience, the limiting value of normalised stresses can be determined by the calibration of the numerical simulation results. the main idea of this calibration is that the numerical simulation of continuous casting for the given operation conditions is performed and the limit and kind of normalised stresses is determined at which defects under the same conditions occur according to the location of the defects. using the statistical evaluation of more simulated processes more general material data will be received. they will enable the safety of operational conditions for the changed or newly designed parameters of the continuous casting process to be evaluated. eliminating sets of parameters that lead to crack formation, it will be possible to derive sets of parameters under which the cracks will not occur. all the variables influencing the process of continuous casting may be considered in this process. 4 conclusion numerical simulation of the continuous casting process contributes to the understanding of the influence of operational conditions and material properties on the development of stresses and strains during the steel manufacturing process. identifying operational parameters that will not lead to cracks forming in the product will raise the productivity of the continuous steel casting process. material properties and operating conditions can be chosen that will lower the slab sensitivity to crack initiation and hot tearing. 5 acknowledgment this analysis was conducted using a program devised within the framework of the ga cr projects (no. 106/01/1464, 106/01/1164,106/01/0379,106/01/0382) of the cost-apomat-oc526.10, eureka no.2716 coop, kontakt (no. 2001/015 and no.23-2003-04) and cez 323001/2201. references [1] kavicka f. et al: two numerical models of a continuously cast steel slab. ivth european conference on continuous casting, birmingham (uk), october 2002. [2] heger j.: finite element simulation of continuous casting with respect to the stress-strain distribution. research report, university of pittsburgh, usa, november 2001. dr. jaromir heger phone: 0116 284 5556 fax: 0116 284 5468 e-mail: jaromir.heger@power.alstom.com alstom power technology centre mechanical integrity cambridge road, whetstone le8 6lh leicester england, u.k. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 ap05_2.vp 1 introduction contemporary cryptographic schemata are frequently based on the discrete logarithm problem (dlp): find integer k such that q k p p k � � � � 1 (1) for given group elements p and q. in elliptic curve cryptography (ecc), p and q are points on a chosen elliptic curve over a finite field. we focus on curves over gf(2m), where point coordinates are expressed as m-bit vectors. the dlp in such a group is exponentially hard in comparison with dlp in a multiplicative group over a finite field. this means that a 173 bit key provides approximately the same security level as the 1024-bit rsa [7]. this fact is very important in applications such as chip cards, where the size of the hardware and energy consumption is crucial. in algorithms such as the elliptic curve digital signature algorithm (ecdsa), k is an m-bit integer, p is a chosen point and q is computed using eq. 1. it requires addition, multiplication and inversion over gf(2m). 1.1 finite field operations the implementation of these operations is determined by the representation of the field elements in gf(2m) [2]. in this work we focus on the normal basis representation. addition over elements of gf(2m) is implemented as a bit-wise xor. squaring is realized by rotation (cyclic shift) one bit to the right. because it is so simple (one clock cycle), it is regarded as a special case. multiplication is based on matrix multiplication over gf(2m). in hardware, a special unit (multiplier) is necessary. the best-known algorithm for inversion in normal basis is the algorithm of itoh, teechai and tsujii (itt) [3] based on multiplication and squaring. 1.2 scalability we understand scalability as the ability to change scale or to be available in various versions. the term ‘scale’ can be paraphrased as ‘the important dimension’ and hence is a matter of viewpoint. the basic dimension of cryptography is the measure of security, which translates to key length. a unit is scalable if it can serve for computations with varying key length, provided that the internal memory is sufficient [5]. we suggest calling this kind of scalability scalability in precision. in the world of parallel processing and vlsi design, the important dimension is the number of processors or other hardware units. an algorithm scales well if we can obtain more performance by assigning more resources (processors, chip area). we shall speak about scalability in performance. as this paper is focused on this kind of scalability only, we will use just the term ‘scalability’. we need scalability for two reasons. the first (and more important) reason is practical: hardware implementations are needed for a variety of contexts, from smart-card devices to high-throughput servers. the second reason is a fair comparison of design versions, where scaling the units to match in one dimension is preferable to artificial quality factors. to scale a unit means to employ parallelism at a certain level of abstraction. we can scale at algorithmic level by selecting a different algorithm, at the register-transfer level, e.g., by changing the data path width, or at the gate level by factoring combinational circuits, or even at the physical level. this work aims at the seam between algorithmic and register-transfer level. units composed of sub-units are harder to scale. we might be lucky and find a unified scaling parameter, e.g., data path width. in the general case, however, each sub-unit may have a different scaling parameter. the sub-units interact. firstly, if there is a common clock (as preferred in practice), it must suit the slowest sub-unit. secondly, to achieve the best global area-performance ratio, local area-performance tradeoffs are not independent. the above sketched elliptic curve operations offer very limited parallelism and therefore cannot be a major source of scalability. scalability must be sought for in finite field operations, and therefore the most interesting area for scalability in an elliptic curve processor is the finite field unit. 1.3 metrics the metrics used reflects the abstraction level of the parallelism employed. when the data path units are simple and their implementations are obvious, we can get at a lower © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 scalable normal basis arithmetic unit for elliptic curve cryptography j. schmidt, m. novotný the design of a scalable arithmetic unit for operations over elements of gf(2m) represented in normal basis is presented. the unit is applicable in public-key cryptography. it comprises a pipelined massey-omura multiplier and a shifter. we equipped the multiplier with additional data paths to enable easy implementation of both multiplication and inversion in a single arithmetic unit. we discuss optimum design of the shifter with respect to the inversion algorithm and multiplier performance. the functionality of the multiplier/inverter has been tested by simulation and implemented in xilinx virtex fpga. we present implementation data for various digit widths which exhibit a time minimum for digit width d � 15. keywords: finite fields, normal base, multiplication, inversion, arithmetic unit. level. in this work, we use the classical metrics of area (which is connected to power consumption) and time. when a unit is scaled, its area a and time t vary in opposite directions. time t depends on the number of clock cycles t spent in a given calculation and on the critical path length � in hardware. to compare differently scaled units, we use the quality measure q at at� � �. 2 previous work massey and omura proposed a multiplier [6] that employs the regularity of equations for all bits of a result. from the equation for one bit of a result (e.g. c0), equations for other bits can be derived by rotating bits of the arguments a and b [2]. in this multiplier, one bit of the result is computed completely in one clock cycle. then, registers holding the arguments a and b are rotated right one bit between cycles. the computation of m bits of the result takes m clock cycles and hence this multiplier is also called bit-serial. agnew, mullin, onyszchuk and vanstone introduced a modification of the massey-omura multiplier [1] (in this paper, we call it the amov multiplier). they divided the equation for each bit ci into m products pi,j: c p p p pi i i i m i m� � � � �� �, , , ,0 1 2 1� (2) in the first clock cycle, the product pi,i+0 of bit ci for all i � �0, m – 1 is evaluated. in the next cycle, the product pi,i+1 (all subscripts are reduced mod m) of bit ci for all i � �0, m – 1 is evaluated and added to the intermediate result, and so on. all bits of the result are successively evaluated in parallel; the computation is pipelined. the block diagram of the amov multiplier is in fig. 1. the multiplication is performed as follows: in the first step, both operands a and b are loaded from inputs in1 and in2 to registers a and b, respectively. then, in each of the following m clock cycles, both the a and b registers are rotated right (this is represented by the blocks ror 1) and the result c in register c is evaluated successively by the block comb. logic, which implements the products pi,j from eq. 2. after m clock cycles, the result c a b� is available at output out. all registers and data paths are m bits wide. the amount of hardware is the same as for the non-pipelined massey-omura multiplier, but the critical path is short and constant (it does not depend on m) and so the maximum achievable frequency is higher. this multiplier is widely used. the computation of an inverse element (inversion) by the itt algorithm [3] is usually controlled by a microprogram [4]. when implementing the itt inversion using a classical amov multiplier, additional registers and data transfers outside the multiplier are necessary. in this work we present a modification of the amov multiplier, which allows efficient implementation of both the multiplication and itt inversion algorithms. in comparison with the microprogrammed inversion, no additional registers or data transfers outside the multiplier are necessary. we also introduce several improvements to this multiplication/inversion unit, which lead to increased performance and a better performance/area ratio. 3 structure of the unit the data path of our arithmetic unit is an extension of the amov multiplier. by adding one more input to the multiplexer preceding register a and by redirecting some data paths (see bold lines in fig. 2), we can simply implement both multiplication and itt inversion in the unit and thus save additional registers and data transfers outside of the multiplier. the modified amov multiplier has a dedicated control unit based on a finite state machine, two counters count_inv and count_k and the shift register m. it implements the commands load_op, multiply and invert. 3.1 multiplication multiplication is performed as follows: in the first two steps, both operands a and b are successively loaded from input in to registers a and b. then, in m clock cycles, the multiplication is performed as in the standard amov multiplier. after its completion, the result c a b� is loaded from register c to register a and is available at output out. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague b comb. logic ror 1 in1 ror 1 in2 a c out fig. 1: amov multiplier b comb. logic ror 1 in ror 1 a c out fig. 2: modified amov multiplier 3.2 inversion our unit computes the itt inversion [2], [3] by algorithm 1. it comprises � �log ( ) ( )2 1 1 1m w m� � � � multiplications, where w (m � 1) is the number of non-zero bits in the binary representation of m � 1. furthermore, it needs (m � 1) � w (m � 1) squarings. the total number of clock cycles spent for one inversion is � � � � c m w m c m w m c co inv mul sqr � � � � � � � � � � � � � log ( ) ( ) ( ) ( ) 2 1 1 1 1 1 nst, where cmul is number of clock cycles of 1 multiplication (cmul � m) and csqr is number of clock cycles of one squaring (csqr � 1). note that the inverted value must be available at input in during the whole process of inversion. the rotation capability of register b is used for the computation of squarings in steps 3.2.1 and 3.4.2. 4 scaling the multiplication the basic ecc operation (eq. 1) is performed by successive point additions and point doublings. each of these operations needs 1 inversion, 2 multiplications and 1 squaring [2]. the number of clock cycles necessary for one point addition or doubling is then: � � � � c m w m c m w m c const padd mul sqr � � � � � � � � � � � � log ( ) ( ) ( ) 2 1 1 1 1 . (3) we can reduce the number of clock cycles cpadd in two ways: by reducing the number of clock cycles cmul of the multiplications and the number of clock cycles csqr of the iterative squaring.0 both the massey-omura and the amov multipliers need m clock cycles for computing all m bits of the result. some authors also call them bit-serial, because they compute one bit of a result in one clock cycle. there is a digit-serial variant of the massey-omura multiplier (some authors call it sliced or parallel). in this multiplier, d bits (also called a digit) are evaluated in one clock cycle. in the case of a digit-serial amov multiplier, d products pi,j are evaluated in one clock cycle. all m bits of the result are then evaluated in � �c m dmul � cycles. since more products are evaluated in one clock cycle, more combinational logic is necessary. the size of the block comb. logic in fig. 2 is proportional to d � 1. the size of other blocks remains constant. as the combinational logic becomes more complex, the length of the critical path grows proportionally to log d. since one multiplication needs m/d clock cycles, the total time necessary for one multiplication is �o m d d( ) log , and the total time of one inversion (or point addition on elliptic curve) is t o m m d m dpadd � � � � � � � � �(log ). log . (4) 5 scaling the iterative squarings another way to improve the performance of the itt inversion (and consequently the point addition) is to reduce the number of clock cycles necessary for the iterative squarings in step 3.2.1 of algorithm 1. adding one or more blocks performing “long distance” rotations can reduce the number of clock cycles required for all iterative squarings (fig. 3). we shall refer to the hardware realizing these rotations as the shifter. let m be the degree of the finite field we work in, gf(2m). let m b b b br r� � �1 1 1 0( )� be the binary representation of m-1 such that the most significant bit br=1. the rotations required by the itoh-teechaji-tsujii algorithm are � �k k i r k b bi i r i� � �, , ( )1� � . the binary representation of ki is br … bi. each of the shifts is performed exactly once in an inversion operation. let a, t, and � be the area, the total number of clock cycles spent in rotations, and the critical path of the shifter. let a0, t0, and �0 be the area, total number of clock cycles spent in © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 1. mr...m0 � m – 1; 2. a � in; 3. for count_inv in r – 1 down to 0 do 3.1 b � a; 3.2 for count_k in m2r�1 … mr down to 1 do 3.2.1. b � b ror 1; 3.3 c � b a; 3.3a a � c; 3.3b m � m shl 1; 3.4 if mr = ,1’ then 3.4.1 b � a; 3.4.2 b � b ror 1; a � in; 3.4.3 c � b a; 3.4.4 a � c; 4. a � a ror 1; algorithm 1: an implementation of the itt inversion in the modified amov multiplier b comb. logic ror 1 in ror 1 a c out ror q ror r fig. 3: “long distance” rotations form a shifter that saves clock cycles necessary for squarings in itt inversion multiplications, and the critical path of the rest of the arithmetic unit. the quality measure of the entire unit, and hence our optimization criterion, is � �q a a t t� � � �( )( ) max ,0 0 0� � (5) this equation also shows the two dependencies between the shifter and the multiplier. firstly, the ratio of the shifter’s area and time must be ‘the right one’. for each a0, t0 and �0, the area and time of the shifter shall be adjusted to achieve minimum q of the entire unit. secondly, the shifter may slow down the au clock. in this case, not only the time t� is longer, but also the multiplication time is t0� instead of t0�0. as the multiplier dominates, the penalty may become unacceptable. the multiplier is scaled by manipulating the digit size, which affects a0, t0 and �0, while changing the number of rotations in the shifter varies a, t, and �� thus the optimization problem becomes multi-parametric. because the multiplier logic dominates in both area and time, we solve it in a pareto-optimal way: we modify the shifter to achieve optimum total q for a given multiplier digit width. 5.1 decomposition in time and space to implement a set of rotations, one might use: � a multiplexer structure, such as the barrel shifter, spending a single clock cycle for each rotation, or � hardware providing the rotation by 1 only, requiring k clock cycles to rotate by k. these options can be seen as decompositions in space and in time, respectively. in a more general approach, we construct hardware providing a limited set of rotations, and we use it in multiple clock cycles to realize the given rotations. the solution of our problem can be decomposed as follows: � find rotations s j nj � � �z , 1� and factors tij � �z , i r j n� �1 1� �, such that s t k mj ij j n i � � � 1 mod (6) (the time domain problem). � implement the rotations sj under the optimality criterion in eq. 5 (the space domain problem). for such a sequential decomposition to work, the first step must estimate the quality of the result of the second step. in our case, we need the estimation of area, clock cycles, and critical path of a circuit realizing a given set of rotations. thanks to the special nature of rotations ki, we can have made reasonable assumptions about them. the decomposition of the problem is summarized in fig. 4. 5.2 space domain problem let nmux be the number of two-input multiplexers in a circuit implementing n rotations. for a given n, nmux has the following bounds: � nmux is o(n). any set of rotations can be implemented using an n-input multiplexer, which in turn is a tree of n 2-input multiplexers. � nmux is � (log n). this is the minimum number of bits specifying one number out of n. note that in both cases, the critical path length is logarithmic. a circuit intermediate between these two extremes can be represented as a network of 2-input multiplexers. we have proven that: � unless there are distinct indices a, b, c, d such that s s s sa b c d� � � , the circuit optimal in area and critical path is an n-input multiplexer; � the original set of rotations ki does not have the above property. these facts lead us to the tentative assumption that the solution of the space domain problem can be approximated as an n-input multiplexer. the overall quality measure is then �q na n a t t nij j n i r � � � � � � � � � � � � � � � �� ��mux mux( ) max (0 1 0 1 �� �), .�0 the area amux(n) and critical path �mux(n) of an n-input multiplexer can be expressed in terms of primitive gates, if such a measure is used for the rest of the unit. alternatively, these values can be measured in terms of implementation technology (transistors, programmable blocks) and as such obtained from the synthesis tools used. 5.3 time domain problem due to the modularity of eq. 6, the value of all sj and tij can be restricted to (0, m) without loss of generality. this still represents, however, a large solution space. to reduce it, further decomposition is used: 1. choose a set of rotations sj. 2. obtain a set of factors tij giving optimum quality q. because no reliable estimate can be made in step 1, both steps are performed iteratively. in other words, we use step 2 as the evaluation function for the local search in step 1. 5.4 optimal factors by dynamic programming in step 2 above, n is fixed, and so are the parameters of the multiplexer implementing sj. furthermore, note that the equations for different values of j in eq. 6 are independent, that is, we have r primitive problems of the following form. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague s k time domain problem space domain problem t step 2 step 1 estimation s muxes i ij i obtain tij i by dynamic programming genetic algorithm choose s i fig. 4: decomposition for a given c c m� ��z , and h � �z , find t j hij � � �z , 1� such that s t c mj j j h � � � 1 mod while minimizing q tj j n � � � 1 . let f (h, c) be a function giving minimum q for the above problem. let f (0, 0) = 0 and f (0, c) = � for c > 0. then f (h, c) is, for 0 < h � r, �� �f h c f h c ds m d d h( , ) min , ( ) mod� � � �1 , where d ranges between 0 and the order of sh in zm, that is 0 � d � m / gcd (sh, m) the values of f (h, c) are computed for increasing h and are stored in a two-dimensional array with r columns and m rows. after that, the element (r, ki) contains the contribution of the i-th primitive problem to the optimization criterion, for 1< i � r. eventually, all values tij and therefore the solution of step 2 can be reconstructed from the array. this means that one pass only of the above outlined dynamic programming procedure is required to solve the entire step 2. as r is o(log m), the complexity of this algorithm is o(m2log m). 5.5 optimal rotation set by a genetic algorithm the search process in step 1 is performed by a genetic algorithm, which in turn uses the above procedure to evaluate the solution. thanks to the decomposition in subsection 5.1, a configuration of the search problem is determined by the value of n � r variables si � (0, m). this is the phenotype of the genetic algorithm. the genotype (chromosome) is a fixed structure of r variables from �0,m). a zero value is omitted from the phenotype, and equal values of two or more variables in the genotype constitute a single shift in the phenotype. all permutations of the genotype are considered equivalent. this decoder avoids the necessity to store n as a separate variable and the need to work with a variable-length genotype. constrained optimization was implemented using a penalty for each rotation ki, which the individual in question cannot realize. due to the modularity of eq. 6, the search space is well connected. the infeasible individuals were not needed to contribute to the connectedness, and therefore the value of the penalty was chosen well above any possible value of q. the rest of the genetic algorithm was quite classical, with single-point crossover and linear scaling of fitness values. the adaptive nature of linear scaling caused the convergence to remain unchanged even in the presence of large a0 and t0, where the differences in evaluations are relatively small. it seems that the number of clock cycles spent in iterative squarings is approx. �o k mk � 1 , where k is the number of rotation blocks. the total time necessary for the point addition is then t o m d m k m dkpadd � � � � � � � � � � � � � � � �log log1 . (7) 6 implementation the proposed multiplier/inverter has been implemented in the xilinx virtex300 fpga using the synopsys fpga express synthesis tool and the foundation 3.3i implementation tool. the functionality has been verified in the modelsim simulator. note that point addition and doubling are completely oblivious, i.e. the sequence of steps depends on m only, not on the processed data. therefore the simulation was also able to show that the numbers of clock cycles used in equations 3, 4, and 7 were correct. 6.1 digit-serial multiplier/inverter because of limited space, table 1 presents results for m � 180 and for several digit widths d only. no additional blocks performing “long distance” rotations have been used in these cases. as expected, the area of the multiplier/inverter grows with growing d, and can be expressed as approx. (2 � 0.5 d)× m slices or (2 � 0.5 d) slices per 1 bit. the computation time does not depend on the digit width d in such a straightforward manner. the results in the last column of table 1 and in fig. 5 correspond to eq. 4. the minimum time is obtained for d � 6. the other local minima are caused by the granularity of the fpga. whenever the capacity of a look-up table is exhausted, the length of the critical path increases. 6.2 improving the iterative squarings the optimizer was implemented using the galib c�� library [8]. a number of experiments have been performed, with m in the range interesting for elliptic curve cryptography, that is, from 160 to 250. the following facts were observed: � where a brute force optimum solution was available, the algorithm gave an identical answer. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 d freq (mhz) #slices #slices per 1 bit point addition (clocks) (�s) 1 122.489 451 2.51 2582 21.08 2 102.533 544 3.02 1412 13.77 3 101.502 632 3.51 1022 10.07 4 100.492 724 4.02 827 8.23 5 97.069 815 4.53 710 7.31 6 94.153 903 5.02 632 6.71 9 64.008 1174 6.52 502 7.84 10 61.054 1265 7.03 476 7.80 12 54.174 1480 8.22 437 8.07 15 58.042 1798 9.99 398 6.86 18 44.607 2026 11.26 372 8.34 20 40.955 2248 12.49 359 8.77 table 1: implementation of a modified multiplier/inverter in xilinx virtex300 � any realized rotation si in an optimum solution was identical to some given rotation ki, although even slightly sub-optimum solutions did not have this property. � the left side of equation 6 was less than m in all optimum and some sub-optimum solutions. � when the above observation was exploited to simplify the evaluation procedure, the search space became disconnected, and more time was needed to achieve equivalent results. � neither brute force nor the described algorithm gave any optimum or sub-optimum solution violating the tentative assumption in subsection 5.2. � with a population size of 100, the algorithm required circa 3000 generations to converge at m � 160, rising to 4000 at m � 250. � infeasible individuals were rare. � the running time was below 20 minutes on an office-grade pc. table 2 illustrates the influence of the multiplier size on the shifter. the results were obtained for m � 163, where the required set of rotations is {1, 2, 5, 10, 20, 40, 81}. the area of the multiplier was amux(n) � 1.5 n and the critical path of the unit was outside the shifter. the effect of adding one rotation block (i.e. k � 2) is illustrated in fig. 5. the expansion of multiplexers did not influence the clock period, as they did not lie on the critical path. the new design is systematically faster. for d � 6, the speedup is over 20 %, while the area increased by 10 %. recall that the speedup is based on minimization of the last term in eq. 7. in our case, this mechanism caused the second local minimum on the original curve to prevail, where the optimum digit width is d � 15 for m � 180. in this case the actual speedup is 37 %. 7 conclusions a pipelined version of the massey-omura multiplier modified for easy implementation of itt inversion algorithm has been presented. the performance of this multiplier/inverter can be improved by employing digit-serialization and by speeding up the iterative squarings. the multiplier/inverter has been implemented in xilinx virtex 300. without speeding up the iterative squarings, the shortest computation time has been obtained for digit width d � 6. the use of “long distance” rotation blocks further speeded up the design and benefited higher digit widths. references [1] agnew, g. b., mullin, r. c., onyszchuk, i. m., vanstone, s. a.: “an implementation for a fast public-key cryptosystem,” journal of cryptology, vol. 3 (1999), p. 63–79. [2] ieee 1363. standard for public-key cryptography, ieee 2000. [3] itoh, t., teechai, o., tsujii, s.: “a fast algorithm for computing multiplicative inverses in gf (2t) using normal bases,” j. society for electronic communications (japan), vol. 44 (1986), p. 31–36. [4] leong p. h. w., leung, k. h.: “a microcoded elliptic curve processor using fpga technology,” ieee transactions on vlsi systems, vol. 10, no. 5, oct. 2002, p. 550–559. [5] savas, e., koc, c. k.: “architectures for unified field inversion with applications“. in: elliptic curve cryptography. the 9th ieee international conference on electronics, circuits and systems – icecs 2002, dubrovnik, croatia, september 15–18, 2002, vol. 3, p. 1155–1158. [6] massey, j., omura, j.: “computational method and apparatus for finite field arithmetic,” u.s. patent number 4,587,627, 1986. [7] blake, i., seroussi, g., smart, n.: “elliptic curves in cryptography”, chapter 1. cambridge university press, cambridge (uk), 1999. [8] wall, m.: “galib, a c�� library of genetic algorithm components” [online] available from http://sourceforge.net/projects/galib/ ing. jan schmidt, ph.d. phone: +420 224 357 473 e-mail: schmidt@fel.cvut.cz ing. martin novotný phone: +420 224 357 261 e-mail: novotnym@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of eletrical engineering karlovo nám. 13 121 35 praha 2, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague multiplier shifter digit width a0 t0 a t rotations si – 0 0 976 10 1, 5, 20, 81 1 489 1956 244 159 1 6 2934 326 488 24 1, 10 table 2: shifters adjusted to different multipliers time of point addition m = 180 0 5 10 15 20 25 0 5 10 15 20 25 digit width d u s k=1 fig. 5: time of point addition for different digit widths. one rotation block (k � 1) and two rotation blocks (k � 2) used ap05_1.vp 1 introduction shell structures are a very broad topic. shells differ in their shape (cylindrical, spherical, parabolic, etc.), in the way in which their walls are stiffened (laterally, longitudinally, with orthogonal stiffeners), by type of load action, by type of material used (concrete, steel), etc. this great variability and range of shell performance presents many practical difficulties in their design. in this paper we deal just with one type of steel cylindrical shell loaded by wind. large non-elastic deformations lead to buckling or plastic failure of steel cylindrical shells. this type of failure differs from that in the case, for example of slab components, where bending is predominant and the behaviour is easy to predict. it is common knowledge that thin shell structures transfer their loading by means of the membrane tensional and compression forces that act in the walls of the shell. also it is known that shells have very high efficiency under symmetrical loading and support. transfer of asymmetrical loading and local load is not desirable. for a symmetrical load, a simple structural geometry and simple boundary conditions (e.g., cylindrical shells with axis symmetrical loading), an analysis of the shell is not too difficult. but when at least one of these factors is missing, analysis becomes more complicated and the results are often unexpected. in real life, shell structures are used mainly as chimneys, tanks, pipelines and silos. an analysis can be made with the use of simplifying methods if all the above mentioned conditions are fulfilled. more sophisticated methods for analysing, shell structures are necessary if the conditions are more complex. the level of analysis rises from the simplest calculations in linear structural analysis (la) to stability calculations of ideal structures (without imperfections) and also geometrically and materially non-linear calculations with structural imperfections (gmnia). all these methods are mentioned in the new european standard for steel shells [27]. another important shell analysis could study the shapes of oscillations of the structure, and with analysis of the structure for the basic dynamic loading. most of the types of analysis of the numerical models mentioned above are more or less accessible in recent practice, but during the preliminary design of structures, i.e., when only the basic dimensions of the structure need to be established, complex computer analysis is inapplicable for time and financial reasons. for this reasons in the doctoral thesis of the first author [7] of this paper the theme of cylindrical shells loaded by wind was thoroughly investigated under the supervision of the second author. the main aim of the theoretical investigations was to obtain limits for stiffening of a cylindrical shell (by ring stiffeners only) such that the strut approach based on beam theory for calculating of a chimney shell would be realistic enough, i.e., when it would be realistic to neglect semi-bending [15]. the main goal of this work is to determine the distance of the stiffeners and to determinate their minimum stiffness. 2 parametric studies in the first stage of the work, the performance of the chimney shell was investigated in a parametric study on numerical models solved by fem. the computational study focused on determining the limit distance of the stiffeners. models were used in the calculation: a linear analysis of the structure (la) and a geometrical non-linear analysis of the structure (gna). the gna model was based on the newton–raphson method. the comparative calculation made use of classical linear analysis of buckling. the second part of the work contains parametric studies for determining the optimum stiffness of the ring stiffeners, making use of the optimum distance of the stiffeners determined in the first step of the work. for an investigation of the interaction between the diameter of the cylindrical shell, its thickness, the distance of the stiffeners and their optimum stiffness (taking into account wind load only), two independent studies evaluated by regression analysis were carried out. all calculations were done using esa-prima win computer program, produced by scia with solvers developed by the fem consulting company [13]. the software is suitable for 56 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague influence of ring stiffeners on a steel cylindrical shell d. lemák, j. studnička shell structures are usually formed from concrete, steel and nowadays also from many others materials. steel is typically used in the structures of chimneys, reservoirs, silos, pipelines, etc. unlike concrete shells, steel shells are regularly stiffened with the help of longitudinal and/or ring stiffeners. the authors of this paper investigated steel cylindrical shells and their stiffening with the use of ring stiffeners. the more complete the stiffening, the more closely the shell will act to beam theory, and the calculations will be much easier. however, this would make realization of the structure more expensive and more laborious. the target of the study is to find the limits of ring stiffeners for cylindrical shells. adequate stiffeners will eliminate semi-bending action of the shells in such way that the shell structures can be analyzed with the use of numerical models of the struts (e.g., by beam theory) without significant divergences from reality. recommendations are made for the design of ring stiffeners, especially for the distances between stiffeners and for their bending stiffness. keywords: shell, cylindrical shell, chimney, steel structure, wind load, ring stiffener, distance of stiffeners, stiffness of ring stiffener, design. static analysis (la, gna, and etc.) and also for dynamic and stability analysis of structures. in the numerical models, the stiffeners were simulated by curved beam elements. the numerical models of the shells were made of quadrilateral elements formed from four triangular plane sub elements with one common node – the centre of the quadrilateral element [14]. the centre is defined as the intersection point of two straight lines connecting the mid-side points of the two opposite side of the element. this definition allows the quadrilateral element to have, if needed, a singular (in fact triangular) form with two nodes in one point, without numerical difficulties. therefore the triangular sub element with 3×6 � 18 nodal deformation parameters forms the basis of the whole calculation. the physical sense of the parameters is in the case of 1d elements (u, v, w, �x, �y, �z), which ensures nodal compatibility between 1d and 2d elements. in a quadrilateral element the deformation parameters of its centre (u, v, w, �x, �y, �z) are common for all four triangular sub elements, and are eliminated in advance and consistently by static condensation. in its final form a quadrilateral element possesses only 4×6 � 24 natural displacement parameters at its vertex nodes, and its stiffness matrix kl is of the order (24, 24). the size of the elements was chosen between 10 and 75 mm according to the diameter and length of the numerical model. in the introduction to the first parametric study, we detected that was it necessary to exploit at least 100 elements along the perimeter and length of the model. the size of the elements strongly influenced the precision of the parametric studies. the wind load was taken according to the european standard [28], with the reference wind speed 24 m�s�1 and open terrain. the shell was assumed to be made from common s235 low carbon steel. 3 determining the limit distance of the stiffeners the numerical model of the shell used in the parametric study is shown in fig.1. for the parametric study, four basic sets of calculations were used according to the diameter of the shell. the calculated diameters were 400 mm, 800 mm, 1600 mm and 2400 mm. the varying parameter was the length and thickness of the shell. the boundary conditions simulated an absolutely rigid ring stiffener of the cylindrical shell. the whole numerical model represented a segment of the shell between two rigid stiffeners. in the linear analysis of the structure (la) and in the geometrical nonlinear analysis (gna) of the structure, the following parameters were checked: � deformation uy, which is the deformation of the shell measured in the wind direction. the maximum value is always situated in the middle of the length of the structure, on the side exposed to the wind. the minimum value is in the middle of the length of the structure, again at periphery angle � � 135°, measuring this angle from the side exposed to the wind. � deformation ux, which is the deformation of the shell measured perpendicularly to the wind direction. the maximum value is situated in the middle of the length of the structure at periphery angle � � 75°. � circumferential stress ��,max , which is the tensional stress in � direction. the maximum circumferential stress of the shell model was found in the middle of the length of the structure at periphery angle � � 135°. the minimum circumferential stress ��,min was found in the middle of the length of the structure at periphery angle � � 90°. � meridian stress �x,min, which is the compression stress of the shell. the minimum meridian stress of the shell model is situated in the middle of the length of the structure on the side exposed to the wind. an example of the calculated parameters of the shell in the model for the middle of the shell length is shown in fig. 2. in the classical linear analysis of buckling, characterised by a bifurcation, the smallest eigenvalue (defined as the critical multiple rcr according to [27] of the given loading) is found when stability is lost. for comparison, simple beam models were examined at the same time. numerical models composed from a 1d element were used. these beam models had the same boundary conditions as those used in the shell models. the loading of the beam model was obtained by integrating the loading around the shell. the beam models were used mainly for © czech technical university publishing housee http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 1: numerical model of the shell used for determining the limit distance of the stiffeners fig. 2: calculated deformations and stresses of the shell in the middle of the length; the values are obtained for diameter 400 mm, thickness 0.5 mm and length 2.0 m, and for loading by the wind downstream of axis y splitting the beam behaviour of the shell (sometimes called semi-bending behaviour). in all beam models, a linear analysis of the structure (la) was made, because a geometrically non-linear analysis (gna) was confirmed to be almost irrelevant for these models. in the beam models, the deformation of the beam and the maximum tension stress was at its highest compared with similar parameters of the shell structures mentioned above. the dependences between the parameters of the shell (thickness and distance of the stiffeners) and between the parameters described in above (deformations, stresses, and critical buckling resistance) were investigated. the main parameter of the “shell” behaviour was defined as deformation uy, which was obtained after subtracting the deformation determined by the beam model and the gna model, respectively. in accordance with practical experience with steel shells, the limit level of deformation was chosen as d/6000, where d is the diameter of the shell. for comparison, values corresponding to d/10000 were also investigated. at the same time, the circumferential stress was also checked. it was found to be very low, with values not higher than 3 mpa. in the next step, we investigated the influence of the distance of the ring stiffeners on the thickness of the shell. linear regression [20] was chosen as the optimum method for this. the relation between the distances of the stiffeners and the thickness of the shell are shown in fig. 3. parameters “a” and “b” of fig. 3 are plotted for different diameters of the shell in fig. 4. the relation between the distance of the stiffeners l and thickness t is as follows: l at b� � , (1) where a, b are parameters of linear regression. for the allowable (limit) deformation of the shell d/6000, formula (1) can be rewritten into formula (2): l d d d t � � � � � � � � � � � � � ( . . . . ) 3 57 10 182 10 5 21 10 2 74 10 9 3 4 2 1 2 1 4 10 17237 1348 53 2. . . ,� � �� d d (2) and for d/10000 into formula (3): l d d d t � � � � � � � � � � � � ( . . . . ) 9 29 10 5 48 10 912 10 6 23 10 1 8 3 4 2 1 1 . . .4 10 21326 15053 2� � �� d d (3) in both formulas it is necessary to specify diameter d and thickness t in millimetres. distance l is also in mm. 58 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 0 2000 4000 6000 8000 10000 12000 14000 16000 0 2 4 6 8 10 12 14 16 18 20 22 24 thickness of shell [mm] d is ta n ce o f st if fe n er s [m m ] 400-1/6000 800-1/6000 1600-1/6000 2400-1/6000 fig. 3: dependence of the maximum distance of the stiffeners on the thickness of the shell for diameter 400 mm, 800 mm, 1600 mm and 2400 mm, for limit deformation of the shell d/6000 650 1150 1650 2150 2650 3150 3650 4150 4650 0 500 1000 1500 2000 diameter of shell [mm] p a ra m e te r "b " 1/6000 1/10000 300 350 400 450 500 550 600 650 0 500 1000 1500 2000 diameter of shell [mm] p a ra m e te r "a " 1/6000 1/10000 fig. 4: dependence of parameters “a” and “b” of the linear regression on the diameter of the shell for limit deformation of the shell d/6000 and d/10000 we can summarize this section as follows: if the distance between the stiffeners is smaller than l according to (2) or (3) we may apply the beam model for the shell with good results. 4 assessment of optimum stiffeners the numerical model of the shell used in the parametric study is shown in fig. 5 and fig. 6. for the study, we chose four basic sets of calculations according to the diameter of the shell (400, 800, 1600 and 2400 mm). the varying parameter was the thickness and length of the shell and the stiffness of the ring stiffener in relation to the cross section area of the shell. the boundary conditions in the numerical model simulated the absolutely stiff base of the chimney on the lower end of the shell. the complete numerical model consisted of three shell segments separated by rigid stiffeners. there were two internal stiffeners and one terminal stiffener on the free end of the structure, see fig. 6. this parametric study investigated the semi-bending component behaviour of the cylindrical shell. la and classical linear analysis of buckling were performed for all models. for some models, gna was also performed, in order to verify the original model more thoroughly. in the linear analysis, the following parameters were investigated: � deformation uy of the stiffeners, which is the deformation of every stiffener in the wind direction. the maximum deformation of the stiffeners is always situated on the side exposed to the wind at periphery angle � � 0°. the minimum deformation is situated at periphery angle � � 180°. � maximum values of bending moment mz, shear force vy, and normal force n on every stiffener. � deformation uy of the shell in the wind direction. the maximum deformation of the shell is situated in the middle of the length of the structure on the side exposed to the wind. the observed points are marked a, b and c in fig. 6. an example of the calculated parameters of the ring stiffener is shown in fig. 7. at the same time, simple beam models for separating the beam and semi-bending behaviour of the shell were also investigated. fig. 8 and fig. 9 compare the beam deformation and the semi-bending deformation along the whole length of the structure for one numerical model and a certain stiffness of the stiffener. the results are as follows: higher stiffness of the internal stiffeners causes higher internal forces in these stiffeners, but an entirely opposite dependence was observed for the end © czech technical university publishing house e http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 5: deformed numerical model of the shell for diameter 2400 mm and thickness 1.5 mm, with the ring stiffeners modelled with the help of the shell elements, with isozones of the deformation in the wind direction fig. 6: numerical model of the shell that was used for determining the optimum distance of the stiffeners. the stiffeners and the other investigated points are identified in the figure fig. 7: parameters of ring stiffener no. 2 for diameter of a cylindrical shell 1600 mm, thickness 3.0 mm and distance between stiffeners 4.0 m. stiffeners of profile 300/24 (second moment 5.4×107 mm4) and loading by wind downstream of axis y are investigated. stiffener. it seems that the function of the two types of stiffeners is completely different. for each set of calculations we obtained the relation between the parameters of the shell (thickness, length, and 60 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 0 2 4 6 8 deformation [mm]uy p o in ts le n g th w is e th e co n st ru ct io n 2400-1,5-50/5 2400-1,5-150/15 2400-1,5-300/24 2400-1,5-450/50 2400-1,5-200/20-beameccentrically-la 2400-1,5-200/20-shelleccentrically-gna 2400-1,5-200/20-shelleccentrically-la 2400-1,5-beam fig. 8: comparison of the lengthwise deformation of the shell for one numerical model and a given stiffness of the stiffener on the side exposed to the wind -0,10 0,15 0,40 0,65 0,90 1,15 1,40 semi-bending deformation [mm]uy p o in ts le n g th w is e th e co n st ru ct io n 2400-1,5-50/5 2400-1,5-150/15 2400-1,5-300/24 2400-1,5-450/50 2400-1,5-200/20-beameccentrically-la 2400-1,5-200/20-shelleccentrically-gna 2400-1,5-200/20-shelleccentrically-la fig. 9: comparison of the lengthwise semi-bending deformation of the shell for one numerical model and a given stiffness of the stiffener on the side exposed to the wind 0,0e+00 2,0e+06 4,0e+06 6,0e+06 8,0e+06 1,0e+07 300 800 1300 1800 2300 diameter of shell [mm] s ec o n d m o m en t o f ri n g st if fe n er s [m m 4 ] intermediate stiffener end stiffener fig. 10: dependence of the second moment of the ring stiffeners on the diameter of the shell stiffness of the stiffeners) and the most realistic numerical model. the biggest semi-deformation uy of a stiffener measured in the wind direction was chosen as the master parameter of the shell (for determining the optimum stiffener) was chosen. the deformations were separated for the beam and shell model. all intermediate stiffeners were analysed separately from the end stiffeners. during the investigation of semi-bending deformations of the stiffeners for varying thickness of shell and the second moment of the stiffeners, we found that if the distance of the stiffeners corresponds with (2), the behaviour of the different stiffeners is almost identical. therefore in every set of calculations we defined the uniform dependence between the biggest semi-deformation and the second moment of the stiffeners for a given type of stiffener (intermediate, end). with these dependencies, we calculated the optimum stiffness and distance of the stiffeners. the value d/6000 was used as the optimum level of semi-bending deformations of the stiffener, the same value as in the first part of the paper. the next step of the study was to search for the dependencies of the optimum second moment of the stiffeners on the diameter of the shell, and the relationship between the internal forces in the stiffener and the diameter of the shell. by the least squares method, polynomial regressions of the second and third degree were chosen as the best approximation. the relations between the second moment of the stiffeners and the diameter of the shell and between the internal forces in the stiffeners and the diameter are shown in figs. 10 to 13. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 0,0 0,6 1,2 1,8 2,4 3,0 300 800 1300 1800 2300 diameter of shell [mm] m ax im al b en d in g m o m en t in st if fe n er s [k n . m ] intermediate stiffener end stiffener fig. 11: dependence of the maximum bending moment of the ring stiffeners on the diameter of the shell 0,0 1,0 2,0 3,0 4,0 5,0 300 800 1300 1800 2300 diameter of shell [mm] m ax im al sh ea r fo rc e in st if fe n er s [k n ] intermediate stiffener end stiffener fig. 12: dependence of the maximum shear force of the ring stiffeners on the diameter of the shell 0 2 4 6 8 10 300 800 1300 1800 2300 diameter of shell [mm] m ax im al n o rm al fo rc e in st if fe n er s [k n ] intermediate stiffener end stiffener fig. 13: dependence of the maximum normal force of the ring stiffeners on the diameter of the shell the analytical formulas are as follows. it is important to emphasize that these formulas are valid only for distances according to (2). the optimum second moment of intermediate stiffeners is given by the formula: i d d� � �3 03 3350 8727002. . (4) the optimum second moment of the end stiffener is given by the formula: i d d� � �106 935 6 1782002. . . (5) in both formulas (4) and (5), it is necessary to have the diameter of shell d in millimetres. the second moment of the stiffeners is in mm4. in intermediate stiffeners, the internal forces are (maximum values): m d d d � � � � � � � � � � � � 1244 10 1171 10 9 079 10 2 097 10 3 06 2 04 . . . . 10 01� , (6) v d d d � � � � � � � � � � � � 5 315 10 2 541 10 9 710 10 2 563 10 3 06 2 04 . . . . 10 01� , (7) n d d d � � � � � � � � � � � 3 974 10 1562 10 1217 10 3 005 09 3 05 2 02 . . . . . (8) in the end stiffener, the internal forces are: m d d d � � � � � � � � � � � 8 302 10 1194 10 4 259 10 1254 1 11 3 08 2 05 . . . . 0 02� , (9) v d d d � � � � � � � � � � � 5 252 10 9 791 10 3 283 10 7 084 1 11 3 08 2 04 . . . . 0 02� , (10) n d d d � � � � � � � � � � � 8 306 10 4 020 10 1 441 10 2 940 1 11 3 07 2 04 . . . . 0 03� . (11) in formulas (6) to (11) it is necessary to have the diameter of shell d in millimetres. the internal forces of the stiffeners are in kn×m (bending moments) and in kn (normal and shear forces). from all the formulas given above, it is obvious that the second moment of the stiffeners and the maximum internal forces in the stiffeners are dependent only on the diameter of the shell. the effect of the shell thickness was important only for the limit distance of the stiffeners. 5 conclusion this paper has investigated the effect of the distance and stiffness of ring stiffeners on the behaviour of a cylindrical steel shell loaded by wind. the aim of work was to determine these characteristics of the shell in a way that would enable these structures to be calculated by beam numerical models and would make the analyses easier for everyday design of structures of this type, for example steel chimneys. all results are presented in analytic formulas that are applicable in practice. acknowledgment this work was carried out at the department of steel structures at the czech technical university in prague. the research was supported by ministry of education project 6840770001. this financial support is gratefully acknowledged. references [1] blacker, m. j.: “buckling of steel silos and wind action.” proceedings of silo – conference, university of karlsruhe, 1988, p. 318–330. [2] brendel, b., ramm, e., fischer, d. f., rammerstorfer, f. g.: “linear and non-linear stability analysis of a thin cylindrical shell under wind loads.” journal struct. mech., (1981), p. 91–113. [3] brown, c. j., nielsen, j.: silos. fundamentals of theory, behaviour and design. london and new york, e & fn spon, 1998. [4] derler, p.: “load-carrying behaviour of cylindrical shells under wind load.” phd thesis, technical university of graz, 1993 (in german). [5] esslinger, m., poblotzki, g.: “buckling under wind pressure.” der stahlbau, vol. 61 (1992), no. 1, p. 21–26 (in german). [6] koloušek, v., pirner, m., fischer, o., náprstek, j.: wind effect on civil engineering structures. academia, prague and elsevier, london, 1983. [7] lemák, d.: „vliv obvodových výztuh na chování válcové skořepiny“. disertace čvut praha, 2003. [8] lemák, d., studnička, j.: „vliv obvodových výztuh na působení ocelové válcové skořepiny.“ stavební obzor, vol. 13 (2004), no. 4, p. 112–117. [9] lemák, d., studnička, j.: “behaviour of steel cylindrical shells.” proceedings international conference vsu 2004, sofia, may, 2004. [10] flügge, w.: stresses in shells. berlin, springer, 1973. [11] greiner, r.: “buckling of cylindrical shells with stepped wall-thickness under wind load.” der stahlbau, vol. 50 (1981), no. 6, p. 176–179 (in german). [12] johns, d. j.: “wind induced static instability of cylindrical shells.” j. wind eng. and ind. aerodynamics, vol. 13, (1983), p. 261–270. [13] kolář, v., němec, i., kanický, v.: fem principy a praxe metody konečných prvků. computer press, 1997. [14] kolář, v., němec, i.: “finite element analysis of structures.” united nations development program, economic com. for europe, workshop on cad techniques, prague – geneva, vol. i, june, 1984, 248 p. [15] křupka, v., schneider, p.: konstrukce aparátů. brno, pc-dir, 1998. [16] křupka, v.: výpočet válcových tenkostěnných kovových nádob a potrubí. praha, sntl, 1967. [17] kundurpi, p. s., samevedam, g., johns, d. j.: “stability of cantilever shells under wind loads.” proc. asce, 101, em5, 1975, p. 517–530. [18] megson, t., harrop, j., miller, m.: “the stability of large diameter and thin-walled steel tanks subjected 62 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague to wind loading.” proc. of international colloquium, university of ghent, 1987, p. 529–538. [19] rammerstorfer, f. g., auli, w. fischer, f.: “uplifting and stability of wind-loaded vertical cylindrical shells.” engineering computations, vol. 2 (1985), p. 170–180. [20] rektorys, k.: přehled užité matematiky ii, prometheus praha, 2000. [21] resinger, f., greiner, r.: “buckling of wind loaded cylindrical shells – application to unstiffened and ring-stiffened tanks.” proc. state of the art colloquium, university of stuttgart, germany, 1982, p. 6–7. [22] schweizerhof, k., ramm, e.: “stability of cylindrical shells under wind loading with particular reference to follower load effect.” proc. joint us-australian workshop on loading, analysis and stability of thin-shell bins, tanks and silos, university of sydney, 1985. [23] singer, j., arbocz, j., weller, t.: buckling experiments: experimental methods in buckling of thin-walled structures. new york, wiley, 2002. [24] studnička, j.: navrhování tenkostěnných za studena tvarovaných profilů. praha, academia, 1994. [25] uematsu, y., uchiyama, k.: “deflection and buckling behaviour of thin, circular cylindrical shells under wind loads. j. wind eng. and ind. aerodynamics, vol. 18 (1985), no. 3, p. 245–262. [26] zienkiewicz, o. c.: the finite element method in engineering science. mc graw hill, london, 1979 3rd ed., 787 p., chapter 18–19 (nonlinear problems). [27] pren 1993-1-6, “design of steel structures,” part 1–6, general rules – supplementary rules for shells, stage 34, cen brussels, 2004. [28] en 1991-2-4, “basis of design and actions on structures,” part 2–4: actions on structures – wind actions, cen brussels, 2003. [29] pren 1993-1-1, “design of steel structures,” part 1–1, general rules and rules for buildings, stage 49, cen brussels, 2003. ing. daniel lemák, ph.d. phone: +420 585 700 701 fax.: +420 585 700 707 e-mail: lemak@statika.iol.cz statika olomouc, s.r.o. balbínova 11 779 00 olomouc, czech republic prof. ing. jiří studnička, drsc. phone: +420 224 354 761 fax: +420 233 337 466 e-mail: studnicka@fsv.cvut.cz dept. of steel structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house e http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 table of contents deformation stress state of elastic bodies 3 j. skokánek a progress report on numerical solutions of least squares adjustment in gnu project gama 12 a. èepek, j. pytel heating and cooling anomaly of a rotating body 19 o. brùha, t. brùha schema management for data integration: a short survey 24 a. almarimi, j. pokorný a complexity and quality evaluation of block based motion estimation algorithms 29 s. usama, m. montaser, o. ahmed comparison of the capacitance method and the microwave impulse method for determination of moisture profiles in building materials 42 p. tesárek, j. pavlík, r. èerný 3d information system of historical site – proposal and realisation of a functional prototype 48 j. hodaè influence of ring stiffeners on a steel cylindrical shell 56 d. lemák, j. studnièka acta polytechnica doi:10.14311/ap.2019.59.0283 acta polytechnica 59(3):283–291, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap deflections and frequency analysis in the milling of thin-walled parts with variable low stiffness serhii kononenko, sergey dobrotvorskiy, yevheniia basova∗, magomediemin gasanov, ludmila dobrovolska national technical university “kharkiv polytechnic institute”, kyrpychova str. 2, kharkiv, ukraine ∗ corresponding author: e.v.basova.khpi@gmail.com abstract. the disadvantage of the geometry of thin-walled parts, in terms of processing, is the low ability to resist static and dynamic loads. it is caused by the elastic deformation of elements with a low stiffness. modelling approaches for the evaluation of deflections during machining are presented. mathematical models of deflections, cutting forces and harmonic response are proposed. the processes of material removal and deflection of a thin-walled sample at the critical points are modelled. a frequency analysis was performed, consisting of a modal analysis of natural frequencies and a harmonic response analysis. as a result, a graph of the deflections amplitude from the frequency of driven harmonic oscillations is generated. the analysis of the obtained values was performed. as a result, the resonance frequency and maximum amplitude of oscillations for the operating parameters are determined. keywords: thin-walled parts, undesirable deflections, end milling, modal analysis, harmonic response analysis. 1. introduction in several branches of the industry, such as mechanical engineering, aircraft manufacturing, drive production, etc., engineering products that belong to a number of thin-walled parts with variable stiffness are particularly prevalent. the processing of such parts on metal-cutting machines generally requires the application of both the specially developed devices that prevent the parts from being deformed by cutting forces and fixing or using new mechanical treatment techniques. the creation and using of special devices is associated with additional costs and, as a result, with an increase in the cost of production. the transition to an adoption of contemporary mechanical treatment methods such as high-speed milling reduces the number of special technological equipment. however, this mechanical treatment technique needs to be studied as well and the positive impact of reduced cutting forces influence on the thing-walled element should be approved. the aim of the research is to develop the technological solutions to reduce undesirable deviations of the geometry of thin-walled elements of parts by high-speed end milling. 2. literature review parts, such as turbine blades, impellers, and similar that have thin-walled elements in their geometry, are indispensable in the automobile, aerospace and other industries [1–4]. such responsible parts of the mechanisms are used in a variety of drive units [5]. the formation of surfaces of thin-walled parts with variable low stiffness requires a consideration of a number of factors that could prevent the achievement of technological requirements for the product [6, 7]. such factors are: deviations from a given shape that increases during processing, vibrations, thermal deformations and errors caused by tools, tool-path accuracy, technological equipment [8, 9] and fixtures [10, 11]. in modern industry and research activity, methods for recording and analysing the parameters of the cutting process for thin-walled parts are being actively developed. one of the methods based on a prediction of cutting parameters is discrete-time modelling of dynamic milling systems. the discrete-time model is general, and it can be used simultaneously for predicting the stability as well as the time response of the milling system [12]. an experimental based study provides a systematic measurement of process response parameters, videlicet the cutting force and surface roughness [13], dynamic stiffness measuring methods [14]. in addition, a special attention is paid to the system identification, signal, modal analysis, vibration absorption [15–18] using the frame structures as a test rig for numerical identification techniques. another researches represent a methods based on a variable correction designed to compensate the deflection effect of a non-rigid construction of blades [19] and technological and other factors of the accuracy characteristics of the blades processing. the correction values depend on the contact area of the tool with the workpiece. the converted parametric view of the machining program makes it possible to include compensating values to the machine manually by the operator. in the process, the operator checks the correctness of the processing. in the case of incorrect 283 http://dx.doi.org/10.14311/ap.2019.59.0283 http://ojs.cvut.cz/ojs/index.php/ap s. kononenko, s. dobrotvorskiy, ye. basova et al. acta polytechnica values of correction, the program should be changed and processing starts again [20]. a research based on a simulation of the distortion due to machining of thin-walled components states that the distortion of components is strongly related to the residual stress state induced by manufacturing processes like heat treatment, forming or machining [21]. as a result of the review, it was decided to focus on the forces in the process of the material removal, fe end-milling modelling and oscillations analysis to determine the undesirable deflections degree. for basic estimation and further validation, a simplified model of a part is considered as a cantilever beam [22, 23] that corresponds to an earlier study [24] where it was clarified that the maximum of deviation occurs on the ending free point of the sample. 3. research methodology the technological solution for determining the values of undesirable deflections, in this paper, is the preliminary calculation of the forces; a comparison of the obtained values with the automated finite element calculation; insertion of the directional milling force into the deflection model; obtaining the deflections amplitude through considering the milling tool as a driven harmonic oscillator. 3.1. deflection model during the processing of the thin-walled element by the end-mill, undesirable deviations occur in the direction of the force of the cutter pressure on the surface. an additional complexity of calculating deflections is the variable distribution of stiffness in each section. it is caused by the geometry of a thin-walled element, similar to an impeller blade. the considered model with a larger thickness at the base and tapering towards the end is presented in fig 1. the model is shear, undeformable and represents the static force interaction. it is used for a further rough deflections value validation. the basis for the equation of the deflection model is the castigliano’s second theorem, which states that the displacement ∆ of the point of application of the generalized force is equal to the partial derivative of the complementary strain energy uc with respect to this generalized force f: ∆ = ∂uc ∂f (1) the complementary strain energy of bending: uc = ∫ l 0 m2 2ei dx (2) where young’s modulus e is a measure of the material stiffness [25]. the bending moment m and moment of inertia i are functions of x: m = f · x (3) figure 1. model of a sample under the directional force. i = b · (y0 + αx)3 12 (4) where b is a beam width. the increasing thickness measure α is represented as (y1 − y0)/l. combining the expressions into the complementary strain energy of bending (2): uc = 6 · f 2 eb ∫ l 0 x2 (y0 + αx)3 dx (5) combining the expressions into the castigliano’s equation (1), the deflection: ∆ = ∂uc ∂f = 12 · f eb ∫ l 0 x2 (y0 + αx)3 dx (6) as a result of calculations, where a young’s modulus for aluminium e = 69 gpa, y1 = 9.75 mm, y0 = 4.75 mm, b = 40 mm, l = 70 mm, f = 184 n obtained from the section 3.2, the deflection ∆ = 0.1642 mm. 3.2. directional cutting force model to determine the forces acting in the cross section of the sample, as the most influencing factors of undesirable deflections, it is necessary to define the components of the cutting forces fig. 2. the scheme of the cutting forces during the milling depends on the machining method and the type of the cutter. for the processing of thin-walled parts, such profiles of blades, end mills are used. the finishing climb milling of an aluminium alloy sample is presented. climb milling provides a surface with a lower roughness and a higher accuracy. the tooth cuts into the material at a point a fig. 2, causing a number of forces directed towards the surface. but, in the case of thin-walled elements that are not deprived of the freedom degree from the back side, the effect of deflection by the cutter of an element leads to the processing errors. 284 vol. 59 no. 3/2019 deflections and frequency analysis in the milling. . . figure 2. components of cutting forces in the end milling. in accordance with the reference book [26], the total tangential force fz and the radial force fy have a resultant fyz, which can be decomposed into two forces the lengthwise fh and the transverse force fv. in the case of the processing model, the lengthwise force fh is directed along the line of the material removal. the force fv is directed perpendicularly to the force fh, in the direction of the supposed blade sample deflection. to determine the transverse force fv, which is responsible for the occurrence of deviations, the total tangential force fz is needed: fz = 10 · cp · tx · syz · bn · z dq · nw · kmp (7) where cp is the coefficient that takes into account the physical and mechanical properties of the material, tabulated value; x, y, w, q exponents that depend on the type of machining, the material of the part and the material of the cutting tool; t depth of cutting, mm; sz feed, mm/rev; b cutting width, mm; z number of cutter teeth; d diameter of the mill, mm; n rotational speed, min−1; kmp the general correction factor, which takes into account the quality of the processed material [26]. as a result of calculations, where a cutting depth t = 0.25 mm, cutting width b = 3 mm, diameter of the contact area of the end spiral three-blade mill d = 7.5 mm, and the material of the part aluminium alloy, the value of the tangential force is fz = 263.2 n. forces fz and fv are in the ratio [26], as fz:fv = 0.7-0.9. based on the ratio, the calculated value of the transverse force is fv = 184 n. the values obtained are necessary for a further comparison with the finite element analysis. 3.3. harmonic response model in classical mechanics, a harmonic oscillator is a system that, when displaced from its equilibrium position, experiences a restoring force f proportional to the displacement x: f = −k · x − b · v (8) where, k is a positive real number, characteristic of the spring; x amount of the displacement; b is a constant that depends on the properties of the environment and the dimensions of the object; v is the velocity of the object. driven harmonic oscillators are damped oscillators further affected by an externally applied force f(t). let us assume a driving force is f = f0 · cos(ωextt) [27], then the totals force: f = f0 · cos(ωextt) − k · x − b · v (9) the equation of motion, f = m · a, becomes: m · d2x dt2 = f0 · cos(ωextt) − k · x − bdx dt (10) after a steady state has been reached, the position varies as a function of time: x(t) = a · cos(ω · t + ϕ) (11) where, ω = ωext is the angular frequency of the driving force. the amplitude of oscillation: a = f0 m · ((ω20 − ω2)2 + ( b·ω m )2)1/2 (12) where, ω20 = k/m. ω0 is the natural frequency of the undamped oscillator. when the frequency of the driving force is close to the natural frequency and the drag force is small, then the denominator in the above expression becomes very small and the amplitude becomes very large. this increase in amplitude is called resonance, ω = ω0 is resonance frequency. due to the geometrical complexity of the sample, a calculation for a slab of a similar cross sectional area and stiffness is performed. the size of the slab: 70 mm long, 40 mm wide, 8.25 mm thick, and stiffness k = 1.14 · 106. according to the section 3.1, the stiffness of the model is k = f/∆ = 1.12 · 106. to show the response of the slab to the force in the graph, the duhamel’s integral is used: u(f,t) = 1 mω0 ∫ t 0 fτe −h(t−τ)sin(ω(t − τ))dτ (13) the graph on the fig. 3 displays the dependency of the oscillation amplitude on time for the functions of the force for resonance fr(t) and for normal conditions fn(t). it is assumed that, under normal conditions, the frequency fn = 35 hz corresponds to the generated oscillations during a conventional machining at a 750 rpm by the three blade end-mill. based on the calculation results, the amplitude of the oscillation an = 0.2247 mm, the resonance amplitude ar = 4.023 mm. 4. results results are based on a finite element analysis. the values obtained as a result of previously manual calculated models and automated are compared. 285 s. kononenko, s. dobrotvorskiy, ye. basova et al. acta polytechnica figure 3. the amplitudes of oscillations at resonance u(fr , t) and normal conditions u(fn, t). figure 4. model of analysis and forces probes. 4.1. end milling analysis cad models of a helical conical three-blade end mill and a thin-walled sample element with a variable distribution of stiffness have been designed and imported into cae program. based on the model defined in the section 3.1, the length l of the sample is 70 mm. width b is 40 mm. the thickest y1 and the thinnest y0 regions of the middle section of the part are 9.75 mm and 4.75 mm respectively. the thickness at the top edges (point 1, point 3) fig. 5 is 3.75 mm. the material, contact surfaces of the sample and cutters, boundary surfaces and a mesh for the finite element calculation are set. the estimated allowance for the finishing process is in the range of up to 500 µm. therefore, the density of the mesh of the sample is much smaller than that, and equals 0.5 mm. the initial position of the milling cutter and the sample is calculated at the cad design step. a particular attention should be paid to the position of the sample and the tool, both relative to each other, and relative to the coordinate system. in the cae environment, both elements are tied to the global coordinate system. a cylindrical coordinate system is additionally assigned to the milling cutter. having the specified parameters above, the dynamic parameters are set: axial rotation of the milling cutter and axial movement of the part on the machining length. one of the main parameters of interest in this article are the forces acting on the thin-walled element in the transverse direction. the simulated force fv.fea can be obtained by taking stress probes from the back of the sample along the cutting area, as it can be seen figure 5. set of critical points. critical points deflection values, mm 1 0.17984 2 0.15186 3 0.17984 4 0.06180 5 0.05319 6 0.06180 table 1. deflections estimation. in fig. 4. it is necessary for the comparison with the calculated force fv.calc = 184 n. analysing the obtained values, it can clearly be seen that the finite element analysis allows getting the approximate values of forces. however, it is necessary to take into account the difference in values, depending on the location of the probes, despite the constant cutting parameters. that may be caused by inaccuracies in the context of this type of analysis. the error of the applied method is 12 % on average. considering six points on the surface fig. 5, five of them are critical they are at the edges of the sample and far from the fixation site. consequently, at these points, the maximum deviation from the initial state is presumably observed. the point 5 is the test point where the lowest deflection is expected. the location of the points approximately matches with the lines of the end milling. by loading the critical points by forces in sequence, it is possible to estimate the magnitude of the deviations in each region. the direction of the forces is set along the y axis of the local coordinate systems. the y axis is perpendicular to the sample surface. the values of the deviations picked at different levels are different; it is caused by variable a distribution of stiffness of the sample [28, 29]. the values obtained by the analysis allow to clearly evaluate the degree of the maximum deflection values of the sample in different regions fig. 6. the results of the analysis are listed in tab. 1. the maximum 286 vol. 59 no. 3/2019 deflections and frequency analysis in the milling. . . figure 6. sample deflection at point 1 and 2 by 0.17984 mm and 0.15186 mm respectively. deflection value for the sample thickness 3.75 mm at points 1 and 3 is 0.17984 mm. at the section 3.1, the deflection ∆ = 0.1642 mm is obtained. that simplified model geometrically represents the force interaction at the level of the second point of the sample where the thickness is 4.75 mm. the deflection at this point according to the fe model is 0.15186 mm, which is quite similar. the different calculation methods allow to be sure that in the process of building of the fe model no errors were made. also, the simplified model allows to roughly estimate what value is expected as a result of the fe analysis. the fe analysis in the cae software gives more possibilities due to working with a complex 3d model thus the more accurate calculations can be performed. 4.2. modal analysis calculating of the natural frequency of the structure should especially be analysed if there are tendencies to the occurrence of oscillations in the process of treatment. this calculation makes it possible to define the harmonic response oscillations further, i.e. calculate the maximum amplitude of oscillation. the interaction of a rotating cutter with a certain frequency and a thin-walled element can be represented as a mechanical system with cyclic loads. the response of a system to a dynamic loading can be among the following [30]: (1.) the intensity of the response can converge; (2.) the system can constantly oscillate; (3.) the intensity of the system can diverge. in all cases, the 3-rd type of the response should be avoided since the system could collapse. if the excitation load has a frequency which becomes close to the natural frequency of the system, the oscillations increase, and if the frequencies coincide, a resonance phenomenon occurs with subsequent negative consequences. therefore, it is critical to be able to calculate the natural frequency of the system. for this purpose, the finite-element analysis of frequencies and oscillations is presented. this analysis calculates the natural frequency based on the geometry and material of the sample, and allows to determine the resonant frequency. the modal analysis settings are: max modes to find 6 (each of them has a tendency to oscillate in different directions), fixed support the base surface of the model. the result is shown in fig. 7, the output listed in tab. 2. the first mode 1728.4 hz is the most significant, since at this frequency, the sample is going to oscillate in the same direction as during the lengthwise processing. 4.3. harmonic response analysis having a modal analysis, the harmonic response analysis can be performed. any sustained cyclic load produces a sustained cyclic response (a harmonic response) in a structural system. the harmonic response analysis gives the ability to predict the sustained dynamic behaviour of structures, thus enabling to verify whether or not the designs will successfully overcome the resonance, fatigue, and other harmful effects of forced vibrations [31]. harmonic response analysis settings are: previously defined modal analysis, fixed support base surface of the model, analyzed frequency is in the range from 0 to 2000 hz, input nodal force load at the critical 287 s. kononenko, s. dobrotvorskiy, ye. basova et al. acta polytechnica figure 7. oscillations under different natural frequencies. mode frequency, hz 1 1728.4 2 5137.1 3 5839.5 4 7603.1 5 13097 6 17771 table 2. natural frequency analysis output. figure 8. frequency response location. point (thin end of the sample) 184 n. the frequency response is taken from the critical point 3 location fig. 8. the graphs of oscillation amplitude and phase angle are presented on the figure 9; values for each frequency and corresponding amplitude, phase angle are presented in the tabl. 3. the amplitude is the maximum value of the oscillation of the sample under the specified load and corresponding frequency. the phase angle is a measure of the time by which the load lags (or leads) a frame of reference. the range from 0 to 2000 hz is significant to analyse as the natural frequency lies in this range 1728.4hz. if the frequency of the cyclic load matches with the natural frequency, the oscillations increase and a resonance occurs. in confirmation, from the data received fig. 9, tab. 3, a large increase of the amplitude of the sample oscillations 2.7934 mm at the frequency 1720 hz 288 vol. 59 no. 3/2019 deflections and frequency analysis in the milling. . . figure 9. oscillation amplitude of the deflection of the sample and phase angle. frequency, hz amplitude, mm phase angle, deg. 40 0.12408 -0.0503 120 0.12458 -0.1516 250 0.12649 -0.32113 500 0.13455 -0.68722 750 0.15072 -1.1669 1000 0.18183 -1.9077 1250 0.24956 -3.3521 1500 0.4692 -7.8414 1720 2.7934 -76.045 1875 0.61746 -165.95 2000 0.32309 -171.92 table 3. shortened list of frequencies and corresponding amplitudes and phase angles. could be noticed. it shows that the system starts to resonate with the frequency of the input loading. the expected rotation of the milling tool for a conventional machining is near 750 rpm. it corresponds to the frequency 12.5 hz. the milling tool has three blades, so expected frequency of the contact of the blades with the surface is tripled. thus the frequency of interest is near 37.5 hz, which is lower than the natural frequency, so milling processing can be safely performed. finally, according to the table 3, the maximum oscillation amplitude of the sample at 40 hz is 0.12408 mm. the same way, during the high speed machining at 15 000 rpm the frequency of interest is 750 hz, and the deflection is 0.15072 respectively. as a result of the calculations in the section 3.3, the osculation amplitude for the conventional machining is 0.2247 mm, whereas according to the fe analysis, it is 0.12408 mm for the conventional machining and 0.15072 mm for a high-speed one. the difference is expected since, unlike the simplified model, the fe model has a thick base thus it has a higher natural frequency so it is more stable to oscillations. the difference in results shows the disadvantage of the simplified model calculation in the section 3.3 in contrast with the finite element simulation. the complexity of the manual mathematical description of the geometry imposes restrictions on the accuracy of calculations for the specific model. 5. discussion in terms of processing of thin-walled parts, harmonic response analysis brings the perspective to estimate deflections at different frequencies. modern highspeed machining equipment allows the flexibility to 289 s. kononenko, s. dobrotvorskiy, ye. basova et al. acta polytechnica choose the most efficient processing speed. negative factors, such as large deviations or even resonance, can be avoided by increasing or decreasing the number of tool’s teeth and revolutions of the cutting tool. 6. conclusion the static and dynamic methods of loading of the thinwalled sample with variable low stiffness are presented. a deflection evaluation is performed. an analysis of a sample is made, as a result, the cutting force components are defined; the relative transverse force fv.calc is 184 n, which is similar to fv.fea. the error of the applied method is 12 % on average. the modelling of the deflections is implemented. as a result of a static analysis, the maximum deviation at the thinnest region of the part is 0.17984 mm. as a result of the frequency analysis, the maximum oscillation amplitude of the sample at a normal processing frequency 40 hz is 0.12408 mm and at a high-speed one 750 hz, it is 0.15072 mm. in a further research, the presented methods are considered for creating processing programs that take into account the geometry of the part and compensate most of the undesirable deflections. references [1] l. breńkacz, g. zywica, m. bogulicz. analysis of dynamical properties of a 700 kw turbine rotor designed to operate in an orc installation. diagnostyka pp. 17–23, 2016. [2] a. gebura, m. stefaniuk. monitoring the helicopter transmission using the fam-c diagnostic method. diagnostyka 18:75–85, 2017. [3] e. pásztor, b. varga. energy and aerodynamic examination of slightly backward leaning impeller blading of small centrifugal compressors. periodica polytechnica transportation engineering 43:199–205, 2015. doi:10.3311/pptr.8093. [4] f. song, y. ni, z. tan. optimization design, modeling and dynamic analysis for composite wind turbine blade. procedia engineering 16:369–375, 2011. doi:10.1016/j.proeng.2011.08.1097. [5] m. amine chelabi, m. kamel hamidou, m. hamel. effects of cone angle and inlet blade angle on mixed inflow turbine performances. periodica polytechnica mechanical engineering 61:225–233, 2017. doi:10.3311/ppme.9890. [6] m. pompa. computer aided process planning for high-speed milling of thin-walled parts: strategy-based support. ph.d. thesis, university of twente, the netherlands, 2010. doi:10.3990/1.9789036530408. [7] v. ivanov. process-oriented approach to fixture design. in design, simulation, manufacturing: the innovation exchange, pp. 42–50. springer, 2018. doi:10.1007/978-3-319-93587-4_5. [8] s. dobrotvorskiy, y. basova, m. ivanova, et al. forecasting of the productivity of parts machining by high-speed milling with the method of half-overlap. diagnostyka 19:37–42, 2018. doi:10.29354/diag/93136. [9] a. permyakov, s. dobrotvorskiy, l. dobrovolska, et al. computer modeling application for predicting of the passing of the high-speed milling machining hardened steel. in design, simulation, manufacturing: the innovation exchange, pp. 135–145. springer, 2018. doi:10.1007/978-3-319-93587-4_15. [10] v. karpus, v. ivanov. choice of the optimal configuration of modular reusable fixtures. russian engineering research 32:213–219, 2012. doi:10.3103/s1068798x12030124. [11] v. ivanov, i. dehtiarov, y. denysenko, et al. experimental diagnostic research of fixture. diagnostyka 19:3–9, 2018. doi:10.29354/diag/92293. [12] c. eksioglu, z. kilic, y. altintas. discrete-time prediction of chatter stability, cutting forces, and surface location errors in flexible milling systems. journal of manufacturing science and engineering 134:061006, 2012. doi:10.1115/1.4007622. [13] g. bolar, a. das, s. n. joshi. measurement and analysis of cutting force and product surface quality during end-milling of thin-wall components. measurement 121:190–204, 2018. doi:10.1016/j.measurement.2018.02.015. [14] t. umezu, d. kono, a. matsubara. evaluation of on-machine measuring method for dynamic stiffness of thin-walled workpieces. procedia cirp 77:34–37, 2018. doi:10.1016/j.procir.2018.08.204. [15] a. procházka, j. uhlíř, p. w. j. rayner, n. g. kingsbury (eds.). signal analysis and prediction. applied and numerical harmonic analysis, chap. system identification, pp. 163–173. birkhäuser, boston, ma, 1998. [16] c. pappalardo, d. guida. development of a new inertial-based vibration absorber for the active vibration control of flexible structures. engineering letters 26:372–385, 2018. accessed: 02 february 2019. [17] c. pappalardo, d. guida. system identification and experimental modal analysis of a frame structure. engineering letters 26:56–68, 2018. [18] j.-n. juang, m. q. phan. identification and control of mechanical systems. cambridge university press, 2001. doi:10.1017/cbo9780511547119. [19] y. altintas, o. tuysuz, m. habibi, z. li. virtual compensation of deflection errors in ball end milling of flexible blades. cirp annals 67(1):365–368, 2018. doi:10.1016/j.cirp.2018.03.001. [20] v. f. mozgovoy, k. b. balushok, i. i. kotov, b. m. k. strategies for processing blades on cnc machining centers with variable 3d correction. aerospace engineering and technology 7:22–28, 2013. [21] v. schulze, p. arrazola, f. zanger, j. osterried. simulation of distortion due to machining of thin-walled components. procedia cirp 8:45–50, 2013. doi:10.1016/j.procir.2013.06.063. [22] dr. drang. an application of castigliano’s second theorem with octave. https://leancrew.com/all-this/2009/10/ an-application-ofcastilianos-second-theorem-with-octave/, 2009. accessed: 01 june 2018. 290 http://dx.doi.org/10.3311/pptr.8093 http://dx.doi.org/10.1016/j.proeng.2011.08.1097 http://dx.doi.org/10.3311/ppme.9890 http://dx.doi.org/10.3990/1.9789036530408 http://dx.doi.org/10.1007/978-3-319-93587-4_5 http://dx.doi.org/10.29354/diag/93136 http://dx.doi.org/10.1007/978-3-319-93587-4_15 http://dx.doi.org/10.3103/s1068798x12030124 http://dx.doi.org/10.29354/diag/92293 http://dx.doi.org/10.1115/1.4007622 http://dx.doi.org/10.1016/j.measurement.2018.02.015 http://dx.doi.org/10.1016/j.procir.2018.08.204 http://dx.doi.org/10.1017/cbo9780511547119 http://dx.doi.org/10.1016/j.cirp.2018.03.001 http://dx.doi.org/10.1016/j.procir.2013.06.063 https://leancrew.com/all-this/2009/10/an-application-of-castilianos-second-theorem-with-octave/ https://leancrew.com/all-this/2009/10/an-application-of-castilianos-second-theorem-with-octave/ https://leancrew.com/all-this/2009/10/an-application-of-castilianos-second-theorem-with-octave/ vol. 59 no. 3/2019 deflections and frequency analysis in the milling. . . [23] dr. drang. revisiting castigliano with scipy. www.leancrew.com/all-this/2013/01/ revisiting-castigliano-with-scipy/, 2013. accessed: 02 june 2018. [24] s. dobrotvorskiy, y. basova, s. kononenko. improvment of technology of milling parts with uneven stiffness. collection of scientific works: open information and computer integrated technologies 72:105–111, 2016. [25] k. grote, e. k. antonsson (eds.). springer handbook of mechanical engineering. springer-verlag berlin heidelberg, 2009. [26] a. kosilova, r. meshcheryakova (eds.). reference book of a technologist-machine engineer. springer-verlag berlin heidelberg, moscow, 4th edn., 1985. [27] the university of tennessee, department of physics and astronomy. damped and driven oscillations . http://labman.phys.utk.edu/phys221core/modules/ m11/damped_and_driven_oscillations.html. accessed: 02 october 2018. [28] sonnerlind, h. modeling linear elastic materials how difficult can it be? https://comsol.com/blogs/ modeling-linear-elastic-materialshow-difficult-can-it-be/. accessed: 03 june 2018. [29] datta, s. computing stiffness of linear elastic structures: part 1. https://comsol.com/blogs/computingstiffness-linear-elastic-structures-part-1/. accessed: 07 june 2018. [30] cyprien. what is fea modal analysis? https://thewikihow.com/video_-_fdvg1-9yi. accessed: 27 september 2018. [31] chapter 4: harmonic response analysis. www.ansys. stuba.sk/html/guide_55/g-str/gstr4.htm. accessed: 03 october 2018. 291 www.leancrew.com/all-this/2013/01/revisiting-castigliano-with-scipy/ www.leancrew.com/all-this/2013/01/revisiting-castigliano-with-scipy/ http://labman.phys.utk.edu/phys221core/modules/m11/damped_and_driven_oscillations.html http://labman.phys.utk.edu/phys221core/modules/m11/damped_and_driven_oscillations.html https://comsol.com/blogs/modeling-linear-elastic-materials-how-difficult-can-it-be/ https://comsol.com/blogs/modeling-linear-elastic-materials-how-difficult-can-it-be/ https://comsol.com/blogs/modeling-linear-elastic-materials-how-difficult-can-it-be/ https://comsol.com/blogs/computing-stiffness-linear-elastic-structures-part-1/ https://comsol.com/blogs/computing-stiffness-linear-elastic-structures-part-1/ https://thewikihow.com/video_-_fdvg1-9yi www.ansys.stuba.sk/html/guide_55/g-str/gstr4.htm www.ansys.stuba.sk/html/guide_55/g-str/gstr4.htm acta polytechnica 59(3):283–291, 2019 1 introduction 2 literature review 3 research methodology 3.1 deflection model 3.2 directional cutting force model 3.3 harmonic response model 4 results 4.1 end milling analysis 4.2 modal analysis 4.3 harmonic response analysis 5 discussion 6 conclusion references acta polytechnica doi:10.14311/ap.2017.57.0467 acta polytechnica 57(6):467–469, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap further generalisations of the kummer-schwarz equation: algebraic and singularity properties r. sinuvasana, c, ∗, k. krishnakumara, k. m. tamizhmania, p. g. l. leachb a department of mathematics, pondicherry university, kalapet, india 605 014 b school of mathematics, statistics and computer science, university of kwazulu-natal, private bag x54001, durban 4000, republic of south africa and institute for systems science, department of mathematics, durban university of technology, pob 1334, durban 4000, republic of south africa c department of mathematics, sastra deemed university, thanjavur, india 613 401 ∗ corresponding author: rsinuvasan@gmail.com abstract. the kummer–schwarz equation, 2y′y′′′ − 3(y′′)2 = 0, has a generalisation, (n − 1)y(n−2)y(n) − ny(n−1) 2 = 0, which shares many properties with the parent form (see sinuvasan r, tamizhmani k m & leach p g l, algebraic and singularity properties of a class of generalisations of the kummer–schwarz equation differ, equ dyn syst (2016), doi:10.1007/s12591-016-0327-5) in terms of symmetry and singularity. all equations of the class are integrable in closed form. here we introduce a new class, (n+q−2)y(n−2)y(n) −(n+q−1)y(n−1) 2 = 0, which has different integrability and singularity properties. keywords: kummer-schwarz; symmetries; singularities; integrability. 1. introduction in [12] we delineated the properties of the class of equations (n−1)y(n−2)y(n)−n(y(n−1))2 = 0, n = 2, 3, . . . (1) in terms of symmetry, singularity and integrability. the class (1) differed from the class with general numerical coefficients in that the number of lie point symmetries was one greater for general values of n (with n = 2 being an exceptional case). in the case of the kummer–schwarz equation [4] it was two greater, being the six comprising the direct sum of two sl(2,r) subalgebras. here we consider the generalisation (n + q−2)y(n−2)y(n) − (n + q−1)y(n−1) 2 = 0, n = 2, 3, . . . , (2) where q is a real number, subsequently to be more precisely defined. we examine (2) in terms of its singularity properties, symmetry properties and general integrability. we commence with the singularity properties of (2) and see that a successful satisfaction of the requirements imposes a constraint on the permissible values of q for the solution to be analytic. we turn to the symmetry properties of (2) and find some unexpected results. in terms of integrability there are no restrictions upon the value of q in a formal sense although in practice there may be restrictions. 2. singularity analysis we examine the sequence of equations introduced above in terms of singularity analysis. we follow the general method as outlined in [10, 14] with the modification for negative nongeneric resonances introduced by andriopoulos et al. [1]. theorem. the exponent of the leading-order term and the resonances of the nth member of the sequence of equations, (n + q−2)y(n−2)y(n) − (n + q−1)(y(n−1))2 = 0, n ∈ n, n > 1, (3) are p = −q and s = −1, 0,q, 1 + q, . . . ,n− 3 + q. proof. we substitute y = αχp, where χ = x − x0 and x0 is the location of the putative singularity. we remove a common factor p2(p−1)2 · · ·(p−n+ 3)2(p− n + 2). the values of p removed are all positive and so of no relevance to the singularity analysis. the remaining terms are (n + q−2)(p−n + 1) − (n + q−1)(p−n + 2) which, when put equal to zero, give the singularity to be p = −q. we write y = αχ−q + µχ−q+s and substitute into (3). we remove the common factors χ−2 and obtain (−q)(−q−1) · · ·(−q−n+ 3)(−q +s)(−q +s−1) · · · · · ·(−q + s−n + 3) this immediately gives the resonances, s = q,q + 1, . . . ,q + n− 3, which are all positive. the remaining terms are (n + q−2)(−q−n + 2)(−q−n + 1) + (n + q−2)(−q + s−n + 2)(−q + s−n + 1) − 2(n + q−1)(−q−n + 2)(−q + s−n + 2). 467 http://dx.doi.org/10.14311/ap.2017.57.0467 http://ojs.cvut.cz/ojs/index.php/ap r. sinuvasan, k. krishnakumar, k. m. tamizhmani, p. g. l. leach acta polytechnica when this equated to zero, we obtain the two additional resonances s = −1, 0. apart from the generic resonance of −1 all of the resonances are nonnegative numbers which contain the positive number q. when the resulting expansion make sense, the laurent expansion is a right painlevé series [2]. we illustrate the method with the fifth-order equation, (q + 3)y′′′y(5) − (q + 4)(y(4))2 = 0. (4) to determine the leading-order behaviour we set y = αχp, where χ = x−x0 and x0 is the location of the putative singularity. we obtain (q + 3)α2p2(p−1)2(p−2)2(p−3)(p−4)χ2p−8 − (q + 4)α2p2(p−1)2(p−2)2(p−3)2χ2p−8 which is zero if (q + 3)(p− 4) = (q + 4)(p− 3), i.e., p = −q. note that the coefficient of the leading-order term is arbitrary. to establish the terms at which the remaining constants of integration occur in the laurent expansion we make the substitution y = αχ−q + mχ−q+s. the various values s may take are determined by those values of s which make the coefficient of m so that m is arbitrary. the coefficient of m is a fifth-order polynomial the roots of which are s = −1, 0,q, 1 + q, 2 + q. 3. solution of the general equation equation (2), up to a multiplicative constant, may be written as d2y(n−2)(x)− 1 n+q−2 dx2 = 0 which is readily integrated to give the general solution of (2) as y = (−1)n (ax + b)−q an−2(q + n−3)(n−3) + n−3∑ i=0 cix i, (5) where the notation (q + n−3)(n−3) denotes the rising pochhammer symbol and means q(q+1) · · ·(q+n−3). we note that the solution of (2) exists no matter the value of q. naturally the utility of the solution depends upon the value of q because we are dealing with values in the complex plane. 4. symmetry properties the symmetry properties of (2) are rather complex and to give an indication of the complexity we quote the results for a small sample of equations. we commence with n = 2 for which we obtain the symmetries, { ∂ ∂x ,y ∂ ∂y ,x ∂ ∂x ,y 1 q +1 ∂ ∂y ,xy 1 q +1 ∂ ∂y ,qy−1/q ∂ ∂x , xy ∂ ∂y − x2 q ∂ ∂x ,y1− 1 q ∂ ∂y − 1 q ( xy−1/q ) ∂ ∂x } , with the algebra sl(3,r). the transformation to the archetypal second-order equation (up to a multiplicative factor) is y(x) → w(x)−q. the transformation follows from inspection of the structures of the symmetries above. in the case of the third-order equation, i.e., n = 3, we obtain four possible algebraic structures. these are { ∂ ∂y , ∂ ∂x ,y ∂ ∂y ,x ∂ ∂x } ,{ ∂ ∂y , ∂ ∂x ,y ∂ ∂y ,x ∂ ∂x ,y2 ∂ ∂y ,x2 ∂ ∂x } ,{ ∂ ∂y , ∂ ∂x ,y ∂ ∂y ,x ∂ ∂y ,x ∂ ∂x ,x2 ∂ ∂y ,x2 ∂ ∂x + 2xy ∂ ∂y } ,{ ∂ ∂y , ∂ ∂x ,y ∂ ∂y ,y ∂ ∂x ,x ∂ ∂x ,y2 ∂ ∂x , 2xy ∂ ∂x + y2 ∂ ∂y } corresponding to general q, q = 1, q = −2 and q = −12 . clearly the latter two cases did not come within the purview of the singularity analysis discussed above. for general q the algebra is a2 ⊕ a2 which can also be written as 2a2. (we make use of the mubarakzyanov classification scheme [5–8] (see also [9, 11, 13]) throughout this paper.) in the case of q = 1 we have the well-known kummer-schwarz equation with the algebra 2sl(2,r) or 2a3,8. for the values of q −2 and −12 the algebra is the same and is the maximal algebra for a third-order equation. for q = −2 the representation is the standard representation for y′′′ = 0. the representation for q = −12 has not been reported before. however, the latter result follows from an interchange of x and y. the fourth-order equation is the prototype for subsequent equations and we find the symmetries{ ∂ ∂y ,x ∂ ∂y ,y ∂ ∂y , ∂ ∂x ,x ∂ ∂x } ,{ ∂ ∂y ,x ∂ ∂y ,y ∂ ∂y , ∂ ∂x ,x ∂ ∂x + 1 2 y ∂ ∂y ,x2 ∂ ∂x + xy ∂ ∂y } ,{ ∂ ∂y ,x ∂ ∂y ,x2 ∂ ∂y ,x3 ∂ ∂y ,y ∂ ∂y , ∂ ∂x , x ∂ ∂x + 3 2 y ∂ ∂y ,x2 ∂ ∂x + 3xy ∂ ∂y } . the five-dimensional algebra is a2 ⊕s a3,3. the latter subalgebra is also known as d⊕s t2, i.e., dilations and translations in the plane. the six-dimensional algebra is a3,3 ⊕s sl(2,r). we note that these symmetries are the same as those of y′′ = 0 apart from the two noncartan symmetries typical of a linear second-order equation [3]. 468 vol. 57 no. 6/2017 further generalisations of the kummer-schwarz equation the eight-dimensional algebra is the maximal algebra for a forth-order scalar equation. this occurs for q = −3. then (2) is simply y(4) = 0 up to a multiplier. unlike in the case of the third-order equation there is no existence of an n + 3-dimensional algebra. this pattern of behaviour persists mutatis mutandis for higher-order equations. the general nth-order equation has three possible numbers of symmetries, namely n + 1, n + 2 and n + 4. the algebras are as delineated above. we note that the exceptional property of the third-order equation, the possession of a double sl(2,r) algebras, is indeed exceptional. 5. conclusion in [12] we considered a family of ordinary differential equations (n− 1)y(n−2)y(n) −n(y(n−1))2 = 0 as a natural generalisation of the well-known kummerschwarz equation 2y′y′′′ − 3(y′′)2 = 0. we reported the symmetries, singularity properties and solutions for the members of the family. these properties were remarkably robust throughout the whole family except for the kummer-schwarz equation itself which had the additional property of six lie point symmetries that makes it exceptional in the class of third-order equations. in this paper we have considered a variation on the kummer-schwarz equation and its natural generalisation to higher-order equations by the inclusion of a parameter q to give the nonlinear family (n + q−2)y(n−2)y(n) − (n + q−1)(y(n−1))2 = 0. there is no a priori constraint upon the value of q. apart from some special values noted in §3 the value of q does not influence the symmetry properties, i.e., in general the number of symmetries is the same independently of the value of q. however, when one considers the singularity analysis, q must necessarily be positive to enable the existence of a singularity, indeed a positive integer for the usual analysis to apply. the value of q affects only the leading-order term. independently of the value of q the resonances take the values −1 and 0. thereafter the value of q enters into the values of the resonances. acknowledgements r sinuvasan thanks the university grants commission for its support. pgll thanks professor km tamizhmani and the department of mathematics, university of pondicherry, for the provision of facilities whilst this work was undertaken. pgll also thanks the university of kwazulu-natal and the national research foundation of the republic of south africa for their continued support. any views expressed in this paper are not necessarily those of the two institutions. references [1] andriopoulos, k., leach, p. g. l.: an interpretation of the presence of both positive and negative nongeneric resonances in the singularity analysis. physics letters a, 359, 2006, p. 199-203. (doi.org/10.1016/j.physleta 2006.06.026) [2] feix, m. r., geronimi, c., leach, p. g. l., lemmer, r. l., bouquet, s.: on the singularity analysis of ordinary differential equations invariant under time translation and rescaling. journal of physics a: mathematical and general, 30(21), 1997, p. 7437-7461. (doi.org/10.1088/0305-4470/30/21/017) [3] hsu, l., kamran, n.: classification of second-order ordinary differential equations admitting lie groups of fibre-preserving point symmetries. proceedings of the london mathematical society, 58, 1989, p. 387-416. (doi.org/10.1112/plms/s3-58.2.387) [4] kummer, e. e.: de generali quadam æquatione differentiali tertii ordinis. journal der reine und angewandte mathematik, 100, 1887, p. 1-9. (reprinted from the programm des evangelischen königl und stadtgymnasiums in liegnitz for the year 1834) (doi.org/10.1515/crll.1887.100.1) [5] morozov, v. v., classification of six-dimensional nilpotent lie algebras. izvestia vysshikh uchebn zavendenĭı matematika, 5, 1958, p. 161-171. [6] mubarakzyanov, g. m.: on solvable lie algebras. izvestia vysshikh uchebn zavendenĭı matematika, 32, p. 114-123. [7] mubarakzyanov, g. m.: classification of real structures of five-dimensional lie algebras. izvestia vysshikh uchebn zavendenĭı matematika, 34, 1963, p. 99-106. [8] mubarakzyanov, g. m.: classification of solvable six-dimensional lie algebras with one nilpotent base element. izvestia vysshikh uchebn zavendenĭı matematika, 35, 1963, p. 104-116. [9] patera, j., sharp, r. t., winternitz, p., zassenhaus, h.: invariants of real low dimension lie algebras. journal of mathematical physics, 17(6), 1976, 986-994. (doi.org/10.1063/1.522992) [10] ramani, a., grammaticos, b., bountis, t.: the painlevé property and singularity analysis of integrable and nonintegrable systems. physics reports, 180, 1989, p. 159-245. (doi.org/10.1016/0370-1573(89)90024-0) [11] shabanskaya, a., thompson, g.: six-dimensional lie algebras with a five-dimensional nilradical. journal of lie theory, 23(2), 2013, p. 313-355. [12] sinuvasan, r., tamizhmani, k. m., leach, p. g. l.: algebraic and singularity properties of a class of generalisations of the kummer-schwarz equation. differential equations and dynamical systems, 2016, doi:10.1007/s12591-016-0327-5. [13] ŝnobl, l., winternitz, p.: classification and identification of lie algebras 33. american mathematical society. providence, r.i. 2014. [14] tabor, m.: chaos and integrability in nonlinear dynamics. new york: john wiley, 1989. (isbn:978-0-471-82728-3) 469 acta polytechnica 57(6):467–469, 2017 1 introduction 2 singularity analysis 3 solution of the general equation 4 symmetry properties 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0193 acta polytechnica 56(3):193–201, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on partial differential and difference equations with symmetries depending on arbitrary functions giorgio gubbiotti, decio levi∗, christian scimiterna dipartimento di matematica e fisica, università degli studi roma tre, e sezione infn di roma tre, via della vasca navale 84, 00146 roma (italy) ∗ corresponding author: decio.levi@roma3.infn.it abstract. in this note we present some ideas on when lie symmetries, both point and generalized, can depend on arbitrary functions. we show a few examples, both in partial differential and partial difference equations where this happens. moreover we show that the infinitesimal generators of generalized symmetries depending on arbitrary functions, both for continuous and discrete equations, effectively play the role of master symmetries. keywords: partial differential equation; partial difference equation; lie symmetries. 1. introduction in a seminal work of two century ago medolaghi [27], following sophus lie results on ordinary differential equations [24, 25], proposed the following work program for partial differential equations (pde’s): (1.) determine all different kinds of infinite groups of point transformations in 3 variables. (2.) for each of the obtained groups determine the invariant second order equations. in the framework of this research medolaghi got, among other equations, the liouville equation uxt(x,t) = eu(x,t). (1.1) the symmetry algebra of the liouville equation (1.1) is given by the vector fields x(f(x)) = f(x)∂x −fx(x)∂u, y (g(t)) = g(t)∂t −gt(t)∂u, (1.2) where f(x) and g(t) are arbitrary smooth functions of their argument and by fx(x) and gt(t) we mean their first derivatives. the commutation relations of the vector fields (1.2) are[ x(f),x(f̃) ] = x(ff̃x − f̃fx), [ y (g),y (g̃) ] = y (gg̃t − g̃gt), [ x(f),y (g) ] = 0. (1.3) the algebra (1.2), (1.3) is isomorphic to the direct sum of two virasoro algebras. we can find also s-integrable equations [5], i.e., equations integrable by the spectral transform, which have an infinite dimensional group of point symmetries. classical examples are the (2 + 1)-dimensional kadomtsevpetviashvili equation [7, 8, 29], the (2 + 1)-dimensional davey-stewartson equation [6], the (2 + 1)-dimensional toda [22], the three wave interaction in three dimensions [26]. in the example 5.7 of [28] one finds the point and generalized symmetries for the nonlinear first order wave equation ut = uux, u = u(x,t). (1.4) in evolutionary form they are given by the characteristic q = uxf ( x + tu,u,t + 1 ux , uxx u3x ) , (1.5) where the function f is an arbitrary function of its arguments. in particular the last term corresponds to generalized symmetries as it depends of the second derivative of the field. for the symmetries of hydrodynamictype partial differential equations see [11]. results by shabat and zhiber [37] show that also the liouville equation, being darboux integrable, has such kind of symmetries: q = (dx + ux)f(w,wx, . . . ,wk1x) + (dt + ut)g(w̄,w̄t, . . . , w̄k2t), (1.6) 193 http://dx.doi.org/10.14311/ap.2016.56.0193 http://ojs.cvut.cz/ojs/index.php/ap g. gubbiotti, d. levi, c. scimiterna acta polytechnica where the operators dx + ux and dt + ut are the laplace operators, k1 and k2 are two positive integer numbers, f and g are two arbitrary functions of their arguments, dx and dt stands for the total derivative with respect to the x and t variables and w = uxx − 12u 2 x and w̄ = utt − 1 2u 2 t are the lowest order integrals of the liouville equation in the x and t direction. here and in the following by wnx we mean the n times derivative of the function w(x,t) with respect to x. so the nonlinear wave equation (1.4) as well as the liouville equation (1.1) admit generalized symmetries depending on arbitrary functions. then medolaghi program could be extended to the case of generalized symmetries. we here present some preliminary ideas on a possible solution of this program. in the following section we will present some results on the construction of pde’s presenting generalized symmetries depending on arbitrary functions and relate them to darboux integrable equations. then in section 3 we present the counterpart of the previous results in the discrete case. at the end we present few conclusive remarks and conjectures. 2. factorizable differential operators, darboux integrable equations and symmetries depending on arbitrary functions. the result (1.5) can be in principle generalized to any differential equation of the first order as the equation for the symmetries, be them point or generalized ones, can be solved on the characteristics as it was in the case of the hopf equation (1.4). let consider, with no loss of generality, the example of a general first order autonomous equation in two independent variables, ut = f(u,ux), u = u(x,t), (2.1) where f is an arbitrary function of its arguments. the symmetries are given by their characteristics q(x,t,u, ux, . . . ,ukx) and their determining equations are given by dtq− [∂f ∂u q + ∂f ∂ux dxq ]∣∣∣∣ ut=f = 0. (2.2) equation (2.2) is a first order differential equation for q which can be solved on the characteristics and whose solutions provide k + 2 symmetry variables. then, as in the case of (1.5), the symmetries of (2.1) are given by arbitrary functions of the k + 2 symmetry variables. so any first order pde, linear or nonlinear, will have generically point and generalized symmetries depending on arbitrary functions of the symmetry variables. we can easily show an interesting consequence of the existence of generalized symmetries depending on arbitrary functions. for the sake of concreteness we consider the symmetries of (1.4). let us consider the subcase of (1.5) when qf = ux [ f1(x + tu) + f2 ( t + 1 ux ) + f3 (uxx u3x )] , (2.3) i.e., the hopf equation (1.4) has a symmetry generator of the form: x̂f = qf∂u (2.4) with the functions fi, i = 1, 2, 3 analytic in their argument. when f2 and f3 are zero then (2.4) is the infinitesimal generator of point symmetries depending on an arbitrary function f1; when f1 and f3 are zero then (2.4) is the infinitesimal generator of contact symmetries depending on an arbitrary function f2; when f1 and f2 are zero then (2.4) is the infinitesimal generator of generalized symmetries depending on an arbitrary function f3. if we take two of such generators, x̂f and x̂g, where f and g are two different functions of the same argument, what can we say of their commutator? it has been proved by bäcklund [17] that as soon as the characteristic q depends on derivatives of order higher than the first one, the symmetry group is infinite. the last symmetry group which can be finite is the one of the contact symmetries, if no arbitrary function is present. as was shown in the case of the liouville equation the presence of a symmetry generator of point symmetries depending on an arbitrary function provide an infinite dimensional lie algebra of point symmetries. let us carry out the the commutation of x̂f and x̂g, for f and g equal to f1 we get:[ x̂f1,x̂g1 ] = x̂f1g′1−f′1g1, (2.5) equal to f2 we get: [ x̂f2,x̂g2 ] = 0, (2.6) 194 vol. 56 no. 3/2016 on pde and p∆e with symmetries depending on arbitrary functions and equal to f3 we get: [ x̂f3,x̂g3 ] = [f′′3 g ′ 3 −f ′ 3g ′′ 3 ] (uxxxux − 3u2xx)2 u9x ∂u. (2.7) the algebra (2.5) turn out to be a virasoro algebra of lie point symmetries. the infinitesimal generators of the contact symmetries (2.6) commute but the infinitesimal generator depending on an arbitrary function of the generalized symmetry take the form of a master symmetry as the commutator of two such symmetries provide a symmetry of higher order (2.7). the existence of arbitrary functions of generalized symmetries is not limited to the case of first order differential equations. it can easily extended to higher order pde’s when the differential operator which define the equation is factorizable. few authors [18, 34] showed through the laplace cascade method that factorizable linear pde’s are darboux integrable if the cascade terminates. we can show that a factorizable pde admits as a subclass of symmetries the symmetries of its first order pde. as an example let us consider the case of second order factorizable pde’s. let us consider the second order autonomous partial differential equation for one dependent variable u in the two independent variables x and t e1 = utt + [ g(u,ux) −f(u,ux)ux ] uxt −f(u,ux)uxg(u,ux)uxx −f(u,ux)u [ ut −g(u,ux)ux ] = 0, (2.8) where f and g are two arbitrary functions of their arguments. defining the first order autonomous pde for u = u(x,t), e0 = ut −f(u,ux) = 0, (2.9) (2.8) is factorizable as: e1 = [ ∂t + g(u,ux)∂x ] e0 = 0. (2.10) the symmetries of e0 are given by the infinitesimal generator x̂ = q0(x,t,u,ux,ut,uxx, . . . )∂u (2.11) and the determining equation is written as dtq0 −f(u,ux)uq0 −f(u,ux)uxdxq0 ∣∣∣ e0=0 = 0, (2.12) where dtq0 and dxq0 are the coefficient of the prolongation of x̂ with respect to ut and ux. the determining equation for the symmetries of e1 = 0 of infinitesimal generator ŷ = q1(x,t,u,ux,ut,uxx, . . . )∂u (2.13) is given by pr ŷ e1 ∣∣∣ e1=0 = 0. (2.14) the explicit expression of (2.14) is long as it involves the application of the second prolongation of ŷ . so we do not write it down here. by direct calculation it is easy to show that (2.14) is satisfied by the solution of (2.12). this shows that the symmetries of the second order autonomous factorizable pde e1 = 0 contain those of e0 = 0. so e1 = 0 can have arbitrary function dependent generalized symmetries as e0 = 0 does. this proof can be extended to pde’s of any order and of any number of variables. the relation of this factorization to laplace cascade method and darboux integrability is still to be understood. 3. discrete equations with generalized symmetries depending on an arbitrary function. partial difference equations (p∆e’s) can have symmetries depending on arbitrary functions. the presence of arbitrary functions of point symmetries has been shown to appear, as in the continuous case [3, 11, 17, 31], when the p∆e is linearizable [21]. here we show on some examples that one can find p∆e’s which have arbitrary functions of generalized symmetries. when they appear the system is usually linearizable but a complete theory in this case is absent. we have some results on darboux integrable discrete equations [2, 9, 10, 32, 33, 35] which correspond to p∆e’s which have two distinct conserved quantities in the two different directions of the discrete plane. the complete classification of darboux integrable equations on the lattice is absent and only 195 g. gubbiotti, d. levi, c. scimiterna acta polytechnica some simple classes are worked out in the references mentioned above. darboux integrable equations turn out to be linearizable but the other way around may not be true. here in the following we presents three linearizable p∆e’s [13, 14] which have arbitrary functions of generalized symmetries. then, in a subsection, we present the case of the completely discrete liouville equations. the first two equations are quad graph equations which do not possess the tetrahedron property. they turn out to be linearazable and have generalized symmetries of any order [14]. then we consider an equation of the boll classification, th (ε) 1 , which is non autonomous, linearizable and has three-point generalized symmetries [4, 13]. the three equations are: (1.) the first equation belongs to the classification of compatible equations around the cube with no tetrahedron property presented by hietarinta in [14]: vm+1,nvm,n+1 + vm,nvm+1,n+1 = 0. (3.1) the three-point symmetries in the m-direction of (3.1) are given by the characteristic qm,n = vm,nf ( (−1)n vm+1,n vm,n , (−1)n vm,n vm−1,n ) . (3.2) as (3.1) is symmetric in the exchange of n and m the generalized symmetry (3.2) is valid also in the direction n with the role of n and m interchanged. we can easily find two conserved quantities w1 = (−1)n vm+1,n vm,n , w2 = (−1)m vm,n+1 vm,n (3.3) such that (t2 − 1)w1 = 0, t1hm,n = hm+1,n, (t1 − 1)w2 = 0, t2hm,n = hm,n+1. (3.4) then (3.1) is darboux integrable equation and (3.2) is just the three-point subcase of a general characteristic in the m direction qm,n = vm,nf ( (−1)nt−n1 vm+1,n vm,n , . . . , (−1)ntm1 vm+1,n vm,n ) , (3.5) with n and m two positive arbitrary integers such that the number of points involved in the symmetry is given by n + m + 2. a similar characteristic exists also in the n direction with the role of n and m interchanged. equation (3.1) is linearizable in three ways: • by the point transformation v0,0 .= ex0,0 , so that x0,0 = log v0,0 (without loss of generality, log can be always taken to stand for the principal value of the complex logarithm), is transformed into the linear equation x0,0 −x1,0 −x0,1 + x1,1 = iπ. • by the hopf–cole transformation v1,0/v0,0 .= w0,0 (v0,0 6= 0) into the ordinary difference equation w0,1 + w0,0 = 0. • by the hopf–cole transformation v0,1/v0,0 .= t0,0 (v0,0 6= 0) into the ordinary difference equation t1,0 + t0,0 = 0. (2.) a second equation which belongs to the classification of equations compatible around the cube with no tetrahedron property presented by hietarinta in [14] is: vm,n + vm+1,n + vm,n+1 + vm+1,n+1 + vm+1,nvm,n+1vm+1,n+1 + vm,n [ vm+1,nvm,n+1 + vm+1,nvm+1,n+1 + vm,n+1vm+1,n+1 ] = 0, (3.6) whose three-point symmetries in the m-direction are given by qm,n = (−1)n(1 −v2m,n)k (( (1 −vm,n)(1 −vm−1,n) (1 + vm,n)(1 + vm−1,n) )(−1)n , ( (1 −vm,n)(1 −vm+1,n) (1 + vm,n)(1 + vm+1,n) )(−1)n) . (3.7) equation (3.6) admits a fourfold discrete symmetry given by v0,0 → v (ε) 0,0 .= (1 + ε)v0,0 + 1 −ε (1 −ε)v0,0 + 1 + ε , (3.8) 196 vol. 56 no. 3/2016 on pde and p∆e with symmetries depending on arbitrary functions where ε is one of the four quartic roots of unity, ε = (±i,±1). equation (3.6) is symmetric in the exchange of n and m and consequently the generalized symmetry (3.7) is valid also in the direction n interchanging the role of n and m. it is not contained in the lists of darboux integrable discrete equations [9] and [32], however we can prove that it is darboux integrable as it admits the following integrals which satisfy (3.4): w1 = ( (1 −vm,n)(1 −vm+1,n) (1 + vm,n)(1 + vm+1,n) )(−1)n , w2 = ( (1 −vm,n)(1 −vm,n+1) (1 + vm,n)(1 + vm,n+1) )(−1)m . (3.9) the symmetry characteristic (3.7) is nothing but the reduction to three-point of a general case depending on n + m + 2 points with n and m arbitrary positive integers, qm,n = (−1)n(1 −v2m,n)k ( t−n1 ( (1 −vm,n)(1 −vm−1,n) (1 + vm,n)(1 + vm−1,n) )(−1)n , . . . ,tm1 ( (1 −vm,n)(1 −vm−1,n) (1 + vm,n)(1 + vm−1,n) )(−1)n) . (3.10) a similar characteristic exists also in the n direction with the role of n and m interchanged. equation (3.6) is linearizable by the point transformation: vm,n .= 1 + exm,n 1 −exm,n (3.11) into the linear equation xm,n + xm+1,n + xm,n+1 + xm+1,n+1 = 2izπ, (3.12) where z = −1, 0, 1, 2. the indeterminacy of the inhomogeneous term of (3.12) takes into account the fourfold discrete symmetry (3.8) of (3.6). it is worth while to notice that inverting (3.11) we get x0,0 = log v0,0−1 v0,0+1 , where without loss of generality, the function log can always be taken as the principal value of the complex logarithm [30]. other two linearizations, similar to those showed for (3.1), are also present here. (3.) th (ε) 1 is: (xm,n −xm+1,n)(xm,n+1 −xm+1,n+1) −ε2α2 ( f (+)n xm,n+1xm+1,n+1 + f (−) n xm,nxm+1,n ) −α2 = 0. (3.13) where f (±)n = 1±(−1) n 2 and ε and α2 are arbitrary constants. equation (3.13) is darboux integrable, as we can find two integrals in the m and n directions which satisfy (3.4). the m-integral being given by w2 .= a2f (+)n s + b2f (−) n (t2 + 1)u, s .= xm,n+1 −xm,n−1 1 + ε2xm,n+1xm,n−1 , u .= xm,n −xm,n−1, (3.14) where a2 and b2 are two arbitrary constants. the n direction integral is different, as th (ε) 1 is not symmetric in the exchange of n and m, and is given by w1 .= f (+)n α2 v + f (−)n t, t .= xm,n −xm−1,n 1 + ε2xm−1,nxm,n , v .= xm,n −xm−1,n. (3.15) equations (3.14) and (3.15) are effectively four integrals as ai and bi, i = 1, 2 are independent constants. in [12] we constructed its three-point generalized symmetries along the direction m: qm,n = f (+)n ( α2(v2 + ε2α22) (v̄ −v)(v̄ + v) bm ( α2 v̄ ) − α2(v̄2 + ε2α22) (v̄ −v)(v̄ + v) bm−1 ( α2 v ) + ( xm,n − (v̄ 2+ε2α22)v (v̄−v)(v̄+v) ) ω + γn ) + f (−)n ( t̄2t2 (t̄− t)(t̄ + t) (bm(t̄) −bm−1(t)) − t̄2t (t̄− t)(t̄ + t) ω + δn ) (1 + ε2x2m,n), (3.16) where bm(y), γn and δn are generic functions of their arguments, ω is an arbitrary constant and v̄ = t1v and t̄ = t1t. let us note that any free function and free parameter may eventually depend on α2 and ε. it is possible to demonstrate that, as long as ε 6= 0, no n−independent reduction of the above symmetry exists. if in (3.16) the functions bm−1 appearing in it will depend on the variables t p 1 w1, . . . ,t p+n 1 w1 197 g. gubbiotti, d. levi, c. scimiterna acta polytechnica and consequently bm on t p+1 1 w1, . . . ,t p+1+n 1 w1, one obtains the generalized symmetries for (3.13) in the direction m at all order. the three-point generalized symmetry along the direction n is different as the equation is not symmetric. it is: qm,n = f (+)n (bn(s) + κn) + f (−) n (1 + ε 2x2m,n)(cn(ū + u) + λn), (3.17) where bn(y) and cn(y) are arbitrary functions of their argument and of the lattice variable n, ū = t2u and κn and λn are arbitrary functions of the lattice variable n. due to the complexity of the three-point generalized symmetries we are not able, in this case, to evince the laplace operators and thus write the generalized symmetries depending on an arbitrary function for any number of points. as in the case of pde’s presented in the previous section also here the generalized symmetry depending on an arbitrary function play the role of a master symmetry. let us consider , as an example, the commutation of two generalized symmetries of (3.1) of infinitesimal generators x̂fm = vmfm∂vm and x̂gm = vmgm∂vm characterized by the arbitrary functions fm = f ( (−1)n vm+1 vm ) and gm = g ( (−1)n vm+1 vm ) corresponding to (3.5) with n = m = 0. we have: [ x̂fm,x̂gm ] = (−1)nvm+1hm∂vm. (3.18) where hm = [∆fmg′m − ∆gmf ′ m]. (3.19) in (3.19), f ′m means the derivative of fm with respect to its argument and the operator ∆ is such that ∆fm = fm+1 − fm. the function hm depends on more points than the functions fm and gm and so the generator x̂fm plays the role of a master symmetry. 3.1. the discrete algebraic liouville equations among the many different completely discrete liouville equations which go in the continuous limit in the algebraic liouville equation vvxy −vxvy = v3, (3.20) obtained from (1.1) by setting v = eu, let’s mention the following four: • the tzitzeica-liouville equation, given in [1]: hm,nhm+1,n+1(hm+1,n − 1)(hm,n+1 − 1) − (hm,n − 1)(hm+1,n+1 − 1) = 0; • the potential hirota-liouville equation, given in [15, 33]: vm,nvm+1,n+1 −vm+1,nvm,n+1 + 1 = 0; • the hirota-liouville equation, given in [16, 33]: um,num+1,n+1 − (um+1,n − 1)(um,n+1 − 1) = 0; • the adler-startsev liouville equation, given in [2]: tm,ntm+1,n+1 ( 1 + 1 tm+1,n )( 1 + 1 tm,n+1 ) − 1 = 0. (3.21) these p∆e’s can be transformed one into the other by the following transformations: um,n = hm,n hm,n − 1 , hm,n 6= 1, um,n = vm+1,nvm,n+1, um,n = − 1 tm,n . (3.22) the generalized symmetries of (3.21) along the direction m are given for all p ∈z and n ∈n0 by dtm,n dε = −(1 + tm,n)(t1 − 1) ( 1 + tm,n(1 + tm−1,n) tm−1,n )−1 tm,nf(t p 1 w1, . . . ,t p+n 1 w1), (3.23) where ε is the group parameter, f(x,y, . . . ,z) is an arbitrary function of its arguments and w1 is the darboux integral of the liouville equation (3.21) along the direction n, given in [2] by w1 = ( 1 + tm,n(1 + tm−1,n) tm−1,n )( 1 + tm,n tm+1,n(1 + tm,n) ) , 198 vol. 56 no. 3/2016 on pde and p∆e with symmetries depending on arbitrary functions satisfying (3.4). equation (3.23) is the equivalent of formula (1.6) introduced by zhiber and shabat for the continuous liouville equation. choosing p = −1 and n = 1 in (3.23) we obtain the most general generalized symmetry depending at most on the five points tm−2,n, tm−1,n, tm,n, tm+1,n and tm+2,n dtm,n dε = −(1 + tm,n)(t1 − 1) ( 1 + tm,n(1 + tm−1,n) tm−1,n )−1 tm,ng(t−11 w1,w1). (3.24) if ∂xg(x,y) 6= 0 and ∂yg(x,y) 6= 0, then we get a five-point symmetry; if ∂xg(x,y) = 0 and ∂yg(x,y) 6= 0, then we get a four-point symmetry depending on the asymmetric set tm−1,n, tm,n, tm+1,n and tm+2,n; if ∂xg(x,y) 6= 0 and ∂yg(x,y) = 0, then we get a four-point symmetry depending on the asymmetric set tm−2,n, tm−1,n, tm,n and tm+1,n; finally if g = 1 we get the three-point symmetry dtm,n dε = −(1 + tm,n)(t1 − 1) ( 1 + tm,n(1 + tm−1,n) tm−1,n )−1 tm,n, (3.25) which doesn’t depend on any arbitrary function. as it is evident from (3.23) there is no way that we can get a lie point symmetry in agreement with the results presented in [19, 20]. the lowest possible symmetry is a generalized symmetry depending on three points (3.25). equation (3.21) is symmetric in the exchange of the m and n indices. so the symmetries in the n directions are trivially given by (3.23) with t1 substituted by t2 and the shifts in m substituted by shifts in n. 4. conclusive remarks in this note we presented a set of results partially distributed in a series of articles of different authors concerning the presence of symmetries depending on arbitrary functions both in the continuous and in the discrete setting. few of the words associated to these systems are darboux integrable systems, linearizable systems, factorizable differential operators but not all necessarily proved to be connected to the presence of symmetries depending on arbitrary functions. darboux integrable systems in two independent variables are systems, studied primarily by g. darboux, characterized by the presence of at least a couple of conserved quantities in the two independent variables. as far as it is known in the literature darboux integrable systems are linearizable and the primary example is the liouville equation (1.1) which has also arbitrary function dependent symmetries, both point and generalized [38]. here we showed that generically first order pde’s have symmetries depending on arbitrary functions as the symmetry determining equations are characterized by just first order pde’s which can be solved on the characteristics in term of arbitrary functions of the symmetry variables. these same symmetries can be found in pde’s of higher order when the differential operator is factorized in the product of lower order ones which, at the bottom, is just a first order one. tsarev in [34] correlate darboux integrable systems with factorizable differential operators. the situation is less clear in the discrete setting. the classification of darboux integrable systems is not complete for p∆e’s even on the square graph. few results [2, 35] are known on factorizable difference operators in the framework of darboux integrable systems and symmetries. nothing has been done up to now to characterize partial difference equations whose symmetries are written in term of arbitrary functions of symmetry variables. moreover, let us notice that the generalized symmetries we obtain for darboux integrable equations as (3.5), (3.10) and (3.23) do not define, in general, s-integrable differential difference equations. indeed these symmetries do not always satisfy the necessary condition for the s-integrability, that the highest order shift in the lattice variable is the opposite of the lowest one [23, 36]. as a result of this note we can conjecture the following theorem: theorem 1. necessary and sufficient conditions for a pde and a p∆e to have symmetries, possibly point but surely generalized, depending on arbitrary functions is that the system be darboux integrable. some of the results presented here are new and never presented anywhere. among them let us mention the structure of the lie algebra of the symmetries of the hopf equation, the fact that both for pde’s and p∆e’s generalized symmetries characterized by arbitrary functions provide master symmetries. moreover the calculations of the generalized symmetries, the darboux integrals and the linearizing transformation for the two nonlinear quad graph equations introduced by hietarinta are new here as well as the generalized symmetries for the completely discrete liouville equation and the darboux integrals for th (ε) 1 . acknowledgements cs and dl have been partly supported by the italian ministry of education and research, 2010 prin continuous and discrete nonlinear integrable evolutions: from water waves to symplectic maps. gg and dl are supported by infn is-csn4 mathematical methods of nonlinear physics. 199 g. gubbiotti, d. levi, c. scimiterna acta polytechnica references [1] v. e. adler on a discrete analog of the tzitzeica equation, preprint arxiv:1103.5139v1, march 26th, 2011. [2] v. e. adler and s. ya. startsev, discrete analogues of the liouville equation, theor. math. phys., 121 (1999) 1484–1495, doi:10.1007/bf02557219 [3] g. bluman and s. kumey, symmetries and differential equations, springer–verlag, new york, 1989, doi:10.1007/978-1-4757-4307-4 [4] r. boll, classification and lagrangian structure od 3d consistent quad-equations, ph. d. dissertation (2012). [5] f. calogero and w. eckhaus, nonlinear evolution equations, rescalings, model pdes and their integrability. i & ii. inv. probl. 3 (1987) 229–262, doi:10.1088/0266-5611/3/2/008 & 4 (1988) 11–33, doi:10.1088/0266-5611/4/1/005 [6] b. champagne and p. winternitz, on the infinite-dimensional symmetry group of the davey-stewartson equantions, j. math. phys, 29 (1988) 1–8, doi:10.1063/1.528173 [7] d. david, n. kamran, d. levi and p. winternitz, subalgebras of loop algebras and symmetries of the kadomtsev-petviashvili equation, phys. rev. lett. 55 (1985) 2111–2113, doi:10.1103/physrevlett.55.2111 [8] d. david, n. kamran, d. levi and p. winternitz, symmetry reduction for the kadomtsev–petviashvili equation using a loop algebra, j. math. phys. 27 (1986) 1225–1237, doi:10.1063/1.527129 [9] r. n. garifullin and r. i. yamilov, generalized symmetry classification of discrete equations of a class depending on twelve parameters, j. phys. a: math. theor. 45 ( 2012) 345205 (23 pp.), doi:10.1088/1751-8113/45/34/345205 [10] r. n. garifullin, r. i. yamilov, examples of darboux integrable discrete equations possessing first integrals of an arbitrarily high minimal order, ufimsk. mat. zh., 4 (2012), 174–180. [11] a. m. grundland, m. b. sheftel and p. winternitz, invariant solutions of hydrodynamic-type equations, j. phys. a: math. gen. 33 (2000) 8193-8215, doi:10.1088/0305-4470/33/46/304 [12] g. gubbiotti, c. scimiterna and d. levi, linearizability and fake lax pair for a consistent around the cube nonlinear non-autonomous quad-graph equation, arxiv:1510.01527, submitted to theor. math. phys. . [13] g. gubbiotti, c. scimiterna and d. levi, the non autonomous ydkn equation and generalized symmetries of boll equations, arxiv:1510.07175, submitted to j. phys. a. [14] j. hietarinta, searching for cac-maps, j. nonlinear math. phys. 12 suppl. 2 (2005) 223–230, doi:10.2991/jnmp.2005.12.s2.16 [15] r. hirota, nonlinear partial difference equations. v. nonlinear equations reducible to linear equations, j. phys. soc. japan 46 (1979) 312-319, doi:10.1143/jpsj.46.312 [16] r. hirota, discrete two-dimensional toda molecule equation, j. phys. soc. japan 56 (1987) 4285-4288, doi:10.1143/jpsj.56.4285 [17] n. h. ibragimov, transformation groups applied to mathematical physics, nauka, moscow, 1983 (english translation by reidel, dordrecht, 1985), doi:10.1007/978-94-009-5243-0 [18] m. juráš and i. m. anderson, generalized laplace invariants and the method of darboux, duke math. j. 89 (1997) 351–375, doi:10.1215/s0012-7094-97-08916-x [19] d. levi, l. martina and p. winternitz, lie-point symmetries of the discrete liouville equation, j. phys. a: math. theor. 48 (2015) 025204 (18 pp.), doi:10.1088/1751-8113/48/2/025204 arxiv:1407.4043 [20] d. levi, l. martina and p. winternitz, structure preserving discretizations of the liouville equation and their numerical tests, sigma 11 (2015) 080 (20 pp.), doi:10.3842/sigma.2015.080 arxiv:1504.01953v2 [21] d. levi and c. scimiterna, linearization through symmetries for discrete equations, j. phys. a: math. theor. 46 (2013) 325204 (18 pp), doi:10.1088/1751-8113/46/32/325204 [22] d. levi and p. winternitz, continuous symmetries of discrete equations, phys. lett. a 152 (1991) 335–338, doi:10.1016/0375-9601(91)90733-o [23] d. levi and r. i. yamilov, conditions for the existence of higher symmetries of evolutionary equations on the lattice, j. math. phys. 38 (1997) 6648–6674, doi:10.1063/1.532230 [24] s. lie, über gruppen von transformationen, gottinger nachrichten 1874 (1874) 529–542. [25] s. lie, articles published mostly in the norwegian archive and reproduced in the essential part in: allgemeine theorie der partiellen differentialgleichungen erster ordnung math. ann. 9 (1875) 245–296, doi:10.1007/bf01443377; theorie der transformationsgruppen i, math. ann. 16 (1880), 441–528, doi:10.1007/bf01446218; allgemeine untersuchungen über differentialgleichungen, die eine continuirliche, endliche gruppe gestatten, math. ann. 25 (1885) 71–151, doi:10.1007/bf01446421; classification und integration von gewöhnlichen differentialgleichungen zwischen xy, die eine gruppe von transformationen gestatten, math. ann. 32 (1888) 213–281, doi:10.1007/bf01444068 [26] l. martina and p. winternitz, analysis and applications of the symmetry group of the multidimensional three wave resonant interaction problem, ann. physics 196 (1989) 231–277, doi:10.1016/0003-4916(89)90178-4 200 http://arxiv.org/abs/1103.5139v1 http://dx.doi.org/10.1007/bf02557219 http://dx.doi.org/10.1007/978-1-4757-4307-4 http://dx.doi.org/10.1088/0266-5611/3/2/008 http://dx.doi.org/10.1088/0266-5611/4/1/005 http://dx.doi.org/10.1063/1.528173 http://dx.doi.org/10.1103/physrevlett.55.2111 http://dx.doi.org/10.1063/1.527129 http://dx.doi.org/10.1088/1751-8113/45/34/345205 http://dx.doi.org/10.1088/0305-4470/33/46/304 http://arxiv.org/abs/1510.01527 http://arxiv.org/abs/1510.07175 http://dx.doi.org/10.2991/jnmp.2005.12.s2.16 http://dx.doi.org/10.1143/jpsj.46.312 http://dx.doi.org/10.1143/jpsj.56.4285 http://dx.doi.org/10.1007/978-94-009-5243-0 http://dx.doi.org/10.1215/s0012-7094-97-08916-x http://dx.doi.org/10.1088/1751-8113/48/2/025204 http://arxiv.org/abs/1407.4043 http://dx.doi.org/10.3842/sigma.2015.080 http://arxiv.org/abs/1504.01953v2 http://dx.doi.org/10.1088/1751-8113/46/32/325204 http://dx.doi.org/10.1016/0375-9601(91)90733-o http://dx.doi.org/10.1063/1.532230 http://dx.doi.org/10.1007/bf01443377 http://dx.doi.org/10.1007/bf01446218 http://dx.doi.org/10.1007/bf01446421 http://dx.doi.org/10.1007/bf01444068 http://dx.doi.org/10.1016/0003-4916(89)90178-4 vol. 56 no. 3/2016 on pde and p∆e with symmetries depending on arbitrary functions [27] p. medolaghi, classificazione delle equazioni alle derivate parziali del secondo ordine, che ammettono un gruppo infinito di trasformazioni puntuali, ann. mat. pura appl. 1 (1898) 229–263, doi:10.1007/bf02419192 [28] p. j. olver, applications of lie groups to differeial equations, (second edition), springer, new york, 1993, doi:10.1007/978-1-4612-4350-2 [29] f. schwarz, symmetries of the two-dimensional korteweg-devries equation, j. phys. soc. jpn. 51 (1982) 2387–2389, doi:10.1143/jpsj.51.2387 [30] c. scimiterna and d. levi, classification of discrete equations linearizable by point transformation on a square lattice, front. math. china 8 (2013) 1067–1076, doi:10.1007/s11464-013-0280-3 [31] m. b. sheftel, symmetry group analysis and invariant solutions of hydrodynamic-type systems, int. j. math. math. sci. 2004 (2004), 487–534, doi:10.1155/s0161171204206147 [32] s. ya. startsev, darboux integrable discrete equations possessing an autonomous first-order integral, j. phys. a: math. theor. 47 (2014) 105204 (16pp), doi:10.1088/1751-8113/47/10/105204 [33] s. ya. startsev, non-point invertible transformations and integrability of partial difference equations, sigma 10 (2014) 066 (13 pp.), doi:10.3842/sigma.2014.066 [34] s. p. tsarev, factoring linear partial differential operators and the darboux method for integrating nonlinear partial differential equations, teoret. mat. fiz. 122 (2000), 144–160; english transl., theoret. math. phys. 122 (2000), 121–133, doi:10.1007/bf02551175 [35] v. l. vereshchagin, darboux-integrable discrete systems, theor. math. phys., 156 (2008) 1142–1153, doi:10.1007/s11232-008-0084-x [36] r. i. yamilov, symmetries as integrability criteria for differential difference equations, j. phys. a: math. gen. 39 (2006) r541–r623, doi:10.1088/0305-4470/39/45/r01 [37] a. v. zhiber and a. b. shabat, klein–gordon equations with a nontrivial group, dokl. akad. nauk sssr 247 (1979), 1103–1107; english transl., soviet phys. dokl. 24 (1979), 607–609. [38] a. v. zhiber and v. v. sokolov, exactly integrable hyperbolic equations of liouville type, uspekhi mat. nauk 56 (2001) 63–106; english transl., russian math. surveys 56 (2001) 61–101, doi:10.1070/rm2001v056n01abeh000357 201 http://dx.doi.org/10.1007/bf02419192 http://dx.doi.org/10.1007/978-1-4612-4350-2 http://dx.doi.org/10.1143/jpsj.51.2387 http://dx.doi.org/10.1007/s11464-013-0280-3 http://dx.doi.org/10.1155/s0161171204206147 http://dx.doi.org/10.1088/1751-8113/47/10/105204 http://dx.doi.org/10.3842/sigma.2014.066 http://dx.doi.org/10.1007/bf02551175 http://dx.doi.org/10.1007/s11232-008-0084-x http://dx.doi.org/10.1088/0305-4470/39/45/r01 http://dx.doi.org/10.1070/rm2001v056n01abeh000357 acta polytechnica 56(3):193–201, 2016 1 introduction 2 factorizable differential operators, darboux integrable equations and symmetries depending on arbitrary functions. 3 discrete equations with generalized symmetries depending on an arbitrary function. 3.1 the discrete algebraic liouville equations 4 conclusive remarks acknowledgements references ap03_5.vp 1 introduction internal model control (imc) is a well-known control design method in the field of control engineering, see [1, 2]. the strategy of imc design is based on knowledge of the system model that is finally involved in the control loop. since the structure of the resulting control algorithm arises from the structure of the system model, the imc controller may acquire a quite complicated form. that is why the obtained imc controller is often substituted by a classical pid controller, which approximates its features. the motivation for using the pid controller as the final control algorithm is given by the convention (pid algorithm is available in most programmable controllers). thus, in this approach, the resulting imc controller is utilised only in designing the parameters of the pid controller. on the one hand, this approach provides easier implementation of the control. however, on the other hand, the dynamics of the final control loop with a pid controller loses some of the merits of imc design as the consequence of simplifying the controller. nevertheless, thanks to progress in the hardware and software equipment of programmable controllers, it is not beyond the scope of a reasonable effort to implement (to program) the algorithm resulting from the imc design on a programmable controller. for this purpose, the task is to search for a model of the lowest possible order to obtain an easily applicable control algorithm. however, use of the classical approach in modelling (based on describing the system dynamics by linear differential equations) often does not allow us to make a satisfactory approximation of the system dynamics using low order models. particularly if the system involves time lags, distributed parameters or transport phenomena (note that such phenomena can be encountered, e.g., in heat transfer, chemical and biological processes) the order of the model resulting from the classical approach is as a rule high. therefore, in such applications, it is advantageous to use the anisochronic modelling approach based on involving time delays in the linear model. in this paper, we first introduce the first order anisochronic model able to fit the dynamics of a broad class of systems. then, analysing the closed loop spectrum, we investigate the features of the closed loop dynamics with the imc controller derived from the anisochronic plant model. 2 first order anisochronic model in many industrial applications, the following first order model with the input delay can be encountered � � � � � � � � g s y s u s k s ts � � � � exp � 1 (1) where k is static gain coefficient, t is time constant and � is input time delay, see, e.g., [3]. in fact, according to [4], see also [5], the model is adequate for approximating most industrial processes with well-damped dynamics. on the other hand, since there is only one parameter in the denominator of (1), supposedly, higher order system dynamics (damped as well as oscillatory) cannot be satisfactorily approximated by this model. in order to further extend the applicability of the first order anisochronic model, let another delay � be introduced into the model. the anisochronic model then acquires the following form � � � � � � � � � � g s y s u s k s ts s � � � � � exp exp � � (2) see [6]. an analogous but second order model has been used in [7]. the enhanced universality of model (2) is due to the possibility to assign its dominant pole couple arbitrarily in the 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 anisochronic internal model control design t. vyhlídal, p. zítek the features of internal model control (imc) design based on the first order anisochronic model are investigated in this paper. the structure of the anisochronic model is chosen in order to fit both the dominant pole and the dominant zero of the system dynamics being approximated. thanks to its fairly plain structure, the model is suitable for use in imc design. however, use of the anisochronic model in imc design may result in so-called neutral dynamics of the closed loop. this phenomenon is studied in this paper via analysing the spectra of the closed loop system. keywords: internal model control, time delay system, dynamics analysis, system spectrum. -3 -2.8 -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 re(s ) im (s ) 00.10.20.3 0.4 0.5 0.6 0.7 0.8 1 0.9 0.37 0.35 0.38 0.45 1.25 2.5 5 10 2.57 2.21 1.03 0.51 1.76 2.32 1.37 0.24 exp(-1) �/2 � � |s 1 | 2.719.3 � fig. 1: the trajectory of the dominant couple of roots of the characteristic function � � � �m s s� � �1 exp � with respect to �, � � �� , s1 2, � �� �j (the time constant t is considered here as a time-scale unit) left half of the complex plane. the sposition of the dominant pole of (2) with respect to the values of t and � can be estimated from fig. 1. the crucial values of the ratio � t are following: � �� t � �exp 1 for which the model has a double real pole and � �t � 2 for which the dominant couple lays on the imaginary axis, which is the last value for which model (2) is stable. for � �� t � �exp 1 the poles of the dominant couple are real, while for � �� t �exp 1 they are complex conjugate. for more details about the distribution of the poles of (2), see [6]. the dynamics of the introduced first order anisochronic models (1) and (2) are determined by the system poles only. involving a dynamic term into the numerator of the model transfer function, i.e., introducing the zeros into the model dynamics, further extends the applicability of the first order anisochronic model. one possible way of involving a zero-effect in the system dynamics consists in using the model � � � �� � � � � � g s k ls s s ts s � � � � � � exp exp exp � � � , (3) where the ratio � l determines the distribution of the roots of � �ls s� � �exp � 0 (the positions of the zeros of (3)) in the same way as the ratio � t determines the distribution of the poles. using (3) is effective only in case of the requirement to involve the zeros that are located on the left half of the complex plane. model (3) cannot be used to approximate the dynamics with a positive real zero because the equation � �ls s� � �exp � 0 does not have positive real roots for any � l, provided that l 0 0, � . if the zero is positive real and single given by �1 l, it can be added to the model dynamics simply by using � �ls 1 instead of � �ls s� �exp � (using � �� � �ls sexp � does not bring about considerable merits because its dominant root is positive real for any � 0). a more difficult task is to involve dominant complex zeros with positive real parts. theoretically, it is possible to use the term � �ls s� �exp � , but as � � � �re im . 0 5 (where m is a dominant zero of the system) the ratio � l becomes very large, which is not convenient from the numerical point of view. another problem arising from the use of model (3) is that the degree of the numerator is equal to the degree of the denominator, i.e., there is a direct input-output link in the model. to avoid such an inconvenient model structure, instead of the first order anisochronic model (3), the following second order model may be used � � � �� � � � � � � �� � g s k ls s s t ts s � � � � � � � exp exp exp � � �1 1 (4) with the additional mode with time constant t1, see [7]. an alternative way of involving the zeros into the first order anisochronic model that does not have the drawback of equal degrees of the numerator and the denominator consists in using the following model � � � �� � � � � � g s k a s s ts s � � � � � � 1 exp exp exp � � � (5) in which instead of the quasipolynomial, the exponential polynomial is used in the numerator, see [8]. the zeros of system (5) are the roots of the following equation � � � �n s a s� � � �1 0exp � (6) considering variable s as a complex variable, i.e., s � �� �j , the complex roots of (6) are the solutions of the equations � �� � � � � �re exp cosn s a� � � �1 0�� �� (7) � �� � � � � �im exp sinn s a� � ��� �� 0 (8) which result from splitting equation (6) into real and imaginary parts. separating the exponential term from (7) � � � � exp cos � ��� �� 1 a (9) and substituting (9) into (8), the following expression results � �tan �� � 0 (10) since (10) is satisfied for � � �� k and the right-hand side of (9) has to be positive to obtain real �, the roots s � �� �j of (6), i.e., the zeros of (5) are given by � � � � 1 1 ln a (11) � � � � � � � � � � � � � 2 0 2 1 0 k a k a if if , k 0, 1, 2, … . (12) thus, by means of parameters a and �, we can assign the horizontal chain of the roots arbitrarily in the complex plane. prescribing the real parts of the roots � yields � �a � exp �� (13) and parameter � results from � � � � 2 p , (14) where �p prescribes the spacing of the imaginary parts of the roots. if a is chosen positive, equation (6) has one real root. the closest complex root (of the horizontal chain) to the real one has an imaginary part equal to �p. if a is chosen negative, equation (6) does not have a real root. in this case, the roots of the chain closest to the real axis have the imaginary parts equal to ��p 2. to sum up, if a > 0, the roots are given as s1 � � and � �s kk k p2 2 1, � � �� �j , k 1, 2, … and if a<0, the roots are givesn as � � � �� �s k+k k p2 1 2 1 2 1 2� � � �, � �j , k 0, 1, … thus, by means of involving exponential polynomial (6) we can assign either one real dominant zero or the pair of complex conjugate dominant zeros. 3 imc design based on a universal first order anisochronic model in [7, 9, 6], the features of internal model control (imc) design based on low order models with time delays are studied. the scheme of imc is shown in fig. 2, where � �r s* is the controller, p(s) denotes the dynamics of the plant which is being controlled, and g(s) is the model of the plant. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 43 no. 5/2003 w r y ( )s* p( )s g( )s u+ + fig. 2: internal model control (imc) scheme using universal first order anisochronic model (5) and provided that the system being approximated does not have positive zeros, the transfer function of the controller is given by � � � � � � � � � �� � � � r s g s f s ts s k a s t s s i * exp exp exp , � � � � � � � � � 1 1 1� � �f f (15) where � �g si is the invertible part of model (5) (the only uninvertible part of model (5) is the term corresponding to the input delay, which is separated) and f(s) is the first order anisochronic filter with f(0) 1. the transfer function of the inner control loop (which is the controller transfer function if the classical control loop is considered) is given by � � � � � � � � � � � � � �� � � � r s r s r s r s g s ts s k a s t s s � � � � � � � � � � � � * * * exp exp exp 1 1 � � �f f � �� �exp �s� (16) see its block diagram in fig. 3. if g(s) p(s), the closed loop dynamics are given by the first order anisochronic model � � � � � � g s s t s s wy � � � � exp exp � �f f (17) and the dynamics of the closed loop can be chosen by the ratio of parameters tf and �f . however, the model approximates only a part of the dynamics as a rule. therefore, let us study the closed loop dynamics for the case � � � �g s p s� . let the filter be � � � �f s f s�1 f , ( � �f s t sf f� � � �� �exp s �f ), the model � � � � � � � �g s k n s m s s� �exp � , � � � �( expn s a s� � �1 � , � � � �m s ts s� � �exp � ) and the true plant model � � � � � �p s q s s s� , then the controller transfer function is given by � � � � � � � � � �� � r s m s kn s f s s � � �f exp � (18) and the transfer function of the closed loop is the following � � � � � � � � � � � � � �� � � � � � g s m s q s s s kn s f s s m s q s wy � � � �f exp � (19) thanks to n(s), obviously, there are delayed terms of the highest derivative of y(t) in the model of the closed loop. thus the closed loop is of neutral dynamics, [10]. in general, this fact rather restricts the class of models that may be used in the described anisochronic imc design, see [13]. this is due to the fact that the asymptotic features of the spectrum of the poles of (19) are determined by the roots of the exponential polynomial n(s). the problem is that the spectrum of n(s) may be very sensitive to the changes in the delays involved in n(s), see [8]. however, if n(s) involves only one delay, as in (6), the distribution of the root spectrum of n(s) and also the distribution of the pole spectrum of (19) are rather insensitive with respect to small changes in the delay. the basic features of such imc design are shown in the following example. 4 application example, spectrum based analysis of the dynamics in order to demonstrate the outstanding approximation features of a first order anisochronic model, consider that the plant is originally described by the tenth order model with the transfer function � � � � p s s s � � � 20 1 2 1 10 (20) with the multiple pole �1 10 0 5� � � . and the single dominant zero � �0 05. . let us find the parameters of model (5) that approximates model (20) in the low frequency range. assessing � � 8 and � �10, which implies a � 0 607. (according to (13), � � �0 05. ) and � �k a� � �1 1 2 54. , the dead time of the system and the rising part of the response are approximated quite well, see fig.4. the remaining two parameters of model (5), i.e., t �131. and � � 5 5. , have been assessed using the least squares method to approximate two points of the frequency response of (20). 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 + + delay delay 1/k gain t gain + integrator 1 s delay + + + a gain delay 1/t gain w-y u + � � � � fig. 3: scheme of the controller (16) 0 10 20 30 40 50 60 70 80 90 100 0 0.5 1 1.5 2 t y (t ) fig. 4: step responses of system (20) (dashed) and of its approximation (5) (solid) since model (5) is supposed to approximate model (20) in the low frequency range, the points of the frequency response being approximated have been chosen those with � � � �� �� � � �1 1 2� � �arg p j and � � � �� �� � � �2 2� � �arg p j . as can be seen in fig. 5, the frequency response of model (5) approximates the frequency response of (20) very well (even in the third quadrant of the frequency response). also the approximation of the system step response is very good considering that the anisochronic model (5) is of the first order. let us use the given parameters of model (5) in controller (16) and let us investigate the dynamics of the closed loop. the closed loop system is of the 11th order with the transfer function given by (19). choosing tf �10 and � f � 7 the closed loop dynamics are supposed to be given by the dominant couple of poles ~ . .� 12 0 081 0156� � � j (the dominant pole of the ideal closed loop (17) with relative damping � � 0 51. , see fig. 1). thus, let us investigate the distribution of the closed loop with the system having the dynamics described by (20) and the imc controller based on approximation model (5). since the closed loop characteristic function is the quasipolynomial, a special numerical algorithm known as a mapping based rootfinder, see [11, 12, 13], is to be used to locate the characteristic function roots. the characteristic function of the closed loop (19) is given by � � � � � �� � � � � �� � m s s k a s t s s s ts wy � � � � � � � � � � � � � � 2 1 110 exp exp exp exp � � �f f � �� �� ��s s20 1� (21) the first step of the mapping based rootfinder consists in substituting s �� �j and splitting the characteristic function into real and imaginary parts, i.e., � � � �� �r mwy� � � �, re� � j and � � � �� �i mwy� � � �, im� � j . then, the implicit functions � �r � �, � 0 and � �i � �, � 0 are mapped in the s-plane and their intersection points are located determining the positions of the roots of (21). the result of the mapping based rootfinding technique can be seen in fig. 6, where the decisive part of the root spectrum of (21) is shown. as it is shown in fig. 6, the following poles �1 0 05� � . , �2 3 0 081 0176, . .� � � j and �4 5 0126 0 089, . .� � � j are the closest poles to the s-plane origin which are likely to be the dominant poles of the system. the poles of couple �2 3, are quite close to the prescribed poles ~ ,�1 2. however, from the distribution of the other dominant poles, it is not obvious that poles �2 3, determine the dynamics of the closed loop (�1 is even closer to the origin of the s-plane). since the closed loop is the neutral system we also analyse the essential spectrum of the neutral system. the essential spectrum of the system is here given by the solutions of the equation � �n s � 0, see [10]. the spectra of the poles (black circles), zeros (empty circles) and the essential spectrum (asterisks) corresponding to closed loop system (19) with chosen tf �10 and � f � 7 are shown in fig. 7. as can be seen, pole �1 is likely to be compensated by the real zero. also the couple of poles �4 5, are quite close to a couple of zeros and they are also partly compensated. consequently, the dominant mode of the set-point response is really given by the couple of poles �2 3, (see fig. 7, the poles of the prescribed anisochronic dynamics are marked by squares). the dominant role of the couple �2 3, in the set-point response dynamics of the closed loop is confirmed by the responses of the real closed loop system (19) and the ideal closed loop system (17) shown in fig. 8. as can be seen, the real set point response (solid) is very close to the ideal one (dashed). the characteristic feature of the class of neutral systems, i.e., some of the poles converge to the eigenvalues of the essential spectrum, can be seen in fig. 7 and in the enlarged region in fig. 9. the consequence of using the numerator of form (6) to approximate the system dominant zero is that the closed loop system has infinitely many poles with real parts close to the value given by (11). therefore, controller (16) cannot be used to control systems with zeros in the right half of the complex plane. on the other hand, if the dominant zero is negative and not too close to the imaginary axis, the neutral character of the closed loop system does not bring about any features that are risky to the dynamics. provided that closed loop system (19) (with negative dominant zero) does not have any unstable poles close to the s-plane origin, it does not have any © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 43 no. 5/2003 -2 -1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 re im fig. 5: frequency responses of system (20) (dashed) and of its approximation (5) (solid) -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0 0.5 1 1.5 2 2.5 re(s ) im (s ) fig. 6: poles of the closed loop system with plant model (20) and controller (5), � �� �re m swy � 0 solid, � �� �im m swy � 0 dashed, � �m swy given by (21) unstable poles at all. this is given by the fact that the chain of the poles converging to the spectrum of the difference equation has the tendency to get closer to the eigenvalues of the essential spectrum as the magnitudes of the poles in the chain increase. 5 conclusions the prime objective of the paper is to break the conventional concept of a pid controller by including the delayors in its structure and in this way to reduce the order of models needed for controller design. in fact, the resulting imc controllers are somewhere between the analog and discrete principles of operation, with the specific feature that the time shifts (sample-delays) are not integer multiples of a sampling time interval. it should be noted that the delays in the controller model do not serve only for compensating some specific delays (e.g., transport delays) in the plant. the purpose of delayors is to fit the dynamic effects of various origin, including the distributed parameters of the plant. the first order example model (5) used in the paper actually represents those plants classified as “higher-order” in the usual sense of the term. controlling these plants by a conventional pid controller is doubtful. the use of delays in the modelling results in a quite plane structure of the final control algorithm, which can easily be implemented on a programmable controller, where simultaneous application of the integrators and delayors is quite possible. as has also been shown, implementation of the imc controller based on the anisochronic model can result in the so-called neutral character of the closed loop dynamics, which may introduce risky features into the closed loop system dynamics. however, if the controller of structure (16) is used (provided that the dominant zero being approximated is located in the left half of the complex plane) the neutral character of the closed loop does not bring any substantially negative features to the dynamics. acknowledgement this research has been supported by the ministry of education of the czech republic under project ln00b096. 58 acta polytechnica vol. 43 no. 5/2003 0 10 20 30 40 50 60 70 80 90 100 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t y (t ) fig. 8: comparison of the ideal and real set-point responses, ideal dashed, real solid -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0 2 4 6 8 10 12 14 16 18 20 re(s ) im (s ) fig. 9: spectra of closed loop system for tf � 10 and �f � 7. spectra of closed loop system (19): black circles – poles of (19), empty circles – zeros of (19), asterisks – roots of n(s), squares – poles of system (17) (ideal closed loop). -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0 0.5 1 1.5 2 2.5 re(s ) im (s ) fig. 7: spectra of closed loop system for tf � 10 and �f � 7, detail of fig. 9 references [1] morari, m., zafiriou, e.: robust process control, prentice-hall, 1989. [2] skogestad, s., postlethwaite, i.: multivariable feedback control. john wiley and sons, 1989. [3] goodwin, g. c., graebe, s. f., salgado, m. e.: control system design. new jersey: prentice-hall, inc., 2001. [4] hang, c. c., lee, t. h., ho, w. k.: applied adaptive control. isa, nc, 1993. [5] wang, q. g., lee, t. h., tan, k. g.: finite spectrum assignment for time-delay systems. london: springer-verlag, 1999. [6] vyhlídal, t., zítek, p.: control system design based on a universal first order model with time delays. proceedings of international conference on advanced engineering design, glasgow: university of glasgow, 2001 and acta polytechnica. vol. 41, 2001, no. 4–5, p. 49–53. [7] zítek, p.: time delay control system design using functional state models. ctu reports, no. 1, ctu prague 1998. [8] avelar, c. e., hale, j. k.: on the zeros of exponential polynomials. mathematical analysis and application, vol. 73, 1980, p. 434–452. [9] zítek, p., hlava, j.: anisochronic internal model control of time-delay systems. control engineering practice, vol. 9, 2001, no. 5, p. 501–516. [10] hale, j. k., verduyn lunel, s. m.: introduction to functional differential equations. vol. 99 of applied mathematical sciences, springer verlag new york inc., 1993. [11] vyhlídal, t., zítek, p.: analysis of time delay systems using quasipolynomial mapping based rootfinder implemented in matlab. proc. of matlab 2002, prague, 2002, p. 606–616. [12] vyhlídal, t., zítek, p.: quasipolynomial mapping based rootfinder for analysis of time delay systems. in proc. of ifac workshop on time delay systems 2003, rocquencourt (france). [13] vyhlídal, t.: analysis and synthesis of time delay system spectrum. ph.d. thesis, faculty of mechanical engineering, czech technical university in prague, 2003. ing. tomáš vyhlídal, ph.d. e-mail: vyhlidal@fsid.cvut.cz prof. ing. pavel zítek, drsc. phone: +420 224 352 564 fax: +420 233 336 414 e-mail: zitek@fsid.cvut.cz centre for applied cybernetics institute of instrumentation and control engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 43 no. 5/2003 acta polytechnica doi:10.14311/ap.2018.58.0292 acta polytechnica 58(5):292–296, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap effect of imperata cylindrica reinforcement form on the tensile and impact properties of its composites with recycled low density polyethylene olusola femi olusunmade∗, abba emmanuel bulus, terwase kelvin kashin department of mechanical engineering, university of agriculture, makurdi, p. m. b. 2373, makurdi, nigeria ∗ corresponding author: olusunmadeolusola@yahoo.com abstract. composites of recycled low-density polyethylene obtained from waste water-sachets and imperata cylindrica were produced with particulate and long-fibre unidirectional mat reinforcements. comparison was made of the tensile and impact properties resulting from the use of the different reinforcement forms at 10 wt% ratio in the matrix. the results obtained from the tests carried out revealed that tensile strength, tensile modulus, elongation at break and impact strength of the composite with the long-fibre mat reinforcement were better than those of the one composite with the particulate reinforcement. the better performance observed in the long-fibre mat reinforcement could be attributed to the retention of the toughness and stiffness of the imperata cylindrica stem in this form of reinforcement, which is lost after the stem strands are pulverized into particles. imperata cylindrica stem, as a natural fibre reinforcement for polymetric material is, therefore, recommended in the long-fibre mat form. the combination of these otherwise challenging resources in composite materials development will add economic value to them and help to reduce the environmental menace they present. keywords: tensile properties; impact strength; imperata cylindrica (ic); water-sachets; composites; particulate; long-fibre mat. 1. introduction effective resource utilization as well as concern for the environment are among the reasons many researchers and industries are adopting the use of renewable natural fibres as replacement for synthetic fibres in a polymer reinforcement for the development of composites [1–3]. the composites produced with the natural fibre reinforcement of polymetric material have shown the potential of application in many engineering applications [4] due their excellent properties (mechanical, physical, electrical, etc). one natural fibrous plant is imperata cylindrical, which is a perennial, with basal leaves 3–100 cm long, 2–20 mm wide [5]. it is an aggressive and difficult weed to control due to its short growth cycle. it is abundant, yet unsuitable for grazing animals and lacks good commercial value [6]. when fully mature, its overall nutrient decline and its sharp pointed seeds and tangled awns may injure animals and humans [7]. they also act as a collateral host for pathogens that affect the yield of some food crops [8]. however, imperata cylindrica is stiff and tough [5, 9]. these properties make it promising as a fibre reinforcement for polymers, particularly recycled water-sachets, to produce thermoplastic composites. this will give economic importance to the fibre and reduce the environmental challenges posed by plastics wastes to human and aquatic lives. however, several factors affect the properties of composite materials, they include: fibre ratio in the matrix [2], production technique, chemical modification, fibre orientation or direction, fibre type and reinforcement form [10]. there are many types and forms of reinforcement, such as fibre, powder, and bulk. compared to the other types of fibre form, the powder form has the smallest volume. according to el-shekeil et al. [11], the mechanical properties of kenaf fibre reinforced polyurethane composites were influenced by the size of the kenaf fibre. different fibre size showed a significant influence on the tensile and flexural properties and impact strength. agarwal et al., [12] reported that a substantial improvement in the mechanical properties can be envisaged through an addition of fillers (either short fibre or particulate) into a nitrile rubber. however, the addition of short fibres has been found to be more effective. vijay and singha [13] reported that with a particle reinforcement, the compressive strength increases to a greater extent than with the short and long fibre reinforcement. it was also reported that the composite with a particle reinforcement has a higher load bearing capacity and lower wear rate than those with the short and long fibre reinforcement. this study, therefore, made a comparative examination of the tensile and impact properties of polymer composites produced with imperata cylindrica (ic) in the particulate form and long-fibre mat form at 10 wt% ratio in the matrix. 292 http://dx.doi.org/10.14311/ap.2018.58.0292 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 5/2018 effect of imperata cylindrica reinforcement form figure 1. ic mat. 2. materials and method the part of the imperata cylindrica that was used for this study is the stem and these were obtained from pilla village in makurdi area of benue state. the growth environment of the imperata cylindrica is tropical with an average temperature of 27 °c and relative humidity of 82 %. the water-sachets made from low density polyethylene (ldpe) were handpicked from the environment within makurdi area of benue state. 2.1. processing of the materials the imperata cylindrica stems were harvested in february during the dry season, which is one of the two major season in the tropics. subsequently, the finer strands of the stems were handpicked and arranged into a unidirectional mat (see figure 1). the average diameter across the length of the strands selected to form the mat was 3 mm. this was measured with a digital micrometre screw gauge. tiny threads were used to hold the ic strands together at three different positions along the breadth of the mat to ease its transfer into the mould without scattering. the dimension of the woven ic mat is 285 × 200 × 3 mm. some of the ic strands were also ground into smaller particles. the waste water-sachets were thoroughly washed and pulverized at goshen plastics industry, makurdi (see figure 2). these pulverized waste water-sachets will be referred to as recycled low density polyethylene (rldpe) henceforth. 2.2. preparation and characterization of the composites the equipment used and methods adopted for producing and characterizing the composites were as described by olusunmade et al. [3]. the weight of the ic (particulate, mat) and the rldpe were measured using an electronic weighing balance such that the weight ratio of the mat in the matrix was 1:9. the thickness of the mat (which is the diameter of the individual strand of the ic stem) and the expected thickness of the test specimens were responsible for the weight ratio in this study. both reinforcement figure 2. pulverized waste water-sachets. forms of the imperata cylindrica were used to produce the sheets. figures 3–5 show one of the composite sheets produced and the test specimens. the test specimens for the tensile test have a dumb-bell shape with a gauge length of 30 mm, grip width of 15 mm and thickness of 5 mm. the dimensions of the impact test specimens are 100 × 10 × 5 mm. three specimens were used for each of the tests. the temperature and relative humidity of the test environment is 22 °c and 50 %. 3. results and discussion 3.1. tensile properties 3.1.1. tensile strength of the composite table 1 shows the average tensile strength of the composites produced with the long-fibre mat and particulate ic compared to the rldpe. it was observed that there was an increase of 57.27 % in the average tensile strength of the rldpe/ic mat composite when compared to the rldpe and an increase of 81.23 % when compared to the rldpe/ic particulate composite. the higher value of tensile strength observed for the rldpe/ic mat composite when compared to that of the neat rldpe and rldpe/ic particulate composite was due to the transfer of stress to the ic long-fibre. a physical examination showed that the ic stem offers great resistance to the force pulling it along its length, due to its stiff nature, thereby increasing the tensile load bearing capacity of the composite material before fracture. 3.1.2. tensile modulus of the composite table 1 shows the average tensile modulus of the composites produced with the long-fibre mat and particulate ic compared to the rldpe. it was observed that there was an increase of 327.50 % in the average tensile modulus of the rldpe/ic mat composite when compared to the rldpe and an increase of 278.05 % when compared to the rldpe/ic particulate composite. the increment in the modulus can be attributed to the decreased deformability of the rigid interface between the ic mat/particulate and the matrix material, which cause a reduced strain. the enhancement in the 293 o. f. olusunmade, a. e. bulus, t. k. kashin acta polytechnica figure 3. finished composite sheet. figure 4. tensile test specimens. tensile modulus is also due to the fibres itself, which have a higher stiffness than the polymer [2, 3, 14]. the ic stem is more rigid in its original form than when pulverized into particulate, hence, there is a more rigid polymer/ic interface in the long-fibre mat form resulting in a higher value for the average tensile modulus recorded for the rldpe/ic mat composite over the rldpe/ic particulate composite. 3.1.3. percentage elongation at break of the composite it was observed in table 1 that there was a decrease of 35.24 % in the average percentage elongation of the rldpe/ic mat composite when compared to the rldpe and an increase of 28.74 % when compared to the rldpe/ic particulate composite. with the incorporation of the ic mat/particulate in the polymer, the elasticity of the composite is suppressed. the reduction is attributed to the decreased deformability of the rigid interface between the ic mat/particulate and the matrix material [2, 3]. the decrease in elongation at break is due to the destruction of the structural integrity of the polymer by the fibres and the rigid structure of the fibres [15]. the higher value of the average percentage elongation recorded for the rldpe/ic mat composite over the rldpe/ic particulate composite was as a result of the mat’s long-fibres being figure 5. impact test specimens. able to stretch a little further while been pulled by tensile forces before fracture due to their length unlike the particles, which easily separate under a load in the matrix. the ic stem is slightly more elastic in the long-fibre mat form than in the particulate form. 3.2. impact strength of the composite figure 6 illustrates the average impact strength of the composites produced with the long-fibre mat and particulate ic compared to the rldpe. it was observed that there was an increase of 44.14 % in the average impact strength of the rldpe/ic mat composite when compared to the rldpe and an increase of 151.74 % when compared to the rldpe/ic particulate composite. the ic stem has a foam-like inner layer, which makes it tough and able to absorb more impact energy before fracture. this is responsible for the higher value of the impact strength obtained for the rldpe/ic mat composite. in the particulate form, the structure is destroyed, leading to a reduction in the toughness and hence a lesser value of the impact strength of the rldpe/ic particulate composite. 294 vol. 58 no. 5/2018 effect of imperata cylindrica reinforcement form rldpe rldpe/ic mat rldpe/ic particulate tensile strength (mpa) 10.86 ± 0.86 17.09 ± 1.50 9.43 ± 0.62 tensile modulus (mpa) 116.44 ± 14.86 497.78 ± 33.34 131.67 ± 15.46 elongation at break (%) 54.93 ± 16.57 35.57 ± 8.43 27.63 ± 2.3 table 1. tensile properties of rldpe, rldpe/ic mat and rldpe/ic particulate. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 rldpe rldpe/ic mat rldpe/ic particle im p a c t s tr e n g th ( j /m m ) material type figure 6. comparison of impact strength of rldpe, rldpe/ic mat and rldpe/ic particulate. 4. conclusion in this study, composites of recycled low-density polyethylene obtained from waste water-sachets and imperata cylindrica were produced with particulate and long-fiber unidirectional mat reinforcements. comparison was made of the tensile and impact properties resulting from the use of the different reinforcement forms at 10 wt% ratio in the matrix. the results from the tests carried out revealed that tensile strength, tensile modulus, elongation at break and impact strength of the composite with the longfibre mat reinforcement were better than that of the composite with the particulate reinforcement. the better performance observed in the long-fibre mat reinforcement could be attributed to the retention of the toughness and stiffness of the imperata cylindrica stem in this form of reinforcement, which is lost after the stem strands are pulverized into particles. imperata cylindrica stem, as a natural fibre reinforcement for polymetric material is, therefore, recommended in the long-fibre mat form. references [1] suddell, b. c. and evans, w. j., “natural fiber composites in automotive applications. in: mohanty, a. k., misra, m., drzal, l. t., (eds.)”, natural fibers, biopolymers and biocomposites, crc press, new york, (2005). [2] olusola femi olusunmade, dare aderibigbe adetan, and charles olawale ogunnigbo, “a study on the mechanical properties of oil palm mesocarp fibre-reinforced thermoplastic,” journal of composites, vol. 2016, article id 3137243, 7 pages, (2016). doi:10.1155/2016/3137243 [3] olusola femi olusunmade, sunday zechariah, and taofeek ayotunde yusuf, “characterization of recycled linear density polyethylene/imperata cylindrica particulate composites”, acta polytechnica: journal of advanced engineering, vol. 58, no. 3, czech technical university in prague, (2018). doi:10.14311/ap.2018.58.0195 [4] celluwood, “technologies and products of natural fibre composites”, cip-eip-eco-innovation2008 : id: eco/10/277331, (2008). [5] kew science, “imperata cylindrica (l.) p. beauv,” royal botanic gardens, (2018), http://powo.science.kew.org/taxon/urn:lsid: ipni.org:names:30138371-2, accessed on 15th october, 2018. [6] cabi, “invasive species compedium: imperata cylindrica,” the centre for agriculture and bioscience international (2018), https://www.cabi.org/isc/datasheet/28580, accessed on 15th october, 2018. [7] angzzas, s. m. k., aripin, a. m., ishak, n., hairom, n. h. h., fauzi, n. a., razali, n. f. and zainulabidin, m. h., “potential of cogon grass (imperata cylindrica) as an alternative fibre in paper-based industry”, journal of engineering and applied sciences, vol. 11, no. 4, (2016). [8] soromessa, t., heteropogon contortus (l.) p.beauv. ex roem. & schult.. record from protabase. brink, m. & achigan-dako, e.g. (editors). prota (plant resources of tropical africa / ressources végétales de l’afrique tropicale), wageningen, netherlands, (2011). [9] cook, b. g., pengelly, b. c., brown, s. d., donnelly, j. l., eagles, d. a., franco, m. a., hanson, j., mullen, b. f., partridge, i. j., peters, m. and schultze-kraft, r., “tropical forages: an interactive selection tool”, csiro, dpi&f (qld), ciat and ilri, brisbane, australia, (2005). [10] tezara c, siregar j. p, lim h. y, fauzi f. a, yazdi m. h, moey l. k, lim j. w. factors that affect the mechanical properties of kenaf fiber reinforced polymer: a review. journal of mechanical engineering and sciences, 10; 2 (2016), 2159-2175. 295 http://dx.doi.org/10.1155/2016/3137243 http://dx.doi.org/10.14311/ap.2018.58.0195 http://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:30138371-2 http://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:30138371-2 https://www.cabi.org/isc/datasheet/28580 o. f. olusunmade, a. e. bulus, t. k. kashin acta polytechnica [11] el-shekeil, y. a., salit, m. s., abdan, k. and zainudin, e. s., “development of a new kenaf bast fiberreinforced thermoplstic polyurethane composite”, bioresources, vol. 6, no. 4, (2011), pp. 4662-4672. [12] agarwal, k., setua, d. k. and mathur g. n., “short fibre and particulate-reinforced rubber composites”, defence science journal, vol. 52, no. 3, (july 2002), pp. 337-346. [13] vijay, k. t. and singha a. s., “physico-chemical and mechanical characterization of natural fibre reinforced polymer composites”, iranian polymer journal, vol. 19, no. 1, (2010), pp. 3-16. [14] then, y. y., ibrahim, n. a., zainuddin, n., ariffin, h. and wan yunus, w. m. z., “oil palm mesocarp fiber as new lignocellulosic material for fabrication of polymer/fiber biocomposites”, international journal of polymer science, vol. 2013, article id797452, (2013), 7 pages. [15] liu, l., yu, j., cheng, l. and qu, w., “mechanical properties of poly (butylene succinate) (pbs) biocomposites reinforced with surface modified jute fibre”, composites part a: applied science and manufacturing, vol. 40, no. 5, (2009), pp. 669–674. 296 acta polytechnica 58(5):292–296, 2018 1 introduction 2 materials and method 2.1 processing of the materials 2.2 preparation and characterization of the composites 3 results and discussion 3.1 tensile properties 3.1.1 tensile strength of the composite 3.1.2 tensile modulus of the composite 3.1.3 percentage elongation at break of the composite 3.2 impact strength of the composite 4 conclusion references ap03_5.vp 1 introduction users expect modern geared motors and reducers to have high reduction ratios and small overall dimensions and total weight as well as a closed, rigid one-part housing containing the toothed elements mounted on rolling bearings, as shown in [1]. the above requirements can be satisfied for fine diametral pitch of gears made of high-quality toughening or carburizing alloy steel. the transmission ratio of the speed reducer is a function of the number of teeth, in particular in the pinion. the assumed high ratio forces the small number of teeth in the pinion. the fine diametral pitch and the small number of teeth cause the operating pitch diameter of the gear wheel often to be comparable to the output shaft diameter of the applied electric motor. this results in serious difficulties connected with the mounting of the gear. this paper deals with the strength analysis of one specific version of the gear wheel-shaft connection, and the tapered self-locking frictional joint is considered. such a connection is preferred in application to lot production of geared motors, manufactured in various series of types. the strength analysis of the joint is based on the relation between the torque and statistical load intensity of the gear transmission. several geometric, strength and engineering dimensionless parameters are introduced to simplify the calculations and to generalize the approach. the procedure requires initial selection of the permissible range of the parameters. stress-strain analysis with respect to combined bending and torsion of the circular shaft, the condition of right contact pressure distribution in the frictional joint and fatigue strength investigations of the shaft lead to the relations between the fundamental dimension of the joint and other parameters. the final acceptable engineering solution may then be suggested and verified. the results of a numerical example illustrate the influence of the considered parameters of the gear wheel and the joint on its functional dimensions. 2 engineering solution of a gear wheel-shaft joint the connection between the gear wheel and the shaft is considered in the case when the operating pitch diameter d of the gear is comparable to the mounting diameter d. the connection is realized by means of the tapered self-locking frictional joint. the geometry of the joint is presented in fig. 1 for the cylindrical helical gear. the gear wheel unit consists of the pinion and the sleeve of external diameter ds and internal conical hole of taper c. the output shaft of nominal diameter d of the electric motor is executed with the same taper. the length of contact of the coupled elements is l and the relation between the cone angle � and the taper c is � � arc tg(c/2). the bolted joint with a slotted nut as a locking device between the pinion and the shaft is applied to produce the effective axial force on the subassembly in the frictional connection. the transitory zone between the pinion and the sleeve (details b and c in fig. 1) must be carefully designed. both production technology requirements and operational requirements must be satisfied and these parts should also be optimally designed with respect to fatigue strength. the main purpose is to obtain dimension e as small as possible, because this leads to a lower fundamental dimension l of the connection and consequently the bending stress produced by the teeth forces is less. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 analysis of gear wheel-shaft joint characterized by comparable pitch diameter and mounting diameter j. ryś, h. sanecki, a. trojnacki this paper presents the design procedure for a gear wheel-shaft direct frictional joint. the small difference between the operating pitch diameter of the gear and the mounting diameter of the frictional joint is the key feature of the connection. the contact surface of the frictional joint must be placed outside the bottom land of the gear, and the geometry of the joint is limited to the specific type of solutions. the strength analysis is based on the relation between the torque and statistical load intensity of the gear transmission. several dimensionless parameters are introduced to simplify the calculations. stress -strain verifying analysis with respect to combined loading, the condition of appropriate load-carrying capacity of the frictional joint and the fatigue strength of the shaft are applied to obtain the relations between the dimensions of the joint and other parameters. the final engineering solution may then be suggested. the approach is illustrated by a numerical example. the proposed procedure can be useful in design projects for small, high-powered modern reducers and new-generation geared motors, in particular when manufactured in various series of types. keywords: geared motor, gear wheel, frictional joint, strength analysis, fatigue. 3 strength analysis of frictional joint 3.1 combined strength of the shaft with respect to bending and torsion the analysis is based on the relation between the rated torque t and the statistical load intensity q u of the gear transmission, introduced by müller [2] t bd u u q u� � 2 2 1 , (1) and on the strength condition for the circular solid shaft of diameter d with respect to combined repeated and reversed bending, and repeated one-direction torsion m s sbe bn r� , (2) where b is the width of the gear face, d is the operating pitch diameter, u z z� 2 1 denotes the ratio of the first stage, s is the section modulus of the cross-section of the shaft, and s bn r stands for endurance strength in repeated and reversed bending. the equivalent bending moment in eq. (2) can be expressed as � �m m mbe b t� � 2 22� , where � � s sbn r tn o , and s tn o is the endurance strength in repeated one-direction torsion. the distance from the midpoint c of the face to the cross-section a – a at the first bearing of the electric motor (where the moment mb reaches its maximum) is l. the moments mb and mt are produced by the force between the teeth, the components of which are f t d f f t d f f t d t r t a t � � � � � � � 2 2 2 , cos cos , tg tg tg tg � � � � � � (3) and may be expressed as functions of the statistical load intensity q u as m d u u q m t b u t � � � � � � � � � � � � � � � � 2 2 2 32 2 1 tg � sin cos , 3 3 2 1 d u u q u � , (4) where � and � stand for the pressure angle and helix angle, respectively, and several dimensionless parameters are introduced as follows: � � b d , � � d d , � l d. the negative sign in eq. (41) must be taken if the directions of rotation and the helix angle are designed in such a way that the component force fa between the teeth has the same sense as the total force f* in the frictional joint (as depicted in fig. 1 by the solid line) and causes an increase of loading in the joint during service. the positive sign should be applied in the opposite case. combining eqs. (2) and (4) enables the fundamental dimensionless parameter of the frictional joint to be determined tg tg tg � � � � � � cos cos � � � � � � � � � � � � � � � 2 21 � � � � � � � �� � � � � � � � � � � � � � � � � 2 2 32 1 1 tg2 s u u q bn r u � � � � 2 0 (5) in terms of other parameters of gear transmission (u, �, �, �, �), strength properties of the material of the shaft (s bn r , �), and statistical load intensity q u of gear transmission. 3.2 load-carrying capacity of the frictional joint under the assumption that in the tapered self-locking frictional joint under study the distribution of the contact pressure is constant (p � const) and additionally taking the coefficient of friction the same over the total surface of contact ( � const), the torque t expressed by eq. (1) may be carried by the joint if the contact pressure is © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 43 no. 5/2003 fig. 1: geometry of the gear wheel-shaft joint � �� � p u u q u� � � � 6 1 1 2 1 3 3 � � � � � � sin tg , (6) where � � l d is the dimensionless length of the frictional joint. the above condition is satisfied for the longitudinal force acting in the joint � � � � � � f d u u q u� � � � � � � 3 2 1 1 2 1 1 2 1 3 2 3 2� � � � � � � � sin cos tg tg . (7) the contact pressure in the joint is caused by the force � � � �f f fb a* � � � 2 (fig. 2), where f(b) is produced in the bolted joint on the subassembly, and � ��fa 2 stands for the portion of the resultant force � � � �f f fa a a� �� � 1 2 that is transmitted to the surface of the frictional joint generating its additional loading or unloading. the strength conditions of the bolted joint must be satisfied with respect to tension for the minor diameter dm(b) of the bolt and with respect to the contact pressure at the thread surface along portion h, which lead to the equations, respectively � �� � � � � �� � d f f s h p d d p f f m b b a t b b b b a ( ) ( ) ( ) ( ) ( ) ( ) ,� � � 4 4 1 2 1 2 1 � � � � � � , (8) where p is the pitch of the bolt thread, d(b) stands for the major bolt diameter, d1 is the minor nut diameter (in the sleeve), and s t b( ) and p b( ) denote the tensile strength and permissible contact pressure of the bolt, respectively. an analysis of the frictional joint under initial loading f(b) and additional loading fa is applied to determine the components �fa ( )1 and �fa ( )2 . the force fa facing as shown in fig. 3 unloads the portion of the pinion of length b/2, the spring rate of which is kp1, and the portion of the bolt of length b e e ex� � �1 3 2 and spring rate kb – fig. 3. at the same time the remaining part of the pinion of length b e2 1� and spring rate kp2 is loaded as well as a part of the sleeve of length e e lx � �3 2, the spring rate of which is denoted by ks. by applying well-known calculations, the components of force fa can be found � � f k k k f f k k k f a be be se a a se be se a ( ) ( ) , , 1 2 � � � � (9) where kbe denotes the equivalent spring rate of kp1 and kb, and kse is the equivalent spring rate of kp2 and ks. the frictional joint is simultaneously loaded with the bending moment mbp, which disturbs the contact pressure p which is initially assumed to be uniform (fig. 4). the assembly pressure and the pressure produced by the bending moment lead to the non-uniform distribution of the resultant contact pressure in radial direction. the values of the resultant contact pressure as well as the effective friction torque carried by the joint with the contribution of mbp, are practically the same. nevertheless, the bending moment mbp must be limited to prevent the minimal pressure on the acceptable value pmin. it is usually assumed that the resultant contact pressure changes 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 2: diagram of the forces acting at the surface of the frictional joint fig. 3: definition of spring rates of the considered portions of the frictional joint linearly along the length l of the joint and must exceed the minimum value defined as � �cp c � 0 25. , which leads to the condition p cpmin � . (10) the maximum value of tensile force f(b) which can be applied on assembly to the bolt depends on its nominal minor diameter dmn(b) and the mechanical properties of the bolt material st(b) f d s fb mn b t b a( ) ( ) ( ) ( )� � � 2 1 4 � . (11) the forces f b( ) and �fa ( )2 produce the resultant contact pressure at the conical surface of the joint � � � �� � p v s k k k k t b be se be * sin sin cos ( ) � � � � � � � � � � � � � � �1 1 2 2 2 tg se u u u q 4 1 2 � � � �tg � � � � , (12) where v d dmn b� ( ) . in this case condition (10) takes the form p cpmin* � . the well-based assumption that the taper c of the conical joint is small and it can be replaced by the cylindrical joint in the above considerations leads to the limiting condition for the bending moment in the frictional joint � � � �� � m c d u u qbp u� � � � � � � � � � � 3 2 3 31 2 1 1 2 1 sin tg . (13) bending moment mbp is produced by the forces acting between the teeth. the influence of mbp on the distribution of contact pressure in the joint is derived for the forces ft and fr acting on the arm � �l l l np � � �2 , i.e., at the midpoint of the frictional joint. after substituting p pl d� and applying eq. (13) one parameter of the joint (consequently p) can be expressed in terms of the other parameters tg tg tg � � � � � � cos cos � � � � � � � � � � � � � � � 2 21 p � � � � �� � p c � �� � � � � � � � � � �� � � � � � � � � � � � �2 2 4 6 3 2 2 tg tg tg 2 cos � � 2 0 , (14) where � � � � � �� � w p p v s t b � � � � � � � � � * sin cos ( ) � � � � � � � �� 1 1 2 1 1 2 6 3 2 2 tg tg 3 1 1 2 3 u u q k k k ku be se be se � � � � � � � � � �tg . (15) in eqs. (12) and (15) positive signs should be used when force fa faces as shown in fig. 3 and negative signs in the opposite case. the maximum value � �p c pmax* � �2� of the contact pressure distribution changed by the bending moment mbp must be limited to permissible compressive loading pc for both members of the frictional joint. 3.3 fatigue strength of the shaft the frictional joint must be evaluated with respect to fatigue strength. the usual procedure is to estimate the fatigue factor of safety fs. the members of the joint are subjected to combined loading: bending and torsion, in which case the effective factor of safety is described by niezgodzinski et al [3] as fs fs fs fs fs b t b t � �2 2 , (16) where the component fatigue factors of safety in reversed bending and one–direction torsion are, respectively � � � � fs s fs s s s b bn r b a t tn r m tn r tn o t a � � � � � � � �� � � �� � , . 2 1 (17) © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 43 no. 5/2003 fig. 4: distribution of resultant contact pressure in the frictional joint loaded with the bending moment in eqs. (17),� ��� b and� ��� t denote the products of partial fatigue factors, calculated for the considered part of the shaft in pure bending separately, and in pure torsion separately, and sbn r , stn r , stn o stand for the endurance stresses in reversed bending, reversed torsion, and one-direction torsion, respectively. the fatigue calculations of the shaft were carried out for a cross-section with two diameters db and d joined by a fillet of radius r1 (detail a in fig. 1), located at a distance l l nf � � 3 from the midpoint of the face. this cross-section was recognized as the weakest plane with respect to stress concentration. in the solid shaft of diameter d subjected to repeated and reversed bending with the moment mbf and also subjected to repeated one-direction torsion with torque mt, the amplitude of the bending stress is � �a bfm d� 32 3, and the amplitude of the shear stress equals the mean shear stress � � �a m tm d� � 8 3. the above relations may be combined with (4) and (17), and introduced into (16) after substituting f fl d� . assuming that fs fsw� , where fsw denotes the working fatigue factor of safety introduced for the considered plane of the shaft, equation (16) can be rearranged and written in the form tg tg tg � � � � � � cos cos � � � � � � � � � � � � � � � 2 21 f � � � � � f b tn r tn o t bn rs s s � �� � � � � � � � � � � � � �� �� 2 1 16 2 1 2 2 2 tg2 s u u q s tn r u bn � � � � � � �� � � � � � � �� � � 2 2 2 32 1 1� � � � � r b wfs�� � � � � � � � 2 0. (18) following the approach presented by niezgodzinski at al in [3], fsw may be estimated by applying the so called required fatigue factor of safety defined as fs x x x xr � 1 2 3 4 , (19) where x1 is the factor for the reliability of the assumptions, x2 is the factor for the importance of a machine part, x3 is the factor of material homogeneity, and x4 stands for the factor of preservation of dimensions. 4 acceptable range of dimensionless parameters all resulting equations in the per are derived in dimensionless quantities related to diameter d. in some cases, however, the geometry of the frictional joint had to be specified and d � 28 [mm] was assumed. as shown by krukowski et al in [4], frictional joints are usually designed assuming � 0.10 � 0.20 as for polished clean parts, and c � 0.25. conical self–locking connections are executed with the taper between 1:5 � 1:100 and with the length l � (1 � 2)d, i.e., � � 1.0 � 2.0. the dimensionless diameter of the joint was assumed � � 0.8 � 1.2 and the outer diameter of the sleeve ds � 2d. the detailed calculations were carried out for a cylindrical helical pinion with the usual pressure angle � � 20 [°] and he helix angle � � 25 [°]. on the basis of literature recommendations the dimensionless width of the face was assumed � � 0.5 � 0.9 and the ratio of the first stage of transmission u � 5 � 9. statistical load intensity q u depends on the strength properties of the material applied for the pinion and its heat treatment. the gears of high-powered gear units for general engineering are usually made of carburizing or nitriding alloy steel, e.g., 18hgt or toughening alloy steel, e.g., 45hn, for which q u � 2 � 5 [mpa]. three grades of steel for the shaft were taken into account: toughening quality carbon steels 35 and 45, and toughening alloy steel 45hn. some strength properties of these materials necessary for the calculations were introduced after ciszewski et al [5] and lysakowski [6]. the specific values of several dimensions must be introduced in the fatigue calculations of the shaft. well-known relations adopted in shafting (dabrowski [7]) for the diameter reduction ratio d db �12. and for the fillet radius � �r d db1 0 25� �. should be satisfied – detail a in fig. 1. in the flange-type electric motor skg112m (made by tamel, poland), the end of the output shaft of diameter d � 28 [mm] 30 acta polytechnica vol. 43 no. 5/2003 quantity steel 35 steel 45 45 hn rm min [mpa] 580 660 1030 re min [mpa] 365 410 835 sbn r [mpa] 255 310 475 stn r [mpa] 152 183 285 stn o [mpa] 300 365 520 s bn r [mpa] 64 78 119 s tn o [mpa] 75 95 130 pc [mpa] 87 98 120 �m [mm] 0.62 0.57 0.36 � [mm] 1.12 1.07 0.86 �/d 0.0400 0.0382 0.0307 � 0.83 0.86 0.95 (�k)b 1.81 1.82 1.93 (�k)t 1.30 1.32 1.36 (�k)b 1.6723 1.7052 1.8835 (�k)t 1.2490 1.2752 1.3420 �p 1.07 1.08 1.13 (�)b 1.7894 1.8416 2.1284 (�)t 1.3364 1.3772 1.5165 (�)b 1.24 1.26 1.28 (�)t 1.16 1.18 1.20 (��)b 2.2189 2.3204 2.7244 (��)t 1.5502 1.6251 1.8198 table 1: results of fatigue calculations is supported on the ball bearing type 6306zz, the bore diameter of which is db � 30 [mm], i.e., d db �10714. . the minimum (i.e, the worst with respect to fatigue) fillet radius for the above data equals r1 0 5� . [mm], and this was applied in the fatigue calculations. the effect of surface conditions on the stress concentration was introduced as for mechanical polishing, after which the surface quality of roughness number ra � 0 32. [�m] may be obtained. the partial fatigue factors � and � in reversed bending and one-direction torsion were arrived at through usual considerations, in which the appropriate equations, tables and diagrams presented by niezgodzinski et al in [3] were applied. several results are gathered in table 1. the working factor of safety fsw �15246. was estimated employing the proposed criterion in which the partial factors of the required factor of safety fsr were equal x1 11� . , x2 12� . , x3 11� . , x4 105� . , respectively. the polish standard [8] recommends application of the m8 screw at the end of a conical shaft of diameter d � 28 [ ]mm . it was assumed that the screw is made of ordinary carbon steel st5. dimensions e1, ex, e3, n1, n2 and n3 must be initially estimated to evaluate parameter �. they were related to the diameter d, as �1 1� e d, �x xe d� , �� �� e d, �1 1� n d, �2 2� n d and �3 3� n d. the fundamental parameter � of the frictional joint may be expressed as a sum � � � � � � � �� � � � � � � �0 5 1 3 1 2 3. x . (20) the extreme values of each component in eq. (20) were estimated with respect to the production and operational requirements: �1 0 2 0 4� �. . , �x � �0 35 0 7. . , �3 0 05 0 07� �. . , �1 0 05 0 07� �. . , �2 0 4 12� �. . and �3 0 34� . (as for the electric motor skg112m). the acceptable range of parameter � was obtained employing eq. (20), and its extreme values are �min .� 2 59, �max .� 4 82, and the mean value �m � 3 71. . the solution of the problem must then satisfy the inequality � � �min max� � . 5 numerical example parameter � was chosen as the fundamental quantity of the joint, and it was subjected to investigation in the design process. the explicit influence of other parameters on the parameter � excludes the standard optimization procedure with imposed appropriate production technology, operational and strength constraints. the set of resulting conditions (5), (14) and (18) was rearranged and uniformly presented as � �� � �� f u c q u, , , , , , ,material properties , introducing the substitutions � �� � � �p � � �2 and � � �f � � 3. the influence of each single variable parameter on the total dimensionless length l of the joint was examined individually, while the other parameters were set as the mean values in the acceptable ranges (suggested in sect. 4), i.e.: u � 7, � 0 7. , �10. , � �15. , c �1 20: (morse taper), � � 015. , q u � 3 5. [mpa] and steel 45 applied for the shaft. a special comment should be made on the relation between the component forces � ��fa and �fa ( )2 . the detailed calculations carried out for the mean values of the parameters indicate that the equivalent spring rate kse is much greater than the equivalent spring rate k k kbe se be� 26. the portion of the resultant force fa between the teeth transmitted to the bolt is then � ��f fa a 0 035. , and the portion transmitted to the surface of the frictional joint is � �f fa a 2 0 965 . . © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 43 no. 5/2003 transmission ratio u 0 1 2 3 4 5 6 7 8 5 6 7 8 9 (a) (5) (14) (18) p a ra m e te r � dimensionless width of face 0 1 2 3 4 5 6 7 8 0.5 0.6 0.7 0.8 0.9 p a ra m e te r � (b) (5) (14) (18) dimensionless diameter 0 1 2 3 4 5 6 7 8 0.8 0.9 1.0 1.1 1.2 p a ra m e te r � (c) (5) (14) (18) dimensionless length � of joint 0 1 2 3 4 5 6 7 8 1.00 1.25 1.50 1.75 2.00 p a ra m e te r � (d) (5) (14) (18) fig. 5: parameter � versus: (a) – transmission ratio u, (b) – width of face , (c) – operating pitch diameter , and (d) – length � of the joint. other parameters are fixed. the well-founded assumption can be introduced that the frictional joint is additionally loaded with the entire force � �f f f fa b a* ( )� � , while the force f b( )* in the bolt is the same during service ( )( )* ( )f fb b� . the results of the numerical calculations are presented in figs. 5 and 6, where the fine lines correspond to the same senses of the forces fa and f*, and the bold lines are applied in the opposite case. the curve labels indicate the appropriate condition. it appears that parameter � depends most strongly on the width b of the gear face (fig. 5b), on the operating pitch diameter d (fig. 5c) and on the length l of the frictional joint (fig. 5d). it is clear, because the variation of these parameters results in significant changes of the bending moment loading the shaft and the frictional joint. the influence of the transmission ratio u (fig. 5a), taper c (fig. 6e) and friction coefficient � (fig. 6f ) is relatively small. the relation between the parameter � and the strength properties rm of the shaft, depicted in fig. 6g, leads to the conclusion that high-quality toughened steel should be used for a shaft, if the gears in the transmission are made of high-strength materials of statistical load intensity q u > 2 [mpa]. it should be noted, however, that none of conditions (5), (14) and (18) is of primary importance; for different ranges of analyzed parameters different conditions impose the largest reduction on the total length l of the joint. the final calculations were carried out for the mean values of all parameters and for the shaft made of toughened steel 45. regarding the estimation suggested in sect. 4 the working factor of safety was assumed fs .w �16, which seems to be well based from the engineering point of view. the inequalities (5), (14) and (18) calculated for q u � 3 5. [mpa] and the same senses of forces fa and f* give � � 3 38. , 3.96 and 3.96, respectively (fig. 6h). for different senses of the forces the appropriate values decrease and the result is � � 3 22. , 3.15 and 3.80. all above values are greater than �min .� 2 59, and on the whole they are close to the mean value of the parameter �m � 3 71. . the minimum value �min .� 315 should be finally accepted for the assumed data and applied in the further dimensioning of the joint. in particular dimensions e and n must be selected with great care. 6 conclusions in the presented procedure for analysis and design of the frictional joint initial data connected with the strength and geometry of the first stage of transmission is required. moreover, several parameters of the joint have to be introduced as well as the production requirements. the strength, load-carrying capacity and fatigue of the joint are then verified employing equations (5), (14) and (18) derived in the paper. the final values of the parameters may be corrected under the assumption that the resulting length of the gear unit must satisfy appropriate conditions. the numerical calculations carried out for an exemplary set of data demonstrate that the suggested approach is of 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 0 1 2 3 4 5 6 7 8 0.01 0.10 1.00 taper logc of joint p a ra m e te r � (e) (5) (14) (18) 0 1 2 3 4 5 6 7 8 0.100 0.125 0.150 0.175 0.200 friction coefficient � of joint p a ra m e te r � (f) (5) (14) (18) 0 1 2 3 4 5 6 7 8 550 675 800 925 1050 ultimate tensile strength r m of shaft p a ra m e te r � (g) (5) (14) (18) 0 1 2 3 4 5 6 7 8 2.0 2.5 3.0 3.5 4.0 4.5 5.0 statistical load intensity q u p a ra m e te r � � min (5) (14) (18) (h) fig. 6: parameter � versus: (e) – taper logc of the joint, (f) – friction coefficient � of the joint, (g) – ultimate tensile strength rm of the shaft, and (h) – statistical load intensity q u. other parameters are fixed. practical meaning and may be useful in the design process of small, compact reducers and geared motors. references [1] catalogue flender gmbh&co. kg., k 20 d/en 10.90: gear units. [2] müller, l.: przekładnie zębate. obliczenia wytrzymałościowe. warszawa: wnt, 1972. [3] niezgodziński, m., niezgodziński, t.: obliczenia zmęczeniowe elementów maszyn. warszawa: pwn, 1973. [4] krukowski, a., tutaj, j.: połączenia odkształceniowe. warszawa: pwn, 1987. [5] ciszewski, a., radomski, t.: materiały konstrukcyjne w budowie maszyn. warszawa: pwn, 1989. [6] łysakowski, e.: podstawy konstrukcji maszyn. warszawa: pwn, 1974. [7] dąbrowski, z.: wały maszynowe. warszawa: pwn, 1999. [8] pn–89/m–85000: czopy końcowe wałów walcowe i stożkowe. prof. ing. jan ryś, drsc. phone: +48 126 489 879 fax: +48 126 484 531 e-mail: szymon@mech.pk.edu.pl ing. henryk sanecki, ph.d. phone:+48 126 283 385 e-mail: hsa@mech.pk.edu.pl ing. andrzej trojnacki, ph.d. phone: +48 126 283 306 e-mail: atroj@mech.pk.edu.pl department of mechanical engineering cracow university of technology ul. warszawska 24 31�155 kraków, poland © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 43 no. 5/2003 acta polytechnica doi:10.14311/ap.2017.57.0032 acta polytechnica 57(1):32–37, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap tearing modes growth rate amplification due to finite current relaxation f. e. m. silveira centro de ciências naturais e humanas, universidade federal do abc, rua santa adélia, 166, bairro bangu, cep 09210-170, santo andré, sp, brazil correspondence: francisco.silveira@ufabc.edu.br abstract. in this work, we explore the influence of perturbative wavelengths, shorter than those usually considered, on the growth rate γ of the tearing modes. thus, we adopt an extended form of ohm’s law, which includes a finite relaxation time for the current density, due to inertial effects of charged species. in the long wavelength limit, we observe the standard γ of the tearing modes. however, in the short wavelength limit, we show that γ does not depend on the fluid resistivity any longer. actually, we find out that γ now scales with the electron number density ne as γ ∼ n −3/2 e . therefore, through a suitable combination of both limiting results, we show that the standard γ can be substantially amplificated, even by moderate shortenings of perturbative wavelengths. further developments of our theory may contribute to the explanation of the fast magnetic reconnection of field lines, as observed in astrophysical plasmas. keywords: tearing modes; current relaxation; magnetic islands; magnetic reconnection. 1. introduction the most violent instabilities in magnetically confined plasmas are the so-called ideal hydromagnetic instabilities, which are driven by current and pressure gradients [1–3]. as a matter of fact, the condition of a vanishingly small resistivity imposes a constraint on the allowed perturbed motions in the fluid. actually, the electric field in a frame moving with an ideal plasma has to vanish. thus, the magnetic flux through any surface moving with that fluid has to remain constant. therefore, the perturbed magnetic field lines cannot slip with respect to the perturbed flow lines in an ideal plasma. according to faraday’s law, the magnetic field remains frozen inside the ideal fluid or obeys the frozen-in law. the frozen-in law and ideal instabilities are wellknown subjects in plasma physics. they are discussed in many textbooks, from the introductory to the advanced levels [4–8]. the ideal instabilities are not only violent in the sense that they can destroy a given equilibrium configuration, which, in this case, is determined by the balance of the pressure gradient with the lorentz force. their growth rate in time can typically achieve very large values as well. this feature has lead to a great effort to achieve control of the relevant physical parameters, which characterize actual confinement configurations, such that those structures can remain stable to ideal modes [9–13]. it is true that a finite electric resistivity provokes a decrease on the current gradient, which drives the ideal modes. however, it should be noted that dissipative effects can relax constraints in the ideal fluid as well. thus, states with lower potential energy can become accessible to the system and new instabilities can emerge. in particular, a finite plasma resistivity relaxes the frozen-in condition. therefore, magnetic field lines can break-up into thin filaments. these structures were coined magnetic islands [14–18]. magnetic islands play a central role in the physics of magnetic confinement configurations. they usually grow in a time scale much longer than the alfvén time scale and attain a saturated size when the linear free energy, which was available to drive the change in the topology of the magnetic field lines, vanishes. if the saturated size of the magnetic islands is comparable with the radius of the plasma column in a tokamak, then heat flow across the field lines is essentially replaced by the much faster flow along the lines. when the saturated islands are sufficiently large to touch each other or the material limiter inside the vacuum chamber, a disruption of the plasma column can occur and the chamber can become subjected to strong mechanical stresses [19–23]. in astrophysical plasmas, magnetic islands play an equally relevant role. magnetic energy can be converted into kinetic energy, thermal energy, and particle acceleration due to the modification of the magnetic topology in highly conducting plasmas. this phenomenon was dubbed the magnetic reconnection. quite interestingly, the magnetic reconnection is observed to occur much faster than theoretically predicted in most astrophysical processes. for instance, solar flares eventuate several orders of magnitude faster than predicted, even by including kinetic, sthocastic, and turbulence effects into the topological dynamics [24–29]. this issue suggests the possibility that there is still enough room to explore the most basic mechanisms, which underly the formation of magnetic islands. the resistive instability that is responsible to form magnetic islands is known as the tearing instability. the linear theory of the tearing modes was originally 32 http://dx.doi.org/10.14311/ap.2017.57.0032 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 1/2017 tearing modes growth rate amplification due to finite current relaxation discussed by furth, killeen, and rosenbluth [30], and further developed by many authors [31–35]. a linear perturbation is effected on a given static state of equilibrium of an infinite ideal plasma, which contains a thin plane slab, with a small resistivity η. therefore, on the assumption of the usual ohm’s law, which establishes that the electric field in a frame moving with the fluid is equal to the product of the plasma resistivity with the current density, one finds that the growth rate γ of the tearing modes scales with η as γ ∼ η3/5. this is the central result of the standard theory of the tearing modes. in this work, we investigate the influence of perturbative wavelengths, shorter than those usually considered, on the growth rate of the tearing modes. thus, we adopt an extended form of ohm’s law, which includes a finite relaxation time for the current density, due to inertial effects of charged species. in the long wavelength limit, we observe the standard growth rate of the tearing modes. however, in the short wavelength limit, we can see that the growth rate does not depend on the fluid resistivity any longer. actually, we can find out that γ now scales with the electron number density ne as γ ∼ n −3/2 e . therefore, through a suitable combination of both limiting results, we show that the standard growth rate of the tearing modes can be substantially amplificated, even by moderate shortenings of perturbative wavelengths. further developments of our theory may contribute to the explanation of the fast magnetic reconnection of field lines, as observed in astrophysical plasmas. 2. basic equations and linear analysis let us start by considering an infinite plasma, in the presence of a magnetic field ~b, flowing with a velocity ~v . by adopting cartesian coordinates (x,y,z), we assume that ~b = ẑb‖0 − ẑ ×∇ψ, ~v = −ẑ ×∇φ, (1) where the flux functions ψ and φ depend only on x and y (of course, they are allowed to depend on the time t as well), and b‖0 is a constant. while writing (1), it should be noted that the conditions on the absence of magnetic monopoles, ∇· ~b = 0, and incompressibility, ∇· ~v = 0, are automatically satisfied. in this work, we aim to explore the influence of perturbative wavelengths, shorter than those usually considered, on the growth rate of tearing modes. therefore, we adopt an extended form of ohm’s law (a still more general formula should include ion and electron pressure gradients, as well as the hall effect [7, 36–40], however, those terms are not relevant for our purposes), ~e + ~v × ~b = η ( 1 + τc ∂ ∂t ) ~j, (2) where ~e, ~j, and η are the electric field, current density, and electric resistivity, respectively. for singly ionized, approximately neutral, resistive plasmas, τc = me(nee2η)−1, with me, ne, and e standing for the electron mass, number density and charge, respectively. as a matter of fact, τc must be interpreted as the finite relaxation time for the current density, due to inertial effects of charged species. actually, if the electromagnetic field is suddenly removed from the presence of the fluid, then (2) shows that ~j(t) = ~j(0)e−τ −1 c t. this means that the initial current ~j(0) damps off in the fluid, in a time interval of the order of the scale τc. at sufficiently long wavelengths, characterizing the fields, inertial effects are negligible, τc → 0, and the initial current damps off instantaneously. in this case, (2) recovers the more usual form of ohm’s law, ~e + ~v × ~b = η ~j. equations (1) imply ~v × ~b = −ẑ~v ·∇ψ− ẑ×~v b‖0. now, by combining the first of (1) with the faraday and ampère laws (the displacement current is neglected in the ampère-maxwell law: the hydromagnetic approximation), we obtain ~e = −∇χ− ẑ ∂ψ ∂t , ~j = −ẑ ∇2ψ µ0 , (3) where χ and µ0 are the electric potential and vacuum magnetic permeability (a diamagnetic plasma is assumed), respectively. by substituting the expanded ~v × ~b and (3) in (2), the z-component of the latter provides an expression for the time evolution of the magnetic flux, ẑ ·∇χ + ( ∂ ∂t + ~v ·∇ ) ψ = η µ0 ( 1 + τc ∂ ∂t ) ∇2ψ. (4) we seek an expression for the time evolution of the flow flux. therefore, we consider euler’s equation (an inviscid fluid is assumed), ( ∂ ∂t + ~v ·∇ ) ~v = −∇p + ~j × ~b ρ0 , (5) where p and ρ0 are the hydrodynamic pressure and (constant and uniform) mass density, respectively. by taking the curl of the relevant terms in (5) (of course, ∇×∇p = 0), we obtain ∇× ( ∂ ∂t + ~v ·∇ ) ~v = −ẑ ( ∂ ∂t + ~v ·∇ ) ∇2φ, ∇× ( ~j × ~b) = − ∇(∇2ψ) ×∇ψ µ0 , (6) where, again, we use (1) and ampère’s law. finally, by combining the z-components of (5) and (6), we arrive at the sought expression,( ∂ ∂t + ~v ·∇ ) ∇2φ = ẑ ·∇(∇2ψ) ×∇ψ µ0ρ0 . (7) equations (4) and (7) are the basic equations to be explored in this work. let us assume that the plasma is ideal throughout the whole space, except in a thin slab, whose electric 33 f. e. m. silveira acta polytechnica resistivity η is a very small, although finite, number. the plasma slab extends infinitely parallel to the yzplane, and from x = −a/2 to x = +a/2, with a > 0. we also assume that the plasma is in a static state of equilibrium, characterized by ψ = ψ(x) and φ = 0. then, the first of (1) shows that the equilibrium magnetic field ~b = ẑb‖0 + ŷb⊥, where b⊥ = −ψ′, with the prime standing for the derivative with respect to x. since the current density is expected to become vanishingly small far away from the plasma slab, the second of (3) shows that |ψ′| = b⊥0, a constant, for |x| � a. thus, we define the alfvén speed va = b⊥0(µ0ρ0)−1/2 and time τa = av −1a scales. hence, we introduce the normalization t → τat, ∇→ a−1∇, ψ → ab⊥0ψ, φ → avaφ, and χ → avab⊥0χ. given the above considerations, we obtain the dimensionless version of (4) and (7), ẑ ·∇χ + ( ∂ ∂t + ~v ·∇ ) ψ = τa τd ( 1 + τc τa ∂ ∂t ) ∇2ψ,( ∂ ∂t + ~v ·∇ ) ∇2φ = ẑ ·∇ ( ∇2ψ ) ×∇ψ, (8) respectively, where we have included the resistive diffusion time scale τd = a2µ0η−1. equations (8) show that the equilibrium condition for the static state can be read as ẑ ·∇χ = τaτ−1d ψ ′′. about the static state of equilibrium, we assume that the magnetic and flow fluxes are slightly perturbed in the form ψ(x,y; t) = ψ(x) +ψ(x,y)eγt and φ(x,y; t) = φ(x,y)eγt, respectively, where γ is a (dimensionless) time rate. by "slightly perturbed", we mean that |φ| ∼ |ψ| � |ψ| (recall that the equilibrium φ = 0). then, by retaining only terms proportional to ψ and φ, and to their derivatives, (8) lead to the linear perturbed equations (of course, any possible perturbation on χ is also assumed to depend only on x, y, and t) γψ + ψ′ ∂φ ∂y = τa τd ( 1 + γ τc τa ) ∇2ψ, γ∇2φ = (ψ′′′ − ψ′∇2) ∂ψ ∂y , (9) respectively. since the equilibrium magnetic flux depends only on the x-coordinate, the perturbative functions can be fourier decomposed in the y-coordinate. a careful inspection of (9) reveals that ψ and φ exhibit opposite parities with respect to y. thus, we choose to fourier decompose the perturbative functions in the form ψ(x,y) = ψ(x) cos(ky) and φ(x,y) = φ(x) sin(ky), where k is a (dimensionless) wavenumber. with this choice, (9) become γψ + kψ′φ = τa τd ( 1 + γ τc τa ) (ψ′′ −k2ψ), γ(φ′′ −k2φ) = −k ( ψ′′′ψ − ψ′(ψ′′ −k2ψ) ) , (10) respectively. 2.1. solutions in the ideal and resistive regions in the ideal region, η → 0 and, according to the first of (10), φ is given by φ = γ kψ′ ((δe a )2 (ψ′′ −k2ψ) −ψ ) , (11) where δ2e = me ( nee 2µ0 )−1 , with δe standing for the electron skin depth. since γ is expected to become vanishingly small in the limit η → 0 (this is true for tearing modes but not for kink modes, which can become unstable even in the ideal limit [33]), by substituting (11) in the second of (10), we see that ψ satisfies the differential equation ψ′′ = ( k2 + ψ′′′ ψ′ ) ψ, (12) in the limit γ → 0. except for much simplified equilibrium models, (12) must be treated numerically. however, the asymptotic behavior of its solution in the limit x → 0 can be easily found by making use of the frobenius method. to see this, we first expand the equilibrium magnetic field in a taylor series about the origin, by noting that ψ′ = 0 at x = 0. therefore, (12) can be approximated to ψ′′ = κ x ψ, (13) where the number κ satisfies the relation κ = ψ′′′(0) ψ′′(0) . (14) the general solution of (13) reads as [41] ψ(x) = ψ̂û(κx) + ψ̄ū(κx), (15) where ψ̂ and ψ̄ are constants, and û(κx) = κx + 1 2 (κx)2 + 1 12 (κx)3 + . . . , ū(κx) = 1 −κx− 5 4 (κx)2 − . . . + û ln|κx| (16) provide the regular and irregular parts, respectively, of ψ in the limit x → 0. as it appears, about the origin, the dominant behavior of ψ is given by ψ(x) = ψ̄ ( 1 + κx ln|κx| ) . (17) equation (17) shows that ψ is actually a constant, while its first and higher order derivatives diverge in the limit x → 0. such asymptotic behavior is the source of the so-called constant-ψ approximation in the analytical theory of tearing modes. given the functional form (17) of the asymptotic solution of (12) about the origin, (13) shows that the logarithmic derivative of ψ exhibits a jump, ∆′ = 1 ψ̄ ∫ 0+ 0− ψ′′dx, (18) 34 vol. 57 no. 1/2017 tearing modes growth rate amplification due to finite current relaxation across the resistive layer. the asymptotic solution (17) of (12) in the ideal region suggests that the perturbative function ψ is, to the lowest order, also a constant inside the resistive layer. however, the varying parts of ψ and φ inside the resistive layer are not yet known. actually, the characteristic length scale of the variation of the fields is not the same for both ideal and resistive phenomena. when resistivity is neglected, the magnetic field is frozen inside the plasma, and the perturbed fluid motions can cause substantial distortions in the field lines. therefore, significant gradients of perturbed fields can ensue in the resistive layer. this is the motivation for making use of the boundary layer technique in the analytical theory of tearing modes [42]. in order to apply the boundary layer technique to solve (10) inside the resistive layer, first we need to identify a small scaling parameter among the relevant physical quantities. for fully ionized plasmas, it is well-known that η ∼ t−3/2e , with te standing for the electron temperature [7]. for typically high values of the electron temperature, the upper-limit of the ratio of the alfvén to the diffusion time scales is quite a small number, τaτ−1d < 10 −5 [6]. hence, we introduce the scaling γ = (τa τd )q ω, τc τa = (τa τd )−q θ, x = (τa τd )r ξ, ψ′ = − (τa τd )r ξ, ψ(x) = ψ̄ + (τa τd )r ψ̃(ξ), φ(x) = (τa τd )sγ k ϕ(ξ), (19) where we see that ωθ = γτcτ−1a , since inertial effects should be actually important at very short length scales, ψ′ vanishes linearly with ξ inside the resistive layer, ψ̄ and ψ̃ provide the constant and varying parts, respectively, of ψ inside the resistive layer, and γk−1 is included in the definition of ϕ to simplify further calculations. given the considerations above, we obtain the scaled version of (10), ψ̄ + (τa τd )r ψ̃ − (τa τd )r+s ξϕ = 1+ωθ ω (τa τd )1−q−r(d2ψ̃ dξ2 − (τa τd )r k2 ( ψ̄+ (τa τd )r ψ̃ )) , d2ϕ dξ2 − (τa τd )2r k2ϕ = − (τa τd )−2q+2r−s(k ω )2 ξ ( d2ψ̃ dξ2 − (τa τd )r k2 ( ψ̄ + (τa τd )r ψ̃ )) , (20) respectively. the powers q, r, and s must be chosen by requiring that the terms proportional to ξ and to the second derivative of ψ̃ become of the same order of ψ̄ (the constant-ψ approximation). this is achieved by choosing r + s = 0, 1−q−r = 0, and −2q + 2r−s = 0. then, q = 3/5, r = 2/5, and s = −2/5. thus, (20) become ψ̄ + (τa τd )2/5 ψ̃ − ξϕ = 1 + ωθ ω ( d2ψ̃ dξ2 − (τa τd )2/5 k2 ( ψ̄ + (τa τd )2/5 ψ̃ )) , d2ϕ dξ2 − (τa τd )4/5 k2ϕ = − (k ω )2 ξ ( d2ψ̃ dξ2 − (τa τd )2/5 k2 ( ψ̄ + (τa τd )2/5 ψ̃ )) , (21) respectively. by retaining only terms proportional to the small scaling parameter τaτ−1d , (21) approach ψ̄ − ξϕ = 1 + ωθ ω d2ψ̃ dξ2 , d2ϕ dξ2 = − (k ω )2 ξ d2ψ̃ dξ2 , (22) respectively. by combining (22), we obtain d2ϕ dξ2 = − k2 (1 + ωθ)ω ξ(ψ̄ − ξϕ). (23) in order to solve (23), it proves useful to introduce the transformations ξ = ((1 + ωθ)ω k2 )1/4 ζ, ϕ(ξ) = − ( k2 (1 + ωθ)ω )1/4 ψ̄f(ζ). (24) by substituting (24) in (23), we obtain d2f dζ2 = ζ(1 + ζf). (25) the solution of (25) can be written in terms of the integral representation [43, 44] f(ζ) = −ζ ∫ 1/2 0 1 (1 − 4β2)1/4 e−ζ 2βdβ, (26) which implies the asymptotic behavior f(ζ) = − 1 ζ − 2 ζ5 − . . . (27) in the limit |ζ|� 1. 2.2. asymptotic matching and growth rate in order to asymptotically match the solutions of (10) in the ideal region and inside the resistive layer, it is 35 f. e. m. silveira acta polytechnica sufficient to identify (18) with the jump in the logarithmic derivative of the scaled ψ across the resistive layer, to the lowest order. thus, given the scaling (19), we identify ∆′ = 1 ψ̄ ∫ +w/2 −w/2 d2ψ̃ dξ2 dξ, (28) where w > 0 is the (scaled, dimensionless) width of the resistive layer. by combining the second of (22) with (28), we obtain ∆′ = − (ω k )2 1 ψ̄ ∫ +w/2 −w/2 d2ϕ dξ2 dξ ξ . (29) by substituting (24) in (29), we obtain ∆′ = ω5/4 k1/2(1 + ωθ)3/4 ∫ +∞ −∞ d2f dζ2 dζ ζ , (30) where we have extended the limits of the integral to ±∞ because the integrand ∼−ζ−4 inside the resistive layer, according to the asymptotic behavior of f in the limit |ζ|� 1, as given by (27). actually, in accordance with (26), the integral in (30) can be written as ∫ ζ=+∞ ζ=−∞ d2f dζ2 dζ ζ = 2 ∫ ζ=+∞ ζ=−∞ dζ ∫ β=1/2 β=0 3β − 2ζ2β2 (1 − 4β2)1/4 e−ζ 2βdβ. (31) since the integral (31) can be expressed in terms of gamma functions [41], (30) finally yields the sought identification, ∆′ = 2πγ(3/4) k1/2γ(1/4) ω5/4 (1 + ωθ)3/4 . (32) on the assumption that ∆′ is known, (32) implies ω5/4 (1 + ωθ)3/4 = ∆′γ(1/4) 2πγ(3/4) k1/2. (33) therefore, according to the scaling (19), (33) can be written as γ5/4 (1 + γτcτ−1a )3/4 = ∆′γ(1/4) 2πγ(3/4) k1/2τ 3/4 a τ −3/4 d , (34) in terms of dimensionless quantities. by plugging back the actual physical quantities in (34), we finally read γ5/4 (1 + γτc)3/4 = ∆′aγ(1/4) 2πγ(3/4) (ka)1/2τ−1/2a τ −3/4 d . (35) equation (35) shows that if ∆′ > 0, then γ > 0, and the aforementioned static state of equilibrium becomes unstable to the linear perturbation. in particular, for sufficiently long perturbative wavelengths, inertial effects due to charged species are negligible, γτc � 1, and (35) simplifies to γ = ( ∆′aγ(1/4) 2πγ(3/4) )4/5 (ka)2/5τ−2/5a τ −3/5 d . (36) equation (36) shows the standard result of the analytical theory of tearing modes, which establishes that the growth rate γ scales with the plasma resistivity η as γ ∼ η3/5 [30]. 3. growth rate amplification now we get to the main result of this work. for sufficiently short wavelengths, inertial effects can become important. then, in the limit γτc � 1, from (35), we find that γ = ( ∆′aγ(1/4) 2πγ(3/4) )2 (ka)τ−1a (δe a )3 . (37) equation (37) shows that inertial effects can provoke quite a significant change on the scaling of the growth rate with the relevant plasma parameters. as a matter of fact, we see that γ does not depend on the plasma resistivity any longer. actually, it scales now with the electron number density as γ ∼ n−3/2e . beyond the above mentioned qualitative result, can we quantify the change of the growth rate due to a change on the perturbative wavelength? to answer this question, first we observe that the product ∆′a, in the general equation (35), is a function of the product ka [45–49]. perhaps, the most illustrative example of this issue is provided by the so-called harris model, which assumes the profile [50] ~j = ẑ b⊥0 aµ0 sech2 x a (38) for the equilibrium current density. by substituting (38) in the second of (3), one can calculate the equilibrium magnetic flux. next, by substituting the latter in (12), one can compute the perturbative magnetic flux. finally, (18) yields the well-known result ∆′a = 2 ( 1 ka −ka ) . (39) given the above considerations, let us assume that two plane resistive slabs are formed in an infinite ideal plasma. the two slabs have the same electric resistivity and are subjected to the same equilibrium magnetic field. the only difference between them is their thicknesses. thus, we can explore the situation for which the product ka is the same for both slabs. this means that the thicker slab can accommodate a longer perturbative wavelength, which we call λ, and the thinner slab can accommodate a shorter perturbative wavelength, which we call λ. therefore, from (35), we find that (γτc)1/2 (γτc)5/4 = (λ λ )−2 , (40) where the growth rates γ and γ are assumed to satisfy the conditions γτc � 1 and γτc � 1, respectively. to see the consequences of result (40), suppose that γτc ∼ 10−2 for the unstable mode with the longer perturbative wavelength λ. hence, if the latter is moderately shortened, for instance, to λ ∼ 10−2λ, then 36 vol. 57 no. 1/2017 tearing modes growth rate amplification due to finite current relaxation (40) shows that γτc ∼ 103. this means that the standard growth rate γ of the tearing mode is amplificated to γ ∼ 105γ, quite a significant amplification due to inertial effects of charged species in the plasma. 4. conclusion in this work, we have explored the influence of perturbative wavelengths, shorter than those usually considered, on the growth rate γ of the tearing modes. thus, we have adopted an extended form of ohm’s law, which includes a finite relaxation time for the current density, due to inertial effects of charged species. in the long wavelength limit, we have observed the standard γ of the tearing modes. however, in the short wavelength limit, we have shown that γ does not depend on the fluid resistivity any longer. actually, we have found out that γ now scales with the electron number density ne as γ ∼ n −3/2 e . therefore, through a suitable combination of both limiting results, we have shown that the standard γ can be substantially amplificated, even by moderate shortenings of perturbative wavelengths. further developments of our theory may contribute to the explanation of the fast magnetic reconnection of field lines, as observed in astrophysical plasmas. references [1] i. b. bernstein, e. a. frieman, m. d. kruskal, r. m. kulsrud. proc r soc a244:17, 1958. [2] w. a. newcomb. ann phys 10:232, 1960. [3] j. m. greene, j. l. johnson. phys fluids 5:510, 1962. [4] r. j. goldston, p. h. rutherford. introduction to plasma physics. institute of physics, bristol, 2000. [5] k. nishikawa, m. wakatani. plasma physics: basic theory with fusion applications. third edition. springer-verlag, berlin, 2000. [6] k. miyamoto. plasma physics and controlled nuclear fusion. springer-verlag, berlin, 2005. [7] l. spitzer. physics of fully ionized gases. second edition. dover publications, new york, 2006. [8] j. p. freidberg. ideal mhd. cambridge university press, cambridge, 2014. [9] v. d. shafranov. sov phys tech phys 15:175, 1970. [10] d. f. düchs, h. p. furth, p. h. rutherford. nucl fusion 12:341, 1972. [11] j. g. cordey, f. a. haas. nucl fusion 16:605, 1976. [12] c. c. grimes, g. adams. phys rev lett 36:145, 1976. [13] m. j. forrest, p. g. carolan, n. j. peacock. nature 271:718, 1978. [14] p. h. rutherford. phys fluids 16:1903, 1973. [15] t. h. stix. phys rev lett 36:521, 1976. [16] b. v. waddell, b. carreras, h. r. hicks, j. a. holmes. phys fluids 22:896, 1979. [17] b. a. carreras, p. w. gaffney, h. r. hicks, j. d. callan. phys fluids 25:1231, 1982. [18] s. j. zweben, r. w. gould. nucl fusion 25:171, 1985. [19] r. w. white, d. a. monticello, m. n. rosenbluth, b. v. waddell. phys fluids 20:800, 1977. [20] m. f. turner, j. wesson. nucl fusion 22:1069, 1982. [21] a. i. smolyakov. sov j plasma phys 15:667, 1989. [22] a. j. wootton, b. a. carreras, h. matsumoto, k. mcguire, w. a. peebles, ch. p. ritz, p. w. terry, s. j. zweben. phys fluids b2:2879, 1990. [23] b. kadomtsev. nucl fusion 31:1301, 1991. [24] e. n. parker. j geophys res 62:509, 1957. [25] d. biskamp. phys fluids 29:1520, 1986. [26] p. goldreich, s. sridhar. astrophys j 438:763, 1995. [27] a. lazarian, e. vishniac. astrophys j 517:700, 1999. [28] g. kowal, a. lazarian, e. vishniac, k. otmianowska-mazur. astrophys j 700:63, 2009. [29] g. kowal, a. lazarian, e. vishniac, k. otmianowskamazur. nonlinear proc geoph 19:297, 2012. [30] h. p. furth, j. killeen, m. n. rosenbluth. phys fluids 6:459, 1963. [31] b. coppi, j. m. greene, j. l. johnson. nucl fusion 6:101, 1966. [32] a. h. glasser, j. m. greene, j. l. johnson. phys fluids 18:875, 1975. [33] b. coppi, r. m. o. galvão, r. pellat, m. n. rosenbluth, p. h. rutherford. sov j plasma phys 2: 533, 1976. [34] a. m. m. todd, m. s. chance, j. m. greene, r. c. grimm, j. l. johnson, j. manickam. phys rev lett 38:826, 1977. [35] j. p. mondt, j. weiland. j plasma phys 34:143, 1985. [36] f. e. m. silveira. j phys: conf ser 370:012005, 2012. [37] f. e. m. silveira. j plasma phys 79:45, 2013. [38] f. e. m. silveira, r. m. o. galvão. phys plasmas 20:082126, 2013. [39] f. e. m. silveira, h. i. orlandi. phys plasmas 23:042111, 2016. [40] f. e. m. silveira. plasma phys tech 3:155, 2016. [41] h. jeffreys, b. s. jeffreys. methods of mathematical physics. third edition. cambridge university press, cambridge, 1999. [42] c. m. bender, s. a. orszag. advanced mathematical methods for scientists and engineers. vol. 1: asymptotic methods and perturbation theory. springer-verlag, new york, 1999. [43] g. l. johnston. j math phys 19:635, 1978. [44] r. m. o. galvão. physica 122c:289, 1983. [45] b. bertotti. ann phys 25:271, 1963. [46] m. g. haines. nucl fusion 17:811, 1977. [47] j. n. leboeuf, t. tajima, j. m. dawson. phys fluids 25:784, 1982. [48] y. ono, m. yamada, t. akao, t. tajima, r. matsumoto. phys rev lett 76:3328, 1996. [49] b. v. somov, t. kosugi. astrophys j 485:859, 1997. [50] e. g. harris. nuovo cimento 23:115, 1962. 37 acta polytechnica 57(1):32–37, 2017 1 introduction 2 basic equations and linear analysis 2.1 solutions in the ideal and resistive regions 2.2 asymptotic matching and growth rate 3 growth rate amplification 4 conclusion references ap04_3web.vp 1 introduction the aim of this paper is to calculate the electric fields caused by the impact of electromagnetic waves inside a housing space. the calculation is performed numerically with the finite-difference time-domain method (fdtd).the method of finite differences is based on discretisation of infinite small time and space steps by finite small time and space steps. the code applied for this investigation approximates the timeand space dependences of magnetic (h) and electric (e) – fields by central differencies. eq. 1 a – c shows discretisation along dimensions (x, y, z) of the cartesian grid, eq. 2 shows discretisation of time. � �x f x y z f x y z f x y z x n n n ( , , ) ( , , ) ( , , ) � � � � 1 2 1 2 � (1a) � �y f x y z f x y z f x y z y n n n ( , , ) ( , , ) ( , , ) � � � � 1 2 1 2 � (1b) � �z f x y z f x y z f x y z z n n n ( , , ) ( , , ) ( , , ) � � � � 1 2 1 2 � (1c) � �t f x y z f x y z f x y z t n n n ( , , ) ( , , ) ( , , ) � � � �1 2 1 2 � (2) f: e, h, respectively, n: time step, x, y, z: location in cartesian dimensions. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 44 no. 3/2004 numerical calculation of electric fields in housing spaces due to electromagnetic radiation from antennas for mobile communication h.-p. geromiller, a. farschtschi the influence of electromagnetic radiation from mobile antennas on humans is under discussion in various group of scientists. this paper deals with the impact of electromagnetic radiation in housing spaces. the space is assumed to be bordered by 5 walls of ferroconcrete and a door-window combination on the 6th side, the latter to be electromagnetic transparent. the transparent side of the housing is exposed to an electromagnetic wave. as the source of radiation is considered to be far away from the housing, the radiation is regarded as a plane wave. due to the high signal frequency and the ferroconcrete walls, 5 sides of the housing space are considered to be perfect conductors. the electric field inside the housing is calculated numerically by the method of finite differences for different angles of incidence of the radiated electromagnetic wave. the maximum value of the calculated electric field is outlined in a diagram. keywords: numerical calculation, finite difference, electric fields, mobile communication. fig. 1: simulation of wave propagation and housing space dimensions as only electromagnetic waves are considered, maxwell’s equations are approximated by omitting the current density. fdtd requires the computational domain to be limited by boundary conditions. two boundary conditions are used here, perfectly electric conductors (pec) and perfectly matched layers (pml). pecs are implemented by forcing the tangential component of the electric field along the boundaries to be zero (etan gential � 0). in the pml technique, an artificial layer of absorbing material is placed around the outer boundary of the computational domain. the goal is to ensure an electromagnetic wave incident into the pml-region at an arbitrary angel to be absorbed without reflection. the pml region is realized by implementing a new degree of freedom into the formulas of the fdtd code, which is done by splitting the field components [1]. fig. 1 shows the dimensions of the house spacing under consideration from a top and side view. the house spacing under consideration is restricted to a volume of v � 4 m × 2.8 m × 2.5 m. five boundaries are considered to be walls of ferroconcrete and therefore good reflectors for high-frequency signals as used for mobile communication. hence these walls are simulated with pecs. the 6th boundary is considered to be a window/door combination and therefore electromagnetically transparent. in order to simulate the incidence of electromagnetic waves from radiating antennas far away from the housing, the window/door boundary is simulated as a source plane. for calculation, the direction of propagation needs to be taken into account, so the electric field vector of the incident wave is simulated on the source plane by decomposition into components. simulation is performed by setting the components of the electric field strength einc z( ) and einc y( ) on the source plane according to the electric field strength einc of the incident electromagnetic wave on the window/door combination. time dependence is taken into account by setting the values of the components of the electric field strength on the source plane sinusoidally. the setting of the magnetic component was omitted, as magnetic and electric fields are related by the impedance of free space. impedance matched simulation of the free space along the window/door combination (source plane) is assured by implementing pmls. the pml structure numerically absorbs the energy of the electromagnetic wave traveling from the interior of the house spacing towards the environment. 2 results as shown in fig. 1a, numerical calculation was performed for an electromagnetic wave with magnetic and electric field vectors hinc and einc and direction of propagation k related to the dimensions of the defined cartesian system. investigation of electric field strength inside the house-spacing was based on different angles of incidence within the range of 5 ° � � � 85° in steps of �step � 5°. for each angle of incidence, the calculation was performed until a steady state of the electric field inside the housing space could be observed. data analysis was restricted to the last time inverval in a steady state. the last time interval was divided into 10 time points with equal time-spacing. fixing the angle of incidence of the propagating electromagnetic wave, the maximum absolute value of the electric field strength emax was detected within the housing space and within the chosen time points for steady state. in addition emax was referred to the amplitude of the incident electric field strength einc . fig. 2 shows e eincmax over �. as may be seen, the maximum electric field strength depends strongly on the angle of the incident electromagnetic wave and the maximum value of the electric field strength inside the housing exceeds the value of the electric field strength of the incident wave. the maximum may be observed for � � 20° with a ratio e einc max .� 2 5 this may be explained by reflections and superposition on the perfectly conducting walls of the house spacing, particulary in corners and edges where supercomposition with reflected electric fields from several walls may occur. taking into account the density of electromagnetic energy being quadratically dependent on the electric field strength, it may be argued that the density of energy may in the worst case be about 5 times higher than the energy density of the incident electromagnetic wave on the source plane. 3 conclusion from fig. 2 it may be conlcuded, that the effects of electromagnetic radiation from antennas for mobile communication should not only be judged by their electric field strength in free space or boundaries between free space and housings. as electromagnetic waves with high frequencies may have negative effects on humans, attention should be paid to legal limits for electromagnetic radiation from radio transmitters for mobile communication in the vicinity of housings. legal limits referring to free space propagation of electromagnetic waves should be regarded with care, since unfavourable conditions inside housings may subject humans to electric field strengths exceeding the allowed limits. in this context it should be taken into account that the effects of electromagnetic radiation on humans are quadratically dependent on electric field strength, as these effects are mainly 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 2: dependency of maximum electric field strength inside a housing space referred to the incident electric field strength (e eincmax ) over angle of incidence (�) related to the energetic density of electromagnetic waves, and therefore the negative impacts on humans increases disproportionately with electric field strength. references [1] berenger j. p.: “a perfectly matched layer for the absorption of electromagnetic waves”. journal of computational physics, vol. 114 (1994), p.185–200. [2] sadiku m.: numerical techniques in electromagnetics. crc-press, 2001, second edition, isbn 0-8493-13953, p. 121–186. [3] simonyi k.: theoretische elektrotechnik. deutscher verlag der wissenschaften, 10. auflage, isbn 3-335-00375-60. dr.-ing. h.-p. geromiller phone: +49 371 531 3354 fax.: +49 371 531 3417 email: hans-peter.geromiller@e-technik.tu-chemnitz.de prof. dr.-ing. habil. a. farschtschi technical university of chemnitz chair of fundamentals of electromagnetics 09111 chemnitz, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 44 no. 3/2004 acta polytechnica doi:10.14311/ap.2017.57.0078 acta polytechnica 57(2):78–88, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap the impact of selected processes and technological parameters on the geometry of the weld pool when welding in shield gas atmosphere josef bradáč∗, jaromír moravec department of engineering technology, faculty of engineering, technical university of liberec. studentská 1402/2, 461 17, czech republic ∗ corresponding author: josef.bradac@tul.cz abstract. this paper is focused on welding with a consumable electrode in a gas shield atmosphere and its main aim is to show the influence of selected processes and technological parameters on the geometry of the weld pool in terms of theoretical and experimental views. for this purpose, the parametric areas defined by the change of the welding current and welding rate were determined. apart from the influence of these parametric areas, the influence of other technological input variables, including the wire diameter and preheating temperature, was also studied. the experimentally obtained geometric data of the weld pool can be used for technological welding procedures wps and especially for simulation calculations to obtain a more accurate numerical model of the heat source. this makes it possible to get accurate simulation results and to better understand the impact of other variables that influence the welding process. this all helps to the optimization of the welding process for several applications. keywords: weld pool geometry; process and technological parameters; gmaw; numerical simulations. 1. introduction to understand any action, it is first necessary to obtain a mathematical description of this action. the more detailed and precise description, the better the possibility to determine its behaviour and predict subsequent action. when predicting the response of the particular process to a various external impulse, it is necessary to establish a mathematical sequence of conditions, which, on the base of specified input variables, determine the particular accuracy of the result. the accuracy of the simulation-based processes depends mainly on the amount and accuracy of the input variables. the higher the quantity and quality of input variables, the greater the probability to obtain results that reflect the real state. the welding process can be characterized as a mathematical problem with many variables, but with only one unknown, which is the shape of the weld pool. it is possible, based on experience, with a certain degree of probability, to predict the shape, but the result must then be confirmed experimentally. the biggest pitfall in predicting the weld pool shape is the great interdependence of variables that affect the final geometry of the weld. the principle of fusion welding is intense local heating, the associated melting of the base material and subsequent crystallization of the weld pool. the shape and dimensions of the weld pool is given by the shape of the temperature gradient and the intensity of the heat transfer to the surrounding material, depending on its thickness and the material properties. the knowledge of the shape of the weld pool is very important both in terms of design and technology. the experiments and their results presented in this paper were realized primarily to achieve more accurate results of simulation calculations. because, when defining the respective model of the heat source, it is necessary to consider not only the influence of process parameters, but also the influence of technological parameters and physical-chemical processes. the main problems are mainly the description of thermal fields based on the shape of the temperature gradient, the heat transfer efficiency, the type of the protective atmosphere, preheating, reheating and interpass temperatures, but also the chemical composition, the effect of surface active components and heat transfer type in the weld pool (conduction, convection, radiation). although it is not possible to solve the fusion welding technology complex, there are approaches allowing to solve certain parts of this process. besides classic calculations of the temperature, stress and strain fields, the effect of surface active elements, such as sulphur and oxygen, and their influence on the melting depth or the flow direction in the welding bath can be [1–3]. processes related to the heat transfer in the weld pool [4], or the material melting efficiency are simulated [5, 6]. it is also possible to describe the influence of the metal transmission type on the geometry of the weld pool, or the effect of the droplet size on the heat diffusion in the weld pool [7]. other studies deal with a more accurate prediction of the chemical composition of the welded metal. these include the determination of the loss of alloying elements by their burn out [8, 9] based on langmuir equation of the pressure gradient, which controls the mass transfer. it is also possible to simulate the intensity of the solubility of the gas during arc welding, calculated according to sievert law [10]. it can also 78 http://dx.doi.org/10.14311/ap.2017.57.0078 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 2/2017 geometry of the weld pool simulate the phase transformations of steels using the johnson-mehl-avrami–kolmogorov equation [11], or microstructure c-mg and hsla steel using the scheile additive rule [12]. research is also focused on the modelling of the grain growth in the haz using the monte carlo method [13] and modelling the properties of the inclusions during precipitation and subsequent diffusion processes. 2. the effect of process parameters on the weld pool geometry arc fusion welding is characterized by a great number of variables, which influence the final geometry of the weld pool. these input parameters can be divided into three basic groups — the process input parameters, technological input parameters and physical-chemical parameters. the largest group of input parameters is represented by process parameters. these parameters are set up directly before welding and can be corrected during welding. they are highly dependent on the design of the power source and its static and dynamic characteristics. the most important process parameters include the welding current (current density), welding voltage and welding speed. increasing the welding speed with the proportional increase of its intensity creates the portion of heat utilized for melting the base material. simultaneously, the temperature gradient is increased and isotherms are closer to the heat source in all directions, mostly before the source. the maximal temperature in the melting zone increases, see figure 1 [14]. the increasing temperature also increases the melting efficiency irrespective of the heat source type. on the basis of the main process parameters, the most important variable used during welding can be determined – the input heat. the specific input heat qv determines the heat input per weld length unit and is given by the existence of the weld pool and the overall heat balance of the weld with respect to the input and output heat, transmission losses and arc protection. the values of the specific heat input allow to predict the range of the thermal effects, stress, and strain fields after welding. the determined value also specifies the limits for the welding of individual materials. the specific heat input can be calculated by this formula: qv = η0 ui vs . (1) the process parameters are not the sole parameters influencing the shape and dimensions of the weld pool. there are also other parameters, including primarily the wire feed speed, current density, polarity at the electrode, the wire extension, torch slope, shielding gas flow, the shape and dimensions of the weld surface, the metal transfer, and the welding position. figure 1. thermal conditions in the weld pool for different welding speeds [14]. 3. the effect of technological parameters technological input parameters are the second largest group of input variables. these parameters are typically determined by the base material, structural design of the welded joint and the operating conditions of the final product. the most commonly used technological input parameters include the wire type and diameter (consumable electrode), the type and amount of shielding gas, the shape and geometry of the welded surfaces (weld type), the preheating temperature, reheating, interpass temperature or subsequent heat treatment of the weldment. technological input parameters greatly affect the value and set-up of process parameters. this means that they considerably influence the resulting geometry of the weld pool. 79 josef bradáč, jaromír moravec acta polytechnica 3.1. the effect of wire diameter generally, the larger wire diameter requires a higher welding current to melt. using smaller wire diameter for the same value of the welding current increases the current density and the coefficient of melting is greater. small diameter wires enable larger number of droplets transmitted in the arc, both for the short and the long electric arc. this means positive effect in case of multilayer and root welds because, due to the increased number of drops, a greater surface smoothness can be achieved. the current density can then be applied. the selection of the wire diameter is based on the material thickness, the joint type and welding position. in the engineering practice, in the case of gmaw welding method, the following wire diameters are used: 0.6, 0.8, 1.0, 1.2, 1.4 and 1.6 mm. selection of the suitable wire diameter depends, besides the already mentioned rules, on the final product and its dimensions. the level of work load and the surrounding conditions are also important. when welding steel structures, the most frequent wire diameters are 0.8, 1.0 and 1.2 mm. the increasing wire diameter decreases the current density. in the case of using the same values of welding current and changing the wire diameter from 0.8 to 1.0 mm, the resulting decrease of the current density is 36 % and when changing the diameter to 1.2 mm the decrease in the current density is another 20 % [15]. 3.2. the effect of preheating and interpass temperature by using preheating during welding, it is possible to improve the quality of the welded joint. its application leads to the reduction of the cooling rate, and thereby reduction or elimination of the formation of selected structural phases, or reduction of the level of internal stresses. the preheating temperature depends primarily on the chemical composition of the welding material and the initial and final values of the selected transformation temperature. it must be high enough to achieve the desired structural phase. also, it should not be too high to prevent excessive grain coarsening in the haz. the preheating temperature is determined either computationally (low alloyed steels), or experimentally in the case of middle and high alloy steels. experimental tests are based either on the metallographic evaluation and hardness measurement in the haz (possible test of impact strength), or on the cracking tests, that means tests like tekken, the cts, implants, etc. [16]. the main disadvantage of cracking tests is that they are highly time-consuming and usually involve high costs. currently used computational algorithms are based on real experiments, but they are designed primarily for standard or low-alloyed steel. the computational algorithms, like computations according to seferian figure 2. mechanical properties vs. interpass temperature [17]. and ito-bessy, are commonly used [16]. for determining the preheating temperature for each group of steels, the en 1011 standard, which provides recommendations on welding metallic materials and describes methods for determining the preheating temperature for each specific steel group, can be used. during welding, the multilayer weld is also important to monitor the interpass temperature. if this temperature is not monitored, there is a high probability that the mechanical properties of the welded material are affected, and the heterogeneity of material properties across the welded metal can also be expected. using high interpass temperature for carbon ferritic steels results in a decrease in yield and ultimate strength and could cause cracking, see figure 2. 4. experiments the main aim of the experimental part was to determine the effect of change in the wire diameter and preheating temperature on the geometry of the weld pool for the fillet weld type and the gmaw welding. experiments were carried out for the parametric area corresponding to the range of the welding current from 140 to 280 a and the welding speed in the range 0.2–0.7 m/min. the samples were welded in a synergic mode. the obtained geometric data were then used in numerical simulations of welding to optimize a double ellipsoidal model of the heat source. double ellipsoidal heat source model enables simulations with very good agreement with the reality, especially when simulating the electric arc welding processes. the cold-rolled sheets of thickness s = 5 mm were used as the base material. sheet material was made of low carbon steel s355j2. experimentally determined chemical composition and mechanical properties of 80 vol. 57 no. 2/2017 geometry of the weld pool figure 3. welded parts geometry with marked cut sections. c si mn cr p 0.113 0.014 1.42 0.046 0.026 cu al w nb n 0.057 0.063 0.017 0.024 0.014 table 1. chemical composition of steel s355j2 [%]. rp0.2 [mpa] rm [mpa] ag [%] a50 [%] 567.58 633.28 11.95 48.59 table 2. mechanical properties of steel s355j2. this material are shown in tables 1 and 2. the results were calculated as an average of three individual measurements. figure 3 shows the assembling of the fillet weld which was used for the experiments. the weld was assembled from sheets of the following dimensions: 150×100×5 mm. the sheets were milled to achieve a constant contact surface with a zero weld gap. stitching was used before the welding process to guarantee the prescribed geometry and perpendicularity of sheets. samples were then welded by the gmaw welding method. samples were welded using the power source migatronic bdh 550 puls syn interconnected with the linear machine with adjustable speed. the vertical position pa and special jig were used, as shown in figure 4. the welding jig enables to achieve the desired geometric position of welded sheets and the right definition of the heat transfer coefficient to the environment. the jig can be adjusted for the welding fillet and butt welds in different positions. figure 4 shows a detail of welded sheets, including a schematic indication of the position of the torch to the welding sample. figure 4. welding position and welding equipment. 81 josef bradáč, jaromír moravec acta polytechnica figure 5. example of the data control in weldmonitor programe. the shield gas m21 according to iso 14175 (82 % ar, 18 % co2) was used for experiments. during welding, the basic process parameters were monitored by measuring the device and weldmonitor 3.5 software. weldmonitor is designed for detailed monitoring and documentation of the welding process with a high recording frequency, which exceeds 20 khz. weldmonitor system consists of hardware and software equipment. weldmonitor hardware system, which comprises sensors monitoring various parameters of the welding process, can be connected to any welding equipment. the whole system is interconnected with the host computer and makes it possible to monitor automated welding processes and the final weld quality. in the basic mode, the welding current and voltage can be monitored. the system enables real time measurement of the actual and the effective value with high precision results. in addition to these variables, it is possible to monitor the welding speed, wire feeding rate, gas flow rate, specific heat input, wire consumption, and the method of the metal transfer [18]. by using weldmonitor, it is possible to display the results both in the text and in the graphical form. figure 5 shows an example of recording the basic welding parameters, including the detail, which shows the welding current and voltage in case of a short circuit metal transfer. all records of measured values can be archived and possibly presented in a graphical form or in the form of data files. all welds prepared by different process parameters were subjected to the dimensional analysis. geometric evaluation was based on the cross section analysis of the weld using a scratch pattern. sixteen different geometric parameters were measured, see figure 6. most important variable is the size of the fillet weld a, which is a basic calculation parameter in the design of steel structures. furthermore, the parameters w and vmax are also important. they define the width of the weld and the maximum penetration depth. these values can also be used in strength calculations of welded joints, but primarily in defining the mathematical description of the double ellipsoidal model of the heat source. other parameters were determined for the classification of the weld into groups in accordance with the en iso 5817 standard. 82 vol. 57 no. 2/2017 geometry of the weld pool figure 6. schematic view of fillet weld basic geometrical parameters. figure 7. schematic view of individual weld surfaces: pc — total weld surface, p∆ + pn — filler and reinforcement surface, p∆ — filler material surface. the total weld surface pc, filler material surface p∆, reinforcement surface pn were measured (figure 7). the reason was to obtain more information about the mixing intensity of the base and filler material. for defining the double ellipsoidal model, it is important to know the length of each of the ellipsoids represented by lc and ld parameters. figure 8 shows these geometrical parameters. in total, 21 parameters were measured for each weld. all parameters were determined using the niselements ar with an accuracy of two decimals. 4.1. experiments by different wire diameter the wire ok autrod 12.51 with diameter 1.0 and 1.2 mm was used for these experiments. both wires were tested by three values of the welding current (140, 200 and 280 a), wherein the welding speed ranged 0.2– 0.7 m/min. how the welding speed influences the size and the shape of the weld pool in case of welding current 280 a and a wire diameter of 1.2 mm is shown in figure 9. for each value of the welding current, six testing samples were made for both wire diameters. welding took place in the synergic mode, and therefore the different current density, given by the different filler wire diameter, could be considered. the wire feeding figure 8. geometrical parameters of the weld pool. speed is lower when using a wire of a larger diameter. the value of the heat input per length unit ranged 0.33–1.2 kj/mm for the wire of a 1.2 mm diameter. in the case of the wire of a 1.0 mm diameter, the heat input values were approximately 12–21 % higher. the first monitored geometrical parameter was the size of the fillet weld a. due to the higher current density in the case of the wire diameter of 1.0 mm, the parametric variables increase, when compared to the diameter of 1.2 mm. it is possible to say that the wire diameter of 1.0 mm enables higher melting at about 10 %. this explains why it is possible to achieve higher fillet weld sizes a in the case of using the lower wire diameter, see figure 10. from the obtained curves, it can be seen that the deviation of 10 % in the fillet weld size has been achieved only for the current of 200 a. for other current values, there are always smaller deviations. similar courses and deviations, were achieved for the parameter weld width w. slightly different is the situation for the depth penetration vmax, see figure 11. for low values of the welding current, there is almost no effect of the wirediameter change. only with the increasing value of the welding current, this difference begins to be important. at the current of 200 a, the difference in the vmax is already 13 %, and at a current of 280 a, even 18 %. this progression of increasing the penetration depth can be explained by the change in the flow direction and velocity in the weld pool. the temperature gradient is steeper, which results in achieving a deeper penetration. the same progression can be seen in the case of the total welded surface pc, which shows the same effect of the welding current. the total welded surface is determined mainly by the weld width w and penetration depth vmax, which are thus the most important variables. in figure 12, it is possible to see the effect of the wire diameter rebound already at low welding currents. with the current growth, this effect on the overall size of the weld further increases. 83 josef bradáč, jaromír moravec acta polytechnica figure 9. welded samples at a different welding speed: (a) 0.3 m/min, (b) 0.4 m/min, (c) 0.5 m/min. figure 10. fillet weld size dependence for selected welding currents and wire diameters. 4.2. experiments by different preheating temperature for the material s355j2, there is no need to use preheating in case of smaller material thickness. for numerical simulation, however, the preheating effect and its influence on the weld pool geometry are important. in addition, the thermal conductivity of unalloyed and low-alloyed steels, which require preheating, is comparable to the steel s355j2 and results obtained with this cheaper material, and thus can be applied to these steels. the wire ok autrod 12.51 with a diameter of 1.2 mm was used for the experiments. preheating temperatures were chosen as 100, 200 and 300 °c with welding speed of 0.2–0.5 m/min and the welding current of 140 a. higher welding speeds were not used because of the small amount of filler material melting. the same geometric parameters were chosen as for the dimensional analysis by different wire diameters. when evaluating the parameters a and w, it can be stated that the preheating has no significant effect on the size of the fillet weld because the differences in the sizes of both parameters were minimal. this can be explained by the constant wire feed speed during each experiment, since both variables are formed by the filler material. deviations were observed only for the parameter vmax, because the increasing preheating temperature increases the maximum penetration depth. this is valid for the whole speed range of welding speeds, see figure 13. it is possible to observe that the curves have very similar slopes and at the temperature of 300 °c, the penetration depth is about 25 % higher than for the weld without the preheating. 84 vol. 57 no. 2/2017 geometry of the weld pool figure 11. penetration depth as dependent on welding speed for selected welding currents and wire diameters. figure 12. total welded surface dependence on welding speed for selected welding currents and wire diameters. 85 josef bradáč, jaromír moravec acta polytechnica figure 13. penetration depth with dependence on welding speed for selected preheating temperatures 5. utilization of the weld pool geometry for optimization of simulation calculations the main task of simulation calculations is that the outcome reflects the real state of the simulated action as much as possible. the accuracy of the simulation calculations depends on the quality of the input data. it is not possible for each numerical analysis to realize the experimental determination of the welded geometry. therefore, it is important to predict geometrical parameters for the defined material and welding method. then, it is possible to prepare a simulation calculation for a different situation without individual experiments. as already mentioned in the case of arc welding, the most frequently used model of the heat source is called the double ellipsoidal model, see figure 14. the double ellipse model of the heat source is described by two equations: q(x,y,z) = 6 √ 3f1q 0.5wvmaxlcπ √ π e −kx2 0.5w2 e −ly2 v2max e −mξ2 l2c , (2) q(x,y,z) = 6 √ 3f2q 0.5wvmaxldπ √ π e −kx2 0.5w2 e −ly2 v2max e −mξ2 l2 d , (3) and defined by the heat flux density in the material q [w/m3] separately for each of the ellipsoids. in the equations, q [w] is the total source power given by the voltage and current, 0.5w [m] is the parameter of the molten area, which, for fillet weld, defines half of figure 14. description of the double ellipsoidal model of the heat source [19]. the weld pool width (figure 6), vmax [m] defines the penetration depth, lc and ld define the geometry of the weld pool in the welding direction (figure 8), k, l, m [-] are the coefficients for modifying the shape of the heat source. how much heat, produced by the heat source, goes into front and rear ellipsoid is decided by parameters f1 and f2. the sum of f1 and f2 is 100 %. the heat power is usually divided: 60 % in the front ellipsoid f1 and 40 % in the rear ellipsoid f2. figure 15 shows the profile of the melting area, which was achieved thanks to the measured geometri86 vol. 57 no. 2/2017 geometry of the weld pool figure 15. temperature field during welding and detail of the melting area in xy plan. cal parameters of the weld pool and corresponds to real conditions. if the corresponding values of temperature fields are obtained, it is then possible to calculate the stress and distortion. this calculation cannot be determined without the heat load of the system, which is why optimizing the temperature fields and determining the correct parameters of the weld pool is so important. without these values, it is not possible to calculate the stress state and distortion. 6. conclusion welding is a complex process with many variables, resulting in a certain shape of the weld pool and the temperature gradient. the paper analysed the effect of the various process parameters on the geometry of the weld pool during the arc welding in the shield gas atmosphere. first, the influence of processing parameters, focusing on the technological parameters, was theoretically described. the particular links between individual parameters were determined. for selected technological parameters, the defined range of process parameters a, w, vmax and pc was determined. this data can be used both for the calculation of operating stress of steel structures and numerical simulations. the results, however, only reflect the system base material (unalloyed or low-alloyed steel) — shielding gas m21 — defined wire diameter (1.0 or 1.2 mm). this makes it possible to do numerical computations with sufficient accuracy without geometric verification based on experiments. it is then sufficient to verify only the final simulations’ variant, which leads to significant economical and time savings. currently, a special software, which will be able to predict the geometric variables based on analytical equation is being prepared. the experiments described in this paper are used among others for the right set-up of this software. the procedure of plan87 josef bradáč, jaromír moravec acta polytechnica ning and preparing experiments and the methodology to assess their results can be considered as the basis for further and easier description and modification of the heat source model for the arc welding in the shield gas atmosphere. references [1] c. r. heiple, j. r. roper. mechanism of minor element effect on gta fusion zone geometry. suplement to the welding journal, pp. 92-102, 1982. [2] w. pitscheneder, t. debroy, k. mundra, r. ebner. role of sulfur and processing variables on the temporal evolution of weld pool geometry during mutikilowatt laser beam welding of steel. welding journal 75(3) pp. 71-80, 1996. [3] a. farzadi, s. serajzadeh, a.h. kokabi. int j adv manuf technol (2008) 38: 258. doi: 10.1007/s00170-007-1106-9 [4] c. limmaneevichitr, s. kou. experiments to simulate effect of marangoni convection on weld pool shape, welding journal 79(8) pp. 126-135 2000. [5] l. quintino, o. liskevich, l. vilarinho, et al. int j adv manuf technol (2013) 68: 2833. doi:10.1007/s00170-013-4862-8 [6] s. wang, r. nates, t. pasang, et al. modelling of gas tungsten arc welding pool under marangoni convection. universal journal of mechanical engineering. 3(5) pp. 185-201, 2015. doi: 10.13189/ujme.2015.030504 [7] j.i. achebo. complex behavior of forces influencing molten weld metal flow based on static force balance theory. physics procedia. 2005(12). pp. 317-324. doi: 10.1016/j.phpro.2012.03.090 [8] p.a.a. khan, t. debroy t., s.a. david. laser beam welding of high-manganese stainless steelsexamination of alloying element loss and microstructural changes. welding journal. 1988 (67). pp 1-7. [9] a. de, t. debroy. probing unknown welding parameters from convective heat transfer calculation and multivariate optimization, journal of physicsapplied physics, 37, pp. 140-150, 2004. http://dx.doi.org/10.1088/0022-3727/37/1/023 [10] t.a. palmer, t. debroy. numerical modeling of enhanced nitrogen dissolution during gas tungsten arc welding metall and materi trans b. 2000(6). pp 1371–1385. doi:10.1007/s11663-000-0023-1 [11] p. sahoo, m. m. collur, t. debroy. metall. trans. b, 1988, 19b, pp.. 967– 972. [12] m. takahashi1, h. k. d. h. bhadeshia. a model for the microstructure of some advanced bainitic steels. materials transactions, jim ,vol. 32 (1991) pp. 689-696. doi: 10.2320/matertrans1989.32.689 [13] d.y. li, szpunar, j.a. a monte-carlo simulation of the electrodeposition proces. jem (1993)22: 653. doi:10.1007/bf02666412 [14] welding metallurgy, american welding society, 4th ed. 1994. [15] r. kopriva. technology of mig/mag welding. 1. ed. ostrava: zeross, 1993. 194 p. isbn 80-85771-004-4. [16] v. pilous. meterials and their behaviuor during welding. zeross, ostrava 2001, 292 p. [17] v. ochodek. the effect of temperature welding mode on weld mechanical properties. technical university of ostrava, 2011. [18] o. ambroz. welding technology. 1. ed. ostrava: zeross, 2001. isbn 80-85771-81-0. [19] j. moravec, j. sobotka., j. bradac. numerical simulations utilization for welding hardly weldable materials based on iron aluminide. liberec 2010. 98 p. isbn 978-80-7372-682-9. 88 http://dx.doi.org/ 10.1007/s00170-007-1106-9 http://dx.doi.org/ 10.1007/s00170-007-1106-9 http://dx.doi.org/10.1007/s00170-013-4862-8 http://dx.doi.org/ 10.13189/ujme.2015.030504 http://dx.doi.org/ 10.1016/j.phpro.2012.03.090 http://dx.doi.org/ 10.1016/j.phpro.2012.03.090 http://dx.doi.org/10.1007/s11663-000-0023-1 http://dx.doi.org/ 10.2320/matertrans1989.32.689 http://dx.doi.org/10.1007/bf02666412 acta polytechnica 57(2):78–88, 2017 1 introduction 2 the effect of process parameters on the weld pool geometry 3 the effect of technological parameters 3.1 the effect of wire diameter 3.2 the effect of preheating and interpass temperature 4 experiments 4.1 experiments by different wire diameter 4.2 experiments by different preheating temperature 5 utilization of the weld pool geometry for optimization of simulation calculations 6 conclusion references acta polytechnica doi:10.14311/ap.2019.59.0211 acta polytechnica 59(3):211–223, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap general model of radiative and convective heat transfer in buildings: part i: algebraic model of radiative heat transfer tomáš ficker brno university of technology, faculty of civil engineering, department of physics, veveří 95, 602 00 brno, czech republic correspondence: ficker.t@fce.vutbr.cz abstract. radiative heat transfer is the most effective mechanism of energy transport inside buildings. one of the methods capable of computing the radiative heat transport is based on the system of algebraic equations. the algebraic method has been initially developed by mechanical engineers for a wide range of thermal engineering problems. the first part of the present serial paper describes the basic features of the algebraic model and illustrates its applicability in the field of building physics. the computations of radiative heat transfer both in building enclosures and also in open building envelopes are discussed and their differences explained. the present paper serves as a preparation stage for the development of a more general model evaluating heat losses of buildings. the general model comprises both the radiative and convective heat transfers and is presented in the second part of this serial contribution. keywords: radiative heat transfer, view factor, radiosity, room envelope, radiative heat in interiors, heat loss. 1. introduction heat radiation represents a dominant transfer mechanism of heat energy inside buildings. estimating this transfer is, therefore, a useful indicator of effectiveness of heating systems, especially those based on radiant panels. so far, several algebraic methods for determining radiative heat transfer have been published. probably, the pioneering work in the field of algebraic models can be ascribed to hottel [1], who introduced the concept of the total-view factor fij. soon afterwards, hotell and sarofim [2] improved the method by introducing the total-exchange area sisj. their model is based on the so-called radiosity heat flux and, thus, it is often referred to as a radiosity method. besides the radiosity method, there are some other modifications of the algebraic approach. gebhart [3, 4] introduced a method utilizing the so-called absorption factor. among other methods dealing with radiative heat transfer in enclosures, it is possible to mention the methods by sparrow [5], sparrow and cess [6], oppenheim [7], ecker and drake [8], love [9], wiebelt [10], and siegel and howell [11]. a comprehensive analysis of the methods that were introduced by hottel and sarofim [1, 2] and gebhart [3] was published by clark and korybalski [12]. these two authors showed that the methods of hottel, sarofim and gebhart, although written in different forms, were mathematically equivalent. liesen and pedersen [13] published an overview of many radiant exchange models that range from the exact models using uniform radiosity networks and exact view factors to mean radiant temperature and area-weighted view factors. these authors compared the radiant exchange models to each other for a simple zone with varying aspect ratios. since that time, many various applications of algebraic methods have been published in the field of mechanical engineering, but in the field of building thermal technology, the algebraic methods has not actually been used. the reason may be related to the fact that a complete solution of the combined radiative-convective transport leads to the system of transcendent non-linear equations whose solutions seem to be problematic in some cases. recent studies of radiative heat transport in the inner spaces of buildings (i.e., in enclosures) avoid algebraic methods and prefer differential transport equations [14–18]. the main interest is focused on floor heating systems [14–17], but radiant panel systems situated on walls or ceilings are also investigated [18–21]. coefficients of heat transfers at interiors or exteriors along with heat losses are often measured or calculated as well [20, 22–25]. as has been mentioned above, the radiative heat transfer is the most effective mechanism of energy transport as compared to the free convection or conduction. this fact has been verified many times both theoretically and experimentally. for example, rahimi and sabernaeemi [21] have experimentally investigated these transports between the radiant ceiling surface and other internal surfaces of a room and have found that more than 90 % of the heat is transferred by radiation. 211 http://dx.doi.org/10.14311/ap.2019.59.0211 http://ojs.cvut.cz/ojs/index.php/ap tomáš ficker acta polytechnica figure 1. heat exchange by radiation between two small surface elements ds1 and ds2. as is well-known, the radiative heat is formed by electromagnetic waves containing many different wavelengths. when the room under investigation contains glazed windows, it is necessary to consider the capability of glass to transmit these waves. glass is transparent for wavelengths between 0 and 4µm but above this range, it behaves as an opaque matter. the wavelengths of solar radiation near the surface of the earth assume values less than 2.5µm and thus they may pass through the windows into interiors. temperatures of inner furnishings usually do not exceed 30 °c, i.e. 300 k, and, according to wien’s displacement law, the wavelength of maximum radiation of such furnishings is λmax = 2898 µmk 300 k = 9.66 µm. this value is sufficiently above the critical transmittance range, which makes the glazed window an opaque barrier for inner radiation similarly as a wall [14]. however, like with walls, heat is transferred by convection and radiation in the vicinity of the internal and external sides of windows, but inside the glass panes, solely by conduction. when the windows are double or triple glazed, the cavities between glass panes filled with an inert gas represent a narrow space where the heat is transferred by convection, radiation and often the conduction may participate as well. since the heat radiation within opaque enclosures represents a dominant heat transfer between inner surfaces, it is desirable to have a reliable computational tool for its quantification. the algebraic method is certainly such a tool. its theoretical formalism is presented in the following section. 2. basics of radiosity method the first step in performing the radiosity method consists in forming the matrix of view factors. these factors facilitate redistributing the radiative energy among the surfaces of the enclosure under study. 2.1. view factors a view factor f reduces the total energy irradiated by a surface to that part of energy that reaches the neighboring surface (figure 1). view factors ares dimensionless, assume only positive values and are restricted to the interval f ∈ 〈0, 1〉. they are defined as follows [26–28]: fij = 1 si ∫ (si) [∫ (sj ) cos ϕi cos ϕj πr2 dsj ] dsi (1) fji = 1 sj ∫ (sj ) [∫ (si) cos ϕj cos ϕi πr2 dsi ] dsj (2) from equations (1) and (2), the symmetry relation between fij and fji can be immediately deduced: si ·fij = sj ·fji (3) this symmetry rule holds quite generally regardless of the types of surfaces and their geometrical positions. there is another important property of view factors. if a radiant surface is incapable of irradiating itself, its view factor is zero: fii = 0 (4) for example, the perfect planes or the external surfaces of spheres belong to this class of surfaces. it should be highlighted that zero rule (4) does not hold generally, but it is restricted to special surfaces. 212 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i figure 2. scheme of surface energy exchange. the third property concerns the closed envelopes that consist of different surfaces numbered as 1, 2, 3, . . . ,n: n∑ j=1 fij = 1 (5) relation (5) may be called the summation rule. it should be stressed that its validity is restricted solely to the closed envelopes. these may be, for example, inner spaces of rooms. the view factors can be ordered into a matrix whose elements have to fulfil the basic rules (3) (5):  f11 f12 . . . f1n f21 f22 . . . f2n ... ... ... ... fn1 fn2 . . . fnn   (6) 2.2. algebraic equations of radiative heat transfer the radiosity method relies on diffuse and grey surfaces. these surfaces emit or reflect radiation that is directional and wavelength independent. so the term ‘grey’ has little to do with real colours of surfaces. the two assumptions (diffuse and grey) are reasonable for most engineering applications. the emissivities ε of grey diffuse lambert’s surfaces assume the values lying between zero and one. the radiative heat transfer between such surfaces is based on two laws, namely the stefan-boltzmann law and the kirchhoff law [26–28]. the grey diffuse solid surfaces fulfil two conditions: i) transmittance is zero τ = 0. ii) reflectance ρ and absorbance α (or emissivity ε) are independent of wavelengths and their sum equals one, i.e. ρ + ε = 1. in figure 2, there is a scheme of energy exchange occurring on the surface of a radiant body. the symbol h represents the total radiative heat flux (w/m2) coming from all neighbouring surfaces whereas the symbol w represents radiosity, which is the total heat flux emitted from the surface (w/m2). radiosity is the sum of the stefan-boltzmann radiation (εeb = εσt 4, σ = 5.67 · 10−8 w/(m2k4)) and the reflected heat flux (ρh), i.e. w = εeb + ρh (7) h = w −εeb ρ . (8) the resulted (net) radiative heat flux q related to the investigated surface is determined as the difference between the emitted (w) and coming (h) fluxes: q = w −h = εeb −εh = { > 0 =⇒ surface emits energy < 0 =⇒ surface absorbs energy (9) by replacing h in eq. (9) by the fraction from eq. (8), we obtain: q = w − w −εeb ρ = (ρ− 1)w + εeb ρ = −εw + εeb ρ = ε ρ (eb −w) (10) thus, the density heat flux qi of the i-th surface is given as follows qi = εi ρi (ebi −wi) (w/m2) (11) from this equation, it is clear that the density heat flux qi can be positive or negative. the positive value indicates that the surface emits the heat energy whereas negative value means that the surface absorbs the energy. to determine the density heat flux qi from eq. (11), it is necessary to know radiosity wi. the equations for radiosities are specified in the following paragraphs. 213 tomáš ficker acta polytechnica 2.3. radiosity the i-th surface of area si is supplied by the energy hi coming from all the n surfaces (including the i-th surface itself if it is curved): sihi = n∑ j=1 sjfjiwj (w) (12) taking into account the rule of symmetry sj ·fji = si ·fij, the following equations may be obtained sihi = n∑ j=1 sifijwj (w) (13) hi = n∑ j=1 fijwj (w/m2) (14) by considering equations (7) and (14), the system of n linear algebraic equations emerges: wi = εiebi + ρi n∑ j=1 fijwj i,j = 1, 2, 3, . . . ,n (w/m2) (15) the solution of this system offers n values of radiosities {wi} n i=1 by means of which the total radiative heat flux qi for each of the n surfaces may be calculated according to eq. (11). if the heat flux qi is determined, it is easy to calculate the net heat flow φi corresponding to i-th surface: φi = siqi (w) (16) if this quantity is positive, the surface emits energy but negative quantity determines the absorption of the energy. equations (11), (15), and (16) are the basic algebraic relations that specify a radiative heat transfer between grey surfaces not only within the closed systems of surfaces, i.e. enclosures, but also in the open systems of surfaces. 2.4. heat flux of absolutely black surface in previous section 2.3, the basic relation for heat flux (11) associated with grey surfaces has been introduced. by applying eq. (11) to absolutely black surfaces (τ = 0, ρ = 0, ε = 1), it leads to an uncertain expression 0/0 and thus it is necessary to derive another expression that does not suffer from such drawback. from eqs. (9) and (14), it follows qi = wi − n∑ j=1 fijwj (17) which is a convenient alternative for calculating heat fluxes associated with both the black and grey surfaces. relation (17) does not lead to any uncertain expression but in comparison with (11), it requires more numerical work and, for this reason, expression (11) is a preferable choice when dealing with grey surfaces. 2.5. energy exchange between couples of surfaces heat exchange between two surfaces i and j, i.e. φi↔j, will be calculated as a difference between those portions of heat that are absorbed by these surfaces φi↔j = φi←j −φi→j (18) the absorbed heat φi←j will be calculated under the condition requiring that the surface j may emit energy while other surfaces emit nothing, i.e. they are ascribed by zero temperatures. the absorbed heat φi←j may be determined by means of relations (11) and (16) in which ebj = 0: φi←j = siεi ρi (−w (j)i ) i 6= j (19) where w (j)i is a radiosity when the surface j irradiates energy while others emit nothing because they are ascribed by zero temperatures. 214 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i similarly, φi→j will be calculated as the difference between absorbed portions of heat when the surface i irradiates energy while others do not emit any energy, i.e. they have zero temperatures: φi→j = sjεj ρj (−w (i)j ) i 6= j (20) the expressions for φi←i (= φi→i) assume a bit different forms. their derivation can be started from a general definition (9): q = w −h q = εeb + ρh −h q = εeb + (ρ− 1)h q = εeb −εh −εh = q −εeb (21) the term −εeb is actually the energy (w/m2) absorbed by the surface under the investigation. to obtain power in watts, this term has to be multiplied by the area s of the surface: −sεh = sq −sεe −sεh = φ−sεe −sεh = sε ρ (e −w) −sεe −sεh = sε ρ [e −w −ρe] −sεh = sε ρ [(1 −ρ)e −w ] −sεh = sε ρ [εe −w ] (22) as seen from eqs. (22), the expressions φi←i (= φi→i) assume the following form φi←i = φi→i = −siεihi = siεi ρi (εiei −wi) (23) combining eqs. (19), (20) and (23) more general expressions emerge: φi←j = siεi ρi (εieiδij −w (j) i ) =   siεi ρi (−w (j)i ) for i 6= j siεi ρi (εiebi −w (j) i ) for i = j (24) φi→j = sjεj ρj (εjejδij −w (i) j ) =   sjεj ρj (−w (i)j ) for i 6= j sjεj ρj (εjebj −w (i) j ) for i = j (25) where δij is the kronecker symbol δij = { 1 for i = j 0 for i 6= j . a general expression for the energy exchange between couples of surfaces may be obtained by using eqs. (18) and (24)/(25) φi↔j = φi←j −φi→j =   0 for i = j sjεj ρj w (i) j − siεi ρi w (j) i for i 6= j (w) (26) hottel and sarofim [2] published a different expression for φi↔j: q̇j↔i = siεi ρi ( jwi ej − δijεj ) (ej −ei) = sjεj ρj ( iwj ei −δijεi ) (ej −ei) (27) in fact, eqs. (26) and (27) are only two equivalent alternatives since they both yield the same numerical results. however, expression (26) seems to be more instructive. 215 tomáš ficker acta polytechnica figure 3. a simple room with three surfaces: heated floor no. 1, side walls no. 2, and ceiling no. 3. parameters surface 1 surface 2 surface 3 ε 0.95 0.80 0.75 ρ 0.05 0.20 0.25 t (k) 300 295 290 s (m2) 72 108 72 εeij (w/m2) 436.3065 343.5272 300.7712 table 1. input data for the simple room shown in fig. 3. as seen from eq. (26), the expression φi↔j represents a quasi-symmetric matrix (φi↔j = −φj↔j ). each row of the matrix describes the energy exchange between a particular surface and the remaining neighbouring surfaces. for example, the i-th row of the matrix contains energies that are exchanged between the i-th surface and all the remaining ones. the matrix φi↔j of energy exchange assumes the following form  φ1↔1 φ1↔2 . . . φ1↔n φ2↔1 φ2↔2 . . . φ2↔n ... ... ... ... φn↔1 φn↔2 . . . φn↔n   (28) 3. application of radiosity method in this section, the algebraic radiosity method is applied to three particular cases. the first application presented in sub-section 3.1 illustrates the functionality of the method within an enclosure, which is represented by a simple room. the second application described in sub-sections 3.2 is focused on an open system created paradoxically as a closed room envelope in which some parts of the envelope completely transmit heat radiation (quasi-enclosure). the third application in sub-section 3.3 concerns a real open system in which some constructional parts of the room envelope are completely missing. in open systems, the radiative heat energy is not conserved as in the closed systems but a large portion of the heat disappears in the open space. the radiosity method is capable of determining the escaped energy independently whether the system of surfaces is closed with some transmitting parts or some completely missing parts. 3.1. application to enclosure to illustrate the functionality of the radiosity method, a simple room with heated floor has been chosen as shown in fig. 3. the matrix of view factors fij determined by means of the graphs published in the technical literature [26–28] and the rules specified by eqs. (3), (4), and (5) reads  0 0.5 0.50.3 0.3 0.3 0.5 0.5 0   (29) 216 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i radiosities (eq. (15)): w1 = 436.3065 + 0.05 (0.5w2 + 0.5w3) w2 = 343.5727 + 0.2(0.3w1 + 0.3w2 + 0.3w3) w3 = 300.7712 + 0.25(0.5w1 + 0.5w2) (30) w1 = 457.352710, w2 = 430.140549, w3 = 411.707857 w/m2 (31) heat flows (eqs. (11) and (16)): φ1 = s1q1 = +2 622.85272 w φ2 = s2q2 = −316.029168 w φ3 = s3q3 = −2 306.791512 w (32) 3∑ i=1 φi = 0.032040 ≈ 0 w (33) the power φ1 = +2 622.85272 w represents the energy flow from the heated floor. the floor has the highest temperature and thus emits the heat power from its surface into the room. the walls and the ceiling have lower temperatures as compared to the floor and from this reason, they assume the heat powers φ2 = −316.029168 w and φ3 = −2 306.791512 w. these energies are absorbed into the volumes of the walls and the ceiling. in enclosures, the emitted energy is redistributed among the cooler surfaces and thus no portion of radiative energy can be lost ( ∑n i=1 φi = 0). in our case, the sum of heat flows shows a tiny deviation from zero ( ∑n i=1 φi = 0.032), but this is caused by rounding errors during the computations and inaccurately reading the values of view factors from their published graphs. the zero sum of radiative heat energies is the general property of closed systems and follows from the first and second laws of thermodynamics. however, since the algebraic radiosity model is not a ’product of nature’ but a product of human creativity, it is desirable to verify whether this artificial model satisfies the laws of thermodynamics. for this reason, the property ∑ (i) φi = 0 (which may be termed as compensation theorem) will be verified mathematically: the proof is based on three properties related to view factors and specified by definition (1), symmetry property (3), and summation property (5): n∑ i=1 φi = n∑ i=1 si  wi − n∑ j=1 fijwj   = = n∑ i=1 siwi − n∑ i=1 si   n∑ j=1 fijwj   = = n∑ i=1 siwi − n∑ i=1 n∑ j=1 sifijwj = = n∑ i=1 siwi − n∑ i=1 n∑ j=1 sjfjiwj = = n∑ i=1 siwi − n∑ j=1 n∑ i=1 sjfjiwj = = n∑ j=1 sjwj − n∑ j=1 sjwj · ( n∑ i=1 fji ) = = n∑ j=1 [ (sjwj ) − (sjwj ) ( n∑ i=1 fji )] = = n∑ j=1 (sjwj ) · [ 1 − n∑ i=1 fji ] = { = 0 (for closed envelopes only) 6= 0 (for opened envelopes only) (34) the expression [1 − ∑n i=1 fji] in eq. (34) is zero only if ∑n i=1 fji = 1, that holds solely for enclosures (closed envelopes), as follows from summation rule (5). however, in open envelopes, summation rule (5) does not hold, 217 tomáš ficker acta polytechnica surface 1 radiates others not w (1)i (w/m 2) w (1)1 = 438.677644 w (1) 2 = 35.565381 w (1) 3 = 59.280378 surface 2 radiates others not w (2)i (w/m 2) w (2)1 = 10.501990 w (2) 2 = 372.237215 w (2) 3 = 47.842401 surface 3 radiates others not w (3)i (w/m 2) w (3)1 = 8.173076 w (3) 2 = 22.337953 w (3) 3 = 304.585078 table 2. special radiosities w (j)i for computing the matrix φi↔j. i.e. ∑n i=1 fji 6= 1, and thus the sum of radiative heat flows ∑n i=1 φi assumes non-zero values, as shown in eq. (34). at first sight, it might seem that the values of radiosities, heat fluxes and heat flows shown in various places of the present paper include a too large number of figures after the decimal points. this is because the computations are performed in the regime of double precision using the input data containing more figures after the decimal points in order to suppress rounding errors and meet the requirement of the compensation theorem ( ∑ (i) φi = 0) as accurate as possible. the tested room shown in fig. 3 has been chosen as a very simple room possessing only three surfaces with different temperatures and emissivities. in reality, rooms may have a much larger number of surfaces with different geometries, temperatures and emissivities (windows, doors, furnishings, wooden or artificial decorations, carpets, textiles, etc.). the possible geometric complexity of rooms concerns solely the matrix of view factors. although there are many tables and formulae for determining view factors in the literature, e.g. in ref. [26–28], many general cases are missing. since the analytical derivation of view factors in these cases may be difficult, the numerical computations of double integrals (eq. (1)) seem to be the only way to overcome this problem. an interesting method for the numerical evaluation of view factors has been presented only recently [29]. however, for common geometries of internal rooms, there is a sufficient number of formulae, graphs or tables to easily determine the corresponding view factors. as soon as the matrix of view factors is formed, other computational steps related to radiosites, heat flux and heat flows are the matter of routine numerical operations for which a large number of various surfaces with different temperatures or emissivities does not represent a larger problem. in addition, it is clear that quite small surfaces compared to the area of the room envelope have small influences on results and thus neglecting some of them will not introduce an essential inaccuracy. finally, as the heat losses of the room under investigation are concerned, some parts of the absorbed energies φ2 = −316.029168 w and φ3 = −2 306.791512 w may be propagated by conduction through the solid constructions and at the external sides they may be transferred by convection and radiation into the exterior. but the real amount of heat loss depends on the quality of thermal insulation, the temperature difference between interior and exterior, external coefficients of heat transfer and external emissivities. the complete computations of heat losses that include inner convective and radiative transfers along with external convective and radiative transfers are accomplished in the second serial paper [30], which thematically continues the present paper. 3.1.1. matrix of heat exchange in the following paragraphs, a detail analysis of radiative heat exchange between couples of surfaces of the investigated simple room (fig. 3) is presented. although such an analysis is not required for determining heat losses, it could be useful for understanding the mechanism of energy exchange between surfaces. for this purpose, the matrix of energy exchange φi↔j will be computed according to relation (26). this relation requires special radiosities w (j)i determined when some surfaces (i-surfaces) have zero temperatures and only one surface (j-surface) has its own correct non-zero temperature. these radiosities have been computed and their values are gathered in tab. 2. the matrix φi↔j of energy exchange (eq. (26)):  01↔1 +997.5222721↔2 +1 623.7936801↔3−997.5222722↔1 02↔2 +683.9629202↔3 −1 623.7936803↔1 −683.9629203↔2 03↔3   =⇒   ∑ 1 = +2 621.315952 ≈ φ1∑ 2 = −313.559352 ≈ φ2∑ 3 = −2 307.756600 ≈ φ3   w (35) ∑ 1 + ∑ 2 + ∑ 3 = 0 (precisely) as seen from matrix (35), the sum of numbers in each row approximately equals the net heat power φ computed previously (see eqs. (32)). 218 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i parameters floor no. 1 side walls no. 2 ceiling no. 3 ε 0.95 1 0.75 ρ 0.05 0 0.25 t (k) 300 0 290 s (m2) 72 108 72 εeb (w/m2) 436.3065 0 300.7712 table 3. input data for the quasi-enclosure of the simple room. from the first row of matrix (35), it is obvious that surface no. 1 (floor) has emitted ∼ 998 w towards surface 2 (side walls) and ∼ 1 624 w towards surface no. 3 (ceiling). both these transfers result in ∼ 2 621 w of the net heat power emitted by the floor (no. 1). from the second row of matrix (35), it follows that surface no. 2 (side walls) has absorbed ∼ 998ť w coming from the floor (no. 1) but, simultaneously, surface no. 2 has sent ∼ 684 w towards the ceiling (no. 3). by summing these two transfers, the net heat power absorbed by the side walls (no. 2) has amounted to ∼ 314 w. similarly, the third row of matrix (35) has let us know that the ceiling (no. 3) has absorbed ∼ 1 624 w coming from the floor (no. 1) and also ∼ 684 w that has been sent from the walls (no. 2). thus, the net absorbed power of the ceiling (no. 3) is ∼ 2 308 w. the radiative heat energies redistributed within the room enclosure and specified by matrix (35) fulfil precisely the compensation theorem in accordance with the first and the second laws of thermodynamics, namely, each transfer of energy is directed from a warmer surface to cooler one and all the transferred radiative energies are conserved inside the enclosure without any possibility to escape, i.e. their sum is precisely zero ( ∑3 i=1 φi = 0). since the majority of basic monographs, e.g. [26–28], do not discuss behaviour of the radiosity method within open room envelopes, the following two sections 3.2 and 3.3 are devoted to this problem. section 3.2 illustrates the behaviour of the radiosity method within the quasi-enclosure whereas section 3.3 explores the properties of this method within a real open envelope. we would like to mention that these two open structures do not serve as the prototypes of window openings since the glazed windows under the acting of long-wave room thermal radiation can be treated as non-transparent structures undertaking only surface absorption without surface transparency just as with solid walls. this fact has been explained in the introduction. sections 3.2 and 3.3 that discuss open systems along with section 3.1 that explores closed systems provide a complete treatise concerning the functionality of the radiosity method applied to a variety of surface arrangements. 3.2. application to quasienclosure the quasi-enclosure is realized like a simple room shown in fig. 3 whose four side walls no. 2 are completely transparent by heat radiation. these side walls are considered as ideally black bodies with zero temperatures, i.e. ε = 1, ρ = 0, t2 = 0. in fact, such an enclosure behaves as an open system in which heat escapes through the ideally absorptive side walls. the input data are summarized in tab. 3. the matrix of view factors fij is the same as in section 3.1:  0 0.5 0.50.3 0.3 0.3 0.5 0.5 0   (36) radiosities (eq. 15): w1 = 436.3065 + 0.05 (0.5w2 + 0.5w3) w2 = 0 w3 = 300.7712 + 0.25 (0.5w1 + 0.5w2) (37) w1 = 445.2170, w2 = 0, w3 = 356.4233 (w/m2) (38) heat flows (eqs. (11), (16) and (17)): φ1 = s1q1 = 72 · 1 0.05 · (436.3065 − 0.95 · 445.21708) = +19 224 w (see eq. (11)) φ2 = s2q2 = 108 · [ 0 − ( 0.3w1 + 0.3w3 )] = −28 859 w (see eq. (17)) (39) φ3 = s3q3 = 72 · 1 0.25 · (300.7712 − 0.75 · 356.4233) = +9 635 w (see eq. (11)) 219 tomáš ficker acta polytechnica surface 1 radiates others not w (1)i (w/m 2) w (1)1 = 437.67423 w (1) 2 = 0 w (1) 3 = 54.70928 surface 2 “radiates” others not w (2)i (w/m 2) w (2)1 = 0 w (2) 2 = 0 w (2) 3 = 0 surface 3 radiates others not w (3)i (w/m 2) w (3)1 = 7.54285 w (3) 2 = 0 w (3) 3 = 301.71410 table 4. special radiosities w (j)i for computing the matrix φi↔j of the quasi-enclosure. 3∑ i=1 φi = 0 w (40) in the case of the quasi-enclosure, the radiosity method provides results that resemble the results achieved with the real closed envelope, i.e. the heat powers transferred are ‘conserved’ since their sum is zero ( ∑3 i=1 φi = 0) in a complete accordance with the compensation theorem. in addition, the transfers of energies are directed from wormer surfaces (floor and ceilings) to cooler surface (side walls). yet, there is some difference. the side walls (no. 2) are completely transparent for heat radiation and thus their total net absorbed power 28 859 w represents the heat that escapes into the open space beyond the quasi-enclosure. this result corresponds to that of a real open enclosure of a room whose side walls are completely removed. the question is whether the radiosity method will determine the same heat loss (28 859 w) if the real open enclosure consisting only of the heated floor and the ceiling is considered. the answer can be found in section 3.3. prior to starting section 3.3, it would be instructive to explore the energy exchanges between surfaces within this quasi-enclosure. the corresponding results are presented in tab. 4 and within the matrix φi↔j. the matrix φi↔j of energy exchange (eq. (26)):  01↔1 +17 725.81↔2 +1 498.61↔3−17 725.82↔1 02↔2 −11 133.22↔3 −1 498.63↔1 +11 133.23↔2 03↔3   =⇒   ∑ 1 = +19 223.4 = φ1∑ 2 = −28 859.0 = φ2∑ 3 = +9 634.6 = φ3   w (41) from the first row of matrix (41) it is obvious that floor (no. 1) has sent 17 725.8 w to the transmittable side walls (no. 2) and this energy escapes into the open space. in addition, the floor (no. 1) has also sent 1 498.6 w to the ceiling (no. 3). the second row of matrix (41) summarizes the total energy coming from the floor (17 725.8 w) and the ceiling (11 133.2 w). these energies are absorbed by the transmittable side walls (no. 2) and represent the total heat losses going into the open space (28 859.0 w). the third row of matrix (41) describes the energy exchanges between the ceiling (no. 3) and the remaining surfaces. the ceiling has absorbed 1 498.6 w from the floor (no. 1) but it has emitted 11 133.2 w towards the transmittable side walls (no. 2) and this energy represents the heat loss escaping into the open space. the net power of the ceiling is 9 634.6 w. by comparing heat exchanges within the quasi-open enclosure (matrix (41)) and the regular enclosure (matrix (35)), it is obvious that the ‘quasi-open’ envelopes of rooms enormously increase heat losses. this is due to the energy that escapes through the transparent (open) parts of envelopes. in the next section, the case of the quasi-open enclosure will be replaced by the real open system and the problem will be recalculated. this will enable us to compare results of the radiosity method applied to differently arranged surfaces. 3.3. application to regularly open envelope let us consider the arrangement shown in fig. 3, in which the side walls have been completely removed. the corresponding input data can be found in tab. 1. the remaining surfaces are associated with the floor (no. 1) and the ceiling (no. 2). the ceiling has been renumbered. this arrangement corresponds to the quasienclosure in which the side walls have really been removed. the matrix of view factors of this two-dimensional problem may be easily derived from matrix (36): ( 0 0.5 0.5 0 ) (42) radiosities (eq. 15): w1 = 436.3065 + 0.05 · (0.5w2) w2 = 300.7712 + 0.25 · (0.5w1) (43) 220 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i surface 1 radiates other not w (1)i (w/m 2) w (1)1 = 437.67423 w (1) 2 = 54.70928 surface 2 radiates other not w (2)i (w/m 2) w (2)1 = 7.54285 w (2) 2 = 301.71406 table 5. special radiosities w (j)i for computing the matrix φi↔j of the regularly open envelope. w1 = 445.2170, w2 = 0, w3 = 356.4233 (w/m2) (44) heat flows (eqs. (11), (16) and (17)): φ1 = s1q1 = 72 · 1 0.05 · (436.3065 − 0.95 · 445.21708) = +19 224.4 w (see eq. (11)) φ2 = s2q2 = 72 · 1 0.25 · (300.7712 − 0.75 · 356.4233) = +9 634.6 w (see eq. (11)) (45) 3∑ i=1 φi = +28 859 w (46) as seen from results (45) and (46), the floor and the ceiling emit energies (they provide only positive values), i.e. no portion of energy is absorbed by these surfaces (no negative values are present). this means that the emitted energies (28 859 w) represent heat losses directed into the open space. this conclusion is in a full agreement with the foregoing computations performed within the quasi-open enclosure presented in section 3.2. the total heat flow (46) of the regularly open system cannot be zero as in the case of the enclosure (40) or (33) since a large portion of energy escapes into the open space. let us now investigate the matrix of energy exchange associated with the two-dimensional radiosity method (see tab. 5). the matrix φi↔j of energy exchange (eq. (26)):( 01↔1 +1 498.61↔2 −1 498.62↔1 02↔2 ) =⇒ (∑ 1 = +1 498.6∑ 2 = −1 498.6 ) w (47) matrix (47) specifies the energy transferred between the floor and the ceiling. the energy of 1 498.6 w has been emitted by the floor (no. 1) and the same energy has been absorbed by the ceiling (no. 2). this is in agreement with the computations performed within the quasi-enclosure in sub-section 3.2. this energy exchange obeys the principle of the second law of thermodynamics that declares that, during natural processes, the heat always flows from the wormer body to the cooler one. the elements φ1↔2 and φ2↔1 of matrix (47) are in agreement with the elements φ1↔3 and φ3↔1 of matrix (41). however, the two-dimensional modification of radiosity model is not capable of providing such detailed information of energy exchanges between surfaces as in the case of thre-dimensional radiosity method. although numerically more demanding, the three-dimensional radiosity model used in the previous section is more informative. 4. conclusions the discussed computational model for the estimation of radiative heat transfer is applicable both to the open and closed systems of surfaces. in the first computational step, the matrix of view factors is formed on the basis of the graphs or formulae published in the technical literature and by means of the three auxiliary rules termed as the symmetry rule (eq. 3), the zero rule (eq. 4) and the summation rule (eq. 5). the radiosities are then determined from the system of linear algebraic equations (eqs. 15). the radiosities enable to compute heat fluxes qi (eqs. 11/17) and heat flows φi (eq. 16) associated with particular surfaces. in addition, the radiative heat portions that are exchanged between couples of surfaces may be identified as the elements of the heat exchange matrix φi↔j. the radiative portions of heat that are transferred in the open and closed systems of surfaces differ as to the estimation of energy losses. in the open system, the large portion of the heat is emitted irreversibly into the open space whereas in the closed system, some portion of the heat is reflected back into the interior, which ensures a more economical performance of enclosures. in addition, the stationary thermal state in the interior of the enclosure is characterized by the equilibrium between emitted and absorbed portions of heat, i.e. the sum of heat flows is zero ( ∑ (i) φi = 0). this is in an agreement with the compensation theorem and the basic laws of 221 tomáš ficker acta polytechnica thermodynamics. on the contrary, the total heat flow of the regularly open system cannot be zero as in the case of the closed system since a large portion of energy escapes into the open space. the open system of surfaces may be investigated either as the quasi-enclosure or the regularly open envelopes. although the radiosity method applied to both these arrangements provides equivalent results, the concept of quasi-enclosure seems to be more informative. the present paper has summarized the algebraic computational model based on radiosity. different properties and behaviours of that model when applied to various systems of room envelopes have been analysed and discussed. all this should serve as a preparation stage for the development of a more general model for the estimation of heat losses of inner spaces of buildings, in which the heat transfer is realized by radiation and convection. the general model of combined radiative and convective heat transfer is formulated and applied in the separate serial paper [30] thematically related to the present contribution. the performed analysis of the algebraic radiosity model enables to draw several summarizing conclusions: (1.) the algebraic radiosisty model is capable of a correct functioning not only in closed systems but also in open enclosures. the analysis of its properties in open enclosures is usually missing in monographs related to this model. (2.) the validity of the compensation theorem related to the overall radiative heat flow in enclosures has been proven on the rigorous mathematical basis (eq. 34). (3.) a new alternative formula for heat exchange between the couples of surfaces has been derived (eq. (26)). (4.) so far, the algebraic radiosity model has been preferably used for thermal applications in the field of mechanical engineering where it was originally formed. the present paper has illustrated that this model may also be useful in the field of thermal building technology and building physics since it is capable of straightforwardly computing the radiative heat energy produced by heating systems. (5.) although the presented application of the radiosity method has been aimed at heated large-area floors, the method may also be applicable to radiant heat energy emitted by small-area radiant panels often used in housing dwellings. such applications are in progress. references [1] h. c. hottel. radiant-heat transmission. in w. h. mcadams (ed.), heat transmission. mcgraw-hill book co., new york, 1954. [2] h. c. hottel, a. f. sarofim. radiative transfer. mcgraw hill book co., new york, 1967. [3] b. gebhart. heat transfer, chap. 5. mcgraw hill book co., new york, 1971. [4] b. gebhart. surface temperature calculations in radiant surroundings of arbitrary complexity—for gray, diffuse radiation. international journal of heat and mass transfer 3(4):341 – 346, 1961. doi:10.1016/0017-9310(61)90048-5. [5] e. m. sparrow. on the calculation of radiant interchange between surfaces. in w. ibele (ed.), modern developments in heat transfer, pp. 181 – 212. academic press, 1963. doi:10.1016/b978-0-12-395635-4.50010-3. [6] e. m. sparrow, r. d. cess. radiative heat transfer. brooks/cole publishing co., new york, 1966. [7] a. k. oppenheim. radiation analysis by the network metod. in transactions asme 78, pp. 725–735. 1956. [8] e. r. g. eckert, r. m. drake jr. heat and mass tranfer. mcgraw hill book co., new york, 1959. [9] t. j. love. radiation heat transfer. charles e. merrill publishing co., new york, 1968. [10] j. a. wiebelt. engineering radiation heat transfer. holt, rinehart and winston, new york, 1966. [11] r. siegel, j. r. howell. thermal radiation heat transfer. mcgraw hill book co., new york, 1972. [12] j. a. clark, m. e. korybalski. algebraic methods for the calculation of radiation exchange in an enclosure. wäarmeund stoffübertragung 7, 1974. doi:10.1007/bf01438318. [13] r. j. liesen, c. o. pedersen. an evalution of inside surface heat balance models for cooling load calculations. in ashrae trans. symposia, vol. 103, pp. 485–502. 1997. [14] h. khorasanizadeh, g. a. sheikhzadeh, a. a. azemati, b. shirkavand hadavand. numerical study of air flow and heat transfer in a two-dimensional enclosure with floor heating. energy and buildings 78:98 – 104, 2014. doi:10.1016/j.enbuild.2014.04.007. [15] x. zheng, y. han, h. zhang, et al. numerical study on impact of non-heating surface temperature on the heat output of radiant floor heating system. energy and buildings 155:198 – 206, 2017. doi:10.1016/j.enbuild.2017.09.022. [16] l. zhang, x.-h. liu, y. jiang. simplified calculation for cooling/heating capacity, surface temperature distribution of radiant floor. energy and buildings 55:397 – 404, 2012. doi:10.1016/j.enbuild.2012.08.026. [17] d. wang, c. wu, y. liu, et al. experimental study on the thermal performance of an enhanced-convection overhead radiant floor heating system. energy and buildings 135:233 – 243, 2017. doi:10.1016/j.enbuild.2016.11.017. 222 http://dx.doi.org/10.1016/0017-9310(61)90048-5 http://dx.doi.org/10.1016/b978-0-12-395635-4.50010-3 http://dx.doi.org/10.1007/bf01438318 http://dx.doi.org/10.1016/j.enbuild.2014.04.007 http://dx.doi.org/10.1016/j.enbuild.2017.09.022 http://dx.doi.org/10.1016/j.enbuild.2012.08.026 http://dx.doi.org/10.1016/j.enbuild.2016.11.017 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part i [18] j. du, m. chan, d. pan, s. deng. a numerical study on the effects of design/operating parameters of the radiant panel in a radiation-based task air conditioning system on indoor thermal comfort and energy saving for a sleeping environment. energy and buildings 151:250 – 262, 2017. doi:10.1016/j.enbuild.2017.06.052. [19] h. karabay, m. arıcı, m. sandık. a numerical investigation of fluid flow and heat transfer inside a room for floor heating and wall heating systems. energy and buildings 67:471 – 478, 2013. doi:10.1016/j.enbuild.2013.08.037. [20] a. koca, g. çetin. experimental investigation on the heat transfer coefficients of radiant heating systems: wall, ceiling and wall-ceiling integration. energy and buildings 148:311 – 326, 2017. doi:10.1016/j.enbuild.2017.05.027. [21] m. rahimi, a. sabernaeemi. experimental study of radiation and free convection in an enclosure with a radiant ceiling heating system. energy and buildings 42(11):2077 – 2082, 2010. doi:10.1016/j.enbuild.2010.06.017. [22] a. koca, z. gemici, y. topacoglu, et al. experimental investigation of heat transfer coefficients between hydronic radiant heated wall and room. energy and buildings 82:211 – 221, 2014. doi:10.1016/j.enbuild.2014.06.045. [23] a. bahadori, s. zendehboudi, k. hooman. estimation of characteristic temperature ratios for panel radiator and high temperature radiant strip systems to calculate heat loss within a room space. energy and buildings 55:508 – 514, 2012. doi:10.1016/j.enbuild.2012.08.020. [24] o. acikgoz, a. çebi, a. s. dalkilic, et al. a novel ann-based approach to estimate heat transfer coefficients in radiant wall heating systems. energy and buildings 144:401 – 415, 2017. doi:10.1016/j.enbuild.2017.03.043. [25] l. evangelisti, c. guattari, p. gori, f. bianchi. heat transfer study of external convective and radiative coefficients for building applications. energy and buildings 151:429 – 438, 2017. doi:10.1016/j.enbuild.2017.07.004. [26] w. a. gray, r. müller (eds.). engineering calculations in radiative heat transfer. pergamon, 1974. doi:10.1016/b978-0-08-017787-8.50007-8. [27] f. p. incropera, d. p. dewitt. fundamentals of heat transfer. john wiley & sons, new york, 1981. [28] r. siegel, j. r. howell. thermal radiation heat transfer. mcgraw-hill, new york, 1972. [29] s. kramer, r. gritzki, a. perschk, et al. numerical simulation of radiative heat transfer in indoor environments on programmable graphics hardware. international journal of thermal sciences 96:345 – 354, 2015. doi:10.1016/j.ijthermalsci.2015.02.008. [30] t. ficker. general model of radiative and convective heat transfer in buildings: part ii: convective and radiative heat losses. acta polytechnica 59(3):224 – 237, 2019. doi:10.14311/ap.2019.59.0224. 223 http://dx.doi.org/10.1016/j.enbuild.2017.06.052 http://dx.doi.org/10.1016/j.enbuild.2013.08.037 http://dx.doi.org/10.1016/j.enbuild.2017.05.027 http://dx.doi.org/10.1016/j.enbuild.2010.06.017 http://dx.doi.org/10.1016/j.enbuild.2014.06.045 http://dx.doi.org/10.1016/j.enbuild.2012.08.020 http://dx.doi.org/10.1016/j.enbuild.2017.03.043 http://dx.doi.org/10.1016/j.enbuild.2017.07.004 http://dx.doi.org/10.1016/b978-0-08-017787-8.50007-8 http://dx.doi.org/10.1016/j.ijthermalsci.2015.02.008 http://dx.doi.org/10.14311/ap.2019.59.0224 acta polytechnica 59(3):211–223, 2019 1 introduction 2 basics of radiosity method 2.1 view factors 2.2 algebraic equations of radiative heat transfer 2.3 radiosity 2.4 heat flux of absolutely black surface 2.5 energy exchange between couples of surfaces 3 application of radiosity method 3.1 application to enclosure 3.1.1 matrix of heat exchange 3.2 application to quasienclosure 3.3 application to regularly open envelope 4 conclusions references acta polytechnica doi:10.14311/ap.2018.58.0026 acta polytechnica 58(1):26–36, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap thermochemical calculations using serviceoriented architecture in the web service form pavel horovčák, ján terpák∗ institute of control and informatization of production processes, technical university of košice, boženy němcovej 3, 042 00 košice, slovak republic ∗ corresponding author: jan.terpak@tuke.sk abstract. the subject of this article is the service-oriented architecture utilization in the design and implementation of a web service that is intended to perform selected thermochemical calculations for chemical reactions. computing functions allow the chemical reaction calculations, such as molar heat capacity, enthalpy, entropy and gibbs free energy. in the next part, there is a description of each function, the method of service calling in the client application and the structure specification of outputs and error states of the service. in addition to computing functions, the web service also has a group of three information functions that characterize the purpose of the web service and its parameters, provide in tabular form a list of all web service functions and a list of all error states of the web service. the final section describes the presentation web service application with a demonstration of the specific calculations, the possibilities of using the service, and a further solution treatment. keywords: thermochemical calculation; chemical reaction; service-oriented architecture; web service. 1. introduction the basic tasks of chemical and process engineering mainly include an analysis of existing processes, design of new processes, construction of technological equipment, optimizing the use of material and energy flows, monitoring, indirect measurement of process quantities and process control itself [1–5]. thermochemical calculations of chemical reactions provide a basis for the quantification of processes in the chemical industry, processing of raw materials, energy and so on. this is particularly the calculation of the molar heat capacity, heat of reaction, entropy, determining the direction of a chemical reaction based on the calculation of gibbs free energy, determination of the equilibrium constant, etc. [6, 7]. for the implementation of thermochemical calculations, there is currently a large number of software tools created using different information technologies. these technologies and program tools based on them begin at simple spreadsheets and desktop applications, continue with network applications using html and various server scripts and end at a service-oriented architecture [8] (soa) through the web services [9] or enterprise services bus (esb) [10], which is optionally also applied in the form of a cloud computing [11] (cc). the concept of cloud computing is currently one of the most popular marketing witticisms in the it industry [11]. the term cloud with all its derivatives has been adopted as a metaphor for the internet-based services. in work [12] several definitions and characteristics of the soa have been given. the starting point is the basic definition, according to which the current soa is an architecture that supports service orientation when using a web service [13]. other definitions [14, 15], specifications and characteristics [16] figure 1. web services and cloud computing [8] in more detail indicate the requirements, characteristics, forms, targets, platforms, interaction and applications of the soa. the relationship between the ws, soa, and cc can be shown in the form of the venn diagram [8] (fig. 1). in this diagram, the ws is concluding (encapsulating) the cc because the cc uses the ws for the purposes of connection. however, the ws are and can be used also outside cc. such ws use can be a part of the soa, or may not, because the soa does not need to use only the ws for the connection purposes. a view of the complexity of solved problems and the participation of numerous teams whose members often operate at different workplaces can be formulated by following requirements for functionality of software tools: unified management of the thermochemical data and calculations, the network user access to unified data and calculations, independence on hardware and software platforms on the user’s end and the modularity of calculations in terms of their use and dissemination. 26 http://dx.doi.org/10.14311/ap.2018.58.0026 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 1/2018 thermochemical calculations using soa in the web service form the article is a contribution in the field of thermochemical calculations using web services and provides the possibility of its use in different areas. 2. overview of the current state the basis of thermochemical calculations of a chemical reaction are mainly the thermochemical data of the pure substances present in a chemical reaction. the source of thermochemical data are primarily book publications such as [17, 18], wherein the thermochemical data are shown in the form of table values, depending on the temperature. in the case of automated calculation, the form of functions is more appropriate than table values. from publications, various forms of the functions are known [18, 19] and due to the number of substances and form of the functional dependence, it appears to be the best source known as nasa polynomials [19]. other sources of thermochemical data are various electronic sources [20] which represent applications from a simple desktop to a network [21] application. the outcome of these applications is usually a table, which can be saved to a text file. analysis of the current situation also shows the number of software tools they support and implement thermochemical calculations. the best known software resources include especially factsage [22], gwb [23], hsc chemistry [24], melts [25], mtdata [26], metsim [27] and so on. we can also include web portals, such as ctserver [28], webqc [29], or test [30], in the list of the program tools. most software tools are commercial and some are also available for free, but with a considerably reduced substances database. one of the few providers of the ws in the field of thermodynamic calculations is ctserver web services [28], which provides a total of four different services through the wsdl (web services description language) files. services allow to obtain thermodynamic properties of minerals and their reactions using an internally consistent database of berman [31], to calculate the thermodynamic properties of other phases and solutions in the phase library of ctserver, perform calculations of the solubility model based on papale [32] and carry out calculations of the fetioxygen system by ghiorso and evans [33]. to make the data relating to the generation of free energies of reactions and minerals available, it was decided, at geoscience australia [34], to also create a web service that is not further specified (in the form of the wsdl). in the work [35], the authors write about the group of chemically oriented (chemoinformatics) web services (also without the wsdl) formed at indiana university. the subject of the ws database infrastructure issues for thermochemical data [19, 36] and its application for the distributed computation of chemical equilibrium is being investigated at san diego university in works [37, 38]. the data access over the web for different calculations (chemistry, biology, etc.) in a documented ws is the subject of work [39]. cdk (chemistry development kit) [40] can be called a clear forerunner of web access solutions through the ws to chemical data, which addresses the client side using java applets and in addition to the web access also allows mutual communication of the server part with the r system. the ws for calculations of molecular similarities using the cdk are specified in paper [41] (with broken links to the ws specification). the use of soa through ws, for example, calculating the reactive effectiveness of atom-diatom systems [42], is very rare. web services herein are used for the allocation of computing capacity. quantum reactive scattering studies of atom-diatom systems have nowadays become routine computing applications when you need to either confirm the associated potential surface energy or to accurately estimate the effectiveness of the reactive system [43]. in terms of functionality, there are software resources offering options from simple outputs to more complex tools realizing thermochemical calculations. the single outputs include, in particular, molar mass calculations of the various substances, view of the chemical formula, balance of chemical reaction providing primary thermochemical data, and the like. more complex instruments are provided according to the system parameters (open, closed, chemical reaction, etc.), generation of state quantities (temperature, enthalpy, entropy, and the like), calculation of thermochemical properties of substances at the given temperature and pressure. they enable the acquisition of number and chemical formulas of components in the current phase or mixture, and so on. in terms of technologies used, the individual software tools are developed using html, php, java, soa in ws form, and the like. existing resources can also be divided from the point of view of an application implementation, i.e., whether it is a desktop or network application. within the analysis of the existing software resources, the form of communication with the user also needs to be evaluated. in the case of simple closed systems, it is mainly of visual form consisting of values and graphs. open systems offer the possibilities from data fields to various file formats. the xml structure can be considered as most appropriate output format mainly in terms of the creation of modular software tools and their linkage to source data. based on analysis of existing software resources and in view of the requirements for program tools providing thermochemical data and calculations, the most suitable form seems to be the use of a service-oriented architecture through ws. web services, among other web resources to provide data and services such as calculations, stand out mainly due to their compatibility, allowing users to use and combine different wss, with the benefit that it does not matter in which programming resources they were created and not even on what software platform the service is implemented. 27 pavel horovčák, ján terpák acta polytechnica 3. thermochemical calculations selected thermochemical calculations of chemical reaction realised in the form of a web service and described in this article are based on the following general chemical reaction in the form 0 = m∑ j=1 νjnj, (1) where νj is the stoichiometric coefficient of substance nj and m is the number of substances in the reaction. the value of νj is positive in the case of a product and is negative in case of a reactant. the molar heat capacity of the general chemical reaction is given by ∆c◦p(t) = r 7∑ i=1 ∆aiti−3, (2) where r is the universal gas constant (8.314 j k−1 mol−1), ∆ai is the coefficient for a given chemical reaction based on the coefficients of the nasa polynomial for pure substances [12, 19] and t is the temperature (k). the reference state of the substances is generally taken to be the thermodynamically stable state at a temperature t0=298.15 k. coefficient ∆ai is calculated as follows ∆ai = m∑ j=1 νjai,j, (3) where ai,j is i-th coefficient of the nasa polynomial for j-th substance of the chemical reaction (1). calculation of the reaction enthalpy for a given chemical reaction is based on the integral of the molar heat capacity ∆rh◦(t) = ∆rh◦(t0) + ∫ t t0 ∆c◦p(t)dt, (4) where ∆rh◦(t0) is reaction the enthalpy for a given chemical reaction at a temperature t0 and is calculated according to the formula ∆rh◦(t0) = m∑ j=1 νjh ◦ j (t0), (5) where h◦j (t0) is the enthalpy of the pure substance j-th at a temperature t0. the calculation of the entropy change for a given chemical reaction is based on the integral of the proportion of changes in molar heat capacity and temperature ∆rs◦(t) = ∆rs◦(t0) + ∫ t t0 ∆c◦p(t) t dt, (6) where ∆rs◦(t0) is the entropy change for a given chemical reaction at a temperature t0 and is calculated according to the formula ∆rs◦(t0) = m∑ j=1 νjs ◦ j (t0), (7) where s◦j (t0) is the entropy of the pure substance j-th at a temperature t0. gibbs free energy is calculated on the basis of changes in the enthalpy (4) and entropy (6) using the equation ∆rg◦(t) = ∆rh◦(t) −t∆rs◦(t). (8) based on the equation ∆rg◦(t) = −rt ln k (9) the value of the logarithm of the equilibrium constant is equal to ln k = − ∆rg◦(t) rt . (10) based on the equation (4) for calculating the enthalpy and van’t hoff reaction isobar(∂ ln k ∂t ) p = ∆rh◦(t) rt 2 , (11) and after integration, we obtain the relation for the equilibrium constant logarithm ln k(t) = ln k(t0) − ∆rh◦(0) r ( 1 t − 1 t0 ) + ∆a1 2 ( 1 t 2 − 1 t 20 ) + ∆a2 ( 1 t0 − 1 + ln t t0 t ) + ∆a3 ln t t0 + 7∑ i=4 ∆ai (ti−3 −ti−30 ) (i− 2)(i− 3) , (12) where ∆rh◦(0) = ∆rh◦(t0) −r ( − ∆a1 t0 + 7∑ i=3 ∆aiti−20 i− 2 ) . (13) the equilibrium constant for the general form of a chemical reaction (1) can be expressed using activities [7]. if we set the activities of condensed components of the chemical reaction equal to one and the gaseous components are expressed by means of partial pressures, the equilibrium constant takes the form k = m∏ j=1 ( p◦j p◦ )νj (14) where p◦ is the normal pressure (101 325 pa), p◦j are partial pressures. based on the known value of the equilibrium constants and the specified total pressure, we can calculate the equilibrium partial pressure, or equilibrium composition, of the gas component. from the point of view of the number of gas components on the side of products, or reactants, and of the number of moles, different specific cases can occur. in the following, we consider only the cases where in the chemical reaction we have: 28 vol. 58 no. 1/2018 thermochemical calculations using soa in the web service form νi νj c3 c2 c1 c0 n n 0 0 1 + n √ k − n √ kpc n 2n 0 1 n √ kp◦ − n √ kp◦pc n 2n 1 0 n √ k(p◦)2 − n √ k(p◦)2pc 2n 3n 1 − n √ k 2 n √ kp◦pc − n √ kp◦(pc)2 table 1. the equation coefficients (17) for the number of moles of reactant (νi) and product (νj). • one gaseous product, • one gaseous reactant, • one gaseous product and one gaseous reactant. in the first case, where one gaseous product nj with the number of moles of νj occurs in the chemical reaction, we can express the equilibrium constant as follows k = ( p◦j p◦ )νj , resp. p◦j = p ◦ νj √ k (15) in the second case, where one gaseous reactant ni with the number of moles of νi occurs in the chemical reaction, it is essentially the first case that the chemical reaction takes place in the opposite direction. the equilibrium constant has the form k = ( p◦i p◦ )νi , resp. p◦i = p◦ νi √ k (16) in the third case, where one gaseous product nj with the number of moles of νj and one gaseous reactant ni with the number of moles of νi occurs in the chemical reaction and the sum of the partial pressures is equal to the total pressure (pc = p◦i +p ◦ j ), then it is necessary to solve the following equation c3(p◦j ) 3 + c2(p◦j ) 2 + c1p◦j + c0 = 0, (17) where coefficients ci are given in the tab. 1. the result of the equation (17) is the partial pressure of the gaseous product p◦j, or mole fraction xj = p◦j/pc, or mole percent. the calculations represent a generalization of the calculations for gaseous components while generally only solutions for specific cases are found in the literature [6]. finding direction of a chemical reaction or gibbs free energy for known initial partial pressures is based on the following equation ∆rg◦(t) = ∆rg◦(t0) + rt ln ( (p◦)−∆ν m∏ j=1 (p◦j ) νj ) , (18) where the calculation of ∆rg◦(t ) is according to (9), and the equilibrium constant is determined by van’t hoff reaction isobar (12). the value of ∆ν represents the change in the total quantity of moles of substances. 4. web service design the purpose of the web service named as thermochemdc (thermo chemical data calculations) is to do selected thermochemical calculations of a chemical reaction. the ws provides two groups of calculations. the first group includes the calculation of the change of the molar heat capacity (2), the calculation of the heat of reaction (4), the calculation of the entropy change (6), the calculation of gibbs free energy (8) and the calculation of the logarithm of the equilibrium constant (10). the calculations can be made for a specified temperature or a temperature interval. the second group includes the calculation of the logarithm of the equilibrium constant (11), the calculation of the equilibrium composition (14) and the calculation of gibbs free energy (18) using the van’t hoff reaction isobar (11). above-mentioned calculations can be performed for a given temperature, pressure or partial pressure or their intervals. other ws functions include the balance control of chemical reaction and the decomposition of both sides of a chemical reaction on its individual components. in addition to computing functions, the ws also has a group of three information functions that are designed as multi-lingual. these functions characterize the purpose of ws thermochemdc and its parameters, provide in tabular form a list of all functions of the ws and also list of all error states of ws thermochemdc. in its activities, the ws uses additional web services that have been proposed and implemented in our department. ws thermpropdc [12] provides thermochemical properties of chemical substances (1154 substances and their 1817 phases) for the purposes of individual calculations. ws chemform [44] is used to convert standard notation of the chemical formula stored in the ascii form to the html format. web service errorservice is designed for the selection of error status meaning for a specified ws using its numeric identifier for the specified error’s code, and in the chosen language. ws thermochemdc directly uses the ws thermpropdc based on the use of nasa resources (nasa polynomials) [12, 19, 36] in all calculations, which include also other published dependences [6, 17, 45, 46] and are supplemented with the cas [47] 29 pavel horovčák, ján terpák acta polytechnica # meaning of function function parameters 1 enthalpy reaction, language, temperature [k] 2 enthalpy for the temperature interval reaction, language, temperature1 [k], temperature2 [k], step [k] 3 gibbs energy reaction, language, temperature [k] 4 gibbs energy for the temperature interval reaction, language, temperature1 [k], temperature2 [k], step [k] 5 entropy reaction, language, temperature [k] 6 entropy for the temperature interval reaction, language, temperature1 [k], temperature2 [k], step [k] 7 logarithm of the equilibrium constant reaction, language, temperature [k] 8 logarithm of the equilibrium constant for the temperature interval reaction, language, temperature1 [k], temperature2 [k], step [k] 9 molar heat capacity reaction, language, temperature [k] 10 molar heat capacity for the temperature interval reaction, language, temperature1 [k], temperature2 [k], step [k] 11 control of chemical reactions balance reaction, language 12 decomposition of chemical reaction reaction, language 13 logarithm of the equilibrium constant (van’t hoff) reaction, language, temperature [k] 14 logarithm of the equilibrium constant for the temperature interval (van’t hoff) reaction, language, temperature1 [k], temperature2 [k], step [k] 15 equilibrium composition reaction, language, temperature [k], pressure [pa] 16 equilibrium composition for the pressures interval reaction, language, temperature [k], pressure1 [pa], pressure2 [pa], step [pa] 17 gibbs energy (van’t hoff) reaction, language, temperature [k], pressure [pa], partial pressure [pa] 18 gibbs energy for the pressures interval (van’t hoff) reaction, language, temperature [k], pressure [pa], partial pressure1 [pa], partial pressure2 [pa], step [pa] table 2. list of web service thermochemdc functions parameter and with one or more names of substances in two languages. in some cases, the value of 000000 is used for the cas identifier, which indicates the fact that it failed to detect cas value for a given substance at any website. the cas identifier is optionally stored in the database infrastructure of the ws as a link to the appropriate website [48– 54]. similarly, you can choose to save the formula of the substance in a standard ascii form or in a html format (the form of the upper/lower index), which ensures the ws chemform [44]. general administration of the database infrastructure of the ws thermpropdc provides a standalone application admin_tp. this application allows a gradual creation of all tables in the operation and preparation service’s database, populating tables with nasa, cas data and the names from the relevant files, transfer complete records to the operation database, optionally manually completing the cas identifier and the names of substances and/or perform a backup of the database into a file. ws thermochemdc is implemented in a php development environment (required php version 5 and above) and as a database system, it uses mysql in version 5 also. the current version of the ws with a class named thermochemdc includes a total of 52 methods, while five of them are service interfaces and are also specified in the file thermochemdc.wsdl. other methods are internal and intended for different calculations or control, creating xml structures, auxiliary, declaratory, folding tasks, a communication with the database system or are complementary. 5. the methods and functions of web service the ws thermochemdc provides users with a total of five methods through the wsdl file. the first two 30 vol. 58 no. 1/2018 thermochemical calculations using soa in the web service form # meaning of error 1 one of the reactants is not in the database 2 one of the products is not in the database 3 the specified temperature is outside the permitted interval (left side — reactants) 4 syntax error in the reaction writing 5 incorrect reaction — reactants and products do not contain the same chemical elements 6 calling a non-existent function of the web service — method getthermochemcalc 7 incorrect parameters (1 or more) of web service calling — method getthermochemcalc 8 used web service is off-line 9 connection error to database (db server): access denied for user 10 error when selecting error meaning from db table [ws_errors] 11 incorrect reaction balance — reactants and products do not contain the same number of chemical elements 12 the reaction does not contain any gaseous compound 13 the numbers of gaseous components of the reaction greater than 1 are not yet solved 14 calling a non-existent function of the web service — method getparcprescalc 15 incorrect parameters (1 or more) of web service calling — method getparcprescalc 16 the first parameter of the cubic parabola respectively kp1 is equal to zero 17 error when selecting function meaning from db table [ws_functions] 18 error when selecting function parameters from db table [ws_funcparams] 19 the specified temperature is outside the permitted interval (right side — products) table 3. list of web service thermochemdc error states methods are computational — getthermochemcalc and getparcprescalc, the other three methods are informational — gethelp, getwsfunc, and geterror. each calculation method can operate in several modes and thus perform the various functions. calculation methods have from three to six parameters. the first three parameters are required, the first parameter specifies the specific function of ws, the second parameter contains a chemical reaction and the third parameter defines the language of outputs or error messages of the ws. other parameters are related to the desired function of the calculation method according to tab. 2. the method getthermochemcalc is used to solve the functions 1–12, the method getparcprescalc to solve the functions 13–18. the method gethelp provides the user of the ws with a brief description of the service — purpose, wsdl file, list of methods, their purpose and parameters, description of their outputs and corresponding xml structures, and other ws used within the service itself. the method getwsfunc provides a list of all ws functions together with parameters of these functions in a tabular form — tab. 2. the first column is the function code; the second column is a more detailed description of the relevant service function, and a list of its parameters. the return value of geterror method is a list of all error states of thermochemdc ws, which is also implemented in a tabular form — tab. 3. the first column is the error code; in the second column, there is a more detailed description of the service’s error state. all three informational methods have one parameter $lang that defines the selected language of the output of above methods. 6. web service call ws thermochemdc is available to clients through a wsdl file, which is located at http://omega.tuke. sk/wsdl/thermochemdc.wsdl. this file specifies all the attributes required for the ws call. the ws can be used either in a separate client application or in another ws. calling the ws consists of a sequence of two steps. the first step is a instantiated object that represents the client part of the ws. to create an instance, a wsdl service file is used. the second step is the appropriate method call, which is the ws function with the corresponding parameters. the object instance created in the previous step is used for the calling. the ws calling is supported by all current development environments (whether in the client application or in other ws). these are primarily web development environments (e.g., php, java, asp, ajax), but they also include desktop development environments (e.g., matlab). the above-mentioned sequence of the ws call steps is then dependent on the chosen development environment for the implementation of the ws call. for an illustration of calling the ws we describe the procedures in a development environment php. step one, create an object instance, has the form: $wsdl = "http://omega.tuke.sk/wsdl/"; $urlwsdl = $wsdl."thermochemdc.wsdl"; $c = new soapclient($urlwsdl); where $urlwsdl indicates the location of the wsdl file of the ws (its url) and $c is an instance of an object to interact with the ws. the second step is to call the desired method and/or the ws function through the instance $c and, after entering specific 31 http://omega.tuke.sk/wsdl/thermochemdc.wsdl http://omega.tuke.sk/wsdl/thermochemdc.wsdl pavel horovčák, ján terpák acta polytechnica figure 2. sample output of a single client of thermochemdc ws parameters, for example, to calculate the enthalpy ($r=1), may take the form: $r = 1; $t = ’400’; $l=’0’; $e = ’caco3(cr)=cao(cr)+co2’; $xml = $c->getthermochemcalc($r,$e,$l,$t); print $xml; because the ws returns as the output value the xml structure ($xml), the easiest way of its presentation is the direct visual display of this structure. variables $r, $e, $l, $t are calling parameters of the selected ws function, where $r is the selection of the function (ws mode), $e contains a chemical equation, $l is the language and $t is the temperature. the appearance of the presentation results on the screen justifies using a css file, in which to each specific element of the output xml structure of the web service, a corresponding way of view is prescribed (such as appropriate colourfulness of individual items). those items of the xml structure that should not, for any reason, be displayed can be identified as ’display: none’. the simplicity of creating a client application illustrates a sample of an enthalpy calculation (mode 1) for the reaction ’caco3(cr)=cao(cr)+co2’ for a given temperature (400 k). the script has the form: sample of web service client enthalpy calculation for reaction $eq at temperature $t k "; print htmlentities ($c->getthermochemcalc($r,$e,$l,$t)). "

"; $sp=’     ’; print "error   t [k]$sp enthalpy$sp$sp$sp unit

"; print $c->getthermochemcalc($r,$e,$l,$t); ?> and it is also available on http://omega.tuke.sk/ tc/client_tc.php. the output of this sample is shown in fig. 2. the line of the php code containing the function htmlentities(...) only serves to display the xml structure (for this article), which is the ws response to the submitted request (usually does not need to be placed in the script). to visualize the result of this example, a simple css file is composed (http://omega.tuke.sk/tc/styly/ client_tc.css). the creation of a client in a php environment requires the version 5 and above. in the current distribution of linux, this version is already included. 7. output and error states of web service the ws output (the return value of the method getthermochemcalc() or getparcprescalc()) is one of the six xml structures that may take the form as shown on http://omega.tuke.sk/tc/docs/ struktury_ws0.pdf. five structures are the result of one of the functions of the ws, one structure contains an error message. successfulness of the ws processing indicates the element , which, in case of an error, has a value 1 and in other cases acquires a value 0. element is a part of each output xml structure. the first structure in the illustration is the most frequent structure containing the numerical result and calculation unit of the selected ws function at the specified temperature (excluding the calculation of logarithm of the equilibrium constant, which is dimensionless, where as a unit the value “1” is listed). the second one is an error structure that contains the code and description of the error state in the selected language and the values of all six specified parameters of the ws call. after that follows a structure that is returned by the function to control the balance of components in the reaction (containing the individual elements of the reaction and their quantity that is the same on both 32 http://omega.tuke.sk/tc/client_tc.php http://omega.tuke.sk/tc/client_tc.php http://omega.tuke.sk/tc/styly/client_tc.css http://omega.tuke.sk/tc/styly/client_tc.css http://omega.tuke.sk/tc/docs/struktury_ws0.pdf http://omega.tuke.sk/tc/docs/struktury_ws0.pdf vol. 58 no. 1/2018 thermochemical calculations using soa in the web service form sides of the reaction), then follows a structure that is returned by the function of the decomposition reaction (the moles number of all components on both sides of the reaction, the element indicates the componentâăźs affiliation to the condensed phase). another structure that is the output of the function for calculating the equilibrium composition for a given temperature and pressure (the 15-th and 16-th function). the last structure returns the function to the calculation of gibbs free energy for a given temperature, pressure and partial pressure by van’t hoff reaction isobar (function 17-th and 18-th). error states are the result of the ws input parameters checking or execution of individual modes of the ws. a centralized way to control the input parameters of the ws required a suitable solution of error states handling on the service server side and their sending to the client side. the indication of an error state to the client is realized by an error xml structure. tab. 3 illustrates a list of various error states. 8. presentation and utilization of web service for the purposes of presentation of the ws functions, the client application thermochemdc_app (http://omega.tuke.sk/tc) is assembled, which allows to enter a chemical reaction, calculate the enthalpy (heat of reaction), heat capacity, entropy, gibbs free energy and the logarithm of the equilibrium constant of the reaction, the execution of the chosen calculation for a given temperature or temperature interval, calculate the logarithm of the equilibrium constant of the reaction, the calculation of the equilibrium composition and calculation of gibbs free energy by van’t hoff reaction isobar, the execution of the chosen calculation for a given temperature or partial pressure respectively its intervals, obtaining information on using the ws, obtaining information on ws thermochemdc, listing functions list of ws thermochemdc, generate a list of error states of ws thermochemdc, and a list of information about thermochemdc_app client application. menu items “calculation”, “about web services”, “ws functions” and “ws errors” directly use the ws thermochemdc. other menu items are used only within the application thermochemdc_app. the thermochemdc_app application displays the results of calculations in a tabular form and the interval of temperatures or pressures in a form of a graph, which is produced with the use of an object-oriented php library jpgraph [55]. thermochemdc_app application is compiled in an environment html5 using cascading style sheets css3 as a multilingual option (specifically, there are three languages — en, sk, cz). the thermochemdc_app client application displays the formula in the correct (html) format, using the ws chemform [44]. using a thermochemdc ws, or thermochemdc_app application, can be illustrated in the figure 3. screenshot of the thermochemdc_app application carbonate decomposition, which is a relatively common occurrence in the different technologies covering the area of the raw materials processing, such as e.g., sintering ore, coke, blast furnace process, the processing of limestone, and the like. therefore, based on a chemical reaction caco3(cr) → cao(cr) + co2 (19) we implement thermochemical calculations — the enthalpy for temperature from 400 to 1200 k with increments of 100 k, or an equilibrium composition for a pressure from 101325 to 1013250 pa with increments of 10000 pa for temperatures 1000 k and 1100 k. after running the web application thermochemdc_app and selecting “calculation”, it is necessary to enter a chemical reaction in the form (19) and perform the send option by pressing “confirm”. subsequently, the calculation will select, for example, “enthalpy for the temperature interval” and the selection is confirmed by pressing the “send” button (fig. 3). subsequently, the boundaries of the interval of temperatures with an increment of the interval are filled in and sent after pressing the “send” button, then the table and graph are shown (fig. 4). also other thermochemical calculations are similarly realized. in the case of the limestone decomposition (19) it is an endothermic reaction and the gaseous substance (co2) is only on the product side. therefore, the equilibrium composition of co2 with increasing pressure decreases. in contrast, the value of the co2 equilibrium composition at 1000 k is lower (fig. 5) than the one at the temperature of 1100 k (fig. 6) at the same pressure. the assembled ws provides a wide range of applications under those sub-functions. the web service can be used directly as a part of the client application in a variety of network and desktop environments (e.g., php, ajax, asp, matlab, java, perl, octave), but can also be used as a part of another ws. for 33 http://omega.tuke.sk/tc pavel horovčák, ján terpák acta polytechnica figure 4. enthalpy as a function of the temperature figure 5. the equilibrium composition of the co2 depending on pressure at 1000 k illustration, we mention some of the possible fields of application — process modelling in which heat is present, cooling off, phase changes, chemical reactions and other calculations in which the thermochemical properties of the substances are necessary, and the thermochemical calculations for chemical reactions. furthermore, ws provides easy access to data and supports distributed computing. in the case of updating data in nasa polynomials, through the application admin_tp is realized an immediate update of service’s database infrastructure. the ws can also be used as an effective tool for teaching students in the educational process as well as in their individual work. 9. conclusion the paper presents a design of the web service thermochemdc for the implementation of selected thermochemical calculations for chemical reactions. these calculations include the calculation of changes in the molar heat capacity, heat of reaction, calculation of entropy change, gibbs free energy change, and calculation of the logarithm of the equilibrium constant for a given temperature or the temperatures interval. next, there are calculation of the logarithm of the equilibrium constant, calculation equilibrium composition and the calculation of gibbs free energy by van’t hoff reaction isobar, which may be implemented for a given temperature, pressure or the partial pressure, or their figure 6. the equilibrium composition of the co2 depending on pressure at 1100 k intervals. the mentioned calculations are carried out by particular functions of the web service. the output of the web service is xml structures corresponding to these functions, or an error structure that may arise either due to incorrect input parameters of the web service call or during the calculation itself. for purposes of presentation of the web service, a multilingual stand-alone application was created that includes all of the above calculations, more detailed description of the web service, the list of functions and error states of the web service and characteristics of other web services used. the output of this application is a result in a tabular form and, in the case of interval of temperatures or pressures, also in a graphical form. the database infrastructure of the service contains a total of 1154 substances and their 1817 phases and using the admin application admin_tp at any time, it is possible to update the database. a benefit of the service is the support of distributed computing in a variety of customer environments, interoperability support within the service-oriented architecture, the possibilities for the composition service in developing other services and simulation models and applications within the cloud. in a future solution, we propose the separation of list of functions and its connection (and administration) by a single web service. we also consider the independence of some partial supporting calculations and their realization in the form of a web service and thus making them available to other services, simulation models or calculations. similarly, in a future solution, we also plan to add some other calculation options (more reactions at the same time) into the presentation of the web service application. acknowledgements this work was partially supported by the slovak research and development agency under the contract no. apvv0482-11, apvv-14-0892, apvv sk-pl-2015-0038 and by project vega 1/0273/17. references [1] m. žecová, j. terpák. heat conduction modeling by using fractional-order derivatives. applied mathematics 34 vol. 58 no. 1/2018 thermochemical calculations using soa in the web service form and computation 257:365–373, 2015. doi:10.1016/j.amc.2014.12.136. [2] j. kačur, m. durdán, m. laciak, p. flegner. impact analysis of the oxidant in the process of underground coal gasification. measurement 51(1):147–155, 2014. doi:10.1016/j.measurement.2014.01.036. [3] k. kostúr, m. laciak, m. durdán, et al. low-calorific gasification of underground coal with a higher humidity. measurement 63:69–80, 2015. doi:10.1016/j.measurement.2014.12.016. [4] j. terpák, l. dorčák, v. maduda. combustion process modelling and control. acta montanistica slovaca 12(3):238–242, 2007. http: //actamont.tuke.sk/pdf/2007/n3/11terpak.pdf. [5] m. laciak, p. flegner, p. horovčák, et al. system of indirect measurement temperature of melt with adaptation module. in proceedings of the 16th international carpathian control conference, pp. 277–281. ieee, 2015. doi:10.1109/carpathiancc.2015.7145088. [6] p. e. liley. 2000 solved problems in mechanical engineering thermodynamics. mcgraw-hill, 1989. isbn 0-07-037863-0. [7] e. b. smith. chemical thermodynamics. imperial college press, 2014. isbn 978-1-78326-336-3. [8] d. k. barry. web services, service-oriented architectures, and cloud computing. 2000. http://www.service-architecture.com/articles/ cloud-computing/web_services_and_cloud_ computing.html. [9] w3c working group: web services architecture. 2004. http://www.w3.org/tr/ws-arch/. [10] d. a. chappell. enterprise service bus. 2006. http://archive.visualstudiomagazine.com/books/ chapters/0596006756.pdf. [11] p. mell, t. grance. the nist definition of cloud computing. recommendations of the national institute of standards and technology special publication 800-145, 2011. http://faculty.winthrop.edu/ domanm/csci411/handouts/nist.pdf. [12] p. horovčák, j. terpák. termochemické vlastnosti substancií formou webovej služby [thermochemical properties of substances on web]. chemické listy 107(2):136–145, 2013. http://www.chemicke-listy. cz/docs/full/2013_02_136-145.pdf. [13] t. erl. soa: principles of service design. prentice hall, 2007. isbn 978-0132344821. [14] d. k. barry. service-oriented architecture (soa) definition. barry & associates, 2000. http://www.service-architecture.com/articles/ web-services/service-oriented_architecture_soa_ definition.html. [15] searchsoa.com: definition service-oriented architecture (soa). 2003. http://searchsoa.techtarget.com/definition/ service-oriented-architecture. [16] t. erl. what is soa? an introduction to the service-oriented computing. goals and benefits of service-oriented computing. 2016. http://www.whatissoa.com/p16.php. [17] i. barin. thermochemical data of pure substances. weinheim, 1993. isbn 3-527-28531-8. [18] m. w. chase. nist-janaf thermochemical tables. journal of physical and chemical reference data, monograph 9, 2000. http://kinetics.nist.gov/janaf/. [19] b. j. mcbride, m. j. zehe, s. gordon. nasa glenn coefficients for calculating thermodynamic properties of individual species. nasa tpű2002-211556, 2002. http://www.archive.org/details/nasa_techdoc_ 20020085330. [20] department of chemistry university of oxford: chemistry and related information on the internet chemistry link collections. 2008. http: //www.chem.ox.ac.uk/cheminfo/internet.html. [21] m. j. zehe. chemical equilibrium with applications. nasa glenn research center, 2006. http://www.grc. nasa.gov/www/ceaweb/ceathermobuild.htm. [22] c. v. bale, e. beliste. fact-web suite of interactive programs. 2016. http://www.factsage.com/. [23] gwb: the geochemist’s workbench. 2016. https://www.gwb.com/. [24] hsc-chemistry-9: software for process simulation, reactions equations, heat and material balance, equilibrium calculation, electrochemical cell equilibriums, eh-ph diagrams pourbaix diagram. 2016. http://www.hsc-chemistry.net/. [25] m. s. ghiorso, r. o. sack. melts software for thermodynamic modeling of phase equilibria in magmatic systems. 2015. http://melts.ofm-research.org/. [26] r. h. davies, a. t. dinsdale, j. a. gisby, et al. mtdata thermodynamic and phase equilibrium software from the national physical laboratory. 2007. http://www.npl.co.uk/upload/pdf/mtdata_calphad_ paper.pdf. [27] metsim: the premier steady-state & dynamic process simulator. 2016. http://www.metsim.com/. [28] ctserver: computational thermodynamics (ct) server. 2016. http://ctserver.ofm-research.org/. [29] chemical-portal: chemistry online education. 2016. http://www.webqc.org/. [30] s. bhattacharjee. test the expert system for thermodynamics. web portal for thermodynamic property evaluation and thermal systems analysis, 2016. http://test.sdsu.edu/testhome/index.html. [31] r. c. berman. internally-consistent thermodynamic data for minerals in the system na2o-k2o-cao-mgofeo-fe2o3-al2o3-sio2-tio2-h2o-co2. journal of petrology 29(2):445–522, 1988. doi:10.1093/petrology/29.2.445. [32] p. papale. modeling of the solubility of a two-component h2o + co2 fluid in silicate liquids. american mineralogist 84:477–492, 1999. doi:10.2138/am-1999-0402. [33] m. s. ghiorso, b. w. evans. thermodynamics of rhombohedral oxide solid solutions and a revision of the fe-ti two-oxide geothermometer and oxygen-barometer. american journal of science 308:957–1039, 2008. doi:10.2475/09.2008.01. 35 http://dx.doi.org/10.1016/j.amc.2014.12.136 http://dx.doi.org/10.1016/j.measurement.2014.01.036 http://dx.doi.org/10.1016/j.measurement.2014.12.016 http://actamont.tuke.sk/pdf/2007/n3/11terpak.pdf http://actamont.tuke.sk/pdf/2007/n3/11terpak.pdf http://dx.doi.org/10.1109/carpathiancc.2015.7145088 http://www.service-architecture.com/articles/cloud-computing/web_services_and_cloud_computing.html http://www.service-architecture.com/articles/cloud-computing/web_services_and_cloud_computing.html http://www.service-architecture.com/articles/cloud-computing/web_services_and_cloud_computing.html http://www.w3.org/tr/ws-arch/ http://archive.visualstudiomagazine.com/books/chapters/0596006756.pdf http://archive.visualstudiomagazine.com/books/chapters/0596006756.pdf http://faculty.winthrop.edu/domanm/csci411/handouts/nist.pdf http://faculty.winthrop.edu/domanm/csci411/handouts/nist.pdf http://www.chemicke-listy.cz/docs/full/2013_02_136-145.pdf http://www.chemicke-listy.cz/docs/full/2013_02_136-145.pdf http://www.service-architecture.com/articles/web-services/service-oriented_architecture_soa_definition.html http://www.service-architecture.com/articles/web-services/service-oriented_architecture_soa_definition.html http://www.service-architecture.com/articles/web-services/service-oriented_architecture_soa_definition.html http://searchsoa.techtarget.com/definition/service-oriented-architecture http://searchsoa.techtarget.com/definition/service-oriented-architecture http://www.whatissoa.com/p16.php http://kinetics.nist.gov/janaf/ http://www.archive.org/details/nasa_techdoc_20020085330 http://www.archive.org/details/nasa_techdoc_20020085330 http://www.chem.ox.ac.uk/cheminfo/internet.html http://www.chem.ox.ac.uk/cheminfo/internet.html http://www.grc.nasa.gov/www/ceaweb/ceathermobuild.htm http://www.grc.nasa.gov/www/ceaweb/ceathermobuild.htm http://www.factsage.com/ https://www.gwb.com/ http://www.hsc-chemistry.net/ http://melts.ofm-research.org/ http://www.npl.co.uk/upload/pdf/mtdata_calphad_paper.pdf http://www.npl.co.uk/upload/pdf/mtdata_calphad_paper.pdf http://www.metsim.com/ http://ctserver.ofm-research.org/ http://www.webqc.org/ http://test.sdsu.edu/testhome/index.html http://dx.doi.org/10.1093/petrology/29.2.445 http://dx.doi.org/10.2138/am-1999-0402 http://dx.doi.org/10.2475/09.2008.01 pavel horovčák, ján terpák acta polytechnica [34] solid earth and environment grid: seegrid>analyticalgeoscience web>thermodynamics. 2010. https://www.seegrid.csiro.au/wiki/ analyticalgeoscience/thermodynamics. [35] x. dong, k. e. gilbert, r. guha, et al. web service infrastructure for chemoinformatics. journal of chemical information and modeling 47(4):1303–1307, 2007. doi:10.1021/ci6004349. [36] b. j. mcbride, s. gordon, m. a. reno. coefficients for calculating thermodynamic and transport properties of individual species. nasa tmű4513, 1993. http://ntrs.nasa.gov/archive/nasa/casi.ntrs. nasa.gov/19940013151.pdf. [37] c. p. paolini, s. bhattacharjee. a web service infrastructure for thermochemical data. journal of chemical information and modeling 48(7):1511–1523, 2008. doi:10.1021/ci700457p. [38] s. bhattacharjee, c. p. paolini, m. patterson. a web service infrastructure and its application for distributed chemical equilibrium computation. journal of computational science education 3(1):19–27, 2012. doi:10.22369/issn.2153-4136/3/1/3. [39] k. i. schuchardt, b. i. didier, t. elsethagen, et al. basis set exchange: a community database for computational sciences. journal of chemical information and modeling 47(4):1045–1052, 2007. doi:10.1021/ci600510j. [40] c. steinbeck, s. kuhn, m. floris, e. willighagen. recent developments of the chemistry development kit (cdk) an open-source java library for chemoand bioinformatics. 2005. https://www.researchgate. net/profile/christoph_steinbeck/publication/ 6987061_recent_developments_of_the_chemistry_ development_kit_%28cdk%29_-_an_open-source_ java_library_for_chemo-_and_bioinformatics/ links/551d48d20cf2000f8f938c85.pdf. [41] r. guha, m. t. howard, g. r. hutchison, et al. the blue obelisk interoperability in chemical informatics. journal of chemical information and modeling 46(3):991–998, 2006. doi:10.1021/ci050400b. [42] c. manuali, a. lagana. grif: a new collaborative framework for a web service approach to grid empowered calculations. future generation computer systems 27(3):315–318, 2010. doi:10.1016/j.future.2010.08.006. [43] e. garcia, c. sanchez, a. saracibar, et al. a detailed comparison of centrifugal sudden and j shift estimates of the reactive properties of the n + n2 reaction. physical chemistry chemical physics 48(11):11456–11462, 2009. doi:10.1039/b915409d. [44] p. horovčák, d. dugáček, p. cirbes. interpretácia chemických vzorcov pomocou webovej služby [interpretation of chemical formulae using web service]. chemické listy 104(11):1029–1033, 2010. http://chemicke-listy.cz/common/content-issue_ 11-volume_104-year_2010.html. [45] e. a. brizuela. computing mixture temperature from enthalpy using nasa polynomials. 2000. http://www.industrial.combustion.ifrf.net/ paper_download.html?paperid=19. [46] a. burcat, b. ruscic. third millennium ideal gas and condensed phase thermochemical database for combustion with updates from active thermochemical tables. technion-israel institute of technology, tae 960, 2005. http: //www.ipd.anl.gov/anlpubs/2005/07/53802.pdf. [47] american chemical society: cas registrysm and cas registry numbers. 2016. http://www.cas. org/expertise/cascontent/registry/regsys.html. [48] nist webbook chemie: nist standard reference database number 69. 2016. http://webbook.nist.gov/chemistry/. [49] chemical book. 2016. http://www.chemicalbook. com/search_en.aspx?keyword=1066-33-7. [50] molbase: chemical compounds data including chemical structures, properties, synthetic routes, msds and nmr spectra. shanghai molbase technology co.,ltd, 2016. http://www.molbase.com/. [51] eurochem: professional chemistry guide. 2016. http://www.eurochem.cz/index.php. [52] environmental, als: analytical services. 2016. http://www.caslab.com/. [53] sigma-aldrich: a part of merck. 2016. http: //www.sigmaaldrich.com/united-kingdom.html. [54] landolt börnstein substance / property index. 2009. http: //lb.chemie.uni-hamburg.de/search/index.php. [55] jpgraph:. most powerful php-driven charts. asial corporation, 2016. http://jpgraph.net/. 36 https://www.seegrid.csiro.au/wiki/analyticalgeoscience/thermodynamics https://www.seegrid.csiro.au/wiki/analyticalgeoscience/thermodynamics http://dx.doi.org/10.1021/ci6004349 http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19940013151.pdf http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19940013151.pdf http://dx.doi.org/10.1021/ci700457p http://dx.doi.org/10.22369/issn.2153-4136/3/1/3 http://dx.doi.org/10.1021/ci600510j https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf https://www.researchgate.net/profile/christoph_steinbeck/publication/6987061_recent_developments_of_the_chemistry_development_kit_%28cdk%29_-_an_open-source_java_library_for_chemo-_and_bioinformatics/links/551d48d20cf2000f8f938c85.pdf http://dx.doi.org/10.1021/ci050400b http://dx.doi.org/10.1016/j.future.2010.08.006 http://dx.doi.org/10.1039/b915409d http://chemicke-listy.cz/common/content-issue_11-volume_104-year_2010.html http://chemicke-listy.cz/common/content-issue_11-volume_104-year_2010.html http://www.industrial.combustion.ifrf.net/paper_download.html?paperid=19 http://www.industrial.combustion.ifrf.net/paper_download.html?paperid=19 http://www.ipd.anl.gov/anlpubs/2005/07/53802.pdf http://www.ipd.anl.gov/anlpubs/2005/07/53802.pdf http://www.cas.org/expertise/cascontent/registry/regsys.html http://www.cas.org/expertise/cascontent/registry/regsys.html http://webbook.nist.gov/chemistry/ http://www.chemicalbook.com/search_en.aspx?keyword=1066-33-7 http://www.chemicalbook.com/search_en.aspx?keyword=1066-33-7 http://www.molbase.com/ http://www.eurochem.cz/index.php http://www.caslab.com/ http://www.sigmaaldrich.com/united-kingdom.html http://www.sigmaaldrich.com/united-kingdom.html http://lb.chemie.uni-hamburg.de/search/index.php http://lb.chemie.uni-hamburg.de/search/index.php http://jpgraph.net/ acta polytechnica 58(1):26–36, 2018 1 introduction 2 overview of the current state 3 thermochemical calculations 4 web service design 5 the methods and functions of web service 6 web service call 7 output and error states of web service 8 presentation and utilization of web service 9 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0230 acta polytechnica 61(1):230–241, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on the three-dimensional pauli equation in noncommutative phase-space ilyas haouam université frères mentouri, laboratoire de physique mathématique et de physique subatomique (lpmps), constantine 25000, algeria correspondence: ilyashaouam@live.fr abstract. in this paper, we obtained the three-dimensional pauli equation for a spin-1/2 particle in the presence of an electromagnetic field in a noncommutative phase-space as well as the corresponding deformed continuity equation, where the cases of a constant and non-constant magnetic fields are considered. due to the absence of the current magnetization term in the deformed continuity equation as expected, we had to extract it from the noncommutative pauli equation itself without modifying the continuity equation. it is shown that the non-constant magnetic field lifts the order of the noncommutativity parameter in both the pauli equation and the corresponding continuity equation. however, we successfully examined the effect of the noncommutativity on the current density and the magnetization current. by using a classical treatment, we derived the semi-classical noncommutative partition function of the three-dimensional pauli system of the one-particle and n-particle systems. then, we employed it for calculating the corresponding helmholtz free energy followed by the magnetization and the magnetic susceptibility of electrons in both commutative and noncommutative phase-spaces. knowing that with both the three-dimensional bopp-shift transformation and the moyal-weyl product, we introduced the phase-space noncommutativity in the problems in question. keywords: 3-d noncommutative phase-space, pauli equation, deformed continuity equation, current magnetization, semi-classical partition function, magnetic susceptibilit. 1. introduction it is well known that the dirac equation is the relativistic wave equation that describes the motion of the spin-1/2 fermions and the pauli equation, which is a topic of great interest in physics, is the non-relativistic wave equation describing it [1–4]. it is relative to the explanation of many experimental results, and its probability current density changed to include an additional spin-dependent term recognized as the spin current [5–7]. pauli equation is shown in [8–12] as the non-relativistic limit of the dirac equation. historically, pauli (1927) presented his famous spin matrices [13] for adjusting the non-relativistic schrödinger equation to account for goudsmit-uhlenbeck’s hypothesis (1925) [14, 15]. therefore, he applied an ansatz for adding a phenomenological term to the ordinary non-relativistic hamiltonian in the presence of an electromagnetic field, the interaction energy of a magnetic field and electronic magnetic moment relative to the intrinsic spin angular momentum of the electron. describing this spin angular momentum through the spin matrices requires replacing the complex scalar wave function by a two-component spinor wave function in the wave equation. since then, the study of the pauli equation became a matter of considerable attention. in 1928, when dirac presented his relativistic free wave equation in addition to the minimal coupling replacement to include electromagnetic interactions [16], he showed that his equation contained a term involving the electron magnetic moment interacting with a magnetic field, which was the same one inserted by hand in pauli’s equation. after that, it became common to count an electron spin as a relativistic phenomenon, and the corresponding spin-1/2 term could be inserted into the spin-0 non-relativistic schrödinger equation as will be discussed in this article to see how this is possible. however, motivated by attempts to understand the string theory and describe quantum gravitation using noncommutative geometry and by trying to draw a considerable attention to the phenomenological implications, we focus on studying the problem of a non-relativistic spin-1/2 particle in the presence of an electromagnetic field within 3-dimensional noncommutative phase-space. as a mathematical theory, noncommutative geometry is by now well established, although at first, its progress has been narrowly restricted to some branches of physics such as quantum mechanics. however, recently, the noncommutative geometry has become a topic of great interest [17–23]. it has been finding applications in many sectors of physics and has rapidly become involved in them, continuing to promote fruitful ideas and the search for a better understanding. such as in the quantum gravity [24]; the standard model of fundamental interactions [25]; as well as in the string theory [26]; and its implication in hopf algebras [27] gives the connes–kreimer 230 https://doi.org/10.14311/ap.2021.61.0230 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space hopf algebras [28–30] etc. there are many papers devoted to the study of such various aspects especially in quantum field theory [31–33] and quantum mechanics [34–36]. this paper is organized as follows. in section 2, we present an analysis review of noncommutative geometry, in particular both the three-dimensional bopp-shift transformation and the moyal-weyl product. in section 3, we investigate the three-dimensional pauli equation in the presence of an electromagnetic field and the corresponding continuity equation. furthermore, we derived the current magnetization term in the deformed continuity equation. section 4 is devoted to calculating the semi-classical noncommutative partition function of the pauli system of the one-particle and n-particle systems. consequently, we obtain the corresponding magnetization and the magnetic susceptibility through the helmholtz free energy, all in both commutative and noncommutative phase-spaces and within a classical limit. therefore, concluding with some remarks. 2. review of noncommutative algebra firstly, we present the most essential formulas of noncommutative algebra [36]. it is well known that at very small scales such as the string scale, the position coordinates do not commute with each other, neither do the momenta. let us accept, in a d-dimensional noncommutative phase-space, the operators of coordinates and momenta xnci and pnci , respectively. the noncommutative formulation of quantum mechanics corresponds to the following heisenberg-like commutation relations[ xncµ ,x nc ν ] = iθµν, [ pncµ ,p nc ν ] = iηµν, [ xncµ ,p nc ν ] = i~̃δµν , (µ,ν = 1, ..d) , (1) the effective planck constant is the deformed planck constant, which is given by ~̃ = αβ~ + tr[θη] 4αβ~ , (2) where tr[θη]4αβ~ � 1 is the condition of consistency in quantum mechanics. θµν, ηµν are constant antisymmetric d×d matrices and δµν is the identity matrix. it is shown that xnci and pnci can be represented in terms of coordinates xi and momenta pj in usual quantum mechanics through the so-called generalized bopp-shift as follows [34] xncµ = αxµ − 1 2α~ θµνpν, and p nc µ = βpµ + 1 2β~ηµνxν , (3) with α = 1 − θη8~2 and β = 1 α being scaling constants. to the 1rst order of θ and η, in the calculations we take α = β = 1, so the equations (3, 2) become xncµ = xµ − 1 2~ θµνpν, p nc µ = pµ + 1 2~ηµνxν , and ~̃ = ~ + tr[θη] 4~ . (4) if the system in which we study the effects of noncommutativity is three-dimensional, we limit ourselves to the following noncommutative algebra [ xncj ,x nc k ] = i12�jklθl, [ pncj ,p nc k ] = i12�jklηl, [ xncj ,p nc k ] = i ( ~ + θη4~ ) δjk , (j,k, l = 1, 2, 3) , (5) θl = (0, 0, θ), ηl = (0, 0,η) are the real-valued noncommutative parameters with the dimension of length2, momentum2 respectively, they are assumed to be extremely small. and �jkl is the levi-civita permutation tensor. therefore, we have xnci = xi − 1 4~ �ijkθkpj :   xnc = x− 14~ θpy ync = y + 14~ θpx znc = z , pnci = pi + 1 4~ �ijkηkxj :   pncx = px + 1 4~ηy pncy = py − 1 4~ηx pncz = pz . (6) in noncommutative quantum mechanics, it is quite possible that we replace the usual product with the moyal-weyl (?) product, then the quantum mechanical system will simply become the noncommutative quantum mechanical system. let h(x,p) be the hamiltonian operator of the usual quantum system, then the corresponding schrödinger equation on noncommutative quantum mechanics is typically written as h(x,p) ? ψ (x,p) = eψ (x,p) . (7) 231 ilyas haouam acta polytechnica the definition of moyal-weyl product between two arbitrary functions f(x,p) and g(x,p) in phase-space is given by [37] (f ? g)(x,p) = exp[ i2 θab∂xa∂xb + i 2ηab∂pa∂pb]f (xa,pa) g (xb,pb) = f(x,p)g(x,p) + ∑ n=1 1 n! ( i 2 )n θa1b1...θanbn∂xa1...∂xakf(x,p)∂xb1...∂xbkg(x,p) + ∑ n=1 1 n! ( i 2 )n ηa1b1...ηanbn∂pa1...∂ p ak f(x,p)∂pb1...∂ p bk g(x,p) , (8) with f(x,p) and g(x,p) assumed to be infinitely differentiable. if we consider the case of a noncommutative space, the definition of moyal-weyl product will be reduced to [38] (f ? g)(x) = exp[ i 2 θab∂xa∂xb]f (xa) g (xb) = f(x)g(x) + ∑ n=1 1 n! ( i 2 )n θa1b1...θanbn∂a1...∂akf(x)∂b1...∂bkg(x). (9) due to the nature of the ?product, the noncommutative field theories for low-energy fields (e2 > 1/θ) at a classical level are completely reduced to their commutative versions. however, this is just the classical result and quantum corrections always reveal the effects of θ even at low-energies. on a noncommutative phase-space the ?product can be replaced by a bopp’s shift, i.e., the ?product can be changed into the ordinary product by replacing h(x,p) with h(xnc,pnc). thus, the corresponding noncommutative schrödinger equation can be written as h(x,p) ? ψ (x,p) = h ( xi − 1 2~ θijpj, pµ + 1 2~ ηµνxν ) ψ = eψ. (10) note that θ and η terms can always be treated as a perturbation in quantum mechanics. if θ = η = 0, the noncommutative algebra reduces to the ordinary commutative one. 3. pauli equation in noncommutative phase-space 3.1. formulation of noncommutative pauli equation the pauli equation is the formulation of the schrödinger equation for spin-1/2 particles, which was formulated by w. pauli in 1927. it takes into account the interaction of the particle’s spin with an electromagnetic field. in other words, it is the nonrelativistic limit of the dirac equation. furthermore, the pauli equation could be extracted from other relativistic higher spin equations such as the dkp equation considering the particle interacting with an electromagnetic field [37]. the nonrelativistic schrödinger equation that describes an electron in interaction with an electromagnetic potential ( a0, −→ a ) (−̂→p is replaced with −̂→π = −̂→p − e c −→ a and ê with �̂ = i~ ∂ ∂t −eφ ) is 1 2m ( −̂→p − e c −→ a (r) )2 ψ (r,t) + eφ (r) ψ (r,t) = i~ ∂ ∂t ψ (r,t) , (11) where −̂→p = i~ −→ ∇ is the momentum operator, m, e are the mass and charge of the electron, and c is the speed of light. ψ (r,t) is the schrödinger’s scalar wave function. the appearance of real-valued electromagnetic coulomb and vector potentials, φ (−→r ,t) and −→ a (−→r ,t), is a consequence of using the gauge-invariant minimal coupling assumption to describe the interaction with the external magnetic and electric fields defined by −→ e = − −→ ∇φ− 1 c ∂ −→ a ∂t , −→ b = −→ ∇ × −→ a. (12) however, the electron gains potential energy when the spin interacts with the magnetic field, therefore, the pauli equation of an electron with a spin is given by [1, 8] 1 2m ( −→σ .−̂→π )2 ψ (r,t) + eφψ (r,t) = 1 2m ( −̂→p − e c −→ a )2 ψ (r,t) + eφψ (r,t) + µb−→σ . −→ bψ (r,t) = i~ ∂ ∂t ψ (r,t) , (13) where ψ (r,t) = ( ψ1 ψ2 )t is the spinor wave function, which replaces the scalar wave function. with µb = |e|~2mc = 9.27 × 10 −24jt−1 being the bohr’s magneton, −→ b is the applied magnetic field vector and µb−→σ represents the magnetic moment. −→σ ’s being the three pauli matrices (tr−→σ = 0), which obey the following algebra [σi,σj] = 2i�ijkσk, (14) 232 vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space σiσj = δiji + i ∑ k �ijkσk, (15) ( −→σ .−̂→a )( −→σ . −̂→ b ) = −̂→a . −̂→ b + i−→σ . ( −̂→a × −̂→ b ) , (16) −̂→a , −̂→ b are any two vector operators that commute with −→σ . it must be emphasized that the third term of equation (13) is the zeeman term, which is generated automatically by using feature (16) with a correct g-factor of g = 2 as reduced in the bohr’s magneton rather than being introduced by hand as a phenomenological term, as is usually done. the pauli equation in a noncommutative phase-space is h(xnc,pnc) ψ (xnc, t) = h(x,pnc) ? ψ (x,t) = e i 2 θab∂xa∂xbh(xa,pnc) ψ (xb, t) = i~ ∂ ∂t ψ (x,t) . (17) here we achieved the noncommutativity in space using moyal ?product, and then the noncommutativity in phase through bopp-shift. using equation (9), we have h(xnc,pnc) ψ (xnc, t) = = { h(x,pnc) + i 2 θab∂ah(x,pnc) ∂b + ∑ n=2 1 n! ( i 2 )n θa1b1...θanbn∂a1...∂akh(x,p nc) ∂b1...∂bk } ψ. (18) in the case of a constant real magnetic field −→ b = (0, 0,b) = b−→e 3 oriented along the axis (oz), which is often referred to as the landau system. we have the following symmetric gauge −→ a = −→ b ×−→r 2 = b 2 (−y,x, 0) , with a0 (x) = eφ = 0. (19) therefore, the derivations in the equation (18) shut down approximately in the first-order of θ, then the noncommutative pauli equation in the presence of a uniform magnetic field can be written as follows h(x,pnc) ? ψ (x) = { 1 2m ( −→p nc − e c −→ a (x) )2 + µb−→σ . −→ b + ie 4mc θab∂a (e c −→ a2 − 2−→p nc. −→ a ) ∂b } ψ (x) + 0(θ2), (20) with [ −→p nc, −→ a ] = 0. we now make use of the bopp-shift transformation (4), in the momentum operator to obtain h(xnc,pnc) ψ (xnc, t) = { 1 2m ( pi + 12~ηijxj − e c ai )2 + µb−→σ .−→b − ie4mcθ ab∂a ( 2 ( pi + 12~ηijxj ) ai − ec −→ a2 ) ∂b } ψ(x,t) = i~ ∂ ∂t ψ (x,t) , (21) we rewrite the above equation in a more compact form h(x,pnc) ? ψ (x,t) = { 1 2m ( −→p − e c −→ a )2 − 12m ( −→x ×−→p ) .−→η − 12m e c~ ( −→x × −→ a (x) ) .−→η + 18m~2 ηijηαβxjxβ + µb −→σ . −→ b + e4~mc (−→ ∇ ( 2−→p . −→ a − 12~ ( −→x × −→ a (x) ) .−→η − e c −→ a2 ) ×−→p ) . −→ θ } ψ (x,t) . (22) we restrict ourselves only to the first-order of the parameter η. the only reason behind this consideration is the balance with the noncommutativity in the space considered in the case of a constant magnetic field. thus we now have h(x,pnc) ? ψ (x,t) = { 1 2m ( −→p − e c −→ a )2 − 12m −→ l.−→η − e2mc~ ( −→x × −→ a (x) ) .−→η + µb−→σ . −→ b + e4mc~ (−→ ∇ ( 2−→p . −→ a (x) − 12~ ( −→x × −→ a (x) ) .−→η − e c −→ a2 ) ×−→p ) . −→ θ } ψ (x,t)) = i~ ∂ ∂t ψ (x,t) . (23) the existence of a pauli equation for all orders of the θ parameter is explicitly relative to the magnetic field. in the case of a non-constant magnetic field, we introduce a function depending on x in the landau gauge as a2 = xbf(x), which gives us a non-constant magnetic field. the magnetic field can be easily calculated using the second equation of equation (12) as follows [33] −→ b (x) = bf(x)−→e 3. (24) if we specify f(x), we obtain different classes of the non-constant magnetic field. if we take f(x) = 1 in this case, we get a constant magnetic field. having the equation (23) on hand, we calculate the probability density and the current density. 233 ilyas haouam acta polytechnica 3.2. deformed continuity equation in the following we calculate the current density, which results from the pauli equation (23) that describes a system of two coupled differential equations for ψ1 and ψ2. by putting qη = q∗η = ( −→x × −→ a (x) ) .−→η , qθ = (−→ ∇ ( 2−→p . −→ a (x) − 12~qη − e c −→ a2 (x) ) ×−→p ) . −→ θ = (−→ ∇v (x) ×−→p ) . −→ θ, (25) the noncommutative pauli equation in the presence of a uniform magnetic field simply reads{ 1 2m ( −~2 −→ ∇2 + ie~ c (−→ ∇. −→ a + −→ a. −→ ∇ ) + e2 c2 −→ a2 ) − −→ l.−→η 2m − eqη 2mc~ + µb−→σ . −→ b + eqθ 4mc~ } ψ = i~ ∂ ∂t ψ. (26) knowing that −→σ , −→ l are hermitian and the magnetic field is real, and q∗θ is the adjoint of qθ, the adjoint equation of equation (26) reads 1 2m { −~2 −→ ∇2ψ† − ie~ c (−→ ∇. −→ a + −→ a. −→ ∇ ) ψ† + e2 c2 −→ a2ψ† } − −→ l.−→η 2m ψ† − eqη 2mc~ ψ† + µb−→σ . −→ bψ† + e 4mc~ ψ†q∗θ = = −i~ ∂ψ† ∂t . (27) here ∗, † stand for the complex conjugation of the potentials, operators and for the wave-functions, respectively. to find the continuity equation, we multiply equation (26) from left by ψ† and equation (27) from the right by ψ, making the subtraction of these equations yields −~2 2m { ψ† −→ ∇2ψ − (−→ ∇2ψ† ) ψ } + ie~2mc { ψ† (−→ ∇. −→ a + −→ a. −→ ∇ ) ψ + [(−→ ∇. −→ a + −→ a. −→ ∇ ) ψ† ] ψ } + e4mc~ ( ψ†qθψ −ψ†q∗θψ ) = i~ ( ψ† ∂ ∂t ψ + ψ ∂ ∂t ψ† ) , (28) after some minor simplefications, we have −~ 2m div { ψ† −→ ∇ψ −ψ −→ ∇ψ† } + ie mc div {−→ aψ†ψ } + e 4mc~2 ( ψ†qθψ −ψ†q∗θψ ) = i ∂ ∂t ψ†ψ. (29) this will be recognized as the deformed continuity equation. the obtained equation (29) contains a new quantity, which is the deformation due to the effect of the phase-space noncommutativity on the pauli equation. the third term on the left-hand side, which is the deformation quantity, can be simplified as follows ie 4mc~2 ( ψ†qθψ −ψ†q∗θψ ) = ie 4mc~2 ( ψ† (v (x) ? ψ) − ( ψ† ?v (x) ) ψ ) , (30) using the propriety ( −→a × −→ b ) .−→c = −→a . (−→ b ×−→c ) = −→ b . (−→c ×−→a ), we must also pay attention to the order, ψ† is the first and ψ the second factor, we have ie 4mc~2 ( ψ†qθψ −ψ†q∗θψ ) = e 8mc~2 divv (x) (−→ θ × −→ ∇ ( ψ†ψ )) = div −→ ξ nc. (31) using the following identity also gives the same equation as above [6] υ† (−→π τ) − (−→π υ)†τ = −i~ −→ ∇ ( υ†τ ) , (32) where υ, τ are arbitrary two-component spinor. noting that −→ a does not appear on the right-hand side of the identity; and that this identity is related to the fact that −→π is hermitian. it is evident that the noncommutativity affects the current density, and the deformation quantity may apear as a correction to it. the deformed current density satisfies the current conservation, which means that we have a conservation of the continuity equation in the noncommutative phase-space. equation (29) may be contracted as ∂ρ ∂t + −→ ∇. −→ j nc = 0, (33) where ρ = ψ†ψ = |ψ|2 , (34) 234 vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space is the probability density and −→ j nc = −→ j + −→ ξ nc = −i~ 2m { ψ† −→ ∇ψ −ψ −→ ∇ψ† } − e mc {−→ aψ†ψ } + −→ ξ nc, (35) is the deformed current density of the electrons. the deformation quantity is −→ ξ nc = e 8mc~2 v (x) (−→ θ × −→ ∇ ( ψ†ψ )) = e 8mc~2 ( 2−→p . −→ a − 1 2~ qη − e c −→ a2 )(−→ θ × −→ ∇ ( ψ†ψ )) . (36) furthermore, the deformed continuity equation for all orders of θ is proportional to the magnetic field −→ b . in fact, one can explicitly calculate the conserved current for all orders of θ in the case of a non-constant magnetic field, thus using equation (24), we have ∂ρ ∂t + −→ ∇. −→ j + ie 4mc~2 { ψ† (v (x) ? ψ) − ( ψ† ?v (x) ) ψ } = 0, (37) we calculate the nth order term in the general deformed continuity equation (37) as follows ψ† (v (x) ? ψ) − ( ψ† ?v (x) ) ψ ∣∣ nth = 1 n! ( i 2 )n θa1b1...θanbn × ( ψ†∂a1...∂akv (x) ∂b1...∂bkψ −∂a1...∂akψ† (∂b1...∂bkv (x)) ψ + (−1)ncc. ) . (38) we note the absence of the magnetization current term in equation (35), as in commutative case when this was asserted by authors [1, 4, 8, 13, 16, 27], where at first, they attempted to cover this deficiency by explaining how to derive this additional term from the non-relativistic limit of the relativistic dirac probability current density. then, nowakowski and others [6] provided a superb explanation of how to extract this term through the non-relativistic pauli equation itself. knowing that, in a commutative background, the magnetization current −→ j m from the probability current of the pauli equation is proportional to −→ ∇ × ( ψ†−→σ ψ ) . however, the existence of such an additional term is important and it should be discussed when talking about the probability current of spin-1/2 particles. in following, we try to derive the current magnetization in a noncommutative background without changing the continuity equation, and seek if such an additional term is affected by the noncommutativity or not. 3.3. derivation of the magnetization current at first, it must be clarified that the authors nowakowski and others (2011) in [4, 6] derived the non-relativistic current density for a spin-1/2 particle using minimally coupled pauli equation. in contrast, wilkes, j. m (2020) in [39] derived the non-relativistic current density for a free spin-1/2 particle using directly free pauli equation. however, we show here that the current density can be derived from the minimally coupled pauli equation in a noncommutative phase-space. starting with the noncommutative minimally coupled pauli equation written in the form hncpauliψ = 1 2m ( −→σ .−̂→π nc)2 ψ = i~ ∂ ∂t ψ, (39) we multiply the above equation from left by ψ† and the adjoint equation of equation (39) from the right by ψ, the subtraction of these equations yields the following continuity equation 2m {(( −→σ .−̂→π nc)2 ψ )† ψ −ψ† ( −→σ .−̂→π nc)2 ψ } = i~ ( ψ† ∂ψ ∂t + ψ ∂ψ† ∂t ) , (40) noting that the noncommutativity of πnc has led us to express the two terms as follows i 2m~ ∑ i,j { (π̂incπ̂jncψ) † σjσiψ −ψ†σiσj (π̂incπ̂jncψ) } = ∂ρ ∂t . (41) while with only pi, we would have no reason for preferring pipjψ over pjpiψ. it is easy to verify that the identity (32) remains valid for −→π nc because of the fact that −→π nc is hermitian. therefore, through identity (32), we have −1 2m ∑ i,j ∇i { (π̂jncψ) † σjσiψ + ψ†σiσj (π̂jncψ) } + i 2m~ ∑ i,j { (π̂jncψ) † σjσi (π̂incψ) − (π̂incψ) † σiσj (π̂jncψ) } = ∂ρ ∂t , (42) 235 ilyas haouam acta polytechnica then −1 2m ∑ i,j ∇i { (π̂jncψ) † σjσiψ + ψ†σiσj (π̂jncψ) } = ∂ρ ∂t . (43) knowing that the 2nd sum in equation (42) gives zero by swapping i and j for one of the sums, then the probability current vector from the above continuity equation is ji = 1 2m ∑ j { (π̂jncψ) † σjσiψ + ψ†σiσj (π̂jncψ) } . (44) using the property (15), equation (44) becomes ji = 1 2m ∑ j { (π̂jncψ) † ψ + ψ† (π̂jncψ) + i ∑ k [ �jik (π̂jncψ) † σkψ + �ijkψ†σk (π̂jncψ) ]} , (45) with �jik = −�ijk, and using one more time identity (32), we find (this is similar to investigation by [6] in the case of commutative phase-space) ji = 1 2m [ (p̂jncψ) † ψ − e c ( ancj ψ )† ψ + ψ†p̂jncψ − e c ψ†ancj ψ ] + ~ 2m ∑ j,k �ijk∇j ( ψ†σkψ ) . (46) in the right-hand side of the above equation, the first term will be interpreted as the noncommutative current vector −→ j nc given by equation (36), and the second term is the requested additional term, namely current magnetization −→ j m, where j mi = ~ 2m (−→ ∇ × ( ψ†−→σ ψ )) i . (47) furthermore, −→ j m can also be shown to be a part of the conserved noether current [40], resulting from the invariance of the pauli lagrangian under the global phase transformation u(1). what can be concluded here is that the magnetization current is not affected by the noncommutativity, perhaps because the spin operator could not be affected by the noncommutativity. this is in contrast to what was previously found around the current density, which showed a great influence of the noncommutativity. 4. noncommutative semi-classical partition function in this part of our work, we investigate the magnetization and the magnetic susceptibility quantities of our pauli system using the partition function in a noncommutative phase-space. we concentrate, at first, on the calculation of the semi-classical partition function. our studied system is semi-classical, so our system is not completely classical but contains a quantum interaction concerning the spin, therefore, the noncommutative partition function is separable into two independent parts as follows znc = zncclaszncl, (48) where zncl is the non-classical part of the partition function. to study our noncommutative classical partition function, we assume that the passage between noncommutative classical mechanics and noncommutative quantum mechanics can be realized through the following generalized dirac quantization condition [41–43] {f,g} = 1 i~ [f,g] , (49) where f , g stand for the operators associated with classical observables f, g and {,} stands for poisson bracket. using the condition above, we obtain from eq.(5){ xncj ,x nc k } = 12�jklθl, { pncj ,p nc k } = 12�jklηl, { xncj ,p nc k } = δjk + 14~2 θjlηkl = δjk , (j,k, l = 1, 2, 3) . (50) it is worth mentioning that in terms of the classical limit, θη4~2 � 1 (check ref. [42]), thus { xncj ,p nc k } = δjk. now based on the proposal that noncommutative observables fnc correspond to the commutative one f(x,p) can be defined by [44, 45] fnc = f(xnc,pnc), (51) and for non-interacting particles, the classical partition function in the canonical ensemble in a noncommutative phase-space is given by the following formula [41, 42] zncclas = 1 n! ( 2π~̃ )3n � e−βh nc clas(x,p)d3nxncd3npnc, (52) 236 vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space which is written for an n particles. 1 n! is the gibbs correction factor, considered due to accounting for indistinguishability, which means that there are n! ways of arranging n particles at n sites. ~̃ ∼4xnc4pnc, with 1 ~̃3 is a factor that makes the volume of the noncommutative phase-space dimensionless. β defined as 1 kbt , kb is the boltzmann constant, where kb = 1.38 × 10−23jk−1. the helmholtz free energy is f = − 1 β lnz, (53) we may derive the magnetization as follows 〈m〉 = − ∂f ∂b . (54) for a single particle, the noncommutative classical partition function is then zncclas,1 = 1 h̃3 � e−βh nc clas(x,p)d3xncd3pnc, (55) where d3 is a shorthand notation serving as a reminder that the x and p are vectors in a three-dimensional phase-space. the relation between equation (52) and (55) is given by the following formula zncclas = ( zncclas,1 )n n! . (56) knowing that, using equation (6), we have d3xncd3pnc = ( 1 − θη 8~2 ) d3xd3p, (57) furthermore, using uncertainty principle and according to the third equation of equation (4), we deduce h̃3 = h3 ( 1 + 3θη 4~2 ) + o ( θ2η2 ) . (58) unlike other works such as [41], where the researchers used a different formula for planck’s constant h̃11 = h̃22 6= h̃33, which led to a different formula of h̃3. for an electron with a spin in interaction with an electromagnetic potential, once the magnetic field −→ b be in the z-direction, and by equation (19), bear in mind that [ −→p nc, −→ anc ] = 0, then for the sake of simplicity, the noncommutative pauli hamiltonian from equation (23) takes the form hpauli (xnc,pnc) = 1 2m { (−→p nc)2 − 2 e c −→p nc. −→ anc + (e c )2 (−→ anc )2} + µbσ̂zb. (59) we split the noncommutative pauli hamiltonian as hncpauli = h nc cla + hncl,σ, with hncl,σ = µbσ̂zb . it is easy to verify that (−→p nc)2 = (pncx ) 2 + ( pncy )2 + (pncz )2 = p2x + p2y + p2z − η2~lz + η 2 16~2 ( x2 + y2 ) , (60) −→p nc. −→ a = pncx a nc x + p nc y a nc y = b 2 { − θ 4~ ( p2x + p 2 y ) − η 4~ ( y2 + x2 ) + ( 1 + θη 16~2 ) lz } , (61) (−→ anc )2 = (ancx ) 2 + ( ancy )2 = b2 4 { x2 + y2 − θ 2~ lz + θ2 16~2 ( p2x + p 2 y )} . (62) using the three equations above, our noncommutative classical hamiltonian becomes hnccla = 1 2m̃ ( p2x + p 2 y ) + 1 2m p2z − ω̃lz + 1 2 m̃ω̃2 ( x2 + y2 ) , (63) where lz = pyx−pxy = (xi ×pi)z, and m̃ = m( 1 + ebθ8c~ )2 , ω̃ = cη + 2e~b4c~m̃(1 + ebθ c8~ ) and 1 2 m̃ω̃2 = 1 2m ( ηeb 4c~ + η2 16~2 + e2b2 c24 ) . (64) 237 ilyas haouam acta polytechnica now, following the definition given in equation (55), we express the single particle noncommutative classical partition function as zncclas,1 = 1 h̃3 � e−β[ 1 2m̃(p2x+p2y)+ 12mp2z−ω̃lz+ 12 m̃ω̃2(x2+y2)]d3xncd3pnc. (65) it should be noted that we want to factorize our hamiltonian into momentum and position terms. this is not always possible when there are matrices (or operators) in the exponent. however, within the classical limit, it is possible. otherwise, to separate the operators in the exponent, we use the baker-campbell-hausdorff (bch) formula given by (first few terms) e[â+b̂] = e[â]e[b̂]e[− 1 2 [â,b̂]]e 1 6 (2[â,[â,b̂]]+[b̂,[â,b̂]])... (66) we can now start to replace some of the operators in the exponent zncclas,1 = 1 h̃3 � e−β[ 1 2m̃(p2x+p2y)+ 12mp2z]e−β[ 1 2 m̃ω̃ 2(x2+y2)]eβω̃lzd3pncd3xnc. (67) we should expand exponentials containing ω̃, and by considering terms up to the second-order of ω̃, we obtain zncclas,1 = 1 h̃3 � e −β2 [ p2x+p 2 y m̃ + p2z m ]( 1 + βω̃lz + 1 2 β2ω̃2l2z )( 1 −βω̃2 m̃ 2 ( x2 + y2 )) d3pncd3xnc, (68) therefore, we have the appropriate expression for zncclas,1 zncclas,1 = 1− 7θη 8~2 h3 � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] d3pd3x + ( 1− 7θη 8~2 )βω̃ h3 � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] lzd 3pd3x + ( 1− 7θη 8~2 )β2ω̃2 2h3 � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] l2zd 3pd3x− ( 1− 7θη 8~2 )βω̃2 2h3 � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] (x2 + y2) d3pd3x. (69) in the right-hand side of the above equation, it is easy to check that the second integral goes to zero and the third and last integrals cancel each other, and thus we obtain zncclas,1 = 1 − 7θη8~2 h3 � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] d3pd3x. (70) using the integral of gaussian function � e−ax 2 dx = √ π a , we have zncclas,1 = 1 − 7θη8~2 h3 � d3x � e − β2 [ p2x+p 2 y m̃ + p 2 z m ] d3p = v λ3 1 − 7θη8~2( 1 + ebθ8c~ )2 , (71) where v , λ = h (2mπkbt) −12 are the volume and the thermal de broglie wavelength, respectively. the non-classical partition function using hncl,σ is zncl = znncl,1 = ( ∑ σz=±1 eβµbσ̂zb )n = 2ncoshn (βµbb) . (72) finally, the pauli partition function for a system of n particles in a three-dimensional noncommutative phase-spaces is znc = (2v )n λ3nn! ( 1 − 7θη8~2 )n coshn (βµbb)( 1 + ebθ8c~ )2n . (73) 238 vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space in the limit of the noncommutativity, i.e. θ → 0, η → 0, the above expression of znc tends to the result of z in the usual commutative phase-space, which is z = (2v )n λ3nn! coshn (βµbb) . (74) using formulae (53) and (54), we find the magnetization in noncommutative and commutative phase-space, thus fnc = − n β ln (2v ) λ3 ( 1 − 7θη8~2 ) cosh (βµbb)( 1 + ebθ8c~ )2 + 1β lnn!, (75) we can use lnn! = nlnn −n (stirling’s formula) to simplify further. the noncommutative magnetization is 〈mnc〉 = − ∂fnc ∂b = 2 n β eθ (8c~ + ebθ) + nµbtanh (βµbb) . (76) the commutative magnetization is 〈m〉 = − ∂f ∂b = nµbtanh (βµbb) , (77) it is obvious that 〈mnc〉|θ=0 = 〈m〉. we may derive the magnetic susceptibility of electrons χ = 1 v ∂〈m〉 ∂b in a noncommutative phase-space using the magnetization (76) by χnc = −2 n v β (eθ)2 (8c~ + ebθ)2 + n v βµ2b ( 1 − tanh2 (βµbb) ) , (78) where the commutative magnetic susceptibility χ = χnc (θ = 0) is χ = n v βµ2b ( 1 − tanh2 (βµbb) ) . (79) finally, we conclude with the following special cases. let us first consider b = 0, then we have 〈m〉 = 0; 〈mnc〉 = 2 n β eθ 8c~ ; and χnc = −2 n v β (eθ)2 (8c~)2 + n v βµ2b. (80) for b →∞ and t = cst, tanh (βµbb) = 1, we obtain 〈mnc〉 = 〈m〉 = nµb; and χnc = χ ∼ 0. (81) as well when t →∞,β → 0 (with b be constant), tanh (βµbb) = 0, we obtain 〈mnc〉→∞,〈m〉 = 0; and χnc →∞,χ = 0. (82) armed with the partition function z, we can compute other important thermal quantities, such as the average energy u = − ∂ ∂β lnz, the entropy s = lnz −β ∂ ∂β lnz and the specific heat c = β2 ∂ 2 ∂2β lnz. this is of course in both case of noncommutative and commutative phase-spaces. 5. conclusion in this work, we have studied the three-dimensional pauli equation and the corresponding continuity equation for a spin-1/2 particle in the presence of an electromagnetic field in a noncommutative phase-space, considering constant and non-constant magnetic fields. it is shown that the non-constant magnetic field lifts the order of the noncommutativity parameter in both the pauli equation and the corresponding continuity equation. given the known absence of the magnetization current term in the continuity equation, even in the noncommutative phase-space as confirmed by our calculations, we extracted the magnetization current term from the pauli equation itself without modifying the continuity equation. furthermore, we found that the density current is conserved, which means that we have a conservation of the deformed continuity equation. by using the classical treatment (within the classical limit), the magnetization and the magnetic susceptibility quantities are explicitly determined in both commutative and noncommutative phase-spaces through a semiclassical partition function of the pauli system of the one-particle and n-particle systems in three dimensions. furthermore, to see the behaviour of these deformed quantities, we carried out some special cases in commutative and noncommutative phase-spaces. finally, we can say that we successfully examined the influence of the noncommutativity on the problems in question, where the noncommutativity was introduced using both the three-dimensional bopp-shift transformation and the moyal-weyl product. furthermore, the noncommutative corrections to the nonrelativistic pauli equation and the continuity equation are also valid up to all orders in the noncommutative parameter. our results’ limits are in good agreement with those obtained by other authors as discussed and in the literature. 239 ilyas haouam acta polytechnica acknowledgements the author would like to thank dr. mojtaba najafizadeh for his valuable discussion on the classical partition function. references [1] w. greiner. quantum mechanics: an introduction. springer, berlin, 4th edn., 2001. [2] e. ikenberry. quantum mechanics for mathematicians and physicists. oxford, new york, 1st edn., 1962. [3] a. galindo, c. sanchez del rio. intrinsic magnetic moment as a nonrelativistic phenomenon. american journal of physics 29(9):582 – 584, 1961. doi:10.1119/1.1937856. [4] m. nowakowski. the quantum mechanical current of the pauli equation. american journal of physics 67(10):916 – 919, 1999. doi:10.1119/1.19149. [5] g. w. parker. spin current density and the hyperfine interaction in hydrogen. american journal of physics 52(1):36 – 39, 1984. doi:10.1119/1.13846. [6] m. s. shikakhwa, s. turgut, n. k. pak. derivation of the magnetization current from the non-relativistic pauli equation: a comment on “the quantum mechanical current of the pauli equation” by marek nowakowski. american journal of physics 79(11):1177 – 1179, 2011. doi:10.1119/1.3630931. [7] w. b. hodge, s. v. migirditch, w. c. kerr. electron spin and probability current density in quantum mechanics. american journal of physics 82(7):681 – 690, 2014. doi:10.1119/1.4868094. [8] j. j. sakurai. advanced quantum mechanics. reading, mass.: addison-wesley pub. co., 1967. [9] i. haouam, l. chetouani. the foldy-wouthuysen transformation of the dirac equation in noncommutative phase-space. journal of modern physics 9(11):2021 – 2034, 2018. doi:10.4236/jmp.2018.911127. [10] i. haouam. the phase-space noncommutativity effect on the large and small wavefunction components approach at dirac equation. open access library journal 5:e4108, 2018. doi:10.4236/oalib.1104108. [11] j. d. bjorken, s. drell. relativistic quantum mechanics. mcgraw-hill, new york, 1964. [12] l. l. foldy, s. a. wouthuysen. on the dirac theory of spin 1/2 particles and its non-relativistic limit. physical review 78(1):29 – 36, 1950. doi:10.1103/physrev.78.29. [13] w. pauli. zur quantenmechanik des magnetischen elektrons. zeitschrift für physik 43:601 – 623, 1927. doi:10.1103/physrev.78.29. [14] g. s, g. e. uhlenbeck. opmerking over de spectra van waterstof en helium. physica 5:266 – 270, 1925. [15] g. e. uhlenbeck, s. goudsmit. ersetzung der hypothese vom unmechanischen zwang durch eine forderung bezüglich des inneren verhaltens jedes einzelnen elektrons. die naturwissenschaften 13(47). [16] p. a. m. dirac, r. h. fowler. the quantum theory of the electron. proceedings of the royal society of london series a, containing papers of a mathematical and physical character 117(778):610 – 624, 1928. doi:10.1098/rspa.1928.0023. [17] a. connes. non-commutative differential geometry. publications mathématiques de l’institut des hautes études scientifiques 62(1):41 – 144, 1985. doi:10.1007/bf02698807. [18] s. l. woronowicz. twisted su (2) group. an example of a non-commutative differential calculus. publications of the research institute for mathematical sciences 23(1):117 – 181, 1987. doi:10.2977/prims/1195176848. [19] a. connes, m. r. douglas, a. schwarz. noncommutative geometry and matrix theory. journal of high energy physics 1998(jhep02):003, 1998. doi:10.1088/1126-6708/1998/02/003. [20] m. m. sheikh-jabbari. c, p, and t invariance of noncommutative gauge theories. physical review letters 84(23):5265 – 5268, 2000. doi:10.1103/physrevlett.84.5265. [21] e. di grezia, g. esposito, a. funel, et al. spacetime noncommutativity and antisymmetric tensor dynamics in the early universe. physical review d 68(10):105012, 2003. doi:10.1103/physrevd.68.105012. [22] o. bertolami, j. g. rosa, c. m. l. de aragão, et al. noncommutative gravitational quantum well. physical review d 72(2):025010, 2005. doi:10.1103/physrevd.72.025010. [23] a. das, h. falomir, j. gamboa, f. méndez. non-commutative supersymmetric quantum mechanics. physics letters b 670(4-5):407 – 415, 2009. doi:10.1016/j.physletb.2008.11.011. [24] j. m. gracia-bondia. notes on “quantum gravity” and noncommutative geometry. in new paths towards quantum gravity, pp. 3 – 58. springer, 2010. doi:10.1007/978-3-642-11897-5_1. [25] p. martinetti. beyond the standard model with noncommutative geometry, strolling towards quantum gravity. vol. 634, p. 012001. iop publishing, 2015. doi:10.1088/1742-6596/634/1/012001. [26] n. seiberg, e. witten. string theory and noncommutative geometry. journal of high energy physics 1999(jhep09), 1999. doi:10.1088/1126-6708/1999/09/032. [27] k. christian. quantum groups. graduate texts in mathematics 155. springer-verlag, new york, 1995. 240 http://dx.doi.org/10.1119/1.1937856 http://dx.doi.org/10.1119/1.19149 http://dx.doi.org/10.1119/1.13846 http://dx.doi.org/10.1119/1.3630931 http://dx.doi.org/10.1119/1.4868094 http://dx.doi.org/10.4236/jmp.2018.911127 http://dx.doi.org/10.4236/oalib.1104108 http://dx.doi.org/10.1103/physrev.78.29 http://dx.doi.org/10.1103/physrev.78.29 http://dx.doi.org/10.1098/rspa.1928.0023 http://dx.doi.org/10.1007/bf02698807 http://dx.doi.org/10.2977/prims/1195176848 http://dx.doi.org/10.1088/1126-6708/1998/02/003 http://dx.doi.org/10.1103/physrevlett.84.5265 http://dx.doi.org/10.1103/physrevd.68.105012 http://dx.doi.org/10.1103/physrevd.72.025010 http://dx.doi.org/10.1016/j.physletb.2008.11.011 http://dx.doi.org/10.1007/978-3-642-11897-5_1 http://dx.doi.org/10.1088/1742-6596/634/1/012001 http://dx.doi.org/10.1088/1126-6708/1999/09/032 vol. 61 no. 1/2021 on the three-dimensional pauli equation in noncommutative phase-space [28] a. connes, d. kreimer. renormalization in quantum field theory and the riemann-hilbert problem i: the hopf algebra structure of graphs and the main theorem. communications in mathematical physics 210(1):249 – 273, 2000. doi:10.1007/s002200050779. [29] a. connes, d. kreimer. renormalization in quantum field theory and the riemann-hilbert problem ii: the β-function, diffeomorphisms and the renormalization group. communications in mathematical physics 216(1):215 – 241, 2001. doi:10.1007/pl00005547. [30] a. tanasa, f. vignes-tourneret. hopf algebra of non-commutative field theory. journal of noncommutative geometry 2(1), 2008. [31] s. m. carroll, j. a. harvey, v. a. kostelecký, et al. noncommutative field theory and lorentz violation. physical review letters 87(14):141601, 2001. doi:10.1103/physrevlett.87.141601. [32] r. j. szabo. quantum field theory on noncommutative spaces. physics reports 378(4):207 – 299, 2003. doi:10.1016/s0370-1573(03)00059-0. [33] i. haouam. on the fisk-tait equation for spin-3/2 fermions interacting with an external magnetic field in noncommutative space-time. journal of physical studies 24(1):1801, 2020. doi:10.30970/jps.24.1801. [34] k. li, j. wang, c. chen. representation of noncommutative phase space. modern physics letters a 20(28):2165 – 2174, 2005. doi:10.1142/s0217732305017421. [35] i. haouam. analytical solution of (2+1) dimensional dirac equation in time-dependent noncommutative phase-space. acta polytechnica 60(2):111 – 121, 2020. doi:10.14311/ap.2020.60.0111. [36] i. haouam. on the noncommutative geometry in quantum mechanics. journal of physical studies 24(2):2002, 2020. doi:10.30970/jps.24.2002. [37] i. haouam. the non-relativistic limit of the dkp equation in non-commutative phase-space. symmetry 11(2):223, 2019. doi:10.3390/sym11020223. [38] i. haouam. continuity equation in presence of a non-local potential in non-commutative phase-space. open journal of microphysics 9(3):15 – 28, 2019. doi:10.4236/ojm.2019.93003. [39] j. m. wilkes. the pauli and lévy-leblond equations, and the spin current density. european journal of physics 41(3):035402, 2020. doi:10.1088/1361-6404/ab7495. [40] m. e. peskin, d. v. schroeder. an introduction to quantum field theory. reading, mass.: addison-wesley pub. co., new york, 1995. [41] m. najafizadeh, m. saadat. thermodynamics of classical systems on noncommutative phase space. chinese journal of physics 51(1):94 – 102, 2013. doi:10.6122/cjp.51.94. [42] w. gao-feng, l. chao-yun, l. zheng-wen, et al. classical mechanics in non-commutative phase space. chinese physics c 32(5):338, 2008. doi:10.1088/1674-1137/32/5/002. [43] a. e. f. djemai, h. smail. on quantum mechanics on noncommutative quantum phase space. communications in theoretical physics 41(6):837, 2004. doi:10.1088/0253-6102/41/6/837. [44] m. chaichian, m. m. sheikh-jabbari, a. tureanu. hydrogen atom spectrum and the lamb shift in noncommutative qed. physical review letters 86(13):2716 – 2719, 2001. [45] s. biswas. bohr-van leeuwen theorem in non-commutative space. physics letters a 381(44):3723 – 3725, 2017. doi:10.1016/j.physleta.2017.10.003. 241 http://dx.doi.org/10.1007/s002200050779 http://dx.doi.org/10.1007/pl00005547 http://dx.doi.org/10.1103/physrevlett.87.141601 http://dx.doi.org/10.1016/s0370-1573(03)00059-0 http://dx.doi.org/10.30970/jps.24.1801 http://dx.doi.org/10.1142/s0217732305017421 http://dx.doi.org/10.14311/ap.2020.60.0111 http://dx.doi.org/10.30970/jps.24.2002 http://dx.doi.org/10.3390/sym11020223 http://dx.doi.org/10.4236/ojm.2019.93003 http://dx.doi.org/10.1088/1361-6404/ab7495 http://dx.doi.org/10.6122/cjp.51.94 http://dx.doi.org/10.1088/1674-1137/32/5/002 http://dx.doi.org/10.1088/0253-6102/41/6/837 http://dx.doi.org/10.1016/j.physleta.2017.10.003 acta polytechnica 61(1):230–241, 2021 1 introduction 2 review of noncommutative algebra 3 pauli equation in noncommutative phase-space 3.1 formulation of noncommutative pauli equation 3.2 deformed continuity equation 3.3 derivation of the magnetization current 4 noncommutative semi-classical partition function 5 conclusion acknowledgements references ap04_1.vp 1 introduction due to agile flight of birds and insects, flapping wing propulsion has already been recognized to be more efficient than conventional propellers for very small scale vehicles with wing spans of 15 cm or less, so-called micro-air vehicles (mavs). since the primary mission for mavs is surveillance, they are desired to have good maneuverability and sustained flights with flight speeds of 30 to 60 kph. a current interest in the research and development community is to find the most energy efficient airfoil adaptation and wing motion technologies capable of providing the required aerodynamic performance for mav flight. recent experimental and computational studies investigated the propulsive characteristics flapping airfoils, and shed some light on the relationship among the produced thrust, the amplitude and frequency of the flapping oscillations, and the flow speed. water tunnel flow visualization experiments on flapping airfoils conducted by lai and platzer [1] and jones et al [2] provide a considerable amount of information on the wake characteristics of thrust producing flapping airfoils. in their experiments, anderson et al [3] observed that the phase angle between pitch and plunge oscillations plays a significant role in maximizing the propulsive efficiency. navier-stokes computations have been performed by tuncer et al [4, 5, 6] and by isogai et al [7, 8] to explore the effect of flow separation on thrust and propulsive efficiency of a single flapping airfoil in combined pitch and plunge oscillations. the experimental and numerical studies by jones et al [9, 10, 11] and platzer and jones [12] on flapping-wing propellers points at the gap between numerical results and the actual flight conditions in high frequency motions, and the limitation or enhancement of the performance of flapping airfoils by the onset of dynamic stall. jones and platzer [11] recently demonstrated a radio-controlled micro air vehicle propelled by flapping wings in biplane configuration (fig. 1). the computational and experimental findings show that thrust generation and propulsive efficiency of flapping airfoils are closely connected to the flapping motion and flow parameters, such as the unsteady flapping velocity, frequency and amplitude of the pitch and plunge motions, the phase shift between them, and the air speed. it is apparent that to maximize the thrust and/or propulsive efficiency of a flapping airfoil, an optimization of all the above parameters is needed. in the earlier study [13], a gradient based numerical optimization method has been applied to a flapping airfoil to maximize its thrust. the preliminary results with a limited number of optimization variables compared well with the parametric studies performed earlier. in this study the optimization method developed earlier is extended to accommodate an objective function, which is a linear combination of the thrust and the propulsive efficiency of a flapping airfoil (fig. 2). the optimization variables are taken to be the pitch and plunge amplitudes, h0 and �0, and the phase shift between the pitch and plunge motions, �. the flowfield around flapping airfoils are discretized using overset grids. unsteady flow solutions required to evaluate the gradients of the objective function by perturbation of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 optimization of flapping airfoils for maximum thrust and propulsive efficiency i. h. tuncer, m. kay a numerical optimization algorithm based on the steepest decent along the variation of the optimization function is implemented for maximizing the thrust and/or propulsive efficiency of a single flapping airfoil. unsteady, low speed laminar flows are computed using a navier-stokes solver on moving overset grids. the flapping motion of the airfoil is described by a combined sinusoidal plunge and pitching motion. optimization parameters are taken to be the amplitudes of the plunge and pitching motions, and the phase shift between them. computations are performed in parallel in a work station cluster. the numerical simulations show that high thrust values may be obtained at the expense of reduced efficiency. for high efficiency in thrust generation, the induced angle of attack of the airfoil is reduced and large scale vortex formations at the leading edge are prevented. keywords: optimization, flapping airfoils, unsteady aerodynamics, moving overset grids. fig. 1: mav with flapping wings [11] fig. 2: flapping motion of an airfoil in combined plunge and pitch optimization variables are computed in parallel in a computer cluster. 2 numerical method the unsteady viscous flowfield around a flapping airfoil is computed by solving the navier-stokes equations on moving overset grids. the flow variables at the intergrid boundaries are interpolated from the donor grid. computations on each subgrid are performed in parallel. pvm message passing library routines are used in the parallel solution algorithm [14]. the computed results are analyzed in terms of average thrust coefficient and propulsive efficiency values, and instantaneous particle traces. the computational domain is discretized with overset grids. c-type grid around the airfoil is overset onto a cartesian background grid (fig. 3). the flapping motion of the airfoil is imposed by moving the airfoil and the grid around it on the background grid. the flapping motion of the airfoil in combined plunge, h, and pitch, �, is specified by � �h h t� � 0 cos � , � �� �� � �0 cos � �t , where the angular frequency � is given in terms of the reduced frequency, k c u� �� . the pitching motion is about the mid-chord location. 2.1 intergrid boundary conditions at the intergrid boundaries formed by the overset grids, the conservative flow variables are interpolated in each timestep of the unsteady solution. intergrid boundary points are first localized in a triangular stencil in the donor grid by a directional search algorithm. the localization process provides the interpolation weights to interpolate the flow variables within the triangular stencil [14]. 2.2 optimization optimization process is based on following the direction of the steepest ascent of an objective function, o. the direction of the steepest ascent is given by the gradient vector of the objective function: � ��o o v o v nv v v� � � � � � �1 1 2 2 �, where vn’s are the optimization variables of the objective function. the objective function is taken as a linear combination of the average thrust, ct, and the propulsive efficiency, �, over a flapping period. � � 0 sets the objective function to a normalized thrust coefficient: o c c c c t t t t ( , ) ( )� � � � � � � � � � � � � 1 � �d d c t c tt d t t t � � � 1 d � � � � � � � c u t p t t s t t t 1 v ad d where the denominator in the efficiency expression accounts for the average work required to maintain the flapping motion. � denotes the optimization stepsize. the components of the gradient vector is then evaluated numerically by computing the objective function for a small perturbation of the optimization variables one at a time. it should be noted that the evaluation of these vector components requires an unsteady flow computation over a few periods of the flapping motion until a periodic behavior is reached. once the unit vector in the ascent direction is evaluated by d � � � o o , the step size �v d� is to be determined. reference [15] suggests that the value of � should be based on the hessian of the objective function, which involves the second derivatives of the objective function with respect to optimization variables. the exact computation of the hessian is expensive and the cost is proportional to the number of optimization parameters. in this work an approximation is made [13], and the step size is evaluated as follows: � � � � � �� o o d where � � � � � � �� � � o o o e n n n � � � � � d v v 1 1 . � ���o � d is then evaluated numerically. 2.3 parallel computation a coarse parallel algorithm based on domain decomposition is implemented in a master-worker paradigm [16]. the overset grid system is decomposed into its subgrids first, and the solution on each subgrid is assigned to a separate processor in the computer cluster. in addition, the background grid may also be partitioned to improve the static load balancing. intergrid boundary conditions are exchanged among subgrid processes. pvm (version 3.4.4) library routines are used for inter-process communication. in the optimization process, unsteady flow computations with perturbed optimization variables, which are required to determine the gradient vector, are also evaluated in parallel. computations are performed in a cluster of computers with dual pentium-iii processors and linux operating system. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 3: overset grid system 3 results and discussion in this study the flapping motion of a naca00l2 airfoil in a combined plunge and pitch is investigated. the reduced frequency of the oscillatory motion is fixed at k � 1. the optimization variables are the plunge and pitch amplitudes, (h0, �0) and the phase shift between plunge and pitch motions (�). all the flows are assumed to be laminar, and computed at re � 1�104 and m � 0.1. table 1 summarizes the optimization cases studied, and the initial values of the optimization variables. the parallel computations with 8 processors take about 20–30 hours of wall clock time for a typical optimization case. in case 1, where � � 0, the average thrust coefficient is maximized. the instantaneous variation of the unsteady drag (negative thrust) coefficient along a few optimization steps is shown in fig. 4. as the optimization variables are incremented along the optimization steps, unsteady computations are carried out for a few periods of the flapping motion until a periodic behavior is obtained. the variation of the average thrust coefficient and the propulsive efficiency with respect to the optimization variables are given in fig. 5. as shown, as the optimization variables are incremented along the gradient of the objective function, the average thrust coefficient increases gradually, and a maximum value of 1.41 is reached © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 case � h0 �� � 1 0.0 0.5 5 30 2 0.5 0.5 5 30 3 1.0 0.5 5 30 4 0.0 0.5 25 60 5 0.0 1.0 5 60 6 0.0 1.0 25 90 7 1.0 0.5 25 60 8 1.0 1.0 5 60 9 1.0 1.0 25 90 table 1: optimization cases and starting conditions fig. 4: cd history along optimization steps, case 1 fig. 5: maximization of thrust coefficient (� � 0), case 1 fig. 6: maximization of propulsive efficiency and thrust coefficient (� � 0.5), case 2 at h0 � 1.60, �0 � 23.5 and � � 103.4 deg. the corresponding propulsive efficiency is 28.3 %. optimization steps for cases 2 and 3 are shown in figs. 6 and 7, respectively. in case 2, where � � 0.5, the average thrust 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 7: maximization of propulsive efficiency (� � 1.0), case 3 fig. 8: instantaneous particle traces at the instant of maximum thrust along the optimization steps for cases 1 and 2 © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 9: instantaneous particle traces along a period of the optimized flapping motion for case 1 fig. 10: maximization of thrust coefficient (� � 0) with various starting conditions, cases 1, 4–6 fig. 11: maximization of propulsive efficiency (� � 1) with various starting conditions, cases 3, 7–9 and the propulsive efficiency have equal weights in the objective function. as a results, the efficiency is improved at the expense of average thrust. it is observed that the higher efficiency is achieved at a lower plunge amplitude and a higher pitch amplitude. the phase shift slightly drops to 97.8 deg. in case 3 the propulsive efficiency is maximized at the low pitch and plunge amplitudes with the corresponding very low thrust coefficient. it is apparent that the propulsive efficiency and the thrust production of flapping airfoils are inversely proportional. the unsteady flow fields along the optimization steps are investigated with particle traces. the particles are emitted along a straight line in the vicinity of the leading edge of the airfoil, and convected in the flow field with the local velocity. the line from which the particles are emitted follows the leading edge of the airfoil to capture the leading edge vortex formations in more detail. in fig. 8, the instantaneous particle traces at the instant of maximum thrust (minimum drag) in a flapping period are given along the optimization steps of cases 1 and 3. it is observed that in case 1, the leading edge vortex formation is promoted along the optimization steps. the maximum instantaneous thrust occurs at about the mean amplitude location as the leading edge vortex forms, just before the suction field at the leading edge collapses as the leading edge vortex develops stronger. whereas, in case 3, the leading edge vortex formation is prevented along the optimization steps, which incidently maximizes the propulsive efficiency. the unsteady flow becomes more streamlined with the motion of the airfoil. fig. 8 shows the optimized flow field for maximum thrust in case 1. the flowfield is observed to be highly vortical with strong leading edge vortices during the upstroke and the downstroke. the flow field is periodic, and antisymmetric along the upstroke and the downstroke. next, the optimization space is searched for other possible local maximums of the objective function for cases 1 and 3. it is implemented by initiating the optimization process from various initial conditions as given in table 1. all the results of the optimization cases are also given in table 2. the initial conditions and the optimized states at the end of the optimization processes are shown in figs. 10 and 11 for � � 0 and � � 1, respectively. fig. 11 reveals that all the optimization cases for � � 0 converge about the same value of the objective function, which is the thrust coefficient, and of the optimization variables. it suggests that the global maximum of the objective function may have been found. on the other hand, the optimization processes for � � 1 provides different optimum states for h0 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 12: optimized flapping motions case h0 �0 � ct � [%] 1 1.60 23.5 103.4 1.41 28.3 2 1.36 29.6 97.8 1.08 44.1 3 0.45 15.4 82.4 0.08 58.5 4 1.73 23.8 100.7 1.44 25.4 5 1.52 26.9 87.2 1.27 33.4 6 1.55 28.6 94.9 1.45 35.9 7 0.57 21.0 86.7 0.13 63.8 8 0.60 22.8 86.1 0.13 64.8 9 0.83 35.6 86.5 0.18 67.5 table 2: optimization results and �0, and about the same � values. it appears that a high flapping efficiency may be achieved for a range of h0 and �0 values, such that ao increases as h0 does. the optimum flapping motions for cases 1–3 and 9 are shown in fig. 12. it is clearly observed that the plunge amplitude plays a significant role in thrust generation. it also appears that in order to improve the efficiency, the plunge amplitude is to be reduced, and the pitch amplitude is to be increased. in addition, the phase shift between the plunge and pitch motions, which is about 90 deg for all the cases, reduces the effective angle of attack at the mid plunge location, where the plunge velocity is maximum. the variations of the effective angle of attack the airfoil sees along a flapping period are given in fig. 13 for cases 1–3. the period starts at 0 deg, which corresponds to the h � �h0 position of the airfoil. in agreement with the previous observations, for higher thrust production, as in cases 1 and 2, a flapping airfoil stays at large effective angles of attack for a large fraction of the flapping period. for an efficient flapping as in case 3, the effective angle of attack at the mid-plunge location (� � 90, 270 deg, h � 0) is set about 0 deg. whereas, in cases 1 and 3 the maximum effective angle of attack occurs around mid-plunge locations. 4 concluding remarks a gradient based numerical optimization is successfully applied to the thrust generation and propulsive efficiency of an airfoil flapping in a combined plunge and pitch. the optimization of thrust generation and propulsive efficiency together is achieved with a weighted and normalized objective function. the parallel implementation of the optimization algorithm is shown to be quite robust. thrust generation of a flapping airfoil is maximized at large plunge amplitudes as large leading edge vortices form and shed into the wake. the airfoil stays at a large effective angle of attack during the most of the flapping period. on the other hand, the propulsive efficiency of flapping airfoils may be increased by reducing the plunge amplitude and the effective angle of attack, and consequently by preventing the formation of leading edge vortices. further research is in progress to implement the present optimization method to the thrust generation of flapping airfoils in a biplane configuration. acknowledgment the partial support provided by nato research and technology organization under the project tx01–02 is acknowledged. references [1] lai j. c. s., platzer m. f.: the jet characteristics of a plunging airfoil. 36th aiaa aerospace sciences meeting and exhibit, reno, nevada, january 1998. [2] jones k. d., dohring c. m., platzer, m. f.: “an experimental and computational investigation of the knoller-betz effect.” aiaa journal,vol. 36 (1998), no. 7, p. 1240–1246. [3] anderson j. m., streitlien k., barrett d. s., triantafyllou m. s.: “oscillating foils of high propulsive efficiency.” journal of fluid mechanics, vol. 360 (1998), p. 41–72. [4] tuncer i. h, platzer m. f.: “thrust generation due to airfoil flapping.” aiaa journal, vol. 34 (1995), no. 2, p. 324–331. [5] tuncer i. h., lai j., ortiz m. a., platzer m. f.: unsteady aerodynamics of stationary / flapping airfoil combination in tandem. aiaa paper 97-0659, 1997. [6] tuncer i. h., platzer m. f.: “computational study of flapping airfoil aerodynamics.” aiaa journal of aircraft, vol. 35 (2000), no. 4, p. 554–560. [7] isogai k., shinmoto y., watanabe y.: “effects of dynamic stall on propulsive efficiency and thrust of a flapping airfoil.” aiaa journal, vol. 37 (2000), no. 10, p. 1145–1151. [8] isogai k., shinmoto y.: study on aerodynamic mechanism of hovering insects. aiaa paper no. 2001–2470, june 2001. [9] jones k. d., duggan s. j., platzer m. f.: flapping-wing propulsion for a micro air vehicle. aiaa paper no. 2001–0126, 39th aiaa aerospace sciences meeting, reno, nevada, january 2001. [10] jones k. d., castro b. m., mahmoud o., pollard s. j., platzer m. f., neef m. f., hummel d.: a collobirative numerical and experimental investigation of flapping-wing propulsion. aiaa paper no. 2002–0706, january 2002. [11] jones k. d., platzer m. f.: experimental investigation of the aerodynamic characteristics of flapping-wing micro air vehicles. aiaa paper no. 2003–0418, january 2003. [12] platzer m. f., jones k. d.: the unsteady aerodynamics of flapping-foil propellers. 9th international symposium on unsteady aerodynamics, aeroacoustics and aeroelasticity of turbomachines, cole centrale de lyon, lyon (france), september 4–8, 2000. [13] tuncer i. h., kaya m.: optimization of flapping airfoils for maximum thrust. aiaa paper no. 2003–0420, january 2003. [14] tuncer i. h: “a 2-d unsteady navier-stokes solution method with moving overset grids.” aiaa journal, vol. 35 (1997), no. 3, p. 471–476. [15] kuruvila g., ta’asan s., salas m. d.: airfoil optimization by the one-shot method. agard report no. 803, 1994. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 13: effective angle of attack [16] tuncer i. h.: parallel computation of multi-passage cascade flows with overset grids. selected papers from parallel computational fluid dynamics workshop, gulcat u., emerson d. r. (eds.), istanbul technical university, istanbul, turkey, 1999, p. 81–89. assoc. prof. ismail h.tuncer e-mail: tuncer@ae.metu.edu.tr mustafa kaya, graduate research assistant department of aerospace engineering middle east technical university 06531 ankara, turkey 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague acta polytechnica doi:10.14311/ap.2019.59.0088 acta polytechnica 59(1):88–97, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap optimization of biodiesel production from waste frying oil over alumina supported chicken eggshell catalyst using experimental design tool adeyinka s. yusuff∗, lekan t. popoola department of chemical and petroleum engineering, college of engineering, afe babalola university, ado-ekiti, nigeria ∗ corresponding author: yusuffas@abuad.edu.ng abstract. an optimization of the biodiesel production from a waste frying oil via a heterogeneous transesterification was studied. this present study is also aimed at investigating the catalytic behaviour of the alumina supported eggshell (ase) for the synthesis of biodiesel. a synthesized ase catalyst, at various mixing ratios of alumina to eggshell, was investigated and exhibited a better activity for the reaction when the eggshell and alumina were mixed via incipient wetness impregnation in 2 : 1 proportion on a mass basis and calcined at 900 °c for 4 h. the as-synthesized catalyst was characterized by basicity, bet, sem, edx, and ftir. the 2k factorial experimental design was employed for an optimization of process variables, which include catalyst loading, reaction time, methanol/oil molar ratio and reaction temperature and their effects on the biodiesel yield were studied. the optimization results showed that the reaction time has the highest percentage contribution of 40.139 % while the catalyst loading contributes the least to the biodiesel production, as low as 1.233 %. the analysis of variance (anova) revealed a high correlation coefficient (r2 = 0.9492) and the interaction between the reaction time and reaction temperature contributes significantly to the biodiesel production process with percentage contribution of 14.001 %, compared to other interaction terms. the biodiesel yield of 77.56 % was obtained under the optimized factor combination of 4.0 wt.% catalyst loading, 120 min reaction time, 12 : 1 methanol/oil molar ratio and reaction temperature of 65 °c. the reusability study showed that the ase catalyst could be reused for up to four cycles and the biodiesel produced under optimum conditions conformed to the astm standard. keywords: biodiesel; catalyst; characterization; eggshell; waste frying oil. 1. introduction biodiesel is a biogenic, renewable, non-toxic and esterderived oxygenated fuel. it is commonly produced from animal fat, edible (soybean, palm, coconut, corn) oil, non-edible (cotton seed, jatropha curcas, algae, sunflower) oil and waste vegetable oil. according to tan et al. [14], biodiesel can conveniently be used to power a diesel engine without making any adjustment to the engine. besides, it is energy efficient, environmentally friendly and reduces the greenhouse gases emission [15]. however, the exorbitant cost of biodiesel has been a major reason why its production has not been largely commercialized. two ways in which this problem could be addressed according to fabbri et al. [2] and taufiq-yap et al. [15] include a biodiesel synthesis from a waste vegetable or nonedible oil and the use of solid catalysts derived from readily available waste or naturally occurring materials instead of a homogeneous catalyst and enzyme. generally, biodiesel, a form of mono-alkyl ester, is commonly produced via transesterification of oil with alcohol (methanol, ethanol, propanol and butanol) in the presence of a catalyst. glycerol is also formed as a by-product. transesterification is an organic reaction in which an ester is transformed into another through an interchange of the special functional group called alkoxy moiety [12]. typically, methanol is used as the co-reactant for the conversion of triglyceride to produce fatty acid methyl ester (biodiesel), because it is relatively cheap, readily available and easier to separate than glycerol from the product mixture [13]. the reaction is illustrated by the overall equation given in figure 1. figure 1. transesterification of glyceride with methanol. the current biodiesel production process involves the use of a homogenous catalyst, such as naoh, hcl, koh and h2so4. however, there are some problems associated with the homogeneous catalysed transesterification process, that is, the generation of wastewater and non-reusability of liquid catalysts. according to taufiq-yap et al. [15], the use of heterogeneous catalysts in transesterification reaction could reduce the 88 http://dx.doi.org/10.14311/ap.2019.59.0088 http://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 1/2019 optimization of biodiesel production from waste frying oil costs of the biodiesel production. this is because; they are much easier to separate from the product mixture, reusable and do not generate wastewater. a heterogeneous catalyst could be classified into solid base and solid acid catalysts. solid base catalysts are often used for a transesterification reaction. they do exhibit higher activity and proceed faster to equilibrium when compared to an acid heterogeneous catalyst. besides, solid base catalysts could easily be synthesized from waste and naturally occurring materials. nevertheless, the heterogeneous catalyst is well known for its mass transfer limitation, because it forms three phases with methanol and oil, thus lowering the reaction rate [7, 15]. however, the diffusion limitation could be overcome by using catalyst supports or promoters which can enhance the textural properties and basic strength of the catalyst. besides, they firmly anchor the catalyst’s active ingredient, thus minimizing the degree of leaching [4]. several catalyst supports, such as alumina (al2o3), silica (sio2), zirconia (zro2) and titania (tio2), have been used in a biodiesel production process [3, 5, 16]. among the aforementioned supports, alumina has been widely used for anchoring catalyst active ingredients due to its high thermal stability, better mechanical properties, and better textural characteristics [15]. taufiq-yap et al. [15] synthesized naoh/al2o3 catalyst and used it for the production of biodiesel from palm oil. the maximum biodiesel yield of 99 % was obtained under optimum reaction conditions. umdu et al. [16], in their research work, used al2o3/cao as a heterogeneous catalyst in converting the lipid of microalgae to biodiesel. the biodiesel yield from the process was 80 %. meanwhile, none of the previous researchers discussed had done a detailed work to check the performance of alumina when loaded on cao derived from agricultural waste. the use of low quality feedstock such as waste frying oil (wfo) to synthesize biodiesel will reduce the production cost. therefore, the current study is focused on the use of alumina supported eggshell as a heterogeneous base catalyst for the transesterification of wfo with methanol because of the easy availability of both reactants. also, the 2-level factorial design of the experiment was applied to the optimization of the biodiesel production process. the variables (factors) considered were the reaction temperature, reaction time, catalyst loading and methanol/wfo molar ratio. 2. materials and method 2.1. materials in this current study, the wfo and waste chicken eggshells were collected from the cafeteria of afe babalola university, ado-ekiti, nigeria. the oil was first heated in an oven at 150 °c for 3 h to reduce the moisture content and later filtered using a 100 µm sieve mesh to remove bits of food residues. the acid value and free fatty acid (ffa) of the oil were determined to be 3.847 mgkoh/g and 1.924 wt.%, respectively. however, since the ffa content of the wfo is less than 3.0 wt.%, the single step transesterification process is appropriate to convert the oil to biodiesel [14]. methanol (ch3oh, 99.5 % synthesis grade) was used as a co-reactant for the transesterification and gamma alumina (anhydrous al2o3) was used as a catalyst support. these compounds were procured from topjay chemical enterprise, ado-ekiti, nigeria and employed as received without subjecting them to a further purification. 2.2. preparation of the catalyst the waste chicken eggshells were initially soaked and thoroughly washed with clean water to remove all attached dirt. the cleaned eggshells were thereafter heated up in an oven at 125 °c until the water was completely dried up. the dried eggshell was later crushed into a powder by mechanical grinder and the powder obtained was allowed to pass through 0.3 mm sieve mesh to obtain a fine powder with particle size lesser than 0.3 mm. it was then kept in a sealed plastic container. the procedure used to prepare the supported catalyst was referred in our previous work [19]. three different samples of alumina supported eggshell catalyst were prepared by weighing and mixing the prepared eggshell powder and alumina in 1 : 1, 2 : 1 and 1 : 2 mass ratios of eggshell to alumina. the resulting mixtures were dissolved in 50 ml of distilled water to form suspensions and stirred continuously until the mixtures were homogenous. the obtained slurries were then heated up in an oven at 110 °c to remove the water. thereafter, the three dried samples were calcined in a muffle furnace at 900 °c for 4 h. the calcined catalysts were kept in a desiccator containing silica particles in order to prevent atmospheric moisture and carbon dioxide that might be in contact with the catalysts. 2.3. characterization of the prepared catalysts the textural features of the prepared catalyst samples were determined by brunauer-emmett-teller (bet) technique using micrometrics analyser (quantachrome instrument, nova station a, version 11.03, usa) based on the principle of adsorption/desorption of nitrogen at 77k and 60/60 sec (ads/des) equilibrium time. the basicity of the as-synthesized catalysts was determined by a colorimetric titration method reported by abdoulmoumine [1]. scanning electron microscope-energy dispersive x-ray (sem-edx) analyser (jeol-jsm 7600f) was used simultaneously to determine the surface morphology and elemental composition of the prepared catalysts, while fourier transform infrared (ftir) spectrophotometer (ir affinity 1s, shimadzu, japan) was employed to determine the surface functional groups on the as-synthesized catalysts. 89 adeyinka s. yusuff, lekan t. popoola acta polytechnica eggshell/al2o3 ratio textural properties basicity (mmol/g cat)bet area (m2/g) total pore volume (cm3/g) 1 : 1 eggshell loading on al2o3 49.6 0.117 1.34 2 : 1 eggshell loading on al2o3 78.2 0.341 1.88 1 : 2 eggshell loading on al2o3 60.9 0.281 0.58 table 2. textural properties and basicity of the different ratio of prepared alumina supported eggshell catalysts. variable level low high (−1) (+1) reaction temperature (t ) 50 °c 65 °c reaction time (t) 1 h 2 h catalyst loading (c) 2 wt.% 4 wt.% methanol/wfo ratio (m) 6 : 1 12 : 1 table 1. studied range of each variable in actual and coded form. 2.4. the 2k factorial experimental design a factorial method of an analysis is one of the numerous features of the design of an experiment used in studying the influence of an individual variable and its interaction with other variables. it economizes the experimental resources by reducing the number of runs [1]. in this study, four process independent variables, which include reaction temperature, reaction time, catalyst loading and methanol/wfo molar ratio, were of interest, with biodiesel yield as the response. however, the response was determined via the transesterification process with the aim to identify the optimum reaction condition that would provide a maximum biodiesel yield. a total of sixteen (16) experimental runs were conducted according to a 2k factorial design with the four process variables (24 = 16 points). table 1 presents the experimental design matrix for a 24 factorial design. 2.5. transesterification reaction study the transesterification of the wfo to biodiesel using a calcined alumina supported eggshell catalyst was conducted according to experimental design values (table 1) and the response measured was the yield of biodiesel. the whole experiments were performed in a 250 ml glass reactor placed on a temperature controlled heating mantle. fifty grams (50 g) of the wfo were heated up to a desired temperature in an oven, after which the mixture of methanol and catalyst with a required amount was added to the heated oil. the reaction mixture was stirred with a speed of 350 rpm. after the reaction was completed, the catalyst was separated from the product mixture (biodiesel and glycerol) by a cloth filtration. the yield of biodiesel collected from the product mixture was then evaluated as [6] biodiesel yield, y = weight of biodiesel weight of wfo used · 100 %. (1) 2.6. biodiesel product analysis the synthesized biodiesel was characterized for a specific gravity, kinematic viscosity, cloud point, pour point and flash point and was compared with astm d6571 standard method. the quality of the wfo biodiesel was further ascertained by the ftir spectrophotometer (ir affinity 1s, shimadzu, japan) in which the functional groups contained in it were determined. moreover, a gas chromatography-mass spectrometry (hewlett packard 6890s, palo alto, usa) analysis was conducted on the biodiesel product to determine the type of the formed methyl esters. 2.7. catalyst reusability study the reusability of the supported catalyst was conducted in order to check its stability after reuse. the used catalyst was collected from the product mixture after the completion of the transesterification reaction, washed severally with methanol to remove oil that was attached to the catalyst particles and heated up at 60 °c in an oven until it dried out. thereafter, the dried catalyst sample was recalcined in a muffle furnace at 700 °c for 2 h. the recalcined catalyst was weighed and reused for subsequent reactions at the same operating conditions. after a fourth cycle, the reaction was discontinued because of the quantity of the recovered catalyst, which was significantly reduced. 3. results and discussion 3.1. characterization of the catalyst the textural characteristics and basicity of the three prepared catalyst samples were determined. according to the results depicted in table 2, it was found that the catalyst sample with the eggshell/al2o3 mass ratio of 2 : 1 exhibited better textural properties and strong basic strength. this is attributed to the fact that the catalyst sample contains a larger proportion of eggshell, which decomposed into cao and co2 after calcination [14] and amphoteric nature of al2o3. according to olutoye and hameed [8], the oxygen atoms contained in the cao and al2o3 signify lewis base sites while the metal ions represent 90 vol. 59 no. 1/2019 optimization of biodiesel production from waste frying oil figure 2. sem images of (left) raw; (right) calcined ase catalyst lewis acid sites. thus, the high basicity possessed by this sample implied that the lewis base sites are active centres for the transesterification reaction. this is corroborated by the edx analysis. in addition, higher values of surface area (78.2 m2/g) and total pore volume (0.341 cm3/g) recorded for this same catalyst sample indicate that the catalyst external surface is dominated by active sites and it can eliminate the mass transfer limitation, leading to a faster reaction process [14]. having identified the prepared catalyst with the ratio 2 : 1 of eggshell loading onto al2o3 as the most active sample amongst other synthesized catalysts, it was further characterized and studied in details under various operating reaction conditions. the sem analysis was conducted in order to examine the external morphology of raw and calcined ase catalysts using a very large magnification. the sem images of the catalysts, presented in figure 2. figure 2(left) showed that the raw catalyst possessed an undefined structure with various tiny pores on its surface. this observation might be attributed to the method of preparation adopted. however, upon calcination, several large pores of different sizes were clearly seen on the catalyst’s surface as shown in figure 2(right). this observation could be attributed to the decomposition of caco3 contained in eggshell into cao and co2. the sem result of the calcined ase catalyst also showed that an elevated calcination temperature facilitated a removal of adsorbed gases, organic matters, moisture and surface and bulk atom rearrangement, thus leading to pore creation and basic sites exposure on the catalyst’s surface [12]. the elemental composition analysis of the ase catalyst (table 3) showed that it contained calcium, oxygen, carbon, aluminium and magnesium. the element ase catalyst sample raw calcined calcium 48.8 65.9 carbon 26 4.6 aluminum 10.8 10.6 magnesium 4.2 2.7 oxygen 10.4 16.1 table 3. elemental analysis for raw and calcined ase catalysts. composition of the calcium and oxygen in the raw catalyst were 48.8 wt.% and 10.4 wt.% respectively, but they increased to 65.9 wt.% and 16.1 wt.% respectively after the calcination. however, the composition of all other elements was reduced after the thermal treatment. the increase in calcium content after the calcination was due to the decomposition of caco3 in eggshell. a similar observation was reported by tan et al. [14] in the transesterification of a waste cooking oil using calcined ostrich and chicken eggshells as the catalysts. the ftir spectrum of a raw ase catalyst is shown in figure 3(a) with various adsorption bands at 3436.56 cm−1 (n–h stretch), 2874.03 cm−1 (ch antisymmetric stretch), 2515.26 cm−1 (o–h stretch), 2360.95 cm−1 (n–h stretch), 1797.72 cm−1 (c––o antisymmetric stretch), 1421.58 cm−1 (in-plane oh bend), 875.71 cm−1 (ch out-of-plane deformation) and 455.22 cm−1 (c–n–c bend). these surface functional groups are an indication that the ase catalyst is complex in nature. however, upon calcination, some peaks were shifted or vanished and new bands at 3643.65 cm−1 (o–h stretch), 1626.05 cm−1 (c––o stretch), 1413.87 cm−1 (c–n stretch), 1022.31 cm−1 91 adeyinka s. yusuff, lekan t. popoola acta polytechnica 0 50 100 150 200 250 05001000150020002500300035004000 t ra n sm it ta n ce ( % ) wavenumber (cm-1) (c) (b) (a) figure 3. ftir spectra of (a) raw, (b) calcined and (c) reused ase catalysts. run transesterification process variable biodiesel yield, y (%) catalyst reaction methanol/oil reaction experimental predicted loading time ratio temperature c (wt.%) t (min) m t (°c) 1 2 60 6 50 45.46 42.33 2 4 60 6 50 27.66 31.30 3 2 120 6 50 57.28 60.41 4 4 120 6 50 59.09 55.44 5 2 60 12 50 31.32 34.39 6 4 60 12 50 38.13 34.54 7 2 120 12 50 68.45 65.38 8 4 120 12 50 68.00 71.59 9 2 60 6 65 47.19 44.03 10 4 60 6 65 41.23 43.87 11 2 120 6 65 38.48 41.64 12 4 120 6 65 50.18 47.54 13 2 60 12 65 48.36 51.58 14 4 60 12 65 61.32 62.62 15 2 120 12 65 65.32 62.10 16 4 120 12 65 76.50 79.20 table 4. factorial design for transesterification process variables. (p–o–c antisymmetric stretch), 783.13 cm−1 (ch out-of-plane deformation) and 559.38 cm−1 (c–c––o bend) were detected as shown in figure 3(b). in the case of a reused catalyst (figure 3(c)), many of the absorption bands displayed by the calcined catalyst disappeared after the use and new bands (3203.87 cm−1 (nh2 symmetric stretch), 2893.32 (ch antisymmetric stretch) and 2278.01 (n––c––o antisymmetric stretch)) were also detected. the presence of these functional groups contributes to the good activity of the catalyst [10]. 3.2. statistical analysis of experimental data the synthesis of biodiesel from the wfo in the presence of the ase catalyst was carried out using a 2k factorial experimental design as shown in table 4. as can be seen, sixteen (16) experimental runs were conducted at different levels of independent variables considered. it was revealed that the run 16, which was conducted at 4.0 wt.% catalyst loading, 120 min reaction time, 12 : 1 methanol/oil molar ratio and 65 °c re92 vol. 59 no. 1/2019 optimization of biodiesel production from waste frying oil source sum of squares degree of freedom mean square f -value p-value (pr > f ) model 2992.54 10 299.25 9.35 0.0118 c 36.75 1 36.75 1.15 0.3329 t 1201.14 1 1201.14 37.53 0.0017 m 562.05 1 562.05 17.56 0.0086 t 86.44 1 86.44 2.70 0.1612 ct 36.69 1 36.69 1.15 0.3332 cm 125.16 1 125.16 3.91 0.1049 ct 118.32 1 118.32 3.70 0.1125 tm 166.73 1 166.73 5.21 0.0713 tt 418.92 1 418.92 13.09 0.0153 m t 240.33 1 240.33 7.51 0.0408 residual 160.02 5 32 – – cor. total 3125 15 – – – r2 = 0.9492; adj-r2 = 0.8477 table 5. anova analysis for 2k factorial design. 40.139 18.784 14.001 8.028 5.573 4.182 3.951 2.887 1.233 1.222 0 10 20 30 40 50 t m tt mt tm cm ct t c ct figure 4. pareto graphic analysis — percentage effect of each factor. action temperature, gave the highest biodiesel yield with the value obtained to be 76.50 %. meanwhile, the lowest biodiesel yield was provided by the run 2, which was conducted at 4.0 wt.% catalyst loading, 60 min reaction time, 6 : 1 methanol/oil molar ratio and 50 °c reaction temperature. these observations imply that, at the maximum reaction time, methanol/oil molar ratio and reaction temperature, the yield of biodiesel was favoured irrespective of the quantity of catalyst consumed during the reaction [17]. this fact is affirmed by the run 15, which was carried out at 2.0 wt.% catalyst loading, 120 min reaction time, 12 : 1 methanol/oil molar ratio and 65 °c reaction temperature and gave a 65.32 % biodiesel yield. the regression model, which correlates the dependent and independent variables in terms of coded factor gives y = 51.75 + 1.52c + 8.66t + 5.93m + 2.32t + 1.51ct + 2.80cm + 2.72ct + 3.23tm − 5.12tt + 3.88m t , (2) where c, t, m and t are the catalyst loading, reaction time, methanol/oil molar ratio and reaction temperature, respectively. these are the main effects. while ct, cm, ct , tm, tt and m t represent interaction effects. 3.2.1. analysis of variance for biodiesel yield the model adequacy was tested using an analysis of variance (anova) and table 5 represents the results of the anova analysis for the 2k factorial design. the model f -value of 9.35 with a probability value (pr > f ) of 0.0118 confirms the adequacy of the model. in addition, two linear terms (t and m) and two interactive terms (tt and m t ) are the significant model terms. this is because their p-values (pr > f ) are less than 0.0500. according to the r2 value (0.9492), the obtained model accounts for 94.92 % of the total variation in the experimental biodiesel yield. 93 adeyinka s. yusuff, lekan t. popoola acta polytechnica figure 5. plot of biodiesel yield (y ) versus reaction time (t) at reaction temperature of 50 °c and 65 °c, 2 wt.% catalyst loading and 6 : 1 methanol/oil molar ratio 3.2.2. analysis of main effects the contribution of each of the factors studied to the biodiesel production from the wfo is displayed in figure 4. it was revealed that the reaction time contributes significantly to the yield of biodiesel, as much as 40.139 %. this implies that increasing the reaction time from 60 min to 120 min affects the biodiesel yield. the influence of the reaction time on biodiesel production process is widely reported [8, 14]. yee and lee [17] reported that, for the transesterification to occur and proceed to completion, higher reaction time and excess methanol are necessary. since the reaction temperatures considered in this study are within the boiling point of methanol, longer reaction period favours the biodiesel yield. this is corroborated by the value of the biodiesel yield provided by the run 16, which was conducted for 120 min (table 4). the methanol/oil molar ratio is the second factor that favours the biodiesel yield with the percentage contribution of 18.784 % as shown in figure 4. two different methanol/oil molar ratios (6 : 1 and 12 : 1) were considered in this study and the maximum molar ratio was found to have a positive effect on the biodiesel yield as confirmed in most of the runs such as run 7, 8, 13, 14, 15 and 16. meanwhile, the use of excess methanol inhibited the separation of biodiesel from the product mixtures, thus reducing the biodiesel yield in some runs like runs 5 and 6. this observation is similar to the one reported by paintsil [11] who investigated the optimization of the biodiesel production from canola oil. furthermore, the third most influential factor is the reaction temperature, which contributes by 2.887 % to the biodiesel production process as shown in figure 4. the reason for this observation could be that the heterogeneous catalysed transesterification reaction often requires either a relatively high reaction temperature or high reaction time in order to achieve a greater biodiesel yield [17]. since the yield of biodiesel was favoured by a high reaction period as it is the case in this study, the reaction temperature did not have a significant effect on the biodiesel production. figure 6. plot of biodiesel yield (y) versus methanol/oil molar ratio (m) at reaction temperature of 50 °c and 65 °c, 2 wt.% catalyst loading and 60 min reaction time. the catalyst loading contributes the least to the biodiesel production process, as low as 1.233 % according to the pareto graphic analysis shown in figure 4. in this current study, the catalytic reaction was conducted at two different catalyst loadings (2.0 wt.% and 4.0 wt.%). at a high catalyst loading of 4 wt.%, a higher biodiesel yield was achieved. this is because enough active sites were available for the reaction and thus enhancing the intimacy between the catalyst and the reactants [14]. meanwhile, at a low catalyst loading, there was a decrease in the biodiesel yield, because the catalyst active sites were insufficient to promote the reaction to completion [20]. 3.2.3. analysis of interaction effect. it was revealed from the anova result that the interaction between process variables has a significant effect on the biodiesel yield. in this study, six interaction effect terms were displayed by the model (table 5). among these interactions, only mt and tt contributed significantly to the biodiesel production process with the percentage contribution of 8.028 % and 14.001 %, respectively. figure 5 depicts the plot of combined effects of the reaction time (t) and reaction temperature (t) on biodiesel yield (y) while keeping catalyst loading (c) and methanol/oil molar ratio (m) at 2 wt.% and 6 : 1, respectively. the plot revealed that, at 50 °c reaction temperature, maximum biodiesel yield could be achieved within 120 min compared to 60 min. however, when the reaction was conducted at 65 °c, the biodiesel yield obtained in 60 min was greater than the one obtained in 120 min. the former observation could be explained by the fact that at the lower temperature and higher reaction time, methanol did not dry up as the reaction temperature studied is less than the boiling point of methanol, which is around 65 °c and as a result, the methanol was sufficient to drive the reaction forward. the combined effects of the methanol/oil molar ratio and reaction temperature on the yield of biodiesel is depicted in figure 6. when the reaction was conducted at a temperature of 65 °c, catalyst loading of 2 wt.% and 60 min reaction period, an increase in 94 vol. 59 no. 1/2019 optimization of biodiesel production from waste frying oil figure 7. ftir spectra of (a) wfo and (b) synthesized biodiesel. property value astm d6571 specific gravity 0.879 0.86–0.90 kinematic viscosity 3.58 mm2/s 1.9–6.0 mm2/s cloud point 2 °c −3 to 15 °c pour point −4.8 °c −5 to 10 °c flash point 145 °c ≥ 130 °c table 6. physico-chemical properties of wfo based biodiesel compared to astm (d6751) standards. methanol/oil molar ratio resulted in an increase in the biodiesel yield. however, when the reaction was conducted at 50 °c for 60 min using 2 wt.% ase catalyst dose, an increase in methanol/oil ratio was found to reduce the yield of biodiesel. the reduction in the biodiesel yield at a higher methanol/oil ratio is probably due to the dissolution of glycerol in methanol, which subsequently shifts the reaction backward [14]. 3.3. numerical optimization of process variables having determined the optimum process variables for the production of biodiesel from the wfo using the ase as a catalyst to be 4 wt.%, 120 min, 12 : 1 and 65 °c for catalyst loading, reaction time, methanol/oil molar ratio and reaction temperature, respectively, a further experimental run was conducted at these conditions and the experimental biodiesel yield was 77.56 %. however, the predicted biodiesel yield was calculated based on an empirical model developed by a design expert software and was found to be 79.20 %. this value is slightly greater than the actual value by 1.44 %. the result indicates that a maximum biodiesel yield can be achieved when the transesterification of the wfo over the ase catalyst is conducted at the maximum values of variables studied. 3.4. catalyst reusability study the biodiesel yields for the four reaction cycles conducted under optimum conditions were 58.22 %, 42.71 %, 29.88 % and 18.92 %, respectively. the reason for the reduction in biodiesel yield is probably due to the saturation of the catalyst active sites by oil molecules. in addition, there was a loss of active sites during the catalyst regeneration process and as a result, there was a decrease in the biodiesel yield during reuse. this observation is similar to the one reported by olutoye et al. [9]. 3.5. analysis of synthesized biodiesel 3.5.1. physicochemical properties of synthesized biodiesel the properties of the prepared wfo based biodiesel, such as specific gravity, viscosity, cloud point, pour point and flash point, were determined and compared with astm d6571 standard as shown in table 6. 3.5.2. ftir analysis the ftir spectra of the wfo and its biodiesel are depicted in figure 7. the spectra of the two substances are very similar because of the high degree of similarities between triglyceride and methyl ester [15]. some of the peaks shifted after the conversion process. however, the adsorption bands at 2990-2850 cm−1 are assigned to c-h antisymmetric and symmetric stretching. while the sharp peak at 1743.71 cm−1 is assigned to c = o stretching mode of esters. the bands at 1155.20 cm−1 and 1026.16 cm−1 both correspond to c – o stretching of esters. the absorption band at 1437.71 cm−1 can be attributed to the ch3 antisymmetric deformation. all these functional groups found in the prepared biodiesel confirmed the presence of methyl esters. 95 adeyinka s. yusuff, lekan t. popoola acta polytechnica figure 8. result of gc-ms analysis on biodiesel synthesized under the optimized factor combination of 4.0 wt.% catalyst loading, 120 min reaction time, 12 : 1 methanol/oil molar ratio and reaction temperature of 65 °c. compound retention concentratime (min) tion (wt.%) methyl palmitate 38.36 33.33 methyl oleate 46.87 41.01 methyl linoleate 42.01 10.14 methyl palmitoleate 47.81 4.24 others – 7.17 table 7. results of gc-ms analysis on biodiesel produced under optimum conditions. 3.5.3. gc-ms analysis the chromatogram of the wfo biodiesel is depicted in figure 8, which confirms the presence of methyl esters and table 7 shows the description of the chromatograms. both figure 8 and table 7 display the compositions of biodiesel from the transesterification of the wfo, which are mainly methyl palmitate, methyl oleate, methyl linoleate and methyl palmitoleate. 4. conclusions a nontoxic and reusable supported catalyst was formulated by an incipient wetness impregnation of alumina with eggshell. the prepared ase catalyst with the bet surface area and basicity of 78.2 m2/g and 1.88 mmol/g cat, respectively, exhibited a better performance in the transesterification of the wfo with methanol to biodiesel. an investigation of the effects of the transesterification process variables on the biodiesel yield revealed that the reaction time contributed the most to the biodiesel production process, as much as 40.139 %, while the catalyst loading contributed the least to the process, as low as 1.233 %. the interaction between the reaction time and the reaction temperature had the most significant effect on the biodiesel yield, compared to other interaction terms. a maximum biodiesel yield was obtained under the optimized factor combination of 4.0 wt.% catalyst loading, 120 min reaction time, 12 : 1 methanol/oil molar ratio and reaction temperature of 65 °c. the reduction in the catalyst performance after four cycles of reaction was noticed, indicating a loss of active sites. references [1] abdulkareem, a.s., uthman, h., afolabi, a.s., & awenebe, o.l. (2011). extraction and optimization of oil from moringa oleifera seed as an alternative feedstock for the production of biodiesel, sustainable growth and applications in renewable energy sources, dr. majid nayeripour (ed.), isbn: 978-953-307-408-5, intech. [2] fabbri, d., bevoni, notari, m., & rivetti, f. (2007). properties of a potential biofuel obtained from soybean oil by transmethylation with dimethyl carbonate. fuel, 86(5/6), 690-697. doi:10.1016/j.fuel.2006.09.003. [3] furuta, s., matsuhashi, h., & arata, k. (2004). biodiesel fuel production with solid super acid catalysis in fixed bed reactor under atmospheric pressure. catalysis communication, 5(12), 712-723. doi:10.1016/j.catcom.2004.09.001. [4] granados, m.l., poves, m.d.z., alonso, d.m., mariscal, r., galisteo, f.c., moreno-tost r., santamaria, j., & fierro, j.l.g. (2009). biodiesel from sunflower oil by using activated calcium oxide. applied catalysis b: environmental, 73(3-4), 317-326. doi:10.1016/j.apcatb.2006.12.017. [5] jitputti, j., kitiyanan, b., bunyakiat, k., rangsunvigit, p., & jakul, p.j. (2006). transesterification of crude palm kernel oil and crude coconut oil by different solid catalysts. chemical engineering journal, 116(1), 61-66. doi:10.1016/j.cej.2005.09.025. [6] leung, d.y.c., guo, y. (2006). transesterification of neat and used frying oil: optimization for biodiesel production. fuel processing technology, 87, 883-890. doi:10.1016/j.fuproc.2006.06.003. 96 http://dx.doi.org/10.1016/j.fuel.2006.09.003. http://dx.doi.org/10.1016/j.catcom.2004.09.001. http://dx.doi.org/10.1016/j.apcatb.2006.12.017. http://dx.doi.org/10.1016/j.cej.2005.09.025. http://dx.doi.org/10.1016/j.fuproc.2006.06.003. vol. 59 no. 1/2019 optimization of biodiesel production from waste frying oil [7] mbaraka, i.k., & shanks, b.h. (2006). conversion of oils and fats using advanced mesoporous heterogeneous catalysts. journal of the american oil chemists’ society, 83, 79-91. doi:10.1007/s11746-006-1179-x. [8] olutoye, m.a & hameed, b.h., (2013). a highly active clay-based catalyst for the synthesis of fatty acid methyl ester from waste cooking palm oil”. applied catalysis a: general. 450, 57-62. doi:10.1016/j.apcata.2012.09.049. [9] olutoye, m.a., wong, s.w., chin, l.h., asif, m., & hameed, b.h. (2016). synthesis of fatty acid methyl esters via transesterification of waste cooking oil by methanol with a barium-modified montmorillonite k10 catalyst. renewable energy, 86, 392-398. doi:10.1016/j.renene.2015.08.016. [10] olutoye, m.a., adeniyi, o.d, & yusuff, a.s, (2016). synthesis of biodiesel from palm kernel oil using mixed clay-eggshell heterogeneous catalysts. iranica journal of energy and environment, 7(3), 308-314. doi:10.5829/idosi.ijee.2016.07.03.14. [11] paintsil, a. (2013). optimization of the transesterification stage of biodiesel production using statistical methods. a msc. thesis, school of graduate and postgraduate studies, university of western ontario, canada. [12] refaat, a.a. (2011). biodiesel production using solid metal oxide catalyst. international journal of environmental science and technology, 8(1), 203-221. doi:10.1007/bf03326210. [13] shuit, s.h., yit, t.o., keat, t.l., bhatia, s., soon, h.t. (2012). membrane technology as a promising alternative in biodiesel production: a review. biotechnology advances, 30(6), 1364-1380. doi:10.1016/j.biotechadv.2012.02.009. [14] tan, y.h., abdullah, m.o., hipolito, c.n., & taufiqyap., y.h. (2015). waste ostrich and chicken-eggshells as heterogeneous base catalyst for biodiesel production from used cooking oil: catalyst characterization and biodiesel yield performance. applied energy, 2, 1-13. doi:10.1016/j.apenergy.2015.09.023. [15] taufiq-yap, y.h., abdullah, n.f., & basri, m. (2011). biodiesel production via transesterification of palm oil using naoh/al2o3 catalysts. sains malaysiana, 40(6): 587-594. [16] umdu, e.s., tuncer, m., & seker, e. (2009). transesterification of nannochloroplasts oculata microalga’s lipid to biodiesel on al2o3 supported cao and mgo catalysts. bioresources technology, 100, 2828-2837. doi:10.1016/j.biotech.2008.12.027. [17] yee, k.f., & lee, k.t. (2008). palm oil as feedstock for biodiesel production via heterogeneous transesterification: optimization study. international conference on environment (icenv), 1-5. [18] yusuff, a.s., adeniyi, o.d., olutoye, m.a., & akpan, u.g. (2017). a review on application of heterogeneous catalyst in the production of biodiesel from vegetable oils. journal of applied science and process engineering, 4(2), 142-157. [19] yusuff, a.s., adeniyi, o.d., olutoye, m.a., & akpan, u.g. (2018). development and characterization of a composite anthill-eggshell catalyst for biodiesel production from waste frying oil. international journal of technology, 1, 110-119. doi:10.14716/ijtech.v9i1.1166. [20] yusuff, a.s., adeniyi, o.d., azeez, o.s., olutoye, m.a., akpan, u.g. (2019). synthesis and characterization of anthill-eggshell-ni-co mixed oxides composite catalyst for biodiesel production from waste frying oil. biofuels, bioproducts & biorefining, 13, 37-47; doi:10.1002/bbb.1914 97 http://dx.doi.org/10.1007/s11746-006-1179-x. http://dx.doi.org/10.1016/j.apcata.2012.09.049. http://dx.doi.org/10.1016/j.renene.2015.08.016. http://dx.doi.org/10.5829/idosi.ijee.2016.07.03.14. http://dx.doi.org/10.1007/bf03326210. http://dx.doi.org/10.1016/j.biotechadv.2012.02.009. http://dx.doi.org/10.1016/j.apenergy.2015.09.023. http://dx.doi.org/10.1016/j.biotech.2008.12.027. http://dx.doi.org/10.14716/ijtech.v9i1.1166. http://dx.doi.org/10.1002/bbb.1914 acta polytechnica 59(1):88–97, 2019 1 introduction 2 materials and method 2.1 materials 2.2 preparation of the catalyst 2.3 characterization of the prepared catalysts 2.4 the 2^k factorial experimental design 2.5 transesterification reaction study 2.6 biodiesel product analysis 2.7 catalyst reusability study 3 results and discussion 3.1 characterization of the catalyst 3.2 statistical analysis of experimental data 3.2.1 analysis of variance for biodiesel yield 3.2.2 analysis of main effects 3.2.3 analysis of interaction effect. 3.3 numerical optimization of process variables 3.4 catalyst reusability study 3.5 analysis of synthesized biodiesel 3.5.1 physicochemical properties of synthesized biodiesel 3.5.2 ftir analysis 3.5.3 gc-ms analysis 4 conclusions references ap02_02.vp 1 introduction analysis of the chemical composition of fluids is an important operation in chemical engineering, often performed continuously, especially in monitoring the operation of chemical reactors. requirements of accuracy and reproducibility make composition analysers expensive. when monitoring several reactors operating in parallel – with fluid composition varying slowly, as is often the case – it may be a good idea to use a single analyser sequentially evaluating samples taken from several locations. this requires placing a selector (or multiplexer) sampling unit in front of the analyser. this unit operates in discrete time steps: one sample is selected and delivered to the analyser at each time step. there are many possible variants. the fluid flows not selected at a particular instant may be halted (turning-down action), or dumped to a vent outlet (diverting action). the individual time steps need not be of the same duration. the sequence of the samples at the output may be simply repetitive periodic or varied according to some programme. there are, nevertheless, features common to all sampling unit designs: 1) the unit has several input terminals and a single output (an exception being aggregate designs with, e.g., several outputs serving two or more analysers, etc.). 2) the unit performs a spatio/temporal conversion (fig. 1). 3) the operation is controlled by an external signal (though the signal generator may be integrated into the unit – even to the degree of sharing some of its components). 4) an essential requirement is to eliminate any possibility of cross-contamination between the samples. 2 sampling reaction products from parallel chemical reactors conventional sampling units use mechanical valves – a single special multiposition scanning valve, or an array of common two-position valves. the schematic fig. 2 shows a typical case. a modern trend in chemistry is to perform chemical processes in microreactors [3]. the required total production rate is obtained by “numbering up” – operating a large number of microreactors – instead of scaling up. one advantage of this approach is better controllability of the process, due to the substantially increased surface-to-mass ratio in small reactors. to use this advantage, the control system needs knowledge about the process. the composition of the reaction products is one of the principal items of information required. the arrangement according to fig. 2 – often with hundreds of sampled reactors (rather than only four as shown here for simplicity) – has become increasingly common. 41 acta polytechnica vol. 42 no. 2/2002 sampling by fluidics and microfluidics v. tesař selecting one from several available fluid samples is a procedure often performed especially in chemical engineering. it is usually done by an array of valves sequentially opened and closed. not generally known is an advantageous alternative: fluidic sampling units without moving parts. in the absence of complete pipe closure, cross-contamination between samples cannot be ruled out. this is eliminated by arranging for small protective flows that clear the cavities and remove any contaminated fluid. although this complicates the overall circuit layout, fluidic sampling units with these „guard“ flows were successfully built and tested. recent interest in microchemistry leads to additional problems due very low operating reynolds numbers. this necessitated the design of microfluidic sampling units based on new operating principles. keywords: sampling, fluidics, microfluidics, fluidic valves, flow switching, fluid samples. fig. 1: the task of the selector sampling unit: converting spatial separation between fluid samples at the input into temporal separation at the output. fig. 2: schematic diagram of a typical application of the selector sampling unit for testing reaction products at outputs of chemical reactors operated in parallel and sharing a single analyser. fig. 3 presents a standard version of the selector sampling unit, using two-positional mechanical valves. there are two essential constituent parts: the array of valves and the flow junction circuit. the latter is admittedly trivial in the case of the mechanically closures of the flowpath in fig. 3, where all valve outlets may be simply connected together. 3 fluidics: no-moving-part valves chemical reactors commonly operate at high temperature and/or with aggressive fluids. this makes the conditions for the valves rather demanding. the small size of microreactors, calling for correspondingly small valves, makes the demands even more formidable. although microvalves with moving parts are demonstrated, they are difficult to manufacture, expensive, and prone to breakages. there are many advantages offered by an alternative: using purely fluidic valves [2]. these are easy to manufacture by methods used in micro-engineering, may be made from heatand corrosion resistant material and have an almost unlimited operating life, even under extreme conditions. there are many variants – fig. 4 presents what may be a typical example of a valve operating at large reynolds numbers. this is a diverter type valve. the sample flow is accelerated in the main nozzle and directed as a jet towards the two opposite collectors (receivers). one of these is connected to the analyser output. the jet is deflected into it in the “open” tate of the valve. the other collector is connected with the vent, into which the sample flow is dumped in the “closed” state. the valve in fig. 4 is symmetric, the roles of the two collectors are mutually interchangeable. using the coanda effect saves the power required for jet deflection. just short switching pulses suffice for control – to move the jet from one attachment wall to the opposite one. a valve is basically just a shaped constant-depth cavity, easily made by standard microengineering manufacturing techniques, such as etching. there is no expensive assembly of moving components, nothing to break or seize. it may well operate without maintenance at extremely high temperature. it is easily made even from refractory materials with high corrosion and abrasion resistance. on the other hand, the design of a sampling unit with fluidic valves is quite exacting, much more than with mechanical valves and there are several inherent problems. the control signal is usually carried by a control fluid, which in some valve designs may mix with the sample fluid. even the use of an inert control fluid, not detected by the analyser, does not avoid problems caused by this mixing: the sample is diluted, demanding higher analyser sensitivity. the fact that the sample flow is not turned down but diverted into the vent means the unit is complicated by the presence of vent channels. the greatest problem is caused by the complex hydrodynamic properties of fluidic valves when compared with their mechanical counterparts. when the solenoid valves in fig. 3 are closed, no flow can pass through them, whatever conditions there are either upstream or downstream. this is not so simple in the case of a fluidic valve. when nominally in its “closed” state – i.e., diverting the sample into the vent – the valve may either overspill some sample fluid into the analyser or, on the other hand, may return the fluid back from the analyser due to the jet pumping effect. which of these two effects takes place depends upon the pressure levels in the other parts of the fluidic system. when designing a fluidic sampling unit, the pressure levels in the outlets must be carefully adjusted taking into account the valve loading characteristics. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 3: schematic diagram of the sampling unit from fig. 2 with traditional control of the sample flows by solenoid valves. in this case the junction circuit may be simply a common connection of all pipes, as mechanically closed valves permit no flow in their “closed” state. fig. 4: an example of a typical no-moving-part fluidic valve. this is a jet-type bistable diverter, its bistability achieved by the coanda-effect [7] attachment of the jet (issuing from the supply nozzle) to one of the two attachment walls. black triangles in the schematic representation: nozzles. white triangles: collectors with their diffusers. 4 auxiliary small flows: cavity cleaning flows and “guard” flows overspilling into the analyser terminal in the “closed” state is unacceptable. the overspilled fluid would contaminate the tested sample. on the other hand, the jet pumping effect (due to the entrainment into the jet) in the “open” state leads to generation of reverse flow in the “off ” outlet, which may be useful. note in fig. 3 that the channels and all other “dead volumes” downstream from the closed valves remain full of fluids different from the sample we wish to deliver to the analyser. of course, this fluid is stagnant, but it is impossible to ensure absolutely that it does not mix with the tested sample by diffusive and even eddy convection processes. as shown in fig. 6, small reverse flows back into the “closed” valves due to the jet pumping effect remove the contaminating fluid from these volumes – and thus assure high sample purity. this capability to generate cleaning reverse flows is another important advantage of purely fluidic sampling units. of course, reverse flows mean the loss of a part of the sample flow. to keep the loss acceptable, these cleaning flows must be very small. the consequences of auxiliary protective flows for fluidic circuit design are shown here in examples using another, monostable version of fluidic valves, as shown in figs. 7 and 8. monostability is achieved by removing one of the attachment walls or placing it sufficiently far away from the nozzle exit to render the coanda effect on this side ineffective. the advantage is simpler control, as there is just a single control channel per valve (note the two control channels per valve needed in fig. 4). the disadvantage is the permanent control flow needed to keep the jet deflected (instead of just the short switching pulse needed by a bistable valve). however, with the small absolute values of flow rates in microchemistry, the control fluid consumption is usually not a very important consideration. fig. 7 shows a monostable valve in the “open” state, without control flow. the main jet always attaches to the remaining single attachment wall and is led into the analyser – without contamination or dilution by the control fluid. fig. 8 shows the same valve with applied control flow, i.e., switched into the “closed” state. the control fluid flow pushes the sample fluid jet away from the attachment wall and diverts it into the vent outlet. contamination or dilution by the control fluid in this state is acceptable (it does not get to the analyser). the desirable backward flow through the main © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 42 no. 2/2002 fig. 5: schematic diagram of the sampling unit from fig. 3 with the solenoid valves replaced by fluidic diverters. apart from the additional complexity of the control and vent channels, a more complex flow junction circuit is often required – in this example using symmetric jet pumps (fig. 9). fig. 6: schematic representation of flows in the flow junction circuit. to eliminate sample cross-contamination, it is advisable to generate small reverse protective flows that clean the previous sample fluids from the downstream cavities. fig. 7: an example of a monostable fluidic jet-deflection diverter valve, shown here in its “open” state. there is only one attachment wall – the other wall on the opposite side is placed far from the main nozzle exit. as before, the black triangles in the schematic representation are nozzles, white triangles are collectors. collector – leading to the cleaning action according to fig. 6 – is due to the entrainment of outside fluid into the jet, similarly as in jet pumps. the cavities through which the sample fluid passes in fluidic valves are not completely mutually separated – there are the common vent outlets through which they communicate. all but one sample flows and all control flows coming from the “closed” valves meet there and freely mix. this “wild” mixture must be prevented from coming into contact with the sample passing through the only “open” valve before it reaches the analyser. the sample purity requirements call for another small auxiliary protective flow to be generated. this “guard flow” is obtained by sacrificing a small amount from the sample flow and directing it into the vent – as shown in fig. 7. to generate the “guard flow” and the “cleaning reverse flow” in the “off ” collectors in the two states of the same valve is not easy – due to the mutually opposite requirements. it is possible, but it calls for careful adjustment of the conditions. 5 flow junction circuits: flow enhancing and flow inhibiting elements the task of getting the desired orientations of the small auxiliary flows in the “off ” collectors of valves may be made easier and less dependent on precise adjustments by using additional fluidic elements placed into the junctions of the valve exits. this results in special flow junction circuits (cf. fig. 5 and fig. 6). the fluidic elements there are also no-moving-part devices (sharing the advantages of easy manufacture and extreme reliability under adverse conditions). their task is to generate a specified mutual interaction of the flows that meet in them. the effects are dependent upon dynamic effects in flowing fluids, which again limits the use of these elements to higher reynolds numbers, of the order of at least 102. the two basic cases of such fluidic flow interaction elements are shown in fig. 9 and fig. 10. in spite of their opposite roles, they are physically quite similar. both consist of two nozzles and one collector and differ only in the magnitude of the angle at which the nozzle exits meet. if the angle is small, as in fig. 9, the jet generated in one nozzle induces the flow in the other inlet by the jet pumping action. in fact, the element in fig. 9 is a jet pump, differing from common jet pumps only in being symmetric, as the role of the two inlets must be mutually interchangeable – each can serve as either a driven or a driving inlet. the element is connected to the vents of two adjacent diverter valves to generate flows directed away from the valves. this is useful for promoting or even generating “guard” flows. the symmetry of the layout slightly compromises the achievable jet pumping effect – but this is not very important, since the required generated “guard” flows must be small anyway. the very opposite effect is achieved in the other element shown in fig. 10. it has a large angle �, near to �. the flow admitted into one of the nozzles tends to generate a flow of opposite sign in the other inlet. this element is usefully connected to the output terminals of two adjacent diverter valves. the sample flow from the valve in its “open” state 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 8: the fluidic monostable diverter valve from fig. 7 shown in its “closed” state, with the desirable flow reversal in the main collector (which then temporarily operates as a nozzle). second flow enhancement fig. 9: simple fluidic flow junction element – a symmetric jet pump – in which one of the incoming flows enhances the magnitude of the flow in the other inlet. generates the desirable backward flow through the output collector of the other valve, which is in the “closed” state. of course, specific requirements of pressure and flow balancing in the junction circuit may call for devices in between, with intermediate values of the angle. 6 complete sampling unit a simple, two-valve case of a unit built from the elements described above is shown in the circuit diagram fig. 11. it is switched between only two sample flows. both flow interaction elements as discussed above are used here in the valve outlets. the flow reversing element (fig.10) is connected to the output terminals, its outlet leading to the analyser. the flow enhancing element (fig. 9) is connected to the vent outlets of the valves. such complete junction circuits – with both kinds of flow interaction elements – are, however, rare. at least one of the desirable small protective flows (and sometimes both) may be generated just by proper adjustment of the pressure levels. fig. 12 extends the above example by presenting a sampling unit designed to sample four channels. there are two valve pairs from fig. 11, each having its two flow interaction elements in the flow junctions at the primary level. the outlets from each pair are then connected to interaction elements of the secondary level junctions. a similar tree-like connection principle at higher levels is applied to build sampling units for a larger number of sample flows. the tree-like interconnections need not be completely symmetric (in the case of six sample flows the primary level junctions of the third switched valve pair are connected directly to the tertiary level interaction elements). © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 42 no. 2/2002 fig. 10: another simple fluidic flow junction element – differing from the one in fig. 9 only in the mutual inclination angle � of the two nozzles. in this case one of the incoming flows suppresses or even reverses the flow in the other inlet. fig. 11: schematic diagram of the basic circuit of the fluidic sampling unit – with only two inlets and two monostable valves, one of them “open” while the other is “closed”. fig. 12: circuit diagram of a fluidic sampling unit with four inlets. the complexity of the circuits with the various directions of the flows (indicated by arrows) is increased, but the basic principle is a simple repetition of the connections from fig. 11. fig. 13: example of a fluidic sampling unit with four inlets, etched integrally in a single plate. the disadvantage of this layout is the inevitable crossing of the valve outlet channels. manufacturing by etching all the valves and interconnecting channels in a single plate is thus impossible. in fig. 13, showing the appearance of the corresponding sampling unit from fig. 12, the flow enhancing junction elements are not visible, being located at another level, in a different plate stacked on top of the plate shown here. 7 low re microfluidic sampling units the continuing tendency towards decreasing the device size has already led to microdevices of micrometer size, not visible to the naked eye. typical present-day microreactors are larger, but some of them have been built in submillimetre dimensions. typical applications are on the one hand in analytical chemistry – the “lab-on-chip” performing hundreds to thousands of parallel analyses, such as dna “fingerprinting”, which may make bring to reality the goal of objective diagnosis of diseases (another business behind the recent efforts is paternity testing). another promising field is synthetic chemistry, currently aimed particularly at producing liquid fuels (and processing them for fuel cells). at the submillimetre scale, the associated microfluidic valves required by microreactors – for reactant flow control as well as sampling units – need to operate at low reynolds number re, the product of flow velocity and dimension (usually the main nozzle width) divided by viscosity. low re sampling units are required to handle high viscosity fluid samples, such as viscous biological liquids or hot gas. the flow rates and hence velocities are quite often small, because the total flow through a reactor is limited by the requirement of reactor residence time – and the samples taken from them represent a loss missing from the production, so that there is every reason to keep them at a minimum. reynolds number may be interpreted as the ratio of inertial forces to viscous forces acting on the fluid. if it is low, the dynamic effects even in an accelerated fluid downstream from a nozzle are weak, the flow is dominated by viscous damping. these inertial effects, however, are the very phenomena upon which the operation of fluidic valves and other elements as described above is based. below about re � 40, a flow issuing from a nozzle forms no jets at all. the fluid simply spreads equally in all available directions. in this subdynamic regime [4], the operation of jet-type fluidic elements ceases to be possible. in fact, the coanda-effect attachment to a wall is absent already below about re � 800. this means that with the present tendency towards smaller and smaller size, fluidic sampling units as described above may increasingly often be found to be out of the question the microfluidics applications [4]. new operating principles are needed. indeed, some recent microfluidic devices rely on unusual driving effects – such as electro-osmotic forces. it is, however, possible to remain within the domain of classical hydromechanics by using pressure driven valves. instead of relying on the acceleration in a nozzle, the sample flow is forced through the valve by a constant pressure difference between vent v and output terminal y. generating and maintaining this pressure difference may require the addition of a pressure regulator. this is usually no problem in the context of mems (micro electromechanical systems), of which microfluidic sampling units form a part – often made by etching directly on a silicon chip together with the electronic control circuits, which are thus readily available. at low re there is no point in using flow interaction elements (fig. 9 and 10) – jet pumping fails without turbulence and flow collision ceases to be interesting without inertial effects. the terminals of all valves of the low-re unit – both vent terminals v and output terminals y – are usually simply mutually connected. this means the driving pressure difference �pyv is simultaneously acting in all the valves in the unit. the driving pressure must be carefully adjusted so as to obtain in the “open” state not only the required sample flow into the output terminal y, but at the same time also a small but desirable spillover “guard flow“ into the vent terminal v. the problem then arises in the “closed” state. first, it is more difficult to generate the desirable cleaning back flow, since it must overcome the pressure difference �pyv acting in the opposing direction. at the same time, there is no hope of fulfilling this more difficult task by the jet pumping effect generated by the sample flow, which is too slow. 8 jet pumping by a powerful control flow one solution, covered by british patent applications [1], [5] was found by the author, and is described in ref. [4]. the idea is that if the sample flow is too weak, another available flow must be used and the only flow that may be sufficiently powerful under the circumstances of the “open” state is the acting control flow. this idea had to overcome a psychological obstacle: designers of fluidic valves assume quite naturally that the control flow is many times smaller than the controlled flow – this is why valves are usually called “fluidic amplifiers”. the ratio of the supply flow rate to the control flow rate, usually made as large as possible, is called the “flow gain”. the new idea had to accept disproportionately large control flows and hence extremely low, in fact fractional, values of the gain. as shown in figs. 14 and 15, the new microfluidic valves incorporate a classical, large re fluidic element: a jet pump. this generates the desired cleaning return flow in the “closed“ state, fig. 14. the exit flow from the jet pump is 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 14: schematic representation of a fluidic valve [5] generating the desirable reverse cleaning flow in the “closed” state by the jet pumping effect by a powerful control jet. for this purpose, the valve contains what is basically an integral jet pump. directed towards the vent collector. this is placed on the opposite side of the sample flow path, so that the powerful jet pump exit flow sweeps the sample into the vent as well. this is the principle of the closing action. in fig. 14, the axes of the jet pump and vent collector are inclined with respect to the main (sample supply) nozzle axis. this is a useful feature, since jet momentum interaction studies indicate that optimum jet deflection is achieved with the control jet inclined to have a velocity component opposed to the supply flow velocity. the inclination is even more useful in the “open“ state valve state (fig. 15). the sample flow has to pass through the valve into the output terminal y, and it is advantageous to have the path towards this terminal as straight as possible to get a lower value of the pressure drop �pyv. microfluidic valve and an example of the sampling unit designing a microfluidic valve based on this principle is a question of finding the best compromise between the conflicting requirements of the “closed“ state fig. 14 and the “open” state fig. 15. an efficient layout of the jet pump part – e.g., according to fig. 19 – improves the achievable clearing backflow in the “closed” state, but because of the ensuing complexity of the sample path, this means a higher pressure loss �pyv in the “open” state to be maintained by the pressure regulator – and to be overcome by stronger “closed” state jet pumping. in the example shown in fig. 16, no attempt is made to have a really efficient jet pump and there is only a very short jet-pump collector (cf. fig.14). the idea is to make the flow path through the “open” valve shorter and easier, generating a lower pressure loss �pyv. this means a lower pressure difference to be overcome by jet pumping, which therefore can afford to be less effective. another feature typical for low-re microfluidic devices is the missing cross-sectional area contraction in the supply (“main”) nozzle, as there is no point in trying to accelerate the sample flow there. the operation of a sampling unit using these valves may be followed in the circuit diagram of the basic valve pair shown in fig.17. to recognise the degenerated character of the jet-pump collectors in these valves, the collector symbols in this schematic diagram are replaced by just a symbol of a short channel. there are no flow interaction fluidic elements in the junctions of the valve exits. as a consequence the keeping of the driving pressure difference �pyv is simplified – it has to be kept just between the final exit terminals of the complete sampling unit circuit. this would, of course, also apply if there were more valve pairs in a sampling unit handling a larger number of sample flows. second example of a microfluidic valve and sampling unit in the other example of a microfluidic valve, shown in fig. 18, the idea of limited jet pumping efficiency traded-off for smaller driving pressure difference �pyv was taken even further – to the degree of the jet pump being hardly recognisable. the sample flow path in the “open” state is here only slightly curved. the rudimentary jet pump suffices since the 47 acta polytechnica vol. 42 no. 2/2002 fig. 15: the fluidic valve from fig. 14 shown schematically in its “open” state. the flow to the analyser is driven by the applied constant pressure difference �pyv, adjusted so as to obtain the proper magnitude of the “guard flow”. fig. 16: an example of a microfluidic valve [5] for low-re microfluidic sampling units. the desired reverse flow in the “closed” state is obtained by the jet pumping effect of the control flow (cf fig. 14). the jet pump is very rudimentary. the actual size is at least by an order of magnitude smaller than for the fluidic valves in fig. 8. fig. 17: schematic circuit diagram of the basic valve pair unit (cf. fig. 11) which forms the basis of microfluidic sampling units [1] – here with the valves from fig. 14. the constant driving pressure difference �pyv is applied between the exit terminals. cleaning back flow is to be very small. this is due to the large number of valves in a realistic sampling unit, all of them simultaneously in their “closed” state – while only one valve is “open”. if the back flow is only 1 % of the sample flow, then with 80 valves generating the jet pumping back flow the total loss is 80 % of the sample. an important feature of the design is the “nose” (fig. 18) that seals off the output channel from the vent area and prevents the generated low pressure from being equalised by unwanted return flow from the vent space. the computed total pressure field in fig. 20 (three-dimensional fluent solution using a two-dimensional turbulence model with rng low-re-turbulence modification) documents that even the reduced „jet pump“ generates sufficient jet pumping. note in fig. 18 or fig. 20 that there is no such “nose” on the opposite, supply nozzle side. there is, on the contrary, a setback “step”. its task is to reduce the effect of the powerful control flow on the sample flow. even though we talk here about a “closed” state, it is undesirable to limit or otherwise influence the sample flow, as this would lead to unwanted changes in the reactor operation. the valve example from fig. 18 was tested by the author in a sampling unit [6] built for a research facility with a relatively small number – 16 – test reactors for high-throughput testing of catalysts for the fritsch-tropfsch reaction (hydrogenation of carbon monoxide to ethanol). since this is a high-temperature reaction (400°c), and the sampling unit is located in the vicinity of the reactors, it was manufactured in stainless steel and its control is external. the valves are switched by essentially cold nitrogen control flows. fig. 22 shows the complete sampling unit made by etching in a 0.25 mm thin stainless steel foil, together with the flow-equalising restrictors. to achieve a higher aspect ratio of 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 2/2002 fig. 18: another example of a microfluidic valve based upon the principles from fig. 14, with an even more rudimentary jet pump part. the advantage is its less tortuous flow path in the “open” state, leading to smaller driving pressure difference �pyv. fig. 19: another example of a microfluidic valve based on the same principle as in fig. 14 – this time with a full jet pump: note the long mixing tube and the even longer diffuser. fig. 20: computed dynamic pressure field in the valve from fig. 18 in the “closed” state. decreasing pressure values coded by increased darkness of the grey scale. note that even rudimentary “jet pump” is capable of generating quite low pressure (very dark grey colour) in the output collector, provided the control jet is really powerful. fig. 21: a detail of a microfluidic sampling unit [1], [5] for high temperature applications, with valves according to fig. 18 arranged into a radial array. the unit is made by etching in a stainless steel foil. the cavities, two-sided etching was used. this would separate the internal part of the foil into 16 mutually movable parts. structural integrity is obtained by braces (seen in fig. 21) located downstream from the valves. the braces are etched only from one side so that there is a space above them allowing sample flow into the central cavity, which is connected to the analyser. the valves are not extremely small – the supply nozzle channel width is 0.34 mm – but they operate at very low re. the sample fluid is “syngas” – fuel synthetisation gas, a mixture of hydrogen and carbon monoxide. it reaches the valve at pressure 0.5 mpa (a substantial decrease from 4 mpa in the reactors). despite cooling of the valve plate by the nitrogen control flows, the kinematic viscosity of the sample gas (also due to its large contents of h2) in the supply nozzle is as high as 40 � 10�6 m2/s. the nozzle exit bulk velocity is only about 3.8 m/s – despite the exceptionally favourable circumstance of the reactors serving only for tests, so that full reaction product flow is available for composition analysis. as a result, the design had to cope with a very low supply nozzle reynolds number, about re � 32. the “guard flow” magnitude is adjusted by the choice of driving pressure, kept by an external pressure regulator, so that 6 % of the sample flow in the “open” state is spilled over into the vent. only 3 % return flow in the “closed” state is chosen (which still means sacrificing in total more than one half of the available sample). in spite of this small desired return flow, the necessary control flow rate to generate the jet-pumping effect is about 40 times the sample mass flow rate supplied into the valve (in standard fluidics this would be an absurdly small flow gain of 0.025). with lower viscosity of the control nitrogen gas and a narrower, 0.27 mm wide, control nozzle to get higher control jet velocity (around 55 m/s), the reynolds number of the control jet is around re � 1000, just high enough to get at least some vortex entrainment effect. 9 conclusions this paper presents essential information about a little known way – perhaps more difficult to design, but bringing interesting advantages – to connect an analyser sequentially to one fluid source at a time from a number of available sample sources. as long as the source sample flow can achieve a reynolds number of at least re � 200 in the nozzle of the fluidic valve (and preferably more), it is possible to utilise what is essentially an already existing technology of fluidic jet-type valves and flow interaction elements. the complicating factor is the requirement to generate protective small flows to keep perfect sample purity. this requirement becomes a real problem in present day microfluidics, characterised by the tendency to operate at very low re values. a solution based on the idea of an exceptionally powerful control jet was demonstrated to work successfully in a sampling unit built for operating with high-temperature gas. acknowlegments the development of the microfluidic sampling unit for hot synthetic gas was supported by iac (institute of applied catalysis, london, u.k.) as a part of the high throughput catalyst testing project managed by prof. r. w. k. allen. the author has gained much useful information from discussions with y. y. low and particularly with dr. j. r. tippetts, who is the originator of the earliest ideas on fluidic sampling. references [1] tesař, v., tippetts, j. r. t., allen r. w. k.: fluid multiplexer. british patent application gb 0019767.9, april 2000. [2] tesař, v.: valvole fluidiche senza parti mobili. [in italian] oleodinamica – pneumatica, revista delle applicazioni fluidodinamiche e controllo del sistemi. 1998, vol. 39, no. 3 , p. 216. issn 1 122-5017. [3] ehrfeld, w.: microreaction technology: industrial prospects. berlin: springer, 2000, isbn 3-540-66964-7. [4] tesař, v.: microfluidic valves for flow control at low reynolds numbers. journal of visualisation. tokyo: 2001, vol. 4, no. 1, p. 51–60. issn 1343-8875. [5] tesař, v., tippetts, j. r. t., allen, r. w. k.: fluidic valve. british patent application gb 0003969, march 2000. [6] tesař, v., allen, r. w. k., tippetts, j. r, low, y. y., adams, c.: high throughput catalyst testing: a novel multichannel microreactor with microfluidic flow control system. 8th nice (network for industrial catalysis in europe) workshop on fast analytical screening of catalyst and fast catalyst testing. espoo (finland): september 2000. [7] tesař, v., randa, z.: large attachment-wall divergence angles in fluidic bistable coanda-effect diverters. proc. of colloquium “dynamika tekutin '93”, inst. of thermomechanics, as cr praha, october 1993. prof. ing. václav tesař, csc. phone: 44 0 114 222 7551 fax: 44 0 114 222 7501 e-mail: v.tesar@sheffield.ac.uk department of chemical and process engineering the university of sheffield mappin street, sheffield s1 3jd, united kingdom © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 42 no. 2/2002 fig. 22: photograph of the author’s microfluidic sampling unit [1], [4] with 16 pressure driven valves and their upstream restrictors (the spiral-shaped channels). the unit is made by through (two-sided) etching in a 0.25 mm thin stainless steel foil. acta polytechnica doi:10.14311/ap.2020.60.0369 acta polytechnica 60(5):369–389, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap development of numerical models for the prediction of temperature and surface roughness during the machining operation of titanium alloy (ti6al4v) ilesanmi daniyana, ∗, isaac tlhabadirab, khumbulani mpofua, adefemi adeoduc a tshwane university of technology, department of industrial engineering, pretoria, staatsartillerie road, private bag x680, pretoria 0001, south africa b tshwane university of technology, pretoria, department of mechanical & automation engineering, staatsartillerie road, private bag x680, pretoria 0001, south africa c university of south africa, department of mechanical and industrial engineering, 1724 florida park, johannesburg, south africa ∗ corresponding author: afolabiilesanmi@yahoo.com abstract. temperature and surface roughness are important factors, which determine the degree of machinability and the performance of both the cutting tool and the work piece material. in this study, numerical models obtained from the response surface methodology (rsm) and artificial neural network (ann) techniques were used for predicting the magnitude of the temperature and surface roughness during the machining operation of titanium alloy (ti6al4v). the design of the numerical experiment was carried out using the response surface methodology (rsm) for the combination of the process parameters while the artificial neural network (ann) with 3 input layers, 10 sigmoid hidden neurons and 3 linear output neurons were employed for the prediction of the values of temperature. the ann was iteratively trained using the levenberg-marquardt backpropagation algorithm. the physical experiments were carried out using a dmu80monoblock deckel maho 5-axis cnc milling machine with a maximum spindle speed of 18 000 rpm. a carbide-cutting insert (rckt1204mo-pm s40t) was used for the machining operation. a professional infrared video thermometer with an lcd display and camera function (mt 696) with infrared temperature range of −50−1000 °c, was employed for the temperature measurement while the surface roughness of the work pieces were measured using the mitutoyo sj – 201, surface roughness machine. the results obtained indicate that there is high degree of agreement between the values of temperature and surface roughness measured from the physical experiments and the predicted values obtained using the ann and rsm. this signifies that the developed rsm and ann models are highly suitable for predictive purposes. this work can find application in the production and manufacturing industries especially for the control, optimization and process monitoring of process parameters. keywords: ann, algorithm, rsm, surface roughness, temperature. 1. introduction titanium alloy (ti-6al-4v) is characterized by excellent mechanical properties, such as high tensile strength, high stiffness, good formability and excellent corrosion resistance in addition to its outstanding strength-to-weight ratio. it finds an extensive range of applications in different industries, such as biomedical, aerospace, automotive, marine, railway etc. [1, 2]. its ease of formability via extrusion often makes it a preferred choice for the development of complex and intricate profiles and its outstanding strength-toweight ratio often promotes energy and environmental sustainability. however, its low value of thermal conductivity and young modulus often results in a poor surface finish and dimensional inaccuracies during the machining operations [3]. since, the degree of the surface finish often influences the product’s quality, integrity and performance, the optimization of the process parameter is key during the machining operation of the titanium alloy. titanium alloys are classified into three groups namely, alpha (α), alpha-beta (α − β) and beta (β). commercially pure titanium and its alloys are mainly used in cryogenic applications while the α−β alloys (mostly ti-6al-4v) are extensively used in the aerospace industries for the development of aircraft components, such as airframes and engine components [4–6]. the β alloys find application in high strength applications due to its good forgeability [7–9]. temperature is an important parameter, which determines changes in the mechanical behaviour, microstructure, surface finish and performance of the work piece as well as the cutting tool during machining operations. cutting operations under controlled temperature can enhance the cutting oper369 https://doi.org/10.14311/ap.2020.60.0369 https://ojs.cvut.cz/ojs/index.php/ap i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica ation with the development of products with relieved residual stresses, good surface finish and mechanical properties. on the contrary, cutting operations under uncontrolled temperature can decrease the useful life of the cutting tool and decrease the overall process sutainability and promote a poor surface finish of the final product. it can also bring about significant reduction in the yield and ultimate tensile strength of the work piece material, thereby subsequently resulting in an increase in the strain fracture, crack growth and fatigue behaviour of both the cutting tool and the work piece [10, 11]. the cutting temperature increases as the major energy required for the cutting operation is converted into heat at the primary shear zone where the main cutting action takes place and at the secondary deformation zone of the chip-tool interface due to friction, as well as in the work piecetool interface [12, 13]. the heat distribution across the work piece, tool and chips formed based on their configurations and their thermal conductivities results in temperature changes. the understanding of the temperature variation during a cutting operation will assist in the design and selection of the right cutting tool, as well as machinability analysis of the cutting process. it will also assist in the selection of the appropriate cutting fluid and the determination of the energy requirements of the process. the energy requirements of the process influence the sustainability of the process in terms of its cost effectiveness and environmental friendliness. since temperature is a critical factor during machining operation, the amount of heat conducted by the tool and the work piece should be controlled to prevent a tool wear, surface roughness of the work piece material, dimensional inaccuracy, oxidation and corrosion. some of the control actions to keep cutting temperature within the permissible limit involves the use of coolants, real time monitoring of cutting temperature via the use of infrared video thermometers and the process optimization of cutting parameters, such as the depth of cut, cutting force, speed of cutting, feed rate etc. the control actions are necessary because research has proven that a lack of an effective temperature monitoring and control can significantly contribute to temperature variations during the machining operation [14–16]. an increase in the temperature distribution during the cutting action often increases the power consumption of the cutting process with the risk of increasing the dimensional inaccuracy and surface roughness of the work piece. it could also trigger a plastic deformation of the cutting edge of the tool, which can result in a tool wear and fracture. several methods, such as the analytical and numerical methods icluding the taguchi, response surface methodology, artificial neural network (ann) etc. have been proposed for the prediction of the temperature and other process parameters during the cutting operations [17–21]. this is because aforementioned approaches have proven to be suitable for the modelling and prediction of important process parameters and also capable of reducing the numbers of expensive physical experimentation runs. the only challenge is that the wrong selection of models may cause variations in the analysis of the temperature distribution when analytical or numerical results are compared with the results from the physical experimentations. over the years, the ann has been used extensively as a modelling technique to study the relationship between the input and output variables in order to make predictions. the tool finds applications in the modelling and optimization of simple as well as complex linear and non-linear systems with multi-dimensional relationships [22–25]. another important parameter considered in this study beside the temperature is the surface roughness of the finished work piece. the surface roughness index is used to indicate the level of surface irregularities or finish during the machining operations. it is a parameter, which determines the quality of the final product and its ability to meet the design specifications and its functional requirements [26]. the correlation between temperature and surface roughness is that when the cutting temperature exceeds the optimum, the surface roughness of the work piece will increase. in addition, when the cutting temperature falls below the threshold under cutting conditions, there is a likelihood for the surface roughness of the work piece to increase. hence, the need to keep the temperature values within the optimum range. both the rsm and the ann tools have been deployed for the design of experiment (doe) as well as an intelligent computation and prediction of the process parameters during machining operations in order to keep process paramters within the optimum range and enhance the overall efficiency or sustainability of the cutting process. for instance, kanta and sangwan [27], performed a predictive modelling and optimization of the machining parameters in order to minimize the surface roughness of titanium alloy using ann and genetic algorithm (ga). kovak et al. [28] applied the fuzzy logic and regression analysis for modelling of the surface roughness in a face milling operation while campatelli et al. [29] employed the response surface methodology (rsm) for the optimization of the power consumption during machining operations. in addition, djavanroodi et al. [30] used the ann for the modelling of an equal channel angular pressing process while fathallah et al. [31] carried out the mathematical modelling and optimization of surface quality and productivity during the turning process of aisi 12l14 free-cutting steel. from these works, the modelling and optimization techniques present applicable solutions for the correlation and determination of the optimum process conditions during the machining operations. furthermore, the developed models and techniques indicate a good agreement between the predicted values and output target values, which indicate that the techniques are capable of determining the optimum range of the process parameters. the aim 370 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . of this work is to demonstrate the application of the response surface methodology (rsm) and artificial neural network (ann) for the correlation and prediction of the temperature and surface roughness duuring the millling operation of titanium alloy (ti6al4v). in view of this, a comparative analysis between the results obtained from both approaches (rsm and ann) were carried out in order to determine their suitability for a predictive purpose. 2. materials and methods the chemical composition as well as the mechanical and thermal properties of the titanium alloy (ti-6al4v) employed in this study are presented in tables 1 and 2 respectively. equation 1 expresses the average surface roughness, which gives an indication of the height variation of the surface of the material. ra = 1 l ∫ l 0 |z(x)|dx (1) where: z is the profile ordinates of the roughness profile. the temperature θi at the chiptool interface is expressed as equation 2. θi = k1ec √ vca1 λcv (2) where: k1 is a constant based on the cutting tool and work piece material, ec is the specific cutting energy (joule), vc is the average cutting velocity (mm/sec), a1 is the thickness of the uncut chip (mm), λ is the thermal conductivity (w/m·k), and cv is the volume specific heat (j/k/m3). since the tool life is a function of the cutting temperature, the taylor’s equation for the estimation of the tool life is expressed as equation 3. v t n = c (3) where: v is the cutting speed (mm/min), t is the tool life expectancy (mins) n and c are the exponent and constant depending on the cutting tool, work piece material and process parameters. the tool life can also be determined from equation 4. v t nfadb = c (4) where: f and d are the feed and depth of cut respectively (mm), while a and b are the exponents, which depends on the cutting tool, work piece material and process parameters. the physical experiments were carried out using a dmu80monoblock deckel maho 5-axis cnc milling machine with a maximum spindle speed of 18 000 rpm. a carbide-cutting insert (rckt1204mopm s40t) was used for the machining operation. a solid rectangular work piece of ti6al4v was screwed to the stationary dynamometer (kistler 9257a 8-channel summation of type 5001a multichannel amplifier) and mounted directly to the machine table. the milling operations were performed at different cutting parameters of feed rate, spindle speed, cutting speed, and depth of cut. the cutting force data were collected with the aid of the data acquisition system (das) connected to a computer. a professional infrared video thermometer with an lcd display and camera function (mt 696) with an infrared temperature range of −50 − 1000 °c, response time of less than 300 ms, resolution of 0.1 to over 1 000 °cand ir basic accuracy of ±1.0 reading was employed for the temperature measurement in real time. the experimental set up and the connections of the kistler dynamometer to the 5-axis cnc milling machine are shown in figure 1. the sandvik carbide insert (rckt1204mo-pm s40t) employed for the cutting and the work piece (ti6al4v) are shown in fig. 2. the specifications of the cutting tool are presented in table 3. the surface roughness of the samples of the work piece were measured using the mitutoyo sj – 201, surface roughness machine (fig. 3). 2.1. the response surface methodology the feasible combination of the process parameters was done using the response surface methodology (rsm). the choice of the rsm was based on its ability to iteratively study the cross-effect of process parameters [33–35]. the numerical experimentation comprises a four factor experimental design, which varied at different levels in the following ranges; feed rate (250 − 350 mm/rev), spindle speed (1000 − 3000 rpm), cutting speed (100 − 300 m/min) and depth of cut (0.3 − 0.9 mm). the feasible combinations of these process parameters produced 41 experimental trials whose response (surface roughness and temperature) were determined via the physical experimentations. the statistical analysis of both the numerical and physical experimentations were used to obtain two mathematical models for correlating and predicting the magnitude of the surface roughness and temperature respectively, as a function of the independent process parameters. 2.2. the ann approach the artificial neural network (ann) with sigmoid hidden neurons and linear output neurons fits the prediction problem given the consistent data and enough neurons in its hidden layer (figure 4). the network was trained with the levenberg-marquardt backpropagation algorithm in a matlab 2018 b environment. the choice of the algorithm was based on the fact that it is highly efficient for correlative and training purposes and typically requires more memory, but less time [36]. the training automatically stops when generalization stops improving, as indicated by an increase in the mean square error of the validation 371 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica element al fe o ti v percent weight (wt.%) 6 0.25 0.2 90 4 table 1. chemical composition of titanium alloy (ti-6al-4v) [32]. s/n properties value mechanical 1. density 45 000 kg/m3 2. brinell’s hardness 334 3. yield strength 880 mpa 4. ultimate tensile strength 950 mpa 5. bulk modulus 150 gpa 6. modulus of elasticity 113.8 gpa 7. poison’s ratio 0.342 8. shear modulus 44 gpa 9. shear strength 550 mpa thermal 1. specific heat capacity 0.5263 j/g°c 2. thermal conductivity 6.7 w/m·k 3. melting point 1 660 °c 4. coefficient of thermal expansion 8.70 k−1 table 2. mechanical and thermal properties of titanium alloy (ti-6al-4v) [32]. figure 1. experimental set up. figure 2. the carbide cutting tool and the work piece (ti6al4v). 372 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . symbol parameter value td inscribed circle diameter (mm) 3.987 dc depth of cut (mm) 1.760 α nose radius (mm) 6.000 t insert thickness (mm) 4.750 ϕ lead angle (deg.) 0° table 3. the cutting tool geometry. figure 3. mitutoyo sj – 201, surface roughness machine. samples. the architecture of the neural network comprises three input and output layers comprising 10 neurons in the hidden layer and 3 neurons in the output layer. the total number of inputs is the same as the number of experimental trials carried out (41) and the same as the number of the outputs. ann performs better with a large data set, however, the data set employed in this study is limited to 41 because of cost considerations since it is a preliminary analysis. for a detailed analysis, it is necessary to increase the number of samples and data sets in order to enhance the performance of the ann. the trainlm network training function, which updates the weights and biases according to levenbergmarquardt optimization, was employed. this is because it is often the fastest backpropagation algorithm in the neural network toolbox, and is highly suitable for supervised learning, although it requires more memory than other algorithms [36]. from the physical experimentations, the feed rate, spindle speed and the cutting velocity were used as the input parameters while the depth of cut, surface roughness and temperature serve as the output target. the grouping of the input and the output target are as follow; inputs= p =[300 300 350 300 250 250 300 300 350 250 250 300 325 350 300 300 350 275 350 325 350 325 325 325 325 325 350 300 350 325 300 350 300 325 300 325 300 300 325 300 375; 250 100 300 100 200 200 300 300 300 200 250 300 200 250 300 200 300 100 300 200 100 200 100 200 200 200 200 200 300 300 300 100 200 100 100 100 100 200 100 200 300; 0.6 0.3 0.3 0.6 0.6 0.9 0.6 0.9 0.6 0.3 0.3 0.6 0.3 0.3 0.3 0.9 0.6 0.9 0.6 figure 4. the architecture of the neural network. 0.6 0.6 0.6 0.6 0.9 0.3 0.3 0.6 0.6 0.9 0.9 0.3 0.3 0.3 0.9 0.6 0.9 0.3 0.6 0.9 0.6 0.3]; targets= t =[3000 1000 3000 1000 2000 2000 3000 1000 3000 2000 1000 2000 2000 1000 3000 1000 2000 1000 2000 3000 2000 2000 2000 2000 2000 1000 3000 3000 3000 2000 2000 3000 3000 3000 1000 1000 2000 1000 1000 2000 3000; 0.79 0.91 0.86 0.98 0.75 0.78 0.83 0.88 0.84 0.70 0.79 0.88 0.84 0.52 0.60 0.96 0.97 0.98 0.99 0.58 0.56 0.74 0.77 0.86 0.85 0.79 0.88 0.46 0.66 0.80 0.87 0.76 0.95 0.79 0.98 0.78 0.80 0.87 0.58 0.77 0.54; 176 160 85 165 124 140 170 156 145 157 95 93 87 88 60 100 120 180 165 158 177 186 168 175 145 123 127 180 87 78 98 96 92 102 105 125 136 140 155 160 156]; the number of the neurons in the hidden layer (hn) is the weighted sum of inputs and bias, expressed as equation 5. hn = ∑ (wi) + b (5) where w is the weight of the input parameters, i is the number of input parameters and b is the bias. for the three input data parameters employed for this study (the feed rate, spindle speed and the cutting velocity), the number of the neurons in the hidden layer was obtained as ten and the corresponding number of the output layer and expected output parameters was three (depth of cut, surface roughness and temperature) as shown in figure 4. the trained neural network then produced two mathematical models for correlating the network outputs while predicting the magnitudes of the surface roughness and temperature. 3. results and discussion 3.1. results from the response surface methodology the statistical analysis of the developed model for predicting surface roughness as a fucntion of the independent process parameters (feed rate, spindle speed, cutting velocity and depth of cut) as well as its analysis of variance (anova) are presented in tables 4 and 5 respectively. the “p-value prob > f” was less than 0.050 for the overall model term (0.0299 < 0.05). this indicates that the overall model term is statistically significant. the statistical significance of the overall model term implies that certain model term is a critical 373 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica statistical parameters sum of squares df mean square f value p-valueprob > f remarks model 0.23 5 0.046 3.02 0.0299 significant b-spindle speed 0.034 1 0.034 2.21 0.502 c-cutting velocity 4.817e-3 1 4.817e-3 0.32 0.5796 d-depth of cut 0.027 1 0.027 1.75 0.1989 bc 0.053 1 0.053 3.46 0.0751 bd 0.11 1 0.11 7.35 0.0122 residual 0.37 24 0.015 lack of fit 0.26 19 0.014 0.63 0.7891 not significant pure error 0.11 5 0.022 corr total 0.60 29 table 4. the statistical analysis of the developed model (surface roughness). parameter value remarks r-squared 0.9675 significant adj r squared 0.9879 significant predicted r-squared 0.9609 significant adeq. precision 6.8300 significant table 5. the analysis of variance (anova) for the developed model (suface roughness). factor (specifically model term bd) that could influence the magnitude of the measured response (surface roughness). in this instance, the factors (statistical parameters) are: b (spindle speed), c (cutting velocity), d (depth of cut), bc (cross effect of the spindle speed and cutting velocity) and bd (cross effect of the spindle speed and the depth of cut). the significant model term was bd (0.0122 < 0.05). the “lack of fit” value of 0.63 implies that the lack of fit is not statistically significant relative to the pure error. there is a 78.91 % chance that a “lack of fit-f-value” this large could occur due to noise. the non-significance of the “lack of fit” value implies that the model is good for a predictive purpose. the values of the adjusted r-square (0.9879) and predicted r square (0.9609) were in a reasonable agreement with the r squared (0.9675), and were all close to 1, thus, indicating that the model is suitable for correlative predictive purposes. in addition, the adeq. precision, which is a measure of the signal to noise ratio has a value greater than 4, which is desirable. this implies that the model is suitable for navigating the design space. the results obtained from both the numerical and physical experimentations were statistically analysed using the rsm to obtain a predictive model, which correlates the surface roughness as a function of the significant independent process parameters, namely the feed rate and depth of cut (equation 6). surface roughness = +0.77 − 0.038b− − 0.014c + 0.033d − 0.058bc + 0.084bd (6) where; b is the spindle speed (mm), c is the cutting velocity (m/min) and d is the depth of cut (mm). equation 6 is a first order 2fi model equation and it was found to be adequate for the predictive and correlative purposes relating to the surface roughness of the samples. the non-significance of the “lack of fit” value implies that the model equation is good for a predictive purpose. if the “lack of fit” is significant, other model equations, such as the quadratic or a cubic model equations, can be considered. figure 5 is the normal plot of the residuals for the developed model for surface roughness. this plot shows the degree, to which the data set is normally distributed. the closeness of the data to the average (diagonal) line indicates that the residuals are approximately linear (normally distributed), although with an inherent randomness left over within the error portion. the departure of data points from the average line were marginal and found to be between the permissible range of ±10 % in a relation to the average line. the approximately linear pattern obtained is an indication of a normally distributed data set and the development of an accurate model, which can be used for predictive and correlative purposes. in addition, the data set does not assume a non-linear pattern and there was no outliner (a data point that significantly falls outside the average line) in the plot, thus, indicating that the residual terms are normally distributed. the statistical analysis of the developed model for predicting the cutting temperature as a function of the independent process parameters (feed rate, spindle speed, cutting velocity and depth of cut) as well as its analysis of variance (anova) are presented in tables 6 and 7 respectively. the model “f-value” of 2.11 implies that the model is statistically significant. there is only a 9.88 % chance that the model “f-value” this large could occur due to noise. in addition, the value of the “p-value prob > f” of the overall model term was 0.0001. the fact that the value of the “p-value prob > f” was less than 0.050 indicates that the model terms are statistically significant. the statistical significance of the overall model term means that certain model term is a critical factor (specifically model term ad) 374 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . figure 5. the normal plot of residuals (surface roughness). statistical parameters sum of squares df mean square f value p-valueprob > f remarks model 11882.92 5 2376.58 2.11 0.0001 significant a-feed rate 198.38 1 198.38 0.18 0.6783 b-spindle speed 18.38 1 18.38 0.016 0.8994 d-depth of cut 10001.04 1 10001.04 0.89 0.3550 ab 3052.56 1 3052.56 2.71 0.1126 ad 7612.56 1 7612.56 6.77 0.0157 residual 27004.95 24 1125.21 lack of fit 18574.95 19 977.64 0.58 0.8226 not significant pure error 8430.00 5 1686.00 corr total 38887.87 29 table 6. the statistical analysis of the developed model (temperature). that could influence the magnitude of the measured response (temperature). in this instance, the factors (statistical parameters) are: factors a (feed rate), b (spindle speed), d (depth of cut), ab (cross effect of the feed rate and spindle speed) and ad (cross effect of the feed ate and the depth of cut). the significant model term was ad (0.0157<0.05). values greater than 0.05 indicate that the model term is not statistically significant at 95 % confidence level. the “lack of fit” value of 0.58 implies that the lack of fit is not statistically significant relative to the pure error. there is a 82.26 % chance that a “lack of fit-f-value” this large could occur due to noise. the non-significance of the “lack of fit” value implies that the model is good for a predictive purpose. the values of the adjusted r-square (0.9008) and predicted r square (0.9100) are in a reasonable agreement with the r squared (0.9105), and were all close to 1, thus, indicating that the model is suitable for correlative predictive purposes. in addition, the adeq. precision is a measure of the signal to noise ratio and a value parameter value remarks r-squared 0.9008 significant adj r squared 0.9100 significant predicted r-squared 0.9105 significant adeq. precision 5.494 significant table 7. the analysis of variance (anova) the developed model. greater than 4 is usually desirable hence a value of 5.494 obtained for the adeq. precision implies that the model is suitable for navigating the design space. the results obtained from both the numerical and physical experimentations were statistically analysed using the rsm to obtain a predictive model, which correlates the temperature as a function of the significant independent process parameters, namely the feed rate and depth of cut (equation 7). temperature = +128.73 + 2.87a− 0.88b− 375 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica figure 6. the normal plot of residuals (temperature). − 6.46d − 13.81ab + 21.81ad (7) where: a is the feed rate (mm) b is the spindle speed (rpm) and d is the depth of cut (mm). equation 7 is a first order 2fi model equation and was found to be adequate for the predictive and correlative purposes relating to the machining temperature. the non-significance of the “lack of fit” value implies that the model equation is good for a predictive purpose. if the “lack of fit” is significant, other model equations, such as the quadratic or a cubic model equations, can be considered. figure 6 shows the normal plot of the residuals for the developed model of temperature prediction. similar to figure 5, the closeness of the data to the average (diagonal) line indicates that the residuals are approximately linear (normally distributed) although with an inherent randomness left over within the error portion. the departure of data points from the average line were marginal and found to be between the permissible range of ±10 % in relation to the average line. the approximately linear pattern obtained is an indication of a normally distributed data set and the development of an accurate model, which can be used for predictive and correlative purposes. in addition, the data set does not assume a non-linear pattern and there was no outliner (a data point that significantly falls outside the average line) in the plot, thus, indicating that the residual terms are normally distributed. the depth of cut, which measures the distance penetrated by the cutter in the work piece is a function of the rate of material removal while the cutting speed is a measure of the relative velocity between the cutting tool and the work piece during the cutting operation. the spindle speed is the rotational frequency of the machine’s spindle, which measures the number of revolutions of the spindle per minute while the feed rate measures the velocity at which the cutter is fed past the work piece per revolution [37]. figures 7 and 8 are the contour and 3d plots, which show the effect of the spindle speed and cutting velocity on the surface roughness of the material. an increase in the magnitude of the cutting velocity was observed to result in an increase in the magnitude of the surface roughness. this can be due to the fact that an increase in the relative velocity between the cutting tool and work piece can result in an increase in the temperature distribution in the tool and work piece material due to increase in frictional activities, thus, causing an increase in the magnitude of the surface roughness. this challenge can be mitigated via the application of an effective cooling strategy to reduce the frictional activities and heat generation at the cutting interfaces. with the application of the cutting fluid, machining under high speed can be beneficial in the sense that it will promote time and cost effectiveness of the machining operation and also increase the rate of material removal while bringing the product to the required surface finish condition. figures 9 and10 are the contour and the 3d interactive plots, which show the effect of the spindle speed and depth of cut on the surface roughness of the work piece respectively. the relationship between the spindle speed and the depth of cut was found to be inversely proportional from the plots. an increase in the depth of cut was observed to produce an increase in the magnitude of the surface roughness and vice versa. on the contrary, the magnitude of the surface roughness was observed to decrease with an increase in the spindle speed and vice versa. this may be due to the fact that a decrease in the depth 376 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . figure 7. the contour plot of the spindle speed and cutting velocity. figure 8. the interactive 3d plot of the spindle speed and cutting velocity. 377 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica figure 9. the contour plot of the spindle speed and depth of cut. figure 10. the interactive 3d plot of the spindle speed and depth of cut. 378 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . of cut may result in an increase in the surface area to volume ratio of the work piece, thereby promoting frictional activities at the tool-work piece interface and subsequently, increase in the surface roughness of the material [26]. furthermore, each revolution of the machine’s spindle represents a smaller circumferential distance, therefore, an increase in the spindle speed may cause the circumferential distance to decrease. as the tool approaches the work piece for material removal, the spindle speed may cause the work piece surface to decrease, thereby decreasing the frictional activities and the possibility for the development of a built up edge at the tool-work piece interface. hence, this phenomenon promotes a significant reduction in the magnitude of surface roughness. this agrees significantly with the findings of kumar et al. [38]. figures 11 and 12 are the contour and the 3d interactive plots, which show the effect of the feed rate and spindle speed on the cutting temperature during the machining operation. the relationship between the feed rate and the spindle was found to be directly proportional. an increase in the magnitude of the feed rate was observed to result in an increase in the magnitude of the temperature and vice versa. in addition, an increase in the magnitude of the spindle speed was also observed to result in an increase in the magnitude of the cutting temperature and vice versa. as the velocity, at which the cutting tool is fed past the work piece (feed rate), increases, the energy requirement of the cutting process may increase with an increase in the magnitude of the cutting temperature. this is in an accordance with the law of energy conservation as part of the energy input is converted to heat energy, which subsequently promotes an increase in the cutting temperature. furthermore, as the magnitude of the spindle speed increases, the energy requirement of the process also increases with a tendency for a temperature build-up, if an effective cooling strategy is not put in place. figures 13 and 14 are the contour and the 3d interactive plots, which show the effect of the feed rate and depth of cut on the cutting temperature during the machining operation. the relationship between the depth of cut and the feed rate, in this case, was found to be directly proportional. an increase in the magnitude of the depth of cut and feed rate was found to result in a significant reduction in the magnitude of the cutting temperature and vice versa. this may be attributed to the fact that an increase in the depth of cut may result in a decrease in the surface area of the work piece, which promotes the quick heat dissipating capacity of the applied coolants at the tool-work piece interface due to the reduced surface. the effectiveness of the coolants within a small area of the work piece may account for a significant reduction in the magnitude of the cutting temperature at the shear zone. the findings indicate that the process parameters and machining conditions are key parameters, which influence the rate of machinability, degree of surface finish as well as process economics and sustainability, hence the need for an effective process design and control [39, 40]. 3.2. results from the artificial neural network (ann) figure 15 presents the performance-training plot, which consists of the training and the iteration lines. the plot indicates the iteration at which the performance goal (target) was met. the ease of convergence of both the training and the iteration lines at the 98th iteration indicates that the developed network is adequate for the predictive purpose as indicated by the training line, which cuts through the vertical line and the negligible value of the mean square error (10−4). the figure shows the plots of the three major steps for the modelling process using neural network, the training, validation and testing plots. the validation plot is used to estimate the adequacy of the network for the predictive purpose while tuning the weights and biases during the neural network development. the training plot measures the performance and accuracy of the developed network during the prediction. the epoch indicates the number of iterations carried out before the network was suitable for correlative and predictive purposes. the similarity in the pattern of the validation and the test plots indicate that there is no overfitting of the data set, which may affect the suitability of the network for predictive purpose. the best validation performance was obtained after the 92nd iteration, after which six more iterations were carried out before the training automatically stopped. hence, the minimum number of iterations that produced the best validation performance was 92. table 8 presents the results obtained from the feasible combinations of process parameters using the response surface methodology (rsm) as well as the values of the surface roughness and the actual temperature obtained via the physical experiments. this serves as the input and output parameters of the developed neural network. figure 16 is a plot of the gradient, training gain mu and validation check after the network has been adequately trained. the gradient was 4.2011 · 10−4 while the training gain mu was 1.00 ·10−7 at the 98th iteration. the training stopped at the 98th iteration due to the increase in error as the data sets begin to overfit. thus, the best validation performance that indicates neither significant error nor validation check failure was observed at the 92nd iteration. beyond the 98th iteration, the validation checks carried out begin to fail as indicated by the figure. the small values of the gradient (4.2011 · 10−4) and the training gain (1.00 · 10−7) indicate that the difference between the output and the target is negligible. the error first reduces up to the 98th epochs of training, but may start to increase due to overfitting the training data by the network after the 98th iteration. 379 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica figure 11. the contour plot of the feed rate and spindle speed. figure 12. the interactive 3d plot of the feed rate and spindle speed. 380 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . figure 13. the contour plot of the feed rate and depth of cut. figure 14. the interactive 3d plot of the feed rate and depth of cut. 381 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica figure 15. performance training graph. figure 16. the plot of the gradient, training gain mu and validation check. 382 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . experimental trials feed rate spindle speed cutting velocity depth of cut average surface roughness actual temp. (mm/rev) (rpm) (m/min) (mm) (µm) (°c) 1. 300 3000 250 0.60 0.79 176 2. 300 1000 100 0.30 0.91 160 3. 350 3000 300 0.30 0.86 85 4. 300 1000 100 0.60 0.98 165 5. 250 2000 200 0.60 0.75 124 6. 250 2000 200 0.90 0.78 140 7. 300 3000 300 0.60 0.83 170 8. 300 1000 300 0.90 0.88 156 9. 350 3000 300 0.60 0.84 145 10. 250 2000 200 0.30 0.70 157 11. 250 1000 250 0.30 0.79 95 12. 300 2000 300 0.60 0.88 93 13. 325 2000 200 0.30 0.84 87 14. 350 1000 300 0.30 0.52 88 15. 300 3000 100 0.30 0.60 60 16. 350 1000 300 0.90 0.96 100 17. 275 2000 200 0.60 0.67 120 18. 350 1000 100 0.90 0.98 180 19. 325 2000 200 0.60 0.99 165 20. 350 3000 100 0.60 0.58 158 21. 325 2000 200 0.60 0.56 177 22. 325 2000 200 0.60 0.74 186 23. 325 2000 200 0.60 0.77 168 24. 325 2000 200 0.90 0.86 175 25. 325 1000 200 0.30 0.85 145 26. 350 3000 300 0.30 0.79 123 27. 300 3000 300 0.60 0.88 127 28. 350 3000 300 0.60 0.46 180 29. 325 2000 100 0.90 0.66 87 30. 300 3000 200 0.90 0.80 78 31. 350 3000 100 0.30 0.87 98 32. 300 3000 100 0.30 0.76 96 33. 350 1000 100 0.30 0.95 92 34. 325 1000 100 0.90 0.79 102 35. 300 2000 200 0.60 0.98 105 36. 325 1000 100 0.90 0.78 125 37. 300 2000 200 0.30 0.80 136 38. 300 1000 300 0.60 0.87 140 39. 325 1000 300 0.90 0.58 155 40. 300 2000 200 0.30 0.77 160 41. 375 3000 300 0.60 0.54 156 table 8. the results obtained from both the physical and numerical experiments. 383 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica figure 17 shows the statistical validation of the developed network. the errors obtained by finding the difference between the targets and outputs from the network were represented by a histogram. as shown by the figure, the values of the error were insignificant. the bins are the number of vertical bars on the plot and each bar represents the number of samples from the data set. in this case, the total number of the bins was 20. the value of the error, which is the difference between the target and the output of the network, ranges from a minimum value of −4.73 · 10−3 to a maximum value of 2.224 · 10−2. this was divided into 20 smaller bins and the error was calculated as follow; error = 2.224 · 10−2 − (−4.73 · 10−3) 20 error = 1.3485 · 10−3 the width of the error corresponds to 1.3485 · 10−3. furthermore, the error at the left hand side of the plot was −4.70 · 10−4 when the vertical height of the bin for the data validation was 35. this implies that 35 samples from the validation data set have errors that fall in the range. the error range is, therefore, calculated as follow; error range = ( −4.70 · 10−4 − 1.3485 · 10−3 2 ; −4.70 · 10−4 + 1.3485 · 10−3 2 ) error range = ( −9.1075 · 10−4; 4.3925 · 10−4 ) the range of error corresponds to −4.70 · 10−4 at the left hand side of the graph. similarly, the errors for other bins can be analysed likewise. the range of error is very small and negligible, thus, indicating a high degree of agreement between the targets and the network outputs. this also signifies a high probability that the predictions from the network will be accurate. as shown in figure 18, the developed network was validated by creating a regression plots to show the relationship between the outputs of the network and the targets. in other words, the regression plots show the degree of agreement between the values obtained from the physical and numerical experimentations. the mean squared error (mse) is the average squared difference between outputs and targets. lower values signify small and negligible error, hence, the lower the value of the mse, the better the developed network and vice versa. zero means no error. the regression coefficient r is the measure of the correlation between outputs and targets. an r-value of 1 means a close relationship while zero indicates a random relationship. the correlation coefficient r of the first regression plot for the training was 0.95119, which indicates that there is a near exact linear relationship between outputs and targets, hence, the agreement between the values of the physical and numerical experimentations can be declared highly significant. the second regression plot is the validation plot, which is used to examine the suitability of the developed model for predictive purposes. the correlation coefficient r obtained was 0.81089. this is close to 1, which also indicates that the network is capable of performing the predictive function accurately with minimal deviations from the target. the reduction in the value of the correlation coefficient stems from the fact that the size of the data sample was relatively small. the larger the size of the data sets, the more efficient the predictive ability of the network and vice versa. the closer the value of the correlation coefficient r to 1, the better the network, which indicates a minimal difference between the target and the network output. the value of the correlation coefficient r can be made closer to 1, by increasing the size of the data set while adjusting the weights and biases of the network architecture. the third regression plot is the test plot, which is a direct function of the performance of the network during prediction. this has a correlation coefficient of 0.95119. the fourth regression plot represents the overall performance of the network architecture. the correlation coefficient obtained for this was 0.92373. this is relatively close to 1, which indicates the suitability of the developed model for predicting the temperature and surface roughness during the milling operation of titanium alloy (ti6al4v). the model equations for the prediction of the magnitude of the surface roughness as well as temperature from the ann models are expressed as equations 8 and 9 respectively. output y, linear fit : y = (0.38)t + (0.49) (8) output y, linear fit : y = (0.26)t + (99) (9) where: t is the target variable. table 9 compares the actual value of the temperature and surface roughness from the physical experimentations with the predicted values obtained with the aid of the ann. the high degree of agreement between the values of temperature and surface roughness from the physical experimentations and the predicted temperature using ann indicates that the developed network is highly suitable for the predictive purpose. figures 19 and 20 show the actual and predicted values of the temperature and surface roughness for the milling operation of ti6al4v respectively, using the rsm and ann. the similarity in the data sets and pattern of the plots indicate the closeness of the predicted values to the actual values. this also indicates the accuracy and the effectiveness of the rsm and ann for correlative and predictive purposes. however, the ann demonstrated a better predictive ability than the rsm as observed by the magnitude of its predictions being closer to the actual values for both the temperature and the surface roughness as opposed to the rsm, though, the errors generated by both approaches were negligible and were found to be within the permissible limit (table 9). 384 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . figure 17. the error histogram plot. figure 18. the regressions plots. 385 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica experimental trials actual values | predicted values surface roughness temp. surface roughness surface roughness temp. temp. (µm) (°c) (µm) (µm) (°c) (°c) (rsm) (ann) (rsm) (ann) 1. 0.79 176 0.7867 0.7901 177.345 176.011 2. 0.91 160 0.8765 0.9101 161.654 160.505 3. 0.86 85 0.9786 0.8599 84.235 85.045 4. 0.98 165 0.9999 0.9802 164.234 165.234 5. 0.75 124 0.7768 0.7501 126.786 123.999 6. 0.78 140 0.7576 0.7802 142.348 139.999 7. 0.83 170 0.8478 0.8300 168.345 170.001 8. 0.88 156 0.8945 0.8800 157.097 156.201 9. 0.84 145 0.8678 0.8465 145.906 145.332 10. 0.70 157 0.7234 0.7000 154.55 157.001 11. 0.79 95 0.7809 0.7930 96.546 94.999 12. 0.88 93 0.8903 0.8801 93.997 92.999 13. 0.84 87 0.8568 0.8400 86.435 86.999 14. 0.52 88 0.5457 0.5200 88.094 88.001 15. 0.60 60 0.6216 0.6002 60.987 60.002 16. 0.96 100 0.9778 0.9600 100.778 100.002 17. 0.67 120 0.6670 0.6701 121.983 120.001 18. 0.98 180 0.9902 0.9980 180.995 179.001 19. 0.99 165 0.9763 0.9900 165.679 164.999 20. 0.58 158 0.5567 0.5867 156.547 157.999 21. 0.56 177 0.5347 0.5600 177.985 176.999 22. 0.74 186 0.7560 0.7400 185.856 186.001 23. 0.77 168 0.7479 0.7765 169.753 168.002 24. 0.86 175 0.8906 0.8679 176.779 175.002 25. 0.85 145 0.8237 0.8599 145.908 145.001 26. 0.79 123 0.7779 0.7923 122.764 122.999 27. 0.88 127 0.8673 0.8600 125.667 126.999 28. 0.46 180 0.4237 0.4500 181.458 179.999 29. 0.66 87 0.6679 0.6607 87.975 86.999 30. 0.80 78 0.8963 0.8325 78.875 78.001 31. 0.87 98 0.8346 0.8678 96.775 98.001 32. 0.76 96 0.78429 0.7700 95.789 96.002 33. 0.95 92 0.9974 0.9450 93.446 92.002 34. 0.79 102 0.7642 0.7900 102.998 101.999 35. 0.98 105 0.9458 0.9804 106.765 105.999 36. 0.78 125 0.7648 0.7867 123.457 125.999 37. 0.80 136 0.7998 0.8002 137.896 136.222 38. 0.87 140 0.7998 0.87001 142.556 140.001 39. 0.58 155 0.5679 0.58001 156.798 155.001 40. 0.77 160 0.7942 0.77001 163.564 160.011 41. 0.54 156 0.5679 0.54001 156.789 156.011 table 9. the results obtained from both the physical and numerical experiments. 386 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . figure 19. actual and predicted values for temperature. figure 20. actual and predicted values for surface roug. 387 i. daniyan, i. tlhabadira, k. mpofu, a. adeodu acta polytechnica 4. conclusion the prediction of the temperature and surface roughness during the milling operation of ti6al4v was successfully carried out using the rsm and ann. the developed rsm and ann models were highly efficient for a predictive purpose as their correlation coefficients r were close to 1. this indicates that there exists a linear relationship between the outputs and targets, thus, indicating the significance of the overall model terms and an agreement between the values of the temperature and surface roughness obtained from both the physical and numerical experimentations. the rsm demonstrated a good predictive ability with the predicted values found to be within the range of the actual values. the anova of the models developed using the rsm were found to be statistically significant, which implies the suitability of the models for correlative and predictive purposes. however, the predicted values from the ann were observed to be closer to the actual experimental values, for both the temperature and the surface roughness, than the predicted values from the rsm models. nonetheless, the errors generated by both approaches were negligible and found to be within the permissible limit. this work can find application in the production and manufacturing industries especially for process control, optimization and monitoring of cutting temperature and surface roughness in order to keep them within the permissible limit. further work can consider the use of genetic algorithm or other analytical models as well as comparative analyses among the identified approaches. references [1] m. b. mhamdi, m. boujelbene, e. bayraktar, a. zghal. surface integrity of titanium alloy ti-6al-4v in ball end milling. physics procedia 25:355 – 362, 2012. doi:10.1016/j.phpro.2012.03.096. [2] a. koohestani, j. mo, s. yang. stability prediction of titanium milling with data driven reconstruction of phase-space. machining science and technology 18(1):78 – 98, 2014. doi:10.1080/10910344.2014.863638. [3] p. j. arrazola, a. garay, l. m. iriarte, et al. machinability of titanium alloys (ti6al4v and ti555.3). journal of materials processing technology 209(5):2223 – 2230, 2009. doi:10.1016/j.jmatprotec.2008.06.020. [4] s. e. haghighi, h. b. lu, g. y. jian, et al. effect of α” martensite on the microstructure and mechanical properties of beta-type ti–fe–ta alloys. materials & design 76:47 – 54, 2015. doi:10.1016/j.matdes.2015.03.028. [5] s. ehtemam-haghighi, y. liu, g. cao, l.-c. zhang. influence of nb on the β → α” martensitic phase transformation and properties of the newly designed ti–fe–nb alloys. materials science and engineering: c 60:503 – 510, 2016. doi:10.1016/j.msec.2015.11.072. [6] n. taniguchi, s. fujibayashi, m. takemoto, et al. effect of pore size on bone ingrowth into porous titanium implants fabricated by additive manufacturing: an in vivo experiment. materials science and engineering: c 59:690 – 701, 2016. doi:10.1016/j.msec.2015.10.069. [7] g. quintana, j. ciurana. chatter in machining processes: a review. international journal of machine tools and manufacture 51(5):363 – 376, 2011. doi:10.1016/j.ijmachtools.2011.01.001. [8] c. bandapalli, k. k. singh, b. m. sutaria, d. v. bhatt. experimental investigation of top burr formation in high-speed micro-end milling of titanium alloy. machining science and technology 22(6):989 – 1011, 2018. doi:10.1080/10910344.2018.1449213. [9] m. benedetti, m. cazzolli, v. fontanari, m. leoni. fatigue limit of ti6al4v alloy produced by selective laser sintering. procedia structural integrity 2:3158 – 3167, 2016. doi:10.1016/j.prostr.2016.06.394. [10] u. s. dixit, s. n. joshi, j. p. davim. incorporation of material behavior in modeling of metal forming and machining processes: a review. materials & design 32(7):3655 – 3670, 2011. doi:10.1016/j.matdes.2011.03.049. [11] f. kara, k. aslantaş, a. çiçek. prediction of cutting temperature in orthogonal machining of aisi 316l using artificial neural network. applied soft computing 38:64 – 74, 2016. doi:10.1016/j.asoc.2015.09.034. [12] c. h. lauro, l. c. brandão, s. l. ribeiro filho. monitoring the temperature of the milling process using infrared camera. scientific research and essays 8(23):1112 – 1120, 2013. doi:10.5897/sre12.579. [13] g. le coz, m. marinescu, a. devillez, et al. measuring temperature of rotating cutting tools: application to mql drilling and dry milling of aerospace alloys. applied thermal engineering 36:434 – 441, 2012. doi:10.1016/j.applthermaleng.2011.10.060. [14] j. chen, k. chandrashekhara, c. mahimkar, et al. void closure prediction in cold rolling using finite element analysis and neural network. journal of materials processing technology 211(2):245 – 255, 2011. doi:10.1016/j.jmatprotec.2010.09.016. [15] a. molinari, r. cheriguene, h. miguelez. numerical and analytical modeling of orthogonal cutting: the link between local variables and global contact characteristics. international journal of mechanical sciences 53(3):183 – 206, 2011. doi:10.1016/j.ijmecsci.2010.12.007. [16] l. m. maiyar, r. ramanujam, k. venkatesan, j. jerald. optimization of machining parameters for end milling of inconel 718 super alloy using taguchi based grey relational analysis. procedia engineering 64:1276 – 1282, 2013. doi:10.1016/j.proeng.2013.09.208. [17] a. m. ravi, s. m. murigendrappa, p. g. mukunda. experimental investigation on thermally enhanced machining of high-chrome white cast iron and to study its machinability characteristics using taguchi method and artificial neural network. the international journal of advanced manufacturing technology 72:1439 – 1454, 2014. doi:10.1007/s00170-014-5752-4. [18] t. d. b. kannan, g. r. kannan, b. s. kumar, n. baskar. application of artificial neural network modeling for machining parameters optimization in drilling operation. procedia materials science 5:2242 – 2249, 2014. doi:10.1016/j.mspro.2014.07.433. 388 http://dx.doi.org/10.1016/j.phpro.2012.03.096 http://dx.doi.org/10.1080/10910344.2014.863638 http://dx.doi.org/10.1016/j.jmatprotec.2008.06.020 http://dx.doi.org/10.1016/j.matdes.2015.03.028 http://dx.doi.org/10.1016/j.msec.2015.11.072 http://dx.doi.org/10.1016/j.msec.2015.10.069 http://dx.doi.org/10.1016/j.ijmachtools.2011.01.001 http://dx.doi.org/10.1080/10910344.2018.1449213 http://dx.doi.org/10.1016/j.prostr.2016.06.394 http://dx.doi.org/10.1016/j.matdes.2011.03.049 http://dx.doi.org/10.1016/j.asoc.2015.09.034 http://dx.doi.org/10.5897/sre12.579 http://dx.doi.org/10.1016/j.applthermaleng.2011.10.060 http://dx.doi.org/10.1016/j.jmatprotec.2010.09.016 http://dx.doi.org/10.1016/j.ijmecsci.2010.12.007 http://dx.doi.org/10.1016/j.proeng.2013.09.208 http://dx.doi.org/10.1007/s00170-014-5752-4 http://dx.doi.org/10.1016/j.mspro.2014.07.433 vol. 60 no. 5/2020 development of numerical models for the prediction of temperature. . . [19] u. natarajan, p. r. periyanan, s. h. yang. multipleresponse optimization for micro-endmilling process using response surface methodology. international journal of advanced manufacturing technology 56(1):177 – 185, 2011. doi:10.1007/s00170-011-3156-2. [20] i. a. daniyan, i. tlhabadira, s. n. phokobye, et al. modelling and optimization of the cutting forces during ti6al4v milling process using the response surface methodology and dynamometer. mm science journal 2019:3353 – 3363, 2019. doi:10.17973/mmsj.2019_11_2019093. [21] r. saidi, b. ben fathallah, t. mabrouki, et al. modeling and optimization of the turning parameters of cobalt alloy (stellite 6) based on rsm and desirability function. the international journal of advanced manufacturing technology 100:2945 – 2968, 2019. doi:10.1007/s00170-018-2816-x. [22] y. xu, q. zhang, w. zhang, p. zhang. optimization of injection molding process parameters to improve the mechanical performance of polymer product against impact. the international journal of advanced manufacturing technology 76:2199 – 2208, 2014. doi:10.1007/s00170-014-6434-y. [23] s. kashyap, d. datta. process parameter optimization of plastic injection molding: a review. international journal of plastics technology 19:1 – 18, 2015. doi:10.1007/s12588-015-9115-2. [24] m. mahdavi jafari, s. soroushian, g. r. khayati. hardness optimization for al6061-mwcnt nanocomposite prepared by mechanical alloying using artificial neural networks and genetic algorithm. journal of ultrafine grained and nanostructured materials 50(1):23 – 32, 2017. doi:10.7508/jufgnsm.2017.01.04. [25] s. altarazi, m. ammouri, a. hijazi. artificial neural network modeling to evaluate polyvinylchloride composites’ properties. computational materials science 153:1 – 9, 2018. doi:10.1016/j.commatsci.2018.06.003. [26] i. tlhabadira, i. a. daniyan, r. machaka, et al. modelling and optimization of surface roughness during aisi p20 milling process using taguchi method. the international journal of advanced manufacturing technology 102:3707 – 3718, 2019. doi:10.1007/s00170-019-03452-4. [27] g. kant, k. s. sangwan. predictive modelling and optimization of machining parameters to minimize surface roughness using artificial neural network coupled with genetic algorithm. procedia cirp 31:453 – 458, 2015. doi:/10.1016/j.procir.2015.03.043. [28] p. kovac, d. rodic, v. pucovsky, et al. application of fuzzy logic and regression analysis for modeling surface roughness in face milliing. journal of intelligent manufacturing 24:755, 2012. doi:10.1007/s10845-012-0623-z. [29] g. campatelli, l. lorenzini, a. scippa. optimization of process parameters using a response surface method for minimizing power consumption in the milling of carbon steel. journal of cleaner production 66:309 – 316, 2014. doi:10.1016/j.jclepro.2013.10.025. [30] f. djavanroodi, b. omranpour, m. sedighi. artificial neural network modeling of ecap process. materials and manufacturing processes 28(3):276 – 281, 2013. doi:10.1080/10426914.2012.667889. [31] b. ben fathallah, r. saidi, c. dakhli, et al. mathematical modelling and optimization of surface quality and productivity in turning process of aisi 12l14 free-cutting steel. international journal of industrial engineering computations 10(4):557 – 576, 2019. doi:10.5267/j.ijiec.2019.3.001. [32] u.s. titanium industry inc. titanium alloys ti6al4v grade 5. azom. https: //www.azom.com/article.aspx?articleid=1547, 2017. accessed: 2 july 2019. [33] i. a. daniyan, i. tlhabadira, o. o. daramola, et al. measurement and optimization of cutting forces during m200 ts milling process using the response surface methodology and dynamometer. procedia cirp 88:288 – 293, 2020. doi:10.1016/j.procir.2020.05.050. [34] i. daniyan, f. fameso, f. ale, et al. modelling, simulation and experimental validation of the milling operation of titanium alloy (ti6al4v). the international journal of advanced manufacturing technology 109(7):1853 – 1866, 2020. doi:10.1007/s00170-020-05714-y. [35] v. aggarwal, s. s. khangura, r. garg. parametric modeling and optimization for wire electrical discharge machining of inconel 718 using response surface methodology. the international journal of advanced manufacturing technology 79:31 – 47, 2015. doi:10.1007/s00170-015-6797-8. [36] m. gayatri vineela, a. dave, p. kiran chaganti. artificial neural network based prediction of tensile strength of hybrid composites. materials today: proceedings 5(9, part 3):19908 – 19915, 2018. doi:10.1016/j.matpr.2018.06.356. [37] i. a. daniyan, i. thlabadira, s. phokobye, et al. modelling and optimization of the cutting parameters for the milling operation of titanium alloy (ti6al4v). in ieee 11th international conference on mechanical and intelligent manufacturing technologies (icmimt 2020), pp. 68 – 73. 2020. doi:10.1109/icmimt49010.2020.9041193. [38] n. s. kumar, a. shetty, a. shetty, et al. effect of spindle speed and feed rate on surface roughness of carbon steels in cnc turning. procedia engineering 38:691 – 697, 2012. doi:10.1016/j.proeng.2012.06.087. [39] i. a. daniyan, i. tlhabadira, o. o. daramola, k. mpofu. design and optimization of machining parameters for effective aisi p20 removal rate during milling operation. procedia cirp 84:861 – 867, 2019. doi:10.1016/j.procir.2019.04.301. [40] s. n. phokobye, i. a. daniyan, i. tlhabadira, et al. model design and optimization of carbide milling cutter for milling operation of m200 tool steel. procedia cirp 84:954 – 959, 2019. doi:10.1016/j.procir.2019.04.300. 389 http://dx.doi.org/10.1007/s00170-011-3156-2 http://dx.doi.org/10.17973/mmsj.2019_11_2019093 http://dx.doi.org/10.1007/s00170-018-2816-x http://dx.doi.org/10.1007/s00170-014-6434-y http://dx.doi.org/10.1007/s12588-015-9115-2 http://dx.doi.org/10.7508/jufgnsm.2017.01.04 http://dx.doi.org/10.1016/j.commatsci.2018.06.003 http://dx.doi.org/10.1007/s00170-019-03452-4 http://dx.doi.org//10.1016/j.procir.2015.03.043 http://dx.doi.org/10.1007/s10845-012-0623-z http://dx.doi.org/10.1016/j.jclepro.2013.10.025 http://dx.doi.org/10.1080/10426914.2012.667889 http://dx.doi.org/10.5267/j.ijiec.2019.3.001 https://www.azom.com/article.aspx?articleid=1547 https://www.azom.com/article.aspx?articleid=1547 http://dx.doi.org/10.1016/j.procir.2020.05.050 http://dx.doi.org/10.1007/s00170-020-05714-y http://dx.doi.org/10.1007/s00170-015-6797-8 http://dx.doi.org/10.1016/j.matpr.2018.06.356 http://dx.doi.org/10.1109/icmimt49010.2020.9041193 http://dx.doi.org/10.1016/j.proeng.2012.06.087 http://dx.doi.org/10.1016/j.procir.2019.04.301 http://dx.doi.org/10.1016/j.procir.2019.04.300 acta polytechnica 60(5):369–389, 2020 1 introduction 2 materials and methods 2.1 the response surface methodology 2.2 the ann approach 3 results and discussion 3.1 results from the response surface methodology 3.2 results from the artificial neural network (ann) 4 conclusion references acta polytechnica doi:10.14311/ap.2017.57.0282 acta polytechnica 57(4):282–294, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap possibility of monitoring of pahs distribution in the vertical profile at the background meteorological station křešín gabriela strnadováa,b,∗, vlastimil hanušc, david kahounb, petr semeráka, jan třískab,c a czech technical university in prague, department of physics, faculty of civil engineering, thákurova 7, 166 29 praha 6 – dejvice, czech republic b university of south bohemia in české budějovice, institute of chemistry and biochemistry, faculty of science, branišovská 1760, 37005 české budějovice, czech republic c global change research institute, v.v.i., czech academy of sciences, bělidla 4a, 603 00 brno, czech republic ∗ corresponding author: gstrnadova@prf.jcu.cz abstract. the contaminants present in the atmosphere have a substantial impact on public health. among contaminants, the most important are polycyclic aromatic hydrocarbons (pahs). this paper is focused on the possibility of continuous pahs monitoring and the description of their vertical distribution using filters, which serve to purify the air before the determination of the greenhouse gases (co2, ch4, n2o, co, ozone) and hg in air, continuously sampled at 8, 50 and 230 m at the atmospheric station křešín near pacov and elaboration of a simple and economical method of extraction of these filters. the station serves as a point for monitoring the occurrence and remote transport of greenhouse gases and selected atmospheric pollutants and for the measurements of basic meteorological characteristics. in december, 17, 2014, a sampling of 16 priority pahs started and lasted until december, 9, 2015. samples were taken approximately once a month. the maximum concentration of σpahs was 15.905 ng/m3, measured at the height of 8 meters in the period of 11. 2. 2015–11. 3. 2015, the concentration of benzo[a]pyrene exceeded the immission limit in this period by more than 50 %. by the sampling, the hypothesis about decreasing concentration of pahs with increasing height was confirmed, especially the decrease of heavier pahs. the sampling has shown that it is highly desirable to use the meteorological tower for sampling of pahs using ptfe filters either by including the active sampler itself, or by using pre-filters for tropospheric ozone and gaseous elemental mercury analysers. keywords: vertical profile; polycyclic aromatic hydrocarbons (pahs); air quality; extraction procedure; hplc; meteorological station křešín. 1. introduction air pollution is a current problem that causes not only climate change, but also global major health problems. international community and large number of states over the world try to regulate the production, use and release of pollutants while providing as much information as possible on the health risks associated with these substances. among the substances that are severely damaging the environment and human society is a group of substances called polycyclic aromatic hydrocarbons (pahs). the occurrence of these substances is typical in both the environment and in the food chain. the group of these substances is characterized as persistent for the environment, carcinogenic and endangering healthy foetal development [19]. it is a large group of substances, occurring mainly in complex mixtures, whose molecules consist of two or more condensed rings. the representative of these compounds, benzo[a]pyrene, is a confirmed carcinogen [19, 45]. pahs are substances of a lipophilic nature, abundantly distributed into the body, usually bound in fat tissues (liver, kidney) and smaller amounts in the adrenal glands, spleen and ovaries, where they accumulate. signs and symptoms of toxic effects may occur many years after the exposure, or in subsequent generations. pahs enter the body not only through the breathing patches, but also through the skin and gastrointestinal tract [25, 36]. a number of studies regarding urban pollution and its connection to the traffic load has been published [2, 5, 7, 12, 20–23, 30, 39]. atmospheric pollution was also measured indirectly by moss bags or soil [4, 38] and it indicates a greater incidence of pahs in an urban air, typically concentrated near urban centres and their concentrations correlate with transport or with combustion of fuels. lower concentrations are found in outdoor air in remote areas [9] or due to long 282 re tr ac te d http://dx.doi.org/10.14311/ap.2017.57.0282 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 4/2017 possibility of monitoring of pahs distribution distance transport [1, 10, 11]. unfortunately, high concentrations of pahs were also detected in residential buildings, e.g., in dormitories and offices [13, 15]. the main problem of the contamination of the environment by pahs is their mobility, which varies by molecular weight. higher molecular weight pahs are relatively immobile due to their high molecular weight and their extremely low volatility and solubility. therefore, they are primarily adsorbed onto dust particles, which then fall with the precipitation into the soil and water. more mobile pahs include many compounds that differ in the number and position of the aromatic rings and the position and type of substituents. such pahs can be found in remote locations of the earth far from their origin [34, 45]. sorption of pahs occurs on solid particulate matter (spm) (dust, ash, soot, etc.) or aerosol droplets [19]. the retention in the atmosphere is, of course, dependent on the "favourable" climatic conditions. on the spm surface, the pahs are subjected to chemical reactions, they react with other pollutants, such as ozone, nitrogen oxides and sulphur dioxide to form oxygen containing compounds (hydroxy and keto pahs), nitroand dinitro-pahs or sulfonic acids. in the air, pahs are exposed to photooxidation, due to the presence of ozone, oh radicals and nox. environmental monitoring has become an important indicator of pollution by individual harmful substances and assessing the quality of life as a whole. this information can then be linked to regulatory and national legislation. 1.1. legislation data sources according to the valid legislation, the czech republic is obliged to monitor the state of the air throughout the whole state territory, to monitor emissions and to regulate the discharge of pollutants into the air. immission limits are set out in the act on air protection [40, as amended] and in the decree on the method of assessment and evaluation of pollution level, the scope of information the public about the level of pollution in smog situations [41, as amended]. for benzo[a]pyrene, this value is shown in table 1. this activity serves not only as legislative instruments for the control of individual polluters, but also as a built up network of monitoring stations operated by the czech hydrometeorological institute together with private entities [31, 43]. the us-epa and iarc selected the entire group of 16 major polycyclic aromatic hydrocarbons for the targeted and long-term monitoring, namely (chemical name and acronym): naphthalene (na), acenaphthylene (acl), acenaphthene (acn), fluorene (fl), phenanthrene (phe), anthracene (ant), fluoranthene (flu), pyrene (py), benzo[a]anthracene (baa), chrysene (chr), benzo[b]fluoranthene (bbf), benzo[k]fluoranthene (bkf), benzo[a]pyrene (bap), indeno[1,2,3-cd]pyrene (ip), benzo[g,h, i]perylene (bghip) and dibenzo[a,h]anthracene (dbaha). pollutant averaging immission time limit (ng/m3) benzo[a]pyrene calendar year 1 table 1. immission limit for health protection — total content in pm10 particles. 1.2. monitoring of pahs in vertical profile monitoring of these 16 pahs in the unrestricted air is the topic of many publications, but not very often the vertical concentration profile of pahs was studied. the following studies are dealing with measurements on the roofs of buildings, garages and also in various floors of residential buildings on busy roads (rooftop on expressway, highway, dockside warehouse, industrial harbour, and central nagoya [29]; a building in an urban area [6]; a building in an urban area and heavy traffic [24]; a building in an express highway area [18]; industry and residential rooftops [14]; a building in an urban area [32]; a building in an urban area [37]. the overview of the main literature sources regarding measurements of pahs and their vertical distribution is given in table 2. some publications deal with the determination of the vertical profile of pahs in urban or industrial agglomerations [28, 35, 39]. studies of the authors focused on the vertical monitoring of pahs at higher heights in the countryside are rather limited, e.g., roof buildings at 3 m height in the farms, farm area [3], a campus area close lake — 2.5 m [33]; a rural area at 4 m height [26]. a few publications are dedicated to the determining of the pahs vertical profile at a different heights of the above mentioned towers: an urban area of toronto city, cn tower [8]; the urban tower [35, 39]; the scaffolding tower in the forest canopy [16]; the milad tower of tehran, urban area [28]. the authors did not find any publication that would deal with the vertical profile of pahs at the background station and simultaneously measured pahs during the sampling time continuing for one month. 1.3. air monitoring on the european continent — observatories air monitoring in europe was backed up with a unique program of the actris (aerosol, clouds, and trace gases research infrastructure network), which offers a comprehensive program of measurements of a vertical distribution of the aerosol, its properties and amounts of trace gases. these observations have, at some stations, a long history. from 18 observation stations, 16 stations are deployed in europe, one station is in madagascar and one in tenerife. only 4 stations (italy, southern sweden, cyprus and the czech republic) are dedicated to the monitoring of dust particles pm1, pm2.5 and pm10. the finnish station is dedicated, as the only one, to dealing with continual determination 283 re tr ac te d g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica a rt ic le c it y/ c ou nt ry s ea so n n ot e 3– 4 m 8– 16 m 20 –3 2 m 36 –4 4 m 60 –6 5 m 10 0– 15 8 m 20 0– 24 0 m 30 0– 33 0 m [3 9] ;1 6 pa h s t ia nj in /c hi na d ec /j an ur ba n, to w er 26 5/ 40 9 33 3/ 53 7 23 9/ 35 9 [3 5] ;1 6 pa h s p ek in g/ c hi na ja n/ m ar ur ba n, to w er 44 0. 9/ 48 .8 */ 45 .5 41 4. 6/ * 39 6. 6/ 52 .7 46 0. 1/ 21 .2 17 4. 0/ 30 .7 62 .2 /2 9. 0 [2 9] ;2 3 pa h s n ag oy a/ ja pa n av er ag e fo r fo ur se as on s ex pr es s hi gh w ay 2. 3– 30 .1 do ck si de w ar eh ou se 1. 1– 22 .4 ce nt ra ln ag oy a 1. 0– 14 .7 in du st ri al ha rb ou r 1. 6– 22 .1 su bu rb an re si de nt ia l 3. 1– 17 .5 [6 ]; 13 pa h s li w an /c hi na n ov –d ec ur ba n, bu ild in g 32 .5 –1 53 .7 w us ha n/ c hi na o ct –n ov ur ba n, bu ild in g 70 .8 –9 3. 6 x in gk en /c hi na o ct su bu rb an ,fi el d 39 .2 [3 3] c am po g ra nd e /b ra zi l ju l– n ov ca m pu s cl os e la ke 8. 9– 62 .5 [2 6] ;2 3 pa h s n or th ce nt ra l pa rt /i nd ia w in te r /s um m er ru ra l 25 8. 2/ 17 5. 4 ra in y ru ra l 52 .4 [2 4] ;1 8 pa h s g ua ng zh ou /c hi na n ov ur ba n, he av y tr affi c, bu ild in g 83 40 .8 [1 8] si ng ap or e ex pr es s hi gh w ay , bu ild in g 2. 7– 6. 8 4. 5– 6. 7 0. 1– 4. 7 [1 6] so ut he rn o nt ar io /c an ad a sp ri ng fo re st ,s ca ffo ld in g to w er 0. 2– 0. 4 (1 .5 m ) 0. 1– 0. 5 0. 1– 0. 6 0. 1– 0. 4 [2 8] t eh ra n/ ir an ye ar ur ba n, m ila d t ow er 16 .2 19 .4 12 .8 14 .2 [3 2] b an gk ok /t ha ila nd fe b ur ba n, bu ild in g 0. 4 0. 5 0. 2 [3 7] m um ba i/ in di a w in te r ur ba n, bu ild in g 39 .0 –4 2. 0 21 .0 –2 5. 0 [1 4] x ia m en /c hi na su m m er /a ut um n in du st ry ,r oo ft op 5. 2/ 37 .0 re si de nt ia l, ro of to p 1. 7/ 26 .6 [3 ] m id pa rt /k or ea fe b/ m ay ur ba n 10 9/ 22 .9 ur ba n 41 0/ 10 5 ro of to p, ru ra l, fa rm 11 9/ 74 .1 ro of to p, ru ra l, fa rm 19 5/ 91 .8 table 2. the overview of the literature sources regarding measurements of pahs and their vertical distribution. unit expressed in ng/m3 for σpahs, the star * signifies the sampling was failed. 284 re tr ac te d vol. 57 no. 4/2017 possibility of monitoring of pahs distribution 6 fig. 1: czech national monitoring point křešín a steel tower of 240 metres height methods after collection, the filters were packed in aluminium foil and then sealed to a plastic foil and dispatched to the laboratory for analysis. if the filters were not immediately analysed, they were stored in the freezer at -18 °c. it was found experimentally that this temperature is sufficient to avoid degradation of pah. fig. 2: placing of ptfe filter in continuous analyser of tropospheric ozone all laboratory glassware was washed three times with redistilled water, then 3 times with a 1:1, (v/v) solvent mixture of acetone : hexane and dried at room temperature. after use, the glassware was washed again, as the first tap water was used, followed by distilled water, followed by redistilled water and a solvent mixture as described above. when determining the reproducibility, the glass was washed immediately after use. the laboratory glass was stored in the locked cabinet in reverse, or the throat covered with aluminium foil. samples of filters were analysed always in the series, in order to comply with the same extraction conditions. in the laboratory the collecting filter was carefully removed with tweezers from the aluminium foil and inserted into an erlenmeyer flask, to which 10 ml of a mixture of extraction agents was added, which was formed by acetone : hexane (1:1, v/v). each sample was spiked with 100 µl of figure 1. czech national monitoring point křešín — a steel tower of 240 metres height. of pahs from the atmosphere [44]. tropospheric ozone is measured at 5 stations (italy, france, finland, czech republic and spain) and gaseous elemental mercury is measured only in finland and in the czech republic. 1.4. the goal of the study the goal of the study was the evaluation and verification of the possibility of monitoring of pahs distribution from the pre-filters of mercury and ozone analysers available on křešín meteorological tower and the elaboration of a simple and economical method of an extraction of these filters. this study also focused on a proposal of filter replacement periods, suggestion of sampling heights and filter porosity to obtain high-quality data so that a continual long-term monitoring of the vertical distribution of pollutants in the atmosphere could be continued in the following years. 2. materials and methods 2.1. materials this study was conducted in the neighbourhood of košetice observatory (n 49° 35’, e 15° 05’, elevation 534 m). very close to the existing meteorological observatory in košetice, the high atmospheric tower (height 240 meters) for scientific purposes was built in 2012 on the locality křešín, fulfilling the function of the czech national monitoring point for the monitoring and long-term measurements of the greenhouse gases content in the air and monitoring of the air cleanliness in higher atmospheric layers. as can be seen in the picture (figure 1), the site is located in a relatively flat landscape. the tower is owned by the global change research institute, v.v.i., czech academy of sciences. the operator goes to the top by the rack electric elevator through the centre of the tower. the tower is anchored by five steel ropes and three giant reinforced concrete basis. [42, 43]. on the tower, there is, among other items, continuous determination of ozone and mercury. prior to these analysers, 47 mm ptfe filters, 5 µm porosity 6 fig. 1: czech national monitoring point křešín a steel tower of 240 metres height methods after collection, the filters were packed in aluminium foil and then sealed to a plastic foil and dispatched to the laboratory for analysis. if the filters were not immediately analysed, they were stored in the freezer at -18 °c. it was found experimentally that this temperature is sufficient to avoid degradation of pah. fig. 2: placing of ptfe filter in continuous analyser of tropospheric ozone all laboratory glassware was washed three times with redistilled water, then 3 times with a 1:1, (v/v) solvent mixture of acetone : hexane and dried at room temperature. after use, the glassware was washed again, as the first tap water was used, followed by distilled water, followed by redistilled water and a solvent mixture as described above. when determining the reproducibility, the glass was washed immediately after use. the laboratory glass was stored in the locked cabinet in reverse, or the throat covered with aluminium foil. samples of filters were analysed always in the series, in order to comply with the same extraction conditions. in the laboratory the collecting filter was carefully removed with tweezers from the aluminium foil and inserted into an erlenmeyer flask, to which 10 ml of a mixture of extraction agents was added, which was formed by acetone : hexane (1:1, v/v). each sample was spiked with 100 µl of figure 2. placement of ptfe filter in continuous analyser of tropospheric ozone. for the ozone analyser, and 0.2 µm for mercury analyser, both located in the ptfe holder (figure 2) were placed in the line in order to prevent particulate matter from penetrating the analysers. in front of these filters is a polyamide net against insects and coarse particulate matter. the air is not subjected to drying. 2.2. methods after collection, the filters were packed in an aluminium foil and then sealed to a plastic foil and dispatched to the laboratory for the analysis. if the filters were not immediately analysed, they were stored in the freezer at −18 °c. it was found experimentally that this temperature is sufficient to avoid a degradation of the pah. all laboratory glassware was washed three times with redistilled water, then 3 times with a 1 : 1, (v/v) solvent mixture of acetone : hexane and dried at a room temperature. after the use, the glassware was washed again, first, the tap water was used, followed by a distilled water, followed by a redistilled water and a solvent mixture as described above. when determining the reproducibility, the glass was washed immediately after use. the laboratory glass was stored in the locked cabinet in reverse, or the throat was covered with an aluminium foil. samples of filters were always analysed in series, in order to comply with the same extraction conditions. in the laboratory, the collecting filter was carefully removed with tweezers from the aluminium foil and inserted into an erlenmeyer flask, to which 10 ml of a mixture of extraction agents was added, which was formed by acetone : hexane (1 : 1, v/v). each sample was spiked with 100 µl of 9,10-diphenylanthracene (as an internal standard) with the concentration of 100 µg/l. the sample was placed in an ultrasonic bath and extracted for 10 minutes. after this time, the extraction solvent was transferred into the pear shaped flasks. the extraction was repeated with 10 ml of the extraction reagent for 10 minutes and this second part was adding too. then, 500 µl of keeper (a mixture of isopropanol : diethylene glycol in a ratio 285 re tr ac te d g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica sampling points date of collection 8 m 50 m 230 m 14. 1. 2015 6.683 4.859 4.125 11. 3. 2015 15.905 11.867 9.624 8. 4. 2015 5.372 3.881 3.369 6. 5. 2015 1.533 1.033 0.741 27. 5. 2015 0.894 0.609 0.449 17. 6. 2015 0.415 0.34 0.368 15. 7. 2015 0.238 0.179 0.208 9. 9. 2015 0.458 0.399 0.416 7. 10. 2015 1.555 1.2 0.821 11. 11. 2015 4.223 2.68 0.654 9. 12. 2015 5.231 3.72 1.94 table 3. the sum of pahs at different height trapped on the filters in front of the ozone analyser (ng/m3) of 4 : 1, v/v) was added and the extract evaporated in a rotary vacuum evaporator nearly to dryness. the rest of the keeper, containing the analytes, in the pear shaped flask was diluted with 1 ml of mobile phase (acetonitrile : double-distilled water in a ratio of 1 : 1 v/v) and transferred into an eppendorf vial. the sample was cleaned and cleared of solid impurities by centrifugation at 15000 rpm, 20 °c for 10 minutes. the supernatant (approximately 800 µl) was carefully removed by a syringe from the centrifuge tubes and transferred into a dark vial for analysis. analysis of the extracts was performed according to the described method (supelco application no. 138) with a small modification by hplc-dad-fld, both detectors connected in series. the following solvents were used for the sample preparation and for the analysis. acetone (for organic ultra resi-analyzed), j.t.baker, hexane (for organic ultra resi-analyzed), j.t.baker, isopropyl alcohol (suitable for liquid chromatography and uv-spectrophotometry, chrom ar hplc), macron, acetonitrile (chrom ar hplc super gradient), macron, diethylene glycol (puriss p.a.), sigma-aldrich and pah–mix-9, dr. ehrenstorfer gmbh (x 20950009al, lot: 30325al) 100 ng/µl in acetonitrile. gradient elution was used in the analysis. this method had the following conditions: a dionex ultimate liquid chromatograph from thermo scientific (usa) was used for the analysis, two detectors were used in series, the dad 3000 rs spectrophotometer for acenaphthylene analysis and the fluorescence detector fld 3000 rs to determine the other 15 pahs. the chromatographic column was a waters pah c18 column (250 × 3 mm; 5 µm), eluting with a gradient. eluent a: water/acetonitrile 1 : 1 (v/v), eluent b: 100 % acetonitrile, mobile phase: flow rate: 1 ml/min, column temperature: 35 °c and 90 µl sample injection. this method was performed with a very good separation and validation of the measured results. the calibration curve was routinely analysed and validated in the range of 0.1–10 µg/l (7 concentration levels, each in 3 independent repeats). sampling points date of collection ground 3. 12. 2014 7.367 17. 3. 2015 8.022 8. 4. 2015 5.702 6. 5. 2015 2.222 3. 6. 2015 0.799 1. 7. 2015 0.347 29. 7. 2015 0.366 9. 9. 2015 0.463 7. 10. 2015 1.882 31. 10. 2015 4.03 2. 12. 2015 4.111 table 4. the sum of pahs at ground trapped on the filters in front of the mercury analyser (ng/m3). 3. results and discussion for the determination of pahs, there are many methods for qualitative and quantitative analysis and many techniques for pre-cleaning of the samples from individual matrices in the literature. the question of what method to choose is probably the most important one. the vast majority of publications propose the extraction procedure using the soxhlet extraction [3, 16, 28, 32, 35], alternatively, an extraction in the ultrasonic bath [24, 27, 30, 31, 33]. as extraction agents, a variety of solvents, mainly dichloromethane, methanol, petroleum ether and others, was used. the chromatographic analysis of the pahs was largely based on gc/ms [3, 6, 16, 18, 24, 28], gc/msms [32] and gc /lrms [27]. as an example of liquid chromatography application for the analysis of pahs is the publication [31]. for the analysis of 10 heavier pahs, they used fld detector, column pursuit 3 pah (100 × 4.6 mm), mobile phase acetonitrile : water (60 : 40) and flow rate 0.5 ml/min. our work was focused on the monitoring of the pahs at the křešín background station and their analysis in december 2014–december 2015 in the vertical profile, while 16 us epa pahs were analysed in the monthly sampling interval. we have chosen the ultrasonic extraction using acetone : hexane (1 : 1, v/v) as the extraction solvent and the liquid chromatography using fld detector for the analysis. the values presented in the following tables are the sum of the pahs for individual periods found on dust particles on filters with a porosity of 5 µm and 0.2 µm, which means that a fractions of 5 µm particles and greater were trapped on the filters of the ozone analyser (table 3) at different heights and fractions of 0.2 µm and higher were trapped on the filters of the mercury analyser (table 4) on the ground. the total concentration of the pahs adsorbed on dust particles and trapped on filters from the ozone analyser in the period of 11. 2. 2015–11. 3. 2015 shows the highest concentration at all three heights (8, 50 286 re tr ac te d vol. 57 no. 4/2017 possibility of monitoring of pahs distribution 9 in the context of decreasing concentration with height, several publications also mentioned the current drop in summer, as opposed to the winter season (tao et al., 2007, masih et al., 2012, hong huasheng et al., 2007, bae et al., 2002). the authors (masih et al., 2012) compared not only the winter and summer season, but also the rainy season in which the pahs concentrations were the lowest. most of these publications focus only on a short sampling time, because of higher financial cost of active sampling. overview of the measurements for four individual seasons at 20 m height is reported by a team of authors (ohura et al., 2016), their sampling is limited for only 3 days, but for 4 seasons. moeinaddini et al. (2014) monitored the vertical profile during october 2011march 2012, however, the results are presented as the sum of individual pahs for all periods, the contribution of 200 and 300 m concentration was explained as long-distance transport of pollutants. tao et al. (2007) supposed a uniform layering of pahs in the atmosphere, but at the same time they claim that this assumption is working with high uncertainties only and it is based on the estimation of atmospheric concentrations and their movement in the atmosphere. concentration of individual polycyclic aromatic hydrocarbons, starting from hydrocarbon with m.w. of 202 g/mol (fluoranthene), shows a trend of decreasing concentration with the height (fig. 3). pahs with 4 to 6 aromatic rings have the highest concentration at all heights in the period of 11.2.201511.3.2015, with the concentration decreasing with increasing height and the lowest concentrations in the period of 17.6.2015-15.7.2015 at 230 m. fig. 3: concentration of individual pahs (16 pahs according u.s. epa) at different heights trapped on the filters from ozone analyser. sampling period 17.12.2014 9.12.2015. unit expressed in ng/m3 figure 4 shows again the decrease in concentrations with the height. regarding the concentrations of lower molecular weight pahs (2 and 3 rings), there is no such steep difference in concentration decreases in the comparison with the higher molecular weight pahs (the result may be distorted by acenaphthylene whose concentration during the year was below the detection limit). pahs with 4 aromatic rings reached the highest concentrations in all periods at all measured heights. figure 3. concentration of individual pahs (16 pahs according u.s. epa) at different heights trapped on the filters from ozone analyser. sampling period 17. 12. 2014–9. 12. 2015. unit expressed in ng/m3 and 230 m) 15.905, 11.867 and 9.624 ng/m3 with a decreasing concentration trend with a higher sampling height. the lowest total concentration of the pahs of 0.179 ng/m3 is then found at the height of 50 m from the period of 17. 6. 2015–15. 7. 2015, with a value of 0.208 ng/m3 from 230 m height and value of 0.238 ng/m3 at height of 8 m. when we compare the concentration levels from different heights, we can see a trend of a decreasing concentration of the pahs with increasing height: concentration measured at different levels from 8 to 320 m on the tower [35]; 4 and 13 m [37]; 24 and 36 m [24]; for comparison, see table 2. steeply decreasing tendency with height in 30, 90, 150, 210, 270 and 360 meters on the toronto cn tower is also reported [8], where it points out urban areas as the source of pollution. in the scientific study [39], the highest concentration is at 40 m height in the winter period (measured heights of 20, 40 and 60 m) in the urban area. the authors of [17] monitored gradient dependence of 8 pahs concentration in different floors of buildings in new york, with the highest concentrations of nonvolatile and semivolatile pahs found on the 3rd-5th floor compared to the monitored pahs in 0-2nd and 6th–32nd floors. pongpiachan [32] found the highest concentration in 158 m compared to 38 and 328 m in winter in the urban atmosphere of bangkok city. a measurement on a scaffolding tower 45 m high in the leafy forest in canada, where the sampling points were in 1.5, 16.7, 29.1 and 44.4 m, were performed [16] and they noted a decrease in lower pahs in the gas phase with the height, namely phenantrene, anthracene, and pyrene. in the context of decreasing concentration with height, several publications also mentioned the current drop in summer, as opposed to the winter season [3, 14, 26, 35]. the authors of [26] compared not only the winter and summer season, but also the rainy season, in which the pahs concentrations were the lowest. most of these publications focus only on a short sampling time, because of a higher financial cost of active sampling. overview of the measurements for four individual seasons at 20 m height is reported [29], their sampling is limited for only 3 days, but for 4 seasons. the vertical profile during october 2011–march 2012 was monitored [28], however, the results are presented as the sum of individual pahs for all periods, the contribution of 200 and 300 m concentration was explained as a long-distance transport of pollutants. a uniform layering of pahs in the atmosphere was supposed [35], but at the same time, the authors claim that this assumption is working with high uncertainties only and it is based on the estimation of atmospheric concentrations and their movement in the atmosphere. concentration of individual polycyclic aromatic hydrocarbons, starting from hydrocarbon with m.w. of 202 g/mol (fluoranthene), shows a trend of decreasing concentration with the height (figure 3). the 287 re tr ac te d g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica 10 fig. 4: concentrations of 2-, 3-, 4-, 5and 6-ring pahs measured on the tower at different heights, on the filters from ozone analysers: 2-ring pahs include naphthalene; 3-ring pahs include acenaphthylene, acenaphthene, fluorene, phenanthrene and anthracene; 4-ring pahs include fluoranthene, pyrene, benzo[a]anthracene and chrysene; 5-ring pahs include benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene and dibenzo[ah]anthracene; 6-ring pahs include indeno[123-cd]pyrene and benzo[ghi]perylene. sampling period 17.12.2014 9.12.2015. unit expressed in ng/m3. it is possible to see in the figure 3 that there is also a certain stable composition of pahs throughout the sampling period, i.e., the meteorological station really measures the relatively stable background concentration of pahs. in figure 3 we also see that pahs create 4 dominants pairs of hydrocarbons. their percent representation for a height of 230 m is shown in figure 5. fig. 5: an example of the layering of four individual pairs in % at a height of 230m. sampling period 17.12.2014 9.12.2015. figure 4. concentrations of 2-, 3-, 4-, 5and 6-ring pahs measured on the tower at different heights, on the filters from ozone analysers: 2-ring pahs include naphthalene; 3-ring pahs include acenaphthylene, acenaphthene, fluorene, phenanthrene and anthracene; 4-ring pahs include fluoranthene, pyrene, benzo[a]anthracene and chrysene; 5-ring pahs include benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene and dibenzo[ah]anthracene; 6-ring pahs include indeno[123-cd]pyrene and benzo[ghi]perylene. sampling period 17. 12. 2014–9. 12. 2015. unit expressed in ng/m3. pahs with 4 to 6 aromatic rings have the highest concentration at all heights in the period of 11. 2. 2015– 11. 3. 2015, with the concentration decreasing with increasing height and the lowest concentrations in the period of 17. 6. 2015–15. 7. 2015 at 230 m. figure 4 shows again the decrease in concentrations with height. regarding the concentrations of lower molecular weight pahs (2 and 3 rings), there is no such steep difference in concentration decrease in comparison with the higher molecular weight pahs (the result may be distorted by acenaphthylene, whose concentration was below the detection limit during the year). the pahs with 4 aromatic rings reached the highest concentrations in all periods at all measured heights. in the figure 3, it is possible to see that there is also a certain stable composition of pahs throughout the sampling period, i.e., the meteorological station really measures the relatively stable background concentration of pahs. in figure 3, we can also see that pahs create 4 dominants pairs of hydrocarbons. their percent representation for a height of 230 m is shown in figure 5. the hypothesis that the proportion of heavier pahs with the sampling height is decreasing was confirmed. this difference is minimal in the summer, which is related to the generally lower concentrations of all pahs (the sum of pahs was calculated starting from m.w. of 202 g/mol and higher) (figure 6). table 5 from the ozone analyser represents the time development of the concentration level of the individual pahs, depending on the different sampling heights during the monitored period (the highest and lowest concentrations are marked with colour). it is clear from the table 5 that the highest concentrations occurred at each sampling height in the period of 11. 2. 2015–11. 3. 2015, the maxima being at 8 m sampling height and the lowest concentrations occurred in the period of 17. 6. 2015–15. 7. 2015, with the lowest concentration being at 230 m sampling height. during our research study, we also focused on benzo[a]pyrene (bap), which is, according to the iarc, group 1 carcinogen. this most carcinogenic and mutagenic hydrocarbon with 5 rings was found present in the air [3] in 2 places in an urban area with the concentration of 0.61 and 0.73 ng/m3 in winter and in 2 places on a farm on the rooftop (3 m) with the concentration of 0.17 and 0.53 ng/m3 in the same season. nagoya metropolis was measured and authors [29] found the concentration of bap inside and in peripheral area 0.06–1.45 ng/m3 and 0.05–3.75 ng/m3, both in 20 m above the ground. the large difference in the bap concentrations in the winter and summer season is described in the article [33]. the authors measured the bap at the height of 2.5 m in campus near brazilian city campo grande and the concentration of the bap was ranging from 8.9-62.5 ng/m3. in spite of the fact that meteorological station křešín near pacov is a background station, the mea288 re tr ac te d vol. 57 no. 4/2017 possibility of monitoring of pahs distribution 1 1 f ig . 6 : t h e s u m o f h e a v ie r p a h s c o n c e n tr a ti o n ( 4 -6 ri n g ) fo r e a c h h e ig h t. t h e f o ll o w in g p a h s a re i n c lu d e d : fl u o ra n th e n e , p y re n e , b e n zo [a ]a n th ra c e n e , c h ry se n e , b e n zo [b ]f lu o ra n th e n e , b e n zo [k ]f lu o ra n th e n e , b e n zo [a ]p y re n e , in d e n o [1 ,2 ,3 -c d ]p y re n e , b e n zo [g ,h ,i ]p e ry le n e a n d d ib e n zo [a ,h ]a n th ra c e n e , fi lt e rs f ro m o zo n e a n a ly se rs . u n it e x p re ss e d i n n g /m 3 t h e h y p o th e si s th a t th e p ro p o rt io n o f h e a v ie r p a h s w it h t h e s a m p li n g h e ig h t is d e cr e a si n g w a s co n fi rm e d . t h is d if fe re n ce is m in im a l in th e su m m e r, w h ic h is re la te d to th e g e n e ra lly lo w e r co n ce n tr a ti o n s o f a ll p a h s (t h e s u m o f p a h s w a s ca lc u la te d s ta rt in g f ro m m .w . o f 2 0 2 g /m o l a n d h ig h e r) ( f ig . 6 ). t a b le 5 : c o n c e n tr a ti o n o f in d iv id u a l p a h s in d if fe re n t p e ri o d s a n d d if fe re n t h e ig h ts , fi lt e rs f ro m o zo n e a n a ly se rs ( n g /m 3 ). l o q t h e l im it o f q u a n ti fi c a ti o n f o r a c l 0 .5 p g /m 3 , fo r p h e 0 .1 p g /m 3 . t a b le 5 f ro m t h e o zo n e a n a ly se r re p re se n ts t h e t im e d e v e lo p m e n t o f th e c o n ce n tr a ti o n l e v e l o f th e in d iv id u a l p a h s, d e p e n d in g o n t h e d if fe re n t sa m p li n g h e ig h ts d u ri n g t h e m o n it o re d p e ri o d ( th e h ig h e st a n d l o w e st c o n ce n tr a ti o n s a re m a rk e d w it h c o lo u r) . it i s cl e a r fr o m t h e t a b le 5 t h a t th e h ig h e st co n ce n tr a ti o n s o cc u rr e d a t e a ch s a m p li n g h e ig h t in t h e p e ri o d o f 1 1 .2 .2 0 1 5 -1 1 .3 .2 0 1 5 , th e m a xi m a d a te o f co lle ct io n σ (n g/ m 3 ) n a a c l a c n fl p h e a n t fl u p y b a a c h r b b f b kf b a p d b a h a b gh ip ip h e ig h t 1 4 .1 .2 0 1 5 6 .6 8 3 0 .0 1 6 < l o q 0 .0 0 7 0 .0 5 2 0 .5 7 0 0 .0 4 7 0 .8 6 2 0 .8 5 9 0 .4 4 8 0 .7 4 2 0 .8 2 0 0 .3 6 8 0 .5 7 6 0 .0 5 1 0 .6 7 2 0 .5 9 4 8 m 1 1 .3 .2 0 1 5 1 5 .9 0 5 0 .0 2 8 < l o q 0 .0 1 2 0 .0 6 6 1 .1 8 7 0 .0 2 9 2 .0 2 4 2 .0 5 3 1 .0 3 0 1 .5 8 6 2 .0 7 3 0 .8 7 5 1 .5 0 7 0 .1 2 7 1 .7 2 2 1 .5 8 5 8 m 8 .4 .2 0 1 5 5 .3 7 2 0 .0 1 2 < l o q 0 .0 0 8 0 .0 2 6 0 .4 1 4 0 .0 2 2 0 .7 2 4 0 .7 2 8 0 .3 4 0 0 .5 2 9 0 .6 8 7 0 .2 9 8 0 .5 0 8 0 .0 4 4 0 .5 3 5 0 .4 9 6 8 m 6 .5 .2 0 1 5 1 .5 3 3 0 .0 0 5 < l o q 0 .0 0 2 0 .0 0 3 0 .0 9 4 0 .0 0 4 0 .1 4 2 0 .1 5 5 0 .0 6 6 0 .1 1 3 0 .2 3 5 0 .1 0 1 0 .1 5 0 0 .0 1 7 0 .2 3 8 0 .2 0 8 8 m 2 7 .5 .2 0 1 5 0 .8 9 4 0 .0 0 2 < l o q 0 .0 0 1 0 .0 0 1 0 .0 5 2 0 .0 0 1 0 .1 0 9 0 .1 1 9 0 .0 4 7 0 .0 7 3 0 .1 2 5 0 .0 5 1 0 .0 9 1 0 .0 0 9 0 .1 1 9 0 .0 9 3 8 m 1 7 .6 .2 0 1 5 0 .4 1 5 0 .0 1 7 < l o q < l o q 0 .0 0 1 0 .0 3 1 0 .0 0 2 0 .0 6 0 0 .0 4 7 0 .0 1 7 0 .0 3 4 0 .0 6 5 0 .0 1 9 0 .0 3 4 0 .0 0 4 0 .0 4 5 0 .0 3 9 8 m 1 5 .7 .2 0 1 5 0 .2 3 8 0 .0 0 3 0 .0 8 6 < l o q < l o q 0 .0 1 1 0 .0 0 0 0 .0 2 2 0 .0 1 8 0 .0 0 6 0 .0 1 4 0 .0 2 1 0 .0 0 9 0 .0 1 0 0 .0 0 1 0 .0 1 8 0 .0 2 0 8 m 9 .9 .2 0 1 5 0 .4 5 8 0 .0 0 1 0 .0 5 7 0 .0 0 1 0 .0 0 2 0 .0 2 6 0 .0 0 1 0 .0 6 0 0 .0 6 0 0 .0 1 6 0 .0 3 9 0 .0 6 5 0 .0 2 0 0 .0 2 9 0 .0 0 4 0 .0 3 6 0 .0 3 9 8 m 7 .1 0 .2 0 1 5 1 .5 5 5 0 .0 0 9 0 .0 5 1 0 .0 0 1 0 .0 0 4 0 .0 8 6 0 .0 0 4 0 .1 5 5 0 .1 5 0 0 .0 7 3 0 .1 1 9 0 .2 0 9 0 .1 0 1 0 .1 2 5 0 .0 1 6 0 .2 1 6 0 .2 3 5 8 m 1 1 .1 1 .2 0 1 5 4 .2 2 3 0 .0 1 3 0 .1 7 3 0 .0 0 1 0 .0 0 6 0 .1 2 3 0 .0 0 8 0 .4 0 2 0 .4 0 2 0 .2 0 2 0 .3 6 8 0 .6 2 3 0 .3 1 4 0 .3 5 0 0 .0 4 6 0 .5 4 7 0 .6 4 4 8 m 9 .1 2 .2 0 1 5 5 .2 3 1 < l o q 0 .0 5 5 0 .0 0 7 0 .0 1 2 0 .2 4 7 0 .0 2 3 0 .5 5 5 0 .6 0 0 0 .3 6 7 0 .5 4 1 0 .6 1 9 0 .3 6 6 0 .4 6 3 0 .0 4 5 0 .6 7 9 0 .6 4 9 8 m 1 4 .1 .2 0 1 5 4 .8 5 9 0 .0 0 9 < l o q 0 .0 0 5 0 .0 3 6 0 .4 5 7 0 .0 3 5 0 .6 4 6 0 .6 3 4 0 .3 0 7 0 .4 7 8 0 .6 4 3 0 .2 7 9 0 .4 0 9 0 .0 4 3 0 .4 5 1 0 .4 2 8 5 0 m 1 1 .3 .2 0 1 5 1 1 .8 6 7 0 .0 3 6 < l o q 0 .0 1 0 0 .0 5 1 0 .8 2 1 0 .0 2 2 1 .5 8 7 1 .5 5 2 0 .7 3 5 1 .1 5 3 1 .6 1 5 0 .6 6 9 1 .1 1 3 0 .1 0 1 1 .2 5 1 1 .1 5 3 5 0 m 8 .4 .2 0 1 5 3 .8 8 1 0 .0 1 0 < l o q 0 .0 0 5 0 .0 2 0 0 .3 2 0 0 .0 1 3 0 .5 5 5 0 .5 2 0 0 .2 2 4 0 .3 4 7 0 .5 0 2 0 .2 1 3 0 .3 4 8 0 .0 3 8 0 .3 9 6 0 .3 7 0 5 0 m 6 .5 .2 0 1 5 1 .0 3 3 0 .0 0 4 < l o q 0 .0 0 1 0 .0 0 3 0 .0 7 2 0 .0 0 1 0 .1 1 2 0 .1 2 6 0 .0 3 7 0 .0 7 0 0 .1 7 0 0 .0 6 4 0 .0 8 2 0 .0 1 3 0 .1 4 7 0 .1 3 1 5 0 m 2 7 .5 .2 0 1 5 0 .6 0 9 0 .0 0 1 < l o q 0 .0 0 1 0 .0 0 1 0 .0 3 4 0 .0 0 1 0 .0 7 9 0 .0 7 3 0 .0 3 2 0 .0 5 3 0 .0 8 7 0 .0 3 6 0 .0 6 0 0 .0 0 7 0 .0 7 7 0 .0 6 9 5 0 m 1 7 .6 .2 0 1 5 0 .3 4 0 0 .0 1 2 < l o q < l o q < l o q 0 .0 2 4 0 .0 0 1 0 .0 5 1 0 .0 3 9 0 .0 1 3 0 .0 2 8 0 .0 5 4 0 .0 1 6 0 .0 2 9 0 .0 0 3 0 .0 3 6 0 .0 3 3 5 0 m 1 5 .7 .2 0 1 5 0 .1 7 9 0 .0 0 2 0 .0 6 4 < l o q < l o q 0 .0 1 1 0 .0 0 0 0 .0 1 8 0 .0 1 6 0 .0 0 4 0 .0 1 0 0 .0 1 4 0 .0 0 6 0 .0 0 7 0 .0 0 1 0 .0 1 2 0 .0 1 3 5 0 m 9 .9 .2 0 1 5 0 .3 9 9 0 .0 0 3 0 .0 5 7 0 .0 0 1 0 .0 0 2 0 .0 2 3 0 .0 0 1 0 .0 5 3 0 .0 4 4 0 .0 1 4 0 .0 3 2 0 .0 5 6 0 .0 1 9 0 .0 2 5 0 .0 0 4 0 .0 3 1 0 .0 3 4 5 0 m 7 .1 0 .2 0 1 5 1 .2 0 0 0 .0 0 6 0 .0 6 7 0 .0 0 1 0 .0 0 3 0 .0 8 7 0 .0 0 4 0 .1 2 9 0 .1 2 1 0 .0 5 7 0 .0 9 2 0 .1 4 6 0 .0 7 4 0 .0 9 4 0 .0 1 1 0 .1 4 6 0 .1 6 4 5 0 m 1 1 .1 1 .2 0 1 5 2 .6 8 0 0 .0 0 6 0 .0 8 2 0 .0 0 6 0 .0 0 5 0 .1 5 4 0 .0 0 8 0 .3 5 0 0 .3 1 7 0 .1 3 4 0 .2 4 3 0 .3 3 0 0 .1 6 6 0 .2 1 4 0 .0 2 5 0 .3 0 6 0 .3 3 2 5 0 m 9 .1 2 .2 0 1 5 3 .7 2 0 < l o q 0 .0 4 5 0 .0 0 7 0 .0 0 8 0 .1 5 6 0 .0 1 7 0 .4 1 7 0 .4 6 2 0 .2 5 0 0 .3 7 0 0 .4 2 9 0 .2 5 6 0 .3 4 2 0 .0 3 1 0 .4 4 5 0 .4 8 5 5 0 m 1 4 .1 .2 0 1 5 4 .1 2 5 0 .0 0 6 < l o q 0 .0 0 4 0 .0 2 3 0 .4 3 1 0 .0 3 5 0 .6 3 1 0 .6 0 6 0 .2 6 6 0 .3 8 5 0 .4 7 0 0 .2 0 5 0 .3 3 8 0 .0 3 2 0 .3 5 4 0 .3 4 0 2 3 0 m 1 1 .3 .2 0 1 5 9 .6 2 4 0 .0 1 6 < l o q 0 .0 0 6 0 .0 5 4 0 .8 6 5 0 .0 2 5 1 .4 5 4 1 .4 4 1 0 .6 0 0 0 .9 1 5 1 .1 7 3 0 .4 7 8 0 .8 5 3 0 .0 7 1 0 .8 5 9 0 .8 1 4 2 3 0 m 8 .4 .2 0 1 5 3 .3 6 9 0 .0 1 0 < l o q 0 .0 0 4 0 .0 2 4 0 .3 4 4 0 .0 1 0 0 .5 4 6 0 .5 0 7 0 .1 9 4 0 .3 0 4 0 .3 8 8 0 .1 6 3 0 .2 9 0 0 .0 2 5 0 .2 8 3 0 .2 7 8 2 3 0 m 6 .5 .2 0 1 5 0 .7 4 1 0 .0 0 3 < l o q < l o q 0 .0 0 2 0 .0 5 7 0 .0 0 0 0 .0 9 8 0 .1 0 9 0 .0 3 7 0 .0 6 8 0 .1 0 8 0 .0 4 3 0 .0 5 7 0 .0 0 8 0 .0 7 9 0 .0 7 2 2 3 0 m 2 7 .5 .2 0 1 5 0 .4 4 9 0 .0 0 1 < l o q 0 .0 0 1 0 .0 0 1 0 .0 2 6 0 .0 0 0 0 .0 6 8 0 .0 6 4 0 .0 2 6 0 .0 4 6 0 .0 6 2 0 .0 2 4 0 .0 4 0 0 .0 0 4 0 .0 4 5 0 .0 4 1 2 3 0 m 1 7 .6 .2 0 1 5 0 .3 6 8 0 .0 1 0 < l o q < l o q < l o q 0 .0 2 4 0 .0 0 2 0 .0 5 5 0 .0 4 4 0 .0 1 4 0 .0 3 1 0 .0 5 9 0 .0 1 8 0 .0 3 3 0 .0 0 4 0 .0 4 0 0 .0 3 6 2 3 0 m 1 5 .7 .2 0 1 5 0 .2 0 8 0 .0 0 2 0 .1 2 0 < l o q < l o q 0 .0 0 6 < l o q 0 .0 1 7 0 .0 1 4 0 .0 0 3 0 .0 1 0 0 .0 1 2 0 .0 0 5 0 .0 0 6 0 .0 0 1 0 .0 0 5 0 .0 0 6 2 3 0 m 9 .9 .2 0 1 5 0 .4 1 6 0 .0 0 5 0 .1 2 1 0 .0 0 2 0 .0 0 3 0 .0 2 1 0 .0 0 1 0 .0 4 8 0 .0 4 0 0 .0 1 2 0 .0 2 9 0 .0 4 7 0 .0 1 5 0 .0 2 2 0 .0 0 3 0 .0 2 4 0 .0 2 5 2 3 0 m 7 .1 0 .2 0 1 5 0 .8 2 1 0 .0 0 7 0 .0 7 7 0 .0 0 1 0 .0 0 3 0 .0 4 7 0 .0 0 3 0 .1 0 4 0 .0 9 8 0 .0 4 0 0 .0 7 2 0 .0 9 6 0 .0 4 6 0 .0 5 3 0 .0 0 6 0 .0 7 6 0 .0 9 1 2 3 0 m 1 1 .1 1 .2 0 1 5 0 .6 5 4 0 .0 1 3 0 .0 6 4 0 .0 0 1 < l o q < l o q 0 .0 0 2 0 .0 9 9 0 .0 8 7 0 .0 3 2 0 .0 6 5 0 .0 7 9 0 .0 3 7 0 .0 4 8 0 .0 0 7 0 .0 5 8 0 .0 6 3 2 3 0 m 9 .1 2 .2 0 1 5 1 .9 4 0 < l o q 0 .0 8 7 < l o q 0 .0 0 3 0 .1 7 3 0 .0 1 0 0 .2 4 7 0 .2 5 3 0 .1 1 1 0 .1 6 2 0 .2 0 6 0 .1 0 8 0 .1 5 5 0 .0 1 4 0 .2 1 0 0 .2 0 1 2 3 0 m th e h ig h e st v a lu e th e lo w e st v a lu e table 5. concentration of individual pahs in different periods and different heights, filters from ozone analysers (ng/m3). loq the limit of quantification for acl 0.5pg/m3, for phe 0.1pg/m3. 289 re tr ac te d g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica 10 fig. 4: concentrations of 2-, 3-, 4-, 5and 6-ring pahs measured on the tower at different heights, on the filters from ozone analysers: 2-ring pahs include naphthalene; 3-ring pahs include acenaphthylene, acenaphthene, fluorene, phenanthrene and anthracene; 4-ring pahs include fluoranthene, pyrene, benzo[a]anthracene and chrysene; 5-ring pahs include benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene and dibenzo[ah]anthracene; 6-ring pahs include indeno[123-cd]pyrene and benzo[ghi]perylene. sampling period 17.12.2014 9.12.2015. unit expressed in ng/m3. it is possible to see in the figure 3 that there is also a certain stable composition of pahs throughout the sampling period, i.e., the meteorological station really measures the relatively stable background concentration of pahs. in figure 3 we also see that pahs create 4 dominants pairs of hydrocarbons. their percent representation for a height of 230 m is shown in figure 5. fig. 5: an example of the layering of four individual pairs in % at a height of 230m. sampling period 17.12.2014 9.12.2015. figure 5. an example of the layering of four individual pairs in % at a height of 230m. sampling period 17. 12. 2014–9. 12. 2015. sured concentration of the bap exceeded the limit value (see table 1) in two cases (figure 7), however, the proportion of heavier pahs was significantly higher than the bap. the highest concentrations of bap we have measured were in the winter season (1.51 ng/m3 for 8 m height) and 1.11 ng/m3 for 50 m height and for the same period. in 230 m height, the bap concentration of 0.85 ng/m3 was under the immission limit. 3.1. filters from mercury analyser as supplementary measurements, we present the values of the concentrations in the collection filters from mercury analyser at the base of the tower. the samples were taken at the sampling period different than in the case of the ozone analyser filters and filters also had a different porosity of 0.2 µm, so these values cannot be compared adequately, however, again, higher concentrations of the pahs in the winter months than in the summer months are clearly visible (fig 8). 3.2. conclusion measurements of pahs concentrations from december 17, 2014 to december 9, 2015 at křešín near pacov meteorological background station have produced unique results. unique, because in the literature, a continuous monthly sampling at different heights at the background station has not been described yet. the concentration of the pahs measured at the křešín near pacov meteorological tower verified the hypothesis of the concentration decrease of the individual hydrocarbons with the increasing height and the dependence of this concentration on the season. the total concentration of the pahs in this period was 15.905, 11.867 and 9.624 ng/m3 for heights of 8, 50 and 230 m. the samples taken continuously for one month are characterizing the concentration of the individual pahs observed during the year, especially the decrease in the concentration of high molecular weight pahs along with increasing height. by the measurements, a higher proportion of heavier pahs was found, for which no immission limits are set, e.g., concentration of the pairs of fluoranthene and pyrene in march 2015 (sum) 4.077, 3.139 and 2.895 ng/m3, chrysene and benzo[b]fluoranthene 3.659, 2.768 and 2.088 ng/m3, benzo[ghi]perylene and indeno[1,2,3cd]pyrene 3.307, 2.404 and 1.673 ng/m3 compared to a pair of benzo[a]pyrene and benzo[k]fluoranthene 2.382, 1.782 and 1.331 ng/m3 at the height of 8, 50 and 230 m. higher concentrations of benzo[a]pyrene of 1.507 and 1.113 ng/m3, which exceeded the allowed immission limit value, were measured only in the winter of 2015 at two heights of 8 and 50 m, which could be caused by the heating season. due to the interesting results from 2015, the authors will continue to carry out further measurements not only on ptfe filters on already existing equipment, but also on passive analysers located directly on the meteorological tower (puf and xad sorbents). another goal is to conduct a pilot single air sampling from one specific height (230 m) using the portable active skc leland legacy sampler, which has its own filtration equipment (ptfe or quartz filters). measurement of pahs will be complemented by weather data, such as wind direction, wind rotation, humidity, temperature and air pressure. 290 re tr ac te d vol. 57 no. 4/2017 possibility of monitoring of pahs distribution 11 fig. 6: the sum of heavier pahs concentration (4-6ring) for each height. the following pahs are included: fluoranthene, pyrene, benzo[a]anthracene, chrysene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, indeno[1,2,3-cd]pyrene, benzo[g,h,i]perylene and dibenzo[a,h]anthracene, filters from ozone analysers. unit expressed in ng/m3 the hypothesis that the proportion of heavier pahs with the sampling height is decreasing was confirmed. this difference is minimal in the summer, which is related to the generally lower concentrations of all pahs (the sum of pahs was calculated starting from m.w. of 202 g/mol and higher) (fig. 6). table 5: concentration of individual pahs in different periods and different heights, filters from ozone analysers (ng/m3). loq the limit of quantification for acl 0.5pg/m3, for phe 0.1pg/m3. table 5 from the ozone analyser represents the time development of the concentration level of the individual pahs, depending on the different sampling heights during the monitored period (the highest and lowest concentrations are marked with colour). it is clear from the table 5 that the highest concentrations occurred at each sampling height in the period of 11.2.2015-11.3.2015, the maxima date of collection σ (ng/m 3 ) na acl acn fl phe ant flu py baa chr bbf bkf bap dbaha bghip ip height 14.1.2015 6.683 0.016 < loq 0.007 0.052 0.570 0.047 0.862 0.859 0.448 0.742 0.820 0.368 0.576 0.051 0.672 0.594 8m 11.3.2015 15.905 0.028 < loq 0.012 0.066 1.187 0.029 2.024 2.053 1.030 1.586 2.073 0.875 1.507 0.127 1.722 1.585 8m 8.4.2015 5.372 0.012 < loq 0.008 0.026 0.414 0.022 0.724 0.728 0.340 0.529 0.687 0.298 0.508 0.044 0.535 0.496 8m 6.5.2015 1.533 0.005 < loq 0.002 0.003 0.094 0.004 0.142 0.155 0.066 0.113 0.235 0.101 0.150 0.017 0.238 0.208 8m 27.5.2015 0.894 0.002 < loq 0.001 0.001 0.052 0.001 0.109 0.119 0.047 0.073 0.125 0.051 0.091 0.009 0.119 0.093 8m 17.6.2015 0.415 0.017 < loq < loq 0.001 0.031 0.002 0.060 0.047 0.017 0.034 0.065 0.019 0.034 0.004 0.045 0.039 8m 15.7.2015 0.238 0.003 0.086 < loq < loq 0.011 0.000 0.022 0.018 0.006 0.014 0.021 0.009 0.010 0.001 0.018 0.020 8m 9.9.2015 0.458 0.001 0.057 0.001 0.002 0.026 0.001 0.060 0.060 0.016 0.039 0.065 0.020 0.029 0.004 0.036 0.039 8m 7.10.2015 1.555 0.009 0.051 0.001 0.004 0.086 0.004 0.155 0.150 0.073 0.119 0.209 0.101 0.125 0.016 0.216 0.235 8m 11.11.2015 4.223 0.013 0.173 0.001 0.006 0.123 0.008 0.402 0.402 0.202 0.368 0.623 0.314 0.350 0.046 0.547 0.644 8m 9.12.2015 5.231 < loq 0.055 0.007 0.012 0.247 0.023 0.555 0.600 0.367 0.541 0.619 0.366 0.463 0.045 0.679 0.649 8m 14.1.2015 4.859 0.009 < loq 0.005 0.036 0.457 0.035 0.646 0.634 0.307 0.478 0.643 0.279 0.409 0.043 0.451 0.428 50m 11.3.2015 11.867 0.036 < loq 0.010 0.051 0.821 0.022 1.587 1.552 0.735 1.153 1.615 0.669 1.113 0.101 1.251 1.153 50m 8.4.2015 3.881 0.010 < loq 0.005 0.020 0.320 0.013 0.555 0.520 0.224 0.347 0.502 0.213 0.348 0.038 0.396 0.370 50m 6.5.2015 1.033 0.004 < loq 0.001 0.003 0.072 0.001 0.112 0.126 0.037 0.070 0.170 0.064 0.082 0.013 0.147 0.131 50m 27.5.2015 0.609 0.001 < loq 0.001 0.001 0.034 0.001 0.079 0.073 0.032 0.053 0.087 0.036 0.060 0.007 0.077 0.069 50m 17.6.2015 0.340 0.012 < loq < loq < loq 0.024 0.001 0.051 0.039 0.013 0.028 0.054 0.016 0.029 0.003 0.036 0.033 50m 15.7.2015 0.179 0.002 0.064 < loq < loq 0.011 0.000 0.018 0.016 0.004 0.010 0.014 0.006 0.007 0.001 0.012 0.013 50m 9.9.2015 0.399 0.003 0.057 0.001 0.002 0.023 0.001 0.053 0.044 0.014 0.032 0.056 0.019 0.025 0.004 0.031 0.034 50m 7.10.2015 1.200 0.006 0.067 0.001 0.003 0.087 0.004 0.129 0.121 0.057 0.092 0.146 0.074 0.094 0.011 0.146 0.164 50m 11.11.2015 2.680 0.006 0.082 0.006 0.005 0.154 0.008 0.350 0.317 0.134 0.243 0.330 0.166 0.214 0.025 0.306 0.332 50m 9.12.2015 3.720 < loq 0.045 0.007 0.008 0.156 0.017 0.417 0.462 0.250 0.370 0.429 0.256 0.342 0.031 0.445 0.485 50m 14.1.2015 4.125 0.006 < loq 0.004 0.023 0.431 0.035 0.631 0.606 0.266 0.385 0.470 0.205 0.338 0.032 0.354 0.340 230m 11.3.2015 9.624 0.016 < loq 0.006 0.054 0.865 0.025 1.454 1.441 0.600 0.915 1.173 0.478 0.853 0.071 0.859 0.814 230m 8.4.2015 3.369 0.010 < loq 0.004 0.024 0.344 0.010 0.546 0.507 0.194 0.304 0.388 0.163 0.290 0.025 0.283 0.278 230m 6.5.2015 0.741 0.003 < loq < loq 0.002 0.057 0.000 0.098 0.109 0.037 0.068 0.108 0.043 0.057 0.008 0.079 0.072 230m 27.5.2015 0.449 0.001 < loq 0.001 0.001 0.026 0.000 0.068 0.064 0.026 0.046 0.062 0.024 0.040 0.004 0.045 0.041 230m 17.6.2015 0.368 0.010 < loq < loq < loq 0.024 0.002 0.055 0.044 0.014 0.031 0.059 0.018 0.033 0.004 0.040 0.036 230m 15.7.2015 0.208 0.002 0.120 < loq < loq 0.006 < loq 0.017 0.014 0.003 0.010 0.012 0.005 0.006 0.001 0.005 0.006 230m 9.9.2015 0.416 0.005 0.121 0.002 0.003 0.021 0.001 0.048 0.040 0.012 0.029 0.047 0.015 0.022 0.003 0.024 0.025 230m 7.10.2015 0.821 0.007 0.077 0.001 0.003 0.047 0.003 0.104 0.098 0.040 0.072 0.096 0.046 0.053 0.006 0.076 0.091 230m 11.11.2015 0.654 0.013 0.064 0.001 < loq < loq 0.002 0.099 0.087 0.032 0.065 0.079 0.037 0.048 0.007 0.058 0.063 230m 9.12.2015 1.940 < loq 0.087 < loq 0.003 0.173 0.010 0.247 0.253 0.111 0.162 0.206 0.108 0.155 0.014 0.210 0.201 230m the highest value the lowest value figure 6. the sum of heavier pahs concentration (4-6ring) for each height. the following pahs are included: fluoranthene, pyrene, benzo[a]anthracene, chrysene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, indeno[1,2,3-cd]pyrene, benzo[g,h,i]perylene and dibenzo[a,h]anthracene, filters from ozone analysers. unit expressed in ng/m3 12 being at 8 m sampling height and the lowest concentrations occurred in the period of 17.6.201515.7.2015, with the lowest concentration being at 230 m sampling height. we focused during our research study also on benzo[a]pyrene (bap) which is according iarc group 1 carcinogen. this most carcinogenic and mutagenic hydrocarbon with 5 rings was found and presented in the air by bae et al. (2002) in 2 places in urban area in the concentration of 0.61 and 0.73 ng/m3 in winter and in 2 places in farm on the rooftop (3 m) in the concentration of 0.17 and 0.53 ng/m3 in the same season. ohura et al. (2016) measured the concentration of bap inside and in peripheral area of nagoya metropolis and found 0.06-1.45 ng/m3 and 0.05-3.75 ng/m3, both in 20 m above the ground. the large difference in bap concentrations in the winter and summer season is described in the article (re-poppi and santiago-silva, 2005). the authors measured bap at a height of 2.5 m in campus near brazilian city campo grande and the concentration of bap was ranging from 8.9-62.5 ng/m3. fig. 7: concentration of benzo[a]pyrene in different periods and different heights, filters from ozone analysers. unit expressed in ng/m3 in spite of the fact that meteorological station křešín near pacov is background station the measured concentration of bap exceeded the limit value (see table 1) in two cases (fig. 7), however the proportion of heavier pahs was significantly higher than bap. the highest concentrations of bap we have measured were in the winter season (1.51 ng/m3 for 8 m height) and 1.11 ng/m3 for 50 m height and for the same period. in 230 m height bap concentration of 0.85 ng/m3 was under the immission limit. figure 7. concentration of benzo[a]pyrene in different periods and different heights, filters from ozone analysers. unit expressed in ng/m3 13 filters from mercury analyser as supplementary measurements we present the values of the concentrations in the collection filters from mercury analyser at the base of the tower. the samples were taken at the sampling period other than in the case of ozone analyser filters and filters had also a different porosity of 0.2 μm, so these values cannot be compared adequately, however, again, higher concentrations of pahs in the winter months than in the summer months are clearly visible (fig 8). fig. 8: sum of pahs collection at the base of the tower, sampling period 31.12.2014 – 2.12.2015. concentration of individual pahs on the ground, filters from mercury analyser (ng/m3) conclusion measurements of pahs concentrations from december 17, 2014 to december 9, 2015 at křešín near pacov meteorological background station have produced unique results. unique because in the literature there has been not described continuous monthly sampling at different heights at the background station yet. the concentration of pahs measured at the křešín near pacov meteorological tower verified the hypothesis of the concentration decrease of the individual hydrocarbons together with the increasing height and the dependence of this concentration on the season. the total concentration of pahs in this period was 15.905, 11.867 and 9.624 ng/m3 for heights of 8, 50 and 230 m. samples taken continuously for one month are characterizing well the concentration of the individual pahs observed during the year, especially the decrease in the concentration of high molecular weight pahs along with increasing height. by the measurements a higher proportion of heavier pahs was found for which no immission limits are set, e.g. concentration of the pairs of fluoranthene and pyrene in march 2015 (sum) 4.077, 3.139 and 2.895 ng/m3, chrysene and benzo[b]fluoranthene 3.659, 2.768 and 2.088 ng/m3, benzo[ghi]perylene and indeno[1,2,3-cd]pyrene 3.307, 2.404 and 1.673 ng/m3 compared to a pair of benzo[a]pyrene and benzo[k]fluoranthene 2.382, 1.782 and 1.331 ng/m3 at the height of 8, 50 and 230 m. higher concentrations of benzo[a]pyrene of 1.507 and 1.113 ng/m3, which exceeded the allowed immission limit value, were measured only in the winter of 2015 at two heights of 8 and 50 m, which could be caused by the heating season. figure 8. sum of pahs collection at the base of the tower, sampling period 31. 12. 2014–2. 12. 2015. concentration of individual pahs on the ground, filters from mercury analyser (ng/m3) 291 re tr ac te d g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica acknowledgements the article was supported by the czech technical university grant sgs16/197/ohk1/3t/11 and by the ministry of education, youth and sports of cr within the national sustainability program i (npu i), grant number lo1415 and the research was performed in the frame of the contract between global change research institute as cr and faculty of natural sciences, university of south bohemia in české budějovice. we also thank very much to colleagues from global change research institute as cr for collecting filters in the tower of křešín near pacov, mainly gabriela vítková. references [1] aamot, eli, eiliv steinnes a rudolf schmid. polycyclic aromatic hydrocarbons in norwegian forest soils: impact of long range atmospheric transport. environmental pollution. 1996, 92(3), 275-280. doi:101016/0269-7491(95)00114-x [2] anastasopoulos, angelos t., amanda j. wheeler, deniz karman a ryan h. kulka. intraurban concentrations, spatial variability and correlation of ambient polycyclic aromatic hydrocarbons (pah) and pm2.5. atmospheric environment. 2012, 59, 272-283. doi:10.1016/j.atmosenv.2012.05.004 [3] bae, soo ya, seung muk yi a yong pyo kim. temporal and spatial variations of the particle size distribution of pahs and their dry deposition fluxes in korea. atmospheric environment. 2002, 36(35), 5491-5500. doi:101016/s1352-2310(02)00666-0 [4] curtosi, antonio, emilien pelletier, cristian l. vodopivez a walter p. mac cormack. polycyclic aromatic hydrocarbons in soil and surface marine sediment near jubany station (antarctica). role of permafrost as a low-permeability barrier. science of the total environment. 2007, 383(1-3), 193-204. doi:10.1016/j.scitotenv.2007.04025 [5] čupr, pavel, jana klánová, tomáš bartoš, zuzana flegrová, jiří kohoutek a ivan holoubek. passive air sampler as a tool for longterm air pollution monitoring: part 2. air genotoxic potency screening assessment. environmental pollution. 2006, 144(2), 406-413. doi:10.1016/j.envpol.2005.12045 [6] duan, jingchun, xinhui bi, jihua tan, guoying sheng a jiamo fu. the differences of the size distribution of polycyclic aromatic hydrocarbons (pahs) between urban and rural sites of guangzhou, china. atmospheric research. 2005, 78(3-4), 190-203. doi:10.1016/j.atmosres.2005.04001 [7] dunbar, jana c, chen-i lin, isaura vergucht, jeffery wong a john l durant. estimating the contributions of mobile sources of pah to urban air using real-time pah monitoring. science of the total environment. 2001, 279(1-3), 1-19. doi:101016/s0048-9697(01)00686-6 [8] farrar, n. j., t. harner, m. shoeib, a. sweetman a k. c. jones. field deployment of thin film passive air samplers for persistent organic pollutants: a study in the urban atmospheric boundary layer. environmental science & technology. 2005, 39(1), 42-48. doi:101021/es048907a [9] gouin, todd, david wilkinson, stephen hummel, ben meyer a andrew culley. polycyclic aromatic hydrocarbons in air and snow from fairbanks, alaska. atmospheric pollution research. 2010, 2010(1), 9-15. doi:10.5094/apr.2010002 [10] halsall, c.j., a.j. sweetman, l.a. barrie a k.c. jones. modelling the behaviour of pahs during atmospheric transport from the uk to the arctic. atmospheric environment. 2001, 35(2), 255-267. doi:10.1016/s1352-2310(00)00195-3 [11] halsall, crispin j., peter j. coleman, brian j. davis, victoria. burnett, keith s. waterhouse, peter. harding-jones a kevin c. jones. polycyclic aromatic hydrocarbons in u.k. urban air. environmental science & technology. 1994, 28(13), 2380-2386. doi:10.1021/es00062a024 [12] harner, tom, michael bartkow, ivan holoubek, et al. passive air sampling for persistent organic pollutants: introductory remarks to the special issue. environmental pollution. 2006, 144(2), 361-364. doi:10.1016/j.envpol.2005.12.044 [13] hassanvand, mohammad sadegh, kazem naddafi, sasan faridi, et al. characterization of pahs and metals in indoor/outdoor pm10/pm2.5/pm1 in a retirement home and a school dormitory. science of the total environment. 2015, 527-528, 100-110. doi:10.1016/j.scitotenv.2015.05.001 [14] hong, huasheng, hongling yin, xinhong wang a cuixing ye. seasonal variation of pm10-bound pahs in the atmosphere of xiamen, china. atmospheric research. 2007, 85(3-4), 429-441. doi:10.1016/j.atmosres.2007.03.004 [15] chen, ying, xinghua li, tianle zhu, yingjie han a dong lv. pm 2.5 -bound pahs in three indoor and one outdoor air in beijing: concentration, source and health risk assessment. science of the total environment. 2017, 586, 255-264. doi:10.1016/j.scitotenv.2017.01.214 [16] choi, s.-d., r. m. staebler, h. li, y. su, b. gevao, t. harner a f. wania. depletion of gaseous polycyclic aromatic hydrocarbons by a forest canopy. atmospheric chemistry and physics. 2008, 8(14), 4105-4113. doi:10.5194/acp-8-4105-2008 [17] jung, kyung hwa, kerlly bernabe, kathleen moors, beizhan yan, steven n. chillrud, robin whyatt a david camann. effects of floor level and building type on residential levels of outdoor and indoor polycyclic aromatic hydrocarbons, black carbon, and particulate matter in new york city. atmosphere. 2011, 2(2), 96-109. doi:10.3390/atmos2020096 [18] kalaiarasan, m., r. balasubramanian, k.w.d. cheong a k.w. tham. particulate-bound polycyclic aromatic hydrocarbons in naturally ventilated multi-storey residential buildings of singapore: vertical distribution and potential health risks. building and environment. 2009, 44(2), 418-425. doi:10.1016/j.buildenv.2008.04.003 [19] kim, ki-hyun, shamin ara jahan, ehsanul kabir a richard j.c. brown. a review of airborne polycyclic aromatic hydrocarbons (pahs) and their human health effects. environment international. 2013, 60, 71-80. doi:10.1016/j.envint.2013.07.019 292 re tr ac te d http://dx.doi.org/10 http://dx.doi.org/10.1016/j.atmosenv.2012.05.004 http://dx.doi.org/10 http://dx.doi.org/10.1016/j.scitotenv.2007.04 http://dx.doi.org/10.1016/j.envpol.2005.12 http://dx.doi.org/10.1016/j.atmosres.2005.04 http://dx.doi.org/10 http://dx.doi.org/10 http://dx.doi.org/10.5094/apr.2010 http://dx.doi.org/10.1016/s1352-2310(00)00195-3 http://dx.doi.org/10.1021/es00062a024 http://dx.doi.org/10.1016/j.envpol.2005.12.044 http://dx.doi.org/10.1016/j.scitotenv.2015.05.001 http://dx.doi.org/10.1016/j.atmosres.2007.03.004 http://dx.doi.org/10.1016/j.scitotenv.2017.01.214 http://dx.doi.org/10.5194/acp-8-4105-2008 http://dx.doi.org/10.3390/atmos2020096 http://dx.doi.org/10.1016/j.buildenv.2008.04.003 http://dx.doi.org/10.1016/j.envint.2013.07.019 vol. 57 no. 4/2017 possibility of monitoring of pahs distribution [20] klánová, jana, jiří kohoutek, lenka hamplová, petra urbanová a ivan holoubek. passive air sampler as a tool for long-term air pollution monitoring: part 1. performance assessment for seasonal and spatial variations. environmental pollution. 2006, 144(2), 393-405. doi:10.1016/j.envpol.2005.12.048 [21] levy, ji, ea houseman, jd spengler, p loh a l. ryan. fine particulate matter and polycyclic aromatic hydrocarbon concentration patterns in roxbury, massachusetts: a community-based gis analysis. environ health perspect. 2001, 109(4), 341-7. [22] levy, ji., dh. bennett, sj. melly a jd. spengler. influence of traffic patterns on particulate matter and polycyclic aromatic hydrocarbon concentrations in roxbury, massachusetts. j expo anal environ epidemiol. 2003, 13(5), 364-71. doi:10,1038/sj.jea.7500289 [23] levy, jonathan i, thomas dumyahn a john d spengler. particulate matter and polycyclic aromatic hydrocarbon concentrations in indoor and outdoor microenvironments in boston, massachusetts. journal of exposure analysis and environmental epidemiology. 2002, 2002(12), 104–114. doi:10.1038/sj/jea/7500203 [24] li, chunlei, jiamo fu, guoying sheng, xinhui bi, yongmei hao, xinming wang a bixian mai. vertical distribution of pahs in the indoor and outdoor pm2.5 in guangzhou, china. building and environment. 2005, 40(3), 329-341. doi:10.1016/j.buildenv.2004.05.015 [25] linhart, igor. toxikologie: interakce škodlivých látek s živými organismy, jejich mechanismy, projevy a důsledky. 1st edition. praha: vysoká škola chemicko-technologická, 2012. isbn 9788070808061 [26] masih, jamson, raj singhvi, ajay taneja, krishan kumar a harison masih. gaseous/particulate bound polycyclic aromatic hydrocarbons (pahs), seasonal variation in north central part of rural india. sustainable cities and society. 2012, 3, 30-36. doi:10.1016/j.scs.2012.01.001 [27] menichini, e., n. iacovella, f. monfredini a l. turrio-baldassarri. relationships between indoor and outdoor air pollution by carcinogenic pahs and pcbs. atmospheric environment. 2007, 41(40), 9518-9529. doi:10.1016/j.atmosenv.2007.08.041 [28] moeinaddini, mazaher, abbas esmaili sari, alireza riyahi bakhtiari, andrew yiu-chung chan, seyed mohammad taghavi, darryl hawker a des connell. source apportionment of pahs and n-alkanes in respirable particles in tehran, iran by wind sector and vertical profile. environmental science and pollution research. 2014, 21(12), 7757-7772. doi:10.1007/s11356-014-2694-1 [29] ohura, takeshi, yuta kamiya a fumikazu ikemori. local and seasonal variations in concentrations of chlorinated polycyclic aromatic hydrocarbons associated with particles in a japanese megacity. journal of hazardous materials. 2016, 312, 254-261. doi:10.1016/j.jhazmat.2016.03.072 [30] omar, nasr yousef m.j, m.radzi bin abas, kamal aziz ketuly a norhayati mohd tahir. concentrations of pahs in atmospheric particles (pm-10) and roadside soil particles collected in kuala lumpur, malaysia. atmospheric environment. 2002, 36(2), 247-254. doi:10.1016/s1352-2310(01)00425-3 [31] pehnec, gordana, ivana jakovljević, anica šišović, ivan bešlić a vladimira vaðić. influence of ozone and meteorological parameters on levels of polycyclic aromatic hydrocarbons in the air. atmospheric environment. 2016, 131, 263-268. doi:10.1016/j.atmosenv.2016.02.009 [32] pongpiachan, siwatt. vertical distribution and potential risk of particulate polycyclic aromatic hydrocarbons in high buildings of bangkok, thailand. asian pacific journal of cancer prevention. 2013, 14(3), 1865-1877. doi:10.7314/apjcp.2013.14.3.1865 [33] re-poppi, n a m santiago-silva. polycyclic aromatic hydrocarbons and other selected organic compounds in ambient air of campo grande city, brazil. atmospheric environment. 2005, 39(16), 2839-2850. doi:10.1016/j.atmosenv.2004.10.006 [34] straif, kurt, aaron cohen a jonathan samet, ed. air pollution and cancer: (iarc scientific publications; 161). 150 cours albert thomas, 69372 lyon cedex 08, france: international agency for research on cancer, 2013. isbn 978-92-832-2166-1. [35] tao, shu, yi wang, shiming wu, et al. vertical distribution of polycyclic aromatic hydrocarbons in atmospheric boundary layer of beijing in winter. atmospheric environment. 2007, 41(40), 9594-9602. doi:10.1016/j.atmosenv.2007.08.026 [36] van vliet, patricia, mirjam knape, jeroen de hartog, nicole janssen, hendrik harssema a bert brunekreef. motor vehicle exhaust and chronic respiratory symptoms in children living near freeways. environmental research. 1997, 74(2), 122-132. doi:10.1006/enrs.1997.3757 [37] venkataraman, chandra, salimol thomas a pramod kulkarni. size distributions of polycyclic aromatic hydrocarbons—gas/particle partitioning to urban aerosols. journal of aerosol science. 1999, 30(6), 759-770. doi:10.1016/s0021-8502(98)00761-7 [38] viskari, e.-l., r. rekilä, s. roy, o. lehto, j. ruuskanen a l. kärenlampi. airborne pollutants along a roadside: assessment using snow analyses and moss bags. environmental pollution. 1997, 97(1-2), 153-160. doi:10.1016/s0269-7491(97)00061-4 [39] wu, s.p., s. tao a w.x. liu. particle size distributions of polycyclic aromatic hydrocarbons in rural and urban atmosphere of tianjin, china. chemosphere. 2006, 62(3), 357-367. doi:10.1016/j.chemosphere.2005.04.101 [40] act no. 201/2012 sb.: zákon o ochraně ovzduší. 2012, příloha 1: imisní limity, 69/2012. [41] act no. 330/2012 sb.: vyhláška o způsobu posuzování a vyhodnocení úrovně znečištění, rozsahu informování veřejnosti o úrovni znečištění a při smogových situacích. [42] česká meteorologická společnost: 20 let monitoringu kvality přírodního prostředí na observatoři košetice. čmes, 2016, http://www.cmes.cz/cs/node/204 [2017-05-31]. 293 re tr ac te d http://dx.doi.org/10.1016/j.envpol.2005.12.048 http://dx.doi.org/10,1038/sj.jea.7500289 http://dx.doi.org/10.1038/sj/jea/7500203 http://dx.doi.org/10.1016/j.buildenv.2004.05.015 http://dx.doi.org/10.1016/j.scs.2012.01.001 http://dx.doi.org/10.1016/j.atmosenv.2007.08.041 http://dx.doi.org/10.1007/s11356-014-2694-1 http://dx.doi.org/10.1016/j.jhazmat.2016.03.072 http://dx.doi.org/10.1016/s1352-2310(01)00425-3 http://dx.doi.org/10.1016/j.atmosenv.2016.02.009 http://dx.doi.org/10.7314/apjcp.2013.14.3.1865 http://dx.doi.org/10.1016/j.atmosenv.2004.10.006 http://dx.doi.org/10.1016/j.atmosenv.2007.08.026 http://dx.doi.org/10.1006/enrs.1997.3757 http://dx.doi.org/10.1016/s0021-8502(98)00761-7 http://dx.doi.org/10.1016/s0269-7491(97)00061-4 http://dx.doi.org/10.1016/j.chemosphere.2005.04.101 http://www.cmes.cz/cs/node/204 g. strnadová, v. hanuš, d. kahoun et al. acta polytechnica [43] český hydrometeorologický ústav. (czech hydrometeorological institute) čhmú. 2015, http://portal.chmi.cz/ [2017-05-19]. [44] european research infrastructure for the observation of aerosol, clouds, and trace gases: actris-2 ia in h2020. actris-2. 2015, http://actris2.nilu.no/ dataservices/observationalfacilities/ accesstoobservationalfacilities.aspx [2017-05-31]. [45] iarc [international agency for research on cancer] monographs on the evaluation of the carcinogenic risk of chemicals to man. lyon: international agency for research on cancer, 1976. [46] icos atmosphere thematic centre: křešín near pacov observatory – panel board. icos, 2016, https://icos-atc.lsce.ipsl.fr/kre [2017-05-31]. [47] united states environmental protection agency: us-epa. us-epa, 2017, https://www.epa.gov/ [2017-05-31]. [48] world meteorological organization: global atmosphere watch. wmo, 2016, http://www.wmo.int/ pages/prog/arep/gaw/gaw_home_en.html [2017-05-31]. 294 re tr ac te d http://portal.chmi.cz/ http://actris2.nilu.no/dataservices/observationalfacilities/accesstoobservationalfacilities.aspx http://actris2.nilu.no/dataservices/observationalfacilities/accesstoobservationalfacilities.aspx http://actris2.nilu.no/dataservices/observationalfacilities/accesstoobservationalfacilities.aspx https://icos-atc.lsce.ipsl.fr/kre https://www.epa.gov/ http://www.wmo.int/pages/prog/arep/gaw/gaw_home_en.html http://www.wmo.int/pages/prog/arep/gaw/gaw_home_en.html acta polytechnica 57(4):282–294, 2017 1 introduction 1.1 legislation data sources 1.2 monitoring of pahs in vertical profile 1.3 air monitoring on the european continent — observatories 1.4 the goal of the study 2 materials and methods 2.1 materials 2.2 methods 3 results and discussion 3.1 filters from mercury analyser 3.2 conclusion acknowledgements references ap02_3.vp 1 introduction there is increasing economic pressure for both newly-built and existing facilities to be kept in-service for a longer period of time. this requirement is enhanced by environmental awareness. to cope with such tasks, mathematical models can be made for the performances of a building or structure, including the process of ageing (deteriorating). generally two classes of problems may be distinguished in this context: (i) prediction of service-life for structures during the design process. this becomes a necessity in the context of the new “performance-based design” rules, which are a trend in forthcoming engineering work – see e.g. [1]; (ii) assessment of existing structures, i.e., estimating residual service life. to estimate future safety and performance one has to consider not only the real material, loading and technological characteristics but also possible deterioration (e.g., by corrosion), physical damage (environmental loadings, impacts), maintenance and possible repair works. in principal, these effects and their governing inputs are highly uncertain, so the adoption of a probabilistic approach is unavoidable. concrete structures – with mild steel or prestressed reinforcement – undergo a process of ageing due to environmental and loading agents (chemical and mechanical/physical processes) during their service. experimental investigation is a basic procedure; however, in many situations – especially concerning more complex structural members or structures it is not feasible. the solution is to model such situations analytically. the aim of this paper is to describe some existing models of various kinds of material deterioration and to discuss their probabilistic form. service life first let us briefly define the service life (or structural life-time) of a structure. according to [2] service life in general is … the period of time after installation during which all the conditions of the structure (structural part) meet or exceed the performance requirements … . it should be stressed that the basic requirement is structural safety. much research has been devoted to methods and applications for service life – see, e.g., [3], [4]. frequently the limit state concept is utilized, comprising a function for the influence of the load s and a function for the load bearing capacity r (or the structural resistance function). then the following inequality should be satisfied: � � � �z r s r x x x s y y yn m� � � � �1 2 1 2, , , , , ,� � 0 (1) where: z – safety margin (or limit state function) xi – basic variable (i � 1, …, n) of the loading function yj – basic variable of the resistance function ( j � 1, …, m). note that the same variable may be included in both the sets x and y (e.g., young’s modulus in the case of a redundant structure). for the limit state of an ageing structure, time t is also included among the basic variables. due to the uncertainties of the parameters involved a non-deterministic analysis – i.e., a probabilistic analysis – should be adopted. then xi and yj are either considered to be random variables or may be represented as random fields (the actual value of the parameter is subjected to scatter). a probabilistic representation of service life is depicted in fig. 1 (extracted from [5]). alternatively, the service life of reinforced concrete structures may be assessed by distinguishing two periods. first, the initiation period, during which the gradual depassivation of the reinforcing bars (due to co2 or other effects) is in progress. second, the propagation period, during which corrosion of the reinforcement takes place, and consequently, the failure probability is increased for both the load and serviceability limit states. this is illustrated in fig. 2. the structural life prediction can be reached by modelling such effects. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 modelling of deterioration effects on concrete structures b. teplý in order to predict the service life of concrete structures models for deterioration effects are needed. this paper has the form of a survey, listing and describing such analytical models, namely carbonation of concrete, ingress of chlorides, corrosion of reinforcing steel and prestressing tendons. the probabilistic approach is applied. keywords: concrete structures, service life, reinforcement, tendons, carbonation, corrosion, chloride ingress. fig. 1: probabilistic representation of structural service life as mentioned above the parameters in the model are in principle stochastic in nature; the actual value of the parameter is subjected to scatter. this implies that the probability can be described by, e.g., probability density functions for all parameters. therefore it is necessary to model the uncertainties and add them to the mathematical performance model or the probabilistic service life design. 2 analytical models carbonation of concrete carbonation is a chemical process in concrete driven by ambient co2 penetrating from the surface and decreasing ph up to a value approximately equal to 9. when the carbonation depth xc equals the cover c the steel is depassivated and corrosion may start (see also the initiation time explained above). � papadakis et al the model of time-dependent carbonation depth xc of opc concrete developed by papadakis et al [6] is an analytical model based on the mass conservation of co2, ca(oh)2 and csh (hydrated calcium silicate) in any control volume of the concrete mass. the simplified carbonation depth formula for opc concrete is expressed here as: � �x w c w c rh w c c c c � �� � � � � � � � � � � 0 35 0 3 1 1000 1 1 . . � � � f 000 22 4 44 10 2 6 c a c c tc a co � � � � � � � . , (2) where xc is the carbonation depth (mm) for time t (years), �c, �a are the mass density (kg/m 3) of cement and aggregates, respectively, w/c, a/c are the water/cement and aggregate/cement ratio, respectively, rh is the ambient relative humidity, cco2 is the co2 content in the atmosphere (mg/m3). aggregate unit content a is calculated using the equation: a s g g� � �4 8 8 16 (3) where s, g 4 8� , g 8 16� are the unit content of sand, gravel size 4–8 mm and gravel size 8–16 mm, respectively. then, the mass density of the aggregates can be expressed as: � � � � a s g g a s g g � � � � � 4 8 8 16 4 8 8 16 , (4) where � s, � g4 8� , � g8 16� are the mass densities of sand, gravel size 4–8 mm and gravel size 8–16 mm, respectively. in the original formulation the model does not provide satisfactory results for high values of rh. this has been overcome by implementing the step-wise linear relationship f(rh) extracted from the experiments reported by matoušek [7], see also [8]. a total of 11 variables are involved in this model, and in some cases not all appropriate pieces of information are available. therefore, a simpler model may be useful: � bob and affana [9] x c k d f tc c � 150 , (5) where xc is the average depth of carbonation (mm), fc is the concrete compressive strength (mpa), t is the time of co2 (years), c is the coefficient of cement type, k introduces the influence of humidity (environmental conditions), and d is the coefficient for co2 content. both the models were probabilised and compared – see [10]. there are some other carbonation models – see a review [11]. chloride ingress chloride ions diffuse through the protective concrete cover. after reaching the critical threshold of chloride concentration, the protective passive film around the reinforcement is dissoluted and the corrosion of the reinforcement may be initiated. the model of chloride diffusion-adsorption in concrete by papadakis et al. [12] can be used. this model allows us to predict the chloride concentration in the solid and liquid (aq) phases of concrete as a function of the initial concentration [ c l � (aq)]i, of the concentration [cl � (aq)]0 (mol/m 3) on the nearest concrete surface and the distance x (mm) from it, and that of time t (years). the assumption of the formation of a moving “chloride front”, where the concentration of cl �(aq) decreases to zero, gives the following expression for the distance xcl (mm) from the surface: � �� � � �� � x d t e cl cl sat cl aq cl solid � � � � � �1000 2 31536 100 7 , . (6) � �� �cl solid sat � denotes the saturation concentration of cl � in the solid (mol/m3) and de,cl � is the effective diffusivity (m 2/s): d w c w c a c w c w c e c c c a c c , . . cl � � � � � � � �015 1 1 0 85 1 � � � � � �� � � � � � � 3 2 dcl , h o (7) dcl �, h o2 denotes the diffusion coefficient of cl � in an “infinite solution”. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 42 no. 3/2002 fig. 2: two periods of structural service life if [cl � (aq)]cr indicates the diffusion coefficient of cl � in the aqueous phase (mol/m3) required for the depassivation of steel bars (a critical value), and cc is the concrete cover (mm), then time tcr (years) to depassivation is given by: � �� � � � � �� � t c d cr c e � � � � � � � cl solid cl aq cl sat cl 1000 31536 10 2 1 2 7 0 . , � �� � � �� � � � � � � � � � aq cl aq cr 0 2 (8) again, other models do exist – let us mention (without making a detailed description), e.g., the model based on fick's second law and utilized in [13]. simultaneous effect of cl � and co2 simultaneous contamination of carbon dioxide and ingress of chloride ions is rather common in reality, but unfortunately there are few works devoted to this problem. in this context the research work [14] has to be mentioned. this experimental study investigates the effect of fly ash addition on the corrosion process in reinforced concrete exposed to carbon dioxide and chloride. it has been shown that in a strongly aggressive chloride environment (a sea coast or locations with frequent use of de-icing salts) the addition of fly ash is beneficial. conversely, in locations with a high concentration of co2 fly ash may accelerate the corrosion process. reference mortar samples without fly ash were also studied, and it has been proved that the carbonation process affects the chloride profile. to the author’s best knowledge, no analytical model exists for predicting the possible synergetic effect of carbon dioxide and ingress of chlorides. steel corrosion – mild steel reinforcement corrosion is in progress during the propagation period and its rate is governed by the availability of water and oxygen on the steel surface. due to corrosion, the effective area of steel decreases and rust products grow, causing at a certain stage longitudinal cracking and later spalling of concrete, or delamination. � andrade et al note that both the uniform (or general) type and the pitting (localized) type of corrosion should be modelled. recently, the model used by, e.g., [15], [16] seems to be sufficient for the prediction of uniform corrosion. the formula for the time related rebar diameter decrease reads � � � � � �d t d t t d i t t t t t d i i i i corr i i i i corr� � � � � � 0 0116 0 0116. .� � � �0 0 0116t t d ii i corr� � � � � � . � (9) where di is the initial bar diameter (mm), ti is the time to initiation (years) and icorr is the current density (normally expressed in �a /cm2). parameter � expresses the type of corrosion. for homogeneous corrosion � equals 2, however, when localised corrosion (pitting) occurs, � may reach values from 4 up to 8 (see [16]). in this latter case the representation of the corrosion is not accurate enough. for localised corrosion, the following models can also be taken into consideration. the studies by gonzalez et al [17] show that the maximum rate of corrosion penetration in the case of pitting corrosion is 4–8 times that of general corrosion. the depth of the pit at time t can be estimated by the equation � � � �p t i r t tcorr corr i� �0 0116. (10) with icorr taken as 3 �a/cm 2, a value indicative of a high rate of corrosion, and rcorr � 6. the corrosion is assumed to be confined to discrete pits. � val and melchers advanced modelling of localised corrosion was presented by val and melchers [18]. the residual section at the pits can be predicted by simplification in a hemispherical form. the net cross sectional area of a corroded rebar, ar, at time t, is calculated as � � � � � � � � a t d a a p t d a a d p t d p t d r i i i i i � � � � � � � � � � � � � � � � 2 1 2 1 2 4 2 2 2 2 0� � (11) where � � � � a d a d p t d a p t a i p i i 1 1 2 2 1 2 2 1 2 2 2 1 2 � � � � � � � � � � � � � � � � � � � , � � p i p t d 2� � � � � � � � (12) � � � � a p t p t d a d p i p i � � � � � � � � � � � � 2 1 2 2 2 1 2 , arcsin , arcsi� � � � n a p t p 2 � � � � � (13) and p is the radius of the pit. it should be noted that pitting corrosion is highly localized on individual reinforcement bars. it is unlikely that many bars could be affected; hence pitting corrosion will not significantly influence the structural capacity of a cross-section. stress corrosion cracking (scc) the tension of a prestressing wire may induce the presence of cracks and a special type of corrosion – stress corrosion cracking (scc). the brittle fracture and fatigue limit state should be considered in this case. fracture mechanics is used to analyze such effects. according to [19] the scc effect can be modelled in the following way: let be the effective tension in a wire, and a be the crack depth caused by corrosion. the tension at the depth of the crack is calculated as � � a p s k a� (14) p is the force in the wire (in n), s is the cross sectional area of the wire and k(a) is the stress intensity factor. its value can be calculated with the aid of the equation of linear fracture mechanics 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 k s a c a d a b ij i j ji j � � � � � � � � � � �� � �� 1 3 0 1 4 , (15) where b is the width of the crack (in m), d is diameter of wire and cij are given in [19] – table 4. when the value of k(a) reaches the critical value kiscc (fracture toughness) the wire breaks suddenly. fig. 3 represents the dependability of time, crack depth and fracture toughness (a0, ac are the initial, resp. critical crack depths). note: in the cases of precast prestressed members or well-grouted tendons, reanchoring should be considered at a certain distance from the point of failure. this distance may be evaluated according to building code requirements for reinforced concrete (aci 318-89), r 12.8. the referred minimum value is 6 inches. pitting and scc corrosion types develop lesser expansion activity than the general case, and therefore the chance of visual detection (cracks on the concrete surface) is often negligible. 3 assessment of structures – random nature � statistical analysis the basic parameters of the carbonation (or other degradation effects) computational models described above are very uncertain. their variability may be significant and stochastic modelling is desirable. therefore, the input variables should be considered to be random variables described by theoretical probability density functions. then the predicted carbonation depth xc(x) or the time to depassivation tcr(x) are the functions of random variables x � x1, x2, …, xm. the level of the probabilisation is given by the decision which input variables will be considered as random and which as deterministic. sensitivity analysis may serve as a tool for such decision-making. the goal of statistical analysis is to estimate the basic statistical parameters of these degradation effects, e.g., their mean values and variances. this can be done by monte carlo simulation, i.e., by repetitive calculations of the carbonation/chloride attack formula. a special type of numerical simulation, latin hypercube sampling (lhs), makes it possible to use a small number of simulations. for an application, see, e.g., [4]. � probability analysis concrete and reinforcing steel materials deteriorate in time, and consequently, e.g., the ultimate bending moment or other resistance variables decrease. the goal is to quantify the influence of structural deterioration on the reliability of the structure in question. the failure probability is defined as: � �� �p p g tf � �x, 0 (16) and can be evaluated at several time points ti (i � 1, 2, …, n). g (.) is the appropriate limit function. the increase of failure probability in time can be estimated, e.g., by advanced simulation techniques, such as importance sampling. � sensitivity analysis sensitivity analysis is usually an additional output of reliability analysis. there are many approaches to sensitivity analysis – see, e.g., [20]. the goal of this analysis is to answer the question “how is the variability of the safety margin influenced by the individual random variables”. and since the variability of the safety margin directly influences the failure probability, this sensitivity measure can be understood as a measure of the influence on the theoretical failure probability. an application of sensitivity analysis of a deteriorating structure may be found, e.g., in [4]. � structural service life a definition of service life is depicted in fig. 1 (as the intersection of the loading action curve and the structural resistance function – both random!). � bayesian updating the results of statistical simulation are based on statistical data, which may be more or less accurate, obtained by measurement, taken from literature or based on an engineering judgment. this data should be available from design documentation, while some may be gained by engineering judgment or from specialized databases (if available!). a computational model for the prediction of, e.g., carbonation depth cannot be considered as the most accurate source, considering all factors involved. such results are, therefore, “a prior information” related to the available data and the theoretical model. the prior prediction for both short-term and long-term effects is obtained by this simulation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 42 no. 3/2002 fig. 3: fracture toughness vs. time and crack depth bayesian statistical prediction combined with lhs is described in [21]; application to improved (a posterior) prediction of the carbonation time profile is shown in [8]. the objective was to use the measurement to improve or update the long-term values of carbonation depth. note that if the data obtained by measurement is too remote from the prior results, then bayesian updating fails [21]. acknowledgements this study has been supported by grant agency of czech republic project no. 103/02/1030. references [1] smith, i.: increasing knowledge of structural performance. structural engineering international, vol. 11, no. 3, 2001, p. 191–195. [2] iso 15686-2: buildings and construction assets – service life planning. international standard. [3] sarja, a.: towards practical durability design of concrete structures. in: durability of building materials and components vol. 2. london: e&f spon, 1999, p. 1237–1247. [4] teplý, b. et al: deterioration of reinforced concrete: probabilistic and sensitivity analyses. acta polytechnica, 1999, vol. 39, no. 2, p. 7–23. [5] siemes, t., rostam, s.: durable safety and serviceability – a performance based design format. delft, report of iabse colloquium 1996, p. 41–50. [6] papadakis, v. g., fardis, m. n., vayenas, c. g.: effect of composition, environmental factors and cement-lime mortar coating on concrete carbonation. materials and structures, 1992, vol. 25, no. 149, p. 293–304. [7] matoušek, m.: effects of some environmental factors on structures. ph.d. thesis, technical university of brno, czech republic (in czech), 1977. [8] novák, d., keršner, z., teplý, b.: prediction of structure deterioration based on the bayesian updating. natural-draught cooling towers, rotterdam, balkema 1996, p. 417–421. [9] bob, c., afana, e.: on-site assessment of concrete carbonation. proceedings of the international conference “failure of concrete structures, štrbské pleso, slovak republic, 1993, p. 84–87. [10] keršner, z., teplý, b., novák, d.: uncertainty in service life prediction based on carbonation of concrete. durability of building materials and components 7, london: e & fn spon 1996, vol. 1, p. 13–20. [11] parrot, l. j.: a review of carbonation in reinforced concrete. british cement association report c/1,0987, london 1987. [12] papadakis et al: mathematical modelling of chloride effect on concrete durability and protection measures. concrete in the service of mankind, london: e & f. spon, 1996, p. 165–174. [13] karimi, a. r., ramachandran, r.: probabilistic estimation of corrosion in bridges due to chlorination. rotterdam: balkema 2000, p. 681–688. [14] montemor, m. f. et al: corrosion behaviour of rebars in fly ash mortar exposed to carbon dioxide and chlorides. cement & concrete composites, 2002, vol. 24, p. 45–53. [15] andrade, c., sarria, j., alonso, c.: corrosion rate field monitoring of post – tensioned tendons in contact with chlorides. durability of building materials and components 7, london: e & fn spon 1996, vol. 2, p. 959–967. [16] rodriguez, j., ortega, l. m., casal, j., diez, j. m.: corrosion of reinforcement and service life of concrete structures. ibid, vol. 1, p. 117–126. [17] gonzales, j. a. et al: comparison of rates of general corrosion and maximum pitting penetration on concrete embedded steel reinforcement. cement and concrete research, vol. 25, no. 2, 1995, p. 257–264. [18] val, d., melchers, r. e.: reliability analysis of deteriorating reinforced concrete frame structures. structural safety and reliability, rotterdam: balkema 1998, p. 105–112. [19] izquierdo, d., andrade, c., tanner, p.: reliability analysis of corrosion in post-tensional tendons: case study. proceedings of the international conference “safety, risk, reliability – trend in engineering”, malta 2001, p. 1001–1006. [20] novák, d., teplý, b., shiraishi, n.: sensitivity analysis of structures: a review. international conference civil comp '93, edinburgh, scotland, 1993, p. 201–207. [21] bažant, z. p., chern, j. c.: bayesian statistical prediction of concrete creep and shrinkage. aci journal, vol. 81, no. 6, 1984, p. 319–330. prof. ing. břetislav teplý, csc. tel.: +420 541 147 642 fax: +420 541 147 667 e-mail: teply.b@fce.vutbr.cz brno university of technology faculty of civil engineering žižkova 17 662 37 brno 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2017.57.0404 acta polytechnica 57(6):404–411, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap the analysis of images in n -point gravitational lens by methods of algebraic geometry albert t. kotvytskiya, b, ∗, semen d. bronzab, volodymyr yu. shablenkoa a karazin kharkov national university, svobody square 4, kharkiv, 61022, ukraine b ukrainian state university of railway transport, feierbakh square 7, 61050, kharkiv, ukraine ∗ corresponding author: kotvytskiy@gmail.com abstract. this paper is devoted to the study of images in n-point gravitational lenses by methods of algebraic geometry. in the beginning, we carefully define images in algebraic terms. based on the definition, we show that in this model of gravitational lenses (for a point source), the dimensions of the images can be only 0 and 1. we reduce it to the fundamental problem of classical algebraic geometry the study of solutions of a polynomial system of equations. further, we use well-known concepts and theorems. we adapt known or prove new assertions. sometimes, these statements have a fairly general form and can be applied to other problems of algebraic geometry. in this paper, the criterion for irreducibility of polynomials in several variables over the field of complex numbers is effectively used. in this paper, an algebraic version of the bezout theorem and some other statements are formulated and proved. we have applied the theorems proved by us to study the imaging of dimensions 1 and 0. keywords: gravitational lense, images, algebaric geometry, resultant. 1. introduction in modern astrophysics, gravitational lensing has been transformed from an effect that confirms the general theory of relativity to the research tool. gravitational lensing is used to study both stellar systems and planets in them, and galaxies and systems of galaxies. even the cosmological parameters of the entire metagalaxy are investigated. from this point of view, it seems rather strange that until now a complete analytical description has been performed only for the simplest lenses axially symmetric lenses (see for example [1] or straight infinite cosmic strings [2]. to analyze fairly simple 2-point gravitational lenses, only approximate or numerical methods are used [3, 4]. in this paper, the authors continue the analytic study of n-point gravitational lenses by methods of algebraic geometry [5–8]. in physics, the concept of “image in a gravitational lens” is understood intuitively and is usually not determined. however, the absence of a definition can lead to ambiguous understanding of the concept and a different interpretation of some results, for example, the theorem on the odd number of images [9, 10]. on the other hand, the terminology developed in algebraic geometry makes it possible to pinpoint the concept of an image in a gravitational lens. on this basis it is possible to formulate a number of statements. 2. the physical formulation of the problem from an algebraic point of view for the model of a plane gravitational lens, we can write the equation that connects the coordinates of the source (the radius vector ~y) and the image coordinates (radius vector ~x), see [9, 10] ~y = ~x−~α, (1) where ~α is the total angle of deflection of the light beam in the plane of the lens. in the case of an n-point gravitational lens, the deflection angle is determined by the following expression: ~α = n∑ i=1 mi ~x−~li∣∣~x−~li∣∣2 , (2) where mi are dimensionless masses whose position in the plane of the lens is determined by the radius vectors ~li. we have that holds ∑n i=1 mi = 1. equation (1) with allowance for (2) in coordinate form has the form  ( x1 − n∑ i=1 mi x1 −ai (x1 −ai)2 + (x2 − bi)2 ) −y1 = 0, ( x2 − n∑ i=1 mi x2 − bi (x1 −ai)2 + (x2 − bi)2 ) −y2 = 0, (3) where ai and bi are the coordinates of the radius-vector ~li i.e. ~li = (ai,bi) . from an algebraic point of view, system (3) is a system of two rational equations (over a field of real numbers) from two unknowns, which are given in cartesian coordinates on the r2 plane. the system (3) will be considered, just above the field of complex numbers c, while we denote it by (3a). the set of solutions of system (3) is obviously the set of real solutions of system (3a). we note that all the coefficients of the equations of system (3a) are real. 404 http://dx.doi.org/10.14311/ap.2017.57.0404 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 the analysis of images in n-point gravitational lens in terms of algebraic geometry, the image of a source in an n-point gravitational lens can be defined as follows: definition. an image of a point source in an n-point gravitational lens will be called the real solution of system (3a) without regard to multiplicity. the set of images is the set of different real solutions of this system. 3. reduction of the problem to the fundamental problem of classical algebraic geometry the main problem of classical algebraic geometry is the problem of studying systems of polynomial equations. let us investigate the set of solutions of system (3a). to do this, we transform the equations of the system to a polynomial form  f1 = (x1 −y1) n∏ i=1 hi − n∑ j=1 mj (x1 −aj ) n∏ i=1,i6=j hi = 0, f2 = (x2 −y2) n∏ i=1 hi − n∑ i=1 mi(x2 −bi) n∏ i=1,i6=j hi = 0, (4) where hi = (x1 −ai)2 + (x2 − bi)2, i = 1, 2, . . . ,n. the polynomial form of the equations in system (4) is necessary for its investigation by methods of algebraic geometry. we shall consider the equations of the system (4) over the field c of complex numbers in the affine coordinate system c2. the system (4) is not equivalent to the system (3a), but it follows from it. the set of solutions of the system (3a) can be obtained from the set of solutions of the system (4). for this, by removing solutions from it in which the system (3a) is not defined. these solutions are pairs of numbers that are the coordinates of the point masses. indeed, we directly verify that the points with coordinates (ai,bj ), i = 1, ...,n, is a solution of the system (4), but the system (3a), in these points is not defined. let f1 and f2 be the left-hand sides of the first and second equations of system (3a), m(f1,f2) be the solution set of system (3a), v (f1,f2) be the solution set of system (4), and re v (f1,f2) ⊂ v (f1,f2) the subset of its real solutions, then we have m(f1,f2) = re v (f1,f2)/{∪(ai,bi)}. (5) from the theorem on the structure of the set of solutions of a system of polynomial equations, see [11] it follows that the set v (f1,f2) can be represented in the form v (f1,f2) = ( v 0(f1,f2) ) ∪ ( v 1(f1,f2) ) , (6) where v 1(f1,f2) is the set of solutions depending on a single parameter, and v 0(f1,f2) is the discrete set of solutions of system (3a). the set v 0(f1,f2) is obviously discrete and, moreover, finite. the sets have dimension dim v 0(f1,f2) = 0 and dim v 1(f1,f2) = 1. 4. study of the set v 1(f1, f2) (extended solutions) a number of theorems, which allow us to determine if the set v 1(f1,f2) is empty, see, for example, [12, 13]. in [5], we give an algorithm that allows us to describe this set analytically, if it is not empty. if the set v 1(f1,f2) is not empty, then the equations of system (3) are said to have a common component. the equation of the common component can be obtained from the analytical description of the set v 1(f1,f2). 4.1. 1-point lens (schwarzschild lens) we apply the theorems presented in the appendix for constructing the set v 1(f1,f2), in the case of a single-point gravitational lens. let the 1-point lens have coordinates a1 = 0,b1 = 0. let l: r2y → r 2 x be the transformation from the plane of the source to the plane of the lens, determined by the system of equations  y1 = x1 − x1 x21 + x22 , y2 = x2 − x2 x21 + x22 . (7) equations of the system are defined for all points such that x21 +x22 6= 0, that is, except for the origin of the point o(0, 0). at the origin, the inverse mapping is not defined. but if we transform system (7) to polynomial form{ (x21 + x 2 2)(x1 −y1) −x1 = 0, (x21 + x 2 2)(x2 −y1) −x2 = 0, (8) then the inverse transformation of l−1 : r2x → r 2 y is completely determined by the system of equations{ x31 + x1x 2 2 −x 2 1y1 −x 2 2y1 −x1 = 0, x21x2 + x 3 2 −x 2 1y2 −x 2 2y2 −x2 = 0. (9) we calculate the result r1 by the variable x1 for which we represent the equation in lexicographic form{ x31 −y1x 2 1 + (x 2 2 − 1)x1 −x 2 2y1 = 0, (x2 −y2)x21 + x 3 2 −x 2 2y2 −x2 = 0. (10) result by degree x1 has the form r1 = ∣∣∣∣∣∣∣∣∣∣ 1 r12 r13 r14 0 0 1 r12 r13 r14 r21 0 r23 0 0 0 r21 0 r23 0 0 0 r21 0 r23 ∣∣∣∣∣∣∣∣∣∣ , (11) where r12 = −y1,r13 = x22 − 1,r14 = −x22y1,r21 = x2 −y2,r23 = x32 −x22y2 −x2. 405 a. t. kotvytskiy, s. d. bronza, v. yu. shablenko acta polytechnica we have r1 = −x31y 2 1 + x 2 2y 2 1y2 + x2y 2 1 −x 3 2y 2 2 + x 2 2y 3 2. (12) in order for the system equations (7) to have a common component, we need r1 ≡ 0. applying theorem 5a for the decomposition of r1 on indecomposable components, we have x2 ( −(y21 + y 2 2 )x 2 2 + y2(y 2 1 + y 2 2 )x2 + y 2 1 ) ≡ 0. (13) the equation is divided into two equations x2 ≡ 0, −(y21 + y 2 2 )x 2 2 + y2(y 2 1 + y 2 2 )x2 + y 2 1 ≡ 0. (14) each of the equations is considered to be a polynomial of the variable x2. a polynomial is identically equal to zero if and only if all its coefficients are equal to zero. from here there is a system of equations{ y21 + y 2 2 = 0, y21 = 0. . (15) next we have that y1 = 0, y2 = 0. substituting in (7) we have  x1 − x1 x21 + x22 = 0, x2 − x2 x21 + x22 = 0 ⇒   x1 ( 1 − 1 x21 + x22 ) = 0, x2 ( 1 − 1 x21 + x22 ) = 0. (16) the system (16) decomposes into three systems and one equation: { x1 = 0, x2 = 0, (17a)   x1 = 0, 1 − 1 x21 + x22 = 0, (17b)  1 − 1 x21 + x22 = 0, x2 = 0, (17c) 1 − 1 x21 + x22 = 0. (18) system (17a) has a solution x1 = 0,x2 = 0 but this solution is not a solution of system (7), since the system equations at the point o(0, 0) are not defined. the system (17b) has two solutions x1 = 0, x2 = ±1. the system (17c) has two solutions x1 = ±1, x2 = 0. we have the transformation of equation (18) into x21 + x 2 2 − 1 = 0. (19) equation (19) is the equation of an individual circle in the plane x with a center at the point o(0, 0). the solution of systems (17b) and (17c) satisfies equation (19). the solution of system (8) is the coordinates of the points of single circle with center at the point o(0, 0). equation (19) is the equation of the general component, hence the set v 1(f1,f2) = {x1,x2 |x21 + x 2 2 − 1 = 0}. (20) in the same way, we compute the resultant r2. by virtue of the symmetry of variables, we have the same solution. 4.2. 2-point lens we research a two-point gravitational lens with equal masses m1 = m2 = 12 . the masses are on the abscissa at a distance a from the origin of coordinates. in this case, system (3) looks like this:  y1 = x1 − 1 2 x1 −a (x1 −a)2 + x22 − 1 2 x1 + a (x1 + a)2 + x22 , y2 = x2 − 1 2 x2 (x1 −a)2 + x22 − 1 2 x2 (x1 + a)2 + x22 . (21) we transform the equation of system (21) into a polynomial form, and represent the obtained polynomials f1 and f2 in lexicographic form with increasing degrees of variable x1:  f1 = −a2(a2 + 2x22y1) + ( a2 −x22 + (a 2 + x22) ) x1 + 2y1(a2 −x22)x 2 1 + (2x22 − 1 − 2a 2)x31 + y1x 4 1 + x 5 1, f2 = ( −a2x2 −x32 + (a 2 + x22) 2(x2 −y2) ) − ( x2 + 2(a2 −x22)(x2 −y2) ) x21 + (x2 −y2)x 4 1, (22) we will remove from the system the variable x1, using the resultant r1 = r(f1,f2). sylvester matrix s1 = s(f1,f2) has order (for x1) deg f1 + deg f2 = 9. because r1 = det s1, we have r1 = 4a4x22(a 2 + x22) ( −a2y32 + (a2y22 −y 2 2 − 4a 4y22 − 4a 2y42 )x2 + (−4a2y2 + 4a4y2 − 4a6y2 −y21y 2 − 4a2y21y2 + 8a 4y21y2 − 4a 2y21y2 − 5y32 + 4a 2y32 − 8a 4y32 − 4a 2y52 )x 2 2 + (−4a4 + 4a6 + y21 + 4a 2y21 − 8a4y21 + 4a 2y41 + y 2 2 − 12a 2y22 + 8a4y22 − 8y 2 1y 2 2 − 8y 4 2 + 4a 2y42 )x 3 2 − 4(a4y2 −a2y2 −y21y2 − 2a 2y21y2 + y41y2 −y 3 2 + 2a 2y32 + 2y 2 1y 3 2 + y 5 2 )x 4 2 + 4(a4 − 2a2y21 + y 4 1 + 2a 2y22 + 2y 2 1y 2 2 + y 4 2 )x 5 2 ) . (23) 406 vol. 57 no. 6/2017 the analysis of images in n-point gravitational lens in order for the system equation to have a common component, it is sufficient that the objects r1 ≡ 0. we have that the equation decomposes into three simple equations and one non-trivial equation. from the trivial equation a4 ≡ 0,x22 ≡ 0, (a2 +x22) ≡ 0 it follows that their solutions are reduced to 1-lens, or incommensurate. we have a nontrivial equation −a2y32 + (a 2y22 −y 2 2 − 4a 4y22 − 4a 2y42 )x2 + (−4a2y2 + 4a4y2−4a6y2−y21y 2 2 −4a 2y21y2 + 8a 4y21y2 − 4a2y21y2 − 5y 3 2 + 4a 2y32 − 8a 4y32 − 4a 2y52 )x 2 2 + (−4a4 + 4a6 + y21 + 4a 2y21 − 8a 4y21 + 4a 2y41 + y 2 2 − 12a2y22 + 8a 4y22 − 8y 2 1y 2 2 − 8y 4 2 + 4a 2y42 )x 3 2 − 4(a4y2 −a2y2 −y21y2 − 2a 2y21y2 + y 4 1y2 −y32 + 2a 2y32 + 2y 2 1y 3 2 + y 5 2 )x 4 2 + 4(a4 − 2a2y21 + y 4 1 + 2a 2y22 + 2y 2 1y 2 2 + y 4 2 )x 5 2 = 0. (24) we equate all coefficients to zero, and have a system of equations  −a2y32 = 0, a2y22 −y 2 2 − 4a 4y22 − 4a 2y42 = 0, −4a2y2 + 4a4y2 − 4a6y2 −y21y 2 2 −4a2y21y2 − 4a 2y52 − 4a 2y21y2 − 5y32 + 4a 2y32 − 8a 4y32 + 8a 4y21y2 = 0, −4a4 + 4a6 + y21 + 4a 2y21 − 8a 4y21 + 4a 2y41 + y 2 2 − 12a2y22 + 8a 4y22 − 8y 2 1y 2 2 − 8y 4 2 + 4a 2y42 = 0, a4y2 −a2y2 −y21y2 − 2a 2y21y2 + y 4 1y2 −y32 + 2a 2y32 + 2y 2 1y 3 2 + y 5 2 = 0, a4 − 2a2y21 + y 4 1 + 2a 2y22 + 2y 2 1y 2 2 + y 4 2 = 0. (25) we have  a = 0, y2 = 0, −y21y 2 2 − 5y 3 2 = 0, −y21y2 + y 4 1y2 −y 3 2 + 2y 2 1y 3 2 + y 5 2 = 0, y41 + 2y 2 1y 2 2 + y 4 2 = 0. (26) the system has one solution a = 0,y1 = 0,y2 = 0. hence this solution reduces the 2-point gravity lens to 1-point. similarly we calculate the resultant r2 as r2 = 4a4(a−x1)x21(a + x1) ( −a2y31 + + (1 + a2 + 4a4 − 4a2y21 − 4a 2y22 )y 2 1x1 + (−4a2 + 4a4 − 4a6 + 5y21 + 4a 2y21 + 8a 4y21 − 4a 2y41 + y22 − 4a 2y22 − 8a 4y22 − 8a 2y21y 2 2 − 4a 2y42 )y1x 2 1 + ( 4a4 + 4a6 + (−1 − 12a2 − 8a4 + 8y21 + 4a 2y21 )y 2 1 + (−1 + 4a2 + 8a4 + 8y21 + 8a 2y21 + 4a 2y22 )y 2 2 ) x31 + (4a2y21 + a 4y1 −y31 − 2a 2y31 + y 5 1 −y1y 2 2 + 2a2y1y22 + 2y 3 1y 2 2 + y1y 4 2 )x 4 1 − 4(a4 − 2a2y21 + y 4 1 + 2a 2y22 + 2y 2 1y 2 2 + y 4 2 )x 5 1 ) . (27) we have that the solution of system (21) reduces the 2-point gravity lens to 1-point. whence • for the 1-point gravitational lens set we have v 1(f1,f2) = {x1,x2 |x21 + x 2 2 − 1 = 0}; • for the 2-point gravitational lens set we have v 1(f1,f2) = ∅. based on the studies we have carried out above, one can prove that there are no extended objects for n-point gravitational lenses, i.e. v 1(f1,f2) = ∅. the set m(f1,f2) can be represented in the form m(f1,f2) = m0(f1,f2) ∪m1(f1,f2), (28) where m0(f1,f2) = re v 0(f1,f2)/{∪(ai,bi)} and m1(f1,f2) = re v 1(f1,f2)/{∪(ai,bi)}. it is known that the set m1(f1,f2), for a point source in 1-point lens is not empty, see for example [9, 10, 14], coincides with v 1(f1,f2), see [5] and is einstein ring. but for a point source in symmetric 2-point lens, we proved [5] that the set m1(f1,f2) is empty and put forward hypothesis: for n-point lens this set is empty for n > 1. 5. the study of the set v 0(f1, f2) (point solutions) to research the set of solutions v 0(f1,f2) of system (3) we use the bezout theorem, see for example [11– 13, 15]. in most monographs, the authors formulate the bezout theorem in geometric terms; see for example [11, 12, 15]. one of these theorems is quoted in appendix. in [13] bezout’s theorem is formulated in algebraic terms, but for equations given in affine coordinates. this theorem is also quoted in appendix. for our purposes, we formulate this theorem in algebraic terms, but for functions given in homogeneous coordinates. theorem 1 (bezout). let g1(x0 : x1 : x2) and g2(x0 : x1 : x2) be homogeneous polynomials, deg g1(x0 : x1 : x2) = n, deg g2(x0 : x1 : x2) = m and the resultant r1(g1,g2), with respect to variable x1 not identically equal to zero. then the resultant r1(g1,g2) is a homogeneous polynomial with respect to variables x0 and x2 , and deg r1(g1,g2) = n ·m. proof. the resultant r1(g1,g2) is a polynomial in the variables x0 and x2. we denote it by f , and write 407 a. t. kotvytskiy, s. d. bronza, v. yu. shablenko acta polytechnica f = r1(g1,g2). let us prove that the polynomial f = f (x0,x2) is homogeneous and of degree deg f = n ·m. really, we have f(tx0, tx2) = r1 ( g1(tx0 : x1 : tx2), g2(tx0 : x1 : tx2) ) . (29) according to theorem 1a, see appendix, we have r1 ( g1(tx0 : x1 : tx2),g2(tx0 : x1 : tx2) ) = det sul ( g1(tx0 : x1 : tx2),g2(tx0 : x1 : tx2) ) . (30) where the right-hand side of expression (30) is the determinant of the sylvester matrix. the order of this determinant is n + m. elements of the sylvester matrix are coeficies from the lexicographic representation of homogeneous polynomials g1 and g2. we have that g1 = n∑ i=0 aix n−i 1 and g2 = m∑ j=0 bjx m−j 1 . the coefficients ai = ai(x0 : x2) and bj = bj (x0 : x2) are homogeneous polynomials. the degree is deg ai(x0 : x2) = i and deg bj (x0 : x2) = j, that is, ai(tx0 : tx2) = tiai(x0 : x2) and bj (tx0 : tx2) = tjbj (x0 : x2). we multiply every row of the determinant of the sylvester matrix by the parameter t in some degree. we choose a degree so that all elements of the column are of the same degree with respect to t. we multiply the i-th row of the determinant of the sylvester matrix by ti where i = 1, 2, . . . ,m, and j-th row by tj where j = m + 1,m + 2, . . . ,m + n. we take the factor ts from the s-th column where s = 1, 2, . . . ,m + n. we denote the total power of t by s. we have s = m+n∑ i=1 i− n∑ i=1 i− m∑ i=1 i = (n + m)(n + m + 1) 2 − n(n + 1) 2 − m(m + 1) 2 = nm. (31) in this way, det sul ( g1(tx0 : x1 : tx2),g2(tx0 : x1 : tx2) ) = tnm det sul ( g1(x0 : x1 : x2), g2(x0 : x1 : x2) ) . (32) the determinant of the sylvester matrix does not depend on the parameter t. substituting (32) in (30) and further in (29) we have f(tx0, tx2) = tnmf(x0,x2). (33) consequently, the resultant is a homogeneous polynomial of degree nm. theorem 1 admits a generalization. we have proved an analogous assertion for systems of equations of several variables, see [7, 16]. we transform system (3) and apply bezout’s theorem to its study. in the equations of system (3) we proceed to homogeneous coordinates. let{ x1 = x1/x0, x2 = x2/x0. (34) after reducing the equations of the system to a polynomial form, we have  x2n +10 f1 (x1 x0 , x2 x0 ,y1 ) = φ1(x0 : x1 : x2) = 0, x2n +10 f2 (x1 x0 , x2 x0 ,y2 ) = φ2(x0 : x1 : x2) = 0. (35) the coordinates x0,x1,x2 are obviously projective coordinates. the system (34) defines surjective mapping, =: c2 → cp 2. the triple of complex numbers (x0 : x1 : x2) are the coordinates of the point and defines the point p ∈ cp 2 in the projective plane cp 2. the triple (λx0 : λx1 : λx2) specifies the same point if λ 6= 0. therefore we have the following result. theorem 2. the system of polynomial equations{ φ1(x0 : x1 : x2) = 0, φ2(x0 : x1 : x2) = 0 (36) has in the projective plane cp 2, counting multiplicity, exactly m ·n solutions, where, m = deg φ1, and n = deg φ2, if gcd(φ1, φ2) belongs to the coefficient field c. functions φ1 = φ1(x0 : x1 : x2) and φ2 = φ2(x0 : x1 : x2) are homogeneous functions of degree 2n + 1. if, at least one of the coordinates of the point p is equal to zero, say that this point is irregular. otherwise, the point is called regular. a straight line that consists of irregular points is called an irregular line. the projective plane cp 2 has three irregular straight lines, which are given by the equations x0 = 0, x1 = 0, x2 = 0. (37) the set of points cp 2 one of the coordinates, which is equal to the number h 6= 0, is called affine map on cp 2 and denoted by a2(h). the complement of a cp 2\a2(h) consists of a one-dimensional complex projective subspace, which is called an infinitely distant line of the affine map , see for example [11, 15]. the infinitely distant line of any affine map a2(h) is evidently irregular. in particular, if we put x0 = 1, then the set of points cp 2 with coordinates (1 : x1 : x2) will be 408 vol. 57 no. 6/2017 the analysis of images in n-point gravitational lens affine map of a2(1), and the infinity of the straight line of this map will be given by equation x0 = 0. consider the situation of general position, i.e. the source is not on the caustic. in this case, the jacobian of the system of lens equations is not equal to zero. theorem 3. in a situation of general position (the jacobian of the system of lens equations is not equal to zero), the number of point images in an n-point gravitational lens has parity opposite to the parity of the number n. in the proof of theorem 3 we use the following lemma. lemma 1. the number of irregular solutions of system (37) , on line x0 = 0, is 2n. proof. using (37), we reduce the system to the form  (x1 −x0y1) n∏ i=1 hi −x20 n∑ j=1 mj (x1 −x0aj ) n∏ i=1,i6=j hi = 0, (x2 −x0y2) n∏ i=1 hi −x20 n∑ j=1 mj (x2 −x0bj ) n∏ i=1,i6=j hi = 0, (38) where hi = (x1 −x0ai)2 + (x2 −x0bi)2. let x0 = 0. we have  x1 n∏ i=1 (x21 + x 2 2 ) = 0, x2 n∏ i=1 (x21 + x 2 2 ) = 0 ⇒ { x1(x21 + x 2 2 ) n = 0, x2(x21 + x 2 2 ) n = 0 ⇒ (x21 + x 2 2 ) n = 0 ⇒ x1 = ±ix2 ⇒ { x1 = c, x2 = ±ic. (39) finally we have two n-fold solutions: p1 = (0 : a : ic) and p2 = (0 : a : −ic). proof of theorem 2. for the degrees of the polynomials of systems (3) and (5) we have deg f1 = deg f2 = deg φ1 = deg φ2 = 2n + 1. by bezout’s theorem, the system of equations (36) has (2n + 1)2 solutions, which include an even number of 2q complex conjugate solutions and p = 2n irregular solutions. therefore, the number of real solutions of system (36), card ( real v 0(f1,f2) ) = (2n + 1)2 − 2q −p = (2n + 1)2 − 2q − 2n = 4n2 + 2n + 1 − 2q. (40) from the fact that the restriction of the inverse mapping =−1 : cp 2 → c2 to the affine map a2(1) is a bijection that is given by the equations x0 = 1, x1 = x1, x2 = x2, we have card ( m0(f1,f2) ) = card ( real v 0(f1,f2) ) −n = 4n2 + n + 1 − 2q. (41) in a situation of general position, the point source is not on the caustic, therefore, all elements of the set real v (f1,f2) are different. in this case, each point of the set real v (f1,f2) is, by definition, an image. it follows from (9) that the parity of the number of images is opposite to the parity of the number n. theorem 3 does not contradict the theorem on the oddness of the number of images in transparent lenses [9, 10]. example 1. for a 1-point lens, the number of images is 2, see [9, 10, 14]. example 2. for a 2-point lens, the number of images is 3 or 5 see [17]. 6. conclusions applying methods of algebraic geometry, we constructed an algorithm that separates images of dimensions 1 and 0. in the present paper, for an image of dimension 1, it is proved that for single-point sources there exists a unique image of dimension 1-the einstein ring; einstein’s ring is only in a single-point lens; the point source in other lenses does not have images of dimension 1 for n > 1. for an image of dimension 0, it is proved that in any n-point gravitational lens: there are a finite number of images; the parity of the number of images is always the opposite of the parity of the number n. the assertion for the number of images of dimension 0 was proved by us earlier, see [18]. in [18], we used the geometric method of algebraic geometry-the newton diagram. in the present paper all the assertions are proved algebraically. this opens the possibility, to use n-point gravitational lenses, not only approximate or numerical methods, but also computer algebra systems. a. appendix let f(x,y) be a function of two variables, and f(x,y), at the point (x0,y0), n-times continuous, differentiable function. then taylor’s formula holds: f(x,y) = f(x0,y0) + n∑ k=1 f(k)(x−x0,y −y0) + rn(x,y), (42) 409 a. t. kotvytskiy, s. d. bronza, v. yu. shablenko acta polytechnica where f(k)(x−x0,y −y0) = n∑ k=1 cik ∂kf(x0,y0) ∂xk−i∂yi (x−x0)k−i(y −y0)i, (43) and f(x,y) is the remainder term. if f(x,y) is a polynomial, and m = deg f(x,y), then rn(x,y) = 0 for all n ≥ m. definition 1a. we say that a pair of numbers x0,y0 is a s-multiple solution of equation f(x,y) = 0 if: • f(x0,y0) = 0; • f(i)(x−x0,y −y0) ≡ 0, i = 1, 2, . . . ,s− 1; • f(s)(x − x0,y − y0) 6= 0, s ≤ n in some neighborhood of the point (x0,y0); we will write this fact as mult(f(x0,y0)) = s. for example point (0, 0) is s-multiple solution of equation f(x,y) = 0 if: • f(0, 0) = 0; • f(i)(x,y) ≡ 0, i = 1, 2, ...,s− 1; • f(s)(x,y) 6= 0, s ≤ n in some neighborhood of the point (0, 0). such a solution is called a s-multiple zero solution. definition 2a. let the pair of numbers x0,y0 be a solution of the system of equations{ f(x,y) = 0, g(x,y) = 0 (44) and q = min(mult(f(x0,y0)), mult(g(x0,y0))). the solution x0,y0 will be called q-multiple of the solution of the system of equations (44), while we will write q = min(mult(f,g)(x0,y0)). the concept of a multiple solution of a system of equations can obviously be extended to systems with an arbitrary number of equations from several variables. the resultant of polynomials is one of the basic concepts of classical algebraic geometry. in the modern literature [12, 19, 20], the resultant of polynomials is usually defined as follows. definition 3a. let k-arbitrary field, f(x) and g(x) polynomials in k[x]. the resultant r(f,g) of polynomials f(x) and g(x) is called an element in the field k, defined by the formula r(f,g) = an0 b m 0 n∏ i=0 m∏ j=0 (αi −βj ), (45) where αi,βi are roots of polynomials f(x) =∑n i=0 aix n−i and g(x) = ∑m j=0 bjx m−j, correspondingly, with the highest coefficients, a0, b0 such that a0 6= 0, b0 6= 0. let the roots of the polynomials f(x) and g(x) be known. to calculate their resultant, one can use formula (45). if we know only the coefficients of these polynomials, then we can use the sylvester matrix to calculate their resultant. the sylvester matrix is a block matrix of two blocks. each block has one ribbon matrix. we have a definition of the sylvester matrix. definition 4a. matrix sylvester for polynomials f(x) = ∑n i=0 aix n−i and g(x) = ∑m j=0 bjx m−j, we call a square matrix s = s(f,g) of order n + m with elements sij defined by the formula sij =   aj−i, if 0 ≤ j − i ≤ n, i = 1, . . . ,m, j = 1, . . . ,n + m, bj−i+m, if 0 ≤ j − i + m ≤ n, i = m + 1, . . . ,m + n, j = 1, ...,n + m, 0, for other i,j, (46) i.e., s(f,g) = [sij ] = n rows   m rows     a0 a1 · · · an 0 · · · 0 0 a0 a1 · · · an 0 0 ... ... ... 0 · · · 0 a0 a1 · · · an b0 b1 · · · bm 0 · · · 0 0 b0 b1 · · · bm 0 0 ... ... ... 0 · · · 0 b0 b1 · · · bm   . (47) the resultant polynomials r(f,g) and the sylvester matrix sul(f,g) are connected by the following theorem. theorem 1a. the resultant r(f,g) of the polynomials f and g is equal to the determinant of sylvester matrix these polynomials, i.e., r(f,g) = s(f,g). for the proof see, e.g., [19, 20]. theorem 2a. polynomials f and g have a common root if and only if r(f,g) = 0. (48) for the proof see, e.g., [13]. theorem 3a (bezout). the number of intersection points of plane curves φ1 and φ2 (counted taking into account the multiplicity) is equal to nm, where m = deg φ1, and n = deg φ2, if the curves: • do not have common components; • are defined over an algebraically closed field; • are considered on the projective plane. for the proof see, e.g., [12]. 410 vol. 57 no. 6/2017 the analysis of images in n-point gravitational lens theorem 4a. let f(x1,x2) = ∑i+j≤n i,j=0 fijx i 1x j 2 and g(x1,x2) = ∑i+j≤m i,j=0 gijx i 1x j 2 be polynomials. let their coefficients be such that fn0 6= 0, f0n 6= 0, gm0 6= 0, g0m 6= 0. then deg r1 ( f(x1,x2),g(x1,x2) ) = deg f(x1,x2) deg g(x1,x2) = nm. (49) for the proof see, e.g., [12]. theorem 5a. the polynoms f1(x1,x2), f2(x1,x2) have a non-trivial common component if and only if r1(f1(x1,x2),f1(x1,x2)) or r2(f1(x1,x2), f1(x1,x2)). for the proof see, e.g., [12]. definition 5a. a formal sum g = g(x1,x2) of the form g = ∑i+j≤n i,j=0 gijx i 1x j 2 is called a polynomial nform of variables x1,x2 over the field k. that is, g is a polynomial of degree n in variables x1,x2 with indefinite coefficients gij in the field k. the expression “the function will be sought in the form of a n-form” is usually understood as a procedure for determining the undetermined coefficients of a given n-form. theorem 6a (criterion of non-decomposability). let f be a polynomial in the variables over the field of complex numbers and . let , and let be the -form of the variables x1 and x2 over the field of complex numbers. let and be the resultants with respect to the variables and respectively. the polynomial f is not decomposable if and only if the systems of equations ∂i ∂xi1 r1 ( f(x1,x2),g(x1,x2) ) = 0, ∂i ∂xi2 r2 ( f(x1,x2),g(x1,x2) ) = 0, (50) where i = 1, ...,m, m = deg r1(f,g), have only zero solutions. the system (50) is considered with respect to the undetermined coefficients gij n-form g, as with respect to unknown variables. for the proof see, e.g., [7]. the theorem admits a generalization to the case of systems of equations of several variables see [7, 16]. the criterion was formulated and proved by the authors earlier [7]. references [1] e. y. bannikova, a. t. kotvytskiy. three einstein rings: explicit solution and numerical simulation. mnras 445(4):4435–4442, 2014. doi:10.1093/mnras/stu2068. [2] a. t. kotvytskiy. gravitational lensing by straight cosmic strings. tmp 184(1):160–174, 2015. doi:10.1007/s11232-015-0315-x. [3] a. cassan. an alternative parameterisation for binary-lens caustic-crossing events. astronomy and astrophysics 491(2):587–595, 2008. doi:10.1051/0004-6361:200809795. [4] p. schneider, a. weiss. the two-point-mass lens: detailed investigation of a special asymmetric gravitational lens. astronomy and astrophysics 164(2):237–259, 1986. [5] a. t. kotvytskiy, s. d. bronza, k. d. nerushenko, v. y. shablenko. matematychnii zmist kiltsia einshteina ta umovy yogo vynyknennja. doslidzhennya uzahal’nenyh umov[in ukrainian]. in zbirnyk naukovyx prac’ vi-i mizhrehional’noi naukovo-praktychnoi konferencii "astronomiya i s’ogodennya", pp. 198–213. vinnytsia, 2017. [6] a. t. kotvytskiy, s. d. bronza, s. r. vovk. estimating the number of images n-point gravitational lenses algebraic geometry methods. visnyk khnu, serija "fizyka" 24:55–59, 2016. [7] s. d. bronza, a. t. kotvytskiy. mathematical bases of the theory of n-point gravitational lenses. part 1. elements of algebraic geometry. visnyk khnu, seriya "fizyka" 26:6–32, 2017. [8] a. t. kotvytskiy, s. d. bronza. mathematical bases of the theory of n-point lenses. odessa astronomical publications 29:31–33, 2016. doi:0.18524/1810-4215.2016.29.84958. [9] a. f. zakharov. gravitacionnye linzy i mikrolinzy [in russian]. janus-k, moscow, 1997. [10] p. schneider, j. ehlers, e. falco. gravitational lenses. second printing. springer-verlag berlin heidelberg, 1999. [11] i. v. arzhantcev. bazisy grebnera i sistemy algebraicheskih uravnenij [in russian]. mcnmo, moscow, 2003. [12] r. j. walker. algebraic curves. springer-verlag new york, 1978. [13] e. kalinina, a. j. uteshev. teoria iskluchenij [in russian]. nii chimii spbgu saint petersburg, 2002. [14] p. v. bliokh, a. a. minakov. gravitational lenses [in russian]. naukova dumka, kiev, 1989. [15] m. reid. undergraduate algebraic geometry. math inst., university of warwick, 2013. [16] s. d. bronza. kriteriy neprivodimosti mnogochlenov ot dvuh peremennyih nad polem kompleksnih chisel[in russian]. in zbirnyk naukovyx prac’, pp. 114–115. ukrduzt, 2016. [17] s. h. phie. infimum microlensing amplification of the maximum number of images of n-point lens systems. the astrophysical journal 484(1):63–69, 1997. doi:10.1086/304336. [18] a. t. kotvytskiy, s. d. bronza, v. y. shablenko. correlation of the number of images of an n-point gravitational lens and the number of solutions of its system. in astronomy and beyond: astrophysics, cosmology, cosmomicrophysics,astroparticle physics, radioastronomy and astrobiology, p. 12. odessa international astronomical gamow conference-school, 2017. [19] b. l. van der waerden. modern algebra i, ii. new york:frederic ungar publishing co., 1950. [20] s. lang. algebra. revised third edition. columbia university. new york, 1965. 411 http://dx.doi.org/10.1093/mnras/stu2068 http://dx.doi.org/10.1007/s11232-015-0315-x http://dx.doi.org/10.1051/0004-6361:200809795 http://dx.doi.org/0.18524/1810-4215.2016.29.84958 http://dx.doi.org/10.1086/304336 acta polytechnica 57(6):404–411, 2017 1 introduction 2 the physical formulation of the problem from an algebraic point of view 3 reduction of the problem to the fundamental problem of classical algebraic geometry 4 study of the set v1(f1,f2) (extended solutions) 4.1 1-point lens (schwarzschild lens) 4.2 2-point lens 5 the study of the set v0(f1,f2) (point solutions) 6 conclusions a appendix references acta polytechnica doi:10.14311/ap.2016.56.0245 acta polytechnica 56(3):245–253, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap two-dimensional hybrids with mixed boundary value problems marzena szajewska, agnieszka tereszkiewicz∗ institute of mathematics, university of bialystok, 1m ciolkowskiego, pl-15-245 bialystok, poland ∗ corresponding author: a.tereszkiewicz@uwb.edu.pl abstract. boundary value problems are considered on a simplex f in the real euclidean space r2. the recent discovery of new families of special functions, orthogonal on f , makes it possible to consider not only the dirichlet or neumann boundary value problems on f , but also the mixed boundary value problem which is a mixture of dirichlet and neumann type, ie. on some parts of the boundary of f a dirichlet condition is fulfilled and on the other neumann’s works. keywords: hybrid functions, dirichlet boundary value problem, neumann boundary value problem, mixed boundary value problem. 1. introduction the boundary value problems, considered in this paper, occurring in a real euclidean space r2 on finite region f ⊂ r2 that is half of a square or half of an equilateral triangle. the main idea of this paper is to study the solutions of helmholtz equation with the mixed boundary value problems. a surprising variety of recently emerged suitable new families of special functions makes that the realization of this idea is relatively simple and straightforward in any dimension. in addition to the classical boundary value problems of dirichlet and neumann type, the new functions, called ‘hybrids’ [6, 10], display properties at the boundary of f, on some parts of the boundary being dirichlet’s, while on the remaining ones neumann’s. the boundary value conditions play an important role in describing the physical phenomena. they are used, inter alia, in the theory of elasticity, electrostatics and fluid mechanics [2, 4, 16]. in section 2 we introduce some facts about weyl groups c2 and g2. in section 3 we show the exact formulas for four families of special functions for each of the group c2 and g2. the branching rules used to separate variables in section 4 are described in details for example in the following papers [9, 11, 14]. in section 5 three types of boundary value problems are considered for four families of special functions described in section 3. although for the case a1 ×a1, there is no hybrid functions, the mixed boundary value problem occurs. we present this case in details in appendix. 2. weyl group c2 and g2 in this section we recall certain facts about weyl groups c2 and g2 [1, 3, 5]. we use four bases in r2, namely e-, α-, α̌and ωbasis. the first one, e-basis, is a natural basis for an euclidean space. the simple root basis, α-basis, exists for every finite group figure 1. shaded triangles represent the fundamental regions f for c2 and g2 group. generated by reflections. the co-root basis α̌ is defined by the formula: α̌i = 2αi 〈αi|αi〉 . the ω-basis is dual to simple root basis. the relationship between considered bases is standard for group theory and is expressed by: 〈α̌i|ωj〉 = δij. below we present the α-basis vectors in cartesian coordinates for each of considered groups: c2 : α1 := 1√2 (1,−1)e, α2 := 2√ 2 (0, 1)e, g2 : α1 := ( √ 2, 0)e, α2 := ( − 1√ 2 , 1√ 6 ) e . the following notation for coordinates is used: r2 3 λ = (a,b)ω = aω1 + bω2. r2 3 x = (x1,x2)α̌ = (y1,y2)e, where indexes ω, e, and α̌ denote ω-, natural-, and α̌-basis, respectively. 245 http://dx.doi.org/10.14311/ap.2016.56.0245 http://ojs.cvut.cz/ojs/index.php/ap marzena szajewska, agnieszka tereszkiewicz acta polytechnica the fundamental regions f for c2 and g2 group, written in ω-basis, have the vertices fc2 = {0,ω1,ω2}, fg2 = {0, ω1 2 ,ω2} and are shown in figure 1. the groups c2 and g2 can be reduced to a subgroup a1 ×a1 using a branching rule method described in [11, 14]. for c2 case it is done by the projection matrix pc2 = ( 1 1 0 1 ) (1) acting on the whole orbit of a group. the branching rule is the following: o(a,b) pc2−→ o(a + b)o(b) ∪o(b)o(a + b). (2) the reduction from g2 to a1 × a1 is given by the matrix pg2 = ( 1 1 3 1 ) . (3) the branching rule, in this case, has a form: o(a,b) pg2−−−→ o(a + b)o(3a + b) ∪o(2a + b)o(b) ∪o(a)o(3a + 2b). (4) for group a1 ×a1 we use the following notation for coordinates r2 3 x = (x,y)e ∈ a1 ×a1. 3. c-, s-, ss-, and sl-functions of g = c2 or g2 the general formula for special functions corresponding to the weyl group [5] is given by∑ w∈g σ(w)e2πi〈wλ|x〉, where the coordinates x = (x1,x2)α̌ ∈ r2 and weight λ = (a,b)ω are given in α̌and ω-basis, respectively. the homomorphism σ : g → {±1} (by g we denote the group c2 or g2) determine the four families of special functions [9], that are of interest to us. the map σ(w) is a product of σ(rl),σ(rs) ∈ {±1}, where rl,rs denote long and short reflections in w, respectively. consequently, there are four types of homomorphisms σ: c : σ(rl) = σ(rs) = 1, s : σ(rl) = σ(rs) = −1, sl : σ(rl) = −1, σ(rs) = 1, ss : σ(rl) = 1, σ(rs) = −1. 3.1. explicit forms of cand s-functions in this subsection we provide an exact formulas for the two types of special functions, namely, cand s-functions for c2 and g2 group. the upper signs in the formulas correspond to c(a,b)(x) functions and the lower ones match up to s(a,b)(x) functions [9, 13]: c2 : ±2 [ cos(2π((a + 2b)x1 + (−a− b)x2)) ± cos(2π((a + 2b)x1 − bx2)) + cos(2π(−ax1 + (a + b)x2)) + cos(2π(ax1 + bx2) ] , g2 : 2 [ cos(2π((2a + b)x1 − (3a + 2b)x2) + cos(2π(ax1 + bx2) ± cos(2π(a + b)x1 − bx2) ± cos(2π(ax1 − (3a + b)x2) ± cos(2π((2a + b)x1 − (3a + b)x2) + cos(2π(a + b)x1 − (3a + 2b)x2) ] . 3.2. explicit form of ss and sl-functions similarly, as in the previous subsection, we present exact formulas for sland ssfunctions. again, the upper signs correspond to ss(a,b)(x) function and the lower ones belong to sl(a,b)(x) function [9, 13]: c2 : 2 [ ∓cos(2π((a + 2b)x1 − (a + b)x2)) ± cos(2π((a + 2b)x1 − bx2)) − cos(2π(−ax1 + (a + b)x2)) + cos(2π(ax1 + bx2) ] , g2 : 2i [ sin(2π((a + b)x1 − (3a + 2b)x2)) + sin(2π(ax1 + bx2)) ± sin(2π((2a + b)x1 − (3a + 2b)x2)) ∓ sin(2π(ax1 − (3a + b)x2)) − sin(2π((2a + b)x1 − (3a + b)x2)) ∓ sin(2π((a + b)x1 − bx2)) ] . remark 1. the weight coordinates (a,b)ω for the four families of special functions are different, namely c(a,b)(x) : a,b ∈ z≥0, s(a,b)(x) : a,b ∈ z>0, ss(a,b)(x) : { a ∈ z>0, b ∈ z≥0 for c2, a ∈ z≥0, b ∈ z>0 for g2, sl(a,b)(x) : { a ∈ z≥0, b ∈ z>0 for c2, a ∈ z>0, b ∈ z≥0 for g2, the next remark is a consequence of explicit forms of functions written in subsections 3.1, 3.2. remark 2. four families of special functions are real in case of c2 group. the functions c-, sare real, and sl-, ssare pure imaginary in case of g2 group. 246 vol. 56 no. 3/2016 two-dimensional hybrids with mixed boundary value problems 4. helmholtz differential equation in this section we consider the well-known partial differential equation ∆ψ(x) = −w2ψ(x), w −positive real constant called homogeneous helmholtz equation (see for example [7, 8, 15] and references therein), where x = (y1,y2)e and ∆ = ∂2 ∂y21 + ∂2 ∂y22 . remark 3 [5]. the special functions described in the previous section are eigenfunctions of the laplace operator. the explicit form of the laplace operator in coordinates relative to the ω-basis and α̌-basis is the following c2 : ∆ω = 2∂21 − 2∂1∂2 + ∂ 2 2, ∆α̌ = 12∂ 2 1 + ∂1∂2 + ∂ 2 2, g2 : ∆ω = ∂21 − 3∂1∂2 + 3∂ 2 2, ∆α̌ = 2∂21 + 2∂1∂2 + 2 3∂ 2 2. since ∆e2πi〈λ|x〉 = −4π2〈λ|λ〉e2πi〈λ|x〉 then we have ∆ψλ(x) = −4π2〈λ|λ〉ψλ(x), where ψλ(x) is one of the functions c,s,ss or sl. the inner product of λs is equal c2 : 〈λ|λ〉 = 12a 2 + ab + b2, g2 : 〈λ|λ〉 = 2a2 + 2ab + 23b 2. 4.1. separation of variables for the helmholtz equation using a standard method of separation of variables for the helmholtz equation [7] ∆ψ(x) = −w2ψ(x), x = (y1,y2)e, and searching for the solutions in the form ψ(x) = x(y1)y (y2), we have the following differential equation x′′y + xy ′′ + w2xy = 0. introducing −k2-separation constant, we get a pair of the ordinary differential equations easy to solve: x′′ + k2x = 0, y ′′ + (w2 −k2)y = 0. (5) a basic solution of (5) we can write in the form x1(y1) = cos ky1, y1(y2) = cos √ w2 −k2y2, x2(y1) = sin ky1, y2(y2) = sin √ w2 −k2y2, where k 6= 0 and w2 −k2 6= 0. according to the assumptions that k 6= 0 and w2 6= k2 we consider c-, s-, ss-, and sl-functions only with positive weights. 4.2. c2 case from the projection matrix pc2 (1) and the branching rule (2) we find two separation constants −k21 and −k22, which have a form −k21 = −2(a + b) 2π2, w2 −k21 = 2b 2π2, −k22 = −2b 2π2, w2 −k22 = 2(a + b) 2π2. noting that k21 = w2 −k22, as a separation constant we take −k2 = −2(a + b)2π2, w2 −k2 = 2b2π2. using the branching rule (2) from section 2 for special functions c,s,ss,sl we can rewrite those functions in the form: ca,b(x) = 4 [ cos(ky1) cos( √ w2 −k2y2) + cos( √ w2 −k2y1) cos(ky2) ] , sa,b(x) = 4 [ sin( √ w2 −k2y1) sin(ky2) − sin(ky1) sin( √ w2 −k2y2) ] , ssa,b(x) = 4 [ cos(ky1) cos( √ w2 −k2y2) − cos( √ w2 −k2y1) cos(ky2) ] , sla,b(x) = −4 [ sin( √ w2 −k2y1) sin(ky2) + sin(ky1) sin( √ w2 −k2y2) ] . by changing the variables by y1 = √ 2x,y2 = √ 2 we get the reduction to a1 ×a1 subgroup ca,b(x) = ca+b(x)cb(y) + cb(x)ca+b(y), sa,b(x) = sa+b(x)sb(y) −sb(x)sa+b(y), ssa,b(x) = sa+b(x)sb(y) + sb(x)sa+b(y), sla,b(x) = ca+b(x)cb(y) −cb(x)ca+b(y), (6) the functions cµ(x), and sµ(x), on the right side of (6), are defined in appendix. the coordinates (x,y) ∈ a1 ×a1 are written in α-basis. 4.3. g2 case from the projection matrix pg2 (3) and the branching rule (4) we find three separation constants −k21, −k22, and −k23, which have a form −k21 = −2(2a + b) 2π2, w2 −k21 = 2 3b 2π2, (7) −k22 = −2(a + b) 2π2, w2 −k22 = 2 3 (3a + b) 2π2, −k23 = −2a 2π2, w2 −k23 = 2 3 (3a + 2b) 2π2. using the branching rule (4) from section 2 for special functions c, s, ss, sl we can rewrite those functions 247 marzena szajewska, agnieszka tereszkiewicz acta polytechnica figure 2. fundamental regions of c2 and g2 groups. sides are marked by s and l symbols which correspond to the reflection orthogonal to the short and long root, respectively. figure 3. the normal vectors and boundaries are indicated for the weyl group c2. in the form: ca,b(x) = 4 [ cos(k1y1) cos( √ w2 −k21y2) + cos(k2y1) cos( √ w2 −k22y2) + cos(k3y1) cos( √ w2 −k23y2) ] , sa,b(x) = 4 [ −sin(k1y1) sin( √ w2 −k21y2) + sin(k2y1) sin( √ w2 −k22y2) − sin(k3y2) sin( √ w2 −k23y2) ] , ssa,b(x) = 4i [ −cos(k1y1) sin( √ w2 −k21y2) − cos(k2y1) sin( √ w2 −k22y2) + cos(k3y1) sin( √ w2 −k23y2) ] , sla,b(x) = 4i [ −sin(k1y1) cos( √ w2 −k21y2) + sin(k2y1) cos( √ w2 −k22y2) + sin(k3y1) cos( √ w2 −k23y2) ] . by changing the variables by y1 = √ 2x,y2 = √ 6 we get the reduction to a1 ×a1 subgroup ca,b(x) = ca(x)c3a+2b(y) + ca+b(x)c3a+b(y) + c2a+b(x)cb(y), sa,b(x) = sa(x)s3a+2b(y) −sa+b(x)s3a+b(y) + s2a+b(x)sb(y), ssa,b(x) = ca(x)s3a+2b(y) −ca+b(x)s3a+b(y) −c2a+b(x)sb(y), sla,b(x) = sa(x)c3a+2b(y) −sa+b(x)c3a+b(y) + s2a+b(x)cb(y). (8) (d) (n) c2,g2 s l s l ca,b(x) ∗ ∗ 0 0 sa,b(x) 0 0 ∗ ∗ (m) (d) (n) c2,g2 s l s l ssa,b(x) 0 ∗ ∗ 0 sla,b(x) ∗ 0 0 ∗ table 1. behaviour of the functions c, s, ss and sl on the boundary ∂f for c2 and g2 group where ∗ denotes any function non-equivalent to 0. the functions cµ(x), and sµ(x), on the right side of (8), are defined in appendix. the coordinates (x,y) ∈ a1 ×a1 are written in α-basis. proposition 1. cµ(x), and sµ(x) functions presented in (8) fulfill the following relationships −k3s3a+2b(x)ca(x) + k2s3a+b(x)ca+b(x) −k1sb(x)c2a+b(x) = √ 3 (√ w2 −k23c3a+2b(x)sa(x) − √ w2 −k22c3a+b(x)sa+b(x) − √ w2 −k21cb(x)s2a+b(x) ) , −k3s3a+2b(x)sa(x) + k2s3a+b(x)sa+b(x) + k1sb(x)s2a+b(x) = √ 3 (√ w2 −k23c3a+2b(x)ca(x) − √ w2 −k22c3a+b(x)ca+b(x) − √ w2 −k21cb(x)c2a+b(x) ) , where ki, √ w2 −ki, i = 1, 2, 3 are defined by (7). 5. types of boundary conditions in this paper we consider three types of boundary conditions. (1.) the first type, called a dirichlet boundary condition, defines the value of the function itself: ψ(x) = f(x), for x ∈ ∂f, (d) where f(x) is a given function defined on the boundary. (2.) the second type, called a neumann boundary condition, defines the value of the normal derivative of the function: ∂ψ ∂n (x) = f(x), for x ∈ ∂f, (n) where n denotes normal to the boundary ∂f. 248 vol. 56 no. 3/2016 two-dimensional hybrids with mixed boundary value problems c1,3(x) s1,3(x) figure 4. the contour plot of c1,3(x), s1,3(x). the triangle denotes the fundamental domain f of the affine weyl group c2. sl1,3(x) ss1,3(x) figure 5. the contour plot of sl1,3(x), ss1,3(x). the triangle denotes the fundamental domain f of the affine weyl group c2. figure 6. the normal vectors and boundaries are indicated for the weyl group g2 (3.) the third type, called a mixed boundary condition, defines the value of the function itself on one part of the boundary and the value of the normal derivative of the function on the other part of the boundary:{ ψ(x) = f0(x) for x ∈ ∂f0, ∂ψ ∂n (x) = f1(x) for x ∈ ∂f1, (m) where ∂f = ∂f0 ∪∂f1 and f0(x),f1(x) are given functions, defined on the appropriate boundary. remark 4 [12]. for the dirichlet boundary conditions all eigenvalues are positive. for the neumann boundary condition all eigenvalues are non-negative. in table 1 we present how the four types of functions, defined in section 3, behave on the boundary ∂f of the fundamental region f . the fundamental region f for c2 and g2 groups is presented in figure 2. symbol s corresponds to the reflection orthogonal to the short root and l corresponds to the reflection orthogonal to the long root. 5.1. c2 case the normal vectors to the fundamental region f of the weyl group c2 are the following: n1 = (0,−1)e, n2 = ( − 1√ 2 , 1√ 2 ) e , n3 = (1, 0)e. in figure 3 we present the fundamental region f with indicated boundaries and corresponding normal vectors. the values of the four families of special functions c,s,ss and sl satisfying the dirichlet boundary condition (d) on the boundary ∂f of the fundamental figure 7. a shaded square represents the fundamental region f of a1 × a1. figure 8. the boundaries of the fundamental region f of a1 × a1 are indicated. region f are presented in tables 2, 4, and 5. tables 3– 5 present the values of the functions satisfying the neumann boundary condition (n). the examples of functions and their behaviours on the boundary ∂f is presented in figures 4 and 5. 5.2. g2 case the normal vectors to the fundamental region f of the weyl group g2 are the following: n1 = (−1, 0)e, n2 = (√3 2 ,− 1 2 ) e , n3 = (1 2, √ 3 2 ) e in figure 6 we present the fundamental region f with indicated boundaries and corresponding normal vectors. the values of the functions for group g2 fulfilling the dirichlet boundary condition (d) on the boundary of the fundamental region f are presented in tables 6, 8, and 9. the values of the functions satisfying the neumann boundary condition (n) are given in tables 7–9. the examples of functions and their behaviours on the boundary ∂f is presented in figures 9 and 10. 249 marzena szajewska, agnieszka tereszkiewicz acta polytechnica ca,b(x) dirichlet condition f1 2ca+b(x) + 2cb(x) f2 2ca+b(x)cb(x) f3 ca+b(1)cb(y) + cb(1)ca+b(y) table 2. values of ca,b(x) function at the boundary of the fundamental region f in c2 case. sa,b(x) neumann condition f1 2i ( ksb(x) − √ w2 −k2sa+b(x) ) f2 2i (√ w2 −k2sa+b(x)cb(x) −ksb(x)ca+b(x) ) f3 i (√ w2 −k2ca+b(1)sb(y) −kcb(1)sa+b(y) ) table 3. values of sa,b(x) function at the boundary of the fundamental region f in c2 case. mixed condition ssa,b(x) dirichlet condition neumann condition f1 2(ca+b(x) −cb(x)) 0 f2 0 2i ( − √ w2 −k2sa+b(x)cb(x) + ksb(x)ca+b(x) ) f3 c(a+b)(1)cb(y) −cb(1)ca+b(y) 0 table 4. values of ssa,b(x) function at the boundary of the fundamental region f in c2 case. mixed condition sla,b(x) dirichlet condition neumann condition f1 0 2i ( − √ w2 −k2sb(x) −ksa+b(x) ) f2 2sa+b(x)sb(x) 0 f3 0 i (√ w2 −k2ca+b(1)sb(y) + kcb(1)sa+b(y) ) table 5. values of sla,b(x) function at the boundary of the fundamental region f in c2 case. ca,b(x) dirichlet condition f1 2(cb(y) + c3a+b(y) + c3a+2b(y)) f2 c2a+b(x)cb(x) + ca+b(x)c3a+b(x) + ca(x)c3a+2b(x) f3 c2a+b(x)cb(x− 1) + ca+b(x)c3a+b(x− 1) + ca(x)c3a+2b(x− 1) table 6. values of ca,b(x) function at the boundary of the fundamental region f in g2 case. sa,b(x) neumann condition f1 2i(−k3s3a+2b(y) + k2s3a+b(y) −k1sb(y)) f2 −2i (√ w2 −k23 c3a+2b(x)sa(x) − √ w2 −k22 c3a+b(x)sa+b(x) + √ w2 −k21 cb(x)s2a+b(x) ) f3 2i(k3s3a+2b(x− 1)ca(x) −k2s3a+b(x− 1)ca+b(x) + k1sb(x− 1)c2a+b(x)) table 7. values of sa,b(x) function at the boundary of the fundamental region f in g2 case. 250 vol. 56 no. 3/2016 two-dimensional hybrids with mixed boundary value problems mixed condition ssa,b(x) dirichlet condition neumann condition f1 2(−sb(x) −s3a+b(x) + s3a+2b(x)) 0 f2 0 2i (√ w2 −k23c3a+2b(x)ca(x) + √ w2 −k22c3a+b(x)ca+b(x) + √ w2 −k21cb(x)c2a+b(x) ) f3 −c2a+b(x)sb(x− 1) −ca+b(x)s3a+b(x− 1) + ca(x)s3a+2b(x− 1) table 8. values of ssa,b(x) function at the boundary of the fundamental region f in g2 case. mixed condition sla,b(x) dirichlet condition neumann condition f1 0 2i(−k3c3a+2b(y) −k2c3a+b(y) + k1cb(y)) f2 −s2a+b(x)cb(x) + sa+b(x)c3a+b(x) + sa(x)c3a+2b(x) 0 f3 0 2i(k3c3a+2b(x− 1)ca(x) + k2c3a+b(x− 1)ca+b(x) −k1cb(x− 1)c2a+b(x)) table 9. values of sla,b(x) function at the boundary of the fundamental region f in g2 case. 3.0.co;2-1. [2] m. haldimann, a. luible, m. overend. structural use of glass, vol. 10 of structural engineering documents. iabse, zürich, switzerland, 2008. [3] a. van duser, a. jagota, s. j. bennison. analysis of glass/polyvinyl butyral laminates subjected to uniform pressure. journal of engineering mechanics 125(4):435–442, 1999. doi:10.1061/(asce)0733-9399(1999)125:4(435). 509 http://dx.doi.org/10.1002/(sici)1096-9845(200002)29:2<159::aid-eqe895>3.0.co;2-1 http://dx.doi.org/10.1002/(sici)1096-9845(200002)29:2<159::aid-eqe895>3.0.co;2-1 http://dx.doi.org/10.1061/(asce)0733-9399(1999)125:4(435) j. schmidt, t. janda, a. zemanová et al. acta polytechnica time t [s] di sp la ce m en t r (t )[ m ] figure 5. snapshots of deformations of a viscoelastic cube subjected to step load. [4] l. galuppi, g. royer-carfagni. the design of laminated glass under time-dependent loading. international journal of mechanical sciences 68:67–75, 2013. doi:10.1016/j.ijmecsci.2012.12.019. [5] a. zemanová, j. zeman, m. šejnoha. comparison of viscoelastic finite element models for laminated glass beams. international journal of mechanical sciences 131-132:380–395, 2017. doi:10.1016/j.ijmecsci.2017.05.035. [6] a. zemanová, j. zeman, t. janda, m. šejnoha. layer-wise numerical model for laminated glass plates with viscoelastic interlayer. structural engineering and mechanics 65(4):369–380, 2018. doi:10.12989/sem.2018.65.4.369. [7] l. andreozzi, s. b. bati, m. fagone, et al. dynamic torsion tests to characterize the thermo-viscoelastic properties of polymeric interlayers for laminated glass. construction and building materials 65:1–13, 2014. doi:10.1016/j.conbuildmat.2014.04.003. [8] y. shitanoki, s. bennison, y. koike. a practical, nondestructive method to determine the shear relaxation modulus behavior of polymeric interlayers for laminated glass. polymer testing 37:59–67, 2014. doi:10.1016/j.polymertesting.2014.04.011. [9] i. mohagheghian, y. wang, l. jiang, et al. quasi-static bending and low velocity impact performance of monolithic and laminated glass windows employing chemically strengthened glass. european journal of mechanics – a/solids 63:165–186, 2017. doi:10.1016/j.euromechsol.2017.01.006. [10] t. hána, t. janda, j. schmidt, et al. experimental and numerical study of viscoelastic properties of polymeric interlayers used for laminated glass: determination of material parameters. materials 12(14):2241, 2019. doi:10.3390/ma12142241. [11] r. w. clough, j. penzien. dynamics of structures. computers & structures, inc, berkeley, 3rd edn., 2003. [12] y. koutsawa, et al. static and free vibration analysis of laminated glass beam on viscoelastic supports. international journal of solids and structures 44(2526):8735–8750, 2007. doi:10.1016/j.ijsolstr.2007.07.009. [13] m. l. aenlle, f. pelayo. frequency response of laminated glass elements: analytical modeling and effective thickness. applied mechanics reviews 65(2):020802–020802–13, 2013. doi:10.1115/1.4023929. [14] a. zemanová, j. zeman, t. janda, et al. on modal analysis of laminated glass: usability of simplified methods and enhanced effective thickness. composites part b: engineering 151:92–105, 2018. doi:10.1016/j.compositesb.2018.05.032. [15] o. c. zienkiewicz, m. watson, i. p. king. a numerical method of visco-elastic stress analysis. international journal of mechanical sciences 10(10):807– 827, 1968. doi:10.1016/0020-7403(68)90022-2. [16] z. p. bažant, m. jirásek. creep and hygrothermal effects in concrete structures, vol. 225 of solid mechanics and its applications. springer, dordrecht, 2018. doi:10.1007/978-94-024-1138-6. [17] n. m. newmark. a method of computation for structural dynamics. journal of the engineering mechanics division 85(3):67–94, 1959. [18] c. kane, j. e. marsden, m. ortiz, m. west. variational integrators and the newmark algorithm for conservative and dissipative mechanical systems. international journal for numerical methods in engineering 49(10):1295–1325, 2000. doi:10.1002/1097-0207(20001210)49:10<1295::aidnme993>3.0.co;2-w. [19] b. bourdin, g. a. francfort, j.-j. marigo. the variational approach to fracture. journal of elasticity 91(1-3):5–148, 2008. doi:10.1007/s10659-007-9107-3. [20] m. buliga. hamiltonian inclusions with convex dissipation with a view towards applications. annals of the academy of romanian scientists series on mathematics and its applications 1(2):228–251, 2009. 510 http://dx.doi.org/10.1016/j.ijmecsci.2012.12.019 http://dx.doi.org/10.1016/j.ijmecsci.2017.05.035 http://dx.doi.org/10.12989/sem.2018.65.4.369 http://dx.doi.org/10.1016/j.conbuildmat.2014.04.003 http://dx.doi.org/10.1016/j.polymertesting.2014.04.011 http://dx.doi.org/10.1016/j.euromechsol.2017.01.006 http://dx.doi.org/10.3390/ma12142241 http://dx.doi.org/10.1016/j.ijsolstr.2007.07.009 http://dx.doi.org/10.1115/1.4023929 http://dx.doi.org/10.1016/j.compositesb.2018.05.032 http://dx.doi.org/10.1016/0020-7403(68)90022-2 http://dx.doi.org/10.1007/978-94-024-1138-6 http://dx.doi.org/10.1002/1097-0207(20001210)49:10<1295::aid-nme993>3.0.co;2-w http://dx.doi.org/10.1002/1097-0207(20001210)49:10<1295::aid-nme993>3.0.co;2-w http://dx.doi.org/10.1007/s10659-007-9107-3 vol. 60 no. 6/2020 newmark algorithm for dynamic analysis of viscoelastic materials [21] b. bourdin, c. j. larsen, c. l. richardson. a time-discrete model for dynamic fracture based on crack regularization. international journal of fracture 168(2):133–143, 2011. doi:10.1007/s10704-010-9562-x. [22] a. bedford. hamilton’s principle in continuum mechanics. pitman publishing, boston, 1985. [23] a. mielke, t. roubíček. rate-independent systems: theory and application, vol. 655 of applied mathematical sciences. springer, new york, 2015. doi:10.1007/978-1-4939-2706-7. [24] j. schmidt, t. janda, a. zemanová, et al. source codes for preprint newmark algorithm for dynamic analysis with maxwell chain model, 2019. doi:10.5281/zenodo.3531802. [25] l. petzold. automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. siam journal on scientific and statistical computing 4(1):136–148, 1983. doi:10.1137/0904010. [26] e. jones, t. oliphant, p. peterson, et al. scipy: open source scientific tools for python, 2001–. http://www.scipy.org, [accessed january 12, 2021]. [27] t. j. r. hughes. analysis of transient algorithms with particular reference to stability behavior. in t. belytschko, t. j. r. hughes (eds.), computational methods for transient analysis, chap. 2, pp. 67–155. north-holland, amsterdam, 1983. [28] a. logg, k.-a. mardal, g. wells (eds.). automated solution of differential equations by the finite element method: the fenics book. lecture notes in computational science and engineering. springer-verlag, berlin heidelberg, 2012. doi:10.1007/978-3-642-23099-8. [29] m. s. alnæs, j. blechta, j. hake, et al. the fenics project version 1.5. archive of numerical software 3(100), 2015. doi:10.11588/ans.2015.100.20553. [30] t. li, j.-j. marigo, d. guilbaud, s. potapov. numerical investigation of dynamic brittle fracture via gradient damage models. advanced modeling and simulation in engineering sciences 3(1):26, 2016. doi:10.1186/s40323-016-0080-x. [31] t. roubíček. models of dynamic damage and phasefield fracture, and their various time discretisations. in m. hintermüller, j. f. rodrigues (eds.), topics in applied analysis and optimisation, pp. 363–396. springer international publishing, cham, 2019. 511 http://dx.doi.org/10.1007/s10704-010-9562-x http://dx.doi.org/10.1007/978-1-4939-2706-7 http://dx.doi.org/10.5281/zenodo.3531802 http://dx.doi.org/10.1137/0904010 http://www.scipy.org http://dx.doi.org/10.1007/978-3-642-23099-8 http://dx.doi.org/10.11588/ans.2015.100.20553 http://dx.doi.org/10.1186/s40323-016-0080-x acta polytechnica 60(6):502–511, 2020 1 introduction 2 newmark method 2.1 governing equations 2.2 discretization 3 variational integrators 3.1 variational framework 3.2 discretization 3.3 equivalence to newmark 3.4 energy balance 4 examples 4.1 discrete problem 4.2 generalization 5 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0537 acta polytechnica 61(4):537–551, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague detailed modelling and analysis of digital mho distance relay with single-pole operation paulo henrique barbosa de souza pinheiro, mayara helena moreira nogueira dos santos, angelo cesar colombini, bruno wanderley frança, marcio zamboti fortes∗ fluminense federal university, engineering school, electrical engineering department, 24210-240 niteroi, rj, brazil ∗ corresponding author: mzamboti@id.uff.br abstract. this paper introduces a methodology for modelling a digital admittance-type distance relay using pscad/emtdc. the proposed distance relay was tested in a simulation of the brazilian power grid with predetermined fault scenarios. the goal of this paper is to make a detailed evaluation of the mho distance relay. the main aspects include the correct operation of the distance relay, fault resistance effects on the mho characteristics, and the fault detection time of this relay. a new approach to analyse the fault detection time is presented, considering several simulated fault scenarios. the results demonstrate that the fault resistance influences the fault detection time and severely affects the distance relay’s general performance. the fault detection time is not constant. it varies within a time interval, considering different fault types, fault locations, and fault resistances. the confidence interval calculation provides a detailed range of the fault detection time, considering its upper and lower limits. keywords: electrical engineering, power system protection, pscad/emtdc, time-domain analysis. 1. introduction power system protection is a wide area of study with many challenges regarding electrical equipment safety and power system reliability. a well-designed protection scheme must guarantee these two main aspects and remove any undesired conditions that might happen. for high-voltage (hv) and extra-high-voltage (ehv) systems, transmission line protection brings many significant improvements, mainly when it includes distance relay modelling. distance relays protect transmission lines. they respond to the impedance between the relay and the fault location [1]. the basic principle of distance protection includes the impedance measurement to the fault location in the transmission line. instrument transformers provide voltage and current signals at the relay location. through these signals, it is possible to estimate the fault location and to determine which fault occurred. the increasing insertion of digital distance relays in electrical power systems led to several works regarding their modelling. the goals of these works are, in general, to make an accurate evaluation of their algorithms and to propose new ones to solve common problems of distance protection. some common problems related to the distance protection can be found in [2–5]. authors in [2] bring a methodology to model a distance relay using a real-time digital simulator (rtds). they highlight every step needed to model the digital distance relay in the following order: • signal conditioning unit • data acquisition unit • data processing unit following this sequence, it is possible to model a complete digital relay, beginning with voltage and current measurements and up to the command signals to trip circuit-breakers. this reference considers two different operational characteristics for the distance relay. they are mho and quadrilateral. the most important reported problems are fault detection and fault classification. this reference applies the three-pole operation scheme for the distance relay model. it means that, for every possible fault type in a three-phase system, the relay will open the circuit-breakers belonging to the three phases of the system. another problem of great importance in the distance protection is the fault resistance during a short-circuit. it can result in the distance relay carrying out an incorrect operation. an extension of an existing ground distance relay algorithm to include phase distance relays is presented in [3]. the algorithm uses a fault estimation process to improve the efficiency of the distance protection. results show that the fault resistance compensation 537 https://doi.org/10.14311/ap.2021.61.0537 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en p. h. b. s. pinheiro et al. acta polytechnica algorithm is suitable for online applications. it is relevant to highlight that conventional distance relays might present some limitations to detect and classify faults with high impedance. one of this paper’s contributions includes proving it. in [4], the authors investigate the effects of new facts (flexible ac transmission system) installations on existing distance relays. the effect of the presence of a upfc (unified power flow controller) was evaluated. it is a controller that can simultaneously support bus voltage and control the flow of power. it is reported that a conventional distance relay will calculate the distance to the fault location based on the bus voltage and current. if this distance is within a protection zone, the relay will clear the fault. when the upfc is connected to the system, the impedance seen by the distance relay must include the impedance of the device. in addition, the upfc has a by-pass circuit-breaker to isolate it during fault conditions. several short circuits were simulated in the power system considered. the main conclusions are, during a steady-state operation, the distance relay will operate correctly. furthermore, during most of the fault conditions, the by-pass circuit-breaker isolates the device. then, it will not affect the correct operation of the protection relays. reactive compensation devices must be considered in the distance protection because they can affect the impedance measurement by the distance relay. problems related to facts devices include overreaching the distance protection [6]. in the presence of faults with impedance, there is also the possibility of underreaching [7]. beyond the aforementioned problems related to the distance protection, another critical condition that might happen in electrical power systems and can severely affect the distance protection is a power swing. power swings originate from the rotor angle oscillations from synchronous generators. these power swings can be harmful to distance protection. they might cause impedance trajectory encroachment on the relay protection zones [5]. unlike electrical networks with power generation mostly using synchronous generators, distance protection using conventional mho and quadrilateral characteristics is affected by renewable energy sources connected to the transmission network using electronic power converters. the adaptive mho characteristic presented in [8] is suitable for solving problems of this type. the topology changes in the electrical systems highlight the need to study the existing protection algorithms and identify their limitations. then, it is possible to propose new solutions that suit the new scenarios. there are many types of distance relays, from electromechanical ones to the latest computer relays. conventional types of distance relays found in the literature (e. g., [1] and [9]) are: • impedance-type distance relay • modified type distance relay • reactance type distance relay • admittance type distance relay • quadrilateral type distance relay the last characteristic can be found only in solid-state or computer relays, while the others exist in both the electromechanical and the digital relays. the two commonly applied types of distance protection elements include admittance type and quadrilateral type distance relays [10]. the paper aims to describe the process of modelling a digital admittance distance relay (mho) and to make a detailed evaluation of its behaviour during several simulated fault scenarios. this evaluation focuses mainly on its correct operation, analyses the effects of fault impedances on distance protection, and the fault detection time of the proposed distance relay. a novel approach to fault detection time is presented. it consists of the confidence interval calculation to provide a range of plausible values of fault detection time, including its upper and lower limits. the distance relay modelling and its evaluation were conducted using the program pscad/emtdc (power system computer-aided design/electromagnetic transients including direct current). the simulated transmission line belonging to the brazilian power grid is used to evaluate the distance relay’s performance. the paper is structured as follows: section 2 presents related works regarding the modelling of distance relays, section 3 depicts the proposed distance relay, section 4 provides more details about the studied scenario and presents simulation results, section 5 presents the conclusions. 2. related works this section presents related works about the modelling of digital distance relays. some of them are used as guides to the development of the proposed model and to the evaluation of some parameters concerning distance protection. an mho distance relay is proposed in [11], using the emtpworks package. this implementation is suitable, considering the nonlinear nature of an arcing fault, which requires a time-domain implementation for the distance relay. hence, it is possible to determine if the fault will or will not be detected. the authors modelled the 538 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . distance relay only for ground faults. due to this, it is not possible to evaluate phase faults. the current distance relay presented in this paper also implements the mho characteristic. furthermore, it includes the possibility to test all fault types in its algorithm. several simulated faults scenarios are considered to check the algorithm’s general performance. these fault scenarios also allow the verifying of limitations of the distance relay using the transient analysis program pscad/emtdc. a quadrilateral characteristic of distance protection is presented and tested in [12]. the performance of this quadrilateral distance relay is evaluated using the model of an electrical power system located in india. ground and phase faults are simulated, varying many different conditions in the power system. these conditions are fault resistance, power transfer angle, and line charging. the proposed distance relay uses the following sequence in its structure: • anti-aliasing filter • fast fourier transform • sequence filter • impedance calculation • fault detection and classification • trip signal following this sequence, it is possible to extract the fundamental components of voltage and current signals during a fault. these signals allow to make an accurate estimation of the impedance to the fault location and correctly send the trip signal to the circuit-breakers. it is possible to notice, in this reference, that the quadrilateral characteristic is suitable to cover faults with a high impedance, and varying the angle of power transfer affects the impedance calculation. some of these steps adopted in [12] are applied to the mho distance relay model proposed in this paper (e. g., the anti-aliasing filter and the fast fourier transform). further details are given in section 3. an approach to distance protection using the s-transform is presented in [13]. it is an invertible timefrequency localization technique that combines elements of the wavelet transform and the short-time fourier transform. the s-transform was applied to the voltage and current signals to generate the s-matrix. the changing of the energy applies to the process of fault detection and the estimated phasors are used for the impedance calculation. applying this technique allows the possibility to detect the faults within less than a half-cycle of the fundamental frequency. the total time, of detecting the fault and sending the trip signal, for the proposed scheme, is about 20 samples (one cycle based on the fundamental frequency of 50 hz). in this paper, the confidence interval calculation of the fault detection time is proposed as an alternative method to treat this parameter. different fault scenarios influence the fault detection time. with this methodology, it is possible to obtain an accurate range of these values, considering best and worst scenarios. these scenarios were elaborated considering different fault resistances, different fault locations, and different fault types. the last parameter evaluated in this paper is the correct operation of the distance relay facing these different scenarios. after this review of related works about distance relays, modelling, section 3 will depict the process of modelling the proposed mho distance relay. 3. digital mho distance relay in this section, the algorithm for the implemented mho distance relay using the program pscad/emtdc is presented. this transient analysis software is used by engineers, researchers, and students from utilities, manufactures, consultants, research and academic institutes [14]. the digital distance relay modelled in this paper uses the admittance characteristic in its protection zones. it is also known as the mho distance relay. the admittance characteristic is a circular tripping characteristic. it can pass through the origin on the r-x diagram or might be offset from it [15]. the mho distance relay with its circular characteristic passing through the origin is the self-polarized mho distance relay. this is the characteristic applied in this paper. the self-polarized mho distance relay decides to trip the circuit-breakers if the measured impedance is located within its protection zone. this study case considered two protection zones, zone 1 and zone 2. protection zone 1 has an instantaneous action characteristic and underreaches to the remote bus of the transmission line. however, protection zone 2 has a time delay before sending the trip signal to the circuitbreakers of the protected line, and it overreaches to the remote bus. this time delay is used for protection coordination. in general, it is also used in a protection zone 3. this third zone has to be set such as to protect the longest adjacent line and to protect 20 % beyond that line to provide a backup to the remote circuit-breakers [16]. this paper did not consider external faults, therefore, the third zone was not implemented in this study. 539 p. h. b. s. pinheiro et al. acta polytechnica the first step of the distance relay modelling is to acquire voltage and current signals from instrument transformers. these analog signals must be applied to anti-aliasing filters to limit the noise effects and unwanted components of higher-frequencies over the sampled data [17]. these higher frequencies can emerge mainly due to traveling waves during transient conditions in the transmission line. the anti-aliasing filter can be understood as a prefiltering stage. it is used, in general, if the nyquist frequency of the input is too high or if the signal is not bandlimited. even if the input signal is naturally bandlimited, wideband noise may exist in the higher frequencies range. during the sampling of this input signal, these noise components would be aliased into the low-frequency band. to avoid this phenomenon, the input signal is forced to be bandlimited to frequencies below one half of the desired sampling rate using this low pass filter (i. e., anti-aliasing filter), [18] presents more details about the subject. in this paper, the function shown in 1 simulates an analog, low pass, anti-aliasing filter consisting of a state variable formulation with trapezoidal integration [19]. y (t) = l−1 { x(s) a0 + a1 ( s ω ) + a2 ( s ω )2 + a3( sω )3 + a4( sω )4 + a5( sω )5 + a6( sω )6 } (1) where: s – laplace variable ω = 2 · π · f g (rad/s) x(s) – input in the laplace domain l−1 represents inverse laplace transform the values of the coefficients in 1 are f g = 0.421875 × sampling frequency; a0 = 1.0; a1 = 3.8637; a2 = 7.464; a3 = 9.1415; a4 = 7.464; a5 = 3.8637; a6 = 1.0 . after the anti-aliasing filter stage, the input signals are sampled at a frequency greater than double the highest harmonic frequency of interest according to the nyquist criterion [20]. therefore, in this case, as the distance relay is interested only in the fundamental frequency, the 7th harmonic is chosen as the highest component of interest. the data sampling frequency is 16 samples/cycle. then, these data are written into a buffer, and the harmonic computations are based on the fast fourier transform (fft) technique [21]. the fft computation is detailed in 2 [22, 23]. the fft results are in the format given by 3, the magnitude computation uses 4, and the phase angle computation uses 5. x(k) = n/2−1∑ n=0 x(2n)w nkn/2 + w k n n/2−1∑ n=0 x(2n + 1)w nkn/2 (2) where: x(k) – output signal vector x(n) – input signal vector decomposed into x(2n) and x(2n + 1) n – number of samples of the input signal k – harmonic frequency index wn = e−j2π/n x(k) = real [x(k)] + jimag [x(k)] (3) m ag(k) = √ real[x(k)]2 + imag[x(k)]2 (4) p hase(k) = tan−1 ( imag [x(k)] real [x(k)] ) (5) where: real [x(k)] – real portion of x(k) 540 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . imag [x(k)] – imaginary portion of x(k) these steps of anti-aliasing filtering, data sampling, harmonic, and phase angle computations, depicted in figure 1, are used to estimate the fundamental phasors of voltage and current signals. they are used to calculate the impedance of the fault loops in the relay. after this step of phasors estimation, the impedances are calculated, for both phase and ground elements, covering all types of faults that might happen in a three-phase system. they are the following: • phase-ground faults • phase-phase faults • phase-phase-ground faults • three-phase faults • three-phase-ground faults figure 1. signal conditioning unit. the impedance computation of phase elements uses 6, and the impedance of ground elements uses 7 [24]. z1 = vϕ1 − vϕ2 iϕ1 − iϕ2 (ω) (6) where: z1 – positive-sequence impedance (ω) vϕ1 – phase 1 voltage (v) vϕ2 – phase 2 voltage (v) iϕ1 – phase 1 current (a) iϕ2 – phase 2 current (a) z1 = vϕ iϕ − ki0 (ω) (7) where: vϕ – phase voltage (v) iϕ – phase current (a) k – zero-sequence compensation factor i0 – zero-sequence current (a) as it is possible to notice in 7, there are two additional variables. the zero-sequence compensation factor k, and the zero-sequence current i0. their computation uses 8 and 9, respectively. k = z0 − z1 z1 (8) where: 541 p. h. b. s. pinheiro et al. acta polytechnica z0 – zero-sequence impedance of the protected line (ω) z1 – positive-sequence impedance of the protected line (ω) i0 = 1 3 (ia + ib + ic) (a) (9) where: ia, ib, ic – phase currents (a) the zeroand positive-sequence impedances used to calculate k in 8 will be given in section 4. z1 being the positive-sequence impedance, a complex number following the format given in 10, the trip decision of the mho distance relay has the following conditions [25, 26]: first, it calculates the difference between the vectors zset and z1, denoted by 11, and then it evaluates if the angle between dz and z1 is within a specific range, the flowchart of the trip decision is depicted in figure 2. z1 = r + jx (ω) (10) where: r – positive-sequence resistance (ω) x – positive-sequence reactance (ω) dz = zset − z1 (ω) (11) where: zset – protection zone’s reach setting (ω) table 1 depicts the reach settings for protection zone 1 and protection zone 2 used in this study case. these settings are based on the positive-sequence impedance of the transmission line model considered in this paper. the reach settings of zone 1 and zone 2 are 80 % and 120 % of its positive-sequence impedance, respectively. protection zone settings zone 1 35.644 ∠ 86.54 °ω zone 2 53.46 ∠ 86.54 °ω table 1. reach settings. the reach settings denoted in table 1 are in secondary ohms, because of the turns ratio from instrument transformers. section 4 presents more details about them. the distance relay must correctly measure the impedance of the phase and ground elements, which is done using 6 and 7. through the conditions about the trip decision, the mho distance relay must correctly classify the faulted phases. then, it will send the trip signal only to the respective circuit-breakers. figure 3 shows the logical scheme adopted to trip only the circuit-breakers of the faulted phases. the logical scheme detailed in figure 3 can be understood as follows, if the fault is within the protection zone 1 or 2, the relay will detect which element, phase or ground, is in logical level 1 and send the trip signal to the respective circuit-breakers. as it is possible to notice, the phase elements are ab, ac, bc and the ground elements are ag, bg, and cg. the protection zone 2 has a time delay component, configured, in this case, with a delay of 20 cycles of the fundamental frequency, used for protection coordination. the variable brkx denotes the circuit-breaker’s status (e.g., opened or closed), in this case, used to hold its current position. a consideration of this distance relay model is another time delay considered to open the circuit-breakers. circuit-breakers are electromechanical devices, so they do not open instantaneously. analysing real fault oscillograms, it is possible to conclude that the mean time interval between the trip signal and the effective opening of the circuit-breaker is 33 ms. these real oscillograms belong to the power line studied in this paper (i. e., serra da mesa – samambaia c3), located in the brazilian power grid. simulating this time delay allows more accurate results. the last thing to consider about the distance relay model is to plot the protection zones and the impedance trajectory from phase and ground elements, so it is possible to verify possible unwanted encroachment of measured 542 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . figure 2. flowchart of trip decision. length 0 − 12.474 km and 234.732 − 249 km conductor ground wire 1 ground wire 2 name rail dotterel opgw inner radius (m) 0.003705 outer radius (m) 0.014805 0.0077 0.0077 dc resistance (w/km) 0.0599 0.324 0.429 bundled sub-conductors 4 length 12.474 − 234.732 km conductor ground wire 1 ground wire 2 name rail 3/8” ehs opgw inner radius (m) 0.003705 outer radius (m) 0.014805 0.004572 0.0062 dc resistance (ω/km) 0.0599 4.1994 0.735 bundled sub-conductors 4 table 2. conductors and ground wire data. 543 p. h. b. s. pinheiro et al. acta polytechnica figure 3. logical scheme to trip circuit-breakers. impedances in the protection zones and to check incorrect operation conditions. the circular characteristic of the self-polarized mho distance relay passes through the origin on the r-x diagram. it is necessary to generate this offset condition as follows: first, one must calculate one point on the r axis and another on the x axis using 12 and 13, respectively. raxis = radius × cos (θline) (12) xaxis = radius × sin (θline) (13) where: raxis – offset on the real axis (r) (ω) xaxis – offset on the imaginary axis (x) (ω) radius – one half of the impedance magnitude of the protection zone (ω) θline – angle of the transmission line (°) table 1 gives the magnitude and angle values of both protection zones 1 and 2. with the magnitude values of these protection zones, it is possible to plot the circular characteristic concerning each zone. after this step, two different signals must be calculated, one for the r axis and one for the x axis. they must be plotted as one in function of the other. these signals are calculated in the time domain for r and x using 14 and 15, respectively. rsignal = radius × cos (ωt) + raxis (14) 544 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . xsignal = radius × sin (ωt) + xaxis (15) where: rsignal – signal generated for the real axis xsignal – signal generated for the imaginary axis ω – angular frequency (rad/s) t – simulation runtime (s) in this study case, the angular frequency ω was empirically chosen as 360π rad/s. the simulation runtime increases with a time step ∆t = 10 µ s. calculating rsignal and xsignal for the protection zones 1 and 2. plotting them as (rsignal, xsignal) results in the characteristic shown in figure 4. after this description of how to model the mho distance relay, section 4 presents the details about the studied scenario and the results. figure 4. mho circular characteristic. 4. simulation and results to test the distance relay presented in section 3, the model of a transmission line belonging to the brazilian power grid is used. the transmission line of interest interconnects the substations of serra da mesa and samambaia. this line is part of the northern-southern interconnection of the brazilian transmission power grid. the transmission line of interest uses the frequency-dependent (phase) model available in pscad/emtdc. this is the most accurate transmission line model available. [27] and [28] presents more details about this frequency-dependent model. the model of the transmission line serra da mesa – samambaia c3 uses the geometric characteristic of its predominant tower, as shown in figure 5. the line is 500 kv voltage rated, and it has a series capacitor for reactive compensation at the samambaia substation. its length is 249 km and it employs a complete transposition cycle, as shown in figure 6. conductors and ground wire data are described in table 2. for this paper, the series capacitor located at the samambaia substation will be with its by-pass circuit-breaker closed, so the distance protection will not be under the effects of reactive compensation. the distance relay tests are performed with several fault scenarios applied to the transmission line serra da mesa – samambaia c3. the following parameters are changed to create different scenarios: fault types, fault locations, and fault resistances. with these parameters, the proposed distance relay is accurately evaluated for its general performance. these parameters were changed using the multiple run algorithm available in pscad/emtdc. the algorithm applies 11 different fault types in 12 different locations along the transmission line. in addition, with 5 different values of fault resistance. the fault types are detailed in table 3. figure 7 depicts the transmission line single-line diagram, highlighting the relay and instrument transformers’ location and the distances considered to the simulated faults. the fault resistance values are 0, 1, 10, 50 and 100 ω. this paper aims to evaluate the performance of a digital mho distance relay. the main aspects are the correct operation of the relay in isolating the various types of simulated faults. it means that the relay must trip the circuit-breakers belonging to the faulted phase and the healthy phases must remain live. a comparison of the influence of fault resistance in the correct operation of this distance relay is presented. the last evaluation is the confidence interval calculation of the fault detection time. the positive-sequence and zero-sequence impedances of the transmission line are obtained using the “line constants” auxiliary routine available in pscad/emtdc. these impedance values are applied to the distance 545 p. h. b. s. pinheiro et al. acta polytechnica figure 5. geometric characteristic of the predominant tower. figure 6. complete transposition cycle. figure 7. single-line diagram of serra da mesa – samambaia c3 transmission line. simulated fault types phase-ground ag, bg, cg phase-phase-ground abg, acg, bcg three-phase abc phase-phase ab, ac, bc three-phase-ground abcg table 3. fault types. 546 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . positive sequence (ω) zero sequence (ω) primary 66.83 ∠ 86.54 ° primary 273.82 ∠ 71.29 ° secondary 44.55 ∠ 86.54 ° secondary 182.55 ∠ 71.29 ° table 4. impedance data. figure 8. correct operation of the distance relay considering effects of fault resistance. relay settings and to the calculation of the zero-sequence compensation factor k in 8. table 4 presents these impedance values. the zero-sequence compensation factor in 8 and the reach settings denoted in table 1 are calculated using the secondary impedances. these secondary impedances are calculated using the turns ratio regarding the voltage transformer and the current transformer. the voltage transformer turns ratio is denoted as vtr = 4500, the current transformer turns ratio as ctr = 3000. the calculation of the secondary impedances uses 16. zsecondary = zprimary × ct r v t r (16) the first generated result, presented in figure 8, is a graph regarding the distance relay operation. it includes a comparison considering its correct operation facing different fault resistance values. analysing the results in figure 8, it is possible to notice the derating of the distance relay’s performance as the fault impedance increases. even in the presence of faults with a low impedance, this distance relay does not operate with a 100 % accuracy. other observations include, for phase-ground faults close to the relay location, an unwanted encroachment of phase elements into the relay’s operation zone might occur, resulting in unwanted trips. for faults near the remote end, the impedance path might dynamically enter and leave the protection zone 2 from time to time. thus, its operation might not occur within the expected time. the effect of fault resistance can be better understood analysing figure 9. the impedance path for a phase a — ground fault located in the middle of the transmission line’s length, seen by the relay located at the serra da mesa substation is shown. from figures 8 and 9, it can be seen that for larger values of fault resistance, the impedance path calculated by the digital relay underreaches the protection zone. figure 10 details an unwanted path encroachment of a phase element for a phase-ground fault. a bolted fault between the phase b and the ground close to the relay’s location at serra da mesa substation was simulated, and there was an incorrect operation condition. as it is shown in figure 10, there was an unwanted encroachment of the phase element zbc and the distance relay sent trip signals to both phase b and phase c circuit-breakers. this was not supposed to happen. the last case evaluated in this paper is the confidence interval calculation for the fault detection time. this evaluation considers several different simulated scenarios. statements defined in [29] are used for the confidence interval calculation. so, it is possible to provide an interval of plausible values for the fault detection time and demonstrate how confident they are. the fault detection time for the protection zone 1 and protection zone 2 was calculated, considering all the simulated cases with the different fault types, fault locations, and fault 547 p. h. b. s. pinheiro et al. acta polytechnica figure 9. effect of fault resistance on mho distance relay. figure 10. unwanted encroachment of the zbc element. impedances. the confidence interval calculation only includes the cases with a correct operation by the distance relay. the result was a 95 % confidence interval for the mean using 17. ( x − 1.96 σ √ n , x + 1.96 σ √ n ) (17) where: x – mean time of fault detection σ – standard deviation of fault detection time n – number of samples table 5 shows the data required for the confidence interval computation. figure 11 presents a graph for the confidence interval for the relays located at both substations, serra da mesa and samambaia. analysing table 5 and figure 11, it is possible to conclude that the fault detection time must not be treated as a constant parameter, as it usually is in the literature. it must be treated as a variable with a mean value and standard deviation subjected to many different conditions in electrical power systems. the fault detection time’s upper and lower limits depend on what fault condition the relay is facing. both the faults farther from the relay location and the faults with an impedance lead to a slower detection time. with the confidence interval computation, it is possible to have more details about the fault detection time and critical conditions for protection systems. also, it is possible to determine if the protection systems comply to the requirements defined by power utilities, standards, and regulatory organizations. after these results, section 5 presents the conclusions. 5. conclusions this paper presented a detailed methodology on how to model a digital distance relay using pscad/emtdc. the main aspects of a digital mho distance relay, its limitations, and how it can be used to protect transmission lines were shown. the basic steps used in its modelling include: 548 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . figure 11. confidence interval of fault detection time zone 1 substation x (s) σ (s) n serra da mesa 0.0272 0.0727 415 samambaia 0.0267 0.0708 386 zone 2 substation x (s) σ (s) n serra da mesa 0.0259 0.0688 409 samambaia 0.0189 0.044 380 table 5. data to calculate the confidence interval of fault detection time. • signal conditioning unit • impedance calculation • fault detection • trip logical scheme • plotting the circular characteristic and impedance path in the signal conditioning unit, the anti-aliasing filter was used to avoid noise from harmonics of higher frequencies. the data sampling, buffering, and the fast fourier transform are implemented to estimate the voltage and current phasors in the fundamental frequency and remove any dc offset. the impedance calculations for phase and ground elements are done using 6-9. the flowchart for the trip decision is detailed in figure 2, and the logical scheme to send trip signals to the circuit-breakers is shown in figure 3. the plotting of the mho characteristic is demonstrated in a simplified and easy-to-understand manner through 12-15. the result is shown in figure 4. section 4 presents a detailed evaluation of the self-polarized mho distance relay. it is possible to notice that its incorrect operation is caused mainly by fault resistance effects, faults close to the relay’s location, and faults close to the remote end. a new approach on how to treat the fault detection time of digital distance relays is presented. in figure 11, it is depicted that the fault detection time is not constant for every possible fault that might happen in electrical power systems. it is shown that the fault detection time varies within a time interval. this paper calculates a 95 % confidence interval for the mean, showing upper and lower boundaries for this time interval. the main scientific contributions of this paper are the modelling of a digital mho distance relay using pscad/emtdc and the detailed evaluation of this relay in the following aspects: • effects of fault resistance on distance protection 549 p. h. b. s. pinheiro et al. acta polytechnica • confidence interval calculation for the fault detection time • considerations about undesired conditions that might happen in distance protection. acknowledgements the authors gratefully acknowledge the financial support by capes – coordenação de aperfeiçoamento de pessoal de nível superior – brasil and taesa (r&d) aneel project pd – 07130-0053/2018. references [1] s. h. horowitz, a. g. phadke. power system relaying, chap. nonpilot distance protection of transmission lines, pp. 101–131. 3rd ed. john wiley & sons, ltd., 2008. [2] c. a. b. da costa, n. s. d. brito, b. a. de souza. a methodology for distance relay modeling. ieee latin america transactions 16(5):1388–1394, 2018. https://doi.org/10.1109/tla.2018.8408432. [3] r. h. salim, d. p. marzec, a. s. bretas. phase distance relaying with fault resistance compensation for unbalanced systems. ieee transactions on power delivery 26(2):1282–1283, 2011. https://doi.org/10.1109/tpwrd.2010.2052967. [4] j. choi, y. han, i. suh, et al. a study on the effects of new facts installations on existing distance protection relays. ifac proceedings volumes 36(20):279–283, 2003. https://doi.org/10.1016/s1474-6670(17)34480-4. [5] s. m. hashemi, m. sanaye-pasand. distance protection during asymmetrical power swings: challenges and solutions. ieee transactions on power delivery 33(6):2736–2745, 2018. https://doi.org/10.1109/tpwrd.2018.2816304. [6] a. manori, m. tripathy. protection of tcsc transmission line by wavelet based advance mho relay. in 2018 5th ieee uttar pradesh section international conference on electrical, electronics and computer engineering (upcon), pp. 1–4. 2018. https://doi.org/10.1109/upcon.2018.8596759. [7] p. r. khade, m. p. thakre. optimal reach settings of mho relay for series compensated transmission line protection. in 2020 4th international conference on electronics, communication and aerospace technology (iceca), p. 307–313. 2020. https://doi.org/10.1109/iceca49313.2020.9297516. [8] y. liang, w. li, w. zha. adaptive mho characteristic-based distance protection for lines emanating from photovoltaic power plants under unbalanced faults. ieee systems journal pp. 1–11, 2020. https://doi.org/10.1109/jsyst.2020.3015225. [9] c. r. mason. the art and science of protective relaying. wiley, new york, 1956. [10] h. beleed, b. k. johnson, h. l. hess. an examination of the impact of d-facts on the dynamic behavior of mho and quadrilateral ground distance elements. in 2020 ieee power & energy society innovative smart grid technologies conference (isgt), pp. 1–5. 2020. https://doi.org/10.1109/isgt45199.2020.9087712. [11] l. gérin-lajoie. a mho distance relay device in emtpworks. electric power systems research 79(3):484–491, 2009. https://doi.org/10.1016/j.epsr.2008.09.010. [12] k. pipaliya, v. makwana. modeling and simulation of digital distance protection scheme for quadrilateral characteristics using pscad. international journal of advance engineering and research development 2(5):1238–1246, 2015. https://doi.org/10.21090/ijaerd.0205173. [13] s. r. samantaray, p. k. dash. transmission line distance relaying using a variable window short-time fourier transform. electric power systems research 78(4):595–604, 2008. https://doi.org/10.1016/j.epsr.2007.05.005. [14] manitoba hvdc research centre. applications of pscadt m /emtdct m . winnipeg, 2008. [15] institute of electrical and electronic engineers. guide for protective relay applications to transmission lines. in ieee std c37.113-2015 (revision of ieee std c37.113-1999), pp. 1–141. 2016. [16] a. m. abdullah, k. butler-purry. distance protection zone 3 misoperation during system wide cascading events: the problem and a survey of solutions. electric power systems research 154:151–159, 2018. https://doi.org/10.1016/j.epsr.2017.08.023. [17] s. g. aquiles perez, m. s. sachdev, t. s. sidhu. modeling relays for use in power system protection studies. in canadian conference on electrical and computer engineering, 2005, pp. 566–569. 2005. https://doi.org/10.1109/ccece.2005.1556994. [18] a. v. oppenheim, r. w. schafer. discrete-time signal processing, chap. digital processing of analog signals, pp. 153–237. 3rd ed. pearson, 2010. [19] a. m. goler. pscad on-line help system, chap. pfft 49 low pass, anti-aliasing filter. pscad, winnipeg, 1993. [20] w. kester. mt-002 tutorial what the nyquist criterion means to your sampled data system design. analog devices pp. 1–12, 2009. [21] r. jayasinghe, s. woodford. pscad on-line help system, chap. online fast fourier transform. pscad, winnipeg, 1992. 550 https://doi.org/10.1109/tla.2018.8408432 https://doi.org/10.1109/tpwrd.2010.2052967 https://doi.org/10.1016/s1474-6670(17)34480-4 https://doi.org/10.1109/tpwrd.2018.2816304 https://doi.org/10.1109/upcon.2018.8596759 https://doi.org/10.1109/iceca49313.2020.9297516 https://doi.org/10.1109/jsyst.2020.3015225 https://doi.org/10.1109/isgt45199.2020.9087712 https://doi.org/10.1016/j.epsr.2008.09.010 https://doi.org/10.21090/ijaerd.0205173 https://doi.org/10.1016/j.epsr.2007.05.005 https://doi.org/10.1016/j.epsr.2017.08.023 https://doi.org/10.1109/ccece.2005.1556994 vol. 61 no. 4/2021 detailed modelling and analysis of digital . . . [22] m. inci, m. buyuk, m. tumay. fft based reference signal generation to compensate simultaneous voltage sag/swell and voltage harmonics. in 2016 ieee 16th international conference on environment and electrical engineering (eeeic), pp. 3–7. 2016. https://doi.org/10.1109/eeeic.2016.7555876. [23] k. r. rao, d. n. kim, j.-j. hwang. fast fourier transform algorithms and applications. springer netherlands, dordrecht, 2010. [24] a. g. phadke, j. s. thorp. computer relaying for power systems, chap. relaying practices. wiley, chichester, uk, 2009. [25] d. d. fentie. understanding the dynamic mho distance characteristic. in 2016 69th annual conference for protective relay engineers (cpre), pp. 1–15. 2016. https://doi.org/10.1109/cpre.2016.7914922. [26] m. da c. siqueira. desempenho da proteção de distância sob diferentes formas de polarização, 2007. federal university of rio de janeiro. [27] b. gustavsen, g. irwin, r. mangelrød, et al. international converence on power system transients (ipst), chap. transmission line models for the simulation of interaction phenomena between parallel ac and dc overhead lines, pp. 61–67. 1999. [28] a. morched, b. gustavsen, m. tartibi. a universal model for accurate calculation of electromagnetic transients on overhead lines and underground cables. ieee transactions on power delivery 14(3):1032–1038, 1999. https://doi.org/10.1109/61.772350. [29] f. m. dekking, c. kraaikamp, h. p. lopuhaa, l. e. meester. a modern introduction to probability and statistics, understanding why and how. springer, london, 2005. 551 https://doi.org/10.1109/eeeic.2016.7555876 https://doi.org/10.1109/cpre.2016.7914922 https://doi.org/10.1109/61.772350 acta polytechnica 61(4):1–15, 2021 1 introduction 2 related works 3 digital mho distance relay 4 simulation and results 5 conclusions acknowledgements references ap05_3.vp 1 introduction in engineering design a designer needs to satisfy a set of functional requirements within a given set of constraints. however, a good engineering design is one that goes beyond merely satisfying these requirements within constraints but achieves a certain level of excellence in some quantifiable or unquantifiable manner. put in another way, designers seek to optimise designs during the design process. design optimisation often involves conflicting multiple objectives or criteria which can be regarded as a form of multiple criteria decision making [1]. multiple criteria decision making (mcdm) can broadly be classified as: � synthesising a set of competing design alternatives. � selecting the most preferred design(s) from a set of competing design alternatives. the search for optimum design solutions involving multiple objectives during the synthesis process usually results in non-dominated or efficient solutions. the search for an efficient solution begins in the feasible solution space and a bi-objective solution design optimisation solution is shown in fig. 1. criterion 1 and criterion 2 are to be maximised, and points a and b are the optimum design solutions if criterion 1 and 2 are optimised as two single objective optimisation problems. the unattainable ideal solution is represented by point “o” in fig. 1. it is clear, from fig. 1, that all the design solutions in the shaded region dominate solution “x”. if a solution on the pareto surface is found, then it is usually sensible to take it to represent the “best solution” in that no improvements can be made on either criterion without sacrificing the performance of the other criterion. this realistic approach of incorporating conflicting objectives in a optimisation framework finds readily available applications in various fields of engineering design: for example safety design [2] and finite element analysis during design [3]. various methods have been developed that allow one to search for solutions on the pareto surface two such methods are the interactive step trade-off method and the multiple objective genetic algorithm [1, 4]. the selection of the most preferred design solution(s) from a set of efficient design solutions is a subjective matter and depends on the decision maker’s preference. in general, given a set of design alternatives, the decision maker then analyses the merits of the various attributes (e.g. cost, performance, and appearance) on the basis of preference structure before ranking or selecting the most preferred design alternative(s). again, various methods have been developed to allow one to rank and select the most preferred design alternatives from a given set of alternatives and articulation of the designer’s preferences [5, 6, 7]. 2 preference structure in mcdm, it is a difficult task to elicit a designer’s or decision maker’s preference structure. the preference structure of the designer or decision maker is usually expressed through weights or utility functions. the preference structure may be elicited in terms of pairwise comparison of attributes (or crite© czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 elicitation of preference structure in engineering design j. k. tan engineering design processes, which inherently involve multiple, often conflicting criteria, can be broadly classified into synthesis and analysis processes. multiple criteria decision making addresses synthesis and analysis processes through multiple objective optimisation to generate sets of efficient design solutions (i.e. on pareto surfaces) and multiple attribute decision making to analyse and select the most preferred design solution(s). mcdm, therefore, has been widely used in all fields of engineering design; for example it has been applied to such diverse areas as naval battle ships criteria analysis/selection and product appearance design. given a list of design alternatives with multiple conflicting criteria, preferences often determine the final selection of a particular set of design alternative(s). preferences may also be used to drive the design/design optimisation processes. various methods have been proposed to model preference structure, for example simple weights, multiple attribute utility theory, pairwise comparison, etc. preference structure is often non-linear, discontinuous and complex. an artificial neural network (ann) learning-based preference elicitation method is presented in this paper. anns efficiently model the non-linearity, complexity and discontinuity nature of any given preference structure. a case study is presented to illustrate the learning-based approach to preference structure elicitation. keywords: engineering design, multiple criteria decision making, preference structure. feasible region infeasible region pareto surface fig. 1: solution space of a bi-objective design optimisation problem ria), ranking of all attributes, ranking of a sub-set of alternatives with respect to all attributes, and the definition of ideal and negative ideal solutions. given the fact that the decision maker may not be able to articulate the preference structure through the comparison of pairs of attributes and/or solutions, and that comparison of pairs of attributes may not be adequate to capture the interactions between the attributes of a decision making problem, the results of preference elicitation may not be well-agreed by the decision maker. in general, the greater the volume of preference information that is provided, the higher is the accuracy of the weights or utility function obtained, accompanied by a higher risk if inconsistencies in judgement are manifested during the elicitation process. attempts are being made to take the complexity of preference elicitation into account in mcdm. one such example involves the use of artificial neural networks and fuzzy set theory to model preference relations for mcdm [8]. artificial neural networks (anns) have been used in a large range of applications in many fields [9]. anns are particularly good at recognising complex patterns and images when they are appropriately set up and trained. one such example involved the use of ann to map the complex response surface of hydrodynamic performance [10]. for the difficult task of eliciting a decision maker’s preference, a learning-based approach using ann is proposed in this paper to capture the designer’s or decision maker’s preference structure. the proposed learning-based approach is an iterative one that allows the designer or decision maker to state and refine his preference on a set of competing design alternatives. this work is still in its early stage and hence only the results of a preliminary investigation are illustrated as a simple case study example. further results will be disseminated in due course as work progresses. 3 an example the preliminary investigation of this proposed learning-based approach has been carried out using a set of existing data on a catamaran design problem. the efficient design solutions data for this catamaran design problem is adapted from the example of [1], which uses a utility function to capture the preference structure, and the learning-based approach is applied to illustrate the decision maker’s preference. in this problem, a catamaran vessel is designed by modifying a parent catamaran hull form so as to maximise a number of performance measures. these performance measures (termed attributes, objectives or criteria) are heave, pitch, roll and relative bow motion of the vessel. to optimise the performance for robustness over a range of wave headings, the signal-to-noise ratio is maximised. the results are shown in table 1. the non-dominated optimal or efficient solutions on the pareto surface are obtained (see table 1) by creating variant designs through a simple perturbation of the three primary design variables, these being the length (l), the beam over draft ratio (b/t) and separation of the demi-hulls 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague parameters criteria �l % �b/t % �hs % heave(db) pitch(db) roll(db) rbm(db) 1 �5.0000 �10.0000 �10.0000 6.9372 15.3381 2.4447 6.3923 2 5.0000 10.0000 10.0000 6.8438 19.1872 2.5865 6.8327 3 0.0000 10.0000 10.0000 6.1819 7.9100 �0.3776 8.5016 4 5.0000 5.0000 10.0000 6.9329 16.7476 7.4053 6.4656 5 0.0000 5.0000 10.0000 6.1645 7.8047 5.5222 10.0961 6 10.0000 0.0000 10.0000 6.8808 13.0745 10.8571 5.2839 7 5.0000 0.0000 10.0000 6.9001 11.8422 8.1982 5.0751 8 �5.0000 �5.0000 0.0000 6.9762 13.4082 5.4809 4.9013 9 �10.0000 �5.0000 0.0000 7.2764 11.1131 4.7353 4.3594 10 0.0000 �5.0000 �10.0000 6.1508 7.6588 8.3186 9.8139 11 �10.0000 �5.0000 �10.0000 7.2764 11.1131 3.8911 4.9367 12 �5.0000 �5.0000 �10.0000 6.9762 13.4082 4.7291 5.6957 13 �10.0000 �10.0000 �10.0000 7.2411 12.9420 3.6884 5.7528 14 10.0000 10.0000 10.0000 6.8514 17.9598 6.9071 6.1825 15 10.0000 5.0000 10.0000 6.8908 15.8289 9.1926 5.9075 16 �10.0000 �5.0000 10.0000 7.2764 11.1131 5.0582 3.7533 17 10.0000 �5.0000 10.0000 6.6281 9.9164 11.4248 3.8082 18 10.0000 �10.0000 0.0000 7.2411 12.9420 3.9183 5.2209 table 1: efficient solutions for the example problem (hs). the design variations for the three primary variables are as follows: l :1 � 0.1 in steps of 0.05 (i.e. � 10 % variation in steps of � 5 %) b/t : 1 � 0.1 in steps of 0.05 h : 1 � 0.1 in steps of 0.05 it is obvious from the data that it is not possible to maximise all four criteria simultaneously, and a trade-off between the four criteria will be necessary. the designer or decision maker needs to articulate his preference so as to identify the “best” design. more importantly, due to the nature of the problem, the 18 design alternatives presented may not contain the most desirable features in accordance with the yet-to-be captured preference structure. hence the preference structure can be used to guide a further explorative search in an attempt to cover improved designs. to aid the decision maker in the articulation, an overall scoring system between 0 and 1 will be used to rank the design alternatives. a feed-forward ann, as shown in fig. 2, is set up to map and capture the preference structure. the input nodes i1– i4 are used to receive the 4 performance measures of heave, pitch, roll and rbm, and the output node o1 will be used to receive the decision maker’s overall preference. the learning-based approach allows the decision maker or designer to articulate his preference in an incremental manner: � the decision maker first selected four design solutions to represent the best, good, average and poor designs with appropriate scores. � the ann was then trained to capture this initial preference structure. this preference structure is then used to predict the scores of other solutions. � if the decision maker agreed with the predicted scores then those scores would be used as additional training data. otherwise, the decision maker assigned new scores and these new scores are then again used as additional training data set. � the process is then refined/repeated until the decision maker is satisfied that the preference structure had been © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 output layerinput layer first hidden layer i1 i2 i3 i4 o1 fig. 2: ann set up for the example fig. 3: correlation of decision score and performance measures mapped adequately. this process can be further refined at a later stage if necessary. the influence of the decision maker’s preference on individual performance measures over the final score can be revealed through simple scatter plots, as shown in fig. 3. it can be seen from fig. 3 that the decision maker, unconsciously perhaps, did not consider heave and roll to be equally important considerations as pitch and rbm affecting the overall selection of final design solution. pitch and rbm thus became important in the selection of the final “best” design solution. the influence of pitch and rbm on the score can thus be plotted (shown in fig. 4) to reveal the preference structure and the interrelation of these two criteria in relation to the final score (desirability of the decision maker). intuitively, one would suspect a correlation between rbm and pitch, and indeed fig. 4 shows that there is a good correlation between the two important performance measures of rbm and pitch. from the graphs shown in figs. 3 and 4, the decision maker decided to look more closely at the two criteria pitch and rbm. an analysis of these two criteria on the basis of the given 18 competing designs yielded the graph shown in figure 5 which revealed that the influence of rbm somehow peaked at the value of 6, whereas pitch has a stronger influence, in that a higher value of this criterion is always desirable. the observation derived from this simple exercise is not dissimilar to those obtained from [1], using the utility function approach, in which the decision maker used a pair-wise comparison of competing designs to construct the utility functions of the performance measures: pitch, rbm, heave and roll. it should be pointed out that due to the difference in the nature of the evaluation algorithm, the results in terms of numerical values obtained by this method should not be compared directly to those stated in [1], which was computed using utility functions. however, the overall trend of the preference structure should not be dissimilar between the two methods. the resultant utility functions derived from the analysis are linear for pitchthe higher the values of the pitch signal to noise performance, the higher will be the utility values and hence the desirability. the rbm’s utility curve showed significantly less influence than the pitch’s utility curve, and the values of the rbm utility were virtually constant and at a value significantly lower than those associated with pitch. the preference structure elicited is then used to assist the designer to perform a further explorative search in an attempt to obtain better overall desirability in accordance with this preference structure. clearly, the direction of the search is directed at trading off the performances of heave, roll and to some extent rbm in order to gain performance in terms of pitch performance. again, a similar conclusion has been drawn in [1]. a further search for solutions can indeed be performed on the basis of this preference structure, but for brevity the details of this are not presented here. 4 discussion and conclusion elicitation of a preference structure is not an easy task, principally because it involves articulation of human preference over a set of competing alternatives. the subjective nature of the design selection process, which involves multiple conflicting criteria, requires trading-off between attributes or criteria with possible interactions between these attributes or criteria. various methods have been developed to assist decision makers in the articulation and mapping of an underlying preference structure. this paper presents a learning-based iterative approach that allows the decision maker to form the preference structure incrementally by stating his preference using a simple scale system. admittedly, one cannot claim that this learning-based iterative approach is perfect; however the decision maker can, through a series of intuitive refinements, arrive at some credible preference structure on the basis of his intuitive articulation. the artifi8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague pitch rbm score fig. 4: plots of preference structure & pitch vs rbm fig. 5: pitch and rbm vs score cial neural network handles the complexity of the possible interactions of the criteria through learning efficiently and transparently via examples and training so as to map the complex response surface that fits the given training data. the preference structure derived can be used to perform further explorations of the solution space in search of better overall desirable design performance. a quantitative example is presented to illustrate the use of this approach and the resultant preference structure. however in many design problems, certain attributes or criteria are somewhat unquantifiable (e.g., the shape and appearance of a design, which often significantly affects the product’s desirability), and eliciting the preference structure of a decision making problem involving unquantifiable attributes and criteria poses a significant challenge to researchers and practitioners in the field of decision making and design. the work described in this paper is still in its early stage and therefore has not specifically addressed this issue in depth; however, it is noted that shape factors (an important attribute for industrial designers) may be correlated to customer preference [11]. it is noted, therefore, that the proposed approach can potentially be gainfully employed over a wide range of applications in multiple criteria decision making. in conclusion, the learning-based approach, on the basis of this preliminary investigation, appears to be intuitive and attractive and can potentially be used in a wide range of applications including engineering design, industrial design and product design. references [1] sen, p., tan, k.: multiple criteria issues in design: seeking the right balance, international workshop on multi-criteria evaluation, mce 2000, neukirchen, september 14th–15th, 2000. [2] sen, p., tan, l., spencer, d.: “an integrated probabilistic risk analysis decision support methodology for systems with multiple state variables.” reliability engineering & system safety, vol. 64 (1999), no. 1, p. 73–87. [3] zhang, w. h.: “pareto optimum sensitivity analysis in multicriteria optimization.” finite elements in analysis and design, vol. 39 (2003), p. 505–520. [4] pereira, c. m. n. a.: “evolutionary multicriteria optimization in core designs: basic investigations and case study.” annals of nuclear energy, vol. 31 (2004), p. 1251–1264. [5] jacquet-lagreze, e., siskos, j.: “assessing a set of additive utility functions for multicriteria decision making, the uta method.” european journal of operation research, vol. 10 (1982), p. 151–164. [6] hwang, c. l., yoon, k.: multiple attribute decision making – methods and applications: a state-of-the-art survey, 1981, springer-verlag, berlin. [7] saaty, t. l.: “how to make a decision: the analytical hierarchy process.” interfaces, vol. 24 (1994), no. 6, p. 19–43. [8] wang, j.: “a neural network approach to modelling fuzzy preference relations for multicriteria decision making.” computers and operations research, vol. 21 (1994), no. 9, p. 991–1000. [9] patternson, d. w.: artificial neural networks: theory and application, eaglewood cliffs, nj prentice hall, 1997. [10] tan, k., sen, p.: “the application of a decomposition and reuse approach in marine design.” prads 2001, practical design of ships & other floating structures, 8th intl conf, september 16th–21st 2001, shanghai, china. [11] vergeest j. s. m., egmond, r., dumitrescu, r.: correlating shape parameters to customer preference, proceedings of the tmce, 2002 april, wuhan, china. john k. tan, ph.d. e-mail: k.tan@unn.ac.uk school of engineering & technology ellison building northumbria university newcastle upon tyne, ne1 8st united kingdom © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 acta polytechnica doi:10.14311/ap.2019.59.0305 acta polytechnica 59(4):305–311, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap evaluating shear strength of sandggbfs based geopolymer composite material alaa hussein jassim al-rkabya, b a university of thi qar, faculty of engineering, civil engineering department, nasiriyah city 64001 , thi qar governorate, iraq b curtin university, civil engineering department, kent st, bentley wa 6102, australia correspondence: a.al-rkaby@postgrad.curtin.edu.au abstract. geopolymer has been emerging as a novel and sustainable replacement for the traditional soil improvement materials, such as ordinary portland cement opc and lime, which have severe environmental impacts. in this paper, a series of unconfined compression and triaxial tests were conducted on sand and sand ground granulated blast-furnace slag (ggbfs) based geopolymer. a solution of sodium silicate and sodium hydroxide was employed for the geopolymerization process. results revealed that adding the ggbfs resulted in a significant increase in the strength properties. this result indicates that geopolymer acted as a cementation agent, providing better bonding between the sand particles and consequently improving the performance of the treated sand. keywords: geopolymer, ground granulated blast-furnace slag, sand, drained triaxial strength, unconfined compression strength. 1. introduction one of the recent developments in geotechnical engineering is the application of geopolymers to soilimprovement procedures. geopolymers have been emerging as an eco-friendly replacement for the traditional materials. these impacts include the emission of large quantities of carbon dioxide, intensive consumption of energy and resources. for example, the production process of one tone of opc causes a carbon dioxide emission of about one tone and an energy consumption of about 5000 mj [1–3]. moreover, one tone of lime results in about 0.86 tons of carbon dioxide. geopolymer can be synthesized by activating variety of sources, such as industrial waste (e.g ground blast furnace slag (ggbfs), fly ash) and natural materials (e.g. metakaoline). a solution of na2sio3 and naoh are the most widely available activators used to create a high alkaline environment [4]. rios et al. [5] studied the behaviour of silty sand mixed with a geopolymer (fly ash, sodium silicate and sodium hydroxide). their results showed a significant increase in the unconfined compression strength due to the role of the alkaline binder. after 90 days of curing, the strength of the soil mixed with an activator and 15, 20 and 25 % fly ash increased by 16, 31 and 77 times respectively compared with non-stabilized samples [5]. similar results were found recently by cruz et al. [6], who found that the unconfined compression strength of soil improved approximately 32 times when alkaliactivated binder was added. in contrast to the opc and lime, clayey soil, improved with alkaline activated low calcium fly ash, exhibited a slow increase in the unconfined compressive strength ucs until the 28th day, and then it showed similar or larger ucs than the conventional stabilizers [7]. a similar trend has been found by al-rkaby et al. [8–11]. dassekpo et al. [12] proposed that the completely decomposed granite (cdg) can be used as a geopolymer source material without the need to add fly ash. they found that the compressive strength, at a 7th day of curing time of the cdg based geopolymer, without an addition of fly ash, was 13.89 mpa compared with 0.36 mpa for the cdg without an alkaline activator. regarding the parameters of strength, the angles of the internal friction ϕ for a soil stabilized by a geopolymer were higher than 50°, associated to cohesion intercepts c higher than 250 kpa [5]. a similar trend was observed by corrêa-silva et al [7], who found that the friction angles in the critical state increased by 73.6 % and 50.0 % in the total and effective stress analysis respectively. moreover, the cohesion increased by two times in terms of total peak strength while it was very low in effective stress [7]. using a geopolymer made from mixing low calcium fly ash, sodium silicate sio2 and sodium hydroxide na2o with silty sand led to a very high strength and stiffness [13]. ratio of sio2/na2o = 1 produced the maximum early strength of the geopolymer concrete [14]. moreover, the optimum content of the sodium silicate-based additive for improving the low strength clay and high swelling clay was found to be 6.0 % [15]. at this percentage, the ucs of low strength clay and high swelling clay increased by 4.7 and 3.3 times respectively [15]. in addition to the strength properties, studies carried out on geopolymer based soil showed that treated soils have a high durability [14, 16–18]. similar results have been observed by corrêa-silva et al [7] who 305 https://doi.org/10.14311/ap.2019.59.0305 https://ojs.cvut.cz/ojs/index.php/ap alaa hussein jassim al-rkaby acta polytechnica found that treated samples induced a strong reduction of sensitivity to water action. moreover, stabilized samples showed a significant resistance to the wetting and drying tests [19]. the addition of alkali activated stabilizer resulted in reduced volume strains and an overall decrease of compressibility of the treated soil [20]. from an early stage of curing, the geopolymer-soil composite exhibited a deformation modulus two times higher than the untreated samples [6]. after 14 days of curing, the deformation modulus of the geopolymersoil composite increased to 183.1 mpa compared with 20.0 mpa for the untreated samples [6]. it was found that the compression index (cc) and swelling index (cs) decreased from 2.3 and 0.66 for untreated low strength clay and high swelling clay respectively to about 0.5 and 0.2 for treated samples respectively [15]. this is in an agreement with sargent et al. [21] who found that using ground granulated blast-furnace slag (ggbfs)-naoh with alluvial soil resulted in decreasing cc and swelling index cs from 0.13 and 0.013 to 0.014 and 0.003 respectively. abdeldjouad et al. [22, 23] found that the admixtures of a higher kaolinite content with the palm oil fuel ash pofa exhibited higher long term strength than the admixtures without pofa. similar trend was observed by teing et al. [24]. the reason for this improvement are the microstructural changes (such as uptake, re-condensation and the presence of glassy phase) taking place in the mixture fabric during the stabilization process [22–25]. scanning electron microscopy sem showed that voids between soil particles were almost filled with structured aluminosilicate polymerized gel. therefore, discrete particles of the geopolymer composite material exhibit a denser and more closely bound texture [22–25]. although good research efforts have been made into enhancing the shear strength and characteristics of soil by a geopolymer, there is little information on the strength characterization of sand ground granulated blast-furnace slag (ggbfs) based geopolymer. therefore, this study aims to investigate the strength improvement of sand with geopolymer incorporating ggbfs. 2. materials and methods 2.1. sand in the city of perth in western australia, sand affords considerable cost savings for infrastructure and buildings because it is the most predominant type of soil in the city and there are many quarries in the surrounding area. as a consequence, soil that is characterized as poorly graded clean sand (sp) according to the unified soil classification system (uscs) is used in the present study, with a coefficient of curvature cc = 1.17 and a coefficient of uniformity cu = 1.64. the physical properties of perth sand are: maximum void ratio emax = 0.844, minimum void ratio emin = 0.521, median particle size d50 = 0.43 and specific gravity gs = 2.63. 2.2. ground granulated blast-furnace slag ggbfs and the alkaline activator calcium rich ground granulated blast-furnace slag (ggbfs) was utilised in this research. such material is a by-product of the steel and iron industry. moreover, sodium silicate (na2sio3) and sodium hydroxide (naoh) were used as the alkaline activator. the na2sio3/naoh mass ratio was chosen to be equal to 2.0 to create a high alkaline environment and produce the maximum early strength. 2.3. sample preparation a total number of forty samples of sand ground granulated blast-furnace slag (ggbfs) based geopolymer was prepared. in order to prepare these samples, initially, the ggbfs was added to sand at five different volume ratios (10 %, 15 %, 20, 30 & 40 % as a fraction of the total weight), then blended thoroughly in dry conditions until achieving a good distribution and a uniform colour. in this study, the high ph alkaline activator was prepared by adding sodium hydroxide (naoh) in a pellet form to the sodium silicate solution (na2sio3) based on predefined mix proportions and mixing them for 15 minutes for a full dissolution. the alkaline activator solution was then added to the sample and mixing continued to obtain a final homogenous mix. each sample for the experiment was prepared according to the desired ggbfs content, activator percentage and dry unit weight. for each layer, to produce the desired density, the required amount of sand ground granulated blast-furnace slag (ggbfs) based geopolymer was determined in advance according to the selected density and the layer volume. in order to compact the mixture, the mould was divided into several layers and, for each layer, the required amount of sand-(ggbfs) based geopolymer was poured into the mould and the surface was flattened by careful scraping. each layer was then compacted into the mould to the required height. after the compaction, the sand ggbfs based geopolymer specimens were kept in the laboratory for 24 hours before being soaked for curing. the period of 28 days was chosen as an average curing time. 3. results figure 2 shows the variation of the maximum dry unit weight (γd)max along the ground granulated blastfurnace slag (ggbfs) content of 5, 10, 15, 20, 30 and 40 % for all activator ratios (activator/ggbfs=0.2, 0.4, 0.6 and 0.8). it is clear that the 15 % ggbfs exhibited a better performance in terms of the largest maximum dry unit weight. this is due to the fact that such amount of ggbfs fills the voids as inclusions. 306 vol. 59 no. 4/2019 evaluating shear strength of sandggbfs based geopolymer. . . (a). (b). figure 1. (a) sodium hydroxide in pellets form, and (b) sodium silicate solution. figure 2. variation of the maximum dry unit weight (γd)max with ggbfs content for different activator ratios. figure 3. variation of unconfined compression strength ucs with ggbfs content for different activator ratios. 307 alaa hussein jassim al-rkaby acta polytechnica figure 4. drained triaxail strength – axial strain relationships for sand-ggbfs based geopolymer (activator/ggbfs=0.4). figure 3 shows the variation of the unconfined compression strength ucs of a sandggbfs -based geopolymer with the ggbfs content for different activator ratios. results revealed that there is a significant difference in the unconfined compression strength ucs due to a variation in the contents of the ggbfs and activator ratio. the unconfined compression strength ucs increased continuously with the increasing ggbfs content. the rate of increase was greatest when the ggbfs content was increased from 20 to 30 % while the rate increased more slowly as the ggbfs content was increased from 30 to 40 and 50 %. an activator ratio of 0.4 produced the largest ucs (981.0 – 4056.5 kpa), while the minimum of 276.5 – 1342.7 kpa occurred with the activator ratio being 0.8. this trend is similar to the results presented in sukmak et al. [26]. regarding the triaxial tests, the trend is similar to that of pure sand (ucs), where the composite is strengthened as the ggbfs increases for all activator ratios. for clarity, figure 4 showed the relationship between the drained triaxail strength and axial strain for a selected activator ratio (activator/ggbfs = 0.4). moreover, the variations of maximum drained triaxial strength against the content of the ggbfs and activator ratios are plotted in figures 5 and 6. for the activator ratio of 0.2, 0.4, 0.6 and 0.8, the drained triaxial strength of the 10 % gbbs-sand increased by 2.7, 3.3, 2.3 and 1.4 times respectively when compared with raw sand. for the same ratios of the activator, the drained triaxial strength of 40 % gbbs-sand increased by 7.6, 10.8, 5.3 and 4.3 times respectively when compared with raw sand. the drained triaxail strength increased by almost 21.1 53.5 % (depending on gbbs) as the activator ratio changed from 0.2 to 0.4. however, increasing the activator ratio to 0.6 and 0.8 was associated with a significant decrease in the drained triaxail strength by 8.6-36.3 % and 34.453.9 % respectively. moreover, the combination of the largest gbbs content (50 %) and the optimum activator ratio of 0.4 resulted in the maximum improvement. the observed improvement is due to the geopolymerization process, which consists of a series of chemical reactions between the calcium – aluminosilicate rich slag ggbfs and the alkaline activator. this leads to the formation of geopolymeric gel, which spreads and hardens in a three-dimensional space. such layers of the geopolymeric gel cover and bind the sand particles. the increase in the strength that occurred with the mixture sand is related to the ability of the geopolymer to resist the applied stress. it is clear that the benefit of the geopolymer depends mainly on the binding effect that could improve the performance of the composite samples in terms of decreasing the deformation and increasing its strength. the benefits of adding the ggbfs and activator to the strength of sand can be explained by its role as a binding agent and cushioning material. the geopolymer acts as a binding agent and provides a significant cohesion to the sand particles. from this conceptual standpoint, the inclusion of this binding agent works to provide a better bonding between the granulate particles of sand, producing a bonded, stable composite. the effect suggests a better performance of sand-based geopolymer under different types of stresses. 308 vol. 59 no. 4/2019 evaluating shear strength of sandggbfs based geopolymer. . . figure 5. variation of maximum triaxail drained strength against ggbfs content for different activator ratios. figure 6. variation of maximum triaxail drained strength against activator ratio content for different contents of ggbfs. ggbfs (%) activator/ggbfs 0.2 | 0.4 | 0.6 | 0.8 c ϕ c ϕ c ϕ c ϕ (kpa) (°) (kpa) (°) (kpa) (°) (kpa) (°) 10 142.7 53.2 172.2 54.6 164.7 52.4 131.6 52.4 20 167.5 56.4 215.7 56.8 207.7 53.3 141.8 52.3 30 189.9 56.3 256.3 58.5 243.6 54.8 196.6 53.7 40 192.0 57.5 284.7 58.8 280.2 54.5 203.7 53.6 50 191.9 57.1 297.2 59.5 271.9 56.2 222.9 53.8 *for untreated sand, c = 0 kpa, ϕ = 46° table 1. strength parameters of sand-geopolymer based. 309 alaa hussein jassim al-rkaby acta polytechnica 4. conclusions in this study, a series of unconfined compression strength and triaxial tests were conducted on samples of sand and sandggbfs based geopolymer composite material. some conclusions can be drawn as follows: (a) the gepolymesation process results in dissolving si and al, forming semi-firm gel of aluminum silicate hydrate, which finally turns into a crystal framework gel. this gel can fill a high percentage of the pores that exist in the untreated sand. therefore, the maximum dry unit weight of samples increased as the ggbfs content increased to 20 %, and then decreased. however, the difference in the maximum dry unit weight due to variation in the activator content was insignificant. (b) the structured aluminosilicate polymerized gel acts as a binding agent and provides significant cohesion, better bonding between the discrete particles of sand, producing a bonded, stable composite. therefore, the inclusion of the ggbfs resulted in a significant increase in the unconfined compression strength, cohesion and friction angle for all activator contents. drained triaxial strength of sand increased by 2.7, 3.8, 6.2, 7.5 and 9.9 times as the ggbfs increased from 0 % to 10, 20, 30, 40 and 50 % respectively. accordingly, such geopolymer material can be used as an important technique in many geotechnical applications. however, due to the limited time and other constraints, there are a number of aspects (such as curing time, permeability, durability, dynamic properties) that require a further research. references [1] a. bosoaga, o. mašek, j. e. oakey. co2 capture technologies for cement industry. energy procedia 1(1):133 – 140, 2009. greenhouse gas control technologies 9, doi:10.1016/j.egypro.2009.01.020. [2] h. fatehi, s. m. abtahi, h. hashemolhosseini, s. m. hejazi. a novel study on using protein based biopolymers in soil strengthening. construction and building materials 167:813 – 821, 2018. doi:10.1016/j.conbuildmat.2018.02.028. [3] a. mosallanejad, h. taghvaei, s. m. mirsoleimani-azizi, et al. plasma upgrading of 4methylanisole: a novel approach for hydrodeoxygenation of bio oil without using a hydrogen source. chemical engineering research and design 121:113 – 124, 2017. doi:10.1016/j.cherd.2017.03.011. [4] c. shi, p. krivenko, d. roy. alkali-activated cements and concretes. taylor & francis publishing, 2006. [5] s. rios, n. cristelo, a. da fonseca, c. ferreira. structural performance of alkali-activated soil ash versus soil cement. journal of materials in civil engineering 28:04015125, 2015. doi:10.1061/(asce)mt.1943-5533.0001398. [6] n. cruz, s. rios, e. fortunato, et al. characterization of soil treated with alkali-activated cement in large-scale specimens. geotechnical testing journal 40:618 – 629, 2017. doi:10.1520/gtj20160211. [7] m. corrêa-silva, n. araújo, n. cristelo, et al. improvement of a clayey soil with alkali activated low-calcium fly ash for transport infrastructures applications. road materials and pavement design pp. 1 – 15, 2018. doi:10.1080/14680629.2018.1473286. [8] a. h. al-rkaby, a. chegenizadeh, h. nikraz. cyclic behavior of reinforced sand under principal stress rotation. journal of rock mechanics and geotechnical engineering 9(4):585 – 598, 2017. doi:10.1016/j.jrmge.2017.03.010. [9] a. h. al-rkaby, a. chegenizadeh, h. nikraz. anisotropic strength of large scale geogrid-reinforced sand: experimental study. soils and foundations 57(4):557 – 574, 2017. doi:10.1016/j.sandf.2017.03.008. [10] a. al-rkaby, a. chegenizadeh, h. r. nikraz. an experimental study on the cyclic settlement of sand and cemented sand under different inclinations of the bedding angle and loading amplitudes. european journal of environmental and civil engineering pp. 1 – 16, 2017. doi:10.1080/19648189.2017.1327891. [11] a. al-rkaby, a. chegenizadeh, h. nikraz. effect of anisotropy on the bearing capacity and deformation of sand. australian geomechanics journal 52:53 – 63, 2017. [12] j.-b. m. dassekpo, x. zha, j. zhan. compressive strength performance of geopolymer paste derived from completely decomposed granite (cdg) and partial fly ash replacement. construction and building materials 138:195 – 203, 2017. doi:10.1016/j.conbuildmat.2017.01.133. [13] s. rios, c. ramos, a. v. da fonseca, et al. colombian soil stabilized with geopolymers for low cost roads. procedia engineering 143:1392 – 1400, 2016. advances in transportation geotechnics iii, doi:10.1016/j.proeng.2016.06.164. [14] h. m. alanazi. explore accelerated pcc pavement repairs using metakaolin-based geopolymer concrete. ph.d. thesis, north dakota state university, 2015. [15] n. latifi, f. vahedifard, e. ghazanfari, et al. sustainable improvement of clays using low-carbon nontraditional additive. international journal of geomechanics 18:04017162, 2018. doi:10.1061/(asce)gm.1943-5622.0001086. [16] d. v. reddy, j.-b. edouard, k. sobhan. durability of fly ash–based geopolymer structural concrete in the marine environment. journal of materials in civil engineering 25:781–787, 2013. doi:10.1061/(asce)mt.1943-5533.0000632. [17] s. h. sanni, r. b. khadiranaikar. performance of geopolymer concrete under severe environmental conditions. international journal of civil and structural engineering 3(2):396 – 407, 2012. [18] m. a. m. ariffin, m. a. r. bhutta, m. w. hussin, et al. sulfuric acid resistance of blended ash geopolymer concrete. construction and building materials 43:80 – 86, 2013. doi:10.1016/j.conbuildmat.2013.01.018. 310 http://dx.doi.org/10.1016/j.egypro.2009.01.020 http://dx.doi.org/10.1016/j.conbuildmat.2018.02.028 http://dx.doi.org/10.1016/j.cherd.2017.03.011 http://dx.doi.org/10.1061/(asce)mt.1943-5533.0001398 http://dx.doi.org/10.1520/gtj20160211 http://dx.doi.org/10.1080/14680629.2018.1473286 http://dx.doi.org/10.1016/j.jrmge.2017.03.010 http://dx.doi.org/10.1016/j.sandf.2017.03.008 http://dx.doi.org/10.1080/19648189.2017.1327891 http://dx.doi.org/10.1016/j.conbuildmat.2017.01.133 http://dx.doi.org/10.1016/j.proeng.2016.06.164 http://dx.doi.org/10.1061/(asce)gm.1943-5622.0001086 http://dx.doi.org/10.1061/(asce)mt.1943-5533.0000632 http://dx.doi.org/10.1016/j.conbuildmat.2013.01.018 vol. 59 no. 4/2019 evaluating shear strength of sandggbfs based geopolymer. . . [19] s. rios, c. ramos, a. v. da fonseca, et al. mechanical and durability properties of a soil stabilised with an alkali-activated cement. european journal of environmental and civil engineering 23(2):245 – 267, 2017. doi:10.1080/19648189.2016.1275987. [20] e. vitale, g. russo, g. dell’agli, et al. mechanical behaviour of soil improved by alkali activated binders. environments 4:80, 2017. doi:10.3390/environments4040080. [21] p. sargent, p. hughes, m. rouainia. a new low carbon cementitious binder for stabilising weak ground conditions through deep soil mixing. soils and foundations 56(6):1021 – 1034, 2016. doi:10.1016/j.sandf.2016.11.007. [22] l. abdeldjouad, a. asadi, h. nahazanan, et al. effect of clay content on soil stabilization with alkaline activation 5(1), 2019. doi:10.1007/s40891-019-0157-y. [23] l. abdeldjouad, a. asadi, b. huat, et al. effect of curing temperature on the development of hard structure of alkali-activated soil. international journal of geomate 17:117 – 123, 2019. doi:10.21660/2019.60.8160. [24] t. teing, b. huat, s. shukla, et al. effects of alkali-activated waste binder in soil stabilization. international journal of geomate 17:82 – 89, 2019. doi:10.21660/2019.59.8161. [25] a. giuma elkhebu, a. zainorabidin, h. b. k. bujang, et al. alkaline activation of clayey soil using potassium hydroxide & fly ash 10:99 – 104, 2019. doi:10.30880/ijie.2018.10.09.016. [26] p. sukmak, s. horpibulsuk, s.-l. shen. strength development in clay–fly ash geopolymer. construction and building materials 40:566 – 574, 2013. special section on recycling wastes for use as construction materials, doi:10.1016/j.conbuildmat.2012.11.015. 311 http://dx.doi.org/10.1080/19648189.2016.1275987 http://dx.doi.org/10.3390/environments4040080 http://dx.doi.org/10.1016/j.sandf.2016.11.007 http://dx.doi.org/10.1007/s40891-019-0157-y http://dx.doi.org/10.21660/2019.60.8160 http://dx.doi.org/10.21660/2019.59.8161 http://dx.doi.org/10.30880/ijie.2018.10.09.016 http://dx.doi.org/10.1016/j.conbuildmat.2012.11.015 acta polytechnica 59(4):305–311, 2019 1 introduction 2 materials and methods 2.1 sand 2.2 ground granulated blast-furnace slag ggbfs and the alkaline activator 2.3 sample preparation 3 results 4 conclusions references acta polytechnica doi:10.14311/ap.2017.57.0008 acta polytechnica 57(1):8–13, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap determination of water content in pyrolytic tars using coulometric karl-fischer titration lenka jílková∗, tomáš hlinčík, karel ciahotný department of gaseous and solid fuels and air protection, university of chemistry and technology prague, technická 5, prague 166 28, czech republic ∗ corresponding author: lenka.jilkova@vscht.cz abstract. the liquid organic fraction of pyrolytic tar has a high energy value which makes possible its utilization as an energy source. however, before utilization, it is crucial to remove water from the liquid fraction. the presence of water reduces the energy value of pyrolytic tars. water separation from the organic tar fraction is a complex process, since an emulsion can be readily formed. therefore, after phase separation, it is important to know the residual water content in the organic phase and whether it is necessary to further dry it. the results presented in this manuscript focus on a water determination in liquid products from coal and biomass pyrolysis by a coulometric karl-fischer titration. the coulometric karl-fischer titration is often used for a water content determination in gaseous, liquid and solid samples. however, to date, this titration method has not been used for a water determination in tars. a new water determination method, which has been tested on different types of tar, has been developed. the coulometric karl-fischer titration is suitable for tar samples with a water content not greater than 5 wt%. the obtained experimental results indicate that the new introduced method can be used with a very good repeatability for a water content determination in tars. keywords: karl-fischer titration; pyrolytic tar; coulometric titration. 1. introduction during the pyrolysis or co-pyrolysis process of lignite, biomass and waste, gaseous (ch4, h2, co, co2, h2s, etc.), liquid (pyrolytic tar) and solid products (coke or char) are formed. pyrolytic tars are viscous, malodorous emulsions that contain organic compounds and water. pyrolytic tars have a low ph, are unstable and under storage conditions, among other changes, their gradual polymerization (tars aging) takes place. given their high energy value, the organic compounds present in the liquid pyrolysis fraction can be used. however, it should be noted that the energy value decreases due to the presence of water, which is, along with the organic phase, also present in the liquid fraction. based on their different densities, water can be separated from the organic phase. after phase separation, it is important to determine the residual water content in the organic phase and whether or not it is necessary to further dry the product. the karlfischer titration is one option for a water content determination in liquid samples. the karl-fischer method for the water content determination is based on a reaction described by r. w. bunsen in 1853 [1]: i2 + so2 + 2 h2o −−→ 2 hi + h2so4. (1) karl fischer discovered the possibility to apply the equation (1) for the water content determination even in systems with an excess of sulphur dioxide. methanol has proven to be a suitable solvent and to neutralize the acids karl fischer used pyridine. the following two-step reaction depicted was formulated by smith, bryanz and mitchell in 1939 [2]: i2 + so2 + 3 py + h2o −−→ 2 py−h+i− + pyso3−, (2) py ·so3 + ch3oh −−→ py−h+ch3so4. (3) if no alcohol is present in the solution, the two-step reaction looks like this: i2 + so2 + 3 py + h2o −−→ 2 py−h+i− + pyso3−, (4) py ·so3 + h2o −−→ py−h+hso4−. (5) in the years 1976–1978, j. c. verhoef and e. barenrecht found out that pyridine works only as a buffer, therefore, pyridine can be replaced by another base. moreover, they determined that titration reaction rate (k) depends on the ph of the solution [3–7]: − d[i2] dt = k[i2][so2][h2o]. (6) sulphur dioxide oxidation by iodine does not occur due to the presence of water, but due to the presence of methyl sulphide anion, which is formed according to. 2 ch3oh + so2 −−→ ch3oh2+ + ch3oso2−. (7) therefore, the higher the ph of the solution, the more methyl sulphide is formed and as a result, the reaction rate of the karl-fischer titration increases. sulphur dioxide, at ph 5.5 – 8, is in the form of methyl 8 http://dx.doi.org/10.14311/ap.2017.57.0008 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 1/2017 determination of water content in pyrolytic tars sulphide. at ph values higher than 8.5, reaction rate increases due to side reactions between iodine and hydroxide ions. however, during the titration process, this leads to a not clear titration endpoint point and higher iodine consumption. therefore, the ph of the medium should always be in the range of 4–8. from the above mentioned findings, e. scholz invented an imidazole based karl-fischer reagent [8–12]. given that imidazole acts as a buffering reagent in a more favorable ph, it replaced the toxic pyridine and enabled a quicker and more accurate titration: roh + so2 + 3 rn + i2 + h2o −−→ (rnh) ·so4r + (rnh)i. (8) 1.1. volumetric and coulometric karl-fischer titration. currently, the water determination according to karl fischer is performed by two different methods — volumetric and coulometric titration. the method of choice is established by a sample water content. in the volumetric karl-fischer titration, iodine solution, using a motorized piston burette, is added to the sample. the volumetric titration is suitable for samples with a high water content in the range of 0.01 to 100 wt%. in the coulometric karl-fischer titration, iodine is generated from electrochemical oxidation in the titration cell. this method is suitable for samples with a low/trace water content in the range of 0.0001 to 5 wt%. the method accuracy depends on a sample weight and, of course, on the chosen device. 1.2. the coulometric karl-fischer titration in equation (1), the determination of water using coulometric karl-fischer titration is described. in the coulometric titration, iodine is generated electrochemically from the anodic oxidation in the titration cell: 2 i− −−→ i2 + 2 e− (9) a classical coulometric cell has two compartments the anode and cathode. a diaphragm separates both compartments. the anode compartment is filled with the anolyte solution. the anolyte is a karl-fischer electrolyte consisting of sulphur dioxide, imidazole and iodine salts. as solvent, methanol or ethanol can be used. by applying current at the generator electrode, iodine is produced by the anodic oxidation of iodide. the cathode compartment is filled with the catholyte solution. in the catholyte solution, the final phase of the reaction takes place, i.e. the reduction. the catholyte solution’s composition is specific to the manufacturer. however, it always contains a reagent, which can be the same as the one in the anolyte solution. in the anode compartment, iodine is formed from iodide. iodide ions transfer electrons to the anode to form iodine as depicted in equation (9). the formed iodine subsequently reacts with water. in the cathodic compartment, hydrogen cations are reduced to molecular hydrogen: 2 [rn]h+ + 2 e− −−→ h2 + 2 rn (10) (hydrogen is the main product). 1.3. the water content determination currently, many studies dealing with water determination by the karl-fischer titration for food industry application are available [13–22]. for instance, felgner a. et al. [13] investigated the possibility to apply an automated titration for edible oils analysis. other studies address the issue of the water determination in dairy products [14], honey [15, 17, 20], inulin [16], lactose [18], and flour [19]. h. wang et al. [23] state in their work on coulometric and volumetric titration that when determining water content in octanol, the difference between methods can be up to one-tenth of a percent (an optimized coulometric method achieved a recovery of 99.76 %, the relative bias between coulometry and volumetry was 0.06 %). another study [24] on a water content determination in cyclodexdtrins by the karl-fischer titration and thermal methods states that only minimal differences were found in the results. a series of works has either directly addressed the issue of water determination in pyrolytic oils (or just oils) by the karl-fischer titration, or has included it in studies of other phenomenon [25–31]. choi et al. [25] determined a wide spectrum of properties of red oak-derived pyrolytic oil and the water content was determined by karl-fischer titration. mourant et al. [26] examined the effect of temperature on the yields and properties of mallee bark pyrolytic bio-oil and the water content in various phases was determined by karl-fischer titration. jansri et al. [27] used the karl-fischer titration for the determination of the water content in methyl ester made from mixed crude palm oil. oasmaa and meier [28] summarized norms and standards for pyrolysis liquids and they recommended the karl-fischer titration for the water content determination according to the astm d1744. meesuk et al. [29] examined the effect of temperature on product yields and the composition of rice husk pyrolytic bio-oil and the water content in bio-oil was determined by karl-fischer titration. smets et al. [30] dealt with comparison between the karl fischer titration, gc/ms-corrected azeotropic distillation and 1h nmr spectroscopy. all three methods had comparable results and were free of interferences, only for samples with a very high water content (> 50 wt%), the spectroscopy gave an underestimation in comparison with the titration and the distillation. he et al. examined the pyrolysis of mallee leaves in a fluidisedbed reactor and the water content in resulting bio-oil 9 l. jílková, t. hlinčík, k. ciahotný acta polytechnica figure 1. a schema of the equipment (1 — silica gel column with moist indicator cocl2, 2 — molecular sieve column ms 3a, 3 — sample tube of drying oven, 4 — sampler containing sample, 5 — titration cell, 6 — working electrode, 7 — generator electrode, 8 — electrolyte). was determined by karl-fischer titration. however, in all currently available works, the volumetric titration was used exclusively. the volumetric karl-fischer titration is used as a standard technique for the water content determination in a lot of products (astm e203). pyrolytic oil contains compounds, such as some organic acids, amines, aldehydes and ketones. these compounds might cause interferences. for the water content determination, it is possible to use the azeotropic distillation. this method is used for various types of samples (herbs, spices and petroleum products), but sometimes it is not recommended to use the azeotropic distillation, for pyrolytic oils due to the interference of water soluble volatile organic compounds [30]. 2. material and methods in this paper, the water content determination in pyrolytic tars of dual origin was carried out. one tar was formed by brown coal pyrolysis, while the other was formed by copyrolysis of brown coal and biomass (extracted rapeseed meal) mixture at a weight ratio of 2:1. the water content was determined on the mettler toledo c30 titrator equipped with the mettler toledo do308 drying oven. standard deviation was calculated for each series of measurements in order to determine a reliable repeatability. 2.1. the mettler toledo c30 titrator equipped with the mettler toledo do308 drying oven the titrator can be used separately, if the sample is injected directly into the anolyte in the titration cell (see chapter 3). in order to avoid the reaction of compounds present in the tar with the k-f reagent, the drying oven was part of the equipment. the sample was placed in the drying oven. the drying oven was connected with the titration cell via a tube. once the drying oven was heated at the programmed temperature, sample was placed in the oven sample tube. the evaporated water by means of an inert gas (nitrogen) stream was transferred into the titration cell. a schema of the equipment is shown in figure 1. the sigma aldrich hydranal® water standard kf-oven (composition: lactose monohydrate) for temperatures 140–160 °c, which contained 4.9– 5.2 wt%, was measured several times in order to assert the accuracy of the measurements carried out with the titrator. for measurements, the recommended standard weight is 50–100 mg. since tars contain aldehydes and ketones, it was necessary to choose a suitable kf reagent to prevent the interferences of these compounds. hydranal®coulomat ak (sigma-aldrich; compostition: 2-methoxyethanol, chloroform, 2,2,2-trifluoroethanol, imidazol) was used as an analyte while hydranal®coulomat cg-k (sigma-aldrich; compostition: n-methylformamid, tetrahydrofurfuryl alcohol, imidazole, sulphide dioxide) was used as a catholyte. both kf reagents are suitable for a water determination in samples containing the aforementioned compounds. the chemical composition of hydranal coulomat ak is imidazole, 2-methoxythanol, trichloromethane and trifluoroethanol. 2.2. measurement the water content for each analyzed sample (standard and tars) was determined on the titrator in the same way. the sample weight was 0.10–0.15 g. nitrogen of 6.0 purity grade was used as an inert gas. it should be noted that before entering the system, nitrogen was dryed using silica gel with moisture indicator cocl2 (first column) and molecular sieve ms 3a (second 10 vol. 57 no. 1/2017 determination of water content in pyrolytic tars figure 2. the determination of water content in standard. column). the sample was placed into the drying oven by a dropper when the oven was heated to the tested temperature. the dm 143 sc electrode was used for the titration, “mixing time” was set to 60 seconds (the starting time of the measurement – insertion of the sample into the sample tube), titration itself took place for 900 seconds and the “delay time” was set to 60 seconds (60 seconds from reaching the first end point to the moment of titration ending). 2.3. samples the water content was determined in pyrolytic tars. the first tar sample (hereinafter referred to as tar 1) was formed from brown coal pyrolysis. the second tar sample (hereinafter referred to as tar 2) was formed from copyrolysis of brown coal and biomass (extracted rapeseed meal) in the 2 : 1 weight ratio. in both cases, pyrolysis was carried out in the same manner. input material weight was 15 kg (grain 1– 3 mm). the pyrolysis batch reactor was heated for 4 hours to reach the temperature of 650 °c and kept constant for the next 2 hours. nitrogen of 4.0 purity grade was used as inert gas. afterwards, pyrolysis was stopped and the reactor was left to cool at a room temperature. the organic and aqueous phase figure 3. the determination of the water content in tar from pyrolysis of brown coal — identification of the optimal temperature. of the collected liquid condensate due was separated by gravitational method (upper organic phase and aqueous phase at the bottom). the content of remaining water in the organic phase was determined by the karl-fischer method (tar 1 and 2). then the gc/ms analysis was performed, showing that both tars contained mainly aromatic hydrocarbons, namely benzene, naphthalene and their derivatives as well as oxygenated hydrocarbons, particularly phenol and its derivatives. the gc/ms analysis showed that the samples contained, among other substances, aldehydes and ketones, which affected the choice of standard and agents for the karl-fischer titration. 3. results and discussion 3.1. water content determination in standards in order to determine the titrator accuracy, prior to the sample analysis was determined water content of standard. in figure 2, the results of analyses with repeatability are depicted (one analysis corresponds to one column). the resulting value of the standard water content was 50.70 ± 0.35 mg g−1. 3.2. water content determination in tar from brown coal pyrolysis prior to water content determination in tar 1, the suitable temperature of the drying oven was established so that the sample would release all water (see figure 3). at 140 °c, the water content was higher than at 130 °c (1–2 mg g−1 which is more than 2 %). at 150 °c, volatile compounds started to release (in the order of milligrams). the released volatile compounds condensed in the tube between the drying oven and the titration cell. therefore, 140 °c was chosen for the water content determination in tar 1. in figure 4, the results of ten other consecutive analyses are depicted with a repeatability (one analysis corresponds to one column in figure 4). the resulting value of the water content in tar 1 was 93.08 ± 0.17 mg g−1. figure 4. the determination of the water content in tar from pyrolysis of brown coal. 11 l. jílková, t. hlinčík, k. ciahotný acta polytechnica figure 5. the determination of the water content in tar from copyrolysis of brown coal and biomass — identification of the optimal temperature. 3.3. the water content determination in tar from copyrolysis of brown coal and biomass also for tar 2, before the water content determination was carried out, the suitable temperature of the drying oven was established so that the sample would release all water (see figure 5). at 140 °c, the water content was higher than at 130 °c (up to 2.3 mg g−1 which is more than 2 %). at 150 °c, volatile compounds started to release (in the order of milligrams). the released volatile compounds condensed in the tube between the drying oven and the titration cell. therefore, 140 °c was chosen for the water content determination in tar 2. in figure 6, the results of ten others consecutive analyses are depicted with repeatability (one analysis corresponds to one column in figure 6). the resulting value of the water content in tar 2 was 103.96 ± 0.24 mg g−1. 4. conclusions from the acquired data after the water content determination in the standard, it was established that the titrator can measure correct values. the measured water content in the standard was 50.7 mg g−1 (guaranteed water content in standard: 49–52 mg g−1). prior to the sample analysis, the drying oven temperature had to be estimated so that the sample would release all water content preventing however the release and eventual condensation of volatile compounds. for both tar samples, the drying oven temperature was set at 140 °c. the water content in tar from brown coal pyrolysis was 93.08 mg g−1 with a standard deviation of 0.173. the water content in tar from brown coal and biomass copyrolysis was 103.96 mg g−1 with a standard deviation of 0.237. the karl-fischer coulometric titration is suitable for samples with water content up to 5 wt%. the water content of the samples analysed within the frame of this work is considerably higher. therefore, the figure 6. the determination of the water content in tar from copyrolysis of brown coal and biomass. volumetric titration for the analysis of such samples would be more suitable, as it is mentioned above. however, it was shown that even for samples with the water content higher than 5 %, the coulometric titration can be applied with a very good repeatability. references [1] r. w. bunsen. liebigs ann. chem. 86, 265, 1853. [2] smith, d. m.; et al. analytical procedures employing karl fischer reagent i. nature of the reagent. j. am. chem. soc. 1939, 61, 2407–2412. [3] verhoef, j. c., barendrecht, e.; mechanism and reaction rate of the karl fischer titration reaction part i. potenciometric measurements. j. electroanal. chem. 1976, 71, 305–315. [4] verhoef, j. c., barendrecht, e. mechanism and reaction rate of the karl fischer titration reaction: part ii. rotating ring disk electrode measurements. j. electroanal. chem. 1977, 75, 705–717. [5] verhoef, j. c., barendrecht, e. mechanism and reaction rate of the karl fischer titration reaction: part v. analytical implications. anal. chim. acta 1977, 94, 395–403. [6] verhoef, j. c.; kok, w. t., barendrecht, e. mechanism and reaction rate of the karl fischer titration reaction: part iii. rotating ring-disk electrode measurement comparison with the aqueous system. j. electroanal. chem. 1978, 86, 407–415. [7] verhoef, j. c.; cofino, w. p., barendrecht, e. mechanism and reaction rate of the karl fischer titration reaction part iv. first and second order catalytic currents at a rotating disk electrode. j. electroanal. chem. 1978, 93, 75–80. [8] scholz, e. karl fischer reagentien ohne pyridin. fresenius’ z. anal. chem. 1980, 303, 203–207. [9] scholz, e. karl fischer reagentien ohne pyridin genaugigkeit der wasserbestimung. fresenius’ z anal. chem. 1981, 306, 394–396. [10] scholz, e. karl fischer reagentien ohne pyridin einkomponenten reagentien. fresenius’ z. anal. chem. 1981, 309, 30–32. 12 vol. 57 no. 1/2017 determination of water content in pyrolytic tars [11] scholz, e. karl fischer reagentien ohne pyridin. neue eichsubstazen. fresenius’ z. anal. chem. 1981, 309, 123–125. [12] scholz, e. karl fischer reagentien ohne pyridin. zweikomponenten reagentien mit imidazol. fresenius’ z. anal. chem. 1982, 312, 460–464. [13] felgner, a.; et al. automated karl fischer titration for liquid samples – water determination in edible oils. food chem. 2008, 106, 1379–1384, doi:10.1016/j.foodchem.2007.04.078 [14] merkh, g.; pfaff, r.; isengard, h. capabilities of automated karl fischer titration combined with gas extraction for water determination in selected dairy products. food chem. 2012, 132, 1736–1740, doi:10.1016/j.foodchem.2011.11.001 [15] sanchez, v.; et al. comparison between karl fischer and refractometric method for determination of water content in honey. food control 2010, 21, 339–341, doi:10.1016/j.foodcont.2008.08.022 [16] ronkart, s. n.; et al. determination of total water content in inulin using the volumetric karl fischer titration. talanta 2006, 70, 1006–1010, doi:10.1016/j.talanta.2006.02.024 [17] morgano, m. a.; et al. determination of water content in brazilian honeybee-collected pollen by karl fischer titration. food control 2011, 22, 1604–1608, doi:10.1016/j.foodcont.2011.03.016 [18] isengard, h.; haschka, e.; merkh, g. development of a method for water determination in lactose. food chem. 2012, 132, 1660–1663, doi:10.1016/j.foodchem.2011.04.100 [19] hadaruga, d. i.; et al. differentiation of rye and wheat flour as well as mixtures by using the kinetics of karl fischer water titration. food chem. 2016, 195, 49–55, doi:10.1016/j.foodchem.2015.08.124 [20] gallina, a.; stocco, n.; mutinelli, f. karl fischer titration to determine moisture in honey: a new simplified approach. food control 2010, 21, 942–944, doi:10.1016/j.foodcont.2009.11.008 [21] ünlüsayin, m.; et al. nano-encapsulation competitiveness of omega-3 fatty acids and correlations of thermal analysis and karl fischer water titration for european anchovy (engraulis encrasicolus l.) oil/b-cyclodextrin complexes. lwt food science and technology 2016, 68, 135–144, doi:10.1016/j.lwt.2015.12.017 [22] kestens, v.; conneely, p.; bernreuther, a. vaporisation coulometric karl fischer titration: a perfect tool for water content determination of difficult matrix reference materials. food chem. 2008, 106, 1454–1459, doi:10.1016/j.foodchem.2007.01.07 [23] wang, h. certification of the reference material of water content in water saturated 1-octanol by karl fischer coulometry, karl fischer volumetry and quantitative nuclear magnetic resonance. food chem. 2012, 134, 2362–2366, doi:10.1016/j.foodchem.2012.04.027 [24] hadaruga, n. g.; hadaruga, d.; isengard, h. water content of natural cyclodextrins and their essential oil complexes: a comparative study between karl fischer titration and thermal methods. food chem. 2012, 132, 1741–1748, doi:10.1016/j.foodchem.2011.11.003 [25] choi, y. s.; et al. detailed characterization of red oak-derived pyrolysis oil: integrated use of gc, hplc, ic, gpc and karl-fischer. j. anal. appl. pyrolysis 2014, 110, 147–154, doi:10.1016/j.jaap.2014.08.016 [26] mourant, d.; et al. effects of temperature on the yields and properties of bio-oil from the fast pyrolysis of mallee bark. fuel 2013, 108, 400–408, doi:10.1016/j.fuel.2012.12.018 [27] jansri, s.; et al. kinetics of methyl ester production from mixed crude palm oil by using acid-alkali catalyst. fuel process. technol. 2011, 92, 1543–1548, doi:10.1016/j.fuproc.2011.03.017 [28] oasmaa, a.; meier, d. norms and standards for fast pyrolysis liquids 1. round robin test. j. anal. appl. pyrolysis 2005, 73, 323–334, doi:10.1016/j.jaap.2005.03.003 [29] meesuk, s.; et al. the effects of temperature on product yields and composition of bio-oils in hydropyrolysis of rice husk using nickel-loaded brown coal char catalyst. j. anal. appl. pyrolysis 2012, 94, 238–245, doi:10.1016/j.jaap.2011.12.011 [30] smets, k.; et al. water content of pyrolysis oil: comparison between karl fischer titration, gc/ms-corrected azeotropic distillation and 1h nmr spectroscopy. j. anal. appl. pyrolysis 2011, 90, 100–105, doi:10.1016/j.jaap.2010.10.010 [31] he, m.; et al. yield and properties of bio-oil from the pyrolysis of mallee leaves in a fluidised-bed reactor. fuel 2012, 102, 506–513, doi:10.1016/j.fuel.2012.07.003 13 http://dx.doi.org/10.1016/j.foodchem.2007.04.078 http://dx.doi.org/10.1016/j.foodchem.2011.11.001 http://dx.doi.org/10.1016/j.foodcont.2008.08.022 http://dx.doi.org/10.1016/j.talanta.2006.02.024 http://dx.doi.org/10.1016/j.foodcont.2011.03.016 http://dx.doi.org/10.1016/j.foodchem.2011.04.100 http://dx.doi.org/10.1016/j.foodchem.2015.08.124 http://dx.doi.org/10.1016/j.foodcont.2009.11.008 http://dx.doi.org/10.1016/j.lwt.2015.12.017 http://dx.doi.org/10.1016/j.foodchem.2007.01.07 http://dx.doi.org/10.1016/j.foodchem.2012.04.027 http://dx.doi.org/10.1016/j.foodchem.2011.11.003 http://dx.doi.org/10.1016/j.jaap.2014.08.016 http://dx.doi.org/10.1016/j.fuel.2012.12.018 http://dx.doi.org/10.1016/j.fuproc.2011.03.017 http://dx.doi.org/10.1016/j.jaap.2005.03.003 http://dx.doi.org/10.1016/j.jaap.2011.12.011 http://dx.doi.org/10.1016/j.jaap.2010.10.010 http://dx.doi.org/10.1016/j.fuel.2012.07.003 acta polytechnica 57(1):8–13, 2017 1 introduction 1.1 volumetric and coulometric karl-fischer titration. 1.2 the coulometric karl-fischer titration 1.3 the water content determination 2 material and methods 2.1 the mettler toledo c30 titrator equipped with the mettler toledo do308 drying oven 2.2 measurement 2.3 samples 3 results and discussion 3.1 water content determination in standards 3.2 water content determination in tar from brown coal pyrolysis 3.3 the water content determination in tar from copyrolysis of brown coal and biomass 4 conclusions references acta polytechnica doi:10.14311/ap.2016.56.0455 acta polytechnica 56(6):455–461, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap finite element modelling and simulation of vehicle impact on steel safety barriers riassa likhonina∗, michal micka czech technical university in prague, faculty of transportation sciences, prague, czech republic ∗ corresponding author: likhorai@fd.cvut.cz abstract. this paper deals with a fea simulation of the vehicle crash with steel safety barriers in the ansys ls-dynar 15.0. two types of safety barriers are used: the jsnh4/h2 and the jsam-2/h2. a geometrical model of the barrier in the modeler ansysr workbenchtm 15.0 was created and then it was transformed into the ls-dynar 15.0 to complete the crash test simulation. after a computation in the solver ansys ls-dynar 15.0, the results of the simulation such as impact forces, a body displacement and an integral energy were analyzed. keywords: finite element method; crash simulation; ansysr workbenchtm; ls-dynar. 1. introduction currently, all safety barriers for permanent and repeated use on the roads have to be tested in real crash tests. the methods and all corresponding specifications of preparations and the performance of crash tests and evaluation of the obtained results are specified in čsn en standards [9]. a numerical simulation by finite element methods (fem) is commonly and widely used in automotive industry [10]. it can distinclty help, for example, to define deformation areas of cars and their safety elements, which would eliminate fatal injuries when accidents take place. according to relevant standards, each type of a safety barrier is required to be tested by several types of crash tests. moreover, it should be considered that during real crash tests it is not possible to take all parameters and situations that could appear into account. it is well known that some deviations will always appear. therefore, a fea simulation with an application of highly sophisticated programs that use a fem analysis can present a very powerful tool and be a great support for performing real crash tests, thus contributing to a correct and accurate design and construction of safety barriers. due to the advantages the fea simulation can bring, the interest in this method is increasing. a large number of publications, which are really diverse as far as the subjects of investigation and main objectives are concerned, exists in this field. some of the papers are devoted to the testing of new types of safety barriers, which are made of a new material and which are not used on real roads yet [15, 21]. other works describe simulation of impacts on bridge barriers or even on rockfall barriers [20]. one of the earliest publications in this area was a work by gruber k., herrmann m. and pitzer m., where, already in 1991, the authors described a computer simulation of a side impact. for these purposes they used different types of mobile barriers. borovinšek m. in his work uses simulation for designing optimal reinforcement of a crash barrier commonly used in slovenia [6]. another example of an application of the fea simulation is a publication by ulker m. b. c. and rahman m. s., “traffic barriers under vehicular impact from computer simulation to design guidelines”, where the problem of a vehicular impact on a portable concrete barrier is investigated [19]. borkowski w., hryciów z., rybak p., wysocki j. and wisniewski a. use the fea simulation to evaluate the effectiveness of the innovative road safety system prototype that has a body made of plastic filled with reinforced concrete. [5]. in another interesting work, which was written by ing. j. drozda and doc. ing. t. rotter, the numerical analysis of designing and testing of bridge barriers is discussed [11]. in his other work, “methodology of validation of fe model for simulations of real crash tests”, drozda j. tries to determine a method that leads to a creation of a validated finite element model before performing the full scale crash test [10]. similar investigations using fea simulation are made all around the world: in italy [4], in korea [15], in china [14], in the usa [7, 8], in romania [12] and so on. and the authors of such publications have come to a similar conclusion that fea simulation is a very helpful tool for design and construction as well as for testing of crash barriers. in the presented work, the fem program ansysr workbenchtm 15.0, ls-dynar 15.0 and ls-prepostr 4.1 are used for the fea simulation of the static and the dynamic tests of barriers . 2. steel safety barriers the subject of the presented work is road restraint systems, or to be more precise — steel safety barriers. from a large number of different steel safety barriers, two types have been used for the analysis (both made by the same manufacturer): jsnh4/h2 and jsam2/h2. the latter is a new type, which claims to have several benefits comparing with jsnh4/h2. the goal of the simulation was to create a model of a safety bar455 http://dx.doi.org/10.14311/ap.2016.56.0455 http://ojs.cvut.cz/ojs/index.php/ap riassa likhonina, michal micka acta polytechnica rier jsam-2/h2 in ansysr workbenchtm 15.0 and to set parameters of both models in ls-prepostr 4.1: defining deformation curves and impact conditions (velocity, vehicle density, etc.), setting the material characteristics for separate parts of the crash barriers, modifying contacts between the corresponding parts, etc. this article shows the correlation of fea results with the literature test data results [16]. in the czech republic, the crash barriers can only be used on the roads if they comply with the requirements specified in tp 114 [18]. compliance with these requirements is verified by crash tests according to čsn en 1317-2 [9]. on the basis of these crash tests, the final containment level is determined. for a majority of the containment levels, there should be two or more types of crash tests. this is valid for “approved” crash barriers. for “other” crash barriers, calculation is used. however, it is not common for several reasons. the most important of these reasons is the fact that calculations are not able to provide a sufficient basis for an assessment of the complex suitability of a certain crash barrier as a road restraint system and, thus, to forecast "impact acceptability", because "impact acceptability" is a combination of many factors. another reason why calculations are not often used, is the lack of methodological guidelines, recommendations and documents dealing with this issue. however, for bridge crash barriers, the calculations are allowed; moreover they are, in accordance with corresponding standards, recommended [17]. on the other hand, the implementation and evaluation of crash tests, although it is strictly prescribed for some cases and situations, is expensive and demanding on the equipment of the laboratory. moreover, the application of the results of the crash tests in prediction of how the barrier will behave in practice, i.e. on roads and during certain impacts, is contentious. therefore, emphasizing advantages and disadvantages of both methods it could be shown that a fea simulation can be a promising method for forcasting behavior of a crash barrier when collision happens [17]. the implementation and evaluation of crash tests, although it is strictly prescribed for some cases and situations, is expensive and demanding on the equipment of the laboratory. moreover, the application of the results of the crash tests in prediction of how the barrier will behave in practice, i.e., on roads and during certain impacts, is contentious [17]. the calculations could be used as “means for making the preparation of a crash test” [17]. in comparison with the “trial and error”, it will bring substantial savings. for example, with the help of calculations it is often possible to predict the influence of the change in the construction of the crash barriers (e.g., construction changes of connecting parts, changes of the element length, etc.) on their strength and stiffness, and then to select the appropriate parameters of crash tests. moreover, the calculation results can be used to design and test new and completely innovative types of crash barriers. and this is where fea simulation can play a significant role as it saves time and decreases expenses [17]. in the presented work, there is an attempt to simulate a vehicle impact on two crash barriers and to prove that fea simulation , in this area, has great prospects. 3. finite element modelling and simulation this chapter describes the construction, parameters and installation of the selected types of safety barriers, i.e., jsnh4/h2 and jsam-2/h2. the 3d models of the safety barriers jsnh4/h2 and jsam-2/h2 have been created in the ansysr workbenchtm 15.0. the numerical model was created for one field of the barrier due to size limitations of the task. a model of a safety barrier itself consists of a post, two parts of a spacer, barrier strips and lower beams. all parts of each of the geometrical models have exactly the same sizes as specified in the technical documents by the manufacturer. these elements are actually interconnected by bolts, nuts and washers specified in the technical documents by the manufacturer. however, in the numerical model, these connecting elements have been replaced with bonded face contacts. for the fea simulation purpose, models of the ground and a vehicle are created. they will impact the safety barrier according to given consideration. in both cases, the ground is represented by rectangular blocks with cylinder concrete block as a ground of the column of the barrier. as for the vehicle, it is represented by a semi-cylinder [16]. it is located in such a way that its upper edge is on the level of the upper edge of the post and its height is sufficient enough to impact both the barrier strip and the lower beam. in this stage, the vehicle does not touch the crash barrier and is app. 20 mm far from it. the problem is solved as a symmetrical one, therefore, the symmetry is supposed in the place of the loading of the model. this fact explains why a semi-cylinder is used instead of a whole cylinder for the model of a vehicle. the impact area in case of symmetry is app. 0.25 × 0.25 m, which is in compliance with the standards , as mentioned in previous chapters. the last thing defined in the ansysr workbenchtm 15.0 were the boundary conditions. for both models, the displacement of the barrier strips and lower beams using symmetry conditions is defined. the displacement of the vehicle was specified so that only a movement, which has a perpendicular direction to the barrier strip, was allowed. a movement of the ground is not allowed in any other direction. as it can be seen from table 1, both types of the crash barriers have the same containment level, that is in terms with the crash test types performed for their validation and certification [1]. however, they differ in used materials as jsam-2/h2 is made of the new micro-alloyed steel, which, according to the manufacturer, allows a reduction of the overall weight, 456 vol. 56 no. 6/2016 finite element modelling and simulation of vehicle impact jsnh4/h2 jsam-2/h2 manufacturer arcelormittal distribution solutions czech republic, s.r.o. arcelormittal distribution solutions czech republic, s.r.o. type single-sided safety barrier single-sided safety barrier containment level h2 h2 dynamic deflection 1.75 m 1.5 m working width 1.85 m 1.6 m acceleration severity index 1.186 1.1 material s235jr, s355mc s235jr, s355mc, s420mc type of a barrier strip nh4: thickness 4 mm, length 4250 mm am: thickness 2.8 mm, length 4250 mm lower beam sp3: thickness 3 mm, height 214 mm, width 28 mm, length 4250 mm am: thickness 2.8 mm, height 214 mm, width 28 mm, length 4250 mm post v140: thickness of a wall 5 mm, width 140 mm, length 2170 mm c 150 × 75 × 25: thickness of a wall 3.5 mm, width 150 mm, length 1755 mm weight 42.67 kg/m 29 kg/m table 1. comparison of jsnh4/h2 and jsam-2/h2 [16]. while preserving the same or even better functionality. as table 1 shows, the application of the new steel type enables production of these parts of the safety barrier jsam-2/h2 with a smaller thickness. besides, the post has been changed both in terms of cross section and its dimensions. all of this affects important parameters like the dynamic deflection, the working width and the acceleration severity index. this fundamentally improves the characteristics of the safety barrier. but the main achievement is the weight reduction that is approximately 32 % — from 42.67 kg/m to 29 kg/m. this fact is not only beneficial in terms of costs, but also in terms of influence on the environment. also, the maintenance and installation is much easier. in simulation, the following materials were used: s235jr for jsnh4/h2 and s355j0 for jsam-2/h2. their load curves used in the ansysr 15.0 are shown in figure 1. the vehicle is modeled by a cylinder, whose material is isotropic elastic (see table 2). for the ground, the material of the type geologic_cap_model was used. the parameters listed in table 3 were specified for this material. for concrete, the standard model cscm_concrete with default parameters was used. notice that the value of the mass density of the “vehicle” differs according to what type of a crash test has to be simulated. the mass density is calculated from the weight of the car and the dimensions of the car model according to the following formula: % = m v and v = πr2h 2 , figure 1. comparison of the working curve of s235jr and s355j0 steel used in simulation [16]. mass density [kg/m3] — passenger car 2.621 · 104 — lorry 2.912 · 105 — bus 3.785 · 105 young’s modulus [pa] 2.000 · 1011 poisson’s ratio [–] 0.3 table 2. parameters for specification of vehicle material type [16]. where % is a mass density [kg/m3], m is a car mass [kg], v is a volume [m3], r = 0.150 m is a radius of a cylinder and h = 0.7 m is a height of a cylinder. the same meshing properties are specified for both models. the element size is 0.20 m, the minimum edge length is 3 · 10−3 m; the smoothing, the relevance center and the span angle center are set to be medium. 457 riassa likhonina, michal micka acta polytechnica name description value ro [kg/m3] mass density 2100 bulk [pa] initial bulk modulus 4.102 · 107 g [pa] initial shear modulus 3.076 · 107 alpha failure envelope parameter 2 theta failure envelope linear coefficient 0.292 gamma failure envelope exponential coefficient 0 beta failure envelope exponent 0 r cap, surface axis ratio 3 d [pa] hardening law exponent 7.524 · 107 w [pa] hardening law coefficient 0.031394 x0 [pa] hardening law exponent 0 c kinematic hardening coefficient 1 n kinematic hardening parameter 1 ftype formulation flag 1 — default — means soil vec vectorization flag 0 — default — means vectorized table 3. parameters for ground specification [16]. 4. results and discussion the numerical model was transformed from the ansysr workbenchtm 15.0 to the ansys lsdynar 15.0 and solved. the results were evaluated in the program ls-prepostr 4.1. the next figures show the most important results from the fea simulation of the crash test tb11 performed for the jsam-2/h2. from a graph of the vehicle kinetic energy (see figure 2), it can be seen that its value is gradually decreasing within the impact and in the end it reaches 1.7735 knm. its maximum value, at the beginning of the impact, is 18.288 knm, which means 36.576 knm for a symmetrical task. the vehicle velocity decreases as well and reaches almost zero (see figure 3). its initial value of 9.5006 m/s is set before simulation and corresponds to the crash speed value specified in standards [9]. the following figures illustrate safety barrier deformation at a time step of 0.125 s (see figure 4). the most interesting and informative is the graph showing the internal energy of the construction (see figure 5). it can be seen that the maximum value is 57.611 knm, i.e., 115.222 knm for a symmetrical task. when we look at the internal energy at a time step of 0.125 s, as we did for the jsnh4/h2, its value is equal to 44 knm, i.e., 88 knm for a symmetrical task. these values more than fully comply with the value specified in the tp 114 (40.600 knm) and with the value of the kinetic energy obtained during the simulation (36.576 knm). therefore, it can be stated that the fea simulation of the crash test tb11 for the jsam2/h2 was successfully performed according to both the criteria of consumed energy and the maximum displacement value at a certain time step. the next graph of interest is the one that illuscrash test impact force [kn] jsnh4/h2 jsam-2/h2 tb11 74.250–105.183 38.250–82.840 tb42 130–183.440 45–157.560 tb51 104–218.978 78–211.380 table 4. comparison of jsnh4/h2 and jsam-2/h2 — impact forces [16]. trates the acceleration of the vehicle (see figure 6). it can be seen that the maximum value of the acceleration reaches 92.044 m/s2. at a time step of 0.125 s, it is equal to 42.500 m/s2. therefore, the impact forces corresponding to these values of acceleration are 82.8396 kn and 38.250 kn, respectively. mind the fact that the tp 101 specifies the following alternative load forces corresponding to the crash test tb11: 15– 35 kn for a crash barrier with a dynamic deflection of 1.500–2.500 m and 35–80 kn for a crash barrier with a dynamic deflection of 0.100–0.500 m [17]. it can be seen that the impact forces obtained during the fea simulation are sufficient, or even higher, than the alternative load forces described in the tp 101 [17]. 4.1. comparision of fea results for jsnh4/h2 and jsam-2/h2 comparing the two types of the crash barriers it can be noticed that the impact forces, which the safety barrier can resist, are higher for jsnh4/h2 for all of the crash tests (see table 4). it is explained by the fact that jsnh4/h2 has a higher stiffness due to the construction dimensions. there is also a difference in the values of the internal energy of the two types, which is presented in the 458 vol. 56 no. 6/2016 finite element modelling and simulation of vehicle impact figure 2. kinetic energy of a vehicle — jsam-2/h2, tb11 [16]. figure 3. vehicle velocity — jsam-2/h2, tb11 [16]. figure 4. safety barrier deformation at time step 0.125 s — jsam-2/h2, tb11 [16]. figure 5. internal energy — jsam-2/h2, tb11 [16]. 459 riassa likhonina, michal micka acta polytechnica figure 6. vehicle acceleration — jsam-2/h2, tb11 [16]. crash test internal energy [knm] jsnh4/h2 jsam-2/h2 tb11 43 88 tb42 135 120 tb51 max. 233.760 max. 245.050 table 5. comparison of jsnh4/h2 and jsam-2/h2 — internal energy [16]. following table (see table 5). the value of the internal energy in tb11 for jsam2/h2 is approximately twice higher than that for the jsnh4/h2. from this, it can be concluded that the jsam-2/h2 better absorbs kinetic energy of a vehicle during an impact, and therefore it is safer for passengers. this fact is also proved by the asi index specified in the technical documentation, which is lower for the jsam-2/h2: 1.1 as compared with 1.186 [2, 3]. similarly, the value of the internal energy obtained during the tb51 is higher for the jsam-2/h2 in comparison with the jsnh4/h2. however, it is a little bit lower during the tb42. 4.2. correlation of results table 6 represents the values of the kinetic energy obtained during real crash tests and an alternative load force used during the calculations for crash test types tb11, tb42 and tb51. these values are specified in the technical standards [9, 17]. according to the results, which were obtained during the fea simulation and discussed in previous section, it can be concluded that both safety barriers passed the crash tests tb11 and tb42 according to the criteria of the consumed energy, and therefore the models correspond to the containment level h1. according to the impact forces the barriers are able to withstand, both types passed all three crash tests — tb11, tb42, tb51 – and, thus, correspond to the containment level h2 specified in the technical documents by the manufacturer [2, 3]. taking into consideration the mentioned facts, it can clearly be seen that by the performance of the same functions, the crash barrier jsam-2/h2 proved to have several advantages against the crash barrier jsnh4/h2. among them is a better energy absorption, and therefore higher safety for the vehicle occupants. another advantage is its weight, which is lower by 32 % than the weight of the 2jsnh4/h2 (29 kg against 42.670 kg) [2, 3]. 5. conclusions resuming all the facts, it can be concluded that the ansysr workbenchtm 15.0 and the ansys lsdynar 15.0 proved to be very good tools for a stress assessment of different constructions — resistance against plastic deformation. though it is indicated in standards that the safety barriers should be tested in real crash tests, preparation and financing of such tests is, however, a challenging task. the ansysr has excellent tools for simulating similar tests and it can be used for preliminary calculations, which would help to avoid possible mistakes and wasting time. besides, it would reduce costs, and what is most important — increase the efficiency of the preparation of experimental tests and improve the final results. references [1] arcelormittal europe, 2014. flat products. update. zákaznický časopis: may 2014. retrieved from: http://flateurope.arcelormittal.com/ repository2/about/cz_update_may14.pdf [2] arcelormittal ostrava a.s. jednostranné svodidlo jsnh4/h2. retrieved from: http://www.kaska.eu/uploads/2012_jsnh4-h2.pdf [3] arcelormittal ostrava a.s. jednostranné svodidlo jsam-2/h2. retrieved from: http://www.doznac.cz/ wp-content/uploads/6_jsam-2-h2.pdf [4] bonin g., cantisani g., loprencipe g.. computational 3d models of vehicle’s crash on road safety systems. universita di roma "la sapienza", rome, italy. [5] borkowski w., hryciów z., rybak p., wysocki j., wiśniewski a.. studies on the effectiveness of the innovative road safety system. journal of kones powertrain an transport, vol. 21, no. 2, 2014. 460 http://flateurope.arcelormittal.com/repository2/about/cz_update_may14.pdf http://flateurope.arcelormittal.com/repository2/about/cz_update_may14.pdf http://www.kaska.eu/uploads/2012_jsnh4-h2.pdf http://www.doznac.cz/wp-content/uploads/6_jsam-2-h2.pdf http://www.doznac.cz/wp-content/uploads/6_jsam-2-h2.pdf vol. 56 no. 6/2016 finite element modelling and simulation of vehicle impact test alternative load force barriers specified in [17] impact force obtained in simulation: jsnh4 (top value), jsam-2 (bottom value) kinetic energy specified in [9] internal energy obtained in simulation: jsnh4 (top value), jsam–2 (bottom value) tb11 15–35 kn 74.3–105.2 kn 38.3–82.8 kn 40.6 knm 43 knm 88 knm tb42 45–80 kn 130–183.4 kn 45–157.6 kn 126.6 knm 135 knm 120 knm tb51 90–140 kn 104–219 kn 78–211.4 kn 287.5 knm 233.8 knm 245.1 knm table 6. correlation of fea results. [6] borovinšek m., vesenjak m., ulbin m., ren z. simulating the impact of a truck on a road-safety barrier. strojniški vestnik: journal of mechanical engineering 52(2006)2. [7] consolazio g. r., chung h. j., gurley k. r. impact simulation and full scale crash testing of a low profile concrete work zone barrier. departement of civil and coastal engineering. university of florida: usa, 2002. [8] consolazio g. r., chung j. h. vehicle impact simulation for curb and barrier design. volumi i — impact simulation procedures. center for advanced infrastructure & transportation, civil & environmental engineering: rutgers, the state university, piscataway, nj 08854-8014, october 1998 [9] čsn en 1317-2 silniční záchytné systémy. část 2: svodidla a mostní svodidla. funkční třídy, kritéria přijatelnosti nárazových zkoušek a zkušební metody. ics 13.200; 93.080.30, únor, 2011. [10] drozda j., 2014. methodology of validation of fe model for simulations of real crash tests. in. sborník semináře doktorandů katedry ocelových a dřevěných konstrukcí: praha, march 3 and october 10, 2014. katedra ocelových a dřevěných konstrukcí fsv čvut. nadace františka faltuse. pp. 43-48. isbn 978-80-0105522-9. retrieved from: http://www.ocel-drevo.fsv. cvut.cz/nff/docs/sborniky/nff-sbornik14.pdf [11] drozda j., rotter r. využití numerické analýzy pro návrh a ověření mostních svodidel. centre for effective and sustainable transport infrastructure, wp3, 3.8b, 2014. retrieved from: http://www.cesti.cz/ technicke_listy/tl2014/2014_wp3_tl3_8b.pdf [12] dumitrache p. numerical simulation of the behaviour of the safety barriers and experimental validation. dunarea de jos university galaji, engineering faculty braila: romania, 2010. retrieved from: http: //das.tuwien.ac.at/fileadmin/mediapool-das/ diverse/publications/boa_siofok/files/p047.pdf [13] gruber k., herrmann m., pitzer m. computer simulation of side impact using different mobile barriers. doi 10.4271/910323. 1991. retrieved from: http://papers.sae.org/910323/ [14] hao h., deeks j. a., wu c.. numerical simulations of the performance of steel guardrails under vehicle impact. transactions of tianjin university: october 2008. [15] hong k.-e., thai h.-t., kim s.-e. numerical simulation of composite safety barrier under vehicle impact. dept. of civil & environmental engineering, sejong university: korea, 2010. [16] likhonina r., 2015. numerical simulation of impact with barriers (master´s thesis). fd čvut, praha, 2015. 162 p. [17] tp 101 výpočet svodidel. č.j. 26514/97-120. praha: ministerstvo dopravy a spojů, odbor pozemních komunikací, dopravoprojekt, 1998. [18] tp 114. svodidla na pozemních komunikačních: zatížení. stanovení úrovně zadržení na pk. navrhování „jiných“ svodidel. zkoušení a uvádění svodidel na trh. č.j. 148/10-910-ipk/1. praha: ministerstvo dopravy, odbor silniční infrastruktury, dopravoprojekt brno, a.s., 2010. [19] ulker m. b. c., rahman m. s. traffic barriers under vehicular impact from computer simulation to design guidelines. comp.-aided civil and infrastructure engineering 23(6): 2008. [20] wu c., hao h., deeks j. a. vehicle impact response analysis of two-rail steel rhs traffic barrier. taylor and francis group: isbn 90 5809 659 9, london, 2005. [21] zike s., kalkins k., ozolins o. experimental verification of simulation model of impact response tests for unsaturated polyester/gf and pp/gf composites. cmms journal: 2011. retrieved from: http: //www.cmms.agh.edu.pl/abstract.php?p_id=317 461 http://www.ocel-drevo.fsv.cvut.cz/nff/docs/sborniky/nff-sbornik14.pdf http://www.ocel-drevo.fsv.cvut.cz/nff/docs/sborniky/nff-sbornik14.pdf http://www.cesti.cz/technicke_listy/tl2014/2014_wp3_tl3_8b.pdf http://www.cesti.cz/technicke_listy/tl2014/2014_wp3_tl3_8b.pdf http://das.tuwien.ac.at/fileadmin/mediapool-das/diverse/publications/boa_siofok/files/p047.pdf http://das.tuwien.ac.at/fileadmin/mediapool-das/diverse/publications/boa_siofok/files/p047.pdf http://das.tuwien.ac.at/fileadmin/mediapool-das/diverse/publications/boa_siofok/files/p047.pdf http://papers.sae.org/910323/ http://www.cmms.agh.edu.pl/abstract.php?p_id=317 http://www.cmms.agh.edu.pl/abstract.php?p_id=317 acta polytechnica 56(6):455–461, 2016 1 introduction 2 steel safety barriers 3 finite element modelling and simulation 4 results and discussion 4.1 comparision of fea results for jsnh4/h2 and jsam-2/h2 4.2 correlation of results 5 conclusions references ap05_2.vp 1 introduction there are two main approaches when animating articulated structures, e.g., human bodies. the first approach, called forward kinematics, derives the movement of the structure from the movement of all its parts. the second technique, inverse kinematics, works in the opposite way. the movement of each part is determined by the movement of the complete structure. solving forward kinematics is a straightforward transformation between status vector q – angles between adjacent segments, and position vector x – position of the end point of the structure also called the end-effector or ee. it can be easily written as x f q� ( ), (1) while the inverse transformation is required for solving the inverse kinematics problem. it can be represented by q f x� �1( ) . (2) the articulated structure is usually redundant, i.e. n > m. thus the solutions to (2) are non-unique (except degenerated cases where no solution exists at all) and even for n � m several solutions can exist. so the inverse kinematics algorithms have to select a particular solution of (2) in the face of multiple solutions. heuristic techniques have been proposed, e.g., freezing dofs to eliminate redundancy. however, the redundant dofs are not necessarily disadvantageous, as they can be used to optimize additional constraints. hence it is useful to impose some optimization criterion g(q), usually represented by a function with a unique global optimum. there are two generic approaches to solving the inverse kinematics problem with optimization criteria. global methods find an optimal path of q with respect to the entire trajectory, usually in computationally expensive off-line calculations. in contrast, local methods, which are applicable in real-time, compute only an optimal change of status vector q, dq according the small displacement dx and then integrate dq to generate a complete joint space path. resolved motion rate control [12] is one such local method. it uses the jacobian matrix j f ( ) ( ) q q q � � � from forward kinematics to relate the change of status vector to a change of the end-effector. d dx j� �( )q q (3) this equation can be solved for desired dx and unknown dq by taking the inverse of j(q) if it is square (m � n) and non-singular. for redundant structures (n > m) the solving of (3) is described in the following sections. there are also analytical methods, optimization-based methods, and others [1, 7]. 1.1 jacobian inversion method this method comes simply from (3) by inverting the jacobian j(q). although the standard inversion cannot be applied due to the jacobian’s non-squareness in the general case, moore-penrose pseudo-inversion a a a a� �� � �t t( ) 1 has to be utilized. equation (3) results in d dq q� �j x+( ) . (4) pseudo-inversion can also be robustly found by singular value decomposition [5]. an advantage of using pseudo-inverse is the minimum norm solution for dq [12]. liegeois [8] suggested a more general form of optimization with pseudo-inverse by minimizing objective function g(q): � �d dq q q q q q � � � � �j x i j j+ + g ( ) ( ) ( ) ( ) � � � , (5) where i is identity matrix and � is positive gain constant. although this formulation requires only that the gradient of g(q) be calculated, the gain � is difficult to obtain. as a general problem, pseudo-inverse methods are numerically unstable when the system reaches kinematic singularity [2] – the jacobian is singular. furthermore, there is still no control of inner parts of the structure. 1.2 extended jacobian method the extended jacobian method was originally developed by baillieul [2, 3].there are r � n � m rows added to the jacobian with goal to zero the gradient of g(q) in the null space of the jacobian. let �i� i � 1, …, r be the orthonormal set of vectors that span the null space of j(q), and g(q) is the desired optimization criterion that should be minimized. equation (3) with the extended jacobian follows. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 solving inverse kinematics – a new approach to the extended jacobian technique m. šoch, r. lórencz this paper presents a brief summary of current numerical algorithms for solving the inverse kinematics problem. then a new approach based on the extended jacobian technique is compared with the current jacobian inversion method. the presented method is intended for use in the field of computer graphics for animation of articulated structures. keywords: inverse kinematics, jacobian inversion, extended jacobian technique. dx j q q q 0 0 � � � � � � � � � � � ( ) ( ( ) � � �i i g � � 1, ,� r dq. (6) hence the solution requires only standard inversion instead of pseudo-inversion. the optimization criterion g(q) represents a set of constraints even though its meaning is again more useful for robotics – it takes into account some objective-function (e.g., manipulability), which applies on the whole structure instead of physically-based constraints for each joint. this method was further extended in [6] from the numerical point of view. 1.3 jacobian transposition method this method comes from the jacobian inversion method (section 1.1).the reason for his is the substitution of computationally difficult inversion by simple transposition. this simplification is based on the virtual work principle [9], and it results in: d dq j q x� �t ( ) . (7) this method suffers in almost the same way as the jacobian inversion method (section 1.1). there is still no control of the internal parts of the structure, while the stability of the solution is much better. 1.4 constrained inverse kinematics problems the most common constraint on inverse kinematics systems is the joint limits. joint constraints are usually represented by inequality constraints. several different methods can be used to make sure a solution satisfies them. 1.4.1 the lagrange approach the lagrange approach with lagrange multipliers will find all minima for the objective function, subject to equality constraints, as stated in [7]. those minima that do not satisfy the joint limit inequalities can be discarded immediately. any new minima that arise from the inequalities must lie on the boundary of the region defined by those limits. that is when one or more of the joint variables take extreme values [7]. therefore by setting one qi to its lower bound or upper bound each time, these minima can be found by solving 2n smaller problems, each involving one less variable qi than the original. 1.4.2 introducing new variables the new variables can be introduced to transform each inequality into an equality constraint [7], assuming the i-th joint angle qi has upper limit ui and lower limit li. the 2n new variables are added yil and yiu for the i-th joint to transform each inequality into an equality constraint. now the lagrange approach (section 1.4.1) can be used to solve the original problem plus 2n new variables and 2n new equality constraints. 1.4.3 penalty function methods another method adds penalty functions to the objective function. the algorithm looks for the minimum value of the objective function so the penalty causes the value of the objective function to increase as the joints approach their limits. the desired result is that the objective function itself effectively prohibits joint limit violations [1]. unfortunately, penalty function methods are not numerically stable. 2 a new approach to the extended jacobian technique we have focused on the jacobian inversion method due to its mathematical simplicity and purity, as presented in [11]. we have also appreciated the extended jacobian technique suggested by baillieul in [2, 3]. extension of the jacobian, to be an every-time matrix with rank at least n, is the only way to avoid using pseudo-inverse and to exploit standard gaussian inverse. it is also possible to employ constraints on the structure and even on each joint. however, the constraints have to deal with particular joints, which is more interesting in the targeted field of computer animation. it is inappropriate to incorporate some objective-function which is being optimized as was suggested in [2], which is more useful in robotics for which this method was originally developed. the other motivation for herein presented method is the absence of similarly-based approaches in computer animation. this is probably caused by the high computational complexity for real-time applications. thus performance optimization of presented approach is a future goal. 2.1 notation and numbering this section will provide an explanation of the indexing of the segments and joints, for better orientation in the further text. the articulated structure consists of n segments l0, …, ln�1 and n joints 0, …, n�1. the end-effector is seen as the n�1-th joint, designated by symbol n. the angles between the adjacent segments are represented by status vector q with q0, …, qn�1 items. we assume the structure in fig. 1 with 4 segments l0, l1, l2, l3, 4 joints 0, 1, 2, 3 and the end-effector labeled with number 4. status vector q contains 4 elements q0, q1, q2, q3. 2.2 treating an articulated structure it is necessary to extend the jacobian, as was stated above, in order to avoid problems coming from usage of pseudo-inverse. it seems natural to control every joint of an articulated structure by some physical characteristic – for instance by its 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague q0 q1 q2 q3 ee l2 l3 l1 l0 x y fig. 1: articulated structure with notation and numbering joint-stiffness – rather than by setting its angle value. thus we decided to treat articulated structures by pairs instead of manipulating them in their entirety. we assumed that a complete articulated structure consists of basic elements – two segments connected by a joint, as is shown in fig. 2. using many of these basic elements, it is easy to create an articulated structure with n > 2 segments. at each basic element its behavior can be controlled. the basic element is described by two quantities – either [x, y] in cartesian coordinates or [l, �] in angle coordinates. the basic elements constituting a complex structure are not connected at the end-points, but the next basic element is connected into medial joint of its predecessor, so they share one segment of the final complex structure. each basic element is labeled by a pair of joint numbers within the structure – i : i � 2 where i is the joint where the first segment starts and i � 2 is the joint where the second (and last) segment finishes. each basic element lies in its local coordinate system, i.e., the coordinates [xi, i�1, yi, i�1] or [li, i�1, �i, i�1] are relative. the cartesian coordinates of the basic element i : i �2 are computed by x l q y l qi i j j ij i i i i j j ij i i , ,cos , sin ,� � � � � � � �� �1 1 1 1 (8) where q qj i k k i j � � � . its origin is at the beginning of the corresponding basic element and is rotated by an angle equal to the sum of the previous angles from the status vector. for example, in fig. 3, where three basic elements 0 : 2, 1 : 3 and 2 : 4 form the complete structure, we assume a basic element 2 : 4 described by [l23, �23], its coordinate system origin lies at the end of the second segment l1 and is rotated by angle qi i� � 0 1 . 2.3 extending jacobian according to the preceding section we can obtain the jacobian with rank at most n � 1 – the number of pairs in a complex structure. this is not enough to avoid problems with the jacobian’s singularity. hence these rows are added into the standard jacobian used for solving (3) – the rows in the standard jacobian represent the main link 0 � n (base to end-effector). then the jacobian looks as presented in equation (9). using such a jacobian extension we can control the end-effector and also the behavior of the inner joints. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 q0 q l l l 1 1 0 � x y fig. 2: a basic element as a main part for creating complex articulated structures q0 q1 q2 q3 ee l2 l3 l1 l0 l01 �01 l12 l23 � � 12 23 fig. 3: a complex joint structure consisting of basic elements j2 0 0 1 1 0 1 1 0 1 1 2 0 0 0 n n l q l q l q l q l , ( ) sin sin sin sin q � � � � � � � sin sin sin cos q l q l q l q l n n n 2 1 2 2 1 1 1 2 0 0 0 0 0 0 � � � � � � � � � � � � � 1 1 0 1 1 0 1 1 2 2 1 2 2 1 0 0 0 0 cos cos cos cos cos q l q l q l q l q � � � � � � � � 0 0 0 1 1 2 00 1 01 1 � l q l q l q n n n i i i n i i i n � � � � � � � � �� � cos sin sin � � � � � � � � � � � �� � l q l q l q l n n i i i n i i i n n 1 1 0 00 1 01 1 sin cos cos 1 1 0 cos qn � � � � (9) the i-th and ( i � n � 1)-th rows of the jacobian correspond to the i-th basic element of the articulated structure. for instance, the 0-th and (n � 1)-th rows correspond to the basic element designated as l01 in fig. 3. the last two jacobian rows describe the main link – from the base to the end-effector. 2.3.1 extending dx consequently as the jacobian is extended, the displacement vector dx also has to be extended to fit the jacobian’s row dimension – dim dx � 2n . the dx items represent the change of the corresponding basic element or main link 0 : n (the last two rows). to set up the change of the main link 0 : n is straightforward – it is the only input of the method. all the remaining items representing a change of the particular basic elements have to be derived from the change of the main link taking into account the desired physical constraints imposed on the joints. as the displacement of the main link is easy to define and clear to understand, the definition of the displacements of a particular basic element is difficult to describe according to the required all-structure behavior. future work and research remains to be done on this. 2.4 solving inverse kinematics equation (3) can be comprehended as a linear system a x b� � (10) and thus can be solved by any appropriate method [4]. vector x is the vector of unknown variables. in the case of (3), vector x corresponds to vector dq, which is being solved. next, vector b represents the displacement of end-effector dx and matrix a corresponds to jacobian j(q). since the jacobian is normally a matrix with dimensions (2n, n), it is necessary to employ an approximation method for solving (3). we decided to use the ordinary least square (ols) method based on normal matrices [10], as it minimizes the norm of residual vector r : r a x b� � � . however, any approximation method can be used [5]. ols produces a square matrix, which can be already solved by the lu decomposition [4] that we decided to use. 3 experiments and evaluation of results we performed a comparison test of herein presented extended jacobian method with a method based on using pseudo-inverse (section 1.1) as presented in [11]. we decided to make a comparison with the jacobian inversion method as it is one of the standard methods, together with jacobian transposition (section 1.3) and ccd – cyclic coordinate descent, for solving the inverse kinematics problem in the field of computer animation. our goal was to acquire the same behavior of both methods, but with good recovery from the singular case. the testing structure consists of 6 segments with lengths l0 � 90, l1 � 60, l2 � 90, l3 � 150, l4 � 150 and l5 � 90 points. the starting configuration was determined by qi � 0. the starting point lies at [�230, 0]. the desired trajectory was formed by an ellipse with a � 400 and b � 50 with the mid-point at the origin. therefore the end-effector touched the ellipse at the point [400, 0]. first we tried to minimize the movement of the inner parts and make a move to the end-effector only. we accomplished this by setting all dxi � 0 except those which correspond to a move of the end-effector. this only partly achieved the desired requirements. the structure recovered well from the singular case – see fig. 4, but its behavior is different from the behavior of the method based on pseudo-inverse. due to the these problems with behavior, we tried to minimize the influence of the added rows by multiplying each added row by a positive non-zero constant w>0. we expected a minimizing effect of the added rows. we assumed the following: when w � 0 then the extended jacobian will become the non-extended jacobian j jext next� . however, this did not achieve the required effect. with w � 0 the behavior remained and the ability to recover from the singular cases disappeared. the behavior of the tested methods is shown in the figs. 5 and 6 – fig. 5 shows the standard jacobian inversion method while fig. 6 displays the herein presented jacobian extension method. the pictures show the movement of the structure during the first several steps. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague 0 50 100 150 200 250 extended jacobian pseudo-inverse step d is ta n c e fr o m d e s ir re d p o s it io n in p o in ts 0 1 2 3 4 5 6 7 fig. 4: graph presenting the distance of the end-effector from the desired position the jacobian inversion method made a “big jump” when moving from the starting configuration. this was caused by a kinematic singularity and consequent numerical instability of the system. this did not happen with the extended jacobian method due to the fact its jacobian cannot be singular in any case, i.e., its rank is always guaranteed to be n. 4 conclusions we have presented a brief overview of methods for solving the inverse kinematics, together with a new option for solving an identical problem actually in plane mainly for the purposes of computer graphics – for animating articulated structures. the articulated structure is treated by pairs to allow us to control the inner joints based on physical characteristics. the method is based on an extension of the jacobian. 2 ( n � 1) rows are added into the jacobian corresponding n � 1 pairs – basic elements. the system is over-determined (the jacobian is full-rank), so singular cases coming from usage of pseudo-inverse are avoided. hence the method is stable. several tests were performed to compare presented method with the method based on pseudo-inverse only. the same behavior was required for both methods. next, the ability to recover from singular cases was also required. in recovering from singular cases, the extended jacobian method passed successfully, but the same behavior for both methods was not achieved. 4.1 future work the usability of herein presented method in 3d space and exact control of the inner joints have not been evaluated yet. for the first issue – exact structure control – the inverse transformation f �1 of (2) has to be found, either analytically or numerically. then, a sensitivity analysis can be performed to realize the dependencies in the system and hence discover the key to exact control. the next remaining problem – usage in 3d space – seems not to be so difficult now. this is mainly due to the fact that the rotating joints usually operate in a single plane. however, this problem becomes difficult if the joints are capable of unlimited rotation instead of rotation in a specified plane. the usability of the proposed technique is clear, for enabling animation of articulated structures with user-defined (expressed as functions) constraints imposed on the joints. after the tasks mentioned above have been dealt with, the focus will turn to the performance and speed of the proposed solution, as speed is dominant requirement for real time computer animations. also for this reason, the basic methods for solving the inverse kinematics problem, such as jacobian inversion, jacobian transposition, and ccd are only used in computer animation due to their relatively low computational complexity. to achieve the goal of speed we are planning to employ non-standard arithmetic, probably residual arithmetic, together with “special hard-wear” which is available on today’s �-processors, for instance mmx, sse, 3dnow, etc. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 start position 2. step 3. step 4. step 5. step 10. step fig. 5: behavior of the jacobian inversion method during the test start position 2. step 10. step fig. 6: behavior of herein presented jacobian extension method during the test references [1] badler, n. i., phillips, c. b., webber, b. l.: simulating humans: computer graphics animation and control. oxford university press, 1993. [2] baillieul, j.: “kinematic programming alternatives for redundant manipulators.” in ieee international conference on robotics and automation, 1985, p. 722–728. [3] baillieul, j.: “avoiding obstacles and resolving kinematic redundancy.” in ieee international conference on robotics and automation, 1986, p. 1698–1704. [4] friedberg, s. h., insel, a. j., spence, l. e.: linear algebra 3rd ed. upper saddle river: prentice-hall, 1997. isbn 0-13-233859-9. [5] golub, g. h., van loan, c. f.: matrix computations 3rd ed. baltimore: the johns hopkins university press, 1996. isbn 0-8018-5414-8. [6] klein, c. a., chu-jenq, c., ahmed, s.: “a new formulation of the extended jacobian method and its use in mapping algorithmic singularities for kinematically redundant manipulators.” ieee transactions on robotics and automation, vol. 11 (february 1995), no. 1, p. 50–55. [7] korein, j. u., badler, n. i.: “techniques for generating the goal-directed motion of articulated structures.” ieee transactions on computer graphics and applications, november 1982, p. 71–81. [8] liegeosis, a.: “automatic supervisory control of the configuration and behavior of multibody mechanisms.” ieee transactions on systems, man, and cybernetics, vol. 7 (1977), no. 12, p. 868–871. [9] paul, r. p.: robot manipulators: mathematics, programming and control. cambridge, mass.: mit press, 1981. [10] van huffel, s., vandewalle, j.: the total least-squares problem. philadelphia: siam, 1991. isbn 0-89871-275-0. [11] šoch, m., lórencz, r.: “solving inverse kinematics – jacobian inversion method in a plane.” in 11th international workshop on systems, signals, image processing and ambient multimedia. poznaň: ptetis, 2004, p. 95–97. isbn 83-906074-8-4. [12] whitney, d. e.: “resolved motion rate control of manipulators and human prosthesis.” ieee transactions on man-machine systems, mms-10, june 1969, no. 2, p. 47–53. ing. martin šoch e-mail: sochm@fel.cvut.cz doc. ing. róbert lórencz, csc. e-mail: lorencz@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 praha 2, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague ap04-bittnar1.vp 1 introduction the basic unknowns are elongations in the directions of edges �21, �32, �13 and pure rotations �iz of a small neighbourhood of a node (considered rigid) about the z-axis passing through vertex i, where i � 1, 2, 3. the effect of pure swing is transferred to the elongations of the edges and is taken into account in formulating of the element stiffness matrix. the static matrix (see section “element stiffness matrix”) supports the rigid body motion (rbm). 2 element displacement derivation of the element stiffness matrix relies on hrenikoff ’s idea that a triangular element can be represented by a system of lines parallel to the edges. owing to this idea we further assume that displacements along individual element sides are mutually independent. triangular coordinates li (appendix a) are used to describe the basic displacement field uij(r, s) derived from pure deformations and rotations. here, the pure deformations are defined through elongations of the individual edges and denoted as u21, u32, …, u13. they are composed of two parts – pure extensions due to node displacements and pure extensions caused by node rotations. pure extensions in terms of nodal elongations. with the help of these quantities we may express the elongations of edges �uij(r, s) at the point r, s by � � �u r s n r s u l uij i ij i ij( , ) ( , )� � � � . (1) pure extensions in terms of nodal rotations. associating the rotation in node 1 with �1 � 1 we write the pure extension along the edges in terms of cubic shape functions in fig. 2. k k r s l l l r s l l l � � 21 2 1 2 21 1 13 3 1 2 13 1 ( , ) , ( , ) , � � � � (2) where lij are the lengths associated with sides ij. rotation �i causes in node (r, s) two displacements k�ij perpendicular to lines that are parallel with sides ij (fig. 2). superscript k means the node about which the rotation takes place. extensions �iuij(r, s) of three lines parallel with edges, which pass point (r, s), due to the vertex rotations �i (superscript �i) need to be expressed. these elongations �iuij(r, s) are obtained by projecting k�ij in the corresponding directions ij. elongations from the rotation in node 1 (fig. 3) can be written as © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 44 no. 5 – 6/2004 membrane triangles with drilling degrees of freedom p. fajman an accurate triangular plane element with drilling degrees of freedom is shown in this paper. this element can be successfully used for solving linear and nonlinear problems. the main advantage of this element is that the stiffness matrix is obtained from pure deformations – elongations of the edges. this aproach is very suitable for nonlinear analysis, where the unbalanced forces can be obtained directly from elongations of edges. keywords: triangular, element, elongations, rigid, displacement. 1 2 3 r,s � u21 � u21(r,s) 1 2 3 r,s n2(r,s) =l2 s r fig. 1: node displacement field �u21(r, s) 1 2 3 r,s 1�13(r,s) 1�21(r,s) 1 fig. 2: extension from pure rotation �1 � 1 � � 1 1 21 21 1 13 13 1 21 2 1 2 21 u r s l l l ij ij ij i ( , ) � � � � � � �n t n t n t j ijl l l� �1 13 3 1 2 13n t , (3) where i � 2, 3, 1 and j � 1, 2, 3, and { } { , } {cos , sin }n ij xij yij nij nijn n� � � � is the outward unit normal vector of the element side ij, { } { , } {cos , sin }tij xij yij nij nijt t� � � � is the unit vector of the element side ij (fig. 4). the other contributions from the rotations in nodes 2 and 3 are obtained by cyclic index replacement. elongation � i iju r s( , ) from three node rotations are obtained by summation of the separate contributions �1u r sij ( , ), �2u r sij ( , ), �3u r sij ( , ). the total displacements along the element sides in node (r, s) due to the � uij (eq. 1) and �uij (eq. 3) may be expressed u r s l u l l l l l lij i ij i jk k j i j jk k i j( , ) sin sin� � � � � � � � 2 2 , (4) or more compactly u nu* *( , )r s � , (5) where vector u*( , ) { ( , ), ( , ), ( , )}r s u r s u r s u r s t� 21 32 13 represents the pure displacements of node (r, s) in the directions of sides ij, u*( , ) { , , , , , }r s u u u t� � � �21 32 13 1 2 3� � � is the vector of six element conectors – elongations of the sides and pure node rotations, and n is the (3×6) matrix composed of the linear and cubic functions obtained directly from eq. (4) for i � 2, 3, 1 and j � 1, 2, 3. 3 pure element stiffness matrix the individual components of the side strains are obtained from the pure displacement vector u*( , )r s using the geometrical equations �ij ij ij ij ij j ij ij i u r s s u r s l l u r s l l � � � d d d d d d ( , ) ( , ) ( , ) . (6) note that �ij corresponds to a relative deformation along the side ij. substituting eq. (5) into eq. (6) gives � * *� bu , (7) where �* { , , }� � � �21 32 13 t and b is the element strain matrix with the components b n l ik ik ij � � � (k � 1, …, 6). components of the engineering strain � � { , , }� � �x y xy t are obtained applying standard transformation as � � � � �ij ij ij ij ij� �{cos , sin , sin cos } 2 2 �, and in matrix notation � � � � * *� � � ��a a 1 . (8) substituting eq. (7) into (8) finally provides � � �a bu1 *. (9) the finite element formulation used in this paper is based on the principle of minimum potential energy. the element stiffness matrix is calculated by considering the strain energy in the form e v vi s t v t t v t t ( ) * * * ,� � � � � � � � � 1 2 1 2 1 2 1 � � d du b a d a b u k b a d a�� 1bdv v , (10) where d is the elastic material stiffness matrix. seven gauss integration points with �( )h5 need to be used for numerical evalutions of k*, as the stiffness matrix contains functions of the fourth order. 4 element stiffness matrix the pure element stiffness matrix is extended by including the rigid body motions through the static matrix t obtained from the relations between the pure displacements u* { , , , , , }� � � �u u u t21 32 13 1 2 3� � � and the global node displacements u � { , , , , , , , , }u v u v u v t1 1 1 2 2 2 3 3 3� � � . rigid body motions of the edges are decomposed into translations and rotations. the relations between elongations of edges � uij and cartesian displacements ui, vi at the nodes are evident from fig. 5 � � �u r s u v u u ij ij ij ij ij i j ij ( , ) cos sin ( ) cos ( � � � � � � � � � � � v vi j ij� ) sin .� (11) 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 1 2 3 1�21 � u23 � u13 fig. 3: relation between node rotation 1 and elongations 1 2 3 x y n21 n13 n32 t13 t21 t32 �1 �2 �3 fig. 4: introducing geometrical quantity the relations among pure vertex rotations �i and node rotations �i shown in fig. 6, are � �i i� �� , (12) where �i is the pure rotation, �i represents the node rotation, � corresponds to a rigid body rotation given by � � � � � � � � � 1 2 � � � � u y v x . (13) the contribution of the pure rotations to the nodal displacements u, v in eq. (13) is neglected, since they cause no translations of the apex of a triangle. when using linear functions, the displacement field assumes the form. u l u v l v l u y r s i i i r s i i i i i i i ( , ) , ( , ) , , ,� � � � � � � � 1 3 1 3 1 2 � � � � � � � � � � � � � � � � � � � � � 1 3 1 3 1 3 1 4 , , , � � l v x a c u b v i i i i i i i i i� � � � � � � � �1 3, . (14) combining eqs. (12), (13), (14), provides the relation between pure displacements u* and node displacements u in the form u tu* � , (15) where u � { , , , , , , , , }u v u v u v t1 1 1 2 2 2 3 3 3� � � is the vector of nodal degrees of freedom (ui, vi are nodal cartesian displacements and �i are nodal rotations). note that the static matrix t is constant and does not cause any artifical strain or stress during the pure rigid body motions. relationship (15) is used in the expression for the strain energy (10), from which the element stiffness matrix k is obtained ei s( ) * *� � � 1 2 u t k tu k t k tt t t . 4.1 cantilever beam under tip load a standard test example often used in the literature corresponds to a shear-loaded cantilever beam. in particular a cantilever beam of rectangular cross section is subjected to parabolic traction at its tip with the resultant f equal to 40 kn. the geometry and boundary conditions of the beam are shown in fig. 7. the beam thickness is 1 m. the following material properties are used: young‘s modulus e � 30000 kpa and poisson’s ratio � 0.25. the vertical deflection at point c according to the theory of elasticity, ref. [6], is wc � 0.3558 m. table 1 lists the numerical results obtained for the tip deflection. these are compared with the results obtained using the constant strain triangle (cst), the linear strain triangle (lst), the element of allman [1] and lee [5]. the values of the tip deflection obtained with the present element are closer to the exact solution than those obtained using the cst and the element due to lee. the high accuracy of the results calculated here with a mesh of 128 elements is particularly noteworthy (even outperforming allman’s element). 77 acta polytechnica vol. 44 no. 5 – 6/2004 1 2 3 u1 v1 u1 �u12 v1 �v12 u3 v3 x y u2 v2 fig. 5: rigid body motions – translation 1 2 3 �1 x y �� �1 fig. 6 rigid body motions rotation 4.8 m 1 .2 x y f = 4 0 k n c 50 4 1 .6 6 4 1 .6 6 reserve linear loading fig. 7: cantilever beam the three adopted mesh patterns are shown in fig. 8. 4.2 cantilever beam under end moment the next example is a cantilever beam subjected to the end moment m � 100. the specimen dimensions together with the support and loading conditions are illustrated in fig. 10. the modulus of elasticity and the poisson ratio are set equal to e � 768, � 0.25 so that the exact tip deflection is 100. the adopted mesh patterns ranging from 32×2 to 2×2 (fig. 11) are used to examine the element aspect ratio ranging from 1:1 to 16:1. this element performs well for unit aspect ratios, but rapidly becomes worse for aspect ratios over 2:1. the results for the present element are compared in fig. 12 with the results obtained using the constant strain triangle (cst), the element of allman [1] and the effand element due to felippa and alexander [2, 3]. numerical results for the tip deflexion wc at point c are also given in table 2. 78 acta polytechnica vol. 44 no. 5 – 6/2004 number of elements deflection wc (m) cst lst allman lee present 8 0.0909 0.2696 0.2068 0.240 32 0.1988 0.3550 0.3261 0.2958 0.3151 128 0.3115 0.3556 0.3471 0.3388 0.3480 table 1 0.240 0.315 0.348 fig. 8: mesh patterns 250 exact = 0.3558 0.296 0.347 0.348 w c 0.269 0.3 50 150 300100 200 allman 0.35 0.25 2 4 0 7 2 0.315 present 2 4 0.326 lee 0.240 number of equations 0.339 fig. 9: dependence of the deflection wc on the number of equations 32 m 2 x y m=100 c fig. 10: cantilever beam number of elements deflection wc (m) cst effand allman 3i allman 7 i present 2×2 1.28 100.07 0.39 0.16 1.10 4×2 4.82 99.96 5.42 2.47 3.31 8×2 15.75 99.99 38.32 24.25 25.53 16×2 36.36 99.99 76.48 69.09 69.70 table 2: tip deflections for a beam under end moment 1.10 25.5 fig. 11: mesh patterns 2×2, 8×2 5 conclusions this paper presents the derivation of a simple triangular membrane finite element using the principle of minimum potential energy. the formulation of two stiffness matrices – the pure stiffness matrix depending on elongations of the edges and the global matrix depending on nodal displacements ui, vi, �i – enables nonlinear problems to be solved more effectively and quickly due to the simple calculation of the unbalanced forces. it is advantage in comparison with quadrilateral elements [4]. the element can represent exactly all constant strain states. the numerical results from the selected test example show that quite acceptable accuracy is achieved for practical applications. appendix a area coordinates the use of area coordinates enables the relation for one edge or one vertex to be derived. the other relations are obtained by cyclic substitution i from 1 to 2, from 2 to 3, from 3 to 1. l a a a b x c y a i i i i i� � � �( ) 2 , where a is the area of the triangle, a x y x y b y y c x xi j k k j i j k i j k� � � � � �, , , and xi, yi are the coordinates of the node in fig.13. acknowledgment financial support from gačr 103 02 0658 and msm 210000001 is gratefully acknowledged. references [1] allman d. j: “a compatible triangular element including vertex rigid rotations for plane elasticity, analysis.” computer & structures, vol. 19 (1984). [2] bergan p. g., felippa c. a.: “a triangular membrane element with rotational degrees of freedom.” computer methods in app. mech. and eng., vol. 50 (1985), p. 25–69. [3] felippa c. a., alexander s.: “membrane triangles with corner drilling freedoms – iii. implementation and performance evalution.” finite elements in analysis and design, vol. 12 (1992), p. 203–239. [4] ibrahimbegovic a., taylor r. l., wilson e. l.: “a robust quadrilateral membrane finite element with drilling degrees of freedom.” int. journal for numerical meth. in. eng., vol. 30 (1990), p. 445–457. [5] lee s. c., yoo ch. h.: “a novel shell element including in plane torque effect.” computer & structures, vol. 28 (1988). [6] timoshenko s., goodier j. n.: “theory of elasticity.” 3rd edn., mcgraw-hill, new york, 1970. petr fajman phone: +420 224 354 473 e-mail: fajman@fsv.cvut.cz department of structural mechanics czech technical university faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 79 acta polytechnica vol. 44 no. 5 – 6/2004 number of equations exact 100� 3 6 3.3 7 2 w c 0.0 15050 100 present effan 100 50 87.1 2 8 8 69.7 1 8 25.5 1 4 4 allman 200 250 cst 2×2 4×2 8×2 16×2 32×2mesh fig. 12: dependence of deflection wc and number of equations 1 2 3 x y x2 x1 x3 y1 y2 y3 a1 a2 a3 (x,y) fig. 13: area coordinates ap04_1.vp 1 introduction mechanical properties of concrete are changing in the course of its ageing. ageing of concrete is accelerated by climatic conditions, mechanical stress, aggressive liquid, etc. this article deals with a question of irradiation influence, which arises, for example, as a consequence of uranium 238u and 235u fission reaction in the nuclear reactor. it concerns neutrons and gamma irradiation acting on concrete constructions and accelerating their surface and internal changes. the concrete in these facilities plays not only the structural role, but also the role of shielding material that has to absorb the aforementioned types of irradiation and prevents their penetration. of the irradiation spectrum we will observe only influence of gamma irradiation. the gamma irradiation is being absorbed at the passage through concrete, what can induce water radiolysis �1� in the material and consequently chemical reactions leading to changes of the concrete phase composition �2� and thus the hydration degree decrease. we can expect that this hydration degree decrease becomes evident by the decrease of the concrete strength. this hypothesis is being proved in this article. 2 concrete composition samples determined for testing were made according to the formulation, which is stated in table 1 �3�. this formulation was used at construction of the containment of the nuclear power plant temelín. composition of cement cem i 42,5 r mokrá is listed in [3]. concrete beams, on which consequently the experiment took place, were made of this mixture. volume mass of concrete 28 days old was 2358 kg × m�3. 3 method of irradiation irradiation was carried out in the company bioster a.s. veverská bítýška, where 60co is used as a gamma irradiation source. the activity of source 60co at the beginning of experiment was a � 462 kci (1.73×1016 bq) and at the end of experiment was 449 kci (1.68×1016 bq). these values were necessary for calculating of total radiation dose rate which was 0.935 kgy × h �1. concrete beams at dimensions 0.4×0.1×0.1 m after 90 days from concreting were used for the experiment. each beam had gamma irradiation dosimeters fixed on it. beams were irradiated for 90 days with a dosing input of approx. 0.25 kgy × h�1, i.e. the maximum dose was 0.5 mgy for the whole period of irradiation. the difference between radiation dose rates is caused due to items, © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 influence of � irradiation on concrete strength v. sopko, k. trtík, f. vodák aging of concrete due to gamma irradiation. strength of concrete are in a good correspondency with already known results. keywords: concrete, gamma irradiation, strength. component mass cement 499 kg [42,5r] water 215 kg plasticizer (ligoplast sf) 2.8 kg aggregates: 0 –4 mm 705 kg 8 –16 mm 450 kg 16 –32 mm 527 kg table 1: composition of 1 m3 of concrete 0.7 0.8 0.9 1 1.1 1.2 1.3 0 100 200 300 400 500 600 dose [kgy] te n si le st re n g h t ra tio fig. 1: dependence of tensile strength ratio on dose 0.7 0.8 0.9 1.0 1.1 0 100 200 300 400 500 600 dose [kgy] c o m p re ss iv e st re n g h t ra tio fig. 2: dependence of compressive tensile strength ratio on dose which were placed between source of radiation and irradiated samples. classical rectifying diodes calibrated by means of standard dosimeters were used as dosimeters �4�, which error is about twenty per cent. standard dosimeters are limited for higher doses of the gamma irradiation, therefore, the dosimeters, which we used, were calibrated for lower doses and consequently extrapolated for high irradiation doses. 4 method of strength measurement at first, the flexural strength test (three – point supporting) was applied to concrete samples. after this destructive test the compressive strength test was used on the first fragment (in accordance with čsn 73 13 18) and the tensile strength test on the second fragment of splitting (in accordance with čsn 73 13 17). conclusion measured values of concrete strength (fig. 1, 2) are according to current measurments obtained by various experimentators viz [5]. acknowledgments this work was supported by mšmt čr (contract no. j 04-098-2100000 20). references [1] assessment and management of ageing of major nuclear power plant components important to safety: concrete containment buildings (pachner j.); iaea – tecdoc – 1025, iaeavienna 1998, issn1001-4289. [2] bouniol p., aspart a.: “disappearance of oxygen in concrete under irradiation: the role of peroxides in radiolysis.” concrete and cement research, vol. 28 (1998), no. 11, p. 1669–1681. [3] vodák f. et al: trvanlivost a stárnutí betonových konstrukcí jaderných elektráren. fsv čvut, 2002. [4] prouza z., obraz o., sopko b., spurný f., škubal a.,kits j., látal f.: “dosimetric parameters of a new czechoslovak neutron si diod.” radiation protection dosimetry, vol. 28 (1989), no. 4, p. 277–281. [5] kaplan m. f.: concrete radiation shielding (nuclear physics, concrete properties, design and construction), longman scientific & technical, england, 1989. ing. vít sopko phone: +420 224 354 435 e-mail: sopko@fsv.cvut.cz department of physics doc. ing. karel trtík, csc. phone: +420 224 354 626 e-mail: trtik@fsv.cvut.cz department of concrete structures and bridges prof. františek vodák, drsc. phone:+420 224 353 886 e-mail: vodak@fsv.cvut.cz department of physics czech technical university in prague faculty of civil engineering thákurova 7 166 29, praha 6, czech republic 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague ap04_4web.vp 1 introduction the development of soil mechanics for unsaturated soils began about two to three decades after the commencement of soil mechanics for saturated soils. the basic principles related to the understanding of unsaturated soil mechanics were formulated mainly in the 1970s [5]. unsaturated soils have recently gained widespread attention in many studies and construction works all over the world, since many soils near the ground surface are considered unsaturated. the shear strength characteristics of unsaturated sedimentary and compacted cohesive soils have been the traditional subject of a number research studies in the last 30 years. only a few studies have focused on analyzing the behaviour of unsaturated cohesionless soils (e.g., [3]). recently, wide attention has been paid to the behaviour of unsaturated sand (e.g., [9] and [10]). in addition, many researchers have presented their results, when conducting constant water content triaxial tests (cw) on unsaturated soils, without analyzing the change of matric suction during these tests. most of the studies have considered the matric suction inside the tested samples to be constant during the whole shearing process. however, the source of this assumption is unclear, as reported by juca et al., [7]. therefore, the decision was taken to study the shear strength behaviour of unsaturated sand using the cw triaxial test method with measurements of matric suction during the shearing stage. 2 experimental work 2.1 tested materials siliceous sand with grain size ranging from 0.1 to 0.5 mm and an average particle size of 0.32 mm was investigated in this study. the sand consists of about 7.0 % fine sand and nearly 93.0 % medium sand. it has a coefficient of uniformity of 1.20 and an effective diameter, d10, of 0.21 mm. this sand is known as sand pr33, manufactured in the czech republic by provodínské písky. it is purely cohesionless with a plasticity index, pi, of zero and a specific weight of 2.65. 2.2 description of testing devices to study the shear strength parameters and suction inside the tested sand, use of the triaxial apparatus was preferred, since it provides a three-dimensional load of the sample, which simulates the real load of the material in nature. an electronically controlled triaxial device was used after implementing some modifications that enabled the control and the measurement of matric suction inside the tested samples. the experimental testing program was executed at the center of experimental geotechnics (ceg) at the faculty of civil engineering of the czech technical university in prague, using a 50 kn triaxial machine manufactured in england by wykeham farrance. as for all the other components of the triaxial apparatus utilized for this work, the computer-controlled compression machine was operated via a host computer using special software. details and contents of the modified triaxial cell are shown in fig. 1. 2.3 identification of the testing program a laboratory-testing program was planned and carried out to fulfill the objectives of this study. the testing program was divided into two main groups. the first group, gs, deals with samples that have 100 % degree of saturation (i.e., zero matric suction). the main goal of testing this group was to evaluate the effective shear strength parameters, c’ and �‘, and to have the ability to compare the behaviour of saturated samples with that of unsaturated samples. consolidated isotropic drained tests, cid, were utilized when testing the saturated sand samples. the identification of this group is summarized in table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 44 no. 4/2004 influence of matric suction on the shear strength behaviour of unsaturated sand a. farouk, l. lamboj, j. kos as a part of the effort made to understand the behaviour of unsaturated soils, this work studies the shear strength characteristics of a cohesionless unsaturated soil. generally, the determination of the shear strength of unsaturated soils is a great challenge to geotechnical engineers, both in terms of understanding it and the effort necessary to determine it. matric suction is one of the stress state variables that control the shear strength of unsaturated soils. therefore, the main aim of this study is to investigate the effect of matric suction on the shear strength characteristic of sand known commercially as sand pr33. the shear strength behaviour of unsaturated sand is studied in this work using the constant water content triaxial test method with measurements of matric suction during the shearing stage. the tests were performed using the axis translation technique in such a way that the pore-air pressure was controlled while the pore-water pressure was measured during all tests. keywords: unsaturated, suction, sand, triaxial, shear strength, axis translation technique. sample identity applied cell pressure, kpa applied pore pressure, kpa initial effective confining pressure, kpa gs1 550 500 50 gs2 550 400 150 gs3 500 200 300 table 1: layout and specification of group gs the second group, gus, deals with unsaturated sand samples. taking into account the air entry value of the high entry disk available in the laboratory, which was 150 kpa, a suitable range of matric suctions was chosen for this group. three subgroups of samples with three different matric suctions, (30, 50, and 150 kpa), were tested in the triaxial apparatus using the constant water content test. each of these subgroups consists of three samples having the same matric suction but tested under the effect of three different prescribed net normal stresses. during tests, the axis translation technique was utilized to maintain the desired matric suction inside the unsaturated samples and to prevent cavitation in the pore-water pressure measuring system. the pore-water pressure was measured, while the pore-air pressure was controlled, which provided the facility to measure the changes in matric suction during the shearing processes. pore-water pressure, uw, was measured at the base of the sample through the high air entry disk, while the pore-air pressure, ua, was applied at the top of the sample through the porous disk. the values of the initial matric suction, the applied air and water pressures, the confining pressures, and other specifications are tabulated in table 2. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 1: longitudinal cross section diagram of the modified triaxial cell for testing unsaturated soils sample identity pore air pressure, (ua), kpa pore water pressure, (uw), kpa net confining pressure, (�3 � ua), kpa initial matric suction, (ua � uw), kpa gus1-30 50 gus2-30 30 0 150 30 gus3-30 300 gus1-50 50 gus2-50 50 0 150 50 gus3-50 300 gus1-150 50 gus2-150 150 0 150 150 gus3-150 300 table 2: layout and specification of group gus at the end of the consolidation stage 2.4 preparation of samples samples were prepared from sand pr33 in its dry state without any other additions. the following procedures were similar when preparing both the saturated and the unsaturated samples. a suitable former consisting of a steel ring and a three-split mould, was used to maintain the required cylindrical shape of the specimens. the sand was dropped into layers inside a membrane fitted inside the mould. each layer was compacted using a wooden rod until achieving the full height of the mould. after the sample had been leveled, capped and sealed, suction was applied to give sufficient strength and to make the sample self-supporting. this process led to cylindrical samples with a density of about 1.55 g�cm�3. it should be pointed out that the full saturation condition (in the case of evaluating the effective strength parameters) or the required matric suction (in the case of testing the unsaturated samples) were not achieved at this stage. however, these conditions were achieved during the consolidation stage, as will be described later. fig. 2 shows the main procedures followed to prepare the samples of sand. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 44 no. 4/2004 fig. 2: steps used for preparing the unsaturated sand samples, (a) the three split mould, and the steel ring; (b) fitting the rubber membrane around the triaxial pedestal and installing the lower o-rings; (c) assembly of the mould around the membrane; (d) filling the mould with sand pr33; (e) compaction of sand inside the mould; (f) sample ready to be tested 2.5 testing procedures during the consolidation stage, the samples were subjected to an all-round stress by pressurizing the water inside the cell at a prescribed confining pressure value. two different types of consolidation stages were conducted according to the purpose of the test. samples used to evaluate the effective stress parameters were consolidated against backpressure to produce saturation. the unsaturated sand samples with matric suction, (ua � uw), less than 150 kpa were consolidated by applying a constant cell pressure and then imposing air pressure at the top of the sample, while the pore-water pressure was opened to the atmosphere at the bottom. this was done in order to control the matric suction inside the samples at prescribed values. thereafter, with the confining pressure being held at a constant value after finishing the consolidation stage, the samples were sheared under strain-controlled conditions. the compression machine was set to a constant strain rate, and then was turned on to submit the axial load. during this stage, the pore-air pressure was maintained constant at the same pressure applied during consolidation, while the pore-water pressure was changing under the undrained conditions. this was achieved by closing all valves except for those that supplied the air and the cell pressures. in such conditions, this test is known as the constant water content triaxial test, cw, since no water was allowed to drain during the shearing stage. the net confining pressure, (�3 � ua), remained constant during this stage, while the matric suction, (ua � uw), varied. to ensure equalization of pore pressures throughout the samples, the shearing process was performed at a strain rate of 0.03 mm�min�1 (i.e., approximately 0.034 % min�1), which was less than or coincided with those rates suggested for similar tests and similar type of soil by many researchers, e.g., [4], and [2]. 3 test results and discussions the results showed that the increase in matric suction did not affect the general shape of the stress–strain relationship. thus, for the whole range of the applied matric suction, the shapes of the stress–strain curves for unsaturated samples resemble those for saturated samples. it was also observed that at the same net confining pressure, the strength of an unsaturated sample is greater than that of a saturated one. for example, fig. 3 shows that the shear strength of sample gus1-50 is roughly 1.25 times that of saturated sample gs1, which indicates that an increase in soil suction leads to an increase in shear strength. the interpretation of this phenomenon is that, at low matric suction, air enters the pores and a contractile skin begins to form around the points of contacts between particles. the capillary action arising from suction at the contractile skin increases the normal forces at the inter-particle contacts. these additional normal forces may enhance the friction and the cohesion at the inter-particle contacts. as a result, the unsaturated sand exhibits a higher strength than the saturated sand. however, this does not seem to be the dominant behaviour for the whole range of the applied matric suction. when comparing the maximum deviator stress values in the case of a sample exposed to matric suction of 50 kpa with those of 150 kpa, it can be seen that the maximum effect on the deviator stress and shear strength is reached at matric suction of 50 kpa. it was observed that after this value, the increase in maximum deviator stress, (�3 � �1)max, drops away. this behaviour indicates that, during the consolidation process, as the matric suction increases to a relatively high value, the sand begins to dry and the water menisci start to accumulate only between fewer grain particles. the interpretation of this behaviour can be that, at low matric suction, air replaces some of the water in the large pores between the grains, which causes the water to flow through the smaller pores and forms water menisci at the grain contact points. in addition, it is well known that, unlike clayey soils, sand has a very low ability to sustain water menisci between the particles. this is due to the fact that, in coarse-grained soils, water exists mainly as free pore water, while in fine-grained soils the water adsorbed on particle surfaces becomes dominant [8]. accordingly, at higher matric suction values, a further decrease in water that occupies the pore volume of the studied sand will take place during the consolidation stage, which leads to a further decrease in the number of connections between the water and sand particles. the total contact area of the momentary contact surfaces arising from the water menisci may be radically 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 3: stress-strain curves for sand pr33 at various matric suctions under the effect of 50 kpa net confining pressure decreased also as a great amount of continuous air phase is formed throughout the soil. thus, the total number of water menisci, which act as a glue at the grain contact points, will be fewer than the number at lower matric suction. as a result, the amount of increase in the strength of the studied sand begins to drop. to clarify the effect of matric suction on increasing the strength of the studied sand, four sets of mohr circles representing four samples, which were all studied at 50 kpa net stress but at four different matric suctions, are plotted in fig. 4. however, this figure shows that the general trend of increasing shear strength with matric suction of sand is not so quite evident as it is for clayey or silty soil. the failure envelopes intercept on the ordinate equal to c � 0, 8.76, 11.51, 7.36 kpa for matric suction of 0, 30, 50, 150 kpa, respectively. the variation of matric suction under the constant water content condition during the shearing stage is investigated here. matric suction versus axial strain curves are presented in figs. 5 and 6. in general, these figures show that matric suction decreases as the shearing process develops. this result can be attributed to the fact that pore-air pressure is maintained at a constant value while the pore-water pressure increases continuously during shearing of the sample. thus, the degree of saturation of the sample is expected to increase, since the pore-air is squeezed out of the soil while the water content remains constant. the change in matric suctions at failure in the case of a sample tested using initial matric suction of 50.0 kpa ranged from approximately 5.0 kpa for sample gus1-50 to nearly 20.0 kpa for sample gus3-50 (i.e., from 10 % to 40 %). similarly, the change in matric suction at failure for samples tested using initial matric suction of 150.0 kpa ranged from 20.0 kpa for sample gus1-150 to approximately 50.0 kpa for sample gus3-150 (i.e., from 12 % to 35 %). furthermore, figs. 5 and 6 show that for the whole range of the net confining pressures used in these tests, the change in matric suction in the case of tests conducted at 50 kpa initial suction is more gradual than in those conducted at 150 kpa, which shows much steeper behaviour. this can be related to the procedures followed in this study to apply the desired matric suction into the tested samples (i.e., the axis-translation technique). it is well known that the axis translation technique is usually performed by increasing the pore-air pressure, ua, which in turn increases the pore-water pressure, uw, by an equal amount. although the pore water pressure was © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 44 no. 4/2004 fig. 4: mohr circles for sand pr 33 at various matric suctions under the effect of 50 kpa net confining pressure fig. 5: variation of matric suction during shearing process of group gus-50 under constant water content conditions controlled to be atmospheric during the whole consolidation stage, this surely was not the case during the shearing process under constant water content conditions. that is why, during shearing of the samples, the high air pressure applied to samples gus-150 in order to have 150 kpa initial matric suction increased the pore water pressure much higher than that inside samples gus-50, which were exposed to lower air pressure to have initial matric suction of only 50 kpa. in addition, it can be seen from figs. 5 and 6 that under the same initial matric suction condition, the variation in matric suction for samples under higher net confining pressures is much more pronounced than the variation for samples under lower confinements. similar observations were reported by adams et al. [1], herkal et al. [6] and wulfsohn et al. [11]. however, it seems that using high net confining pressures during the constant water content shearing stage reduced the volume of the sample. this is largely attributed to the fact that higher net stresses cause a greater reduction in the pore spaces (i.e., compression) during the preceding isotropic consolidation stage, which causes the particles of the sample to be in a closed packed form. under these conditions, pore-air and pore-water may become entrapped and thus less air will be expelled, which increases the pore water pressure as a result of shearing the sample while the pore-water phase is in the undrained mode. thus, a significant variation in matric suction for samples tested under high confining pressures will be attributed to the build up of pore water pressures within the samples. 4 conclusion the results obtained from a series of triaxial tests performed on sand in its unsaturated form indicated that the shear strength of the samples increases as a result of increasing matric suction. however, this did not seem to be the dominant behaviour for the whole range of the applied matric suctions. in fact, the maximum effect on the shear strength was reached at a certain value of matric suction, and then the increase in strength drops. the results showed that unsaturated sand could attain some cohesion although the effective cohesion is equal to zero. furthermore, testing the unsaturated sand showed that the variation in matric suction during tests for samples tested under high net confining pressures is much more pronounced than that for samples under lower confinements. thus, it is recommended when performing triaxial constant water content tests on unsaturated soils, to perform tests using low net confining pressures to minimize the differences between the initial matric suction and the matric suction at failure. references [1] adams b. a., wulfsohn d. h.: “critical-state behaviour of an agricultural soil”. journal of agricultural engineering res., vol. 70 (1998), p. 345–354. [2] bishop a. w., henkel d. j.: the measurement of soil properties in the triaxial test. second edition. edward arnold (publishers) ltd., london, isbn 0-312-52430-7, 1962, 228 p. [3] drumright e. e., nelson j. d.: “the shear strength of unsaturated tailings sand”. proceedings of the 1st international conference on unsaturated soils, paris (france), isbn 90 5410 584 4, vol.1 (1995), p. 45–50. [4] fredlund d. g., rahardjo h.: soil mechanics for unsaturated soils. john wiley and sons, inc., new york, ny 10158-0012, isbn 0-471-85008-x, 1993, 517 p. [5] fredlund d. g.: “historical developments and milestones in unsaturated soil mechanics”. proceedings of the 1st asian conference on unsaturated soils (unsat-asia 2000), singapore, isbn 90-5809-139-2, 2000, p. 53–68. [6] herkal r. n., vatsala a., murthy b. r. s.: “triaxial compression and shear tests on partly saturated soils”. proceedings of the 1st international conference on unsaturated soils, paris (france), isbn 90 5410 584 4, vol. 1 (1995), p. 109–116. [7] juca j. f. t., frydman s.: “experimental techniques”. proceedings of the 1st international conference on unsaturated soils, paris, france, isbn 90 5410 586 0, vol. 3 (1995), p. 1257–1292. [8] kezdi a.: hand book of soil mechanics vol. 1, soil physics. elsevier scientific publishing company, amsterdam (the netherlands), isbn 0-444-99890-x, 1974, 294 p. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 6: variation of matric suction during shearing process of group gus-150 under constant water content conditions [9] lauer c., engel j.: “a triaxial device for unsaturated sand – new developments”. proceedings of the issmge international conference: from experimental evidence towards numerical modelling of unsaturated soils, september 18–19, 2003, bauhaus-universität, weimar, germany. [10] russell a., khalili n.: “a bounding surface plasticity model for sands in an unsaturated state”. proceedings of the issmge international conference: from experimental evidence towards numerical modelling of unsaturated soils, september 18–19, 2003, bauhaus-universität, weimar, germany. [11] wulfsohn d. h., adams b. a., fredlund d. g.: “triaxial testing of unsaturated agricultural soils”. journal of agricultural engineering res., vol. 69 (1998), p. 317–330. ing. ahmed farouk ibrahim phone: +420 224 354 555 e-mail: doc. ing. ladislav lamboj, csc. phone: +420 224 353 874 e-mail: ing. jan kos, csc. phone: +420 224 354 552 e-mail: department of geotechnics czech technical university in prague faculty of civil engineering thakurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 44 no. 4/2004 acta polytechnica https://doi.org/10.14311/ap.2021.61.0117 acta polytechnica 61(si):117–121, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on the sensitivity of the nonlinear term in the outflow boundary condition tomáš neustupa∗, ondřej winter czech technical university in prague, faculty of mechanical engineering, karlovo nám. 13, 121 35 praha 2, czech republic ∗ corresponding author: tneu@centrum.cz abstract. this paper studies the artificial outflow boundary condition for the navier-stokes system. this type of condition is widely used and it is therefore very important to study its influence on a numerical solution of the corresponding boundary-value problem. we particularly focus on the role of the coefficient in front of the nonlinear term in the boundary condition on the outflow. the influence of this term is examined numerically, comparing the obtained results in a close neighbourhood of the outflow. the numerical experiment is carried out for a fluid flow through the channel with so called sudden extension. presented numerical results are obtained by means of the openfoam toolbox. they confirm that the kinetic energy of the flow in the channel can be controlled by means of the proposed boundary condition. keywords: navier-stokes equations, natural outflow boundary condition, finite volume method. 1. introduction in computational fluid dynamics, the boundaries where the velocity is not known in advance are usually denote by open/artificial boundaries. this situation occurs in mathematical models of many types of fluid flow (e.g. the flow of blood, the flow in various blade machines, etc., see e.g.[1–5]. for instance, the accuracy of the dynamics of micropolar fluid depends on the boundary conditions [6–8]. in these cases, the velocity profile is rarely available in advance on the whole boundary of the flow field, the pressure is available in some special cases when it is measured or computed with the aid of a reduced model. furthermore, the necessity of setting an appropriate boundary condition on an artificial part of the boundary becomes also important when the computational domain is obtained by truncating the length of the domain in order to reduce the computational cost. one of the boundary conditions addressing the problem of the open boundary is the so called “do–nothing” boundary condition used e.g.by heywood, rannacher and turek in [9] (see also the so called natural boundary condition [10]), i.e., −ν ∂u ∂n + p n = 0, (1) where u = (u,v) denotes the velocity of the fluid, p is the kinematic pressure, i.e., the pressure divided by constant fluid density, ν is the kinematic viscosity (which is assumed to be a positive constant) and n is the unit outer normal vector to the boundary of the considered domain. however, the condition (1) does not enable one to control the amount of kinetic energy in the domain if a backward flow appears on the “open boundary” (which is the part, where the velocity profile is not given). bruneau and fabri proposed in [11] the boundary condition −ν ∂u ∂n + p n − 1 2 (u · n)− u = 0, (2) as a natural modification of (1). this extension comes naturally when the symmetric part of the convective term is integrated by parts. the superscript “−” denotes the negative part (i.e. (u · n)− = −u · n if u · n < 0, otherwise (u · n)− = 0). thus, the inequality (u · n)− > 0 is satisfied only in the case of a “backward flow” on the open boundary. this modification enables one to prove the existence of a weak solution, but only if the inflow velocity profile is bounded with respect to the viscosity see e.g. [2]. the same condition is also used in [12] on a part of the boundary. here, the authors prove the existence of a weak solution, under the stronger assumption, i.e. that the inflow velocity is zero. neustupa in [13] proposed a modification of (2), i.e., −ν ∂u ∂n + p n − 1 + ξ 2 (u · n)− u = h, (3) where ξ is a constant dimensionless parameter and h is an arbitrary vector function. this boundary condition, in comparison to (2), contains a correction in the nonlinear part. the correction enables the author to derive necessary a priori estimates of a solution in the case of an arbitrarily large inflow. these results show that the coefficient in front of the nonlinear part of the used boundary condition plays an important role in theoretical considerations, particularly in the existential theory. if it is less then 12 (which corresponds to ξ < 0 in condition (3)) then the existence of the weak solution is an open problem. if it is exactly 12 (which 117 https://doi.org/10.14311/ap.2021.61.0117 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en tomáš neustupa, ondřej winter acta polytechnica d i d e li le ω γi γo γw+ γw− x y figure 1. sketch of the computational domain ω. corresponds to ξ = 0 in condition (3)) then the existence of the weak solution can be proved, assuming a certain restriction of the size of the inflow velocity. if the coefficient is greater than 12 (which corresponds to ξ > 0 in condition (3)) then the existence of the weak solution can be proved for arbitrary large inflow, see [13]. in the field of numerical simulations, the problem with the original “do–nothing” boundary condition is well known and the modification (2) is widely used. our goal is to test numerically the behaviour of the flow in the neighbourhood of the outflow part of the boundary, in dependence on the dimensionless parameter ξ in front of the nonlinear part of the boundary condition. we are especially interested in comparison of numerical results, obtained in the three cases, i.e. when the dimensionless parameter is less that, equal to, or greater than zero. 2. mathematical model the stationary flow of a viscous incompressible newtonian fluid is described by the navier–stokes equation and the equation of continuity, i.e., (u ·∇)u + ∇p = ν∆u + f, in ω, (4) div u = 0, in ω. (5) here, f is a specific volume force. since the parts of the boundary of ω are of different types, we impose different boundary conditions. concretely, we assume that the boundary of ω consists of four curves: ∂ω = γi ∪ γo ∪ γw+ ∪ γw−, see figure 1. the curve γi represents the inlet (i.e. the part of boundary where the fluid enters the considered domain ω) and γo is the outlet (where the fluid leaves ω). the curves γw+, γw− are the non–permeable fixed walls of the channel. we assume that the whole boundary ∂ω is lipschitz–continuous. we consider dirichlet’s boundary conditions on γi, γw+ and γw−, i.e., u = g on γi, (6) u = 0 on γw+. (7) u = 0 on γw−. (8) the curve γo represents an artificially chosen part of the boundary, and the velocity profile on γo is therefore not known in advance. we consider this concrete “artificial” boundary condition on γo: −ν ∂u ∂n + p n − 1 + ξ 2 (u · n)− u = 0 on γo. (9) 3. numerical approximation the numerical solution of the governing system is based on a collocated finite-volume method implemented in the freely available cfd toolbox openfoam [14]. the solver uses segregated approach and pressure–velocity coupling is done with aid of the semi-implicit method for pressure-linked equations (simple) algorithm, see e.g. [15]. the convective term appearing in navier–stokes equation is discretized using the limited piece-wise linear reconstruction and the viscous term is approximated using a central scheme, see [16] or [17] for more details on the spatial discretization. 3.1. boundary conditions the boundary conditions are implemented in the usual way, used in the finite volume framework, see e.g. [16]. since the simple algorithm uses the elliptic equation for pressure we need to prescribe boundary condition for the pressure on the whole boundary of ω. the numerical implementation of the boundary conditions given by (6, 7, 8, and 9) is realized as follows: • γi: the velocity profile u = (u(y), 0) is prescribed, i.e., the fully developed parabolic profile is given as u(y) = 3u 2 − 6u d2i y2 , (10) where di is inlet diameter and u denotes the mean value of the magnitude of the velocity at the inlet. the homogeneous neumann boundary condition for pressure is used; • γw+ ∪ γw−: no-slip boundary condition, i.e., u = 0 and the homogeneous neumann boundary condition for pressure is used; • γo: the artificial boundary condition (3) is implemented in the finite volume framework in the following way, i.e., ∂u ∂n ∣∣∣ b = 1 ν [ (pb −p0)n − 1 + ξ 2 (up · n)−up ] , (11) where p0 is the referential value of the pressure, constant on γo. indices b and p denote the value on the boundary and the internal value at the nearest degree of freedom placed along the normal direction to the boundary, respectively. the pressure is realized so that� γo pdγ − � γo p0 dγ = 0. (12) this means that we are prescribing only one piece of information on the pressure on the whole line segment γo. 118 vol. 61 special issue/2021 on the sensitivity of the nonlinear term. . . −2 0 2 0 0.5 1 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 −2 0 2 0 0.5 1 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 figure 2. u along the boundary γo, re = 10, ξ < 0,ξ = 0. long result obtained on prolonged domain (le = 50di). 4. numerical results the influence of the coefficient ξ is studied in case of a flow through the channel with the so called sudden extension, see figure 1. the computations were done for ξ = −0.6,−0.5,−0.3̄, 0, 0.3̄, 0.5, 0.6 corresponding to 15, 1 4, 1 3, 1 2, 2 3, 3 4, 4 5 of the coefficient in front of the nonlinear term in (3). furthermore, the computations were done for three different reynolds numbers, re = udi/ν, re = 10, 100, 500. for an optimal occurrence of the backward flow in the neighbourhood of the outlet, we used different ratios of the sudden extension de/di (outlet diameter / inlet diameter) for each reynolds number, namely 4, 2, 1.5 for re = 10, 100, 500, respectively. the following data were considered: di = 1 m, u = 1.5 m/s, li = 1.5 m, le = 0.5 m. figures 2 and 3 show profiles of the velocity u on γo for re = 10, ξ ≤ 0 and ξ ≥ 0, respectively. there are no visible differences in dependence on the varying ξ. figure 4 shows details of the contours of u for re = 10 for ξ = −0.6, 0, 0.6. one can see that ξ = 0 and ξ = −0.6 gives almost the same results, but there is a significant difference (shift) between the contours obtained with ξ = 0 and ξ = 0.6. figure 5 shows profiles of u on γo for re = 100, ξ ≤ 0. small differences can be observed for different ξ. however, for re = 100 we were not able to obtain any solution for ξ > 0 due to lack of convergence. more cases were computed (not published in this work) with different extension ratios for ξ ∈ 〈0, 0.3̄〉 and we were not able to find a general sharp borderline for value of ξ. the lack of convergence seems to be dependent on the magnitude of the velocity occurring in the region of γo where a backward flow occurs. figure 6 shows a detail of the contours of u for re = 100 −2 0 2 0 0.5 1 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 −2 0 2 0 0.5 1 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 figure 3. u along the boundary γo, re = 10, ξ > 0,ξ = 0. long result obtained on prolonged domain (le = 50di). figure 4. detail of the contours of u, in region d, see figure 1, re = 10. black, red, blue, and green lines indicate results for prolonged domain (le = 50di), ξ = 0, ξ = −0.6 and ξ = 0.6, respectively. for ξ = 0,−0.6. one can see that there is a good correspondence between the contours for ξ = 0 in a “longer” domain and the results for ξ = −0.6 are slightly shifted. this can be explained by the fact that as ξ → 1, the boundary condition (3) converges to the do-nothing boundary condition (1) which does not take into account any backward flow on γo. figure 7 shows profiles of u at γo for re = 500, ξ ≤ 0. no virtual differences can be observed for different ξ. figure 8 shows a detail of the contours of u for re = 500 for ξ = 0,−0.6. similar conclusion can be made for ξ > 0 as for the case of re = 100. 119 tomáš neustupa, ondřej winter acta polytechnica −1 0 1 0 0.5 1 1.5 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 −1 0 1 0 0.5 1 1.5 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 figure 5. u along the boundary γo, re = 100, ξ < 0,ξ = 0. long result obtained on prolonged domain (le = 50di). figure 6. detail of the contours of u in region d, see figure 1, re = 100. black, red, and blue lines indicate results for prolonged domain (le = 50di), ξ = 0, ξ = −0.6, respectively. 5. conclusion numerical investigation of the boundary condition (3), allowing to control the amount of kinetic energy in the domain was done. the boundary condition was tested for different magnitudes of the inflow velocity with respect to the viscosity, namely re = 10, 100, 500. from the obtained results one can conclude that if the inflow velocity is sufficiently small then ξ can be chosen so that ξ > 0 (ξ ∈ 〈−0.6, 0.6〉 in our simulations). with an increasing inflow velocity, lack of the convergence for ξ > 0 can be observed. from other numerical results (not presented in this work), it is possible to conclude that the convergence for ξ > 0 strongly depends on the magnitude of the possible reverse velocity on the outflow and also on the size of the backward flow area. studying the dependence of the backward flow area, as a subset of γo, on multiple parameters, e.g. inlet velocity, reynolds number, and extension ratio we were not able to find any sharp general border for convergence criteria on coefficient ξ. however, one can observe that the region in which −0.5 0 0.5 0 0.5 1 1.5 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 −0.5 0 0.5 0 0.5 1 1.5 y long ξ = −0.3̄ ξ = −0.5 ξ = −0.6 ξ = 0 figure 7. u along the boundary γo, re = 500, ξ < 0,ξ = 0. long result obtained on prolonged domain (le = 50di). figure 8. detail of the contours of u in region d, see figure 1, re = 500. black, red, and blue lines indicate results for prolonged domain (le = 50di), ξ = 0, ξ = −0.6, respectively. we are able to find a numerical solution is approximately re ∈ (0, 50〉 for ξ ∈ 〈−0.6, 0.6〉. hence, the used numerical method converges for small inlet data both for ξ ∈ 〈−0.6, 0.6〉. for larger inlet data, the method converges only for ξ ∈ 〈−0.6, 0〉, where ξ = 0 corresponds to boundary condition introduced in [11]. the used numerical method does not confirm the theoretical conclusion presented in [13], i.e. that for ξ > 0 the existence of a weak solution can be proved for an arbitrary large inflow, in the sense that the numerical method does not converge in the situations described above. there are still relatively many open problems, connected with the boundary condition (3), both in the field of qualitative analysis and numerical computations. for example, the condition (3) can be reformulated in terms of the pressure (see e.g. [18]) which can be possibly more suitable for finite volume framework and simple algorithm. furthermore, it also seems desirable to test another approach, e.g. based on the finite element method, due to its closer relation with the weak formulation. 120 vol. 61 special issue/2021 on the sensitivity of the nonlinear term. . . acknowledgements the research was supported by european regional development fund-project “center for advanced applied science” no. cz.02.1.01/0.0/0.0/16_019/0000778. authors acknowledge support from the eu operational programme research, development and education, and from the center of advanced aerospace technology (cz.02.1.01/0.0/0.0/16_019/0000826), faculty of mechanical engineering, czech technical university in prague and by the grant no. 19-04477s of the czech science foundation. references [1] m. feistauer, t. neustupa. on non-stationary viscous incompressible flow through a cascade of profiles. mathematical methods in the applied sciences 29(16):1907–1941, 2006. doi:10.1002/mma.755. [2] m. feistauer, t. neustupa. on some aspects of analysis of incompressible flow through cascades of profiles. in operator theoretical methods and applications to mathematical physics, vol. 147, pp. 257–276. birkhäuser basel, 2004. doi:10.1007/978-3-0348-7926-2_29. [3] m. feistauer. mathematical methods in fluid dynamics. longman scientific & technical, 1993. [4] s.i. abdelsalam. m.m bhatti. the study of non-newtonian nanofluid with hall and ion slip effects on peristaltically induced motion in a non-uniform channel. rsc advances 8(15):7904–7915, 2018. doi:10.1039/c7ra13188g. [5] y. abd elmaboud, s.i. abdelsalam, k.s. mekheimer couple stress fluid flow in a rotating channel with peristalsis. journal of hydrodynamics 30:307–316, 2018. doi:10.1007/s42241-018-0037-2. [6] s. s. motsa, i. l. animasaun bivariate spectral quasi-linearisation exploration of heat transfer in the boundary layer flow of micropolar fluid with strongly concentrated particles over a surface at absolute zero due to impulsive. international journal of computing science and mathematics 9:455–473, 2018. doi:10.1504/ijcsm.2018.10016504. [7] o. k. koriko, i. l. animasaun new similarity solution of micropolar fluid flow problem over an uhspr in the presence of quartic kind of autocatalytic chemical reaction frontiers in heat and mass transfer 8, 2017. doi:10.5098/hmt.8.26. [8] o. k. koriko, a.j. omowaye, i. l. animasaun, m. e. bamisaye melting heat transfer and induced-magnetic field effects on the micropolar fluid flow towards stagnation point: boundary layer analysis international journal of engineering research in africa 29:10–20, 2017. doi: 10.4028/www.scientific.net/jera.29.10. [9] j. g. heywood, r. rannacher, s. turek. artificial boundaries and flux and pressure conditions for the incompressible navier-stokes equations. international journal for numerical methods in fluids 22(5):325–352, 1996. doi:10.1002/(sici)10970363(19960315)22:5<325::aid-fld307>3.0.co;2-y. [10] r. glowinski. numerical methods for nonlinear variational problems. springer, 1984. [11] c.-h. bruneau, p. fabrie. new efficient boundary conditions for incompressible navier-stokes equations: a well-posedness result. mathematical modelling and numerical analysis 30(7):815–840, 1996. doi:10.1051/m2an/1996300708151. [12] m. braack, p. b. mucha. directional do-nothing condition for the navier-stokes equations. journal of computational mathematics 32(5):507–521, 2014. doi:10.4208/jcm.1405-m4347. [13] t. neustupa. a steady flow through a plane cascade of profiles with an arbitrarily large inflow: the mathematical model, existence of a weak solution. applied mathematics and computation 272:687–691, 2016. doi:10.1016/j.amc.2015.05.066. [14] h. g. weller, g. tabor, h. jasak, c. fureby. a tensorial approach to computational continuum mechanics using object-oriented techniques. computers in physics 12(6):620–631, 1998. doi:10.1063/1.168744. [15] j. h. ferziger, m. perić. computational methods for fluid dynamics. springer, 3rd edn., 2002. [16] f. moukalled, l. mangani, m. darwish. the finite volume method in computational fluid dynamics: an advanced introduction with openfoam and matlab. springer, 2016. [17] o. winter, p. sváček. on numerical simulation of flexibly supported airfoil in interaction with incompressible fluid flow using laminar-turbulence transition model. computers & mathematics with applications 2020. doi:10.1016/j.camwa.2019.12.022. [18] t. bodnár, p. fraunié. artificial far-field pressure boundary conditions for wall-bounded stratified flows. in topical problems of fluid mechanics 2018. institute of thermomechanics, as cr, v.v.i., 2018. doi:10.14311/tpfm.2018.002. 121 http://dx.doi.org/10.1002/mma.755 http://dx.doi.org/10.1007/978-3-0348-7926-2_29 http://dx.doi.org/10.1039/c7ra13188g http://dx.doi.org/10.1007/s42241-018-0037-2 http://dx.doi.org/10.1504/ijcsm.2018.10016504 http://dx.doi.org/10.5098/hmt.8.26 http://dx.doi.org/ 10.4028/www.scientific.net/jera.29.10 http://dx.doi.org/ 10.4028/www.scientific.net/jera.29.10 http://dx.doi.org/10.1002/(sici)1097-0363(19960315)22:5<325::aid-fld307>3.0.co;2-y http://dx.doi.org/10.1002/(sici)1097-0363(19960315)22:5<325::aid-fld307>3.0.co;2-y http://dx.doi.org/10.1051/m2an/1996300708151 http://dx.doi.org/10.4208/jcm.1405-m4347 http://dx.doi.org/10.1016/j.amc.2015.05.066 http://dx.doi.org/10.1063/1.168744 http://dx.doi.org/10.1016/j.camwa.2019.12.022 http://dx.doi.org/10.14311/tpfm.2018.002 acta polytechnica 61(si):117–121, 2021 1 introduction 2 mathematical model 3 numerical approximation 3.1 boundary conditions 4 numerical results 5 conclusion acknowledgements references ap04_2web.vp 1 introduction there have been many attempts to find arc gas environments that might compete with sf6 as an efficient arc-extinguishing medium. although a range of pure gases and gas mixtures has been investigated [2], a useful combination remains to be identified. this quest has recently been given impetus, fuelled by the realisation that sf6 is one the most potent greenhouse gases known (christophorou, et al, 1997 [1]). this contribution explores the possibility of effective arc extinction with gases at lower pressures than are usually used, and preferably without sf6. in this paper we revisit the problem of investigating the arc-extinguishing capabilities of some relatively environmentally friendly gases, and also sf6, but now in a system that has more affinity with puffer type interrupters, an area that has to some extent been hitherto neglected. particular attention is paid to the possible ablation of ptfe from the flow constraining shield and the manner in which the different gases introduced (,background gases’) might affect such ablation effects. various forms of ptfe sleeve inner linings have been investigated including solid rings and compressed powders. a solid sleeve, honed to a clearance fit without inner lining, was found to perform best, and is the arrangement described in this work. judgements regarding the relative effectiveness of the various conditions for promoting arc extinction are made on the basis of a number of factors, which could be conveniently assessed. these have included the measurements of the minimum contact gap length at which current interruption can be achieved and the extent to which arc extinction and re-ignition voltage peaks occur. the gases investigated are sf6, n2, air, co2 and ar. the effect of pressure has also been considered for each of these gases. 2 equipment used and experimental approach the experimental interrupter unit used in these investigations is shown schematically in fig. 1 and is based on that of telfer, et al [4, 5]. the ptfe shield configuration was fitted to an industrial circuit breaker unit rated at 145 kv and 60 ka. the electrode sub-assembly consisted of a hollow fixed contact (copper-tungsten) and a solid cylindrical moving contact (copper-tungsten), which withdrew along a 22 mm diameter sliding-fit cylindrical channel within the ptfe sleeve. the entire arrangement is housed within a gas tight enclosure. the equipment thus allowed exploration of the role of arc-induced ablation from the close-fitting ptfe cylinder, resulting in enhanced current interruption, for various background gases in high voltage circuit breakers. the unit was used to interrupt currents produced by a resonant lc circuit [3] at 6.3 kv and tuned to approximately 50 hz. peak ac currents up to about 20 ka were investigated. the power source also provided a capability to produce a low level quasi-steady current prior to switching to the full ac fault current mode. fig. 1 shows the various timing sequences, ranging from circuit breaker trigger timing (dark square to left of waveform) and capacitor bank dump operation (end of period duration c). the approximate position of the interrupter contact at various time instances can be recognised by comparing the time intervals on the current waveform diagram with the corresponding position on the contact displacement diagram. in a typical current waveform produced by the current source, the preliminary quasi-steady current (starting left of a) is followed by two half cycles (b, c) at approximately 50 hz. as delay a is increased, the waveform in b and c is pushed to the right, until a point is reached where the contacts will have separated too far to support a two half-cycle 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 a novel approach to power circuit breaker design for replacement of sf6 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries this contribution explores the role of ptfe ablation in enhancing current interruption for various background gases in high voltage circuit breakers. an assessment of the current interruption capability has been made in terms of the arcing duration and the contact gap length at which critical arc extinction is achieved. these observations are supported by measurements of the magnitude of extinction and re-ignition voltage peaks. most previous and other current experimental work on gas filled circuit breaker design follows conventional wisdom in investigating arcing behaviours at elevated gas pressures (usually up to 6 bar). but in this work we concentrate on the effects of using low gas pressures (less than 1 bar) in the presence of a close-fitting shield of ablatant polymer material (ptfe) that surrounds the electrode assembly of an experimental high power circuit breaker. we demonstrate that for several different gases, arc extinction capability compares well under these conditions with sf6 , suggesting that sf6 could be replaced entirely in this novel system by more environmentally friendly gases. moreover, the critical contact gap lengths at extinction are only slightly greater than when using sf6 at 6 bar. weight loss measurements from the ablatant shield suggest that a chemical puffer action is the most likely mechanism for achieving the observed arc extinctions in this system. keywords: circuit breaker, power distribution, sf6 replacement, arc-induced ablation, anti-pollution. regime. at this critical gap and beyond, there is no re-ignition of the arc past current-zero and only one half-cycle of current is sustained. the critical gap (cg), which marks the onset of the single half-cycle regime, is sensitive to arcing conditions including the pressure and nature of the gases present in the arc channel. thus for a high dielectric strength medium like high pressure (> 6 bar) sf6, the value of cg will be small. the relationship between timing and distance was ascertained by calibration runs in an open chamber using a high-speed camera. the time-position curve was characteristically s-shaped with a sufficiently linear central region to allow simple conversion between time and distance of electrode travel. for instance, for the example shown, the contact separation (start of period a) was 2.43 cm at the point at which the first half cycle was triggered (start of b). the duration of the quasi-steady current period was pre-set as required, as was the peak alternating current. 3 results and discussion in this work, an assessment of the current interruption capability was made in terms of the arcing duration and the contact gap length at which critical arc extinction was achieved. these critical conditions were indicated by a transition from a two half-cycle to a one half-cycle regime, with cg (cm) bein measured at the onset of the first one half-cycle firing. in fig. 2 we plot the shortest gap length between moving and fixed contacts needed to successfully interrupt an alternating fault current of peak value 20 ka against gas pressure p psi (absolute) in the range 0.8 to 84 psi. accuracy was estimated to be ±0.1 division (±1 mm) for cg, and for pressure ±2 % at 10 psi degrading to ±20 % at 1 psi. fig. 2 shows that there is a sub-range of background gas pressures between 2 to 10 psi for which arc extinction performance improves. in this respect, sf6 maintains its lead, but the other gases are not far behind. the noticeable features are: (a) relatively small dependence upon gas pressure with sf6. (b)the improved interruption with n2 for p < 7 psi. (c) the similar performance of n2 to sf6 for p < 7 psi. (d)the similar behaviour of co2, air to n2, sf6 for p < 7 psi. (e) the generally poorer performance of ar but none the less showing a similar trend as n2 and co2. (f) in this apparatus, there is convergence in performance for n2 and sf6 at higher pressures. in (f) above, for sf6 and n2 at higher pressures, gas pressure, particularly within the confining space of the cylindrical shield, becomes the dominating influence on arc extinction. on the other hand, the similar performance of most gases below 7 psi implies the dominance of a common feature believed to be ptfe ablation and subsequent pressurisation due to arc heating of the ablation by-products. weighing the ptfe billet after some 250 test firings indicates on an average a ptfe weight loss of 0.14 grams per test (for cylinder and moving electrode diameter 2.2 cm). the erosion of the ptfe channel wall is significant but not excessive and the per© czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 44 no. 2/2004 fig. 1: relationship between physical distance and current waveform (see text) formance deteriorates only slightly over 250 tests at fault currents of 20 ka max. chemical calculations indicated that during these tests, an average of about 30 % of the energy of the arcing process was being used to effect ablation. thus, irrespective of the ,background gas’ being used, transient production of the required chemical species for chemical puffer action is being achieved by arc induced ablation of the polymeric material from the walls of a quasi-cylindrical channel within which the electric arc burns. experiments were also conducted to examine the effect of peak alternating current on the critical gap length for current interruption (fig. 3) at a pressure of 3.7 psi. these show a trend for the interruption performance at lower currents to be approximately at least as effective as at 20 ka, as judged by the critical gap length criterion. tests were also carried out using a gas mixture, 80:20 n2:sf6, which behaves in a similar manner to pure sf6 and n2. at the present state of knowledge, there therefore appears to be no significant advantage in utilising n2:sf6 mixtures in preference to pure n2 unless the recovery of dielectric strength might be improved. the critical gap length results of figs. 2 and 3 are supported by measurements of the magnitude of the voltage extinction and re-ignition peaks (as defined in fig. 4). these 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: critical contact gap length (cm) as a function of background gas pressure (log scale) for a peak alternating current 20 ka with various types of gases (sf6, n2, air, co2, ar). arc extinction performance varies inversely with the value of cg. k ka fig. 3: critical contact gap length (cm) as a function of maximum fault current with different gases (sf6, n2, co2, ar). arc extinction performance varies inversely with the value of cg. were measured close to the critical gap length for current interruption. fig. 5 compares the relative magnitude of the extinction and re-ignition peaks for a number of gases at a 20 ka and a pressure of 3.7 psi. the pre-critical extinction and re-ignition peaks results show that under conditions of 3.7 psi at 20 ka, the re-ignition peak (rp) in sf6 (ca. 200 v) is considerably greater (~2×) than in n2, but the second half cycle extinction peaks (xp2) in sf6 and n2, although small, are both larger than those for the other gases. and interestingly, the fist half-cycle extinction peaks (xp[1] and xp[2]) for n2 are the larger than for sf6 and the other gases. the reignition peak magnitude when using ar is relatively high (c.f. also telfer, et al [4]), in spite of its relatively poor enhancement of performance (fig. 2). this may possibly be attributable to a difference in prevailing mechanism for this inert gas and is worth further research. otherwise, the evidence from extinction and re-ignition peak measurements adds credence to the use of n2 as a replacement for sf6 in this type of system. 4 summary and conclusions an experimental circuit breaker using a close-fitting ablatant shield around the moving electrode has been operated at arc fault currents to 20 ka using various low pressure background gases, including comparison with sf6. extinction behaviours using n2, air or co2 are shown to be similar under these conditions and approach the operating performance when using sf6. at low pressures below 1 bar (absolute) expansion of the arc flame appears to assist the ablation process. the background gas molecules, along with gases evolved from the ablatant material and dissociation products, contribute to the localised chemical puffer action. operation at lower pressures reduces the quantity of sf6, but given the desirability to replace sf6 entirely because of its undesirable ‘greenhouse’ properties, these results also show that complete replacement of sf6 by n2 at low pressure operation is a viable approach to circuit breaker design. © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 44 no. 2/2004 fig. 4: example voltage profile (at top) showing extinction and reignition peaks (n2, 1bar) fig. 5: extinction and re-ignition peaks with a ptfe cylindrical block, clearance fitting, 145 kv breaker at 20 ka and 3.7 psi (xp1 = first half-cycle extinction peak, xp2 = second half-cycle extinction peak, rp = second half-cycle re-ignition peak, [1] signifies single half-cycle regime, [2] signifies two half-cycle regime) 5 acknowledgment the authors appreciate the financial support provided by the engineering and physical sciences research council of the uk. 6 future work further investigations are warranted to confirm the trends observed and to investigate in more detail the effects of background gas pressure, ptfe configurations, including bore geometries, and weight loss through ablation. future research will include more quantitative determination of ptfe removal under various controlled conditions. it will also be extended to other appropriate polymeric materials, with suitable background gases and gas pressures, and electromagnetic arc driving forces in order to enhance the production of the suitable chemical species. efforts are required to establish the optimum combination of these operational conditions that are most likely to produce the required chemical conditions transiently. references [1] christophorou l. g., olthoff j. k., green d. s.: “gases for electrical insulation and arc interruption: possible present and future alternatives to pure sf6.” nist tech. note 1425, 1997, 44 p. (us dept. of commerce). [2] frind g., kinsinger r. e., miller r. d., nagamatsu t. t., noeske h. o.: epri el-284, project 246-1, final report, january 1977. [3] furlong s. a.: characterisation of electromagnetic emissions from circuit breaker arcs. phd. thesis, the university of liverpool, 1999. [4] telfer d. j., humphries j., spencer j. w., jones g. r.: “influence of ptfe on arc quenching in an experimental self-pressurised circuit breaker.” proc 14th int. conf. on arc discharges and their applications, liverpool, 2002. [5] telfer d. j., spencer j. w., jones g. r., humphries j.: patent pending, 2002. d. j.telfer e-mail: djtelfer@liverpool.ac.uk j. w.spencer g. r. jones j. e. humphries centre for intelligent monitoring systems department of electrical engineering and electronics university of liverpool brownlow hill, liverpool l69 3gj, uk 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2018.58.0104 acta polytechnica 58(2):104–108, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap stability of synthesized silver nanoparticles in citrate and mixed gelatin/citrate solution jana kavuličováa, ∗, anna mražíkováb, oksana velgosováb, dana ivánováa, martina kubovčíkovác a institute of metallurgy, faculty of materials, metallurgy and recycling, technical university of kosice, slovakia b institute of materials, faculty of materials, metallurgy and recycling, technical university of kosice, slovakia c institute of experimental physics, slovak academy of sciences, kosice, slovakia ∗ corresponding author: jana.kavulicova@tuke.sk abstract. the study focuses on an investigation of the influence of both citrate and mixed gelatin/citrate as a reductant and stabilizer on the colloidal stability of silver nanoparticles (agnps) synthesized by a chemical reduction of ag+ ions after a short(7th day) and long(118th day) term storage. formed agnps were characterized by a uv–vis spectroscopy, transmission electron microscope (tem), dynamic light scattering (dls) and zeta-potential (zp). the obtained results revealed that a short-term stability of the synthesized agnps was greatly influenced by a citrate stabilizer with the absence of gelatin. smaller-sized agnps (average particle diameter of 3 nm), roughly spherical in a shape, were obtained with a narrow size distribution. the very negative value of the zeta-potential confirmed a strong stability of the citrate capped agnps. however, a surface coating of the agnps by a gelatin/citrate stabilizer was found to be a dominant contributor in improving a long-term stability of the agnps (average particle diameter of 26 nm). the use of gelatin in mixed stabilizer solution provided the agnps with higher monodispersity and a controllable size after both the short and long-term storage. keywords: silver nanoparticles; sodium citrate; gelatin; size; stability. 1. introduction the fabrication of highly characterized nanomaterials requires a developing of their specific physical and chemical properties. among these nanomaterials, especially silver containing nanocomposites has a wide range of potential applications in diverse areas, such as, electronics, catalysis [1], cosmetics, wastewater treatment, textile industry and biomedical devices [2]. many practical applications require stable silver nanoparticles (agnps) dispersions. the improving of significant parameters, which influence the stability of silver nanoparticles (agnps) against an aggregation state, is an essential task in the optimization of the agnps synthesis [3, 4]. there are factors that influence the stability of agnps, such as the concentration of the precursor, ph, temperature, order of mixing of reactants and stabilizers [6, 7]. for example, different kinds of stabilizers/capping agents, such as surfactants, polymers and reducing agents, are used for the surface modification of agnps during their synthesis to control the size, morphology, stability, and physicochemical properties of agnps [8–10]. silver nanoparticles are typically prepared through the reduction of a silver precursor using chemical or physical means [4, 9]. chemical reduction of a silver nitrate salt in an aqueous media by a reducing agent in the presence of a surfactant is a facile synthetic method. many authors have also reported that sodium citrate serves both as a reducing and stabilizing agent in the preparation of silver nanoparticles [11–13]. among stabilizers, gelatin has also been used as a natural biopolymer for the stabilization of inorganic nanoparticles and is known to be biodegradable in physiological environments [4, 14]. the novelty of this study lies in its modified approach to the synthesis of agnps by the surface-protecting chemical reduction of the silver nitrate using a mixed gelatin/sodium citrate solution both as a stabilizer and reducing agent for agnps. the effects of reducing agent and stabilizer on the particle size, size distribution, shape and stability of agnps were evaluated after a short(7th day) and long(118th day) term storage and compared to those of agnps synthesized by using the standard reduction of silver salt in sodium citrate. 2. materials and methods 2.1. chemicals silver nitrate (p.a. mikrochem), sodium citrate (p.a. lachema), gelatin (p.a. lachema) were used for the synthesis of the silver nanoparticles without any further purification. demineralized water (conductivity 0.3 µs cm−1) from demiwa 3 rosa (watek) was used to prepare the solutions. 2.2. synthesis of silver nanoparticles by using sodium citrate method chemically synthesized agnps were prepared using a chemical reduction method as demonstrated [15]. aqueous agno3 (0.29 mm) was used as the ag+ salt precursor. to this solution, sodium citrate (1 %) was added drop by drop as a reducing agent. during the synthesis, the solution was heated to an elevated temperature (∼ 90 °c) and mixed vigorously until a colour change was evident. the ph of the prepared agnps colloids was close to 7. 104 http://dx.doi.org/10.14311/ap.2018.58.0104 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 2/2018 stability of synthesized silver nanoparticles 2.3. synthesis of silver nanoparticles by using mixed gelatin/sodium citrate method the experiment was modified using the same conditions of the previous method with a reaction of aqueous agno3 (0.29 mm) and sodium citrate (1 %), but with gelatin. a 0.01 % gelatin concentration was added to the solution of agno3 prior to the addition of a reducing agent in order to prevent the particle agglomeration. 2.4. methods for determination the formation of agnps was monitored by measuring the uv–vis spectra with the unicam uv/vis spectrometer uv4. the absorbance was recorded after 5 h, 24 h and on the 3rd, 7th, 10th, 15th, 23th, 30th, 90th, 118th and 150th day. transmission electron microscope (tem) (jeol model jem-2000fx microscope operated at an accelerating voltage of 200 kv) was used to determine the size and the morphology of agnps on the 7th and 118th day. the samples were prepared by placing a drop of the colloidal solution of agnps on a carbon-coated nickel grid and completely dried at a room temperature. dynamic light scattering (dls) was carried out to estimate the hydrodynamic size of agnps in a suspension on the 7th and 118th day. the samples were measured using zetasizer nano zs (malvern instruments). the zeta (ζ) potential was determined using the laser doppler electrophoretic measurement technique with a scattering angle of 173° at 25 ± 0.1 °c 3. results and discussion the inset of figure 1 shows the photographs of samples obtained under different conditions. in the reaction of agno3 with citrate without gelatin, silver colloids turned pale yellow-brown. it implied the formation of agnps, while the addition of gelatin to ag+ solution led to the change of the solution colour to a dark brown (after 20 min), indicating the significant influence of gelatin in terms of its role as a reducing agent on the final colour of the agnps dispersion. similarly, the ability of gelatin as a good reducing agent in the synthesis of agnps was reported in the study [14]. the synthesis of agnps was confirmed by a uv-vis spectroscopy. the surface plasmon resonance (spr) bands for agnps have been reported to be in the range of 380–450 nm. the spr bands are influenced by size, shape, morphology and composition of the prepared agnps [16]. the symmetry and broadness of the absorption bands may be related to various shaped and/or sized agnps [17, 18]. the absorption spectra (figure 1a) of silver colloids without gelatin displayed the symmetric and broader spr peak of a very low intensity with a no significant spectral shift (from 420 to 422 nm) after a short storage (range 3–23 days) indicating a synthesis of uniform and stable agnps. based on the symmetry of the peak, it is assumed that the majority of synthesized agnps did not aggregate. in the case of adding gelatin to the ag+ solution, the asymmetric figure 1 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 300 500 700 a b so rb a n ce ( a .u ) wavelength (nm) a 5 h 3 day 7 day 10 day 15 day 23 day 30 day 90 day 118 day 150 day 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 300 500 700 a b so rb a n ce ( a .u ) wavelength (nm) b 5 h 24 h 3 day 7 day 10 day 15 day 23 day 30 day 90 day 118 day figure 1. uv-vis spectra of agnps a) capped with citrate, b) capped with gelatin/citrate as a function of reaction time, c) coated with different capping agents after 7 days. the inset shows the color change upon synthesis of agnps. and narrower spr band with a significant spectral shift (from 414 to 432 nm) after a short storage revealed a less uniform particle size distribution (figure 1b) as compared with those seen in the absence of gelatin. however, insignificant changes in the red-shift of the spr band (from 432 to 434 nm) were observed after a long storage (range 30–150 days) suggesting a good stability of the prepared agnps for a long time. the intensity of the spr bands gradually increased over time due to the continuous synthesis of agnps [19, 20]. as shown in fig. 1b, the spr bands were proportionally more heightened than in fig.1a, which corresponds to the combined effects of citrate/gelatin as reducing agents resulting to a more intensive colour of the agnps colloid. according to the authors [21, 22], the spr peak located between 410 and 450 nm during the synthesis of agnps might be attributed to spherical or roughly spherical shaped silver nanoparticles (figure 1). figure 1c compares the spr bands of the agnps coated with citrate, gelatin and gelatin/citrate after 7 days of storage. it is interesting to note the significant reduction ability of gelatin in the synthesis of agnps. the particle size distribution and the shapes of prepared agnps were confirmed by a tem analysis on the 7th and 118th day. the tem images revealed roughly 105 j. kavuličová, a. mražíková, o. velgosová et al. acta polytechnicafigure 2 a d c b figure 2. tem images of agnps prepared using a) citrate and b) gelatin/citrate on 7th day, c) citrate and d) gelatin/citrate on 118th day. spherical shaped agnps in both the absence and the presence of gelatin (figure 2). smaller-sized agnps were obtained after their short storage (figure 2ab) and gelatin/citrate capped agnps showed a higher degree of monodispersity (figure 2bd) than agnps without gelatin, significantly after 118 days (figure 2c). the tem images (on the 7th day) displayed agnps with an average particle diameter of 3 and 15 nm without and with gelatin, respectively (figure 3ab). according to the results, smaller agnps were prepared with a narrow size distribution, the size ranged from 1–22 nm (figure 3a) when only citrate was used as a reducing and stabilizing agent. as indicated above, tem results confirmed that the synthesized agnps in the absence of gelatin are more stable after a short storage. the tem images (on the 118th day) demonstrated that a higher percentage of the agnps (42 %) synthesized with gelatin are smaller than those prepared without gelatin (10 %), with the former in the range of 4 to 55 nm with an average diameter of 26 nm (figures 2d and 3d). the usage of the mixed gelatin/sodium citrate method allows a synthesis of size-controlled agnps. according to [23], the better control on nucleation and growth of nanoparticles is caused by a combined effect of two different reducing agents. the size of agnps without gelatin showed a wider range of particle size distribution (from 7 to 85 nm in diameter) with an average diameter of 36 nm (figures 2c and 3c). this can be particularly explained by the fact that gelatin forms the coating on the surface of agnps during the particle growth process and prevents an aggregation through a steric hindrance due to its high molecular weight and significant surface activity [4, 24]. it seems that the stability of sterically stabilized silver colloids in the presence of gelatin predominates over the electrostatic stabilization caused by a capping of citrate onto the surface of agnps. moreover, the addition of excess citrate as a reducing agent in the absence of figure 3 0 5 10 15 20 25 30 1 4 7 10 13 16 19 22 f re q u e n cy ( % ) particle diameter (nm) a 0 10 20 30 3 6 9 12 15 18 21 24 f re q u e n cy ( % ) particle diameter (nm) b 0 5 10 15 20 25 30 7 17 26 36 46 57 71 85 f re q u e n cy ( % ) particle diameter (nm) c 0 10 20 30 4 17 25 36 46 55 71 85 f re q u e n cy ( % ) particle diameter (nm) d figure 3. the size histogram of agnps prepared using a) citrate and b) gelatin/citrate on the 7th day, c) citrate and d) gelatin/citrate on the 118th day. gelatin forms a larger agnps in comparison with agnps made with the presence of gelatin (figure 2). likewise, the effect of higher citrate concentrations during the growth of existing agnps nucleis through the surface reduction of silver ions has also been described by [25]. this supports the observation in uv-vis spectra that gelatin/citrate capped agnps exhibited a long-term stability. in addition, the average hydrodynamic diameter, polydispersity index (pdi) and zeta potential of agnps in dispersion gathered by the dls both in the absence and presence of gelatin are shown in table 1. the sizes observed by the dls (figure 4ab) are significantly higher than the sizes obtained by the tem. this could be attributed to the effective coating of gelatin and/or citrate on the surface of agnps. the agnps in colloids are electrostatically stabilized because of their negative surface charge arising from the presence of oxoor hydroxo-functional groups in the citrate on the agnps surface [26]. 106 vol. 58 no. 2/2018 stability of synthesized silver nanoparticles sample name ph of silver colloid t = 7 days t = 118 days z-average diameter (nm) z-potencial (mv) pdi z-average diameter (nm) z-potencial (mv) pdi tem dls tem dls agnps (a) 6.77 3 103 −37.3 0.28 36 164 −10.5 0.44 agnps (b) 6.93 15 146 −7.8 0.2 26 174 −19.7 0.21 table 1. average values of agnps diameter determined by dls method and from tem images, pdi and z-potencial of agnps colloids on 7th and 118th day. figure 4 0 2 4 6 8 10 0 200 400 600 800 in te n si ty ( % ) size (d.nm) a 7 d 118 d 0 4 8 12 16 0 200 400 600 800 in te n si ty ( % ) size (d.nm) b 7 d 118 d figure 4. dls size distribution of agnps prepared using a) citrate b) gelatin/citrate on 7th and 118th day. the zeta potential analysis also confirmed the strong stability of agnps without gelatin (−37.3 mv) after a short storage due to the citrate layer on the surface of agnps through an electrostatic interaction. however, the addition of gelatin resulted in a significantly lower negative value of the zeta potential (−7.8 mv). it probably relates to a possible interaction between amino groups of gelatin and carboxyl groups of citrate coated on the surface of agnps. the stability of agnps coated by organic coatings with varying molecular weight and functional groups is achieved through the adsorption or covalent attachment. the attachment of the stabilizer to agnps surface is probably related to their interaction strength [26]. effectiveness of the bounded citrate with gelatin was enhanced on the surface of agnps after a long storage and proved by the fact that the z-potential was found to be −19.7 mv. electrostatic stabilization of agnps can be reduced by the presence of counterions in the colloid [26]. the presence of cations (e.g. na+) in the reducing agent could partially destabilize the citrate coated agnps after long storage, this is confirmed by the increased negative value of the zeta potential (−10.5 mv). a possible reaction scheme expresses the formation of the citrate capped agnps (1) by a faster reduction in comparison with the formation of the gelatin capped agnps (2) achieved by a slower reduction [14]. if the course of these reactions is considered, the synthesis of the gelatin/citrate capped agnps can take place through the formation of a gelatin-silver cation complex and the gelatin capped agnps [27]. the concentration of this complex is considerably higher than the content of the gelatin capped agnps. this is likely due to the partial reduction of ag+ to ag0 using gelatin during the short reaction time for the formation of gelatin/citrate-agnps. these reactions can be represented by (3)–(4) (figure 5). the agnps prepared with gelatin showed considerably lower pdi of 0.2 and 0.21 as compared with the agnps without gelatin, pdi of 0.28 and 0.44 on the 7th and 118th day, respectively (table 1). the comparison of pdi values for agnps prepared without and with gelatin demonstrated that the synthesis of agnps in mixed gelatin/citrate solution produced agnps with higher monodispersity after both short and long-term storage. the results of this study indicate the importance of the factors, such as stabilizer type, concentration and addition order of the reducing and stabilizer agents, reaction time and ionic strngth, that significantly affect the stability of the synthesized agnps in the colloid. 4. conclusions in this study, the influence of citrate and gelatin on the stability of formed agnps in dispersion was investigated. the modified synthesis of agnps through a chemical reduction of ag+ in a gelatin/citrate mixed solution was compared to the standard reduction of ag+ in citrate. the surface coating of agnps by a citrate stabilizer proved to be significant for achieving a shortterm stability of agnps in colloid. in comparison, the use of gelatin as a reducing and surfactant agent greatly increased the long-term stability of agnps. this shows that synthesized agnps exhibited improved stability through synergetic steric/electrostatic effects of the gelatin/citrate system, respectively. the mixed gelatin/sodium citrate method offers a size-controlled synthesis of monodisperse agnps with their potential use in a production of nanocomposite materials. acknowledgements this work was financially supported by the slovak grant agency (vega 1/0197/15). 107 j. kavuličová, a. mražíková, o. velgosová et al. acta polytechnica ag+(aq) + citrate(aq) → citrate-agnps(s) + citric acid(aq) (1) ag+(aq) + gelatin(gel)(aq) → gelatin-agnps(s) (2) ag+(aq) + gelatin(gel)(aq) → [ag(gel)]+(aq) + gelatin-agnps(s) (3) [ag(gel)]+(aq) + citrate(aq) → gelatin/citrate-agnps(s) + citric acid(aq) (4) figure 5. a possible reaction scheme for the synthesis of agnps coated with the capping agents: citrate (1), gelatin (2) and gelatin/citrate (3)–(4). references [1] sheng, y. et al.: facile synthesis of monodisperse nanostructured silver micro-colloids via controlled agglomeration and coalescence. nanosci. nanotechnol., 17, 2017, p. 626-633. [2] keat, c.l. et al.: biosynthesis of nanoparticles and silver nanoparticles. bioresour. bioprocess., 2:47, 2015, doi:10.1186/s40643-015-0076-2 [3] tran, q.h., nguyen, v.q., le, a.-t.: silver nanoparticles: synthesis, properties, toxicology, applications and perspectives. adv. nat. sci: nanosci. nanotechnol., 4, 033001, 2013, 20 pp. [4] ipek, y. g. et al.: synthesis and immobilization of silver nanoparticles on aluminosilicate nanotubes and their antibacterial properties. appl. nanosci., 6, 2016, p. 607-614. [5] sivera, m. et al.: silver nanoparticles modified by gelatin with extraordinary ph stability and long-term antibacterial activity. plos one, 6, 9(8): e103675, 2014, doi:10.1371/journal.pone.0103675 [6] jose, m., sakthivel, m.: synthesis and characterization of silver nanospheres in mixed surfactant solution. mater. lett., 117, 2014, pp. 78-81. [7] römer, i. et al.: aggregation and dispersion of silver nanoparticles in exposure media for aquatic toxicity tests. j. chromatogr. a, 1218(27):4226-33, 2011, doi:10.1016/j.chroma.2011.03.034 [8] lodeiro, p. et al.: silver nanoparticles coated with natural polysaccharides as models to study agnp aggregation kinetics using uv-visible spectrophotometry upon discharge in complexenvironments. sci. total environ., 539, 2016, p. 7-16. [9] srinithya, b. et al.: synthesis of biofunctionalized agnps using medicinally important sida cordifolia leaf extract for enhanced antioxidant and anticancer activities. mater. lett., 170, 2016, p.101-104. [10] abdulla-al-mamun, m., kusumoto, y., muruganandham, m.: simple new synthesis of copper nanoparticles in water/acetonitrile mixed solvent and their characterization. mater. lett., 63, 2009, p. 2007-2009. [11] khanna, p. k. et al.: synthesis and characterization of copper nanoparticles. mater. lett., 61, 2007, p. 4711-4714. [12] guzmán, m. g., dille, j., godet, s.: synthesis of silver nanoparticles by chemical reduňction method and their antibacterial activity. int. j. chem. biomol. eng., 2, 2009, p. 104-111. [13] girilal, m. et al.: a comparative study on biologically and chemically synthesized silver nanoparticles induced heat shock proteins on fresh water fish oreochromis niloticus. chemosphere, 139, 2015, p. 461-468. [14] lee, c. and zhang, p.: facile synthesis of gelatin-protected silver nanoparticles for sers applications. j. raman spectrosc., 44, 2013, p. 823-826. [15] asta, s. et al.: analysis of silver nanoparticles produced by chemical reduction of silver salt solution. mater. sci., 12, 2006, p. 287-292. [16] kelly, k. l. et al.: the optical properties of metal nanoparticles: the influence of size, shape, and dielectric environment. j. phys. chem. b, 107, 2003, p. 668-677. doi:10.1021/jp026731y [17] lah, n. a. c., johan, m. r.: facile shape control synthesis and optical properties of silver nanoparticles stabilized by daxad 19 surfactant. appl. surf. sci., 257, 2011, p. 7494-7500. [18] rashid, m. u., bhuiyan, m. k. h., quayum, m. e.: synthesis of silver nano particles (ag-nps) and their uses for quantitative analysis of vitamin c tablets. j. pharm. sci., 12, 2013, p. 29-33. [19] ibrahim, h. m. m.: green synthesis and characterization of silver nanoparticles using banana peel extract and their antimicrobial activity against representative microorganisms. j. rad. res. appl. sci., 8, 2015, p. 265-275. [20] velgosová, o., mražíková, a., marcinčáková, r.: influence of ph on green synthesis of ag nanoparticles. mater. lett., 180, 2016, p. 336-339. [21] jyoti, k., baunthiyal, m., singh, a.: characterization of silver nanoparticles synthesized using urtica dioica linn. leaves and their synergistic effects with antibiotics. j. rad. res. appl. sci., 9, 2016, p. 217-227. [22] shankar, s. et al: effect of reducing agent concentrations and temperature on characteristics and antimicrobial activity of silver nanoparticles. mater. lett., 137, 2014, p. 160-163. [23] agnihotri, s., mukherji, s., mukherji, s.: size-controlled silver nanoparticles synthesized over the range 5–100 nm using the same protocol and their antibacterial efficacy. rsc adv., 4, 2014, p. 3974-3983. doi:10.1039/c3ra44507k [24] lin, l.-h., chen, k.-m.: preparation and surface activity of gelatin derivative surfactants. colloids surf. a: physicochem. eng. asp., 272, 2006, p. 8-14. [25] thanh, n. t. k., maclean, n., mahiddine, s.: mechanisms of nucleation and growth of nanoparticles in solution. chem. rev., 114, 2014, p. 7610-7630. [26] levard, c., hotze, e. m., lowry, g.v., brown jr., g.e.: environmental transformations of silver nanoparticles: impact on stability and toxicity. environ. sci.technol., 46, 2012, p. 6900-6914. doi:10.1021/es2037405 [27] oluwafemi, o.s., lucwaba, y., gura, a., masabeya, m., ncapayi, v., olujimi, o.o., songca, s.p.: a facile completely ´green´size tunable synthesis of maltose-reduced silver nanoparticles without the use of any accelerator. colloids surf. b:biointerfaces, 102, 2013, p. 718-723. 108 http://dx.doi.org/10.1186/s40643-015-0076-2 http://dx.doi.org/10.1371/journal.pone.0103675 http://dx.doi.org/10.1016/j.chroma.2011.03.034 http://dx.doi.org/10.1021/jp026731y http://dx.doi.org/10.1039/c3ra44507k http://dx.doi.org/10.1021/es2037405 acta polytechnica 58(2):104–108, 2018 1 introduction 2 materials and methods 2.1 chemicals 2.2 synthesis of silver nanoparticles by using sodium citrate method 2.3 synthesis of silver nanoparticles by using mixed gelatin/sodium citrate method 2.4 methods for determination 3 results and discussion 4 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0321 acta polytechnica 57(5):321–330, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap platform with camera system for measurement of compensatory movements of small animals patrik kutileka, ∗, david skodaa, jan hybla, rudolf cernyb, daniel fryntac, eva landovac, petra frydlovac, aniko kuralid, radek doskocile, vaclav krivaneke a czech technical university in prague, faculty of biomedical engineering, kladno, czech republic b second faculty of medicine, charles university, prague, czech republic c faculty of science, charles university, prague, czech republic d „lendulet“ evolutionary ecology research group, plant protection institute, mta centre for agricultural research, budapest, hungary e university of defence, brno, faculty of military technology, brno, czech republic ∗ corresponding author: kutilek@fbmi.cvut.cz abstract. the article introduces systems and methods of a controllable rotational platform used for measuring compensatory movement of small animals. the system, based on a camera subsystem, is located on a mechanical platform powered by a set of three actuators. the subsystems and methods allow to measure angles of the platform’s orientation in space and body segment angles in both anatomical and earth’s coordinate systems. the methods of video processing, selection of measurement parameters and detection of anatomical angles are thoroughly described in this article. the study also deals with the software designed in matlab®, which controls the platform, records and processes videos, and obtains angles for the movement analysis. the system was tested for measuring a head rotation of a small reptile/amphibian and monitored reflective markers on the creature’s body by the camera system. this method has never been described before. the new subsystems of the platform and methods for monitoring animal’s head compensatory movements can be used in studies of the neural system and its evolution. keywords: rotational platform; animal; compensatory movement; head; camera; mocap system. 1. introduction the aim to research the process of detection and processing of stimuli for an organism (especially humans) during a change of its position in space stimulated the introduction of the method of dynamic posturography. this method records the body segment movement and is used especially in medical use [1]. examinations using platforms providing shift or tilt of the platform’s base with the measured subject. for monitoring the position of selected body segments [2, 3] some more advanced platforms, e.g., the 6dof2000e (moog, east aurora, ny, usa) are equipped with laser sensors or camera recording systems. these are complex robotic systems with multiple actuators. for the purpose of the evaluation of a movement, dynamic platforms can also provide valuable data for the study of nervous, vestibular, or musculoskeletal systems of animals. original measurements were carried out with animals tilted in hands and their movement was recorded as shown in the pictures [4]. this method is, however, considered inaccurate. in order to make measurements more effective, special systems, using tilt platforms, have been developed. an example of such typical system is a platform tilting around a set of axis. even though these systems are usually equipped with multiple electric motors, or they are powered manually, the measurements of the movement are always taken and evaluated during a rotation around a single axis. a typical illustration of the mentioned practice is a study focused on measuring a movement of hares [5]; here, the platform was rotated manually and the markers on the bodies of hares were recorded by cameras. for some experiments with hares, the tilt platforms were additionally equipped with force sensors under the animals, while goniometers were used to measure their movement. the animals are mounted onto platforms with the holders placed on body segments [7], or, if the animals are trained, using only a rough platform surface; this safeguard prevents them from falling, [8]. some more complex platforms and mocap systems were used in the studies of nervous system of a domestic cat for an evaluation of its movement in [9, 10]. the tilting platform system was, again, equipped with force sensors under the feet of subjects, in this case, however, the electric motor control system recorded not only the momentary values of angles but also the angular velocity. the animal’s movements were recorded by vicon (vicon motion systems, inc, lake forest, ca) mocap camera system. the system, located outside the platform’s structure, identified locations of reflective markers on the animal’s body automat321 http://dx.doi.org/10.14311/ap.2017.57.0321 http://ojs.cvut.cz/ojs/index.php/ap p. kutilek, d. skoda, j. hybl et al. acta polytechnica ically. some other specific platforms such as those that allow not only rotation but also a translational motion monitoring have also been tested [11]. this type of platforms, however, offers a smaller range of rotation; the accuracy of the positional identification depends on the movement of the platform and the mocap system is located outside the platform. in the case of measuring smaller animals than cats or hares, e.g., rats or frogs, the systems measuring only changes in the body orientation around one axis (i.e., mainly the rotation of the head around the vertical axis) were used. in this case, the movement of the head was examined by means of contrast markers positioned on the head of the monitored animal and a camera placed above the rotation platform [12]. the camera was capturing the movement around the horizontal axis [13]. however, due to the requirements of a basic research of functioning and evolution of the nervous and vestibular system, there is, nowadays, a need to study segments of smaller animals, such as small amphibians or reptiles of 5 cm in size. in the past, such devices allowed to measure the movement of smaller mammals only around a single axis [14]. as already demonstrated, recently designed devices with moving platforms allow measuring a movement of mammals of 10 cm in size, but they are rarely used for measuring smaller animals. all the mentioned works refer to the same disadvantage – the camera system located outside the platform and capturing the animal from rather a great distance (even up to 2 m). therefore, the markers had to be quite large so that they could be seen on the recorded video. if we wanted to measure animals smaller than cats or dogs (like lizards or frogs in our case), we would be limited by downscaling the size of markers and would not be able to get any valid results. the mentioned drawbacks are eliminated by the designed system including a tilt platform and a camera system for measuring the movement of smaller animals. the system also allows a rotation of the platform (with the subject) in all directions, while the camera system mounted on the platform records the movements of body segments in an anatomical coordinate system. the system, manufactured according to the requirements of faculty of science of the charles university in prague and first faculty of medicine of charles university in prague [15], will be described, in detail, in this article. to describe and test the functions of the designed system, the head of an animal has been chosen. although the system was briefly introduced at the conference [15, 16], this article is focused on the description of the subsystematic functions and data processing, that has not been described before. 2. methods 2.1. subsystems for measuring movement response to record the movement of an animal on anatomical axes, a special system consisting of several subsystems described in more detail in this article. to describe and test the functions of the designed system, the head of an animal has been chosen. although the system was briefly introduced at the conference [15,16], this article is more focused on description of the subsystematic functions and data processing, that has not been described before. 2. methods 2.1 subsystems for measuring movement response to record the movement of an animal on anatomical axes, a special system consisting of several subsystems has been designed fig. 1 demonstrates a block diagram of the main subsystems. fig.1 block diagram of the main subsystems used for measuring movement response of small animals to change in orientation of their bodies [15]. the basic part of the system consists of the platform with a camera system, imu, and the mounted animal. the platform is tilted by an actuator unit allowing for a change of all three euler’s angles. since the system preforms a sort of a multiple task, the two main tasks (motor control and camera recording) were allocated to two different pcs, connected via lan, see fig. 1. the two pcs need to be synchronized, so that the motor control pc (mc-pc) becomes aware of camera recording pc (cr-pc) and can start storing platform positions, as soon as it starts. the similar synchonized sequence actions happens when the recording stops. we developed this application using matlab’s timers with custom-made file writing and reading functions. figure 1. block diagram of the main subsystems used for measuring a movement response of small animals to the change in orientation of their bodies [15]. has been designed. figure 1 demonstrates a block diagram of the main subsystems. the basic part of the system consists of the platform with a camera system, the imu, and the mounted animal. the platform is tilted by an actuator unit allowing a change of all three euler’s angles. since the system preforms multiple tasks, the two main tasks (the motor control and camera recording) were allocated to two different pcs, connected via lan, see figure 1. the two pcs need to be synchronized, so that the motor control pc (mc-pc) becomes aware of camera recording pc (cr-pc) and can start storing platform positions, as soon as it starts. the similar synchronized sequence actions happen when the recording stops. we developed this application using matlab’s timers with custom-made file writing and reading functions. the way of the platform’s rotation is set by a user on motor control pc, from where it is sent to a data collection and control unit located on a stationary base. the base holds the actuator unit connected to the moving platform, see figures 2 and 3. the control unit sends the instructions to three actuators and simultaneously obtains information about their position. the imu and the set of three mutually perpendicular cameras of the camera subsystem give information on the values of the platform tilt angles and angles of the animal’s body segment’s orientation in an anatomical coordinate system. the imu and the camera subsystem send information about the platform’s orientation and video recording of the animal’s body segment to the control unit and the data collection unit. the control and data collection unit is connected to the pc with a data cable to process and determine the platform yaw, roll and pitch angle values as well as the three anatomical angles of the animal’s body segment’s orientation. the designed system (figure 2) consists of the platform, moving base (the platform plane) made from thin aluminium, creating a space of 200 × 300 mm 322 vol. 57 no. 5/2017 platform for measurement of compensatory movements the way of the platform rotation is set by a user on motor control pc, where it is sent from to a data collection and control unit located on a stationary base. the base holds the actuator unit connected to the moving platform, see fig 2 and fig. 3. the control unit sends instructions to three actuators and simultaneously obtains information about their position. imu and the set of three mutually perpendicular cameras of the camera subsystem give information on the values of the platform tilt angles and angles of the animal’s body segment’s orientation in anatomical coordinate system. imu and camera subsystem send information about platform’s orientation and video recording of the animal’s body segment to the control unit and the data collection unit. the control and data collection unit is connected to pc with a data cable to process and determine the platform yaw, roll and pitch angle values as well as the three anatomical angles of the animal’s body segment’s orientation. fig.2 diagram demonstrating the system; 1. moving platform, 2. camera subsystem, 3. camera holder, 4. animal holder, 5. imu, 6. actuator unit, 7. control and data collection unit, 8. pc controlling platform movement, 9. pc for data processing and determination of movement angles, 10. data cables connecting pc and control and data collection units. the designed system (fig. 2) consists of the platform, moving base (the platform plane) made from thin aluminum, creating space of 200 x 300 mm for placing the animal. the platform plane allows for manual shift of the animal on the platform’s plane perpendicular to the transversal plane. in this way the animal’s head can be set to the focus of the camera subsystem. solid stationary base, where the control and data collection unit is placed, has a framework made of aluminum profiles. this framework makes space for electric components and connection point to mount the motors of the actuator unit. actuator unit consist of three twophase motors type 603th88 with a basic step of 1.8° creating a torque of 3 nm. the three motors of the actuator unit are located in mutually perpendicular axes, each of them capable of rotating around different axis. we used bipolar stepper motors running at 128 microsteps per step, with magnetic momentum compensation for steady speed. microstepping ensures smooth acceleration/deceleration. figure 2. diagram demonstrating the structure of the system; 1. moving platform, 2. camera subsystem, 3. camera holder, 4. animal holder, 5. imu, 6. actuator unit, 7. control and data collection unit, 8. pc controlling platform movement, 9. pc for data processing and determination of movement angles, 10. data cables connecting pc and control and data collection units. fig.3 diagram demonstrating the system; 1. moving platform, 2. camera subsystem, 3. camera holder, 6. actuator unit, 7. control and data collection unit, 8. pc controlling platform movement, 9. pc for data processing and determination of movement angles. 2.2 measurement of body segment movement of small animals in 3-d space the platform allows for rotational motion of the animal’s body in 3-d space. the initial position was a horizontal plane and a direction of gravitational acceleration. the animal was mounted to the platform in a way ensuring that the three anatomical axes were in accordance with the three main axes of the platform (fig 4). the change in the animal’s orientation is defined by euler’s angles – yaw, pitch and roll, fig 4. figure 3. system for measurement of compensatory movements; 1. moving platform, 2. camera subsystem, 3. camera holder, 6. actuator unit, 7. control and data collection unit, 8. pc controlling platform movement, 9. pc for data processing and determination of movement angles. for the placement of the animal. the platform plane allows for manual shift of the animal on the platform’s plane perpendicular to the transversal plane. in this way, the animal’s head can be set to the focus of the camera subsystem. the solid stationary base, where the control and data collection unit is placed, has a framework made of aluminium profiles. this framework makes space for electric components and a connection point to mount the motors of the actuator unit. the actuator unit consist of three two-phase motors type 60sth88 (motion control products ltd.) with a basic step of 1.8° creating a torque of 3 nm. the three motors of the actuator unit are located in mutually perpendicular axes, each of them capable of rotating around a different axis. we used bipolar stepper motors running at 128 microsteps per step, with a magnetic momentum compensation for a steady speed. microstepping ensures a smooth acceleration/deceleration. specific ranges of each rotations have been limited by the software and sensors for each motor individually; this was determined on the basis of a so-called safe operating range depending on the applied current and weight of the platform itself. while the lowest motor can rotate in the range covering 0°±170°, both higher motors have been confined to 0° ± 23.4°. the three angles of tilt and rotation were achieved with an accuracy of 1 microstep. the minimum settable 323 p. kutilek, d. skoda, j. hybl et al. acta polytechnica fig.4 anatomical coordinate system of the animal and its angular movement in relation to coordinate system related to the direction of gravitational acceleration and earth’s horizontal plane [15]. figure 4. anatomical coordinate system of the animal and its angular movement in relation to coordinate system related to the direction of gravitational acceleration and earth’s horizontal plane [15]. step was 1.8° for all three motors, but the sensitivity of the movement is 128 microsteps per 1.8°, giving a movement smoothness of 0.014°. the reason why we did not let the user execute movements smoother than 1.8° is that the motors cannot be stopped anywhere between. every time they stop, the finishing stroke is present within a one second delay setting the platform’s angle to the closest stable point (every 1.8° away). also, the 1.8° step is inevitable for accelerating/decelerating the motion of the moving platform. the calibration and checking of the tilt and rotation angles is performed by the control and data collection unit obtaining information from the imu. this combines a three-axis gyroscope, three-axis accelerometer, 16 bit adc and a digital movement processor for an accurate monitoring of fast and slow movements in gyroscope ranges of ±250°/s as well as in accelerometer ranges of ±4 g. ranges of angular movements of the three motors are limited to avoid coincidence between the platform and the base construction. the limit of the range of the motor movement is set to 46.8°, which is a higher value than in similar experiments [7, 8]. a light tin holder with rails for the cameras of the subsystem and another one to hold the animal are fitted onto the platform. cameras are placed so that they are perpendicular to each other; each records one of the body planes of the animal. we used defender g-lens 2577 hd720p cameras with a high definition of 720p, 10× digital zoom and a suitable focus distance starting at 3 cm . the resolution was set to 320×240 pixels to speed up the data saving process and maximize the number of captured frames per a measurement. the camera’s angle of view was 56°, which enabled a scope of a significant part of a small animal in a particular anatomical plane. tests have proven that the geometrical distortion of the view was negligible, taking into consideration the resolution, the distance from the object and the size of markers. it was also proven that the distortion does fig.4 anatomical coordinate system of the animal and its angular movement in relation to coordinate system related to the direction of gravitational acceleration and earth’s horizontal plane [15]. figure 5. movement of body segment (head) in anatomical coordinate system of an animal. not affect the angle calculation. we can also regulate the amount of light coming from the led strip lights placed on the camera holder. they influence the maximum achievable framerates of the used cameras and enable to set the light conditions so that the markers are as contrasting as possible, which facilitates their recognition in the captured frames. as mentioned before — lighting of the scene has a huge impact on frames per second (fps). the general rule is: more light means more fps. the demand for a higher resolution results in lowering the final fps, which actually means that we have to find a compromise. we have discovered that the resolution of 320 × 240 pixels is optimal for our applications. such resolution would normally allow up to 60 fps. the lighting of the scene has to be set so that markers are not over-exposed. 2.2. measurement of body segment movement of small animals in 3-d space the platform allows a rotational motion of the animal’s body in 3-d space. the initial position was a horizontal plane and a direction of a gravitational acceleration. the animal was mounted to the platform in a way ensuring that the three anatomical axes were in accordance with the three main axes of the platform (figure 4). the change in the animal’s orientation is defined by euler’s angles – yaw, pitch and roll, figure 4. to determine the orientation of the body segment in a coordinate system, it is necessary (in the case of the head) to define three movement angles in the anatomical system of the animal’s body, which match the coordinate system of the moving platform. they are – mediolateral flexion (i.e., rotation about mediolateral axis), dorsoventral flexion (i.e., rotation about dorsoventral axis) and the head rotation (i.e., rotation about anterior-posterior or say rostro-caudal axis), figure 5. 324 vol. 57 no. 5/2017 platform for measurement of compensatory movements fig.5 movement of body segment (head) in anatomical coordinate system of an animal. to determine the orientation of the body segment in coordinate system it is necessary (in the case of the head) to define three movement angles in anatomical system of the animal’s body which match the coordinate system of the moving platform. they are – mediolateral flexion (i.e. rotation about mediolateral axis), dorsoventral flexion (i.e. rotation about dorsoventral axis) and head rotation (i.e. rotation about anterior-posterior or say rostro-caudal axis), fig 5. these angles can be measured by appropriate placing of markers on the animal’s head. cameras of the camera subsystem are located in a way ensuring their optical axes are perpendicular to each other, and at the same time perpendicular to sagittal, transversal and frontal axes of the animal placed on the platform [15]. markers can have a form of reflective plastic stickers [9] or their paper alternative [5]. it is, however, their disadvantage that they are rather big [9], therefore, anatomically harmless paint of a contrastive color was applied on the animal skin instead. two spots were made for each plane. it had to be ensured they coincide neither with one another nor their placement on anatomically fitting parts coincides with minimal intrapopulation variability. as for the placement of the animal onto the platform we needed a solution, which would give animals more freedom to move, but not the chance to escape. for larger animals (about 10cm length of body), velcro straps were found as most suitable, but smaller animals (less than 10cm length of body), were placed into transparent closeable boxes equipped with velcro at the bottom, see fig.6. if subject turned its segments inside the box ,the measurement was canceled. fig. 6: placement of the marked animal inside the box on the platform. 2.3 controlling of the platform and data recording figure 6. placement of the marked animal inside the box on the platform. these angles can be measured by appropriately placed markers on the animal’s head. cameras of the camera subsystem are located in a way ensuring their optical axes are perpendicular to each other, and at the same time perpendicular to sagittal, transversal and frontal axes of the animal placed on the platform [15]. markers can have a form of reflective plastic stickers [9] or their paper alternative [5]. it is, however, their disadvantage that they are rather big [9], therefore, anatomically harmless paint of a contrastive colour was applied on the animal skin instead. two spots were made for each plane. it had to be ensured they do not coincide with each other and their placement on anatomically fitting parts does not coincide with minimal intrapopulation variability. as for the placement of the animal onto the platform — we needed a solution, which would give animals more freedom to move, but not a chance to escape. for larger animals (about 10cm length of body), velcro straps were found as the most suitable, but smaller animals (less than 10cm length of body), were placed into transparent closable boxes equipped with velcro at the bottom, see figure 6. if the subject turned its segments inside the box, the measurement was cancelled. 2.3. controlling of the platform and data recording graphical users interface (gui) has been developed to control the movements of the platform and to record and process videos in matlab (matlab r2010b, mathworks, inc., natick, ma, usa). the user can modify the angle step (°), speed of rotation (°/s) and the direction of the movement of each motor. a gui for recording allows the user to determine the fps. in the case of camera’s settings, for any correct calculation of anatomical angles, it is suitable to use a higher frequency rate for the record [12]. in one case [5], a 25 fps rate was used, however, in [9, 11], even 120 fps rate was used. this means that, even before making the final decision, some expected movements of a body segment should be taken into consideration. our lowest framerate usually varies based on the light conditions during the measurement, but roughly equals to 30 fps fig.7 demonstration of how to work with a border-radius close-up. the pixels taken into consideration are white, red crosses mark detected centers and the pixels ouside the range of a close-up (highlighted with circles) are in grey zone one approach to detection of markers´ centers is to make matlab´s algorism of k-means detect two centers in relation to white coloured pixels, as input arguments. the k-means function implemented in matlab software [20], is very accurate (mistakes are really very rare) and comfortable, however, also time-consuming. the new method first detects all the pixels complying with the requirements of the tolerance and cropping of radius conditions, as was described above. then it checks the x and y coordinates of every detected pixel and based on that determines, whether pixels are located above each other ( vertically in the image), or next to each other (horizontally in the image). the algorithm then finds the central line between bordering pixels. for example, if the the pixels are horizontally situated on the slide, we take the smallest and the biggest x coordinate of detected pixels and the vertical central line is right between them. determination of the center is given by equation     n i n i i x a ax c ,     n i n i i y a ay c , the c stands for centres and its index marks the plane for which we count it. the number of detected pixels is n and a is the area of the detected pixel. using this simplification, we get an equation for the mean value of the x and y coordinates. to be even more accurate in estimation of the center positions, we can completely replace the equation by median calculus. after both centers on both sides of the central line are detected, the method checks the distance of all detected pixels from previous center locations using classical pythagoras's theorem. the pixels figure 7. demonstration of how to work with a border-radius close-up. the pixels taken into consideration are white, red crosses mark detected centers and the pixels ouside the range of a close-up (highlighted with circles) are in grey zone. due to the recording slow compensation movements of reptiles. a button for post-processing has been added to the gui. what the algorithm does, has been partially described in the previous section. first, it cuts off redundant frames that are recorded before matlab manages to shut all three cameras or before it starts recording. to determine, how many frames has the cameras recorded, we used matlabs’s „getdata“ function, which can return timestamps. timestamps tell us exactly when each frame was recorded. at the end of each measurement, everything is stored in a text file, including the name and fps of each video, as the processing of the videos is very time consuming. last part of the gui enables the user to calculate angles automatically. it also has a sub-part, where a user can play videos and see what the software has detected, or how angles are changing in time. 2.4. video processing marker coordinates are detected in the pictures automatically using the contrast between their own colour and the colour of the animal’s skin. this phenomenon is used in the calculation software created in matlab. the detection of markers is based on the pre-set colour tolerance on the scale from 0 to 255 for all three colours in the rgb spectrum, converting each pixel to a binary mask. the tolerance is pre-set for each colour and can be adjusted by a user to any value ranging from 1 to 50. it means, for example, that any pixels found suitable for the selected colour with the value of ±50 will be marked on the frame as detected. the tolerance is common for all three colours in the rgb spectrum and cannot be modified separately. cropping a border-radius (figure 7) is another useful technique. sometimes, we can detect pixels in the background of the scene that are the same colour as markers, or are within the tolerance range [18]. those pixels are usually fairly far away from those indicating the markers [19]. therefore, after we find the approximate centres of the markers on the previous frame using k-means, we perform a deletion scan of any detected pixels within the specified radius from 325 p. kutilek, d. skoda, j. hybl et al. acta polytechnicawhich are further than the specified cropped radius value are excluded from the previous center positions, and are stained grey in the preview image. then the method checks the medians again, using only remaining pixels, to get more precise centers locations. fig.8 method of slicing the screen. crosses marks are computed centers of markers and α is the detected angle. this method is much faster than k-means algorithm, unfortunately it’s not entirely error-proof. the sharp division along just two dimensions (vertically, or horizontally) can be inaccurate. for example, when markers get close to each other and position themselves diagonally, the division line can cut one of the markers and take some of its pixels as pixels of the other marker (fig. 8) – slowly shifting the detected center’s locations, which results in losing one of the markers. the situation is optimal when the division line is made at a specific angle to clearly separate both markers. implementation of such an algorithm would, however, be too time consuming, which we were trying to avoid. based on the above, we decided to conduct a k-means check after the set of every 50 frames to prevent losing the marker for a longer period of time. we also needed to check whether the intercentral distance hasn’t changed by more than 15 % of its length on the previous frame in which case – k-means are launched again automatically to correct the error. this is particulary convenient in the case of the wrong estimation, which may happen, although very rarely, as was mentioned before. last but not least, we had to deal with a problem of changing marker’s color influenced by change of light conditions. if we rely on detecting one color with specified tolerance throughout the whole video, we may as well end up with only few applicable frames , while we would have to discard the rest. therefore, we had to implement a method based on a vector containing all the information about the color which the user wants to detect, such as tolerance and cropping radius of each frame where the mentioned values were changed. the user can then save the vector for each video to a txt file and return to it later, when necessary. the user also has to indicate at the beginning that he wishes to use the keyframe vector and clicks on the color manual selection button. the preview of the first frame appears along with the preview of detection. the user then clicks on one of the markers he wishes to detect in the preview, figure 8. method of slicing the screen. crosses marks are computed centres of markers and α is the detected angle. each previous centre on the current frame and process the frame again in order to receive information from pixels near the centres of the previous frame. one approach to the detection of markers’ centres is to make a matlab’s algorithm of k-means detect two centres in relation to white coloured pixels, as the input arguments. the k-means function, implemented in matlab software [20], is very accurate (mistakes are very rare) and comfortable, however, also timeconsuming. the new method first detects all the pixels complying with the requirements of the tolerance and cropping of radius conditions, as was described above. then, it checks the x and y coordinates of every detected pixel and based on that determines whether pixels are located above each other ( vertically in the image), or next to each other (horizontally in the image). the algorithm then finds the central line between bordering pixels. for example, if the pixels are horizontally situated on the slide, we take the smallest and the biggest x coordinate of the detected pixels and the vertical central line is right between them. determination of the centre is given by cx = ∑n i=1 xia∑n i=1 a , cy = ∑n i=1 yia∑n i=1 a , where c stands for centres and its index marks the plane for which we count it. the number of detected pixels is n and a is the area of the detected pixel. using this simplification, we get an equation for the mean value of the x and y coordinates. to be even more accurate in the estimation of the centre positions, we can completely replace the equation by a median calculus. after both centres on both sides of the central line are detected, the method checks the distance of all detected pixels from previous centre locations using a classical pythagoras’s theorem. the pixels, which are further than the specified cropped radius value, are excluded from the previous centre positions and are stained grey in the preview image. then, the method checks the medians again, using only remaining pixels, to get more precise centres’ locations. this method is much faster than the k-means algorithm, unfortunately, it is not entirely error-proof. the sharp division along just two dimensions (vertically, or horizontally) can be inaccurate. for example, when markers get close to each other and position themselves diagonally, the division line can cut one of the markers and take some of its pixels as pixels of the other marker (figure 8) – slowly shifting the detected centres’ locations, which results in losing one of the markers. the situation is optimal when the division line is made at a specific angle to clearly separate both markers. implementation of such algorithm would, however, be too time consuming, which we were trying to avoid. based on the above, we decided to conduct a k-means check after the set of every 50 frames to prevent losing the marker for a longer period of time. we also needed to check whether the intercentral distance has not changed by more than 15 % of its length on the previous frame, in which case k-means are launched again automatically to correct the error. this is particularly convenient in the case of the wrong estimation, which may happen, although very rarely, as was mentioned before. last but not least, we had to deal with a slight changes in the colour of the marker due to the changes of the light conditions. if we rely on detecting only one colour with a specified tolerance throughout the whole video, we may as well end up with only few applicable frames and rest would have to be discarded. therefore, we had to implement a method based on a vector containing all the information about the colour, which the user wants to detect, such as tolerance and cropping radius of each frame where the mentioned values were changed. the user can then save the vector for each video to a txt file and return to it later if necessary. at the beginning, the user also has to indicate that he wishes to use the keyframe vector and click on the colour manual selection button. the preview of the first frame appears along with the preview of the detection. the user then clicks on one of the markers he wishes to detect in the preview, adjusts the tolerance or cropping radius range if needed, and he can then use the slider to scroll to the other frame. once he notices the marker is diminishing on the detection preview, he can refresh the detection colour by another click on the marker. the values are stored within a vector and from then on, the software is detecting the last settings until the end of the video, or until another change is made and another column appears in the vector. there is another useful feature that can be implemented in preliminary video processing tests. since the markers change their colour quite often, an adaptive color change algorithm (acc algorithm) was implemented. the basic idea is to observe how the colour of the markers change in the course of time and adjust changing the detection colour in the settings automatically, so that the user does not have to do it manually. its mainframe works similarly to the keyframe vector described above, but unlike the keyframe, the acc vector stores the information 326 vol. 57 no. 5/2017 platform for measurement of compensatory movements about colours, tolerance and cropping radius for every frame of the video, not just keyframes specified by the user. the keyframe vector is, however, superior to the acc vector, because the acc algorithm does not try to change the values belonging to the keyframe, but rather uses them instead. the software always tries to use the detection colours specified by the previous frame and see what pixels it detects. then, it extracts the information about the rgb spectrums of the detected pixels on the current frame and calculates a median of each of the three values. the median values are implemented into the acc vector and then serve as the detection colour for the current frame. the detection is launched once more with updated values and centres are located for the needs of the cropping radius method. then it shifts to another frame and repeats the process described above until the required or the last frame of the video is reached. 2.5. method for determination of anatomical angles of movement to measure the position of particular segments of an animal (in our case, the head), the platform is equipped with a camera system and the imu. to determine the anatomical angles, it is necessary to measure using euler’s angles representing changes in angular positions of the platform in space. for this reason, it is essential to calibrate the system prior to each measurement. the platform’s coordinate system (i.e., platform orientation in 3-d space) is set in an accordance with gravitational acceleration and horizontal plane of the earth on the basis of the imu data before each measurement. in order to secure this, the imu was mounted to the platform during its assembly in such a way that the three main axes of the imu are parallel to the main platform axes. after the calibration, the pitch, yaw and roll angles are measured too and the information obtained from the three motors verifies the accuracy of the reached pitch, yaw and roll angles. the camera holder is mounted onto the platform holding three cameras (set so that the three axes of the cameras are mutually perpendicular and also parallel with the three main axes of the platform) and it enables to record the movement of the points in planes respective to planes perpendicular to the main axes of the platform. if the animal’s body is placed on the platform correctly, it is possible to record the movement of segments in anatomical planes of the body, i.e., sagittal, frontal and transversal. the animal must be fixed in a way that the three anatomical axes are parallel to the three axes of the platform. if we, however, focus only on measuring the evolution of the angle value change, the accurate fixing of animal does not seem to be so crucial. determination of the segment rotation on a particular plane is given by the position of two markers on a body segment (i.e., anatomical points) and the camera setting. let us suppose the angle calculation is defined by two markers placed on the head and physical axis defined by the tilt of a camera, while the camera is parallel with the main axis of the platform. the cameras (during the assembly process) are set in a way that their picture matches a particular plane and also two axes of a picture (i.e., physical axis of the picture – vertical and horizontal), which are parallel with the main axes of the platform. figure 8 demonstrates the example of two captured markers on a segment (e.g., head) to calculate mediolateral flexion, or, angular movements in the frontal plane. the angle between the anatomical axis and the horizontal picture is an angle between the v vector given by a camera setting in an accordance with the platform axes and the u vector representing the coordinates of points determined in the picture. the angle between the vectors is given by α = arccos u ·v |u||v| , where u = (ax − bx,ay − by ), v = (1, 0), ax, bx are coordinates of two points/markers on the x axis, and ay, by are coordinates of two points/markers on the y axis of a picture from one of the cameras [21, 22]. this calculation is utilized when calculating movement angles in all three anatomical planes, or three pictures of a simultaneous record from cameras in the subsystem. then, the records from cameras, i.e., anatomical angles, are coupled with data, i.e., angles of the rotation, of the motors and the imu. thus, body segment angles in space for its respective planes are calculated by a mere sum or difference between the anatomical angle and the platform movement angle. if it is not possible to determine the value of angles due to a short temporary loss of information on marker positions, a data interpolation method is implemented into the software. the interpolation method allows to calculate the missing kinematics data within a time series of angles. for this purpose, the cubic spline data interpolation, implemented in matlab software, was used [23]. 3. experiments in our measurements, we used frogs and geckos. we were mostly working with common toads (bufo bufo), and leopard geckos (eublepharis macularis), but we also measured some other frog specimen of various sizes. the animals were specifically selected for a systematic testing. we focused on the measurements of the relative movement of the head in relation to trunk. the body of the subject was placed in such a way that the longitudinal axis corresponded with the longitudinal axis of the platform. relative angles of the head movement in relation to the platform were determined by markers placed on the upper part of the head, two on the front and two on the side part of the head. the animal was placed onto the platform, and a gradual and sequenced rotation was carried out. 327 p. kutilek, d. skoda, j. hybl et al. acta polytechnica 3.1. selection of measurement conditions to test and verify the designed system and method, the rotation of the animal around only one axis, similarly to several other studies of animal body segment movement [7, 8], was chosen. the animal body can be rotated in transversal anatomical plane, i.e., by change in the roll angle of the platform [5–8]. some studies, however, also describe the movement on sagittal plane; i.e., change in the pitch angle of the platform [7–9]. we primarily focused on the head movement in the sagittal plane and controlled the change of orientation in the pitch angle of the platform, where the most significant compensatory movements of the head can be expected. as for the selection of the measurement conditions, the platform can be moved periodically by various types of the course of the angular velocity, e.g., by the sinus course of angles (with a frequency of 0.5 or 1 hz and an amplitude of ±20°/s) or linear course (between ±20°), so called ramp-and-hold mode, as used in [5]. in the research of animal movement, the sinus course is used less often than the linear course. the sinus course (with a frequency of 1 hz) was used and also in [7, 13] (with a frequency of 0.25– 5.5 hz and an acceleration of 100°/s2). in [16, 24], the linear course was used along with a tilt range of ±20°. tilts are, therefore, usually symmetric in relation to the initial horizontal position of the platform. maximum values of tilt angles can reach up to 40° in healthy subjects, in the case of roll and pitch angles [7, 8]. during an evaluation of pitch angles, even +90°/−90° angle was used, however, the movement of subjects was carried out manually and the movement record step was 20°, instead of electronically powered platform [4]. in some studies of animals (e.g., cats), we may identify lower angles, such as 6°, [9, 11], or ±1° to ±20° [13]. lower values (from 3° to 7°) are, however, used rather in cases of postural stability in humans, due to a higher centre of mass position compared to quadripedal animals, where the base support is larger with a lesser risk of losing the balance in higher tilt values [25–27]. initial or final phase of the linear change can be defined by the maximum reached angle value or setting the period of an angular movement of the platform [7, 8]. certain values of angular velocities then correspond to such movement. an angular velocity of 40°/s was used in [5, 9, 11], while in [7], the values were higher; e.g., 50°/s. these values are comparable with those used in studies of humans, 36°/s to 50°/s [25–27]. since the mentioned cases discuss dynamic effects, (high or low), the velocity was high enough to reliably cause an automatic postural response (apr) in the subjects [28]. the total measurement time lasted only few seconds, e.g., max 3 s [9] per one noncyclic movement. we select the movement angles for the preliminary testing on a greater scale, ranging between +23°/−23°, as the design allows. referring to the above-mentioned, we selected movement angles for the preliminary testing on a greater scale, ranging between +23°/−23°, according to what the design of the construction allows. as for the selection of angular velocity, it did not fall within the objectives of the research commissioned by faculty of sciences of the charles university in prague and first faculty of medicine of charles university in prague. in both these institutions, the research has been, so far, primarily focused on movement responses of small reptiles with the nervous system on a lower evolutionary level than that in mammals. using a high angular velocity is not appropriate with small reptiles. in the case of fast movements of body segments, we may encounter a problem generally known as the d’alembert’s inertial force and a moment of a force. these forces and moments would have to be compensated by the forces provided by the muscular system of the animal. for this reason, at a high angular velocity, change in the orientation of the body segment relative to the vector of gravity is not primarily compensated . therefore, the angular velocity of 2.5°/s and 5°/s was selected for the purpose of studying the movement responses. with each speed, the platforms always move from one side to the other and back, repeating multiple times to ensure useful data. finally, we tried to use a „sine wave“ mode [5], which is actually a negative cosine wave. in this regime, the platform starts at one end and slowly accelerates up to the peak speed reached in the neutral position, and then decelerates until it closes the cycle at the other end. the rotation speed profile resembles a negative cosine wave, hence the name. 3.2. results the system provides information on the flexion angles (dorsoventral and mediolateral) and the rotation angle. the example of dorsal flexion of the head of an unstressed animal is demonstrated in figure 2. based on the information about the relative head angles in relation to the platform and angles of the platform movement in earth’s coordinate system (i.e., platform pitch angle), it is possible to calculate the head movement in relation to the earth’s horizontal line [29], i.e., the head pitch angle, figure 9. 4. discussion a special device with a moving platform was constructed to measure body segments of small animals. the system is particularly suitable to carry out a precise measuring of a body response to changes in the body orientation. a record — the output of the system – is supposed to contribute to a further scientific research of the evolution of segment’s anatomical angle values and segment angles in relation to the horizontal plane of earth. in the tests of the designed device, we measured the movement of a head during changes in orientation of the whole body. we discovered that in the case of platform moving with the toads, the subjects perform a compensating head 328 vol. 57 no. 5/2017 platform for measurement of compensatory movements 3.2 results the system provides information on the flexion angles (dorsoventral and mediolateral) and the rotation angle. the example of dorsal flexion of the head of an unstressed animal is demonstrated in fig. 2. based on the information about the relative head angles in relation to the platform and angles of platform movement in earth’s coordinate system (i.e. platform pitch angle) it is possible to calculate the head movement in relation to the earth’s horizontal line [29], i.e. the head pitch angle, fig. 9. fig.9 an example of a diagram measuring head movements of the common toad in the sagittal plane of its body 2. discussion a special device with a moving platform was constructed to measure body segments of small animals. the system is particularly suitable to carry out precise measuring of body response to changes in the body orientation. a record the output of the system – is supposed to contribute to further scientific research of the evolution of segment’s anatomical angle values and segment angles in relation to horizontal plane of earth.. in the tests of the designed device, we measured the movement of head during changes in orientation of the whole body. we discovered that in the case of platform moving with the toads, the subjects perform a compensating head movement in almost all the cases. the range and extent of the compensation will be used in the future research. nonetheless, the stress definitely seems to be a relevant factor, as some frogs just got inflated and refused to move, while others relaxed and sat, compensating for even the slightest movement. in the case of unstressed animals preliminarily findings show that compensation of a platform movement has a certain inertia after the completion of the platform movement, as shown in fig.9. finding the reasons for this behavior of animals, and its quantitative description, will be part of the follow-up research. to compare the designed system with some others, we can observe that, for example, in the case of [5] only manual positioning and lower video frequency was used, while our system figure 9. an example of a diagram measuring head movements of the common toad in the sagittal plane of its body. movement in almost all the cases. the range and extent of the compensation will be used in the future research. nonetheless, the stress definitely seems to be a relevant factor, as some frogs just got inflated and refused to move, while others relaxed and sat, compensating for even the slightest movement. in the case of unstressed animals, preliminarily findings show that the compensation of a platform movement has a certain inertia after the completion of the platform movement, as shown in figure 9. finding the reasons for this behaviour of animals and its quantitative description, will be a part of the follow-up research. to compare the designed system with some others, we can observe that, for example, in the case of [5], only manual positioning and a lower video frequency was used, while our system uses 30 fps. another shortcoming of the [5] system is that the recordings were processed manually, i.e., markers were made manually in the pictures. the software that we created enables detection of contrast markers automatically, which speeds up the processing of the record. another advantage of the outcomes of our study is that since the skin of our subjects is painted, sticker markers are not used, which, again, facilitates experiments and makes them cheaper. our system also avoids difficulties in measuring very small animals (under 10 cm of length) by mounting them directly onto the platform. instead of mechanical sensors for monitoring angles used in [6], our system is contactless, non-invasive and more suitable for measuring the 3-d movement. compared to mocap systems used for measuring animal movements based on the electromagnetic principle [13, 30], for example fastrak (polhemus, colchester, vt), the accuracy of our system is affected by electronic parts such as actuator unit motors. the method we designed is based on a single axis rotation in accordance with [5, 6] and is adjusted to measure slow movements of reptiles/amphibians. measurements of the head movement of smaller reptiles/amphibians on a rotating platform with the mounted camera system are original and have not been used so far. although some recordings of mammal head movements were mentioned, for example, in [11, 30], their system’s construction and measuring methods did not allow measuring of relative and absolute values of the head movement angles in all three anatomical planes. further research into the movement might deal with the issue of a cyclic repetition of transition between stationary (or extreme) platform positions, which lasts just a few seconds, like in [7, 8], while the cyclic movement is not limited by the number of cycles. the measuring also allows a repetition of identical tests under the same conditions and a collection of long-term data. [9]. the system can be used for measuring the movement of animals trained to track targets, as stated in [30]. based on the above, it is obvious that the platform and methods for monitoring animal’s head compensatory movements can be used in non-contact studies of body movement influenced by functioning of the nervous system and its evolution, as described in detail in the submitted research [4–14]. the measurement conditions (i.e., the measurement methodology for the study of the nervous system) may be the same as in [4–9, 14] or in an accordance with the suggestions in this article. 5. conclusions the article presents a new system and methods for measuring the movement response of small animals to changes in their body orientation. the method is based on a camera system located on a moving platform and powered by three actuators. the system was tested on measuring the head rotation of small reptiles/amphibians. the results also include a presentation of the method, which measures compensation movements of animals. the new system intends to measure movements of small animals within the scope of veterinary and scientific usage and allows to diagnose pathological symptoms in the animal’s posture. further testing of systems for the evaluation of some more complex movements is expected within a follow-up research. acknowledgements this work was done in the framework of project sgs16/109/ohk4/1t/17. the authors would also like to thank progredior kybernetes ltd. for the manufacturing of the electronically powered platform. references [1] teszler, c. b., et al. sonovestibular symptoms evaluated by computed dynamic posturography. the international tinnitus journal, 6(2), 140-153, 2000. [2] luu, b. l., et al. “validation of a robotic balance system for investigations in the control of human standing balance.” ieee transactions on neural systems and rehabilitation engineering 19.4: 382-390, 2011. doi:10.1109/tnsre.2011.2140332 [3] tsai, y. c., hsieh, l. f., and yang, s. “age-related changes in posture response under a continuous and unexpected perturbation.” journal of biomechanics. 47.2: 482-490, 2014. doi:10.1016/j.jbiomech.2013.10.047 329 http://dx.doi.org/10.1109/tnsre.2011.2140332 http://dx.doi.org/10.1016/j.jbiomech.2013.10.047 p. kutilek, d. skoda, j. hybl et al. acta polytechnica [4] heath, j. e., northcutt, r. g, and barber, r. p. “rotational optokinesis in reptiles and its bearing on pupillary shape.” journal of comparative physiology a: neuroethology, sensory, neural, and behavioral physiology 62.1: 75-85, 1969. doi:10.1007/bf00298043 [5] beloozerova, irina n., et al. “postural control in the rabbit maintaining balance on the tilting platform.” journal of neurophysiology 90.6: 3783-3793, 2003. doi:10.1152/jn.00590.2003 [6] hsu, l. j., et al. “effects of galvanic vestibular stimulation on postural limb reflexes and neurons of spinal postural network.” journal of neurophysiology 108.1: 300-313, 2012. doi:10.1152/jn.00041.2012 [7] lyalka, v. f., et al. “impairment and recovery of postural control in rabbits with spinal cord lesions.” journal of neurophysiology 94.6: 3677-3690, 2005. doi:10.1152/jn.00538.2005 [8] lyalka, v. f., et al. “impairment of postural control in rabbits with extensive spinal lesions.” journal of neurophysiology 101.4: 1932-1940, 2009. doi:10.1152/jn.00009.2008 [9] macpherson, j. m., et al. “bilateral vestibular loss in cats leads to active destabilization of balance during pitch and roll rotations of the support surface.” journal of neurophysiology 97.6: 4357-4367, 2007. doi:10.1152/jn.01338.2006 [10] deliagina, tatiana g., et al. “neural bases of postural control.” physiology 21.3: 216-225, 2006. doi:10.1152/physiol.00001.2006 [11] ting, l. h., macpherson, j. m. “ratio of shear to load ground-reaction force may underlie the directional tuning of the automatic postural response to rotation and translation.” journal of neurophysiology 92.2: 808-823, 2004. doi:10.1152/jn.00773.2003 [12] shinder, michael e., and jeffrey s. taube. “active and passive movement are encoded equally by head direction cells in the anterodorsal thalamus.” journal of neurophysiology 106.2: 788-800, 2011. doi:10.1152/jn.01098.2010 [13] dieringer, n., and precht, w. “compensatory head and eye movements in the frog and their contribution to stabilization of gaze.” experimental brain research 47.3: 394-406, 1982. doi:10.1007/bf00239357 [14] dieringer, n., cochran, s. l., and precht, w. “differences in the central organization of gaze stabilizing reflexes between frog and turtle.” journal of comparative physiology a: neuroethology, sensory, neural, and behavioral physiology 153.4: 495-508, 1983. doi:10.1007/bf00612604 [15] kutilek, p., et al. “system for measuring movement response of small animals to changes in their orientation.” applied electronics (ae), 2015 international conference on. ieee, 2015. [16] hybl, j., et al. “methods for evaluation of kinematic motion data of animal’s body on dynamic platform.” mechatronics-mechatronika (me), 2016 17th international conference on. ieee, 2016. [17] simsik, d., et al. “the video analysis utilization in rehabilitation for moblity development.” lékař a technika. česká republika: 4-5, 2004. [18] de la bourdonnaye, a., et al. “practical experience with distance measurement based on single visual camera.” advances in military technology 7.2: 49-56, 2012. [19] doskocil, r., krivanek, v., and stefek, a. “contribution to determination of target angular position by single visual camera at indoor close environs.” mechatronics 2013. springer, cham. 379-385, 2014. [20] gonzalez, r. c., woods, r. e. eddins, s. l. digital image processing using matlab, 2003. isbn-13: 978-0130085191, isbn-10: 0130085197 [21] kutilek, p., j. charfreitag, and hozman, j. “comparison of methods of measurement of head position in neurological practice.” xii mediterranean conference on medical and biological engineering and computing 2010. springer berlin heidelberg, 2010. doi:10.1007/978-3-642-13039-7_114 [22] kutílek, p., and hozman, j. “determining the position of head and shoulders in neurological practice with the use of cameras.” acta polytechnica 51.3, 2011. [23] hazewinkel, m. “spline interpolation”, encyclopedia of mathematics, springer, 1994. isbn 978-1-55608-010-4 [24] dieringer, n. “the role of compensatory eye and head movements for gaze stabilization in the unrestrained frog.” brain research 404.1: 33-38, 1987. doi:10.1016/0006-8993(87)91352-7 [25] allum, j. h. j., and pfaltz c. r. “visual and vestibular contributions to pitch sway stabilization in the ankle muscles of normals and patients with bilateral peripheral vestibular deficits.” experimental brain research 58.1: 82-94, 1985. doi:10.1007/bf00238956 [26] carpenter, m., allum, j., and honegger, f. “vestibular influences on human postural control in combinations of pitch and roll planes reveal differences in spatiotemporal processing.” experimental brain research 140.1: 95-111, 2001. doi:10.1007/s002210100802 [27] nardone, a., t. corra, and m. schieppati. “different activations of the soleus and gastrocnemii muscles in response to various types of stance perturbation in man.” experimental brain research 80.2: 323-332, 1990. doi:10.1007/bf00228159 [28] diener, h. c., et al. “early stabilization of human posture after a sudden disturbance: influence of rate and amplitude of displacement.” experimental brain research 56.1: 126-134, 1984. doi:10.1007/bf00237448 [29] haque, a., and dickman, j. d. “vestibular gaze stabilization: different behavioral strategies for arboreal and terrestrial avians.” journal of neurophysiology 93.3: 1165-1173., 2005. doi:10.1152/jn.00966.2004 [30] stapley, paul j., et al. “bilateral vestibular loss leads to active destabilization of balance during voluntary head turns in the standing cat.” journal of neurophysiology 95.6: 3783-3797, 2006. doi:10.1152/jn.00034.2006 330 http://dx.doi.org/10.1007/bf00298043 http://dx.doi.org/10.1152/jn.00590.2003 http://dx.doi.org/10.1152/jn.00041.2012 http://dx.doi.org/10.1152/jn.00538.2005 http://dx.doi.org/10.1152/jn.00009.2008 http://dx.doi.org/10.1152/jn.01338.2006 http://dx.doi.org/10.1152/physiol.00001.2006 http://dx.doi.org/10.1152/jn.00773.2003 http://dx.doi.org/10.1152/jn.01098.2010 http://dx.doi.org/10.1007/bf00239357 http://dx.doi.org/10.1007/bf00612604 http://dx.doi.org/10.1007/978-3-642-13039-7_114 http://dx.doi.org/10.1016/0006-8993(87)91352-7 http://dx.doi.org/10.1007/bf00238956 http://dx.doi.org/10.1007/s002210100802 http://dx.doi.org/10.1007/bf00228159 http://dx.doi.org/10.1007/bf00237448 http://dx.doi.org/10.1152/jn.00966.2004 http://dx.doi.org/10.1152/jn.00034.2006 acta polytechnica 57(5):321–330, 2017 1 introduction 2 methods 2.1 subsystems for measuring movement response 2.2 measurement of body segment movement of small animals in 3-d space 2.3 controlling of the platform and data recording 2.4 video processing 2.5 method for determination of anatomical angles of movement 3 experiments 3.1 selection of measurement conditions 3.2 results 4 discussion 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0180 acta polytechnica 56(3):180–192, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap on immersion formulas for soliton surfaces alfred michel grundlanda, b, ∗, decio levic, luigi martinad a centre de recherches mathématiques, université de montréal, montréal cp 6128 (qc) h3c 3j7, canada b department of mathematics and computer science, université du québec, trois-rivières, cp 500 (qc) g9a 5h7, canada c dipartimento di mathematica e fisica dell’università roma tre, sezione infn di roma tre, via della vasca navale 84, roma, 00146 italy d dipartimento di mathematica e fisica dell’università del salento, sezione infn di lecce, via arnesano, c.p. 193 lecce, 73100 italy ∗ corresponding author: grundlan@crm.umontreal.ca abstract. this paper is devoted to a study of the connections between three different analytic descriptions for the immersion functions of 2d-surfaces corresponding to the following three types of symmetries: gauge symmetries of the linear spectral problem, conformal transformations in the spectral parameter and generalized symmetries of the associated integrable system. after a brief exposition of the theory of soliton surfaces and of the main tool used to study classical and generalized lie symmetries, we derive the necessary and sufficient conditions under which the immersion formulas associated with these symmetries are linked by gauge transformations. we illustrate the theoretical results by examples involving the sigma model. keywords: integrable systems; soliton surfaces; immersion formulas; generalized symmetries. ams mathematics subject classification: 35q35, 22e60, 53a05. 1. introduction soliton surfaces associated with integrable systems have been shown to play an essential role in many problems with physical applications (see e.g. [2, 5– 9, 11–13, 15, 19, 29, 30, 33–35]). we say that a surface is integrable if the gauss-mainardi-codazzi equations corresponding to it are integrable, i.e. if they can be represented as the compatibility conditions for some “non-fake” linear spectral problem (lsp) [2, 3, 5, 17, 18, 21, 22, 24–26, 31–35]. the possibility of using an lsp to represent a moving frame on the integrable surface has yielded many new results concerning the intrinsic geometric properties of such surfaces (see e.g. [4, 29]). in the present state of development, it has proved most fruitful to extend such characterizations of soliton surfaces via their immersion functions (see e.g. [1, 2, 19, 23, 27, 30] and references therein). the construction of surfaces related to completely integrable models was initiated by a. sym [33–35]. this construction makes use of the conformal invariance of the zero-curvature representation of the system with respect to the spectral parameter. another approach for finding such surfaces has been formulated by j. cieslinski and a. doliwa [5–7] using gauge symmetries of the lsp. a third approach, using the lsp for integrable systems and their symmetries has been introduced by fokas and gel’fand [8, 9], to construct families of soliton surfaces. most recently, in a series of papers [11–13, 15], a reformulation and extension of the fokas-gel’fand immersion formula has been performed through the formalism of generalized vector fields and their actions on jet spaces. this extension has provided the necessary and sufficient conditions for the existence of soliton surfaces in terms of the symmetries of the lsp and integrable models. the objective of this paper is to investigate and construct the relation between the three approaches concerning 2d-soliton surfaces associated with integrable systems. this paper is organized as follows. section 2 contains a brief summary of the results concerning the construction of soliton surfaces and symmetries. using the sym-tafel (st) formula, we get the surface by differentiating the solution of the lsp with respect to the spectral parameter. through the cieslinskidoliwa (cd) formula we apply a gauge transformation and through the fokas-gel’fand (fg) formula we consider the generalized symmetries of the associated integrable equation. in section 3 we demonstrate that the immersion problem can be mapped through a gauge to any of the three immersion formulas listed above and we show that these formulas correspond to possibly different parametrizations of the same surface. then, in section 4, we apply the results on the example of the sigma model. section 5 contains the concluding remarks. 180 http://dx.doi.org/10.14311/ap.2016.56.0180 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 3/2016 on immersion formulas for soliton surfaces 2. summary of results on the construction of soliton surfaces in this section we recall the main tools used to study symmetries suitable for the use of fokas-gel’fand formulas for the construction of 2d surfaces. we make use of the formalism of vector fields and their prolongations as presented in [29]. more specifically, we rewrite the formula for the immersion functions of 2d surfaces in terms of the prolongation formalism of the vector fields instead of the fréchet derivatives. 2.1. classical and generalized lie symmetries let x (with coordinates xα, α = 1, . . . ,p) and u (with coordinates uk, k = 1, . . . ,q) be differential manifolds representing spaces of independent and dependent variables, respectively. let jn = jn(x ×u) denote the n-jet space over x ×u. the coordinates of jn are given by xα, uk and ukj = ∂nuk ∂xj1...∂xjn where j = (j1, . . . ,jn) is a symmetric multi-index. we denote these coordinates by x and u(n). on jn we can define a system of partial differential equations (pdes) in p independent and q dependent variables given by m equations of the form ωµ(x,u(n)) = 0, µ = 1, . . . ,m. (2.1) we consider a vector field v tangent to j0 = x ×u v = ξα(x,u)∂α + ϕk(x,u)∂k, (2.2) where ∂α = ∂/∂xα, ∂k = ∂/∂uk and we adopt the summation convention over repeated indices. such a field defines vector fields, pr(n) v on jn [29] pr(n) v = ξα∂α + ϕkj ∂ ∂ukj . (2.3) the functions ϕkj are given by ϕkj = djr k + ξαukj,α, r k = ϕk − ξαukα, (2.4) where the operators dj correspond to multiple total derivatives, each of which is a combination of total derivatives of the form dα = ∂α + ukj,α ∂ ∂ukj , α = 1, . . . ,p (2.5) and rk are the so-called characteristics of the vector field v. in the following, the representation of v can be written equivalently as v = ξαdα + ωr, ωr = rk ∂ ∂uk . (2.6) one says that the vector field v is a classical lie point symmetry of a nondegenerate system of pdes (2.1) if and only if its n-th prolongation of v is such that pr(n) vωµ(x,u(n)) = 0, µ = 1, . . . ,m (2.7) whenever ωµ(x,u(n)) = 0, µ = 1, . . . ,m are satisfied. by a totally non-degenerate system we mean a system for which all their prolongations have maximal rank and are locally solvable [29]. it follows from the wellknown properties of the symmetries of a differential system that the commutator of two symmetries is again a symmetry. thus, such symmetries form a lie algebra g, which locally defines an action of a lie group g on j0. every solution of (2.1) can be represented by its graph, uk = θk(x), which is a section of j0. the symmetry group g transforms solutions into solutions. this means that a graph corresponding to one solution is transformed into a graph associated with another solution. if the graph is preserved by the group g or equivalently, if the vector fields v from the algebra g are tangent to the graph, then the related solution is said to be g-invariant. invariant solutions satisfy, in addition to the equations (2.1), the characteristic equations equated to zero ϕka(x,θ) − ξ α a (x,θ)θ k ,α = 0, a = 1, . . . ,r, (2.8) where the index a runs over the generators of g. it may happen that invariant solutions are restricted in number or trivial if the full symmetry group is small. to extend the number of symmetries, and thus of solutions, one looks for generalized symmetries. they exist only if the nonlinear equation (2.1) is integrable [28], i.e. it has been obtained as the compatibility of a lax pair (see (2.13) and in the following). a generalized vector field is expressed in terms of the characteristics ωr = rk[u] ∂ ∂uk , (2.9) where [u] = (x,u(n)) ∈ jn. the prolongation of an evolutionary vector field ωr is given by pr ωr = ωr + djrk ∂ ∂ukj . (2.10) a vector field ωr is a generalized symmetry of a nondegenerated system of pdes (2.1) if and only if [29] pr ωrωµ(x,u(n)) = 0, (2.11) whenever ω(x,u(n)) = 0 and its differential consequences are satisfied. 2.2. the immersion formulas for soliton surfaces in order to analyse the fokas-gel’fand immersion formula for a surface in 2d, we briefly summarize the results obtained in [2, 5–9, 11–13, 15, 33–35]. let us consider an integrable system of partial differential equations (pdes) in two independent variables x1,x2 and m dependent variables uk(x1,x2) written as ω[u] = 0. (2.12) 181 a. m. grundland, d. levi, l. martina acta polytechnica suppose that the system (2.12) is obtained as the compatibility of a matrix lsp written in the form [33] ∂αφ(x1,x2,λ) −uα([u],λ)φ(x1,x2,λ) = 0, α = 1, 2. (2.13) in what follows, the potential matrices uα can be defined on the extended jet space n = (jn,λ), where λ is the spectral parameter. the compatibility condition of the lsp (2.13), often called the zero-curvature condition (zcc) d2u1 −d1u2 + [u1,u2] = 0, (2.14) which is assumed to be valid for all values of λ, implies (2.12). the bracket in (2.14) denotes the lie algebra commutator. equation (2.14) provides a representation for the initial system (2.12) under consideration. the m-dimensional matrix functions uα take values in some semisimple lie algebra g and the wavefunction φ takes values in the corresponding lie group g. we can say that, as long as the potential matrices uα([u],λ) satisfy the zcc (2.14), there exists a group-valued function φ which satisfies (2.13). there exists a subclass of φ which can be defined formally on the extended jet space n . for φ belonging to this subclass we can write formally φ = φ([u],λ) ∈ g, meaning that φ depends functionally on [u] and meromorphically in λ. when φ = φ([u],λ) the lsp (2.13) can be written as λα([u],λ) ≡ dαφ([u],λ) −uα([u],λ)φ([u],λ) = 0, α = 1, 2, (2.15) which is a convenient form for the analysis we carry out in the following. in reference [9], the authors looked for a simultaneous infinitesimal deformation of the associated lsp (2.15) which preserves the zcc (2.14) ũ1ũ2 φ̃   =  u1u2 φ   + �  a1a2 ψ   + o(�2), (2.16) where the matrices ũ1, ũ2, a1 and a2 take values in the lie algebra g, while φ̃ and ψ belong to the corresponding lie group g. the parameter λ is left invariant under the transformation (2.16) and 0 < � � 1. the corresponding infinitesimal generator formally takes the evolutionary form x̂e = a1∂u1 + a 2∂u2 + ψ∂φ. (2.17) equation (2.17) can be written as x̂� = a1j∂uj1 + a 2j∂ u j 2 + ψj∂φj , (2.18) where we decompose the matrix functions a1 and a2 in the basis ej, j = 1, . . . ,s for the lie algebra g uα = ujαej ∈ g, [ei,ej] = c k ijek, (2.19) where [ , ] is the lie algebra commutator and ckij are the structural constants of g. since this generator x̂e does not transform λ, these symmetries preserve the singularity structure of the potential matrices uα in the spectral parameter λ. the infinitesimal deformation of the lsp (2.13) and the zcc (2.14) under the infinitesimal transformation (2.16) requires that the matrix functions uα and ψ satisfy, at first order in �, the equations dαψ = uαψ + aαφ, α = 1, 2 (2.20) and d2a1 −d1a2 + [a1,u2] + [u1,a2] = 0, (2.21) the equation (2.21) coincides with the compatibility condition for (2.20). for the given matrix functions uα, aα ∈ g and φ ∈ g satisfying equations (2.14), (2.15) and (2.21), an infinitesimal symmetry of the matrix system of the integrable pdes (2.12) allows us to generate a 2d-surface immersed in the lie algebra g. according to [8] this result is formulated as follows. theorem 1. if the matrix functions uα ∈ g, α = 1, 2 and φ ∈ g of the lsp (2.15) satisfy the zcc (2.14) and aα ∈ g are linearly independent matrix functions which satisfy (2.21) and φ ∈ g satisfies the lsp (2.15), then there exists (up to affine transformations) a 2d-surface with a g-valued immersion function f ([u],λ) such that the tangent vectors to this surface are given by dαf([u],λ) = φ−1aα([u],λ)φ, α = 1, 2. (2.22) proof. the compatibility condition of (2.22) coincides with (2.21). so an immersion function f ([u],λ) exists and can be assumed to take its values in the lie algebra g. if we define the matrix function ψ = φf, (2.23) then, using (2.22), the function ψ satisfies (2.20). hence, since f = φ−1ψ, the formula φ̃ = φ + εψ = φ(i + εf) implies that φ̃ is in the lie group g. the immersion function f = ( φ−1([u],λ) dφ̃([u],λ,�) d� )∣∣∣∣ �=0 (2.24) is an element of the lie algebra g. this shows that we have constructed an appropriate infinitesimal deformation of the wavefunction φ. in [5, 33] it was shown that the admissible symmetries of the zcc (2.14) include a conformal transformation of the spectral parameter λ, a gauge transformation of the wavefunction φ in the lsp (2.13) and generalized symmetries of the integrable system (2.12). all these symmetries can be used to determine explicitly a g-valued immersion function f of a 2dsurface. thus, a generalization of the fg formula for immersion can be formulated as follows [9, 11]. 182 vol. 56 no. 3/2016 on immersion formulas for soliton surfaces theorem 2. let the set of scalar functions {uk} satisfy a system of integrable pdes ω[u] = 0. let the g-valued function φ([u],λ) satisfy the lsp (2.15) of gvalued potentials uα([u],λ). let us define the linearly independent g-valued matrix functions aα([u],λ) (α = 1, 2) by the equations aα([u],λ) = β(λ)dλuα + (dαs + [s,uα]) + pr ωruα + (pr ωr(dαφ −uαφ)) φ−1. (2.25) here β(λ) is an arbitrary scalar function of λ, s = s([u],λ) is an arbitrary g-valued matrix function defined on the jet space n , ωr = rk[u]∂uk is the vector field, written in evolutionary form, of the generalized symmetries of the integrable pdes ω[u] = 0 given by the zcc (2.14). then there exists a 2d-surface with immersion function f([u],λ) in the lie algebra g given by the formula (up to an additive g-valued constant) f([u],λ) = φ−1 (β(λ)dλφ + sφ + pr ωrφ) . (2.26) the integrated form of the surface (2.26) defines a mapping f : n → g and we will refer to it as the st immersion formula (when s = 0,ωr = 0) [33–35] fst ([u],λ) = β(λ)φ−1(dλφ) ∈ g, (2.27) the cd immersion formula (when β = ωr = 0) [5–7] fcd([u],λ) = φ−1s([u],λ)φ ∈ g, (2.28) or the fg immersion formula (when β = 0,s = 0) [8, 9] ffg([u],λ) = φ−1(pr ωrφ) ∈ g. (2.29) 2.3. application of the method the construction of soliton surfaces requires three elements for an explicit representation of the immersion function f ∈ g: (1.) an lsp (2.13) for the integrable pde. (2.) a generalized symmetry ωr of the integrable pde. (3.) a solution φ of the lsp associated with the soliton solution of the integrable pde. note that item (1.) is always required. in its presence, even without one of the remaining two objects, we can obtain an immersion function f. when a solution φ of the lsp is unknown, the geometry of the surface f can be obtained using the non-degenerate killing form on the lie algebra g. the 2d-surface with the immersion function f can be interpreted as a pseudo-riemannian manifold. when the generalized symmetries ωr of the integrable pde are unknown but we know a solution φ of the lsp then we can define the 2d-soliton surface using the gauge transformation and the λ-invariance of the zcc f = φ−1(β(λ)dλφ + sφ), (2.30) where β(λ) is an arbitrary scalar function of λ and s is an arbitrary g-valued matrix function defined on the extended jet space n . equation (2.30) is consistent with the tangent vectors dαf = β(λ)φ−1(dλuα) + φ−1(dαs + [s,uα])φ. (2.31) in all cases, the tangent vectors, given by (2.22), and the unit normal vector to a 2d-surface expressed in terms of matrices are dαf = φ−1aαφ ∈ g, n = φ−1[a1,a2]φ ( 12 tr[a1,a2] 2)1/2 ∈ g, (2.32) where the g-valued matrices aα are given by (2.25). the first and second fundamental forms are given by i = gijdxidxj, ii = bijdxidxj, i = 1, 2, (2.33) where gij = � 2 tr(aiaj), bij = � 2 tr ( (djai + [ai,uj])n ) , � = ±1. (2.34) this gives the following expressions for the mean and gaussian curvatures h = 1 ∆ ( tr(a22) tr ( (d1a1 + [a1,u1])n ) − 8 tr(a1a2) tr ( (d2a1 + [a1,u2])n ) + tr(a21) tr ( (d2a2 + [a2,u2])n )) , k = 1 ∆ ( tr ( (d1a1 + [a1,u1])n ) · tr ( (d2a2 + [a2,u2])n ) − 2 tr2 ( (d2a1 + [a1,u2])n )) , ∆ = tr(a21) tr(a 2 2) − 4 tr(a1a2), (2.35) which are expressible in terms of uα and aα only. the study of soliton surfaces defined via the fg formula for immersion provides a unique mechanism for studying the relationship between these various characteristics of integrable systems. for example, do the infinite families of conservation laws have a geometric characterization? what is the geometry behind the hamiltonian structure? is there a geometric interpretation of the family of surfaces and frames associated with the spectral parameter? these answers will not only serve to construct surfaces with interesting geometric quantities but can also help to clarify some problems in the theory of integrable system. following the three terms in (2.26) for the immersion of 2d-soliton surfaces in lie algebras, we now show that there exists a relation between the symtafel, the cieslinski-doliwa and the fokas-gel’fand formulas. 183 a. m. grundland, d. levi, l. martina acta polytechnica 3. mapping between the sym-tafel, the cieslinski-doliwa and the fokas-gel’fand immersion formulas 3.1. λ-conformal symmetries and gauge transformations in this subsection we show that the st immersion formula can always be represented by a gauge transformation through the cd formula for immersion. the converse statement is also true: from a specific gauge it is always possible to determine the st immersion formula for soliton surfaces. proposition 1. a symmetry of the zcc (2.14) of the lsp associated with an integrable system ω[u] = 0 is a λ-conformal symmetry if and only if there exists a g-valued matrix function s1 = s1([u],λ) which is a solution of the system of differential equations dαs1 + [s1,uα] = β(λ)dλuα, α = 1, 2. (3.1) proof. first we show that for any λ-conformal symmetry of the zcc of the lsp associated with the integrable system ω[u] = 0, there exists a g-valued matrix function s1 = s1([u],λ) which is a solution of the system of differential equations (3.1). indeed, the linearly independent g-valued matrix functions aα([u],λ) = β(λ)dλuα([u],λ), (3.2) associated with the λ-conformal symmetry of the zcc (2.14) satisfy the infinitesimal deformation of the zcc (2.21) and the corresponding st immersion function is fst ([u],λ) = β(λ)φ−1dλφ ∈ g, (3.3) with linearly independent tangent vectors dαf st = β(λ)φ−1(dλuα)φ, α = 1, 2. (3.4) on the other hand any g-valued matrix function can be written as the adjoint group action on its lie algebra. this implies the existence of a g-valued matrix function s1([u],λ) for which the st immersion formula (3.3) is the cd formula, i.e. fcd([u],λ) = φ−1s1([u],λ)φ ∈ g, (3.5) whose tangent vectors are found to be dαf cd = φ−1 ( dαs1 +[s1,uα] ) φ, α = 1, 2. (3.6) by comparing the tangent vectors (3.4) and (3.6) we obtain (3.1). it remains to show that the system (3.1) is a solvable one. indeed, from the compatibility condition of (3.1) we get β(λ)d2(dλu1) −β(λ)d1(dλu2) − [ β(λ)dλu2 − [s1,u2],u1 ] − [s1,d2u1] + [ β(λ)dλu1 − [s1,u1],u2 ] + [s1,d1u2] = 0, (3.7) which has to be satisfied whenever (3.1) holds. using the zcc (2.14) and the jacobi identity it is easy to show that (3.7) is identically satisfied. so if we can find a gauge s1([u],λ) which satisfies (3.1), then the st immersion formula (3.3) can always be represented by a gauge. conversely, we show that for any g-valued matrix function s1 defined as a solution of the system of pdes (3.1), there exists a λ-conformal symmetry of the zcc of the lsp associated with the integrable system ω[u] = 0. indeed, comparing the immersion formulas (3.3) with (3.5) we find a linear matrix equation for the wavefunction φ β(λ)dλφ = s1([u],λ)φ. (3.8) if the gauge function s1([u],λ) is known, by solving (3.8) we can determine the wavefunction φ and consequently obtain the st immersion formula for 2d-soliton surfaces. therefore, the st formula for immersion (2.27) is equivalent to the cd immersion formula (2.28) for the gauge s1, which satisfies differential equation (3.1). 3.2. generalized symmetries and gauge transformations in this subsection we discuss the links between gauge transformations and generalized symmetries of the zcc associated with the integrable partial differential system ω[u] = 0. we show that the immersion formula associated with the generalized symmetries (2.29) can always be obtained by a gauge transformation and the converse statement is also true. proposition 2. a vector field ωr is a generalized symmetry of the zcc (2.14) of the lsp associated with an integrable system ω[u] = 0 if and only if there exists a g-valued matrix function (gauge) s2 = s2([u],λ) which is a solution of the system of differential equations dαs2 + [s2,uα] = pr ωruα + ( pr ωr(dαφ −uαφ) ) φ−1. (3.9) proof. first we demonstrate that for every infinitesimal generator ωr which is a generalized symmetry of the zcc (2.14), there exists a g-valued matrix function (gauge) s2 = s2([u],λ) which is a solution of the system of differential equations (3.9). indeed an evolutionary vector field ωr is a generalized symmetry of the zcc (2.14) if and only if pr ωr ( d2u1 −d1u2 + [u1,u2] ) = 0, (3.10) whenever the zcc (2.14) holds. equation (3.10) is equivalent to the infinitesimal deformation of the zcc (2.14) given by (2.21), with linearly independent gvalued matrix functions aα([u],λ) = pr ωruα + ( pr ωr(dαφ −uαφ) ) φ−1, α = 1, 2. (3.11) 184 vol. 56 no. 3/2016 on immersion formulas for soliton surfaces in the derivation of (3.11) we use the fact that the total derivatives dα commute with the prolongation of a vector field ωr written in the evolutionary form [29] [dα, pr ωr] = 0, α = 1, 2. (3.12) using the lsp (2.15), equation (3.11) can be written in the equivalent form aα([u],λ) = [ −uα(pr ωrφ) + pr ωr(dαφ) ] φ−1. α = 1, 2 (3.13) substituting (3.13) into (2.21) we obtain( −d2u1 + d1u2 − [u1,u2] ) (pr ωrφ)φ−1 = 0, which is satisfied identically whenever the zcc (2.14) holds. an integrated form of the immersion function ffg([u],λ) of a 2d-surface associated with a generalized symmetry ωr of the zcc (2.14) and the tangent vectors (2.22) is given by the fg formula ffg([u],λ) = φ−1(pr ωrφ) ∈ g. (3.14) the fact that any g-valued matrix function can be written under the adjoint group action implies that there exists a g-valued gauge s2, such that (3.5) holds for the cd immersion function and its tangent vectors dαf cd given by (3.6). comparing equations (2.22) and (3.11) with (3.6) we get (3.9). let us show that the system (3.9) always possesses a solution. the compatibility condition of (3.9), whenever (3.9) and (2.21) hold, implies the relation [s2,d2u1 −d1u2] + [ [s2,u1],u2 ] − [ [s2,u2],u1 ] = 0, (3.15) which is identically satisfied in view of the zcc (2.14) and the jacobi identity. so, if we can find a gauge function s2([u],λ) which satisfies (3.9), then the fg formula (3.14) can always be represented by a gauge transformation. the converse statement is also true. we show that for any g-valued matrix function s2 defined as a solution of the system of pdes (3.9), there exists a generalized symmetry ωr of the zcc of the lsp associated with ω[u] = 0. indeed, let the lsp of ω[u] = 0 admit a gauge symmetry. if the gauge s2([u],λ) is given, then the immersion function fcd of a 2d-surface can be integrated explicitly [5–7] fcd([u],λ) = φ−1s2([u],λ)φ ∈ g, (3.16) whenever the tangent vectors dαf cd = φ−1 ( dαs2 + [s2,uα] ) φ. (3.17) are linearly independent. it is straightforward to verify that the characteristics of a generalized vector field ωsr, written in evalutionary form, associated with a gauge symmetry s2, can be expressed as aα = dαs2 + [s2,uα] ∈ g. (3.18) the matrices aα identically satisfy the determining equations (2.21) which are required for ωsr to be a generalized symmetry of the zcc ω[u] = 0 d2a1 −d1a2 + [a1,u2] + [u1,a2] = pr ωsr ( d2u1 −d1u2 + [u1,u2] ) = 0, (3.19) whenever ω[u] = 0 holds. hence, the vector field ωsr associated with a gauge symmetry s2 is given by ωsr = ( dαs2 + [s2,uα] )j ∂ ∂u j α , (3.20) where we have decomposed the matrix functions aα and uα in the basis {ej}n1 for the lie algebra uα = ujαej ∈ g, dαs2 + [s2,uα] = ( dαs2 + [s2,uα] )j ej. (3.21) hence, for any smooth g-valued gauge s2([u],λ) there exists a generalized symmetry ωsr of the zcc (2.14) and the converse statement holds as well. comparing the fg formula for immersion (3.14) with the cd immersion formula (3.5) we find the gauge s2 = (pr ωrφ)φ−1. (3.22) hence the fg formula for immersion (2.29) is equivalent to the cd immersion formula (2.28) for the gauge s2 satisfying (3.9). 3.3. the sym-tafel immersion formula versus the fokas-gel’fand immersion formula under the assumptions of propositions 1 and 2, we have the following result. proposition 3. let s1 and s2 be the two g-valued matrix functions determined in propositions 1 and 2, respectively in terms of a λ-conformal symmetry and a generalized symmetry of the zcc (2.14) of the lsp associated with an integrable system ω[u] = 0. if the gauge s2 is a non-singular matrix then there exists a matrix m = s1s−12 such that β(λ)(dλφ) = m(pr ωrφ). (3.23) the matrix m defines a mapping from the fg immersion formula (3.14) to the st immersion formula (3.3). alternatively, if the gauge s1 is a non-singular matrix then there exists a matrix m−1 such that (pr ωrφ) = m−1β(λ)(dλφ). (3.24) the matrix m−1 defines a mapping from the st immersion formula (3.3) to the fg immersion formula (3.14). 185 a. m. grundland, d. levi, l. martina acta polytechnica fst = β(λ)φ−1(dλφ) ∈ g φ ∈ g ffg = φ−1(pr ωrφ) ∈ g s2◦s−11 s1∈g s2∈g s1◦s−12 figure 1. representation of the relations between the wavefunction φ ∈ g and the g-valued st and fg formulas for immersions of 2d-soliton surfaces. proof. equation (3.23) or (3.24) is obtained by eliminating the wavefunction φ from the right-hand side of equations (3.8) and pr ωrφ = s2φ, (3.25) respectively. so the link between the immersion functions fst and ffg exists, up to a g-valued gauge function. it should be noted that in order to recover soliton surfaces, we have to perform an integration with respect to the curvilinear coordinates in the case of the fg formula. alternatively, by using the st immersion formula, we obtain the same soliton surface by differentiating the wavefunction φ with respect to the spectral parameter λ. the connection between the fg and st approaches for determining the immersion functions fst and ffg of 2d-surfaces is obtained through the gauge matrix functions m or m−1 from the equation (3.23) or (3.24), respectively (see fig. 1). we can also write direct equations relating the generalized symmetries ωr with the sym-tafel λconformal symmetry for the zcc (2.14), eliminating the gauge s2([u],λ) in (3.9) by using (3.8). so we get β(λ)(dλφ)uα −β(λ)φuαφ−1(dλφ) + β(λ)(dλuα)φ + φ [ −pr ωr(dαφ) + uα(pr ωrφ) ] φ−1 = 0. (3.26) however, equations (3.26) are nonlinear differential equations for the wavefunction φ, which in general are not easy to solve. to conclude, in all three cases we give explicit expressions for 2d-soliton surfaces immersed in the lie algebra g and demonstrate that one such surface can be transformed to another one through a gauge. 4. the sigma model and soliton surfaces for the sake of generality we start by considering the general cpn−1 model. the problem of constructing integrable surfaces associated with the cpn−1 models and their deformations under various types of dynamics have generated a great deal of interest over the past decades [1, 23, 37]. the most fruitful approach to the study of general properties of this model has been formulated through descriptions of the model in terms of rank-one hermitian projectors. a matrix p (z, z̄) is said to be a rank-one hermitian projector if p 2 = p, p = p†, tr p = 1. (4.1) the target space of the projector p is determined by a complex line in cn, i.e. by a one-dimensional vector function f(z, z̄) given by p = f ⊗f† f†f , (4.2) where f is the mapping c ⊇ ω 3 z = x + iy 7→ f = (f0,f1, . . . ,fn−1)cn\{0}. equation (4.2) gives an isomorphism between the equivalence classes of the cpn−1 model and the set of rank-one hermitian projectors p . the equations of motion ω(p) = [∂+∂−p,p ] = 0, ∂± = 1 2 (∂1 ± i∂2), ∂1 = ∂x, ∂2 = ∂y (4.3) and other properties of the model take a compact form when the model is written in terms of the projector. now we present some examples which illustrate the theoretical considerations presented in the previous section. our first example shows that the integrated form of the surface associated with the cpn−1 model admits conformal symmetries which depend on two arbitrary functions of one complex variable. this model is defined on the riemann sphere s2 = c∪{∞} and its action functional is finite [37]. an entire class of solutions of (4.3) is obtained by acting on the holomorphic (or anti-holomorphic) solution p [10] with raising and lowering operators. these operators are given by π±(p) =   (∂±p)p(∂∓p) tr(∂±pp∂∓p) for (∂±p)p(∂∓p) 6= 0, 0 for (∂±p)p(∂∓p) = 0, π−(pk) = pk−1, π+(pk) = pk+1. (4.4) the set of n rank-1 projectors {p0, . . . ,pn−1} acts on orthogonal complements of one-dimensional subspaces in cn and satisfy the orthogonality and completeness relations pjpk = δjkpj, (no summation) and n−1∑ j=0 pj = in, (4.5) where in is the n ×n identity matrix on cn. these projectors provide a basis of commuting elements in the space of the hermitian matrices on cn and satisfy the euler-lagrange equation (written in the form of a conservation law) ∂[∂̄pk,pk] + ∂̄[∂pk,pk] = 0, k = 0, 1, . . . ,n − 1, (4.6) where ∂ = 12 (∂x − i∂y) and ∂̄ = 1 2 (∂x + i∂y). for a given set of rank-1 projector solutions pk of (4.6) 186 vol. 56 no. 3/2016 on immersion formulas for soliton surfaces the su(n)-valued generalized weierstrass formula for immersion (gwfi) [20] fk(z, z̄) = i ∫ γ (−[∂pk,pk]dz + [∂̄pk,pk]dz̄), k = 0, 1, . . . ,n − 1 (4.7) (where γ is a curve locally independent of the trajectory in c) can be explicitly integrated [10] fk(z, z̄) = −i ( pk + 2 k−1∑ j=0 pj ) + 1 + 2k n in. (4.8) the immersion functions fk satisfy the algebraic conditions [fk − ickin ][fk − i(ck−1)in ][fk − i(ck−2)in ] = 0, 0 < k < n − 1, [f0 − ic0in ][f0 − i(c0 − 1)in ] = 0, [fn−1 + ic0in ][fn−1 + i(c0 − 1)in ] = 0, n−1∑ j=0 (−1)jfj = 0, ck = 1 n (1 + 2k). (4.9) the lsp associated with (4.6) is given by [27, 36] ∂αφk = uαkφk, uαk = 2 1 ±λ [∂αpk,pk], (u1k)† = −u2k, (4.10) (where α = 1, 2 stands for ±) with soliton solution φk = φk([p ],λ) ∈ su(n) which goes to in as λ →∞ [36, 37] φk = in + 4λ (1 −λ)2 k−1∑ j=0 pj − 2 1 −λ pk, φ−1k = in − 4λ (1 + λ)2 k−1∑ j=0 pj − 2 1 + λ pk, λ = it, t ∈ r. (4.11) the recurrence relation (4.4) is expressed in terms of rank-1 projectors pk, without any reference to the sequence of functions fk as in (4.2). for the sake of simplicity, in this section, we drop the index k attributed to the n projectors pk. it is convenient for computational purposes to express the cpn−1 model in terms of the matrix θ ≡ i ( p − 1 n in ) = θses ∈ su(n), [ej,el] = csjles, j, l,s = 1, . . . ,n 2 − 1, (4.12) where csjl are the structural constants of g and es is the basis element for the su(n) algebra. due to the indempotency of the projector p we get the following algebraic restriction on θ: θ ·θ = −i 2 −n n θ + 1 −n n2 in ⇐⇒ p 2 = p. (4.13) the equations of motion in terms of the matrix θ are ωj[θ] = [ (∂21 + ∂ 2 2 )θ,θ ]j = 0, j = 1, . . . ,n2 − 1 (4.14) where [·, ·]j denotes the coefficients of the commutator with respect to the jth basis element ej for the su(n) algebra. the potential matrices uα in terms of θ are u1 = −2 1 −λ2 ( [∂1θ,θ] − iλ[∂2θ,θ] ) , u2 = −2 1 −λ2 ( iλ[∂1θ,θ] + [∂2θ,θ] ) , λ = it, t ∈ r. (4.15) the wavefunction φ in terms of θ is φ([θ],λ) = in + 4λ (1 −λ)2 k−1∑ j=0 πj−(θ) − 2 1 −λ ( 1 n in − iθ ) ∈ su (n), (4.16) where π± are the raising and lowering operators acting on the elements θ of the algebra su(n) π−(θk) = θk−1, π+(θk) = θk+1. (4.17) in what follows, we use the simplified notation of π±(θk) by π±(θ), where the index k is suppressed. the operators (4.17) are written explicitly as π−(θ) = ∂̄θ(e − iθ)∂θ tr(∂̄θ(e − iθ)∂θ) , π+(θ) = ∂θ(e − iθ)∂̄θ tr(∂θ(e − iθ)∂̄θ) , e = 1 n in, (4.18) where the traces in the denominators are different from zero unless the whole matrix is zero. for any functions f and g of one variable, the equations of motion (4.14) and their lsp (2.15) (with the potential matrices (4.15)) admit the conformal symmetries ωci = [ f(x)∂1θj + g(y)∂2θj ] ∂ ∂θj , i = 1, 2. (4.19) the vector fields ωci are related to the fields ηci defined on the jet space m = [(φ,uα)] ηci = (∂iφ j) ∂ ∂φj + (∂iujα) ∂ ∂u j α , i = 1, 2, (4.20) which are conformal symmetries of the lsp (2.15). the integrated form of the surface is given by the fg formula [15] ffg = φ−1 ( f(x)u1 + g(y)u2 ) φ ∈ su(n). (4.21) 4.1. soliton surfaces associated with the cp 1 sigma model we give a simple example to illustrate the construction of 2d-soliton surfaces associated with the cp 1 187 a. m. grundland, d. levi, l. martina acta polytechnica model (n = 2) introduced in previous section. the only solutions with finite action of the cp 1 model are holomorphic and antiholomorphic projectors [37]. the rank-one hermitian projectors (i.e. holomorphic p0 and antiholomorphic p1) based on the veronese sequence f0 = (1,z), take the form p0 = f0 ⊗f † 0 f † 0f0 = 1 1 + |z|2 ( 1 z̄ z |z|2 ) , p1 = f1 ⊗f † 1 f † 1f1 = 1 1 + |z|2 ( |z|2 −z̄ −z 1 ) , (4.22) where f1 = (i2 − p0)∂f0. the corresponding integrated forms of the surfaces are given by the gwfi (4.8) f0 = i( 1 2 i2 −p0) = i 1 + |z|2 (1 2 (|z| 2 − 1) −z̄ −z 12 (1 −|z| 2) ) ∈ su(2), f1 = −i(p1 + 2p0) + 3i 2 i2 = f0. (4.23) from equation (4.10) the potential matrices uαk become u10 = u11 = 2 (λ + 1)(1 + |z|2)2 ( −z̄ −z̄2 1 z̄ ) , u20 = u21 = 2 (λ− 1)(1 + |z|2)2 ( −z 1 −z2 z ) , λ = it, t ∈ r. (4.24) the su(2)-valued soliton wavefunctions φk in the lsp (4.10) for the cp 1 model have the form φ0 = 1 1 + |z|2 ( −i+t+(i+t)|z|2 t−i −2iz̄ t−i −2iz t+i i+t+(t−i)|z|2 t+i ) , φ1 = 1 1 + |z|2 ( 1+t2+(t+i)2|z|2 (t−i)2 2(1−it)z̄ (t−i)2 −2i(t−i)z (t+i)2 1+t2+(t−i)2|z|2 (t+i)2 ) . (4.25) let us now consider separately four different analytic descriptions for the immersion functions of 2dsoliton surfaces in lie algebras which are related to four different types of symmetries. i. the zcc (2.14) of the cp 1 model admits a conformal symmetry in the spectral parameter λ. the tangent vectors dαfstk associated with this symmetry are given by dαf st k = φ −1 k (dλuαk)φk, α,k = 1, 2 and are linearly independent. the integrated forms of the 2d-surfaces in su(2) are given by the st formulas (2.27) fst0 = φ −1 0 (dλφ0) = 2i (1 + t2)2(1 + |z|2)2 · ( −|z|2[t2 − 3 + |z|2(1 + t2)] z[(t− i)2 + |z|2(3 − 2it + t2)] z̄[(t + i)2 + |z|2(3 + 2it + t2)] |z|2[t2 − 3 + |z|2(t2 + 1)] ) , fst1 = φ −1 1 (dλφ1) = 2i (1 + t2)2(1 + |z|2)2 · ( −[(t2 + 1)(1 + 2|z|4) + 3|z|2(t2 − 3)] z[t2 − 6it− 5 + |z|2(7 + t2 − 6it)] z̄[6it− 5 + t2 + |z|2(7 + 6it + t2)] (t2 + 1)(1 + 2|z|4) + 3|z|2(t2 − 3) ) , (4.26) where, without loss of generality, we can put β(λ) = 1 in the expression (2.27). the surfaces fstk satisfy (fstk ) 2 + 1 4 i2 = 0, for k = 0, 1 and have positive constant gaussian and mean curvatures. hence, they are spheres (see fig. 2a) k0 = k1 = 4, h0 = h1 = 4. (4.27) the surfaces fstk in cartesian coordinates (x,y) take the form for z = x + iy fstk = { x 1 + x2 + y2 , y 1 + x2 + y2 , 1 −x2 −y2 2(1 + x2 + y2) } (4.28) the su(2)-valued gauges sstk associated with the st immersion functions fstk (4.26) take the form sst0 = (dλφ0)φ −1 0 = 2i 1 + |z|2 ( −|z|2 t2+1 z̄ (t−i)2 z (t+i)2 |z|2 t2+1 ) , sst1 = (dλφ1)φ −1 1 = 2i 1 + |z|2 ( −(1+2|z|2) t2+1 z̄(t+i)2 (t−i)4 z(t−i)2 (t+i)4 1+2|z|2 t2+1 ) , det sstk 6= 0, tr s st k = 0. (4.29) ii. the surfaces fgk ∈ su(2) associated with the scaling symmetries of the zcc (2.14) associated with equations of the cp 1 model (4.6) ω g k = ( d1(zu1k) + z̄(d2u1k) ) ∂ ∂θ1 + ( z(d1u2k) + d2(z̄u2k) ) ∂ ∂θ2 , (4.30) have the integrated form [14] f g k = φ −1 k (zu1k + z̄u2k)φk, k = 0, 1 (4.31) where θ1 and θ2 are complex-valued functions determined from the su(2) lie algebra (4.12). the surfaces f g k also have constant positive curvatures k0 = k1 = −4λ2, h0 = h1 = −4iλ, iλ ∈ r. (4.32) 188 vol. 56 no. 3/2016 on immersion formulas for soliton surfaces the surfaces fgk are not spheres (as in the previous cases (4.27)) since they have boundaries (see fig. 2b). the surfaces fgk can be given in the parametric form f g k = (x3 − 2x2y + x(y2 − 1) − 2y(1 + y2) (1 + x2 + y2)2 , − 2x3 + x2y + y(y2 − 1) + 2x(1 + y2) (1 + x2 + y2)2 , 2(x2 + y2) (1 + x2 + y2)2 ) . (4.33) the su(2)-valued gauges sgk = (pr ω gφk)φ−1k associated with the scaling symmetries ωgk are given by s g 0 = s g 1 = 2 (t2 + 1)(1 + |z|2)2 · ( 2it|z|2 iz̄[i− t + |z|2(t + i)] z[1 − it + |z|2(1 + it)] −2it|z|2 ) , (4.34) where det sgk 6= 0. iii. in the case of surfaces associated with the conformal symmetries ωck = −gk(z)∂ − ḡk(z̄)∂̄, (4.35) where for simplicity we have assumed that gk(z) = 1 + i, the su(2)-valued immersion functions fck are given by [12] fck = φ −1 k (u1k + u2k)φk, (4.36) where u10 + u20 = u11 + u21 = 2 (t2 + 1)(1 + |z|2)2 · ( 2z + i(t + i)(z + z̄) −1 − it + iz̄2(t + i) 1 + z2 + it(z2 − 1) −[2z + i(t + i)(z + z̄)] ) . (4.37) by further computation it can be verified that the gaussian curvature and the mean curvature corresponding to the surfaces fck are not constant for any value of λ. the fact that the surfaces fck have the euler-poincaré characters [10] χk = −1 π ∫∫ s2 ∂∂̄ ln [tr(∂pk · ∂̄pk)]dx1dx2 (4.38) equal to 2 and positive gaussian curvature k > 0 means that the surfaces fck are homeomorphic to ovaloids (see fig. 2c). the surfaces fck associated with the conformal symmetries ωck are cardioid surfaces which can be parametrized as follows fck = (x2 − 1 − 4xy −y2 (1 + x2 + y2)2 ,− 2(1 + x2 + xy −y2) (1 + x2 + y2)2 , 2(2x−y) (1 + x2 + y2)2 ) (4.39) the su(2)-valued gauges sck associated with the conformal symmetries ωcktake the form sc0 = (pr ω cφ0)φ−10 = 2 (1 + |z|2)2 · ( −i(1−i)(t−i)z+(1+i)(1−it)z̄ t2+1 i(1+i)(t+i)−(1−i)z2(t−i) (t+i)2 (1−i)(1+it)+(1+i)(1−it)z̄2 (t−i)2 (1−i)(1+it)z+i(1+i)(t+i)z̄ t2+1 ) , sc1 = (pr ω cφ1)φ−11 = 2 (1 + |z|2)2 · ( −i(1−i)(t−i)z+(1+i)(1−it)z̄ t2+1 i(t−i)2[(1+i)(t+i)−(1−i)(t−i)z2] (t+i)4 (t+i)2[(1−i)(1+it)+(1+i)(1−it)z̄2] (t−i)4 (1−i)(1+it)z+i(1+i)(t+i)z̄ t2+1 ) , (4.40) where det sck 6= 0. iv. if the generalized symmetries of the cp 1 model (4.6) are written in the evolutionary form ωrk = ( d21u1k + d 2 2u1k + [d1u1k,u1k] +[d2u1k,u1k] ) ∂ ∂θ1 + ( d21u2k +d 2 2u2k +[d2u2k,u2k] + [d1u2k,u2k] ) ∂ ∂θ2 , k = 1, 2 (4.41) (where θ1 and θ2 are complex-valued functions obtained from (4.12)) then the su(2)-valued integrated form of the immersion becomes [14] ffgk = φ −1(pr ωrk φk) = φ −1 k (d1u1k + d2u2k)φk. (4.42) the tangent vectors to this surface are given by d1f fg k = φ −1 k (pr ω r k u1k)φk = φ −1 k (d 2 1u1k + d 2 2u1k +[d1u1k,u1k] + [d2u1k,u1k])φk, d2f fg k = φ −1 k (pr ω r k u2k)φk = φ −1 k (d 2 1u2k + d 2 2u2k +[d2u2k,u2k] + [d1u2k,u2k])φk. (4.43) the surfaces ffgk also have positive gaussian curvatures k > 0 and the euler-poincaré characters are equal to 2. in the parametrization x,y, the surfaces ffgk take the form ffgk = ( − x3 − 6x2y −x(1 + 3y2) + 2y(1 + y2) (1 + x2 + y2)3 , 2x3 + y + 3x2y −y3 + x(2 − 6y2) (1 + x2 + y2)3 , − 2(x2 − 4xy −y2) (1 + x2 + y2)3 ) (4.44) and they are homeomorphic to ovaloids. the su(2)-valued gauges sfgk associated with the generalized symmetry ωrk take the form sfgk = (pr ω r k φk)φ −1 k = d1u1k + d2u2k. (4.45) 189 a. m. grundland, d. levi, l. martina acta polytechnica under the assumption that pk are holomorphic or antiholomorphic projectors (4.22) the gauges sfgk take the form sfg0 = s fg 1 = 4 (t2 + 1)(1 + |z|2)3 · ( −z2(1 + it) + z̄2(1 − it) z̄3(1 − it) + z(it + 1) −iz3(t− i) + iz̄(t + i) z2(1 + it) − z̄2(1 − it) ) , (4.46) where det sfgk 6= 0. hence the mappings mk = sstk (s fg k ) −1 from the fg immersion formulas to the st immersion formulas are given by m0 = sst0 (s fg 0 ) −1 = 1 2(t2 + 1) · ( −2iz3(t−i)+z̄[(t+i)2+|z|2(t2+1)] z(t−i) − [z 3(1+t2)+2z̄(1−it)+z(t−i)2] t+i z[(t+i)2+(t2+1)|z|2+2(1+it)] t−i z(t−i)2+|z|2z(t2+1)+2iz̄3(t+i) z̄(t+i) ) , m1 = sst1 (s fg 1 ) −1 = i 2|z|2 ( −(1+2|z|2) t2+1 (i+t)2z̄ (t−i)4 z(t−i)2 (t+i)4 1+2|z|2 t2+1 ) · ( z2(it + 1) + z̄2(it− 1) −z(it + 1) + z̄3(it− 1) z3(it + 1) + (1 − it)z̄ −z2(1 + it) + (1 − it)z̄2 ) , (4.47) where det mk 6= 0. conversely, the gauges m−1k = sfgk (s st k ) −1 do exist. hence there exist mappings from the st immersion formulas to the fg immersion formulas m−10 = s fg 0 (s st 0 ) −1 = 2 (1 + |z|2)3 ( m11 m12 m21 m22 ) , (4.48) where m11 = (t− i)2z + (1 + t2)|z|2z + 2(it− 1)z̄3 (i + t)z̄ , m12 = 2(1 + it)z2 + (i + t)2z̄2 + (1 + t2)|z|2z̄2 (i− t)z , m21 = (t− i)2z2 + (1 + t2)|z|2z2 + 2(1 − it)z̄2 (i + t)z̄ , m22 = −2(1 + it)z3 + (i + t)2z̄ + (1 + t2)|z|2z̄ (t− i)z , and m−11 =s fg 1 (s st 1 ) −1 = 2 (t2 + 1)(1 + |z|2)3(1 + 4|z|2) · ( −z2(1 + it) + z̄2(1 − it) z̄3(1 − it) + z(1 + it) −z3(1 + it) + z̄(it− 1) z2(1 + it) − z̄2(1 − it) ) · ( (1 + 2|z|2)(1 + it)(i + t) −iz̄(i+t) 4 (t−i)2 −iz(t−i)4 (i+t)2 (1 + 2|z| 2)(1 − it)(−i + t) ) . (4.49) (a) (b) (c) (d) figure 2. surfaces f st0 in (a), f g 0 in (b), f c 0 in (c) and f f g0 in (d) for k = 0, 1 and λ = i/2, and ξ± = x ± iy with x,y ∈ [−5, 5]. the axes indicate the components of the immersion function in the basis for su(2), e1 = ( 0 i i 0 ) , e2 = ( 0 −1 1 0 ) , e3 = ( i 0 0 −i ) . 5. concluding remarks in this paper we have shown how three different analytic descriptions for the immersion function of 2d-soliton surfaces can be related through different g-valued gauge transformations. the existence of such gauges is demonstrated by reducing the problem to that of mappings between different forms of the immersion formulas for three types of symmetries : conformal transformations in the spectral parameter, gauge symmetries of the lsp and generalized symmetries of the integrable systems. we have investigated the geometric consequences of these mappings and rephrased them as requirements for the existence of the corresponding vector fields and their prolongations acting on a solution φ of the associated lsp for an integrable pde. the explicit expressions for these relations, which we have established, have provided us with a tool for distinguishing between the cases in which soliton surfaces can be or cannot be related among them, see proposition 3. the task of finding an increasing number of soliton surfaces associated with integrable systems is related to the symmetry properties of these systems. the construction of soliton surfaces started with the contribution of sym [33, 34] and tafel [35] providing a formula for the immersion of integrable surfaces which are extensively used in the literature (see e.g. 190 vol. 56 no. 3/2016 on immersion formulas for soliton surfaces [2, 5–16, 30, 33–35]). in this paper we have addressed the question and formulated easily verifiable conditions which ensure that the st formula produces a desired result. this advance can assist future studies of 2d-soliton surfaces with integrable models, which can describe more diverse types of surfaces than the ones discussed in three-dimensional euclidean space for the cp 1 sigma model. it may be worthwhile to extend the investigation of soliton surfaces to the case of the sigma models defined on other homogeneous spaces via grassmann models and possibly to models associated with octonion geometry. this case could lead to different classes and types of surfaces than those studied in this paper. this task will be explored in a future work. acknowledgements amg has been partially supported by a research grant from nserc of canada and would also like to thank the dipartimento di mathematica e fisica of the universitá roma tre and the dipartimento di mathematica e fisica of universitá del salento for its warm hospitality. dl has been partly supported by the italian ministry of education and research, 2010 prin continuous and discrete nonlinear integrable evolutions: from water waves to symplectic maps. lm has been partly supported by the italian ministry of education and research, 2011 prin teorie geometriche e analitiche dei sistemi hamiltoniani in dimensioni finite e infinite. dl and lm are also supported by infn is-csn4 mathematical methods of nonlinear physics. references [1] babelon o, bernard d and talon m 2006 introduction to classical integrable systems (cambridge monographs on mathematical physics) (cambridge: cambridge university press) doi:10.1017/cbo9780511535024 [2] bobenko ai (1994) surfaces in terms of 2 by 2 matrices. old and new integrable cases in harmonic maps and integrable systems, eds fordy a, wood j (braunschwieg, vieweg) doi:10.1007/978-3-663-14092-4_5 [3] calogero f and nucci mc, lax pairs galore, j. math. phys., 32 72–74 (1991) doi:10.1063/1.529096 [4] cartan e 1953 sur la structure des groupes infinis de transformation chapitre i. les systèmes différentiels en involution (paris, gauthier-villars). [5] cieśliński j 1997 a generalized formula for integrable classes of surfaces in lie algebras journal of mathematical physics 38 4255–4272, doi:10.1063/1.532093 [6] cieslinski j 2007 pseudospherical surfaces on time scales: a geometric deformation and the spectral approach j. phys. a: math. theor. 40 12525-38, doi:10.1088/1751-8113/40/42/s02 [7] doliwa a and sym a 1992 constant mean curvature surfaces in e3 as an example of soliton surfaces, nonlinear evolution equations and dynamical systems (boiti m, martina l and pempinelli f, eds, world scientific, singapore, pp 111-7) [8] fokas a s and gel’fand i m 1996 surfaces on lie groups, on lie algebras, and their integrability comm. math. phys. 177 203–220 [9] fokas a s, gel’fand i m, finkel f and liu q m 2000 a formula for constructing infinitely many surfaces on lie algebras and integrable equations sel. math. 6 347–375 doi:10.1007/pl00001392 [10] goldstein p p and grundland a m 2010 invariant recurrence relations for cp n −1 models j. phys. a: math theor. 43, 265206 (18pp), doi:10.1088/1751-8113/43/26/265206 [11] grundland am 2016 soliton surfaces in generalized symmetry approach j. theor. math. phys. (accepted.) [12] grundland a m and post s 2011 soliton surfaces associated with generalized symmetries of integrable equations j. phys. a.: math. theor.44 165203 (31pp) doi:10.1088/1751-8113/44/16/165203 [13] grundland a m and post s 2012 surfaces immersed in lie algebras associated with elliptic integrals j phys a: math theor 45, 015204 (20pp), doi:10.1088/1751-8113/45/1/015204 [14] grundland am and post s 2012 soliton surfaces associated with cp n −1 sigma models j. phys. conf. series 380, 012023 pp 1-14, doi:10.1088/1742-6596/380/1/012023 [15] grundland am, post s and riglioni d 2014 soliton surfaces and generalized symmetries of integrable systems j. phys. a.: math. theor.47 015201 (14pp) doi:10.1088/1751-8113/47/1/015201 [16] grundland am and yurdusen i 2009 on analytic descriptions of two-dimensional surfaces associated with the cp n −1 sigma model, j. phys. a: math. theor. 42 172001 doi:10.1088/1751-8113/42/17/172001 [17] gubbiotti g, scimiterna c and levi d 2016 linearizability and fake lax pair for a consistent around the cube nonlinear non–autonomous quad–graph equation, teor. math. phys., in press. [18] hay m and butler s, simple identification of fake lax pair, arxiv:1311.2406v1. [19] helein f 2001 constant mean curvature surfaces, harmonic maps and integrable systems (lectures in mathematics) (boston, ma: birkhauser), doi:10.1007/978-3-0348-8330-6 [20] konopelchenko b g, 1996 induced surfaces and their integrable dynamics. stud. appl. math. 96, 9–51, doi:10.1002/sapm19969619 [21] levi d, sym a and tu gz, a working algorithm to isolate integrable surfaces in e3, preprint df infn 761, roma oct. 10th, 1990, doi:10.1016/0375-9601(90)90897-w [22] li yq, li b and lou sy, constraints for evolution equations with some special forms of lax pairs and distinguishing lax pairs by available constraints, arxiv:1008.1375v2. [23] manton n and sutcliffe p 2004 topological solitons (cambridge monographs on mathematical physics) (cambridge: cambridge university press), doi:10.1017/cbo9780511617034 191 http://dx.doi.org/10.1017/cbo9780511535024 http://dx.doi.org/10.1007/978-3-663-14092-4_5 http://dx.doi.org/10.1063/1.529096 http://dx.doi.org/10.1063/1.532093 http://dx.doi.org/10.1088/1751-8113/40/42/s02 http://dx.doi.org/10.1007/pl00001392 http://dx.doi.org/10.1088/1751-8113/43/26/265206 http://dx.doi.org/10.1088/1751-8113/44/16/165203 http://dx.doi.org/10.1088/1751-8113/45/1/015204 http://dx.doi.org/10.1088/1742-6596/380/1/012023 http://dx.doi.org/10.1088/1751-8113/47/1/015201 http://dx.doi.org/10.1088/1751-8113/42/17/172001 http://arxiv.org/abs/1311.2406v1 http://dx.doi.org/10.1007/978-3-0348-8330-6 http://dx.doi.org/10.1002/sapm19969619 http://dx.doi.org/10.1016/0375-9601(90)90897-w http://arxiv.org/abs/1008.1375v2 http://dx.doi.org/10.1017/cbo9780511617034 a. m. grundland, d. levi, l. martina acta polytechnica [24] marvan m 2002 on the horizontal gauge cohomology and nonremovability of the spectral parameter, acta appl. math. 72 51–65, doi:10.1023/a:1015218422059 [25] marvan m 2004 reducibility of zero curvature representations with application to recursion operators, acta appl. math. 83 39–68, doi:10.1023/b:acap.0000035588.67805.0b [26] marvan m 2010 on the spectral parameter problem, acta appl. math. 109 239–255, doi:10.1007/s10440-009-9450-4 [27] mikhailov av 1986 integrable magnetic models soliton (modern problems in condensed matter vol 17) ed s e trullinger et al (amsterdam: north-holland) pp 623-90 [28] mikhailov a v, shabat a b and sokolov v v 1991 the symmetry approach to classification of integrable equations, in nonlinear dynamics, ed zakharov v e, springer, pp 115-184. [29] olver p j 1993 applications of lie groups to differential equations, 2nd edn. (new york, springer). doi:10.1007/978-1-4612-4350-2 [30] rogers c and schief wk 2000 backlund and darboux transformations. geometry and modern applications in soliton theory (cambridge: cambridge university press). doi:10.1017/cbo9780511606359 [31] sakovich s yu, true and fake lax pairs: how to distinguish them, arxiv:nlin.si/0112027. [32] sakovich s yu, cyclic bases of zero-curvature representations: five illustrations to one concept, arxiv:nlin/0212019v1. [33] sym a, 1982 soliton surfaces. lett. nuovo cimento 33, 394-400. [34] sym a, 1995 soliton surfaces and their applications (soliton geometry from spectral problems) geometric aspect of the einstein equation and integrable systems (lectures notes in physics vol 239) ed r martini (berlin: springer) pp 154-231. doi:10.1007/3-540-16039-6_6 [35] tafel j 1995 surfaces in r3 with prescribed curvature j. geom. phys. 17 381-90. doi:10.1016/0393-0440(94)00054-9 [36] zakharov v e and mikhailov a v 1979 relativistically invariant two-dimensional models of field theory which are integrable by means of the inverse scattering problem method sov. phys. – jetp 47 1017–49 [37] zakrzewski w 1989 low dimensional sigma models (bristol: hilger). 192 http://dx.doi.org/10.1023/a:1015218422059 http://dx.doi.org/10.1023/b:acap.0000035588.67805.0b http://dx.doi.org/10.1007/s10440-009-9450-4 http://dx.doi.org/10.1007/978-1-4612-4350-2 http://dx.doi.org/10.1017/cbo9780511606359 http://arxiv.org/abs/nlin.si/0112027 http://arxiv.org/abs/nlin/0212019v1 http://dx.doi.org/10.1007/3-540-16039-6_6 http://dx.doi.org/10.1016/0393-0440(94)00054-9 acta polytechnica 56(3):180–192, 2016 1 introduction 2 summary of results on the construction of soliton surfaces 2.1 classical and generalized lie symmetries 2.2 the immersion formulas for soliton surfaces 2.3 application of the method 3 mapping between the sym-tafel, the cieslinski-doliwa and the fokas-gel'fand immersion formulas 3.1 lambda-conformal symmetries and gauge transformations 3.2 generalized symmetries and gauge transformations 3.3 the sym-tafel immersion formula versus the fokas-gel'fand immersion formula 4 the sigma model and soliton surfaces 4.1 soliton surfaces associated with the cp1 sigma model 5 concluding remarks acknowledgements references ap03_6.vp 1 introduction efficient scheduling of computationally intensive programs is one of the most essential and most difficult issues to achieve high performance in a homogeneous computing environment [1]. when the characteristics of an application are known a priori, including tasks execution time, data size of communication between tasks, and task dependencies, the application is represented by a static model [2]. in the static model, the application is represented by a directed acyclic graph (dag) in which the nodes represent the application tasks and the edges represent intertask data dependencies. each node is labeled by the computation cost (expected computation time) of the task and each edge is labeled by the communication cost (expected communication time) between tasks [3, 4, 5, 6, 7]. the objective of scheduling is to map the tasks onto the processors (machines) and order their execution so the that task dependencies are satisfied and minimum overall scheduling length (makespan) is achieved. finding an optimal solution for the scheduling problem is np-complete [3, 4, 5, 6, 7]. therefore, it is necessary to have heuristics to find the best scheduling rather than evaluate all possible scheduling combinations. most scheduling heuristics algorithms are based on list-scheduling [2, 3, 4, 7]. list-scheduling consists of two phases: a task prioritizing phase, where a priority is computed and assigned to each node of the dag, and a processor selection phase, where each task (in order of its priority) is assigned a processor that minimizes a suitable cost function. the scheduling heuristic is called static if the processor selection phase starts after completion of the task prioritizing phase [2, 8] and it is called dynamic if the two phases are interleaved [9, 10]. this paper presents the characteristics of the two main static and the two main dynamic list-scheduling algorithms. it also compares their performance over a 90k variant random graph. the remainder of this paper is organized as follows. the next section defines the static task-scheduling problem and gives the background of the problem including some definitions and parameters used in the algorithms. section 3 presents a brief review of the examined algorithms. section 4, presents a performance comparison of the reviewed algorithm. section 5 provides the conclusion. 2 task scheduling problem this section presents the application model used for static task scheduling and the homogeneous computing environments model that they will be used for the surveyed algorithms. the application can be represented by a directed acyclic graph g(v, e, c, w) where: v is the set of v nodes, and each node v vi � represents an application task, which is a sequence of instructions that must be executed serially on the same processor, w is the set of computation costs, where w wi � is the execution time of task vi, e is the set of communication edges. the directed edge ei, j joins nodes vi and vj, where node vi is called the parent node and node vj is called the child node. this also implies that vj cannot start until vi finishes and sends its data to vj. c is the set of communication costs, and the edge ei, j has a communication cost c ci j, � . a task without any parent is called an entry task and a task without any child is called an exit task. if there is more than one exit (entry) task, they may be connected to a zero-cost pseudo exit (entry) task with zero-cost edges, which do not affect the schedule. the homogeneous computing environment model is a set p of p identical processors connected in a fully connected graph. it is also assumed that: � any processor can execute the task and communicate with other processors at the same time, � once a processor has started task execution, it continues without interruption, and on completing the execution it sends immediately the output data to all children tasks in parallel. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 static vs. dynamic list-scheduling performance comparison t. hagras, j. janeček the problem of efficient task scheduling is one of the most important and most difficult issues in homogeneous computing environments. finding an optimal solution for a scheduling problem is np-complete. therefore, it is necessary to have heuristics to find a reasonably good schedule rather than evaluate all possible schedules. list-scheduling is generally accepted as an attractive approach, since it pairs low complexity with good results. list-scheduling algorithms schedule tasks in order of priority. this priority can be computed either statically (before scheduling) or dynamically (during scheduling). this paper presents the characteristics of the two main static and the two main dynamic list-scheduling algorithms. it also compares their performance in dealing with random generated graphs with various characteristics. keywords: list scheduling, compile time scheduling, task graph scheduling, homogeneous computing. the communication cost ci, j for transferring data from task vi (scheduled on pm) to task vj (scheduled on pn), is defined as: c s ri j i j, ,� � � � , where s is the cost of starting communication between processors (in secs), �i, j is the amount of data transmitted from task vi to task vj (in bytes), r is the cost of communication per transferred byte (in sec/byte). it is assumed that startup cost s is negligible and the unit cost r is the same for any two processors, so that the communication cost for any two tasks is a function of the amount of transferred data only. 2.1 basic scheduling attributes the frequently used attributes for assigning priority in list-scheduling are t � level (top level) and b � level (bottom level). the t � level of node vi is the length of the longest path from the entry node to vi (excluding vi). here, the length of the path is the sum of all nodes and edges weights along the path. the t level vi� ( ) is computed recursively by traversing the dag downward starting from the entry node v entry as follows: � � � � t level v t level v w ci v pred v m m m i m i � � � � � � max ( ) , , where pred(vi) is the set of immediate predecessors of vi and t level ventry� �( ) 0. the � �b level vi� of node vi is the length of the longest path from vi to the exit node. the � �b level vi� is computed recursively by traversing the dag upward starting from the exit node vexit as follows: � � � � b level v w b level v ci i v succ v m i m m i � � � � � � max ( ) , , where succ(vi) is the set of immediate successors of vi and � � � �b level v w vexit exit� � . if the edge weights are not taken into account in computing the b level� , the b level� in this case is called the static b level� or simply the static level (sl). the sl can be computed recursively by traversing the dag upward starting from the exit node vexit as follows: � � � � sl v w sl vi i v succ v m m i � � � max ( ) , where succ(vi) is the set of immediate successors of vi and � � � �sl v w vexit exit� . two other attributes are also used to assign priority to the nodes: est (earliest start time), also called asap (as soon as possible), and lst (latest start time), also called alap (as late as possible). the est(vi) of vi is highly correlated with the t level� of vi and the procedure for computing the t level� can be used to compute the nodes earliest start times. the est can be computed recursively by traversing the dag downward starting from the entry node vexit as follows: � � � � est v est v w ci v pred v m m m i m i � � � � max ( ) , , where pred vi( ) is the set of immediate predecessors of vi and est ventry( ) � 0 . the lst can be computed recursively by traversing the dag upward, starting from the exit node vexit, as follows: � � � � lst v lst v c wi v succ v m i m i m i � � � � min ( ) , , where succ(vi) is the set of immediate successors of vi and lst v est vexit exit( ) ( )� . the critical path (cp) of a dag is the longest path from the entry node to the exit node. clearly a dag can have more than one cp. consider the dag shown in fig. 1, where each node has two labels, the upper one is indicating the node label and the lower one is indicating the node weight. in this dag, the nodes v1, v2, v9, v10 are the cp nodes and are called cpns (critical path nodes). the edges on the cp are shown by thick arrows. the values of the priorities discussed above are shown in table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 43 no. 6/2003 fig. 1: application directed acyclic graph (dag) node sl t � level & est b � level lst 1 80 0 104 0 2 60 28 76 28 3 50 24 60 44 4 55 22 65 39 5 45 28 61 43 6 40 24 52 52 7 30 52 32 72 8 35 52 39 65 9 40 56 48 56 10 20 84 20 84 table 1: priority attributes 3 list-scheduling algorithms this section presents the two main static list-scheduling algorithms and the two main dynamic list-scheduling algorithms. all these algorithms are for a limited number of homogeneous processors. 3.1 static list-scheduling algorithms this section briefly reviews the two main static list-scheduling algorithms, which are the highest level first with estimated time (hlfet) algorithm [2] and the modified critical path (mcp) algorithm [8]. 3.1.1 hlfet algorithm the highest level first with estimated time (hlfet) algorithms [2] is one of the simplest list-scheduling algorithms and is described as follows in fig. 2. the complexity of the hlfet algorithm is o (pv2) . for the dag shown in fig. 1, the scheduling trace of hlfet algorithm is given in table 2. in the table, the execution start times of each node on all available processors at each step are given, and the nodes on the list are scheduled one by one, to the processor that allows the earliest execution start time. 3.1.2 mcp algorithm the modified critical path (mcp) algorithm [8] uses the alap attribute (lst defined in section 2) of a node as the scheduling priority. the mcp algorithm first computes the alaps of all nodes, and then constructs a list of nodes in ascending order of nodes alap. in the case of equivalent alap values, the alaps of the children are taken into consideration to break the tie. the mcp algorithm then schedules the nodes on the list one by one such that a node is scheduled to the processor that allows the earliest execution start time. the mcp algorithm is shown in fig. 3. the complexity of the mcp algorithm is o(pv2). for the dag shown in fig. 1, the scheduling trace of mcp algorithm is given in table 3. 3.2 dynamic list-scheduling algorithms this section briefly reviews the two main dynamic list-scheduling algorithms, which are the earliest time first (etf) algorithm [9] and the dynamic level scheduling (dls) algorithm [10]. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 1. compute the sl (static level) for each node in the graph 2. put all nodes in a list l and sort l in a descending order of nodes sl 3. while not the end l do � dequeue vi from l � compute earliest execution start time for vi in all processors � schedule vi to the processor that minimizes the node earliest execution start time fig. 2: hlfet algorithm step selected v p1 p2 p3 selected p 1 1 0 0 0 p1 2 2 20 28 28 p1 3 4 40 22 22 p2 4 3 40 37 24 p3 5 5 40 37 44 p2 6 6 40 42 44 p1 7 9 50 48 50 p2 8 8 45 68 53 p1 9 7 60 68 44 p3 10 10 76 68 76 p2 table 2: a scheduling trace of the hlfet algorithm (makespan � 88) 1. compute the alap (lst in section 2) for each node in the graph 2. for each node, create a list, which consists of the alap of the node itself and all its children 3. sort these lists in an ascending order of nodes alap 4. create a node list l sorted in ascending order of nodes alap. use nodes sorted lists (previous 2 steps) to break a tie 5. while not the end l do � dequeue vi from l � compute earliest execution start time for vi in all processors � schedule vi to the processor that minimizes the node earliest execution start time fig. 3: mcp algorithm step selected v p1 p2 p3 selected p 1 1 0 0 0 p1 2 2 20 28 28 p1 3 4 40 22 22 p2 4 5 40 37 28 p3 5 3 40 37 33 p3 6 6 40 37 53 p2 7 9 41 48 53 p1 8 8 61 44 53 p2 9 7 61 61 53 p3 10 10 65 69 69 p1 table 3: a scheduling trace of the mcp algorithm (makespan � 85) 3.2.1 etf algorithm the earliest time first (etf) algorithm [9] computes, at each step, the earliest execution start time (eest) for all ready nodes and selects the one with the lowest value for scheduling. the ready node is defined as the node having all its parents scheduled. when two nodes have the same value of eest, the etf algorithm breaks the tie by scheduling the one with the higher static level. fig. 4 shows the etf algorithm. the complexity of the etf algorithm is o(pv3). for the dag shown in fig. 1, the scheduling trace of etf algorithm is given in table 4. 3.2.2 dls algorithm the dynamic level scheduling (dls) algorithm [10] uses an attribute called the dynamic level (dl), which is the difference between the static level of a node and its earliest execution start time. in each scheduling step, the nodeprocessor pair that gives the largest value of dl is selected. this mechanism is similar to the one used by the etf algorithm. however, there is one subtle difference between the etf and dls: the etf algorithm schedules the node with the minimum earliest execution start time and uses the static level merely to break ties. in contrast, the dls algorithm tends to schedule nodes in descending order of their static levels at the beginning of the process, but tends to schedule nodes in ascending order of eest near the end of the process. the dls algorithm is shown in fig. 5. the complexity of the dls algorithm is o(pv3). for the dag shown in fig. 1, the s of dls algorithm is exactly the same as the scheduling trace of the the etf algorithm as shown in table 4. 4 experimental results and discussion this section presents a performance comparison of the four algorithms given in section 3. for this purpose, we used randomly generated task graphs and the following comparison metrics are used. 4.1 comparison metrics the comparisons of the algorithms are based on the following metrics. makespan the makespan is defined as the overall completion time, and can be specified as follows: makespan fet vexit� ( ) , where fet vexit( ) is the finishing time of the scheduled exit node. scheduling length ratio (slr) the main performance measure is the scheduling length (makespan) of its output schedule. since a large set of task graphs with different properties is used, it is necessary to normalize the schedule length to the lower bound, which is called the schedule length ratio (slr). the slr value of an algorithm on a graph is defined as slr makespan wi i cp � � � . © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 43 no. 6/2003 1. compute the sl (static level) of each node in the graph 2. initially, the ready nodes list includes only the entry node 3. while the ready list is not empty do � compute the earliest execution start time on each processor for each node in the ready list � select the node-processor pair that gives the earliest execution start time. ties are broken by selecting the node with a higher sl � schedule the node to the corresponding selected processor � add the newly ready nodes to the ready list fig. 4: etf algorithm step selected v p1 p2 p3 selected p 1 1 0 0 0 p1 2 2 20 28 28 p1 3 4 40 22 22 p2 4 3 40 37 24 p3 5 5 40 37 44 p2 6 6 40 42 44 p1 7 7 52 52 44 p3 8 8 45 53 53 p1 9 9 50 48 50 p2 10 10 76 68 76 p2 table 4: a scheduling trace of the etf and dls algorithms (makespan � 88) 1. compute the sl (static level) of each node in the graph 2. initially, the ready nodes list includes only the entry node 3. while the ready list is not empty do � compute the earliest execution start time for every ready node on each processor � compute the dl of every node-processor pair by subtracting the earliest execution start time from the node’s static level (sl) � select the node-processor pair that gives the largest dl � schedule the node to the corresponding selected processor � add the newly ready nodes to the ready list fig. 5: dls algorithm the denominator is the sum of the computation costs of the tasks on a critical path (cp). the slr of a graph (using any algorithm) cannot be less than one, since the denominator is the lower bound. average slr values are used in our experiments. speedup the speedup value is computed by dividing the sequential execution time (i.e., the cumulative computation costs of the tasks) by the parallel execution time (i.e., the makespan of the schedule). number of occurrences of better quality of schedules the number of times that each algorithm produced a better, worse, and equal quality of schedules compared to every other algorithm is counted in the experiments. 4.2 random graph generator the random graph generator was implemented to generate weighted application dags with various characteristics that depend on several input parameters. the generator requires the following input parameters to build weighted dags. � number of tasks in a graph v, � graph levels l, � communication to computation ratio ccr, which is defined as the ratio of the average communication cost to the average computation cost. in all experiments, graphs with a single entry and a single exit node were considered. in each experiment, the values of the previous parameters are selected from the corresponding set given below. v v l v ccr � � � � 20 30 40 50 60 70 80 90 100 0 2 0 8 0 5 , , , , , , , , , . . , . , 10 2 0. , . . these input parameters were used to generate 10 k different dags with various characteristics for each v from the used v set. 4.3 performance results the performances of the algorithms were compared with respect to different graph size. the experiments were repeated for each v from the v set given above. for each v, 10 k graph were generated using random selection for ccr and levels (l) (given above) for each graph. the average slr for each v is given in fig. 6. in general the performances of the dynamic algorithms are better than those of the static ones. the static algorithms have near equal performance while the etf algorithm is better than the dls in the dynamic algorithms. the average speedup is given in fig. 7. in general, the speed up of the dynamic algorithms is better than the static algorithms. the two dynamic algorithms have almost the same performances, while hlfet is better than mcp. finally, the percentage of situations that each scheduling algorithm in the experiments produced better (b), equal (e) or worse (w) scheduling length compared to every other algorithm was counted for 90k dags used. each cell in table 5 indicates the comparison results of the algorithm on the left with the algorithm at the top. table 5 indicates that the dynamic algorithms are better than the static ones. as regards the static algorithms, the hlfet algorithm is better than the mcp, while the dls algorithm is better than the etf algorithm as regards the dynamic algorithms. we note that the algorithm complexity is an important factor that has to be taken into account when comparing the performance of different algorithms. as shown in table 6, the complexity of the dynamic algorithms is much higher than 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 6: average slr fig. 7: average speedup hlfet mcp etf dls b 49.15 % 37.75 % 22.62 % hlfet e 12.47 % 22.37 % 42.17 % w 38.39 % 39.89 % 35.22 % b 38.39 % 40.84 % 36.41 % mcp e 12.47 % 5.84 % 8.35 % w 49.15 % 53.33 % 55.24 % b 39.89 % 53.33 % 26.93 % etf e 22.37 % 5.84 % 39.53 % w 37.75 % 40.84 % 33.54 % b 35.22 % 55.24 % 33.54 % dls e 42.17 % 8.35 % 39.43 % w 22.62 % 36.41 % 26.93 % table 5: pair-wise comparison of the examined algorithms that of static ones. this makes it unfair if the static algorithm gives the same schedule length as the dynamic one, to consider it as an equivalent trial. if we consider the equivalent scheduling length of two algorithms as a better trial of the lowest complexity algorithm, the better, equal, and worse comparison between the examined algorithms will be as shown in table 7. 4.4. ranking of examined algorithms based on the above comparison metrics and the average results for 90k randomly generated dags, the ranking for the examined algorithms was as follows: average makespan: dls etf hlfet mcp average speedup: dls etf hlfet mcp average slr: etf dls mcp hlfet best results: dls etf mcp hlfet complexity: hlfet & mcp dls & etf 5 conclusion in this paper we present a brief description of the characteristics of the two most known static and also the two most known dynamic list-scheduling algorithms. the performances of these algorithms were examined using variant random generated graphs. six comparison matrices were used to measure their performance. in general the dynamic list-scheduling algorithms performed better than the static list-scheduling algorithms. for the static list-scheduling algorithms, the hlfet performed better than mcp and for the dynamic list-scheduling algorithms, the dls algorithm performed better than the etf algorithm. if the complexities of the algorithms are taken into account, it is highly recommended to use static list-scheduling. references [1] feitelson d., rudolph l., schwiegelshohm u., sevcik k., wong p.: theory and practice in parallel job scheduling. jsspp, 1997, p. 1–34. [2] kwok y., ahmed i.: benchmarking the task graph scheduling algorithms. proc. ipps/spdp, 1998. [3] liou j., palis m.: a comparison of general approaches to multiprocessor scheduling. proc. int’l parallel processing symp., 1997, p. 152–156. [4] khan a., mccreary c., jones m.: a comparison of multiprocessor scheduling heuristics. icpp, 1994, vol. 2, p. 243–250. [5] gerasoulis a., yang t.: a comparison of clustering heuristics for scheduling dags on multiprocessors. journal of parallel and distributed computing, 1992, vol. 16, p. 276–291. [6] gerasoulis a., yang t.: on the granularity and clustering of directed acyclic task graphs. ieee trans. parallel and distributed systems, 1993, vol. 4, no. 6, p. 686–701. [7] zhou h.: scheduling dags on a bounded number of processors. int’l conf., parallel and distributed processing techniques and applications, 1996. [8] min-you w., gajski d.: hypertool: a programming aid for message-passing systems. ieee trans. parallel and distributed systems, 1990, vol. 1, no. 3. [9] hwang j., chow y., anger e., lee c.: scheduling precedence graphs in systems with interprocessor communication times. siam journal on computing, 1989, vol. 18, no. 2, p. 244–257. [10] sih g., lee e.: a compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. ieee trans. in parallel and distributed systems, 1993, vol. 4, no. 2, p. 75–87. ing. tarek hagras phone: +420 224 357 267 e-mail: tarek@felk.cvut.cz doc. ing. jan janeček, csc. phone: +420 224 357 267 e-mail: janecek@fel.cvut.cz dept. of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 43 no. 6/2003 algorithm complexity hlfet o pv( )2 mcp o pv( )2 etf o pv( )3 dls o pv( )3 table 6: algorithms complexity hlfet mcp etf dls b 49.15 % 60.12 % 64.79 % hlfet e 12.47 % w 38.39 % 39.89 % 35.22 % b 38.39 % 46.68 % 44.76 % mcp e 12.47 % w 49.15 % 53.33 % 55.24 % b 39.89 % 53.33 % 26.93 % etf e 39.53 % w 60.12 % 46.68 % 33.54 % b 35.22 % 55.24 % 33.54 % dls e 39.53 % w 64.79 % 44.76 % 26.93 % table 7: pair-wise comparison of scheduling algorithms complexity based ap04-bittnar2.vp 1 introduction one of the most important factors determining road design is the intensity of heavy trucks which is often higher than originally assumed by the designers. according to the traffic prognosis for europe and the usa, there is an annual increase in such vehicles, and new trucks with multiple axles or trailers appear. there are additional costs for road repair and maintenance. unevenness on the road surface generates, aside from static, a dynamic contact force which can be controlled and reduced possibly. while direct reduction of bridge deflections was found to be complicated, e.g. tuned mass dampers or intelligent stiffeners [1], new trends in suspension development follow the concept of semi-active suspensions, directly on the vehicle axles. the driving force that controls the damping properties of a suspension is negligible when compared to the active damper force, while semi-active suspensions after a good compromise between performance and price. a mechatronic solution combined with a controlled damper provides the a tool for its optimization and for reducing the dynamic contact force. the concept of road-friendliness has been extended to bridge-friendliness by optimizing the damper parameters on bridges [2, 3, 4]. the aim of this work is to explore the benefits of bridge-friendly trucks on ordinary, simply supported bridges. the results from previous studies with a quarter-car model were so promising that a more accurate model with a half-car and a slab bridge was assembled [3]. a similar model was produced for a simply supported bridge with a specific road profile, traversed by a quarter-car model with a passive, sky-hook and ground-hook control configuration. the effect of a semi-controlled damper on the bridge response is significant for close natural frequencies of the vehicle and the bridge [5]. 2 the half-car model the proposed half-car model is based on the parameters of the commercially available liaz truck, simplified to four dof [6]. fig. 1 displays the configuration of the truck together with a bridge. the front axle comprises two axles of a real car, and the rear axle comprises the four axles. the model parameters were set to: m1 � 15 t, m2 � 0.75 t, m3 � 1.5 t, a � 4 m, b � 1.3 m, k12 � 430 kn/m, k13 � 650 kn/m, k20 � 1700 kn/m, k30 � 4900 kn/m, i � � 50 tm2, and the damping factors of the tires were set to zero. damper forces fd12 and fd13 result from the movement 152 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 dynamic bridge response for a bridge-friendly truck v. šmilauer, j. máca, m. valášek a truck with controlled semi-active suspensions traversing a bridge is examined for benefits to the bridge structure. the original concept of a road-friendly truck was extended to a bridge-friendly vehicle, using the same optimization tools. a half-car model with two independently driven axles is coupled with simply supported bridges (beam, slab model) with the span range from 5m to 50 m. surface profile of the bridge deck is either stochastic or in the shape of a bump or a pot in the mid-span. numerical integration in the matlab/simulink environment solves coupled dynamic equations of motion with optimized truck suspensions. the rear axle generates the prevailing load and to a great extent determines the bridge response. a significant decrease in contact road-tire forces is observed and the mid-span bridge deflections are on average smaller, when compared to commercial passive suspensions. keywords: half-car model, semi-active suspension, bridge-friendly truck, bridge response, dynamics. fig. 1: nonlinear half-car model traversing the bridge of the damper attachment, and they depend on an operative algorithm, as will be explained further. all springs in the car model are considered to be linear without hysteresis. the road profile superimposed on the bridge deck deflection determines the position of the tire contact area. the equations of motion describing car the dynamic behavior and the damper force are as follows: m z k z z a f k z z b f d d 1 1 12 1 2 12 13 1 3 13 2 4 �� ( ) ( ) , � � � � � � � � � � � � (1) i ak z z a af bk z z b bf d d 01 1 12 1 2 12 13 1 3 2 4 �� ( ) ( ) � � � � � � � � � � � � 13 , (2) � �m z k z z a f k z z t b z d2 2 12 1 2 12 20 2 02 20 2 2�� ( ) ( ) �� � � � � � � � � � � � � �� ( ) ,z t02 (3) � �m z k z z b f k z z t b z d3 3 13 1 3 13 30 3 03 30 3 4�� ( ) ( ) �� � � � � � � � � � � � � �� ( ) ,z t03 (4) f b z z b z a b z a zd f f f12 1 2 02 2 1 12 1 2� � � � � � � �( � � ) ( � � ) ( � � � )� � � � � � �� �k z z k z a zf f10 2 02 12 1 2( ) ( ) ,� (5) f b z z b z b b z b zd r r r13 1 3 03 2 1 12 1 3� � � � � � � �( � � ) ( � � ) ( � � � )� � � � � � �� �k z z k z b zr r10 3 03 12 1 3( ) ( ) ,� (6) z t z zr02 4( ) ,� � (7) z t z zr03 5( ) ,� � (8) where z4, z5 are the bridge displacements and zr is a road irregularity. this extended ground-hook model consists of three damping rates, where bf(r)1 and bf(r)2 corresponds to the damping factor of the ground hook and sky-hook respectively, and the passive damper corresponds to the damping factor bf(r)1. the semi-active damper forces fd12 and fd13 are changed with the setting of the damping rate bf(r)1, bf(r)2 and bf(r)12 in such a manner that the damping force approaches the desired value. the typical 17 ms delay response of the damper is further considered. four values can be assigned to each damping factor, depending on the velocity and direction of the damper attachment [6]. the numerical experiments proved small dependence on the fictitious change in stiffness �kf(r)10 and �kf(r)12, therefore only damping factors are employed in the optimization process. during optimization, 4*3 � 12 free damping parameters are involved for each axle. since the half-car model holds two independently controlled dampers, 24 free parameters in total are optimized, using genetic algorithms [4]. the multi objective parameter optimization method (mopo) within the matlab/simulink environment was found appropriate for such a large task [4]. the first part of the objective function for optimization on both axles took the form of the square root of the time integral of the dynamic contact force: rf f tsum rms dyn t , ,� � 12 2 0 d . (9) the second performance criterion considered driver comfort in the form of truck sprung mass acceleration: acc z a tsum t � ��(�� �� )1 2 0 � d . (10) the truck moved at a velocity of 50 km/h during all simulations, and passive or bridge-friendly damper control was adopted for the purposes of comparison. the unevenness amplitudes of the road profile were set to 20 mm for a bump or a pot in the mid-span, or for a stochastic road, fig. 2. 3 the bridge model the bridge is modeled as a simply supported euler-bernoulli beam or as a slab bridge for shorter spans, in order to include the bridge torsional effect. the bridge span varies from 5 to 50 m, covering the majority of real bridges made from concrete or steel with such a statical system [7]: � reinforced concrete bridges – span of 5 to 12 m � prestressed concrete bridges – span of 12 to 30 m � composite steel-concrete bridges – span of 15 to 50 m © czech technical university publishing house http://ctn.cvut.cz/ap/ 153 acta polytechnica vol. 44 no.5–6/2004 fig. 2: bump and pot used for damper response on the bridges fig. 3: bending stiffness and the first eigenfrequency of bridges (5–50) m each bridge consists of two road lanes and two sidewalks, and the preliminary design calculations provide bridge parameters for the simulation [7, 8]. the bending stiffness and the first eigenfrequency for all considered bridges are given in fig. 3. the first truck eigenvalue is 9.3 hz. it is assumed that the vehicle passes over the beam bridge on the symmetry axis and on the slab bridge as close as possible to the sidewalks, fig. 5. fem with equally spaced nodes in combination with the bridge parameters provides the equation of motion in the form: m b k�� �r r� � �r f, (11) where m, b, k, r, f are the mass, damping, stiffness matrices, the displacement vector and the vector of contact axle forces. rayleigh damping with a logarithmic decrement of 0.05 was used for all bridges for the damping matrix assemblage. two contact forces from the truck axles are linearly distributed between adjacent nodes to vector f: f k z zdyn1 20 2 02� �( ) , (12) f k z zdyn2 30 3 03� �( ) . (13) the connection links between the bridge and the car model are the bridge deflections below the axle and the contact force between the tire and the road, fig. 4. the equations of motion (1)–(8) and (11) are simultaneously solved in the matlab/simulink environment with an implicit trapezoidal integration scheme, using a variable time step. fig. 6 displays the simulink scheme of the bridge model as an example. the lowest critical speed of the vehicle is over 220 km/h and the first natural frequency is higher than all first bridge frequencies considered. no frequency-matching phenomenon is observed during the simulations. 4 dynamic contact force dynamic contact force is a variable part of the total contact force between the tire and the road, with the positive direction upward. the slab bridge model, fig. 5, as a short bridge with the same parameters as the beam model was proposed and verified. in all such cases, the bump is placed in the mid-span of a beam or slab bridge. fig. 7 shows similar behavior of both bridges with a different damper control strategy. even when compared to a road on a solid base, only the damper control mode plays a significant role in reducing the dynamic contact force. 154 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 4: interconnection of truck and bridge model fig. 5: slab bridge model with 10 m span and the traversing axle path fig. 6: the simulink scheme of the bridge model © czech technical university publishing house http://ctn.cvut.cz/ap/ 155 acta polytechnica vol. 44 no.5–6/2004 fig. 7: comparison of semi-active and passive damper performance on the bridge type across a bump: the bridge span is 10 m fig. 8: contact dynamic force of the rear axle across a bump on bridges (5–50) m fig. 9: contact dynamic force of the rear axle across a pot on bridges (5–50) m for the half-car model, the results for the rear axle are in figs. 8 and 9 for the bump and the pot. the change of dynamic contact force is evident shortly after unevenness passing, reducing force value in the next peek. again, the bridge parameters have minor effect on the truck response. it is evident that damper functionality for the pot is not as much effective as for the bump. the reason lies in low damping values when the damper is under contraction. nevertheless, the phase of dynamic contact force is shifted and its value slightly reduced as well in the case of mopo control strategy [8]. the front axle carries about a third of the total static truck load and an effect of damper is lower than it is for the rear axle. 5 bridge deflections bridge deflections express the effect of the truck on the bridge structure itself. the difference for beam and slab bridge response reveals the qualitative behavior of these two models. the torsional effect is taken into account for the slab bridge, hence higher deflection values are expected. fig. 10 shows similar behavior of both bridge systems, and approx. 15 % difference in bridge deflection is observed. the beam bridge model is found to be sufficient even for a bridge of nearly square shape desk. short and long bridges are compared as an example of bridge excitation, using the same truck. the majority of the car weight is located in the rear axle, and the reading on the graph is therefore from this axle. short bridges are mainly influenced by the shape of the unevenness, fig. 11. there are two reasons for their excitation: the truck load prevails on short bridges because of the available space and the bridge stiffness, and also because its mass is low. no significant force impulse would appear for a stochastic road, and the deflections are then close to the static values. an overall response of 5 m – 50 m spans is illustrated in fig. 12, depending on the damper control strategy. an average semi-active dampers reduce maximum deflections on a stochastic road by 2.5 %, and on bump unevenness by 3.6 %. 156 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 10: deflections of the beam and slab bridge under one axle of 10 m span fig. 11: mid-span bridge deflections with spans 5m 50 m and a comparison with static displacements 6 conclusions the paper describes the effect of a bridge-friendly truck on a road and bridge structures, and proves that the concept of road-friendliness can be extended to bridges. a semi-active optimized damper, when compared to a passive damper, reduces the local contact force in all cases and shifts the contact force peaks after passing the unevenness. bridge-friendly truck dampers are beneficial for decreasing road damage, mainly the rear axle, which bears the prevailing truck load. the dynamic contact force is influenced mainly by the shape of unevenness and the control strategy of the damper. the bridge span plays a minor role in this case. a beam bridge for a span of 10 m captures the qualitative behavior of the truck well when compared to the more sophisticated slab bridge. the average reduction of deflections on bridge spans of 5 m – 50 m using semi-active suspensions is found to be an average of 2.5 % lower for a stochastic road, and 3.6 % for a bump, respectively, in comparison with passive dampers. 7 acknowledgments the authors gratefully acknowledge support for this work from the ministry of education, youth and sports as a part of grant msm 210000003. references [1] kwon h. c., kim m. c., lee i. w.: “vibration control of bridges under moving loads.” computers & structures, vol. 66 (1998), p. 473–480. [2] valášek m., kejval j.: “new direct synthesis of nonlinear optimal control of semi-active suspensions.” proc. of avec 2000, ann arbor, 2000, p. 691–697. [3] máca j., valášek m.: “dynamic interaction of trucks and bridges.” advanced engineering design, 3rd international conference, prague, 2003. [4] valášek m., kejval j., máca j.: “control of truck-suspension as bridge-friendly,” in: structural dynamics eurodyn 2002. (editors: h. grundmann and g. i. schueller), munich, balkema, 2002, p. 1015–1020. [5] chen y. et al.: “smart suspension systems for bridge-friendly vehicles.” paper 4696-06. proc. of 9th spie annual international symposium on smart structures and materials, san diego, 2002. [6] valášek m., kejval j.: “bridge-friendly truck suspension,” in: proc. of mechatronics, robotics and biomechanics, vut brno, trest, 2001, 277-284. [7] šmilauer v., máca j.: “dynamic interaction of bridge and truck with semi-active suspension.” engineering mechanics 2003, svratka, czech republic, 2003. [8] šmilauer v., máca j., valášek m.: “dynamic bridge response for a truck with controlled suspensions.” engineering mechanics 2004, svratka, czech republic, 2004. ing. vít šmilauer e-mail: vit.smilauer@fsv.cvut.cz doc. ing. jiří máca, csc. phone: +420 224 354 500 e-mail: maca@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic prof. ing. michael valášek, drsc. phone: +420 224 357 361 fax: +420 224 916 709 e-mail: valasek@fsik.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 157 acta polytechnica vol. 44 no.5–6/2004 fig. 12: maximum bridge deflection on a stochastic road on (5 m–50) m spans for various bridges and two control strategies ap04_3web.vp 1 introduction problems of structural optimization and analysis of stress and strain often lead to the function optimization problem of minimizing a given functional subject to a system of differential equations and inequalities with some boundary conditions, and a system of integral equations and inequalities. this class of structural optimization problems is called function structural optimization. the branch of mathematics that deals with this problem is called variational calculus. the difficulty of solving this problem has led to the idea of transforming the given function optimization problem with the galerkin method (based on the finite element method) to a problem of parameter optimization. this paper deals with both function optimization and parameter optimization of laminate plate stiffness. the paper aims to express the relation of a measure of laminate plate stiffness with respect to the fiber orientation of its plies. the inverse of the scalar product of the lateral displacement of the central plane and the lateral loading of the plate is the measure of laminate plate stiffness. in the case of a simply supported rectangular laminate plate this measure of stiffness is maximized, and the optimum orientation of its plies is searched. we contemplate the problem of rectangular laminate plate with common but fixed given dimensions, number of plies, thickness of plies, mechanical properties of orthotropic plies and lateral loading that is simply suported on its boundary with the aim to specify an orientation of its plies that maximizes the measure of stiffness, which is the inverse of the measure of compliance. the measure of compliance is given by the scalar product of the deflection function describing the deformed middle plane of the plate and the lateral loading over the projection of the plate into the middle plane. we must search an actual deformed state with respect to the common orientation of the plies. this solution is used in the above expressed measure of compliance, and the minimum of this expression is searched. 2 preliminaries we contemplate the problem of a rectangular laminate plate with common but given � dimensions � number of plies � thickness of plies � mechanical properties of orthotropic plies � lateral loading that is simply suported on its boundary with the aim of specifying an orientation of its plies that maximizes the measure of stiffness s l ( ) ( ) w w � 1 where l( ) ( , ) ( , )w w x y q x y x y� � d d� is the measure of compliance and where w(x,y) is the deflection function describing the deformed middle plane of the plate, q(x,y) is the lateral loading, and � is the projection of the plate into middle plane x-y. we must search an actual deformed state w(�) with respect to the common orientationof the plies �. this solution is used in the above expressed measure of compliance, and the minimum of this expression is searched; i.e. we must solve the problem � �� � � � arg min ( )l w 3 common rules it is well-known that an actual deformed state minimizes the potential energy � �� u u u u� �a , l( ) ( ) , where a ,( )u u is elastic potential energy � � � �a , eijkl ij kl( ) ( ) ( ) ( )u u x u x u x� � �� 1 2 � � � �d and l(u) is potential energy of external loads l p u t ui i i i t ( )u � �� �d d� �� �� . it is also well-known that in the actual state �u the potential energy � has the value �( �) ( �)u u� � 1 2 0l . hence the problem of maximizing the stiffness measure is transformed into the problem of searching a min-max point of the potential energy � � �� � arg max min ( ) ( )e u u u u e u , a , l� � � �e u . © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 44 no. 3/2004 controlling laminate plate elastic behavior t. mareš this paper aims to express the relation of a measure of laminate plate stiffness with respect to the fiber orientation of its plies. the inverse of the scalar product of the lateral displacement of the central plane and lateral loading of the plate is the measure of laminate plate stiffness. in the case of a simply supported rectangular laminate plate this measure of stiffness is maximized, and the optimum orientation of its plies is searched. keywords: mechanical properties, laminate, plate, composites. 4 formulation of the problem in the case introduced above there is the problem �� , � arg max min ( , )w w w � � � � � �a w � , � � �� � � � � l l l lim jn mnop ko lp kl ij n e w w q x y wd 1 ( , ) ( x y, )d� �� where emnop is the elasticity tensor of -th orthotropic ply and �lik i k � � � � � � � � cos sin sin cos is the tensor of transformation from the local coordinate system of the -th ply into the global coordinate system. it is not worthwhile to present the way of constructing this relation. 5 the way of resolving in this work the method of alternating fulfilment of necessary conditions was used: �� ��� � ��� � ��� � p c p c s p c s1 3 2 2 1 � � � � � � � � 2 , � ��� ��� � �� �� � � � � � � � 1 4 3 5 4 1 2 n p c s p s w q ( , , ) ,� �w w r r r k �� � ��� ��� ��� � � � 1 2 3 1 � � � �� tg tg2 ,, � 1 4 5 0 1 2 k r r n � � � ��� ��� �tg tg 3 4 ( , , , )� in this p ��� , r ��� , � are known numbers, c � cos( ), s � sin( ) and are the searched ply orientations. 6 results first example: � a square plate, � six plies that are laid symmetrically with respect to the middle plane of the plate, � continuous loading q � x y [n/mm2]. graphic representation of the stiffness maximizing ply orientation (first example – fig. 1). the previous three solution variants have the same value as our measure of stiffness. this value of the measure of stiffness is the maximum of all values. it is interesting that the first variant is balanced and therefore not twisted. the same holds for the folowing three solution variants. second example: � a rectangular plate with side ratio 1:2, � six plies that are laid symmetrically with respect to the middle plane of the of the plate, � continuous loading of the plate q = x y [n/mm2]. graphic representation of the stiffness maximizing ply orientation (second example – fig. 2). 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 3 =45° 2 =45° 1 =45° � 3 � 1 � 2 3 x y 3 =45° 2 =45° 1 =45° � 3 � 1 � 2 3 x y 3 = 45°2 =45° 1 =45° � 3 � 1 � 2 3 x y first variant second variant third variant fig. 1: first example � 3 � 1 � 2 3 =75° 2 =75° 1 =75° 3 x y � 3 � 1 � 2 3 =75° 2 =75° 1 =75° 3 x y � 3 � 1 � 2 3 = 75°2 =75° 1 =75° 3 x y first variant second variant third variant fig. 2: second example 7 conclusions we contemplate the problem of a rectangular laminate plate with common but fixed given dimensions, number of plies, thickness of plies, mechanical properties of orthotropic plies and lateral loading that is simply supported on its boundary with the aim of specifying an orientation of its plies that with maximize the measure of stiffness, which is the inverse of the measure of compliance. the measure of compliance is given by the scalar product of the deflection function describing the deformed middle plane of the plate and the lateral loading over the projection of the plate into middle plane. we must search an actual deformed state with respect to the common orientation of the plies. this solution is used in the above expressed measure of compliance, and the minimum of this expression is searched. it is not worthwhile to present the way of constructing of this relation. it is also not worthwhile to present the way of resolving it. therefore, we introduce only the result. first example: a square plate with six plies that are laid symmetrically with respect to the middle plane, of the plate. this plate is continuously loaded with the value increasing as the coordinates increase. there are three solution variants that have the same value as our measure of stiffness. this measure of stiffness value is the maximum of all values. the optimal ply orientation is 45° for all plies, the only difference is the orientation (plus, minus). it is interesting that the first variant is balanced and therefore not twisted. second example: a rectangular plate with side ratio 1:2 with six plies that are laid symmetrically with whit respect to the middle plane of the plate. this plate is also continuously loaded with value increasing as the coordinates increase. the same remark as for the previous example also holds for this example. the only contrast is the optimal orientation of the plies. in this example it is 75°. 8 acknowledgment this work has been supported by the grant agency of czech republic under contract no. 106/01/0958. references [1] alexejev v. m., tichomirov v. m., fomin s. v.: matematická teorie optimálních procesů. praha: academia 1991. [2] allaire g.: shape optimization by the homogenization method. new york: springer-verlag, 2002. [3] gürdal z., haftka r. t., hajela p.: design and optimization of laminated composite materials. new york: john wiley & sons, 1999. [4] washizu k.: variational methods in elasticity and plasticity. 2nd ed. oxford: pergamon press, 1975. (in russian: variacionnyje metody v teorii uprugosti i plastičnosti. moskva: mir 1987.) [5] wilde d. j.: globally optimal design. new york: john wiley & sons, 1978. ing. tomáš mareš phone:+420 224 352 525 e-mail:marest@sgi.fsid.cvut.cz department of mechanics division of strength of materials czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 44 no. 3/2004 ap02_04.vp 1 introduction conventional methods for measuring acoustic pressure using a microphone are not quite suitable for some applications. these conventional methods have three drawbacks: they are restricted to about a 100 khz frequency band, the body of the microphone influences the acoustic field, and contact measurement may sometimes not be possible. these problems are found, e.g., in acoustic measurement in small diameter pipes. the acoustic field was generated by a power ultrasonic generator with a steel buffer at a frequency of 20 khz with second and third harmonics of the base signal. other harmonic components in the acoustic field arose as the result of the relatively high energy in the propagating waves. we solved this problem in optically transparent pipes by using an optical interferometric sensor – a laser probe. first, we optically measured the acoustic pressure from a piston source in open space, before measuring in the glass waveguide. the optically obtained results were compared with results obtained by the microphone and with piston radiator theory. we intended to prove that the data from the two measurements of higher harmonic components of the acoustic field are in good agreement. 2 theory 2.1 acousto-optic interaction the process of acousto-optic interaction is described in [1]. when the light ray goes through a volume with variable pressure, the refractive index and the velocity of light propagation differ according to the density changes. we can describe this process as phase modulation of light by acoustic waves. this holds in terms of the raman-nath diffraction regime, which is restricted by the klein-cook parameter q and the raman-nath parameter � [1] within the bounds q � 1 (1) and q � � � 1. (2) the q and � are defined as follows: q l n � 2 0 �� � � (3) � � � � � 2 l n , (4) where � [m] is the laser light wavelength in a vacuum, � [m] is the acoustic wavelength in the interaction medium, n0 [-] is the refractive index in a medium without ultrasound, �n [-] is the amplitude of the refractive index changes and l [m] is the interaction length. the integral phase change of the laser (electromagnetic) wave in the distance z from the acoustic source over the interaction length (in the direction of the x-axis) is: � � � �� z z l t k p x t x� �0 0 , d , (5) where k [m 1] is the laser wave number in a vacuum, 0 [pa 1] is the adiabatic piezo-optic coefficient, calculated from the refractive index of the medium and � �p x tz , [pa] is the acoustic pressure at a distance z from the acoustic source at any point on the x-axis. the system of coordinates is defined in fig. 1. let us assume for simplicity only two higher harmonic components in the acoustic signal that correspond to two plane acoustic waves. the same steps could be followed in the case of a more complex signal. the time dependence of the acoustic pressure can be written as: � � � � � �p x t p t p tz z a a z a a, sin sin� � � �0 1 1 1 0 2 2 2� � � � . (6) 13 acta polytechnica vol. 42 no. 4/2002 a comparison of acoustic field measurement by a microphone and by an optical interferometric probe r. bálek, z. šlegrová the objective of this work is to show that our optical method for measuring acoustic pressure is in some way superior to measurement using a microphone. measurement of the integral acoustic pressure in the air by a laser interferometric probe is compared with measurement using a microphone. we determined the particular harmonic components in the acoustic field in the case of relatively high acoustic power in the ultrasonic frequency range. keywords: acousto-optics, ultrasonic light diffraction, heterodyne laser interferometer. y x z laserprobe source fig. 1: system of coordinates the integral phase change (5) is in terms of plane waves equal to: � � � � � �� �� � � � �z z a a z a at kl p t p t� � � �0 0 1 1 1 0 2 2 2sin sin , (7) or, more simply, � � � � � �� � �z z zt t t� �1 2 (8) � � � �� z izt k p t� 0 , (9) where piz(t) is the integral value of the acoustic pressure. the interaction of a laser beam with the acoustic wave causes spatial diffraction, which can be expressed by means of diffraction angles. diffraction angles n of the particular orders of diffraction are: � n n n � � 0 . (10) where n is the diffraction index of the interaction media. for this low frequency acoustic field there are negligible angles of diffraction, so we can consider all n orders of diffraction as a single laser beam: � � � � �� � s s j j m n t m z n z l a nm a � � � � � � ��� �� ��� �� 0 01 02 1 2 � � � � � exp j ��� � �s a am n� �1 2 (11) where s0 is the amplitude of the incoming electromagnetic wave, jm, jn are bessel functions of the first kind, of the order of m, n and �01z, �02z are raman-nath parameters corresponding to the values of the amplitudes of light phase changes �1z(t), �2z(t), due to acoustic waves. 2.2 laser interferometric detection we chose the heterodyne laser interferometry method for detection of acoustic pressure. the advantages of this method are the high signal to noise ratio (80 db) and the possibility of absolute determination of the signal level [2]. the output signal of the interferometer carries information about the phase changes of the detected laser beam. this information is transformed from the optical band to a much more acceptable lower frequency range. the detailed characteristics of the output signal created by the simple harmonic acoustic wave are introduced in [3]; this paper also describes methods for determining the amplitude of the pressure from the frequency spectrum of this signal. more complex acoustic fields produce more complex interferometer output signals. we cannot compute the amplitude of the pressure from the simple ratio of individual signal amplitudes corresponding to certain frequencies. it is necessary to use the whole signal frequency spectrum. we solved this problem in a way analogous to the processing of frequency modulated radio signals. after the phase demodulation the obtained amplitudes are directly proportional to the averaged values of the acoustic pressure of the individual harmonic components. 3 experiments 3.1 description of the setup 3.1.1 ultrasonic generator ultrasonic waves are generated at base frequency 20.5 khz by a couple of piezo-electric transducers connected at one end with a steel buffer, which radiates as a piston generator into the ambient air. the acoustic signal is a nearly sinusoidal wave with small second and third harmonics of the base signal. two buffers were used. the first, labeled as n, has a constant circular cross-section. the cross section of the other buffer decreases toward the radiated end. the characteristics of both are shown in table 1. the near-field length was calculated from the formula for a circular piston radiator [4]. the output power of the generator can be changed from 20 % to 100 %. 14 acta polytechnica vol. 42 no. 4/2002 type of buffer radius of radiated aperture r [mm] base frequency f1 [hz] near field distance 0 [mm] n 15 20.5�103 9.00 k 8 20.5�103 0.37 table 1: parameters of the buffers pd pbs s r b o1 l r s gb ug ma sa m sadfm la o3 fig. 2: schematic diagram: b – bragg cell, dfm – frequency demodulator, gb – bragg cell signal generator, l – lens, la – he-ne laser, m – microphone, ma – amplifier, o1,o2,o3 – mirrors, pbs – polarizing beam splitter, pd – photodiode, sa – spectral analyzer, ug – ultrasonic generator with piston buffer 3.1.2 measurement by microphone the level of the generated signal was measured by a 1/8" g.r.a.s. type 40dp microphone with a 26ac preamplifier. the bandwidth of this microphone within a 0.5 db drop is 100 khz. the acoustic field was scanned at different distances from the aperture of the radiator by means of a computer controlled scanning 3-axis bridge. the scanning area is (90×90) mm in the x-y plane with a minimum possible step of 50 m. the signal from the microphone was measured with a hewlett packard 8560e spectrum analyzer and recorded by a computer. position data was also recorded. 3.1.3 optical setup and signal processing a schematic diagram of the optical measurement system is shown in fig. 2. a detailed description of this apparatus is given in [3]. the hewlett packard 8593e spectral analyzer, which has a frequency demodulator (dfm) with variable bandwidth, demodulates the signal from the photodiode pd. the demodulated signal is recorded by the other spectral analyzer sa (hewlett packard 8560e). 3.2 raman-nath regime assessment the constants from table 2 were used for computing [5]. as was stated above, proof of the validity of using acousto-optic interaction requires that we satisfy all conditions of the klein-cook and raman-nath parameters (1), (2). the amplitude of the integral pressure at distance � 0 on the z-axis and the amplitudes obtained from piston radiator theory are shown in table 3. the limit value of the raman-nath parameter � � 1 was used. the klein-cook parameter q is also presented. the interaction length l, defined as the width of the acoustic field, perpendicular to the z-axis, had to be doubled (the laser light goes through twice). 4 results, discussion 4.1 a comparison of measurement by microphone with results from piston radiator theory fig. 3 shows examples of acoustic field distribution, with effective values of acoustic pressure, in the x-y plane measured by a microphone. fig. 4 compares the measured amplitudes © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 42 no. 4/2002 amplitudes units refraction index of air 1.000276 – velocity of ultrasound 343.6 m�s 1 piezo-optic coefficient 2�10 9 pa 1 laser wavelength 632.8 nm table 2: constants used for computing buffer pil0 [pa�m] pl0max [pa] q [–] n 25 1475 8.43�10 4 k 25 2268 4.5�10 4 table 3: raman-nath regime fig. 3: microphone measured acoustic field distribution in the x-y plane at distance z � 6.5 mm from the buffer k with 60 % power from the generator: a) on base frequency f1 � 20.5 khz b) on second harmonic frequency f2 � 41 khz c) on third harmonic frequency f3 � 61.5 khz on the x-axis with piston radiator theory. the theory is derived in [4] and the individual amplitudes are related to the maximum measured value on the z-axis. the theory is derived for an ideal piston radiator placed in the infinity plane. a real buffer radiates to free space. this explains the difference between the two curves with increasing x. 4.2 optical method measurement, comparison of results we performed many measurements to check the laser probe measurements with those using the microphone. microphone measurements had to be performed over the entire 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 4/2002 buffer z [mm] power [%] f [khz] pi [pa�m] over center of buffer, y � 0 pi [pa�m] over edge of buffer, y � r laser microphone theory laser microphone theory n 9.0 20 20.5 1.575 1.77 1.71 0.59 0.60 0.68 41.0 0.0044 0.043 0.0017 0.0017 k 6.5 60 20.5 15.22 15.55 15.83 10.70 8.00 9.20 41.0 0.13 0.21 0.14 0.09 0.10 0.063 table 4: results 0 100 200 300 400 500 600 700 800 900 1000 -20 -10 0 10 20 x [mm] p [p a ] pm pt fig. 4: dependence of acoustic pressure from k buffer on the x-coordinate at distance z � 6.5 mm, f1 � 20.5 khz, generator power 60 %. pt – from theory, pm – from experiment with microphone. a) 0 20 40 60 80 100 0 20 40 60 80 100 pm [%] p i [% ] b) 0 20 40 60 80 100 0 20 40 60 80 100 pm [%] p i [% ] fig. 5: dependence of optically measured integral pressure in the direction perpendicular to the z-axis of buffer on the measured pressure of the microphone on the z-axis at frequency f � 20.5 khz: a) buffer k, z � 6.5 mm, b) buffer n, z � 9 mm. length of the acoustic field to obtain integral values that were similar to those obtained by the optic method. the transfer constant of frequency modulation was estimated from a calculation of the simple harmonic signal demodulation. optical measurement was performed in the presence and in the absence of the microphone to determine whether the microphone influences the acoustic field. examples of the integral pressure obtained from the measurements and from piston radiator theory are introduced in table 4. table 4 shows that the integral values obtained by all three methods are much closer for the central acoustic beam area and for higher acoustic pressures. the central area is better determined because of the maximum of acoustic pressure. at the edge, small coordinate changes produced large changes in the obtained amplitudes. in addition, the measured data are influenced by noise for small signals. the amplitude of the integral pressure for the second harmonic component was below the pitch of sensitivity for buffer n, and is therefore missing. the minimum detected amplitude of integral pressure was 0.01 pa�m. another deviation can be explained by the finite size of the body of the microphone. the measurements could not be made at the same place and time because the microphone would stop the laser beam. also, the long term stability of the power generator and the settings of the output level could cause some errors. in spite of all these factors, the results are generally in agreement. the z-axis dependence of integral pressure pi obtained from the laser probe on the pressure measured by microphone pm is displayed in fig. 5. the first graph is for buffer k and the second’s for n at the basic frequency of radiation. the acoustic pressure is related to 100 % of the power generator setting. the acoustic pressure measured by the laser probe is linearly dependent on the pressure measured by the microphone along the z-axis of the acoustic field. 5 conclusion our results prove that it is possible to use a laser probe for measuring integral acoustic pressure. accuracy in placement of the probes (mainly the microphone) and measurements at different times produced deviations in the results obtained using the different systems. to determine the acoustic pressure at a specific point the tomography method can be used. the results obtained in this study will help us to measure acoustic fields in glass pipes where a microphone cannot be used. acknowledgements this research has been supported by research project msm 212300016 and grant ctu0206313. reference [1] korpel, a.: acousto-optics. new york and basel: marcel dekker, inc., 1988, p. 43–93. [2] bálek, r., šlegrová, z.: modeling and measurement laser induced surface displacement. euromech 419 colloquium, institute of thermomechanics as cr, 2000, p. 32. [3] bálek, r., šlegrová, z.: interferometric measurement of acoustic pressure. atp 2001, vut brno, 2001, p. 41– 45. [4] škvor, z.: akustika a elektroakustika. praha: academia, 2001, p. 120–129. [5] american institute of physics handbook. new york: mcgraw-hill, 1972. rudolf bálek, ph.d., assoc. prof. phone: +420 224 352 384 e-mail: balek@feld.cvut.cz zuzana šlegrová, m.sc. phone: +420 224 352 386 e-mail: slegroz@feld.cvut.cz department of physics czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 42 no. 4/2002 ap03_6.vp 1 introduction: the “indoor quality” concept the quality of indoor air is affected by all components of the environment, the so-called constituents of the microclimate (see fig. 1.1, table 1.1) [27]. existence of the individual components is obvious from: (1) the differential equation of the environment (only agents creating exposing flows in the human organism can be taken into account), (b) the stress theory according to selye (each constituent is created only by those agents or complexes of agents creating one type of stress in the human body, see [27]). air quality is thus dependent on its temperature and relative humidity, the concentrations of odors and toxic materials, the number of aerosols and microbes in the air, contamination by radioactive gases, static electricity and the numbers of negative and positive ions in the air. the impact of each of these factors depends on the magnitude of the stimulus. taking into account air quality only, i.e. the chemical components of indoor air, pleasant or unpleasant odors dominate the perception of the environment by the occupant, if there are no significant indoor sources. that is-why odors have become the criteria for assessing overall air quality [39], [41]. however, we always have to remember that it is solely the odor microclimate that we are evaluating and not the overall indoor air quality. this means that, if a significant source is present, e.g. water vapour or carbon monoxide, the hygrothermal or the toxic constituent has to be evaluated according to its own criteria (e.g. acceptable relative humidity of the air or acceptable carbon monoxide concentration). simultaneously, the increasing requirements for indoor air quality in buildings need more exact criteria in order to ascertain the real condition of the environment and to allow acta polytechnica vol. 43 no. 6/2003 22 indoor air quality assessment based on human physiology – part 1. new criteria proposal m. v. jokl human physiology research makes evident that the weber-fechner law applies not only to noise perception but also to the perception of other environmental components. based on this fact, new decibel units for odor component representing indoor air quality in majority locations have been proposed: decicarbdiox dcd (for carbon dioxide co2) and decitvoc dtv (for total volatile organic compound tvoc). equations of these new units have been proved by application of a) experimental relationships between odor intensity (representing odor perception by the human body) and odor concentrations of co2 and tvoc, b) individually measured co2 and tvoc levels (concentrations) – from these new decibel units can be calculated and their values compared with decibel units of noise measured in the same locations. the undoubted benefit of using the decibel scale is that it gives much better approximation to human perception of odor intensity compared to the co2 and tvoc concentration scales. keywords: indoor air quality, odors, air changes estimation. fig. 1.1: types of constituent agents microclimate toxic solids solid aerosols aerosol micro-organismus microbial toxic liquids material liquid aerosols aerosol toxic gases toxic odours odour air (its movement) space (its colourness) man (as an object) water vapours "p s y c h ic m ic r o c l im a t e " heat convection conduction evaporation hygro-thermal respiration radiation light uv radiation laser radiation electro-dynamic energetic microwave radiation (radar) ionizing radiation ions in surrounding air electro-ionic static electricity electrostatic sound acoustic force field (gravitation) vibration table 1.1: common environmental agents and corresponding microclimate types better optimization of its level (see [29], [30]), to remove “sick building” symptoms, i.e. to get the real comfort within a building. human physiology research makes evident that the weber-fechner law applies not only to noise perception but also to the perception of other environmental components. according to this law human body response (r) is proportionate to the logarithm of the stimulus (s; k is a constant): r s� k log . (1) applied to acoustic component of the environment l spl p p p � � 20 0 log [db], (2) where l splp � � sound pressure level [db); p � acoustic pressure (when measuring the rms value, i.e. the square root of the arithmetic average of a set of squared instantaneous values); p0 � acoustic pressure at the threshold of hearing (for air p0 � 20 �pa). analogical relationship could be supposed for odor component of environment [27] which determines the necessary indoor air exchange (by which unpleasant odors should be removed, the toxic gases removal by ventilation usually highly exceeds ordinary requirements and must be solved separately): � �lodor odork� log � � 0 [db], (3) where lodor represents the odor concentration level in db (the value representing the human response, i.e. odor perception [db]), � represents the odor concentration in a building interior [g � m�3], [ppm], �0 represents the odor concentration threshold value [g � m�3], [ppm]. the psycho-physical scale by jaglou (the scale of perceived odors) could be applied to odor concentration levels [18]. in relationship to the percentage of dissatisfied (pd) its experimentally formed course is presented in fig. 1.2. it is a logarithmic function (see equation (1)) that proves the logarithmic form of equation (3). the odor indoor air quality, decisive for air exchange, is dominantly determined by two odor substances: carbon dioxide co2, which cannot be neglected in rooms occupied by a higher number of people (it is produced by respiration) and the complex of volatile organic compounds (voc) produced by the majority of upholstery and building materials. the following equations could be written for co2 and tvoc: lodor co odor co co co k 2 2 22 0 � � � � � log � � [db co2], (4) lodor tvoc odor tvoc tvoc tvoc k� � � � � log � � 0 [db tvoc], (5) where � co2 and �tvoc represent the concentrations of co2 and tvoc. even though these equations (4, 5) look very promising from the physiological point of view, they must be experimentally proved. furthermore, at least two points are necessary for each equation: (1) minimum threshold value, i.e. the weakest odor that can be detected (odor tresholds are statistically determined points usually defined as the point at which 50 % of a given population will perceive on odor” [24]), (2) any second point can be chosen. we prefer the maximum threshold (limit) value, i.e. the beginning of the toxic range. the weakest odor that the smell organ of a healthy human can register has an intensity of l, according to the yaglou psycho-physical scale [18], and corresponds approximately to a percentage dissatisfaction (pd) of 5.8 % (see fig. 1.2). if we respect the similarity theory (see [21]), according to which analogous phenomena are governed by the same laws (e.g., concerning the perception of noise, odor etc. with intensity as a logarithmic function), then the corresponding minimal value for thermal comfort, as defined by fanger [16], is 5 which is not too dissimilar from 5.8 for yaglou’s odor value, taking into account the demanding nature of the experimental procedure. there is a good collection of experimental values in literature, therefore it has not been necessary to rely on our own measurements; even complete curves (see for example fig. 1.2) are available. 2 carbon dioxide for a long time, the odor constituent has been evaluated on the basis of co2 concentration and its limit value of 1000 ppm, introduced by von pettenkoffer (see [42]), was used to determine the minimum amount of fresh air (25 m3� h�1 per person). co2 is the most important biologically active agent whose production is proportional to human metabolic ratio [24]. in practice, monitoring co2 levels for the purpose of controlling fresh air supply has proved satisfactory for lecture theatres, halls, cinemas, theatres and similar spaces where the load imposed by occupants can vary rapidly. in order to prove the equation (4), another experimental relationship besides the experimental relationship presented in fig. 1.2 must be available: namely, the relationship between pd and co2 concentration. this is presented in fig. 1.3. now we are able to shape the equation (4). the first point, the minimum threshold value for co2 can be taken as 5.8 % dissatisfaction (yaglou psycho-physical scale: 1), which is 485 ppm, i.e. 875 mg � m�3 figs. 1.2 and 1.3 [15]. the second point used was the short-term exposure limit, which is the beginning of the toxic range, i.e. 15000 ppm. this is based on [22] from the health and safety executive (hse) occupational exposure limits of great britain. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 43 no. 6/2003 odor intensity d is s a t is f ie d [% ] fig. 1.2: the relationship between odor intensity (yaglou’s psycho-physical scale) and the percentage of dissatisfied sedentary subjects during light activity (smoking not permitted) [18] so we are able to formulate the equation for odor level: l i odor co co 2 ppm decicarbdiox, dcd� 90 485 2log [ ] [ ] � (6) � �� �k kodor co odor co2 2log 15000 485 135 90� � � or l i odor co co -3 2 mg m decicarbdiox, dcd� � 90 875 2log [ ] [ ] � (6a) where decicarbdiox, dcd, is a new decibel unit for odor level (decibel carbon dioxide) caused by co2 production by humans, � ico2 is indoor air concentration, kodorco2 is constant. besides experimental functions already applied to equation (4), a lot of individually measured co2 levels from various locations (tables 1.2 and 1.3, fig. 1.4) are available for the verification of the equation (4). new decibel units dcd can be calculated from the measured co2 concentrations and their values can be compared acta polytechnica vol. 43 no. 6/2003 24 fig. 1.3: the percentage of dissatisfied sedentary subjects as a function of the carbon dioxide concentration above outdoors sample mean, dcd / ppm range, dcd / ppm study s ns s ns public places 39 schools (ns) – – 28 dcd 990 ppm – – �5 to 68 425–2800 smedje et al 1994 10 schools (ns) – – 38 1300 – – – – thorstensen et al 1990 14 town halls (s) 20 800 – – 1 to 38 500–1300 – – skov, valbjorn 1987 5 office buildings (s) – – – – <26 <950 – – palonen, sepanen, 1990 4 office buildings (s) 20 800 – – �8 to 38 400–1300 – – loewenstein 1989 26 office buildings (s) 11 639.4 – – �13 to 52 350–1850 – – reynolds et al 1990 10 kindergarten – – 27 962.7 – – 25 to 66 915–2590 piade et al 1988 1 office building (s) �6 420 – – – – – – jaakola et al 1990 10 offices (5 s, 5 ns assumed) 8 590 4 533 – – – – proctor et al 1989 1 office – – – – 8 to 13 600–675 – – berglund et al 1982 1 library – – 16 731.5 – – – – berglund et al 1988 9 office buildings (s) 15 710.6 – – – – – – jones 1980 homesa) 10 homes and apartments 14 692 �1 to 40 470–1360 van der wal et al 1990 table 1.2: measured co2 levels in various locations [23] © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 43 no. 6/2003 sample mean, dcd / ppm range, dcd / ppm study s ns s ns 18 homes 6 570 �13 to 24 350–900 keskinen, graeffe 1989 57 homes 16 730.5 – – van dangen, van der wal 1990 living room b.r. 29 1024 – – op’t veld, slijpen 1993 living room a.r. 23 882 – – op't veld, slijpen 1993 kitchen b.r. 29 1013 – – op't veld, slijpen 1993 kitchen a.r. 21 836 – – op't veld, slijpen 1993 bedroom b.r. 25 927 – – op't veld, slijpen 1993 bedroom a.r. 13 672 – – op't veld, slijpen 1993 s – smoking, ns – nonsmoking, a.r. – after renovation, b.r. – before renovation, negative values of dcd – measuring methods allow estimating of values below the detection threshold, a)no distiction made between smoking/non-smoking. table 1.2: measured co2 levels in various locations [23] (continue) before renovation after renovation who 1987 guideline values living room kitchen bedrooms living room kitchen bedrooms (dcd) mean 29 29 25 23 21 13 35 co2 mean 1024 1013 927 882 836 672 1200 (ppm) std 184 277 193 160 122 148 co mean 3.9 4.3 4.2 2.2 2.0 1.4 10 (8 h) (mg � m�3) std 0.4 0.2 1.6 1.0 0.5 0.5 ch2o* mean 665 577 530 405 357 231 120 (0.5 h) (�g � m�3) std 214 51 234 167 153 188 tvoc (ref ch4) mean – – – 4.5 4.1 2.9 – (mg � m�3) std – – – 1.0 1.0 1.0 no2 mean 84 160 30 30 34 16 150 (�g � m�3) std 40 127 9 16 14 3 (24 h) resp.dust mean 30 – – 30 – – 70 (�g � m�3) std 16 – – 15 – – (pm10 24 h) rh(%) mean 42 41 57 45 44 45 (30–70) % std 4 6 9 4 3 4 * including a.o. other aldehydes, c5h12, c6h14 table 1.3: measured indoor quality parameters before and after renovation (mean values and standard deviations, n � 16) [38] with decibel units for noise measured in the same locations (fig. 1.5). a perfect agreement is evident. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 1.4: the measured co2 levels in various buildings (see table 1.2); the psychophysical scale slightly modified by fanger (1988), s – smoking, ns – nonsmoking, a.r. – after renovation, b.r. – before renovation 3 volatile organic compounds although co2 is a good indicator of the perceived air quality by sedentary persons, it is frequently also an unsuitable indicator: it does not represent further perceived sources of air contamination, such as building materials and fittings, especially carpets and other floor covering materials, producing voc. this was why fanger [17] proposed a new system based on the units of the “olf ” and “decipol”. it was presented in a number of international periodicals and publications and in 1992 became the eu recommended method of evaluating indoor air quality (eur 14449 en, 1992). it was not, however, accepted for the bsr/ashrae standard 62-1989 r (1989). there are certain obvious problems with this system, as were analysed by oseland [39], jokl et al [28] and especially by parine [41]. fanger’s system, as used in the ec standard, is based, rather than on co2, on a new criterion: the total of all volatile organic compounds (tvoc) produced by humans and especially by building materials, furniture and other fittings. tvoc is also used for outside air quality as used for ventilation purposes, especially in areas with sources of contamination (chemical and other factories). tvoc is defined by the world health organisation (who) as a set of agents (toluene, xylene, pinene, 2-(2etoxyetoxy), ethanol etc.) with a melting point below room temperature and a boiling point in the range 50–260 °c. more detailed definitions also exist. humans sense tvoc by means of olfactory (smell) sensors (see [27]). adaptation during the course of exposure is small. the response of the human organism to indoor tvoc has been classified as acute sensing of the environment, acute © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 43 no. 6/2003 or sub-acute inflammation of the skin or mucous membranes, or a sub-acute and mild stress reaction [15]. in order to prove the equation (5), besides the experimental relationship presented in fig. 1.2, another experimental relationship must be available: namely, the relationship between pd and tvoc concentrations. it is presented in fig. 1.6. now we are able to shape the equation (5). the first point, the minimum threshold value for tvoc, can be taken as 5.8 % dissatisfaction (yaglou psychophysical scale: 1) which is 50 �g � m�3, see fig. 1.6 (adapted from fig. 1.2 in eur 14449 en, see [30]). the second point used was the short-term exposure limit, which is the beginning of the toxic range, i.e. 25000 �g m�3, which has been estimated by molhave [36]. so we are able to formulate the equation for odor level: l iodor tvoc tvoc� 50 50 log � [decitvox, dtv] (7) ( � �k kodor tvoc odor tvoclog 25000 50 135 50� � � ) where decitvoc, dtv, is a new decibel unit for odor level caused by tvoc release from building materials and other sources (decibel tvoc), kodor tvoc is a constant. besides experimental functions already applied to equation (5), a lot of individually measured tvoc levels from various locations (tables 1.3 and 1.4, fig. 1.7) are available for the verification of the equation (5). new decibel units dtv can be calculated from the measured tvoc concentrations and their values can be compared with decibel units for noise measured in the same locations (fig. 1.5). a perfect agreement is evident. 4 conclusions 1. the undoubted benefit of using the decibel scale is that it gives a much better approximation to human preception of odor intensity compared to the co2 and tvoc concentration scales. this is because the human olfactory organ (see [27]) reacts to a logarithmic change in level 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 1.6: the percentage of dissatisfied sedentary subjects as a function of the total volatile organic compound concentration (tvoc) above outdoors sample tvoc [�g � m�3] odor level [dtv] study min mean max min mean max living rooms and bedrooms old houses 130 200 240 21 30 34 brown, crump 1993 new houses 330 460 580 41 48 53 non-asthmatics 72 320 1600 8 40 75 nordbäck et al 1993 asthmatics 67 540 8300 6 52 111 office buildings a) without sbs during working hours 43 51 61 �3* 0 4 ekberg 1993 during night-time 37 44 49 �6* �3* 0 during weekend 37 – – �6* – – b) with sbs (liquid process photo copiers) – 5000 90000 – 100 163 broder et al 1993 sbs � sick building syndrome * measuring methods allow to estimate values under the detection threshold table 1.4: measured tvoc values in various locations which corresponds to the decibel scale, where a change of 1 db is approximately the same relative change everywhere on the scale. 2. the new decicarbdiox and decitvoc values also fit very well with the db values for sound, e.g., the optimal odor value of 30 db corresponds to the iso noise rating acceptable value nr 30 for libraries and private offices. they can therefore be compared to each other. 3. it is possible, by comparing dcd and dtv values, to estimate, which component – co2 or tvoc – plays a more important role and hence which sources of contamination are more serious. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 43 no. 6/2003 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 references see presented at the end of part 3 (p. 44). prof. ing. miloslav v. jokl, drsc. phone: +420 224 354 432 email: miloslav.jokl@fsv.cvut.cz department of engineering equipment of buildings czech technical university in prague faculty of civil engineering 166 29 prague 6, czech republic acta polytechnica doi:10.14311/ap.2020.60.0151 acta polytechnica 60(2):151–157, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap the use of the two-handed collaborative robot in non-collaborative application michal vocetka∗, jiří suder, daniel huczala vsb – technical university of ostrava, faculty of mechanical engineering, department of robotics, 17. listopadu 2172/15, 708 00 ostrava – poruba, czech republic ∗ corresponding author: michal.vocetka@vsb.cz abstract. the article deals with possibilities of using of a two-handed collaborative robot in automated production. the introductory part of this paper is about robot manufacturers’ proposed ways of use of collaborative robots and a consideration of correctness of this stance. in this matter, an alternative point of view is proposed and tested, where a collaborative robot does not cooperate with a worker but replaces him/her completely. the main part of the study focuses on a specific installation of the yumi collaborative robot into an already existing production line of a leading czech supplier in the automotive industry. this real application is verified in simulations with an alternative solution consisting of two traditional industrial robots abb irb 120 instead. these data are evaluated and the advantages of deploying the collaborative robot and the industrial robots in the specific assembly application are compared. economic return and productivity in high production cycle applications are considered. the article then describes the difficulties caused by the low load capacity of the yumi collaborative robot and an alternative approach using the fem methodology and topology optimization in the robot grip jaws design. keywords: collaborative robot, yumi, industrial robot, industrial automation, semi-collaborativ. 1. introduction in 2008, a danish company universal robots had launched the first collaborative robot (cobot) that has a worldwide sales success and changed the established rules of the market [1]. the small ur3 [2] is simple to operate and thanks to its favourable price and collaborative principle, it is also available to customers who would not otherwise consider traditional industrial robotics. most of the world leading robot manufacturers have sensed the potential of this idea and come up with their own designs and solutions. the ur concept is a concept of the traditional, sixaxis kinematic structure, but with less robust tubular construction. the robots have an excellent load-toweight ratio, in the case of the ur3, it is 3:11 kg. it is very easy to control and install the cobot, after putting it out of the box, it is possible to put it in service within an hour. these cobots are very convenient for simple applications. taiwan enterprise techman produces conceptually the same robots but improved by adding a camera system installed on its fifth axis [3]. the system can then, to some extent, evaluate the work environment by itself. fanuc has created the strongest cobot so far. their cr-35ia has a load capacity of 35 kg [4]. according to many, such a weight defies collaboration, because dropping an object of this weight could easily cause pain and injury, which collaborativity does not allow. however, fanuc offers this system to applications where it works as a hand-guided manipulator, the operator holds a control located on the upper arm and guides the manipulator where needed. abb has come up with a completely new concept. the cobot yumi is supposed to work with person directly by being seated at one desk in front of the operator and work simultaneously, in the same work space and on the same task. the cobot is equipped with two seven-axis arms (most robots and cobots have a six-axis arm). these arms are set up in a common body with a built-in controller. a collaborative tool, the smartgripper, is included with the yumi cobot. the basic configuration includes an electric gripper with a clamping force of up to 20 n, the gripper can be fitted with one or two suction cups and possibly a camera. [5] this cognex ae3 industrial camera has a resolution of 1.3 mpx and an integrated led for illumination. the standard robot control system is upgraded with the integrated vision module to control it. also, it is possible to control the smart gripper camera through an insight software, provided by cognex [6]. this solution is designed for industrial–use, but camera guidance is time consuming. the principle of cooperativity has also caught attention of people outside of the industrial automation, this type of automation is now widely supported by various government grants, large enterprises have ordered deploying cobots in applications for which it often does not seem to fit. the reason is the fear and apprehension that industrial automation, and especially robotization, will exclude people from pro151 https://doi.org/10.14311/ap.2020.60.0151 https://ojs.cvut.cz/ojs/index.php/ap m. vocetka, j. suder, d. huczala acta polytechnica figure 1. ur3, cr-35ia and yumi cobots. duction processes and increase unemployment, and so the cobotics is perceived as a compromise. collaborative robotics has many advantages, but it has some limitations too making it in most manufacturing processes so disadvantageous that it cannot stand in comparison with traditional industrial robotics. these limitations result from the safety functions and operation of these manipulators, which do not have to be installed behind the protective fence, optical barriers, zone scanners and are generally considered to be safe (which applies only to the manipulator, not to its tool and surrounding technology). the disadvantage is that the robots have a small load capacity, and, above all, are slow. it is not physically possible to create a fast robot with a kilogram load capacity that will be safe at the same time. in case of a collision with a human, more energy than would be acceptable could be transmitted. moreover, according to the european legislation, any contact with human head and neck at any speed is completely unacceptable [7, 8]. therefore, the question how to use cobots in practice arises [9–13]. 2. a specific yumi application in the specific case described below, the choice of yumi cobot is more appropriate in all aspects than the other considered options with traditional industrial robots. the task is to install a motor, a centrifugal clutch, an upper and a lower housing. the process is now carried out by a human and proceeds as follows: the operator puts both plastic housings into the press; the motor and the centrifugal clutch are inserted into the press as well. the automatic press gradually compresses all parts. the finished part is taken out from the press by the operator and placed in a technological palette. due to the lack of workers, however, there is a risk of production interruption and the manufacturer is therefore trying to fully automate these simpler processes. however, the production must not be disturbed, so the installation of any new equipment must take place very quickly, and, if necessary (in the case of a robot malfunction), it must be possible to disconnect the robot immediately and to carry out the production by a human operator. it is also not possible to fence the workplace because of a lack of space. the last limiting parameter is the production cycle, which is set at 12 s. these requirements directly figure 2. bosch technological pallet. lead to the use of a robot that works in a similar way to a human being. in addition, the installation of the yumi cobot is also suitable because of the limited working space on the pallet and the presses. indeed, yumi has seven-axis arms that are much smaller in cross-section than the smallest industrial abb robots. a pair of these industrial robots would not fit into the target space at the same time, the robots could not work simultaneously, which results in a longer duty cycle [14, 15]. the fig. 2 above depicts the bosch technology palette, the clutch is in the left corner, and the completed motor is placed in the green socket. the proposed workstation consists of a press that is placed behind the pallet conveyor, a bowl feeder for housing supply and a robot on a docking station. the press had to be adapted for automation purposes, the pressing of the housings takes place separately and is provided by pneumatic cylinders with a diameter of 20 mm. the pneumatic manipulator, whose individual drives only move between the end positions, solves the transfer of housings from the draw-off plate. this solution is suitable with respect to price and adjusting the machine. the pressing plate can be rotated by 180 degrees. pressing takes place on the front side while the back side is filled with the new set of housings by a pneumatic manipulator. the clutch is pressed on the motor by a pneumatic cylinder with a diameter of 80 mm, the motor is locked by an angular gripper during pressing to avoid misalignment. in fig. 3, the yumi robot is in the foreground and the manual assembly press is on its right side. opposite the robot, the automatic press is mounted – it works only when the robot is connected and operational. the housings are delivered by the bowl feeder installed in the background. the robot duty cycle differs slightly from the manual assembly. left arm realizes pressing of the housings, right arm assembles the clutch to the motor. the whole process is tuned, so there is no delay caused by the arms waiting for each other. deploying and testing of a new technology on an already functioning production line that cannot be stopped for a longer time is difficult. for this reason, 152 vol. 60 no. 2/2020 the use of the two-handed collaborative robot. . . figure 3. assembly line workstation. figure 4. automatic press detail. the robot is placed on a detachable docking station. if this new technology does not work properly, then it can be easily disconnected, and another robot or line operator could continue the work manually instead. the workplace remains unchanged from the worker’s point of view, but the production process will be stabilized after debugging, as the line performance is proportional to the performance of the slowest operator, and many tasks are performed at this station in a short time. 3. low load capacity problems however, the load capacity of the yumi robot is problematic. it is set by the manufacturer at a maximum of 500 g as shown in fig. 6. this weight must not exceed the sum of the weight of the object of manipulation and the end-effector. with these values, the maximum permissible weight of a one gripper jaw is 12 g when the smart gripper is used. if it is made of duralumin and the functional surface is coated with rubber, this weight can be achieved, but the robot will work on its limits. an alternative is a design of a custom end-effector with a pneumatic gripper, which figure 5. abb smart gripper. will have a quadruple clamping force, which is very desirable, total price per the tool will be lower, but it will not meet the principles of cooperativity. the design of the jaw for the collaborative gripper was created based on the results of the topology optimization. in order to minimize the weight, this method seems to be the most appropriate. the initial testing revealed that the jaw contact surface must copy the radius of the motor body on the largest possible area, otherwise there is a risk of displacement or rotation of the motor in the jaws. 153 m. vocetka, j. suder, d. huczala acta polytechnica figure 6. yumi load diagram. material mass [g] tension [mpa] pc 11.5 10 20al6061 23.9 table 1. comparison of fem analysis results. in addition, this surface must be coated with a soft coating to enhance adhesion. the original design of the jaw, which would be printed on a 3d polycarbonate printer, corresponds to a strength and weight limits, but the lifetime of such a jaw would be short, the 3d printed part would not withstand the cyclic loading. however, in the case of the same design but using higher quality aluminium instead, the weight would rise above the acceptable limit, as it is shown in table 1. the principle of topology optimization is to exclude all the mass, which does not carry any load and is therefore useless, from the optimized part. the fem analysis has shown that this part has a potential to be topologically optimized. in the case of the aluminium, the optimization results in half the weight, at roughly doubling the stress values, but this is still acceptable. thus, the optimized part will be modified to reduce the production costs. however, it can be considered as sufficient at this phase. 4. industrial robot solution the possible solution with a pair of traditional six-axis robots is also functional, but in comparison with the yumi cobot, it is slower by two seconds. material mass [g] tension [mpa] pc 5.6 25 40al6061 12 table 2. comparison of fem analysis results of topologically analyzed part. parameter yumi 2× irb120 investment 60 000 € 66 500 € tact 10.5 s 12.8 s modularity yes no fenced no yes power excess no yes table 3. comparison of fem analysis results of topologically analyzed part. the industrial solution simulation contemplates a pair of abb irb120 robots. these small six – axis industrial robots could reach a speed of 6.2 m/s and acceleration of 28 m/s2 with a load capacity of 3 kg. [16] robots would be faster, if they were hung, but this option is not possible. even though the robot is generally much faster, it does not reach higher speeds on such a short trajectory, the mightiness of its arms causes additional delays when one robot waits for another. all the simulations were performed in abb robotstudio software. it is probably the most accurate software to simulate abb robots, according to a fact that a detailed robot specification and performance is confidential, considered as a brand’s know-how. the last and probably the most serious parameter is the price, where the solution will be more expensive by an estimated € 6,500. based on these parameters, the cobot yumi seems to be more appropriate, for this specific case. 5. results the total cost of the automation is estimated at about 60,000 €. considering that the line operates in twoshift operation and so two operators will be spared, the return is estimated to be about three years. the more important fact is that the production will not be hampered by a lack of personnel, the robotized station is three seconds faster than a human operator, so there may be a small increase in production. the comparison of cobot and industrial robots has shown that a cobot is more appropriate in this case. the parameters are compared in table 3. the key parameters in the decision making are the price and modularity, i.e. whether it is necessary to fence the station and whether it is possible to quickly replace the cobot with another cobot or a human. this comparison can be surprising, but it is valid only for this specific application. in other applications, 154 vol. 60 no. 2/2020 the use of the two-handed collaborative robot. . . figure 7. topology optimization process. figure 8. fem analysis. figure 9. a pair of irb120. 155 m. vocetka, j. suder, d. huczala acta polytechnica yumi would probably not be able to withstand the comparison with industrial robots. 6. conclusion it is known that cobots are very suitable for applications with a long production cycle. in industrial applications, there are many cases where the cobot is installed in a manner of the industrial robot, behind the fence, and it works in a non-collaborative mode. this approach may be advantageous in terms of the cost of the manipulator. the cobots are very useful tools for programming and robot-control education. in those applications, there is no need for high speeds and the cobot can move unloaded. due to the safe operation of the manipulator itself, which does not need safety peripherals for its function, this is a suitable solution for technical schools and universities or training centres that offer this type of education. in this specific case, surprisingly, a situation where cobot deployment is more useful than a robot, has been achieved. a standard collaboration would not by possible, due to the surrounding technology, such as presses, conveyor etc. a traditional industrial robot solution is applicable, but requires more floor space and needs to be fenced, so there is no room for modularity. the idea of using a semi-collaborative installation would be interesting. the yumi robot works as a standard industrial manipulator, using maximum speed and acceleration, however, its body covers the workspace and yumi parameters are not that dangerous. protection in a way of light curtain, whose disruption will result in a stop of all presses and cobot as well, is acceptable. this safety solution, together with small docking station for a small cobot saves the floor space, workstation could be “opened” and so robot could be easily and quickly replaced, even by a human worker. this idea is economical especially in case of lowvolume production, or production of several product variants. if a variant doesn’t need, for example, a technological operation provided by the cobot, that cobot could be used on another production line in another manufacturing process, as is needed. as the next steps of research, we would like to work on adaptive jaws. this kind of jaws is already available on the market [17], but we believe that for cylindrical object of manipulation, a better finray jaw design could be invented. acknowledgements this work was supported by the european regional development fund in the research centre of advanced mechatronic systems project, project number cz.02.1.01/0.0/0.0/16_019/0000867 within the operational programme research, development and education and project vega 1/0355/18 the use of experimental methods of mechanics for refinement and verification of numerical models of mechanical systems with a focus on composite materials. this article has been also elaborated under support of the specific research project sp2019/69 and financed by the ministry of education, youth and sports of the czech republic. references [1] universal robots. https://en.wikipedia.org/wiki/universal_robots, 2019. accessed: 19 march 2019. [2] universal robots. https://www.universal-robots.com/cs/, 2019. accessed: 19 march 2019. [3] techman robots. http://tm-robot.com/tm5.php, 2019. accessed: 19 march 2019. [4] fanuc. https: //www.fanuc.eu/cz/en/robots/robot-filter-page/ collaborative-robots/collaborative-cr35ia, 2019. accessed: 19 march 2019. [5] abb yumi irb14000. https://new.abb.com/products/robotics/cs/ prumyslove-roboty/yumi?utm_source=google&utm_ campaign=se_roboty_roboty&utm_medium=cpc&utm_ term=roboty&utm_content=robot-yumi&gclid= eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_ bwe, 2019. accessed: 19 march 2019. [6] product manual irb 14000 gripper: document id: 3hac054949-001. https://abb.sluzba.cz/pages/ public/irc5userdocumentationrw6/en/3hac054949% 20pm%20irb%2014000%20gripper-en.pdf, 2018. accessed: 12 february 2020. [7] iso 10218 robots for industrial environments — safety requirements—part 1: robot. standard, international organization for standardization, geneva, switzerland, 2006. this version is no longer in effect. [8] iso/ts 15066 robots and robotic devices — collaborative robots. standard, international organization for standardization, geneva, switzerland, 2016. [9] v. villani, f. pini, f. leali, c. secchi. survey on human–robot collaboration in industrial settings: safety, intuitive interfaces and applications. mechatronics 55:248 – 266, 2018. doi:10.1016/j.mechatronics.2018.02.009. [10] v. villani, f. pini, f. leali, et al. survey on human-robot interaction for robot programming in industrial applications. ifac-papersonline 51(11):66 – 71, 2018. 16th ifac symposium on information control problems in manufacturing incom 2018, doi:10.1016/j.ifacol.2018.08.236. [11] g. michalos, s. makris, p. tsarouchi, et al. design considerations for safe human-robot collaborative workplaces. procedia cirp 37:248 – 253, 2015. cirpe 2015 understanding the life cycle implications of manufacturing, doi:10.1016/j.procir.2015.08.014. [12] j. fryman, b. matthias. safety of industrial robots: from conventional to collaborative applications. in german conf. robot., robotik, pp. 1 – 5. munich, germany, 2012. 156 https://en.wikipedia.org/wiki/universal_robots https://www.universal-robots.com/cs/ http://tm-robot.com/tm5.php https://www.fanuc.eu/cz/en/robots/robot-filter-page/collaborative-robots/collaborative-cr35ia https://www.fanuc.eu/cz/en/robots/robot-filter-page/collaborative-robots/collaborative-cr35ia https://www.fanuc.eu/cz/en/robots/robot-filter-page/collaborative-robots/collaborative-cr35ia https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://new.abb.com/products/robotics/cs/prumyslove-roboty/yumi?utm_source=google&utm_campaign=se_roboty_roboty&utm_medium=cpc&utm_term=roboty&utm_content=robot-yumi&gclid=eaiaiqobchmitqtrrj-o4qivqzztch06bayreaayasaaegjvovd_bwe https://abb.sluzba.cz/pages/public/irc5userdocumentationrw6/en/3hac054949%20pm%20irb%2014000%20gripper-en.pdf https://abb.sluzba.cz/pages/public/irc5userdocumentationrw6/en/3hac054949%20pm%20irb%2014000%20gripper-en.pdf https://abb.sluzba.cz/pages/public/irc5userdocumentationrw6/en/3hac054949%20pm%20irb%2014000%20gripper-en.pdf http://dx.doi.org/10.1016/j.mechatronics.2018.02.009 http://dx.doi.org/10.1016/j.ifacol.2018.08.236 http://dx.doi.org/10.1016/j.procir.2015.08.014 vol. 60 no. 2/2020 the use of the two-handed collaborative robot. . . [13] i. maurtua, a. ibarguren, j. kildal, et al. human–robot collaboration in industrial applications: safety, interaction and trust. international journal of advanced robotic systems 14(4), 2017. doi:10.1177/1729881417716010. [14] a. m. zanchettin, n. m. ceriani, p. rocco, et al. safety in human-robot collaborative manufacturing environments: metrics and control. ieee transactions on automation science and engineering 13(2):882 – 893, 2016. doi:10.1109/tase.2015.2412256. [15] h. ding, j. heyn, b. matthias, h. staab. structured collaborative behavior of industrial robots in mixed human-robot environments. in ieee international conference on automation science and engineering, case. 2013. [16] product specification: irb 120. https://library.e.abb.com/public/ 7139d7f4f2cb4d0da9b7fac6541e91d1/3hac035960% 20ps%20irb%20120-en.pdf. accessed: 11 february 2020. [17] festo. adaptive gripper fingers dhas. https://www.festo.com/cat/cs_cz/data/doc_engb/ pdf/en/dhas_en.pdf, 2017. accessed: 27 february 2020. 157 http://dx.doi.org/10.1177/1729881417716010 http://dx.doi.org/10.1109/tase.2015.2412256 https://library.e.abb.com/public/7139d7f4f2cb4d0da9b7fac6541e91d1/3hac035960%20ps%20irb%20120-en.pdf https://library.e.abb.com/public/7139d7f4f2cb4d0da9b7fac6541e91d1/3hac035960%20ps%20irb%20120-en.pdf https://library.e.abb.com/public/7139d7f4f2cb4d0da9b7fac6541e91d1/3hac035960%20ps%20irb%20120-en.pdf https://www.festo.com/cat/cs_cz/data/doc_engb/pdf/en/dhas_en.pdf https://www.festo.com/cat/cs_cz/data/doc_engb/pdf/en/dhas_en.pdf acta polytechnica 60(2):151–157, 2020 1 introduction 2 a specific yumi application 3 low load capacity problems 4 industrial robot solution 5 results 6 conclusion acknowledgements references ap04_3web.vp 1 introduction steel is nowadays a very suitable material for many structural uses because of its low cost, good formability, variable strength with good toughness, and good weldability. still better steels are required, however, because those available at present do not meet the high standards of this century. ultra-fine ferrite (uff) microstructures with grain sizes usually not more than 3 �m (0.003 mm) have been shown to result in superior combinations of mechanical properties, much better than those in the present steels with grain sizes of about 5 to 20 �m. high strength, excellent impact toughness even at very low temperatures, and very good fatigue strength together with good formability and weldability can be obtained at the same time in an ultra-fine grained steel. most of the test steels used in numerous investigations worldwide have been produced on a laboratory scale, but a few have been made in commercial plate mills. numerous research projects to develop still better structural steels, e.g. “ultra steels” or “super steels”, have been commenced all over the world since the mid-1990s, especially in asia (e.g. japan, south korea, china), australia and europe. one of the main goals of japan’s enormous national 10-year “ultra steel project”, for example, is to achieve a strength level of 800 mpa for unalloyed silicon-manganese steels [1, 2]. this means that the grainy microstructure of the steel will have to be much finer than at present. the aim of most current projects is to achieve an ultra-fine grain structure, with a grain size usually of not more than 3 �m (0.003 mm), in conventional steels. this is assumed to ensure high strength and very good toughness even at very low temperatures [3]. novel processes for producing these ultra-fine-grained steels have been developed lately, e.g. in japan [2], finland [4, 5] and australia [6]. a description of the improved mechanical properties obtained in ultra-fine-grained steels up to now will be presented in this paper, and some potential applications of these new generation steels will also be described. in addition, the principle and implementation of a novel hot rolling process, tncp [4, 5], developed by the author will be introduced. 2 superior properties of uff steels ultra-fine-grained steels have often been shown to have mechanical properties superior to conventional structural steels with common microstructures. yield strength and fatigue strength have been shown to increase markedly with reduced grain size, and impact toughness has improved radically. the effect of ferrite grain size on yield strength and impact transition temperature (tough/brittle), extrapolated to ultra-fine grain sizes, is shown in fig. 1. the main attention in the new research work that arose in the 1990s regarding the benefits of ultra-fine-grained steels was paid to higher yield and tensile strength. one of the main goals of japan’s enormous national 10-year “ultra steel project”, for example, was to achieve a strength level of 800 mpa for unalloyed silicon-manganese steels [1, 2]. this means that the ferrite grain size should be about 0.5 �m (fig. 1). a stronger but lighter steel will contribute to improving the fuel efficiency of automobiles and increasing the thermal efficiency of power generation plants, as well as reducing carbon dioxide emissions [2]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 44 no. 3/2004 superior properties of ultra-fine-grained steels j. i. leinonen a description of the improved mechanical properties obtained in ultra-fine-grained steels up to now will be presented in this paper, and some potential applications of these new generation steels will be described. in addition, the principle and implementation of a novel hot rolling process developed by the author will be introduced. this novel thermomechanical nonrecrystallisation control process (tncp) has been shown to give an ultra-fine ferrite (uff) structure with grain sizes of 2 to 3μm in various test steels, thus resulting in super-toughness. charpy v impact test results suggest that some of these steels could still be tough at temperatures lower than �100 °c. this novel process, tncp, is one potential candidate for the commercial production of superior ultra-fine-grained steels in the future. keywords: ultra-fine-grained steel, super-toughness, yield strength, fatigue strength, tncp. fig. 1: dependence of yield strength (ys) and charpy impact transition temperature (itt) on ferrite grain size, extrapolated to ultra-fine grain sizes [7] it is problematic in practice, however, to use uff steels with 1 μm or smaller grain sizes in real structures, because these would involve a substantial reduction in uniform elongation and very little work hardening occurs [8]. therefore the ferrite grain size should preferably be 2 to 3 �m or a little more, but this in turn would result in lower strength than a grain size of 1 �m. this drawback can be mainly eliminated by producing low carbon steels with bainitic or even martensitic microstructures, having higher original strength. in this case maximum yield strengths of about 1000 mpa may be achieved as well as good impact toughness and reasonable weldability [5]. an ultra-fine grain structure has been shown in various steels to result in super-toughness. the impact toughness transition temperatures of these steels have been radically lowered as the grain sizes were reduced from about 10 �m to 2 to 3 �m by tncp [9]. their good toughness at very low temperatures means that these steels could very well be called “ultra-arctic”, for example, and could be employed in various structures used at very low temperatures, e.g. in arctic regions. for a low-carbon high-manganese test steel, the absorbed energy in the charpy v impact test at �85 °c, the lowest temperature used, was 96 j (fig. 2), a quite exceptional toughness for a structural steel, whereas the corresponding energy before the treatment was only 8 j. likewise, the difference in transition temperature between the original and treated materials was at least 40 °c. due to the limited cooling capability, it was not possible to measure the exact transition temperature of the ultra-fine structure, but fig. 2 suggests that it could be –100 °c or even lower. the absorbed impact energy of a medium-carbon steel with a pearlitic-ferritic microstructure is known to be much lower than that of a low-carbon steel with a ferritic or ferritic-pearlitic structure, and its impact toughness transition temperature is fairly high. this poor toughness is mainly due to the high proportion of brittle pearlite in the microstructure. tncp treatment was shown to improve the impact toughness of a medium-carbon test steel as well, by bringing down the transition temperature sharply, as shown in fig. 3. in addition to impact toughness and tensile/yield strength, the fatigue strength of ultra-fine-grained steels has also been found to be superior to that of conventional steels. test specimens from steel plates with ultra-fine-grained surface layers (suf) steels exhibited not only higher resistance to fatigue crack initiation and early propagation, but also superior resistance to the propagation of long fatigue cracks [10]. suf steel plates have already been employed in important members of large structures such as large ships, and the microstructure also has superior fatigue strength in artificial sea water compared with a conventional microstructure [11]. weight reduction and rigid auto-bodies have been examined extensively in the automotive industry in recent years, from the viewpoints of environmental protection and improvement of collision safety. some high strength, high elongation tubular products based on ultra-fine grain metallurgy have been developed in order to meet this demand [12]. advanced steel structures to be manufactured from ultra-fine-grained steels will usually be built by welding. efficient joining processes with low heat input and narrow width of the heat affected zone (haz) have to be used to obtain high-performance welded structures while preserving an ultra-fine microstructure. laser welding [13] and ultra narrow gap gma welding [14], for example, have been shown to be adequate processes. 3 how to produce ultra-fine-grained steels novel processes for producing ultra-fine-grained steels have been developed lately in many places, e.g. in japan [2], finland [4, 5] and australia [6]. a detailed description of tncp, a novel hot rolling process (patent pending) developed by the author (fig. 1) which results in an ultra-fine grain size [4], is given here. in tncp, the steel is first heated to above the ac3 temperature to transform it to a homogeneous austenitic structure. the annealing temperature is often not more than 1000 °c, and in any case not more than 1150 °c, in order to prevent 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 2: absorbed energy in the charpy v impact test at various temperatures after conventional hot rolling (o) and after tncp, a novel hot rolling process (x). steel a: 0.05 % c, 0.1 % si, 2.5 % mn [9] fig. 3: absorbed energy in the charpy v impact test at various temperatures after conventional hot rolling (o) and after tncp, a novel hot rolling process (x). steel b: 0.33 % c, 0.3 % si, 1.2 % mn [9] excessive growth of the austenite grains. the holding time at that temperature is also constrained, because the austenite grain size before rolling must be less than 20 �m, and preferably 10 �m or less. the steel is then cooled below the non-recrystallization temperature, tnr, for hot rolling. tnr depends on the steel composition and is often even above the ac3 temperature, but usually not more than 1050 °c. the structure of the steel at this point is austenitic, but no substantial recrystallization of the flat, elongated austenite grains occurs during hot rolling, which can also be continued below the ar3 temperature. the total reduction ratio to be achieved in rolling can be rather low, too, but usually not less than 20 to 30 %. after rolling, the steel is cooled below the ar3 and ar1 temperatures at which the austenite grains will be transformed to various phases. during this cooling and transformation, the prolonged austenite grains change to ultra-fine grains of ferrite, pearlite, etc., depending on the steel composition and cooling rate. an example of an ultra-fine-grained microstructure produced by tncp is shown in fig. 5. the grain size was about 3 �m. the microalloyed test steel contained 0.08 % carbon, 0.20 % silicon, 1.68 % manganese and 0.04 % niobium, and it was in the conventional hot-rolled state before treatment. tncp treatment was carried out according to the principles shown in fig. 4. specimens of thickness 8 mm were heated to the austenite region in an air furnace (900 °c, 30 min) and then cooled to 800 °c. hot rolling with one pass was carried out at this temperature using a laboratory roller with a reduction ratio of 36 %. after rolling, the specimens were cooled in a water spray at a rate of about 15 °c/s. the charpy v impact toughness of the tncp steel was still very good (114 j) at –90 °c, the lowest test temperature used. it can be proposed that this steel is still tough at least at –100 °c. 4 conclusions the superior mechanical properties obtained in ultra-fine-grained steels and a novel hot rolling process, tncp, for producing these steels as introduced here, lead to the following conclusions: 1. tensile/yield strength can be markedly improved by producing steels with ultra-fine grain sizes of usually not more than 3 �m. 2. high absorbed energies also occur in the charpy v impact test at very low temperatures, thus showing the super-toughness of these ultra-fine-grained steels. 3. a novel hot rolling process, tncp, is one potential candidate for the commercial production of ultra-fine-grained steels in the future. references [1] sato s.: world of steel (japan), vol. 48, (1997), no. 12, p. 14–15. [2] furukawa t.: american metal market, vol. 106, (1997), no. 84. [3] hanamura t., yamashita t., umezava o., torizuka s., nagai k.: proc. of the int. symp. on ultrafine grained steels (isugs 2001), uminonakamichi, fukuoka, japan, september 20–22, 2001, p. 228–231. [4] leinonen j.: pct patent application pct/fi00/00902, 2000. [5] leinonen j. i.: advanced materials & processes (usa), vol. 159, nov. 2001, p. 31–33. [6] hodgson p. d., hickson m. r., gibbs r. k.: u.s. patent 6,027,587. 2000. [7] howe a. a.: “ultrafine grained steels: industrial prospects”. materials science and technology, vol. 16, november–december 2000, p. 1264–1266. [8] priestner r., ibraheem a. k.: “ultra-fine grained ferritic steel”. proceedings of the j. j. jonas symposium on thermomechanical processing of steel. august 20–23, 2000, ottawa, ontario, canada, p. 351–363. [9] leinonen j. i.: “a novel process results in ultra-fine structure and improved properties”. proc. of the 2nd int. symp. on high strength steel. verdal, norway: 23–24 april 2002, 10 p. [10] kimura h., akiniwa y., tanaka k., kondo j., ishikawa t.: “microstructural effect on the initiation and propagation behavior of fatigue cracks in ultrafine-grained steel”. proc. of the int. symp. on ultrafine grained © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 44 no. 3/2004 fig. 4: schematic time-temperature cycle for producing an ultra-fine grain structure in unalloyed and low-alloy steels by tncp [9] fig. 5: microstructure of the ultra-fine-grained tncp metal steels (isugs 2001), uminonakamichi, fukuoka, japan, september 20–22 2001, p. 248–251. [11] fukui t., yajima h., ishikawa t., koseki t.: “microstructural effect on corrosion fatigue properties in ultrafine grained steel”. proc. of the int. symp. on ultrafine grained steels (isugs 2001), uminonakamichi, fukuoka (japan), september 20–22 2001, p. 240–243. [12] koyama y.: “high strength and high elongation tubular products ”history steel tube” with good bendability”. kawasaki steel technical report, oct. 2000, no. 43, p. 55–57. [13] chen w., peng y., wan c., bao g., tian z.: “welding thin plate of 400mpa grade ultra-fine grained steel using co2 laser”. proc. of the int. symp. on ultrafine grained steels (isugs 2001), uminonakamichi, fukuoka (japan), september 20–22 2001, p. 252–255. [14] ito r., shiga c., kawaguchi y., nakamura,t., hiraoka k., hayashi t., torizuka s.: “controlling of the softened region in weld hyeat affected zone of ultra fine grained steels”. isij international, vol. 40 (2000), p. s29–s33. jouko i. leinonen, ph.d. (eng.) phone: +358 50 325 0926 fax: +358 8 553 2165 e-mail: jouko.leinonen@oulu.fi department of mechanical engineering university of oulu linnanmaa fin-90014 oulu, finland 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 acta polytechnica doi:10.14311/ap.2017.57.0125 acta polytechnica 57(2):125–130, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap heat transfer enhancement and pressure drop of grooved annulus of double pipe heat exchanger putu wijaya sunu∗, i made rasta mechanical engineering department, bali state polytechnic, bali, indonesia ∗ corresponding author: wijayasunu@pnb.ac.id abstract. this investigation was performed to experimentally investigate the enhancement of heat transfer and the friction of an annulus in a double pipe heat exchanger system with rectangular grooves in the turbulent flow regime. the shell is made of acrylic and its diameter is 28 mm. the tube is made of aluminium and its diameter is 20 mm. grooves were incised in the annulus room with a circumferential pattern, with a groove space of 2 mm, a distance between the grooves of 8 mm and a groove height of 0.3 mm. the experiments consist of temperature and pressure measurement and a flow visualization. throughout the investigation, the cold fluid flowed in the annulus room. the reynold number of cold fluid varied from about 31981 to 43601 in a counter flow condition. the volume flow rate of hot fluid remains constant with reynold number about 30904. result showed the effect of grooves, which are applied in the annulus room. the grooves induce the pressure drop, the pressure drop in the grooved annulus was greater by about 15.88 % to 16.72 % than the one in the smooth annulus. the total heat transfer enhancement is of 1.09–1.11. moreover, the use of grooves in the annulus of the heat exchanger not only increase the heat transfer process, but also increase the pressure drop, which is related to the friction factor. keywords: heat transfer; friction factor; circumferential rectangular grooves; annulus. 1. introduction heat exchanger is the most important device for handling the thermal energy in many industrial applications, such as power plants, waste heat recovery, chemical, automotive, food handling, refrigeration and air conditioning and many other application. the important parameters of a heat exchanger are temperature of fluid at the inlet and outlet, flowrate of each fluid and the pressure drop [1]. more reasons to consider the application of a heat exchanger are simplicity and easy maintenance. the double pipe heat exchanger is the best candidate for these. to improve the thermal performance of a heat exchanger, many techniques have been developed. one of the surface modifications for an increase of the heat transfer in the turbulence flow are grooves. groove, as a passive technique, was developed for these applications, because it does not require additional energy, needs no auxiliary power, is quite simple and reduces the weight the of heat exchanger. therefore, it is widely used in modern heat exchangers and other cooling equipment [2]. the grooves increase the surface area with a little change in diameter and potentially improves not only heat transfer, but also pressure drop. many investigations have been done to improve the flow characteristics and friction around grooves [2–9]. in past years, most researches focused on rectangular grooves [10–16]. also studies on various shapes of grooves has been done to improve the flow and flow structure details [17–20]. over the years, many researches on the fluid flow through the grooves and corrugated tubes and their influence on the heat transfer characteristics have been carried out [21–28]. although many researches have focused on the groove’s shape and the flow structure correlated to a heat transfer enhancement on various applications, few have been done on the annulus of the heat exchanger. thus, this study fills an important gap in the literature by demonstrating and describing the feasibility of these surface techniques for an annulus application. the specific objective and the scope of this presented work is to evaluate the performance of the heat transfer and to observe the flow phenomenon and friction through the grooved annulus in a double pipe heat exchanger. 2. experimental apparatus and method the sketch of the experiment apparatus for this investigation is shown in figure 1. the experiment apparatus was designed to determine the pressure drop, heat transfer and flow visualization. this study used a counter flow in flow direction and water as a working fluid. in counter flow heat exchanger, the hot side inlet temperature is in contact with the cold side outlet temperature and vice versa. the experiment installation consisted of two centrifugal pumps that circulates hot and cold water through a pipe system loop. rotameter was installed for varying the volume flow rate of the hot and cold fluid. a minimum of 11 l/min was used and it was increased up to 15 l/min. this flowrate correlated to the reynolds number re 125 http://dx.doi.org/10.14311/ap.2017.57.0125 http://ojs.cvut.cz/ojs/index.php/ap putu wijaya sunu, i made rasta acta polytechnica figure 1. the schematic representation of the experiment apparatus set-up. of the annulus room that was about 31981 and 43610, respectively. meanwhile, the re for hot fluid was maintained constant at about 30904. the hot water temperature was 50 ± 0.50 °c and the cold water temperature was 30 ± 0.5 °c. the heat exchanger test section was a 50 cm long a tube diameter of 20 mm and a shell diameter of 28 mm. the tube was made of aluminium and the shell side was made of acrylic tube for easier flow visualization. the annulus room was etched by circumferentialrectangular grooves on outer surface of the tube side. the grooves were formed by a conventional etching technique. k-type thermocouples were used to measure the temperature of the inlet and outlet of the hot and cold fluid in the test section. the signal from thermocouple was then digitalized using a data logger and recorded in a computer memory for 600 s. pressure transducers mpx 5050 d were used in the pressure measurement, which took place at the upstream and downstream of the annulus section with the sampling frequency of 10 hz. dyes were used as the media for visualization and recorded with a high speed camera at 240 fps. 2.1. data processing the aim of this current investigation is to determine the heat transfer characteristics and friction in a grooved annulus of a double pipe heat exchanger. the parameters of interest are the reynolds number re, friction factor f, heat transfer q, ntu and effectiveness ε. the reynolds number is given by re = udh υ . (1) the hydraulic diameter of annulus is calculated from dh = 4π(d 2 2 −d 2 1 ) 4 π(d2 − d1) = d2 − d1. (2) figure 2. schematic of groove. the friction factor f calculated from pressure drop as f = 2∆p l dh ρu2 . (3) the heat transfer, the heat transfer enhancement and heat capacity ratio defined as q = ṁcp∆t, (4) eh = qgroove qsmooth , (5) c = cmin cmax . (6) the ntu and effectiveness follow ntu = ua cmin ; u = ntu cmin a , (7) ε = 1 − exp(−ntu(1 − c)) 1 − c exp(−ntu(1 − c)) . (8) 2.2. uncertainity analysis experimental uncertainty was determined by the deviation of the parameters, including temperature, pressure, and flow rate. the precision of temperature data acquisitions is ±0.1 °c, the pressure drop was measured by pressure transducers with the accuracy of ±2.5 %, the flow rate was measured by a rotameter with a full scale accuracy of ±5 %. the uncertainty of experimental results could be expressed as follows [29]: um = ± (( ∆t t )2 + ( ∆p p )2 + ( ∆v v )2)1/2 , (9) therefore the experiment uncertainty was about 5 % 3. results and discusscion preliminary experiments were done using the smooth annulus of a double pipe heat exchanger to determine the amount of the pressure drop. this data were then used as a comparison to that of the grooved annulus. figure 3a shows the overall pressure drop of a double pipe heat exchanger at various values of reynolds number. from figure 3a, it can be seen that the application of grooves on the annulus room of a double pipe heat exchanger leads to a substantial increase of pressure drop, about 15.82 % to 16.72 % compared to that of the smooth annulus. the grooves provided the enhancing turbulence by increasing the recirculation 126 vol. 57 no. 2/2017 heat transfer enhancement and pressure drop figure 3. pressure drop and friction factor of double pipe heat exchanger: (a) overall pressure drop; (b) friction factor. figure 4. the variation of heat transfer with reynold number. region near the pipe wall. this phenomenon tends to increase the shear stress and the pressure drop. figure 3a indicates linear pressure drop for increased re of the annulus. the influence of using circumferentialrectangular grooves towards friction of the annulus is displayed in figure 3b. in the figure 3b, it can be seen that the application of grooved annulus tends to increase the friction factor when compared to the one of the smooth annulus. the friction factor decreases with the increasing re. the friction loss is mainly caused by the increased surface area, higher recirculation intensity and stronger vortex flow around the groove (figure 7). figure 4 shows the total heat transfer by the heat exchanger for a period of time. it was found out that the heat transfer increased with the increase of the reynolds number and the grooved annulus gave higher total heat transfer than the smooth annulus. from this figure, the heat transfer enhancements were calculated using (5). the highest enhancement at the highest re was observed at 1.11. the swirl flow on the fluid near the pipe wall generated by the groove was responsible for thinning the thermal boundary figure 5. effectiveness and ntu relationship. layer and made the heat transfer higher through the fluid. the effectiveness and the ntu relationship of the investigation were presented in figure 5. numbers 1 to 5 represent the heat capacity ratio c of this experiment at specific reynolds number re. the ntu and effectiveness ε of the grooved annulus were compared to that of the smooth annulus. it was found out that the heat capacity ratio influenced the ntu and effectiveness. figure 5 shows that the increased heat capacity ratio caused a decrease in the ntu and effectiveness. at the same heat capacity ratio, the effectiveness and the ntu of the grooved annulus were higher than those of the smooth annulus. the increase of effectiveness is indicated by the increased rate of the actual heat transfer. while the increase of the ntu indicates an increase of the overall heat transfer coefficient and surface area ua value. the grooves lead to an increase of turbulence, therefore, the fluid interaction and momentum also increase. the effectiveness and the ntu of the grooved annulus were close to the results from [30]. to give more detailed information about the heat transfer phenomenon in the grooved annulus, a visualization technique was developed. the flow structures over annulus were visualized using a dye technique to 127 putu wijaya sunu, i made rasta acta polytechnica figure 6. visualisation of the flow in the smooth double pipe heat exchanger at 240 fps: (a) initial condition; (b) condition after 0.004 s. figure 7. visualisation of the flow in the grooved double pipe heat exchanger at 240 fps. (a) initial condition; (b) condition after 0.004 s. give more understanding about what happens inside the groove valley. the images were transferred to a computer for image processing and analysis as shown in figures 6 and 7. the small round shape is the fluid lump considered to be the vortex near the groove. the change of the vortex dimensions indicates a change of the velocity gradient inside it. after 0.004 s, in the smooth annulus of the double pipe heat exchanger (figure 6), there can be clearly seen that there is a small change of the vortex dimensions. this phenomenon indicates a small gradient of the fluid velocity in the annulus space. however, in the case of the grooved annulus of a double pipe heat exchanger, the vortex dimensions became smaller and the vortices tend to fall into the groove valley. the vortices crept inside the valley and increased the velocity gradient and shear stress. the presence of a velocity gradient was proved by the value of pressure drop (figure 3a). it is well known that the pressure drop is proportional to the square of the velocity. this phenomenon increases the turbulence intensity, recirculation and fluid momentum. this process will lacerate the thermal boundary layer so that the obstacles of the heat transfer will be smaller. this phenomenon is increasing the heat transfer rate of the grooved double pipe heat exchanger. 4. conclusions an experimental investigation was carried out for a heat transfer enhancement, the ntu, effectiveness ε, pressure drop and flow visualization of a grooved annulus of a double pipe heat exchanger. the main scope of this study was to compare a grooved double pipe heat exchanger with a smooth double pipe heat exchanger. moreover, new and simple flow visualizations were developed. the result can be summarized as follows: (1.) the heat transfer increased with the increase of the re. for the grooved double pipe heat exchanger, the heat transfer enhancement was about 1.09–1.11 and the pressure drop increased by about 15.82– 16.72 %. (2.) the effectiveness increased with the decrease of the heat capacity ratio and increased the ntu value. (3.) grooves influence the flow structure in the annulus of a double pipe heat exchanger. 128 vol. 57 no. 2/2017 heat transfer enhancement and pressure drop list of symbols u fluid velocity [m/s2] a surface area [m2] υ kinematic viscosity [m2/s] l pipe long [m] ṁ mass flowrate [kg/s] ρ fluid density [kg/m3] dh hydrolic diameter [m] d1 outer diameter of tube side [m] d2 inner diameter of shell side [m] ∆p pressure drop [pa] ∆t temperature difference of hot/cold fluid in time period [°c s] q heat transfer [j] cp specific heat [j/kg °c] eh heat transfer enhancement ε effectiveness u overall heat transfer coefficient [w/m2 °c] cmin the smallest heat capacity rate [w/°c] cmax the highest heat capacity rate [w/°c] c heat capacity ratio ntu number of transfer unit references [1] jan opatril, jan havlik, ondrej batros, tomas dlouhy. an experimental assessment of the plate heat exchanger characteristics by wilson plot method. acta polytechnica 56(5), 2016, 367-372. [2] sunu p.w., ing wardana, a.a.sonief, nurkholis hamidi. the effect of wall groove numbers on pressure drop in pipe flows. int. j. fluid mech. resch., 42(2), 2015, 119 – 130. [3] hongwei, m.a., qiao tian, and hui wu. experimental study of turbulent boundary layers on groove/smooth flat surfaces. j. of thermal science 14(3), 2005, 93-97. [4] litvinenko, y.a., chernoray, v.g., kozlov, v.v., loefdahl, l., grek, g.r. and chun, h.h. the influence of riblets on the development of a structure and its transformation into a turbulent spot. doklady physics 51(3), 2006, 144-147. [5] sunu p.w. the characteristics of increased pressure drop in pipes with grooves. adv. studies theor. phys., vol. 9, 2015, no. 2, 57–61. [6] sunu p.w., ing wardana, a.a.sonief, nurkholis hamidi. flow behavior and friction factor in internally grooved pipe wall. adv. studies theor. phys., vol. 8, 2014, no. 14, 643-647. [7] sunu p.w., ing wardana, a.a.sonief, nurkholis hamidi. optimal grooves number for reducing pressure drop. contemporary engineering sciences, vol. 9, 2016, no. 22, 10671074. [8] ding g., haitao hu, huang x., bin deng, yifeng gao. experimental investigation and correlation of two-phase frictional pressure drop of r410a–oil mixture flow boiling in a 5 mm microfin tube. international journal of refrigeration 32, 2009, 150–161. [9] frohnapfel, b., lammers, p., jovanović, j. the role of turbulent dissipation for flow control of near-wall turbulence. new res. in num. and exp. fluid mech. vi, nnfm 96, 2007, 268-275. [10] s. lorenz, d. mukomilow, w. leiner. distribution of the heat transfer coefficient in a channel with periodic transverse grooves. exp. therm. fluid sci. 11, 1995, 234–242. [11] t. adachi and h. uehara. correlation between heat transfer and pressure drop in channels with periodically grooved parts. int. j. heat mass transfer 44, 2001, 4333–4343. [12] s. eiamsa-ard and p. promvonge. numerical study on heat transfer of turbulent channel flow over periodic grooves. int. commun. heat mass transfer 35, 2008, 844–852. [13] s. eiamsa-ard and p. promvonge. thermal characteristics of turbulent rib-grooved channel flows. int. commun. heat mass transfer 36, 2009, 705–711. [14] t. adachi, y. tashiro, h. arima, y. ikegami. pressure drop characteristics of flow in a symmetric channel with periodically expanded grooves. chem. eng. sci. 64, 2009, 593–597. [15] m. jain, a. rao, k. nandakumar. numerical study on shape optimization of groove micromixers. microfluid nanofluid 15, 2013, 689–699. [16] c. wang, z.l. liu, g.m. zhang, m. zhang. experimental investigations of flat plate heat pipes with interlaced narrow grooves or channels as capillary structure. exp. therm. fluid sci. 48, 2013, 222–229. [17] sutardi and c.y. ching. effect of transverse square groove on a turbulent boundary layer. experimental thermal and fluid science, 20, 1999, 1-10. [18] lee, s.j. and y.g. jang. control of flow around a naca 0012 airfoil with a micro-riblet film. journal of fluid and structure, 20, 2005, 659-672. [19] shan huang. viv suppression of a two degree of freedom circular cylinder and drag reduction of a fix circular cylinder by the use of helical grooves. journal of fluids and structures, 27, 2011, 1124-1133. [20] lee, s.j. and s.h. lee. flow field analysis of a turbulent boundary layer over a riblet surface. experiments in fluids, 30, 2001, 153-166. [21] sunu p.w., anakottapary d.s., santika w.g. temperature approach optimization in the double pipe heat exchanger with groove. matec web of conference 58, 04006, 2016. doi:10.1051/matecconf/20165804006 [22] aroonrat, k., jumpholkul, c., leelaprachakul, r., dalkilic, a.s., mahian, o., wongwises, s. heat transfer and single-phase flow in internally grooved tube. international communication in heat and mass transfer 42, 2013, 62-68. [23] katoh, k., k.s. choi, t. azuma. heat-transfer enhancement and pressure loss by surface roughness in turbulent channel flows. international journal of heat and mass transfer, 43, 2000, 4009-4017. [24] jian liu, gongnan xie, terrence w. simon. turbulent flow and heat transfer enhancement in rectangular channels with novel cylindrical grooves. international journal of heat and mass transfer 81, 2015, 563–577. 129 http://dx.doi.org/10.1051/matecconf/20165804006 putu wijaya sunu, i made rasta acta polytechnica [25] k. bilen, m.cetin, h.gul, t. balta. the investigation of groove geometry effect on heat transfer for internally grooved tubes. appl.therm.eng. 29, 2009, 753–761. [26] r.kamali, a.binesh. the importance of rib shape effects on the local heat transfer and flow friction characteristics of square ducts with ribbed internal surfaces. int. commun. heat mass transf. 35, 2008, 1032–1040. [27] ali najah al-shamani, k.sopian, h.a. mohammed, sohif mat, mohd hafidz ruslan, azher m. abed. enhancement heat transfer characteristics in the channel with trapezoidal rib-groove using nanofluids. case studies in thermal engineering, 5, 2015, 48-58. [28] hamed sadighi dizaji, samad jafarmadar, farokh mobadersani. experimental studies on heat transfer and pressure drop characteristics for new arrangements of corrugated tubes in a double pipe heat exchanger. international journal of thermal sciences, 96, 2015, 211-220. [29] moffat, r. j. describing the uncertainties in experimental results. experimental thermal and fluid science, 1, 1988, 3-17. [30] kays w.m and london a.l. compact heat exchanger. 3rd ed. new york: mcgraw-hill, 1984. 130 acta polytechnica 57(2):125–130, 2017 1 introduction 2 experimental apparatus and method 2.1 data processing 2.2 uncertainity analysis 3 results and discusscion 4 conclusions list of symbols references acta polytechnica https://doi.org/10.14311/ap.2021.61.0456 acta polytechnica 61(3):456–464, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague an investigation of corrosion of friction welded and post-weld heat-treated aa6061/sic/graphite hybrid composites jadamuni senthilkumara, ∗, pavan s. m. kumarb, manickam balasubramanianc a sathyabama institute of science and technology, department of mechanical engineering, sholinganallur, 600119 chennai, india b anand institute of higher technology, department of mechanical engineering, kazhipattur, 603103 chennai, india c rmk college of engineering and technology, department of mechanical engineering, puduvoyal, 601206 chennai, india ∗ corresponding author: jsenthil_trt@rediffmail.com abstract. the aluminium-based hybrid metal matrix composites have noteworthy applications in sub-sea installations, structures of deep-sea crawlers, submarine parts, engine cylinders, drum brakes etc., as they possess high strength, corrosion resistance, chemical, and dimensional stability. in this investigation, the pitting corrosion behaviour of friction welded and post-weld heat-treated aa6061/sic/graphite hybrid composites were analysed. the corrosion rates of aw (as welded), st (solution treated), sta (solution treated and aged), and aa (artificially aged) weld joints were experimentally determined. the corrosion behaviour has been discussed in light of microstructure. the experimental results revealed that the sta joints exhibited better corrosion resistance characteristics as compared to aw, aa, and st joints. the corrosion rate was high for aw joints, followed by aa and st joints, respectively. taking into account the corrosion rates of aw and sta joints, the sta joints have a corrosion rate 34.6 % lesser than that of aw joints. a comparison of aa and st with sta joints reveals that the rate of corrosion for sta joints was 31.1 % lesser than that of aa joints and 28.8 % lesser than that of st joints. a lower corrosion rate was observed for sta joints as compared to aa, aw, and st joints. keywords: corrosion, aluminium, friction welding, pwht, hybrid composite. 1. introduction aluminium based hybrid metal matrix composites have promising applications in engine cylinder blocks, pistons, disc brakes, aircraft wing panels, etc., as they possess high strength, corrosion resistance, chemical stability, dimensional stability at high temperature [1]. solidification cracking, hot tearing, and fissuring are some common defects observed in fusion joints of aluminium alloys. these undesirable features in the welded joints could be eliminated by using a solidstate welding process like continuous drive friction welding [2]. nunes et al. investigated the corrosion behaviour of silicon carbidealuminium metal matrix composites. the cathodic oxygen reduction process was the main driving force for the corrosion process. in al-si-mg composites, eutectic silicon, sic particles, and the precipitated phases prove to be the active cathodic sites for the corrosion process. pitting resistance can be improved through anodization and application of ceria coatings [3]. mclntyre et al. investigated the pitting behaviour of aluminium-sicw composites. pit nucleation takes place due to the presence of sic intermetallic. the intermetallic tends to form at dislocations, voids, and grain boundaries. the sicw can act as efficient nucleation sites for intermetallic formations in sic/aa2124 composites [4]. the pitting behaviour of gr-al and sic-al metal matrix composites has been examined. the corrosion resistance of the unreinforced alloy is more compared with reinforced alloy in [5]. galvanic corrosion behaviour of aluminium matrix composites reveals the aluminium coupled to graphite fibre in aerated 3.15 wt % nacl shows a high corrosion current density value, which in turn possesses a low corrosion rate [6]. after the friction welding process, aa6061/sic/graphite hybrid composites show mg2si as the predominant precipitate present in all the joints [7]. sori won et al. studied the corrosion behaviour of friction welded dissimilar aluminium alloys. an improvement in the corrosion performance was observed in aa6063 due to the growth of the passivation layer [8]. if corrosion potential is more positive, pitting corrosion is initiated in aluminium alloy matrix. mathematical models can be used for the prediction of accurate results [9]. abhishek sharma et al. reported that al6061-sic-graphite hybrid surface composite has an improved corrosion resistance property due to the decrease of interfacial corrosion. this can be attributed to a graphite layer formation 456 https://doi.org/10.14311/ap.2021.61.0456 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 3/2021 an investigation of corrosion of friction welded. . . that behaves as a passive layer, which resists the corrosion [10]. the hot extrusion produces a high level of deformation in al6061-sic-graphite composites that produces improved mechanical and microstructural properties. an extruded component has the least corrosion rate [11]. sriram vikas et al. studied the influence of post-weld heat treatment (pwht) and pitting corrosion behaviour of dissimilar aluminium alloy friction stir welds. all the regions of the weld have an improved pitting resistance after pwht (post weld heat treatment). this can be attributed to the precipitation of fine sub microscopic particles during natural ageing. active sites for the pit initiation get reduced due to the dissolution of secondary intermetallic precipitation [12]. sunitha et al. reported that al6065/sic/graphite hybrid composites show a good corrosion resistance behaviour if the graphite percentage of the composite is low [13]. subba rao et al. reported the influence of heat treatment on the corrosion behaviour of aluminium metal matrix composites. cast composites show larger size pits in more numbers compared to heat-treated composites. this indicates that a cast composite has a poor corrosion resistance. heat-treated composites show reduced sized pits in a lesser number, which in turn has a role in corrosion minimization [14]. the intermetallic phases in the aluminium alloys prove to be the regions where localized form of corrosion occurs. the corrosion behaviour is influenced by intermetallic phases in reinforced alloys [15]. the corrosion behaviour of aa7075 aluminium alloy friction stirs welds after pwht has been examined. the size and interspaces of the precipitates influence the pitting corrosion. large spacing and discontinuous precipitates reduce the susceptibility to pitting corrosion. the finer grains and continuous precipitates promote pitting corrosion [16]. the corrosion behaviour of aa7075 aluminium metal matrix composites was studied in a 3.5 wt % nacl aqueous solution. for low ph values (i.e., acidic), the corrosion rate was larger. as far as volume percentage goes, the corrosion rate lowered due to the increase in the content of reinforcement [17]. trzaskoma investigated the pit morphology of aluminium alloy and silicon carbide/aluminium alloy metal matrix composites. the composites have a greater number of pit initiation sites than the alloy due to a greater number of intermetallic phases present in the composite. a large number of detrimental intermetallic phases found on the composites leads to the formation of more pits on the composite. the metal dissolution rates are larger in the case of the composite as compared to the alloy [18]. bhat et al. investigated the corrosion behaviour of silicon carbide particles reinforced with 6061 al alloy composites. the composites corrode in a faster manner than the base alloy due to the interface of the reinforcements that result in crevices or pits. this could be attributed to the fact that a thin layer of reaction product present at the interface acts as an effective cathode that enables a higher corrosion rate. the homogeneity in the distribution of particles and the absence of defects like pores in the extruded composites make it less susceptible to corrosion [19]. sunil sinhmar et al. investigated the corrosion behaviour of a friction stir weld joint of aa2014 aluminium alloy. due to the dynamic recrystallization and plastic deformation experienced during the welding, grain refinement happened in the weld zone. the dissolution and coarsening of the precipitates increase the corrosion loss. the precipitate free zones act as anodes and increase the corrosion loss. the precipitate free zones are comparatively soft [20]. farhad gharavi et al. analysed the corrosion behaviour of friction stir welded lap joints of aa6061-t6 aluminium alloy. the localized corrosion attack generated by semi-pittingand circumferential pits were formed around strengthening precipitates. the higher number of intermetallic particles leads to cathodic reactions. the corrosion resistance of weld regions was poorer as compared to the parent alloy [21]. sara bocchi et al. investigated the influence of process parameters on the corrosion behaviour of friction stir welded aluminium joints. during the welding, process orientation is produced in the grain structure. the coarsening precipitates and dissolution of precipitates enhanced the corrosion loss [22]. vinoth jebaraj et al. analysed the corrosion behaviour of aluminium alloy 5083 and its weldment for marine applications. the second phase particles enriched with mg have been formed during the weld. the parent metal exhibited a better corrosion resistance as compared to the weldment [23]. ramesh et al. studied the experimental erosion-corrosion analysis of friction stir welding of aa5083 and aa6061 for sub-sea applications. the erosion was the highest for the basic material, followed by a decrease with decreasing ph values. the corrosion and erosion were more in the centre of the welding as compared to the base material [24]. shuai li et al. made experimental studies on the corrosion behaviour and mechanical properties of al-zn-mg aluminium alloy weld. the comparison made between the base metal and weld joints revealed that the base metal possesses superior mechanical properties. the variance in corrosion potential depends on the number of precipitates and the difference in the composition of constituents [25]. 2. experimental details 2.1. materials aa6061 aluminium alloy was used as the matrix material. silicon carbide with a weight fraction of 10 % was used as the primary reinforcement and graphite was used as the secondary reinforcement with a weight fraction of 5 %. the chemical composition of aa6061 is given in table 1. 2.2. fabrication of composites the stir casting process was used to fabricate the cast samples. aa6061 alloy billets were melted at 650 °c 457 j. senthilkumar, p. s. m. kumar, m. balasubramanian acta polytechnica mg si fe cu cr mn zn ti al 0.9 0.62 0.33 0.28 0.17 0.06 0.02 0.02 bal table 1. chemical composition(wt. %) aa6061. figure 1. al matrix before extrusion. figure 2. al matrix after extrusion. at this stage, pre-heated particles of sic and graphite were added and samples were cast. the samples were then subjected to machining and specimens of 45 mm length and 16 mm diameter were produced. then, the process of hot extrusion was carried out at a temperature of 450 °c. after the extrusion process, specimens of increased length and decreased diameter were obtained. the extruded specimens after machining had final dimensions of 60 mm in length and 12 mm in diameter. the fig. 1 shows the aluminium matrix before the extrusion and fig. 2 shows the aluminium matrix after the extrusion. 2.3. friction welding the extruded specimens were joined in a continuous drive friction welding machine (model: spm-fw-3t) as shown in fig. 3. in this friction welding machine, clamping was done by a hydraulically controlled chuck. the machine was inbuilt with a data acquisition and analysis software. the maximum speed of the machine spindle was 2 000 rpm. friction welding was carried out at a rotational speed of 1 600 rpm and an upset load of 3.5 kn for 3 seconds. in total, 32 joints were done and the friction welded joints are shown in fig. 4. the fig. 5 shows the as welded condition (aw) covfigure 3. friction welding machine. figure 4. friction welded samples. ering parent metal (pm), heat affected zone (haz) and weld zone (wz). 2.4. post weld heat treatment eight joints were isolated and kept in the aswelded (aw) condition without a post-weld heat treatment. the remaining joints were kept in a muffle furnace for the heat treatment as shown in fig. 6. solution treatment (st) was performed on eight joints at a temperature of 525 °c. the joints were quenched in water after a 1-hour soaking period in the furnace. eight specimens were subjected to the solution treatment followed by artificial ageing (sta), wherein the solution-treated samples were aged at 163 °c for 8 hours. the remaining eight specimens were subjected to artificial ageing at a temperature of 163 °c for 8 hours and then let to cool in the furnace itself. the post-weld heat-treated samples are shown in fig. 7. 2.5. corrosion test gill ac potentiostat was used to carry out the corrosion test. it has a 100 khz frequency response. it has up to 2 amp in same style enclosure. all the 458 vol. 61 no. 3/2021 an investigation of corrosion of friction welded. . . figure 5. aw covering parent metal, heat affected zone, weld zone. figure 6. post weld samples in the furnace. figure 7. post weld heat treated samples. figure 8. specimens after the corrosion test. specimens were ground on-grid emery paper to have a uniform standard surface. acetone was used to clean the specimens’ surfaces before the exposure to the solution of 3.5 wt. % of nacl. the fig. 8 shows the specimens after the corrosion test. the specimens were cleaned and dried according to astm stp 534 after the corrosion tests. 2.6. microstructure analysis the optical micrographs of the weld zones were captured for aw, st, sta, and aa joints using standard metallographic procedures. the etchant used is keller’s reagent. the microstructural analysis was performed using a 100 w halogen powered optical microscope (model:mm-400/800), with a maximum magnification range of 10× to 400×. 3. results and discussion 3.1. microstructure of weld regions fig. 9a shows the weld region of the aw sample. the weld region shows eutectic particles along with primary aluminium grains. the micro voids were present in the vicinity of grain boundaries. the grain structure seems to be coarse in this aw sample and the distribution of the reinforcement was also not uniform. the coarse grain structure can be attributed to the effect of severe heat generated during the friction welding process. fig. 9b shows the weld region of the aa condition. the microstructure shows longer primary α-aluminium grains with eutectic particles. the agglomeration of silicon carbide particles can be seen from the microstructural image. due to the postweld heat treatment process, the grain structure has been improved. fig. 9c shows the microstructure of the weld region of the st condition. the microstructure shows fine acicular primary α-aluminium with eutectic particles near the grain boundaries. a fine grain structure was visible. a uniform distribution of reinforcement particles can be seen from the microstructure. fig. 9d shows the microstructure of the weld region of the sta condition. the microstructure shows the primary aluminium grains with eutectic particles precipitated along the grain boundaries. the frictional stress during the friction welding process has produced a perpendicular re-orientation in the grain structure. 459 j. senthilkumar, p. s. m. kumar, m. balasubramanian acta polytechnica (a) . aw. (b) . aa. (c) . st. (d) . sta. figure 9. microstructure of weld regions of the samples. the grain boundary precipitates seem to be discontinuous and with large spacing between them. the homogenous distribution of reinforcement particles can be visualized from the microstructure. during the heat treatment process, the dissolution of secondary intermetallic precipitation took place and resulted in the formation of fine sub-microscopic particles in the structure. the precipitates expected to form from this alloy are mg2si precipitates. the precipitate formation enhances the strength and corrosion resistance of the composite. re-precipitation was observed to increase the hardness, while recrystallization of finer grains enhanced the strength. 3.2. electrochemical measurement the potentiodynamic polarization test results conducted to analyse the corrosion behaviour of various joints of aw, aa, st, sta are shown in fig. 10. the corrosion potential (ecorr) and corrosion current densities (icorr) obtained by using tafel extrapolation were shown in table 2. the as-welded joints show a shifting of ecorr towards more negative values. the rate of corrosion in the aw weld surface was found to be 9.735 mm/year. the rate of corrosion in the case of aa and st joints was found to be 9.234 mm/year and 8.934 mm/year, respectively. a graphite layer gets formed between the al and sic interface that reduces the corrosion. among the weld regions of different joints, the sta has shown a low corrosion rate. the rate of corrosion was found to be 6.368mm/year. an increased grain refinement seems to reduce the possibility of pitting corrosion. 3.3. corrosion morphology of weld regions the pitting corrosion behaviour on the weld regions of aw, st, sta, and aa is shown in fig. 11. the corrosion morphology of the weld region of aw is shown in fig. 11a.the aw weld region shows a higher rate of corrosion attack due to more sic inter-metallic particles in the interior grain and grain boundaries. a clustered attack can be seen around the intermetallic particles that can be ascribed to pitting corrosion. the sic intermetallic, which gets formed at voids and grain boundaries results in the pit initiation. at the interface of the matrix and the reinforcement, the sic intermetallic gets formed that acts as a cathode. the inter-metallic were continuous in this aw weld region, which in turn increases cathode-to-anode ratio that caused a higher corrosion rate. the distribution of particles in this aw weld region seems to be non-uniform, hence the higher corrosion loss. here, the grain coarsening had happened due to the severe heat generated during the friction welding process. the coarse grains promote corrosion. the microstructure of the weld region of aa is shown in fig. 11b. the weld region shows the fine pitting corrosion that was caused by anode etching. the pitting spots were larger and resembled islands. the pitting spots were continuous, which resulted in a higher 460 vol. 61 no. 3/2021 an investigation of corrosion of friction welded. . . (a) . aw. (b) . aa. (c) . st. (d) . sta. figure 10. potentiodynamic curves of different weld samples. s.no starting potential (mv) reverse potential (mv) sweep rate (mv/min) rest potential (ecorr) (mv) current density (icorr) (ma/cm2) corrosion rate (mm/year) aw -250 250 100 -793.09 1.8021 9.735 aa -250 250 100 -775.98 1.6727 9.234 st -250 250 100 -733.3 1.5621 8.934 sta -250 250 100 -653.67 1.2170 6.368 table 2. results obtained from potentiodynamic plots. 461 j. senthilkumar, p. s. m. kumar, m. balasubramanian acta polytechnica (a) . aw. (b) . aa. (c) . st. (d) . sta. figure 11. corrosion morphology of different weld samples. corrosion loss. the dark fine particles of graphite agglomerates were found on the weld surface of the aa specimen. the increase in corrosion loss, in this case, can be attributed to the porous nature of graphite particle agglomerates, which leads to the sucking of electrolytes. this, in turn, increases the corrosion current density, which ultimately increases the corrosion loss. the weld region of the st surface is shown in fig. 11c. the pits were caused by the etching process. here, the pit density was lower as compared to the aw, aa samples. the distribution of reinforcement particles was uniform. the agglomeration of graphite particles did not happen here. this also reduced the corrosion loss in this weld region. with the reduction in graphite particle concentration, there is a decrease in the number of potential cathodic sites. this has also reduced the rate of corrosion. the weld region of the sta is shown in fig. 11d. the weld region shows fine eutectic precipitates that were not continuous. the interspaces between the precipitates were also considerably larger. the microstructure shows the distribution of fine sub microscopic particles throughout the matrix that formed from the particles of the supersaturated solution and the dissolution of secondary intermetallic particles. the pit density was very low in this sta weld region. due to the discontinuous grain boundary precipitates with large spacing, no continuous chain exists for the corrosion to take place. as a result, the susceptibility to pitting corrosion was reduced. the particle agglomeration of sic and graphite did not happen here. this has a role in the reduced corrosion loss of sta joints. the microstructure of the sta joint weld surface reveals no voids and pores. they were eliminated and this has reduced the chances of the pit initiation. during the sta condition, precipitate formation takes place. the fine precipitate enhances the corrosion resistance of the material. the grain structure in the case of the sta condition is also fine, which improves the corrosion resistance. all the above-mentioned reasons played a significant role in the reduction of the corrosion loss in the case of the sta joints. 4. conclusions the aw(as-welded) joints exhibit more active open circuit potentials as compared to aa, st, sta joints. the corrosion loss in aw joints is high, compared with other joints. the sta joints exhibit less corrosion loss as compared to the other joints. this is because of the fine grain structure, very few agglomerates of reinforcement particles and the formation of discontinuous mg2si precipitates. the corrosion rate of the composite was lower than that of the matrix alloy. the heat treatment resulted in an improved corrosion resistance of the composite. the enhanced pitting corrosion is associated with the pores in the composites. the corrosion rate of the composites decreases due 462 vol. 61 no. 3/2021 an investigation of corrosion of friction welded. . . to the high bonding strength associated with composites. the solution treatment and ageing (sta) has improved the corrosion resistance characteristics of the sta joint. references [1] a. m. hassan, m. almomani, t. qasim, a. ghaithan. effect of processing parameters on friction stir welded aluminum matrix composites wear behavior. materials and manufacturing processes 27(12):1419 – 1423, 2012. https://doi.org/10.1080/10426914.2012.700156. [2] m. şahin. joining of aluminium and copper materials with friction welding. the international journal of advanced manufacturing technology 49:527 – 534, 2010. https://doi.org/10.1007/s00170-009-2443-7. [3] p. nunes, l. ramanathan. corrosion behavior of alumina-aluminum and silicon carbide-aluminum metal-matrix composites. corrosion 51:610 – 617, 1995. https://doi.org/10.5006/1.3293621. [4] j. f. mcintyre, r. conrad, s. golledge. technical note: the effect of heat treatment on the pitting behavior of sicw/aa2124. corrosion 46:902 – 905, 1990. https://doi.org/10.5006/1.3580856. [5] d. aylor, p. moran. effect of reinforcement on the pitting behavior of aluminum-base metal matrix composites. journal of the electrochemical society 132:1277 – 1281, 1985. https://doi.org/10.1149/1.2114101. [6] l. hihara, r. latanision. galvanic corrosion of aluminum-matrix composites. corrosion 48:546 – 552, 1992. https://doi.org/10.5006/1.3315972. [7] j. senthilkumar, p. suresh mohan kumar, m. balasubramanian. effect of post weld ageing treatment on tensile properties and micro structural characteristics of friction welded aa6061/sic/graphite hybrid composites 10:1275 – 1284, 2020. https://doi.org/10.24247/ijmperdapr2020122. [8] s. won, b. seo, j. m. park, et al. corrosion behaviors of friction welded dissimilar aluminum alloys. materials characterization 144:652 – 660, 2018. https://doi.org/10.1016/j.matchar.2018.08.014. [9] v. balasubramanian, j. senthilkumar, m. balasubramanian. optimization of the corrosion behavior of aa7075 al/sicp and al/al2o3 composites fabricated by powder metallurgy. journal of reinforced plastics and composites 27(15):1603 – 1613, 2008. https://doi.org/10.1177/0731684407082629. [10] a. sharma, v. mani sharma, b. sahoo, et al. study of nano-mechanical, electrochemical and raman spectroscopic behavior of al6061-sic-graphite hybrid surface composite fabricated through friction stir processing. journal of composites science 2:32, 2018. https://doi.org/10.3390/jcs2020032. [11] j. senthilkumar, p. suresh mohan kumar, v. balasubramanian. post weld heat treatment of continuous drive friction welded aa6061/sic/graphite hybrid composites-an investigation. materials research express 6(12):1265e1, 2019. https://doi.org/10.1088/2053-1591/ab6407. [12] k. sri ram vikas, v. s. n. venkata ramana, r. mohammed, et al. influence of post weld heat treatment on microstructure and pitting corrosion behavior of dissimilar aluminium alloy friction stir welds. materials today: proceedings 15:109 – 118, 2019. https://doi.org/10.1016/j.matpr.2019.05.032. [13] n. sunitha, k. manjunatha, s. khan, m. sravanthi. study of sic/graphite particulates on the corrosion behavior of al 6065 mmcs using tafel polarization and impedance. sn applied sciences 1, 2019. https://doi.org/10.1007/s42452-019-1063-6. [14] e. subba rao, n. ramanaiah. influence of heat treatment on mechanical and corrosion properties of aluminium metal matrix composites (aa 6061 reinforced with mos2). materials today: proceedings 4(10):11270 – 11278, 2017. https://doi.org/10.1016/j.matpr.2017.09.050. [15] r. loto, a. adeleke. corrosion of aluminum alloy metal matrix composites in neutral chloride solutions. journal of failure analysis and prevention 16:874 – 885, 2016. https://doi.org/10.1007/s11668-016-0157-3. [16] p. vijaya kumar, g. madhusudhan reddy, k. srinivasa rao. microstructure, mechanical and corrosion behavior of high strength aa7075 aluminium alloy friction stir welds – effect of post weld heat treatment, journal = defence technology 11(4):362 – 369, 2015. https://doi.org/10.1016/j.dt.2015.04.003. [17] j. senthilkumar, m. balasubramanian, v. balasubramanian. effect of metallurgical factors on corrosion behavior of al-sicp composites fabricated by powder metallurgy. journal of reinforced plastics and composites 28(9):1087 – 1098, 2009. https://doi.org/10.1177/0731684407087005. [18] p. p. trzaskoma. pit morphology of aluminum alloy and silicon carbide/aluminum alloy metal matrix composites. corrosion 46(5):402 – 409, 1990. https://doi.org/10.5006/1.3585124. [19] m. bhat, m. surappa, h. nayak. corrosion behaviour of silicon carbide particle reinforced 6061/al alloy composites. journal of materials science 26:4991 – 4996, 1991. https://doi.org/10.1007/bf00549882. [20] s. sinhmar, d. k. dwivedi. effect of weld thermal cycle on metallurgical and corrosion behavior of friction stir weld joint of aa2014 aluminium alloy. journal of manufacturing processes 37:305 – 320, 2019. https://doi.org/10.1016/j.jmapro.2018.12.001. [21] f. gharavi, k. a. matori, r. yunus, et al. corrosion evaluation of friction stir welded lap joints of aa6061-t6 aluminum alloy. transactions of nonferrous metals society of china 26(3):684 – 696, 2016. https://doi.org/10.1016/s1003-6326(16)64159-6. [22] s. bocchi, m. cabrini, g. d’urso, et al. the influence of process parameters on mechanical properties and corrosion behavior of friction stir welded aluminum joints. journal of manufacturing processes 35:1 – 15, 2018. https://doi.org/10.1016/j.jmapro.2018.07.012. [23] a. vinoth jebaraj, k. aditya, t. sampath kumar, et al. mechanical and corrosion behaviour of aluminum alloy 5083 and its weldment for marine applications. materials today: proceedings 22:1470 – 1478, 2020. https://doi.org/10.1016/j.matpr.2020.01.505. 463 https://doi.org/10.1080/10426914.2012.700156 https://doi.org/10.1007/s00170-009-2443-7 https://doi.org/10.5006/1.3293621 https://doi.org/10.5006/1.3580856 https://doi.org/10.1149/1.2114101 https://doi.org/10.5006/1.3315972 https://doi.org/10.24247/ijmperdapr2020122 https://doi.org/10.1016/j.matchar.2018.08.014 https://doi.org/10.1177/0731684407082629 https://doi.org/10.3390/jcs2020032 https://doi.org/10.1088/2053-1591/ab6407 https://doi.org/10.1016/j.matpr.2019.05.032 https://doi.org/10.1007/s42452-019-1063-6 https://doi.org/10.1016/j.matpr.2017.09.050 https://doi.org/10.1007/s11668-016-0157-3 https://doi.org/10.1016/j.dt.2015.04.003 https://doi.org/10.1177/0731684407087005 https://doi.org/10.5006/1.3585124 https://doi.org/10.1007/bf00549882 https://doi.org/10.1016/j.jmapro.2018.12.001 https://doi.org/10.1016/s1003-6326(16)64159-6 https://doi.org/10.1016/j.jmapro.2018.07.012 https://doi.org/10.1016/j.matpr.2020.01.505 j. senthilkumar, p. s. m. kumar, m. balasubramanian acta polytechnica [24] n. ramesh, v. senthil kumar. experimental erosion-corrosion analysis of friction stir welding of aa 5083 and aa 6061 for sub-sea applications. applied ocean research 98:102121, 2020. https://doi.org/10.1016/j.apor.2020.102121. [25] s. li, h. dong, l. shi, et al. corrosion behavior and mechanical properties of al-zn-mg aluminum alloy weld. corrosion science 123:243 – 255, 2017. https://doi.org/10.1016/j.corsci.2017.05.007. 464 https://doi.org/10.1016/j.apor.2020.102121 https://doi.org/10.1016/j.corsci.2017.05.007 acta polytechnica 61(3):456–464, 2021 1 introduction 2 experimental details 2.1 materials 2.2 fabrication of composites 2.3 friction welding 2.4 post weld heat treatment 2.5 corrosion test 2.6 microstructure analysis 3 results and discussion 3.1 microstructure of weld regions 3.2 electrochemical measurement 3.3 corrosion morphology of weld regions 4 conclusions references acta polytechnica https://doi.org/10.14311/ap.2021.61.0562 acta polytechnica 61(4):562–569, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on a static and dynamic calibration of thermochromic liquid crystals stanislav solnař∗, jan medek, abubakar shola suleiman, patrik vyhlídal czech technical university in prague, faculty of mechanical engineering, department of process engineering, technická 4, 160 00 prague 6, czech republic ∗ corresponding author: stanislav.solnar@fs.cvut.cz abstract. the paper deals with a static and dynamic calibration of selected wide-range thermochromic liquid crystals (tlc solardust 24c), which are very easy to ingest and can provide accurate information on the temperature distribution on different surfaces. static calibration problems, such as tlc illumination angle or illumination intensity, are solved and conclusions are drawn for the application of such a measurement. dynamic experiments also indicate a certain time delay (around 50 ms) of the applied tlc measurement layer, which is very important to know for dynamic experimental methods, but the error and inaccuracy of the experiment is too high to draw conclusions. keywords: thermochromic liquid crystals, tlc, static, dynamic, calibration. 1. introduction temperature measurement is a very important aspect of perhaps every industry, from heavy machinery to the very precise pharmaceutical industry. temperature affects all mechanical, thermophysical and process parameters, so its knowledge is usually critical. for many common applications of a (even very accurate) temperature measurement, we use touch (contact) thermometers, which are based on various physical principles (e.g., thermal expansion of substances or a thermal change of electrical resistance of conductors). in cases where it is not possible to measure the temperature by contact or the temperatures are very high, we can use contactless measurements using ir cameras. these cameras measure the intensity of the emitted infrared radiation or its wavelength and thus evaluate the surface temperature of the measured object. such a measurement can also be very accurate, but ir cameras are usually very expensive and other phenomena, such as surface emissivity, reflections, etc., can affect the quality of measurements ([1], [2]). thermochromic liquid crystals, which are both relatively inexpensive and easy to use, can offer very simple and accurate surface temperature measurements. we most often encounter tlcs as a liquid that is sprayed on the monitored surface or in the form of self-adhesive plates. at the same time, these thermochromic liquid crystals have often been used for scientific experiments in the past, so their reliability is considered to be very good. from a physical point of view, it is a substance in the narrow range between the liquid and crystalline phases, in the literature, this state is also called mesophase. this mesophasic state allows the undissolved crystals to have different orientations in the fluid depending on the ambient conditions, e.g. uv radiation, electric field voltage or temperature. it is the temperature dependence of the inclination of the crystals that is very often used, because a change in the temperature of the tlc layer leads to a change in the visible colour of the surface. a change in the rotation of the individual crystals leads to a change in the refraction of the incident white light. the first mesophasic substance dates back to the end of the 18th century, and a large number of them have been described since then. from a chemical point of view, these are mostly crystallic organic molecules in the right temperature interface. several mesophases of minerals are also already known ([1], [3]). tlc layers have been used successfully in various scientific fields, for example, newton et al. [4] have used a thin layer of tlc to measure the convective heat transfer coefficient in gas turbine cooling ducts. in the experiments, the layer was applied to the monitored surface and the authors used the "slow transient" method to minimise errors in the step change. this method leads to the solution of a non-stationary onedimensional fourier equation and it can be assumed that the delay of the measuring tlcs is so small that no fundamental measurement error occurs. ireland and jones [5] have successfully applied similar measurements of the heat transfer coefficient and, in addition, have prepared measurements of surface tension on the wall in their article. the beginnings of similar experiments date back to the 80’s and on the basis of pre-calibrated positions of the experiment, it is possible to evaluate the surface tension. the experimental surface is illuminated with white light at a given angle and then the camera captures the monitored surface from different angles. this surface is coated with tlcs and is affected by the air flow. based on the evaluation of the surface temperature 562 https://doi.org/10.14311/ap.2021.61.0562 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 4/2021 static and dynamic calibration of tlc from different measurement angles, it is then possible to recalculate the surface tension value. the article also mentions their earlier work on the dynamic behaviour of tlcs, however, the authors only state that the delay was measured in the order of milliseconds. in addition, stasiek et al. [6] have shown the use of tlcs in biomedicine, where we can meet with measurements that can non-invasively detect cancer. the tlc layer is applied to the skin surface and the temperature field in this area is monitored. due to other thermophysical properties of the tumor, it is possible to evaluate whether the tumor is in this area or not. due to the width of the application of tlcs, we can encounter their application very often. however, a very important part of accurate tlc measurements is also their calibration, which is necessary for each applied layer. the calibration curve represents the dependence of the surface colour shade on the temperature, and in the literature, we can find a large number of articles that deal with the static calibration of the tlc layer. one article that should be cited is rao and zang [7], who prepared an article on the application, calibration, and inaccuracy of wide-band tlc measurements. at the beginning of the article, they presented their experimental apparatus, which consisted of an isolated calibrator and an isothermal copper plate. they investigated whether the quality of the tlc coating affects the measurement and concluded that thicker tlc layers (20-40 µm) lead to more accurate measurements and lower measurement inaccuracies. according to the authors, the quality of the tlc application on the surface has a very fundamental effect on the measurement and can halve the measurement uncertainty. they also described that the angle of inclination of the light has an effect on the absolute value of the measurement, however, the effect on the opacity is minimal. the authors also recommend using a median filter to reduce the impact of calibration on the digital noise generated during the measurement. their work is very extensive and we consider it a very good basis for their own tlc calibration measurements. 2. static calibration the measuring device for a static tlc calibration is based on a comparison method between a colour image from an rgb camera (go pro hero 5 black) with a fixed setting (iso 400, wb 4000k and shutter speed 1/25 s) and an accurate pt 1000 thermometer (greisinger gmh 175). the constant temperature in the calibrator was maintained by a water thermostat (lauda e-100). a polarizing filter (hoya cir-pl) was installed in front of the rgb camera lens to prevent light from reflecting off the captured image. all components were placed on a solid sheet metal and placed in the dark to avoid any ambient optical influences. a sufficient light for a correct evaluation of the surface colour of the tlc layer is guaranteed by camera led lights yn-14 40. these led lights allow us to set the amount of lighting as well as the temperature of the light, which was set to 4000k for all experiments. the camera was placed on the measuring axis at a distance of 300 mm from the measuring plate, the lights were set to an angle of 45 degrees at a distance of 200 mm from the measuring axis. the details of our static experiment are shown in figure 1. lauda e-100 led lights rgb camera pt 1000 calibrator measurement target figure 1. schematic drawing of a static calibration experiment. rgb camera (go pro hero 5), calibrator, thermometer pt 1000 and led lights (yn-14 40) are firmly placed on a solid metal sheet. during the experiment, the entire assembly is covered with a wall and a black cloth to minimise the ambient light and optical phenomena. the calibrator was made of plexiglass, which has a very poor thermal conductivity, and thus, losses to the environment were minimised. the measuring plate itself was made of steel (thickness 1 mm) and a layer of measured tlcs was provided on the outside. the design of the calibrator minimised the possibility of natural convection around the calibrator. a schematic drawing of the static experiment calibrator is in figure 2. prior to the application of the wide-band tlc layer (solardust 24c, range 60 − 90 f, approx. 16 − 32 °c), a thin layer of matt black paint was applied to the measured metal target, which is recommended for an easier reading of the colours of the tlc layer. a total of 7 layers (about 2.5 ml before drying) of tlcs were then applied to the black coating using an air brush for a perfect layer distribution and thickness minimization. the layers were applied in a perpendicular arrangement to the previous layer. this created a very high–quality painted surface, which did not show any defects. due to the controlled conditions of the measurement environment, we did not apply any covering transparent layers to the tlcs. before each measurement, the temperature in the calibrator was allowed to settle for 10 minutes before the surface colour image of the tlc layer was taken 563 s. solnař, j. medek, a. s. suleman, p. vyhlídal acta polytechnica water inlet/ outlet measurement plate plexiglass body mounting screws water chamber figure 2. drawing of a static calibrator made of plexiglass with a metal measuring plate. the external dimensions of the calibrator are 200×200×40 mm, the measuring plate is a circular plate with a diameter of 110 mm and a thickness of 1 mm. inside the calibrator is a water chamber with a size of 140 × 140 mm, which is milled in the area of the measuring plate up to the measuring plate. and the temperature was read from pt1000. from the rgb camera, we obtain an image (or several images), of which each point consists of three layers (r, g and b). however, the rgb model of the colour representation is not well suited for this experiment, because it takes into account other shades of the same colour. the hsv (or sometimes hsb) model is commonly used in the literature for the evaluation, because the hue value (degree of colour) is not affected by the colour shade or its lightness. there are several equations for the conversion of the rgb and hsv models, we used the rgb2hsv command in the matlab environment, which is based on the dominant colour. if the red channel is dominant, then hue = g − b 6(r − min(r, g, b)) , (1) if the green channel is dominant, then hue = 2 + b − r 6(g − min(r, g, b)) (2) and if the blue channel is dominant, then hue = 4 + r − g 6(b − min(r, g, b)) . (3) after this transformation, we obtain a matrix of hue values that is not affected by the brightness or shadowness of the colour in any way. to eliminate the noise that occurs when taking digital photos, we scanned the entire measured target and evaluated a selected area of points in the middle of the target (usually 500 × 500 measured pixels). a typical dependence of the hue value on the temperature (also known as a calibration curve) for our type of the tlc layer can be seen in figure 3. hu e () 0 50 100 150 200 250 300 350 t (oc) 20 22 24 26 28 30 32 figure 3. typical temperature dependence of the hue value for our type of the tlc layer. this is a very nonlinear dependence, which does not even have a clear tendency at the ends. in practice, a narrower strip is usually used, where the tendency is unambiguous and the sensitivity of the layer to temperature changes is high. the calibration curve also shows that the tlc layer has various sensitivities to temperature changes. in the region around 24 °c, the sensitivity is about 85 hue/°c, while in the region around 28 °c, the sensitivity is only about 3 hue/°c. for tlc measurements, it is also very important to find out if there is only one colour in the image or if the evaluated image is composed of many different hue values (colours). for this confirmation, we prepared a histogram of the colour distribution in the evaluated field of the image, see figure 4. percentage (% ) 0 20 40 60 80 100 co un ts ( -) 0 5×104 105 2×105 2,5×105 hue (-) 0 50 100 150 200 250 300 350 figure 4. histogram of the distribution of individual hue values for 5 various cases. it is obvious that only one hue value is always dominant. one image usually contains more than 98 % hue of one colour (98 % of measured pixels show one colour). a measurement using this tlc layer is, therefore, possible, one colour is always dominant in the measurement and so the evaluation can be carrioud out without any problems. the temperature dependence 564 vol. 61 no. 4/2021 static and dynamic calibration of tlc of the hue value (calibration curve) is strongly nonlinear and has various sensitivities to temperature changes. 2.1. influence of the amount of light for the same measuring target with the tlc layer applied, we verified whether the amount of light affects the result of the measured values. the led lights used allow to adjust the amount of light in several steps, for the experiment, the settings of 25, 50, 75 and 100 % intensity were used. the experiment was basically the same as the static calibration (comparison method) with different lighting conditions and the results are shown in figure 5. 25% 50% 75% 100% hu e () 0 50 100 150 200 250 t (oc) 20 22 24 26 28 30 32 figure 5. comparison of calibration curves for different light intensities. although the calibration curves do not show a significant difference, it is clear that a deviation is noticeable for a lighting level of 25 %. if we consider a calibration curve with an illumination level of 100 % as a reference, we can plot the deviations of other calibration curves and find out the deviations, see figure 6. based on our measurements, we can judge that if we use an illumination intensity greater than 25 %, the deviation from the reference calibration curve (illumination intensity 100 %) will be minimal. we detected a larger deviation at the beginning of the calibration curve, but if the measurement will take place in the range from 24 °c upwards (the area of the highest sensitivity of the tlc layer to temperature changes), this deviation is also negligible. we do not recommend measuring with a very limited lighting intensity 2.2. influence of the angle of light we also investigated the effect of light tilt on the measured values by the same method as the calibration curve and the effect of light intensity. we have adjusted the experimental setup for the possibility of moving the lights. the relocated lights then always aimed at the center of the measuring plate. 25% 50% 75% de vi at io n (% ) −20 −10 0 10 20 30 40 t (oc) 20 22 24 26 28 30 32 −3 −2 −1 0 1 24 25 26 27 28 29 30 figure 6. deviations from the reference calibration curve. we can see fundamental deviations at the beginning of the calibration curves and also for the whole curve with an illumination level of 25 %. other curves have a deviation of less than 1 % from the reference. three light setting angles (36, 45 and 66 °) were selected for the measurement to observe changes in the measurement. the changes in the calibration curves are clearly visible in this experiment, see figure 7. 66o 45o 36o hu e () 0 50 100 150 200 250 300 t (oc) 22 24 26 28 30 32 figure 7. comparison of calibration curves for different light angle settings. from the comparison of the calibration curves, we can conclude that with an increasing angle of light, the value of hue also increases, and the measurement must, therefore, be approached with a uniform setting of the angle of light. for a comparison, we again chose one reference calibration curve (in this case with an angle of 45 °) and monitored the deviations of the other curves, see figure 8. although all calibration curves have a very similar trend and show a similar sensitivity in individual sections of the curve, the differences are not negligible and the shapes are partially different. nor does there appear to be a difference that can be expressed by any simple multiple constant. from this point of view, 565 s. solnař, j. medek, a. s. suleman, p. vyhlídal acta polytechnica 66o 36o de vi at io n (% ) −20 0 20 40 60 80 t (oc) 20 22 24 26 28 30 32 figure 8. deviations of the calibration curves from the reference curve. we can see deviations in the tens of percent, which approach below 10 % only in the range of 26-30 °c. it is critical to know the illumination angle and to prepare a calibration curve for this setting before the actual measurement. 3. dynamic calibration the static calibration has shown that with some limitations (especially knowledge of the angle of incident light), tlcs can be used for static experiments. for dynamic experiments (eg thermal oscillation method, step or response methods), however, it is also necessary to know the dynamic properties of the tlc layer its delay compared to the excitation function or other time deviations. to monitor the dynamic behaviour of the applied tlc layer, we prepared a similar experiment, which consisted of two measuring bars, which were connected together by a delay channel. the calibrator for the dynamic behaviour was made by 3d printing (mat. abs very bad thermal conductivity k = 0.25 w/(m k), prusa i3 mk3) and sealed by etching the walls with acetone vapour. a model of the calibrator can be seen in figure 9. the 3d printed body has two windows on which a tlc layer is applied. in this case, a pmma plate was used as the measuring plate, and a tlc layer was applied to the inside of the calibrator in the same number of layers and the same pattern. the tlc layer thus prepared was then covered with a spray layer of black matte paint for a correct colour identification. the black colour also served as a barrier to washing away the tlc layer. the measuring targets thus prepared were then fastened to the calibrator body with screws and a delay loop was fitted. for observation possibilities, the whole calibrator was also monitored by an ir camera (microepsilon tim 160, 160 × 120 pixels, 80 mk sensitivity). the emissivity of the pmma surface was measured by a comparative static method and was determined to abs body measurement targets water inlet water outlet adjustable delay loop figure 9. model of our dynamic calibrator with an adjustable delay loop. during the experiment, all tubes are provided with thermal insulation to minimise the losses to the environment, see [3]. be ϵ = 0.73. although the tlc layer is deposited on the inside and the ir camera senses the surface temperature, we assume that both signals are comparable. since both measuring targets are made of the same material of the same thickness, we assume that the heat will be conducted in them at the same speed and due to the insulation of the calibrator, to approximately the same temperature. this assumption leads to the possibility of comparing the measured colour shade by the tlcs (time dependence) and the time dependence measured by an ir camera. another possibility is the use of very fast thermocouples, which will be close to the measured target from the inside, which is the goal of our further research. to measure the dynamic behaviour, we prepared an experiment with two cameras, see figure 10. with both cameras, we recorded temperature jump changes, which we performed with hot and cold water (stored in tanks), which was alternated manually in a 3-way valve. due to the thermal capacity of the system, however, there are no step changes, but continuous gradual changes in both spectra (both visible and ir). it is possible to use several methods to evaluate continuous changes in signals, we used the method of finding the inflection point that we think best describes the change. both cameras were set to a high frame rate (120 hz for the rgb camera and 100 hz for the ir camera) and captured the experiment. theoretically, the time resolution of this experiment is 10 ms of time response. the synchronization of signals from the ir and rgb cameras was performed manually using an optical element. from both measured windows, the field in the middle of the measurement was selected, which had dimensions of about 10 x 10 mm and the corresponding camera resolution (ir and rgb camera have different resolutions). scanning a part of the measured area also helped to eliminate any digital noise. the scanned field was always averaged to the final value 566 vol. 61 no. 4/2021 static and dynamic calibration of tlc ir and rgb camera led lights cold and hot water tank 3 way valve delay loop figure 10. schematic drawing of an experiment for measuring the dynamic behaviour of a tlc layer. the distance of the cameras from the measuring windows is about 300 mm. (colours or temperatures). the measured temperature profiles (both from the tlcs and from the ir camera) show a very similar course (continuous changes), both during the heating and the cooling. by evaluating the course of heating and cooling, we obtain the same time difference (with an error of about ± 10 ms) for both the tlc and the ir camera. the detection of the system is, therefore, efficient and both temperature changes (heating and cooling) can be used for a system time delay evaluation. the evaluation of time dependences (a comparison of time delays of the tlc and ir) shows that the difference between the two measured signals is in the order of tens of milliseconds, the average value is about 49 ms. but we must take into account the inaccuracy of this experiment. if we assume that the scanning error of both ir and rgb cameras is a maximum of ± 1 frame, we obtain a systematic error of ± 10 ms of measured signal differences. we create another introduced error by fitting the function to the measured data and searching for their inflection point. if we again assume that this operation creates an inaccuracy of only ± 10 ms and we also consider other inaccuracies (digital noise, random errors, etc.), we find that the measured value of the tlc signal delay is about as large as the inaccuracy of the experiment. therefore, we do not recommend the use of this experimental device to determine the dynamic behaviour of tlcs. either the tlc delay would have to be in the hundreds of milliseconds, or more (maybe only faster) sophisticated experimental equipment must be used. however, this experiment showed that tlc delays can be expected in the order of tens of milliseconds. 4. discussion the article discusses the issue of calibrating thermochromic liquid crystals (tlc), which very often serve as temperature indicators. the great advantage of tlcs is the very low cost and ease of application when measuring. the disadvantage is their sensitivity to damage and usually a narrow range of measured temperatures. for the experiments, we used a wide-range tlcs labeled solardust 24c, which has a range of approximately 16-32 °c. however, for each experimental measurement, it is necessary to calibrate any measuring element. while carrying out the experiments, we also discovered several changes that interested us and we tried to analyse them. the calibration curve (surface temperature dependence and tlc colour shade) is a very non-linear curve that changes the sensitivity from 5 hue/°c to 85 hue/°c. from the point of view of experimental use, it is most advantageous to use a narrower band of measured values, where the sensitivity to the temperature change is the highest. the data show that only one colour shade is always dominant for the set temperature, which is crucial for the application. the measured data contained 98 % or more pixels of the same shade of colour. we found that the light intensity has no significant effect on the change in the calibration curve. only when the lighting is insufficient, there is a fundamental deviation and this is also accompanied by a greater digital noise. when we used a sufficient high-power illumination in the experiments, we recorded a deviation from the reference calibration curve of about 1 % and about 8 % at the beginning of the measurement interval. the data show a fundamental influence of the calibration curve in terms of the angle of adjustment of the lights and this is essential for experimental measurements. for a further use of tlcs, we therefore recommend using the same light angle setting, or obtaining a calibration curve for the current experiment settings. deviations from the reference calibration curve were significant in the order of tens of percent. dynamic experiments have shown that there is probably some time lag of the tlc layer to the temperature change. the data show that this delay can be expected in the order of tens of milliseconds, but the inaccuracy and insensitivity of the experiment (in the order of tens of milliseconds) does not allow conclusions to be drawn. for a more detailed examination of the dynamic behaviour of tlcs, we therefore recommend using the same experiment, but with a higher sensitivity, or using another dynamic experimental technique that would be able to detect these dynamic phenomena with a reasonable error. for further experimental measurements, we recommend shading other ambient light effects using a box or black fabric curtains. we also recommend using a large amount of heat transfer medium (in our case water) to obtain the largest possible heat capacity of the measuring system and thus the smallest possible temperature fluctuations in time. in our experiments, professional camera lights with their own voltage sta567 s. solnař, j. medek, a. s. suleman, p. vyhlídal acta polytechnica bilization proved to be very successful, other lights did not guarantee a uniform light colour. 5. conclusion from our experimental measurements we can conclude: • calibration curve of chosen tlc layer is very nonlinear with varying sensitivity to temperature change • experimental measurements with our camera always show only one dominant colour in the whole image, so it is possible to use tlcs for surface temperature measurements • light intensity has no significant effect on the change in the calibration curve • changing the illumination angle has a significant effect on the calibration curve • dynamic experiment prepared by us shows a time delay of about 50 ms, but it is necessary to create a more accurate measurement to detect the dynamic behaviour list of symbols b blue [–] g green [–] hue shade of color [°] r red [–] ir infra red rgb red-green-blue t lc thermochromic liquid crystal acknowledgements the main author would like to thank the other colleagues for their excellent cooperation during their thesis. authors acknowledge the support from the grant agency of the czech technical university in prague, grant no. sgs20/119/ohk2/2t/12. references [1] p. vyhlídal. problems of surface temperature measurement using themochromic crystals (in czech). master’s thesis, czech technical university in prague, faculty of mechanical engineering, 2019. [2] a. s. suleiman. thermochromic liquid crystals as a temperature indicator. master’s thesis, czech technical university in prague, faculty of mechanical engineering, 2019. [3] j. medek. thermochromic liquid crystals (tlc) calibration (in czech). master’s thesis, czech technical university in prague, faculty of mechanical engineering, 2020. [4] p. j. newton, y. yan, n. e. stevens, et al. transient heat transfer measurements using thermochromic liquid crystal. part 1: an improved technique. international journal of heat and fluid flow 24(1):14–22, 2003. https://doi.org/10.1016/s0142-727x(02)00206-0. [5] p. t. ireland, t. v. jones. liquid crystal measurements of heat transfer and surface shear stress. measurement science and technology 11(7):969–986, 2000. https://doi.org/10.1088/0957-0233/11/7/313. [6] j. stasiek, m. jewartowski, t. a. kowalewski. the use of liquid crystal thermography in selected technical and medical applications — recent development. journal of crystallization process and technology 4(1):46–59, 2014. https://doi.org/10.4236/jcpt.2014.41007. [7] y. rao, s. zang. calibrations and the measurement uncertainty of wide-band liquid crystal thermography. measurement science and technology 21(1):1–8, 2010. https://doi.org/10.1088/0957-0233/21/1/015105. 568 https://doi.org/10.1016/s0142-727x(02)00206-0 https://doi.org/10.1088/0957-0233/11/7/313 https://doi.org/10.4236/jcpt.2014.41007 https://doi.org/10.1088/0957-0233/21/1/015105 vol. 61 no. 4/2021 static and dynamic calibration of tlc a. attachments matlab script for the evaluation of measured data. the measured data are saved in .jpg format and the file name is identical to the measured temperature multiplied by 10 (for easier data storage, there are no decimal points). clear all; format compact; clc number = dir(’*.jpg’); number=length(number); a=dir(’*.jpg’); for i=2:number file=a(i).name; [name,ext]=fileparts(file); temp(i)=str2num(ext)/10; foto = imread(file); field=foto(x1:x2,y1:y2,:); color=rgb2hsv(field); hue=color(:,:,1); huem(i)=mean(mean(hue))*360; end plot(temp,huem,’k.’) xlabel(’temperature (c)’); ylabel(’hue value (deg)’); table(transpose(temp), transpose (huem)) 569 acta polytechnica 61(4):1–8, 2021 1 introduction 2 static calibration 2.1 influence of the amount of light 2.2 influence of the angle of light 3 dynamic calibration 4 discussion 5 conclusion list of symbols acknowledgements references a attachments acta polytechnica doi:10.14311/ap.2018.58.0334 acta polytechnica 58(6):334–338, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap the influence of cutting conditions on the selected parameters of the surface integrity robert čepa, ∗, šárka malotováa, jiří lichovníka, michal hatalab, stanislaw legutkoc a všb – technical university of ostrava, 17. listopadu 15/2172, ostrava, czech republic b technical university in košice with the seat in prešov, bayerova 1, prešov, slovakia c poznan university of technology, 5 m. sklodowska-curie square, 60-965, poznan, poland ∗ corresponding author: robert.cep@vsb.cz abstract. the article deals with an investigation of residual stress in machined surface under conditions of high-feed milling and determination of the influence of machining conditions on the size and types of stress resulting from cutting into the machined surface. as a testing material, the hardened tool steel w. nr. 1.2343 (čsn 19552) was used. for the realization of the experimental activity, a high-feed milling head was used with exchangeable cutting inserts marked h600 wxcu 070515t. all surfaces were machined under different cutting conditions (200, 300, 350, 400 and 500 m min−1 cutting speed) in regard to the recommended parameters and machine tool options. the evaluated residual stress was measured in the depth of 8 µm under the surface with the device proto ixrd working on the principle of the x-ray diffraction. monitoring was carried out using an analysis of occurrence of tensile or compressive residual stress, and from these results, a possible dependence of the residual stress on the cutting conditions during milling process was determined. keywords: residual stress; milling; machining; x-ray diffraction; surface integrity. 1. introduction worldwide industrial production deals with the manufacturing of airplanes, the energy sector, construction, machines, tooling, cutting, and the quality and final properties of manufactured components. the progressive technologies of machining with high speeds must be the priority of all manufacturing. during machining, in this case, mechanical and thermal loading and final surfaces are exposed to changes, which influence the surface and subsurface layers [1–3]. these changes can be classified as the residual stress. the presence of stress in machined surfaces is related to the mechanical loading or the phase transformation. during machining, the residual stress is generated after plastic deformation during the chip forming. a thermal treatment or other processes can change the size and type of residual stress [4–6]. the residual stress is an important parameter of surface integrity. an investigation can predict the future behaviour of formed surfaces. several authors (e.g. r.m’saoubi [7]) deal with the surface integrity and relationship between geometrical and physical properties such as the residual stress. as mentioned above, during chip forming, the change of tension field in material occurs. for that reason, many authors [8– 10] also deal with the influence of cutting parameters on selected parameters of surface integrity. as authors frederick gunnberg et al. [11] describe in the paper “the influence of cutting conditions on residual stress and surface topography”, stress can be compressive (−) or tensile (+). compressive stress figure 1. mechanism of the residual stress [11]. is generally useful, because it tends to increase fatigue strength and restricts a spreading of cracks and increases corrosion resistance. tensile stress is dangerous for components, and it has the opposite impact on the above mentioned material properties. the mechanisms of creation of the residual stress are explained in figure 1. during the chip forming, tensile stress occurs (a), which may be explained by the plastic deformation into the surface layer and elastic deformation into the subsurface layer. when the machining process is completed, a balance in the microstructure is achieved. compressive stress is caused from dilatation. (b). the mechanism of residual stress due to heat (figure 1 right) is caused by the amount of the heat in the cutting zone, which extends into surface layer and creates compressive stress (a). after 334 http://dx.doi.org/10.14311/ap.2018.58.0334 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 6/2018 the influence of cutting conditions on surface integrity figure 2. cutting tool. vc [m min−1] fz [mm] 200 0.6 0.7 0.9 300 0.6 0.7 0.8 0.9 350 0.6 0.7 0.8 0.9 400 0.6 0.8 500 0.6 table 1. cutting parameters. the cooling process occurs, the contraction causes the change of the stress. (b). for this reason, it is also important to know the temperature of the cutting wedge, because this temperature will reduce the tensile stress induced into the surface. there are several methods for the investigation of residual stress with one main classification: destructive or non-destructive. these methods are not only expensive but also time-consuming and they require a broad knowledge of residual stress. a frequently used method is the x-ray diffraction. its principle of measurement is a positioning of the sample under the measuring head, when the collimator must be in position perpendicular to the measuring surface [12]. when the focus is set at a right height, the x-ray beam radiation penetrates into the material, which bounces back. the measuring head leans in a ratio of −30° and +30° and the measurement is realized step by step into an approximately 8 µm depth. the data are processed, and they can be used for the determination of dependences. 2. set up of the experiment the aim of the experiment was focused on the research and distribution of residual stress under the conditions of high-feed milling. used samples consisting of hardened material wr. n. 1.2343 (čsn 19552) figure 3. chip forming. 42 ± 2 hrc were used. the dimensions of samples were 40 × 40 × 330 mm. the reason for this choice of material was its repeatability for the form production or shearing tools, and thus it is important to know the state and future behaviour of this material after the manufacturing. this steel is characterized by good hardness, high heat-strength, and very good resistance against creating of cracks. the sample set was machined under the cutting conditions recommended by producer iscar. the whole experiment was carried out using the 3-axis milling centre hurco wmx 30t, dry machining. the milling process requires a holding toughness and an increased durability of exchangeable cutting inserts. therefore, five-teeth milling head iscar ff fwx d050-22-07 was used. the head was fit with the inserts h600 wxcu 070515t made from the material ic810. these inserts contain a very fine-grained structure with a high wear resistance. the inserts had a pvd coating on their base, altin+tin. it is nitride titanium and it prevents a chip jamming at the cutting edge point, and nitride titanium and additives improve the properties and decrease the fracture toughness of the cutting edge. the producer of these cutting tools offers a width ratio of cutting parameters. in regard to the influence of cutting conditions on the milling process, following conditions were selected (table 1). the depth of the cut was set to 1 mm. the residual stress was measured in machined surfaces. a non-destructive method based on the x-ray diffraction was used. the stress was measured in the middle part of every sample into approximately 8 µm of depth. the values were processed with proto ixrd software and are described in the next section. 335 r. čep, š. malotová, j. lichovník et al. acta polytechnica 3. results a characteristic feature for machining hard material is the creation of short comma chips, typically of a purple colour. during the process of machining, sparks and chips are generated, which are clearly visible. (figure 3) each sample was machined on four surfaces and the results were statistically processed. the stress always had a positive character – tensile stress. figure 4 shows an example of the stress distribution for a selected sample. international standard čsn en 15305 deals with non-destructive methods. individual atoms are not grouped around the curve and do not tend to counteract crack propagation in the material structure. this is a hallmark of tensile stress. shape of the parabola characterizes the accuracy of the measurement – the deviation ±19.7 mpa is about 28 % of the stress. the direction of axis of the parabola describes the type of the stress. with a rising axis of the parabola, the presence of the tensile stress is confirmed. figure 4. graphic output of residual stress. figure 5. dependence of residual stress on (left) cutting speed and (right) feeds per tooth. 336 vol. 58 no. 6/2018 the influence of cutting conditions on surface integrity figure 6. results of residual stress. if we consider the potential dependence of the stress on the cutting speed at a constant feed per tooth, no dependence was detected. figure 5 top illustrates all variants of feeds and the highest stress value was reflected predominantly at the highest chosen cutting speed. however, results could not confirm that increasing cutting speed also increases the stress value. minor fluctuations could be due to structural changes in the material. furthermore, we focused on the results of the stress with a dependence on the increasing feed per tooth and a constant cutting speed. (figure 5 bottom) even here, the results didn’t confirm the direct dependence between the stress and feed. initially, the values grew, however, only slight fluctuations followed. for both cases of the dependence, the tensile stress had a significant influence on the thermal effect during the cutting process. as mentioned earlier in the introduction, the stress is predominantly compressive in the surface, and with a cooling process, it converts into positive values – tensile stress. for experimental purposes, the stress was also measured in the non-machined sample. despite the fact that the values were small, the stress was still tensile. the only finding that can really be confirmed is that the stress occurs in the material even without mechanical loading from machining. another finding was that the external force changes the size or type of the stress. figure 6 shows a three-dimensional map of the results. the residual stress in the surface did not confirm its dependence on the cutting parameters. therefore, another parameter of the surface integrity, surface roughness, was evaluated within this experiment. the roughness was evaluated on the basis of čsn en iso 4287/88 using the device surftest sj 210. the assumption was that with an increasing feed per tooth at a constant cutting speed, the surface roughness (ra value) will increase. that was confirmed. worsened surface quality was monitored at higher machining parameters, acompanied by significant tool traces. 4. conclusion the residual stress was evaluated in relation to the selected machined material, technology, and the cutting conditions. for this experiment, we used a hardened tool steel for machining by high-feed milling technology. the evaluation and measurement of residual stress was carried out with the use of a non-destructive x-ray method. the results didn’t confirm any dependence of the residual stress on an increasing cutting speed with a constant feed per tooth. each combination of conditions evinced a presence of tensile stress, which was caused during the cooling process. tensile stress can have an impact on sped-up corrosion, crack propagation in the material structure and lowering the fatigue strength of the material. the stress has a wide range of values. the stress with the highest value was measured in the conditions of the highest value of cutting speed or feed per tooth. there, the stress was up to three times higher. from the view point of the surface roughness, increasing values of machining conditions result in a poor surface quality with significant tool tracks. with the increasing feed per tooth and a constant cutting speed, the roughness increased. we can observe that, generally, a machining process has an impact on a tensile and compressive stress and an increase of residual stress into the material. in conclusion, cutting conditions have no effect on residual stress and no dependence, with respect to the residual stress, has been confirmed. if there is an interest for further research of dependencies, the research could evaluate the residual stress in dependence to the measurement of the surface depth and the temperature at the cutting point. 337 r. čep, š. malotová, j. lichovník et al. acta polytechnica acknowledgements this paper has been supported from funds of moravian and silesian region within project „the support talented students of doctoral study at všb-tu ostrava“ (reg. no. 04766/2017/rrc). article has been done in connection with projects education system for personal resource of development and research in field of modern trend of surface engineering – surface integrity, reg. no. cz.1.07/2.3.00/20.0037 financed by structural founds of europe union and from the means of state budget of the czech republic and by project students grant competition sp2018/150 and sp2018/136 financed by the ministry of education, youth and sports and faculty of mechanical engineering všb-tuo. references [1] twardowski, p. et al. investigation of wear and tool life of coated carbide and cubic boron nitride cutting tools in high speed milling. advances in mechanical engineering. 2015, vol. 7, no. 6. doi:10.1177/1687814015590216 [2] čep, r.; petrů, j.; zlámal, t. influence of feed speed on machined surface quality. in: metal 2013: 22nd international conference on metallurgy and materials. ostrava: tanger ltd., 2013, p. 1033-1038. isbn 978-80-87294-41-3. [3] wojciechowski, s.; m. wiackiewicz; g.m. krolczyk. study on metrological relations between instant tool displacements and surface roughness during precise ball end milling. measurement. 2018, vol. 129, p. 686-694. doi:10.1016/j.measurement.2018.07.058 [4] ganev, n.; pala, z.; kolařík, k. the impact of cutting conditions on residual stresses in the case of plain milling. materials engineering. 2010, vol. 17, no. 3, p. 9-14. [5] petrů, j.; zlámal, t.; špalek, f.; čep, r. surface microhardening studies on steels after high feed milling. advances in science and technology research journal. 2018, vol. 12, no. 2, p. 222-230. doi:10.12913/22998624/91888 [6] petru, j.; zlamal, t; cep, r; pagac, m.; grepl, m. influence of strengthening effect on machinability of the welded inconel 625 and of the wrought inconel 625. in: imeti 2013 6th international multi-conference on engineering and technological innovation, proceedings. orlando: international institute of informatics and systemics (iiis), 2013, p. 155-159. isbn 978-193633892-4. [7] m’saoubi, r. et al. a review of surface integrity in machining and its impact on functional performance and life of machined products. international journal of sustainable manufacturing. 2008, vol. 1, no. 1/2, p. 203-236. doi:10.1504/ijsm.2008.019234 [8] krolczyk, g. m.; legutko, s. experimental analysis by measurement of surface roughness variations in turning process of duplex stainless steel. metrology and measurement systems. 2014, vol. 21, no. 4, p. 686-694. doi:10.2478/mms-2014-0060 [9] beiyhi, l. et al. high-speed milling characteristics and the residual stresses control methods analysis of thin-walled parts. advanced materials research. 2011, vol. 223, p. 456-463. [10] krolczyk, g. m.; legutko s.; stoic a. influence of cutting parameters and conditions onto surface hardness of duplex stainless steel after turning process. tehnicki vjesnik-technical gazette. 2013, vol. 20, no. 6, p. 1077-1080. [11] gunnberg, f.; escursell, m.; jacobson, m. the influence of cutting parameters on residual stresses and surface topography during hard turning of 18mncr5 case carburised steel. journal of materials processing technology. 2006, vol. 174, p. 82-90. doi:10.1016/j.jmatprotec.2005.02.262 [12] malotová, š. et al. evaluation of residual stresses after irregular interrupted machining. tehnicki vjesnik technical gazette. 2018, vol. 25, no. 4, p. 1009-1013. doi:10.17559/tv-20160615125650 338 http://dx.doi.org/10.1177/1687814015590216 http://dx.doi.org/10.1016/j.measurement.2018.07.058 http://dx.doi.org/10.12913/22998624/91888 http://dx.doi.org/10.1504/ijsm.2008.019234 http://dx.doi.org/10.2478/mms-2014-0060 http://dx.doi.org/10.1016/j.jmatprotec.2005.02.262 http://dx.doi.org/10.17559/tv-20160615125650 acta polytechnica 58(6):334–338, 2018 1 introduction 2 set up of the experiment 3 results 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0197 acta polytechnica 60(3):197–205, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap model reference adaptive control-based genetic algorithm design for heading ship motion nasir ahmad al-awad university of mustansiriyah, faculty of engineering, department of computer engineering, palestine street, 10001 baghdad, iraq correspondence: nasir.awad@uomustansiriyah.edu.iq abstract. in this paper, the heading control of a large ship is enhanced with a specific end goal, to check the unwanted impact of the waves on the actuator framework. the nomoto model is investigated to describe the ship’s guiding progression. first and second order models are considered here. the viability of the models is examined based on the principal properties of the nomoto model. different controllers are proposed, these are proportional integral derivative (pid), linear quadratic regulator (lqr) and model reference adaptive control genetic optimization algorithm (mrac-ga) for a ship heading control. the results show that the mrac-ga controller provides the best results to satisfy the design requirements. the matlab/simulink tool is utilized to demonstrate the proposed arrangement in the control loop. keywords: nomoto ship mathematical models, course changing, ship heading control, pid controller, lqr controller, mrac-ga controller. 1. introduction automatic control techniques for marine vehicles are proposed to improve their abilities with adequate unflinching quality and economy. the primary reason for the rudder is to control the heading of the ship in course-keeping and course-changing moves [1]. applying more complex autopilots to deliver guidance is fundamental due to the precision of the execution and mileage [2]. a ship in waves is considered as an inflexible body with six degrees of freedom. nonlinear numerical models of ship elements are used to reproduce the ship movements, and to outline the closed-loop control frameworks [3]. in order to consider the yaw movement for the ship, the guiding elements can be described by basic linear [4] or nonlinear [5] mathematical model. the ship guiding control frameworks are intended to perform two types of movements: course keeping and course-changing moves, more details are addressed in [6]. there are two necessary methods to approach the mathematical model. first one is classic modelling, which includes the analytical approach. this model is developed based on a physical knowledge of the system, due to the fact of the computational and other practical requirements; the order of the model needs to be reduced for synthesis, evaluation, and implementation of control systems and the second approach is used to simulate ship’s movements in real time and it possesses the ability to cope with different shapes of a ship, engines, and sea conditions without the loss of efficiency [7]. the historical backdrop of ship autopilots is over 80 years of age. minorsky’s [8] is one of the early works in the field of programmed control. sperry [9] presents the main programmed guiding control framework for ships. these controllers are completely mechanical, as they provide a basic controlling activity. the rudder request is relative to the heading error. at the point when pid controllers becomes industrially accessible, they will significantly enhance the execution of the ship behaviour. the primary drawback of the pid is the need of tuning [10]. to overcome the drawbacks of the pid controllers, adaptive controllers are presented [11–13]. artificial neural networks (anns) seem to offer a few points of interest over different types of control for ship guides. this is a result of the capacity of neural systems to deal with different framework elements [14]. an ann controller can be prepared to have good properties of both the consistent parameter controller and the adaptive controller. fuzzy controllers are effectively employed in ship steering. fuzzy autopilots, h-infinity autopilots, and lqr controllers have been proposed in different works [15–18]. in fuzzy control, it is generally hard to decide and streamline the control rules. it defaults to improve the parameters of a disturbance rejection controller due to many parameters. the backstepping technique can provide a methodical development process for the controller outline, yet it flops in deciding the ideal estimation of the control parameters, as a rule, adaptive fuzzy backstepping control can provide an orderly technique of tackling following or direction control issues, where fuzzy frameworks are utilized to inexact obscure nonlinear functions with obscure parameters [19]. the motivation for this paper is based on the practical need to develop a control system to steer a ship along a reference trajectory defined in a given reference frame. this paper is organized as follows: section 2 depicts the inflexible body elements of the ship in six degrees of freedom (6dof) and 197 https://doi.org/10.14311/ap.2020.60.0197 https://ojs.cvut.cz/ojs/index.php/ap nasir ahmad al-awad acta polytechnica deliver a movement condition, finally, the scientific model for a payload dispatch including the rudder servo framework is introduced. section 3 describes the different controllers’ methods. the simulation, results, and comparison are addressed in section 4. section 5 illustrates the main conclusions. 2. ship mathematical modeling there are many established ship displaying procedures [20] such as nomoto model, norrbin model, bech model. all these demonstrating procedures have a distinctive plan stream and methodology. the nomoto model is one of the basic linear models that is utilized for displaying ship elements, which underpins each of the 6 degrees of freedom (dof) and is appropriate for small rudder angles. the nomoto model is the most broadly used ship display for autopilot outlines [21– 23]. this model provides a better execution to small rudder angle turnings and best reasonable solution for big boats since it underpins every one of the 6 dof. the bech’s and norrbin models are nonlinear models, which are derived from the nomoto model. these models are used for bigger rudder angles [24]. this section discusses the connection between changes in the heading angle, with respect to the rudder angle. during course changing, it has all the earmarks of being attractive to swing to the new heading with a consistent rate of turn, or potentially with a steady turning span. the new heading must be without overshoot. the likelihood to alter the rate of turn is the main setting required by the user. the full nonlinear model is too intricate to ever be utilized for designing the controller, so the linear model is picked. the principal linear transfer function of a ship directing framework was advanced by davidson and schiff [25]. nomoto et al. [4] have proposed two basic linear mathematical models in light of the model of davidson and schiff, nomoto’s models have been utilized broadly by control engineers for an investigation and outline of ship controllers. these models represent a sensibly precise depiction of the directing conduct for a large type of ships. for autopilot applications, there is no compelling reason to incorporate 6 dofs(three transnational motions surge, sway and heave and the other three rotational motions roll, pitch and yaw)in estimations. for surface vessels, it is ordinary to lessen the model to movement in a surge, sway, and yaw. this is done under the assumption that movements in heave, roll, and pitch are little, see figure 1. the ship control issue is generally taken by applying the classical technique, which considers just two coupled developments, which are yaw and sway [4]. the equations below describe the horizontal motion of a well-established ship. these equations can be derived from newton’s laws, after linearizing the equations of the movement, see [4] and [26–28]. in the ship moving issue, just the flat movement is thought about the surge, sway, and yaw movements. in this way, the equations are [29]: x = m (u. − rv) (1) y = m (v. − ru) (2) n = izr. (3) where x,y , and n are the surge, sway, and yaw motions, m is the mass of the ship, iz is the moment of inertia to the z-axis, (u,v and r) are the surge, sway, and yaw speed motion. (u.,v.,r.) are surge, sway and yaw acceleration. in considering the steering movement, surge is isolated from steering movement and particularly, by ignoring the swaying movement in eqs. (2), (3) and by taking the laplace transform of yaw and surge with thinking about a consistent speed, a nomoto second order estimate, which portrays the heading elements of the ship through a basic transfer function, is as [29, 30]: r δ = k (1 + t3s) (1 + t1s) (1 + t2s) (4) where (r) is the yaw rate,(δ) is the rudder angle. (k) is the static ruder gain. (t1,t2, and t3) are time constants, these parameters rely on working conditions and are generally alluded to as controlling steering movement. in real work, in light of the fact that the pole term (1 + t2s) and the zero term (1 + t3s) in eq. (2) nearly cancel each other out, a further improvement of eq. (4) should be possible to give the principal arrange nomoto demonstrate as: r δ = k (1 + ts) (5) where t = t1 + t2 − t3. sometimes called as yaw model, the nomoto display is legitimate on the assumption, that the ship moves at a steady speed, the propelling push is consistent and the rudder angle is little. given that conditions are satisfied, the nomoto display gives a sensibly exact portrayal of course-keeping behaviour [22]. nomoto model, which is widely employed in autopilot design and yaw dynamics, is characterised by parameters k and t can be determined by manoeuvring tests. in practice, it is more helpful to work with the heading angle (ψ) than the yaw rate (r), it is ideal to modify eq. (5) since the yaw rate is actually the time derivative of the ship heading angle, so that: ψ δ = k s (1 + ts) (6) this model is widely used for the ship autopilot design due to its simplicity and accuracy. eq. (6) can be written as state-space form:[ ψ r ] = [ 0 1 0 −1 t ] [ ψ r ] + [ 0 k t ] δ (7) 198 vol. 60 no. 3/2020 model reference adaptive control-based genetic algorithm design. . . figure 1. ship motion description [4]. x = ax + bu ψ = [ 1 0 ] [ ψ r ] + [0] δ (8) y = cx where r = dψ dt in other words, eq. (6) shows that the ship steering autopilot is designed for the heading angle control [30]. therefore, the nomoto model might be considered as a linear estimate for the full model ship. if we take k = 0.2803 and t = 47.44 sec [27], the transfer function of eq. (6) is: ψ δ = 0.2803 s (1 + 47.44s) (9) figure 2 shows the closed-loop transient response of a ship motion without the controller, it appears that the maximum overshot (the maximum amount by which the response overshoots the steady-state value and is thus the amplitude of the first peak), which is mp = 64.7 % and settling time(defined as the time required by the response to reach and steady within the specified range of 2 %),which is ts = 342 sec. these values are undesirable design requirements for ship movements, therefore, a controller needs to be designed to address this. 3. ship steering control methods as a rule, a ship controlling framework is a singleinput-single-output system. this can be seen in figure 3, where (ψd) is desired heading, (ψ) is the actual heading and (δc) is the command rudder angle (all in degrees). the purpose of a guiding machine is to move the rudder edge to a coveted heading when it is constrained by the control structure, in this paper, it is considered that this machine is a part of the ship model. a controller for a coordination of the ship normally controls the rudder to diminish the error between the reference heading edge and the genuine heading point. for the essential plan, it will, in any case, be expected that the ship’s steering flow is linear and of a known order and structure and that there is no disturbance. the objective is to plan the autopilot so the closed-loop framework satisfies the robust stability and design specifications with particular qualities for course-changing issues. there are many types of ship autopilots. 3.1. pid controller minorsky proposed the utilization of a pid controller [8], in 1930. indeed, even today, most ships utilize this kind of controller. there are few requirements that need to be taken into account in the design [21]: (1.) for course-changing without oscillations, the damping ratio is chosen between (0.8 and 1) (2.) the choice of natural frequency (wn) will be limited by the resulting bandwidth of the rudder wδ (rad/sec) and the ship’s dynamics 1/t (rad/sec) for a critically damped ship. the design requirements are, considering no overshot, (ζ = 1) and wn = 0.15 rad/sec. the settling time (ts = 4/ζwn) = 39 sec. ref. [23] has inferred straightforward relations for figuring the pd/pid controller parameters for the (ζ) and (wn) of a closed-loop dispatch control framework. where kp = twn2 k kd = 2tζwn − 1 k (10) ki = wnkp k so that kp = 3.8, kd = 47.2, and ki = 0.057. figure 4 shows the closed-loop transient response of a ship motion with a pid controller. the figure demonstrates that the ship does not take after the 199 nasir ahmad al-awad acta polytechnica figure 2. closed-loop transient response of ship motion without controller. figure 3. ship steering control system. coveted reaction, an overshoot of about mp = 10.3 %) can obviously be observed. this can be perilous, if different ships are in the environment or if the ship is cruising in a limited channel. the execution of the controller is not sufficiently accurate, even at a settled forward speed of the ship. 3.2. lqr controller an (lqr) can be designed for the state-space model, in order to design a linear optimal control law, the system (a, b, c) must be controllable. refer to eq. (7) and (8). ref. [31] covers the deduction of the (lqr) and that induction won’t be repeated here. in any case, the essential issue is to limit some quadratic cost work (j) obliged by the progression of the framework. using the matlab tool: [k,p,e] = lqr(a,b,q,r) and let r = 0.1, q = c′ · c. the values of the gain matrix k = [3.1623 33.4766] can be calculated. figure 5 shows the transient response of the ship motion with (lqr) controller as seen from fig the (mp = 4 %) and (ts = 50.5 sec). this type of controller needs a pre-filter to cancel the offset between the input and the output. 3.3. model reference adaptive control-based genetic algorithm (mrac-ga) an adaptive controller is a settled structure controller with movable parameters and a component for naturally altering those parameters. in (mrac) framework, the desired response is determined by a reference transfer function and the parameters are adjusted in light of the error, which is the contrasts between the yield of the nearby closed-loop framework and the reference demonstrate. the adaptation law provides an adjustment of parameters whose breaking point is the misstep between the plant and the model’s yields. hence the parameters of the controller are balanced so that the mistaken approach is zero. various adjustment laws have been created. the two essential components are the gradient and the lyapunov approach. here, the gradient method using the mit rule is utilized to build-up the adjustment law [32, 33].to apply the (mrac), a reference transfer function might be viewed as a pre-filter, see figure 6. this guarantees 200 vol. 60 no. 3/2020 model reference adaptive control-based genetic algorithm design. . . figure 4. closed-loop transient response of ship motion with pid controller. figure 5. closed-loop transient response of ship motion with (lqr) cotroller. 201 nasir ahmad al-awad acta polytechnica figure 6. the reference model. figure 7. the reference model. that the numerical challenges related to the input signal are kept away from the calculations [28]. the elements of the reference transfer function ought to be coordinated to the flow of the ship paying little mind to the greatness of the requested change reference yaw point. the reference transfer function is excessively drowsy and cannot create an ideal execution since the ship cannot accomplish the required heading in the base time. then again, we ought not to utilize a reference model, which is too quick compared to the ship reaction attributes since this may cause a rudder actuator immersion and execution corruption. to correctly choose a reference model, the following should be considered: (1.) unified steady-state gain (2.) speed of response, defined by natural frequency (wn) (3.) damping, specified by the damping coefficient (ζ) generally, a second-order transfer function is utilized. such a model can be depicted numerically as: ψd ψr = wn2 (s2 + 2ζwn + wn2) (11) 3.4. mit adaptation law the mit law is the first way to deal with the reference versatile control. the name comes from the way that it was produced at the instrumentation laboratory (now the draper laboratory) at massachusetts institute of technology (mit), u.s.a., figure 7 shows the marc block-diagram. to display the mit law, we will consider a closed-loop structure in which the controller has one versatile parameter. the desired closed-loop reaction is indicated by a model yield (ym). the error (e) is a contrast between the yield of the framework (y) and the yield of the reference demonstrates (ym). the modelling error (e) is given by: e = y − ym (12) there are many performance indices for control, such as iae, ise, itae and itse, but in this work, we are using the rss (residual sum of squares). one approach is to modify parameters such that the cost work (j), as the squared difference of outputs, is: j(n) = 1 2 e2 = 1 2 [y(n) − ym(n)]2 (13) to make (j) little, it is sensible to alter the parameters in the course of the negative slope of (j). that is, du dt = −η de du (14) where (η is an adjustable parameter (adaptation gain) and it is used to adjust the convergence speed of the gradient descent optimization method, and (de/du) is the sensitivity derivative of the system and indicates how the error (e) is effected by the (η. by hand calculations, it has been found that the reaction is moderate with the little estimation of adjustment gain. this gain is increased, the settling time decreases and the rise time is diminished. it is extremely hard to check every last estimation of adaptation gain in the simulink model. with a specific end goal to overcome this issue, a genetic algorithm was executed. genetic algorithm (g.a) is a heuristic search method used in artificial intelligence for searching through large and complex data sets. the cause of this parameter’s iterative pursuit system depends on darwin’s “survival of the fittest” guideline [34]. ga strategy is motivated by two organic standards to be specific to the procedure of common determination and the mechanics of characteristic hereditary qualities. it controls the gathering of potential arrangements, which is known as the populace. the measure of populace depends on trial and error. for the most part, the projected size of a populace is in a frame from 30 to 100. the potential arrangement of a populace is called chromosomes. these are encoded portrayals of the considerable number of parameters of the solution. there are three stages of (g.a), they are reproduction, crossover, and mutation. the optimization is accomplished in cycle frames called generations and makes another arrangement of chromosomes at every generation through crossover and mutation and the best chromosomes are permitted to the next generation. in this paper, (g.a) parameters are picked by the experimentation strategy as takes: populace size = 70; crossover rate = 0.2; mutation rate = 0.05; maximum generation = 100. the parameter value of the adaptation gain, which is optimized by using the g.a, is (0.90456). figure 8 shows the flow-chart of g.a. the transient response of the ship motion with (mrac-ga) controller can be seen from figure 9, it shows the output tracks of the model reference, and 202 vol. 60 no. 3/2020 model reference adaptive control-based genetic algorithm design. . . figure 8. shows the flow-chart of g.a. figure 9. closed-loop transient response of ship motion with (mrac-ga) controller for step input. 203 nasir ahmad al-awad acta polytechnica controller type settling time (sec) rise time (sec) max. overshot % steady-state error pid 37.1 5.51 10.3 0 lqr 50.4 18.4 4.12 0 (with prefilter) mrac-ga 26 16.4 0 0 table 1. levels of the composite adsorbent preparation variables chosen for this study. satisfies all the design requirements., (mp = 0 %) and (ts = 26 sec). 4. discussion different controllers are used to improve the design requirements (settling time, maximum overshoot, and rise time) for a ship moving. in spite of the fact that the pid controllers are generally utilized as a part of the marine control frameworks, their control effectiveness will be tested truly by the framework model uncertainties. in (lqr), there is some overshot, which may endanger a ship on the sea, also, the ts is out of the design specification. in the comparison of the two (pid, lqr) approaches, the outcomes in the two cases demonstrate that the control power of the proposed technique (macr-ga) is better than the others, in terms of maximum overshoot (mp %) and the speed of response, settling time (ts) to reach the steady-state, table 1 shows the numerical values comparison between all controllers . when comparing the proposed method (mrac-ga) with the classic controller designs (pid) and (lqr) by using the same model eq. (6), it is obvious that the design specifications (max. overshot, settling time, and rise time) have better results for the proposed method. 5. conclusions this paper shows the response of a framework controlled by a model reference adaptive control method utilizing the mit scheme, when compared with customary settled pickup controllers (pid) and (lqr), the following was found: (1.) adaptive controllers are exceptionally compelling to deal with. the circumstances where the parameter varieties and natural changes are substantial, the results are heavily affected. (2.) the adaptive controller keeps up a steady powerful execution within the sight of disturbances and immense varieties. (3.) it is shown that, for suitable values of adaptation gain, the mit rule can make the transient response of second order model output as close as possible to the adaptive reference model. also, the response in terms of the settling time, peak time and rise time is reduced with the little estimation of adaptation gain. (4.) (g.a) is utilized to estimate the adjustment pick up, which requires 9.3245 seconds of computational time to tune the parameter. (5.) the mrac-ga is effective in disturbances rejection. (6.) all design requirements (mp,ts,tt) are satisfied when the mrac-ga is applied. acknowledgements the author would like to thank mustansiriyah university (www.uomustansiriyah.edu.iq) baghdad – iraq for its support in the present work. references [1] t. i. fossen. guidance and control of ocean vehicles. john wiley sons, new york, 2014. [2] c.-n. hwang. the integrated design of fuzzy collision-avoidance and h-infinity autopilots on ships. journal of navigation 55(1):117–136, 2002. doi:10.1017/s0373463301001631. [3] x.gu., n.ma, s.j.tong, et al. a simplified simulation model for a ship steering in regular waves. in proceedings of the 12th international conference on the stability of ships and ocean vehicles, pp. 14 – 19. 2015. [4] k. nomoto, t. taguchi, k. honda, s. hirano. on the steering qualities of ships. tech. rep., international shipbuilding progress, 1957. [5] r. a. ibrahim, i. m. grace. modeling of ship roll dynamics and its coupling with heave and pitch. mathematical problems in engineering 2010, 2010. doi:10.1155/2010/934714. [6] t. i. fossen. marine control systems: guidance, navigation, and control of ships, rigs and underwater vehicles. marine cybernetics, trondheim, 2002. [7] s. a. hariprasad, v. singh, t. d. shashikala, k. shashikala. a design approach to rudder controller. internation journal of computer science and technology 3(03):23 – 29, 2012. [8] n. minorsky. automatic steering tests. journal of the american society for naval engineers 42(2):285 – 310, 1930. doi:10.1111/j.1559-3584.1930.tb05036.x. [9] e. sperry. automatic steering. transactions of the society of naval architects and marine engineers pp. 53 – 57, 1922. [10] a. white, p. gleeson, m. karamanoglu. control of ship capsize in stern quartering seas. international journal of simulation 8:20 – 31, 2007. [11] n. hovakimyan, c. cao. l1 adaptive control theory: guaranteed robustness with fast adaptation. society for industrial and applied mathematics, philadelphia, 2010. [12] y. morel, a. leonessa. indirect adaptive control of a class of marine vehicles. international journal of adaptive control and signal processing 24(4):261 – 274, 2010. doi:10.1002/acs.1128. 204 www.uomustansiriyah.edu.iq http://dx.doi.org/10.1017/s0373463301001631 http://dx.doi.org/10.1155/2010/934714 http://dx.doi.org/10.1111/j.1559-3584.1930.tb05036.x http://dx.doi.org/10.1002/acs.1128 vol. 60 no. 3/2020 model reference adaptive control-based genetic algorithm design. . . [13] j. alvarez, i. r. bertaska, k. von ellenrieder. nonlinear control of an unmanned amphibious vehicle. in proceeding of the american society of mechanical engineers, vol. 3 of dynamic systems and control conference. 2013. doi:10.1115/dscc2013-4039. [14] a. nighat, m. unar, d. yaping. design of heading controller for cargo ship using feedforward artificial neural network. international journal of advancements in computer technology 5:556 – 566, 2013. doi:10.4156/ijact.vol5.issue9.66. [15] e. omerdic, g. n. roberts, z. vukic. a fuzzy track-keeping autopilot for ship steering. journal of marine engineering & technology 2(1):23 – 35, 2003. doi:10.1080/20464177.2003.11020163. [16] a. a. ammar. a self-learning fuzzy logic controller for the ship steering system. iraq journal of electrical and electronic engineering 8(1):124 – 139, 2012. [17] j. velagic, z. vukic, e. omerdic. adaptive fuzzy ship autopilot for track-keeping. control engineering practice 11(4):433 – 443, 2003. doi:10.1016/s0967-0661(02)00009-6. [18] g.antonio, c. francisco. robust design of h-infinity controller for a launch vehicle autopilot against disturbances. journal of electrical and electronics engineering 11(5):135 – 140, 2016. doi:10.9790/1676-110502135140. [19] a. witkowska, r. śmierzchalski. designing a ship course controller by applying the adaptive backstepping method. international journal of applied mathematics and computer science 22(4):985 – 997, 2012. doi:10.2478/v10006-012-0073-y. [20] s. carrillo, j. contreras. obtaining first and second order nomoto models of a fluvial support patrol using identification techniques. ciencia y tecnología de buques 11(22):19 – 28, 2018. doi:10.25043/19098642.160. [21] v. nicolau, c. miholcă, d. aiordachioaie, e. ceangă. q ft autopilot design for robust control of ship course-keeping and course-changing problems. control engineering and applied informatics 7(1):44 – 56, 2005. [22] p. mishra, s. k. panigrah, s. dass. ships steering autopilot design by nomoto model. international journal on mechanical engineering and robotic 3(3):37 – 41, 2015. [23] i. temiz. an investigation on the ship rudder with a different control system. electronics and electrical engineering 9(105):28 – 32, 2010. [24] m. ejaz, m. chen. sliding mode control design of a ship steering autopilot with input saturation. international journal of advanced robotic systems 14(3):1729881417703568, 2017. doi:10.1177/1729881417703568. [25] r. skjetne, o. smogeli, t. fossen. a nonlinear ship manoeuvering model: identification and adaptive control with experiments for a model ship. modeling, identification and control 25:3 – 27, 2004. doi:10.4173/mic.2004.1.1. [26] s. das, s. talole. robust steering autopilot design for marine surface vessels. ieee journal of oceanic engineering 41:1 – 10, 2016. doi:10.1109/joe.2016.2518256. [27] m. pande, k. k. mangrulkar. design of a heading autopilot for mariner class ship with wave filtering based on a passive observer. international journal on mechanical engineering and robotics 3(3):30 – 36, 2015. [28] r. skjetne, t. i. fossen, p. v. kokotović. robust output maneuvering for a class of nonlinear systems. automatica 40(3):373 – 383, 2004. doi:10.1016/j.automatica.2003.10.010. [29] m. beirami, h. y. lee, y.-h. yu. implementation of an auto-steering system for recreational marine crafts using android platform and nmea network. journal of the korean society of marine engineering 39:577 – 585, 2015. doi:10.5916/jkosme.2015.39.5.577. [30] m. tomera. nonlinear controller design of a ship autopilot. international journal of applied mathematics and computer science 20(2):271 – 280, 2010. doi:10.2478/v10006-010-0020-8. [31] h. a. bouziane, r. b. bouiadjra, m. b. debbat. design of robust lqr control for dc-dc multilevel boost converter. in 4th international conference on electrical engineering (icee), pp. 1 – 5. 2015. [32] p. jain, m. j. nigam. design of a model reference adaptive controller using a modified mit rule for a second order system. advance in electronic and electrical engineering 3(4):477 – 484, 2013. [33] p. swarnkar, s. jain, r. k. nema. effect of adaptation gain in model reference adaptive controlled second order system. engineering, technology & applied science research 1(3):70 – 75, 2011. [34] i. abuiziah, n. hakarneh. a review of genetic algorithm optimization: operations and applications to water pipeline systems. international journal of mathematical and computational sciences 7(12):1782 – 1788, 2013. 205 http://dx.doi.org/10.1115/dscc2013-4039 http://dx.doi.org/10.4156/ijact.vol5.issue9.66 http://dx.doi.org/10.1080/20464177.2003.11020163 http://dx.doi.org/10.1016/s0967-0661(02)00009-6 http://dx.doi.org/10.9790/1676-110502135140 http://dx.doi.org/10.2478/v10006-012-0073-y http://dx.doi.org/10.25043/19098642.160 http://dx.doi.org/10.1177/1729881417703568 http://dx.doi.org/10.4173/mic.2004.1.1 http://dx.doi.org/10.1109/joe.2016.2518256 http://dx.doi.org/10.1016/j.automatica.2003.10.010 http://dx.doi.org/10.5916/jkosme.2015.39.5.577 http://dx.doi.org/10.2478/v10006-010-0020-8 acta polytechnica 60(3):197–205, 2020 1 introduction 2 ship mathematical modeling 3 ship steering control methods 3.1 pid controller 3.2 lqr controller 3.3 model reference adaptive control-based genetic algorithm (mrac-ga) 3.4 mit adaptation law 4 discussion 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0159 acta polytechnica 57(3):159–166, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap evaluation of the application of a thermal insulation system: in-situ comparison of seasonal and daily climatic fluctuations jan fořta, ∗, pavel beranb, petr konvalinkaa, zbyšek pavlíka, robert černýa a czech technical university in prague, faculty of civil engineering, thákurova 7, 166 29 prague, czech republic b academy of sciences of the czech republic, institute of theoretical and applied mechanics, centre of excellence telč / arcchip, batelovská 485-486, 588 56 telč, czech republic ∗ corresponding author: jan.fort.1@fsv.cvut.cz abstract. the current outdated state of many institutional and administrative buildings in the eu region poses a significant burden from the energy sustainability point of view. according to the contemporary eu requirements on the energy efficiency of buildings maintenance, an evaluation of performed improvements is essential for the assessment of expended investments. this paper describes the effect of building envelope reconstruction works consisting in the installation of a thermal insulation system. here, a long-term continuous monitoring is used for the extensive assessment of the seasonal and daily temperature and relative humidity fluctuations. the obtained results include temperature and relative humidity profiles in the wall cross-section as a response to the changing exterior climatic conditions. the analysis of measured data reveals substantial improvements in thermal stability of the analyzed wall during temperature peaks. while the indoor temperatures exceeding 28 °c are recorded during summer before application of the thermal insulation layer, the thermal stability of the indoor environment is distinctly upgraded after performed improvements. based on the complex long-term monitoring, a relevant experience is gained for the future work on energy sustainability and fulfilment of the eu directives. keywords: in-situ monitoring; temperature; relative humidity; thermal insulation; energy sustainability; seasonal fluctuations. 1. introduction building maintenance represents a substantial part in the total worldwide energy consumption. according to numerous studies, buildings pose a large and unexploited source for reduction of carbon dioxide emissions as well as potential energy savings. to be specific, buildings present the third most demanding sector after industry and transportation from the environmental point of view. particularly, the heating and cooling requirements are responsible for almost 40 % of the total energy consumed by the building sector [1]. within the sustainable development, the new european directives are aimed at the better energy performance of buildings in all involved countries. according to the 2010/31/eu directive, the public administration and institution buildings are considered as leading sector for making progress in the energy efficiency of buildings [2]. therefore, all member states are required to minimize energy demands related to the building maintenance. moreover, for the new public administration and institution buildings, it will be obligatory to fulfil directive standards of a nearly zero energy building [3]. these commitments should be precisely defined at the national level, due to the exact definition of the nearly zero energy building. according to the provided 2012/27/eu directive, proper strategies of investments for building renovations to increase modernization rate of current buildings are defined [4]. notwithstanding, many institutional and public administration buildings in the czech republic are substantially outdated, especially from the thermal performance point of view. the buildings designed in previous decades, in many cases, do not provide stable ambient conditions, particularly during seasonal temperature fluctuations. they also do not meet the requirements on thermal comfort for inhabitants. moreover, regarding the efforts to decrease the energy consumption, these outdated buildings with poor thermal performance require an installation of additional heating/cooling devices [5]. this common temporary solution is beneficial for managing the thermal comfort, but it unfortunately leads to an energy demanding and noisy service. utilization of portable a/c devices does not provide an adequate solution from the long-term perspective. a proper understanding of the relationship between the exterior climatic conditions and the quality of the indoor environment is based on the monitoring of temperature, relative humidity, illumination, air quality, etc. [6]. the influence of the utilization of various monitoring techniques on energy reduction 159 http://dx.doi.org/10.14311/ap.2017.57.0159 http://ojs.cvut.cz/ojs/index.php/ap j. fořt, p. beran, p. konvalinka et al. acta polytechnica figure 1. photo of the studied building: before the application of thermal insulation (left); after reconstruction works (right). solution was studied by batista et al. [7]. here, a decision based mainly on a building simulation is not recommended. the explanation of this conclusion lies in the differences between building plans and the real state of building including faults occurred during the construction process, which can significantly affect the obtained data. the research methods can be categorized as qualitative and quantitative [8]. a spot or long-term monitoring methods can be distinguished in order to evaluate the indoor thermal comfort as well as the environment quality from the perspective of the quantitative methods [9]. from the environmental viewpoint, the total energy consumption related to the certain time interval is probably the mostly used evaluation method [10]. the life cycle assessment, primary consumption or carbon emissions equivalent are extensively used especially in relation to the life cycle of the building or particular materials [11]. the low thermal resistance of external walls of an outdated building is mostly fixed by the application of an exterior insulation system, which ensures substantial improvements of the thermal comfort as well as the energy sustainability. however, besides the accomplishment of the eu directives aimed at the nearly zero energy buildings, appropriate ambient climatic conditions and requirements must be ensured for the convenience of building residents [12]. although modern energy efficiency measures are more less concentrated to the energy consumption, mikulčioniene et al. [13] presented a multicriterial method to promote building retrofitting. indoor thermal comfort can be perceived as the one of the most important factors and achievement of the indoor thermal comfort for building occupants represents a substantial objective, which should be satisfied in order to provide appropriate working and health conditions [14]. according to the study of zhang et al. [15], some differences can be found in the terms of ideal indoor temperature. in china, the indoor comfort zone is around 21.7 °c, while the ideal indoor temperature in europe is 23.4 °c. therefore, many studies were proposed in order to optimize the indoor thermal comfort, natural ventilation, employment of the hvac system. however, the maintenance of the ambient conditions directly affects the energy need for the heating or cooling purposes. importance of the maintenance of ambient air is connected not only with comfort of building occupants, but also with health issues. the negative health consequences of the undesirable indoor conditions, such as overheating and low or high relative humidity, consist in irritation of eyes, dryness of skin and throat, reduced work efficiency and respiratory problems [16]. moreover, the higher risk of condensation may cause material defacement and proliferation of microorganisms [17]. the risks of water vapour condensation, frost damage and mould growth should be considered in all renovation and building insulation works [18]. within this study, an outdated institutional building from 1960s is subjected to a long-term temperature and relative humidity monitoring beginning in 2012. based on the preliminary data, overheating during the summer period and substantial heat losses during the winter period resulted in a high energy consumption. the installation of the additional external insulation system was done in autumn/winter 2013, in order to improve the indoor environment quality for building inhabitants and decrease of sensitivity of the building to the exterior temperature fluctuations. the performance of the applied insulation systems is evaluated on the basis of the comparison of the long-term monitoring data before and after conducted reconstruction works. 2. experimental 2.1. studied building the investigations were done on the institutional building built in 1960s, located in the north-west suburb part of prague, the capital of the czech republic. the studied building is composed of reinforced concrete column structure having 3 stories. the exterior envelope is constructed from the hung facade slabs formed of steel frames, aluminous window frames, glass, and insulation boards. part of the envelope is built from cavity brick blocks provided on the surface by cement-lime plaster. 160 vol. 57 no. 3/2017 evaluation of the application of a thermal insulation system figure 2. scheme of sensors placement. element u-value [w/m2k] initial state final state facade 1.12 0.29 roof 0.86 0.15 window 2.40 1.21 door 1.72 1.22 table 1. summarized description of initial and final state of studied building. the additional exterior contact thermal insulation system was applied during autumn 2013 (october–december). it consists of adhesive mortar jub jubizol (polymer-modified micro-reinforced cement mortar mixture) used for gluing insulation boards and, as a brown coat, expanded polystyrene board eps perimetr having a thickness of 150 mm, alkali resistant reinforcing mesh, acrylic base paint unigrund applied under thin-film plasters and water vapour permeable siloxane plaster unisil g with fungicidal additives. the insulation boards are fastened by fixing plastic/metal anchors. the outside view of the reconstructed building envelope and applied thermal insulation system is given in figure 1. the monitoring of the hygrothermal performance was done for the part of the envelope formed by cavity brick blocks and exterior and interior plasters. this part of the building facade is considered to be the main cause of the heat losses of the building. here, the material durability problems are also expected. new windows with insulating glazing and frames together with new insulated door and gates were placed in order to improve the thermal stability of the building. summarized information of the initial state of the building is described in table 1. 2.2. hvac system studied building is not equipped with its own heat source, but is connected to the central heat supply system. the time heat regulation is installed only for the administrative part of the building and the rest is equipped only with thermostatic valves. there is no regulating fitting on the return pipes. running of the heating system is ensured by suitable pumps with a continuous regulation. however, due to the size of the object, it is possible to consider the heating system at the main points and meeting the requirements of the decree no. 194/2007. however, the regulation of hydraulic regulation is not optimal. the thermal insulation of heat distribution pipes is in a good visual condition and energy losses related to the material conditions are approximately 7 %. the object is not equipped by any ventilation or a/c devices; only portable a/c devices are partly used. ventilation is managed only naturally by opening windows. 2.3. sensors placing the tested structure was continuously monitored by relative humidity and temperature measurements. here, the combined mini-sensors, produced by ahlborn (germany), which allow a measurement of the relative humidity in the range of humidity of 5–98 % and the accuracy within ±2 %, the resistance temperature sensors have the accuracy of ±0.4 °c in the temperature range of −20 to 0 °c and ±0.1 °c in range of 0–70 °c were used. the particular sensors were connected with the measuring unit that was controlled by a computer. the data were continuously collected, whereas the experiment was operated by a remote computer station. in total, 16 measuring channels were installed. the measurement runs from april 2012 and still continues. the sensor placement is shown in figure 2. dimensions used for the sensor placement description are stated in mm. together with the monitoring of temperature and relative humidity profiles, the simultaneous measurement of the exterior climatic conditions was done by using the weather station fiedler-mágr (czech republic) with 16 recording channels. the following sensors were applied: w2t for the measurement of wind velocity and direction (heated version), rv12 for the relative humidity and temperature measurement, psychrometer sr02 with a collecting board of 200 cm2 and with a distinction of 0.2 mm, and pyranometer for the measurement of global solar radiation. the weather station was placed on the roof of the studied building. 161 j. fořt, p. beran, p. konvalinka et al. acta polytechnica figure 3. average monthly climatic data. 3. results and discussion the exterior climatic conditions, namely the monthly average temperatures and the monthly average relative humidity, are shown in figure 3. here, an overview and a year-by-year comparison show only minor differences between the average values of monitored outdoor conditions. in general, the average monthly temperature was slightly higher in 2014 compared to 2013, which was more distinct during spring and autumn. 3.1. seasonal fluctuations 3.1.1. temperature profiles the temperature profiles in selected representative days from spring, summer, autumn and winter are presented in figure 4, where distinct differences can be observed. temperature profiles were determined for selected days, which allows year-to-year comparison based on the similar day weather conditions. days selected within the evaluation represent critical exterior conditions typical for the particular period, for example, high temperature during summer and low during winter period. this selection allows an assessment of the applied insulation layer by temperature profiles, which are typical for certain period and represent exterior climate loads. for the day used for description of 2012, the corresponding days with very similar temperature and relative humidity were identified and further analysed. the conditions before application of the insulation system were unpleasant, especially for the building occupants, as an apparent thermal discomfort (during summer and autumn season) was noted. to be specific, temperature during summer reached almost 28 °c and in autumn dropped to 20 °c. in terms of ideal indoor temperature [15], this unsatisfactory state requires a remediation. the undesirable indoor environment could certainly affect the work efficiency and health of building occupants in a negative way. according to the temperature profiles from 2013 (before reconstruction) plotted in figure 4, it is possible to conclude that the building envelope suffered from the changing climatic conditions and the thermal stability was considerably poor. however, temperature profiles from the summer period referred to building overheating problems and a necessity to apply additional cooling devices [19]. from the perspective of building occupants, the interior temperature of almost 28 °c was very unpleasant and worsened the quality of the working environment [9]. it was revealed that the interior temperature remained high due to the heat solar gains through the hung façade system. in the light of these findings, the application of external insulation system was urgent in order to improve the thermal performance of the outdated building envelope [20]. the importance of the consideration of the day temperature peak was studied by [21]. here, continuous monitoring was used in order to estimate the time of the temperature peak of several buildings. obtained results point to the substantial influence of the building orientation and consequent differences between energy released during various day periods or seasons. a complex assessment of a campus building in france, including different aspects, allowed to overcome the dilemma between the comfort and energy efficiency. an employment of a pvm (predicted mean vote) and a ppd (predicted percentage of dissatisfied) revealed, in some cases, situations, when certain phenomena induce thermal discomfort and an increased energy consumption at the same time [22]. conducted research focused on the influence of seasonal variations on the indoor temperature clarified fluctuation. authors concluded that neutral temperature is higher in autumn compared to spring, due to a higher number of used fans and opened windows. thus, occupants feel cooler at the same temperature in autumn, as a result of higher indoor air velocity. 162 vol. 57 no. 3/2017 evaluation of the application of a thermal insulation system figure 4. seasonal comparison of temperature profiles. figure 5. seasonal comparison of relative humidity profiles. the temperature profiles measured after the installation of the thermal insulation layer show a significant effect of the applied layer towards energy sustainability and better thermal stability. the insulation layer, formed by a mineral wool, substantially reduced the negative effect of temperature fluctuations on the interior temperature response. compared to the temperature data before reconstruction, the temperature profiles in the load bearing structure exhibited a lower temperature gradient. this fact can be assigned to the significant improvement of the thermal performance of the building envelope. in general, the average indoor temperature, about 26.5 °c during the summer period, was decreased to the more comfortable 23.2 °c. moreover, costs related to the building maintenance, particularly heating, were substantially reduced – approximately by half . 3.1.2. relative humidity profiles the representative relative humidity profiles in the wall cross-section obtained during winter, spring, summer and autumn are given in figure 5. the obtained values referred to a low level of the indoor relative humidity, which could have a negative influence on occupants’ health (eye irritation, dry throat) and work efficiency, especially during winter period when the rh level drops below 30 %. however, for the rest of the year, relative humidity values are kept in the comfort zone and the ambient condition are satisfactory for building inhabitants. from this point of view, the application of the thermal insulation layer did not have any substantial effect on indoor relative humidity levels. here, only minor changes were obtained within year-to-year comparison, for example, a slight decrease of the indoor air humidity was observed. the 163 j. fořt, p. beran, p. konvalinka et al. acta polytechnica figure 6. daily comparison of temperature profiles. figure 7. daily comparison of relative humidity profiles. effect of the application of thermal insulation layers on the indoor relative humidity was studied, e.g., by korjenic et al. [23] with similar conclusions. this problem is widely analysed in modern low energy and passive houses, where the application of insulation polystyrene layer has a negative effect on the relative humidity level. here, the air recuperation is the mostly used way to moderate the relative humidity level and keep it in the desired range in order to eliminate health risks related to the low level of relative humidity. the low level of relative humidity measured during 07.2014 can be attributed to the use of the portable a/c device in order to reduce the increased indoor temperature. 3.2. daily temperature fluctuations diurnal changes in the exterior conditions, especially during certain periods such as sunny springs days, when nocturnal temperature often drops below the zero, represent considerable burden for the building envelope, especially due to the possible risk of water vapour condensation. moreover, daily temperature variations in the cross section wall refer to the unstable and undesirable conditions. a low thermal performance of the building envelope is responsible for the higher energy consumption required for maintenance of the indoor thermal comfort. furthermore, a low thermal inertia plays an important role during temperature peaks and significantly affects the quality of the interior climate for building occupants. the employment of additional cooling devices is unfortunately accompanied with higher energy demands [24]. also, it is not always possible to use these devices due to space limitations and, last but not least, the operational noise of portable cooling devices substantially limits their utilization on a daily basis. it should be noted, that the application of the central a/c solution is not widespread in the czech republic in institutional buildings. 164 vol. 57 no. 3/2017 evaluation of the application of a thermal insulation system in the measurements performed in this paper, the negative effect of the daily temperature fluctuations was successfully minimized by the application of the mineral wool thermal insulation layer. the overheating of the building was substantially reduced, as it is documented in figure 6. in the light of these findings, one can conclude that the indoor thermal comfort was significantly improved and occupants did not suffer from the high temperatures. in this case, days with high temperature values were submitted for evaluation of the influence of applied insulation system. for this purposes, days with temperature about 34.5 °c (14:00) were selected in order to access maximal temperature load induced by the exterior weather conditions, which significantly affect the ambient conditions. the reason for choosing the summer period is based on the problems related to a room overheating. during the winter period, almost no changes are observed because temperature profiles are strongly connected with the building heating, and the changes are compensated by the increased energy consumption. the daily variation of the relative humidity level was negligible, as it is illustrated in figure 7. in order to access diurnal relative humidity fluctuation, the typical winter days with 78.6 % rh were used for this comparison. the application of the thermal insulation layer did not affect the relative humidity profiles in the wall cross-section, however, lower values of the relative humidity poses a possible health risk related to the building occupants as was stated above. notwithstanding, the diurnal fluctuation are only minor and from this point of view, no significant changes occurred. 4. conclusions the evaluation of the functionality of the thermal insulation system applied on the outdated institutional building from 1960s proved its beneficial influence on the thermal stability. thanks to the installed sensors in different depths, it was possible to perform a continuous monitoring of temperature and relative humidity fluctuations related to the changing exterior climatic conditions. a detailed analysis of the prereconstruction conditions in the building envelope allowed the identification of fundamental factors, which substantially influenced the obtained results and the previous low thermal performance. together with the seasonal variations, the diurnal fluctuations of studied parameters were also identified as critical factors for the seasonal and daily overheating. the comparison of the temperature and relative humidity profiles in the wall cross-section between the pre-reconstruction state and after reconstruction confirmed substantial improvements achieved in the field of the thermal stability, resulting in more comfortable conditions for building occupants. the obtained results also showed decreased energy consumption demands. acknowledgements this research has been supported by the czech science foundation, under project no gbp105/12/g059. references [1] iea, co2 emissions from fuel combustion. international energy agency. paris, france, p. 512, 2008. [2] eu commission, commission delegated regulation (eu) no. 244/2012 supplementing directive 2010/31/eu of the european parliament and of the council on the energy performance of buildings. official journal of the european union, 2012. [3] berry, s., whaley, d., davidson, s., saman, w.: near zero energy homes – what do user think?, energ. policy, 73, 2014, p. 127-137, doi:doi.org/10.1016/j.enpol.2014.05.011 [4] ascione, f., bianco, n., bottcher, o., kaltenbrunner, r., vanoli, g.p.: net zero-energy buildings in germany: design, model calibration and lessons learned from a case-study in berlin. energy build., 133, 2016, p. 688-710. doi:10.1016/j.enbuild.2016.10.019 [5] carbonara, g.: energy efficiency as a protection tool, energy build. 95, 2015, p. 9–12, doi:10.1016/j.enbuild.2014.12.052 [6] balaras, c.a.: european residential buildings and empirical assessment of the hellenic building stock, energy consumption, emissions and potential energy savings, build. environ., 42, 2007, p. 1298-1314, doi:10.1016/j.buildenv.2005.11.001 [7] batista, a. p., freitas, m. a., jota, f. g.: evaluation and improvement of the energy performance of a building´s equipment and subsystems through continuous monitoring. energy build., 75, 2014, p. 368-381, doi:10.1016/j.enbuild.2014.02.029 [8] guerra-santin, o., tweed, ch. a.: in-use monitoring of buildings: an overview of data collection methods. energy build., 93, 2015, p. 189-207, doi:10.1016/j.enbuild.2015.02.042 [9] citherlet, s., hand, j.: assessing energy, lighting, room acoustic, occupant comfort and environmental impacts performance of building with a single simulation program. build. environ., 37, 2002, p. 845-856. doi:10.1016/s0360-1323(02)00044-6 [10] berry, s., davidson, k.: zero energy homes – are they economically viable? energ. policy, 85, 2015, p. 12-21, doi: 10.1016/j.enpol.2015.05.009 [11] chantrelle, f.p., lahmidi, h., keilholz, w., el mankibi, m., michel, p.: development of a multicriteria tool for optimizing the renovation of buildings, appl. energy, 88, 2011, p. 1386–1394, doi:10.1016/j.apenergy.2010.10.002 [12] moatassem, a., el-rayes, k., liu, l.: economic and ghg emission analysis of implementing sustainable measures in existing public buildings, j. performance constr. facilit., vol. 30, no. 6, 2016, p. 165-174. doi:10.1061/(asce)cf.1943-5509.0000911 [13] miulčioniene, r., martinaitis, v., keras, e.: evaluation of energy efficinecy measures sustainability by decision tree method. energy build., 76, 2014, p. 64-71, doi:10.1016/j.enbuild.2014.02.048. 165 http://dx.doi.org/doi.org/10.1016/j.enpol.2014.05.011 http://dx.doi.org/10.1016/j.enbuild.2016.10.019 http://dx.doi.org/10.1016/j.enbuild.2014.12.052 http://dx.doi.org/10.1016/j.buildenv.2005.11.001 http://dx.doi.org/10.1016/j.enbuild.2014.02.029 http://dx.doi.org/10.1016/j.enbuild.2015.02.042 http://dx.doi.org/10.1016/s0360-1323(02)00044-6 http://dx.doi.org/ 10.1016/j.enpol.2015.05.009 http://dx.doi.org/10.1016/j.apenergy.2010.10.002 http://dx.doi.org/10.1061/(asce)cf.1943-5509.0000911 http://dx.doi.org/10.1016/j.enbuild.2014.02.048. j. fořt, p. beran, p. konvalinka et al. acta polytechnica [14] ascione, f., bianco, n., de masi, r. f., vanoli, g. p.: rehabilitation of the building envelopes of hospitals: achievable energy savings and microclimatic control on varying the hvac systems in mediterranean climates. energy build., 60, 2013, p. 125-138. doi:10.1016/j.enbuild.2013.01.021 [15] zhang, n., cao, b., wang, z., zhu, y., lion, b.: a comparison of winter indoor thermal envinronment and thermla comfort between regions in europe, north america and asia, build. environ., 117, 2017, p. 208-217, doi:doi.org/10.1016/j.buildenv.2017.03.006 [16] zhang, h„ gong, n., zhao, l.: indirect effects of air-conditioning relative humidity of hospital buildings in summer on human health. journal of central south university (science and technology), 43, 2012, p. 3727–3733. [17] dall´o, g., sarto, l., galante, a., pasetti, g.: comparison between predicted and actual energy performance for winter heating in high-performance residential buildings in the lombardy region (italy). energy build., 47, 2012, p. 247-253. doi:10.1016/j.enbuild.2011.11.046 [18] toftum, j., jorgensen, a. s., fanger, p. o.: upper limits for indoor air humidity to avoid uncomfortably humid skin. energy build., 28, 1998, p. 1-13. doi:10.1016/s0378-7788(97)00017-0. [19] brunner, s., simmler, h.: in-situ performance assessment of vacuum insulation panels in a flat roof construction. vacuum, 82, 2008, p. 700-707, doi:10.1016/j.vacuum.2007.10.016 [20] kruger, e., givoni, b.: thermal monitoring and indoor temperature predictions in a passive solar building in an arid environment. build. environ., 43, 2008, p. 1792-1804. doi:10.1016/j.buildenv.2007.10.019 [21] sham, j. f. c., memon, s. a., lo, t. y.: application of continuous surface temperature monitoring technique for investigation of nocturnal sensible heat release characteristics by building fabrics in hong kong, energy build., 58, 2013, p. 1-10, doi:10.1016/j.enbuild.2012.11.025 [22] allab, y., pellegrino, m., guo, x., nefzaoui, e., kindinis, a.: energy and comfort assessment in education building: case study in a french university campus, energy build., 143, 2017, p. 202-219, doi:10.1016/j.enbuild.2016.11.028 [23] korjenic, a.; teblick, h.; bednar, t.: increasing the thermal indoor humidity levels in buildings with ventilation systems: simulation aided design in case of passive houses. building simul., 2010, p. 295-310, doi:10.1007/s12273-010-0015-2 [24] andersen, r., fabi, v., corgnati, s.: predicted and actual indoor environmental quality: verification of occupants’ behaviour models in residential buildings. energy build. 127, 2016, p. 110–115. doi:10.1016/j.enbuild.2016.05.074 166 http://dx.doi.org/10.1016/j.enbuild.2013.01.021 http://dx.doi.org/doi.org/10.1016/j.buildenv.2017.03.006 http://dx.doi.org/10.1016/j.enbuild.2011.11.046 http://dx.doi.org/10.1016/s0378-7788(97)00017-0. http://dx.doi.org/10.1016/j.vacuum.2007.10.016 http://dx.doi.org/10.1016/j.buildenv.2007.10.019 http://dx.doi.org/10.1016/j.enbuild.2012.11.025 http://dx.doi.org/10.1016/j.enbuild.2016.11.028 http://dx.doi.org/10.1007/s12273-010-0015-2 http://dx.doi.org/10.1016/j.enbuild.2016.05.074 acta polytechnica 57(3):159–166, 2017 1 introduction 2 experimental 2.1 studied building 2.2 hvac system 2.3 sensors placing 3 results and discussion 3.1 seasonal fluctuations 3.1.1 temperature profiles 3.1.2 relative humidity profiles 3.2 daily temperature fluctuations 4 conclusions acknowledgements references ap04_2web.vp 1 introduction the development of small (approximately 6 inch long, or hand-held) autonomous flying vehicles is driven by the need for intelligent reconnaissance robots, capable of discreetly penetrating confined spaces, and manoeuvring in them without the assistance of a human telepilot [7]. this is particularly relevant to military operations in urban terrain. flight inside buildings, stairwells, shafts and tunnels has significant military and civilian value, and requires agility at low speeds to avoid obstacles and move in confined spaces. the vehicles can be used in dull, dirty or dangerous (d3) environments, where direct or remote human assistance is not practical. non-military uses will include law enforcement and rescue operations. the ability to explore d3 environments without human involvement will be of interest for many industries, for example, enabling air sampling in inaccessible areas, and examination of confined spaces in buildings, installations and large machines. the flight envelope of mavs requires high agility (including hover) at low speeds (1–2 m s�1) and silent flight, which is not easily met by scaled-down fixed or rotary wing aircraft. however, insect-like wing-flapping flight would appear to be very suitable for such applications requiring highly manoeuvrable flight through confined spaces. very recently, it has been recognized that flapping wing propulsion can be more efficient than conventional propellers if applied to mavs, because of the very small reynolds numbers encountered on such vehicles. flapping flight is more complicated than flight with fixed or rotating wings. the key to understand the mechanisms of flapping flight is adequate physical and mathematical modeling. animal propulsion by means of flapping wings was the focus of considerable interest in the late 1990s. this is due to the relatively high efficiency obtainable by such mode of flight. flapping flight for micro-robots (known also as mavs or micro-flyers) is not only an intriguing mode of locomotion but provides manoeuvrability not obtainable with fixed or even rotary wing aircraft. the mav is comparable on size with small birds and large insects. in order to obtain satisfactory explanation of animal flight features, it is necessary to create adequate physical, mathematical and computational models. the key to this is to understand how the complex motions of animal wings generate aerodynamic forces. however, very little is still known about the flight dynamics and automatic control of flying micro-robots. the unconventional aerodynamic concept associated with mavs deserves a more detailed explanation. insects fly by oscillating (frequency range: 5–200 hz) and rotating their wings through large angles, which is possible because their wing articulation is not limited by an internal skeleton. the wing beat cycle can be divided into two distinct phases, the downstroke and the upstroke. at the beginning of downstroke the wing (as seen from the front of the insect) is in the uppermost position with the leading edge pointing forward. the wing is then pushed downwards and rotated continuously, resulting in large changes to the angle of attack. at the end of the downstroke the wing is twisted rapidly so that the leading edge points backwards, and the upstroke begins. during the upstroke the wing is pushed upwards and rotated again, changing the angle of attack throughout this phase. at the highest point the wing is twisted, so that the leading edge is pointing forwards again, and the next downstroke begins. another important problem is the control of motion. stabilising control is made difficult because the wings do not have typical control surfaces such as ailerons. influence on the motion is possible only through changing the amplitudes and frequencies of flapping, lagging, and feathering of wings. the thrust of an entomopter depends on the local angles of attack, and these depend on the parameters of flapping and feathering. in forward flight the downstroke lasts longer than the upstroke, because of the need to generate thrust in addition to lift. in the hover, where lift only is required, the two strokes are of equal duration. this mode of flying relies on unsteady aerodynamics [1], producing high lift coefficients (peak cl of the order of 3 is typical [1, 2, 3]), and excellent manoeuvrability. the unsteady mechanism varies with different insects, © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 44 no. 2/2004 dynamics of micro-air-vehicle with flapping wings k. sibilski small (approximately 6 inch long, or hand-held) reconnaissance micro air vehicles (mavs) will fly inside buildings, and require hover for observation, and agility at low speeds to move in confined spaces. for this flight envelope insect-like flapping wings seem to be an optimal mode of flying. investigation of the aerodynamics of flapping wing mavs is very challenging. the problem involves complex unsteady, viscous flow (mainly laminar), with the moving wing generating vortices and interacting with them. at this early stage of research only a preliminary insight into the nature of the little known aerodynamics of mavs has been obtained. this paper describes computational models for simulation of the controlled motion of a microelectromechanical flying insect – entomopter. the design of software simulation for entomopter flight (ssef) is presented. in particular, we will estimate the flight control algorithms and performance for a micromechanical flying insect (mfi), a 80–100 mm (wingtip-to-wingtip) device capable of sustained autonomous flight. the ssef is an end-to-end tool composed of several modular blocks which model the wing aerodynamics and dynamics, the body dynamics, and in the future, the environment perception, control algorithms, the actuators dynamics, and the visual and inertial sensors. we present the current state of the art of its implementation, and preliminary results. keywords: micromechanical flying insect, entomopter aerodynamics, entomopter dynamics, insect flight, software simulator, insect aerodynamics, insect dynamics. the most important being a bound leading edge vortex [4]. high lift is a major factor in high efficiency of the mechanism: a typical power requirement for insects is 30 w/kg [2, 11], whereas small, electrically-powered, propeller-driven, fixed wing aircraft require about 150 w/kg. insect wing flapping occurs in a stroke plane that generally remains at the same orientation to the body, and may be horizontal or inclined. rapid rotations occur at each end of the flapping half-stroke. to a first approximation, kinematic control of insect flight manoeuvres is provided by changes in the tilt of the stroke plane, which is analogous to helicopter control. precise control is achieved by including inter-wing differences in the magnitude of the force produced, the timing of the downstroke-to-upstroke wing rotation, and the geometric position of the wings when the rotation occurs. the primarily goal of this work is to design the software simulation for a micromechanical flying insect (entomopter). the entomopter flight simulator should be an end-to-end tool composed of several modular blocks, which model: the wing aerodynamics, the body motion, and the control algorithms. we present the design of a software simulation for entomopter flight, the current state of the art of its implementation and preliminary results of calculations. 2 the main issues features of airfoil for animals the typical airfoils for birds and bats are thin and cambered, which means that they generate very little leading-edge suction. cruising birds and bats fly with their flapping axes aligned close to horizontal. this could produce an interesting dilemma for the upstroke. insects have low-aspect-ratio wings, which are not suitable for cruising flight. during the downstroke, the insect generates mainly a vertical force. the acceleration of the insect’s body during the first half of the downstroke is especially large, and this acceleration is mainly caused by a large unsteady pressure drag action on the wings. during the upstroke the insect generates mainly a horizontal force. the change of direction of the forces during the down-and-up-strokes is controlled by variations in the inclination of the stroke plane. aerodynamic phenomena as the size of an aircraft is reduced, the need for efficiency in terms of lift and propulsion generation becomes more evident. reducing the size of the lifting surfaces and keeping the flight speed around 15 m/s makes the aerodynamic phenomena different from those found in normal size aircraft, mainly due to very low reynolds number of the flow. moreover, entomopter manoeuvring in this regime is subject to non-linear, unsteady aerodynamic loads [1, 2, 12]. the non-linearities and unsteadiness are due mainly to the large regions of 3-d separated flow and concentrated vortex flows that occur at large angles of attack. accurate prediction of these non-linear, unsteady airloads is of great importance in the analysis of entomopter flight motion and in the design of its flight control system. prediction of the unsteady airloads is complicated by the fact that the instantaneous flowfield surrounding a manoeuvring body, and thus the loading, is not determined solely by the instantaneous values of the motion variables, such as the angles of attack and sideslip, and, particularly in this study, by the deforming flexible wing parameters. in general, the instantaneous state of the flowfield depends on the time-history of the motion, that is, on all the states taken by the flowfield during the makeover prior to the instant in question. degrees of freedom not only at first sight, the study of entomopter flight dynamics and control may seem very complicated, since each wing possesses degrees of freedom in addition to those of the “fuselage”. detailed analyses of kinematics are central to an integrated understanding of animal flight. four degrees of freedom in each wing are used to achieve flight in nature: flapping, lagging, feathering, and spanning. flapping is a rotation of the animal wing about the longitudinal axis of the animal body (this axis lies in the direction of the flight velocity), i.e. this is “up and down” motion. lagging is a rotation about a “vertical” axis; this is the “forward and backward” wing motion backward parallel to the body. feathering is an angular movement about the wing longitudinal axis (which may pass through the centre of gravity of the wing) which tilts the wing to change its angle of attack. spanning is an expansion and contraction of the wingspan. not all flying animals perform all of these motions. for instance insects with low wing flap frequencies about 20 hz (17 … 25 hz) generally have very restricted lagging capabilities. unlike birds, most insects do not use the spanning technique. insects such as alderfly (apatele alni) and mayfly (ephemera) have fixed stroke planes with respect to their bodies. thus, flapping flight is possible with only two degrees of freedom: flapping and feathering. in the simplest physical models heaving and pitching represent these degrees of freedom. the motion of each bird wing may be decomposed into flapping, lagging, feathering (the rigid body motions) and also into more complex deflections of the surface from the base shape (vibration modes). example of numerical calculations sample results of calculations illustrating current capabilities of the method and providing a preliminary insight into the aerodynamic behaviour of flapping wings are shown in fig. 1. in order to evaluate the performance of flight control algorithms, a software for simulation of entomopter flight (ssef) is being implemented to simulate the flight of an mfi inside a virtual environment. the ssef is decomposed into several modular units, each of them responsible for an independent task. the aerodynamic module takes as its input the wing motion and the mfi body velocities, and gives as the output the corresponding aerodynamic forces and torques. this module corresponds to a mathematical model for the aerodynamics. the body dynamics module takes the aerodynamic forces and torques generated by the wing kinematics and integrates them along with the dynamic model for the mfi body, thus computing the body’s position and the attitude as a function of time. the control system module takes as its input the mfi body state and eventually the perception of the external world. its task is to decide a control strategy to achieve a desired mission and to generate the control signals to the electromechanical system. the electromechanical system 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 takes as its input the electrical control signals generated by the control system module and generates the corresponding wing kinematics. it consists of the model of the electromechanical wing-shorax architecture and the aerodynamic damping on the wings. the traces of the motion of the wings, the corresponding aerodynamic forces acting on them, and the trajectory motion and heading of the mfi produced by the ssef, are combined together into a virtual environment simulation. the ssef architecture is flexible, since it readily allows modifications or improvements of one single module without rewriting the whole simulator. for example, different combinations of control algorithms and electromechanical structure can be tested, giving rise to the more realistic setting of flight control with limited kinematics due to electromechanical constraints. moreover, the dimensions and masses of the wings and body can be modified to analyze their effects on flight stability, power efficiency and maneuverability. finally, as soon as better aerodynamic models are available, the aerodynamic module can be updated to improve accuracy. as present, the ssef is not implemented: the electromechanical system modules, and the sensory system module. the following sections present the state of art for the ssef. aerodynamic modul animal flapping flight represents an unusual aerodynamic problem because of the inherent “unsteadiness” and the low reynolds number of the airflow. a large number of models for unsteady animal flight have been formulated, and these have been categorised and evaluated in a recent review by smith et al [17], and pietrucha et al [12]. usually unsteady flow is defined as that in which the aerodynamic characteristics depend on time. among various unsteady flows, linear, harmonic flows are of special importance. the linearity means that the amplitudes of the oscillations are small and that separation does not take place. for such flows it is sufficient that the aerodynamic characteristics are presented versus a frequency parameter. time does appear explicit in the function describing these characteristics. panel methods there are several methods for aerodynamic modelling by means of panel methods. in our opinion, the most valuable is the unsteady vortex lattice method. the unsteady vortex © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 44 no. 2/2004 lattice method [6] employs an explicit routine for generating the unsteady wakes, instead of the implicit scheme requiring iteration. the wing may be viewed as moving through air either at rest or in motion. thus, the effects of gusts on a manoeuvre can be modelled. the method is especially suitable for an impulsive start. however, giving the wing an impulsive start and then having it move at constant velocity until a steady state is developed can calculate all the steady flows. the method is based on the continuity equation with the few additional conditions. the method is especially valuable for analysing different manoeuvres, such as a steady turn, a fast roll about the wing longitudinal axis, response to aileron deflection, etc. to mark out the forces and aerodynamic moments effected on an entomopter’s wing we have used the modified panel method [6, 17]. the choice of method was dictated by an easy application and low cost of calculations, which enables this problem to be realized on pc computers. since the object has been found in unstationary motion, the solution has been found by the time-stepping method – which means that for every time step the wake vortex was suitably modified. strip theory the classical strip theory approximation is based on the assumption that each element of a wing can be considered as an airfoil segment of a finite span wing. lift and drag are then calculated from the resultant velocity acting on the airfoil, each element being considered independent of the adjoining elements. the aerodynamic characteristics of the wing are obtained by integrating the individual contribution of each element along the span. in order to obtain the resultant velocity at a wing element, the total flow over the wing must be known. it is composed of the resultant flight velocity, flapping velocity, and the induced velocity. a detailed explanation of the assumed formulas and algorithms can be found in [1], [2], and [9]. body dynamics module given the aerodynamic forces generated by the wing kinematics, the body dynamics module integrates the rigid body equations of motion, and gives the body position and attitude trajectories. the input to the body dynamics module is the stroke angle, lift and drag forces. it is a well-known fact that larger flying creatures fly principally by gliding or slow beating, where as smaller animals fly by strong beating at high frequency. thus, the range of beating frequency and the reynolds number varies greatly. the beating motion of the wings is exclusively used in the powered flight of birds and insects. in flying, this is the only way by means of the which these flying creatures can counter the gravity forces and propel themselves against aerodynamic drag. therefore, detailed analyses of kinematics are central to an integrated understanding of animal flight [1, 2, 4, 8, 9, 11, 12, 13, 14, 15, 16]. the motion of an animal wing may be decomposed into: flapping, lagging, feathering (the rigid body motions) and also into more complex deflections of the surface from the base shape (vibration modes). this requires a universal joint similar the shoulder in a human. a good model of such a joint is the articulated rotor hub. flapping is a rotation of a wing about longitudinal axis of the body (this axis lies in the direction of flight velocity), i.e. “up and down” motion. lagging is a rotation about a “vertical” axis, this is the “forward and backward” wing motion. feathering is an angular movement about the wing longitudinal axis. during the feathering motion the wing changes its angle of attack. spanning is an expanding and contracting of the wingspan. not all flying animals implement all of these motions. unlike birds, most insects do not use the spanning technique. flapping flight is possible with only two degrees of freedom: flapping and feathering. in the simplest physical models heaving and pitching represent these degrees of freedom. this kind of motion can be generated principally by a flapping (up and down) motion of the wing, but not by a feathering (pitch-up and pitch-down) motion. the mode and frequency of the beating motion differ among different species and are strongly dependent on body size and shape. a typical difference in beating motion between birds and insects is observed in the way they use the aerodynamic forces, lift and drag. birds rely entirely on lift because the reynolds number of their wings is high enough. however, insects use drag as well as lift, thanks to the low reynolds number and high frequency beating of low aspect ratio wings. 3 equations of entomopter motion gibbs-appel equations the formalism of analytical mechanics allows us to present the dynamic equations of motion of an entomopter as mechanical systems in generalised co-ordinates. the method presented above provides a remarkably incredibly interesting and comfortable tool for constructing the equation of motion of an entomopter. gibbs-appel equations have the following form [16]: d dt s� ���q q � � � � � � � � � (1) where: q – is the vector of generalised co-ordinates; s , tq q q� ��, , – is the so called appel function, or functional of accelerations. functional s for the i-th element of the mechanical system is given by the equation [16]: s mi i i v i � ��� 1 2 � �v v� d (2) where �vi means the vector of absolute acceleration of elementary mass dmi of the i-th body of the dynamical system considered (fig. 2): � �v vi i i i i i i� � � 0 0 0 0� � � � � . (3) assuming that: r r r ri i i i� �� � � �3 3 3 (4) and m i r r j i i i it i i m m m � � � � � � � � � ~ ~ 0 (5) h r r v j i i i i i i i i m m m � � � � � � � � � ~ ~ ~ ~ ~ � � � � �� � 0 0 2 0 0 (6) 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 where mi is mass of the i-th element, j0 i tensor of inertia of the i-th element, �0 vector of the angular velocity, v0 i vector of the velocity of the i-th element, and assuming that: � �a � a a a t� � �, , , and ~a � � � � � � � � � � � � � � 0 0 0 a a a a a a � � � � � � ; term (2) can be expressed in the following matrix form: s i i i i t i i i i� �� �� � �� �� �� � �� � �1 2 1 1 � �v m h m v m h . (7) calculating the matrixes mi, hi, the appel function si for all k bodies of the system, and defining matrices: � �m m m m� diag 1 2, , ,� k , (8) v v v v� � �� � �� 1 2t t k t t , , ,� (9) and h h h h� � �� � �� 1 2t t k t t , , ,� , (10) functional s for the whole mechanical system is given by the equation: s + + t � � � 1 2 1 1 � �v m h m v m h (11) assuming that q is the vector generalised coordinates of the mechanical system, the relations between q and v are given by the equation: v d q q f q= , t + , t� (12) hence: � �� �v d q q q q= , t + , , t� (13) where: � � � � �d q f+ therefore the appel function can be expressed by the following relation: s , , , t + + t q q q d q m h m d q m h� �� �� ��� � �� � 1 2 1 1 � � (14) assuming that: m d mdg t� and h d m hg t +� � and remembering, that: d d md d mt t � � 1 1, equation (14) can be expressed in the form: s , , , t + +gg t g gq q q q m h m q m h� �� �� ��� 1 2 1 1 . (15) non-linear equations of entomopter motion are expressed in a moving systems of co-ordinates [16]. in the case when we consider the model of a entomopter treated as a mechanical system containing a rigid fuselage and n rigid wings fixed to the fuselage by means of three hinges, the vector generalised co-ordinates have the following form (fig. 3): � �q � x y zs s s n n n t, , , , , , , , , , , , , ,� � � � � � � � �1 1 1� � � . (16) the vector of quasi-velocities can be expressed by the following equation � �w � u v w p q r n n n t , , , , , , � , , � , � , , � , � , , �� � � � � �1 1 1� � � . (17) for the holonomic dynamical system the relation between generalised velocities � �� � , � , , �q � q q qn1 2 � and quasi velocities � �� � , � , , �w � w w wn1 2 � is as follows: �q a q w= t (18) matrix at has a construction: a a c i t g t� � � � � � � � � � � 0 0 0 0 0 0 (19) matrices ag and ct are classical matrices of transformations of kinematics relations and can be found in [19], unit © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 44 no. 2/2004 fig. 2: location of points, radius vectors and vectors of velocities and accelerations matrix i has the dimension (3n+1) × (3n+1), n – number of main rotor blades. from (18) we have the following relation: �� � �q a q w a w� �t t (20) finally, the appel function has the following form: s , , , t w w t w w w * � � �q w w w m h m w m h� � �� � 1 2 1 1 (21) where m q a m aw t t q t� , and h q w a m a hw tt q t q, � �� . gibbs-appel equations of motion, written in quasi velocities, has the following form: � � � � � � s s s , t + t k t w * * * � � , , � � w m q w h � � � � � � � � � � � � � � � � w w1 � w , , t , , tq w q q w� * (22) vector q* is the sum of aerodynamic loads, potential forces acting on the entomopter, and other non-potential forces acting on system eq. (21). 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 3: systems of coordinates 0 0.4 0.8 1.2 1.6 2 t [s] -8 -4 0 4 8 [° ] cg position 25% sca 45% sca 0 0.4 0.8 1.2 1.6 2 t [s] -0.06 -0.04 -0.02 0 0.02 0.04 0.06 q [1 /r a d ] cg position 25% sca 45% sca fig. 4: typical results of simulation those equations are written in a form allowing the creation of procedures meant for their automatic formulation, (e.g., by means of such well known commercial software as mathematica® or mathcad®). 4 results sample calculations illustrating the current capabilities of the method and providing a preliminary insight into the behaviour of an entomopter are shown in fig. 4. the dynamics of the entomopter shows an oscillatory motion superimposed on a vertical drifting term. the vertical drifting term is a result of a mean non-zero force along a wingbeat, while the oscillatory motion is the result of the time-varying nature of aerodynamic forces for insect flight. 5 conclusions this paper it is proposes the design of an accurate software simulation for entomopter flight that includes all major components involved: aerodynamics, body dynamics and control several of these components are modeled and implemented, and we have obtained simulation results that are consistent with observations of real flying animals (especially birds and bats). finally, we plan to implement a 3d graphical visualization tool which can animate the motion of the simulated animalopter in a 3d environment. current research is directed at improving some of the models considered, aerodynamic models and the control process, and to take advantage of this simulator to evaluate flight control schemes. references [1] azuma a., masato o., kunio y.: aerodynamic characteristics of wings at low reynolds numbers. fixed and flapping wings aerodynamics for micro air vehicle applications. ed t. j. mueller, progress in astronautics and aeoronautics, aiaa, reston, va., 200, p. 341–398. [2] azuma a.: the biokinetics of flying and swimming. springer verlag, tokyo, 1998. [3] ellington c. p.: “the aerodynamics of hovering insects flight. iii kinematics.” philosophical transactions of the royal society of london. series b. biological sciences. vol. 305, 1122, (1984), p. 41–78. [4] ellington c. p.: “the novel aerodynamics of insect flight: applications to micro-air-vehicles.” the journal of experimental biology. vol. 202, (1999), p. 3439–3448. [5] gessow a., myers g. c.: aerodynamics of the helicopter. college park press, 1985. [6] goraj z., pietrucha j.: “basic mathematical relations of fluid dynamics for modified panel methods.” journal of theoretical and applied mechanics. vol. 36, no. 1, (1998), p. 47–66. [7] mcmichael j. m., francis m. s.: micro air vehicles – toward a new generation of flight: http://www.darpa.mil/tto/mav/mav_auvsi.html [8] lasek m., pietrucha j., sibilski k., złocka m.: analogies between rotary and flapping wings from control theory point of view, aiaa meeting papers on disc, vol. 5, 3, aiaa 2001–4002 cp, 2001. [9] lasek m., pietrucha j., sibilski k., złocka m.: a study of flight dynamics and automatic control of an animalopter, icas paper no 339, toronto, 2002. [10] lasek m., pietrucha j., sibilski k.: micro air vehicle manoeuvres as a control problem of flexible flapping wings, aiaa meeting papers on disc, vol. 6, 1, (2002), aiaa 2002–0526 cp. [11] marusak a., pietrucha j., sibilski k., złocka m.: mathematical modelling of flying animals as aerial robot., 7th ieee inter. conf. on methods and models in automation and robotics (mmar 2001), międzyzdroje, poland, aug. 28–31, 2001. [12] pietrucha j., sibilski k., złocka m.: modelling of aerodynamic forces on flapping wings – questions and results, proc. of 4th inter. seminary on rrdpae-2000, warsaw. [13] pornsin-sisirak t., lee s. w., nassef h., grasmeyer j., tai y. c., ho c. m., keenon m.: mems wing technology for a battery-powered ornithopter, 13th ieee inter. conf. on micro electro mechanical systems (mems ‘00), miyazaki, japan, jan. 23-27 2000. [14] schenato l., deng x., wu w. c., sastry s.: virtual insect flight simulator (vifs): a software testbed of insect flight. ieee international conference on robotics and automation, 2001. [15] shyy w., berg m., ljungqvist d.: “flapping and flexible wings for biological and micro air vehicles.” progress in aerospace sciences. vol. 35, (1999), p. 455–505. [16] sibilski k. modeling of an agile aircraft flight dynamics in limiting flight conditions. military university of technology, warsaw, 1998. [17] smith m. j. c., wilkin p. j., williams m. h.: “the advantages of an unsteady panel method in modelling the aerodynamic forces on rigid flapping wings.” journal of experimental biology. vol. 199 (1996), p. 1073–1083. [18] tobalski b. w., dial k. p.: “flight kinematics of black-billed magpies and pigeons over a wide range of speed.” journal of experimental biology. vol. 199, (1996), p. 263–280. [19] willmott a. p., ellington c. p.: “the mechanics of flight in the hawkmoth manduca sexta. part i. kinematics of hovering and forward flight.” journal of experimental biology. vol. 200, (1997), p. 2705–2722. krzysztof sibilski mechanical engineering department radom technical university institute of applied mechanics malczewskiego 29 26-600 radom, poland © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2017.57.0014 acta polytechnica 57(1):14–21, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap deformation response of gellan gum based bone scaffold subjected to uniaxial quasi-static loading daniel kytýřa, b, ∗, nela krčmářováa, b, jan šleichrta, b, tomáš fílaa, petr koudelkaa, ana gantarc, d, sasa novakc, d a institute of theoretical and applied mechanics as cr, v.v.i., prosecká 76, 190 00 prague 9, czech republic b czech technical university in prague, faculty of transportation sciences, department of mechanics and materials, konviktská 20, 120 00 prague 1, czech republic c department for nanostructured materials, jozef stefan institute, jamova cesta 39, 1000 ljubljana, slovenia d jozef stefan international postgraduate school, jamova cesta 39, 1000 ljubljana, slovenia ∗ corresponding author: kytyr@itam.cas.cz abstract. this study is focused on an investigation of the reinforcement effect of the bioactive glass nano-particles in the gellan gum (gg) scaffolds used in bone tissue engineering. the investigated material was synthesized as the porous spongy-like structure improved by the bioactive glass (bag) nano-particles. cylindrical samples were subjected to a uniaxial quasi-static loading in tension and compression. very soft nature of the material, which makes the sample susceptible to damage, required employment of a custom designed experimental device for the mechanical testing. moreover, as the mechanical properties are significantly influenced by testing conditions the experiment was performed using dry samples and also using samples immersed in the simulated body fluid. material properties of the pure gg scaffold and the gg-bag reinforced scaffold were derived from a set of tensile and compression tests under dry and simulated physiological conditions. the results are represented in the form of stress-strain curves calculated from the acquired force and displacement data. keywords: gellan gum scaffold; reinforcement; uni-axial loading. 1. introduction the worldwide incidence of bone disorders and conditions is growing by steeply increasing trend. particularly in the high income regions, a twofold increase between 2010 and 2020 is expected [1]. this is the tribute for populations aging coupled with improper nutrient consumption and poor physical activity. globally, more than 40 % of women and 30 % of men are under increased risk of occurence of bone disorders [2]. annually, only in the usa more than half a million bone defects are reported. the treatment expenditures reach more than $2.5 billion per year. treatment of bone disorders using engineered bone tissue has been considered as a promising alternative to traditional medical treatment methods including use of autografts and allografts. currently, the field of artificial tissue engineering represents overcoming problems, such as donor site morbidity, loss of bone inductive factors, and scaffold resorption during healing [3]. generally, several fundamental requirements have to be simultaneously fullfilled by the artificial tissue: i) normal cellular activity without toxicity effects, ii) minimizing of the stress shielding effect, iii) successful diffusion of nutrients and oxygen, and iv) controlled degradation coupled with the resorption of the artificial material [4]. the presented paper deals with a uni-axial quasistatic testing of the artificial spongy-like structure [5] proposed for bone tissue engineering purposes as the bone scaffold. the investigated gg-bag material combines organic (polysacchariditic) components with inorganic (silicon-calcium based) nanoparticles. this approach effectively enables the adaptation of physical and mechanical properties of the synthesized material according to the desired application [6]. mechanical properties of gellan-gums can vary significantly depending on the procedure of synthetization. young’s modulus in range 0.15 − 150 kpa was reported in [7]. therefore, the newly synthesized material was subjected to the quasi-static loading in both tension and compression to evaluate primarily its stiffness and yield behaviour. firstly, the testing was carried out under dry conditions to evaluated the material properties of the synthesized material itself. then, the experiment was repeated under simulated physiological conditions by employment of the bioreactor with circulating synthetic plasma. 2. gellan-gum based scaffolds the gg-bag material investigated in this study was synthesized at the jozef stefan institute (slovenia) as a porous spongy-like structure improved by the bag nano-particles with size of ≈ 200 nm [8]. the gg is a microbial extracted polysaccharide used in the food and pharmaceutical industry [9, 10]. it is composed of repeating units consisting of two d-glucose and one of each l-rhamnose and d-glucoronic acids [11]. the main 14 http://dx.doi.org/10.14311/ap.2017.57.0014 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 1/2017 deformation response of gellan gum bone scaffold sample h dmin dmax m [mm] [mm] [mm] [mg] gg00 1 8.77 4.97 5.09 10.9 gg00 2 8.73 5.03 5.04 10.9 gg00 3 8.93 4.95 5.04 10.4 gg00 4 9.18 5.01 5.02 10.5 gg00 5 9.01 4.93 4.96 10.0 table 1. dimensions of pure gg samples for the compression test under dry conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg00 1 9.12 5.04 5.05 10.4 gg00 2 8.92 4.91 5.00 10.6 gg00 3 8.91 4.92 4.94 10.6 gg00 4 8.90 4.94 5.00 10.7 gg00 5 9.04 4.91 5.02 11.0 table 2. dimensions of pure gg samples for the tensile test under dry conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg00 1 8.35 4.94 4.97 12.0 gg00 2 8.28 5.05 5.06 10.6 gg00 3 8.23 5.04 5.11 10.1 gg00 4 8.22 4.99 5.05 11.3 gg00 5 8.17 5.08 5.10 11.4 table 3. dimensions of pure gg samples for the compression test under wet conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg00 1 8.25 4.96 5.02 10.6 gg00 2 8.24 4.93 4.98 10.7 gg00 3 8.21 4.97 5.01 10.8 gg00 4 8.13 4.96 5.02 11.5 gg00 5 8.37 4.99 5.02 10.1 table 4. dimensions of pure gg samples for the tensile test under wet conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg50 1 8.87 4.99 5.10 14.8 gg50 2 9.03 5.09 5.09 14.5 gg50 3 9.10 4.94 5.10 15.0 gg50 4 8.80 5.04 5.06 14.0 gg50 5 8.97 4.92 4.97 14.2 table 5. dimensions of gg samples with 50 wt% bag for the compression test under dry conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg50 1 8.98 4.93 5.07 15.1 gg50 2 8.62 5.06 5.08 14.4 gg50 3 8.74 4.97 5.05 14.7 gg50 4 9.09 4.94 4.95 14.6 gg50 5 9.05 4.90 5.01 14.4 table 6. dimensions of gg samples with 50 wt% bag for the tensile test under dry conditions. advantage is an ability to form highly porous 3d structures, when properly cross-linked and fabricated [12]. in the terms of bone regeneration characteristics, the gg is biocompatible and biodegradable, but its mechanical properties are inappropriate to bear stresses during normal movement of a patient and to enable successful bone regeneration. the material also does not promote natural bone formation. therefore, the gg needs to be reinforced by the bioactive glass particles (bag). the bag is a nano-particulate amorphous material with the chemical composition of 70 n% sio2 and 30 n% cao, which is prepared by the modified sol-gel method. the bag is unique since, during degradation, it can induce precipitation of hydroxyapatite formation (even in vitro) and consequently bonds toward soft and hard tissues. therefore, the bag particles are highly interesting for bone regeneration applications. during the production process of the investigated gg-bag samples, the gellan gum was dissolved in ultrapure water by heating the solution for 30 minutes at 90 °c. to the hot gg solution, a dispersion of the bag was admixed and 0.18 wt% cacl2 was added. kept under high temperatures, this mixture was subsequently poured into the required mould and left there to be spontaneously jellified. the weight ratio of the gg and the bag was 1 : 1 and the final concentration of cacl2 was 0.03 wt% in all samples. such samples were frozen for 12 hours at −80 °c and freeze-dried for three days in a freeze dryer. cylindrical samples with height h ≈ 9 mm, diameter h ≈ 5 mm, and weight m ≈ 11 mg, ≈ 16 mg and ≈ 24 mg for the pure gg scaffold, gg-bag reinforced scaffold with 50 wt% and 70 wt% bag respectively were used for the testing. dimensions are listed in detail in tabs. 1–12. x-ray microtomography imaging was performed on all three types of material. based on reconstructed volumetric data porosity and pore size analysis was performed. region of interest with face of inscribed square to the face of the sample and height of 75 % of the sample was binarized. using this data the porosity 67.46, 61.56, and 35.79 % for pure gg, 50 wt% bag and 70 wt% bag content samples respectively was obtained. microstructure of the samples were irregular with closed pores (but with some broken cell-walls). pore size was assessed from medial, lateral and frontal section of each imaged sample. the pores in shape of 15 d. kytýř, n. krčmářová, j. šleichrt et al. acta polytechnica sample h dmin dmax m [mm] [mm] [mm] [mg] gg50 1 8.40 4.93 4.98 18.0 gg50 2 8.56 5.07 5.10 17.7 gg50 3 8.38 4.91 4.96 18.0 gg50 4 8.24 4.93 5.01 16.5 gg50 5 8.34 5.06 5.11 17.1 table 7. dimensions of gg samples with 50 wt% bag for the compression test under wet conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg50 1 8.27 5.05 5.07 17.2 gg50 2 8.30 5.05 5.08 16.5 gg50 3 8.20 5.07 5.08 17.0 gg50 4 8.18 5.03 5.05 17.3 gg50 5 8.31 5.10 5.14 17.0 table 8. dimensions of gg samples with 50 wt% bag for the tensile test under wet conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg70 1 9.16 5.01 5.08 23.6 gg70 2 9.12 5.00 5.09 24.4 gg70 3 8.85 4.93 4.94 24.1 gg70 4 8.88 5.04 5.05 24.5 gg70 5 8.66 5.05 5.09 24.5 table 9. dimensions of gg samples with 70 wt% bag for the compression test under dry conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg70 1 8.61 4.93 4.96 23.0 gg70 2 8.86 4.91 4.94 23.6 gg70 3 8.63 4.93 5.00 24.2 gg70 4 8.91 4.97 5.10 24.4 gg70 5 9.12 4.94 5.00 24.5 table 10. dimensions of gg samples with 70 wt% bag for the tensile test under dry conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg70 1 8.64 5.06 5.09 24.1 gg70 2 9.10 4.95 5.02 24.4 gg70 3 8.62 4.90 5.04 24.3 gg70 4 9.05 4.95 4.96 24.4 gg70 5 8.66 4.94 5.07 23.3 table 11. dimensions of gg samples with 70 wt% bag for the compression test under wet conditions. sample h dmin dmax m [mm] [mm] [mm] [mg] gg70 1 8.20 5.09 5.15 22.6 gg70 2 8.34 5.10 5.15 23.2 gg70 3 8.26 4.93 4.99 22.2 gg70 4 8.24 4.97 5.01 23.3 gg70 5 8.25 5.05 5.07 24.7 table 12. dimensions of gg samples with 70 wt% bag for the tensile test under wet conditions. prolate ellipsoids were dominantly oriented in loading direction. length of pores were 796 ± 334 , 565 ± 127 and 429 ± 72 µm for pure gg, 50 wt% bag and 70 wt% bag content samples respectively. 3. methods to obtain information about deformation characteristics of the synthesized material, sets of quasi-static experiments were performed. the first goal was to demonstrate possibilities of the in-house developed experimental infrastructure for such measurement. expected collapse force ranged within single newton orders and precise loading plate positioning was required as well. to obtain more relevant results, modifications of the experimental devices presented in detail in section 3.2 were carried out. then, material properties in terms of stress-strain response were obtained. 3.1. experimental procedure cylindrical samples with the dimensions and weights listed in tabs. 1–12 were subjected to tensile and compressive loading under dry and wet conditions. here, the dry conditions represented the final state of the scaffold synthetization, whereas the wet conditions simulated physiological environment of human body using artificial plasma with the content listed in tab. 13. dry samples were loaded at room temperature (22 °c). the samples placed in the bioreactor chamber were fully immersed in the solution with temperature 37 ± 2 °c. because of permeability (required for transport of nutrients) of the material, full saturation by simulated body fluid was used during the test. loading platen displacement was set to approximately 1000 µm which corresponds to ≈ 11–12 % deformation sufficient to induce some observable damage to the samples’ microstructure [13, 14]. loading rate was set to 2 µm s−1 yielding quasi-static strain rate ≈ 2 · 10−4 s−1. the force and position was read-out with the sampling frequency 50 sps. natrii chloridum 5.26 g/l kalii chloridum 0.37 g/l magnesii chloridum hexahydricum 0.30 g/l natrii acetas trihydricus 3.68 g/l natrii gluconas 5.02 g/l table 13. content of the artificial plasma. 16 vol. 57 no. 1/2017 deformation response of gellan gum bone scaffold figure 1. custom designed micro-loading device with control unit. 3.2. instrumentation the in-house developed loading device (depicted in fig. 1) for low-force indentation was adapted for both tensile and compression testing. for the indentation, the device was designed using the stiff modular aluminium (30 × 30 mm hollow profiles) frame bearing the following components: (i) two motorized axes kk40 (hiwin, japan) for sample positioning accuracy 10 µm, (ii) loading axis based on linear stage mgw12 (hiwin, japan), and (iii) linear actuator 43-series (haydon kerk, usa) with positioning full-step resolution 3 µm with mounting for u9b/c series (hbm, germany) load-cell. this axis was upgraded using the same series linear actuator with positioning full-step resolution 1.5 µm, encoder with resolution 0.5 µm, and u9b/c load cell with nominal force 50 n. for the testing under the wet conditions, the device was equipped with a custom designed bioreactor (see fig. 2). the bioreactor enables full control over flow-rate and temperature of the fluid used for the experiments. the fluid is pumped from a heated reservoir to a basin surrounding the samples and loading platens; circulation of the fluid is provided by a peristaltic pump with adjustable stream velocity. the control unit of the experimental setup was designed specifically for controlling stepperor servomotor based positioning devices. the control unit is also equipped with drivers, an i/o board for controlling peripheral devices (i.e. lights, etc.), and a unit for readout signals from load-cells. the used drivers are capable of microstepping up to 64 microsteps per steps to achieve a maximum possible smoothness of motion, when stepper motors are used for the experiments. 3.3. strain calculation the investigated material was expected to exhibit very low stiffness, which, coupled with high porosity and suboptimal geometry of the samples, induces high potential for significant boundary effects (i.e. localized plastic deformation of the sample near the contact with the loading platens) yielding yielding low reliability figure 2. visualization of bioreactor mounted on loading device. of strain evaluation based on displacement of loading platens. the accuracy of the presented measurements was reduced due to geometrical properties of the samples, the type of boundary conditions during experiments, and also by very low stiffness of the material. currently, the production process of the gg-bag samples does not allow to produce cylindrical samples too reliably. the diameter of the specimens varied in average by ±100 µm, the loaded faces were rough and not planparallel. furthermore, the character of the material, particularly its brittleness that is making it prone to severe damage during any type of preparation procedures, made machining of the samples impossible. thus, the non-contact optical displacement measurement was employed instead of calculations based on the known position of the loading platens. the strains were evaluated from optically acquired displacements using the digital image correlation (dic) method. this method is based on the comparison of differences within the sequence of the deforming object images. when the image with identifiable texture is divided to subsets, the center coordinates (x′,y′) of arbitrary subsets in a subsequent image are given by: x′ = x + u + ∂u ∂x ∆x + ∂u ∂y ∆y, (1) y′ = y + v + ∂v ∂x ∆x + ∂v ∂y ∆y, (2) where u and v are the displacements of the subset centroid in the x and y directions respectively. ∆x and ∆y represent the distances from the centroid of the sub-image to the point x and y. this enables to create any arbitrary correlation pattern on the investigated surface, including a matrix for full-field displacement tracking. here, the matrix of correlation points for full field measurements was constructed on every measured sample and paths of the correlation points were observed for 17 d. kytýř, n. krčmářová, j. šleichrt et al. acta polytechnica figure 3. surface of well manufactured sample with artificial patters and grid of correlation points for optical strain measurement. 0 50 100 150 200 250 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 c o m p re s s iv e s tr e s s [ k p a ] compressive strain [-] gg00 dic gg00 dis. gg50 dic gg50 dis. figure 4. stress-strain curves of two samples (pure gg in blue and 50 wt% bag content in red) subjected to compressive loading under dry condition obtained from optical measurement and encoder signal. 0 50 100 150 200 250 300 350 400 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 c o m p re s s iv e s tr e s s [ k p a ] compressive strain [−] egg70 = 4733.46±307.96 kpa egg50 = 2145.80±255.84 kpa egg00 = 1223.88±1.19 kpa figure 5. stress-strain curves envelope dry scaffolds under the compressive loading. 0 20 40 60 80 100 120 0 0.01 0.02 0.03 0.04 0.05 t e n s il e s tr e s s [ k p a ] tensile strain [−] egg70 = 2613.80±511.45 kpa egg50 = 3322.55±441.35 kpa egg00 = 2256.83±391.67 kpa figure 6. stress-strain curves envelope of dry scaffolds under the tensile loading. calculating local deformations over the whole image sequence using the lucas-kanade tracking algorithm [15]. using the obtained data, strain fields were derived and used for the calculation of stress-strain relations. the full-field optical strain measurement of the wet samples placed in the fluid basin was not possible with the existing setup. therefore, only comparative measurements on the dry pure gg samples and gg with 50 wt% bag were performed. here the average displacement of the bottom and top line of the correlation points (see fig. 3) was used for strain calculation together with the encoder output. the resulting stressstrain curves of two compressive experiments are nearly identical in the elastic region. the strain measurement using the encoder signal can be considered acceptable only for such a measurement that can be seen in the comparative stress-strain diagrams depicted in fig. 4. 3.4. stress calculation the stress σ in all experimental analyses was considered as engineering stress obtained using: σ = f ac , (3) where ac is the cross-sectional area of the specimen calculated from the minimal sample diameter measured before deformation. the force f was acquired by the load-cell. for the purpose of stress calculations, there were considered the ideally cylindrical samples neglecting all geometrical irregularities. 4. results material properties and the deformation behaviour of the pure gg scaffold and the gg-bag reinforced scaffold with 50 wt% and 70 wt% the bag content were studied during the tensile and compression tests under the dry and wet conditions. five experiments for each type of material, ambient conditions, and loading mode were performed. the stress–strain curves for each experimental batch are presented in figs. 5–8. young’s modulus was calculated using the linear regression applied on the elastic part of the stress–strain diagrams. regression error r was calculated using: r = √√√√ n∑ i=1 (σi − σ̂i) n , (4) 18 vol. 57 no. 1/2017 deformation response of gellan gum bone scaffold 0 2 4 6 8 10 12 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 c o m p re s s iv e s tr e s s [ k p a ] compressive strain [−] egg00 = 113.78±22.21 kpa egg70 = 82.01±29.96 kpa egg50 = 107.65±30.52 kpa figure 7. stress-strain curves envelope wet scaffolds under the compressive loading. 0 5 10 15 20 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 t e n s il e s tr e s s [ m p a ] tensile strain [−] egg00 = 53.12±17.56 kpa egg50 = 24.18±2.06 kpa egg70 = 22.01±1.96 kpa figure 8. stress-strain curves envelope of wet scaffolds under the tensile loading. sample e [kpa] σy [kpa] gg00 1 1194.32 ± 4.16 61.52 gg00 2 1479.81 ± 6.27 76.65 gg00 3 1146.19 ± 4.77 88.51 gg00 4 1119.64 ± 5.26 56.71 gg00 5 1179.43 ± 3.07 62.28 average 1223.88 ± 1.19 69.14 ± 13.14 table 14. elastic properties and yield stresses of pure gg samples for the compression test under dry conditions. sample e [kpa] σy [kpa] gg00 1 2164.31 ± 0.92 36.18 gg00 2 2259.77 ± 0.62 45.83 gg00 3 2915.55 ± 1.10 67.12 gg00 4 2036.73 ± 0.40 48.78 gg00 5 1906.54 ± 0.42 45.18 average 2256.58 ± 391.67 48.62 ± 11.36 table 15. elastic properties and yield stresses of pure gg samples for the tensile test under dry conditions. sample e [kpa] gg00 1 88.58 ± 0.14 gg00 2 119.50 ± 0.15 gg00 3 126.15 ± 0.14 gg00 4 93.57 ± 0.14 gg00 5 141.08 ± 0.14 average 113.78 ± 22.21 table 16. elastic properties of pure gg samples for the compression test under wet conditions. sample e [kpa] gg00 1 56.34 ± 0.84 gg00 2 33.40 ± 0.20 gg00 3 36.38 ± 0.41 gg00 4 69.07 ± 0.94 gg00 5 70.43 ± 0.95 average 53.12 ± 17.56 table 17. elastic properties of pure gg samples for the tensile test under wet conditions. where σ̂ represent the calculated values of σ. for the yield stress σy evaluation, the christensen [16] second derivation criteria was used: σy = σ at ∣∣∣d2σd�2 ∣∣∣ = max. (5) for noise reduction, significantly distorting resulting derived function, a rolling average filter was computed from five values as: σ′na = σ′n−2 + σ′n−1 + σ′n + σ′n+1 + σ′n+2 5 . (6) the calculated young’s modulus, regression error, presented as e ± r and the yield stresses σy are listed in tabs. 14–25 together with mean values of each set of measurements and standard deviation of such calculated values. the yield stresses were presented only for the dry samples. in case of testing under the wet conditions, the loadcell output signal-to-noise was too low for the yield stress evaluation. all obtained results in the form of the enveloped strain–stress curves are plotted in figs. 5–8. 5. conclusion and discussion the gg-bag samples with different fraction of the reinforcing bag particles were subjected to optically evaluated uni-axial measurements under both dry and wet conditions. it was found out that the ambient environment has significant influence on the mechanical response of the material and measured properties of the dry and wet scaffolds were apparently different. the scaffolds wetted by a synthetic plasma solution exhibited radical loss of stiffness while the elastic modulus decreased more than ten times. in addition, 19 d. kytýř, n. krčmářová, j. šleichrt et al. acta polytechnica sample e [kpa] σy [kpa] gg50 1 2100.57 ± 0.46 132.52 gg50 2 2066.95 ± 3.01 127.33 gg50 3 1818.36 ± 1.57 93.46 gg50 4 2520.59 ± 1.01 132.68 gg50 5 2222.56 ± 0.42 140.42 average 2145.81 ± 255.84 125.28 ± 18.39 table 18. elastic properties and yield stresses of gg samples with 50 wt% bag for the compression test under dry conditions. sample e [kpa] σy [kpa] gg50 1 5805.60 ± 0.30 40.15 gg50 2 3137.46 ± 0.23 85.71 gg50 3 2520.69 ± 0.27 67.99 gg50 4 2157.34 ± 0.31 63.40 gg50 5 2991.67 ± 1.02 85.25 average 3322.55 ± 1441.35 68.50 ± 18.75 table 19. elastic properties and yield stresses of gg samples with 50 wt% bag for the tensile test under dry conditions. sample e [kpa] gg50 1 81.45 ± 0.14 gg50 2 131.88 ± 0.13 gg50 3 70.84 ± 0.14 gg50 4 113.89 ± 0.28 gg50 5 140.22 ± 0.20 average 107.65 ± 30.52 table 20. elastic properties of gg samples with 50 wt% bag for the compression test under wet conditions. sample e [kpa] gg50 1 24.40 ± 0.09 gg50 2 27.15 ± 0.08 gg50 3 24.28 ± 0.08 gg50 4 23.71 ± 0.08 gg50 5 21.37 ± 0.08 average 24.18 ± 2.06 table 21. elastic properties of gg samples with 50 wt% bag for the tensile test under wet conditions. the values of the peak force, which was measured during the tests and averaged at ≈ 0.5 n, may be influenced by the load-cell nonlinearity and low signal-to-noise ratios. therefore, it wasn’t possible to evaluate the yield stress properly using the acquired load-cell signal. all compressed wet samples showed very similar deformation behaviour. the standard deviation for the samples from one group was higher than the effects of the bag reinforcement. unexpectedly, the results obtained for the wet samples subjected to tensile loading showed higher elastic modulus and ultimate stresses of the pure gg samples. this can be attributed to the fact that, on the microstructural level, the bag particles disrupt integrity of the wet gellan gum. in case of the dry samples subjected to compression loading, a significant reinforcement effect of the bag was observed. the values of elastic modulus and yield stresses are two and three times higher for 50 wt% and 70 wt% bag improved samples respectively. during the tensile loading, the elastic properties of the samples were very similar independently on the content of the bag reinforcing particles, but higher reinforcement increased the measured ultimate stress. the yield stresses of both the 50 wt% and 70 wt% bag were 20 % compared to the pure gg samples. the samples loaded in tension were also able to resist the deformation up to 4 %. possibilities of experimental infrastructure for the testing of the newly synthetized gg based scaffolds was successfully demonstrated. the most limiting part of the experimental setup is the loadcell signal-to-noise ratio at the desired loading level in case of the wet samples and generally suboptimal geometrical characteristics of the samples inducing shear stresses during loading. for more complex investigation including the detailed analysis of the deforming microstructure, the time-lapse radiographical methods will be used [17, 18]. acknowledgements the research was supported by grant agency of the czech technical university in prague (grant no. sgs15/225/ohk2/3t/16), by european regional development fund in frame of the project com3d-xct (atcz-0038) in the interreg v-a austria–czech republic programme and by institutional support rvo: 68378297. the slovenian research agency is acknowledged for its financial support of the phd study of the co-author, ms. ana gantar (pr-03770). references [1] w. h. organization. world health statistics. who press, 2015. [2] w. h. organization. global recommendations on physical activity for health. who press, switzerland, 2010. [3] a. r. vaccaro, k. chiba, j. g. heller, et al. bone grafting alternatives in spinal surgery. the spine journal 2(3):206 – 215, 2002. doi:10.1016/s1529-9430(02)00180-8. [4] a. r. amini, c. t. laurencin, s. p. nukavarapu. bone tissue engineering: recent advances and challenges. critical reviews in biomedical engineering 40(5):363–408, 2012. doi:10.1615/critrevbiomedeng.v40.i5.10. [5] l. polo-corrales, m. latorre-esteves, j. ramirez-vick. scaffold design for bone regeneration. journal of nanoscience and nanotechnology 14(1):15–56, 2014. doi:10.1166/jnn.2014.9127. 20 http://dx.doi.org/10.1016/s1529-9430(02)00180-8 http://dx.doi.org/10.1615/critrevbiomedeng.v40.i5.10 http://dx.doi.org/10.1166/jnn.2014.9127 vol. 57 no. 1/2017 deformation response of gellan gum bone scaffold sample e [kpa] σy [kpa] gg70 1 4857.36 ± 0.34 208.05 gg70 2 4441.11 ± 0.38 204.68 gg70 3 5058.45 ± 0.32 202.55 gg70 4 4371.74 ± 4.38 224.20 gg70 5 4938.64 ± 0.23 213.25 average 4733.46 ± 307.96 210.55 ± 8.63 table 22. elastic properties and yield stresses of gg samples with 70 wt% bag for the compression test under dry conditions. sample e [kpa] σy [kpa] gg70 1 2565.45 ± 0.68 83.81 gg70 2 3087.56 ± 0.22 56.13 gg70 3 2847.83 ± 1.07 68.55 gg70 4 1761.72 ± 0.67 65.98 gg70 5 2806.60 ± 1.06 88.45 average 2613.83 ± 511.06 70.78 ± 13.30 table 23. elastic properties and yield stresses of gg samples with 70 wt% bag for the tensile test under dry conditions. sample e [kpa] gg70 1 134.42 ± 0.12 gg70 2 63.56 ± 0.07 gg70 3 79.59 ± 0.07 gg70 4 67.37 ± 0.08 gg70 5 65.13 ± 0.07 average 82.01 ± 29.96 table 24. elastic properties of gg samples with 70 wt% bag for the compression test under wet conditions. sample e [kpa] gg70 1 22.76 ± 0.08 gg70 2 23.44 ± 0.08 gg70 3 20.90 ± 0.08 gg70 4 21.14 ± 0.10 gg70 5 22.70 ± 0.07 average 22.19 ± 1.10 table 25. elastic properties of gg samples with 70 wt% bag for the tensile test under wet conditions. [6] e. r. morris, k. nishinari, m. rinaudo. gelation of gellan – a review. food hydrocolloids 28(2):373 – 411, 2012. doi:10.1016/j.foodhyd.2012.01.004. [7] d. f. coutinho, s. v. sant, h. shin, et al. modified gellan gum hydrogels with tunable physical and mechanical properties. biomaterials 31(29):7494 – 7502, 2010. doi:http://dx.doi.org/10.1016/j.biomaterials.2010.06.035. [8] a. gantar, l. da silva, j. oliveira, et al. nanoparticulate bioactive-glass-reinforced gellan-gum hydrogels for bonetissue engineering. materials science and engineering c 43:27–36, 2014. doi:10.1016/j.msec.2014.06.045. [9] d. hoikhman, y. sela. gellan gum based oral controlled release dosage formsa novel platform technology for gastric retention, 2005. wo patent app. pct/il2004/000,654. [10] m. bououdina. emerging research on bioinspired materials engineering. igi global, 2016. doi:10.4018/978-1-4666-9811-6. [11] j. t. oliveira, l. martins, r. picciochi, et al. gellan gum: a new biomaterial for cartilage tissue engineering applications. journal of biomedical materials research part a 93a(3):852–863, 2010. doi:10.1002/jbm.a.32574. [12] n. drnovšek, s. novak, u. dragin, et al. bioactive glass enhances bone ingrowth into the porous titanium coating on orthopaedic implants. international orthopaedics 36(8):1739–1745, 2012. doi:10.1007/s00264-012-1520-y. [13] j. garrison, c. slaboch, g. niebur. density and architecture have greater effects on the toughness of trabecular bone than damage. bone 44(5):924–929, 2009. doi:10.1016/j.bone.2008.12.030. [14] o. jiroušek, p. zlámal, d. kytýř, m. kroupa. strain analysis of trabecular bone using time-resolved x-ray microtomography. nuclear instruments and methods in physics research, section a: accelerators, spectrometers, detectors and associated equipment 633(suppl. 1):s148–s151, 2011. doi:10.1016/j.nima.2010.06.151. [15] b. d. lucas, t. kanade. an iterative image registration technique with an application to stereo vision. in proceedings of the 7th international joint conference on artificial intelligence volume 2, ijcai’81, pp. 674–679. morgan kaufmann publishers inc., san francisco, ca, usa, 1981. [16] r. m. christensen. observations on the definition of yield stress. acta mechanica 196(3):239–244, 2008. doi:10.1007/s00707-007-0478-0. [17] i. kumpová, d. vavřík, t. fíla, et al. high resolution micro-ct of low attenuating organic materials using large area photon-counting detector. journal of instrumentation 11(02):c02003, 2016. doi:10.1088/1748-0221/11/02/c02003. [18] d. kytýř, t. doktor, o. jiroušek, et al. deformation behaviour of a natural-shaped bone scaffold. materiali in tehnologije 50(3):301–305, 2016. doi:10.17222/mit.2014.190. 21 http://dx.doi.org/10.1016/j.foodhyd.2012.01.004 http://dx.doi.org/http://dx.doi.org/10.1016/j.biomaterials.2010.06.035 http://dx.doi.org/10.1016/j.msec.2014.06.045 http://dx.doi.org/10.4018/978-1-4666-9811-6 http://dx.doi.org/10.1002/jbm.a.32574 http://dx.doi.org/10.1007/s00264-012-1520-y http://dx.doi.org/10.1016/j.bone.2008.12.030 http://dx.doi.org/10.1016/j.nima.2010.06.151 http://dx.doi.org/10.1007/s00707-007-0478-0 http://dx.doi.org/10.1088/1748-0221/11/02/c02003 http://dx.doi.org/10.17222/mit.2014.190 acta polytechnica 57(1):14–21, 2017 1 introduction 2 gellan-gum based scaffolds 3 methods 3.1 experimental procedure 3.2 instrumentation 3.3 strain calculation 3.4 stress calculation 4 results 5 conclusion and discussion acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0528 acta polytechnica 60(6):528–539, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap a combined strain element to functionally graded structures in thermal environment hoang lan ton-thata, b a ho chi minh city university of architecture, department of civil engineering, 196 pasteur street, district 3, ho chi minh city, viet nam b ho chi minh city university of technology and education, department of civil engineering, 01 vo van ngan street, thu duc district, ho chi minh city, viet nam correspondence: lan.tonthathoang@uah.edu.vn; lantth.ncs@hcmute.edu.vn abstract. functionally graded materials are commonly used in a thermal environment to change the properties of constituent materials. they inherently withstand high temperature gradients due to a low thermal conductivity, core ductility, low thermal expansion coefficient, and many others. it is essential to thoroughly study mechanical responses of them and to develop new effective approaches for an accurate prediction of solutions. in this paper, a new four-node quadrilateral element based on a combined strain strategy and first-order shear deformation theory is presented to achieve the behaviour of functionally graded plate/shell structures in a thermal environment. the main notion of the combined strain strategy is based on the combination of the membrane strain and the shear strain related to tying points as well as bending strain with respect to a cell-based smoothed finite element method. due to the finite element analysis, the first-order shear deformation theory (fsdt) is simple to implement and apply for structures, but the shear correction factors are used to achieve the accuracy of solutions. the author assumes that the temperature distribution is uniform throughout the structure. the rule of mixtures is also considered to describe the variation of material compositions across the thickness. many desirable characteristics and the enforcement of this element are verified and proved through various numerical examples. numerical solutions and a comparison with other available solutions suggest that the procedure based on this new combined strain element is accurate and efficient. keywords: combined strain, four-node quadrilateral element, first-order shear deformation theory, thermal environment. 1. introduction functionally graded materials have been successfully applied in numerous fields of engineering. the material is usually made from a mixture of ceramic and metal and provides a continuous variation of material properties from the bottom surface to the top surface of the plate. the functionally graded materials have attracted more attention in thermal environment applications, such as spacecraft and nuclear tanks. the analytical solutions [1–4] are valuable in some certain cases, but in general, with complicated geometries or complex conditions like high temperatures in the thermal environment, they are often limited. besides the analytical approaches, numerical methods are used in the structures analyses [5–27]. bui et al. [5] presented new numerical results of high temperature mechanical behaviour of heated functionally graded plates, emphasizing the high temperature effects on static bending deflections and natural frequencies. a displacement-based finite element formulation associated with the third-order shear deformation plate theory of shi was thus developed. an improved four-node element based on twice-interpolation strategy was introduced by ton-that et al. [6, 7] in linear and nonlinear analyses of composite plate/shell structures. besides, thermal buckling analyses of functionally graded plates and cylindrical shells were investigated by s.trabelsi et al. [12]. in this reference, the finite element formulation based on a modified fsdt shell formulation was elaborated. a dynamic analysis of a functionally-graded carbon nanotube-reinforced plate and shell structures using a double directors finite shell element was firstly presented by a.frikha et al. [13, 14]. the governing equations were developed using a linear discrete double directors finite element model. the generalized differential quadrature method [15–18] was used to study the behaviour of functionally graded materials and laminated doubly curved shells and panels. the free vibration of beams made of imperfect functionally graded materials including porosities was investigated in [19] and the free vibration of functionally graded beams resting on two parameter elastic foundation was examined in [20] by m.avcar et al. due to the discrete singular convolution method, ö.civalek et al. [21–23] studied the behaviour of carbon nanotubes reinforced and functionally graded shells and plates respectively. 528 https://doi.org/10.14311/ap.2020.60.0528 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . (a). (b). figure 1. the functionally graded plate: a) in 3d space, b) with variation of volume fraction. moreover, based on the refined plate theory, some references [24–26] reviewed the mechanical behaviour of the functionally graded sandwich or functionally graded porous plates under various boundary conditions. in this paper, a new four-node quadrilateral finite element related to the combined strain strategy is introduced. the main idea of this combined strain strategy is based on the combination of the membrane strain and shear strain related to tying points as well as bending strain with respect to cell-based smoothed finite element method. some difficulties that arise in the standard fem may be listed as follows: it requires large computer memory and computational time to obtain the desired results, the mapping or coordinate transformation is involved in the standard fem, so its element is not allowed to be of arbitrary shape, the restriction on the shape bilinear isoparametric elements cannot be removed and the problem domain cannot be discretized in more flexible ways. this paper’s element with the combined strain strategy that can does not suffer from the above mentioned difficulties is used to analyse the behaviour of functionally graded plate/shell structures in a thermal environment. many desirable characteristics of the proposed element, such as accuracy, efficiency and removal of shear and membrane locking, are verified through several examples. the article is organized into four sections. in sect. 2, formulation of this new element for functionally graded structures related to first-order shear deformation theory is presented. several examples are subsequently given in sect. 3. we end the paper with a summary and some concluding remarks in the last section. 2. formulation of the combined strain element for functionally graded material 2.1. functionally graded material a functionally graded plate [28] is considered as shown in figure 1a with thickness h. the volume fraction of the ceramic (vc) and metal (vm) are described in (1) and the variation of volume fraction for several volume fraction coefficients of a functionally graded plate using the power-law distribution is plotted by figure 1b. vc = å z h + 1 2 ãn vm = 1 −vc n ≥ 0 (1) where z is the thickness coordinate variable with −h/2 ≤ z ≤ h/2. and c, m and n represent the ceramic, metal constituents and the non-negative volume fraction gradient index, respectively. all values of e, ρ, ν and α that vary throughout the thickness of plate are formulated as below e(z) = em + (ec −em) å 1 2 + z h ãn (2) ρ(z) = ρm + (ρc −ρm) å 1 2 + z h ãn (3) 529 hoang lan ton-that acta polytechnica ν(z) = νm + (νc −νm) å 1 2 + z h ãn (4) α(z) = αm + (αc −αm) å 1 2 + z h ãn (5) the function of temperature t (k) can be expressed by following equation [1] p = p0 ( p−1t −1 + 1 + p1t + p2t 2 + p3t 3 ) (6) where t = t0 + ∆t and t0 = 300 k (ambient or free stress temperature), ∆t is the temperature change, and p0, p−1, p1, p2, p3 are the coefficients of temperature t (k), and are unique to each constituent. 2.2. formulation of the combined strain element in this section, the construction of the combined strain element is briefly given with respect to the first-order shear deformation theory u(x,y,z) = u0(x,y) + zθy v(x,y,z) = v0(x,y) −zθx (7) w(x,y,z) = w0(x,y) with u0, v0, w0 being the displacements of a point located in the mid-surface, and θx, θy are the rotations of the transverse normal, i.e. in the z direction, about the x− as well as y− axes, respectively. the membrane and bending strain vectors can be written as � =   �x�y γxy   =   u0,xv0,y u0,y + v0,x   + z   θxy,x−θx,y θy,y −θx,x   = �m + z�b (8) and the shear strain vector is also given �s = ï �xy �yz ò = ï θy + w0,x −θx + w0,y ò (9) under hooke’s law, the constitutive equation is expressed as σ = dm(z) ä �m + z�b −�(t) ä (10) τ = ds(z) (�s) (11) in which σ = [ σx σy σxy ]t ; τ = [ τyz τxz ]t (12) dm(z) = e(z) 1 −v2   1 v 0v 1 0 0 0 (1 −v)/2   ds(z) = e(z)2(1 + v) ï 1 0 0 1 ò (13) �(t) = î � (t) x � (t) y 0 ót = [ α(z)∆t α(z)∆t 0 ]t (14) the mid-surface of a four-node quadrilateral element is subdivided into four non-overlapping 3-node triangular domains defined by the vertices and the centre point ‘5’ of the element as shown in figure ??. the coordinates of the point ‘5’ in the natural coordinate system are interpolated by x5 = ζ1x1 + ζ2x2 + ζ3x3 + ζ4x4 with  ζ1 ζ2 ζ3 ζ4   = 12 a234a234 + a124   1/3 1/3 0 1/3  + 12 a124a234 + a124   0 1/3 1/3 1/3  + 12 a134a134 + a123   1/3 1/3 1/3 0  + 12 a123a134 + a123   1/3 0 1/3 1/3   (15) in which, a234, a124, a134 and a123 are the areas of triangles “234”, “124”, “134” and “123”. from four non-overlapping triangular domains “125”, “235”, “345” and “415”, four tying points at four positions are determined as depicted in figure 2b and clearly presented in [29]. 530 vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . (a). (b). figure 2. (a) four triangular subdivisions, (b) four tying points corresponding to these subdivisions. (a). (b). (c). (d). figure 3. (a), (b) & (c) nc smoothing cells with values of shape functions at nodes in format (n1, n2, n3, n4) , (d) four tying points for calculating shear strain. 531 hoang lan ton-that acta polytechnica then the membrane strain is approximated as �m = 1 4 ä �̃m(a) + �̃m(b) + �̃m(c) + �̃m(d) ä + 1 2 ä −�̃m(d) + �̃m(c) ä ξ + 1 2 ä −�̃m(b) + �̃m(a) ä η = = b̃mqm = bmq (16) in which, �̃m(a), �̃m(b), �̃m(c), �̃m(d) are, respectively, the membrane strains of four triangular domains evaluated at the tying points. the bending strain is smoothed by following [30] with figure 3a, 3b & 3c. the liaison bet-ween the nodal displacements and the smoothed bending strain field is written as �b = nc∑ i=1 b̃biqbi = b̃bqb = bbq b̃bi = 1 ac ∫ γc   0 ninx 00 0 niny 0 niny ninx  dγ (17) with ac and γc are the area and the boundary of the smoothing cell, respectively; nx and ny are the components of the vector normal to the boundary γc. based on the assumed constant transverse shear strain conditions along the edges and using four tying points as shown in figure 3d, the shear strain related to [31] can be expressed �s = b̃sqs = bsq b̃si = ï xξ yξ xη yη ò−1 ï ni,ξ b 11 i ni,ξ b 12 i ni,ξ ni,η b 21 i ni,η b 22 i ni,η ò (18) where b11i = ξix m ξ , b 12 i = ξiy m ξ , b 21 i = ηix l η and b22i = ηiy l η in which ξi ∈ {−1 1 1 − 1}, ηi ∈ {−1 − 1 1 1} and (i, m, l) ∈{(1, f, h), (2, f, g), (2, e, g), (4, e, h)} as well as n = 1 2 ï 1 − ξ 0 1 + ξ 0 0 1 −η 0 1 + η ò (19) the normal forces, bending moments and shear force can then be computed through the following relations n = { ñx ñy ñxy }t = ∫ h/2 −h/2 { σx σy σxy }t dz = ∫ h/2 −h/2 dm(z) ä �m + z�b −�(t) ä dz (20) m = { m̃x m̃y m̃xy }t = ∫ h/2 −h/2 { σx σy σxy }t zdz = ∫ h/2 −h/2 dm(z) ä �m + z�b −�(t) ä zdz (21) q = { q̃y q̃x }t = ∫ h/2 −h/2 { τyz τxz }t dz = ∫ h/2 −h/2 ds(z) (�s) dz (22) above equations can be presented in the matrix form  n m q   =   a b 0b d 0 0 0 â     �m �b �s  −   n (t) m (t) 0   (23) with ( a, b, d ) = ∫ h/2 −h/2 ( 1, z, z2 ) dm(z)dz (24)ä â ä = ∫ h/2 −h/2 ds(z)dz (25)ä n (t) , m (t) ä = ∫ h/2 −h/2 dm(z)(1, z){ 1 1 0 }tα(z)∆tdz (26) the total strain energy of a plate due to the normal forces, shear force and bending moments can be given by u = 1 2 ∫ ve �tσdv = ∫ se utfds = 1 2 qt ∫ se ä btmabm + b t mbbb + b t b bbm + b t b dbb + b t s âbs ä dsq− −qt ∫ se (btmn (t) + btb m (t) )ds −qt ∫ se ntfds + 1 2 ∫ se (�(t))ta�(t)ds (27) 532 vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . u = 1 2 qte keqe −q t e f (t) e −q t e fe + c (t) = qte å 1 2 keqe −f (t)e −fe ã + c(t) (28) the kinetic energy is shown t = 1 2 ∫ ve u̇tρ(z)u̇dv = 1 2 q̇t ß∫ ve ntltρ(z)lndv ™ q̇ = 1 2 q̇tmeq̇ (29) in which l is clearly described as l =   1 0 0 0 z0 1 0 −z 0 0 0 1 0 0   (30) and the mass matrix of element is presented me = ∫ ve ntltρ(z)lndv = ∫ se nt ç∫ h/2 −h/2 ρ(z)ltldz å nds (31) for the bending analysis, the bending solutions can be obtained by solving the following equation kd = f + f (t) (32) the dynamic equations for solving the eigenvalue can be given as (k −ω2m)d = 0 (33) for shell structures, the drilling rotation is added in the local matrices. the null values of the stiffness corresponding to this drilling rotation are obtained by using approximate values. this value was equal to 10 − 3 times the maximum diagonal value in the stiffness matrix of the element [32]. 3. numerical results in this section, the numerical solutions for static bending and free vibration analyses of functionally graded structures in the thermal environment are presented. unless specified otherwise, the shear correction factors are equal to 5/6. table. 1 gives the different material properties of functionally graded structures made of the ceramic (al2o3, si3n4, zro2) and the metal (sus304) as written in [1, 3]. 3.1. bending analysis the author started examining the accuracy of the proposed element by comparing the achieved results with reference solutions related to other approaches available in literature. a fully simply supported fg plate (a/b = 1 and a/h = 10) made of al/al2o3 subjected to a uniform load q is considered. the material properties em = 70 gpa, ec = 380 gpa, νm = νc = 0.3 are used and the maximum central deflection is normalized by w = [10h3ecw(a/2, b/2)]/qa4. the comparisons of between the proposed method and others, such as reddy’s theory (rt) [33], the sinusoidal shear deformation theory (ssdt) [34], the hyperbolic shear deformation theory (hysdt) and shi’s theory (st) [5] are presented in table 2 for different values of n. the results of this paper show a good agreement with other reference solutions as depicted in figure 4a. however, as the functionally graded plates become more and more metallic, larger deflections are obtained as compared to those more and more ceramic. by considering the accuracy of the proposed element to analyse the functionally graded plates under a temperature environment, the temperature is set to be t = 300 k (∆t = 0) and the same plate as the previous example is also studied, but now it is made of si3n4/sus304. the properties related to this material can be seen in table 1. the maximum central deflection is then normalized by w = [100h3emw(a/2, b/2)]/[12(1 −ν2m)qa4] and compared with the analytical solutions given by [1]. with each value of n, a good agreement between the two results can be found, as presented in table 3. to additionally explore the physical behaviour of functionally graded plates, the fully clamped al2o3/sus304 plate with n = 0.5 and a/b = 1 is considered by changing the thickness (a/5, a/10, a/20, a/30 & a/50) and the temperature from 300 k to 1400 k. the temperature-deflection curves are plotted in figure 4b. generally, under high temperature environments, it clearly indicates a very important effect of the material combinations on the overall mechanical behaviour of functionally graded materials. the numerical results obtained are very interesting as the thinner plates yield larger deflections than the thicker ones. the mechanical deflections of all functionally graded plates increase for the higher temperature range. it means that when the functionally graded plates are subjected to higher temperature environments, larger deflections for all considered functionally graded plates can be reached. 533 hoang lan ton-that acta polytechnica ceramic p (300 k) p0 p−1 p1 p2 p3 al2o3 e (pa) 320.24e9 349.55e9 0 -3.853e-4 4.027e-7 -1.673e-10 α (1/k) 7.203e-6 6.8269e-6 0 1.838e-4 0 0 ν 0.260 0.26 0 0 0 0 ρ (kg/m3) 3800 3800 0 0 0 0 si3n4 e (pa) 322.27e9 348.43e9 0 -3.070e-4 2.160e-7 -8.946e-11 α (1/k) 7.475e-6 5.8723e-6 0 9.095e-4 0 0 ν 0.240 0.24 0 0 0 0 ρ (kg/m3) 2370 2370 0 0 0 0 zro2 e (pa) 168.06e9 244.27e9 0 -1.371e-3 1.214e-6 -3.681e-10 α (1/k) 18.591e-6 12.766e-6 0 -1.491e-3 1.006e-5 -6.778e-11 ν 0.298 0.288 0 1.133e-4 0 ρ (kg/m3) 3657 3657 0 0 0 metal p (300 k) p0 p−1 p1 p2 p3 sus304 e (pa) 207.79e9 201.04e9 0 3.079e-4 -6.534e-7 0 α (1/k) 15.321e-6 12.330e-6 0 8.086e-4 0 0 ν 0.318 0.326 0 -2.002e-4 3.797e-7 0 ρ (kg/m3) 8166 8166 0 0 0 0 table 1. temperature dependent coefficient of young’s modulus e (pa), thermal expansion coefficient α(1/k), poisson’s ratio ν, mass density ρ(kg/m3) for various materials. n results rt ssdt hysdt st present ssss ceramic 0.4665 0.4665 0.4665 0.4630 0.4673 1 0.9421 0.9287 0.9421 0.9130 0.9304 2 1.2227 1.1940 1.2228 1.2069 1.1929 3 1.3530 1.3200 1.3533 1.3596 1.3144 5 1.4646 1.4356 1.4653 1.4874 1.4225 10 1.6054 1.5876 1.6057 1.6308 1.5716 metal 2.5328 2.5327 2.5327 2.5120 2.5351 table 2. comparison of the dimensionless deflections of a functionally graded al/al2o3 plate (a/b = 1, a/h = 10) for different values of volume fraction exponent n. method n = 0.5 n = 1 n = 5 n = 10 analytical approach based on hsdt 0.3251 0.3430 0.3800 0.3960 present 0.3342 0.3553 0.3905 0.4056 table 3. comparison of the dimensionless deflections of a functionally graded si3n4/sus304 plate (a/b = 1, a/h = 10) for different values of volume fraction exponent n under ambient temperature. 534 vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . (a). (b). figure 4. (a) the convergence of results for a fully simply supported functionally graded al/al2o3 plate (a/b = 1, a/h = 10), (b) temperature-deflection curves for a fully clamped functionally graded al2o3/sus304 plate (a/b = 1, n = 0.5). figure 5. the geometry of a fully simply supported functionally graded zro2/al cylindrical panel and the central deflection of the cylindrical panel. the last example in this section is given with a fully simply supported zro2/al cylindrical panel subjected to a uniform load q = 106 n/m2 as depicted in figure 5. the material properties are also given as em = 70 gpa, ec = 151 gpa, νm = νc = 0.3. the geometric properties of this structure are denoted by l = 0.2 m, r = 1 m, and ϕ = 0.2 rad. the dimensionless central deflection is introduced by w = wmax/h and the results of this paper are compared with the solutions based on the element-free kp-ritz method of [35]. table 4 presents the dimensionless central deflections with respect to the changes of ratio s = r/h (50, 100 & 200) and three values of n (0.5, 1 & 2). once again, the accuracy and efficiency of the combined strain element are proved by the very small errors shown in table 4 between the results of this element and the results in [35]. from this section, it can be concluded that the mechanical bending behaviour of the functionally graded structures is material dependent, mainly caused by the nonlinear thermal properties and material behaviour of constituent materials. in other words, not all the functionally graded structures in a high temperature environment act and react in the same manner, they, as observed numerically, behave differently from each other. therefore, material combinations in terms of functionally graded materials are important and greatly affect the mechanical static bending behaviour of resultant functionally graded structures and their performance under high temperature conditions. consequently, this phenomenon and behaviour of functionally graded materials may be important for the design and development of the functionally graded materials in engineering applications, especially for those that suffer tough temperature conditions. 535 hoang lan ton-that acta polytechnica s = r/h method n 0.5 1 2 50 the element-free kp-ritz method 0.0038 0.0043 0.0047present 0.0038 0.0043 0.0048 100 the element-free kp-ritz method 0.0542 0.0607 0.0666present 0.0542 0.0607 0.0666 200 the element-free kp-ritz method 0.6503 0.7283 0.8057present 0.6422 0.7192 0.7954 table 4. comparison of the dimensionless deflections at a central point of a fully simply supported functionally graded zro2/al cylindrical panel (l = 0.2 m, r = 1 m, ϕ = 0.2 rad) for different values of volume fraction exponent n under an ambient temperature. (a). (b). (c). figure 6. the first three mode shapes of fully simply supported si3n4/sus304 square plate. 3.2. vibration analysis the first example in this section is related to the fully simply supported square functionally graded plates and their natural frequencies in a thermal environment. three types of materials are used in this study: si3n4/sus304, al2o3/sus304 and zro2/sus304. the geometrical parameters are set to be: length a = b = 0.2 m, thickness h = 0.025 m. the natural frequency results presented in the dimensionless values ω = (ωa2/h)[ρm(1−ν2)/em]1/2 with em and ρm are taken at t = 300 k. table 5 gives the comparison of first three modes of dimensionless frequencies of si3n4/sus304, al2o3/sus304 and zro2/sus304 square plates under different values of n (0.5, 1 & 2) among the proposed element and others related to two analytical solutions based on higher-order shear deformation theories [1, 3] and numerical solution based on the standard finite element method connected to the higher-order shear deformation theory [5]. as expected, the presented results show a good agreement with exact solutions. although it only uses the first-order shear deformation theory, its results are almost identical to other results based on a higher-order shear deformation theory [1, 3, 5]. moreover, under the same conditions, the si3n4/sus304 square plate has the largest frequencies while zro2/sus304 square plate provides the smallest values. in order to further validate the accuracy of the combined strain element, especially for structures in high temperature environments, table 6 presents a comparison of the fundamental frequency at high temperatures t = 400 k, 500 k & 600 k of two types of materials, si3n4/sus304 and al2o3/sus304, with fully clamped plates for three values of n (0.5, 1 & 5) between the proposed element and analytical method based on shi theory [1]. it can be seen that the frequencies at high temperatures achieved by the combined strain element are in close agreement with the analytical solutions [1]. finally, the first three modes of a simply supported functionally graded si3n4/sus304 square plate (a = b = 0.2 m and h = 0.025 m) are also depicted in figure 6. the frequencies increase with increasing the temperature from 400 k upto 600 k, but these frequencies decrease when the plates are more and more metallic. 4. conclusions 9. in this work, an efficient numerical method based on the combined strain element is developed for modelling functionally graded structures in a thermal environment. the combination of the membrane strain and shear strain related to tying points and bending strain with respect to cell-based smoothed finite element method is established to build the proposed element. the numerical results show that the presented element can be used to analyse and predict the behaviour of functionally graded plate/shell structures in a thermal environment. in each case of the study with, the achieved results are found to agree well with other analytical results, or with 536 vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . n material mode [1] [3] [5] present 0.5 si3n4/sus304 1 8.675 8.646 8.554 8.555 2 20.262 20.080 20.559 20.849 3 30.359 29.908 31.088 31.514 al2o3/sus304 1 7.803 7.805 7.759 2 18.253 19.003 18.892 3 27.569 28.018 28.538 zro2/sus304 1 6.368 6.406 6.418 2 14.824 15.119 15.567 3 24.570 24.719 23.453 1 si3n4/sus304 1 7.555 7.599 7.487 7.502 2 17.649 17.705 17.987 18.286 3 26.606 26.727 27.209 27.643 al2o3/sus304 1 7.114 6.997 7.058 2 16.633 16.518 17.185 3 24.700 25.433 25.958 zro2/sus304 1 6.037 6.075 6.080 2 14.014 14.544 14.760 3 21.456 21.582 22.250 2 si3n4/sus304 1 6.777 6.825 6.705 6.737 2 15.809 15.947 16.083 16.413 3 23.806 24.147 24.326 24.803 al2o3/sus304 1 6.563 6.519 6.511 2 15.323 15.833 15.841 3 23.048 23.346 23.916 zro2/sus304 1 5.753 5.796 5.794 2 13.294 13.898 14.081 3 20.247 20.636 21.243 table 5. the comparison of first three modes of dimensionless frequencies of fully simply supported functionally graded plates under ambient temperature. n mode 1 si3n4/sus304 al2o3/sus304 400 k 500 k 600 k 400 k 500 k 600 k 0.5 analytical solution 15.938 15.468 14.939 14.384 14.003 13.592present 14.966 14.990 15.083 13.568 13.611 13.727 1 analytical solution 13.915 13.426 12.941 13.025 12.631 12.188present 13.121 13.133 13.190 12.339 12.365 12.441 5 analytical solution 11.175 10.749 10.242 10.965 10.556 10.073present 10.672 10.679 10.694 10.533 10.539 10.564 table 6. dimensionless frequencies of fully clamped functionally graded plates (a/b = 1, a/h = 10) in a high temperature. 537 hoang lan ton-that acta polytechnica other numerical methods. based on this element, the presented numerical solutions offer more stable than others and the applicability of proposed element has been clearly shown as above section. the present formulation is general and can be extended to other problems, especially those in high temperature environments. the paper also helps to supplement knowledge for engineers in the design process. the functionally graded materials, where the excellent characteristics of ceramic in heat and corrosive resistances combine with the ability to absorb energy and plastically deform and toughness of metals, are outstanding advanced materials that can withstand large mechanical loads under a high temperature. mechanical information might also be helpful to the designers or researchers in the appropriate selection of functionally graded materials for specific purposes, for instance, a right selection of the functionally graded materials for the right conditions such as structures to be working under high temperature conditions is of course a great benefit in practice. references [1] n. wattanasakulpong, g. b. prusty, d. w. kelly. free and forced vibration analysis using improved third-order shear deformation theory for functionally graded plates under high temperature loading. journal of sandwich structures & materials 15(5):583 – 606, 2013. doi:10.1177/1099636213495751. [2] n. wattanasakulpong, b. gangadhara prusty, d. w. kelly. thermal buckling and elastic vibration of third-order shear deformable functionally graded beams. international journal of mechanical sciences 53(9):734 – 743, 2011. doi:10.1016/j.ijmecsci.2011.06.005. [3] x.-l. huang, h.-s. shen. nonlinear vibration and dynamic response of functionally graded plates in thermal environments. international journal of solids and structures 41(9):2403 – 2427, 2004. doi:10.1016/j.ijsolstr.2003.11.012. [4] j. yang, h.-s. shen. nonlinear bending analysis of shear deformable functionally graded plates subjected to thermo-mechanical loads under various boundary conditions. composites part b: engineering 34(2):103 – 115, 2003. doi:10.1016/s1359-8368(02)00083-5. [5] t. q. bui, t. v. do, l. h. t. ton, et al. on the high temperature mechanical behaviors analysis of heated functionally graded plates using fem and a new third-order shear deformation plate theory. composites part b: engineering 92:218 – 241, 2016. doi:10.1016/j.compositesb.2016.02.048. [6] h. l. ton-that, h. nguyen-van, t. chau-dinh. an improved four-node element for analysis of composite plate/shell structures based on twice interpolation strategy. international journal of computational methods 17(06):1950020, 2020. doi:10.1142/s0219876219500208. [7] h. l. ton that, h. nguyen-van, t. chau-dinh. nonlinear bending analysis of functionally graded plates using sq4t elements based on twice interpolation strategy. journal of applied and computational mechanics 6(1):125 – 136, 2020. doi:10.22055/jacm.2019.29270.1577. [8] l. t. that-hoang, h. nguyen-van, t. chau-dinh, c. huynh-van. enhancement to four-node quadrilateral plate elements by using cell-based smoothed strains and higher-order shear deformation theory for nonlinear analysis of composite structures. journal of sandwich structures & materials 22(7):2302 – 2329, 2020. doi:10.1177/1099636218797982. [9] h. nguyen-van, h. l. ton-that, t. chau-dinh, n. d. dao. nonlinear static bending analysis of functionally graded plates using misq24 elements with drilling rotations. in h. nguyen-xuan, p. phung-van, t. rabczuk (eds.), proceedings of the international conference on advances in computational mechanics 2017, pp. 461 – 475. springer singapore, singapore, 2018. doi:10.1007/978-981-10-7149-2\_31. [10] h. l. ton-that. finite element analysis of functionally graded skew plates in thermal environment based on the new third-order shear deformation theory. journal of applied and computational mechanics 6(4):1044 – 1057, 2020. doi:10.22055/jacm.2019.31508.1881. [11] m. bayat, i. alarifi, a. khalili, et al. thermo-mechanical contact problems and elastic behaviour of single and double sides functionally graded brake disks with temperature-dependent material properties. scientific reports 9:15317, 2019. doi:10.1038/s41598-019-51450-z. [12] s. trabelsi, a. frikha, s. zghal, d. fakhreddine. a modified fsdt-based four nodes finite shell element for thermal buckling analysis of functionally graded plates and cylindrical shells. engineering structures 178:444 – 459, 2018. doi:10.1016/j.engstruct.2018.10.047. [13] a. frikha, s. zghal, f. dammak. dynamic analysis of functionally graded carbon nanotubes-reinforced plate and shell structures using a double directors finite shell element. aerospace science and technology 78:438 – 451, 2018. doi:10.1016/j.ast.2018.04.048. [14] s. zghal, a. frikha, d. fakhreddine. mechanical buckling analysis of functionally graded power-based and carbon nanotubes-reinforced composite plates and curved panels. composites part b: engineering 150:165 – 183, 2018. doi:10.1016/j.compositesb.2018.05.037. [15] f. tornabene, n. fantuzzi, m. bacciocchi, e. viola. effect of agglomeration on the natural frequencies of functionally graded carbon nanotube-reinforced laminated composite doubly-curved shells. composites part b: engineering 89:187 – 218, 2016. doi:10.1016/j.compositesb.2015.11.016. 538 http://dx.doi.org/10.1177/1099636213495751 http://dx.doi.org/10.1016/j.ijmecsci.2011.06.005 http://dx.doi.org/10.1016/j.ijsolstr.2003.11.012 http://dx.doi.org/10.1016/s1359-8368(02)00083-5 http://dx.doi.org/10.1016/j.compositesb.2016.02.048 http://dx.doi.org/10.1142/s0219876219500208 http://dx.doi.org/10.22055/jacm.2019.29270.1577 http://dx.doi.org/10.1177/1099636218797982 http://dx.doi.org/10.1007/978-981-10-7149-2\_31 http://dx.doi.org/10.22055/jacm.2019.31508.1881 http://dx.doi.org/10.1038/s41598-019-51450-z http://dx.doi.org/10.1016/j.engstruct.2018.10.047 http://dx.doi.org/10.1016/j.ast.2018.04.048 http://dx.doi.org/10.1016/j.compositesb.2018.05.037 http://dx.doi.org/10.1016/j.compositesb.2015.11.016 vol. 60 no. 6/2020 a combined strain element to functionally graded structures. . . [16] f. tornabene, n. fantuzzi, e. viola. stress and strain recovery for functionally graded free-form and doubly-curved sandwich shells using higher-order equivalent single layer theory. composite structures 119:67 – 89, 2015. doi:10.1016/j.compstruct.2014.08.005. [17] f. tornabene, a. liverani, g. caligiana. fgm and laminated doubly curved shells and panels of revolution with a free-form meridian: a 2-d gdq solution for free vibrations. international journal of mechanical sciences 53:446 – 470, 2011. doi:10.1016/j.ijmecsci.2011.03.007. [18] f. tornabene. free vibration analysis of functionally graded conical, cylindrical shell and annular plate structures with a four-parameter power-law distribution. computer methods in applied mechanics and engineering 198:2911 – 2935, 2009. doi:10.1016/j.cma.2009.04.011. [19] m. avcar. free vibration of imperfect sigmoid and power law functionally graded beams. steel and composite structures 30:603 – 615, 2019. doi:10.12989/scs.2019.30.6.603. [20] m. avcar, w. k. m. mohammed. free vibration of functionally graded beams resting on winkler-pasternak foundation. arabian journal of geosciences 11:232, 2018. doi:10.1007/s12517-018-3579-2. [21] ö. civalek, m. acar. discrete singular convolution method for the analysis of mindlin plates on elastic foundations. international journal of pressure vessels and piping 84:527 – 535, 2007. doi:10.1016/j.ijpvp.2007.07.001. [22] ö. civalek. free vibration of carbon nanotubes reinforced (cntr) and functionally graded shells and plates based on fsdt via discrete singular convolution method. composites part b: engineering 111:45 – 59, 2017. doi:10.1016/j.compositesb.2016.11.030. [23] ö. civalek, m. avcar. free vibration and buckling analyses of cnt reinforced laminated non-rectangular plates by discrete singular convolution method. engineering with computers pp. 1 – 33, 2020. doi:10.1007/s00366-020-01168-8. [24] a. menasria, a. kaci, a. a. bousahla, et al. a four-unknown refined plate theory for dynamic analysis of fg-sandwich plates under various boundary conditions. steel and composite structures 36:355 – 367, 2020. doi:10.12989/scs.2020.36.3.355. [25] m. rahmani, k. abdelhakim, a. bousahla, et al. influence of boundary conditions on the bending and free vibration behavior of fgm sandwich plates using a four-unknown refined integral plate theory. computers and concrete 25:225 – 244, 2020. doi:10.12989/cac.2020.25.3.225. [26] a. zine, a. a. bousahla, f. bourada, et al. bending analysis of functionally graded porous plates via a refined shear deformation theory. computers and concrete 26:63 – 74, 2020. doi:10.12989/cac.2020.26.1.063. [27] h. l. ton-that. improvement on eight-node quadrilateral element (iq8) using twice-interpolation strategy for linear elastic fracture mechanics. engineering solid mechanics 8:323 – 336, 2020. doi:10.5267/j.esm.2020.3.005. [28] d. k. jha, t. kant, r. k. singh. a critical review of recent research on functionally graded plates. composite structures 96:833 – 849, 2013. doi:10.1016/j.compstruct.2012.09.001. [29] k. yeongbin, p.-s. lee, k.-j. bathe. the mitc4+ shell element and its performance. computers & structures 169:57 – 68, 2016. doi:10.1016/j.compstruc.2016.03.002. [30] h. nguyen-van, n. mai-duy, t. tran-cong. a simple and accurate four-node quadrilateral element using stabilized nodal integration for laminated plates. computers, materials & continua 6:159 – 176, 2007. doi:10.3970/cmc.2007.006.159. [31] k.-j. bathe, e. n. dvorkin. a formulation of general shell elements-the use of mixed interpolation of tensorial components. international journal for numerical methods in engineering 22:697 – 722, 1986. doi:10.1002/nme.1620220312. [32] d. d. fox, j. c. simo. a drill rotation formulation for geometrically exact shells. computer methods in applied mechanics and engineering 98:329 – 343, 1992. doi:10.1016/0045-7825(92)90002-2. [33] j. n. reddy. analysis of functionally graded plates. international journal for numerical methods in engineering 47:663 – 684, 2000. doi:10.1002/(sici)1097-0207(20000110/30)47:1/3<663::aid-nme787>3.0.co;2-8. [34] a. m. zenkour. generalized shear deformation theory for bending analysis of functionally graded plates. applied mathematical modelling 30:67 – 84, 2006. doi:10.1016/j.apm.2005.03.009. [35] x. zhao, y. lee, k. liew. thermoelastic and vibration analysis of functionally graded cylindrical shells. international journal of mechanical sciences 51:694 – 707, 2009. doi:10.1016/j.ijmecsci.2009.08.001. 539 http://dx.doi.org/10.1016/j.compstruct.2014.08.005 http://dx.doi.org/10.1016/j.ijmecsci.2011.03.007 http://dx.doi.org/10.1016/j.cma.2009.04.011 http://dx.doi.org/10.12989/scs.2019.30.6.603 http://dx.doi.org/10.1007/s12517-018-3579-2 http://dx.doi.org/10.1016/j.ijpvp.2007.07.001 http://dx.doi.org/10.1016/j.compositesb.2016.11.030 http://dx.doi.org/10.1007/s00366-020-01168-8 http://dx.doi.org/10.12989/scs.2020.36.3.355 http://dx.doi.org/10.12989/cac.2020.25.3.225 http://dx.doi.org/10.12989/cac.2020.26.1.063 http://dx.doi.org/10.5267/j.esm.2020.3.005 http://dx.doi.org/10.1016/j.compstruct.2012.09.001 http://dx.doi.org/10.1016/j.compstruc.2016.03.002 http://dx.doi.org/10.3970/cmc.2007.006.159 http://dx.doi.org/10.1002/nme.1620220312 http://dx.doi.org/10.1016/0045-7825(92)90002-2 http://dx.doi.org/10.1002/(sici)1097-0207(20000110/30)47:1/3<663::aid-nme787>3.0.co;2-8 http://dx.doi.org/10.1016/j.apm.2005.03.009 http://dx.doi.org/10.1016/j.ijmecsci.2009.08.001 acta polytechnica 60(6):528–539, 2020 1 introduction 2 formulation of the combined strain element for functionally graded material 2.1 functionally graded material 2.2 formulation of the combined strain element 3 numerical results 3.1 bending analysis 3.2 vibration analysis 4 conclusions references ap03_5.vp 1 introduction off-road cycling, or mountain biking, has developed as an important element of the sport of cycling in the last 20 years. a significant distinction between competition bicycles is whether or nor they have a suspension system. there are three categories. a rigid frame (rf) mountain bike has no suspension. a hard tail (ht) mountain bike has a front wheel suspension only and a full suspension (su) mountain bike has front and rear wheel suspensions. at the present time, there is a lack of information about the conditions under which a suspension system offers an advantage, the extent of the advantage and possible detrimental effects. the great majority of competition mountain bikes include a front suspension, but most professional cross-country cyclists do not ride full suspension bicycles [1]. however, research studies indicate that there can be advantages with a full suspension on rough terrain and it is of interest to explore how these can be realised under race conditions and to provide data that can be used for future design developments. a full suspension system provides potential benefits of reduced fatigue, better traction for both up and down hill cycling and the ability to control the bicycle at faster down hill speeds [2]. it is difficult to quantify these benefits because they may depend on so many variables, including the physiology and psychology of the cyclist, the roughness of the track, the cyclist’s riding style and the design of the bicycle and suspension system. published results show definite physiological advantages for full suspension bicycles under laboratory conditions, and some advantages in controlled time trials on cross country trails. however, there is significant conflict between experimental evidence, time trials and race results. this paper reports on work [1] that has been done to help clarify the differences. the design of a laboratory based test rig that allows the number of variables in the system to be reduced and the test conditions to be controlled is described. a particular aim was to demonstrate a correlation between physiological results and mechanical dynamic measurements. the test rig allows both to be measured under sub-maximal exertion levels. the paper summarises results of this work and makes comparisons between physiological and engineering measurements. 2 test rig design past work most experiments on the physiological effects of riding bicycles are carried out using standard cycle dynamometer training machines where the machine is static; there are no wheels and the cyclist pedals against a largely frictional loading. clearly, this is not suitable for the investigation of the effect of suspension systems, and other methods must be used. a standard bicycle can be tested in the laboratory by using a power driven treadmill, usually with the treadmill inclined at a slope of about 4 % to 6 % so that the rider has to do work to generate a reaction force at the wheel equal to the component of the weight of the rider and bicycle acting parallel to the surface of the treadmill. laboratory tests can also be conducted using a roller system such as used for training purposes. the rear wheel is supported by two closely spaced rollers while the front wheel runs on a single roller. it is important that the front roller be driven by the rotation of the rear rollers so that the rider can maintain balance. experiments can also be conducted on outdoor tracks. these provide the greatest realism but the least control of conditions and restricted opportunities for measurements. one area of interest has been the loss of energy arising because of the suspension movements induced by the cyclic variation of pedal force on a smooth road. this has been investigated by wang and hull [3] using treadmill tests, and results indicate that about 1.3 % of the rider’s power input is dissipated in the suspension system. such results have provided the motivation for recent designs of suspension systems that minimise the response to rider induced loading. experiments have also been conducted to investigate the effect of suspension systems on bicycles ridden over bumps. berry et al [4] made measurements under laboratory conditions where they attached a bump to the belt of a power driven treadmill, tilted to give a 4 % slope, and tested bicycles with no suspension system, with a front suspension system, with a rear suspension system and with a full suspension system. bump impact frequency was 0.7 hz. their results show that the oxygen consumed and heart rate are highest with the rigid frame bicycle and lowest with the rear only suspension system. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 43 no. 5/2003 experimental evaluation of mountain bike suspension systems j. titlestad, t. fairlie-clarke, m. davie, a. whittaker, s. grant a significant distinction between competitive mountain bikes is whether they have a suspension system. research studies indicate that a suspension system gives advantages, but it is difficult to quantify the benefits because they depend on so many variables, including the physiology and psychology of the cyclist, the roughness of the track and the design of the suspension system. a laboratory based test rig has been built that allows the number of variables in the system to be reduced and test conditions to be controlled. the test rig simulates regular impacts of the rear wheel with bumps in a rolling road. the physiological variables of oxygen consumption and heart rate were measured, together with speeds and forces at various points in the system. physiological and mechanical test results both confirm a significant benefit in using a suspension system on the simulated rough track, with oxygen consumption reduced by around 30 % and power transmitted through the pedals reduced by 30 % to 60 %. keywords: mountain bike, suspension, dynamics. the oxygen consumed when riding the full suspension bicycle over bumps was close to that measured for the rear only suspension bicycle, but heart rate was higher. experiments on outdoor tracks have been conducted by seifert et al [2]. on a flat track at 16.1 km/h and bump frequency of 0.5 hz, they measured the 24-hour change in creatine kinase, the volume of oxygen consumed and the heart rate. the results indicate that a significant advantage is gained over an rf bicycle by having a suspension system, but there was little difference between the ht and su types of suspension. time trials were also conducted over a cross-country trail, an uphill trail and a downhill trail. they found that the ht bicycle was significantly faster than the rf and su bicycles on the cross-country trail, but there was no significant difference between bicycles in ascent and descent time trials. macrae et al [5] conducted up hill time trials on an ‘on road’ asphalt course and on an ‘off road’ course to compare performance of an ht bicycle and a su bicycle. they found no significant difference in times or physiological measurements, but the power transmitted through the pedals on the su bicycle was significantly higher than on the ht bicycle. some increase in power transmitted is to be expected due to the heavier weight of the su bicycle and the energy dissipated through the suspension damper, but the difference is much greater than can be explained by this and the cause is not clear. design approach the results of these tests do not provide consistent data. su bicycles perform well in laboratory tests, but ht bicycles seem to have the advantage in time trials and races. the plan for the work reported in this paper was to clarify the issues by establishing an ongoing program that would start from a simple but secure basis with a well-controlled experiment where the number of variables could be minimised. such an experiment could be repeated with consistent outcomes and later on the constraints could be relaxed and the effect on performance of particular variables investigated. it is important to simulate accurately the inertia of the bicycle and rider and the overall resistance to forward movement in order to obtain meaningful physiological results. at the same time, the bicycle needs to be held stationary in a laboratory environment to allow accurate instrumentation and observation of the rider. a standard treadmill would not meet the inertia and resistance requirements and would leave too many unconstrained variables with the rider having to maintain balance and position on the treadmill. it was decided to design the test rig so that the rear wheel would run on a single large diameter roller while the front forks were held by a frame, but free to rotate about the front wheel axle. bumps were fitted across the width of the roller. this arrangement allows the inertia and resistance to be simulated through the dynamics of the roller and limits the bump impact to the rear wheel, although a consequence is that the bump frequency is higher than would be experienced on most trails since it is dictated by the circumference of the roller. the subject cyclists were asked to remain in the seat and to minimise their body movement so that the measured differences between the ht and su tests could be clearly correlated with the rear wheel impact with the bumps and the effectiveness of the suspension system in attenuating the effects of the impact. the test rig is shown in fig. 1. its primary elements are a bracket to hold the front forks of the bicycle and a large diameter steel roller against which the rear wheel rotates. the inertia of the roller was set to simulate a cyclist of 74 kg (mean weight of subjects was 74 � 6.3 kg) and a mountain bike of 12 kg. this meant that impact with a bump would decelerate the roller by the same amount as the bicycle on a road would be decelerated and the subject cyclist would have to do the same amount of work to regain the speed of the roller as to regain the speed of the bicycle on the road. a strip of carpet was attached to the roller to simulate riding on a soft surface and two bumps were formed by evenly spaced rectangular wooden blocks (70 mm wide by 30 mm high) bolted across the roller. a range of bump sizes were tried in the preliminary testing and the chosen size provided the largest bump impact that could be applied while still allowing the cyclist to cycle at a sub-maximal level. the lower stanchions of the front forks of the bicycle were held vertical in the transverse plane and stationary so that the cyclists did not expend energy to balance the bicycle nor use their upper body to respond to front wheel impact. the bicycles were free to move on the front shock absorbers, which compress when a rear wheel impact occurs because the centre of gravity of the cyclist is between the two wheels and the inertial reaction force is therefore shared between the wheels. a resisting force was applied during the no bump tests by friction between the roller and a web strap passed over the roller and loaded by weights. the load on the web strap was set so that the time for the bicycle wheel and roller to come to a natural stop was an average of the times for the ht and su bicycles to come to a natural stop with bumps fitted to the roller and no web strap. the bicycles two bicycles typical of those available on the market were provided on loan for the tests. the two bicycles were essentially the same apart from one having a swing arm rear suspension system (marin mt. vision®) and one having a rigid frame (marin rocky ridge®). the same front shock absorbers (manitou magnum® r) were attached to both bicycles with the 16 acta polytechnica vol. 43 no. 5/2003 fig. 1: the test rig arrangement setting for pre-load and damping kept constant throughout the tests. the same rear wheel was used in all the tests and the tyre pressure was kept constant at 3.4 bar. the frame (fig. 2) the frame consists of a modular welded construction to allow easy transportation, and provides a foundation to locate the roller axle and the front bracket, the position of which can be adjusted to suit bicycles of different wheelbase so that the rear axle of the bicycle remains vertically over the axle of the roller. the frame also provides some support for the platform. the roller (fig. 3) the size and mass of the roller were chosen to create an inertial effect equivalent to that of the cyclist and the bicycle. the roller is constructed out of 0.61 m diameter, 9 mm thick rolled and welded mild steel pipe. eight tie rods are used to clamp a 24 inch bicycle wheel at each end of the roller between metal rings welded inside each end of the pipe and a second free metal ring at each end of the pipe. the two wheels thus provide the axle around which the roller rotates. the maximum run out error with this arrangement was measured at 3 mm, giving slight undulations during the bump free tests. the total inertia of the roller assembly is 8.21 kg�m2. front bracket (fig. 4) the front bracket, which is fabricated from mild steel plate and square tube, holds the front wheel axle of the bicycle stationary. it also locates with the bicycle brake stanchions to provide additional transverse stiffness to ensure that the front suspension forks of the bicycle remain vertical in the transverse plane while allowing the bicycle to rotate around the front wheel axle. this arrangement allows the suspension system at the front of the bicycle to operate normally. the axle itself is mounted on a small frame that is attached to the bracket in a way that allows the horizontal reaction force at the axle to be separated from the vertical force. the frame is pivoted relative to the front bracket immediately below the front wheel axle so that vertical forces pass directly through the pivot. horizontal forces tend to cause the frame to rotate about the pivot, but this rotation is restrained by a thin cantilever arm. a strain gauge fitted to this cantilever arm is calibrated to measure horizontal force. instrumentation the saddle and handlebar accelerations were measured using linear accelerometers fitted to the seat post just under the saddle and on the top of the steerer tube of the front shock absorber. velocity and displacements at these points were obtained by integration of the acceleration signals. the front bracket horizontal force was measured by a strain gauge, as described above this force provides an estimate of the force between the rear tyre and the roller surface, although it is also affected by inertial loads from the movements of the cyclist. the component of the pedal force applied perpendicular to one crank arm was calculated from the measured torque in the crank arm. the spline locking the spider and chain ring to the right crank arm and the bottom bracket axle was machined off and replaced with a bushing to allow free relative rotation. a slot was then machined into the right crank arm so that a cantilever beam could be inserted. the end of this cantilever beam was bolted to the chain rings, thus fixing the chain rings to the crank arm. all torque generated by the pedals was thus transmitted to the chain rings through the cantilever beam, which was strain gauged. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 43 no. 5/2003 fig. 2: the frame fig. 3: the roller fig. 4: the front bracket the rotational velocity of the crank set was measured using an optical sensor with a disc attached to the chain wheel spider in place of the 22-tooth chain ring. electronics mounted on a circuit board fitted on the base of the bottom bracket were used to convert the output frequency to a voltage proportional to the tangential velocity of the pedals. the voltage was amplified by a factor of 2000 to ensure a high signal to noise ratio when transmitted via the slip rings. the front derailleur of the bicycles was locked so that only the 32-tooth chain ring could be used. this prevented any damage to the instrumentation by accidental shifting of the gears. a similar optical sensor and disc were used to measure the roller velocity. a position sensor generated a signal showing when the crank passed the top position. physiological measurements a polar favor® heart rate monitor (polar heart rate monitor, kempele, finland) was used to continuously monitor the heart rate of the subject, while one minute samples of expired air were collected using a two-way hans rudolph® 2770 mouthpiece, tubing and douglas® bags. the expired air was analysed using a servomex 570a® o2 analyser (servomex, crowborough, uk) and a pk morgan® td 801a co2 analyser (morgan, rainham, uk). both analysers were calibrated before testing with gases of known concentrations. gas volumes were measured using a parkinson cowan® (cranlea, birmingham, uk) meter calibrated against a tissot® spirometer (collins, massachusetts, usa). standard formulae [6] were used to calculate o2 consumption and co2 production. 3 experiments eight male subjects participated in the tests. they were aged between 19 and 27 years and were all active in either cycling or some other physical sport. they all signed a consent form and the study was approved by the local ethics committee. each subject was tested on both the su and the ht bicycles with and without bumps on the roller. the first tests were conducted with bumps on the roller but without any additional braking effect. the tests were then repeated without bumps but with a brake applied to the roller and adjusted so that the time to free roll to rest was the same with and without bumps. this ensured that the workload and heart rate of the subjects during the bump tests and the no bump tests were of the same order. the tests with each subject were conducted at the same time of day. the order in which the two bicycles were ridden was randomly assigned to the subjects. the saddle height was set so that when the pedal was at its lowest position the subject’s leg was straight with the heel on the pedal. the first test included a familiarization session during which the subject was instructed to cycle at a speed between 10 and 15 km/h that could be maintained comfortably for ten minutes. the subjects were asked to maintain this speed to within 0.5 km/h during all their tests. to allow for different riding styles, the subjects were permitted to ride in the rear gear of their choice, but had to then use the same gear for all the tests. all the subjects wore a nose clip. the subjects were instructed to remain motionless and not apply any load to the pedals for the first 10 seconds of the test while zero load readings were recorded. they then had the remainder of the first minute to attain their chosen test speed at the start of each test. the test proper started at the end of the first minute and continued for a further ten minutes while readings and samples were taken. the subjects were then instructed to allow the bicycle to come to a halt, to remove all load from the pedals and remain motionless for a further 10 seconds while the zero readings were checked. each subject was instructed to adopt a passive riding style during the tests by remaining seated on the bicycle at all times and not consciously transferring their weight. this was to ensure that the effects of the suspension were measured and not the ability of the rider to use body movements to minimise the effects of the bumps. measurement velocity, acceleration and load measurements were recorded continuously. heart rate was recorded 45 seconds into each minute of the test. it was found that the readings remained quite steady during the last five minutes of the test and the mean of the last two recordings was taken as an indicative value. one minute samples of expired air were taken in the ninth and tenth minutes of the test. verbal feedback was also obtained from the subjects during the tests to indicate perceived exertion and comfort. results the results of the experiments are reported in detail by titlestad [1], and samples of these are given here in bar graph form to demonstrate the main outcomes. the measurements taken for each subject are indicated by an individual group of bars given in the order from left to right of ht and su bicycles over bumps and then ht and su on the smooth roller with additional resistance provided by the web brake. figs. 5 and 6 show the physiological measurements of oxygen consumption and heart rate, while figs. 7 and 8 show the 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 0 5 10 15 20 25 30 35 40 1 2 3 4 5 6 7 8subject v o 2 (m l/ k g /m in ) htbumps subumps htsmooth susmooth fig. 5: oxygen consumption dynamic measurements. fig. 7 shows the displacement of the seat post, derived by double integration of the acceleration data and given as the range of movement in millimetres from its highest position to its lowest position as the bicycle rides over a bump and then lands back on the roller surface. fig. 8 shows the power being transmitted by the rider through the pedal crank arms. for each of the variables shown in the figures, the objective is to achieve lower values that will provide an advantage for the rider. thus, it can be seen that for the current test conditions, the su bicycle performs better over the bumps than the ht bicycle. the physiological results show that when riding the su bicycle the subjects consume less oxygen and their heart rates are slower. the dynamic results confirm the physiological results, with the su bicycle rider having to transmit less power through the crank arms and thence to the rear wheel when riding over bumps. fig. 7 also shows that there is considerable attenuation of the displacement of the seat and this will improve rider comfort. for the tests using the smooth roller (i.e., without bumps), additional loading was applied using the web strap wrapped around the roller. this resulted in a total resistance that was higher than that experienced by the su bicycle in the presence of bumps but lower than that experienced by the ht bicycle (note that because the ht bicycle is stiffer it experiences higher forces on impact with the bumps than the su bicycle and therefore has a larger overall resistance). the results for the ht and su bicycles on the smooth roller are similar to each other, with a tendency for slightly higher physiological readings from the su bicycle. this is as expected since the suspension system reacts slightly to the varying crank arm loading due to pedalling cadence and a little energy is dissipated through the damping action of the suspension system. the seat displacement shown in fig. 7 for the smooth roller arises primarily because of the run out error in the cylindricity of the roller, which is about 3 mm. marginally higher displacements occur with the su bicycle. power transmitted through the crank is generally, but not always, lower for the su bicycle. it is not clear why this should be so, but it may be that some subjects rock forward and back a little so that inertia effects come into play, and there may sometimes be a back pressure applied on the second crank, which would result in an overestimate of the transmitted torque. 4 conclusions a test rig has been constructed that allows laboratory tests to investigate the effect of the impact of bicycles with bumps under controlled conditions with a minimal number of variables. a number of such tests have been completed to compare the performance of ht and su bicycles. the test conditions represent a severely bumpy track with a high frequency of encounter with the bumps. under these conditions, the results show a clear physiological and dynamic advantage for the su bicycle. there is around 30 % reduction in the consumption of oxygen and heart rate is reduced by between 20 to 50 beats per minute. reduction of oxygen consumption indicates a reduced expenditure of physiological energy, and this correlates with the mechanical measurements that show a reduction of between 30 % and 60 % in the power transmitted through the crack arms, which provides the driv© czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 43 no. 5/2003 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8subject h e a rt r a te (b e a ts /m in ) htbumps subumps htsmooth susmooth fig. 6: heart rate 0 5 10 15 20 25 30 1 2 3 4 5 6 7 8subject s e a t d is p la c e m e n t r a n g e (m m ) htbump subump htsmooth susmooth fig. 7: seat displacement range 0 20 40 60 80 100 120 140 160 1 2 3 4 5 6 7 8subject p o w e r t h ro u g h c ra n k (w ) htbump subump htsmooth susmooth fig. 8: power transmitted through crank ing force to rotate the roller. the suspension system also provided about 50 % reduction in the vertical displacement of the seat as the bicycle goes over a bump, which as well as providing a more comfortable ride also indicates that less energy is transferred from the roller to the bicycle during an impact. the objective to establish a test configuration that would produce clear cut results that are not counter intuitive and where the physiological findings are reinforced by the mechanical measurements has been achieved. a basis has therefore been established for comparison with future experiments. these should be conducted under less constrained and less severe conditions with the aim of understanding better the effect of riding style and track conditions on performance and to provide data that can help optimise bicycle configurations and settings to achieve peak performance on designated types of terrain. acknowledgements our thanks go to jon whyte of whyte bicycles for the loan of the bicycles used in the tests. thank you also to the department of mechanical engineering and the institute of biomedical & life sciences at the university of glasgow for providing resources and technician support. references [1] titlestad, j.: mountain bicycle rear suspension dynamics and their effect on cyclists. thesis, the university of glasgow, 2001. [2] seifert, j., luetkemeier, m., spencer, m., miller, d., burke, e.: the effect of mountain bike suspension systems on the energy expenditure, physical exertion, and time trial performance during mountain bicycling. international journal of sports medicine, vol. 18, 1997, p. 197–200. [3] wang, e. l., hull, m. l.: a model for determining rider induced energy losses in bicycle suspension systems. vehicle system dynamics, vol. 25, 1996, p. 223–246. [4] berry, m. j., woodard, c. m., dunn, c. j., edwards, d. g., pittman, c. l.: the effects of a mountain biking suspension system on metabolic energy expenditure. cycling science, 1993, p. 8–14. [5] macrae, h. s-h., hise, k. j., allen, p. j.: effects of front and dual suspension mountain bike systems on uphill cycling performance. medicine and science in sports and exercise, vol. 32, 2000, p. 1276–1280. [6] mcardle, w., katch, f., katch, v.: exercise physiology, energy, nutrition, and human performance. 4th edition, london: williams & wilkins, 1996. figs. 1– 4 are reproduced from reference [1]; copyright j.titlestad ® this article includes words that are believed to be, or asserted to be, proprietary terms or trademarks. the authors recognise their status as such and no other judgement is implied concerning their legal status. john titlestad dr. tony fairlie-clarke phone: +441 413 304 327 e-mail: tonyfc@mech.gla.ac.uk mark davie dr. arthur whittaker department of mechanical engineering the university of glasgow glasgow g12 8qq , united kingdom dr. stan grant phone: +441 413 306 490 e-mail: s.granty@bio.gla.ac.uk institute of biomedical & life sciences the university of glasgow glasgow g12 8qq , united kingdom 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 acta polytechnica doi:10.14311/ap.2019.59.0593 acta polytechnica 59(6):593–600, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap optimization of computational burden of the point information gain jan urban∗, renata rychtáriková, petr macháček, dalibor štys, pavla urbanová, petr císař university of south bohemia in české budějovice, faculty of fisheries and protection of waters, south bohemian research center of aquaculture and biodiversity of hydrocenoses, kompetenzzentrum mechanobiologie in regenerativer medizin, institute of complex systems, zámek 136, 373 33 nové hrady, czech republic ∗ corresponding author: urbanj@frov.jcu.cz abstract. we developed a method of image preprocessing based on the information entropy, namely, on the information contribution made by each individual pixel to the whole image or to image’s part (i.e., a point information gain; pig). an idea of the pig calculation is that an image background remains informatively poor, whereas objects carry relevant information. in one calculation, this method preserves details, highlights edges, and decreases random noise. this paper describes optimization and implementation of the pig calculation on graphical processing units (gpu) to overcome a high computational burden. keywords: pixel information contribution, entropy, image analysis, computation optimization. 1. introduction one of the tasks in image analysis is an accurate segmentation of entities from their background. correct finding of objects depends on plenty of factors, such as the kind of illumination, shadows, level of noise, proper focusing, overlaps of objects, and dissimilarity between an object and its background. the traditional methods, e.g., thresholding or edge detectors, are generally based on local or global characteristics of intensity histograms [1–4]. the final result of a chosen segmentation algorithm is more or less conditioned by the sufficiency of preprocessing computation that include color space transformations, denoising, or other adjustments [5–7]. the segmentation process is of a high quality if the borders of the objects of interest are sharp, clearly visible, and separated from the background. a promising way for automation of the whole process of the selection of a sub-image of proper parameters is a usage of an equation of information entropy, which was defined by shannon and generalized by rényi [8]. although the formula for the information entropy is well known for a longer time, other ways how to use it for one or two dimensional thresholding and filtering, e.g., [1–4, 9–11], still appear. the entropy also generally measures an information content. an important question is how much information is included in a data point and how the points can be distinguished from each other. this question was solved by a derivation of the variable point information gain (pig, γ) [12–22] which evaluates an information content of a single pixel in local and global context. this variable says how much is one individual pixel important for understanding of an image or image’s part. in addition, the method of the pig preserves the details, highlights the edges, and decreases random noise in one calculation. examples of usage of the pig in image (pre)processing were published previously [12–22] the computation tool for image preprocessing using the pig method is called the image info extractor professional software (iiep; institute of complex systems, nové hrady). the theoretical concept and practical utilization of this kind of software was introduced previously and this paper is supplemental to [17]. the task of this paper is, using the examples of the shannonian pig with whole image approach and for cross neighbourhood for each pixel, to present a kind of computational optimization and the implementation of this computation on gpus in order to achieve a higher computation performance of the iiep software. 2. the shannon entropy the shannon entropy is a special case of the rényi entropy for α = 1. any discrete probability distribution p = {p0,p1, ...,pk} fulfils the condition pj ≥ 0; k∑ j=1 pj = 1. (1) in an intensity image, the approximation of the probability distribution is given by the histogram function that shows counts of the pixels φ(x,y) with intensity j at the position (h,w) in the image [3, 4]. more conditions are assumed when measuring the information. the information must be additive for two independent events with the probabilities of occurrence of p1 and p2: i(p1,2) = i(p1) + i(p2). (2) 593 https://doi.org/10.14311/ap.2019.59.0593 https://ojs.cvut.cz/ojs/index.php/ap j. urban, r. rychtáriková, p. macháček et al. acta polytechnica the information itself is dependent only on the probability distribution or, as in our case, on normalized histogram function. eq. 2 describes a modified cauchy’s functional equation with the unique solution i(p1) = −k×log2(p1). in statistical thermodynamics, the constant k refers to the boltzmann constant [23], in the hartley information, k = 1 [24]. if different amounts of information occur with different probabilities, the total amount of information corresponds to the average of the individual information contributions weighted by the probabilities of their individual occurrences [24–26]. this leads to the definition of the shannon information entropy as∑ j (pjij ); h (p) = − ∑ j pj log2(pj ). (3) the image histogram n is normalized to the total amount of pixels [27, 28] to fulfil the condition (1): n =  ∑ h,w n0(h,w), ∑ h,w n1(h,w), . . . ∑ h,w nk(h,w)   ; nj(h,w) = { 1 for φ(h,w) = j 0 for φ(h,w) 6= j. (4) pj(h,w) = nj(h,w) wh ; p =  ∑ h,w p0(h,w), ∑ h,w p1(h,w), . . . ∑ h,w pk(h,w)   , where wh is a total amount of pixels in an image of the size of [h ×w ]. then, the normalized histogram function p is used to define the information in the form of the shannon entropy as h (p) = − ∑ j pj log2 pj. (5) for the simplification of the next deduction, the binary logarithm log2, which explains the information in bits, is supplied by natural algorithm ln. 3. point information gain in the form of the variable pig (γ(i)), the shannon entropy allows to measure an information content of either the whole image or a selected part of the image. the key idea for the definition of the pig was if the occurrence of the intensity of a single pixel is a surprise. one can predict that, on the one hand, the background pixels will not carry much information and, on the other hand, the pixels of structurally complicated objects will increase the entropy on their positions after discarding one of them. in order to investigate the contribution of one single pixel with intensity value i to the total entropy, it is necessary to introduce a histogram n(i), which is created without this investigated pixel: n (i) j = { nj for j 6= i nj − 1 for j = i. the intensity value i of the investigated pixel φ(h,w) = φ(x,y) was now discarded from the computation, but only once. one single pixel of intensity j will only decrease the histogram value n(i)j on its intensity position i. then, the histogram is normalized according to condition (1). the probability (value of the normalized histogram) p(i)i of the intensity i is slightly lower than the probability pi of the primary normalized histogram p (with all pixels). the other probabilities p(i)j , where j is not the value of the investigated pixel i, are slightly higher than the probability pj of the primary normalized histogram p (with all pixels). then, the second shannon information entropy h (p(i)) without the pixel of intensity i at the position (x,y) is computed from the modified normalized histogram p (i) as h (p(i)) = − ∑ j p (i) j ln(p (i) j ), (6) where the individual information contributions ln(p(i)j ) as well as their weights p(i)j slightly differ from those in the shannon information entropy h (p) (eq. 5). thus, using eqs. (5)–(6), we obtained two different values of the information entropy: the entropy h (p) represents the total information in the whole original image, whereas the entropy h (p(i)) represents the information in the image without the investigated pixel. γ(i) = h (p) − h (p(i)) = h − h (i) (7) refers then to a difference between the entropy of the two histograms and thus also to the difference between the entropy of two images – with and without the investigated pixel φ(x,y) of the intensity i. recall that the histograms p and p(i) were normalized and, therefore, the difference γ(i) is usually a small number. (in the iiep software, the point information gain is defined with the opposite sign as γ(i) = h (i) − h .) the difference γ(i) represents either the entropy contribution of the pixel φ(x,y) or the contribution of the intensity value of the pixel φ(x,y) to the information content of the image. in other words, the transformation of the intensity value of the pixel φ(x,y) value to its contribution to the image via eq. (7) represents the information contribution carried by this pixel. the computation of the γ(i) for each single pixel transforms an original image into an entropy map, i.e., into an image that shows the contribution of each pixel to the total information content of the image. the variable γ(i) is dependent only on the intensity of the pixel φ(x,y) and does not carry any information about the pixel’s position. however, the histogram can include pixels from not only the whole image but also from a selected area around an investigated pixel. if the image has a semi-uniform background with changes in intensity and the background is completely different in different parts of the image, it is 594 vol. 59 no. 6/2019 optimization of computational burden of the point information gain advantageous to use the row and the column with the investigated pixel in the centre as the area for the computation of the intensity histogram n = n(x) + n(y) − ei, (8) where the value of pixel φ(x,y) is counted only once. the value of the second pixel φ(x,y) is subtracted using the unit histogram ei = [0, 0, . . . , 1, . . . , 0, 0] with 1 at the position relevant to nj = ni > 0. the outputs consist of the amount of information for the cross centred on the investigated pixel itself. the additional histogram is again a histogram of the values from the cross, however, in this case, the intensity value of the center pixel φ(x,y) was discarded: n(i) = n(x) + n(y) − 2ei. (9) the subsequent application of eq. (5) results in entropies h(x,y)(p ) = h(x,y) and h(x,y)(p (i)) = h (i) (x,y) with and without the center pixel, respectively. the difference γ(i)(x,y) = h(x,y) − h (i) (x,y) refers to the difference between the information content of these two crosses. recall again that the difference γ(i)(x,y) is a small number and represents either the entropy contribution of pixel φ(x,y) or the contribution of the value of pixel φ(x,y) to the cross. both (whole and cross) pig can be considered as special cases of the kullback-leibler divergence [29] of two distributions: one distribution is formed from the neighbourhood of the investigated pixel including this pixel, while the second distribution is formed from the same neighbourhood but without this pixel. the whole approach considers the whole image as the neighbourhood, while the cross approach considers the designed cross with the investigated pixel in the center. thus, the novelty of the pig approach lies in the practical definition of the investigated normalized distributions. 4. optimization of the algorithm 4.1. point information gain for the whole image the γ(i) values for the whole image does not need to be computed for each pixel repeatedly. the value of the total entropy h remains the same through the whole image and the value of the entropy h (i) is the same for all pixels φ(x,y) with the same intensity j = i. the algorithm was optimized by the modification of the relevant equations as follows: the histogram n = [n0,n1, . . .n2d−1] for the whole image was redefined to nj = ||{∀(x,y) : φ(x,y) = j ∧x ∈{1, 2, . . . ,h} ∧y ∈{1, 2, . . . ,w}}||; j = [0, 1, . . . , i, . . . , 2d − 1]. the entropy h can be subsequently explained as h = − 2d−1∑ j=0 pj ln(pj ), h = − 2d−1∑ j=0 nj wh ln ( nj wh ) , h = − 2d−1∑ j=0 nj wh [ln(nj ) − ln(wh)], h = ln(wh) wh 2d−1∑ j=0 nj ︸ ︷︷ ︸ w h − 1 wh 2d−1∑ j=0 nj ln(nj ), h = ln(wh) − 1 wh 2d−1∑ j=0 nj ln(nj ). (10) the function h (i) which explains the entropy of the modified image with the omitted investigated pixel φ(x,y) was reformulated as h (i) = − 2d−1∑ j=0 p (i) j ln(p (i) j ), h (i) = − 2d−1∑ j=0 n (i) j wh − 1 ln ( n (i) j wh − 1 ) , h (i) = − 2d−1∑ j=0 n (i) j wh − 1 [ln(n(i)j ) − ln(wh − 1)], h (i) = ln(wh − 1) wh − 1 2d−1∑ j=0 n (i) j︸ ︷︷ ︸ w h−1 − ∑2d−1 j=0 n (i) j ln(n (i) j ) wh − 1 , h (i) = ln(wh − 1) − ∑2d−1 j=0 n (i) j ln(n (i) j ) wh − 1 . (11) the histogram n(i) is then defined as n (i) j = { nj j 6= i nj − 1 j = i, j = [0, 1, . . . , 2d − 1]. the first optimization was performed as n(i) ⇔ h (i) ⇔ φ(x,y) = i; x ∈{1, 2, . . . ,h}; y ∈{1, 2, . . . ,w}; i ∈{0, 1, . . . , 2d − 1}. 595 j. urban, r. rychtáriková, p. macháček et al. acta polytechnica figure 1. image of the size of [h × w] with the intensity levels of d bits. from this optimization as well as eq.(7) follows directly that the entropy h (i) is dependent only on the intensity i of the pixel φ(x,y) and not on the pixel’s position (x,y) in the image. then, the entropy h (i) can be redefined as h (i) = (h (i)|φ(x,y) = i); i = [0, 1, . . . , 2d − 1]. eventually, the pig for the whole image is subsequently explained as γ(i) = h − h (i), γ(i) = ln(wh) − ∑2d−1 j=0 nj ln(nj ) wh − ln(wh − 1) + ∑2d−1 j=0 n (i) j ln(n (i) j ) wh − 1 , γ(i) = ln ( wh wh − 1 ) − ni ln(ni) + ∑i−1 j=0 nj ln(nj ) + ∑2d−1 j=i+1 nj ln(nj ) wh + (ni − 1) ln(ni − 1) wh − 1 + ∑i−1 j=0 nj ln(nj ) + ∑2d−1 j=i+1 nj ln(nj ) wh − 1 , γ(i) = ln ( wh wh − 1 ) + 1 wh(wh − 1)  i−1∑ j=0 nj ln(nj ) + 2d−1∑ j=i+1 nj ln(nj )   ︸ ︷︷ ︸∑2d −1 j=0 (nj ln(nj ))−ni ln(ni) − ni ln(ni) wh + (ni − 1) ln(ni − 1) wh − 1 , figure 2. image of the size of [h × w] with the intensity levels of d bits, with investigated cross. γ(i) = ln ( wh wh − 1 ) + ∑2d−1 j=0 nj ln(nj ) wh(wh − 1) + 1 wh − 1 [(ni − 1) ln(ni − 1) −ni ln(ni)] , γ(i) = ln ( wh wh − 1 ) + ∑2d−1 j=0 nj ln(nj ) wh(wh − 1) + 1 wh − 1 ln (ni − 1)(ni−1) n (ni) i . (12) from eq. 12 follows that the γ(i) can be evaluated as γ(i) = b + c ln (ni − 1)(ni−1) n (ni) i , (13) where the constant c = 1 w h−1 and the base b = ln ( w h w h−1 ) + 1 w h(w h−1) ∑2d−1 j=0 nj ln(nj ). 4.2. point information gain for the cross computation a column histogram n(y) is defined as n(y) = [n0(y),n1(y), . . . ,n(2d−1)(y)]; y = [1, 2, . . . ,w ]; j = [0, 1, . . . , i, . . . , 2d − 1]. similarly, a row histogram n(x) is defined as n(x) = [n0(x),n1(x), . . . ,n(2d−1)(x)]; x = [1, 2, . . . ,h]; j = [0, 1, . . . , i, . . . , 2d − 1]. the elements of the unit histogram ei are (cf. eqs. (8)–(9)) ej = { 0 j 6= i = φ(x,y) 1 j = i = φ(x,y) j = [0, 1, . . . , i, . . . , 2d − 1]. 596 vol. 59 no. 6/2019 optimization of computational burden of the point information gain the entropy h(x,y) for computation from the column histogram n(y) and row histogram n(x) of the cross with the center pixel on, of the intensity i at the position (x,y), can be written as h(x,y) = − 2d−1∑ j=0 pj ln(pj ), h(x,y) = = − 2d−1∑ j=0 nj(x) + nj(y) −ej h + w − 1 ln nj(x) + nj(y) −ej h + w − 1 , h(x,y) = = 2d−1∑ j=0 nj(x) + nj(y) −ej h + w − 1 ln(h + w − 1) − 2d−1∑ j=0 nj(x) + nj(y) −ej h + w − 1 ln(nj(x) + nj(y) −ej ), h(x,y) = = ln(h + w − 1) h + w − 1 2d−1∑ j=0 nj(x) + nj(y) −ej︸ ︷︷ ︸ h+w −1 − ∑2d−1 j=0 (nj(x) + nj(y) −ej ) ln(nj(x) + nj(y) −ej ) h + w − 1 , h(x,y) = = ln(h + w − 1) − ∑2d−1 j=0 (nj(x) + nj(y) −ej ) ln(nj(x) + nj(y) −ej ) h + w − 1 , h(x,y) = = ln(h + w − 1) − ∑i−1 j=0 (nj(x) + nj(y)) ln(nj(x) + nj(y)) h + w − 1 (14) − ∑2d−1 j=i+1 (nj(x) + nj(y)) ln(nj(x) + nj(y)) h + w − 1 − (ni(x) + ni(y) − 1) ln(ni(x) + ni(y) − 1) h + w − 1 . similarly, the entropy h (i)(x,y) for the computation from the column histogram n(y) and row histogram n(x) of the cross without center pixel, of the intensity i at the position (x,y), is h (i) (x,y) = − 2d−1∑ j=0 p (i) j ln(p (i) j ), h (i) (x,y) = = − 2d−1∑ n=0 nj(x) + nj(y) − 2ej h + w − 2 ln nj(x) + nj(y) − 2ej h + w − 2 , h (i) (x,y) = ln(h + w − 2) − ∑2d−1 j=0 (nj(x) + nj(y) − 2ej ) ln(nj(x) + nj(y) − 2ej ) h + w − 2 , h (i) (x,y) = = ln(h + w − 2) − ∑i−1 j=0 (nj(x) + nj(y)) ln(nj(x) + nj(y)) h + w − 2 − ∑2d−1 j=i+1 (nj(x) + nj(y)) ln(nj(x) + nj(y)) h + w − 2 . (15) it gives a pig for the cross computation as γ(i)(x,y) = h(x,y) − h (i) (x,y), γ(i)(x,y) = = ln ( h + w − 1 h + w − 2 ) + ∑i−1 j=0 (nj(x) + nj(y)) ln(nj(x) + nj(y)) (h + w − 1)(h + w − 2) + ∑2d−1 j=i+1 (nj(x) + nj(y)) ln(nj(x) + nj(y)) (h + w − 1)(h + w − 2) (16) − (ni(x) + ni(y) − 1) h + w − 1 ln(ni(x) + ni(y) − 1) + (ni(x) + ni(y) − 2) h + w − 2 ln(ni(x) + ni(y) − 2). let us introduce the vector t defined as t =[t(0), t(1), t(2), . . . , t(h+w −1)] = =[0 ln(0), 1 ln(1), 2 ln(2), . . . , (h + w − 1) ln(h + w − 1)]. then, eq. (16) can be written as γ(i)(x,y) = = ln ( h + w − 1 h + w − 2 ) + ∑i−1 j=0 t(nj(x)+nj(y)) (h + w − 1)(h + w − 2) (17) + ∑2d−1 j=i+1 t(nj(x)+nj(y)) (h + w − 1)(h + w − 2) − t(ni(x)+ni(y)−1) h + w − 1 + t(ni(x)+ni(y)−2) h + w − 2 . 597 j. urban, r. rychtáriková, p. macháček et al. acta polytechnica in this way, the time consuming computation of the logarithms was replaced by the summing of precomputed vectors k ln(k). the length of the vector t does not depend on the number of pixels in the image but on the dimensions (sizes of the sides) of the image. the number of rows and columns of the image determines how many points in the histogram created from the cross is available. it is, however, possible that some values in the vector t will never be used in the computation. nevertheless, due to the possibility of searching in the list of all stored k ln(k), the approach of the precomputed histograms makes the computation faster. eq. (17) implicates γ(i)(x,y) = c (x,y) + b(x,y) + f (x,y); c(x,y) = ln ( h + w − 1 h + w − 2 ) ; b(x,y) = ∑i−1 j=0 (nj(x) + nj(y)) ln(nj(x) + nj(y)) (h + w − 1)(h + w − 2) + ∑2d−1 j=i+1 (nj(x) + nj(y)) ln(nj(x) + nj(y)) (h + w − 1)(h + w − 2) ; (18) fi = (ni(x) + ni(y) − 2) ln(ni(x) + ni(y) − 2) h + w − 2 − (ni(x) + ni(y) − 1) ln(ni(x) + ni(y) − 1) h + w − 1 . in eq. 18 the term c(x,y) is a constant, the b(x,y) is a base of the computation, and the fi corresponds to a fluctuating part of the computation. 5. implementation on gpu the algorithm optimised above can be further split into a few threads when multi-core cpus can be fully utilized, however this is relevant mainly for batch processing of image sets. the computation process was further optimized by an implementation on graphics cards [30]. the algorithm was executed using the cuda architecture to run on nvidia hardware. the key to the next acceleration was to fit the algorithm to gpu highly-parallel architecture which is typical of a double hierarchy. all multiprocessors can access data to the device memory. in the cuda data-parallel programming, the data has to be split into two levels of algorithmically independent parts, into a grid of blocks where a kernel processes the block with the same algorithm. during the processing of the block, several threads run – the second level of the hierarchy. all threads of one block run per one multiprocessor. the iiep software tries to detect the graphics card with the cuda support and, if the card is available, the calculation is realised as the parallel computation on the graphics card kernels (gpu). depending on the image resolution and the type of the entropy calculation, the calculation is typically 50–150× faster on the gpu than on the cpu. the original concept of the software expected an 8-bpc (bits per channel) images (which could be optimized directly for the gpu memory block), while the current cameras are also able to provide raw file formats with ≥10 bpc (which requires a different gpu static-field allocation). the possibility to work with the raw data files deals with signal processing without debayerization [14]. 6. conclusion we investigated the degree of modality for a computation of the difference of two entropies: h , entropy with a central pixel φ(x,y), and h (i), entropy without such pixel. an output image then represents a map of pixels’ importance. we simplified the calculation of the information contribution of a pixel to the whole shannonian information of the image and to local information explained for pixels lying on shanks crossing the studied pixel φ(x,y). the pre-computation of a fixed set (for a given bit precision) of possible logarithm values reduces the total number of necessary calculations in the algorithm and decrease the computational burden in the evaluation of the entropy difference. the calculation kernel was implemented for hardware acceleration using a graphics card. list of symbols b base in computation of γ(i) b(x,y) base in computation of γ (i) (x,y) c constant member in computation of γ(i) c(x,y) constant member in computation of γ (i) (x,y) cuda compute unified device architecture d image intensity bit depth ej element of the unit vector ei ei unit vector [0,0, . . . ,1,. . . ,0,0], where 1 is at the position ei fi fluctuating in computation of γ (i) (x,y) gpu graphical processing unit h row position of the pixel of intensity j h image height (in pixels) h (p) = h shannon entropy for probability histogram function of the whole image h(x,y) shannon entropy for probability histogram function from the cross around an investigated pixel of the intensity i at the position (x,y) h (p(i)) = h (i) shannon entropy for probability histogram function from the whole image without an investigated pixel of intensity i h (i) (x,y) shannon entropy for probability histogram function from the cross around and without an investigated pixel of the intensity i at the position (x,y) i intensity of an investigated pixel i information iiep image info extractor professional software j pixel intensity; j ∈{0, 1, . . . , i, . . . , i, . . . ,k} k weight of the entropic contribution; e.g., k = 1 in the hartley and shannon entropy; k = 1.38 × 10−23 in the boltzmann entropy nj(h,w) = nj contribution of the pixel of intensity j at the position (h,w) to the histogram n 598 vol. 59 no. 6/2019 optimization of computational burden of the point information gain n (i) j element in the histogram n, where a pixel of intensity i is discarded nj(x) element of the intensity histogram of an image pixel row x nj(y) element of the intensity histogram of an image pixel column y n image histogram function n(i) intensity histogram without an investigated pixel of intensity i n(x) intensity histogram for the image pixel row x n(y) intensity histogram for the image pixel column y pi probability of the occurrence of the investigated pixels of intensity i in histogram p pj(h,w) = pj contribution of the pixel of intensity j at the position (h,w) to the probability histogram p ; p (i) j element in the probability histogram p, where the pixel of intensity i is discarded p discrete probability distribution (of the whole image) p(i) probability intensity histogram without an investigated pixel of the intensity i pig point information gain; γ t mathematical substitution for a vector of weighted logarithms with element t(nj ) = nj ln nj t(nj ) element of the vector t of weighted logarithms w column position of the pixel of intensity j w image width (in pixels) x row position of the pixel of intensity i y column position of the pixel of intensity i α rényi dimensionless coefficient γ(i) point information gain for a pixel of intensity i (from a whole-image histogram) γ(i)(x,y) point information gain for a pixel of intensity i from a histogram from pixels forming a cross around the investigated pixel φ(h,w) pixel of intensity j at the position (h,w) φ(x,y) pixel (of) intensity i at the position (x,y) acknowledgements authors are thankful to j. vaněk, t. levitner, a. zhyrova, and v. březina for several discussions. this work was supported by the ministry of education, youth and sports of the czech republic—projects cenakva (lm2018099) and the cenakva centre development (no. cz.1.05/2.1.00/19.0380)—and from the european regional development fund in frame of the project kompetenzzentrum mechanobiologie (atcz133) in the interreg v-a austria–czech republic programme. the work was further financed by the ta cr gama poc 03-24rychtáriková sub-project. references [1] s. beucher. applications of mathematical morphology in material sciences: a review of recent developments. international metallography conference, colmar, france. proc mc95 pp. 41–46, 1945. [2] n. otsu. a threshold selection method from gray-level histogram. ieee trans syst man cybern 9(1):62–66, 1979. doi:10.1109/tsmc.1979.4310076. [3] m. sonka, v. hlavac, r. boyle. image processing, analysis and machine vision. brooks/cole publishing company, 1999. [4] r. c. gonzales, r. e. woods. digital image processing. addison-wesley publishing co., reading, mass.-london-don mills, ont, 1992. [5] j. yang, w. lu, a. waibel. skin-color modeling and adaptation. proc. accv 1998-vol. ii. lect notes comp sci 1352:687–694, 1998. doi:10.1007/3-540-63931-4_278. [6] l. vincent. morphological grayscale reconstruction in image analysis: applications and efficient algorithms. ieee trans image process 2(2):176–201, 1993. doi:10.1109/83.217222. [7] j. vaněk, j. urban, z. gardian. automated detection of photosystems ii in electron microscope photographs. technical computing prague, prague, czechia pp. 102–105, 2006. [8] a. rényi. on measures of information and entropy. proc 4th berkeley symp math stat prob 1:547–561, 1961. [9] r. moddemeijer. on estimation of entropy and mutual information of continuous distributions. signal process 16(3):233–246, 1989. doi:10.1016/0165-1684(89)90132-1. [10] t. pun. a new method for grey-level picture thresholding using the entropy of the histogram. signal process 2(3):223–237, 1980. doi:10.1016/0734-189x(85)90125-2. [11] p. iliev, p. tzvetkov, g. petrov. multidimensional dynamic scene analysis using 3d image histogram and entropy sequences analysis (bulgarian). int j comp 6(1):35–43, 2007. [12] a. zhyrova, d. stys, p. cisar. information entropy approach as a method of analysing belousov-zhabotinsky reaction wave formation. chem listy 107(suppl. 3):s341–s342, 2013. [13] r. rychtáriková. clustering of multi-image sets using rényi information entropy. in f. ortuño, i. rojas (eds.), iwbbio 2016, lect notes bioinf, vol. 9656, pp. 517–526. 2016. doi:10.1007/978-3-319-31744-1_46. [14] r. rychtáriková, t. náhlík, k. shi, et al. superresolved 3-d imaging of live cell’s organelles from brightfield photon transmission micrographs. ultramicroscopy 179:1–14, 2017. doi:10.1016/j.ultramic.2017.03.018. [15] a. zhyrova, d. štys. construction of the phenomenological model of belousov-zhabotinsky reaction state. int j comp math 91(1):4–13, 2014. doi:10.1080/00207160.2013.766332. [16] d. štys, j. urban, r. rychtáriková, et al. measurement in biological systems from the self-organisation point of view. in f. ortuño, i. rojas (eds.), iwbbio 2015, lect notes bioinf, vol. 9044, pp. 431–443. 2015. doi:10.1007/978-3-319-16480-9\_43. [17] r. rychtáriková, j. korbel, p. macháček, et al. point information gain and multidimensional data analysis. entropy 18(10):372, 2016. doi:10.3390/e18100372. [18] t. náhlík, j. urban, p. císař, et al. entropy based approximation to cell monolayer development. in a. jobbággy (ed.), 5th ifmbe, ifmbe proc., vol. 37, pp. 563–566. 2011. doi:10.1007/978-3-642-23508-5_146. 599 http://dx.doi.org/10.1109/tsmc.1979.4310076 http://dx.doi.org/10.1007/3-540-63931-4_278 http://dx.doi.org/10.1109/83.217222 http://dx.doi.org/10.1016/0165-1684(89)90132-1 http://dx.doi.org/10.1016/0734-189x(85)90125-2 http://dx.doi.org/10.1007/978-3-319-31744-1_46 http://dx.doi.org/10.1016/j.ultramic.2017.03.018 http://dx.doi.org/10.1080/00207160.2013.766332 http://dx.doi.org/10.1007/978-3-319-16480-9\_43 http://dx.doi.org/10.3390/e18100372 http://dx.doi.org/10.1007/978-3-642-23508-5_146 j. urban, r. rychtáriková, p. macháček et al. acta polytechnica [19] t. náhlík, d. štys. microscope point spread function, focus and calculation of optimal microscope set-up. int j comp math 91(2):221–232, 2014. doi:10.1080/00207160.2013.851379. [20] j. urban, j. vanek, d. stys. preprocessing of microscopy images via shannon’s entropy. prip 2009, belarusian state university, belarussia. proc prip 2009 pp. 183–187, 2009. [21] j. urban, j. vaněk. preprocessing of microscopy images via shannon’s entropy on gpu. gravisma 2009, pilsen, czechia. workshop proc pp. 48–51, 2009. [22] j. urban, j. vaněk, d. stys. preprocessing of microscopy images via shannon’s entropy, öagm/aapr workshop, graz, austria. öagm/aapr workshop pp. 48–51, 2011. [23] t. boublík. statistická termodynamika (czech). statistical thermodynamics. academia, prague, czechia, 1996. [24] p. jizba, t. arimitsu. the world according to rényi: thermodynamics of multifractal systems. ann phys 312(1):17–59, 2004. doi:10.1016/j.aop.2004.01.002. [25] c. e. shannon. a mathematical theory of communication. bell syst techn j 27(3):379–423, 1948. doi:10.1002/j.1538-7305.1948.tb01338.x. [26] c. e. shannon. a mathematical theory of communication. bell syst techn j 27(4):623–656, 1948. doi:10.1002/j.1538-7305.1948.tb00917.x. [27] o. demirkaya, m. h. asyali, p. k. sahoo. image processing with matlab: applications in medicine and biology. crc press, 2009. [28] m. nixon, a. aguado. feature extraction & image processing. academic press, 2002. [29] s. kullback, r. a. leibler. on information and sufficiency. ann math stat 22(1):79–86, 1951. doi:10.1214/aoms/1177729694. [30] general-purpose computation using gpus. www.gpgpu.org. 600 http://dx.doi.org/10.1080/00207160.2013.851379 http://dx.doi.org/10.1016/j.aop.2004.01.002 http://dx.doi.org/10.1002/j.1538-7305.1948.tb01338.x http://dx.doi.org/10.1002/j.1538-7305.1948.tb00917.x http://dx.doi.org/10.1214/aoms/1177729694 www.gpgpu.org acta polytechnica 59(6):593–600, 2019 1 introduction 2 the shannon entropy 3 point information gain 4 optimization of the algorithm 4.1 point information gain for the whole image 4.2 point information gain for the cross computation 5 implementation on gpu 6 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0245 acta polytechnica 58(4):245–252, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap influence of aggressive environment on the tensile properties of textile reinforced concrete jan machoveca, b, pavel reitermana, b, ∗ a faculty of civil engineering, czech technical university in prague, thakurova 7, prague 6, 166 29, czech republic b university centre for energy efficient buildings, czech technical university in prague, trinecka 1024, bustehrad, 273 43, czech republic ∗ corresponding author: pavel.reiterman@fsv.cvut.cz abstract. this article deals with the long-term durability of a relatively new composite – textile reinforced concrete (trc). the studied composite material introduces a modern and favourite solution in contemporary architecture and structural engineering. it could also be used in renovation and monument restoration due to its high utility properties. the experimental program was focused on the determination of the resistance of the trc in an aggressive environment using durability accelerated tests. the high performance concrete (hpc), which we used in our study, exhibited a compressive strength exceeding 100 mpa after 28 days. specimens were subjected to a 10 % solution of h2so4, 10 % solution of naoh, and freeze-thaw cycling respectively. all these environments can occur in real conditions in the trc practical utilization. the testing was carried out on “dog-bone” shaped specimens, specially designed for the tensile strength measurement. studied trc specimens were reinforced by textiles of three different square weight that were applied in one or two layers, which led to the expected increase of tensile strength the freeze-thaw cycling had the biggest influence on the tensile properties, because it causes micro-cracks formation. the specimens exposed to the chemically aggressive environment deteriorated mostly on the surface, because of the high density of the concrete and generally low penetration of the media used. the resistance of the studied trc to the aggressive environment increased with the applied reinforcement rate. the performed experimental programme highlighted the necessity of including the durability properties in the design of structural elements. keywords: textile concrete; tensile strength; accelerated durability test; acid environment; freezingthawing; alkaline solution. 1. introduction textile reinforced concrete (trc) is a composite material, which is becoming more and more known. textile concrete may have one or (usually) more layers of textile reinforcement instead of steel [1–3]. however, mostly 2d textile reinforcement is used. it can also be prestressed and there are two ways to do that — the mechanical way [1] or by a chemical composition of reinforcement (nylon shrinks in alkali environment) [2]. the trc is a composite material based on the high-density matrix, usually high-performance concrete (hpc). the concrete matrix used in combination with the textile reinforcement must fulfil several technological conditions. maximum size of its particles must be smaller than the “eye” of the textile reinforcement and it must be fluid enough to enter all parts. the textile reinforcement is easy to use, it is light and flexible. this material is currently very favoured by architects, which can be declared on the basis of the increasing amount of its application. the trc is used mostly as facade panels, bridges, decorative roofs, etc. it means that it is used in aggressive conditions very often. that is why its long-term durability is a very relevant issue. its long-term durability depends on the properties of particular components. a suitable durability of concrete matrix was studied intensively by a number of research teams [4–6]. high resistance to the action of the environment is ensured by the application of supplementary cementitious materials (scms), which significantly reduce the matrix permeability. the most frequently applied scm is silica fume, however, fly-ash or metakaolin could be successfully used [7–9]. the highest quality textiles are made of carbon, but, at the same time, it is the most expensive additive. another alternative is glass. glass itself is very sensitive to alkaline environment, which is typical for concrete; therefore, it needs a surface treatment or a special primary material (ar glass – alkali resistant). organic materials can be used as well, such as plastics, nylon, hemp, jute, etc., however, these materials can react differently in extreme conditions. they have different coefficient of thermal expansion, chemical properties and are generally sensitive to the degradation. one of the biggest chemical exposition of acidic solutions is in acid rains. wang et al. [10] discussed the problem of acid rains in china, it affects almost one third of its area. not only rain, but industrial 245 http://dx.doi.org/10.14311/ap.2018.58.0245 http://ojs.cvut.cz/ojs/index.php/ap jan machovec, pavel reiterman acta polytechnica component weight ratio [–] cement cem i 42.5 r 1 silica fume 0.19 silica sand 0.1/0.6 0.48 silica sand 0.3/0.8 0.5 silica sand 0.6/1.2 0.38 silica powder 0.48 superplasticizer 0.011 water 0.35 table 1. composition of used concrete matrix. and agricultural environments are exposing concrete to acidic conditions, this problem is not just in china, but worldwide. facts based on a worldwide comprehensive research [11] shows that almost 54 % of all problems with concrete structures are caused by chemical reactions (carbonation, chloride attack, acids, etc.) and it is mostly caused by a wrong choice of concrete mixture for the required purpose. nobili [12] studied the influence of the alkaline environment, which is typical for all concrete structures and used reinforcement must deal with it. a tensile strength experiment of a concrete reinforced with glass textile shown that there is an obvious difference in strength loss between the matrix and textile. the cementitious matrix shown faster degradation (24 % compared to the reference sample) than glass textile (11 %). the glass textile reinforcement can be protected from the influence of the alkali environment by two basic ways depending on the glass used. a lower quality glass is covered in styrene-butadiene layer and higher quality glass, so called ar glass (alkali resistant) contains 15–20 % of zirconium [13, 14], which improves the durability of the glass. butler et al. [7, 8] also experimented with the durability of the textile concrete. they mention three factors that are crucial for a long term durability of glass-fibre-reinforced concrete, which has the same damage mechanism as the trc. first is the corrosion of the fibre material itself due to attack of ohions in the pore solution, second is static fatigue of the glass fibre under sustained load in the highly alkaline environment and lastly densification of the matrix adjacent to the filaments and enhancement of the fibre–matrix bond with continued hydration. the alkalinity resistance is improved by a polymer layer, which also has a mechanical purpose (holds the textile together). the focus of their article was to find the perfect composition of matrix for the best durability results. experiments showed that metakaolin and fly-ash are the best solution, because they create a high density matrix with a suitable resistance to the aggressive environment. however, cementitious matrixes show better results than the blended binding systems in early stages, because of faster hydration, which caused better bond between the matrix and textile reinforcement. figure 1. three types of used textile reinforcement. the main aim of the present article is to assess the durability of the trc using a complex system of experimental procedures. the performed set of accelerated tests introduces the most harmful deterioration mechanisms attacking the concrete. samples of the studied trc were exposed to freeze-thaw cycling, acid and alkaline environment, respectively, in terms of direct tension tests. 2. experimental program performed experimental program follows a previous research [15, 16], which was focused on the development of a cement matrix applicable for the trc production. a new type of portland cement matrix was formulated and its technological aspects were verified. 2.1. materials the concrete matrix was based on the hpc, whose composition is shown in tab. 1. this composite contains a high amount of fine components to obtain the suitable consistency in fresh state. high amount of silica fume significantly reduces the permeability and ph of pore water. the compressive strength exceeds, after 28 days, 100 mpa when using cubes 100 × 100 × 100 mm. testing was conducted in accordance with bs en 193 [17]. name code of textiles is for example “r 585 a101”, where r means textile fabric, 585 stands for weight in grams per square meter and a101 is a type of the surface treatment, in this case, alkali resistant. textiles have different properties on perpendicular directions, caused by the technology process of the production – warp and weft weaving. glass textile used in this work is made from glass type e with styrene-butadiene coating layer ensuring its alkali-resistance. glass textile reinforcement of three different square weights, produced by company adfors, were used in this work. their square weights were 131, 275 and 585 g/m2 (fig. 1). each textile was applied in one and two layers respectively. 246 vol. 58 no. 4/2018 influence of aggressive environment on textile reinforced concrete figure 3. typical loading chart. the main aim of the experimental program was to evaluate impact of the environment in terms of set of accelerated tests. these testing was inspired by japanese code jsce-e 549-2000 [19] which prescribes specific solutions for 60 days without air circulation and solution renewal. samples were submersed to 10% naoh solution, 10% h2so4 solution, reference samples were cured in water. the above described procedure includes monitoring of residual weight of samples and their visual description. trc is used for thin-walled structure elements production, which is why the freeze-thaw resistance test was included, because it could cause progress of cracks. the freeze-thaw resistance was realized according to [20]. the load cycle started with a frosting phase down to -18 °c lasting for four hours, and then continued with a defrosting period up to 20 °c for another two hours (as shown on chyba! nenalezen zdroj odkazů. 4). defrosting is realized by flooding the climatic chamber with water of 20°c. loss of the mass is also monitored, if any. residual properties after accelerated durability tests were specifically determined on the “dog-bone” shaped specimens. all tests were carried out on three samples. figure 4. freeze-thaw scheme. -20 -15 -10 -5 0 5 10 15 20 25 0 1 2 3 4 5 6 t e m p e ra tu re [ °c ] time [h] figure 3. typical loading chart. figure 2. specimen in claws with installed extensimeter. 2.2. experimental methods determination of the flexural strength and compressive strength were carried out according to bs en 1015-11 [18]. flexural strength measurement was executed on the accompanying prismatic specimens 40 × 40 × 160 mm. the determination of the flexural strength was organized as a three-point bending test with the support span 100 mm and axial loading. the compressive strength was investigated by using fragments left after the bending test. the main testing part – tensile property test – was executed on “dog-bone” specimens placed into special claws on the testing machine. the loading speed was 0.5 mm/min and the test ended at the point of a full break-up. several data outputs have been observed, such as time, loading strength, elongation and strain thanks to the installed extensimeter, fig. 2. incorporation of the reinforcement gets difficult in the interpretation of the test results, because it is necessary to take into consideration the entire failure behaviour of the specimen. that is why four main points of the test record were monitored. the characterisation of the obtained test record is illustrated in fig. 3. point 1 describes the strength where the first crack occurred. point 2 is where the trc specimen started acting again and begun to bear load again. point 3 describes the maximal obtained strength in the textile reinforcement. fourth point is a point when the specimen was torn apart. area ”a” is where just the cementitious matrix bears all loads; area ”b” is the area where the textile reinforcement is activated. the main aim of the experimental program was to evaluate the impact of the environment with a set of accelerated tests. this testing was inspired by japanese code jsce-e 549-2000 [19], which prescribes specific solutions for 60 days without an air circulation and solution renewal. samples were submersed to 10 % naoh solution, 10 % h2so4 solution, reference samples were cured in water. the above described procedure includes monitoring the residual weight of samples and their visual description. the trc is used for a thin-walled structure elements production, which is why the freeze-thaw resistance test was included, because it could cause progress of cracks. the freeze-thaw resistance was realised according to [20]. the load cycle started with a frosting phase down to −18 °c lasting for four hours, and then continued with a defrosting period up to 20 °c for another two hours (as shown on figure 4). the defrosting is realized by flooding the climatic chamber with a water with a temperature of 20 °c. loss of the mass is also monitored, if any occurs. the residual properties after accelerated durability tests were specifically determined on the “dog-bone” shaped specimens. all tests were carried out on three samples. 3. results the performed experimental program was focused on the assessment of the resistance of the trc made of 247 jan machovec, pavel reiterman acta polytechnica figure 3. typical loading chart. the main aim of the experimental program was to evaluate impact of the environment in terms of set of accelerated tests. these testing was inspired by japanese code jsce-e 549-2000 [19] which prescribes specific solutions for 60 days without air circulation and solution renewal. samples were submersed to 10% naoh solution, 10% h2so4 solution, reference samples were cured in water. the above described procedure includes monitoring of residual weight of samples and their visual description. trc is used for thin-walled structure elements production, which is why the freeze-thaw resistance test was included, because it could cause progress of cracks. the freeze-thaw resistance was realized according to [20]. the load cycle started with a frosting phase down to -18 °c lasting for four hours, and then continued with a defrosting period up to 20 °c for another two hours (as shown on chyba! nenalezen zdroj odkazů. 4). defrosting is realized by flooding the climatic chamber with water of 20°c. loss of the mass is also monitored, if any. residual properties after accelerated durability tests were specifically determined on the “dog-bone” shaped specimens. all tests were carried out on three samples. figure 4. freeze-thaw scheme. -20 -15 -10 -5 0 5 10 15 20 25 0 1 2 3 4 5 6 t e m p e ra tu re [ °c ] time [h] figure 4. freeze-thaw scheme. results performed experimental program was focused on the assessment of the resistance of trc made of hpc matrix and glass textile reinforcement to various aggressive environments. used cement based matrix was developed in previous research, nevertheless determination of actual mechanical properties was necessary for the correct evaluation. flexural and compressive strength exhibited values 11.5 mpa and 122.7 mpa respectively after 28 days, what met initial expectations. the dominant part of the experimental program was focused on the influence of various aggressive environments, where samples were immersed to acidic and alkaline solutions for 60 days; freezethaw test was in progress at the same time for different specimens. residual values of flexural and compressive strength obtained using prismatic specimens are shown in fig. 5. figure 5. mechanical properties of used matrix. specimens exposed to alkaline solution exhibited minimal loss of mass and visual changes as well; the surface was coated by thin dark blue film, which was formed by silica fume released from the surface. high alkaline environment leads to the metastable state of created hydrates. the alkaline solution was nearly constant during testing; ph dropped from 13.3 to 13.2 during 60 days of exposure. this changes well explain minimal decay of studied mechanical properties. on the other side, acidic solution was significantly neutralized, what is well obvious from ph changes; initial ph of the acidic solution was 0.2 and final occurred 3.4 after prescribed time. this significant change was accompanied by the deterioration of immersed specimens, which provided dotation of alkalinity. however, it is necessary to note, that the measurement of initial value of ph was on the limit of apparatus resolution. the loss of ca(oh)2 and consequent decomposition of hydrated phases [9, 21] was accompanied by massive loss of mass and decay of mechanical properties. interesting trend exhibited accompanying prismatic samples subjected to freezing-thawing. values of flexural strength after 50 cycles decreased by more than 40%, however additional freezing-thawing 122,7 120,4 57,3 139,1 124,5 135,7 11,5 9,8 7,1 6,5 7,8 12,1 4 6 8 10 12 14 16 0 20 40 60 80 100 120 140 160 ref naoh h2so4 50c 100c 150c f le x u ra l st re n g th [ m p a ] c o m p re ss iv e s tr e n g th [ m p a ] compressive strength flexural strength figure 5. mechanical properties of used matrix. the hpc matrix and glass textile reinforcement to various aggressive environments. the cement based matrix, which was used, was developed in a previous research, nevertheless, the determination of actual mechanical properties was necessary for the correct evaluation. flexural and compressive strength exhibited values of 11.5 mpa and 122.7 mpa respectively, after 28 days, which met the initial expectations. the dominant part of the experimental program was focused on the influence of various aggressive environments, where samples were immersed to acidic and alkaline solutions for 60 days; a freeze-thaw test was in progress at the same time for different specimens. residual values of flexural and compressive strength obtained using prismatic specimens are shown in fig. 5. specimens exposed to the alkaline solution exhibited a minimal loss of mass and visual changes as well; the surface was coated by a thin dark blue film, which was formed by silica fumes released from the surface. high alkaline environment leads to the metastable state of created hydrates. the alkaline solution was figure 6. studied samples after the exposure. nearly constant during testing; ph dropped from 13.3 to 13.2 during 60 days of exposure. this changes well explain the minimal decay of studied mechanical properties. however, acidic solution was significantly neutralized, which is well obvious from the ph changes; the initial ph of the acidic solution was 0.2 and increased to 3.4 after the prescribed time. this significant 248 vol. 58 no. 4/2018 influence of aggressive environment on textile reinforced concrete figure 7. tensile strength of studied trc samples – point “1”. figure 8. elongation of trc samples between the second and the fourth point. elongation between points two and four is crucial parameter describing failure mode of the samples, because brittle rupture is not desirable for thin-walled structures. this parameter was effected by reduced cohesion of tr and used matrix, what is well visible on fig. 8. results obtained on the samples subjected to the cycle freezing-thawing are nearly similar with reference samples. 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 without tr 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le s tr e n g th [ m p a ] ref naoh h2so4 50c 100c 150c 0 1 2 3 4 5 6 7 8 585x1 275x1 131x1 585x2 275x2 131x2 e lo n g a ti o n [ m m ] ref naoh h2so4 50c 100c 150c figure 7. tensile strength of studied trc samples – point “1”. figure 7. tensile strength of studied trc samples – point “1”. figure 8. elongation of trc samples between the second and the fourth point. elongation between points two and four is crucial parameter describing failure mode of the samples, because brittle rupture is not desirable for thin-walled structures. this parameter was effected by reduced cohesion of tr and used matrix, what is well visible on fig. 8. results obtained on the samples subjected to the cycle freezing-thawing are nearly similar with reference samples. 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 without tr 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le s tr e n g th [ m p a ] ref naoh h2so4 50c 100c 150c 0 1 2 3 4 5 6 7 8 585x1 275x1 131x1 585x2 275x2 131x2 e lo n g a ti o n [ m m ] ref naoh h2so4 50c 100c 150c figure 8. elongation of trc samples between the second and the fourth point. acidic alkaline environment environment loss of mass 9.93 % 1.57 % table 2. loss of mass of immersed samples. change was accompanied by the deterioration of immersed specimens, which provided dotation of alkalinity. however, it is necessary to note that the measurement of the initial value of ph was on the limit of the apparatus resolution. the loss of ca(oh)2 and the consequent decomposition of hydrated phases [9, 21] was accompanied by a massive loss of mass and decay of mechanical properties. interesting trend exhibited accompanying prismatic samples subjected to freezing-thawing. values of flexural strength after 50 cycles decreased by more than 40 %, however, additional freezing-thawing led to a slight increase in comparison with the original values. it is probably caused by the additional matrix hydration. similar conclusions could be found in work of sahmaran et al. [22] and chung et al. [23]. the most important part of the experiment was the tensile testing of the trc specimens. the impact of studied aggressive environments is well declared on the results of dog-bone specimens testing. the immersion of produced dog-bone samples into aggressive environment caused the expected loss of mass, however, increasing the amount of textile reinforcement did not affect this parameter. average results are summarized in tab. 2 for both acidic and alkaline environments. the 249 jan machovec, pavel reiterman acta polytechnica environment freeze-thaw cycles acidic alkaline 50 100 150 without tr 63.9 86.9 63.9 48.2 51.0 1 × 131 74.7 79.4 85.6 54.4 49.6 1 × 275 90.6 67.7 67.0 47.9 52.1 1 × 585 101.3 99.5 79.1 58.4 60.7 2 × 131 91.9 84.5 78.7 48.9 51.5 2 × 275 56.8 76.7 60.7 39.2 38.3 2 × 585 72.7 75.8 58.4 41.5 41.5 table 3. the decay of tensile strength due to action of studied environments [%]. figure 9. tensile force in trc after the first crack – point “2”. fig. 9 describes behaviour of specimen after the first crack occurs. values of force corresponding to point “2” are significantly affected by the action of aggressive environment, it is well visible on the influence of freezing-thawing. reason for this result is the fact, that freezing-thawing affects the whole mass in comparison with surface degradation induced by solutions. we can clearly see the difference between one and two layers of all reinforcement, which is more than double in some cases. the biggest positive effect appeared in a case of reinforcements with lower square weight. the degree of reinforcement has demonstrable effect on the after-crack behaviour. figure 10. maximal tensile force in trc after the first crack – point “3”. fig. 10 shows maximum force achieved in textile reinforcement after first crack – point “3”. this moment was followed by slow decrease, which is positive because of the absence of fast rupture, which is undesirable. the biggest negative effect was caused by naoh solution, where the most noticeable strength decrease was almost 31% in a case of two layers of r 585 a101 reinforcement 0 500 1000 1500 2000 2500 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le f o rc e [ n ] ref naoh h2so4 50c 100c 150c 0 500 1000 1500 2000 2500 3000 3500 4000 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le f o rc e [ n ] ref naoh h2so4 50c 100c 150c figure 9. tensile force in trc after the first crack – point “2”. conduction of freeze-thaw cycling did not cause any mass loss. the documentation of visual condition of studied samples is shown in fig. 6. the decay of the determined tensile strength is introduced in tab. 3. the absolute values of the obtained tensile strength after the freezing-thawing are introduced in fig. 7; these values present the stress corresponding to the crack initiation – point 1 according to fig. 3. the elongation between points two and four is a crucial parameter describing the failure mode of the samples, because the brittle rupture is not desirable for thin-walled structures. this parameter was affected by the reduced cohesion of the tr and used matrix, which is well visible in fig. 8. results obtained on the samples subjected to the cycle freezing-thawing are nearly similar with the reference samples. figure 9 describes the behaviour of a specimen after the first crack occurs. values of force corresponding to point “2” are significantly affected by the action of the aggressive environment, it is well visible on the influence of freezing-thawing. reason for this result is the fact that freezing-thawing affects the whole mass in comparison with the surface degradation induced by solutions. we can clearly see the difference between one and two layers of all reinforcement, which is more than double in some cases. the biggest positive effect appeared in the case of reinforcements with a lower square weight. the degree of reinforcement has demonstrable effect on the after-crack behaviour. figure 10 shows the maximum force achieved in the textile reinforcement after the first crack – point “3”. this moment was followed by a slow decrease, which is positive because of the absence of a fast rupture, which is undesirable. the biggest negative effect was caused by naoh solution, where the most noticeable strength decrease was almost 31 % in the case of two layers of r 585 a101 reinforcement compared to the reference samples. it is a very surprising result because of the presence of an alkali resistant coating on the reinforcement. 4. conclusion this research programme describes the behaviour of the trc after an exposure to various aggressive environments. test specimens were dog-bone shaped, made of the hpc cement based matrix with one or two layers of glass textile reinforcement. besides being subjected to freezing-thawing, the trc samples were also submersed to chemically aggressive solutions – alkaline and acidic. the applied accelerated test simulated the impact of an external environment. 250 vol. 58 no. 4/2018 influence of aggressive environment on textile reinforced concrete figure 9. tensile force in trc after the first crack – point “2”. fig. 9 describes behaviour of specimen after the first crack occurs. values of force corresponding to point “2” are significantly affected by the action of aggressive environment, it is well visible on the influence of freezing-thawing. reason for this result is the fact, that freezing-thawing affects the whole mass in comparison with surface degradation induced by solutions. we can clearly see the difference between one and two layers of all reinforcement, which is more than double in some cases. the biggest positive effect appeared in a case of reinforcements with lower square weight. the degree of reinforcement has demonstrable effect on the after-crack behaviour. figure 10. maximal tensile force in trc after the first crack – point “3”. fig. 10 shows maximum force achieved in textile reinforcement after first crack – point “3”. this moment was followed by slow decrease, which is positive because of the absence of fast rupture, which is undesirable. the biggest negative effect was caused by naoh solution, where the most noticeable strength decrease was almost 31% in a case of two layers of r 585 a101 reinforcement 0 500 1000 1500 2000 2500 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le f o rc e [ n ] ref naoh h2so4 50c 100c 150c 0 500 1000 1500 2000 2500 3000 3500 4000 585x1 275x1 131x1 585x2 275x2 131x2 t e n si le f o rc e [ n ] ref naoh h2so4 50c 100c 150c figure 10. maximal tensile force in trc after the first crack – point “3”. freeze-thaw cycles seem to be a crucial parameter for the trc. number of studies declared sufficient resistance of the hpc to this type of exposure, however, a limited cross-section of the trc amplifies the negative impact of the micro-crack propagation. a very interesting finding is the stimulated self-healing of the hpc matrix induced by freezing. the observed effect is probably caused by the surplus of the active mineral additive. the negative frost effect is determined by the character of the consequent deterioration – presence of cracks. however, the action of acidic and alkaline environments, respectively, breaks the material from the surface, which was well visible on the resulting mass loss. the alkaline environment had a significant impact on the durability of the used glass reinforcement, despite of the application of the alkali-resistant coating. the acidic environment led to the weakening of the matrix and mutual cohesion with the used tr. that is why we can conclude that the crucial parameter is the frost resistance from the point of view of ways of the environment loading. very important fact in the composite behaviour is the type and amount of reinforcement layers. the lightest textile reinforcement (131 g/m2) showed to be inappropriate for reinforcing. the strongest textile (585 g/m2) seems to be adequate. generally, specimens with two layers exhibited approximately two times better properties compared to just one layer. however, the final durability and resistance to the actions of an aggressive environment will be crucial parameters for the design of the trc structures. acknowledgements this research was supported by project ltc18063. references [1] du, y. et al.: experimental study on basalt textile reinforced concrete under uniaxial tensile loading. construction and building materials, 138, 2017, p. 88–100. doi:10.1016/j.conbuildmat.2017.01.083 [2] zargaran, m. et al.: minimum reinforcement ratio in trc panels for deflection hardening flexural performance. construction and building materials, 137, 2017, p. 459–469. doi:10.1016/j.conbuildmat.2017.01.091 [3] sharei, e. et al.: structural behavior of a lightweight, textile-reinforced concrete barrel vault shell. composite structures, 171, 2017, p. 505–514. doi:10.1016/j.compstruct.2017.03.069 [4] chira, a. et al.: property improvements of alkali resistant glass fibres/epoxy composite with nanosilica for textile reinforced concrete applications. materials & design, 89, 2016, p. 146–155. doi:10.1016/j.matdes.2015.09.122 [5] williams portal, n. et al.: tensile behaviour of textile reinforcement under accelerated ageing conditions. journal of building engineering, 5, 2016, p. 57–66. doi:10.1016/j.jobe.2015.11.006 [6] kong, k. et al.: comparative characterization of the durability behaviour of textile-reinforced concrete (trc) under tension and bending. composite structures, 179, 2017, p. 107–123. doi:10.1016/j.compstruct.2017.07.030 [7] butler, m. et al.: experimental investigations on the durability of fibre–matrix interfaces in textile-reinforced concrete. cement and concrete composites, 31 (4), 2009, p. 221–231. doi:10.1016/j.cemconcomp.2009.02.005 [8] butler, m. et al.: durability of textile reinforced concrete made with ar glass fibre: effect of the matrix composition. materials and structures, 43 (10), 2010, p. 1351–1368. doi:10.1617/s11527-010-9586-8 251 http://dx.doi.org/10.1016/j.conbuildmat.2017.01.083 http://dx.doi.org/10.1016/j.conbuildmat.2017.01.091 http://dx.doi.org/10.1016/j.compstruct.2017.03.069 http://dx.doi.org/10.1016/j.matdes.2015.09.122 http://dx.doi.org/10.1016/j.jobe.2015.11.006 http://dx.doi.org/10.1016/j.compstruct.2017.07.030 http://dx.doi.org/10.1016/j.cemconcomp.2009.02.005 http://dx.doi.org/10.1617/s11527-010-9586-8 jan machovec, pavel reiterman acta polytechnica [9] reiterman, p., tomek, j.: resistance of concrete with metakaolin addition to acid environment. key engineering materials, 677, 2016, p. 144–149. doi:10.4028/www.scientific.net/kem.677.144 [10] wang, z. et al.: deterioration of fracture toughness of concrete under acid rain environment. engineering failure analysis, 77, 2017, p. 76–84. doi:10.1016/j.engfailanal.2017.02.013 [11] basheer, p.a.m. et al.: predictive models for deterioration of concrete structures. construction and building materials, 10 (1), 1996, p. 27–37. doi:10.1016/0950-0618(95)00092-5 [12] nobili, a.: durability assessment of impregnated glass fabric reinforced cementitious matrix (gfrcm) composites in the alkaline and saline environments. construction and building materials, 105, 2016, p. 465–471. doi:10.1016/j.conbuildmat.2015.12.173 [13] orlowsky, j., raupach, m.: durability model for ar-glass fibres in textile reinforced concrete. materials and structures, 41 (7), 2007, p. 1225–1233. doi:10.1617/s11527-007-9321-2 [14] orlowsky, j., raupach, m.: modelling the loss in strength of ar-glass fibres in textile-reinforced concrete. materials and structures, 39 (6), 2006, p. 635–643. doi:10.1617/s11527-006-9100-5 [15] holčapek, o. et al.: analysis of mechanical properties of hydrothermally cured high strength cement matrix for textile reinforced concrete. acta polytechnica, 55 (5), 2015, p. 313. doi:10.14311/ap.2015.55.0313 [16] holčapek, o. et al.: using of textile reinforced concrete wrapping for strengthening of masonry columns with modified cross-section shape. procedia engineering, 195, 2017, p. 62–66. doi:10.1016/j.proeng.2017.04.524 [17] bs en 12390-3 – testing hardened concrete – part 3: compressive strength of test specimens, bsi 2009. [18] bs en 1015-11 – methods of test for mortars for masonry – part 11: determination of flexural and compressive strength of hardened mortars, bsi 1999. [19] jsce-e 549-2000 — test method for water, acid and alkali resistance of continuous fiber sheets, jsce 2000. http://www.jsce.or.jp/ committee/concrete/e/newsletter/newsletter01/ recommendation/frp-sheet/2-10.pdf [2018-08-01]. [20] csn 731322 – determination of frost resistance of concrete, csi 1968. [21] a.m. neville, properties of concrete. 4th and final ed. harlow: longman group, 1995. isbn 978-0-582-23070-5. [22] sahmaran, m. et al.: self-healing capability of cementitious composites incorporating different supplementary cementitious materials. cement and concrete composites, 35 (1), 2013, p. 89–101. doi:10.1016/j.cemconcomp.2012.08.013 [23] chung, c.-w. et al.: chloride ion diffusivity of fly ash and silica fume concretes exposed to freeze–thaw cycles. construction and building materials, 24 (9), 2010, p. 1739–1745. doi:10.1016/j.conbuildmat.2010.02.015 252 http://dx.doi.org/10.4028/www.scientific.net/kem.677.144 http://dx.doi.org/10.1016/j.engfailanal.2017.02.013 http://dx.doi.org/10.1016/0950-0618(95)00092-5 http://dx.doi.org/10.1016/j.conbuildmat.2015.12.173 http://dx.doi.org/10.1617/s11527-007-9321-2 http://dx.doi.org/10.1617/s11527-006-9100-5 http://dx.doi.org/10.14311/ap.2015.55.0313 http://dx.doi.org/10.1016/j.proeng.2017.04.524 http://www.jsce.or.jp/committee/concrete/e/newsletter/newsletter01/recommendation/frp-sheet/2-10.pdf http://www.jsce.or.jp/committee/concrete/e/newsletter/newsletter01/recommendation/frp-sheet/2-10.pdf http://www.jsce.or.jp/committee/concrete/e/newsletter/newsletter01/recommendation/frp-sheet/2-10.pdf http://dx.doi.org/10.1016/j.cemconcomp.2012.08.013 http://dx.doi.org/10.1016/j.conbuildmat.2010.02.015 acta polytechnica 58(4):245–252, 2018 1 introduction 2 experimental program 2.1 materials 2.2 experimental methods 3 results 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0118 acta polytechnica 58(2):118–127, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap quasi-exactly solvable schrödinger equations, symmetric polynomials and functional bethe ansatz method christiane quesne physique nucléaire théorique et physique mathématique, université libre de bruxelles, campus de la plaine cp229, boulevard du triomphe, b-1050 brussels, belgium correspondence: cquesne@ulb.ac.be abstract. for applications to quasi-exactly solvable schrödinger equations in quantum mechanics, we consider the general conditions that have to be satisfied by the coefficients of a second-order differential equation with at most k + 1 singular points in order that this equation has particular solutions that are nth-degree polynomials. in a first approach, we show that such conditions involve k − 2 integration constants, which satisfy a system of linear equations whose coefficients can be written in terms of elementary symmetric polynomials in the polynomial solution roots whenver such roots are all real and distinct. in a second approach, we consider the functional bethe ansatz method in its most general form under the same assumption. comparing the two approaches, we prove that the above-mentioned k − 2 integration constants can be expressed as linear combinations of monomial symmetric polynomials in the roots, associated with partitions into no more than two parts. we illustrate these results by considering a quasi-exactly solvable extension of the mathews-lakshmanan nonlinear oscillator corresponding to k = 4. keywords: schrödinger equation; quasi-exactly solvable potentials; symmetric polynomials. 1. introduction in quantum mechanics, solving the schrödinger equation is a fundamental problem for understanding physical systems. exact solutions may be very useful for developing a constructive perturbation theory or for suggesting trial functions in variational calculus for more complicated cases. however, very few potentials can actually be exactly solved (see, e.g., one of their lists in [1]).1 these potentials are connected with second-order differential equations of hypergeometric type and their wavefunctions can be constructed by using the theory of corresponding orthogonal polynomials [3]. a second category of exact solutions belongs to the so-called quasi-exactly solvable (qes) schrödinger equations. these occupy an intermediate place between exactly solvable (es) and non-solvable ones in the sense that only a finite number of eigenstates can be found explicitly by algebraic means, while the remaining ones remain unknown. the simplest qes problems, discovered in the 1980s, are characterized by a hidden sl(2,r) algebraic structure [4–8] and are connected with polynomial solutions of the heun equation [9]. generalizations of this equation are related through their polynomial solutions to more complicated qes problems. several procedures are employed in this context, such as the use of high-order recursion relations (see, e.g., [10]) or the functional bethe ansatz (fba) method [11–13], which has proven very effective [14–17]. the purpose of the present paper is to reconsider the general conditions under which a second-order differential equation x(z)y′′(z) + y (z)y′(z) + z(z)y(z) = 0 with polynomial coefficients x(z), y (z), and z(z) of respective degrees k, k − 1, and k − 2, has nth-degree polynomial solutions yn(z). choosing appropriately the polynomial z(z) for such a purpose is known as the classical heine-stieltjes problem [18, 19]. such a differential equation with at most k + 1 singular points covers the cases of the hypergeometric equation (for k = 2), the heun equation (for k = 3), and generalized heun equations (for k ≥ 4), and plays therefore a crucial part in es and qes quantum problems. here we plan to emphasize the key role played by symmetric polynomials in the polynomial solution roots whenever such roots are all real and distinct. this will be done by comparing two different approaches: a first one expressing the polynomial z(z) in terms of k− 2 integration constants, and a second one based on the fba method. in section 2, the problem of second-order differential equations with polynomial solutions is discussed and k− 2 integration constants are introduced. in section 3, the fba method is derived in its most general form. a comparison between the two approaches is carried out in section 4. the results so obtained are illustrated in 1here we do not plan to discuss the recent development of the exceptional orthogonal polynomials and the associated polynomially solvable analytic potentials (see, e.g., [2] and references quoted therein). 118 http://dx.doi.org/10.14311/ap.2018.58.0118 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 2/2018 quasi-exactly solvable schrödinger equations section 5 by considering a qes extension of the mathews-lakshmanan nonlinear oscillator. finally, section 6 contains the conclusion. 2. second-order differential equations with polynomial solutions and integration constants on starting from a schrödinger equation (t + v )ψ(x) = eψ(x), where the real variable x varies in a given domain, the potential energy v is a function of x, and the kinetic energy t depends on d/dx (and possibly on x whenever the space is curved or the mass depends on the position [20]), and on making an appropriate gauge transformation and a change of variable x = x(z), one may arrive at a second-order differential equation with at most k + 1 singular points, x(z)y′′(z) + y (z)y′(z) + z(z)y(z) = 0, (2.1) where x(z) = k∑ l=0 alz l, y (z) = k−1∑ l=0 blz l, z(z) = k−2∑ l=0 clz l, (2.2) and al, bl, cl are some (real) constants. polynomial solutions yn(z) of this equation will yield exact solutions ψn(x) of the starting schrödinger equation provided the latter are normalizable. on deriving equation (2.1) k − 2 times, we obtain xy(k) + [( k − 2 1 ) x′ + y ] y(k−1) + [( k − 2 2 ) x′′ + ( k − 2 1 ) y ′ + z ] y(k−2) + · · · + [ x(k−2) + ( k − 2 1 ) y (k−3) + ( k − 2 2 ) z(k−4) ] y′′ + [ y (k−2) + ( k − 2 1 ) z(k−3) ] y′ + z(k−2)y = 0, (2.3) which is a kth-order homogeneous differential equation with polynomial coefficients of degree not exceeding the corresponding order of differentiation. since all its derivatives will have the same property, it can be differentiated n times by using the new representation y(n)(z) = vn(z). in such a notation, equation (2.3) can be written as xv (k) 0 + [( k − 2 1 ) x′ + y ] v (k−1) 0 + [( k − 2 2 ) x′′ + ( k − 2 1 ) y ′ + z ] v (k−2) 0 + · · · + [ x(k−2) + ( k − 2 1 ) y (k−3) + ( k − 2 2 ) z(k−4) ] v′′0 + [ y (k−2) + ( k − 2 1 ) z(k−3) ] v′0 + z (k−2)v0 = 0. (2.4) its nth derivative can be easily shown to be given by k∑ l=0 ( n + k − 2 k − l ) x(k−l)v(l)n + k−1∑ l=0 ( n + k − 2 k − l− 1 ) y (k−l−1)v(l)n + k−2∑ l=0 ( n + k − 2 k − l− 2 ) z(k−l−2)v(l)n = 0. (2.5) when the coefficient of vn in (2.5) is equal to zero, i.e.,( n + k − 2 k ) x(k) + ( n + k − 2 k − 1 ) y (k−1) + ( n + k − 2 k − 2 ) z(k−2) = 0, (2.6) there only remains derivatives of vn in the equation. it is then obvious that the latter has a particular solution for which vn = y(n) is a constant or, in other words, there exists a particular solution y(z) = yn(z) of equation (2.1) that is a polynomial of degree n. on integrating equation (2.6) k − 2 times, we find that this occurs whenever z(z) = zn(z) is given by zn(z) = − n(n− 1) k(k − 1) x′′(z) − n k − 1 y ′(z) + k−3∑ l=0 ck−l−2,n zl l! , (2.7) where c1,n,c2,n, . . . ,ck−2,n are k − 2 integration constants. this is the first main result of this paper. it is worth observing that in the hypergeometric case, we have k = 2 so that the polynomial z(z) reduces to a constant λ. then λ = λn, where, in accordance with equation (2.7), λn = −12n(n− 1)x ′′(z) −ny ′(z), (2.8) 119 christiane quesne acta polytechnica with no integration constant, which is a well-known result [3]. in the heun-type equation case, we have k = 3 and the linear polynomial z(z) = zn(z) is given by zn(z) = −16n(n− 1)x ′′(z) − 12ny ′(z) + cn (2.9) in terms of a single integration constant cn, which is a result previously derived by karayer, demirhan, and büyükkılıç [21]. on inserting now equation (2.2) in (2.7) and equating the coefficients of equal powers of z on both sides, we obtain the set of relations ck−2 = −n(n− 1)ak −nbk−1, (2.10) cl = − n(n− 1) k(k − 1) (l + 2)(l + 1)al+2 − n k − 1 (l + 1)bl+1 + ck−l−2,n l! , l = 0, 1, . . . ,k − 3. (2.11) the nth-degree polynomial solutions yn(z) of equation (2.1) can be written as yn(z) = n∏ i=1 (z −zi), (2.12) where from now on we assume that the roots z1,z2, . . . ,zn are real and distinct. we now plan to show that the integration constants c1,n,c2,n, . . . ,ck−2,n satisfy a system of linear equations whose coefficients can be expressed in terms of elementary symmetric polynomials in z1,z2, . . . ,zn [22], el ≡ el(z1,z2, . . . ,zn) = ∑ 1≤i1 0 or λ < 0, the range of the coordinate x is (−∞,∞) or (−1/ √ |λ|, 1/ √ |λ|). such a hamiltonian is formally self-adjoint with respect to the measure dµ = (1+λx2)−1/2dx. the corresponding schrödinger equation is es [24, 25] and its bound-state wavefunctions can be expressed in terms of gegenbauer polynomials [26]. noting that the potential energy term in (5.1) can also be written as v0(x) = λa− λa 1 + λx2 , where a = β λ ( β λ + 1 ) , (5.2) let us extend it to vm(x) = λa− λa 1 + λx2 + λ 2m∑ k=1 bk(1 + λx2)k, (5.3) where m may take the values m = 1, 2, 3, . . . , a, b1, b2, . . . , b2m are 2m + 1 parameters, and the range of x is the same as before. the starting schrödinger equation therefore reads( −(1 + λx2) d2 dx2 −λx d dx + λa− λa 1 + λx2 + λ 2m∑ k=1 bk(1 + λx2)k −e ) ψ(x) = 0. (5.4) to reduce it to an equation of type (2.1), let us make the change of variable z = 1 1 + λx2 (5.5) and the gauge transformation ψ(x) = xpza exp ( − m∑ j=1 bj zj ) y(z), (5.6) where p = 0, 1 is related to the parity (−1)p = +1,−1, and a, b1, b2, . . . , bm are m + 1 parameters connected with the previous ones. it turns out that k in (2.2) is related to m in (5.3) by the relation k = m + 2. for m = 2, for instance, equation (5.4) yields{ −4z3(1 −z) d2 dz2 + 2[(4a + 3)z3 − 2(2a− 2b1 + 1 −p)z2 − 4(b1 − 2b2)z − 8b2] d dz + [2a(2a + 1) −a]z2 + [−4a2 + 8ab1 + 4ap− 2b1 −p− �]z + b1 − 4b1(2a− b1 − 1 −p) + 4b2(4a− 3) } y(z) = 0, (5.7) after setting e = λ(� + a) and b2 = 4[b21 + 2b2(2a− 2b1 − 2 −p)], b3 = 16b2(b1 − b2), b4 = 16b 2 2. (5.8) with the identifications a4 → 4, a3 →−4, a2,a1,a0 → 0, b3 → 2(4a + 3), b2 →−4(2a− 2b1 + 1 −p), b1 →−8(b1 − 2b2), b0 →−16b2, c2 → 2a(2a + 1) −a, c1 →−4a2 + 8ab1 + 4ap− 2b1 −p− �, c0 → b1 − 4b1(2a− b1 − 1 −p) + 4b2(4a− 3) (5.9) in equation (2.2), from equations (2.10) and (2.11) we obtain 2a(2a + 1) −a = −4n(n− 1) − 2n(4a + 3), −4a2 + 8ab1 + 4ap− 2b1 −p− � = 2n(n− 1) + 83n(2a− 2b1 + 1 −p) + c1,n, b1 − 4b1(2a− b1 − 1 −p) + 4b2(4a− 3) = 83n(b1 − 2b2) + c2,n, (5.10) 124 vol. 58 no. 2/2018 quasi-exactly solvable schrödinger equations where, from (2.22), the integration constants c1,n and c2,n are given by c1,n = −2[4(n− 1) + 4a + 3]m(1,0̇) + 2n(n− 1) + 4 3n(2a− 2b1 + 1 −p), c2,n = −2[4(n− 1) + 4a + 3]m(2,0̇) + 4[2(n− 1) + 2a− 2b1 + 1 −p]m(1,0̇) − 8m(12,0̇) + 16 3 n(b1 − 2b2). (5.11) on the other hand, the direct application of the fba relations (3.16) and (3.17) yields the equivalent results 2a(2a + 1) −a = −4n(n− 1) − 2n(4a + 3), −4a2 + 8ab1 + 4ap− 2b1 −p− � = 4n(n− 1) + 4n(2a− 2b1 + 1 −p) − 8(n− 1)m(1,0̇) − 2(4a + 3)m(1,0̇), b1 − 4b1(2a− b1 − 1 −p) + 4b2(4a− 3) = 8n(b1 − 2b2) + 8(n− 1)m(1,0̇) − 8[(n− 1)m(2,0̇) + m(12,0̇)] + 4(2a− 2b1 + 1 −p)m(1,0̇) − 2(4a + 3)m(2,0̇). (5.12) in both cases, on setting n = 0 for instance, we get that the schrödinger equation (5.4) with m = 2, a = 2a(2a + 1), b1 = 4b1(2a− b1 − 1 −p) − 4b2(4a− 3), (5.13) and b2, b3, b4 given in (5.8), has an eigenvalue e0,p = λ(8ab1 + 4ap + 2a− 2b1 −p) (5.14) with the corresponding eigenfunction ψ0,p(x) ∝ xp(1 + λx2)−ae−λ(b1+2b2)x 2−λ2b2x4. (5.15) the latter is normalizable with respect to the measure dµ provided b2 > 0 if λ > 0 or a < 14 if λ < 0 and it corresponds to a ground state for p = 0 and to a first excited state for p = 1. furthermore, for n = 1, on taking into account that m(1,0̇) = z1, m(2,0̇) = z21, and m(12,0̇) = 0 in terms of the single root z1 of y1(z), we obtain that equation (5.4) with m = 2, a = (2a + 2)(2a + 3), b1 = −2(4a + 3)z21 + 4(2a−2b1 + 1−p)z1 + 4b1(2a−b1 + 1−p)−4b2(4a + 1), (5.16) and b2, b3, b4 given in (5.8), has an eigenvalue e1,p = λ[8ab1 + 4ap + 2a + 6b1 + 3p + 2 + 2(4a + 3)z1] (5.17) with the corresponding eigenfunction ψ1,p(x) ∝ xp(1 + λx2)−a−1[1 −z1(1 + λx2)]e−λ(b1+2b2)x 2−λ2b2x4. (5.18) the normalizability condition is now b2 > 0 if λ > 0 or a < −34 if λ < 0. here z1 is a real solution of the cubic equation (4a + 3)z31 − 2(2a− 2b1 + 1 −p)z 2 1 − 4(b1 − 2b2)z1 − 8b2 = 0, (5.19) hence, for instance, z1 = 2(2a− 2b1 + 1 −p) 3(4a + 3) + ( − v 2 + √(v 2 )2 + (u 3 )3 )1/3 + ( − v 2 − √(v 2 )2 + (u 3 )3 )1/3 , u = 4 4a + 3 ( −b1 + 2b2 − (2a− 2b1 + 1 −p)2 3(4a + 3) ) , v = − 8 4a + 3 ( b2 + 2 27 (2a− 2b1 + 1 −p)3 (4a + 3)2 + 1 3 (2a− 2b1 + 1 −p)(b1 − 2b2) 4a + 3 ) . (5.20) according to its value, the wavefunction ψ1,p(x) may correspond to a ground or second-excited state for p = 0 and to a firstor third-excited state for p = 1. 6. conclusion in the present paper, we have reconsidered the general conditions that have to be satisfied by the coefficients of a second-order differential equation with at most k + 1 singular points in order that the equation has particular solutions that are nth-degree polynomials yn(z) and we have expressed them in terms of symmetric polynomials in the polynomial solution roots. this has been done in two different ways. 125 christiane quesne acta polytechnica in the first one, we have shown that these conditions involve k − 2 integration constants, which satisfy a system of linear equations whose coefficients can be expressed in terms of elementary symmetric polynomials in the polynomial solution roots whenever such roots are all real and distinct. in the second approach, we have considered the solution of the fba method in its most general form under the same assumption. comparing the outcomes of both descriptions, we have proved that the above-mentioned k − 2 integration constants can be expressed as linear combinations of monomial symmetric polynomials in the polynomial solution roots, corresponding to partitions into no more than two parts. as far as the author knows, this property and the general solution of the fba method in terms of such symmetric polynomials are new results. in addition, their practical usefulness has been demonstrated by solving a qes extension of the mathewslakshmanan nonlinear oscillator, corresponding to k = 4. 7. appendix the purpose of this appendix is to solve equation (2.20) for r = 3, 4, 5 and to show that the resulting expressions of c1,n, c2,n, and c3,n agree with equation (2.22). for r = 3, equation (2.20) directly leads to c1,n (k − 3)! = −[2(n− 1)ak + bk−1]e1 − 2n(n− 1) k ak−1 − n k − 1 bk−2, (7.1) which corresponds to equation (2.22) for q = 1 because e1 = m(1,0̇). for r = 4, equation (2.20) becomes c2,n (k − 4)! − c1,n (k − 3)! e1 = 2[(2n− 3)ak + bk−1]e2 + [2 k (n− 1)(n−k)ak−1 + 1 k − 1 (n−k + 1)bk−2 ] e1 − 2n(n− 1) k(k − 1) (2k − 3)ak−2 − 2n k − 1 bk−3. (7.2) on inserting (7.1) in (7.2) and using the identities e2 = m(12,0̇), e21 = m(2,0̇) + 2m(12,0̇), we get c2,n (k − 4)! = −[2(n− 1)ak + bk−1]m(2,0̇) − [2(n− 1)ak−1 + bk−2]m(1,0̇) − 2akm(12,0̇) − 2n(n− 1) k(k − 1) (2k − 3)ak−2 − 2n k − 1 bk−3, (7.3) which agrees with equation (2.22) for q = 2. on setting now r = 5 in equation (2.20), we obtain c3,n (k − 5)! − c2,n (k − 4)! e1 + c1,n (k − 3)! e2 = −3[2(n− 2)ak + bk−1]e3 − {2 k [n2 − (2k + 1)n + 3k]ak−1 + 1 k − 1 (n− 2k + 2)bk−2 } e2 + { 2 k(k − 1) (n− 1)[(2k − 3)n−k(k − 1)]ak−2 + 1 k − 1 (2n−k + 1)bk−3 } e1 − 6n(n− 1) k(k − 1) (k − 2)ak−3 − 3n k − 1 bk−4. (7.4) here, let us employ equations (7.1) and (7.3), as well as the identities e3 = m(13,0̇), m(2,0̇)m(1,0̇) = m(3,0̇) + m(2,1,0̇), and m(12,0̇)m(1,0̇) = m(2,1,0̇) + 3m(13,0̇). for the coefficient of m(13,0̇) in c3,n/(k − 5)!, we obtain −3[2(n− 2)ak + bk−1] from the right-hand side of (7.4), −6ak from c2,ne1/(k − 4)!, and 3[2(n− 1)ak + bk−1] from −c1,ne2/(k−3)!, respectively. we conclude that m(13,0̇) does not occur in c3,n/(k−5)!, which is given by c3,n (k − 5)! = −[2(n− 1)ak + bk−1]m(3,0̇) − [2(n− 1)ak−1 + bk−2]m(2,0̇) − [2(n− 1)ak−2 + bk−3]m(1,0̇) − 2akm(2,1,0̇) − 2ak−1m(12,0̇) − 6n(n− 1) k(k − 1) (k − 2)ak−3 − 3n k − 1 bk−4, (7.5) in agreement with equation (2.22) for q = 3. 126 vol. 58 no. 2/2018 quasi-exactly solvable schrödinger equations references [1] f. cooper, a. khare, u. sukhatme. supersymmetry and quantum mechanics, phys. rep. 251:267–385, 1995. doi:10.1016/0370-1573(94)00080-m. [2] d. gómez-ullate, y. grandati, r. milson. extended krein-adler theorem for the translationally shape invariant potentials, j. math. phys. 55:043510, 30 pages, 2014. doi:10.1063/1.4871443. [3] g. szegö. orthogonal polynomials, american mathematical society, new york, 1939. doi:10.1090/coll/023. [4] a. v. turbiner, a. g. ushveridze. spectral singularities and quasi-exactly solvable quantal problem, phys. lett. a126:181–183, 1987. doi:10.1016/0375-9601(87)90456-7. [5] a. v. turbiner. quasi-exactly-solvable problems and sl(2) algebra, commun. math. phys. 118:467–474, 1988. doi:10.1007/bf01466727. [6] a. g. ushveridze. quasi-exactly solvable models in quantum mechanics, iop, bristol, 1994. doi:10.1201/9780203741450. [7] a. gonzález-lópez, n. kamran, p. j. olver. normalizability of one-dimensional quasi-exactly solvable schrödinger operators, commun. math. phys. 153:117–146, 1993. doi:10.1007/bf02099042. [8] a. v. turbiner. one-dimensional quasi-exactly solvable schrödinger equations, phys. rep. 642:1–71, 2016. doi:10.1016/j.physrep.2016.06.002. [9] a. ronveaux, heun differential equations, oxford university press, oxford, 1995. [10] h. ciftci, r. i. hall, n. saad, e. dogu. physical applications of second-order linear differential equations that admit polynomial solutions, j. phys. a: math. theor. 43:415206, 14 pages, 2010. doi:10.1088/1751-8113/43/41/415206. [11] m. gaudin, la fonction d’onde de bethe, masson, paris, 1983. [12] c.-l. ho. prepotential approach to exact and quasi-exact solvabilities, ann. phys. (ny) 323:2241–2252, 2008. doi:10.1016/j.aop.2008.04.010. [13] y.-z. zhang. exact polynomial solutions of second order differential equations and their applications, j. phys. a: math. theor. 45:065206, 20 pages, 2012. doi:10.1088/1751-8113/45/6/065206. [14] d. agboola, y.-z. zhang. exact solutions of the schrödinger equation with spherically symmetric octic potential, mod. phys. lett. a27:1250112, 8 pages, 2012. doi:10.1142/s021773231250112x. [15] d. agboola, y.-z. zhang. novel quasi-exactly solvable models with anharmonic singular potentials, ann. phys. (ny) 330:246–262, 2013. doi:10.1016/j.aop.2012.11.013. [16] d. agboola, j. links, i. marquette, y.-z. zhang. new quasi-exactly solvable class of generalized isotonic oscillators, j. phys. a: math. theor. 47:395305, 17 pages, 2014. doi:10.1088/1751-8113/47/39/395305. [17] c. quesne. families of quasi-exactly solvable extensions of the quantum oscillator in curved spaces, j. math. phys. 58:052104, 19 pages, 2017. doi:10.1063/1.4983563. [18] e. heine. handbuch der kugelfunctionen, vol. 1, pp. 472–479, g. reimer, berlin, 1878. [19] t. j. stieltjes. sur certains polynômes qui vérifient une équation differentielle linéaire du second ordre et sur la théorie des fonctions de lamé, acta math. 6:321–326, 1885. [20] c. quesne, v. m. tkachuk. deformed algebras, position-dependent effective masses and curved spaces: an exactly solvable coulomb problem, j. phys. a: math. gen. 37:4267–4281, 2004. doi:10.1088/0305-4470/37/14/006. [21] h. karayer, d. demirhan, f. büyükkılıç. extension of nikiforov-uvarov method for the solution of heun equation, j. math. phys. 56:063504, 14 pages, 2015. doi:10.1063/1.4922601. [22] d. e. littlewood. a university algebra: an introduction to classic and modern algebra, dover, new york, 1971. [23] p. m. mathews, m. lakshmanan. on a unique nonlinear oscillator, quart. appl. math. 32:215–218, 1974. doi:10.1090/qam/430422. [24] j. f. cariñena, m. f. rañada, m. santander. one-dimensional model of a quantum nonlinear harmonic oscillator, rep. math. phys. 54:285–293, 2004. doi:10.1016/s0034-4877(04)80020-x. [25] j. f. cariñena, m. f. rañada, m. santander. a quantum exactly solvable non-linear oscillator with quasi-harmonic behaviour, ann. phys. (ny) 322:434–459, 2007. doi:10.1016/j.aop.2006.03.005. [26] a. schulze-halberg, j. r. morris. special function solutions of a spectral problem for a nonlinear quantum oscillator, j. phys. a: math. theor. 45:305301, 9 pages, 2012. doi:10.1088/1751-8113/45/30/305301. 127 acta polytechnica 58(2):118–127, 2018 1 introduction 2 second-order differential equations with polynomial solutions and integration constants 3 functional bethe ansatz method 4 comparison between the two approaches 5 example: qes extension of the mathews-lakshmanan nonlinear oscillator 6 conclusion 7 appendix references acta polytechnica doi:10.14311/ap.2019.59.0051 acta polytechnica 59(1):51–58, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap modelling, simulation and parameter identification of active pollution reduction with photocatalytic asphalt jens kruschwitzb, martin linda, adrian munteana, ∗, omar richardsona, yosief wondmagegnea a department of mathematics and computer science, karlstad university, universitetsgatan 2, karlstad, sweden b capital city kiel, transportation infrastructures department, fleethörn 9, kiel, germany ∗ corresponding author: adrian.muntean@kau.se abstract. we develop and implement a numerical model to simulate the effect of photocatalytic asphalt on reducing the concentration of nitrogen monoxide (no) due to the presence of heavy traffic in an urban environment. the contributions in this paper are threefold: we model and simulate the spread and breakdown of pollution in an urban environment, we provide a parameter estimation process that can be used to find missing parameters, and finally, we train and compare this simulation with different data sets. we analyse the results and provide an outlook on further research. keywords: pollution; environmental modelling; parameter identification; finite element simulation. 1. introduction pollution in urban environments has been a major issue for several decades and efforts of combating it have spanned across many areas of research. emerging technologies, such as those based on photo-catalysis target the removal of vehicular nitrogen oxides (nox) to mitigate the roadside air pollution problem. these techniques have proven effective and research in this area is still ongoing. recently, a lot has been done from the experimental perspective as well, see i.e. [1] for experimental studies on visible–light activated photocatalytic asphalt, or [2–5] for photo-catalytic concrete products as well as [6] for actual observational studies. a comprehensive overview of the underlying principles in photo-catalysis processes in connection with removal of air pollutants is presented in [7–9] and the references therein. however, from the mathematical modelling and simulation point of view, this setting is less studied. a remotely related situation is handled via a fluid dynamics-based study on the pollutant propagation in the near ground atmospheric layer as discussed in [10]. we strongly believe that there is a need for a deeper insight via mathematical modelling and simulations in the connections between experiments, theory and practical applications of the methodology. this is the place where we wish to contribute. similar techniques as the ones we apply are detailed in [11], to which we refer the interested reader for more details on the modelling structure. in this paper, we report on the use of numerical simulations to mimic the effect of the presence of a street paved with photocatalytic asphalt has on the no reduction in the local ambient. comparisons are based on data collected at a neighbourhood of one of the streets in kiel, germany. the findings in this work contribute to enhance the understanding on the interplay between the various factors involved in the air pollution control. the contributions of this paper are threefold: • we model and simulate the spread and breakdown of pollution in an urban environment. • we provide a parameter estimation process that can be used to determine relevant missing parameters. • we train and compare this simulation with different data sets. the paper is structured as follows: first, we present the model equations in a dimensionless formulation and explain what they are describing. then, we provide an insight in the reference set of parameters we use. as a next step, we use the designed model to formulate and solve a parameter identification problem required to identify the effect of the local environment on the no evolution. finally, we compare our numerical results with the measured data from both before and after the photocatalytic asphalt was placed. 2. no pollution model 2.1. model description the urban environment we model is the cross-section of a road. this is a three-lane highway within city limits. in our two-dimensional model, we represent the introduction of the pollution by the cars, which is diffused through the air. we model sunlight as a reactive factor [7, 8]. finally, we model interaction with the rest of the environment by using robin boundary conditions. these boundary conditions include environmental effects from neighbouring locations (especially molecular diffusion and dispersion), represented in a single parameter later referred to as σ. we do 51 http://dx.doi.org/10.14311/ap.2019.59.0051 http://ojs.cvut.cz/ojs/index.php/ap j. kruschwitz, m. lind, a. muntean et al. acta polytechnica figure 1. sketch of the cross-section. γ represents the face of the asphalt, the box is a cross-section of the street. the grey rectangular box inside of the cross-section illustrates the location of no emission. not incorporate wind flux in our model. our motivation for this is two-fold: (1) we are not aware of measurements of wind data in the region under observation; (2) our model can handle a slow-to-moderate wind by a simple translation of the fluxes with an averaged convective term. however, the size effects that a strong drift introduces would be too significant for our model to capture in the current setting and with the current data available. introducing genuine wind effects in our model would necessarily force us to handle a big area of the urban environment, where more localized measurements would be needed to determine a correct no pollution level. instead we opt for introducing some level of dispersion correcting the molecular diffusion of the main pollutant agent; see section 6 for a further discussion of this topic. 2.2. setting of the model equations the equations are defined in the aforementioned urban environment, denoted by ω. the geometric representation of the environment is illustrated in figure 1. to keep the presentation concise, we introduce the model directly into a dimensionless form. we refer the reader to the appendix for a concise explanation of the non-dimensionalization procedure. let x be the variable denoting the position in space and t ∈ [0, 1) be the variable denoting the time of day. the unknown concentration profile is denoted by u(x, t) and represents the no concentration at the position x and time t. we refer here to no as air pollutant. we choose l to represent a reference length scale: the width of the two times three-lane highway plus the two corresponding banks (see section 2.3). tr is a reference time scale. concerning the concentration of the air pollutant, let u0 denote the initial concentration value, ut a preset (threshold) value and ur a reference concentration value. the preset concentration of the pollutant might correspond, for example, to the concentration of no naturally present in the environment. the transport (dispersion) coefficient for no is denoted by d, while t = 0 and t = 1 denote the dimensionless initial and final times of the process observation; the start and the end of one day, respectively. let f denote the traffic intensity on the street, measuring the average number of vehicles passing through the observation point during a specified period, with fr as a reference value. let s denote the effect of the solar radiation, with sr as a reference value. introducing non-dimensional variables and rewriting gives the main equation of this study: ∂u ∂t −∇· ( ∇u ) = aff(x, t) −κass(t)u, (1) where af f represents the contribution from the emission of no by motor vehicles. the dependence of the coefficient af on the other parameters can be expressed as af = fr l 2 ur d = 5.5, a damköhler-like number that expresses the relation between the no emitted and the density of the traffic. the baseline value for the molecular diffusion coefficient d is 0.146 cm2/s; this reference value is taken from [12]. to account for the effects of dispersion and slow winds, we use values for this coefficient that are of order 102 higher. this brings the numerical output in the range of emission measurements for large ranges of all the other model parameters. to quantify the evolution of the air pollutant, one has to first get a grip on the typical sizes of σ, κ and γ. the term as s(t) represents the contribution from the reaction where no is converted to no2 with a reaction rate κ. the dependence of as on the other parameters is given by as = sr l 2 d . the following initial and boundary conditions are imposed on (1). u(x, 0) = u0 ur for x ∈ ω, −∇u · n = 0 at γn × (0, 1), ∇u · n = σ l d ( u− ut ur )+ at γr × (0, 1), ∇u · n = γ κl d u at γ × (0, 1), (2) where g+ denotes the positive part of g, namely g+(x) = max(g(x), 0) for x ∈ γr. we refer to σ as the environmental parameter, to κ as the reaction rate and to γ as the asphalt reactivity. all parameters can be seen as mass-transfer coefficients; σ represents the exchange of no with the ambient atmosphere, κ expresses the speed of the reaction from no to no2 while γ expresses the capacity of the photo-voltaic asphalt. in our model, we choose a value of σ = 300 m3/µg, consistent with the base level of no concentration in the environment according to the measurements. when simulating the scenario prior to the photo-voltaic asphalt, γ = 0. it is worth mentioning that κ and γ are influenced by a multitude of effects, including but not limited 52 vol. 59 no. 1/2019 modelling and simulation of active pollution reduction 0 0.5 1 0 0.5 1 1.5 2 2.5 figure 2. traffic density m(t) as a function of t. to the local atmospheric conditions, the effect of uv, temperature and humidity and the porosity and the chemical composition of the asphalt. for this reason, these parameters are situation specific and require tuning for each scenario that one wants to model. we will do so by applying a parameter identification technique, described in section 4. 2.3. other parameters 2.3.1. initial and threshold concentrations of no u0 represents the initial mass concentration of no present in the environment. in this simulation, we choose a concentration of u0 = 37 µg/m3. this value corresponds to the lowest available no concentration level from the measurements. ut represents the threshold concentration level from which no disperses out of the environment we consider. in this model, we choose ut = 0 µg/m3, which means we assume low ambient levels of no, ensuring natural dispersion even at low concentrations. in the case of high ambient levels of no due to, e.g., nearby factories or other highways, ut can be higher. 2.3.2. patch-wise vehicle distribution the emission of no is proportional to the amount of motor vehicles passing through this cross-section. the distribution of motor vehicles is derived from measurements of [13], a german municipal service that performs automated traffic counts for a large number of cities. the traffic count we collect our data from is located at 3.5 kilometers from the no measuring point. because the roads in question are similar in size and geographically close, we expect this data to provide a reliable estimate. daily, 72278 cars passed are counted in both directions of the road. the traffic count is aggregated on an hourly level, so in order to obtain a vehicle distribution, we interpolate the hourly data with cubic splines and normalize the resulting function. we end up with a nominal density m(t) of cars for each time t ∈ [0, 1) such that ∫ 1 0 m(t) dt = 1. a plot of m(t) is presented in figure 2. 2.3.3. patch-wise no emission (per vehicle) we model the emission of no by choosing a specific shape for the source term f(x, t) from (1). this source term is defined as follows: f(x, t) = { m(t), if x ∈ a, 0, if x /∈ a, (3) for all x ∈ ω and t ∈ [0, 1), where a is a rectangle of 15 × 0.4 square meters, located 0.1 meter over the asphalt. this box represents the location of the emission of the vehicles. it is illustrated by the grey box in figure 1. 2.3.4. cross-section geometry the dimensions of the simulated cross-section (displayed in figure 1) are 40 meters by 8 meters. in the simulation, the road has a width of 15 meters (2 × 3 lanes of each 2.5 meters wide) and is located in the middle of the cross section. this is a simplified representation of the road under consideration, but can be generalized to model roads of any dimension. the measuring point we use to evaluate the simulated no concentration profile is located 1.75 meters above the centre of the road, in accordance to the real measuring point that provided us the no concentration data. 53 j. kruschwitz, m. lind, a. muntean et al. acta polytechnica 0 0.5 1 0 0.5 1 1.5 2 2.5 figure 3. nondimensional uv strength s(t) as a function of t. 2.3.5. solar effects on no the natural conversion from no to no2 due to sunlight is represented by term κass(t). in this term, the intensity of the uv radiation is expressed by the dimensionless factor s(t). we compute this factor by interpolating the sunrise, sunset and solar noon data from [14] and normalizing to obtain a function s(t) such that ∫ 1 0 s(t) dt = 1. this is done under the assumption that there is no uv radiation between sunset and sunrise and that the maximum of uv radiation takes place at solar noon. for our simulation, we choose a reference value sr = 1 uvi (see appendix), where by uvi we mean the uv index of the sunlight, a dimensionless quantity. a plot of the shape of s(t) is presented in figure 3. 2.3.6. pollution-reducing effects as mentioned above, the values of κ and γ strongly depend on the environment. by using the measurement data detailed in the next section, we have enough information to derive the values of these parameters for our model. 3. measurements the no measurements used in the simulation cover a period from 2012 to 2017, where no was measured in 30-minute intervals. we clean the data by limiting ourselves to a specific period: from september 1st to december 10th: 101 measurements outside of the holiday period, with an average intensity of the sun (between summer and winter). we interpolate this data to obtain the emission profile of an average day. data in this period from 2017 corresponds to the newly installed photocatalytic asphalt. to compare this scenario with the initial situation, we use the measured data from 2016. to assert that the traffic intensity remained relatively constant, we compare the measurements from 2016 to the same period in 2015. figure 4 shows comparable concentration profiles. this indicates, as a reasonable assumption, that the only major change between 2016 and 2017 was the new asphalt, given the local weather conditions incorporated in the parameter κ. figure 4 reveals a large reduction in no pollution after 2016. 4. parameter identification as stated in section 2, we require situation-specific values for the parameters κ and γ. to obtain these, we propose and solve a parameter identification problem based on the datasets described in section 3. in a two-step procedure, we first use the measurements prior to the installation to obtain κ, knowing that in this case γ = 0, and then use the measurements after the installation to obtain γ. mathematically speaking, we wish to solve the following problem: let u( · ; κ,γ) be the solution to (1) for given parameters κ and γ and ur be the measurement data. find κ and γ that solve the following optimization problem: min κ∈pκ min γ∈pγ ∥∥ur(x, t) −u(x, t; κ,γ)∥∥l2(ω×(0,1)). (4) here pκ and pγ are compact sets in r where parameters are searched. 5. simulation this section describes the setup of determining the effectiveness of the photocatalytic asphalt. 54 vol. 59 no. 1/2019 modelling and simulation of active pollution reduction 0 2 4 6 8 10 12 14 16 18 20 22 24 0 50 100 150 200 250 300 350 2015 2016 2017 figure 4. no measurements at the theodorus-heuss-ring in kiel, germany. the green curve (2017) has the same shape as the other two curves, but presents a significant reduction of no levels. 0 0.5 1 1.5 2 2.5 3 3.5 κ ×10 4 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 r e la t iv e d e v ia t io n fr o m m e a s u r e m e n t s in l 2 − n o r m parameter identification for κ figure 5. the relative discrepancy between the simulated no concentrations and the measurements for a range of κ values and γ = 0. the optimum is reached for κ = 1.85 × 104 1/[day uvi], displayed as the minimum of this curve. 5.1. simulation framework we numerically compute the solution to (1) with (2) with a finite element simulation using the fenics library [15]. we investigate the process within the specified cross-section, leading to a two-dimensional reaction-diffusion process. the finite element mesh has 30 × 30 elements on a rectangular grid. the solution to (1) is approximated with quadratic basis functions. we simulate two scenarios: (a) the no concentration profile prior to the photocatalytic asphalt (corresponding to measurements from 2015 and 2016); and (b) the no concentration profile after the installation of the photocatalytic asphalt (corresponding to measurements from 2017). in these simulations, we use the parameters as described in section 2 and choose a diffusion coefficient of d = 43.8, which shows an agreement with the current setting. in case (a) specifically, we fix γ = 0. our goal is to use the data set from (a) to train our simulation on the value of parameter κ. then we use the data set from (b) to determine the effect 55 j. kruschwitz, m. lind, a. muntean et al. acta polytechnica 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 γ 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 r e la t iv e d e v ia t io n fr o m m e a s u r e m e n t s in l 2 − n o r m parameter identification for γ figure 6. the relative discrepancy between the simulated no concentrations and the measurements for a range of γ values and κ = 1.85 × 104 1/[day uvi]. the optimum is reached for γ = 3.0 × 10−3, displayed as the minimum of this curve. 0 2 4 6 8 10 12 14 16 18 20 22 24 time (of day) 0 50 100 150 200 250 300 350 n o ( µ g / m 3 ) simulated concentration of no passive/before active/after figure 7. simulated no concentration profiles. the dash-dotted line corresponds to the period in 2016, the continuous line has corresponds to the period in 2017. the photocatalytic asphalt has on the reduction of the local no concentration, i.e., find γ. figure 5 displays this process, where for a range of κ, we plotted the relative discrepancy with the measurements. more precisely, let ur denote the interpolation of the measurements and uκ the result of our simulation for a specific κ, then the relative discrepancy eκ(u) is defined as eκ(u) = ‖ur −uκ‖l2(ω×(0,1)) ‖ur‖l2(ω×(0,1)) . (5) having captured the nature of the process, we can estimate the effect of the photocatalytic asphalt in terms of γ by starting a new parameter identification process using the data set from the second scenario. figure 6 shows the relative discrepancy (similarly defined as in (5)) for uγ (with an optimal value of κ). figure 5 and figure 6 suggest that we are dealing with a convex minimization problem. the simulations with optimally fitted parameters, before and after the construction of the photocatalytic asphalt, are displayed in figure 7. 56 vol. 59 no. 1/2019 modelling and simulation of active pollution reduction 0 2 4 6 8 10 12 14 16 18 20 22 24 time (of day) 0 50 100 150 200 250 300 350 n o ( µ g / m 3 ) simulated and measured concentration of no before (simulation) before (measurements, 2016) after (simulation) after (measurements, 2017) figure 8. no measurements compared to simulations in the two cases. the simulations show good quantitative agreement. we use the measurements of no to compare the simulation results with. figure 8 shows the measured and simulated data combined. the simulation agrees qualitatively and quantitatively, with a mass error of 58.64 µg/m3 in the pre-photocatalytic case and 22.68 µg/m3 in the post-photocatalytic one. this mass error is defined as∫ t 0 ∫ ω ∣∣u(x, t) −ur(x, t)∣∣ dx dt, (6) where u represents the simulation solution and ur the measurements. 6. discussion and outlook this report shows that it is possible to use a mathematical model like the one presented in section 2 to describe the evolution of no concentration as a function of time in an urban environment where both intense motorized vehicle traffic and photocatalytic asphalts are present. a number of things can be done to continue this investigation and improve the quantitative predictions obtained in this study. • perform a 3d simulation along the whole length of the photocatalytic asphalt. • account for the presence of uncertainties in the weather condition (especially the variation in the uv radiation and the effect of precipitation). • numerically identify scenarios leading to an extreme no pollution. • by neglecting wind flux, we emphasize that our interest lies in quantifying the reactive part of this reaction-diffusion process. this approach allows us to identify two extreme scenarios: (1) excellent performance of the asphalt, and (2) low-response on no pollution induced by the presence of the asphalt. including more detailed wind effects is certainly of an interest (both from a theoretical and a practical perspective), since it is known that the location under investigation (kiel, germany) is prone to varied weather conditions. 7. acknowledgements we gratefully thank the state agency for agriculture, environment and rural areas schleswig-holstein (llur) for providing the data measured at the theodor-heuss-ring in kiel, germany. the authors acknowledge the very valuable input from the side of the referees regarding the shaping of the final form of the manuscript. 8. appendix for x ∈ ω and t ∈ [0,t), we consider the following equation ∂u ∂t −∇· ( d∇u ) = f(x, t) −κs(t) u(x, t). (7) let us also introduce the following rescalings: x̄ = x l , t̄ = t tr , ū = u ur , f̄ = f fr and s̄ = s sr , where fr = max (x,t)∈ω×[0,t] ∣∣f(x, t)∣∣. rewriting (7) using these new rescaled functions and variables yields ∂ū ∂t̄ − trd l2 ∇̄ · ( ∇̄ū ) = tr fr ur f̄ −κtr sr s̄ ū. (8) 57 j. kruschwitz, m. lind, a. muntean et al. acta polytechnica this equation is posed in a rescaled space domain ω̄ = ω l . choosing tr = l 2 d reduces (8) to the form ∂ū ∂t̄ −∇̄ · ( ∇̄ū ) = l2 fr dur f̄ −κ l2 sr d s̄ū. (9) setting af = l 2 fr dur , as = l 2 sr d and abusing the notation, namely rewriting ω instead of ω̄, u instead of ū, f instead of f̄, and s for s̄ in (9) posed in ω̄ gives (1) posed in ω. a similar discussion yields the structure of the initial and boundary conditions in a non-dimensional form. references [1] w. fan, k. y. chan, c. zhang, m. k. leung. advanced solar photocatalytic asphalt for removal of vehicular nox. energy procedia 143:811–816, 2017. doi:10.1016/j.egypro.2017.12.767. [2] g. hüsken, m. hunger, h. brouwers. experimental study of photocatalytic concrete products for air purification. building and environment 44(12):2463– 2474, 2009. doi:10.1016/j.buildenv.2009.04.010. [3] m. hunger, h. brouwers, m. ballari. photocatalytic degradation ability of cementitious materials: a modeling approach. in w. sun, k. breugel, van, c. miao, et al. (eds.), proceedings of the 1st international conference on microstructure related durability of cementitious composites, 13 -15 october 2008, nanjing, china., pp. 1103–1112. 2008. [4] m. ballari, m. hunger, g. hüsken, h. brouwers. modelling and experimental study of the nox photocatalytic degradation employing concrete pavement with titanium dioxide. catalysis today 151(1):71–76, 2010. 2nd european conference on environmental applications of advanced oxidation processes(eaaop2)., doi:10.1016/j.cattod.2010.03.042. [5] j. sikkema, s. ong, j. alleman. photocatalytic concrete pavements: laboratory investigation of no oxidation rate under varied environmental conditions. construction and building materials 100:305–314, 2015. doi:10.1016/j.conbuildmat.2015.10.005. [6] g. l. guerrini. photocatalytic performances in a city tunnel in rome: nox monitoring results. construction and building materials 27(1):165–175, 2012. doi:10.1016/j.conbuildmat.2011.07.065. [7] j. ângelo, l. andrade, l. m. madeira, a. mendes. an overview of photocatalysis phenomena applied to nox abatement. journal of environmental management 129:522–539, 2013. doi:10.1016/j.jenvman.2013.08.006. [8] j. lasek, y.-h. yu, j. c. wu. removal of nox by photocatalytic processes. journal of photochemistry and photobiology c: photochemistry reviews 14:29–52, 2013. doi:10.1016/j.jphotochemrev.2012.08.002. [9] t. ochiai, a. fujishima. photoelectrochemical properties of tio2 photocatalyst and its applications for environmental purification. journal of photochemistry and photobiology c: photochemistry reviews 13(4):247– 262, 2012. doi:10.1016/j.jphotochemrev.2012.07.001. [10] a. i. sukhinov, d. s. khachunts, a. e. chistyakov. a mathematical model of pollutant propagation in nearground atmospheric layer of a coastal region and its software implementation. computational mathematics and mathematical physics 55(7):1216–1231, 2015. doi:10.1134/s096554251507012x. [11] a. muntean. continuum modeling: an approach through practical examples. springer, 2015. [12] j. targa, a. loader. diffusion tubes for ambient no2 monitoring: practical guidance. tech. rep., aea energy & environment, 2008. [13] federal ministry of transport and digital infrastructure. federal highway research institute (bast). http://www.bast.de/de/home/home_node.html. accessed: 2018-01-16. [14] global radiation group. earth system research laboratory; global monitoring division. https://www.esrl.noaa.gov/gmd/grad/solcalc/. accessed: 2018-01-16. [15] m. alnæs, j. blechta, j. hake, et al. the fenics project version 1.5. archive of numerical software 3(100), 2015. doi:10.11588/ans.2015.100.20553. 58 http://dx.doi.org/10.1016/j.egypro.2017.12.767 http://dx.doi.org/10.1016/j.buildenv.2009.04.010 http://dx.doi.org/10.1016/j.cattod.2010.03.042 http://dx.doi.org/10.1016/j.conbuildmat.2015.10.005 http://dx.doi.org/10.1016/j.conbuildmat.2011.07.065 http://dx.doi.org/10.1016/j.jenvman.2013.08.006 http://dx.doi.org/10.1016/j.jphotochemrev.2012.08.002 http://dx.doi.org/10.1016/j.jphotochemrev.2012.07.001 http://dx.doi.org/10.1134/s096554251507012x http://www.bast.de/de/home/home_node.html https://www.esrl.noaa.gov/gmd/grad/solcalc/ http://dx.doi.org/10.11588/ans.2015.100.20553 acta polytechnica 59(1):51–58, 2019 1 introduction 2 no pollution model 2.1 model description 2.2 setting of the model equations 2.3 other parameters 2.3.1 initial and threshold concentrations of no 2.3.2 patch-wise vehicle distribution 2.3.3 patch-wise no emission (per vehicle) 2.3.4 cross-section geometry 2.3.5 solar effects on no 2.3.6 pollution-reducing effects 3 measurements 4 parameter identification 5 simulation 5.1 simulation framework 6 discussion and outlook 7 acknowledgements 8 appendix references ap03_6.vp 1 introduction an important problem in the theory and practice of control system design is the design of feedback controllers, which place the closed-loop poles of a linear system at desired locations. the literature in this field is quite rich. the state feedback control problem has been given a lot of attention in the control community during the last three decades. several researchers have developed design methods for a wide class of linear systems under full-state feedback with the objective of stabilizing control systems (e.g. [1–10]). in designing control systems based on pole placement, it may be satisfactory in practice that the closed-loop system has all poles at a desired location. however, this paper focuses on a special feedback using only state derivatives instead of full state feedback. therefore this feedback is called state derivative feedback. the problem of arbitrary pole placement using full state derivative feedback naturally arises. to the best of the authors’ knowledge, there has not yet been any general study solving this feedback for pole placement. the motivation for the state derivative feedback comes from controlled vibration suppression of mechanical systems. the main sensors of vibration are accelerometers. from accelerations it is possible to reconstruct velocities with reasonable accuracy, but not the displacements. therefore the available signals for feedback are accelerations and velocities only, and these are exactly the derivatives of the states of mechanical systems which are the velocities and displacements. one necessary condition for a control strategy to be implementable is that it must use the available measured responses to determine the control action. all of the previous research in the control has assumed that all of the states can be directly measured (i.e., full state feedback). to control this class of systems, many papers have been published (e.g. [11–16]) describing the acceleration feedback for controlled vibration suppression. however, the pole placement approach for feedback gain determination has not been used at all or has not been solved generally. the approach in [11–14] is based on dynamic derivative output feedback. the feedback uses acceleration only (the velocity is not used, therefore it is not full state derivative feedback, but only output derivative feedback) and the acceleration is processed by a dynamic filter (dynamic feedback). the feedback gains are determined using root locus analysis [11–15], optimization of the h2 norm of the closed loop transfer function [14] or using just numerical parameter optimization of performance indices [16]. other papers dealing with acceleration feedback for mechanical systems are [17–18], but here the feedback uses all states (positions, velocities) and accelerations additionally. recently, paper [19] has presented a nonlinear controller based on the state-derivative feedback control for a magnetic bearing. the state-derivative feedback is used for feedback linearization and not for pole placement. in this paper the problem of pole placement by state derivative feedback for single-input linear systems, both time invariant and time varying, is generally formulated and solved. the solution is based on recent efficient techniques for solving the pole placement problem by state feedback for siso and mimo linear time-invariant and time-varying systems [8, 9, 10]. it uses the transformation of a linear system into frobenius canonical form and results in different versions of ackermann’s formula. this methodology is also utilized in this paper. in summary, this paper is organized as follows. in section 2, we begin with the general formulation of the problem followed by the solution for single-input linear time invariant systems. section 3 deals with the extension of this pole placement for time-varying systems. in section 4, the illustrative examples and simulation results for several systems are presented. finally, conclusions follow in section 5. 2 pole placement by state-derivative feedback for linear time-invariant systems pole placement by state-derivative feedback is solved in this section for linear time-invariant systems using the technique from [8, 9, 10]. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 a direct algorithm for pole placement by state-derivative feedback for single-input linear systems taha h. s. abdelaziz, m. valášek this paper deals with the direct solution of the pole placement problem for single-input linear systems using state-derivative feedback. this pole placement problem is always solvable for any controllable systems if all eigenvalues of the original system are nonzero. then any arbitrary closed-loop poles can be placed in order to achieve the desired system performance. the solving procedure results in a formula similar to the ackermann formula. its derivation is based on the transformation of a linear single-input system into frobenius canonical form by a special coordinate transformation, then solving the pole placement problem by state derivative feedback. finally the solution is extended also for single-input time-varying control systems. the simulation results are included to show the effectiveness of the proposed approach. keywords: pole placement, state-derivative feedback, linear single-input systems, feedback stabilization. 2.1 pole placement problem formulation consider a controllable linear time-invariant single-input system, � � � � � ��x a x bt t u t� � , � �x xt0 � 0, (1) where � �x t n�r and � �u t �r are the state vector and the scalar control input, respectively, while a � �r n n and b �r n are the system matrix and control gain vector, respectively. the characteristic polynomial of matrix a can be given by � det s s s s +i a � � � � � n n na a a1 1 1 0 0� , (2) where � a � a a an0 1 1, , ,� are the coefficients of the characteristic polynomial and i � �r n n is the identity matrix. it is known that � � � � � �a n0 1� � det deta a and � �an � 1 trace a . the objective is to place the desired poles for the closed-loop system using the constant state-derivative feedback control u � k x� , (3) that enforces a desired characteristic behavior for the states and thus stabilizes the system. using the feedback (3) the closed-loop system becomes � � � �x i bkt � + 1a x , (4) the design problem is to find the feedback gain matrix k such that the closed-loop poles, � �� � �1 2, , ,� n satisfying � � � �spec a � �� � �1 2 0, , , n � , of system (4) are assigned at the desired values. a solution of this problem using the procedures of state feedback is difficult, and a direct approach usually leads to iterative optimization. the solution to this problem is further accomplished by utilizing a transformation of this system to the frobenius canonical form and placing the desired eigenvalues. 2.2 transformation into frobenius canonical form for time-invariant systems frobenius canonical form is constructed by transforming the state vector to a new coordinate system in which the system equation takes a particular form. let us take the following time-invariant state transformation, � � � �z q xt t� 1 , (5) where � �z t n�r is the new (transformed) state variable vector and q ��1 r n n is the transformation matrix. the original system is transformed into a system with the transformed system matrix af n n� �r and the transformed control gain vector bf n�r , which are given by a af � q q1 , b bf � q 1 . (6) if the transformation matrix is chosen as q q q q � � � � � � � � � � � � 1 1 1 1 1 a a m n , (7) where the row vector q1 1� �r n is computed as q e1 1= tn r , (8) from the controllability matrix r � �r n n of system (1), � �r b ab a b a b� 2 1� n , (9) where � en � 0 0 1, , ,� ten � [ 0, …, 0, 1] t is a unit vector, then the transformed system is transformed into the frobenius canonical form �z z� � � � � a bf f n+ u a a a a 0 0 0 1 � � � � � � 0 0 1 0 m m m o m 0 0 1 0 1 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � z+ u 0 0 0 1 m . (10) then, the system is reduced to a simple and convenient form that can be easily manipulated and we can solve the pole placement problem by state-derivative feedback. if the transformation matrix is nonsingular, then the transformation to the generalized frobenius canonical form can be made. 2.3 solution of the pole placement problem for time-invariant systems in this subsection, we will find the state-derivative feedback gain matrix k that assigns the desired closed-loop poles system in a computationally efficient and simple manner. utilizing the above transformation into frobenius canonical form, the system can be manipulated by a linear feedback for a desired behavior. by differentiating the transformation equation (5), the resulting closed-loop system in the z-coordinate is, � � � �� �z q xt t� 1 . (11) hence, after the substitution of (4) and (5) in the above equation we obtain � � � �z q i bk aq z a zt z� � � 1 1 , (12) where the closed-loop system matrix az n n� �r is given by � a q i bk aqz � � 1 1 . (13) postmultiply equation (13) by � q a i bk �1 1 . hence, we can rewrite the above equation as � a q a i bk qz � �1 1 1. (14) given the desired eigenvalues {�1, …, �n}. the desired closed-loop characteristic polynomial is, � � � �� � � �d d d d n n n n s s s s s s s � � � � � � � � �1 2 1 1 1 0 � � , (15) where � d � d d dn0 1 1, , ,� are the coefficients of the desired characteristic polynomial. the structure of the desired closed-loop matrix can be written in a canonical form, as az nd d d d � � � � � � � � � � � � 0 0 0 0 1 2 1 � � � � � � � � � � � � � m m m o m � � � � . (16) equation (14) can be rewritten in terms of the row vectors of the transformation matrix q 1 as, � q a i bk q a1 1 i i� � , i � 0, …, n – 2, and © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 43 no. 6/2003 � � � d q a i bk q a1 1 1 1n . (17) based on the definition of the transformation matrix q 1, it is easy to write that q a b1 0 i � , i � 0, …, n – 2. (18) finally, the feedback gain matrix k for the time-invariant system can be written as � � � �k d q a b q a d q a� � 1 1 1 1 1 1 1n . (19) the above equation can be rewritten as k q a b q a� � � �� � � � �� � � � � � � � � � � � �d di i i n n i1 1 0 1 1 1 1 q a1 1 0 1 i i n � � � � �� � � � �� . (20) to simplify the feedback gain matrix. first, we utilize the relation a q q a� f . (21) then, it is easy to verify that q a b a q b a b � �1 1 1 1 1f f f . (22) the inverse of the transformed system matrix af can be obtained as af na a a a a a a � � � 1 1 0 2 0 1 0 01 1 0 0 0 1 � � � � � 0 0 m m m m m o 0 0 0 � � � � � � � � � � � � � . (23) finally, it can easily be obtained that � � � � � � � � � � � � � � d q a b d a b d1 1 1 0 0 0 1 0 0 f f a d am . (24) then, the feedback gain matrix k for the time-invariant system can be written as k q a q a� � �� � � �� � � � �� � � � �� � � � �ad d a n i i i n 0 0 1 1 1 1 0 1 0 � d d d d a d n n n 0 1 1 1 1 1 0 0 0 � �� � � �� � � � � � � � �� � � �� q a a a a i� � � � � � � � � �e ar d a e r a d an n n d t t adj � 1 0 11 (25) where d(a) denotes the evaluation of the desired characteristic polynomial (15) with the state matrix a. the resulting formula for the constant state-derivative feedback gain matrix is the direct analogy of ackermann formula for traditional state feedback. the original ackermann formula has been modified in [8, 9, 10] into equivalent efficient numerical algorithms for computing the feedback gain matrix k. the same can be done for the state-derivative feedback. the resulting equivalent efficient formula based on desired coefficients of the characteristic polynomial is the following recursive procedure k q q� � �� � � �� � � � � � �� � � � �� � �ad dn i i i n 0 0 0 1 , (26) where � � � � q q a e a r0 1 1 1 n t , � � � q q ai i 1 , i � 1, …, n. it is clear that the computation with row vectors is more efficient than with the full square matrices. now, if the stabilizing feedback control defined by a set of desired eigenvalues �i, i � 1, …, n, instead of the coefficients di of the characteristic equation. then, the feedback gain matrix is � � � � � k q a a i a e ar a � � � � � � � � � a i i n i i n i i n n 0 1 1 1 1 1 1 � � � det t � � � � � � � � � � � � � � � � � i i n i i n n i i n i e r a a i 1 1 1 1 1 t adj (27) and again utilizing the above simplification (26), the above equation can be written as k q� � �� a i i n n 0 1 � , (28) where � � � � q q a e ar0 1 1 1 n t , � �� � � q q a ii i i1 � , i � 1, …, n. one can easily note that the proposed algorithm is simple and easy to implement. in addition, the state-derivative feedback gain matrix calculations are not done in the intermidiate domain and direct implementation is performed in the original state space. we do not need to compute the transformation into the generalized frobenius form. one of the main advantages of the transformation matrix is the posibility to easily derive an explicit analytic expression for the feedback gain matrix. the above algorithm is valid for desired eigenvalues that are either real or complex-conjugate poles. from the derivation of the state-derivative feedback pole placement the necessary and sufficient conditions for arbitrary pole placement can be described. for the transformation into the frobenius canonical form and/or computing formulas (25)–(28) the controllability matrix r must be of full rank, i.e. the original system must be controllable as for traditional state feedback. in addition, the coefficient a0 must be non-zero. using our knowledge about the coefficients of the characteristic polynomial the coefficients is equivalent to the condition that a oi i n 0 1 0� � � � � (29) where here �oi, (i � 1, …, n), are the original poles of the system (1). this means that all the original poles must be non-zero. this is equivalent according to [20, 1.1.7] to the 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 condition that the system matrix a of the original system (1) is nonsingular (it has full rank n). based on the above necessary and sufficient conditions, we are now in the position to present the first main result of the present work. theorem 1: consider the controllable single-input time-invariant linear system of (1). if matrix a is non-singular, then the poles of system (1) can be arbitrarily placed in the desired places by the state-derivative feedback (3) using the constant feedback gain matrix k computed by one of the formulas (25)–(28). it is clear that this solution is general, it requires no iteration and can be easily directly computed. this solves the pole placement for a linear time-invariant system by state-derivative feedback. another interesting feature is that in many cases the state-derivative feedback gains (26)-(28) are in absolute values smaller than the traditional state feedback gains [8, 9, 10] if the inverse of system matrix a reduces the gains in (26)–(28). the state-derivative feedback is a derivative feedback. it can modify the system dynamics, but it cannot modify the steady-state errors. the steady-state error is finite for the original system, as system matrix a is nonsingular and it remains unchanged by the state-derivative feedback. 3 pole placement by state-derivative feedback for linear time-varying systems the above methodology can be extended for general single-input controllable linear time-varying systems � � � � � � � � � ��x a x bt t t t u t� � , � �x xt0 0� (30) where � �x t n�r and � �u t �r are the state vector and the scalar control input, respectively, while � �a t n n� �r and � �b t n�r are the system matrix and control gain vector, respectively. the sufficient condition for the existence and unique solution is to require that all elements of a(t) be continuous in the time interval of interest � t t� �0, . the objective here is to find a time-dependent linear feedback gain matrix that will stabilize the system. then, this system can be stabilized by the varying state-derivative feedback control law, � � � � � �u t t t� k x� , (31) and the closed-loop system can be written as � � � � � �� � ��x i b k a xt + t t t� 1 . (32) the objective now is to construct the varying feedback gain matrix k(t) in order to stabilize the system. in this treatment, we turn our attention to utilize the frobenius transformation as an intermediate step to simplify the pole placement problem. let us take the following time-varying state transformation � �z q x� 1 t , � �x q z� t (33) then, the system is transformed to the frobenius canonical form and the dynamic system matrices can be computed as � � � �a q aq qf t � 1 � , b q bf � 1 (34) where � �af n nt � �r and bf n�r are the transformed system matrix and control gain vector, respectively. the transformed system is the same as (10), however with � � � � � � � �� a t a t a t a tn� 0 1 1, , ,� being the time-varying coefficients. note that the eigenvalues of the time varying dynamic system do not have any meaning regarding its behaviour or its stability features. the state transformation matrix � �q ��1 t n nr can be calculated as follows � � � �q q q q �1 1 2t nrows � (35) where qi n� �r1 is computed by using the recursive computations of the rows as follows q e r1 1� n t , q q a qi i i� � �1 � , i � 1, …, n – 1. (36) the controllability matrix for the time-varying system � �r t n n� �r is formed as � � � �r r r rt n� 1 2 � , (37) where ri n�r can be computed algebraically using the recursion r b1 � , r a r ri i i� � 1 � , i � 1, …, n – 1. (38) if � �q t , � �q 1 t , and � ��q t are continuous and bounded matrices and � �q 1 t has a full rank at the time interval of interest, � t t� �0, , then this transformation is called a lyapunov transformation. note that the lyapunov transformation means that the transformation from one system to the other preserves the property of stability. consequently, this ensures that we can stabilize the time-varying system by means of placing the poles of the lyapunov equivalent by a linear time-invariant system. assuming that the above transformation to the frobenius canonical form is of the lyapunov kind then the pole placement technique from the time-invariant case can be extended for the time-varying case. by differeintiating the transformation equation (33) and substitute (32), the resulting closed-loop system in the z-coordinate is, � � � � � � � z q x q x q q i bk a q z a z � � � � � � 1 1 1 1 1+ ,z (39) where az n n� �r is the closed-loop system matrix and is given as (16), and can be computed as � � �a q q i bk a qz +� � � 1 1 1 . (40) hence, we can rewrite the above equation as � � � a q q a i bk qz + �1 1 1 1� . (41) applying the same procedure for the time-invariant system, it is easy to write the n equations describing the system in terms of the row vectors qi, i � 1, …, n of � �q 1 t as, � � � q q a i bk qi i i+� �1 1� , i � 1, …, n – 1 and � � � � d q q a i bk q1 1�n n+ , (42) then, the feedback gain matrix k(t) for the time-varying system can be written as © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 43 no. 6/2003 � � � � � �k dq q a b q dq q at n n n� ��� � �� � �� �� � �� 1 1 1 1 1 � � . (43) the feedback gain matrix k(t) can be rewritten as � �k q q a bt dn i i i n � � � � �� � � � �� � � � � � � � � � �� � � �1 1 1 0 1 1 1 q q an i i i n d� � � � � � �� � � � ���1 1 0 1 1. (44) next, we consider a stabilizing feedback control defined by the desired eigenvalues {�1, …, �n}. the gain matrix is directly computed by an efficient numerical algorithm as � � � �k q a b q at n n� � �� � 1 1 1 1 1 1, (45) where � � � q q e r1 1 1 n t , � �� � � ��q q a i qi i i i1 � � , i � 1, …, n . from the derivation of the state-derivative feedback pole plaement the necessary and sufficient conditions for arbitrary pole placement can be described. for the transformation into the frobenius canonical form and/or computing formulas (34)–(38) the controllability matrix r(t) and the transformation matrix � �q 1 t must be of full rank, the matrices � �q t , � �q 1 t , � ��q t are continuous and bounded in order for the transformation be of the lyapunov kind. in addition, matrix a(t) is continuous, nonsingular, and its inverse is bounded. finally, the term . everything must be valid at the time interval of interest, � t t� �0, . now, the following theorem can be presented for the single-input time-varying control systems. theorem 2: consider the controllable single-input time-varying linear system of (30). if matrix a(t) is continuous, nonsingular and its inverse is bounded. furthermore, the transformation matrix � �q 1 t is a transformation of a lyapunov kind and the term � �� q a bn 1 1 1 for the time interval of interest � t t� �0, . then the poles of the system (30) can be arbitrarily placed in the desired places by the state-derivative feedback (31), using the time-varying feedback gain matrix k(t) computed by one of the formulas (44)–(45). with the above development, it is clear that this solution is general, it requires no iteration and can be relatively easily directly computed. we do not need to complete the transformation into a generalized frobenius form. furthermore, the characteristic polynomial coefficients or eigenvalues do not have to be calculated. this solves the pole placement for a linear time-varying system by state-derivative feedback. the procedure defined here represents a unique treatment for state-derivative feedback in the literature. 4 illustrative examples in this section, simulation results are given to demonstrate the feasibility and effectiveness of the proposed pole placement algorithm by state-derivative feedback. example 1 the configuration of the mechanical system and its parameters are shown in fig. 1. the dynamic equation of this system can be described in the state-space form as �x � 0 0 1 0 0 0 0 1 1 2 1 2 1 1 2 1 2 1 2 2 2 2 2 2 k k m k m b b m b m k m k m b m b2 2 1 2 0 0 1 1 m m m � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � x where m1 and m2 are the first and second mass, k1 and k2 are the spring constants, b1 and b2 are the damper constants, x1 and x2 are the first and second mass vertical displacement, the state vector � x � x x x x1 2 1 2� � , and u is the control input. the model parameters are taken as m1 � 100 kg; m2 � 10 kg; k1 � 360 kn/m; k2 � 36 kn/m; b1 � 70 n�s/m, and b2 � 50 n�s/m. the transformation matrix q � 1 0 02778 0 00278 0 0 019444 0 0 02778 0 00278 100 0 0 0 0 . . . . . 0 100 0 � � � � � � � � � � � while the equivalent frobenius canonical form has a state description as � . . . x � � � � � � � � � 0 1 0 0 0 0 1 0 0 0 0 1 1296 10 20520 7563 5 6 27 � � � � � � � � � � � � � � � � z 0 0 0 1 u , with the coefficients of the characteristic polynomial � a � �1296 10 20520 7563 5 6 27. , , . , . . for this system the open-loop poles are –2.1835 � 70.1294 i and –0.9165 � 51.3006 i. the desired closed-loop poles are selected as –5 � 2 i and – 1 0 � 5 i. let us apply the control synthesis procedure of pole placement from the previous sections to this system. the computed state-derivative feedback gain matrix is k � �105 [ –105.19, 0.18117, –3.2225, 0.0349]. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 m1 k1 b1 x1 b2 x2 k2 m2 u fig. 1: mechanical system to verify the design a simulation was performed. the initial conditions of the states are taken as � � � x t0 0 5 0 5 0 2 0 2� . , . , . , . t. the transient responses of states and control input are displayed in fig. 2. for a comparison, the ackermann gain matrix for the same desired system poles is k s � �10 5 [ 3.6074, –0.3599, –0.02829, –0.00045 ]. the stabilized results for the same initial conditions are plotted in fig. 3. they are the same as with state-derivative feedback in both transient response and control input. example 2 consider the familiar ball and beam system. fig. 4 presents the configuration of this system and its parameters. the dynamic equations of the linearized state space model for small motions about � 0.0, and r � r0 is �x � � � � � � � � � 0 1 0 0 0 0 0 0 0 1 0 0 0 0 2 0 2 2 2 b m r i m g m r i m g m ib � � � � � � � � � � � � � � � � � � � � � � � � � x 0 1 0 0 0 2m r i u where m, ib, and are the mass, moment of inertia and radius of the ball, respectively, i is the beam moment of inertia, g is the gravitational acceleration, b is the coefficient of viscous friction opposing the beam rotation, is the angle that the beam makes with the horizontal, r is the distance between the center of the ball and center of rotation of the beam, the state vector � x � � �r r , and u is the control torque input. in this treatment, we assume that the ball is frictionless rolling along the beam. the control objective is to balance the ball at a distance from the central pivot point by tilting the beam back and forth using the motor. the values of various parameters of the model are m � 1 kg; � 0.05 m; r0 � 0.1 m; g � 9.81 m/s2; b � 0.1 n�s/m, ib � 5 kg�m 2, and i � 1 kg�m2. the characteristic polynomial coefficients are a � [ –0.0479, 0.0, 0.0, 0.020]. the original poles are, +0.4629, –0.4729 and –0.005 � 0.4678 i. obviously, the open loop system is unstable. the desired closed-loop poles are –7 � 2 i and –10 � 5 i. the state-derivative feedback gain matrix is computed as, � k � 01012 5 01 4 1609 0 6782. , . , . , . . taking the initial states as � � � x t0 01 01 0 05 0 05� . , . , . , . t. the simulation results are shown in fig. 5. compared with the ackermann formula, the gain matrix for the same desired system poles is � k s � � 10 0 023 0 002 13 568 5 755 5 . , . , . , . . similar to the results of the previous example, the performance of the two cases are the same in transient response and control input. © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 43 no. 6/2003 fig. 2: transient response and control input of the system controlled by state-derivative feedback fig. 3: transient response and control input of the system controlled by the ackermann formula i m, ib, r u r q fig. 4: ball and beam system fig. 5: transient response and control input of the system controlled by state-derivative feedback example 3 consider the dynamic equation of the single-input time-varying system � � � �� . . . . . . . .x xt e e t t t� � � � � � � � � � � 01 0 2 0 01 0 2 01 01 0 01 0 0 � �1 0 � � � � � � � � � u t . this system is unstable and the zero-input transient response of an open-loop system are shown in fig. 6. to stabilize the above system, the pole placement technique is used. first, the controllability matrix and its inverse are computed as � � � � r t t � � � � � �� � � � � �� 0 0 02 0 002 2 01 0 02 0 006 0 0 0 002 . . . . . . e , and � � � � � �r � � � � � � � � � � � � � 1 10 10 10 1 50 0 50 2 0 0 500 t t t e e . it is clear that the controllability matrix is a full rank in the time interval of interest � t t� �0, . this means the system is completely state controllable. now, the transformation matrix can be computed as follows. the rows qi, i � 1, …, n are computed as � �q e r1 1 0 0 500� � n t , � �q q a q2 1 1 50 0 50� � � � , and � �� �q q a q3 2 2 5 1 10 5� � � � � e t . then, the transformation matrix, inverse, and derivative are � � � � q � � � � � �� � � � � �� 1 0 0 500 50 0 50 5 1 10 5 t te , � � � �q t . . t t� � � � � � � � 0 002 0 02 0 0 001 0 01 1 01 0 002 0 0 . . . . e e � � � , � ��q � � � � � � � � � � 1 0 0 0 0 0 0 5 0 0 t te , and � �� . .q t t t� � � � � � � � � � 0 0 0 0 001 0 001 0 0 0 0 e e . these matrices are continuous and bounded and the transformation matrix has a full rank at the time interval of interest � t t� �0, . then this transformation is a lyapunov transformation and the proposed pole placement technique can be applied. the transformed system matrix � �af t and the control gain vector bf are � � � � � � � � a q a q qf t t t t � � � � � � 1 0 1 0 0 0 1 0 01 0 2 013 01 3 � . . . .e e e�� � � � � �� , and b q bf � � � � � � � � � � � 1 0 0 1 , with the time-varying coefficients � � � � � �� �a t t t t� � 0 01 0 2 013 01 3. . . .e e e . 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 6: zero-input transient response of an open-loop system the computation of the feedback gain matrix for the desired eigenvalues �1,�2,�3 can be done as follows. first, the rows �q1, i � 1, …, n+1 are � �� � � q q1 1 0 0 500 , � � � �� �� � � � � q q a i q2 1 1 1 150 0 500 01� �� . , � � � � � �� � � � � � q q a i q3 2 2 2 1 2 150 01 01 10 500 01 01� � � �� . . . .e t � �� � �1 , and � � � � �� �� � � � � � � � � q q a i q4 3 3 3 2 1 2 30 5 4 5 5 1 50� � � � �� . .e e et t t � � � � � � 1 2 1 3 2 3 1 2 3 1 15 10 3 500 01 0 � � � � � � � � � � � � � � � . , , . .e et t � �� ��1 012 3 � �. taking the desired eiginvalues and the same initial conditions for the state-derivative feedback, the simulation results indicate the same transient performance and control input with lower gains. the elements of the gain matrix are displayed in fig. 9. based on the simulation analysis above, we note the great reduction in the state-derivative feedback gain matrix compared to the well-known state feedback approach, with the same performance for time-invariant and time-varying systems. this paper shows how the pole placement approach can be used to design a controller-based state-derivative feedback control which yields a closed-loop system with specified characteristics. the approach is relevant for design with perservation of stability when some necessary and sufficient conditions are provided. compared with a well-known state feedback, the state-derivative feedback controller in some cases achieves the same performance with a lower gain. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 43 no. 6/2003 then, we can compute � �� � � � � � q a4 1 1 2 3 1 2 3 155 5000 50 5 50 5000e e t t � � � � � � �,� � � � � � � � � � � � � � � 2 3 2 1 2 3 1 2 3 1 2 10 50 5000 1 50 500 � � � � � � , e et t � � �� � � �1 3 2 3 5� , and � � � � q a b4 1 1 2 35 500 1e t � � � . finally, the feedback gain matrix is � � � � � �k q a b q at t t � � � � � � � � 1 10 10 1 100 4 1 4 1 1 2 3 1 2 e e � � � � � �3 1 2 3 1 2 3 1 2 10 2 100 10 1000 100 � � � � � , , e e t t � � � � � � � � �� � � �1 3 2 3 1 2 3 1 2 3 10 1 100 � � � � � � � � � � � � � � � � � � e t . our selection of the desired closed-loop poles are –5 � 2 i and –7. applying the pole placement technique to this system, the initial condition of the states is selected as � � � x t0 1 1 2� , , t. the transient response and control input are shown in fig. 7. in addition, the elements of the gain matrix are displayed in fig. 8. as a comparison with the state feedback, the state feedback gain matrix for this system can be computed as [8, 9], � � � �� �k qs e e et . t t t� � � � � � � � � 4 2 1 2 3 1 2 1 30 5 4 5 5 1 50. � � � � � � �� �� � � � �� � � � � � � � � � 2 3 1 2 3 1 2 15 10 3 500 01 01 01 � � � � . , , . . .e et t � ��3 . fig. 7: transient response and control input of a time-varying system controlled by state-derivative feedback fig. 8: gain elements of a time-varying system controlled by state-derivative feedback fig. 9: gain elements of a time-varying system controlled by state feedback 5 conclusions this paper has generally formulated and proposed a new concept and technique for solving the pole placement problem by full state-derivative feedback for a linear time-invariant and time-varying single-input system. there have been formulated necessary and sufficient conditions for solving such pole placement. the resulting formula for the time-invariant case is a generalization of the ackermann formula for traditional state feedback. the described algorithm avoids previous iterative approaches and provides a fast and computationally efficient solution. an interesting feature of the state-derivative feedback is that it in many cases gives feedback gains with smaller absolute values than traditional state feedback gains. the simulation results prove the feasibility and effectiveness of the proposed technique. references [1] tuel, w. g.: on the transformation to (phase-variable) canonical form. ieee trans. on automatic control, ac-11, 1966, p. 607. [2] wonham, w. m.: on pole assignment in multi-input controllable linear systems. ieee trans. on automatic control, ac-12, 1967, p. 660–665. [3] luenberger, d. g.: canonical forms for linear multivariable systems. ieee trans. on automatic control, ac-12, 1967, p. 290–292. [4] ackermann, j.: der entwurf linearer regelungsysteme im zustandraum. regel. tech. proz.-datenverarb., 1972, vol. 7, p. 297–300. [5] wolowich, w. a.: linear multivariable systems. springer verlag, new york 1974. [6] kautsky, j., nichols, n. k., van dooren, p.: robust pole assignment in linear state feedback. int. j. of control, 1985, vol. 41, p. 1129–1155. [7] lewis, f. l.: applied optimal control and estimation, digital design and implimentation. prentice-hall and texas instruments, englewood cliffs, nj., 1992. [8] valášek, m., olgac, n.: an efficient pole placement technique for linear time-variant siso systems. iee control theory appl. proc. d, 1995, vol. 142, no. 5, p. 451–458. [9] valášek, m., olgac, n.: efficient eigenvalue assignments for general linear mimo systems. automatica, 1995, vol. 31, no. 11, p. 1605–1617. [10] valášek, m., olgac, n.: pole placement for linear time-varying non-lexicographically fixed mimo systems. automatica, 1999, vol. 35, no. 11, p. 101–108. [11] preumont, a., loix, n., malaise, d., lecrenier, o.: active damping of optical test benches with acceleration feedback. machine vibration, 1993, vol. 2, p. 119–124. [12] preumont, a.: vibration control of active structures. kluwer, 1998. [13] bayon de noyer, m. p., hanagud, s. v.: single actuator and multi-mode acceleration feedback control. adaptive structures and material systems, asme, ad, 1997, vol. 54, p. 227–235. [14] bayon de noyer, m. p. hanagud, s. v.: a comparison of h2 optimized design and cross-over point design for acceleration feedback control. proceedings of 39th aiaa/asme/ asce/ahs, structures, structural dynamics and materials conference, 1998, vol. 4, p. 3250–3258. [15] olgac, n., elmali, h., hosek, m., renzulli, m.: active vibration control of distributed systems using delayed resonator with acceleration feedback. trans. of asme journal of dynamic systems, measurement and control, 1997, vol. 119, p. 380. [16] kejval, j., sika, z., valasek, m.: active vibration suppression of a machine. proceedings of interaction and feedbacks '2000, ut av cr, praha, 2000, p. 75–80. [17] deur, j., peric, n.: a comparative study of servosystems with acceleration feedback. proceedings of the 35th ieee industry applications conference, roma (italy), 2000, vol. 2, p. 1533–1540. [18] ellis, g.: cures for mechanical resonance in industrial servo systems. proceedings of pcim 2001 conference, nuremberg, 2001. [19] necsulescu, d. s., ceru, m.: nonlinear control of magnetic bearing. journal of electrical engineering, vol. 2, 2002. [20] horn, r. a., johnson, c. r.: matrix analysis. cambridge university press, cambridge, 1986. prof. ing. michael valášek, drsc. phone: +420 224 357 361 e-mail: michael.valasek@fs.cvut.cz eng. taha helmy sayed abdelaziz, ph.d. e-mail: tahahelmy@yahoo.com department of mechanics czech technical university in prague, faculty of mechanical engineering karlovo nám. 13 121 35 prague 2, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 table of contents inlet channel for a ducted fan propulsion system of a light aircraft 3 e. ritschl, r. theiner, d. hanus concurrent design of railway vehicles by simulation model reuse 9 m. valášek, p. steinbauer, j. koláø, j. dvoøák static vs. dynamic list-scheduling performance comparison 16 t. hagras, j. janeèek indoor air quality assessment based on human physiology – part 1.new criteria proposal 22 m. v. jokl indoor air quality assessment based on human physiology – part 2.limits 31 m. v. jokl indoor air quality assessment based on human physiology – part 3.applications 38 m. v. jokl non-thermal plasma ozone generation 47 s. pekárek a direct algorithm for pole placement by state-derivative feedback for single-input linear systems 52 taha h. s. abdelaziz, m. valášek acta polytechnica doi:10.14311/ap.2017.57.0399 acta polytechnica 57(6):399–403, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap lambert function methods for laser dynamics with time-delayed feedback yogesh n. joglekar∗, andrew wilkey, gautam vemuri department of physics, indiana university purdue university indianapolis (iupui), indianapolis 46202 indiana, usa ∗ corresponding author: yojoglek@iupui.edu abstract. time-delayed differential equations arise frequently in the study of nonlinear dynamics of lasers with optical feedback. traditionally, one has resorted to numerical methods because the analytical solution of such equations are intractable. in this manuscript, we show that under some conditions, the rate equations model that is used to model semiconductor lasers with feedback can be analytically solved by using the lambert w-function. in particular, we discuss the conditions under which the coupled rate equations for the intra-cavity electric field and excess carrier inversion can be reduced to a single equation for the field, and how this single rate equation can be cast in a form that is amenable to the use of the lambert w-function. we conclude the manuscript with a similar discussion for two lasers coupled via time-delayed feedbacks. keywords: time-delayed differential equations; lambert function; lasers. 1. introduction time-delayed differential equations arise naturally in a wide variety of physical phenomena where one or more system parameters are fed back into the system after a certain amount of time. such time-delayed feedbacks are seen in population behaviors in biology and ecology [1–3], chemical reactions [4], interactions between time-delayed non-markovian laser fields and resonant media [5] and a host of nonlinear dynamical systems [6]. one of the more prominent examples of a physical system with time-delayed feedback is a laser wherein the light emitted by a laser is injected back into the laser by reflection from a distant mirror outside the laser cavity [7]. the mathematical model for a time-delayed feedback system often reduces to a first order differential equation with a time-delayed term, and the analytical solution of such differential equations can be difficult because one has to deal with an infinite-dimensional equation. in this article, we demonstrate that the lambert w-function [8] can be invoked in some situations to obtain analytical solutions to time-delayed equations of physical interest, and explore some of the consequences of using this method. for the sake of concreteness, we focus on the problem of a semiconductor laser that is subject to timedelayed feedback of light into the laser [9]. lasers with time-delayed feedback are a paradigm for the study of time-delayed systems in part because the delay can be easily controlled, which allows one to study the behavior of the system for delays that are shorter than the intrinsic time-scales of the laser as well as for delays that are longer than the natural time scales of the isolated laser system. such lasers are of fundamental interest due to the variety of nonlinear dynamical behaviors that arise as a function of the time-delay and the strength of the feedback [10]. in particular, there are combinations of delay and feedback strengths that produce single-tone oscillations in the optical frequency of the laser, period doubling routes to chaos, and coherence collapse and line-narrowing. each of these dynamical responses have been studied for a variety of applications such as the development of stable, all-optical microwave frequency oscillators, chaotic synchronization for all-optical encryption, and stable, narrow line-width lasers [11]. another system that has been of immense interest to the semiconductor laser community is the coupling of two lasers by mutual injection of light from each laser into the other [13]. these systems have a natural time-delay built into them due to the finite amount of time it takes for the light from one laser to reach the other laser due to the physical separation between the lasers. 2. lang–kobayashi equations semiconductor lasers are usually modeled by the lang– kobayashi equations [14] which are known to describe the experimentally observed behavior of these lasers very well. for a single-mode laser, these equations describe the coupled time evolution of the electric field and the excess carrier inversion inside a laser cavity. in the slowly-varying envelope approximation, these equations in the non-dimensional form are de dt = (1 + iα)ζn(t)e(t) + κe(t− τ), (1) t dn dt = p −n(t) − (1 + 2n(t))|e(t)|2. (2) here e(t) is the complex, time-dependent, intra-cavity electric field, α is the line-width enhancement factor 399 http://dx.doi.org/10.14311/ap.2017.57.0399 http://ojs.cvut.cz/ojs/index.php/ap yogesh, andrew, gautam acta polytechnica 0 20 40 60 80 100 time (ns) 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 d im e n si o n le ss e le ct ri cfie ld in te n si ty p=5x10-4 p=10-3 p=10-2 laser power above threshold (p=0) 0 20 40 60 80 100 time (ns) -10 -8 -6 -4 -2 0 2 4 d im e n si o n le ss e xc e ss c a rr ie r d e n si ty 10-3 figure 1. full temporal dynamics of electric field intensity i(t) = |e(t)|2 and excess carrier inversion n(t) in a typical semiconductor laser with line-width enhancement factor α = 3 and t = 100, as a function of the pump power p above the threshold. note that both dynamics have the same time scale, and change from overdamped to underdamped in their approach to equilibrium as the pump power increases. for the gain medium, n(t) is the time-dependent excess carrier inversion (above the carrier inversion at the lasing threshold), ζ is the differential-gain coefficient, κ is the feedback coupling strength, τ is the time-delay, t is the ratio of excess carrier-inversion lifetime to the cavity photon lifetime, and p is the external pumping to the laser. note that the rate equation for the macroscopic polarization within the gain medium does not enter this model because it decays very rapidly, relative to the time scale at which e(t) and n(t) evolve in semiconductor lasers, and hence can be adiabatically eliminated. figure 1 shows the results for the envelope field intensity |e(t)|2 and the excess carrier inversion n(t) as a function of the pump power above the lasing threshold. when the pump power is small, both intensity and excess inversion approach their equilibrium values rapidly in an overdamped manner. as the pump power increases, the approach to the equilibrium values changes to an underdamped manner. in dissipative systems, this transition from overdamped to underdamped approach can be mapped onto a parity-time (pt ) symmetry breaking transition [15] where the overdamped region is associated with pt -broken phase and the underdamped region is associated with the pt -symmetric phase. for realistic parameters of standard semiconductor lasers, used in figure 1, we note that the time scales for variation in the electric field and the excess inversion are the same. thus, it is not possible to eliminate the excess carrier density dynamics in the current set up, and the lambert function formalism cannot be directly applied to above set of equations (1)–(2). the lambert w-function is defined by solutions of the equation wew = z. for a general complex number z, this equation has countably infinite number of solutions denoted by wk(z) for integers k, out of which, by convention, only branches k = 0 and k = −1 are real-valued for any z [8]. suppose we are able to reduce the lang–kobayashi equations to a single equation of the form dx dt = ax + bx(t− τ) (3) where x is complex and a,b are constants that could be real or complex. we note that such equations commonly arise in time-delayed population dynamics models in biology and ecology [1–3], but in those cases, the coefficients and the solution of the equation are both constrained to be purely real. since (3 is a linear equation, its general solution will be given by the linear superposition of exponential-in-time terms. assuming x(t) ∼ eλt leads to a transcendental characteristic equation (λ−a)τe(λ−a)τ = bτeaτ. (4) equation 4 shows that the eigenvalues λ can be expressed in terms of the lambert w-function. specifically, when if bτeτa > 0, the solution for is given by λ = a + w0(bτeaτ )/τ. if bτeaτ < 0, there are two possible solutions when −1/e < bτeaτ < 0, one smaller than 1 and the other larger than 1. thus, in general, the eigenvalues λ that characterize the exponential-in-time behavior of x(t) are given by λk = a + 1 τ wk(bτeaτ ) (5) where for bτeaτ < −1/e or complex, the general solution is obtained by an analytical continuation of the function to the complex plane [8]. 3. domain of validity of lambert formalism it is generally the case that the modeling of semiconductor lasers requires the coupled rate equations for the intracavity electric field, e(t) and the excess carrier inversion, n(t). however, the use of the lambert equation formalism, as discussed above, requires 400 vol. 57 no. 6/2017 lambert function methods 0 500 1000 1500 2000 time (ns) -0.1 -0.05 0 0.05 0.1 0.15 t im e -d e p e n d e n t i( t) a n d n (t ) i(t) n(t) p=0.03, =1 figure 2. time evolution of the dimensionless electric field intensity i(t) = |e(t)|2 and the excess carrier inversion n(t) in typical, solitary semiconductor laser shows that the electric field rises exponentially in a very short window before nonlinearities in the gain medium become effective and saturate the intensity to a steady state value. that the two coupled rate equations be reduced to a single equation with time-delay. in this case, it means that the model must be reduced to a rate equation for the electric field only. this is, strictly speaking, not possible for semiconductor lasers because, as seen in figure 1, the characteristic time scales on which the intracavity field and the excess carrier inversion evolve are of the same order of magnitude. however, with advancements in technology, it may well be possible to fabricate lasers in which the excess carrier inversion evolves much faster than the electric field, in which case the rate equation for the excess carrier inversion can be eliminated. this entails setting the time evolution of the inversion to zero, i.e., dn(t)/ dt = 0, solving for the steady state value of n(t), and substituting this expression for n(t) in the rate equation for the electric field (1). the model is thereby reduced to a single, time-delayed rate equation that is amenable to the use of the lambert function. another possibility is to fabricate a laser in which the carrier inversion relaxation time is much larger than the electric field decay time. in this case, the inversion does not evolve during the time that the electric field reaches a steady state, and one can, once again, focus on the single, time-delayed rate equation for the electric field. in addition to the elimination of the inversion equation, the lambert formalism is strictly applicable to the case of a semiconductor laser with time-delayed feedback only when the nonlinearities that arise from the cubic term in the electric field are also ignored. physically, this means that the formalism is valid when two conditions are simultaneously satisfied (i) the excess carrier inversion n(t) has reached its steady state value, and (ii) the laser has reaches threshold and the intracavity electric field e(t) is starting to grow exponentially, but the intensity saturation that typically sets in due to the nonlinearities in the gain is yet to occur. for typical semiconductor laser parameters (with no time-delayed feedback), this time window between the inversion settling to a steady state value and the intensity getting saturated is negligibly small. figure 2 shows the typical time evolution of a solitary semiconductor laser (α = 3, pumping p = 0.03, and differential gain coefficient ζ = 1) with initial excess carrier inversion, i.e., n(t = 0) = −0.1, and we see that the exponential electric field rise occurs in a very short time interval. this essentially means that the predictions of the analytic solutions (5), obtained via the lambert function approach will be visible in a very tiny time window. what is desired however is a wider time window between the inversion reaching a steady state and the intensity of the laser not yet being saturated, i.e., the intensity is still in the exponential amplification regime. since the intensity at very short times grows exponentially with the product of the differential gain coefficient ζ and n(t), and because the steady-state value of n(t) is a function of the pumping p, one can manipulate these two quantities to slow down the exponential growth of laser intensity and thereby enhance the time window in which both conditions are simultaneously met. figure 3 shows the time-dependent laser intensities i(t) = |e(t)|2 when the inversion is adiabatically eliminated and only the electric field equation is solved, as a function of differential gain ζ and pump power p . we note that the time axis only contains range after which the carrier inversion has settled down. it is evident from the left-hand panel that for a fixed ζ, as the pump power is reduced from p = 0.09 to p = 0.07, the exponential growth of intensity is slowed, thereby providing a wider window of time in which to make any desired measurements of the laser intensity dynamics. the right-hand panel shows that for a fixed pumping, as ζ is reduced, the intensity growth is slowed. while the external pump is an easily varied parameter in experiments, the differential gain coefficient ζ is a material parameter and hence cannot be tuned in a given laser. on the other hand, reducing the pump power close to zero means the laser is operating very close to the threshold, and that leads to enhanced quantum fluctuations whose effects are not included in the present analysis. thus, our results provide some guidance on the material parameters that are necessary for a laser to meet such that the lambert formalism will be applicable to analytically study the dynamics of the laser. under these conditions, the rate equation for the electric field can be written as de dt = (1 + iα)ζn0e(t) + κe(t− τ) (6) where n0 ∼ p is the steady-state value of the carrier inversion. this equation is identical to (3) with a 401 yogesh, andrew, gautam acta polytechnica 50 60 70 80 90 100 time (ns) 0 500 1000 1500 2000 2500 d im e n si o n le ss e le ct ri cfie ld i n te n si ty =0.06 =0.07 =0.08 laser differential gain 50 60 70 80 90 100 time (ns) 0 1 2 3 4 5 6 7 d im e n si o n le ss e le ct ri cfie ld i n te n si ty 104 p=0.08 p=0.07 p=0.09 laser pump above threshold figure 3. rise of the intracavity electric field e(t) when the carrier inversion is adiabatically eliminated. the left-hand panel shows that for a fixed ζ = 0.1, as the pump current is reduced, the exponential growth of intensity slows down. the right-hand panel shows that for a fixed pumping p = 0.08, as ζ is reduced, the time window during which the gain-nonlinearity can be ignored is widened. in both cases, the calculations are carried out using typical semiconductor laser parameters, α = 3. note that the time axis only contains time window after the carrier inversion has reached steady-state value. manifestly complex a = (1 +iα)ζn0 and a purely real, positive b = κ. thus, within the appropriate timewindow, the electric field exponents are determined by the properties of lambert w-function. 4. bidirectionally coupled lasers another interesting situation in which the lambert formalism can be invoked is when two identical semiconductor lasers, at optical frequencies ω1 and ω2 are mutually coupled to each other. such systems have been extensively studied in the context of their nonlinear dynamics [10]. the four rate equations for such a system, two for the intracavity electric fields and two for the corresponding excess carrier inversions, are given by a modified form of the lang-kobayashi equations wherein the bidirectional coupling is accounted for, i.e., de1(t) dt = (1 + iα)ζn1(t)e1(t) + i∆ωe1(t) + κe−iθτe2(t− τ), (7) de2(t) dt = (1 + iα)ζn2(t)e2(t) − i∆ωe2(t) + κe−iθτe1(t− τ), (8) t dn1(t) dt = j1 −n1(t) − (1 + 2n1(t))|e1(t)|2, (9) t dn2(t) dt = j2 −n2(t) − (1 + 2n2(t))|e2(t)|2. (10) here the subscripts 1, 2 denote laser index,and these equations are written in a frame rotating at a frequency that is the average of the two laser frequencies, θ = (ω1 + ω2)/2, so that each laser is detuned by an equal amount ±∆ω = ±(ω1 − ω2)/2. κ is the coupling coefficient between the two lasers, τ is the time-delay in the coupling that depends on the physical separation between the two lasers, e−iθτ is the phase accumulation due to light propagating from one laser to the other, and j1,2 are the injection currents above threshold for each laser. if these four coupled equations can be reduced to two by eliminating the carrier inversion equations as discussed for the single laser with feedback, one is left with two coupled, time-delayed rate equations for the intracavity fields e(t) = [e1(t),e2(t)]t , d dt e(t) = m ( d dt ) e(t) (11) where the 2 × 2 non-hermitian, time-delay operator m is given by m = [ (1 + iα)ζn10 + i∆ω κe−iθτe−τ∂t κe−iθτe−τ∂t (1 + iα)ζn20 − i∆ω ] (12) and n10 and n20 are the steady-state carrier inversions. equations (11) and (12) are also amenable to analytic solution via the lambert w equation formalism. an experimental study of the bidirectionally coupled lasers, and a detailed comparison between the predictions of the lambert function solution and numerical solutions of the full lang–kobayashi equations will be reported elsewhere. in summary, we have shown that the lambert wfunction provides an hitherto unexplored, analytic method for studying the intensity dynamics of a semiconductor laser with time-delayed optical feedback. the formalism is valid in a regime where the two rate equations given by the lang–kobayashi model are reduced to a single, time-delayed rate equation for the intracavity electric field. the lambert function can be invoked when the nonlinearities that arise from gain saturation are neglected, which implies that the analytic results are valid at short times after the laser 402 vol. 57 no. 6/2017 lambert function methods intensity crosses its threshold value, i.e., when the intensity is still in the exponentially amplifying stage. furthermore, the analytic technique assumes that the carrier inversion has reached its steady state value while the intensity is till growing. to overcome the problem of a very narrow observation time window for the laser intensity dynamics, we have suggested some remedies that could be implemented at the laser fabrication step to modify the material parameters of the laser. in particular, reducing the differential gain coefficient, and modifying other properties of the laser so that a very weak pump will induce the desired population inversion, will enable a much wider time window between the time at which the inversion reaches a steady state and the time at which the laser intensity saturates. once a wider time window is attained, the predictions of the lambert function results can be tested. references [1] see e.g. h. smith, an introduction to delay differental equations with applications to the life sciences, isbn 978-1-4419-7646-8, springer (2011). [2] s.p. blythe, r. m. nisbet, and w. s. c. gurney, instability and complex dynamic behavior in population models with long time delays. theoretical population biology 22, 147âăş176 (1982). [3] y. kuang, delay differential equation with applications in population dynamics. academic press, inc. (1993). [4] marc r. roussel, the use of delay differential equations in chemical kinetics, j. phys. chem. 100, 8323-8330 (1996). [5] m.h. anderson, g. vemuri, j. cooper, p. zoller and s.j. smith, experimental study of absorption and gain from two-level atoms in non-markovian, phase diffusing optical fields, phys. rev. a 47, 3202-3209 (1993). [6] b. krauskopf and d. lenstra, fundamental issues of nonlinear laser dynamics, aip conference proc. 548 (2000). [7] see e.g. d. kane and k.a. shore, unlocking dynamical diversity: optical feedback effects on semiconductor lasers, isbn 0-470-85619-x. [8] r.m. corless, g.h. gonnet, d.e.g. hare, d.j. jeffery, and d.e. knuth, on the lambert w function, adv comput math 5, 329 (1996). [9] g.h.m. van tartwijk and d. lenstra, quant. and semiclassical optics 7, 87 (1995). [10] j. mork, j. mark and b. tromborg, route to chaos and competition between relaxation oscillations for a semiconductor laser with optical feedback, phys. rev. a 65, 1999-2002 (1990). [11] m. soriano, j. garcia-ojalvo, c.r. mirasso, and i. fischer, rev. mod. phys. 85, 421-470 (2013). [12] m. sciamanna and k.a. shore, nature photonics 9, 151-162 (2015). [13] j. mulet, c. masoller and c.r. mirasso, modeling bidirectionally coupled single-mode semiconductor lasers, phys. re. a 65, 063815 (2002). [14] r. lang and k. kobayashi, ieee j. quant. elec. qe-16, 347-355 (1980). [15] j. li, a.k. harter, l. de melo, y.n. joglekar, and l. luo, arxiv:1608.05061. 403 acta polytechnica 57(6):399–403, 2017 1 introduction 2 lang–kobayashi equations 3 domain of validity of lambert formalism 4 bidirectionally coupled lasers references acta polytechnica https://doi.org/10.14311/ap.2021.61.0579 acta polytechnica 61(5):579–589, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague performance characteristics of low carbon waste material to stabilise soil with extremely high plasticity ali al-baidhani∗, abbas jawad al-taie al-nahrain university, college of engineering, civil engineering department, al-jadriya, 10070, baghdad, iraq ∗ corresponding author: al.baidhani7471@gmail.com abstract. the application of low-carbon and natural materials to mitigate the undesired properties of difficult soils is considered as a sustainable solution to the issues regarding these soils. selecting some natural materials, of low carbon type, from the rubble of demolished buildings or debris from the construction of new buildings and recycling them in a poor or weak soil stabilisation process is a very little explored field of research in iraq. this paper investigated the geotechnical characteristics of extremely high plasticity soil (ehps) improved with a low-carbon building stone debris (bsd). five dosages from coarse and fine soil-size ((bsdc) and (bsdf)) of bsd have been prepared to use in the ehps-bsd mixtures. the laboratory tests included atterberg limits, linear shrinkage, unconfined compression, consolidation, and swelling. the effect of the bsd on the time to zero-water content and the maximum swell was included. the efficiency of the bsd was proved by the amelioration of the compressibility and strength, and by reducing the shrinkage, swell pressure, and the potential of swelling. the shrinkage, compressibility, and swelling properties of the ehps were reduced depending on the gradation and content of bsd. the gradation of bsd had a major role in strength development and controlling the time required to reach the final shrinkage and maximum swell stage. keywords: low carbon materials, extremely high plasticity soil, swelling shrinkage. 1. introduction high plasticity soils (hps) have unsought properties upon wetting and drying like shrinkage and expansion. these soils exhibit a considerable shrinkage/swelling when their water or moisture content is subjected to a seasonal fluctuation [1, 2]. the unsought properties of hps have negative effects on the strength and compressibility of these soils, as such, hps are considered as unfavourable and unsuitable soils to be used as foundation materials. the construction of different engineering infrastructures on hps represents a big challenge for engineers. where hps occur, they are difficult as layers for roads or as foundations for light structures. due to the unfavourable properties of hps, their difficult nature, and their widespread presence across the world, these soils have gained worldwide attention. therefore, high plasticity soils need an enhancement in their index and mechanical properties [2–4]. there are numerous ways one can utilise and adopt in the enhancement processes of properties of difficult soils, such as electrical, mechanical, chemical, physical, and biological treatment methods. for more sustainable technologies, recycling of waste material for the soil stabilization is also applied. however, mitigating the particle volume changes in the hps by chemical alternation is among the most effective stabilisation techniques that gained the acceptance of geotechnical engineers [5–22]. the efficiency of the stabilising material (economic and design) is an important factor that affects the chemical stabilizer selection. also, the impacts of the selected stabilizer on the environment should be kept in mind. improvement technologies of difficult soils have to be used within a manner that take into account the environmental aspects so as not to cause adverse effects to the environment (soil, air, and water). hence, an improvement and stabilisation of soils using low-carbon natural materials are highly encouraged [2, 22, 23]. using such materials is considered a sustainable solution to the problems of difficult soils [23]. there is a number of materials defined as “low carbon materials”, lcm. among these materials are lime, rice-husk (reactive ash), fly ash, silica fume, granulated slag, blended cement, natural stones, stone dust, and several liquid materials like “sodium silicatebased” [23–26]. materials like “ground-granulated blast-furnace slag”, which have low carbon emissions, have been used in improving expansive soil. such an application causes a positive reduction in the emissions of carbon dioxide. the solid block units produced from the compaction of the mixture of soil, lime, and water are considered as the lcm and eco-friendly construction material [2, 27]. the use of high volumes of coal fly ash, silica fume, rice-husk (reactive ash), or granulated slag in the cement manufacturing process produced a low carbon cement called blended cement. also, the compacted mixture of stone dust, fly ash, and lime is used to produce a block with a high density, such a block has been considered as lcm. natural materials like stones, biomass, and soil are considered 579 https://doi.org/10.14311/ap.2021.61.0579 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en ali al-baidhani, abbas jawad al-taie acta polytechnica ideal building materials [18, 19]. such materials were considered as ideal due to their low emission of carbon, recycling potential, reusability, and a low carbon footprint. the “cement stabilized rammed earth” is another material that was proven to be low-carbon material [28]. also, the autoclaved dredged mud-brick is another lcm. this material is produced by mixing 10 % of fly ash, 12 % of steel slag, and 15 % of calcium carbide with dredged soil [29]. furthermore, the “ground granulated blast furnace slag – gbs” has been considered as a sustainable lcm. the rammed earth (including different additives like lime or gravel and sand), for wall construction purposes, has been used as lcm. in addition, some of the natural stones were, also, referred to as being lcm [25, 30]. in the last years, [31], a low carbon liquid additive named “sodium silicate-based” has been used to improve the compressibility and strength properties of soils. the application of “sodium silicate-based” was considered as a sustainable solution to the problems of expansive soil. this is due to the composition of this material, it is mainly composed of elements like si, na, fae, and al, therefore, it is classified as lcm. based on the chemical composition of the material, tile factory waste was also classified as an lcm and considered as a successful stabiliser to improve the properties of clay soil. it was found that the waste of tiles is mainly composed of sodium, silicon, oxygen, and magnesium [31]. recent studies emphasised the geoenvironmental issue regarding the stabilisation of expansive soils. these studies show the necessity of sustainability of soil stabilisation processes. avoiding parameters (like dosage and type of additive) that cause adverse effects on the environment (like emission of carbon) should be considered. it is undesirable to use additives with a high carbon or heavy metal content in the soil stabilisation processes [32]. selecting some natural materials of lcm type from the rubble of demolished buildings or debris from the construction of new buildings and recycling them in a poor or weak soil stabilisation process is a very little explored field of research in iraq. in this work, low-carbon natural building stone debris (bsd) was selected as a solid additive to ameliorate the geotechnical characteristics of extremely high plasticity soil (ehps). extensive laboratory testing was carried out. bsd was prepared in two different sizes to use in ehps-bsd mixtures, these are coarse soil-size and fine soil-size, denoted as bsdc and bsdf, respectively. five dosages of each bsd size used, ranging from 10 % to 50 %, with an increment of 10 %, were prepared. experiments included atterberg limits, linear shrinkage, unconfined compression, consolidation, and swelling tests. also, the effect of curing on the development of soil strength has been included. soil properties values color yellow specific gravity, gs 2.58 sand fraction (0.06-2 mm, %) 0 silt fraction (0.002–0.06 mm, %) 18 clay fraction (less than 0.002 mm) 82 liquid limit, wl (%) 139 plastic limit, wp (%) 43 plasticity index, ip (%) 96 soil activity, a 1.17 bscs ce optimum water content (%) 35 maximum dry density, g/cm3 1.23 silicon, si (%) 20.71 aluminium, al (%) 6.68 calcium, ca (%) 3.38 magnesium, mg (%) 2.14 iron, fe (%) 0.62 sodium, na (%) 0.75 nitrogen, n (%) 1.57 oxygen, o (%) 60.88 carbon, c (%) 0.00 table 1. properties and characteristics of the ehps. 2. materials and methods 2.1. extremely high plasticity soil (ehps) the soil selected in this study is plastic clay soil, it is a naturally occurring problematic soil with a clay content higher than 80 %. the disturbed soil sample used in this study was obtained from al-anbar province (latitude 32° 31’ n and longitude 41° 54’ e), iraq. the basic properties of the soil have been determined and are shown in table 1. according to bs 5930, the selected soil was classified as “clay of extremely high plasticity” “ce”. throughout this paper, the selected soil of extremely high plasticity has been denoted as (ehps). the ehps has a high plasticity index (ip = 96 %) with an activity value (a) of 1.17. based on ip and a values and the potential expansivity, the ehps can be classified as “very high”, [33]. the chemical composition of ehps, table 1, indicates that the element of silicon is the most abundant, also, ehps has a high content of calcium component, with the other elements being aluminium, iron, etc. the qualitative x-ray diffraction (x.r.d.) analysis indicated that montmorillonite and quartz are the main constituents of the soil, [34]. the value of “cation exchange capacity” for ehps is 80 meq/100 g. as it can be seen, the results of physical, chemical, mineralogical tests indicate that the studied ehps is potentially expansive. 2.2. building stone debris (bsd) in iraq, natural stones are an essential building material used in construction. they are used for the 580 vol. 61 no. 5/2021 performance characteristics of low carbon waste . . . material property dbsc dbsf color very pole brown light yellowish brown specific gravity, gs 2.78 2.71 liquid limit, wl (%) 28 32 plastic limit, wp (%) – – plasticity index, ip (%) np np d10 (mm) 0.13 0.001 d30 (mm) 0.18 0.0028 d50 (mm) 0.23 0.0065 d60 (mm) 0.26 0.01 cu 2 10 cc 0.99 0.78 british soil classification system, bscs sp m uscs sp ml cao 63.41 fe2o3 20.09 sio2 5.30 k2o 1.19 mno 0.15 table 2. properties and characteristics of bsd. dbs content, % 10 20 30 40 50 designation dbsc10 dbsc20 dbsc30 dbsc40 dbsc50dbsf10 dbsf20 dbsf30 dbsf40 dbsf50 table 3. designation of the produced ehps-bsd mixtures. masonry of buildings. also, natural stones are preferred for finishing and facing works. mostly, natural massive rocks are the main source of the building stones in iraq. the building stone debris (bsd) was selected as a solid additive to ameliorate the geotechnical characteristics of extremely high plasticity soil (ehps). the bsd used was obtained and collected from a dumping area at one of the construction sites in baghdad city, the capital of iraq. bsd was prepared in the laboratory in steps including cleaning, drying, crushing, and sieving. after cleaning the surface of the bsd from foreign materials, drying at a temperature of 105° c took place, the bsd was crushed in two stages. a small hammer was first used to crush the bsd manually, while in the second stage, bsd was crushed mechanically. subsequently, the crushed bsd was divided into two sets. the crushed bsd in the first set was sieved on sieve no. 40, the remaining material was then subjected to further mechanical crushing, then sieved again to ensure all material passed through the sieve 40. the crushed bsd from the second set was subjected to further crushing using the los angeles machine. the resultant material was sieved on sieve no. 200, and the residual material was then returned to the los angeles machine for further crushing. such a process was continued until all the material passed the sieve no. 200. to study the effect of the gradation of bsd on the geotechnical properties of the soil, two different gradations were prepared, the first one is the coarse gradation that included the materials passing through the sieve no. 40, denoted bsdc, while the second gradation is the gradation of fine material, obtained from materials passing the sieve no. 200, denoted bsdf. the basic properties of the bsd are shown in table 2. clearly, the bsd is a non-plastic material and has higher gs values than ehps. bsd’s chemical components show that, notably, the ca, fae, and si are the main elements of bsd. accordingly, it can be confirmed that the selected stabilising agent in this work shows a low-carbon content, in other words, it is an eco-friendly agent. 2.3. preparation of specimens and laboratory testing programme the ehps was mixed with different dosages of bsdc and bsdf (table 3). for this purpose, the dosages of 10 %, 20 %, 30 %, 40 %, and 50 % by weight of the dry soil were added to the ehps. the ehpsbsd mixtures were prepared by mixing the dosages of bsd with ehps to form completely homogeneous mixtures. in general, the dry mixing method was used to prepare testing soil specimens. the designation of each produced sample is shown in table 3. the laboratory tests included atterberg limits, linear shrinkage [35], compaction test, unconfined com581 ali al-baidhani, abbas jawad al-taie acta polytechnica figure 1. testing programme. pression [36], consolidation, and swelling tests [36]. the effect of curing on the unconfined compression test samples has also been included in the testing programme (see figure 1). specimens were cured for 3 and 7 days. for basic tests, disturbed mixtures were used, while for engineering tests, compaction was applied to achieve the specified dry unit weights and corresponding water contents. the compacted specimens were extruded from a standard compaction mould. the homogenous wet mixtures were prepared at an optimum water content, then compacted to a corresponding maximum unit weight inside a standard compaction mould. the specimens for engineering tests were obtained by pushing the 3.6 cm diameter and 7.2 cm height steel mould of unconfined tests inside the compaction mould. the obtained specimens were either directly tested for unconfined-compressive-strength [36] or cured for three and seven days and then tested. it should be noted that all cured specimens were weighted with a precision of 0.01 g after the preparation and after the completion of the curing period to ensure minimum variations and maximal consistency. the difference in weight was very little (in the second decimal of the weight), and this indicates a great accuracy of the curing process. the compressibility of ehps-bsd mixtures was studied by carrying out a set of oedometer tests (onedimensional-consolidation test [36]; each wet homogenous ehps-bsd mixture was statically compacted inside the oedometer ring (5 cm diameter and 2 cm height) to obtain the required unit weight. a set of swell-tests was carried out in accordance with [36] to study the volume changes of ehps-bsd mixtures. the preparation of swell test specimens was similar to that of the consolidation test specimens. however, the height of the specimen for the swell test (1.65 cm) is smaller than the height of the ring (2 cm). this is, as stated by head [37], to ensure a total lateral confining throughout the test. the swelling was measured using a dial gauge of 0.002 mm/division. after soaking, the following intervals were fixed to record the swelling: (0.5 min, 1 min, 2 min, 4 min, 8 min, 16 min, 30 min, 60 min, 120 min, 180 min, 240 min, 1440 min, 2880 min, 4320 min, 5760 min, 7200 min, and 8640 min). 3. results and discussion 3.1. plasticity of ehps-bsd mixtures the results of atterberg’s limits (liquid limit (wl), plastic limit (wp ), and plasticity index (ip) for ehps mixed with two graded types of bsd are provided in figure 2. there is a clear improvement of the properties of the ehps when mixed with bsdc and bsdf. there is a characteristic reduction in wl due to the addition of the bsdc and bsdf. however, a higher reduction in wl values can be noted in ehps-bsdc mixtures as compared to ehps-bsdf mixtures. the percentual decrease in wl of ehps was about 50 % 582 vol. 61 no. 5/2021 performance characteristics of low carbon waste . . . figure 2. variation of atterberg limits with various bsd percentages. and 49 % for bsdc50 and bsdf50, respectively. this is also the case for wp values, wp decreased with increasing the content of both types of bsd. but a distinct response can be seen to the addition of bsdc in comparison with bsdf. as shown in figure 2, wp decreased considerably with the addition of bsdc. the gross effect on the improvement of plasticity with bsd is a decrease in the ip of ehps-bsd mixtures. as for the wl, figure 2 clearly shows that the improvement effect of bsdf was better than that of an equal amount of bsdc. while the ip of ehps decreased below 35 % for bsdf50, the ehps mixed with the identical quantity of bsdc had ip value of 40 %. in general, an increased bsd percentage led to a decrease in the ip. the same finding is, also, reported by [38] for expansive soil treated with “pyroclastic rock dust”. as such, an immediate alteration in ehps workability can be caused by the exchanging of calcium ions, released by the cao from the bsd structure, with other ions on the structure of the ehps. such a cation exchange caused the wl to decrease and as a result, reduce ip. such a reduction in ip produced a more tenuous texture of ehps-bsd figure 3. variation of ls with various bsd percentages. mixtures and this makes the ehps in-situ handling easier. 3.2. linear shrinkage (ls) result according to the procedure presented by bs 1377 [36], linear shrinkage (ls) values for ehps-bsd mixtures were determined. test specimens were prepared at a high water content (near the wl) and placed in the designed mould under lab conditions. the changes that take place in the air and water phases of specimens have been analysed, and shrinkage paths have been examined for ls. figure 2 shows the variation of ls and ip with different contents of bsd. as can be seen, when the content of bsd increased from 0 % to 50 %, the ls reduced, decreasing the ip. particularly, the ls of bsdc50 and bsdf50 mixtures decreased by 45.7 % and 48.8 %, respectively, in comparison to unaltered ehps specimens. a re-examination of figure 3 showed that the efficiency of bsdf was higher than that of bsdc in decreasing the shrinkage values of ehps, however, the trend of decreasing ls is comparable for both the bsdc and bsdf. the study of ls included the time to reach the final drying stage (the zero-water content). the effect of the type and content of bsd on the time needed to reach the final shrinkage stage has been investigated. for each ehps and ehps-bsd mixtures, the water content during the drying stage was calculated at 583 ali al-baidhani, abbas jawad al-taie acta polytechnica figure 4. effect of bsd and ip on the time ratio of ehps. specified time periods, then, the ratio of final drying time for each mixture (tn) to that of unaltered soil (tf ) was calculated and named as “time ratio”. the variation of the time ratio with various ip and bsd contents is shown in figure 4. as illustrated, the addition of bsd decreases the time required to reach the final shrinkage stage, this decrease depends mainly on the type and content of bsd. the ehps-bsdf mixtures required a shorter time than the ehpsbsdc mixtures. the bsdf50 required only 50 % of the time, while the bsdc50 required 70 %. 3.3. shear strength of ehps-bsd mixtures according to the result of the unconfined compression test, the variation of the shear strength of ehps (su) with the type and amount of bsd has been studied. the su values were calculated for the uncured ehpsbsd samples tested directly after the preparation and for samples cured for three and seven days. as illustrated in figure 5, for uncured ehps-bsdc samples, there is a slight change in su after the addition of bsdc, which indicates the importance of the curing period on the generation of a pozzolanic reaction. this was proved by curing the ehps-bsdc samples for three and seven days, figure 5, in which a significant increase in su values has been noted. this may be attributed to a change in the soil matrix due to the formation of cementation agents as a result of pozzolanic reaction during the curing period. in contrast, the early generation of strength was noted for both figure 5. the effect of the type and content of bsd and curing period on su of ehps. uncured and cured ehps-bsdf samples. the gradation of bsd has a major role in strength development. the start of the pozzolanic reaction is mainly affected by the fineness of pozzolanic additives. in fact, the pozzolanic reaction is starting earlier for materials with a paramount fineness, as such materials have a noticeably larger surface area [39–41]. the influence of curing periods on su of ehps-bsd samples is well very noticeable in figure 5. a significant improvement in su can be noted for ehps-bsdf samples in comparison to ehps-bsdc samples. in fact, this improvement is not absolute, it depends on the content and the type of the bsd. it was found that the shear strength begins to develop and increase with the increase in the content of both types of bsd until it reached an optimal value. beyond this value, the shear strength begins to decrease, but the final result of the shear strength remains higher than that of the unaltered soil. regardless of the curing period, the optimal ehps-bsd mixtures for this work were bsdc40 and bsdf20. based on the observation, it can be concluded that the content of bsd, up to an optimal value, supports the pozzolanic reaction required to form the cementitious materials for the amelioration of the su. a further addition of bsd then showed the same result as adding sand or silt materials, such materials lead to a reduction in the shear strength of the ehps-bsd mixtures. 584 vol. 61 no. 5/2021 performance characteristics of low carbon waste . . . figure 6. the effect of the type and content of bsd on cc and cs of ehps. 3.4. compressibility of ehps-bsd mixtures the compressibility of ehps-bsd mixtures has been examined in the one-dimensional-consolidation test. the compression index (cc) and swelling index (cs) values obtained from this test are plotted in figure 6. there is a clear effect of bsd on the compressibility of ehps, regardless of the gradation type of bsd, both cc and cs were reduced with an increasing bsd content. for bsdc50 and bsdf50, the addition of bsd reduces (21 to 25 %) the cc values, while the reduction in cs values ranged from (35 to 40 %). in other words, a lesser value of consolidation settlement can be expected for ehps due to the addition of bsd. the reaction of bsd with ehps coincides with the reaction of “pyroclastic rock dust” with an expansive soil. the addition of pyroclastic rock alters the clay minerals due to larger sand-sized grains, and this leads to a reduction of the water absorption rate, as a result, an immediate modification in soil swell properties takes place [38]. 3.5. swelling of ehps-bsd mixtures the efficiency and performance of the bsdc and bsdf in mitigating the potential of ehps to swell have been included in this work. the results of the swell-test for the designated specimens were used to determine the swell potential (sp ) and the swelling pressure (p s). figure 7 highlights the variation of sp with bsd content; the ls values are also presented in figure 7. variation of sp and ls with various bsd percentages. this figure. it is clear that both bsd types are efficient in mitigating the sp values. there is a non-linear decrease in sp and ls values with an increasing bsd content. however, the values of sp for ehps-bsdf specimens are significantly lesser than those for ehpsbsdc specimens. for the bsdf50 specimen, the sp value was 6 %. for the comparable coarse-gradation bsd type specimen (i. e., bsdc50), the sp value was 7.7 %. researches on the stabilisation of soils using additives with a high cao content showed that such additives lead to, in addition to a cation exchange, flocculation, and agglomeration of soil grains. as a result of these two phenomena, larger particles form by flocking of the small particles together [42, 43]. the clear result of using such additives is a considerable reduction in the volume changes in the soil. figure 8 illustrates the axial strain-log pressure curves for ehps-bsd mixtures. the swell pressure (p s) values were determined from these curves. the variation of p s with the applied pressure is shown in figure 8. it can be seen that, under various applied pressures and for both bsd types, all the ehps-bsd specimens showed the same trend in axial strain-log pressure curves. it was noted that the time to achieve 99 % of the maximum soil swell decreases with increasing the content of bsd, for bsdc50, the time was about 40 % of that required to reach the maximum swell of unaltered ehps. however, as the applied pressure increased, the reduction in p s for each ehpsbsd specimen significantly decreased, beyond a par585 ali al-baidhani, abbas jawad al-taie acta polytechnica figure 8. axial strain-log pressure curves for ehpsbsd mixtures. ticular pressure value, the axial strain was converted from swell (positive zone) to compression (negative zone). as shown, the bsd mitigated the effect of the swell potential and swell pressure to various degrees. to evaluate the efficiency and performance of bsdc and bsdf in improving the ehps, the degree of improvement or the improvement ratio (ir) in sp and p s values have been calculated as shown below: ir = sp i − sp f sp i × 100 (1) ir = p si − p sf p si × 100 (2) where ir improvement ratio (%) sp i swell potential of unaltered ehps specimen sp f swell potential of unaltered ehps specimen p si swell pressure of unaltered ehps specimen figure 9. improvement ratio for swell pressure and swell potential for ehps-bsd mixtures. p sf swell pressure of unaltered ehps specimen the variation of the improvement ratio with designated mixtures is shown in figure 9. the statistical parameters for results shown in figure 8 are shown in table 4. as it can be seen, the bsd reduced the sp and p s to different degrees. as the bsd increased, both sp and swell pressure decreased. however, the performance of the fine-sized bsd is better than that of the coarse gradation bsd. the maximum improvement ratio of sp and swell pressure by using bsdf was 57.0 % and 71.7 %, respectively. while the best performance of bsdc in improving the sp and swell pressure was 44.7 % and 57.8 %, respectively. these maximum improvement ratios were obtained for mixtures designated as bsdf50 and bsdc50. such improvement, can be, while taking into account the aforementioned reasons, attributed to the formation of cementitious materials, as a result of the pozzolanic reaction, thus increasing the resistance of the soil to expansion [38, 44]. overall, the addition of bsd altered the structure of expansive particles and caused a mitigation of the soil swelling potential. 4. conclusions in this experimental research, the following findings have been obtained: (1.) there is a characteristic reduction in wl due to the addition of the bsdc and bsdf. the percentual decrease in wl of ehps was about 50 % 586 vol. 61 no. 5/2021 performance characteristics of low carbon waste . . . parameters p s sp dbsc dbsf dbsc dbsf mean 162 145 11 9 cov 34 46 26 38 standard deviation 55 67 3 3 table 4. various statistical parameters for sp and p s. and 49 % for bsdc50 and bsdf50, respectively. this is also noted for wp and ip values. the ip of ehps declined below 35 % when mixed with bsdf50, while the ehps mixed with the identical quantity of bsdc resulted in ip value of 40 %. bsd made the texture of ehps more tenuous. (2.) the improvement of ehps with bsd reduces the amount of shrinkage. increasing the bsd content from 0 % to 50 % reduced the ls, at a 50 % bsd content, the percentual decrease in ls was 45.7 % (for bsdc) and 48.8 % (for bsdf). bsdf50 reduced the time required by the ehps to reach the final shrinkage by approximately 50 %. (3.) the addition of bsd improved the su of ehps, the soil became stiffer and harder with the addition of bsd. however, the curing period, the type, and the content of bsd had an important role. (4.) the consolidation parameters of ehps were reduced due to the addition of bsd, the reduction was 21 to 25 % in cc and 35 to 40 % in cs. (5.) there is a non-linear decrease in the swelling potential of ehps with an increasing content of bsd. for the bsdf50 and bsdc50, the sp reduced to 6 % and 7.7 %, respectively. the maximum reduction in swell pressure due to using bsdf and bsdc was 71.7 % and 57.8 %, respectively. overall, the addition of bsd altered the structure of expansive particles and caused a mitigation of the soil swelling potential. however, bsdf was more effective. finally, the building stone debris can present a sustainable low-carbon stabiliser significant in geotechnical applications. list of symbols cc compression index cs swelling index ip plasticity index [%] ir improvement ratio [%] ls linear shrinkage [%] p si swell pressure of unaltered ehps specimen p sf swell pressure of unaltered ehps specimen sp i swell potential of unaltered ehps specimen sp f swell potential of unaltered ehps specimen su shear strength [kpa] wl liquid limit [%] wp plastic limit [%] references [1] a. al-baidhani, a. j. al-taie. recycled crushed ceramic rubble for improving highly expansive soil. transportation infrastructure geotechnology 7(3):426–444, 2020. https://doi.org/10.1007/s40515-020-00120-z. [2] m. ashfaq, a. a. b. moghal, b. m. basha, a. a. b. moghal. carbon footprint analysis on the expansive soil stabilization techniques. in international foundations congress and equipment expo, asce. 2020. https://doi.org/10.1061/9780784483411.021. [3] a. j. al-taie. practical aid to identify and evaluate plasticity, swelling and collapsibility of the soil encountered in badrah, shatra and nassirya cities. journal of engineering and sustainable development 20(1):38–47, 2016. [4] a. al-baidhani, a. j. al-taie. shrinkage and strength behavior of highly plastic clay improved by brick dust. journal of engineering 26(5):95–105, 2020. https://doi.org/10.31026/j.eng.2020.05.07. [5] a. j. puppala, e. wattanasanticharoen, l. r. hoyos. ranking of four chemical and mechanical stabilization methods to treat low-volume road subgrades in texas. transportation research record 1819(1):63–71, 2003. https://doi.org/10.3141/1819b-09. [6] s. horpibulsuk, c. phetchuay, a. chinkulkijniwat, a. cholaphatsorn. strength development in silty clay stabilized with calcium carbide residue and fly ash. soils found 53(4):477–486, 2013. https://doi.org/10.1016/j.sandf.2013.06.001. [7] y. j. du, n. j. jiang, s. y. li, et al. field evaluation of soft highway subgrade soil stabilized with calcium carbide residue. soils and foundations 56(2):301–314, 2016. https://doi.org/10.1016/j.sandf.2016.02.012. [8] a. arulrajah, e. yaghoubi, y. c. wong, s. horpibulsuk. recycled plastic granules and demolition wastes as construction materials: resilient moduli and strength characteristics. construction and building materials 147:639–647, 2017. https: //doi.org/10.1016/j.conbuildmat.2017.04.178. [9] a. j. al-taie, y. al-shakarchi. shear strength, collapsibility and compressibility characteristics of compacted baiji dune soils. journal of engineering science and technology 12(3):767–779, 2017. http://jestec.taylors.edu.my/vol%2012%20issue% 203%20march%202017/12_3_14.pdf. [10] f. maghool, a. arulrajah, s. horpibulsuk, y. j. du. laboratory evaluation of ladle furnace slag in unbound pavement-base/subbase applications. journal of materials in civil engineering 29(2), 2017. https: //doi.org/10.1061/(asce)mt.1943-5533.0001724. 587 https://doi.org/10.1007/s40515-020-00120-z https://doi.org/10.1061/9780784483411.021 https://doi.org/10.31026/j.eng.2020.05.07 https://doi.org/10.3141/1819b-09 https://doi.org/10.1016/j.sandf.2013.06.001 https://doi.org/10.1016/j.sandf.2016.02.012 https://doi.org/10.1016/j.conbuildmat.2017.04.178 https://doi.org/10.1016/j.conbuildmat.2017.04.178 http://jestec.taylors.edu.my/vol%2012%20issue%203%20march%202017/12_3_14.pdf http://jestec.taylors.edu.my/vol%2012%20issue%203%20march%202017/12_3_14.pdf https://doi.org/10.1061/(asce)mt.1943-5533.0001724 https://doi.org/10.1061/(asce)mt.1943-5533.0001724 ali al-baidhani, abbas jawad al-taie acta polytechnica [11] a. j. puppala, a. pedarla. innovative ground improvement techniques for expansive soils. innovative infrastructure solutions 2:24, 2017. https://doi.org/10.1007/s41062-017-0079-2. [12] s. rios, n. cristelo, a. viana da fonseca, c. ferreira. stiffness behavior of soil stabilized with alkali-activated fly ash from small to large strains. international journal of geomechanics 17(3), 2017. https: //doi.org/10.1061/(asce)gm.1943-5622.0000783. [13] i. o. jimoh, a. a. amadi, e. b. ogunbode. strength characteristics of modified black clay subgrade stabilized with cement kiln dust. innovative infrastructure solutions 3:55, 2018. https://doi.org/10.1007/s41062-018-0154-3. [14] v. farhangi, m. karakouzian. design of bridge foundations using reinforced micropiles. in proceedings of the international road federation global r2t conference & expo, las vegas, nv, usa. 2019. [15] a. j. al-taie, b. s. albusoda, s. alabdullah, a. j. dabdab. an experimental study on leaching in gypseous soil subjected to triaxial loading. geotechnical and geological engineering 37(6):5199–5210, 2019. https://doi.org/10.1007/s10706-019-00974-2. [16] y. gao, j. he, x. tang, j. chu. calcium carbonate precipitation catalyzed by soybean urease as an improvement method for fine-grained soil. soils and foundations 59(5):1631–1637, 2019. https://doi.org/10.1016/j.sandf.2019.03.014. [17] a. j. al-taie, a. al-obaidi, m. alzuhairi. utilization of depolymerized recycled polyethylene terephthalate in improving poorly graded soil. transportation infrastructure geotechnology 7(2):206–223, 2020. https://doi.org/10.1007/s40515-019-00099-2. [18] p. s. k. raja, t. thyagaraj. effect of compaction time delay on compaction and strength behavior of lime-treated expansive soil contacted with sulfate. innovative infrastructure solutions volume 5:14, 2020. https://doi.org/10.1007/s41062-020-0268-2. [19] v. farhangi, m. karakouzian. effect of fiber reinforced polymer tubes filled with recycled materials and concrete on structural capacity of pile foundations. applied sciences 10(5):1554, 2020. https://doi.org/10.3390/app10051554. [20] v. farhangi, m. karakouzian, m. geertsema. effect of micropiles on clean sand liquefaction risk based on cpt and spt. applied sciences 10(9):3111, 2020. https://doi.org/10.3390/app10093111. [21] a. al-kalili, a. ali, a. al-taie. effect of metakaolin and silica fume on the engineering properties of expansive soil. journal of physics: conference series 1895:012017, 2021. https://doi.org/10.1088/1742-6596/1895/1/012017. [22] b. v. venkatarama reddy. sustainable materials for low carbon buildings. international journal of low-carbon technologies 4(3):175–181, 2009. https://doi.org/10.1093/ijlct/ctp025. [23] n. latifi, f. vahedifard, e. ghazanfari, et al. sustainable improvement of clays using low-carbon nontraditional additive. international journal of geomechanics 18(3), 2018. https: //doi.org/10.1061/(asce)gm.1943-5622.0001086. [24] m. p. kumar. cements and concrete mixtures for sustainability. in proceedings of structural engineering world congress, bangalore, india, 2–7 november 2007. [25] l. f. cabeza, c. barreneche, l. miró, et al. low carbon and low embodied energy materials in buildings: a review. renewable and sustainable energy reviews 23:536 –542, 2013. https://doi.org/10.1016/j.rser.2013.03.017. [26] m. m. roshani, s. h. kargar, v. farhangi, m. karakouzian. predicting the effect of fly ash on concrete’s mechanical properties by ann. sustainability 13(3):1469, 2021. https://doi.org/10.3390/su13031469. [27] b. v. venkatarama reddy, a. gupta. tensile bond strength of soil-cement block masonry couplets using cement-soil mortars. journal of materials in civil engineering 18(1):36–45, 2006. https://doi.org/10. 1061/(asce)0899-1561(2006)18:1(36). [28] b. v. venkatarama reddy, p. kumar. embodied energy in cement stabilized rammed earth walls. energy and buildings 42(3):380–385, 2010. https://doi.org/10.1016/j.enbuild.2009.10.005. [29] s. jiao, m. cao, y. li. impact research of solid waste on the strength of low carbon building materials. in 2011 international conference on electrical and control engineering. 2011. https://doi.org/10.1109/iceceng.2011.6058160. [30] m. al-bared, i. harahap, a. marto. sustainable strength improvement of soft clay stabilized with two sizes of recycled additive. international journal of geomate 15(51):39–46, 2018. https://doi.org/10.21660/2018.51.06065. [31] j. kinuthia, j. oti. designed non-fired clay mixes for sustainable and low carbon use. applied clay science 59-60:131–139, 2012. https://doi.org/10.1016/j.clay.2012.02.021. [32] c. ikeagwuani, d. nwonu. emerging trends in expansive soil stabilisation: a review. journal of rock mechanics and geotechnical engineering 11(2):423–440, 2019. https://doi.org/10.1016/j.jrmge.2018.08.013. [33] d. h. van der merwe. contribution to speciality session b, current theory and practice for building on expansive clays. in proceeding of 6th regional conference for africa on smfe, durban, p. 166–167. 1975. [34] n. al-rubaiey, f. kadhim, a. ati. nano ferrites as corrosion inhibitors for carbon steel in local iraqi bentonite mud. engineering and technology journal 35(8):849–855, 2017. [35] british standard institution. 1990 method of testing soils for civil engineering purposes, b.s. 1377. [36] astm. 2003 annual book of astm standards, vol. 04.08. astm international, west conshohocken, pa. [37] k. head. manual of soil laboratory testing. whittles publishing, dunbeath mill, crc press, scotland, uk, 2011. 588 https://doi.org/10.1007/s41062-017-0079-2 https://doi.org/10.1061/(asce)gm.1943-5622.0000783 https://doi.org/10.1061/(asce)gm.1943-5622.0000783 https://doi.org/10.1007/s41062-018-0154-3 https://doi.org/10.1007/s10706-019-00974-2 https://doi.org/10.1016/j.sandf.2019.03.014 https://doi.org/10.1007/s40515-019-00099-2 https://doi.org/10.1007/s41062-020-0268-2 https://doi.org/10.3390/app10051554 https://doi.org/10.3390/app10093111 https://doi.org/10.1088/1742-6596/1895/1/012017 https://doi.org/10.1093/ijlct/ctp025 https://doi.org/10.1061/(asce)gm.1943-5622.0001086 https://doi.org/10.1061/(asce)gm.1943-5622.0001086 https://doi.org/10.1016/j.rser.2013.03.017 https://doi.org/10.3390/su13031469 https://doi.org/10.1061/(asce)0899-1561(2006)18:1(36) https://doi.org/10.1061/(asce)0899-1561(2006)18:1(36) https://doi.org/10.1016/j.enbuild.2009.10.005 https://doi.org/10.1109/iceceng.2011.6058160 https://doi.org/10.21660/2018.51.06065 https://doi.org/10.1016/j.clay.2012.02.021 https://doi.org/10.1016/j.jrmge.2018.08.013 vol. 61 no. 5/2021 performance characteristics of low carbon waste . . . [38] e. ene, c. okagbue. some basic geotechnical properties of expansive soil modified using pyroclastic dust. engineering geology 107(1-2):61–65, 2009. https://doi.org/10.1016/j.enggeo.2009.03.007. [39] j. paya, m. borrachero, j. monzo, et al. enhanced conductivity measurements techniques for evaluation of fly ash pozzolanic activity. cement and concrete research 31(1):41–49, 2001. https://doi.org/10.1016/s0008-8846(00)00434-8. [40] j. tangpagasit, r. cheerarot, c. jaturapitakkul, k. kiattikomol. packing effect and pozzolanic reaction of fly ash in mortar. cement and concrete research 35(6):1145–1151, 2005. https://doi.org/10.1016/j.cemconres.2004.09.030. [41] m. r. jones, a. mccarthy, a. booth. characteristics of the ultrafine component of fly ash. fuel 85(16):2250–2259, 2006. https://doi.org/10.1016/j.fuel.2006.01.028. [42] a. jha, p. sivapullaiah. mechanism of improvement in the strength and volume change behavior of lime stabilized soil. engineering geology 198:53–64, 2015. https://doi.org/10.1016/j.enggeo.2015.08.020. [43] a. amadi, a. okeiyi. use of quick and hydrated lime in stabilization of lateritic soil: comparative analysis of laboratory data. international journal of geo-engineering 8:3, 2017. https://doi.org/10.1186/s40703-017-0041-3. [44] l. krishnaraj, d. joshua giftson, l. tamilvannan, p. t. ravichandran. study on effect of fineness and pozzolanic reaction of fly ash on mechanical properties of cement mortar. international journal of applied engineering research 10(9):14292–14297, 2015. https://www.researchgate.net/publication/ 303522067_study_on_effect_of_fineness_and_ pozzolanic_reaction_of_fly_ash_on_mechanical_ properties_of_cement_mortar. 589 https://doi.org/10.1016/j.enggeo.2009.03.007 https://doi.org/10.1016/s0008-8846(00)00434-8 https://doi.org/10.1016/j.cemconres.2004.09.030 https://doi.org/10.1016/j.fuel.2006.01.028 https://doi.org/10.1016/j.enggeo.2015.08.020 https://doi.org/10.1186/s40703-017-0041-3 https://www.researchgate.net/publication/303522067_study_on_effect_of_fineness_and_pozzolanic_reaction_of_fly_ash_on_mechanical_properties_of_cement_mortar https://www.researchgate.net/publication/303522067_study_on_effect_of_fineness_and_pozzolanic_reaction_of_fly_ash_on_mechanical_properties_of_cement_mortar https://www.researchgate.net/publication/303522067_study_on_effect_of_fineness_and_pozzolanic_reaction_of_fly_ash_on_mechanical_properties_of_cement_mortar https://www.researchgate.net/publication/303522067_study_on_effect_of_fineness_and_pozzolanic_reaction_of_fly_ash_on_mechanical_properties_of_cement_mortar acta polytechnica 61(5):579–589, 2021 1 introduction 2 materials and methods 2.1 extremely high plasticity soil (ehps) 2.2 building stone debris (bsd) 2.3 preparation of specimens and laboratory testing programme 3 results and discussion 3.1 plasticity of ehps-bsd mixtures 3.2 linear shrinkage (ls) result 3.3 shear strength of ehps-bsd mixtures 3.4 compressibility of ehps-bsd mixtures 3.5 swelling of ehps-bsd mixtures 4 conclusions list of symbols references ap02_3.vp 1 software freedom as an academic motivator this paper reports on the gama free software project for adjusting of geodetic networks, which has been started at the department of mapping and cartography of the faculty of civil engineering, ctu prague. the gama project received the status of gnu software [3] in 2001, and in this paper we would like to talk about our positive experience of the attractiveness of the idea of free software (and namely the gnu project [1]) for talented students, and how it can help to stimulate and support their creative work, international contacts and professional growth. free software has much in common with the spirit of university education (most of all, the tradition of free sharing of information). for this reason we start our introduction to the gama project with a description of its evolvement against the background of the activities of commision 2 for professional education of fig (international federation of surveyors) and its virtual academy working group. the discussions and projects presented at the fig workshop and seminar in helsinki [14] in 2001 show that the virtual academy is mainly associated with the exploitation of technical facilities and communication tools (provided by the internet today), and most of the current fig projects in the field of internet education are oriented more or less toward mass education on the bachelor or master level. perhaps due to our different background (our country is still in a transitional period), we put more emphasis on our phd programs and the idea of the virtual academy is particularly attractive to us as a way of building professional international contacts and collaboration. our phd students will be responsible for our future development, as they will be the source of our future academic staff. ways of limiting time consuming and routine tasks like testing students’ knowledge are undoubtedly attractive, and deserve extensive support and development. yet however fascinating present technologies may be, they lack the maturity of traditional educational tools and courses. technically speaking, we have systems and educational programs today that are capable of serving an almost unlimited number of students. does this mean the end of most of traditional universities, and are they going to be replaced by virtual universities equipped with sufficiently powerful hardware and sophisticated educational programs? surely not, because the distinction of a university reflects above all the quality of its people, while its equipment is always a secondary element. we are convinced that when thinking about the virtual academy and about education in general, more emphasis should be put on human communication and collaboration than on technical facilities. in this sense the best inspiration for the virtual academy is provided by successful free software projects like gnu and many others. 2 about gnu gama the gama project (the acronym comes from geodesy and mapping) provides free software written in c++, released under the terms of the gnu general public licence [2], aimed at adjusting geodetic networks http://www.gnu.org/software/gama/. the project was started by aleš čepek in 1998, but soon his first students joined the project. the original idea was to start a project to demonstrate to students the capability and power of object programming and at the same time to present and conserve practical experience and knowledge gained in previous years at the research institute for geodesy (vúgtk zdiby) [11]. the gama project has been strongly influenced by the work of františek charamza, namely his research in the field of applying gram-schmidt orthogonalisation as a general numerical adjustment method (the gso algorithm). without going into detail, we can say that gso is an orthogonalisation algorithm that solves the adjustment without normal equations, directly from the project equations, and thus avoids possible numerical problems in the case of ill-conditioned systems (the condition number of a normal equations matrix is the square of the condition number of a design matrix). the gama project uses the singular value decomposition algorithm (svd) for numerical solution of the adjustment, but gso is also available as an alternative. thanks to its object-oriented design, gama can easily adopt other numerical methods, like cholesky decomposition of normal equations or numerical solutions exploiting the sparse structure of the 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 the gnu gama project – adjustment of geodetic networks a. čepek the development of free software is a well established and successful phenomenon which could hardly exist without the internet, where groups of programmers scattered all around the world are developing software. the idea of free software is highly attractive to talented creative students and can stimulate and support their profesional activities. the gnu gama [4] for adjusting of geodetic networks, with input data described in xml, is given here as a concrete example. free software [3] (or open source) projects need not be limited to software development but can generally cover any professional project based on free information exchange; a suggested example is the planned collection of model geodetic networks described in xml. keywords: gnu gama, geodetic network adjustment, free software. design matrix (however, in our current plans these enhancements have low priority). originally the project was not expected to be presented outside our faculty, so everything was done in czech, from the c++ names to the documentation. this version of the program allowed only uncorrelated horizontal directions and distances in the local triangulation network to be adjusted. the project was first introduced to a wider public at the fig working week 2000 [13]. professor henrik haggrén (from hut findland [12]) proposed that the gama project be presented in 2001 at the workshop and seminar in helsinki and we then started working on adapting gama for application outside the local community of czech users. two major changes had to be made � redefinition of czech tags in the xml input data into their english equivalents, and � implementation of new observation data structures to enable general adjustment of correlated observations. in this second point, we benefitted from discussions with henrik haggrén and from his interest in possible future enhancements to enable adjustment of photogrammetric observations together with classical observation types. the design of the data structures is the cornerstone of any software project. to implement a new observation type in our project, it is only necessary to derive the new corresponding class and to define in it a few virtual functions. the design of gama classes enables implementation of adjustment of any information with a given variance-covariance matrix that can be linearized in respect to the coordinates and other unknown parameters. the gama project uses the concept of clusters to handle possibly correlated observations of a general kind. in c++ terminology a cluster is a container class that maintains a set of observations with a common variance-covariance matrix (symmetric, band or diagonal), as depicted in fig. 1. observations are independent of point data which, in the current version, is defined only for a local geodetic network (an adjustment model on the ellipsoid is our next planned goal). observation data structures have been designed to enable easy implementation of sparse matrix algorithms in the network adjustment for future versions. currently, the following observation types are supported � horizontal directions and distances, � horizontal angles, � slope distances and zenith angles, � height differences, � observed coordinates (used in sequential adjustment, etc.,) and � observed coordinate differences (vectors). in order to add a new observation type to the project it is necessary to override the virtual functions for linearization by an actual code for that observation type, and thus to derive the appropriate class from the cluster. in this way, for example, photogrammetric measurements can be incorporated and simultaneously adjusted together with classical surveying observables in a common three dimensional network. in parallel with the above mentioned changes we have worked on other enhancements and improvements necessary for making gama into gnu software [3]. we passed the gnu review process and in november 2001 gama was declared a gnu package. the status of gnu software for the gama project is not only amatter of prestige: it also provides extensive technical support from the gnu web space and archives to the cvs © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 42 no. 3/2002 fig. 1: basic observation data structures repository, know-how background for project policies and legal advice. last but not least, the status of gnu software will help us to involve more students and developers in the project. 3 using xml as a data format the primary motivation for usage of extensible markup language (xml) in the gama project was to define structured input data for adjusting the local geodetic network. extensible markup language [7] describes a class of data objects called xml documents, and is generally expected to be one of the most important communication standards in the near future. the data is described (similarly to html) with a set of xml user defined markup tags. the most important feature of xml is probably the ease of defining a grammar for our data (a class of xml documents) and thus we can validate it even independently of our application. this grammar is known as a document type definition, or dtd. the document type definition for gama xml input data format is http://www.gnu.org/software/gama/gama-xml.dtd dtd is not the only possible schema for defining xml syntax and generally not even the best schema. however, it is quite sufficient for the relatively simple syntax of gama input, as shown in the following short (but complete) example (see fig. 2). most of this example can surely be understood intuitively without bothering about the syntax, whereas formal parsing xml is not a trivial task. without going into technical details, let us just note that gama uses the expat parser [8] version 1.1, written by james clark. 4 rocinante xml is a replacement for ascii data files, and can be viewed or edited with a text editor or simply printed. many tools are available for xml processing, e.g., viewing or transforming it into html or even editing. however, a typical user would never edit xml input manually as a raw text file. the gama project originally focused on developing a platform independent c++ library (gamalib) with only a simple command line tool (gama program) for processing xml input data. our primary platform is gnu/linux, and to help our users to use gama adjustment we have been running a mail-server and a web interface (webgama) based on the php scripting language. we considered the possibility of presenting our adjustment as a web-based application implemented with c++ cgi scripts, but this proved to be impractical and for various reasons we did not want use the java language. as most of our potential users come from the microsoft windows platform, we provided a windows application (gamawed) built with the borland c++ builder. actually, we wanted to be prepared in advance for the case when inprise/borland would port 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 fig. 2: example of gama input their c++ builder to linux; but until now only kylix is available. however, none of these solutions was satisfactory. in 2001, jan pytel (a student from our faculty) decided to write a graphical user interface (gui) for gama based on the qt graphical library [10]. the first beta versions of his program rocinante [6] became available in november 2001. rocinante now forms part of the gama project, and an example screen-shot is shown in fig. 3. it should be stressed that rocinante is a platform independent gui; the qt library is available on linux, ms windows and macos, among others. the qt c++ library is one of the best graphical libraries available, for example, it is used in linux kde or borland kylix, and for linux it is available free of of charge under the gpl licence. rocinante provides a model example of how students can be involved and can actively contribute to a project, simply for the joy of doing good work and supporting the idea of free software. 5 future plans to conclude our paper we summarise what is still missing in the gnu gama project, what has to be done, and our plans for future development. � we have prepared a project to produce outputs in any language that can be expressed in the utf-8 encoding (unicode). technically, this is done by separating all texts into xml files, with possibly different input encodings, from which the corresponding c++ functions for dynamic language switching and utf-8 strings are generated. this could form the basis for an international student project (participants need not be programmers). � naturally, adjustment results should also be available in xml format, like the input data. one of our plans is to enable adjustment in steps using earlier adjustment results. this point was deliberately postponed until we decide whether to define an ad hoc gama xml adjustment output format or to use a more general scheme inspired by the borland xml general data packets. � another goal of our project is to gather a collection of model geodetic networks and to present them with their adjustment results on the web. xml is the best tool for this purpose. lack of reliable model data for debugging and testing (namely in the case of 3d networks) was a problem that we had to cope with. typically examples given in textbooks on geodetic adjustment and least squares are trivial, and are useful only for demonstration purposes. they cannot be used in software development. � this year we plan to add to rocinante a plug-in for graphical output described in scalable vector graphics (svg) [9], a general graphical format based on xml. � closely related to gama is another project of ours aimed at processing levelling observations, namely the project for a general xml format for describing levelling data from various recording units. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 42 no. 3/2002 fig. 3: rocinante the qt based gui for gnu gama � a major enhancement planned for gama is the implementation of adjustment in a global coordinate system together with classes for handling cartographic projections (that could be added as plug-ins). discussions during the fig congress [15] with frank leahy from melbourne university about his experience and work helped us to clarify the choice of a geocentric reference frame to be used in the gama adjustment model. the gama project is a gnu software. anybody can use and redistribute it freely, and all source codes are available under the gpl licence. furthermore, it is platform independent and currently runs on gnu/linux and on microsoft windows (we are looking for a volunteer to port it to the macos operating system). starting this year, we have included gnu gama/rocinante in one of our undergraduate courses, in which our students are introduced to the basic geodetic software used at our faculty. at the current stage, rocinante is ready to be used in areas from basic surveying computations (intersections, traverses, etc.) to the adjustment of special measurements of 3d networks in engineering geodesy. acknowledment support of the ministry of education, youth and sports of the czech republic, project msm: 210000007, is highly appreciated. references [1] free software foundation, inc.: gnu’s not unix! [online]. fsf, boston, ma, usa [cit. 2002-07-13]. updated 2002-07-01 . [2] free software foundation, inc.: gnu general public license [online]. [cit. 2002-07-13]. updated 2001-07-15 . [3] free software foundation, inc.: categories of free and non-free software [online]. [cit. 2002-07-13]. updated 2002-03-09 . [4] free software foundation, inc.: gnu gama [online]. [cit. 2002-07-13]. updated 2002-06-03 . [5] free software foundation, inc.: gnu gama dtd [online]. version 2.0.10. [cit. 2002-07-13]. updated 2002-05-04 . [6] free software foundation, inc.: gnu gama / rocinante [online]. [cit. 2002-07-13]. . [7] world wide web consortium (w3c): extensible markup language (xml) 1.0. (second edition) 6 oct. 2000 [online]. [cit. 2002-07-13] . [8] clark, james: expat – xml parser toolkit [online]. [cit. 2002-07-13] . [9] world wide web consortium (w3c): scalable vector graphics (svg) 1.0 specification [online]. [cit. 2002-07-13]. 04 sept. 2001 . [10] trolltech, a. s: trolltech [online]. [cit. 2002-07-13] . [11] výzkumný ústav geodetický, topografický a kartografický ve zdibech: vúgtk – úvodní stránka [online]. [cit. 2002-07-13]. aktualizováno 29. 4. 2002 . [12] helsinki university of technology: teknillinen korkeakoulu – tkk [online]. [cit. 2002-07-13] . [13] čepek, a., hnojil, j.: internet in education, practical experience and future plans. fig working week 2000, may 21–26, prague 2000. [14] čepek, a.: free open source project gama for adjustment of local geodetic networks. fig workshop and seminar: virtual academy, june 5–8, 2001, helsinki university of technology, lifelong learning institute dipoli. [15] čepek, a., pytel, j.: free software � an inspiration for virtual academy. xxii fig international congress and acsm�asprs conference and technology exhibition 2002, april 19–26, 2002, washington, dc, u.s.a. [16] leahy, f. j.: the automatic segmenting and sequential phased adjustment of large geodetic networks. (dissertation), university of melbourne, 1999. doc. ing. aleš čepek, csc. phone: +420 224 354 657 fax: + 420 224 355 419 e-mail: cepek@fsv.cvut.cz web site: http://gama.fsv.cvut.cz/~cepek dept. of mapping and cartography czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 acta polytechnica doi:10.14311/ap.2017.57.0470 acta polytechnica 57(6):470–476, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap numerical calculation of the complex berry phase in non-hermitian systems marcel wagner∗, felix dangel, holger cartarius, jörg main, günter wunner institut für theoretische physik 1, universität stuttgart, 70550 stuttgart, germany ∗ corresponding author: marcel.wagner@itp1.uni-stuttgart.de abstract. we numerically investigate topological phases of periodic lattice systems in tight-binding description under the influence of dissipation. the effects of dissipation are effectively described by pt -symmetric potentials. in this framework we develop a general numerical gauge smoothing procedure to calculate complex berry phases from the biorthogonal basis of the system’s non-hermitian hamiltonian. further, we apply this method to a one-dimensional pt -symmetric lattice system and verify our numerical results by an analytical calculation. keywords: complex berry phase; pt symmetry; gauge smoothing. 1. introduction due to their robustness against local defects or disorder topologically protected states as majorana fermions [1–3] are of high value for physical applications such as quantum computation [4]. however, no physical system is completely isolated and dissipation can have an important influence on the states [5]. majorana fermions can even be created with the help of dissipative effects [6, 7]. of special importance in this context is the case of balanced gain and loss as described by pt -symmetric complex potentials [8], which has attracted large interest in quantum mechanics [9–13]. the stationary schrödinger equation was solved for scattering solutions [14] and bound states [15], and also questions concerning other symmetries [16, 17] as well as the meaning of non-hermitian hamiltonians have been discussed [18, 19]. their influence on many-particle systems has been studied mainly in the context of bose-einstein condensates [20–25] but also on lattice systems [26–31]. in the latter systems it was shown that pt -symmetric complex potentials may eliminate the topologically protected states existing in the same system under complete isolation [26–30, 32, 33]. however, in some pt -symmetric potential landscapes they can survive [28–32, 34], which has been confirmed in impressive experiments [33, 35] in most theoretical studies the topologically protected states have been identified via their property to close the energy gap of the band structure or their localisation at edges or interfaces of the systems [26, 28– 30, 32]. the identification and calculation of topological invariants such as the zak phase [36] known from hermitian systems leads to new challenges in the case of non-hermitian operators [27, 37, 38]. this is especially true if the eigenstates are only available numerically. indeed, in refs. [27, 37, 38] all calculations have been done for eigenstates which are analytically available. however, for arbitrary pt -symmetric complex potentials analytical access to the eigenstates is not available and a reliable numerical procedure is required. for the calculation of the invariants an integration of phases over a loop in parameter space is typically necessary. for example, in the case of the zak phase, which is applied to one-dimensional systems, this integral runs over the first brillouin zone. the integrand contains the eigenstates of the hamiltonian and their first derivatives. in a numerical calculation it is evaluated at discrete points in momentum space and each state possesses an arbitrary global phase spoiling the phase relation. this is the point our study sets in. in this article we introduce a robust method of calculating the complex berry phase numerically. it is based on a normalisation of the leftand right-hand eigenvectors with the biorthogonal inner product [39], which reduces to the c product [40] in the case of complex symmetric hamiltonians. to obtain an unambiguous complex berry phase we introduce a numerical gauge smoothing procedure. it consists of two parts. first we have to remove the influence of the arbitrary and unconnected global phases of the eigenstates, which is unavoidably attached to them for each point in parameter space. this is achieved by relating the eigenvectors of consecutive steps in parameter space, and then normalising them. with this approach the eigenvectors are not yet single-valued, i.e. the vectors at the starting and end point of the loop possess different phases. these points have to be identified and will be refered as the basepoint of the loop later on. the phase difference between the different left and right states at the basepoint has to be corrected by ensuring the eigenvectors to be identical at the basepoint. the article is organised as follows. first, we introduce the complex berry phase in section 2. in section 470 http://dx.doi.org/10.14311/ap.2017.57.0470 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 numerical calculation of the complex berry phase 3 we establish the algorithm of the gauge smoothing procedure for non-hermitian (and hermitian) hamiltonians. an example is presented in section 4, where we apply the previously developed method to a nonhermitian extension of the su-schrieffer-heeger (ssh) model [41] to calculate its complex zak phase. 2. complex berry phase topological phases of closed one-dimensional periodic lattice systems are characterised by the zak phase [36], which is the berry phase [42] picked up by the eigenstate when it is transported once along the brillouin zone. in the presence of an antiunitary symmetry these phases are quantised [43] and can be related to the winding number of a vector n(k) determining the bloch hamiltonian h(k) = n(k) ·σ, (1) where σ denotes the vector of pauli matrices and k is the wave number parametrising the brillouin zone, which acts as parameter space. in this case the zak phase characterises the system’s topological phase. this concept can be generalised to dissipative systems effectively described by a pt -symmetric nonhermitian hamiltonian h(α). the complex berry phase γn of a biorthogonal pair of eigenvectors 〈χn| and |φn〉 of h(α), γn = i ∮ c 〈χn|∇α|φn〉 · dα, (2) follows from the lowest order of the adiabatic approximation of the time evolution of a state in parameter space [44]. here c is a loop in parameter space and α = (α1, ...,αi, ...) are its coordinates. we consider pt -symmetric non-hermitian hamiltonians of the form h(α) = hh(α) + hnh(α), (3) where hh denotes its hermitian and hnh represents its pt -symmetric non-hermitian part. the nonhermitian part is a complex potential modelling the gain and loss of particles. the complex berry phase γn arising from the periodic modulation of states in the parameter space of a pt -symmetric one-dimensional system cannot be related to a real winding number calculated from (1). hence, the calculation of the complex berry phase requires the determination of gauge-smoothed eigenvector pairs along the loop c in parameter space allowing for the evaluation of (2). here, it is important to note that the argumentation of hatsugai [43] for the quantisation of berry phases of hermitian hamiltonians can be extended on complex berry phases of non-hermitian pt -symmetric hamiltonians in the case of unbroken pt symmetry. one finds the real part of the complex berry phase to take values 0 or π modulo 2π. thus, a strict quantisation is still present and the pt symmetry protects the topological phases occurring in such systems. 3. numerical gauge smoothing in this section we present a numerical procedure to determine the left and right eigenvectors 〈χn| and |φn〉 in an appropriate smoothed gauge to compute complex berry phases on a discretised loop c = (α1, ...,αj, ...,αm = α1) in parameter space. this is necessary for the evaluation of integrals of the form in (2). typically the left and right eigenvectors have to be calculated independently, and each of them has an arbitrary global phase. the biorthogonal normalisation condition [39], 〈χn(αj)|→ 〈χn(αj)|√ 〈χn(αj)|φn(αj)〉 , (4a) |φn(αj)〉→ |φn(αj)〉√ 〈χn(αj)|φn(αj)〉 , (4b) chooses one arbitrary global phase for each αj. this is sufficient if only products or matrix elements of eigenstates belonging to the same point in parameter space are required. however, for numerical derivatives used in (2) the remaining global phases of successive steps in αj along the loop c can spoil the complex zak phase. a fixation of the phase between consecutive steps that does not distort the desired result is required. starting point for the gauge smoothing procedure are the left and right handed versions of the timeindependent schrödinger equation, 〈χn(α)|h(α) = en(α)〈χn(α)|, (5a) h(α) |φn(α)〉 = en(α) |φn(α)〉, (5b) defining a set of natural left and right basis states 〈χn(α)| and |φn(α)〉. these equations are solved for every point αj of the discretised loop c in parameter space providing the eigenvalues en(αj) and the unnormalised states of a biorthogonal basis {〈χn(αj)|, |φn(αj)〉} of the hamiltonian h(αj) at each point αj. here the basis states are determined up to the aforementioned arbitrary phases. to smooth the gauge within the loop in parameter space with basepoint α1 one chooses an arbitrary global phase. it is most convenient to do this for the basepoint. the corresponding eigenstates are normalised according to the conditions (4a) and (4b). the following two-stage procedure transfers the choice of the global phase at α1 onto the other basis states along the loop c. first one modifies the phases of the states 〈χn(αj)| and |φn(αj)〉 iteratively by 〈χn(αj)|→ 〈χn(αj)|e−i arg(〈χn(αj )|φn(αj−1)〉), (6a) |φn(αj)〉→ |φn(αj)〉e−i arg(〈χn(αj−1)|φn(αj )〉) (6b) followed by a normalisation of the states according to (4a) and (4b). equations (6a) and (6b) relate the vectors of step j to those of step j − 1 by ensuring im ( 〈χn(αj)|φn(αj−1)〉 ) = 0, (7a) im ( 〈χn(αj−1)|φn(αj)〉 ) = 0, (7b) 471 m. wagner, f. dangel, h. cartarius et al. acta polytechnica which is a valid condition in the continuous limit. the normalisation conditions (4a) and (4b) ensure that the basis states now fulfil 〈χm(αj)|φn(αj)〉 = δmn (8) for j ∈{1, . . . ,m} and for all n and m. as a result of the first step the arbitrary global phases have been removed. only one arbitrary phase is left, which has no influence, since it is identical for all right eigenvectors and its complex conjugate for all left eigenvectors. however, the biorthogonal basis following from the procedure so far is not singlevalued in the parameter space. in particular, the vectors at the starting and end point of the loop are not identical. for the calculation of a berry phase a continuous single-valued phase function is essential [42], and thus has to be established. to this end in the second step one adjusts the basis states such that they are the same at the starting and the end point of the loop. this can be achieved by compensating the phase difference between the states at the basepoint, 〈χn(α1)| and 〈χn(αm )|, respectively, as well as |φn(α1)〉 and |φn(αm )〉. this remains true for single vector components. therefore we calculate the phase difference of the first non-vanishing component p of the left basis states 〈χn(α1)| and 〈χn(αm )|, ∆ϕn = ϕn,m −ϕn,1 + 2πxn, (9) where ϕn,j = arg ( 〈χn(αj)|p ) is the argument of component p of the left eigenvector at the point αj in parameter space and x denotes the sum of directed crossings of the phase ϕn,j over the borders of the standard interval [−π,π). starting with x = 0 we increase x by one for every jump of ϕn,j from −π to π and subtract 1 for the opposite direction. the states of the biorthogonal basis can then finally be gauge-smoothed by multiplying the states at αj by a phase factor according to 〈χn(αj)|→ 〈χn(αj)|e−if∆ϕn ((j−1)/(m−1)), (10a) |φn(αj)〉→ |φn(αj)〉eif∆ϕn ((j−1)/(m−1)) (10b) for j ∈ {1, ...,m}, where f∆ϕn (x) is any “smooth” real valued continuous function f∆ϕn : [0, 1] → r, (11a) fulfilling: 0 7→ f∆ϕn (0) = 0, 1 7→ f∆ϕn (1) = ∆ϕn ± 2zπ, (11b) with z ∈ z. its explicit form is not critical since it only has to correct the total phase change over the whole range of the loop. however, a linear progression of the phase correction from step to step turns out to be a good choice. it should be mentioned that in case of a degeneracy of the eigenvalue at αj the solution of (5a) and (5b) yields an arbitrary linear combination of eigenvectors of the degenerate eigenspace. to find the correct eigenvectors 〈χn(αj)| and |φn(αj)〉 one can apply a biorthogonal gram-schmidt algorithm [45]. if αj−1 is a point neighbouring the degeneracy one tries to find a linear combination of the vectors of the left degenerate eigenspace fulfilling 〈χm(αj)|φn(αj−1)〉≈ δmn (12) and then chooses the right eigenvectors such that 〈χm(αj)|φn(αj)〉 = δmn. (13) alternatively one can treat the real and imaginary parts of the degenerate eigenvector components as “smooth” functions. then the eigenvector components at degeneracy points can be predicted by fitting a spline to the vector components at neighbouring points αl. an approximation to the correct eigenvectors of the degenerate eigenspace can be determined by a linear combination of the obtained vectors of the degenerate eigenspace such that they fit best to the prediction. hermitian hamiltonians can be treated as a special case, in which the left eigenvector fulfils 〈χn| = (|φn〉t)∗. 4. application to a one-dimensional lattice system in this section we apply the gauge smoothing procedure developed in section 3 to a pt -symmetric onedimensional lattice system to calculate the complex zak phase γn = ∮ bz 〈χn|∂k|φn〉dk, (14) where the parameter space is given by the discretised brillouin zone bz and k is the wave number. as an example we consider the ssh model [41] with n lattice sites, tunnelling amplitudes t+ and t−, and creation (annihilation) operators of spinless fermions c†n (cn) at site n, hssh = n/2∑ n=1 t− ( c†ancbn + h.c. ) + n/2−1∑ n=1 t+ ( c † bncan+1 + h.c. ) . (15) we apply an alternating non-hermitian pt -symmetric potential of the form u = i γ 2 n/2∑ n=1 ( c † bncbn − c † ancan ) , (16) where γ denotes the parameter of gain and loss. the pt -symmetric hamiltonian describing this model (sketched in figure 1) is given by h = hssh + u. (17) to evaluate (14) we need to represent this hamiltonian in the reciprocal space, where the brillouin zone 472 vol. 57 no. 6/2017 numerical calculation of the complex berry phase t− t+ t− t− a1 b1 a2 b2 an/2 bn/2 figure 1. (colour online) sketch of the ssh model with n lattice sites subject to the alternating imaginary potential u. the minus (plus) sign marks a negative (positive) imaginary potential corresponding to particle sinks (sources). acts as parameter space. this is done by rewriting the hamiltonian with creation and annihilation operators in the reciprocal space in the limit n →∞, h = π∑ k=−π ( c † a,k, c † b,k ) ( −iγ/2 t− + t+eik t− + t+e−ik iγ/2 )( ca,k cb,k ) , (18) where the sum runs over discrete values of k in steps of k = 2π/n and the annihilation operator of an electron with wave number k is given by cn = 1 √ n ∑ k cke −ikrn (19) with rn = an and the lattice spacing a. the matrix occurring in (18) is the bloch hamiltonian h(k) of the system, which can be decomposed into the pauli matrices, h(k) = ( t− + t+cos(k) ) σ1 − t+sin(k)σ2 − iγ/2σ3 = n(k) ·σ (20) with a coefficient vector n and the pauli vector σ. from this form the energy eigenvalues can be obtained explicitly, e±(k) = ±|n(k)|. (21) in the limit γ → 0 the hamiltonian from (18) reproduces the hermitian ssh model, which possesses time-reversal, reflection, particle-hole, and a chiral symmetry (represented by σ3). the introduction of a pt -symmetric non-hermitian on-site potential γ breaks these symmetries. the non-hermitian bloch hamiltonian is invariant under the combined action of the parity and the time inversion operator. further the particle-hole symmetry is broken because the sources (sinks) of an electron correspond to sinks (sources) of holes. the non-hermitian system is therefore symmetric under the action of the combination of the parity and the charge conjugation operator. further it has no chiral symmetry λ = a0σ0 + a ·σ because a chiral symmetry would fulfil {λ,h} = 3∑ i=0 3∑ j=1 ainj{σi,σj} = 2 ( a1n1 + a2n2 + a3n3 ) σ0 + 2a0 3∑ j=1 njσj != 0 (22) with a coefficient vector a which is independent of the value of k , the 2 × 2 identity matrix σ0, and the anti-commutation relations {σi,σj} = 2δijσ0 of the pauli matrices. therefore one finds a0 = 0, and thus a1n(k)1 + a2n(k)2 + a3n3 != 0, (23) which cannot be satisfied for a constant vector a because the vector n(k) rotates on a cylindrical surface as k runs through the brillouin zone. hence the nonhermitian hamiltonian in (18) does not possess a chiral symmetry. however, this does not mean there is no quantised real part of the zak phase since its quantisation is ensured by the argument of hatsugai [43] in the pt -unbroken parameter regime as mentioned above. at the critical point γ = 0 the system reproduces the hermitian ssh model, which possesses the previously mentioned symmetries. for γ < 0 the particle sinks and sources are interchanged leading to a spatially reflected system with the same general properties as the system with γ > 0. from the bloch hamiltonian (cf. (18)) the complex zak phase can be calculated following the steps explained in section 3. we choose (cf. (11a)) f∆ϕn (x) = ∆ϕnx− 2π with x = k + π/a 2π/a , (24) which is the most simple function fulfilling the conditions (11b). figure 2 illustrates the gauge smoothing process using the first non-vanishing component p = 1 of the left basis states as an example. the component 〈χ1(αj)|1 of the unnormalised left eigenvector is shown in figure 2 (a). one can see two different phase branches as a result of the numerical diagonalisation and a constant modulus. after the gauge smoothing and normalisation according to the first step described in section 3 as shown in figure 2 (b) the modulus varies with the wave number k and there is only one phase branch left, but the basis is not yet single-valued as there is still a phase difference at the boundaries of the brillouin zone. in this example the factor x = −1 has to be used as ϕ1,j jumps from −π to π. after the second step the component 〈χ1(αj)|1 of the final left eigenvector is the same at the brillouin zone boundaries in figure 2 (c). the two complex zak phases γ1 and γ2 following from the eigenvector pairs of h(k) are shown in figures 3 (a) and (b), where the control parameter θ is used to describe the difference between the two tunnelling amplitudes t+ and t−, t± = t ( 1 ± ∆cos(θ) ) (25) with the mean value of the tunnelling amplitude t and the dimerization strength ∆. to verify our results we compare them with the analytical ones derived in [37], γ1/2 = πθ(q − 1) ± i η 2 √ ν q ( k(ν) + q − 1 q + 1 π(µ,ν) ) , (26) 473 m. wagner, f. dangel, h. cartarius et al. acta polytechnica −π 0 π 0 1 2(a) −π 0 π 0 1 2 } = ∆ϕ̃1 (b) ϕ 1 ,j ∣ ∣ 〈 χ 1 (α j )| 1 ∣ ∣ −π 0 π −1 −0.5 0 0.5 1 0 1 2 k/π ∣∣〈χ1(αj)|1 ∣∣ϕ1,j (c) figure 2. (colour online) first component of the left handed eigenvector 〈χ1(αj )|1 in dependence of the wave number k with t = 1, ∆ = 1/2, γ = 1 and θ ≈ 0.3π: (a) before the steps described in (6a) and (6b) one can identify two different phase branches (blue line) and a constant modulus (filled red dots). (b) after the steps characterised by (4a) and (4b) (here ∆ϕ̃1 = ϕm,1 −ϕ1,1 and x = −1 cf. (9)) the phase is smooth within the brillouin zone but discontinuous at its boundaries, and the modulus varies with k. (c) after the gauge smoothing process the phase difference ∆ϕ̃1 is compensated and the phase is continuous and smooth in the whole brillouin zone. where q = t+/t− is the ratio of the tunnelling amplitudes, η = γ/(2t−) and ν = 4q/((q + 1)2 − η2) and µ = 4q/(q + 1)2. k(ν) and π(µ,ν) are elliptic integrals of first and third kind, k(ν) = π/2∫ 0 dk√ 1 −ν sin2(k) , (27) π(µ,ν) = π/2∫ 0 dk 1 −µ sin2(k) √ 1 −ν sin2(k) . (28) the numerical calculations perfectly reproduce the analytical results. the grey shaded areas in figure 3 mark the values of θ for which the system is in a pt broken phase, for all other values of θ the system is in a pt -unbroken phase. in the pt -symmetric regime the real part of the complex zak phase is either 0 or π modulo 2π and can be used to characterise the topological phase of the system. 5. conclusion we developed a numerical method to determine a gauge-smoothed biorthogonal basis of eigenstates of a pt -symmetric non-hermitian hamiltonian required for complex zak phases. it is also applicable to hermitian systems. in the course of this we removed -2 0 2 (a) -2 0 2 -1 -0.5 0 0.5 1 re im (b) γ 1 / π γ 2 / π θ/π analytic figure 3. (colour online) numerical results of the real part (filled red dots) and the imaginary part (open blue circles) of the complex zak phases γ1 (a) and γ2 (b) following from the hamiltonian given in (17) in dependence of the control parameter θ with t = 1, ∆ = 1/2 and γ = 1 (all values in a.u.) compared to the analytical result (real and imaginary part are represented by a solid black line coinciding with the numerical results) following from (26). the grey shaded area marks the pt -broken phase of the bloch hamiltonian. the arbitrary and unconnected global phases of the biorthogonal eigenstates of the pt -symmetric hamiltonian at each point in parameter space and made the basis single-valued. this allows for the calculation of the complex berry phase by explicitly evaluating (2) even in complicated lattice systems for which no analytical access to the eigenstates is approachable. we demonstrated the action of the gauge smoothing method by applying the developed algorithm to a pt symmetric extension of the ssh model. an excellent agreement of the numerical and analytical results was found. in future work this provides the basis for the identification of the zak phase as topological invariant in many-body systems with arbitrary pt -symmetric complex potentials. references [1] v. mourik, k. zuo, s. m. frolov, et al. signatures of majorana fermions in hybrid superconductorsemiconductor nanowire devices. science 336:1003–1007, 2012. doi:10.1126/science.1222360. [2] t. d. stanescu, s. tewari. majorana fermions in semiconductor nanowires: fundamentals, modeling, and experiment. j phys: condens matter 25(23):233201, 2013. doi:10.1088/0953-8984/25/23/233201. [3] s. r. elliott, m. franz. colloquium: majorana fermions in nuclear, particle, and solid-state physics. rev mod phys 87:137–163, 2015. doi:10.1103/revmodphys.87.137. [4] a. stern, n. h. lindner. topological quantum computation—from basic concepts to first experiments. science 339(6124):1179–1184, 2013. doi:10.1126/science.1231473. 474 http://dx.doi.org/10.1126/science.1222360 http://dx.doi.org/10.1088/0953-8984/25/23/233201 http://dx.doi.org/10.1103/revmodphys.87.137 http://dx.doi.org/10.1126/science.1231473 vol. 57 no. 6/2017 numerical calculation of the complex berry phase [5] a. carmele, m. heyl, c. kraus, m. dalmonte. stretched exponential decay of majorana edge modes in many-body localized kitaev chains under dissipation. phys rev b 92, 2015. doi:10.1103/physrevb.92.195107. [6] c.-e. bardyn, m. a. baranov, c. v. kraus, et al. topology by dissipation. new j phys 15(8):085001, 2013. doi:10.1088/1367-2630/15/8/085001. [7] p. san-jose, j. cayao, e. prada, r. aguado. majorana bound states from exceptional points in non-topological superconductors. sci rep 6:21427, 2016. doi:10.1038/srep21427. [8] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. phys rev lett 80:5243–5246, 1998. doi:10.1103/physrevlett.80.5243. [9] m. znojil. pt -symmetric square well. phys lett a 285(1–2):7, 2001. doi:10.1016/s0375-9601(01)00301-2. [10] c. m. bender, d. c. brody, h. f. jones. complex extension of quantum mechanics. phys rev lett 89:270401, 2002. doi:10.1103/physrevlett.89.270401. [11] m. znojil. decays of degeneracies in pt -symmetric ring-shaped lattices. phys lett a 375(39):3435, 2011. doi:10.1016/j.physleta.2011.08.005. [12] h. f. jones, e. s. moreira, jr. quantum and classical statistical mechanics of a class of non-hermitian hamiltonians. j phys a 43(5):055307, 2010. doi:10.1088/1751-8113/43/5/055307. [13] k. li, p. g. kevrekidis, b. a. malomed, u. günther. nonlinear pt -symmetric plaquettes. j phys a 45(44):444021, 2012. doi:10.1088/1751-8113/45/44/444021. [14] h. mehri-dehnavi, a. mostafazadeh, a. batal. application of pseudo-hermitian quantum mechanics to a complex scattering potential with point interactions. j phys a 43:145301, 2010. doi:10.1088/1751-8113/43/14/145301. [15] v. jakubský, m. znojil. an explicitly solvable model of the spontaneous pt -symmetry breaking. czech j phys 55:1113, 2005. doi:10.1007/s10582-005-0115-x. [16] g. lévai, m. znojil. the interplay of supersymmetry and pt symmetry in quantum mechanics: a case study for the scarf ii potential. j phys a 35(41):8793, 2002. doi:10.1088/0305-4470/35/41/311. [17] n. abt, h. cartarius, g. wunner. supersymmetric model of a bose-einstein condensate in a pt -symmetric double-delta trap. int j theor phys 54(11):4054–4067, 2015. doi:10.1007/s10773-014-2467-0. [18] a. mostafazadeh. delta-function potential with a complex coupling. j phys a 39:13495, 2006. doi:10.1088/0305-4470/39/43/008. [19] h. f. jones. interface between hermitian and nonhermitian hamiltonians in a model calculation. phys rev d 78:065032, 2008. doi:10.1103/physrevd.78.065032. [20] e. m. graefe, h. j. korsch, a. e. niederle. mean-field dynamics of a non-hermitian bose-hubbard dimer. phys rev lett 101:150408, 2008. doi:10.1103/physrevlett.101.150408. [21] e. m. graefe, u. günther, h. j. korsch, a. e. niederle. a non-hermitian pt symmetric bose-hubbard model: eigenvalue rings from unfolding higher-order exceptional points. j phy a 41(25):255206, 2008. doi:10.1088/1751-8113/41/25/255206. [22] z. h. musslimani, k. g. makris, r. el-ganainy, d. n. christodoulides. analytical solutions to a class of nonlinear schrödinger equations with pt -like potentials. j phys a 41:244019, 2008. doi:10.1088/1751-8113/41/24/244019. [23] w. d. heiss, h. cartarius, g. wunner, j. main. spectral singularities in pt -symmetric bose-einstein condensates. j phys a 46(27):275307, 2013. doi:10.1088/1751-8113/46/27/275307. [24] d. dast, d. haag, h. cartarius, g. wunner. quantum master equation with balanced gain and loss. phys rev a 90:052120, 2014. doi:10.1103/physreva.90.052120. [25] r. gutöhrlein, j. schnabel, i. iskandarov, et al. realizing pt -symmetric bec subsystems in closed hermitian systems. j phys a 48(33):335302, 2015. doi:10.1088/1751-8113/48/33/335302. [26] y. c. hu, t. l. hughes. absence of topological insulator phases in non-hermitian pt-symmetric hamiltonians. phys rev b 84:153101, 2011. doi:10.1103/physrevb.84.153101. [27] k. esaki, m. sato, k. hasebe, m. kohmoto. edge states and topological phases in non-hermitian systems. phys rev b 84:205128, 2011. doi:10.1103/physrevb.84.205128. [28] c. yuce. topological phase in a non-hermitian symmetric system. phys lett a 379(18–19):1213 – 1218, 2015. doi:10.1016/j.physleta.2015.02.011. [29] c. yuce. pt symmetric floquet topological phase. eur phys j d 69(7):184, 2015. doi:10.1140/epjd/e2015-60220-7. [30] c. yuce. majorana edge modes with gain and loss. phys rev a 93:062130, 2016. doi:10.1103/physreva.93.062130. [31] m. klett, h. cartarius, d. dast, et al. relation between pt -symmetry breaking and topologically nontrivial phases in the su-schrieffer-heeger and kitaev models. phys rev a 95:053626, 2017. doi:10.1103/physreva.95.053626. [32] h. schomerus. topologically protected midgap states in complex photonic lattices. opt lett 38(11):1912–1914, 2013. doi:10.1364/ol.38.001912. [33] j. m. zeuner, m. c. rechtsman, y. plotnik, et al. observation of a topological transition in the bulk of a non-hermitian system. phys rev lett 115:040402, 2015. doi:10.1103/physrevlett.115.040402. [34] x. wang, t. liu, y. xiong, p. tong. spontaneous pt -symmetry breaking in non-hermitian kitaev and extended kitaev models. phys rev a 92:012116, 2015. doi:10.1103/physreva.92.012116. [35] s. weimann, m. kremer, y. plotnik, et al. topologically protected bound states in photonic parity-time-symmetric crystals. nat mater 16(4):433–438, 2017. doi:10.1038/nmat4811. 475 http://dx.doi.org/10.1103/physrevb.92.195107 http://dx.doi.org/10.1088/1367-2630/15/8/085001 http://dx.doi.org/10.1038/srep21427 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1016/s0375-9601(01)00301-2 http://dx.doi.org/10.1103/physrevlett.89.270401 http://dx.doi.org/10.1016/j.physleta.2011.08.005 http://dx.doi.org/10.1088/1751-8113/43/5/055307 http://dx.doi.org/10.1088/1751-8113/45/44/444021 http://dx.doi.org/10.1088/1751-8113/43/14/145301 http://dx.doi.org/10.1007/s10582-005-0115-x http://dx.doi.org/10.1088/0305-4470/35/41/311 http://dx.doi.org/10.1007/s10773-014-2467-0 http://dx.doi.org/10.1088/0305-4470/39/43/008 http://dx.doi.org/10.1103/physrevd.78.065032 http://dx.doi.org/10.1103/physrevlett.101.150408 http://dx.doi.org/10.1088/1751-8113/41/25/255206 http://dx.doi.org/10.1088/1751-8113/41/24/244019 http://dx.doi.org/10.1088/1751-8113/46/27/275307 http://dx.doi.org/10.1103/physreva.90.052120 http://dx.doi.org/10.1088/1751-8113/48/33/335302 http://dx.doi.org/10.1103/physrevb.84.153101 http://dx.doi.org/10.1103/physrevb.84.205128 http://dx.doi.org/10.1016/j.physleta.2015.02.011 http://dx.doi.org/10.1140/epjd/e2015-60220-7 http://dx.doi.org/10.1103/physreva.93.062130 http://dx.doi.org/10.1103/physreva.95.053626 http://dx.doi.org/10.1364/ol.38.001912 http://dx.doi.org/10.1103/physrevlett.115.040402 http://dx.doi.org/10.1103/physreva.92.012116 http://dx.doi.org/10.1038/nmat4811 m. wagner, f. dangel, h. cartarius et al. acta polytechnica [36] j. zak. berry’s phase for energy bands in solids. phys rev lett 62:2747–2750, 1989. doi:10.1103/physrevlett.62.2747. [37] s.-d. liang, g.-y. huang. topological invariance and global berry phase in non-hermitian systems. phys rev a 87:012118, 2013. doi:10.1103/physreva.87.012118. [38] i. mandal, s. tewari. exceptional point description of one-dimensional chiral topological superconductors/superfluids in bdi class. physica e 79:180 – 187, 2016. doi:10.1016/j.physe.2015.12.009. [39] d. c. brody. biorthogonal quantum mechanics. j phys a 47(3):035305, 2014. doi:10.1088/1751-8113/47/3/035305. [40] n. moiseyev. non-hermitian quantum mechanics. cambridge university press, cambridge, 2011. [41] w. p. su, j. r. schrieffer, a. j. heeger. solitons in polyacetylene. phys rev lett 42:1698–1701, 1979. doi:10.1103/physrevlett.42.1698. [42] m. v. berry. quantal phase factors accompanying adiabatic changes. proc r soc london, ser a 392(1802):45–57, 1984. doi:10.1098/rspa.1984.0023. [43] y. hatsugai. quantized berry phases as a local order parameter of a quantum liquid. j phys soc jpn 75(12):123601–123601, 2006. doi:10.1143/jpsj.75.123601. [44] j. c. garrison, e. m. wright. complex geometrical phases for dissipative systems. phys lett a 128(34):177–181, 1988. doi:10.1016/0375-9601(88)90905-x. [45] b. n. parlett, d. r. taylor, z. a. liu. a look-ahead lanczos algorithm for unsymmetric matrices. math comp 44(169):105–124, 1985. doi:10.1090/s0025-5718-1985-0771034-2. 476 http://dx.doi.org/10.1103/physrevlett.62.2747 http://dx.doi.org/10.1103/physreva.87.012118 http://dx.doi.org/10.1016/j.physe.2015.12.009 http://dx.doi.org/10.1088/1751-8113/47/3/035305 http://dx.doi.org/10.1103/physrevlett.42.1698 http://dx.doi.org/10.1098/rspa.1984.0023 http://dx.doi.org/10.1143/jpsj.75.123601 http://dx.doi.org/10.1016/0375-9601(88)90905-x http://dx.doi.org/10.1090/s0025-5718-1985-0771034-2 acta polytechnica 57(6):470–476, 2017 1 introduction 2 complex berry phase 3 numerical gauge smoothing 4 application to a one-dimensional lattice system 5 conclusion references ap04-bittnar1.vp 1 introduction concrete is the premier construction material, and design for durability is a decisive issue in concrete building. several levels of this design exist, and the most sophisticated level – a probabilistic approach at the micro-level being in the focus of research activities – contrasts with the prescriptive approach given in current codes. in addition to the assessment or design of service life and its statistical features, the probabilistic approach offers the possibility to estimate the reliability grade in the context of durability. the disadvantages of such an approach are the necessity to utilize mathematical models of deteriorating processes, to deal with random variables or random fields, and to use special statistical methods and simulation techniques. the lack of sufficient and reliable statistical data is an important and rather problematic factor in these situations. for these reasons a probabilistic approach is not commonly used in everyday application. 2 designing tool the authors have recently introduced [1] a simple auxiliary tool for the designing process of concrete structures under the consideration of durability – thus attempting to make a “bridge” between the two approaches mentioned above – the micro-level and the prescriptive level. the interactive web page rc_lifetime is freely accessible on http://www.stm.fce.vutbr.cz/. the depassivation of reinforcing steel due to carbonation is considered conservatively as a limiting condition, i.e. the initiation period governs. this is based on a relatively complex model for carbonation of concrete [2] whose input variables are treated as random variables [3]. the theoretical background and some useful recommendations for the input data are provided. rc_lifetime offers the following options: (i) service life assessment provides an evaluation of service life based on the equality carbonation depth � concrete cover (1) the input data are the concrete cover value (as a deterministic value at present, but another version with a random value option is being prepared) together with 12 model variables (optionally deterministic or random). the output data are the statistical characteristics of the relevant service life – mean value and standard deviation/coefficient of variation (cov). this estimated service life may be used for a structural service life assessment or as the “reference service life” value when using the factor method (according to iso 15686-1 (1998) buildings – service life planning – part 1: general principles). optionally, the target value of reliability index � may be an additional input value, and then the corresponding service life is the output value. (ii) the concrete cover assessment option provides an evaluation of the concrete cover appropriate to equality (1). the input data are the target service life value (as a deterministic value) together with 12 model variables (optionally deterministic or random). the output data are the statistical characteristics of the relevant concrete cover (mean value and standard deviation/cov. note: when designing a structure, this value has to be amended at the end of the process according to the technological or constructional requirements. optionally, required concrete cover value may be input and the relevant reliability index � is then an output value (describing the reliability of reinforcement depassivation). 3 reliability consideration and limit states the goal of this paper is to show some trends and time-profiles of the reliability index relevant to the serviceability limit state (sls), taking into consideration the design service life and utilizing the rc_lifetime web application. first, some comments on the limit state issue are given: according to en 1990 the ultimate limit state (uls) is defined as “associated with collapse or with other similar forms of structural failure”, whereas sls is defined as a state “corresponding to conditions beyond which specified service life requirements for a structure are no longer met”. the failure criteria of uls are linked to structural resistance, while © czech technical university publishing house http://ctn.cvut.cz/ap/ 107 acta polytechnica vol. 44 no. 5 – 6/2004 how reliable is the durability of rc structures? b. teplý, p. rovnaník, z. keršner, p. rovnaníková the goal of this paper is to show some trends and time profiles of the reliability index relevant to the serviceability limit state considering the design service life of rc structures. the interactive web page “rc_lifetime” – originated by the authors – is used (see http://www.stm.fce.vutbr.cz/). the depassivation of reinforcing steel due to carbonation is considered conservatively as a limiting condition. it is based on model concrete carbonation with 12 random input variables the latin hypercube sampling simulation method is used. rc_lifetime offers the following options: service life assessment – a statistical evaluation of service life, where optionally the target value of reliability index � may be an additional input value and then the corresponding service life is the output value concrete cover assessment – a statistical evaluation of concrete cover value for the target service life, where optionally the required concrete cover value may be input in this case and the relevant reliability index � describes the reliability of reinforcement depassivation. keywords: carbonation depth, concrete cover, durability, rc structures, reliability index. the failure criteria of sls are, e.g., a limiting deflection or crack width, and might also be characterized by a design service life (a number of years)! the last type of sls criteria are however only described in a qualitative manner and are not suited as a direct basis for probabilistic calculations. moreover, different levels of reliability should be adopted for structural resistance and serviceability. the choice of levels of reliability for a particular structure must take account of the relevant factors, including: the possible cause and/or mode of attaining a limit state; possible consequences of failure in terms of risk to life, injury, potential economic losses; public aversion to failure, and also the expense and procedures necessary to reduce the risk of failure. a problem for sls is the lack of specific quantified failure criteria for different structural components and materials, and the corresponding required levels of reliability. concentrating on reinforced concrete structures and corrosion of the reinforcement, it is evident that the following limit states should be considered: (i) depassivation of the reinforcement; (ii) cracking (visible cracks); (iii) spalling of the concrete cover; (iv) decreases in the effective reinforcement area (leading to excessive deformation or possibly to collapse). types (i) – (iii) belong to the sls category of limit states, whereas (iv) belongs to the uls category. sls should be described as specific limit states including a number of years (service life), the limit state itself (for instance a certain percentage of the surface reinforcement depassivated by a decrease in hydroxide ions in the ambient cement paste due to carbonation), and the level of reliability needed to reach these limits, for instance given by a reliability index. requirements of this kind are not yet included in codes; the authors believe that the utilization of rc_lifetime in general, and some results in the following text specifically, might provide a closer insight into: (1) the progress of carbonation and its dependence on various parameters/conditions; (2) the reliability issue in durability design of rc structures. note: a similar problem (using different models and a different approach) was also treated in [4, 5], and provides some guidance in this field of investigation; they do not allow for practical and versatile use. example 1: a) the process of concrete carbonation is driven by the ambient carbon dioxide, the concentration of which varies in different locations. this example shows the influence of co2 concentration on the progress of the carbonation front in a concrete of medium strength class. according to continuous measurements recently performed in brno (and compared to existing data from other parts of the world – see [6]) the usual mean value in urban areas is about 800 mg/m3; in heavy industrial areas it can be more than 1500 mg/m3. fig. 1 depicts the function of carbonation depth versus co2 concentration and its possible scatter for a service life of 50 years, showing the mean value and this value plus or minus one standard deviation (note: about 66 % of possible realizations are between the upper and lower curve in the case of a normal probability distribution). for the purposes of this study all the input data were considered as deterministic, apart from the coefficient of model uncertainty (lognormal probability distribution, mean value 1.0, standard deviation equal to 0.15 – according to the jcss recommendation). fig. 1 shows how the progress of carbonation is influenced by co2 concentration; certainly, the statistical scatter of carbonation depth would be greater in reality, as all other technological and environmental parameters involved in the carbonation process are more or less random. b) to illustrate this feature, the same example has been solved leaving out this time the coefficient of model uncertainty and consecutively changing the variability of the individual input parameters only. table 1 lists some of these results showing, e.g., the rapid increase in the coefficient of variation of the carbonation depth due to changes in the input variability of the relative humidity. in other cases, the increase is practically linear. example 2: in order to show the trend of the reliability index associated to the carbonation front reaching the concrete cover thickness (i.e. the danger that reinforcement depassivation and possible corrosion will be initiated), again the concrete and environment data from example 1a) were taken and reli108 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 0 10 20 30 40 50 300 500 700 900 1100 1300 1500 1700 carbon dioxide content [mg/m 3 ] c a rb o n a ti o n d e p th [m m ] mean + std mean mean std fig. 1: carbonation depth vs. co2 concentration for a service life of 50 years: mean value �/� standard deviation input variable covinp covoutp ambient co2 content 5 2 10 5 20 10 relative humidity 5 1 10 3 20 52 unit cement content 5 9 10 18 20 38 table 1: input vs. output variability [%] ability index � computed. three different concrete cover values are presented – see fig. 2. as described above the specific values of b are not standardized and depend on several conditions. the recommended value for irreversible sls is � � 1.5 (iso 2394); lower values are also mentioned in the literature – see [4] and [5]. considering, e.g., � � 1.0 and following fig. 1, 30 mm of cover would be reliable in a very clean environment, while 40 mm of cover can be safely used only up to the co2 concentration of 900 mg/m3. 4 conclusions the web-site tool rc_lifetime may serve as an easy-to-use tool for carbonation progress, service life and reliability prediction for reinforced concrete structures. it may be utilized for verification or for justification of special durability requirements. acknowledgment this work was supported by project no. 103/03/1350 and partially by project no. 103/02/1161 backed by the grant agency of the czech republic. references [1] teplý b. et al.: “support to durability design of rc structures.” beton tks, vol. 3 (2004), p. 38–40 (in czech). [2] papadakis v. g., fardis m. n., vayenas c. g.: “effect of composition, environmental factors and cement-lime mortar coating on concrete carbonation.” materials and structures, vol. 25 (1992), p. 293–304. [3] keršner z., teplý b., novák, d.: “uncertainty in service life prediction based on carbonation of concrete.” 7th international conference on the durability of building materials and components, e & fn spon., stockholm, 1996, p. 13–20. [4] gehlen ch.: “probabilistishe lebensdauerbemessung von stahlbeton bauwerken.” deutsher ausschuss fuer stahlbeton, 510, berlin 2000. [5] maage m. smeplass s.: “carbonation – a probabilistic approach to derive provisions for en 206-1.” duranet workshop, tromso, norway june 2001. [6] teplý b., králová h., stewart m.: “ambient carbon dioxide, carbonation and deterioration of rc structures.” international journal of materials � structural reliability, vol. 1 (2002), p. 31–36. prof. ing. břetislav teplý, csc. phone: +420 541 147 642 e-mail: teply.b@fce.vutbr.cz ústav chemie fast vut v brně žižkova 17 602 00 brno, czech republic rndr. pavel rovnaník phone: +420 541 147 631 e-mail: rovnanik.p@fce.vutbr.cz ing. zbyněk keršner, csc. phone: +420 541 147 362 e-mail: kersner.z@fce.vutbr.cz ústav stavební mechaniky fast vut v brně veveří 95 602 00 brno, czech republic doc. rndr. pavla rovnaníková, csc. phone: +420 541 147 633 e-mail: rovnanikova.p@fce.vutbr.cz ústav chemie fast vut v brně žižkova 17 602 00 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 109 acta polytechnica vol. 44 no. 5 – 6/2004 -4 -3 -2 -1 0 1 2 3 4 5 300 500 700 900 1100 1300 1500 1700 carbon dioxide content [mg/m 3 ] r e li a b il it y in d e x [d im e n s io n le ss ] cover = 40 mm cover = 30 mm cover = 20 mm fig. 2: reliability index � vs. ambient carbon dioxide concentration for a service life of 50 years: three levels of concrete cover acta polytechnica https://doi.org/10.14311/ap.2021.61.0279 acta polytechnica 61(1):279–291, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague performance of sustainable green concrete incorporated with fly ash, rice husk ash, and stone dust ayesha siddikaa, b, ∗, md. ruhul aminb, md. abu rayhanb, md. saidul islamb, md. abdullah al mamunc, rayed alyousefd, y. h. mugahed amrand a university of new south wales, sydney, school of civil and environmental engineering, nsw 2052, australia b pabna university of science and technology, department of civil engineering, pabna-6600, bangladesh c rajshahi university of engineering & technology, department of civil engineering, rajshahi-6204, bangladesh d prince sattam bin abdulaziz university, college of engineering, department of civil engineering, 11942 alkharj, saudi arabia ∗ corresponding author: a.siddika@unsw.edu.au abstract. the performance of a sustainable green concrete with fly ash (fa), rice husk ash (rha), and stone dust (sd) as a partial replacement of cement and sand was experimentally explored. fa and rha have a high silica content, are highly pozzolanic in nature and have a high surface area without any treatment. these by-products show filler effects, which enhance concrete’s density. results showed that the fa and rha materials have good hydration behaviour and effectively develop strength at an early age of concrete. sd acts as a stress transferring medium within concrete, thereby allowing the concrete to be stronger in compression, and bending. consequently, water absorption capacity of the sustainable concrete was lower than that of the ordinary one. however, a little reduction in strength was observed after the replacement of the binder and aggregate using the fa, rha and sd, but the reduction was insignificant. the reinforced structure with sustainable concrete containing the fa, rha, and sd generally fails in concrete crushing tests initiated by flexural cracking followed by shear cracks. the sustainable concrete could be categorized as a perfect material with no significant conciliation in strength properties and can be applied to design under-reinforced elements for a low-to-moderate service load. keywords: fly ash, rice husk ash, stone dust, sustainable green concrete, mechanical properties. 1. introduction sustainable construction is gradually becoming challenging due to environmental and economic considerations. because the main consumer of natural resources is the construction sector that produces a large mass of waste [1]. concrete is an extensively used construction material that requires a large mass of ingredients and is, thus, costly. around 5 % 8 % of global carbon dioxide emissions are generated by the production of cement, which is an important component of a concrete mix [2–4]. meanwhile, natural sands are used in concrete construction, and this utilization adversely affects the natural resources and river bed level [5, 6]. therefore, the optimization of the uses of cement and natural sands in concrete is a priority. many available by-products and waste materials can be used in concrete, such as the fa, rha, palm oil fuel ash, limestone waste, sd, slag, silica fume, fibres, glass and rubber waste [3, 7–15]. the use of these by-products in concrete for waste management not only reduces the material cost but also the waste management cost [2, 11, 16]. the fa is a by-product with high pozzolanic properties produced in large quantities each year from the coal combustion in different industries. an optimum amount of cement can be replaced in concrete by the fa [17]. because of the siliceous type, it reacts aggressively with calcium hydroxide and forms cementitious compound, thereby the strength of the concrete with an addition of the fa gets improved. the addition of the fa lowers the water demand in concrete mix, and thus results in less bleeding and little heat progression [3, 18, 19]. the fa has a high silica content and is highly amorphous in nature. the specific surface area of the fa is approximately 300 − 495 m2/kg, which helps to produce a high-density concrete matrix [20, 21], and thus the use of finer fa particles results in a higher compressive strength concrete than the use of coarser ones [21]. additionally, the fa enhances the density of concrete in the interfacial transition zone (itz) by reducing the permeable voids, which leads to a highly durable and high strength concrete [22–24]. besides, the reduced micro-voids along the itz results in the high fracture toughness of the siliceous fa-based concrete [23]. additionally, the fa-based concrete possesses better chloride penetration resistance than opc concrete and an increased electrical resistivity [25]. the most concerning matter 279 https://doi.org/10.14311/ap.2021.61.0279 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica of the fa-based concrete is the low strength development rate at an early age due to a low reactivity [25]; but it can gain strength over a long period. moreover, an ordinary concrete slows down the strength gaining at around 56 days of curing, sometimes stopping altogether, but the fa concrete shows a gradual improvement in strength over a long period of time [3, 26]. the improvement in strength depends on the class and fineness of the fa and other ingredients and admixtures. it is not possible to determine the optimum level of the fa addition for a global application, as the performance varies depending on the source, chemical composition and structure of the fa, and other ingredients in the concrete [27]. however, in literature, the optimum value of replacement of cement by the fa is said to be around 10 − 25 %wt [3, 21, 22, 27]. meanwhile, rice husk is a by-product produced in millions of tons per year from agricultural and industrial factories. after burning these rice husks in any industries as fuel, around 20 % − 25 % of rha is generated. for every tonne of harvested rice, approximately 220 kg of husks are produced; when these husks are burned, 55 kg of ash are generated [28]. the rha contains a highly amorphous silica, which is a good supplementary material for cement in concrete [25, 29]. in concrete, the rha forms a calcium silicate hydrate (c-s-h) gel around cement particles, which leads to a highly dense concrete matrix and resulting in an accelerated early strength development [30] with an average density of around 2200 − 2550 kg/m3 [7]. however, the incineration time, temperature, and environment of the combustion of the rice husks affect the specific surface area, amorphous silica, and carbon content of the rha, and consequently the strength of concrete when used [28]. the pozzolanic activity of the rha depends on the specific surface area and fineness of its particles. it can be controlled by adopting a controlled combustion and grinding process [31]. open field burning or a special incineration with a controlled temperature can be applied to produce the rha from rice husks. however, the rha, which is produced from an open field uncontrolled burning system, contains high carbon content, which adversely affects the properties of concrete and produces highly crystalline forms in the structure [32, 33]. generally, the rha contains around 85 % − 95 % amorphous silica and exhibits an eco-friendly behaviour as a supplementary cementing material in concrete [34–36] and can be used at around 10 − 25 %wt to replace cement in concrete without decreasing the strength [2]. moreover, the rha exhibits a good performance even without any further processing and helps to accelerate the early age strength development in concrete [7, 37]. the shrinkage and absorption capacities of an ordinary and rha-based concrete are nearly the same. in optimum levels, additions of fa and rha in concrete as replacements of cement can reduce carbon dioxide emissions from cement production; this replacement improves the greenness of environment and decreases the overall costs of construction [35]. additionally, when the fa and rha are used simultaneously in concrete, more beneficial outcomes are being noticed. because the problems of low reactivity of the fa in concrete and slow strength development in an early age can be overcome by replacing a part of it by the rha [25, 37]. moreover, the addition of these supplementary cementitious materials also reveals economic and environmental benefits by reducing the uses of natural limestone, energy consumption and carbon emission from cement production [23, 38]. in addition, the mineral additives make sure a high degree of homogenization of the composites in the concrete mix, which leads to smaller internal micro-cracks [39], thus these mineral wastes are suitable for reinforced concrete designed for dynamic loading. the sd is being produced from stone quarries in a large amount during the processing of stone aggregates. these waste products can be an excellent alternative to natural sands [5] or cement [8] in concrete and can help to develop a sustainable green concrete [40]. the highly fine silica content in the sd increases the amorphous content and filler activity, which increase the density of a concrete mix up to a certain limit of replacement. the replacement of fine aggregates by 25 %wt of the sd improves mechanical properties and durability; a further increase in the replaced amount may decrease strength [41]. previous studies have shown that replacing sand with quarry rock dust produces a concrete with an approximate 8 % − 20 % increase in strength [41]. as the amount of the sd addition increases, the density of concrete increases and the absorption of water decreases, which leads to a high durability [42, 43]. sd-based concrete has better mechanical properties than ordinary concrete and is cost effective due to the use of waste products [44]. a gradation of the sd is required prior to its use as a replacement of sand in concrete. in their study, bahoria et al. [5] found that the sd with a particle size below 75 µm acts as filler, increasing the workability of mix, and decreasing the porosity of the hardened concrete. by contrast, sd with a particle size between 75 µm and 6 mm acts as a fine aggregate. additionally, very fine sd increases the workability, and a lower amount of superplasticizers are needed in the sd concrete than in an ordinary fine aggregate concrete [43]. in his study, singh et al. [40] reached a contrary conclusion, he found that the water absorption capacity of the sd is higher than that of natural aggregates, thereby its addition reduces the workability of concrete. however, the performance, when these supplementary materials are being used simultaneously in concrete is still unclear. thus, this study aims to determine the performance of a sustainable green concrete with a partial replacement of cement by incorporating the fa, rha, and replacing fine aggregate with the sd in different percentages. physical and mechanical properties such as water absorption capacity 280 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . figure 1. raw materials for concrete specimens. and load-carrying capacity of the tested specimens were investigated. compressive, flexural, and splitting tensile strength tests were performed to assess the performance of the sustainable green concrete for eco-friendly construction material systems and their application. 2. methodology of experimental work 2.1. material selection and specimen preparation the control specimens were prepared with ordinary portland cement (opc 43 grade), coarse sand, and crushed stone maintaining a ratio of 1 : 2 : 4 with the water–cement ratio of 0.5. the specific gravity of sand was 2.60, and fineness modulus was 2.50. 20 mm downgraded stone aggregate was used with the specific gravity of 2.64. the water absorption capacity of sand and stone chips was 2.35 % and 1.50 %, respectively. the fa by-products from cement industries, rha from different mills and factories, and sd from stone quarries were used in concrete. in sustainable specimens, the rha and fa were used as replacements of cement. the fa and rha met the requirements of astm c 618 [45]. chemical and physical properties of materials were collected from the manufacturer and supplier. class-f fa with more than 75 % oxide components (sio2 + al2o3 + fe2o3) and around 2 % loss of ignition was used in this experiment. as per manufacturer, the fa contained 2.15 % cao, which is much lower compared to the amount of 61.4 % contained by the opc used. the rha was ground and sieved with a 150 mesh sieve. the specific gravity of the rha was 2.01, and bulk density was 106 kg/m3. the specific surface area of the rha was measured by applying the bet theory [46]. the specific surface area of the rha was 19.7 m2/kg after 5 minutes of grinding and contained 79.7 % of sio2. the average size of the rha particles was in the range of 90 − 100 microns. in addition, the 100 % particles of the opc and fa passed through the 90 microns opening sieve. meanwhile, the specific gravity and fineness modulus of the sd were 2.63 and 2.80, respectively, which meets the fine aggregate gradation criteria as per astm c33 [47]. the sd passed through 4.75 mm opening sieve and around the 24 % that retained on the 1 mm opening sieve were used for the specimen preparation. fig. 1 shows the samples. the specimens were prepared with varying contents of the fa and rha as replacements of cement and the sd as a replacement of sand in concrete. table 1 lists the composition of each type of concrete mix with a specimen id. cement and sand in concrete were replaced by different weight percentages of the fa, rha, and sd. the concrete mix ratios are listed in table 2. 2.2. experimental tests a slump test was conducted to evaluate the workability in line with the astm c 143 [48] for all concrete mixes. a water absorption test was carried out for each type of specimen following the astm c 642-13 [49]. the percentage of weight gain by dried specimens (dried in oven for 24 h at around 110 °c) after a submersion in water for 24 h was calculated as water absorption capacity. a compressive strength test was performed with cylindrical specimens with a diameter of 100 mm and a height of 200 mm at 28 days of curing using a universal testing machine as per the astm c 39 [50]. meanwhile, a flexural strength test (fig. 2) with prism beam specimens of 150 mm×150 mm×700 mm was performed for all types of concrete in accordance with the astm c78 [51]. a splitting tensile strength test was conducted using cylindrical specimens with a diameter of 100 mm and a height of 200 mm for all types of concrete following the astm c 496 [52]. fig. 2a-2f illustrates the specimen preparation and the test setups. each of the tests was done on at least three specimens and the average results were taken. 3. results and discussion 3.1. workability fig. 3 shows the slump test results of different concrete mixes. the maximum slump value was found for the control specimen, whereas the minimum slump value is found for the mix with 5 %wt of the fa, 5 %wt of the rha, and 10 %wt of the sd. the fa and rha 281 a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica replacement level specimen id 0 % fa + 0 % rha + 0 % sd f0r0s0 5 % fa + 10 % rha + 0 % sd f5r10s0 10 % fa + 5 % rha + 0 % sd f10r5s0 10 % fa + 10 % rha + 5 % sd f10r10s5 5 % fa + 5 % rha + 10 % sd f5r5s10 table 1. specimen id and replacement level. specimen id constituents (kg/m 3) cement water fa rha fine aggregate sd coarse aggregate f0r0s0 380 190 0 0 760 0 1520 f5r10s0 323 190 19 38 760 0 1520 f10r5s0 323 190 38 19 760 0 1520 f10r10s5 304 190 38 38 722 38 1520 f5r5s10 342 190 19 19 684 76 1520 table 2. concrete mix proportion. (a) . casting of specimens. (b) . hardening of specimens. (c) . curing of specimens. (d) . compressive strength test. (e) . splitting tensile strength test. (f) . flexural strength test. figure 2. specimen preparation and test setup. 282 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . figure 3. slump value of different concrete mix. have a high specific surface area, which leads to an immediate absorption of water after mixing. hence, the addition of the fa and rha caused a reduction of the slump value of the concrete mix; but, unexpectedly, a little higher slump was observed for the mix f10r10s5, which can be caused by the reaction between the fa and rha and variation in mixing process. in general, the addition of the fa causes an increased workability of the concrete mix due the spherical shape of the particles, which was mentioned in several researches [20, 53, 54] and the rha causes a reduction of the slump value [55]. as observed from the results of this experiment, the fa reduced the slump value of concrete, because it required more water to complete hydration and pozzolanic reactions within the mix as compared to the rha. this is due to the fa having finer particles than the rha. another reason of the high water demand by the concrete mix can be the high carbon content in the rha and fa [35]. because of the uncontrolled burning during the production, the fa and rha used in this experiment had a high carbon content, and thus caused a high water demand and low slump of the concrete mix. meanwhile, the sd addition also resulted in a reduction of the slump value of the concrete mix, because it has a higher water absorption capacity than natural sands [40]. the addition of the sd in the concrete mix increases interparticle friction, which increases the cohesiveness and stiffness of concrete mixes; thereby resulting in a low slump value, and consequently in a poor workability of that concrete mix [56]. for these reasons, the addition of the sd considerably decreased the workability of the concrete mix. to increase the workability of the fa-, rha-, and sd-based concrete mixes, researchers have suggested specific doses of superplasticizers [43, 44, 55]. 3.2. water absorption capacity the addition of very fine pozzolanic particles in a concrete matrix instigates a densely packed and less porous structure, which results in a low water absorption capacity. as shown in fig. 4, the water absorption capacity of concrete decreased after cement and sand were replaced by the supplementary materials. the fa, rha, and sd act as filler materials due to their high fineness properties, which produce dense concrete matrix, and thus lowered the water absorption capacity. the specimen of the concrete mix with 5 %wt of the fa and 10 %wt of the rha has increased water absorption, and this result may be attributed to its less dense matrix caused by imperfect mixing and hydration. hydration of cementitious materials is important to enhance the binding action and should be closely packed with a high bonding energy to produce less porous structures. the addition of the sd results in an excellent performance of the f5r5s10 specimen with a reduced water absorption capacity of 15.85 %. these results are in a good agreement with the study of ghorbani et al. [42, 57]. during hydration, the calcium hydroxide available for the formation of c-s-h product depends on the content of cao in the fa and rha, and their chemical reactivity is controlled by the active silica content and fineness of both materials [55, 58]. though, in this experiment, the fa and rha contained a low amount of cao and the relatively coarser rha caused a lower binding energy during the hydration process and the formation of c-s-h gel was lower compared to cement. therefore, the porosity of these concrete mixes was increased. the porosity of the f10r10s5 specimen was higher than that of f5r5s10. because the lower level of cement replacement and a higher stone dust addition, the stronger binding energy and more voids filling phenomenon occurs in the latter specimen [55, 57]. hence, the water absorption was minimal for the specimen f5r5s10. therefore, the durability of concrete can be enhanced by adding the fa, rha, and sd up to an optimum limit. 283 a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica figure 4. water absorption capacity of different concrete specimens. figure 5. compressive strength of different concrete specimens (28 days of age). 3.3. mechanical performance 3.3.1. compressive strength the compressive strength test results at 28 days of age of specimens (fig. 5) show that the addition of any supplementary materials causes a reduction of strength. the addition of the fa and rha as replacements of cement in concrete mixes gradually reduced the compressive strength as the level of replacement increased. the overall strength reduction observed for the test specimens was around 3.99 % − 11.42 %, depending on the replacement ratio. this result is consistent with those in previous studies [2, 3, 7, 20]. the fa has a better binding properties than the rha, therefore a lower strength reduction was observed for the fa than for the rha. specifically, f5r10s0 specimens show around 7.74 % lower compressive strength than f10r5s0 specimens. in the present study, the lower fineness of the fa and rha can be the cause of the strength reduction. the rapid pozzolanic action and accelerated hydration is related to the addition of fine fa and rha; therefore, an addition of very fine fa and rha is required to enhance the strength [25]. additionally, the rha produced from uncontrolled burning has a high carbon content and shows a low reactivity because of the crystalline particle structure [33]. the replacement of natural sands by the sd caused an improvement in the strength of concrete. this finding agrees well with those in current studies [8, 43, 57]. the lost strength of the fa and rha added concrete compared to the control have been recovered, to some extent, after the addition of the sd in the mix. the f5r5s10 specimen shows moderate strength compared to the ordinary concrete with a strength lower only by 3.5 %. the fracture surfaces of different specimens showed that the most densely packed specimens were those of f5r5s10, which also showed a lower water absorption after hardening as a proof of a densely packed matrix. but the concrete, which contains a large amount of rha, showed porous structures at the hardened state. this happened due to the low binding energy and inferior cementitious activity of the untreated coarser rha, which may contain impurities and prevent the hydration and formation of the c-s-h link [55]. meanwhile, the addition of the sd provides stiffnes to the itz, which creates a strong surface for the load transfer and stress distribution [56]. hence, 284 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . figure 6. flexural strength of different concrete specimens (28 days of age). figure 7. failure mode of sustainable green concrete specimen under bending. a lower amount of the rha and a higher amount of the sd can be beneficial for the strength development. for this reason, the lowest strength was observed for f5r10s0 and highest was for f5r5s10. therefore, a 10 %wt replacement of cement by equal parts of the fa and rha and a 10 %wt replacement of natural sands by the sd is the optimum for this experiment in terms of compressive strength. 3.3.2. flexural strength the overall flexural strength of specimens with different percentages of the fa and rha is between 2.91 and 3.18 mpa at 28 days of age, which is around 8.62 % − 16.4 % lower than that of control specimens (fig. 6). the fa and rha contains considerably lower cao than the opc, thereby causing a low energy in bonding within the concrete [3, 7, 20]. thus, the sustainable concrete specimens show a lower flexural strength than the control specimens. however, the sustainable concrete specimens show a reliable mechanical strength due to the high pozzolanicity in the faand rha-based concretes caused by their very high specific surface area and the accelerated reaction between calcium hydroxide and silica [3, 55]. meanwhile, the addition of the sd to concrete is very effective in terms of the strength improvement. the sd can transfer load to a wider range of its adjacent coarse aggregates than natural sands, and this helps improve the flexural strength of concrete. high fineness of these supplementary materials can help to enhance the overall performance of composites because fine materials showed an improved packing density. the graded sd is effective because it provides a less porous structure after hardening. consequently, the density along the interfacial transition zone and mechanical strength are improved. all the specimens fail in the brittleness test by a single crack along the mid span, which is very near the loading point (fig. 7). this happened because of the addition of the rha generally increases the brittleness of concrete [59]. 3.3.3. splitting tensile strength fig. 8 illustrates the variation in splitting tensile strength of the different concrete specimens at 28 days of age. the splitting tensile strength of concrete specimens is the load carrying capacity in the transverse direction and is applied perpendicularly to the horizontal axis. therefore, it is related to the compressive strength. the splitting tensile strength of sustainable concrete specimens was around 7.52 % − 14.16 % lower than that of the control concrete. the reduction in strength considerably depends on the percentage re285 a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica figure 8. splitting tensile strength of different concrete specimens (28 days of age). placement level and mixing accuracy of the concrete mix. poor mixing and compaction lead to porous structures, which lead to the reduction in strength. due to its sharp and angular shape, the sd provides interlocking within the concrete matrix, and therefore resist greater loads than a natural sand-based concrete. meanwhile, the stiffness provided by the sd against tensile stress was found higher compared to natural sands due to the interlocking of the binding agent by the rough surface of the sd [56]. the addition of the sd of up to 10 %wt as a replacement of natural sands reliably improves the mechanical properties and can be used in an amount up to 15 %wt, as found in [40]. a further addition of the sd causes a very brittle concrete matrix, which is not expected. 4. performance of reinforced specimens bending test of steel reinforced concrete (rc) specimens was performed to evaluate the applicability and performance of the sustainable green concrete in steel reinforced concrete. the specimens were prepared by reinforcing beam specimens of 150 mm × 150 mm × 700 mm with four 12 mm-diameter mild steel deformed bars; two of which were placed along soffit, and the other two were placed at the top. the stirrups were 8 mm-diameter deformed bars placed at a distance of 150 mm centre to centre. the yield strength and ultimate strength of the 12 mm steel reinforcement was 423 mpa and 619 mpa, respectively, and 414 mpa and 625 mpa, respectively, for the 8 mm diameter bars. the beams were cast with a concrete cover maintained around 25 mm. fig. 9 shows the flexural strength of these reinforced concrete beams cast with different mixtures. the additions of the fa and sd show a significantly improved flexural strength of these specimens, but the addition of the rha reduces the strength of the control specimens. the f10r5s0 and f5r5s10 specimens have a 20 % and 26 % improvement in strength, respectively, compared with the control specimens. however, the addition of the rha of more than 5 %wt considerably reduces the flexural strength. all of the rc beams failed due to crushing of concrete and the failure was initiated by shear cracks (fig. 10). meanwhile, the control specimen with ordinary sand and opc fails in flexure with flexural cracks induced in mid span under the maximum bending zone. the reason is that the shear strength of the concrete is low in these sustainable concrete specimens, which causes a failure in materials before considerable yielding of the steel reinforcements. the analysis of the failure mode and strength diagram shows that the use of the sd strengthens the tension and compression capacity of the concrete due to its filler and stress-transferring effects. however, the use of the rha in excess amounts has the opposite effect. all the supplementary materials negatively affect the properties of concrete when they exceed the optimum level. the experiment shows that the sustainable concrete mixes are applicable in moderate amounts for an under-reinforced structural system to produce lightweight and low-cost structural systems. generally, the lowest grade of concrete required for the rc member construction for typical and dynamic loading conditions is c12/c15 concrete [39]. in this research, the developed sustainable green concrete can be classified as grade c15 and c20 concrete, and could be an equivalent to a normal grade m20 and m25 concrete, therefore, being suitable for rc construction according to is 456–2000 [60] and as 3600-2009 [61]. 5. correlation between hardened properties the analysis of the mechanical properties and fracture surface (fig. 11) of all types of concrete specimens reveals a general relationship of split tensile and flexural strengths of the concrete with the compressive strength. the minimum bonding between materials in specimens is found for 10 %wt of the rha. the 286 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . figure 9. flexural strength of rc beam with different concrete mixes. (a) . f0r0s0. (b) . f5r10s0. (c) . f10r5s0. (d) . f10r10s5. (e) . f5r5s10. figure 10. failure mode of rc beam specimens cast with different concrete mixes. observation of less loose materials and lower porosity along the interface of stone aggregates and other materials provides a high bond development in concrete (fig. 11b and 11d). with a high amount of the rha, the interface of the stone aggregate and binder paste is weakened because of the lesser reactivity of the rha. the poor bond formation within the concrete matrix causes a decrease in the compressive strength compared with the other specimens with a low rha content. the high-water absorption result is the proof of the porous structure of these specimens, thus, these specimens possess low splitting tensile strength and flexural strength. however, the combined addition of the fa and rha accelerates the development of strength in early stages. the f10r5s0 specimens show around 8.4 % higher compressive strength than the f5r10s0 specimens. similarly, the variations in flexural and splitting tensile strength are consistent with this result. therefore, the fa exhibits a higher cementitious activity than the rha. the addition of the sd causes filler and crack-bridging effects within the sustainable concrete, which can then absorb a large amount of energy under compression. this condition caused a strength increment for the rhaand fa-based concretes after the addition of the sd. the analysis of the fracture surface and the water absorption test results indicates that the sdbased concrete shows a dense matrix. all the specimens show a general trend of variation of strength, and most of the results are consistent with those of the compressive test. this study finds a 11.42 %, 3.99 %, 6.50 %, and 3.49 % reduction in compressive strength; a 16.4 %, 12.93 %, 9.48 %, and 8.62 % reduction in flexural strength; and a 14.16 %, 9.74 %, 8.41 %, and 7.52 % reduction in splitting tensile strength for f5r10s0, f10r5s0, f10r10s5, and f5r5s10 specimens, respectively, compared with the control f0r0s0 specimen. as the compressive strength varied with the addition of any by-product, the splitting tensile strength and flexural strength showed a nearly similar trend in variation for the same concrete. therefore, a very general conclusion on mechanical properties can 287 a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica (a) . f5r10s0. (b) . f10r5s0. (c) . f10r10s5. (d) . f5r5s10. figure 11. fracture surfaces of different hardened concrete mixtures. be obtained with the addition of these by-products in concrete. 5.1. comparison of results the experimental results show that the optimum value of replacement of cement is with 5 %wt of the fa and 5 %wt of the rha with 10 %wt of the sd as a replacement of sand. the addition of the fa can be increased up to 10 %wt, because this amount improves strength. however, a further addition of any of the three materials may increase the porosity and strength because of the corresponding reduction in bonding energy and an occurrence of agglomeration. this result agrees well with those of current studies. chindaprasirt et al. [22] indicated that additions of 10 %wt of the fa and 10 %wt of the rha in concrete reduce the early strength of concrete and slightly increase the strength during the aging process. however, further addition considerably decreases the compressive strength [22, 25]. jung et al. [30] added limestone powder with a high content of calcium oxide to increase the strength of the faand rha-based concretes. they found approximate 8.2 % and 14.75 % increases in compressive strength when 5 %wt and 10 %wt of a limestone powder is added to specimens with 5 %wt and 10 %wt of rha and fa, respectively. the addition of the rha by more than 10 %wt reduces the strength, and this result is also consistent with the one obtained by kathirvel et al. [62]. the addition of hydrated lime can accelerate the pozzolanic reaction in the rha-concrete [63]. additionally, nano-silica can also help to improve the strength of the rhaconcrete [59]. moreover, any pre-treatment of the rha and fa can improve the strength properties, because fine particles produce a dense matrix [3, 64]. the sustainable concrete containing the fa, rha, and sd simultaneously has rarely been investigated. the current experiment shows that the addition of the sd of up to 10 %wt as a replacement of natural sands improves the performance. this performance is consistent with the previous result without considering the cement replacement level [8, 57]. however, a further addition of the sd may cause agglomeration and porous structure, which weaken the concrete. therefore, the optimum level of replacement of natural sands by the sd is 10 %wt. 6. conclusion the performance of a sustainable green concrete with a partial replacement of cement and sand by incorporating the fa, rha, and sd in different percentages was investigated. however, the production of a sustainable and low-cost concrete with the studied by-products (i.e., fa, rha, and sd) as partial replacements of cement and sand is still challenging. the addition of fa, rha, and sd in concrete showed insignificant decreases in strength, up to a certain limit. from the experimental test results, the following conclusions can be drawn: 288 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . • the binding action of the fa is considerably superior to that of the rha, but both show similar results when no treatment is applied to the concrete mix. • the use of the sd as a replacement of natural sands in concrete is considerably effective and improves the mechanical properties, thereby reducing the capacity of water absorption by enhancing the density of the matrix. • the reduction in the water absorption capacity of the fa-, rha-, and sd-based concrete is a sign of improvement in durability. • around 20 % of ordinary materials in concrete can be replaced by the studied by-products or waste materials (i.e., fa, rha, and sd) without significantly compromising the strength properties. however, an excess amount may exhibit negative effects. the optimum combination found in this study is 5 %wt of the fa and 5 %wt of the rha as replacements of cement and 10 %wt of the sd as a replacement of natural sands without decreasing the strength of concrete. • the replacement of cement by the fa and rha provides sustainability in the construction industry and waste management systems, and the replacement of natural sands by the sd can help saving natural resources. several studies have been conducted on the individual uses of the fa, rha, and sd in concrete. however, the combined effect of these materials in concrete is still under investigation. extensive chemical analyses and microstructural studies are required to establish the appropriateness of these materials in terms of strength and durability against high loads and exposure conditions. acknowledgements the experimental work was carried out at pabna university of science and technology, pabna-6600, bangladesh. the authors are grateful to the chairman and the faculty members of the department of civil engineering for the facilities provided and co-operation rendered. the authors gratefully acknowledge the support provided by the department of civil engineering, college of engineering, prince sattam bin abdulaziz university, saudi arabia. references [1] i. taji, s. ghorbani, j. de brito, et al. application of statistical analysis to evaluate the corrosion resistance of steel rebars embedded in concrete with marble and granite waste dust. journal of cleaner production 210:837 – 846, 2019. doi:10.1016/j.jclepro.2018.11.091. [2] j. alex, j. dhanalakshmi, b. ambedkar. experimental investigation on rice husk ash as cement replacement on concrete production. construction and building materials 127:353 – 362, 2016. doi:10.1016/j.conbuildmat.2016.09.150. [3] a. k. saha. effect of class f fly ash on the durability properties of concrete. sustainable environment research 28(1):25 – 31, 2018. doi:10.1016/j.serj.2017.09.001. [4] a. siddika, m. a. al mamun, r. alyousef, h. mohammadhosseini. state-of-the-art-review on rice husk ash: a supplementary cementitious material in concrete. journal of king saud university engineering sciences 2020. doi:10.1016/j.serj.2017.09.001. [5] b. bahoria, d. parbat, p. nagarnaik. xrd analysis of natural sand, quarry dust, waste plastic (ldpe) to be used as a fine aggregate in concrete. materials today: proceedings 5(1):1432 – 1438, 2018. doi:10.1016/j.matpr.2017.11.230. [6] m. r. hasan, a. siddika, m. p. a. akanda, m. r. islam. effects of waste glass addition on the physical and mechanical properties of brick. innovative infrastructure solutions 6(36):1 – 13, 2021. doi:10.1007/s41062-020-00401-z. [7] a. siddika, m. a. al mamun, m. h. ali. study on concrete with rice husk ash. innovative infrastructure solutions 3(18):1 – 9, 2018. doi:10.1007/s41062-018-0127-6. [8] s. ghorbani, i. taji, m. tavakkolizadeh, et al. improving corrosion resistance of steel rebars in concrete with marble and granite waste dust as partial cement replacement. construction and building materials 185:110 – 119, 2018. doi:10.1016/j.conbuildmat.2018.07.066. [9] a. siddika, m. a. al mamun, r. alyousef, et al. properties and utilizations of waste tire rubber in concrete: a review. construction and building materials 224:711 – 731, 2019. doi:10.1016/j.conbuildmat.2019.07.108. [10] a. hajimohammadi, t. ngo, a. kashani. glass waste versus sand as aggregates: the characteristics of the evolving geopolymer binders. journal of cleaner production 193:593 – 603, 2018. doi:10.1016/j.jclepro.2018.05.086. [11] h. mohammadhosseini, m. m. tahir. production of sustainable green concrete composites comprising industrial waste carpet fibres. in green composites, pp. 25–52. springer, 2019. [12] r. alyousef, h. alabduljabbar, h. mohammadhosseini, et al. utilization of sheep wool as potential fibrous materials in the production of concrete composites. journal of building engineering 30:101216, 2020. doi:10.1016/j.jobe.2020.101216. [13] r. alyousef, k. aldossari, o. ibrahim, et al. effect of sheep wool fiber on fresh and hardened properties of fiber reinforced concrete. international journal of civil engineering and technology 10:190 – 199, 2019. [14] y. mugahed amran, r. alyousef, h. alabduljabbar, et al. performance properties of structural fibred-foamed concrete. results in engineering 5:100092, 2020. doi:10.1016/j.rineng.2019.100092. [15] h. assaedi, t. alomayri, a. siddika, et al. effect of nanosilica on mechanical properties and microstructure of pva fiber-reinforced geopolymer composite (pva-frgc). materials 12(21):3624, 2019. doi:10.3390/ma12213624. 289 http://dx.doi.org/10.1016/j.jclepro.2018.11.091 http://dx.doi.org/10.1016/j.conbuildmat.2016.09.150 http://dx.doi.org/10.1016/j.serj.2017.09.001 http://dx.doi.org/10.1016/j.serj.2017.09.001 http://dx.doi.org/10.1016/j.matpr.2017.11.230 http://dx.doi.org/10.1007/s41062-020-00401-z http://dx.doi.org/10.1007/s41062-018-0127-6 http://dx.doi.org/10.1016/j.conbuildmat.2018.07.066 http://dx.doi.org/10.1016/j.conbuildmat.2019.07.108 http://dx.doi.org/10.1016/j.jclepro.2018.05.086 http://dx.doi.org/10.1016/j.jobe.2020.101216 http://dx.doi.org/10.1016/j.rineng.2019.100092 http://dx.doi.org/10.3390/ma12213624 a. siddika, m. r. amin, m. a. rayhan et al. acta polytechnica [16] a. siddika, m. a. al mamun, w. ferdous, et al. 3d-printed concrete: applications, performance, and challenges. journal of sustainable cement-based materials 9(3):127 – 164, 2020. doi:10.1080/21650373.2019.1705199. [17] r. prakash, r. thenmozhi, s. n. raman, et al. mechanical characterisation of sustainable fibre-reinforced lightweight concrete incorporating waste coconut shell as coarse aggregate and sisal fibre. international journal of environmental science and technology pp. 1 – 12, 2020. doi:10.1007/s13762-020-02900-z. [18] a. k. saha, p. k. sarker, v. golovanevskiy. thermal properties and residual strength after high temperature exposure of cement mortar using ferronickel slag aggregate. construction and building materials 199:601 – 612, 2019. doi:10.1016/j.conbuildmat.2018.12.068. [19] r. prakash, r. thenmozhi, s. n. raman, c. subramanian. characterization of eco-friendly steel fiber-reinforced concrete containing waste coconut shell as coarse aggregates and fly ash as partial cement replacement. structural concrete 21:437 – 447, 2019. doi:10.1002/suco.201800355. [20] f. moghaddam, v. sirivivatnanon, k. vessalas. the effect of fly ash fineness on heat of hydration, microstructure, flow and compressive strength of blended cement pastes. case studies in construction materials 10:e00218, 2019. doi:10.1016/j.cscm.2019.e00218. [21] p. chindaprasirt, c. jaturapitakkul, t. sinsiri. effect of fly ash fineness on compressive strength and pore size of blended cement paste. cement and concrete composites 27(4):425 – 428, 2005. doi:10.1016/j.cemconcomp.2004.07.003. [22] p. chindaprasirt, s. rukzon, v. sirivivatnanon. resistance to chloride penetration of blended portland cement mortar containing palm oil fuel ash, rice husk ash and fly ash. construction and building materials 22(5):932 – 938, 2008. doi:10.1016/j.conbuildmat.2006.12.001. [23] g. l. golewski, t. sadowski. a study of mode iii fracture toughness in young and mature concrete with fly ash additive. in advanced materials and structures vi, vol. 254 of solid state phenomena, pp. 120 – 125. 2016. doi:10.4028/www.scientific.net/ssp.254.120. [24] r. prakash, r. thenmozhi, s. n. raman, et al. an investigation of key mechanical and durability properties of coconut shell concrete with partial replacement of fly ash. structural concrete pp. 1 – 12. doi:10.1002/suco.201900162. [25] v. n. kanthe, s. v. deo, m. murmu. effect of fly ash and rice husk ash on strength and durability of binary and ternary blend cement mortar. asian journal of civil engineering 19:963 – 970, 2018. doi:10.1007/s42107-018-0076-6. [26] m. a. mamun, m. s. islam. experimental investigation of chloride ion penetration and reinforcement corrosion in reinforced concrete member. journal of construction engineering and project management 7:26 – 29, 2017. doi:10.6106/jcepm.2017.3.30.026. [27] g. l. golewski. estimation of the optimum content of fly ash in concrete composite based on the analysis of fracture toughness tests using various measuring systems. construction and building materials 213:142 – 155, 2019. doi:10.1016/j.conbuildmat.2019.04.071. [28] s. pavía, r. walker, p. veale, a. wood. impact of the properties and reactivity of rice husk ash on lime mortar properties. journal of materials in civil engineering 26(9):04014066, 2014. doi:10.1061/(asce)mt.1943-5533.0000967. [29] s. n. raman, t. ngo, p. mendis, h. b. mahmud. high-strength rice husk ash concrete incorporating quarry dust as a partial substitute for sand. construction and building materials 25(7):3123 – 3130, 2011. doi:10.1016/j.conbuildmat.2010.12.026. [30] s.-h. jung, s. velu, k. subbiah, et al. microstructure characteristics of fly ash concrete with rice husk ash and lime stone powder 12(17):1 – 9, 2018. [31] c. fapohunda, b. akinbile, a. shittu. structure and properties of mortar and concrete with rice husk ash as partial replacement of ordinary portland cement. international journal of sustainable built environment 6(2):675 – 692, 2017. doi:10.1016/j.ijsbe.2017.07.004. [32] c. l. hwang, s. chandra. 4 the use of rice husk ash in concrete. in waste materials used in concrete manufacturing, pp. 184 – 234. william andrew publishing. doi:10.1016/b978-081551393-3.50007-7. [33] d. g. nair, k. s. jagadish, a. fraaij. reactive pozzolanas from rice husk ash: an alternative to cement for rural housing. cement and concrete research 36(6):1062 – 1071, 2006. doi:10.1016/j.cemconres.2006.03.012. [34] k. c. panda, s. d. prusty. influence of silpozz and rice husk ash on enhancement of concrete strength. advances in concrete construction 3(3):203 – 221, 2015. doi:10.12989/acc.2015.3.3.203. [35] g. cordeiro, r. toledo filho, e. fairbairn. use of ultrafine rice husk ash with high-carbon content as pozzolan in high performance concrete. materials and structures 42(7):983 – 992, 2008. doi:10.1617/s11527-008-9437-z. [36] c. elakkiah. rice husk ash (rha) the future of concrete. in lecture notes in civil engineering. springer, singapore, 2019. doi:10.1007/978-981-13-3317-0_39. [37] a. abalaka. strength and some durability properties of concrete containing rice husk ash produced in a charcoal incinerator at low specific surface. international journal of concrete structures and materials 7(4):287 – 293, 2013. doi:10.1007/s40069-013-0058-8. [38] g. l. golewski. energy savings associated with the use of fly ash and nanoadditives in the cement composition. energies 13(9):2184, 2020. [39] g. golewski. a novel specific requirements for materials used in reinforced concrete composites subjected to dynamic loads. composite structures 223:110939, 2019. doi:10.1016/j.compstruct.2019.110939. 290 http://dx.doi.org/10.1080/21650373.2019.1705199 http://dx.doi.org/10.1007/s13762-020-02900-z http://dx.doi.org/10.1016/j.conbuildmat.2018.12.068 http://dx.doi.org/10.1002/suco.201800355 http://dx.doi.org/10.1016/j.cscm.2019.e00218 http://dx.doi.org/10.1016/j.cemconcomp.2004.07.003 http://dx.doi.org/10.1016/j.conbuildmat.2006.12.001 http://dx.doi.org/10.4028/www.scientific.net/ssp.254.120 http://dx.doi.org/10.1002/suco.201900162 http://dx.doi.org/10.1007/s42107-018-0076-6 http://dx.doi.org/10.6106/jcepm.2017.3.30.026 http://dx.doi.org/10.1016/j.conbuildmat.2019.04.071 http://dx.doi.org/10.1061/(asce)mt.1943-5533.0000967 http://dx.doi.org/10.1016/j.conbuildmat.2010.12.026 http://dx.doi.org/10.1016/j.ijsbe.2017.07.004 http://dx.doi.org/10.1016/b978-081551393-3.50007-7 http://dx.doi.org/10.1016/j.cemconres.2006.03.012 http://dx.doi.org/10.12989/acc.2015.3.3.203 http://dx.doi.org/10.1617/s11527-008-9437-z http://dx.doi.org/10.1007/978-981-13-3317-0_39 http://dx.doi.org/10.1007/s40069-013-0058-8 http://dx.doi.org/10.1016/j.compstruct.2019.110939 vol. 61 no. 1/2021 performance of sustainable green concrete incorporated with fly ash. . . [40] s. singh, r. nagar, v. agrawal. a review on properties of sustainable concrete using granite dust as replacement for river sand. journal of cleaner production 126:74 – 87, 2016. doi:10.1016/j.jclepro.2016.03.114. [41] b. k. meisuh, c. k. kankam, t. k. buabin. effect of quarry rock dust on the flexural strength of concrete. case studies in construction materials 8:16 – 22, 2018. doi:10.1016/j.cscm.2017.12.002. [42] m.-c. han, d. han, j.-k. shin. use of bottom ash and stone dust to make lightweight aggregate. construction and building materials 99:192 – 199, 2015. doi:10.1016/j.conbuildmat.2015.09.036. [43] h. li, f. huang, g. cheng, et al. effect of granite dust on mechanical and some durability properties of manufactured sand concrete. construction and building materials 109:41 – 46, 2016. doi:10.1016/j.conbuildmat.2016.01.034. [44] s. mundra, p. sindhi, v. chandwani, et al. crushed rock sand – an economical and ecological alternative to natural sand to optimize concrete mix. perspectives in science 8:345 – 347, 2016. recent trends in engineering and material sciences, doi:10.1016/j.pisc.2016.04.070. [45] astm c618-03 standard specification for coal fly ash and raw or calcined natural pozzolan for use in concrete. standard, american society for testing and materials, west conshohocken, 2003. doi:10.1520/c0618-03. [46] b. hellack. specific surface area by brunauer-emmettteller (bet) theory. tech. rep. www.nanopartikel. info/wp-content/uploads/2020/11/nanoximet_sop_ specific-surface-area-analysis-by-bet-theory_ v1.pdf. [47] astm c33-03 standard specification for concrete aggregates. standard, american society for testing and materials, west conshohocken, 2003. doi:10.1520/c0033-03. [48] astm c143/c143-03 standard test method for slump of hydraulic-cement concrete. standard, american society for testing and materials, west conshohocken, 2003. doi:10.1520/c0143_c0143m-03. [49] astm c642-06 standard test method for density, absorption, and voids in hardened concrete. standard, american society for testing and materials, west conshohocken, 2006. doi:10.1520/c0642-06. [50] astm c39/c39m-18 standard test method for compressive strength of cylindrical concrete specimens. standard, american society for testing and materials, west conshohocken, 2018. doi:10.1520/c0039_c0039m-18. [51] astm c78/c78m-18 standard test method for flexural strength of concrete (using simple beam with third-point loading). standard, american society for testing and materials, west conshohocken, 2018. doi:10.1520/c0039_c0039m-18. [52] astm c 496/c496m-11 standard test method for splitting tensile strength of cylindrical concrete specimens. standard, american society for testing and materials, west conshohocken, 2011. doi:10.1520/c0496_c0496m-11. [53] z. yao, x. ji, p. sarker, et al. a comprehensive review on the applications of coal fly ash. earth-science reviews 141:105 – 121, 2015. doi:10.1016/j.earscirev.2014.11.016. [54] r. prakash, r. thenmozhi, s. n. raman. mechanical characterisation and flexural performance of eco-friendly concrete produced with fly ash as cement replacement and coconut shell coarse aggregate. international journal of environment and sustainable development 18(2):131 – 148, 2019. doi:10.1504/ijesd.2019.099491. [55] s. k. antiohos, j. g. tapali, m. zervaki, et al. low embodied energy cement containing untreated rha: a strength development and durability study. construction and building materials 49:455 – 463, 2013. doi:10.1016/j.conbuildmat.2013.08.046. [56] a. rana, p. kalla, h. k. verma, j. k. mohnot. recycling of dimensional stone waste in concrete: a review. journal of cleaner production 135:312 – 331, 2016. doi:10.1016/j.jclepro.2016.06.126. [57] s. ghorbani, i. taji, j. de brito, et al. mechanical and durability behaviour of concrete with granite waste dust as partial cement replacement under adverse exposure conditions. construction and building materials 194:143 – 152, 2019. doi:10.1016/j.conbuildmat.2018.11.023. [58] g. sivakumar, r. ravibaskar. investigation on the hydration properties of the rice husk ash cement using ftir and sem. applied physics research 1(2):71 – 77, 2009. doi:10.5539/apr.v1n2p71. [59] b. s. thomas. green concrete partially comprised of rice husk ash as a supplementary cementitious material a comprehensive review. renewable and sustainable energy reviews 82:3913 – 3923, 2018. doi:10.1016/j.rser.2017.10.081. [60] is 456:2000. concrete, plain and reinforced. bur indian stand dehli. standard, 2000. [61] reinforced concrete design: in accordance with as 3600-2009. standard, standards australia, 2011. [62] strength, r. h. a. durability properties of quaternary cement concrete made with fly ash, l. powder. p. kathirvel and v. saraswathy and s. p. karthik and a. s. s. sekar. arabian journal for science and engineering 38:589 – 598, 2013. doi:10.1007/s13369-012-0331-1. [63] p. a. adesina, f. a. olutoge. structural properties of sustainable concrete developed using rice husk ash and hydrated lime. journal of building engineering 25:100804, 2019. doi:10.1016/j.jobe.2019.100804. [64] r. zerbino, g. giaccio, g. c. isaia. concrete incorporating rice-husk ash without processing. construction and building materials 25(1):371 – 378, 2011. doi:10.1016/j.conbuildmat.2010.06.016. 291 http://dx.doi.org/10.1016/j.jclepro.2016.03.114 http://dx.doi.org/10.1016/j.cscm.2017.12.002 http://dx.doi.org/10.1016/j.conbuildmat.2015.09.036 http://dx.doi.org/10.1016/j.conbuildmat.2016.01.034 http://dx.doi.org/10.1016/j.pisc.2016.04.070 http://dx.doi.org/10.1520/c0618-03 www.nanopartikel.info/wp-content/uploads/2020/11/nanoximet_sop_specific-surface-area-analysis-by-bet-theory_v1.pdf www.nanopartikel.info/wp-content/uploads/2020/11/nanoximet_sop_specific-surface-area-analysis-by-bet-theory_v1.pdf www.nanopartikel.info/wp-content/uploads/2020/11/nanoximet_sop_specific-surface-area-analysis-by-bet-theory_v1.pdf www.nanopartikel.info/wp-content/uploads/2020/11/nanoximet_sop_specific-surface-area-analysis-by-bet-theory_v1.pdf http://dx.doi.org/10.1520/c0033-03 http://dx.doi.org/10.1520/c0143_c0143m-03 http://dx.doi.org/10.1520/c0642-06 http://dx.doi.org/10.1520/c0039_c0039m-18 http://dx.doi.org/10.1520/c0039_c0039m-18 http://dx.doi.org/10.1520/c0496_c0496m-11 http://dx.doi.org/10.1016/j.earscirev.2014.11.016 http://dx.doi.org/10.1504/ijesd.2019.099491 http://dx.doi.org/10.1016/j.conbuildmat.2013.08.046 http://dx.doi.org/10.1016/j.jclepro.2016.06.126 http://dx.doi.org/10.1016/j.conbuildmat.2018.11.023 http://dx.doi.org/10.5539/apr.v1n2p71 http://dx.doi.org/10.1016/j.rser.2017.10.081 http://dx.doi.org/10.1007/s13369-012-0331-1 http://dx.doi.org/10.1016/j.jobe.2019.100804 http://dx.doi.org/10.1016/j.conbuildmat.2010.06.016 acta polytechnica 61(1):279–291, 2021 1 introduction 2 methodology of experimental work 2.1 material selection and specimen preparation 2.2 experimental tests 3 results and discussion 3.1 workability 3.2 water absorption capacity 3.3 mechanical performance 3.3.1 compressive strength 3.3.2 flexural strength 3.3.3 splitting tensile strength 4 performance of reinforced specimens 5 correlation between hardened properties 5.1 comparison of results 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0224 acta polytechnica 59(3):224–237, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap general model of radiative and convective heat transfer in buildings: part ii: convective and radiative heat losses tomáš ficker brno university of technology, faculty of civil engineering, department of physics, veveří 95, 602 00 brno, czech republic correspondence: ficker.t@fce.vutbr.cz abstract. the present paper represents the second part of the serial publication, which deals with convective and radiative heat transfers in buildings. the algebraic computational method for combined convective-radiative heat transport in buildings has been proposed. the convective transport of heat has been formulated by means of the correlation functions of the nusselt number. the radiative heat transfer has been specified by using the radiosity method explained in the first part of the serial publication. the system of transcendent equations has been formed to couple the convective and radiative heat transports. the transcendent system has been solved iteratively, which has facilitated to obtain the optimized surface temperatures as well as the optimized values of the coefficients of heat transfer. on the basis of these optimized values, a more precise overall heat loss has been computed and compared with the results obtained from the thermal standard. the strong and weak points of the both used numerical methods have been discussed. keywords: convective heat transfer, radiative heat transfer, room envelope, heat losses, iterative optimization. 1. introduction the present paper continues the previous contribution entitled general model of radiative and convective heat transfer in buildings: part i: algebraic model of radiative heat transfer [1]. the first part of the serial contribution explained the basics of the radiosity method for quantifying radiative heat transfer in buildings. this method will also be used in the present paper in combination with convective methods so that the heat losses of buildings may be assessed more rigorously. in the field of building physics, heat and moisture transports are often studied [2–6]. these transports represent a core problem in building performances. the heat transport is often investigated as a heat conduction through building envelopes but the heat transfer inside buildings usually remains overlooked. the heat transfer in closed spaces may consist of conduction, convection and radiation. in thermal equilibrium, these transports compensate not only heat losses going through building envelopes but they may influence the temperatures of interior surfaces that occasionally suffer from the condensation of water vapours. the electromagnetic waves of heat radiation do not need a mass environment for their propagation in contrast to conductive and convective transports that are incapable of propagating in vacuum. the radiative heat usually represents the most effective transfer of energy. however, in some special cases, other two mechanisms may be in competition with radiation. for this reason, it will be instructive to specify such cases. this will be illustrated in the following introductory paragraphs by comparing effectiveness of heat radiation with other two transfer mechanisms within large and small enclosures. let us have a common living room with a heated floor (absolutely black surface with a temperature of t1 = 301.15 k). the ceiling of the room is covered with a lime-cement plaster whose emissivity is 0.9 and with a temperature of t2 = 291.15 k. the spacing between the floor and the ceiling is l = 2.8 m and is filled with air that is ‘diathermal’ owing to radiative heat but owing to heat conduction and convection behaves as a normal fluid. its thermal conductivity is λ = 0.025992 wm−1k−1, kinematic viscosity ν = 15.5473 · 10−6 m2/s and thermal diffusivity α = 21.9918 m2/s. let us consider this inner configuration to be a large hollow cavity with two parallel sides (floor versus ceiling) neglecting the influence of side walls. let us calculate the amount of heat, which is transferred separately by conduction, convection and radiation per one hour (t = 3600 s) and from one meter squared (s = 1 m2). heat transferred by conduction (fourier’s relation [5, 6]): qc = λ (t1 −t2) ·s · t l = 0.025992 10 · 1 · 3600 2 · 8 ≈ 334 j (1) 224 http://dx.doi.org/10.14311/ap.2019.59.0224 http://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii heat transferred by convection (nusselt’s number nu [7]): tf = (t1 + t2)/2 = 296.15 k the rayleigh number: ra = gβ (t1 −t2) l3 αν = 9.81 · 1 296.15 · 10 · 2.83 21.9918 · 10−6 · 15.5473 · 10−6 = 2.12675 · 1010 h = λ l nu = λ l { 1 + 1.44 [ 1 − 1 708 ra ]∗ + [( ra 5 830 )1/3 − 1 ]∗} = 1.442 wm−2k−1 the bracket [ ]∗ is to be taken as zero if its argument is negative. qcv = s · t ·h · (t1 −t2) = 1 · 3600 · 1.442 · 10 = 51 912 j (2) heat transferred by radiation (mutual radiant capacity c12 [5, 6]): qr = s · t ·q12 = s · t ·c12 · [( t1 100 )4 − ( t2 100 )4] = s · t · cb 1 ε1 + 1 ε2 − 1 · [( t1 100 )4 − ( t2 100 )4] = = 1 · 3600 · 5.67 1 0.9 + 1 1 − 1 · [ (3.0115)4 − (2.9115)4 ] = 190 918.08 j (3) by comparing the heat transferred by conduction (eq. (1)) convection (eq. (2)) and radiation (eq. (3)), it is obvious that the transfer realized by the radiation is much more effective than the other two mechanisms. the radiation transmits more than 78.5 % of the heat energy whereas conduction and convection only 21.5 % within the tested room of a common height 2.8 m between the floor and the ceiling. such room represents a typical room of a family house. however, it should be highlighted that it is the large spacing between floor and ceiling that causes such disproportions between transferred amounts of heat. if the spacing is considerably reduced, e.g. to several millimeters (narrow cavity), the convection degenerates into conduction (nu ≈ 1) and the amount of heat transferred by conduction and radiation becomes comparable. for example, for the spacing of 4.9011 millimeters, the heat transferred by conduction equals that of radiation, i.e. 190 918 joule. this is because the radiation mechanism does not depend on the spacing whereas conduction mechanism is reciprocally dependent on the spacing. based on foregoing estimations, it can be seen that heat radiation dominates over the other two transfer mechanisms in large inner spaces of buildings. however, the convective contributions in such spaces cannot be neglected, too. for this reason, the discussion should be aimed at the radiative and convective heat transfers. since the radiative transfer was thoroughly discussed in the preceding contribution [1], the convective transport is the subject of the central interest. 2. correlation relations and their role in convective heat transport in research works, the problem of convective heat transport is often solved by using the basic physical conservation laws concerning mass, momentum and energy [8, 9]. this leads to the system of governing partial differential equations that have different forms for forced and free convections [10]. these equations are analytically resolvable only in quite simple cases, when surfaces assume uncomplicated shapes. in other, more complicated cases, the discretization and numerical solutions are needed. such procedures would be rather cumbersome for a quick engineering estimation of a convective heat transport. fortunately, instead of solving the partial differential equations, the correlation relations associated with the nusselt number can be used to determine the coefficient of heat transfer h(wm−2k−1). this parameter is a crucial parameter for the estimation of the convective heat transport. correlation functions f associated with nusselt’s number assume different forms for forced and free convections: nul = hl λ = { f1(ra, pr) free convection f2(re, pr) forced convection (4) where ra, re, and pr are rayleigh’s, reynolds’, and prandtl’s similarity numbers [10], respectively. the symbol l is a characteristic dimension of the surface. the particular forms of f1 and f2 depend on the shapes of surfaces 225 tomáš ficker acta polytechnica figure 1. investigated room. physical property surface no. 1 (floor) surface no. 2 (side walls) | surface no. 3 (ceiling) interior surface interior surf. exterior surf. interior surf. exterior surf. area s (m2) 80 108 108 80 80 emissivity ε (-) 0.95 0.85 0.90 0.90 0.90 reflectivity ρ (-) 0.05 0.15 0.10 0.10 0.10 temperature t (k) tsi1 tsi2 tse2 tsi3 tse3 radiation εeb (w/m2) 5.3865 ( tsi1 100 )4 4.8195 ( tsi2 100 )4 5.103 ( tse2 100 )4 5.103 ( tsi3 100 )4 5.103 ( tse3 100 )4 thermal resistance r (m2k/w) ∞ 2.50 1.20 temperatures: interior ti = 293 k (20 °c), exterior te = 258 k (-15 °c), sky to = 243 k (-30 °c). speed of wind: u∞ = 20 m/s. table 1. thermal characteristics of the investigated room shown in fig. 1. and are summarized in various monographs. for example, in ref. [10], a large series of correlation functions are published. the correlation functions result either from the solutions of the governing partial differential equations or, more frequently, from convective experiments. one of the forms of correlation functions has been actually used in the introductory paragraphs to estimate the convective heat transport between the floor and the ceiling of the room. in order to develop a fast and simple procedure for estimating convective heat transport nearby the inner and outer surfaces of buildings, we will use corresponding correlation functions to determine convective heat transports. these correlations will be coupled with the relations for radiative heat transfers expressed by means of radiosisty concept. in this way, the general combined heat transfer will be described and will serve as a basis for estimating heat losses. since the correlation relations are mostly represented by transcendent functions, the resulted system of algebraic equations will also be transcendent and this will require iterative solutions. the goal of the present study is to develop a general numerical procedure for assessing heat losses consisting of radiative and convective components. the procedure should be as simple as possible. the most numerical operations should be realizable by a pocket calculator or excel spreadsheet. such a procedure should be more tractable in comparison with the governing partial differential equations but, simultaneously, the procedure should yield results whose qualities are comparable with those achieved by the partial differential equations. 226 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii 3. thermal characteristics of the investigated room fig. 1 shows the scheme of the room whose heat losses are to be investigated. the floor (no. 1) is heated. since it is well isolated, its heat losses may be neglected. the thermal resistance of the side walls (no. 2) and the ceiling (no. 3) are r2 = 2.5 m2k/w and r3 = 1.2 m2k/w, respectively. the temperature of the air on the exterior is ti = −15 °c. the wind blows on exterior with a speed of 20 m/s. the temperature of the sky is to = −30 °c. the goal is to determine the temperature of the floor tsi1 that would guarantee a room temperature of ti = 20 °c. other input data are summarized in tab. 1. in the stationary thermal state, the heat from the heated floor (no. 1) is transported by convection and radiation towards the cooler side walls (no. 2) and the cooler ceiling (no. 3). the cooler surfaces absorb the energies and transport them by conduction through their volumes to their external sides. at the external side, the transported energy is transferred by radiation and convection into the open space (radiation towards the very cold sky). these transport processes will be described by the system of transcendent equations that will contain 5 unknown quantities, namely tsi1, tsi2, tsi13, tse2 and tse3 (see tab. 1). prior to the formulation of this general system of equations, it is necessary to solve the problem of convection (coefficients of heat transfers) and the heat radiation from the floor to the walls and the ceiling. the next two sections solve both these transports separately. 4. estimation of convective heat transport coefficients of heat transfers h are calculated separately for surfaces nos. 1, 2 and 3. these coefficients are termed as follows: hsi1 (floor), hsi2 (internal side of walls), hsi3 (internal side of ceiling), hse2 (external side of walls), hse3 (external side of ceiling). for computational purposes, the correlation functions associated with the nusselt number are used [10–12]. the thermodynamic data for air are taken from the table published in [10]. 4.1. external surfaces the basic input data for calculations of the heat transfer coefficients are the average temperature of convective film tf , the speed of wind u∞, kinematic viscosity ν of air, thermal conductivity of air λ, characteristic dimension of surface l, the reynolds number rel, the prandtl number pr and the average nusselt number nul [10, 11]: tf = te + tse(2)(3) 2 ≈ te = 258 k (5) u∞ = 20 m/s (6) ν(258 k) = 12.152 · 10−6 m2/s (7) λ(258 k) = 22.94 · 10−3 w · m−1 · k−1 (8) pr(258 k) = 0.718 (9) rel = u∞ ·l ν (10) nul = h ·l λ (11) due to the acting of wind, the forced convection will be established at the external surfaces of the room and this kind of convection will be included in further considerations. the estimation of the dimension l is problematic because the wind direction is unknown. fortunately, it may be shown that convective flow along smaller dimension leads to a higher coefficient of heat transfer as compared with the convective flow along a larger dimension. if a higher value of the coefficient of heat transfer is used in calculations, the larger heat losses are obtained, i.e. the results are at the side of a larger safety. in order to illustrate the dependence of the coefficient of heat transfer on the value of l, the following calculations are performed twice, i.e. for maximum and minimum of the dimension l. surface no. 2 (the external side of walls): l = { 3 m =⇒ rel = 4.9375 · 106 10 m =⇒ rel = 1.6458 · 107 (12) since both the values of the reynolds numbers are much greater than the critical one re(critical)l ≈ 5 · 10 5, which is a limit for the transition of the laminar flow into the turbulent flow, it is the turbulent flow that will dominate over the laminar flow. this is also clearly documented by the value of the critical distance xc = 5 · 105ν/u∞ = 0.3038 m determining the border for the transition between laminar and turbulent flows. 227 tomáš ficker acta polytechnica the critical distance is very small in comparison with the dimensions of the walls (3 m or 10 m) which means that the turbulent flow will dominate. the correlation function of the average nusselt number in the case of the forced, turbulent convection and the plane surface of the arbitrary orientation with respect to the turbulent convection was experimentally established in the following form [10, 11]. nul = 0.037re 4/5 l pr 1/3 (l � xc, rel � re (critical) l ) (13) from the definition of the nusselt number (eq. (11)) and the correlation function (eq. (13)), it follows: nul = h ·l λ =⇒ h = λ l [ 0.037re4/5l pr 1/3 ] (14) hse2 =   for l = 3 m : 22.94 · 10−3 3 [ 0.037 · ( 20 · 3 12.152 · 10−6 )4/5 · 0.7181/3 ] = 57.354 wm−2k−1 for l = 10 m : 22.94 · 10−3 10 [ 0.037 · ( 20 · 10 12.152 · 10−6 )4/5 · 0.7181/3 ] = 45.081 wm−2k−1 (15) from eq. (15), it is clear that a smaller l leads to a larger h. to be at the side of a larger safety, it is reasonable to prefer the following value hse2 = 57.354 wm−2k−1 (16) surface no. 3 (the external side of ceiling): l = { 8 m =⇒ rel = 1.3166 · 107 10 m =⇒ rel = 1.6458 · 107 (17) the reynolds numbers in eq. (17) are much larger than the critical reynolds number 5 · 105 and thus the turbulent convective flow will be assumed in the further calculations as well. the same correlation function of the nusselt number as in eqs. (14)/(15) may be applicable [10, 11]: hse2 =   for l = 8 m : 22.94 · 10−3 8 [ 0.037 · ( 20 · 8 12.152 · 10−6 )4/5 · 0.7181/3 ] = 47.138 wm−2k−1 for l = 10 m : 22.94 · 10−3 10 [ 0.037 · ( 20 · 10 12.152 · 10−6 )4/5 · 0.7181/3 ] = 45.081 wm−2k−1 (18) as seen from eq. (18), a smaller l again leads to a larger h. to be at the side of a larger safety, the following value is preferred hse3 = 47.138 wm−2k−1 (19) 4.2. internal surfaces inside the room, the floor is the only source of heat. its warm surface heats the adjacent air that is rising up due to the archimedes buoyancy force and this is a typical mechanism of the free convection. the basic input data for calculations of the coefficients of heat transfer are the average temperature of convective film tf , kinematic viscosity ν of air, thermal conductivity of air λ, thermal diffusivity of air α, volumetric thermal expansion coefficient of air β, gravity acceleration g, characteristic dimension of the surface l, the rayleigh number ral, the prandtl number pr, and the average nusselt number nul [10, 12]. at first, it is necessary to decide whether the convective flow along the internal surfaces is laminar or turbulent. for this purpose, we need to know the temperature of the convective film tf which is calculated from the room and surface temperatures as their average. this value will be surely very close to the room temperature ti in the case of surfaces nos. 2 and 3, but not for the surface of the floor (no. 1) due to its heating. for this reason, the estimation of the following thermodynamic values solely concerns the walls and ceiling but the conclusions concerning the floor will be extrapolated. tf = ti + tsi(2)(3) 2 ≈ ti = 293 k (20) ν(293 k) = 15.267 · 10−6 m2/s (21) 228 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii λ(293 k) = 25.74 · 10−3 w · m−1 · k−1 (22) α(293 k) = 21.576 · 10−6 m2s (23) β = 1 tf = 1 293 k (24) g = 9.81 ms−2 (25) pr(293 k) = 0.7088 (26) rel = u∞ ·l ν (27) nul = h ·l λ (28) ral = gβ ( ti −tsi(2)(3) ) l3 να (29) surface no. 2 (the internal side of walls): the height of walls is 3 m. this is a characteristic length l along which the free convection develops. to estimate the rayleigh number ral, the temperature difference (ti −tsi(2)(3)) is needed. however, this difference is unknown, but, on the basis of experience, this difference may amount to about 5 k (a preliminary guess). the corresponding rayleigh number calculated according to eq. (28) then reaches the following value ral = 9.81 1293 · 5 · 3 3 15.267 · 10−6 · 21.576 · 10−6 = 1.372 · 1010 (30) this value is by one order larger than the critical rayleigh number ra(critical)l ≈ 10 9 specifying a limit for the transition between the laminar and turbulent flows [10]. the turbulent flow is inevitably present since the critical distance xc xc = 3 √ 109 · να gβ ( ti −tsi(2)(3) ) = 3 √ 109 · 15.267 · 10−6 · 21.576 · 10−6 9.81 1293 · 5 = 1.25 m (31) is smaller than the characteristic length l = 3 m. as soon as the laminar flow passes the critical distance 1.25 m, it will start converting into a turbulent flow. this is a typical case of the combined (mixed) convective flow, in which both these types of convections participate almost equally. the correlation function of the average nusselt number in the case of the free mixed convection running along the vertical surface was experimentally established in the following form [10] nul =   0.825 + 0.387 · (ral)1/6[ 1 + ( 0.492 pr )9/16]8/27   2 (for entire range of ral) (32) from the definition of the nusselt number (eq. (11)) and the correlation function (eq. (32)), it follows: nul = h ·l λ =⇒ h = λ l   0.825 + 0.387 · (ral)1/6[ 1 + ( 0.492 pr )9/16]8/27   2 (33) the length l = 3 m is utilized as a characteristic dimension in this case: hsi2 = 25.74 · 10−3 3 ·   0.825 + 0.387 · ( 9.81 · 1293 (ti −tsi2) · 3 3 15.267 · 10−6 · 21.576· )1/6 [ 1 + ( 0.492 0.7088 )9/16]8/27   2 (34) 229 tomáš ficker acta polytechnica hsi2 = 0.00858 · { 0.825 + 0.387 · 37.41738 · (ti −tsi2)1/6 1.19305 }2 (35) hsi2 = { 0.076418 + 1.124267 · (ti −tsi2)1/6 }2 (36) surface no. 3 (the internal side of ceiling): the internal surface of the ceiling is a horizontal surface faced downward. at such surface, the convection flow may arise with difficulty in contrast to the convection that runs along the horizontal surface faced upward (such as with floor). to determine the characteristic convective length l of a horizontal surface is a problem, because the orientation of the convective flow is not known beforehand. as a common practice, the ratio of area s and circumference c of the horizontal surface has been implemented [10, 12] l ≈ s c (37) for the internal surface of the ceiling, the characteristic length is l ≈ 80/36 = 2.22 m, which is comparable with the characteristic length of the walls. thus, the approximate rayleigh number will be comparable with the rayleigh number of the wall (eq. (30)), namely: ral = gβ(ti −tsi3)l3 να = 9.81 · 1293 · 5 · 2.22 3 15.267 · 10−6 · 21.576 · 10−6 = 5.5604 · 109 (38) the correlation function of the average nusselt number in the case of the horizontal cold surface faced downward experiencing free convection has been experimentally found in the following form [10, 12] nul = 0.15 · (ral)1/3 (for the range 107 < ral < 1010) (39) from the definition of the nusselt number (eq. (11)) and the correlation function (eq. (39)), it follows: nul = h ·l λ =⇒ hsi3 = λ l · 0.15 · (ral)1/3 (107 < ral < 1010) (40) hsi3 = 25.74 · 10−3 2.22 · 0.15 · ( 9.81 · 1293 · (ti −tsi3) · 2.22 3 15.267 · 10−6 · 21.576 · 10−6 )1/3 (41) hsi3 = 1.7894 · (ti −tsi3)1/3 (42) surface no. 1 (floor): the characteristic length l of the horizontal floor is the same as the horizontal ceiling, i.e l = 2.22 m. the surface of the floor has a higher temperature when compared with the room temperature ti and, thus, the corresponding rayleigh number will be larger when compared with the ceiling. supposing tsi3 −ti ≈ 10 k, the rayleigh number would be ral = 1.1121 · 1010. the warm horizontal floor whose surface is faced upwards and the rayleigh number laying in the interval 107 < ral < 1010 , has the same correlation function of the nusselt number as the cold ceiling [10, 11] and, therefore, the relation for the coefficient of heat transfer may be obtained in the same form as for the ceiling: hsi1 = 1.7894 · (tsi1 −ti)1/3 (43) 5. radiative heat in the first part of the serial publication [1], the radiative transport of heat in enclosures has been thoroughly discussed and the computational radiosity method [10, 13–15] has been analysed. the method has been applied not only to closed systems of surfaces but also to open systems. due to that thorough treatise, there is no need to repeat the basics of that computational method and the particular operations may be presented without detailed explanations. the matrix of view factors fij related to the investigated room:  0 0.460 0.5400.341 0.318 0.341 0.540 0.460 0   (44) 230 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii radiosities wi of the inner surfaces: wi = εiebi + ρi n∑ j=1 fijwj i,j = 1, 2, 3 (45) w1 = 0.95 · 5.67 ·x1 + 0.05(0.46w2 + 0.54w3) w2 = 0.85 · 5.67 ·x2 + 0.15(0.341w1 + 0.318w2 + 0.34w3) w3 = 0.90 · 5.67 ·x3 + 0.10(0.54w1 + 0.46w2) (46) the system of linear algebraic equations (46) is rewritten by means of the gauss elemination method so that the radiosities wi may be functions of the temperature terms x1, x2, and x3: w1 = 5.401810x1 + 0.12334x2 + 0.14485x3 w/m2 w2 = 0.306568x1 + 5.08046x2 + 0.28299x3 w/m2 w3 = 0.305790x1 + 0.24035x2 + 5.12380x3 w/m2 (47) heat flows φi associated with the inner surfaces: φi = si ·qi = si εi ρi (ebi −wi) (48) φ1 = 407.6488x1 − 187.4768x2 − 220.172x3 w φ2 = 360.7985x2 − 187.6196x1 − 173.1899x3 w φ3 = 393.2640x3 − 220.1688x1 − 173.0520x2 w (49) for further processing, it is important to realize which of the flows {φi} is positive (heat emission from the surface into the room) and which of the flows {φi} is negative (heat absorption from the room into the surface). according to the analysis presented in the first part of the serial communication [1], the positive flow is associated with the heated floor (warm surface) and the rest of the inner surfaces absorb energies (cold surfaces): φ1 > 0 φ2 < 0 φ3 < 0 (50) 6. transport equations in the preceding sections 3 and 4, the convective and radiative transports have been solved. the internal and external coefficients of heat transfer hsi1, hsi2, hsi3, hse2, hse3 have been determined in section 3. some of these coefficients are in general forms requiring the surface temperatures tsi1, tsi2, tsi3. in section 4, the radiative heat flows φ1, φ2, and φ3 inside the room have been expressed in dependence on the temperature terms x1 = (tsi1/100)4, x2 = (tsi2/100)4, and x3 = (tsi3/100)4, which also contain the unknown temperatures tsi1, tsi2, tsi3. as soon as the room reaches the stationary thermal state, the heat losses at its external sides are equal to the internal heat transports. the external heat losses comprise losses caused by convection and thermal radiation against the sky. to calculate the radiation against the sky, two other temperatures are required, namely tse2 and tse3. by summarizing the unknown temperatures that are required, it is evident that there are five unknown temperatures tsi1, tsi2, tsi3, tse2 and tse3 that have to be determined. this problem will be resolved by formulating five equations, the first two of which describe the heat transfer at the external sides whereas the last three equations describe the heat transfers at the internal sides of the room: tsi2 −tse2 r2 = hse2 · (tse2 −te) + εse2 · 5.67 · [( tse2 100 )4 − ( t0 100 )4] w/m2 (51) tsi3 −tse3 r3 = hse3 · (tse3 −te) + εse3 · 5.67 · [( tse3 100 )4 − ( t0 100 )4] w/m2 (52) s2 tsi2 −tse2 r2 = s2 ·hsi2 · (ti −tsi2) −φ2 w (53) s3 tsi3 −tse3 r3 = s3 ·hsi3 · (ti −tsi3) −φ3 w (54) 231 tomáš ficker acta polytechnica s1 ·hsi1 · (tsi1 −ti) + φ1 = s2 tsi2 −tse2 r2 + s3 tsi3 −tse3 r3 w (55) first eq. (51) declares the thermal stationarity between the heat conduction inside the walls and their convective and radiative heat transfers at their external sides. similarly, second eq. (52) puts the equality between the conductive transport inside the ceiling and the sum of convective and radiative heat transfers at its external side. in short, first two equations (51) and (52) solve the heat losses at the external sides of the investigated room. third eq. (53) defines the numerical equality between the heat conduction inside the walls and the convective and radiative transfers at their internal sides. similarly, fourth eq. (54) specifies the stationary equivalence between the conductive transport inside the ceiling and the convective and radiative heat transfers at its internal side. the minus signs with the terms φ2 and φ2 are used because these two terms have a negative numerical value (see eqs. (50)). fifth eq. (55) describes the thermal equilibrium between the energy coming from the floor and that absorbed by the walls and the ceiling, i.e. it represents the overall heat losses. in short, the last three equations (53), (54), and (55) solve the thermal transfers on the internal sides of the investigated room. the general system of eqs. (51) (55) represents a procedure for finding heat losses of a room whose envelope has been reduced to three surfaces. in a more general case, when the room has a larger number of inner surfaces, the number of equations increases. for example, a room with n cold surfaces and 1 warm surface (heated floor or one radiant panel) would require 2n + 1 equations whose forms would closely resemble eqs. (51) (55). if a source of the heat in the room consisted of several units of the same temperature (e.g. several radiant panels), the total surface of these units would enter the equation related to the warm surface (eq. (55)) and, in addition, the radiative heat flow φi in that equation would be calculated as the sum of the radiative heat flows generated by all the units. to prepare the system of equations (51) (55) for the numerical processing, it is necessary to insert the data from tab. 1, the coefficients of heat transfer from section 2 and the heat flows from section 3 in the system of equations (51) (55). after inserting all those items, the system of equations assumes the following form: 0.4 · (tsi2 −tse2) = 57.354 · (tse2 − 258) + 0.9 · 5.67 · 10−8 · [ (tse2) 4 − 2434 ] (56) 0.8333 · (tsi3 −tse3) = 47.138 · (tse3 − 258) + 0.9 · 5.67 · 10−8 · [ (tse3) 4 − 2434 ] (57) 108 · 0.4 · (tsi2 −tse2) = 108 · [0.076418 + 1.124267 · (293 −tsi2)1/6]2 · (293 −tsi2)− − 10−8(360.7985 ·t 4si2 − 187.6196 ·t 4 si1 − 173.1899 ·t 4 si3) (58) 80 · 0.8333 · (tsi3 −tse3) = 80 · 1.7894 · (293 −tsi3)4/3− − 10−8(393.264 ·t 4si3 − 220.1688 ·t 4 si1 − 173.052 ·t 4 si2) (59) 80 · 1.7894 · (tsi1 − 293)4/3 + 108 · (407.6488 ·t 4si1 − 187.4768 ·t 4 si2 − 220.172 ·t 4 si3) = = 108 · 0.4 · (tsi2 −tse2) + 80 · 0.8333 · (tsi3 −tse3) (60) eqs. (56) (60) represent the system of five non-linear transcendent equations. solving such system may require an iterative procedure. in our case, the newton iterative method utilizing the jacobi matrix has been used. the starting guess of five temperatures (tsi1 = 303 k, tsi2 = 295 k, tsi3 = 292 k, tse2 = 265 k, tse3 = 260 k) has been used. after six iterations, the resulted temperatures have satisfied the system of eqs. (56) (60) satisfactorily. two types of computations have been performed. the first type has not included the heat losses caused by the radiation of exterior surfaces against the sky. the second type of computations has taken the external radiation into account. these two types of computations have enabled us to evaluate the influence of external radiation on the overall heat loss. 6.1. heat losses without external radiation in this type of computations, the radiations of external surfaces have been excluded, i.e. eqs. (56) (57) have not contained the terms 0.9 · 5.67 · 10−8 · [ (tse2)4 − 2434 ] and 0.9 · 5.67 · 10−8 · [ (tse3)4 − 2434 ] , respectively. the resulted temperatures, coefficients of heat transfers and heat losses are shown in tab. 2. from tab. 2, it can be seen that the floor temperature amounts to about 23.40 °c whereas internal surfaces of the walls and the ceilings have lower temperatures, namely about 17 °c. the temperatures of the external 232 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii tsi1 (floor) 296.3970 k (∼ 23.40 °c) hsi2 (walls interior) 2.0723 w/(m2k) tsi2 (walls interior) 289.8231 k (∼ 16.82 °c) hsi3 (ceiling interior) 2.6148 w/(m2k) tsi3 (ceiling interior) 289.8763 k (∼16.88 °c) tse2 (walls exterior) 258.2204 k (∼−14.78 °c) heat losses through walls 1 365.2366 w tse3 (ceiling exterior) 258.5537 k (∼−14.45 °c) heat losses through ceiling 2 088.0898 w hsi1 (floor) 2.6888 w/(m2k) overall heat losses 3 453.3264 w table 2. resulting quantities obtained by the computations excluding external heat radiations. tsi1 (floor) 296.4994 k (∼ 23.50 °c) hsi2 (walls interior) 2.0797 w/(m2k) tsi2 (walls interior) 289.7869 k (∼ 16.79 °c) hsi3 (ceiling interior) 2.6328 w/(m2k) tsi3 (ceiling interior) 289.8111 k (∼16.81 °c) tse2 (walls exterior) 257.4211 k (∼−15.58 °c) heat losses through walls 1 398.2026 w tse3 (ceiling exterior) 257.5791 k (∼−15.42 °c) heat losses through ceiling 2 148.7140 w hsi1 (floor) 2.7155 w/(m2k) overall heat losses 3 546.9166 w table 3. resulting quantities obtained by the computations including external heat radiations. sides of the room envelope have dropped down to -14 °c but this is still above the temperature of the external air (-15 °c). however, if the external radiative heat losses are included in the computations, all these temperatures will change their values. 6.2. heat losses with external radiation in this type of computations, the radiations of external surfaces have been included, i.e. eqs. (56) and (57) have contained the terms 0.9 · 5.67 · 10−8 · [ (tse2)4 − 2434 ] and 0.9 · 5.67 · 10−8 · [ (tse3)4 − 2434 ] , respectively. the resulting temperatures, coefficients of heat transfers and heat losses are shown in tab. 3. in this second type of computations, one would expect that the overall heat loss will be increased by the amount of energy represented by the external radiative heat computed according to expressions 108 · 0.9 · 5.67 · 10−8 · [ (tse2)4 − 2434 ] plus 80 · 0.9 · 5.67 · 10−8 · [ (tse3)4 − 2434 ] . this sum of energies amounts to about ∼ 8 720 w. however, the real increase of heat loss is only (3 546.9166 − 3 453.3264) ≈ 93.6 w. the explanation of this paradoxical discrepancy is actually easy. from tab. 3 it follows that the temperatures of external surfaces -15.58 °c and -15.42 °c lie below the temperature of air (-15 °c), which means that these surfaces absorbed heat due to convection from air. these convective gains can be calculated according to the terms 108 · 57.354 · (tse2 − 258) plus 80 · 47.138 · (tse3 − 258), which provide ∼ 5 173.1 w. the convective gains compensate the radiative losses and, in addition, the original convective loss ∼ 3 453.3 (see tab. 2) is already not active. the net gain is, therefore, much smaller, i.e. 8 720 − 3 453.3 − 5, 173.1 ≈ 93.6 w, which is in fact the real increase of the overall heat loss. this phenomenon is a consequence of the second law of thermodynamics and can be termed as the ‘natural thermal pumping’. if the radiative heat loss ( 8 720 w) is considered along with the convective heat gain (∼ 5 173.1 w), then the resulted heat loss can be calculated as the difference 8720 − 5173.1 = 3546.9 w, which perfectly agrees with the net heat loss given in tab. 3. as seen, the very cold sky (-30 °c) is responsible for dropping the temperatures of external surfaces below the temperature of air. such a situation, when external surfaces are cooler than the air environment, is favourable for condensation of water vapors. for example, if the relative humidity in our case were about 96.6 %, the condensate would appear on both the external surfaces (on walls and ceiling) but due to low temperatures, the water droplets would immediately freeze up and the icy cover would arise on the surfaces. similar situation may occur in summer seasons, when the cold night sky makes solid surfaces cooler as compared with the air and in the early morning wet surfaces may be observed. 7. comparison with thermal standard the heat losses of the investigated room may also be evaluated by the proper thermal standard. in our case, the czech thermal standard [16] has been chosen. this standard specifies fixed coefficients of heat transfer for internal surfaces (hi) and external surfaces (he) according to their orientations and the two seasons of the year (winter and summer): • hi = 7.7 w/(m2k) (internal vertical surface, arbitrary season) • hi = 5.9 w/(m2k) (internal horizontal surface faced downwards, arbitrary season) • hi = 10.0 w/(m2k) (internal horizontal surface faced upwards, arbitrary season) 233 tomáš ficker acta polytechnica physical property results from transcendenteqs. (56) (60) results from thermal standard tsi1 (floor) 23.50 °c tsi2 (walls interior) 16.79 °c 18.30 °c tsi3 (ceiling interior) 16.81 °c 16.26 °c tse2 (walls exterior) -15.58 °c -14.47 °c tse3 (ceiling exterior) -15.42 °c -14.12 °c hsi1 (floor) 2.7155 w/(m2k) 10 w/(m2k) hsi2 (walls interior) 2.0797 w/(m2k) 7.7 w/(m2k) hsi3 (ceiling interior) 2.6328 w/(m2k) 5.9 w/(m2k) hse2 (ceiling exterior) 57.354 w/(m2k) 25 w/(m2k) hse3 (ceiling exterior) 47.138 w/(m2k) 25 w/(m2k) table 4. resulted temperatures and coefficients of heat transfer. • he = 25 w/(m2k) (external surface, arbitrary orientation, winter season) heat losses according to the thermal standard [16]: walls: 108 · 20 − (−15) 1 7.7 + 2.5 + 1 25 = 1 415.7992 w (61) ceiling: 80 · 20 − (−15) 1 5.9 + 1.2 + 1 25 = 1 986.5320 w (62) the overall heat loss : ∑ = 3 402.3321 w (63) the heat loss 3 402.3312 w is close to the result shown in tab. 2, which, however, does not include the heat loss due to external radiation against the very cold sky. the difference is only ∼ 51 w. comparing the heat loss 3 402.3312 w with the result shown in tab. 3, it is obvious that the difference is larger and amounts to ∼ 145 w. undoubtedly, the result shown in tab. 3 is more precise as compared with the calculations according to eqs. (61) (63) since the result in tab. 3 includes not only the influence of external radiation but also the influence of internal radiation inside the room, the actual temperature of fluid flow, turbulent air movement at the external side of the room and the laminar flow of air inside the room. the simple calculations performed according to the thermal standard [16] (see eqs. (61) -(63)) have utilised universal fixed values of the coefficients of heat transfers hi and he. unfortunately, these fixed values cannot satisfy all varieties of physical conditions inside and outside the room. the coefficients of heat transfer are dependent on the thermodynamic states of the air inside and outside the room (e.g. on temperature, speed of circulating movement, turbulent or laminar flows) and also on the geometrical character of internal and external surfaces (their shapes, length, roughness, etc.). for these reasons, the universal fixed values of the coefficients that are recommended by the thermal standard cannot reliable serve for the accurate estimation of heat transfer but they may only provide an approximate guess for a quick orientation. tab. 4 summarizes surface temperatures and the coefficients of heat transfer computed by means of the transport equations (eqs. (56) (60)) and calculated according to the thermal standard [16]. this table offers a comparison of results achieved by both these methods. a clear difference can be seen between results concerning the external temperatures tse2 and tse3 achieved by these two methods. the approximate standard calculations have set these temperatures above the air temperature whereas the transport equations (56) (60) have found their values below the air temperature as the consequence of the very cold sky (situation favourable for ‘natural thermal pumping’). in fact, the standard calculations are not capable of taking into account the effects of cold sky since they do not have any numerical tool for lowering temperatures of external surfaces below the temperature of the external air. this is a significant weak point of the standard calculations. some technical literature suggests that the coefficients of heat transfer may be corrected by the radiative contribution as follows htot = hconvection + ∆hradiation (64) thanks to our separate computations of convective and radiative heat transfers, it is possible to find such corrections and then compare the corrected coefficients with those suggested by the thermal standard [16]. our convective coefficients of heat transfers hconvection are given by eqs. (16), (19), (36), (42), and (43). the radiative 234 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii physical property resulted htotfrom eqs. (64) (69) h-values from thermal standard differences hsi1 (floor) 12.56 w/(m2k) 10 w/(m2k) 26.0 % hsi2 (walls interior) 5.76 w/(m2k) 7.7 w/(m2k) 25 % hsi3 (ceiling exterior) 8.43 w/(m2k) 5.9 w/(m2k) 43 % hse2 (ceiling exterior) 56.62 w/(m2k) 25 w/(m2k) 126 % hse3 (ceiling exterior) 45.75 w/(m2k) 25 w/(m2k) 83 % table 5. the comparison of the coefficients of heat transfer. corrections ∆hradiation may be computed by the radiative heat energies φsi1, φsi2, φsi3 given by eqs. (49) at the internal side of the room and the radiative heat energies at the external side of the room specified by the following expressions φse2 = 0.9 · 5.67 · 10−8 · [ (tse2)4 − 2434 ] and φse3 = 0.9 · 5.67 · 10−8 · [ (tse3)4 − 2434 ] : ∆hradiation,si1 = φsi1 s1 · (tsi1 −ti) (65) ∆hradiation,si2 = φsi2 s2 · (tsi2 −ti) (66) ∆hradiation,si3 = φsi3 s3 · (tsi3 −ti) (67) ∆hradiation,se2 = φse2 s2 · (tse2 −te) (68) ∆hradiation,se3 = φse3 s3 · (tse3 −te) (69) tab. 5 summarizes the corrected coefficients htot and the standard coefficients suggested by the thermal standard [16]. it is evident that the agreement between these two groups of coefficients is not satisfactory. as seen from tab. 5, there are large differences between the results of these two methods. in some cases, the difference reaches 126 % (see hse2 in tab. 5). this indicates that the standard fixed coefficients may hardly be considered as the coefficients that contain reliable contributions from heat radiations. in short, the standard method [16] may be used only for a very approximate estimation of heat losses. 8. conclusions the presented computational analysis has investigated the convective and radiative heat transports as coupled phenomena. this procedure has consisted of three numerical steps. within the first step, the convective coefficients of heat transfer have been generally expressed by means of the correlation functions of the nusselt numbers. in the second step, the radiative heat flows have been generally specified inside and outside the investigated space. in the third and concluding step, the system of transcendent equations has been formed in which the convective and radiative transports have been coupled. the system has been solved iteratively. the results have provided optimized surface temperatures as well as the optimized convective coefficients of heat transfer. on the basis of these optimized values, the total heat loss has been computed. the presented analysis has shown that the heat losses determined by the thermal standard [16] are underestimated. the problem of this thermal standard is that it uses the constant coefficients of heat transfer h that are actually highly variable quantities. these coefficients are dependent on many parameters, such as the thermodynamical conditions of convective flows, speed of flows, types of flows (free or forced, laminar or turbulent), shapes of surfaces (plain or curved), and topology of surfaces (smooth or rough). for example, a high speed of convective flows may considerably increase the value of the coefficients of heat transfer. all these effects can be taken into account by the combined convective-radiative computational model but not by the standard method [16]. for this reason, it should not be surprising that the coefficients of heat transfer provided by the standard document [16] show so large deviations from the coefficients obtained by the convective-radiative method (see tab. 5). since the thermal standard [16] does not include the influence of the thermodynamical state of external environments (e.g. the cold sky, among others), it is incapable of finding reliable external surface temperatures and thus the effects like ‘natural thermal pumping’ are beyond the calculational capability of the standard procedure. however, these drawbacks also concern all other thermal methods that use the constant coefficients of heat transfer. the presented computational analysis based on the combined convective-radiant heat transfer enables to draw several summarizing conclusions: 235 tomáš ficker acta polytechnica • the radiative heat transfer is a dominant energy transport in large enclosures like inner spaces of buildings. however, in narrow air cavities whose widths are only of several millimetres the conductive heat transfer assumes a governing role. • the combined radiative-convective computational model for assessing the heat transport can be introduced on the basis of the algebraic radiosity method and the correlation functions of the nusselt number. • in comparison with the governing partial differential equations, the radiative-convective computational model is more convenient for a common engineering practice since the system of algebraic equations is easier to solve, compared to the system of partial differential equations. • the accuracy of the radiative-convective model is not worse as compared with the accuracy of the governing partial differential equations since the radative-convective model utilizes correlation functions, most of them have been determined by precise experiments and such experiments are usually closer to reality than discretized differential equations. • the radiative-convective model includes the majority of the influencing parameters, such as the thermodynamical parameters of convective flows, speed of flows, types of flows (free or forced, laminar or turbulent), shapes of surfaces (plain or curved), and topology of surfaces (smooth or rough). • most of the preparatory numerical operations of the radiative-convective model are easy to perform even on a pocket calculator. the model applied to a heated enclosure consists of 2n + 1 transcendent algebraic equations where the symbol n is a number of cold surfaces. the system is solved iteratively, which facilitates an optimization of results. • the radiative-convective model is primarily focused on heat losses of buildings. it takes into account the influence of not only the external convective heat losses but also the external radiative heat losses (radiation towards the cold sky). this enables to optimize the temperatures on external surfaces and, thanks to this optimization, the special effects of ‘natural thermal pumping’ have been revealed and explained. • one of the valuable properties of the radiative-convective model consists in the possibility to compute the coefficients of heat transfer as a sum of convective and radiative contributions, i.e. htot = hconvection + ∆hradiation. by comparing these optimized coefficients htot with those provided by the thermal standard [16], the large discrepancies have been found (see tab. 5). this is due to the fact that the thermal standard uses constant coefficients that are optimized neither to actual convective thermodynamical states nor to a real heat radiation of surfaces. as a consequence, the heat losses assessed by such thermal standard may be only very approximate and incapable of taking into account special phenomena like ‘natural thermal pumping’. the developed model treats all of the energy flows in the interior of a structure, through the walls and ‘roof’ of the structure and between the exterior of the structure and the surrounding environment. this is relevant to computing optimized interior and exterior surface temperatures. the method can compute the energy demand to maintain interior temperatures at desired values. this may be a useful tool in the fields of the building thermal technology and building physics. the model is more advanced as compared with the common methods based on thermal standards. its results concerning heat losses have a potential to approach the quality of the corresponding results yielded by partial differential models trained in recent thermal research. references [1] t. ficker. general model of radiative and convective heat transfer in buildings: part i: algebraic model of radiative heat transfer. acta polytechnica 59(3):211 – 223, 2019. doi:10.14311/ap.2019.59.0211. [2] j. du, m. chan, d. pan, s. deng. a numerical study on the effects of design/operating parameters of the radiant panel in a radiation-based task air conditioning system on indoor thermal comfort and energy saving for a sleeping environment. energy and buildings 151:250 – 262, 2017. doi:10.1016/j.enbuild.2017.06.052. [3] o. acikgoz, a. çebi, a. s. dalkilic, et al. a novel ann-based approach to estimate heat transfer coefficients in radiant wall heating systems. energy and buildings 144:401 – 415, 2017. doi:10.1016/j.enbuild.2017.03.043. [4] t. ficker. non-isothermal steady-state diffusion within glaser’s condensation model. international journal of heat and mass transfer 46(26):5175 – 5182, 2003. doi:10.1016/s0017-9310(03)00356-9. [5] t. ficker. handbook of building thermal technology, acoustics and day lighting. cerm, brno, 2004. [6] t. ficker. applied building physics heat transfer and condensation in buildings. but, brno, 2017. [7] k. hollands, g. raithby, l. konicek. correlation equations for free convection heat transfer in horizontal layers of air and water. international journal of heat and mass transfer 18(7):879 – 884, 1975. doi:10.1016/0017-9310(75)90179-9. [8] d. k. singh, s. singh. combined free convection and surface radiation in tilted open cavity. international journal of thermal sciences 107:111 – 120, 2016. doi:10.1016/j.ijthermalsci.2016.04.001. 236 http://dx.doi.org/10.14311/ap.2019.59.0211 http://dx.doi.org/10.1016/j.enbuild.2017.06.052 http://dx.doi.org/10.1016/j.enbuild.2017.03.043 http://dx.doi.org/10.1016/s0017-9310(03)00356-9 http://dx.doi.org/10.1016/0017-9310(75)90179-9 http://dx.doi.org/10.1016/j.ijthermalsci.2016.04.001 vol. 59 no. 3/2019 model of radiative and convective heat transfer in buildings: part ii [9] v. vivek, a. k. sharma, c. balaji. interaction effects between laminar natural convection and surface radiation in tilted square and shallow enclosures. international journal of thermal sciences 60:70 – 84, 2012. doi:10.1016/j.ijthermalsci.2012.04.021. [10] f. p. incropera, d. p. dewitt. fundamentals of heat transfer. john wiley & sons, new york, 1981. [11] s. w. churchill, h. h. chu. correlating equations for laminar and turbulent free convection from a vertical plate. international journal of heat and mass transfer 18(11):1323 – 1329, 1975. doi:10.1016/0017-9310(75)90243-4. [12] r. goldstein, e. sparrow, d. jones. natural convection mass transfer adjacent to horizontal plates. international journal of heat and mass transfer 16(5):1025 – 1035, 1973. doi:10.1016/0017-9310(73)90041-0. [13] w. a. gray, r. müller. engineering calculations in radiative heat transfer. pergamon press, oxford, 1974. [14] r. siegel, j. r. howell. thermal radiation heat transfer. mcgraw-hill, new york, 1972. [15] h. c. hottel. radiant-heat transmission. in w. h. mcadams (ed.), heat transmission. mcgraw-hill book co., new york, 1954. [16] čsn 73 0540-3 thermal protection of buildings part 3: design value quantities. standard, czech standardization agency, 2005. 237 http://dx.doi.org/10.1016/j.ijthermalsci.2012.04.021 http://dx.doi.org/10.1016/0017-9310(75)90243-4 http://dx.doi.org/10.1016/0017-9310(73)90041-0 acta polytechnica 59(3):224–237, 2019 1 introduction 2 correlation relations and their role in convective heat transport 3 thermal characteristics of the investigated room 4 estimation of convective heat transport 4.1 external surfaces 4.2 internal surfaces 5 radiative heat 6 transport equations 6.1 heat losses without external radiation 6.2 heat losses with external radiation 7 comparison with thermal standard 8 conclusions references acta polytechnica https://doi.org/10.14311/ap.2021.61.0661 acta polytechnica 61(5):661–671, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague subjective approach to optimal cross-sectional design of biodegradable magnesium alloy stent undergoing heterogeneous corrosion najmeh zareia, seyed ahmad anvara, ∗, sevan goenezenb a shiraz university, department of civil and environmental engineering, shiraz 71348, iran b texas a & m university, department of mechanical engineering, college station, 77843 texas, usa ∗ corresponding author: anvar@shirazu.ac.ir abstract. existing biodegradable magnesium alloy stents (mas) have several drawbacks, such as high restenosis, hasty degradation, and bulky cross-section, that limit their widespread application in a current clinical practice. to find the optimum stent with the smallest possible cross-section and adequate scaffolding ability, a 3d finite element model of 25 mas stents of different cross-sectional dimensions were analysed while localized corrosion was underway. for the stent geometric design, a generic sine-wave ring of biodegradable magnesium alloy (az31) was selected. previous studies have shown that the long-term performance of mas was characterized by two key features: stent recoil percent (srp) and stent radial stiffness (srs). in this research, the variation with time of these two features during the corrosion phase was monitored for the 25 stents. to find the optimum profile design of the stent subjectively (without using optimization codes and with much less computational costs), radial recoil was limited to 27 % (corresponding to about 10 % probability of in-stent diameter stenosis after an almost complete degradation) and the stent with the highest radial stiffness was selected. the comparison of the recoil performance of 25 stents during the heterogeneous corrosion phase showed that four stents would satisfy the recoil criterion and among these four, the one having a width of 0.161 mm and a thickness of 0.110 mm, showed a 24 % – 49 % higher radial stiffness at the end of the corrosion phase. accordingly, this stent, which also showed a 23.28 % mass loss, was selected as the optimum choice and it has a thinner cross-sectional profile than commercially available mas, which leads to a greater deliverability and lower rates of restenosis. keywords: coronary stent, biodegradation, magnesium, pitting corrosion, finite element. 1. introduction recently, using magnesium alloy stents in clinical treatments has become relatively common owing to their biosafety, biocompatibility, superior mechanical properties, and comparatively larger stiffness than other metallic and polymeric biodegradable stents. although mg alloy stents have shown promising results in stent implantation in clinical trials, relatively high rates of vessel lumen loss, a premature loss of scaffolding ability, and especially bulky cross-sectional profiles remain to be major shortcomings of mas [1–3]. it is noteworthy that either of the two or both the following approaches can be adopted to improve mechanical and clinical performances of mas; (1) the development of new materials to enhance the mechanical strength and alter the stent degradation rate, and (2) optimizing the shape of the stent and it’s cross-sectional geometry to reduce the rate of lumen loss and the bulkiness of the mas and also to increase their scaffolding ability [4, 5]. in line with the second approach, the role of the device design as a key indicator for determining the long term scaffolding behaviour has to be studied further. in recent years, only a few studies have focused on the influence of the stent design and geometry on the mechanical performance of mas. in the study conducted by gastaldi et al. [6], a numerical simulation of the behaviour of mas was investigated based on uniform corrosion and phenomenological-stress corrosion mechanism. the results obtained from these numerical simulations were validated against the experimental corrosion test [7]. furthermore, wu et al. [8] applied a combined 3d fem with a degradable material model to three different mas designs. they compared the mechanical performance of the three designs while corrosion was in progress. in another study, grogan et al. [9] developed a numerical model for predicting the effects of a pitting corrosion mechanism on the mechanical performance of mas. wu et al. [10] proposed a shape optimization method for two-dimensional (2d) finite element model of a single stent hinge without considering corrosion. it should be emphasized that they used a morphing procedure to simplify the optimization and compared the results for four different mg alloys: az31, az80, we43, and zm21. in another research, grogan et al. [11] developed a new method to simulate uniform corrosion by employing an arbitrary lagrangianeulerian (ale) adaptive meshing technique. the abaqus/standard implicit solver was used in their study. in addition, they combined the uniform corro661 https://doi.org/10.14311/ap.2021.61.0661 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en n. zarei, s. a. anvar, s. goenezen acta polytechnica (a). (b). (c). figure 1. stent geometry and numerical model: (a) complete unit; (b) single ring used in 3d model; (c) fe 3d model of the investigated system. sion model based on ale adaptive meshing with an optimization strategy to find out the relationship between the geometry and the mechanical performance of mas. chen et al. [12] developed a 2d shape optimization framework for mas with a generic sine-wave design without considering corrosion in the optimization process. they concluded that the complete deformation history, the deformation induced by deployment and crimping, had a significant effect on the mechanical performance of mas and should be considered in its simulation. recently, ni et al. [13] investigated the effects of the degradation of the stent on the biomechanical performance in the stent-bile duct coupling system. they developed a numerical model based on a continuum damage approach for a biodegradable magnesium alloy bile duct stent. furthermore, they showed how the computational methods can be used as a powerful tool for studying the degradation mechanism of a magnesium alloy stent in the nonvascular cavities. by looking at the influence of the stent design and geometry, it may be reasonable to identify the stimulus for the poor efficacy of mas due to corrosion and to determine suitable biomechanical baseline designs for absorbable magnesium stents. in the present study, 3d finite element models of a biodegradable magnesium alloy stent were developed, which explicitly accounted for localized pitting corrosion and were deployed in idealized stenosed vessels. in current finite element simulations, a continuum damage-based corrosion model was applied, which can fully capture the localized pitting attack [9]. the overall goal of this research is to subjectively determine the most appropriate stent profile concerning its radial recoil, stiffness properties, and mass loss of the stent when localized or pitting corrosion is in progress. the stent profile selection was based on the influence of the stent strut profile on the long-term properties of mas due to localized corrosion, which is investigated through a parametric study. this parametric study focuses on a ring of the stent with different randomly selected strut widths and thicknesses subjected to pitting corrosion and determining the impairment of the stent recoil and radial stiffness due to pitting corrosion. 2. methods 2.1. stent’s geometry and finite element model generally, the stent is made up of two components; the tubular-like rings and the connecting links, fig. 1a. as stated by petrini et al. [14], the tubular-like ring scaffolds the vessel after the expansion, whereas the connecting links provide the flexibility of the stent during the delivery of the stent through the artery. when a stent is expanded in a straight vessel, the links act as connectors and can be excluded from the computational model. therefore, to minimize the computational costs, the advantage is taken of the 662 vol. 61 no. 5/2021 subjective approach to optimal cross-sectional design . . . cluster 1 2 3 4 5number t w 0.058 0.079 0.110 0.141 0.161 0.062 st.1 st.6 st.11 st.16 st.21 0.078 st.2 st.7 st.12 st.17 st.22 0.098 st.3 st.8 st.13 st.18 st.23 0.110 st.4 st.9 st.14 st.19 st.24 0.133 st.5 st.10 st.15 st.20 st.25 table 1. clustering of selected cross-sectional width, w and thickness, t (mm). geometrical repetition of a stent design, and in the present study, only one ring of the stent was selected for the 3d model, fig. 1b. the analytical model selected for this investigation is composed of a biodegradable magnesium stent ring, a rigid cylinder (representing a non-folded balloon used to expand the stent), an artery and plaque, fig. 1c. the ring of the stent has twelve arches (six peaks and six valleys) in the circumferential direction, it is 1.5 mm long and its outer diameter is 2 mm, these parameters are kept unchanged through the optimization process. as for the cross-sectional dimensions of the stent, five strut widths and five strut thicknesses were selected by latin hyper cube sampling (lhcs) scheme within the bounds stated for these in the literature [15–17] and shown in eq. 1. table 1 shows the parameter values (w, t) for the 25 stents. { 0.05 mm ≤ w ≤ 0.17 mm 0.06 mm ≤ t ≤ 0.14 mm (1) where w and t are strut’s width and thickness, respectively. to generate the helical stent ring of specified dimensions and a variable strut width and thickness, a matlab user-subroutine code was prepared in order to automatically generate three-dimensional stent geometries. as for the artery vessel, it is modelled as a straight, hollow cylinder, 2.8 mm long with an inner diameter of 3.3 mm and a wall thickness of 0.9 mm. the plaque has a length of 2.2 mm and its thickness gradually changes from 0.2 mm at each side to 0.6 mm at the center [15]. the rigid cylinder within the stent is 2.6 mm long. the whole balloon-stent-plaque-vessel fe model used in the present study consisted of a total of 18500 elements. a mesh convergence test was performed to ensure enough mesh refinement. the cylinder has meshed with surface elements and other components (stent, artery and plaque) with solid elements. to avoid rigid body movements of the system, the ends of the cylindrical balloon and arterial vessel were fixed in axial and circumferential directions. furthermore, twelve outer nodes on the central cross-section of the stent were constrained longitudinally and circumferentially to prevent motions and yet allowed to freely expand radially. as for the interaction between the inner stent surface and cylindrical balloon, and the stent outer surface and plaque inner surface, a hard contact formulation for normal behaviour and a frictional coefficient of 0.2 for all tangential behaviour were used, based on de beule et al. [18]. furthermore, to capture the interaction between the degrading stent and the vessel following the element removal, the special general contact algorithm was incorporated to redefine the faces in contact. 2.2. materials’ properties and models the stent was assumed to be made of a commercial magnesium alloy az31, modelled as a homogeneous, isotropic, elastoplastic material with non-linear isotropic hardening plasticity. the extruded az31 has a poisson’s ratio of 0.35, the yield stress of 138 mpa and ultimate tensile strength (uts) of 245 mpa at a strain of 17 % [9]. the material properties of elements on exposed surfaces of the stent susceptible to corrosion were controlled by damage parameters based on the continuum damage model. in addition to failure due to excessive corrosion, a ductile failure condition was also set for elements in which strains exceed a specified ultimate value. as for the balloon, an isotropic, linear-elastic material model, characterized by a young’s modulus of 920 mpa and a poisson ratio of 0.4, was used [19]. a hyper-elastic isotropic constitutive model based on a sixth-order reduced polynomial strain energy density function was used for the vessel tissue and the atherosclerotic plaque. more details can be found in references [8, 9, 20]. table 2 shows the model parameters for the artery vessel and plaque. 2.3. corrosion model in this study, a user prepared explicit solver subroutine for modelling corrosion based on a continuum damage theory was incorporated within a commercial fe code. the continuum damage approach accounts for the effects of corrosion-induced microscale geometric discontinuities by reducing the macroscopic mechanical properties of the material (e.g., stiffness). 663 n. zarei, s. a. anvar, s. goenezen acta polytechnica constant c10(m p a) c20(m p a) c30(m p a) c40(m p a) c50(m p a) c60(m p a) k(m p a) artery 6.52e-3 4.89e-2 9.26e-3 0.76 -0.43 8.69e-2 – plaque 0.00238 0.189 -0.388 3.73 -2.540 0.573 4.762 table 2. constitutive material parameters for strain energy density function of artery and plaque. the degradation of mechanical properties has been considered through the introduction of a scalar damage parameter, d, as described in gastaldi et al. [6]. dd dt = δu le ku (2) where, δu and le are the material and finite element characteristic lengths provided by the utility routine for each element and ku represents the corrosion kinetic parameter. in the cdm approach, the effects of geometrical discontinuities can be related to the reduction of macroscopic mechanical properties of the material (e.g., stiffness) through the introduction of damage parameter, d, and an effective stress tensor, σ̃ij . the effective stress tensor components σ̃ij can then be defined as: σ̃ij = σij 1 − d (3) where σij is the corresponding undamaged stress tensor component. based on the isotropic damage assumption, d is a scalar field that calculates the loss of integrity due to the corrosion for each element. during the corrosion process, the corrosion damage variable d gradually changes from 0 to 1. in particular, d = 0 corresponds to an undamaged state and d = 1 represents a completely damaged or fully corroded state of the material. in this work, d is only related to the material damage resulting from a localized form of corrosion called pitting corrosion. while the pitting corrosion model can accurately describe the dependency between the fracture time and the applied stress, the uniform corrosion model is not able to predict the fracture time both qualitatively and quantitatively, [9], hence it is not considered in the present study. the damage evolution law for the case of pitting or heterogeneous corrosion process is [9]: dd dt = δu le λeku (4) λe is, for heterogeneous corrosion, the degradation rate of each element, determined for surface elements susceptible to corrosion by a weibull distribution function. in this study, following grogan et al. [9], δu and ku were set equal to 0.017 mm and 0.026h −1 , respectively. the corrosion model is implemented in a finite element framework through the development of a user subroutine. to update the state of elements, the user subroutine is called in at each material point in the explicit time integration scheme. the user subroutine determines the element connectivity data and generates a random number through an input file as a pre-processor requirement. the elements on the surface layer, which are susceptible to corrosion, have been determined based on the element connectivity map. at the beginning of each analysis increment, the user subroutine receives stress, strain and time increments from the fe solver and then returns the material properties based on an updated damage parameter at the end of the analysis increment. when the damage parameter of an element, d, reaches 0.9 or the equivalent plastic strain of any element, εpl, exceeds a strain of 17 % at ultimate tensile strength [9], the element is deleted from the finite element model by this subroutine. after the strongly corroded elements are deleted, the corrosion layer is updated at the end of the time increment so that elements around the removed element are now designated as an updated corrosion layer and exposed to the aggressive environment. 2.4. verification of corrosion model in this study, the results of the corrosion model simulation, as outlined in section 2.3., were validated against experimental data provided by grogan et al. [9]. the same foil samples as that of grogan et al. [9] constructed in the finite element framework were subjected to the pitting corrosion routine, while different uniform tensile stresses, constant throughout the corrosion process, were applied to them, fig. 2. the time required for the foil samples to rupture due to corrosion was then determined and the results are shown in fig. 3 along with the experimental and fe simulation results of grogan et al. [9]. the comparison of the results showed a good correlation between the present fe corrosion simulation and experimental results. therefore, the predicting capability of the corrosion computer code was verified. the r2 values of the result of grogan, et al. [9] fe simulation and the present research against the experimental result of grogan, et al. [9] were calculated and the values 0.851 and 0.870, respectively, were obtained. the obtained values show a satisfying similarity. 664 vol. 61 no. 5/2021 subjective approach to optimal cross-sectional design . . . (a). (b). figure 2. foil specimen used for verification against grogan et al. [9] expermental results: (a) fe model, boundary condition and loading; (b) corroded model. figure 3. verification of numerical simulation against experimental results. 3. results 3.1. stenting procedure and simulation the stenting simulation, consisting of four steps, is shown in fig. 4. this shows the variation of the outer diameter of stent st.19, in terms of mm, vs. normalized time unit (ntu). as the first two steps take place in a very short time period, the time scale has been elongated to capture the peculiarities of the first two steps. in the first step (expansion), after the stent is deployed to the desired point of the artery (that is partially clogged), it is expanded to an outer diameter of 3 mm by enforcing a radial displacement – a driven process on the rigid cylindrical balloon. in the second step (elastic recoil), the cylindrical balloon is contracted elastically in the radial direction. this allows the expanded stent and artery to recoil (and yet, retaining a larger diameter than the original one). in the third step (degradation process), the biodegradable stent undergoes corrosion, however, still acting as a temporary scaffolding for the opened vessel. during this phase, the arterial wall exerts compression on the stent. finally, in the last step (collapse), the stent collapses due to excessive degradation. fig. 5 shows, (a) the stress distribution of the stent after the expansion, (b) at the end of the elastic recoil, and (c) at three time points in the process of the stent degradation; the last state corresponds to the collapsed stent. the four-step stenting procedure depicted in fig. 4 should be simulated by any realistic numerical algorithm. however, the expansion and elastic recoil steps occur in a very short time as compared to the degradation time, as could be seen in fig. 4, and therefore, the corrosive attack during these two steps is negligible and thus ignored. 3.2. significant parameters to be studied one of the essential clinical trials’ issues regarding the mas is its high restenosis due to stent’s early recoil. besides this, early and fast degradation of magnesium alloys weakens the radial stiffness of the stent and could be the primary cause for the premature loss of stent scaffolding and recoil. hence, the radial recoil 665 n. zarei, s. a. anvar, s. goenezen acta polytechnica figure 4. four steps of stenting procedure. (a). (b). (c). figure 5. von mises stress contours at different stages of stenting: (a) at end of expansion stage; (b) at end of recoil stage; (c) at degradation stages corresponding to t∗=0.4, 0.6 and 0.9. 666 vol. 61 no. 5/2021 subjective approach to optimal cross-sectional design . . . and stiffness of the stent are the main mechanical characteristics of biodegradable stents, and any attempt to improve these properties reduces the risk of restenosis [1]. as a result, the recoil and radial stiffness of the stent, as defined in this section, are the key features that would be considered in the present study as a criterion for a subjective optimal design of the stent. stent recoil percent (srp) this is defined as the relative difference of the stent diameter at the beginning of the corrosion process (step 2 of section 3.1), dststep 2, and the stent diameter during the recoil, as the corrosion proceeds (step 3 of section 3.1), dstrecoil, expressed in percentage, see eq. 5. srp (t) = dststep 2 − dstrecoil(t) dststep 2 × 100 in % (5) stent radial stiffness (srs) this is defined as the ratio of the average contact normal force exerted on the outer surface of the stent by the plaque, fc, to the external radius of the stent, r, as the corrosion proceeds, see eq. 6. srs(t) = fc(t) r(t) in n/mm (6) mass loss (ml) in addition to the two previously defined parameters, a third one, mass loss, is considered in the optimum design. the mass loss percent of the whole stent model at any time is calculated based on the damage parameter d value averaged over all of the elements times 100, see eq. 7. m l(t) = = ( 1 nelements ) × ( nelements∑ i=1 di(t) ) × 100 in % (7) 3.3. stent application in this study, the 25 stents of different cross-sectional dimensions, as categorized in five clusters of fixed width and variable thicknesses (table 1), were modelled in the prepared computer environment. fig. 6 shows the variation of srp (in percentage) with a normalized time unit, ntu, t∗ for clusters 1 to 5, respectively. it is to be noted that the ntu of fig. 6 is measured from the end of phase 2 of fig. 4. inter-vascular ultra-sound (ivus) imaging of the absorbable magnesium stent (ams) in human coronaries has shown that recoil is a significant contributor to restenosis. this imaging has also shown that the average value of recoil at the time of a complete stent degradation is about 27 % [21, 22]. because of this, a horizontal line corresponding to the srp of 27 % is drawn on fig. 6. however, the stents lose their scaffolding ability approximately at a normalized time unit, t∗, of 0.95, beyond which the stent degraded completely. therefore, a vertical line corresponding to t∗ of 0.95 is also drawn on this figure. 4. discussion in this research, 25 stents having different crosssectional dimensions were subjected to heterogeneous corrosion and their performance with regard to radial recoil, stiffness, and mass loss were monitored. the two-tier process of determining the optimum stent design by using these simulation results is described. in this regard, any stent showing srp greater than 27 % prior to t∗ of 0.95 was excluded from the pool of admissible stents. in reference to fig. 6a, 6b and 6c, it is concluded that all stents of clusters 1 to 3 have to be removed from the pool. but stents st.19 and st.20 of cluster 4, fig. 6d, showed srp lower than 27 % at t∗ of 0.95. these are two of the eligible candidate stents. from fig. 6e, it is concluded that stents st.24 and st.25 of cluster 5 also show the same feature. the recoil percent history of the four stents, st.19, st.20, st.24 and st.25 are shown in fig. 7a. these cross the vertical line corresponding to t∗ of 0.95 at 22.555, 23.736, 24.808, and 24.868, respectively, showing that all the stents retain an adequate diameter at the time of the stent collapse due to degradation. from t∗ = 0 to approximately 0.75, the four stents show almost the same recoil behaviour, but from t∗ ≈ 0.75 onward, st.19 shows a lesser recoil percentage. no definite conclusion with regard to the optimum stent cross-section can be drawn just by considering the srp. as stated in section 3.2, radial stiffness is another important characteristic of mas, which directly controls the scaffolding ability of the stent. nevertheless, no criterion or specified values regarding the radial stiffness has been reported in the literature. as both recoil and radial stiffness are essential properties of the stent if it has to function properly, the variation with a normalized time unit of radial stiffness of just those stents complying with recoil requirement, i. e., st.19, st.20, st.24, and st.25 are shown in fig. 7b. in this figure, the vertical line corresponding to t∗ of 0.95 is drawn; this is the time the stent losses its scaffolding ability. to make drawing conclusions easy, the srp and srs values of the four stents listed at t* of 0.95 are tabulated in table 3; along with the mass loss percent. column 2 of this table shows that st.19 has the lowest recoil, while column 4 shows that st.24 has the largest stiffness. in column 3, the recoil ratio relative to st. 19 (which has the smallest recoil percentage) is shown, in columns 5 and 6, radial stiffness ratios relative to st.19 and st. 24, respectively, are shown. the values in columns 3 and 6 indicate that st.24 has an almost 10 % higher recoil as compared to st.19, while st.19 has an almost 34 % lower stiffness as compared to 667 n. zarei, s. a. anvar, s. goenezen acta polytechnica (a). (b). (c). (d). (e). figure 6. recoil of stents of five clusters as degradation proceeds: (a) cluster 1; (b) cluster 2; (c) cluster 3; (d) cluster 4; (e) cluster 5 (all w’s and t’s are in mm). 668 vol. 61 no. 5/2021 subjective approach to optimal cross-sectional design . . . (a). (b). (c). figure 7. variation of srp, srs and ml of stents st.19, st.20, st.24, and st.25 with time: (a) variation of srp; (b) variation of srs; (c) variation of ml (all w’s and t’s are in mm). st.24. this shows that the optimum stent design would be st.24, even though it requires almost 14 % more material used for its production as compared to st.19. besides the recoil and stiffness of the stent, the variation of percentage mass loss of stents due to degradation with time (as defined in section 3.2) was monitored and the results for st.19, st.20, st.24, and st.25 are shown in fig. 7c. the percentage loss corresponding to the ntu of 0.95 for these four stents is listed in column 7 of table 3. the 8.5 % smaller mass loss percentage of st.24 as compared to st.19 is another indication of the superiority of st.24 over the st.19. erbel et al. [1] have shown, through a clinical trial of coronary implantations of absorbable magnesium stents, that the mean and standard deviation of instent diameter stenosis percentage would be 48.37 and 17.00, respectively. considering these numbers, it can be concluded that the 27 % recoil percentage reported by waksman [21] and waksman et al. [22] and used by the authors to this point for finding the optimum profile for stents, ensures an almost 90 % proper functionality during the required scaffolding period. to study the effect of a higher probability of stenosis, say 15.87 %, a recoil percent of 48.37 − 17 = 31.37 % is used instead of 27 % in screening the stents. the result is shown in fig. 8a, with stent recoil percent (srp) as ordinate and normalized time unit (ntu) as abscissa. five more stents than the four previously complying with the criterion of 27 % recoil percentage are included in the batch; overall, nine stents have a lower recoil percentage than 31.37 at t∗ of 0.95. to find the optimum profile design, the variation of the stent radial stiffness (srs) versus the time is shown in fig. 8b for selected six of the nine stents passing the recoil criterion. it is clearly evident that st.24 has a higher stiffness than others, and therefore, would be the optimum choice. this indicates that the selection of the recoil percentage does not alter the optimum design and st.24 with a 0.110 mm thickness and a 0.161 mm width would be the optimum choice with an almost 90 % probability of surpassing the recoil criteria and having an adequate stiffness, while not having the largest cross-section. 5. conclusion in this study, recoil and radial stiffness of stents were considered as the criteria for a subjective optimal 669 n. zarei, s. a. anvar, s. goenezen acta polytechnica st # srp at t ∗=0.95 recoil ratio srs at t*=0.95 radial stiffness ratio mass loss (ml)(%) (n/mm) (%) 1 2 3 4 5 6 7 st.19 22.555 1 0.053 1 0.6625 25.438 st.20 23.736 1.052 0.041 0.771 0.5125 24.986 st.24 24.808 1.099 0.080 1.485 1 23.281 st.25 24.868 1.102 0.061 1.134 0.7625 22.912 table 3. comparison of recoil percent and radial stiffness of st.19, st.20, st.24 and st.25. (a). (b). figure 8. variation of srp and srs: (a) variation of srp of stents having lesser srp than 31.37 %; (b) variation of srs of stents st.14, st.15, st.19, st.20, st.24, and st.25 (all w’s and t’s are in mm) . design of stents. the results of the analysis of 25 stents showed that the optimum choice, st.24, with a 0.110 mm thickness and a 0.161 mm width, had at least 24 % higher stiffness than others and passed the recoil criteria with an almost 90 % probability of proper functionality. although st.24 requires an almost 14 % more material used for its production as compared to st.19, it has 8.5 % and 6.8 % smaller mass loss percent as compared to st.19 and st.20, respectively, and thus it would be the final choice even though it does not have the largest cross-section within the cluster set selected by the lhcs. it’s noteworthy that the optimum choice has a thinner cross-sectional profile than the commercially available magnesium alloy stents and this makes the stent delivery much easier and lowers the risk of restenosis as compared to commercially available thicker mas stents. to the best of authors’ knowledge, this is the first attempt for subjectively determining (without using optimization codes and extensive computational effort) the optimal cross-section of a biodegradable magnesium alloy stent undergoing heterogeneous corrosion. it should also be noted that the method set forth in the present research and applied to sine-wave magnesium alloy (az31) stent could, as well, be applied to any stent with a different configuration and material provided its mechanical properties and corrosion parameters are known. references [1] r. erbel, c. di mario, j. bartunek, et al. temporary scaffolding of coronary arteries with bioabsorbable magnesium stents: a prospective, non-randomised multicentre trial. the lancet 369(9576):1869–1875, 2007. https://doi.org/10.1016/s0145-4145(08)04009-4. [2] m. haude, r. erbel, p. erne, et al. safety and performance of the drug-eluting absorbable metal scaffold (dreams) in patients with de-novo coronary lesions: 12 month results of the prospective, multicentre, first-in-man biosolve-i trial. the lancet 381(9869):836–844, 2013. https://doi.org/10.1016/s0140-6736(12)61765-6. [3] b. d. gogas. bioresorbable scaffolds for percutaneous coronary interventions. global cardiology science and practice 2014(4), 2015. https://doi.org/10.5339/gcsp.2014.55. [4] p. e. mchugh, j. a. grogan, c. conway, e. boland. computational modeling for analysis and design of metallic biodegradable stents. journal of medical devices 9(3):030946, 2015. https://doi.org/10.1115/1.4030576. [5] t. hu, c. yang, s. lin, et al. biodegradable stents for coronary artery disease treatment: recent advances and 670 https://doi.org/10.1016/s0145-4145(08)04009-4. https://doi.org/10.1016/s0140-6736(12)61765-6 https://doi.org/10.5339/gcsp.2014.55 https://doi.org/10.1115/1.4030576 vol. 61 no. 5/2021 subjective approach to optimal cross-sectional design . . . future perspectives. materials science and engineering: c 91:163–178, 2018. https://doi.org/10.1016/j.msec.2018.04.100. [6] d. gastaldi, v. sassi, l. petrini, et al. continuum damage model for bioresorbable magnesium alloy devices — application to coronary stents. journal of the mechanical behavior of biomedical materials 4(3):352–365, 2011. https://doi.org/10.1016/j.jmbbm.2010.11.003. [7] w. wu, s. chen, d. gastaldi, et al. experimental data confirm numerical modeling of the degradation process of magnesium alloys stents. acta biomaterialia 9(10):8730–8739, 2013. https://doi.org/10.1016/j.actbio.2012.10.035. [8] w. wu, d. gastaldi, k. yang, et al. finite element analyses for design evaluation of biodegradable magnesium alloy stents in arterial vessels. materials science and engineering: b 176(20):1733–1740, 2011. https://doi.org/10.1016/j.mseb.2011.03.013. [9] j. a. grogan, b. j. o’brien, s. b. leen, et al. a corrosion model for bioabsorbable metallic stents. acta biomaterialia 7(9):3523–3533, 2011. https://doi.org/10.1016/j.actbio.2011.05.032. [10] w. wu, l. petrini, d. gastaldi, et al. finite element shape optimization for biodegradable magnesium alloy stents. annals of biomedical engineering 38(9):2829–2840, 2010. https://doi.org/10.1007/s10439-010-0057-8. [11] j. a. grogan, s. b. leen, p. e. mchugh. optimizing the design of a bioabsorbable metal stent using computer simulation methods. biomaterials 34(33):8049–8060, 2013. https: //doi.org/10.1016/j.biomaterials.2013.07.010. [12] c. chen, j. chen, w. wu, et al. in vivo and in vitro evaluation of a biodegradable magnesium vascular stent designed by shape optimization strategy. biomaterials 221:119414, 2019. https: //doi.org/10.1016/j.biomaterials.2019.119414. [13] x. ni, y. zhang, c. pan. the degradable performance of bile-duct stent based on a continuum damage model: a finite element analysis. international journal for numerical methods in biomedical engineering p. 36:e3370, 2020. https://doi.org/10.1002/cnm.3370. [14] l. petrini, f. migliavacca, f. auricchio, et al. numerical investigation of the intravascular coronary stent flexibility. journal of biomechanics 37(4):495–501, 2004. https://doi.org/10.1016/j.jbiomech.2003.09.002. [15] s. pant, n. w. bressloff, g. limbert. geometry parameterization and multidisciplinary constrained optimization of coronary stents. biomechanics and modeling in mechanobiology 11(1-2):61–82, 2012. https://doi.org/10.1007/s10237-011-0293-3. [16] h. li, j. gu, m. wang, et al. multi-objective optimization of coronary stent using kriging surrogate model. biomedical engineering online 15:148, 2016. https://doi.org/10.1186/s12938-016-0268-9. [17] c. mccormick. overview of cardiovascular stent designs. in functionalised cardiovascular stents, pp. 3–26. woodhead publishing, 2018. https: //doi.org/10.1016/b978-0-08-100496-8.00001-9. [18] m. de beule, s. van cauter, p. mortier, et al. virtual optimization of self-expandable braided wire stents. medical engineering & physics 31(4):448–453, 2009. https://doi.org/10.1016/j.medengphy.2008.11.008. [19] m. de beule, p. mortier, s. g. carlier, et al. realistic finite element-based stent design: the impact of balloon folding. journal of biomechanics 41(2):383–389, 2008. https://doi.org/10.1016/j.jbiomech.2007.08.014. [20] f. gervaso, c. capelli, l. petrini, et al. on the effects of different strategies in modelling balloonexpandable stenting by means of finite element method. journal of biomechanics 41(6):1206–1212, 2008. https://doi.org/10.1016/j.jbiomech.2008.01.027. [21] r. waksman. current state of the absorbable metallic (magnesium) stent. euro intervention: journal of europcr in collaboration with the working group on interventional cardiology of the european society of cardiology 5:f94–f97, 2009. https://doi.org/10.4244/eijv5ifa16. [22] r. waksman, r. erbel, c. di mario, et al. early-and long-term intravascular ultrasound and angiographic findings after bioabsorbable magnesium stent implantation in human coronary arteries. jacc: cardiovascular interventions 2(4):312–320, 2009. https://doi.org/10.1016/j.jcin.2008.09.015. 671 https://doi.org/10.1016/j.msec.2018.04.100 https://doi.org/10.1016/j.jmbbm.2010.11.003 https://doi.org/10.1016/j.actbio.2012.10.035 https://doi.org/10.1016/j.mseb.2011.03.013 https://doi.org/10.1016/j.actbio.2011.05.032 https://doi.org/10.1007/s10439-010-0057-8 https://doi.org/10.1016/j.biomaterials.2013.07.010 https://doi.org/10.1016/j.biomaterials.2013.07.010 https://doi.org/10.1016/j.biomaterials.2019.119414 https://doi.org/10.1016/j.biomaterials.2019.119414 https://doi.org/10.1002/cnm.3370 https://doi.org/10.1016/j.jbiomech.2003.09.002 https://doi.org/10.1007/s10237-011-0293-3 https://doi.org/10.1186/s12938-016-0268-9 https://doi.org/10.1016/b978-0-08-100496-8.00001-9 https://doi.org/10.1016/b978-0-08-100496-8.00001-9 https://doi.org/10.1016/j.medengphy.2008.11.008 https://doi.org/10.1016/j.jbiomech.2007.08.014 https://doi.org/10.1016/j.jbiomech.2008.01.027 https://doi.org/10.4244/eijv5ifa16 https://doi.org/10.1016/j.jcin.2008.09.015 acta polytechnica 61(5):661–671, 2021 1 introduction 2 methods 2.1 stent's geometry and finite element model 2.2 materials' properties and models 2.3 corrosion model 2.4 verification of corrosion model 3 results 3.1 stenting procedure and simulation 3.2 significant parameters to be studied 3.3 stent application 4 discussion 5 conclusion references acta polytechnica doi:10.14311/ap.2016.56.0112 acta polytechnica 56(2):112–117, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap tests on 10.9 bolts under combined tension and shear anne k. kawohl∗, jörg lange institute for steel structures and materials mechanics, technische universität darmstadt, darmstadt, germany ∗ corresponding author: kawohl@stahlbau.tu-darmstadt.de abstract. prior investigations of the load-bearing capacity of bolts during fire have shown differing behaviour between bolts that have been loaded by shear or by tensile loads. a combination of the two loads has not yet been examined under fire conditions. this paper describes a series of tests on high-strength bolts of property class 10.9 both during and after fire under a combined shear and tensile load. keywords: structural fire design; steel structures; material behaviour; high-strength bolts; experimental study; connections. 1. introduction in steel structures, connections are essential for the stability of the entire structure. they not only join one load-bearing member to another but they also transfer the load and influence the internal forces through their rigidity. the failure of a connection can lead not only to the failure of individual connected members but also, for example, to a change in buckling length within the structure and, as a consequence, to its collapse. in the load case of fire, connections are not only additionally strained by thermal exposure, but thermal exposure of the connected members leads to a change in the strain within the connection over the duration of the fire. at the beginning of a fire, while the temperature is still rising, the thermal expansion of the connected members leads to compression in the connection, which is commonly designed to carry shear and moment forces at ambient temperatures. in a later stage of the fire, at high temperatures, the connected steel members have lost most of their resistance, which again leads to massive deflections of the members and, in consequence, to large tensile and shear forces in the connection. after the fire, during the cooling phase, as the thermal expansion of the bearing members is reversed, the connection is strained by large tensile forces. in the case of a fire load, ductility of the connections is therefore essential. in recent years, the behaviour of joints in fire has been the focus of numerous investigations. al-jabri et al. [1] and burgess et al. [2] give a good overview of research in this field. all of these investigations have in common that they examine the connections as a whole. within a connection there are many effects that come together and influence the load-bearing and deflection behaviour at elevated temperatures. to understand these different effects, it is important to have a thorough knowledge of the load-bearing behaviour of the individual elements of the connection, for example the bolts that are used. appendix d of eurocode 3 part 1–2 [3] states reduction factors for the strength of bolts at elevated temperatures. these reduction factors are independent of the property class of the bolts. as high-strength bolts of property classes 8.8 and 10.9 (fu = 800 n/mm2 or fu = 1000 n/mm2) obtain their enhanced strength through different heat treatments, the assumption that these bolts behave in the same way as bolts of property classes 4.6 and 5.6 (fu = 400 n/mm2 or fu = 500 n/mm2) at elevated temperatures must be questioned. furthermore, ec 3 [3] states nothing about the deflection behaviour of bolts at elevated temperatures. there have been fewer studies on the load-bearing behaviour of bolts at elevated temperatures than on complete connections, but the number of studies has increased in recent years. the tests conducted on bolts so far have focused either on pure tension or on pure shear. studies that have included tests both under pure tension and under pure shear show deviating reductions of the load-bearing capacity of the bolts in reference to the temperature, depending on whether the bolts were loaded by tension or by shear. the above-mentioned reduction factors for bolt strength at elevated temperatures are based on an extensive series of tests by kirby [4] on bolts of property class 8.8. kirby conducted steady-state tests on bolt sets under pure tension and under pure shear. as stated in the eccs model code on fire engineering [5], the reduction factors stated in ec 3 [3] are based on the results of the pure tension tests by kirby, as his results lead to more conservative values. there was less strength loss with pure shear tests. another study that includes both tension tests and shear tests at elevated temperatures is by kodur et al. [6] on a325 and a490 bolts (fu = 830 n/mm2 and fu = 1030 n/mm2). the test results also show deviating strength reduction depending on the strain on the bolts and the temperature. the absolute value of the shear strength for a325 bolts lies above the absolute value of the tensile strength in the temperature range between 112 http://dx.doi.org/10.14311/ap.2016.56.0112 http://ojs.cvut.cz/ojs/index.php/ap vol. 56 no. 2/2016 tests on 10.9 bolts under combined tension and shear figure 1. example of a connection in which the bolts are loaded by a combination of tension and shear. 450 °c and 550 °c. for a490 bolts, the absolute shear strength values are above the absolute tensile strength, beginning at a temperature of approximately 550 °c. the effect of different strength reduction depending on strain was also observed by gonzález [7] in his tests on bolts of property class 10.9. for bolts tested under pure shear, the reduction factors given by ec 3 overestimated the load-bearing capacity only in the temperature range between 400 °c and 600 °c. for bolts tested under pure tension, on the other hand, the factors from ec 3 overestimate the strength of the bolts from temperatures of 450 °c onward. as described above, in the event of fire the bolts in a connection are loaded by a combination of tension and shear. however, there are also connections in which the bolts are by design already loaded by interacting tensile and shear stresses at ambient temperatures. an example is shown in figure 1. a closer examination of the load-bearing behaviour of high-strength bolts under a combination of tension and shear at elevated temperatures is therefore of interest. a preliminary series of tests by the authors [8] investigated the postfire performance of bolts of property class 10.9 under various combinations of tension and shear. it confirms the assumption that at least the post-fire performance depends on the type of strain. a more comprehensive series of tests of the load-bearing behaviour of these bolts under combined tension and shear after fire and also during fire is presented in this paper. 2. test set-up in the above-mentioned preliminary test series [8], test rigs were used that had been designed for a very extensive series of tests on threaded and shank bolts under combined tension and shear at ambient temperature. figure 2 shows the test rig assembled for an angle of 45°. renner [9] designed a total of three different test rigs with which, depending on the assembly, each bolt can be stressed under two different angles or shearto-tension ratios. the strain is applied by pulling on each of the two parts of the test rig. figure 2. test rig for angles 0° and 45°, assembled for a test with an angle of 45°. the test rigs designed by renner [9] are not practical for the tests described in this paper, due to the boundary conditions of the furnace that is required for the fire tests. however, the basic idea of applying the load in only one direction was to be retained, while thereby straining the bolt both by tension (in the bolt axis) and by shear (perpendicular to the axis of the bolt) through a suitable test rig. new test rigs based on this idea were designed, while at the same time taking into account that the furnace used for the tests only allows compression to be applied. the design combines the principles of a compact test rig by godley et al. [10], where tension tests on bolts were performed by applying compression to the test rig, with the use of different angles as in the test rigs used by renner [9]. figure 3 shows the newly-designed test rigs for angles of 0° (left), 30° (centre) and 45° (right). the bolt is stressed by applying compression on the bottom and top plate of each test rig, causing tension and – for angles of 30° and 45° – also shear in the bolt. the test rigs are made of nimonic® 80 a nickelbased alloy, a high-temperature alloy, to ensure that only the bolt that is being tested receives deflection and ultimately fails. the tested bolts are m20 zinccoated shaft bolts of property class 10.9. the test layouts for the tests at elevated temperatures and for post-fire performance are described in greater detail below. 2.1. tests at elevated temperatures the tests are conducted as steady-state tests with the temperature rising until the exposed bolt reaches the specified temperature. the temperature is kept constant, and after a stabilising time the bolt is loaded 113 anne k. kawohl, jörg lange acta polytechnica figure 3. test rigs for an angle of 0° (left), 30° (centre) and 45° (right). figure 4. load-deflection diagram of the fire tests with an angle of 30° and 45°. until failure. a 4-zone electric furnace with a maximum temperature of 1000 °c is used for the fire tests. the furnace is fitted with a 3 mn compression machine. type k thermocouples are installed to monitor the temperature on the surface of the test rig and also the temperature of the bolt. in order to ensure that the specified temperature is reached throughout the entire bolt, the surface temperature of the bolt is measured in the shear plane, where the surrounding mass of the test rig is highest. the furnace has a heating rate of approximately 10 k/min. during the heating phase, the temperature is monitored carefully to ensure that the bolt is uniformly heated. after the stabilising time at the specified temperature, the load is applied in a displacement-controlled manner at a rate of 1.5 mm/min. since experimental studies in fire are both time-consuming and expensive, only two temperatures are specified approximately 500 °c and 700 °c. as the temperature of the test rig, and therefore also the temperature of the bolt, is not the same as the air temperature in the furnace, it is very difficult to meet the specified temperature exactly; it can only be reached approximately. the initial plan was to conduct 3 tests each for every combination of temperature and angle. however, after the first three tests (angle 45°, temperature 500 °c) showed very good compliance, it was decided to conduct only two tests for each combination. 2.2. post-fire performance tests the same test rigs are used for the tests on the postfire performance of the bolts as for the tests at elevated temperatures. as these tests are less time-consuming and less expensive, a larger number of temperatures are tested. the bolts are heated to 500 °c, 600 °c, 700 °c, 800 °c and 900 °c without any additional mechanical loading, and are then cooled slowly to ambient temperature. once the bolts have reached the 114 vol. 56 no. 2/2016 tests on 10.9 bolts under combined tension and shear figure 5. results of the fire tests in reference to the reduction factors from ec 3 [3], kirby [4] and lange and gonzález [11] and also the test results by gonzález [7]. target temperature, the temperature is kept constant for 30 min to ensure that the temperature is reached throughout the bolt section. a compression machine with a maximum load of 1 mn is used for the tests. they are carried out in a displacement-controlled manner at a rate of 1.5 mm/min, and the bolts are loaded until failure. five tests are done for each combination of angle and temperature. in the planning phase of the tests a large scatter of results was assumed, but this has not been confirmed thus far. in addition, five bolts from the same batch are tested without further heat treatment, as references. 3. results the tests have not as yet been completed. up to now, the tests for angles of 30° and 45° have been carried out. the results of these fire tests and the postperformance tests will be presented in the following. 3.1. results of the fire tests in the tests at approximately 500 °c, the bolts tested at an angle of 45° and those tested at 30° show widely differing load-deflection behaviour, see figure 4. the bolts tested at an angle of 45° show quite a large deflection after reaching the maximum load, while at the same time the load decreases. the bolts tested at an angle of 30°, on the other hand, fail quite suddenly after reaching the maximum load. the temperature in the bolts tested at an angle of 30° was about 20 °c higher than in the bolts tested at an angle of 45°. disregarding this, the absolute value of the maximum load would have been higher at an angle at 30° than at an angle of 45°, as the amount of tension is greater at this angle. however, the differing temperature in the bolts is not the reason for the differing failure behaviour. the temperature of approximately 500 °c is within the critical temperature range of lme (liquid metal induced embrittlement), where the liquid zinc flows into the micro cracks along the grain boundaries due to the tensile stresses and consequently leads to failure of the microstructure. the failure mode of the bolts seemed to indicate failure due to lme. the bolts tested at both angles at ambient temperature fail in the shear plane. in the fire tests, both bolts tested at an angle of 30° failed in the thread; and of the bolts tested at an angle of 45° only one bolt failed in the shear plane, although failure in the shear plane was imminent. the thread is especially prone to lme, as the thread acts more or less as a series of notches. the failure surfaces were investigated with an edx-analysis to verify this assumption. for both angles, zinc was found within the fracture surface. at an angle of 30° the amount of zinc was much higher and was also detected further into the cross-section than for an angle of 45°. the other investigation temperature of approximately 700 °c is outside the critical temperature range for lme failure. as predicted, all bolts failed within the shear plane but showed large necking of the cross section. the maximum load reached in these tests is only about one fifth of the load in the tests at 500 °c, see figure 4. however, the bolts show a very large deflection. the elongation of the bolts is about a third of the original bolt length. figure 5 shows the results of the current tests in comparison with the reduction factors for the bolt 115 anne k. kawohl, jörg lange acta polytechnica figure 6. results of the post-fire tests in comparison with the post-fire reduction factors by gonzález [7]. strength at elevated temperatures given by eurocode 3 part 1–2 appendix d [3] for all property classes, the reduction factors given by kirby [4] from his tests on 8.8 bolts, and the reduction factors given by lange and gonzález [11] for bolts of property class 10.9. the figure also shows gonzález’s test results on pure tension and pure shear tests applied to 10.9 bolts [7]. all test results lie below the reduction factor given by appendix d of the eurocode 3 part 1–2. gonzález has shown that the reduction factor stated in eurocode 3 is not applicable for bolts of property class 10.9, as the strength reduction at temperatures above 450 °c is much higher than for bolts of property class 8.8. the tests conducted at an angle of 30° and 45° lead to tension and shear in the bolts. the failure loads of the tests, therefore, lie as predicted between the test results of gonzález under pure tension and pure shear. at an angle of 45° the amount of shear stress in the bolt is higher, which in turn has a positive effect on the ultimate strength of the bolts. 3.2. results of the post-fire performance tests the investigation of the post-fire performance of highstrength bolts is quite sensitive, as these bolts obtain their enhanced strength through carefully controlled heat treatment. the uncontrolled heating and cooling in the event of fire can consequently lead to a complete change in material properties. in his dissertation, gonzález [7] states two reduction factors for evaluating the post-fire strength of bolts of property class 10.9. the minimum reduction factor kred,min is based on tension tests carried out on specimens and bolts that were heated without an additional mechanical load. the maximum reduction factor kred,max is based on tension tests, where the specimens were loaded by both thermal and mechanical loading. the post-fire reduction of the strength of a 10.9 bolt should usually in the following bounds: kred,min =   1.0 20 °c ≤ t ≤ 500 °c, −1.434 · 10−3t + 1.717 500 °c ≤ t ≤ 800 °c, kred,max =   1.0 20 °c ≤ t ≤ 450 °c, −2.0 · 10−3t + 1.9 450 °c ≤ t ≤ 800 °c. (see the shaded area in figure 6). both reduction factors are based on pure tension tests of specimens and bolts. in the current series of tests, the tested bolts were heated to each specified temperature without additional mechanical loading. thus the values lie nearer to the reduction factor kred,min, see figure 6. as with the tests during fire, the higher ratio of shear with an angle of 45° has a positive influence on the load-bearing capacity of the bolts. the above-stated reduction factors are also applicable for the tested batch of 10.9 bolts and for a combination of tension and shear. in addition to the tests on whole bolt sets, specimens milled from the bolts were tested under pure tension to obtain the stress-strain relations. the unaltered bolts (20 °c) and the bolts heated to 500 °c show very close compliance with each other. up to 600 °c, the stressstrain relations show no yielding, which is typical for quenched and tempered steels. the specimens taken from bolts heated to 900 °c have a lower proportional 116 vol. 56 no. 2/2016 tests on 10.9 bolts under combined tension and shear limit than the bolts heated to 800 °c, but the ultimate stress value is higher. micrographs show a change in microstructure to a coarser microstructure in the bolts heated to 900 °c. 4. conclusions the tests presented in this paper are intended to give a better understanding of the load-bearing behaviour of high-strength bolts of property class 10.9 under a combination of tension and shear during and after fire. although the test programme has not yet been completed, initial conclusions can be drawn. the load-bearing capacity of bolts of property class 10.9 at elevated temperatures degrades more than the load-bearing capacity of bolts of property class 8.8. however, the 10.9 bolts behave in a very ductile manner even after reaching the maximum load and with quite a large rate of shear stress. this is a positive characteristic for the stability of a connection in a steel structure during fire. as was pointed out in the introduction, connections encounter great deformation over the course of a fire. the positive effect of a combination of tension and shear is also true for the post-fire performance of bolts of property class 10.9. the reduction factors stated by gonzález [7] can also be used for combined stress. in order to draw final conclusions, further tests need to be conducted. acknowledgements the authors thank prof. dr. mario fontana, eth zürich, and his staff for their most helpful support during the fire tests. references [1] al-jabri, k.s., davison, j.b., burgess, i.w.: performance of beam-to-column joints in fire – a review, fire safety journal 43 (2008), p. 50-62. doi:10.1016/j.firesaf.2007.01.002 [2] burgess, i.w., davison, j.b., dong, g., huang, s.-s.: the role of connections in the response of steel frames to fire, structural engineering international 4 (2012), p.449-461. doi:10.2749/101686612x13363929517811 [3] din en 1993-1-2: eurocode 3: design of steel structures, part 1-2: general rules – structural fire design, german version; en 1993-1-2, 2:2005 + ac:2009 [4] kirby, b.r.: the behaviour of high-strength grade 8.8 bolts in fire, journal of constructional steel research, 33 (1995), p. 3-38. doi:10.1016/0143-974x(94)00013-8 [5] european convention for constructional steelwork / technical committee fire safety of steel structures eccs, model code on fire engineering. isbn: 92-9174-000-65 [6] kodur, v., kand, s., khaliq, w.: effect of temperature on thermal and mechanical properties of steel bolts, journal of materials in civil engineering, 24 (2012), p765-774. doi:10.1061/(asce)mt.1943-5533.0000445 [7] gonzález orta, f.: untersuchungen zum materialund tragverhalten von schrauben der festigkeitsklasse 10.9 während und nach einem brand, dissertation, technische universität darmstadt, 2011. isbn: 978-3-939195-24-5 [8] kawohl, a., renner, a., lange, j.: experimental study of post fire performance of high-strength bolts under combined tension and shear, 8th international conference on structures in fire, june 11-13 2014, shanghai, china, 2014. p. 89-96, isbn: 978-7-5608-5494-6 [9] renner, a., lange, j.: load-bearing behaviour of high-strength bolts in combined tension and shear, steel construction, 5 (2012), p. 151-157. doi:10.1002/stco.201290033 [10] godley, m.h.r., needham, f.h.: comparative tests on 8.8 and hsfg bolts in tension and shear, the structural engineer, 60 (1982), p. 94-99. udc: 624.078.0001.4 [11] lange, jörg; gonzález, fernando: behavior of high-strength bolts under fire conditions, structural engineering international, 4 (2012), p. 470-475. doi:10.2749/101686612x13363929517451 117 http://dx.doi.org/10.1016/j.firesaf.2007.01.002 http://dx.doi.org/10.2749/101686612x13363929517811 http://dx.doi.org/10.1016/0143-974x(94)00013-8 http://dx.doi.org/10.1061/(asce)mt.1943-5533.0000445 http://dx.doi.org/10.1002/stco.201290033 http://dx.doi.org/10.2749/101686612x13363929517451 acta polytechnica 56(2):112–117, 2016 1 introduction 2 test set-up 2.1 tests at elevated temperatures 2.2 post-fire performance tests 3 results 3.1 results of the fire tests 3.2 results of the post-fire performance tests 4 conclusions acknowledgements references ap04_2web.vp 1 introduction in packet speech transmission over a network, part of the information is lost. if we are to preserve an acceptable quality of speech, the permissible percentage of losses is limited. the authors of [1] and other papers admit 1–3 % percentage loss. as a result, the average network load is limited. the authors of [2], [3] propose the use of a variable speech packet-encoding rate to enable smoothing of the effect of network overloads on the received speech quality. paper [4] proposes the classification of speech segments in accordance with their structure. packets belonging to different classes are assigned different priorities of delivery. when network overloads appear, packets with lower priority are discarded first. at the receiving end, regeneration of lost packets is performed. the articles mentioned above deal mainly with information transfer based on the encoding of speech signal parameters. this paper deals with transfer based on waveform encoding using pulse-code modulation. section 2 considers the main principle of loss recovery in waveform encoding. section 3 analyses first-order interpolation methods. section 4 analyses the possibilities of second-order interpolation. section 5 discusses some experimental results on the recovery of phrases and suggest some future investigations. 2 basic interpolation principle for channels with error bursts, the sequence is often subjected to an alternation prior to transmission and is recovered at the receiving end. in this case the errors are distributed in a more uniform manner [5]. we will use this principle for packet speech transmission. the source sequence of the samples of a signal segment is memorized, and some permutation of the samples is made, which is followed by separation into packets, and then transfer. at the receiving end a reverse procedure is performed. if a packet is lost during transmission, the lost samples are separated by one or several samples of the source signal. such a procedure enables the recovery of losses as a result of correlation. interpolation of samples was applied in [6]. a signal is divided into two groups of even and odd samples. each group is shaped into a packet, following reverse permutation, and in the case of the loss of one packet, one source sample appears between the missing samples. this allows the applicability extrapolation and first-order interpolation. the “odd-even” alternation allows only the correlation of the neighbouring samples to be used for the recovery. using the information on a greater number of samples, the value of the lost samples can be recovered more accurately. to do this, the samples of the source signal have to be interchanged on a segment containing more than two packets. for example, using a block encoder, the sequence of samples is written into an n × m matrix column-wise and is read row-wise. having defined the length of row n as being equal to the packet length, after the reading we will obtain m packets. if one of the m packets is lost, then after the reverse permutation the lost samples will be separated by m�l samples of the source signal. in this case interpolation procedures ranging from the zero order and up to the (m�l)-th order may be applied for the recovery. 3 first-order interpolation we will assume that not more than one packet is randomly lost from the sequence of signal samples consisting of packets. the sequence xi (i � 1, 2, ...) of the centered signal x(t) is transmitted. the reception and the reverse permutation follows. in the case of the loss of one packet, the received signal xi differs from xi only in the points ti � tj, where the samples are missing. in this case the probability is p(ti � tj) � 1/m. the transmission error is �i i ix x� � . the mean square error of the transmission of a sequence consisting of m packets is � � � �e x x p t t e x x mi i i i j i i i( ) ( )� 2 2 2� � � � �� �� �� (1) if the estimate xi of the values of the missing samples is equated to zero, then: e e x m r mi xx( ) ( ) ( )�0 2 2 0� � , (2) where rxx(0) is the value of the correlation function for a zero shift. in the recovery of the values of xi on the basis of the first-order prediction x xi i� �� 1. it is usually assumed that coefficient � � 1. then, in accordance with (1), the mean square error will be: � �� � � �e e x x m r r mi i xx xx( ) ( ) ( )�p12 1 2 2 0 1� � � �� , (3) where rxx(1) is the value of the correlation function of signal x(t) with the shift equal to � t. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 44 no. 2/2004 speech signal recovery in communication networks v. zagursky, a. riekstinsh interpolation approaches to the shape recovery of a speech signal in transmission over packet switched communications networks are proposed. the samples of signal fragments are mixed and transmitted in correspondence with standard procedure for packet-switched transmission. after reception a reverse permutation is made. in the case of packet losses missing samples are separated by several samples of the source signal. correlation properties of the signal are used for the recovery samples due to firstand second-order non-adaptive and adaptive interpolation. for the loss of 25 % packets and second order adaptive interpolation a 2–4 % error distribution range has been achieved. for the first-order interpolation: x a x a xi i i� � � � � �1 1 1 1 . (4) for the first-order interpolation it is usually assumed that a a a� �� �1 1 . for the most commonly used procedure a � 0.5 (linear interpolation). this procedure was referred to in [6] as non-adaptive, as well as the prediction with � � 1. for x x xi i i� �� �0 5 1 1. ( ) the mean square error will be: � �e r r r mi xx xx xx( ) . ( ) ( ) . ( )� 1 2 1 5 0 2 1 0 5 2� � � . (5) let us compare expressions (3) and (5): � �e e r r mr i xx xx( ) ( ) ( ) ( )� �1 2 1 2 0 2� � � . (6) expression (6) yields that interpolation provides a smaller recovery error, as compared with prediction, since for any random signals rxx(2) < rxx(0). the error can be minimized by applying the adaptive approach, as was demonstrated in [6], by calculating the error for interpolation with the use of (4) and by determining the minimal error depending on the coefficients a �1 and a+1, we will obtain: � �a a a r rxx xxopt � � � �� � � 1 1 11 1 2( ) ( ) , (7) where r n r n rxx xx xx( ) ( ) ( )� 0 is the correlation coefficient. the figure 1 shows the charts of the errors of recovering a signal which corresponds to the sounds a, d, c, sh. the vertical axis are laid off, in percent, the value of the normalized mean square errors � �� e rxx( ) ( ) 2 0 � . the horizontal axis displays the ordinal numbers of the appropriate signal segment with length n � 128 samples. a real speech signal was used in the experiments. a situation was simulated in which every fourth packet is lost (m � 4). when no recovery is applied, the error at the receiving end equals 25 %, since from (2) we have � � 1/m. it follows from the figure that for practically all signal segments �1.2 < �1.1. 4 second-order interpolation the estimate of the values of the lost samples xi using the known samples with numbers i � 1, i � 2 will be made using the expression: x b x x b x xi i i i i� � � �� � � �1 1 1 2 2 2( ) ( ) (8) the normalized mean-square error of the second-order interpolation will be determined in a way similar to (3). ��2 12 22 2 2 1 2 2 2 1 2 2 4 1 1 2 2 2 � � � � � � � ( ) ( ) ( ) ( ) ( ) b b b b r b b r xx xx �� �4 3 2 41 2 22b b r b r mxx xx( ) ( ) . (9) 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 1: errors of recovering a signal which correspods to the sounds a, d, c, sh 1 – linear interpolation (�1.1) 2 – 1-st order adaptive interpolation (�1.2) 3 – 2-nd order chebyshev’s interpolation (�2.1) 4 – 2-nd order adaptive interpolation (�2.2) like for the first-order interpolation, the procedure for determining, coefficients b1, b2 may be non-adaptive or adaptive. for a non-adaptive procedure some known family of polynomials can be used, for example, the family of second-order chebyshev polynomials. it can be shown that b1 � 0.667 and b2 � � 0.167. then (9) is converted to the following form: ��2 1 1 94 3 11 1 1 56 2 0 44 3 0 056 . . . ( ) . ( ) . ( ) . � � � � � r r r r xx xx xx x �x m( ) .4 (10) let us compare (10) and (5) for the normalized error � �11 1 2 0. ( ) ( )� e ri xx : � � � �� �� �11 2 1 1 0 44 2 0 44 3. . ( ) . ( ) . ( )� � � � �r r r mxx xx xx (11) analysing (11), it is easy to see that the efficiency of the interpolation is determined by the type of correlation function of the signal. specially, �2.1 < �1.1 for the signals for which the correlation between the adjacent samples is high, and afterwards it rapidly decreases. this is true for many speech sounds [7]. however, for hushing sounds and fricatives, the value of rxx(1) may not be high. then it is possible that �1.2 < �1.1. in selecting coefficients in (8) the second-order adaptive interpolation allows us to take account of the effect of all values of the correlation function. the formulas for the optimum values of coefficients b1 and b2 will be obtained by equating the partial derivatives � � �2 1b and � � �2 2b to zero in (9). this is illustrated by fig. 1, which shows the experimental results of signal recovery for 25 % losses for the firstand second-order non-adaptive and adaptive interpolation. for the sounds “a” and “d”, the second-order interpolation provides better results than the first-order interpolation. for the sounds “c” and “sh” this is true for adaptive interpolation, while with the use of the second-order chebyshev interpolation the recovery error increases. 5 experimental results and conclusions experiments have been made on the recovery of losses in fused speech. signal samples are divided into packets, 128 readings each. the packets are combined into groups of 4 packets. within one group permutations are made so as to ensure that all 4 packets are interrelated. each fourth packet is discarded, following which a reverse permutation and recovery are performed. use was made of firstand second-order interpolation – non-adaptive and adaptive. the table shows the integral error estimates for all recovery procedures for different speech phrases up to 5 sec. in length. our investigations testify to the efficiency of approaches that invo1ve waveform recovery. the first-order adaptive interpolation yielded results, acceptable in terms of sound quality, for the loss of 25 % packets. the second-order adaptive interpolation yields better results in terms of both sound quality and root-mean-square error. the adaptive procedure requires additional processing of the signal at the receiving end in order to calculate the correlation instants and coefficients, followed by transmission of the calculated information in the packet. the authors intend to investigate other approaches – determining the relation between the interpolation coefficients and the sign characteristics of the signal, which are easier to determine than the correlation ones, as well as calculating the characteristics at the receiving end directly from the signal with losses. references [1] chlamtac i.: “an ethernet sompatible protocol for real-time voice/data integration.” computer networks and isdn syst., vol. 10 (1985), no. 2, p.81–96. [2] bially t., crold b., seneff s.: “a technique for adaptive voice flow control in integrated packet networks.” ieee trans. on communic., vol. com-28 (march 1980), no. 3, p. 325–333. [3] forat v. s., friedman e. m., minden g. i.: “multirate voice coding for load control on csma/cd local computer networks.“ computer networks and isdn syst., vol. 11 (1986), p. 99–110. [4] petr p. w., dasilva i. a., forat v. s.: “priority discarding of speech in integrated packet networks.” ieee journal on selected areas in communic., vol. 7 (june 1989), no.5, p. 644–656. [5] clark g. c., cain j. b.: “error-correction coding for digital communications.” new york: plenum press, 1982, p. 352. [6] van den bos a.: “complex electron wave reconstruction using parametr estimation.” ieee transact. on instrum. and measurement, vol. 46, no. 4, p. 826–830. [7] rabiner l. r., schafer r. w.: “digital processing of speech signals.” new jersey 07632: prentice-hall, 1978, p. 436. v. zagursky phone: +371 755 8448 fax: +371 755 5337 e-mail: zagursky@edi.lv, aigars@egle.cs.rtu.lv institute of electronics and computer science of latvian university a. riekstinsh institute of automation and computer engineering of riga technical university 14 dzerbenes str., riga, lv-1006, latvia © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 44 no. 2/2004 type of interpolation error distribution range linear interpolation 5–12 % l-st order adaptive interpolation 3–5 % 2-nd order chebyshev interpolation 5–15 % 2-nd order adaptive interpolation 2–4 % table 1: fused speech recovery error table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica doi:10.14311/ap.2020.60.0038 acta polytechnica 60(1):38–48, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap on cfd investigation of radial clearance of labyrinth seals of a turbine engine michal čížeka, ∗, zdeněk pátekb a czech technical university in prague, center of aviation and space research, jugoslávských partyzánů 1580/3, 16000 praha, czech republic b vzlú, czech aerospace research centre, beranových 130, 19905 praha, czech republic ∗ corresponding author: michal.cizek@fs.cvut.cz abstract. fluid flow in labyrinth seals of a turbine engine is described. the aim is to describe numerical calculations of fluid flow in labyrinth seals and evaluate the calculated data for different settings of radial clearance of labyrinth seals. the results are achieved by 3d cfd detailed simulations in a typical seal geometry. the calculations are performed for different radial clearances at a constant pressure drop. the calculated data are evaluated based on mass flow, static pressure, total enthalpy and total temperature of air. based on the calculated data, it is visible that the total temperature of air is increased in the labyrinth seals. the static pressure of air acts as expected –the static pressure is decreased in all teeth. the mach number is similar in all teeth, but the maximum value is in the last tooth, because of the expansion into the ambient conditions. results of the calculations are that the total temperature in labyrinth seals is not constant as it is usually presented or supposed in common literature. keywords: labyrinth seal, cfd calculation, turbine engine. 1. introduction this article describes and analyses the flow in a labyrinth seal of a small turbine engine. the objective is to analyse the air flow for a constant pressure drop in the seals with different geometrical settings – it means different radial clearances between rotor and stator and different numbers of teeth. generally, the labyrinth seals work in a turbine engine to prevent the air flow enter the engine modules, where the flow is useless, because the turbine disc is screwed to the shaft – it is a rotating part. the air flow is primarily used to cool turbine blades, turbine discs, shafts etc. (see [1]). thanks to the labyrinth seals, it is possible to direct the air flow to the parts of the engine where it can be useful – which means decrease the axial force of the shaft, etc. (see [2]). in numerical analysis, it is important to correctly design and define the cavity between the rotating and non-rotating parts, because the tooth profile has an important influence at high mach numbers (see [1, 3]). historically, the research of the labyrinth seals has been executed more extensively on steam turbines than on aircraft turbine engines (see [1, 4]). the steam turbines use the labyrinth seals specially to reduce the mass flow over the top of turbine blades and to increase the efficiency of the turbine [4]. in steam turbines, the radial clearance has a higher influence on the turbine performance than in the turbine engine. but when the radial clearance is too large in a critical part of the turbine engine, the influence on the engine performance parameters (e.g. fuel consumption) is more pronounced than with the standard clearance (see [5]). this is because the turbine has bigger dimensions than the engine (see [1, 4, 6]). in a small aviation turbine engine, the device continually changing the radial clearance during the flight cannot be used. it is not technically a problem, but there is a problem with weight. this is the reason why it is necessary to understand the airflow in cavities between the rotating and non – rotating parts (see [7, 8]). the way to understanding the flow is through the cfd simulation (see [9]). the setup of the simulation is that the part with teeth rotates – as resulted from [10], where only the rotor wall rotated. the stator parts are located in front of and after the rotor part. the future evaluation of the calculations will be performed by the measurements of the rotating part of labyrinth teeth (see [11, 12]). thermodynamically, the process in the labyrinth seal is a conversion of kinetic energy of the shaft to the heat energy of the flow (see [1]). the thermal energy manifests itself by a higher total temperature and a higher enthalpy of air flow. generally, the dissipation of the kinetic energy is important factor, but typically, dimensions of a small turbine engine render it relatively negligible. it is very difficult situation for engine designers. the aim of this work is directly defined: clarify why the temperature is increased. the way of clarification of the problem is a thorough detailed numerical analysis of this problem. the analysis is based on the cfd calculation of labyrinth seals of the turbine engine. at the end, the engine designer would have more information about the temperature through the labyrinth seal. designer will be able to better define 38 https://doi.org/10.14311/ap.2020.60.0038 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 1/2020 on cfd investigation of radial clearance of labyrinth seals. . . hub radius 86.5 mm tip radius 89 mm length of rotor part 17 mm length of stator part 21 mm table 1. geometrical parameters. the material of the seal with respect to the conditions of its use. 2. geometrical description the labyrinth seals of a small turbine engine in the cfd calculation are composed of rotor and stator parts. the stator part is formed by a non-rotational surface (see fig. 1). the rotor part consists of a shaft with teeth. the shaft is rotating with a predetermined constant speed (see fig. 2). the swirl is on surfaces between the selected teeth there are two surfaces between the teeth where the swirl is created. thanks to the circumferential swirl, the kinetic energy of the air flow (the similar geometry is in [7]) is thwarted (details of the geometrical parameters are in tab. 1). the geometric parameters correspond to the small turbine engine. the rotor of the labyrinth seals consists of straight teeth that are tapered on the external side. there is a thin slab on the spike, which is left behind due to technological reasons (see fig. 3). for a better quality of the air flow, the tooth should be as sharp as possible. in an ideal situation, it should be a sharp edge. this geometry is a compromise between the ideal and real teeth. 3. 3d model and calculating mesh calculating the 3d model for the cfd calculation consists of three basic volumes: • inlet control volume – non-rotating • sealing volume – rotating (the volume is rotating, but a boundary condition “counter rotating wall” is set on the wall of the stator i.e. the face rotates in a reverse direction to the volume – resulting in a stator. this solution was chosen based on the manual [13] and on the discussion with ansys cfd specialist) • outlet control volume – non-rotating all three parts are designed as circular cuts with a 5°opening angle. all parts were designed in ansys design modeler v18. the mesh of the sealing volume is presented in fig. 4. the calculating mesh was prepared in ansys meshing v18 software. the inflation function was used on all edges for a better description of fluid flow. even though the dimensions of the sealing are very small, the inflation on edges is additionally created. in the smallest point of labyrinth seals the radial clearance between the teeth and stator part – there are 14 rows of a hexahedral mesh. created 2.47 mil cells are generated in this settings where 5 teeth are used in the sealing volume. in the inlet and outlet control volume, 158 thousand cells are created. “frozen-rotor” interfaces are between the rotor and stator parts. 4. boundary conditions all variants were calculated with a constant pressure drop. the total pressure and the total temperature are defined in the inlet control volume. in the outlet control volume, static pressure is defined (see tab. 2). rotating sealing volume is defined by a constant rotating speed. rotating speed means that the sealing volume rotates at a constant speed. air ideal gas is used as the fluid of flow. the fluid model of heat transfer is total energy and the turbulence model defined by k − ε. the boundaries of the volumes are defined by periodic conditions to save the calculating time (see fig. 5). ansys cfx v18 software was used for the calculations. the k −ε model was selected based on a preliminary analysis of the turbulent models. there were three turbulent models tested at a constant radial clearance, the identical mesh and the identical boundary conditions: • k − ε • sst (shear stress transport) • rng k − ε the analysis was performed using total temperature and mass flow differences in the first and the last teeth of the labyrinth seal. results of this preliminary analysis show that k − ε is the best turbulent model (see fig. 7 and fig. 6). the k − ε turbulent model shows good results with a reasonable computing time. it also has a wall function for a better description of the boundary layer. similar results can be found in [2] and [9]. 5. results of the calculation the calculation model was finished in 1000 iterations (see fig. 8 and fig. 9, where a convergence of residuals and of the mass flow through the labyrinth seals is visible). time steps are different for a specific number of iterations (see tab. 3). the calculated thermodynamic parameters are evaluated by the following formulas: • mass flow is represented by dimensionless flow coefficient: qcorr = q qref (1) • static pressure is represented by dimensionless pressure coefficient: pscorr = ps psref (2) 39 michal čížek, zdeněk pátek acta polytechnica figure 1. stator parts. figure 2. rotor part. figure 3. teeth. figure 4. mesh of labyrinth seals. 40 vol. 60 no. 1/2020 on cfd investigation of radial clearance of labyrinth seals. . . pressure ratio inlet total temperature inlet total pressure rotating speed [kpa] [k] [kpa] [rpm] 1.3 542 660 35e+03 table 2. boundary conditions. figure 5. boundary and periodic conditions. figure 6. turbulent model comparison. figure 7. turbulent model comparison. number of iterations time step [s] 1 10e-6 300 10e-5 500 10e-4 1000 10e-4 table 3. time steps. 41 michal čížek, zdeněk pátek acta polytechnica figure 8. convergence of residuals. figure 9. convergence of mass flow. 42 vol. 60 no. 1/2020 on cfd investigation of radial clearance of labyrinth seals. . . • total temperature is represented by dimensionless temperature coefficient: tccorr = tc tcref (3) • total enthalpy is represented by dimensionless enthalpy coefficient: hcorr = h href (4) • the radial clearance is represented by dimensionless clearance: rccorr = rc rcref (5) the reference values of thermodynamic parameters were established by the ambient conditions corresponding to a standard operation of the turbine engine. the reference value of the radial clearance is the height of the flow channel without the teeth (see tab. 4). the field of mach number through all 5 teeth and rccorr from 0.02 to 0.06 are presented in pictures from fig. 10 to fig. 12. the highest mach number is in the last teeth. in chambers between the teeth, lower speeds than in the radial clearance area can be observed. in the following charts, calculations with a constant number of teeth and variable radial clearance are presented. in charts from fig. 18 to fig. 19, the standard thermodynamic parameters that were calculated by the cfd calculation are presented. in x-axis, the number of teeth is shown. it is possible to see the trend of the thermodynamic parameters in all teeth (not only inlet and outlet). due to this fact, it should be possible to better understand which thermodynamic phenomena are in the teeth. in y-axis, the thermodynamic parameters (by dimensionless values) that are present in (1) to (5) formulas are shown. in charts, 3 lines represent 3 different radial clearances. in the following figures (from fig. 13 to fig. 15), the velocity vectors in the labyrinth seals are presented. the decreasing mass flow (seen in fig. 17) is explained as a numerical error. based on the convergence analysis of the mass flow that is presented in fig. 9, it is seen that the inlet and outlet mass flows are identical. after all, the values in fig. 17 are very small. 6. results and discussion in the previous paragraph, the steps of the cfd calculation in the labyrinth seals were summarized. the geometrical setting is different in comparison with the original geometrical setting in [10]. the original idea was without the non-rotating parts. regarding the parts with teeth, only the wall with teeth was rotating. based on results of the analysis, it was decided that the simulation is not so accurate. the fluid model was modified to achieve a better description – inlet and outlet non-rotating parts and rotating parts with teeth. the calculation mesh that is presented in fig. 4 is equivalent to the calculating mesh in [7, 14, 15]. based on this analysis, the mesh is usable and fully and properly functioning. the mach number field through the teeth corresponds to the expectations based on thermodynamics (see fig. 10, fig. 11 and fig. 12). from the flow field, it can be seen that the maximum speed is in the last tooth, which has the greatest effect on the flow in labyrinth seals. the maximum mach number is in the position of a maximal radial clearance. from the velocity vectors field (see fig. 13, fig. 14 and fig. 15), similar results like from mach number field can be seen, and the velocity vortexes are fully developed in cells between the teeth. the trends of a non-dimensional static pressure (see fig. 16) and mass flow (see fig. 17) are as expected, because the trends are decreasing. this assumption can be observed by decreasing static pressure and decreasing mass flow at a constant pressure drop. these trends are visible in all situations with a different radial clearance. fig. 17 shows that the minimum mass flow through the seals is reached when the minimum radial clearance (orange line) was used. different situation is in the trend of non-dimensional total temperature (see fig. 18) and total enthalpy (see fig. 19). with the minimum radial clearance (orange line), the temperature gradient reaches its maximum (∆tccorr ∼= 0.26). with the maximum radial clearance (green line), the temperature gradient reaches it minimum (∆tccorr ∼= 0.12). the identical trend is observed in enthalpy. the maximum enthalpy gradient is reached with the minimum radial clearance (∆hcorr ∼= 0.29) and the minimum enthalpy gradient is reached with the maximum radial clearance (∆hcorr ∼= 0.14). regarding the velocity distribution in [2, 15] with velocity vectors in fig. 13, fig. 14 and fig. 15, the velocity vectors are similar. the rotating swirl is fully developed the labyrinth seal is working correctly, as it can be seen when comparing the pressure distribution in [7] and static pressure distribution in fig. 16, where the pressure is decreased. this corresponds with the labyrinth seal theory [1]. the mass flow through the labyrinth seals in fig. 17 is similar as the one in [3]. the mach number distribution in fig. 10, fig. 11 and fig. 12 is comparable with results in [7]. the mach number distribution is logic – maximum mach number is in the last tooth because of the expansion to the ambient conditions. based on the above-mentioned facts, it can be stated that the labyrinth seal calculation is correct in the sense that it provides results corresponding to basic thermodynamic considerations. the result of the total temperature distribution is that the total temperature is not constant in all teeth. the values of static pressure and mass flow distribution is decreasing in all teeth. the values of total enthalpy distribution is increasing in all teeth. 43 michal čížek, zdeněk pátek acta polytechnica psref [kpa] tcref [k] href [kj·kg−1] qref [kg·s−1] rcref [mm] 101 288 260 0.01 2.5 table 4. reference conditions. figure 10. mach number field – rccorr = 0.02. figure 11. mach number field – rccorr = 0.04. figure 12. mach number field – rccorr = 0.06. 44 vol. 60 no. 1/2020 on cfd investigation of radial clearance of labyrinth seals. . . figure 13. velocity vectors – rccorr = 0.02. figure 14. velocity vectors – rccorr = 0.04. figure 15. velocity vectors – rccorr = 0.06. 45 michal čížek, zdeněk pátek acta polytechnica figure 16. non-dimensional static pressure. figure 17. non-dimensional mass flow. figure 18. non-dimensional total temperature. 46 vol. 60 no. 1/2020 on cfd investigation of radial clearance of labyrinth seals. . . figure 19. non-dimensional total enthalpy. 7. conclusions the conclusions from the calculations consider the fact that the total temperature and enthalpy are increasing, because of the shaft kinetic energy conversion to the heat energy, which disproved the frequently used assumption that the temperature is constant. this new knowledge can be important for the design of aircraft turbine engines. the designers can improve the design of the shaft of the turbine and thus improve the performance characteristics of the engine. when the temperature gradients in all teeth are calculated more precisely, the appropriate material of the shaft can be chosen accordingly. based on it, harmful vibrations of the shaft that are dangerous for the engine (see [1]) can be eliminated. as the next step, these studies should be carried out: (1.) the loss performance parameter (e.g. power) should be analysed. (2.) the development of enthalpy and total temperature and static pressure should be experimentally tested. after the loss performance parameter analysis in teeth (point 1), it should be interesting to analyse the stability of the flow path. this should be important for a better evaluation of the calculated data and also it should be helpful for the design or appropriate selection of an experimental laboratory where the seals should be tested. after the experimental tests of labyrinth seals, it would be possible to evaluate the calculated data and the cfd model of seals would be modified accordingly. then, the calculation model could be simplified and modified to be more user friendly. list of symbols rc radial clearance [mm] q mass flow [kg s−1] ps static pressure [pa] x2 mixing ratio of eggshells to anthill clay t c total temperature [k] h total enthalpy [j kg−1] subscripts corr corrected ref reference acknowledgements authors acknowledge support from the esif, eu operational programme research, development and education, and from the center of advanced aerospace technology (cz.02.1.01/0.0/0.0/16_019/0000826), faculty of mechanical engineering, czech technical university in prague. references [1] j. jerie. teorie motorů. ediční středisko čvut, prague, 1981. [2] g. ilieva, c. pirovsky. labyrinth seals with application to turbomachinery. materials science & engineering technology 50(5):479 – 491, 2019. doi:10.1002/mawe.201900004. [3] g. bondarenko, v. baga, i. bashlak. flow simulation in a labyrinth seal. in problems of mechanics in pump and compressor engineering, vol. 630 of applied mechanics and materials, pp. 234 – 239. trans tech publications ltd, 2014. doi:10.4028/www.scientific.net/amm.630.234. [4] a. v. sčeglajev. parní turbíny 1. nakladatelství technické literatury, prague, 1983. [5] h. l. stocker. advanced labyrinth seal design performance for high pressure ratio gas turbines. in asme 1975 winter annual meeting: gt papers, vol. turbo expo: power for land, sea, and air. 1975. doi:10.1115/75-wa/gt-22. [6] j. a. demko. the perediction and measurement of incompressible flow in a labyrinth seal. ph.d. thesis, texas a&m university, 1986. 47 http://dx.doi.org/10.1002/mawe.201900004 http://dx.doi.org/10.4028/www.scientific.net/amm.630.234 http://dx.doi.org/10.1115/75-wa/gt-22 michal čížek, zdeněk pátek acta polytechnica [7] j. fürst. numerical simulation of flows through labyrinth seals. in engineering mechanics 2015, vol. 821 of applied mechanics and materials, pp. 16 – 22. 2016. doi:10.4028/www.scientific.net/amm.821.16. [8] h. zimmermann. some aerodynamic aspects of engine secondary air systems. in turbo expo: power for land, sea, and air, vol. 1: turbomachinery. 1989. doi:10.1115/89-gt-209. [9] l. san andrés, t. wu. leakage and dynamic force coefficients for two labyrinth gas seals: teeth-on-stator and interlocking teeth configurations — a cfd approach to their performance. in turbo expo: power for land, sea, and air, vol. 7b: structures and dynamics. 2018. doi:10.1115/gt2018-75205. [10] m. čížek. chambers of labyrinth seals of turbine engine. in new trends of civil aviation 2018, conference proceedings. 2018. [11] d. sun, s. wang, c.-w. fei, et al. numerical and experimental investigation on the effect of swirl brakes on the labyrinth seals. journal of engineering for gas turbines and power 138(3), 2015. doi:10.1115/1.4031562. [12] h. l. chae. a numerical and experimental study of windback seals. ph.d. thesis, texas a&m university, 2009. [13] ansys help viewer 18.0, 2016. [14] j. j. moore. three-dimensional cfd rotordynamic analysis of gas labyrinth seals . journal of vibration and acoustics 125(4):427 – 433, 2003. doi:10.1115/1.1615248. [15] t. wu, l. s. andrés. gas labyrinth seals: on the effect of clearance and operating conditions on wall friction factors – a cfd investigation. tribology international 131:363 – 376, 2019. doi:10.1016/j.triboint.2018.10.046. 48 http://dx.doi.org/10.4028/www.scientific.net/amm.821.16 http://dx.doi.org/10.1115/89-gt-209 http://dx.doi.org/10.1115/gt2018-75205 http://dx.doi.org/10.1115/1.4031562 http://dx.doi.org/10.1115/1.1615248 http://dx.doi.org/10.1016/j.triboint.2018.10.046 acta polytechnica 60(1):39–49, 2020 1 introduction 2 geometrical description 3 3d model and calculating mesh 4 boundary conditions 5 results of the calculation 6 results and discussion 7 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0391 acta polytechnica 60(5):391–399, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap modeling of material balance from the experimental ucg milan durdán∗, ján terpák, ján kačur, marek laciak, patrik flegner technical university of košice, faculty berg, institute of control and informatization of production processes, němcovej 3, 040 01 košice, slovak republic ∗ corresponding author: milan.durdan@tuke.sk abstract. the underground coal gasification is a continually evolving technology, which converts coal to calorific gas. there are many important parameters in this technology, which are difficult to measure. these parameters include the underground cavity growth, amount gasified coal, and the leakage of input and output gaseous components into the surrounding layers during the coal gasification process. mathematical modeling of this process is one of the possible alternatives for determining these unknown parameters. in this paper, the structure of the mathematical model of laboratory underground coal gasification process from the material balance aspect is presented. the material balance consists of mass components entering and leaving from the ucg process. the paper shows a material balance in the form of a general mass balance and atomic species balance. the material balance was testing by six ucg laboratory experiments, which were realized in two ex-situ reactors. keywords: underground coal gasification, material balance, atoms, ex-situ reactors, losses. 1. introduction the underground coal gasification (ucg) is a continually evolving technology and seems to be a great source of energy that can be obtained at a lower cost than in the case of using conventional mining methods. this process offers a less costly alternative to traditional coal mining methods, especially when the coal seam is placed deep under the earth’s surface. a system of two wells needs to be build. at least one for the injection (i.e., for gasification agent input) and at least one for the production (i.e., for product gas output). the ucg is a process of transformation of coal into the output gas (i.e., syngs). methane, hydrogen, carbon monoxide, carbon dioxide, and hydrogen sulfide are primary components of the produced gas. the produced gas can have various uses, e.g., it may serve as a fuel for a gas turbine for electricity generation, or it can be used as a chemical feedstock for a conversion to synthetic fuels. the product gas composition depends on the coal geology and gasification parameters. the diversity of the conditions, factors, and structure of individual coal seams hampers the transfer of the ucg process knowledge [1–3]. the ucg reactor can be divided into four zones (i.e., the oxidation zone, reducing zone, and drying and pyrolysis zone). the reactions take place in these zones depending on the coal geology, the composition of the gasification agent, and the coal seam temperature. the oxidation reactions increase the coal seam temperature to above 900 ◦c. the reduction reactions generate produced gas (i.e., synthetic gas syngas) in a temperature range from 550 to 900 ◦c. the coal seam is initially dried and then pyrolyzed at temperatures ranging from 200 to 550 ◦c. the product gas is cleaned and stored after the extracting process from the product well [4–6]. the scheme of the ucg process is shown in figure 1. a decisive role throughout the ucg process is correctly identifying the geology of the area. impermeable over/underlying strata that have low porosity and less deformation are the most suitable. they act as a seal between the surrounding rock layers and the gasified coal seam [7]. the cavity growth in time and space has a random nature due to the impact of the surrounding rock and lack of information to control the ucg process. fracturing and cracks are created inside the gasified coal seam and surrounding rocks when the underground cavity grows. it was identified that operational conditions (i.e., operation times, feed gases, and flow rates on gasification) have significantly affected the fracture occurrence, cavity growth, rack behavior, and coal gasification efficiency [8]. an effective pressure control system and also a gas production controlling (i.e., calorific value and volumetric production rate) by the proper composition and volume of the gasification agent are recommended to implement in the ucg process [9]. numerical models, as a tool for optimizing the technologies, are used to simulate the processes. the paper [10] reviews the approaches, key concepts, assumptions, and limitations of models (e.g., the packed bed model, the channel model, and the coal slab model) for a prediction of a cavity growth and gas production in the ucg process. the results of this study showed that currently used computing 3d models are very simple (i.e., the details of the physical and chemical models are not considered), and developing the ucg 391 https://doi.org/10.14311/ap.2020.60.0391 https://ojs.cvut.cz/ojs/index.php/ap m. durdán, j. terpák, j. kačur et al. acta polytechnica figure 1. the scheme of the ucg process. model requires an understanding of physical theory and correctly interpreting of the operating and laboratory results. the paper [11] describes the analysis of the temperature field in the gasified coal layers by a model experiment in the laboratory gasifier and the two-dimension nonlinear and dynamic mathematical models. the temperature of the coal layer is considered steady in the thickness direction and unsteady (i.e., in time) along the strike and slope of the coal seam. the results showed that the relative error between the calculated value and the measured value was below 15 %, except for the measured points near the flameworking face. the combustion mathematical modeling and in situ gasification was described in [12]. chemical reactions in multiphase porous media were simulated in this work. the influence of the oxidant injection position (i.e., bottom and top position) to determine buoyant forces and product gas properties by a dynamic model of an underground cavity partially filled with an ash bed is described in [13]. this dynamic model simulates the combined effect of transport phenomena (i.e., heat and mass) and chemical reactions during the ucg process. the mathematical model in the energy balance form is described in [14]. the task of this model is realizing energy balances and estimating the efficiency of the hydrogen conversion from syngas. the development of information concerning the efficiency of processes, either through calculation from the first principles or by experimentation, also associates the development of the material and energy balance calculations. recently, incorporating material balance principles in industrial and agricultural performance measurement systems with pollutant factors has been on the rise. the issue of eco-efficiency measurement adjusted for pollution, taking into account materials flow conditions and the material balance principle requirements to provide more precise measures of performance, is described in paper [15]. in this work, a new approach by integrating a slacks-based measure to enhance the malmquist luenberger index by a material balance condition was presented. new approaches of material balance calculations, namely graph and neuro-fuzzy are described in [16]. result show that the multimodel of both methods was suitable. the paper [17] described a new comprehensive version of the flowing material balance to address errors in the hydrocarbons-in-place estimation. the developed method was used to reasonably determine the original volumes of hydrocarbons in place in both the conventional and unconventional reservoirs.. a new zero-dimensional (tank) material balance equation for predicting the reservoir performance has been described in [18]. this material balance is directly applicable for the analysis of liquid-rich (wet and retrograde) gas reservoirs in form of an equivalent gas molar density function.. the density-based equation allows it be directly expressed as an extension of the dry gas material balance equation. the mass-balance calculations have been used in computation (estimation) of the accumulation of soil organic matter, transformation of pedogenic fe and al, and net losses of the main elements (i.e., ca, mg, k, na, fe, al, mn, and si) in paper [19]. elemental losses due to deglaciation and exposure to the weathering environment were calculated in this paper. the results show that extensive mineral weathering resulted in significant leaching losses of si, major base cations, and al. the paper [20] describes the material balance calculated for cocombustion of the high-watered oil sludge and the separated oil product composition. in this paper, the content of moisture and ash of the combustible working material was used to determine the elemental composition of the petroleum product. the materialbalance equation for the estimation of the gas and condensate in place in shale-gas-condensate reservoirs has been described in [21]. the task of this materialbalance is estimating the critical time for using the gas injection when the condensate buildup represents a problem. it considers the stress-dependency of porosity and permeability and also takes into account the effects of adsorbed, free, and dissolved gas-condensate production. the paper [22] described the material balance of organic matter for initial rock sample (domanic black shale) and after a thermal treatment at 300 and 500 ◦c. the amounts of generated liquid hydrocarbon at these temperatures are presented. the paper [23] describes equations of mass and energy balances for the solid and gas phases in the ucg process. the heat transfer by conduction in the solid phase and convection in the gas phase is considered. the results of the longwall generator method showed that the energy content of the output gas depends on the coal seam dimensions and the applied pressure gradient. the use of the mass and heat balance calculations for thin coal seams in faulting zones of the coal basin is described in [24]. these calculations are used to analyze the effectiveness of the ucg process and to obtain quantitative and qualitative indicators of the output parameters for the ucg products prediction. the mass balance is based on a mathematical model of the physical and chemical processes and takes into 392 vol. 60 no. 5/2020 modeling of material balance from the experimental ucg account phases of individual substances (i.e., solid, liquid, and gaseous phases). this paper described the proposal of the material/mass balance of the ucg process due to the information given above. this balance is verified by measurements realized under laboratory conditions on the two physical models (i.e., ex-situ reactors). the total material balance will be based on the balance of the input (i.e., coal and gasification agent) and output components (i.e., unburned coal, condensate, ash, and produced gas). the elemental material balances will be based on the balance of elements such as c (carbon), h (hydrogen), n (nitrogen), o (oxygen), and s (sulphur), which are dominant in the mass components of the total material balance. 2. material balance the subject of research (i.e., system) can be defined as a volume in space with set boundaries corresponding to the purpose of the research. the system includes a quantity of material and an assembly of equipment where it is necessary to supply energy (e.g., in heat form) to transform the material from the initial to the final state (i.e., a mass of the system is variable with time) [25]. the laboratory ucg process may be marked as a batch process because the raw material is added to the ex-situ reactor at the beginning of the process and kept there until the desired final state is reached. chemical reactions taking place in this ex-situ reactor involve the separation and combination of molecules and elements to form new chemical substances (e.g., synthetic gas) during the ucg process. in this paper, a material balance is used as an application of conservation of mass in the ucg process. the task of this material balance is to determine mass flow by accounting for material entering and leaving a system (ex-situ ucg reactor). the entering materials in the process are the coal and gasification agent (i.e., a mixture of air and oxygen) and the leaving materials are the product gas (i.e., syngas), ash, condensate, and unburned coal. the total mass flow that enters the process must be equal to the overall mass flow that leaves process. a schematic drawing of the material balance of the ucg process in the ex-situ reactor is shown in figure 2. since total mass is conversed, a general mass balance in the overall ucg process can be written as follows: gcoal + gair + goxygen = gcoal,unburned + gash + gsyngas + gcondensate (1) where gcoal is the mass of input coal (kg), gair is the mass of air (kg), goxygen is the mass of oxygen (kg), gcoal,unburned is the mass of unburned coal (kg), gash is the mass of ash (kg), gsyngas is the mass of product gas (kg), and gcondensate is the mass of condensate (kg). figure 2. scheme of material balance. chemical reactions are common occurrences in the production and processing of materials. one of the expected outputs from the ucg process is the higher temperature that ensures the realization of the chemical reactions needed for the syngas creation. the equation of the material balance can be written based on the principle of the conservation of atoms as species, or as elements. an atomic species balance is based on the principle that input equals output because atomic species can neither be generated nor consumed in chemical reactions [26, 27]. the balance of atomic species for considered elements c (carbon), h (hydrogen), n (nitrogen), o (oxygen), and s (sulphur) has the following form m∑ i=1 gx,inputi = n∑ j=1 gx,outputj (2) where gx,inputi is the mass of x-th chemical element (i.e., c, h, n, o, and s) in the individual input materials (i.e., input coal, air, and oxygen) (kg), and gx,outputj is the mass of x-th chemical element (i.e., c, h, n, o, and s) in the individual output materials (i.e., unburned coal, ash, and gas) (kg). table 1 shows the representation of atoms in the mass of entering and leaving materials. similarly, table 2 shows the representation of atoms in a mass of syngas. these representations were determined according to the chemical composition of particular entering and leaving materials. the chemical composition of syngas (i.e., without so2) was determined by the type and percentage content of measured gas components during laboratory experiments. the mass volume of so2 in the syngas was calculated by a difference of s atoms between the input coal and unburned coal. the mass of x-th chemical element in syngas was determined from individual chemical compounds (i.e., co, co2, h2, ch4, so2, o2, and n2) where this chemical element is found. for example, we can calculate the mass of c chemical element from chemical compounds of syngas according to gc,syngas = 3∑ k=1 vk,syngas · ρk · mc mk (3) where gc,syngas is the mass of c chemical element in syngas (kg), vk,syngas is the volume of k-th chemical compound (m3), ρk is the density of k-th chemical 393 m. durdán, j. terpák, j. kačur et al. acta polytechnica entering material leaving material atom coal air oxygen unburned coal ash syngas condensate c yes no no yes no yes no h yes no no yes no yes yes n yes yes no yes no yes no o yes yes yes yes no yes yes s yes no no yes no yes no table 1. the representation of considered atoms in entering and leaving materials atom co co2 h2 ch4 so2 o2 n2 c yes yes no yes no no no h no no yes yes no no no n no no no no no no yes o yes yes no no yes yes no s no no no no yes no no table 2. the representation of considered atoms in syngas compound (kg·m−3), mk is the molar mass of k-th chemical compound (g·mol−1), mc is the molar mass of c chemical element (g·mol−1), and k is the index of the chemical compound where c is found (i.e., co, co2, and ch4). 3. verification of material balance the material balance of the ucg process was verified using data obtained from four experimental measurements. these measurements were realized in two ex-situ reactors. a description of ex-situ reactors, experimental measurements, and the evaluation of these measurements by total and atoms material balance is in the following subsections. 3.1. ex-situ reactors and experimental measurements ex-situ reactors were created for the investigation of the ucg process. these ex-situ reactors were created in a geometric similarity to a real coal seam. the coal cubes or broken coal represent the coal seam model. the clays, isolation material (i.e., sibral and nobasil), and a mixture of gravel and water glass represent the surrounding rock. these components were embedded in ex-situ reactors before the experimental gasification. ex-situ reactors include vessel and lid. the vessel of the first ex-situ reactor has a half-round shape. its length is 3000 mm, and height is 500 mm. the vessel of the second ex-situ reactor has a shape of a large steel box. dimensions of this ex-situ reactor are 5000 × 1500 × 500 mm (i.e., length × width × height). the ucg process realized in created ex-situ reactors is based on the principle of a controlled gasification agent (i.e., air and oxygen) supply to the burning coal seam model and the extraction of syngas. the laboratory ucg process scheme is shown in figure 3. the flow of the gasification agent and syngas, the syngas compositions, and temperatures of coal and surrounding rock were measured during laboratory experiments. the principal scheme of ex-situ reactors is shown in figure 4. the temperature and gas composition were measured using measuring probes designed for a gasification process analysis. the six experiments were realized in described exsitu reactors four experiments in the first ex-situ reactor and two experiments in the second ex-situ reactor. these experiments were different in the coal seam model. the cross-sectional design of the coal seam model for the first experiment is shown in figure 5a and for the second experiment in figure 5b. the first and second measurement were realized in the first ex-situ reactor. the cross-sectional design of the coal seam model for the third experiment is shown in figure 5c and for the fourth experiment in figure 5d. the third and fourth measurement were realized in the second ex-situ reactor. the cross-sectional design of the coal seam model for the fifth and sixth experiment is shown in figure 5e. the fifth and sixth measurement were realized in the first ex-situ reactor and the individual components volume of these two experiments was different. the physical model of the coal seam consisted of broken coal with a total weight of 521 kg in the first experiment. there was 167 kg of the unburned crushed coal after the gasification process. coal cubes with a total weight of 532 kg were used in the second experiment. in total, 62 kg of unburned coal remained after the gasification process. coal cubes with a total weight of 702 kg were used in the third experiment. in total, 147 kg of unburned coal remained after the gasification process. the ex-situ reactor was filled with coal cubes with a total weight of 766 kg in the fourth experiment. in total, 1335 kg of unburned coal remained after the gasification process. coal cubes with a total weight of 214 kg were used in the fifth 394 vol. 60 no. 5/2020 modeling of material balance from the experimental ucg figure 3. scheme of the laboratory ucg process. figure 4. principal scheme of ex-situ reactors (legend: (a) the first ex-situ reactor, (b) the second ex-situ reactor, 1 – gasification agent supply, 2 – gasification channel, 3 – exhaust gas, 4 – the lid of the generator, 5 – sounds for analysis). figure 5. the cross-sectional design of the coal seam model. 395 m. durdán, j. terpák, j. kačur et al. acta polytechnica moisture (%) ash (%) elementary analysis (%) c h n o s input coal 20.4 24.1 35.9 3.1 0.6 15 0.9 unburned coal 0 30.4 45.1 3.9 0.7 18.8 1.1 table 3. the coal analysis figure 6. percentage of losses in experiments. experiment. no unburned coal remained after this experiment. coal cubes with a total weight of 472 kg were used in the sixth experiment. in total, 66 kg of unburned coal remained after the gasification process. flows of gasification agents and syngas, the mass flow rate of condensate, and chemical composition of syngas (i.e., co, co2, ch4, h2, and o2,) were measured in these experiments. the lignite coal was used for the creation of a coal model. the gasified coal analysis for the input and unburned coal is shown in table 3. 3.2. the simulation results the values of general and atomic species mass balance were calculated based on measured components as the flow of gasification agent and syngas, the gasification agent and syngas composition, mass of condensate, and mass of input and unburned coal. the values of mass balance for individual experiments are shown in tables 4-9. a loss component was also added to the tables. this component was determined as the difference between entering and leaving material of individual mass balance components. the percentage of losses in individual experiments is shown in figure 6. the difference between entering and leaving material can be caused by gasification agent leak on the input side (i.e., atoms n and o) and syngas leak on the output side (i.e., atoms c, h, n, o, and s). the highest amount of losses was found in the first (i.e. 15.46 %) and third (i.e. 16.92 %) experiment. it can be caused by a type of gasified coal (i.e., the broken coal was gasified at the first experiment) and the surrounding rock type (i.e., sibral, clay and a mixture of gravel and water glass at the third experiment). figure 7. average percentage of atom losses. the broken coal caused the gasification agent and syngas leaks through the coal seam, and surrounding rock (i.e., a mixture of gravel and water glass) was cracked under the influence of high temperatures. the syngas and mainly gasification agent leaked through this mixture. the average percentage of atom loss is shown in figure 7 where the highest atoms losses were for components n (i.e. 15.41 %) and o (i.e. 13.05 %). it confirms that the largest loss were in the pipeline at the inlet of the reactor. figure 8 showed that losses for components n and o were not the same in individual experiments, except for the first experiment, where they were approximately equal (i.e., 17.35 % for component o and 17.45 % for component n). the loss of the component o was the highest in the fourth and fifth experiment, but the loss of the component n was the highest at the second, third, and sixth experiment. we assume that the leaks were mainly through the o2 pipeline at the fourth and fifth experiment and the air pipeline at the second, third, and sixth experiment. the mass balance calculation showed that the lowest syngas losses were measured for the c and h atoms. these atoms in syngas were obtained only from the gasified coal. 4. summary and conclusions this paper describes the proposal of a general and atomic species mass balance for improving the knowledge about the ucg process. the information about the composition of the entering and leaving materials in the ucg process is significant for an optimization and efficiency of the process.. this balance can pro396 vol. 60 no. 5/2020 modeling of material balance from the experimental ucg balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 521 1504.13 21.42 167 75.18 1403.23 84.75 316.4 c 186.98 0 0 75.31 0 95.49 0 16.18 h 27.95 0 0 6.5 0 9.75 9.44 2.25 n 3 1153.65 0 1.21 0 952.52 0 202.93 o 172.4 350.48 21.42 31.39 0 342.95 75.56 94.41 s 4.68 0 0 1.88 0 2.52 0 0.27 table 4. the value of general and atomic balance in the first experiment balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 532 903.51 129.6 32 118.83 1237.97 53 123.3 c 190.93 0 0 14.43 0 170.17 0 6.33 h 28.54 0 0 1.25 0 20.64 5.89 0.76 n 3.07 692.98 0 0.23 0 599.61 0 96.21 o 176.04 210.53 129.6 6.01 0 443.35 47.11 19.69 s 4.78 0 0 0.36 0 4.19 0 0.22 table 5. the value of general and atomic balance in the second experiment balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 702 3220.74 129.6 147 124.99 2979.69 114.84 685.83 c 251.94 0 0 66.29 0 174.97 0 10.68 h 37.66 0 0 5.72 0 18.19 12.76 0.99 n 4.05 2470.28 0 1.06 0 1819.82 0 653.44 o 232.3 750.46 129.6 27.63 0 962.23 102.08 20.42 s 6.3 0 0 1.66 0 4.47 0 0.18 table 6. the value of general and atomic balance in the third experiment balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 766 1801.16 129.6 133.5 144.55 2043.06 49.68 325.97 c 274.91 0 0 60.21 0 198.52 0 16.18 h 41.09 0 0 5.2 0 28.15 5.52 2.23 n 4.41 1381.48 0 0.97 0 1296.01 0 88.91 o 253.48 419.69 129.6 25.09 0 515.32 44.16 218.19 s 6.88 0 0 1.51 0 5.05 0 0.32 table 7. the value of general and atomic balance in the fourth experiment balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 214 1179.02 129.6 0 51.71 1285.53 10.97 174.41 c 76.8 0 0 0 0 72.55 0 4.25 h 11.48 0 0 0 0 9.51 1.22 0.75 n 1.23 904.3 0 0 0 851.27 0 54.26 o 70.81 274.72 129.6 0 0 350.39 9.75 115 s 1.92 0 0 0 0 1.82 0 0.11 table 8. the value of general and atomic balance in the fifth experiment 397 m. durdán, j. terpák, j. kačur et al. acta polytechnica balance entering material (kg) leaving material (kg) losses coal air oxygen unburned coal ash gas condensate general 472 1946.06 129.6 66 94.01 2001.15 10.97 375.54 c 169.4 0 0 29.76 0 130.01 0 9.62 h 25.32 0 0 2.57 0 20.01 1.22 1.53 n 0 1492.61 0 0 0 1159.88 0 332.74 o 156.19 453.45 129.6 12.4 0 687.98 9.75 29.11 s 4.24 0 0 0.74 0 3.28 0 0.21 table 9. the value of general and atomic balance in the sixth experiment figure 8. percentage of atom losses in experiments (a) the first experiment, b) the second experiment, c) the third experiment, d) the fourth experiment, e) the fifth experiment, f) the sixth experiment). vide information about the coal mass gasified (i.e., cavity formation) through the ucg process by watching atoms, e.g., c atom in syngas. the information about c atom can be obtained from the input coal on the input side and from syngas (i.e., molecular compounds,such as co, co2, and ch4) on the output side. the proposed mass balance was verified on the data obtained from six laboratory experiments. there is a visible difference between the input and output side of the mass balance in results from the described experiments. an examination of the overall mass flow clearly indicates that the process suffers from a mass loss: • the flow meter on one of the streams is faulty. • a leak of the gasification agent or syngas is present. • the syngas analysis is wrong. we must avoid these failures to propose a more precise calculation of proposed mass balance in the ucg process. the crushed coal was causing a higher syngas leak through the generator. for this reason, it is imperative to know the coal seam permeability and cracks in real and laboratory conditions. the achieved accuracy of the calculation for the mass balance components confirmed that the proposed approach is suitable for testing under real conditions. results of the mass balance in real conditions could determine the amount of gasifying coal at a specific time (i.e., cavity growth). the information about the syngas composition and proportion of important atoms can be used in the control of the ucg process, e.g., control of the gasification agent ratio (i.e., the ratio of air, o2, and also h2o) to improve the syngas calorific value. it will also enable the proposal of new approaches to the control of the ucg process. acknowledgements this work was supported by the slovak research and development agency under the contracts no. apvv-140892 and apvv-18-0526. 398 vol. 60 no. 5/2020 modeling of material balance from the experimental ucg references [1] a. bhutto, a. bazmi, g. zahedi. underground coal gasification: from fundamentals to applications. progress in energy and combustion science 39(1):189 – 214, 2013. doi:10.1016/j.pecs.2012.09.004. [2] a. uppal, a. bhatti, e. aamir, et al. control oriented modeling and optimization of one dimensional packed bed model of underground coal gasification. journal of process control 24(1):269 – 277, 2014. doi:10.1016/j.jprocont.2013.12.001. [3] j. kačur, m. durdán, m. laciak, p. flegner. a comparative study of data-driven modeling methods for soft-sensing in underground coal gasification. acta polytechnica : journal of advanced engineering 59(4):322 – 351, 2019. doi:10.14311/ap.2019.59.0322. [4] m. laciak, k. kostúr, m. durdán, et al. the analysis of the underground coal gasification in experimental equipment. energy 114:332 – 343, 2016. doi:10.1016/j.energy.2016.08.004. [5] k. kostúr, m. laciak, m. durdán. some influences of underground coal gasification on the environment. sustainability 10(5):1 – 31, 2018. doi:10.3390/su10051512. [6] e. škvareková, g. wittenberger, m. šofránko. tar related issues in underground coal gasification. acta montanistica slovaca 21(4):298 – 305, 2016. [7] d. mohanty. an overview of the geological controls in underground coal gasification. iop conference series: earth and environmental science 76(1):1 – 8, 2017. doi:10.1088/1755-1315/76/1/012010. [8] f. su, k. itakura, g. deguchi, k. ohga. monitoring of coal fracturing in underground coal gasification by acoustic emission techniques. appl energy 189:142 – 156, 2017. doi:10.1016/j.apenergy.2016.11.082. [9] p. mocek, m. pieszczek, j. swiadrowski, et al. pilotscale underground coal gasification (ucg) experiment in an operating mine wieczorek in poland. energy 111:313 – 321, 2016. doi:10.1016/j.energy.2016.05.087. [10] m. m. khan, j. p. mmbaga, a. s. shirazi, et al. modelling underground coal gasification—a review. energies 8(11):12603 – 12668, 2015. doi:10.3390/en81112331. [11] l. yang. the dynamic temperature field of two-stage underground coal gasification (ucg). energy sources, part a: recovery, utilization, and environmental effects 28(7):667 – 680, 2006. doi:10.1080/009083190951438. [12] g. perkins. mathematical modelling of in situ combustion and gasification. proc inst mech eng, part a: j power energy 232(1):56 – 73, 2018. doi:10.1177/0957650917721595. [13] g. perkins, v. sahajwalla. modelling of heat and mass transport phenomena and chemical reaction in underground coal gasification. chemical engineering research and design 85(3):329 – 343, 2007. doi:10.1205/cherd06022. [14] a. verma, b. olateju, a. kumar, r. gupta. development of a process simulation model for energy analysis of hydrogen production from underground coal gasification (ucg). int j hydrogen energy 40(34):10705 – 10719, 2015. doi:10.1016/j.ijhydene.2015.06.149. [15] b. arabi, s. doraisamy, a. emrouznejad, a. khoshroo. eco-efficiency measurement and material balance principle: an application in power plants malmquist luenberger index. annals of operations research 255(12):221 – 239, 2017. doi:10.1007/s10479-015-1970-x. [16] a. v. spesivtsev, m. l. dudorova. multimodel approach to calculating material balances of an enterprise in the medium of intellectual data-measuring systems. russian journal of non-ferrous metals 52(2):191 – 195, 2011. doi:10.3103/s1067821211020106. [17] m. s. shahamat, c. r. clarkson. multiwell, multiphase flowing material balance. spe reservoir evaluation & engineering 21(2):445 – 461, 2018. doi:10.2118/185052-pa. [18] m. zhang, l. f. ayala. a density-based material balance equation for the analysis of liquid-rich natural gas systems. journal of petroleum exploration and production technology 6(4):705 – 718, 2016. doi:10.1007/s13202-015-0227-1. [19] d. orhan, s. huseyin. effect of top sequences on geochemical mass balance and clay mineral formation in soils developed on basalt parent material under subhumid climate condition. indian journal of geo-marine sciences 47(9):1809 – 1820, 2018. [20] p. a. timofeev. material balance calculation of incineration plant for oily-waste combustion. marine intellectual technologies 1(3):111 – 115, 2018. [21] d. orozco, r. aguilera. material-balance equation for stress-sensitive shale-gas-condensate reservoirs. spe reservoir evaluation & engineering 20(1):197 – 214, 2017. doi:10.2118/177260-pa. [22] y. v. onishchenko, a. v. vakhin, b. i. gareev, et al. the material balance of organic matter of domanic shale formation after thermal treatment. petroleum science and technology 37(7):756 – 762, 2019. doi:10.1080/10916466.2018.1558247. [23] w. k. sawyer, l. shuck. numerical simulation of mass and energy transfer in the longwall process of underground gasification of coal. in society of petroleum engineers, spe symposium on numerical simulation of reservoir performance, los angeles, california, p. 16. 1976. doi:10.2118/5743-ms. [24] v. lozynskyi, r. dichkovskiy, p. saik, v. falshtynskyi. coal seam gasification in faulting zones (heat and mass balance study). solid state phenomena 277:66 – 79, 2018. doi:10.4028/www.scientific.net/ssp.277.66. [25] a. e. morris, g. geiger, h. a. fine. handbook on material and energy balance calculations in materials processing. wiley-tms 3 edition, 2011. doi:10.1002/9781118237786. [26] k. balu, n. satyamurthi, s. ramalingam, b. deebika. problems on material and energy balance calculation. i k international publishing house, 2009. [27] n. ghasem, r. henda. principles of chemical engineering processes: material and energy balances, second edition. crc press, 2014. 399 http://dx.doi.org/10.1016/j.pecs.2012.09.004 http://dx.doi.org/10.1016/j.jprocont.2013.12.001 http://dx.doi.org/10.14311/ap.2019.59.0322 http://dx.doi.org/10.1016/j.energy.2016.08.004 http://dx.doi.org/10.3390/su10051512 http://dx.doi.org/10.1088/1755-1315/76/1/012010 http://dx.doi.org/10.1016/j.apenergy.2016.11.082 http://dx.doi.org/10.1016/j.energy.2016.05.087 http://dx.doi.org/10.3390/en81112331 http://dx.doi.org/10.1080/009083190951438 http://dx.doi.org/10.1177/0957650917721595 http://dx.doi.org/10.1205/cherd06022 http://dx.doi.org/10.1016/j.ijhydene.2015.06.149 http://dx.doi.org/10.1007/s10479-015-1970-x http://dx.doi.org/10.3103/s1067821211020106 http://dx.doi.org/10.2118/185052-pa http://dx.doi.org/10.1007/s13202-015-0227-1 http://dx.doi.org/10.2118/177260-pa http://dx.doi.org/10.1080/10916466.2018.1558247 http://dx.doi.org/10.2118/5743-ms http://dx.doi.org/10.4028/www.scientific.net/ssp.277.66 http://dx.doi.org/10.1002/9781118237786 acta polytechnica 60(5):391–399, 2020 1 introduction 2 material balance 3 verification of material balance 3.1 ex-situ reactors and experimental measurements 3.2 the simulation results 4 summary and conclusions acknowledgements references ap04_4web.vp 1 introduction eigenstructure assignment is one of the basic techniques for designing linear control systems. the eigenstructure assignment problem is the problem of assigning both a given self-conjugate set of eigenvalues and the corresponding eigenvectors. assigning the eigenvalues allows one to alter the stability characteristics of the system, while assigning eigenvectors alters the transient response of the system. eigenstructure assignment via state feedback has developed the design methods for a wide class of linear systems under full-state feedback with the objective of stabilizing control systems. parametric solutions of eigenstructure assignment for state feedback have been studied by many researchers [4–9]. however, this paper focuses on a special feedback using only state derivatives instead of full-state feedback. therefore this feedback is called state-derivative feedback. the problem of arbitrary eigenstructure assignment using full-state derivative feedback naturally arises. the motivation for state derivative feedback comes from controlled vibration suppression of mechanical systems. the main sensors of vibration are accelerometers. from accelerations it is possible to reconstruct velocities with reasonable accuracy, but not displacements. therefore the available signals for feedback are accelerations and velocities only, and these are exactly the derivatives of the states of mechanical systems, which are velocities and displacements. then, direct measurement of a state is difficult to achieve. one necessary condition for a control strategy to be implementable is that it must use the available measured responses to determine the control action. all of the previous research work in control has assumed that all of the states can be directly measured (i.e., full state feedback). to control this class of systems, many papers (e.g. [10–15]) have been published describing the acceleration feedback for controlled vibration suppression. however, the eigenstructure assignment approach for feedback gain determination has not been used at all or has not been solved generally. other papers dealing with acceleration feedback for mechanical systems are [16–17], but here the feedback uses all states (positions, velocities) and accelerations additionally. only recently, abdelaziz and valášek [1–3] have presented a pole placement technique by state-derivative feedback for single-input and multi-input time-invariant and time-varying linear systems. this paper proposes a parametric approach for eigenstructure assignment in linear systems via state-derivative and partial output-derivative feedback. the necessary and sufficient conditions for the existence of eigenstructure assignment problem are described. the proposed controller is based on measurement and feedback of the state derivatives of the system. this work has successfully extended previous techniques by state feedback and modified to state-derivative feedback. finally, numerical examples are included to demonstrate the effectiveness of this approach. the main contribution of this work is the efficient technique that solves the eigenstructure assignment problem via state-derivative feedback systems. the procedure defined here represents a unique treatment for the extension of the eigenstructure assignment technique using the derivative feedback in the literature. the paper is organized as follows. the next section introduces the eigenstructure assignment problem formulation via state-derivative feedback. the necessary and sufficient conditions that ensure solvability are described. additionally, the parametric solution to the eigenstructure assignment problem via state-derivative feedback for distinct and repeated right eigenvectors is presented. section 3 introduces the eigenstructure assignment problem by partial output54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 eigenstructure assignment by state-derivative and partial output-derivative feedback for linear time-invariant control systems t. h. s. abdelaziz, m. valášek this paper introduces a parametric approach for solving the problem of eigenstructure assignment via state-derivative feedback for linear control time-invariant systems. this problem is always solvable for any controllable systems if the open-loop system matrix is nonsingular. in this work, the parametric solution to the feedback gain matrix is introduced that describes the available degrees of freedom offered by the state-derivative feedback in selecting the associated eigenvectors from an admissible class. these freedoms can be utilized to improve the robustness of the closed-loop system. finally, the eigenstructure assignment problem via partial output-derivative feedback is introduced. numerical examples are included to show the effectiveness of the proposed approach. keywords: eigenstructure assignment, state-derivative feedback, output-derivative feedback, linear systems, feedback stabilization, parametrization. -derivative feedback. section 4 presents illustrative examples. finally, conclusions are discussed in section 5. 2 eigenstructure assignment by state-derivative feedback for time-invariant systems in this section, we present a parametric approach for solving the eigenstructure assignment problem via state-derivative feedback for linear time-invariant systems. 2.1 eigenstructure assignment problem formulation consider a linear, time-invariant, completely controllable system � ) ( ) ( ), ( )x a x b u x x(t t t t� � �0 0 , (1) where x (t) � rn and u (t) � rm are the state and the control input vector, respectively, (m n� ), while a � rn×n and b � rn×m are the system and control gain matrices, respectively. the fundamental assumption imposed on the system is that, the system is completely controllable and the b matrix has a full column rank m. the objective is to stabilize the system by means of a linear feedback that enforces a desired characteristic behavior for the states. the eigenstructure assignment problem is to find the state-derivative feedback control law u k x( ) � )t t� � ( (2) which assigns the prescribed closed-loop eigenvalues and the corresponding eigenvectors that stabilize the system and achieve the desired performance. then, the closed-loop system dynamics becomes: � ) ( ) ( )x i bk a x(t tn� � �1 , (3) where in � r n×n is the identity matrix. in what follows, we assume that ( )i bkn � has a full rank in order that the closed-loop system is well define. the closed-loop characteristic polynomial is given by � �det ( )� i i bk an n� � ��1 0 (4) denote the right eigenvector matrix of the closed-loop matrix ( )i bk an � �1 by v, and we then have by definition, ( )i bk av = vn � �1 � (5) where � � diag{ , , }� �1 � n and det (v) � 0. we now formulate the eigenstructure assignment problem via state-derivative feedback as follows. eigenstructure assignment problem 1: given the controllable pair ( a, b ) and the desired self-conjugate set { , , }� �1 � n �c. select the appropriate state-derivative feedback gain matrix k � rn×m that will make the closed-loop matrix ( )i bk an � �1 have admissible eigenvalues and associated set of eigenvectors v. now, we will prove the necessary and sufficient conditions for the existence of the eigenstructure assignment problem via state-derivative feedback. the necessary conditions to ensure solvability are presented in the following theorem. theorem 1: the eigenstructure assignment problem for the real pair (a, b) is solvable for any arbitrary self-conjugate closed-loop poles, if and only if (a, b) is completely controllable, that is, � �rank , ,b ab a b�, nn1 � and � �rank , c� �i a bn � � �n, , and matrix a is nonsingular. proof: from the condition of that the closed-loop matrix must be defined. this means the matrix (in � bk) is of full rank and det(in � bk) � 0. then, from (5) it is easy to rewrite as av v bkv� �� � (6) which can be written as bk av v i� �� �� 1 1 n (7) then, det( ) det( )i bk i av v in n n� � � � � � � � 1 1 � � �det( )av v� 1 1 0. (8) since v and � must be nonsingular. this leads to, det ( )a 0, and matrix a must be of full rank in order for the matrix to be defined. thus, the necessary and sufficient conditions for the existence of the solution to the eigenstructure assignment problem via state-derivative feedback is that the system is completely controllable and all eigenvalues of the original system are nonzero (a has full rank). we remark that the requirement that the matrix � is diagonal, together with the invertibility of v, ensures that the closed-loop system is non-defective. based on the above necessary and sufficient conditions, the parametric formula for the state-derivative feedback gain matrix k that assigns the desired closed-loop poles is derived. 2.2 case of distinct eigenvalues in this subsection, all the desired closed-loop eigenvalues � �� �1, ,� n are distinct. let (�i, vi) be the closed-loop eigenvalue and the corresponding eigenvector, then we can write the right eigenvectors of the closed-loop system as ( ) , , ,i bk an i i i i n� � � �1 1v v� � . (9) the above equation can be rewritten as a i bkv vi i n i� �� ( ) , (10) then a bkv v vi i i i i� �� � . (11) let w vi i� k , then we have a bv v wi i i i i� �� � . (12) then (12) can be rewritten in a matrix form as, � �� �i n i i i i ni a b 0� � � � � � � �, , , , v w 1 � . (13) thus the subspace may be calculated in which the closed-loop eigenvector, vi, and gain-eigenvector product, wi, must satisfy the null space, � �� �null � �i n ii a b� , , i � 1, …, n. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 44 no. 4/2004 finally, the parametric equation to the right eigenvector can be expressed as � �� �i n i i i i i i i i a b 0 k � � � � � � � � � � � � � �, ,� � v v v w . (14) then vi � c n and k � rm×n are unknown and notational simplicity is gained by defining the (n � m)×1 vector �i. the determination of k consists of two general steps. first, a sufficient number of solution vectors �i is found. then, the internal structure among the components of these vectors is used to find the feedback gain matrix k. then, there exists a real feedback gain matrix k if and only if the following three conditions are satisfied: 1. the assigned eigenvalues are symmetric with respect to the real axis. 2. the vectors {vi � c n, i � 1, …, n}, are linearly independent and for complex-conjugate poles, �i � �j then v vi j� . 3. there exists a set of vectors {wi � c m, i � 1, …, n} satisfying (13) and w wi j� for �i � �j. the parametric formula for the state-derivative feedback gain matrix k that assigns the desired closed-loop eigenvalues and eigenvectors is derived as � � � �k v v w w( ), , ( ) ( ), , ( )� � � �1 1� �n n� . (15) finally the gain matrix can be computed as k wv� �1 (16) where � �v � v v( ), , ( )� �1 � n and � �w � w w( ), , ( )� �1 � n . this is a parametric solution to the eigenstructure assignment problem via state-derivative feedback. it is clear that, for the case of multi-input, there are infinite solutions to the feedback gain. eigenstructure assignment uses the extra degrees of freedom in the undetermined solution to specify the closed-loop right-eigenvectors, v, corresponding to the desired self-conjugate set {�i}, i � 1, …, n. 2.3 case of repeated eigenvalues our main work is to find a parametric solution in the case that some or all of the desired closed-loop eigenvalues {�i} are repeated. let � �� � � � � �� �i i i s s n, , , , ,c 1 1� be a set of desired eigenvalues and let us denote the algebraic and geometric multiplicity of �i by mi and qi, respectively, (qi � mi). the length of qi chains of generalized eigenvectors with �i is denoted by pij, (j � 1, …, qi). it is satisfying that p nijj q i s i �� �� �11 . then, the right generalized eigenvector of the closed-loop system with �i is denoted by vij k n�c , i � 1, …, s, j � 1, …, qi, k � 1, …, pij. then it is satisfied that ( ) , , , , , , , i bk an ij k i ij k ij k ij i s j � � � � � � � �1 1 0 0 1 1 v v v v� � � q k pi ij, , , .�1 � (17) this equation demonstrates the relation of assignable right generalized eigenvectors with the corresponding eigenvalue, and can be rewritten as, a i bk i bkv v v vij k i n ij k n ij k ij� � � � � � � ( ) ( ) ,1 0 0 . (18) let w vij k ij k� k , i � 1, …, s, j � 1, …, qi, k � 1, …, pij. the notations are defined as � �v v v� 1, ,� s , � �v v vi i iq i� 1, ,� , vij ij ij p ij� � �� � �� v v1 , ,� . the set of w ij k is define in a similar manner to the set of vij k . this leads to � � � �� �i n i ij k ij k n ij k ij ki a b i b� � � � � � � � � � � � � � � � �, , v w v w 1 1 � � � �, ,v wij ij 0 00 0 .(19) finally, the parametric equation to the right generalized eigenvector can be expressed as � � � �� �i n i ij k n ij k ij k ij k ij ki a b i b� � � � � � � � � � � �, , , ,� � � �1 v w ij i iji s j q k p 0 1 1 1 � � � � 0, , , , , , , , , .� � � (20) similar to the case of distinct eigenvalues the (n�m)-dimentional parameter vectors �ij k are chosen arbitrarily, under the condition that the columns of eigenvector matrix v are linearly independent. a parametric solution k to the eigenstructure assignment problem via state-derivative feedback is given by k wv� �1 (21) where � �v v v� 1 1( ), , ( )� �� s s and � �w w w� 1 1( ), , ( )� �� s s . then the feedback gain matrix is parameterized directly in terms of the eigenstructure of the closed-loop system, which can be selected to ensure robustness by exploiting the freedom of these parameters. then, there exists a real feedback gain matrix k, such that (19) holds, if and only if the following three conditions are satisfied: 1. the assigned eigenvalues are symmetric with respect to the real axis. 2. the vectors � �vijk n i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � are linearly independent and for complex-conjugate poles, � �i i1 2 � , then v vi j k i j k 1 2 � . 3. there exists a set of vectors � �w ijk m i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � , satisfying (19) and w wi j k i j k 1 2 � for � �i i1 2� . from the above results we observe that the system poles can always be assigned by a state-derivative feedback for any controllable system if and only if all eigenvalues of the original system are nonzero. in the case of single-input, m � 1, there is only at most one solution. in the case of multi-input, m > 1, the solution is generally non-unique, and extra conditions must be imposed to specify a solution. the extra freedom can be used to give the closed-loop system other desirable properties. the extra freedom can be used in different ways, for example to decrease the norm of the feedback gain matrix or to improve the condition of the eigenvalues of the closed-loop matrix. additionally, the robustness of the closed-loop system against system parameter 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 perturbation is increased. this issue becomes very important when the system model is not sufficently precise or the system is subject to parameter uncertainty. 3 eigenstructure assignment by partial output-derivative feedback for time-invariant systems let us consider eigenstructure assignment via output-derivative feedback for linear time-invariant systems. in practice, it may be expensive to measure all the state-derivative variables, or they may not all be available for measurement. we then feedback some of the derivative-outputs via a controller. output-derivative feedback is a more difficult problem than state-derivative feedback. consider a linear, time-invariant, completely controllable and observable system �( ) ( ) ( ), ( ) , ( ) ( ), x a x b u x x y c x t t t t t t � � � � 0 0 (22) where the state x(t) � rn, the output y(t) � rr and the control input vector u(t) � rm, while a � rn×n, b � rn×m and c � rr×n are the system, control and output gain matrices, respectively. the fundamental assumptions imposed on the system are that the system is completely controllable and observable. additionally, matrices b and c have a full rank. the objective is to stabilize the system by means of a linear feedback that enforces a desired characteristic behavior for the state. the eigenstructure assignment problem is to find the output-derivative feedback control law u f y( ) �( )t t� � , (23) which assigns prescribed closed-loop eigenvalues that stabilize the system and achieve the desired performance, and the closed-loop system dynamics becomes �( ) (x i bfc a xt tn� � ) ( ) (24) in what follows, we assume that (in � bfc) has a full rank in order that the closed-loop system is well defined. the closed-loop characteristic equation of this system is given by � �det (�i i bfc an n� � ��) 1 0 (25) denote the right eigenvector matrix of the matrix (i bfc an � �) 1 by v, and then we have, (i bfc av vn =� �) 1 � . (26) now, the eigenstructure assignment problem for output-derivative feedback is as follows. eigenstructure assignment problem 2: given the reat triple (a, b, c) and the desired self-conjugate set {�1, …, �n} � c. select the appropriate output-derivative feedback matrix f � rm×r that will make the closed-loop matrix (i bfc an � �) 1 have admissable eigenvalues and corresponding right generalized eigenvectors v. now, the necessary and sufficient conditions for the existence of the eigenstructure assignment problem via output-derivative feedback are presented in the following theorem. theorem 2: the eigenstructure assignment problem for the triple (a, b, c) is solvable for any arbitrary self-conjugate closed-loop poles, if and only if (a, b) is completely controllable, that is, � �rank b ab a b, , ,� n n� �1 and � �rank c� �i a bn n� � �, , , also, the pair (a, c) is completely observable that is, � �rank c ac a c, , ,� n n� �1 and � �rank c� �i a cn n� � �, , , and matrix a is nonsingular. proof: from the condition of that the closed-loop matrix must be defined. this means the matrix (in � bfc) is of full rank and det(i bfcn � ) 0. then, from (26) it can easily put in the following form av v bfcv� �� � (27) which can be written as bfc av v i� �� �� 1 1 n (28) then, � � �det det ]i bfc i av v in n n� � � � � � � � 1 1 � � �det [ ]av v� 1 1 0 (29) since v and � must be nonsingular. thus, matrix a must be of full rank in order for the closed-loop matrix to be defined. from the above results we can observe that, the necessary and sufficient conditions for existence the solution to the eigenstructure assignment problem via output-derivative feedback are the system is completely controllable and observable and all eigenvalues of the original system are nonzero (a has full rank). our main work in this section is to find a parametric solution to the gain matrix in the case of right generalized eigenvectors. let � �� � � � � �� �i i i s s n, , , , ,c 1 1� be a set of desired eigenvalues and let us denote the algebraic and geometric multiplicity of �i by mi and qi, respectively. the length of qi chains of generalized eigenvectors with �i, are denoted by pij. the right generalized eigenvector of the closed-loop system with �i is denoted by vij k n�c , i � 1, …, s, j � 1, …, qi, k � 1, …, pij. it is satisfying that p nijj q i s i �� �� �11 . then we have the following relation ( , , , , , , i bfc a 0n ij k i ij k ij k ij i s j � � � � � � � �) 1 v v v v� 1 0 1 1� �, , , , .q k pi ij�1 � (30) this equation demonstrates the relation of assignable right generalized eigenvectors with the eigenvalue, and can be rewritten, ( ) ( ,a i bfc i bfc 0� � � � ��� �i n i ij k n ij k ijv v v) 1 0 . (31) let z vij k ij k� fc , i � 1, …, s, j � 1, …, qi, k � 1, …, pij. the notations are defined as � �v v v� 1, ,� s , � �v v vi i iq i� 1, ,� , © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 44 no. 4/2004 vij ij ij p ij� � �� � �� v v1 , ,� . the set of zij k is defined in a similar manner to the set of vij k . the above equation can be rewritten as � � � �� �i n i ij k ij k n ij k ij ki a b i b� � � � � � � � � � � � � � � � �, , v z v z 1 1 � � � �, ,v zij ij 0 00 0 . (32) the right generalized eigenvector can be expressed as � � � �� �i n i ij k n ij k ij k ij k ij ki a b i b� � � � � � � � � � � �, , , ,� � � �1 v z ij i iji s j q k p 0 1 1 1 � � � � 0, , , , , , , , , .� � � (33) similar to the case of state-derivative feedback, there exists a real matrix f if and only if the (n � m) ×1-dimensional parameter vectors, �ij k , are arbitrary chosen under the condition that the columns of matrix v are linearly independent. finally, a real gain matrix f is expressed as: f zv� �1 (34) where � �v v v� 1 1( ), , ( )� �� s s and � �z z z� 1 1( ), , ( )� �� s s . then the feedback gain matrix f is parameterized directly in terms of the eigenstructure of the closed-loop system. there exists a real feedback gain matrix f, such that (32) holds, if and only if the following three conditions are satisfied: 1. the assigned eigenvalues are symmetric with respect to the real axis. 2. the vectors � �vijk n i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � are linearly independent and for complex-conjugate poles, � �i i1 2� then v vi j k i j k 1 2 � . 3. there exists a set of vectors � �zijk m i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � , satisfying (32) and z zi j k i j k 1 2 � for � �i i1 2� . 4 illustrative examples in this section, we present numerical examples to illustrate the feasibility and effectiveness of the proposed eigenstructure assignment technique. example 1: consider a single-input linear system described in the state-space form �( ) ( ) ( )x x ut t t� � � � � � � � � � � � 1 2 0 3 0 1 in the following, we consider the assignment of two different cases: case 1: the desired closed-loop eigenvalues are selected as, {�3 and �4}. for �1 � �3, � �� �1 1 1 1 4 2 0 0 6 3 i a b 0n � � � � � � � � � � � �, � � , arbitrarily selecting � ��1 0 5 1 2� � �. , , t leads to k � � � � � � � � 0 5 1 2 . . for �2 � �4, � �� �2 2 2 2 5 2 0 0 7 4 i a b 0n � � � � � � � � � � � �, � � , taking � ��2 0 4 1 7 4� � �. , , t leads to k � � � � � � � � 0 4 1 7 4 . . taking the two equations together, we can write k � � � � � � � � � � 0 5 0 4 1 1 2 7 4 . . ( ) . finally, the unique state-derivative feedback gain matrix is � �k � 2 5 0 75. , . . case 2: the desired eigenvalues are, {–1 and –1}. for � � �1, � �� � �� � �� �i a b 0n � � � � � � � � � � � �, � � 2 2 0 0 4 1 , leads to � ���� � � � �1 1 4, , t and k � � � � � � � � 1 1 4. the generalized eigenvector equation is, � � � �� � � � � �i a b i bn n� � �, ,� �2 1 , and, � � � � � � � � � � � � � � � � � � � � � � � � � � � � 2 2 0 0 4 1 1 0 0 0 1 1 1 1 4 2�� � � � � � � � � 1 3 , then � ��� � 2 15 1 7� � �. , , t and k � � � � � � � � 15 1 7 . . and the gain equation can be written as k � � � � � � � � � � 1 15 1 1 4 7 . ( ). finally, the unique feedback gain is � �k � 6 2, . example 2: consider a multi-input controllable linear system �( ) ( ) ( )x x ut t t� � � � � � � � � � � � 1 2 0 3 1 0 0 1 in the following, we consider the assignment of two different cases: case 1: the desired closed-loop eigenvalues are selected as {–3 and –5}. for �1 � �3, � �� �1 1 1 1 4 2 3 0 0 6 0 3 i a b 0n � � � � � � � � � � � � �, � � , taking � �v1 � a b, t, leads to � ��1 4 3 2 3 2� � � �a b a b b, , , t and k a b a b b � � � � � � � � � � � � � � 4 3 2 3 2 . 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 for �2 � �5, � �� �2 2 2 2 6 2 5 0 0 8 0 5 i a b 0n � � � � � � � � � � � � �, � � , choosing � �v2 � c d, t, then � ��2 6 5 2 5 8 5� � � �c d c d d, , , t, and k c d c d d � � � � � � � � � � � � � � 6 5 2 5 8 5 . then, the feedback gain matrix equation is k a c b d a b c d b d � � � � � � � � � � � � � � � � � 4 3 2 3 6 5 2 5 2 8 5 . then any values of a, b, c and d will give a valid gain matrix as long as the required inverse exists. to give the orthogonal set of the closed-loop eigenvector, we set a � d � 1 and b � c � 0. the gain matrix is k � � � � � � � � � 4 3 2 5 0 8 5 . case 2: the desired closed-loop poles are, {–1 and –1}. for �1 � �1, � �� � �� � �� �i a b 0n � � � � � � � � � � � � �, � � 2 2 1 0 0 4 0 1 , taking � �v11 1 � a b, t, leads to � ���� � � � � �a b a b b, , ,2 2 4 t. the generalized eigenvector equation is, � � � �� � � � �� �i a b i bn n� � �, ,� �2 � � � � � � � � � � � � � � � � � � � � � �2 2 1 0 0 4 0 1 1 0 1 0 0 1 0 1 2 2� �� � �� � a b b3 � � � � �, taking � �v12 1 � c d, t, leads to � ��� � 2 2 2 2 3 4� � � � � � �c d a b c d b d, , , t. then, the gain matrix equation can be written as k � � � � � � � � � � � � � � � � � � � � � a c b d a b a b c d b b d 2 2 2 2 2 4 3 4 . we can set a � d � 1 and b � c � 0. then, the gain matrix is k � � � � � � � � � 2 3 0 4 . example 3: consider a controllable and observable, multi-input linear system �( ) ( ) ( )x x ut t t� � � � � � � � � � � � � � 0 1 3 4 1 0 0 1 , y x( ) ( ) ( )t t� 1 1 in the following we consider the first eigenvalue at (�1 � �5), � �� �1 1 1 1 5 1 5 0 3 1 0 5 i a b 0n � � � � � � � � � � � � �, � � , taking � �v1 � a b, t, leads to � ��1 0 2 0 6 0 2� � � �a b a b a b, , . , . . t. this means fc a b a b a b � � � � � � � � � � � � � � 0 2 0 6 0 2 . . . . hence, fc f a b a b a b a b � � � � � � � � � � � � � � � �( ) . . . 0 2 0 6 0 2 . the gain matrix is f � � � � � � � � � � 1 0 2 0 6 0 2a b a b a b . . . . clearly indicating that the real output-derivative gain matrix is not unique and a finite number of valid choices for a and b are possible. the closed-loop system equation can be computed as ( ) . . . . . . i bfc a i n n a b a b a b a b a � � � � � � � � � � �1 1 0 2 0 2 0 6 18 0 6 0 2b a b a b b b a b � � � � � � �� � � �� � � � � � � � � � � � � � � � �1 0 1 3 4 1 5 4 4 5 � � � � � then it is easily to shown that the closed-loop system has eigenvalues at –5 and –1, regardless of the a and b values. it has been shown, how the eigenstructure assignment approach can be used to design a controller-based state-derivative and partial output-derivative feedback control, which yields a closed-loop system with specified characteristics. the approach is relevant for design with preservation of stability when some necessary and sufficient conditions are provided. compared to state feedback, the state-derivative feedback controller in some cases achieves the same performance with lower gain elements. from the practical point of view, it is desirable to determine feedback with small gains. intuitively, this must be advantageous, since small gains lead to smaller control signals, and thus to less energy consumption. 5 conclusions this paper proposes a parametric approach for solving the eigenstructure assignment problem via state-derivative feedback. the necessary conditions to ensure solvability are that the system is controllable and the open-loop system matrix is nonsingular. the main result of this work is an efficient algorithm for solving the eigenstructure assignment problem of linear systems via state-derivative feedback. the extra degrees of freedom on the choice of feedback gains can be exploited to further to improve the closed-loop robustness against perturbation. additionally, the eigenstructure assignment problem via partial output-derivative feedback is introduced. the numerical examples prove the feasibility and effectiveness of the proposed technique. references [1] abdelaziz t. h. s., valášek m.: “pole placement for siso linear systems by state-derivative feedback.” (submitted to iee proceeding part d: control theory and applications). [2] abdelaziz t. h. s., valášek m.: “a direct algorithm for pole placement by state-derivative feedback for single© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 44 no. 4/2004 -input linear systems.” acta polytechnica, vol. 6 (2003), p.52–60. [3] abdelaziz t. h. s., valášek m.: “a direct algorithm for pole placement by state-derivative feedback for multi-input linear systems.” (to appear). [4] duan g. r: “solution of the equation av+bw=vf and their application to eigenstructure assignment in linear systems.” ieee transactions on automatic control, vol. 38 (2) (1993), p. 276–280. [5] fahmy m. m., tantawy h. s.: “eigenstructure assignment via linear state feedback control.” international journal of control, vol. 40(1984), no. 1, p. 161–178. [6] fahmy m. m., o’reilly j.: “eigenstructure assignment in linear multivariable systems a parametric solution.” ieee transactions on automatic control, vol. 28 (1983), no. 10, p. 990–994. [7] fahmy m. m., o’reilly j.: “on eigenstructure assignment in linear multivariable systems.” ieee transactions on automatic control, vol. 27 (1982), no. 3, p. 690–693. [8] brogan w. l.: modern control theory, englewood cliffs, nj: prentice-hall, inc., 1991. [9] lewis f. l.: applied optimal control and estimation, digital design and implementation. prentice-hall and texas instruments, englewood cliffs, nj., 1992. [10] preumont a., loix n., malaise d., lecrenier o.: “active damping of optical test benches with acceleration feedback.” machine vibration, vol. 2, (1993), p. 119–124. [11] preumont a.: vibration control of active structures, kluwer, 1998. [12] bayon de noyer m. p., hanagud s. v.: “single actuator and multi-mode acceleration feedback control.” adaptive structures and material systems, asme, ad, vol. 54 (1997), p. 227–235. [13] bayon de noyer m. p., hanagud s. v.: a comparison of h2 optimized design and cross-over point design for acceleration feedback control. proceedings of 39th aiaa/asme/ asce/ahs, structures, structural dynamics and materials conference, vol. 4, 3250-3258, (aiaa-98-2091), 1998. [14] olgac n., elmali h., hosek m., renzulli m.: “active vibration control of distributed systems using delayed resonator with acceleration feedback.” transactions of asme journal of dynamic systems, measurement and control, vol. 119 (1997), p. 380. [15] kejval j., sika z., valášek m.: active vibration suppression of a machine, proceedings of interaction and feedbacks 2000, ut av cr, praha, 2000, p. 75–80. [16] deur j., peric n.: a comparative study of servosystems with acceleration feedback. proceedings of the 35th ieee industry applications conference, roma, italy, vol. 2 (2000), p. 1533–1540. [17] ellis g.: cures for mechanical resonance in industrial servo systems. proceedings of pcim 2001 conference, nuremberg, 2001. [18] horn r. a., johnson c. r.: matrix analysis. cambridge university press, cambridge 1986. eng. taha h. s. abdelaziz, ph.d. e-mail: tahahelmy@yahoo.com doc. michael valášek, drsc. e-mail: valasek@fsik.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 acta polytechnica doi:10.14311/ap.2016.56.0373 acta polytechnica 56(5):373–378, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap experimental investigation on chromium(vi) removal from aqueous solution using activated carbon resorcinol formaldehyde xerogels eghe a. oyedoha, b, ∗, michael c. ekwonuc a school of chemistry and chemical engineering, queen’s university belfast, northern ireland, united kingdom b department of chemical engineering, university of benin, benin city, edo state, nigeria c department of chemical & petroleum engineering, afe babalola university, ado-ekiti, nigeria ∗ corresponding author: egheoyedoh@uniben.edu abstract. the adsorption of chromium(vi) metal ion in aqueous solutions by activated carbon resorcinol formaldehyde xerogels (acrf) was investigated. the results showed that pore structure, surface area and the adsorbent surface chemistry are important factors in the control of the adsorption of chromium(vi) metal ions. the isotherm parameters were obtained from plots of the isotherms and from the application of langmuir and freundlich isotherms. based on regression analysis, the langmuir isotherm model was the best fit. the maximum adsorption capacity of acrf for chromium (vi) was 241.9 mg/g. the pseudo-second-order kinetic model was the best fit to the experimental data for the adsorption of chromium metal ions by activated carbon resorcinol formaldehyde xerogels. the thermodynamics of cr(vi) ions adsorption onto acrf was a spontaneous and endothermic process. keywords: adsorption, chromium(vi) ion, langmuir isotherm, freundlich isotherm, activated carbon resorcinol formaldehyde xerogels (acrf). 1. introduction chromium is used in various industries such as the metallurgical industry (steel, ferroand nonferrous alloys), refractories (chrome and chrome-magnesite), and in the chemical industry (pigments, electroplating and tanning) [1]. as a result of these industrial processes, large amounts of chromium compounds are discharged into the environment. these compounds are toxic and have negative effects on humans and the environment. persistent exposure to cr(vi) causes cancer in the digestive tract and lungs, and may cause other health problems, for instance skin dermatitis, bronchitis, perforation of the nasal septum, severe diarrhoea, and haemorrhaging [2, 3]. the maximum level for chromium in drinking water permitted by the world health organization (who) is 0.05 mg/l [4]. cr(iii) and cr(vi) species are the two stable forms of chromium present in the environment. they have different chemical, biological and environmental characteristics. the most toxic form of chromium is cr(vi), which exists with oxygen as chromate cro42– or dichromate cr2o72– oxyanions. cr(vi) compounds are highly soluble and mobile. cr(iii) is less mobile, less toxic and is mainly found bound to organic matter in soil and aquatic environments [5]. typical methods for the removal of dissolved heavy metals from aqueous solution include chemical precipitation, chemical oxidation or reduction, filtration, ion exchange, electrochemical treatment and application of membrane technology. however, these processes have some major drawbacks, which include incomplete metal removal, requirements for expensive equipment and monitoring system, high reagents and energy requirement and generation of toxic sludge with special disposal requirements, especially with the application of low cost adsorbents [6]. adsorption can be an effective method for the removal of chromium from aqueous solution, especially in combination with suitable regeneration steps, which resolves the problems associated with sludge disposal and makes the process more economically viable [6]. previous studies on the removal of cr(vi) using activated carbons produced from coconut shells [7], clays [8], wheat bran [9], rice husk [10], tyres and sawdust [11], etc. have been reported in the literature. this study investigates the adsorption of chromium(vi) onto activated carbon resorcinol formaldehyde xerogels using the langmuir and freundlich isotherms. the kinetics of adsorption was fitted with pseudo-firstorder and pseudo-second-order and the controlling rate of adsorption described by intra-particle diffusion. 2. material and methods 2.1. material all chemical reagents and materials used were of analytical grade. deionised water (18.0 ω) was used as solvent in the preparation of stock solutions of chromium metal ions by dissolving 2.828 g of potassium dichromate in 1 dm3 of deionised water. 2.2. synthesis of activated carbon resorcinol formaldehyde xerogels activated carbon obtained from the synthesis of resorcinol formaldehyde xerogels (acrf) was used for the adsorption studies [12]. the rf xerogels 373 http://dx.doi.org/10.14311/ap.2016.56.0373 http://ojs.cvut.cz/ojs/index.php/ap eghe a. oyedoh, michael c. ekwonu acta polytechnica were synthesised from the polycondensation of resorcinol, c6h4(oh)2 (r), with formaldehyde hcho (f) according to the method proposed by pekala et al. [13, 14], rf solutions were prepared by mixing resorcinol (r), formaldehyde (f), sodium carbonate na2co3 (c) and distilled water. the solution was mixed vigorously for 45 min. the resorcinol/formaldehyde ratio r/f was fixed at 0.5, while the molar ratio of r/c and the ratio of r/w (g/cm) were varied. the homogeneous clear solution was then poured into sealed glass vials to avoid water evaporating during the gelation process. the sealed vials were then placed in an oven set at 25 °c for 24 h. oven temperature was then increased to 60 °c for 48 h, and then finally it was increased to 80 °c for an additional 24 h to complete the curing process. the wet gels were then removed from the oven and allowed to cool to room temperature. in order, to remove water from the pores of the gels, the gels were immersed in acetone for solvent exchange at room temperature for three days. after the third day, the acetone was poured out and the gels were placed in a vacuum oven for drying. the gels were dried in a vacuum oven at 64 °c for 3 days. 2.3. adsorption studies all adsorption experiments were carried out with batch reactors (glass bottles and beakers). stock solutions (1000 ppm) of cr(vi) metal ions were prepared. different concentrations of standard solutions (25, 50, 100, 150, 200 and 250 ppm) were prepared by appropriate dilutions of the stock solutions with deionised water. chromium (vi) concentrations were analysed at 540 nm wavelength using hach-dr-2800 uv visible spectrophotometer with 1, 5-diphenylcarbazide reagent. the reagent was prepared by using 250 mg of 1, 5-diphenylcarbohydrazide which was dissolved in 50 ml of methanol (hplc-grade). 250 ml of h2so4 solution (contains 14 ml of 98 % h2so4) was added into the above solution, which was then diluted with deionised water to 500 ml. 2.4. adsorption isotherms experimental data obtained from the batch tests were analyzed using the langmuir and freundlich isotherms to determine the isotherm model that described the experimental data more accurately. langmuir isotherm. the langmuir isotherm assumes a monolayer, uniform, and finite adsorption site and therefore saturation is reached, beyond which no further adsorption takes place. it is also based on the assumption that there is no interaction between the molecules adsorbed on neighbouring sites [15]. the model developed by langmuir (1916) is given by: qe = qmaxbce 1 + bce (1) the very important characteristic of the langmuir isotherm can be expressed in terms of a dimensionless constant called the separation factor [6, 16]: rl = 1 1 + bco (2) freundlich isotherm. the freundlich isotherm is an empirical equation for multilayer, heterogeneous adsorption sites [17]. the freundlich isotherm is given by: qe = kf c 1 n e (3) 3. results and discussion 3.1. effect of initial ph the effect of ph on the adsorption of the cr(vi) metal ions was studied with the ph varied from 2.0–11.0. the studies were performed with constant initial metal ions of 100 ppm, adsorbent dose of 1 g/l solution and contact time of 72 h. the adsorption of chromium(vi) (fig. 1) increases with the ph to a maximum at ph 3, and thereafter decreases with further increase in ph. this shows that adsorption of chromium ions is ph dependent. the maximum adsorption at ph 3 may be attributed to the existence of chromium ions as hcro4 – which is the dominant form of cr(vi) at ph 3. the high adsorption of cr(vi) at ph 3 might be a result of electrostatic attraction between positively charged groups of the acrf surface and hcro4 – . this can also be attributed to fact that the surface charge on the acrf. the phzpc of acrf is at 9.19 and below this ph, the surface charge of the acrf is positive. hence, adsorption of cr(vi) might also be due to electrostatic attraction between positively charged adsorbent and negatively charged hcro4 – [18]. as the ph increased, the overall surface charge on the adsorbents became negative and adsorption decreased [19]. the decrease in removal at higher ph may be due to the abundance of oh– ions which compete with the negatively charged cr(vi) species for the active sites on the acrf. figure 1. effect of ph on adsorption of cr(vi) ions 374 vol. 56 no. 5/2016 removal of cr(vi) using acrf 3.2. effect of initial chromium(vi) concentration on acrf the effect of initial chromium metal ion concentration on the adsorption was studied at optimum the ph of 3 which was observed from a previous study. the experimental data for the adsorption of cr(vi) onto activated carbon resorcinol formaldehyde xerogels were fitted to the langmuir and freundlich isotherms using non-linear regression analysis. the isotherm parameters are given in table 1. langmuir isotherm (fig. 2) was seen to have a better fit based on non-linear regression analysis. the value of the separation factor, rl, determines the type of isotherm either to be favourable (0 < rl < 1), linear (rl = 1), unfavourable (rl > 1) or irreversible (rl = 0) [20]. the low value of rl (0.000049) showed that the adsorption of chromium(vi) onto acrf was favourable (table 1). 3.3. effect of temperature on chromium(vi) adsorption the effect of temperature on the adsorption of metal ions was carried out with the temperature varied from 20 °c (293 k) to 60 °c (333 k), with initial concentration of 25–250 ppm, adsorbent dosage of 1 g/l and optimal ph. the adsorption of cr(vi) ions was found to increase with an increase in temperature range 20–60 °c (fig. 3). this increase in adsorption capacity of acrf is an indication of an endothermic process [21]. this might be a result of complexation and reduction reactions [22]. also, diffusion is an figure 2. application of langmuir and freundlich isotherms to adsorption of cr(vi) ions langmuir isotherm freundlich isotherm qmax = 241.9 b = 0.3529 rl = 0.000049 r2 = 0.9740 kf = 79.06 1/n = 0.3426 r2 = 0.9441 table 1. isotherm parameters for cr(vi) adsorption onto acrf endothermic process and an increase in temperature increases the diffusion rate of the adsorbate molecules across the external boundary layers and into the pores of acrf. similar results were observed with adsorption of cr(vi) onto activated carbon [23]. 3.4. effect of contact time the effect of contact time on the adsorption of cr(vi) was studied by varying the contact time from 0– 420 min under ph of 3. in fig. 4, it was seen that the uptake of the metal ions increased with increasing contact time until equilibrium was reached. the adsorption of cr(vi) ions initially increased rapidly and then reached equilibrium. the optimum chromium removal was 74.86 % at 60 min for 25 ppm and 77.21 % at 240 min for 200 ppm; it was 100 % at 420 min for 25 ppm and 83 % at 420 min for 200 ppm. 3.5. adsorption kinetics the pseudo-first-order, pseudo-second-order and intraparticle diffusion models were used to fit the experimental data for the different initial chromium ion concentrations. the results of pseudo-second-order kinetics observed in this study are supported by the findings of bhattacharya [8]. the values of the second figure 3. effect of temperature on the adsorption of cr(vi) figure 4. effect of contact time on the adsorption of cr(vi). 375 eghe a. oyedoh, michael c. ekwonu acta polytechnica co (mg/l) pseudo-first-order pseudo-second-order intra-particle diffusion k1 r 2 k2 h r 2 ki r 2 25 0.1705 0.9768 0.0136 3.2938 0.9743 1.5737 0.7411 50 0.0935 0.9742 0.0041 3.3706 0.9833 2.2374 0.7888 150 0.0706 0.9138 0.0015 8.3693 0.9647 3.0394 0.7584 200 0.0530 0.9499 0.0010 5.7566 0.9864 3.3937 0.8209 table 2. kinetic models and parameters of adsorption of cr(vi) order rate constants (k2) were found to decrease from 0.0136–0.0010 g mg−1 min−1 as the initial concentration increased from 25–200 mg/l. this indicated that the process is highly concentration dependent [24]. 3.6. mechanism of adsorption as seen in fig. 5, the acrf spectra displayed a change of intensity and shift of the carbonyl stretching band around 1630 cm−1 after the contact with chromium solution. this is a result of the complexation of the carbonyl group with chromium. another shift can be observed as a result of complexation of the oxygen from the carboxyl c–o bond at wave numbers 1166 and 1066 cm−1. the o–h (3434 cm−1) and c–o (2390 and 2361 cm−1) band absorption peaks are observed to shift when acrf is loaded with chromium. two new peaks were observed in ftir spectra of cr(vi)-loaded sorbents, which is attributed to cr–o and cr––o bonds of chromate anions, and which confirms the sorption of cr(vi) onto the activated carbon resorcinol formaldehyde xerogel (acrf) at 719 and 910 cm−1 [25]. the mechanism of chromium(vi) adsorption from aqueous solution is attributed to physical adsorption by electrostatic attraction between positively charged adsorption sites in the adsorbent and the negatively charged cr(vi) species. it can be seen that carboxyl groups are involved in the removal mechanism, as shown with the ftir figure 5. ftir spectra of acrf adsorbent (a) before and (b) after cr(vi) adsorption. results [26]. other functional groups may also be involved in metal ions adsorption. from the sem (figure 6a (unloaded acrf)), it can be seen that acrf has a large surface area. the chromium metal ions were adsorbed onto the pores and surfaces of adsorbent as shown by the sem image (figure 6b (loaded acrf)). the edx analyses of acrf before adsorption were: c: 97.48 %; o: 2.31 %; na: 0.21 %. the edx analyses for cr(vi)-loaded acrf were: c: 39.56 %; o: 3.0 %; na: 0.05 %; cr: 57.38 %. 3.7. adsorption thermodynamics the thermodynamics parameters such as gibbs free energy, enthalpy change and entropy change were obtained using the following equations [6]: kc = qe ce , (4) ∆go = −rt ln kc, (5) ln kc = ∆so r − ∆h o rt . (6) the value of ∆h o and ∆so are obtained from the slope and intercept of the linear van’t hoff plot of ln kc versus 1/t (6). table 3 shows the calculated values of the thermodynamic parameters for the adsorption of cr(vi) on acrf. the negative values of ∆go at various temperatures indicate the spontaneous nature of the adsorption process. the increase in ∆go with temperature clearly indicates a more favourable adsorption at high temperature. the positive value of ∆h o indicates the adsorption process is endothermic. more so, the positive value of ∆so indicates the degree of randomness of the system solid-solution interface during the adsorption process. similar results were reported for cr(vi) adsorption [6, 27]. as reported by malkoc figure 6. sem images of acrf (a) before cr(vi) adsorption and (b) after cr(vi) adsorption. 376 vol. 56 no. 5/2016 removal of cr(vi) using acrf t kc ∆go ∆h o ∆so 293 3.0398 −2.71 7.51 0.0355 303 4.0450 −3.52 333 4.6185 −4.24 table 3. thermodynamics parameters for cr(vi) adsorption onto acrf and nuhoglu [27], the positive value of ∆so reflects the affinity of the adsorbent for cr(vi) ions and suggests some structural changes in chromium and the adsorbent. 4. conclusions the effect of initial chromium(vi) metal ion concentration on the adsorption on acrf was studied at optimum ph observed from a previous study. the langmuir isotherm was seen to have a better fit based on non-linear regression analysis. the maximum adsorption capacity of acrf for chromium(vi) was 241.9 mg/g. the pseudo-second-order kinetic model was the best fit to the experimental data for the adsorption of chromium (vi) metal ions by activated carbon resorcinol formaldehyde xerogels. the optimal removal was 74.86 % at 60 min for 25 ppm and 77.21 % at 240 min for 200 ppm. different mechanisms were responsible for chromium(vi) metal ion adsorption. the results showed that the adsorption of cr(vi) is a result of electrostatic attraction, ion exchange/complexation and reduction reactions. the thermodynamic analysis showed that the chromium adsorption process was endothermic and spontaneous in nature. list of symbols qe amount of solute adsorbed per unit weight of adsorbent [mg/g] ce equilibrium concentration of solute in the bulk solution [mg/l] b constant related to the free energy of adsorption [l/mg] qmax maximum adsorption capacity [mg/l] co initial cr(vi) concentration [mg/l] kf freundlich constant [mg/g] 1/n heterogeneity factor [mg/l] k1 pseudo-first-order rate constant [min−1] k2 pseudo-second-order rate constant [g/mg min] ki intra-particle diffusion constant [mg g−1 min−0 5] rl separation factor t temperature [k] r universal gas constant, 8.314 j mol−1 kc equilibrium constant ∆h enthalpy change [kj mol−1] ∆s entropy change [kj mol−1k−1] ∆go gibbs free energy change [kj mol−1] r2 coefficient of determination references [1] j. kotaś, z. stasicka. chromium occurrence in the environment and methods of its speciation. environmental pollution 107(3):263 – 283, 2000. [2] a. albadarin, c. mangwandi, a. al-muhtaseb, et al. kinetic and thermodynamics of chromium ions adsorption onto low-cost dolomite adsorbent. chemical engineering journal 179:193 – 202, 2012. [3] m. owlad, m. k. aroua, w. a. w. daud, s. baroutian. removal of hexavalent chromium-contaminated water and wastewater: a review. water, air, and soil pollution 200(1):59–77, 2009. doi:10.1007/s11270-008-9893-7. [4] l. wang, c. lin. adsorption of chromium (iii) ion from aqueous solution using rice hull ash. journal of the chinese institute of chemical engineers 39(4):367 – 373, 2008. [5] a. shanker, c. cervantes, h. loza-tavera, s. avudainayagam. chromium toxicity in plants. environmental international 31(5):739 – 753, 2005. [6] h. demiral, i̇. demiral, f. tümsek, b. karabacakoğlu. adsorption of chromium(vi) from aqueous solution by activated carbon derived from olive bagasse and applicability of different adsorption models. chemical engineering journal 144(2):188–196, 2008. doi:10.1016/j.cej.2008.01.020. [7] s.-j. park, y.-s. jang. pore structure and surface properties of chemically modified activated carbons for adsorption mechanism and rate of cr(vi). journal of colloid and interface science 249(2):458–463, 2002. doi:10.1006/jcis.2002.8269. [8] a. bhattacharya, s. mandal, s. das. adsorption of zn (ii) from aqueous solution by using different adsorbents. chemical engineering journal 123(1-2):43–51, 2006. [9] m. nameni, m. r. alavi moghadam, m. arami. adsorption of hexavalent chromium from aqueous solutions by wheat bran. international journal of environmental science & technology 5(2):161–168, 2008. doi:10.1007/bf03326009. [10] n. r. bishnoi, m. bajaj, n. sharma, a. gupta. adsorption of cr(vi) on activated rice husk carbon and activated alumina. bioresource technology 91(3):305– 307, 2004. doi:10.1016/s0960-8524(03)00204-9. [11] n. k. hamadi, x. d. chen, m. m. farid, m. g. lu. adsorption kinetics for the removal of chromium(vi) from aqueous solution by adsorbents derived from used tyres and sawdust. environmental chemical engineering 84(2):95–105, 2001. doi:10.1016/s1385-8947(01)00194-2. [12] e. oyedoh, a. albadarin, g. walker, et al. preparation of controlled porosity resorcinol formaldehyde xerogels for adsorption applications. chemical engineering transactions 32:1651 – 1656, 2013. [13] m. mirzaeian, p. j. hall, epsrc (funder). preparation of controlled porosity carbon aerogels for energy storage in rechargeable lithium oxygen batteries. electrochimica acta 54(28):7444–7451, 2009. doi:10.1016/j.electacta.2009.07.079. 377 http://dx.doi.org/10.1007/s11270-008-9893-7 http://dx.doi.org/10.1016/j.cej.2008.01.020 http://dx.doi.org/10.1006/jcis.2002.8269 http://dx.doi.org/10.1007/bf03326009 http://dx.doi.org/10.1016/s0960-8524(03)00204-9 http://dx.doi.org/10.1016/s1385-8947(01)00194-2 http://dx.doi.org/10.1016/j.electacta.2009.07.079 eghe a. oyedoh, michael c. ekwonu acta polytechnica [14] r. w. pekala, d. w. schaefer. structure of organic aerogels. 1. morphology and scaling. macromolecules 26(20):5487–5493, 1993. doi:10.1021/ma00072a029. [15] i. langmuir. the constitution and fundamental properties of solids and liquids. part i, solids. journal of the american chemical society 38(11):2221–2295, 1916. doi:10.1021/ja02268a002. [16] k. r. hall, l. c. eagleton, a. acrivos, t. vermeulen. poreand solid-diffusion kinetics in fixed-bed adsorption under constant-pattern conditions. industrial & engineering chemistry fundamentals 5(2):212–223, 1966. doi:10.1021/i160018a011. [17] h. freundlich. over the adsorption in solution. j phy chem 57:385 – 470, 1906. [18] a. albadarin, a. al-muhtaseb, n. al-laqtah, et al. biosorption of toxic chromium from aqueous phase by lignin: mechanism, effect of other metal ions and salts. chemical engineering journal 169(1-3):20–30, 2011. [19] b. singha, s. das. biosorption of cr (vi) ions from aqueous solutions: kinetics, equilibrium, thermodynamics and desorption studies. colloids and surfaces b: biointerfaces 84(1):221 – 232, 2011. [20] m. belhachemi, f. addoun. comparative adsorption isotherms and modeling of methylene blue onto activated carbons. applied water science 1(3):111–117, 2011. doi:10.1007/s13201-011-0014-1. [21] k. s. rao, g. r. chaudhury, b. k. mishra. kinetics and equilibrium studies for the removal of cadmium ions from aqueous solutions using duolite es 467 resin. international journal of mineral processing 97(1-4):68 – 73, 2010. doi:10.1016/j.minpro.2010.08.003. [22] d. duranoğlu, a. trochimczuk, u. bekerb. a comparison study of peach stone and acrylonitrile-divinylbenzene copolymer based activated carbons as chromium (vi) sorbents. chemical engineering journal 165:56 – 63, 2010. [23] e. ozdemir, d. duranoğlu, u. bekerb, a. avcıb. process optimization for cr (vi) adsorption onto activated carbons by experimental design. chemical engineering journal 172:207 – 218, 2011. [24] e. demirbas, n. dizge, m. sulak, m. kobya. adsorption kinetics and equilibrium of copper from aqueous solutions using hazelnut shell activated carbon. chemical engineering journal 148(2-3):480–487, 2009. [25] g. kyzas, m. kostoglou, n. lazaridis. copper and chromium (vi) removal by chitosan derivatives-equilibrium and kinetic studies. chemical engineering journal 152(2-3):440–448, 2009. [26] v. murphy, s. tofail, h. hughes, p. mcloughlin. a novel study of hexavalent chromium detoxification by selected seaweed species using sem-edx and xps analysis. chemical engineering journal 148(2-3):425–433, 2009. [27] e. malkoc, y. nuhoglu. potential of tea factory waste for chromium(vi) removal from aqueous solutions: thermodynamic and kinetic studies. separation and purification technology 54(3):291–298, 2007. doi:10.1016/j.seppur.2006.09.017. 378 http://dx.doi.org/10.1021/ma00072a029 http://dx.doi.org/10.1021/ja02268a002 http://dx.doi.org/10.1021/i160018a011 http://dx.doi.org/10.1007/s13201-011-0014-1 http://dx.doi.org/10.1016/j.minpro.2010.08.003 http://dx.doi.org/10.1016/j.seppur.2006.09.017 acta polytechnica 56(5):373–378, 2016 1 introduction 2 material and methods 2.1 material 2.2 synthesis of activated carbon resorcinol formaldehyde xerogels 2.3 adsorption studies 2.4 adsorption isotherms 3 results and discussion 3.1 effect of initial ph 3.2 effect of initial chromium(vi) concentration on acrf 3.3 effect of temperature on chromium(vi) adsorption 3.4 effect of contact time 3.5 adsorption kinetics 3.6 mechanism of adsorption 3.7 adsorption thermodynamics 4 conclusions list of symbols references acta polytechnica https://doi.org/10.14311/ap.2022.62.0303 acta polytechnica 62(2):303–312, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague results of research on backlash compensation in a power electric drive by low-power electronic device semen l. samsonovich, boris k. fedotov, nikolay b. rozhnin, roman v. goryunov∗ national research university, moscow aviation institute, department 702, a-80, volokolamskoye av. 4, 125993 moscow, russia ∗ corresponding author: radiofizika01@mail.ru abstract. this article considers the feasibility of restoring and maintaining the kinematic accuracy of the support-rotary device drives by introducing a backlash compensation device into the control system. the power electromechanical drives of support-rotary device considered in this article contain two motors, the summation of the torques of which is carried out on a common output shaft. it is shown that the restoration of the required kinematic accuracy of the drives can be achieved by introducing one of two variants of an electronic device for backlash compensation into the control system. in the first variant, equal and opposite displacement signals are introduced into the control signals of the motors. the second variant introduces an electronic cross-connections backlash compensation scheme was into the control system. the study of the operation of the support-rotary device drive system with two backlash compensation devices carried out by a simulation method showed that the use of a cross-connection scheme is the most preferable and effective. as a result of the research, it was shown that the introduction of an electronic backlash compensation device into the control system makes it possible to ensure the operability of the power electromechanical drives of a support-rotary device with initial kinematic accuracy. keywords: backlash, backlash compensation device, multi-motor drive, tandem control. 1. introduction the long-time presence of a large-sized support-rotary device (srd) in an open atmosphere leads, as a result of corrosion, to the appearance of additional gaps in the gearboxes, and this impairs the kinematic accuracy of electromechanical drives. it was found that after the 30 years of srd being in the open air, the gears of the drive systems were corrosive, as a result of which the loss of the metal layer occurred, which in turn led to the appearance of backlash (gaps) and a decrease in the kinematic accuracy of the drive system [1]. the restoration of the kinematic accuracy of the electromechanical drive of a large-sized srd after a long stay in an open atmosphere is of a great practical importance, and the way to restore it is an actual scientific task. there are known ways to compensate for backlash in mechanical transmissions of electromechanical drives, described in [2–10], based on the following principles: a) using mechanical spring devices to compensate for backlash [2]; b) frequency correction and system bandwidth degradation [4, 5]; c) the use of a backlash and elastic deformation sensor [5, 6]; d) converting signals from speed and torque sensors to determine the amount of backlash and elastic deformations [4, 7]; e) introducing a nonlinear corrective element “dead zone” into the error signal circuit [6]. since the drive systems of srd consist of two drives operating with the same load (figure 1), it is possible to restore the kinematic accuracy by using special drive channels control circuits to compensate the backlash and improve the operation in dynamic tracking modes. the greatest development in the multi-motor electric drive was made using two ways of backlash compensation. the first way of backlash compensation is similar to the operation of an “electromechanical spring”, based on the creation of, opposite in sign, but equal in magnitude, torques that compensate for the gap, which is implemented by introducing constant displacements of a constant value into the control signals of the motors of the drive. methods to implement this way are described in [11–17]. another way is “tandem control”, based on phase displacement of control signals for individual electric motors, depending on the calculated coordinates [18–24]. a feature of this way is the need to carry out a large amount of calculations and build a complex digital control system using microprocessors for its implementation. 303 https://doi.org/10.14311/ap.2022.62.0303 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en s. l. samsonovich, b. k. fedotov, n. b. rozhnin, r. v. goryunov acta polytechnica figure 1. kinematic diagram of support-rotary device drive. figure 2. block diagram of a system with non-linearity of backlash. 2. methods the article discusses two ways to compensate for the backlash of a multi-motor electric drive gears, based on the transformation of a dynamic error signal: the wellknown method of backlash compensating with the introduction of displacement signals (electromechanical spring) [11–17], and a newly developed method with a delay of the dynamic error signal [25]. each of the methods is implemented by a special electronic device introduced into the dynamic error channel of the electro drive. a comparison of the schemes was carried out by mathematical modelling. the simulation was carried out in the matlab simulink software environment, a mathematical model of the investigated drive system was compiled, consisting of two motors operating on a common load and the backlash model in the form of a dead zone coupled with transmission stiffness. the block diagram of the developed model is shown in figure 2. the parameters of the srd drives are considered using the example of the signal φ(t) = asin (ωt). the following parameter values are used in the model: electric motor armature voltage – 440 v; armature current – 115 a; rated power of the electric motor – 45 kw; rated speed – 750 rpm; the moment of inertia of the rotor – 2.575 kgm2; gear ratio of the gearbox – 400; backlash of mechanical transmissions – 0.017 deg. initially, the operation of an idealized drive without any backlash was simulated, as a result, the reference characteristics of the drive with the guidance signal was obtained. then backlash was introduced into the model and the results of its impact were considered. then, one of the electronic devices for backlash compensation was introduced into the control system of the drives and the parameters were synthesized until 304 vol. 62 no. 2/2022 results of research on backlash compensation in a power . . . figure 3. backlash compensation with connecting additional supply voltages. the reference characteristics of the srd drives corresponding to the idealized model without backlash was obtained. 3. backlash compensation device with input of displacement signals in the well-known scheme with an input of displacement control channels (electromechanical spring scheme), a certain constant displacement signal is supplied to each channel, equal in magnitude and different in sign. this creates an expansion of two gears relative to the common wheel. in the first known works [2], the thrust torques were created by additional supply voltages to the electric motors (figure 3). later, to create thrust torques, instead of supplying voltages to the motors, an electronic device was used, which introduces displacements into the dynamic error signal (figure 4), which is created by the preliminary displacement block [3]. this ensures opposite torques of the motors at zero value of the control signal, similar to the action of a spring. the control signal θ in a system with the main position feedback is a dynamic error signal. when a control signal arrives, for example, a positive one, the signal value in the corresponding channel is added to the offset signal, and in the opposite channel, the offset signal is subtracted from the control signal. if the values of the control signals are lesser than the offset value, it creates a thrust torque to compensate for backlash in the system. when the value of the control signal exceeds the offset value, the signal in the second channel will change sign, and both motors will create torques in the same direction. 4. backlash compensation device with cross connections the disadvantage of the known method of backlash compensation, based on the introduction of displacefigure 4. backlash compensation device scheme on input of displacement signals. ment signals, is the low efficiency of the electric drive due to the high mutual loading of the electric motors. in the process of research, in order to compensate for the influence of backlash in gears, to ensure the summation of the dynamic capabilities of the channels in the steady-state motion mode, as well as to decrease the mutual torques and increase the efficiency, a control scheme for the drive channels with the introduction of cross-connections has been developed. the scheme implements a method for backlash compensation by introducing a delay into the dynamic error signal of one of the drive channels. the developed scheme of the backlash compensation device will be called a cross-connections scheme (figure 5). this scheme can be considered as a special type of tandem control, in which the phase shift is carried out not by calculating the coordinates of individual electric motors, but by converting the error signal by introducing a dynamic delay that depends on the value of backlash. the backlash compensation device consists of two splitters, two switches, two aperiodic links with time constants t1, t2, a signal module extraction block, four multiplying blocks, and two summing blocks. the input of the backlash compensation device is branched with a splitter to three outputs. two outputs are connected to corresponding switches. one output of the switch is connected to the first input of the summator through the first product block, and the second through the aperiodic link. the second product block forms a cross-connection line to the second input to the second summator. switches are configured for positive and negative dynamic error signal. the outputs of the switches are signals (0, 1) and (-1, 0), respectively. after the switch, one of the branches includes a signal module extraction block, which is necessary to match the signs of signals at the outputs of the products blocks, the second inputs of the four product 305 s. l. samsonovich, b. k. fedotov, n. b. rozhnin, r. v. goryunov acta polytechnica figure 5. scheme of backlash compensation device with cross-connections. blocks are connected to the remaining outputs of the input signal splitter. the signals from the outputs of the summation blocks are the output signals of the backlash compensation device θ1, θ2. the principle of operation of a cross-connections scheme is based on the phase shift of the signal in the second channel, the value of which is selected depending on the amount of backlash. for this purpose, aperiodic links with time constants t1, t2 are used. initially, the values of the time constants are equal to t1 = t2, then the value – t1,2 is selected depending on the individual parameters of a separate channel. it is known that the backlash has the greatest influence in the tracking system when the direction of movement is changed; in that moment, there is a change in the working surfaces (sides) of the gear teeth, and for the time of this switching, the system appears to be open. changing the direction of movement of the control object requires switching channels. switches are used to prevent the backlash from opening when the direction of movement is changed. for this purpose, switching filters and cross-connections are used. in the mode of steady motion, after the end of the transient processes, the signal in the second channel is close to the signal in the first channel, therefore, both channels operate in the same direction, in this mode of movement, the drive torques are summed up. the backlash in mechanical transmissions causes amplitude limiting and phase displacement during the guidance signal processing. if the selected time constants t1, t2 provide a displacement in the second channel greater than the displacement produced by the action of the backlash, then in the electric drive consisting of two motors connected by a common mechanical transmission, there will be thrust torques for backlash compensation. 5. experiment results the study of the electric drive operation with a backlash was carried out by simulating in simulink using the example of a harmonic guidance signal. the results of the operation of the drive under study with the known scheme of the backlash compensation device based on the introduction of displacement signals are shown in figure 6. it follows from the graphs that the dynamics of the drive is close to the dynamics of an idealized model without any backlash. the smoothness of the change in the coordinates of the electric drive is ensured in the entire range of development (figure 6a). figure 6b shows significant thrust torques exceeding the required total torque. a significant amount of energy is spent on mutual torques of the motors, which leads to a decrease in the efficiency of the electric drive, which does not exceed 30 % in the steady-state motion mode (figure 6c). the simulation results show that the use of a backlash compensation device with an input of displacement signals makes it possible to compensate for the effect of backlash caused by corrosion and ensure the required accuracy of operation. the operating parameters, in terms of processing the control signal, are as close as possible to the parameters of the drive without backlash, however, the high thrust torque leads to high energy costs and a decrease in efficiency. the results of the operation of the drive with a crossconnection backlash compensation device when small values of t1,t2 = 0.2 s are selected (the displacement in the second channel is lower than the displacement produced by the backlash) and when a harmonic guidance signal arrives are shown in figure 7. the simula306 vol. 62 no. 2/2022 results of research on backlash compensation in a power . . . (a). (b). (c). figure 6. the operation of the drive with a backlash compensation device with input of displacement signals with harmonic guidance signal: a) guidance signal – φc, execution of the guidance signal by the drive – φobj , and dynamic system error signal θ; b) torques developed by motors and total drive torque; c) efficiency of the electric drive. tion showed the drive execution to the guidance signal; when the sign of the dynamic error signal changes, a shift in the amplitudes of the torques of individual motors is observed (figure 7b), which compensates for the backlash effect. a feature of the work is the absence of thrust torques. figure 7c shows the change in the efficiency of the electric drive, which in the steady state motion mode, approaches 80 %. figure 8 shows the harmonic guidance signal processing by the drive when high values of the time constants t1, t2 = 1.0 s are selected. in figure 8b, you can see that an increase in the time constants led to the appearance of thrust torques, which function in almost the entire operating range. thus, when choosing high values of the time constants t1, t2, the backlash is compensated by creating thrust torques by an electric drive, similar to the action of the known device with the input of displacement signals. the disadvantage of this mode, as in the case of the known device, is the decrease in efficiency of the electric drive (figure 8c), which in the mode of steady motion is in the range of 30–40 %. the dependence of the efficiency of the electric drive on the selected values of the time constants t1, t2 is shown in figure 9. the graph was obtained by simulating the processing of a harmonic guidance signal with an amplitude of 10 by an electric drive. the figure shows that an increase in the time constants t1, t2 leads to a decrease in efficiency, since thrust torques begin to appear in the system, which leads to energy losses. by simulation, it was found that for the selected guidance signal and the amount of backlash, there is such a value of the time constant – t1, t2 = 0.65 s, at which the phase displacement in the second channel is close to the displacement produced by the backlash. relative to this point, a combined graph of the dependence of the efficiency of the electric drive and 307 s. l. samsonovich, b. k. fedotov, n. b. rozhnin, r. v. goryunov acta polytechnica (a). (b). (c). figure 7. operation of the drive with a cross-connection backlash compensation device with a harmonic guidance signal when small values of the time constants t1, t2 are selected: a) guidance signal – φc, execution of the guidance signal by the drive – φobj , and dynamic system error signal θ; b) the torques developed by the motors and total drive torque; c) the efficiency of the electric drive. the torques of individual motors on the value t (figure 10) is plotted. for convenience, this point will be called the transition point. when choosing values – t1, t2 to the left of the transition point, the electric drive will operate without thrust torques, to the right – there will be thrust torques in the system. an increase in the values of t1, t2 significantly to the right of the transition point will lead to a decrease in the participation of the second channel in joint work, up to the transformation of the drive into a single-channel drive with a passive load. this graph is only valid for a specific backlash value and guidance signal parameters. when the values t1, t2 = 0 are selected, the drive will operate without displacements and the signals in the channels will be equal. an increase in backlash will lead to a shift of the transition point to the right, a decrease in backlash – to the left. specific values – t1, t2 depend on the speed parameters of electric motors and power amplifiers, therefore, the choice of values must be made individually, based on a combination of parameters. when considering the operation of the crossconnections backlash compensation device, it was found that the electric drive can operate with the device in two modes of operation: with thrust torques and without thrust torques. the thrust torques mode (figure 8) occurs when the control signal in the second channel (reversible) has a displacement greater than the displacement caused by the backlash. without thrust torques (figure 7), the indicated displacement is lesser than the displacement produced by the backlash, and in this mode, the electric drive operates more efficiently. if the drive operates only with certain, previously known guidance signals, then it is possible to fine-tune the operating modes of the crossconnections backlash compensation device (with or without thrust torques). when the drive is operating 308 vol. 62 no. 2/2022 results of research on backlash compensation in a power . . . (a). (b). (c). figure 8. operation of the drive with a cross-connection backlash compensation device with a harmonic guidance signal when large values of the time constants t1, t2 are selected: a) guidance signal – φc, execution of the guidance signal by the drive – φobj , and dynamic system error signal θ; b) the torques developed by the motors and total drive torque; c) the efficiency of the electric drive. figure 9. the dependence of the efficiency of the electric drive on the selected values of the time constants t1, t2. 309 s. l. samsonovich, b. k. fedotov, n. b. rozhnin, r. v. goryunov acta polytechnica figure 10. a combined graph of the dependence of the drive efficiency and the torques of individual motors on the value – t . with arbitrary (random) guidance signals, a variable operation of the electric drive with backlash in gears in two modes is possible: with and without thrust torques. 6. comparison results the comparison of the schemes of the backlash compensation devices shows that the device based on the introduction of the displacement signal is the simplest to implement and gives an effect similar to the effect of a mechanical spring. the simulation results shown, when choosing the smallest value of the error signal displacements according to the condition of the absence of self-oscillations, one of the channels develops a torque that is almost twice the required one (figure 6b), while excess energy is expended to create thrust torques. the effect of backlash compensation in the dynamics is achieved by significant energy losses for the mutual torques of the channels. the efficiency of a drive with a device with an input of displacement signals with a harmonic guidance signal does not exceed 40 %. the cross-connection scheme shows that, when a small value of the time constant of the aperiodic block is selected due to the absence of self-oscillations in the steady-state mode of motion, the summation of the torques of individual motors is provided; the backlash is compensated by shifting the control signals in individual channels. the dynamics of the main movement is close to the dynamics of the ideal model without any backlash. the efficiency of the electric drive approaches 80 % when processing the harmonic guidance signal. an increase in the time constants t1, t2 of the aperiodic links leads to a change of the operation of the electric drive. the thrust torques of the electric drive appear (figure 8b), which increases the energy consumption and leads to a decrease in efficiency (figure 8c), similar to the device with an input of displacement signals. the use of a backlash compensation device for a drive containing two control channels will ensure the required kinematic accuracy of drive systems during the operation and will not require any additional effort to restore accuracy. the simulation showed that the developed device for backlash compensation with cross-connections allows ensuring the accuracy of an electromechanical drive with backlash in mechanical transmissions, and compensating for the gap caused by prolonged idling periods so that when the dynamic error signal changes, it creates a shift in control signals in individual channels, providing a compensation for backlash, and in the mode of steady motion, it creates the summation of the torques of the channels. when choosing large values of the time constants t1, t2 of a cross-connection device, the backlash is compensated by creating thrust torques, similar to a device with an input of displacement signals; in this case, the efficiency does not exceed 40 %. the choice of small values of the time constants t1, t2 of the aperiodic links of the backlash compensation device provides high-speed performance, an increase in the accuracy of operation (the value of the dynamic error is lower (figure 7a)) and an increase in efficiency of up to 85 %. an analysis of the operating parameters of a drive with various schemes for constructing backlash compensation devices testifies that it is advisable to use a circuit with cross-connections. 310 vol. 62 no. 2/2022 results of research on backlash compensation in a power . . . 7. conclusions (1.) the conducted research on compensation of backlash in mechanical transmissions of a power electromechanical drive by introducing various devices for backlash compensation shows that the devices provide the required dynamic characteristics with different energy efficiency. (2.) the proposed electronic device for backlash compensation allows, due to the transformation of the signal of the dynamic error of the system in lowpower electrical control circuits, to create a shift of signals in each channel so that when the sign of the signal of the dynamic error changes, it compensates for the backlash, and when the motion is steady, the torques of the individual motors are summed up. (3.) it is shown that the proposed cross-connection backlash compensation scheme provides a kinematic accuracy with increased efficiency, which is two times higher than the efficiency of the circuit with the input of displacement signals. references [1] s. l. samsonovich, r. v. goryunov. research of the influence of atmospheric corrosion on the kinematic accuracy of the drive of a large-sized supporting and rotating device. handbook an engineering journal (2):16–22, 2019. https://doi.org/10.14489/hb.2019.02.pp.016-022. [2] p. v. belyanskiy, b. g. sergeev. control of terrestrial antennas and radio telescopes [in russian]. sovet radio publishing house, moscow, 1980. [3] a. a. kirillov, v. g. stebletsov. bases of the electric drive of aircraft. tutorial. biblio-globus publishing house, moscow, 2013. [2020-03-05], https://rucont.ru/efd/260901. [4] m. nordin, p.-o. gutman. controlling mechanical systems with backlash – a survey. automatica 38(10):1633–1649, 2002. https://doi.org/10.1016/s0005-1098(02)00047-x. [5] r. m. r. bruns, j. f. p. b. diepstraten, x. g. p. schuurbiers, j. a. g. wouters. motion control of systems with backlash, 2006. dct rapporten; vol. 2006.075 [2020-02-16], https: //pure.tue.nl/ws/files/4295876/633392.pdf. [6] b. k. chemodanov. servo drives [in russian]. bauman mstu publishing house, moscow, 1999. isbn 5-7038-1383-2. [7] l. márton. adaptive friction compensation in the presence of backlash. control engineering and applied informatics 11(1):3–9, 2009. [8] r. r. selmic, f. l. lewis. neural net backlash compensation with hebbian tuning using dynamic inversion. automatica 37(8):1269–1277, 2001. https://doi.org/10.1016/s0005-1098(01)00066-8. [9] g. tao, f. l. lewis. adaptive control of nonsmooth dynamic systems. springer-verlag, london, 2001. isbn 978-1-84996-869-0, https://doi.org/10.1007/978-1-4471-3687-3. [10] s. suraneni, i. n. kar, o. v. ramana murthy, r. k. p. bhatt. adaptive stick-slip friction and backlash compensation using dynamic fuzzy logic system. applied soft computing 6(1):26–37, 2005. https://doi.org//10.1016/j.asoc.2004.10.005. [11] v. v. yavorsky. backlash compensation device in a two-motor electric drive, 1980. patent su746399, bull. no. 25. [12] y. postnikov, et al. dc twin-motor driver, 1984. patent su1075360a, bull. no. 7. [13] y. oho, k. iijima. motor control device and motor control method, 2019. patent us 2019/0079487 a1. [14] t. uchida, a. ito, n. furuya, t. oshima. 3d14 positioning system based on twin motor cooperative control with gear backlash compensation. the proceedings of the symposium on the motion and vibration control pp. 3d14–1–3d14–12, 2010. https://doi.org/10.1299/jsmemovic.2010._3d14-1_. [15] w. zhao, x. ren. adaptive robust control for fourmotor driving servo system with uncertain nonlinearities. control theory and technology 15(1):45–57, 2017. https://doi.org/10.1007/s11768-017-5120-7. [16] f. xu, h. wang. clearance elimination method with two motors based on fuzzy control for turntable. in proceedings of the seventh asia international symposium on mechatronics, pp. 702–710. springer singapore, singapore, 2020. https://doi.org/10.1007/978-981-32-9437-0_72. [17] m. deng. operator-based nonlinear control systems: design and applications. wiley–ieee press, piscataway, 2014. isbn 978-1-118-13122-0. [18] t. uchida, a. ito, t. kitamura, n. furuya. positioning system with backlash compensation by twin motor cooperative control (evaluation of rectilinear motion mechanism installed planetary gear speed reducer). transactions of the jsme (in japanese) 80(814):dr0162, 2014. https://doi.org/10.1299/transjsme.2014dr0162. [19] w. gawronski, j. j. beech-brandt, h. g. ahlstrom, e. maneri. torque-bias profile for improved tracking of the deep space network antennas. ieee antennas and propagation magazine 42(6):35–45, 2000. https://doi.org/10.1109/74.894180. [20] z. haider, f. habib, m. h. mukhtar, k. munawar. design, control and implementation of 2-dof motion tracking platform using drive-anti drive mechanism for compensation of backlash. in 2007 international workshop on robotic and sensors environments. 2007. https://doi.org/10.1109/rose.2007.4373968. [21] s. g. robertz, l. halt, s. kelkar, et al. precise robot motions using dual motor control. in 2010 ieee international conference on robotics and automation. 2010. https://doi.org/10.1109/robot.2010.5509384. [22] y. toyozawa, k. maeda, n. sonoda. tandem control method based on a digital servo mechanism, 1997. patent us005646495a. [23] s. tararykin, et al. method for controlling interconnected electric motors (variants), 2008. patent ru2316886c1, bull. no. 18. 311 https://doi.org/10.14489/hb.2019.02.pp.016-022 https://rucont.ru/efd/260901 https://doi.org/10.1016/s0005-1098(02)00047-x https://pure.tue.nl/ws/files/4295876/633392.pdf https://pure.tue.nl/ws/files/4295876/633392.pdf https://doi.org/10.1016/s0005-1098(01)00066-8 https://doi.org/10.1007/978-1-4471-3687-3 https://doi.org//10.1016/j.asoc.2004.10.005 https://doi.org/10.1299/jsmemovic.2010._3d14-1_ https://doi.org/10.1007/s11768-017-5120-7 https://doi.org/10.1007/978-981-32-9437-0_72 https://doi.org/10.1299/transjsme.2014dr0162 https://doi.org/10.1109/74.894180 https://doi.org/10.1109/rose.2007.4373968 https://doi.org/10.1109/robot.2010.5509384 s. l. samsonovich, b. k. fedotov, n. b. rozhnin, r. v. goryunov acta polytechnica [24] i. polyushchenkov, et al. method of the interconnected electric drives coordinates adjustment, 2018. patent ru2655723c1, bull. no. 16. [25] s. l. samsonovich, b. k. fedotov, r. v. goryunov. method and device for selection of backlash in kinematic transmission of support-rotary device with two interconnected electric drives, 2020. patent ru2726951c1, bull. no. 20. 312 acta polytechnica 62(2):303–312, 2022 1 introduction 2 methods 3 backlash compensation device with input of displacement signals 4 backlash compensation device with cross connections 5 experiment results 6 comparison results 7 conclusions references ap02_3.vp 1 introduction hydraulic actuators are used for delivering high actuation forces and high power density. due to their simple construction and low cost, hydraulic actuators are widely used. these actuators have highly nonlinear model characteristics. two types of actuators are used, differential (single-rod) and synchronizing (double-rod). from the control engineering point of view, synchronizing or symmetric actuators are preferred because there is no piston area difference and this fact reduces non-linearities, but on the other hand, the construction of these types of actuators is difficult and expensive. also in some situations, e.g., robots, cranes, etc., due to limited space symmetric actuators cannot be used. in the literature [13, 6, 11, etc.] various models of hydraulic actuators are presented. some are linear models, mostly applied to synchronizing cylinders. many authors have neglected the leakage; also the friction forces are not completely modeled. these factors are all included in the derivation of our model that provides the basis for creating object oriented library blocks. the schematic diagram of a differential actuator, as shown in fig. 1, consists of a constant pressure supply pump, a magnetically controlled spool valve and a differential hydraulic cylinder. all the variables are explained in the list of symbols (table 2). the model is derived in the next section. the motivation is to prepare the relations and parameter definitions for object-oriented blocks that are suitable for programming in special matlab s-functions. 2 mathematical modelling let the flow areas to the supply and return port of the spool valve be proportional to the spool displacement xs, (i.e., a k x0 0� s). then the flow rate of oil across the spool valve can be given as: q c x p� 0 0 � (1) where c c k a a 0 0 2 1 2 1 1 2 � � � � � � � d � at the nominal pressure drop (�pn) and nominal flow (qn), this constant can be estimated as c q x p 0 0 5 � n s n, max . � (2) applying the continuity equation to both chambers 1 and 2 � � � � � � d d d d oil c oil oil c t p v a x p q q t p v a x � � � 1 0 1 1 1 1 2 0 2 2 , int , � � � � � �� � �� oil ext 2p q q q2 int (3) and considering the compressibility of oil eoil as � � �e p p p poil oil oil �� � �� © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 42 no. 3/2002 non linear modelling and control of hydraulic actuators b. šulc, j. a. jan this paper deals with non-linear modelling and control of a differential hydraulic actuator. the nonlinear state space equations are derived from basic physical laws. they are more powerful than the transfer function in the case of linear models, and they allow the application of an object oriented approach in simulation programs. the effects of all friction forces (static, coulomb and viscous) have been modelled, and many phenomena that are usually neglected are taken into account, e.g., the static term of friction, the leakage between the two chambers and external space. proportional differential (pd) and fuzzy logic controllers (flc) have been applied in order to make a comparison by means of simulation. simulation is performed using matlab/simulink, and some of the results are compared graphically. flc is tuned in a such way that it produces a constant control signal close to its maximum (or minimum), where possible. in the case of pd control the occurrence of peaks cannot be avoided. these peaks produce a very high velocity that oversteps the allowed values. keywords: modelling, simulation, fluid dynamics, robotics, fuzzy control, proportional-derivative control. u xs p0ps q2q1 a2 ff v1p1 p2 f dist a1 v2 q int q ext, 2 xc fig. 1: schematic diagram of a differential hydraulic actuator after putting this formula into eq. (3) we obtain the equations � � � � � �� � � oil c c oil 1 oil d d d d p a x t v a x p p p t p q q 1 1 0 1 1 1 1 1 � � � � � , i � � � � � nt , i � � � � � � �� � � oil c c oil 2 oil d d d d p a x t v a x p p p t p q 2 2 0 2 2 2 2 �nt � �q qext 2 (4) � � d d d d d d 1 oil c c 2 o p t p v a x q a x t q p t � � � 1 0 1 1 1 1 , int � � � �� � �� � � � il c ext c 2 d d p v a x q q a x t q2 0 2 2 2 , int � � � � �� � �� (5) where � � � � q c x p p p p x p p p p q c 1 0 1 1 0 0 2 0 � � � � � � � � � ( ) ( sgn sgn sg sg s s s s 1 1 � � � � sg sg s 2 2 s s s x p p p p x p p p p sgn sgn ) � � � � � � � 0 0 2 2 (6) � � q k p p q k p int int , , � � � 1 2 2 2 2ext ext where coefficients kint and kext, 2 can be evaluated by means of the general formula by [12] k d r r l c q p leak i c c 3 n n 6 � �� � � � � � � � 2 0 5 0 . � eoil [1] can be given as: �e p e c p p coil oil, max� � � � � � � � 1 2 10 1 2log max (7) and the following notification is used �sgn x x x x � � � � � 1 0 0 0 1 0 �sg x x x x � � � 0 0 0 the forces on the cylinder can be expressed as (using newton’s second law of motion) � �m x t p a p a f v f d d c f c dist 2 2 1 1 2 2 � � � � (8) where v x t c cd d � . the friction forces �f vf c are viscous friction, coulomb friction and static friction, i.e., � �f v k v v f f v c f c v c c c s c s � � � �� � � � � � � � � �sgn exp the plot of friction versus velocity is shown in fig. 2. the factors of friction forces fv (viscous), fc (coulomb) and fs (static) can be identified from the total friction (ff) and velocity (vc). these (ff and vc) can be estimated from the values of the position, and pressures in both chambers of the cylinders experimentally with zero load, and using then numerical integration of the position for determing velocity. spool valve spool valve dynamics can be derived in a similar way as for a cylinder, but the following linear second order differential equation is a widely used and sufficient approximation: � � � d d d d s n s n 2 n 2 2 2 2 x t t b x t t u t� � � (9) typical values of spool valve parameters are: natural frequency n � 300–500 s 1 and damping factor b � 0.7–1.0. correct modelling of a spool valve requires resetting of the integrators by bringing their inputs to zero as soon as the end positions of the spool valve are overstepped. this precaution is shown in fig. 4. state space form of the hydraulic actuator let x x x x t x p x p x x x x t 1 2 3 1 4 2 5 6� � � � � �c c s sd d d d , , , , , then from eq. (5–9) we get: � �� � � � � � , x x x m x a x a f x f x e x v a x 1 2 2 3 1 4 2 2 3 3 0 1 1 1 � � � � � � � f load oil � � �� � � � 1 2 oil 1 � � � � � � � � q x x a x q x x x e x v a x q 1 3 5 1 3 4 4 4 0 2 2 , , � int , int � � �� �x x q x a x q x x x x x u b x 4 5 4 2 4 5 5 6 6 2 , , � � � � � � � � � ext 2 2 n 2 n 6 n 2 x5 (10) 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 0.25 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 0.25 200 150 100 50 0 50 100 150 200 velocity [m/s] friction force [n] fig. 2: friction verses velocity profile 3 modelling in matlab/simulink the model has been implemented in versatile software matlab/simulink, which is widely used in control engineering communities around the world. cylinder, spool valve and controller models are shown in fig. 3. these blocks were created using the above derived equations, and a gui (graphical user interface) will appear by clicking the block. parameters can be entered through these menus. fig. 4 shows the spool valve model, where anti-windup has been included. fig. 5 shows the cylinder model, which is modeled using four state variables, and the cylinder stops at the ends have been introduced. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 42 no. 3/2002 fig. 3: simulink block diagram of the hydraulic actuator model sign product 0 constant 1/w0/w01 in1 fig. 4: simulink spool valve model with end position stops fig. 5: simulink block diagram of the hydraulic cylinder model 4 fuzzy logic controller (flc) looking for an algorithm that would be closer to the time optimal controller (bang-bang control), we tried using non-linear fuzzy logic control design techniques. the basic structure of a fuzzy logic controller is shown in fig. 6. from the many types of fuzzy controllers, e.g., pid-like, sliding mode, takagi-sugeno etc., we selected mamdani type fuzzy logic controller. this controller uses two inputs (control error and its derivate) and one output (control signal). for both inputs, three fuzzy sets (two sigmoid and one gaussian membership functions), and for output five fuzzy sets (two sigmoid and three gaussian membership functions) have been taken, as shown in fig. 6. rules are defined such that for the region where we have large errors, a constant and maximum allowable control signal will be delivered, whereas in the small error region it will behaves as a non-linear pd-like controller, see table 1. two types of inference engines are used; one is composition-based inference and the other is individual-rule-based inference. in the defuzzification, the set of modified control output values in the form of fuzzy sets is converted into single point-wise values or crisp values. various defuzzification methods can be used; we used the centre of gravity or centre of area method. fuzzy controllers have the disadvantage that it is difficult to tune the large number of parameters, in contrast to the proportional derivative (pd) controller, which can be expressed by the equation: u k e k epd p d� � � (11) where upd is the control signal, e the control error, and the constants kp, kd represent proportional and derivate constants. the controller parameters can be optimised heuristically, but good support for optimising these parameters is provided by the matlab toolbox called the ncd block set. even in the presented case of a fuzzy logic controller when a minimum number of fuzzy variables and rules have been used, better results than with a pd controller have been achieved. as a fuzzifier, three fuzzy sets (two sigmoid and one gaussian membership functions) have been used to convert the crisp values of the inputs into fuzzy linguistic variables, see fig. 9. similarly five fuzzy sets (two sigmoid and three gaussian membership functions) have been used to convert fuzzy linguistic variables into crisp values of the outputs as a control signal. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 normalization crisp process-state values fig. 6: the structure of a fuzzy knowledge based controller control error n z p change of control error n .. .. p z .. z .. p n .. .. .. nl .. pl table 1: lookup table (explaining rules) fig. 7: inserting flc into the simulink model fig. 8: gui for output in matlab for flc fig. 9: schematic diagram of fuzzy inference system the whole process of fuzzy control can be expressed in the following steps, considering x as input and y as output for each step. also two types of membership functions are used as given below: sigmoid membership function: � � f x a c e a x c ; , � � � � 1 1 gaussian membership function: � � f x a c e x c a; , � � � 2 2 step 1: fuzzifier: conversion of crisp values to fuzzy variables: for input 1, i.e., error, conversion has been done using three fuzzy variables (negative (n) using the sigmoid membership function of parameters [ 250 0.02], zero (z) using the gaussian membership function of parameters [0.015 0], and positive (p) using the sigmoid membership function of parameters [250 0.02]). the same procedure was followed for input 2, i.e., change of error with parameters ([ 90 0.05], [0.04 0], [90 0.05]). step 2: logical operators the logical operator and used in the rules is described as: �y x x� �min , ,1 2 step 3: implication the implication used in rules is as follows: if error is p and change of error is n then control signal is p. step 4: aggregation: in aggregation, the rules directed towards one output are combined together. we used the logical operator or with max function as given below: �y x x� �max , ,1 2 in our case we had one output and five rules. step 5: defuzzification: defuzzification is the process of converting fuzzy variables to crisp values. the center of gravity method has been used to obtain crisp values of the output from the reference output fuzzy sets. the reference fuzzy sets consist of five fuzzy variables (negative large (nl) using the sigmoid membership function [ 175 0.12], negative (n) using the gaussian membership function [0.008 0.05], zero (z) using the gaussian membership function [0.008 0], positive (p) using the gaussian membership function [0.008 0.05] and positive large (pl) using the sigmoid membership function [175 0.12]). the following formula has been used to convert the fuzzy variables to crisp values. � y c a x a x ij ij i i p ij i i p � � � � � 1 1 as shown in fig. 8, guis has been used in matlab, where all the input and output fuzzy variables, inference system, fuzzifier and defuzzifier can be defined very easily by setting the corresponding parameters. then the controller is connected to the model in simulink, as shown in fig. 7. 5 results and conclusions this work has two aims: the first is to elaborate a way of non-linear modelling of a differential hydraulic actuator suit45 acta polytechnica vol. 42 no. 3/2002 fig. 10: gui for defining rules in matlab for flc 0 0.5 1 1.5 0 10 20 30 40 50 60 70 80 90 100 110 position [mm] with small gain time [s] x desired x pd x flc 0 0.5 1 1.5 0 10 20 30 40 50 60 70 80 90 100 110 position [mm] with large gain time [s] x desired x pd x flc fig. 11: position trajectory of hydraulic actuator for flc and pd control able for programming in simulink. the second goal is to use this model for comparing a classical and a non-linear fuzzy logic control loop design. the obtained results do not show any discrepancy with our experience or with the available experimental data. the responses that we obtained confirm the general advantage of flc in quicker responses, due to the fact that only flc is able to produce control signals that are not only proportionally dependent on control error as in the case of a pd controller. the fundamental importance of a proper choice of the gain in the pd controller is demonstrated in by the columns in fig. 13, where the left column shows responses with half of the gain used in the experiments depicted in the right column. pd for small step changes surprisingly achieved worse control responses than for larger step changes. the anti-windup problem has been solved, as shown in fig. 15, which explains the importance of 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 42 no. 3/2002 0 0.5 1 1.5 50 0 50 100 150 200 250 300 350 400 450 velocity [mm/s] with large gain time [s] v pd v flc 0 0.5 1 1.5 50 0 50 100 150 200 250 300 350 400 450 velocity [mm/s] with small gain time [s] v pd v flc fig. 12: velocity trajectory of hydraulic actuator for flc and pd control 0 0.5 1 1.5 50 0 50 100 150 200 250 300 control signal [mm] with large gain time [s] u pd u flc limit (160 mm) 0 0.5 1 1.5 50 0 50 100 150 200 250 300 control signal [mm] with small gain time [s] u pd u flc limit (160 mm) fig. 13: control signal for a hydraulic actuator applying flc and pd control 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.03 0.025 0.02 0.015 0.01 0.005 0 0.005 0.01 0.015 velocity [mm/s] with and without anti-windup time [s] with anti-windup without anti-windup with no matlab limits fig. 14: velocity profile with windup problems 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1.5 1 0.5 0 0.5 1 1.5 2 ×10 3 position [mm] with and without anti-windup time [s] with anti-windup without anti-windup with no matlab limits fig. 15: position curves with windup problems respecting the limits in the model. however in this case less difference is shown between with and without antiwindup, due to the unsuitability of the parameters. acknowledgement this work has been supported by ga čr, research project no: 102/01/1347; “numerical methods for optimal control system design”. references [1] berzen, w.: nonlinear control of hydraulic cylinders. proceedings of the 6th ieee mediterranean conference on control and systems (mccs’98), italy, 1998. [2] blackburn, j. f., reethof, g., shearer j. l.: fluid power control. massachusetts, (u.s.a.): the mit press, 1960. [3] costa branco, p. j., dente, j. a.: the application of fuzzy logic in automatic modelling electromechanical systems. fuzzy sets and systems, 1998, vol. 95, no. 3, p. 273–293. [4] daugherty, r. l., franzini, j. b., finnemore, e. j.: fluid mechanics with engineering applications. singapore: mc graw-hill book company, 1989. [5] dimiter, d., hans, h., michael, r.: an introduction to fuzzy control. berlin: springer-verlag, 1996, isbn 3-540-60691-2. [6] horacek, p., binder, z.: an approach for design and implementation of fuzzy controller. in: proceedings of european congress on fuzzy and intelligent technologies eufit’93, germany, september 1993, vol. 2, p. 163–169, isbn 3-86073-176-9. [7] jan, j. a., šulc, b.: conventional and fuzzy logic control of non linear hydraulic actuators for flexible robots. iasted international conference applied simulation and modelling (asm 2001), marbella, spain, 2001, p. 302–307, isbn 0-88986-311-3. [8] jan, j. a., šulc, b.: fuzzy logic control in hydraulic actuator modelled from own-built simulink block library. proceedings of the conference on information engineering and process control, ctu, prague, czech republic, 2001, p. 73–74, isbn 80-902131-7-0. [9] jan, j. a., šulc, b.: modelling and simulation of nonlinear hydraulic actuator using conventional and fuzzy logic control. 16th international conference on production research (icpr-16), prague, czech republic, 2001, isbn 80-02-01438-3. [10] jan, j. a., šulc, b.: fuzzy logic control in differential hydraulic cylinders driven by servo-valves: simulation and visualization. 5th international student conference, ctu in prague, prague (czech republic): poster 2001. [11] kugi, a., schlacher, k., keintzel, g.: position control and active eccentricity compensation in rolling mills. automatisierungstechnik germany, 1999, at 8/99, isbn 3-486-23477-3. [12] merrit, h. e.: hydraulic control systems. new york: john wiley & sons, 1967. 13] noskievič, p.: modelling and system identification. (in czech), ostrava (czech republic): montanex a.s., 1999, isbn 80-7225-030-2. [14] stadler, w.: analytical robotics and mechatronics. new york: mcgraw-hill inc., 1995, isbn 0-07-0060608-0. [15] sugeno, m., yasukawa, t.: a fuzzy-logic-based approach to qualitative modeling. ieee trans. on fuzzy systems, february 1993, vol. 1, no. 1, p. 7–31. [16] šulc, b.: integral wind-up in control and system simulation. in: control engineering solutions. a practical approach (p. albertos, r. strietzel and n. mort (editors)), iee, london 1997, p. 61–76, isbn 0-85296-829-9. bohumil šulc phone:+420 224 352 531 fax: +420 233 336 414 e-mail: sulc@fsid.cvut.cz ing. javed alam jan, ph.d. e-mail: javed@student.fsid.cvut.cz department of instrumentation and control engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 42 no. 3/2002 x1, xc position of cylinder piston [m] x2, vc velocity of cylinder piston [m/s] x3, p1 pressure of oil in chamber 1 [pa] x4, p2 pressure of oil in chamber 2 [pa] x5, xs position of valve piston [m] x6, vs velocity of valve piston [m/s] q1/2 flow in pipe 1 and 2 [m 3/s] v0,1/2 volume of oil in chamber at xc � 0 [m 3] ff friction force [n] eoil bulk modulus of oil [pa] a1 piston area in chamber 1 [m 2] a2 piston area in chamber 2 [m 2] d diameter of piston of cylinder [m] rc radial clearance between piston and cylinder [m] l length of piston head [m] b spool valve damping [1] �n spool valve natural frequency [s �1] � dynamic viscosity [pa � s] table 2: lists of symbols ap04-bittnar1.vp 1 introduction calcined gypsum is a historical binder that was used already several thousands of years ago. it has been found in the binder of buildings in the territory of present-day syria dated 7000 b. c., and it was also used in the cheops pyramid 2650 b. c. calcined gypsum is used in many technological modifications, which aim to improve its properties, in particular as a binder of rendering mortars, for the production of stuccowork and also for plasters. in interior applications and in the binder of rendering mortars, calcined gypsum is still employed, due to easy processing and low energy consumption in the production process. in the second half of the 20th century, technologies were developed for desulfurization of flue gases in power stations and heating plants. these methods are based on the reaction of sulfur(ii) oxide formed during combustion of brown coal with a high content of sulfur with limestone caco3. although it seemed that these methods are suitable from the point of view of protection of the environment, there is currently opposition to these technologies. it has been pointed out that the price of desulfurization equipment is too high, and that a large amount of high quality limestone is consumed while a huge amount of flue gas desulfurization (fgd) gypsum is formed as a waste product. according to the information given in the mining yearbook 2000 the amount of sulfur in higher quality brown coal for households ranges from 0.9 % to 1.78 %. the coal for energy production contains up to 2.5 % of sulfur. flue gas desulfurization of one power station block creates up to 20 t of fgd gypsum per hour. fgd gypsum is produced in great quantities, but is insufficiently used. calcined gypsum is produced from fgd gypsum in only one power station in the czech republic, while the remaining production ends with gypsum that is used only partially as an additive for retarding the setting of cement. calcined gypsum is mostly used for the production of gypsum plasterboard. gypsum, that is not utilized is deposited as waste. therefore, it is very desirable to pay attention to utilization of calcined gypsum also in those applications where it has not previously been used, i.e. in exteriors. utilization of binders with minimal energetic demand is in accordance with the current trend in production, when building materials including binders should be produced with a minimized impact on the environment, i.e., with minimal or no production of co2 and minimal demands on energy. examples of such binders are belitic cements, binders based on silicate waste products and also calcined gypsum. calcined gypsum as a low-energy material can be produced from waste fgd gypsum, by dehydrating it at temperatures between 110 and 150 °c. then, �-form of calcined gypsum is formed according to the equation caso4� 2h2o � caso4� h2o + 1 h2o. (1) the solid structure of calcined gypsum is created by reverse hydration when gypsum caso4� 2h2o is again formed. this compound is relatively soluble in water, its solubility being 0.256 mg in 100 g of water at 20°c. therefore, it cannot be utilized in exterior applications, as rain water could dissolve this the product that should safeguard the mechanical properties of the material. in order to use gypsum elements in exteriors, it is necessary to modify them so that they will exhibit more suitable properties and a longer service life. modifications of gypsum are usually performed using polymer materials. bijen and van der plas [1] reinforced gypsum with e-glass fibers, and modified the binder by using acrylic dispersion in a mixture with melamine. the results show that this material had higher flexural strength and higher toughness than glass fiber reinforced concrete after 28 days. a disadvantage of polymers based on the carbon chain is a decrease in the fire resistance of calcined gypsum elements. generally it can be stated that the resistance of hardened gypsum to water has not yet been resolved in a satisfactory way. in the literature, only applications of lime and artificial resins (polyvinylacetate, urea formaldehyde and melamine formaldehyde) have been studied, some inorganic substances such as fluorosilicates, sulfates and silicates were found to increase their surface hardness and impermeability, see schulze et al. [2]. therefore, our primary aim is to adjust basic technologies for the production of modified gypsum, particularly from the point of view of hydrophobization and improving its mechanical, hygric and thermal properties. in this paper, we present reference measurements of the mechanical, thermal and hygric properties of common fgd gypsum, which will be © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 44 no. 5 – 6/2004 mechanical, hygric and thermal properties of flue gas desulfurization gypsum p. tesárek, j. drchalová, j. kolísko, p. rovnaníková, r. černý the reference measurements of basic mechanical, thermal and hygric parameters of hardened flue gas desulfurization gypsum are carried out. moisture diffusivity, water vapor diffusion coefficient, thermal conductivity, volumetric heat capacity and linear thermal expansion coefficient are determined with the primary aim of comparison with data obtained for various types of modified gypsum in the future. keywords: flue gas desulfurization (fgd) gypsum, calcined gypsum, mechanical properties, thermal properties, hygric properties. utilized for a comparison with various types of modified gypsum in the future. 2 experimentals methods 2.1 bending strength and compressive strength the measurement of bending strength was performed according to the czech standard čsn 72 2301 [3] on 40×40×160 mm prisms. the specimens were demolded 15 minutes after the final setting time and stored in the testing room. each specimen was positioned in such a way that the sides that were horizontal during the preparation were in the vertical position during the test. the experiment was performed as a common three-point bending test using a wpm 50 kn device. the distance of the supporting cylinders was 100 mm. the bending strength was calculated according to the standard evaluation procedure. the measurements were done 2 hours, 1 day, 3 days, 7 days, 14 days and 28 days after mixing. compressive strength was determined in accordance with the czech standard čsn 72 2301 on the halves of the specimens left over after the bending tests. the specimens were placed between the two plates of the wpm 100 kn device in such a way that their lateral surfaces adjoining during the preparation to the vertical sides of the molds were in contact with the plates. in this way, the imprecision of the geometry on the upper cut off side did not have a negative effect on the experiment. the compressive strength was calculated as the ratio of the ultimate force and the load area. 2.2 moisture diffusivity 2.2.1 determination of the apparent moisture diffusivity from a water sorption experiment a common water sorption experiment was carried out. the specimen was water and vapor-proof insulated on four lateral surfaces and the face side was immersed 2 mm in water. a constant water level in the tank was achieved using a bottle placed upside down. the known water flux into the specimen during the suction process was then employed to determine the water absorption coefficient. the samples were tested in constant temperature conditions. to calculate the apparent moisture diffusivity dw [m 2s�1], the following approximate relation was employed: d a w w c � � � � � � � � � 2 (2) where a is the water absorption coefficient [kgm�2s1/2], and wc is the saturated moisture content [kgm�3]. 2.2.2 determination of moisture diffusivity from moistures profiles the capacitance method [4] was employed to measure the moisture content, and the measuring frequency was 250–350 khz. the parallel electrodes of the capacitance moisture meter had dimensions 20×40 mm. the moisture profiles were determined using a common capillary suction 1-d experiment in the horizontal position, and the lateral surfaces of the specimens were water and vapor-proof insulated. a moisture meter reading along the specimen was taken every 5 mm. the calibration curve was determined after the last moisture meter reading, when the moisture penetration front was about one half of the length of the specimen, using this last reading and the standard gravimetric method after cutting the specimen into 1 cm wide pieces. the final calibration curve for the material was constructed from the data of 6 samples. the moisture profiles were then calculated from the calibration curve. the measurements were done at 25 °c ambient temperature. moisture diffusivity was determining by the matano method [5]. 2.3. water vapor diffusion coefficient 2.3.1 standard cup methods in the standard cup methods (dry and wet), the water vapor diffusion coefficient d was calculated from the measured data according to the equation d m d r t s m pp � �� , (3) where d is the water vapor diffusion coefficient [m2s�1], �m the amount of water vapor diffused through the sample [kg], d the sample thickness [m], s the specimen surface being in contact with the water vapor [m2], � the period of time corresponding to the transport of mass of water vapor �m [s], �pp the difference between the partial water vapor pressure in the air below and above the specimen [pa], r the universal gas constant [j mol�1 k�1], m the molar mass of water [kg mol�1], t the absolute temperature [k]. on the basis of the diffusion coefficient d, the water vapor diffusion resistance factor � was determined: � d d a , (4) where da is the diffusion coefficient of water vapor in the air [m2s�1]. in the dry cup method, a sealed cup containing silica gel was placed in a controlled climate chamber with 50 % relative humidity and weighed periodically. for the wet cup method, a sealed cup containing water was placed in an environment with a temperature of about 25 °c and relative humidity about 50 %. the measurements were done at 25 °c over a period of two weeks. the steady state values of mass gain or mass loss determined by linear regression for the last five readings were used to determine the water vapor transfer properties. 2.3.2 transient method in the transient method designed in [6], the measuring device consists of two airtight glass chambers separated by a board-type specimen of the measured material. in the first chamber, a state near to 100 % relative humidity is maintained (achieved with the help of a cup of water), while in the second chamber, there is a state close to 0 % relative humidity (established using some desiccant, in our case a silica gel). alternately, saturated salt solutions establishing defined relative humidity conditions can be placed in either the wet or the dry chamber, or in both of them. the change in the mass of water in the cup and the mass of the desiccant are recorded by an automatic balance in dependence on time. if steady-state 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 measurements are also required, the validity of the condition that the change in the mass of water equals in absolute values the change in the mass of the desiccant is tested, and the experiment continues until this condition is realized. the experiment is carried out under isothermal conditions, as in the case of standard cup methods. 2.4 thermal conductivity and volumetric heat capacity thermal properties were measured using isomet 2104 (applied precision, ltd., sk). this is a multifunctional instrument for measuring thermal conductivity � [w m�1 k�1], volumetric heat capacity c [j m�3 k�1] and temperature [°c] of a wide range of materials. the thermal diffusivity a [m2s�1] is calculated by the device from the formula a c � . (5) the measurements were done using surface probes with samples, which were placed at laboratory conditions of 25 °c and about 50 % relative humidity. the relative moisture content by mass of the samples was about 18 %. 2.5 linear thermal expansion coefficient the linear thermal expansion coefficient �t was determined in a common way using the measured length changes (carl zeiss optical contact comparator with a precision of �0.5 �m) between two different temperatures: 25 °c and 80 °c. it was calculated from the formula a l l t t t 1 0, d d , (6) where l0, t is the length at a reference temperature. 3 material and samples the material, used for reference measurements was �-form of calcined gypsum with purity higher than 98 % of fgd gypsum, produced at the electric power station at počerady, cz. the water/gypsum ratio was 0.627. the samples were mixed according to czech standard čsn 72 2301. for the measurements of particular mechanical, thermal and hygric parameters, we used the following samples: bending strength and compressive strength – 8 sets of 3 specimens each 40×40×160 mm, moisture diffusivity – capacitance method – 6 specimens 20×40×300 mm, apparent moisture diffusivity – 4 specimens 50×50×23–25 mm, water vapor diffusion coefficient – 12 cylinders with the diameter 105 mm and thickness 10–22 mm, thermal conductivity and volumetric heat capacity – 6 specimens 70×70×70 mm, linear thermal expansion coefficient – 5 specimens 40×40×160 mm. the samples for determining moisture diffusivity were insulated on all lateral surfaces by waterand vapor-proof plastic foil, the samples for measuring water vapor diffusion coefficient were also waterand vapor-proof insulated on the lateral surfaces by epoxy resin. 4 experimental results the basic properties of the studied material for its characterization are shown in table 1. in addition to these measurements, which are commonly performed for all porous building materials, we also made a classification of fgd gypsum according to czech standard čsn 72 2301. this classification consists in determining the grinding fineness using the 0.2 mm sieve residue, initial and final setting times using a vicat device and compressive strength for a period of two hours after mixing. the results are summarized in table 2. according to these results, fgd gypsum can be classified as g-13 b iii. the dependence of compressive strength and bending strength on time for the first 28 days after mixing is given in fig. 1. we can see that both strengths decrease slightly for approximately 3 days, but then they begin to increase rapidly and the maximum strengths are achieved after 14 days. these © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 44 no. 5 – 6/2004 bulk density [kgm�3] matrix density [kgm�3] open porosity [% by volume] 1019�1.5 % 2530�2.0 % 60�3.4 % table 1: basic properties of fgd gypsum compressive strength [mpa ] initial setting time [min] final setting time [min ] 0.2 mm sieve residue [%] measured values 13.3 9 13 1.79 limiting values according to čsn minimum 13.0 earliest time 6 latest time 30 maximum 2 classification according to čsn g-13 b iii table 2: classification of fgd gypsum using čsn 72 2301 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 time [days] s tr e n g th [m p a ]. .. compressive strength bending strength fig. 1: compressive strength and bending strength of fgd gypsum changes are apparently related to the change of moisture content in the specimens. while the moisture content for the 2-hour specimens was 67% kg/kg, for the 28-day specimens it was only 24 %. so, both compressive strength and bending strength were significantly improved by drying of the specimens. fig. 2 shows typical moisture profiles determined by the capacitance method. fig. 3 presents the dependence of moisture diffusivity on the moisture content calculated using the moisture profiles and the apparent moisture diffusivity determined on the basis of the water absorption coefficient. clearly, the agreement of both measurements is very good, the value of apparent moisture diffusivity being equal to the moisture diffusivity determined from moisture profiles for 83% of the capillary water saturation value. table 3 presents the basic thermal properties of fgd gypsum. table 4 shows the results of measurements of the water vapor diffusion resistance factor using the dry cup and the wet cup methods and also the transient method. we can see that the results obtained by the transient method are very close to the results of the wet cup method, the difference being within the error range of both methods. this seems to indicate that the boundary condition on the wet side with the relative humidity close to 100% affected the measurements more significantly than the condition on the dry side. 5 discussion the possibilities of comparing of the material parameters of fgd gypsum measured in this paper with the parameters determined by other scientists, at least for common gypsum, are very limited. as for fgd gypsum, no data at all were found in common sources. among the basic properties, klein and von ruffer [7] found porosity of 55 % for gypsum with a water to gypsum ratio of 0.67–0.72. mrovec and perková [8] indicate the bulk density of cast gypsum blocks to be between 840 kgm�3and 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 0 000. 0 100. 0 200. 0 300. 0 400. 0 500. 0.600 0 000. 0 020. 0 040. 0 060. 0 080. 0 100. 0 120. 0 140. 0 160. 0 180. position [ m ] m o is tu re c o n te n t [k g k g -1 ] t1=300 s t2=900 s t3=1800 s t4=2400 s t5=3300 s t6=6000 s t7=7800 s t8=9600 s t9=11400 s fig. 2: typical moisture profiles in fgd gypsum specimens thermal conductivity [wm�1k�1] volumetric heat capacity [jm�3k�1] thermal diffusivity [m2s�1 ] linear thermal expansion coefficient [k�1] 0.47�10 % (1.60�10 %) e�6 (0.29�10 %) e�6 (7.22�15 %)e�6 table 3: basic thermal properties of fgd gypsum 1 0e-08. 1 0e-07. 1 0e-06. 1.0e-05 0 15. 0 20. 0 25. 0 30. 0 35. 0 40. 0 45. 0 50. moisture content [kg kg -1 ] m o is tu re d if fu s iv it y [m 2 s -1 ] moisture profiles water sorption experiment fig. 3: moisture diffusivity of fgd gypsum water vapor diffusion resistance factor cup methods transient method dry cup wet cup 17.3 � 15 % 5.44 � 15 % 5.3 � 5 % table 4 water vapor diffusion resistance factor 1130 kgm�3. the bulk density of fgd gypsum falls between these limits. according to čsn 73 0540-3 [9] the bulk density of plasterboard is 750 kgm�3. as for the mechanical properties, klein and von ruffer [7] determined compressive strength 20 mpa and the bending strength 4 mpa for �-gypsum with a water to gypsum ratio of 0.67–0.72. singh and garg [10] determined the compressive strength of raw gypsum to be 12–14 mpa in dependence on ph. for gypsum with a water to gypsum ratio of 0.6, tazawa [11] measured compressive strength of 18.2 mpa, bending strength of 5.59 mpa. we can see that particularly the compressive strength of fgd gypsum is much higher than the mentioned reference data. for thermal properties, mehaffey et al. [12] indicate thermal conductivity of gypsum of 0.25 wm�1k�1. sultan [13] gives thermal conductivity of 0.25 wm�1k�1 for gypsum in the temperature range of 20–100 °c. mrovec and perková [8] indicate thermal conductivity of gypsum as 0.20 wm�1k�1. in a comparison with these data, the thermal conductivity of fgd gypsum is about two times higher. among the hygric parameters, hanusch [14] determined the water vapor diffusion resistance factor m in dependence on the thickness of the plasterboard. for a thickness of 9.5 mm he obtained � 10 (for 0 and 50 % of relative humidity) and � 6.5 (for 50 and 100 % of relative humidity), for a thickness of 18 mm � 8.5 (for 0 and 50 % of relative humidity) and � 5.5 (for 50 and 100% of relative humidity). as we can see, the results of the wet cup measurements with fgd gypsum in this paper correspond reasonably well with the data in [14] but the dry cup data are higher than in [14]. for a detailed and more serious comparison of the data obtained for fgd gypsum in this paper with the results measured by other scientists we unfortunately suffer a lack of more detailed information in the above sources. the authors usually make just references to national standards and requirements that apply to the production and processing of the specimens, and to the testing methods, which are not easily accessible. this complicates a possible comparison. in addition, some of the authors used plasterboard as the studied material, i.e., a gypsum board covered by a paper surface. from the technological point of view, special additives such as setting retarders, additives increasing the fire resistance etc. are used in the production of plasterboard. therefore, it is an open question whatever we can talk of a common, unmodified gypsum in this case. as follows from the above considerations, any comparison with reference data can only be approximate. however, even from such a rough comparison it is quite clear that the fgd gypsum analyzed in this paper had significantly higher compressive strength, which is the most important parameter for cast gypsum blocks. this indicates that the overall quality of fgd gypsum was much higher than the gypsum studied in the above reference papers. 6 conclusions the main aim of the work done in this paper was to obtain a reference data set for unmodified fgd gypsum without any additives. this data set is relatively extensive, including not only mechanical properties but also thermal and hygric properties. these parameters will help in simulating the processes in the material, for instance in contact with water, air humidity, due to changes of temperature, or due to some other load. it also follows from the results obtained here that the basic material will have to be subjected to substantial modifications primarily due to the worsening of its properties with increasing moisture content. protection against water and air humidity will have to be provided with the use of hydrophobization additives. also, some modifications need to be made, aimed at increasing compressive strength and bending strength, for instance using plasticizers, need made. improvement of thermal properties also seems to be an important topic for the future modifications. however, all these modifications will have to be cross-checked for possible negative effects on other parameters than those intended to be improved. the modified gypsum that will be developed in future research will be used in the production of cast blocks for application in envelope parts of building structures. using the measured data it will be possible to simulate both the mechanical and the hygrothermal performance of the designed structure of the envelope and to predict its long-term behavior and service life. acknowledgment this research has been supported by the czech science foundation, under grant no. 103/03/0006. references [1] bijen j., van der plas c.: “polymer-modified glass fibre reinforced gypsum.” materials and structures, vol. 25 (1992), p. 107–114. [2] schulze w. et al.: “necementové malty a betony.” sntl praha, 1990. [3] čsn 72 2301 gypsum binding materials, czech standard (in czech), vydavatelství úřadu pro normalizaci a měření, praha 1979. [4] semerák p., černý r.: “a capacitance method for measuring the moisture content of building materials.” stavební obzor, vol. 6 (1997), p. 102–103 (in czech). [5] drchalová j., černý r.: “non-steady-state methods for determining the moisture diffusivity of porous materials.” int. comm. heat and mass transfer, vol. 25 (1998), p. 109–116. [6] černý r., hošková š., toman j.: “a transient method for measuring the water vapor diffusion in porous building materials.” proc. of international symposium on moisture problems in building walls, v. p. de freitas, v. abrantes (eds.), univ. of porto, porto, 1995, p. 137–146. [7] klein d., von ruffer c.: “grundlagen zur herstellung von formengips.” keramische zeitschrift, vol. 49 (1997), p. 275–281. [8] mrovec j., perková j.. pokyny pro projektanty. technické podklady firmy gypstrend (2003). [9] čsn 73 0540–3. tepelná ochrana budov, část 3: výpočtové hodnoty veličin pro navrhování a ověřování, s. 15. © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 acta polytechnica vol. 44 no. 5 – 6/2004 vydavatelství úřadu pro normalizaci a měření, praha, 1994. [10] singh m., garg m.: “retarding action of various chemicals on setting and hardening characteristics of gypsum plaster at different ph.” cement and concrete research, vol. 27 (1997), p. 947–950. [11] tazawa e.: “effect of self-stress on flexural strength of gypsum-polymer composites.” advanced cement based materials, vol. 7 (1998), p. 1–7. [12] mehaffey j. r., cuerrier p., carisse g.: “a model for predicting heat transfer through gypsum board/wood-stud walls exposed to fire.” fire and materials, vol. 18 (1994), p. 297–305. [13] sultan m. a.: “model for predicting heat transfer through noninsulated unloaded steel-stud gypsum board wall assemblies exposed to fire.” fire technology, vol. 32 (1996), p. 239–259. [14] hanusch h.: “übersicht über eigenschaften und anwendung von gipskartonplatten.” zement-kalk-gips, no. 5, (1974), p. 245–251. ing. pavel tesárek phone: +420 224 355 436 e-mail: tesarek@fsv.cvut.cz department of structural mechanics rndr. jaroslava drchalová, csc. phone: +420 224 354 586 e-mail: drchalov@fsv.cvut.cz department of physics czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic ing. jiří kolísko, ph.d. phone: +420 224 353 537 e-mail: kolisko@klok.cvut.cz czech technical university in prague klokner institute šolínova 7 166 08 prague 6, czech republic doc. rndr. pavla rovnaníková, csc. phone: +420 541 147 633 e-mail: chrov@fce.vutbr.cz institute of chemistry university of technology brno faculty of civil engineering žižkova 17 662 37 brno, czech republic prof. ing. robert černý, drsc. phone: +420 224 354 429 e-mail: cernyr@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 ap04_2web.vp 1 introduction thermal simulations and determination of surface temperatures play a very important role in the design of many engineering applications, including luminaires and their electronic components. knowing the temperature distribution can help us to achieve optimal properties of the final product. there are two main factors that can be important for thermal analysis of a luminaire: � type of light source and control gear; (with the exception of linear and compact fluorescent lamps, all conventional light sources have a declared surface temperature more than 100°c). � the working conditions of the luminaire; (the external climate or a special kind of surrounding can have an unwelcome influence on the temperature in the luminaire and on the electronic parts inside it). we assume a luminaire for linear fluorescent lamp ip65, with conventional control gear, which consist of a power-factor capacitor, starting gear and choke coil. this paper contains a numerical analysis of the temperature field in the choke coil. the heat sources are defined from experimental determination of the heat losses in the choke coil. the results of the numerical analysis are checked by an analytic calculation of the surface temperature using of an equivalent scheme. finally, an experimental measurement was made of the surface temperature on a real choke coil. the results and values obtained from the measurements will be used as inputs for a thermal analysis of the whole luminaire. 2 experimental determination of the heat loss in the choke coil computing systems based on finite element methods, e.g., ansys [1], allow thermal analysis during their design. using the results of such an analysis the construction and configuration of the components inside the luminaire can be updated in accordance with allowable temperature limits. in order to describe the heat losses as correctly as possible we need to make the following measurement series: � resistance measurement on the coil winding; � measurement on the choke coil; � measurement on the electric circuit of the light source. 3 resistance measurement on the coil winding the resistance of the coil winding depends on the operating temperature – the value of the current passing through it. first, we measured the resistance when the temperature of the choke coil was the same as the ambient temperature and then, after the choke coil has been working for 20 minutes when we assume steady running conditions and a constant working temperature. if the resistance of the coil winding is known and we assume that a nominal current passes through it, we can calculate the joule heat in the coil winding as p r in n� 2 (1) rn nominal operating temperature resistance (measured 58.9 �), in nominal current, p joule heat (calculated 8.06 w). © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 44 no. 2/2004 numerical analysis of the temperature field in luminaires j. murín, m. kropáč, r. fric this paper contains a calculation of the thermal field caused by electro-heat in lighting devices. after specifying the heat sources, a thermal analysis is make using the finite element method and the equivalent thermal scheme method. the calculated results have been verified experimentally. keywords: thermal analysis, luminaire, choke coil, heat loss, surface temperature, finite element method. fig. 1: example of luminaire for linear fluorescent lamp ip65 and the choke coil 4 measurement on the choke coil it is necessary to measure the characteristic curve p � f (i) in the area of current values about the working point of the choke coil, because we are not able to make an accurate p measurement. then, from p � f (i) we get p of the choke coil for in. this measurement gives us the total heat loss in the choke coil, and then we obtain the iron heat loss from the following equation: p p pi w� � (2) pi iron heat loss, pw heat loss in coil winding, p joule heat. 5 measurement on the electric circuit of the light source to check the energetic state in the circuit of the light source we have to make the following measurement and find out the general loss of energy in this circuit (fig. 3). we assume that the input power of the light source (fluorescent lamp – 18w) is 17.5–18.5 w, we can ignore the loss in other electric components, so the loss in choke coil is about 12 w, and this value is in accordance with our previous measurement. this measurement series provides values of the heat sources represented by conventional control gear, which will now be used for the thermal analysis: 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: measurement on the choke coil measured values current of choke coil loss in meas. instr. heat loss in choke coil heat loss in coil winding iron heat loss % u [v] i [a] p’ [w] i1 [a] pvw [w] p [w] pj [w] pfe [w] pj [w] pfe [w] 230 0.396 20 0.36 8.45 11.55 7.60 3.95 66 34 232 0.401 20.5 0.36 8.60 11.90 7.80 4.10 66 34 234 0.402 21 0.36 8.75 12.25 7.83 4.42 64 36 236 0.407 21.5 0.37 8.90 12.60 8.03 4.57 64 36 238 0.410 22 0.37 9.05 12.95 8.15 4.80 63 37 240 0.421 22.5 0.38 9.20 13.30 8.62 4.67 65 35 table 1: measured and computed parameters fig. 3: measurement on the electric circuit of the light source fig. 4: heat loss divided in the choke coil 6 analytic calculation of surface temperature for an analytic calculation of the surface temperature we assume: � choke coil represents the heat source, where heat generation is constant in the whole volume, � the ambient heat sources are irrelevant, � the heat from the surface is transferred by convection and radiation (see fig. 5). heat resistances rk (convection) and rr (radiation) depend on the surface temperature, so that iterative operation of the final surface temperature must be used, ref. [2], [3]. step i. inputs for calculation � joule heat losses p [w] � surface of choke coil a [m2] � characteristic dimension l [m] � emissivity of surface � [-] � ambient temperature t � [°c] step ii. calculation of heat resistance r l nu a k � � (3) rk convection resistance l characteristic dimension nu nusselt s number � air heat conduction coefficient a surface of choke coil r t t c a t t s s r � � � � � � � � � � � � � � � � � � � � �� 0 4 4 100 100 (4) rr radiation resistance � surface emissivity c0 emissivity of the black body surface ts surface temperature t � ambient temperature step iii. surface temperature calculation r r r r r � � k r k r (5) r total heat resistance rk convection resistance rr radiation resistance t p r ts � � � (6) r total heat resistance p heat losses ts surface temperature t � ambient temperature the iteration cycle will repeat step ii. and step iii., until the surface temperature difference in the cycles is �t � 01. °c 7 numerical modelling of the heat source in a lighting device following the previous analysis, conditions of unambiguity for evaluation of the temperature rise in the choke coil can be specified by the finite element method (fem, [4]). the choke coil was modelled as a box with an assembly plate and with separated volumes of the coil winding and the magnetic circuit iron. we applied the appropriate boundary conditions and material properties to these volumes. © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 44 no. 2/2004 fig. 5: equivalent thermal scheme for analytic calculation of the surface temperature 58 60 62 64 66 68 70 10,5 11 11,5 12 12,5 13 13,5 p (w) t (° c ) emissivity 0.7 emissivity 0.8 emissivity 0.9 fig. 6: analytic calculation of the surface temperature for different emissivity and joule heat losses we defined the heat flux conditions on the surfaces. the heat was generated in both thecoil winding and the iron (heat loss caused by the eddy current). we assumed that the choke coil was hanging in the air and the heat from the surface was transferred by convection and radiation into space. we evaluated the heat convection coefficient from the criteria equations by the iteration cycle, where the final value is � � 8.5 w/m2k. in real cases the coefficient value can vary betwen 7.5 and 9.5. next, we assumed the surface emissivity of the radiation to be in the range 0.7 ~ 0.9. 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 material generated heat [w/m3] generated heat [w ] thermal conductivity [w/m k] cu 564 799 8.35 380 fe 79 319 4.15 67 table 2: heat generation in materials of the choke coil fig. 7: volumes and heat generation (w/m3, ansys) surface emissivity 0.7�0.9 bulk temperature 19 °c coef. of convection 7.5�9.5 table 3: expected parameters of convection and radiation of the heat fig. 8: temperature field on the surface of the choke coil (� � 8.5 w/m2k; � � 0.8) we assumed two marginal extreme cases, in which we obtain minimum and maximum possible values of the heat and temperatures. the real temperature will occur within this range (between these two values). we modelled the temperature field in the choke coil by the ansys program [1], using 3d solid and surface elements. the boundary conditions applied by the modelling corresponded to the conditions from the measurements of the heating of the real choke coil. the results of the analysis (using average values of the emission and convection coefficients) are displayed in figs. 8, 9 and 10. 8 measurement of the temperature rise in the choke coil as a verification of the numerical analysis results, we measured the heating of the real choke coil. the measurement was performed under the same conditions as those assumed in the numerical analysis – with the choke coil hanging in the air. the coil was fed by its nominal current in � 0.37 a . measurement of the surface temperatures was performed by several cu-ko heat-sensors installed on the choke coil sur© czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 44 no. 2/2004 surface emissivity 0.7 0.8 0.9 coef. of convection 7.5 8.5 9.5 final maximum temperature [ °c] 72.9 67,3 62.9 final minimum temperature [ °c] 55.1 50 45.9 table 4: temperature of the choke coil in extreme cases fig. 9: temperature field of the iron magnetic circuit (� � 8.5 w/m2k; � � 0.8) fig. 10: temperature field of the copper coil and assembly sheet (� � 8.5 w/m2k; � � 0.8) face at the locations displayed in fig. 11 (sensor no. 4 remained free to check the bulk temperature). the measurement was dynamic and ran during the first two thirds of the heating curve. the final temperature was approximated according to the slope of the heating curve. the increases in the temperature (heating curve) for each measurement point are displayed in fig. 12. the following temperature values were measured: � maximum measured temperature: 68.5 °c – at point 5. � minimum measured temperature: 41 °c – at point 3. 9 conclusion it can be seen that our results for numerical and analytical analyses of the surface temperature correspond to the measured values. the differences in the lowest temperatures can be caused by the fact that the real abrasive and multiform surface is too difficult to model, so we cannot create an exact and precise model of the choke coil. nevertheless, in general we can state that measuring of the surface temperature confirmed the results of one numerical analysis obtained by ansys, as regards both the values and the distribution, so we can use them in a thermal analysis of the whole luminaire. references [1] ansys 5.7, theory manual [2] kalousek m., hučko b.: prenos tepla. bratislava: vydavate�stvo stu, 1996. [3] janíček f., murín j., lelák j.: “load conditions and evaluation of the rise of temperature in an enclosed conductor”. journal of electrical engineering, vol. 52, no. 7–8, 2001, p. 210–215. [4] rao s. s.: the finite element method in engineering. oxford: pergamon press, 1982. prof. ing. justín murín, drsc. phone:+420 260 291 452 fax.:+420 265 427 192 e-mail: justin.murin@elf.stuba.sk department of mechanics ing. miroslav kropáč phone:+420 260 291 151 fax.:+420 265 425 826 e-mail: kropac@elf.stuba.sk department of power engineering ing. róbert fric phone:+420 260 291 415 fax.:+420 265 427 192 e-mail: fric@elf.stuba.sk department of mechanics slovak university of technology in bratislava faculty of electrical engineering and information technology ilkovičova 3 812 19 bratislava, slovak republic 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 11: temperature measurement points fig. 12: temperature rise curves of the choke coil table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica https://doi.org/10.14311/ap.2021.61.0511 acta polytechnica 61(4):511–515, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague use of thermal analysis for the detection of calcium oxalate in selected forms of plastering exposed to the effects of serpula lacrymans drahomíra cígler žofkováa, ∗, jiří franklb, dita frankeováb a czech technical university in prague, faculty of civil engineering, thákurova 7, 166 29 prague 6, czech republic b czech academy of sciences, institute of theoretical and applied mechanics, prosecká 809/76, 190 00 prague 9, czech republic ∗ corresponding author: drahomira.zofkova@post.cz abstract. the article discusses the interaction of metabolic products of a wood-destroying fungus of the dry rot species (serpula lacrymans (wulfen) p. karst.) with a commonly used lime mortar. mortar samples used in the presented experiment were made mostly in laboratory conditions so as to make it possible to set input conditions and to determine initial properties of the examined samples. matured lime mortar samples were placed in cultivation boxes with a growth of serpula lacrymans and exposed to its action for a predetermined period of time. for a comparison, mortar samples taken “in situ” from real structures were also subjected to the experiment. the examined samples were subjected to a thermal analysis and a comparative measurement by infrared spectroscopy (ftir). the results of the measurement of infected samples were compared with the results obtained in the reference (control) samples. the experiment carried out was focused on assessing the presence of calcium oxalate, which is secreted into the surroundings of the mycelium during the active growing of serpula lacrymans. keywords: dry rot, serpula lacrymans, lime mortar, thermal analysis, infrared spectroscopy (ftir), calcium oxalate. 1. introduction our working group focused on acquiring new knowledge in the field of detecting calcium oxalate in the activity of serpula lacrymans when in contact with lime mortars, i.e., building materials containing calcium ca2+, which is typical of czech buildings. serpula lacrymans is one of the brown rot fungi whose activities cause significant material and financial losses in the order of millions of dollars a year in many countries around the world [1]. this fungus is the most common saprophyte and is considered to be the most destructive and least controllable internal wood-destroying fungus especially due to its ability to transport water and nutrients over long distances. the basic abiotic factors that affect the growth of serpula lacrymans include optimal temperature in the range of 21-25 °c, ph value in the range of acidic to neutral (optimally in the range of 4-5), limited air movement, a culture water potential of not less than -6 mpa, content moisture in the wood around 20 % [2, 3] and probably also a source of a divalent ion such as calcium [1]. among the characteristics of the organism that rank serpula lacrymans at the top of the list of the most common wood-destroying fungi in buildings [4] is the highly efficient transport system that transports water, nitrogen, iron, calcium, etc., through rhizomorphs and an efficient solubilization system. this allows the extraction of metal ions (especially fe2+ and ca2+) from plaster and stone around the mycelium [5], which the fungus then uses to its advantage [6–8]. during its growth, serpula lacrymans regulates the ph of the substrate during its own metabolic processes while oxalic acid (c2o4h2) is formed as a metabolic product [6, 9], which has an effect on the decomposition of wood [10]. the issue of oxalic acid in wood-destroying fungi and its influence on the modification of the microenvironment is extensively discussed in several specialist publications [10–13]. 2. aims the presented study focuses on the issues of processes taking place in mortars that are in contact with serpula lacrymans. the objective is to monitor changes and evaluate the use of the method of thermal analysis in assessing the formation of calcium oxalate in lime mortars exposed to the effects of serpula lacrymans. a thermal analysis is a method commonly used in the evaluation of building materials (e.g., the carbonation of concrete [14], ceramic and glass raw materials [15], low-fired ceramics [16]) and was also used in the evaluation of the patina formed on the surface of dolomitic rocks used in the construction of a romanesque church in spain [17]. calcium oxalate is the most commonly found oxalate. its occurrence is usually associated with the activity of fungi in both natural and building environments. in living organisms and the environment, the 511 https://doi.org/10.14311/ap.2021.61.0511 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en d. cígler žofková , j. frankl, d. frankeová acta polytechnica main forms are monohydrate (whewellite) and dihydrate (weddellite). the solubility of calcium oxalate increases at ph<5. its production can be influenced by many factors (e.g., the carbon and nitrogen source, ph of the environment, etc. [18]), which contribute to the biological deterioration of the properties of rock and mineral substrates, lignocellulosic materials and, indirectly, to the change and loss of elements (or traces) of cultural heritage [11]. oxalate can be potentially toxic and must therefore be eliminated or separated from the cells of the organism. oxalate detoxification can be performed by means of degradation, which is a common property of wood-destroying fungi [19]. oxalate detection can be evaluated during surveys of constructions as an indicator of the activity of living organisms. this study also focuses on this fact. 3. materials and methods 3.1. organisms within the research activity, a pure form of serpula lacrymans culture was used, which was prepared by inoculation in the laboratory of the biological degradation of material of the institute of theoretical and applied mechanics of the cas, v. v. i. in the samples taken from a structure, the culture of the wood-destroying fungus was determined after staining it with agno3 solution on a versamet union 9666 epifluorescence microscope (japan). the identification was performed by a comparison with keys [20] and [21]. 3.2. culture medium and conditions culture vessels with a surface area of 210 × 140 mm and a height of 60 mm were prepared in which the test mortar bodies were exposed to the activity of the wood-destroying fungus. a 10 mm layer of malt extract agar base gel (w/mycological peptone – himedia laboratories pvt. ltd.) was prepared at the bottom of sterile boxes and a pure serpula lacrymans culture was inoculated onto it. the culture vessels were placed in a climatic chamber with stable conditions at a temperature of 23.5 ± 2.5 °c and a relative humidity of 65 ± 5 %. they were left there for 14 days. during this time, the growth of the serpula lacrymans mycelium covered the entire bottom of the culture vessels with a continuous layer. 3.3. mortar test specimens a lime mortar consisting of lime putty and aggregate in a ratio of 1:3 was chosen for the production of test specimens [22]. lime putty was made from aqua lime putty burnt lime (produced by aqua obnova staveb s.r.o.), aggregate (fraction 0/4 mm) from the straškov locality and lukewarm drinking water from the water supply system. the test specimens were produced at room temperature with a size of 20 × 20 × 100 mm while making tests and complying with the requirements for fresh mortars pursuant to the prescribed čsn en 1015-3,7 standards as amended. the specimens were stored for 6 days in a moist environment (in a polyethylene bag), then removed from their moulds and a day later transferred to an air-conditioned room at 20 ± 3 °c and 60 ± 5 % relative humidity where they remained for 30 days. after the mortar samples were collected from the building, they were stored in closeable microtene bags and stored in their original condition in a cooling box at a temperature of 15°c until laboratory measurements were performed. 3.4. verification of calcium oxalate cao crystals production by means of thermal analysis for our study, we used the thermogravimetric (tg) analysis. during the heating of the measured specimens, the change in weight was continuously recorded as a function of the rising temperature [23] on the sdt q600 instrument produced by ta instruments in the temperature range of 25-1000 °c. for the purposes of the analyses, 20-30 mg of a sieved mortar sample was weighed out into ceramic crucibles using a standardized sieve of 0.063 mm in order to separate the aggregate from the binder and to refine the results. the thermal decomposition was performed in a nitrogen atmosphere at a heating rate of 20 °c per minute for all measurements. 3.5. infrared spectroscopy (ftir) the spread test samples of mortars used in the above thermal analyses were further analysed on the iz10 sub-module of the in10 infrared microscope (thermo scientific). they were measured using a transmission technique in a kbr tablet in the range of 4000-400 cm−1 with a spectral resolution of 4 cm−1. 4. results and discussion within the study, samples of a laboratory-prepared lime mortar exposed to serpula lacrymans were tested under optimal conditions for its growth. samples of mortars taken “in situ” from a real building, where, at the place of sampling, the presence of serpula lacrymans had been recorded on a long-term basis, were also subjected to thermal analyses. the results of the analyses of samples of lime mortars damaged by the activity of the wood-destroying fungus were compared with samples that underwent the same temperature and humidity conditions but which, however, were not in contact with serpula lacrymans. this is a control measurement (so-called zero control). the comparative measurement was performed by means of the decomposition of calcium oxalate with the general formula cac2o4, the output of which is shown in figure 1. calcium oxalate is the result of 512 vol. 61 no. 4/2021 use of thermal analysis for the detection . . . figure 1. tg/dtg curves of calcium oxalate cac2o4. the reaction of oxalic acid (c2h2o4) with a calciumcontaining material. in our study, the source of this element is lime plastering. both the tg and dtg curves recorded in figure 1 show the thermal decomposition of the whewellite mineral in three partial steps (1, 2, 3) that are separated from each other by plateaus. these plateaus correspond to thermally stable phases. the thermal decomposition begins at 100 °c. from this point, the dehydration of cac2o4·h2o begins; crystal water is released. this phase is completed at 205 °c. in the following section, up to a temperature of 380 °c, the formed anhydrous cac2o4 is thermally stable. between 380 and 545 °c, co is then released while caco3 is formed. up to a temperature of 605 °c, the calcium carbonate formed is thermally stable. beyond this point, its thermal decomposition to calcium oxide takes place and carbon dioxide is released. the decomposition is completed at 825 °c. the individual decomposition steps are described stoichiometrically by the equations below. cac2o4· h2o → cac2o4 + h2o (1) cac2o4 −→ caco3 + co (2) caco3 −→ cao + co2 (3) from the test beams of the lime mortars (marked sd1) produced in the laboratory, the surface of which was overgrown with mycelium of serpula lacrymans for 56 days, the mycelium was mechanically cleaned before sampling of the mortar sample and sampling was performed from the surface of the tested specimens. spread-out mortar samples (without aggregate) weighing 21 mg were placed in the ceramic crucibles of the ta instrument. the output of the measurement is recorded in figure 2, in which three peaks corresponding to the temperature range for the decomposition of calcium oxalate are visible. the most significant interval corresponds to the thermal decomposition of caco3 in the range of 600-870 °c. figure 2. tg dtg curve of sample sd1. figure 3. tg dtg curve of sample hrádek nad ohří. thermal measurements that followed were performed on mortar samples taken as part of a construction and technical survey of a structure in hrádek nad ohří, where they had been exposed to the long-term effects of serpula lacrymans activity. the mortar samples from the actual structure were treated similarly to the above-mentioned laboratory-prepared and damaged samples. a ground plaster sample weighing 27 mg was taken from the surface part, from which the serpula lacrymans mycelium had been mechanically removed with a brush, and weighed into ceramic crucibles. three areas corresponding to the calcium oxalate degradation peaks are visible in the measurement outputs, see figure 3. in the first phase, water is released, in the second phase, co is released, and the peak for the decomposition of caco3 is again the most marked. on the thermal analysis record (figure 3), the results correspond to an approximate content of 10 % cac2o4 in the measured sample (i. e., in the sieved fraction). the proportion of calcium carbonate formed by the transformation of calcium oxalate cannot be quantified, because it partly belongs to the decomposition of caco3 contained in the plaster. the temperature interval of the calcium carbonate decomposition is slightly shifted in the experimental measurements of the thermal analysis as compared to the temperature range in the decomposition of the comparative measurement of calcium oxalate. this is probably related to the crystallization modification of caco3 [24]. control measurements were made on the lime plaster samples produced in the laboratory (marked r1). the spread sieved samples weighing 23 mg were placed in ceramic cups of the thermal analysis apparatus and heated up to 1000 °c. the output of the measurement is shown in figure 4 and corresponds to the course 513 d. cígler žofková , j. frankl, d. frankeová acta polytechnica figure 4. tg/dtg curve of sample r1 (control). of the thermal decomposition of a lime binder (plaster). at the beginning, the measured samples are gradually dehydrated and, in the temperature range of approximately 604-860 °c, a single distinct band of decomposition of caco3 is recorded. the above samples (sd1, hrádek nad ohří, r1) were independently subjected to a comparative measurement by the ftir method. the laboratoryproduced plaster samples overgrown with serpula lacrymans (sd1) and plaster samples from the structure (hrádek nad ohří) showed a similar course of spectra as the plaster samples without the activity of the biodeteriogen (r1) (see figure 5 a, b, c). the dominant features of the measured samples’ spectra were bands extending from 1350 to 1550 cm−1 and also bands of vibrations 875 cm−1 and 1800 cm−1. the outputs were compared with the stored library spectra of caco3, shown in figure 5 (d), and sio2, the content of which in the measured samples was expected. in the case of mortar samples from hrádek nad ohří, bands that correspond to the presence of calcium oxalate were recorded (∼1630 cm−1, 1320 cm−1 and a weak band of ∼780 cm−1), whose library spectra in a pure form are shown in 5 (e). the outputs of the ftir measurement indicate the effect of the period of time during which the measured sample was exposed to the activity of the biodeteriogen. for this method of measurement, the time period of 56 days, during which the mortar sample was exposed to the activity of serpula lacrymans, was insufficient and this exposure did not have a significant effect on the results as compared to the outputs in the plaster samples taken from the structure. 5. conclusions as a part of the research, we used thermal analysis and infrared spectroscopy (ftir) techniques. samples of lime mortars produced in the laboratory and, for a comparison, mortars taken “in situ” from a real structure were also subjected to these two analyses. the initial experiment using the thermal analysis was focused on mortar samples with various forms of surface damage caused by serpula lacrymans activity. figures 2,3,4 indicate slight changes in the decomposition of the mortars observed in the thermal measurements. the data obtained indicate changes in the composition of the mortar samples in the absence figure 5. ftir spectrum of the sample sd1 (a), hrádek nad ohří (b), r1 (c), caco3 (d), cac2o4 (e). of the wood-destroying fungus (r1) and when there is an activity of serpula lacrymans on the surface of the test specimens (sd1). however, it is not entirely clear from the performed measurements to what extent the presence of cac2o4 is caused exclusively by the influence of metabolic changes during the active growth of serpula lacrymans. the most pronounced bands (peaks) in the thermal analysis, corresponding to the first and second stages of calcium oxalate decomposition, were recorded in mortar samples exposed to the activity of serpula lacrymans. this phenomenon showed most markedly in the case of the mortar sample taken from the structure where the activity of the wood-destroying fungus was long-lasting. the bands (peaks) corresponding to the decomposition of caco3 are of a similar intensity for all tested mortar samples and are recorded in the same temperature range. for the weight losses caused by the release of co2 from caco3, as shown in the individual graphs (∆mc = 34.6 %; 33.6 % and 30.4 %), it is not possible to exactly determine the part of carbon dioxide originating from the decomposition of calcium carbonate from the binder of the mortar and the part originating from the decomposition of calcium oxalate produced by serpula lacrymans. the conclusions of the performed ftir measurements correspond to the results of the thermal analysis for all measured sets. the most prominent bands of calcium oxalate were shown in mortar samples taken from the structure where the activity of serpula lacrymans was long-lasting. acknowledgements we thank our collaborator petra mácová from the centre of excellence telč (cet) of the czech academy of sciences, v. v. i., who performed infrared spectroscopy (ftir) measurements and thus contributed to our research. the results mentioned in the article were obtained with the support of the sgs19/144/ohk1/3t/11 grant and were also financially supported by the czech academy of sciences under the framework strategy av21, research program 23: “city as a laboratory of change; construction, historical heritage and place for safe and quality life”. 514 vol. 61 no. 4/2021 use of thermal analysis for the detection . . . references [1] j. w. palfreyman, n. a. white, t. e. j. buultjens, h. glancy. the impact of current research on the treatment of infestations by the dry rot fungus serpula lacrymans. international biodeterioration & biodegradation 35(4):369–395, 1995. https://doi.org/10.1016/0964-8305(95)00064-3. [2] o. schmidt. wood and tree fungi — biology, damage, protection, and use. springer, berlin, 2006. https://doi.org/10.1007/3-540-32139-x. [3] d. h. jennings, a. f. bravery. serpula lacrymans: fundamental biology and control strategies. wiley, chichester, 1991. isbn 047193058x. [4] j. gabriel, k. švec. occurrence of indoor wood decay basidiomycetes in europe. fungal biology reviews 31(4):212–217, 2017. https://doi.org/10.1016/j.fbr.2017.05.002. [5] j. w. palfreyman. the domestic dry rot fungus, serpula lacrymans, its natural origins and biological control. in workshop ariadne 8 – bio-degradation of cultural heritage, arcchip. 2001. [6] j. bech-andersen. the dry rot fungus and other fungi in houses. 5th edition. hussvamp laboratoriet, denmark, 1995. isbn 87-89560-25-6. irg/wp 95-10124. [7] j. bech-andersen. the influence of the dry rot fungus (serpula lacrymans) in vivo on insulation materials. material und organismen 22:191–202, 1987. [8] j. bech-andersen. serpula lacrymans the dry rot fungus: review of previous papers, 1989. international research group on wood preservation document irg/wp/1393. [9] j. w. palfreyman, e. m. philips, h. j. staines. the effect of calcium ion concentration on the growth and decay capacity of serpula lacrymans and coniophora puteana. holzforschung 50:3–8, 1996. https://doi.org/10.1515/hfsg.1996.50.1.3. [10] j. s. schilling, j. jellison. oxalate regulation by two brown rot fungi decaying oxalate-amended and non-amended wood. holzforschung 59:681–688, 2005. https://doi.org/10.1515/hf.2005.109. [11] g. m. gadd, j. bahri-esfahani, q. li, et al. oxalate production by fungi: significance in geomycology, biodeterioration and bioremediation. fungal biology reviews 28(2-3):36–55, 2014. https://doi.org/10.1016/j.fbr.2014.05.001. [12] f. green, m. j. larsen, j. e. winandy, t. l. highley. role of oxalic acid in incipient brown-rot decay. material und organismen 26(3):191–213, 1991. [13] m. guggiari, r. bloque, m. arango, et al. experimental calcium-oxalate crystal production and dissolution by selected wood-rot fungi. international biodeterioration & biodegradation 65(6):803–809, 2011. https://doi.org/10.1016/j.ibiod.2011.02.012. [14] p. mec, t. murínová, k. kubečka. možnosti využití termické analýzy v oblasti stavebních materiálů. stavební obzor (2):39–43, 2013. [15] termická analýza keramických a sklářských surovin, learning text, 2012. ict in prague, prague. [16] termická analýza a mikrostruktura nízkopálené keramiky, learning text, 2012. ict in prague, prague. [17] j. l. perez-rodriguez, a. duran, m. a. centeno, et al. thermal analysis of monument patina containing hydrated calcium oxalate. thermochimica acta 512(1-2):5–12, 2011. https://doi.org/10.1016/j.tca.2010.08.015. [18] y. akamatsu, m. takahashi, m. shimada. production of oxalic acid by wood-rotting basidiomycetes grown on low and high nitrogen culture media. material und organismen 28(4):251–264, 1994. [19] t. watanabe, n. shitan, s. suzuki, et al. oxalate efflux transporter from the brown rot fungus fomitopsis palustris. applied and environmental microbiology 76(23):7683–7690, 2010. https://doi.org/10.1128/aem.00829-10. [20] k. m. nobles. identification of cultures of wood inhabiting hymenomycetes. canadian journal of botany 43(9):1097–1138, 1965. https://doi.org/10.1139/b65-126. [21] t. huckfeldt, o. schmidt. identification key for european strand-forming house-rot fung. mycologist 20(2):42–56, 2006. https://doi.org/10.1016/j.mycol.2006.03.012. [22] o. severin. stavba domu v praxi. díl ii. grada publishing, praha, 2002. isbn 80-247-0263-0. [23] a. blažek. termická analýza. sntl – nakladatelství technické literatury, praha, 1974. [24] uhličitan vápenatý. [2020-06-13], https://cs.wikipedia.org/wiki/uhli%c4%8ditan_v% c3%a1penat%c3%bd. 515 https://doi.org/10.1016/0964-8305(95)00064-3 https://doi.org/10.1007/3-540-32139-x https://doi.org/10.1016/j.fbr.2017.05.002 https://doi.org/10.1515/hfsg.1996.50.1.3 https://doi.org/10.1515/hf.2005.109 https://doi.org/10.1016/j.fbr.2014.05.001 https://doi.org/10.1016/j.ibiod.2011.02.012 https://doi.org/10.1016/j.tca.2010.08.015 https://doi.org/10.1128/aem.00829-10 https://doi.org/10.1139/b65-126 https://doi.org/10.1016/j.mycol.2006.03.012 https://cs.wikipedia.org/wiki/uhli%c4%8ditan_v%c3%a1penat%c3%bd https://cs.wikipedia.org/wiki/uhli%c4%8ditan_v%c3%a1penat%c3%bd acta polytechnica 61(4):1–5, 2021 1 introduction 2 aims 3 materials and methods 3.1 organisms 3.2 culture medium and conditions 3.3 mortar test specimens 3.4 verification of calcium oxalate cao crystals production by means of thermal analysis 3.5 infrared spectroscopy (ftir) 4 results and discussion 5 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0350 acta polytechnica 61(2):350–363, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague adaptive wavelets sliding mode control for a class of second order underactuated mechanical systems fares nafaa, ∗, aimad boudoudaa, billel smaanib a boumerdes university, faculté de technologie, laboratoire d’ingénierie des sytémes et des telecommunication, cité franzfanon, boumerdes, algeria b laboratoire hyperfréquences et semiconducteurs constantine 1 university, b.p. 325 route ain el bey, constantine, algeria ∗ corresponding author: f.nafa@univ-boumerdes.dz abstract. the control of underactuated mechanical systems (ums) remains an attracting field where researchers can develop their control algorithms. to this date, various linear and nonlinear control techniques using classical and intelligent methods have been published in literature. in this work, an adaptive controller using sliding mode control (smc) and wavelets network (wn) is proposed for a class of second-order ums with two degrees of freedom (dof). this adaptive control strategy takes advantage of both sliding mode control and wavelet properties. in the main result, we consider the case of un-modeled dynamics of the above-mentioned ums, and we introduce a wavelets network to design an adaptive controller based on the smc. the update algorithms are directly extracted by using the gradient descent method and conditions are then settled to achieve the required convergence performance. the efficacy of the proposed adaptive approach is demonstrated through an application to the pendubot. keywords: adaptive, gradient descent, pendubot, sliding mode control, wavelets. 1. introduction 1.1. motivation: a literature review challenging control strategies for underactuated mechanical systems (ums) are still relevant for researchers and remain a major open field. those systems have fewer control inputs than degrees of freedom (dof) and often arise from nonholonomic systems with non-integrable constraints: first and/or second-order nonholonomic constraints [1]. since decades, many applications include this class of mechanical systems in different fields, such as robotics [1, 2], aeronautical [3] , spatial systems, marine and underwater systems [4], and flexible and mobile systems [1]. in particular, the set of ums examples includes inverted pendulum, double inverted pendulum, acrobot, pendubot, beam and ball system, tora system, underactuated surface vessels, underactuated aerial vehicles,... etc. having fewer actuators than degrees of freedom gives a lot of advantage for the ums, namely: reduction of weight, saving cost by reduction of the number of actuators and then the consumed energy. in addition, a fully actuated system can become underactuated in case of a breakdown, and then the control strategies for the ums can be used as a rescue plan for the control of fully actuated systems. recently, there have been extensive and remarkable research efforts in the control of the ums and several classifications and papers have been presented including modelling, stability and controllability issues [5–12]. indeed, an overview of control strategies shows the use of both linear and nonlinear algorithms. linear controllers such as in [13] use a robust lqr-based anfis (adaptive neuro fuzzy inference system), and [14] involves the jacobian linearization method to extract a linear quadratic regulator and hybrid pid with lqr. for nonlinear approaches, a feedback linearization controller has been proposed by [15] to control the inertia wheel pendulum, [16] for tora system and [17] to control the inverted pendulum. however, the smc is still the most important nonlinear strategy that many works have used [7, 9, 18, 19]. indeed, using the smc for the ums remains the most important perspective because of its robust performance against disturbances and uncertainties. this appropriate robust approach is designed by using a systematic scheme based on a sliding surface and lyapunov stability theorem [20]. also, other variants of the smc strategy have been introduced such as smc with an integral sliding function as in [21] for a stabilization of an underactuated unicycle, and second order sliding mode control such as [22] for the control of a class of second-order ums. the common idea for those works is to build a hierarchy of sliding surfaces and combine them according to some conditions to guarantee the stability and convergence. 350 https://doi.org/10.14311/ap.2021.61.0350 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . although these control strategies are widely used for nonlinear systems and, in particular, for the ums, the nonlinearities of the control plants and the difficulties to extract accurate models make the implementation of the model-based control strategies difficult, and therefore adaptive techniques become inevitable [23, 24]. in recent control research works, developers have merged various techniques from the approximation theory with the adaptive nonlinear controllers to enhance the adaptive capability while ensuring the stability and convergence. particularly, the ums remains a challenging class of coupled nonlinear systems and the problem of adaptive control of the ums presents more difficulties because of the coupling that exists between the control input and the outputs. in literature, the main trends of adaptive controllers are fuzzy logic and neural networks with the combination of a nonlinear control, such as optimal control and smc [7, 9, 10, 25, 26]. besides, in these presented works focused on adaptive controllers for the ums, the adaptation laws are indirectly adjusted by using the lyapunov approach, i.e., the learning laws are extracted to guarantee the convergence of the lyapunov function. for an efficient adaptation, it is more judicious to directly extract the parameters adaptation laws from the identification error between the unknown function and its adaptive approximation [26, 27]. in fact, optimization algorithms offer the advantage to find the parameters that directly minimize any truncation error by following some specific update rules. the gradient descent algorithm can be a good alternative to reach this objective. nowadays, wavelets theory was successfully applied to multi-scale analysis and synthesis, time-frequency signal analysis in signal processing as well as in nonlinear control system design [28]. by definition, a wavelet network is a one-layer network consisting of orthonormal “father wavelets” and “mother wavelets” and widely used as a kind of building blocks for the approximation of unknown functions [28, 29]. it was proven that a wavelet network is a universal approximator that can approximate any function with an arbitrary degree of accuracy by a linear combination of father and mother wavelets [29–31]. besides, adaptive wavelets networks are especially well suited for such function approximation tasks and the problem of adaptation to an unknown dynamics [29]. moreover, the wn has been a subject of application for nonlinear systems [32–35]. nevertheless, in recent works, the wn was rarely applied on the ums, and this opens an interesting trend of application. 1.2. contributions from the aforementioned background, the main contributions of this paper are emphasized as follows: (1.) the design of a new sliding mode controller for second-order ums with two degrees of freedom. indeed,by exploiting the characteristics of the ums dynamics, we have adapted the sliding surfaces to make the control law more flexible to eliminate the existing coupling between the input and the output, and to guarantee the performance of the convergence and stability of the closed-loop system. (2.) based on the wavelets network and smc approach, an adaptive control scheme is designed to overcome some difficulties related to nonlinearities and model inaccuracy of the ums. this approach combines the approximation properties of wavelets network and the robustness of the smc. (3.) the adjustable parameters of the wavelet networks are updated using a gradient descent algorithm by minimizing the truncation error between the existing unknown ideal sliding mode controller and the adaptive one. 1.3. outlines of the paper the following parts of the paper are organized as follows: preliminaries about ums and the problem formulation are explained in section 2. section 3 is devoted to present the proposed smc strategy with the adequate stability analysis. the adaptive control algorithm, which is developed according to the results of the previous section, is presented in section 4. all through these sections, a stability analysis is analytically proved by using the lyapunov theory and barbalat’s lemma. in section 5, the developed control strategies are applied on an illustrative example: the pendubot system. at the end, we wrap up the paper with a conclusion in section 6. 2. preliminaries and problem formulation the dynamics of an underactuated system with two dof can be written in a compact form as [36]:[ m11(q) m12(q) m21(q) m22(q) ][ q̈1 q̈2 ] + [ h1(q, q̇) h2(q, q̇) ] = [ 0 u ] (1) with q = [q1,q2] t , q1 and q2, being the generalized coordinates, u the control input, mij the inertia matrix elements and hi contains coriolis, centrifugal and gravity terms. it has been shown that such systems can be 351 f. nafa, a. boudouda, b. smaani acta polytechnica given according to the following dynamics [36]: ẋ1 (t) = x2 (t) ẋ2 (t) = f1 (x) + b1 (x) u (t) (2) ẋ3 (t) = x4 (t) ẋ4 (t) = f2 (x) + b2 (x) u (t) where x = (x1 (t) ,x2 (t) ,x3 (t) ,x4 (t)) is the state variable vector, representing the vector q = [q1, q̇1,q2, q̇2] t , u (t) the control input, f1 (x), f2 (x), b1 (x) and b2 (x) are nonlinear functions. throughout this work, we will consider the following assumptions: (1.) assumption 1. the state vector x is measurable. (2.) assumption 2. the control gains bi (x) (i = 1, 2) are finite, nonzero, and of known sign for all states of x. it is assumed that the sign of bi (x) does not change; and without a loss of generality, this sign can be taken as positive. in addition, those functions are bounded, i.e.: 0 ≤ bmin ≤ bi (x) ≤ bmax. the system (2) can be viewed as two subsystems with a second-order canonical form including the states (x1,x2) and (x3,x4) for which we define the following pair of sliding surfaces: s1 = ẋ1 + λ1x̃1 = x2 + λ1x̃1. s2 = ẋ3 + λ2x̃3 = x4 + λ2x̃3. (3) where x̃1 = x1 −x1d, x̃3 = x3 −x3d (x1d and x3d are constant desired values), λ1 and λ2 are positive constants. the time derivatives of both equations in (3) are given as follows: ṡ1 = f1 + b1u + λ1x2. ṡ2 = f2 + b2u + λ2x4. (4) if fi and bi are known, then we can obtain the equivalent control laws of each sub-system such as: ueq1 = b −1 1 (f1 + λ1x2). (5) ueq2 = b −1 2 (f2 + λ2x4). (6) the control objective is to stabilize the whole system and to force the outputs x1 and x3 to follow their desired values x1d and x3d. however, using (5) or (6) ensures only the stabilization of the resultant subsystem. for such challenging systems, we need to design a total control law able to simultaneously attract both subsystems to their desired values and to guarantee the overall stability. this will be the aim of the next section. 3. sliding mode controller design by using the second equation of (4) and (6) , it is easy to find that: u = 1 b2 ṡ2 + ueq2. (7) substituting (7) into the first equation of (4) yields: ṡ1 = b21ṡ2 + ∆21. (8) where: b21 = b1 b2 and ∆21 = b1(ueq2 −ueq1 ). (9) similarly, from (4) and (6)one can have: ṡ2 = b12ṡ1 + ∆12. (10) with: b12 = b2 b1 and ∆12 = b2(ueq1 −ueq2 ). (11) if the dynamics (8) of s1 is driven so that: ṡ1 = −βṡ2 −ksgn (s2) −qs2 −kssgn(s2 + αsgn (s1)). (12) 352 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . with α, β, k, ks, q being positive constants and 0 < α < 1; then, by combining (10) and (12) we get: ṡ2 = − ksgn (s2) + qs2 + kssgn(s2 + αsgn (s1)) + ∆21 (β + b21) . (13) in order to show that this new resulting dynamics form yields the stability and the convergence of the whole system, we consider the following lyapunov function: v = 1 2 s22. (14) then, the time derivative of (14) is obtained as follows: v̇ = s2ṡ2. (15) by replacing (13) in (15), we get: v̇ = −s2 (ksgn (s2) + qs2 + kssgn(s2 + αsgn (s1)) + ∆21) (β + b21) . (16) if we choose: β = βm + bmax/bmin, and k = km + ks + max|∆21|. (17) with βm > 0 and km > 0 ; then (16) can be rewritten as: v̇ ≤−βm−1km|s2|−βm−1qs22. (18) according to smc properties, (18) guarantees that an ideal sliding motion takes place from any initial conditions when the dynamics reach the sliding surface [37]. clearly, (18) shows that s2 is bounded and goes to zero as t →∞ and its dynamics satisfies ṡ2 ≈ 0. as a result, (13) gives: ṡ1 → −kssgn(αsgn(s1)) →−kssgn(s1), and this easily proves the stability and the convergence of s1 to zero as t →∞. therefore, by recalling the second equation of (4) and (13), the control signal will have the following expression: u = −b−1[[ksgn (s2) + qs2 + kssgn(s2 + αsgn (s1))] − (b1ueq1 + βb2ueq2 )]. (19) where: b = (b1 + βb2). 3.1. proposition-1 consider a class of underactuated mechanical systems with two dof given by (2), which satisfy assumptions 1 and 2, and design the sliding surfaces (3). the designed control law (19) satisfying the condition (17) ensures all signals in the closed-loop system will be bounded and the sliding surfaces will converge to zero asymptotically. 4. adaptive wavelets sliding mode controller (awsmc) 4.1. wavelets network overview a wavelets network is a type of building blocks for function approximation, developed based on the concept of the multiresolution approximation. the building block is formed by shifting and dilating the basis functions: mother wavelet ψ and father wavelet φ. a multiresolution analysis was proposed in [28], which provides a mathematical tool to describe the “increment in information” from a coarse approximation to a higher resolution approximation. the multiresolution analysis consists of a sequence of successive approximation spaces vj ∈ l2 (r)which satisfies: ... ⊂ v−2 ⊂ v−1 ⊂ v0 ⊂ v1 ⊂ v2 ⊂ ... , ⋂ m∈z vm = {0} and ⋃ m∈z vm = l 2 (r), with: g (x) ∈ vj ⇔ g(2x) ∈ vj+1 g (x) ∈ vj ⇒ g(x− 2−jk) ∈ vj,k ∈ z; vj = span{φj,k,k ∈ z} (20) 353 f. nafa, a. boudouda, b. smaani acta polytechnica where z is the set of all integers, φj,k (x) = 2j/2φ ( 2jx−k ) and φj,k ∈ vm. φj,k is an orthonormal basis of vj which is called scaling function (or father wavelet). for every j ∈ z, wj is defined to be the orthogonal complement of vj in vj+1. at every resolution j: vj+1 = vj ⊕wj wj⊥wi, if : i 6= j wj ⊂ vi, if : i > j wj = span{ψj,k,k ∈ z} (21) where ψj,k (x) = 2j/2ψ ( 2jx−k ) and ψj,k ∈ wj. ψj,k is an orthonormal basis of wj and is called mother wavelet [28, 30]. a function g(x), such as : g(x) ∈ l2(r), can be approximated in the space vj as follows: g (x) = gj (x) + e (j) = nj∑ k=1 〈 φj,k (x) ,g (x) 〉 φj,k (x) + e (j) . (22) where gj (x) is the projection of g on the space vj, < .,. > is the inner product in l2, e(j) is the approximation error at the jth resolution, and nj is the number of the basis functions used at the jth resolution. note that a larger j means a higher resolution, which contains a lower resolution vj−1 and the complement space wj−1. therefore, as j →∞, vj−1 tends to be l2 and limj→∞|e(j)| = 0 and from multiresolution property of wavelet, we have |e(j + 1)| < |e(j)|[28]. it is important to note that all wavelets satisfy multiresolution and orthonormal properties, and hence the choice of the wavelets will not affect the approximation results [28]. next, we will explore the above mentioned approximation’s properties to design an adaptive controller by approximating the unknown ideal controller (19). 4.2. adaptive controller design for the adaptive control purpose, we assume now that the nonlinear functions f1 (x), f2 (x), b1 (x), and b2 (x) are unknown and only the state vector x is available for measurement. in addition, we keep all terms of the assumption 2 regarding the control gain bi(x). we construct the aforementioned ideal input control (19) by using a wavelet network in the form of (22). thus, the general form of this ideal controller u∗(x) can be expressed as: u∗ (x) = u∗j (x) + eu(j) = nj∑ k=1 v∗j,kφj,k (x) + ∑ j≥j,k∈z w∗j,kψj,k (x) (23) where j is the coarse resolution. it is worth noticing that using wavelets networks to approximate the ideal controller requires a guarantee of the input vector x in a compact region [38, 39]. in practical applications, it is required to assign a very large compact set to avoid a violation of this requirement. however, a very large wavelet basis is needed in this situation and this may result in a high computational complexity. fortunately, in many physical systems, such as mechanical and electrical systems, an appropriate selection of the pre-assigned compact set can be obtained via the knowledge of some physical limitations [39]. accordingly, in order to meet the control requirements, we assume the existence of a finite unknown integer jc such that the desired approximation performance can be met at this resolution, and hence (23) becomes as follows: u∗(x) = (v∗j) tφj,k(x) + jc∑ j=j (w∗j ) tψj,k (x) . (24) in addition, we will use a weighting vector c such that the wavelets bases become: φj,k(ctx) and ψj,k(ctx) [36]. also, we introduce an additional adaptation so that (24) will have the following form: u∗(x) = (v∗j) tφj,k(c tx) + jc∑ j=j ξ(∆e)(w∗j ) tψj,k(c tx). (25) where ∆e is the performance evaluation given by ∆e = s21 + s22 , ξ(∆e) = η(1 −e − ∆2e w2 ), [ω η] ∈ r∗2, while v∗j and w ∗ j represent the ideal approximation parameters. 354 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . therefore, the estimate value of the unknown ideal controller u∗(x) can be given as: u(x) = (vj)tφj,k(c tx) + jc∑ j=j ξ(∆e)(wj,k)tψj,k(c tx). (26) now, we define the approximation error between both controllers u∗(x) andu(x) such as: ej = u(x) −u∗(x). (27) thus, by using (25) and (26), it is easy to find the following expression of ej: ej = (ṽj)tφj,k(c tx) + jc∑ j=j ξ(∆e)(w̃j)tψj,k ( ctx ) . (28) where ṽj = vj − v∗j and w̃j = wj − w ∗ j. recalling that u = u∗ + (u−u∗), then (4) becomes: ṡ1 = f1 + b1u∗ + b1ej + λ1x2. ṡ2 = f2 + b2u∗ + b2ej + λ2x4. (29) thus, we can calculate the following expression: ṡ1 + βṡ2 =(f1 + λ1x2) + β(f2 + λ2x4) + (b1 + βb2)u∗ + (b1 + βb2)ej. (30) plugging the value of u∗ (19) into (30) gives: ṡ1 + βṡ2 =−ksgn (s2) −qs2 −kssgn(s2 + αsgn (s1)) + bej. (31) in the next step, we define a quadratic cost function that measures the discrepancy between the ideal and the actual wavelet controller. such a function can be defined as: i = b 2  (vj)tφj,k(y) + jc∑ j=j ξ(∆e)(wj)tψj,k(y) −u ∗  2 (32) where y = ctx. then, we use the gradient descent method to minimize the cost function (32) with respect to the adjustable parameters vj and wj . then, by applying the gradient method [40], the minimizing trajectories vj(t) and wj(t) are given by the following equations: v̇j = −γv5v(i). ẇj = −γw5w(i). (33) with: 5v(i) = ∂i(v, w) ∂v = φj,k(y)bej. (34) 5w(i) = ∂i ∂w = ψj,k(y)bej. (35) finally, we obtain the following form of the gradient descent algorithm: v̇j = −γvφj,k(y)bej. (36) ẇj = −γwψj,k(y)bej. (37) however, the adaptive laws (36) and (37) cannot be calculated because both b and ej are unavailable. thus, we will use (31) to overcome this parameter unavailability, that is: bej = βṡ2 + ṡ1+ksgn (s2) + qs2 + kssgn(s2 + αsgn (s1)). (38) consequently, (36) and (37) become: v̇j = −γvφj,k(y)[βṡ2 + ṡ1+ksgn (s2) + qs2 +kssgn (s2 + αsgn (s1))]. (39) ẇj = −γwψj,k(y)[βṡ2 + ṡ1+ksgn (s2) + qs2 +kssgn (s2 + αsgn (s1))]. (40) the second main result of this paper is resumed in the following proposition. 355 f. nafa, a. boudouda, b. smaani acta polytechnica 4.3. proposition-2 let assumptions 1 and 2 be satisfied for a class of the ums (2) driven by the controller (26). if the condition (17) holds with the adaptation parameter laws (39) and (40) where γv and γw are positive definite constants, all signals in the closed-loop system will be bounded and the sliding surfaces given by (3) will converge to zero asymptotically. 4.4. proof in order to study the stability of the whole system, we define the following lyapunov function v : v = 1 2  βs22 + ṽtj γv−1ṽj + jc∑ j=j w̃tj γw −1w̃j   (41) the time derivative of v along the dynamics (4), (39) and (40) is then given by: v̇ =βs2ṡ2 + ṽtj γv −1v̇j + jc∑ j=j w̃tj γw −1ẇj =βs2(f2 + b2u∗ + b2ej + λ2x4) − ṽtj φj,k(y)bej − jc∑ j=j w̃tj ξ(∆e)ψj,k(y)bej. = −s2(β + b21)−1(ksgn (s2) + qs2 + kssgn(s2 + αsgn (s1)) + ∆21) + βb2ejs2 −be2j (42) invoking the condition (19), the expression in (42) becomes: v̇ ≤−β−1m km|s2|−β −1 m qs 2 2 + βb2ejs2 −be 2 j (43) by using the following inequality: βb2ejs2 ≤ 1 2 βb2e 2 j + 1 2 βb2s 2 2. (44) we can rewrite (43) as : v̇ ≤−km|s2|−β−1m qs 2 2 + 1 2 βb2s 2 2 + 1 2 βb2e 2 j −be 2 j. (45) in addition, since b = βb2 + b1, a straightforward calculation of (45) gives: v̇ ≤−km|s2|− ( β−1m q−β 1 2 b2 ) s22 − ( 1 2 βb2 + b1 ) e2j. (46) hence, provided that q ≥ 12bmaxβ 2, the expression in (46) can be upper-bounded as: v̇ ≤−km|s2|− ( 1 2 βb2 + b1 ) e2j ≤ 0. (47) this guarantees that s2, ṽj and w̃j are bounded. furthermore, by using barbalat’s lemma, the sliding surface s2 and the error ej converge to zero as t →∞ and the the dynamics of s2 is given by ṡ2 ≈ 0. by assuming that the adaptation process converges and ej is very small, it is obvious that the convergence of s2 to zero as t →∞ gives the following from (31): ṡ1 = −kssgn(αsgn (s1)) + bej. (48) then, we infer from (48) that for: ks = ke + max(bej). (49) where ke > 0, one can obtain: ṡ1 →−kesgn(αsgn(s1)) →−kesgn(s1) (50) this clearly shows that s1 converges to zero as t →∞. 356 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . figure 1. block diagram of the awsmc algorithm 4.5. remarks (1.) the derivatives of s1 and s2 are not available, we can predict a discrete implementable version. indeed, the time derivative of the function vj can be given as: v̇j = vj(t) − vj(t− ∆t) ∆t (51) where ∆t is a small positive constant, which is assumed small enough. thus, the discrete implementable version of (39) and (40) can be obtained as: vj(t) =vj(t− ∆t) − γvφj(y)[(β(s2(t) + s1(t) − 2∑ i=1 si(t− ∆t)) + ∆t(ksgn(s2(t)) + qs2(t) + kssgn(s2(t) + αsgn(s1(t)))]. (52) wj(t) =wj(t− ∆t) − γwψj(y)[(β(s2(t) + s1(t) − 2∑ i=1 si(t− ∆t)) + ∆t(ksgn(s2(t)) + qs2(t) + kssgn(s2(t) + αsgn(s1(t)))]. (53) (2.) in order to remedy the control discontinuity in the boundary layer, the sign function throughout this paper is replaced by a saturation function defined for θ > 0 as: sat(x) = { sgn ( x θ ) , if|x| ≥ θ x θ , if|x| < θ (54) 5. simulation and results in this section, we applied the awsmc on the pendubot system depicted in figure 2. this two-link robot has an actuated joint mounted at the beginning of the first arm (arm 1) to which another arm (arm 2) is connected. such system presents four equilibrium positions: down-down (−π2 , 0, 0, 0), down-up(−π2 , 0,π, 0), up-down( π 2 , 0,π, 0), and up-up ( π 2 , 0, 0, 0). both down-up and up-up are unstable positions while down-down and up-down are stable [1, 41]. the control challenge is to maintain and stabilize the pendubot on one of its unstable equilibrium positions. the main mechanical parameters of the pendubot are listed in table 1. recalling the lagrange-euler principle, the dynamics of this system are given by (55) (see appendix). by introducing the state vector x = [x1, ẋ1,x2, ẋ2]t such that x = q = [q1, q̇1,q2, q̇2]t , it is easy to conclude that (55) has the form of (2). to synthesize the awsmc, daubechies wavelets (db5) with n=2 are chosen to be the basis of the wavelet network and the vector c is taken as: c = 10−3 × [5, 2,−3,−10]t . the appropriate parameters and coefficients of the designed controllers are: α = 0.3, β = 1.5, k = 9, q = 11, ks = 1, λ1 = 0.1, λ2 = 5, γv = 0.1, γw = 5, jc = 5, j = 7, ω = 1, η = 0.5 and θ = 0.1. in addition, the 357 f. nafa, a. boudouda, b. smaani acta polytechnica y g l1 lc1 l2 lc2 m1 l1 m2 l2 active joint passive joint τ1 x q2 q1 o figure 2. schematic of the pendubot in a relative coordinate system 0 2 4 6 8 10 -2 -1 0 1 2 3 4 5 (a) 0 2 4 6 8 10 -4 -3 -2 -1 0 1 2 (b) figure 3. system response for ”down − up” position (− π2 , 0,π, 0); including disturbances during the time interval [6s, 8s]: (a) angles response, (b) control input. initial conditions are taken such that: x(0) = [π/2, 0, 0, 0] and x(0) = [−π/2, 0, 0, 0] for down-up and up-up, respectively, and the initial values of the parameter estimates vj(0) and wj(0) are set equal to zero. in order to show the effectiveness of the proposed control strategies, we have considered two scenarios: with and without noisy measurements. in all the reported simulations, a random disturbance representing 10% of unmodeled dynamics has been added in order to show the robustness of the proposed controllers. for the first scenario, the results of q1 and q2 responses for both positions are shown in figure 3 and figure 4, respectively. referring to the down-up position, it is clear from figure. 3 that both angles converge asymptotically to their desired values. the adaptive strategy shows a good performance and ensure the asymptotic stability for all the state variables of the system. in addition, from figure. 3a, the plant response presents an oscillatory form for the awsmc strategy. indeed, the adaptation laws led to a control input less smooth than the smc method (figure(3b)). regarding the effect of the disturbance during the time interval [6s, 8s], it is clear that the impact is minimal on the stability and convergence of the angles. this confirms the robustness of the proposed control strategies. in addition, it is important to mention that the adaptive controller has a shorter response time, this represents a 50% reduction as compared to the smc. moreover, it is worth noticing that the use of the adaptive controller has considerably reduced the chattering seen in the case of the smc. 358 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . 0 2 4 6 8 10 -2 -1 0 1 2 3 (a) 0 2 4 6 8 10 -4 -3 -2 -1 0 1 2 (b) figure 4. system response for ”up − up” position ( π2 , 0, 0, 0); including disturbances during the time interval [6s, 8s] : (a) angles response, (b) control input. 0 2 4 6 8 10 -3 -2 -1 0 1 2 3 (a) 6.5 7 7.5 -0.05 0 0.05 0 2 4 6 8 10 -4 -2 0 2 4 6 (b) 6.5 7 7.5 -0.05 0 0.05 figure 5. error estimation ej; including disturbances during the time interval [6s, 8s]: (a)”down − up”position, (b)”up − up” position. finally, in figure 5, we can observe that the approximation error tends to zero even during the disturbance effect, and this represents an additional positive point of the proposed adaptation law. the same notes and observations can be reported for the second case related to the up-up position, see figure 4(a) and figure 4(b). for the second scenario, we introduced noisy measurements for all the states. the noise is a periodic non-smooth function with a nonzero average representing 10% of the measured signal. the results obtained 359 f. nafa, a. boudouda, b. smaani acta polytechnica 0 2 4 6 8 10 -2 -1 0 1 2 3 4 5 (a) 0 2 4 6 8 10 -4 -3 -2 -1 0 1 2 (b) figure 6. system response for ”down − up” position (− π2 , 0,π, 0); including noisy measurements and disturbances: (a) angles response, (b) control input for the "down-up" position using the awsmc are demonstrated in figure 6. the performances do not change significantly , even if the frequency of the noise varies from 500s to 100000s. the controller signal shown in figure 6(b) seems to be less smooth than the first scenario but tends to a steady smooth form. this is due to the variation of the measured signal. for both cases, we can see that the smc controller presents some low frequency chattering while this effect is reduced in the case of the awsmc. this proves another advantage of using adaptive combinations between the smc and intelligent approximation algorithms. 6. conclusion and outlook in this paper, we have developed an adaptive control strategy for a class of second-order ums with two degrees of freedom (dof). the proposed algorithm is based on a wavelets network (wn) and sliding mode control (smc). in first stage, we have considered that the system dynamics are well modelled and we have presented a new control algorithm based on the smc. it was shown that the proposed approach ensures the asymptotic stability and the convergence of the closed-loop system. for solving a real case scenario,where the dynamics present uncertainties due to unmodelled parameters, we have designed an adaptive controller that uses wavelets network to approximate an ideal smc control law. in particular, the adaptation laws have been extracted by using the gradient descent method. based on the lyapunov theory, we proved that the proposed control strategy guarantees the stability and the convergence of the whole system to the desired values. for a numerical application, we have taken the pendubot system as an illustrative example. the obtained results revealed satisfactory performances using the smc or awsmc strategies despite the lack of information about the system. with respect to the results obtained in this work and to the existing literature, many future improvements can be implemented and will be the scope of potential works: (1.) the generalization of the proposed approach for the ums with a higher dof. (2.) to utilize new intelligent approximation technics for the ums. (3.) the use of adaptive higher-order and fractional sliding mode controllers. references [1] x. xin, y. liu. control design and analysis for underactuated robotic systems. springer-verlag, london,uk, 2014. https://doi.org/10.1007/978-1-4471-6251-3. 360 https://doi.org/10.1007/978-1-4471-6251-3 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . [2] l. birglen, t. laliberté, c. m. gosselin. underactuated robotic hands. springer-verlag, berlin, germany, 2008. https://doi.org/10.1007/978-3-540-77459-4. [3] p. masarati, m. morandini, a. fumagalli. control constraint of underactuated aerospace systems. journal of computational and nonlinear dynamics 9(2):395–401, 2014. https://doi.org/10.1115/1.4025629. [4] d. k. duc, j. pan. control of ships and underwater vehicles, design for underactuated and nonlinear marine systems,advances in industrial control. first edition. springer-verlag, london, 2009. https://doi.org/10.1007/978-1-84882-730-1. [5] s. krafes, z. chalh, a. saka. review on the control of second order underactuated mechanical systems. complexity 2018. https://doi.org/10.1155/2018/9573514. [6] l. yang, y. hongnian. a survey of underactuated mechanical systems. iet control theory&applications 7(7):921–935, 2013. https://doi.org/10.1049/iet-cta.2012.0505. [7] m. reyhanoglu, a. schaft, n. mcclamroch, i. kolmanovsky. dynamics and control of a class of underactuated mechanical systems. ieee transaction on automatic control 44(9):1663–1671, 1999. https://doi.org/0.1109/9.788533. [8] d. huynh, v. dat, t. nguyen. application of fuzzy algorithm in optimizing hierarchical sliding mode control for pendubot system. robotica and management 22(2):2–12, 2017. https://doi.org/10.1109/fuzz.2002.1005070. [9] m. spong. lecture notes in control and information sciences. in control problems in robotics and automation, chap. underactuated mechanical systems. springer, berlin, heidelberg, 1998. https://doi.org/10.1007/bfb0015081. [10] x. huang, a. ralescu, h. gao. a survey on the application of fuzzy systems for underactuated systems. journal of systems and control engineering 233(3):217–244, 2018. https://doi.org/10.1177/0959651818791027. [11] m. rehan, m. reyhanoglu. control of rolling disk motion on an arbitrary smooth surface. ieee control systems letters 2(3):357–362, 2018. https://doi.org/10.1109/lcsys.2018.2838528. [12] a. choukchou-braham, b. cherki, m. djemai, k. busawon. analysis and control of underactuated mechanical systems. springer international publishing, cham, switzerland, 2014. https://doi.org/10.1007/978-3-319-02636-7. [13] i. chawla, a. singla. real-time control of a rotary inverted pendulum using robust lqr-based anfis controller. international journal of nonlinear sciences and numerical simulation 19(3-4):379–389, 2018. https://doi.org/10.1515/ijnsns-2017-0139. [14] m. keshmiri, a. singla. modeling and control of ball and beam system using model based and non-model based control approaches. international journal on smart sensing and intelligent systems 5(1):14–35, 2012. https://doi.org/10.21307/ijssis-2017-468. [15] c.aguilar-avelar, j. moreno-valenzuela. new feedback linearization-based control for arm trajectory tracking of the furuta pendulum. ieee/asme transactions on mechatronics 21(2):638 – 648, 2016. https://doi.org/10.1109/tmech.2015.2485942. [16] t. taniguchi, m. sugeno. piecewisemulti-linear model based control for tora system via feedback linearization. proceedings of the international multiconference of engineers and computer scientists 2018. [17] r. chanchareon, v. sangveraphunsiri, s. chantranuwathana. tracking control of an inverted pendulum using computed feedback linearization technique. ieee international conference on robotics, automation and mechatronics 2006. https://doi.org/10.1109/ramech.2006.252680. [18] l. hsu, t. oliveira, j. cunha, l. yan. adaptive unit vector control of multivariable systems using monitoring functions. international journal of robust and nonlinear control 29(3):583–600, 2019. https://doi.org/10.1002/rnc.4253. [19] t. oliveira, j. cunha, l. hsu. li s., yu x., fridman l., man z., wang x. (eds). in advances in variable structure systems and sliding mode control—theory and applications. studies in systems, decision and control, chap. adaptive sliding mode control based on the extended equivalent control concept for disturbances with unknown bounds. springer, cham, 2018. https://doi.org/10.1007/978-3-319-62896-7$\_$6. [20] c. edwards, s. spurgeon. sliding mode control: theory and applications. taylor and francis, london, uk, 1998. [21] j.-x. xu, z.-q. guo, t. lee. a synthesized integral sliding mode controller for an underactuated unicycle. 11th international workshop on variable structure systems (vss),mexico city, 2010 pp. 352–357, 2010. https://doi.org/10.1109/vss.2010.5545132. [22] i. shah, f. ur rehman. smooth second order sliding mode control of a class of underactuated mechanical system. ieee access, 2018 6:7759–7771, 2018. https://doi.org/10.1109/access.2018.2806568. [23] h. cho, g. kerschen, t. oliveira. adaptive smooth control for nonlinear uncertain systems. nonlinear dyn 99 p. 2819–2833, 2020. https://doi.org/10.1007/s11071-019-05446-z. [24] t. oliveira, v. rodrigues, l. fridman. generalized model reference adaptive control by means of global hosm differentiators. ieee transactions on automatic control 5(64):2053–2060, 2020. https://doi.org/10.1109/tac.2018.2862466. 361 https://doi.org/10.1007/978-3-540-77459-4 https://doi.org/10.1115/1.4025629 https://doi.org/10.1007/978-1-84882-730-1 https://doi.org/10.1155/2018/9573514 https://doi.org/10.1049/iet-cta.2012.0505 https://doi.org/0.1109/9.788533 https://doi.org/10.1109/fuzz.2002.1005070 https://doi.org/10.1007/bfb0015081 https://doi.org/10.1177/0959651818791027 https://doi.org/10.1109/lcsys.2018.2838528 https://doi.org/10.1007/978-3-319-02636-7 https://doi.org/10.1515/ijnsns-2017-0139 https://doi.org/10.21307/ijssis-2017-468 https://doi.org/10.1109/tmech.2015.2485942 https://doi.org/10.1109/ramech.2006.252680 https://doi.org/10.1002/rnc.4253 https://doi.org/10.1007/978-3-319-62896-7$\_$6 https://doi.org/10.1109/vss.2010.5545132 https://doi.org/10.1109/access.2018.2806568 https://doi.org/10.1007/s11071-019-05446-z https://doi.org/10.1109/tac.2018.2862466 f. nafa, a. boudouda, b. smaani acta polytechnica [25] s. ramos-paz, f. ornelas-tellez, a. g. loukianov. nonlinear optimal tracking control in combination with sliding modes: application to the pendubot. in proceedings of the 2017 ieee international autumn meeting on power, electronics and computing (ropec),mexico 6:1–6, 2017. https://doi.org/10.1109/ropec.2017.8261619. [26] f. nafa, s. labiod, h. chekireb. direct adaptive fuzzy sliding mode decoupling control for a class of under actuated mechanical systems. turk j electr eng co 21:1615–1630, 2013. https://doi.org/10.3906/elk-1112-17. [27] s. labiod, t. guerra. direct adaptive fuzzy control for a class of mimo nonlinear system. international journal of systems science 38(8):665–675, 2007. https://doi.org/10.2307/2001373. [28] s. mallat. multiresolution approximation and wavelet orthonormal base of l2 ( r). transactions of the american mathematical society 315(1):69–87, 1989. [29] i. daubechies. ten lectures on wavelets. siam,philadelphia, pennsylvania, usa 1992. [30] j. xu, y. tan. nonlinear adaptive wavelet control using constructive wavelet networks. ieee transactions on neural networks 18(1):115–127, 2007. https://doi.org/10.1109/tnn.2006.886759. [31] t. kugarajah, q. zhang. multidimensional wavelet frames. ieee transaction on neural networks 6(6):1552–1556, 1995. https://doi.org/10.1109/72.471353. [32] a. stephen, h. wei. new class of wavelet network for nonlinear system identification. ieee transaction on neural networks 16(4):862–874, 2005. https://doi.org/10.1109/tnn.2005.849842. [33] h. r. karimi, b. lohmann, b. moshiri, p. maralani. wavelet-based identification and control design for a class of nonlinear systems. international journal of wavelets,multiresolution and information processing 4(1):213–226, 2006. https://doi.org/10.1142/s0219691306001178. [34] b. chen, y. cheng. adaptive wavelet network control design for nonlinear systems. proceedings of 35th ieee conference on decision and control 3(6):3224–3229, 1997. https://doi.org/10.1109/cdc.1996.573635. [35] m. zekri, s. sadri, f. sheikholeslam. adaptive fuzzy wavelet network control design for nonlinear systems. fuzzy sets and systems 159(20):2668–2695, 2008. https://doi.org/10.1016/j.fss.2008.02.008. [36] r. olfati-saber. normal forms for underactuated mechanical systems with symmetry. ieee transaction on automatic control 47(2):305–308, 2002. https://doi.org/10.1109/9.983365. [37] j. slotine, w. li. applied nonlinear control. prentice-hall, englewood cliffs, nj,usa, 1991. [38] r. sanner, j. slotine. structurally dynamic wavelet networks for adaptive control of robotic systems. international journal of control 70(3):405–421, 1998. https://doi.org/10.1080/002071798222307. [39] b.-s. chen, y.-m. cheng. adaptive wavelet network control design for nonlinear systems. proceedings of 35th ieee conference on decision and control, kobe, japan 3:3224–3229, 1996. https://doi.org/10.1109/cdc.1996.573635. [40] l.-x. wang. adaptive fuzzy systems and control : design and stability analysis. prentice-hall, englewood cliffs, nj,usa, 1994. [41] m. gulan, m. salaj, b. rohal’-ilkiv. achieving an equilibrium position of pendubot via swing–up and stabilizing model predictive control. journal of electrical engineering 65(6):356–363, 2014. https://doi.org/10.2478/jee-2014-0058. 7. appendices by considering the friction in both joints, the dynamic behavior of a pendubot can be given as follows [41]: q̈1 = 1 θ1θ2 −θ32 cos2 q2 [θ1θ3(q̇1 + q̇2)2 sin q2 + θ23q̇ 2 1 cos q2 sin q2 −θ2θ4g cos q1 + θ3θ5 cos q2 cos (q1 + q2) −θ2d1q̇1 + (θ2 + θ3cosq2)d2q̇2 + θ2τ1]. q̈2 = 1 θ1θ2 −θ32 cos2 q2 [−θ3(θ2 + θ3 cos q2)(q̇1 + q̇2)2 sin q2 − (θ1 + θ3 cos q2)q̇21θ3 sin q2 − (θ1 + θ3 cos q2)θ5g cos (q1 + q2) + (θ2 + θ3 cos q2)d1q̇1 − (θ1 + θ2 + 2θ3 cos q2)d2q̇2 + (θ2 + θ3 cos q2)(θ4g cos q1 − τ1)]. (55) the pendubot parameters are listed in the below table: 362 https://doi.org/10.1109/ropec.2017.8261619 https://doi.org/10.3906/elk-1112-17 https://doi.org/10.2307/2001373 https://doi.org/10.1109/tnn.2006.886759 https://doi.org/10.1109/72.471353 https://doi.org/10.1109/tnn.2005.849842 https://doi.org/10.1142/s0219691306001178 https://doi.org/10.1109/cdc.1996.573635 https://doi.org/10.1016/j.fss.2008.02.008 https://doi.org/10.1109/9.983365 https://doi.org/10.1080/002071798222307 https://doi.org/10.1109/cdc.1996.573635 https://doi.org/10.2478/jee-2014-0058 vol. 61 no. 2/2021 adaptive wavelets sliding mode control. . . arm number parameter and symbol value arm 1 m1 0.256 [kg] lenght: l1 0.206 [m] distance of the center of mas of link1: lc1 0.107 [m] moment inertia about the centroid: i1 0.0025 [kgm2] friction coefficient: d1 0.08 [kgm2 s−1] arm 2 mass: m2 0.226[kg] lenght: l2 0.298[m] distance of the center of mas of link2: lc2 0.133 [m] moment inertia about the centroid: i2 0.0011 [kgm2] friction coefficient: d2 0.00001[kgm2 s−1] torque (input control) :τ1 [nm] table 1. mechanical parameters of pendubot model [41]. 363 acta polytechnica 61(2):350–363, 2021 1 introduction 1.1 motivation: a literature review 1.2 contributions 1.3 outlines of the paper 2 preliminaries and problem formulation 3 sliding mode controller design 3.1 proposition-1 4 adaptive wavelets sliding mode controller (awsmc) 4.1 wavelets network overview 4.2 adaptive controller design 4.3 proposition-2 4.4 proof 4.5 remarks 5 simulation and results 6 conclusion and outlook references 7 appendices ap04-bittnar1.vp 1 introduction a variety of computational techniques exist to describe the fracture behaviour of quasi-brittle materials. these models can be classified into two main groups: continuous and discontinuous models. in a continuous model, the displacement and strain fields remain continuous, even after a strong localization of the deformations. localization of deformation can be triggered by strain softening. a major problem with classical continuum models is that the governing equations lose ellipticity for quasi-static problems and hyperbolicity for dynamic problems if strain softening is introduced. when using finite elements, this results in a strong sensitivity to the mesh. upon mesh refinement, deformations localize into a band of zero thickness and complete structural failure can occur without dissipation of energy. to regularize the governing differential equations, non-locality or rate dependency can be introduced in the constitutive model. this controls the zone in which the deformations tend to localize. examples of regularized continuum models are non-local damage models [1], gradient damage models [2], cosserat continuum models [3] or viscous models [4, 5]. discontinuous models represent cracks as displacement discontinuities. a discontinuous term can be incorporated into the strain field (weak discontinuity) [6–8] or into the displacement field (strong discontinuity) [9–18]. in this paper, a displacement discontinuity is introduced using a specific property of finite element shape functions. these form a partition of unity, which allows enhancing nodes with additional degrees of freedom. the first section covers the kinematics of a body crossed by a discontinuity. then, the governing equations are derived. in the third section, implementation aspects are discussed and finally an example is treated. 2 cohesive zone model based on partitions of unity 2.1 kinematics of a displacement jump consider a body � crossed by 2 non-intersecting discontinuities, �1 and �2, as shown in fig. 1. the displacement field is given by: u u u� � � �� ~h�i i i m 1 , (1) where u is the displacement field of the body �, �u and ~ui are continuous fields and h�i is the heaviside step function defined as: h� � �i i i � � � � � � � � 1 0 x x (2) where � i � and � i are subvolumes of � such that w i i� � � � � and the discontinuity �i is the border between the two subvolumes. the normal ni to the discontinuity is pointed towards � i �. taking the symmetric gradient of the displacement field (1) results in the infinitesimal strain field: � � � � � � � �s s i i= m i i s i= m i i � ~ (~ )u u u nh� � 1 1 � (3) where ��i is the dirac delta distribution centred on the discontinuity. 2.2 partition of unity concept a partition of unity is a set of functions ni, such that: ni i n ( )x � � � 1 1 (4) 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 a discontinuous model to study fracture of brittle materials k. de proft, w. p. de wilde in this paper, the partition of the unity property of finite element shape functions is used to introduce displacement discontinuities into finite elements. the discontinuous character of the displacement field is captured with the heaviside step function. using the partition of unity concept, the governing equation of the continuum and the discontinuity are separated and are consequently described by different constitutive laws. inside the discontinuity, a plasticity based constitutive law is used to describe the decrease of tractions in function of the crack opening while the continuum is assumed to remain elastic. the methodology will be described and validated with a comparison between numerical simulations and experimental results. this paper is dedicated to j. sejnoha, tu prague, with respect and admiration for his scientific achievement. 1 2 n1 n2 se t �1 �2 n1 n2 se t � � 2 � � 2 � � 1 � � 1 fig. 1: body crossed by 2 discontinuities where n is the number of discrete points. since finite element shape functions must be able to describe rigid body modes, they must fulfil eq. (4) and consequently they form a partition of unity. duarte and oden [19] showed that a field such as the displacement field can be interpolated making use of these partitions of unity, u � � � � � � � � � � � ��� �� n a bi i ij j j k i n � 11 , (5) where ai are the regular degrees of freedom, bij are the enhanced degrees of freedom and �j is an enhanced basis with k basis terms. as can be seen, the classical degrees of freedom are enriched with additional degrees of freedom, when necessary. this enhancement can be done locally. belytschko and black [20] and moës et al. [21] used the partition of unity concept to enrich standard approximations with near-tip asymptotic fields and a discontinuous function for elastic crack growth. wells and sluys [17] introduced the partition of unity technique for describing cohesive cracks in a material. when the heaviside function is chosen as an enhanced basis, the displacement field can be interpolated according to equation (5): u na nb� � � �h�i i i m 1 (6) where n is a matrix containing the finite element shape functions, a are the regular degrees of freedom and bi are the enhanced degrees of freedom related to discontinuity i. for each crack, a basis term and an additional set of degrees of freedom is added. 2.3 governing finite element equations the weak form of the virtual work equation without body forces reads: � � ��� s s s e � � �: d d e� � t (7) where � is taken from the set of admissible displacement variations and se is the outer surface where external tractions t are applied. using a galerkin approach, the admissible displacement variations can be decomposed in the same manner as the actual displacement field. inserting the kinematical expressions for multiple (in this case m) non-intersecting discontinuities, the virtual work equation can be rewritten as: � � � � � � �� � � s s i i m i i s i i n � : ~ : ~ ) : � � � � � � � d d d � � � � � � � � h 1 � i m s i si m s s � � � � �� � � � � � 1 1 � ~ .� �t td de e e e (8) taking first variations of ��(~�i � 0) followed by variations of ~�i (�� � 0), a set of m � 1 equations is obtained. � � ��� s s s� : �� � �d d e e � � t (9) h� � � � � i i s i i i it� � � �� �~ : ~� � �d d 0 (i � 1, … m), (10) where ti are the traction forces working at the discontinuity �i. to obtain equation (9), the enhanced displacement field, ~u, is assumed to be zero where essential boundary conditions are imposed. from equations (9, 10), the set of discretized equations is obtained by introducing the discretized form of the displacement and the strain field. b n tt t s s e � d d e� � � �� � (11a) h� � � � � i i t t i itb n� d d� �� � � 0 (i � 1, … m). (11b) in these equations, the continuum response and the discontinuity response are completely split. the continuum is assumed to remain elastic during the complete computation. the stress rate in the continuum can be easily obtained from the expression of the strain field in the bulk, given by eq. (3): � � � �� � � � � � � � � � � � � �c c ba bbe e� h�i i i m 1 , (12) where ce is the elastic material tensor. the behaviour at the discontinuity is stated in terms of tractions and separations. the separation of the discontinuity can be computed as: � � � � � �� u u u u u u nb� ~ � ~ . (13) eq. (13) shows that the separation of the discontinuity is expressed as a function of the enhanced degrees of freedom. the traction rate is defined as: � �t dnbi i� , (14) where d is the material tangent for the discontinuity. this material tangent will be further elaborated in the next paragraph. eq. (11) is further linearized by inserting eq. (12) and eq. (14). after some mathematical manipulations, the linearized set of equations is given by k k k k k k k k k aa ab ab b a b b b b b a b b b b m m m m m m 1 1 1 1 1 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �d d d ext da b b f 1 0 0 � � m a t t, � � � � � � � � � � � � � � � � � � � � � f f f a t b t b t m int, int, int, 1 � � (15) where k b c baa t� � e d� � (16a) k b c bab t j j � �h� � � e d (16b) k b c bb a t e j j � �h� � �d (16c) k b c bb b t e i j i j � �h h� � � �d (16d) k b c b n dnb b t t jj j j j � �� �h� � � � � e d d (16e) © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 44 no. 5 – 6/2004 it can be seen from previous equations that the global system of equations (eq. (15)) remains symmetric when ce and d are symmetric. it should be noted that equations (16) are valid when only discontinuity j crosses the element, but the element is also influenced by discontinuity i. note that all stiffness contributions in equations (16) are very similar. the crucial difference between the terms in equations (16) is the presence of the heaviside function. this makes the finite element implementation relatively simple. 2.4 numerical implementation aspects 2.4.1 integration of the crossed elements from eq. (16) it can be seen that the heaviside step functions should be taken into account for the integration of the stiffness matrix. this means that the integration must only be performed on the side of the element where the heaviside function equals one, i.e. � i �. obviously, enough integration points must be present on that side so that an adequate integration is performed. since a discontinuity can cross the element arbitrarily, the safest solution is to redefine the integration scheme. when only one discontinuity crosses the considered element, the integration rule is adapted following wells and sluys [17]. for the triangular quadratic element as the underlying finite element, 23 integration points – 21 in the continuum and 2 for the discontinuity – are inserted as shown in fig. 2a. when 2 non-intersection discontinuities cross the same element, the integration rule is re-adapted again as shown in fig. 2b. in this case, 15 integration points for the continuum and 4 integration points for the discontinuity are used. of course, depending on the position of the discontinuities, the integration rule might change. 2.4.2 enhanced nodes another implementation issue is the enhancement of the nodes. only nodes whose support is crossed by a discontinuity should be enhanced. furthermore, the enhanced degrees of freedom of the nodes on the support of the crack tip remain constrained. consequently, the separation of the crack tip is zero. an overview of which nodes should be enhanced is given in fig. 3. the enhanced nodes are represented by squares, while circles represent the nodes at the support of the crack tip. since the crack can freely run through the finite elements, it is possible that a discontinuity runs close to a node. as a result, a small proportion of the support of the node lies in either � i � or � i . in this case, the global stiffness matrix is not necessarily well conditioned. for this reason, an extra condition is introduced: min( , )� � � s s s tol � � (17) where � s is the volume of the support of a node. the tolerance is dependent on the precision of the solver. when condition (17) is not met, the considered node is constrained. the influence will be spread out over the other nodes. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 3: definition for enhancement of nodes (squares represent enhanced nodes) (a) (b) fig. 2: location of integration points for an element crossed by (a) 1 discontinuity and (b) 2 discontinuities initiation and propagation rules an important issue is the initiation of the discontinuity. a criterion is needed to decide when a discontinuity should be initiated and should propagate. a typical example of an initiation criterion is a plasticity yield surface: whenever the stress state in one integration point of the considered element is situated outside the elastic domain bounded by the yield surface, a discontinuity is initiated, as will be explained in section 7. another important item is the direction of the discontinuity. oliver [16] stated that the direction of the discontinuity could be found from the stress state at the moment of initiation. when using a plasticity yield surface as the initiation criterion, the direction can be obtained by means of a bifurcation analysis. furthermore, a discontinuity can only grow from a previous crack tip to introduce path continuity, and discontinuities can only grow at the end of a time step. in this case, no discontinuities are introduced in non-equilibrium states and the quadratic convergence of the newton-raphson process is ensured [17]. a discontinuity crosses each time through the whole element. when allowing multiple discontinuities to grow without intersection, specific interaction rules should be defined. different cases can be considered, namely a) only one crack tip touches the considered element, b) two crack tips touch the considered element, c) three crack tips touch the considered element, d) one crack tip touches an element that is already crossed. the first case is the most common case. the stress state in the considered element is checked and the initiation criterion decides whether the discontinuity should propagate along the computed direction or not. the second case implies several possibilities: � the two crack tips link and form one crack, � one crack propagates while the other stops, � both cracks grow, without intersecting. to decide which case occurs, the normal ncomp is computed according to the present stress state in the considered element. then the normal ncon to the connecting line, i.e. the line that connects the crack tips, is computed. from this normal, a 3 cohesive zone model the behaviour within the discontinuity is described by a plasticity based cohesive zone model. the adopted plasticity model was proposed by carol et al. [22] for use in interface elements. consequently, the plastic yield function is given in the traction space instead of the stress space. a hyperbolic yield surface is introduced, f t c t c ft n t� � 2 2 2( tan ) ( tan )� � (18) where !t � t tn t, are the normal and tangential component of the traction vector, c is the cohesion, ft the tensile strength and � the internal friction angle of the material. for tension, an associative flow rule is adopted. the evolution of the yield surface is governed by the decrease of tensile strength and cohesion throughout the computation: f f w g t t cr fi � � � � � � � � �0 1 , (19a) c c w g cr fii � � � � � � � � �0 1 , (19b) where ft0 and c0 are the initial values for the tensile strength and the cohesion, gfi is the mode-i fracture energy, gfii is the mode-ii fracture energy and wcr is the energy dissipated during fracture processes. the energy is defined as: d d dw t tcr n n pl t t pl� �� � (20a) and w wcr cr t � �d 0 (20b) where !� � � �npl tpl, are the normal and tangential component of the plastic separation vector. the decrease of tensile strength and cohesion is coupled to the energy dissipated during the fracture processes. moreover, the choice of eq. (19a)/(19b) ensures that the total mode-i fracture energy/mode-ii fracture energy is dissipated when the tensile strength/cohesion vanishes. furthermore, the decrease of tensile strength and cohesion is coupled: when a material is damaged due to tensile loading, the tensile strength but also the cohesion decreases. the tangential stiffness and the stress update are obtained with classical elasto-plastic equations. the tractions are defined through: t d� e pl( )� � . (21) the plastic deformation rate is defined as: � � �� pl f� � t n , (22) where � is the plastic multiplier rate. for tension, an associative flow rule is adopted. the plastic deformation rate can be introduced in eq. (22): � ( � � )t d n� e � (23) the plastic multiplier rate can be obtained through the consistency equation: � � � n d n d n t t h e e � , (24) where h is the hardening/softening modulus. inserting the result for the plastic multiplier into equation (23) yields the tangential stiffness: � � �t d d nn d n d n d� � � � � � � � � �e e e t t h � � . (25) the tangential stiffness can be inserted in the finite element equations (see eq. 14). the elastic stiffness is chosen very high, in theory infinite, in order to suppress the artificial elastic part of the solution. since a discontinuity is only inserted when the yield surface is violated, the jump is completely inelastic. 4 numerical example nooru-mohamed [23] examined double edge notched specimens of different sizes (200×200×50, 100×100×50, 50×50×50 mm), subjected to different loading paths. all specimens were placed in a special loading frame, allowing a combination of shear and tensile loading. for the numerical simulations presented in this section, only one loading path is considered (path 4 in the experiments). double-edge notched specimens, shown in fig. 4, are first loaded in shear until a certain value of the lateral force ps is reached. afterwards, the lateral shear force is kept constant © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 44 no. 5 – 6/2004 tolerance is computed. the obtained limit values for the normal are ntol1 and ntol2. when the normal computed from the stress field lies within the zone defined by ntol1 and ntol2, the cracks are connected. in the other case, one crack propagates along the normal computed from the stress field while the other crack is stopped. the propagating crack can be chosen arbitrarily. another possibility is to allow the crack with the highest energy dissipation (main crack) to propagate. another option is to let both cracks propagate. in this case, the cracks may or may not intersect. the case of intersecting cracks is not considered in this paper. the third case can be solved in the same way, only more linking possibilities should be considered here. the last case is simply solved by arresting the crack tip. it would also be possible to let the crack grow, but only if the discontinuities inside the element do not intersect. this is however not considered. it is clear that the defined interaction rules are more or less arbitrary. nevertheless, the rules are straightforward and relatively easy to implement. if necessary, the adopted rules will be refined. ps pn 25 25150 65 100 100 a a’ b’ b 3 0 c fig. 4: geometry of the specimen (all dimensions in mm) while a tensile load (pn) is applied. the specimen is supported at the bottom and along the right side below the notch. nooru-mohamed [23] applied three different values for the lateral shear force, i.e. (a) ps � 5 kn, (b) ps � 10 kn and (c) ps � max. for the last case, the specimen is loaded laterally until it no longer sustains the lateral forces. � c0 � 20 mpa, � gfii � 0.1 n/mm, � tan � � 0.5. for the continuum, the material paramters are : young’s modulus e � 25000 mpa and poisson’s ratio � � 0.2. the obtained load-deformation curve is shown in fig. 6a. when compared with the experimental curve, a good agreement is found. especially the peak load is simulated remarkably well. the post peak response in the finite element simulation is more brittle. the computed crack path is compared with the experimentally obtained path. as can be seen in fig. 6b, the computed crack path is in good agreement with the experimentally observed crack path. during the experiments, nooru-mohamed also connected lvdts to the specimen in order to study local deformations. the position of these additional lvdts is visualized in fig. 7. the recorded deformations are plotted versus the shear deformation �s, and compared with the computed values in fig. 8a–e. examining fig. 8a and fig. 8e, the calculated deformations for lvdt 1 and lvdt 5 show a good agreement with the measured values. the deformation for lvdt 2, fig. 8b, is not captured with the calculations. the computed crack path runs outside the measuring range of lvdt 2. as a result, the experimentally observed increase in deformations is not observed in the computations. for lvdt 3 and lvdt 4, the measured and calculated deformations show the same tendency. first a small increase is noticed. when the crack has passed the location of the lvdt, the deformations start to decrease because the crack passes outside the measuring range of the lvdt. the overview in fig. 8a–e shows that apart from the load deformation curve and the final crack path, also local information is captured in a reasonable way. finally, the same model and model parameters are used to simulate load case (a). in this case, the lateral force is increased until ps � 5 kn. then, the lateral force is kept constant and a tensile load is applied. the comparison of the computed load deformation curve with the experimental obtained curve is shown in fig. 9a while the experimental and computed crack path are visualized in fig. 9b. again, the peak load is captured remarkably well. the computed post peak response is slightly more brittle. when the crack path is studied, it can be observed that the crack which grows from the left notch is in agreement with the experimentally observed crack. the crack growing from the right notch is more curved than the experimentally observed crack. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 5: mesh used for analysis 30 3535 3535 30 35 1 2 3 4 5 fig. 7: location of additional lvdt � �� � p n (k n ) 0 25 50 75 100 0 4 8 12 calculation experimental mn (a) (b) fig. 6: (a) experimental versus computed tensile load deformation curve and (b) experimental versus computed crack path 5 acknowledgments financial support from the fwo-vlaanderen (fonds voor wetenschapelijk onderzoek, fund for scientific research – flanders) is gratefully acknowledged. references [1] pijaudier-cabot g., bažant z.: “non-local damage theory.” asce journal of engineering mechanics, vol. 113 (1987), no. 10, p. 1512–1533. [2] peerlings r. h. j., de borst r., brekelmans w. a. m., geers m. g. d.: “gradient-enhanced damage modelling of concrete fracture.” mechanics of cohesive-frictional materials, vol. 3 (1998), no. 4, p. 323–342. [3] de borst r., sluys l. j.: “localization in a cosserat continuum under static and dynamic loading conditions.” computer methods in applied mechanics and engineering, vol. 90 (1991), no. 1–3, p. 805–827. [4] needleman a.: “material rate dependence and mesh sensitivity in localization problems.” computer methods in © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 44 no. 5 – 6/2004 � � � � 10 20 30 40 10 20 30 40 50 calculation experiment s m m 1 (a) � � � � 10 20 30 40 10 20 30 40 50 calculation experiment s m m 2 (b) � � � � 10 20 30 40 0 1 2 3 4 5 calculation experiment s m m 3 (c) � � � � 0 10 20 30 40 0 5 10 calculation experiment s m m 4 (d) � � � � 0 10 20 30 40 0 5 10 15 20 25 calculation experiment s m m 5 (e) fig. 8: experimental versus calculated local deformations applied mechanics and engineering, vol. 67 (1988), no. 1, p. 69–97. [5] sluys l. j., de borst r.: “wave-propagation and localization in a rate-dependent cracked medium: model formulation and one-dimensional examples.” international journal of solids and structures, vol. 29 (1992), no. 23, p. 2945–2958. [6] ortiz m., leroy y., needleman a.: “a finite element method for localized failure analysis.” computer methods in applied mechanics and engineering, vol. 61 (1987), p.189–214. [7] belytschko t., fish j., engelmann b. e.: “a finite element with embedded localization zones.” computer methods in applied mechanics and engineering, vol. 70 (1998), p. 59–89. [8] sluys l.j., berends a.h.: “discontinuous failure analysis for mode-i and mode-ii localization problems.” international journal of solids and structures, vol. 35 (1998), no. 31, p. 4257–4274. [9] dvorkin e. n., cuitino a. m., giogia g.: “finite elements with displacement interpolated embedded localization lines insensitive to mesh size and distortions.” international journal of numerical methods in engineering, vol. 30 (1990), p. 541–564. [10] klisinski m., runesson k., sture s. “finite element with inner softening band.” asce journal of engineering mechanics, vol. 17 (1991), p. 575–587. [11] simo j. c., oliver j., armero f.: “an analysis of strong discontinuities induced by strain-softening in rate-independent inelastic solids.” computational mechanics, vol. 12 (1993), p. 277–296. [12] armero f., garikipati k.: “recent advances in the analysis and numerical simulation of strain localization in inelastic solids.” in computational plasticity, fundamentals and applications, d. r. j. owen, e. onate and e. hinton, eds. pineridge press, swansea, 1995, p. 547–561. [13] larsson r., runesson k,. akesson m.: “embedded localization based on regularized strong discontinuity.” in computational plasticity, fundamentals and applications, d. r. j. owen, e. onate and e. hinton, eds. pineridge press, swansea, 1995, p. 599–610. [14] lofti h.r., shing p.b.: “embedded representation of fracture in concretewith mixed finite elements.” international journal of numerical methods in engineering, vol. 38 (1995), p. 1307–1325. [15] larsson r., runesson k.. “element-embedded localization band based on regularized displacement discontinuity.” journal of engineering mechanics, vol. 122 (1996), p. 402–411. [16] oliver j.: “modelling strong discontinuities in solid mechanics via strain softening constitutive equations. part 1: fundamentals.” international journal for numerical methods in engineering, vol. 39 (1996), p. 3575–3600. [17] wells g. n., sluys l. j.: “a new method for modeling cohesive cracks using finite elements.” international journal for numerical methods in engineering, vol. 50 (2001), no. 12, p. 2667–2682. [18] jirasek m.: “comparative study on finite elements with embedded discontinuities.“ computer methods in applied mechanics and engineering, vol. 188 (2000), p. 307–330. [19] duarte c. a., oden j. t.: “h-p clouds and h-p meshless method.” numerical methods for partial differential equations, vol. 12 (1996), no. 6, p. 673–705. [20] belytschko t., black t.: “elastic crack growth in finite elements with minimal remeshing.” international journal for numerical methods in engineering, vol. 45 (1999), p. 601–620. [21] moës n., dolbow j., belytschko t.: “a finite element method for crack growth without remeshing”. international journal for numerical methods in engineering, vol. 46 (1999), p. 131–150. [22] carol i., prat p. c., lopez c. m.: “normal/shear cracking model: application to discrete crack analysis.” journal of engineering mechanics, vol. 123 (1997), p. 765–773. [23] nooru-mohamed m. b.: “mixed-mode fracture of concrete: an experimental approach”. phd thesis, delft university of technology, 1992. k. de proft, w.p. de wilde department of mechanics of materials and constructions, vrije universiteit brussel pleinlaan 2 1050 brussels, belgium 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 � � p (k n ) 0 25 50 75 100 0 5 10 15 calculated experimental n m (a) (b) fig. 9: (a) experimental versus computed load deformation curve and (b) experimental versus computed crack path ap05_3.vp 1 introduction electrorheological (er) fluids are liquids with flow properties that can be changed by stimulation by an electric field. in particular, changes in dynamic viscosity under an electric field have found many applications in various devices such as clutches, brakes, active engine mounts and shock absorbers [1, 2]. the rheological properties of the oil dispersions of stimulus responsive particles have been well documented. the key principle of an er fluid is the dispersion of electro-conducting particles in a non-conducting liquid medium. we considered that it woud be interesting to investigate the rheological properties of an ionic liquid. in this work, we study the rheology of a new ionic liquid in electric and magnetic fields. to the best, of our knowledge, no reports have appeared in the literature dealing with the rheology of ionic liquids. ionic liquids are salts which are fluids at around room temperature. in recent years they have attrached increasing interest. only in 2001-2002 more than five hundred papers were published. ionic liquids show unusual physical properties such as high ionic conductivity and powerful solution efficiency for various organic and inorganic substrates. two types of ionic liquids, imidazolinium salts [3, 4, 5] and trialkylammonium salts [6] have been studied extensively by chemists. more recently a new type of ionic liquid, 2-hydroxyethylammonium formate was discovered at the chemistry department of our university. according to their report [7], it is obtained simply by mixing ethanolamine and formic acid, both of which are commercially available. the freezing temperature of this salt is -82 oc, which is the lowest freezing temperature among ionic liquids. this work deals with measurements of the vibration damping effect of the ionic liquid by the logarithmic degrement approach under magnetic and electric fields. the variation of the damping coefficient is derived and the results are discussed. our preliminary experiments show that the new ionic liquid is not stimulated by magnetic fields. however, a relatively high vibration damping effect was detected even under low electricfields, compared with the effects discussed in previous reports. 2 mechanism of the electro-rheological effect the electrorheological effect refers to the sudden and reversible change in the flow characteristics by means of an electric field. an abtract change in the molecular orientation of the structure from an initially random distribution to a more ordered structure takes place. furusho classified er fluids into two-phase (particle-type) and one-phase (homogeneous-type) systems in terms of their characteristics [8]. conventional er fluids, for example suspensions of polarizable solid particles and insulation oil, demonstrate an orientational change in response to an external electric field because of the induced aggregation of the particles. the main disadvantage of two-phase er fluids is that their characteristics change greatly with the shape and dimensions of the particles [9]. homogeneous-type er fluids have been developed by using a solution of low-molecule liquid crystals or macromolecular liquid crystals [10, 11]. up to now, the er effects of electron conducting dispersed particles have been studied. here we present the rheological effect of an ionic liquid in which electrical charges are created by ion migration rather than by electron movements. the er effect is strongly dependent on polarization rate, permittivity, the dielectric loss factor and conductivity [12]. it is important to note that the structure of the carrier fluid may break down under high electrical fields, when highly conducting substrates are used. because of this fact, it is not possible to obtain high damping forces. in this case, the dampings received are substantially lower than those produced by magnetorheological fluids. zhao et al. demonstrated that the er effect produced by conducting and dielectric systems are completely different in nature [13]. for instance, network systems are formed rather than linearly aligned chain systems. in fact, the conductivity effect dominates in low frequency ac fields, and permittivity dominates at high frequencies. in other words low frequency ac fields must be applied to conductive er fluids in order to achieve a better er effect. in this study, we have intestigated the rheological properties of an ionic liquid under dc electric fields. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague vibration damping of a new ionic liquid under electric field effect m. m. a. bicak, h. t. belek, a. göksenli ionic liquids are recently-developed smart materials that are not well known by mechanical engineers. they are of great interest due to their non-volatility, viscosity and extremely high electrical conductivity. up to now, no reports have appeared on their rheological properties under magnetic or electrical fields. in this work, we study the electro-rheological behaviour of a newly presented ionic liquid (2-hydroxyethylammonium formate). our experiments show that the ionic liquid is not sensitive to magnetic fields. nevertheless, resonably high damping ratios (42.8%) have been attained under relatively low electric fields (0.6 kvcm�1). keywords: damping characteristics, ionic liquid, electrorheological fluid. 2 physical properties of the ionic liquid recently a new ionic liquid (2-hydroxyethylammonium formate), with an extremely low melting temperature (�82 �c), has been reported [7]. this ionic liquid shows reasonably high room temperature ionic conductivity (3.3 ms cm–1) and heat stability up to 150 �c. other physical properties are given in table 1. the room temperature conductivity of the ionic liquid is 3.3 mscm-1, which is in the order of metallic conductivities. the ionic conductivity increases exponentially with temperature, and reaches 40 ms cm�1 at 92 °c (fig. 1). this effect can be attributed to fast ion mobilities at elevated temperatures. the ac conductivity – frequency plot of the ionic liquid (fig. 2) shows a sharp increase in the (0.1–10) hz range. a plateau appears between 10 hz–10 mhz, in which the conductivity is around 68 ms cm�1 at room temperature. the viscosity of the ionic liquid decreases with temperature. for instance, 105 cp of room-temperature viscosity reduces to 15 cp at 70 °c. processing of the temperature dependent viscosity data shows an arhenius type of relationship from which the following correlation is obtained: log . . � � � �5 265 1919 5 t , where � denotes viscosity in terms of cp, and t is temperature (k). 3 experimental we designed a home-made experimental setup shown in fig. 3, for damping measurements under an electric field effect. the same setup can also be used under a magnetic field by replacing the electric field with a magnetic coil. the measurements were carried out under (0–0.6) kv cm�1 of a dc electric field. for this purpose the damper reservoir was filled with about 25 ml ionic liquid. the electric field was created by two parallel plates using a controllable power supply. meanwhile, the shaker was stimulated by amplified signals. the oscillations generated by the shaker © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 appearance viscous clear liquid density 1.204 g cm�3 refractive index nd � 1.4772 (at 25 °c) viscosity � � 105 cp (at 25 °c) conductivity � � 3.3 ms cm�1 (at 25 °c) decomposition temperature approx. 150 °c (by tga�) vapor pressure 2.2×10�2 torr (air saturation method) melting point mp (freezing point): � 82°c (*adapted from ref. [7], �thermo gravimetric analysis) table 1: some physical characteristics of the ionic liquid* c o n d u c ti v it y m s m m -1 temperature (°c) fig. 1: effect of temperature on conductivity log � a c c o n d u c ti v it y (m s m m -1 ) fig. 2: ac conductivity – frequency plot were collected and the output responses were collected by a data collector (bruel & kjaer). the displacements-time plots are illustrated in fig. 4., and the logarithmic degrement of the oscillations was calculated by tracing the maxima of the displacement signals using the following formula � � � 1 1r a a i i ln , � � � � � 1 1 2 2( ) , where ai is the first significant amplitude, and ai�r is amplitude after r cycles. the damping forces were recorded simultaneously by means of the force transducers. these experiments were carried out under various electric fields. the data collected is pictured as a function of the electric field, as shown in fig. 5. 3.1 measurements under a magnetic field the experiments described above were repeated under constant and sinusoidal magnetic fields supplied from an electromagnetic coil with an inner diameter of 10 cm. varia70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague m force transducer force transducer displacement transducer power supply er damper spring shaker data collector signal generator amplifier amplifier fig. 3: schematic diagram of the testing unit time (sec.) displacement (mm) ai+r ai 3 2 1 fig. 4: impulse response of the oscillator fig. 5: damping force – time relationship under an electrical field tion of the magnetic field (0–1) tesla did not give any significant change in response relative to blank experiments. 4 results and discussion although we are not able to apply high electric fields, we have observed reasonable rises in damping ratios as high as 42.8 % in moderate electrical fields up to 0.6 kv cm�1 (fig. 6). this seems to be due to the quick orientation of the dipoles of the ionic liquids under an electric field. this result is especially significant because such high damping can only be achieved under high electric fields (i.e. 40 kv cm�1), as described many times in the literature [13]. since the viscosity of the liquid is inversely proportional to the temperature, high damping performances are expected at lower temperatures. however, the conductivity of the ionic liquid increases with temperature. in other words, the effects of viscosity and conductivity, in our case, are contraversal in the damping effect. nevertheless, the high solubility of of the liquid compensates the disadvantage of the temperatures. for instance, the addition of ammonium chloride (nh4cl ) increases both the conductivity and the viscosity at the same time. it is important to note that there is a great difference between the present system and the reported data obtained from electroconducting particle dispersions. in the present case, electric conduction is provided by ion migration rather than by electron conduction. in our case, 3.3 ms cm�1 of conductivity at room temperature is comparable with the conductivities of conducting metals. in other words, the conductivity of the ionic liquid is about 106 times that of particle dispersions reported so far. better damping effects can be attained by dissolving dissociable salts in the ionic liquid at low temperatures. the advantage of the liquid presented here over reported systems is that it can be stimulated by low electric field strengths. the non-volatility of the liquid is another advantage. we have also studied the damping response under a magnetic field in the (0–1) t range supplied by a magnetic coil. however, no significant response was detected, as expected. 5 conclusion ionic liquids are ion conducting viscous liquids which provide reasonable dampings under relatively low electrical fields. the new ionic liquid without any added ingredient does not show any magnetorheological effect. however, due to its powerful solvating effect, the homogeneous or semi-homogeneous magnetic particle dispersions are of interest. further studies are under consideration. 6 acknowledgments we are grateful to prof. n. bicak for his valuable help in donating the ionic liquid. references [1] carlson, j. d., catanzarite, d. m., st. clair, k. a.: electrorheological fluids, magnetorheological suspensions and associated technology (ed. w. a. bullough). singapore: world scientific, 1996, p. 20. [2] wu, x. m., wong, j. y., sturk, m., russell, d. l.: electrorheological fluids: mechanisms, properties, technology and applications (ed. r. tao and g. d. roy). singapore: world scientific, 1994, p. 538. [3] watanabe, m., yamada, s. i., ogata, n.: electrochim. acta, vol. 40 (1995), p. 2285–2288. [4] fuller, j., breda, a. c., carlin, r. t.: j. electrochem. soc. vol. 144 (1997), no. 4, l67–l70. [5] noda, a., watanabe, m.: electrochim. acta, 45 (2000), p. 1265–1270. [6] forsyth, m., sun, j., macfarlane, d. r.: electrochim. acta, vol. 45 (2000), p. 1249–1254. [7] bicak, n.: “a new ionic liquid: 2-hydroxy ethyl ammonium formate.” j. molecular liquids, (in press) [8] furusho, j.: “control of mechatronics systems using electrorheological fluids.” journal of the society of instrument and control engineers, vol. 34 (1995), p. 687–691. © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 6: effect of an electrical field on damping coefficients [9] johson, a. r., makin, j., bollough, w. a.: advances in electrorhological fluids (ed. m. a. kohudic). technomic publ., 1994. [10] inoue, a., maniwa, s.: “electrorheological effect of liquid crystalline polymers.” journal of application polymer science, vol. 55 (1995), p. 113–118. [11] inoue, a.: “trends of homogeneous er fluid development.” journal of the society of instrument and control engineers, vol. 34 (1995), p. 698–701. [12] akhavan, j., slack, k., wise, v., block, h.: “coating of polyaniline with an insulating polymer to improve the power efficiency of electrorheological fluids.” proceedings of the 6th international conference on electrorheological fluids, magnetorheological suspensions and their applications, eds. m. nakano, k. koyama, 1997, p. 322. [13] zhao, x. p., chen, j., shen, y. x., lu, k. q.: “properties of conductive electrorheological systems.” proceedings of the 6th international conference on electrorheological fluids, magnetorheological suspensions and their applications, eds. m. nakano, k. koyama, 1997, p. 302. m. m. altug bicak tel.: +90-212-293 13 00/2510 fax.: +90-212-245 07 95 e-mail: bicakme@itu.edu.tr h. temel belek tel.: +90-212-2931300/2577 e-mail: belek@itu.edu.tr ali göksenli tel:+90-2122931300 e-mail: goksenli@itu.edu.tr istanbul technical university faculty of mechanical engineering taksim, istanbul, turkey 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap04-bittnar2.vp 1 introduction nowadays, attention in civil engineering is focused on high performance structural materials. like it or not, concrete is still a world wide leading structural material. typically, it is used in columns (due to its high bearing capacity in compression) in high-rise buildings. the high performance of the material also tends to substantial questions concerning not only the bearing capacity, but also its ductility and post-peak behavior. high strength concrete members usually suffer from lower ductility. thus, special attention must be paid to the post-peak behavior of such columns, because reduction of ductility can lead to a significant reduction of the overall load bearing capacity of the structure during abnormal loading, such as earthquake or a terrorist attack, for example. the problem of concrete ductility is complicated by dependence on the amount of confinement, i.e. the amount of transversal reinforcement. this concerns both high and normal strength concretes. a better understanding of concrete behavior in reinforced concrete structures such as columns is needed, and precise and verified models are required. during the past decade, many authors have investigated the load bearing capacity of columns and the confinement effects of reinforcement. however, these studies have been oriented mainly to strength investigations. less is known about the post-peak behavior of such columns. moreover, studies are limited mainly to circular cross sections, where the effect of confinement is very significant. fam and ritzakalla [9] investigated a series of small to large-scale circular columns cast into steel and composite tubes. they showed the importance of lateral confinement, what can lead to more than a 100% strength increase. unfortunately, no post-peak investigation was performed. similar results were obtained by mortazavi et al. [16] for concrete columns with pretensioned carbon fiber polymer tubes. existing knowledge about the design of steel confined concrete is already incorporated in eurocode 8 [8]. it assumes the confining steel is fully utilized (yielded) and cross sectional aspects are taken into account by the confinement effectiveness coefficient (equal to one for cylindrical cross sections). some investigations of confinement effects and also of size effect for reinforced concrete columns can be found e.g. in hollingworth [10], bažant and kwon [1], sener et al. [21]. however, these works concern mainly strength issues and not the ductility. 2 motivation eccentrically loaded reinforced concrete columns were chosen for this research. centric loading would not be suitable for our research, especially when post-peak behavior is the matter of investigation (e.g. němeček [13], [14]). the choice of columns was influenced by the following facts. a column is a typical structural element, which is used multiple times in the structure. it has an enormous influence on the ductility and overall performance of the structure. a combination of compression with small eccentricity produces a relatively complicated triaxial stress state in the concrete, which is longitudinally and transversally reinforced. the character of the failure can be readily observed and measured. measured parameters can be compared with the model simulation. all these considerations made the column a perfect candidate for this study. 3 methods we decided to study the problem both experimentally and numerically. one typical geometry with a square cross section was chosen for all tested columns. the columns were reinforced with the same amount of longitudinal reinforcement and a variable amount of transversal reinforcement (stirrups). three different distances of stirrups were used. two concrete grades (normal and high strength) were tested. thus, the total number of studied cases was 6. a three-dimensional finite element model for columns was constructed. some sophisticated three-dimensional material model, capable of describing all important natural phenomena (such as tension and compression softening, path dependence, anisotropy and so on), had to be used. the m4 microplane model (bažant et al. [5], caner and bažant [6]) was our choice. the use of this model was justified for similar applications (brocca and bažant [7], němeček and bittnar [13]). the problem was studied using the oofem finite element package [11] developed at the department of structural mechanics, ctu prague. 158 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 effect of stirrups on behavior of normal and high strength concrete columns j. němeček, p. padevět, z. bittnar this paper deals with an experimental investigation and numerical simulation of reinforced concrete columns. the behavior of normal and high strength columns is studied, with special attention paid to the confinement effects of transversal reinforcement in columns with a square cross section. the character of a failure, and the strengths, ductility and post-peak behavior of columns are observed in experiments and also in numerical solution. a three-dimensional computational model based on the microplane model for concrete was constructed and compared with experimental data. the results of the numerical model showed good agreement in many aspects, and proved the capabilities of the used material model. keywords: concrete, reinforced concrete columns, stirrups, normal strength concrete, high strength concrete, experiments, simulation, microplane model. 4 experiments 4.1 specimens a common geometry for all columns was used. the columns had a square cross section 150×150 mm and length 1150 mm. the longitudinal reinforcement (ribbed bars with a diameter of 12 mm) was placed into the corners of the cross section. the transversal reinforcement was formed by closed stirrups with a diameter of 6 mm. the longitudinal distance between the stirrups at the middle part of the columns was 50, 100 and 150 mm (these dimensions are used in the subsequent notation of the series, e.g. n50 is a normal strength concrete column with a 50 mm stirrup distance at the midheight). the distance of the stirrups at the ends was denser to prevent damage in this region caused by introducing the load and by possible geometrical imperfections. the longitudinal reinforcement was further fixed into massive end blocks made of steel (80 mm in depth) for smooth load transfer. the dimensions of the specimens and their reinforcement is depicted in fig. 1a. each series consisted of five identical specimens. the columns were loaded in uniaxial eccentric compression. the eccentricity of the compressive load was 15 mm (i.e. 0.1 of cross sectional depth). 4.2 instrumentation experiments were carried out on a feedback control test machine having a very stiff 12 mn/mm steel frame and maximum load 2500 kn (inova dsm2500, cz). the control was based on a constant increment of longitudinal deformation. each specimen was equipped with a set of tensometric gauges used for longitudinal strain measurements. rotary potentiometric gauges were used for midheight lateral deflection measurements. the measured experimental parameters were as follows: overall axial force, midheight lateral deflection, strains measured over the whole length of the column (base 960 mm), strains at the ends of the column (base 50 mm). the type and character of the failure was also observed. the gauge arrangement is depicted in fig. 1b. 4.3 materials it was decided to study the confinement effects of stirrups on two grades of concrete: normal (n series) and high strength (h series). the concrete mixture proportions are given in table 1. uniaxial compression tests on cylinders (diameter 150 mm, height 300 mm, six for each series) were performed. an average measured stress-strain diagram of cylinders is shown in fig. 2. it is clear that not only the peak strengths but also the ductility are different for the two concrete mixtures. n series was much more ductile than h series as can be seen from the post-peak slope in fig. 2. stress-strain diagrams of the cylinders were used for subsequent calibration of the microplane model. the mean strengths and standard deviations for concrete (in uniaxial compression) and steel (in uniaxial tension) are given in table 2. 4.4 test results the behavior of all series was very similar. almost all specimens failed around the midheight. as an example, all specimens of series n100 after collapse can be seen in fig. 3 a. column collapse was initiated by concrete softening at the © czech technical university publishing house http://ctn.cvut.cz/ap/ 159 acta polytechnica vol. 44 no.5–6/2004 (a) (b) fig. 1: (a) geometry of specimens in mm, (b) gauge arrangement and corresponding measured parameters (f stands for overall force, w stands for midheight lateral deflection, �0 is overall longitudinal strain and �t and �b are top and bottom strains measured at column ends) series aggregate (0–4) mm aggregate (4–8) mm cement water plastisizer n 800 880 350 204 – h 800 880 420 120 8.4 table 1: concrete mixture proportions (kg/m3) fig. 2: stress-strain diagrams measured on cylinders in uniaxial compression tests (mean values) and corresponding computed curves used for calibration of the microplane model n series h series longitudinal reinforcement stirrups 30 � 1.6 67.2 � 3.4 561 � 12.2 314 � 11.6 table 2: mean values of material strengths � standard deviations (uniaxial compression for concrete and uniaxial tension for reinforcement, all in mpa) midheight, accompanied by symmetric buckling of both reinforcing bars at the compressed side of the cross section. the bars always buckled between stirrups, as can be seen in fig. 3b. failure localized at the middle part of the column where a wedge-shaped pattern developed (see fig. 3 c). the front longitudinal dimension of the wedge was measured to characterize the size of the damage zone (labeled as o in fig. 3c). the dimensions of the damage zone for all series are summarized in table 3. note that these dimensions are just guide values because it is hard to find a sharp end of the damage zone in concrete and also the final force applied in the column was not absolutely the same for all specimens (loading finished at approximately 30 � 10 % of the peak force). however, it can be seen that the damage zone is approximately equal for all series regardless of the density of the stirrups. the means that the damage zone runs across one or more stirrups but the damage size remains approximately the same. the yield plateau in the force-deflection curve was very small and the load-bearing capacity decreased from the peak value. the loading diagrams plotted for overall axial force versus midheight lateral deflection are shown in fig. 4 for all series. the peak values of deflection and force for all series are summarized in table 4. the results show no significant in160 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 (a) (b) (c) fig. 3: experiments: (a) series n100 after collapse, (b) front view of the damaged zone at the midheight, (c) side view of the damage zone (wedge shape) specimen dimension o specimen dimension o n50 250 � 11.7 h50 266 � 18.5 n100 256 � 18.6 h100 240 � 16.3 n150 196 � 15.0 h150 213 � 9.4 table 3: dimensions of the wedge-shaped damage zone (all dimensions in mm) fig. 4: experiments: force vs. midheight lateral deflection diagrams of n (left) and h (right) series peak deflection wexp [mm] peak force pexp [kn] n50 3.84 � 0.18 617.6 � 18.7 n100 3.69 � 0.35 607.8 � 11.8 n150 3.48 � 0.44 602.2 � 15.9 h50 2.75 � 0.21 1053.2 � 45.7 h100 2.71 � 0.16 1038.4 � 46.2 h150 2.38 � 0.41 1007.0 � 55.1 table 4: experimental results: mean peak values of midheight lateral deflection and overall axial force � standard deviations fluence of density of stirrups on the peak values, i.e. strength and strain. however, this dependence occurs in the post-peak region. the ductility characterized by the slope of the force-deflection diagram increases as the distance between the stirrups decreases. tensometric measurements of longitudinal strains were also performed. the strain was measured over the whole length of the column and at the ends. the measurements at the ends of columns showed unloading in this area after reaching the peak load (see fig. 5 where a specimen from the h100 series was taken as an example). this provided a clear evidence of the localization of the deformation in the middle part of the column while the remaining part underwent unloading. 5 numerical simulation 5.1 computational model a three-dimensional finite element model of the specimens was developed. the microplane model m4 (bažant et al. [5], caner and bažant [6]) was chosen as a model for concrete due to its capability to describe many natural phenomena of this material, such as compression and tension softening, path dependence, development of anisotropy and others. it is a conceptually simple but computationally demanding triaxial model. the crucial aspect of a constitutive model of such a kind is proper fitting of the material parameters. the microplane model constitutive relations are based on a set of parameters that have generally no direct physical meaning. by an optimal fitting of standard uniaxial compression tests on cylinders (see fig. 2) the sets of material parameters for two concrete grades, n and h were found. the appropriate set of material parameters was always used for computations of all specimens of n series and all of h series, respectively. the microplane model parameters for concrete are summarized in table 5. structured meshes were generated for all columns, because only local formulation of the microplane model was used. this means that the energy dissipated from each element must be kept constant in order to receive mesh independent results. thus, the models consisted of the same size cubic elements in the middle part of the column where the microplane model was used. the end parts of the columns were modeled as elastic to save computational time. the reinforcement was modeled by 3d beam elements with both geometrical and material nonlinearities (j2 plasticity with hardening), in order to capture also yielding and buckling. the fe model of the rc column is shown in fig. 6. the deformed geometry is depicted in the post-peak phase, where the deformation is already localized in the middle part. the deformed embedded reinforcement buckling at midheight between the stirrups can also be seen in fig. 6. details concerning the fe model and the computational times are mentioned in table 6. the number of elements varied only slightly between the 50, 100 and 150 series due to the slightly different amount of stirrups. as a side effect, some computational aspects were observed for the model. the computation times are very long on a single processor pc. © czech technical university publishing house http://ctn.cvut.cz/ap/ 161 acta polytechnica vol. 44 no.5–6/2004 fig. 5: strain measured at the compressed side of rc column over the whole length and at the end series e k1 k2 k3 k4 c3 c20 mpa nondimensional microplane model m4 parameters n 33000 0.000088 500 15 150 4 1.0 h 46039 0.000140 500 15 150 4 0.4 table 5: material parameters of the microplane model m4 for concrete fig. 6: finite element mesh: initial geometry, deformed geometry, deformed embedded reinforcement (specimen h50) series number of degrees of freedom number of elements number of load steps time of computation n/h 33585 10024 to 10280 10000 35:38/55:10 h table 6: details of fe model and computational times (pentium iv-2ghz, 1 gb ram) this indicates the high computational complexity of the microplane model. this is the penalty to be paid for such a complex but concise material model. this model feature can be overcome by using a parallel approach, for instance němeček et al. [15]. 5.2 structural analysis as was mentioned above, the microplane model is very computationally demanding. moreover, the m4 version of the model gives no direct formulation of the tangential stiffness matrix and the only way is to use the initial elastic stiffness matrix throughout the computation, which gives a very slow convergence. the solution is to use nonlinear dynamic analysis – explicit integration (němeček [14], němeček et al. [15]). to solve a static loading, we used a special form of the load time function d(t) that minimizes inertia forces (řeřicha [20]). the formula has the form d t a t t t t ( ) � �� � � � � � � � � � 3 2 , where t is the total time of computation and a is a constant dependent on the final displacement. the load time function was applied as a displacement at the appropriate points at the top of the columns. the problem was solved using the oofem fe-code ([11], patzák [18], [19]) developed at the department of structural mechanics at ctu in prague. 5.3 results of the simulation it was found that the model is capable of capturing all important features of rc-column behavior. it can give a good prediction of the shape and size of the damage zone in concrete and in buckling of steel (see fig. 7). the deformation of the fe-mesh is depicted in fig. 6, where one can see the overall deformation of the column in the post-peak phase and also the deformed reinforcement with buckled longitudinal bars between the midheight stirrups. the computed loading diagrams are shown in fig. 8, where the overall axial force is plotted versus the midheight lateral deflection for all series. if we perform a comparison of the computed diagrams (fig. 8) with their experimental counterpart (fig. 4) we can draw the following conclusions. the character of the failure, including the softening branch, is in good agreement with experiments. it lacks a yielding plateau as in the experiments, however it does not follow the slope of the experimental curve in the post-peak region. the model gives a less ductile response in this case. the peak values of the loading diagram were captured relatively well by the model. one must bear in mind that the material parameters were extracted from uniaxial compression tests only. this means that no further fitting of experimental data on the columns was done. the peak values together with the percentage change with respect 162 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 7: comparison of experiment (left) and fe model. damage zone with buckled reinforcement fig. 8: simulation: force vs. midheight lateral deflection diagrams of n (left) and h (right) series series peak deflection peak force percentage change with respect to experiments wsim [mm] psim [kn] in deflection 100 1 w w sim exp � � � � � � � in peak force 100 1 p p sim exp � � � � � � � n50 2.33 623.9 �39.3 % �1.0 % n100 2.38 594.2 �35.5 % �2.2 % n150 2.10 614.6 �39.6 % �2.1 % h50 3.40 1096.6 �23.6 % �4.1 % h100 3.25 1059.7 �19.9 % �2.1 % h150 3.00 1053.7 �26.1 % �4.6 % table 7: numerical results: peak values of midheght lateral deflection and overall axial force to the experimental results (see table 4) are summarized in table 7. the values of the peak force are in excellent agreement with the experiments (within 5 %), while the peak deflections are less satisfactory (within 20–40 %). the reason for the different results in post-peak behavior between the simulation and the experiments is probably that the concrete-steel interaction was not taken into account. no bond of reinforcement was assumed. the reinforcement modeled by beam elements was connected only to nodes of the corresponding finite element. the interaction of materials is what makes the column more ductile in the post-peak regime in comparison with only plain concrete and a plain steel structure. a consequence of this finding is that the pre-peak behavior of the columns is not very strongly influenced by the concrete to steel interaction, while for the post-peak behavior the interaction cannot be neglected. however, incorporation of the concrete/steel interaction into the model and the corresponding experimental measurements are beyond the scope of this study. 6 conclusions the behavior of six series of reinforced concrete columns with a square cross section was investigated. two different grades of concrete (normal and high strength) and three different densities of stirrups were chosen. the columns were loaded in eccentric compression with small eccentricity. the problem was studied experimentally and numerically. a computational model based on the m4 microplane model for concrete (bažant et al. [5], caner and bažant [6]) was constructed and used for simulation of the problem. the major experimental and numerical results are as follows: � compression failure (crushing) accompanied by concrete softening and steel buckling developed in the columns. � failure of columns localized into the middle part, where a wedge-shape failure pattern developed in the concrete, together with buckling of the reinforcement between the stirrups. the damage zone had approximately the same dimensions for all tested series. � the influence of the density of the stirrups on the column strength was negligible in the investigated cases (i.e. square cross section, stirrup density 50 mm–150 mm). � a significant influence of stirrup density was observed in the post-peak region. the post-peak is characterized by the lack of a yield plateau, and the slope of the descending branch depends on the density of the stirrups. the ductility of the columns increases as the distance between stirrups becomes smaller. this was observed for both normal and high strength concretes. � the proposed computational model based on the m4 microplane model (bažant et al. [5], caner and bažant [6]) is able to provide a good description of all observed parameters, such as the shape and size of the damage process zone, the buckling of steel reinforcement, the load capacity of the structure (peak values), and the character of the post-peak behavior (decreasing load with increasing deformations). � the computational model gives less ductility in the post-peak region, which is caused by the lack of steel to concrete interaction in the model. the model should be improved in this feature. acknowledgments this work has been supported by the czech grant agency (project no. 103/02/1273). this support is gratefully acknowledged. references [1] bažant z. p., kwon y. w.: “failure of slender and stocky reinforced concrete columns: test of size effect.” materials and structures, vol. 27 (1994), p. 79–90. [2] bažant z. p., xiang y., prat p. c.: “microplane model for concrete i.: stress-strain boundaries and finite strain.” journal of engineering mechanics, (1996). [3] bažant z. p., xiang y., adley m. d., prat p. c., akers s. a.: “microplane model for concrete ii.: data delocalization and verification.” journal of engineering mechanics, (1996). [4] bažant z. p., planas j.: “fracture and size effect in concrete and other quasibrittle materials.” crc press llc, boca raton, florida, 1998. [5] bažant z. p., carol i., adley m. d., akers s. a.: “microplane model m4 for concrete i: formulation with work-conjugate deviatoric stress.” journal of engineering mechanics, vol. 126 (2000), no. 9, p. 944–953. [6] caner f. c., bažant z. p.: “microplane model m4 for concrete ii.: algorithm and calibration.” journal of engineering mechanics, vol. 126 (2000), no. 9, p. 954–961. [7] brocca m., bažant z. p.: “size effect in concrete columns: finite-element analysis with microplane model.” journal of struct. mech., vol. 127 ( 2001), no. 12, p. 1382–1390. [8] eurocode 8: design provisions for earthquake resistance of structures, part 1–4. strengthening and repair of buildings. commission of the european communities, brussels, 2001. [9] fam a., rizkalla s.: “large scale testing and analysis of hybrid concrete/composite tubes for circular beam-column applications.” construction and building materials, vol. 17 (2003), p. 507–516. [10] hollingworth s. c.: “structural and mechanical properties of high strength concrete”, ph.d. thesis, 1998, university of wales, cardiff. [11] http://ksm.fsv.cvut.cz/oofem, “finite element software oofem”, department of structural mechanics, ctu prague. [12] jirásek m., bažant z. p.: “inelastic analysis of structures.” john wiley & sons, uk, 2002. [13] němeček j., bittnar z.: “experimental investigation and numerical simulation of post-peak behavior and size effect of reinforced concrete columns.” materials and structures, vol. 37 (2004), no. 67, p.161–169. [14] němeček j.: “modeling of compressive softening of concrete”, ph.d. thesis, ctu reports, 2000, prague. [15] němeček j., patzák b., rypl d., bittnar z.: “microplane models: computational aspects and proposed parallel algorithm.” computers and structures, vol. 80 (2002), p. 2099–2108. [16] mortazavi a. a., pilakoutas k., son k. s.: “rc column strengthening by lateral pre-tensioning of frp.” © czech technical university publishing house http://ctn.cvut.cz/ap/ 163 acta polytechnica vol. 44 no.5–6/2004 construction and building materials, vol. 17 (2003), p. 491–497. [17] ouyang c., shah s. p.: “fracture energy approach for predicting cracking of reinforced concrete tensile members.” structural journal aci, vol. 91 (1994), p. 69–78. [18] patzák b.: “material models for concrete”, ph.d. thesis (in czech), ctu prague, 1997. [19] patzák b.: “object oriented finite element modeling.” acta polytechnica, (1999), prague. [20] řeřicha p.: “optimum load history for non-linear analysis using dynamic relaxation.” journal of structural mechanics, vol. 23 (1986), p. 2313–2324. [21] sener s., barr b. i. g., abusiaf h. f.: “size-effect tests in unreinforced concrete columns.” magazine of concrete research, vol. 51 (1999), no. 1, p. 3–11. [22] van mier j. g. m.: “strain-softening of concrete under multiaxial loading conditions”, ph.d. thesis, eidhoven university of technology, the netherlands, 1984. [23] vonk, r.: “softening of concrete loaded in compression”, ph.d. thesis, technische universiteit eindhoven, 1992. ing. jiří němeček, ph.d. email: jiri.nemecek@fsv.cvut.cz ing. pavel padevět, ph.d. prof. ing. zdeněk bittnar, drsc. phone: +420 224 354 309 fax: +420 224 310 775 department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 164 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 acta polytechnica doi:10.14311/ap.2020.60.0435 acta polytechnica 60(5):435–439, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap additive laser barcode printing on high reflective stainless steel mihail stoyanov mihaleva, ∗, chavdar momchilov hardalovb, christo georgiev christovb, monika rinkec, harald leistec, johannes schneided a technical university of sofia, faculty of mechanical engineering, department of precision engineering and measurement instruments, 8 kl. ohridski blvd, 1000 sofia, bulgaria b technical university of sofia, department of applied physics, 8 kl. ohridski blvd, 1000 sofia, bulgaria c karlsruhe institute of technology, institute of applied materials – applied materials physics, hermann-von-helmholtz-platz 1, 76344 eggenstein-leopoldshafen, germany d karlsruhe institute of technology, institute of applied materials – computational materials science, hermann-von-helmholtz-platz 1, 76344 eggenstein-leopoldshafen, germany ∗ corresponding author: mmihalev@tu-sofia.bg abstract. in this paper, the process of an additive laser marking on stainless steel parts for barcode printing is presented. it is based on the use of one transition metal oxide chemically-well bonded to the stainless steel substrate, without a usage of any additional materials and cleaning substances. the resulting additive coatings, produced from initial moo3 powder by irradiation with a laser beam, reveal strong adhesion, high hardness, long durability and high optical contrast, which make the process suitable for barcode printing on materials such as high reflective stainless steel, which was always a challenge for the classical laser marking technologies. the obtained bar patterns are in a compliance with the requirements of the existing standards. keywords: additive laser marking, direct part marking, moo3, transition metal oxides, chemical bonding. 1. introduction nowadays, the product information becomes more and more important due to the globalization of the world industry. this kind of production requires a consistent system for identification and classification of the products. since 1952, the bar and 2d codes are established as a standard tool for identifying and classification of the quality in the industrial part production, part assembling, logistics and traceability. the codes carry information about the origin, the history of the production, assembling and transportation needed by the companies for improvement of their quality procedures and help avoiding errors before the products are shipped. to meet these challenges, the barcodes were standardized in iso/ies 15415 [1]. nasa also developed a standard for matrix identification of their numerous details aiming at reliable synchronization during the production and assembling time [2]. this approach is followed by many other federal agencies and departments, such as the department of defense as well as many companies in usa and eu. the marking process plays a key role for the reliable recognition of the barcode symbols and identification of the details. from a technological point of view, the marking coating is expected to have a high optical contrast in a broad spectral range as well as a good adhesion to the substrate and must be durable as well. one of the most versatile techniques is the laser marking. however, the inscription of metal parts, especially of highly reflective stainless steel with standard material subtractive methods like laser ablation is a challenge for this approach due to the low contrast, change of the substrate material properties during the laser treatment and emerging of habitats of micro-flora and fauna, which makes it unacceptable for marking of medical instruments, for example. to overcome most of the mentioned disadvantages, we present an additive marking process, which needs only one transition metal oxide for the preparation of a dark coloured coating, chemically-well bonded to the stainless steel substrate, without any additional materials and cleaning substances [3, 4]. henceforth, it will be referred to as “additive laser marking” (alm). the aim of the present publication is to reveal the feasibility of the alm to be used for a specific industrial application, namely marking barcodes on highly reflective stainless steel and the compliance of the obtained barcodes with the standards. 435 https://doi.org/10.14311/ap.2020.60.0435 https://ojs.cvut.cz/ojs/index.php/ap m. s. mihalev, c. m. hardalov, c. g. christov et al. acta polytechnica figure 1. digital photo of a real alm barcode with a microscopic image of selected black bars (square below the barcode). 2. experimental 2.1. technology of additive laser direct part marking a plate of mirror finished stainless steel grade 304 is chosen as a substrate and moo3 powder as a typical transition metal oxide for the alm. it is well known that the moo3 melted by laser irradiation in liquid phase exhibits strong acid and oxidant properties [3, 4], building an fe2(moo4)3 interface layer between the substrate and amorphised moo3. the coating obtained shows a dark colour due to oxygen vacancies [5], so that no additional pigments for colourising are necessary. the equipment for the production of a marking coating consists of an industrial laser engraver jq6090 with a sealed co2 laser tube with a maximal output power of 80 w. the industrial application of the presented modification of the alm requires line-art scanning of the assigned topology, so that the desired image is obtained. 2.2. alm printed barcode sample fig. 1 shows a real barcode sample, printed using the suggested alm, containing the coded sequence of numbers 9788420 532318 with a magnified microscopic excerpt from selected bar edges. the picture is obtained by using commercial digital camera sony hd. a magnified view (objective 4×, ocular 16×) of a region, observed by an optical microscope motic bm13, is situated in the square below the barcode. 2.3. experimental sample using the alm a sequence of bars of different widths is obtained by irradiating a layer of pre-coated moo3 powder on stainless steel with the following the topology of a given design (fig. 2). the working modes (intensity of the laser beam and the scan velocity) are chosen in such a way to make a well bonded black layer on the stainless steel substrate [3, 4]. a sequence of bars consisting of laser lines shifted by 50 µm is produced (in fig. 2 from left to right), the width of the narrowest bar is approximately equal to the diameter of the diffraction spot. every consecutive figure 2. digital photo of a sequence of black bars and empty spaces with a progressively increasing width. empty space has the same width as the previous black bar. 3. results 3.1. material properties of the coating 3.1.1. adhesion a standard scratch test is performed by a csem revetest (knmf kit karlsruhe, gernany) scratch tester. during the scratch test, no critical load for coating failure is registered. even more, when the maximal load is reached and the indenter scrapes the substrate, traces of the coating in the substrate can still be observed [3]. these results confirm the excellent adhesion of the coating to the substrate. 3.1.2. reflectance of substrate and coating optical reflectance measurements are performed in the whole visible spectrum by using an avaspec spectrophotomer. fig. 3 presents the optical reflectance of both the coating and the substrate, which are an important characteristic for the barcode and 2d code recognition as required by the standard [6]. a large difference between the substrate and coating reflectance is observed. in the spectral region 600 − 700 nm, in which the most barcode readers operate, the difference is nearly constant (app. 68 %) which meets the gs1 requirements (at least 15 %, [7]). 3.2. barcode recognition properties 3.2.1. scan reflectance profile to receive the scan reflectance profile, the bar pattern, obtained by the alm, is illuminated by a light source of a wavelength in the red region, imitating the barcode reader, scanning the bars along a given scan line. the intensity of each pixel from the bitmap image along the scan path is acquired from a bitmap image of the pattern. the profile is graphically represented in fig. 4. the regions with high reflectance levels are related to the white spaces; the regions with low levels – to the black bars, as shown in fig. 4. all barcode symbol parameters, defined in [6], measured experimentally, are listed in the next subsections. 436 vol. 60 no. 5/2020 additive laser barcode printing on high reflective stainless steel figure 3. spectral reflectance profile of stainless steel substrate and moo3 layer, obtained by the alm. figure 4. intensity profile along a line across the black bars and empty spaces (a. u.). 437 m. s. mihalev, c. m. hardalov, c. g. christov et al. acta polytechnica rmax rmin global threshold 75 % 12 % 44 % table 1. minimal and maximal reflectance. 3.2.2. global threshold in order to distinguish a bar, more precisely, its width, the reflectance level must be under a certain value, defined as global threshold. in compliance with [6], the global threshold is a horizontal line in the middle between the minimal rmin and maximal rmax level of the pattern reflectance, as shown in fig. 4 and table 1. 3.2.3. symbol contrast (sc) generally, the contrast defines the possibility to distinguish between a bar and an empty space. the symbol contrast sc is an integral measure for the readability of the barcode. the sc is related to the highest and lowest levels of the reflectance of the reflected light in the bar pattern. usually, barcode scanners operate at a wavelength of 650 nm, so from fig. 3, the evaluated values are rmax = 77 %, rmin = 12 %: sc = rmax − rmin = 77 % − 12 % = 65 %, which is in a compliance with the standard. 3.2.4. edge contrast (ec) edge contrast is the local reflectance difference between two adjacent empty spaces and a black bar. similarly to sc, the edge contrast is defined as the difference of the highest and lowest reflectance values (rs and rb respectively) in a pair of adjacent elements (bar + space or space + bar). from the intensity profile (fig. 4), rs and rb are evaluated in table 2 for each bar: ec = rs − rb in the worst case, ecmin = 32.7 % is reached. 3.2.5. modulation (mod) modulation is the ratio of the edge contrast to the symbol contrast: m od = ecmin/sc modulation is estimated to be m od = 32.7 % 65 % = 0.5. 3.2.6. defects: d = ernmax/sc spots in the empty spaces or light areas in bars will cause a ripple in the scan reflectance profile at the point where the scan path crosses them. this is referred to as element reflectance non-uniformity (ern, fig. 2). in the profile of a space, they show as a valley; in that of a bar, they show as a peak. if this peak or valley approaches the threshold between light and dark, the risk of the element being seen as more than one, and of the scan failing to decode, increases. the defect parameter d is defined as: d = ernmax/sc from fig. 4 it can be seen, that the maximal ernmax is most expressed in bar no. 11: ernmax = 21.9 % − 13.8 % = 8.1 % than the defect parameter is estimated to be: d = 8.1 % 65 % = 0.13 3.2.7. minimal recognized bar width (x-dimension) in order to recognize the bar code, the minimal width of the bar, more precisely, the space width, must be evaluated (x-dimension). in an accordance with the [5], this value must be in a range between 264 and 660 µm. technologically, the narrowest possible bar can be obtained by moving the co2 laser spot along a single line, which is the case with bar no. 1. from fig. 4, x-dimension (xd) is estimated as the space between both dimension lines, entitled “xdimension”, which defines the well expressed initial and the final points of the bar. from the graph, the xd is evaluated to be approximately 270 µm, which is in a compliance with the standard as well as with the diffraction spot size when a lens with a focal length of 50 mm for the co2 laser is used as it is in our case [3]. 4. discussion marking of stainless steel was always a challenge because of its high direct reflectance. independent of the high reflectance coefficient, the direct reflected light does not reach the reader again, reducing the readability. table 3 compares the requirements of the standard and experimental results obtained. all low levels of reflectance (fig. 4) are below the global threshold (44 %), even in the case of the narrowest dark bar no. 1 [5, 6]. the minimal width, which can be recognized, is estimated as 270 µm under the required minimal width of 266 µm. from table 3, it can be seen that all parameters of the pattern are in a compliance with the requirements of the standard [6]. they are all between grade 3 and 4 (i. e., the barcode symbol produced by using the alm is expected to be of a high quality and reliably recognizable by a barcode reader). clearly, all parameters can be improved by choosing more suitably selected technological conditions of the alm process – feed rate, intensity of the laser beam, overlapping distance of the laser scan lines. but, even in the worst case of mirror finish stainless steel with working parameters not perfectly optimised, a high contrast barcode symbol is obtained. 438 vol. 60 no. 5/2020 additive laser barcode printing on high reflective stainless steel bar no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ec, % 32.7 37.5 46.2 47.5 51.5 58.9 49.7 48.4 52.8 55.8 56.7 54.5 58.0 59.8 table 2. edge contrast. required parameters grade symbol contrast edge contrast modulation defects 4 ≥ 70 % ≥ 15 % ≥ 0.7 ≤ 0.15 3 ≥ 55 % ≥ 0.6 ≤ 0.20 2 ≥ 40 % ≥ 0.5 ≤ 0.25 1 ≥ 20 % ≥ 0.4 ≤ 0.30 0 (fail) < 20 < 15 % < 0.4 > 0.30 experimental results (worse case) 63 % 32.7 % 0.5 0.13 table 3. comparison between standard requirements and experimental results. 5. conclusions • a modification of the alm method, which needs only one transition metal oxide for a preparation of a high contrast coating over a well-polished stainless steel substrate, chemically-well bonded to the metal, without any additional materials and cleaning substances, has been developed and presented. the coatings show strong adhesion, high hardness, long durability and a high optical contrast. • the alm is a suitable technique for direct laser printing of barcodes on stainless steel details, especially made from mirror finished material. acknowledgements the support of knmf, karlsruhe institute of technology (kit), karlsruhe, germany, institute of applied materials/applied materials science at kit and german academic exchange service (daad) is gratefully appreciated. references [1] iso/iec 15415 information technology automatic identification and data capture techniques bar code symbol print quality test specification two-dimensional symbols. standard, international organization for standardization, geneva, 2011. [2] nasa-std-6002d applying data matrix identification symbols on aerospace parts. standard, national aeronautics and space administration, washington, 2002. [3] m. s. mihalev, c. m. hardalov, c. g. christov, et al. transition metal oxides as materials for additive laser marking on stainless steel. acta polytechnica 57(4):252 – 262, 2017. doi:10.14311/ap.2017.57.0252. [4] m. mihalev, c. hardalov, c. christov, et al. structural and adhesional properties of thin moo3 films prepared by laser coating. journal of physics: conference series 514, 2014. doi:10.1088/1742-6596/514/1/012022. [5] n. v. borisova, e. p. surovoy. zakonomernosty izmeneniya opticheskih svoistv nanorazmernyh sloev oxida molibdena (vi) v resultate termoobrabotki. izvestiya tomskogo polytechnicheskogo universiteta 310(3):68 – 72, 2007. http://earchive.tpu.ru/handle/11683/1659. [6] ansi incits 182 for information systems: bar code print quality guideline. standard, american national standards institute, washington, 1990. [7] bar code verification for linear symbols, ver. 4.3. report, gs1, washington, 2009. 439 http://dx.doi.org/10.14311/ap.2017.57.0252 http://dx.doi.org/10.1088/1742-6596/514/1/012022 http://earchive.tpu.ru/handle/11683/1659 acta polytechnica 60(5):435–439, 2020 1 introduction 2 experimental 2.1 technology of additive laser direct part marking 2.2 alm printed barcode sample 2.3 experimental sample 3 results 3.1 material properties of the coating 3.1.1 adhesion 3.1.2 reflectance of substrate and coating 3.2 barcode recognition properties 3.2.1 scan reflectance profile 3.2.2 global threshold 3.2.3 symbol contrast (sc) 3.2.4 edge contrast (ec) 3.2.5 modulation (mod) 3.2.6 defects: d = ernmax/sc 3.2.7 minimal recognized bar width (x-dimension) 4 discussion 5 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0248 acta polytechnica 59(3):248–259, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap dynamic smart grid communication parameters based cognitive radio network haider tarish haidera, ∗, dhiaa halboot muhsena, haider ismael shahadib, ong hang seec, wilfried elmenreichd a university of mustansiriyah, department of computer engineering, 10001 baghdad, iraq b university of karbala, department of electrical and electronic engineering, 56001 karbala, iraq c universiti tenaga nasional, department of electronics and communication engineering, 43000 kajang, selangor, malaysia d alpen-adria-universität klagenfurt, institute of networked & embedded systems/lakeside labs, 9020 klagenfurt, austria ∗ corresponding author: haiderth@uomustansiriyah.edu.iq abstract. the demand for more spectrums in a smart grid communication network is a significant challenge in originally scarce spectrum resources. cognitive radio (cr) is a powerful technique for solving the spectrum scarcity problem by adapting the transmission parameters according to predefined objectives in an active wireless communication network. this paper presents a cognitive radio decision engine that dynamically selects optimal radio transmission parameters for wireless home area networks (han) of smart grid applications via the multi-objective differential evolution (mode) optimization method. the proposed system helps to drive optimal communication parameters to realize power saving, maximum throughput and minimum bit error rate communication modes. a differential evolution algorithm is used to select the optimal transmission parameters for given communication modes based on a fitness function that combines multiple objectives based on appropriate weights. simulation results highlight the superiority of the proposed system in terms of accuracy and convergence as compared with other evolution algorithms (genetic optimization, particle swarm optimization, and ant colony optimization) for different communication modes (power saving mode, high throughput mode, emergency communication mode, and balanced mode). keywords: smart grid, home area network, cognitive radio, decision engine, differential evolution. 1. introduction the integration of information and communication technology (ict) systems has transformed the traditional power grid into a smart grid [1]. ict systems enable the efficient use of energy by deploying intelligent devices and control systems to automate power grids for energy and cost saving [2]. furthermore, advanced communication systems contribute to the interaction between utility companies and customers. consequently, customers can save energy and cost, while utility companies can maintain system reliability and resilience. wireless technologies are a preferred option in several parts of a smart grid to provide flexible and low-cost data communication and networking [3]. in a smart grid, three main wireless communication networks exist, ranging from those used in a home area network (han) to connect various appliances and devices within a home [4], to a neighbourhood area network (nan), directly connecting multiple end users (hans) in specific areas to the data concentrator/substation and, ultimately to the wide area network (wan), which connects many nans to the central control unit. due to the rapid development of smart grids, an increasing number of smart meters have been installed in the han, thus the amount of data to be transmitted is growing rapidly. furthermore, the emerging new paradigms, such as the internet of things (iot), device-to-device (d2d) communication, and smart appliances, are expected to have a massive spectrum demand [5]. therefore, more spectrum bands are required to provide an accurate and flexible wireless communication in the smart grid, and this requirement presents a significant challenge in originally scarce spectrum resources [6]. cognitive radio (cr) has provided a powerful technique to overcome the stringent spectrum resource [7, 8]. the cr has the possibility to sense the wireless environment parameters and adapt intelligently for providing an optimized service that improves the communication performance [9]. spectrum sensing and spectrum decision operations involve the cognitive cycle; its applications are not limited to licensed bands and can be applied to cognitive radio users while accessing unlicensed bands to increase the efficiency and capacity [10]. the cr-based communication in smart grids has been investigated in terms of various aspects, such as architecture management [11, 12], channel selection [13], reliability of event estimation [14, 15], security and protection [16, 17], and multimedia commu248 http://dx.doi.org/10.14311/ap.2019.59.0248 http://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . nications [18]. however, the optimal selection of the han network parameters based on the communication environment has not been thoroughly investigated despite it being one of the most important requirements of smart grid communications. there are many works for a cognitive radio application in a smart grid. in [19], an approach based on a cellular learning automata for designing cognitive engines in the cognitive peer-to-peer networks is proposed. in [20], a dynamic channel selection algorithm for the cr-based smart grid communication network is proposed. the proposed algorithm is based on a fuzzy inference system to select suitable channel parameters including bandwidth, snr and probability of missed detection. in [14], a reliable spectrum access and reaching consensus with the cognitive radio sensor and actor nodes are discussed. furthermore, a consensus scheme is proposed to increase the reliability by enabling a consensus convergence of actor nodes with a minimum spectrum access. in [13], the channel selection problem is investigated for cognitive radio based smart grid communications in the distribution section. furthermore, several recent studies have emphasized the need to address the optimal selection of transmission parameters in cognitive radio decision engine (crde) systems for non-smart grid applications using evolution algorithms. in [21], a multi-objective genetic-algorithm (ga) is presented to select the optimal communication transmission parameters. however, the result indicates that the ga has a slow convergence for the given communication modes. in [22], a population adaptation technique is used for the ga to decrease the time required to reach the final decision. the authors attempted to improve the work presented in [21] by taking the advantages of feedback learning from previous cognition cycles to speed up the system convergence. the algorithm starts at a high initial fitness that leads to a faster convergence than a standard ga. however, at high seeding values, the resulting fitness is lower than the fitness obtained with the standard ga. in [23, 24], a two-dimensional structure for chromosome’s implementation of the ga was used to optimize the parameters of a cr engine. the results indicate that the non-dominated sorting ga has a faster convergence than the conventional ga. in [25], a mutated ant colony optimization (maco)-based cr engine is proposed to find optimal transmission parameters. the results indicate that the fitness scores obtained by the maco engine in the given communication scenarios are larger than those obtained by the conventional ant colony (ac) and ga engines. a cognitive radio adaptation method, which uses particle swarm optimization (pso) as the decision method was proposed in [26]. in this system, a discrete pso was used to optimize parameters given a set of objectives for cognitive radios. in [27], a hybrid architecture of a cognitive decision engine based on the pso algorithm and case study is proposed. the case study can reduce the response delay of the cognitive radio engine and provides a radio with the ability to learn from its past running experiences. the result indicates that the hybrid pso can achieve a better convergence than the original pso and ga. however, this method requires a large storage for the efficient and considerably high performance. these mentioned works generally refer to the cr engine optimization methods to optimize the transmission parameters based on the surrounding wireless communication environment. however, the required computation time to obtain optimal transmission parameters has been observed to be unrealistic for smart grid applications [28]. furthermore, some studies have a trade-off between time convergence and scores of fitness function. therefore, fast and accurate convergence optimization methods for optimal transmission parameters are required to achieve optimal results within a short period of time to support active communications. 1.1. study contributions this paper attempts to fill the aforementioned gaps in the previous cr-related work. the contribution of this paper as compared to previous approaches can be summarized as follows. first, a multi-objective differential evolution (mode) optimization algorithm is proposed for the optimal selection of transmission parameters of a cr engine in a han for the home management application of a smart grid. the mode provides fast convergence and a high score fitness function. second, four communication modes are adopted to utilize different environment conditions: power saving mode (minimum power), high throughput mode (maximum throughput), emergency communication mode (minimizing the bit error rate), and balanced mode (equal parameter priorities). the system adapts the transmission parameters dynamically according to the sensing wireless communication environments of the han. these modes are taken to address all environmental sensing statuses. third, the simulation results indicate that the proposed system can achieve higher fitness scores and faster convergence than other evolution algorithmbased crde such as; ga, pso and maco. 2. cognitive radio parameters the cr provides the ability to sense the surrounding wireless environment periodically and to adapt the transmission parameters appropriately according to the objectives for the optimal utilization of spectrum bands [29]. to achieve these goals, the cr needs a crde to provide efficient transmission parameters for the current environment, including transmission link, user demand, and system policies as shown in fig. 1. the crde must balance multiple objectives [30]. the environmental variables are responsible for enabling 249 h. t. haider, d. h. muhsen, h. i. shahadi et al. acta polytechnica figure 1. cognitive radio system architecture. the cr system to be alert to the surrounding environment to maximize the objectives of the system [31]. this information is periodically sensed by external sensors. the three commonly used environmental parameters are the bit error rate (ber), signal-to-noise ratio (snr) and noise power (n). the ber parameter represents the amount of erroneous bits in relation to the amount of bits being sent of a specific modulation type. the snr represents the ratio of signal power to noise power [21]. the transmission parameters are envisioned as system knobs that can be adapted based on the measured reading of environmental parameters and under the instruction of predefined objective functions. in the context of optimization, these parameters can also be defined as decision variables that should be determined on the basis of the prescribed optimization procedures. the three parameters used to generate a fitness function are transmitting power (pt), modulation type (mt), and modulation index (mi). 3. objective function the objective functions of a cognitive engine should be defined to guide the searching direction of optimization process for the optimal selection of the transmission parameters. in this work, three individual objective functions are combined to achieve a compromise among these objectives based on the predefined trade-off requirements. table 1 presents the considered objectives of a cr system: minimum ber, maximum throughput and minimum transmitting power. the fitness function of a minimum power consumption is defined as follows [21]: fmin−power = 1 − pt ptmax (1) where, pt is the transmit power of the single carrier, and ptmax is the maximum available transmit power. the fitness function of the maximum throughput is defined as follows [21]: fmin−throughput = log10(mi) log10(mimax) (2) where mi is the modulation index of the given modulation types, and mimax is the maximum modulation index used in the system. the fitness function of minimum ber is given as [21] fmin−ber = 1 − log10(0.5) log10(pber ) (3) this fitness function is normalized to the worst possible ber value of (0.5) [32]. also, pber is the probability of a ber for a given modulation type. the probability of ber is calculated for m-ary phase-shift keying (psk) and m-ary quadrature amplitude modulation (qam) modulation types, as described by the following equations [21]: for a binary psk (bpsk) modulation type, the pber is pber_bp sk = q (√ pt n ) , mi = 2 (4) for m-ary psk modulation signal, the pber is pber_m p sk = 2 log2(mi) q(√ 2 · log2(mi) · pt n · sin π mi ) , mi > 2 (5) for m-ary qam modulation signal, the pber is pber_m qam = 4 log2(mi) ( 1 − 1 √ mi ) q(√ 3 · log2(mi) mi − 1 · pt n ) (6) where, pt/n is the bit energy-to-noise power spectral density ratio. these objective functions are obtained by dividing the variable score to its maximum possible value to attain a normalized range falling within (0,1). this normalization can prevent the optimization trend from being attracted to the objectives of relatively large magnitudes. each of the three objective functions is formulated to be a maximization problem by transforming any minimization objective (f) to an equivalent (1 −f) maximization objective. the three single objective functions may interfere with each other. for instance, using a higher modulation index for a specific modulation scheme increases the throughput of the cr system, but it consequently increases the ber. increasing the transmit power also reduces the ber thus improving the objective function of minimizing ber. however, this increment of the transmit power increases the power consumption, and thus reduces the objective function of minimizing the power consumption. therefore, these conflicting objective functions need to be solved via the multi-objective optimization method. 250 vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . objective name description related parameters minimize power consumption to decrease the amount of power consumption pt maximize throughput to increase the data throughput mi minimize the bit-error rate to improve the overall ber of the transmissionenvironment pt, mi table 1. cognitive radio objectives and related parameters. 4. multi-objective optimization multi-objective optimization problems involve multiple conflicting objectives to be simultaneously optimized. given the conflict objectives, a single solution that is simultaneously optimal with respect to all objectives does not necessarily exist [33]. a solution may be optimal for one objective, but sub-optimal or even poor for another. therefore, a set of satisfied trade-off solutions known as pareto optimal solutions is commonly used [34]. for such solutions, no improvement is possible for any objective without sacrificing at least one of the other objective functions. thus, the main aim of the multi-objective optimization is to find the pareto optimal solutions rather than the optimal solutions of independent objectives [35]. one of the issues that arise when solving multiobjective optimization problems is how to assign a single numerical single value for the overall multiobjective function in terms of its dependent objectives and corresponding variables. to realize this aim, preferences for individual objectives are first identified and numerically expressed so that the multiple objective functions can be placed in a single scalar multiobjective function [36]. the aggregation method has been used to combine multiple objective functions into one overall utility function. in this method, the optimization preferences are expressed by the weighting coefficients assigned to individual objective functions. therefore, the aggregation method enables the combination of single-objective functions into a single multiple-objective function [37]. the multi-objective function can be defined for the weighted sum of m objectives as [36] f(x) = m∑ i=1 wifi(x) (7) subject to ∑m i=1 wi = 1 and wi ≥ 0 where, m is the number of possible individual objective functions. for the current optimization problem (i.e., m = 3), we can rewrite (7) as follows: fmultiple = w1·(fmin−power )+w2·(fmax−throughput)+ + w3 · (fmin−ber) (8) where the weight vector wi, determined by the three distinct cr transmission modes or scenarios, is defined by assigning a higher weight to a specific objective and lower weights to the others. in addition, another mode is introduced through a fair distribution of weights among the three objectives. different modes for the weighing coefficients are defined by assigning a different weight to the fitness function of each objective. the aggregation (weighted sum) method is considered a flexible mechanism to steer the optimization process towards the highest priority objective of a higher weight. the resulting four modes (weight vectors) are identified in table 2 for the comparison of results, as obtained in [25]. each multi-objective fitness function is obtained by plugging in the corresponding weighing vector into equation (8). 5. differential evolution evolutionary multi-objective optimization (emo) algorithms are powerful tools to solve multi-objective optimization and decision problems [38], and an emo-based crde can support awareness-processing, decision-making, and learning elements of a cognitive functionality [29]. the differential evolution (de) is a stochastic, population-based optimization algorithm developed by storn and price [39]. the key difference between the de and other evolutionary algorithms (ga/pso) is in the mechanism for generating new solutions. different from ga/pso, the de generates a new solution by combining several solutions with the candidate solution. the population in the de evolves through repeated cycles of mutation, crossover, and selection, unlike the ones used in the ga [32].the classical de has four main stages: initialization, mutation, crossover, and selection [36]. furthermore, three control parameters exist: f, cr, and np. f is the scaling factor that typically controls the differential mutation process between (0 and 1). cr is the crossover rate, which involves the probability that a trial vector is selected. np is the current population size, i.e., the number of competing solutions for any given generation g. 5.1. initialization the initial population should generate random individuals within the entire search space. the search space is prescribed by lower and upper bounds for each parameter of the optimization problem [40]. the ith parameter (i = 1, 2, . . . ,d, where d is the total number of decision variables) of the jth individual vector (j = 1, 2, . . . , np) at g = 1 is initialized as follows [40]: x1j,i = xj,min + randj (0, 1)(xj,max −xj,min) (9) 251 h. t. haider, d. h. muhsen, h. i. shahadi et al. acta polytechnica transmission mode weighting vector [w1,w2,w3] mode power saving mode (minimum transmit power) [0.8, 0.05, 0.15] psm high throughput mode (maximize throughput) [0.15, 0.8, 0.05] htm emergency communication mode (minimize bit-error rate) [0.05, 0.15, 0.8] ecm balance mode (equal priority) [1/3, 1/3, 1/3] blm table 2. definition of cr transmission modes and weights [21, 22, 25–27]. where, randj (0, 1) is a random number within (0,1) interval, which is multiplied by the interval length, (xj,max − xj,min) to ensure a distributed sampling of the parameter’s domain interval [xj,max −xj,min]. different approaches can be used to generate the initial population although random uniformity is the most common [41]. 5.2. mutation once the initialization is completed, the de mutates and recombines the population to produce a population of np mutant vectors. the differential mutation adds a scaled and randomly sampled vector difference to a third vector. equation (10) shows how to combine three different randomly chosen vectors to create a mutant vector [36]: v1i,g = xri0,g + f · ( xri1,g −xri2,g ) (10) where xri0,g,xri1,g and xri2,g vectors are sampled randomly, selected from the current population rang and r0,r1,and r2 are mutually exclusive integers randomly generated in the range [1, np], such that r0 6= r1 6= r2 6= i . the mutation scale factor f usually takes values within the range [0, 1] [42]. 5.3. crossover the crossover enhances the potential diversity of a population. in a crossover, the trial vectors are produced according to [41]. u g j,i = { v g j,i if (randj (0, 1)) ≤ cr or j = jrand x g j,i otherwise (11) where cr is the crossover rate, which is a user-specified constant within the range, [0,1], that controls the diversity of the population and enables the algorithm to escape from the local optima [40]. 5.4. selection in the selection process, if the trial vector has an objective function value greater than or equal to the corresponding target vector, the target vector will be replaced by the trial vector and the last represents as a part of the population for the next generation. otherwise, the target vector will remain in the population for the next generation. the selection operation can be expressed as follows [42]: x g+1 i = { u g i if f(u g i ) ≥ f(x g i ) x g i otherwise (12) once the new population is constructed, the reproduction process (mutation, recombination, and selection) is repeated until the optimum solution is located, or a pre-specified termination criterion is satisfied, e.g., the number of generations reaches a pre-set maximum, gmax [36]. fig. 2 shows the flowchart of the differential evolution. 6. proposed mode-based crde of wireless han a scenario considered, in which the crde functionality is deployed inside the smart meter, which is connected to the utility load management centre and to the han base station, to achieve optimal management for the customer’s appliances is shown in fig. 3. the cognitive radio is proposed to select the optimal transmission parameters for the given communication modes within the han of the home management system. fig. 4 shows the proposed mode-base cr system architecture. the cr system extracts the data of the sensing wireless channel conditions via its environmental parameters and passes these data to a multi-objective de. the system weight generation of the cr transmission is used to assign a distinct weighting coefficient for given transmission modes. these different weighting coefficients are used to vary the priority level among radio objectives. the largest weighting coefficient is allocated for the objective of the highest priority as shown in table 2. the other benefit of these weighting coefficients is the conversion of the multi-objective optimization problem into a single objective optimization problem using the aggregation method. the sensed environmental parameters and the weighting coefficients are assigned to three transmission mode objectives to construct the multiobjective function. the de optimization engine is 252 vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . figure 2. flowchart of differential evolution. figure 3. proposed crde for han of smart grid. figure 4. proposed mode-base cr system architecture. used to find an optimal set of transmission parameters. based on the selected communication mode, the communication in the home load management system of han is established. 6.1. simulation results the proposed mode cognitive decision engine is simulated using the matlab software package. the de parameters used in this work are based on the recommended values in [43], (f = 0.85, cr = 0.6 and np= 10d). for the sake of simplicity, only unlicensed bands are considered in the han of the home management system. therefore, the transmitted power ranges from 0.1 mw to 2.56 mw in increments of 0.0256 mw. this range of the transmit power is close to the maximum transmit power level allowed in the unlicensed national information infrastructure (un-iib) band [21]. however, the proposed crde and its results can be extended to other spectrum bands (e.g., licensed or hydride bands). the modulation index (mi) is selected to be any of the nine possible values 2-512. practically, the psk is commonly used for low modulation indices (i.e. when mi equals to or is less than 8), whereas the qam is used for high modulation indices. thus, the proposed modulation schemes used in this work are 2-psk (or bpsk), 4-8 psk, 16-512 qam. the proposed choices of modulation type (i.e., types of plus indices) offer the modulation diversity needed for the cr adaption requirement. the performance of the mode-based crde system is simulated under the four transmission modes and presented in table 3. the convergence performance of the power saving mode (psm) is shown in fig. 5. the simulation graph presents the convergence progresses of the mean (average) of over 10 simulation runs to provide a time-invariant average and maximum (best) fitness with respect to the number of generations. the maximum achievable fitness under this mode is 0.964, and it is captured after three generations as indicated in table 3. the pareto-front solution achieved for transmission parameters’ for psm returns a transmit power of 0.1 mw and a modulation 253 h. t. haider, d. h. muhsen, h. i. shahadi et al. acta polytechnica transmission mode fitness scores no. of generation (max. fitness) no. of generation (mean. fitness) time per one generation (ms) mt mi pt/mw psm 0.9648 3 8 1.51 qam 512 0.1 htm 0.9880 2 7 1.54 qam 512 1.253 ecm 0.9629 3 9 1.57 psk 4 2.56 blm 0.9596 2 5 1.63 qam 512 0.125 table 3. simulation results of de-based cr system. figure 5. power saving mode (psm) convergence. figure 6. high throughput mode (htm) convergence. 254 vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . figure 7. emergency communication mode (ecm) convergence. figure 8. balance communication mode (blm) convergence. scheme of 512-qam. the transmit power controlled by the mode cognitive engine is much lower (to save the power), while the modulation scheme is relatively high (which means a high ber). when a large amount of data must be sent or relayed while maintaining a high throughput, the weighting coefficients of the high throughput mode (htm) function are distributed to prioritize the maximizing throughput. the results are obtained after a fast convergence of only two generations, scoring a fitness value of 0.988 (see fig. 6). for this communication mode, the obtained pareto-front solution are is a transmit power of 1.253 mw and a modulation scheme of 512-qam. under this mode, therefore, the modulation scheme is set to 512-qam by the mode cognitive engine, which maximizes the throughput of the systems at the cost of a large transmit power and a high ber. under the emergency communication mode (ecm), additional emphasis is placed on the objective of minimizing the ber to realize a scenario of a reliable transmission and reception. fig. 7 illustrates the results obtained by running the mode-based crde system for a given set of environmental parameters. the maximum achievable fitness is 0.962, and it is obtained within the three generations. as shown in table 3, the pareto-front results returns a transmit power of 2.56 mw and a modulation scheme of 4-psk for the ecm communication mode. the modulation scheme is extremely low, controlled by the mode cognitive engine at the cost of a larger transmit power. for the balanced priority mode (blm), the three single objective functions constructed are all given equal weights. this scenario may be utilized by the mode-based crde system in the case no particular operational plan exists or when 255 h. t. haider, d. h. muhsen, h. i. shahadi et al. acta polytechnica the system requires no major attraction in favour of a specific objective compared with other objectives. the convergence performance of the system is shown in fig. 8. figure 8 demonstrates the optimal solution obtained after only two generations, scoring a maximum achievable fitness of 0.959. under this communication mode, the pareto-front solution provides a transmit power of 0.125 mw and a modulation scheme of 512-qam. the transmit power is relatively small with a high modulation scheme. another important characteristic of the proposed mode-based crde communication system is the time per generation. the cognitive system is stopped after it reaches the generation that gives the highest possible fitness scores. for all considered communication modes in table 3, the system reaches the highest scoring fitness after only two to three generations. the average times per one generation are 1.51, 1.54, 1.57, and 1.63 ms for psm, htm, ecm, and blm, respectively. therefore, the total required times for psm, htm, ecm, and blm are 4.53, 3.08, 4.71, and 3.26 ms, respectively, to complete the computation within two de generations. 6.2. performance comparison to highlight the difference between the proposed mode-based crde and previously published systems, the proposed mode-based crde is compared with the ga-based cr systems used in [21], the maco-based cr in [25] and the pso based cr in [26]. the comparison is based on the maximum fitness scores for the given communication modes and the number of generations to reach the maximum fitness scores. for the psm, the ga-based cr had a maximum fitness of 0.93 that was reached within approximately 400 generations. for the maco-based system, the provided maximum fitness was 0.9482, which was reached within approximately 470 generations. meanwhile, the maximum score was 0.944 reached within 300 generations using the pso-based system. on the contrary, the proposed mode-based crde offers a maximum fitness of 0.964 that is reached within three generations. in the htm communication mode, the fitness score of the ga-based was 0.938, which was obtained within around 200 generations. the fitness function score of the maco-based system was approximately 0.9422 and was obtained within 470 generations. for the pso-based system the fitness score was about 0.944 obtained within 150 generations. in the mode-based approach, a fitness score of 0.988 was reached within only two generations. the comparison of the convergence performance of each cr system is presented in table 4 for the given communication modes. the proposed mode-based crde had larger fitness scores for all of the communication modes compared with the other crde systems as shown in table 4. furthermore, the mode-based crde had a faster convergence; the maximum fitness with optimal parameters for different communication modes was reached within only two to three generations. the mode provides fast and accurate convergence, which is a critical issue for the cognitive radio to adapt the transmission parameters with respect to the changes in the wireless environment. 6.3. discussion the results of the proposed mode-based crde indicate that the convergence towards the optimal communication transmission parameters is essentially fast and accurate. the results of the proposed modebased crde refer the convergence toward the optimal transmission parameters is essentially fast and accurate. for various communication modes, the proposed system reaches the optimal parameters within only two to -three generations of the de to support the active and rapid communication response. the time required to reach the final higher scored fitness function is another key parameter of the proposed mode-based crde system. for the given four-communication modes, the average time per one generation is about 1.5 ms. furthermore, the fitness scores of the given modes are higher than 95 %, thus supporting the accuracy requirement of active communication systems. the fast and accurate convergences are the most important key parameters of the proposed mode-based crde system. for regarding the obtained paretofront solutions for given communication modes, are they tend to match the target objective for each mode while scarify sacrificing the other objectives. according to results, the proposed mode-based crde performs an optimal management for the originally scarce spectrum resources of the in-home wireless communication of smart grids. this system helps customers manage and schedule the available appliances for the given optimal signals of a home managementbased utility pricing scheme. moreover, the proposed cr communication system maintains the essential use of the wireless spectrum environment of the han. 7. conclusion a mode-based crde system is proposed to adapt the transmission parameters according to the sensing parameters of a wireless environment in the han of a smart grid. multi-conflicting objectives have been aggregated into a one overall multi-objective function using variable weighting coefficients that can be adjusted to define distinct transmission scenarios. the selection among these transmission scenarios is assumed to be based on user preferences or possibly on environmental conditions. these distinct transmission modes are psm, htm, ecm, and blm. the proposed de decision engine is developed to enable the optimization of the transmission parameters for a given set of sensed environmental parameters that can satisfy the requirements of the predefined transmission mode. the performance of the proposed mode-based crde system is evaluated and verified under the proposed transmission modes. the proposed mode 256 vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . proposed modes ga-based cr [21] maco-based cr [25] pso-based cr [26] proposedde-based cr fitness score no. of gen. fitness score no. of gen. fitness score no. of gen. fitness score no. of gen. psm 0.930 400 0.9482 470 0.9444 300 0.9648 3 htm 0.938 200 0.9422 470 0.9444 150 0.9880 2 ecm 0.800 400 0.8523 470 0.7776 300 0.9629 3 blm 0.8460 470 0.8765 150 0.9596 2 table 4. comparison of convergence performance. decision engine can converge to the optimal of transmission solutions within only two to three evolutionary generations, with fitness scores greater than 95 % for all of the communication modes. furthermore, the total time required to reach the higher scored fitness for given communication modes is approximately 4.5 ms. the results exhibit the superiority of the proposed mode-based crde systems in terms of accuracy and convergence as compared to previous works presented in literature. the convergence speed and accuracy of the fitness function make the proposed mode decision engine feasible for a cr application of a home management system of a smart grid [28]. in the upcoming work, a multi-criteria decision making (mcdm) method will be used to sort all possible optimal communication parameters from the best to the worst, based on predefined criteria. acknowledgements the authors would like to thank mustansiriyah university (www.uomustansiriyah.edu.iq) baghdad-iraq for its support with the present work. references [1] h. tarish haider, o. hang see, w. elmenreich. a review of residential demand response of smart grid. renewable and sustainable energy reviews 59:166–178, 2016. doi:10.1016/j.rser.2016.01.016. [2] r. lu, s. hong. incentive-based demand response for smart grid with reinforcement learning and deep neural network. applied energy 236:937–949, 2019. doi:10.1016/j.apenergy.2018.12.061. [3] s. hong, m. yu, x. huang. a real-time demand response algorithm for heterogeneous devices in buildings and homes. energy 80:123–132, 2014. doi:10.1016/j.energy.2014.11.053. [4] h. tarish haider, o. hang see, w. elmenreich. optimal residential load scheduling based on time varying pricing scheme. in 13th ieee student conference on research & development 2015 (scored), pp. 210–214. kuala lumpur, 2015. doi:10.1109/scored.2015.7449326. [5] m. e. bayrakdar, a. çalhan. artificial bee colonybased spectrum handoff algorithm in wireless cognitive radio networks. international journal of communication systems 31:1–16, 2018. doi:10.1002/dac.3495. [6] m. barbeau, g. cervera, j. garcia-alfaro, e. kranakis. channel selection using a multiple radio model. journal of network and computer applications 64:113–123, 2016. doi:10.1016/j.jnca.2016.01.021. [7] m. e. bayrakdar, a. çalhan. non-preemptive queueing model of spectrum handoff scheme based on prioritized data traffic in cognitive wireless networks. etri journal 39:558–569, 2017. doi:10.4218/etrij.17.0116.0850. [8] s. alam, m. f. sohail, s. a. ghauri, et al. cognitive radio based smart grid communication network. renewable and sustainable energy reviews 72:535–548, 2017. doi:10.1016/j.rser.2017.01.08. [9] u. khan, n. dilshad, m. h. rehmani, t. umer. fairness in cognitive radio networks: models, measurement methods, applications, and future research directions. journal of network and computer applications 73:12–26, 2016. doi:10.1016/j.jnca.2016.07.008. [10] o. bicen, v. cagri gungor, o. akan. delay-sensitive and multimedia communication in cognitive radio sensor networks. ad hoc networks 10:816–830, 2012. doi:10.1016/j.adhoc.2011.01.021. [11] j. palicot, c. moy, b. résimont, r. bonnefoi. application of hierarchical and distributed cognitive architecture management for the smart grid. ad hoc networks 41:86–98, 2016. doi:10.1016/j.adhoc.2015.12.002. [12] m. faheem, s. shah, r. butt, et al. smart grid communication and information technologies in the perspective of industry 4.0: opportunities and challenges. computer science review 30:1 – 30, 2018. doi:10.1016/j.cosrev.2018.08.001. [13] s. althunibat, q. wang, f. granelli. flexible channel selection mechanism for cognitive radio based last mile smart grid communications. ad hoc networks 41:47 – 56, 2016. doi:10.1016/j.adhoc.2015.10.008. [14] o. ergul, o. bicen, o. akan. opportunistic reliability for cognitive radio sensor actor networks in smart grid. ad hoc networks 41:1–10, 2015. doi:10.1016/j.adhoc.2015.10.003. [15] u. premarathne, i. khalil, m. atiquzzaman. trust based reliable transmission strategies for smart home energy management in cognitive radio based smart grid. ad hoc networks 41:15–29, 2016. doi:10.1016/j.adhoc.2015.12.004. 257 www.uomustansiriyah.edu.iq http://dx.doi.org/10.1016/j.rser.2016.01.016 http://dx.doi.org/10.1016/j.apenergy.2018.12.061 http://dx.doi.org/10.1016/j.energy.2014.11.053 http://dx.doi.org/10.1109/scored.2015.7449326 http://dx.doi.org/10.1002/dac.3495 http://dx.doi.org/10.1016/j.jnca.2016.01.021 http://dx.doi.org/10.4218/etrij.17.0116.0850 http://dx.doi.org/10.1016/j.rser.2017.01.08 http://dx.doi.org/10.1016/j.jnca.2016.07.008 http://dx.doi.org/10.1016/j.adhoc.2011.01.021 http://dx.doi.org/10.1016/j.adhoc.2015.12.002 http://dx.doi.org/10.1016/j.cosrev.2018.08.001 http://dx.doi.org/10.1016/j.adhoc.2015.10.008 http://dx.doi.org/10.1016/j.adhoc.2015.10.003 http://dx.doi.org/10.1016/j.adhoc.2015.12.004 h. t. haider, d. h. muhsen, h. i. shahadi et al. acta polytechnica [16] a. abuadbba, i. khalil, a. ibaida, m. atiquzzaman. resilient to shared spectrum noise scheme for protecting cognitive radio smart grid readings bch based steganographic approach. ad hoc networks 41:30–46, 2016. doi:10.1016/j.adhoc.2015.11.002. [17] u. premarathne, i. khalil, m. atiquzzaman. secure and reliable surveillance over cognitive radio sensor networks in smart grid. pervasive and mobile computing 22:3–15, 2015. doi:10.1016/j.pmcj.2015.05.001. [18] h. wang, y. qian, h. sharif. multimedia communications over cognitive radio networks for smart grid applications. ieee wireless communications 20:125–132, 2013. doi:10.1109/mwc.2013.6590059. [19] a. saghiri, m. meybodi. an approach for designing cognitive engines in cognitive peer-to-peer networks. journal of network and computer applications 70:17–40, 2016. doi:10.1016/j.jnca.2016.05.012. [20] w. khan, m. zeeshan. qos-based dynamic channel selection algorithm for cognitive radio based smart grid communication network. ad hoc networks 87:61–75, 2019. doi:10.1016/j.adhoc.2018.11.007. [21] t. r. newman, b. a. barker, a. m. wyglinski, et al. cognitive engine implementation for wireless multicarrier transceivers. wireless communications and mobile computing 7:1129–1142, 2007. doi:10.1002/wcm.486. [22] t. r. newman, r. rajbanshi, a. m. wyglinski, et al. population adaptation for genetic algorithm-based cognitive radios. mobile networks and applications 13:442–451, 2008. doi:10.1007/s11036-008-0079-8. [23] c. huynh, w. lee. multicarrier cognitive radio system configuration based on interference analysis by two dimensional genetic algorithm. in advanced technologies for communications (atc), 2011 international conference, pp. 85–88. da nang, vietnam, 2011. doi:10.1109/atc.2011.6027441. [24] c. k. huynh, w. c. lee. two-dimensional genetic algorithm for ofdm-based cognitive radio systems. in 2011 ieee 3rd international conference on communication software and networks (iccsn), pp. 100–105. 2011. doi:10.1109/iccsn.2011.6013553. [25] n. zhao, s. li, z. wu. cognitive radio engine design based on ant colony optimization. wireless personal communications 65:15–24, 2012. doi:10.1007/s11277-011-0225-7. [26] z. zhao, s. xu, s. zheng, j. shang. cognitive radio adaptation using particle swarm optimization. wireless communications and mobile computing 9:875–881, 2009. doi:10.1002/wcm.633. [27] x. tan, h. zhang, j. hu. a hybrid architecture of cognitive decision engine based on particle swarm optimization algorithms and case database. annals of telecommunications annales des télécommunications 69:593–605, 2014. doi:10.1007/s12243-013-0417-0. [28] v. c. gungor, d. sahin, t. kocak, et al. a survey on smart grid potential applications and communication requirements. ieee transactions on industrial informatics 9(1):28–42, 2013. doi:10.1109/tii.2012.2218253. [29] e. palacios-garcía, a. chen, i. santiago, et al. stochastic model for lighting’s electricity consumption in the residential sector. impact of energy saving actions. energy and buildings 89:245–259, 2015. doi:10.1016/j.enbuild.2014.12.028. [30] w. chen, t. li, t. yang. intelligent control of cognitive radio parameter adaption: using evolutionary multi-objective algorithm based on user preference. ad hoc networks 26:3–16, 2015. doi:10.1016/j.adhoc.2014.09.006. [31] e. hossain, d. niyato, d. in kim. evolution and future trends of research in cognitive radio: a contemporary survey. wireless communications and mobile computing 15:1530–1564, 2013. doi:10.1002/wcm.2443. [32] p. m. pradhan, g. panda. comparative performance analysis of evolutionary algorithm based parameter optimization in cognitive radio engine: a survey. ad hoc networks 17:129–146, 2014. doi:10.1016/j.adhoc.2014.01.010. [33] e.-g. talbi. metaheuristics: from design to implementation, vol. 74. john wiley & sons, hoboken, new jersey, usa, 2009. [34] g.-q. zeng, j. chen, l.-m. li, et al. an improved multi-objective population-based extremal optimization algorithm with polynomial mutation. information sciences 330:49–73, 2016. doi:10.1016/j.ins.2015.10.010. [35] c. a. c. coello, d. a. van veldhuizen, g. b. lamont. evolutionary algorithms for solving multi-objective problems. springer science & business media, new york, usa, 2002. doi:10.1007/978-1-4757-5184-0. [36] k. v. price, r. storn, j. lampinen. differential evolution: a practical approach to global optimization. springer-verlag berlin heidelberg, 2005. doi:10.1007/3-540-31306-0. [37] r. marler, j. arora. the weighted sum method for multi-objective optimization: new insights. structural and multidisciplinary optimization 41:853–862, 2010. doi:10.1007/s00158-009-0460-7. [38] m. a. fotouhi ghazvini, p. faria, s. ramos, et al. incentive-based demand response programs designed by asset-light retail electricity providers for the day-ahead market. energy 82:786–799, 2015. doi:10.1016/j.energy.2015.01.090. [39] r. storn, k. price. differential evolution a simple and efficient heuristic for global optimization over continuous spaces. journal of global optimization 11:341–359, 1997. doi:10.1023/a:1008202821328. [40] m. ali, n. awad, p. suganthan. multi-population differential evolution with balanced ensemble of mutation strategies for large-scale global optimization. applied soft computing 33:304–327, 2015. doi:10.1016/j.asoc.2015.04.019. [41] c. roque, p. martins. differential evolution for optimization of functionally graded beams. composite structures 133:1191–1197, 2015. doi:10.1016/j.compstruct.2015.08.041. 258 http://dx.doi.org/10.1016/j.adhoc.2015.11.002 http://dx.doi.org/10.1016/j.pmcj.2015.05.001 http://dx.doi.org/10.1109/mwc.2013.6590059 http://dx.doi.org/10.1016/j.jnca.2016.05.012 http://dx.doi.org/10.1016/j.adhoc.2018.11.007 http://dx.doi.org/10.1002/wcm.486 http://dx.doi.org/10.1007/s11036-008-0079-8 http://dx.doi.org/10.1109/atc.2011.6027441 http://dx.doi.org/10.1109/iccsn.2011.6013553 http://dx.doi.org/10.1007/s11277-011-0225-7 http://dx.doi.org/10.1002/wcm.633 http://dx.doi.org/10.1007/s12243-013-0417-0 http://dx.doi.org/10.1109/tii.2012.2218253 http://dx.doi.org/10.1016/j.enbuild.2014.12.028 http://dx.doi.org/10.1016/j.adhoc.2014.09.006 http://dx.doi.org/10.1002/wcm.2443 http://dx.doi.org/10.1016/j.adhoc.2014.01.010 http://dx.doi.org/10.1016/j.ins.2015.10.010 http://dx.doi.org/10.1007/978-1-4757-5184-0 http://dx.doi.org/10.1007/3-540-31306-0 http://dx.doi.org/10.1007/s00158-009-0460-7 http://dx.doi.org/10.1016/j.energy.2015.01.090 http://dx.doi.org/10.1023/a:1008202821328 http://dx.doi.org/10.1016/j.asoc.2015.04.019 http://dx.doi.org/10.1016/j.compstruct.2015.08.041 vol. 59 no. 3/2019 dynamic smart grid communication parameters. . . [42] a. k. qin, v. l. huang, p. n. suganthan. differential evolution algorithm with strategy adaptation for global numerical optimization. ieee transactions on evolutionary computation 13:398–417, 2009. doi:10.1109/tevc.2008.927706. [43] v. feoktistov. differential evolution in search of solutions 5, 2006. doi:10.1007/978-0-387-36896-2. 259 http://dx.doi.org/10.1109/tevc.2008.927706 http://dx.doi.org/10.1007/978-0-387-36896-2 acta polytechnica 59(3):248–259, 2019 1 introduction 1.1 study contributions 2 cognitive radio parameters 3 objective function 4 multi-objective optimization 5 differential evolution 5.1 initialization 5.2 mutation 5.3 crossover 5.4 selection 6 proposed mode-based crde of wireless han 6.1 simulation results 6.2 performance comparison 6.3 discussion 7 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0111 acta polytechnica 60(2):111–121, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap analytical solution of (2+1) dimensional dirac equation in time-dependent noncommutative phase-space ilyas haouam université frères mentouri, laboratoire de physique mathématique et de physique subatomique (lpmps), constantine 25000, algeria correspondence: ilyashaouam@live.fr abstract. in this article, we studied the system of a (2+1) dimensional dirac equation in a time-dependent noncommutative phase-space. more specifically, we investigated the analytical solution of the corresponding system by the lewis-riesenfeld invariant method based on the construction of the lewis-riesenfeld invariant. knowing that we obtained the time-dependent dirac hamiltonian of the problem in question from a time-dependent bopp-shift translation, then used it to set the lewisriesenfeld invariant operators. thereafter, the obtained results were used to express the eigenfunctions that lead to determining the general solution of the system. keywords: lewis-riesenfeld invariant method, time-dependent bopp-shift translation, bopp’s shift; time-dependent dirac equation, time-dependent noncommutative phase-space. 1. introduction it is known that heisenberg suggested the option of noncommutative (nc) space-time in 1930, and in 1947, snyder presented it [1, 2] to the necessity of regularizing the divergence of the quantum field theory. then, in recent years, noncommutative geometry (ncg) became very interesting for studying several physical problems, and it became clear that there is a strong connection between the ncg and string theories. studies of this geometric type and its involvement have been incorporated with important physical concepts and tools, and have been useful for highlighting in various fields of physics, particularly in matrix theory (matrix model bfss (1997)) [3]. the ncg was involved also in the description of quantum gravity theories [4], aharonov-bohm effect [5], aharonov-casher effect [6], etc [7]. knowing that the origins of the ncg related to the investigations for topological spaces of c∗-algebras of functions. later this type of geometry was theorized by a. connes and others in 1985 [8–12] by studying and defining a cyclical cohomology. it has been shown that the differential calculus on manifolds had an nc equivalent. next, the ncg found great encouragement through several mathematical results, such as the k-theory of c∗-algebras, gelfand-naïmark theorem on the c∗-algebras, characterizations of commutative von neumann algebras, cyclic cohomology of the c∞(m) algebra, relations between dirac operators and riemannian metrics, serre-swan theorem, etc. the idea of the phase-space noncommutativity is largely motivated by the foundations of quantum mechanics through the canonical quantization. it is easy to apply the phase-space noncommutativity using the ordinary product with weyl operators (weyl-wigner maps)[12], or by replacing the ordinary product with the moyal-weyl product (?-product) in the functions and actions of our systems[13, 14], also the bopp-shift linear transformations [15, 16], and the seiberg-witten maps [8, 10, 17]. studying physics within the ncg has attracted a lot of interest in recent years, because noncommutativity is necessary when considering the low-energy efficiency of d-brane with a background magnetic field, and also in a tiny scale of strings or in conditions of a very high energy, the effects of noncommutativity may appear. besides, one of the strong motivations of the ncg, is to obtain a coherent mathematical framework in which it would be possible to describe quantum gravitation. for all these reasons and advantages, we carry out this work in the nc formalism. in addition, it is interesting to find other models in which noncommutativity emerges. several scientific works have focused on the time-independent noncommutativity. experimental research has considered the parameters of the noncommutativity of fixed values in the context of cosmic microwave background radiation, perhaps considered approximately fixed to the celestial sphere, for example, as proposed in ref [18]. however, differently, in our work, our obvious intention is to involve the time-dependency in the nc parameters because of the possibility that nc parameters may show time-dependency. for instance, physical measurements must take into account the effect of the earth’s rotation around its axis, which causes a time-dependency in the nc parameters. however, the motivations for choosing to study the (2+1) dimensional dirac equation are due to several important works in this context, such as the investigations of landau levels [19], the oscillation of magnetization 111 https://doi.org/10.14311/ap.2020.60.0111 https://ojs.cvut.cz/ojs/index.php/ap ilyas haouam acta polytechnica [20], weiss oscillation [21], de haas-van alphen effect [22], analysis through coherent states [23], the movement of electrons transporters in graphene and other materials [24], etc. particularly, the 2 dimensional dirac equation in interaction with a homogeneous magnetic field has various applications in graphene, such as in refs [25, 26], and in studying quantum hall effect and fractional hall effects [27, 28], berry phase [29], etc. in graphene and other materials such as in weyl semimetals, an important phenomenon takes place if the magnetic field and the uniform electric field are introduced. specifically, the spacing between different landau levels decreases if the electric field strength reaches a critical value [30, 31]. in our study about a (2+1) dimensional dirac system, the noncommutativity will be considered timedependent through a time-dependent darboux transformation “bopp’s shift”. this, in turn, makes the studied system a time-dependent one, h(xnci ,p nc i ) −→h(t). solving the system of equations in interaction with time-dependent potentials has attracted many physicists over the past years. in addition to the essential mathematical benefit, this topic is related to a lot of physical problems and applications for its applicability. for instance, in quantum transport [32–34], quantum optics [35–37], quantum information [38], the degenerate parametric amplifier [39], also spintronics [40, 41], and in the description of the two trapped cold ions dynamics in the paul trap [42]. to study systems of time-dependent equations, there are many methods like the evolution operator, the change of representation, the unitary transformations, path integral, second quantization, lewis-riesenfeld (lr) invariant approach. also, there are other used techniques, as in refs [43, 44]. the lr method [45, 46] is a technique that allows obtaining a set of solutions of time-dependent equation systems, through the eigenstates of lr invariants. these invariants are built to find the solutions of such systems of equations, where lewis and riesenfeld in their original paper presented a technique to obtaining a group of exact wave-functions for the time-dependent harmonic oscillator in hilbert space. the lr approach has been applied in many applications such as in mesoscopic r(t)l(t)c(t) electric circuits where the quantum evolution is described[47]. as well as in engineering, in shortcuts and adiabaticity [48]... a large variety of scientific papers concerning time-dependent systems were interested in the time-dependent harmonic oscillator, or in time-dependent linear potentials, but in our current work, to be more specific, we report the time-dependent background of the nc phase-space. we consider a time-dependent bopp-shift translation to transform the system to a time-dependent nc one, then, due to the lr invariant method, we obtain the lr invariant and its eigenstates to solve our system equations. 2. time-dependent noncommutativity in theory, the ncg space may not commute anymore (i.e. ab6=ba). in a d dimensional time-dependent nc phase-space, let us consider the operators of coordinates and momentum xncj and p nc k , respectively. these nc operators satisfy the deformed commutation relations [49][ xncj ,x nc k ] = iθjk(t)[ pncj ,p nc k ] = iηjk(t), (j,k = 1, ..,d)[ xncj ,p nc k ] = i~effδjk , (1) the effective planck constant being ~eff = ~ ( 1 + θη 4~2 ) , (2) where θη4~2 � 1 is the consistency condition in the usual quantum mechanics. δjk is the identity matrix, and θjk, ηjk are real constant antisymmetric d×d matrices. in some studies concerning the nc parameters, as in the experiment by “nesvizhevsky et al” [50, 51], we note that θ ≈ 10−30m2 and η ≈ 1, 76.10−61kg2m2s−2. other bounds exist. for example, θ ≈ 4.10−40m2 when assuming the natural units, ~ = c = 1 [52]. as well as when taking into account that the experimental energy resolution is related to the uncertainty principle because of the finite lifetime of the neutron, this leads to obtaining η ≈ 10−67kg2m2s−2 (a kind of correction). these obtained results including the experiment by “nesvizhevsky et al”, allow us to evaluate the consistency condition of the nc model ∣∣∣ θη4~2 ∣∣∣ � 10−24. but if we consider the modifications introduced by noncommutativity over ~ value (the precision is about 10−9), which are at least about 24 orders of magnitude smaller than its value, with considering the corrected bounds of η, we have∣∣∣ θη4~2 ∣∣∣ � 10−29 [53]. these values agree with the higher limits on the basic scales of coordinate and momentum. these limits will be suppressed if the used magnetic field in the experiment is weak, about b ≈ 5mg. as long as the system in which we investigate the effects of noncommutativity is 2 dimensional, we restrict 112 vol. 60 no. 2/2020 analytical solution of (2+1) dimensional dirac equation. . . ourselves to the following nc algebra[ xncj ,x nc k ] = iθeγt�jk[ pncj ,p nc k ] = iηe−γt�jk, (j,k = 1, 2)[ xncj ,p nc k ] = i~effδjk , (3) we have �12 = −�21 = 1, �11 = �22 = 0, and θ, η are real-valued with the dimension of length2 and momentum2, respectively. while the space coordinates and momentum are fuzzy and fluid [54], they cannot be localized, unless for minus infinite times. the parameters θ, η represent the fuzziness and γ represents the fluidity of the space. the above equation is the relation of the ordinary ncg except that the nc structure constants are considered as exponentially increasing functions with the evolution of time. certainly, there is a multitude of other possibilities, such as θ(t) = θcos(γt), η(t) = ηsin(γt). the new deformed geometry can be described by the operators xnc1 = xnc = x− 1 2~ θe γtpy, p nc 1 = pncx = px + 1 2~ηe −γty xnc2 = ync = y + 1 2~ θe γtpx, p nc 2 = pncy = py − 1 2~ηe −γtx . (4) when γ = 0, the time-dependency in the structure of nc parameters vanishes. in addition, for θ = η = 0, the ncg reduces to a commutative one and the coordinates xj and the momentum pk satisfy the ordinary canonical commutation relations [xj,xk] = 0 [pj,pk] = 0, (j,k = 1, 2) [xj,pk] = i~δjk . (5) 3. (2+1) d explicitly time-dependent dirac equation and its invariant operator 3.1. (2+1) d dirac equation in time-dependent noncommutative phase-space in presence of an electromagnetic four-potential aµ = (a0,ai), the dirac equation in (2+1) d is given by( cαi(pi − ecai(x)) + ea0(x) + βmc 2 ) |ψ〉 = i~ ∂ ∂t |ψ〉 , (6) with |ψ〉 being the dirac wave function, and pj =   pxpy pz   is the momentum. the dirac matrices αj , β αj = ( 0 σj σj 0 ) ,α1 = σ1 = ( 0 1 1 0 ) ,α2 = σ2 = ( 0 −i i 0 ) ,β = σ3 = ( 1 0 0 −1 ) , i = ( 1 0 0 1 ) , (7) satisfy the following anticommutation relations {αi,αj} = 2δij , {αi,β} = 0 with α2i = β 2 = 1. (8) we consider the magnetic field −→ b along z-direction, and it is defined in terms of the symmetric potential ai = b 2 (−y,x, 0) , with a0 = 0, (9) most research about time-dependent systems concerns the presence of an electric field. however, in our current work, we do not rely on the electric field. using eq.(9), the hamiltonian of the system becomes h(x,y,pxpy) = cα1px + cα2py + eα1 b 2 y −eα2 b 2 x + βmc2. (10) achieving the ncg in the dirac hamiltonian (10) as follows h ( xnc,ync,pncx ,p nc y ) = cα1pncx + cα2p nc y −eα2 b 2 xnc + eα1 b 2 ync + βmc2, (11) by applying eq.(4), we necessarily express the new nc hamiltonian using the commutative variables {x,y,px,py}, and by assuming that ~ = c = 1 (natural units) to simplify the calculations, then we obtain hnc(x,y,pxpy, t) = α1(1 + eb 4 θeγt)px−α2( eb 2 + η 2 e−γt)x+α2(1 + eb 4 θeγt)py +α1( eb 2 + η 2 e−γt)y +βm. (12) 113 ilyas haouam acta polytechnica the time-dependent dirac equation in nc phase-space is given by i ∂ ∂t ∣∣ψ̄ (t)〉 = hnc (t) ∣∣ψ̄ (t)〉 , (13) where ∣∣ψ̄ (t)〉 is the dirac nc wave function. 3.2. the construction of the lewis-riesenfeld invariants to solve eq.(13), we use the lr invariant method, which assumes the existence of a quantum-mechanical invariant i(t) which satisfies di(t) dt = −i [i(t),hnc (t)] + ∂i(t) ∂t = 0, (14) with i ∂ ∂t ( i(t) ∣∣ψ̄ (t)〉) = hnc (t) i(t) ∣∣ψ̄ (t)〉 . (15) the eq.(14) is called the invariance condition for the dynamical invariant operator i(t), which is a hermitian operator i(t) = i+(t). (16) assuming that i(t) = a1(t)px + b1(t)x + a2(t)py + b2(t)y + c(t), (17) with a1(t), b1(t), a2(t), b2(t), c(t) are time-dependent matrices. the substitution of eqs.(17, 11) into eq.(14) and using the properties of the commutation relations lead to [i,hnc] + i ∂i ∂t = [a1px,hnc] + [b1x,hnc] + [a2py,hnc] + [b2y,hnc] + [c,hnc] + i ∂i ∂t = 0, (18) for simplicity, we take fθ(t) = 1 + eb4 θe γt and fη(t) = eb2 + η 2 e −γt, which are not matrices, then we have [a1,α1fθ] p2x + [a2,α2fθ] p2y − [b1,α2fη] x2 + [b2,α1fη] y2 + {[a1,α2fθ] + [a2,α1fθ]}pxpy +{[a1,α1fη] + [b2,α1fθ]}ypx + { [a1,βm] + [c,α1fθ] + i∂a1∂t } px + { [a2,βm] + [c,α2fθ] + ∂a2∂t } py + { [b1,βm] − [c,α2fη] + i∂b1∂t } x + { [b2,βm] + [c,α1fη] + i∂b2∂t } y + {− [a1,α2fη] + [b1,α1fθ]}xpx +{[b1,α2fθ] − [a2,α2fη]}xpy + {[b1,α1fη] − [b2,α2fη]}xy + {[a2,α1fη] + [b2,α2fθ]}ypy +ia1α2fη + ib1α1fθ − ia2α1fη + ib2α2fθ − i [b1,α1fθ] − i [b2,α2fθ] + [c,βm] + i∂c∂t = 0 . (19) then, to satisfy eq.(14), and always taking advantage of the properties of commutation relations, with pipj = pjpi, xipj = pjxi if i 6=j ∈{1,2}, else pxx = xpx − i, pyy = ypy − i. we demand [a1,α1fθ] = 0, (20) [a2,α2fθ] = 0, (21) [b1,α2fη] = 0, (22) [b2,α1fη] = 0, (23) [a1,β] m + [c,α1fθ] + i ∂a1 ∂t = 0, (24) [a2,β] m + [c,α2fθ] + i ∂a2 ∂t = 0, (25) [b1,β] m− [c,α2fη] + i ∂b1 ∂t = 0, (26) [b2,β] m + [c,α1fη] + i ∂b2 ∂t = 0, (27) [a1,α2fθ] + [a2,α1fθ] = 0, (28) [b1,α1fθ] − [a1,α2fη] = 0, (29) [b1,α2fθ] − [a2,α2fη] = 0, (30) [b1,α1fη] − [b2,α2fη] = 0, (31) [b2,α1fθ] + [a1,α1fη] = 0, (32) 114 vol. 60 no. 2/2020 analytical solution of (2+1) dimensional dirac equation. . . [a2,α1fη] + [b2,α2fθ] = 0, (33) ia1α2fη + ib1α1fθ − ia2α1fη + ib2α2fθ − i{[b1,α1fθ] + [b2,α2fθ]} + [c,βm] + i ∂c ∂t = 0. (34) from the relations (20 23), and as long as from eq.(20), we have a1 = a0(t) + a1(t)α1 + a2(t)α21 + a3(t)α 3 1 + a4(t)α 4 1 + ... = a ′ 0(t) + a ′ 1(t)α1, with a ′ i(t) = ai(t), (35) therefore, we obtain a1 = a1 + a2α1, (36) a2 = a3 + a4α2, (37) b1 = b1 + b2α2, (38) b2 = b3 + b4α1. (39) from eqs.(24 27) and with the same manner, supposing that c is written in terms of α1, α2 and β as follows c = c1 + c2α1 + c3α2 + c4β, (40) where aj, bj and cj (with j = 1, .., 4) are supposed to be time-dependent arbitrary functions. substituting eqs.(40, 36) into eq.(24) and eqs.(40, 37) into eq.(25), and taking into consideration eq.(8), yields ∂a1 ∂t = 0, ∂a3 ∂t = 0, a2 = a4 = c2 = c3 = c4 = 0 , (41) thereafter, substituting eqs.(40, 38) into eq.(26) and eqs.(40, 39) into eq.(27), and taking into consideration eq.(8), yields ∂b1 ∂t = 0, ∂b3 ∂t = 0, b2 = b4 = c2 = c4 = 0 . (42) from the eqs.(41, 42) we note that a1, a3, b1, b3 are time-independent constants. we have a1 = a1, a2 = a3, b1 = b1, b2 = b3, c = c1 . (43) in addition, from eqs.(30, 32), and assuming that there exist χ(t) and ϕ(t), which are time-dependent matrices, with [χ(t),α2] = [ϕ(t),α1] = 0. the time-dependency may appear as follows b1fθ −a3fη = χ(t) b3fθ + a1fη = ϕ(t) . (44) now, substituting eq.(43) into eq.(34) and using eq.(44) gives us ∂c1 ∂t = −{a1fη + b3fθ}α2 −{b1fθ −a3fη}α1, (45) using system of relations (44), we find ∂c1 ∂t = 0 and χ = ϕ = 0. (46) last but not least, the dynamical invariant (17) of the time-dependent nc dirac equation can be written as follows i = a1px + b1x + a3py + b3y + c1, (47) we inferred that eq.(14) is verified and c1 should be a constant. we may also note that all the spin-dependent parts, which are proportional to αj, β disappear. which means that i has no spin-dependency, but it is proportional to the matrix of identity in the spinor of space. 115 ilyas haouam acta polytechnica 3.3. eigenvalues and eigenstates of i and h(t) supposing that the invariant in general i(t) is a complete set of eigenfunctions |φ(λ,k)〉 (in this subsection, the analysis is not concerning only on time-independent invariants), with λ being the corresponding eigenvalue (spectrum of the operator), and k represents all other necessary quantum numbers to specify the eigenstates, the eigenvalues equation is written as i(t) |φ(λ,k)〉 = λ |φ(λ,k)〉 , (48) where |φ(λ,k)〉 are an orthogonal eigenfunctions〈 φ(λ,k) | φ(λ ′ ,k ′ ) 〉 = δλλ′ δkk′ . (49) according to eq.(16), the eigenvalues are real and not time-dependent. deriving eq.(48) in time, we find ∂i ∂t |φ(λ,k)〉 + i ∂ ∂t |φ(λ,k)〉 = ∂λ ∂t |φ(λ,k)〉 + λ ∂ ∂t |φ(λ,k)〉 , (50) we apply eq.(14) over the eigenfunctions |φ(λ,k)〉, we have i ∂i ∂t |φ(λ,k)〉 + ihnc |φ(λ,k)〉−hncλ |φ(λ,k)〉 = 0, (51) the scalar product of eq.(51) by 〈 φ(λ ′ ,k ′ ) ∣∣∣ is i 〈 φ(λ ′ ,k ′ ) ∣∣∣∣∂i∂t ∣∣∣∣φ(λ,k) 〉 + ( λ ′ −λ )〈 φ(λ ′ ,k ′ ) |hnc|φ(λ,k) 〉 = 0, (52) which implies 〈 φ(λ ′ ,k ′ ) ∣∣∣∣∂i∂t ∣∣∣∣φ(λ,k) 〉 = 0, (53) the scalar product of eq.(50) by 〈 φ(λ ′ ,k ′ ) ∣∣∣is〈 φ(λ ′ ,k ′ ) ∣∣∣∣∂i∂t ∣∣∣∣φ(λ,k) 〉 = ∂λ ∂t , (54) from eq.(53), the eq.(54) shows that 〈 φ(λ ′ ,k ′ ) ∣∣∣∣∂i∂t ∣∣∣∣φ(λ,k) 〉 = ∂λ ∂t = 0. (55) while the eigenvalues are time-independent, the eigenstates should be time-dependent. in order to find the link between the eigenstates of the invariant i(t) and the solutions of the relativistic dirac equation, we first start with writing the motion equation of |φ(λ,k)〉, so that, using eq.(50) and eq.(55), we obtain ∂i ∂t |φ(λ,k)〉 = (λ− i) ∂ ∂t |φ(λ,k)〉 , (56) by using the scalar product with 〈 φ(λ ′ ,k ′ ) ∣∣∣, and taking eq.(52) to eliminate 〈φ(λ′,k′ ) ∣∣∂i∂t ∣∣φ(λ,k)〉, then we obtain i 〈 φ(λ ′ ,k ′ ) ∣∣∣∣(λ−λ′) ∂∂t ∣∣∣∣φ(λ,k) 〉 = ( λ−λ ′ )〈 φ(λ ′ ,k ′ ) |hnc|φ(λ,k) 〉 , (57) for λ ′ 6= λ, we deduce i 〈 φ(λ ′ ,k ′ ) ∣∣∣∣ ∂∂t ∣∣∣∣φ(λ,k) 〉 = 〈 φ(λ ′ ,k ′ ) |hnc|φ(λ,k) 〉 , (58) then we deduce immediately that |φ(λ,k)〉 satisfy the dirac equation, that is to say |φ(λ,k)〉 are particular solutions of the dirac equation. it is assumed that a phase has been taken, but it is still always possible to multiply it by an arbitrary time-dependent phase factor, which means that we can define a new set of i(t) eigenstates linked to our overall by a time-dependent gauge transformation, and |φ(λ,k)〉α = e iαλ(t) |φ(λ,k)〉 , (59) 116 vol. 60 no. 2/2020 analytical solution of (2+1) dimensional dirac equation. . . where αλ(t) is a real time-dependent function arbitrarily chosen called lr phase, |φλ(x,y,t)〉αare eigenstates of i(t) which are orthonormal and associated with λ. by putting eq.(59) in eq.(58) and using eq.(49), we find ∂αλ,k ∂t δλλ′ δkk′ = 〈 φ(λ ′ ,k ′ ) ∣∣∣i ∂ ∂t −hnc |φ(λ,k)〉 . (60) all the eigenstates of the invariant are also solutions of the time-dependent dirac equation, it was shown in [46] that its general solution is done by ∣∣ψ̄(t)〉 = ∑ λ,k cλ,ke iαλ,k(t) |φ(λ,k,t)〉 , (61) we remark that eq.(61) is also spin-independent in its state. but maybe the spin-dependent part is entangled in the coefficient c. |φ(λ,k,t)〉 are the orthonormal eigenstates of i(t), with cλ,k being time-independent coefficients, which correspond to |ψ(0)〉 cλ,k = 〈λ,k | ψ(0)〉 . (62) for a discrete spectrum of i(t), with λ = λ ′ , k = k ′ , and from eq.(60) the lr phase is defined as α(t) = � t 0 〈 φ(λ,k,t ′ ) ∣∣∣i ∂ ∂t ′ −hnc(t ′ ) ∣∣∣φ(λ,k,t′ )〉dt′. (63) but in the continuous spectrum case, the general expression of the phase is ∂αλ,k ∂t 〈 φ(λ ′ ,k ′ , t ′ | φ(λ,k,t) 〉 = 〈 φ(λ ′ ,k ′ , t ′ ) ∣∣∣i ∂ ∂t −hnc |φ(λ,k,t)〉 , (64) where k is an index that varies continuously in the real values, thus〈 φ(λ ′ ,k ′ , t ′ | φ(λ,k,t) 〉 = δλλ′ δ(k −k ′ ), (65) substituting eq.(65) in eq.(64) yields α(t) = � � t 0 〈 φ(λ,k ′ , t ′ ) ∣∣∣i ∂ ∂t ′ −hnc ∣∣∣φ(λ,k,t′ )〉dt′dk′. (66) once the expression of the phase α(t) is found, we can write the particular solution of our nc time-dependent dirac equation (61). for simplicity, we use the notation of the discrete spectrum of i(t). we see that the eigenfunction of i(t) has the form of [55, 56] |φλ,k(x,y,t)〉∝ |λ,k〉exp [ i ( ξ1(t)x + ξ2(t)y + ξ3(t)x2 + ξ4(t)y2 )] , (67) where ξ1(t), ξ2(t), ξ3(t), ξ4(t) are arbitrary time-dependent functions. by substituting eq.(67) into eq.(63) yields α(t) = ϑ− � t 0 encdt ′ , (68) with ϑ(x,y,t) = (ξ1(0) − ξ1(t)) x + (ξ2(0) − ξ2(t)) y + (ξ3(0) − ξ3(t)) x2 + (ξ4(0) − ξ4(t)) y2, (69) and enc is the eigenvalue of the hamiltonian (12). finally, the solution of the nc dirac equation (13) is [46] ∣∣ψ̄(t)〉 = ∑ λ,k cλ,ke i[ϑ− � t 0 e ncdt ′ ] |φ(λ,k,t)〉 , (70) 117 ilyas haouam acta polytechnica 3.4. the exact form of the solutions of the problem as agreed [55–57], the wave function of the nc dirac equation is given by the following trial function∣∣ψ̄(x,y,t)〉 = f(t) |φ(x,y,t〉 , (71) where f is a time-dependent vector of 2 components (2 × 1) f(t) = ( f1(t) f2(t) ) , (72) as long as i(t) is independent in time, eq.(15) goes to eq.(13). then the substitution of eq.(71) into eq.(13), and using eqs.(67, 7) give { i∂f1 ∂t −f1 ∂ξ1∂t x−f1 ∂ξ2 ∂t y −f1 ∂ξ3∂t x 2 −f1 ∂ξ4∂t y 2 i∂f2 ∂t −f2 ∂ξ1∂t x−f2 ∂ξ2 ∂t y −f2 ∂ξ3∂t x 2 −f2 ∂ξ4∂t y 2 } ={ m α1fθpx −α2fηx + α2fθpy + α1fηy α1fθpx −α2fηx + α2fθpy + α1fηy −m } × ( f1 f2 ) , (73) then, we obtain i∂f1 ∂t −f1 ∂ξ1∂t x−f1 ∂ξ2 ∂t y −f1 ∂ξ3∂t x 2 −f1 ∂ξ4∂t y 2 = fθf2px + ifηf2x− ifθf2py + fηf2y + mf1 i∂f2 ∂t −f2 ∂ξ1∂t x−f2 ∂ξ2 ∂t y −f2 ∂ξ3∂t x 2 −f2 ∂ξ4∂t y 2 = fθf1px − ifηf1x + ifθf1py + fηf1y −mf2 , (74) by solving the above system of equations, we find ∂f1 ∂t = −imf1, ∂f2 ∂t = imf2, (75) f1 ∂ξ1 ∂t = −i{ eb 2 + η 2 e−γt}f2, (76) f1 ∂ξ2 ∂t = −{ eb 2 + η 2 e−γt}f2, (77) ∂ξ3 ∂t = ∂ξ4 ∂t = 0, (78) which leads to obtaining f1 = e−imt+q1, f2 = eimt+q2, (79) ∂ξ1 ∂t = i ∂ξ2 ∂t = −i{ eb 2 + η 2 e−γt}ei2mt+q2−q1, (80) ξ1 = iξ2 = −i{ κ 4l2bim ei2mt + ηκ 4im− 2γ e(−γ+i2m)t}, (81) with q1, q2 and κ = eq2−q1 being real constants, l−1b = √ eb is the magnetic length [58]. in commutative case (θ = η = γ = 0), then the above relations (79, 81) return to that of general quantum mechanics ξ1(t) = iξ2(t) = − κ 4l2bm ei2mt, |φ(x,y,t〉 |η=γ=0 ∼ e − iκ 4l2 b m ei2mt(x+iy)+o1ix2+o2iy2 , (82) with o1, o2 are real constants, and in t = 0 ξ1(t = 0) = iξ2(t = 0) = − κ 4l2bm , and f1 = κf2 = eq1. (83) 4. conclusion in conclusion, the dynamics of the system of a time-dependent nc dirac equation has been analysed and formulated using the lr invariant method. we introduced the time-dependent noncommutativity using a timedependent bopp-shift translation. knowing that the nc structure constants postulated expanding exponentially with the evolution of time, and the time-dependency have a multitude of other possibilities. we benefit from the dynamical invariant following the standard procedure allowed to construct and to obtain an analytical solution of the system. having obtained the explicit solutions could also help to investigate and reformulate the modified version of heisenberg’s uncertainty relations emerging from non-vanishing commutation relations (1). the uncertainty for the observables a, b has to satisfy the inequality 4a4b |ψ≥ 12 | 〈ψ| [a,b] |ψ〉 | with 4a |2ψ= 〈ψ|a2 |ψ〉−〈ψ|a |ψ〉 2 and the same for b for any state. depending on these results, we are planning to study the pair creation process, and investigate its implications in quantum optics. 118 vol. 60 no. 2/2020 analytical solution of (2+1) dimensional dirac equation. . . acknowledgements the author wishes to express thanks to pr lyazid chetouani for his interesting comments and suggestions, and would also like to appreciate anonymous reviewers for their careful reading of the manuscript and their insightful comments. references [1] h. s. snyder. quantized space-time. physical review 71:38 – 41, 1947. doi:10.1103/physrev.71.38. [2] h. s. snyder. the electromagnetic field in quantized space-time. physical review 72:68 – 71, 1947. doi:10.1103/physrev.72.68. [3] a. connes, m. douglas, a. schwarz. noncommutative geometry and matrix theory: compactification on tori. journal of high energy physics 1998, 1998. [4] t. banks, w. fischler, s. h. shenker, l. susskind. m theory as a matrix model: a conjecture. physical review d 55:5112 – 128, 1997. doi:10.1103/physrevd.55.5112. [5] m. chaichian, m. m. sheikh-jabbari, a. tureanu. hydrogen atom spectrum and the lamb shift in noncommutative qed. physical review letters 86:2716 – 2719, 2001. doi:10.1103/physrevlett.86.2716. [6] b. mirza, m. zarei. non-commutative quantum mechanics and the aharonov-casher effect. european physical journal c 32:583 – 586, 2004. doi:10.1140/epjc/s2003-01522-8. [7] c. duval, p. horvã¡thy. the exotic galilei group and the “peierls substitution”. physics letters b 479(1):284 – 290, 2000. doi:10.1016/s0370-2693(00)00341-5. [8] a. connes. non-commutative differential geometry. publications mathématiques de l’institut des hautes études scientifiques 62:41 – 144, 1985. doi:10.1007/bf02698807. [9] a. connes. a short survey of noncommutative geometry. journal of mathematical physics 41(6):3832 – 3866, 2000. doi:10.1063/1.533329. [10] s. woronowicz. twisted su (2) group. an example of a non commutative differential calculus. publications of the research institute for mathematical sciences 23:117, 1987. doi:10.2977/prims/1195176848. [11] n. seiberg, e. witten. string theory and noncommutative geometry. journal of high energy physics 1999(09):032 – 032, 1999. doi:10.1088/1126-6708/1999/09/032. [12] j. madore, s. schraml, p. schupp, j. wess. gauge theory on noncommutative spaces. the european physical journal c particles and fields 16:161 – 167, 2000. doi:10.1007/s100520050012. [13] i. haouam. continuity equation in presence of a non-local potential in non-commutative phase-space. open journal of microphysics 09:15 – 28, 2019. doi:10.4236/ojm.2019.93003. [14] i. haouam. on the fisk-tait equation for spin-3/2 fermions interacting with an external magnetic field in noncommutative space-time. journal of physical studies 24, 2020. doi:10.30970/jps.24.1801. [15] i. haouam. the non-relativistic limit of the dkp equation in non-commutative phase-space. symmetry 11:223, 2019. doi:10.3390/sym11020223. [16] i. haouam, l. chetouani. the foldy-wouthuysen transformation of the dirac equation in noncommutative phase-space. journal of modern physics 09:2021 – 2034, 2018. doi:10.4236/jmp.2018.911127. [17] b. jurčo, l. möller, s. schraml, et al. construction of non-abelian gauge theories on noncommutative spaces. european physical journal c 21:383, 2001. doi:10.1007/s100520100731. [18] j.-i. kamoshita. probing noncommutative space-time in the laboratory frame. the european physical journal c 52:451, 2002. doi:10.1140/epjc/s10052-007-0371-y. [19] z. alisultanov. landau levels in graphene in crossed magnetic and electric fields: quasi-classical approach. physica b: condensed matter 438:41 – 44, 2014. doi:10.1016/j.physb.2013.12.033. [20] z. alisultanov. oscillations of magnetization in graphene in crossed magnetic and electric fields. jetp letters 99:232 – 236, 2014. doi:10.1134/s0021364014040055. [21] n. ma, s. zhang, d. liu, v. wang. influence of electrostatic field on the weiss oscillations in graphene. physics letters a 378(45):3354 – 3359, 2014. doi:10.1016/j.physleta.2014.09.026. [22] s. zhang, n. ma, e. zhang. the modulation of the de haas–van alphen effect in graphene by electric field. journal of physics: condensed matter 22(11):115302, 2010. doi:10.1088/0953-8984/22/11/115302. [23] n. m. r. peres, e. v. castro. algebraic solution of a graphene layer in transverse electric and perpendicular magnetic fields. journal of physics: condensed matter 19(40):406231, 2007. doi:10.1088/0953-8984/19/40/406231. [24] k. novoselov, a. geim, s. morozov, et al. two-dimensional gas of massless dirac fermions in graphene. nature 438:197 – 200, 2005. doi:10.1038/nature04233. [25] a. de martino, l. dell’anna, r. egger. magnetic confinement of massless dirac fermions in graphene. physical review letters 98:066802, 2007. doi:10.1103/physrevlett.98.066802. 119 http://dx.doi.org/10.1103/physrev.71.38 http://dx.doi.org/10.1103/physrev.72.68 http://dx.doi.org/10.1103/physrevd.55.5112 http://dx.doi.org/10.1103/physrevlett.86.2716 http://dx.doi.org/10.1140/epjc/s2003-01522-8 http://dx.doi.org/10.1016/s0370-2693(00)00341-5 http://dx.doi.org/10.1007/bf02698807 http://dx.doi.org/10.1063/1.533329 http://dx.doi.org/10.2977/prims/1195176848 http://dx.doi.org/10.1088/1126-6708/1999/09/032 http://dx.doi.org/10.1007/s100520050012 http://dx.doi.org/10.4236/ojm.2019.93003 http://dx.doi.org/10.30970/jps.24.1801 http://dx.doi.org/10.3390/sym11020223 http://dx.doi.org/10.4236/jmp.2018.911127 http://dx.doi.org/10.1007/s100520100731 http://dx.doi.org/10.1140/epjc/s10052-007-0371-y http://dx.doi.org/10.1016/j.physb.2013.12.033 http://dx.doi.org/10.1134/s0021364014040055 http://dx.doi.org/10.1016/j.physleta.2014.09.026 http://dx.doi.org/10.1088/0953-8984/22/11/115302 http://dx.doi.org/10.1088/0953-8984/19/40/406231 http://dx.doi.org/10.1038/nature04233 http://dx.doi.org/10.1103/physrevlett.98.066802 ilyas haouam acta polytechnica [26] l. dell’anna, a. de martino. multiple magnetic barriers in graphene. physical review b 79:045420, 2009. doi:10.1103/physrevb.79.045420. [27] y. zhang, y.-w. tan, h. stormer, p. kim. experimental observation of the quantum hall effect and berry’s phase in graphene. nature 438:201 – 204, 2005. doi:10.1038/nature04235. [28] k. bolotin, f. ghahari, m. shulman, et al. observation of the fractional quantum hall effect in graphene. nature 462:196 – 199, 2009. doi:10.1038/nature08582. [29] g. p. mikitik, y. v. sharlai. the berry phase in graphene and graphite multilayers. low temperature physics 34(10):794 – 800, 2008. doi:10.1063/1.2981389. [30] v. lukose, r. shankar, g. baskaran. novel electric field effects on landau levels in graphene. physical review letters 98:116802, 2007. doi:10.1103/physrevlett.98.116802. [31] v. arjona, e. v. castro, m. a. h. vozmediano. collapse of landau levels in weyl semimetals. physical review b 96:081110, 2017. doi:10.1103/physrevb.96.081110. [32] g. burmeister, k. maschke. scattering by time-periodic potentials in one dimension and its influence on electronic transport. physical review b 57:13050 – 13060, 1998. doi:10.1103/physrevb.57.13050. [33] c. s. tang, c. s. chu. coherent quantum transport in narrow constrictions in the presence of a finite-range longitudinally polarized time-dependent field. physical review b 60:1830 – 1836, 1999. doi:10.1103/physrevb.60.1830. [34] w. li, l. e. reichl. transport in strongly driven heterostructures and bound-state-induced dynamic resonances. physical review b 62:8269 – 8275, 2000. doi:10.1103/physrevb.62.8269. [35] c. figueira de morisson faria, m. dörr, w. sandner. time profile of harmonic generation. physical review a 55:3961 – 3963, 1997. doi:10.1103/physreva.55.3961. [36] i. h. deutsch, p. s. jessen. quantum-state control in optical lattices. physical review a 57:1972 – 1986, 1998. doi:10.1103/physreva.57.1972. [37] h. maeda, t. f. gallagher. nondispersing wave packets. physical review letters 92:133004, 2004. doi:10.1103/physrevlett.92.133004. [38] c. e. creffield, g. platero. ac-driven localization in a two-electron quantum dot molecule. physical review b 65:113304, 2002. doi:10.1103/physrevb.65.113304. [39] h. p. yuen. two-photon coherent states of the radiation field. physical review a 13:2226 – 2243, 1976. doi:10.1103/physreva.13.2226. [40] m. governale, f. taddei, r. fazio. pumping spin with electrical fields. physical review b 68:155324, 2003. doi:10.1103/physrevb.68.155324. [41] a. g. mal’shukov, c. s. tang, c. s. chu, k. a. chao. spin-current generation and detection in the presence of an ac gate. physical review b 68:233307, 2003. doi:10.1103/physrevb.68.233307. [42] l. s. brown. quantum motion in a paul trap. physical review letters 66:527 – 529, 1991. doi:10.1103/physrevlett.66.527. [43] m. feng. complete solution of the schrödinger equation for the time-dependent linear potential. physical review a 64:034101, 2001. doi:10.1103/physreva.64.034101. [44] m.-l. liang, z.-g. zhang, k.-s. zhong. quantum-classical correspondence of the time-dependent linear potential. czechoslovak journal of physics 54:397 – 402, 2004. doi:10.1023/b:cjop.0000020579.42018.d9. [45] h. r. lewis. classical and quantum systems with time-dependent harmonic-oscillator-type hamiltonians. physical review letters 18:510 – 512, 1967. doi:10.1103/physrevlett.18.510. [46] h. r. lewis, w. b. riesenfeld. an exact quantum theory of the time-dependent harmonic oscillator and of a charged particle in a time-dependent electromagnetic field. journal of mathematical physics 10(8):1458 – 1473, 1969. doi:10.1063/1.1664991. [47] i. a. pedrosa, j. l. melo, e. nogueira. linear invariants and the quantum dynamics of a nonstationary mesoscopic rlc circuit with a source. modern physics letters b 28(27):1450212, 2014. doi:10.1142/s0217984914502121. [48] x. chen, a. ruschhaupt, s. schmidt, et al. fast optimal frictionless atom cooling in harmonic traps: shortcut to adiabaticity. physical review letters 104:063002, 2010. doi:10.1103/physrevlett.104.063002. [49] s. dey, a. fring. noncommutative quantum mechanics in a time-dependent background. physical review d 90:084005, 2014. doi:10.1103/physrevd.90.084005. [50] v. v. nesvizhevsky, h. g. börner, a. m. gagarski, et al. measurement of quantum states of neutrons in the earth’s gravitational field. physical review d 67:102002, 2003. doi:10.1103/physrevd.67.102002. [51] v. nesvizhevsky, h. bã¶rner, a. petukhov, et al. quantum states of neutrons in the earth’s gravitational field. nature 415:297 – 299, 2002. doi:10.1038/415297a. 120 http://dx.doi.org/10.1103/physrevb.79.045420 http://dx.doi.org/10.1038/nature04235 http://dx.doi.org/10.1038/nature08582 http://dx.doi.org/10.1063/1.2981389 http://dx.doi.org/10.1103/physrevlett.98.116802 http://dx.doi.org/10.1103/physrevb.96.081110 http://dx.doi.org/10.1103/physrevb.57.13050 http://dx.doi.org/10.1103/physrevb.60.1830 http://dx.doi.org/10.1103/physrevb.62.8269 http://dx.doi.org/10.1103/physreva.55.3961 http://dx.doi.org/10.1103/physreva.57.1972 http://dx.doi.org/10.1103/physrevlett.92.133004 http://dx.doi.org/10.1103/physrevb.65.113304 http://dx.doi.org/10.1103/physreva.13.2226 http://dx.doi.org/10.1103/physrevb.68.155324 http://dx.doi.org/10.1103/physrevb.68.233307 http://dx.doi.org/10.1103/physrevlett.66.527 http://dx.doi.org/10.1103/physreva.64.034101 http://dx.doi.org/10.1023/b:cjop.0000020579.42018.d9 http://dx.doi.org/10.1103/physrevlett.18.510 http://dx.doi.org/10.1063/1.1664991 http://dx.doi.org/10.1142/s0217984914502121 http://dx.doi.org/10.1103/physrevlett.104.063002 http://dx.doi.org/10.1103/physrevd.90.084005 http://dx.doi.org/10.1103/physrevd.67.102002 http://dx.doi.org/10.1038/415297a vol. 60 no. 2/2020 analytical solution of (2+1) dimensional dirac equation. . . [52] s. m. carroll, j. a. harvey, v. a. kostelecký, et al. noncommutative field theory and lorentz violation. physical review letters 87:141601, 2001. doi:10.1103/physrevlett.87.141601. [53] o. bertolami, j. g. rosa, c. m. l. de aragão, et al. noncommutative gravitational quantum well. physical review d 72:025010, 2005. doi:10.1103/physrevd.72.025010. [54] f. delduc, q. duret, f. gieres, m. lefrancois. magnetic fields in noncommutative quantum mechanics. journal of physics: conference series 103:012020, 2008. doi:10.1088/1742-6596/103/1/012020. [55] x. jiang, c. long, s. qin. solution of dirac equation with the time-dependent linear potential in non-commutative phase space. journal of modern physics 04:940 – 944, 2013. doi:10.4236/jmp.2013.47126. [56] h. sobhani, h. hassanabadi. two-dimensional linear dependencies on the coordinate time-dependent interaction in relativistic non-commutative phase space. communications in theoretical physics 64(3):263 – 268, 2015. doi:10.1088/0253-6102/64/3/263. [57] m. merad, s. bensaid. wave functions for a duffin-kemmer-petiau particle in a time-dependent potential. journal of mathematical physics 48(7):073515, 2007. doi:10.1063/1.2747609. [58] z. jiang, e. a. henriksen, l. c. tung, et al. infrared spectroscopy of landau levels of graphene. physical review letters 98:197403, 2007. doi:10.1103/physrevlett.98.197403. 121 http://dx.doi.org/10.1103/physrevlett.87.141601 http://dx.doi.org/10.1103/physrevd.72.025010 http://dx.doi.org/10.1088/1742-6596/103/1/012020 http://dx.doi.org/10.4236/jmp.2013.47126 http://dx.doi.org/10.1088/0253-6102/64/3/263 http://dx.doi.org/10.1063/1.2747609 http://dx.doi.org/10.1103/physrevlett.98.197403 acta polytechnica 60(2):1–11, 2020 1 introduction 2 time-dependent noncommutativity 3 (2+1) d explicitly time-dependent dirac equation and its invariant operator 3.1 (2+1) d dirac equation in time-dependent noncommutative phase-space 3.2 the construction of the lewis-riesenfeld invariants 3.3 eigenvalues and eigenstates of i and h(t) 3.4 the exact form of the solutions of the problem 4 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0370 acta polytechnica 58(6):370–377, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap a method of complex calculation of rational structural parameters of railway humps sergii panchenkoa, oleksandr oharb, maksym kutsenkoc, julia smachiloc, ∗ a rector, ukrainian state university of railway transport, feyerbach square 7, kharkiv, ukraine b head of department of railway station and junctions, ukrainian state university of railway transport, feyerbach square 7, kharkiv, ukraine c department of railway station and junctions, ukrainian state university of railway transport, feyerbach square 7, kharkiv, ukraine ∗ corresponding author: smachilo.julia@gmail.com abstract. the article deals with a method of complex calculation of rational structural parameters of humps at classification yards. unlike existing methods, this method allows an implementation of the technology of guided gravity regulation of the cut speed by applying a special layout and profile arrangement. the authors believe that it will decrease the maintenance costs to refund damaged car and cargo, costs on electricity needed for the cut speed regulation and some extra charges due to demurrages caused by waiting for breaking-up at arrival yards. keywords: railway transport, railway humps, optimization. 1. introduction due to certain trends in the world energy resources market and severe competition in transportation, the research of optimization of rail transportation charges is urgent. besides, the problem is also of importance for sorting operations at railway stations, as their parameters are greatly conditioned by structural parameters of the humps. in article [1], the authors pay an attention to the fact that one of the components considerably influencing the total costs of transportation are the car processing costs on humps. cars can be processed several times on their way from departure stations to destination stations. according to [1], a lot of factors influence the processing costs, among which are the costs of depreciation, spare parts and maintenance of cut speed regulators (the maintenance costs are proportional to the costs of the regulators). over a long period of time a lot of scientists have paid their attention to an improved layout and profile structure of a hump, cut rolling speed regulators, systems of automated hump technological processes and cut braking modes to increase the breaking-up efficiency [2–7]. thus, in research [2, 3], the aim is achieved by the optimized structures of hump necks, in [4] — by defining the optimal parameters of the longitudinal hump profile, in [5] — by developing new and improved existing structures for cut rolling speed regulators, in [6] — by forming approaches to the automated braking regulation, and in [7] — by optimizing cut braking modes. the analysis of the above-mentioned sources testifies that many scientists theoretically prove the possibility to increase the breaking-up efficiency with the methods proposed by the authors for humps. research [8] implies that there are some factors of the sorting process substantially influencing the breaking-up efficiency; thought they are rather difficult to consider, forecast or formalize. the operational condition of cut rolling speed regulators and automation devices, the degree of consideration and presentation method of a random nature, the state of wheelsets and other factors are among them. the research concluded that the reduction of influence of a human factor on the breaking-up efficiency is still a problem. thus, consideration, forecasting, or formalization of the above-mentioned factors is currently a very difficult challenge, and the solution is not found yet. also, in research [8], the authors substantiate the importance to implement the technology of guided gravity braking for cuts. the special layout and profile arrangement for classification yards proposed by the authors (figure 1) can be used to implement the given technology. a special feature of such an arrangement is the location of the switching area (sa) (either in part or in whole) at the origin of sorting tracks up to the yard retarder position (yrp) on the ascent. the other elements on the section from the hump crest (hc) to the design point (dp) were located on the descent. the height and longitudinal profile of the arrangement provide, firstly, the rolling of slow light car (sl) in unfavourable winter conditions from hc to dp, the most difficult track in terms of resistance, and, secondly, sufficient intervals at dividing switching points in the sl-fh link (where fhfast heavy car). 370 http://dx.doi.org/10.14311/ap.2018.58.0370 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 6/2018 example of an article with xa long title figure 1. the arrangement of the guided gravity regulation of cut braking. therefore, intervals between cuts, sufficient for throwing over points from one position to another, are only provided due to a special structure of the descent part profile, and the location of certain profile elements on the ascent allows slowing the cars. in other words, such an arrangement generates the gravity braking effect. when applying this sorting arrangement, yrp does not change its functionality. in [8], the authors put forward the hypothesis that under automated car processing, the accrued economic benefit over the calculation operational period for the arrangement will exceed the accrued economic benefit over the similar operational period for a conventional automated sorting hump. even if the costs of cut rolling speed regulators can be twice as higher when applying the technology of guided gravity regulation of cut braking (additional investment into automated devices for conventional humps can compensate the difference of investment into car retarders). 2. a method of complex calculation of rational structural parameters of railway humps therefore, the most efficient structural variant for a hump design with the guided gravity regulation of cut braking is the variant of the least needed capacity of the yrp (hyrp) in terms of meeting requirements for safety and fail-safe sorting operations. thus, the needed capacity of the yrp is the criterion for rationalized structural parameters of a hump with the guided gravity braking technology for cuts. as far as for a certain hump, if the position of its crest and profile element length are constants, then: hyrp = f(i1,i2, . . . ,in), (1) where i1,i2, . . . ,in — the slopes of section elements from the hump crest to the origin of the switching area. let us develop the objective function to define the rational values of structural parameters of a hump. under favourable rolling conditions, the needed capacity of the yard retarder position, according to [1], is defined as follows: hyrp = ken (( n∑ r=1 lrir + lswitchiswitch + 4lstist ) · 10−3 + v 20 2 ·g′ ) = ( w0lhc-sa + v 2mid(hc-sa) · ( 0.56 ·nhc-sa + 0.23 · ∑ αhc-sa )) · 10−3 − (lswitchiswitch + 4lstist) · 10−3), (2) 371 s. panchenko, o. ohar, m. kutsenko, j. smachilo acta polytechnica where ken – the enlargement factor of the minimal design capacity of retarder positions on the hump descent; n – the number of profile elements of the descend (from the hump crest to the origin of the switching area); lr,ir – the length, m and the slope, %0 of the profile elements of a hump from the crest to the origin of the switching area), respectively; lswitch,iswitch – the length, m and the slope, ‰ of the switching area, respectively; 4lst, ist – the length, m and the slope, ‰ of the sorting track section from the switching area to the origin of yrp; v0 – the initial speed of the consist shunting on the hump crest, m/s; g′ – the acceleration of the gravity force of a heavy car with consideration of the rotating wheelset masses, m/s2; w0 – the basic specific resistance to a heavy car, n/kn; lhc-sa – the section length from the hump crest to the origin of the switching area, m; v 2mid(hc-sa) – the average speed of a heavy car on the section from the hump crest to the origin of the switching area, m/s; nhc-sa – the number of switching points on the section from the hump crest to the origin to the switching area;∑ αhc-sa – the sum of the rotation angles on the section from the hump crest to the origin of the switching area. since each profile element consists of technological elements: n∑ r=1 lrir = m∑ i=1 liii, v 2mid(hc-sa) ( 0.56 ·nhc-sa + 0.23 · ∑ αhc-sa ) = m∑ i=1 ( 0.56 ·ni + 0.23 · ∑ ai ) ·v 2mid(i), (3) where m – the number of technological elements. the values v0,g′,w0,li,ni, ∑ αi for i = 1, . . . ,m are constants. let: ken = a, v 20 2 ·g′ −lhc-saw0 · 10−3 = b, (0.56 ·ni + 0.23 · ∑ αi) · 10−3 = ci, (4) then hyrp = a ( m∑ i=1 (liii) · 10−3 + b + m∑ i=1 ci ·v 2mid(i) ) → hyrp(min). (5) the average rolling speed of a heavy car on ith technological element is: vmid(i) = v ′i + vi−1 2 , (6) where v ′i – the speed of a heavy car at the end of ith element in a first approximation (calculation of v ′i considers only those specific resistances, which do not depend on the average rolling speed on a technological element: basic (w0(i)), on snow and frost (wsn(i)) and on braking (wb(i))), m/s: v ′i = √ v 2i−1 + 2 ·g′li(ii −w0(i) −wsn(i) −wb(i)) · 10−3, (7) where li, ii – the length and slope of ith technological element, respectively. since the hyrp is defined under favourable rolling conditions, wsn(i) and, besides, according to [1] wb(i) = 0. thus, v ′i = √ v 2i−1 + 2 ·g′li(ii −w0(i)) · 10−3, (8) where vi−1 – the heavy car speed at the end of i− 1 element in a second approximation (with consideration of wrol(i − 1) and wb(i − 1)). the heavy car speed at the end of mth element: vm = ( v 20 + 2 ·g ′l1(i1 −w0 −wrol(1) −wb(1)) · 10−3 + 2 ·g′l2(i2 −w0 −wrol(2) −wb(2)) · 10−3 + · · · + 2 ·g′lm(im −w0 −wrol(m) −wb(m)) · 10−3 )1/2 = √ v 20 + 2 ·g′ ∑m s=1 ls(is −w0 −wrol(s) −wb(s)) · 10 −3. (9) therefore, the heavy car speed at the origin of (i− 1)th element is: vi−1 = √ v 20 + 2 ·g′ ∑i−1 s=1 ls(is −w0 −wrol(s) −wb(s)) · 10 −3. (10) 372 vol. 58 no. 6/2018 example of an article with xa long title according to [1], the actual average rolling speed of a heavy car on the technological element is: v amid = vst + ∑z i=1 √ v 2st + 2g′li(ii −w0)/(1000z) z + 1 , (11) where z – the number of elemental sections as components of the technological element (for calculation of v amid , the elemental section length is taken 0.5 m, i.e. l/z = 0.5). according to [1], the authors propose to define the average speed on a technological element by the formula: vmid(i) = kiv ′m(i), (12) where v ′m(i) – the heavy car’s speed in a first approximation in the middle of a technological element: vm(i) = √ v 2i−1 + g′li(ii −wvrg0 ) · 10−3, (13) and ki – the correction index. the authors propose to define the correction index on the basis of the conditions of equality of errors in calculation of the average heavy car speed on technological elements; one of them is of an infinitesimal length, thus, vmid on the element can be taken as vst, the second – as a length of 30 m and located on a slope of 50%0, thus: vst −kv ′st = kv ′ m −v a mid. (14) from the equation obtained: k = vst + v amid vst + v ′m . (15) the research made by the authors on the basis of calculations of the correction index at various vst showed that it can be set as an exponential function: k = −0.0576191 ·e−0.5710201·vst + 0.9966873. (16) thus, hyrp = a ( m∑ i=1 (liii) · 10−3 + b + m∑ i−1 cik 2 i v 2 i−1g ′li(ii −wvgc0 ) · 10 −3 ) → hyrp(min), (17) where v 2i−1 = v 2 0 + 2g ′ vgc i−1∑ s=1 ( ls(is −wvgc0 ) · 10 −3 − ( 0.56 ·nswitch(s) + 0.23 ∑ as ) k2s (v 2 s−1 + g ′lg (ig −wvgc0 ) · 10 −3 − 1000 ·hb(s) ) . (18) let: cik 2 i = di, v 2 i−1 −g ′liwo · 10−3 = ei, g′li · 10−3 = fi, li · 10−3 = gi. (19) then: hyrp = a ( m∑ i−1 giii + b + m∑ i=1 di(ei + fiii) ) → hyrp(min), (20) or hyrp = a ( b + ( (g1i1) + d1(e1 + f1i1) ) + ( (g2i2) + d2(e2 + f2i2) ) + · · · + ( (gmim) + dm(em + fmim) )) → hyrp(min). (21) as far as each profile element consists of the several technological elements, let us specify: i1 = ii, where i = 1, . . . ,x1, i2 = ii, where i = x + 1, . . . ,x2, in = ii, where i = xn−1 + 1, . . . ,xn. (22) where n – the number of profile elements of the descent part of a hump, and xi, i = 1, 2, . . . ,n – the number of the last technological element of the ith profile element 373 s. panchenko, o. ohar, m. kutsenko, j. smachilo acta polytechnica and, eventually, the objective function takes the form: hyrp = a ( b + x1∑ i=1 ( gii1 + di(ei + fii1) ) + x2∑ i=x1+1 ( gii2 + di(ei + fii2) ) + · · · + xn∑ i=xn−1+1 ( giin + di(ei + fiin) )) → hyrp(min). (23) in order to realize the guided gravity technology for cuts, the objective function (23) should be minimized at non-linear limitations-equalities:  d1 = fd1 (v0), e1 = fe1 (v0), d2 = fd2 (v0, i1), e2 = fe2 (v0, i1), d3 = fd3 (v0, i1, i2), e3 = fe3 (v0, i1, i2), dxn = fdxn (v0, i1, i2, . . . , ixn−1 ), en = fexn (v0, i1, i2, . . . , ixn−1 ), , (24) linear limitations-inequalities:  0 ≤ i1 ≤ 50, 25 ≤ i2 ≤ 50, −50 ≤ in ≤ 50, i1 − i2 ≤ 25, hyrpb ≤ nrhone, v yrpen ≤ v yrpen(max), t0 ≤ t max0 , , (25) and linear limitations-equalities:{ lunfrun = lc, v yhrex = 1.4, (26) where i1,i2, . . . ,in – the profile element slopes of a hump, %0; hyrpb – the value of the heavy car braking under favourable summer conditions at yrp, kj/kn; nr – the number of retarders installed at yrps; hone – the capacity of a retarder installed at yrp, kj/kn; v yrpen – the entry speed of a heavy car under favourable summer conditions at yrp, m/s; v yrpen(max) – the maximum admissible entry speed of a heavy car under favourable summer conditions at yrp, m/s; t0 – the time interval at dividing elements between cars rolling in turn, sec; t max0 – the maximum accessible time at dividing elements between cars rolling in turn, sec; lunfrun – the run of a heavy car under unfavourable winter conditions along a difficult track in terms of resistance, m; lc – the calculation length of a difficult track in terms of resistance from the hump crest to the design point, m; v yrpex – the exit speed of a heavy car from yrp, m/s. according to [8], the task cannot be reduced to an unconditional extremum task. therefore, let us consider other ways to solve the task. the method of lagrange multipliers is technically difficult to implement as the number of limitations-equalities is rather great. however, the method cannot be directly used if limitations are inequalities [8]. thus, there is a needed to have a method that will allow finding the minimal value of hyrp with the minimal variant search. the standard method of lagrange multipliers supplemented with terms, stem from the duality theory, got its generalization to the task of non-linear programming of a general kind with limitations of the equality-inequality type [8]. the needed optimality conditions of such tasks are called the kuhn-tucker conditions. in order to build the kuhn-tucker tasks, a mathematical model of the non-linear programming task must have a strict limitationsinequalities recording: z = f(x) → min,{ hk(x) = 0,k = 1,s, gi(x) ≤ 0, i = 1,m. (27) the conditions of non-negative variables are included in the task recording as limitations-inequalities: gi = −xj ≤ 0. (28) 374 vol. 58 no. 6/2018 example of an article with xa long title the lagrange function of the task is built with s + m of undetermined coefficients: l(x,v,u) = f(x) + s∑ k=1 vkhk(x) + m∑ i=1 uigi(x). (29) the coefficients vk(k = 1,s), ui(i = 1,m) are called the lagrange multipliers. they are unlimited by sign dual variables corresponding to limitations-equalities (vk = 0,k = 1,s) and non-negative dual variables corresponding to limitations-inequalities (ui ≥ 0, i = 1,m). the following equation system with the n + s + m unknown variable is called the kuhn-tucker task for minimization:   ~5f(x) + ∑s k=1 vk ~5hk(x) + ∑m i=1 ui ~5gi(x) = 0, hk(x) = 0,k = 1,s, gi(x) ≤ 0, i = 1,m, uigi(x) = 0, i = 1,m, ui ≥ 0, i = 1,m. (30) the equations uigi(x) = 0, i = 1,m are the complementary slackness conditions; they are an analogy of the second duality theorem of linear programming tasks. if in point x, the limitation gi(x) is inactive (gi(x) > 0) then ui = 0, if gi(x) is active (gi(x) = 0), than ui > 0. the solution to the kuhn-tucker task should start with the analysis of this group of equations searching all possible combinations of equality to zero ui or gi(x)in turn. the optimal solution should be sought among points meeting the kuhn-tucker conditions (30). let us build the kuhn-tucker task at linear limitations-equalities for output task (23): { lunfrun −lc = 0, v phrex − 1.4 = 0, (31) non-linear limitations-equalities:   d1 −fd1 (v0) = 0, e1 −fe1 (v0) = 0, d2 −fd2 (v0,i1) = 0, e2 −fe2 (v0,i1) = 0, d3 −fd3 (v0,i1,i2) = 0, e3 −fe3 (v0,i1,i2) = 0, ... dzx −fdzx (v0,i1,i2, ...,izx−1 ) = 0, ezx −fezx (v0,i1,i2, ...,izx−1 ) = 0; (32) linear limitations-inequalities: 0 ≤ i1 ≤ 50 → { 0 − i1 ≤ 0, i1 − 50 ≤ 0, 25 ≤ i2 ≤ 50 → { 25 − i2 ≤ 0, i2 − 50 ≤ 0, ... −50 ≤ in ≤ 50 → { −50 − in ≤ 0, in − 50 ≤ 0; (33) and   i1 − i2 − 25 ≤ 0, v yrpen −v yrpen(max) ≤ 0, t0 −t max0 ≤ 0. (34) 375 s. panchenko, o. ohar, m. kutsenko, j. smachilo acta polytechnica the lagrange function has the form: l(i,v,u) = a ( b + z1∑ j=1 ( cji1 + dj (ej + fji1) ) + z2∑ j=z1+1 ( cji2 + dj (ej + fji2) ) + · · · + zx∑ j=zx−1+1 ( cjix + dj (ej + fjix) )) + v1(lunfrun −lc) + v2(v yrp ex − 1.4) + v3(d1 −fd1 (v0)) +v4(e1−fe1 (v0))+v5(d2−fd2 (v0,i1))+v6(e2−fe2 (v0,i1))+v7(d3−fd3 (v0,i1,i2))+v8(e3−fe3 (v0,i1,i2)) + · · · + v2zx+2 (dzx −fdzx (v0,i1,i2, . . .izx−1 )) + v2zx+3 (ezx −fezx (v0,i1,i2, . . .izx−1 )) + u1(0 − i1) + u2(i1 − 50) + u3(25 − i2) + u4(i2 − 50) + · · · + u2x−1(−50 − ix) + u2x(ix − 50) + u2x+1(i1 − i2 − 25) + u2x+2(v yrpen −v yrp en(max)) + u2x+3(t0 −t max 0 ), (35) and conditions: (1.) a ( z1∑ j=1 (gj + djfj ) ) −v6 ∂fd2 ∂i1 −v7 ∂fe2 ∂i1 −v8 ∂fd3 ∂i1 −v9 ∂fe3 ∂i1 −···−v2zx+2 ∂fdzx ∂i1 −v2zx+3 ∂fezx ∂i1 −u1 + u2 + u2x+1 = 0, ... a ( zx∑ j=zx−1+1 (gj + djfj ) ) −u2x−1 −u2x = 0; (36) (2.) partial derivatives by dual variables ∂l ∂vk and ∂l ∂ui : lunfrun −lc = 0, v yrpex − 1.4 = 0, d1 −fd1 (v0) = 0,e1 −fe1 (v0) = 0, d2 −fd2 (v0,i1) = 0,e2 −fd2 (v0,i1) = 0, d3 −fd3 (v0,i1,i2) = 0,e3 −fe3 (v0,i1,i2) = 0, ... dzx −fdzx (v0,i1,i2, . . . ,izx−1 ) = 0, ezx −fezx (v0,i1,i2, . . . ,izx−1 ) = 0, 0 − i1 = 0; i1 − 50 = 0, 25 − i2 = 0,i2 − 50 = 0, ... −50 − ix = 0; ix − 50 = 0; i1 − i2 − 25 = 0, v yrpen −v yrp en(max) = 0; t0 −t max 0 = 0; (37) (3.) the complementary slackness condition (2nd duality theorem): u1(0 − i1) = 0, u2(i1 − 50) = 0, u3(25 − i2) = 0, u4(i2 − 50) = 0, ... u2x−1(−50 − ix ) − 0 = 0, u2x(ix − 50) = 0, u2x+1(i1 − i2 − 25) = 0, 376 vol. 58 no. 6/2018 example of an article with xa long title u2x+2(v yrpen −v yrp en(max)) = 0, u2x+3(t0 −t max0 ) = 0; (38) (4.) ui ≥ 0. (39) according to [8], the solution should start with a linear search of all possible combinations of equality to zero for multipliers of group iii. at first, let us suppose that dual variables are equal to zero or not equal to zero: ui = 0 (∀i) then one out of ui = 0 , the others are not equal to zero, etc. 3. conclusions the solution to the optimization task with limitations presented in the study will make it possible to minimize the demand for the capacity of braking facilities and implement the technology of guided gravity regulation of the cut speed. the authors believe that the implementation of the technology will encourage the decrease of operational costs for a refund of a car and freight damages (due to better conditions improving the quality of the cut rolling speed regulation), and for electricity needed for this regulation (a possible decrease of wind consumption by car retarders). besides, it will reduce additional charges for demurrages caused by waiting for breaking-up at arrival yards (due to a possible reduction of hump intervals and lower volume of marshalling works for pulling cars with a subsequent remedy of the consequences). references [1] m. i. kutsenko, i. v. berestov, o. m. ohar, o. b. akhiiezer. as for the question of development of methodology of complex calculation of optimal structural parameters of gravity hump. eastern-european journal of enterprise technologies (2/3):56–60, 2009. [2] v. i. bobrovskiy, a. kolesnik, a. s. dorosh. the perfection of construction of plan of the track development of humping access door. transport systems and transportations technologies : collection of research papers dnipropetrovsk national university of railway transport named by academic v lazaryan (1):27–33, 2001. [3] e. a. mamaev. the perfection of construction of plan of the track development of humping access door, vol.11. international journal of applied engineering research (23):11515–11524, 2016. [4] s. a. bessonenko. the optimization of basic parameters of gravity humps. the perfection of railways operating work, novosibirsk (1):4–25, 2008. [5] v. a. kobzev. the state and prospects of braking hill technique development. automation, connection, informatics (11):2–5, 2004. [6] a. n. shabelnikov. the development of theory and methods of automation of management with difficult processes at the marshalling yard. first printing. russian state technical university, moscow, 2005. [7] v. i. bobrovskiy, d. n. kozachenko, n. p. bozhko, et al. the optimization of the modes of uncoupling braking on gravity humps. first printing. makovets, dnepropetrovsk, 2010. [8] t. a. hemdi. introduction to the analysis of operations. first printing. vilyams, st-peterburg, 2007. 377 acta polytechnica 58(6):370–377, 2018 1 introduction 2 a method of complex calculation of rational structural parameters of railway humps 3 conclusions references acta polytechnica doi:10.14311/ap.2020.60.0428 acta polytechnica 60(5):428–434, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap properties of a differential sequence based upon the kummer-schwarz equation adhir maharaja, ∗, kostis andriopoulosb, peter leacha, b a durban university of technology, steve biko campus, department of mathematics, durban, 4000, republic of south africa b university of kwazulu-natal, school of mathematical sciences, private bag x54001, durban,4000, republic of south africa ∗ corresponding author: adhirm@dut.ac.za abstract. in this paper, we determine a recursion operator for the kummer-schwarz equation, which leads to a sequence with unacceptable singularity properties. a different sequence is devised based upon the relationship between the kummer-schwarz equation and the first-order riccati equation for which a particular generator has been found to give interesting and excellent properties. we examine the elements of this sequence in terms of the usual properties to be investigated – symmetries, singularity properties, integrability, alternate sequence – and provide an explanation of the curious relationship between the results of the singularity analysis and a consideration of the solution of each element obtained by quadratures. keywords: lie symmetries, singularity analysis, differential sequence. 1. introduction in a prescient paper of 1977 olver [1], the idea of a recursion operator for an evolution of partial differential equations, but with definite indication of extension to wider classes of partial differential equations was introduced1. given an evolution differential equation, ut = f(t,x,u,ux, . . . ,unx), (1) r[u] is a recursion operator for (1) if it satisfies the equation [1] [p 1213, (8)] [l[f] −dt,r[u]]lb v ∈ (ut −f) , (2) where l[f] is the linearised operator l[f] = ∂f ∂unx dnx + . . . + ∂f ∂ux dx + ∂f ∂u , (3) dt and dx denote the total differentiation with respect to t and x, respectively, and v is a solution of the linearised equation, ie lv = 0. over the last thirty years, recursion operators have found a wide application in the study of nonlinear evolution partial differential equations with particular reference to their integrability in terms of the possession of an infinite number of conservation laws. more recently, euler et al [4] and petersson et al [5] examined the linearisability of nonlinear hierarchies of evolution equations in (1 + 1) dimensions with a particular reference to the generalised x-hodograph transformation. as a natural development from this work, euler et al [6] initiated a parallel study of recursion 1in this respect olver made use of a number of results due to various authors [2, 3]. the beauty of the subsequent work is in his synthesis. operators applied to ordinary differential equations. this was subsequently amplified by andriopoulos et al [7] in their detailed study of the riccati differential sequence2. a further development was the introduction of the concept of an alternate sequence [8]. in this paper, we construct a first-order recursion operator for a sequence of nonlinear ordinary differential equations based upon the schwarzian differential invariant in its expression as the kummer-schwarz equation and investigate the properties of the higherorder differential equations so generated. as our primary interest is in differential equations integrable in the sense of poincaré (iewith solutions which are analytically away from isolated polelike singularities) we are disappointed with the results. however, we are able to develop a sequence based upon a different generator and find that the elements of that sequence have a combination of very interesting properties, which help to illustrate a facet of this type of analysis not displayed in the literature. in addition to the examination of the singularity properties of this sequence, we also demonstrate its complete symmetry group and provide a differential generator for an alternate representation of the elements of the sequence. 2. the first-order recursion operator for the kummer-schwarz equation we write the kummer-schwarz equation in the form u′′′ 2u′ − 3u′′2 4u′2 = 0, (4) 2the change in terminology is justified in that paper. 428 https://doi.org/10.14311/ap.2020.60.0428 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 5/2020 properties of a differential sequence based upon the kummer-schwarz. . . where the prime denotes the differentiation with respect to the independent variable, x, to emphasise its connection with the schwarzian derivative, (u′−1/2)′′. we construct a first-order recursion operator for (4), r = d + a, where we write d as the operator of the total differentiation with respect to the single independent variable, x, and a is a function to be determined. the linearised operator for (4) is l[u] = 1 2u′ d3 − 3u′′ 2u′2 d2 − ( u′′′ 2u′2 − 3u′′2 2u′3 ) d. (5) as we are dealing with an ordinary differential equation, (2) is somewhat simpler. we calculate the lie bracket of l and r and require its action upon v to give zero when the (linearised) equation for v is satisfied. this gives a large equation with terms in v′′, v′ and v. the coefficient of each of these terms is required separately to be zero. that of v′′ gives 6a′ u′ − 6u′′2 u′3 + 6u′′′ u′2 = 0 (6) which has the solution a = − u′′ u′ (7) up to an arbitrary constant of integration, which we consider to be zero. when the right side (7) is substituted into the remaining terms, the result is identically zero. we apply the recursion operator r = d − u′′ u′ (8) to the kummer-schwarz equation, (4), to initiate the construction of the elements of the sequence. the first equation we obtain is u(4) u′ − 5 u′′u(3) u′2 + 9 u′′3 2u′3 = 0. (9) when we seek the leading-order behaviour of (9) by setting u = αχp, where χ = x−x0 and x0 is the location of the putative singularity, we find that p = −1(bis), 1. this is not acceptable for a scalar equation! we recall that the kummer-schwarz equation is closely related to the equation v′′ + v′2 = 0 (10) which is the autonomous version of the potential form of burgers equation for which the first-order recurrence relation is d + v′. this recurrence relation becomes that of the kummer-schwarz equation obtained above, under the transformation connecting the two equations. equation (10) is an elementary riccati equation in the dependent variable v′. in terms of y = v′ it is y′ + y2 = 0. (11) the connection of this form of the riccati equation to the kummer-schwarz equation is well known. it is a simple calculation to show that the recursion operator for (11) is d + 2y. however, it has been shown [6, 7] that the operator, d + y, generates a sequence of differential equations based upon the riccati equation, which has very satisfying properties in terms of their singularity characteristics and algebras. there is a price to pay. this operator is not a recursion operator – hence the descriptor ‘generator of sequence’ [7] to emphasise the distinction – and so one cannot expect to obtain the properties normally associated with the recursion operator for an hierarchy of evolution equations. however, there is much to be said for the attractive properties which we do find. in terms of the dependent variable of the kummerschwarz equation, the generator of the sequence, d+y, becomes d −u′′/2u′ and it is this operator which we employ henceforth. 3. the early elements of the new sequence we now apply the generator r = d − u′′ 2u′ (12) to the kummer-schwarz equation, (4), to initiate the construction of the elements of the sequence and then apply it to each new element in turn. the process may be continued for as long as the memory of one’s symbolic manipulator can contain the results. we simply list the first few members to give an indication of the structure of these equations. we include the first element, the original kummer-schwarz equation, for the sake of completeness in terms of the notation used. the ks should need no explanation. the initial ‘1’ indicates that the sequence is generated by a firstorder operator. the two subsequent digits indicate the place of the particular equation as an element of the sequence. we obtain the following sequence, ks101: = u′′′ u′ − 3u′′2 2u′2 = 0 ks102: = u′′′′ u′ − 9u′′u ′′′ 2u′2 + 15u′′3 4u′3 = 0 ks103: = u′′′′′ u′ − 6u′′u′′′′ u′2 − 9u′′′2 2u′2 + 45u′′2u′′′ 2u′3 − 105u′′4 8u′4 = 0 ks104: = u′′′′′′ u′ − 15u′′u′′′′′ 2u′2 − 15u′′′u′′′′ u′2 + 75u′′2u′′′′ 2u′3 + 225u′′u′′′2 4u′3 − 525u′′3u′′′ 4u′4 + 945u′′5 16u′5 = 0 ... 429 a. maharaj, k. andriopoulos, p. leach acta polytechnica ks10n:= ( d − u′′ 2u′ )n ( u′′′ u′ − 3u′′2 2u′2 ) = 0. 4. singularity analysis we perform the singularity analysis in the usual fashion by firstly making the substitution u = αχp, (13) where χ = x−x0 and x0 is the location of the putative singularity, to determine the possible exponents of the leading-order term and the corresponding coefficients. the results for the first few elements of the sequence are given in table 1. there are three points to note. firstly, the elements of the sequence are of degree zero in u and so the coefficient of the leading-order term is arbitrary. secondly, if one removes the fractions by multiplying by the highest power of u′ in the denominator, the possibility of p = 0 has multiple occurrences. for the sequence as we have written it, these possibilities are spurious. as further analysis is more convenient to perform when the denominators are removed, we ignore these values. thirdly, there is always the possibility that p = 1. such a value is without the ambit of the singularity analysis. it is amusing to note that ks101 is invariant under the transformation u −→ 1/u. this property does not persist for higher elements of the sequence. the second step in the singularity analysis is to determine at what powers of χ the additional constants of integration occur. the performance of this computation is greatly facilitated by the removal of the fractions involving the derivative of the dependent variable. in table 2, we list the resonances for the various permissible values of the exponents, p. 5. complete symmetry group the element ks101 possesses six lie point symmetries with the algebra sl(2,r)⊕sl(2,r). all other elements of the differential sequence possess just four lie point symmetries, namely γ1 = ∂x, γ2 = x∂x, γ3 = ∂u and γ4 = u∂u with the algebra 2a1 ⊕ 2a1. neither algebra is sufficient to specify the corresponding equation completely. it is necessary to have a recourse to nonlocal symmetries for the complete specification. the determination of the nonlocal symmetries is facilitated by the fact that all elements of the sequence may be linearised by means of the same transformation, which linearises the kummer-schwarz equation. we recall that (4) can be written as u′1/2(u′−1/2)′′ = 0. consequently the linearising transformation is u′ = 1/w2. the linear equation corresponding to the nth element of the sequence is3 w(n+1) = 0 (14) for which a representation of its complete symmetry group [9, 10] is σi = xi∂w, i = 0, n, and σn+1 = w∂w. (15) when we reverse the transformation, the symmetries in (15) become σ̄i = {∫ xiu′3/2dx } ∂u, i = 0, n, and σ̄n+1 = u∂u. (16) to these n + 2 symmetries, we add the symmetry behind the reduction of order to (14), namely ∂u. it is a simple matter to demonstrate that this indeed is a representation of the complete symmetry group. the algebra is a2 ⊕s (n + 1)a1. the (n + 1)-dimensional abelian subalgebra is comprised of the symmetries σ̄i, i = 0, n. 6. the alternate sequence in the introduction, we mention the work of euler and leach [8] in which they demonstrated that the elements of a given sequence of ordinary differential equations of increasing order could be written in terms of equations of a lower order with a nonhomogeneous term of increasing complexity as one rose through the sequence. this is also the case with the present sequence. we illustrate the procedure with the first few elements of the sequence. we recall that the principle of the construction of the alternate sequence is based upon the relationship between one member of the sequence and the next member through the generator of sequences. the initial step is to solve the equation ( d − u′′ 2u′ ) q1 = 0. (17) thereafter, one proceeds in a sequential manner by solving ( d − u′′ 2u′ ) qn = qn−1, n = 1, . . . , (18) where we adopt the convention that q0 = 0. the solution of (17) is q1 = c0u′1/2, (19) where u′−1/2 is an integrating factor for ks1014. the representation of the second member of the sequence 3in a usual manner of dealing with differential equations one forgets about the factor 1/w corresponding to the u′1/2. when it comes to the generation of differential sequences, there is no place for such slackness. were one to wish to consider the differential sequence corresponding to that of the kummerschwarz sequence, the base equation would have to be w′′/w = 0. as we see below, the removal of denominators entails an adjustment of the generator. 4this property was firstly noted in the case of recursion operators. it appears to be somewhat robust! 430 vol. 60 no. 5/2020 properties of a differential sequence based upon the kummer-schwarz. . . element identifier possible exponents ks101 -1, 1 ks102 -3, -1, 1 ks103 -5, -3, -1, 1 ks104 -7, -5, -3, -1, 1 ks105 -9, -7, -5, -3, -1, 1 ks106 -11, -9, -7, -5, -3, -1, 1 ks107 -13, -11, -9, -7, -5, -3, -1, 1 ks108 -15, -13, -11, -9, -7, -5, -3, -1, 1 ... ks10n -2n+1, -2n+3, -2n+5,... -7, -5, -3, -1, 1 table 1. kummer-schwarz sequence: possible exponents of the leading-order term for the first eight elements is then ks102 : u′′′ u′ − 3u′′2 2u′2 = c0u′1/2. (20) it is an easy matter to demonstrate that in general ks10n : u′′′ u′ − 3u′′2 2u′2 = ( n−1∑ i=0 1 i! cn−1−ix i ) u′1/2. (21) 7. conclusion in the article, we have reported the results of our analysis of the elements of this sequence. it remains to interpret the results of the singularity analysis in the light of our knowledge of a route to obtain the solution of each element of the sequence in terms of a quadrature. the nth element of the sequence has n possible values for the exponent of the leading-order term, which are acceptable, ie negative integers. as was noted in table 1, for the first eight elements of the sequence, the precise values are −(2i − 1), i = 1, 2, . . . , n. in table 2, the resonances for each of the negative exponents of the leading-order term are given. several features are to be noted. firstly, there is the possibility of repeated resonances, which indicates the introduction of a logarithmic term into the laurent expansion for the solution. the number of exponents of the leading-order term for which this happens increases as one proceeds down the sequence. secondly, the highest exponent leads to a right painlevé series, albeit with the intrusion of the unhappy logarithm, and the lowest exponent to what would be a left painlevé series were it not for a single positive resonance equal in value to the negative of the exponent. for the exponents between the highest and the lowest, one sees a selection of both negative and positive resonances (apart from the generic −1), which indicates a full laurent expansion, again with the possibility of logarithmic terms occurring. as should be by now well known, the right painlevé series is a laurent expansion in the neighbourhood of the singularity convergent on a punctured disc and the left painlevé series has the nature of an asymptotic expansion in that it exists on the exterior of a disc centred on the singularity. the full series is defined over an annulus centred on the similarity and the existence of multiple instances of these series indicates a succession of annuli, the bounding circles of which are defined by successive singularities. thirdly, the value of the highest resonance shows some variability. for patterns of resonances, in which there is no repeated resonance or the repeated resonance is the highest resonance, the highest resonance is the negative of the exponent of the leading-order term. for the other patterns of the resonances – always starting from the more positive exponents – this value is exceeded to the extent that it is necessary to provide a full number of constants of integration. evidently, some explanation of these features is required! fortunately, we are in a position to describe the solution of each element of the sequence due to the property of the linearisation noted in §5. to provide the explanation, we make use of the third element of the sequence, which has the double merits of featuring in almost every property (the repetition of logarithms is not one of them) mentioned in the previous paragraph. the third element of the sequence, ks103, u′′′′′ u′ − 6u′′u′′′′ u′2 − 9u′′′2 2u′2 + 45u′′2u′′′ 2u′3 − 105u′′4 8u′4 = 0, (22) takes the linear form w(4) = 0 (23) under the transformation u′ −→ w−2. the solution of (23) is w(x) = p3(x), (24) 431 a. maharaj, k. andriopoulos, p. leach acta polytechnica element identifier exponents resonances ks101 -1 -1, 0, 1 ks102 -1 -1, 0, 1 (bis) -3 -2, -1, 0, 3 ks103 -1 -1, 0, 1 (bis), 2 -3 -2, -1, 0, 1, 3 -5 -3, -2, -1, 0, 5 ks104 -1 -1, 0, 1 (bis), 2, 3 -3 -2, -1, 0, 1, 2, 3 -5 -3, -2, -1, 0, 1, 5 -7 -4, -3, 2, -1, 0, 7 ks105 -1 -1, 0, 1 (bis), 2, 3, 4 -3 -2, -1, 0, 1, 2, 3 (bis) -5 -3, -2, -1, 0, 1, 2, 5 -7 -4, -3, -2, -1, 0, 1, 7 -9 -5, -4, -3, -2, -1, 0, 9 ks106 -1 -1, 0, 1 (bis), 2, 3, 4, 5 -3 -2, -1, 0, 1, 2, 3 (bis), 4 -5 -3, -2, -1, 0, 1, 2, 3, 5 -7 -4, -3, -2, -1, 0, 1, 2, 7 -9 -5, -4, -3, -2, -1, 0, 1, 9 -11 -6, -5, -4, -3, -2, -1, 0, 11 ks107 -1 -1, 0, 1 (bis), 2, 3, 4, 5, 6 -3 -2, -1, 0, 1, 2, 3 (bis), 4, 5, -5 -3, -2, -1, 0, 1, 2, 3, 4, 5 -7 -4, -3, -2, -1, 0, 1, 2, 3, 7 -9 -5, -4, -3, -2, -1, 0, 1, 2, 9 -11 -6, -5, -4, -3, -2, -1, 0, 1, 11 13 -7, -6, -5, -4, -3, -2, -1, 0, 13 ks108 -1 -1, 0, 1 (bis), 2, 3, 4, 5, 6, 7 -3 -2, -1, 0, 1, 2, 3 (bis), 4, 5, 6 -5 -3, -2, -1, 0, 1, 2, 3, 4, 5 (bis) -7 -4, -3, -2, -1, 0, 1, 2, 3, 4, 7 -9 -5, -4, -3, -2, -1, 0, 1, 2, 3, 9 -11 -6, -5, -4, -3, -2, -1, 0, 1, 2, 11 13 -7, -6, -5, -4, -3, -2, -1, 0, 1, 13 -15 -8, -7, -6, -5, -4, -3, -2, -1, 0, 15 table 2. kummer-schwarz sequence: resonances for the permissible values of the exponent of the leading-order term 432 vol. 60 no. 5/2020 properties of a differential sequence based upon the kummer-schwarz. . . where p3(x) is a polynomial of degree three in x. consequently, the solution of (22) can be written in terms of the quadrature, u(x) = ∫ dx p3(x)2 + m. (25) the evaluation of the quadrature in (25) is, in principle, a simple matter due to the fundamental theorem of algebra and the use of partial fractions. however, the interpretation of the results of the singularity analysis obtained above requires that the quadrature is to be approached with a certain degree of delicacy to illustrate the different possibilities. we commence with the case p = −1 and write the solution in terms of a polynomial about the singularity at x0 in terms of the variable χ = x−x0. we have p3(x) = kχ(χ−a)(χ− b), (26) where we have written the factors in the sense that 0 < |a| ≤ |b| to empathise that we are dealing with a simple pole at x0. when we substitute (26) into (25), we obtain u(x) = ∫ dx (kχ(χ−a)(χ− b))2 + m. (27) we may write the integrand of (27) as 1 k2χ2a2b2 ( 1 − χ a )−2 ( 1 − χ b )−2 . (28) after applying the binomial expansion and simplifying, we may write (28) as 1 k2a2b2 [ 1 χ2 − 2 ( 1 a + 1 b ) 1 χ + ( 3 a2 + 3 b2 + 4 ab ) +2 ( 2 a3 + 2 b3 − 3 ab2 − 3 a2b ) χ... ] . (29) when we substitute (29) into (27) and perform the quadrature, we obtain u(x) = m − 1 k2a2b2 [ 1 χ + 2 ( 1 a + 1 b ) log χ − ( 3 a2 + 3 b2 − 4 ab ) χ − ( 2 a3 + 2 b3 − 3 ab2 − 3 a2b ) χ2... ] . (30) the occurrence of the unfortunate logarithm and the requisite number of constants of integration (five in this case, k,a,b,x0 and m) is obvious. the quadrature evaluated above is valid to the next singularity at χ = a. for the case p = −5, we may write the integrand of (27) as 1 k2χ6 ( 1 − a k )−2 ( 1 − b k )−2 . (31) after the application of the binomial expansion and simplification of (31), we obtain 1 k2χ6 [ 1 − 2(a + b) χ + 3(a2 + b2) + 4ab χ2 − 4(a3 + b3) + 6a2b2 + 6ab2 χ3 ... ] . (32) when we substitute (32) into (27) and perform the quadrature, we have u(x) = 1 k2 [ − 1 5χ5 + (a + b) 4χ6 − 3(a2 + b2) + 4ab 7χ7 + 2(a3 + b3) + 3a2b + 3ab2 4χ8 ... ] + m, (33) where the requisite number of constants of integration is also obvious. finally, we consider the case p = −3 where the full series is defined over an annulus centred on the singularity and we write the factors in the sense that |a| < |χ| < |b|. we now write integrand of (27) as 1 k2b2χ4 ( 1 − a χ )−2 ( 1 − χ b )−2 . (34) by applying the above method to (34) we obtain u(x) = m + 1 b2k2 [( − 4 b3 ... ) log χ− 1 χ ( 3 b2 + 8a b3 ... ) − 1 2χ2 ( − 2 b − 6a b2 − 12a2 b3 ... ) − 1 3χ3 ( 1 + 4a b + 9a2 b2 + 16a3 b3 ... ) − 1 4χ4 ( −2a− 6a2 b − 12a3 b2 ... ) − 1 5χ5 ( 3a2 + 8a3 b ... )] . (35) we observe that (35) contains an unfortunate logarithmic term, which is not indicated in the resonances for p = −3. this is possible since we have a complete laurent series, which must necessarily be convergent in an annulus centred on the singularity.the kummerschwarz equation and the members of the sequence generated from it by the generator (12) can be easily integrated. what we have done here is to explore its properties from a wider perspective so that a greater appreciation of its properties can be gained. the exploration of different aspects of such sequences adds to our understanding of them. acknowledgements pgll would like to thank the national research foundation of the republic of south africa, university of kwazulu-natal and durban university of technology for their continued support. am would like to thank the durban university of technology for their continued support. 433 a. maharaj, k. andriopoulos, p. leach acta polytechnica references [1] p. j. olver. evolution equations possessing infinitely many symmetries. journal of mathematical physics 18:1212 – 1215, 1977. doi:10.1063/1.523393. [2] c. s. gardner. korteweg-de vries equation and generalisations iv. the korteweg-de vries equation as a hamiltonian system. journal of mathematical physics 12:1548 – 1551, 1971. doi:10.1063/1.1665772. [3] i. m. gelfand, l. a. dikii. asymptotic behaviour of the resolvent of sturm-liouville equations and the algebra of the korteweg-de vries equations. russian mathematical surveys 30:77 – 113, 1975. doi:10.1070/rm1975v030n05aneh001522. [4] m. euler, n. euler, n. petersson. linearizable hierarchies of evolution equations in (1 + 1) dimensions. studies in applied mathematics 111:315 – 337, 2003. doi:10.1111/1467-9590.t01-1-00236. [5] n. petersson, n. euler, m. euler. recursion operators for a class of integrable third-order evolution equations. journal of mathematical physics 112:201 – 225, 2004. doi:10.111/j.0022-2526.2004.02511.x. [6] m. euler, n. euler, p. g. l. leach. the riccati and ermakov-pinney hierarchies. journal of nonlinear mathematical physics 14:290 – 310, 2007. doi:10.2991/jnmp.2007.14.2.11. [7] k. andriopoulos, p. g. l. leach, a. maharaj. on differential sequences. applied mathematics and information sciences an international journal 5(3), 2011. [8] n. euler, p. g. l. leach. aspects of proper differential sequences of ordinary differential equations. theoretical and mathematical physics 159:473 – 486, 2009. doi:10.1007/s11232-009-0038-y. [9] k. andriopoulos, p. g. l. leach, g. p. flessas. complete symmetry groups of ordinary differential equations and their integrals: some basic considerations. journal of mathematical analysis and applications 262:256 – 273, 2001. doi:10.1006/jmaa.2001.7570. [10] k. andriopoulos, p. g. l. leach. the economy of complete symmetry groups for linear higher dimensional systems. journal of nonlinear mathematical physics 9(s-2), 2002. 434 http://dx.doi.org/10.1063/1.523393 http://dx.doi.org/10.1063/1.1665772 http://dx.doi.org/10.1070/rm1975v030n05aneh001522 http://dx.doi.org/10.1111/1467-9590.t01-1-00236 http://dx.doi.org/10.111/j.0022-2526.2004.02511.x http://dx.doi.org/10.2991/jnmp.2007.14.2.11 http://dx.doi.org/10.1007/s11232-009-0038-y http://dx.doi.org/10.1006/jmaa.2001.7570 acta polytechnica 60(5):428–434, 2020 1 introduction 2 the first-order recursion operator for the kummer-schwarz equation 3 the early elements of the new sequence 4 singularity analysis 5 complete symmetry group 6 the alternate sequence 7 conclusion acknowledgements references ap04_3web.vp 1 introduction concurrent engineering technologies provide essential tools for acceleration of time-to-market procedures in automotive industry. reciprocating engines could be a prime mover in future decades if all theoretically promising improvements are evaluated, tested and quickly realized after passing through the selection procedures. the huge amount of simulation codes in the field of computational fluid dynamics, experimental tools and the data-processing capacities of contemporary computers on the one hand, and human inventive capacities on the other, call for their application in the thermodynamic design of contemporary engines. the use of a virtual engine during the design phase is nowadays essential. this is due to both the new demands on efficiency and emission levels (including carbon dioxide levels) and the new possibilities in engine fuelling and combustion management together with the downsizing of engines thanks to super/turbocharging. the current situation calls for changes not only due to great flexibility of control but also due to the new actuating concepts, e.g., in common rail injection, valve actuation, boost pressure control, etc. this should be reflected in the early design phase, giving an opportunity to predict new possibilities not only to provide a perfect simulation of the standard state usable in parametric design but also to incorporate new concepts. this paper summarizes the possibilities and achievements in spark ignited (si) turbocharged engine design. engine models employed up to now, their hierarchical structure and the philosophy of the use of previous experience are described. a modular system of general thermodynamic elements usable for all types of volumetric engines (reciprocating, rotating, etc.) based on a finite volume approach has already been described [1, 2, 3]. using these tools, real limits for high-b.m.e.p., downsized si engines and optimized cycle parameters can be estimated in advance, taking into account the auto-ignition and knock limits. it is especially worthwhile to estimate an appropriate compression ratio that cannot be changed simply in a wide range after the engine structure is manufactured. the application of advanced models to the simulation of new combustion concepts is also discussed. the purpose of the paper is to show the different level model use in the concurrent procedures. therefore, a brief description is given of the applied methods and the results of simulations. 2 concept of computer aided concurrent design of an ic engine engineering computational models can be classified according to the depth of reality description. this reflects the detail of reality description, namely whether both space and time are treated continuously, or one of them only discretely [4]. a distinction can be made between computational models on the deep level, medium level and shallow level. the deep level considers both space and time as continuous (partial differential equations (pde) are used). the medium level usually considers space as discrete and time as continuous (ordinary differential equations (ode) or differential-algebraic equations (dae) are used). the other possibility of continuous space and discrete time is also used, but less frequently. the shallow level considers both space and time discretely (nonlinear algebraic equations are generally used) (fig. 1). the initial model is usually at the deep level. the medium level is a simplification of the deep level obtained by describing of the whole space or time area (interval) by a single variable. the shallow level is simplified by a further step where not only space (time) but also time (space) is described by a single variable. this means that the created object is considered as a chain of discrete events in discrete space and time. certainly there exists a mixture of these models where different elements of different system levels are described by different kinds of models. the primary reasons for simplifying of computational models is to accelerate simulation of the product behavior. the other purpose is to provide the inversion of the models on the shallow level, which is usually impossible or possible only with great difficulty on the medium or deep level. the inversion of models [3] is necessary for the application of the principle of successive approximations as the basic solving principle of the fundamental paradox of system theory – for details see [4]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 44 no. 3/2004 integration of cfd methods into concurrent design of internal combustion engine m. polášek, j. macek, o. vítek, k. kozel this paper describes patterns of algorithms for different innovative levels of design at parametric, configuration and conceptual levels. they can be applied to computer-aided engine design (ced). data structures, process simulation hierarchy, engine simulation modules and the requirements for further development are described. an example of advanced thermodynamics modeling of combustion engines is included. keywords: internal combustion engine, simulation, cad, cfd, combustion, knock. the list of necessary engine sub-models is shown in the following table. models of different levels (ae, ode, etc.) are used either simultaneously or uniformly. the first approximation after the engine configuration is set provisionally, and is based mostly on ae methods (if these methods are missing a simple regression formula is used interpolating between the most similar design cases already analyzed). for the next turns of iteration, more profound methods are used. pde methods are not mature in some cases for these aims. this concerns especially cfd models and also some complicated solid continuum dynamics. the reasons lie in the insufficient result of a trade-off between cpu time, the number of inputs needed (e.g., a very detailed description of the geometry is not available in the starting phase of engine design) and reliability – precision of results. on the other hand, top level methods should be used even for parameter estimation in the case of high boost pressures, where no experience is available. the results naturally influence the demands on engine motor-management. if a conflict of targets occurs, the initially estimated nominal power has to be changed. some examples of the formulation of si-focused sub-models and their significance for engine optimization will be presented in the following text. further, the application of cfd models in the design of the ic engine is demonstrated on a simulation of an unconventional cycle. 3 application of a thermodynamic model to si engine downsizing concept simulation the current trend toward significant co2 reduction calls for an increase in the fuel economy of engines and cars. in the case of passenger cars, there is permanent competition between si engines and di diesel engines. before the two concepts converge, both have to be further optimized to fully utilize all potentials. the future of the conventional si engine is the downsizing concept. it brings in a highly turbocharged small displacement engine. it is obvious that this causes new problems of higher mechanical and thermal stresses, knocking, turbocharger control, etc. the problems of downsized si engine boost pressure optimization can be summarized as follows: 1. boost pressure at a certain speed is primarily given by the required torque. 2. the high engine load causes higher temperature of the walls, as well as higher temperature of the cylinder charge. 3. knocking combustion may occur at high temperature and pressure. this has to be avoided. knocking considerably limits the parameters of si engines. it is usually necessary to find a compromise among the used fuel knock rating/engine compression ratio/spark advance/boost pressure to operate the engine at knock limits to gain as high engine efficiency as possible. 4. knock can be eliminated by delaying combustion or by decreasing the compression ratio. both of these changes significantly deteriorate efficiency. 5. boost pressure (or required torque) should be checked against the engine efficiency gain of a downsized engine. knock simulation is probably the most crucial problem of turbocharged si engine simulations. knock prediction is an important point in basic decisions on target parameters in si engine development. a quasi-dimensional model (q-d or 0-d) was used for the engine parameter set-up in the first stage. in fact, the model 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 1: trace of the design process through design procedures level of model ae ode dae pde thermodynamic model in the standard 0-d or 1-d version x x thermodynamic model decomposed to processes in engine cycle parts (see examples in the next section) x x thermoaerodynamic 2.5-d model suitable for new engine lay-outs x fuel properties evaluation (especially for gas mixtures) x fuel spray model x aerodynamic simulation of valves and ports x aerodynamic simulation of exhaust gas aftertreatment systems (catalysts, soot filters) x radial centripetal turbocharger turbine simulation x gas mixing devices simulation x piston ring leakage and thermal load simulation x boundary conditions of thermal load evaluation x fem static analysis applied to pistons, piston rings, heads, cylinder liners, valves, crankcases (including cylinder head gasket), connecting rods and crankshafts, etc. x stands at the second stage of the original model hierarchy, but it represents a very conventional and easy-to-use approach to simulation of the engine cycle. experimental data and current knowledge can be used to calibrate the model. therefore, it is not necessary to employ a shallow model for a preliminary determination of the engine parameters. instead, the quasi-dimensional model is quite complex as regards the extent to which the engine lay-out is described – see fig. 2. the model enables us to simulate an si engine including turbocharger, crank mechanism, etc. the obvious limitation of the q-d model is the discrete treatment of all parts (pipes), which does not account for the fluid dynamics phenomena. the procedure for using a different level model is described on the example of turbocharged si engine optimizations. in the procedure presented here, three major levels of simulation are employed in an integrated procedure. the hierarchy is demonstrated in the following table. this work uses a decoupled simulation of the engine cycle by means of the obeh code and the knock properties of a fuel (knock code. describing knock in an si engine, we usually consider self-ignition of unburned gas at a late stage of combustion. the phenomenon is often called end-gas knock. the end-gas knock is determined mainly by the temperature of the unburned mixture, which is pre-heated and compressed by the propagating flame and is cooled by the heat transfer to the cooling system via the cylinder walls. therefore, it is necessary to formulate a zone model enabling a description of the unburned mixture, the reaction zone and the burned mixture. this is usually done using a q-d approach introducing several zones describing the pertinent region in the combustion chamber. each of the zones represents an open thermodynamic system to be described by conservation laws. the temperature of the unburned mixture can be used to estimate the occurrence of knock. this procedure uses the assumption of a critical amount of products, i.e., free radicals, to be prepared to induce self-ignition. in a perfectly stirred reactor with constant pressure and temperature, the time to self-ignition induction can be estimated using the guldberg-waage-arrhenius law for a particular fuel: � � � � � � � � � a p b t n exp (1) � defines ignition delay, constants a, b and exponent n depend on the fuel under consideration. in fact, the reaction mechanism describing oxidation of a fuel is usually very complicated, and the solution is demanding. equation (1) simplifies the procedure for a global description of all undergoing reactions. the ignition delay time itself is not sufficient for knock prediction, as both the temperature and the pressure in the cylinder of an engine change rapidly. the history of the states, i.e., the temperature and pressure that unburned mixture is exposed to, has to be taken into account. this can be done using an empirical assumption of transport and conservation of “readiness” of the mixture for self-ignition [5]. hence, the instant of knock is estimated as follows: dt t t i � � � � 0 1 . (2) the integration starts at the beginning of compression, and integration limit ti identifies the time of knock. there are two possibilities for determining induction time � – use of the empirical reaction time correlation or that of the chemical mechanisms describing the full oxidation process of a fuel [6]. in the work presented here, an empirical correlation for the induction time proposed in [7] has been used to describe the fuel behavior. the authors further employed a chemical kinetics solution to describe the induction time of isooctane and methane. the model has been combined in decoupled way with the obeh cycle, as this provides us with the parameters of the knock sub-model. the results of the chemical kinetics computations and a comparison with data obtained by the empirical correlation are shown in fig. 3. the figure shows that the empirical correlation provides a good description of the ignition delay time in the region between high and low temperature ignition. this region is typically exhibited by hydrocarbon fuels – negative temperature behavior – and it is important to describe it properly when making si engine © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 44 no. 3/2004 fig. 2: layout of a standard 1-d engine model 1st stage 2nd stage 3rd stage quasi-dimensional model – obeh code 1-d fluid dynamics model + fe solver for engine part temperature – gt-power code cfd code + coupled chemical kinetics solution – amem code decoupled chemical kinetics solution – regression formulas (ae) to be used as sub-models knock predictions. typical results of si engine optimization [8] – are shown in fig. 4. 4 engine parameter tuning – wall heat transfer assessment in the next stage of engine development, a more sophisticated 1-d model together with the fe solver of the temperature field of the engine parts was employed [9, 10]. the layout of this engine was the same, but the model accounts for the fluid dynamics in the pipes. the fe solver was used for a more detailed simulation of the influence of the cylinder wall temperature on knock induction. the solver was also used to evaluate the heat transfer to the cooling water [11]. this is very important precisely in the case of a highly turbocharged si engine, because the high power density in the cylinder of the engine enhances the heat transfer to the walls and the cooling water. this leads to very hot regions on the cylinder liner/cooling water and the cylinder head/cooling water interfaces, which exhibit local boiling of the cooling water. the influence of the heat transfer coefficients is described in fig. 5. 5 detailed in-cylinder phenomena solution – cfd model application the cfd model is the most advanced model in the simulation tool hierarchy. it is based on pde and it treats time and space continuously. such a model is usually limited to the simulation of an engine part, i.e., in-cylinder phenomena, etc., as it is very demanding. it can be combined with the previous models in order to determine initial and boundary conditions. in the work presented here, the advanced eulerian multizone model [12] developed by the authors was used to 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 5: temperature field of the engine head for 4 different heat transfer coefficients (alfa) from head to cooling water. the heat transfer coefficient was chosen according to the heat flux density on the wall in an iterative procedure to ensure a realistic temperature drop on the wall 0.01 0.1 1 10 100 1000 0.6 0.8 1 1.2 1.4 1.6 1.8 10 3 /t [1/k] t [m s ] lo g . m ì øí tk o p = 10 bar (chem) p = 20 bar (chem) p = 40 bar (chem) p = 60 bar (chem) p = 10 bar (d-e) p = 20 bar (d-e) p = 40 bar (d-e) p = 60 bar (d-e) fig. 3: ignition delay time of isooctane. comparison of detailed chemistry solution (chem) and empirical correlation of douaud-eyzat [7] (d–e) 0,24 0,26 0,28 0,3 0,32 0,34 0,36 0,38 0,4 -20 -15 -10 -5 0 5 10 15 pøedstih [°] e fe k ti v n í ú è in n o s t [] 200 220 240 260 280 300 320 340 360 m p e [g /k w .h ] (m ì rn á s p o tø e b a ) k. p. = 8,5 k. p. = 9,5 k. p. = 10,5 ef. úèinnost mpe, k. p. = 8,5 mpe, k. p. = 9,5 mpe, k. p. = 10,5 mpe fig. 4: si engine optimization. engine efficiency and s.f.c. (mpe) as a function of spark advance and compression ratio (k.p.). boost pressure of 1.5 bar. simulate knock induction in a disc-shape cylinder. it requires the formulation of a very comprehensive model, including the chemistry. unlike the previous models, the results of cfd simulations are not completely useable in the engine design procedure, but they are very useful as they give a deep insight into phenomena which are of concern. in the case presented here, the results show that knock occurs very close to the cylinder walls, which leads to a significant thermal load [12]. 6 application of the cfd model to unconventional cycle simulation the simulation of an unconventional cycle by means of cfd simulation is a typical example of the use of an advanced method to estimate engine parameters and to propose an engine design. the amem code as adopted to simulate an ic engine with a porous medium (pm) taking place in the cylinder [13]. the purpose of the pm insert is to realize in-cylinder heat regeneration (icr) in order to increase engine efficiency. the porous medium acts as a heat regenerator as it transfers energy to an instantaneous cycle from a the previous cycle. the energy supplied during the combustion may help to homogenize and stabilize the combustion. in addition, the “charging” of the regenerator restricts the temperature in the cylinder, preventing the formation of thermal nox. as for the design of the heat regenerator, the authors considered a ceramic insert placed in a pre-chamber of a cylinder. the insert is made of sic. a pm burner with high porosity (0.9) is assumed in the presented simulations. the pm burner concept has been used in steady state combustion systems, e.g., furnaces, boilers, with great success. however, the pm – ic engine simulation has to answer the question whether the concept provides all the advantages under unsteady conditions. both engine efficiency and pollutant formation issues have to be considered. this explains why the cfd model should be used. combustion in the new concept usually has to be modeled in a very complex way, employing chemical kinetics applied to a full reaction scheme of fuel oxidation. examples of the results are shown in fig. 6 and fig. 7. the local temperature during combustion is unfavorably high – fig. 6 – which enhances no formation – fig. 7. the fuel is injected into the pm and the main part of the combustion occurs in the pm, which is inconvenient for the temperature. the combustion system only partially homogenize the temperature field during combustion. see [13] for a more detailed description of results. the first simulations show that the pm concept needs to be further optimized. despite the drawbacks of the cfd simulation, it provides significant advantages in the novel combustion system case, where there is a lack of experimental data. moreover, some experimental techniques, e.g., piv and lif, cannot be employed because optical access to the pm is not possible. 7 conclusions this paper describes the main steps in inplementing of advanced models of the ice system for preliminary optimization and design proposals. the current developments in engine structure caused by new fuels, materials and mechatronics approaches to control actuation are involved in it. this situation changes many old standard paradigms. on the other the model reacts adequately to ever increasing demands on efficiency and environmental impact of engine operation, manufacture and disposal. new thermodynamic procedures modularized according to the presented demands were described together with their hierarchy of precision (ae, …, pde). the use of precision levels should be relevant to the trade-off between predictable force, dependence on previous experience and requirements on detailed information. the use of downsized engines may ensure better well-to-payload efficiency of a vehicle, thus reducing carbon dioxide and other emissions. the application of the different level models, and combination of them considerably enhance and accelerate the design process. the knock prediction capability is important mainly when proposing engine pa© czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 44 no. 3/2004 fig. 6: temperature field of the gas phase at the end of combustion 40 degca after tdc fig. 7: no mass fraction at 40 degca after tdc. hydrogen combustion. chemistry model prediction. rameters which cannot be easily changed for real engines, e.g., the compression ratio. the knock estimation procedure based on a combination of the chemistry and the q-d zone model can be extended to new fuels. it adds predictive capabilities to the model. the classical simple-to-advanced model hierarchy is advantageous in problems where either calibration of the model with experiments or current knowledge can be used to determine a model parameter. however, very advanced models based on pde, e.g., cfd models have a great potential in problems where there is a lack of experimental data. in this case, such a model can be favorably employed in the first stage of the design procedure to explore this limits of a new concept. moreover, the results of cfd computations can be generalized. this provides the basis for sub-model formulation to be used in simpler simulation tools. acknowledgments this research has been supported by the research center project no. ln00b073 of the ministry of education of the czech republic. the support is gratefully acknowledged. references [1] macek j., valášek m.: “initial methodology for concurrent engineering”. in: fiala p., smrcek l. (eds.): proc. of first international conference on advanced engineering design, ctu prague, prague 1999, p. 286–290, isbn 80-01-02055-x. [2] macek j., valášek m.: “computer aided configuration design of internal combustion engines – ced system”. sae paper 2002-01-0903.. in: modeling of si engines and multi-dimensional engine modeling. warrendale, pa: society of automotive engineers, 2002, vol. 1, p. 225–241, isbn 0-7680-0970-7. [3] macek j., polášek m., vítek o., hvězda j.: “comparison of governing equations for lagrangian/eulerian approaches to engine modelling”. journal of kones internal combustion engines. vol. 9 (2002), no. 1–2, p. 172–180, issn 1231-4005. [4] valášek m.: “principles of computer-aided systems operation”. in: trappl r. (ed.): cybernetics and systems 88, kluwer, dordrecht 1988, p. 203–210. [5] livengood j. c., wu p. c.: “correlation of autoignition phenomenon in internal combustion engines and rapid compression machines”. proceedings of fifth international symposium on combustion, reinhold (1998), p. 347. [6] polášek m., hofman k.: “modelování detonací zážehových motorů”. in: koka 2002. račkova dolina: spu nitra, 2002, vol. 1, p. 59–66, isbn 80-8069-051-0. [7] doaud a. m., eyzat p.: four-octane-number method for predicting the anti-knock behavior of fuels and engines. sae paper 780080, sae trans., vol. 87 (1987). [8] hofman k.: “vliv detonací na oběh zážehového motoru”. diplomová práce u220.1, vedení takáts m., macek j., polášek m., čvut fs, praha: 2002. [9] gt-power, user’s manual – vers. 5.1. gamma technologies inc., 601 oakmont lane, suite 220, westmont, il, usa. [10] vítek o., polášek, m.: “tuned manifold system application of 1-d pipe model”. sae paper 2002-01-0004. in: modeling of si engines and multi-dimensional engine modeling. warrendale, pa : society of automotive engineers, vol. 1 (2002), p. 37–46, isbn 0-7680-0970-7. [11] hlávka t.: “studie přeplňování zážehového motoru”. diplomová práce u220.1, vedení macek j., polášek m., čvut fs, praha: 2002. [12] polášek m., macek j., takáts m., vítek, o.: “application of advanced simulation methods and their combination with experiments to modeling of hydrogen fueled engine emission potentials”. sae paper 2002-01-0373. in: modeling of si engines and multi-dimensional engine modeling. sae congress 2002. detroit: sae. 2002. isbn 0-7680-0970-7. [13] polášek m., macek j.: “homogenization of combustion in cylinder of ci engine using porous medium”. sae paper 2003-01-1085. in: homogeneous charge compression ignition (hcci) combustion. warrendale, pa: society of automotive engineers, vol. 1 (2003), isbn 0-7680-1165-5. ing. miloš polášek phone: +420 224 352 492 e-mail: polasekm@fsid.cvut.cz ing. oldřich vítek phone: +420 224 352 507 e-mail: oldrich.vitek@fs.cvut.cz prof. ing. jan macek, drsc. phone: +420224 352 504 e-mail: jan.macek@fs.cvut.cz josef bozek research center czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic prof. rndr. karel kozel, drsc. phone: +420 224 357 365 e-mail: karel.kozel@fs.cvut.cz department of technical mathematics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 acta polytechnica doi:10.14311/ap.2017.57.0454 acta polytechnica 57(6):454–461, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap simple models of three coupled pt -symmetric wave guides allowing for third-order exceptional points jan schnabela, holger cartariusa, ∗, jörg maina, günter wunnera, walter dieter heissb, c a 1. institut für theoretische physik, universität stuttgart, 70550 stuttgart, germany b department of physics, university of stellenbosch, 7602 matieland, south africa c national institute for theoretical physics (nithep), western cape, south africa ∗ corresponding author: holger.cartarius@itp1.uni-stuttgart.de abstract. we study theoretical models of three coupled wave guides with a pt -symmetric distribution of gain and loss. a realistic matrix model is developed in terms of a three-mode expansion. by comparing with a previously postulated matrix model it is shown how parameter ranges with good prospects of finding a third-order exceptional point (ep3) in an experimentally feasible arrangement of semiconductors can be determined. in addition it is demonstrated that continuous distributions of exceptional points, which render the discovery of the ep3 difficult, are not only a feature of extended wave guides but appear also in an idealised model of infinitely thin guides shaped by delta functions. keywords: optical wave guides; third-order exceptional point; matrix model. 1. introduction it is a well-known fact that the spectra of nonhermitian quantum systems can exhibit exceptional points of second order (ep2), i.e. branch point singularities at which two eigenstates coalesce [1–3]. they have been extensively studied theoretically [4–17] and their physical relevance has been demonstrated in impressive experiments [18–26]. in much rarer cases exceptional points of higher order (epn) are discussed [27–30]. in a matrix representation they can be identified by the fact that the matrix is not diagonalisable. with a similarity transformation one can reduce the matrix to a jordan normal form, where the epn appears as an n-dimensional jordan block [31]. in a third-order exceptional point (ep3) three states coalesce in a cubicroot branch point singularity, which already turned out to exhibit new effects beyond those of ep2s such as an unusual chiral behaviour [28]. the exchange behaviour of the eigenstates for circles around an ep3 shows a complicated structure. it does not in all cases uncover the typical cubic-root behaviour [29, 32]. of special interest in the investigation of exceptional points are pt -symmetric systems, i.e. systems whose hamiltonians are invariant under the combined action of the parity operator p and the time reversal operator t [33]. in these systems the exceptional point marks a quantum phase transition, in which real eigenvalues merge under variation of a parameter and become complex if the parameter is varied further in the same direction. the eigenstates of the complex eigenvalues are not pt symmetric, this is only the case for the eigenstates with real eigenvalues. one speaks of broken pt symmetry, and the ep marks the position of the pt symmetry breaking. since the occurrence of exceptional points is a generic feature of the pt phase transition a large number of works exists for pt -symmetric quantum mechanics [27, 34– 49], quantum field theories [50, 51], electromagnetic waves [52–58], and electronic devices [59]. in these papers exceptional points of second order have been investigated in great detail. pt -symmetric optical wave guides, in particular, with an appropriate coupling between them, are ideally suited to generate higher-order exceptional points [29]. three coupled wave guides have already been used to theoretically investigate the influence of loss on a stirap procedure [60]. klaiman et al. [61] showed that a detailed theoretical modelling of a setup of two wave guides predicts the occurrence of an ep2, and directly proved it via its signatures, among them an increasing beat length in the power distribution. exactly this strategy has later been used experimentally [62]. encouraged by these findings the model was extended by heiss and wunner, who added a third wave guide between those with gain and loss to allow for a third-order exceptional point. they studied a simplified model consisting of infinitely thin wave guides modelled by delta functions [30]. in a follow-up paper a detailed investigation of a spatially extended setup with experimentally accessible parameters was used [63]. it was shown that the system is well capable of manifesting a third-order exceptional point in an experimentally feasible procedure. the purpose of this paper is to show that the essential properties of the system studied in [63] can already be found in much simpler descriptions, allowing for deeper insight. the whole three wave-guide setup can be mapped to a matrix model. in such a matrix model an ep3 can be found in a simple manner. however, the largest benefit is in the predictability 454 http://dx.doi.org/10.14311/ap.2017.57.0454 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 simple models of three coupled pt -symmetric wave guides figure 1. pt -symmetric wave guide setup allowing for the occurrence of an ep3. three coupled slab wave guides are formed on a background material with refractive index n0 = 3.3 of gaas via an index variation of ∆n = 1.3 × 10−3. the middle wave guide can exhibit an additional index shift given by the value of nm. a gain-loss profile is introduced via the imaginary refractive index part γ. all distances are measured in terms of a constant length scale a = 2.5 µm via the dimensionless parameters sm and s1,2. the variation of the refractive index is only in x direction and obeys pt symmetry. of matrix structures allowing for an easy access to an ep3. since the influence of the physical parameters on the matrix elements is known from the mapping, the matrix can guide the search for appropriate physical parameter ranges. in [63] it was demonstrated that the ep3 of interest is surrounded by continuous distributions of ep2s or ep3s in the space of the physical parameters. this effect in combination with the fact that an ep3 can show a square-root behaviour for parameter space circles [29, 32] renders its identification difficult. in this paper we show that this difficulty can be studied in the much simpler delta-functions model introduced in [30]. the remainder of the paper is organised as follows. in section 2 we provide a brief introduction into the system. the matrix model is developed in section 3, where we introduce the mapping of the full system onto three modes, which can be used to search for the best parameter ranges. in section 4 we show the appearance of continuously distributed exceptional points in the delta-functions model. the central results are summarised in section 5. 2. three optical wave guides with a complex pt -symmetric refractive index profile the starting point of our investigation is the pt symmetric optical wave guide system introduced in [63]. it consists of three coupled planar wave guides on a background material of gaas, which has a refractive index of n0 = 3.3 at the vacuum wavelength used in that study. the refractive index profile is supposed to be pt -symmetric, i.e. it possesses a symmetric real part representing the index guiding profile and an antisymmetric imaginary part describing the gain-loss structure, i.e. n(x) = n∗(−x). it extends the idea klaiman et al. pursued for two wave guides. the physical parameters consist of dimensionless scaling factors sm and s1,2 used to define distances in units of a constant length a = 2.5 µm (cf. fig. 1) and variations of the refractive index. the background index is shifted by a constant value of ∆n = 1.3 × 10−3, and an additional shift nm can be applied, to the middle wave guide. the gain-loss parameter is labelled γ. the vacuum wavelength is assumed to be λ = 1.55 µm. for a wave propagation of transverse electric modes along the z axis the ansatz ey(x,z,t) = ey(x)ei(ωt−βz) (1) with k = 2π/λ and the propagation constant β can be applied, and leads to the wave equation( ∂2 ∂x2 + k2n(x)2 ) ey(x) = β2ey(x), (2) which is formally equivalent to the one-dimensional schrödinger equation( − 1 2 ∂2 ∂x2 + v (x) ) ︸ ︷︷ ︸ =hsys ψ(x) = eψ(x) (3) with the relations v (x) = − 1 2 k2n(x)2, e = − 1 2 β2. (4) thus, the formalism of pt -symmetric quantum mechanics can be used for the setup. eq. (2) was solved numerically in [63]. in this work we introduce approximations preserving the main properties. 3. mapping onto a matrix model in the first approximation we map the full hamiltonian of the setup shown in figure 1 to a three-mode matrix model. by this approach we check whether the system can give rise to a third-order branch point in a simple manner. this is clearly the case if the resulting hamiltonian is of the form proposed in [29], viz. ĥmath =  a− 2iγ √ 2v 0√ 2v 0 √ 2v 0 √ 2v b + 2iγ   (5) with γ,v ∈ r and a,b ∈ c. the real and imaginary parts of the diagonal elements simulate the refractive index as well as the gain (loss) behaviour in the wave guides. the parameter v represents a coupling between neighbouring wave guides via evanescent fields, and is thus related to the distance between them. all in all ĥmath reflects a situation in which each wave guide supports a single mode. however, this matrix is merely an abstract mathematical model without direct connection to an experimental realisation. to establish such connection 455 j. schnabel, h. cartarius, j. main et al. acta polytechnica we use the formal analogy between the wave equation (2) and the one-dimensional schrödinger equation (3) and calculate a matrix representation of our system in terms of ĥ′ = 〈ψi|hsys|ψj〉. (6) for this purpose we assume the same real index difference ∆n = 1.3 × 10−3 (and nm = 0) as well as the same width w = 2a = 5.0 µm for all three wave guides and use the ground state modes of each single potential well with corresponding basis functions ψi,ψj. this results in a matrix of the form ĥ′ =  α + iη σ′ ξσ′ α σ′ ξ σ′ α− iη   (7) with α,η,ξ ∈ r and σ′ ∈ c. the use of a common width is compatible only with sm = 1.0 and s2 − s1 = 2.0, which implies that the matrix elements still depend on the doublet (γ,s1), and thus on the distance s1 −sm between the wave guides. eq. (7) does not show the form predicted for the appearance of an ep3 from eq. (5) as ξ 6= 0 and σ′ ∈ c. this corresponds to a situation in which also the two outer wave guides are connected by a coupling of their waves. this can happen due to a too small separation between the wave guides. however, for a sufficiently large separation s1 −sm the hamiltonian ĥ′ reduces to the form ˆ̄ h =  α + iη σ 0σ α σ 0 σ α− iη   (8) with α,η,σ ∈ r, which resembles the desired form of eq. (5) for a = b = 0 up to a constant shift α. the transition ĥ′ → ˆ̄h can be observed in figure 2, where the real eigenenergies of eq. (7) are shown as a function of the wave guides’ distances. for comparatively small distances the energies are far apart from each other due to a stronger coupling between the modes. in this range only ĥ′ describes the full system correctly. for larger distances the energies approach each other, which means that we likewise obtain an accurate description of the system in terms of the hamiltonian ˆ̄h. this is the regime, in which a search for an ep3 is most promising. the small separation between the modes also ensures that they are well separated from further states, and thus the reduction to three modes is justified. knowing the shape of an appropriate wave guide we now focus on a large separation between the wave guides with sm = 1.0, s1 = 28.0 and s2 = 30.0 and verify the existence of the ep3. we find it in the spectra by varying the gain-loss coefficient γ. as in the case of the numerical solution of eq. (2) carried out in [63] the ep3 can be reached by varying only one parameter. the result is depicted in figure 3a. obviously we obtain the coalescence of all three real eigenenergies for γmep3 = 0.0636 cm −1 in a cubic root −91.5 −91 −90.5 −90 −89.5 −89 −88.5 −88 9 12 15 18 21 24 27 r e( e ) [ µ m − 2 ] (s1 − sm) [1] ĥ′ ˆ̄h −89.504 −89.5035 −89.503 25 26 27 figure 2. real eigenenergies e = −β2/2 of the hamiltonian ĥ′ as a function of the scaled distance s1 −sm between the middle wave guide and the outer ones. as there is a small separation between the wave guides there is comparatively strong coupling among the modes (grey area). with increasing distance the coupling becomes negligible and the system is well described by the new hamiltonian ˆ̄h from eq. (8). branch point singularity. beyond this point the spectrum becomes complex with two complex conjugate energies and one with vanishing imaginary part. note that the middle state stays widely unaffected by an increase of the non-hermiticity parameter, which is also found in the more realistic descriptions [30, 63] as well as flat band systems [64, 65]. to ascertain that this is indeed an ep3 we follow a standard procedure and perform a closed loop in a suitable parameter space around the supposed branch point singularity. to do so we have to introduce the complex parameters a,b from eq. (5) breaking the underlying pt symmetry. we restrict ourselves to the specific choice a = b = ar + iai and add this to ˆ̄h, ending up with ĥa = ˆ̄h + a  1 0 00 0 0 0 0 1   . (9) using this form the circle is performed in the complex plane of a along an ellipse with the parametrization [0, 2π] → r2, ϕ 7→ ( ar ai ) = ( 10−5 cos ϕ 10−6 sin ϕ ) (10) as illustrated in figure 3c. the corresponding state permutation is depicted in figure 3b and clearly exhibits the threefold exchange behaviour of an ep3. 4. continuously distributed exceptional points around the ep3 the third-order exceptional point investigated in section 3 was verified with a parameter space circle and its typical threefold permutation behaviour. as was found in [63] this can become a difficult task in a realistic setup. on the one hand, not every parameter space circle leads to the threefold permutation. a twofold 456 vol. 57 no. 6/2017 simple models of three coupled pt -symmetric wave guides -89.50356 -89.50353 -89.50350 -89.50347 -89.50344 -89.50341 0.0 000 00 0.0 150 00 0.0 300 00 0.0 450 00 0.0 600 00 0.0 750 00 r e( e ) [ µ m − 2 ] γ [ cm−1 ] a) γmep3 -3.000000 -1.500000 0.000000 1.500000 3.000000 -89 .50 352 0 -89 .50 350 0 -89 .50 348 0 -89 .50 346 0 im (e ) [a .u .] re(e) [ µm−2 ] b) ×10−5 ep3 -1.000000 -0.500000 0.000000 0.500000 1.000000 -1.0 000 00 -0.5 000 00 0.0 000 00 0.5 000 00 1.0 000 00 a i [a .u .] ar [a.u.] c) ×10−6 ×10−5 ep3 e1 e2 e3 figure 3. evidence of a third-order exceptional point in the matrix model for a large separation between the wave guides (sm = 1.0, s1 = 28.0, s2 = 30.0). the real index difference to the background material (∆n = 1.3 × 10−3) was set to be equal for all three wave guides, i.e. nm = 0. the coalescence of the three eigenenergies as a function of the non-hermiticity parameter γ is depicted in a) and appears at γep3 = 0.0636 cm−1. the lower panel shows the characteristic state permutation of an ep3 b) as one performs a closed loop around it in a specific parameter space c). here we circle the ep3 counter clockwise in the complex plane of the asymmetry parameter a = ar + iai, where the specific symbols mark the starting points of the permutation and the arrows the corresponding direction. state exchange misleadingly indicating an ep2 is also possible for certain parameter choices [29, 32]. on the other hand, additional exceptional points, which are accidentally located within the area enclosed by the parameter space loop, can distort the signature of the ep3. in the spatially extended investigation of [63] it turned out that the ep3 is accompanied by continuous distributions of exceptional points in such a way that it is very hard to find a parameter plane, in which a circle reveals the pure cubic-root branch point signature of the ep3. here we show that this is not only a property of the special shape of the three wave guides used in [63] but a generic feature of three coupled guiding profiles. to do so, we return to the delta-functions model from [30]. the model is given by an effective schrödinger equation of the form −ψ′′(x) − ( (1 + iγ)δ(x + b) + γδ(x) + (1 − iγ)δ(x− b) ) ψ(x) = −k2ψ(x), (11) where three delta-function potential wells are located at x = ±b and x = 0. loss is added to the left well and the same amount of gain is added to the right one via the parameter γ. the units are chosen in such a way that the strength of the real and imaginary parts of the two outer wells is normalised to unity, while in the middle well we allow for a different depth given by the real parameter γ > 0, similarly to its spatially extended counterpart from section 2. as the system is non-hermitian the eigenvalues k are complex in general with re(k) > 0. we are interested in bound state solutions with real eigenvalues, and the bound state wave functions have the form ψ(x) =   aekx, x < −b, 2(r cosh(kx) + %1 sinh(kx)), −b < x < 0, 2(r cosh(kx) + %2 sinh(kx)), 0 < x < b, be−kx, x > b. (12) as the continuity conditions and the discontinuity con457 j. schnabel, h. cartarius, j. main et al. acta polytechnica ditions for the wave functions and their first derivatives have to be fulfilled at the delta functions we obtain a system of linear equations in the form m   r%1 %2   = 0, (13) for which nontrivial solutions exist if the corresponding secular equation detm = γ ( e−4kb(1 + γ2) − 2e−2kb(γ2 − 2k + 1) + γ2 + (2k − 1)2 ) + 2k ( e−4kb(1 + γ2) −γ2 − (2k − 1)2 ) = 0 (14) vanishes. hence we obtain the eigenvalues k as roots of the determinant det[m](k) ≡ f(k) depending on the distance b, the non-hermiticity parameter γ, and the parameter γ of the middle well. assuming k to be purely real, the position of the third-order branch point singularity is fixed by f(k) = ∂f ∂k = ∂2f ∂k2 = 0. (15) with γep3 = 1.002 the ep3 appears at γδep3 = 0.06527796794065678, (16a) bep3 = 6.2012417361076206, (16b) kep3 = 0.49584858490334327. (16c) this can be verified by circling this point in the complex plane of the distance b (as it was done in [30]) or by introducing asymmetry parameters breaking the underlying pt symmetry. a verification without pt symmetry breaking, i.e. a circle around the ep3 in the b-γ space turns out to be impossible in this simplified model as it is always dominated by a signature belonging to an ep2. this suggests the conclusion that in analogy with the spatially extended model from [63] the ep3 may be accompanied by ep2s, which disturb the exchange behaviour. to expose this behaviour we attenuate the condition of eq. (15) to a twofold zero, from which we get the pair of variables (k,γ) or (k,b) and therefore the positions of the ep2s via a two-dimensional root search. the results are shown in figure 4 (top left). it can be seen that the ep2s are distributed continuously around the ep3 in the b-γ space. the lines represent either ep2s between the ground state and the first excited mode or between both excited modes. they coalesce at the position of the ep3 at γδep3 leaving a knee in the parameter space. moreover there appear more branches at γc1,2 along the blue line, which cannot be explained by purely real parameters b and γ. hence we either continue b or γ analytically into the complex plane and allow for k ∈ c, which turns the two-dimensional root search into a four-dimensional one. the resulting effects on the re(γ)–im(b) space or re(γ)–im(γ) space are depicted on the right-hand side of figure 4. 5. conclusion in this paper we applied two approximations to the system of three coupled pt -symmetric wave guides studied in [63]. in a mapping of the system to a three-mode matrix model we could show that the matrix can serve as an intuitive guide to parameter regimes, in which the prospects of finding a third-order exceptional point are best. this is exactly the case when the correctly mapped matrix assumes, due to appropriately chosen physical parameters, the shape proposed in [29]. the continuous distributions of ep2s around the ep3 of interest in the space of the accessible physical parameters can be found in the much simpler deltafunctions model from [30]. thus, it is possible to search for adequate physical parameters allowing for the identification of the ep3 via its characteristic threefold state permutation without the need of having to solve the full problem. the appearance of the ep3 has experimentally observable effects in the wave guide system. circles in parameter space as those performed in section 3 can be used. they can lead to the unambiguous signal of a threefold state permutation. due to the continuous distribution of ep2s around the ep3 this might become a difficult task. thus, a temporally resolved measurement of the field intensity in the three wave guides as proposed in [63] might be the best way of obtaining an observable effect. close to the ep3 an increasing beat length and a simultaneous pulsating behaviour in all three wave guides will be present. in principle the approach of extending the system with additional wave guides to allow for higher-order exceptional points can be continued. with four wave guides it should for example be possible to access a fourth-order exceptional point. the observations made in this work suggest that all further extensions should first be studied in simple approaches before a laborious modelling of a realistic physical setup is done. an n-mode matrix model can tell whether a promising search for an epn in a setup with n wave guides is possible. if this is the case, it can provide rough estimates for suitable physical parameters. since algorithms for the detection of higher-order jordan blocks exist [66] their presence can quickly be investigated. the reduction of the full system to delta functions leads to much simpler equations but preserves the whole richness of effects. as such it can be used as a first access to the structure of the eigenstates. in particular, it can be used to evaluate whether an identification of the epn via the n-fold permutation of the eigenstates seems to be feasible. this can give valuable information before costly numerical calculations of the full system are done. acknowledgements gw and wdh gratefully acknowledge support from the national institute for theoretical physics (nithep), west458 vol. 57 no. 6/2017 simple models of three coupled pt -symmetric wave guides 6.12 6.15 6.18 6.21 6.24 0.062 0.063 0.064 0.065 0.066 γc1γ c 2 γ δ ep3 r e (b ) [1 ] re (γ) [1] −0.04 −0.02 0 0.02 0.04 0.062 0.064 0.066 0.068 0.07 im (b ) [1 ] re (γ) [1] γδep3γ c 1 −0.0008 −0.0004 0 0.0004 0.0008 0.063 0.064 0.065 0.066 im (γ ) [1 ] re (γ) [1] γδep3γ c 2 ground/1 ex 1/2 ex b complex (ground/1 ex) γ complex (ground/1 ex) ep3 figure 4. continuously distributed second-order exceptional points in the b-γ space of the idealised system made up of three delta-functions potentials. the system’s ep3 appears at γδep3, from which several ep2s arise (top left, solid lines) either between the ground and first excited (1.ex) or between the first and second excited (2.ex) state. they reveal an even more profound branch structure at the points γc1 and γc2, which can be explained in terms of an analytical continuation of b or γ (top right and bottom right). ern cape, south africa. gw expresses his gratitude to the department of physics of the university of stellenbosch where parts of this paper were developed. references [1] t. kato. perturbation theory for linear operators. springer, berlin, 1966. [2] w. d. heiss. the physics of exceptional points. j phys a 45(44):444016, 2012. doi:10.1088/1751-8113/45/44/444016. [3] n. moiseyev. non-hermitian quantum mechanics. cambridge university press, cambridge, 2011. [4] w. d. heiss, a. l. sannino. avoided level crossing and exceptional points. j phys a 23(7):1167, 1990. doi:10.1088/0305-4470/23/7/022. [5] w. d. heiss. repulsion of resonance states and exceptional points. phys rev e 61:929–932, 2000. doi:10.1103/physreve.61.929. [6] e. hernández, a. jáuregui, a. mondragán. nonhermitian degeneracy of two unbound states. j phys a 39(32):10087, 2006. doi:10.1088/0305-4470/39/32/s11. [7] r. lefebvre, o. atabek, m. šindelka, n. moiseyev. resonance coalescence in molecular photodissociation. phys rev lett 103:123003, 2009. doi:10.1103/physrevlett.103.123003. [8] h. cartarius, j. main, g. wunner. exceptional points in the spectra of atoms in external fields. phys rev a 79:053408, 2009. doi:10.1103/physreva.79.053408. [9] s.-b. lee, j. yang, s. moon, et al. observation of an exceptional point in a chaotic optical microcavity. phys rev lett 103:134101, 2009. doi:10.1103/physrevlett.103.134101. [10] s. longhi. spectral singularities and bragg scattering in complex crystals. phys rev a 81:022102, 2010. doi:10.1103/physreva.81.022102. [11] a. guo, g. j. salamo, d. duchesne, et al. observation of pt -symmetry breaking in complex optical potentials. phys rev lett 103:093902, 2009. doi:10.1103/physrevlett.103.093902. [12] h. cartarius, n. moiseyev. fingerprints of exceptional points in the survival probability of resonances in atomic spectra. phys rev a 84:013419, 2011. doi:10.1103/physreva.84.013419. [13] r. gutöhrlein, j. main, h. cartarius, g. wunner. bifurcations and exceptional points in dipolar bose-einstein condensates. j phys a 46(30):305001, 2013. doi:10.1088/1751-8113/46/30/305001. [14] j. wiersig. enhancing the sensitivity of frequency and energy splitting detection by using exceptional points: application to microcavity sensors for single-particle detection. phys rev lett 112:203901, 2014. doi:10.1103/physrevlett.112.203901. [15] w. d. heiss, g. wunner. resonance scattering at third-order exceptional points. j phys a 48(34):345203, 2015. doi:10.1088/1751-8113/48/34/345203. 459 http://dx.doi.org/10.1088/1751-8113/45/44/444016 http://dx.doi.org/10.1088/0305-4470/23/7/022 http://dx.doi.org/10.1103/physreve.61.929 http://dx.doi.org/10.1088/0305-4470/39/32/s11 http://dx.doi.org/10.1103/physrevlett.103.123003 http://dx.doi.org/10.1103/physreva.79.053408 http://dx.doi.org/10.1103/physrevlett.103.134101 http://dx.doi.org/10.1103/physreva.81.022102 http://dx.doi.org/10.1103/physrevlett.103.093902 http://dx.doi.org/10.1103/physreva.84.013419 http://dx.doi.org/10.1088/1751-8113/46/30/305001 http://dx.doi.org/10.1103/physrevlett.112.203901 http://dx.doi.org/10.1088/1751-8113/48/34/345203 j. schnabel, h. cartarius, j. main et al. acta polytechnica [16] l. schwarz, h. cartarius, g. wunner, et al. fano resonances in scattering: an alternative perspective. eur phys j d 69(8):196, 2015. doi:10.1140/epjd/e2015-60202-9. [17] h. menke, m. klett, h. cartarius, et al. state flip at exceptional points in atomic spectra. phys rev a 93:013401, 2016. doi:10.1103/physreva.93.013401. [18] m. philipp, p. v. brentano, g. pascovici, a. richter. frequency and width crossing of two interacting resonances in a microwave cavity. phys rev e 62:1922–1926, 2000. doi:10.1103/physreve.62.1922. [19] c. dembowski, h.-d. gräf, h. l. harney, et al. experimental observation of the topological structure of exceptional points. phys rev lett 86:787–790, 2001. doi:10.1103/physrevlett.86.787. [20] c. dembowski, b. dietz, h.-d. gräf, et al. observation of a chiral state in a microwave cavity. phys rev lett 90:034101, 2003. doi:10.1103/physrevlett.90.034101. [21] b. dietz, t. friedrich, j. metz, et al. rabi oscillations at exceptional points in microwave billiards. phys rev e 75:027201, 2007. doi:10.1103/physreve.75.027201. [22] t. stehmann, w. d. heiss, f. g. scholtz. observation of exceptional points in electronic circuits. j phys a 37(31):7813, 2004. doi:10.1088/0305-4470/37/31/012. [23] m. lawrence, n. xu, x. zhang, et al. manifestation of pt symmetry breaking in polarization space with terahertz metasurfaces. phys rev lett 113:093901, 2014. doi:10.1103/physrevlett.113.093901. [24] t. gao, e. estrecho, k. y. bliokh, et al. observation of non-hermitian degeneracies in a chaotic exciton-polariton billiard. nature 526(7574):554–558, 2015. doi:10.1038/nature15522. [25] j. doppler, a. a. mailybaev, j. böhm, et al. dynamically encircling an exceptional point for asymmetric mode switching. nature 537(7618):76–79, 2016. doi:10.1038/nature18605. [26] h. xu, d. mason, l. jiang, j. g. e. harris. topological energy transfer in an optomechanical system with exceptional points. nature 537(7618):80–83, 2016. doi:10.1038/nature18604. [27] e. m. graefe, u. günther, h. j. korsch, a. e. niederle. a non-hermitian pt symmetric bosehubbard model: eigenvalue rings from unfolding higherorder exceptional points. j phys a 41(25):255206, 2008. doi:10.1088/1751-8113/41/25/255206. [28] w. d. heiss. chirality of wavefunctions for three coalescing levels. j phys a 41(24):244010, 2008. doi:10.1088/1751-8113/41/24/244010. [29] g. demange, e.-m. graefe. signatures of three coalescing eigenfunctions. j phys a 45(2):025303, 2012. doi:10.1088/1751-8113/45/2/025303. [30] w. d. heiss, g. wunner. a model of three coupled wave guides and third order exceptional points. j phys a 49(49):495303, 2016. doi:10.1088/1751-8113/49/49/495303. [31] u. günther, i. rotter, b. f. samsonov. projective hilbert space structures at exceptional points. j phys a 40(30):8815, 2007. doi:10.1088/1751-8113/40/30/014. [32] w. d. heiss. some features of exceptional points. in f. bagarello, r. passante, c. trapani (eds.), non-hermitian hamiltonians in quantum physics: selected contributions from the 15th international conference on non-hermitian hamiltonians in quantum physics, palermo, italy, 18-23 may 2015, pp. 281–288. springer international publishing, cham, 2016. doi:10.1007/978-3-319-31356-6_18. [33] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. phys rev lett 80:5243–5246, 1998. doi:10.1103/physrevlett.80.5243. [34] c. m. bender, s. boettcher, p. n. meisinger. pt -symmetric quantum mechanics. j math phys 40(5):2201, 1999. doi:10.1063/1.532860. [35] m. znojil. exact solution for morse oscillator in pt -symmetric quantum mechanics. phys lett a 264(23):108, 1999. doi:10.1016/s0375-9601(99)00805-1. [36] v. jakubský, m. znojil. an explicitly solvable model of the spontaneous pt -symmetry breaking. czech j phys 55:1113–1116, 2005. doi:10.1007/s10582-005-0115-x. [37] h. f. jones. interface between hermitian and nonhermitian hamiltonians in a model calculation. phys rev d 78:065032, 2008. doi:10.1103/physrevd.78.065032. [38] h. mehri-dehnavi, a. mostafazadeh, a. batal. application of pseudo-hermitian quantum mechanics to a complex scattering potential with point interactions. j phys a 43:145301, 2010. doi:10.1088/1751-8113/43/14/145301. [39] h. f. jones, e. s. moreira, jr. quantum and classical statistical mechanics of a class of non-hermitian hamiltonians. j phys a 43(5):055307, 2010. doi:10.1088/1751-8113/43/5/055307. [40] e.-m. graefe. stationary states of a pt symmetric two-mode bose-einstein condensate. j phys a 45(44):444015, 2012. doi:10.1088/1751-8113/45/44/444015. [41] w. d. heiss, h. cartarius, g. wunner, j. main. spectral singularities in pt -symmetric bose-einstein condensates. j phys a 46(27):275307, 2013. doi:10.1088/1751-8113/46/27/275307. [42] d. dast, d. haag, h. cartarius, g. wunner. quantum master equation with balanced gain and loss. phys rev a 90:052120, 2014. doi:10.1103/physreva.90.052120. [43] n. abt, h. cartarius, g. wunner. supersymmetric model of a bose-einstein condensate in a pt -symmetric double-delta trap. int j theor phys 54(11):4054–4067, 2015. doi:10.1007/s10773-014-2467-0. [44] r. gutöhrlein, j. schnabel, i. iskandarov, et al. realizing pt -symmetric bec subsystems in closed hermitian systems. j phys a 48(33):335302, 2015. doi:10.1088/1751-8113/48/33/335302. [45] d. dast, d. haag, h. cartarius, g. wunner. purity oscillations in bose-einstein condensates with balanced gain and loss. phys rev a 93:033617, 2016. doi:10.1103/physreva.93.033617. [46] m. kreibich, j. main, h. cartarius, g. wunner. realizing pt -symmetric non-hermiticity with ultracold atoms and hermitian multiwell potentials. phys rev a 90:033630, 2014. doi:10.1103/physreva.90.033630. 460 http://dx.doi.org/10.1140/epjd/e2015-60202-9 http://dx.doi.org/10.1103/physreva.93.013401 http://dx.doi.org/10.1103/physreve.62.1922 http://dx.doi.org/10.1103/physrevlett.86.787 http://dx.doi.org/10.1103/physrevlett.90.034101 http://dx.doi.org/10.1103/physreve.75.027201 http://dx.doi.org/10.1088/0305-4470/37/31/012 http://dx.doi.org/10.1103/physrevlett.113.093901 http://dx.doi.org/10.1038/nature15522 http://dx.doi.org/10.1038/nature18605 http://dx.doi.org/10.1038/nature18604 http://dx.doi.org/10.1088/1751-8113/41/25/255206 http://dx.doi.org/10.1088/1751-8113/41/24/244010 http://dx.doi.org/10.1088/1751-8113/45/2/025303 http://dx.doi.org/10.1088/1751-8113/49/49/495303 http://dx.doi.org/10.1088/1751-8113/40/30/014 http://dx.doi.org/10.1007/978-3-319-31356-6_18 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1063/1.532860 http://dx.doi.org/10.1016/s0375-9601(99)00805-1 http://dx.doi.org/10.1007/s10582-005-0115-x http://dx.doi.org/10.1103/physrevd.78.065032 http://dx.doi.org/10.1088/1751-8113/43/14/145301 http://dx.doi.org/10.1088/1751-8113/43/5/055307 http://dx.doi.org/10.1088/1751-8113/45/44/444015 http://dx.doi.org/10.1088/1751-8113/46/27/275307 http://dx.doi.org/10.1103/physreva.90.052120 http://dx.doi.org/10.1007/s10773-014-2467-0 http://dx.doi.org/10.1088/1751-8113/48/33/335302 http://dx.doi.org/10.1103/physreva.93.033617 http://dx.doi.org/10.1103/physreva.90.033630 vol. 57 no. 6/2017 simple models of three coupled pt -symmetric wave guides [47] m. znojil. bound states emerging from below the continuum in a solvable pt -symmetric discrete schrödinger equation. phys rev a 96:012127, 2017. doi:10.1103/physreva.96.012127. [48] l. schwarz, h. cartarius, z. h. musslimani, et al. vortices in bose-einstein condensates with pt -symmetric gain and loss. phys rev a 95:053613, 2017. doi:10.1103/physreva.95.053613. [49] m. klett, h. cartarius, d. dast, et al. relation between pt -symmetry breaking and topologically nontrivial phases in the su-schrieffer-heeger and kitaev models. phys rev a 95:053626, 2017. doi:10.1103/physreva.95.053626. [50] c. m. bender, v. branchina, e. messina. ordinary versus pt -symmetric φ3 quantum field theory. phys rev d 85:085001, 2012. doi:10.1103/physrevd.85.085001. [51] p. d. mannheim. astrophysical evidence for the non-hermitian but pt -symmetric hamiltonian of conformal gravity. fortschr phys 61(2-3):140, 2013. doi:10.1002/prop.201200100. [52] r. el-ganainy, k. g. makris, d. n. christodoulides, z. h. musslimani. theory of coupled optical pt -symmetric structures. opt lett 32(17):2632–2634, 2007. doi:10.1364/ol.32.002632. [53] k. g. makris, r. el-ganainy, d. n. christodoulides, z. h. musslimani. beam dynamics in pt symmetric optical lattices. phys rev lett 100:103904, 2008. doi:10.1103/physrevlett.100.103904. [54] a. mostafazadeh, h. mehri-dehnavi. spectral singularities, biorthonormal systems and a twoparameter family of complex point interactions. j phys a 42:125303, 2009. doi:10.1088/1751-8113/42/12/125303. [55] c. m. bender, m. gianfreda, ş. k. özdemir, et al. twofold transition in pt -symmetric coupled oscillators. phys rev a 88:062111, 2013. doi:10.1103/physreva.88.062111. [56] s. bittner, b. dietz, h. l. harney, et al. scattering experiments with microwave billiards at an exceptional point under broken time-reversal invariance. phys rev e 89:032909, 2014. doi:10.1103/physreve.89.032909. [57] a. mostafazadeh. nonlinear spectral singularities of a complex barrier potential and the lasing threshold condition. phys rev a 87:063838, 2013. doi:10.1103/physreva.87.063838. [58] b. peng, s. k. ozdemir, f. lei, et al. parity-time-symmetric whispering-gallery microcavities. nat phys 10(5):394–398, 2014. doi:10.1038/nphys2927. [59] j. schindler, a. li, m. c. zheng, et al. experimental study of active lrc circuits with pt symmetries. phys rev a 84:040101, 2011. doi:10.1103/physreva.84.040101. [60] e.-m. graefe, a. a. mailybaev, n. moiseyev. breakdown of adiabatic transfer of light in waveguides in the presence of absorption. phys rev a 88:033842, 2013. doi:10.1103/physreva.88.033842. [61] s. klaiman, u. günther, n. moiseyev. visualization of branch points in pt -symmetric waveguides. phys rev lett 101:080402, 2008. doi:10.1103/physrevlett.101.080402. [62] c. e. rüter, k. g. makris, r. el-ganainy, et al. observation of parity-time symmetry in optics. nat phys 6:192, 2010. doi:10.1038/nphys1515. [63] j. schnabel, h. cartarius, j. main, et al. pt -symmetric waveguide system with evidence of a third-order exceptional point. phys rev a 95:053868, 2017. doi:10.1103/physreva.95.053868. [64] l. ge. parity-time symmetry in a flat-band system. phys rev a 92:052103, 2015. doi:10.1103/physreva.92.052103. [65] b. qi, l. ge. defect states emerging from a non-hermitian flat band of photonic zero modes. in frontiers in optics 2017, p. jw3a.55. optical society of america, 2017. doi:10.1364/fio.2017.jw3a.55. [66] a. a. mailybaev. computation of multiple eigenvalues and generalized eigenvectors for matrices dependent on parameters. numer linear algebra appl 13(5):419–436, 2006. doi:10.1002/nla.471. 461 http://dx.doi.org/10.1103/physreva.96.012127 http://dx.doi.org/10.1103/physreva.95.053613 http://dx.doi.org/10.1103/physreva.95.053626 http://dx.doi.org/10.1103/physrevd.85.085001 http://dx.doi.org/10.1002/prop.201200100 http://dx.doi.org/10.1364/ol.32.002632 http://dx.doi.org/10.1103/physrevlett.100.103904 http://dx.doi.org/10.1088/1751-8113/42/12/125303 http://dx.doi.org/10.1103/physreva.88.062111 http://dx.doi.org/10.1103/physreve.89.032909 http://dx.doi.org/10.1103/physreva.87.063838 http://dx.doi.org/10.1038/nphys2927 http://dx.doi.org/10.1103/physreva.84.040101 http://dx.doi.org/10.1103/physreva.88.033842 http://dx.doi.org/10.1103/physrevlett.101.080402 http://dx.doi.org/10.1038/nphys1515 http://dx.doi.org/10.1103/physreva.95.053868 http://dx.doi.org/10.1103/physreva.92.052103 http://dx.doi.org/10.1364/fio.2017.jw3a.55 http://dx.doi.org/10.1002/nla.471 acta polytechnica 57(6):454–461, 2017 1 introduction 2 three optical wave guides with a complex pt-symmetric refractive index profile 3 mapping onto a matrix model 4 continuously distributed exceptional points around the ep3 5 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0323 acta polytechnica 58(5):323–333, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap the time difference in emission of light and pressure pulses from oscillating bubbles karel vokurka physics department, technical university of liberec, studentská 2, 461 17 liberec, czech republic correspondence: karel.vokurka@tul.cz abstract. oscillations of spark-generated bubbles are studied experimentally. in this work, an attention is paid to the time difference in the radiation of light flashes and pressure pulses from a bubble at the final stages of the first bubble contraction and the early stages of the first bubble expansion. it is found that light and pressure pulses are not radiated synchronously. in some experiments, the light flashes are radiated before the pressure pulses by a few µs and in other experiments, the light flashes are radiated later than the pressure pulses by a few µs. the time difference in the radiation of the two pulses is examined in detail in relation with the bubble size, bubble oscillation intensity, maximum value of the light flash and the width of the light flash. it is shown that the magnitude of the time differences is very weakly correlated with the bubble size, intensity of oscillation and intensity of the light flashes and that the magnitude of the time differences is only moderately correlated with the light flashes widths. keywords: spark-generated bubbles; light emission; bubble oscillations. 1. introduction bubble oscillations have long been an important topic in fluid dynamics. while they have traditionally been studied in connection with erosion damage [1], recent efforts have been aimed at medical applications such as contrast-enhancing in ultrasonic imaging [2– 7]. physical processes in oscillating bubbles are very complex and so far many points in this field have not yet been clarified. one of these points that is not well understood is the emission of light from bubbles. and this phenomenon will be discussed in this work. in experiments, oscillating bubbles are generated using a wide variety of techniques. these techniques encompass, e.g. laser beam focusing in the liquid [8– 10], spark discharge in liquid [11–15], multiple bubbles oscillating in ultrasonic fields [16], hydrodynamic cavitation in the liquid flow [17, 18], and shock induced bubble oscillations [19]. all these techniques are also used in studies of the light emission from bubbles [9, 11, 13–19]. during the last thirty years, the light emission from oscillating bubbles has also been intensively studied in a number of laboratories using acoustic resonators. a recent review of papers published in this area [20] mentions 309 references. although this review concentrates on the light emission from bubbles oscillating in acoustic resonators, it also includes papers dealing with the light emission from bubbles generated during acoustic cavitation. papers dealing with the light emission from bubbles generated during hydrodynamic cavitation and from laserand spark-generated bubbles are not included in this review. in this work, the emission of light from large sparkgenerated bubbles freely oscillating in water far from boundaries is studied. an obvious advantage of large bubbles is that the optical and acoustic radiation from them can be recorded more easily than in the case of smaller bubbles. this is because, in large oscillating bubbles, physical processes are taking place more slowly. the light emitted by large bubbles is also sufficiently intensive so that averaging on light pulses from different experiments, which is always accompanied by a loss of natural variety (e.g., in pulse shape), is not necessary. the technique of low voltage spark discharges makes it also possible to generate bubbles of different sizes and oscillating with different intensities [21], which further enhances the data analysis. in this work, the time difference in radiation of the optical and acoustic pulses at the first bubble contraction and at the following first bubble expansion is studied. it will be shown that the instants, when the maxima in the optical pulses and pressure pulses are radiated, may differ by a few µs. in some experiments, the maxima in optical pulses are radiated earlier than the peaks in the pressure pulses, and in some experiments, the maxima of the optical pulses are radiated later than the pressure peaks. this phenomenon has also been observed by golubnichiy et al. [11], huang et al. [13] and zhang et al. [14]. results discussed here have been presented in a brief form at conferences [22, 23]. 2. experimental setup the experimental setup used in this work is schematically shown in figure 1. freely oscillating bubbles were generated by discharging a capacitor bank via a sparker submerged in a laboratory water tank having dimensions of 6 m (length)×4 m (width)×5.5 m 323 http://dx.doi.org/10.14311/ap.2018.58.0323 http://ojs.cvut.cz/ojs/index.php/ap karel vokurka acta polytechnica figure 1. experimental setup used to generate oscillating bubbles and to record the optical and acoustic radiation from them (abbreviations in the figure: daq – data acquisition board, hv – additional high voltage used to trigger the air gap). (depth). the experiments were performed in tap water at a constant hydrostatic pressure p∞ = 125 kpa, at a room temperature θ∞ = 292 k, and far from any boundaries. the capacitance of the capacitor bank could be varied in steps by connecting 1 to 10 capacitors in parallel. each of these capacitors had a capacitance of 16 µf. the capacitors were charged from a high voltage source of 4 kv. an air-gap switch was used to trigger the discharge through the sparker. earlier measurements [24] have shown that the current flowing through the discharge circuit has the form of a highly damped sinusoid and depending on the total bank capacity, it drops to zero in 0.3–0.7 ms after the liquid breakdown. a more detailed description of the experimental setup is given in an earlier work [24]. both the spark discharge and the subsequent bubble oscillations were accompanied by an intensive optical radiation and acoustic radiation. the optical radiation was monitored by a detector, which consisted of a fiber optic cable, a photodiode, an amplifier, and an a/d converter. the input surface of the fiber optic cable was positioned in water at the same level as the sparker at a distance r = 0.2 m aside, pointing perpendicularly to the sparker gap and the electrodes. at the output surface of the fiber optic cable, a hammamatsu photodiode type s2386-18l was positioned. the usable spectral range of the photodiode is 320 nm to 1100 nm. an analysis of the optical spectra given in the literature showed that the maximum temperatures in spark-generated and laser-generated bubbles range from 5800 k to 8150 k [9, 14]. then, using the wien and planck law, it can be verified that the spectral maxima of the optic radiation are within the photodiode band-pass and that the prevailing part of the radiation is received by the detector. the load resistance of the photodiode was 75 ω, so the rise time of the measured pulses is about 50 ns. a broadband amplifier (0–10 mhz) was connected to the photodiode output terminals. the output voltage from the amplifier was recorded using a data acquisition board (national instruments pci 6115, 12 bit a/d converter) with a sampling frequency of 10 mhz. the presented optical data are referring to the photodiode output. the acoustic radiation was monitored using a reson broadband hydrophone type tc 4034. the hydrophone was positioned with the sensitive element at the same depth as the sparker. the distance between the hydrophone acoustic centre and the sparker gap was rh = 0.2 m. the output of the hydrophone was connected via a divider 10:1 to the second channel of the a/d converter. in the experiments, a large number of almost spherical bubbles freely oscillating in a large expanse of liquid were successively generated. the sizes of these bubbles, as described by the first maximum radius rm1, ranged from 18.5 mm to 56.5 mm, and the bubble oscillation intensity, as described by the nondimensional peak pressure in the first acoustic pulse pzp1 = (pp1rh)/(p∞rm1) ranged from 24 to 153 [21]. here, pp1 is the peak pressure in the first acoustic pulse p1(t). the non-dimensional quantity pzp1 can be best interpreted by multiplying it by the hydrostatic pressure p∞. then it represents the peak acoustic pressure pp1 in the first acoustic pulse p1(t) measured at a distance rh = rm1. both rm1 and pzp1 were determined 324 vol. 58 no. 5/2018 the time difference in emission of light and pressure pulses in each experiment from the respective pressure record using an iterative procedure described in detail in [21]. this iterative procedure is an extension of the wellknown rayleigh’s formula for the “collapse time” of a bubble having a size rm1. the rayleigh formula is commonly used in studies of spark and laser generated bubbles (see, e.g. [1, 9]). it has been verified experimentally many times that for bubbles oscillating sufficiently intensively, it gives satisfactory results. however, for bubbles oscillating with lower intensity, it gives less precise values. the iterative procedure is thus extending this approach to any oscillation intensity. prior to the measurements reported here, a limited number of high-speed camera records were taken with framing rates ranging from 2800 to 3000 frames/s. these records were used to check the shape of the generated bubbles and the photographs yielded also useful visual information on the bubble content. examples of the photographs of the spark-generated bubbles taken by the high-speed camera at different instants of their life and the experimentally determined variations of the bubble radius r with time t were given in earlier works [21, 25]. 3. results let us assume that at a time t0, the liquid breakdown initiates a spark-discharge. thus at the instant t0, the bubble starts growing explosively and radiating light (optical) and pressure (acoustic) waves intensively. the instant t0 thus represents the beginning of all the physical processes considered here. the bubble wall motion is oscillatory. at a time t1, the explosively growing spherical bubble attains its first maximum volume (a sphere of radius rm1). then the bubble starts contracting and at a time tc1, it attains its first minimum volume (a sphere of radius rm1). then the bubble starts expanding again and at a time t2, it attains its second maximum volume (a sphere of radius rm2). after time t2, the bubble performs several further oscillations. however, these are already out of scope of the present work. the interval (t0, t1) represents the growth phase, the interval (t1, tc1) the first contraction phase and the interval (tc1, t2) the first expansion phase. the interval (t0, tc1) represents the time of the first bubble oscillation to1. in this work, we shall concentrate on the processes taking place in a very short interval encompassing the final stages of the bubble contraction and the early stages of the bubble expansion. to abbreviate the description of this interval, a term “subinterval in the vicinity of the minimum bubble volume” (shortly “subinterval mbv”) will be used in the following. this subinterval is centred on the instant tc1, when the bubble is compressed to its first minimum volume and the extent of this subinterval is about 0.5 % of the time of the first bubble oscillation to1. as already said in section 2, both the spark discharge and the subsequent bubble oscillations are accompanied by an intensive optical radiation and acoustic radiation. an example of an optical record, represented by the voltage u(t) at the output of the optical detector, is given in figure 2. as can be seen, the voltage u(t) consists of two pulses. first, it is a pulse u0(t) that corresponds to the optical radiation from the bubble during the growth phase (t0, t1). second, it is a pulse u1(t) that represents the optical radiation from the bubble during the first contraction phase and the first expansion phase that is in the interval (t1, t2). in this work, only the pulse u1(t) will be considered, and therefore the pulse u0(t) is shown clipped in figure 2. the maximum value of the pulse u1(t) is denoted as um1 and the time of its occurrence is denoted as tu1. an example of an acoustic record p(t) is given in figure 3. the pressure wave has been measured at a distance from the bubble center rh = 0.2 m and recalculated to the nominal distance rn = 1 m. as can be seen, the pressure wave also consists of several pulses. first, it is a pressure pulse p0(t) radiated by the bubble during the growth phase (t0, t1). second, it is a pressure pulse p1(t) radiated by the bubble during the first contraction phase and the first expansion phase that is in the interval (t1, t2). the peak value of the pulse p1(t) is denoted as pp1 and the time of its occurrence is denoted as tp1. further pressure pulse p2(t) can also be seen in figure 3. however, this pulse will not be considered in this work. the pressure wave propagates from the bubble wall to the hydrophone at a distance rh = 0.2 m at the speed of sound in water c = 1482 m/s and thus the instants t0, t1, tp1, and t2 in the pressure record are delayed by about 135 µs after the instants t0, t1, tc1, and t2 defined above for the bubble wall motion. however, to simplify the discussion the propagation time of the pressure wave in water is not considered here and it will be assumed that the instants t0 defined above for the beginning of the bubble wall motion, optical radiation, and acoustic radiation are identical, even if the hydrophone is at the distance rh. from the above discussion, it is evident that an accurate determination of the instants t0 in the optical and acoustic records is crucial in this work. the instants t0 are defined as the points in the records at which the pulses u0(t) and p0(t) start rising steeply from an undisturbed level. small portions of the optical and acoustic pulses u0(t) and p0(t) extracted from the records at the vicinity of t0 are displayed together in figure 4. it can be seen that the instant t0 can be determined relatively accurately. the precision of the determination of t0 is given by the sampling interval dt = 1/fs = 0.1 µs. the instants t0 have been defined as the starting points of the recorded waves u(t) and p(t). in the following, the instants t0 in both waves will be assumed to be identical, that is, the propagation time of the pressure wave will be ignored, or, which is the same, it will be assumed that the pressure wave propagates 325 karel vokurka acta polytechnica 0 2 4 6 8 10 12 −1 −0.5 0 0.5 1 1.5 2 2.5 3 t [ ms ] u (t ) [ m v ] u 0 (t) u 1 (t) t u1 t 1 t 0 t 2 u m1 figure 2. an example of a radiated optical wave u(t). the bubble size is rm1 = 49 mm, the intensity of bubble oscillation is pzp1 = 142.1. with the speed of light. then, both waveforms can be displayed together in one figure with an identical starting point t0 on the time axis. and in this way, both waveforms can easily be mutually compared (see also figure 4). because in this work, we are interested in comparison of the optical and acoustic radiations from the bubble at the subinterval mbv, only small portions of the pulses u1(t) and p1(t) extracted from the records in the vicinity of the instants tu1 and tp1 will be considered in the following discussion. examples of the pulses u1(t) and p1(t) recorded in two different experiments are shown in figures 5 and 6. in these figures, the time origins have been set at the instants tp1 and the sizes of the waveforms u1(t) have been adjusted by using arbitrary units so that the shapes of both waveforms can be compared easily. and as said above, the instants t0 of both waves are identical on the time axis. it can be seen in figures 5 and 6 that the shapes of the pulses u1(t) and p1(t) differ, and that the times tu1 and tp1 are not identical. the difference in shapes of the pulses u1(t) and p1(t) is evidently connected with the autonomous behaviour of plasma in the bubble interior, already discussed in [25–27]. the difference in times tu1 and tp1 has not been discussed yet and the existence of this time difference is a surprising fact because a “reasonable” assumption is that the maxima in optical and acoustic radiations will occur at the same instant tc1, when the bubble is contracted to the first minimum volume. even if this has not yet been verified experimentally, it seems highly probable that the instants tc1 and tp1 are identical. however, this assumption then means that the maximum in the optical radiation is not firmly tied to the instant of the bubble maximum contraction tc1, but can occur a bit earlier (figure 5), or a bit later (figure 6). in figures 5 and 6 the time difference between the instants tp1 and tu1 has been denoted as δ1 and this quantity is defined by the relation δ1 = tu1 − tp1. as shown in figures 5 and 6 the time difference δ1 can have both a positive and a negative value. as said in section 2, the spark-generated bubble is described by two parameters, by its size rm1 and oscillation intensity pzp1. it is convenient to characterise the optical pulse u1(t) by two parameters as well [26]. first, it is the maximum value of the pulse um1. second, it is the pulse width ∆ at one-half of the maximum value (that is at um1/2). thus defined pulse widths ∆ are shown in figures 5 and 6. in the following, the dependence of the time differences δ1 on these four parameters, that is on rm1, pzp1, um1 and ∆, will be shown and discussed. the variation of the time difference δ1 with the bubble size rm1 determined on a larger set of experimental data is shown in figure 7. the regression line for the mean value of the time difference δ1 in dependence on rm1 is 〈δ1〉 = −0.2rm1 + 5.9 [µs, mm]. it can be seen in figure 7 that the time difference δ1 is correlated with the bubble size rm1 only very weakly and that the dispersion of the time differences δ1 grows with the bubble size rm1. 326 vol. 58 no. 5/2018 the time difference in emission of light and pressure pulses 0 2 4 6 8 10 12 −200 0 200 400 600 800 1000 1200 t [ ms ] p (t ) @ r n = 1 m [ k p a ] t 1 p 0 (t) t p1 p p1 t 0 p 1 (t) p 2 (t) t 2 figure 3. an example of a radiated pressure wave p(t). the bubble size is rm1 = 49 mm, the intensity of bubble oscillation is pzp1 = 142.1. −4 −3 −2 −1 0 1 2 −4 −2 0 2 4 6 8 10 12 14 t [ μs ] p 0 (t ) [ a .u . ], u 0 (t ) [ a .u . ] u 0 (t) t 0 p 0 (t) figure 4. an example of voltage and pressure pulses u0(t) and p0(t) in the vicinity of the instant t0. the bubble size is rm1 = 49 mm, the intensity of bubble oscillation is pzp1 = 142.1. 327 karel vokurka acta polytechnica −100 −80 −60 −40 −20 0 20 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 t [ µs ] p 1 (t ) [ m p a ] , u 1( t) [ a .u . ] ∆ p 1 (t) u 1 (t) t p1 t u1 δ 1 figure 5. an example of optical and pressure waves in the vicinity of instants tu1 and tp1. the bubble size is rm1 = 38.1 mm, the intensity of bubble oscillation is pzp1 = 107, the hydrophone was at a distance rh = 0.2 m, the optical pulse width is ∆ = 57.9 µs, and the time difference in occurrence of the maxima in both pulses is δ1 = −12.7 µs. −100 −80 −60 −40 −20 0 20 0 1 2 3 4 5 t [ µs ] p 1 (t ) [ m p a ] , u 1( t) [ a .u . ] t p1 t u1 u 1 (t) p 1 (t) δ 1 ∆ figure 6. an example of optical and pressure waves in the vicinity of instants tu1 and tp1. the bubble size is rm1 = 49 mm, the intensity of bubble oscillation is pzp1 = 142.1, the hydrophone was at a distance rh = 0.2 m, the optical pulse width is ∆ = 9.4 µs, and the time difference in occurrence of the maxima in both pulses is δ1 = 2.6 µs. the waveforms displayed in figures 2, 3, 4 and 6 were recorded in the same experiment. if the records shown in figures 2 and 3 are aligned as it is done in figure 4, the overlapping records shown in figure 6 are obtained. 328 vol. 58 no. 5/2018 the time difference in emission of light and pressure pulses 10 20 30 40 50 60 −15 −10 −5 0 5 r m1 [ mm ] δ 1 [ µ s ] figure 7. the variation of the time difference δ1 with the bubble size rm1. 40 60 80 100 120 140 160 −15 −10 −5 0 5 p zp1 [ − ] δ 1 [ µ s ] figure 8. the variation of the time difference δ1 with the bubble oscillation intensity pzp1. 329 karel vokurka acta polytechnica 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 −15 −10 −5 0 5 u m1 [ mv ] δ 1 [ µ s ] figure 9. the variation of the time difference δ1 with the maximum voltage um1. bubble sizes: (◦) rm1 > 50 mm, (×) 50 mm ≥ rm1 > 40 mm, (+) 40 mm ≥ rm1 > 30 mm, (∗) 30 mm ≥ rm1 > 20 mm, (·) 20 mm ≥ rm1. 0 20 40 60 80 100 120 −15 −10 −5 0 5 ∆ [ µs ] δ 1 [ µ s ] figure 10. the variation of the time difference δ1 with the pulse width ∆. bubble sizes: (◦) rm1 > 50 mm, (×) 50 mm ≥ rm1 > 40 mm, (+) 40 mm ≥ rm1 > 30 mm, (∗) 30 mm ≥ rm1 > 20 mm, (·) 20 mm ≥ rm1. 330 vol. 58 no. 5/2018 the time difference in emission of light and pressure pulses the variation of the time difference δ1 with the bubble oscillation intensity pzp1 is shown in figure 8. the regression line for the mean value of the time difference δ1 in dependence on pzp1 is 〈δ1〉 = 0.07pzp1−10.3 [µs, – ]. it can be seen that the time difference δ1 is very weakly correlated with the bubble oscillation intensity pzp1 and that is another proof of the autonomous behaviour of plasma in the bubble interior that has already been observed earlier in works [25–27]. as discussed in [25–27], this autonomous behaviour manifests itself in the relatively large independence of the light radiation from the plasma surface on the pressure at the bubble wall. the variation of the time difference δ1 with the maximum voltage in the optical pulse um1 (and thus with the maximum intensity of the light radiated by the bubble) is shown in figure 9. the regression line for the mean value of the time difference δ1 in dependence on um1 is 〈δ1〉 = −5.9um1 −1.03 [µs, mv]. it can be seen again that the time difference δ1 is weakly correlated with the maximum voltage in the optical pulse um1. however, the dependence of the time difference δ1 on the bubble size rm1, already observed in figure 7, can also be seen in figure 9. the variation of the time difference δ1 with the pulse width ∆ is shown in figure 10. the regression line for the mean value of the time difference δ1 in dependence on ∆ is 〈δ1〉 = −0.17∆ + 1.3 [µs, µs]. it can be seen that now, the time difference δ1 is moderately correlated with the optical pulse width ∆. for broader optical pulses u1(t) (that is for light flashes with larger widths ∆), the time difference δ1 is negative, which means that the light flashes are radiated before the pressure pulses p1(t) (and thus also before the bubble contraction to rm1). however, for narrower optical pulses u1(t) (that is, for light flashes with smaller widths ∆), the time difference δ1 is positive, which means that the optical pulses are radiated later than the pressure pulses p1(t) (and thus are also radiated after the instant tc1, when the bubble volume is contracted to rm1). the dependence of δ1 on the bubble size rm1 can also be observed. for larger bubbles, the time difference δ1 is predominantly negative, for smaller bubbles, the time difference δ1 can be both positive and negative (cf. also figure 7). 4. discussion when comparing the instants of occurrence of the maxima in the first optical pulse and in the first acoustic pulse, it can be seen that the times tu1 and tp1 may differ by a few µs, in some experiments, the maxima of the optical radiation are radiated earlier than the peaks in the pressure pulses (an example of this case is given in figure 5) and in some experiments, the maxima in optical radiation are radiated later than the peaks in pressure pulses (an example of this case is given in figure 6). as can be seen in figures 7–10, the occurrence of the optical maxima before the pressure maxima is prevailing. however, occurrence of the optical maxima after the pressure maxima can also be seen, but it is not so frequent and occurs predominantly for smaller bubbles and for more intensively oscillating bubbles. the time differences δ1 between these maxima are very small when compared with the times of the first bubble oscillations to1, the ratio δ1/to1 is of the order 10−3 typically. due to this small magnitude, it is difficult to observe the time differences δ1 when studying smaller bubbles such as those oscillating in acoustic resonators. at present, there is no explanation for the existence of the time differences δ1. but their presence further confirms the earlier findings concerning the autonomous plasma behaviour in bubbles [25, 26]. as shown in figures 7– 10, the time difference δ1 may be of both positive and negative values and these values are only very weakly correlated with the bubble size rm1, intensity of bubble oscillation pzp1, and maximum values of the optical radiation um1. the time difference δ1 is moderately correlated only with the optical pulse widths ∆. as can be seen in figure 10, it grows with ∆. for large bubbles and for large ∆, the time differences δ1 are of negative values, and thus the maxima um1 always occur before the peaks pp1. the time difference δ1 can be of positive values, and thus the maxima um1 can be delayed behind the peaks pp1, only for small bubbles and small optical pulse widths ∆. the time difference between tu1 and tp1 has also been observed by golubnichiy et al. [11], huang et al. [13] and zhang et al. [14]. in these experiments, the time difference δ1 was negative (that is, the optical pulse occurred earlier than the pressure pulse). variation of the time difference δ1 with bubble parameters and optical pulse parameters has not been studied in [11, 13, 14]. for the analysis carried out in this and in previous works [25, 26], the measurement of pressure waves radiated by oscillating bubbles is essential. however, in the review paper by crum [20], there are only 2 papers mentioned in which pressure waves radiated by oscillating bubbles were recorded [20, section viii, subsection l]). and even in these 2 papers, the pressure waves were not used for a more detailed analysis. therefore, it is not surprising that in the works included in the above review, no findings similar to those presented here and in our earlier works [25, 26] were mentioned. in the review [20], altogether 40 various theoretical models trying to clarify the origins of the light emission from bubbles are also summarized [20, section vi, subsections a, b, c, and section viii, subsection e]). an interested reader may find the full bibliographical data of these papers and short summaries of the main results given in these works in the review. the presented theories include the hot spot model, electro-hydrodynamic hypothesis, re-entrant jet impacting the opposite bubble wall, electron-neutral atom-bremsstrahlung, protontunnelling radiation, becquerel effect and quantum vacuum radiation, just to name some of the hypothesis 331 karel vokurka acta polytechnica given in the review. however, the conclusion that can be made from the review is that none of these theoretical models has been verified experimentally and none has been accepted by the research community as a definitive valid clarification of the processes taking place in bubbles that are leading to the light emission. and, unfortunately, none of these theoretical models can also explain the facts observed when studying the light radiation from spark-generated bubbles, that is, none can explain the persisting light emission during the whole first bubble oscillation, relatively autonomous behaviour of the plasma in the bubble interior and the differences in instants of the radiation of light and pressure pulses. in view of the fact that at present, there is no suitable theoretical explanation for the observed phenomena, at the end of this section, we would like to draw an attention to the results published by researchers studying plasmoids generated by electrical discharges in wet air [28–33]. the aim of those works is to simulate the ball lightning, also known as fireballs. the electrical discharges in wet air are performed with voltages and capacitor banks roughly similar to those used in this work, i.e. the voltages are about 5 kv and the capacitor banks have a capacitance about 1 mf. the generated plasmoids usually have an almost spherical form and live for about 0.5–1 s. the main conclusions of these works can be summarized as follows [33]: the plasmoid consists of a hot core surrounded by cool shell, possessing a translational temperature of about 600–1300 k, an electron temperature of 2000–5000 k and a rotational temperature of about 15000 k and is displacing air with a warm, partially ionized wateraerosol produced by the discharge. and, according to authors of work [33], this conclusion is consistent with the ball-lightning models proposed by shevkunov [34– 36], where a cation – anion recombination is inhibited by many orders of magnitude by the clustering of water around ion atomic and molecular ion cores. we believe that the strange behaviour of plasma in the spark-generated bubbles can be best compared with these results. of course, the plasmoids that are studied in [28–33] are generated under other conditions than is the case of plasma in spark-generated bubbles. however, the research of plasmoids most likely shows the way that should be followed in studies of the light emission from spark-generated bubbles. 5. conclusions when analysing the experimental data, it was found that there is no exact coincidence in the radiation of the light flashes and pressure pulses from the sparkgenerated bubbles at the final stages of their first contraction and early stages of their first expansion (that is, in the subinterval mbv). the time difference between the maxima of the two pulses is only a few µs and is, therefore, about three orders smaller than the time of the first bubble oscillation. thus it is not easy to detect it in the case of smaller bubbles. this time difference is a further evidence of the relatively autonomous plasma behaviour that has already been observed earlier in connection with other physical processes taking place in oscillating bubbles [25–27]. unfortunately, at the present state of knowledge of the bubble oscillations, there is no clear explanation for this phenomenon. the only physical processes that may be considered to be similar can be observed in plasmoids generated in wet air [28–33]. acknowledgements this work was supported by the ministry of education youth and sports of the czech republic as the research project msm 245100304. the experimental part of this work was carried out during the author’s stay at the underwater acoustics laboratory of the italian acoustics institute, cnr, rome, italy. the author wishes to thank dr. silvano buogo from the cnr-insean marine technology research institute, rome, italy, for his very valuable help in preparing the experiments. references [1] a. jayaprakash, c.-t. hsiao, g. chahine. numerical and experimental study of the interaction of a sparkgenerated bubble and a vertical wall. trans. asme j. fluid eng. 134, 031301, 2012. doi:10.1115/1.4005688 [2] p.a. dayton, j.s. allen, k.w. ferrara. the magnitude of radiation force on ultrasound contrast agents. j. acoust. soc. am. 112, 2183-2192, 2002. doi:10.1121/1.1509428 [3] v. sboros. response of contrast agents to ultrasound. adv. drug deliver. rev. 60, 1117-1136, 2008. doi:10.1016/j.addr.2008.03.011 [4] e.p. stride, c.c. coussios. cavitation and contrast: the use of bubbles in ultrasound imaging and therapy. proc. inst. mech. eng. j. eng. med. 224, 171-191, 2010. doi:10.1243/09544119jeim622 [5] t. faez, m. emmer, k. kooiman, m. versluis, a.f.w. van der steen, n. de jong. 20 years of ultrasound contrast agent modelling. ieee trans. ultrason. ferroelectr. frequency control. 60, 7-20, 2013. doi:10.1109/tuffc.2013.2533 [6] d.h. thomas, m. butler, n. pelekasis, t. anderson, e. stride, v. sboros. the acoustic signature of decaying resonant phospholipid microbubbles. phys. med. biol. 58, 589-599, 2013. doi:10.1088/0031-9155/58/3/589 [7] t. segers, n. de jong, m. versluis. uniform scattering and attenuation of acoustically sorted ultrasound contrast agents: modeling and experiments. j. acoust. soc. am. 140, 2506-2517, 2016. doi:10.1121/1.4964270 [8] b. ward, d.c. emmony. interferometric studies of the pressures developed in a liquid during infrared-laserinduced cavitation-bubble oscillation. infrared phys. 32, 489-515, 1991. doi:10.1016/0020-0891(91)90138-6 [9] e.a. brujan, d.s. hecht, f. lee, g.a. williams. properties of luminescence from laser-created bubbles in pressurized water. phys. rev. e. 72, 066310, 2005. doi:10.1103/physreve.72.066310 332 http://dx.doi.org/10.1115/1.4005688 http://dx.doi.org/10.1121/1.1509428 http://dx.doi.org/10.1016/j.addr.2008.03.011 http://dx.doi.org/10.1243/09544119jeim622 http://dx.doi.org/10.1109/tuffc.2013.2533 http://dx.doi.org/10.1088/0031-9155/58/3/589 http://dx.doi.org/10.1121/1.4964270 http://dx.doi.org/10.1016/0020-0891(91)90138-6 http://dx.doi.org/10.1103/physreve.72.066310 vol. 58 no. 5/2018 the time difference in emission of light and pressure pulses [10] c. frez, g.j. diebold. laser generation of gas bubbles: photoacoustic and photothermal effects recorded in transient grating experiments. j. chem. phys. 129, 184506, 2008. doi:10.1063/1.3003068 [11] p.i. golubnichiy, v.m. gromenko, a.d. filonenko. microsecond scanning of the optical spectra of the electrohydrodynamic sonoluminescence (in russian). pisma v zh. tekh. fiz. 5, 568-571, 1979. [12] y. huang, h. yan, b. wang, x. zhang, z. liu, k. yan. the electro-acoustic transition process of pulsed corona discharge in conductive water. j. phys. d: appl. phys. 47, 255204, 2014. doi: 10.1088/0022-3727/47/25/255204. [13] y. huang, l. zhang, j. chen, x. zhu, z. liu, k. yan. experimental observation of the luminescence flash at the collapse phase of a bubble produced by pulsed discharge in water. appl. phys. lett. 107, 184104, 2015. doi:10.1063/1.4935206 [14] l. zhang, x. zhu, h. yan, y. huang, z. liu, k. yan. luminescence flash and temperature determination of the bubble generated by underwater pulsed discharge. appl. phys. lett. 110, 034101, 2017. doi:10.1063/1.4974452 [15] a.h. aghdam, b.c. khoo, v. farhangmehr, m.t. shervani-tabar. experimental study on the dynamics of an oscillating bubble in a vertical rigid tube. exp. therm. fluid sci. 60, 299-307, 2015. doi:10.1016/j.expthermflusci.2014.09.017 [16] i. ko, h.-y. kwak. measurement of pulse width from a bubble cloud under multibubble sonoluminescence conditions. j. phys. soc. japan. 79, 124401, 2010. doi: 10.1143/jpsj.79.124401. [17] t.g. leighton, m. farhat, j.e. field, f. avellan. cavitation luminescence from flow over a hydrofoil in a cavitation tunnel. j. fluid mech. 480, 43-60, 2003. doi: 10.1017/s0022112003003732. [18] m. farhat, a. chakravarty, j.e. field. luminescence from hydrodynamic cavitation. proc. r. soc. a. 467, 591-606, 2011. doi:10.1098/rspa.2010.0134 [19] n.k. bourne, j.e. field. shock-induced collapse and luminescence by cavities. phil. trans. r. soc. lond. a. 357, 295-311, 1999. doi:10.1098/rsta.1999.0328 [20] l.a. crum. resource papers: sonoluminescence. j. acoust. soc. am. 138, 2181-2205, 2015. doi: 10.1121/1.4929687. [21] s. buogo, k. vokurka. intensity of oscillation of spark generated bubbles. j. sound vib. 329, 4266-4278, 2010. doi:10.1016/j.jsv.2010.04.030 [22] k. vokurka, s. buogo. experimental study of light emission from spark generated bubbles. in: 36. jahrestagung für akustik daga 2010, berlin 15.-18.3.2010 (conference proceedings on cdrom, deutsche gesellschaft für akustik, berlin 2010, isbn: 978-3-9808659-8-2, editors: m. möser et al., pp. 671672), http://kfy.fp.tul.cz/katedra/zamestnanci/vokurkakarel [23] k. vokurka, s. buogo. light from oscillating bubbles – persisting mystery. in: the 80th acoustic seminar, hrdoňov 4.-6.5.2010 (conference proceedings: české vysoké učení technické v praze, česká akustická společnost, prague 2010, isbn: 978-80-01-04547-3, editors m. brothánek and r. svobodová, pp. 65-72), http://kfy.fp.tul.cz/katedra/zamestnanci/vokurkakarel [24] s. buogo, j. plocek, k. vokurka. efficiency of energy conversion in underwater spark discharges and associated bubble oscillations: experimental results. acta acust. united with acust. 95, 46-59, 2009. doi:10.3813/aaa.918126 [25] k. vokurka, experimental determination of temperatures in spark-generated bubbles oscillating in water. acta polytechnica. 57, 149-158, 2017. doi:10.14311/ap.2017.57.0149 [26] k. vokurka, s. buogo. experimental study of light emitted by spark-generated bubbles in water. eur. phys. j. appl. phys. 81, 11101, 2018. doi:10.1051/epjap/2017170332 [27] k. vokurka, j. plocek. experimental study of the thermal behavior of spark-generated bubbles in water. exp. therm. fluid sci. 51, 84-93, 2013. doi:10.1016/j.expthermflusci.2013.07.004 [28] a.i. egorov, s.i. stepanov. long-lived plasmoids produced in humid air as analogues of ball lightning. tech. phys. 47, 1584-1586, 2002. doi:10.1134/1.1529952 [29] a.i. egorov, s.i. stepanov, g.d. shabanov. laboratory demonstration of ball lightning. physicsuspekhi. 47, 99-101, 2004. doi:10.1070/pu2004v047n01abeh001691 [30] y. sakawa, k. sugiyama, t. tanabe, r. more. fireball generation in a water discharge. plasma and fusion res.: rap. commun. 1, 039-1 – 039-2, 2006. doi:10.1585/pfr.1.039 [31] a.i. egorov, s.i. stepanov. properties of short-living ball lightning produced in the laboratory. tech. phys. 53, 688-692, 2008. doi:10.1134/s1063784208060029 [32] a. versteegh, k. behringer, u. fantz, g. fussmann, b. jüttner, s. noack. long-living plasmoids from an atmospheric water discharge. in: 28th int. conf. on phenomenon in ionized gases, prague 2007, plasma sources sci. technol. 17, 024014, 2008. doi:10.1088/0963-0252/17/024014 [33] d.m. friday, p.b. broughton, t.a. lee, g.a. schutz, j.n. betz. further insight into the nature of balllightning-like atmospheric pressure plasmoids. j. phys. chem. a. 117, 9931-9940, 2013. doi:10.1021/jp400001y [34] s.v. shevkunov. cluster mechanism of the energy accumulation in a ball electric discharge. dokl. phys. 46, 467-472, 2001. doi:10.1134/1.1390398 [35] s.v. shevkunov. scattering of centimeters radiowaves in a gas ionized by radioactive radiation: cluster plasma formation. j. exp. theor. phys. 92, 420-440, 2001. doi:10.1134/1.1364740 [36] s.v. shevkunov. a high energy barrier to charge recombination in ionized water vapor. high energy chem. 43, 341-349, 2009. doi:10.1134/s0018143909050026 333 http://dx.doi.org/10.1063/1.3003068 http://dx.doi.org/10.1063/1.4935206 http://dx.doi.org/10.1063/1.4974452 http://dx.doi.org/10.1016/j.expthermflusci.2014.09.017 http://dx.doi.org/10.1098/rspa.2010.0134 http://dx.doi.org/10.1098/rsta.1999.0328 http://dx.doi.org/10.1016/j.jsv.2010.04.030 http://dx.doi.org/10.3813/aaa.918126 http://dx.doi.org/10.14311/ap.2017.57.0149 http://dx.doi.org/10.1051/epjap/2017170332 http://dx.doi.org/10.1016/j.expthermflusci.2013.07.004 http://dx.doi.org/10.1134/1.1529952 http://dx.doi.org/10.1070/pu2004v047n01abeh001691 http://dx.doi.org/10.1585/pfr.1.039 http://dx.doi.org/10.1134/s1063784208060029 http://dx.doi.org/10.1088/0963-0252/17/024014 http://dx.doi.org/10.1021/jp400001y http://dx.doi.org/10.1134/1.1390398 http://dx.doi.org/10.1134/1.1364740 http://dx.doi.org/10.1134/s0018143909050026 acta polytechnica 58(5):323–333, 2018 1 introduction 2 experimental setup 3 results 4 discussion 5 conclusions acknowledgements references ap05_2.vp 1 introduction rc5 is a fast block cipher designed by r. rivest [1, 2, 3]. it is a parameterized algorithm in which the block size, key length, and number of rounds are variable. the parameters are often specified in the rc5-w/r/b notation, where w is the word size in bits, r�[0, 255] is the number of rounds, and b�[0, 255] is the number of bytes in the secret key. the block size is twice the word size and the allowed values are 32 bits (recommended for experiments and testing only), 64 bits, or 128 bits. this flexibility allows the optimal level of security and efficiency to be chosen. a suggested “nominal” choice is rc5-32/12/16, that is, rc5 with 32-bit words (64-bit blocks), 12 rounds, and a 128-bit (16-byte) key. although rc5 is extremely simple and easy to implement, it provides a good level of security, as long as a sufficient key length and enough rounds are employed. attempts to crack rc5 through various cryptanalysis techniques [4, 5, 6, 7, 8, 9] have been published; [6] concludes that at least 16 rounds are necessary to prevent a partial differential attack. however, all of these works consider a chosen-plaintext attack, which is not always plausible. in 1997, rsa security inc. announced a “secret-key challenge” [10]. the objective of the challenge is to find the secret key, given one ciphertext sample together with a portion of the corresponding plaintext. this corresponds to a real-world scenario, where an attacker could, for instance, capture packets of certain network communication and guess the contents of the encrypted packet header. unfortunately, chosen-plaintext attack methods are of limited use in this case, and the exhaustive keyspace search method is the only known option. in this paper, we analyze the feasibility of a brute-force attack, using general-purpose computers as well as dedicated hardware. we present the design of a fast, fully pipelined cracking engine, and compare the cost and speed with software-based solutions used by the currently employed methods. based on the results, we suggest a minimum key length that can be considered secure against various types of attackers. these numbers are then compared to suggestions published by a group of computer scientists and cryptographers in january 1996 [11]. 2 the rc5 algorithm the rc5 algorithm is fully described and discussed in [1, 2, 3]. we include the relevant routines here for the sake of completeness. the algorithm consists of three routines: key expansion, encryption, and decryption. it makes heavy use of data-dependent rotations, besides additions and subtractions modulo 2w and xors. in the following description, � and � denote addition and subtraction modulo 2w respectively, � denotes bit-wise xor, a � b and a � b denote a rotation of a by b bits to the left or right respectively. the least-significant bit is assumed to be at the rightmost position. 2.1 constants and variables besides the already-defined constants w, r, b, [1] defines the following constants to describe the algorithm: c � �b/(w/8)� is the number of w-bit words necessary to hold the b bytes of the key, and t � 2r � 2 is the size of the expanded key table (array s). the key expansion routine uses the magic constants pw and qw, which are defined in [1] and are derived from certain irrational numbers. their values, for w � 32, are equal to (in hex): p32 � 0xb7e15163, q32 � 0xe93779b9. the key is initially stored in the array k of b bytes. the key expansion routine uses two arrays, s and l, containing t or c words respectively, two w-bit variables a and b, and the counters i, j. 2.2 key expansion first, the array l[0 c � 1] is filled with the secret key k[0 b � 1]. on little-endian machines, this is accomplished by zeroing-out the array l and copying the contents of k directly into the same memory area. then, the array s[0 t � 1] is initialized and the secret key is mixed in: © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 cost-effective architectures for rc5 brute force cracking j. buček, j. hlaváč, m. matušková, r. lórencz in this paper, we discuss the options for brute-force cracking of the rc5 block cipher, that is, for revealing the unknown secret key, given a sample ciphertext and a portion of the corresponding plaintext. first, we summarize the methods employed by the current cracking efforts. then, we present two hardware architectures for finding the secret key using the “brute force” method. we implement the hardware in fpga and asic and, based on the results, we discuss the cost and time needed to crack the cipher using today’s technology and suggest a minimum key length that can be considered secure. keywords: rc5 cipher, decryption, brute-force cracking, fpga, asic. s pw[ ]0 � ; for i � 1 to t do s i s i q w[ ] [ ]� � �1 ; i j a b� � � �0 0; ; repeat 3 max (t, c) times a s i s i a b� � � �[ ] ( [ ] ) � 3; b l j l j a b� � � �[ ] ( [ ] ) �( )a b� ; i i t� �( ) mod1 ; j j c� �( ) mod1 ; 2.3 decryption the following decryption routine assumes that the array s contains the mixed key and that the ciphertext block is in registers a and b. for i r� downto 1 do b b s i� � �( [ ])2 1 � �a a� ; a a s i� �( [ ])2 � �b b� ; b b s� � [ ]1 ; a a s� � [ ]0 ; 2.4 encryption although encryption is not a primary concern of this paper, the appropriate routine is given here for completeness. the plaintext is assumed to be in a and b. a a s� � [ ]0 ; b b s� � [ ]1 ; for i � 1 to r do a a b� �( ) � �b s i� [ ]2 ; b b a� �( ) � �a s i� �[ ]2 1 ; 3 current cracking efforts 3.1 distributed.net the most well-known project aiming to crack the rc5 challenges is organized by distributed computing technologies, inc. [12]. on october 19, 1997, after 250 days of searching, distributed.net found the 56-bit secret key for the rc5-32/12/7 challenge. on july 14, 2002, they found the 64-bit secret key for the rc5-32/12/8 challenge (it took 1757 days). at the time of writing this paper (june 2004), the remaining challenges still remain to be solved. the philosophy of the distributed.net project is to search the keyspace using the idle cpu time of many computers connected to the internet, creating a virtual massively parallel system. participating users download and install client software that runs in the background, searching portions of the keyspace and sending the results to the main server. table 1 lists some of the results published at the distributed.net web site [12], combined with the approximate cost of the respective cpu. the numbers pertain to the currently running rc5-72 project (rc5-32/12/9). (note that the frequency of the amd processors is actually their performance rating. this follows from the numbers from distributed.net.) it turns out that older common processors provide a more favorable cost per mkey/s than recent releases. in addition, intel pentium 4 implements the rotate instructions less efficiently than the other cpus, attaining the least favorable results out of those presented. 3.2 distributed reconfigurable hardware a recent work by morrison et al [13] attempts to crack rc5 using a cluster of pc compatible computers with fpga accelerator boards. a rate of 168 kkeys/s for a pentium 4 and 1.7 mkeys/s per node of their distributed system is cited. these numbers appear rather low in comparison with the distributed.net [12] results; unfortunately, it is not clear what is the cause of the mismatch. 4 hardware cracking engine we have designed two different architectures of dedicated hardware (one of them was presented in [14]) for brute-force cracking of the rc5 cipher (any of its variants with w � 32). to attain the maximum speed possible, the design is not reconfigurable the parameters of the rc5 variant, which is to be cracked, must be specified at design (synthesis) time. 4.1 pipelined architecture the first design utilizes a long pipeline in order to attain a high clock rate and effectively check one key every clock cycle. the number of pipeline stages is equal to 3max (t, c) � r. the overall block diagram of our hardware design is shown in figure 1. the inputs are “secret key” – the 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague cpu rate (mkeys/s) cost (usd) cost/rate amd athlon 64 (3.4 ghz) 8.4 245 29.2 amd athlon xp (3 ghz) 7.7 163 21.1 intel pentium 4 (3.2 ghz) 4.5 237 52.7 powerpc g4 (1.25 ghz) 13.1 399 30.5 table 1: general-purpose cpu results fig. 1: block diagram key which is to be tested, “initial values for s” – the contents of the s array detailed in section 2.2, and “ciphertext ” – the one-block sample of the encoded message, to which the corresponding plaintext block is known. hardware for generating individual keys and for comparing the computed plaintext to the expected value is omitted from the figure. the implementation of the key expansion pipeline (“key expansion” block in fig. 1) in our design is shown in fig. 2. again, its inputs are the initial contents of the array s (constants calculated at design time using the values pw and qw) and the secret key to be tested. the dash-and-dotted line represents registers (fields of the array s) that would still contain original initialization values (which are known at design time) and therefore are not physically implemented. the structure of one key expansion element (k-e in fig.2) is shown in fig. 3. it performs the operations that are inside the body of the main loop described in section 2.2. the structure of the decryption pipeline (the “decryption” block in fig. 1) is shown in fig. 4. it performs the functions described by the algorithm in section 2.3. its inputs are the computed array s (with the secret key mixed in) and the ciphertext block. the dotted lines represent registers (portions of the array s) that are no longer needed for the calculation and thus are not physically implemented. the structure of one decryption round element is shown in fig. 5. it performs the operations inside the main loop detailed in section 2.3. the final round (not included in r) differs from the others and its hardware structure is shown in fig. 6. to give the reader an idea about the size of the entire hardware, let us take the rc5-32/12/9 variant as an example and count the number of hardware elements. every stage of the key expansion pipeline contains 3 registers (l array), up to 26 registers (s array), 4 adders, 1 barrel shifter. every stage of the decryption pipeline contains up to © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 2: key expansion pipeline fig. 3: key expansion element fig. 4: decryption pipeline fig. 5: decryption round element fig. 6: final decryption round 26 registers (s array), 2 registers (a, b), 2 subtractors, 2 xors, 2 barrel shifters. the last round element contains 2 subtractors. thus, the entire pipeline would contain 1906 w-bit registers, 312 adders, 26 subtractors, 24 xors and 102 barrel shifters. 4.2 sequential (not pipelined) architecture as stated in section 4.1 above, the pipelined architecture is resource hungry. we anticipate that we would need a large fpga (or a large asic) to implement the full pipeline. however, there is a class of cheap fpgas (e.g. the xilinx spartan-3 series), which could be also utilized, if we can design such a circuit that fits into these small-to-medium sized devices. we have therefore designed another variant of the cracking engine, which is sequential but not pipelined. (in fact, it still has two stages, but it is not fully pipelined, so we call it sequential here.) the basic block diagram stays the same (see figure 1). the key expansion and decryption units have been changed so that the computation is performed sequentially, much like the specification of the original algorithm (sections 2.2 and 2.3). the key expansion and decryption units contain only one key expansion element and one decryption round element, respectively. the structure of the sequential key expansion unit is shown in fig. 7. the main difference from the pipelined unit (fig. 2) is that the s and l arrays are now implemented as shift registers and the a and b registers are now inside the key expansion element. the s and l registers shift by one word to the lower indices (to the left in the figure) in order to implement the key expansion sequence. the number of cycles is the same as the number of pipeline stages in the pipelined version, i.e. 3max (t, c) � r, and the end of key expansion is determined by a counter (not shown here for simplicity). the unit now contains only one key expansion element, whose structure is shown in fig. 8. it performs the operations that are inside the body of the main loop described in section 2.2, and contains the a and b registers. the structure of the sequential decryption unit is shown in fig. 9. it performs the functions described by the algorithm in section 2.3. the s register is now implemented as a shift register, but this time it shifts by two words toward the higher indices (to the right in the figure). the structure of one decryption round element is shown in fig. 10. it performs the operations inside the main loop detailed in section 2.3. the last round is now treated differently, and the final decrypted output is now taken directly from the decryption round element (outputs a, and b, in fig. 10). the end of decryption is determined by a counter (not shown here for simplicity). 5 results we described both hardware architectures in vhdl, simulated it and synthesized it for fpga. we synthesized the pipelined architecture also for asic. the following experiments were conducted with the rc5-32/12/9 variant to make the results directly comparable with those in section 3.1 (distributed.net). 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 7: sequential key expansion unit fig. 8: sequential key expansion element fig. 9: sequential decryption unit fig. 10: sequential decryption round element 5.1 fpga implementation the pipelined hardware is rather complicated. furthermore, barrel shifters are not “fpga-friendly” due to high requirements on routing resources. as a result, a rather large fpga is necessary to implement the entire design, e.g. a xilinx xc2vp100 (virtex ii pro) or xc2v8000 (virtex ii). it is clear that fpga is not a suitable platform for the pipelined architecture, compared to general-purpose cpus (table 1). the sequential hardware is much smaller than our pipelined implementation. the circuit now fits into cheap, smallto medium-sized fpgas such as xilinx xc3s200 and xc3s400 (spartan-3). the downside is the substantial drop in the decryption rate. however, as we can see, the lower price compensates for the speed decrease. we can see that the sequential fpga implementation is comparable with general-purpose cpus, from the cost/rate point of view. the results of both variants implemented in fpga are listed in table 2. 5.2 asic implementation we synthesized the pipelined design using a 0.18 �m technology library (unfortunately, we did not have the necessary data for synthesis into the latest 90 nm and 60 nm technologies). out of the wide range of tradeoffs between speed and area, one of the favorable settings resulted in a design that occupies approximately 15 mm2 and can run at 215 mhz under typical conditions. several instances of the design can be placed on one die to further reduce the cost. for asics, it is difficult to obtain an exact price per chip since it greatly depends, among other factors, on the total volume of chips produced. we tried to get a reasonable estimate for a small quantity of chips (based on mosis price lists [15] for a lot of 40 chips) and for a large quantity (based on the market price of common ics with a similar die size). table 3 lists our results and the estimated prices. while it would be possible to fine-tune and optimize the synthesis process or implement the design in more advanced technology for even better results, the obtained figures are sufficient for comparison with the other platforms considered in this paper. 5.3 minimum secure key length concluding from the previous sections, it is clear that fpga is not a reasonable option for the pipelined architecture since it is too expensive. on the other hand, the sequential architecture can be used with smaller and cheaper fpgas. a low-budget attacker would turn to systems built upon general-purpose cpus or cheap fpgas, while a well-funded attacker is likely to use asics. the cost/rate ratio of small fpgas is comparable or even superior to that of general purpose cpus. however, when considering whether to build a fpga-based cracking engine or a cpu-based one, we have to take into account additional costs that were not addressed in this paper. these involve the costs of custom built or off-the-shelf mainboards, input/output processing circuits, etc. in this respect, the cpu-based cracking engine would probably be better. to find the unknown secret key using a brute-force attack, an attacker must search 50% of the keyspace (28b � 1 keys) on average, and the entire keyspace (28b keys) in the worst case. table 4 details the number of keys an attacker would be able to try every second, based on the amount invested, using the technology discussed in the preceding sections. from the data in table 4, we can conclude that 40-bit keys are not secure at all, and 56-bit keys can only protect sensitive data from low-budget attackers for a limited period of time. should we wish to protect sensitive data against the strongest attacker listed in table 4, the secret key should be longer than 70 bits. however, considering that more advanced technology is available today than the discussed 0.18 �m process, we would recommend at least 80 to 90 bits. 128-bit keys that are common in today’s cryptographic systems satisfy this condition. another view is that, for instance, a 54-bit key is inadequate, if an attacker can gain more than $500,000 by recovering it in a day. it should be noted that while the cost of brute-force attack increases exponentially with the length of the secret key, the cost of encryption and decryption increases negligibly. therefore, a prudent cryptographic system would use a key that is at least twice or thrice as long as the recommended minimum, to allow a margin for error or technology improvements [11]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 device architecture rate [mkeys/s] cost [usd] cost/rate xc2vp100 pipelined 39 7900 202 xc2v8000 pipelined 43 11000 256 xc3s200 sequential 0.66 13.45 20.4 xc3s400 sequential 1.32 22 16.7 table 2: fpga results production type rate/chip (mkeys/s) cost/chip (usd) cost/rate low volume, small die 215 1050 4.9 low volume, large die 2150 4500 2.1 high volume, large die 2150 180 0.08 table 3: asic results investment technology keys per sec day year $5k gp cpu or fpga 228 244 253 $50k gp cpu or fpga 231 247 256 asic 233 249 258 $500k asic 238 254 263 $5m asic 246 262 270 table 4: speed vs. investment 6 conclusion we have designed two hardware architectures for attacking the rc5 cipher using the exhaustive search (also known as “brute-force”) method. we have described the architectures in vhdl, synthesized it for various fpga and asic platforms and compared the speed with common general-purpose processors. it turns out that fpgas are comparable to general-purpose cpus, although the additional costs may favor using cpu-based systems. for a low-budget attacker, today’s general-purpose cpus provide better performance. for an attacker with a generous budget, asic is the best option. today, an 80 to 90-bit key seems to be secure enough against most attackers. for data that should remain protected for some time to come, at least 128 bits should be used; however, a prudent cryptographer would use a much longer key – the increased cost of encryption and decryption is negligible, while a brute-force attack becomes impossible. our future work will include testing the sequental design in a physical implementation in a small or medium-sized fpga. references [1] rivest, r. l.: the rc5 encryption algorithm. cryptobytes, vol. 1 (1995), no. 1, p. 9–11. [2] baldwin, r., rivest, r.: the rc5, rc5-cbc, rc5-cbc-pad, and rc5-cts algorithms. rfc 2040, network working group, 1996. [3] rivest, r. l.: block encryption algorithm with data-dependent rotations. u.s. patents no. 5,724,428 and no. 5,835,600, 1998. [4] kaliski, b. s. jr., yin, y. l.: “on differential and linear cryptanalysis of the rc5 encryption algorithm.” advances in cryptology – crypto’95, springer-verlag, 1995, p. 171–183. [5] knudsen, l. r., meier, w.: “improved differential attacks on rc5.” advances in cryptology – crypto’96, springer-verlag, 1996, p. 216–228. [6] biryukov, a., kushilevitz, e.: “improved cryptanalysis of rc5.” advances in cryptology – eurocrypt’98, springer-verlag, 1998. [7] selcuk, a. a.: “new results in linear cryptanalysis of rc5.” proceedings of 5th international workshop on fast software encryption, springer-verlag, 1998, p. 1–16. [8] shimoyama, t., takeuchi k., hayakawa j.: “correlation attack to the block cipher rc5 and the simplified variants of rc6.” proceedings aes3, new york, 2001. [9] yin y. l.: “the rc5 encryption algorithm: two years on,” cryptobytes, vol. 2 (1997), no. 3, p. 14–15. [10] rsa data security, inc.: secret-key challenges. on-line at http://www.rsasecurity.com/rsalabs/chalenges/secretkey/ index.html [11] blaze, m., diffie, w., rivest, r. l. et al.: “minimal key lengths for symmetric ciphers to provide adequate commercial security.” a report by an ad hoc group of cryptographers and computer scientists, 1996. on-line at http://theory.lcs.mit.edu/rivest/bsa-final-report.pdf [12] distributed computing technologies, inc.: the “distributed.net” project. on-line at http://www.distributed.net [13] morrison, j. p., o’dowd, p. j., healy, p. d.: “searching rc5 keyspaces with distributed reconfigurable hardware.” in: engineering of reconfigurable systems and algorithms, csrea press, 2003, p. 269–272. [14] matušková, m., hlaváč, j., buček, j., lórencz, r.: “rc5 brute force cracking engine.” proceedings of the sixth international scientific conference electronic computers and informatics eci 2004. technical university košice, 2004, p. 259–264. isbn 80-8073-150-0. [15] mosis ic fabrication service. on-line at http://www.mosis.org ing. jiří buček e-mail: bucekj@fel.cvut.cz ing. josef hlaváč e-mail: hlavacj2@fel.cvut.cz ing. róbert lórencz, csc. e-mail: lorencz@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague, czech republic ing. monika matušková e-mail: monika.matuskova@vslib.cz department of software engineering technical university of liberec faculty of mechatronics hálkova 6 460 15 liberec, czech republic 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague acta polytechnica doi:10.14311/ap.2017.57.0430 acta polytechnica 57(6):430–445, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap on self-similarities of cut-and-project sets zuzana masáková∗, jan mazáč department of mathematics, fnspe, czech technical university in prague, trojanova 13, 120 00 praha 2, czech republic ∗ corresponding author: zuzana.masakova@fjfi.cvut.cz abstract. among the commonly used mathematical models of quasicrystals are delone sets constructed using a cut-and-project scheme, the so-called cut-and-project sets. a cut-and-project scheme (l,π1,π2) is given by a lattice l in rs and projections π1, π2 to suitable subspaces v1, v2. in this paper we derive several statements describing the connection between self-similarity transformations of the lattice l and transformations of its projections π1(l), π2(l). for a self-similarity of a set σ we take any linear mapping a such that aς ⊂ σ, which generalizes the notion of self-similarity usually restricted to scaled rotations. we describe a method of construction of cut-and-project schemes with required self-similarities and apply it to produce a cut-and-project scheme such that π1(l) ⊂ r2 is invariant under an isometry of order 5. we describe all linear self-similarities of this scheme and show that they form an 8-dimensional associative algebra over the ring z. we perform an example of a cut-and-project set with linear self-similarity which is not a scaled rotation. keywords: self-similarity; quasicrystal; cut-and-project scheme. 1. introduction quasicrystals, their mathematical models and their physical properties stand in the front row of interest of scientists since 1984 when shechtmann and his collegues [18] published his 1982 discovery of non-crystallographic materials with long-range order. advances in the description of these materials obtained first as rapidly solidified intermetallic alloys have been since achieved both on the mathematical side and in the experiments. a number of overview books and survey papers were published, see for example [1]. a fresh impuls to the research on quasicrystals was given by awarding 2011 nobel prize for chemistry to dan shechtmann for his discovery. while crystals are modeled by periodic lattices, as a suitable mathematical model of atomic positions in quasicrystals one recognizes the cut-and-project method that stems in projecting lattice points from a higherdimensional space to suitable subspaces (called usually physical and inner spaces). a cut-and-project scheme is thus given by a lattice l and two projections π1,π2, see details in section 2. then, choosing properly a delone subset σ(ω) of π1(l) one obtains a quasiperiodic structure in the physical space which allows symmetries forbidden for periodic sets by the crystallographic restriction theorem. the choice is directed by a suitable window ω in the inner space. the set σ(ω) is then called a cut-and-project set. the origins of the idea can be stepped back to times long before quasicrystal discovery, to bohr [6] who developed his theory of quasiperiodic functions, and then to meyer in connection to harmonic analysis [15]. de bruijn performed [8] this construction for obtaining the vertices of the famous penrose tiling [17]. the utility of this method for constructing quasicrystal models was then recognized by kramer and neri [10]. when studying two-dimensional quasicrystal models, one is interested in those displaying 5-, 8-, 10and 12-fold rotational symmetry which corresponds to experimentally observed cases [19]. the family of symmetries is however much more rich; besides rotations/reflections it contains scalings by irrational factors and other affine symmetries. it follows from the results of lagarias [11] that if σ(ω) is a cut-and-project set and η > 1 is such that ης(ω) ⊂ σ(ω), than η can only be a pisot or salem number. some authors [2] have considered self-similarities of quasilattices in the form of scaled rotations, i.e., mappings ηr, where η > 1 and r is an orthogonal map. according to our knowledge, no systematic study of general affine self-similarities, i.e., affine mappings a such that aς(ω) ⊂ σ(ω), is found in the literature. in this paper we focus on two main problems about general linear self-similarities. first, given a linear map a, we are interested in what are the cut-and-project schemes that allow a as a self-similarity, i.e., such that aπ1(l) ⊂ π1(l). then, we may want to fix the cut-and-project scheme and ask about all linear self-similarities allowed by this specific scheme. to this aim we present a matrix formalism for the study of cut-and-project schemes and derive several general statements (theorem 3.1 and proposition 3.3). these statements we then apply in the context of quasicrystal models with 5, resp. 10-fold symmetry. we derive what is the necessary form of the cut-and-project scheme (l⊂ r4,π1,π2) if one aims to obtain a quasicrystal model with 10-fold symmetry. it turns out that the requirement of the symmetry alone leads to 430 http://dx.doi.org/10.14311/ap.2017.57.0430 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 on self-similarities of cut-and-project sets a construction equivalent to the classical one [3] which uses preliminary knowledge of space group description, provided by fedorov and schönflies in dimension ≤ 3 and then generalized by bieberbach to any dimension [5]. we demonstrate the comparison of the two constructions, see section 5. next, given the cut-and-project scheme allowing 5-fold symmetry, we study its linear self-similarities in section 6. we show that these mappings form an 8-dimensional z-algebra z and we provide explicit description of its elements. the z-algebra z has a 4-dimensional commutative subalgebra, whose elements give scaled rotations. this subalgebra is ring-isomorphic to the ring z[ω] of cyclotomic integers, where ω = e 2πi 5 . not all self-similarities of the set π1(l) are self-similarities of some cut-and-project set σ(ω). general statements providing a necessary and some sufficient conditions for existence of a suitable window ω are given in section 4. application of this theory is then performed in section 7. we focus on the commutative subalgebra of the algebra z and describe the scaled rotations s for which a window ω such that s ( σ(ω) ) ⊂ σ(ω) exists. we also provide an example illustrating that a cut-and-project set may have a linear self-similarity which is not a scaled rotation. 2. preliminaries it is commonly understood that a mathematical model for quasicrystals should satisfy the so-called delone property. definition 2.1. we say that x ⊂ rn is delone, if the following conditions are satisfied: (1.) there exists r > 0 such that every open ball of radius r contains at most one point of x (uniform discreteness). (2.) there exists r > 0 such that every closed ball of radius r contains at least one point of x (relative density). note that the supremum of the values r from uniform discreteness bounds the distances between points in the delone set x from below. if the supremum is achieved, then it is the minimal distance in x. the infimum of the values r from relative density is the so-called covering radius of the set x. the essential idea behind the cut-and-project scheme is to project elements of a higher-dimensional lattice to suitable subspaces. there are two basic approaches when doing this: either one takes the standard lattice zn+m and projects to general irrationally oriented subspaces v1, v2 of dimensions n, m, respectively, whose direct sum is equal to rn+m. or one chooses a general lattice l and projects to subspaces spanned by vectors of the standard basis e1, . . . , en+m. both methods are equivalent in principal. for formal reasons, it is suitable for us to choose the second approach. definition 2.2. let l⊂ rn+m be a (n + m)-dimensional lattice. let further v1 = spanr{e1, e2, . . . , en} and v2 = spanr{en+1, en+2, . . . , en+m}. let π1 : rn+m → rn and π2 : rn+m → rm be projections to v1 and v2, respectively. the triple (l,π1,π2) is called a cut-and-project scheme. we say that the cut-and-project scheme is non-degenerated, if π1 ∣∣ l is injective. we say that the cut-andproject scheme is irreducible, if π2(l) is dense in rm. it can be easily seen that the set π1(l) is a z-module in rn and in case that the cut-and-project scheme is non-degenerated, it is not a discrete set. a quasicrystal model is constructed as a suitable subset of the z-module π1(l). in order to choose a delone subset of π1(l), one puts a condition on the second projection of lattice points. definition 2.3. let (l,π1,π2) be a non-degenerated irreducible cut-and-project scheme. given a bounded set ω with non-empty interior, we define the cut-and-project set σ(ω) with acceptance window ω by σ(ω) := {π1(x) : x ∈l,π2(x) ∈ ω}. (1) in the literature, one sometimes puts different requirements on the acceptance window ω, for example lagarias [11] asks it to be bounded and open. on the other hand, cotfas [7] requires compactness and int(ω) 6= ∅. moody [16] sets that the bouded set ω satisfies ω ⊂ int(ω). some additional conditions, such as empty intersection of the boundary with π1(l), or convexity, may influence some specific properties of the cut-and-project set, namely repetitivity [12], or closedness under quasiaddition [4]. here we stick to the two basic requirements which ensure the delone property of σ(ω), see [16]. in the study of quasicrystal models with the observed rotational symmetries, one necessarily encounters certain numbertheoretic notions. an algebraic number α is a root of a polynomial with rational coefficients. if this polynomial is monic and irreducible over the rationals, it is called the minimal polynomial of α and its degree is the degree of α. algebraic numbers with the same minimal polynomial are called algebraic conjugates. 431 zuzana masáková, jan mazáč acta polytechnica if the minimal polynomial of α has integer coefficients, then α is said to be an algebraic integer. a special class of algebraic integers is given by the so-called pisot numbers. a pisot number is an algebraic integer β > 1 whose algebraic conjugates lie in the interior of the unit disk. the most prominent example of a pisot number is the golden ratio τ = 12 (1 + √ 5), with minimal polynomial x2 −x− 1. the golden ratio is strongly linked to the 5-fold symmetry, namely by the equality 2 cos 2π5 = τ −1. pisot numbers appear as self-similarity factors of cut-and-project sets. another class of important numbers are salem numbers, algebraic integers > 1 with conjugates in the unit disk with at least one being on the unit circle. the notion of pisot numbers is transferred to the complex plane by the term complex pisot number – a complex algebraic integer β such that all algebraic conjugates but β and its complex conjugate β belong to the interior of the unit disk. these will play important role in section 7. 3. matrix formalism for the cut-and-project method in what will follow, we use a matrix formalism for describing the cut-and-project sets. on its basis, we will set the conditions on the vectors generating the lattice l = {∑s i=1 aili : ai ∈ z } , so that the scheme is self-similar. denote by v the s×s matrix formed by the vectors l1, . . . , ls written in columns. every lattice vector can be then written as l = v x with x ∈ zs. the projections π1, π2 then act on a lattice vector l ∈l as π1(l) = (in,o)l, π2(l) = (o,is−n)l, where ik is the identity matrix of order k and o stands for the zero matrix of the size n×(s−n) or (s−n)×n, respectively. assume having a window ω ⊂ b(0,r) ⊂ rs−n. an n-dimensional cut-and-project set with window ω can be expressed as σ(ω) = { (in,o)v x : x ∈ zs, (o,is−n)v x ∈ ω } . denoting (in,o)v x = π1(v x) = b ∈ rn, (o,is−n)v x = π2(v x) = b∗ ∈ rs−n, we obtain the cut-and-project set in the form σ(ω) = { b ∈ π1(l) : b∗ ∈ ω } , which fully corresponds to the definition (1). we further use the above matrix formalism for deriving several statements about self-similarities of the cut-and-project scheme and cut-and-project sets. theorem 3.1. let (l,π1,π2) be a non-degenerate irreducible cut-and-project scheme with l ⊂ rs×s. let a ∈ rn×n satisfy aπ1(l) ⊂ π1(l). then there exists a matrix c ∈ zs×s, similar to a matrix( a o o b ) , where b ∈ r(s−n)×(s−n). in particular c = v −1 ( a o o b ) v, where v ∈ rs×s is the matrix formed by the vector generators of the lattice l written in columns. proof. let l1, l2, . . . , ls be the linearly independent vectors in rs generating the lattice l. since a is a selfsimilarity of the set π1(l), for every l ∈l there exists l′ ∈l such that aπ1(l) = π1(l′). correspondingly, to every x ∈ zs there exists x′ ∈ zs such that aπ1(v x) = π1(v x′). the mapping x 7→ x′ is linear over z, and thus there exists a matrix c ∈ zs×s such that cx = x′. we further define a linear map b by bπ2(v x) = π2(v x′), for x ∈ zs. together, rewritten in the matrix formalism, we have a(in,o)v x = (in,o)v cx, b(o,is−n)v x = (o,is−n)v cx, which can be put together into ( a o o b ) v x = v cx. since this holds for every x ∈ zs, we derive that c = v −1 ( a o o b ) v, (2) which we aimed to show. 432 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets we have an obvious corollary to the above theorem. corollary 3.2. let (l,π1,π2) be a non-degenerate irreducible cut-and-project scheme with l ⊂ rs×s. let a ∈ rn×n satisfy aπ1(l) ⊂ π1(l). then the eigenvalues of the matrix a are algebraic integers and their minimal polynomial divides the characteristic polynomial of the matrix c over q. the matrix framework also enables us to find a cut-and-project scheme displaying a self-similarity defined by a given integer matrix c. proposition 3.3. let for c ∈ zs×s there exist a matrix v ∈ rs×s of rank s such that v cv −1 is block diagonal, i.e., v cv −1 = ( a o o b ) , where a ∈ rn×n, b ∈ r(s−n)×(s−n). denote l = {∑s i=1 aili : ai ∈ z } the lattice generated by the linearly independent columns li of v and for a lattice vector l ∈l set the projections π1 : rs → rn, π2 : rs → rs−n to π1(l) = (in,o)l, π2(l) = (o,is−n)l. (1.) then aπ1(l) ⊂ π1(l). (2.) if z ∈ zs×s is another matrix satisfying v zv −1 = ( s o o t ) , for some s ∈ rn×n, t ∈ r(s−n)×(s−n), then sπ1(l) ⊂ π1(l). the above proposition follows from theorem 3.1. note that the proposition does not state anything about non-degeneracy or irreducibility of the cut-and-project scheme (l,π1,π2) obtained as shown. examples of degenerate or reducible schemes may be constructed. when studying the structure of the set of all self-similarities of a given cut-and-project scheme, one easily realizes that they form an associative algebra over the ring z. this follows from the fact that π1(l) is a z-module. proposition 3.4. let r ⊂ rn be a z-module. denote by r the set of all linear mappings s on rn such that sr ⊂ r. then r is an associative z-algebra. proof. let s1,s2 ∈r, i.e., s1,s2 ∈ rn×n such that sir ⊂ r. then clearly (s1 + s2)r = s1r + s2r ⊂ r + r = r, (s1s2)r = s1(s2r) ⊂ s1r ⊂ r, where we have used that for a z-module r we have r + r = r. this means that s1 + s2,s1s2 ∈ r, and necessarily also ks1 ∈r for any k ∈ z. associativity is obvious. 4. self-similarities of a cut-and-project set the statements in the previous section concerned self-similarities of the z-module π1(l). let us now concentrate on what can be said in general about the self-similarities of cut-and-project sets, given a cut-and-project scheme (l,π1,π2) with a self-similarity a. theorem 4.1. let (l,π1,π2) be a non-degenerated irreducible cut-and-project scheme with a self-similarity a. if there exists a window ω ⊂ rm, such that aς(ω) ⊂ σ(ω), then the eigenvalues of the matrix b from theorem 3.1 are in modulus smaller or equal to 1. proof. assume that for some ω ⊂ rm we have aς(ω) ⊂ σ(ω). this means that akz ∈ σ(ω) for any z ∈ σ(ω) and k ∈ n. for the integer matrix c ∈ zs×s and the real matrix b ∈ r(s−n)×(s−n) from theorem 3.1 we have, by iterating (2), that for any k ∈ n ( ak o o bk ) = v ckv −1. realize that if z ∈ σ(ω), then z = π1(l) for some l ∈l and π2(l) ∈ ω. thus for any k there exist l′ ∈l such that akz = π1(l′) and π2(l′) ∈ ω. rewriting in the matrix formalism, akπ1(l) = ak(in,o)l = (in,o) ( ak o o bk ) l = (in,o)v ckv −1l = π1(l′). 433 zuzana masáková, jan mazáč acta polytechnica since the scheme is non-degenerate, the projection π1 is injective, and thus we can derive that l′ = v ckv −1l. the condition π2(l′) ∈ ω is therefore equivalent to π2(l′) = (o,is−n)l′ = (o,is−n)v ckv −1l = (o,is−n) ( ak o o bk ) l = bk(o,is−n)l = bkπ2(l). (3) now realize that by irreducibility the set {π2(l) : l ∈l, π1(l) ∈ σ(ω)} is dense in the bounded window ω. by linearity of b, we must have for the closure of the window that bkω ⊂ ω. suppose that b has a real eigenvalue λ of modulus strictly exceeding 1. as b ∈ r(s−n)×(s−n), we have a real eigenvector w of b corresponding to the eigenvalue λ. iterating, we obtain a contradiction bkw = λkw /∈ ω for sufficiently large k ∈ n. if λ is a non-real eigenvalue of b with |λ| > 1 with a non-real eigenvector w, then λ is an eigenvalue of b corresponding to the eigenvector w. over the real space of dimension 2, spanned by the vectors w + w, i(w − w), the mapping b acts as multiplication by |λ| and rotation by the argument of λ. we obtain a similar contradiction with boundedness of the window ω as before. theorem 4.2. let (l,π1,π2) be a non-degenerated irreducible cut-and-project scheme with a self-similarity a. if the matrix b from theorem 3.1 has all eigenvalues in modulus strictly smaller than 1, then there exists a window ω such that a is a self-similarity of the cut-and-project set σ(ω). proof. since the eigenvalues of the matrix b are strictly smaller than 1, by [9, corollary 1.2.3] there exists a metric ρ in rs−n such that the mapping b is in that metric contracting, i.e., there exists δ < 1 such that for every x, y ∈ rs−n we have δρ(x, y) > ρ(bx,by). choosing for the window the set ω = {x ∈ rm : ρ(x, 0) ≤ 1 } , we have for any l ∈ l with π2(l) ∈ ω, that bπ2(l) ∈ ω. therefore aπ1(l) ∈ σ(ω) for any l ∈ l such that π1(l) ∈ σ(ω). thus aς(ω) ⊂ σ(ω). when the matrix b is diagonalizable we can weaken the assumption on its eigenvalues. theorem 4.3. let (l,π1,π2) be a non-degenerated irreducible cut-and-project scheme with a self-similarity a. if the matrix b from theorem 3.1 is diagonalizable and all its eigenvalues in modulus smaller than or equal to 1, then there exists a window ω such that a is a self-similarity of the cut-and-project set σ(ω). proof. we will construct a positive semi-definite matrix which induces an inner product on rs−n (and consequently a metric) in which the mapping b is non-expanding, i.e., does not enlarge the distances. first we define an inner product in which the eigenvectors of b form an orthonormal basis of rs−n. denote by η1,η2, . . . ,ηm, m := s−n, the eigenvalues of b and the corresponding eigenvectors by w1, w2, . . . , wm. the inner product is defined using the hermitian matrix h = g∗g where g∗ stands for conjugate transpose. for g we take the matrix transferring the eigenvectors w1, w2, . . . , wm into the standard basis e1, e2, . . . , em. then the inner product for x, y ∈ rm is defined 〈x, y〉h := x ∗g∗gy. consider a general vector x = m∑ i=1 αiwi. then 〈x, x〉h = 〈 m∑ i=1 αiwi, m∑ j=1 αjwj 〉 h = m∑ i,j=1 αiαj 〈wi, wj〉h = m∑ i=1 |αi|2, 〈bx,bx〉h = 〈 m∑ i=1 αibwi, m∑ j=1 αjbwj 〉 h = m∑ i,j=1 αiηiαjηj 〈wi, wj〉h = m∑ i=1 |ηi|2|αi|2. as for all eigenvalues |η| ≤ 1, we thus have 〈x, x〉h ≥〈bx,bx〉h for any x ∈ rm, and therefore the mapping b is not expanding. setting for the acceptance window ω a ball in the metric induced by this inner product, i.e., ω = { x ∈ rm : 〈x, x〉h ≤ 1 } , it can be again easily derived that the cut-and-project σ(ω) has self-similarity a. 434 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets 5. construction of a cut-and-project scheme with 5-fold symmetry in the following, we shall apply proposition 3.3 in order to construct a cut-and-project scheme allowing 5-fold symmetry, and subsequently, to describe all its self-similarities. the desired cut-and-project scheme must admit a cut-and-project set closed under an isometry of order 5, i.e., a mapping satisfying a5 = i. the minimal polynomial of the matrix a over z (monic polynomial µa ∈ z[x] of lowest degree satisfying µ(a) = o) must divide the polynomial x5 − 1, which over z factors as x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1). the smallest non-trivial example is thus the cyclotomic polynomial µa(x) = φ5(x) := x4 + x3 + x2 + x + 1. the minimal polynomial of the integer matrix c obtained in theorem 3.1 should be divisible by µa. thus, as the simplest example, we consider for c the companion matrix of the polynomial φ5(x), namely c =   0 1 0 0 0 0 1 0 0 0 0 1 −1 −1 −1 −1   . (4) let ω = e2πi/5. then the eigenvalues of c are ω, ω2, ω3 = ω2 and ω4 = ω, the four roots of the polynomial φ5(x) = x4 + x3 + x2 + x + 1 which is irreducible over the rationals. note that these are precisely the primitive 5th roots of unity and they generate the cyclotomic field q(ω). since the minimal polynomial of ω is φ5 and is of degree 4, the cyclotomic field is expressed as q(ω) = { a + bω + cω2 + dω3 : a,b,c,d ∈ q } . the field q(ω) has four automorphisms, induced by σj(ω) = ωj for j = 1, 2, 3, 4. recall that the isomorphisms are identical over the rationals. we can diagonalize the matrix c using a matrix y composed of the corresponding eigenvectors written in columns, and its inverse. denote y1 = 1 5   1 ω ω2 ω3   , y2 = 15   1 ω4 ω3 ω2   , y3 = 15   1 ω2 ω4 ω   , y4 = 15   1 ω3 ω ω4   , (5) where the scaling factor 15 is chosen just for convenience. then y and its inverse y −1 are given by y = 1 5   1 1 1 1 ω ω4 ω2 ω3 ω2 ω3 ω4 ω ω3 ω2 ω ω4   , y −1 =   1 −ω ω4 −ω ω3 −ω ω2 −ω 1 −ω4 ω −ω4 ω2 −ω4 ω3 −ω4 1 −ω2 ω3 −ω2 ω −ω2 ω4 −ω2 1 −ω3 ω2 −ω3 ω4 −ω3 ω −ω3   . (6) note that we grouped the columns into pairs that are complex conjugates. then y −1cy = diag(ω,ω4,ω2,ω3) is a diagonal matrix over c. in order to obtain a block diagonal matrix over r we use matrices p =   1 −i 0 0 1 i 0 0 0 0 1 −i 0 0 1 i   , p−1 = 12   1 1 0 0 i −i 0 0 0 0 1 1 0 0 i −i   . (7) we thus have p−1y −1cy p =   cos 2π5 sin 2π 5 0 0 −sin 2π5 cos 2π 5 0 0 0 0 cos 4π5 sin 4π 5 0 0 −sin 4π5 cos 4π 5   . (8) 435 zuzana masáková, jan mazáč acta polytechnica therefore, by proposition 3.3, as the matrix composed of vectors generating the lattice l we can take v = p−1y −1 =   1 − cos 2π5 0 cos 4π 5 − cos 2π 5 cos 4π 5 − cos 2π 5 sin 2π5 2 sin 2π 5 sin 4π 5 + sin 2π 5 sin 2π 5 − sin 4π 5 1 − cos 4π5 0 cos 2π 5 − cos 4π 5 cos 2π 5 − cos 4π 5 sin 4π5 2 sin 4π 5 sin 4π 5 − sin 2π 5 sin 4π 5 + sin 2π 5   = 1 2   2 −ω −ω4 0 ω3 + ω2 − (ω + ω4) ω3 + ω2 − (ω + ω4) i(ω4 −ω) 2i(ω4 −ω) i(ω3 −ω2 + ω4 −ω) i(ω2 −ω3 + ω4 −ω) 2 −ω2 −ω3 0 ω + ω4 − (ω2 + ω3) ω + ω4 − (ω2 + ω3) i(ω3 −ω2) 2i(ω3 −ω2) i(ω −ω4 + ω3 −ω2) i(ω4 −ω + ω3 −ω2)   . (9) the lattice l is then of the form l = z   1 − cos 2π5 sin 2π5 1 − cos 4π5 sin 4π5   ︸ ︷︷ ︸ l1 + z   0 2 sin 2π5 0 2 sin 4π5   ︸ ︷︷ ︸ l2 + z   cos 4π5 − cos 2π 5 sin 4π5 + sin 2π 5 cos 2π5 − cos 4π 5 sin 4π5 − sin 2π 5   ︸ ︷︷ ︸ l3 + z   cos 4π5 − cos 2π 5 sin 2π5 − sin 4π 5 cos 2π5 − cos 4π 5 sin 4π5 + sin 2π 5   ︸ ︷︷ ︸ l4 . (10) the relation (8) is thus expression of the matrix c in the block diagonal form v cv −1 = ( a o o b ) , where the matrices a = ( cos 2π5 sin 2π 5 −sin 2π5 cos 2π 5 ) , b = ( cos 4π5 sin 4π 5 −sin 4π5 cos 4π 5 ) (11) correspond to rotations by angle −2π5 and − 4π 5 , respectively. notation 5.1. for further reference we denote by λ the cut-and-project scheme λ := (l,π1,π2), r2 π1←− l⊂ r4 π2−→ r2 composed of the lattice l defined in (10), with the projections π1,π2 : r4 → r2 given as before in our formalism, i.e., π1(l) = (i2,o)l, π2(l) = (o,i2)l. in order to understand completely the structure of cut-and-project sets defined by the cut-and-project scheme λ, let us apply the projections to the generating vectors l1, . . . , l4. we obtain for the π1 projection( 1 − cos 2π5 sin 2π5 ) ︸ ︷︷ ︸ l ‖ 1 , ( 0 2 sin 2π5 ) ︸ ︷︷ ︸ l ‖ 2 , ( cos 4π5 − cos 2π 5 sin 4π5 + sin 2π 5 ) ︸ ︷︷ ︸ l ‖ 3 , ( cos 4π5 − cos 2π 5 sin 2π5 − sin 4π 5 ) ︸ ︷︷ ︸ l ‖ 4 (12) and for the π2 projection( 1 − cos 4π5 sin 4π5 ) ︸ ︷︷ ︸ l⊥1 , ( 0 2 sin 4π5 ) ︸ ︷︷ ︸ l⊥2 , ( cos 2π5 − cos 4π 5 sin 4π5 − sin 2π 5 ) ︸ ︷︷ ︸ l⊥3 , ( cos 2π5 − cos 4π 5 sin 4π5 + sin 2π 5 ) ︸ ︷︷ ︸ l⊥4 . (13) let us consider the projection π1. rewritten in another form, we have for the projected lattice vectors l ‖ 1 = 2 sin π 5 ( sin π5 cos π5 ) , l ‖ 2 = 2 sin 2π 5 ( 0 1 ) , l ‖ 3 = 2 sin 3π 5 ( −sin π5 cos π5 ) , l ‖ 4 = 2 sin π 5 ( −sin 3π5 −cos 3π5 ) . in this form, it is easily seen how one can draw the vectors π1(lj) into the plane. 436 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets 0 l ‖ 1 l ‖ 2 l ‖ 3 l ‖ 4 π 5 π 5 π 5 as the vectors l‖i together with the origin form the vertices of a regular pentagon, we can rewrite l ‖ 2 = l ‖ 4 + τl ‖ 1, l ‖ 3 = l ‖ 1 + τl ‖ 4, where τ is the golden ratio. these relations can be verified with the use of 2 cos 2π5 = τ −1, 2 cos 4π5 = −τ. for any integer a,b,c,d ∈ z we have al ‖ 1 + bl ‖ 2 + cl ‖ 3 + dl ‖ 4 = (a + c + bτ)l ‖ 1 + (b + d + cτ)l ‖ 4, and thus π1(l) = { al ‖ 1 + bl ‖ 2 + cl ‖ 3 + dl ‖ 4 : a,b,c,d ∈ z } = z[τ]l‖1 + z[τ]l ‖ 4. (14) similarly, we have for the second projection 0 l⊥3 l⊥1 l⊥4 l⊥2 π/5 π 5 π 5 and we can derive that π2(l) = l⊥1 z[τ] + l ⊥ 4 z[τ] . (15) we will now show that the constructed cut-and-project scheme is non-degenerate and irreducible, which is obligatory, in order that it allows constructing cut-and-project sets. first we show non-degeneracy, i.e., that the projection π1 restricted to l is a one-to-one mapping. lemma 5.2. let l be given in (10) and π1 : r4 → r2 by π1(l) = (i2,o)l. then π1 restricted to l is injective. proof. in order to verify injectivity of π1 ∣∣ l it suffices to show that the preimage of the zero vector is 0. this amounts to showing that the vectors π1(li) = l ‖ i , i = 1 . . . 4, are linearly independent over q. recalling (12), this can be verified with the use of the equality cos 2π 5 + cos 4π 5 = − 1 2 , (16) which follows from the obvious relation 0 = ω4 + ω + ω3 + ω2 + 1 = 2 cos 2π 5 + 2 cos 4π 5 + 1. 437 zuzana masáková, jan mazáč acta polytechnica it remains to show that the second projection of the lattice π2(l) is dense in r2. lemma 5.3. let l be given in (10) and π2 : r4 → r2 by π2(l) = (o,i2)l. then π2(l) is dense in r2. proof. recall (15), where vectors l⊥1 , l ⊥ 4 are linearly independent and that, due to irrationality of the golden ratio τ, the set z[τ] = z + zτ is dense in r. thus π2(l) is a cartesian product of two sets, each of them dense in the subspace it generates. whence, π2(l) is dense in r2. lemmas 5.2 and 5.3 can be summarized as follows. corollary 5.4. the cut-and-project scheme λ = (l,π1,π2) defined above is non-degenerate and irreducible. if ω ⊂ r2 is bounded and such that int(ω) 6= ∅, then the cut-and-project set σ(ω) can be written as σ(ω) = { (a + bτ)l‖1 + (c + dτ)l ‖ 4 : a,b,c,d ∈ z, (a + bτ ′)l⊥1 + (c + dτ ′)l⊥4 ∈ ω } . one can review the results of barache et al. [3] to see that our method leads to the same model as the classical method of defining decagonal cut-and-project set based on coxeter groups, where one projects the crystallographic root system a4 to the non-crystallographic system h2. 6. self-similarities of the constructed scheme let us now study in according to item (ii) of proposition 3.3 what other self-similarities are present in the cut-and-project scheme λ constructed in section 5. we thus need to find all integer matrices z which by similarity transformation v zv −1 (with the matrix v defined in (9)) becomes a block diagonal matrix with blocks of the size 2. this means that z has the same two eigenspaces of dimension 2. let us first consider those integer matrices z which have the same eigenvectors yi, i = 1, . . . , 4, as c defined in (4). rewriting this requirement into matrix equation for a general integer matrix z, we obtain zy1 =   a b c d e f g h j k l m n o p q     1 ω ω2 ω3   = ρ   1 ω ω2 ω3   , for some ρ ∈ r. from the first row, we obtain ρ = a + bω + cω2 + dω3. using this, and the expression for ω4 in terms of lower powers of ω, namely ω4 = −1 −ω −ω2 −ω3, we get from the remaining rows the following relations between the integer coefficients a,. . . ,q, e = −d, j = d− c, n = c− b, f = a−d, k = −c, o = d− b, g = b−d, l = a− c, p = −b, h = c−d, m = b− c, q = a− b. one can check that, not surprisingly, a matrix z satisfying such conditions, i.e., z =   a b c d −d a−d b−d c−d d− c −c a− c b− c c− b d− b −b a− b   =: za,b,c,d (17) is nothing else then an integer combination z = ai + bc + cc2 + dc3 of the powers of the matrix c given in (4), namely i = c0 =   1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1   , c =   0 1 0 0 0 0 1 0 0 0 0 1 −1 −1 −1 −1   , c2 =   0 0 1 0 0 0 0 1 −1 −1 −1 −1 1 0 0 0   , c3 =   0 0 0 1 −1 −1 −1 −1 1 0 0 0 0 1 0 0   , note that we do not use higher than third powers of the matrix c, as by hamilton-cayley theorem, c4 = −c31 − c21 − c1 − i. note also, that we do not need to use zyi = ρiyi, for i = 2, 3, 4, since the result is obtainable applying the galois automorphisms of the field q(ω), and it would not provide any new information. it follows that the matrix z of (17) can be diagonalized using the matrix y (of (6)) composed of eigenvectors yi, and on the diagonal we find the numbers σj(a + bω + cω2 + dω4). these are thus the eigenvalues of z. 438 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets corollary 6.1. the set zcom := { za,b,c,d : a,b,c,d ∈ z } with standard matrix addition and multiplication is a commutative ring isomorphic to the ring z[ω] of cyclotomic integers. in order to transform the matrix z into the real block diagonal form, we use the similarity transformation by the matrix p of (7). this yields p−1y −1zyp = ( s o o t ) , where s = ( a + b cos 2π5 + (c + d) cos 4π 5 b sin 2π 5 + (c−d) sin 4π 5 −b sin 2π5 + (d− c) sin 4π 5 a + b cos 2π 5 + (c + d) cos 4π 5 ) , t = ( a + b cos 4π5 + (c + d) cos 2π 5 b sin 4π 5 + (c−d) sin 2π 5 −b sin 4π5 + (d− c) sin 2π 5 a + b cos 4π 5 + (c + d) cos 2π 5 ) . the matrices s, t are in the form λr, where λ > 1 and r is an orthogonal matrix. indeed, denote η the cyclotomic integer η = a + bω + cω2 + dω4 and find its goniometric form η = |η|(cos ϕ + i sin ϕ). then by construction, s satisfies s = |η| ( cos ϕ −sin ϕ sin ϕ cos ϕ ) . (18) similarly, t = |ν| ( cos ψ −sin ψ sin ψ cos ψ ) , (19) where ν = σ3(η) = |ν|(cos ψ + i sin ψ). we thus see that our original assumption on the integer matrix z having the same eigenvectors as c leads to self-similarities s of the cut-and-project scheme in the form of scaled rotations. as we will see, these are not the only self-similarities of the constructed cut-and-project scheme. in order to find all of the self-similarities, we relax the condition on eigenvectors of z and require only that c and z have the same invariant subspaces of dimension 2. with this, we obtain the following proposition. in order to formulate the statement, define z := { z ∈ z4×4 : ∃s,t ∈ r2×2, v zv −1 = ( s o o t )} , (20) where v is the matrix defining the lattice l of the cut-and-project scheme λ. proposition 6.2. for integer a,b, . . . ,h ∈ z denote za,b,...,h :=   a b c d e f g h −a−e+f−h −b+g−h −c−e+f −d−e+g−h a−b+d+e−f +h −c+d−g+h a−b+e−f a−c+d+e−g+h   . then z = {za,b,...,h : a,b, . . . ,h ∈ z}. proof. recalling the definition of v in (9), we rewrite the requirement( s o o t ) = v zv −1, for some matrices s,t ∈ r2×2 by p ( s o o t ) p−1 = y −1zy ⇐⇒ y p ( s o o t ) p−1 = zy. (21) since also p and p−1 are block diagonal, the latter represents the requirement that the matrix c′ has two invariant subspaces, namely spanr{y1, y2} and spanr{y3, y4}. stated otherwise, we require existence of complex coefficients µ,µ′,ν,ν′,ζ,ζ′,η,η′ such that zy1 = µy1 + νy2, zy2 = µ ′y1 + ν ′y2, zy3 = ζy3 + ηy4, zy4 = ζ ′y3 + η ′y4. (22) 439 zuzana masáková, jan mazáč acta polytechnica since y1 = y2 and y3 = y4, it follows that µ′ = ν, ζ′ = η, ν = µ, η′ = ζ. consider a general integer matrix z ∈ z4×4: z =   a b c d e f g h j k l m n o p q   . conditions (22) are conveniently rewritten as ( a b c d e f g h ) ︸ ︷︷ ︸ zu   1 ω ω2 ω3   = ( 1 1 ω ω4 ) ︸ ︷︷ ︸ y (1) u ( µ ν ) , ( j k l m n o p q ) ︸ ︷︷ ︸ zd   1 ω ω2 ω3   = ( ω2 ω3 ω3 ω2 ) ︸ ︷︷ ︸ y (1) d ( µ ν ) . (23) excluding parameters µ,ν, we obtain relation between coefficients of matrices fh and fd, y (1) d y (1) u −1 zuy1 = zdy1, 1 ω4 −ω ( ω2 ω3 ω3 ω2 )( ω4 −1 −ω 1 )( a b c d e f g h ) 1 ω ω2 ω3   = ( j k l m n o p q ) 1 ω ω2 ω3   , ( −1 ω + ω4 −ω −ω4 −ω −ω4 )( a + bω + cω2 + dω3 e + fω + gω2 + hω3 ) = ( j + kω + lω2 + mω3 n + oω + pω2 + qω3 ) ,  −a−e + f −h + ω(−b + g −h) −a−e + f −h + ω(−b + g −h) a− b + d + e−f + h + ω(−c + d−g + h) a− b + d + e−f + h + ω(−c + d−g + h)   = ( j + kω + lω2 + mω3 n + oω + pω2 + qω3 ) . since the entries are – as elements of the cyclotomic field q(ω) – uniquely written as a rational combination of 1,ω,ω2,ω3, we find expression for j, . . . ,q in terms of a,. . . ,h. the matrix z thus has only eight independent integer parameters, z =   a b c d e f g h −a−e+f−h −b+g−h −c−e+f −d−e+g−h a−b+d+e−f+h −c+d−g+h a−b+e−f a−c+d+e−g+h   . (24) as µ,ν ∈ q(ω), relations (22) are obtained from the first of them by application of the field automorphisms. therefore µ′ = σ4(µ), ν′ = σ4(ν), ζ = σ2(µ), η = σ2(ν), ζ′ = σ3(µ), η′ = σ3(ν). remark 6.3. note that setting e = −d, f = a−d, g = b−d, h = c−d, we obtain the matrix (17). therefore the set zcom is a commutative z-subalgebra of the z-algebra z. in order to describe self-similarities od the module π1(l) corresponding to the matrices za,...,h let us determine the values of µ,ν from the relations (23), ( µ ν ) = 1 ω4 −ω ( ω4 −1 −ω 1 )( a b c d e f g h ) 1 ω ω2 ω3   = 1 5 (2 + 4ω + ω2 + 3ω3) ( b−e + ω(c−f) + ω2(d−g) −hω3 + aω4 e + ω(f −a) + ω2(g − b) + ω3(h− c) −dω4 ) , 440 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets µ = 1 5 ( 2a + 2b− 3c + 2d− 2e + 3f − 2g + 3h + ω(−a + 4b− c−d− 4e + f + g + h) + ω2(a + b + c + d−e−f −g + 4h) + ω3(−2a + 3b− 2c + 3d− 3e + 2f − 3g + 2h) ) , ν = 1 5 ( 3a− 2b + 3c− 2d + 2e− 3f + 2g − 3h + ω(a + b + c + d + 4e−f −g −h) + ω2(−a− b + 4c−d + e + f + g − 4h) + ω3(2a− 3b + 2c + 2d + 3e− 2f + 3g − 2h) ) . in the matrix formalism, (22) rewrites as zy = y   µ ν 0 0 ν µ 0 0 0 0 ζ η 0 0 η ζ   . comparing to (21), we obtain the expression for s,t, ( s o o t ) = p−1   µ ν 0 0 ν µ 0 0 0 0 ζ η 0 0 η ζ  p =   re(µ + ν) im(µ + ν) 0 0 im(ν −µ) re(µ−ν) 0 0 0 0 re(ζ + η) im(ζ + η) 0 0 im(η − ζ) re(ζ −η)   , (25) where the coefficients are of the form re(µ + ν) = a + b cos 2π 5 + (c + d) cos 4π 5 , im(µ + ν) = b sin 2π 5 + (c−d) sin 4π 5 , re(µ−ν) = −c + d + f + h + (2g − b) cos 2π 5 + (−c + d + 2h) cos 4π 5 , im(ν −µ) = 1 5 ( (2a− 3b + 2c + 2d + 8e− 2f − 2g − 2h) sin 2π 5 + (−6a + 4b− c−d− 4e + 6f − 4g − 4h) sin 4π 5 ) , re(ζ + η) = a + (c + d) cos 2π 5 + b cos 4π 5 , im(ζ + η) = (d− c) sin 2π 5 + b sin 4π 5 , re(ζ −η) = −c + d + f + h + (−c + d + 2h) cos 2π 5 + (2g − b) cos 4π 5 , im(η − ζ) = 1 5 ( (6a− 4b + c + d + 4e− 6f + 4g + 4h) sin 2π 5 + (2a− 3b + 2c + 2d + 8e− 2f − 2g − 2h) sin 4π 5 ) . remark 6.4. the matrices s of (25) of all linear self-similarities of the cut-and-project scheme λ = (l,π1,π2) of notation 5.1 form an 8-dimensional associative z-algebra. 7. self-similarities of cut-and-project sets with 5-fold symmetry the requirement of preserving the invariant subspaces alone is not sufficient for providing a complete description of linear mappings that are self-similarities of some cut-and-project set. in order that a matrix z gives rise to a self-similarity of σ(ω) for some window ω, it is necessary to set conditions on the eigenvalues of the matrix b, as specified in proposition 4.2. we shall do that for the self-similarities given by matrices of the z-algebra zcom defined in corollary 6.1. proposition 7.1. let z ∈ zcom and let s correspond to z by v zv −1 = ( s o o t ) . if s is a self-similarity of a cut-and-project set σ(ω), then there exists an algebraic integer η = a+bω+cω2 +dω3 = |η|(cos ϕ+i sin ϕ) ∈ z[ω] such that s = |η|r, where r is a rotation by the angle ϕ. moreover, η is a pisot number, complex pisot number or a tenth root of unity. 441 zuzana masáková, jan mazáč acta polytechnica proof. recall that a matrix za,b,c,d has as its eigenvalues the numbers σj(η), j = 1, 2, 3, 4, where η = a + bω + cω2 +dω3. the product ∏4 j=1 σj(η) is equal to the determinant of the integer matrix za,b,c,d. the corresponding matrices s,t from (18), (19), have the same eigenvalues, namely σ1(η), σ4(η) for the matrix s and σ2(η), σ3(η) for the matrix t . recall that σ1(η) = σ4(η) and σ2(η) = σ3(η). if s is a self-similarity of a cut-and-project set σ(ω), then by proposition 4.2, the eigenvalues of t must be in modulus smaller or equal to 1. as ∏4 j=1 σj(η) ∈ z we have |η| 2 = σ1(η)σ4(η) ≥ 1. assume first that |σ2(η)| = 1. then necessarily |σ2(η)σ3(η)| = 1, i.e., σ2(η) = σ3(η)−1. the characteristic polynomial of the matrix za,b,c,d is therefore reciprocal and its roots are algebraic units lying on the unit circle. by the well-known result of kronecker, these must be roots of unity, but the only roots of unity lying in the field q(ω) are tenth roots of unity. secondly, let |σ2(η)| < 1. then |σ1(η)| > 1. in this case either the characteristic polynomial of the matrix za,b,c,d is irreducible over q and then η is a complex pisot number of degree 4. on the other hand, if it is reducible, than this is possible only if η ∈ r is of degree 2. in this case η = σ1(η) = σ4(η) and σ3(η) = σ4(η) is its algebraic conjugate. it follows that η is a quadratic pisot number. the above proposition states that non-trivial scaled rotations correspond to complex pisot numbers in the cyclotomic integers ring z[ω]. it is clear that any complex pisot number in z[ω] gives a scaled rotation of some cut-and-project set. proposition 7.2. let η ∈ z[ω] be a complex pisot number, η = |η|(cos ϕ + i sin ϕ), and denote by s the mapping s = |η|r, where r is a rotation by the angle ϕ. then there exists a window ω ⊂ r2 such that the mapping s is a self-similarity of the cut-and-project set σ(ω). however, scaled rotations are not the only linear self-similarities possible. the following example shows a non-trivial linear map s for which we construct a cut-and-project set σ(ω) with self-similarity s. consider the matrix za,...,h ∈ z where we choose a = e = g = h = 0, b = −1, and c = d = f = 1, i.e., z =   0 −1 1 1 0 1 0 0 1 1 0 −1 1 0 0 0   . the corresponding mappings s,t related to z by v zv −1 = ( s o o t ) are of the form s = ( −cos 2π5 + 2 cos 4π 5 −sin 2π 5 sin 2π5 cos 2π 5 + 1 ) = ( −τ − 12τ − 1 2 √ τ2 + 1 1 2 √ τ2 + 1 12τ + 1 ) , t = ( 2 cos 2π5 − cos 4π 5 −sin 4π 5 sin 4π5 cos 4π 5 + 1 ) = ( 1 τ + τ2 − 1 2τ √ τ2 + 1 1 2τ √ τ2 + 1 1 − τ2 ) . let us study the action of the linear map s. its eigenvalues are λ1 = −τ, λ2 = 1, corresponding to the eigenvectors w1 = ( −sin 2π5 cos 2π5 ) , w2 = ( cos 2π5 −sin 2π5 ) . the mapping s thus acts in the direction of w1 as a scaling by the factor −τ, and in the direction of w2 as the identity. figure 1 shows the action of s on the regular decagon v0, . . . v9 centered in the origin. let us study the action of the mapping t . its eigenvalues are η1 = 1 τ , η2 = 1, 442 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets sv0 sv1 sv2 sv3 sv4 sv5 sv6sv7 sv8 sv9 v0 v1 v2v3 v4 v5 v6 v7 v8 v9 w1 w2 figure 1. the action of the mapping s on the regular decagon. its vertices, namely the points vi, i = 0, . . . , 9, are marked with black dots. the points svi are marked by larger grey dots. with the corresponding eigenvectors z1 = ( −sin 4π5 cos 4π5 ) , z2 = ( cos 4π5 −sin 4π5 ) . since the eigenvalues of the matrix t are in modulus ≤ 1, by proposition 4.2, there exists a window ω such that the cut-and-project set σ(ω) has self-similarity s. the window ω must be chosen such that it is invariant under the action of t. obviously, we could choose for example a parallelogram ω = { α1z1 + α2z2 : αi ∈ [−1, 1] } . for illustration, let us find a window according to the construction presented in the proof of proposition 4.3, namely with the use of a special inner product. it can be easily shown that a matrix g transferring the eigenvectors zi of t to the vectors of the standard basis is of the form g = 1 cos 2π5 ( sin 4π5 cos 4π 5 cos 4π5 sin 4π 5 ) . it is a symmetric real matrix, and thus g = g∗. then the matrix h = g∗g = g2 determining the inner product is given by h = 1 cos2 2π5 ( 1 −sin 2π5 −sin 2π5 1 ) . for the window ω we can choose a ball in the metric induced by the new inner product, namely ω = { x ∈ r2 : 〈x, x〉h ≤ const. } . in particular, we can have ω = {( x y ) ∈ r2 : x2 − 2xy sin 2π 5 + y2 ≤ 1 } . such a window ω satisfies t ω ⊂ ω and thus sς(ω) ⊂ σ(ω). figure 2 shows the window ω and its transformation by t . 8. comments in this article we have studied affine self-similarities of quasicrystal models obtained by the cut-and-project method. it is a first step towards solving the general question: given a linear map a : rn → rn under which 443 zuzana masáková, jan mazáč acta polytechnica -3 -2 -1 1 2 3 -3 -2 -1 1 2 3 figure 2. acceptance window ω is marked by full line and the action of the mapping t to ω is marked by dashed line. conditions there exist a cut-and-project scheme that admits a cut-and-project set σ(ω) such that aς(ω) ⊂ σ(ω)? how to find such a cut-and-project scheme? which other linear self-similarities such a cut-and-project set has? many problems that concern these questions remain, however, unsolved. for example, what are the conditions on the linear map a, or the corresponding integer matrix c, so that the constructed cut-and-project scheme with self-similarity a is non-degenerate and irreducible? a second important question is about the linear maps a that admit a window ω, so that a cut-and-project set σ(ω) satisfies aς(ω) ⊂ σ(ω). the answer to such a question could be viewed as a generalization of lagarias’ result of [11]. one can also ask a further question, namely: given a cut-and-project set σ(ω), what are its possible self-similarities? however, answer to such a question heavily depends on the form of the acceptance window ω. some research in this direction has been done for pentagonal quasicrystals in [4] and [13, 14], all of these however only for scaling symmetries. 9. acknowledgements this work was supported by the czech science foundation, grant no. 13-03538s. we also acknowledge financial support of the grant agency of the czech technical university in prague, grant no. sgs17/193/ohk4/3t/14. references [1] michael baake and uwe grimm, aperiodic order. vol. 1, encyclopedia of mathematics and its applications, vol. 149, cambridge university press, cambridge, 2013, a mathematical invitation, with a foreword by roger penrose. mr 3136260 [2] michael baake and robert v. moody, self-similarities and invariant densities for model sets, algebraic methods in physics (montréal, qc, 1997), crm ser. math. phys., springer, new york, 2001, pp. 1–15. mr 1847245 [3] damien barache, bernard champagne, and jean-pierre gazeau, pisot-cyclotomic quasilattices and their symmetry semigroups, quasicrystals and discrete geometry (toronto, on, 1995), fields inst. monogr., vol. 10, amer. math. soc., providence, ri, 1998, pp. 15–66. mr 1636775 [4] stephen berman and robert v. moody, the algebraic theory of quasicrystals with five-fold symmetries, j. phys. a 27 (1994), no. 1, 115–129. mr 1288000 [5] ludwig bieberbach, über die bewegungsgruppen der euklidischen räume, math. ann. 70 (1911), no. 3, 297–336. mr 1511623 [6] harald bohr, zur theorie der fastperiodischen funktionen, acta math. 45 (1924), 29–127. [7] nicolae cotfas, on the self-similarities of a model set, j. phys. a 32 (1999), no. 15, l165–l168. mr 1685699 [8] nicolaas g. de bruijn, algebraic theory of penrose’s nonperiodic tilings of the plane. i, ii, nederl. akad. wetensch. indag. math. 43 (1981), no. 1, 39–52, 53–66. mr 609465 444 vol. 57 no. 6/2017 on self-similarities of cut-and-project sets [9] anatole katok and boris hasselblatt, introduction to the modern theory of dynamical systems, encyclopedia of mathematics and its applications, vol. 54, cambridge university press, cambridge, 1995, with a supplementary chapter by katok and leonardo mendoza. mr 1326374 [10] peter kramer and r. neri, on periodic and nonperiodic space fillings of em obtained by projection, acta cryst. sect. a 40 (1984), no. 5, 580–587. mr 768042 [11] jeffrey c. lagarias, geometric models for quasicrystals i. delone sets of finite type, discrete comput. geom. 21 (1999), no. 2, 161–191. mr 1668082 [12] jeffrey c. lagarias and peter a. b. pleasants, repetitive delone sets and quasicrystals, ergodic theory dynam. systems 23 (2003), no. 3, 831–867. mr 1992666 [13] zuzana masáková, jiří patera, and edita pelantová, inflation centres of the cut and project quasicrystals, j. phys. a 31 (1998), no. 5, 1443–1453. mr 1628499 [14] , self-similar delone sets and quasicrystals, j. phys. a 31 (1998), no. 21, 4927–4946. mr 1630499 [15] yves meyer, algebraic numbers and harmonic analysis, north-holland publishing co., amsterdam-london; american elsevier publishing co., inc., new york, 1972, north-holland mathematical library, vol. 2. mr 0485769 [16] robert v. moody, model sets: a survey, from quasiperiodic to more complex systems (les houches, 1998), springer, berlin, 2000, pp. 145–166. [17] robert penrose, pentaplexity: a class of nonperiodic tilings of the plane, math. intelligencer 2 (1979/80), no. 1, 32–37. mr 558670 [18] dan shechtman, ilan a. blech, denis gratias, and john w. cahn, metallic phase with long-range orientational order and no translational symmetry, phys. rev. lett. 53 (1984), no. 20, 1951–1954. [19] walter steurer, twenty years of structure research on quasicrystals. i. pentagonal, octagonal, decagonal and dodecagonal quasicrystals, z. krist. 219 (2004), no. 7, 391–446. mr 2082031 445 acta polytechnica 57(6):430–445, 2017 1 introduction 2 preliminaries 3 matrix formalism for the cut-and-project method 4 self-similarities of a cut-and-project set 5 construction of a cut-and-project scheme with 5-fold symmetry 6 self-similarities of the constructed scheme 7 self-similarities of cut-and-project sets with 5-fold symmetry 8 comments 9 acknowledgements references acta polytechnica doi:10.14311/ap.2017.57.0245 acta polytechnica 57(4):245–251, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap fictitious domain method for numerical simulation of incompressible viscous flow around rigid bodies matej beňo∗, bořek patzák czech technical university in prague, faculty of civil engineering, department of mechanics, thákurova 7, 166 29 prague, czech republic ∗ corresponding author: matej.beno@fsv.cvut.cz abstract. this article describes the method of efficient simulation of the flow around potentially many rigid obstacles. the finite element implementation is based on the incompressible navier-stokes equations using structured, regular, two dimensional triangular mesh. the fictitious domain method is introduced to account for the presence of rigid particles, representing obstacles to the flow. to enforce rigid body constraints in parts corresponding to rigid obstacles, lagrange multipliers are used. for time discretization, an operator splitting technique is used. the model is validated using 2d channel flow simulations with circular obstacles. different possibilities of enforcing rigid body constraints are compared to the fully resolved simulations and optimal strategy is recommended. keywords: finite element method; computational fluid dynamics; fictitious domain method; lagrange multipliers; flow around obstacles. 1. introduction the motivation of the paper comes from the modeling of the fresh concrete casting. the simulation models of fresh concrete aiming at structural-scale applications typically consider concrete suspension as a homogeneous, non-newtonian fluid, whose rheological properties can be derived from the mix composition [1]. the flow can be then efficiently described by navierstokes equations considering non-newtonian fluid and solved using the finite element method (fem), for example. of course, the efficiency of the homogeneous approach comes at the expense of the coarser description of the flow, where sub-scale phenomena can only be approximately accounted for by post-processing simulation results, e.g. to determine the distribution and orientation of reinforcing fibers [2], or by heuristic modification of constitutive parameters, e.g. to account for the effect of traditional reinforcement [3]. especially the latter aspect is critical in the modeling of casting processes in highly-reinforced structures that represent the major field of application for selfcompacting concrete. the explicit representation of individual reinforcing bars in the fe model would lead to extremely fine computational meshes and will result in extremely high computational demands in terms of resources and time. this article aims at reducing these computational costs by avoiding the need of explicit representation of individual reinforcing bars by adopting an approach based on fictitious domain method to solve the problem of newtonian incompressible flow with rigid body obstacles on regular, structured computational grid, where the individual reinforcing bars can be inserted arbitrarily and independently of the underlying mesh. the individual bars are accounted for by enforcing no-flow constraints in a volume occupied by the bar using lagrange multipliers. to solve the incompressible navier-stokes problem, a 2d eulerian formulation of finite element method is developed, using the “taylor-hood” p2/p1 elements, which are quadratic in velocity and linear in pressure, satisfying the lbb condition [4]. the time discretization is based on operator splitting approach, in particular, the fractional-time-step scheme described by marchuk has been employed [5]. the fundamental idea of fictitious domain method consists in extending the problem to one geometrically simpler domain covering both fluid and obstacles – the so-called fictitious domain. the no-flow constraints in subdomains corresponding to individual bars are enforced using distributed lagrange multipliers, which represent the additional body forces needed to enforce zero flow inside the regions representing obstacles. the pressure is constrained by incompressibility of the fluid in the fictitious domain. this so called distributed lagrange multiplier fictitious domain method has been introduced by glowinski et al. [6]. the method has then been used by bertrand et al. [7] and tanguy et al. [8] to calculate threedimensional flows. recently, the method has been extended by baaijens [9] and yu [10] to handle fluidstructure interactions. the main advantage of this method is a much simpler generation of the computational grid, because the mesh does not need to conform to the geometry of the reinforcement. it can be fully structured and regular, allowing to take advantage of specialized, fast numerical solvers. this feature can significantly reduce computational requirements of the problem. to the authors knowledge, there has been no study published investigating the influence of the sampling point selection on the quality of the solution. the 245 http://dx.doi.org/10.14311/ap.2017.57.0245 http://ojs.cvut.cz/ojs/index.php/ap matej beňo, bořek patzák acta polytechnica figure 1. original and fictitious domain problems. goal of this paper is to propose and evaluate different strategies for enforcing no-flow constraints using different sets of sampling points and compare their performance to the fully resolved simulations. 1.1. fictitious domain method the basic idea of the method is based on a cancellation of the forces and moments between obstacles and fluid in the combined weak formulation for particle-fluid motion solved everywhere in the fictitious domain. to enforce the no-flow constraints (sometimes referred as rigid-body constraints when moving rigid particles are considered) to account for rigid particles, lagrange multipliers are introduced. these multipliers represent the additional body forces needed to maintain the rigid body motion inside the moving particles or the no flow constraint inside the regions representing fixed particles. as already mentioned, the advantage of this approach consist in ability to use geometrically simpler, regular meshes. 2. problem formulation 2.1. the governing equation of the flow assume an incompressible, viscous, newtonian fluid occupying at the given time t ∈ (0,t) the delimited domain ω ⊂ r2 with boundary γ. let denote by x = {xi}2i=1 a generic point in ω. let further denote by u(x, t) velocity and by p(x, t) pressure, both governed by navier-stokes equations % du dt = %g + ∇·σ on ω \p, (1) ∇·u = 0 on ω \p, (2) where % is density of the fluid, u is velocity of the fluid, and σ is stress tensor. for an incompressible, newtonian viscous fluid, the stress can be decomposed into hydrostatic and deviatoric components σ = −pi + 2ηd[u], (3) where p is hydrostatic pressure in the fluid, η is the viscosity (assumed constant), and 2ηd[u] is deviatoric stress tensor. relations (1)–(3) are to be complemented by the appropriate initial and boundary conditions u(x, 0) = u0 on ω \p, (4) ∇·u0 = 0, (5) u(x, t) = uγ(t) on γ, (6)∫ γ uγ(t)n̂dx = 0, (7) where n̂ is unit normal vector pointing out of γ. 2.2. governing equations for obstacles hydrodynamic force fi and torque ti acting on the i-th particle can be evaluated by summing up corresponding fluid induced forces acting on the particle boundary: fi = ∫ ∂pi σn̂ ds, ti = ∫ ∂pi rixσn̂ ds, (8) where pi is the region occupied by the particle and ri = x − xi is relative position vector to particle center xi. 3. weak form let us introduce the approximation of spaces for test and trial functions for velocities and pressure in the fluid part of domain ω \p without obstacles: w = {u ∈ h1(ω \p) : u = uγ(t) on γ}, v = {v ∈ h1(ω \p) : v = 0 on γ}, l = {p ∈ l2(ω \p)}, k = {q ∈ l2(ω \p)}. by using the method of weighted residuals applied to (1), (2) and using finite dimensional spaces w0.h, v0.h, l0.h, k0.h, approximating w, v, l, k defined above we arrive at the following finite-element approximation of the navier-stokes equations: find uh ∈ w0.h, p ∈ l0.h satisfying∫ ω\p ( % ∂uh ∂t + (uh ·∇)uh ) ·vh dx − ∫ ω\p ph∇·vh dx + ∫ ω\p 2ηd[uh] : d[vh] dx = 0 for all vh ∈ w0.h, (9)∫ ω\p qh∇·uh dx = 0 for all qh ∈ l0.h, (10) uh(0) = u0,h on ω, (11) where u0,h is a divergence-free initial velocity. since in (9) u is divergence-free and satisfies the dirichlet boundary condition (4) on γ, we can write ∫ ω\p 2ηd[uh] : d[vh] dx = ∫ ω\p η∇uh : ∇vh dx for all vh ∈ w0.h. 246 vol. 57 no. 4/2017 fictitious domain method for a numerical simulation 3.1. a fictitious domain formulation to extend from the domain ω \p to the fictitious domain ω, it is necessary to enforce no-flow constraints inside each pi. let us introduce an approximation of spaces for test and trial functions velocities and pressure for the fictitious domain ω: w = {u ∈ h1(ω) : u = uγ(t) on γ}, v = {v ∈ h1(ω) : v = 0 on γ}, l = {p ∈ l2(ω)}, k = {q ∈ l2(ω)}. by extending the incompressibility condition to the whole ω, we get∫ ω q∇·u dx = 0 for all q ∈ l2h(ω). the condition to enforce rigid body (or no-flow) constrains for each obstacle can be expressed as u(x, t) −u(xi, t) = 0 for all x ∈ p. (12) to relax the no-flow constraint defined above, the family of lagrange multipliers λj is introduced representing a discrete set of points covering pj, such that λh = { µh :µh = m∑ i=1 µiδ(x−xi), µ1,...,µm ∈r2 } , where δ is the dirac delta function at x = 0. using the space λh defined above, the condition (12) is relaxed to〈 µh,u(x, t) −u(xi, t) 〉 = 0 for all µh ∈ λh, (13) where the inner product on the obstacle is defined as 〈µh,vh〉p = m∑ i=1 µh,i ·vh(xi). in case of a fixed particle, it is necessary to prescribe zero velocity at the particle center. in case of moving particle, the equations of motions for a particle have to be solved as well. the combined weak solution is then to find uh ∈ w0.h, ph ∈ l0.h, where finite dimensional spaces w0.h, l0.h are approximating w, l spaces defined above and λh ∈ λh is such that∫ ω % (∂u ∂t + (uh ·∇)uh ) vh dx− ∫ ω ph∇·vh dx + ∫ ω 2ηd[uh] : d[vh] dx = 〈λh,vh〉p for all vh ∈ vh,∫ ω qh∇·uh dx = 0 for all qh ∈ l2h, 〈µh,uh〉p = 0 for all µh ∈ λh, u(0) = u0 on ω. 3.2. time discretization by operator splitting as pointed out by glowinski [11], numerical solutions of the nonlinear partial differential navier-stokes equations are not trivial, mainly due to the following reasons: (i) the above equations are nonlinear and (ii) the incompressibility condition is difficult to handle. the above equations represent a system of partial differential equations, coupled through the nonlinear term , the incompressibility condition ∇·u = 0, and sometimes through the boundary conditions. the use of time discretization by operator splitting will partly overcome the above difficulties; in particular, decoupling of difficulties associated with the nonlinearity from those associated with the incompressibility condition. following the work of glowinski and pironneau [12], assuming the following initial value problem dϕ dt + a(ϕ) = 0, ϕ(0) = ϕ0, where a is an operator (possibly nonlinear, and even multivalued) from a hilbert space h into itself and where ϕ0 ∈ h. a number of splitting techniques have been proposed, for an overview, we refer to [5]. in this work, we have used the scheme proposed by marchuk [5], which is described in the next section. 3.3. marchuk’s fractional-step scheme marchuk assumed a decomposition of the operator a into the following nontrivial decomposition a = a1 + a2 + a3 (14) (by nontrivial, we mean that operators a1, a2 and a3 are individually simpler than a). by assuming a time discretization with the time step ∆t, the updated value at the end of the time step can be computed using a three step integration procedure as follows: ϕn+1/3 −ϕn ∆t + a1(ϕn+1/3) = fn+11 , ϕn+2/3 −ϕn+1/3 ∆t + a2(ϕn+2/3) = fn+12 , ϕn+1 −ϕn+2/3 ∆t + a3(ϕn+1) = fn+13 . by applying an operator splitting to the problem (9)– (12), one obtains (with 0 ≤ α,β ≤ 1 and α + β = 1): find un+1/3h wh and p n+1/3 ∈ lh such that % ∫ ω u n+1/3 h −u n h ∆t ·vh dx− ∫ ω p n+1/3 h ∇·vh dx = 0 for all vh ∈ vh, (15)∫ ω qh∇·u n+1/3 h dx = 0 for all qh ∈ l 2 h. (16) 247 matej beňo, bořek patzák acta polytechnica figure 2. the different strategies for selection of sampling points. the boxed cross represents boundary condition application (zero velocity) in case of fixed particle, the open cross indicate location of sampling point. find un+2/3h ∈ wh such that % ∫ ω u n+2/3 h −u n+1/3 h ∆t ·vh dx −% ∫ ω (un+1/3h ·∇)u n+2/3 h ·vh dx + 2αη ∫ ω d[un+2/3h ] : d[vh] dx = 0 for all vh ∈ vh. (17) finally find un+1h ∈ wh, λ n+1 h ∈ λh such that % ∫ ω un+1h −u n+2/3 h ∆t ·vh dx + 2βη ∫ ω d[un+1h ] : d[vh] dx = 〈λ n+1 h ,vh〉 for all vh ∈ vh, (18) 〈µh,un+1h 〉p = 0 for all µh ∈ λh. (19) this scheme is only first order accurate, but with good stability properties, see [5]. the described method has also been used in [6]. 4. sampling point selection as already mentioned, the rigid body or no flow constraints are imposed in a weak sense using lagrange multipliers. in a practical implementation, the corresponding integral in a weak form is evaluated as a sum over a set of discrete sampling points. in this paper, we have proposed and tested several strategies to enforce constrains by combining one or more lagrange multipliers per particle and using different strategies for the sampling point selection. for the considered figure 3. the problem of 2d flow around single obstacle. figure 4. the mesh of 2d flow with single obstacle. circular geometry of particle, the proposed strategies consist of generating the different patterns of sampling points in radial and longitudinal directions, see figure 2. these results of these strategies are compared using two test scenarios to results obtained with fully resolved simulations, where the geometry of each individual particle is represented exactly by the mesh and corresponding no-flow constraints are exactly satisfied. 5. numerical tests 5.1. flow around single obstacle the first example to evaluate proposed strategies solves the 2d flow around a single, rigid particle. the geometry of the problem is depicted in figure 3, together with initial and boundary conditions. the perfect friction (no slip) boundary condition has been assumed on horizontal edges. the domain has been discretized into 1820 triangular taylor-hood elements, see figure 4 top. the values of mass density % = 1.0 kg/m3, viscosity η = 10−1 pa s, and time step ∆t = 0.001 s have been used in this study. the described model has been implemented into 248 vol. 57 no. 4/2017 fictitious domain method for a numerical simulation figure 5. velocity profiles through obstacle. figure 6. the geometry and the boundary conditions of the flow in tube with obstacle (448 el.). an object oriented fe code oofem [13]. the different strategies for a selection of sampling points have been evaluated and compared to results obtained by fully resolved simulation (see figure 4 bottom for discretization, consisting of 1656 elements and 3462 nodes). the comparison with fully resolved simulation has been made at a steady state. the individual strategies considered are described in table 1. the figure 5 shows the profiles of velocity (horizontal component) in a vertical section passing through the particle center. table 1 also contains the l2 norm of the difference of the obtained velocity and velocity from fully resolved simulation along the vertical profile passing the center of the particle. the obtained results indicate that an optimum strategy for sampling point selection is to have multiple rings with a dedicated multiplier for each ring. the best result (in terms of smaller difference) has been obtained for strategy 05 consisting of 12 rings and 12 multipliers for each ring. however, it is necessary to balance the computational effort (related to number of rings and number of multipliers) and obtained error. from this point of view, the optimal seems to be the strategy 03 with 4 rings. 5.2. flow around two obstacles the second example represents a flow in a tube with two particles, as illustrated in figure 6. in this test, three different discretizations of the fictitious domain have been used with an increasing mesh density, containing 448, 1792, and 7168 elements. the perfect friction on both horizontal edges (no slip condition) has been assumed. the results are again compared by means of comparing the velocity profiles along the 249 matej beňo, bořek patzák acta polytechnica no. lagrange multipliers figure error l2-norm 10 fully resolved simulation 0 01 1 multiplier per ring – 1 ring 15 a 58.10 02 1 multiplier per ring – 2 rings 15 b 9.36 03 1 multiplier per ring – 4 rings 15 b 1.13 04 1 multiplier per ring – 8 rings 15 b 0.93 05 1 multiplier per ring – 12 rings 15 b 0.89 001 1 multiplier for 2 rings 15 c 22.66 002 1 multiplier for 4 rings 15 c 7.48 003 1 multiplier for 8 rings 15 c 4.45 004 1 multiplier for 12 rings 15 c 4.04 01-p1 same as 01 – sporadic points 15 d 56.33 02-p1 same as 02 – sporadic points 15 d 21.10 03-p1 same as 03 – sporadic points 15 d 13.92 04-p1 same as 04 – sporadic points 15 d 9.76 05-p1 same as 05 – sporadic points 15 d 8.88 table 1. different ways of introducing of sample points with l2 norm of velocity difference in vertical profile passing the center of obstacle. figure 7. vertical profiles of velocity passing the centers of both particles. vertical section, indicated in figure 6. four (eight, twenty-four) lagrange multipliers per obstacle hav been used. figure 6 shows obtained velocity profiles using the fictitious domain method and using the fully resolved simulation. the obtained results show excellent agreement to the results obtained by the fully resolved simulation. 6. conclusion the presented paper deals with the modelling of incompressible flow around fixed (or moving, rigid) particles using the concept of fictitious domain method and demonstrates the capability of this method to describe the flow with solid obstacles. the paper demonstrates that the accuracy of the method depends on the strategy of choosing sampling points in addition to traditional aspects, such as time stepping and integration or space discretization. from the simulations performed using different strategies, the uniform distribution of sampling points is strongly recommended, while each element should contain at least 1 sampling point. although the obtained results are problem and grid dependent, they demonstrate the clear dependence of the sampling point selection on the quality of the solution. acknowledgements the authors would like to acknowledge the support of the czech science foundation under the project 13-23584s. 250 vol. 57 no. 4/2017 fictitious domain method for a numerical simulation references [1] h. mori, y. tanigawa. simulation methods for fluidity of fresh concrete. memoirs of the school of engineering. 1992, vol. 1, 44. [2] f. kolařík, b. patzák, l.n. thrane. modeling of fiber orientation in viscous fluid flow with application to self-compacting concrete. computers & structures. 2015, vol. 154, pp. 91-100. [3] k. vassilic, b. meng, h.c. kühne, n. roussel. flow of fresh concrete through steel bars: a porous medium analogy. cement and concrete research. may 2011, vol. 41, 5, pp. 496-503. [4] babuška, i. the finite element method with lagrangian multipliers. numerische mathematik. 20, pp. 179-192. [5] marchuk, g.i. splitting and alternating direction methods. [book auth.] j.l. lions p.g. ciarlet. handbook of numerical analysis. amsterdam, north-holland, 1990, vol. 1, pp. 197-461. [6] r. glowinski, t.-w. pan, t.i. hesla, d.d. joseph. a distributed lagrange multiplier/fictitious domain method for particulate flows. international journal of multiphase flow. 1999, vol. 25, pp. 755-794. [7] f. bertrand, p.a. tanguy, f. thibault. a three-dimensional fictitious domain method for incompressible fluid flow problems. int. j. numer. meth. fluids. 1997, vol. 25, pp. 719-736. [8] p.a. tanguy, f. bertrand, r. labie, e. britode la fuente. numerical modelling of the mixing of viscoplastic slurries in a twin-blade planetary mixer. trans. icheme. 1996, vol. 74 (part a), pp. 499-504. [9] baaijens, f.p.t. a fictitious domain/mortar element method for fluid–structure interaction. int. j. numer. meth. fluids. 2001, vol. 35, pp. 743-761. [10] yu, z. a dlm/fd method for fluid/flexible-body interactions. j. comput. phys. 2005, vol. 207, pp. 1-27. [11] glowinski, r. finite element methods for incompressible viscous flow. [book auth.] j.l. lions p.g. ciarlet. handbook of numerical analysis. amsterdam, north-hollad, 2003, vol. 9, pp. 3-76. [12] r. glowinski, o. pironneau. finite element methods for navier-stokes equations. annual reviews fluid mech. 1992, vol. 24, pp. 167-204. [13] patzák, b. oofeman object-oriented simulation tool for advanced modeling of materials and structures. acta polytechnica. 2012, vol. 52, 6, pp. 59-66. 251 acta polytechnica 57(4):245–251, 2017 1 introduction 1.1 fictitious domain method 2 problem formulation 2.1 the governing equation of the flow 2.2 governing equations for obstacles 3 weak form 3.1 a fictitious domain formulation 3.2 time discretization by operator splitting 3.3 marchuk’s fractional-step scheme 4 sampling point selection 5 numerical tests 5.1 flow around single obstacle 5.2 flow around two obstacles 6 conclusion acknowledgements references acta polytechnica doi:10.14311/ap.2016.56.0319 acta polytechnica 56(4):319–327, 2016 © czech technical university in prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap effect of fibre aspect ratio and fibre volume fraction on the effective fracture energy of ultra-high-performance fibre-reinforced concrete radoslav sovják∗, petr máca, tomáš imlauf experimental centre, faculty of civil engineering, czech technical university in prague, thákurova 7, 166 29 prague 6, czech republic ∗ corresponding author: sovjak@fsv.cvut.cz abstract. this paper investigates the effective fracture energy of uhpfrc with various fibre volume fractions and various fibre aspect ratios. we have concluded that the effective fracture energy is dependent on both the fibre volume fraction and the fibre aspect ratio. in addition, we have found that both dependencies follow a linear trend. keywords: effective fracture energy; uhpfrc; straight steel micro-fibres; fibre volume fraction; fibre aspect ratio; simplified predictive model. 1. introduction ultra-high-performance fibre-reinforced concrete (uhpfrc) is an advanced cementitious composite with enhanced mechanical and durability properties that outperforms conventionally used concretes in many ways. such a material with certain properties and specifications is well suited for energy absorption facade panels and key elements of building structures that may be exposed to impacts [1–3] or blast loads [4– 7]. in the event of such a disaster, a large deformation of the structural member is expected, while the exposed member is required to continue to possess some residual capacity to carry the load. it can be stated that the resistance of civil infrastructure is strongly related to the energy absorption capacity. the capacity of the member to absorb energy can be quantified via the effective fracture energy, which represents the overall energy that a material can absorb per square meter. many researchers have demonstrated the considerably higher energy absorption capacity of uhpfrc (as indicated by the fracture energy) compared to conventionally used fibre reinforced concretes or normal strength concretes [8–11]. the energy absorption capacity is the main material property that benefits from fibre reinforcement. the effective fracture energy (gf) is a key parameter for evaluating the ability of a material to withstand an impact or blast load and also to redistribute the load from the exposed structure to its surrounding parts. the aim of this study is to investigate the effective fracture energy of uhpfrc with various fibre volume fractions and various aspect ratios. different behaviour of uhpfrc in terms of gf can be expected for various fibre volume fractions and for various aspect ratios, as the fibres are the key component of uhpfrc that results in enhanced ductility. the results provided in the present study can serve as valuable information for verifying material modfigure 1. effective fracture energy of a notched specimen. els, and also for design purposes. this paper also provides an overview of basic mechanical properties of uhpfrc and experimental techniques used. in addition, a brief mixing procedure as well as sample preparation is presented. 2. experimental program the effective fracture energy (gf) of a material is defined as the energy required to open a unit crack surface area. the fracture energy is primarily governed by the tensile mechanism of the material, and represents the amount of energy consumed when a crack propagates through a beam. the fracture energy (gf) is expressed as the work of external forces acting on the beam related to the actual depth of the crack. the overall work of the external forces related to the final crack depth is considered as the average fracture energy, the so-called effective fracture energy (fig. 1). the effective fracture energy (gf) was determined in this study on the basis of recommendations presented by the rilem technical committee [12] and also by 319 http://dx.doi.org/10.14311/ap.2016.56.0319 http://ojs.cvut.cz/ojs/index.php/ap r. sovják, p. máca, t. imlauf acta polytechnica figure 2. experimental setup. other studies [13, 14]: gf = wf + mguu b(h − a0) , (1) where gf is the effective fracture energy, wf is the work of external forces (i.e., the area beneath the loaddeflection diagram), and mguu is the contribution of the weight of the beam and measurement equipment that is not connected to the loading frame. in detail, m is the mass of the beam and the free experimental equipment, g is gravity acceleration, uu is the ultimate deflection of the beam, b is the width of the beam, h is the height of the beam, and a0 is the height of the notch. experiments were performed on beams of 100 × 100 × 550 mm in dimensions with a clear span of 500 mm. similar spans were used in previous studies by habel and gauvreau [15] and sovják et al. [9], for instance. the beams had a notch in their bottom edge which was 30 mm in height and 5 mm in width (fig. 2). each beam was turned 90° from the casting surface and then sawed through completely at midspan [16]. the aspect ratio and the fibre volume fraction were set as the main test variables in this study. three different fibre volume fractions were tested covering 1, 2, and 3 % of the fibre volume content. other studies have shown that the optimal fibre volume fraction for a protective structure or for a defence structure is 2 % by volume [17, 18]. the effect of aspect ratio on gf was therefore investigated in this study on samples with 2 % of fibres by volume only. the aspect ratios investigated in this study were 50.0, 59.1, 72.2 and 108 : 1, using fibres of 8.5 × 0.17 mm, 13 × 0.22 mm, 13 × 0.18 mm and 14 × 0.13 mm, respectively. quasi-static loading conditions were simulated by a deformation controlled test with a speed of the cross-head of 0.2 mm/min. this speed corresponded to a strain rate of 5.6 · 10−6 s−1, which is considered as the quasi-static strain rate [19, 20]. during the experimental program, the force acting on the beam and the deflection measured by two lvdt (linear variable differential transformer) sensors were recorded with 5 hz frequency. steel yokes were implemented in the experimental setup as mounts for the lvdt sensors, in order to subtract the settlement of the supports from the measured deflections [21]. the compressive strength and secant modulus of elasticity were measured on cylinders with 100 mm in diameter and of 200 mm in height. because the strength of the best available capping material (100 mpa) was significantly lower than the expected measured strengths, the tops of the cylinders were cut off and grinded. compressive strength was measured on the cylinders by monotonic increments of load with an average speed of 36 mpa/min up to the level of 70 % of the expected compressive strength. at this point loading was switched to deformation control with a speed of 0.048 mm/min for about 20 minutes in order to measure peak and post peak behaviour. the modulus of elasticity was measured using two strain gauges with a 50 mm base, attached to the sides of the cylinder specimen. a hydraulic loading machine was used and the loading procedure was stress controlled. as a first step the specimens were loaded to 1/3 of expected maximal compressive strength — in this case 50 mpa — for 60 seconds. afterwards the specimens were unloaded to 5 mpa. this procedure was repeated three times. the secant modulus of elasticity was calculated from the third unloading cycle. the modulus of rupture in a three-point bending configuration was measured on prisms with dimensions of 400×100×100 mm with a clear span of 300 mm. the loading speed was 0.1 mm/min. a direct tensile test was carried out on dog-bone shaped specimens without a notch. the length of the specimens was 330 mm and the cross-section of the narrowed part was 30 × 30 mm with a length of 80 mm. the specimens were mounted into specially developed grips and the loading speed was 0.1 mm/min. splitting tensile strength was measured on cylinders with 100 mm in diameter and with a length of 100 mm. the splitting tensile test was stress controlled and the loading speed was 3 mpa/min. 3. material the uhpfrc tested in this study was developed on the basis of components widely available in the czech republic. the material design process has been fully described elsewhere [17, 22, 23]. high particle packing density is a key property of ultra-high compressive strength of concrete. therefore the mixture design was based on optimizing the particle packing density of sand (s), silica fume (sf), glass powder (gp) and cement (c). an improvement in particle packing was achieved mainly by changing the matrix composition and proportions and by selecting ranges of particles for sand [24]. the first mixture was designed following the proportions of c : sf : gp recommended by wille et al. [25] as 1 : 0.25 : 0.25 with a water binder (w/b) ratio of 0.2. subsequent changes in the most important parameters such as high-range water reducer (i.e., superplasticizer), water content (w), sf, and gp led to an optimized cementitious matrix (tab. 1). 320 vol. 56 no. 4/2016 effect of fibre aspect ratio and fibre volume fraction fibre content 1 % 2 % 3 % cement cem i 52,5r 800 silica fume 200 silica powder 200 water 176 superplasticizer 39 fine sand 0.1/0.6 mm 336 fine sand 0.3/0.8 mm 720 640 560 straight steel fibres 80 160 240 table 1. mixture design of the uhpfrc used in this study (values in kg/m3). figure 3. the various micro-fibres used in this study. the uhpfrc was mixed in conventional mixers, and the beams were cured in water tanks. the mixture contained a high volume of cement and silica fume, and the water-to-binder ratio was 0.18. in this study, the fibre aspect ratio and the fibre volume fraction (i.e., the fibre content) were selected as the main test variables. the high-strength steel micro-fibres used in this study were straight and smooth, with a tensile strength of 2800 mpa (fig. 3) as specified by the manufacturer. high tensile strength of the fibres was chosen in order to achieve the pull-out failure mode. the pull-out failure mode (fig. 4a) is a much more energy-consuming mode than the fibre failure mode (fig. 4b) [26]. straight fibres also provided a good trade-off between the workability and the mechanical properties of the resulting mixture. when mixing uhpfrc, it is very important to achieve good workability, particle distribution and packing density. in comparison to normal strength concrete, uhpfrc contains more constituents and finer particles. several researchers recommend mixing all fine dry particles first, before adding water and superplasticizer [27–29]. they do so because small particles tend to agglomerate and it is easier to break these chunks when the particles are dry. the specific mixing procedure looked as follows: as the first step both types of aggregates and silica fume were mixed figure 4. a) pull-out failure mode. b) fibre failure mode. for five minutes. as the second step cement and silica powder were mixed for another five minutes. at the end of the procedure water and superplasticizer were added. the addition of superplasticizer was gradual. the mixture became fully workable after another 5 minutes. fibres were added gradually into the flowable mixture, to avoid chunk formation, during the last 5 minutes of mixing. the shear action of the fibres helped to destroy any remaining agglomerates in the mixture, thus improving workability. the total mixing time was 20 minutes. a food-type mixer with a capacity of 25 l was used to prepare the samples. the placement of the fresh uhpfrc mixture in the moulds caused the fibres to be aligned along the length of the beam [30]. uhpfrc had flowable and self-consolidating characteristics, so beams made of uhpfrc were fabricated by placing concrete at a certain point of the mould and allowing it to flow [16, 31]. this led to fibre alignment in the direction of the tensile stress, because when filling the fresh uhpfrc into formwork, the fibres in the zones near the surface are predominantly oriented parallel to the formwork walls (the so-called wall effect) [32]. no other technique was used to align the fibres. all beams were tested after 28 days from casting in order to avoid the effect of ageing, which may also influence the results [33]. uhpfrc is usually cured using pressure or elevated temperature. this helps to enhance its properties by accelerating the hydration reaction of the binder. however, this is not only energy-expensive but it also limits the usage of uhpfrc to precast elements production. therefore, the uhpfrc investigated in this study is a self-consolidating concrete with fast strength development which does not require heat curing or special mixing techniques. as shown in tab. 2 and tab. 3, the compressive strength measured on cylinders of 200 mm in height and 100 mm in diameter was around 150 mpa. the compressive strength did not vary, either with an increasing fibre content or with various fibre aspect ratios. however, the uniaxial tensile strength, the splitting tensile strength (tab. 2) and the modulus of 321 r. sovják, p. máca, t. imlauf acta polytechnica fibre content 1 % 2 % 3 % fibre geometry 13 × 0.22 mm cylinder compressive strength (mpa) 150 152 150 tensile strength (mpa) 7.80 9.90 11.7 modulus of rupture (mpa) 15.8 25.6 33.8 splitting tensile strength (mpa) 14.9 20.5 26.6 modulus of elasticity (gpa) 45.1 56.3 51.5 table 2. mechanical properties of the uhpfrc used in this study. fibre content fibre geometry (aspect ratio) 8.5 × 0.17 mm 13 × 0.22 mm 13 × 0.18 mm 14 × 0.13 mm (50.0 : 1) (59.1 : 1) (72.2 : 1) (108 : 1) ccs mor ccs mor ccs mor ccs mor 1 % – – 150 (2.4) 15.8 (2.5) – – – – 2 % 144 (6.1) 21.0 (1.6) 152 (7.4) 25.6 (1.5) 147 (5.2) 27.0 (1.8) 148 (3.3) 36.5 (2.5) 3 % – – 150 (6.5) 33.8 (1.4) – – – – table 3. cylinder compressive strength (ccs) and modulus of rupture (mor) of the uhpfrc used in this study (values in mpa). the value in parentheses gives the standard deviation from the tested specimens. figure 5. load-deflection diagrams for uhpfrc beams with various fibre contents. rupture (tab. 3) showed a linear dependence on the actual fibre content or aspect ratio. the maximum tensile strength was determined to be 11.7 mpa when the fibre content was 3 % by volume [10]. 4. results and discussion load-deflection (l–d) diagrams were plotted for all beams including various fibre volume fractions (fig. 5) and various aspect ratios (fig. 6). six beams were tested for all fibre contents and for all fibre aspect ratios, making a total of 36 tested beams. the experimental results obtained in this study showed that an increase both in the aspect ratio of the fibres and in the fibre volume fraction resulted in an increase in the effective fracture energy (tab. 4). in addition, both results appear to follow a linear trend (fig. 7, fig. 8). data obtained by other researchers dealing with figure 6. load-deflection diagrams for uhpfrc beams with various aspect ratios. uhpfrc with straight steel micro-fibres subjected to flexure using notched specimen were examined in order to verify the results derived from this study. kang et al. [34] tested uhpfrc using a notched 3-point bending test, where the fibre volume fraction varied from 1 to 5 %. the uhpfrc used in their study was mixed using a w/b ratio of 0.2 and steel fibres with an aspect ratio of 65 (13 × 0.2 mm). the tensile strength of the fibres was specified to be 2500 mpa. the beams were 400 mm in length (300 mm clear span) with a notch 30 mm in height. the effective fracture energy derived from their study increased approximately linearly with an increasing fibre volume fraction until 5 %. a linear increase in gf was also observed in the work conducted by yoo et al. [35]. yoo et al. tested uhpfrc up to 4 % of the fibre volume fraction, and the effective fracture energy values tended to increase as the fibre volume increased (fig. 9). yoo et al. also 322 vol. 56 no. 4/2016 effect of fibre aspect ratio and fibre volume fraction fibre content fibre geometry (aspect ratio) 8.5 × 0.17 mm 13 × 0.22 mm 13 × 0.18 mm 14 × 0.13 mm (50.0 : 1) (59.1 : 1) (72.2 : 1) (108 : 1) 1 % – 12700 (2000) – – 2 % 10900 (1300) 18700 (1000) 21000 (1700) 32400 (4000) 3 % – 26200 (800) – – table 4. effective fracture energy (gf) of uhpfrc, in j/m2. the value in parentheses gives the standard deviation from the tested beams. figure 7. development of the effective fracture energy in the framework of various fibre volume fractions. tested smooth steel fibres with an aspect ratio of 65 (13 × 0.2 mm) and with tensile strength of 2500 mpa. the beams were 400 mm in length with a 300 mm-long clear span, and the notch was 10 mm in height. the average compressive strength of the uhpfrc used by yoo et al. ranged from 182 mpa to 207 mpa. fig. 9 shows that gf increases as vf increases. in addition, it tends to follow an assumption of linear progression. the forecasting equation for the simplified linear trend model can be expressed as gf = g∗f + a ∆vf , (2) where gf is the effective fracture energy, g∗f is the reference sample value of the effective fracture energy (measured), a indicates the size of the increase in gf when vf increases by one percent, and ∆vf is the change between the reference sample value of vf and the actual sample value of vf. the domain for the proposed equation is between 1 and 5 % of vf. the experimentally derived constant a was calculated as an average slope, and was determined to be 6500 j/m2/%. we have found very little research on uhpfrc with different aspect ratios of the fibres in the framework of high-strength, straight and smooth steel micro-fibres. yoo et al. [31] tested various aspect ratios of fibres in uhpfrc incorporating 2 % of the fibres by volume. the fibres were 13.0 mm, 16.3 mm, 19.5 mm and 30.0 mm in length and 0.2 mm, 0.2 mm, 0.2 mm and 0.3 mm in diameter, respectively. the uhpfrc used figure 8. development of the effective fracture energy in the framework of various aspect ratios. figure 9. effect of fibre volume fraction on the effective fracture energy of uhpfrc. in the study by yoo et al. was specified by a highest compressive strength of 204.5 mpa for a cylinder with a fibre length of 16.3 mm. yoo et al. also reported that the compressive strength was somewhat independent from the fibre geometry. the data obtained by yoo et al. tended to maintain its linear trend throughout the range of aspect ratios used in their study. this trend was found also to be dependent on the reference value of gf in order to reach the uniform gradient of increments between different studies per change in aspect ratio. it could therefore be noted that both the effective fracture energy and the aspect ratio increase proportionally where the proportionality constant also covers the reference value for the effective fracture energy. the proportionality constant for this trend was specified by the experimental constant b and the reference value of the effective fracture energy (g∗f ). the 323 r. sovják, p. máca, t. imlauf acta polytechnica figure 10. effect of aspect ratio on effective fracture energy of uhpfrc. experimental constant b was determined to be 0.0338 (fig. 10). the value of the experimental constant b was calculated as the average slope of the two lines presented in fig. 10. fig. 10 shows that gf increases as l/d increases. the forecasting equation for the simplified linear trend model can therefore be expressed as follows: gf = g∗f + b g ∗ f ∆ l d , (3) where gf is the effective fracture energy, g∗f is the reference sample value of the effective fracture energy, b is the experimentally derived constant, and ∆ l d is the change between the reference sample value of l/d and the actual sample value of l/d. the domain for the proposed equation is limited from 50 to 108 : 1 of l/d. within the framework of the studies discussed here, it is interesting to note that the slope of the linear dependence is not influenced by the matrix composition in the case of various fibre volume fractions. only the intercept of the line is influenced by various matrix compositions in each study, despite the fact that various matrix compositions lead to different fibre-matrix bond properties, which have a direct influence on the fracture process. on the other hand, the slope of the linear dependence, in the case of various aspect ratios, was found to be influenced by the composition of the matrix. this was reflected by the implementation of the reference value of gf (i.e., the initial value of the line by the lowest l/d in the domain) to the experimental constant b, which reflects the size of the increase in gf when l/d increases by one unit. using the reference value of gf (i.e., g∗f ), a uniform slope of the individual lines was achieved. however, it is important to note that only a limited number of results were found and used in this simplified model, and it is essential to provide more data to upgrade or extend the scope of this trend. the simplified model for estimating the change in the effective fracture energy due to the change in the aspect ratio and/or in the fibre volume fraction can be established using both dependencies, as presented figure 11. correlation of the results. in (2) and (3), as follows: gf = g∗f + a ∆vf + b g ∗ f ∆ l d , (4) where g∗f is the reference (measured) value of the effective fracture energy, a and b are experimentally derived constants, ∆vf is the change between the reference sample fibre volume fractions and the actual sample fibre volume fractions, and ∆ l d is the change between the reference sample fibre aspect ratio and the actual sample fibre aspect ratio. the correlation between the results obtained experimentally and the results obtained by the proposed model is shown in fig. 11. fig. 11 shows that the scatter plot follows an approximately linear pattern. the correlation coefficient for all data was calculated to be 0.986. this value implies that the simplified linear model describes the relationship between gf and ∆vf and between gf and ∆ l d with reasonable accuracy. these calculated plots were derived from (4), and they are valid only within the given intervals. we cannot provide any guarantee that the relationship will continue beyond the range for which the data were collected. it can be expected that a further increase in fibre content or aspect ratio beyond these limits will lead the linear relationships to break down. therefore, an extension of the scope of the current testing by other experimental data is needed. 5. conclusions and further outlook the effective fracture energy was determined on a total of 36 beams, which were tested with various fibre aspect ratios and with various fibre contents. a three-point bending configuration was applied and l–d diagrams were examined. a closer examination of the fracture surface revealed fibre pull-out from the matrix, which is more energy-expensive than fibre rupture (fig. 12). the fibre volume fraction ranged from 1 to 3 % by volume, and the aspect ratio ranged from 50.0 to 324 vol. 56 no. 4/2016 effect of fibre aspect ratio and fibre volume fraction (a) (b) figure 12. a) fibres used in this study. b) fibre pull-out failure mode. 108 : 1. other studies [31, 34, 35] were examined and relevant data were obtained to create the simplified model. the following conclusions can be drawn on the basis of the experimental outcomes derived from our study: (1.) the effective fracture energy (gf) increases as the fibre volume fraction increases. it has been specified that the average increase in gf per 1 % increase in vf is approximately 6500 j/m2. (2.) the effective fracture energy also increases as the aspect ratio increases. a higher aspect ratio is reflected by a higher absolute number of fibres in the mixture and therefore by a larger surface area. in addition, it was found that the dependence of gf on the aspect ratio of the fibres tends to follow a linear trend. this has also been confirmed by another study. (3.) a simplified model was established to calculate the change in the effective fracture energy when the fibre volume fraction or the aspect ratio of the fibres changes. two experimental constants were established on the basis of the results from this study and other available literature [31, 34, 35]. (4.) an idealization of the linear dependence has been proposed, but the model is valid only within the limits given in tab. 5. further extension of the model by other experimental data is needed, especially for higher aspect ratios. acknowledgements the authors gratefully acknowledge the support provided by the czech science foundation under project number gap 105/12/g059. the authors would like to acknowledge michaela kostelecká, from the klokner institute, for her assistance with the microscopic investigation of the fibres used in this study. the authors would also like to acknowledge the assistance given by the technical staff compressive strength of uhpfrc 144–207 mpa the steel fibres are straight and smooth. tensile strength of the fibre 2400–2800 mpa fibre diameter 0.13–0.22 mm fibre length 8.5–30 mm aspect ratio 50–108 : 1 fibre volume fraction 1–5 % clear span of the beam 300–500 mm cross-section of the beam 100 × 100 mm height of the notch 10–30 mm table 5. limits on the validity of the model. of the experimental centre, faculty of civil engineering, ctu in prague, and by the students who participated in the project. references [1] millon, o., riedel, w., et al.: fiber-reinforced ultra-high performance concrete – a material with potential for protective structures. in: proceedings of the first international conference of protective structures (editors: q. m. li et al.). manchester: manchester, 2010, p. no.–013. [2] máca, p., sovják, r., et al.: mix design of uhpfrc and its response to projectile impact. international journal of impact engineering, 63, 2014, p. 158–163. doi:10.1016/j.ijimpeng.2013.08.003 [3] sovják, r., vavřiník, t., et al.: experimental investigation of ultra-high performance fiber reinforced concrete slabs subjected to deformable projectile impact. in: procedia engineering. 2013, p. 120–125. doi:10.1016/j.proeng.2013.09.021 [4] lai, j., guo, x., et al.: repeated penetration and different depth explosion of ultra-high performance concrete. international journal of impact engineering, 84, 2015, p. 1–12. doi:10.1016/j.ijimpeng.2015.05.006 325 http://dx.doi.org/10.1016/j.ijimpeng.2013.08.003 http://dx.doi.org/10.1016/j.proeng.2013.09.021 http://dx.doi.org/10.1016/j.ijimpeng.2015.05.006 r. sovják, p. máca, t. imlauf acta polytechnica [5] li, j., wu, c., et al.: investigation of ultra-high performance concrete slab and normal strength concrete slab under contact explosion. engineering structures, 102, 2015, p. 395–408. doi:10.1016/j.engstruct.2015.08.032 [6] nicolaides, d., kanellopoulos, a., et al.: experimental field investigation of impact and blast load resistance of ultra high performance fibre reinforced cementitious composites (uhpfrccs). construction and building materials, 95, 2015, p. 566–574. doi:10.1016/j.conbuildmat.2015.07.141 [7] nicolaides, d., kanellopoulos, a., et al.: development of a new ultra high performance fibre reinforced cementitious composite (uhpfrcc) for impact and blast protection of structures. construction and building materials, 95, 2015, p. 667–674. doi:10.1016/j.conbuildmat.2015.07.136 [8] tran, n.t., tran, t.k., et al.: fracture energy of ultra-high-performance fiber-reinforced concrete at high strain rates. cement and concrete research, 79, 2015, p. 169–184. doi:10.1016/j.cemconres.2015.09.011 [9] sovják, r., rašínová, j., et al.: effective fracture energy of ultra-high-performance fibre-reinforced concrete under increased strain rates. acta polytechnica, 54 (5), 2014, p. 358–362. doi:10.14311/ap.2014.54.0358 [10] máca, p., sovják, r., et al.: experimental investigation of mechanical properties of uhpfrc. procedia engineering, 65 (0), 2013, p. 14–19. doi:10.1016/j.proeng.2013.09.004 [11] xu, m., wille, k.: fracture energy of uhp-frc under direct tensile loading applied at low strain rates. composites part b: engineering, 80, 2015, p. 116–125. doi:10.1016/j.compositesb.2015.05.031 [12] recommendation, r.d.: determination of the fracture energy of mortar and concrete by means of three-point bend tests on notched beames. 1985. [13] bažant, z.p., kazemi, m.t.: size dependence of concrete fracture energy determined by rilem work-offracture method. international journal of fracture, 51 (2), 1991, p. 121–138. doi:10.1007/bf00033974 [14] hu, x.-z., wittmann, f.h.: fracture energy and fracture process zone. materials and structures, 25 (6), 1992, p. 319–326. doi:10.1007/bf02472590 [15] habel, k., gauvreau, p.: response of ultra-high performance fiber reinforced concrete (uhpfrc) to impact and static loading. cement and concrete composites, 30 (10), 2008, p. 938–946. [16] yang, i.h., joh, c., et al.: structural behavior of ultra high performance concrete beams subjected to bending. engineering structures, 32 (11), 2010, p. 3478–3487. doi:10.1016/j.engstruct.2010.07.017 [17] máca, p., sovják, r., et al.: mix design of uhpfrc and its response to projectile impact. international journal of impact engineering, 2013, p. doi:10.1016/j.ijimpeng.2013.08.003 [18] sovják, r., vavřiník, t., et al.: resistance of slim uhpfrc targets to projectile impact using in-service bullets. international journal of impact engineering, 76, 2015, p. 166–177. doi:10.1016/j.ijimpeng.2014.10.002 [19] li, q.m., reid, s.r., et al.: local impact effects of hard missiles on concrete targets. international journal of impact engineering, 32 (1), 2005, p. 224–284. doi:10.1016/j.ijimpeng.2005.04.005 [20] beckmann, b., hummeltenberg, a., et al.: strain behaviour of concrete slabs under impact load. structural engineering international, 22 (4), 2012, p. 562–568. doi:10.2749/101686612x13363929517893 [21] banthia, n., trottier, j.-f.: test methods for fexural toughness characterization of fiber reinforced concrete: some concerns and a proposition. aci materials journal, 92 (1), 1995, p. doi:10.14359/1176 [22] maca, p., zatloukal, j., et al.: development of ultra high performance fiber reinforced concrete mixture. in: ieee symposium on business, engineering and industrial applications (isbeia). ieee, 2012, p. 861–866. doi:10.1109/isbeia.2012.6423015 [23] sovják, r., vogel, f., et al.: triaxial compressive strength of ultra high performance concrete. acta polytechnica, 53 (6), 2013, p. doi:dx.doi.org/10.14311/ap.2013.53.0901 [24] máca, p., sovják, r.: resistance of ultra high performance fibre reinforced concrete to projectile impact. structures under shock and impact, 2012, p. 261. [25] wille, k., naaman, a.e., et al.: ultra-high performance concrete with compressive strength exceeding 150 mpa (22 ksi): a simpler way. aci materials journal, 108 (1), 2011, p. 46–54. doi:10.14359/51664215 [26] bencardino, rizzuti, et al.: stress-strain behavior of steel fiber-reinforced concrete in compression. journal of materials in civil engineering, 20 (3), 2008, p. 255–263. doi:10.1061/(asce)0899-1561(2008)20:3(255) [27] wille, k., el-tawil, s., et al.: properties of strain hardening ultra high performance fiber reinforced concrete (uhp-frc) under direct tensile loading. cement and concrete composites, 2014, p. doi:10.1016/j.cemconcomp.2013.12.015 [28] habel, k., viviani, m., et al.: development of the mechanical properties of an ultra-high performance fiber reinforced concrete (uhpfrc). cement and concrete research, 36 (7), 2006, p. 1362–1370. doi:10.1016/j.cemconres.2006.03.009 [29] habel, k., charron, j.-p., et al.: ultra-high performance fibre reinforced concrete mix design in central canada. canadian journal of civil engineering, 35 (2), 2008, p. 217–224. doi:10.1139/l07-114 [30] fornůsek, j., tvarog, m.: influence of casting direction on fracture energy of fiber-reinforced cement composites. key engineering materials, 594, 2014, p. 444–448. doi:10.4028/www.scientific.net/kem.594-595.444 [31] yoo, d.-y., kang, s.-t., et al.: effect of fiber length and placement method on flexural behavior, tensionsoftening curve, and fiber distribution characteristics of uhpfrc. construction and building materials, 64, 2014, p. 67–81. doi:10.1016/j.conbuildmat.2014.04.007 326 http://dx.doi.org/10.1016/j.engstruct.2015.08.032 http://dx.doi.org/10.1016/j.conbuildmat.2015.07.141 http://dx.doi.org/10.1016/j.conbuildmat.2015.07.136 http://dx.doi.org/10.1016/j.cemconres.2015.09.011 http://dx.doi.org/10.14311/ap.2014.54.0358 http://dx.doi.org/10.1016/j.proeng.2013.09.004 http://dx.doi.org/10.1016/j.compositesb.2015.05.031 http://dx.doi.org/10.1007/bf00033974 http://dx.doi.org/10.1007/bf02472590 http://dx.doi.org/10.1016/j.engstruct.2010.07.017 http://dx.doi.org/10.1016/j.ijimpeng.2013.08.003 http://dx.doi.org/10.1016/j.ijimpeng.2014.10.002 http://dx.doi.org/10.1016/j.ijimpeng.2005.04.005 http://dx.doi.org/10.2749/101686612x13363929517893 http://dx.doi.org/10.14359/1176 http://dx.doi.org/10.1109/isbeia.2012.6423015 http://dx.doi.org/dx.doi.org/10.14311/ap.2013.53.0901 http://dx.doi.org/10.14359/51664215 http://dx.doi.org/10.1061/(asce)0899-1561(2008)20{:}3(255) http://dx.doi.org/10.1016/j.cemconcomp.2013.12.015 http://dx.doi.org/10.1016/j.cemconres.2006.03.009 http://dx.doi.org/10.1139/l07-114 http://dx.doi.org/10.4028/www.scientific.net/kem.594-595.444 http://dx.doi.org/10.1016/j.conbuildmat.2014.04.007 vol. 56 no. 4/2016 effect of fibre aspect ratio and fibre volume fraction [32] mechtcherine, v., millon, o., et al.: mechanical behaviour of strain hardening cement-based composites under impact loading. cement and concrete composites, 33 (1), 2011, p. 1–11. doi:10.1016/j.cemconcomp.2010.09.018 [33] holčapek, o., vogel, f., et al.: time progress of compressive strength of high performance concrete. applied mechanics and materials, 486, 2014, p. 167–172. doi:10.4028/www.scientific.net/amm.486.167 [34] kang, s.-t., lee, y., et al.: tensile fracture properties of an ultra high performance fiber reinforced concrete (uhpfrc) with steel fiber. composite structures, 92 (1), 2010, p. 61–71. doi:10.1016/j.compstruct.2009.06.012 [35] yoo, d.-y., lee, j.-h., et al.: effect of fiber content on mechanical and fracture properties of ultra high performance fiber reinforced cementitious composites. composite structures, 106, 2013, p. 742–753. doi:10.1016/j.compstruct.2013.07.033 327 http://dx.doi.org/10.1016/j.cemconcomp.2010.09.018 http://dx.doi.org/10.4028/www.scientific.net/amm.486.167 http://dx.doi.org/10.1016/j.compstruct.2009.06.012 http://dx.doi.org/10.1016/j.compstruct.2013.07.033 acta polytechnica 56(4):319–327, 2016 1 introduction 2 experimental program 3 material 4 results and discussion 5 conclusions and further outlook acknowledgements references ap05_3.vp 1 introduction it is widely accepted that many of the tasks involved during the early stages of design are critical to the success of a product however defined [1, 2]. a model of the overall design process is shown in fig. 1. the ability to effectively undertake many of these tasks is largely dependent upon the level of understanding of the design problem and the representation and handling of design knowledge [3, 4]. in fact modelling both the product design and the design knowledge from which it is developed is one of the most important and difficult tasks to support. a number of attempts to address this have been undertaken by academia and industry. these include function-based modelling [5], domain representations [6], ontologies [7] and knowledge based methods [8]. however, the majority of these supportive tools and methods are poor at dealing with generalised problems or problems where the design knowledge is incomplete and continually changes. one alternative approach is to deal directly with the design constraints themselves. most design decisions involve considering some form of restriction (real or artificial) on the choices available. formulating problems in terms of design parameters and these restrictions is a more intuitive approach. constraint modelling is concerned with formalising and representing design parameters and their relationships as a set of inter-related constraints. the relationships may include mutually conflicting requirements and are generated directly from the design knowledge currently available. as the design proceeds and the design knowledge evolves the constraints set develops. this constraint set represents the process which led to the development of a particular engineering solution. a constraint-aided process provides a more holistic approach that allows the designer to iterate between the specification, concept and layout stages of the design process. this is shown in the context of the traditional design process in fig. 1. constraint rules are derived directly from the specification and used in the evaluation of the proposed functional structure. failure of this proposed functional description to meet the specified rules results in the re-evaluation of that proposed solution and possibly in a reconsideration of those rules that could not be met. success leads on to the construction of a preliminary layout in the form of a set of preferred values of a parametric model. similarly the parametric model itself contains rules that control the limits of its geometry. as a result other failures in rules may occur that need to be resolved by further modifications of the selected specification rules, the chosen concept and the geometric configuration. manipulation of all these parameters against the goals set by the rules, also provides information to the designer on those aspects of the design that are highly sensitive to variation and conversely those that have little to no effect upon the solution. this information is invaluable in the process of deriving further or alternative concepts. this paper presents a constraint-aided approach for supporting the designer during the early stages of product design. a constraint modelling environment based on the notion of constraint satisfaction is proposed as the basis of a design support tool. the approach addresses the issue of modelling and reasoning about the design of products from an abstract set of requirements. it also demonstrates how design knowledge can be incorporated and handled during the early stages of the design of a product. the constraint modelling environment is described and a number of examples are used to illustrate the approach. the examples demonstrate the process of creating rules from the specification, testing ideas as functional structures as well as against the specification, generating preliminary layouts and selecting the best preliminary layout. furthermore, the examples demonstrate the potential of a constraint-aided approach for supporting important issues such as the design of product variants and product families. 2 constraint-aided modelling the constraint based approach aims to represent what is to be achieved rather than how it is to be achieved. these objectives or goals are represented as ‘constraint rules’. these represent the relationship between the design parameters, which must be satisfied if the design is to fulfil all of the requirements. these constraints may include performance and physical requirements of the design and also constraints im© czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 constraint-aided product design g. mullineux, b. hicks, t. medland the importance of supporting the early stages of design is widely accepted. in particular, the development of supportive tools and methods for modelling and analysis of evolving design solutions present a difficult challenge. one reason for this is the need to model both the product design and the design knowledge from which the design is created. there are a number of limitations with many existing techniques and an alternative approach that deals with the design constraints themselves is presented. dealing directly with the constraints affords a more generalised approach that represents the process by which a product is designed. this enables modelling and reasoning about a product from an often abstract and evolving set of requirements. the constraint methodology is an iterative process where the design requirements are elaborated, the constraint rules altered, design ideas generated and tested as functional structures. the incorporation of direct search techniques to solve the constrained problem enables different solutions to be explored and allows the determination of ‘best compromises’ for related constraints. a constraint modelling environment is discussed and two example cases are used to demonstrate the potential of a constraint-aided approach for supporting important issues such as the design of product variants and product families. keywords: constraint modelling, design knowledge, synthesis, product design, product families. posed by resources. in this manner the design of the artefact is not process led but goal orientated. when considering a system it is vary rare that any single element or operation is independent of all the other elements. consequently, all the goals must be dealt with concurrently and their relationships considered. the aim is to find a solution that satisfies all these imposed constraints as closely as possible. the solution space is the intersection of all the individual constraint fields, as shown in fig. 2. this intersection is determined computationally by direct search techniques and the manipulation of the design parameters. using direct search techniques means that convergence on a fully successful solution is obtained if one exists or if not a best compromise is determined. this holistic approach allows the 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague design task clarify the task elaborate the specification specification identify essential problems establish function structures search for solution principles combine and firm up into concepts evaluate against technical and economic criteria concept develop preliminary layouts and form designs select best preliminary layouts refine /evaluate against technical /economic criteria preliminary layout optimise and complete form designs check for errors and cost effectiveness prepare the preliminary parts list and production definitive layout finalise details complete detail drawings and production documents check all documents documentation manufacture it e r a t iv e c o n s t r a in t -a id e d p r o c e s s k e y e l e m e n t s generate constraint rules evaluate function structures against constraint rules create parametric model optimise geometric configuration fig. 1: typical activities in the design process a the solution space lies at the intersection of all the individual constraint fields bc d e fig. 2: constraint rules illustrated as sets representation of design knowledge, and more importantly, enables this knowledge to be expanded or modified at any time during the process. in this manner, the approach allows changes in both the proposed solution and in the governing constraints of the particular design problem. in the approach discussed in this paper, no large assumptions about the form of the engineering constraints are made except that it is assumed that the underlying variables are (more or less) continuously varying. because of this, different forms of constraints between the design parameters can, in principle, be dealt with. the approach implements a general purpose resolution strategy based upon optimisation with the constraints themselves being treated as penalty functions. although unrefined when compared with specific techniques for dedicated applications, it has proved successful in design software where the forms of the application and constraints are not known a priori. the language of the constraint modeller has been created to handle design variables of several types including structured types to represent, for example, geometric objects. the language supports user defined functions that are essentially collections of commands which can be invoked when required. input variables can be passed into a function and the function itself can return a single value or a sequence of values. an important inbuilt function is the “rule” command. each rule command is associated with a constraint expression between design variables which is zero (as a real number) when true. a non-zero value is a measure of its falseness (error). during resolution the constraint expression for each rule command is evaluated and the sum of the squares of these is found. if this is already zero, then each constraint expression represents a true state. if the sum is non-zero resolution commences. this resolution process involves varying a set of design parameters specified by the user. the sum is now treated as a function of these variables and a numerical technique can be applied to search for values of the parameters which minimise the sum. if a minimum of zero can be found then the constraints are fully satisfied. if not, then the minimum represents some form of best compromise for a set of constraints in which there is conflict. furthermore, it is possible to identify those constraints that are not satisfied and as a result less important constraints can be relaxed enabling an overall solution to be determined. as an example, consider the modelling of a four bar linkage. this is shown schematically in fig. 3. in part (a) of the figure, the two fixed pivot points are declared, and the lines representing the three links are defined, each in a local model space. the model space of the coupler link is “embedded” in the space of the crank, and the spaces for the crank and the driven links are embedded in world space. if transformations are applied to the links in each respective space then partial assembly is achieved. this is shown in part (b) of the figure. if the space of either the crank or the coupler is rotated, the hierarchy of their spaces ensures their ends remain attached. to complete the assembly, the ends of the coupler and driven link have to be brought together. this cannot be done by model space manipulation alone, as this would break the structure of the model space hierarchy. instead a constraint rule is applied whose value represents the distance between the ends of the lines [9]. to facilitate this, the user language has a binary function “on” which returns the distance between its two geometric arguments. if l1 and l2 are the lines representing the coupler and the driven links, then, in the user language, the constraint rule is expressed as follows rule( l1:e2 on l2:e1 ); where the colon followed by e1 or e2 denotes either the first or second endpoint of the line segment. in order to satisfy this rule, the system is allowed to alter the angle of rotation of the two relevant model spaces. if the rule is applied then the correct assembly is obtained as shown in part (c) of the figure. by rotating the space of the crank link and performing the assembly of the other two links at each stage, a simulation of the motion is achieved, as in part (d) of the figure. if solid objects representing the link are constructed, these can be entered into each model space as shown in part (e). the following sections discuss in detail two cases which are used to illustrate the supportive capabilities of a con© czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 3: modelling a four bar/six bar straint-aided process for product design. in particular, the examples discuss the ability to generate design ideas and test them as functional structures, embody and evaluate various alternatives or layouts, and identify which performs best. furthermore, the consideration of product families and product variants during the early stages of design are also discussed. 3 designing a hole punch the first example involves the design of a variety of hole punches capable of handling stacks of paper of a variable number of sheets. this example demonstrates in principle the processes of capturing design knowledge, representing and assessing alternatives, and the representation of geometric relations and product assembly. in the example considered a spreadsheet is used as an intermediary in which illustrations and notes may be added to supplement the constraint rules. the design variables and constraint rules are themselves linked to the constraint modeller by means of the windows protocol for interprocess communication (dynamic data exchange). these notes and illustrations represent evolving ideas and preliminary layouts. fig. 4 shows the initial sketches and design notes as well as the associated design variables (geometry) and the conceptual rules (constraint rules). these rules are derived from the product specification and are elaborated as the iterative process of developing constraint rules and exploring ideas is undertaken. firstly, a parametric model of the geometry is constructed. the constraints define design requirements or conceptual rules governing the relationships between parts and their assembly operations. the constraint satisfaction problem not only attempts to assemble the product but also to satisfy relations such as ‘base should be greater than handle width’ and ‘overall height must be less than draw depth’. these requirements are expressed as rules that describe the relationships 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague e v o l v in g d e s ig n k n o w l e d g e preliminary layouts/designs constraint rules design variables design ideas and alternative solutions parameter values for alternatives design solutions fig. 4: representing evolving design knowledge for the product development process member 1 member 2 fig. 5: product variants determined through constraint satisfaction between the design parameters. as the development process proceeds and the design knowledge evolves, the constraint rules alter. by supplementing the constraint set with new and altered rules, a historical record of the outcome or implication of key decisions is constructed. this represents the elaboration of the specification and is particularly useful for design audit or design reuse activities, where a fundamental understanding of the particular solution needs to be obtained rapidly. fig. 5 shows two members of the hole punch family as determined by a global solution of the constraints. design data describing these two specific cases is held within the intermediary spreadsheet shown in fig. 4. this example illustrates the capability to develop product variants driven by a small number of requirements whilst satisfying or determining best compromises with other essential design criteria. the use of constraints to define assembly operations can be extended to more complex problems such as the assembly of a container involving 11 parts and utilising 30 degrees of freedom [10]. 4 a bicycle range the second example involves the design of a range of bicycles (family of products). in this example, design objectives were firstly collected and a list of design knowledge describing the requirements/specification generated. the design objectives (design requirements) total more than 30 and relate to both the rider and the bicycle. these requirements extend from necessary performance needs and physical restrictions (for example ‘pedal not hitting the ground’ and ‘foot not going through front wheel’) to styling considerations that limit the range and sizes of the wheels. this design knowledge is transformed into a series of constraint rules that relate to a parametric model of the bicycle and the rider. for example, the ‘reach the pedal at the bottom of the stroke’ requires that the total length of the rider is greater than the distance from the saddle to pedal position at bottom dead centre. the corresponding constraint rule is expressed in terms of the design variables for the rider and for the bicycle. fig. 6 shows a parametric model of the bicycle and its associated design variables. for the case considered, the objective is to develop a product family. to achieve this, a global solution is sought for the constrained problem for various sizes of rider. part (b) of fig. 6 shows two variants for different sizes of rider. in these solutions everything has been varied in order to satisfy the constraint set. however, in practice it is desirable to use a number of standard components. these can be included by either fixing the values of certain design parameter sor by restricting their range of permissible values to within a bounded interval or even a list of values, such as sprocket sizes. in part (c) of fig. 6 a solution obtained using standard components for the wheels and the sprockets is shown. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 vb1 vb2 vb3 vb5vb6 vb4 vb7 vb8 vb9 vb10 vb11 vb12 vb13 vb14 part (a) parametric model part (b) product variants part (c) inclusion of standard components part (d) strategies for minimal change fig. 6: a parametric model of a bicycle and design solutions determined through constraint satisfaction when developing a product range or family it is often desirable to explore strategies for minimal change. this is made possible by investigating the sensitivity of the design solution to changes in the design variables [11]. dominant or critical design variables may then be altered to best achieve the desired effect with the minimal change to the solution. a number of strategies to achieve the desired changes in design objectives are shown in part (d) of fig. 6. each solution is determined by altering only a proportion of total design variables. 5 conclusions the need to model both the product design and the design knowledge from which the product is created has been discussed. there are limitations with many current techniques and an alternative approach that deals with the design constraints themselves has been presented. dealing directly with the design constraints allows a more generalised approach that represents the process by which a product is designed. furthermore, dealing with the constraints allows the modelling and reasoning about a product from an often abstract and evolving set of requirements, which is particularly important during the early stages of design. the constraint process is an iterative one where design requirements are elaborated, constraint rules altered, design ideas generated and then tested as functional structures. the constraint-aided process provides for the representation and handling of design knowledge through the creation of constraint rules. the process of forming constraint rules acts to elicit and formalise design knowledge. the constraint rules not only describe relationships between design parameters but also assembly operations and desired performance capabilities. direct search techniques are applied to satisfy the set of constraint rules fully or to determine a best compromise. in particular, the constraint-aided approach supports the exploration, evaluation and assessment of alternative design solutions. design alternatives can be generated by altering the constraint set or the relative weighting between constraint rules, whilst a potential solution can be embodied by specifying desired performance characteristics or system outputs as a set of constraints rules. finally, design ideas can be assessed by testing them against the set of constraint rules. the example cases presented, also demonstrate the potential of a constraint-aided approach for supporting important issues such as the design of product families and the inclusion of standard components at the early stages of product design. acknowledgement the ideas reported in this paper have emerged from work supported by a number of grants funded by engineering and physical sciences research council (epsrc), department of trade and industry and department for environment food and rural affairs (defra), involving a large number of industrial collaborators. in particular, current research is being undertaken as part of the epsrc innovative manufacturing research centre at the university of bath (reference gr/r67507/01). the authors gratefully express their thanks for the advice and support of all concerned. references [1] pahl, g., beitz, w.: engineering design: a systematic approach. second edition, springer-verlag, 1996, isbn 3-540-19917-9. [2] pugh, s. : total design. addison-wesley publishing company, new york, 1991, isbn 0-201-41639-5. [3] hubka, v., ernst, eder w.: design science: introduction to the needs, scope and organization of engineering design knowledge, springer-verlag, 1996, isbn 3-540-19997-7. [4] miles, j., moore, c.: practical knowledge-based systems in conceptual design. springer verlag, 1994, asin 0387198237. [5] rosenman, m. a., gero, j. s.: “collaborative cad modelling in multidisciplinary design domains.” in maher, m. l., gero, j. s., sudweeks, f. (eds.), preprints formal aspects of collaborative computer-aided design. key centre of design computing, university of sydney, sydney, 1997, p. 387–403. [6] chakrabarti, a., blessing, l.: “representing functionality in design.” artificial intelligence for engineering design and manufacture, vol. 10 (1996), no. 4, p. 251–253. [7] gerhenson, j. k., stauffer, l.: “a taxonomy for design requirements from corporate customers.” journal of research in engineering design, vol. 11 (1999), no. 2, p. 103–115. [8] hayes-roth, f., jacobstein, n.: “the state of knowledge-based systems.” communications of the acm, vol. 37 (1994), no. 3, p. 27–39. [9] anantha, r., kramer, g. a., crawford, r. h.: “assembly modelling by geometric constraint satisfaction.” computer-aided design, vol. 28 (1996), no. 9, p. 707–722. [10] mullineux, g.: “constraint resolution using optimisation techniques.” computer & graphics, vol. 25 (2001), p. 483–492. [11] apps, b., mullineux, g.: “sensitivity analysis within a constraint modelling system.” international conference in engineering design, proc. 11th iced, riitahuhta, a., ed., august, 1997, tampere university of technology, vol. 3, 1997, p. 309–312, isbn 951-722-788-4. dr. glen mullineux dr. ben hicks e-mail: b.j.hicks@bath.ac.uk prof. tony medland innovative manufacturing research centre department of mechanical engineering university of bath bath, ba2 7ay, united kingdom 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap04_1.vp 1 introduction conceptual design is a term representing design activities before own modelling and drawing of the system by any cad system. the conceptual design is a process of formulation and precisiation of vaguely formulated preconditions, function descriptions and structure of a designed system. due to lack of information in these early design stages, we must use proper techniques, especially qualitative. we also must reason, that information about system increases during design process and so used formalism must enable movement from imprecise to more precise, from qualitative to quantitative description. design process is from many views optimisation process. process not only decreasing with uncertainty, but also searching as better solution as possible within limits given by technical, physical and economical constraints. optimisation needs to have on its background model and simulation tool. in field of conceptual design we also expect that such a tool is also capable to provide part-dimensioning work and to simulate under conditions of uncertainty. sources of uncertainty are especially incompleteness of model, its vagueness and inner contradictions. human designers decrease design problem complexity by decomposition techniques. this approach enables them to use known parts and subsystems and focus their creativity on limited sub-problems. thus, simulation support tool for conceptual design must be also component-based. because each component is represented by special file with structure enabling direct reading by prolog consult predicate, inheritance is solved by special predicate in this description. inheritance is allowed only single, not multiple like in c++, because used model contains object collection. their use is in the field of modelling usually clearer than the use of multiple inheritances. e.g. the c++ programmers use often multiple inheritances on the place of static collections because multiple inheritance mechanism better fits information systems features. but in the field of technical systems ones the grouping of different functions into one indivisible system is not frequent. 2 component based model for conceptual design each component-based simulation tool consists from two fundamental parts: component editor and own simulator. presented editor supports all model features including inheritance, encapsulation, structural information (physical types of variables) and some consistency checks. the editor also enables work with component libraries, see figure 1. editor distinguishes two types of components, simple and container ones. the structure of model is described by the work [1]. model is described by prolog-like structures. an example is sketched on the table 1. simulation tool developed in prolog language works in three ways. in the first presented simulation tool works as 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague implementation of component-based simulation support tool for conceptual design t. brandejský presented paper speaks about the problem of conceptual design stage simulation support. simulation support is a very useful part of conceptual design system due to capability to verify ideas in early design stages. the problem excluding classical simulation tools is lack of information about designed device, is inconsistency and uncertainty. thus specialised tool must be developed. the component-oriented editor of component descriptions and models is presented. this tool enables to describe components not only in terms of algebraic equations, but also by fuzzy rules. the problem of dynamic work with uncertainty representation during simulation and design processes is also solved. presented tool also differentiates from standard tools like mathematica or matlab in its ability to work with component-based models, where each component is described from many aspects and only few of them are valid in concrete use. the tool must be able to select relations relevant in concrete simulation task and omit the rest. keywords: component-based simulation, conceptual design, mixed uncertainty description, heterogeneous model, dynamic uncertainty representation selection. parent(p:\tom\creativity\cmb\heating\tank.scf) variable(t,real,temperature, 1.10000000000000e+0000) inherited(heating,p:\tom\creativity\cmb\heatying\exchanger.scf) table 1: an example of container component derived from parent component class “tank” with new variable t, physical meaning temperature with default value 1.1 and with inherited object heating of exchanger class pre-processor generating on the base of component libraries models for commercial simulators like matlab or mathematica (selects relevant relations – equations). in the second simulation tool generates and simulates qualitative model (from algebraic relations or from functional ontology). in the last simulation tool uses all its’ capabilities, selects proper relations like in last cases and simulates system with or without conditions of uncertainty. model structure can be derived from uml description by special mapping scheme described in the work [2]. used uncertainty model is based on interval-valued fuzzy sets. as it is known from many works of me, atanassov, mizumoto and tanaka, interval fuzzy sets enables to describe intervals of possible values, fuzzy uncertainty and, because they are a special case of second order fuzzy sets, also rough approximation of probabilistic uncertainty. because the use of universal description is computationally intensive and brings problems in cases of state variable integration and some functions, an alternative way based on the use of dynamic representation choice from given set of representations (fuzzy numbers, fuzzy linguistic variables, intervals, interval-valued fuzzy sets) on the base of input data uncertainty type and operations used in computed relations will be also discussed in the next part. in the case of conceptual design it is impossible to eliminate uncertainty e.g. by defuzzyfication, because uncertainty distribution brings significant information about sources of uncertainty – about wrongly determined system components, about contradictions in designed system etc. 2 methods applicable in the area of heterogeneous models of systems within the conceptual design stage, the function of designed system can be described not only in the form of algebraic equations, but also in the form of fuzzy rules [3]. similarly, we must reason parameters and initial conditions of simulation minimally in three forms – crisp data, fuzzy numbers and fuzzy linguistic variables. thus we must solve four basic cases: crisp data – algebraic equations, crisp data – fuzzy rules, fuzzy data – fuzzy rules, fuzzy data – algebraic operations and their combinations. we must also distinguish fuzzy data in the form of predefined © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 1: component editor editing relation of container component operations data crisp fuzzy numbers fuzzy linguistic variables algebraic operations numerical mathematics operations with fuzzy numbers method aofulv rules fuzzyfication of fuzzy rule and defuzzyfication similarity based reasoning fuzzy rules table 2: basic methods applicable for partial combinations of operation and data descriptions fuzzy linguistic variables and fuzzy data in the form of fuzzy numbers (which usually has dynamic structure). table 2 represents basic methods applicable for solving of each combination. we can recognise six basic situations in table 2. the following points describe each method more in detail. � numerical mathematics represents sophisticated mathematical discipline applicable in the area of crisp (numerical) data and algebraically described operations. there are many information sources, e.g. [4]. � fuzzy numbers operations are used in situations, where input data contains fuzzy uncertainty and are described as fuzzy numbers. operations are described algebraically. from the viewpoint of pedrycz [5] granulation, fuzzy numbers can be in some application better than fuzzy linguistic variables due to better approximation of uncertainty. analogously, algebraic operations with fuzzy numbers do not increase uncertainty of system description in contrast with fuzzy-rules. increasing of uncertainty in the case of fuzzy-rules use is analyzed in the work [6]. the application of fuzzy numbers is possible only with respect to particular limitations, especially the need of use convex fuzzy numbers. the use of non-convex fuzzy numbers is possible (e.g. they are used in aofulv method), but their use tends to computationally intensive calculations. � aofulv method (algebraic operations with fuzzy linguistic variables) was developed for calculation of algebraic operations with data in the form of fuzzy linguistic variables. the construction is following: the memberships of resulting variable r values (after execution of operation “*”) are computed as maximums of variable r k-th linguistic values membership function fm p conjunction with product of all partial operations with elements k (e1, ..., em) of cartesian product k unification: � �� �� �� �� � � �� �� � � � k r k r m m f f k e e k e e � max , , min , , � 1 1 � � (1) symbol f (k (e1, …, em)) denotes m-dimensional vector of membership functions which form the element of cartesian product k a a k am� � � �1 2 . the membership function of e-th linguistic value of k-th linguistic variable there is understood as a fuzzy number. symbol “ ” (in (1)) represents an algebraic function with fuzzy numbers (not only elementary algebraic operation, like “+”, “�”, “*”, “/”). the method is described e.g. in [6]. most universal than fuzzy sets are interval valued fuzzy sets and membership interval fuzzy sets [7, 8]. � fuzzyfication � fuzzy-rules � defuzzyfication: such an operation sequence represents approach well-known from many implementations in areas of automatic control and fuzzy modeling. the approach applies rule-based description of a system on crisp data calculations. in the way it uses two transformations between crisp and fuzzy linguistic variable description – fuzzyfication and defuzzyfication. we often speak about fuzzy approximation of (unknown) function. the possibility of the way is proved by fat theorem [9]. the problem is the selection of optimal combination of fuzzyfication and defuzzyfication operations and method of rule result membership calculation. we can choice from many viewpoints, e.g. from approximation linearity viewpoint. � analogical reasoning: the application of operations described by rules on fuzzy-numbers is difficult, because shape of fuzzy-number membership function changes during the time of calculation and so fuzzy-numbers changes its sense. on the other hand, rule-based reasoning is based on predefined terms of previously defined meaning. � thus application of rule-described operations on data in the form of fuzzy-numbers is possible only in the case of existence of transformation between fuzzy-numbers and fixed-meaning terms – fuzzy linguistic values. the possible way is represented by measuring of membership function similarity, for some methods see [10]. similar approach is used for transformation of unified partial results to linguistic values memberships in aofulv method. � fuzzy-rules: the application of operations described by fuzzy-rules on data in the form of fuzzy linguistic variables is simple and does not need any additional transformations. 3 dynamic selection of uncertainty description it is possible to solve uncertainty description selection problem and from it concluding implementation of operations from two viewpoints. either our goal is to describe influence of initial parameters and operations uncertainty on precision of output parameters with maximal credibility (it usually is in conceptual design field) or our gal is to choice most effective method of work with uncertainty information saving the basic information about distribution of uncertainty (this situation become in the case of complex systems, when it is possible to eliminate information about uncertainty influence in the case of less significant components from the viewpoint of actually solved design operations). in the following text transcription x y� denotes that x is transformed during evaluation to y, large chars denotes variables, symbol ‘_’ is used for undefined value, � �op x k x rn1, , , denotes n-ary operation op, � � � �� �op type x type r, then means unary operation op with argument of type type and with value x and result of type type and value r. analogously, the form � � � �� �op t x t rr1 , then represents unary operation op with t1 type argument and value x and with the result of tr type and value r. particular uncertainty description types are denoted by table 3: 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague crisp crisp(x) singleton sgltn(x) fuzzy number fn(x) fuzzy linguistic variable flv(x) mi fuzzy number mifs(x) mi linguistic variable milv(x) table 3: uncertainty representations naming 3.1 algebraically described operations algebraically described operations represent basic type of system behaviour description from viewpoint of presented system of conceptual design simulation support. thus this type will be discussed in this chapter. second type is rule based description. this type will not be discussed on this place because the solution is analogous. 3.2 uncertainty description type solution for the case where uncertainty description of operation result is not given the two basic cases becomes from the viewpoint whether uncertainty description type of the partial-result one is not given, or it is; and from the method of operation evaluation. the situation, when result uncertainty description type is given becomes usually as consequent of user choice (e.g. the postulate for representation by pre-defined linguistic variables on the place of fuzzy numbers which must be interpreted and which are not suitable e.g. for co-operation of simulation system and of expert system). © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 3.3 transformations of argument type unary operator unary operator is described by ordered twin of parameters – description of value and uncertainty of an argument and description of value and uncertainty of the result. if concrete type of uncertainty description of the result is not asked the type of uncertainty of the result is the some as the one of the argument. � � � �� � � � � �� �op n x r op n x n r, _ ,� (2) binary operator � � � � � �� � � � � � � �� �op n x n x r op n x n x n r1 2 1 2, , _ , ,� (3) � � � � � �� � � � � �� � � � � �op fn x crisp x r crisp x sgltn x op fn x sgltn x f1 2 2 2 1 2, , _ , , ,� � � �� �n r (4) � � � � � �� � � � � �� � � � � �op flv x crisp x r crisp x sgltn x op fn x sgltn x1 2 2 2 1 2, , _ , , ,� � � �� �fn r (5) � � � � � �� � � � � � � �� �op flv x fn x r op flv x fn x fn r1 2 1 2, , _ , ,� (6) � � � � � �� � � � � �� � � �op mifs x crisp x r crisp x sgltn x op mifs x sgltn1 2 2 2 1, , _ , ,� � � � � �� �x mifs r2 , (7) � � � � � �� � � � � � � �� �op mifs x fn x r op mifs x fn x mifs r1 2 1 2, , _ , ,� (8) � � � � � �� � � � � � � �� �op mifs x flv x r op mifs x flv x mifs r1 2 1 2, , _ , ,� (9) � � � � � �� � � � � �� � � �op milv x crisp x r crisp x sgltn x op milv x sgltn1 2 2 2 1, , _ , ,� � � � � �� �x mifs r2 , (10) � � � � � �� � � � � � � �� �op milv x fn x r op milv x fn x mifs r1 2 1 2, , _ , ,� (11) � � � � � �� � � � � � � �� �op milv x flv x r op milv x flv x mifs r1 2 1 2, , _ , ,� (12) � � � � � �� � � � � � � �� �op milv x mifs x r op milv x mifs x mifs r1 2 1 2, , _ , ,� (13) transformations of argument description types and high order operation results are analogous and for short will not be described in detail on this place. 3.4 transformation of pre-defined type of result it is possible to transform this situation on previous case with subsequent transformation of uncertainty description type into asked; or it is possible to start from asked type of uncertainty and search the most simple sufficiency method of solving. the first case can be described by transformations (14) – for unary operator and (15) for binary: � � � �� � � � � �� � � � � �� �op p x g r op p x r r g r, , _ , _� � (14) � � � � � �� � � � � � � �� � � � � �� �op p x p x g r op p x p x r r g r1 1 2 2 1 1 2 2, , , , _ , _� � (15) the search of effective way of calculation in the case of asked type of result uncertainty description is expressed by transformations (16) and (17). it is easy to add the other cases by analogous way. in every case we search such operation which is adequate to the most precise type of argument uncertainty representation increased on the level of the result uncertainty representation type (if it is higher). � � � � � �� � � � � �� � � � � �� �op p x p x crisp r p x crisp x p x crisp x op cri1 1 2 2 1 1 1 2 2 2, , , ,� � � � � � � � �� �sp x crisp x crisp r1 2, , (16) � � � � � �� � � � � � � �� �op crisp x crisp x t r op crisp x crisp x crisp r cris1 2 1 2, , , , ,� � � � �� �p r t r� (17) 4 simulation model construction from component description many aspects of component description and simulation model derivation are described in the paper [1]. so, in this chapter after small recapitulation only novel approaches to model construction and simulation will be described. the first chapter of this paper brings introduction to components description and proper editor tool. because many relations in this model describe value of the same variable, simulation tool must select optimal one on the base of values with known value (value determined by user or previous calculation). simulation and dimensioning of technical system usually conduce to situations, when more relations than one must be use. this fact increases combinatorial complexity of the task of simulation model derivation from component model of device. used component model and component description differs from standard models (e.g. in simulink) because model of component applicable in the field of conceptual design must be applicable in any possible use of the component. so, it must describe all component behaviours, but in concrete use only few of them will be relevant. the first version of mechanism of simulation model (solving set of equations) collecting was described in [1]. this method works similarly like prolog language. it starts its work with selecting of equation determining value of asked variable (on the base of known and unknown attributes ratio). then it tries recursively to determine its unknown attributes. if the way is wrong, it returns about one step up and selects different relation etc. this method is successful in one-component case and in case of encapsulated component without relations between them. it cases when input on one encapsulated component depends on output of other, the different solution method must be used. in that situations it is need to look what next relations can be calculated, which next variables can be on the base of previous result determined. the part of communication with the tool using both this method is sketched on fig. 1. the prototype of conceptual design simulation support tool does not implement multiple uncertainty support and is used only for solving relation set collecting algorithm verification. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague also this choice can be described by proper grammar, e.g.: � � � � � �� � � �� � � �op crisp x mifs x op milv x fn r op crisp x mifs x1 1 2 2 3 1 1, , , _ , ,� � � � � � �� � � �� �2 2 3, , _ ,op milv x fn fn r (18) c:\documents and settings\brandejsky\my documents\doc a gacr\exe command line empty enter root component (model) file name ..\cmb\gacr\x1.ccf if you want to generate model enter ‘g’, otherwise anything else z i simulate model entering known variables enter single name, then you will be asked for magnitude insertion empty line ends entering, n_in n_in:=12 total_transmission_ratio total_transmission_ratio:=0.1 enter names of asked variables n_out [“n_out”] n_out = 1.2 fig. 2: communication with conceptual design simulation support tool prototype 3.5 composed operation case in the case of composed operation the type of partial result representation is not ever defined. everything is given by selected approach (maximal credibility or the precise adequate to asked type of uncertainty representation of the result of the whole composed operation). in the first case the type of partial result uncertainty representation is selected by method described by chapter 3.3. in the case of the effective method search the following rule is valid: the better type of partial result is not selected than the representation of the final result uncertainty is asked. if the arguments are of a less precise type, this type is selected. the following order of representation types is reasoned from the viewpoint of precise: crisp – singleton – fuzzy number mifs table 3: the order of uncertainty representation types from the viewpoint of precise. the tool on the beginning of session ask for the name of the main – root component (whole model can be understood as a container component). then it ask if set of equations for specialised simulation tools will be generated or if the simulation will be provided by this tool, then it asks for variables with known magnitudes and on the end for asked variables. then the tool collects solving set of equations and presents results. the tool is also capable to use default values on variables and in the future will be collected by mechanism of physical unit management (by now supported in component editor – see fig. 1) and by uncertainty dynamic description management presented in chapter 3. conclusion the presented paper on the background on conceptual design simulation tool summarises possibilities of the use of universal uncertainty management within the conceptual tool without the need of implicit conversion on the most common possibilistic uncertainty representation. the method applicable on rule based description is not discussed because the solution is analogous with the method for algebraic one. the paper also describes universal component-based representation, equations collecting algorithm and proper component and model editor tool. the tool is developed as a part of intelligent conceptual design system. acknowledgement presented work was supported by czech research grant agency project, contract no. gacr 102/01/0763. references [1] brandejský t.: “architecture of simulation tool for conceptual/functional design.” kerchkhoffs e. j. h. a snorek m.(eds): modelling and simulation 2001. 15th european simulation multiconference 2001, june 6–9, prague, scs san diego, ca, usa (2001), p. 390–3. [2] brandejský t.: transformation of uml into bylander qualitative description of systems. research report 13/2002. ctu, faculty of transportation sciences, prague, 2002. [3] brandejský t.: “integrated fuzzy-numerical models for conceptual design.” kerchkhoffs e. j. h. a snorek m.(eds): modelling and simulation 2001. 15th european simulation multiconference 2001, june 6–9, prague, scs san diego, ca, usa (2001), 387–9. [4] törnig w.: numerische mathematik für ingenieure und physiker. band i.: numerischen methoden der algebra. band ii: eigenwertprobleme und numerische methoden der analysis. berlin-heidelberg-new york, springer-verlang, 1979. [5] pedrycz w.: “granular computing for system modelling.” szczerbinska h. (ed.): modelling and simulation: a tool for the next millenium. scs, warszaw, poland (1999), p. 391–394. [6] brandejský t., bíla j., brož k.: “fuzzy qualitative modelling of distributed energy and heat supply complex.” in: szczerbinska h. (ed.): modelling and simulation: a tool for the next millenium. scs, warszaw, poland (1999), p. 391–394. [7] brandejský t.: methods of building membership interval logic models. mendel 2000 conf., brno university of technology, (2000), p. 238–242. [8] brandejsky t.: membership interval fuzzy logic reasoning. proceeding east-west fuzzy colloquium 2000. 8th zittau fuzzy colloquium, september 6–8, (2000), p. 36–41. [9] kosko b.: fuzzy systems as universal approximators. proceedings ieee international conference on fuzzy systems, ieee (1992), p. 1153–1162. [10] setnes m.: fuzzy rule-base simplification using similarity measures. delft university, m. sc thesis (1995). doc. dr. ing. tomáš brandejský phone: +420 224 359 530 e-mail: brandejsky@fd.cvut.cz department of telecommunications and informatics czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 ap04_2web.vp 1 introduction in recent times we have been witnessing to a strong trend toward enterprise integration. this tendency is a result of the increasing efficiency of enterprises, which is a logical consequence of classic competition among companies. enterprise integration processes are related to the integration of enterprise software systems [1]. modern information technologies are being used increasingly by businesses. during the 1990s the situation changed with the introduction of internet-oriented technologies not only in the academic sphere, but also in commerce. this change meant that each enterprise has started to be physically joined by this computer network. users of enterprise software and, in particular, managers have started to call for a logical link-up of territorially isolated software equipment. this has speeded up mutual communication among software systems and has led to integration. at the same time, the globalisation of world industry and the international economy initiated massive enterprise integration processes and intensive enterprise collaboration. the new methods of collaboration (both internal and external) are solidly based on communicating software systems [2]. thus in this period many standards, communication mechanisms and protocols have been developed, which effectively support software communication and formal integration [3]. the typical schema of the architecture of enterprise software systems is that the contemporary model is based on a complex enterprise information system (eis), which communicates with other software. eis logically wraps all the enterprise software and this shell is integrated on the information level. in present day applications of internet mechanism, eis is not applied only to the communication layer, but also to the presentation layer. solutions based on this new approach are usually called portal solutions or internet portals. the phenomenon of portal solutions also enables more heterogeneity of hardware platforms [1]. this is very important, because the spectrum of eis users is widely extended, and territorial boundaries of eis are more suppressed than in the past. typical example is the field-worker who is using a mobile device (such as a mobile telephone) connected to eis via internet. the portal solutions used in an enterprise environment, can be classified into two basic groups. the first is an enterprise application portal (eap) while the second is an enterprise information portal (eip) [4]. eip´s aim is to provide information (and are more widespread – see fig. 1), eap are 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 the role of cad in enterprise integration process m. ota, i. jelínek this article deals with the problem of the mutual influence between software systems used in enterprise environment and enterprise integration processes. the position of cad data and cax systems in the integrated environment of manufacturing enterprises is clarified. as a consequence, the key role of cax systems used in those companies is emphasized. it is noted that the integration of cad data is nowadays only on a secondary level, via primarily integrated pdm systems. this limitation is a reason why we are developing a unified communication model focused on product-oriented data. our approach is based on internet technologies, so we believe that is independent enough. the proposed system of communication is based on a simple request-replay dialogue. the structure of this model is open and extensible, but we assume supervision supported by an internet portal. keywords: cad, enterprise information systems, internet portals, pdm. eis shared data software system software system software system eip user user user user user user fig. 1: enterprise information portal gateways for application runs (and are usually combined with an eip one – see fig. 2). however this classification is disputable, because eips are also built on some application logic. the use of enterprise software applications via an internet portal is under intense study in the fiels known as enterprise application integration (eai). each manufacturing enterprise stores, manages and uses the data about the products that it produces. this data is called product-oriented data. the data special parts created by cad systems are product oriented data. these data is specific, so the software systems that operate on this data (known generally as cax systems, especially cad, cam and cae) must be perceived as special in the context of data sharing and information flow. these software systems play a major role in all technically oriented companies. cax software is intensive on computer power, so the integration to the eap does not yet have good results. however, content integration to the context of the information system is desirable. there are ways to solve this, but we believe that the degree of integration is not sufficient and we suppose there are other ways that could markedly improve the integration of cax systems. the products of a manufacturing company are at the centre of its concern and all the cax data is just about the product. integration of the cax technologies, vitalized by the integration of enterprise processes, improves the support for teamwork and makes necessary information about the products available to cax users, such as managers, marketing workers, etc. cax systems must therefore be designed as open systems, in order to be able to integrate them into the enterprise information system. last but not least, there is the problem of using more than one cax system on the same areas (e.g. two or more cad systems). no cax system is generally best, and many companies use a variety of cad systems for different parts of the design process e. g. for sheet metals and metal casting support. as a result of this specialisation there are heterogeneous subsets of product data for a single product, and there is no easy way to exchange related information between two different cad systems. 2 observation interoperability and data connectivity of enterprise software equipment is an area where dynamic developments are taking place. there are complex methodologies for data-oriented and data-intensive software integration, such as agendas, databases and data warehouses. however, cax systems are excluded from the focus of primary software integration processes. since computer-aided design is widely used in the enterprise environment of manufacturing companies, the gap between the cad systems, on the one hand, and the rest of the company and its other software equipment, on the other hand is a serious problem. relationships between cad and other cax software have been solved more or less satisfactory (partly out outside the wrap of eis). the field of product data management (pdm) tries to systematize and manage data about products, including cax data [5]. pdm systems manage the storrage of product data, both attribute, and documentary product data, as well as relationships between them. this is usually implemented through a relational database system. data classification should be a fundamental capability of pdm systems. cad data is therefore secondarily integrated into eis via a primarily integrated pdm system. this type of integration is less intensive, not so close, and the efficiency of integration of cad data soon become weak. the binding among cad data and eis is not always complete and fully up to date. although all production data including cad data, is stored in pdm (e.g., files with virtual 3d models), much information is not reachable (e.g., some characteristics and/or physical relations © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 44 no. 2/2004 eis shared data software system software system software system eip user user user eap user user user fig. 2: enterprise application portal among parts of virtual 3d models). a typical situation that illustrates the problem of managing this data is when a manager or a salesman needs to know some physical characteristic of a manufactured product. he is able to find the appropriate cad file via eis – thanks to collaboration with the integrated pdm. even the desired characteristic is easy to obtain, but only in the cad system in which the model was created. however managers are not able to use cad systems (and why should they have such skills?). they therefore rely on some other employee – a cad-worker. although touch pdm systems may try to store as much information as possible (including data extracted from cad model), it is not feasible to cover every potential user query. moreover, there is spread data redundancy in the present model, which implies many known problems. we are convinced that a better compromise between complexity and efficiency of retrieving information held in cad data can be found. 3 proposal the main aim of our research is to improve the integration of product data into an enterprise information environment managed by eis. we would like to move the integration of cax data from the secondary level of integration in eis to the primary level. this does not mean any radical changes, such as eliminating of pdm systems, but involves making information contained by cad data available. this information can be released by the possibility of communicating between the cad software, that is appropriate to a particular data record (with an operational relation to this record) and other software, that needs obtain this information and/or mediates it to the user. the model of an enterprise environment that we propose integrates cax systems more tightly into eis. the organisation and the architecture of eis and their integration with other software remain the same. the pdm system is also used in accordance with the former model, and even specific pdm software can be retained. pdm system is used for classifying and managing of data storage, but the set of redundant information obtained from cax data stored in pdm is minimized in our model (it is reasonable to store only frequently queried information). other product-oriented queries are solved in collaboration with an appropriate cax system. obviously, there is a need for a new mechanism for querying information contained in cax data, which is the only way in which our differs from the present day model. the compared current state and the proposed state of communication between cad systems and enterprise software is shown in figs. 3 and 4. communication between two different cad systems is the same scenario as communication between a cad system and other software – there is usually more than one cad system in use in a real enterprise. there is only a higher probability that the cad systems could share part of the spectrum of supported export/import file formats. fig. 3 shows a present-day approach. communication is shown between cad system and another program. the need for user assistance is quite common, so it is indicated in the figure. if the figure were to describe full communication, there would be a full graph, and each program would have to implement communication mechanisms for each cad system, and for each other program. this heterogeneity is a reason for the present-day use of lower complexity of automatic connectivity. thus there are various types data or information flow channel in present-day communication models, sometimes even a human being (for this reason there is a dashed line in the figure). the circle shows the imaginary boundary of software integration to eis (systems wrapped by eis are connected by solid line, because first-level integration is applied at this point). fig. 4 is illustrates the proposed model. each program that would communicate with some cad system (i.e., obtaining information from cad data) should be extended by an interface for using the communication model. this extension can customised – not only by a cad system producer, but also by the individual, who implements the complex software solution of an enterprise. cad systems must therefore be open/extensible as noted above. there is only one interface for all product data oriented communication among the entire participating software. if some program does not implement the interface of the communication mechanism, and retains the architecture shown in fig. 3, this software will not be excluded from communication, because it will still be allowed to use the former mechanism. this is significant, because it means that movement from the present model to the proposed model can be performed stepwise. fig. 4 describes communication among all shown programs and cad systems, unlike the structure presented in fig. 3 – the graph of the software connection is much simple due to the unified communication channel. the circle showing the level of integration boundary is dashed, signifying that this boundary makes no relevant sense. the text above indicates that the described problem with communication is especially related to cad. however if a cax system implements the interface of the proposed model, it can fully use the new communication approach, and can provide data for it. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 cad-syst. 1 cad-syst. 2 program 1 program 2 is program 3 note: this shows communication between two systems only. there should be similar communication among all programs able to communicate with cad. fig. 3: currently used model cad-syst. 1 cad-syst. 2 program 1 program 2 is program 3 comm. model i n t i n t i n t i n t i n t i n t fig. 4: proposed model 4 model of communication the proposed model of communication relies on the ubiquity of the internet in the business environment. it is a simple communication protocol based on tcp/ip protocols, such as http. only a clear structure and the scenario of interaction are defined. a real system of communication is based on a system of requests and replies. requests are product-oriented queries. there are two basic types of queries: � property (e.g., the weight of a part, or is the iges conversion format supported), � list of available items (e.g., get supported conversion formats). the categories of queries can be classified into several classes, such as physics (e.g., get weight or get material info), presentation (e.g., get vrml presentation), conversion (e.g., convert to step or get supported conversion formats), and others. replies can be classified into three basic types: � result (e.g., 12.5 kg, or vrml-file), � exception (e.g., 2d model without weight info), � error (e.g., appropriate cad system for query solving not found). the level of independence of this communication model is high – the only requirement is for the existence of tcp/ip protocols. there is no component technology, such as the need for corba, dcom, j2ee or net, which is problem in some enterprise communication systems [6, 7]. our model can be used in very heterogeneous environment – at both software and/or hardware levels. a disadvantage of this communication model is the need to have appropriate cad software installed in at least one computer connected to the eis, because this software is needed for solving queries. however if the pdm system, in the present-day approach, stores some cad files, software must also be installed, because otherwise the stored data is not useful. thus this disadvantage is debatable. it is evident that the draft of the communication protocol can neither encompass each field of the cad application sphere, nor fully cover each field. thus the model has to remain an open and extensible system. this fact prevents the model beeing definitely incomplete and to makes it applicable on any evolution in computer aided design and/or sphere of eis. however, every open and/or extensible system is problematic, because its evolution tends to be unsystematic and parallel. the system is often not useful, because each user demands incompatible extensions of the model. for the reason we intend the supervise extension to the system. the first precaution considers the definition of exact rules for extending the model, as these rules are the (extensible) part of system. the second basic element of supervision is designed in the form of the internet portal of the communication model (eip type – see fig. 1). this portal solution allows users to make a simple and unambiguous extension automatically or semi-automatically. it will also be able to collect requests, suggestions, comments, etc., from users. these inputs will be evaluated and processed and the results will be released by the portal (perhaps in the form of an extension of the model). compatibility among users will be ensured by a system of questions (e.g., is the iges conversion format supported) or by results in the form of exceptions (e. g., get weight � 2d model without weight info). a versioning system will also be helpful. 5 state of our research and future work we have completed the basic design of the communication model, as described above. a part of our model is a draft of its lifecycle supported by an internet portal. we will to keep this model extensible and open, but with internet-powered supervision. the system of supervision is not being tested. this experimental phase is based on microsoft net technology, but the system of support is planned to be independent from this platform for users, so it should be no problem to change this background of the lifecycle of our model. we are considering using the web services technology [8] as an addition to the classic portal solution. the first problem, that has still not been full satisfactorily solved is that we have to harmonize our proposed model with existing standards, such as step – iso 10303 [9]. our model cannot ignore world standardisation processes, but it should remain simple, effective and usable. the second problem concerns efficient use of computer resources. if each cad remains in operation, communication will not be problematic. however, the software is invoked and is stopped, there will be a waste of computer power during cyclic starting and exiting from the systems (fig. 5). in the case of two (or more) queries applied on a single cad model, the problem will be considerable. if the communication model is to be effective in query-intensive applications, this problem will have to be solved. we will now start experiments on the proposed model. in first trial phase we plan to use autodesk cad systems and .net technologies (aps.net, c# and vb(a)) to simulate enterprise information system environment. other cad systems and other eis platforms (such as java and j2ee) will be included in experiments in subsequent phases of our laboratory experiment. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 44 no. 2/2004 cad system start particular data file open problem solve particular data file close cad system exit time fig. 5: usual plan for solving an information query to the cad data file 6 conclusion present-day methods for integrating of cax software are very limited, because there is a great deal of unreachable information in cad data (for non cad-system users). our research focuses on the problem of communication among enterprise software, between cad systems and other programs. as a result we are developing a unified communication model focusing on the product data of manufacturing enterprises. we have the formal draft of this model; and we are harmonizing it with existing standards used in the product data oriented sphere. we are aware of the computer power intensity problem, and are now working on solving it. acknowledgment this research has been supported by gačr grant no. 102/01/0723. references [1] ota m.: the influence of enterprise integration processes on software systems. acta polytechnica. prague: accepted. [2] ota m., jelínek i.: simulated enterprise information system (seis). proceedings of workshop 2003. prague: ctu, 2003, vol. a, p. 108. [3] ota m.: what is covered in the core of information systems. (in czech). it-system. ccb. brno: accepted, in print 2003, no. 6. [4] řepa v.: business processes and evolution of information systems. in czech. systémová integrace. čssi. prague: 2002, no. 2, p. 7. [5] pdm information centre: http://www.pdmic.com/ as april 2003. [6] michalčík l., jelínek i.: advanced technologies and standards in the area of integration cad and is. proceedings of workshop 1999. prague: ctu, 1999, p. 89. [7] bohms m., van der waal j.: combining step, java and mapping gives user control of data. epm technology, the express way. no. 2, 1998. [8] kačmář d.: we are programming the .net applications in visual studio .net. in czech. computer press, praha. 2001. [9] al-timini k., mackrell j.: step – towards open systems. step fundamentals & business benefits. cim data, 1986. martin ota fax: +420 224 923 325 e-mail: otam@fel.cvut.cz prof. dr. ivan jelínek dept. of computer science and engineering czech technical university faculty of electric engineering karlovo nám. 13 121 35 praha 2, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap05_2.vp 1 introduction and motivation petri nets (pn) are a well established mechanism for system modeling. they are a mathematically defined formal model, and can be subjected to a large variety of systems. pn based models have been widely used due to their ease of understanding, declarative, logic based and modular modeling principles, and finally because they can be represented graphically. since petri nets began to be exploited in the 1960s, many different types of models have been introduced and used. the most popular models are presented in this paper by their definitions and by specific models. their main advantages are shown and the differences between them are mentioned. petri net based models have been used in our research on digital design methodology: the design of a processor or control system architecture with special properties, e.g. fault-tolerant or fault-secure, hardware-software co-design, computer networks architecture, etc. this has led to the development of pn models in some petri net design tools (design/cpn, [1], jarp, [2], cpn tools, [3]), and analysis and simulation of the model using these tools. after this high-level design has been developed and validated it becomes possible, through automatic translation to a vhdl description, to employ an fpga implementation that will enable a custom device to be rapidly prototyped and tested (asic implementation is also possible). an fpga version of a digital circuit is likely to be slower than the equivalent asic version, due to the regularly structured fpga wiring channels compared to the asic custom logic. however, the easier custom design changes, the possibility of easy fpga reconfiguration, and relatively easy manipulation make fpgas very good final implementation bases for experiments. most models used in the hardware design process are equivalent to the finite state machine (fsm), [4], [5], [6], [7]. it is said that the resulting hardware must be deterministic, but we have found real models that are not equivalent to an fsm and their real behavior was tested on the final fpga design kit platform [8]. therefore we have concentrated on those models with really concurrent actions, with various types of dependencies (mutual exclusion, parallel, scheduled), and have studied their hardware implementation. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 modeling by petri nets h. kubátová one specific model of a digital system in different types of petri nets is presented. the formal definitions of the basic (black-and-white) petri net, a place/transition net (p/t net), an arc-constant coloured petri net (ac-cpn) and a coloured petri net (cpn) are presented and explained on the basis of this example. real models of dining philosophers, a producer-consumer system and railway tracks are described. keywords: petri nets, formal models, hardware, digital design, field programmable gate array (fpga), pnml, vhdl, finite state machine (fsm) fig. 1: design methodology block diagram with dark parts corresponding to possible use of petri nets petri nets are a good platform and tool in the “multiple-level” design process, see fig. 1. they can serve as a specification language on all levels of specifications, and as a formal verification tool throughout these specification and architectural description levels. the first problem to be solved during the design process is the construction of a good model, which will enable the specification and further handling and verification of the different levels of this design. therefore this paper presents such model constructions on the basis of a number of simple examples. 2 petri net definitions and examples petri nets can be introduced in many ways, according to their numerous features and various applications. this text will focus on basic principles and modeling of actions. in this section, a formal definition of place/transition nets and coloured petri nets is given. they have been presented in many books and publications, the definitions presented here being taken from [9]. many attempts have been made to define the principles of basic types of petri nets. the way chosen here involves a brief introduction to the basic principles and to the hierarchical construction of the most complicated and widely used petri net based models used in professional software tools. the essential features of petri nets are the principles of duality, locality, concurrency, graphical and algebraic representation. these notions will be presented on a simple model of a handshake used by printers communicating with a control unit that transmits data according to the handshake scheme. the control unit uses the control signal strobe to signal “data valid” to the target units – printers, receivers. the printers signal “data is printing” to the control unit by ack signals. after the falling edge of a strobe signal, all printers must react by the falling edges of ack signals to obtain the next portion of data (e.g., a byte). our petri net will model cooperation between only two printers a, and b, with one control unit c, see fig. 2. following essential conditions and actions have been identified: � list of conditions: p1: control unit c has a byte prepared for printing p2: control unit c is waiting for signals ack p3: control unit c is sending a byte and a strobe signal to printer a p4: printer a is ready to print p5: printer a is printing a byte p6: printer a sends ack signal p7: control unit c sends strobe = 0 to a p8: control unit c is sending a byte and a strobe signal to printer b p9: printer b is ready to print p10: printer b is printing a byte p11: printer a sends ack signal p12: control unit c sends strobe = 0 to b � list of actions: t1: control unit c sends strobe = 1 t2: control unit c sends strobe = 0 t3: printer a sends ack = 1 t4: printer a sends ack = 0 t5: printer b sends ack = 1 t1: printer b sends ack = 0 separating or identifying passive elements (such as conditions) from active elements (such as actions) is a very important step in the design of systems. this duality is strongly supported by petri nets. whether an object is seen as active or passive may depend on the context or the point of view of the system. but it is always necessary to construct a correct petri net model according to definitions 1 – 5. basically, the edges must connect only places with transitions, or vice-versa (petri nets are a bipartite graph). the following principles belong to the essential features of petri nets that express locality and concurrency: � the principle of duality for petri nets: there are two disjoint sets of elements: p-elements (places) and t-elements (transitions). entities of the real world, interpreted as passive elements, are represented by p-elements (conditions, places, resources, waiting pools, channels etc.) entities of the real world, interpreted as active elements, are represented by t-elements (events, transitions, actions, executions of statements, transmissions of messages etc.). 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 2: the petri net model of two printers working in parallel � the principle of locality for petri nets: the behavior of a transition depends exclusively on its locality, which is defined as the totality of its input and output objects (preand postconditions, input and output places, …) together with the element itself. � the principle of concurrency for petri nets: transitions having a disjoint locality occur independently (concurrently). � the principle of graphical representation for petri nets: p-elements are represented by rounded graphical elements (circles, ellipses, …), t-elements are represented by edged graphical symbols (rectangles, bars, …). arcs connect each t-element with its locality, which is a set of p-elements. additionally, there may be inscriptions such as names, tokens, expressions, guards. � the principle of algebraic representation for petri nets: for each graphical representation there is an algebraic representation containing equivalent information. it contains the set of places, transitions and arcs, and additional information such as inscriptions. in contrast to concurrency, there is the notion of conflict. some transitions can fire independently (e.g. t4 and t6 in fig. 2, only tokens must be inside the input places), but there can be petri nets that model mutual exclusion, see fig. 3. concurrent transitions behave independently and should not have any impact on each other. sometimes this can depend on the state of the net – these transitions can behave independently. situations that show the sophisticated interaction of concurrency and conflicts are called confu© czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 3.: initial state of the petri net from fig. 2 a) b) fig. 4: concurrency of t3 and t5 transitions a) after t1 firing both t3 and t5 are enabled, b) after t3 firing t5 still remains enabled sions [9], [10]. building hierarchies by abstraction or refinement is an important technique in system design. pn supports such approaches by abstraction techniques that are inherently compatible with the structure of the model, [9]. definition 1: a net is a triple n = (p, t, f) where � p is a set of places � t is a set of transitions, disjoint from p, and � f is a flow relation f p t t p� � �( ) ( )� for the set of arcs. if p and t are finite, the net is said to be finite. the state of the net is represented by tokens in places. the tokens distributions in places are called markings. the holding of a condition (which is represented by a place) is represented by a token in the corresponding place. in our example, in the initial state control system c is prepared to send data (a token in place p1), printers a and b are ready to print (token s in places p4 and p9), see fig. 3. a state change or marking change can be performed by firing a transition. a transition “may occur” or “is activated” or “is enabled” or “can fire” if all its input places are marked by a token. transition firing (the occurrence of a transition) means that all tokens are removed from the input places and are added to the output places. the transitions can fire concurrently (simultaneously – independently, e.g. t3 and t5 in fig. 4, or in conflict, see fig. 5). the arc in the sense of definition 1 can be only simple – only one token can be transmitted (removed or added) from or to places by transition firings. place/transition nets are nets in the sense of definition 1, together with a definition of arc weights. this can be seen as an abstraction obtained from more powerful coloured petri nets by removing the individuality of the tokens, see below. the example derived from the petri net from fig. 2 is shown in fig. 6. here more (two) printers are expressed only by two tokens in one place p4. the condition “all printers are ready” expressed by two tokens in place p4 and fulfilled by multiply edge from place p4 to transition t3. definition 2: a place/transition net (p/t net) is defined as a tuple n p tpt �� �, , ,pre post where � p is a finite set (the set of places of npt), � t is a finite set (the set of transitions of npt), disjoint from p, and � pre, post � n |p|×|t| are matrices (the backward and forward incidence matrices of npt). c � pre � post is called the incidence matrix of npt. the set of these arcs is f p t p t p t t p t p p t: ( , ) [ , ] ( , ) [ , ] .� � � � � � �pre post0 0� this interpretation leads to the alternative definition, which is closer to the graphical representation. definition 3: a place/transition net (p/t net) is defined as a tuple npt=, where � (p, t, f) is a net (see definition 5.1) with finite sets p and t, and � w : f � n \ {0} is a function (weight function). npt together with an initial marking (m0) is called a p/t net system s �< npt, m0> or s �. for a net system s �< npt, m0> the set rs s m w t m mw( ): ,*� � � �0 , where t* is the sequence of transitions and w � t* and t � t, is the reachability set. fs s w t m m mw( ): ,*� � � �0 is a set of occurrence-transition sequences (or a firing-sequence set) of s. it is sometimes convenient to define the set occ(s) of occurrence sequences to be the set of all sequences of the form m m m mn0 0 1 1 2 2 1 1, , , , , , , , ( )t t t t nn� � such that m mi ti i+ i n� � �1 0 1for { , , }� . 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague a) b) fig. 5: mutual exclusion of places p5 and p10, transition t3 and t5 in conflict, a) initial state where t3 and t5 are both enabled, b) the state after t3 firing, where t5 is not enabled fig. 6: place/transition net the tokens in figs. 2–5 are not distinguished from each other. the tokens representing printers a and b are distinguished by their places p4 and p9. a more compact and more natural way is to represent them in one place p4&p9 by individual tokens a and b. distinguishable tokens are said to be coloured. colours can be thought of as data types. for each place p, colour set cd (p) is defined. in our case cd p p a b( & ) { , }4 9 � . for a coloured net we have to specify the colours, and for all places and transitions, particular colour sets (color domains). since arc inscriptions may contain different elements or multiple copies of an element, multisets (bags) are used, ‘bg’. a bag over a non-empty set a is a function bg a: � n, sometimes denoted as a formal sum bg a a a a ( )� �� . extending set operations sum and difference to bag (a) are defined [9]. definition 4. an arc-constant coloured petri net (ac-cpn) is defined as a tuple n ac� where � p is a finite set (the set of places of nac), � t is a finite set (the set of transitions of n ac), disjoint from p, � c is the set of colour classes, � cd p c: � is the colour domain mapping, and � pre post, � �b p t are matrices (the backward and forward incidence matrices of n ac) such that � �pre p t cd p, ( ( ))�bag and � �post p t cd p, ( ( ))�bag for each ( , )p t p t� � . c � pre � post is called incidence matrix of n ac. in this definition b is taken as the set bag (a), where a is the union of all colour sets from c. the difference operator in c � post � pre is a formal one here, i.e. the difference is not computed as a value. a marking is a vector m such that m[p] � bag(cd(p)) for each p � p. the reachability set, firing sequence, net system and occurrence have the same meaning as for p/t nets. the example for constructing coloured petri nets (cpn) is discussed in several following examples and figures derived from our original model of parallel printers. arc-constant cpn in fig. 7 is simply derived from the initial example, with the same meaning of all places and transitions. places p4 and p9 (and p5 and p10) originally used for distinguishing two printers are connected (“folded”) to one place here named p4&p9 (p5&p10). for a transition t, it is necessary to indicate which of the individual tokens should be removed (with respect to its input places). this is done by the inscriptions on the corresponding arcs in fig. 7. transition t3 can fire if there is an object a in place p4&p9 (and an indistinguishable token in the place p3). when it fires, token a is removed from place p4&p9 and is added to place p5&p10, and an (indistinguishable) token is added to p6. places p4&p9 and p5&p10 have the colour domain printers � {a, b} denoting printer a and printer b. the control process is modeled by token s (strobe). colour domains are represented by lower case italics near the place symbols in fig. 7. places p3, p6, p7, p8, p11 and p12 are assumed to hold an indistinguishable token and therefore have the colour domain token � { �}, which is assumed to hold by default. the net from fig. 2 (ordinary or black-and-white pn) and the net from fig. 7 (coloured pn) contain the same information and have similar behavior. only two places are “safe”. this cpn is called arc-constant, since the inscriptions on the arcs are constants and not variables. the next step will be to simplify the graph structure of ac-cpn. we will represent the messages “strobe signal sent to printer a” (sta) and “strobe signal sent to printer b” (stb), ack signal sent from printer a (acka) and ack signal sent from printer b (ackb). we can connect places p3 and p8, p6 and p11, p7 and p12, in fig. 8 they are named by the first name of the connected places. the behaviour of the net is the same. as a new feature of this net, transition t2 has to remove both signals acka and ackb from place p6. the expression acka � ackb denotes the set {acka, ackb}. transi© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 colour sets: control = {s} printers = {a, b} constants: s, a, b fig. 7: arc-constant cpn tion t3 is enabled if both acka and ackb are in place p4 and by t3 firing both tokens are removed. therefore in the general case, bags (multisets) will be used instead of sets. the transition firing rule for arc-constant cpn can be expressed as: all input places must contain at least as many individual tokens as specified by the corresponding arcs. the transition firing means that these tokens are removed and added to the output places as indicated by arc inscriptions. the firing rule for ac-cpn is sketched in fig. 9. in a coloured petri net the incidence matrices cannot be defined over b � bag (a) as for arc-constant cpns. the different modes or bindings of a transition have to be represented. these are called colours, and are denoted by cd(t). therefore the colour domain mapping cd is extended from p to p t� . in the entries of the incidence matrices for each transition colour, a multiset has to be specified. this is formalized by a mapping from cd(t) into the bags of colour sets over cd(p) for each (p, t) � p × t. our example expressed by cpn is shown in fig. 11. the number of places and transitions corresponds to the p/t net in fig. 6, but the expression power is greater. for each transition a finite set of variables is defined which is strictly local to this transition. these variables have types or colour domains which are usually the colours of the places connected to the transition. in fig. 11 the set of variables of transition t3 is {x, y}. the types of x and y are dom (x) � printers and dom(y) � ack, respectively. an assignment of values to variables is called a binding. not all possible bindings can be allowed for a correctly behaving net. the appropriate restriction is defined by a predicate at the transition, which is called a guard. now the occurrence (firing) rule is as follows, see fig. 10, where all places have the colour set cd(p) � objects � {a, b, c}, and the colour domain of all variables is also objects: 1. select a binding such that the guard holds (associate with each variable a value of its colour), fig. 10b. 2. temporarily replace variables by associated constants, fig. 10c. 3. apply the firing rule from ac-cpn from fig. 9 as shown in fig. 10d (remove all appropriate tokens from input and add to output places according the arc inscriptions). the firing rule should be understood as a single step from fig. 10a to d. if the binding x � a, y � b, z � c is selected, then 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague colour sets: control = {s} printers = {a, b} ack = {acka, ackb} strobe = {sta, stb} constants: s, a, b, acka, ackb, sta, stb fig. 8: arc-constant cpn without three places fig. 9: firing rule for ac-cpn the transition is not enabled in this binding, since the guard is not satisfied. the selection of a binding is local to a transition. definition 5. a coloured petri net (cpn) is defined by a tuple ncpn� where � p is a finite set (the set of places of n cpn), � t is a finite set (the set of transitions of n cpn), disjoint from p, � c is the set of colour classes, � cd: p t c� � is the colour domain mapping, and � pre, post � b |p|×|t| are matrices (the backward and forward incidence matrices of n cpn) such that pre [p, t] : cd(t) � bag (cd (p)) and post [p, t] : cd(t) � bag (cd (p)) are mappings for each pair (p, t) � p×t. b can be taken as the set of mappings of the form f : cd (t) � bag (cd (p)). c � post � pre is called incidence matrix. the mapping pre [ , ]: ( ) ( ( ))p t cd t cd p� bag defines for each transition the colour (occurrence mode) � � cd t( ) of t a bag pre [ , ]( ) ( ( ))p t cd p� �bag denoting the token bag to be removed from p when t occurs (fires) in colour �. in a similar way, post[ , ]( )p t � specifies the bag to be added to p when t occurs (fires) in colour �. the overall effect of the action performed on the transition firing is given by a tuple corresponding to the arcs connected with t. the colours of the transition can be seen as particular subsets of tuples cd t cd p cd p p( ) ( ( )) ( ( ))� � �bag bag1 � , i.e., vectors having an entry for each place. but this can be an arbitrary set as well. effective representations of this set are © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 select a binding, such that the guard holds, e.g.: x a y b z b � � � fig. 10: firing rule for cpn colour sets: control = {s} printers = {a, b} ack = {acka, ackb} strobe = {st} variables: x, y, z, st constants: acka, ackb, sta, stb x y z st, , , fig. 11: cpn model of printers necessary. the mappings pre [ , ]p t and post [ , ]p t can be denoted by vectors, projections, functions and terms with functions and variables. 3 experiments with hardware implementation we performed several experiments with direct implementation of the petri nets model in hardware (fpga). the results were presented in [14], [15] and [16].these models are briefly described here. they were constructed in software tools (design/cpn or jarp editor) and from these tools their unified description in pnml language [11], [12], was directly transformed into the fpga bitstream. we have modeled 5 philosophers, who are dining together, fig. 12. the philosophers each have two forks next to them, both of which they need in order to eat. as there are only five forks it is not possible for all 5 philosophers to be eating at the same time. the petri net shown here models a philosopher who takes both forks simultaneously, thus preventing the situation where some philosophers may have only one fork but are not able to pick up the second fork as their neighbors have already done so. the token in the fork place (places p1, p2, …, p5) means that this fork is free. the token in the eat place (places p6, p7, …, p10) means that this philosopher is eating. we also performed experiments with a “producer-consumer” system, fig. 13. our fpga implementation used 59 clb blocks, 47 flip-flops with maximum working frequency 24.4 mhz. the maximum input capacity parameter for places (the size of the counter) was set to the value 3. the average buffer occupation during 120 cycles (transition firings) was 1.43, [13], [14]. our real application experiment modeled a railway with one common critical part – a rail, see fig. 14. the pn model, fig. 15, has the initial marking where tokens are in places “1t” and “3t” (two trains are on rails 1 and 3, respectively), “4f” (a critical rail is free) and “2f” (rail 2 is free). this model has eight places, two places t (train) and f (free) for each rail: a token in the first place means that the train is in this rail (t-places), and the second means that this rail is free (f-places). this was described and simulated in the design/cpn system and then it was implemented in the real fpga design kit (promox, [15]). conclusions this paper deals with the practical use of petri nets and modeling by petri nets. different levels and types, practical and concrete styles of modeling are presented on the basis of a simple and clear example. the practical results obtained for 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 12: the dining philosophers pn model buffer consumerproducer fig. 13: producer – consumer model fig. 14: railway semaphore model fig. 15: the pn model of 4 rails specific fpga implementations have been published and can be found in [14], [15], [16]. the specific petri net models are shown here. the example presented here in which parallel printers are served by a controlling process, was chosen due its practical presentation and practical iterative construction during the teaching process at the department of computer science and engineering (dsce) of the czech technical university in prague. future work will involve optimizing the direct implementation of petri nets with respect to space, time, power and reliability. acknowledgment this research was in part supported by a grant 102/04/0737 of the czech grant agency (gačr) and by the msm 212300014 research program. references [1] http://www.daimi.au.dk/designcpn/ [2] home page, 2002: http://jarp.sourceforge.net/ [3] http://wiki.daimi.au.dk/cpntools/cpntools.wiki [4] adamski, m.: “a rigorous design methodology for reprogrammable logic controllers.” proc. desdes ’01 zielona gora, poland, 2001, p. 53–60. [5] erhard, w., reinsch, a. schober, t.: “modeling and verification of sequential control path using petri nets.” proc. des des ’01 zielona gora, poland, 2001, p. 41–46. [6] gomes, l., barros, j-p.: “using hierarchical structuring mechanism with petri nets for pld based system design.” proc. desdes ’01 zielona gora, poland, 2001, p. 47–52. [7] uzam, m., avci, m. kürsat, m.: “digital hardware implementation of petri nets based specification: direct translation from safe automation petri nets to circuit elements.” proc. desdes ’01, zielona gora, poland, 2001, p. 25–33. [8] projects (2003): http://service.felk.cvut.cz/courses/36apz /archives/2002-2003/w/prj /36apz124/ [9] girault, c., valk, r.: petri nets for systems engineering. berlin, heidelberg: springer-verlag, 2003, 607 p. [10] češka, m.: petriho sítě. brno: akademické nakladatelství cerm, 1994, 95 p. (in czech) [11] humboldt-universität berlin: http://www.informatik.hu-berlin.de/top/pnml [12] kindler, e., ed.: “definition, implementation and application of a standard interchange format for petri nets.” proceedings of the workshop of the satellite event of the 25th international conf. on application and theory of petri nets, bologna, italy, june 2004, 85 p. [13] koblížek, m.: “hardware implementation of petri nets.” diploma thesis, ctu, prague, 2002, 51 p. (in czech). [14] kubátová, h.: “direct implementation of petri net based model in fpga.” proceedings of the international workshop on discrete-event system design – desdes ’04. zielona gora: university of zielona gora, 2004, p. 31-36. [15] kubátová, h.: petri net models in hardware. ecms 2003, technical university, liberec, 2003, p. 158–162. [16] kubátová, h.: “direct hardware implementation of petri net based models.” proceedings of the work in progress session of euromicro 2003, linz: j. kepler university – faw. 2003, p. 56–57. ing. hana kubátová, csc. e-mail: kubatova@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 ap04_3web.vp 1 introduction a typical product passes through the following life cycle (fig. 1): the decreased product life cycle in the present industrial scenario has led to more focus on product design. this is the result of intense competition among companies. to sustain this competition, enterprises have to pay more attention to the conceptual design phase. the conceptual design phase is one of the important phases of total design [1] and it is the phase where 70 % of the product life cycle cost is decided [2]. the conceptual design phase consists of generating design concepts & evaluation of those concepts to identify the best one out of them. the evaluation process comes into the picture for selecting the best concept. there can be a number of concepts that can be generated depending on the experience of the designers and their know-how of the component or the product. but still the number of concepts generated is limited. hence, the best one out of these is the local optimum that is generally obtained. the ‘best’ is pseudo-term in the sense that one never knows how many more concepts could have been generated, which means that global optima may or may not be achieved [3]. the local optima may or may not be the global optima. for this to happen: 1. one has to generate a large number of concepts. 2. the evaluation process for selecting the concept has to be effective. if the right concept is not selected, greater than necessary cost and time is used on its production. it may transpire after the product rolls out that the cost has exceeded the limits. to avoid such a situation, it is wise to take care in the conceptual design phase when selecting the concept. the concepts are weighed and measured against criteria, and various procedures have been put forward for the final decision. one of them includes application of fuzzy logic as a tool to evaluate design concepts against criteria. the next section revisits the evaluation procedure which is followed by a description of some design evaluation methods, i.e. the method of controlled convergence and the application of fuzzy logic [4]. a model is then proposed for computational evaluation of design concepts. this is an integrated model for decision-making and is intended to enhance the capability of novice designers. 2 design evaluation the importance of design evaluation has been recognised over the past few years. initially, it was applied considering a holistic approach to the product or taking a total product view [1]. bjarnemo [3] suggested classification of general evaluation methods and proposed an integrated evaluation procedure. green [5] used a combination of various models that leads to an integrated evaluation of concepts for the evaluation process. evaluation has been defined as “the activity of judging between and selecting from a range of competing design options“ [5]. design evaluation is a decision making process whereby all the concepts are enlisted and evaluated against different criteria. it plays an important role in the current scenario, as various enterprises are keen on introducing new products within a short span of time. they want to make sure that the product they introduce is best in terms of every criterion, e.g., reliability, manufacturability and cost. this fact has led to paying more attention to the evaluation process. table 1 shows a matrix generally used for evaluation purposes. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 an integrated decision making model for evaluation of concept design g. green, g. mamtani the conceptual design phase generates various design concepts and these are then evaluated in order to identify the ‘best’ concept. identifying the best concept is important because much of the product life cycle cost is decided in this phase. various evaluation techniques are performed so as to aid decision-making. different criteria are weighted against concepts for the comparison. this paper describes the research being carried out at the university of glasgow on design evaluation. it presents the application of fuzzy logic for design evaluation and proposes an integrated decision-making model for design evaluation. this is a part of research project that aims at developing a computer tool for evaluation process to aid decision-making. keywords: concept design, design evaluation, fuzzy logic. need from the market generation of product design specification concept design embodiment design detailed designmanufacturing sold to the market fig. 1: product life cycle this is a matrix generated for evaluation of concepts for the design of a stretcher cum wheelchair (shown in fig. 2). the values in the matrix are the scores provided by members of the product development team. the number of concepts generated is 3, and there are 6 criteria. 3 method of controlled convergence for selecting the right concept, the method of controlled convergence is an effective evaluation tool. alternate convergent and divergent thinking forms the basis of this method. the reasoning is followed by a reduction in the number of concepts and then new concepts are generated. this alternate reduction and generation of concepts is followed until the final concept to be considered is arrived at. initially, a product design specification is generated which in turn forms the basis for generating of various concepts. these concepts are then compared using an evaluation matrix leading to a reduction in the number of concepts. after this, the concept generation process is reapplied and new concepts are added. again a concept comparison and re-reduction process is run through, so as to filter the concepts or aid final selection. this reduction and generation of concepts is followed until the best concept is finally selected. fig. 3 depicts this model. 4 fuzzy logic and its application to design evaluation 4.1 fuzzy and crisp sets fuzzy sets are used when the information available is fuzzy or vague. for example, when we say that the car has a high mileage, we are not sure about the exact mileage the car has. this is a fuzzy expression. there may be several cars with a high mileage, but the degree to which they belong to this set of cars (i.e. high mileage cars) is variable. as such, each element of a fuzzy set belongs to it with a certain degree of © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 44 no. 3/2004 fig. 2: stretcher cum wheelchair criteria concept 1 concept 2 concept 3 portable 6 7 8 safe & reliable 6 6 7 easy to use 7 7 7 flexible 7 8 8 good aesthetics 8 7 8 good ergonomics 8 7 8 table 1: concept/criteria evaluation matrix generation of product design specification (pds) generation of concepts reduction in no. of concepts reapplication of concept generation process has the final concept been arrived at? final concept obtained yes no fig. 3: method of controlled convergence (after pugh) membership. thus, a fuzzy statement is not either true or false but may be partly true or partly false to some extent. crisp sets or classical sets are special cases of fuzzy sets when the degree of membership of the element is either 0 or 1. it is based on the logic that uses one of the two values: true or false. for example, when we say that all cars with a mileage of more than 10 miles/litre are considered to have a high mileage and, below this, ‘not high’ mileage. this means that a car can have either a high mileage or ‘not high’ mileage. a crisp set a of universe x is defined by the characteristic function fa(x) f xa( ) : x � [0, 1] (1) f x x a x a a( ) , , . � � � 1 0 if if a fuzzy set a of universe x is defined by membership function m xa( ) m xa( ) : x � [0, 1] (2) where, (a) m xa( ) �1 if x is totally in a, (b) m xa( ) � 0 if x is not in a, (c) 0 1� �m xa( ) if x is partly in a. in case a and b above, a fuzzy set becomes a crisp set. 4.2 linguistic values and fuzzy rules linguistic values are used to describe the fuzziness of the situation. for example, when we say that a person is a quick runner, the linguistic value is quick. it does not actually show how quick the person is and therefore it is fuzzy in this respect. some other examples of linguistic values are tall, short, good, high, better, etc. these values are used to define the fuzzy rules, such as: “if the distance is more, the time taken is also more” or “if the speed is high, time taken is less”. fuzzy rules, in turn are the expert rules that drive the conclusion. these must already to be prepared and stored in the database. these rules are generated with the help of experts, heuristics, various books, journals and databases. 4.3 fuzzy set operations let a and b be two fuzzy sets and x be the universe of discourse. the following are a few common operations applied on fuzzy sets: complement m x m xa a� � �( ) ( )1 (3) union � �m x m x m xa b a b �( ) max ( ), ( ) (4) intersection � �m x m x m xa b a b �( ) max ( ), ( ) (5) these operations are useful when fuzzy rules are applied. there are combinations of fuzzy rules that use the above operations for the generation of output. 4.4 fuzzy logic inference the fuzzy inference technique discussed here is the mamdani style inference [6], which is the most commonly used inference. this inference comprises the following steps: 1. fuzzification: input of crisp values and allocating them to the fuzzy sets they belong to, thereby determining their membership value. 2. rule application: the fuzzy rules are then applied on the fuzzified inputs. these fuzzy rules make use of fuzzy operators (and or or) and arrive at the solution. 3. output generation: the solution arrived at after the rule applications provides the unified outputs that are combined into a single output fuzzy set. 4. defuzzification: this output value arrived at is finally defuzzified so as to get the final crisp result. 4.5 application of fuzzy logic to evaluation referring to table 1 of the evaluation matrix, fuzzy logic is applied to fill the matrix. it means in place of scores allocated to the matrix, now fuzzy linguistic values are allocated [7]. table 2 shows the same example with fuzzy values in the matrix. this is the same example of the design of a stretcher cum wheelchair. the number of concepts generated is 3, and there are 6 criteria. the inference discussed in the previous section is used for decoding this matrix and for getting the final crisp output. the final defuzzified result gives a value for each of the above 3 concepts, and the one with the largest value is finally selected. 5 proposal of an integrated decision making model for evaluation as seen in the previous sections, fuzzy linguistic values are allocated by human beings. these fuzzified values are used for fuzzy inference to get the solution. such a model is as follows (fig. 4). now such a model requires a lot of intervention by humans, and the result depends a lot on the experience of the designer. if the designer is experienced, the chance of getting a good solution is increased because of the allocation of appropriate linguistic values provided by him. but if the designer is a novice, chances are that he may not be able to provide appropriate fuzzy variables and that the solution is affected due to this. henceforth, a model is sought to take care of such cases. in this model, the score values are provided by computation. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 criteria concept 1 concept 2 concept 3 portable good very good excellent safe & reliable good good very good easy to use very good very good very good flexible very good excellent excellent good aesthetic excellent very good excellent good ergonomics excellent very good excellent table 2: use of fuzzy values for weighting criteria this may be done by calculating the criteria in question to get the score values so as to fill their respective positions in the evaluation matrix. this computational model is depicted in fig. 5 for a concept evaluation with 2 criteria only. for such a model, rules have to be discovered and generated for the calculation of various criteria used in different systems. this means that if the criteria are, for example reliability, manufacturability and ease of assembly, a specific product shall be considered for which the calculations are to be done. a questionnaire has been prepared and sent to various scottish industries involved in design to know the practical interest of industry in such a proposed model. it contains various questions on the importance of evaluation activity within their company and inclination towards computational evaluation. the results will be reported elsewhere in due course. 6 conclusion the previous sections have shown the importance of evaluation procedures and the requirements of various models to enhance the capability of novice designers. it has also shown how fuzzy logic application to evaluation helps in the selection of concepts. it is particularly helpful in the conceptual design phase, which lacks the information content. the model proposed is helpful in the sense that it will compensate for the lower experience of novice designers. it will, in due course, be subjected to experimental testing to determine its validity. future work involves evaluating concepts with this model and the criteria considered will be reliability, manufacturability and ease of assembly. initially, it will be tested with a specific product, and then work will be done towards generalisation of the model for products with more or less common attributes. references [1] pugh s.: total design: integrated methods for successful product engineering. workingham, 1991. [2] nevins j. l. et al: concurrent design of product and processes. daniel e. whitney, new york, 1989. [3] bjarnemo r.: towards a computer implementable evaluation procedure for the mechanical engineering design process. lund, 1994. [4] zadeh l. a.: “fuzzy sets”. information and control, vol. 8, (1965), p. 338–353. [5] green g.: “towards an integrated design evaluation tool (ide) tool”. acta polytechnica, vol. 40 (2000), p. 50–56. [6] negnevitsky m.: artificial intelligencea guide to intelligent systems. harlow, 2001. [7] wang j.: “a fuzzy outranking method for conceptual design evaluation”. international journal of production research, vol. 35 (1997), p. 995–1010. dr. graham green e-mail: g.green@mech.gla.ac.uk mr. girish mamtani mechanical engineering department james watt building university of glasgow glasgow, g12 8qq, scotland, uk © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 44 no. 3/2004 allocation of linguistic values for evaluation generation of pds generation of concepts crisp result fuzzy logic application fig. 4: fuzzy logic application process generation of concepts generation of pds computation of criteria 1 computation of criteria 2 crisp result fig. 5: proposed computational model ap03_5.vp 1 introduction fighter aircraft are mostly designed to carry stores such as a launcher or an external tank under the wing. when these stores are installed, the flow on its surrounding components such as the control surfaces can be considerably changed. this may introduce several aerodynamic interference characteristics such as changes in aerodynamic force, increase in turbulence and possibly flow separation. these phenomena may cause an adverse effect on other aircraft components such as the horizontal tail and vertical stabilizer, and consequently may affect the controllability and stability of the aircraft. research on external store installation is complex and extensive. it covers several research areas such as aerodynamics, structure, flutter, physical integration, trajectory prediction, aircraft performance, stability analysis and several multiple engineering disciplines. however, the focus of this work was to study the aerodynamic interference particularly on the change in aerodynamic characteristics. the aerodynamic characteristics are a prerequisite for the other analysis, since the aerodynamic data are required for a subsequent aircraft structural analysis, stability analysis, performance analysis and store trajectory analysis. investigations of the aerodynamic characteristics in the external store clearance program usually involve a complex flow field study with multi component interferences. flow of such a nature is usually investigated through wind tunnel testing and empirical methods. the main objective in this study was to identify the interference effect of a subsonic fighter aircraft that is currently used by royal malaysian air force with the presence of an external store installation. a generic model of one of the subsonic fighter aircraft used by royal malaysian air force was chosen for the study. wind tunnel testing and computational fluid dynamics (cfd) simulation were conducted to investigate these interference effects. a low speed wind tunnel with a working section of 0.45 m × 0.45 m was used to conduct the experiments and commercial cfd software was used for the simulation. other milestones in this study include the verification and validation process and the suitability of applying a commercial cfd code for predicting the wing and external store aerodynamic interference effects. 2 simulation and experimental works the methodology adopted to conduct the study consists of several steps. the first and foremost was to obtain the digitized wing section geometry. the digitization process was done using photomodeller software. the next step was to construct a scale model of the wing based on the digitized wing geometry using numerical control machine (cnc). then several series of experiments were carried out upon the scale model in the wind tunnel at a low speed of approximately 22.8 m/s. the digitized wing 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft tholudin mat lazim, shabudin mat, huong yu saint the main objective of the present work is to study the effect of an external store on a subsonic fighter aircraft. generally most modern fighter aircrafts are designed with an external store installation. in this study, a subsonic fighter aircraft model has been manufactured using a computer numerical control machine for the purpose of studying the effect of the aerodynamic interference of the external store on the flow around the aircraft wing. a computational fluid dynamic (cfd) simulation was also carried out on the same configuration. both the cfd and the wind tunnel testing were carried out at a reynolds number 1.86×105 to ensure that the aerodynamic characteristic can certify that the aircraft will not be face any difficulties in its stability and controllability. both the experiments and the simulation were carried out at the same reynolds number in order to verify each other. in the cfd simulation, a commercial cfd code was used to simulate the interference and aerodynamic characteristics of the model. subsequently, the model together with an external store was tested in a low speed wind tunnel with a test section sized 0.45 m×0.45 m. measured and computed results for the two-dimensional pressure distribution were satisfactorily comparable. there is only a 19% deviation between pressure distribution measured in wind tunnel testing and the result predicted by the cfd. the result shows that the effect of the external storage is only significant on the lower surface of the wing and almost negligible on the upper surface of the wing. aerodynamic interference due to the external store was most evident on the lower surface of the wing and almost negligible on the upper surface at a low angle of attack. in addition, the area of influence on the wing surface by the store interference increased as the airspeed increased. keywords: computational fluid dynamic (cfd), wind tunnel testing, cfd validation, aerodynamic interference. geometry was also used in the cfd simulation. gambit preprocessor software was used to produce the necessary mesh. the setup was then simulated using fluent 5 cfd software and the cfd simulation was carried out with various physical models, numerical algorithms, a discretization method and boundary conditions. in the final step, the study was wrapped up by comparing the computed and measured results in a further investigation of the nature of the interference effect of a wing and external store configuration. 3 aircraft wing external geometry digitization digitization of the wing geometry was vital in order to obtain an adequate aircraft model geometry representing the real aircraft photomodeler 3.0 software was used to capture and digitize the aircraft wing external geometry. the software captured the image of 84 photographs taken at various angles of the aircraft wing. these photos were taken using a digital camera. a number of points were marked on the aircraft wing and the adjacent fuselage part using masking tape, as shown in fig. 1. the size of the markers was designed to ensure clear and sharp visibility in photographs taken from a certain distance. this was determined using the relationship between the number of pixels and the distance from the camera. the placement and location of the markers were determined based on the profile of the wing. the high curvature area was placed with denser markers. this figure also shows part of the total of 84 photographs used to generate the wing profile and some part of the fuselage. the output from the digitization process was a set of coordinates conforming to the wing geometry, as shown in fig. 2. most of the coordinates were on the wing surface and wingtip pylon. unfortunately, the wing geometry image was not of high quality in terms of accuracy and perfection. therefore, cad software was used to smooth the image. after minor adjustments were made, the image became as in fig. 3. wind tunnel testing a wing model is required for wind tunnel testing. therefore, a 20 % scale wing model of a fighter aircraft was fabricated with the use of a cnc machine. the model was made from a single solid piece of aluminum-alloy with nine conduits each on the upper and lower surface. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 43 no. 5/2003 fig. 1: photographs of the marked wing surface at the various projection fig. 2: aircraft geometry produced by photomodeller fig. 3: digitized wing geometry smoothed with the use of cad fig. 4 shows the semi-span model of a generic fighter aircraft taken from the digitized geometry produced by photomodeller. the figure indicates three main parts of the wing: the root section, the mid section and the tip section. since the external store was installed on the mid section, it was decided to fabricate only this section. furthermore, it was not feasible to test the full set of the wing model, due to the limitation of the size of the wind tunnel test section. the model has three main stations for pressure measurement study located at the chord are parallel to each other and placed at with equal distances. every station was equipped with static pressure-taping points on the upper and lower surface, respectively. besides fabricating the mid wing model, a 1/5 scale model of a launcher and a pylon as the external stores were also fabricated. these external stores were designed in such a way that they can be easily secured and removed from the wing section. fig. 5 shows the complete assembly of this aircraft wing together with the external stores inside the test section. the test was conducted using two different configurations of the wing model. the first configuration was without the external store, while, the second configuration was with the external store installed. both configurations were tested at zero angle of attack at two different speeds: 22 m/s and 27 m/s. in this study, the wing model was tested in an open suction type low speed wind tunnel with a working section of 0.45 m × 0.45 m. pressure measurement was carried out using a multi-channel manometer. the wing model was designed in such a way that there were three static pressure holes on each three different span wise stations sharing a single tube. hence, during the experiment the pressure was taken on a station by station basis and the remaining 2 pressure holes were closed using a thin tape. to avoid a higher flow separation between the model and wall at higher angle of attack, the test with the store installed was carried out at zero angle of attack only. computational fluid dynamic simulation in the cfd simulation, the mid wing was simulated under two different conditions. in the first condition, the mid wing was meshed into 111 239 elements, while in and the second condition it was meshed into 221 112 elements, as shown in fig. 6a and 6b. the flow was simulated at speed 22 m/s, incompressible flow and was considered laminar. fig. 6c shows the simulation for a wing with external storage with 122 158 elements. fig. 7 below shows the simulation results for the mid wing that was generated with two 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 4: semi span model of a generic fighter aircraft fig. 5: model installation inside the wind tunnel (a) mesh for wing in tunnel 111,239 elements (b) mesh for wing in tunnel, 221,112 elements (c) mesh for wing and store in tunnel, 122,158 elements fig. 6: cfd model surface meshes different elements. this implies that the simulation at around 120 000 elements was acceptable. results wind tunnel testing results after a series of experiments had been conducted, the pressure distribution at mid span for the upper and lower surfaces was plotted, as shown in fig. 8. from these results it was found that at station 1 the difference in pressure coefficient is only 3 % on the upper surface of the wing compared to the lower surface, due to the external store installations. there are substantial differences in pressure coefficient on the lower surface with the external store configuration. station 2 and 3 indicate the same phenomenon and that there is a small difference in pressure distribution on the upper surface. the lower surface shows some reduction in pressure distribution. these experimental results give an initial indication that the flow on the upper surface will not be severely affected by the external storage configuration compared to the lower surface. computational fluid dynamics results fig. 9 shows the results of cfd simulation on the upper and lower surfaces of the wing of this aircraft. at station 3, it is shown that the pressure coefficient is almost constant and unchanged from the leading edge to trailing edge on the upper surface, and the value does not change very much with the external store installation. this shows that the external store did not affect the flow on the upper surface. in contrast the pressure coefficient shows a significant change on the lower surface with the external store compared to the clean wing configuration. this result showed that the external storage only affected the lower surface. the same phenomenon also occurs at station 1 where the coefficient of pressure has not changed at the upper surface but there is some reduction in the pressure coefficient distribution on the lower surface compared to the upper surface. the results at station 2 are the same. © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 43 no. 5/2003 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 x/c c p fine grid upper fine grid lower fig. 7: pressure distributions at the mid span for upper and lower surface pressure dist station 2 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.2 0.4 0.6 0.8 1 x/c -c p upper-clean lower-clean upper-store pressure dist station 2 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.2 0.4 0.6 0.8 1 x/c upper-clean lower-clean upper-store pressure dis station 1 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0 0.2 0.4 0.6 0.8 1 x/c upper-clean lower-clean upper-store lower-store fig. 8: pressure distribution for a various chord wise location surrounding the store (experiment) 5 analysis and discussion comparison between cfd and experimental works the study shows that the value of the pressure coefficient predicted by simulation compared well to that performed in the experimental study. there is an average difference of about 19 % between the two values. in the experimental study, a problem during the setup of the experiment such as misalignment in determining the angle of attack, accuracy of the model, blockage effects and wind tunnel calibration can significantly influence the result. though the wing was machined accurately by the computer numerical control (cnc), there was still some doubt about the accuracy of the model. moreover, the fluid level of the manometer used for measuring the pressure fluctuated constantly between 2 to 3 mm. discussion in the study we observed that the external store configuration only affects the lower surface of the wing. fig. 10 shows the pressure distribution at the quarter chord point along the span wise location from the tip to the root. on the upper surface, the pressure distribution is almost constant in the span wise direction, which is from the tip to the root of the wing. the pressure coefficient on the lower surface was reduced by 40 % compared to the upper surface. with the external store installed, the pressure distribution on the lower surface was increased to around 20 % compared to the clean wing configuration. there was a sudden increase in pressure distribution at the position of 0.2 span wise location, where the external storage was mounted to the wing. fig. 11 shows the results of the simulated aerodynamic force coefficient at zero angle of attack, reynolds number of 1.86 × 105 and mach number 0.067. this figure shows the simulated value of the lift coefficient and drag coefficient for the wing alone and the wing with the store configuration. it should be noted here that the external storage decreased the total coefficient of 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 pressure distribution st1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st1-lower store-st1-upper clean-st1-lower clean-st1-upper pressure distribution st3 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 x/c store-st3-lower store-st3-upper clean-st3-lower clean-st3-upper pressure distribution st2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -0.1 0.1 0.3 0.5 0.7 0.9 1.1 x/c clean-st2-lower clean-st2-upper store-st2-upper fig. 9: pressure coefficient at three different span wise locations (cfd) spanwise pressure distribution at quarter chord 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 -0.5 -0.4 -0.3 -0.2 -0.1 0 span position (tip root) c p store lower clean lower store upper clean -upper fig. 10: external store interference on pressure distribution lift from 0.2528 to 0.1767 and increased the coefficient of drag from 0.0146 to 0.0318. this shows that the store installation significantly influenced the drag coefficient compared to the lift coefficient. in this initial stage of research, the wind tunnel test was carried out at a speed of 22.8 m/s corresponding to a reynolds number of 1.86 × 105, which was low for a fighter aircraft. however, the test was meant to validate the cfd simulation, which was performed on the wing and store configuration at the same reynolds number. in order to perform an experimental and simulation study as close as possible to the real situations the reynolds number similarity has to be increased to an order of several million. this will be performed in the next phase of the study in a bigger wind tunnel facility with a working section of 2 m × 1.5 m. 6 conclusion in conclusion, the main objective in this project was achieved. it has been shown that cfd simulation is an important tool for investigating the influence of store interference characteristics for a subsonic fighter aircraft. the static pressure measured around the wing was about 19 % higher than the simulated values. the results show that the flow over the upper surface of the wing was not much affected when the pylon and launcher were installed. the study also shows that the flow over the lower surface was much more affected by the presence of the external store. the methods for determining the influence of stores on the wing will be used to simulate the full size of this fighter aircraft with real size external store. the ongoing project is to compare the full cfd model of this fighter aircraft with the wind tunnel model that was tested in a bigger wind tunnel facility of working section 2 m × 1.5 m at universiti teknologi malaysia. references [1] bhardwaj, m. k., kapania, r. k., reichenbach, e., guruswamy, g. p.: computational fluid dynamics/computational structural dynamics interaction methodology for aircraft wings. aiaa journal, vol. 36, no. 12, 1998, p. 2179–2185. [2] tomaro, r. f., witzeman, f. c., strang w. z.: simulation of store separation for the f/a-18c using cobalt. journal of aircraft, vol. 37, no. 3, 2000, p. 361–367. [3] prewitt, n. c., belk, d. m., maple, r. c.: multiple-body trajectory calculations using the beggar code. journal of aircraft, vol. 36, no. 5, 1999, p. 802–808. [4] brock, j. m. jr, jolly b. a.: application of computational fluid dynamics at elgin air force base. sae 985500, 1988. [5] shanker, v., malmuth, n.: computational and simplified analytical treatment of transonic wing/fuselage/pylon/store interaction. journal of aircraft, vol. 18, no. 8, 1981, p. 631–637. [6] hirsch: numerical computation of internal and external flows. volume 1. brussels: wiley, 1988. [7] jameson, a.: re-engineering the design process through computation. journal of aircraft, vol. 36, no. 1, 1999. [8] spradley, l. w., lohner, r., chung, t. j.: generalized meshing environment for computational mechanics. aiaa journal, vol. 36, no. 9, 1996, p. 1735–1737. [9] mavriplis, d. j.: viscous flow analysis using a parallel unstructured multigrid solver. aiaa journal, vol. 38, no. 11, 2000, p. 2067–2075. [10] kallinderis, y., khawaja, a., mcmorris, h.: hybrid prismatic/tetrahedral grid generation for viscous flows around complex geometries. aiaa journal, vol. 34, no. 2, 1996, p. 291–298. [11] koomullil, r. p., soni, b. k.: flow simulation using generalized static and dynamic grids. aiaa journal, vol. 37, no. 12, 1999, p. 1551–1557. tholudin mat lazim, ph.d. e-mail: tholudin@fkm.utm.my shabudin mat, m.sc e-mail: shabudin@fkm.utm.my department aeronautics & automotive faculty mechanical engineering universiti teknologi malaysia 81310, utm skudai, johor malaysia huong yu saint, m.eng royal malaysian airforce wisma pertahanan jalan padang tembak 50634, kuala lumpur, malaysia 65 acta polytechnica vol. 43 no. 5/2003 wing wing-store pressure viscous total pressure viscous total lift coeff. 0.2528 negligible 0.2528 0.1767 negligible 0.1767 drag coeff. 0.0128 0.0018 0.0146 0.0295 0.0024 0.0318 fig. 11: simulated aerodynamic forces, � � 0, re � 1.86×105, m � 0.067 table of contents modal logic – a tool for design process formalisation 3 i. jelínek verification of the wind response of a stack structure 8 d. makovièka, j. král, d. makovièka experimental evaluation of mountain bike suspension systems 15 j. titlestad, t. fairlie-clarke, m. davie, a. whittaker, s. grant process design for hot forging of asymmetric to symmetric rib-web shaped steel 21 h. cho, b. oh, c. jo, k. lee analysis of gear wheel-shaft joint characterized by comparable pitch diameter and mounting diameter 26 j. ryœ, h. sanecki, a. trojnacki heat transfer analysis of a diesel engine head 34 m. diviš, r. tichánek, m. španiel reducing time for the product development process by evaluation in the phase of solution searching 40 b. jokele, d. k. fuchs numerical calculation of electric fields in housing spaces due to electromagnetic radiation from antennas for mobile communication 44 h.-p. geromiller, a. farschtschi numerical simulation of stresses in thin-rimmed spur gears with keyway 47 b. brùžek, e. leidich anisochronic internal model control design 54 t. vyhlídal, p. zítek computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 60 tholudin mat lazim, shabudin mat, huong yu saint acta polytechnica doi:10.14311/ap.2019.59.0560 acta polytechnica 59(6):560–572, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap comparative study and empirical modelling of pulverized coconut shell, periwinkle shell and palm kernel shell as a pozzolans in concrete efe ewaen ikponmwosa, samuel onosedeba ehikhuenmen∗, karieren kate irene university of lagos, faculty of engineering, department of civil and environmental engineering, sapara road, 100213, akoka, lagos, nigeria ∗ corresponding author: sehikhuenmen@unilag.edu.ng abstract. infrastructural development across the world is fast growing due to the rapid growth in population. this has resulted in consequently creating an increase in the demand for construction materials, especially cement. this study presents a report of a concise investigation on the pozzolanic potentials of pulverized coconut shell (pcs), periwinkle shell (pps) and palm kernel shell (ppks). the chemical composition of the pcs, pps and ppks, physical properties of concrete constituents and mechanical properties of the pozzolans blended concrete were determined. a concrete mix ratio of 1 : 2 : 4 with a w/c of 0.6 was adopted. a total of 270 cubes and 120 cylinders were cast at an interval of 10 % from 0 % to 50 % replacement level and cured in water for 7, 14, 21 and 28 days. results indicate that the pcs, pps and ppks are good retarders as they increased the setting time of cement paste and decreased the workability of concrete as the percentage replacement increased. the addition of the pcs in the mix produced a concrete of lower density while the addition of the pps and ppks produced a concrete of a higher density up to a 30 % replacement level, further increase resulted to a decrease in density. the compressive and tensile strengths increased with a curing age but decreased with an increasing percentage replacement level for the three pozzolans investigated. however, 10 % replacement level of the pcs, pps and ppks in concrete is suitable for the production of pozzolans blended concrete for structural works. the models developed are in a good agreement with the experimental results. keywords: pulverized coconut shell (pcs), pulverized periwinkle shell (pps), pulverized palm kernel shell (ppks), structural strengths, empirical models. 1. introduction infrastructural development across the world is fast rising, and this is consequently creating an increase in the demand for construction materials. concrete is listed among the most extensively used construction materials worldwide and its production is daily on the increase to service the demand for the infrastructure development. cement, which is the main binder in the production of concrete is very expensive; particularly in developing countries [1]. besides its high cost, its production is associated with very high temperatures with an attendant emission of poisonous gases, such as co2, no, no2, as well as depletion of natural resources such as limestone [2]. more so, due to increasing industrial and agricultural activities, tonnes of waste materials like as steel slag, palm kernel, rice husk, saw dust, groundnut husk, periwinkle shell, coconut shell, etc, are deposited in the environment without littleany effective method of waste management and recycling [3]. some of these deposits are not easily decomposed and the accumulation has constituted to various forms of environmental challenges, hence there is a need to reuse them in order to minimize their negative effects on the environment. the american society of testing materials (astm) [4] defines pozzolans as siliceous or aluminous materials, which possess little or no cementitious properties but will, in finely divided form and in the presence of moisture, react with lime [ca(oh)2] at ordinary temperature to form a compound having cementitious properties [5]. many achievements have been made regarding to pozzolanic materials and the subject is still attracting much research due to its functional benefit of waste reusability and sustainable development, its indigenous technology and equipment requirements, and reduction in construction costs are added advantages. some findings of waste materials reveal a great significant effect on its mechanical properties of pozzolan blended concrete. golewski [6], investigated the effect of the addition of siliceous fly ashes (fa) in the amount of 0, 20 and 30 % by weight of cement on the interfacial microcracks and mechanical parameters in plain concrete. he discovered that the using the 20 % fa cement binder could trigger favourable changes in the microstructure of concrete leading to an improvement in its mechanical properties. also, the addition of fly ash into the concrete matrix up to 20 % resulted to a marginal increase of the fracture 560 https://doi.org/10.14311/ap.2019.59.0560 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . toughness of plain concrete, while 30 % fa additive led to a significant decrease [7, 8]. falade et al. [9], examined the potential of pulverized bone as a pozzolanic material. results of an investigation revealed no substantial difference in strength propertiesy of specimens having pulverised bone up to 20 % replacement at 28 days curing age when compared to the control specimens. from the results of the study, it was concluded that pulverised bone had pozzolanic properties and could be used to replace cement partially. ikponmwosa et al. [10], investigated the strength characteristics of concrete beams with cement partially replaced by uncalcined soldier-ant mound clay. results indicated that the addition of samc in the mix produced a concrete of a lower density than normal concrete; increased the setting times of cement with increased workability. the density as well as the flexural strength of concrete decreased with increase in samc content. however, 5 % samc content in the mix was considered as optimalum for an improved structural performance when compared with normal concrete. ikponmwosa et al. [11], presented a study on the suitability of polyvinyl waste powder as a partial replacement for cement in concrete production. as part of their findings, polyvinyl waste powder has a pozzolanic effect, served as a retarder, reduced workability and higher structural strength up to a 20 % replacement. reddy et al. [12], studied the utilization of sugarcane bagasse ash in concrete by a partial replacement of cement. the results showed that the performance of concrete having up to a 10 % scba replacement met the requirements of bs 8110 [13] and can be used for the production of concrete for structural works. marthong [14], researched on the usage of sawdust ash (sda) as a part replacement of cement in concrete production. the test results disclosed that the addition of sda triggered only a little expansion due to a little calcium content. early strength growth was observed to be around 50-60 % of their 28 days strength. it was recommended that the usage of sda as a part replacement of cement should not exceed 10 % by volume in all grades of cement. utsev & taku [15], studied the application of coconut shell ash as a replacement of ordinary portland cement in concrete production. the results revealed that the densities of concrete cubes for a 10-15 % replacement was were above 2400 kg/m3 and the compressive strength increased from 12.45 n/mm2 at 7 days to 31.78 n/mm2 at 28 days curing; thus, meeting the requirement for the use in both heavy weight and light weight concreting. in conclusion, the study revealed that a 10 to 15 % partial replacement of the opc with the csa using a w/c ratio of 0.5 was suitable for a production of both heavy weight and light weight concrete. umoh & femi [16] investigated the behaviour of ternary blended cement concrete integrating periwinkle shell ash (psa) and bamboo leaf ash (bla) as cement enhancers. the result revealed that at 28and 56-days hydration, ternary blended cement concrete containing a combined percentage of psa and bla of 20 % cement replacement attained high compressive and tensile strength and low water absorption. hence, from the result, it was concluded that blended cement concrete with a 20 % replacement of cement with the psa and bla was considered optimum for ternary blended cement concrete. offiong & akpan [17] discovered, in the assessment of physico-chemical properties of periwinkle shell ash as part replacement for cement in concrete production, that periwinkle shell ash calcined at 800 °c met the required maximum 34 % as stipulated by astmc618 [18] is appropriate for the usage as a part replacement for cement in concrete production. as the percentage replacement level of psa content increases, the strength properties of concrete reduces and the increase in curing age led to an increased strength for specimens cured in water (control); while for the specimen cured in sulphuric acid solutions, the strength properties decreased with age [19]. similar findings were observed by abdullahi and sara [20] and olutoge et al. [21]. olutoge et al. [22] examined the strength properties of palm kernel shell ash concrete. from their research findings, the compressive strength for the pksa blended concrete was lower than for a normal concrete but at a 10 % replacement, its 28-day strength of been 22.8 n/mm2 is within the recommended strength for reinforced concrete. the combination of palm kernel shell ash with other supplementary cementitious materials (scm) had been researched and found to improve the strength properties of concrete [23, 24]. utsev and taku [15] investigated coconut shell ash as a partial replacement of ordinary portland cement in concrete production. they discovered that the density of concrete increases with the increase in the csa contents but the compressive strength decreases as the replacement level of csa increases. it was concluded that a 10 % replacement level is optimal for replacing cement with the csa. the decline in strength can be attributed to the nature of the csa acting more as a filler material than a binding material [25]. oyedepo et al. [26] carried out a study on the performance of coconut shell ash and palm kernel shell ash as partial replacement for cement in concrete. the results revealed that a 20 % pksa and csa replacement in cement gave an average optimum compressive strength of 15.4 n/mm2 and 17.26 n/mm2 , respectively, at 28 days. it was concluded that the optimum replacement level of a 10 % replacement is suitable for both light weight and heavy weight concrete production. akhionbare [27] evaluated the usage of agro-waste as a partial replacements for cement in construction production. the seven selected agro-wastes are wood ash (wa), bone powder ash (bpa), acha husk ash (aha), bambara ground shell ash (bgsa), rice husk ash (rha), palm oil shell ash (posa) and groundnut husk ash (gha). the chemical analysis of the ash 561 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica for each material was done with the bogue’s model. a linear program calculates the equations governing the model for an ease of complex computations. the results revealed that bone ash (ba) has the highest compressive strength and c3s composition while rice husk is the most economically viable. global pollution coupled with resource depletion has inspired several engineers and researchers to search for these locally available resources with a view goal to investigateing their usefulness either wholly as a construction material or partly as a substitute for conventional ones in concrete production. the search revealed that coconut shell ash, periwinkle shell ash and palm kernel shell ash haves been used as a partial replacement of cement. however, little or no literature exists on the use of these three materials in their natural state (in pulverized form). thus, this paper presents the comparative study between the characterization of pulverized coconut shell, periwinkle shell and palm kernel shell as pozzolans in concrete. this research is aimed at reducing the cost of concrete production and co2 emission by incorporating pozzolan (biodegradable waste) without compromising the concrete strength. also, its serves as an encouragement for a low-cost housing scheme for real estate investors and developers. 2. materials & methods 2.1. materials the cement used in this study was ordinary portland cement (opc), which conforms to the requirements set under astm c-150 [28], bs 12 [29] and bs en 197-1 [30]. the pulverized coconut shell, periwinkle shell and palm kernel shell were gotten from matured shells at badagry (lagos state), bariga (lagos state) and abeokuta (ogun state), respectively. the shells were washed to remove impurities and sun-dried for 48 hours after which they were broken into small pieces and fed into a pulveriser. the powder from the pulveriser was then sieved using the sieve size with 90µm micron size. the shaft retained on the sieve was then collected and taken back to the pulveriser for a further processing. coarse aggregates were crushed granite ranging from 12.5 mm to 19 mm sizesin size, obtained from a quarry located in ogun state. the fine aggregate used was river sand gotten from river ogun, which was free from organic matter and salt. the gradation test as presented in table 3 showed that they met specifications requirements in accordance to bs 882 [31]. the water used for this research was clean, portable and impurities-free obtained from university of lagos water distribution system, which was in an accordance with bs 3148 [32]. 2.2. methodology 2.2.1. mix design and sample preparation the design mix proportions for grade 30 concrete were 310.85 kg/m3, 621.70 kg/m3, 1243.40 kg/m3 and 186.51 kg/m3 for cement, sand, granite and water respectively, with a w/c ratio of 0.60 for one cubic meter of concrete. a total of 270 nos. 150 mm × 150 mm × 150 mm concrete cubes and 120 nos. 150 mm × 300 mm concrete cylinder specimens were cast, cured and tested. 90 cubes and 40 cylinders were cast using each of the materials at different replacement levels from 0 % to 50 % at 10 % intervals for pulverized coconut shell, periwinkle shell and palm kernel shell respectively. a concrete mixer was used for mixing the concrete constituents to produce freshly mixed concrete. the mixtures were poured into various moulds for different concrete elements and compacted using tapping rod and vibrating machine. the specimens were demoulded after 24 ± 2 hours and cured in potable water. the specimens were de-moulded 24 ± 2 hours after the casting and stored in the curing medium until the age of the test of 7, 14, 21 and 28 days for the cubes and cylinders. 2.2.2. testing procedure a cchemical analysis was carried out on the pcs, pps and ppks in the department of chemistry, university of lagos. setting time test was conducted on the cement paste replacement with the pcs, pps and ppks. also, workability was determined in the fresh state of the concrete having the pcs, pps and ppks using a slump test. compressive strength test was conducted using avery dension universal testing machine having a loading rate of 120 kn/min, which was is in an accordance with bs en 12390-3 [33]. the ssplitting tensile strength test was done in an accordance with bs en 12390-6 [34] and astm c496-96 [35] using a loading rate of 120 kn/min. 2.2.3. mathematical model the results of and the experimental data for various properties of pozzolan blended concrete were analysed using a bilinear interpolation method to develop mathematical models for predicting parameters with respect to its variables. the algorithm for the bilinear interpolation method for the value of the unknown function f at points x and y. for a known value of f at four points; q11 = (x1, y1), q12 = (x1, y2), q21 = (x2, y1), q22 = (x2, y2). linear interpolation in the x-direction, we have: fx,y1 = x2 − x x2 − x1 f (q11) + x − x1 x2 − x1 f (q21) (1) fx,y2 = x2 − x x2 − x1 f (q12) + x − x1 x2 − x1 f (q22) (2) 562 vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . (a). (b). (c). (d). (e). (f). figure 1. laboratory equipment and experimental activities (a, b) avery dension universal testing machine (c) material measurement using weighing machine (d) slump test (e) casting of concrete cubes (f) concrete cubes immersed in potable water curing. interpolating in the y-direction to obtain the desired estimate: fx,y = y2 − y y2 − y1 f (x, y1) + y − y1 y2 − y1 f (x, y2) (3) comparisons of the predicted results obtained from the use of the bilinear interpolation method was made with the observed experimental results at unique check points using a simple percentage difference formula; per.diff. = actual result − model result actual result · 100 %. 3. results and discussion 3.1. chemical analysis of pulverized coconut shell, periwinkle shell and palm kernel the chemical analysis of the pcs, pps and ppks was carried out and the chemical compositions determined. the results are shown in table 1. the physical observation shows that the samples are different in colour (brown) when compared to cement (light grey), which does not translate to the ability to perform alike, but their individual performance is based on the degree of constituents’ chemical elements present in them. it was observed that the combined percentage masses of silica (sio2), alumina (al2o3) and ferric oxides (fe2o3) for the samples were lower than the various class type of pozzolans according to astm c618-12a [36]. cao, known for providing strength in cement, was observed to be very low in the samples. this can be attributed to the low strength performance and increase in setting time. the absence of sio2 and al2o3 in the samples would have an adverse effect on the strength and setting property of the cement paste binding the aggregates together. the mgo was found to be within the lower limit range of less for cement, pps and ppks but the pcs sample recorded higher value which can be attributed to the reduction of strength of concrete [28, 36]. the percentages of na2o and k2o known as the alkali oxides in the samples were observed to be large when compared to the standard range [36]. this resulted 563 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica (a). (b). (c). figure 2. supplementary cementitious materials (scm) (a) pulverized coconut shell (b) pulverized palm kernel shell (c) pulverized periwinkle shell. parameters cement pulverized coconut pulverized periwinkle pulverized palm shell (%) shell (%) kernel shell (%) calcium oxide (cao) 63.97 8.918 10.84 9.68 silica (sio2) 18.34 27.57 22.32 20.28 aluminium oxide (al2o3) 4.73 14.38 9.37 3.38 ferric oxide(fe2o3) 0.38 5.82 6.89 7.42 potassium oxide (k20) 0.48 6.981 4.98 6.81 magnesium oxide (mgo) 2.16 3.196 2.49 2.89 sodium oxide (na2o) 0.55 5.189 6.12 5.81 chlorine (cl) 1.83 6.28 7.62 6.58 sulphate (so2−4 ) 0.51 8.00 7.40 7.60 table 1. chemical composition of supplementary cementitious materials (scm) and cement. in some difficulties in regulating the setting time of the cement paste. 3.2. physical properties and sieve analysis results of the physical properties tests on the concrete constituent used in this study are presented in table 2. according to unified soil classification system [37, 38], the coarse aggregate is classified as poorly graded granite because its coefficient of uniformity (cu = d60/d10) value is less than 4 and coefficient of curvature (cc = (d30 ×d30)/(d60 ×d10))value is between 1-3 as presented in figure 3. also, the fine aggregate is classified as poorly graded sand because its cu value is less than 6 and its cc value is between 1-3 [37, 38]. the densities and specific gravity of the concrete constituents used met the specified standards of specific gravity (2.4-2.9), bulk density for the aggregates are classified as normal weight aggregate according american standard test method [39]. the aggregate crushing and impact values of 22.01 and 13.26 respectively were in an accordance with the relevant standards [40]. 3.3. effect of pozzolans on the setting times of cement the results of the setting time tests of cement paste having pozzolans incorporated at various replacement levels are presented in the figures below. it was observed that the initial setting times of cement and pozzolans are higher than the specified standard of 45 minutes but their final setting time recorded a lower time, which is within the specified standard of 600 minutes [41]. from figures 4a and 4b, it was observed that both the initial and final setting times of the paste incorporated with pozzolans increased as the replacement level of the pozzolans increased. specifically, the addition of each of the three materials to cement paste respectively caused a retardation of the setting time. the blended cement with pulverized coconut shell (psc) had the highest initial and final setting times when compared to others. in order words, these pozzolans are acting as a retarder and could be considered good for temperate regions, or in readymix concrete transported over a long distance. this behaviour may be attributed to the reduction of the strength forming compounds (c3s and c3a) in the blended cement through the partial replacement of cement with pozzolans. these compounds are re564 vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . parameters granite sand cement pcs pps ppks coefficient of uniformity (cu) 1.58 3.03 coefficient of curvature (cc) 1.03 1.08 fineness modulus 3.88 4.49 4.25 dry density (kg/m3) 1403.29 1405.01 bulk density (kg/m3) 1407.36 1409.55 1297 1312 1320 1384 specific gravity 2.66 2.63 2.75 2.40 2.46 2.64 moisture content (%) 0.29 0.6 1.45 1.23 1.34 aggregate crushing value (%) 22.01 aggregate impact value (%) 13.26 table 2. physical properties of concrete constituents. figure 3. particle size distribution for aggregates. (a). (b). figure 4. effect of pozzolans on the setting time of cement paste. 565 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica sponsible for the early strength gain and the initial and final setting times of cement. 3.4. effect of pcs, pps and ppks on the workability of concrete the variations in the workability of pozzolan blended concrete were determined using slump test and the results revealed a high degree of workability for the pps and ppks from control to a 20 % replacement level while the pcs exhibit a high workability for control and at a 10 % replacement level. a further increase in replacement levels for the pcs, pps and ppks resulted to a low degree of workability. figure 5 shows the variations of the slump value of the pcs, pps and ppks blended concretes. the values were taken at different replacement levels. as depicted in the figure, the workability decreased as the replacement level of different pozzolans increased. the trends observed in the slump values can be attributed to the high moisture absorption capacity of the pozzolan particles in the mix as the percentage replacement increased, thus transiting from plastic mix to stiff-plastic mix. pulverized palm kernel shell was observed to have higher slump values when compared to the other pozzolans. 3.5. effects of pcs, pps and ppks on the density of concrete the density of the concretes produced with different pozzolans incorporated into the concrete matrix generally increased as the curing age increased. this could be attributed to the adequate amount of moisture available for continuous hydration and volume stability. figure 6a shows the variation of the percentage replacement of three pozzolans (pcs, pps & ppks) on the density of concrete at 7 days curing age. as depicted in the figure, the density of pulverized coconut shell blended concrete decreased as the replacement level increased while the densities of pulverized periwinkle shell blended concrete and pulverized palm kernel shell blended concrete increased for values up to a 30 % replacement level and further replacement levels led to a decrease. figure 6b shows the effect of different pozzolans (pcs, pps & ppks) on the density of concrete at 28 days curing age. similar trends were observed for the 7 days curing age when the increase in the replacement of the pcs led to the decrease in density while the increase in the pps and ppks resulted in an increase in the density for values up to a 30 % replacement level and subsequent increase gave a decrease in density. from these figures, the trends observed can be attributed to the quantity of magnesia or magnesium oxide present in the concrete matrix. the decrease in the densities of pulverized coconut shell blended concrete can be ascribed to the high quantity of 3.198 % magnesia above the specified range of o.1 to 3 % which could impart its hardness. also, when the quantity of the magnesia in the concrete matrix becomes more than the allowable, it could result in a decrease in the density of concrete as observed in pulverized periwinkle shell and palm kernel shell blended concretes. 3.6. effects of pcs, pps and ppks on compressive strength of concrete the variation in the compressive strength of pozzolan blended concretes at different curing ages are presented in figure 7. it was observed that the compressive strength increased with an increase in curing age. the reason for such behaviour is that the concrete specimens are exposed to an environment that facilitates continuous hydration with resultant strength gains. also, the curing condition helped in the prevention of moisture loss needed for the continuous hydration. figures 7a and 7b illustrate the outcome of having some pozzolans incorporated in concrete at varied curing age. from the figures, it is observed that the compressive strength gradually decreased as the percentage replacement increased from 0 % to 50 % at 10 % intervals for the three pozzolans. however, pulverized coconut shell is observed to yield the lowest compressive strength as its replacement level increased while pulverized periwinkle shell recorded the highest values of compressive strength when compared to the other two pozzolans. the decline in the compressive strength for the three pozzolans can be attributed to the lack of adhesion between the pozzolans and cement paste leading to losses in stability and surface area. also, the breakdown of the internal structure in the bond strength between the cement-pozzolans paste and aggregates resulted in reduced compressive strength. the compressive strength at 28 days curing age decreased by 13.38 %, 10.50 % and 16.50 % at 10 % replacement level, 23.98 %, 16.88 % and 23.93 % at 20 % replacement level, 36.21 %, 27.53 % and 32.95 % at 30 % replacement level, 45.47 %, 39.23 % and 45.00 % at 40 % replacement level, 71.27 %, 52.23 % and 70.07 % at 50 % replacement level for pcs, pps and ppks respectively. the values of compressive strength for the three pozzolans are above 17 n/mm2 at the 10 % replacement level. this indicates that the pcs, pps and ppks at 10 % replacement level are suitable for the production of pozzolan blended concretes for structural works using a water-binder ratio of 0.6. 3.7. strength activity index (sai) on pozzolans blended concrete the strength activity index (sai) measures the pozzolanicity of cement replacement materials (crms) and the measure is based on the variation of the strength with respect to its control in percent. according to astm [4], a crm can be classified as a pozzolan if the strength of the blended cement at 7-day and/or 28-day is not be less than 75 % of the strength of normal concrete. the strength activity 566 vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . figure 5. slump values of pozzolans blended concrete in fresh mix stage. (a). (b). figure 6. densities of pozzolans blended concrete at 7 days and 28 days curing ages. (a). (b). figure 7. compressive strength of pozzolans blended concrete at 7 and 28 days curing ages. 567 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica curing age percentage replacement (%) pcs pps ppks cs sai cs sai cs sai (n/mm2) (%) (n/mm2) (%) (n/mm2) (%) 7 days 0 14.3 100 14.3 100 14.3 100 10 12.81 89.58 13.62 95.25 13.11 91.68 20 9.26 64.76 12.3 86.01 12.01 83.99 30 7.06 49.37 9.22 64.48 9.07 63.43 40 5.34 37.34 6.78 47.41 5.67 39.66 50 2.44 17.06 5.21 36.43 3.45 24.13 28 days 0 20.85 100 20.85 100 20.85 100 10 18.06 86.62 18.66 89.50 17.41 83.50 20 15.85 76.02 17.33 83.12 15.86 76.07 30 13.3 63.79 15.11 72.47 13.98 67.05 40 11.37 54.53 12.67 60.77 11.47 55.01 50 5.99 28.73 9.96 47.77 6.24 29.93 table 3. strength activity index of pcs, pps and ppks. index for the pozzolan blended concretes are is presented in table 3. from table 3, at 7-day, the three pozzolans met the specification for the 10 % replacement, and but at the 20 % replacement level, two pozzolans (pps & ppks) met the minimum permissible level of 75 % of their control strength. at 28 days curing age, the maximum sais for 10 % to 50 % are 86.62 %, 89.50 % and 83.50 % for 10 % replacement level, 76.02 %, 83.11 % and 76.07 % for 20 % replacement level, 63.79 %, 72.47 % and 67.05 % for 30 % replacement level, 54.53 %, 60.77 % and 55.01 % for 40 % replacement level, 28.73 %, 47.77 % and 29.93 % for 50 % replacement level. this implies that up to 20 % replacement could be adopted as an optimum replacement level of cement with pulverized coconut shell, periwinkle shell and/or palm kernel shell for normal concrete production. 3.8. effect of pcs, pps and pks on tensile strength of concrete the experimental results of the tensile strength of the investigated pozzolan blended concretes are presented in figures 8a and 8b. the tensile strength increased significantly as the curing ages increased from 7 days to 28 days. figure 8 shows the test results of the pcs, pps and ppks blended concretes varied from control (0 %) to the 50 % replacement level at an interval of 10 % for 7and 28days of water curing. from these figures, it is observed that the use of these pozzolans as a cement replacement resulted in a decline in the tensile strength of the concretes. the increase in the replacement level of cement in the concrete matrix resulted to in an increase in the reduction of tensile strength to 19.08 %, 25.77 %, 44.79 %, 48.47 % and 67.49 % for pcs, 16.33 %, 18.84 %, 37.27 %, 42.57 % and 45.93 % for pps, 21.93 %, 25.72 %, 41.96 %, 48.01 % and 58.90 % for ppks when compared with the control mix at 28 days. also, it was observed that the tensile strength for the pozzolans blended concrete varies between 1/8 to 1/12 of their compressive strength. 3.9. mathematical model to predict compressive strength of pozzolans blended concrete properties predictive models were developed using bilinear interpolation method for two inputs and one output system; having the two inputs as being the curing age and percentage replacement for the different pozzolans. the mathematical models developed are presented below; for the compressive strength of pulverized coconut shell; fcu = −2.558333359 × 10−7r5− − 0.00001083332963r4 + 0.002996249948r3− −0.00022837706c 3 + 12.91000000 + 0.012408169c 2− − 0.1152666679r2 + 2.275672175 × 10−10r5c 3− −1.222789117×10−8r5c 2+1.525396828×10−7r5c− −1.988176304×10−8r4c 3+1.071003446×10−6r4c 2− −0.00001167758013r4c + 4.02089414×10−7r3c 3− − 0.2134353783 × 10−3r3c 2+ + 0.4369424621 × 10−4r3c+ + 3.39731152×10−6r2c 3 −0.0002122023963r2c 2+ + 0.00899871059r2c − 0.0000978781988rc3+ + 0.00574642857rc2 − 0.1369956349rc+ + 0.125476196c + 0.9554333322r (4) for the compressive strength of pulverized periwinkle shell; fcu = −8.749999850 × 10−7r5+ + 0.0001254166631r4 − 0.006229166683r3− −0.00022837706c 3 + 12.91000000 + 0.012408169c 2+ 568 vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . (a). (b). figure 8. split tensile strength of pozzolans blended concrete at 7and 28-days curing ages. + 0.1228083290r2 − 3.753644352 × 10−10r5c 3+ +1.145408186×10−8r5c 2−4.52381311×10−9r5c+ +4.047214231×10−8r4c 3−1.137329988×10−6r4c 2− −3.069443624×10−6r4c−1.341715255×10−6r3c 3+ + 0.00003064200662r3c 2 + 0.0003516666700r3c+ + 0.00001091877345r2c 3 + 0.0000203655832r2c 2− − 0.01218710214r2c + 0.0000971088407rc3− − 0.007122618889rc2 + 0.1413428547rc+ + 0.125476196c − 0.9488333240r (5) for the compressive strength of pulverized palm kernel shell; fcu = −3.56250000 × 10−6r5+ + 0.0004691666657r4 − 0.02144708339r3− − 0.00022837706c 3 + 12.91000000+ + 0.012408169c 2 + 0.3921833343r2+ + 7.871720176 × 10−10r5c 3− − 4.452380980 × 10−8r5c 2+ +7.457142890×10−7r5c−1.018788466×10−7r4c 3+ +5.758928567×10−6r4c 2−0.00009674305549r4c+ + 4.616942050 × 10−6r3c 3− − 0.2604336751 × 10−3r3c 2+ + 0.4369424621 × 10−2r3c− − 0.8636216480 × 10−4r2c 3+ + 0.004841454123r2c 2 − 0.08025902824r2c+ + 0.0005877713022rc3 − 0.03239914980rc2+ + 0.5147194461rc + 0.125476196c− − 2.548666671r (6) where: r percentage replacement level and c curing age. table 4 shows the comparison of the results of experimental findings and bilinear interpolation models. from the percentage difference calculated for each replacement level for all the pozzolans, it was observed that the values of the mathematical models are in good agreements with the experimental results. the two-way statistical analysis of variance (anova) without replication on the 7, 14, 21 and 28-day compressive strength results at 95 % confidence level (that is ∞ = 5) is presented in table 5. the percentage replacements level and curing age were considered as the source of variations of the compressive strength. from table 5 the p-values for the rows and columns of the pcs, pps and ppks are less than 0.05 and f is greater than f-crit, we reject the null hypothesis, and so at the 95 % level of confidence we conclude that the curing age and the percentage replacement have a significant effect on the compressive strength of concrete. 4. conclusions from research carried out, the following conclusions can be made: (1.) the quality of the pcs, pps and ppks materials were below the limits of class f, c and n pozzolanic materials according to astm c 618. (2.) the addition of the pcs, pps and ppks to the cement paste matrix resulted in an increase in the setting. this indicates that the three pozzolans are good retarders, hence they can be used for a construction where an early setting of concrete is not required such as plastering of walls. (3.) the workability of the pozzolans blended concrete decrease with an increase in the pozzolans replacement level. this resulted in stiff mixes. 569 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica curing per. pcs model per. diff. pps model per. diff. ppks model per. diff. age rep.(%) 7 days 0 14.3 14.300 -4.895e-07 14.3 14.300 -4.9e-07 14.3 14.300 -4.9e-07 10 12.81 12.810 -3.123e-07 13.62 13.620 -5.9e-07 13.11 13.110 -2.3e-07 20 9.26 9.259 3.24e-07 12.3 12.300 -6.5e-07 12.01 12.009 2.16e-06 30 7.06 7.059 1.261e-06 9.22 9.220 -5.6e-07 9.07 9.069 6.63e-06 40 5.34 5.339 5e-06 6.78 6.779 5.32e-06 5.67 5.669 6.48e-05 50 2.44 2.439 1.643e-05 5.21 5.209 4.07e-05 3.45 3.449 0.000137 28 days 0 20.85 20.850 -3.453e-06 20.85 20.850 -3.5e-06 20.85 20.850 -3.5e-06 10 18.06 18.060 -3.433e-06 18.66 18.660 -4.1e-06 17.41 17.410 -4.6e-06 20 15.85 15.850 -2.082e-06 17.33 17.3300 -4.7e-06 15.86 15.860 -1.3e-06 30 13.3 13.299 3.158e-06 15.11 15.110 -6.3e-06 13.98 13.980 -2e-05 40 11.37 11.369 1.706e-05 12.67 12.670 -8.7e-07 11.47 11.469 8.55e-05 50 5.99 5.989 8.681e-05 9.96 9.959 4.69e-05 6.24 6.240 -2.7e-05 table 4. validation of developed model for compressive strength (n/mm2) of pozzolans (pcs, pps and ppks) blended concretes. anova for pcs source of variation ss df ms f p-value f crit rows 459.2312 5 91.84623 305.9878 1.54e-14 2.901295 columns 109.5279 3 36.50929 121.6315 9.47e-11 3.287382 error 4.502446 15 0.300163 total 573.2615 23 anova for pps source of variation ss df ms f p-value f crit rows 286.7838 5 57.35675 339.8477 7.06e-15 2.901295 columns 103.4237 3 34.47458 204.2673 2.2e-12 3.287382 error 2.531579 15 0.168772 total 392.7391 23 anova for ppks source of variation ss df ms f p-value f crit rows 431.5447 5 86.30893 235.6973 1.06e-13 2.901295 columns 72.07297 3 24.02432 65.60696 7.46e-09 3.287382 error 5.492783 15 0.366186 total 509.1104 23 table 5. results of two-way analysis of variance for compressive strength (n/mm2) of pozzolans (pcs, pps and ppks) blended concretes. 570 vol. 59 no. 6/2019 comparative study and empirical modelling of pcs. . . (4.) the density of pulverized coconut shell (pcs) blended concrete decreased as the replacement level increased while the densities of pulverized periwinkle shell (pps) blended concrete and pulverized palm kernel shell (ppks) blended concrete increased for values of up to 30 % replacement level and further replacement levels led to a decrease. (5.) the compressive strength and tensile strength declined as the percentage replacement increased from 0 % to 50 % at 10 % intervals for the three pozzolans. the values of compressive strength for the three pozzolans are above 17 n/mm2 at the 10 % replacement level; an indication that the pcs, pps and ppks at the 10 % replacement level are suitable for the production of pozzolans blended concrete for structural works using a water-binder ratio of 0.6. (6.) at 28 days curing age, the strength activity index for the pozzolans blended concretes was more than 75 % of the strength of normal concrete at the 20 % replacement level. the tensile strength for the pozzolans blended concretes varied between 1/8 to 1/12 of their compressive strength. (7.) the models developed using the bilinear interpolation method for the compressive strengths of the pcs, pps, and ppks blended concrete are in a good agreement with the experimental results. the statistical analysis of the variance showed that the replacement level of the pcs, pps, and ppks contents and curing age have effect on the compressive strength of concrete. acknowledgements the authors would like to express their gratitude to university of lagos for providing the enabling environment to conduct this research work. references [1] b. c. mclellan, r. p. williams, j. lay, et al. costs and carbon emissions for geopolymer pastes in comparison to ordinary portland cement. journal of cleaner production 19(9):1080 – 1090, 2011. doi:10.1016/j.jclepro.2011.02.010. [2] r. m. andrew. global co2 emissions from cement production, 1928–2017. earth system science data 10(4):2213–2239, 2018. doi:10.5194/essd-10-2213-2018. [3] e. ikponmwosa, s. ehikhuenmen. the effect of ceramic waste as coarse aggregate on stregth properties of concrete. nigerian journal of technology 36(3):691 – 696, 2017. doi:10.4314/njt.v36i3.5. [4] astm c618-08 standard specification for coal fly ash and raw or calcined natural pozzolan for use in concrete. standard, american society for testing and materials, west conshohocken, 2008. [5] s. donatello, m. tyrer, c. cheeseman. comparison of test methods to assess pozzolanic activity. cement and concrete composites 32(2):121 – 127, 2010. doi:10.1016/j.cemconcomp.2009.10.008. [6] g. l. golewski. the influence of microcrack width on the mechanical parameters in concrete with the addition of fly ash: consideration of technological and ecological benefits. construction and building materials 197:849 – 861, 2019. doi:10.1016/j.conbuildmat.2018.08.157. [7] g. l. golewski. determination of fracture toughness in concretes containing siliceous fly ash during mode iii loading. structural engineering and mechanics 62(1):1–9. [8] g. l. golewski. effect of fly ash addition on the fracture toughness of plain concrete at third model of fracture. journal of civil engineering and management 23(5):613 – 620, 2017. doi:10.3846/13923730.2016.1217923. [9] f. falade, e. ikponmwosa, c. fapohunda. potential of pulverized bone as a pozzolanic material. international journal of scientific & engineering research 3(7):1 – 6, 2012. [10] e. ikponmwosa, m. salau, s. mustapha. strength characteristics of concrete beams with cement partially replaced by uncalcined soldier-ant mound clay. in second international conference on advances in engineering and technology strength, pp. 402 – 408. 2011. [11] e. ikponmwosa, c. fapohunda, s. ehikhuenmen. suitability of polyvinyl waste powder as partial replacement for cement in concrete production. nigerian journal of technology 33(4):504 – 511, 2014. doi:10.4314/njt.v33i4.11. [12] m. v. s. reddy, k. ashalatha, m. madhuri, p. sumalatha. utilization of sugarcane bagasse ash (scba) in concrete by partial replacement of cement. iosr journal of mechanical and civil engineering 12(6):12 – 16, 2015. doi:10.4314/njt.v33i4.11. [13] bs 8110-1:1997 structural use of concrete. standard, british standards institution, london, 2004. [14] c. marthong. sawdust ash (sda) as partial replacement of cement. international journal of engineering research and applications 2(4):1980 – 1985, 2012. [15] j. utsev, k. taku. coconut shell ash as partial replacement of ordinary portland cement in concrete production. international journal of scientific & technology research 1(8):86 – 89, 2012. [16] a. umoh, o. o. femi. comparative evaluation of concrete properties with varying proportions of periwinkle shell and bamboo leaf ashes replacing cement. ethiopian journal of environmental studies and management 6(5):570 – 580, 2013. doi:10.4314/ejesm.v6i5.15. [17] u. offiong, g. akpan. assessment of physico-chemical properties of periwinkle shell ash as partial replacement for cement in concrete. international journal of scientific engineering and science 1(7):33 – 36, 2017. [18] astm c618-12a standard specification for coal fly ash and raw or calcined natural pozzolan for use in concrete. standard, american society for testing and materials, west conshohocken, 2012. [19] i. attah, r. etim, d. ekpo. behaviour of periwinkle shell ash blended cement concrete in sulphuric acid environment. nigerian journal of technology 37(2):315 – 321, 2018. doi:10.4314/njt.v37i2.5. 571 http://dx.doi.org/10.1016/j.jclepro.2011.02.010 http://dx.doi.org/10.5194/essd-10-2213-2018 http://dx.doi.org/10.4314/njt.v36i3.5 http://dx.doi.org/10.1016/j.cemconcomp.2009.10.008 http://dx.doi.org/10.1016/j.conbuildmat.2018.08.157 http://dx.doi.org/10.3846/13923730.2016.1217923 http://dx.doi.org/10.4314/njt.v33i4.11 http://dx.doi.org/10.4314/njt.v33i4.11 http://dx.doi.org/10.4314/ejesm.v6i5.15 http://dx.doi.org/10.4314/njt.v37i2.5 e. e. ikponmwosa, s. o. ehikhuenmen, k. k. irene acta polytechnica [20] i. abdullahi, s. g. sara. assessment of periwinkle shells ash as composite materials for particle board production. in international conference on african development issues: materials teclmology track assessment, pp. 158 – 163. 2015. [21] f. a. olutoge, o. m. okeyinka, o. s. olaniyan, i. oyo. assessment of the suitability of periwinkle shell ash (psa) as partial replacement for ordinary portland cement (opc) in concrete 10(3):428 – 434, 2012. [22] f. a. olutoge, h. a. quadri, o. s. olafusi. investigation of the strength properties of palm kernel shell ash concrete. engineering, technology & applied science research 2(6):315 – 319, 2012. [23] j. oti, j. kinuthia, r. robinson, p. davies. the use of palm kernel shell and ash for concrete production. world academy of science, engineering and technology international journal of civil, structural, construction and architectural engineering 9(3):210 – 217, 2015. [24] h. hardjasaputra, i. fernando, j. indrajaya, et al. the effect of using palm kernel shell ash and rice husk ash on geopolymer concrete. matec web of conferences 251, 2018. doi:10.1051/matecconf/201825101044. [25] a. s. leman, s. shahidan, m. s. senin, n. i. r. r. hannan. a preliminary study on chemical and physical properties of coconut shell powder as a filler in concrete. iop conference series: materials science and engineering 160(1), 2016. doi:10.1088/1757-899x/160/1/012059. [26] o. j. oyedepo, l. m. olanitori, s. p. akande. performance of coconut shell ash and palm kernel shell ash as partial replacement for cement in concrete. journal of building materials and structures 2(1):18 – 24, 2015. doi:10.5281/zenodo.241986. [27] w. akhionbare. a comparative evaluation of the application of agrowaste as construction material. international journal of science and nature 4(1):141 – 144, 2013. [28] astm c150/c150m-18 standard specification for portland cement. standard, american society for testing and materials, west conshohocken, 2018. [29] bs 12:1996 specification for portland cement. standard, british standards institution, london, 1996. [30] bs en 197-1:2011 cement. composition, specifications and conformity criteria for common cements. standard, british standards institution, london, 2011. [31] bs 882:1992 specification for aggregates from natural sources for concrete. standard, british standards institution, london, 1992. [32] bs 3148:1980 methods of test for water for making concrete (including notes on the suitability of the water). standard, british standards institution, london, 1980. [33] bs en 12390-3:2009 testing hardened concrete. compressive strength of test specimens. standard, british standards institution, london, 2009. [34] bs en 12390-6:2009 testing hardened concrete. tensile splitting strength of test specimens. standard, british standards institution, london, 2010. [35] astm c496-96 standard test method for splitting tensile strength of cylindrical concrete specimens. standard, american society for testing and materials, west conshohocken, 1996. [36] astm c618-12a standard specification for coal fly ash and raw or calcined natural pozzolan for use in concrete. standard, american society for testing and materials, west conshohocken, 2012. [37] astm d2487-11 standard practice for classification of soils for engineering purposes (unified soil classification system). standard, american society for testing and materials, west conshohocken, 2011. [38] astm d3282-09 standard practice for classification of soils and soil-aggregate mixtures for highway construction purposes. standard, american society for testing and materials, west conshohocken, 2009. [39] astm c33/c33m-18 standard specification for concrete aggregates. standard, american society for testing and materials, west conshohocken, 2018. [40] bs 812-110:1990 testing aggregates. method for determination of aggregate crushing value. standard, british standards institution, london, 1990. [41] bs en 196-3:2016, methods of testing cement. determination of setting time and soundness. standard, british standards institution, london, 2016. 572 http://dx.doi.org/10.1051/matecconf/201825101044 http://dx.doi.org/10.1088/1757-899x/160/1/012059 http://dx.doi.org/10.5281/zenodo.241986 acta polytechnica 59(6):560–572, 2019 1 introduction 2 materials & methods 2.1 materials 2.2 methodology 2.2.1 mix design and sample preparation 2.2.2 testing procedure 2.2.3 mathematical model 3 results and discussion 3.1 chemical analysis of pulverized coconut shell, periwinkle shell and palm kernel 3.2 physical properties and sieve analysis 3.3 effect of pozzolans on the setting times of cement 3.4 effect of pcs, pps and ppks on the workability of concrete 3.5 effects of pcs, pps and ppks on the density of concrete 3.6 effects of pcs, pps and ppks on compressive strength of concrete 3.7 strength activity index (sai) on pozzolans blended concrete 3.8 effect of pcs, pps and pks on tensile strength of concrete 3.9 mathematical model to predict compressive strength of pozzolans blended concrete properties 4 conclusions acknowledgements references ap03_5.vp 1 introduction the peak temperatures of burning gases inside the cylinder of diesel engines are of the order of 2500 k. to prevent overheating, the maximum temperatures of the metal surfaces enclosing the combustion chamber are limited to much lower values and, therefore, cooling must be provided for the cylinder, cylinder head and piston. the substantial heat fluxes and temperature nonuniformities arising from these conditions lead to thermal stresses, which further escalate the otherwise significant mechanical loading from combustion pressures. the design must take into account all these considerations to ensure trouble-free operation of engines, which, especially in the case of parts of complex design, requires an extended analysis based on detailed information of all the processes involved. the cylinder head is one of the most complicated parts of an internal combustion engine. it is directly exposed to high combustion pressures and temperatures. in addition, it needs to house intake and exhaust valve ports, the fuel injector and complex cooling passages. compliance with all these requirements leads to many compromises in design. as a result, cylinder heads tend to fail in operation (distortions, fatigue cracking) due to overheating in regions of limited cooling. in this study, we have put emphasis on the problematic regions around the valve seats and narrow bridges between valves. these regions experience especially severe thermal loading, as they receive heat not only from in-cylinder burning gases during the combustion period but also from burned gases flowing through the exhaust valve and along the exhaust-port walls during the exhaust. although the temperatures of exhaust gases are significantly lower than peak in-cylinder temperatures, rapid movement of flowing gases there promotes the heat transfer to the walls. most of the heat accumulated in the valve is rejected through the contact surface of the valve seat. therefore, any deformations of these parts accompanied by improper contact and occurrence of leakage on the conical valve contact face dramatically increase the thermal loading of the valves and, therefore, may lead to their destruction. a detailed fe heat-transfer analysis can provide valuable information on the temperature distribution in the overall assembly of the cylinder head, especially in those regions where experimental data is almost impossible to gather. moreover, this is the first logical stage of cylinder head strength analysis. in the next step, the temperature and mechanical stresses have to be analysed using temperature field and pressure (and other mechanical loads, e.g., belt pre-stress). the resulting displacement/stress fields may be utilised for evaluation of operational conditions, e.g., contact pressure between valves and valve port uniformity, as well as strength and failure resistance of the assembly. such information contributes to a detailed understanding of the thermal and mechanical processes in the cylinder-head assembly under engine operation, which is a prerequisite for further optimisation of engine design. 2 cylinder-head assembly in the present study, the cylinder head of a big turbocharged direct-injection diesel engine is analysed. the engine is used in power generation units. the basic parameters of the engine are: bore 275 mm, stroke 330 mm, maximum brake mean effective pressure 1.96 mpa, nominal speed 750 rpm. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 heat transfer analysis of a diesel engine head m. diviš, r. tichánek, m. španiel this paper documents the research carried out at the josef božek research center of engine and automotive engineering dealing with extended numerical stress/deformation analyses of engines parts loaded by heat and mechanical forces. it contains a detailed description of a c/28 series diesel engine head fe model and a discussion of heat transfer analysis tunning and results. the head model consisting of several parts allows a description of contact interaction in both thermal and mechanical analysis. keywords: heat transfer analysis, fem, internal-combustion engine. fig. 1: cylinder head assembly the cylinder head (fig. 1, link 1) is made of cast iron. it contains two intake valves (fig. 1, link 6) and two exhaust valves (fig. 1, link 7) made of forged alloy steel. the valve guides (2, 3) as well as the valve seats (4, 5) are pressed into the head. the exhaust-valve seats are cooled by cooling water flowing through the annular cavities around the seats. the fuel injector is situated inside the axis of the cylinder. the bottom face of the cylinder head, directly exposed to the in-cylinder gases, is cooled by special bores, which, however, represent a further complication in the design of this mechanically highly loaded region of the cylinder head. 3 the fe model our fe model includes all components mentioned above, see also table 1. the real design of the cylinder head was slightly modified in details to enable manageable meshing. the model of the cylinder head block was created using pro/engineer 3d product development software and was imported as a cad model, unlike the models of other components (valves, seats, valve guides and fuel-injector), which were developed directly in abaqus cae. some parts of the valves and fuel-injector were considerably simplified or completely left out, as they were considered to have a negligible influence on the results. more information on the mesh geometry and statistics are provided in fig. 2 and table 1. 4 interactions and boundary conditions although the thermal loadings of engine parts vary considerably in time due to the cyclical nature of engine operation, the computations were performed assuming steady-state heat fluxes evaluated on the basis of time-averaged values. taking into account the speed of the periodical changes and the thermal inertia of the components of the cylinder head, the temperature variations are damped out within a small distance from the wall surface (~1mm), and this simplification is therefore acceptable. the thermal contact interactions between individual parts of the cylinder head assembly are described by heat flux qab from the solid face a to b, which is related to the difference of their surface temperatures ta, tb according to � ��q k t tab b a� � , where k is the contact heat-transfer coefficient. the values of the coefficient used in the present analysis are summarised in table 2. they follow the values reported in [3]. the boundary conditions of surfaces contacted by flowing gases are described as a steady-flow convective heat-transfer problem, where the heat flux q transferred from a solid surface at temperature t to a fluid at bulk temperature t0 is determined from the relation, � ��q h t t� � 0 , where h denotes the heat-transfer coefficient. this depends on the flow, the properties of the fluid and the geometry of the surfaces. functional forms of these relationships are usually developed with the aid of dimensional analysis. in the present study, the values of gas-side heat-transfer coefficients and bulk gas temperatures (i.e., for in-cylinder surfaces and intake and exhaust port walls) were obtained from a de© czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 43 no. 5/2003 fig. 2: mesh geometry part nodes elements dc3d8 dc3d6 dc3d4 ds4 dc1d2 brick prism tetrahedron shell link cylinder head 67 282 21 636 174 149 162 – – inlet valve seat (2×) 4 800 3 360 – – – – exhaust valve seat (2×) 4 640 2 880 – – – – inlet valve (2×) 5 038 3 448 56 2 205 – – exhaust valve (2×) 6 112 4 008 56 2 048 – – inlet valve guide 1 560 1 000 – – – – exhaust valve guide 1 716 1 100 – – – – fuel injector 6 644 4 412 – 5 171 32 2 table 1: mesh statistics of individual parts (elements signed according to their abaqus names) 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 3: interactions and boundary conditions on a cylinder-head block interaction description link contact heat transfer coefficient k [w m�2k�1] contact of cylinder head vs. cylinder head gasket 1 6000 contact of cylinder head vs. inlet valve seat 2a,b,c 6000 contact of cylinder head vs. exhaust valve seat 3a,b,c 6000 contact of inlet valve vs. seat 4 6000 contact of inlet valve vs. valve guide 5 600 contact of exhaust valve vs. seat 6 6000 contact of exhaust valve vs. valve guide 7 600 contact of inlet valve guide vs. cylinder head 8 6000 contact of exhaust valve guide vs. cylinder head 9 6000 contact of cylinder head vs. fuel injector (o-ring) 10 6000 contact of cylinder head vs. fuel injector 11 6000 table 2: interactions between single parts considered in fe model (acting surfaces linked in fig. 3 and 4) fig. 4: interactions and boundary conditions on other parts of the cylinder-head assembly tailed thermodynamic analysis of the engine operating cycle performed with the use of the 0-d thermodynamic model obeh, see [1]. this analysis uses the well-known eichelberg empirical heat-transfer coefficient correlation. the remaining boundary conditions on outside surfaces most exposed to the ambient air temperature are described using estimated values of the heat transfer coefficient; in special cases the heat-transfer is neglected. more detailed information on the used values are provided in table 3 in conjunction with fig. 3 and 4. the coolant-side boundary conditions for the water-cooled passages are based on values reported in the literature, see [4]. however, if local boiling occurs at the surface, different relationships for h must be used. heat-transfer coefficients for boiling features even more complicated dependencies, since in addition to all the mentioned influences affecting the values of heat-transfer in convection, in boiling processes additional variables play a role, e.g., those linked to the phase change, microstructure and material of the surface. in the first computed case (a) the possibility of occurrence of local boiling effects was neglected. however, the computed results suggest that the boiling point of water was exceeded in some parts of the cooling passages. in further computations (b, c) the temperature-dependent heat-transfer coefficient was assumed, see fig. 5, the values of which approximately follow, the steep dependence of the heat-transfer coefficient on the temperature of superheating, as reported in [4, 5]. the reasons for this simple approximation were twofold: the complexity of the problem, resulting in significant deviations of the coefficient values calculated according to the different equations, and the multiplicity of variables needed, some of which were not at our disposal. in this connection, two different dependencies � �h h t� were assumed in cases b and c, which made it possible to assess the sensitivity of the computational results to the potential errors in approximation of the heat-transfer coefficient for boiling. 5 results and discussions experimental data provided by the engine manufacturer enabled us to make a comparison with the computed results. the temperature measurement arrangement with the positioning of all the measured points is sketched in fig. 6. the thermocouples were placed in special bores. all the bores were situated at a distance of 18 mm from the bottom margin of the cylinder head. despite the lack of further detailed information on conditions of the experiment (errors caused © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 43 no. 5/2003 boundary condition description link heat transfer coefficient [w m�2k�1] bulk temperature [k] insulated surfaces (negligible heat-transfer rate) 20 0 (adiabatic) – free surfaces (contact with ambient air) 29 5 320 cooling passages 30 3 000* 350 in-cylinder surfaces 21 450 1120 intake-port surfaces 22 800 330 exhaust-port surfaces 23 800 700 *for cases b and c, the heat-transfer coefficient was assumed dependent on surface temperature table 3: boundary conditions considered in fe model (acting surfaces linked in fig. 3 and 4) 340 360 380 400 2000 4000 6000 8000 10000 12000 14000 16000 case a case b case c h e a t tr a n s fe r c o e ff ic ie n t h [w m 2 k 1 ] t [k] fig. 5: dependency of the heat-transfer coefficient in the cooling passages on surface temperature for cases a, b, and c measured point temperature [k] case a case b case c experimental data 1 568 488 488 425 2 558 485 485 509 3 497 430 429 442 4 517 453 453 – 5 493 434 433 412 6 528 454 453 448 7 508 444 443 415 8 553 501 501 468 9 501 443 443 394 10 553 501 501 430 11 516 448 448 400 12 489 445 443 361 13 518 486 483 414 table 4: comparison of computed with measured temperatures points p1 – p13 by the measuring equipment, influence of the location and fixation of the thermocouples in the bores, etc.), the authors found the data provided a useable and useful resource for verification of the presented model. direct comparison is disabled by the fact that the experimental data and the computed values do not correspond exacly to the same load of the engine. thermal boundary conditions for the fe analysis were computed for about 5 % higher bmep (break mean effective pressure) than that prevailing during the experiment. accordingly to [2], the experimental temperatures for the same bmep as those used in fe analysis might cause an increase in observed temperatures of about 15 k. table 4 provides a comparison of the results of all the computed cases (a, b, c) with those from measurement. the tabulated values indicate significantly closer agreement of results of cases b, c with experimental data, which confirms the importance of including the local boiling effects. the negligible differences of values computed for cases b and c indicate the insensitivity of the steady-state temperature field to any possible shortcoming in the evaluation of the heat-transfer coefficient for boiling, see fig. 5, which confirms the suitability of the approach chosen for incorporating of the local boiling effects into the fe model. due to some uncertainty about the precision of the placement of quite lengthy bores, the sensitivity of the calculated values to the positioning of the measuring points was tested. therefore, for case c, all the temperatures were observed at a distance of 20 mm from the bottom margin of the cylinder head. the comparison displayed in fig. 6 affords a rough estimation of the possible errors arising from the inaccurate fit of the points at which the temperatures were observed in the model to the real measuring points. the results of the calculations shown in the cross-section in fig. 8 provide some interesting information on the temperature distribution within the valves and seats. especially, there is apparently a positive influence of cooled seats of exhaust valves on the thermal loading of those parts, as the temperature values within the exhaust valves just slightly exceed those in the intake valves (experiencing significantly lower thermal loading). 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 5/2003 fig. 6: computed temperature vs. path distance. paths go through measured points p6, p11 defined in fig. 7. squares indicate measured temperature. fig. 7: measured points p1–p13 distribution over the head. all measurement holes are positioned 18 mm from the bottom of the head. colored isotherms represent the computational results of case c. (measured values linked directly with points) 6 conclusion a steady state heat transfer analysis of a c 28 big diesel engine cylinder head assembly was peformed using fem. the model was verified using measured temperatures. the computed values correspond to the experimental data. the results of computation confirmed the need to incorporate the possible occurrence of local boiling and associated steep changes in the values of the heat-transfer fluxes. thermal loading of the head assembly will be performed in the near future. references [1] macek, j., vávra, j., tichánek, r., diviš, m.: výpočet oběhu motoru 6c28 a stanovení okrajových podmínek pro pevnostní a deformační výpočet dna hlavy válce. (in czech) čvut v praze, fakulta strojní, vcjb, 2001. [2] macek, j., vítek, o., vávra, j.: kogenerační jednotka s plynovým motorem o výkonu větším než 3 mw – ii. (in czech) čvut v praze, fakulta strojní, 2000. [3] horák, f., macek, j.: use of predicted fields in main parts of supercharged diesel engine. proceedings of xix. conference of international centre of mass and heat transfer. new york: pergamon press, 1987. [4] kreith, f., black, w.: basic heat transfer. new york: harper and row, 1980. [5] baehr, h. d., stephan, k.: heat and mass transfer. berlin: springer-verlag, 1998. ing. marcel diviš phone: +420 224 351 827 e-mail: divis@student.fsid.cvut.cz ing. radek tichánek phone: +420 224 352 507 e-mail: tichanek@student.fsid.cvut.cz department of automotive and aerospace engineering czech technical university in prague josef božek research center technická 4 166 07 prague 6, czech republic ing. miroslav španiel, csc. phone: +420 224 352 561 e-mail: spaniel@lin.fsid.cvut.cz department of mechanics czech technical university in prague josef božek research center technická 4 166 07 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 43 no. 5/2003 fig. 8: temperature field on exhaust valve and seat (case c) acta polytechnica doi:10.14311/ap.2018.58.0285 acta polytechnica 58(5):285–291, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap minimal non-integer alphabets allowing parallel addition jan legerskýa, b a research institute for symbolic computation, johannes kepler university altenbergerstraße 69, a-4040 linz, austria b faculty of nuclear sciences and physical engineering, czech technical university in prague trojanova 13, 120 00 praha 2, czech republic correspondence: jan.legersky@risc.jku.at abstract. parallel addition, i.e., addition with limited carry propagation has been so far studied for complex bases and integer alphabets. we focus on alphabets consisting of integer combinations of powers of the base. we give necessary conditions on the alphabet allowing parallel addition. under certain assumptions, we prove the same lower bound on the size of the generalized alphabet that is known for alphabets consisting of consecutive integers. we also extend the characterization of bases allowing parallel addition to numeration systems with non-integer alphabets. keywords: numeration system, parallel addition, minimal alphabet. 1. introduction the concept of parallel addition in a numeration system with a base β and alphabet a was introduced by a. avizienis [1]. the crucial difference from standard addition is that carry propagation is limited and hence an output digit depends only on bounded number of input digits. therefore, the whole operation can run in constant time in parallel. it is known that the alphabet a must be redundant [2], otherwise parallel addition is not possible. necessary conditions on the base and alphabet were further studied by c. frougny, p. heller, e. pelantová, and m. svobodová [3–5] under assumption that the alphabet a consists of consecutive integers containing 0. it was shown that there exists an integer alphabet allowing parallel addition if and only if the base is an algebraic number with no conjugates of modulus 1. lower bounds on the size of the alphabet were given. the main result of this paper is generalization of these results to non-integer alphabets, namely a⊂ z[β]. such alphabets might have elements smaller in modulus comparing to integer ones. this is useful for instance in online multiplication and division [6]. parallel addition algorithms that use non-integer alphabets are discussed in [7]. the paper [8] discusses consequences of parallel addition for eventually periodic representations in q(β). this paper is organized as follows: in section 2, we recall the necessary definitions and show that for parallel addition we can consider only bases being algebraic numbers. in section 3, we prove that if (β,a) allows parallel addition and β′ is a conjugate of β, then there is an alphabet a′ such that (β′,a′) allows parallel addition. if a[β] = z[β], we show that a must contain all representatives modulo β and β − 1. if β is an algebraic integer, a consequence is the same lower bound on the size of a⊂ z[β] as for integer alphabets. the assumption a[β] = z[β] or existence of parallel addition without anticipation implies that β is expanding, i.e., all its conjugates are greater than one in modulus. the key result from [3] is generalized to a⊂ z[β] in section 4. namely, there is an alphabet in z[β] allowing so-called k-block parallel addition if and only if β is an algebraic number with no conjugates of modulus one. 2. preliminaries the concept of positional numeration systems with integer bases and digits is very old and can be easily generalized: definition 2.1. if β ∈ c is such that |β| > 1 and a⊂ c is a finite set containing 0, then the pair (β,a) is called a numeration system with a base β and digit set a, usually called an alphabet. numbers in a numeration system (β,a) are represented in the following way: let x be a complex number and xn,xn−1,xn−2, . . . ∈ a,n ≥ 0. we say that ω0xnxn−1 · · ·x1x0 •x−1x−2 · · · is a (β,a)representation of x if x = ∑n j=−∞xjβ j, where ω0 denotes the left-infinite sequence of zeros. the set of all numbers which have a (β,a)representation with only finitely many non-zero digits is denoted by fina(β) := { n∑ j=−m xjβ j : n,m ∈ n,xj ∈a } . the set of all numbers with a finite (β,a)-representation with only non-negative powers of β is denoted by a[β] := { n∑ j=0 xjβ j : n ∈ n,xj ∈a } . we remark that the definition of a[β] is analogous to the one of z[β], i.e. the smallest ring containing z 285 http://dx.doi.org/10.14311/ap.2018.58.0285 http://ojs.cvut.cz/ojs/index.php/ap jan legerský acta polytechnica and β, which is equivalent to the set of all sums of powers of β with integer coefficients. now we show that whenever we require the alphabet to be finite and the sum of two numbers with finite (β,a)-representations to have again a finite (β,a)representation (which is the case of parallel addition), then we can consider only bases which are algebraic numbers. lemma 2.2. let β be a complex number such that |β| > 1 and a⊂ z[β] be a finite alphabet with 0 ∈a and 1 ∈ fina(β). if n ⊂ fina(β), then β is an algebraic number. proof. since a⊂ z[β], all digits can be expressed as finite integer combinations of powers of β. let d be the maximal exponent of β occurring in these expressions and c be the maximal absolute value of the integer coefficients of all digits in a. hence, for every n ∈ n, there exist m,n ∈ n and a−m, . . . ,an ∈a, where ai = ∑d j=0 αijβ j with αij ∈ z and |αij| ≤ c, such that n = n∑ i=−m aiβ i = n∑ i=−m d∑ j=0 αijβ i+j. suppose for contradiction that β is transcendental. therefore, the corresponding integer coefficients of powers of β on the left hand side and on the right hand side must be equal, particularly∑ i+j=0 0≤j≤d,−m≤i≤n αij = n. this is a contradiction, since the left hand side is bounded by (d + 1) ·c, whereas n can be arbitrarily large. corollary 2.3. let β be a complex number such that |β| > 1 and let a ⊂ z[β] be a finite alphabet with 0 ∈ a and 1 ∈ fina(β), resp. 1 ∈ a[β]. if the set fina(β), resp. a[β], is closed under addition, then β is an algebraic number. proof. the closedness of fina(β) under addition and 1 ∈ fina(β) implies n ⊂ fina(β). if a[β] is closed under addition and 1 ∈ a[β] ⊂ fina(β), then n ⊂ a[β] ⊂ fina(β). in both cases, lemma 2.2 applies. the concept of parallelism for operations on representations is formalized by the following definition. definition 2.4. let a and b be alphabets. a function ϕ : bz →az is said to be p-local if there exist r,t ∈ n satisfying p = r + t + 1 and a function φ : bp → a such that, for any w = (wj)j∈z ∈bz and its image z = ϕ(w) = (zj)j∈z ∈az, we have zj = φ(wj+t, · · · ,wj−r) for every j ∈ z. the parameter t, resp. r, is called anticipation, resp. memory. in other words, every digit can by determined from only limited number of neighboring input digits. since a (β,a + a)-representation of sum of two numbers can be easily obtained by digit-wise addition, the crucial part of parallel addition is conversion from the alphabet a + a to a. definition 2.5. let β be a base and let a and b be alphabets containing 0. a function ϕ : bz →az such that (1.) for any w = (wj)j∈z ∈ bz with finitely many non-zero digits, z = ϕ(w) = (zj)j∈z ∈az has only finite number of non-zero digits, and (2.) ∑ j∈z wjβ j = ∑ j∈z zjβ j, is called a digit set conversion in the base β from b to a. such a conversion ϕ is said to be computable in parallel if ϕ is a p-local function for some p ∈ n. parallel addition in a numeration system (β,a) is a digit set conversion in the base β from a + a to a, which is computable in parallel. 3. necessary conditions on alphabets allowing parallel addition in a⊂ z[β] through this section, we assume that the base β is an algebraic number and the alphabet a is a finite subset of z[β] such that {0} ( a. the finiteness of the alphabet is a natural assumption for a practical numeration system, whereas the requirement that β is an algebraic number is justified by corollary 2.3. we recall that for an algebraic number β, if α,γ,δ are elements of z[β], then γ is congruent to δ modulo α in z[β], denoted by γ ≡α δ, if there exists ε ∈ z[β] such that γ − δ = αε. in this section, we recall the known results on necessary properties of integer alphabets allowing parallel addition, and we extend them to non-integer alphabets. in [4], the following statement is proven: theorem 3.1. let (β,a) be a numeration system such that a⊂ z[β]. if there exists a p-local parallel addition in (β,a) defined by a function φ : (a+a)p →a, then φ(b, . . . ,b) ≡β−1 b for any b ∈a + a. the same paper explain that, when considering only integer alphabets a⊂ z from the perspective of parallel addition algorithms, all the numbers β, 1/β, and their algebraic conjugates behave analogously: parallel addition algorithms exist either for all, or for none of them. this statement can be extended to non-integer alphabets as well – the following lemma summarizes that if we have a parallel addition algorithm for a base β, then we easily obtain such an algorithm also for conjugates of β by field isomorphism. regarding the base 1/β, we can use the equality fina(β) = fina(1/β) to transfer the parallel addition algorithm, and thus in fact drop the requirement on the base to be greater than 1 in modulus. 286 vol. 58 no. 5/2018 minimal non-integer alphabets allowing parallel addition lemma 3.2. let (β,a) be a numeration system such that a⊂ z[β] and β is an algebraic number. let β′ be a conjugate of β such that |β′| 6= 1 and σ : q(β) 7→ q(β′) be the corresponding field isomorphism. if there is a p-local parallel addition function ϕ in (β,a), then there exists a p-local parallel addition function ϕ′ in (β′,a′), where a′ = {σ(a) : a ∈a}. proof. let φ : ap →a be a mapping which defines ϕ with p = r+t+ 1. we define a mapping φ′ : (a′)p →a′ by φ′(w′j+t, . . . ,w ′ j−r) = σ ( φ(σ−1(w′j+t), . . . ,σ −1(w′j−r)) ) . next, we define a digit set conversion ϕ′ : (a′ + a′) → a′ by ϕ′(w′) = (z′j)j∈z, where w ′ = (w′j)j∈z and z′j = φ ′(w′j+t, . . . ,w ′ j−r). obviously, if w ′ has only finitely many non-zero entries, then there is only finitely many non-zeros in (z′j)j∈z, since φ′(0, . . . , 0) = σ ( φ(σ−1(0), . . . ,σ−1(0)) ) = σ ( φ(0, . . . , 0) ) = σ ( 0 ) = 0. the value of the number represented by w′ is also preserved: ∑ j∈z w′jβ ′j = ∑ j∈z σ(wj)σ(β)j = σ (∑ j∈z wjβ j ) = σ (∑ j∈z zjβ j ) = σ (∑ j∈z φ ( wj+t, . . . ,wj−r ) βj ) = ∑ j∈z σ ( φ ( wj+t, . . . ,wj−r )) β′ j = ∑ j∈z z′jβ ′j, where wj = σ−1(w′j) for j ∈ z and ϕ((wj)j∈z) = (zj)j∈z. next, it is shown again in [4] that if a base β has a real conjugate greater than one, then there are some extra requirements on the alphabet a ⊂ z[β]. the following lemma strengthens the results a bit. lemma 3.3. let (β,a) be a numeration system such that a ⊂ z[β] and 1 < β ∈ r. let λ = mina and λ = maxa. if there exists a p-local parallel addition in (β,a), with p = r + t + 1, defined by a mapping φ: (a + a)p →a, then: (1.) φ(b, . . . ,b) 6= λ for all b ∈ a + a such that b > λ∧ (b ≥ 0 ∨ t = 0), (2.) φ(b, . . . ,b) 6= λ for all b ∈ a + a such that b < λ ∧ (b ≤ 0 ∨ t = 0), (3.) if λ 6= 0, then φ(λ, . . . , λ) 6= λ, (4.) if λ 6= 0, then φ(λ,. . . ,λ) 6= λ. proof. (1.): let b ∈a+a be such that b > λ. assume, for contradiction, that φ(b, . . . ,b) = λ. we follow the proof of claim 3.5. in [4]. for any n ∈ n,n ≥ 1, we consider the number represented by ω0 b. . .b︸ ︷︷ ︸ t b. . .b︸ ︷︷ ︸ n • b. . .b︸ ︷︷ ︸ r 0ω . its representation after the digit set conversion has the form ω0 wr+t . . .w1︸ ︷︷ ︸ βnw λ.. .λ︸ ︷︷ ︸ n •w̃1 . . . w̃r+t0ω, where w = ∑r+t j=1 wjβ j−1 and wj, w̃j ∈a. since both representations have the same value, we get: b n+t−1∑ j=−r βj = βnw + λ n−1∑ j=0 βj + r+t∑ j=1 w̃jβ −j for all n ≥ 1. corollary 3.6. in [4] gives that w = bβ t−λ β−1 . thus, b −1∑ j=−r βj + b βn+t − 1 β − 1 = βn bβt −λ β − 1 + λ βn − 1 β − 1 + t+r∑ j=1 w̃jβ −j. hence b ( r∑ j=1 1 βj + −1 β − 1 ) = λ −1 β − 1 + t+r∑ j=1 w̃j 1 βj ≥ λ ( −1 β − 1 + t+r∑ j=1 1 βj ) . using 1 β−1 = ∑∞ j=1 1 βj , we get − b 1 βr 1 β − 1 = −b ∞∑ j=r+1 1 βj ≥−λ ∞∑ j=r+t+1 1 βj = −λ 1 βr+t 1 β − 1 . thus, we have λ ≥ bβt. if t = 0, then it contradicts the assumption b > λ. if b ≥ 0, then λ ≥ bβt ≥ b since β > 1, which is also a contradiction. the proof of (2.) is similar. for (3.) and (4.), see [4]. now we can conclude with the following statement. theorem 3.4. let (β,a) be a numeration system such that a⊂ z[β], β is an algebraic number with a positive real conjugate and there is parallel addition in (β,a). let λ = mina and λ = maxa. if λ ≡β−1 λ, then there exists c ∈a,λ 6= c 6= λ such that λ ≡β−1 c ≡β−1 λ. if λ 6≡β−1 λ, then there exist a,b ∈a,a 6= λ,b 6= λ such that a ≡β−1 λ and b ≡β−1 λ. 287 jan legerský acta polytechnica proof. by lemma 3.2, we can assume that the base β itself is real and greater than one. let φ be a mapping which defines the parallel addition. since {0} ( a, we know that λ > 0 or λ < 0. assume that λ > 0, the latter one is analogous. by theorem 3.1 and lemma 3.3, λ ≡β−1 φ(λ, . . . , λ) ∈ a and λ 6= φ(λ, . . . , λ) 6= λ. hence, φ(λ, . . . , λ) is a digit of a which belongs to the same congruence class as λ. if λ ≡β−1 λ, the claim follows, with c = φ(λ, . . . , λ). the case that λ 6≡β−1 λ is divided into two sub-cases. if λ 6= 0, then we have λ ≡β−1 φ(λ,. . . ,λ) ∈ a and φ(λ,. . . ,λ) 6= λ again by theorem 3.1 and lemma 3.3, which implies the statement, with a = φ(λ,. . . ,λ) and b = φ(λ, . . . , λ). if λ = 0, then all elements of a + λ are positive. suppose, for contradiction, that there is no nonzero element of the alphabet a congruent to 0 modulo β−1. let k be the number of congruence classes occurring in a and let r be a subset of a such that there is exactly one representative of each of those k congruence classes. for d ∈ λ + r, the value φ(d,. . . ,d) ∈a is not congruent to 0, as φ(d,. . . ,d) 6= λ = 0 by lemma 3.3 and the congruence class containing zero has only one element in a, by the previous assumption. therefore, the values fj = φ(dj, . . . ,dj) ∈a for k distinct digits dj = λ + ej ∈ λ + r belong to only k − 1 congruence classes modulo β − 1. hence, there exist two distinct elements d1,d2 ∈ λ + r such that f1 ≡β−1 f2. due to fj = φ(dj, . . . ,dj) ≡β−1 dj = λ + ej for each j , we obtain also e1 ≡β−1 e2 which contradicts the construction of the set r. 3.1. a[β] closed under addition in order to express the properties of the alphabet a allowing parallel addition in terms of representatives modulo β and β − 1, we restrict ourselves to alphabets such that a[β] is closed under addition, or even slightly stronger condition that a[β] = z[β]. the following theorem summarizes some consequences of these assumptions. theorem 3.5. let (β,a) be a numeration system such that a ⊂ z[β] and 1 ∈ a[β]. the following statements hold: (1.) if a[β] is closed under addition, then n[β] ⊂a[β]. (2.) a[β] is additive abelian group if and only if a[β] = z[β]. (3.) if n ⊂ a[β], then β is expanding, i.e., β is an algebraic number with all conjugates greater than 1 in modulus. proof. (1.): obviously, if a[β] is closed under addition, then n ⊂a[β]. since 0 ∈a by our general assumption, also β ·a[β] ⊂a[β]. therefore, β ·n ⊂a[β] and the claim n[β] ⊂a[β] follows by induction. (2.): the assumption that a[β] is closed also under subtraction and (1.) imply that z[β] = n[β] −n[β] ⊂ a[β], and obviously a[β] ⊂ z[β]. the opposite implication is trivial. (3.): by lemma 2.2, β is an algebraic number since a[β] ⊂ fina(β). the proof that β must be expanding is based on the paper of s. akiyama and t. zäimi [9]. let β′ be an algebraic conjugate of β and σ : q(β) → q(β′) be the field isomorphism such that σ(β) = β′. since n ⊂ a[β], for all n ∈ n there exist a0, . . . ,an ∈ a such that n∑ i=0 aiβ i = n = σ(n) = n∑ i=0 σ(ai)(β′)i . denoting m̃ := max{|σ(a)|: a ∈a}, we have n = |n| ≤ n∑ i=0 |σ(ai)| · |β′|i ≤ ∞∑ i=0 |σ(ai)| · |β′|i ≤ m̃ ∞∑ i=0 |β′|i . as n is arbitrarily large, the sum on the right side diverges, which implies that |β′| ≥ 1. thus, all conjugates of β are at least one in modulus. if the degree of β is one, the statement is obvious. therefore, we may assume that deg β ≥ 2. suppose for contradiction that |β′| = 1 for an algebraic conjugate β′ of β. the complex conjugate β′ is also an algebraic conjugate of β. take any algebraic conjugate γ of β and the isomorphism σ′ : q(β′) → q(γ) given by σ′(β′) = γ. now 1 γ = 1 σ′(β′) = σ′ ( 1 β′ ) = σ′ ( β′ β′β′ ) = σ′ ( β′ |β′|2 ) = σ′(β′) . hence, 1 γ is also an algebraic conjugate of β. moreover,∣∣ 1 γ ∣∣ ≥ 1 and |γ| ≥ 1, which implies that |γ| = 1. we may choose γ = β, which contradicts |β| > 1. thus all conjugates of β are greater than one in modulus, i.e., β is an expanding algebraic number. let us remark that the assumption on a[β] to be closed under addition is satisfied by a wide class of numeration systems. namely, if a numeration system (β,a) allows p-local parallel addition such that p = r+1, i.e., there is no anticipation, then a[β] is obviously closed under addition. hence, (1.) and (3.) give the following corollary. corollary 3.6. let (β,a) be a numeration system such that 1 ∈a[β] and a⊂ z[β]. if (β,a) allows parallel addition without anticipation, then β is expanding. for β expanding, lemma 8 in [10] provides a so called weak representation of zero property such that the absolute term is dominant, and hence parallel addition in the base β without anticipation is obtained for some integer alphabet aint according theorem 4.3. in [5]. 288 vol. 58 no. 5/2018 minimal non-integer alphabets allowing parallel addition in what follows, we assume a[β] = z[β], although the weaker assumption, a[β] being closed under addition, would be sufficient. the reason is that subtraction is also required in applications using parallel addition, such as on-line multiplication and division. hence the assumption is justified by (2.) of theorem 3.5. let us mention that, for instance, the numeration system (2,{0, 1, 2}) allows parallel addition, but (2,{0, 1, 2}) is obviously not closed under subtraction. theorem 3.7. if a numeration system (β,a) with a[β] = z[β] allows parallel addition, then the alphabet a contains at least one representative of each congruence class modulo β and modulo β − 1 in z[β]. proof. let x = ∑n i=0 xiβ i be an element of z[β]. since x0 ∈ z ⊂a[β], we have x ≡β x0 = n∑ i=0 aiβ i ≡β a0 , where ai ∈a. hence, for any element x ∈ z[β], there is a digit a0 ∈a such that x ≡β a0. in order to prove that there is an element of a congruent to x modulo β − 1, we use the binomial theorem: x = n∑ i=0 xiβ i = n∑ i=0 xi(β − 1 + 1)i = n∑ i=0 x′j(β − 1) j, for some x′j ∈ z. hence x ≡β−1 x′0 = n∑ i=0 aiβ i , for some ai ∈a. we prove by induction with respect to n that x′0 ≡β−1 a for some a ∈ a. if n = 0, then x′0 = a0 ∈a. for n + 1, we have x′0 = n+1∑ i=0 aiβ i = a0 + (β − 1) n∑ i=0 ai+1β i + n∑ i=0 ai+1β i ≡β−1 a0 + a′ ≡β−1 a ∈a , where we use the induction assumption∑n i=0 ai+1β i ≡β−1 a′ ∈ a and the statement of theorem 3.1, i.e, for each digit b ∈ a + a there is a digit a ∈a such that b ≡β−1 a. 3.2. lower bound on #a when deriving the minimal size of alphabets for parallel addition, we assume that the base β is an algebraic integer (in this whole subsection), since it enables us to count the number of congruence classes, and hence to provide an explicit lower bound on the size of alphabet allowing parallel addition. in what follows, the monic minimal polynomial of an algebraic integer α is denoted by mα. let d be the degree of β. it is well known that z[β] = {∑d−1 i=0 xiβ i : xi ∈ z } if and only if β is an algebraic integer. hence, there is an obvious bijection π : z[β] → zd given by π(u) = (u0,u1, · · · ,ud−1)t for every u = ∑d−1 i=0 uiβ i ∈ z[β]. moreover, the additive group zd can be equipped with a multiplication such that π is a ring isomorphism. in order to do that, we recall the concept of companion matrix. definition 3.8. let p(x) = xd + pd−1xd−1 + · · · + p1x + p0 ∈ z[x] be a monic polynomial with integer coefficients, d ≥ 1. the matrix s :=   0 0 · · · 0 −p0 1 0 · · · 0 −p1 0 1 · · · 0 −p2 ... ... ... 0 0 · · · 1 −pd−1   ∈ z d×d is the companion matrix of the polynomial p. it is well known (see for instance [11]) that the characteristic polynomial of the companion matrix s is p. the matrix s is also root of the polynomial p. the claim of the following theorem, which provides the required multiplication in zd, is discussed in [12]. these topics are also more elaborated in [13, 14]. theorem 3.9. let β be an algebraic integer of degree d ≥ 1 and let s be the companion matrix of mβ. if the multiplication �β : zd ×zd → zd is defined by u�β v := (d−1∑ i=0 uis i ) ·v for all u = (u0,u1, · · · ,ud−1)t ,v ∈ zd, then (zd, +,�β) is a commutative ring which is isomorphic to z[β] by the mapping π. one of the consequences is the following lemma. although it is a known result, we include its proof here, to be more self-contained. let us recall that for a non-singular integer matrix m ∈ zd×d, two vectors x,y ∈ zd are congruent modulo m in zd, denoted by x ≡m y, if x−y ∈ mzd. lemma 3.10. let β be an algebraic integer of degree d and α ∈ z[β] be such that deg α = deg β. the number of congruence classes modulo α in z[β] is |mα(0)|. proof. the number α is an algebraic integer, since it is well known that sum and product of algebraic integers is an algebraic integer. let γ,δ ∈ z[β] and let s be the companion matrix of the minimal polynomial mβ of the algebraic integer β. let π(α) = (a0,a1, · · · ,ad−1)t , with α = ∑d−1 i=0 aiβ i. if we set sα := ∑d−1 i=0 ais i, then the congruences ≡α in z[β] and ≡sα in zd fulfill: γ ≡α δ ⇐⇒ ∃ε ∈ z[β] : γ −δ = αε ⇐⇒ ∃z = π(ε) ∈ zd : π(γ) −π(δ) = = π(γ −δ) = π(α) �β z = sα ·z ⇐⇒ π(γ) ≡sα π(δ) . 289 jan legerský acta polytechnica thus, the number of congruence classes modulo α in z[β] equals the number of congruence classes modulo sα in zd, which is known to be |det sα|. to show that |det sα| = |mα(0)|, we proceed in the following way: the characteristic polynomial of the companion matrix s is the same as the minimal polynomial of β. since minimal polynomials have no multiple roots, s is diagonalizable over c, i.e., s = p−1dp where d is a diagonal matrix with the conjugates of β on the diagonal, and p is a non-singular complex matrix. the matrix sα is also diagonalized by p : sα = d−1∑ i=0 ais i = d−1∑ i=0 ai(p−1dp)i = p−1 (d−1∑ i=0 aid i ) ︸ ︷︷ ︸ dα p . it is known (see for instance [15]) that if σ : q(β) → q(β′) is a field isomorphism and α ∈ q(β), then σ(α) is a conjugate of α, and we obtain all conjugates of α in this way. since α = ∑d−1 i=0 aiβ i, deg α = deg β and d has conjugates of β on the diagonal, the diagonal elements of the diagonal matrix dα are precisely all conjugates of α. hence, |det sα| equals absolute value of the product of all conjugates of α, which is |mα(0)|. finally, we put together the fact that the alphabet a for parallel addition in base β contains all representatives modulo β and modulo β − 1, the derived formula for the number of congruence classes, and also specific restrictions on alphabets for parallel addition in a base with some positive real conjugate. theorem 3.11. let (β,a) be a numeration system such that β is an algebraic integer and a[β] = z[β]. if the numeration system (β,a) allows parallel addition, then #a≥ max { |mβ(0)|, |mβ(1)| } . moreover, if β has a positive real conjugate, then #a≥ max { |mβ(0)|, |mβ(1)| + 2 } . proof. by theorem 3.7, the alphabet a for parallel addition must contain all representatives modulo β and modulo β − 1 in z[β]. the numbers of congruence classes are |mβ(0)| and |mβ−1(0)|, respectively, by lemma 3.10. obviously, mβ−1(x) = mβ(x + 1). thus mβ−1(0) = mβ(1). theorem 3.4 ensures that if the minimal and maximal element of a are congruent modulo β−1, then there are at least three digits of a in this class. otherwise, the class of the minimal and also the class of the maximal element of a have at least two elements. both lead to the conclusion that #a≥ |mβ(1)| + 2. we remark that the obtained bound is basically the same as the one for integer alphabets in [4]. 4. necessary and sufficient condition on bases for parallel addition p. kornerup [2] proposed a more general concept of parallel addition called k-block parallel addition. the idea is that blocks of k digits are considered as one digit in the new numeration system with base being the k-th power of the original one. definition 4.1. for a positive integer k, the numeration system (β,a) allows k-block parallel addition if there exists parallel addition in (βk,a(k)), where a(k) = {ak−1βk−1 + · · · + a1β + a0 : ai ∈a}. we remark that 1-block parallel addition is the same as parallel addition. c. frougny, p. heller, e. pelantová and m. svobodová [3] showed that for a given base β, there exists an integer alphabet a such that (β,a) allows parallel addition if and only if β is an algebraic number with no conjugates of modulus 1. moreover, it was shown that the concept of k-block parallel addition does not enlarge the class of basis allowing parallel addition in case of integer alphabets. we prove an extension of these statements also to alphabets being subsets of z[β] in theorem 4.2. although the k-block concept does not enlarge the class of bases for parallel addition, it might decrease the minimal size of the alphabet. theorem 4.2. let β be a complex number such that |β| > 1. there exists an alphabet a⊂ z[β] with 0 ∈a and 1 ∈ fina(β) which allows k-block parallel addition in (β,a) for some k ∈ n, if and only if β is an algebraic number with no conjugate of modulus 1. if this is the case, then there also exists an alphabet in z allowing 1-block parallel addition in base β. proof. if the base β is an algebraic number with no conjugates of modulus 1, then [5] provides a ∈ n such that the alphabet a = {−a,−a+ 1, . . . , 0, . . . ,a−1,a} allows 1-block parallel addition. obviously, 0 ∈a⊂ z and 1 ∈ fina(β). for the opposite implication, β is an algebraic number by corollary 2.3. let r,s ∈ n and u−r, . . . ,us ∈a be such that s∑ j=−r ujβ j = 1 ∈ fina(β) . (1) now we slightly modify the proof from [3] to show that if β has a conjugate of modulus 1, then there is no alphabet in z[β] allowing block parallel addition. let γ be a conjugate of β such that |γ| = 1 and let σ : q(β) → q(γ) be the field isomorphism such that σ(β) = γ. let a′ := {σ(a) : a ∈ a}. assume, for contradiction, that there are k,p ∈ n such that there exists p-local function performing k-block parallel addition on (β,a). we denote s := max {∣∣∣∣pk−1∑ j=0 ajγ j ∣∣∣∣: aj ∈a′ } . 290 vol. 58 no. 5/2018 minimal non-integer alphabets allowing parallel addition since there are infinitely many j such that re γj > 12 , there exists n > p and indices 0 ≤ j1 < · · · < jm ≤ kn − 1 satisfying ji+1 − ji > r + s for all i ∈{1, . . . ,m− 1} such that 2s < re kn−1∑ j=0 εjγ j ≤ ∣∣∣∣kn−1∑ j=0 εjγ j ∣∣∣∣, where εj = 1 if j = ji for some i ∈ {1, . . . ,m} and εj = 0 otherwise. by using the representation (1) of 1 and the fact that ji+1 − ji > r + s, we have∑kn−1 j=0 εjγ j = ∑m i=1 1 ·γ ji = ∑kn−1+s j=−r v ′ jγ j for some v′j ∈a ′. hence 2s < t := max {∣∣∣∣kn−1+s∑ j=−r ajγ j ∣∣∣∣: aj ∈a′ } . let x′ = ∑kn−1+s j=−r σ(xj)γ j, where x−r, . . . , xkn−1+s ∈ a, be such that |x′| = t. let x =∑kn−1+s j=−r xjβ j, i.e., x′ = σ(x). since there is a kblock parallel addition in (β,a), we have x+x = k(n+p)−1+s∑ j=kn+s zjβ j+ kn−1+s∑ j=−r zjβ j+ −r−1∑ j=−kp−r zjβ j, where zj ∈a. we denote z′j := σ(zj). hence, z ′ j ∈a ′, and |x′| + 2s < |x′| + |x′| = |x′ + x′| ≤ ∣∣∣∣ k(n+p)−1+s∑ j=kn+s z′jγ j ∣∣∣∣+ ∣∣∣∣kn−1+s∑ j=−r z′jγ j ∣∣∣∣+ ∣∣∣∣ −r−1∑ j=−kp−r z′jγ j ∣∣∣∣ ≤ |γkn+s|s + |x′| + |γ−kp−r|s = 2s + |x′| , which is a contradiction. 5. conclusion we have shown that the necessary conditions on β and a allowing parallel addition that were known for alphabets consisting of consecutive integers can be largely extended to alphabets a being subsets of z[β]. during our investigation, we also considered even more general case: β ∈ z[ω] and a ⊂ z[ω], where ω is an algebraic number. clearly, z[β] ⊂ z[ω], but if z[β] ( z[ω], then congruences modulo β or β − 1 behave differently in z[ω] than in z[β]. due to this fact, theorem 3.7 does not hold in the z[ω] setting, see a counterexample: let ω = 12 √ 5 + 12 , β = −2ω+ 1 = − √ 5 and a = {−2,−1, 0, 1, 2, 3}⊂ z[ω]. this numeration system allows parallel addition, but −2 ≡β−1 0 ≡β−1 2 and −1 ≡β−1 1 ≡β−1 3, with ≡β−1 in z[ω]. the minimal polynomial of β is x2 − 5, thus there are six congruence classes modulo β − 1 in z[ω], i.e., a does not contain representatives of all congruence classes. see [14] for further elaboration. we conjecture the converse of corollary 3.6, that is: let (β,a) be a numeration system such that 1 ∈a[β] and a ⊂ z[β]. let (β,a) allow parallel addition by p-local function with p = r + t + 1. if β is expanding, then the parallel addition is without anticipation, i.e., t = 0. 6. acknowledgments this work was supported by gačr 13-03538s and sgs 17/193/ohk4/3t/14. the author thanks to milena svobodová and edita pelantová for fruitful discussions. references [1] a. avizienis. signed-digit number representations for fast parallel arithmetic. ieee trans comput 10:389–400, 1961. [2] p. kornerup. necessary and sufficient conditions for parallel, constant time conversion and addition. proc 14th ieee symp on comp arith pp. 152–155, 1999. [3] c. frougny, p. heller, e. pelantová, m. svobodová. k-block parallel addition versus 1-block parallel addition in non-standard numeration systems. theoret comput sci 543:52–67, 2014. [4] c. frougny, e. pelantová, m. svobodová. minimal digit sets for parallel addition in non-standard numeration systems. j integer seq 16:36, 2013. [5] c. frougny, e. pelantová, m. svobodová. parallel addition in non-standard numeration systems. theoret comput sci 412:5714–5727, 2011. [6] m. brzicová, c. frougny, e. pelantová, m. svobodová. on-line multiplication and division in real and complex bases. in 2016 ieee 23nd symp. comput. arith. (arith), pp. 134–141. ieee, 2016. doi:10.1109/arith.2016.13. [7] j. legerský, m. svobodová. construction of algorithms for parallel addition, 2018. http://arxiv.org/abs/1801.01062. [8] s. baker, z. masáková, e. pelantová, t. vávra. on periodic representations in non-pisot bases. monatshefte für math 184(1):1–19, 2017. doi:10.1007/s00605-017-1063-9. [9] s. akiyama, t. zaïmi. comments on the height reducing property. cent eur j math 11:1616–1627, 2013. [10] s. akiyama, p. drungilas, j. jankauskas. height reducing problem on algebraic integers. funct approx comment math 47:105–119, 2012. doi:10.7169/facm/2012.47.1.9. [11] r. a. horn, c. r. johnson. matrix analysis. cambridge university press, 1990. [12] i. kátai. generalized number systems in euclidean spaces. math comput model 38:883 – 892, 2003. [13] j. legerský. construction of algorithms for parallel addition. research project, czech technical university in prague, fnspe, czech republic, 2015. http://jan.legersky.cz/pdf/research_project_ parallel_addition.pdf. [14] j. legerský. construction of algorithms for parallel addition in non-standard numeration systems. master thesis, czech technical university in prague, fnspe, czech republic, 2016. http://jan.legersky.cz/pdf/ master_thesis_parallel_addition.pdf. [15] r. chapman. algebraic number theory – summary of notes. http://empslocal.ex.ac.uk/people/staff/ rjchapma/notes/ant2.pdf. accessed: 2017-12-16. 291 http://dx.doi.org/10.1109/arith.2016.13 http://arxiv.org/abs/1801.01062 http://dx.doi.org/10.1007/s00605-017-1063-9 http://dx.doi.org/10.7169/facm/2012.47.1.9 http://jan.legersky.cz/pdf/research_project_parallel_addition.pdf http://jan.legersky.cz/pdf/research_project_parallel_addition.pdf http://jan.legersky.cz/pdf/master_thesis_parallel_addition.pdf http://jan.legersky.cz/pdf/master_thesis_parallel_addition.pdf http://empslocal.ex.ac.uk/people/staff/rjchapma/notes/ant2.pdf http://empslocal.ex.ac.uk/people/staff/rjchapma/notes/ant2.pdf acta polytechnica 58(5):285–291, 2018 1 introduction 2 preliminaries 3 necessary conditions on alphabets allowing parallel addition in a subset z[beta] 3.1 a[beta] closed under addition 3.2 lower bound on a 4 necessary and sufficient condition on bases for parallel addition 5 conclusion 6 acknowledgments references ap04_1.vp 1 the real physical quantities and their psychological images the investigation of the psychological image of real physical quantities like time, distance, intensity of sound, etc. has a particular significance. psychological quantities (the designation of psychological images of real quantities) are actually necessary components of our decision making. in reality, they exist even though they have a subjective character. the determination of psychological quantities and their dimensions has its specialities or rules, which are necessary to investigate and recognize, should they be correctly utilized or applied. the knowledge of the regularities, which these rules are governed by, may be used for seeking or finding methods that take advantage of their positive sides. for example: the bringing near or drawing away of psychological images to or from reality. a good example, for instance, is the effort to positively influence the so-called “psychological speed of aging” or “psychological age”, a phenomenon, which among others, this article deals with. the real physical quantities such as time, age, length, mass, temperature and others, are relative values that are referred to an internationally chosen and accepted unit of a constant value. the real physical quantities are objectively measurable. in the past, similar samples of “local” units of constant values for measurements were used. an example of this was the so-called “prague elbow”, used in the past as a constant unit of length of a steel rod for measuring the textile goods. it was fixed to the wall of the prague city hall on charles’ square. on the contrary, the psychological image of a real quantity is a subjective and variable value. the real-actual physical quantities (pq) are called real or global pq, in short: rpq or gpq. real time is also designated as clock time, calendar time or global time. next to the physical quantities for length, time, speed, space, etc., valid in global terms, physical quantities exist also in cosmic or astronomical conditions. they are generally multiples of rpq. global units in astronomy can hardly be used. similarly, like the existence of the global and space physical quantities or phenomena, it is possible to speak of psychological quantities or phenomena and corresponding units (in short ppq). as it has been stated, psychological quantities are corresponding to the psychological images (sensation or phenomena) of quantities that exist in reality. the size or dimension of psychological quantities has a subjective character, but exists objectively. for example, one may be fifty years old but “feels” as a forty year old. the given case is referring to a so-called psychological image of internal phenomena (sensation), for the evaluating subject is identical with the object of evaluation. the psychological image is dependent on variable influences like the age, biological or physical well-being, fatigue, health, iq, and others of the evaluated person. unlike quantities of real units, the units of psychological quantities are variable and subjective. a year, as a unit of time, varies psychologically in length for a young and aged person. next to the psychological images of internal phenomena, which accord with psychological images of internal realities (for example, age), there are psychological images of external phenomena. the example illustrated is a plant or odour that one finds beautiful, while another finds displeasing. warmth, coolness or a produced good, are other examples whose usefulness is for one individual advisable while for another harmful. similarly, it can be a political party, religion, political system, government and others, whose nature is good for one, while unacceptable, harmful, even dictatorial, etc. for another. psychological images of the external phenomena may be: a) uninfluencable phenomena (like the scent of flowers, the beauty of nature), or b) influencable phenomena, which can be a subject of various commercial, political or other interests. it is possible in all cases to speak of the importance of the evaluation of the psychological quantities. in conclusion, it can be stated that the existing three systems of physical quantities and the corresponding dimensions are: real or global physical quantities – pq (rpq, corresponding to the afore mentioned conditions). cosmic physical quantities – astronomical (cpq). psychological physical quantities – pq (ppq). © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 the psychological image of realistic physical quantities. the psychological speed of aging l. végh the article is dealing with psychological images of real physical quantities, such as time, length, etc. and with their interrelationship in general and in the scientific field in particular. it has been shown that the importance of the above is reflected by the possibility to influence or reshape, within certain limits, positively or negatively, the psychological images of real values, so as to meet the outlined or desired expectation. as shown, one of the possible practical application of the above phenomenon is the possibility to positively influence the speed of aging.this problem has been demonstrated in the article analytically, as well as graphically. keywords: real values, psychological image, biological vs. psychological age, psychological speed of aging. the quantities expressed in the corresponding units can not and should not be interchanged. 2 the characteristics of psychological images of real physical quantities rpq are objective physical quantities that exist realistically and are objectively measurable. on the contrary, psychological images of real physical quantities have a subjective character and dimension. it has been stated that the psychological images are dependent on the receiving subject: on its age, iq, memory, and on its specific conditions. the psychological image of an external phenomenon or quantity is with the receiving subject generally distinct from the real dimension of the pq. the same real physical quantity with different receiving subjects can evolve generally different psychological images. for example, different individuals psychologically observe one hour to be of various psychological quantities. in his publication [1] professor m. nakonecny states about the psychological perception of time: “an elder individual perceives one hour to be shorter than a young individual perceives it to be.” it is evident, that it is the dependence of the psychological perception or image of time on the age of the receiving subject that the author has on his mind. he concludes, that the psychological quantity of the elapsed time is dependent on age of the individual. psychological quantities can be divided into two categories: a) the first category holds the realistic, easily controlled, and objectively uninfluencable psychological images of quantities and phenomena of the internal and external world. for example: biological data, age, objective natural and social phenomena of the external world, which are allowing to be physically, statistically, historically or economically explicitly identified. the psychological images of true inner phenomena (for example age, time) and true external, but media uninfluenced, phenomena and data (nature, weather, warmth, etc.) belong also in this category. b) the second category possesses the psychological images of external phenomena and data that allow the media, commercial or other special-interest influences and evaluation, to influence the psychological image of the receiving subjects. for this purpose, psychological images of realistic external quantities can be influenced by media or be commercially biased with distorted information and statistical data. simplified, the discussed evaluation is referred to as: “brain-washing”. the given reality as described above, is to some extent contiguous with the conception of democracy and principles of, “the freedom of individuals to create and evaluate their psychological images of external phenomena”. an example is the paradox of various roles of alcohol in the time of war and peace as the means of influencing the psychological behaviour of warriors in the interests of authority. in past wars, soldiers on battlefields were given alcohol before attacking the enemy, so as to overcome their humane senses. whereas in the time of peace, the interest is to keep the senses active to protect lives. another paradoxical example is the bylaw that during the last 24 hours before the elections into the highest government offices prohibits the election campaign. why precisely at the 24 hour mark, not more not less, is a psychological mystery. the interrelationship between rpq and ppq is always a multi-parametrical dependence that cannot be generally, analytically, nor accurately expressed. however, it is possible to represent the mutual dependence of ppq and rpq in a simplified, single parameter form with the predominating parameter: � �a) ppq rpq� f (1) � �b) rpq ppq� p (2) just like there is an existing direct dependence between rpq and its psychological image ppq (2), it is also possible to speak of the dependence in the opposite direction (1), regarding the dependence of ppq and gpq, which is in general different from (2). however, here the mathematical reciprocity does not hold true. (an example is a varying possible testimony of witnesses during the police investigation of a punishable crime). psychological images of realistic quantities may play an important role in an individual’s life. for a young woman around the age of twenty, the upcoming years of looking for a prospective partner are “long” and there is no “hurry” for the search of her partner. nonetheless, the young woman does not need to be in agreement with the opinion of her parents or grandparents, who may assess the years to come far not so long. with her psychological evaluation of time, the young woman may “throw away” her time for the natural selection of a lifelong partner and due to time delay “lose her chance in life”. examples of faulty psychological evaluations of time are not unusual. an interesting, historically proven example is the psychological evaluation of time made by admiral nelson before his victorious sea battle with napoleon. he correctly assessed psychologically the right moment for launching the victorious sea battle. ppq may play a significant role in every decision-making process. however difficult the relationships between rpq and ppq may be, it is necessary to examine and recognize them. 3 the psychological speed of aging, the real (biological) and psychological length of time the psychological speed of aging (in short psa) denoted vpx, expresses the relationship between the real time tx, measured by means of calendar or clock, and its psychological image tpx. index ‘p’ indicates always the psychological image of reality. psa � �v t tpx x px (3) if psa � 1, then the real value of the speed of aging is identical with its psychological image, i.e. if psa � 1 then t tx px� . the psychological speed of aging (psa) expresses how many times faster or slower the psychological time elapses (runs out) in comparison with the real or clock-time. for better understanding, a case may be introduced where the elder individual, whose value psa � vpx is usually great. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague let us assume, that in a particular case, the psychological time at higher age elapses 3 times faster than the time measured by a calendar or clock. therefore, the psa of that individual is three times faster than the real time. psa expresses that one calendar year, in the given case of high age, is psychologically equivalent to a period 3 times shorter, i.e. equivalent only to 4 calendar months of a younger middle aged individual (for him t tx px� ). therefore, psa � 12 months / 4 months � 3. the psychological speed of elapsing time, or aging, (psa � vpx) is in the given case multiplied by three in comparison to the speed of the real or clock time. it is as if the psychological time has shrunk to one third of the reality and therefore its contents are practically going to be one third only in comparison to contents of the real time. the theoretically forthcoming end of life (eternal rest) occurs when psa is equal to infinity (psa � �). while the opposite is true for a child, as its psa is smaller than in reality and is theoretically equal to zero at birth. it is as though the corresponding psychological age for a child is infinitely long. adulthood and demise are at birth out of sight. in the above formulae is: psa � vpx … psychological speed of aging (elapsing of time) tx … the real (measured by the clock or calendar) time tpx … psychological time (psychological image of real time) a single parametric model of the dependence of psa � vpx on the aging at the age between t0 to tx, max � 100, may be graphically demonstrated, among others, with a trigonometric function in diagram 1. the maximum age is represented here with a chosen value of tx, max � 100 years. the boundary conditions for the function of psa � vpx are empirically assumed. the analytical expression of the function psa � vpx is chosen in such a way that the function is continuous and satisfying the boundary conditions as well as the condition in one chosen intermediate point. a trigonometric function in the form of (4) satisfies the above conditions. however a logarithmic function could also be considered. as shown in the next, the intermediate point lies in 2 tx, max, for, at this point, for all values of the power z in eq. (5) are vpx �1. psa decreases with the reduction of age and increases with the increasing age within the theoretical limits from zero to infinity. this is confirmed by psychological experience. the boundary conditions are determined from the following consideration: as the psa is small at birth, it is assumed that theoretically for tx � 0 is vpx � 0 (to children, time goes by very slowly and the length of a lifetime appears to be “infinite”). the tangent to the curve at this point is close to horizontal. in contrast, for a very old individual (theoretically at the age of tmax �100) psa v� � �100 . psychologically, at a high age, time goes by very quickly. it is thus assumed that the boundary condition of the function (4) for tmax �100 has the value: psa v� � �100 . the tangent to the curve is here vertical. the diagram in fig.1, showing the function of vpx, is a graphic aid in finding a method for slowing down of the psychological speed of aging (psa) in the second half of the life-cycle period, i.e. between 2 tx, max to tx, max, and also method of lengthening of the so-called psychological age. the analytical function that satisfies the above-mentioned condition, is expressed by eq. (4). fig. 1. illustrates eq. (4) graphically � �v f t t t px x x x z � � � � � � � � tg � 2 , max (4) in fig.1 the horizontal co-ordinate axis is identical to the axis tx and the vertical axis is identical to the axis psa � vpx. the exponent z determines the shape of the curve vpx. at the point t tx x� 2 , max for all values of z, vpx is equal to one. therefore, � �vpx z � �tg � 4 1 (5) from this it can be concluded that all curves, corresponding to various values of z, intersect the same point: (1 2 tmax, vpx �1). it is evident that psa in the first half of a life-span never exceeds the real speed of aging. the values of vpx for (1 4, 1 2 and 3 4) of tmax are given in table 1. psychological age (time): tpx similarly, as with the function (4), it is possible to express the dependence of the psychological age tpx (not speed of aging) on the real “calendar” age tx. the dependence can be visualized with the following statement: “the psychological length of a life-cycle (the expected life-span) is theoretically infinite (or is an unidentified value) at birth and falls gradually to zero at death.” the function tpx is expressed in accordance with the equation (3) by the expression of (6) and demonstrated graphically in fig. 2. t t v t t t px x px x x x z � � � � � � � � tg � 2 , max (6) the functions of (4) and (6) are interdependent. the moment at which the psychological and calendar age equalize, occurs in the middle of the span of a lifetime. therefore, © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 fig. 1: graphical illustration of the function vpx t tx x, max 0 0,25 0,50 0,75 0,90 1,00 vpx 0 0,41 1,00 2,41 6,31 � table 1: values of vpx for z � 1 when tx � 50, then: t tpx x� �2 50, max . according to the initial assumptions, tpx has the following boundary values: for tx � 0, tpx is undetermined, for tx � 50 is also tpx � 50 and for tmax �100 tpx � 0. the shape of the curve is dependent on the exponent z. the function of tpx in fig. 2 is illustrated for the value of z �1. 4 the interrelationship between real and psychological quantities the quantities of rpq and ppq, are expressed in different units and therefore cannot be mixed. for example it is not possible to collect statistical data about the realistic aging of citizens from the psychological ages of individuals. a statistical quantity collected from realistic physical data is once more a realistic quantity. in contrast, the statistical quantity composed of psychological data is again only a psychological quantity. should a group of individuals of the same sex with exactly the same biological age be standing next to each other, each individual is going to have a different psychological age. if an evaluation criteria for the determination of psychological age were to exist, it would be possible to determine the psychological age (pa) of each individual (for example by means of wrinkles, physical and spiritual activity, memory, and others). also it would be possible to determine some sort of a statistical “index” in the entire society, which would characterize, “the average psychological age of the society”. conclusion to summarize the above analysis, two important concluding comments can be made: 1) in chapter 3, the course of the so-called psychological speed of aging (psa) was analyzed. from the course of psa arises that psychological aging (pa) may be extended by slowing down the psa. from experience it is known, that the decreasing of psa in a higher age is favorably influenced, among other factors, by the slowing down of the process of the reduction of physical and intellectual activities. although “psychological rejuvenation” is not the same as “biological rejuvenation”, an interrelationship does exist in this case. a favorable influence on the course of psa can be achieved through correction of those influences that significantly affect the psa, like intellectual activity and physical activity, as stated above. analytically any positive change of psa would be reflected by changes of the curvature of the function vpx in its right half. the greatest danger in aging is the inactivity of the mind and body as confirmed by long-term experience. that would certainly deserve more research in the given area. practical experience with the post-retirement age and activities of university professors often prove the above statement. 2) each attempt to apply one-sided intentional influence on the generation of psychological images of real events by individuals during their decision-making process has an adverse effect. a numerical result in the elections does not give necessarily an exact picture of the voters’ long-term stand-point. under different psychological conditions, the same individuals could decide in a different manner. for that reason, each psychological one-sided influence of individuals in their decision-making process is hardly acceptable in a free society. references [1] nakonecny m.: the basics of psychology. academia praha, 1998 and 2002, p. 251, 252 prof. l. végh, drsc. doubravčická 10 100 00 praha 10, czech republic phone: +420 274 811 674 e-mail:vegh@mbox.vol.cz czech technical university in prague 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague fig. 2: the function of tpx 39 40 41 42 acta polytechnica doi:10.14311/ap.2017.57.0058 acta polytechnica 57(1):58–70, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap numerical modelling of the soil behaviour by using newly developed advanced material model jan vesely department of geotechnics, faculty of civil engineering, czech technical university in prague, thakurova 7, prague, czech republic correspondence: veselyjan@centrum.cz abstract. this paper describes a theoretical background, implementation and validation of the newly developed jardine plastic hardening-softening model (jphs model), which can be used for numerical modelling of the soils behaviour. although the jphs model is based on the elasto-plastic theory, like the mohr-coulomb model that is widely used in geotechnics, it contains some improvements, which removes the main disadvantages of the mc model. the presented model is coupled with an isotopically hardening and softening law, non-linear elastic stress-strain law, non-associated elasto-plastic material description and a cap yield surface. the validation of the model is done by comparing the numerical results with real measured data from the laboratory tests and by testing of the model on the real project of the tunnel excavation. the 3d numerical analysis is performed and the comparison between the jphs, mohr-coulomb, modified cam-clay, hardening small strain model and monitoring in-situ data is done. keywords: numerical modelling; advanced material model; soils; small strain stiffness; tunnel excavation. 1. introduction in the last few years, the numerical analyses are increasingly used in the design of the underground structures [1–8]. the designers can work with different userfriendly commercial software and relatively easy simulate the behaviour of the soil mass. unfortunately, one important fact is overlooked. numerical analyses are only approximation of real behaviour and are highly dependent on the input parameters [9, 10]. one area, which can affect the calculated results, is the correct choice of the material model [11–13]. many laboratory tests show that the behaviour of soils is highly nonlinear [14–16] and therefore the use of a common linear elasto — perfectly plastic material models based on the mohr-coulomb yield criterion is not appropriate. instead, the advanced material model like a newly developed jardine plastic hardening-softening model (the jphs model) should be used. 2. theoretical background of the model the jardine plastic hardening-softening model (the jphs model) is based on the elasto — plastic theory as in geotechnics widely used mohr-coulomb model, but it contains some other features, which improve its capabilities and allow better simulations of the ground behaviour. firstly, the mohr-coulomb failure yield criterion is replaced by the willam-warnke failure yield criterion, which eliminates the singular tips from the mohr-coulomb surface and has a better agreement with the data from the experimental tests. secondly, the model contains non-associated material description and the plastic flow is controlled by the drucker-prager potential function. thirdly, the model has the ability to simulate the non-linear isotropic hardening and softening of the soils. fourthly, it is capable to increase the stiffness modulus with an increasing depth or stress level. fifthly, the volumetric hardening is controlled by a cap yield surface and finally, the model has the possibility to calculate the non-linear stress-strain dependence in a small strain range. 2.1. yield failure criterion the mohr-coulomb failure criterion for soils is one of the oldest failure criteria. it is experimentally verified in triaxial compression and extension, but it is also very conservative for intermediate principal stress states between the triaxial compression and extension as can be seen in figure 1. to eliminate this conservative behaviour and approximate the jphs model more to the reality, the willam-warnke failure criterion [17] is implemented in the model. if rc is the distance from the hydrostatic axis to the failure surface at the compressive meridian and rt at the tension meridian, then at any intermediate position, the distance r is given by (see figure 2): r = 2rc(r2c −r2t ) cos θ + rc(2rt −rc) √ d1 4(r2c −r2t ) cos2 θ + (rc − 2rt)2 , (1) where d1 = 4(r2c −r 2 t ) cos 2 θ + 5r2t − 4rtrc. (2) 58 http://dx.doi.org/10.14311/ap.2017.57.0058 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 1/2017 numerical modelling of soil behaviour figure 1. the three-dimensional failure surface of kaolin clay in octahedral plane, lab. data from [18]. after several mathematical operations, the locus of yield surface in deviatoric plane can be written as: g(θ) = 2(1 −e2) cos θ + (2e− 1) √ d2 4(1 −e2) cos2 θ + (2e− 1)2 , (3) where d2 = 4(1 −e2) cos2 θ + 5e2 − 4e, (4) e = rc rt = 3 − sin φ′ 3 + sin φ′ . (5) and the whole willam-warnke failure criterion can be expressed as follows: f = a1p + q g(θ) + a2 = 0, (6) where a1 = − 6 − sin φ′ 3 − sin φ′ , a2 = − 6c− sin φ′ 3 − sin φ′ . (7) on the one hand, it is evident that the formula is relatively complex which can cause some difficulties during evaluation of differential and secondary order differential form, but on the other hand, this complexity allows that the locus of yield surface in deviatoric plane has the following features: • fits with the mohr-coulomb key points; i.e., the experimental extension and compression meridian points are on the curve; • it is differentiable at the compression and extension meridian points; i.e., the singularities in the corner, which cause the numerical difficulties, are eliminated; • it is convex in the whole range of 0 < φ ≤ π/2 (0.5 < e ≤ 1). figure 2. the geometrical interpretation of willamwarnke failure criteria. 2.2. plastic flow the jphs model contains the non-associated material description and the plastic flow is controlled by the drucker-prager yield criterion [19] that has the following form: f = √ j2 −αdpi1 −k = 0 (8) the coefficients αdp and kdp are not commonly used parameters in geotechnics and a proper calibration can lead to some difficulties in practice. since the drucker-prager yield surface is a smooth version of the mohr-coulomb yield function, the drucker-prager constants can be expressed in terms of cohesion and friction angle. by relating hydrostatic stress p and deviatoric stress q to the invariants i1 and j2 the form of drucker-prager yield criterion can be rewritten as follows : f = q √ 3 − 3pαdp −k = 0, (9) → f = −3 √ 3pαdp + q − √ 3k = 0, (10) where the coefficient αdp and kdp can be expressed for the triaxial compression and extension: αctc = 2 sin ψ √ 3(3 − sin ψ ,αcte = 2 sin ψ √ 3(3 + sin ψ , (11) kctc = 6c sin ψ √ 3(3 − sin ψ ,kcte = 6c sin ψ √ 3(3 + sin ψ . (12) and the general form of the drucker-prager potential function is then: f = a1p + q g(θ) + a2 = 0, (13) 59 jan vesely acta polytechnica figure 3. the hardening-softening law — dependence of friction angle on equivalent deviatoric plastic deformation. where a1 = − 6 − sin ψ 3 − sin ψ ,a2 = − 6c− sin ψ 3 − sin ψ ,g(θ) = 1. (14) 2.3. non-linear hardening/softening in the jphs model, the isotropic hardening rule is used to describe the hardening and softening law. the change in yield criterion can be described as follows: df = ∂f ∂σij dσij + h(d�p). (15) according to the strain hardening hypothesis [20], hardening process can be work hardening or strain hardening. to cover both cases, it is useful to define the hardening variable dκ and the parameter h, which express the dκ in terms of equivalent plastic strain d�p: h = dκ d�p . (16) the change in the yield function can then be rewritten as follows: df = ∂f ∂σij dσij + ∂f ∂κ dκ = 0, (17) → df = ∂f ∂σij dσij + ∂f ∂κ dκ d�p d�p dλ dλ = 0, (18) where the equivalent plastic strain can also be characterised as follows: d�p = √ 2 3 d�p : d�p = √ 2 3 dλ √ dq. (19) finally, substituting equations (16), (17) and (19) gives the expression of the hardening function hκ, which is directly related to the input parameter h: hκ = √ 2 3 h √ dq ∂f ∂κ . (20) figure 4. cap yield surface in p-q plane. the jphs model allows to assume the variation of friction angle φ′ with accumulated plastic strain as shown in figure 3. there are three zones. in zone 1, φ′ is assumed to increase from the initial value (φ′crit) to the peak value (φ′peak) while in zone 2, φ ′ is reduced from the peak value to the residual value (φ′res) and in zone 3, φ′ remains constant and equal to the residual value (φ′res). in each of these zones, mathematical expressions can be assigned to the variation of φ′ with equivalent plastic strain d�p, and therefore the hardening/softening rules can be expressed in a piecewise manner and the parameter h is defined in the jphs model as follows: hhardening = (φ′peak −φ ′)δ �hardening , (21) hsoftening = (φ′ −φ′res)δ �softening . (22) note that φ′peak, φ ′ res, δ, �hardening, �softening are input parameters, which are based on the result of the triaxial tests 2.4. cap yield surface the shear yield surface defined in § 2.1 does not explain the plastic volume strain that is measured in the isotropic compression. therefore, the second type of yield surface is implemented in the jphs model to close the elastic region for compressive stress path. the shape of cap yield surface in p-q plane (see figure 4) is chosen in the way to get the best agreement with the laboratory test data and is defined as follows: f = p + kcap − qc −q qc −pc = 0. (23) the initial pre-consolidation is calculated according to the following equation: pc,ini = ocr(pini + kcapqini), (24) qc,ini = 6 sin φ′peak 3 − sin φ′peak pc,ini + 6c sin φ′peak 3 − sin φ′peak . (25) during the volumetric strain hardening, the value of pre-consolidation pressure is then updated as follows: pc = pc,ini + k�v. (26) 60 vol. 57 no. 1/2017 numerical modelling of soil behaviour 2.5. non-linear elasticity the strain range, in which the soils can be considered truly elastic, is very small and with increasing strain amplitude, soil stiffness decays non-linearly. to simulate this soil behaviour, non-linear elastic stress-strain law, based on the jardine function, is implemented in the model. jardine et al. [14] proposed a periodic logarithmic function to express the non-linear relationship between the normalized secant young’s modulus eu and axial strain �a as follows: eu cu = a + b cos ( α log (�a c )γ) , (27) where a, b, c, α, γ are the jardine material model constants based on triaxial tests. the normalized tangent young’s modulus eut corresponding to the secant young’s modulus eu can be derived by a differentiating and rearranging (27): eut cu = a + b cos ( α log (�a c )γ) − bαγ log ( �a c )γ−1 2.303 . (28) for the purpose of the numerical modelling, it is appropriate to replace the axial strain �a by deviatoric strain invariant �dev that is defined as: �dev = √ 2 3 ( (�1 − �2)2 + (�2 − �3)2 + (�3 − �1)2 ) . (29) by substituting the stress state of the undrained triaxial test (�1 = �a and �2 = �3 = −11�a) into the equation (29), the dependence between axial strain �a and deviatoric strain invariant is �dev = √ 3�a and the normalized tangent young’s modulus eut can be rewritten as follows: eut cu = a + b cos ( α log ( �dev√ 3c )γ) − bαγ log ( �dev√ 3c )γ−1 2.303 . (30) finally, to simulate the dependency of the modulus with the depth or stress level, the equation is modified as follows [21]: eut = σm ( a + b cos(αiγ) − bαγiγ−1 2.303 sin(αiγ) ) , (31) where σm is mean stress pressure and i = log �dev√3c . due to the jardine model’s trigonometric nature, it is also mandatory to specify the exact strain range, in which the non-linear stress-strain law is to be applied. when exceeding the upper (�max) or lower (�min) limit of the strain range specified, stiffness is set as constant. a practical value for �min is the smallest strain for which the test data is available. for �max it is required to ensure compatibility with onset of plastic yield and check that high value will not gain to the negative tangent young’s modulus (see figure 5). figure 5. jardine function — strain range. 3. implementation of the model the software plaxis allows users to implement a wide range of material models into the program. these models must be programmed in fortran language, compiled as a dynamic link library (dll), and then added to the plaxis program directory. these models simulate the soil behaviour in a single material point and the global behaviour is then governed by the finite element method implementation in plaxis. the whole flow chart containing all steps needed for implementation of the model is presented in figure 7. the first action that must be done is the declaration and initialization of the material properties and state variables and the specification of the undrained behaviour. the second action is to calculate the current value of the e-modulus. this is done by calling the following three subroutines: strain_calculation: this subroutine is calculating the deviatoric strain invariant �dev according to the formula (29). the input parameters are all components of the total strains and the output is the deviatoric strain invariant �dev. jardine: this subroutine is calculating the current e-modulus according to the formula (31) and rules mentioned in section 2.5. the input parameters are the deviatoric strain invariant �dev and the jphs model parameters. the output is the actual e-modulus unloading: this subroutine is controlling the value of the e-modulus during unloading and reloading cycles according to the formula: eunload = ekunload, (32) where kunload is an unloading coefficient the fourth action is the calculation of the actual hardening / softening parameter by using the user defined subroutine hardening. the input of this subroutine is the jphs model parameters. the output is a new value of the hardening / softening parameter h, determined according to the formulas (21) and (22). 61 jan vesely acta polytechnica figure 6. jardine model — secant stiffness defined as a logarithmic function of strain [14]. the last action is the definition of the constitutive stresses. after the calculation of the actual e-modulus, the new stiffness matrix is determined and the trial stresses are calculated as follows: predσn+1 = σn + d�n+1. (33) these stresses are one of the main inputs into the user-defined subroutine stress_integration. this complex subroutine is the core element of the implementation and the backward return-mapping algorithm is implemented there. the first step in the subroutine is the check of the trial stresses. the jphs model contains three different yield surfaces (willam-warnke yield surface fs, cap yield surface fc and tension cut-off yield surface ft), and therefore the trial stresses can occur in 6 different regions: • elastic region (fs < 0, fc < 0 and ft < 0); • willam-warnke region (fs ≥ 0, fc < 0 and ft < 0); • cap region (fs < 0, fc ≥ 0 and ft < 0); • tension cut-off region (fs < 0, fc < 0 and ft ≥ 0); • willam-warnke and tension cut-off region (fs ≥ 0, fc < 0 and ft ≥ 0); • willam-warnke and cap region (fs ≥ 0, fc ≥ 0 and ft < 0). if the trial stress occurs in the elastic region (fs < 0, fc < 0 and ft < 0), then the strain increment is elastic and the trial stress is the real stress for the increment: σn+1 = predσn+1. (34) if the trial stress occurs in other regions (fs ≥ 0 or fc ≥ 0 or ft ≥ 0), then the trial stress is an inadmissible stress state and return-mapping algorithm is called. the general implementation of the return mapping algorithm is as follows [22]: (1.) calculate the starting point: (a) calculate the derivatives (according to the region) n+1nmn = ( ∂f ∂σmn ) n+1 , (35) n+1mpq = ( ∂q ∂σpq ) n+1 ; (36) (b) update the plastic multiplier λ (yield function predf according to the region) dλ(0) = predf prednmndijklpredmpq −hκ ; (37) (c) update stress and state variable (n+1)σ(0)mn = predσmn − dλ(0)dmnpqpredmpq, (38) (n+1)κ(0) = predκ + dλ(0)hκ. (39) (2.) calculate backward euler algorithm: do while n+1f (i) < tol (a) calculate the yield function n+1f (i) (according to the region) n+1f (i) = f ( n+1σ (i) ij , n+1κ(i) ) ; (40) (b) calculate the derivatives (according to the region) n+1n(i)mn = ( ∂f ∂σmn ) n+1 , (41) n+1m (i) kl = ( ∂q ∂σkl ) n+1 , (42) ∂mkl ∂σmn ∣∣∣∣(i) n+1 , ∂mkl ∂κ ∣∣∣∣(i) n+1 ; (43) 62 vol. 57 no. 1/2017 numerical modelling of soil behaviour figure 7. jphs model — flow chart. 63 jan vesely acta polytechnica (c) update the plastic multiplier λ (yield function predf according to the region) dλ(i+1) = n+1f (i) −n+1n(i)mnoldrij(n+1tijmn)−1 d3 −n+1hκ , (44) where d3 = n+1n(i)mndijkl n+1mkl(n+1tijmn)−1 (45) n+1tijmn = δimδnj + λ(i)dijkl ∂mkl ∂σmn ∣∣∣∣(i) n+1 , (46) rij = σij − (predσij −λ(i)dijkln+1m (i) kl ); (47) (d) update stress and state variable (n+1)σ(i+1)mn = predσmn − dλ(i)dmnpqn+1mpq, (48) (n+1)κ(i) = predκ + dλ(i)hκ. (49) (e) i = i + 1 end do 4. validation of the model the following chapter is focused on the validation and verification of thejphs model. the basic soil tests (the triaxial test and oedometer test) are simulated by using the numerical analyses and the results are compared with the real measured data from laboratory tests done on the brno clay [15]. 4.1. the triaxial test the 2d analysis (plaxis v.2016) of the triaxial test is simulated by using the axisymmetric model. the input parameters for the jphs model are summarized in table 1 and table 4. the coarseness of the mesh, geometry and dimensions of the model are presented in figure 8. the left and the bottom boundaries are fixed in horizontal and vertical directions respectively. the isotropic / axial loading is represented by a distributed load and is applied on the top and right boundary. the modelling sequence is as follows: (1.) initial conditions. (2.) isotropic loading (the loads applied to the top and right boundary are activated, the undrained behaviour is ignored. after the isotropic loading, the prescribed displacements are set to zero). (3.) axial compression to the failure (the load applied to the top is increased to a value which causes the failure, the undrained behaviour is active). figure 9 and 10 presents the results of the triaxial tests for different values of initial stress conditions (275, 500 and 750 kpa). the results confirm that the jphs model predicts a good match with the laboratory data. slightly over predicted is only the peak strength for 750 kpa, but this initial value corresponds figure 8. the triaxial test — 2d model. figure 9. the triaxial test results — strain versus deviatoric stress, lab. data from [15]. to the depth of approximately 100 m that is not a common area for the construction of geotechnical structures. the comparison of water pressure development is shown in figure 11 and again confirms a good match. the curve’s shape is almost identical with the laboratory data and the slight underprediction of maximum excess pore pressure is only 15%. 4.2. oedometer test the numerical simulation of the oedometer test is done by a 3d analysis (plaxis v.2013). the input paramters for the jphs model are summarized in table 1 and table 4. the coarseness of the mesh, geometry and dimensions of the model are presented in figure 12. the side boundaries are fixed in horizontal directions; the vertical fixities are used on the bottom 64 vol. 57 no. 1/2017 numerical modelling of soil behaviour figure 10. the triaxial test results — hydrostatic pressure versus deviatoric stress, lab. data from [15]. figure 11. the triaxial test results — strain versus water pressure, lab. data from [15]. side. the axial load σ1 is represented by a distributed load on the top plane. the modelling sequence is as follows: (1.) initial conditions. (2.) axial compression (the load applied on the top is gradually increased). (3.) unloading (the load applied on the top is gradually decreased). the results are shown in figure 13 and confirm a good match between the jphs model simulation and the laboratory data for both the loading and unloading stages. if the absolute values of axial strain are compared, then the jphs model slightly underestimates the maximum value (approx. 10%). figure 12. oedometer test — 3d model. figure 13. oedometer test results — strain versus stress, lab. data from [15]. 5. practical use of the model — numerical modelling of tunnel excavation the 3d numerical analysis of the shallow tunnel in stiff clay is performed to verify the jphs model and to compare the results calculated by this model with the real data from the geotechnical monitoring. for the calculation, the kralovopolske tunnels (exploration adit) is chosen. the kralovopolske tunnels are a part of the ring road of brno town in the czech republic and are excavated in difficult geological conditions. the overburden varies between 6 m to 22 m, the tunnels are excavated in brno clay and there is an urban area on the surface. 5.1. model discretization and boundary conditions the numerical analysis is carried out in the software plaxis 3d v.2013. model represents 100 m wide, 65 jan vesely acta polytechnica loess and clay loams sand deposits stiff clay unit weight γ [knm−3] 19.0 19.0 18.0 e-modulus e [mpa] 10 65 15 poisson’s ratio ν [–] 0.35 0.35 0.40 cohesion c’ [kpa] 10 5.0 5.0 friction angle φ′ [°] 20 30 26 dilatation angle ψ [°] 4.0 8.0 1.0 table 1. mc model — geotechnical parameters. cam-clay swelling index cam-clay compression index tangent of critical state line initial void ratio poisson’s ratio κ [–] λ [–] m [–] e [–] νur [–] stiff clay 0.2 0.09 1.55 0.5 0.2 table 2. modified cam clay model — geotechnical parameters. 48 m high and 90 m deep section of the soil mass, where the top boundary of the model represents the ground surface. the bottom and the side model boundaries are set at a distance required to reliably predict stress redistribution and ground deformation around the tunnel. the tunnel has a triangular shape with 4.8 m width and 4.2 m height and the crown of the tunnel is situated in a stiff clay approx. 22 m under the surface. the model geometry including the mesh and coordination system is presented in figure 14. the model boundary conditions are set that the vertical displacement is fixed at the bottom boundary and the horizontal displacements are fixed at the side boundaries. 5.2. ground properties and primary stress conditions the area of kralovopolske tunnels is formed by miocene marine deposits of the carpathian fore deep. the main geological strata, taken from the ground surface to bedrock, are loess and clay loams with a thickness between 3–10 m, followed by the layer of sand deposits (thickness approx. 10 m) and stiff clay (locally called brno clay). the thickness of this clay is expected to be several hundreds of meters and most of the route of the tunnels is located there. the ground water level is connected with a sand layer and observed in the depth of 13–20 m. according to the study done by svoboda et al. [15], material properties of the loams and sand deposits have an only small influence on the prediction of surface settlement, and therefore they are modelled only by using the mohr-coulomb model in this study. however, the stiff clay is modelled using the jphs model and its prediction is then compared with the prediction calculated by the mohr-coulomb model, modified cam clay model and hardening small strain figure 14. model geometry. model. the model parameters are calibrated based on laboratory tests [15], which have been carried out during the site investigation. the parameters used for the numerical analysis are summarized in table 1 to table 4 and the calibration of the stiff clay for the jphs model is shown in figure 9, figure 11, figure 13 and figure 15. the calibration of other models is mentioned in [23]. to set the initial stress conditions, it is necessary to determine the coefficient of earth pressure k0. the brno clay can be characterized as over-consolidated clay with ocr = 6.5 [15]. by using the formula by mayne and kulhawy [24] the k0 coefficient is: k0 = (1 − sin φ)ocrsin φ = (1 − sin 26.5)6.5sin 26.5 = 1.25. (50) 5.3. primary lining and modelling stages the kralovopolske exploration adit is excavated by the new austrian tunnelling method (natm) with 66 vol. 57 no. 1/2017 numerical modelling of soil behaviour e-modulus eref50 [mpa] erefoed[mpa] e ref ur [mpa] gref0 [mpa] γ0.7 [–] stiff clay 3.0 2.9 9.0 11.0 2.0 · 10−4 power stress dependency poisson’s ratio k0 value for nc reference pressure m [–] νur [–] knc0 [–] pref [kpa] stiff clay 1.0 0.2 0.54 100 table 3. hss model — geotechnical parameters. stiff clay jardine stiffness parameters a [–] 180 b [–] 165 c [–] 1.7 · 10−5 α [–] 0.85 γ [–] 1.05 �min [–] 1.701 · 10−5 �max [–] 0.02 critical friction angle φ′crit [°] 15.0 peak friction angle φ′peak [°] 28.0 residual friction angle φ′res [°] 18.0 hardening constant �hardening [–] 0.07 softening constant �softening [–] 0.04 hard./soft. constant δ[–] 1.075 unloading coefficient kunload [–] 2.7 cap coefficient kcap [–] 0.45 table 4. jphs model — geotechnical parameters. figure 15. jphs model calibrationsmall strain stiffness, laboratory data from [15]. a full-face adit excavation. the numerical analysis corresponds to a construction process with the round length of 1.0 m that leads to 90 calculation phases. each phase represents excavation progress of 1.0 m and installation of the tunnel lining in period of 8 hours. the tunnel lining consists of 100 mm of sprayed concrete, lattice girders (1 m span) and wire meshes. the lining is modelled using shell elements, which are capable of taking normal forces and bending moments. the lining is directly connected to the soil mass and is modelled as linearly elastic. to simulate the influence of the lattice girders on the stiffness of the lining, the homogenization of a steel-concrete lining is done according to [25]. this procedure converses the cross-section of a lining consisting of two components with different e-modulus to a substitute-homogenized cross-section with only one modulus of elasticity. to simulate the increase in stiffness of the sprayed concrete with the time, the young’s modulus of shotcrete is updated in the construction stage process and is based on the menschke equation [26]: et = βete28, (51) where βet =  0.0468t− 0.00211t 2 if t < 1 day, βet = ( 0.9506 + 32.89 t−6 )−0.5 if t ≥ 1 day. (52) 5.4. results of the numerical analysis four 3d analyses are carried out to determine the effect of the material models on the deformation of the ground during the excavation of the adit in stiff clay. three analyses are done using the standard material models included in software plaxis (mohr coulomb, modified cam-clay and hardening small strain model) while the fourth analysis is done using the jphs model. in the following chapter, the results of all analyses are discussed and compared with the real data from geotechnical monitoring. figure 16 presents the surface settlement after the full excavation of the adit. the results show that the settlement trough calculated by the mc model has a totally unrealistic shape. the vertical displacement above the tunnel is lower than at the certain distance from the axis. this phenomenon is caused by a high value of k0 and it is evident that the mc model is not suitable for a numerical modelling of the overconsolidated soils. the other models predict better shape of settlement though and differs mainly in its width 67 jan vesely acta polytechnica figure 16. the settlement trough after full profile excavation, monitoring data from [11]. figure 17. development of the settlement in the longitudinal direction, monitoring data from [15]. and depth. the closest results to the real monitored data are calculated by the jphs model. if we look at the absolute values of the surface settlement, we can clearly see that the jphs model almost predicts the same values (difference around 5%), while the hss model underestimates the vertical displacement by more than 25%. also, the width of the settlement trough calculated by the jphs model is almost two times more accurate than the one predicted by the modified cam-clay model. another interesting results can be observed when the development of the settlement in the longitudinal direction is compared (see figure 17). the development of the settlement in the longitudinal direction calculated by the mc and modified cam-clay model has an unrealistic shape. the models are predicting that 80% of the deformation will occur before the adit face pass through the measured profile. figure 18. horizontal displacement after full profile excavation, monitoring data from [15]. figure 19. deformations of the adit lining, monitoring data from [15]. one of the reasons for this behaviour is that the mc and modified cam-clay model does not contain the non-linear small-strain behaviour of soils. however, this feature is implemented in the hss and the jphs model and especially the jphs model fits very well with the monitoring data. the similar results can also be observed for the horizontal displacement as presented in figure 18. the jphs model is closest to the real measured data. the difference is higher (approx. 50%), but one of the reasons can be the anisotropy of brno clay (mentined by [27]), which is not implemented in the model. the jphs model is also the only model that correctly simulates the increase of the horizontal displacement near to the ground surface. the good match is also indicated when comparing 68 vol. 57 no. 1/2017 numerical modelling of soil behaviour the deformations of the adit lining. the differences between the material models are, in this case, not so significant but the jphs model is still closest to the reality (see figure 19). 6. conclusions the presented results show that the choice of the material model has a significant influence on the correct modelling of the soil behaviour. the commonly used mohr-coulomb is not appropriate for the numerical analysis in the overconsolidated clay and it is necessary to use advanced material models. the presented jphs model demonstrates that is able to quite precisely simulate behaviour of the soils and with proper calibration of input parameters, it is possible to predict the correct displacements of the ground during excavation. list of symbols αdp drucker-prager coefficient [–] δ hard./soft. constant based on the triaxial test [–] �a axial strain [–] �dev deviatoric strain invariant [–] �hardening hardening constant based on the triaxial test [–] �n strain tensor [–] �softening softening constant based on the triaxial test [–] �v volumetric strain [–] d�p plastic strain scalar [–] κ hardening/softening variable [–] λ plastic multiplier [–] φ′ actual friction angle [rad] φ′peak peak friction angle [rad] φ′res residual friction angle [rad] ψ dilatation angle [rad] σm mean stress pressure [kpa] σn stress tensor [kpa] θ loge angle [rad] cu undrained shear strength [kpa] kcap cap coefficient [–] kunload unload coefficient [–] p hydrostatic stress [kpa] pc pre-consolidation stress [kpa] q deviatoric stress [kpa] rc distance from the hydrostatic axis to the failure surface at the compressive meridian [m] rt distance from the hydrostatic axis to the failure surface at the tension meridian [m] a,b,c,α,γ,�min,�max jardine material model constants based on the triaxial test [–] d stiffness matrix [–] et young’s modulus of sprayed concrete in time [kpa] e28 young’s modulus of sprayed concrete after 28 days [kpa] eunload unloading young’s modulus [kpa] eu secant young’s modulus [kpa] eut tangent young’s modulus [kpa] f yield function [–] k bulk modulus [kpa] ocr over consolidation ratio [–] q potential function [–] acknowledgements this work was supported by the grant agency of the czech technical university in prague, grants sgs15/045/ohk1/1t/11 and sgs16/051/ohk1/1t/11. references [1] g.r.dasari, c.g.rawlings, m.d.bolton. numerical modelling of a natm tunnel construction in london clay. in geotechnical aspects of underground construction in soft ground, rotterdam, pp. 491–496. 1996. [2] p.a.vermeer, p. bonnier, s.c.moller. on a smart use of 3d-fem in tunnelling. in proceedings of the 8th international symposium on numerical models in geomechanics numog viii, rome, italy, pp. 361–366. 1996. doi:10.1201/9781439833797-c52. [3] m.dolezalova. approaches to numerical modelling of ground movements due to shallow tunnelling. in proc. 2nd int. conference on soil structure interaction in urban civil engineering, eth zurich, pp. 365–376. 2002. [4] c. ng, g. lee. three-dimensional ground settlements and stress-transfer mechanisms due to open-face tunnelling. canadian geotechnical journal 42(4):1015–2029, 2005. doi:10.1139/t05-025. [5] l.vydrova, j.vesely. optimization of the numerical modeling utilization for the design of undeground structures. in geotechnical engineering: new horizons. 2011. isbn 978-1-60750-807-6. 2011. [6] s. maras-dragojevic. analysis of ground settlement caused by tunnel construction. gradevinar 64(7):573–581, 2012. [7] a. lambrughi, l. rodriguez, r. castellanza. development and validation of a 3d numerical model for tbm-epb mechanised excavations. computers and geotechnics 40:97–113, 2012. doi:http://dx.doi.org/10.1016/j.compgeo.2011.10.004. [8] t.janda, m.sejnoha, j.sejnoha. modeling of soil structure interaction during tunnel excavation: an engineering approach. advances in engineering software 62-63:51–60, 2013. doi:http://dx.doi.org/10.1016/j.advengsoft.2013.04.011. [9] j.bartak, j.pruska, m.hilar. probability analysis of the effect of input parameters on the mrazovka tunnel deformations modelling. tunel 11(3):27–33, 2002. [10] t.svoboda, , m.hilar. probabilistic analyses of tunnel loads using variance reduction. proceedings of the institution of civil engineers geotechnical engineering 168(4):348–357, 2015. doi:http://dx.doi.org/10.1680/geng.14.00062. [11] d.masin. 3d modeling of an natm tunnel in high k0 clay using two different constitutive models. journal of geotechnical and geoenvironmental engineering 135:1326–1335, 2009. doi:10.1061/(asce)gt.1943-5606.0000017. 69 http://dx.doi.org/10.1201/9781439833797-c52 http://dx.doi.org/10.1139/t05-025 http://dx.doi.org/http://dx.doi.org/10.1016/j.compgeo.2011.10.004 http://dx.doi.org/http://dx.doi.org/10.1016/j.advengsoft.2013.04.011 http://dx.doi.org/http://dx.doi.org/10.1680/geng.14.00062 http://dx.doi.org/10.1061/(asce)gt.1943-5606.0000017 jan vesely acta polytechnica [12] d.masin, i.herle. numerical analyses of a tunnel in london clay using different constitutive models. in geotechnical aspects of underground construction in soft ground, at amsterdam, pp. 795–801. 2015. doi:10.1201/noe0415391245.ch81. [13] t. benz. small-strain stiffness of soils and its numerical consequences. ph.d. thesis, university of stuttgart, 2010. [14] r. jardine, d. potts, a. fourie, j. burland. studie of the influence of non-linear stress-strain characteristics in soil-structure interaction. geotechnique 36(3):377–396, 1986. doi:10.1680/geot.1986.36.3.377. [15] t. svoboda. numericky model nrtm tunelu v tuhem jilu. ph.d. thesis, charles university in prague, 2010. [16] a. gasparre. advanced laboratory characterisation of london clay. ph.d. thesis, imperial college london, 2005. [17] k.j.willam, e.p.warnke. constitutive model for triaxial behavior of concrete. in international association for bridges and structural engineering proceedings. bergamo, italy, vol. 19. 1975. [18] a. prashant. three-dimensional mechanical behavior of kaolin clay with controlled microfabric using true triaxial testing. ph.d. thesis, university of tennessee, 2004. [19] d. drucker, w. prager. soil mechanics and plastic analysis or limit design. quarterly of applied mathematics 10(2):157–165, 1952. [20] m. jirasek, z. bazant. inelastic analysis of structures. london: j. wiley & sons, 2002. [21] b. jones, a. thomas, y. hsu, m. hilar. evaluation of innovative sprayed-concrete-lined tunnelling. geotechnical engineering 161(ge3):137–149, 2008. doi:10.1680/geng.2008.161.3.137. [22] b.jeremic, s. sture. implicit integrations in elasto-plastic geotechnics. international journal for mechanics of cohesive-frictional materials and structures 2:165–183, 1997. doi:10.1002/(sici)10991484(199704)2:2<165::aid-cfm31>3.0.co;2-3. [23] j. vesely. the use of advanced material model for numerical modelling of underground structures in clay. ph.d. thesis, czech technical university in prague, 2017. [24] p. mayne, f. kulhawy. soil mechanics and plastic analysis or limit design. journal of geotechnical engineering, asce 108(gt6):851–872, 1986. [25] j. rott. homogenisation and modification of composite steel-concrete lining, with the modulus of elasticity of sprayed concrete growing with time. tunel 23(3):53–60, 2014. [26] a. thomas. sprayed concrete lined tunnels: an introduction. taylor & francis, 2009. [27] j.rott, d. masin, j.bohac, et al. evaluation of k0 in stiff clay by back-analysis of convergence measurements from unsupported cylindrical cavity. acta geotechnica 10:719–733, 2015. doi:10.1007/s11440-015-0395-7. 70 http://dx.doi.org/10.1201/noe0415391245.ch81 http://dx.doi.org/10.1680/geot.1986.36.3.377 http://dx.doi.org/10.1680/geng.2008.161.3.137 http://dx.doi.org/10.1002/(sici)1099-1484(199704)2:2<165::aid-cfm31>3.0.co;2-3 http://dx.doi.org/10.1002/(sici)1099-1484(199704)2:2<165::aid-cfm31>3.0.co;2-3 http://dx.doi.org/10.1007/s11440-015-0395-7 acta polytechnica 57(1):58–70, 2017 1 introduction 2 theoretical background of the model 2.1 yield failure criterion 2.2 plastic flow 2.3 non-linear hardening/softening 2.4 cap yield surface 2.5 non-linear elasticity 3 implementation of the model 4 validation of the model 4.1 the triaxial test 4.2 oedometer test 5 practical use of the model — numerical modelling of tunnel excavation 5.1 model discretization and boundary conditions 5.2 ground properties and primary stress conditions 5.3 primary lining and modelling stages 5.4 results of the numerical analysis 6 conclusions list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0057 acta polytechnica 58(1):57–68, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap anaerobic digestion of landfill leachate with natural zeolite and sugarcane bagasse fly ash as the microbial immobilization media in packed bed reactor hanifrahmawan sudibyoa, b, zata lini shabrinaa, hartika rafih wondaha, retno tri hastutia, lenny halima, chandra wahyu purnomoa, wiratni budhijantoa, b, ∗ a chemical engineering department, faculty of engineering, universitas gadjah mada, jalan grafika no. 2, yogyakarta 55281, indonesia b center for energy studies, universitas gadjah mada, jalan bhinneka tunggal ika sekip ugm k-1a, yogyakarta 55281, indonesia ∗ corresponding author: wiratni@ugm.ac.id abstract. to enhance the digestion rate of a landfill leachate in the anaerobic packed bed reactor, a natural zeolite and sugarcane bagasse fly ash (bfa) were tested as the immobilization media. in order to scale up this process and systematically optimize the reactor performance, a kinetics model was needed. the suitability of the contois and haldane growth kinetic models were tested on the experiment data. it turned out that contois gave the best fit for both acidogenic and methanogenic steps. a statistical analysis on contois kinetic parameters using the pearson correlation coefficient indicated that, in comparison with the bfa, the zeolite, as an immobilization media, showed more positive effects on the performance of the anaerobic digestion of a leachate. keywords: landfill leachate, natural zeolite, sugarcane bagasse fly ash, growth kinetics, contois, haldane. 1. introduction 1.1. landfill leachate problem in developing country the increase of the municipal waste accumulation in conventional landfill sites has caused severe environmental impacts. in indonesia, the high organic fraction in the municipal solid waste leads to an excessive leachate release. the leachate generated from landfill sites usually contains a high amount of organic and inorganic contaminants [1]. the organic and inorganic contaminants in the leachate were commonly characterized by the high values of chemical oxygen demand (cod), ph, ammonia nitrogen and heavy metals and strong colour and bad odour. the removal of an organic material represented by the cod, biochemical oxygen demand (bod), and ammonium from the leachate is the mandatory prerequisite for discharging the leachates into water bodies [2]. a biological treatment of the leachate was quite complicated due to the excessive amount of it, possible contents toxic to the digesting microbial and the uncertainty of its composition. the leachate composition can vary depending on several factors, including the degree of compaction, waste composition, moisture content in the waste, composition and volume and the age of the landfill [3, 4].several methods are currently available to treat the landfill leachate. most of them are adapted for wastewater treatment processing and can be divided into two main categories: biological treatments and physical/chemical treatments [5]. the latter was often chosen over the biological treatment because it was easier and faster. however, it was usually a costlier and energy intensive operation and created other environmental problems due to the chemical release to the water bodies [5]. in a densely populated country like indonesia, the large landfill site input could be as high as 5000– 6000 ton/day of municipal solid waste (msw), with a 60–70 % organic fraction. this msw characteristic produced high amount of a leachate with a high organic content. an anaerobic digestion of this organicrich landfill leachate would potentially produce a significant amount of biogas. energy generation from the leachate in the form of a biogas makes anaerobic digestion even more attractive to indonesia, because this country is now heavily relying on imported fossil fuel [6]. the biogas produced in the landfill site could be converted into electricity that could be used for further treatment of the effluent aerobically, using more energy-efficient aeration system [7, 8]. however, the common hindrance to run the anaerobic digestion for the leachate treatment was the slow growth of microorganisms, so the usage of a conventional anaerobic digester required a huge volume of the digester [9]. besides, a wash out often happened at the conventional anaerobic digester, especially for 57 http://dx.doi.org/10.14311/ap.2018.58.0057 http://ojs.cvut.cz/ojs/index.php/ap h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica a high flow rate feed. the solution for stabilizing and maximizing the microbial growth is immobilizing the cell on solid media [9, 10]. the cell immobilization on solid media can be defined as a localization of an intact cell via a physical adsorption between the carrier and cell membrane [11]. the microbial immobilization can be applied in various designs of reactors, such as the fixed bed, fluidized bed, or the membrane reactor to minimize the possibility of the cell to be washed out. 1.2. material of the immobilization media the natural zeolite and sugarcane bagasse fly ash (bfa), which are both widely and abundantly available in indonesia, have a potential to be used as an immobilization media. as an inorganic material, the natural zeolite is able to immobilize the biological species through offering interesting characteristics, such as mechanical and chemical resistance and a high surface area. it has the advantage of the mineral content, such as silica and alumina (the major component) [12] calcium, magnesium, etc., which could enhance the cell growth. zeolites are also known to be stable both in wet and dry conditions and well-tolerated by microorganisms, and therefore, normally compatible with a bioprocess application. the sugarcane bagasse, a residue obtained after crushing the sugarcane to extract the broth, was the most abundant lignocellulosic residue, with 1–1.2 ton of bagasse produced for 10 tons of sugarcane consumed for a sugar production [13]. although most of the bagasse had been used in the sugarcane industry itself to generate energy, there was a surplus of this agro-industrial residue and several alternatives for its utilization had been evaluated, among which was the production of vanillin [14] and xylitol [15, 16]. in this respect, the sugarcane bagasse has already been used with promising results as a cell support in different bioprocesses [17]. the performance of both the bfa and the natural zeolite as the immobilization media in the landfill leachate anaerobic digestion was evaluated in this work. the comparison was conducted quantitatively by means of a mathematical model to compare the kinetics parameter between the process using a natural zeolite and the process using the bfa. the appropriate growth kinetics model was chosen based on the best fit of the experimental data indicated by the minimum sum of squares of errors (sse). the pearson correlation coefficient was selected as the statistical tool to verify the correlation between the additions of the immobilization media to the digester performance represented by the kinetics parameters. furthermore, the mathematical model suggested in this paper could be very useful in the future for scaling up and optimizing the design of the full scale reactors. 1.3. kinetic model of anaerobic digestion with immobilized microbes in digesting the leachate, there are two processes that must be carried out acidogenesis and methanogenesis. currently, there are two suitable models for describing the leachate or wastewater digestion contois kinetics [18] and haldane kinetics [19]. on the one hand, the contois kinetics has been proven to describe the acidogenesis step well, especially with the substrate type like wastewater and leachate [18]. on the other hand, for the methanogenesis step, there are still some discussions about the most suitable model due to some existing phenomenon. after the acidogenesis begins and starts producing a volatile fatty acid (vfa), this vfa usually inhibits the microorganism growth, this is known as the substrate inhibition. that is why using contois kinetics for the methanogenesis is not sometimes suitable to describe the phenomenon well. fortunately, the brigg-haldane kinetics accommodates the substrate inhibition phenomenon. therefore, it was necessary to compare which model suited better to describe the mechanism of the anaerobic leachate digestion, especially for the methanogenesis step. to do so, an anaerobic batch digestion inside the packed-bed reactor would be conducted to obtain the necessary data for the kinetics study. this kinetics study would be beneficial for optimizing the performance of the continuous anaerobic packed-bed reactor. according to the aforementioned explanation, there were two scenarios that could be set up to find the mechanism of the anaerobic leachate digestion. the first scenario consisted of the acidogenesis step, which was described by the contois kinetics, and followed by the methanogenesis step, which was described by the haldane kinetics. the other scenario consisted of acidogenesis and methanogenesis steps, which were only described by the contois kinetics. a set of differential equations was derived from both scenarios. the first scenario’s set of differential equations consists of (1), (2), (4), (5), and (6), whereas the second scenario’s set of differential equations consists of (1) (3) (4) (5), and (6), where dx1 dt = µm1scodx1 ksx1x1 + scod − kd1xc11 , (1) dx2 dt = µm2cvfax2 ksx1x2 + cvfa + c2vfa/ki − kd2xc22 , (2) dx2 dt = µm2cvfax2 ksx1x2 + cvfa − kd2xc22 , (3) dscod dt = 1 yx1/cod dx1 dt , (4) dcvfa dt = yvfa/x1 dx1 dt − 1 yx2/vfa dx2 dt , (5) dcch4 dt = ych4/x2 dx2 dt . (6) these differential equations are solved numerically 58 vol. 58 no. 1/2018 anaerobic digestion of landfill leachate and the corresponding kinetics constants could be determined by minimizing the sum of square of error (sse) between the calculated and experimental data of the organic matter concentration as the acidogenic cell (x1), methanogenic cell (x2), substrate (scod), volatile fatty acid (cvfa), and methane (cch4). afterwards, the change of kinetics constants related to the additions of more immobilization media would be verified through the pearson correlation coefficient: r = n ∑ xy − ∑ x ∑ y√( n ∑ x2 − ( ∑ x)2 )( n ∑ y2 − ( ∑ y)2 ). (7) the calculated correlation coefficient was transformed into absolute value and then compared with the critical value of the pearson correlation coefficient. the absolute value of the calculated correlation coefficient must be greater than the critical value to show that there is a correlation. after being verified that there was a correlation, the integer value of the correlation coefficient (either positive or negative) shows what the correlation was. the interpretation would be linearly correlated, inversely correlated, or not correlated at all. 2. materials and methods 2.1. materials a fresh leachate was obtained from piyungan sanitary landfill, yogyakarta, indonesia. starter, in the form of an active digester effluent, was supplied by the cow-manure-based biogas mini-plant located at gadjah mada university’s piat (pusat inovasi agroteknologi) at berbah, sleman. the immobilization media was produced from the lampung natural zeolite and pt. madukismo sugarcane bagasse fly ash, which were supported by bentonite as the adhesive agent. high purity chemicals were used in this work for analytical routines, which included h2so4 98 % (merck), hcl 37 % (merck), naoh (merck), c8h5ko4 p.a. (emsure), hgso4 p.a. (emsure), agso4 p.a. (merck), k2cr2o7 p.a. (emsure), na2b4o7 · 10 h2o 99.55 % (merck), and ch3cooh 96 % p.a. (merck). 2.2. production of immobilization media from natural zeolite the bfa and the raw natural zeolite powder (undersize 100 mesh) were each mixed with bentonite with the mass ratio of 1 : 1. afterwards, the mixture was moulded to form the raschig ring with the size of 1 cm inside diameter, 5 mm thickness, and 2 cm long by the extruder. lastly, the moulded mixture was kept heated at 110 °for 12 hours by using the furnace thermolyne tube heater type f21100. 2.3. anaerobic digestion of leachate the anaerobic digestion was operated in a batch system using a vertical-cylinder-formed digester made of *corresponding author. tel.: +6281328183160 e-mail address: wiratni@ugm.ac.id figure 1 figure 1. experimental set-up. month parameter scod, mg/l vfa, mg/l 1 3660 1132 2 5895 1220 3 2620 970 table 1. leachate characteristics over three different months in 2016. acrylic and equipped with a vertical tube gasometer (figure 1). in this work, the digester volume was 3 l with the l/d ratio of 4.6. the fresh leachate was fed to the digester without any dilution so that the initial concentration of the leachate (in the form of scod) would be different for each experiment batch. the scod and vfa concentration analysis, identifying the initial leachate characteristics, resulted in the data shown in table 1. the difference was caused mostly by rainfall. to identify the effect of the immobilization media addition to the digester performance, the number of the immobilization media varied. since both immobilization media had different bulk density, the number of the immobilization media was calculated based on the volume fraction inside the digester. because of the same basic area, the immobilization media would be 1/4, 1/2, and 3/4 of the digester height. through this way, the zeolite ratios were 150 g/g scod, 240 g/g scod, and 350 g/g scod, while the bfa ratios were 43 g/g scod, 110 g/g scod, and 164 g/g scod. the anaerobic digestion without immobilization media was set as the control (0 g zeolite/g scod) and was executed for each immobilization media. 59 h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica 2.4. analytical method of scod, vfa, ch4, and microbial concentration the quantification of the microbes in the digester was conducted using the heterotrophic plate counts method explained by boothe et al. [20] and by apha [21]. obtained number of the cell (in cell/l) was then converted into a mass concentration (g/ml) by multiplying with the mass of a cell (1.15 pg/cell) [22]. the mass concentration of the acidogenic and the methanogenic microbes was estimated by using the ratio of 9 : 1 (acidogenic : methanogenic microbes) [23]. the calculation is as follows: x1 [mg l ] = 0.9 ( # of cells [cell ml ] 1.15 pgcell 10 6 mg pg ml l ) , (8) x2 [mg l ] = 0.1 ( # of cells [cell ml ] 1.15 pgcell 10 6 mg pg ml l ) . (9) in this work, the variable used to represent the substrate concentration was the soluble cod (scod). the analysis of the scod, vfa, and ammonia during the experiment would follow the standard procedure by the apha [24]. the scod analysis was conducted as the closed reflux colorimetric method. the vfa analysis used the titrimetric method and the ammonia measurement used the ion selective electrode (ise) measurement. the gas volume was measured using the gasometer method outlined by walker [25] while the methane content was analysed by using the gas chromatography (gc) shimadzu gc 8a. the data of the methane production was presented using ml/g scod (removed) unit [26] and in a volume percentage to describe its purity. 3. results and discussion 3.1. acidogenic and methanogenic cell growth behavior according to the experimental data (figure 2), both for the acidogenic cell (x1) and the methanogenic cell (x2), it could be seen that each growth phase ran for the same period of time when the natural zeolite and bfa were used as the immobilization media. for instance, the lag phase ran for the same period of time before and after the addition of the immobilization media (from day 0 to day 7). from day 7 to day 13, the slope was increasing sharply, so it could be considered as the beginning of the log phase. between the day 13 and 35, the slope started declining, so it could be assumed that the stationary phase began at this period. however, the maximum concentration of the cell was the differentiator. for instance, with the increase of the number of the immobilization media, the maximum concentration of the cell reached during the lag phase was greater than the one without using immobilization media. the maximum concentration increased largely when the immobilization media filled 3/4 of the digester volume. 3.2. scod and cvfa profile as the consequence of the cell growth, four parameters experienced changes – the scod, cvfa, cumulative (a) (b) (c) (d) figure 2 0 2 000 4 000 6 000 8 000 10 000 0 7 14 21 28 35 x 1 , m g /l days 100 1 100 2 100 3 100 4 100 5 100 6 100 7 100 8 100 0 7 14 21 28 35 x 1 , m g /l days 0 200 400 600 800 1 000 1 200 0 7 14 21 28 35 x 2 , m g /l days 0 200 400 600 800 1 000 0 7 14 21 28 35 x 2 , m g /l days figure 2. concentration of acidogenic cell (x1) and methanogenic cell (x2) in leachate during anaerobic digestion: (a) x1-zeolite; (b) x1-bfa; (c) x2-zeolite; (d) x2-bfa (diamonds — no media; triangles — 1/4 height of digester; squares — 1/2 height of digester; crosses — 3/4 height of digester). methane purity (% ch4), and cumulative methane production. in the digester immobilized with the zeolite, the scod and cvfa decreased consistently in time (figure 3ac). this consistent decrease stood in line with the growth characteristic of the acidogenic cell (figure 2a), which increased in time. however, the fastest significant scod decrease occurred in the digesters, which were immobilized by the natural zeolite by as many as 1/2 and 3/4 of the digester volume (240 g/g scod and 350 g/g scod). in those digesters, the scod decreased significantly from the day 7 to 13 60 vol. 58 no. 1/2018 anaerobic digestion of landfill leachate (a) (b) (c) (d) figure 3 1 000 1 500 2 000 2 500 3 000 0 7 14 21 28 35 sc o d , m g /l days 1 000 2 000 3 000 4 000 5 000 6 000 0 7 14 21 28 35 sc o d , m g /l days 400 500 600 700 800 900 1 000 0 7 14 21 28 35 v f a , m g /l days 700 900 1 100 1 300 1 500 1 700 1 900 0 7 14 21 28 35 v f a , m g /l days figure 3. scod and cvfa profile during anaerobic digestion: (a) scod-zeolite; (b) scod-bfa; (c) vfa-zeolite; (d) vfa-bfa (diamonds — no media; triangles — 1/4 height of digester; squares — 1/2 height of digester; crosses — 3/4 height of digester). (shown by a steeper slope). meanwhile, the digester without any immobilization media and the digester with the immobilization by the natural zeolite by as much as 1/4 of digester volume had a significant but late scod decrease from the day 21 to 35. therefore, the addition of the zeolite seemed to be influential to the rate of the scod consumption by the cell after the zeolite ratio was greater than 110 g/g scod. however, the digester immobilized by the bfa had both the scod and the cvfa increasing from the day 0 to 21 (figure 3cd). this phenomenon could possibly be explained as the degradation (through hydrolysis) of the insoluble compound, such as complex carbohydrates and proteins in form of particulate, into the simple ones, which were soluble. thus, the scod could increase, because the scod increase (a) (b) (c) (d) figure 4 0 5 10 15 20 25 30 35 0 7 14 21 28 35 c h 4 , m l/ g s c o d days 0 5 10 15 20 25 0 7 14 21 28 35 c h 4 , m l/ g s c o d days 0% 10% 20% 30% 40% 0 7 14 21 28 35 c h 4 , % v /v days 0% 5% 10% 15% 20% 25% 30% 35% 40% 0 7 14 21 28 35 c h 4 , % v /v days figure 4. cumulative methane production and methane percentage ( % ch4) profile during anaerobic digestion: (a) ch4-zeolite; (b) ch4-bfa; (c) % ch4-zeolite; (d) % ch4-bfa (diamonds — no media; triangles — 1/4 height of digester; squares — 1/2 height of digester; crosses — 3/4 height of digester). caused by the hydrolysis was greater than the scod decrease caused by the acidogenic cell consumption. after the day 21, the scod started decreasing, which means that the insoluble compounds had completely degraded into the simple and soluble compounds. 3.3. cumulative volume of biogas and methane content the cumulative volume of methane produced from the digester immobilized either by the natural zeolite or by the bfa increased along the time and tended to be stable (reach the asymptote point) at the stationary phase (see figure 4ab). however, the digesters immo61 h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica variable no media 1/4 1/2 3/4 contois-haldane x1 1 168 700 126 620 141 820 837 480 x2 677 380 100 750 226 880 1 719 100 scod 5 084 200 2 987 200 752 470 951 060 cvfa 52 551 187 090 37 764 15 562 cch4 110.24 43.62 12.8 0.01 contois-contois x1 8138 101 610 72 256 280 810 x2 7117 62 042 60 491 161 210 scod 3 240 500 2 604 100 561 560 1 321 900 cvfa 60 428 89 010 29 270 59 133 cch4 0.47 45.95 12.15 0.01 table 2. comparison of sse results of two proposed models. variable no media 1/4 1/2 3/4 contois-haldane x1 42 534 252 370 143 390 1 745 400 x2 13 440 63 460 351 560 1 140 600 scod 411 590 17 974 208 410 232 730 cvfa 78 732 53 479 68 476 28 177 cch4 12.07 129.41 10.57 280.21 contois-contois x1 12 788 28 870 11 343 481 880 x2 4344 25 903 8244 265 090 scod 230 980 174 150 105 620 257 410 cvfa 19 792 17 427 26 633 27 526 cch4 9.54 79.21 2.67 113.48 table 3. comparison of sse results of anaerobic leachate digestion using sugarcane bagasse fly ash as the immobilization media. bilized by the bfa by more than 1/2 of the digester volume were unfortunately unable to produce more methane. thus, in this work, the optimum number of the bfa to have a large cumulative volume of methane was 1/2 of digester volume (101 g/g scod). different from the bfa, the more natural zeolite could increase the cumulative volume of the methane produced. with respect to the methane content, the digester immobilized by the natural zeolite and the one immobilized by the bfa had different methane content profile. the methane content of the zeolite-immobilized digester had specific trend line in which a lower volume of a higher-purity methane was produced during the lag phase and the log phase. afterward, during the stationary phase, the methane purity decreased and tended to be stable (see figure 4c). the highest stable methane content was reached when the natural zeolite filled 3/4 of the digester volume. the lower number of the natural zeolite only produced methane purity in the range of 12–15 % at a stable conditions. when the digesters were immobilized by the bfa, the maximum methane content reached at a stable conditions was about 14 % (see figure 4d). the increase of the methane content tended to be similar among the growth period (lag phase, log phase, and stationary phase). when the number of the bfa inside the digester was increased to 3/4 of the digester volume (164 g/g scod), the digester became unproductive in terms of the biogas production rate and methane content. to understand this phenomenon, kinetic study was conducted on the aforementioned data. 3.4. kinetic study generally, the sse results of scod, cvfa, and cch4 depicted the same performance of both scenarios to fit the experimental data (see tables 2 and 3). for each number of the immobilization media both for the natural zeolite and the bfa, the sse result stayed in the same order of magnitude. for instance, when the digester was immobilized by the natural zeolite by as many as 1/2 of the digester volume, the sse result of the scod was at a hundred thousand order of magnitude. although the sse had a gap about two 62 vol. 58 no. 1/2018 anaerobic digestion of landfill leachate constants no med. 1/4 1/2 3/4 natural zeolite amount µm1 1.1989 1.2028 0.8779 2.5125 µm2 1.1782 0.9671 0.8564 1.2192 ksx1 69.5063 18.3186 11.8596 8.9268 ksx2 25.0034 8.2281 8.0195 2.824 yx1/cod 1.1205 1.1066 1.5885 2.9432 yx2/vfa 0.464 2.4561 0.5124 5.9081 ych4/x2 15.7046 18.2752 16.8141 16.476 yvfa/x1 0.9845 0.0134 1.1001 0.0283 kd1 4 · 10−6 3.1014 2.6587 10.5556 c1 0.3601 0.3599 0.3613 0.3589 kd2 0.0281 0.103 0.0386 0.1899 c2 0.8321 0.8301 0.8311 0.8278 table 4. kinetics constants of contois-contois scenario (second scenario) for natural zeolite as the immobilization media. constants no med. 1/4 1/2 3/4 bfa amount µm1 0.1709 0.2847 0.3766 0.6775 µm2 0.1222 0.2746 0.3655 0.6695 ksx1 14.6413 5.6303 2.7611 4.1738 ksx2 2.7716 9.0647 0.4556 2.8435 yx1/cod 0.7595 0.8443 1.1089 2.7326 yx2/vfa 0.0663 0.1116 0.1856 0.1587 ych4/x2 16.4151 50.1092 23.6147 0.144 yvfa/x1 9.9998 6.0225 3.552 4.1674 kd1 3 · 10−6 0.2414 0.6659 0.5249 c1 0.0092 0.7588 0.7588 0.7588 kd2 0.1182 3 · 10−8 0.2997 0.0735 c2 0.5788 0.9571 0.9521 0.9571 table 5. kinetics constants of contois-contois scenario (second scenario) for bfa as the immobilization media. constants in this study previous studies µm1 0.17–0.25 0.315 [27], 0.156 [28] µm2 0.12–1.22 0.271 [29], 1.2 [28] ksx1 2.76–69.51 126.32 [27], 0.983 [29], 20–50 [28] ksx2 0.46–25.00 151.32 [29], 0.4 [30], 20–50 [28] yx1/cod 0.76–2.94 0.82 [30] yx2/vfa 0.07–5.91 0.983 [30] ych4/x2 0.14–50.11 0.27 [30], 11–25 [31], 74 [32] yvfa/x1 0.01–10.00 0.4 [33] kd1 4 · 10−6–10.56 0.48 [30] c1 0.01–0.36 — kd2 3 · 10−8–0.30 0.48 [30] c2 0.58–0.96 — table 6. comparison of the obtained parameters with previous studies. hundred thousand, it was considered as a small gap due to the sse concept. when there were six data for a one experiment condition, the sse would be about 33000 in average. according to the sse concept, this average meant that there was a difference of only 180 between each experimental data and each calculation result. differently, the sse result of x1 and x2 depicted that the anaerobic leachate digestion in the zeoliteimmobilized and the bfa-immobilized digester were well described using the contois model for both the acidogenesis and methanogenesis steps. the comparisons between the x1’s and x2’s sse of each scenario showed that there was a huge difference because of the different order of magnitude (see tables 2 and 3). for instance, when the digester was immobilized by the bfa by as many as 3/4 of the digester volume, the second scenario had the sse at a hundred thousand order of magnitude while the first scenario had the sse at a million order of magnitude (see tables 2 and 3). visually observed, the huge difference of the sse was caused by the inability of the first scenario (contois-haldane models) to fit the experimental data. it clearly revealed that by using the contois model for both steps, each growth phase, such as lag phase, log phase and the beginning of the stationary phase, was well depicted (see figures 5 and 6). the kinetics constants value for contois-contois scenario, resulted from the numerical calculation using matlab, was shown on tables 4 and 5. compared with previous studies focusing on finding the kinetics of an anaerobic digestion of organic or food waste, the obtained parameters only showed a slight difference. in table 6, there were two parameters, which cannot be compared with the previous studies since in this work, the death rate equations opened the possibilities of the non-elementary kinetics model (not on the first order) together with proving if it was 63 h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica (a) zeolite ¼ (b) zeolite ½ (c) zeolite ¾ (d) bfa¼ (e) bfa½ (f) bfa¾ figure 5 figure 5. matlab calculation result for acidogenic cell (x1) inside the digester using natural zeolite and bfa as immobilization media. true that the first order was applicable in this study. to identify the effect of the increase of the immobilization media number, a statistical approach was used. according to the correlation coefficient result for the experimental data of the zeolite-immobilized digester, the most absolute values of the correlation coefficient were not greater than its critical value. the critical value set in this work was 0.951, gained from a level of significance of 0.05 and a degree of freedom of 2. therefore, statistically, it revealed that the addition of more natural zeolite to the digester didn’t have any correlation to the digester performance (see tables 7 and 8). however, the correlation coefficients, which were greater than zero, still showed that, actually, there was still a correlation though it was a weak correlation. both the zeolite-immobilized and the bfa-immobilized digesters had the ksx1 and ksx2 value decreasing with the increase of the immobilization media number (see tables 4 and 5). this decrease meant that cells/microbe could easily attach because ks is the parameter representing the affinity of microbes to the solid substrate [34]. a large ks usually indicates a low affinity and vice versa for a small ks [34]. because of the lower value of ksx1 and ksx2, the cell/microbe preferred the bfa to the natural zeolite for the attachment. however, the cell/microbe attached to the solid substrate had been found to grow at a much 64 vol. 58 no. 1/2018 anaerobic digestion of landfill leachate (a) zeolite – ¼ (b) zeolite – ½ (c) zeolite – ¾ (d) bfa – ¼ (e) bfa – ½ (f) bfa – ¾ figure 6 figure 6. matlab calculation result for methanogenic cell (x2) inside the digester using natural zeolite and bfa as the immobilization media. slower rate than the cell/microbe that was unattached due to its lack of a direct access to food [35]. when more cell/microbes attach to the solid substrate, the overall growth rate of the cell is slower. thus, the growth rate of the acidogenic and methanogenic cell was better when the digester used the natural zeolite as the immobilization media, since the number of the cells/microbes attached to the zeolite was quite low. it was revealed by the value of µm1 and µm2, which was greater when using the natural zeolite inside the digester (see tables 4 and 5). the values of yx1/cod and yx2/vfa were also greater when using the natural zeolite inside the digester due to the ability of the cell/microbe to gain the food (in form of the scod and vfa). other kinetic constants dealing with the aforementioned explanation were kd1 and kd2 (see tables 4 and 5). the more immobilization media added, the greater the value of both variables. this meant that the addition of more immobilization media caused more cells/microbes to attach on it, leading to more cells/microbes dying afterwards due to a lack of food. this result stood in line with the previous work by wang et al. [35]. as the consequence of a better growth of the cell/microbe by using the zeolite, the rate of methane 65 h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica constants correlation indication coefficient (r) nat. zeolite as the immobilization med. µm1 0.617 no correlation µm2 −0.046 no correlation ksx1 −0.899 no correlation ksx2 −0.935 no correlation yx1/cod 0.851 no correlation yx2/vfa 0.732 no correlation ych4/x2 0.266 no correlation yvfa/x1 −0.447 no correlation kd1 0.885 no correlation c1 0 no correlation kd2 0.746 no correlation c2 0 no correlation table 7. pearson correlation coefficient of kinetics constants of contois-contois scenario (second scenario) for natural zeolite as the immobilization media. constants correlation indication coefficient (r) bfa as the immobilization media µm1 0.944 no correlation µm2 0.96 correlated ksx1 −0.868 no correlation ksx2 −0.205 no correlation yx1/cod 0.831 no correlation yx2/vfa 0.865 no correlation ych4/x2 −0.375 no correlation yvfa/x1 −0.913 no correlation kd1 0.871 no correlation c1 0.833 no correlation kd2 0.119 no correlation c2 0.834 no correlation table 8. pearson correlation coefficient of kinetics constants of contois-contois scenario (second scenario) for bfa as the immobilization media. production was more stable in the zeolite-immobilized digester. the yield of methane produced per the methanogenic cell increase (ych4/x2) was still greater though the methanogenic cell growth rate was also greater in the zeolite-immobilized digester. for the order of the death-rate constant, according to shuler and kargi [9], the order for the cell concentration of the death rate equation was first order. this work tried to prove the validity of this order. the numerical calculation using matlab showed that the death rate of the acidogenic and methanogenic cell had order’s value almost identical to first order. thus, its validity was verified. 4. conclusions the addition of the zeolite and the bfa as the immobilization media in the anaerobic digester showed different digester’s behaviour. in the bfa-immobilized digester, the hydrolysis seemed to take part at the beginning of the operation, therefore, causing the scod and cvfa to increase at first. in contrast, the zeolite-immobilized digester had only acidogenesis and methanogenesis as the limiting processes, causing the the scod and cvfa to decrease from the start. besides, the methane purity and the cumulative methane volume of the zeolite-immobilized digester were greater than those of the bfa-immobilized digester. kinetically, the anaerobic leachate digestion using the natural zeolite and the bfa as the immobilization media followed the contois model both for the acidogenesis and for the methanogenesis step. a statistical analysis showed that a higher ratio of the immobilization media did positively affect some kinetics parameters. it indicated that the natural zeolite was plausible to be further studied as the potential immobilization media for anaerobic digestion purposes. acknowledgements the study was conducted under the clean project financially supported by usaid peer-science research grant [nas sub-grant award letter agreement number 2000004934 and sponsor grant award number aid-oaaa-11-00012]. the authors also expressed the highest appreciation to the office of civil work and energy/mineral resources of yogyakarta and the bureau of waste treatment, infrastructure, and municipal water supply of the government of d.i. yogyakarta province. list of symbols µm1 maximum specific growth rate of acidogenic cell (day−1) µm2 maximum specific growth rate of methanogenic cell (day−1) ksx1 half-saturation constant associated with scod (mg scod/mg acidogenic cell) ksx2 half-saturation constant associated with vfa (mg vfa/mg methanogenic cell) yx1/cod yield of cell formation per mg scod reduction (mg acidogenic cell/mg scod) yx2/vfa yield of cell formation per mg vfa reduction (mg methanogenic cell/mg vfa) ych4/x2 yield of ch4 formation per mg methanogenic cell/l increase ((mg ch4/l)/(mg methanogenic cell/l)) yvfa/x1 yield of vfa formation per mg acidogenic cell (mg vfa/mg acidogenic cell) ki inhibition constant associated with vfa (mg vfa/l) kd1 death rate constant of acidogenic cell kd2 death rate constant of methanogenic cell c1 order of acidogenic cell death rate equation c2 order of methanogenic cell death rate equation references [1] el-salam, m.m.a., abu-zuid, g.i., 2015. impact of landfill leachate on the groundwater quality: a case 66 vol. 58 no. 1/2018 anaerobic digestion of landfill leachate study in egypt. journal of advanced research, 6 (4), 579-586. doi:10.1016/j.jare.2014.02.003 [2] kettunen, r.h., hoilijoki, t.h., rintala, j.a., 2009. anaerobic and sequential anaerobic–aerobic treatments of municipal landfill leachate at low temperatures, bioresource technol. 58, 40–41. [3] silva, a.c., dezotti, m., sant’anna jr, g.l., 2004. treatment and detoxification of a sanitary landfill leachate. chemosphere, 55 (2), pp. 207-214. doi:10.1016/j.chemosphere.2003.10.013 [4] im, j.h., woo, h.j. choi, m.w., han, k.b., kim, c.w., 2001. simultaneous organic and nitrogen removal from municipal landfill leachate using an anaerobic–aerobic system. water res. 35, 2403–2410. doi:10.1016/s0043-1354(00)00519-4 [5] wiratni, w. and subandiyono, 2009. enhancement of methane formation in biogas production by addition of landfill leachate. proceeding of regional conference on chemical engineering, de la salle university, manila. [6] santosa, n.b., 2014. pemanfaatan lng sebagai sumber energi di indonesia. jurnal rekayasa proses. 8(1), 33-39. [7] deendarlianto, d., wiratni, w., tontowi, a., indarto, i., & iriawan, a. 2015. the implementation of a developed microbubble generator on the aerobic wastewater treatment. international journal of technology. 6(6), 924-930. doi:10.14716/ijtech.v6i6.1696 [8] budhijanto, w., deendarlianto, d., kristiyani, h., & satriawan, d., 2015. enhancement of aerobic wastewater treatment by the application of attached growth microorganisms and microbubble generator. international journal of technology. 6(7), 1101-1109. doi:10.14716/ijtech.v6i7.1240 [9] shuler, m., kargi, f., 2002. bioprocess engineering basic concepts, second ed. prentice hall, new jersey. [10] mshandete, a.m., björnsson, l., kivais, a.k., rubindamayugi, m.s.t., mattiasson, b., 2008. performance of biofilm carriers in anaerobic digestion of sisal leaf waste leachate,”electronic journal of biotechnology 2008; 11 (1), pp.1-9. doi:10.2225/vol11-issue1-fulltext-7 [11] kourkoutas, y., xolias, v., kallis, m. bezirtzoglou, e., and kanellaki, m., 2005. lactobacillus casei cell immobilization on fruit pieces for probiotic additive, fermented milk and lactic acid production. process biochem. 40, 411–416. [12] wirawan, s.k., sudibyo, h., setiaji, m.f., warmada, i.w., wahyuni, e.t., 2015. development of natural zeolites adsorbent: chemical analysis and preliminary tpd adsorption study. journal of engineering science and technology. special issue 4 on somche 2014 & rsce 2014 conference, 87-95. [13] rainey, t.j., 2009. a study of the permeability and compressibility properties of bagasse pulp. brisbane, australia: queensland university of technology. [14] mathew, s., abraham, t.e., 2005. studies on the production of feruloyl esterase from cereal brans and sugar cane bagasse by microbial fermentation. enzyme microb. tech. 36 (4), 565–570. doi:10.1016/j.enzmictec.2004.12.003 [15] carvalho, w., santos, j.c., canilha, l., silva, s.s., perego, p., converti, a., 2005. xylitol production from sugarcane bagasse hydrolysate: metabolic behaviour of candida guilliermondii cells entrapped in caalginate. biochem. eng. j. 25 (1), 25–31. doi:10.1016/j.bej.2005.03.006 [16] santos, j.c., carvalho, w., silva, s.s., converti, a., 2003. xylitol production from sugarcane bagasse hydrolyzate in fluidized bed reactor. effect of air flowrate. biotechnology progress. 19 (4), 1210–1215. doi:10.1021/bp034042d [17] sene, l., converti, a., felipe, m.g.a., zilli, m., 2002. sugarcane bagasse as alternative packing material for biofiltration of benzene polluted gaseous streams: a preliminary study. bioresource technol. 83 (2), 153–157. doi:10.1016/s0960-8524(01)00192-4 [18] nelson, m., sidhu, h. reducing the emission of pollutants in food processing wastewaters., 2007. chem. eng. proc. process intensification. 46 (5), 429-436. doi:10.1016/j.cep.2006.04.012 [19] hussain, a., dubey, s. k., kumar, v., 2015. kinetic study for aerobic treatment of phenolic wastewater. water resources and industry. 11, 81-90. doi:10.1016/j.wri.2015.05.002 [20] boothe, d.d.h., smith, m.c., gattie, d.k., das, k.c., 2001. characterization of microbial populations in landfill leachate and bulk samples during aerobic bioreduction. adv. environ. res. 5, 285-294. doi:10.1016/s1093-0191(00)00063-0 [21] american public health association (apha), 2005. standard methods for the examination of water and wastewater, twentieth ed. american public health association, new york. [22] fabregas, j., herrero, c., cabezas, b., abalde, j., 1986. biomass production and biochemical composition in mass cultures of the marine microalga isochrysis galbana parke at varying nutrient concentrations aquaculture. 53(2), 101-113. doi:10.1016/0044-8486(86)90280-2 [23] wirth, r., kovács, e., maróti, g., bagi, z., rákhely, g., kovács, k.l., 2012. characterization of a biogas-producing microbial community by short-read next generation dna sequencing. biotechnol. biofuels. 5, 41. doi:10.1186/1754-6834-5-41 [24] american public health association (apha), 1984. compendium of methods for the microbiological examination of foods, second ed. american public health association, washington, d.c. [25] walker, m. zhang, y., heaven, s., and banks, c., 2009. potential errors in the quantitative evaluation of biogas production in anaerobic digestion processes. bioresource technol. 100, 6339-6346. doi:10.1016/j.biortech.2009.07.018 [26] budiyono, syaichurrozi, i., sumardiono, s., 2014. effect of total solid content to biogas production rate from vinasse. ije transactions b: applications, 27 (2), 177 – 184. 67 http://dx.doi.org/10.1016/j.jare.2014.02.003 http://dx.doi.org/10.1016/j.chemosphere.2003.10.013 http://dx.doi.org/10.1016/s0043-1354(00)00519-4 http://dx.doi.org/10.14716/ijtech.v6i6.1696 http://dx.doi.org/10.14716/ijtech.v6i7.1240 http://dx.doi.org/10.2225/vol11-issue1-fulltext-7 http://dx.doi.org/10.1016/j.enzmictec.2004.12.003 http://dx.doi.org/10.1016/j.bej.2005.03.006 http://dx.doi.org/10.1021/bp034042d http://dx.doi.org/10.1016/s0960-8524(01)00192-4 http://dx.doi.org/10.1016/j.cep.2006.04.012 http://dx.doi.org/10.1016/j.wri.2015.05.002 http://dx.doi.org/10.1016/s1093-0191(00)00063-0 http://dx.doi.org/10.1016/0044-8486(86)90280-2 http://dx.doi.org/10.1186/1754-6834-5-41 http://dx.doi.org/10.1016/j.biortech.2009.07.018 h. sudibyo, z. l. shabrina, h. r. wondah et al. acta polytechnica [27] tomei, l., altamura, s., bartholomew, l., bisbocci, m., bailey, c., bosserman, m., cellucci, a., forte, e., incitti, i., orsatti, l., koch, u., 2004. characterization of the inhibition of hepatitis c virus rna replication by nonnucleosides. j. virol. 78(2), 938-946. doi:10.1128/jvi.78.2.938-946.2004 [28] grady jr., c.p.l., daigger, g.t., love, n.g., filipe, c.d.m., 2011. biological wastewater treatment. crc press taylor & francis group: boca raton, florida, p. 296. [29] geed, s.r., kureel, m.k., giri, b.s., singh, r.s., rai, b.n., 2017. performance evaluation of malathion biodegradation in batch and continuous packed bed bioreactor (pbbr). bioresour. technol. 227, 56-65. doi:10.1016/j.biortech.2016.12.020 [30] fedailaine, m., moussi, k., khitous, m., abada, s., saber, m., tirichine, n., 2015. modeling of the anaerobic digestion of organic waste for biogas production. procedia comput. sci. 52, 730 – 737. doi:10.1016/j.procs.2015.05.086 [31] stucki m., jungbluth n., leuenberger m., life cycle assessment of biogas production from different substrates, final report, federal department of environment, bern (2011 dec), transport, energy and communications, federal office of energy. [32] achinas, s., achinas, v., euverink, g.j.w., 2017. technological overview of biogas production from biowaste. engineering. 3 (3), 299-307. doi:10.1016/j.eng.2017.03.002 [33] chiu, s.f., chiu, j.y., kuo, w.c., 2013. biological stoichiometric analysis of nutrition and ammonia toxicity in thermophilic anaerobic codigestion of organic substrates under different organic loading rates. renew. energ. 57, 323-329. doi:10.1016/j.renene.2013.01.054 [34] liu, y., 2006. a simple thermodynamic approach for derivation of a general monod equation for microbial growth. biochem. eng. j. 31, 102–105. doi:10.1016/j.bej.2006.05.022 [35] wang, z.w., hamilton-brehm, s.d., lochner, a., elkins, j.g., morrell-falvey, j.l., 2011. mathematical modeling of hydrolysate diffusion and utilization in cellulolytic biofilms of the extreme thermophile caldicellulosiruptor obsidiansis. bioresour.technol. 102, 3155–3162. doi:10.1016/j.biortech.2010.10.104 68 http://dx.doi.org/10.1128/jvi.78.2.938-946.2004 http://dx.doi.org/10.1016/j.biortech.2016.12.020 http://dx.doi.org/10.1016/j.procs.2015.05.086 http://dx.doi.org/10.1016/j.eng.2017.03.002 http://dx.doi.org/10.1016/j.renene.2013.01.054 http://dx.doi.org/10.1016/j.bej.2006.05.022 http://dx.doi.org/10.1016/j.biortech.2010.10.104 acta polytechnica 58(1):57–68, 2018 1 introduction 1.1 landfill leachate problem in developing country 1.2 material of the immobilization media 1.3 kinetic model of anaerobic digestion with immobilized microbes 2 materials and methods 2.1 materials 2.2 production of immobilization media from natural zeolite 2.3 anaerobic digestion of leachate 2.4 analytical method of scod, vfa, ch4, and microbial concentration 3 results and discussion 3.1 acidogenic and methanogenic cell growth behavior 3.2 scod and cvfa profile 3.3 cumulative volume of biogas and methane content 3.4 kinetic study 4 conclusions acknowledgements list of symbols references acta polytechnica doi:10.14311/ap.2018.58.0279 acta polytechnica 58(5):279–284, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap effects on the final intensity of input forces in longbolts installed at the mining operation 2 area, okd, inc. pavel dvořáka, ∗, eva jiránkováb a minova bohemia s.r.o., lihovarská 1199/10, 716 00 ostrava – radvanice, czech republic b všb-technical university of ostrava, 17. listopadu 15/2172, 708 33 ostrava – poruba, czech republic ∗ corresponding author: pavel.dvorak@minovaglobal.com abstract. in the deep coal mines of okd, inc., both bolts and long bolts of different designs are used for the rock massif and steel arch support reinforcement. continuous measurement of forces in 6 strand bolts and 1 cable bolt (long bolts, generally) was carried out during the trial operation of the modified room and pillar mining method at mining operation 2, site north, okd, inc. hydraulic dynamometers were installed on these long bolts and a monitoring of forces took place throughout the life-time period of the mining panel no. v. from this measurement, a knowledge of their different load behavior with respect to the input stress parameters was obtained. the input intensity of the force applied to the bolting elements is burdened by losses of various kinds. the subject of the article is a description and analysis of the intensity of the initial stressing force applied to individual long bolts (with a threaded clamping bush or wedge barrel) and quantization of short-term stress losses with a description and analysis of these. keywords: bolting; mining; geotechnics; longbolts; stressing; stressing losses; mine working. 1. introduction for the excavation of coal seams in the ostravakarviná district, the exclusively used mining method is the longwall mining [4]. the longwall mining can negatively affect the rock massif and the mining works therein. it is, therefore, necessary to create a protective shaft and stone drift pillars systematically so that these mining effects do not cause damage in the form of unacceptable deformations of galleries or shafts. protective pillars, however, bind large coal reserves and their subsequent mining is, for a number of reasons, difficult. for the possible removal of these negative phenomena and for the possible mining of remnant pillars, it was decided to prepare the trial operation of the modified room and pillar mining method in the protective shaft pillar of the mining plant 2, site north, okd, inc. the purpose of the trial operation was to prove the applicability of this mining method under the conditions of the deep coal mines of the ostrava karviná district. the extent of the mining is shown in figure 1. this method [13] is characterized by forming retained stable coal pillars where galleries are exclusively driven in rockbolt support. it is precisely the fact that there is no caving of mined out areas that is the most important factor for the protection of the surface, surface objects and main mining workings. the trial operation started in may 2014 by excavating the panel no. v [4] in the seam no. 30 in the lower suchá bed of the karvina formation. the seam no. 30 was picked out mainly because there is no coal mine burst hazard based on regional prognosis and in an accordance with decree 659/2004 coll. [15]. the seam is classified as a seam without a risk of a coal mine burst event, and therefore does not require any active coal mine burst prevention measures. the seam thickness varies from 2.5 meters up to 5 meters at the point of the connection of the three coal seams no. 30, 31 and 32 with the mining depth of about 850 meters below the surface. the immediate overburden of the 30th seam is formed by a layer of siltstone, followed by a transition of a thick bench of medium-grained sandstone. in the subsequent overburden up to the seam no. 29 sp. l. (648), benches of sandstones and siltstones then alternate. the underlying bed of the coal seam no. 30 is formed by coal seams no. 31 and no 32, along with sandstone benches alternating with siltstone benches. the trial operation was terminated in october 2017 after the finishing of the panel no. ii. the primary objective and function of the installed bolting elements (rock bolts, long bolts) is not only the transfer of shear and tensile loads acting in the massif, but also the ability to bring a certain degree of stress into the massif and thereby positively affect it (closing microcracks, increasing the shear strength). however, it remains a question of how large forces, in combination with the torque of a drilling machine or device, actually act on the bolting element, what will be the intensity of these forces during the process and what factors affect this development of internal forces in the elements. 279 http://dx.doi.org/10.14311/ap.2018.58.0279 http://ojs.cvut.cz/ojs/index.php/ap pavel dvořák, eva jiránková acta polytechnica figure 1. the subject panel no. v as a part of the trial operation of modified mining method room and pillar 2. support systems and goals of geotechnical monitoring for the purpose of the actual operation and, above all, the long-term stability of a mine supported by long bolts, it was decided to exclusively use resin rebar bolts fully encapsulated along their entire length, where the resin forms not only the bond between the rockbolt and the rock, but also the required protection of the bolt from a corrosion [5]. based on the calculation, according to the technical director’s standard no. 2/2012 okd, a.s. [9], it was decided to use the apb-1k rebar bolts in the project, fully encapsulated along their entire length with polyester adhesive ampoules lokset and completed with a steel mesh and square shaped washers of a thickness of 8 mm and dimensions 150 mm × 150 mm. the same rebar bolts were used for the rib reinforcement with the exception of future roadway crossings, where fibreglass bolts were used, mostly the fib type, or rockbolt k60-25 power system washers and nuts in areas with a high horizontal stress. the specific bolt pattern, their number, length and spacing are determined by the geological conditions at the site. there are three variants – a bolting support system for normal conditions, worsened conditions and bad conditions. these variants differ in the number of bolts, their location, time of installation, and possible addition of a 6 meter long ir-6 strand bolt. in some critical situations, e.g., thrust crossings, longer longbolts were used, but this is beyond the scope of this paper. there were two types of longbolts used, the vast majority was strand bolts ir-6 and the rest was cable bolts mca-m. both types were 6 meters long and bonded at the root with two pieces of lokset hs slow 24/800 ampoules (the expected length of the resin 280 vol. 58 no. 5/2018 effects on the final intensity of input forces in longbolts bolting diameter/cross-sectional maximum load length of used element area [mm/mm2] capacity [kn] bolting element [m] rockbolt apb-1k 21.7/370 285 2.4 or 2.8 strandbolt ir-6 27/308 450 mostly 6 cablebolt mca-m 21.8/312.9 545 mostly 6 table 1. features of steel bolts used [6–8]. figure 2. cutting part of the mining map of the mining panel no. v with the dynamometers shown bond was 2 metres). the bolts were installed regularly in roadway crossings with a wide span directly in the roof and alternatively in bad geological conditions. the monitoring of the overburden was carried out by tell-tales, strain gauge rock bolts, 3-dimensional ccbm probes [13]. additional measurements of convergence were done at selected locations and hydraulic dynamometers were installed on several long bolts. dualor triple height tell-tales were commonly used. tell-tales were set to height levels 2.2 and 5.2 metres, 2.2 and 5.2 and 7 metres respectively. these altitudes correspond to the used anchor technique, 2.2 metres correspond to the range of the apb-1k rockbolts, 5.2 metres corresponds to the width of the corridor and 7 metres are set to determine the dynamics of the movements in the higher overburden, should a longer bolting element be used. 3. stress input and its intensity for the anchor element to cooperate properly with the massif, it must be activated. the activation of the element is important for ensuring the proper contact between all involved bolting elements (nut, plate, bolt, masiff) and for the input of the stabilizing force into the massif. this is done in two ways, depending on 0,00 10,00 20,00 30,00 40,00 50,00 60,00 70,00 75 100 125 150 175 200 225 250 275 300 325 350 375 f o rc e [ k n ] torque [n.m] apb-1k (m 24x3) ir-6 (m 42x2) figure 3. relationship between torque and force in the bolting element [3]. the design of the bolting element. if the element is provided with a thread or threaded clamp bushing, the activation is performed by tightening the nut on the thread. for elements with a wedge barrel, stressing jacks or other, hydraulic devices are used. the stressing effect affects the rock mass in several ways: • the compression of the layers reinforces the material of the layers by increasing the shear friction between the individual parts and the material particles; • the gripping of the thin layers by stressing forces together helps to form a compact, rigid beam that better resists shear deformations; • stressing helps to close cracks and other discontinuities in the massif, this prevents the migration of liquids and gases, which can have a negative effect on the mechanical properties of both the massive and the support itself. dvořák [3] occupied himself with the intensity of the force introduced into a particular bolting element (apb-1k, ir-6) with the threaded part by means of the torque of the drilling and installation equipment. the greatest effect on the achieved results was the friction between the individual elements (i.e., the friction between the nut, the washer and the threaded rod). it is evident that the magnitude of the force generated in the element is proportional to the magnitude of friction in the thread and between the individual elements. in underground conditions, threads are often heavily clogged with coal dust or corrosion as a 281 pavel dvořák, eva jiránková acta polytechnica result of mineralized mine water. this can result in a significant reduction of the input stress. the activation of the bolting elements with the wedge barrel is most often done by means of hydraulic stressing jacks of various designs. the jack is attached to the free end of the cable behind the barrel, the inner clamping jaws are locked on the cable, and the simultaneous combination of the cable pull from the bore and the barrel pressure against the rock causes stress. after releasing the jack, the barrel can preserve the stress in the cablebolt due to its function and, thus, in the massif. the intensity of the input stress is determined by the ability of the jack to develop the tensile force. here, however, the rule is, the higher the input stress is, the heavier and bigger the stressing jack must be, eg pp-17 (maximum stressing force 170 kn, weight 13 kg, minova), zpe 23 (230 kn, 23 kg, vsl) or ycw240qx (240 kn, 20.5 kg, ovm). in geotechnical or road constructions, the equipment used for stressing of multi-strand anchors can have a weight of several tons and develop a force in the order of dozens of meganewtons [10, 12]. the depth capacity and effect of the induced stress in the rock massif can be simulated in several ways. numerical modeling or analytical approaches, such as steinbrenner´s and boussinesque´s, can be used. the results of these models [16] show that the relevant depth capacity and the resulting influence of the force is only in the immediate vicinity of its location. with the distance from the perpendicular projection of the point of action and the direction of the depth, the influence of the force decreases rapidly. 4. development and effects on the resultant intensity of the stress the measurements in the mining panel no. v in the area of the trial operation of the modified room and pillar mining method resulted in the intensity of the input stressing force in the 6 ir-6 strandbolts and 1 mca-m cablebolt of a 6 m length, fitted with glötzl kn500 a5 dynamometers (figure 2). ir-6 anchors have been installed and activated with air operated, telescopic leg type drilling and bolting super turbo bolter machine. mca-m cablebolts were stressed by the hydraulic jack pp-17 to the initial intensity of 150 kn immediately after installation. this intensity decreased to 70 kn immediately after releasing the jack. there is a relatively large variation among the intensity of the input stress of the ir-6 strand bolts. this is due to several factors, the varying inlet air pressure in the individual workplaces (2.5–5.5 bar) (i.e., the lower or higher input torque, machine wear, thread clogging, friction between elements and tightening nut). a completely different situation occurred at the anchor vd-7 (cable bolt mca-m with an anchor sleeve), where the initial stressing force intensity immark type real initial force [kn] vd 1 ir-6 30 vd 2 ir-6 30 vd 3 ir-6 45 vd 4 ir-6 25 vd 5 ir-6 25 vd 6 ir-6 10 vd 7 mca-m 70 table 2. initial forces in bolts 6 m long [3]. mediately dropped from 150 to 70 kn after the release of the stressing jack. this was mainly due to the different construction of the cablebolt head (i.e., the clamp bushing with the nut (ir-6) as opposed to the mca-m wedge barrel). generally, losses of stress in structures are contemplated in eurocode 2: design of concrete structures [2, 14]. this standard divides short-term prestressing losses based on their origin to losses caused by: • stress loss due to a friction (curvature and wobble) — the loss incurred due to the friction of the cable on the wall of the injection duct wall depending on their curvature; • anchorage losses — losses due to a wedge draw-in of the anchorage devices during the operation of the anchoring after the stressing; • elastic shortening of concrete. in the following, the relevant types of losses will be discussed in more detail. in this case, the anchorage loss and elastic deformation of the base material. because the borehole for a long bolt is straight, it is not necessary to evaluate the friction loss. 4.1. anchorage loss as it was already mentioned above, an anchorage loss is a loss due to a wedge draw-in back into the barrel after releasing the stressing jack. to reach the final intensity of stress, the wedges must be released from the barrel by the pressure of the jack, and thereby creating a gap between the barrel and wedges. the size of this gap is critical for the resulting value of the anchorage loss (figure 4). some degree of anchorage loss can also occur with coarse threaded rods, the movement of the nut along the thread can reach an order of tenths of milimeters. for fine metric threads, the nut movement is completely negligible. the anchorage loss [2] is given by the equation ∆σpw = −wep lp = −(x1 − x2)ep lp , (1) where w is the anchorage wedge draw-in, ep is the modulus of elasticity of the cablebolt material, lp is the cablebolt free length, x1, x2 are the wedge positions 282 vol. 58 no. 5/2018 effects on the final intensity of input forces in longbolts figure 4. wedge draw-in mechanism during the stressing and after releasing the stressing jack (figure 4). if we put real values into (1) (i.e., the free lenght of a longbolt, which is 4 metres (6 m without the resin bond), the 3 mm measured average wedge draw-in and 195 gpa as young´s modulus for spring steel we get −146.25 mpa. that is −45.76 kn after reducing. the foreign literature [11] shows anchorage losses of 52 % in the hi-ten bolt system with a similar wedge barrel construction using the pushing springs (i.e., only 48 % of the original stress intensity is retained in the case of an anchorage loss). for strand bolts, due to their fine metric thread m 42 × 2, the anchorage loss can be ignored. 4.2. loss of stress due to elastic deformation of the surrounding rock stressing leads to compression of the surrounding rock mass. the compression rate corresponds to the rate of the loss. the calculation includes rheological properties of the rock mass. the loss is thus dependent on the instant condition of the rock and its alteration level depending on the amount of time since the berock δ α claystone 0.0157 0.89338 siltstone 0.00947 0.87255 medium-grained sandstone 0.00713 0.63503 coarse-grained sandstone 0.04017 0.57291 table 3. rheological coefficients for individual materials [1]. mca-m ir-6 [kn] [kn] initial stressing force 150 aver. 30.00 anchorage loss −45.76 ≈ 0 elastic deformation loss −32.45 −9.34 total loss −78.21 −9.34 remaining force 71.79 20.66 table 4. total quantification of losses in longbolts. ginning of the road-heading works. the loss ∆pe is calculated according to the following formula [2]: ψ = apep aheh , (2) where σp is the residual stress in the element after deducting the present losses ∑ ∆p from the initial intensity of the stress p0, ap is the cross sectional area of the element, ep is the modulus of elasticity of the element, ah is the surface area of the washer, eh is the modulus of elasticity of the rock mass: ∆pe = σpψ 1 + ψ , (3) σp = p0 − ∑ ∆p. (4) rock properties can be considered as varying in time, a rock mass may degrade over time due to weathering activities of wind, water, etc. the formula for the auxiliary constant ψ is: ψ = apep aheh(t) . (5) the modulus of elasticity of rock mass in time [1] φ = δt1−α 1 − α , (6) eh(t) = e 1 + φ , (7) ∆σpw = −wep lp = −(x1 − x2)ep lp . (8) when we put values for siltstone (estimated value for e = 6000 gpa and the area of the washer 150 mm × 150 mm (22500 mm2)) into previous equations (6)–(8) and due to the minimal time interval of the excavation from the installation of the longbolts, we do not use the time-dependent parameter of the elastic modulus of the siltstone. we then get the total loss of the cable bolt mca-m and stran dbolt ir-6, which is shown in table 4. 283 pavel dvořák, eva jiránková acta polytechnica 5. conclusions in the protective shaft pillar of mining operation 2, site north, okd, inc. from may 2014 until october 2017, the trial operation of the modified mining method room and pillar was carried out. the principle of this mining method is to leave stable coal pillars, which will be the permanent support of the overburden. in order to avoid a massive stratification of the overburden, a system of reinforcement of mining works was designed using a rock bolt reinforcement and various types of long bolts (cable bolts, strand bolts). the input intensity of the forces applied to the bolting elements are subjected to losses of various kind. on the basis of the measurements made during the trial operation of the modified room and pillar method, the relationship between the intensity of the theoretical and the actual input force in the bolting element was determined. it can be said that on the basis of the in situ measurements and the calculated values, the difference in the input stressing force intensity is obvious. this difference is mainly due to the design of the longbolts and the intensity of the input stressing force. the cablebolts are significantly affected by the anchorage loss depending on the movement of wedges in the barrel body during the stressing. on the contrary, the ir-6 strand bolts do not suffer from this because of the tightness of the metric thread on the clamp bushing. losses caused by the deformation of the base material (rock mass) are proportional to the applied stress. these losses, therefore, are lower for the strandbolts. considering the calculated elastic deformation loss at the strandbolt, it can be assumed that the input stress intensity was still about 10 kn higher than the intensity measured by the dynamometer. in any case, measurements have shown that short-term losses have a significant effect on the intensity of the input force applied to the element and should be considered in the projects. although practical use of cable bolts is quite common, there is not much research done on the relationship between the theoretical and real input force. the results obtained are of a practical significance not only for the design of the reinforcement of mine workings but also where the stressing of bolting elements is considered (i.e., in geotechnics, civil and transportation engineering and other branches of human activity). references [1] aldorf, j.: 1999, mechanika podzemních konstrukcí (mechanics of underground structures). ostrava: nakladatelství všb-tu ostrava,410 p, (in czech). [2] british standards institution: (2008). eurocode 2: design of concrete structures: british standard. london: bsi. [3] dvořák, p.: 2017, activation of threaded bolting elements. journal of fundamental and applied sciences, 9, no. 1s, 1-11. [4] jiránková, e., mučková, j., jadviščok, p., vochta, r., molčák, v., havlicová, m.: 2016, geodetic monitoring of the surface within the trial operation of the room and pillar mining method in the karvina part of the ostravakarvina coal district. international journal of clean coal and energy, 5, no. 2, 37-44. doi:10.4236/ijcce.2016.52004 [5] meikle, t., tadolini s.c., sainsbury, b., bolton, j.: 2016, laboratory and field testing of bolting systems subjected to highly corrosive environments. international journal of mining science and technology.no.27, pages 101-106. [6] minova: 2012, resin bolts type app, apg, apb. available on: http://www.arnall.com.pl/app-apg-apb [7] minova: 2014, pramencové svorníky typu ir-6 (strandbolts ir-6). minova bohemia s.r.o., 2014. (in czech) [8] minova: 2012, lanový svorník mca-m (cablebolt mca-m). minova bohemia s.r.o., 2014. (in czech) [9] okd, a.s.: 2012, technický standard č. 2/2012 technického ředitele okd, a.s. – dimenzování samostatné svorníkové výztuže dlouhých důlních děl obdélníkového a lichoběžníkového průřezu v podmínkách okd (technical standard no. 2/2012 of the technical director of okd, a.s. dimensioning of rockbolt support of long mine excavations with rectangular and trapezoidal cross-sections in the conditions of okd, inc.). okd, a.s., (in czech). [10] ovm: 2010, ovm prestressing systems. available on http://www.tensindo-ovm.com [11] rataj, m.:2008, development of hi-ten bolt in australian coal mines.[online]. available on: http://www.dsiminingproducts.com [12] vsl international ltd:2002, multistrand post-tensioning. available on http://www.vsl.net [13] waclawik p., šňupárek r., kukutsch r.: 2017, rock bolting at the room and pillar method at great depths. procedia engineering, 191, 575-582. doi:10.1016/j.proeng.2017.05.220 [14] zhou, z., he, j., chen, g., ou, j.: 2009, a smart steel strand for the evaluation of prestress loss distribution in post-tensioned concrete structures. journal of intelligent material systems and structures, 20, no. 6, 1901-1912. doi:10.1177/1045389x09347021 [15] the czech repulic. the czech mining authority in prague. regulation (decree) no.659/2004 on safety and health protection at work and operation safety in mines with the rockbursts risk. (in czech). [16] hulla, j., turček, p.:2005, zakládání staveb (foundation of buildings). bratislava: jaga group, v.o.s., 350 p,ostrava,410 p, (in czech). 284 http://dx.doi.org/10.4236/ijcce.2016.52004 http://www.arnall.com.pl/app-apg-apb http://www.tensindo-ovm.com http://www.dsiminingproducts.com http://www.vsl.net http://dx.doi.org/10.1016/j.proeng.2017.05.220 http://dx.doi.org/10.1177/1045389x09347021 acta polytechnica 58(5):279–284, 2018 1 introduction 2 support systems and goals of geotechnical monitoring 3 stress input and its intensity 4 development and effects on the resultant intensity of the stress 4.1 anchorage loss 4.2 loss of stress due to elastic deformation of the surrounding rock 5 conclusions references acta polytechnica doi:10.14311/ap.2017.57.0367 acta polytechnica 57(5):367–372, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap relation between left ventricular unloading during ecmo and drainage catheter size assessed by mathematical modeling svitlana struninaa, ∗, jiri hozmana, petr ostadalb a faculty of biomedical engineering, czech technical university in prague, nám. sítná 3105, 272 01 kladno, czech republic b cardiovascular center, na homolce hospital, roentgenova 2/37, 15030 prague, czech republic ∗ corresponding author: svitlana.strunina@fbmi.cvut.cz abstract. the flow-dependent left ventricle overload is a well-known complication of the veno-arterial extracorporeal membrane oxygenation in a severe cardiogenic shock, which leads to a distension of the left ventricle and, frequently, to a severe pulmonary edema. recently, an unloading of the left ventricle using a catheter inserted to the left ventricle and connected to the extracorporeal membrane oxygenation circuit has been proposed. the computational method was used to simulate the blood flow in the extracorporeal membrane oxygenation system with a drainage catheter incorporated to the left ventricle and connected to the inflow part of the extracorporeal membrane oxygenation circuit by a y-shaped connector. the whole system was modelled in modelica modelling language. the impact of various catheter sizes (from 5 fr to 10 fr) and extracorporeal blood flow values (from 1 l/min to 5 l/min) were investigated. in our simulation model, the extracorporeal blood flow only modestly affected the value of volume that was withdrawn from the left ventricle by a catheter. conversely, the size of the drainage catheter was the principal factor responsible for the achievement of the adequate left ventricle decompression. a 10 fr drainage catheter, inserted into the left ventricle and connected to the venous part of the ecmo system, presents a promising solution to the unloading of the left ventricle during a extracorporeal membrane oxygenation. keywords: catheter; decompression; extracorporeal membrane oxygenation; mathematical modeling; modelica modeling language; overload. 1. introduction the heart function improvement following a cardiac support was reported early [1]. the veno-arterial extracorporeal membrane oxygenation (va-ecmo) currently represents the most effective minimally invasive circulatory support system. the va-ecmo provides a sufficient support to enable an adequate tissue perfusion even in the case of cardiac arrest. however, a marked increase in systemic blood pressure caused by the va-ecmo may also impair the function of the left ventricular (lv). in the presence of a severe left ventricle dysfunction, the left ventricle is unable to eject a sufficient blood volume, this leads to the increased afterload caused by the extracorporeal membrane oxygenation blood flow (ebf) and, consequently, to an impairment of the lv performance. in the extreme situation, the aortic valve remains closed even during systole. this results in the lv overload with distension, increased wall stress and increased myocardial oxygen consumption [2]. insufficient decompression of the left ventricle during the va-ecmo is considered as a major factor preventing an adequate lv recovery [3]. several methods are used for the lv decompression during the va-ecmo therapy: a surgical approach with a minimally invasive thoracotomy, percutaneous approaches via the pulmonary artery or aortic valve, or through a septostomy, percutaneously inserted microaxial pump or intraaortic balloon pump. the lv decompression, during the ecmo therapy, seems to be associated with a significant improvement of the lv function [4]. the left ventricle can be unloaded by an insertion of a pigtail catheter into the lv through the aortic valve and connected to the inflow line of the ecmo circuit by an y-shaped connector [2]. figure 1 shows the unloading method during the ecmo therapy by the catheter inserted in the lv. there is a lack of information about the impact of the drainage catheter diameter and extracorporeal blood flow value on the lv decompression by the catheter inserted in the lv. the main objective of this study is to assess the unloading capacities of various diameters of the drainage catheter and various ebf values. modelica modelling language was employed to identify the association between the catheter diameter and volume value withdrawn from the left ventricle in a big animal model with cardiogenic shock during the ecmo and, consequently, to evaluate the effect of the catheter diameter on the lv unloading. modelica language is an object oriented, hierarchical, equation based and acausal modeling language, in which models can be created and graph367 http://dx.doi.org/10.14311/ap.2017.57.0367 http://ojs.cvut.cz/ojs/index.php/ap s. strunina, j. hozmana, p. ostradal acta polytechnica figure 2. system diagram. ppc – pulses pressure change, lv – left ventricle, ra – right atrium, cat – vent, vs – volume sensor, vcp – venous circuit part, oxy – oxygenator, acp – arterial circuit part, ac – arterial cannula, a – aorta, ic – inflow cannula. oxygenator drainage catheter mechanical blood pump y-shaped connector figure 1. unloading method during ecmo therapy by catheter. ically represented from pre-prepared components or by connecting instances of classes from libraries [5, 6]. modelica language is acausal, which means that the equations can be expressed declaratively and the modelica tool determine which of variables are dependent and independent based on the context upon a compilation [7]. the models are prepared to simulate just by a simple setting of the parameters. in the result of the simulation, the user can examine the change of variable values over time [6]. the main problem of medical research, articles, and experiments is using obscure units from medicine, pharmacology, biology and non-physics disciplines. one of the advantages of the modelica environment is the support of non-si units in the parameter dialog of each component. values are represented by si-units in the text code, but the modelica environment supports non-si units in the parameter dialog of each component. physiological units are implemented as the displayunits for each variable. using displayunits, the user can establish and observe the “physiological” values [8]. 2. materials and methods in this study, a computer model was used to assess the lv unloading capacities of various diameters of drainage catheter inserted in the lv and ebf values. 2.1. computer model the whole system was modelled in modelica modeling language, which uses a hierarchical object-oriented modelling. the model is described using the following compartments: left ventricle, right atrium, aorta, oxygenator, pump, tube set of ecmo system and integrated drainage catheter incorporated into the lv and connected to ecmo system (figure 2). each compartment is modelled using a mathematical relationship between blood volume vi(t), input flow rate fi(t) and output flow rate fi,out(t) relative to the ith compartment given as dvi(t) dt = fi(t) − fi,out(t). (1) with a flow rate fij (t) between compartments i and j defined in general by [9] fij (t) = pi(t) − pj (t) rij , j = i − 1. (2) analogous to kirchhoff’s current law, which is applied in the electrical domain, the law sum-to-zero is applied 368 vol. 57 no. 5/2017 relation between left ventricular unloading and drainage catheter size extracorporeal blood flow [l/min] 1 2 3 4 5 end-systolic volume (esv) [ml] 64 ± 11 70 ± 11 74 ± 11 78 ± 12 83 ± 14 systolic blood pressure (sbp) [mmhg] 60 ± 7 72 ± 7 81 ± 6 89 ± 7 97 ± 8 lv end-diastolic pressure (edp) [mmhg] 17.2 ± 1.4 18.2 ± 0.7 18.6 ± 1.5 18.9 ± 2.4 19.0 ± 2.9 heart rate (hr) 94 ± 4 89 ± 3 84 ± 3 80 ± 2 77 ± 2 table 1. initial values of state variables and parameters of the model in modelica. figure 3. withdrawal volume value for cardiac cycle according to the catheter size. the values can be found in table 2. in the hydraulic domain to the flow variables. the sum of all mass flows at any given point is zero [10] fin − fout = 0. (3) the left ventricular pressures were established according to a single cycle of cardiac activity time given as [11] pressure =   diapressure if tc < td1, diapressure + sin ( tc − td1π ) (syspressure − diapressure) if tc < td2, diapressure otherwise. (4) the blood flow in the system components can be completely described by hagen–poiseuille equation. the equation brings together all of the variables that determine flow q = π∆pr4 8µl . (5) hagen–poiseuille equation states that the maximum flow is inversely proportional to the lumen length and directly proportional to the fourth power of the radius for a circular cross-section of lumen [12]. the radius and length of components were various, as a consequence of required quantities. the dynamic viscosity of blood 0.001 pa s was chosen [13]. the pump element was used from modelica library for physiological calculations — physiolibrary. for the simulation purpose, the pump flow rate was gradually increased from 1 l/min to 5 l/min. the oxygenator was modelled as a compartment with a pressure gradient. ∆p = pin − pout. (6) the simulation in this study is based on data from a standard in vivo experiment on large animal models. the initial parameter values were derived from measurements on a female swine [14]. the parameters for the simulations are presented in table 1. 2.2. model output the output variables consisted of time-varying flow rate value and pressure in different parts of the ecmo circuit and time-varying volume value withdrawn from the lv. the values are presented in the following units: time-varying flow rate value in ml/s and volume value withdrawn from the lv in ml. 2.3. simulations for the simulation purpose, the pump volume flow rate was gradually increased from 1 l/min to 5 l/min 369 s. strunina, j. hozmana, p. ostradal acta polytechnica figure 4. the effect of veno-arterial extracorporeal membrane oxygenation blood flow on withdrawal volume value for cardiac cycle. the values can be found in table 2. (table 2). for each value of the extracorporeal blood flow, the value of systolic blood pressure, the lv enddiastolic blood pressure and heart rate were changed in accordance with the obtained data from the experiment on big animal models. the internal diameter of the drainage catheter was gradually increased from 5 fr to 10 fr (table 2) for a various ebf value. the entire process was done for each unique combination of the ebf, catheter size and vital parameters. 3. results the conducted study indicates that the size of the drainage catheter is a crucial factor when the insertion of a pigtail catheter into the lv through the aortic valve is used as a method of the lv decompression during the ecmo therapy. figure 3 depicts the withdrawal volume value for the cardiac cycle according to the catheter size. the results of the simulation have indicated that the ebf does not greatly affect the withdrawal volume value from the lv by drainage catheter. the relationship between the withdrawal volume value for the cardiac cycle and the ebf is shown in figure 4. figure 5 depicts the flow rate profiles throughout the cardiac cycles and the varying-in-time volume value withdrawn from the lv by the drainage catheter according to the catheter size. the red circles depict the volume values withdraw from the lv for a one cardiac cycle. table 2 presents the simulation results of volume values withdrawn from the lv for a one cardiac cycle. the drainage catheter size ranges from 5 fr to 10 fr, the ebf value varies from 1 l/min to 5 l/min. the outcomes of the simulation have shown that the 10 fr drainage catheter withdraws 5.38 ± 0.76 ml during one cardiac cycle during the ebf from 1 l/min to 5 l/min (table 2). extracorporeal blood flow ebf [l/min] 1 2 3 4 5drainage catheter size [fr] blood withdrawal for cardiac cycle [ml] 5 0.25 0.27 0.32 0.36 0.41 6 0.52 0.58 0.67 0.76 0.85 7 0.96 1.06 1.24 1.41 1.56 8 1.63 1.82 2.12 2.41 2.67 9 2.63 2.90 3.37 3.84 4.26 10 3.96 4.37 5.08 5.78 6.41 table 2. volume rate value withdraw from lv during va ecmo in modelica. 4. discussion an increase of the lv afterload, together with severe systolic dysfunction during the va-ecmo, often requires an urgent lv unloading. a number of sources mention that the drainage catheter is successfully used for the lv unloading during the ecmo therapy [3, 16– 20], but there is very little knowledge of the catheter size for an adequate lv unloading. this study was focused on the withdrawn blood volumes from the lv for a one cardiac cycle. one common complication of the ecmo is the lv overload and distention, primarily due to the increased afterload caused by the ebf [2]. the ecmo application causes an increase of end-systolic volume (esv) and can induce deterioration of lv the function [1]. the end-systolic volume is the amount of blood left in the ventricle at the end of the contraction [15]. therefore, the objective was to identify the appropriate size of the drainage catheter for the extraction of the excessive 370 vol. 57 no. 5/2017 relation between left ventricular unloading and drainage catheter size figure 5. flow rate profiles throughout the cardiac cycle and varying in time volume value withdraw from lv by drainage catheter, ebf 3 l/min. esv, to maintain adequate emptying of the lv at the end of each cardiac cycle, and thereby decompression of the lv during the ecmo. as you can see in table 1, with an increase of the ebf from 1 l/min to 5 l/min, the lv end-systolic volume increases, at average, to 4.75 ± 0.95 milliliters per liter (table 1). the outcomes of the simulation have shown that the 10 fr drainage catheter withdraws 5.38 ± 0.76 ml during one cardiac cycle during the ebf from 1 l/min to 5 l/min (table 2). thereby, the 10 fr catheter presents a promising solution to achieve the purpose of the lv unloading during the ecmo by a drainage catheter inserted in the lv and connected to the venous part of the ecmo system. the present study demonstrates that the withdrawal volume value by drainage catheter, connected to the inflow part of the ecmo system, depends mainly on the size of the catheter. the ebf did not demonstrate a notable effect. when the catheter diameter was kept constant and the ebf varied, the flow in the drainage catheter varied marginally. the limitations of the study are related to the mathematical model. created model replicates the general specification of the flow in the ecmo circuit. the model does not completely behave the same way as the ecmo system. it has been assumed that blood is modelled as an incompressible, newtonian fluid. the flow is considered to be laminar with no acceleration of the fluid in the ecmo circuit, gravitational effects were neglected. to make the study results clinically feasible, future work should be verified with the ecmo system specific model. 5. conclusions unloading capacities of various drainage catheter diameters and various ebf values during the ecmo by applying a numerical method is presented in this paper. the results suggest that the catheter diameter is the crucial factor when an insertion of a pigtail catheter into the lv through the aortic valve is used as a method of the lv decompression during the ecmo therapy. the ebf hardly affects a withdrawal volume value by the drainage catheter from the lv. the model used in the presented work provides interesting answers to the question regarding determining parameters in the ecmo circuit. the model can predict details of the pressure, withdrawal volume and flow rate value at any position in the system throughout the cardiac cycles. list of symbols vi blood volume [m3] fi,in input flow rate [m3/s] fi,out output flow rate [m3/s] fij flow rate between compartments i and j [m3/s] pi pressure of compartment i [pa] pj pressure of compartment j [pa] rij resistance between compartments i and j [(n/m2) (m3/sec)−1] fin input flow [m3/s] fout output flow [m3/s] diapressure diastolic blood pressure [pa] syspressure systolic blood pressure [pa] 371 s. strunina, j. hozmana, p. ostradal acta polytechnica tc relative time in cardiac cycle [s] td1 relative time of start of systole [s] td2 relative time of end of systole [s] q volumetric flow rate [m3/s] µ dynamic viscosity [pa s] r pipe radius [m] l length of pipe [m] ∆p pressure reduction [pa] pin input pressure [pa] pout output pressure [pa] references [1] s. vandenberghe, p. segers, b. meyns and p. verdonck. unloading effect of a rotary blood pump assessed by mathematical modeling. artif organs 27(12), pp. 1094-101. 2003. [2] s. strunina and p. ostadal, “left ventricle unloading during veno-arterial extracorporeal membrane oxygenation”, current research: cardiology, vol. 3, pp. 5-8, 2016. [3] hong, byun, yoo, hwang, kim and park. successful left-heart decompression during extracorporeal membrane oxygenation in an adult patient by percutaneous transaortic catheter venting. the korean journal of thoracic and cardiovascular surgery 48(3), pp. 210-213. 2015. [4] g. douflé, a. roscoe, f. billia and e. fan, “echocardiography for adult patients supported with extracorporeal membrane oxygenation”, critical care, vol. 19, pp. 1-10, 2015. [5] ježek filip et al, “zkušenosti z inovace výuky modelování a simulace na fel čvut.” in 2012, pp. 139-146. [6] m. mateják, “physiology in modelica,” mefanet journal, vol. 2, (1), pp. 10-14, 2014. [7] t. kulhánek et al, “simple models of the cardiovascular system for educational and research purposes,” mefanet journal, vol. 2, (2), pp. 56-63, 2014. [8] mateják marek et al, “physiolibrary — modelica library for physiology,” (96), pp. 499-505, 2014. [9] j. fernandez de canete, j. luque, j. barbancho and v. munoz. modelling of long-term and short-term mechanisms of arterial pressure control in the cardiovascular system: an object-oriented approach. computers in biology and medicine 47pp. 104-112. 2014. [10] p. fritzson, principles of object-oriented modeling and simulation with modelica 3.3: a cyber-physical approach. wiley, 2014. [11] t. kulhánek, m. tribula, j. kofránek and m. mateják. simple models of the cardiovascular system for educational and research purposes. mefanet journal 2(2), pp. 56-63. 2014. [12] k. kohler, k. valchanov, g. nias and a. vuylsteke, “ecmo cannula review,” perfusion, vol. 28, pp. 114-124, 2013. [13] b. uggla and t. k. nilsson, “whole blood viscosity in plasma cell dyscrasias,” clin. biochem., vol. 48, pp. 122-124, 2015. [14] ostadal, mlcek, kruger, hala, lacko, mates, vondrakova, svoboda, hrachovina, janotka, psotova, strunina, kittnar and neuzil. increasing venoarterial extracorporeal membrane oxygenation flow negatively affects left ventricular performance in a porcine model of cardiogenic shock. journal of translational medicine 13(1), 2015. [15] d. u. silverthorn, b. r. johnson, w. c. ober, c. w. garrison and a. c. silverthorn, human physiology: an integrated approach. pearson education, 2013. [16] h. kurihara, m. kitamura, m. shibuya, y. tsuda, m. endo and h. koyangi, “effect of transaortic catheter venting on left ventricular function during venoarterial bypass,” asaio journal, vol. 43, 1997. [17] m. kitamura, k. hanzawa, m. takekubo, k. aoki and j. hayashi. preclinical assessment of a transaortic venting catheter for percutaneous cardiopulmonary support. artificial organs 28(3), pp. 298-302. 2004. [18] m. guirgis, k. kumar, a. h. menkis and d. h. freed. minimally invasive left-heart decompression during venoarterial extracorporeal membrane oxygenation: an alternative to a percutaneous approach. interact cardiovasc thorac surg 10(5), pp. 672-674. 2010. [19] barbone a, malvindi pg, ferrara p, et al. left ventricle unloading by percutaneous pigtail during extracorporeal membrane oxygenation. 13(3), pp. 293-295. 2011. [20] t. h. hong, j. h. byun, h. m. lee, y. h. kim, g. kang, j. h. oh, s. w. hwang, h. y. kim and j. h. park, “initial experience of transaortic catheter venting in patients with venoarterial extracorporeal membrane oxygenation for cardiogenic shock,” asaio journal, vol. 62, 2016. 372 acta polytechnica 57(5):367–372, 2017 1 introduction 2 materials and methods 2.1 computer model 2.2 model output 2.3 simulations 3 results 4 discussion 5 conclusions list of symbols references ap05_1.vp 1 introduction heterogeneous data sets contain data that may be represented using different data models and different structuring primitives. they may use different definition and manipulation facilities, and run under different operating systems and on different hardware [3]. schemas have been used in information systems for a long time for these data sets. they provide a structural representation of data or information. a schema is a model of data sets which can be used for both understanding and querying data. as diverse data representation environments and application programs are developed, it is becoming increasingly difficult to share data across different platforms, primarily because the schemas developed for these purposes are developed independently and suffer from problems like data redundancy and incompatibility. when we consider different systems interacting with each other, it is very important to transfer data from one system to another. this has led to research on heterogeneous database systems. (multidatabase systems make up a subclass of heterogeneous database systems.) heterogeneity in databases also leads to problems like schema matching and integration. the problem of schema matching is becoming an even more important issue in view of the new technologies for the semantic web [4]. the operation which produces a match of schemas in order to perform some sort of integration between them is known in the literature as a matching operation. matching is intended to determine which attribute in one schema corresponds to which attribute in another. performing a matching operation among schemas is useful for many particular applications such as mediations, schema integration, electronic commerce, ontology integration, data warehousing, and schema evolution. such an operation takes two schemas as input and produces a mapping between elements of the two schemas that correspond semantically to each other [29]. until recently, schema matching operations have typically been performed manually, sometimes with some support from graphical tools, and therefore they are time-consuming and error-prone. moreover, as systems become able to handle more complex databases and applications, their schemas become larger. this increases the number of matches to be performed. the main goal of this paper is to survey briefly the different issues that arise in managing schemas and to show how they are tackled from different perspectives. the remainder of the paper is structured as follow. section 2 describes schema heterogeneity. section 3 presents schema matching approaches. section 4 introduces schema integration methodologies. section 5 describes data integration. in section 6 we present our proposal for a data integration system in the context of heterogeneous xml data sources. section 7 concludes the paper. 2 schema heterogeneity schemas developed for different applications are heterogeneous in nature, i.e. although the data is semantically similar, the structure and syntax of its representation are different. data heterogeneity is classified according to the level of abstraction at which they are detected and handled (data instance, schema or data model). schema heterogeneity arises due to different alternatives provided by one data model to develop schemas from the same part of the real world. for example, a data element modelled as an attribute in one relational schema may be modelled as a relation in another relational schema for the same application domain. the heterogeneity of schemas can be classified into three broad categories: � platform and system heterogeneity [22] – differences in operating systems, hardware, and dbms systems. � syntactic and structural heterogeneity, which encompasses the differences between data model, schema isomorphism [35], domain, and entity definition incompatibility [14] and data value incompatibility [10]. � semantic heterogeneity – this includes naming conflicts (synonym and homonyms) and abstraction level conflicts [23] due to generalization and aggregation. 3 schema matching to integrate or reconcile schemas we must understand how they correspond. if the schemas are to be integrated, the corresponding information should be reconciled and modelled in a single consistent way. methods for automating the discovery of correspondences use linguistic reasoning on schema labels and the syntactic structure of the schema. such methods have come to be referred to as schema matching. schema matching is a basic problem in many database application domains, such as data integration, e-business, data warehousing, and semantic query processing. 24 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague schema management for data integration: a short survey a. almarimi, j. pokorný schema management is a basic problem in many database application domains such as data integration systems. users need to access and manipulate data from several databases. in this context, in order to integrate data from distributed heterogeneous database sources, data integration systems demand the resolution of several issues that arise in managing schemas. in this paper, we present a brief survey of the problem of schema matching which is used for solving problems of schema integration processing. moreover, we propose a technique for integrating and querying distributed heterogeneous xml schemas. keywords: schema matching, schema integration, data integration. to motivate the importance of schema matching, we should understand the relation between a symbol and its meaning. we can consider a word to be a symbol that evokes a concept which refers to a thing. the meaning is in the application that deals with the symbol, and in general in the mind of the designer, and not in the symbol itself. hence, it is difficult to discover the meaning of a symbol. the problem gets more complicated as soon as we move to a more realistic situation in which, for example, an attribute in one schema is meant to be mapped in two more specialized attributes in another schema. in general we can say that the difficulty of schema matching is related to the lack of any formal way to expose the intended semantic of the schema. to define a match operation, a particular structure for its input schemas and output mapping must be chosen. it can be represented by an entityrelationship model, an object-oriented model, xml, or directed graphs. in each sort of representation, there is a correspondence among the set of elements of the schemas. for example, entities and attributes in an entity-relationship model; objects in an object oriented model; elements in xml; and nodes and edges in graphs. a mapping is defined to be a set of mapping elements, each of which indicates how the elements in the schemas are related. there are several classification criteria that must be considered for realization of individual matching. matching techniques may consider the instance data level as in [17, 38] or schema level information [12, 15]. such techniques can be performed for one or more elements of one schema to one or more elements of the other. various approaches have been developed over the years that can be grouped into classes, according to the kind of information and the actual idea used: � manual approaches. the mechanisms used in these approaches involve the use of an expert to solve the matching, for example drag and drop. � schema based approaches. these are based on knowledge of the internal structure of a schema and its relation with other schemas. � data driven approaches. here, the similarities are more likely to be observed in the data than in the schema. 4 schema integration schema integration is the process of combining database schemas into a coherent global view. schema integration is necessary in order to reduce data redundancy in heterogeneous database systems. it is often hard to combine different database schemas because of the different data models or structural differences in how the data is represented and stored. thus, there are many factors that may cause schema diversity [6]: � different user or view perspectives, � equivalence among constructs of the model, � incompatible design specifications, � common concepts can be represented by different representations. there are several features of schema integration that make it difficult. the key issue is resolution of conflicts among the schemas. a schema integration method can be viewed as a set of steps to identify and resolve conflicts. schema conflicts represent differences in the semantics that different schema designers associate with syntactic representation in the data definition language. even when the two schemas are in the same data model, conflicts like naming and structural may arise. naming conflicts occur when the same data is stored in multiple databases, but is referred to by different names. naming conflicts arise when names are homonyms and when names are synonyms. the homonym naming problem is when the same name is used for two different concepts. the synonym naming problem occurs when the same concept is described using two or more different names. structural conflicts arise when data is organized using different model constructs or integrity constraints. some common structural conflicts are: � type conflicts – using different model constructs to represent the same data, � dependency conflicts – a group of concepts related differently in different schemas ( e.g. 1-to-1 participation versus 1-to-n participation), � key conflicts – a different key for the same entity, � interschema properties – schema properties that only arise when two or more schemas are combined. the schema integration process involves three major steps: 1. pre-integration, a step in which input schemas are re-arranged in various ways to make them more homogeneous (both syntactically and semantically). 2. correspondence identification, a step devoted to the identification of related items in the input schemas and the precise description of the relationships these inter-schemas. 3. the final step, which actually unifies the corresponding items into an integrated schema and produces the associated mappings. a robust integration methodology must be able to handle both naming and structural conflicts. there have been various attempts from different perspectives. the work [25] broadly classifies these attempts into two categories: � structural approaches – also called the common data model approach. in this, the participating databases are mapped to a common data model. the problem with such systems is the amount of human participation required. human intervention is required to qualify the mappings between the individual databases and the common model. � semantic approaches – these use a higher order language that can express information ranging over individual databases. ontology based integration approaches belong to this category. many research projects (shoe [21], ontobroker [7], observer [19]) and others use ontologies to create a global schema [20, 30]. in the past several years, many systems have been developed in various research projects on data integration using the techniques mentioned above. here are some of the more prominent representative systems: � pegasus [1] takes advantage of object-oriented data modelling and programming capabilities. it allows the user to access and to manipulate multiple autonomous hetero© czech technical university publishing house e http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 geneous distributed object-oriented relational and other information systems through a uniform interface. � mermaid [36] uses a relational common data model and allows only relational schema integration. � clio [34] was developed by ibm around 2000. it involves transforming legacy data into a new target schema. clio introduces an interactive schema mapping paradigm, based on value correspondences. � garlic [11, 18] uses an odmg-93 based object oriented model. it extends odmg to allow modelling of data items in the case of a relational schema with weak entity. � tsimmis [13, 37] and medmaker [31] were developed at stanford around 1995. they use the object exchange model (oem) [32] as a common data model. oem allows irregularity in data. the main focus is to generate mediators and wrappers based on application specification. � mix [8, 3], a successor of tsimmis, uses xml to provide the user with an integrated view of the underlying database systems. it provides a query/browsing interface called blended browsing and querying. these were the prominent techniques in the structuring approach. there are many other techniques which use ontology as a common data model or use ontologies to translate queries over component databases. below we present some of these techniques: � information manifold [24] employs a local-as-view approach. it has an explicit notion of global schema/ontology. � the observer [28] system uses a different strategy for information integration. it allows individual ontologies and defines terminological relationships between them, instead of creating a global ontology to support all the underlying source schemas. 5 data integration data integration is the process of combining data at the entity-level. after schema integration has been completed, a uniform global view has been constructed. however, it may be difficult to combine all the data instances in the combined schemas in a meaningful way. combining the data instances is the focus of data integration. data integration is difficult because similar data entities in different databases may not have the same key. determining which instances in two databases are the same is a complicated task, if they do not share the same key. entity identification [27] is the process of determining the correspondence between object instances from more than one database. data integration is further complicated because attribute values in different databases may disagree or be range values. simply said, data integration is the process which: � takes as input a set of databases (schemas), and � produces as output a single unified description of the input schemas (the integrated schema) and the associated mapping information supporting integrated access to existing data through the integrated schema. parent and spaccapietra [33] present a general data integration process in their survey on database integration. first, they convert a heterogeneous schema to a homogeneous representation, using transformation rules that explain how to transform constructs from the source data models to the corresponding one in the target common data model. the transformation specification produced by this step specifies how to transform instance data from the source schema to the corresponding target schema. then, correspondences are investigated, using the semantic descriptions of the data to produce correspondence assertions. finally, correspondence assertions and integration rules are used to produce the unified schema. in general, data integration systems can be classified into data-warehouse and mediator-wrapper systems. a data warehouse [9] is a decision support database that is extracted from a set of data sources. the extraction process requires data to be transformed from the source format into the data warehouse format. a mediator-wrapper approach [39] is used to integrate data from different databases and other data sources by introducing a middleware virtual database, called a mediator, between the data sources and the application using them. wrappers are interfaces to data sources that translate data into a common data model used by the mediator. based on the direction of the mappings between a source and a global schema or common schema, mediator-wrapper systems can be classified into so called global-as-view and local-as-view [19, 26]. in global-as-view (gav) approaches [16], each item in the global schema/ontology is defined in terms of source schemas/ontologies. in local-as-view (lav) approaches, each item in each source schema/ontology is defined in terms of the global schema/ontology. methods for query rewriting and query answering views are presented in [11]. the most important techniques in the literature for lav are presented. 6 integration and querying xml via mediation in this section, we propose a general framework for a system for xml date integration and querying xml via mediation (iqxm) [2]. the architecture of iqxm is shown in fig. 1. iqxm mainly refers to the problem of integrating heterogeneous xml data sources. it can be used for resolving structural and semantic conflicts for distributed heterogeneous xml data. a global xml schema is specified by the designer to provide a homogeneous view over heterogeneous xml data. a mediation layer is proposed for describing mappings between global and local schemas. an xml mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying xml data sources in terms of the global schema. the xml data sources are described by xml schema language. the former task is performed through a semi-automatic process that generates local and global paths. a tree structure for each xml schema is constructed and represented by a simple form. this is in turn used for assigning indices manually to match local paths to corresponding global paths. by gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an xml metadata document is constructed. the query translator acts to decompose global queries into a set of subqueries. a global query from an end-user is translated into local queries for xml data sources by looking up the corresponding paths in the xml metadata document. 26 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 7 conclusion in this paper, we have presented some problems behind schema management, such as schema matching and schema integration. schema matching is a basic problem in many database application domains. we have introduced some of the past and current approaches employed to solve these problems. finally, we have described a framework for an xml data integration and querying system. acknowledgements this work supported in part by the national programme of research (information society project 1et100300419). references [1] ahmed, r. et al.: “the pegasus heterogeneous multi database system.” ieee computer, vol. 24, 1991, p. 19–27. [2] almarimi, a., pokorný, j.: “querying heterogeneous distributed xml data.” in: databases and information systems, int. baltic conf. on db&is 2004, riga, latvia, acta universitatis latviensis, latvias universitate, 2004, p. 177–191. [3] attaluri g. et al.: “the cords multidatabase project.” ibm systems journal. vol. 34, 1995, no. 1, p. 39–62. [4] berners-lee, t., hendler, j., lassila, o.: “the semantic web: a new form of web content that is meaningful to computers will unleash a revolution of new possibilities.” the scientific american. vol. 284, 2001, p. 34–43. [5] baru, c. et al.: “xml-based information mediation with mix.” in: proc. of the acm sigmod international conference on management of data, 1999, p. 597–599. [6] batini, c., lenzerini, m., navathe, s.: “a comparative analysis of methodologies for database schema integration.” acm computing surveys. vol. 18, 1986, no. 4, p. 323–364. [7] benjamins, r., fensel, d.: “the ontological engineering initiative-ka2.” in: proc. of the 1st int. conf. on formal ontologies in information systems, fois’98 (ed. n. guarino), trento, italy, ios press, 1998, p. 287–301. [8] baru, c. et al.: “xml-based information mediation with mix.” in: proc. of sigmod’99, 1999, p. 597–599. [9] bernstein, p. a., rahm, e.: “data warehouse scenarios for model management.” in: proc. 19th int. conf. on entity-relationship modeling, lecture notes in computer science, vol. 1920. springer, berlin heidelberg new york, 2000, p. 1–15. [10] breibart, y. j. et al.: “database integration in a distributed heterogeneous database system.” in: proc. of 2nd int. ieee conf. on data engineering, los angeles, ca, 1986. [11] calvanese, d., lembo, d., lenzerini, m.: “survey on methods for query rewriting and query answering views.” technical report, university of roma, italy, april 2001. [12] castano, s. et al.: “global view of heterogeneous data sources.” ieee trans data knowledge eng. vol. 13, 2001, no. 2, p. 277–297. [13] chawathe, s., et al.: “the tsimmis project: integration of heterogeneous information sources.” in: proc. of the information processing society of japan conference, tokyo, japan, 1995, p. 7–18. [14] czejdo, d. b., rusinkiewicz, m., embley, d.: “an approach to schema integration and query formulation in federated database systems.” in: proc. of icde, 1987, p. 477–484. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 1: system architecture [15] doan, a. h., domingos, p., levy, a.: “learning source descriptions for data integration.” in: proc. webdb workshop, 2000, p. 81–92. [16] friedman, m., levy, a., millstein, t.: “navigational plans for data integration.” in: proc. of the 16th national conf. on aaai ’99, orlando, florida, 1999, p. 67–73. [17] goldman, r., widom, j.: “data guides: enabling query formulation and optimization in semi-structured databases.” in: proc. of 23rd int. conf. on vldb, athens, greece, 1997, p. 436–445. [18] haas, l. et al.: “optimizing queries across diverse data sources.” in: proc. of the 23rd int. conf. on vldb, athens, greece, 1997, p. 276–285. [19] halevy, a. y.: “answering queries using views: a survey.” vldb journal. vol. 10, no. 4, december, 2001, p. 270–294. [20] hakimpour, f., geppert, a.: “resolving semantic heterogeneity in schema integration: an ontology based approach.” in: proc. of int. conf. on formal ontologies in information systems fois’01 (eds. ch. welty and b. smith), new york, acm press, october 2001, p. 297–308. [21] heflin, j., hendler, j.: “semantic interoperability on the web.” in: proc. of extreme markup languages 2000. graphic communications association, 2000, p. 111–120. [22] hull, r.: “managing semantic heterogeneity in databases: a theoretical perspective.” in: proc. of principles of database systems (pods’97), tucson, arizona, usa, 1997, p. 51–61. [23] kashyap, v., sheth, a.: “semantic and schematic similarities between database objects: a context-based approach.” vldb journal. vol. 5, no. 4, 1996, p. 276–304. [24] kirk, t. et al.: “the information manifold.” in: proc. of aaai spring symposium on information gathering. aaai, standford, ca, march, 1995, p. 85–91. [25] lakshmanan, l., sadri, f., subramanian, i.: “on the logical foundations of schema integration and evoluion in heterogeneous database systems.” in: proc. of dood’93, phoenix, az, 1993, p. 81–100. [26] lenzerini, m.: “data integration: a theoretical perspective.” in: proc. of the acm symposium on principles of database systems, madison, wisconsin, usa, june 2002, p.233–246. [27] lim, e. et al.: “entity identification in database integration.” in: proc. of int. conf. on data engineering, los alamitos, ca., usa, ieee computer society press, 1993, p. 294–301. [28] mena, e. et al.: “domain specific ontologies for semantic information brokering on the global information infrastructure.” in: proc. of international conference on formal ontologies in information systems, fois’98, trento, italy, ios press, june 1998, p. 269–283. [29] milo, t., zohar, s.: “using schema matching to simplify heterogeneous data translation.” in: proc. 24th int. conf. on vldb, 1998, pp. 122-133. [30] pepijn, r. s. visser et al: “resolving ontological heterogeneity in the kraft project.” in: proc. of 10th int. conf. on database and expert systems applications dexa’99. university of florence, italy, august 1999, p. 668–677. [31] papakonstantinou, y., garcia-molina, h., ullman, j.: “medmaker: a mediation system based on declarative specifications.” in: proc. of icde conference, new orleans, feb, 1996, p. 132–141. [32] papakonstantinou, y., garcia-molina, h., widom, j.: “object exchange across heterogeneous information sources.” in proc. of 11th int. conf. on data engineering, taipei, taiwan, march, 1995, p. 251–260. [33] parent, c., spaccapietra, s.: “issues and approaches of database integration.” cacm, vol. 41 (1998), no. 5, p. 166–178. [34] renee, j., et al.: “schema mapping as query discovery.” in: prof. 26th int. conf. on vldb cairo, egypt, september, 2000, p. 77–87. [35] sheth, a., kashyap, v.: “so far (schematically) yet so near (semantically).” in: proc. of the ifip ds-5 conferences on semantics of interoperable database systems, lorne, australia, november 1992, p. 283–312. [36] templeton, m. et al.: “mermaid: a front end to distribute heterogeneous databases.” in: proc. of the ieee, vol. 75 (1987), no. 5, p. 695–708. [37] ullman, j.: “information integration using logical views.” in proc. of the int. conf. on database theory, 1997, p. 19–40. [38] wang, q., wong, k.: “approximate graph schema extraction for semi-structured data.” in: proc. extended database technologies, lecture notes in computer science, vol. 1777. springer, berlin heidelberg new york, 2000, p. 302–316. [39] wiederhold, g.: “mediators in the architecture of future information systems.” ieee computer, vol. 25, no. 3, march 1992, p. 38–49. abdelsalam almarimi, msc. e-mail: belgasem_2000@yahoo.com department of computers czech technical university faculty of electrical engineering karlovo nám. 13 121 35 praha 2, czech republic prof. rndr. jaroslav pokorný, csc. e-mail: pokorny@ksi.ms.mff.cuni.cz department of software engineering charles university faculty of mathematics and physics malostranské nám. 25 118 00 praha 1, czech republic 28 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague acta polytechnica doi:10.14311/ap.2017.57.0412 acta polytechnica 57(6):412–417, 2017 © czech technical university in prague, 2017 available online at http://ojs.cvut.cz/ojs/index.php/ap on the common limit of the pt -symmetric rosen–morse ii and finite square well potentials józsef kovács, géza lévai∗ institute for nuclear research, hungarian academy of sciences (mta atomki), debrecen, pf. 51, hungary 4001 ∗ corresponding author: levai@atomki.mta.hu abstract. two pt -symmetric potentials are compared, which possess asymptotically finite imaginary components: the pt -symmetric rosen–morse ii and the finite pt -symmetric square well potentials. despite their different mathematical structure, their shape is rather similar, and this fact leads to similarities in their physical characteristics. their bound-state energy spectrum was found to be purely real, an this finding was attributed to their asymptotically non-vanishing imaginary potential components. here the v (x) = γδ(x) + i2λ sgn(x) potential is discussed, which can be obtained as the common limit of the two other potentials. the energy spectrum, the bound-state wave functions and the transmission and reflection coefficients are studied in the respective limits, and the results are compared. keywords: pt -symmetric potential; bound states; scattering; dirac-δ limit. 1. introduction the introduction of pt -symmetric quantum mechanics [1] gave strong impetus to the investigation of non-hermitian quantum mechanical systems (for a review, see [2]). in most cases these systems represent one-dimensional complex potentials that are invariant with respect simultaneous space (p) and time (t ) reflection, where the latter corresponds to complex conjugation. although these potentials are manifestly non-hermitian, they possess several features that are characteristic of hermitian systems, i.e. real potentials. perhaps the most spectacular one among these is that their discrete energy spectrum is partly or completely real. this feature was first attributed to pt symmetry, but later it soon turned out that pt symmetry is neither a necessary, nor a sufficient condition for the presence of real energy eigenvalues. it was found that in many such systems the energy eigenvalues merge pairwise with increasing non-hermiticity, and reappear as complex conjugate pairs. since at the same time the energy eigenstates cease to be eigenfunctions of the pt operator, this phenomenon was interpreted as the breakdown of pt symmetry. it was shown that from the mathematical point of view pt symmetry is a particular case of pseudo-hermiticity [3]. more recently, after a decade of theoretical investigations the existence of pt symmetry, as well as that of its breakdown was verified in quantum optical experiments [4]. although the first pt -symmetric potentials were solved by numerical methods, it was soon realized that most exactly solvable potentials can be cast into a pt -symmetric form, and the usual techniques applied to their hermitian version can be used in the pt symmetric setting too (see [5–7] for reviews). the pt -symmetrization of shape-invariant [5, 6, 8] and of the more general natanzon-class potentials [7] that are solved in terms of the (confluent) hypergeometric function [9] revealed that the characteristic features of pt -symmetric potentials can conveniently be studied using the exact analytical solutions of these potentials. a particularly interesting issue was the study of the breakdown of pt symmetry: the transition through the critical point could be reached by fine tuning of some potential parameter, and the whole process could be kept under control. it was found that there are exactly solvable potentials that do not exhibit this feature [10–13], while some others do [14–18]. it was also noticed that in most cases the complexification of the energy eigenvalues occurs at the same value of the control parameter (sudden mechanism) [15–17], while in some cases it is a continuous process [18]. although there are examples for this latter, gradual mechanism among natanzon-class potentials [18], it seems to be characteristic of potentials not belonging to the natanzon (and thus, the shape-invariant) class. examples are the numerically solvable bender-boettcher potentials [1], some piecewise constant potentials, like the pt -symmetric infinite square well [19], and the pt -symmetric exponential potential [20]. the asymptotic behaviour of the imaginary potential component was found to play an important role in determining the characteristics of the energy spectrum. the pt -symmetric scarf ii and rosen–morse ii potentials share the same real component, cosh−2(x), while their imaginary components are different. in the case of the scarf ii potential the imaginary potential component vanishes asymptotically, while in the case of the rosen–morse ii potential it is the i tanh(x) function, reaching finite values for x →±∞. 412 http://dx.doi.org/10.14311/ap.2017.57.0412 http://ojs.cvut.cz/ojs/index.php/ap vol. 57 no. 6/2017 common limit of two pt -symmetric potentials in the former case the breakdown of pt symmetry can occur [15], while in the latter the discrete energy spectrum is purely real [10]. this latter finding was later proven for all three pii-class shape-invariant potentials (rosen–morse i, ii, eckart) using a thorough analysis of pt -symmetric natanzon-class potentials [7]. this clear difference can obviously be attributed to the different asymptotic behaviour of the two potentials [21]. this peculiar character of the asymptotically constant imaginary potential component characterising the pt -symmetric rosen–morse ii potential inspired the investigation of further potentials with similar structure. a natural candidate was the finite pt symmetric square well potential [22], which is essentially the finite real square well potential supplemented outside the well by a constant imaginary component with opposite sign on the two sides. in a way, this potential can be considered an approximation of the pt symmetric rosen–morse ii potential: the cosh−2(x) and i tanh(x) terms are mimicked by the finite real square well and the constant imaginary terms, respectively. it was supposed [22] that given the similar shapes, the main physical features of the two potentials would also be close to each other. it was found that the energy spectrum of the finite pt -symmetric square well potential is purely real, similarly to that of the pt -symmetric rosen–morse ii potential. another similarity was that by increasing non-hermiticity, i.e. the coupling coefficient of the imaginary potential component, the energy eigenvalues are rapidly lifted to the positive domain e > 0. there was, however, an important difference: while the number of bound states was fixed for the rosen–morse ii potential, it was infinite for the finite pt -symmetric square well potential. the additional states were found to be the equivalents of transmission resonances of the real finite square well potential [22]. these results naturally inspire further investigation of pt -symmetric potentials with asymptotically constant imaginary component. here a potential of this kind is investigated, which, furthermore, can be obtained as the common limit of the pt -symmetric versions of the rosen–morse ii and finite square well potentials: v (x) = γδ(x) + i2λ sgn x. (1) the purpose of this work is to explore how the physical quantities of the pt -symmetric rosen–morse ii and finite square well potentials behave when the limit as above is implemented. the paper is organized as follows. sections 2 and 3 discuss the specific limits of the pt -symmetric rosen– morse ii and finite square well potentials, respectively. in section 4 the results are summarized and are compared with those obtained for other potentials with various asymptotic behaviour. 2. the pt -symmetric rosen–morse ii potential let us consider the potential v (x) = − s(s + 1)a2 cosh2(ax) + 2iλa2 tanh(ax). (2) noting that the s →−s− 1 replacement leaves v (x) invariant, we may chose s ≥ −1/2. following the discussion of [10] with the difference that x is rescaled by the positive real constant a as ax, the bound-state eigenvalues are en = −a2(s−n)2 + λ2a2 (s−n)2 , (3) while the corresponding wave functions are expressed in terms of jacobi polynomials [23] ψn(x) = cn(1 − tanh(ax)) α 2 × (1 + tanh(ax)) β 2 p (α,β)n (tanh(ax)). (4) here cn is the normalization constant cn = in2n−s |γ(s + 1 + iλ/(s−n))| × (an!γ(2s−n + 1)((s−n)2 + λ2/(s−n)2) s−n )1/2 (5) while αn = s−n + iλ s−n , βn = s−n− iλ s−n . (6) it was shown in [10] that the number of bound states is always finite, and the upper limit does not depend on the parameter λ: n < s. (7) it may be noted that for −1 ≤ s ≤ 0 the real component of (2) turns into a barrier with vmax(x) ≤ a2/4, and there are no bound states in this case. let us reparametrize the potential in the following way: γ = −2as(s + 1), λ = a2λ. (8) considering then the following limits: δ(x) = lim a→∞ a 2 cosh2(ax) (9) and sgn(x) = lim a→∞ tanh(ax). (10) the potential in (2) can be transformed into v (x) = γδ(x) + 2iλ sgn(x). (11) let us now clarify the effect of this limit on the energy eigenvalues (3) and wave functions (4). from (8) it follows that s = − 1 2 ( 1 − (1 − 2γ/a)1/2 ) , (12) 413 józsef kovács, géza lévai acta polytechnica where the “−” sign inside the square brackets follows from the requirement s > 0. s can be expressed in a series involving the powers of 1/a as s = − γ 2a − γ2 4a2 + o(a−3). (13) note that for s > 0 (8) implies that γ < 0. recalling (7) this result also leads to the finding that only the ground state remains normalizable. substituting s from (13) and λ from (8) one finds that in the a →∞ limit the ground-state eigenvalue becomes e0 = − γ2 4 + 4λ2 γ2 . (14) the corresponding wave function can be calculated after substituting s from (13) and λ from (8) and n = 0 into (4) and (5), and taking the a →∞ limit: ψ0(x) = { c0 exp(−κ+x), x > 0, c0 exp(−κ−x), x < 0, (15) where κ± = −ik± = ∓ γ 2 − i 2λ γ , (16) and c0 = ( − 2 γ (γ2 4 + 4λ2 γ2 ))1/2 . (17) it can be seen that (15) satisfies pt symmetry, i.e. pt ψ0(x) = ψ∗0 (−x) = ψ0(x). it is also found that e0 = −κ2± ± 2iλ, (18) as expected. it is seen that the i sgn(x) potential alone cannot support any bound state, rather the dirac delta is also required for it. this is similar to the case of the pt -symmetric rosen–morse ii potential [10]: there the imaginary 2iλ tanh(ax) potential cannot support bound states without the presence of the real −s(s + 1)a2/ cosh2(ax) potential component. it is worthwhile to check some special cases against already known results. for λ = 1/2 the potential in (11) reduces to the one considered in [24], and (14) confirms the result e0 = −γ2/4+1/γ2 discussed there. furthermore, the λ = 0 choice recovers the simple dirac delta potential [25]. finally, the hermitian case with a real step function can also be considered after replacing λ → ±iλ. in this case κ± in (16) become real, and a bound state can appear only for γ 2 + 2|λ| γ < 0, (19) i.e. for a negative γ satisfying γ < −2(|λ|)1/2. the single real energy eigenvalue can also be obtained from the transmission coefficient. adapting the corresponding formulas from [10] one obtains for an incoming wave from the left tl→r = −ik− a −s− ik−2a − i k+ 2a × γ(1 − ik−2a − i k+ 2a −s)γ(1 − i k− 2a − i k+ 2a + s) γ(1 − ik+ a )γ(1 − ik− a ) (20) and rl→r = tl→r −s + ik−2a − i k+ 2a ik− a × γ(1 − ik+ a )γ(1 + ik− a ) γ(1 + ik−2a − i k+ 2a −s)γ(1 + i k− 2a − i k+ 2a + s) , (21) where k2± = e ∓ 2iλ (22) are the squared wave numbers obtained form the asymptotic limits x →∞ and x →−∞, respectively. for the sake of consistency the original notation k and k ′ was replaced by k− and k+, and an error in the sign of k ′ was corrected in [10, eqs. (38)–(41)]. recalling (13) and taking the a → ∞ limit the terms with the gamma functions reduce to unity, so (20) and (21) turn into tl→r(k−,k+) = 2ik− ik− + ik+ −γ (23) and rl→r(k−,k+) = ik− − ik+ + γ ik− + ik+ −γ . (24) the transmission and reflection coefficients for the reverse direction are obtained by the k− ↔ k+ replacement. it is found that tr→l(k−,k+) = tl→r(k−,k+)k+/k−, so the difference is represented by a phase factor (as can be seen from (22)), while the reflection coefficients are related by rr→l(k−,k+) = rl→r(k−,k+)(k+ − k− − iγ)/(k− − k+ − iγ). this handedness effect is similar to that observed for the pt -symmetric rosen–morse ii potential [10]. note that in the case of asymptotically vanishing pt symmetric potentials the two transmission coefficients are strictly identical [26, 27]. the connection of the transmission and reflection coefficients to the bound state (15) will be discussed in section 3, together with the corresponding results obtained there. 3. the finite pt -symmetric square well potential let us consider the potential v (x) =   −iv, x < −�, −v0, |x| < �, iv, x > �, (25) 414 vol. 57 no. 6/2017 common limit of two pt -symmetric potentials where � and v0 take on positive real values, and v takes on real value. for the case v = 0 potential (25) reduces to the real square well potential [28]. the sign of v is not significant, because changing the sign of v is practically equivalent with a spatial reflection, i.e. with the p operation. following the discussion of [22] and using 2m = ~ = 1 units the solution of the time-independent schrödinger equation can be described as ψ(x) =   f−(p−)eip−x + f−(−p−)e−ip−x, x < −�, α cos(kx) + iβ sin(kx), |x| < �, f +(p+)eip+x + f +(−p+)e−ip+x, x > �, (26) where p± = (e ∓ iv)1/2, k = (e + v0)1/2. (27) e denotes the energy eigenvalue and the complex square root function is understood as in [23]. note that in this case [p±]∗ = p∓ holds. the coefficients f−(p−) and f +(p+) can be determined from maching the solutions at x = ±� as f−(p−) = 1 2ip− eip−� ( (αp− + βk)i cos(k�) +(αk + βp−) sin(k�) ) , (28) f +(p+) = 1 2ip+ e−ip+� ( (αp+ + βk)i cos(k�) −(αk + βp+) sin(k�) ) , (29) while f−(−p−) and f +(−p+) follow from (28) and (29) by changing the sign of p− and p+. considering the case v < 0, following [22] the energy eigenvalues correspond to solutions that vanish asymptotically in both directions are searched as roots of equation 2kp+i cos(2k�) + (p2+r + p 2 +i −k 2) sin(2k�) = 0 (30) on the real axis, where p+r and p+i are definied as the real and imaginary part of p+, i. e. p+ = p+r + ip+i. note that (30) establishes a connection between p+r and p+i when e is real. taking into consideration (27) and separating the real and imaginary components of it, it turns out that the latter occurs only in the second term in the form −i(v + 2p+rp+i) sin(2k�). since this expression has to be zero in general (irrespective of �), it follows that p+r = − v 2p+i . (31) to reproduce (11) let us reparametrize the potential (25) by introducing γ = −2�v0, λ = v 2 . (32) then keeping γ fixed and considering the � → 0 limit the potential in (25) can be transformed into v (x) = γδ(x) + 2iλ sgn(x) (33) as well. in this limit, after applying the l’hospital rule, equation (30) transform into 2p+i + γ = 0. together with (31) and (32) this means that there is a single bound state with p± = 2λ γ ∓ i γ 2 , (34) with the energy eigenvalue e0 = − γ2 4 + 4λ2 γ2 , (35) which coincides with (14). the transmission and reflection coefficients can be obtained form the corresponding limit of those in [22]. these coefficient for an incoming wave from the left are given by tl→r = 2ip−ke−ip−�e−ip+� ik cos(2k�)(p++p−) + sin(2k�)(p+p−+k2) (36) and rl→r = e−2ip−� × ik(p− −p+) cos(2k�) + (p+p− −k2) sin(2k�) ik(p+ + p−) cos(2k�) + (p+p− + k2) sin(2k�) , (37) respectively. in the � → 0 limit they turn into tl→r = 2ip− ip− + ip+ −γ (38) and rl→r = ip− − ip+ + γ ip− + ip+ −γ , (39) respectively. the equivalent coefficients for a wave incoming from the right are obtained by the p+ ↔ p− change, similarly to the results of section 2. note that pt symmetry, and in particular, the asymptotically non-vanishing potential component has strong influence on the asymptotic properties of the wave functions, and this fact manifests itself in the structure of the transmission and reflection coefficients too, in accordance with the findings of [22]. it turns out that for real e (as is the case here) the asymptotically vanishing (i.e. bound) states can be identified with the zeros of the reflection coefficents, rather than with the poles of the transmission (and reflection) coefficients. the reason is that in these solutions exp(±ip±x) occur with the same sign in the exponent for both x > 0 and x < 0, corresponding to a transmitting wave. in particular, for the potential (33) the bound state occurs for 2p+i + γ = 0, which is the zero of (39), while for the reverse direction, the zero of rr→l occurs at 2p−i + γ = 0 = −2p+i + γ corresponding to the interchange of p− and p+ or spatial reflection. the same results are obtained from the discussion of section 2 too. 415 józsef kovács, géza lévai acta polytechnica 4. conclusions we investigated the pt -symmetric rosen–morse ii and finite square well potentials in the limit when their real even potential component turns into the dirac delta, while their imaginary odd component tend to the sign function, respectively. the energy spectrum was found to contain a single real eigenvalue for γ < 0 and arbitrary λ, depending on both parameters. the transmission and reflection coefficients were also determined, and it was found that they exhibit the expected handedness effect. the results of [24] were recovered for the bound-state energy after setting λ = 1/2, while for λ = 0 the dirac delta potential was obtained. the results were also derived for the hermitian version of this potential with an imaginary λ. the transmission and reflection coefficients were also considered in the appropriate limit for the two potentials. it was confirmed that the handedness effect occurs in this case too, i.e. in contrast with real potentials, the reflection coefficients differ essentially for waves arriving from the two directions, while the transmission coefficients differ only in a phase. (note that for potentials with asymptotically vanishing imaginary component even this phase is missing.) it was shown that the only bound state that occurs in the limiting case from both potentials is obtained as the zero of the reflection coefficient, rather than as the pole of the transmission coefficient, in accordance with the findings of [22]. the present study confirms the importance of the asymptotically non-vanishing imaginary potential component, which was already pointed out in connection with the pt -symmetric rosen–morse ii potential [10, 21]. it is notable that supplementing the same real even asymptotically vanishing potential component with an asymptotically vanishing odd imaginary potential component (scarf ii [15, 29]) leads to complex conjugate energy eigenvalues when the relative intensity of the imaginary component is increased, but this phenomenon does not occur when the imaginary potential component is chosen asymptotically nonvanishing (rosen–morse ii). in this case increasing the intensity of the imaginary component leads to lifting the energy spectrum to higher energies, such that even the ground-state energy can be tuned to positive values. this was also found for the pt -symmetric finite square well [22] potential. it is instructive to consider further pt -symmetric potentials with various asymptotic behaviour. the energy spectrum of the bender–boettcher potentials v (x) = x2(ix)ε [1] contains complex conjugate energy eigenvalues for ε < 2. in this case complexification is a gradual process, starting from higher energies. for ε ≥ 0 the energy eigenvalues are all real, similarly to the case of the pt -symmetric rosen–morse ii potential. the ε = 1 choice recovers the purely imaginary ix3 potential. another interesting case is the pt -symmetric exponential potential [20]. its solutions are expressed in terms of bessel functions, so it is also outside the natanzon class. this two-parameter potential is purely imaginary, and it tends to infinity asymptotically stronger than the imaginary component of the bender–boettcher potential. it has the unusual feature that its energy spectrum generally contains both real and complex energy eigenvalues such that it is not possible to separate parametric domains where only imaginary energy eigenvalues occur. increasing non-hermiticity leads to the gradual complexification of the energy spectrum from above, however, the ground-state energy always remains real. all these findings indicate that the breakdown of pt symmetry occurs in potentials with rather different patterns of the imaginary component. for asymptotically strongly divergent imaginary potential components the complexification of the energy eigenvalues generally occurs gradually, starting from above. for potentials with asymptotically vanishing imaginary component the same procedure occurs from below either suddenly [15–17] or gradually [18], but in these cases a non-vanishing real potential component is also necessary to obtain bound states. it is notable that although potentials with asymptotically constant imaginary potential component (such as the rosen–morse ii and its limit discussed here) fall between the two potential types mentioned above, their energy spectrum is purely real. furthermore, a real potential component is also necessary for them to support bound states. further studies concerning potentials with various asymptotic patterns seem worthwhile in order to shed more light on the possible mechanisms of the breakdown of pt symmetry. acknowledgements this work was supported by the hungarian scientific research fund – otka, grant no. k112962. references [1] c. m. bender, s. boettcher. phys. rev. lett. 80:5243, 1998. doi:10.1103/physrevlett.80.5243 [2] c. m. bender. rep. prog. phys. 70:947, 2007. doi:10.1088/0034-4885/70/6/r03 [3] a. mostafazadeh, j. math. phys. 43:205,2814 and 3944, 2002. doi:10.1063/1.1418246, doi:10.1063/1.1461427, doi:10.1063/1.1489072 [4] c. e. rüter, k. g. makris, r. el-ganainy, d. n. christodoulides, m. segev, d. kip. nature physics 6:192, 2010. doi:10.1038/nphys1515 [5] g. lévai, m. znojil. j. phys. a:math. gen. 33:7165, 2000. doi:10.1088/0305-4470/33/40/313 [6] g. lévai, m. znojil. mod. phys. lett. a 16:1973, 2001. doi:10.1142/s0217732301005321 [7] g. lévai. int. j. theor. phys. 54:2724, 2015. doi:10.1007/s10773-014-2507-9 [8] l. e. gendenshtein. jetp lett. 38:356, 1983. 416 http://dx.doi.org/10.1103/physrevlett.80.5243 http://dx.doi.org/10.1088/0034-4885/70/6/r03 http://dx.doi.org/10.1063/1.1418246 http://dx.doi.org/10.1063/1.1461427 http://dx.doi.org/10.1063/1.1489072 http://dx.doi.org/10.1038/nphys1515 http://dx.doi.org/10.1088/0305-4470/33/40/313 http://dx.doi.org/10.1142/s0217732301005321 http://dx.doi.org/10.1007/s10773-014-2507-9 vol. 57 no. 6/2017 common limit of two pt -symmetric potentials [9] g. a. natanzon. teor. mat. fiz. 38:146, 1979. [10] g. lévai, e. magyari. j. phys. a: math. theor. 42:195302, 2009. doi:10.1088/1751-8113/42/19/195302 [11] a. sinha, g. lévai, p. roy. phys. lett. a 322:78, 2004. doi:10.1016/j.physleta.2004.01.009 [12] g. lévai. j. phys. a:math. gen. 39:10161, 2006. doi:10.1088/0305-4470/39/32/s17 [13] g. lévai. phys. lett. a 372:6484, 2008. doi:10.1016/j.physleta.2008.08.073 [14] m. znojil. phys. lett. a 264:108, 1999. doi:10.1016/s0375-9601(99)00805-1 [15] z. ahmed. phys. lett. a 282:343, 2001. doi:10.1016/s0375-9601(01)00218-3 [16] g. lévai, a. sinha, p. roy. j. phys. a:math. gen. 36:7611, 2003. doi:10.1088/0305-4470/36/27/313 [17] g. lévai. pramana j. phys. 73:329, 2009. doi:10.1007/s12043-009-0125-5 [18] g. lévai. j. phys. a:math. theor. 45:444020, 2012. doi:10.1088/1751-8113/45/44/444020 [19] m. znojil, g. lévai. mod. phys. lett. a 16:2273, 2001. doi:10.1142/s0217732301005722 [20] z. ahmed, d. ghosh, j. a. nathan. phys. lett. a 379:1639, 2015. doi:10.1016/j.physleta.2015.04.032 [21] g. lévai. int. j. theor. phys. 50:997, 2011. doi:10.1007/s10773-010-0595-8 [22] g. lévai, j. kovács. submitted to phys. lett. a, 2017. [23] m. abramowitz, i. a. stegun. handbook of mathematical functions. dover, new york, 1970. [24] r. henry, d. krejčiřik. arxiv:1503.02478v2. [25] d. j. griffiths. introduction to quantum mechanics. prentice hall, upper saddle river, nj, 1995. [26] z. ahmed. phys. lett. 324:152, 2004. doi:10.1016/j.physleta.2004.03.002 [27] f. cannata, j.-p. dedonder, a. ventura. ann. phys. (ny) 322:397, 2007. doi:10.1016/j.aop.2006.05.011 [28] s. flügge. practical quantum mechanics i. springer, berlin, heidelberg, new york, 1971. [29] g. lévai, m. znojil. j. phys. a: math. gen. 35:8793, 2002. doi:10.1088/0305-4470/35/41/311 417 http://dx.doi.org/10.1088/1751-8113/42/19/195302 http://dx.doi.org/10.1016/j.physleta.2004.01.009 http://dx.doi.org/10.1088/0305-4470/39/32/s17 http://dx.doi.org/10.1016/j.physleta.2008.08.073 http://dx.doi.org/10.1016/s0375-9601(99)00805-1 http://dx.doi.org/10.1016/s0375-9601(01)00218-3 http://dx.doi.org/10.1088/0305-4470/36/27/313 http://dx.doi.org/10.1007/s12043-009-0125-5 http://dx.doi.org/10.1088/1751-8113/45/44/444020 http://dx.doi.org/10.1142/s0217732301005722 http://dx.doi.org/10.1016/j.physleta.2015.04.032 http://dx.doi.org/10.1007/s10773-010-0595-8 http://dx.doi.org/10.1016/j.physleta.2004.03.002 http://dx.doi.org/10.1016/j.aop.2006.05.011 http://dx.doi.org/10.1088/0305-4470/35/41/311 acta polytechnica 57(6):412–417, 2017 1 introduction 2 the pt-symmetric rosen–morse ii potential 3 the finite pt-symmetric square well potential 4 conclusions acknowledgements references ap05_3.vp notation c total cost c0 construction and maintenance cost cd cost of full damage (full malfunction or performance failure) dr(x) damage function f(z ,t) load function of a generic point z and time t k(z) deflection of a beam k constant n normalising factor s action effect r resistance r1 lower bound of the transition region r2 upper bound of the transition region x a generic point of a relevant performance indicator z a point of a beam t time point � fuzzy probability of performance failure � decision parameter �r the mean of r �r the standard deviation of r �s the mean of s �s the standard deviation of s �r(x) membership function �r(x|�) probability density function �(z, t) deflection at a generic point z and time t ej stiffness of a beam 1 introduction structural performance has become a fundamental concept in advanced engineering design in construction. however, the performance requirements (including serviceability, safety, security, comfort, functionality) of buildings and engineering works are often affected by various uncertainties that can hardly be entirely described by traditional probabilistic models. as a rule, the transformation of human needs and desires, particularly of those describing occupancy comfort and aesthetical aspects, to performance (user) requirements often results in an indistinct or imprecise specification of the technical criteria for relevant performance indicators (for example permissible deflection, crack width, velocity, acceleration) [1]. thus, in addition to the natural randomness of basic variables, the performance requirements may be considerably affected by vagueness in the definition of technical criteria. two types of uncertainty of performance requirements are therefore identified here: randomness, handled by commonly used methods of the theory of probability, and fuzziness, described by the basic tools of the recently developed theory of fuzzy sets [2, 3]. similarly as in previous studies [4, 5], the fundamental condition of structural performance, s � r, relating an action effect s and a relevant performance requirement r, is analysed assuming the randomness of s and both the randomness and the fuzziness of r. an illustrative example of continuous vibration in offices is used throughout the paper to clarify the general concepts. in this example, it is shown that it is impossible to identify a distinct value of an appropriate indicator (a root mean square value of acceleration) that would separate satisfactory performance from unsatisfactory performance (see also [6, 7, 8]). typically, a broad transition region is observed, where the building gradually loses its ability to perform adequately and where the degree of damage (inadequate performance or malfunction) gradually increases. this paper is an extension of previous studies [4, 5, 9, 10]. 2 theoretical model for performance requirements fuzziness due to vagueness and imprecision in the definition of performance requirement r is described by the membership function �r(x) indicating the degree of membership of a structure in a fuzzy set of damaged (unserviceable) structures [1, 4, 5]; here x denotes a generic point of a relevant performance indicator (in the illustrative example the root mean square value of acceleration) used to assess both s and r. a simple piecewise linear membership function �r(x), shown in fig. 1, is considered in the following analysis. this function © czech technical university publishing house http://ctn.cvut.cz/ap/ 99 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fuzzy optimisation of structural performance m. holický structural performance has become a fundamental concept in advanced engineering design in construction. basic criteria concerning the action effects and imposed performance requirements are analysed assuming two types of uncertainties: randomness and vagueness. the natural randomness of the action effect is handled by commonly used methods of the theory of probability. the vagueness of performance requirements due to indistinct or imprecise specification and perception is analysed using the basic tools of the theory of fuzzy sets. both types of uncertainties are considered to define fuzzy probabilistic measures of structural performance, the damage function and fuzzy probability of failure. fuzzy probabilistic optimisation of vibration constraints indicates that the limiting values recommended for the acceleration of building structures may be uneconomical. further research should focus on verifying of the input theoretical models using available experimental data. keywords: performance requirements, randomness, fuzziness, optimisation. describes the non-random (deterministic) part of uncertainty in the requirement r. the randomness of r at each damage level � = �r(x) is described by the probability density function �r(x|�) (see fig. 1), for which the normal distribution is considered here. the transition region �r1, r2�, where the building gradually loses its ability to perform adequately and its damage increases, may be rather broad depending on the nature of the performance requirement. in the illustrative example of continuous vibration in offices, the upper bound r2 may be a multiple of the lower bound r1 (see the international standards [6, 7] and the comprehensive discussion in [8]). an assessment of the lower bound r1 can be derived from the root mean square value of the acceleration limits, which are expected to be approximately equal to r1. the acceleration constraints for continuous vibration in offices suggested in various countries (see the critical review in [8]) are within a range from 0.02 to 0.06 ms�2. as discussed in [8], the vibration threshold may be even lower, within values from 0.01 to 0.02 ms�2. in the case of continuous vibration in offices there is, however, a low probability of an adverse comment for accelerations below 0.02 ms�2 [8]. therefore, this value may be considered as the lower limit r1 below which an office is assumed to be fully serviceable and to perform adequately. the assessment of the upper limit r2, above which an office is fully unserviceable, is even more difficult than the appraisal of the lower limit r1. the upper limit r2 may vary considerably depending on the definition of a fully unserviceable state of a building. in accordance with the discussion in [8], an adverse comment is probable for accelerations above 0.10 ms�2. although this value may not imply a full disability of a building space to be used as an office, it is accepted here as an assessment of the upper limit r2. to show the effect of the upper limit of the transition region on optimum constraints, two indicative values r2 � 3 r1 (� 0.06 ms �2) and r2 � 5 r1 (� 0.10 ms �2) are considered in the following analysis. in addition to fuzziness, performance requirements are also dependent on the natural randomness of user needs. as already indicated above, this uncertainty is described in fig. 1 by the normal probability density function �r(x|�). the mean of �r(x|�) for a given damage level � is considered as the value of the indicator x for which � � �r(x), the standard deviation is taken as independent of x and equal to 0.1 r1 (0.002 ms �2). the above described theoretical model of performance requirements, including the fuzziness and randomness characteristics, should however be considered only as a conceivable representation of actual user needs. in order to determine a more accurate and more precise fuzzy probabilistic model of performance requirements, there is an urgent need for further development in the definitions of newly introduced characteristics of performance uncertainties using appropriate experimental data. 3 theoretical model of public perception the fuzzy probabilistic concepts introduced above may be effectively used as a theoretical model of public perception [9, 10]. public perception plays an important role in the assessment and final decision concerning any existing structure. due to its performance deficiency the new department store described in [9] soon became a building closely watched by a large number of users and local authorities. after a few years in service serious performance defects of the cladding, interior partitions, and other secondary elements were observed [9]. incidentally, at the same time another department store suffered from construction faults and this was partly the reason why all the deficiencies have been carefully recorded. this unfavourable engineering climate seems to enhance the intensity of public perception. the observed defects were often exaggerated and regarded as indicators of insufficient structural safety. widespread public perception of the defects and discrepancies in expert assess100 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 1.0 fig. 1: fuzzy probabilistic model of the performance requirement r ments was reported in the newspapers and finally resulted in a strong public demand for the building to be strengthened [9]. the evaluation of public as well as expert assessments confirms the fundamental fuzzy probabilistic concepts described above. there is no distinct point in any commonly used performance indicator x (e.g. deflection, crack width) that would uniquely separate acceptable and unacceptable structures. rather, there seems to be a transition region in which the structure gradually becomes unserviceable and the degree of caused damage �r(x) increases. it appears that the conceivable model for �r(x) indicated in fig. 1 may be used for the description of public perception [10]. note that the values of �r(x) are within the conventional interval from 0 to 1. in the assessment of existing structures there is no damage below a certain lower limit value r1, and full damage above the upper limit r2. the fuzzy probabilistic measures developed below can be used for the analysis of an structural performance of new structures as well as for an assessment of the public perception of existing structures exhibiting performance deficiencies. 4 fuzzy probabilistic measures of structural performance the damage function dr(x) is defined as the weighted average of damage probabilities reduced by the corresponding damage level [4, 5] d x n x xr r x ( ) ( | )� � � � � � �� �� 1 0 1 � � � �d d , (1) where n denotes a factor normalising the damage function dr(x) to the conventional interval �0, 1� and �x is a generic point of x. the damage function dr(x) defined by equation (1) may be used [4, 5] to specify the design (or characteristic) value of the performance requirements corresponding to a given level of the total expected damage. thus, for the fuzziness characteristics r1 and r2, and the randomness characteristic �r the design value of the performance requirement r may be specified in a rational way using fuzzy probabilistic concepts. the fuzzy probability of performance (serviceability) failure � is then defined as [4, 5] � �� �� � � s rx d x x( ) ( ) d , (2) where �s(x) is the probability density function of the action effect s. similarly the fuzzy probability of performance failure � defined by equation (2) enables the formulation of various design criteria in terms of relevant randomness as well as fuzziness characteristics. then, however, besides the fuzziness characteristics r1, r2 and the randomness characteristic �r of the performance requirement r, the characteristics of action effect s, particularly the mean �s and the standard deviation �s are also needed. in the following, a symmetric normal (laplace-gauss) distribution of s is accepted. the general case of an asymmetric three parameter lognormal distribution is considered in earlier studies [4, 5]. 5 optimisation procedure the optimum value of the fuzzy probability of performance failure can be estimated using the technique of design optimisation [4, 5]. it is assumed that the objective function is given by the total cost c(� ) expressed approximately as a sum c c cd( ) ( ) ( )� � � �� �0 , (3) where c0(� ) is given as the sum of the construction and maintenance cost, �(�) cd is the expected malfunction cost; here cd denotes the cost of full damage (full malfunction or serviceability failure) and � denotes the decision parameter (for example the mass per unit length or the cross section area). it has been shown [4, 5] that this equation can be used if the malfunction cost due to the damage level � is given as the multiple � cd (in the illustrative example this represents the cost due to disturbance and the lower efficiency of occupancies in the offices). further, it is assumed that both the initial cost c0(�) and the fuzzy probability of performance failure �(�) are dependent on a decision parameter � (in the illustrative example it is the mass per unit length of a floor component) while the cost of full damage cd is independent of �. if c0(�) is proportional to the decision parameter �, and the load effect s is proportional to a power ��k (k � 1), then the optimum ratio cd /c0(�) may be expressed [4, 5] as c c k k s s s s d 0 1 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) � �� � �� � � � �� � �� � � �� � � � � � �1 , (4) where the quantities c0(�), �s(�), �s(�) are dependent on the decision parameter �. partial derivatives of the fuzzy probability of failure � in equation (4) are to be determined using equation (2) and numerical methods of integration and derivation. 6 vibration of a floor member vibration of a load bearing horizontal member supporting the floor structure of a building may be analysed using the equation of motion [8] for a beam: � � � � � 2 2 4 4 v z t t ej v z t z f z t ( , ) ( , ) ( , )� � , (5) where v(z, t) denotes the vertical deflection and f(z, t) denotes a load function of a generic point z and time t, � denotes the mass of the beam per unit length and ej the stiffness of the beam. in the case of vibration criteria for building structures [7] ensuring human comfort, the relevant variable used to verify the serviceability conditions of a beam is the acceleration a(z, t) which follows from equation (5) as a z t v z t t f z t ej v z t z ( , ) ( , ) ( , ) ( , ) � � � � � � �� � � � � 2 2 4 4 1. (6) if the decision parameter is the mass per unit length �, then the load effect s, being the root mean square value of acceleration, can be expressed in terms of � as s k z� �( ) � 1, (7) where k(z) is a function expressing the shape of the deflection curve. thus, in this case of vibration of a floor member, the load effect s is proportional to ��1 and the parameter k, entering equation (4), is equal to 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 101 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 the optimum cost ratio cd / c0 obtained from equation (4) for k � 1, �r � 0.1 r1, and for r2 /r1 � 3 is shown for selected values of �s and �s in fig. 2. similar results are shown in fig. 3 for r2 /r1 � 5. assuming �r � 0.1 r1 and r2 /r1 � 5 it follows from fig. 3 that the optimum values of cd /c0 are slightly higher than those corresponding to �r � 0.1 r1 and for r2 /r1 � 3, which are indicated above in figure 2. in both cases (for r2 /r1 � 3 and r2 /r1 � 5) the optimum cost ratio cd /c0 for �s > r1 is very low (less than 100) and almost independent of the standard deviation �s. it is interesting to note that the optimum probability ratio � cd /c0 is almost independent of the characteristic of the load effect s described by the mean �s and the standard deviation �s. fig. 4 shows the variation of � cd /c0 with �s /r1 for selected �s /r1 assuming the same input data as in the illustrative example above: k � 1, �r � 0.1 r1 and r2 /r1 � 3. it should be noted that the resulting values for k � 1, �r � 0.1 r1 and r2 /r1 � 5 are almost exactly the same. considering �s /r1 � 0.2 and �s /r1 � 1 it follows from fig. 4 that the optimum fuzzy probability of performance failure may be assessed using an approximate relation 102 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 10 4 10 6 10 8 cd /c0 �s /r1 0.0 fig. 2: optimum cost ratio cd / c0 for k � 1, �r � 0.1 r1 and r2 /r1 � 3 10 4 10 6 10 8 cd /c0 � s /r1 0.0 fig. 3: optimum cost ratio cd / c0 for k � 1, �r � 0.1 r1 and r2 /r1 � 5 � � 0 1 0. c cd . (8) fig. 4 may be used to adjust equation (8) for relevant values �s /r1 and �s /r1. thus, equation (8) may be used for a first estimate of the optimum fuzzy probability of performance failure � assuming that the action effect is described by �s /r1 � 0.2 and �s /r1 � 1 and the performance constraint by k � 1, �r � 0.1 r1 and by r2 /r1 approximately from 3 to 5. then, having the costs c0 and cd, the optimum fuzzy probability of performance failure � may be assessed using equation (8). for example, if the expected cost ratio cd /c0 � 100 (the damage cost cd is hundred times greater than c0, then the optimum � is 0.001. it should be noted that the acceleration constraints required for various buildings (loading areas and activities) may significantly vary (in particular the lower and upper limits r1 and r2 may be related in various ways) [8]. furthermore, requirements concerning other aspects of structural performance may be characterised by input data dissimilar to those considered in fig. 4 [4, 5]. then an additional analysis considering appropriate input data k, �r and r2 /r1 is needed. 7 discussion fuzzy optimisation of structural performance is applicable to many types of serviceability and other functional requirements, particularly of those affected by significant vagueness (for example deflection and acceleration). an example of an acceleration constraint for continuous vibration of a structure in an office building illustrates the general concepts well. it appears that the lower and upper bounds for acceleration constraints, denoted here r1 and r2 respectively, may vary within a broad range. similar conditions may be observed in the case of other serviceability indicators as the deflection or crack width. assuming r2 /r1 � 3 and �r � 0.1 r1 it follows from fig. 2 for the cost ratio cd /c0 equal to about 100, that the building should be designed in such a way that the characteristics of action effects should correspond to the horizontal line at the level cd /c0 � 100; for example if �s � 0.1 r1, then the optimum mean is �s � 0.9 r1 (which is equal to 0.018 ms �2 if r1 � 0.02 ms �2 and r2 /r1 � 3). for r2 /r1 � 5 it follows from fig. 3 that for �r � �s � 0.1 r1 the optimum mean is �s � r1 (which is equal to 0.02 ms�2). thus, with the increasing upper limit r2 the optimum mean �s also increases (from 0.018 to 0.02 ms�2). note that for the standard deviation �s � 0.2 r1 the optimum mean �s is in both cases considerably lower (from 0.8 r1 to 0.95 r1). if the cost ratio cd /c0 is 10 4, then for r2 /r1 � 3 (fig. 2) and, as above for �r � �s � 0.1 r1, the optimum mean �s of the load effect is about 0.63 r1 (0.0136 ms �2), for r2 /r1 � 5 (fig. 3) the optimum mean �s of the load effect is about 0.72 r1 (0.0144 m s�2). again, with the increasing upper limit r2 the optimum mean �s also increases (from 0.0136 to 0.0144 ms�2). for the standard deviation �s � 0.2 r1 the optimum mean �s is in both cases again considerably lower (less than 0.4 r1). generally, with the increasing cost ratio cd /c0 and increasing the standard deviations �r and �s the optimum mean �s and the standard deviation �s lead to decreases in of the action effect s. for higher values of these quantities the optimum values for the mean �s and the standard deviation �s may be quite severe and may not be achievable without introducing adequate structural measures. in some cases it may be necessary to revise the overall design of the building. the acceleration constraints considered in various international documents [1, 6, 7], which are generally greater than the lower limit r1, correspond to the optimum cost ratio cd /c0 in the range from 1 to 100 (see fig. 2 and 3). in the case of office buildings such values of cost ratio seem to be rather low. consequently, the values of acceleration constraints recommended in [6, 7, 8] may be uneconomical. a similar observation was obtained in previous studies concerning deflections [4,5]. experience from the assessment of existing structures confirms that there is no distinct value that would uniquely distinguish acceptable and unacceptable structural conditions. the fuzzy probabilistic concept may well explain the disturbing variance in public perception and in expert assessments of the observed defects. in particular it was difficult to explain great differences in experts judgements. it appears that there is an optimum value of the performance indicator that would lead to the minimum total cost and may be considered as the most likely outcome of the expert assessment [10]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 103 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 4: optimum probability ratio � cd / c0 for k � 1, �r � 0.1 r1 and r2 /r1 � 3 however, the presented concepts are based on merely hypothetical (although quite reasonable and plausible) assumptions concerning the theoretical models describing the randomness and vagueness of performance requirements. for example the acceleration constraints used in the case of vibration were derived from data available in the literature and standards that may not fully fit actual conditions. obviously to make a more credible assessment of the optimum structural characteristics, appropriate experimental data enabling a realistic definition of the relevant theoretical models is needed. at present only limited data is available, and it will be difficult to obtain new data experimentally. the most difficult problems seem to be connected with the definition of the membership function �r(x) and the specification of the relevant probability distributions, for which conceivable theoretical models are used only. in particular both the limits r1 and r2 and the type of function �r(x) (which may be a non-linear function of the indicator x) should be derived from appropriate experimental data. nevertheless, general concepts and the developed methodical principles supplemented by auxiliary computer programs seem to provide effective tools for comparative studies and further investigation of the structural performance. 8 conclusions (1) performance requirements on structural behaviour are generally affected by two types of uncertainty: randomness and vagueness due to indistinct or imprecise definitions and perceptions; the theory of probability and the theory of fuzzy sets may be used to analyse them. (2) the newly developed fuzzy probabilistic concepts including the damage function and the fuzzy probability of performance failure provide effective measures enabling rational analysis and the optimisation of structural performance. (3) the proposed fuzzy probabilistic concepts are confirmed by available experience from the assessment of new as well as existing structures. fuzzy probabilistic concepts may well explain the disturbing variance in public perception and in expert assessments of existing structures. (4) optimisation analysis indicates that commonly used performance criteria including the acceleration constraints for continuous vibration of a structure in office buildings may be uneconomical. similar observations were obtained by previous studies concerning deflections. (5) appropriate experimental data enabling the specification of more realistic theoretical models is needed for further development and practical applications of fuzzy probabilistic concepts, including the optimisation of structural performance. 9 acknowledgment this paper presents a part of the findings of the research project cez: j04/98:210000029 “reliability and risk engineering of technological systems ”, supported by the ministry of youth and education of the czech republic. references [1] holický, m., östlund, l.: “probabilistic design concept.” proc. international colloquium iabse: structural serviceability of buildings, göteborg, 1983, p. 91–98. [2] brown, c. b., yao, j. t. p.: “fuzzy sets and structural engineering.” journal of structural engineering, vol. 109 (1983), no. 5, p. 1211–1225. [3] shiraishi, n., furuta, h.: “structural design as fuzzy decision model.” proc. icasp 4, pitagora editrice, bologna, 1983, p. 741–752. [4] holický, m.: “fuzzy optimisation of structural reliability.” proc. icossar’93, a.a.balkema, rotterdam, 1994, p. 1379–1382. [5] holický, m.: “fuzzy probabilistic optimisation of building performance.” proc. cib-astm-iso-rilem international symposium application of the performance concept in building, tel aviv, 2001, p. 4-75 to 84. (see also automation in construction 8, no. 4, 1999; p. 437–443). [6] iso 2631-2: evaluation of human exposure to whole/ body vibration – part 2: continuous and shock-induced vibration in buildings (1 to 80) hz, 1989. [7] iso 10137: basis for design of structures – serviceability of buildings against vibration, 1991. [8] bachmann, h., ammann, w.: vibration in structures induced by man and machines. iabse, zurich, 1987. [9] holický, m.: “performance deficiency of a department store – case study.” proc. safety, risk and reliability – trends in engineering; zürich: iabse, 2001, p. 321–326. [10] holický, m.: “structural failure and assessment of a department store.” second international conference on forensic engineering. ice, london, november 2001, p. 55–63. prof. ing. milan holický, ph.d., drsc. phone: +420 224 353 842 fax: + 420 224 355 232 e-mail: holicky@klok.cvut.cz czech technical university in prague klokner institute šolínova 7 166 08 praha 6, czech republic 104 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague table of contents section 1mechanical engineering elicitation of preference structure in engineering design 5 j. k. tan simulation in quality management – an approach to improve inspection planning 10 h.-a. crostack, m. höfling, j. liangsiri kbe application for the design and manufacture of hsm fixtures 17 j. ríos, j. v. jiménez, j. pérez, a. vizán, j. l. menéndez, f. más towards integration of cax systems and a multiple-view product modeller in mechanical design 25 h. song, b. eynard, p. lafon, l. roucoules constraint-aided product design 31 g. mullineux, b. hicks, t. medland the design and development of enhanced thermal desorption products 37 r. humble, h. millward, a. mumby, r. price structural stress analysis of an engine cylinder head 43 r. tichánek, m. španiel, m. diviš modelling a new product model on the basis of an existing step application protocol 49 b.-r. hoehn, k. steingroever, m. jaros study of concepts of parallel kinematics machines for advanced manufacturing 56 m. valášek, v. bauma, z. šika solid modeling and finite element analysis of an overhead crane bridge 61 c. alkin, c. e. imrak, h. kocabas vibration damping of a new ionic liquid under electric field effect 68 m. m. a. bicak, h. t. belek, a. göksenli measurement of two phase flow 73 j. novotný, j. nožièka, j. adamec, l. nováková dynamic behaviour of the patented kobold tidal current turbine: numerical and experimental aspects 77 d. p. coiro, a. de marco, f. nicolosi, s. melone, f. montella section 2civil engineering design and optimization of a turbine intake structure 87 p. fošumpaur, f. èihák finite element modelling of cold formed stainless steel columns 92 m. macdonald, j. rhodes fuzzy optimisation of structural performance 99 m. holický ap04-bittnar1.vp 1 introduction utilization of an appropriate material model for realistic modeling of concrete is essential in order to capture both experimental results and real structures. generally, the more sophisticated the model we deal with, the greater the number of model parameters that have to be considered. in better cases, basic parameters such as compressive strength, modulus of elasticity, etc. are known. in worse cases, practically nothing is known. some parameters can be estimated using recommended formulas from the literature, but in most cases these formulas can be used only as a first approximation of the parameters. if an experimental load-deflection curve is to be captured, e.g., by. a nonlinear fracture mechanics model, the first calculation using an initial set of material model parameters usually deviates from desired set. then it is necessary to make some correction of the parameters using a trial-and-error method. the parameters are changed step by step, the numerical calculations have to repeated many times and the numerical simulation results are compared with the experimental results. such a classical approach is not very efficient, especially if a complex material model with many parameters is used. that is why different alternatives of identification algorithms have been proposed in the literature, and they are now becoming increasingly attractive (e.g. [1], [2], [3]). the aim of this paper is to present a new approach for identifying material model parameters. the proposed approach is based on coupling stochastic nonlinear fracture mechanics analysis and an artificial neural network. the identification parameters play the role of basic random variables, with the scatter reflecting the physical range of possible values. the efficient monte carlo type simulation method latin hypercube sampling (lhs) is used. the statistical simulation provides the set of data, “a bundle” of numerically simulated load-deflection curves. the generated basic random variables and the subsequently calculated load-deflection curves are used for training a suitable type of neural network. once the network is trained it represents an approximation which can be utilized in a reverse way: for a given experimental load-deflection curve to provide the best possible set of material model parameters. several software tools had to be combined in order to make the identification possible. first, atena [4] nonlinear fracture mechanics softwaand freet [5] probabilistic software package – these can be combined under the sara software shell [6], [7]. then dlnnet, new neural network software was developed [8]. the results of the approach were recently presented [9]. the similar concept of using latin hypercube sampling statistical simulation for stochastic training of a neural network was used to estimate microplane model parameters [10]. the methodology is demonstrated on selected numerical examples of identifyinf a material model for concrete, called sbeta (the classical often used model available in atena software): a notched specimen under three-point bending of high-strength concrete used for railway sleepers and shear wall experiments. 2 fundamental difficulty of nonlinear fracture mechanics modeling for realistic modeling of structural failure from quasibrittle materials, an advanced computational analysis should utilize nonlinear fracture mechanics. atena software [4] is an efficient tool for analysis of concrete, reinforced concrete and prestressed structures. the software employs a set of advanced material models for realistic calculation of the structural response. the well-known sbeta material model was verified during long development of the software, and it reflects all important aspects of concrete behaviour in both tension and compression. a fundamental difficulty naturally exists when an experimental load-deflection curve is to be captured by numerical simulation. such a virtual experiment needs good material data in order to reproduce the load-deflection curve properly. in the case of the sbeta material model the main parameters are: modulus of elasticity e, tensile strength ft, compressive strength fc, fracture energy gf, critical compressive displacement wd, compressive strain in the uniaxial compressive test �c. a typical situation is that a set of preliminary parameters is first used for modeling, and in most cases only poor agreement with the experimental 110 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 neural network based identification of material model parameters to capture experimental load-deflection curve d. novák, d. lehký a new approach is presented for identifying material model parameters. the approach is based on coupling stochastic nonlinear analysis and an artificial neural network. the model parameters play the role of random variables. the monte carlo type simulation method is used for training the neural network. the feasibility of the presented approach is demonstrated using examples of high performance concrete for prestressed railway sleepers and an example of a shear wall failure. keywords: neural network, nonlinear fracture mechanics, latin hypercube sampling, identification. data is achieved. a heuristic user-based iteration has to be performed. for illustration, fig. 1 shows an experimental load-deflection curve and a result of numerical simulation with the first set of preliminary parameters of a notched beam. this example will be discussed later in this paper. although most of the material parameters were known (the relevant experiments had been done), the disagreement between the two curves is significant. it is clear that the parameters were not determined satisfactorily (errors, assumptions, contaminated experiments, influence of size effect, etc.), and the parameters need to be modified. 3 identification of material model parameters 3.1 general concept the new identification technique is based on a combination of statistical simulation and training the neural network. several software tools had to be combined to make the identification possible. the whole procedure can be itemized as follows (the software relevant to the individual steps is referenced): 1. first, a computational model has to be developed using the appropriate fem software, which enables modeling of both pre-peak and post-peak behaviour. the initial calculation uses a set of initial material model parameters. software: atena [4]. 2. the parameters of the material model to be identified are considered as random variables described by a probability distribution. rectangular distribution is a “natural choice” as the lower and upper limits represent the bounded range of physical existence. however distributions can also be used, e.g. gaussian (in spite of the fact that it is not bounded). these parameters are simulated randomly based on a monte carlo type simulation, and lhs small-sample simulation is recommended. the statistical correlation between some parameters can be taken into account. software: freet [5]. 3. a multiple calculation of a deterministic computational model using random realizations of the material model parameters is performed, resulting in “a bundle” of load-deflection curves (usually overlapping experimental curve). software: sara [6], [7]. 4. the random load-deflection curves serve as the basis for training an appropriate neural network. such training can be called stochastic training, due to the stochastic origin of the load-deflection curves. after training, the neural network is ready to answer the referse task: to select the material model parameters which can capture the experimental load-deflection curve as closely as possible. software: dlnnet [8]. 5. the final calculation using the identified material model parameters should verify how well the parameters were identified (atena). the complex program communication and the necessary interfaces are schematically shown in fig. 2. © czech technical university publishing house http://ctn.cvut.cz/ap/ 111 acta polytechnica vol. 44 no. 5 – 6/2004 0 0. 0 5. 1 0. 1 5. 2.0 0 00. 0 05. 0 10. 0 15. 0 20. deformation [mm] fo rc e [k n ] experiment model param. from exp. fig. 1: load-deflection curves – experiment and initial numerical simulation sara atena freet dlnnet random material parametersrandom l-d curves experimental l-d curve identified material parameters a d a p t iv e p h a s e -a c t iv e p h a s e -t r a in in g fig. 2: identification software communication scheme 3.2 statistical simulation in order to prepare the set of random load-deflection curves for training the neural network, a proper efficient monte carlo type simulation has to be performed. the sara system originally developed for statistical and reliability analysis of concrete structures consists of two major parts – the freet statistical and reliability package and the atena nonlinear finite element simulation. the stochastic part of the sara system is the freet – feasible reliability engineering efficient tool – probabilistic program. this probabilistic software for statistical, sensitivity and reliability analysis of engineering problems was designed with its focus on computationally intensive problems, which do not allow thousands of samples to be performed [5], [11]. a special type of numerical probabilistic simulation, lhs, makes it possible to use only a small number of monte carlo simulations for a good estimation of the first and second moments of the response function. lhs uses stratification of the theoretical cumulative probability distribution function (cpdf) of the input random variables. cpdfs for all random variables are divided into n equivalent no overlapping intervals, where n is the number of simulations. the representative parameters of the variables are selected randomly on the basis of random permutations of integers 1, 2, …, j, …, n. every interval of each variable must be used only once during the simulation. atena nonlinear finite element software is well-established for realistic computer simulation of damage and failure of concrete and reinforced concrete structures in a deterministic way [12], [13]. the constitutive relation at a material point (constitutive model) plays the most crucial role in the finite element analysis and decides how the structural model represents reality. since concrete is a complex material with a strongly nonlinear response even under service load conditions, special constitutive models for the finite element analysis of concrete structures are employed [4]. the sara system can easily be used for stochastic training of a neural network. the parameters for identification are simulated as random variables with prescribed variability. the resulting random load-deflection curves with random realizations of the parameters serve for training the network. such a “bundle” of curves is shown in fig. 5. 3.3 artificial neural networks the basic idea of an artificial neural network was to provide a numerical model of processes in the brain. nowadays this approach is used in various fields of technical practice, mainly for classification problems [14]. in our identification technique a multilayered neural network is used. all neurons in one layer are connected with all neurons in the following layer. the connecting paths among the neurons are weighted, which models their conductivity. at the level of the neuron, the bias is added to the sum of the weighted impulses from each neuron of the previous layer. then the transfer function is applied. three types of transfer functions can be used: hard-limit, linear and nonlinear (e.g. sigmoid) transfer functions. the synaptic weights, biases and transfer functions determine the behavior of neurons and the whole neural network. the output from a single neuron (fig. 3a) can be calculated as: y f x f w p bk k k � � � � � � � � � �( ) ( ) , (1) where: k – number of input impulses (1, …, k), wk – synaptic weight of the connecting path from the k-th neuron of the previous layer, pk – impulses from the k-th neuron of the previous layer, b – bias of the neuron and f – transfer function of the neuron. if the output vector of the whole neural network is required, the output vectors have to be calculated layer by layer from the input layer to the output layer of the network. output of the u-th layer of the network is: y f w y bk u u kj u j u k u j j � � � � � � � � � � � �( )1 1 , (2) where: k – number of components in the output vector in the u-th layer (1, …, k � number of neurons in the u-th layer), j – number of components in the output vector in the (u�1)-th layer (1, …, j � number of neurons in the (u�1)-th layer), yk u – one component of the output vector, wkj u – synaptic weight – this connects the k-th neuron of the u-th layer with the j-th in the (u�1)-th layer, y j u�1 – one component of the output vector in the previous layer, bk u – bias of the k-th neuron in the u-th layer and f u – transfer function of the neurons in the u-th layer. if u is the number of the last layer, then y u is the output vector of the network. an artificial neural network works in two phases – active and adaptive. in the active phase the signal passes through the connecting paths from the input layer to the output layer of the network. to obtain correct results of that process, the weights and biases must have appropriate values. to assign these values, an adaptive phase must be used. this process is called training the neural network. for network training a set of training parameters is needed. this set consists of ordered pairs [pi, yi], where yi are the expected output vectors (in our case random realizations of the model parameters selected for identification), which are yielded by simulation of the network with input vectors pi (in our case points on the load-deflection curves). the main aim during training is to minimize the following criterion: e y yik ik k k i n � � �� �� 1 2 2 11 ( )*� , (3) 112 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 � p p p � � � � w w � w � y b a) b �� p � p �� w � b b � �� w �� w �� w � b � b � � y y 1 2b) fig. 3: a) scheme of single neuron, b) scheme of neural network with two layers where: n – number of ordered pairs input – output in the training set, yik * – required output value of the k-th output neuron at the i-th input and yik � – real output value (at the same input). in order to minimize criterion e some optimization technique is used. a description of these techniques can be found in, e.g. [15], [8]. 4 numerical examples 4.1 notched specimen of high-strength concrete used for railway sleepers a classical experiment involving three-point bending of a notched plane concrete beam was performed in order to determine fracture parameters of concrete for mass production of railway sleepers. six specimens 80×80×480 mm were tested with a notch 25 mm in depth. the fracture-mechanical parameters were determined on the basis of the recommendation of rilem [16] and improvements according to elices [17], stibor [18] and veselý [19]. based on this experiment, the following parameters were obtained: modulus of elasticity ec � 32.4 gpa, fracture energy gf � 188.7 n/m, compressive strength fc � 75.0 mpa and estimation of tensile strength ft � 4.0 mpa. the mean values of the parameters used for stochastic simulation and consequent identification are as follows: modulus of elasticity ec � 60 gpa, tensile strength ft � 5 mpa, compressive strength fc � 75 mpa, fracture energy gf � 170 n/m © czech technical university publishing house http://ctn.cvut.cz/ap/ 113 acta polytechnica vol. 44 no. 5 – 6/2004 f x 2l x wa l = 200 mm w = 80 mm a = 25 mm b = 80 mm b fig. 4: notched beam under three-point bending 0 0. 0 5. 1 0. 1 5. 2 0. 2.5 0 00. 0 05. 0 10. 0 15. 0 20. deformation [mm] fo rc e [k n ] fig. 5: random load-deflection curve realizations – 20 simulations of lhs 0 0. 0 5. 1 0. 1 5. 2.0 0 00. 0 05. 0 10. 0 15. 0 20. deformation [mm] fo rc e [k n ] experiment model 5 parameters model 3 parameters fig. 6: load deflection curves – experiment and numerical simulation using identified parameters and compressive strain at compressive strength in the uniaxial compressive test �c � 0.003. for stochastic training, randomness was introduced using the same coefficient of variation 0.15 and rectangular probability distribution for all random variables. twenty simulations of lhs resulted in the load-deflection curves presented in fig. 5. note that none of these random curves captured the experiment well. this input-output information serves for training the selected neural network: a network with 20 inputs (20 points on the load-deflection curve for each simulation is utilized for training), one hidden layer consisting of 15 neurons with a nonlinear transfer function and one output layer of 5 neurons with a linear transfer function. instead of 5 neurons, a second alternative with only 3 output neurons was also used. the original number of parameters (5) could be decreased (to 3) as the sensitivity analysis showed the dominating and nondominating random variables. the trained neural network provided the material model parameters (for 3 or 5 considered parameters): ec � 70.3 and 73.2 gpa, ft � 5.5 and 5.3 mpa, fc � 75 (mean value) and 103.22 mpa, gf � 128 and 141 n/m, �c � 0.003 (mean value) and 0.004. the final calculation using atena resulted in a very good agreement with the experimental load-deflection curve for both alternatives, fig. 6. 4.2 shear wall failure the shear wall shown in fig. 7 was tested by maier and thürliman [20]. the square panel was orthogonally reinforced and provided with stiffening flanges. loading by a vertical force was first applied representing a dead load. then a horizontal force was applied and increased to failure. during the experiment there was extensive diagonal cracking prior to failure, followed by explosive crushing of the concrete under maximum load. the experimental failure pattern is shown in fig. 7(a). the analysis was done by atena using plane-stress isoparametric finite elements with the composite reinforced concrete material. all 10 shear wall parameters of the material models (both concrete and steel reinforcement) were identified here. the mean values of the parameters used for stochastic simulation and consequent identification are as follows: for concrete (sbeta model) – modulus of elasticity ec � 30 gpa, compressive strength fc � 35 mpa, tensile strength ft � 2.5 mpa, fracture energy gf � 75 n/m, compressive strain in the uniaxial compressive test �c � 0.0025, critical compressive displacement wd � 0.003 m; for steel (bilinear law) – yield strain x1 � 0.0027, yield stress fx1 � 574 mpa, ultimate strain x2 � 0.015 and ultimate stress fx2 � 764 mpa. for stochastic training, randomness was intuitively introduced using coefficient of variation 0.10 for ec, ft and fc, 0.2 for gf and �c, 0.3 for wd and 0.1 for all steel parameters. a rectangular probability distribution for all random variables is used. the experimental load-deflection curve and 20 simulations of lhs are presented in fig. 8. this input-output information serves for training the selected neural network: network with 24 inputs (24 points on load-deflection curve for every simulation is utilized for training), two hidden layer consisting of 12 and 10 neurons with nonlinear transfer functions and one output layer of 10 neurons with a linear transfer 114 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 0 200 400 600 800 1000 0 2 4 6 8 10 12 14 horizontal deformation [mm] h o ri z o n ta l fo rc e [k n ] fig. 8: random load-deflection curve realizations – 20 simulations of lhs (a) (b) fig. 7: shear wall experimental failure and its atena simulation: (a) experimental failure, (b) failure simulated by atena function. the trained neural network provided the material model parameters: ec � 33 gpa, fc � 35.3 mpa, ft � 2.47 mpa, gf � 77.85 n/m, �c � 0.0026, wd � 0.0031 m, x1 � 0.0028, fx1 � 570.7 mpa, x2 � 0.0147 and fx2 � 768.8 mpa. the final calculation using atena resulted in a very good agreement with experimental load-deflection curve, fig. 9. 5 conclusion determining the material model parameters generally presents a great problem when using nonlinear analysis and sophisticated material constitutive laws. a methodology for efficient numerical identification of material model parameters is suggested. this approach utilizes stochastic computational analysis in combination with an artificial neural network. small-sample simulation techniques are employed, which enables analysis of computationally intensive problems of nonlinear fracture mechanics. the feasibility of the approach is documented by numerical examples. application of the concept to problems, where the experimental load-deflection curve is known could result in much better identification of the material model parameters than with a heuristic “trial and error” approach. such well-identified parameters based on experiment can then be used for calculating a real structure. very good results have been achieved, which indicates that efficient techniques have been combined at all three basic levels: deterministic nonlinear modeling, probabilistic stratified simulation and neural network approximation. references [1] bažant z. p., becq-giraudon e.: “statistical prediction of fracture parameters of concrete and implications for choice of testing standard.” cement and concrete research, vol. 32 (2002), no. 4, p. 529–556. [2] iacono c., sluys l. j., van mier j. g. m.: “development of an inverse procedure for parameters estimates of numerical models.” in: proceedings of the euro-c conference, st. johann im pongau, austria, rotterdam: balkema 2003, p. 259–268. [3] fairbairn e. m. r., paz c. n. m., ebecken n. f. f., ulm f. j.: “use of neural networks for fitting of fe probabilistic scaling model parameters.” international journal of fracture, vol. 95 (1999), p. 315–324. [4] ervenka v., pukl r.: atena program documentation. červenka consulting, prague, czech republic, 2003. [5] novák d. et al.: freet – user’s manual and theory manual. brno/červenka consulting, czech republic, 2004. [6] pukl r., červenka v., strauss a., bergmeister k., novák d.: “an advanced engineering software for probabilistic-based assessment of concrete structures using nonlinear fracture mechanics.” in: “9th international conference on applications of statistics and probability in civil engineering (icasp9)”, san francisco, california, usa, rotterdam: millpress 2003, p. 1165–1171. [7] bergmeister k. et al.: “structural analysis and safety assessment of existing concrete structures.” in: “1st fib congress concrete structures in 21st century”, osaka, session 11, 2002, p. 47–54. [8] lehký d.: “dlnnet – program documentation, theory and user’s manual”. brno, 2004 (in preparation). [9] lehký d., novák, d.: “identification of material model parameters using stochastic training of neural network.” in: “5th international phd symposium in civil engineering”, delft, netherlands, 2004. [10] kučerová a., lepš m., zeman j.: “soft computing methods for estimation of microplane model parameters.” in: “proc. of conference computational mechanics”, wccm vi in conjuction with apcom 04, beiijing, china, tsinghua university press & springer-verlag, 2004, (submitted to). [11] novák d., vořechovský m., rusina r.: “small-sample probabilistic assessment – software freet.” in: “9th international conference on applications of statistics and © czech technical university publishing house http://ctn.cvut.cz/ap/ 115 acta polytechnica vol. 44 no. 5 – 6/2004 0 200 400 600 800 1000 0 2 4 6 8 10 12 14 horizontal deformation [mm] h o ri z o n ta l fo rc e [k n ] s2 experiment atena identified par. fig. 9: load deflection curves – experiment and simulation using identified parameters probability in civil engineering – icasp 9”, san francisco, usa, rotterdam: millpress 2003, p. 91–96. [12] červenka v.: “simulating a response.” concrete engineering international, vol. 4 (2000), no. 4, p.45–49. [13] červenka v.: “computer simulation of failure of concrete structures for practice.” in: “1st fib congress 2002 concrete structures in 21st century”, osaka, japan, keynote lecture, 2002, p. 289–304. [14] cichocki a., unbehauen r.: “neural networks for optimization and signal processing.” john wiley & sons ltd. & b.g. teubner, stuttgart, germany, 1993. [15] the mathworks, inc.: ”matlab neural network toolbox.” user’s guide, 2002. [16] rilem.: “determination of fracture energy of mortar and concrete by means of three-point bend tests.” materials and structures, vol. 18 (1985), p. 285–290. [17] elices m., guinea g. v., planas j.: “on the measurement of concrete fracture energy using three-point bend tests.” materials and structures, vol. 30 (1997), p. 375–376. [18] stibor m.: „problémy s určováním lomové energie cementových kompozitů.“ in: sborník semináře problémy lomové mechaniky, brno, 2001, p. 72–77, (in czech). [19] veselý v.: „lomově-mechanické vlastnosti pražcového betonu.“ in: sborník 4. odborné konference doktorského studia (cd-rom), (in czech), vut fast v brně, brno, 2002. [20] maier j., thürliman b.: “bruchversuche an stahlbetonscheiben.” institut für baustatik und konstruktion eth zürich, bericht nr. 8003-1, 1985. prof. ing. drahomír novák, drsc. phone:+420 541 147 360 fax: +420 541 240 994 e-mail: novak.d@fce.vutbr.cz ing. david lehký phone: +420 541 147 376 fax: +420 541 240 994 e-mail: lehky.d@fce.vutbr.cz institute of structural mechanics brno university of technology faculty of civil engineering veveří 95 602 00 brno, czech republic 116 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2019.59.0458 acta polytechnica 59(5):458–466, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap inverted pendulum with linear synchronous motor swing up using boundary value problem lukáš koska, slávka jadlovská, dominik vošček, anna jadlovská∗ technical university of košice, faculty of electrical engineering and informatics, department of cybernetics and artificial intelligence, letná 9, 042 00 košice, slovak republic ∗ corresponding author: anna.jadlovska@tuke.sk abstract. research in the field of underactuated systems shows that control algorithms which take the natural dynamics of the system’s underactuated part into account are more energy-efficient than those utilizing fully-actuated systems. the purpose of this paper to apply the two-degrees-of-freedom (feedforward/feedback) control structure to design a swing-up manoeuver that involves tracking the desired trajectories so as to achieve and maintain the unstable equilibrium position of the pendulum on the cart system. the desired trajectories are obtained by solving the boundary value problem of the internal system dynamics, while the optimal state-feedback controller ensures that the desired trajectory is tracked with minimal deviations. the proposed algorithm is verified on the simulation model of the available laboratory model actuated by a linear synchronous motor, and the resulting program implementation is used to enhance the custom simulink library inverted pendula modeling and control, developed by the authors of this paper. keywords: automatic model generator, pendulum on the cart, linear synchronous motor, feedforward/feedback control structure, boundary value problem, swing-up control. 1. introduction underactuated systems are mechanical systems that have fewer control inputs than degrees of freedom to control, which make both the control design and the analysis more difficult when compared to their fully actuated alternatives. these systems exploit the natural dynamics of the system in order to achieve a higher level of performance in terms of speed, efficiency, or robustness [1]. a system of interconnected pendulums on a stabilizing base such as a cart/rotary arm is a typical example of a benchmark underactuated system. this system has proven to be important in studying the dynamics of more complex higher-order systems, such as mobile (notably legged) robots, robot manipulators, segways, aircrafts or submarines [2] [3] [4] [5]. the basic control objective for pendulum-based underactuated systems is to stabilize each pendulum link in the vertical upright (inverted) position, often following a swing-up from the downward position. to perform the swing-up manoeuver, the pendulum can either be simply required to reach the upright position without regard to the trajectory tracked, or to reach it while simultaneously tracking the prescribed trajectory. the former approach can be realized via a number of methods, such as furuta’s energy-based method [6], partial feedback linearization [7] or lyapunov function design [8]. an example application of the latter was notably published in [9] and [10] where graichen et al. studied the swing-up manoeuver problem for a double pendulum on the cart-based combined feedforward/feedback control scheme resulting in a two-point boundary value problem (bvp) of the internal dynamics of the system. the bvp solution for the pendulum swing-up for the triple pendulum on the cart was introduced in [11]. this paper is a follow-up to the research of our underactuated systems group at the center of modern control techniques and industrial informatics (cmct&ii) at dcai, feei, tu in the field of modelling and control of underactuated systems (http: //matlab.fei.tuke.sk/underactuated). the obtained results, such as functions that automate the generation of mathematical models for the n-link inverted pendulum system on the cart or a rotary base, implemented control algorithms and demo simulations, have been included in the custom matlab/simulink library, inverted pendula modeling and control (ipmac) [12], developed since 2011. the library has been gradually expanded, i.e. [13] describes the addition of weight in the form of a sphere/ring/cylinder at the end of the pendulum, which is reflected in its inertia; additional benchmark models were introduced in [14]. stabilizing control algorithms most importantly include state-feedback approaches, such as pole-placement or lqr, and the implementation of swing-up algorithms covers the energy method or the partial feedback linearization. the hybrid control structure, which enables switching between the swingup and stabilizing controller, is used to meet the stated control objective [14] [15]. the development of the ipmac as a framework for solving analysis/control problems of inverted pendulum systems also enabled us to unify and generalize the nomenclature and labelling of input/output/state variables and physical 458 https://doi.org/10.14311/ap.2019.59.0458 https://ojs.cvut.cz/ojs/index.php/ap http://matlab.fei.tuke.sk/underactuated http://matlab.fei.tuke.sk/underactuated vol. 59 no. 5/2019 swing-up of the inverted pendulum on a cart quantities used in mathematical modelling of these benchmark underactuated systems. the present paper proposes the algorithm for calculating the swing-up trajectories and their subsequent tracking in the two-degree of freedom control scheme, which is then verified on the simulation model of the laboratory single inverted pendulum system on a cart (cart-pole system), the multipurpose workplace for nondestructive diagnostics with linear synchronous motor, located in the cmct&ii’s laboratory of production lines and image recognition at dcai, feei tu in košice. this laboratory model was first presented in [16], where the process of obtaining its mathematical model and parameter identification, as well as design and subsequent verification of swing-up and stabilizing control was described in detail. unlike the approaches based on switching between swing-up and stabilizing control algorithms, the proposed approach has not yet been implemented in the ipmac library, and therefore will enhance it. the structure of the paper is as follows. in section 2, the functionality of the ipmac library regarding the generation of the mathematical model of the inverted pendulum on the cart with the force input will be presented and the model will further be adjusted to generate the optimal trajectory for the natural, underactuated pendulum dynamics. in section 3, the paper describes the solution of the bvp with the objective of trajectory generation for pendulum swing-up using the two-degrees-of-freedom control structure. at the end of section 3, the optimal control algorithm for tracking desired trajectories is described, which ensures the minimal deviations of the state-space vector from the generated trajectories. in section 4, the extended functionality of the existing ipmac library is presented, based on the proposed swing-up algorithm. in the end, the obtained results are evaluated and future research directions are outlined. 2. inverted pendulum model generated via ipmac library determining the correct and sufficiently accurate mathematical model is a basic prerequisite for any further analysis of a given physical system. to design advanced control algorithms such as swing-up trajectory planning/tracking for an underactuated system, such a mathematical model is essential [13]. the ipmac library, via the inverted pendula model equation derivator application, allows to derive the mathematical model of an inverted pendulum system in various configurations, considering a given number of links and several types of weights (or no weight) at the end of the pendulum [14]. the implemented algorithm uses the lagrange equations of the second kind to derive the model of a selected inverted pendulum system [12]. this method is based on the definition of the mass point coordinates and the determination of kinetic ek and potential ep energy with respect to the defined vector of the figure 1. pendulum on the cart (cart-pole) system generalized coordinates θθθ(t), which is specified for a n-link pendulum as: θθθ(t) = [ θ0(t) θ1(t) · · · θn(t) ] (1) where the θ0(t) represents the position of the cart and the remaining coordinates θ1(t), . . . ,θn(t) represent the angles of the individual pendulum links. lagrangian (lagrange function) for n mass points is defined as [17]: l(θθθ(t),θ̇θθ(t)) = n∑ i=0 eki (θθθ(t),θ̇θθ(t))− n∑ i=0 epi (θθθ(t)) (2) by deriving the lagrange function with respect to time t and generalized coordinates θθθ(t), the resulting motion equations of the system can be obtained in the form: d dt ( ∂l(t) ∂θ̇̇θ̇θ(t) ) − ∂l(t) ∂θθθ(t) + ∂r(t) ∂θ̇̇θ̇θ(t) = qqq∗(t) (3) where r(t) represents the rayleigh dissipative function (friction/damping) and qqq∗(t) is the external generalized force [17]. our specific case is considered to have only one pendulum, as in fig. 1. thus, kinetic ek and potential ep energy is defined for two mass points, for the cart θ0(t) and for the pendulum θ1(t), respectively. we consider the model with a sphere at the end of the pendulum, which is the closest representation of the laboratory model. the external input to the system corresponds to the force f(t) applied on the cart. the resulting nonlinear differential equations have the form of a cart equation: (m + m0 + m1)θ̈0(t) + c cos(θ1(t))(m + m1)θ̈1(t)+ +δ0θ̇0(t) −c sin(θ1(t))θ̇20 (t)(m + m1) = f(t) (4) and the pendulum equation: c(m + m1) cos(θ1(t))θ̈0(t) + j1θ̈1(t)+ +δ1θ̇1(t) −c(m1 + m)g sin(θ1(t)) = 0 (5) with the parameters listed in tab. 1. to find a trajectory for the underactuated part of the system, we declare the acceleration of the cart 459 l. koska, s. jadlovská, d. vošček, a. jadlovská acta polytechnica m0 [kg] weight of the cart m1 [kg] weight of the pendulum m [kg] weight of the sphere l1 [m] length of the pendulum c [m] distance between the centre of gravity (cog) and the point of rotation: c = (m1l1/2 + m(l1 + r)) /(m + m1) r [m] radius of the weight j1 [kgm2] moment of inertia δ0 [kgm2/s] cart friction δ1 [kgm2/s] pendulum friction table 1. parameters of the inverted pendulum on the cart system θ̈0(t) to be the input of the pendulum subsystem (5). hence, we consider the system of equations θ̈0(t) = u(t) (6) θ̈1(t) = c(m + m1) j1 cos(θ1(t))u(t) + δ1 j1 θ̇1(t)− − c(m1 + m)g j1 sin(θ1(t)) (7) where (6) represents the input-output dynamics of the cart [18] and (7) represents the internal dynamics of the system. for the design of desired (nominal) swing-up trajectories [θ∗1 (t) θ̇∗1 (t)] and the control input θ̈∗0 (t) for swing-up, we also have to take into account the following constraints resulting from the laboratory setup: |θ0(t)| < 0.45m, |θ̇0(t)| < 3.5m/s, |θ̈0(t)| < 12m/s2 (8) this concludes the modelling section. 3. trajectory planning using bvp the problem of trajectory planning for the inverted pendulum system involves solving the task of pendulum swing-up from the downward equilibrium position to the upright unstable equilibrium position [19]. let us suppose that the swing-up must be executed for the bounded time interval t ∈ [0,t]. the boundary conditions for the swing-up of the considered system from the initial downward to the terminal upward equilibrium are as follows: θ0(0) = 0, θ̇0(0) = 0, θ1(0) = −π, θ̇1(0) = 0 θ0(t) = 0, θ̇0(t) = 0, θ1(t) = 0, θ̇1(t) = 0 (9) in this paper, the design of the swing-up trajectory for the underactuated system of inverted pendulum on the cart is based on the solution of the two-point boundary value problem, considering the boundaries (8) for the input signal. the input signal represents the required acceleration of the cart θ̈∗0 (t), hence the obtained solution will be suitable for a subsequent implementation into the laboratory model [9] [16]. the result of the trajectory planning is represented not only by the trajectories of state variables of the selected system (θ∗0 (t) θ̇∗0 (t) θ∗1 (t) θ̇∗1 (t)), but also by the trajectory of the adequate input u∗(t) = θ̈∗0 (t) that performs the required behaviour [20]. the task of inverted pendulum swing-up is most commonly implemented in a way where the pendulum is simply required to reach the upright position without regard to the trajectory tracked; once close to the unstable equilibrium, the control law is switched to a stabilizing controller. this approach, known as the hybrid control structure, has already been successfully implemented as a part of the ipmac [14] [16]. rather than this control structure, this paper uses the two-degrees-of-freedom (feedforward/feedback) scheme shown in fig. 2 to ensure the offline generation of desired trajectories based on a nonlinear feedforward design and their subsequent online tracking by feedback lq control design. upon the formulation of the bvp assumptions and constraints, the solution can be searched for. 3.1. solution of the bvp from the equation describing the dynamics of the inverted pendulum (5), we obtained a nonlinear second order differential equation (7) where the input is the acceleration of the cart (6). the system of equations (6)−(7), which is already in input-output normal form together with boundary conditions (9), defines a nonlinear two-point bvp for states θ0(t), θ̇0(t), θ1(t), θ̇1(t) that depends on the input trajectory u(t). using the feedforward control principle design based on inverting the input-output dynamics of the cart (6) and solving the bvp of the internal dynamics of the pendulum (7), a feedforward control input u∗(t) = θ̈∗0 (t) that ensures that the pendulum performs the swing-up, is obtained. such a control input u∗(t) can be expressed in the form of polynomial series, splines, harmonic functions or other functions that contain free parameters; hence the task transforms into the search for these free parameters [21]. the type of the input series depends on the choice of the expert [18]; in our case, we have chosen the cosine input series θ∗0 (t) = γ(t,σσσ) with free parameters for the cart position θ0(t) in the form: γ(t,σσσ) = − (σ1 + σ3) − (σ2 + σ4) cos ( πt t ) + + 5∑ i=2 σi−1 cos ( iπt t ) (10) the free parameters σσσ = (σ1,σ2,σ3,σ4) represent the coefficients of the cosine series for the highest frequencies. to evaluate these, we use the function bvp5c from matlab with an initial estimation given as σσσ = {0, 0, 0, 0} and the swing-up time set to t = 1.5s. solving the bvp leads to the set of free parameters specified as σ1 = 0.1087, σ2 = 0.1588, σ3 = −0.0530, σ4 = −0.0611. the desired trajectories θ∗0 (t), θ̇∗0 (t) and θ̈∗0 (t) for the cart that result from this set of parameters are shown in the fig. 3. the desired trajecto460 vol. 59 no. 5/2019 swing-up of the inverted pendulum on a cart figure 2. feedforward/feedback control structure 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 time [s] -0.4 -0.2 0 desired position 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 time [s] -2 0 2 desired velocity trajectory 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 time [s] -10 0 10 desired acceleration trajectory (input) figure 3. the desired position θ∗0 (t), velocity θ̇∗0 (t) and acceleration θ̈∗0 (t) of the cart generated by the solution of the bvp 0 0.5 1 1.5 time [s] -200 -100 0 desired position trajectory of the pole pole trajectory position [deg] 0 0.5 1 1.5 time [s] -200 0 200 400 desired velocity trajectory of the pole pole trajectory velocity [deg/s] figure 4. desired trajectory of the pendulum angle θ∗1 (t) and pendulum angular velocity θ̇∗1 (t) generated by the solution of the bvp 461 l. koska, s. jadlovská, d. vošček, a. jadlovská acta polytechnica ries satisfying the control objective and constraints for the pendulum swing-up θ∗1 (t) and its angular velocity θ̇∗1 (t) are shown in fig. 4. 3.2. design of the control algorithm for tracking desired trajectories calculated by the bvp the generated inverted pendulum model (4)−(5) is only appropriate for trajectory tracking control design if the drive control unit allows the force input on the cart. our system is actuated by means of a linear synchronous motor (lsm), hence it does not satisfy this condition. therefore, we first need to modify the model by replacing the equation (4) with the linear approximation of the cart-lsm subsystem to obtain the simulation model of the laboratory system [16]. the state-space vector for the inverted pendulum with lsm remains unchanged and is of the form: xxx(t) = ( θ0(t) θ̇0(t) θ1(t) θ̇1(t) )t (11) and the same holds for the vector of desired trajectories xxx∗(t) = (θ∗0 (t) θ̇∗0 (t) θ∗1 (t) θ̇∗1 (t))t . in this paper, we consider the cart-lsm subsystem as a linear system whose input is the desired (reference) velocity θ̇∗0 (t) of the cart and the output is the actual velocity θ̇0(t) of the cart. the desired velocity trajectory θ̇∗0 (t) was obtained in the previous section by means of the cosine series (10). a first-order linear system is proposed in the form: θ̈0(t) + q0θ̇0(t) = p0θ̇∗0 (t) (12) the coefficients q0 and p0 of the linear differential equation (12) were obtained using the arx function from system identification toolbox, i.e. by performing an experimental identification on the laboratory pendulum system, as in [16]. after combining the approximate cart-lsm representation (12) with the pendulum equation (5), we obtain the modified model of the inverted pendulum system in the form that is suitable for linearization [16]. for the design of the control algorithm which ensures tracking the desired trajectories obtained by the solution of the bvp, the algorithm based on timevarying lq principle (feedback control ∆u(t)) and results of the bvp algorithm (feedforward control u∗(t)) are used in the control structure, which is depicted in fig. 2. to implement an optimal feedback time-varying lq control ∆u(t), the linearized inverted pendulum model is used, which is characterized by the time-varying system dynamics matrix aaa(t) and input matrix bbb(t), given as follows: aaa(t) =   0 1 0 0 h3 −h2 0 −h1q0 0 0 0 1 0 0 0 −q0   ∣∣∣∣∣∣∣∣ [xxx∗(t),u∗(t)] bbb(t) =   0 h1p0 0 p0   ∣∣∣∣∣∣∣∣ [xxx∗(t),u∗(t)] where: h1 = c(m1 + m) j1 h2 = δ1 h3 = c(m1 + m)g j1 sin(θ1(t)) the proposed control algorithm law yields the optimal feedback lq control ∆u(t) = −k(t)∆x(t), which minimizes the quadratic criterion: jlq(t) = ∫ ∞ 0 (∆xxxt (t)qqq∆xxx(t) + ∆ut (t)r∆u(t))dt (13) where qqq ∈ r4×4 is a symmetric positive semidefinite matrix, r is a positive constant and ∆x(t) = x∗(t) − x(t) is a vector of desired trajectory perturbations. the feedback gain kkk(t) is represented as: kkk(t) = r−1bbbt (t)ppp (14) where ppp is the solution of the riccati system of algebraic equations corresponding to the given time step: pbpbpb(t)r−1bbbt (t)ppp −papapa(t) −aaat (t)ppp −qqq = 000 (15) the feedback gain vector kkk(t) was computed in each time step of 0.1 second for specific values of the state-space matrices with the built-in matlab function lqr with the weighting matrices adjusted manually to qqq = diag(20000, 100, 1000, 220000) and r = 10000. the time behaviour of the feedback gain kkk(t) components obtained by the simulation is shown in the fig. 5. 4. enhancement of the ipmac library the ipmac library [12] contains algorithms enabling inverted pendulum control in the hybrid control structure, i.e. by switching between the swing-up and stabilizing control algorithm. the swing-up manoeuver has been implemented either via the energy-based methods or via partial feedback linearization, without considering the swing-up trajectory. when the pendulum approaches the upright equilibrium, the switch to the state-feedback stabilizing control algorithm takes place, which has been implemented using methods 462 vol. 59 no. 5/2019 swing-up of the inverted pendulum on a cart 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 time [s] -1800 -1600 -1400 -1200 -1000 -800 -600 -400 -200 0 200 time-varying lq feedback gains figure 5. time-varying feedback gains k1(t), k2(t), k3(t), k4(t) 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 time [s] -0.4 -0.2 0 0.2 comparison of desired trajectories generated by bvp 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 time [s] -300 -200 -100 0 figure 6. comparison of the desired cart position trajectories θ∗0 (t) and pendulum angle θ∗1 (t) generated by the matlab function bvp5c with different time intervals t1 = 1.5s,t2 = 1.8s for swing-up such as lqr, pole placement or model predictive control [12]. the modification of the ipmac library presented in this paper consists of adding a control structure that uses the generated swing-up trajectories and a trajectory-tracking algorithm to raise the pendulum from the downward to the upward unstable equilibrium position and stabilize it there. the simulation scheme in fig. 7 is used to implement the trajectories calculated offline by the bvp in the control structure depicted in the fig. 2. the physical parameters of the laboratory model, necessary for simulations, are listed in the tab. 2. the parameters of the inertia moment j1 and the damping coefficient δ1 were obtained through an experimental identification. the procedure for obtaining these parameters is described in [16]. cog 0.3098 [m] center of gravity j1 0.0613 [kg m2] moment of inertia m 0.5983 [kg] weight of the pendulum δ1 0.0027 [kg m2/s] pendulum friction p0 31.53 numerator coefficient of the cart subsystem q0 31.24 denominator coefficient of the cart subsystem table 2. numerical parameters of the simulation model of the laboratory pendulum model to calculate the swing-up trajectories, two time intervals were selected: t1 = 1.5s and t2 = 1.8s. for these settings, the bvp5c function returned two sets of parameters σσσ. the desired trajectories for the pendu463 l. koska, s. jadlovská, d. vošček, a. jadlovská acta polytechnica figure 7. simulation scheme for tracking the desired trajectories using the solution of the bvp and time-varying lq algorithm lum swing-up based on these σσσ parameters are shown in the fig. 6. the generated trajectories were verified and compared, and the trajectory corresponding to the time interval t = 1.5s was selected for further steps. based on (10), parameters σσσ are used to generate the desired trajectories for the position θ∗0 (t), velocity θ̇∗0 (t) and acceleration θ̈∗0 (t) of the cart. the final trajectories of the cart position θ∗0 (t) and pendulum angle θ∗1 (t) are shown in the fig. 8 and fig. 9. it is shown that the proposed tracking control algorithm with the control law u(t) = u∗(t) + ∆u(t) and the suitably selected weight matrices qqq and r exploits the effect of natural dynamics of the system and ensures that the desired trajectory is tracked with minimal deviations of the actual pendulum trajectory from the swing-up trajectories calculated by the bvp. moreover, the proposed control algorithm reduces the control input of the feedback component against the feedforward component that plays the key role in the swing-up to the upward equilibrium. as suggested by fig. 8 and fig. 9, the proposed enhancing functions for the ipmac library correctly fulfil the stated control objective. it can, therefore, be concluded that the designed control algorithm not only swings up the pendulum but also stabilizes it in the unstable equilibrium position. 5. conclusions the aim of this paper was to introduce a modification of the swing-up control algorithm based on the feedforward/feedback control scheme and to verify it on the simulation model of a cart-pendulum system, which is part of the laboratory model multipurpose workplace for nondestructive diagnostics with linear synchronous motor. the control algorithm modification was included in our proprietary matlab/simulink inverted pendula modeling and control library. first of all, the mathematical model of the single inverted pendulum system with an attached weight was generated via a gui application, which is part of the ipmac library. considering the cart acceleration as the pendulum subsystem input together with constraints resulting from the laboratory setup defines a two-point nonlinear boundary value problem with free parameters as the coefficients of the input cosine series. the solution of the boundary value problem is represented by the desired trajectories for state-space variables of the system during the swing-up phase. simulations have confirmed that based on the calculated set of parameters of the input cosine series, the suitable feedforward component signal u∗(t) was generated. before designing the control algorithm for the trajectory tracking, the model was modified from its generated form to mirror the actual laboratory im464 vol. 59 no. 5/2019 swing-up of the inverted pendulum on a cart 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 time [s] -0.4 -0.2 0 0.2 tracking the desired trajectories by the laboratory model 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 time [s] -2 -1 0 1 2 figure 8. tracking of the desired cart position θ∗0 (t) and the velocity θ̇∗0 (t) against the laboratory system position θ0(t) and velocity θ̇0(t) for t = 1.5s 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 time [s] -200 -150 -100 -50 0 tracking the desired trajectories by the laboratory model 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 time [s] -200 0 200 400 figure 9. tracking of the desired pendulum angle θ∗1 (t) and the angular velocity θ̇∗1 (t) against the laboratory system pendulum angle θ1(t) and angular velocity θ̇1(t) for t = 1.5s plementation, using the linear approximation of the motor-cart subsystem. after the simulation model of the laboratory system was obtained, the control signal u∗(t) was able to steer the pendulum system from the downward equilibrium position to the upward, unstable equilibrium position. tracking of the generated trajectories and subsequent stabilization in the unstable equilibrium were ensured by means of a time-varying lq control. an actual implementation of the designed control algorithm into the laboratory model will require the hardware configuration of the model to be enhanced by a suitable embedded system. the trajectory planning problem considered in this paper is not only a relevant control objective for benchmarking underactuated systems, but it can also be applied in gait generation for bipedal or multi-legged robotic systems, with several different trajectories stored in the motion database and selected as the most appropriate for current conditions. our research will further focus on the verification of this approach for more complicated underactuated systems such as the compass gait and other benchmarks for a passive legged locomotion. acknowledgements this work has been supported by the grant kega implementation of research results in the area of modelling and simulation of cyber-physical systems into the teaching process development of modern university textbooks – 072tuke-4/2018 (100%). 465 l. koska, s. jadlovská, d. vošček, a. jadlovská acta polytechnica references [1] r. tedrake. underactuated robotics: learning, planning, and control for efficient and agile machines course notes for mit 6.832. working draft edition p. 3, 2009. [2] o. boubaker. the inverted pendulum: history and survey of open and current problems in control theory and robotics. the inverted pendulum in control theory and robotics: from theory to new innovations 111:1, 2017. [3] y. wang, k. zhu, b. chen, h. wu. a new model-free trajectory tracking control for robot manipulators. mathematical problems in engineering 2018, 2018. [4] i. kafetzis, l. moysis. inverted pendulum: a system with innumerable applications. in school of mathematical sciences. aristotle university of thessaloniki, 2017. [5] o. boubaker. the inverted pendulum benchmark in nonlinear control theory: a survey. international journal of advanced robotic systems 10(5):233, 2013. [6] k. j. åström, k. furuta. swinging up a pendulum by energy control. automatica 36(2):287 – 295, 2000. doi:10.1016/s0005-1098(99)00140-5. [7] m. w. spong. the swing up control problem for the acrobot. ieee control systems magazine 15(1):49–55, 1995. doi:10.1109/robot.1994.350934. [8] i. fantoni, r. lozano. non-linear control for underactuated mechanical systems. springer science & business media, 2001. [9] k. graichen, m. treuer, m. zeitz. swing-up of the double pendulum on a cart by feedforward and feedback control with experimental validation. automatica 43(1):63 – 71, 2007. doi:10.1016/j.automatica.2006.07.023. [10] k. graichen, m. treuer, m. zeitz. swing-up of the double pendulum on a cart by feedforward and feedback control with experimental validation. automatica 43(1):63–71, 2007. [11] t. glück, a. eder, a. kugi. swing-up control of a triple pendulum on a cart with experimental validation. automatica 49(3):801 – 808, 2013. doi:10.1016/j.automatica.2012.12.006. [12] s. jadlovská, j. sarnovský. modelling of classical and rotary inverted pendulum systems–a generalized approach. journal of electrical engineering 64(1):12–19, 2013. doi:10.2478/jee-2013-0002. [13] s. jadlovská, j. sarnovský, j. vojtek, d. vošček. advanced generalized modelling of classical inverted pendulum systems. in emergent trends in robotics and intelligent systems, pp. 255–264. springer, 2015. doi:10.1007/978-3-319-10783-7_28. [14] s. jadlovská, l. koska, m. kentoš. matlab-based tools for modelling and control of underactuated mechanical systems. transactions on electrical engineering 6(3):56–61, 2017. doi:10.14311/tee.2017.3.056. [15] s. jadlovska, j. sarnovsky. a complex overview of modeling and control of the rotary single inverted pendulum system. advances in electrical and electronic engineering 11(2):73–85, 2013. [16] a. jadlovská, s. jadlovská, d. vošček. cyber-physical system implementation into the distributed control system. ifac-papersonline 49(25):31 – 36, 2016. 14th ifac conference on programmable devices and embedded systems pdes 2016, doi:10.1016/j.ifacol.2016.12.006. [17] r. p. feynman, r. b. leighton, m. sands. the feynman lectures on physics, vol. i: the new millennium edition: mainly mechanics, radiation, and heat, vol. 1. basic books, 2011. [18] s. ozana, m. schlegel. computation of reference trajectories for inverted pendulum with the use of two-point bvp with free parameters. ifac-papersonline 51(6):408 – 413, 2018. 15th ifac conference on programmable devices and embedded systems pdes 2018, doi:10.1016/j.ifacol.2018.07.119. [19] s. ozana, m. pies, r. hajovsky. computation of swing-up signal for inverted pendulum using dynamic optimization. in ifip international conference on computer information systems and industrial management, pp. 301–314. springer, 2014. doi:10.1007/978-3-662-45237-0_29. [20] p. boscariol, d. richiedei. robust point-to-point trajectory planning for nonlinear underactuated systems: theory and experimental assessment. robotics and computer-integrated manufacturing 50:256 – 265, 2018. doi:10.1016/j.rcim.2017.10.001. [21] k. graichen, m. zeitz. feedforward control design for nonlinear systems under input constraints, pp. 235–252. springer berlin heidelberg, berlin, heidelberg, 2005. doi:10.1007/11529798_15. 466 http://dx.doi.org/10.1016/s0005-1098(99)00140-5 http://dx.doi.org/10.1109/robot.1994.350934 http://dx.doi.org/10.1016/j.automatica.2006.07.023 http://dx.doi.org/10.1016/j.automatica.2012.12.006 http://dx.doi.org/10.2478/jee-2013-0002 http://dx.doi.org/10.1007/978-3-319-10783-7_28 http://dx.doi.org/10.14311/tee.2017.3.056 http://dx.doi.org/10.1016/j.ifacol.2016.12.006 http://dx.doi.org/10.1016/j.ifacol.2018.07.119 http://dx.doi.org/10.1007/978-3-662-45237-0_29 http://dx.doi.org/10.1016/j.rcim.2017.10.001 http://dx.doi.org/10.1007/11529798_15 acta polytechnica 59(5):458–466, 2019 1 introduction 2 inverted pendulum model generated via ipmac library 3 trajectory planning using bvp 3.1 solution of the bvp 3.2 design of the control algorithm for tracking desired trajectories calculated by the bvp 4 enhancement of the ipmac library 5 conclusions acknowledgements references ap05_4.vp notation and units e voltage associated with the permanent magnets [v] e0 open circuit magnet voltage [v] �1 fundamental flux-linkage associated with b1 [v-s] b1 peak value of fundamental airgap flux density [t] d bore diameter [m] lstk stack length [m] nph number of phases kw1 fundamental winding factor p number of pole pairs xq quadrature axis synchronous reactance [�] xd direct axis synchronous reactance [�] r phase resistance [�] � angular frequency [rad/sec] i phase current [a] i instantaneous current [a] iq quadrature axis current component [a] id direct axis current component [a] ra,b non-inductive resistance [�] rvar variable resistance [�] rm winding resistance [�] lm winding inductance [h] � flux-linkage due to current from wheatstone bridge [v-s] 1 introduction the permanent magnet synchronous motor (pmsm) has risen in prominence owing to its comparably high efficiency and torque per volume ratio. the motor is salient-pole and highly saturable. the rotor may have interior rather than surface-mounted magnets, and may include a cage for starting. the saturation of the magnetic circuit varies with rotor position, resulting in localised effects. the operation of the motor can be analysed using the phasor diagram method, which transforms the phase currents and flux-linkages into direct (polar) and quadrature (interpolar) axis components. the direct axis (d-axis) flux-linkage can be split into two contributions, one from the current and one from the permanent magnets. the emf associated with the magnets is denoted by the resultant emf, e. there is no flux-linkage contribution from the permanent magnets on the quadrature axis (q-axis). it is not possible to measure the emf associated with the magnets with current flowing in the winding and so it is necessary to assume that it remains constant at the open circuit value e0, irrespective of loading. as the diagram is based on phasor quantities, it can only be used to calculate sine-wound motors driven by sinusoidal voltages and currents. in cases where the winding distribution is non-sinusoidal, or where the excitation waveforms are non-sinusoidal, it is useful to analyse the motor using finite element (fe) software. it is not possible to separate the total flux-linkage calculated using finite elements without resorting to superposition, which cannot be considered valid in the case of non-linear magnetic circuits. regardless of the method used, it is important to verify the results by measurement. the amount of magnet flux crossing the airgap is heavily dependent on the rotor design. the rotor slots are sometimes fully enclosed by bridge sections, which lowers the noise or harmonic content in the airgap flux distribution. the bridges quickly saturate and create magnetic short circuits within the rotor, contributing significantly to the levels of leakage flux and thus reducing the amount of flux crossing the airgap. the rotor slots can be designed so as to provide lower harmonic content in the flux density whilst limiting leakage flux. 2 simulation of magnetic characteristics using finite elements the motor cross sections can be modelled by finite elements, as shown in fig. 1 [1]. single load point simulations can be run to determine the airgap flux density distribution © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 simulation and analysis of magnetisation characteristics of interior permanent magnet motors j. a. walker, c. cossar, t. j. e. miller modern permanent magnet (pm) synchronous brushless machines often have magnetic circuits in which the patterns of saturation are complex and highly variable with the position of the rotor. the classical phasor diagram theory of operation relies on the assumption of sinusoidal variation of flux-linkage with rotor position, and neglects the non-linear effects that arise in different operating states. the finite element method is a useful tool for detailed magnetic analysis, but it is important to verify simulation results by direct measurement of the magnetic characteristics of the motor, in terms of “magnetisation curves” of current and flux-linkage. this paper presents results from finite element simulations to determine the magnetisation in a split-phase interior permanent magnet (ipm) motor. investigation has been made to determine the effects of the rotor geometry on the synchronous reactances and airgap flux distribution. comparisons are made with a second ipm motor with a different rotor configuration. keywords: permanent magnet, finite element method, flux-linkage measurement, rotor bridges. when the phase axis is aligned with the direct and quadrature axis rotor positions, for increasing load current. the direct and quadrature axis synchronous reactances are calculated from the fundamental component of the airgap flux density. the peak fundamental flux-linkage is given by (1) and then the rms synchronous reactances can be calculated from (2) and (3). �1 1 � b d l n k p stk ph w1 [v-s] (1) x e id � � �1 02 � [�] (2) x iq � �1 2 � [�] (3) according to (2), the direct axis synchronous reactance calculation requires separation of the current component of flux-linkage from the magnet component. it is, therefore, once again convenient to assume that the open circuit magnet flux is constant and independent of loading. the properties of the permanent magnets can be matched in the finite element simulations by comparing simulated open circuit back emf waveforms to experimental results. the static magnetisation curves of the motor represent the variation of flux-linkage with current at successive rotor positions and can be represented in terms of either the direct and quadrature axes or phase quantities. by minor alteration of a scripting routine in the finite element software, it is possible to calculate the flux-linkages at incremental rotor positions with constant current in the winding. the flux-linkage is calculated from the magnetic vector potential in each of the stator slots and so the direct axis value includes the flux-linkage contributions from both the direct axis current and the permanent magnets. 3 verification of simulation results by measurement the testing of ipm motors necessarily differs from that of wound-field synchronous machines, due to the permanent excitation resulting from the magnets. for wound-field machines, the synchronous reactances are measured from open circuit saturation tests and short-circuit tests in accordance with ieee standard 115-1995 [2]. the method for determining synchronous reactances, independently discovered by jones and el-kharashi [3, 4], was first applied to permanent magnet motors by miller [7]. the phase of the motor to be tested is connected into one leg of a wheatstone bridge circuit, as in fig. 2. the resistance rm and inductance lm represent the winding under test. the variable resistor rvar is adjusted so that when the switch is open there is no voltage across the centre of the bridge. the switch is initially closed, to allow a dc current idc to flow through the bridge circuit. when the switch is then opened, the current through the inductor decays from idc to zero. during this transient period, the voltage across the centre of the bridge is given by (4). the voltage waveform is stored in a digital storage oscilloscope (dso). by integrating this voltage with respect to time, the flux-linkage for the given level of current can be found. if the bridge is balanced and the resistor ratios have been selected such that ra � rb, then the inductance of the winding is given by (5). from this, the synchronous reactances can be determined (6). the synchronous reactances will vary as a function of load. v v v r r r l i tr r � � � � � b var var var m m d d [v] (4) l im dc � 2� [h] (5) x l� � m [�] (6) the direct measurement of magnetisation curves in switched reluctance motors using locked rotor tests with pulsed voltage waveforms is described by miller, [5]. the bridge circuit used for measurement of the synchronous reactances can be incorporated into the locked rotor test rig. however, the flux-linkage calculated by integration of the instantaneous voltage is due to winding current only and does not include any contribution from the permanent magnets. it is commonly assumed that the flux-linkage contribution from the permanent magnets is independent of current and varies only with rotor position. under this assumption, the contribu26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague a) b) fig. 1: finite element plots a) test motor 1, b) test motor 2 fig. 2: wheatstone bridge circuit for measurement of synchronous reactances tion from the magnets can be calculated from integration of the open circuit back emf waveform. an indirect method of verifying the magnet flux-linkage with current in the winding, combining the wheatstone bridge circuit and a rotational test, has been discussed by miller et al. [6]. the rotor is locked into position and the flux-linkage due to current measured as for the synchronous reactance measurements. the rotor is then rotated through a predetermined angle and the change in flux-linkage added to the flux-linkage from the wheatstone bridge measurement. the result will be the total flux-linkage at the new rotor position. the flux-linkage due to current at the new rotor position is easily measured using the bridge circuit. subtraction of this value from the total flux-linkage leaves the magnet flux-linkage contribution at the new rotor position, which if the magnet flux is independent of current, will equal the open circuit magnet flux-linkage at that rotor angle. taking the starting point for the measurements as the quadrature axis, there will be no flux-linkage contribution from the permanent magnets and so the q-axis magnetisation curve can be determined solely from the wheatstone bridge circuit. using the rotational test method, the total flux-linkage at each successive rotor position will be the sum of the flux-linkage at the previous rotor position and the change in flux-linkage measured during rotation. in this way, the complete set of magnetisation curves can be measured without any assumption of the magnet flux-linkage. measured magnetisation curves for test motor 1 have been compared with simulated results, shown as dashed lines, in fig. 3. 4 analysis of magnetisation characteristics simulations have been run on two test motors. parameter information is given in appendix 1. simulation results from the first test motor have been compared with measured values for verification. the flux-linkages due to current of motor 1, a split-phase, 2 pole ipm motor are shown in fig. 4. the quadrature axis synchronous reactance is higher than that of the direct axis, due to presaturation of the direct axis from the permanent magnets. there is greater variation in the quadrature axis synchronous reactance because the slope of the magnetisation curve is steeper in the q-axis operating region than in the d-axis region. there is a difference in the direct axis static inductance levels between magnetising and demagnetising currents, caused by presaturation of the magnetic circuit by the permanent magnets. the operating point of the motor is shifted high up the linear region of the material saturation characteristic, so that the introduction of magnetising current will shift the operating point into saturation. a demagnetising current produces magnetic flux in opposition to that of the permanent magnets, shifting the operating point further down the linear region of the curve. because the slope in the linear region is steeper than in the saturation region, the demagnetising synchronous reactance will be larger for a given current magnitude. the permanent magnets have no effect on the quadrature axis saturation levels. the synchronous reactance is the same for both magnetising and demagnetising current. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 fl u x -l in k a g e (v o lt -s e c s ) measured fea d -d q phase current (amps) fig. 3: comparison between measured and fe-simulated magnetisation curves 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.5 1 1.5 2 2.5 fl u x -l in k a g e (v o lt -s e c s ) d-axis magnetising d-axis demagnetising q-axis idc (amps) fig. 4: test motor 1 flux-linkages due to current a number of papers discuss the pre-saturation of the magnetic circuit by the permanent magnets. [7] suggests that the direct axis flux-linkage will initially be the same for both magnetising and demagnetising currents. when the demagnetising current reaches a sufficient level to saturate the rotor bridge areas in the opposite direction, there will then be a step increase in flux-linkage, creating a difference between the magnetising and demagnetising flux-linkages that will remain as the current increases further. the explanation given in [7] is specific to the geometry of the rotor tested and the true nature of the synchronous reactances is, in fact, slightly different. the change between magnetising and demagnetising d-axis flux-linkage occurs as the saturation of the bridges is initially neutralised, not as it is reversed. the demagnetising current creates a flux in the rotor bridges that opposes the direction of the flux created by the permanent magnets. when the current is sufficiently high, these two components of flux will be of equal magnitude. at this point, both components of flux will flow across the airgap rather than through the bridges, resulting in the step change noted in [7, 8]. the step change results in an initial difference between the demagnetising and magnetising flux-linkages that will gradually decrease as the level of current is increased, due to saturation of the rotor bridges in the opposing direction. this phenomenon is not immediately obvious from either the measure or simulated results of test motor 1, due to the construction of the rotor. the motor tested in [7] has solid rotor bridges; the slots are fully enclosed. in test motor 1, the slots are partially open. it is the difference in bridge permeability that affects the synchronous reactances. whereas in the motor used in [7] bridge areas are saturated, the bridge areas in test motor 1 act as an extension of the airgap. most flux 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.5 1 1.5 2 2.5 fl u x -l in k a g e (v o lt -s e c s ) magnetising d-axis demagnetising d-axis q-axis idc (amps) fig. 5: results of flux-linkage simulations for test motor 1 with remodelled rotor bridges 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 2 4 6 8 fl u x -l in k a g e (v o lt -s e c s ) magnetising d-axis demagnetising d-axis q-axis idc (amps) fig. 6: reliance motor with original geometry showing significant change in flux-linkage due to current 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 1 2 3 4 5 6 7 8 fl u x -l in k a g e (v o lt -s e c s ) magnetising d-axis demagnetising d-axis q-axis idc (amps) fig. 7: reliance motor with remodelled geometry showing no step change in flux-linkage due to current flows across the airgap to the stator rather than between adjacent rotor bars and the difference in synchronous reactances is gradual rather than a step change. test motor 1 was remodelled with the rotor bridges specified as the same material as the bars; the width of the bridges is such that there is a significant amount of leakage flux. the resulting flux-linkages are shown in fig. 5. a second test motor, similar to that used in [7], has also been simulated. fig. 6 shows the simulated values of flux-linkage due to current for the original geometry. when the bridges are removed to create open rotor slots, there is no longer a significant change in the flux-linkage, as shown in fig. 7. the initial level of saturation in the bridges is decreased if the width is increased, thereby reducing the difference in flux-linkage levels for positive and negative currents. 5 dependence of magnetisation characteristics on rotor bridge design fuller investigation into the effects of the rotor bridges has been carried out using the fe software. test motor 1 has been modelled with four different bridge configurations: with the original open rotor slots (bridge areas are air) and with three different thicknesses of bridges with the same material as the rotor bars. fig. 8 shows the results of synchronous reactance simulations for each configuration. test motor 2 has been modelled with the original rotor design (rotor bridges are the same material as the rotor bars), with the bridges at half the © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 0 50 100 150 200 250 300 350 0 0.5 1 1.5 2 2.5 air bridges 0.1mm bridges 0.25mm bridges 0.5mm bridges i id q, (amps) x x d q , (o h m s ) fig. 8: synchronous reactance simulation results for different bridge types (test motor 1) 0 2 4 6 8 10 12 14 0 2 4 6 8 0.51mm bridges 0.25mm bridges air bridges i id q, (amps) x x d q , (o h m s ) fig. 9: synchronous reactance simulation results for different bridge types (test motor 2) original thickness and with open rotor slots. the synchronous reactance simulation results are shown in fig. 9. the effect of the rotor bridges is to limit the levels of harmonics in the airgap flux density waveform. figs. 10 and 11 show the open circuit airgap flux density distributions for test motors 1 and 2 respectively. for both test motors, the airgap flux density waveforms with highest harmonic content are those when the rotor slots are open. introducing magnetic bridge sections to close the slots reduces the harmonic content of the waveforms, but also decreases the levels of airgap 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague parameter test motor 1 test motor 2 0.1 mm 0.25 mm 0.5 mm 0.25 mm 0.51 mm xq + 2 % + 4 % + 8 % + 0.7 % + 1.3 % xd + 14 % + 36 % + 63 % + 8 % + 19 % bgap (oc) � 9 % � 13 % �17 % � 2 % � 4 % table 1: comparison between partially open and fully closed rotor slots, showing significant increases in d-axis reactance and decreases in airgap flux density when bridge thickness is increased -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0 30 60 90 120 150 180 azimuth from rotor s d-axis (electrical degrees) a ir g a p o c fl u x d e n s it y (t e s la ) air bridges 0.1mm bridges 0.25mm bridges 0.5mm bridges fig. 10: open circuit airgap flux density distribution of test motor -0.7 -0.35 0 0.35 0.7 0 30 60 90 120 150 180 azim uth fr om rotor s d-axis (e lectrical de gree s ) a ir g a p o c fl u x d e n s it y (t e s la ) 0.51mm bridges air bridges 0.25mm bridges fig. 11: open circuit airgap flux density distribution of test motor 2 flux density, as the bridges act as leakage paths for the flux. increasing the thickness of these sections reduces the airgap flux density further, but does not make any discernible difference to the harmonic content of the flux density distribution. the simulation results are summarised in table 1. it shows the maximum percentage changes in synchronous reactances and open circuit flux density waveforms, when open slots are remodelled with closed magnetic bridge sections of various thicknesses. the effects of adding the magnetic bridge sections are greater in the direct axis than the quadrature, because the rotor bars lying on the direct axis form a path for the magnet flux, whereas there is assumed to be no flux-linkage contribution from the magnet flux on the quadrature axis. the effects of closing the rotor slots with magnetic bridge material can also been seen in the magnetisation curves of the test motors. at low current levels, the magnetic circuit is dominated by the flux contribution from the permanent magnets and so the total flux-linkage will be less for the rotor designs with magnetic bridges than for those with open slots. the degree to which the flux-linkage is reduced is dependent on the thickness of the bridges, but also on the rotor position. the quadrature axis magnetisation curve has no flux-linkage contribution from the magnet and so the difference will be minimal. on the direct axis the magnet contribution is greatest and so the difference in flux-linkage levels will be greatest. as the level of current increases, the difference in flux-linkage contributions from the current also increases. when the current reaches a certain level, the difference in flux-linkage will be greater than the difference in magnet flux crossing the airgap, and so the rotors with magnetic bridges will eventually produce more flux-linkage than those with open slots, as in fig. 12. 6 conclusions the results from the finite element analysis show that the magnetisation characteristics of the permanent magnet motors are highly dependent on the rotor bridge design. the use of magnetic rotor bridges is advantageous in reducing harmonics in the flux density distribution and the addition of the bridge sections can lead to increases in flux-linkage at high current levels. however, designs incorporating closed rotor slots have been shown to increase leakage flux. a compromise must therefore be reached whereby the dimensions of the bridge sections reduce the harmonic levels in the flux density distribution and the leakage flux is kept to a reasonable level. 7 acknowledgments the authors acknowledge the support of the speed consortium. j. a. walker is funded by the uk engineering and physical sciences research council, robert bosch gmbh and the speed consortium. thanks are given to electrolux compressors, for supply of test motors, to jimmy kelly and wilson macdougall for help with construction of the test rigs and to dr. mircea popescu for many useful discussions. references [1] miller, t. j. e., mcgilp, m. i: pc-fea version 3.0/5.0 user’s manual. glasgow: speed laboratory, may 2002. [2] ieee guide: test procedures for synchronous machines. ieee standard 115-1995 new york: institute of electrical and electronic engineers, inc. 1995. [3] jones, c. v.: the unified theory of electrical machines. london: butterworths, 1967. [4] prescott, j. c., el-kharashi, a. k.: “a method of measuring self-inductances applicable to large electrical machines.” proceedings of the institution of electrical engineers 106 (part a), 1959, p. 169–173. [5] miller, t. j. e.: switched reluctance motors and their control. oxford: clarendon press, 1993. [6] miller, t. j. e. et al.: “calculating the interior permanent-magnet motor.” conference record of the ieee international electric machines and drives conference 2, 2003, p. 1181–1187. [7] miller, t. j. e.: “methods for testing permanent magnet polyphase ac motors.” conference record of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5 6 7 8 fl u x -l in k a g e (v o lt -s e c s ) air bridges 0.51mm bridges 0.25mm bridges idc (amps) fig. 12: direct and quadrature axis magnetisation curves for one phase of test motor 2 ieee industry applications society annual meeting, 1981, p. 494–499. [8] rahman, m. a., zhou, p.: “analysis of brushless permanent magnet synchronous motors.” ieee transactions on industrial electronics. vol. 43 (1996), p. 256–267. jill alison walker e-mail: jwalker@elec.gla.ac.uk calum cossar t. j. e. miller speed laboratory, university of glasgow rankine building oakfield avenue glasgow g12 8lt, scotland, uk 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague parameter test motor 1 test motor 2 stator lamination shape circular, chamfered edges circular stack length 39 mm 95.25 mm shaft radius 9.5 mm 15.75 mm rotor outer radius 31.72 mm 46.4 mm airgap length 0.28 mm 0.32 mm stator outer radius 64 mm 77.22 mm magnet thickness 5.8 mm 6.35 mm no. of poles 2 4 no. of rotor bars 28 44 no. of stator slots 24 36 rated voltage 220 v, 50 hz 230 v, 60 hz turns/ phase 970 168 winding configuration custom sine-distributed lap appendix 1: test motor parameters acta polytechnica https://doi.org/10.14311/ap.2021.61.0740 acta polytechnica 61(6):740–748, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague determination of bond model for 7-wire strands in pretensioned concrete beam vadzim parkhats∗, rafał krzywoń, jacek hulimka, jan kubica silesian university of technology, faculty of civil engineering, department of structural engineering, akademicka 5, 44-100 gliwice, poland ∗ corresponding author: vadzpar036@student.polsl.pl abstract. a correct choice of a bond model for prestressing tendons is crucial for the right modelling of a structural behaviour of a pretensioned concrete structure. the aim of this paper is the determination of an optimal bond model for 7-wire strands in a prestressed concrete beam produced in a precast concrete plant of consolis poland. atena 3d is used to develop finite element models of the beam that differ only in a bond stress-slip relationship of tendons. the bond stress-slip relationships for modelling are taken from the results of bond tests carried out by different researchers in previous years. moreover, for comparison purposes, a simplified 2d model of the beam is created in autodesk robot. the strain distribution at the time of the strand release is found for each of the finite element models. the determined strain distributions are compared with the strain distribution in the beam established by an experimental test using a measuring system based on a digital image correlation. on the basis of the comparison results, the most appropriate bond models for 7-wire strands used in the beam are identified. keywords: bond stress-slip relationship, digital image correlation, end zone, prestressed concrete, pretensioned concrete beam, strand release. 1. introduction as is known, the prestressing force is transferred to the concrete of a pretensioned member by the bond between the concrete and the prestressing steel [1]. therefore, an accurate definition of a bond model for prestressing tendons in finite element modelling of pretensioned concrete structures is key for the determination of correct results. unfortunately, a bond stress-slip relationship for a prestressing tendon is omitted from most design standards. model code 2010 [2] contains bond models for ribbed and plain reinforcing bars, but not for prestressing strands. digital image correlation (dic) is a non-contact optical technique for measuring strain and displacement [3]. in recent years, this measurement technique has been used more and more often in various fields of study: civil engineering [4], applied mechanics [5], biology [6], aerospace engineering [7] and others [8, 9]. in particular, it should be noted that the dic technique is nowadays used in the research of prestressed concrete [10–15]. in this paper, a review of bond models for tendons found in the literature is done. moreover, an experimental test carried out with the help of a dic measurement system and consisting in a determination of strain distribution in the end zone of a pretensioned concrete beam at the time of the strand release is described. bond models found in the literature are applied in finite element modelling of the beam used in the experimental test to establish the most appropriate ones. the appropriateness of the bond models is evaluated by the comparison of the strain distributions determined in specific points on the side surface of the beam by means of dic measurements and finite element modelling. the best fitting bond models will be additionally verified on other pretensioned concrete structures of the same manufacturer in the future. furthermore, it is also planned to carry out bond tests for deducing our own bond model for the used 7-wire strands. 2. literature review of bond models for strands bond models for 7-wire strands found in the literature are presented in this section. bond models for other types of tendons are omitted, since only 7-wire strands were used in the beam analysed in the experimental test. balazs [16] presented the bond stress-slip relationship 1 based on results of pull-out tests with 7-wire strands with a diameter of 12.8 mm. the specified concrete strength at transfer was f′ci = 40 mpa. τ = ψc(f′ci) 0.5(s/db)b (1) where: τ is the bond stress [mpa]; s is the slip [m]; db is the strand diameter [m]; ψ is the factor [–] for the upper bound (ψ0.95 = 1.35), the mean value (ψm = 1.00) and the lower bound (ψ0.05 = 0.65) of bond stresses; c and b are the experimental constants; for db = 12.8 mm: c = 2.055 mpa0.5 and b = 0.25. oh et al. [17] carried out bond tests for 12.7 mm and 15.2 mm strands using concrete with the strength 740 https://doi.org/10.14311/ap.2021.61.0740 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 6/2021 determination of bond model for 7-wire . . . (a). (b). figure 1. bond stress-slip relationship [20]: (a) for ribbed bars in the case of a pull-out failure according to model code 2010, (b) for 7-wire smooth strands according to the equations 8–12. f′ci = 32.71–35.50 mpa. the following equation was obtained: τ = c(s/db)b (2) where: c and b are experimental constants; for db = 12.7 mm: c = 13.787 mpa and b = 0.3301, for db = 15.2 mm: c = 9.331 mpa and b = 0.2688. lim et al. [18] presented a bond stress-slip relationship obtained by a measurement of strains in pretensioned members with the help of strain gauges. their relationship for 15.2 mm strands is identical to the equation 1 with ψ = 1.00, but the values of the experimental constants are different: c = 10.7 mpa0.5 and b = 0.27. orr et al. [19] carried out pull-out tests for both unstressed and stressed 15.2 mm strands. the specified concrete strength at the transfer was f′ci = 54.2 mpa. their model is based on the bond stress-slip relationship 3–6 for ribbed bars in the case of a pull-out failure presented in model code 2010 [2] (figure 1a). τ = τmax(s/s1)α for 0 ≤ s ≤ s1 (3) τ = τmax for s1 ≤ s ≤ s2 (4) τ = τmax − (τmax − τf )(s − s2)/(s3 − s2) for s2 ≤ s ≤ s3 (5) τ = τf for s > s3 (6) the following values of the parameters of their model were proposed: α = 0.5; s1 = s2 = s3 = 0.1 mm for stressed strands and s1 = s2 = s3 = 2 mm for unstressed strands. the equation 7 is stated for the determination of the bond stress: τmax = τf = δ1 δ2 0.70 (fcm)0.5 (7) where: δ1 accounts for the reduction in the bonded perimeter in specimens with reduced cover [–]; δ2 accounts for the confinement from cover or transverse reinforcement [–]; fcm is the mean concrete cylinder strength of the specimen [mpa]. as opposed to the above-mentioned researchers, khalaf and huang [20] developed an analytical bond model for both 3-wire and 7-wire strands. it was validated by a comparison of its results and the results of experimental tests. the model considers the surface condition of a strand, the geometry and the number of wires, the concrete parameters, and the influence of elevated temperatures. the model is based on the model code 2010 relationship (eq. 3–6), but is modified in the case of 7-wire smooth strands (figure 1): τ = τmax(s/s1)α for 0 ≤ s ≤ s1 (8) τ = τmax − (τmax − τ2)(s − s1)/(s2 − s1) for s1 ≤ s ≤ s2 (9) τ = τ2 + (τmax − τ2)(s − s2)/(s3 − s2) for s2 ≤ s ≤ s3 (10) τ = τmax − (τmax − τ4)(s − s3)/(s4 − s3) for s3 ≤ s ≤ s4 (11) τ = τf for s > s4 (12) the following parameters are considered in the case of 7-wire smooth strands: s1 = 0.25 mm; s2 = 0.5 mm; s3 = 3.5 mm; s4 = 8 mm; τ2 = 0.75 τmax; τ4 = 0.35 τmax. the bond stress is calculated in accordance with the equation 13: τmax = tb/ab (13) 741 v. parkhats, r. krzywoń, j. hulimka, j. kubica acta polytechnica cement cem i 52,5r lafarge 414 limestone powder cemex rudniki 58 sand (0–2 mm) 681 crushed granodiorite (2–8 mm) 474 crushed granodiorite (8–16 mm) 664 water 116.6 admixture sika 34rs 3.68 table 1. composition of concrete [kg/m3]. property value standard deviation cube compressive strength [mpa] 56.6 0.50 cylinder compressive strength [mpa] 48.32 1.09 mean tensile strength [mpa] 3.98 – modulus of elasticity [gpa] 25.81 0.27 poisson’s ratio [–] 0.19 0.016 strain at failure [‰] 2.72 0.20 plastic part of strain [‰] 0.85 0.15 tensile strain at failure [‰] 0.154 – density [kg/m3] 2444 – table 2. average properties of concrete at the time of strand release. where: tb is the maximum bond force [mn] found according to the equation 15; ab is the contact area between the strand and the concrete [m2]; ab = π db lb (14) where lb is the embedded length of the strand [m]. tb = [µvc dw lw n + 0.6 π dw lw n (c′ + µσn)]/ cos θ (15) where: dw, lw, n, θ = 9° are the diameter [m], the length [m], the number [–], and the pitch angle of the outer wires, respectively; µ is the coefficient of friction between the concrete and the steel [–]; c′ is the cohesion between the concrete and the steel [–]; for 7-wire smooth strands µ = 0.4 and c′ = 1.3; σn is the normal stress perpendicular to the strand axes [mpa]; vc is the shear strength of the shear keys in the concrete mass [mpa]; vc should not be greater than 0.2 f′c, where f′c is the concrete compressive strength [mpa]; for the pull-out bond (σn = 0) vc is calculated according to the equation 16; vc = ft [f′c/ft + 2 − 2 (1 + f ′ c/ft) 0.5]0.5 (16) where ft is the concrete tensile strength [mpa]. 3. experimental test in this section, the experimental test carried out in a precast concrete plant of consolis poland is described. it consisted in the measurement of strains in the end region of the prestressed concrete beam at the time of the strand release by a dic system. the test results are necessary to verify strain distributions in the finite element models of the beam. 3.1. specimen description the 11.66-metre long pretensioned concrete beam (figures 2 and 3) was used for the test. the cross section was i-shaped in the central part, whereas the anchorage zone was equipped with an end block. three horizontal openings in the central part and four vertical openings in the end zones were provided. twenty prestressing tendons with a diameter of 15.7 mm made of steel y1860s7 were used in the beam. their tension was 1250 mpa. it should be noted that two of them had a shielding two metres long. the strands were cut with an acetylene torch in a sequence shown in figure 2b. the stirrups, 8 and 10 mm in diameter, were made of steel b500. their spacing in the end region was not greater than 95 mm. the beam was made of concrete of the strength class c50/60. its composition is presented in table 1. the mechanical properties at the time of the strand release (44 hours after the pouring) were determined with the help of additional tests on the cubic and cylindrical concrete specimens and are summarised in table 2. 3.2. test procedure the side surface of the beam was covered with paint and recorded during the strand release by a dic measurement system. on the basis of the obtained images, strain distribution was established in the 2.4-metrelong end region. the gom aramis system equipped 742 vol. 61 no. 6/2021 determination of bond model for 7-wire . . . (a). (b). (c). figure 2. pretensioned concrete beam: a) a side view of the end region and the location of virtual extensometers, b) the arrangement of prestressing steel and the order of the strand release, c) the reinforcing steel in the end zone. figure 3. sections 1 – 1 and 2 – 2 in the end region (see also figure 2c). 743 v. parkhats, r. krzywoń, j. hulimka, j. kubica acta polytechnica figure 4. finite element models created in autodesk robot (on the left) and atena 3d (on the right). with two prime-lens 6-megapixel cameras (24 mm focal length) was used. the recording frequency was 4 hz. 4. finite element modelling atena 3d and autodesk robot are used for the finite element modelling. because of the symmetry of the specimen, half of the beam is modelled (figure 4). seven finite element models created in atena 3d consider both the prestressing and the reinforcing steel, openings in the beam, and the order of the strand release. the concrete is defined by the material “3d nonlinear cementitious 2”. the mesh size is 5 cm. the models differ only in a bond stress-slip relationship for prestressing tendons. the relationships for strands deduced by [16, 17, 19, 20] are applied. additionally, the model code 2010 relationship (3–6) for ribbed bars in the case of a pull-out failure is used to evaluate its appropriateness for 7-wire strands. the relationship of lim et al. [18] is omitted, because it leads to questionable results. one simplified 2d finite element model based on linear elastic properties of concrete is developed in autodesk robot. the reinforcement and openings in the beam are omitted. prestressing is simulated by a linear increase of negative temperature over the transmission length of the beam (i. e., approximately 780 mm). the value of the transmission length is calculated with the help of the simplified method presented in [21]. 5. results and discussion the small values of strains during the strand release lead to difficulties in interpretation of the results obtained by aramis. firstly, instead of using deformation maps, the results are presented with the help of virtual extensometers 200 mm long. the location of the longitudinal extensometers is shown in figure 2a. they are situated on the side surface of the beam in the characteristic points of strain variation, namely on the level of the bottom row of strands (points 1, 2, 3, 4, and 5). the location of the extensometers is chosen so that they evenly cover the transmission length. the other problem are great fluctuations of the values in the strain distribution diagrams. therefore, the weighted moving average method is used to decrease the fluctuations: the strains in the diagrams are averaged out for seven previous values and seven subsequent ones. the transverse strain distributions in the end block are not presented in the paper, because they are characterised by such significant fluctuations that the verification of the models using these results seems pointless. the longitudinal strain distributions in the characteristic points of the beam (figure 2a) obtained by aramis and finite element modelling are shown in figure 5. the horizontal axis corresponds to the space of time when the strand release was carried out. it is observed that, with the increase of the distance from the end of the beam, all the finite element models overestimate the strains in comparison with the values obtained by aramis approximately until the time when the bottom row of strands starts to be released (between 125 and 150 s, especially at the point 5, a characteristic leap is seen). after this moment, the values obtained by the modelling and the experimental test are close. the possible cause is that between 125 and 150 s, a detachment of the concrete from the formwork happens, so the structural behaviour of the beam is changed. therefore, in the finite element modelling, the bottom surface of the beam should be restrained from moving in a vertical direction over the full length of the beam until the moment of the detachment. after the detachment, only the place of junction of the end and bottom surfaces should be restrained in this way (as is done in the analysed finite element models – see figure 4). this explanation 744 vol. 61 no. 6/2021 determination of bond model for 7-wire . . . figure 5. longitudinal strains in the end zone during strand release (location of the points is shown in figure 2a). is confirmed by the longitudinal strain distributions presented in figure 6, where the strains obtained by aramis are compared with the strains obtained with the help of the finite element model restrained from moving in a vertical direction over the full length. it can be seen that the values in the diagrams are similar until the time between 125 and 150 s, i. e., until the moment of the detachment. coefficients of determination between the strains according to aramis and the predicted strains are presented in tables 3 and 4. on the basis of the strain distributions (figures 5, 7) and the coefficients of correlation (tables 3, 4), it is established that the finite element models using the bond stress-slip relationships of model code 2010 [2] for ribbed bars, orr et al. [19] for unstressed strands, and khalaf and huang [20] give results that are significantly different than those obtained with the help of aramis. the finite element model based on the model code 2010 relationship overestimates the values of the strains, whereas the models using the relationships of orr et al. [19] for unstressed strands and khalaf and huang [20] underestimate them. in the case of the model based on the model code 2010 relationship, the difference has been expected, since this relationship is developed for ribbed reinforcing bars, but not for prestressing tendons. the inappropriateness of the model using the relationship of orr et al. [19] for unstressed strands indicates that the prestressing of strands in bond tests is crucial for deducing a realistic bond stress-slip relationship. concerning the model based on the relationship of khalaf and huang [20], it 745 v. parkhats, r. krzywoń, j. hulimka, j. kubica acta polytechnica figure 6. longitudinal strains in the end zone during strand release – moment of the detachment of the concrete from the formwork. figure 7. longitudinal strains along the beam length on the level of the bottom row of strands after release of all the tendons. bond model points 1 2 3 4 5 autodesk robot (2d model) 0.505 0.978 0.962 0.974 0.971 orr et al. (stressed strand) 0.538 0.975 0.944 0.953 0.959 orr et al. (unstressed strand) 0.546 0.974 0.942 0.954 0.959 oh et al. (equation for db = 15.2 mm) 0.546 0.975 0.946 0.957 0.960 oh et al. (equation for db = 12.7 mm) 0.548 0.974 0.945 0.956 0.960 balazs ψ = 1.00 0.549 0.973 0.943 0.956 0.961 model code 2010 (ribbed bar) 0.552 0.968 0.942 0.957 0.961 khalaf and huang 0.545 0.977 0.949 0.958 0.960 table 3. coefficient of determination between the strains according to aramis and the predicted strains (for the strains shown in figure 5). bond model coefficient of determination autodesk robot (2d model) 0.954 orr et al. (stressed strand) 0.980 orr et al. (unstressed strand) 0.937 oh et al. (equation for db = 15.2 mm) 0.966 oh et al. (equation for db = 12.7 mm) 0.976 balazs ψ = 1.00 0.971 model code 2010 (ribbed bar) 0.896 khalaf and huang 0.924 table 4. coefficient of determination between the strains according to aramis and the predicted strains (for the strains presented in figure 7). 746 vol. 61 no. 6/2021 determination of bond model for 7-wire . . . is difficult to explain the cause of the different results. it is interesting to note that this is the only analytically developed relationship in the literature review, as opposed to the others determined by experimental tests. experimental tests on more pretensioned concrete structures with different design parameters should be carried out to draw definitive conclusions about appropriateness of the analysed bond stress-slip relationships for the strands and concrete used. 6. conclusions in this paper, bond stress-slip relationships for 7wire strands of different researchers are analysed to evaluate their appropriateness for the use in finite element modelling of the pretensioned concrete beam made in a precast concrete plant of consolis poland. the assessment is done by a comparison of the strain distributions in the beam found by aramis and the finite element modelling. the following conclusions are drawn: (1.) the longitudinal strain distributions in the finite element models based on the bond stress-slip relationships of balazs [16], orr et al. [19] (for stressed strands), and oh et al. [17] are fairly similar to the results of the dic measurements. moreover, using the simplified model developed in autodesk robot gives satisfactory results as well. however, it is worth noting that the relationships of oh et al. [17] do not consider the concrete strength at the transfer, so that the satisfying similarity might be accidental. neglecting the concrete strength at the transfer restricts their applicability to the finite element modelling of pretensioned concrete structures. in addition, it is found that the finite element models based on the bond relationships of orr et al. [19] for unstressed strands and khalaf and huang [20] underestimate the strains, whereas the model using the model code 2010 relationship for ribbed bars overestimates them. however, these findings have to be additionally verified on the basis of experimental tests on other pretensioned concrete members that are planned in the future. besides, bond tests utilising the same prestressing strands and concrete mix are proposed as a direction for a future research. (2.) the strain distributions obtained by aramis are characterised by great fluctuations that complicate the analysis of the results. it concerns the transverse strain distributions. in future tests, the scanned region is planned to be reduced to increase the resolution and make the results more legible. (3.) it is established that the structural behaviour of a pretensioned concrete beam might change during the strand release because of the detachment of the concrete from the formwork. thus, a finite element model of a pretensioned concrete beam should be restrained from moving in a vertical direction differently before and after the detachment. references [1] c. w. dolan, h. r. hamilton. prestressed concrete. springer international publishing, 2019. https://doi.org/10.1007/978-3-319-97882-6. [2] fib model code for concrete structures 2010. ernst & sohn, berlin, 2013. https://doi.org/10.1002/9783433604090. [3] n. mccormick, j. lord. digital image correlation. materials today 13(12):52–54, 2010. https://doi.org/10.1016/s1369-7021(10)70235-2. [4] m. a. sutton, f. matta, d. rizos, et al. recent progress in digital image correlation: background and developments since the 2013 w m murray lecture. experimental mechanics 57:1–30, 2017. https://doi.org/10.1007/s11340-016-0233-3. [5] m.-t. lin, c. furlong, c.-h. hwang (eds.). advancement of optical methods & digital image correlation in experimental mechanics. springer international publishing, 2021. https://doi.org/10.1007/978-3-030-59773-3. [6] e. b. dolan, s. w. verbruggen, r. a. rolfe. mechanobiology in health and disease, chap. 1 –techniques for studying mechanobiology, pp. 1–53. academic press, 2018. https: //doi.org/10.1016/b978-0-12-812952-4.00001-5. [7] a. pagani, r. azzara, e. carrera, e. zappino. static and dynamic testing of a full-composite vla by using digital image correlation and output-only ground vibration testing. aerospace science and technology 112:106632, 2021. https://doi.org/10.1016/j.ast.2021.106632. [8] b. pan. digital image correlation for surface deformation measurement: historical developments, recent advances and future goals. measurement science and technology 29(8):082001, 2018. https://doi.org/10.1088/1361-6501/aac55b. [9] j.-n. perie, j.-c. passieux (eds.). advances in digital image correlation (dic), vol. special issue of applied sciences. 2020. https://doi.org/10.3390/books978-3-03928-515-0. [10] b. omondi, d. g. aggelis, h. sol, c. sitters. improved crack monitoring in structural concrete by combined acoustic emission and digital image correlation techniques. structural health monitoring 15(3):359–378, 2016. https://doi.org/10.1177/1475921716636806. [11] d. zhu, s. liu, y. yao, et al. effects of short fiber and pre-tension on the tensile behavior of basalt textile reinforced concrete. cement and concrete composites 96:33–45, 2019. https://doi.org/10.1016/j.cemconcomp.2018.11.015. [12] e. martinelli, a. hosseini, e. ghafoori, m. motavalli. behavior of prestressed cfrp plates bonded to steel substrate: numerical modeling and experimental validation. composite structures 207:974–984, 2019. https://doi.org/10.1016/j.compstruct.2018.09.023. 747 https://doi.org/10.1007/978-3-319-97882-6 https://doi.org/10.1002/9783433604090 https://doi.org/10.1016/s1369-7021(10)70235-2 https://doi.org/10.1007/s11340-016-0233-3 https://doi.org/10.1007/978-3-030-59773-3 https://doi.org/10.1016/b978-0-12-812952-4.00001-5 https://doi.org/10.1016/b978-0-12-812952-4.00001-5 https://doi.org/10.1016/j.ast.2021.106632 https://doi.org/10.1088/1361-6501/aac55b https://doi.org/10.3390/books978-3-03928-515-0 https://doi.org/10.1177/1475921716636806 https://doi.org/10.1016/j.cemconcomp.2018.11.015 https://doi.org/10.1016/j.compstruct.2018.09.023 v. parkhats, r. krzywoń, j. hulimka, j. kubica acta polytechnica [13] c. lakavath, s. s. joshi, s. s. prakash. investigation of the effect of steel fibers on the shear crack-opening and crack-slip behavior of prestressed concrete beams using digital image correlation. engineering structures 193:28–42, 2019. https://doi.org/10.1016/j.engstruct.2019.05.030. [14] a. b. sturm, p. visintin, r. seracino, et al. flexural performance of pretensioned ultra-high performance fibre reinforced concrete beams with cfrp tendons. composite structures 243:112223, 2020. https://doi.org/10.1016/j.compstruct.2020.112223. [15] h. zhao, b. andrawes. innovative prestressing technique using curved shape memory alloy reinforcement. construction and building materials 238:117687, 2020. https: //doi.org/10.1016/j.conbuildmat.2019.117687. [16] g. balazs. transfer control of prestressing strands. pci journal 37(6):60–71, 1992. https://doi.org/10.15554/pcij.11011992.60.71. [17] b. h. oh, e. s. kim, y. c. choi. derivation of development length in pretensioned prestressed concrete members. journal of the korea concrete institute 12(6):3–11, 2000. https://doi.org/10.22636/jkci.2000.12.6.3. [18] s. n. lim, y. c. choi, b. h. oh, et al. bond characteristics and transfer length of prestressing strand in pretensioned concrete structures. in framcos-8 – viii international conference on fracture mechanics of concrete and concrete structures, pp. 121–128. 2013. http://www.framcos.org/framcos-8/p348.pdf. [19] j. j. orr, a. darby, t. ibell, et al. anchorage and residual bond characteristics of 7-wire strand. engineering structures 138:1–16, 2017. https://doi.org/10.1016/j.engstruct.2017.01.061. [20] j. khalaf, z. huang. analysis of the bond behaviour between prestressed strands and concrete in fire. construction building materials 128:12–23, 2016. https: //doi.org/10.1016/j.conbuildmat.2016.10.016. [21] a. ajdukiewicz, j. mames. konstrukcje z betonu sprężonego. wyd. 2 popr. stowarzyszenie producentów cementu, kraków, 2008. 748 https://doi.org/10.1016/j.engstruct.2019.05.030 https://doi.org/10.1016/j.compstruct.2020.112223 https://doi.org/10.1016/j.conbuildmat.2019.117687 https://doi.org/10.1016/j.conbuildmat.2019.117687 https://doi.org/10.15554/pcij.11011992.60.71 https://doi.org/10.22636/jkci.2000.12.6.3 http://www.framcos.org/framcos-8/p348.pdf https://doi.org/10.1016/j.engstruct.2017.01.061 https://doi.org/10.1016/j.conbuildmat.2016.10.016 https://doi.org/10.1016/j.conbuildmat.2016.10.016 acta polytechnica 61(6):740–748, 2021 1 introduction 2 literature review of bond models for strands 3 experimental test 3.1 specimen description 3.2 test procedure 4 finite element modelling 5 results and discussion 6 conclusions references ap04_2web.vp 1 introduction one of the main tasks during the introduction stage of aeroplane design is to determine the basic aeroplane performance. one of the input is the thrust curve of the power plant – available thrust versus flight velocity. the optimisation procedure requires combinations of suitable engines and propellers offered on the market to compare different thrust curves and, consequently, aircraft performance. designers of small sport aircraft very often have only the shape and the number of propeller blades without any aerodynamic characteristics. it is evident that very sophisticated and precise numerical methods (helix vortex surfaces or sophisticated solutions by means of fems of the real flow around the rotating lift surfaces) require large input data files. these conclusions have led the author to present an easy and sufficiently precise procedure for calculating the integral propeller aerodynamic characteristics with minimum demands on geometric and aerodynamic propeller input data. an inspection of various aerodynamic propeller theories indicated that a suitable method can be gained by enhancing lock’s model of the referential section connected with bull-bennett mean lift and drag propeller blade curves. 2 lock’s propeller model of the referential section lock’s model [1] considers a referential section on a propeller blade located at 70 % of the tip radius to be representative of the total aerodynamic forces acting on the blade (thrust tbl and tangential force qbl or lift lbl and drag dbl) – see fig. 1. it is assumed that these forces are configured according to the local relative wind w determined by the incoming flow w0 at this section (composed of tangential sped u and flight speed v ) and induced speed vi. the induced speed is related to lifting line theory. the propeller blade lift and drag expressed by means of the lift and drag coefficient: l w c b rbl l� 1 2 2 0 7 0 7� �( ) . . (1) d w c b rbl d� 1 2 2 0 7 0 7� �( ) . . (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 103 acta polytechnica vol. 44 no. 2/2004 preliminary determination of propeller aerodynamic characteristics for small aeroplanes s. slavík this paper deals with preliminary determination of propeller thrust and power coefficients depending on the advance ratio by means of some representative geometric parameters of the blade at a specific radius: propeller blade chord and blade angle setting at 70 % of the top radius, airfoil thickness at the radius near the tip and the position of the maximum blade width. a rough estimation of the non-linear influence of propeller blades number is included. the published method is based on lock’s model of the characteristic section and the bull-bennett lift and drag propeller blade curves. lock’s integral decomposition factors and the loss factor were modified by the evolution of the experimental propeller characteristics. the numerical-obtained factors were smoothed and expressed in the form of analytical functions depending on the geometric propeller blade parameters and the advance ratio. keywords: propeller, propeller aerodynamics, thrust coefficient, power coefficient, propeller efficiency, propeller design. fig. 1: lock’s scheme of the referential blade section enables us to write the total thrust of a propeller with z blades as: � �t zt z l dbl bl bl� � �cos( ) sin( )� � (3) in the form: � �t w b r c cl d� � 1 2 2 0 7 0 7� � �. . cos( ) sin( ) (4) substitution of apparent relations from fig. 1 for the resultant velocities w: w w v n r n r r i s i s � � � � � � 0 2 0 7 2 2 0 7 2 2 cos( ) ( ) cos( ) ( ) ( . . � � � � � ) cos( ) ,2 �i (5) and the angle of the real incoming flow �: � � � � � � � � � ��� � � � � � � �0 7 0 7 0 7 0 7 . . . . i i r arctg (6) (� is the advanced ratio: � � v n ds and �0.7 is the blade angle setting) into the expression for the thrust (4) and she use of non dimensional geometry (b b r0 7 0 7. .� , r r r0 7 0 7 0 7. . .� � ) the expression of the propeller thrust coefficient: c t n d t s � � 2 4 (7) is achieved in the final form: c z b r r r t � � � � � � 1 8 0 7 2 0 7 0 7 2 0 7 0 7 . . . . . ( ) cos� � � � � arctg � � � � � � � �c cl dcos( ) sin( ) .. .� � � �0 7 0 7 (8) calculation of the thrust coefficient at given advance ratios requires not only the lift cl(�) and drag cd(�) blade curves but also the relationship to determine angle of attack �. lock [1] developed the induced equation as the dependence of the lift coefficient on the tip loss factor (function of advance ratio � and angle � of the real incoming flow): s cl i0 7 4. sin( ) ( )� � �tg (9) where s0.7 is the propeller solidity factor related to the referential section: s z b r r0 7 0 7 0 7 0 72 0 7. . . ., ( . ) .� � � (10) the relations among thrust tbl and tangential force qbl acting on the propeller blade and the equivalent blade lift lbl and drag dbl forces were designed by lock [1] in a decomposition form based on the angle of the real incoming flow � at the referential section. this resolution is corrected by integration factors e and f: c s ec fcl t m0 7. cos( ) sin( )� �� � (11) c s fc ecd m t0 7. cos( ) sin( )� �� � . (12) torque coefficient cm represents the tangential force due to c m n d z q r n d m k s s � � � � 2 5 0 7 2 5 . . (13) the introducing of a propeller power p and power coefficient cp: c p n d p s � � 3 5 (14) provides a constant relation between the power and torque coefficient: c cp m� 2� . integration factors e and f were developed by lock in dependence on the advance ratio: e � � 3 276 4 336 2 . . � (15) f e r r� � 2 0 7 0 7 0 7 . ., ( . ) . (16) decomposition equations (11) and (12) can be used for an inverse procedure to calculate the thrust and power coefficient by means of the integration factors and blade lift and drag curves: c s e c c c t l d l� � � � � � � � � 0 7 1 . cos( ) ( ) ( ) ( )� � � � tg tg tg2 (17) � � � � c c s c c f p m d l� � � � 2 2 1 0 7� � � � � . ( ) cos( ) ( ) tg tg2 . (18) 3 lift and drag of the propeller blade bull and bennett [2] published the lift and drag propeller blade curve gained by an applying lock’s scheme (11) and (12) to a set of experimental propeller aerodynamic characteristics. to calculate the induced values the set of lock’s induced equation (9) and the decomposition relation (11) for the lift was used. the results of the calculations were presented in [2] and are shown in fig. 2. these curves represent different raf-6 section propellers covering a wide range of 104 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 2: propeller blade lift and drag curve setting angles and advance ratios. the tip mach number never reached 0.7. the mean values of the lift and drag blade curve are described by simple linear and quadratic forms – [2]: c c l l � � � � � � � 0 4996 01096 4 98 0 9867 0 0001 0 0024 . . . . . . � � � � � 2 4 98� � � �. (19) cd � � �0 0258 0 00318 0 00173 2. . .� � (20) 4 modification of lock’s method there are at least three reasons for improving lock’s procedure to obtain a more effective and accurate method for quick preliminary calculations of integral propeller aerodynamic characteristics. 1) lock’s loss factor (�, �) is given only in tabular form, and requires interpolation procedures. 2) the blade geometry represented by the referential section (b0.7 and �0.7) is too reduced to affect the entire propeller acceptably. 3) lock’s method involves a number of blades in linear form. use was made of experimental thrust and power coefficients and a presumption of the bull-bennett lift and drag blade curve independence of the geometry and flight regime (fixed curves in fig. 2 for all types of propellers) to meet the above outlined requirements. eleven two-blade propellers with raf-6 sections were involved in the calculations. two other geometric parameters were added: blade thickness at 90 % of the propeller tip radius – t0.9 and the position (radius) of the maximum blade chord – rmax. all of the geometric parameters and the tip mach numbers m are presented in table 1, where t t b0 9 0 9 0 9 100. . .( )� and r r rmax max� . 5 induced velocity lock’s expression for the thrust coefficient (8) with bull-bennett lift (19) and drag (20) blade curves was equal to the experimental thrust and numerically solved the unknown induced angle �iexp for the corresponding advance ratio and blade geometry: � �c c z b r c c ct t y x t� � � � 2 0 7 0 7 0 7 0 , , , , , ( ), ( ),. . . exp . � � � � � � � 7 0 7 � �arctg � � � r i . . (21) the calculated induced angles �iexp were then used to determine of the loss factor directly from lock’s induced equation (9): � � � � s cl i 0 7 4 . ( ) sin( ) ( )tg . (22) a three-step procedure was used to simulate the influence of the blade geometric parameters on the induced values. the first step smoothed the loss factor only as a function of the advance ratio and induced angle. subsequently this analytical expression of the loss factor was used to calculate induced angles �i by solving the induced equation (9) for all the experimental propellers. finally, these induced angles �i were correlated with the experimental set �iexp. the function of the smooth loss factor that approximates the numerical results was stated in the form: � � � � 1 a bi i (23) with coefficients: a � �� � �0 3254 0 3529 0 4449 2. . . (23a) b � �� � �0 8213 0 0854 0 0628 2. . . . (23b) a comparison of experimental set of the induced angles �iexp with induced angles �i calculated by means the smooth loss factor showed differences that were evaluated by regression analysis into the final linear correction function: � � �i i ia bexp � � �cor (24) a s r� � � �1088 0 0149 174 0 4620 7 0 7. . . .. . max� (24a) b t� �1 286 0 113 0 9. . . . (24b) this linear expression gives a good approximation in the region of higher angles. in order to keep the simple linear correlation through all the angles, a slightly different form based on coefficients a (24a) and b (24b) is used for a range of small induced angles: � � � �i i i i a b� � � � � �0 5 1 3 0 5 0 65. : . . .exp cor (25) 6 integral factors the calculated experimental induced angles �iexp can also be used to express more precisely the integral factors e and f of lock’s decomposition equations (11) and (12). such modified factors are necessary for more accurate calculations of the thrust and power (torque) coefficients directly with use of lock’s decomposition equations (17) and (18). the integral factors were explicitly derived from decomposition equations (11) and (12): e s c f c c l p t � �2 2 0 7� � � � . sin( ) cos( ) (26) � � � � f s c c c d l p � � � 2 0 7� � � � � . ( ) cos( ) sin( ) ( ) tg tg . (27) © czech technical university publishing house http://ctn.cvut.cz/ap/ 105 acta polytechnica vol. 44 no. 2/2004 s0.7 [1] t0 9. [%] rmax [1] �0.7 [°] m ref. m 337 0.0620 10.5 0.313 14.60 0.45 0.5 0.55 [3] m60-180 0.0568 13.4 0.355 18.77 0.5 0.6 0.7 [3] m60-130 0.0606 13.5 0.320 15.74 0.5 0.6 0.7 [3] m30-011 0.0576 11.7 0.365 12.66 0.5 0.6 0.7 [3] r503-2v 0.0502 13.8 0.300 10.59 0.5 0.6 0.7 [3] m30-04a 0.0605 12.0 0.335 11.53 0.5 0.6 0.7 [3] n5868-15 0.0596 8.3 0.500 16.06 0.46 [4] n5868-25 0.0596 8.3 0.500 21.06 0.46 [4] n3647-15 0.0888 8.3 0.500 16.06 0.46 [4] n3647-25 0.0888 8.3 0.500 21.06 0.46 [4] vr 411 0.0947 6.4 0.680 9.50 0.6 [5] table 1: set of experimental propellers the numerical dependences of the two factors on the blade geometry and flight regime were obtained by using the set of experimental induced angles �iexp in the expressions (6) for angles � and � and introducing these angles with the corresponding experimental values of thrust and power coefficients into equations (26) and (27). the numerical results were smoothed by the following functions: e a b ce e e� � �� � 2 (28) f a b cf f f � � � 1 � �( ) (29) with parameters describing the influence of the blade geometry and also partly the flight regime: a b c a b e e e f � � � � � � � 0 565 0 0825 0 0375 0 639 18189 0 7 . , . , . . . ,. b f b a f b f m f r m � � � � � � 0 965 0 393 0 9731 0 027 0 7 0 7 0 7 . , . . . . . . � � 0 414 0182 0 0234 0 0475 10777 0 9 0 7 . . . . . max . . � � � m m r t b � � � � � 0 7 0 7 01 0165 0 9 8676 2 . . . , . : : ( . . � � � � � � � � � � � � � � r m r f r f c c b 542 1 9144 1 2 0 7 3 )( ) ( . )( ). � � � � � � � � � r rb (29a) the analysis confirmed the independence of the e factor from the propeller geometry in acceptance with lock’s original model. 7 number of blades lock’s scheme considers the linear dependence on the number of blades with the use of the solidity factor (10) both in the tip loss factor (22) and in the relations for thrust (17) and power (18) coefficients derived from the decomposition equations. the linear model gives thrust and power coefficients that are higher than they are in reality, and the propeller propulsive efficiency does not depend on the number of blades. to preserve the simplicity of the developed procedure for two-blade propellers, an initial correction of the linear model was designed on the basis of the evolution of experimental thrust and power coefficients. by comparing different blade propellernumber having the same blade geometry [4], it was found that the mean value of the rate between the thrust (power) per blade of a two-blade propeller and a z-blade propeller systematically increases from 1 (z � 2) to higher values ( z > 2). the analytical expressions of the mean thrust kt and power kp ratio are as follows: k c z c z z z z t t t � � � � � � � �� ( ) ( ) . . . . 2 2 0 837 0 08583 1 5 10 3 3333 2 33 10 4 3� � z (30) k c z c z z z z p p p � � � � � � � �� ( ) ( ) . . . 2 2 0 764 016533 27 10 016663 2 6 10 4 3� � z . (31) the ratio kt(z) and kp(z) can therefore be used as conversion factors between two-blade and z-blade propeller thrust and power coefficients: c z c z z k t t t ( ) ( ) � � 2 2 (32) c z c z z k p p p ( ) ( ) � � 2 2 . (33) in order to ensure the correct internal calculation of a two-blade propeller even if the right solidity factor (10) of the z-blade propeller is given, an effective solidity factor must be considered during the calculation: s s z0 7 0 7 2 . .ef � (34) 8 calculation procedure 1) geometric input data: referential section (chord, setting angle) – b r0 7. [1], �0.7 [°] relative thickness – ( ). .t b0 9 0 9 100 [%] position of the max. blade chord – r rmax [1] number of blades – z 2) flight regime input: advance ratio – � [1] 3) calculation of the effective solidity factor – (34) 4) solution of the induced angle – the root of the transcendent induced equation (22) with the modified loss factor (23), (blade lift curve (19) is required) 5) linear correction of the induced angle – (24) and (25) 6) calculation of the modified integral factors e and f – (28) and (29) 7) calculation of the thrust and power coefficient with the effective solidity factor – (17) and (18), (blade lift and drag curves (19) and (20) are required) 8) conversion of the gained thrust and power coefficients by means of the kt and kp factors with respect to the number of blades – (32) and (33) 9) calculation of the propulsive efficiency: � �� ( )c ct p [1] (or � � 0 8 3 2. ( )c ct p in case of � � 0) 9 validity it was proved by systematic reversal calculations of the experimental propeller set, table 1, that the maximum relative error of both thrust and power coefficients is less than 10 % and the mean error is about 5 %. these differences are valid from the start regime up to flight regimes with maximum propeller efficiency. the range of geometric parameters that ensures a relative error limit of 10 % can therefore be directly estimated from the data in table 1: blade width b r0 7 0 09 0 22. ( . . )� � , blade angle setting j0 7 9 23. ( )� � , maximum blade width r rmax ( . . )� �0 3 0 7 and airfoil thickness ( ) ( ). .t b0 9 0 9 100 6 14� � %. the tip mach number should not exceed 0.75. all analyses were performed with raf-6 blade airfoil propellers. 106 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 10 examples the first example presents calculations of the two-blade propeller vlu 001 [6] with upper geometric limits of the blade chord and angle setting. the tip mach number m � 0.45. numerical results are compared with experimental values. the input geometric parameters are as follows: � blade angle setting at 70 % of the propeller diameter – �0.7 � 24.4 [°], � relative chord of the blade at 70 % of the propeller diameter – b0.7/r � 0.227 [1], � relative position of the maximum blade width – rmax/r � 0.535 [1], � relative airfoil thickness at 90 % of the propeller diameter – (t0.9/b0.9) 100 � 8.5 [%]. the thrust and power coefficients are shown in fig. 3. the propeller efficiency is presented in fig. 4. the relative error of the power coefficient cp reached about 10 %. the thrust coefficient gives better results. the second example shows the possibilities of a parametric study. propeller efficiency in a static regime ( � � 0, non-forward movable propeller v � 0 – see fig. 1) defined as � � 0 8 3 2. ( )c ct p is calculated for the case a two-blade propel© czech technical university publishing house http://ctn.cvut.cz/ap/ 107 acta polytechnica vol. 44 no. 2/2004 fig. 4: efficiency of two-blade propeller vlu 001 fig. 6: efficiency of a two-blade propeller at � � 0.2 with different geometric parameters fig. 3: thrust and power coefficients of two-blade propeller vlu 001 fig. 5: efficiency of a two-blade propeller at � � 0 with different geometric parameters ler with fixed r rmax , two thickness parameters ( ). .t b0 9 0 9 and four b r0 7. . the results are plotted in fig. 5 as a function �0.7. fig. 6 depicts the propulsive efficiency of the same propeller at the advance ratio � � 0.2. 11 conclusion the published method presents a simple and quick calculation procedure for thrust and power propeller coefficients based on lock’s 2d scheme of the referential section. the numerical demands are restricted to the solution of a non-linear algebraic equation to obtain the induced angle. the thrust and power coefficients are consequently calculated directly by explicit analytical algebraic formulae. the aerodynamics characteristics of the propeller are obtained with an acceptable error for preliminary aircraft performance analyses: the maximum relative deviation of both the thrust and the power coefficient does not exceed 10 % from the start regime up to flight regimes with maximum propeller efficiency. the mean error is about 5 %. the range of blade geometric parameters was set to keep the calculations within this error limit. in addition to applications in the small aeroplane industry the presented method is also suitable for student study projects at technical universities with aerospace study programmes. parametrical input data of the propeller blade geometry and the number of blades enables easy studies of the influence of propeller geometry on the aerodynamic characteristics. this procedure can be further enhanced by considering the tip mach number effect, the aerodynamics of a blade airfoils and by a more detailed analysis of the influence of blade numbers. systematic use of fems (e.g. fluent) can supply the experimental basis. references [1] lock c. n. h.: a graphical method of calculating the performance of an airscrew. british a. r .c. report and memoranda 1675, 1935. [2] bull g., bennett g.: propulsive efficiency and aircraft drag determined from steady state flight test data. mississippi state university, dep. of aerospace engineering. society of automotive engineers, 1985. [3] slavík s., theiner r.: měření aerodynamických charakteristik vrtulí pro motor m-60 a m-30 v tunelu vzlú � 3 m. czech technical university in prague faculty of mechanical engineering, department of aerospace engineering, prague 1989. [4] hartman e. p., biermann d.: the aerodynamic characteristics of full scale propellers having 2, 3, and 4 blades of clark y and r.a.f. airfoil sections. technical report no. 640, n. a. c. a., 1938. [5] hacura e. p.: charakteristiky vrtule vr 411 4003. aeronautical research and test institute, prague, 1952. [6] hacura e. p., biermann d.: zkoušky rodiny vrtulí r & m 829 v aerodynamickém tunelu � 3 m. aeronautical research and test institute, prague, 1949. doc. ing. svatomír slavík, csc. phone: +420 224 357 227, +420 224 359 216 e-mail: svatomir.slavik@fs.cvut.cz department of automotive and aerospace engineering czech technical university in prague faculty of mechanical engineering karlovo náměstí 13 121 35 prague 2, czech republic 108 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap04-bittnar2.vp 1 introduction it is in a nature of mankind to search for simplicity, efficiency and stability. when translated into a computational mechanics language, a new landscape for efficient numerical schemes arises from the introduction of the multi-scale or multilevel solution strategies currently at the forefront of engineering interest when studying complex heterogeneous materials and structures. to provide an illustrative example of such a structure, consider the multi-layered wound composite tube shown in fig. 1. an accurate prediction of the mechanical response of this particular class of structures inevitably calls for analyses on three widely separated length scales. it is now becoming widely accepted that models constructed on the basis of hierarchical or multi-scale modeling offer a reliable route in numerical investigation of deformation and failure processes taking place at individual scales [1–5]. since their introduction, the need for a feasible computational framework has challenged the applicability of various numerical techniques at individual scales. the high cost of traditional numerical techniques, e.g., the finite element method, then provided an opportunity for classical averaging schemes as a cost-effective and computationally simple alternative. the motivation for the present paper, in particular, arises from the idea of using the well-known variational principles of hashin and shtrikman (hs) [6] when studying the behavior of a composite on the level of individual constituents, fig. 1c. assuming that the material response of a two-phase composite system is well described by volume averages of local fields, all these methods including the hs principles can be conveniently referred to as two point averaging schemes. a comprehensive overview of various micromechanical techniques can be found in [7]. an extension to the loading conditions that promote inelastic deformation is presented in [8–14]. a number of studies, however, have revealed the main drawback of the so called “elastic localization rule” for the evaluation of local stress and strain averages with the two-point averaging schemes. in particular, defining the localization rules based on the elastic or tangent behavior of individual phases [8] yields a macroscopic response that is significantly stiffer than that provided by the finite element method or a sufficiently refined transformation field analysis [13]. a number of approaches have been proposed to address this task with the main goal of improving the way the macro200 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 on adequacy of two-point averaging schemes for composites with nonlinear viscoelastic phases j. zeman, r. valenta, m. šejnoha finite element simulations on fibrous composites with nonlinear viscoelastic response of the matrix phase are performed to explain why so called two-point averaging schemes may fail to deliver a realistic macroscopic response. nevertheless, the potential of two-point averaging schemes (the overall response estimated in terms of localized averages of a two-phase composite medium) has been put forward in number of studies either in its original format or modified to overcome the inherited stiffness of classical ”elastic” localization rules. however, when the material model and geometry of the microstructure promote the formation of shear bands, none of the existing two-point averaging schemes will provide an adequate macroscopic response, since they all fail to capture the above phenomenon. several examples are presented here to support this statement. keywords: fiber-reinforced composite materials, microstructure, nonlinear viscoelastic behavior, leonov model, energy methods, finite element modeling. macro-scale m» meso-scale mm» micro-scale m» m fig. 1: an example of three-scale modeling scopic strains or stresses are redistributed into the individual phases. this led to the appearance of alternative methods, based on different linearization of the governing non-linear problem such as the secant, modified secant and affine methods [11, 12]. another example are the second-order variational estimates for material systems described by convex potentials [11]. a drawback of the previously mentioned methods when presented in the framework of incremental formulation is the need for the solution of a nonlinear system of equations for each load increment. a remedy has been offered in [13] where the classical “total localization rule” presented in the framework of transformation field analysis but enhanced by corrected values for the eigenstrains to soften the localization rule was proposed. although appealing in the macroscopic response that they deliver, most of the above methods require the existence of instantaneous or asymptotic tangent stiffness, which is not always available, so that an extension to more complex and/or rate-dependent constitutive laws might not be an easy task [14]. in this work in particular the authors examined the use of the two-point averaging scheme in conjunction with the extended hs principles for modeling of unidirectional composites with a disordered microstructure. when allowing for a nonlinear viscoelastic response of the matrix phase, a rather surprising phenomenon was observed suggesting severe limitations in the application of two-point averaging schemes. to introduce the subject we recall the essential conclusions put forward in [14] that illustrate the well-known drawback of the so called “elastic localization rule” for the evaluation of local stress and strain averages in a two-phase medium. to begin with, consider the results presented in fig. 2 featuring macroscopic stress-strain curves derived for hexagonal arrangement of fibers under transverse shear strain loading, see [14] for more details. the inability of the hs principle, when used in its original form, to correctly capture the stress redistribution is further revealed in fig. 3, which shows plots of “localized” phase averages. for the present material system (matrix response described by the generalized leonov model), the deficiency of the family of elastic localization rules can be attributed to the fact that the significant non-uniformity of local fields, which manifested itself in an evolution of the shear bands as will be shown later in section 5, cannot be accurately represented by the piecewise uniform variation of the local linearized modules. to further support this statement, we consider an example of a two-phase laminate having uniform distribution of local fields in individual phases, so that the piecewise uniform approximations fit exactly, thus suggesting a perfect match with the response provided by the standard finite element method. fig. 4 shows a variation of the matrix shear stress due to overall shear strain rate �e12 4 110� � �s for several values of the fiber volume fractions cf. the results plotted in fig. 4 indicate an increase of the matrix strain rate for higher values of the fiber volume fraction manifested here by higher values of the plateau stress. this supposition is confirmed in fig. 5 showing a time evolution of the matrix shear strain. it is worth noting, however, that a similar conclusion cannot be drawn from fig. 2, when the fe method is used to derive the volume averages of the local fields. clearly, the volume average of the matrix phase for the composite plots below the curve is obtained for a pure matrix subjected to the same loading conditions. this particular result is in direct contradiction with the observations gained from figs. 4–5. a sound explanation of this behavior is therefore needed. similar experiments were examined in [14] with reference to the primary hs principle in the framework of the two-point integration scheme augmented with a special choice of © czech technical university publishing house http://ctn.cvut.cz/ap/ 201 acta polytechnica vol. 44 no.5–6/2004 fig. 2: motivation: macroscopic response for �e12 4 110� � �s – hexagonal packing fig. 3: motivation: localized response for �e12 4 110� � �s – hexagonal packing fig. 4: motivation: laminate response for strain rate �e12 4 110� � �s the reference medium. the results summarized in fig. 6 provide comparisons of stress-strain curves obtained for this formulation. clearly, for material systems with uniform fields the modified incremental elastic localization rule is sufficient for all values of fiber volume fraction cf. this result just demonstrates that the resulting mismatch between the hs principles and the finite element simulations, fig. 2, is a consequence of non uniformity of the involved fields appearing in a composite. this phenomenon is examined below particularly with reference to geometrically more complex microstructures, fig. 1. in achieving this goal the behavior of so-called statistically equivalent unit cells is studied. the derivation of such a unit cell is reviewed in section 2. the first order homogenization scheme is then addressed in section 3 in the framework of the finite element method. the basic ingredients of the used constitutive model are presented in section 4. the paper concludes with a detailed discussion of the studied phenomena presented in section 6. in the following text, lowercase boldface letters, e.g., a are used for column vectors while capital boldface letters, e.g. a will be used to denote a matrix. lightface letters are used for scalar quantities, e.g., a. specific dimensions of individual quantities follow from the context. the inverse of a non-singular matrix is denoted as a�1 while the superscript t indicates the transpose of a matrix. finally, only the loading due to constant overall strain rate is considered and the analysis is carried out under the generalized plane strain conditions [24, appendix b] with x3 being the axis of the fibers. 2 geometrical modeling: construction of statistically optimized unit cells this section offers a certain statistically optimized periodic unit cell (puc) consisting of only a small number of particles as a suitable representative of a real microstructure. such a unit cell is found through a minimization procedure formulated in terms of certain statistical descriptors characterizing the geometrical configuration of a random medium. it is believed that if the two systems, the real microstructure and the corresponding unit cell, are the same in the statistical sense then the mechanical response of both systems will also be the same. this idea has been successfully exploited in a number of the authors’ previous works [15–19]. see also [20] for other routes to tackle disordered microstructures. it has been shown that successful completion of this step requires some measures for a reliable quantification of the random microstructure. to do so it is convenient to introduce the concept of an ensemble – a collection of a large number of systems, which are identical from the macroscopic point of view and different in their microscopical details. the morphology of such a composite system is fully characterized by a random function � �r ( , )x , which is equal to one when a point x lies in the phase r within the sample �� and equal to zero otherwise. with the aid of function � r , the n-point probability function sr1, …, rn can be defined as [21, 22] sr rn n r rn n1 1 1 1, , ( , , ) ( , ) ( , ) ,� � �x x x x� � � � � (1) where � denotes the ensemble average. thus, sr1, …, rn gives the probability of finding n points ( , , )x x1 � n randomly thrown into a medium located in the phases r1, …, rn. in the following, we limit our attention to functions of the order of one and two. analysis of random composites usually relies on various hypotheses, which simplify the computational effort to a great extent. in particular, under the ergodic hypothesis [21] the volume average of function � �r ( , )x given by � � �r v r v v ( , ) lim ( )x x y y� � �� � 1 d , (2) is independent of � and identical to the ensemble average � �r r r rs c( ) ( ) ( )x x x� � � , (3) where cr is the volume fraction of the r th phase. the statistical homogeneity assumption means that the value of the ensemble average is translation invariant. then, for example, the two-point matrix probability function reads s smm mm( , ) ( )x x x1 2 12� , (4) where x x xij j i� � . note that for an ergodic and periodic microstructure, the two-point probability function srs has the following form srs r s( ) ( ) ( )x x x y y� �� 1 � � � � d , (5) 202 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 5: motivation: laminate response for strain rate �e12 4 110� � �s – evolution of the matrix strain rate as a function of the volume fractions fig. 6: motivation: comparison of fem and hs-based estimates for strain rate �e12 4 110� � �s where � represents the size of the analyzed domain. the fourier transform of function srs is given by f f fsrs r s( ) ( ) ( )� � � � �� 1 � , (6) where � now denotes the complex conjugate. when introducing a binary image of the actual microstructure and taking into account the periodicity of the rve we may approximate eq. (6) by the discrete fourier transform: � �s n n mm x y m m� 1 idft dft dft( ) ( )� � , (7) where nx and ny are the horizontal and the vertical resolution of the bitmap. note that this approach requires only o (nxny) log (nxny) operations, instead of o(n nx y 2 2) operations needed by the direct method. moreover, the possibility of using highly optimized public software packages makes the dft-based approach very efficient. see, e.g., [16] for detailed discussion and numerical experiments. having the original microstructure characterized in an appropriate sense, the construction of an equivalent puc is relatively straightforward. in particular, the puc is derived by matching a selected microstructure describing function of the real microstructure and the puc. to be more concrete, consider a periodic unit cell consisting of n particles. dimensions h1 and h2 and the x and y coordinates of all particle centers determine the geometry of such a unit cell. the particle locations together with an optimal ratio of cell dimensions h1/h2 are found by minimizing an objective function involving two-point probability function f h h s r s s r sn i i i i i nm ( , , ) ( ( , ) ( , ))x 1 2 0 2 1 � � � (8) where � �x n n nx y x y� 1 1, , , ,� stores the positions of the particle centers, s r si i0( , ) is the value of a two-point (matrix) probability function corresponding to the original medium evaluated at a point ( , )r si i , and nm is the number of points in which both functions are matched. a closer inspection of the objective function (8) reveals that it is discontinuous and multimodal with a large number of local plateaus. this is a direct consequence of using bitmap images, where individual entries of the searched vector are integer variables. based on our previous experience with optimization of these functions [17], the augmented simulated annealing method [19] is implemented to solve the optimization problem. examples of the resulting 5and 10-particle pucs are displayed in fig. 7. the hexagonal unit cell with the same volume fraction is shown for comparison. 3 numerical modeling: finite element discretization in this section, the numerical analysis is performed in the spirit of the first order homogenization of periodic fields [1-2, 23]. to that end, consider a representative volume element y in terms of one of the statistically optimized unit cells derived in the previous section. next, let the applied loading conditions produce a uniform distribution of macroscopic strain e or the macroscopic stress � fields. when further assuming a periodicity of microstructure, the local displacement field u(x) and strain field �(x) admit the following decomposition u x e x u x x e x( ) ( ), ( ) ( )* *� � � �� � , (9) where u*(x) and �*( )x represent fluctuations of the local fields due to the presence of heterogeneities [21, 23]. the local stresses are related to the strains by the constitutive equation � �( ) ( ) ( ) ( )x x x x� �l � , (10) where l(x) is the material stiffness matrix and �(x) is the vector of eigenstresses (strain-free stresses). note that �(x) can present a variety of physical phenomena like temperature or moisture effects, creep stresses, etc. employing the principle of virtual work (the hill lemma) in the form � � � � � � � � � t *t t t ( ) ( ) ( ( ) )( ( ) ( ) ( ) , x x x e x x x e � � � � � l (11) and introducing the standard finite element approximation u x x r x x r* *( ) ( ) , ( ) ( )� �n b� [24], we finally obtain the discretized equilibrium equations l l b b l b l b ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) x x x x x x x x x x x x d d d dt t y y y y � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � e r x x x x x � � � � ( ) ( ) ( ) d dt y y b � � � � � � � � � � � . (12) © czech technical university publishing house http://ctn.cvut.cz/ap/ 203 acta polytechnica vol. 44 no.5–6/2004 fig. 7: examples of statistically optimized periodic unit cells denoting kij individual blocks of the left-hand side of eq. (12), we can formally eliminate the unknown vector r from (12) and obtain the resulting macroscopic homogenized constitutive law � �� �lfe fee (13) where l k k k k k k b fe t fe td � � � � � � � 1 1 11 12 22 1 12 12 22 1 y y y ( ), ( )� � x x ( ) ( ) .x x x� d y � � � � �� ! " "" (14) when analyzing the non-linear material system, eq. (10) needs to be replaced by an appropriate linearized constitutive law, and relations (13) – (14) are, rather straightforwardly, replaced by an incremental counterpart. this step is discussed in the next section. similar results can also be derived with help of other numerical techniques such as the boundary element method [25]. 4 constitutive modeling: generalized leonov model as suggested in the introductory part, a graphite/epoxy material system is selected as a particular example in this study. note that the fiber is assumed to remain elastic during deformation so that the inelastic effects are limited to the matrix phase. for the composite structure plotted in fig. 1, pr100/2+em100e epoxy is used as a bonding agent. an experimental program carried out on this type of material [27, 29] demonstrates that the relevant rate dependent response of the epoxy as well described by the generalized leonov model. combining the eyring flow model for the plastic component of the shear strain rate d d e t a p � 1 2 0 sinh � � , (15) with the elastic shear strain rate dee / dt yields the one-dimensional leonov model [26] d d d d d d d d d d e t e t e t e t e t e p e p � � � � � 2 , (16a) � � � � � � d d ( e t a p � �0 0 0 0 sinh , (16b) where is the shear-dependent viscosity. in eq. (15), a and � � are material parameters; a� that appears in eq. (16b) is the stress shift function with respect to the zero shear viscosity 0 (viscosity corresponding to an elastic response). clearly, the phenomenological representation of eq. (16a) is the maxwell model with variable viscosity . note that a single leonov mode is not able to describe a realistic response of real polymers, as it only accounts correctly for the initial stiffness and the strain rate dependent yield stress. to describe the multi-dimensional behavior of the material, the generalized compressible leonov model, equivalent to the generalized maxwell chain model, can be used. the viscosity term corresponding to the μ-th unit receives the form � �� � �� �0 1 2, ( ),a eq eq ij ijs s , (17) where �eq is the equivalent shear stress and sij is the stress deviator tensor. admitting only small strains and isotropic materials, a set of constitutive equations defining the generalized compressible leonov model can be written as � �m k� , (18) � ( � � ), ,sij ij ij pg e e� � �� �2 , (19) � ( ) ( ), , , , e s s ij p ij eq ij eqa � � � � � � � � � � 1 2 1 2 0 , (20) � � � � � ij m ij ij ij ij m � � � � s s s, , 1 , (21) where �m is the mean stress, � is the volumetric strain, k is the bulk modulus, g� is the shear modulus of the �-th unit and �ij is the kronecker delta. the material parameters required by the generalized compressible leonov model are parameters a, �0 and g�( 0�). the material parameters a and �0 describe the stress dependency of the model and can be derived from the eyring plot, fig. 8(b), assuming that at yielding the overall deformation as 204 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig. 8: (a) uniaxial tensile experiments, (b) the eyring plot equal to the plastic deformation. experimental measurements needed to calibrate the eyring flow model appear in fig. 8(a). the shear modules g� related to relaxation times �� � �� � �0 � # g ) then provide the time dependency of the model. they are usually found by transforming the experimentally derived creep function into the relaxation function using the laplace transform combined with the method of partial fractions [27]. coefficients of the creep function were determined from a set of creep experiments performed at various stress levels. thirteen elements of the generalized kelvin-voight chain model were used to obtain an accurate description of the linear compliance function. the resulting coefficients for the pr100/2+em100e epoxy needed in the generalized leonov model are stored in table 1. modeling the mechanical behavior of a bonding agent in polymer matrix composites requires a reliable and stable procedure for integration of the set of governing equations (18)-(21). to avoid possible numerical instabilities linked to explicit integration schemes a fully implicit euler backward integration procedure was developed. providing the total strain rate is constant during integration a new state of stress in the matrix phase at the end of the current time step assumes the form � � �m i m it t k( ) ( )� ��1 � , (22) s s e( ) ( ) �( ) ( )t t g t ti i i i� � ��1 2 q� �� , (23) where ti is the current time at the end of the i-th time increment; �m(ti) is the elastic mean stress, s(ti) stores the deviatoric part of the stress vector �(ti) and �e is the deviatoric part of the total strain increment. with reference to the backward euler integration scheme the time dependent variables at time ti receive the form �( ) ( ) exp ( ) g t g a t t t a t i i i � � � � � � � ! " " � � � � � � � � � � � � � �� � 1 �� � 1 m , (24) � � �( ) exp ( ) ( )t t a t ti i i� � � � � � � � ! " " � � � � � � � � � � 1 1 � � � � � s 1 m , (25) where s�(ti), � � 1, 2, …, m, is the deviatoric stress vector in individual units of the maxwell chain model evaluated at the beginning of a new time increment �t t ti i� � �1, and m is the assumed number of mawell units in the chain model. the stress shift factor is given by a t t ti eq i eq i � � � � � ( ) ( ) sinh ( ) � � � �� ! "" 0 0 , (26) where the equivalent stress �eq follows from �eq i i it t t( ) ( ) ( )� �1 2 1s st q , (27) and q � � �� � �� diag 1 1 1 1 2 1 2 1 2 . (28) clearly, the backward euler step makes all variables nonlinearly dependent on the stress values found at time ti. therefore, successful completion of a given integration step requires the solution of a system of nonlinear equations. here, the solution is established employing the newton-raphson method. to that end, define a set of residuals � �r � t g a tas t t t teq i i i� � �� ( ) ( ) ( ) 1 2 1s st q , (29) g g t g a t t t a t i i i � � � � � � � � ! " " � � � � � �( ) ( ) exp ( ) � � � � � � �� � 1 � � �� � 1 m , (30) a a t t ti eq i eq i � � � � �� ! "" � � � � � ( ) ( ) sinh ( ) 0 0 , (31) with the primary variables � �a � � �eq i i it g t a t( ) �( ) ( ) t . (32) note that the current increment of the vector ��(ti), which appears in eq. (23), is considered as a secondary variable. then, under the condition that � e is constant, the newton-raphson iterative scheme reads a a rk i k i k kt t� �� �1 1( ) ( ) h , (33) where jacobian matrix h is given by h � d d r a . (34) the initial values of primary variables at time ti for k � 0 are set to forward euler estimates. more details about the numerical implementation including comparison of an explicit and implicit integration scheme can be found in [27,28]. the accuracy of the proposed numerical procedure is demonstrated in fig. 9, which shows a typical uniaxial re© czech technical university publishing house http://ctn.cvut.cz/ap/ 205 acta polytechnica vol. 44 no.5–6/2004 table 1: nonlinear viscoelastic material properties of the pr100/2+em100e epoxy matrix sponse of the pr100/2+em100e epoxy resin for three different constant strain rates, where the solid lines were obtained experimentally, while the others follow from the numerical analysis. 5 inadequacy of two-point averaging recall that, in section 2, rather encouraging results were obtained for (an appropriately modified) two-point averaging scheme for binary laminates. this scheme, however, when applied to more complex microstructures such as hexagonal packing of fibers delivers rather inadequate results, although having a significantly softer prediction of the macroscopic response than the classical one, see fig. 3. an explanation is offered by examining, among others, also the response for a pure matrix subjected to the same overall shear strain rate as the composite. it is interesting to observe, particularly in view of the results presented in figs. 4–6, that the stress-strain curve for a pure matrix plots above the finite element estimates of the volume average of matrix stress derived for the composite. to reconcile this “paradox”, namely the fact that the volume average of the matrix yield stress decreases with an increase in the volume average of the matrix strain rate, we can set up a very simple geometrical model with the finite element discretization depicted in fig. 10. elements that exhibit the same stress response were marked with the same number. note that 6-noded elements with the 7-point integration rule were used so that the results displayed in figs. 11–12 correspond to element averages. fig. 11 indeed confirms large non-homogeneity in the distribution of local fields. it also shows an increase in the rate of deformation of the volume average of matrix shear strain in comparison with the response of the pure matrix under the same loading conditions. this is clearly due to the heterogeneity of the material system and is consequently due to the required compatibility of local and overall fields. this condition inevitably leads to an increase in reference yield stress of the matrix phase provided by the hs procedure. recall also the laminate response studied in the first example, figs. 4–6. the expected increase in the volume average of the matrix yield stress, however, is not observed with the finite element calculations even for such a simple geometry. this can be attributed to the highly nonlinear dependency of the yield stress on the applied rate of deformation provided by the leonov model. also note that the elements found in a closer 206 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 fig 9: experiments vs. numerical simulations fig. 10: a simple finite element mesh with groups of elements fig. 11: element averages of matrix strains for �e12 4 110� � �s fig. 12: element averages of matrix stresses for �e12 4 110� � �s vicinity of the fiber experience a considerably slower response that those located away from the fiber phase. this eventually results in a softer response in comparison with the pure matrix. clearly, such a behavior cannot be represented by any of the existing localization rules dealing with two-phase material systems and piecewise uniform variation of local fields, at least for the present material model. the conclusions of this case study can, in principle be, extended by analogy to more complex periodic unit cells such as those displayed in fig. 7. to illustrate the governing mechanism responsible for this fact, the distribution of local equivalent strain in a hexagonal array, 5and 10-fiber unit cells is shown in fig. 13. each microstructure was discretized with 6-noded triangular elements with 7 integration points. in particular, the plotted local fields correspond to the value of overall deformation e12 � 0,1. for all the studied microstructures, the final deformation pattern has the character of a localized shear layer that is responsible for the plateau regions clearly visible in the graphs depicted in figs. 3-4. this behavior simply cannot be reliably captured by a two-point averaging scheme and finally leads to erroneous response of the simplified method. note that a quite similar character of the overall response appears even for discretization of the structure with three-noded constant strain triangular elements, see fig. 4. in this case, however, the approximation is not rich enough to correctly capture the fact that the formed shear band prohibits interaction of the fibers with the matrix phase, fig. 15. an additional explanation comes from the basic assumption of the selected material model. recall that the model draws on the nonlinear viscoelastic response to be limited purely to the deviatoric part of © czech technical university publishing house http://ctn.cvut.cz/ap/ 207 acta polytechnica vol. 44 no.5–6/2004 fig. 13: distribution of local equivalent strain fields for discretization with 6-noded elements: (a) hexagonal array, (b) 5-fiber puc, (c) 10-fiber puc fig. 14: distribution of local equivalent strain fields for discretization with 3-noded elements: (a) hexagonal array, (b) 5-fiber puc, (c) 10-fiber puc fig. 15: overall response of optimized unit cells – effect of the order of discretization the stress-strain relationship, while the volumetric response remains elastic. as often observed with j2 plasticity, this may eventually lead to what we call volumetric locking and consequently to substantially stiffer response when compared to reality. this, in particular, suggests that even when using a refined variant of the tfa method – “multi-point incremental homogenization” [10] – the global response will probably not be captured properly. 6 conclusions the paper reviewed essential steps of first order homogenization procedure implemented in the framework of the finite element method. a family of so-called statistical equivalent unit cells was introduced to deal with random microstructures. unlike the elastic analysis, where a 5-fiber unit cell was found sufficient to arrive at reliable estimates of homogenized properties [16, 17] at least a 10-fiber unit cell is needed when highly non-homogeneous evolution of local fields may occur. recall the formation of shear bands displayed in figs. 13, 14 and corresponding overall response plotted in fig. 15. the existence of shear bands, zones of highly localized deformation, on the other hand, may raise a question about mesh objectivity when local formulation is used. it is worth noting that in the present study the size of the shear band is explicitly given by the underlying microstructure. clearly, when refining the finite element mesh such a shear band will be captured more accurately but will not depend on the size of the elements and will always attend a finite depth. regularization from the mathematical point of view is thus not needed. nevertheless, problems might be encountered for relatively coarse meshes or for microstructures with a rather low value of the fiber volume fraction. in this regard, owing to the material model used in this study a phenomenon known as volumetric locking occurred for 3-noded triangular elements, recall figs. 14, 15 and the discussion in section 6. as for the main goal of this paper, the analysis clearly uncovered major drawbacks of two point averaging schemes when applied to random microstructures with possibly inelastic phases. as already suggested, the existence of regions of highly localized deformation attributed heterogeneity of a microstructure may considerably influence the overall response (this phenomenon can be further magnified if one of the phases is elastic while the other experiences an inelastic response). here we refer to the results presented in figs. 11, 12 identifying the mechanism of the actual localization rule and clearly suggesting an inadequacy of the “total elastic” localization rule used with the present form of the hs principles. the two-point averaging schemes, if at hand, should therefore be used with caution. nevertheless, if the “exact” local response is not of the main interest the parameters of a given material model can be adjusted to meet the desired overall response. in such a case, numerical experiments on refined geometry are usually used in place of laboratory measurements. an example of combining such an approach with a two-point averaging scheme has been presented in [14]. acknowledgments financial support for this work was provided by ga r grant no. 103/01/d052. references [1] fish j., shek k., pandheeradi m., shephard m. s.: “computational plasticity for composite structures based on mathematical homogenization: theory and practice.” computer methods in applied mechanics and engineering, vol. 148 (1997) no. 1–2, p. 53–73. [2] fish j., shek k.: “multiscale analysis of large-scale nonlinear structures and materials.” international journal for computational civil and structural engineering, vol. 1 (2000), no. 1, p. 79–90. [3] kouznetsova v. g., geers m. g. d., brekelmans w. a. m.: “multi-scale second-order computational homogenization of multi-phase materials: a nested finite element solution strategy.” computer methods in applied mechanics and engineering, (2004), accepted for publication. [4] massart t. j.: “multi-scale modeling of damage in masonry structures”, ph.d. thesis, technische universiteit eindhoven, 2003. [5] kabele p.: “assesment of structural performance of engineered cementitious composites by computer simulation.” ctu reports, vol. 5 (2001), no. 4. [6] hashin z., shtrikman s.: “on some variational principles in anisotropic and nonhomogeneous elasticity.” journal of the mechanics ans physics of solids, vol. 10 (1962), p. 335–342. [7] willis j. r.: “variational and related methods for the overall properties of composites.” advances in applied mechanics, vol. 21 (1981), p. 2–4. [8] lagoudas d. c., gavazzi a. c., nigam h.: “elastoplastic behavior of metal matrix composites based on incremental plasticity and the mori-tanaka averaging scheme.” computational mechanics, vol. 8 (1991), p. 193–203. [9] dvorak g. j.: “transformation field analysis of inelastic composite materials.” proceedings of the royal society of london series a mathematical, physical and engineering sciences, vol. 437 (1992), p. 311–327. [10] dvorak g. j., bahei-el-din y. a., wafa a. m.: “implementation of the transformation field analysis for inelastic composite-materials.” computational mechanics, vol. 14 (1994), no. 3, p. 201–228. [11] castaneda p. p., suquet p.: ”nonlinear composites.” advances in applied mechanics, vol. 34, no. 998, p. 171–302. [12] masson r., bornert m., suquet p., zaoui a.: “an affine formulation for the prediction of the effective properties of nonlinear composites and polycrystals.” journal of the mechanics and physics of solids, vol. 48 (2000), p. 1203–1227. [13] chaboche j. l., kruch s, maire j. f., pottier t.: “towards a micromechanics based inelastic and damage modeling of composites.” international journal of plasticity, vol. 17 (2001), no. 4, p. 411–439. [14] šejnoha m., valenta r., zeman j.: “nonlinear viscoelastic analysis of statistically homogeneous random composites.” international journal of multiscale computational engineering, (2004), submitted for publication. [15] šejnoha m., zeman j.: “overall viscoelastic response of random fibrous composites with statistically quasi uniform distribution of reinforcements.” computer methods 208 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 in applied mechanics and engineering, vol. 191 (2002), no. 44, p. 5027–5044. [16] šejnoha m., zeman j.: “micromechanical analysis of random composites.” ctu reports, 6 (1), czech technical university in prague, 2002. [17] zeman j., šejnoha m.: “numerical evaluation of effective properties of graphite fiber tow impregnated by polymer matrix.” journal of the mechanics and physics of solids, vol. 49 (2001), no. 1, p. 69–90. [18] zeman j., šejnoha m.: “homogenization of plain weave composites with imperfect microstructure: part i theoretical formulation.” international journal of solids and structures, (2004), in print. [19] matouš k., lepš m., zeman j. šejnoha m.: “applying genetic algorithms to selected topics commonly encountered in engineering practice.” computer methods in applied mechanics and engineering, vol. 190 (2000), no. 13–14, p. 1629–1650. [20] shan z., gokhale a. m.: “representative volume element for non-uniform microstructures.” computational materials science, vol. 24 (2002), p. 361–379. [21] beran m. j.: “statistical continuum theories.” monographs in statistical physics, interscience publishers, 1968. [22] torquato s.: “random heterogeneous materials: microstructure and macroscopic properties.” springer-verlag, 2002. [23] michel j. c., moulinec h., suquet p.: “effective properties of composite materials with periodic microstructure: a computational approach.” computer methods in applied mechanics and engineering, vol. 172 (1999), p. 109–143. [24] bittnar z., šejnoha j.: “numerical methods in structural engineering.” asce press, 1996. [25] procházka p., šejnoha j.: “a bem formulation for homogenization of composites with randomly distributed fibers.” engineering analysis with boundary elements, vol. 27 (2) (2003), p. 137–144. [26] leonov a. i.: “non-equilibrium thermodynamics and rheology of viscoelastic polymer media.” rheologica acta, vol. 15, (1976), p. 85–98. [27] valenta r.: „numerické modelování polymerů“ [numerical modeling of polymers], master thesis, czech technical university in prague, (2003) (in czech). [28] valenta r.: “numerical implementation of the leonov model”, in engineering mechanics 2004 (i. zolotarev – a. poživilová, editors), institute of thermomechanics, czech academy of science, (2004), p. 303–304. [29] valenta r., šejnoha m.: ”epoxy resin as a bonding agent in polymer matrix composites: material properties and numerical implementation.” to be presented at icces’04 madeira, portugal, july 26–29, 2004. ing. jan zeman, ph.d. phone: +420 224 354 482 fax: +420 224 310 775 e-mail: zemanj@cml.fsv.cvut.cz ing. richard valenta phone: +420 224 354 472 fax: +420 224 310 775 e-mail: richard.valenta@fsv.cvut.cz doc. ing. michal šejnoha, ph.d. phone: +420 224 354 494 fax: +420 224 310 775 e-mail: sejnom@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 209 acta polytechnica vol. 44 no.5–6/2004 ap04_4web.vp 1 introduction the partition of unity concept [2], which allows a local enrichment of the standard finite element basis by special functions has been widely used to model the displacement discontinuity in a number of applications, e.g., the quasi-brittle failure of natural stones such as massangis limestone [6] or continuous-discontinuous modeling of failure in high performance fiber-reinforced cement composites [7]. in this framework, the discontinuity in the displacement field is introduced by enriching the standard finite element polynomial basis with the heaviside function [8]. this enrichment, however, results in additional degrees of freedom (enhanced degrees of freedom) in the nodes that belong to the domain affected by this enrichment. as these degrees of freedom are found in a set of displacement degrees of freedom the proper constraints must be applied to those found on the domain boundary in order to maintain the regularity of the resulting system of equations. although not immediately evident, this step in certain applications may significantly pollute the correct solution of a given boundary value problem. to introduce the subject, recall the problem of localization of the inelastic deformation in problems free of initial stress concentrators. this problem has been addressed, e.g., in the habilitation thesis of brocca [5] and recently revisited in [1] using the concept of the partition of unity method, which allows the necessary splitting of the total displacement field into elastic and inelastic displacements associated with the crack opening. to test the ability of the latter approach to provide the desired results, and also in order to gain a clear insight into the problem formulation, we used one-dimensional setting, for which the exact solution is available. the presented numerical examples revealed several drawbacks associated with this approach. among others, the study showed a possible depreciation of the results when an element crossed by a discontinuity containing a boundary node that had to be eliminated by the boundary constraints. motivated by the above result this paper attempts to shed a more detailed light on this problem and to clearly illustrate the need for a complete approximation of the discontinuous part of the displacement field in order to arrive at the correct results. to keep the analysis simple, attention is again limited to a one-dimensional bar element crossed by a set of discontinuities with finite elastic stiffness assigned to each of the predefined discontinuities. the paper is organized as follows. section 2 outlines the derivation of the linearized weak form of the governing equations. application to a one-dimensional problem is then discussed in section 3 and compared to the analytical solutions provided by the conventional chain of spring elements. 2 strong discontinuity problem this section reviews general steps in the formulation of the problem of embedded discontinuities based on the partitions of unity method. in this framework, the discontinuous modes are introduced through the heaviside step function directly in the kinematic relations. the standard principle of virtual work is then used to arrive at the discrete system of linearized governing equations. 2.1 kinematics of a displacement jump consider a body � bounded by a surface � and crossed by a discontinuity �d, fig. 1. �u represents a portion of � with prescribed displacements u while tractions t are prescribed on � �� � � �t u t d� � � 0 . the internal discontinu32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 effect of boundary constraints in the formulation of the partition of unity method: one-dimensional setting m. audy, m. šejnoha the paper examines an effect of boundary constraints applied to the enhanced degrees of freedom of partition of unity based discontinuous elements. to highlight the present issue the problem is studied in a one-dimensional setting. in particular, an example of a one-dimensional bar element crossed by a set of discontinuities having a finite elastic stiffness clearly shows a need for proper approximation of the displacement field within a discontinuous element in order to correctly represent the structural response. while the discontinuous elements with boundary constraints applied to the enhanced degrees of freedom display an unrealistic dependence of the global response on the locations of the discontinuities, the discontinuous elements with complete approximation of the discontinuous part of the displacement field provide the expected global response independent of the locations of the discontinuities. keywords: strong discontinuity approach, partition of unity method (pum), boundary constraints. fig. 1: body � crossed by discontinuity �d ity surface �d divides the body into two subdomains, � � and � � (� � �� � �� ). suppose that the displacement field can be split into discontinuous and continuous parts u u u( , ) �( , ) ( ) ~( , )x x x xt t h t d � � � , (1) where h d� ( )x is the heaviside function centered at the discontinuity surface �d (h d + � �( ) ,x x� � �1 and h d� ( )x � 0, � � �x � ) and �u and ~u are continuous functions on �. note that the discontinuity is introduced by the heaviside function h d� ( )x at the discontinuity surface �d and that the magnitude of the displacement jump [[u]] at the discontinuity surface is given by ~u . for small displacements, the strain field assumes the form � � � �� �t t� ~ \u u+h d d� � �, x (2) where the operator matrix � can be found in [4]. 2.2 governing equations the displacement field can be interpolated over the body � using the concept of the partition of unity method. for the purpose of the present work, it is sufficient to define a partition of unity as a collection of functions �i which satisfy (see, e.g., [2] for more details) �i i n ( )x x� � � � 1 1 � , (3) where n is the number of discrete points (nodes). the displacement field will be interpolated in terms of discrete nodal values by � �u a bi i i i i n ( ) ( ) ( )x x x� � � � � 1 , (4) where �i is a partition of unity function, ai is the discrete nodal value and bi is the ’enhanced’ nodal value with respect to ‘enhanced’ basis �i. note that the standard finite element shape functions ni also form a partition of unity, since ni i n ( )x x � � � � 1 1 � . (5) in the standard finite element method, the partition of unity functions are shape functions and the enhanced basis is empty. when adopting the general scheme (4), the discretized form of the displacement field becomes � �u a b( ) = ( ) ( ) ( )x x x xn n n+ � , (6) where n is the matrix of standard nodal shape functions to interpolate regular nodal degrees of freedom and vector n n( )� b serves to introduce certain specific features of the displacement field u using so called enhanced degrees of freedom stored in vector b. introducing eq. (4) into eq. (2) gives the strain field in the form � ( ) ( ) ( )x x x� �b ba b� (7) where b n� � t and b nn� �� � t( ). a suitable choice of n� may considerably improve the description of the displacement field for a specific class of problems [3]. when solving, e.g., the localized damage problem, the discontinuous displacement field can be easily modeled after replacing the matrix n� by a scalar heaviside function h d� multiplied by a boolean matrix h (a matrix with entry 1 if the corresponding degree of freedom is enhanced and zero otherwise). eqs. (6) and (7) then become u a b( ) = ( ) ) ( )x x x xn n h+h d� ( , (8) � ( ) = ( ) ) ( )x x x xb b ha b+h d� ( , (9) where eqs. (8) – (9) hold for x �� �\ d . thus the constitutive equations for the stress � at a point x �� �\ d and tractions developed on the discontinuity surface �d are �( ) �x � d � � �� ( ( ) ( ) )d b b hx xa bh d� , (10) t b( ) ~ ( )x x� d n h . (11) employing the principle of virtual work, the resulting discrete system of linear equations receives the traditional form, see [1] for more details, k u f� , (12) where u represents the vector of nodal displacements consisting of standard and enhanced degrees of freedom �u a b� t, (13) and f lists externally applied forces �f f f� a b t, (14) where f ta t � � n t d� � , (15) f tb � �� n t d� �� , (16) and finally k represents the enhanced stiffness matrix k k k k k � � � � � � aa ab ba bb , (17) where individual sub-matrices are defined as k b dbaa � � t � d� � , (18) k k b dbab ba� ��t t � d� � , (19) k b db n dnbb d � � �� � t t� ~d d� � � � . (20) 3 numerical analysis of a one-dimensional problem the general formulation presented in the previous section will now be given in the context of a one-dimensional discontinuous bar element. in particular, we will consider the elements with one, two or an arbitrary number of discontinuities with a constant elastic stiffness assigned to each discontinuity. the effect of the problem setup in terms of number of elements, location of the crack element with © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 44 no. 4/2004 fig. 2: simple chain model respect to the prescribed boundary conditions and a discontinuity position within an element is of primary interest. 3.1 simple chain model first, assume a simple chain model consisting of a spring and a set of discontinuities, as shown in fig. 2, in which ki is the spring constant and hi represents an elastic stiffness of the discontinuity i that relates the force transmitted across the discontinuity to the discontinuity opening displacement. the assumed arrangement of the individual elements in the chain model suggests f f f f fk k h h� � � �1 2 1 2 , (21) u u u u uk k h h� � � �1 2 1 2 , (22) u f k � , (23) substituting from eq. (23) into eq. (22) gives f k f k f k f h f h k k h h� � � �1 1 2 2 1 1 2 2 (24) and then using eq. (21) provides the effective stiffness k in the form f k f k f k f h f h � � � � 1 2 1 2 (25) 1 1 1 1 1 1 2 1 2k k k h h � � � � (26) simple generalization to m springs and n discontinuities yields 1 1 1 1 1 k k hii m ii n � � � � (27) note that in the previous derivation no assumption about the location of the discontinuity is required. it is therefore expected that, if addressing the same problem in the framework of pum-based discontinuous elements, the jumps across individual discontinuities should be independent of the discontinuity location and should depend solely on the assigned discontinuity stiffness. the latter condition arises from the fact that the tensile stress in the structure should remain constant and equal to � � f a, where f is the applied force and a is the element cross-sectional area, recall eqs. (22)–(25). fulfillment of the above requirements will now be explored for several configurations. 3.2 pum-based discontinuous elements three different configurations will be examined. first, we consider the most simple structure consisting of a spring and a single discontinuity. an element with two discontinuities is studied next, and finally we provide general results for an element with n discontinuities. 3.2.1 pum-based element with one discontinuity two representatives of the possible numerical models appear in figs. 3 (a), (b). however, before commenting on individual configurations we present the derivation of the element stiffnesses for typical elements in figs. 3 (a), (b). to that end, we introduce the following notation k ea a a � , k ea b b � , (28) where e, a, a, b are the young modulus, cross-sectional area and lengths of individual elements, respectively, ka and kb then represent in analogy with fig. 2 the corresponding spring constants and h is reserved for the discontinuity elastic stiffness. to proceed, consider an element in fig. 3(a). by analogy with eq. (13) the element degrees of freedom are ordered as �u � a a b b1 2 1 2 t. (29) note that a one-dimensional bar element crossed by a discontinuity has two degrees of freedom in each node, one standard and one enhanced. with reference to fig. 3 (a), eqs. (18) – (20) now become k b baa a ea x� � t d 0 (30) k b bab d ea x� � t d 0 (31) k b b n nbb d ea x h� �� t td 0 (32) assuming standard linear interpolation functions for a one-dimensional bar element given by n � � �� � �� a x a x a , so that b � � �� � �� 1 1 a a , (33) and employing the notation in eq. (28) we arrive at the following element stiffness matrix 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 3: one discontinuity model k � � � � � � � �� k k k d a k d a k k k d a k d a k d a k d a k d a h d a a a a a a a a a a a a 1 � � � � � � � � � � � � � � � � � �� � � 2 1 1 k d a h d a d a k d a k d a k d a h d a d a a a a a � � � � � � � � � � � � � � � � � � � � � � � � � � � � �k d a h d aa 2 (34) finally, after imposing the boundary constraints to both standard and enhanced degrees of freedom and introducing the applied loading we get for the configuration displayed in fig. 3 (a) the following global system of equations k k d a k d a k d a h d a a b fa a a a � � � � � � � � � � � � � � � � � � � � � � � � �2 2 2 0� � � � . solving for free degrees of freedom then yields � � u � � � � � � � � � � � � � dh ak k d h k ak f a dh ak dk fa a a a a a( ) , t . (38) eq. (38) clearly shows that the solution of the first configuration violates the basic requirement of being independent of the discontinuity location. this can be attributed to the fact that in this case the discontinuous element displacement field is not well approximated, as one of the two enhanced degrees of freedom is constrained. consequently, the above solution when introduced into eq. (10) gives a linear variation of the stress over the element, which is in direct contradiction with the results summarized in section 3.1. on the contrary, rather different results are discovered for the configuration of fig. 3 (b). in this case the global system of equations reads © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 44 no. 4/2004 similarly we derive the stiffness matrix for the configuration plotted in fig. 3 (b). on the structural level the vector of unknown degrees of freedom assumes the form �u � a a a b b b1 2 3 1 2 3 t. (35) the element stiffness matrix for element #2 is identical to that given by eq. (34) oncereplacing ka by kb and a by b. it thus remains to determine the element stiffness matrix for element #1. note that this element contains a node whose support (element #2) is crossed by a discontinuity. also note that node #1 does not necessarily have to be enhanced, as the support of the associated nodal base function is not crossed by any discontinuity. here, the node enhanced degree of freedom b1 is preserved for the sake of simplicity in the derivation of the element stiffness matrix and will be eliminated via the boundary constraints. since the entire element is contained in the domain �� �( )a d and is discontinuity-free the element stiffness matrix provided by eq. (34) reduces to k � � � � � � � � � � � � � � � � k k k k k k k k k k k k k k k k a a a a a a a a a a a a a a a a � � � � . (36) after assembly the global stiffness matrix of the configuration in fig. 3 (b) becomes k � � � � � � � � � � � k k k k k k k k k k k d b k d b k k k a a a a a a b b a a b b b b b 0 0 0 0 d b k d b k k k k k k k d b k d b k k k d b h d b b a a a a a a b b a a b � � � � � � � �� � 0 0 1� � � � � � � � � � � � � � � � �� � 2 1 0 0 1 k d b h d b d b k d b k d b k d b h d b d b b b b b � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � k d b h d bb 2 . (37) fig. 4: variation of the displacement field for a single crack element for two different configurations of fig. 3 graphical representation of the above results derived using the material setting from table 1 is plotted in figs. 4 (a), (b). the figure shows the variation of the displacement field for two different crack locations. note the expected constant distribution of the tensile stress found for the second configuration and plotted in fig. 4 (b). the same results, however, are not obtained for the first configuration. see fig. 4 (a) suggesting an unrealistic jump in the tensile stresses at the discontinuity location. a similar conclusion can be drawn for the problem of an element with two discontinuities studied below. 3.2.2 pum-based element with two discontinuities for the case of two discontinuities placed within an element the two possible configurations are plotted in figs. 5 (a), (b), where d1 and d2 represent two arbitrary locations of the element discontinuities. moving in the footsteps of the previous section we first derive the element stiffness matrix. owing to the presence of two discontinuities there are two enhanced degrees of freedom in each node. the two degrees of freedom in the first node of the second configuration in fig. 5 (b) are, however, inactive since the support of the node base function is not crossed by a discontinuity. for the solution of the underlying problem they will again be eliminated by the boundary constraints. in order to derive the element stiffness matrix suppose that the enhanced degrees of freedom are ordered consecutively with respect to the individual discontinuities according to figs. 5 (a), (b). thus the degrees of freedom (b1, b2, b3) correspond to discontinuity #1, whereas the degrees of freedom (b4, b5, b6) are linked to discontinuity #2. the element stiffness matrix then receives the following form k k k k k k k k k k � � � � � � � � � � aa ab ac ba bb bc ca cb cc , (40) where individual submatrices are provided by k b baa a ea x� � t d 0 , (41) k k b bab ba d ea x� � � t d 0 1 , (42) k k b bac ca d ea x� � �t t d 0 2 , (43) k b b n nbb d ea x h� �� t td 0 1 1 , (44) k k b bbc cb d ea x� � �t t d 0 1 , (45) k b b n ncc d ea x h� �� t td 0 2 2 . (46) 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 k k k k k d b k d b k k k d b k d b k k d b k d b k k a b b a b b b b b b a b b a � � � � � � � � � b b b b b d b h d b k d b h d b d b k d b k d b k d b � �� � � � � � � � � � � � � � � � � � 1 1 2 h d b d b k d b h d b a b1 2 2 �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � a b b f3 2 3 0 0 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � , and its solution listed in (39) u � � � � �� � � �� � � � � �� � � �� � � � � � � � 1 1 1 1 1 k h f k k h f f h f ha a b , , , � t , (39) is clearly independent of the discontinuity location d. in addition, the variation of the discontinuous part of the displacement field, recall eq. (39), is constant in the discontinuous element. k ea aa � [n/m] k ea bb � [n/m] h [n/m] a [m] b [m] f [n] 100 50 50 1 2 100 table 1: material, geometrical and loading parameters fig. 5: two-discontinuities model as expected, the solution in eq. (47) depends, for the same reasons as already pointed out, on the locations of the two discontinuities and must be disqualified. in contrast to the first configuration, the solution for the second configuration in fig. 5(b), eq. (48), does not suffer from this drawback. the correctness of this solution is again supported by fig. 6 (b), which shows a constant variation of the tensile stress along the bar unlike the plot in fig. 6 (a) derived for the first configuration. also note the constant variation of the discontinuous part of the displacement field for both discontinuities. �u � � � � � � � �� � � �� � � � a a b b b b k h h f k k h a a b 2 3 2 3 5 6 1 2 1 1 1 1 1 1 1 1 t h f f h f h f h f h 2 1 1 2 2 � � �� � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � . (48) 3.2.3 pum-based element with n discontinuities general case to complete our discussion we also present the derivation of the element stiffness matrix for the case of n discontinuities. keeping the same ordering of the enhanced degrees of freedom as in the previous section, see also fig. 7, we get k b baa a ea x� � t d 0 , (49) k k b b n nij ji d d ij iea x h i j � � ��t t td 0 min( , ) � , (50) where dij is assumed to represent an identity matrix for i � j and a zero matrix for i � j. by analogy with eq. (48), the solution of the problem plotted in fig. 7 reads �u � �a a b b b b b bn n2 4 2 3 5 6 3 1 3� t (51) © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 44 no. 4/2004 fig. 6: variation of the displacement field for a single crack element for two different configurations of fig. 5 �u � � � � � � a b b d d h ad d d hk a d d k k d d a a a 2 2 4 1 2 2 2 1 1 2 2 1 2 2 1 2 t ( ) ( ) � �2 2 2 1 2 1 2 1 2 2 2 2 2 h d d d a d d hk a d d a d k f ad a a� � � � � � � � � ( ( )) ( )( ) h d d h d d d a d d hk a d d a d ka a1 2 2 2 2 1 2 1 2 1 2 2 22� � � � � � � �( ( )) ( )( ) f a d d h a d d k d d h d d d a d d hk a( ( ) ) ( ( )) � � � � � � � 1 2 1 2 1 2 2 2 2 1 2 1 22 a aa d d a d k f � � � � � � � � �� � � � � � � � � � �� � � � � � ( )( )1 2 2 2 . (47) as in the problems discussed in the previous section the solution of the two problems in fig. 5 requires the introduction of the boundary constraints and loading. in particular, removing all the degrees of freedom in node #1 then gives after some algebra the solution of the first problem in the form fig. 7: n-discontinuities model u � � � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 1 0 0 k h f k k h f f a ii n a b ii n h f h f h f h f h f h n n 1 1 2 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � (52) revealing again a constant distribution of the discontinuous part of the displacement field within the discontinuous element. 4 conclusions a simple one-dimensional example was given to demonstrate the essential drawback of the pum-based discontinuous elements associated with constraining the enhanced degrees of freedom. it was shown that for the correct results to be independent of the locations of discontinuities the discontinuous part of the displacement field must be fully approximated. this can be accomplished by placing the discontinuous element away from the domain boundary. when the discontinuous element, however, contains a boundary node that must be constrained, the free degree of freedom in the other node is not sufficient to provide a correct representation of the discontinuous part of the displacement field resulting in an erroneous response that depends on the discontinuity location. although the present results cannot be directly transplanted to the general case they suggest possible problems when applying the fixed kinematic boundary conditions to the enhanced degrees of freedom in higher dimensions as well, as typically done in [2]. 5 acknowledgment this work was sponsored by research projects msm 210000001,3. references [1] audy m.: localization of inelastic deformation in problems free of initial stress concentrators. master thesis, czech technical university in prague, faculty of civil engineering, prague, 2003. [2] babuška i., melenk j. m.: “the partition of unity method.” international journal for numerical methods in engineering, vol. 40 (1997), p. 727–758. [3] babuška i., banerjee u., osborn j. e.: “on principles for the selection of shape functions for the generalized finite element method.” computer methods in applied mechanics and engineering, vol. 191 (2002), p. 49 – 50. [4] bittnar z., šejnoha m.: numerical methods in structural engineering. asce press, 1996. [5] brocca m.: analysis of cracking localization and crack growth based on thermomechanical theory of localization. ph.d. thesis, university of tokyo, 1997. [6] de proft k.: combined experimental-computational study to discrete fracture of brittle materials. ph.d. thesis, vrije universiteit brussel, brussel, 2003. [7] simone a.: continuous-discontinuous modelling of failure. ph.d. thesis, delf university, delft, 2003. [8] moes n., belitschko t.: “extended finite element method for cohesive crack growth.” engineering fracture mechanics, vol. 69 (2002), p. 813 – 833. ing. miroslav audy phone: +420 224 354 472 fax: +420 224 310 775 e-mail:miroslav.audy@fsv.cvut.cz doc. ing. michal sejnoha, ph.d. phone: +420 224 354 494 fax:+420 224 310 775 e-mail:sejnom@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 ap05_2.vp 1 introduction the complexity of present-day vlsi devices has risen to millions of gates, and the chips are therefore becoming untestable by standard manufacture external ate (automated test equipment) testers. the test lengths are rapidly increasing, as are the testing times and the ate memory requirements. hence the built-in self-test (bist) has been established as a necessary part of vlsi circuits. the circuit is able to test itself by bist without using any ate equipment, or when used together with an external tester, bist significantly reduces the test time and tester memory demands. many bist techniques have been developed [1, 2]. the vast majority of them use a pseudo-random pattern generator (prpg) to produce test vectors that detect the easy-to-detect faults, which mostly represent more than 90 % of the total faults. for the remaining faults, test vectors are either applied externally, or they are generated by the bist structure itself. linear feedback shift registers (lfsr) or cellular automata (ca) are mostly used as prpgs, due to their simplicity and good properties concerning implementation space demands and the good fault coverage. a general bist structure is shown in fig. 1. the patterns are generated by a test pattern generator (tpg), then they are fed to the circuit-under-test (cut) and the circuit’s responses are evaluated. test patterns may be applied to the circuit in parallel, which is denoted as a test-per-clock bist, or serially (test-per-scan) [1]. the design of the tpg is of key importance for the whole bist, since it determines the fault coverage achieved and the area overhead of the bist equipment. a simple lfsr often cannot ensure satisfactory fault coverage, thus it has to be augmented in some way. the lfsr code word sequence is modified in some approaches to produce patterns that detect more faults. these methods imply reseeding the lfsr during the test, or possibly the generating polynomial is also modified [3], or the lfsr patterns are modified by an additional logic [4, 5, 7, 13]. the best results are produced by mixed-mode bist methods. here some of the prpg patterns are applied to the circuit unmodified to detect the easy-to-detect faults. after that, either deterministic or somehow modified prpg patterns are generated to detect the remaining faults [2, 5, 6, 7]. the proper choice of a prpg is very important in a case of mixed-mode testing. it is desirable to detect as many faults as possible by the prpg, so that the additional logic is maximally reduced. this is the main issue addressed in this paper. we introduce statistics on the stuck-at fault coverages for the iscas [10, 11] and itc’99 benchmarks [21], using different prpgs. the influence of the prpg on the total bist area overhead is shown for the column-matching method [7, 13, 14], since this method enables high scalability, and the effects of the generator type and test lengths can be demonstrated here very well. the paper is organized as follows: the basic principles of prpgs are introduced in section 2, the statistics of fault coverages are presented in section 3, section 4 briefly describes the mixed-mode bist principles, together with the column-matching bist method and the results obtained using this method. section 5 concludes the paper. 2 the prpg structure generally, prpgs are simple sequential circuits generating code words, according to the generating polynomial [23]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 pseudorandom testing – a study of the effect of the generator type p. fišer, h. kubátová the test pattern generator produces test vectors that are applied to the tested circuit during pseudo-random testing of combinational circuits. the nature of the generator thus directly influences the fault coverage achieved. in this paper we discuss the influence of the type of pseudo-random pattern generator on stuck-at fault coverage. linear feedback shift registers (lfsrs) are mostly used as test pattern generators, and the generating polynomial is primitive to ensure the maximum period. we have shown that it is not necessary to use primitive polynomials, and moreover that their using is even undesirable in most cases. this fact is documented by statistical graphs. the necessity of the proper choice of a generating polynomial and an lfsr seed is shown here, by designing a mixed-mode bist for the iscas benchmarks. an alternative to lfsrs are cellular automata (ca). we study the effectiveness of ca when used as pseudo-random pattern generators. the observations are documented by statistical results. keywords: built-in self-test, diagnostics, testability, lfsr, test pattern generators, column-matching. fig. 1: the bist scheme these code words are then either fed directly to the cut inputs, or they are modified by some additional circuitry. the most common prpg structures are linear feedback shift registers (lfsrs) or cellular automata (ca). an n-bit (n-stage) lfsr is a linear sequential circuit consisting of d flip-flops and xor gates generating code words (patterns) of a cyclic code. the structure of an n-stage lfsr-i (with internal xors) is shown in fig. 2. the register has n parallel outputs corresponding to the outputs of the d flip-flops, and one flip-flop output can be used as a serial output of a register. the coefficients c1 � cn�1 express whether there exists (1) a connection from the feedback to the corresponding xor gate or no connection (0). thus it determines whether there is a respective xor gate present or the flip-flops are connected directly. the feedbacks leading to the xor gates are also called taps. the sequence of code words produced by an lfsr can be described by a generating polynomial g(x) in gf(2n), [22]. g x x c x c x c xn n n n n( ) � � � � � � � � � � 1 1 2 2 1 1 1� . if the generating polynomial is primitive, the lfsr has a maximum period 2n�1, thus it produces 2n�1 different patterns. the initial state of the register (initial values of the flip-flops) is called the seed. the second lfsr type, the lfsr-ii is implemented with xors in the feedback. its generating polynomial is dual to the lfsr-i polynomial. only lfsr-i will be considered in this paper, since these lfsrs are mutually convertible. cellular automata are sequential structures similar to lfsrs. their periods are often shorter, but code words generated by ca are sometimes more suitable for test patterns with preferred numbers of ones or zeros at the outputs. an example of a ca performing multiplication of the polynomials corresponding to code words by the polynomial x+1 (rule 60 for each cell [19]) is shown in fig. 3. 3 fault coverage statistics we have performed extensive experiments on the standard iscas benchmarks, both combinational benchmarks [10] and full-scan versions of sequential benchmarks [11], to determine the fault coverage achieved by a pseudo-random test sequence generated by a prpg. the fsim fault simulator [12] has been used in all the examples to determine the fault coverage. first we show that the testability and the fault coverage achieved by a certain number of pseudo-random test vectors strictly depend on the tested circuit. knowledge of the testability of the circuit for which the bist is being designed can help us to select properly the lengths of the bist phases [20]. then we demonstrate the effect of the generator type on the stuck-at fault coverage, and show that the simplest lfsr is sufficient for most of applications. 3.1 pseudo-random testability of the circuits a low area overhead and good speed of the designed bist strictly depend on the nature of the circuit, for which the bist is being designed. pseudo-random testability of a particular circuit strictly depends on the number of hard-to-detect faults. it is possible to apply an unmodified sequence of lfsr code words to fully test some circuits in a reasonable number of cycles, while some other circuits are particularly untestable by this way. we have studied the pseudo-random testability of the iscas [10, 11] and itc’99 [21] benchmarks, using standard lfsrs. all the benchmarks were in their full-scan versions, thus turned into combinational. each benchmark was tested 1000 times using different lfsr polynomials and seeds. both the polynomials and the seeds were randomly generated, while a satisfactory period length was ensured by a simulation of the prpg run. the number of lfsr bits was set to be equal to the number of cut inputs. the results of a simulation of a selected set of benchmarks are shown in table 1. the “i” column shows the number of the benchmark inputs (including the scan path for sequential circuits), “range” indicates the range of the encountered number of test patterns to fully test the circuit (in those 1000 samples), while the statistical average value is shown in the last column. “k” stands for thousands of patterns, “m” for millions, “g” for billions. for some benchmarks the range has not been evaluated, for an extremely large number of patterns needed to fully test the circuit (more than 10 m). 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 2: lfsr structure fig. 3: example of a cellular automaton bench i range avg c17 5 2 – 33 4 c432 36 250 – 120 600 c499 41 300 – 6 k 1 200 c880 60 2 500 – 57 k 13 k c1355 41 800 – 12 k 2 800 c1908 33 3 k – 77 k 12 k c2670 233 2.4 m – 12.5 m 4.4 m table 1: pseudo-random testability it can be seen that the number of pseudo-random patterns needed to fully test the circuits varies considerably. the distribution of the number of required patterns follows the curve shown in fig.4. this particular curve corresponds to the iscas c1908 circuit. 3.2 influence of lfsr type on test length an lfsr used as a pseudo-random pattern generator is mostly based on the primitive generating polynomial to provide the longest period of the code generated. in this subsection we show that it is not necessary to use primitive polynomials. we investigated the influence of the number of lfsr taps on the testing capability. in particular, we studied the number of patterns needed to test all the faults in a circuit (like in subsection 3.1), while varying the number and the position of the lfsr taps. a satisfactory period for each generated lfsr was ensured by simulating its run. the results of the experiment are shown in fig. 5. here the number of lfsr cycles needed to cover all the stuck-at faults in the c1355 circuit is shown. 100 different lfsrs were generated ran© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 bench i range avg c3540 50 5 k – 174 k 32 k c5315 178 1 400 – 5 k 2 500 c6288 32 33 – 474 131 c7552 207 > 100 m s27 7 2 – 192 29 s208.1 18 1 400 – 26 k 6 k s298 17 100 – 1000 500 s344 24 60 – 1000 250 s349 24 70 – 1000 250 s382 24 150 – 2000 500 s386 13 1 400 – 15 k 3 600 s400 24 120 – 2000 500 s420.1 34 165 k – 4 m 1.4 m s444 24 130 – 2000 500 s510 25 300 – 2500 900 s526 24 5 k – 67 k 19 k s641 54 196 k – 3.2 m 1 m s713 54 294 k – 3.4 m 1 m s820 23 10 k – 78 k 27 k s832 23 9 k – 75 k 27 k s838 67 > 100 m s953 45 15 k – 98 k 46 k s1196 32 196 k – 3.2 m 1 m s1238 32 21 k – 489 k 118 k s1423 91 9 k – 138 k 55 k s1488 14 2500 – 24 k 6 800 s1494 14 2200 – 23 k 5 k s5378 214 50 k – 196 k 82 k s9234.1 247 6 m – 30 m 15 m s13207.1 700 97 k – 879 k 329 k s15850.1 611 > 10 m s35932 1763 150 – 500 230 s38417 1664 > 10 m s38584.1 1464 > 1 g b01 7 1 – 1100 350 b02 5 1 – 1000 200 b03 34 30 – 2600 700 b04 77 14 k – 330 k 60 k b05 35 10 k – 70 k 25 k b06 11 1 – 1200 330 b07 50 220 k – 10 m 3 m bench i range avg b08 30 2 k – 60 k 13 k b09 29 4 k – 42 k 16 k b10 28 300 – 5 k 1700 b11 38 12 k – 160 k 50 k b12 126 5 – 44 m 13 m b13 63 700 – 15 k 5 k b14 277 > 100 m b15 485 > 100 m b17 1452 > 100 m b18 3307 > 100 m b19 6666 > 100 m b20 522 > 100 m b21 522 > 100 m b22 767 > 100 m 0 10000 20000300004000050000 600007000080000 0 50 100 150 200 250 300 350 c1908 f re q u e n c y lfsr cycles fig. 4: distribution of the number of patterns to achieve full fault coverage for c1908 domly for each lfsr size, differing both in the tap positions and the seed. thus, for the circuit used (having 40 inputs) 3900 different lfsrs were produced (the x-axis – lfsr in fig. 5). lfsrs 0 – 99 correspond to 1-tap lfsrs, 100 – 199 correspond to 2-tap lfsrs, and so on. it can be observed that the number of taps does not influence the fault coverage capability at all; the test lengths are steadily distributed. thus, we can conclude that the most advantageous lfsr is one of the 1-tap lfsrs, since its area overhead is the smallest (only one feedback). a 1-tap lfsr having a satisfactory period can be found in most cases. the use of primitive polynomials thus becomes counterproductive, since the number of taps is mostly greater than one here, and they do not make any contribution. 4 mixed-mode bist principles the number of faults detected by pseudo-random patterns successively applied to the cut follows the saturation curve, see fig. 6. here the lfsr patterns were gradually applied to the s1196 iscas benchmark, while the number of covered faults was recorded. it can be observed that 90 % of the faults were covered in the first 1000 cycles, while 60 000 cycles were needed to achieve a complete fault coverage. thus, it is advantageous to apply a relatively small number of pseudo-random patterns to cover the easy-to-detect faults, and then produce several deterministic patterns to cover the rest. this approach is called a mixed-mode bist. it is necessary to find a trade-off between the number of pseudo-random and deterministic patterns. moreover, the prpg has to be properly chosen, since the fault coverage differs notably with prpg type, as we have shown in subsection 3.1. the probability of covering a given number of faults by a prpg is illustrated by fig. 7. here, sets of 10, 50, 100, 500, 1000 and 5000 lfsr patterns were applied to the c3540 circuit, 10 000 samples for each test size. the seed and the tap position were selected randomly for each sample. the distribution of the number of faults that remained undetected is shown here. for a low number of patterns many faults are left undetected, and their number also varies considerably. as the number of test patterns increases, the number of undetected faults rapidly decreases, while the standard deviation of this number decreases as well. 4.1 cellular automata vs. lfsrs many methods using cellular automata instead of lfsr have been proposed [17, 18]. in general, pseudo-random patterns generated by a ca have a more random nature than those generated by an lfsr. the weights of the particular prpg outputs (i.e., the ratios of zeroes and ones) are balanced in lfsrs, approaching the value 0.5. cellular automata often have the weights misbalanced, according to the seed. this property gives an advantage to ca, since they can be advantageously exploited as generators of weighted test patterns [18]. we studied the fault covering capabilities of cellular automata seeded with a random vector and compared the results with random lfsrs (randomly generated polynomial and seed). a very interesting observation was made. the distribution of the number of undetected faults, exactly as in fig. 7, was studied for lfsrs and ca, rule 60 (see fig. 3). the distribution curves for these two types of generators are shown in fig. 8. here, 500 patterns generated by random lfsrs and random ca were repeatedly applied to the c3540 circuit (10 000 times). it can be observed that the mean value of the number of undetected faults does not change when a ca is used, but the standard deviation is decreased. hence an important conclusion can be derived: using a ca instead of an lfsr does not increase the number of covered faults on an average, but the probability that more faults will be covered by the ca vectors is increased. this experiment shows that the use of randomly seeded ca instead of lfsrs does not make any contribution to the 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague 0 500 1000 1500 2000 2500 3000 3500 4000 0 2000 4000 6000 8000 10000 12000 c1355 c y c le s lfsr fig. 5: influence of the lfsr 0 10000 20000 30000 40000 50000 60000 0 200 400 600 800 1000 1200 1400 f a u lt s c o v e re d test patterns fig. 6: fault coverage curve 500 1000 1500 2000 2500 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 c3540 10 5000 1000 500 100 50 f re q u e n c y undetected faults fig. 7: pseudo-random fault coverage fault coverage achieved. in all of these experiments the seeds were generated randomly, with a steady distribution of 1’s and 0’s. on the other hand, when a “special” seed is chosen for a cellular automaton, its fault covering properties will change dramatically. we performed an experiment similar to that shown in fig. 8, for the s838 iscas circuit using an lfsr and a ca, once with a steady distribution of values in its seed, and once with a seed having only one “1” value at a random position, so that the weight of this seed was unbalanced. the four tests were run for 500 cycles, and the distribution of the number of undetected faults was measured. the results are shown in fig. 9. we can observe that the fault coverage of the lfsr decreased rapidly for this special seed, but on the other hand the variability of fault coverage of the ca increased, while in some cases many more faults were covered by the vectors produced by this ca (left-hand side). this observation can be explained by unsteady distribution of weights, i.e. the probabilities of occurrence of the “1” value on the prpg outputs. the distribution of weights for four 100-bit prpgs running 1000 cycles is shown in fig. 10a–d. we can see that for a randomly generated seed, for both the lfsr and ca the weights near 0.5, thus there is a balanced distribution of zeroes and ones in a test (fig. 10a and c). when a lfsr is unbalanced by a seed having only one “1” © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 200 250 300 350 400 0 200 400 600 800 1000 1200 lfsr ca c3540 500 patterns f re q u e n c y undetected faults fig. 8: comparison of the fault coverage obtained by lfsr and ca 100 200 300 0 100 200 300 400 500 600 700 800 900 1000 1100 ca, unbalanced seed lfsr, unbalanced seed random ca random lfsr f re q u e n c y undetected faults fig. 9: fault coverage of a “specially” seeded ca and lfsr 0,0 0,2 0,4 0,6 0,8 1,0 0 2 4 6 8 10 12 14 16 lfsr random seed f re q u e n c y weight 0,0 0,2 0,4 0,6 0,8 1,0 0 5 10 15 20 25 lfsr seed with one 1 f re q u e n c y weight 0,0 0,2 0,4 0,6 0,8 1,0 0 5 10 15 20 25 30 ca, rule 60 random seed f re q u e n c y weight 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 0 5 10 15 20 ca, rule 60 seed with one 1 f re q u e n c y weight a) b) c) d) fig. 10: a) distribution of weights for different prpgs – lfsr with a balanced random seed, b) distribution of weights for different prpgs – lfsr with an unbalanced random seed, c) distribution of weights for different prpgs – ca with a balanced random seed, d) distribution of weights for different prpgs – ca with an unbalanced random seed value and the rest are zeroes, the weights at the outputs are shifted to the weight of the seed for this particular case (fig. 10b). the weights do not differ from each other too much; the probabilities of zeroes and ones at all the outputs are approximately equal. a 1-tap lfsr has been chosen here, as it has been for all of our experiments. if a lfsr with a bigger number of taps was chosen, all the weights would near to 0.5, similarly to the case of a balanced seed. in fig. 10d the ca seeded with an unbalanced seed (having one “1” value) is shown. the weights range from negligible values (all zeroes) to more than 0.7. this is the case where the weighted pattern testing can be advantageously applied. 4.2 column-matching bist the column-matching bist method is based on a transformation of the prpg code words into deterministic test patterns pre-computed by some atpg tool. this transformation is being done by a combinational block, called output decoder. the method is designed for combinational or sequential full-scan circuits, thus the order of the test patterns applied to the cut is insignificant. moreover, not all the prpg patterns have to be transformed into test patterns; the excessive ones just do not test any new faults. in our column-matching method we try to assign the prpg code words to the deterministic patterns, so that some of the columns are equal. then the decoding logic needed to implement the matched column will be reduced to a mere wire connecting the decoder output with its respective input. the unmatched outputs have to be synthesized by some boolean minimizer. for a more detailed description, see [13, 14]. this principle has been further extended to support mixed-mode testing [7]. the bist run is divided into two disjoint phases. first, the circuit is tested using an unmodified sequence of lfsr code words, detecting the easy-to-detect faults. the deterministic test patterns for the rest of the faults are computed by the atalanta atpg tool [15]. these vectors are to be generated by several consecutive lfsr code words and modified by the decoder to obtain deterministic vectors. there has to be some additional logic to control the switch between the two phases. the switch is implemented as an array of multiplexers, one for each cut input. however we attempt to eliminate the muxes as well, by introducing direct matches [7]. the structure of a mixed-mode bist is shown in fig. 11. the sequence of patterns is fed to the tested circuit and its response is then evaluated by a multi-input shift register (misr). 4.3 test lengths it is clear that the choice of appropriate lengths of these two phases is of key importance. the maximum number of faults should be detected in the pseudo-random phase, while its length should be acceptable. according to fig. 6, most of the faults can be detected by a few initial patterns, and for the remaining faults deterministic patterns have to be produced. the more faults remain undetected, the more atpg vectors are needed, which also complicates the decoder design, in terms of the area overhead. this can be compensated by a longer run of the deterministic phase to some extent, but not significantly. the influence of the length of the initial phase on the final result is illustrated by table 2. the lengths of the two phases are shown in the “rand / det.” column. after the “rand” pseudo-random (unmodified) patterns were applied to the cut, “ud.” faults were left undetected and “vct.” deterministic vectors were produced by the atalanta atpg tool to detect them. 100% coverage of detectable stuck-at faults is considered in all the cases. thus, the “ud.” column does not include the number of redundant faults, which cannot be detected, from the nature of the tested circuit. the deterministic vectors are to be generated from additional “det.” lfsr patterns by the decoder. the area overhead of the bist decoder synthesized by a column-matching method is indicated in the “ges” column. it is described in terms of gate equivalents [16]. only the logic of the decoder and the switching logic is considered and stated in this column, while the overhead of the prpg and bist control logic is not included here. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague lfsr decoder switch cut misr tpg mode fig. 11: mixed-mode bist structure bench rand / det. ud. vct. ges c1355 500 / 500 31 12 70 1000 / 1000 8 1 15 c1908 1000 / 1000 46 30 46.5 2000 / 1000 19 10 7.5 c3540 1000 / 1000 33 22 15 2000 / 1000 8 8 7.5 5000 / 1000 3 3 6 s420 400 / 600 40 30 24.5 1000 / 1000 35 19 25.5 3000 / 1000 29 17 27 s526 500 / 500 21 17 30.5 1000 / 1000 12 11 4.5 2000 / 1000 7 6 4.5 s641 1000 / 500 12 9 21 3000 / 1000 8 7 15 5000 / 1000 7 6 16.5 s820 1000 / 1000 70 28 63 5000 / 5000 34 14 0 table 2: influence of test length it can be seen that increasing length of the pseudo-random phase decreases the bist overhead to some extent. a significant decrease in the overhead is achieved for a small increase in the test length in some cases (c1355, c1908, s820). sometimes the improvement is negligible even when the test length is increased significantly (s641, s838). sometimes a longer pseudo-random phase even causes an increase in the area overhead (s420, s641). this is due to the fact that the amount of test don’t cares decreases for smaller test size and complicates the decoder synthesis [14]. 4.4 influence of the lfsr the fault coverage achieved in the first phase is influenced not only by the number of pseudo-random test patterns (the length of the pseudo-random phase). the number of detected faults also depends on the pseudo-random sequence, so it is influenced by the lfsr (ca) polynomial and seed. this is illustrated by figs. 4 and 7. significantly different results are produced for different lfsrs, even when the lengths of the phases are retained. for illustration, we designed a bist for the c1908 circuit. the pseudo-random phase was run for 2000 cycles, the lfsr polynomial was set constant (1-tap) and we repeatedly randomly reseeded it. then the deterministic phase was run for 1000 clock cycles. the simulation results are shown in table 3. again, the “ud.” column indicates the number of undetected faults in the first phase, “vct.” gives the number of deterministic vectors and “ges” shows the complexity of the final bist logic. the entries are sorted by the number of undetected faults. we can see that the complexity of the final circuit strictly depends on the lfsr seed – it varies from 7.5 ges up to 69 ges. it is not possible to compute a proper lfsr seed and/or generating polynomial analytically for practical examples, due to the complexity of this problem. thus, in practice we repeatedly reseed the polynomial and conduct the fault simulation several times, while we pick out the best seed for further processing. fault simulation is often a very fast process, thus it does not significantly influence the bist design time. 5 conclusions we have discussed the influence of the pseudo-random pattern generator type on its fault detection capability. both lfsrs and ca are studied, with either a random or a “special” seed. the distribution of weights on the individual prpg outputs is shown for all cases, together with the fault coverage curves obtained by the prpgs. we have shown that for a pseudo-random test-pattern generation phase a 1-tap lfsr is mostly a good choice, due to its satisfactory period length, fault coverage and minimal area overhead. the pseudo-random testability of the standard iscas and itc benchmarks is summarized in this paper, to help bist designers properly choose the desired pseudo-random test lengths for these circuits. the effects of generator type are illustrated on a mixed-mode column-matching bist synthesis. it directly influences the total complexity of the resulting bist circuitry. the claims were confirmed experimentally on a bist design for several iscas benchmarks, but the conclusions made can be applied to any circuits. acknowledgment this research was supported by grant ga 102/04/2137, “design of highly reliable control systems built on dynamically reconfigurable fpgas” and msm6840770014. references [1] agrawal, v. k., kime, c. r., saluja, k. k.: “a tutorial on bist, part 1: principles”. ieee design & test of computers, vol. 10, no. 1, march 1993, p. 73–83, “part 2: applications”, no. 2, june 1993, p. 69–77. [2] touba, n. a., mccluskey, e. j.: “synthesis techniques for pseudo-random built-in self-test”. technical report, (csl tr # 96-704), dept. of electrical engineering and computer science, stanford university, august 1996. [3] hellebrand, s. et al.: “built-in test for circuits with scan based on reseeding of multiple-polynomial linear feedback shift registers”. ieee trans. on comp., vol. 44, no. 2, february 1995, p. 223–233. [4] hartmann, j., kemnitz, g.: “how to do weighted random testing for bist”, proc. of international conference on computer-aided design (iccad), p. 568–571, 1993. [5] chatterjee, m., pradhan, d. k.: “a bist pattern generator design for near-perfect fault coverage”. ieee transactions on computers, vol. 52, no. 12, december 2003, p. 1543–1558. [6] touba, n. a.: “synthesis of mapping logic for generating transformed pseudo-random patterns for bist”, © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 bench rand / det. ud. vct. ges s838 1000 / 1000 129 72 120 5000 / 1000 105 56 130 10000 / 1000 106 62 110 s1196 1000 / 1000 89 54 50.5 5000 / 1000 25 19 28.5 ud. vct. ges ud. vct. ges 19 10 7.5 33 15 37 21 9 19.5 34 16 33 24 13 23.5 36 18 38 26 15 28 37 20 40.5 26 13 25 39 22 53 28 15 37.5 44 26 40 28 14 22.5 46 22 42.5 30 14 36 48 24 44 32 16 31 52 28 63.5 33 17 27.5 62 34 69 table 3: influence of the lfsr seed proceedings of international test conference, 1995, p. 674–682. [7] fišer, p., kubátová, h.: “an efficient mixed-mode bist technique”. ddecs’04, tatranská lomnica (slovakia), 18.–21. 4. 2004, p. 227–230. [8] aloke, k., chaudhuri, d. p.: “vector space theoretic analysis of additive cellular automata and its application of pseudoexhaustive test pattern generation”. ieee transactions on computers, vol. 42, no. 3, march 1993, p. 340–352. [9] novák, o., hlavička, j.: “design of a cellular automaton for efficient test pattern generation”. proc. ieee etw 1998, barcelona, spain, p. 30–31. [10] brglez, f., fujiwara, h.: “a neutral netlist of 10 combinational benchmark circuits and a target translator in fortan”. proc. of international symposium on circuits and systems, 1985, p. 663–698. [11] brglez, f., bryan, d., kozminski, k.: “combinational profiles of sequential benchmark circuits”. proc. of international symposium of circuits and systems, 1989, p. 1929–1934. [12] lee, h. k., ha, d. s.: “an efficient forward fault simulation algorithm based on the parallel pattern single fault propagation”. proc. of the 1991 international test conference, oct. 1991, p. 946–955. [13] fišer, p., hlavička, j.: “column-matching based bist design method”. proc. 7th ieee european test workshop (etw’02), corfu (greece), 26.–29. 5. 2002, p. 15-16. [14] fišer, p., hlavička, j., kubátová, h.: “column-matching bist exploiting test don’t-cares.” proc. 8th ieee european test workshop (etw’03), maastricht (the netherlands), 25.–28. 5. 2003, p. 215–216. [15] lee, h. k., ha, d. s.: “atalanta: an efficient atpg for combinational circuits”. technical report, 93-12, dep’t of electrical engineering, virginia polytechnic institute and state university, blacksburg, virginia, 1993. [16] de micheli, g.: synthesis and optimization of digital circuits. mcgraw-hill, 1994. [17] hortensius, et al.: “cellular automata circuits for bist”. ibm j. r&dev, vol. 34 (1990), no. 2/3, p. 389–405. [18] novák, o.: “pseudorandom, weighted random and pseudoexhaustive test patterns generated in universal cellular automata”. springer: lecture notes in computer science, 1667, september 1999, p. 303–320. [19] chaudhuri, p. p. et al.: additive cellular automata theory and applications. volume i. ieee computer society press, 1997, 340 p. [20] fišer, p., kubátová, h.: “influence of the test lengths on area overhead in mixed-mode bist”. proceedings 9th biennial baltic electronics conference (bec’04), tallinn (estonia), 3.–6. 10. 2004, p. 201–204. [21]corno, f., sonza reorda, m., squillero, g.: “rt-level itc 99 benchmarks and first atpg results”. ieee design & test of computers, july-august 2000, p. 44–53. [22] adamek, j.: foundations of coding. john wiley & sons, inc. 1991, 336 p. [23] stroud, ch. e.: a designer’s guide to built-in self-test. kluwer academic publisher, london, 2002. ing. petr fišer e-mail: fiserp@fel.cvut.cz ing. hana kubátová, csc. phone: +420 224 357 281 e-mail: kubatova@fel.cvut.cz dept. of computer science & engineering czech technical university in prague karlovo nám. 13 121 35, prague 2, czech republic 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague ap04_2web.vp 1 introduction spinal corrective braces (see fig. 1) are used for treating spine scoliosis in children (pathologic at deformation of chest curvature). the x-ray of the patient from fig. 1 without and with a brace is shown in fig. 2. cheneau-type dynamic corrective braces or according to cerny’s patent no. 281800cz (see fig. 1) are usually used in the czech republic. the breast curvature can be classified according to king. a chenau-type 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 simulation of scoliosis treatment using a brace j. čulík ivo mařík m.d. has treated many child patients with scoliosis at the “centre for locomotor defects”, olšanská 7, 130 00 prague 3. the author has cooperated with him, and composed the computer program for the spine stress state under brace effects and for simulating scoliosis treatment. the program simulates the spinal curve remodelling in time for a specific child patients, and the algorithm for stress state calculation and treatment simulation is given. orthopaedists in the czech republic use cheneau-type or cerny-type corrective braces. the brace exerts force effects on the skeleton of a child. the brace is made individually for each patient, in the following way: first, a negative plaster form of the child´s trunk is made, and then the positive plaster form is created. the orthopaedist determines the places where brace has to load the patient´s trunk, and the plaster form deepened in these places on the basis of his advice. the laminate brace made according to this plaster form constricts the child´s trunk (like a tight shoe). this paper shows how the stress state is determined in vertebrae and in inter-vertebral discs, and the solution of spinal curvature correction under brace force effects for a specific child patient. the project aims to find the dependence of the activation and velocity of spinal curvature correction in the spinal stress state for many patients. the paper shows the computing algorithms for spinal deformations and the stress state under brace force effects, and a simulation of spinal curvature correction. spinal curvature is determined according to measured values on an x-ray of a patient before a brace is applied. the stress state in the spine and the spinal deformation are investigated by the finite element method as beam (spine) in an elastic ground (soft tissue). two algorithms are used. the first algorithm deals with the spine above and below the soft tissues, and it is loaded by given displacements of the trunk surface. the second algorithm determines from the x-ray of a patient with and without a brace the spine deformation and the spine stress state, and the necessary trunk surface displacement is determined from this deformation. the calculation algorithm and parameters were compared with contest of treatment. the trunk surface load was checked by sensor that plates were placed into the braces to measure the load values between the brace and the surface of the child. the simulation program assesses the spinal curvature correction according to the spinal curvature type, the spinal stress state and the period of time for which the brace will be applied. keywords: biomechanics, simulation of treatment, scoliosis, spine stress state, spine remodelling. fig. 1: patient without and with the dynamic corrective brace according to cerny’s patent no. 281800cz brace of is recommended for spinal curves of type king i, ii, and iv types, a cerny-type brace for spinal curves of king ii, iii and v types. the brace constricts the child´s trunk and makes a stress state in the patient’s spine. the brace changes the spinal curvature, which means that the pathological spinal form is corrected. after long-term use of the brace, the spinal correction is permanent. the brace is made in the following manner: first, a plaster negative form and then a positive form of the child’s trunk are made. according to his experience and the orthopedist’s recommendation, the orthopedic assistant deepens the positive form of the plaster’s in the place where the brace is to push on the child’s trunk. the plastic brace is then made according to this plaster form. after it has been applicated to child´s trunk, the brace constricts the places where the form has been deepened (the tight shoe principle). if no computer search is used, the brace force effect is the result of the orthopaedist and his assistant’s experience only, and it does not ensure that the form of the designed brace and the manner of treatment are optimal. this paper shows computer algorithms that are able to determine the stress state in vertebra and inter-vertebral discs, and spinal curve changes for specific brace applications. the theoretical conclusions are based on many causes of treatment. the remodelling of a pathological spine curvature depends on the spine stress state, and the time and manner in which the brace is applied. the course of treatment is simulated on the computer. the aim of this study is to determine the ideal brace form and a course of treatment course with the help of computer simulation. the computer program calculates the spine stress state and its curvature changes at each time point. the treatment simulation is now being provided simultaneously with the patient’s treatment, and the computer model is being verified. if the computer model and the actual treatment have the same behaviour, then the model can be used for treatment prognosis in orthopaedic practice. since the course of treatment takes a long time, the simulation model is still being verified so that its prognoses can become as precise as possible. 2 spinal curvature the task is solved using cartesian coordinates (x – spine axis direction, y, z – frontal and saggital plane). the spinal curvature is stored in the computer as the following three functions � � � � � �y y x z z x x� � �, , � � , (1) where � is the turning based on the x-axis. the spinal curvature can be described if the extreme values of y and/or z are measured from x-ray (the extremes of the white curvature in the x-ray on the left in fig. 2). the method is also applied for the frontal and sagittal plane. the spinal curvature has 3 extremes of coordinate maxima. the curvature is divided into n sectors between the beginning, the extremes and the end point of the coordinates, respectively (max. n is 4). the ex© czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 44 no. 2/2004 fig. 2: frontal x-ray of the patient from fig. 1 without and with the corrective brace treme coordinates xi, yi, i � 1, …, n � 1 and the coordinate xn of the spine end (spine length where xn � 0) are measured by x-ray (see fig. 2). the length of segment i is l x xi i i� � �1 . the local coordinate � is considered from the beginning of segment i. function y is considered as a polynomial. for the first segment (quadratic polynomial function) it is y y l l i i i � � � � �� ��� � 2 , for the inner segments (cubic polynomial function) y y y y l l i i i i i � � � � � �� ��� � 1 1 2 2 3 2 ( )� � and for the last segment (quadratic polynomial function) y y l i i � � � � � � � ��1 2 2 1 � . 3 deformation of the spine the inertia moment has to be determined for an inter-vertebral disc and a lignum cross-section area (see fig. 3). the calculation procedure is as follows: the cross-section area is divided into triangles, and one third of triangular areas are concentrated to their side centres. the spine is treated like a beam in an elastic basis, and the finite element method (deformation variant according to the lagrange principle) is used for calculating the stress state. it is assumed that the vertebrae have no deformation. the potential energy is calculated for the inter-vertebral disc volume and for the compressed soft tissue region of a child´s trunk. for simplification, the soft tissue is considered constant (rectangular cross-section of the trunk). the displacements and the turning at the vertebral centres are kinematic unknowns: r r r � � � � � � 1 2 , � �rt x i x i1 1� �� �, ,, , � �r w wt i i i i2 1 1� � �, , ,� � , (2) where �x are turnings based on the spine axis. the following algorithm is valid for the frontal and sagittal planes, and planes will not be indicated by the plane index. the stiffness matrix for the part of the spine between the centres of neighbouring vertebra is (torsion and beam influences) k k k � � � � � � � � � 1 2 0 0 . (3) the sub-matrices will be determined separately for deformation of the spine and for soft tissue. 4 deformation of inter-vertebral discs the beam and torsion stiffness is k ei l � 2 , t gi l t� , where e, i are the modulus of elasticity and the moment of inertia of a cross-section of the inter-vertebral disc and lignums (see fig. 3), and l is the thickness of the disc. the influence of torsion is: k t t t t 1 � � � � � � � � � . (4) the influence for rforces (moments) and rmovements (turns) on inter-vertebral disc boundaries are r k r2 2 2� , (5) the stiffness matrix k 2 will be searched for inter-vertebral disc and kinematic value at the inter-vertebral borders. because the vertebra are stiff, they have no potential energy. the inter-vertebral stiffness matrix will be recalculated on matrix k2 for kinematic unknowns at the vertebra centres. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 3: inter-vertebral disc and lignums k k l k l k l k l k l k k l k k l k l k l k l k l k 2 2 2 2 2 6 3 6 3 3 2 3 6 3 6 3 3 � � � � � � � 3 2 k l k � � � � � � � � � � � � � � � � � � � � . (6) let the boundary forces zi, mi, zi�1, mi�1 and the kinematic unknowns wi, �i, wi�1, �i�1 be transformed from vertebra centres to values zi, mi, zi�1, mi�1, wi, �i, wi�1, �i�1 at the disc boundary points (see fig. 4). as there is no deformation between the vertebral centre and the inter-vertebral disc boundary, the central spinal line is straight in this part, the spine movement w has a linear course in the part of length a, and the torsion moment mx and turning �, �x are invariable. w w a w w ai i i i i i i i i i� � � � � �� � � � �� � � � � �, , ,1 1 1 1 1, (7) m m z a m m z a z z z zi i i i i i i i i i� � � � � �� � � � �, , ,1 1 1 1 1. (8) let us put (7), (8) to (5) r az az k r a a i i i i 2 1 2 2 1 0 0 0 0 � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � . (9) formula (9) can be written r k r2 2 2� , (10) where k2 is the stiffness matrix for vertebra centres k k l k l a l k l k l a l k l a l2 2 2 6 3 2 1 6 3 2 1 3 2 � � � � � � � � � � � � � � � � 1 2 3 2 1 3 2 1 2 3� � � � � � � � � � � � � � � � � � � � � �k a l a l k l a l k a l 2 1 6 3 2 1 6 3 2 1 2 2 a l k l k l a l k l k l a l � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 3 2 1 2 3 2 1 3 2 1 k l a l k a l a l k l a l � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � k a l a l 2 3 2 1 � . (11) analogous formulas are valid for the y axis direction. 4.1 compressed soft tissue (elastic ground) the compressed soft tissue of a child’s trunk round the spine is considered as an elastic ground according to [1] p. 86–113, and the final formulas will be used here. the width of the ground is considered constant. let us calculate the parameters c e h c e hp p 1 2 6 � �, , c c b c c c c b c c 1 1 1 2 2 2 2 3 1 1 1 2 * *, ,� � � � c c b c b c c c c b b c c 3 1 2 2 1 2 4 2 2 2 3 1 1 3 1 3 2 * *,� � � � � , where ep, h, b are the modulus of elasticity, thickness and width of the compressed soft tissue. the torsion stiffness sub-matrix is: k bl c b l c1 3 43 2 1 1 2 2 1 1 1 1 � � � � � � � � � � � � � � � � * * . (12) the beam stiffness sub-matrix is: k blc l l l l l l 2 1 2 2 13 25 11 210 9 70 13 420 11 210 105 13 420� � � � � * 2 2 2 140 9 70 13 420 13 35 11 210 13 420 140 11 210 105 � � � � � l l l l l l � � � � � � � � � � � � � � � � � � � � � � � � �2 6 5 10 6 5 10 10 2 152 2 bc l l l l l l * 10 30 6 5 10 6 5 10 10 30 10 2 15 2 2 2 l l l l l l l � � � � � � � � � � � � � � � � � � � � � � � (13) 5 the first algorithm the brace constricts the trunk in the place where the plaster positive form of the child´s trunk has been deepened; this means that the trunk surface (soft tissue surface) has non-zero prescribed displacements in these places. let us assume that the prescribed displacement acts from above for a lying patient, and the z-axis direction is from below. the compression of the soft tissue up the spine is w0 � w and below it is w, where w is the spine displacement and w0 is the prescribed trunk surface displacement. let matrices kabove, kbelow be calculated according to formulas (12), (13) for the part of the trunk above and below the spine. the potential energy of the soft tissue part is � �� � � e r k r r k r r k r k p t t � � � � � � � � above below above below ( ) ( 0 0� �� k rabove ) . (14) the term kabover0 can be calculated, and its negative form can be considered as a load vector (the right side of linear algebraic equations of the finite element method). in this way, the potential energy can be considered in the compressed parts of the soft tissue only; this means that the terms kabove(r0 – r) and/or kbelowr are taken into cosideration only if they are positive. an iteration calculation is necessary for correct results; this means that the load vector is calculated for © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 44 no. 2/2004 a l a z mi i z mi i z mi i+ +1 1 z mi i+ +1 1 wi wi wi+1 wi+1 �i �i+1 fig. 4: spine deformation is linear in the vertebra parts and curvilinear in the inter-vertebral part of the disc the compressed soft tissue part above and/or below the spine according to the results from the last iteration step. the oblique load will be searched. let y, z be coordinates of the point of the center where the positive plaster form was deepened, and let � be the depth to which the plaster positive form has been deepened in the perpendicular direction to the child’s trunk surface. now, � is a prescribed trunk surface displacement and y, z are its coordinates (positive displacement is in the direction from the trunk surface to the spine). let us consider that the transversal cross-section of the trunk has a half elliptic form with radiuses a, b for z > 0 and a, b for z < 0. the following formulas can be written for the ellipse z b y a � �1 2 2 (15) if formula (13) is derived, angle � of the tangent with axis y can be calculated; the negative value of angle � is the angle of the normal with the z axis. tg =� � � � � z by a a y2 2 . the prescribed surface displacements v0, w0 in y, z directions are v w0 0� � �� �sin , cos� � . the problem can be solved in the x, y or x, z plane with prescribed displacements v0, or w0 or more correctly as a space problem with a space spine and soft tissue elements. the stiffness matrix for the space spine element can be considered in the same way as formulas (4), (5) and (11), but the matrix k2 (see (11)) also has the elements for the direction of the axis y. as the vertebra have no deformations, the kinematic variables at the vertebra surface can be calculated from the kinematic variables of the centre of gravity of the vertebra (see (3)). the normal and tangential stresses on the boundary between a vertebral and an inter-vertebral disc are then calculated from the resulting joint forces and moments. the normal force of the x-axis load has to be taken into account in the normal stress calculation, too, and the influence of shear and torsion should be taken into account in the tangent stress calculation. the parameters and calculation algorithm are being verified with values observed in the x-ray of a child with and without a brace, i.e., the calculated function values y, z and y + v, z + w and their extremes are compared with the patient’s x-ray. 6 the second algorithm the spinal curvature extremes are measured on x-rays taken without and with a brace. the spinal curvature coordinates are determined for the x-rays, and the spine deformation function r2 is the difference of the two curvature coordinates. the joint force (moment) vector r2 can be calculated from (10) and beam theory can be used to calculate the spine stress state. if displacement vector w is known, we can calculate from (14) vector w0 – the necessary displacement of the trunk surface and the most suitable place for deepening the brace. 7 simulation of treatment if the brace is removed from the child’s trunk after being applied for some time, then the spine does not return to its previous position, but the pathological spine form is partly corrected. an example of the result of scoliosis treatment of a king i type spinal defect is shown in fig. 5. the maximum of the angles between the spine axis has been measured at the thoracic and lumbar parts of spine. the measured angles were 27 grads and 34 grads before treatment; 2 and 5 grads with the brace; 9 and 12 grads after treatment. statistical data for various spinal curve types according to king-moe are given in table 1. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 fig. 5: x-rays of a patient with king i type spinal curvature, before treatment, with a brace, and after 2.5 years the second column is for the thoracic part of the spine, and the third column is for the lumbar part of the spine. the first set of data is for ch neau and the second set of data for cerny brace types. the percentages of angle correction are given first, followed by the number of patients treated, in parenthesses. the prognosis prognosis and speed of the treatment effect are made according to the statistical data and the spine stress state. the condition for succesful treatment is periodical use of the brace in accordance with orthopaedic advice. 8 conclusion many child patients have been observed within this project, and the dependence between spinal curvature correction, the spine stress state and the time interval of applying the brace have been studied. theoretical conclusions about spine remodelling have been sought. the computer simulation model and its parameters are being verified to ensure that the behaviour of the model is the same as the child´s course of treatment. since the treatment takes a long time, the theoretical conclusions can only be determined after a sufficient number of comparison have been made between observed treatment courses and their computer simulations. acknowledgment this research was supported by grant msm 21-0000012 “trans-disciplinary research in biomedical engineering, area” and is being conducted in cooperation with orthopeadist and pediatrician ivo mařík, m.d., phd. (center for defects of the locomotor apparatus, olšanská 7, 130 00 prague 3), who has treated the patients, and ing. pavel černý (orthotic center, truhlářská 8, 110 00 prague 1), who has made the special braces and measurement instruments. references [1] bittnar z., šejnoha j.: numerical method in mechanics (in czech). prague: 1.ed. ctu, 1992. [2] černý p., mařík i., zubina p., hadraba i.: “application of orthotic as a technical device of rehabilitation by bone displasies.” (in czech) locomotor systems, vol. 5 (3–4), prague 1998, p.145–151. [3] čulík j.: “computer simulation of spine deformities treatment with orthoses”. word congress on computer mechanics, “wccm – books of abstracts” vol. i., international association for computer mechanics, vienna university of technology, p. 303, full paper at http://wccm.tuwien.ac.at., vienna 2002. [4] čulík j., mařík i., černý p., zubina p., zemková d.: “computer control of bone deformities treatment by limb orthoses”. journal of musculoskeletal & neuronal interaction. vol. 2 no. 4, nafplion 2002, p. 389. [5] denis, f.: “spinal instability as acute spinal trauma.” clin. orthop., vol. 189, 1984, p. 65. [6] ch neau, j.: “bracing scoliosis”. locomotor systems, vol. 5, no. 1–2, prague 1998, p. 60–73. [7] mařík i., černý p., sobotka z., korbelař p., kuklík m., zubina p.: “conservative therapy of spine deformities with dynamic trunk orthoses.” (in czech). locomotor systems. vol. 3, no. 1, prague 1996, p. 38–41. [8] roy-camille r.: rachis dorsolumbale traumatigue non neurologique. paris: masson, 1980. [7] zubina p.: “prevention of spine deformities after multiply level laminecktonii at child age”. locomotor systems, vol. 4, no. 2, prague 1997, p. 3–18. prof. ing. jan čulík, drsc. phone: +420 224 354 481, +420 312 608 208 e-mail: culik@ubmi.cvut.cz czech technical university in prague institute of biomedical engineering náměstí sítná 3105 / 610 27 201 kladno 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 44 no. 2/2004 table 1: statistical data for spinal angle correction after scoliosis treatment with braces table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap05_4.vp 1 introduction the issue of helicopter flight control has been discussed extensively in the relevant literature [1, 2, and references therein]. due to the complexity of helicopter dynamics, the design and implementation of controllers is difficult. helicopters are highly coupled systems. the rotor provides propulsion and is the main control actuator, and is therefore the source of much of the complexity. the level of detail used to represent the rotor dynamics is often an important factor during the design of the controller and selection of the associated parameters [3, 4]. as the flight conditions change, these dynamics change, often resulting in controllers that only perform to specification within the operation margin for which they are designed. all these factors have stimulated this area of research and resulted in a study of various control strategies being applied to this application. the goal is to achieve high-bandwidth, high-gain, robust controllers operable over the entire flight envelope [5]. one area of control that has had little application to the helicopter problem is that of non-linear, variable structure methods, such as sliding mode control, (smc) [6, 7, 8, 9]. smc is comprises of two parts: a linear equivalent term and a non-linear switching term. this non-linear term is the unique attribute for this type of control scheme. it provides much of the controllers’ actuation power and provides high robustness to model uncertainty and external disturbance. however, it is also often the source of concern for this controller structure as the non-linear term tends to switch around the zero error region giving a high frequency input to the control actuator, called chattering [10, 11], which can be avoided by employing a soft switching structure. the application of a smc scheme to a helicopter system is presented in this paper. the controller is evaluated using the aeronautical design standard performance specification of handling qualities requirements for military rotorcraft, ads-33e [12]. 2 helicopter model the rascal model (rotorcraft aerodynamics simulation for control analysis) [13] is used to represent the helicopter dynamics in this study. this is a high order, nonlinear, individual rotor blade representation of the helicopter dynamics. this differs from many other helicopter models in that it uses an individual blade representation of the rotors, and not a disc. this means that the high-order dynamics of the rotor are captured, instead of the assumption that the rotor tilt is quasi-steady [14]. although linearising models involve omitting important high order, non-linear dynamics, generally control engineering utilizes linear models of systems, which they base the controller designs upon. it is assumed that a linear model can represent the important rigid body dynamics needed for adequate controller design. this means that a linear model must be formed from the non-linear model described in [13], accomplished using a numerical method found through small perturbation from a given trim condition [14]. this defines a state space system given by: x ax bu� � (1) � �x t� u v w p q r � � � , (2) � �u t� � � � �� 1 1 0s c tr , (3) where, a is the system derivative matrix, b is the control derivative matrix, u (surge), v (sway) and w (heave) are the velocities in the body referenced x, y, and z axes respectively, with the rotational velocities p (roll rate), q (pitch rate), and r (yaw rate), and attitudes � (roll), � (pitch) and � (yaw) about those axes. �0 is the main rotor collective, �1s is the main rotor longitudinal cyclic, �1c is the main rotor lateral cyclic, and �0tr is the tail rotor collective. this model is used for the controller design, but the full representation of the helicopter system is used for testing and evaluating the controllers. for this a puma helicopter in a hovering flight is used [15]. 3 sliding mode control smc is a non-linear control methodology. it has advantages over linear control schemes in that it can be more robust to matched, unmodelled, uncertain system dynamics, and to disturbances [10]. the controllers developed in this paper are of individual decoupled controllers [16, 17] that have their to88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague sliding mode implementation of an attitude command flight control system for a helicopter in hover d. j. mcgeoch, e. w. mcgookin, s. s. houston this paper presents an investigation into the design of a flight control system, using a decoupled non-linear sliding mode control structure, designed using a linearised, 9th order representation of the dynamics of a puma helicopter in hover. the controllers are then tested upon a higher order, non-linear helicopter model, called rascal. this design approach is used for attitude command flight control implementation and the control performance is assessed in the terms of handling qualities through the aeronautical design standards for rotorcraft (ads-33). in this context a linearised approximation of the helicopter system is used to design an smc control scheme. these controllers have been found to yield a system that satisfies the level 1 handling qualities set out by ads-33. keywords: helicopter, sliding mode control, hover, handling qualities, ads-33e-prf, response types. tal control effort comprising of two parts: a linear equivalent term, uequivalent, and a non-linear switching term, uswitching: u u u� �equivalent switching (4) the closed loop system dynamics are represented by a sliding manifold [10]. this sliding manifold is a hyper plane representing zero steady state error, which the controller strives to converge the system toward. the switching term is effective when the system diverges from the zero sliding surface, which causes the system to converge back towards it. the equivalent controller is effective when upon the sliding manifold, representing the desired closed-loop system dynamics. u k xtequivalent � � �. (5) k is the decoupled feedback gain vector found from pole placement [18] and x represents the decoupled system states. the switching term drives the systems when subjected to disturbance or commands, defined by the sliding surface, �, which is represented as [16, 17] �( ) ( )� �� � � � � � �x h x h x xt t cmd , (6) �xcmd is the desired trajectory. � is a function of the state error, � �x , and h is the right eigenvector [16] of the desired decoupled closed loop system matrix, ac, found from: a a b ktc � � � � , (7) where �a is the decoupled system matrix, �b is the decoupled input distribution vector. this leads to an appropriate controller function to represent the switching action [16, 17]: � � �u h b h x h f x xt t tswitching cmd� � � � � �1 � �( ) sgn( ( ))� � � , (8) where, �( )f x represents the unmodelled dynamics. however, (8) is not very practical, due to system noise and actuator dynamics, resulting in chattering [10]. this is due to small values of � causing the switching term to add a magnitude of � to the control action. consequently this overcompensates for the small error and thus causes the input signal to oscillate around � � 0. this may result in high actuator wear and may excite any high frequency modes of the system. for this reason, other switching regimes that incorporate a boundary layer �bl, around the sliding surface can be used [17]. for this application, a saturation function is employed. this is similar to that of a sgn function in that when � �bl 1, or � �bl � �1, the output of the sat function is the same as the sgn function. however when within the boundary layer, � �� bl, the output is equal to � �bl. this is known as pseudo (or soft) switching as it removes the hard transition between the sudden transitions from –1 to 1 [17]. this is shown below. sat � � � � � � � � � bl bl bl bl � � � � � � � � � � � � � � � � � � � � 1 1 1 1 1 1 . (9) when within this region however, there is no guarantee that the sliding surface will be reached [10]. hence, there must be a trade-off in terms of robustness, performance and chattering. this gives the total control effort as: � u k x h b h x h f x sat xt t t t� � � � � � � � �� � � � � �1 � �( ) ( ) cmd bl � � � � � � � � � � � �(10) 4 aeronautical design standards ads-33 the ads-33 document outlines the desired handling qualities of military rotorcraft (see ads-33 and cooper harper for more information [12, 19]). this paper centers upon the design and implementation of an attitude command response type, (offers a lower level of agility than rate response types but tends to offer a higher level of stabilisation). this means that from a step input to the cyclic or pedals, a constant attitude change of pitch, roll or yaw will result, proportional to the magnitude of the step input. the assessment criteria for attitude response types can be broken down into the following areas: small amplitude, moderate amplitude, large amplitude, and inter axis coupling. as well as the above requirements, the impulse response must be observed. for attitude hold response types, the attitude should return to trim following a control input. by observing the response following an impulse command, the time for attitude to return to trim is measured. the requirement for pitch, roll and yaw, is that for a pulse controller input, the attitude should return to 10 % of the peak value within 10 seconds. for yaw (heading) there is the additional requirement that for a release of the directional controller the rotorcraft captures the reference heading within 10 % of the yaw rate at release [12], � �f r rr� � 01. (11) or within 1° of attitude at release: � �f r� � �1 , (12) whichever is greater. here rr is the yaw rate at the time of controller release, �r is the yaw attitude at controller release, and �f is the final yaw attitude. small amplitude inputs are defined in two parts; short-term and mid-term response. the short-term response is defined by bandwidth and phase delay parameters. the bandwidth, bw, is defined to be equal to the phase-limited bandwidth, bwphase (frequency giving 45° phase margin) [12]. the phase delay parameter is defined as: p � ��2 57 3 2 180 180. ( ) (13) where ��2 180 is the difference in phase between the 180° frequency, 180, and twice the 180° frequency, 2 180. however, ads-33 [12] states that if the gain limited bandwidth, bwgain, is lower than bwphase (gain margin less than 6dbs) then the rotorcraft may be prone to pilot induced oscillation, pio. this is due to interactions between the helicopter and the pilot that can cause the aircraft to become unstable. the mid-term response concerns the damping factor, �, at frequencies below the bandwidth frequency, bw, found above. it is a measure of the controller’s ability to reject unwanted oscillations caused by disturbances, and high order dynamics. ads-33 states that for level 1, � > 0.35 [12]. the moderate amplitude requirements are also known as attitude quickness. this is because it is a ratio of peak achievable rate to the peak attitude change. the above ratio can be compared to the ads-33 criteria for different magnitudes of change in attitude. the criterion is structured so that the over and under shoot characteristics of the attitude response are detrimental. this measure is particularly relevant for rate © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 response types where the angular rate of change is controlled. this is because rate systems tend to offer the highest level of agility at the sacrifice of stability. moderate amplitude requirements are not of major concern because attitude control is a sacrifice of agility to improve stability; hence, attitude quickness is degraded. large amplitude response is important as it helps to assess the craft’s ability to retain high levels of handling at points where the non-linearities are most severe [20]. as controllers are typically designed with linear, small-amplitude, approximations of the real system, it is necessary to test the system outside the range where these approximations are valid. the requirements for aggressive, target, acquisition and track maneuvers (highest specification), for hover and low speed flight are �30° for pitch, �60° for roll. there is no large amplitude requirement for yaw as the aircraft should be able to perform 360° rotations indefinitely. the manner in which pitch is affected by roll, and vice-versa is the inter-axis coupling. for pitch and roll coupling the ratio of roll attitude due to pitch attitude commanded change following a fast input should not exceed �0.25 for level 1 [12], and vice versa. 5 control system the control structure employed to implement smc is shown below. the system incorporates a model reference block in the form of a low pass pre-filter. this takes the pilot commands, filtering out any high frequency inputs, providing demanded attitudes and rates (6 states). the 3 individual controllers provide the control action. however, as the controllers are decoupled, assuming no cross coupling of the dynamics or actuators, this needs to be taken into account. the unmodelled system dynamics (which include the cross coupling, off axis terms in the system matrix a), are represented by �( )f x the term in the controller given in equation (10). however, for the cross coupling caused by the actuator dynamics (off axis terms in input distribution matrix b), a pre-compensation matrix is employed. the effect of this is to diagonalise the b matrix by selecting the matrix as follows: � � � 1 1 0 4 2 4 3 4 4 4 3 5 3 5 2 1 1 c s tr b b b b b b � � � � � � � � � � � � � � , , , , , , � � � � � � � � � � � � � b b b b b b c s5 4 5 2 6 3 6 4 6 2 6 4 1 1 1 , , , , , , � � smc smc �0trsmc � � � � � � � � � � , (14) where �smc are the respective outputs from the 3 decoupled smc controllers. the output from the pre-compensator is then added to the trim value that is found in [14] which is valid for the hover flight condition. the controllers themselves are designed as multi-state systems, with each controller design incorporating the appropriate rate and attitude states. it has been found that testing the controller using the state space matrices a and b, a very high gain system is realized. however, when testing upon the full helicopter model, this had to be greatly reduced. the high order dynamics of the system become troublesome, and noise, in the form of angular rate transferred from main rotor vibration, necessitates a large reduction of gain, and a large increase in the boundary layer. a hard switching controller is not possible in this system due to this, and therefore a soft switching sat regime is used. finally the controllers are designed with closed loop poles at 0 and �4. the input pre-filters have a bandwidth of double the ads-33 minimum bandwidth [12], with a dc gain of 10 dbs. the switching gain and boundary layer are chosen to give stable responses at high amplitude inputs. 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 2: pitch att. bode plots including h xt �cmd fig. 3: pitch att. bode plots not including h x t �cmd fig. 1: control structure diagram during the evaluation of these control schemes, it has been found that the controller produced a system that gives a low gain margin, bwgain < bwphase, as can be seen in fig. 2, which shows bode plots of gain and phase for pitch between 1 and 10rad/sec. this produces a system prone to pilot induced oscillations, as the gain limited bandwidth is lower than the phase limited bandwidth, giving a gain margin of approximately 2.5dbs. this is a highly undesirable attribute for the control system. the way round this problem is to increase the gain roll off characteristic. the most suitable manner to accomplish this is by the removal of the h xt �cmd term in the switching function. this term produces a large actuator command from the controller, which gives the system a fast actuation following a command input. the effect of omitting this term is to increase the roll-off of gain and increase the phase delay. this increased phase delay decreases bwphase, which in turn increases the gain margin, as can be seen in fig. 3. the other effect of this is evident in the moderate amplitude evaluation. when the h xt �cmd term is included, level 1 requirements can be satisfied. however, when the term is omitted, the system does not have the high actuation signal that provides much of the control effort following a command input, resulting in a more sluggish response, as reflected in figs. 6, 10, and 14. another issue concerning attitude control is that of steady state tracking. it has been found that the controller given in equation (10) is insufficient to provide adequate tracking. for this reason it is desirable to include an integral term into the controller. this results in a new controller given by (15) where is the integral gain � u k x h b h f x x sat xt t t� � � � � � � � � �� �� �� � 1 �( ) ( ) ( ) � � � � � � bl � �� � � � � � � (15 ) 6 pitch control the pitch form of the smc comprises two states, those of pitch rate and pitch attitude. this allows for both rate control and attitude control to be accomplished with the same controller, only requiring different command inputs for each task. the first requirement for an attitude hold response type is concerned with the impulse response. fig. 4 shows the response to a rapid controller impulse. it can be seen that the response is not first order due to the negative overshoot. also observable is the coupling with roll, which can be seen to be oscillatory, due to the high order rotor dynamics. however, the requirement for returning to trim is met as the pitch returns to 10 % of peak deviation within approximately 2 seconds, and roll and yaw return to 10% of peak within less than 5 seconds. the second requirement is that of small amplitude bandwidth and phase delay. fig. 5 shows the resulting bandwidth and phase delay in relation to the ads-33 requirements. this indicates that the level 1 requirement can be satisfied. bandwidth can be increased by increasing the gain or increasing the bandwidth of the input filter, however, this has the effect of increasing the phase delay due to the on set of high order dynamics that are unaccounted for in the controller. these contribute a large decay in phase at high frequencies (over 10 rad/sec). the mid-term requirement is a minimum damping factor of � � 0.35 in order to satisfy level 1 handling [1]. the pitch damping is found to be 0.68, well above the 0.35 requirement. the attainable attitude quickness and the ads-33 require© czech technical university publishing house http://ctn.cvut.cz/ap/ 91 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 bwpitch (rad/sec) fig. 5: ads-33 requirements for pitch attitude [12] fig. 4: attitude response to rapid longitudinal cyclic fig. 6: moderate amplitude pitch attitude [12] ments are shown in fig. 6. it can be seen that the level 1 requirements cannot be satisfied over the total range, due to the omission of the h xt �cmd term in the controller that provides much of the initial control effort. a constant level in the attitude quickness is not observed in fig. 6 due to the overshoot seen in at high attitudes (see fig. 7) which reduces the moderate amplitude response. the large amplitude requirement demands for attitude control is a minimum of � 30° [12]. this is shown in fig. 7 for a positive step in pitch. it should be noted that there is a slight overshoot present in the response, but the coupling between pitch and roll and yaw is minimal. 7 roll control the first requirement for roll is that of the impulse response. this is shown in fig. 8. like that of pitch, the response is not strictly first order due to the negative overshoot. however, it can be seen that this requirement is satisfied as the roll returns to less than 10 % of the peak value in under 3 seconds, far less than the 10 second requirement. also, as there is little deviation in pitch and yaw, this is not of concern, and demonstrates the successful decoupling of pitch and yaw from the roll channel. the phase delay in roll is the most stringent, with a required phase delay of under 0.12sec for level 1. fig. 9 shows the resulting bandwidth and phase delay in relation to the ads-33 requirements, and that the level 1 requirement is met, although further tuning is required to reduce the phase delay p. for mid-term response, the roll damping is found to be 0.70. the moderate amplitude requirements in roll are the most stringent. the attainable attitude quickness and the ads-33 requirements are shown in fig. 10. it can be seen that the level 1 requirements cannot be satisfied over the total whole range, due to the omission of the h xt �cmd term in the controller that provides much of the initial control effort. a constant level in the attitude quickness is not observed in fig. 10. the level of overshoot observed in the roll response does not increase greatly as a function of commanded input (overshoot only varies from 1° to 2° over the 10° to 60° attitude range) due to the effect of the integral action improving the steady state tracking. this means that the ratio of peak rate to maximum attitude change is lower for small attitude changes than for large attitude changes. consequently, the moderate amplitude response is poorer in the lower half of the attitude 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 7: large amplitude response of pitch command fig. 8: attitude response to rapid lateral cyclic input bwroll (rad/sec) fig. 9: ads-33 requirements for roll attitude [12] fig. 10: moderate amplitude of roll attitude [12] range. the large amplitude requirement demands are for rate control, a minimum of � 50°/sec [12]. fig. 11 shows a positive step command in roll. it should be noted that there is overshoot present in the response, some high frequency oscillation, and the coupling between roll, pitch and yaw. as with pitch, this coupling can be improved upon by increasing the gain of the off-axis controllers at the expense of the commanded responses in those axes. 8 coupling interaxis coupling is defined as a ratio of roll due to pitch, and vice-versa. for coupling of pitch due to roll, a ratio of 0.008 is attained, and for roll due to pitch, a ratio of 0.022 is found. this is well below the 0.25 level 1 requirement [12]. 9 yaw control yaw attitude (or heading) control has essentially the same requirements applied to it as pitch and roll at hover and low speeds (under 40 kts). the first requirement, like that of pitch and roll, is that the yaw attitude should return to trim within 10 seconds of a rapid controller input. as can be seen from fig. 12, this requirement can be met. yaw control exhibits superior damping to that of pitch and roll as there is minimal overshoot, and the response approximates a first order behaviour. however, it can be seen that coupling of pitch and roll from this axis is still present, due to the deviation of both attitudes from trim following the input. the additional requirement for yaw attitude hold is that of capturing the reference heading within 10 % of the yaw rate following the release of a commanded yaw input. fig. 13 shows the response to a command resulting in a step change in yaw rate, which is rapidly applied and released. the final yaw attitude, �f, should remain within the greater of �r � 1° or �r � 0.1rr. it is found that for the 6°/sec yaw rate, and a release attitude of 18°, �f is 18.5°, which is within the 10 % of yaw rate at time of release (0.6deg/sec) and well under the 1° requirement. the yaw bandwidths requirements are the highest, with a required bandwidth of at least 3.5rad/sec. fig. 14 shows the resulting bandwidth and phase delay in relation to the ads-33 requirements. this shows that the level 1 requirement is satisfied. © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 11: large amplitude response of roll fig. 12: attitude response to rapid tail rotor cyclic input fig. 13: yaw attitude (heading) capture bwyaw (rad/sec) fig. 14: ads-33 bandwidth for yaw attitude [12] for mid-term response, the yaw damping has been found to be 0.9. the attainable attitude quickness and the ads-33 requirements are shown in figure 15. it can be seen that the level 1 requirements cannot be satisfied, due to the omission of the h xt �cmd term in the controller that provides much of the initial control effort. it must be noted here that these results are for negative yaw commands. although the resulting attitude quickness is very similar for positive steps, at inputs over 40°/sec the system can become unstable. this is likely to be due to the asymmetry of the aircraft from the direction of rotation of the main rotor, placement of the tail rotor, and the effect of wake caused by the former on the latter, which excites coupling between the yaw axis and roll axis. 10 conclusions in this paper, the successful implementation of a sliding mode attitude control system has been presented. the assessment of these controllers to the ads-33 requirements has shown that level 1 handling can be achieved for pitch, roll and yaw, creating a high bandwidth, high gain, stable control platform. despite the use of a decoupled control strategy, the coupling between the channels is well within acceptable levels. although the controller design does not take into account the high order dynamics, the testing of the controllers shows that the smc system can cope with such dynamics. however, there is room for improvement in this system. many of the high order dynamics can be observed on the responses, and removal of these would be desirable. the issue of attaining sufficient gain margins to avoid pio has been addressed by altering the controller structure, but at the cost of increasing phase lag, reducing gain, and reducing available bandwidth, resulting in a less agile system as reflected in the moderate amplitude measurement. the systems exhibit high levels of damping and allow for large amplitude while maintaining stable control and low levels of coupling. references [1] manness, m. a., gribble, j. j., murray-smith, d. j.: “helicopter flight control law design methodologies”, september 1991. [2] prouty, r. w.: “helicopter control systems: a history”, journal of guidance, control, and dynamics, vol. 26 (2003), no. 1, january-february 2003. [3] ingle, s. j., celi, r.: “effects of higher order dynamics on helicopter flight control law design”, presented at the annual forum of the american helicopter society, washington, d. c., june 1992. [4] manness, m. a., murray-smith, d. j.: “aspects of multivariable flight control law design for helicopters using eigenstructure assignment.” journal of the american helicopter society, vol. 37 (1992), no. 3, july 1992. [5] tischler, m. b.: “digital control of highly augmented combat rotorcraft.” nasa-technical memorandum 88346, usaavscom technical report 87a-5 may 1987. [6] bag, s. k., spurgeon, s. k., edwards, c.: “robust sliding mode design based upon output feedback.” conference publication no. 427, ukacc international conference on control, 2–5 september 1996, p. 406–411. [7] fossard, a. j.: “helicopter control law based on sliding mode with model following.” international journal of control, vol. 57 (1993), no. 5, p. 1221–1235. [8] sira-ramirez, h., zribi, m., ahmad, s.: “dynamical sliding mode control approach for vertical flight regulation in helicopters.” iee proc.-control theory appl., vol. 141, no. 1, january 1994, p. 19–24. [9] spurgeon, s. k., edwards, c., foster, n. p.: “robust model reference control using a sliding mode controller/observer scheme with application to a helicopter problem.” ieee workshop on variable structure systems, 1996. [10] edwards, c., spurgeon, s. k.: sliding mode control: theory and applications. taylor and francis ltd, london 1998. [11] slotine, j. j. e., li, w.: applied non-linear control. prentice hall, 1992. [12] ads-33e-prf, “aeronautical design standard, 33e, handling qualities requirements for military rotorcraft”, united states army aviation and missile command, aviation engineering directorate, redstone arsenal, alabama, 21st march 2000. [13] houston, s. s.: rotorcraft aerodynamics simulation for control analysis – mathematical model definition. university of glasgow, dept. of aerospace engineering internal report no. 9123, 1991. [14] houston, s. s.: “validation of a non-linear individual blade flight dynamics model using a perturbation method.” the aeronautical journal of the royal aeronautical society, vol. 98 (1994), no. 977, august-september 1994. [15] houston, s. s.: “validation of a blade-element helicopter model for large amplitude manoeuvres.” the aeronautical journal of the royal aeronautical society, vol. 101 (1997), no. 1001, january 1997, p. 1–7. [16] healey, a. j., lienard, d.: “multivariable sliding mode control for autonomous diving and steering of unmanned underwater vehicles.” ieee journal of oceanic engineering, vol 18 (1993), no. 3, p. 327–339. 94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 15: moderate amplitude of yaw attitude [12] [17] mcgookin, e. w.: “sliding mode control of a submarine.” m. eng thesis, university of glasgow, dept. e&ee engineering, 1993. [18] kautsky, j., nichols, n. k., van dooren, p.: “robust poles assignment in linear state feedback.” international journal of control, vol. 41 (1985), no. 5, p. 1129–1155. [19] cooper, g. e., harper jr, r. p.: “the use of pilot rating in the evaluation of aircraft handling qualities.” nasa-tn-d-5153, april 1969. [20] osder, s., caldwell, d.: “design and robustness issues for highly augmented helicopter controls.” journal of guidance, control and dynamics, vol. 15 (1992), no. 6, november-december 1992. david j. mcgeoch phone:+44 141 330 6137 fax: +44 131 330 6004 email: d.mcgeoch@elec.gla.ac.uk dr. euan w. mcgookin center for systems and control department of electronic and electrical engineering university of glasgow, glasgow, g12 8lt, uk dr. stewart s. houston rotorcraft flight dynamics group department of aerospace engineering university of glasgow, g12 8qq, uk © czech technical university publishing house http://ctn.cvut.cz/ap/ 95 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 ap05_3.vp 1 introduction since 2001, a piv laboratory has been built up at the department of fluid mechanics. the piv system is now equipped with hardware and software for 3d measurements in air and water, concentration and temperature measurement in water, using the plif method, and two-phase flow measurements with the lif and ipi methods. one of our partial goals is to verify the suitability of the ipi method for measuring of the size of the condensing cores in wet steam. we will project a simple course, build it and check its properties. 2 proposition of the measurement course for the desired outputs, we had to build a course line with the following properties: � it has to be able to regulate the generated steam and its condensation in the measurement area � the course has to be safe for the personnel � the price has to be under $1000 � it must be suitable for ipi measurement © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 measurement of two phase flow j. novotný, j. nožička, j. adamec, l. nováková this paper presents the results of experiments with moist wet steam. the aim of the experiment was to measure the velocity of the growth of a condensing nucleus in wet steam dependent on the velocity of condensation. for the experiments in wet steam an experimental setup was designed and constructed, which generated superheated steam at lowered pressure and a temperature of 50 °c. low pressure and temperature of the hot vapour was chosen in order to minimize the risk of accidental disruption of the wall. the size of the condensing nucleus was measured by the method of interferometric particle imaging (ipi). the ipi method is a technique for determining the particle size of transparent and spherical particles based on calculating the fringes captured on a ccd array. the number of fringes depends on the particle size and on the optical configuration. the experimental setup used is identical with the setup for measuring flow by the stereo piv method. the only difference is the use of a special camera mount comprising a transparent mirror and enabling both cameras to be focused to one point. we present the results of the development of the growth of a condensing nucleus and histograms of the sizes of all measured particles depending on position and condensation velocity. keywords: two phase flow, condensation, wet steam, ipi measurement. fig. 1: scheme of the measurement course for safety reasons we proposed a measurement course for low temperature superheated steam with lowered pressure. if the wall is damaged, there will be an implosion, which is much more acceptable than the possible damage from a regular explosion of a course filled with overheated steam at high pressure. the measurement course (fig. 1) consists of two reservoirs (a and b), which are connected with an overheater (cooler). both chambers are filled to 1/10 of their volume with water and drained to boiling point. the water in the chamber a is warmer and so it needs higher pressure that in chamber b to reach boiling point. above the water surfaces there is in both cases saturated steam. because of the higher pressure in chamber a than in chamber b, we get a sufficient hydraulic gradient for flow between the steam source and the measurement area. the pipe that connects the chambers has a jet with a 2-millimeter diameter. the two chambers chambers are separated with a tap (c). after exhaustion of both chambers and fixation of both pressures, stopcock c is opened and the saturated steam flows through the overheater into the measurement area with lower pressure steam. here the steam condenses. going though the overheater, the steam becomes a little hotter and overheated steam emerges. with the degree of overheating we regulate the beginning of condensation in the measurement area. with ipi the method and our laser (wave length 532 �m), we are able to measure only particles larger than 5 �m, and so the regulation at the beginning of the condensations essential, because the condensation cores are superfine at the beginning of the condensation. for the construction of the course we used a molten vessel with a volume of cca 3 l as a source of warm steam. measurement area with volume of a 20 l was made of glass sheets (10 mm in thicknees) and stiffened with crossbars. the vessel of the overheater (cooler) was made of brass sheet. inside the overheater there was a ribbed pipe, which connects the chamber with “hot” steam and the measurement chamber. the pipe has a regulation tap, so that we are able to separate the chambers. the ribbed pipe is surrounded with water at acertain temperature, which has the function of a heat transfer medium. both chambers have thermocouples and pressure pickups. the pressure and temperature of the water in the overheater can be set at will. 3 measurement method interferometric particle imaging (ipi) is a measurement method based on getting the distance by measuring the interference fringes spaces on a picture of a defocused particle. we get the interference fringes in a picture of a defocused particle by illuminating the ball particle with a source of monochromatic light. now there are to two reflections of the ray. on the front side the ray reflects light on the wall of the particle, and on the distant wall it first passes through the wall and then it reflects on the interface between the particle and the surrounding medium. the intensity of these two rays is the same, if we place the ccd camera at the right angle to the laser sheet. these two rays can be caught on the ccd chip as two light points. because both light rays are generated at the same source and their routes were different at the moment of impact on the ccd chip, they are shift phased. if we defocus the lens of the ccd camera, both points will begin to generate characteristic jang stripes. in the place where these stripes cross, we can see characteristic interference stripes. their distance together with the optical setup of the lens and other parameters gives us the size of the particle. the principle for measuring the size of the particles is shown in fig. 2. we can obtain the particle diameter d np fr� � trough the following equation: � � � � � � � � � � � � � � � � � � � � arcsin cos sin d z m m a 2 2 2 1 22 m � � � � � � � � � � � � cos � 2 . where: dp particle diameter, nfr number of fringes, � geometric factor, da aperture diameter, z distance from light sheet to camera lens, � observation angle, m relative index of refraction. 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ccd chiplens light source particle refraction reflection fig. 2: the principle for measuring of the transparent and spherical particle. the pattern of the defocused particle with an interference fringe is shown in the right hand corner. 4 measurement the measured area delineated in the picture was 50 mm by 50 mm the size. before the measurement itself, it was necessary to calibrate it with the help of a calibration chart placed in the measurement area. after successful calibration we performed several experiments with different setups of the temperatures: saturated steam in the measurement area and the source of “warm” saturated steam and even a different temperature in the overheater. because of our 5 �m limits in the size of the particle, we were not able to measure the size of the condensation particles using this setup. that’s why we proceeded to the controlled condensation and we did not warm up the steam going through the overheater, but we were cooling it down. with this approach is possible to measure the size and placement of the condensating droplets of water, the results of some experiments are shown in the fig. 4. 5 conclusion the course that we designed and built course gives us the opportunity to measure the size of condensing water droplets with the ipi method. using our current laser beam, we are not able to measure size of the particles smaller than 5 �m or particles that were placed too close to each other. in order to eliminate these insufficiencies it is necessary to have a camera with better ccd. with this camera we will be able to measure in a closer steam of particles. there is also a need to use a light source whit better properties for measurements at the beginning of condensation. the wavelength of this light source © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 3: experimental adjustment 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250pix 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 pix 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1200 1250pix fig. 4: result of experiments. instantaneous flow field and pattern of a defocused particle with an interference fringe. detected position (outer white circle), diameter (inner gray circle), and velocity (arrow). should be 100 �m. finally the data should be compared with a numerical approach. references [1] rafael, m., willert, c., kompenhans j.: particle image velocimetry, practical guide. berlin heidelberg: springer-verlag , 1998. [2] pereira, f., gharib, m., dabiri, d., modarress, d.: “instantaneous whole field measerument of velocity and size of air microbubbbles in two-phase flows ddpiv.” 10th international symposium on application of laser techniques to fluid mechanics, 10–13 july 2000, p. 38.4. [3] alabria, r., massoli, p.: “experimental study of droplets in evaporatin regime by 2d scattering analysis.” 10th international symposium on application of laser techniques to fluid mechanics, 10–13 july 2000, p. 100–104. ing. jan novotný prof. ing. jiří nožička, csc. e-mail: jiri.nozicka@fs.cvut.cz doc. ing. jozef adamec, csc. department of fluid dynamics and thermodynamics ing. ludmila nováková department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap04-bittnar1.vp 1 introduction it is well-known that the verlet algorithm (explicit newmark for a certain choice of its parameters) may be written as a composition of two first order algorithms, the symplectic euler and its adjoint [2]. what is perhaps less known is that there is an interpretation of this composition in terms of an approximation of the midpoint rule, which is of course implicit in the evaluation of the forcing impulse. our goal in this brief note is to point out that this mechanically inspired derivation yields the well-known second order explicit verlet algorithm in the vector space setting, and also an extremely accurate integrator for rigid body rotations when the midpoint rule is interpreted in the lie sense. 2 vector space midpoint rule approximation let us write the initial value problem for a mechanical system with configuration u � �n in a fairly general form as � , ( ) � , ( ) p f p p u m p u u � � � �� 0 0 0 1 0, where �p is the rate of linear momentum, and f f u� ( , )t is the applied force. for simplicity we shall assume p mv� , where m is a time-independent mass matrix, and v is the velocity. the midpoint approximation of the second equation is u u m p ( ) ( )t t t t t t� � � � � � � � �� � �1 2 , where p p u( ) ( ( ), )t t t t t t� � � �� � �2 2 2 makes this formula implicit. to approximate the midpoint momentum, we may use the equation of motion in integral form, p p f( ) ( ) ( )t h t t t h � � � � � � �d , and numerically evaluate the impulse integral. this we propose to treat by recourse to the concept of concentrated, discrete impulses delivered at pre-selected time instants. in particular, the impulse may be delivered at the end of the time step, in which case f 0( )� �d t t t� � � 2 . on the other hand, the impulse may be imposed at the beginning of the time step, and f f( ) ( )� �d t t t t t � � � � 2 . therefore, we obtain two algorithms. the first one, �� t , may be recognized as the symplectic euler method, and the second, ��t * , as its adjoint. �� � � � �t t t t t t t t t t t t t p u p u p f u m p � � � � � � � � � � � � � � � � �1 � � � � � � � � t (symplectic euler) �� � � �� � t t t t t t t t t t t t t * p u p u p f u m � � � � � � � � � � � � � � � � � � � � � � � �1p t (symplectic euler adjoint). these algorithms are symplectic [2], and momentum conserving. their accuracy is only linear in the time step, but their composition preserves both symplecticity and momentum conservation and yields a second order accurate algorithm. that algorithm may be recognized as the well-known verlet (explicit newmark with � �1 2) � � �� � �t t t� 2 2 * � (verlet). 3 midpoint rule approximation on the rotation group now we shall discuss the dynamics of rigid bodies rotating about a fixed point. (more detail is available in reference [3].) the initial value problem may be written in the convected description (body frame) as � �� , ( )� � � � �� � � ��skew i t1 00 � �r r i r r� ��skew 1 00� , ( ) , where �� is the rate of body frame angular momentum, r is the rotation tensor, t is the applied torque in the body frame, skew[�] is defined by skew[ ]h h 0� � , and i is the time-inde© czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 44 no. 5 – 6/2004 explicit time integrators for nonlinear dynamics derived from the midpoint rule p. krysl we address the design of time integrators for mechanical systems that are explicit in the forcing evaluations. our starting point is the midpoint rule, either in the classical form for the vector space setting, or in the lie form for the rotation group. by introducing discrete, concentrated impulses we can approximate the forcing impressed upon the system over the time step, and thus arrive at first-order integrators. these can then be composed to yield a second order integrator with very desirable properties: symplecticity and momentum conservation. keywords: time integration, rigid body motion, midpoint rule, symplectic euler, verlet, newmark, midpoint lie algorithm. pendent tensor of inertia in the body frame. the second equation is not in a form suitable for midpoint discretization, because the rotation tensor constitutes points of the lie group so(3), which is not a vector space and linear combinations are not legal operations on the rotation tensors. to transform the initial value problem to a form suitable for our purposes, we shall introduce the rotation vector representation of the rotation tensor. the equation of motion is written in the spatial frame as �� � rt, where �� is the rate of the spatial angular momentum, and consequently the equation of motion may be written in integral form as � �� � �( ) exp [ ] ( ) ( ) ( ) ( )t t tt t t � � � � � � � � � ��skew d0 0 r r t� � � � where � �exp [ ] ( ) ( )� �skew � r rt t t0 is the incremental rotation through vector �. upon time differentiation and identification with the original differential equation of motion, we obtain � ( exp )[ ]� ��� � � �d skew 1 1i , where d skewexp ( )[ ]� � � is the differential of the exponential map. the initial value problem may therefore be rewritten as � , ( )� � � � �� � � �� � �� � ��skew i t1 00 , � ( exp ) ( )[ ]� � ��� �� � �d ,skew 1 1 0i 0 . the midpoint approximation applied to the second equation yields (�t � 0) � � � t t t tt t � � � � �� � � � � � �� � � d skew exp [ ] 1 2 1 1 2i which may be simplified by noting d skew exp [ ]� � � � �1 2 � � � t t t t t t � � � to give 1 2 � � � t t t t ti � �� �� . this equation needs to be solved for the rotation vector. therefore, as for the vector space setting we get two different algorithms, depending on the chosen approximation of � t t� � 2. for the impulse applied at the beginning of the time step we obtain the so(3) counterpart of the symplectic euler integrator: � � � � � � � � t t t t t t t t t r r � � � � � � � � � � � � �� � �exp[ )]skew( ( ) exp[ )] � � t t t t t t�� � � � � � � � � t r skew( , where �t t�� solves 1 � �� � t tt t t t t ti t� � �� �� � � � � � � �� � �� �exp ( )skew 1 2 . on the other hand, the total torque impulse applied at the end of the time step yields the adjoint method � � � � � � � � t t t t t t t t t* exp[ ) r r � � � � � � � � � � � � �� � �skew( ]( ) exp[ )] � � t t t t t t t�� � � � � � � � � � � t r skew( , where �t t�� solves 1 � � � t t t t t ti � � �� �� � � � � � � �� � �� exp skew 1 2 . both algorithms are first-order, symplectic, and momentum conserving. as before, the composition of these two algorithms in one time step provides us with a second order accurate algorithm, which is an analogy of the verlet (explicit newmark with � �1 2) algorithm for the vector space dynamics � � �� � �t t t� 2 2 * � . 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 1: fast lagrangian top; on the left hand side convergence in the norm of the error in body-frame angular momentum; on the right hand side convergence in the norm of the error in the attitude matrix: akw: implicit midpoint rule of austin, krishnaprasad, [1]: sw: simo and wong [5]; nmb: krysl, endres newmark algorithm [4]; liemid[1]: implicit midpoint lie: liemid[e1]: adjoint of the symplectic euler (explicit midpoint lie variant 1); liemid[e2]: symplectic euler (explicit midpoint lie variant 2); liemid[ea]: alternating explicit midpoint lie method. the above algorithms have been called the explicit midpoint lie variant 2 and 1 respectively, and the alternating midpoint lie algorithm in reference [3]. it bears emphasis that these algorithms are not simply the symplectic euler and its adjoint. they all reduce to the full midpoint lie algorithm for torque-free motion, and it is only the approximation of the midpoint momentum evaluation that distinguishes them from the implicit midpoint lie rule. 4 example the accuracy of the explicit midpoint lie algorithms is rather remarkable, as may be seen in fig. 1. we show convergence graphs for the fast spinning heavy top (the kinetic energy is dominant, and a numerical method has to deal effectively with precession and nutation which are motions of distinct frequencies). even the first-order methods perform very well for larger time steps, and the present alternating midpoint lie algorithm is the strongest performer out of a selection of the best currently available implicit and explicit algorithms, including the implicit midpoint lie method. 5 conclusions we have presented an approach to the construction of rigid body integrators, in particular for general 3-d rotations, that are explicit in the evaluation of the forcing, momentum-conserving, and symplectic. the starting point is the midpoint (implicit) rule, which is then treated with numerical integration of the forcing with concentrated impulses. the resulting algorithms conserve momentum, are symplectic, first-order, and they are adjoint. consequently, their composition leads to a second order algorithm, which may be readily interpreted as a verlet (explicit newmark) integrator, both in the vector space setting, and in the setting of the special orthogonal group of rotations. (we would like to suggest as an appropriate name explicit midpoint lie algorithm for the latter.) for rotational dynamics, this integrator is a new addition to the lineup of current high performance algorithms, and in fact numerical evidence suggests that it is the best second-order integrator to date. applications abound: molecular dynamics, micro magnetics, rigid body dynamics, finite element dynamics of deformable solids. acknowledgment support for this research by a hughes-christensen award is gratefully acknowledged. references [1] austin m., krishnaprasad p. s., wang l. s.: “almost lie-poisson integrators for the rigid body.” journal of computational physics, vol. 107 (1993), p. 105–117. [2] hairer e., lubich c., wanner g.: “geometric numerical integration. structure-preserving algorithms for ordinary differential equations.” springer series in comput. mathematics, springer-verlag, vol. 31 (2002). [3] krysl p.: “explicit momentum-conserving integrator for dynamics of rigid bodies approximating the midpoint lie algorithm.” international journal for numerical methods in engineering, (2004), submitted. [4] krysl p., endres l.: “explicit newmark/verlet algorithm for time integration of the rotational dynamics of rigid bodies.” international journal for numerical methods in engineering, (2004), submitted. [5] simo j. c., wong k. k.: “unconditionally stable algorithms for rigid body dynamics that exactly preserve energy and momentum.” international journal for numerical methods in engineering, vol. 31 (1991), p. 19–52. petr krysl e-mail: pkrysl@ucsd.edu university of california san diego la jolla california 92093-0085, usa © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2020.60.0448 acta polytechnica 60(5):448–454, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap application of fmea methodology for checking of construction’s project documentation and determination of the most risk areas martin tuháček∗, ondřej franek, pavel svoboda czech technical university in prague, faculty of civil engineering, department of construction technology, thákurova 7, 160 00 prague, czech republic ∗ corresponding author: martin.tuhacek@fsv.cvut.cz abstract. the article deals with an innovative method designed to check project documentation of buildings at the design stage, specifically exploring the possibility to implement fmea and pdca methodologies. based on performed measurements and data collection, it theoretically determines the riskiest areas of the project documentation, which should be given special attention in order to reduce later costs for construction companies to fix the reported complaints. the research proves that the application of the fmea and pdca methodology can be very useful regarding the elimination of defects in the project documentation of constructions already in the phase of construction preparation. keywords: quality control, civil engineering, construction, project, project documentation, construction defects, fmea, pdca. 1. introduction high quality project documentation is a basic prerequisite for the final quality of a construction project. the quality of the submitted project documentation in the design phase significantly affects the result of the construction project after its completion and during usage, both in financial and qualitative terms [1]. the issue of the project documentation quality in the commercial sector is often underestimated. experience from practice proves that the processed project documentation suffers from many shortcomings, this is also confirmed by the following results of the analysis of expert opinions, which researches the cause of defects in construction projects [2]. figure 1 shows an analysis of 537 expert opinions that were prepared in the years 2007 2015. a total of 346 defects (i.e., 64 %) were related to the design or the concept itself (i.e., a defect that was already present in the project documentation of the construction). because of the confirmation of the connection between the design project documentation and project defects, this research was focused on creating a tool for the determination of the risk of individual claimed defects. the assessment of the risk factor of the claimed defects subsequently helps to focus the quality control of newly submitted project documentations on critical commodities from a financial point of view. this makes it possible to target the quality control of project documentation on commodities, the defect of which causes large financial losses to construction companies. this issue can also be based on already existing regulations that deal with the practical experience of the claimed defects in new audited projects, or it is a principle of continuous quality improvement [3]. beside the quality of the project documentation, it has a significant impact on the quality of the final product realization and the quality of maintenance during the operation of the building [4, 5]. the issue of the building operation is directly related to the need to implement operating parameters before creating the project documentation. prescribed technical regulations determining the cycles for the replacement of components and equipment, together with the building user guide, can make a significant contribution to the final quality of a construction project during the project’s life cycle [6]. the worldwide trend in the preparation of construction projects is shifting towards increasing the quality already in the design phase (i.e., their thorough preparation) [7]. in connection with the effective preparation of construction projects, we increasingly encounter the term lean construction [8, 9]. the idea and effect of applying the “lean” approach can clearly be seen in the following figure 2 with macleamy curve [7]. the macleamy curve shown in figure 2 describes the current status during the traditional construction project. curve with the number one clearly shows the possibility of influencing the costs and required properties of the construction project over time. the red curve with the number two then indicates how to increase the cost of any construction project incorporating changes based on a progressive realization and creation of construction documents. the vertical lines divide the individual stages of the project documentation. the graph clearly shows which levels of project documentation need attention (i.e., in particular, it is possible to influence the quality and financial intensity of the construction in the preparation phase). within the methods that are applied in lean construction [8, 9] there is an emphasis on the maximum 448 https://doi.org/10.14311/ap.2020.60.0448 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 5/2020 application of fmea methodology for checking of construction’s. . . figure 1. causes of failures – analysis of expert opinions in the years 2007 – 2015. figure 2. macleamy curve distribution of effort in relation to project changes across project development phases [7]. preparation of the construction project in its initial phase. this phase is very important especially in the field of construction, where each construction project is unique in its own way and requires a proper preparation, especially in this phase. any further interventions in the emerging project are costlier as the project progresses and there is an effort to avoid careful preparation and it is necessary to eliminate such interventions as much as possible. based on the above, it can be stated that it is better to check the project documentation already in the preparation phase [10, 11]. for a higher efficiency of the project documentation quality checking, the present research presents and describes computational models, thanks to which it is possible to obtain data from an analysis of claimed defects. the aim of the research is to analyse construction defects on the basis of a set of claimed defects in completed constructions. data were collected on the basis of the claimed defects within a large construction company in the czech republic and has focused on building construction. the research is based on theoretical formulas applied to practical measurements in the analysis of claimed defects. 2. materials and methods 2.1. expert analysis and their use for the documentation checking in general, the research is based on the application of theoretical knowledge based on data that were obtained by provided measurements. based on the calculation model and the obtained data, it is possible to obtain a comprehensive overview of defects that have the most significant impact on the financial performance of the building unit with regard to the implementation of buildings. the fmea method (failure mode and effects analysis), the pdca cycle (plan-do-check-act) and the basic parret rule are used as theoretical relations. the use of expert analyses in the issue of the quality control of the project documentation depends mainly on the set of processed data and required outputs. the analysis of the claimed defects is based on the fmea method. the fmea method is a verbal-numerical, qualitative449 m. tuháček, o. franek, p. svoboda acta polytechnica quantitative rating method used to assess the failure rate of planned projects in risk analysis, quality management and many other areas. originally, the fmea was developed to analyse complex processes and identify their shortcomings, especially in the engineering industry. it was developed for the government agency nasa for the purposes of space research, specifically for the apollo program, but it has also found its application in nuclear energy sector. subsequently, the fmea method spread to other industries and has found extensive application in the automotive industry. the research assumes that in addition to the engineering industry, the method can also be applied in the construction industry. the fmea method is used to determine the risks of the project documentation, and the construction project is described for this method, including the determination of individual aspects that require increased attention, with current knowledge of potential risks and impacts faced by the construction project [12]. depending on what is the subject of the analysis, the fmea method is further divided into design (researches the causes of defects), process (looks for the causes of defects in the product production process), system (a combination of both previous variants) [13]. the target value of any variant of the fmea method is the rp n index (risk priority number), defined by the general formula (1) [12]. rp n = rt1 × rt2 × · · · × rtm (1) where: rp n is the risk priority number [-], rt1 to rtm are dimensionless expert ratings [-] of attributes 1 to m, where m is the number of the evaluation criteria chosen by the evaluator, which assigns it to individual pairs [m, e], where m is the cause of the defect and e is the consequence of the defect. at the same time, the overall riskiness of the project can be determined using the fmea method. this is determined by summing the rp n values found for all m pairs identified using the equation (2) [12]. rp ntot = k=m∑ k=1 mrp nk (2) where: rp ntot is the total risk of the project [-], m is the cause of the defect [-], m is the number of the evaluation criteria chosen by the evaluator [-], rp nk is the partial risk of the priority number for item k. the total riskiness of the project is determined in order to verify the effect of the project modification on its risk compared to the original project, where the original project can be marked rp ntot (ori), and the corrected project compared to the original can be marked as rp ntot (ori + 1). the fmea method has been modified and adapted in a previous research for use in the field of quality of project documentation of constructions, specifically for the categorization of product or process defects [12], where the fmea method determines the idp defect priority index, which is represented by relation (3) [12]. idp = sv × rm (3) where: idp is the index of defect priority [-], sv is the severity of the consequences of the defect [-], rm is the degree of removability [-], while the values are determined by the evaluator, the scale of values is recommended to be chosen as an even number. the scale can be perceived as penalty points by which the respective defect is evaluated. for the purposes of this research, general formula (3) is used according to the notation of formula (4). idp = mv × rm (4) where: mv is the index of the costs incurred to rectify the defect, for idp and rm the definition given for formula (3) applies. the fmea method can be suitably implemented in the risk analysis, where the subjects of evaluation are the above-mentioned pairs [m, e], which take into account three basic attributes, namely the severity of the sv, disorder, the probable possibility of lk disorder and the possibility of detecting the fault before its manifestation or later dt. the values of these criteria are determined by the evaluator. while using the fmea method for the risk analysis, it is also necessary to pay attention to less frequent cases. despite them being cases with low probability scenarios, they can have very serious consequences, which is why it is necessary to proceed in isolation. it can happen that the detected value rp n = 4 (for a given four-point scale) is insignificant with respect to the value rp nmax = 43 = 64. however, the severity of this fault can be considerable sv = 4, while the probability of the occurrence of the lk = 1 and the detection of the fault dt = 1 is negligible. it was necessary to determine the values of the evaluation criteria for the research. from the point of view of the criterion mv, each evaluation criterion comprises four evaluation levels, the influence of the defect on the repair price being monitored at each level. the risk is, therefore, derived according to the financial impact on the elimination of the defect c. the values of the criterion mv are shown in table 1. in terms of the criteria rm that expresses the intensity of removability defects, the values are sorted from easily correctable defects to defects whose removal is quite complicated. as an example, elaborately removable defects include dysfunctional system of waterproofing substructures. the values of the rm criterion are shown in tab. 2. from the above tables, it is clear that expert ratings take values from 1 to 4. the individual values chosen to perform the researched measurement can be understood as penalty points. if the value is higher, this makes the defect morerisky. individual defects are divided into categories in the research using metadata, 450 vol. 60 no. 5/2020 application of fmea methodology for checking of construction’s. . . the cost c of removing the defect [eur] m v c < 750 1 750 ≤ c < 1500 2 1500 ≤ c < 2250 3 c > 2250 4 table 1. determination of defect price groups for criterion mv. difficulty in removing the defect rm practically impossible 4 difficult to remove (time and fin. side) 3 easy (but time realization) 2 unpretentious (time and realization) 1 table 2. determination of groups according to the difficulty of removing the defect rm. and it is therefore possible to evaluate the data set and identify the weakest group of claimed defects and focus on it as a part of the project documentation and subsequent implementation in the production. for this purpose, the research introduces the socalled defect priority index idpk, which is based on relation (2). idptot is the average value of the defect priority indices of individual defects in a given category. for the purposes of our research, it is determined by relation (5). for this purpose, the research introduces the so-called defect priority index idpk, which is based on relation (2). idptot is the average value of the defect priority indices of individual defects in a given category. for the purposes of our research, it is determined by relation (5). idptot = k=m∑ k=1 idpk n (5) where: idptot is the average value of the defect priority indices [-], idpk is the defect priority index of the partial defect, n is the number of defect priority indices. based on the comparison of individual categories, it is possible to identify the group with the highest risk and focus the attention during the quality check primarily on it. also, the risk assessment can be provided according to iso 31000:2018 risk management – guidelines. 2.2. a system of continuous quality improvement continuous quality improvement is the basis of any quality management system and consists of planning, manufacturing, checking(inspecting) and improving the monitored product. an illustrative example of how a continuous quality improvement system works is the pdca cycle. the pdca cycle is an interactive repetitive cyclic method that is based on four steps plan, do, check, act. the basic principle of this scientific method is its repetition. after completing the entire cycle, in its last phase, the knowledge is evaluated and applied to production. subsequently, the whole process is repeated to verify the application of the improvement and, if necessary, to identify new weaknesses in the process that need to be improved again. this fulfils the idea of a continuous quality improvement and striving for a perfect operation. repeated implementation of the pdca cycle is often also described by a spiral, which is supposed to symbolize increasing knowledge about the system towards the set goal. each new cycle should be closer to its goal. each subsequent application of the cycle then brings higher knowledge about the system as a whole, which is researched and improved using this method. 2.3. pareto rule based on this rule, it can be stated that 20 % of causes are responsible for 80 % of complications. it is important to apply such a fact in terms of work efficiency in solving quality problems. in practice, the pareto diagram began to be more prominent only thanks to j. m. juran, who used previous knowledge to compile the pareto diagram. at the same time, based on his experience, he argued that 5 % to 20 % of causes are responsible for 80 % to 95 % of problems regarding quality and its management. this makes this method the most commonly used in the field of quality management and at the same time, very suitable for identifying priorities [13]. 3. results 3.1. data collection and measurement for the purposes of the research, measurements, or a comprehensive collection of data on the claimed defects was performed within the operating unit of a large construction company operating in the czech republic. the data were collected from 2017 to 2018, in the form of in-house records for the management of claimed defects. the volume of the building unit’s orders in the field of building construction exceeded eur 75.5 million in those years. the number of 451 m. tuháček, o. franek, p. svoboda acta polytechnica year of monitoring value of deffects number of deffects [eur] [pieces] 2017 295 836 2327 2018 453 821 2734 table 3. the number and financial volume of claimed defects of the researched construction company in 2017 and 2018. claimed defects and their financial volume for repairs in individual years is shown in table 3. from table 3, it is evident that in 2017, a total of 2327 defects in the total financial volume of eur 295.836 were claimed in the measured company. in 2018, 2.734 defects were claimed in a total financial volume of eur 453.821. in 2017 and 2018, a total of 5061 defects were claimed in a total financial volume of eur 711.475. table 4 provides an overview of claimed defects for the years 2017 and 2018, specifically their financial volume and number divided into subcategories, according to professional areas. table 4 is divided into the following 21 professional categories: a insulation against water and moisture of the superstructure, b external surface treatment (etics), c hole fillings, d high current, e insulation against water and moisture substructure, f floors and floor coverings, g internal surface treatment, h other unclassified defects, i air conditioning and cooling, j tiling and paving, k low current, l internal water supply and sewerage, m fixtures, n central heating, o surface treatment of metal structures and corrosion, p internal dividing and visible structures, q monolithic reinforced concrete structures, r masonry structures, s measurement and regulation, t light perimeter cladding incl. shielding systems, u other surface treatments. 3.2. applications of computational models within the research, a computational models (4) and (5) were applied to the obtained data to determine the riskiness of the claimed defects. due to the extent of the data obtained, it was appropriate to use the sorting of claimed defects according to the calculated risk. the obtained data divided into subcategories and sorted according to the rp n indicator are shown in table 5. table 5 shows that the riskiest construction technologies include a, which shows a total of 232 reported defects in 2 years with a financial volume of eur 224.208 and an rp n of 5.75 [-]. the financial volume for repairs, of this item only, is several times higher than for any other solution. the second place, according to the rp n indicator, is the area of b with an average rp n of 5.65 [-], then the area of e with an rp n of 5.33 [-], c with an rp n of 5.10 [-], the remaining professions have an rp n of less than 5.00 [-]. by applying pareto’s rule (i.e., selecting approximately 20 % of the most risky items), building technology number of defects costs [pieces] [eur] a 232 224 208 b 34 76 000 c 888 58 445 d 324 50 834 e 392 49 253 f 319 47 211 g 534 34 585 h 232 29 985 i 314 28 547 j 597 23 642 k 318 19 189 l 191 18 004 m 189 12 721 n 176 9 381 o 41 6 755 p 62 6 547 q 58 5 755 r 102 4 415 s 23 3 038 t 18 2 151 u 17 811 table 4. the number and financial volume of claimed defects of the researched construction company in 2017 and 2018. the risk coverage of 80 % of future complaint costs should theoretically be achieved. however, by summing up the values of the claimed areas of a, b, e and c (a total of 19.0 % of items) we get the value of eur 407.906, which is only 57.3 % of the total amount of eur 711.475. the results of the rp n according to individual professional areas are clearly shown in figure 3. figure 3 shows that in the case of the application of the rp n indicator, which, in addition to the price, also takes into account the possibility of remediation of the defect, cheaper occupations may also prove to be riskier than areas with higher costs for removal. 4. discussion measurements and calculations showed that in terms of the dividing of defects, the pareto rules cannot be reliably used for the construction industry, with calculations performed by this research showing that approximately 19 % of the most significant items, ac452 vol. 60 no. 5/2020 application of fmea methodology for checking of construction’s. . . figure 3. sorting of individual professional areas according to the average value of the rp n of claimed defects based on the measured data and application of computational models. building technology number of defects costs rp n [pieces] [eur] [-] a 232 224 208 5.75 b 34 76 000 5.65 e 392 49 253 5.33 c 888 58 445 5.10 g 534 34 585 4.95 j 597 23 642 4.75 d 324 50 834 4.63 k 318 19 189 4.31 l 191 18 004 4.24 f 319 47 211 4.12 i 314 28 547 4.11 q 58 5 755 4.05 n 176 9 381 3.92 u 17 811 3.87 t 18 2 151 3.81 p 62 6 547 3.75 o 41 6 755 3.45 r 102 4 415 3.14 s 23 3 038 3.05 m 189 12 721 2.98 h 232 29 985 2.13 table 5. the number and financial volume of claimed defects of the researched construction company according to professional areas for the years 2017 and 2018 sorted according to the calculated rp n. cording to the rp n indicator, affect only 57.3 % of total costs. the performed research also shows that, thanks to the rp n indicator, it is possible to evaluate not only the financial risks in the field of eliminating claimed defects but also the complexity of removing individual defects. comparing table 3 and table 4, it is clear that when the rp n indicators are not used, priorities are given to the control of professional areas a, b, c, d, which make up to 19 % of majority professional areas in terms of financial impact, but the cost of remedying the defects is not taken into account. the use of the rp n indicators allows to include these costs as well. it can, therefore, be stated that the rp n indicator is capable of a multi-criteria optimization in terms of determining the majority of professional areas for checking (inspecting) the project documentation already in the preparation phase. in the performed measurement, the researched data within the construction company were limited by the warranty period. after its expiration, the construction contractor loses control over the further operation of the building. this can be solved by an ongoing data collection by the facility management, which then takes over the management of the building. the data collection from the facility management can provide other valuable data that can be further analysed. the output of such an analysis can provide valuable data for the design and preparation of design work for construction projects, which will increase the resulting quality of buildings. the aim of the analyses is to obtain current reports on the claimed defects at any time. at the same time, the question of the need for interconnection and transmission of information within 453 m. tuháček, o. franek, p. svoboda acta polytechnica the entire life cycle of the construction is raised. the building must be designed for the intended purpose, built in an accordance with the design and operated in an accordance with the proposed parameters. a building user guide should be a part of every quality documentation. it should clearly define and coordinate the parameters for reliable operation. at the same time, the considered cycles of replacement and renewal of partial parts of the building should be stated. 5. conclusion the risk assessment of claimed defects, which the authors of the article use on the basis of the fmea method and the pdca cycle in the management and evaluation of claimed defects, is an effective tool for checking the project documentation of prepared construction projects. due to the fact that the process of registration and evaluation of claimed defects is a continuous process, the principle of a continuous quality improvement is applied by this activity. based on the updated results of analyses, tools for checking project documentation are presented and specified. it is appropriate that all data on the claimed defects be thoroughly analysed and subjected to computational models for a continuous updating of results, resp. specification of the majority risk areas of the project documentation. acknowledgements the authors are grateful to the czech technical university in prague. the presented research was supported by the grant sgs20/005/ohk1/1t/11, czech technical university in prague. references [1] m. tuhacek, p. svoboda. quality of project documentation. iop conference series: materials science and engineering 471:052012, 2019. doi:10.1088/1757-899x/471/5/052012. [2] j. synek. nekvalita jako funkce ztráty informací. https: //www.tzb-info.cz/bim-informacni-model-budovy/ 15018-nekvalita-jako-funkce-ztraty-informaci, 2016. [3] iso 9001:2015 quality management systems requirements. standard, international organization for standardization, brussel, 2015. [4] l. vesela, j. synek. quality control in building and construction. iop conference series: materials science and engineering 471:022013, 2019. doi:10.1088/1757-899x/471/2/022013. [5] n. forcada, m. macarulla, m. gangolells, m. casals. handover defects: comparison of construction and post-handover housing defects. building research & information 44(3):279 – 288, 2016. doi:10.1080/09613218.2015.1039284. [6] a. sharma, a. saxena, m. sethi, et al. life cycle assessment of buildings: a review. renewable and sustainable energy reviews 15(1):871 – 875, 2011. doi:10.1016/j.rser.2010.09.008. [7] j. zhang, c. wu, y. wang, et al. the bim-enabled geotechnical information management of a construction project. computing 100:47 – 63, 2017. doi:10.1007/s00607-017-0571-8. [8] a. h. a. jamil, m. s. fathi. the integration of lean construction and sustainable construction: a stakeholder perspective in analyzing sustainable lean construction strategies in malaysia. procedia computer science 100:634 – 643, 2016. doi:10.1016/j.procs.2016.09.205. [9] d. carvajal-arango, s. bahamón-jaramillo, p. aristizábal-monsalve, et al. relationships between lean and sustainable construction: positive impacts of lean practices over sustainability during construction phase. journal of cleaner production 234:1322 – 1337, 2019. doi:10.1016/j.jclepro.2019.05.216. [10] y. a. olawale, m. sun. cost and time control of construction projects: inhibiting factors and mitigating measures in practice. construction management and economics 28(5):509 – 526, 2010. doi:10.1080/01446191003674519. [11] y. olawale, m. sun. construction project control in the uk: current practice, existing problems and recommendations for future improvement. international journal of project management 33(3):623 – 637, 2015. doi:10.1016/j.ijproman.2014.10.003. [12] m. tichý. ovládání rizika: analýza a management. c. h. beck, praha, 2006. beckova edice ekonomie. [13] j. veber, m. hůlová, a. plášková. management kvality, environmentu a bezpečnosti práce: legislativa, systémy, metody, praxe. management press, praha, 2006. 454 http://dx.doi.org/10.1088/1757-899x/471/5/052012 https://www.tzb-info.cz/bim-informacni-model-budovy/15018-nekvalita-jako-funkce-ztraty-informaci https://www.tzb-info.cz/bim-informacni-model-budovy/15018-nekvalita-jako-funkce-ztraty-informaci https://www.tzb-info.cz/bim-informacni-model-budovy/15018-nekvalita-jako-funkce-ztraty-informaci http://dx.doi.org/10.1088/1757-899x/471/2/022013 http://dx.doi.org/10.1080/09613218.2015.1039284 http://dx.doi.org/10.1016/j.rser.2010.09.008 http://dx.doi.org/10.1007/s00607-017-0571-8 http://dx.doi.org/10.1016/j.procs.2016.09.205 http://dx.doi.org/10.1016/j.jclepro.2019.05.216 http://dx.doi.org/10.1080/01446191003674519 http://dx.doi.org/10.1016/j.ijproman.2014.10.003 acta polytechnica 60(5):448–454, 2020 1 introduction 2 materials and methods 2.1 expert analysis and their use for the documentation checking 2.2 a system of continuous quality improvement 2.3 pareto rule 3 results 3.1 data collection and measurement 3.2 applications of computational models 4 discussion 5 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0005 acta polytechnica 61(si):5–13, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague note on the problem of motion of viscous fluid around a rotating and translating rigid body paul deuringa, stanislav kračmarb, c, šárka nečasovác, ∗ a université du littoral côte d’opale, centre universitaire de la mi-voix 50, rue f.buisson cs 80699, 62228 calais cedex, france b czech technical university in prague, faculty of mechanical engineering, department of technical mathematics, karlovo nám. 13, 121 35 praha 2, czech republic c czech academy of sciences, institute of mathematics, žitná 25, 11567 praha 1, czech republic ∗ corresponding author: matus@math.cas.cz abstract. we consider the linearized and nonlinear systems describing the motion of incompressible flow around a rotating and translating rigid body d in the exterior domain ω = r3 \d, where d ⊂ r3 is open and bounded, with lipschitz boundary. we derive the l∞-estimates for the pressure and investigate the leading term for the velocity and its gradient. moreover, we show that the velocity essentially behaves near the infinity as a constant times the first column of the fundamental solution of the oseen system. finally, we consider the oseen problem in a bounded domain ωr := br ∩ ω under certain artificial boundary conditions on the truncating boundary ∂br, and then we compare this solution with the solution in the exterior domain ω to get the truncation error estimate. keywords: incompressible fluid, rigid body, exterior domain, estimates of pressure, leading terms, artificial boundary conditions. 1. introduction the boundary problem of navier–stokes equations describing flows past a rigid body translating with a constant velocity (with or without rotation) is one of the challenging problems in fluid mechanics. in recent decades, much effort has been made to analyze the properties of solutions of both stationary and non-stationary solutions, both linear and nonlinear mathematical models, both in the whole space and in exterior domains. the difficulty which arises in this type of problem is the variability of the spatial domain in time. to solve it there are two possibilities: (i) to study a problem in the time dependent domain, see conca, starovoitov and tucsnak [1], desjardins and esteban [2], gunzburger, lee and seregin [3], hoffman and starovoitov [4], etc. (ii) to use a transformation in order to transform the spatial domain varying in time in to a fixed domain. for this approach the global or local transformation can be applied. the global linear transformation implies that the whole space is rigidly rotated and shifted back to its original position at each time t > 0 (cf. [5]). the equations of motion of the fluid-rigid body system is in a frame attached to the rigid body, with its origin in the center of mass of the latter and coinciding with an inertial frame at time t = 0. (works related to this type of transformation see [6–12]). the local transformation implies that the change of variables only acts in a bounded neighbourhood of the body, the solenoidal condition of the fluid velocity are preserved and the regularity of the solution are not changed. see e.g. works of tucsnak, cumsille and takahashi (cf. [13–15]). 1.1. formulation of the problem let us formulate our problem in the fixed domain, which is a result of applying the global linear transformation, for more details, see [5]. the systems of equations are as follows −∆u(z) + τ∂1u(z) − (ω ×z) ·∇u(z) + ω ×u(z) +τ(u(z) ·∇)u(z) + ∇π(z) = f(z) div u(z) = 0 for z ∈ ω (1.1) −∆u(z) + τ∂1u(z) − (ω ×z) ·∇u(z) + ω ×u(z) + ∇π(z) = f(z) div u(z) = 0 for z ∈ ω (1.2) where d ⊂ r3 is open and bounded, with lipschitz boundary. the systems (1.1) and (1.2) together with some boundary conditions on ∂ω = ∂d constitute the mathematical models (linear and non-linear, respectively) describing the stationary flow of a viscous incompressible fluid around a rigid body which moves at a constant velocity and rotates at a constant angular velocity. in this study we consider that the rotation is parallel to the velocity at infinity. (for more details concerning the derivation of the model, see [5, 7]. the description and the 5 https://doi.org/10.14311/ap.2021.61.0005 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en p. deuring, s. kračmar, š. nečasová acta polytechnica analysis in the case where the rotation is not parallel to the velocity at infinity can be found in the following works, see [16, 17]). the aim is to obtain the l∞ estimates for the pressure in the linear and nonlinear cases, since such estimates are missing in the literature. only the estimates of the velocity field and the gradient of the velocity field in l∞ are available. this implies that complete information about the decay of the solution (u,π) of the systems (1.1), (1.2) for |x|→∞. (for other works see [18], [19].) second, we are interested in the “leray solutions” of (1.1), supplemented by a decay condition at infinity, u(x) → 0 for |x|→∞, (1.3) and the suitable boundary conditions on ∂ω. weak solutions are characterized by the conditions u ∈ l6(ω)3 ∩ w 1,1 loc (ω) 3, ∇u ∈ l2(ω)9 and π ∈ l2loc(ω). from [18] and [20], it follows that the velocity part of the leray solution (u,π) in (1.1) and (1.3) decays for |x|→∞ as the estimates express below |u(x)| ≤ c ( |x|s(x) )−1 , |∇u(x)| ≤ c ( |x|s(x) )−3/2 (1.4) for x ∈ r3 with |x| sufficiently large, where s(x) := 1 + |x|−x1 (x ∈ r3) and c > 0 a constant independent of x. the factor s(x) may be considered as a mathematical manifestation of the wake extending downstream behind a body moving in a viscous fluid. in the work by m. kyed, (see [21]) it was shown that uj(x) = γ ej1(x) + rj(x), ∂luj(x) = γ ∂lej1(x) + sjl(x) (x ∈ d c , 1 ≤ j, l ≤ 3), (1.5) where e : r3\{0} 7→ r4 ×r3 denotes a fundamental solution to the oseen system −∆v + τ ∂1v + ∇% = f, divv = 0 in r3. (1.6) the term ej1(x) can be expressed explicitly in terms of elementary functions. the coefficient γ is also given explicitly, its definition involving the cauchy stress tensor. the remainding terms r and s are characterized by the relations r ∈ lq(ω)3 for q ∈ (4/3, ∞), s ∈ lq(ω)3 for q ∈ (1,∞). from [22, section vii.3] it is known that ej1|bcr /∈ lq(bcr) for r > 0, q ∈ [1, 2], and ∂lej1|bcr /∈ lq(bcr) for r > 0, q ∈ [1, 4/3], j, l ∈{1, 2, 3}. the function r decays faster than ej1, and sjl decays faster than ∂lej1, in the sense of lq-integrability. thus, the equations in (1.5) can be viewed in fact as asymptotic expansions of u and ∇u, respectively. let us mention that the result in [21] are valid under the assumption that u verifies the boundary conditions u(x) = e1 + (ω ×x) for x ∈ ∂ω, (1.7) which is not our case. reference [21] does not deal with l∞-decay of r and s, nor does it indicate whether s = ∇r. below, in theorem 4.1 we derive an l∞-decay of u and ∇u respectively, which is independent on the boundary conditions. however, in comparison with [21] and indicated in (1.5), our leading term is less explicit than the term γ ej1(x) in (1.5) and instead of the fundamental solution ej1(x) of the stationary oseen system, we use the time integral of the fundamental solution of the evolutionary oseen system. in [23] it was proved that zj1(x, 0) = ej1(x) for x ∈ r3\{0}, 1 ≤ j ≤ 3, and lim|x|→∞ |∂αxzjk(x, 0)| = o ( (|x|s(x))−3/2−|α|/2 ) for 1 ≤ j ≤ 3, k ∈{2, 3} ([23, corollary 4.5, theorem 5.1]). thus, setting gj(x) := 3∑ k=2 βk zjk(x, 0) + fj(x) (x ∈ bs1 c , 1 ≤ j ≤ 3), (1.8) we may obtain from (4.3) that uj(x) = β1 ej1(x) + (∫ ∂ω u ·ndox ) xj (4 π |x|3)−1 + gj(x) (x ∈ bs1 c , 1 ≤ j ≤ 3) (1.9) and lim |x|→∞ |∂αg(x)| = o ( (|x|s(x))−3/2−|α|/2 ln(2 + |x|) ) for α ∈ n30 with |α| ≤ 1 (1.10) (theorem 4.2, corollary 4.3). comparing the coefficient γ from (1.5) in the work [21] with the coefficient β1 from (1.9) in [24], see theorem 4.1 below, and taking into account the boundary condition (1.7) in [21], it follows that γ and β1 are equal. third, we are solving the linear system (1.2) in a truncation ωr := br ∩ ω of the exterior domain r3 \d under certain artificial boundary conditions on the truncating boundary ∂br. then we compare this solution with the solution of (1.2) in the exterior domain, i.e. to find the error estimates of the method of an artificial boundary condition. for this aim we use l∞-estimates of the velocity and of the pressure. 6 vol. 61 special issue/2021 note on the problem of motion of viscous fluid. . . 2. definitions and notation let us define s(y) := 1 + |y|−y1 for y ∈ r3, ω = r3 \d , ωr := br ∩ ω, bcr := r 3 \br, where br := {x ∈ r3; |x| < r}, for r > 0 such that br ⊃d. so, ωr is the truncation of the exterior domain ω = r3 \d by the ball br. the boundary ωr consists of parts ∂ω and ∂br, the later we call the truncating boundary. fix τ ∈ (0,∞), e1 := (1, 0, 0), ω = |ω|e1, |ω| 6= 0, and ω := |ω|  0 0 00 0 −1 0 1 0  . so, ω ·z = ω ×z for z ∈ r3. for u ⊂ r3 open, u ∈ w 2,1loc (u) 3, z ∈ u, put (lu)(z) := − ∆u(z) + τ∂1u(z) − (ω ×z) ·∇u(z) + ω ×u(z), (l∗u)(z) := − ∆u(z) − τ∂1u(z) + (ω ×z) ·∇u(z) −ω ×u(z). put n(x) := (4 π |x|)−1 for x ∈ r3\{0} (”newton potential”, fundamental solution of the poisson equation in r3), o(x) := (4 π |x|)−1 e−τ (|x|−x1)/2 for x ∈ r3\{0} (fundamental solution of the scalar oseen equation −∆v + τ ∂1v = g in r3), put k(z,t) := (4πt)−3/2e−|z| 2/(4t) (z ∈ r3, t ∈ (0,∞)), λ(z,t) := ( k(z,t)δjk + ∂zj∂zk (∫ r3 (4π|z −y|)−1k(y,t)dy )) 1≤j,k≤3 (z ∈ r3, t > 0), γ(x,y,t) := λ(x− τte1 −e−tωy,t) ·e−tω, γ̃(x,y,t) := λ(x + τte1 −etωy,t) ·etω (x,y ∈ r3, t > 0), z(x,y) := ∫ ∞ 0 γ(x,y,t)dt, z̃(x,y) := ∫ ∞ 0 γ̃(x,y,t)dt, (x,y ∈ r3, x 6= y). ψ(r) := ∫ r 0 (1 −e −t) t−1 dt (r ∈ r), φ(x) := (4 π τ)−1 ψ ( τ (|x|−x1)/2 ) (x ∈ r3), ejk(x) := (δjk ∆ −∂j∂k)φ(x), e4k(x) := xk (4 π |x|3)−1 (x ∈ r3\{0}, 1 ≤ j,k ≤ 3) (fundamental solution of the oseen system (1.6), with (ejk)1≤j,k≤3 the velocity part and (e4k)1≤k≤3 the pressure part). for q ∈ (1, 2), f ∈ lq(r3)3, put r(f)(x) := ∫ r3 z(x,y)f(y)dy (x ∈ r3); see [25, lemma 3.1]. we will use the space d1,20 (ω) 3 := {v ∈ l6(ω)3 ∩h1loc(ω) 3 : ∇v ∈ l2(ω)9, v|∂ω = 0} equipped with the norm ‖∇u‖2, where v|∂ω means the trace of v on ∂ω. for p ∈ (1,∞), define mp as the space of all pairs of functions (u,π) such that u ∈ w 2,ploc (ω) 3, π ∈ w 1,ploc (ω), u|ωr ∈ w 1,p(ωr)3, π|ωr ∈ lp(dr), u|∂ω ∈ w 2−1/p,p(∂ω)3, divu|ωr ∈ w 1,p(ωr), l(u) + ∇π|ωr ∈ lp(ωr)3 for some r ∈ (0,∞) with ω c ⊂ br. we write c for generic constants. in order to romove possible ambiguities, we sometimes use the notation c(γ1, ..., γn) in order to indicate that the constant in question depends particularly on γ1, ..., γn ∈ (0,∞), for some n ∈ n. but the relevant constant may depend on other parameters as well. 3. decay estimates in the first part of this section, we recall some known results from [25] and [26] about the decay of the velocity part of the solution of the system (1.2). in order to get the full decay characterization of the solution, we derive the decay of the pressure part of the solution of (1.2). in the second part of this section, we extend the result for the pressure to the non-linear case of (1.1). 7 p. deuring, s. kračmar, š. nečasová acta polytechnica 3.1. decay estimates in the linear case our starting point is a decay result from [26] for the velocity part u of a solution to (1.2). theorem 3.1. ([26, theorem 3.12]) suppose that ωc is c2-bounded. let p ∈ (1,∞), (u,π) ∈ mp. put f = l(u) + ∇π. suppose there are numbers s1,s,γ ∈ (0,∞), a ∈ [2,∞), b ∈ r such that s1 < s, ωc ∪ supp(div u) ⊂ bs1, u|b c s ∈ l 6(bcs) 3, ∇u|bcs ∈ l 2(bcs) 9, a + min{1,b}≥ 3, |f(z)| ≤ γ|z|−as(z)−b for z ∈ bcs1. then |u(y)| ≤ c (|y|s(y))−1 la,b(y), (3.1) |∇u(y)| ≤ c (|y|s(y))−3/2 s(y)max (0,7/2−a−b) la,b(y) (3.2) for y ∈ bcs, where function la,b is given by{ 1 if a + min{1,b} > 3 max(1, ln(y)) if a + min{1,b} = 3. corollary 3.2. let p ∈ (1,∞), γ, s1, s ∈ (0,∞) with ωc ⊂ bs1, s1 < s, a ∈ [2,∞), b ∈ r with a + min{1,b}≥ 3. let f : ω 7→ r3 be measurable with f |ωs1 ∈ lp(ωs1 )3 and |f (z)| ≤ γ|z|−as(z)−b for z ∈ bcs1. let u ∈ w 1,ploc (ω) 3 with u|bcs ∈ l 6(bcs) 3, ∇u|bcs ∈ l 2(bcs) 9, supp(div u) ⊂ bs1 ,∫ d c [ ∇u ·∇ϕ + ( τ ∂1u− (ω ×z) ·∇u + (ω ×u) −f ) ·ϕ ] dz (3.3) = 0 for ϕ ∈ c∞0 (ω) 3 with div ϕ = 0. then inequalities (3.1) and (3.2) hold for y ∈ bcs. moreover f ∈ lq(ω)3 for q ∈ (1,p]. if p ≥ 6/5, the function f may be considered as a bounded linear functional on d1,20 (ω) 3, in the usual sense. let π ∈ lploc(ω) with ∫ d c [ ∇u ·∇ϕ + ( τ ∂1u− (ω ×z) ·∇u + (ω ×u) −f ) ·ϕ (3.4) −π div ϕ ] dz = 0 for ϕ ∈ c∞0 (ω) 3. fix some number s0 ∈ (0,s1) with d∪ supp(div u) ⊂ bs0. then the relations u|bs0 c ∈ w 2,ploc (bs0 c )3, π ∈ w 1,p loc (bs0 c ) and l(u|bs0 c ) + ∇π = f|bs0 c hold. the main result of this section, dealing with the l∞-estimates of the pressure, is stated in theorem 3.3. let p, γ, s1, s, a, b, f, u be given as in corollary 3.2, but with the stronger assumptions a = 5/2, b ∈ (1/2, ∞) on a and b. let π ∈ lploc(ω) such that (3.4) holds then there is c0 ∈ r such that |π(x) + c0| ≤ c |x|−2 for x ∈ bcs. (3.5) corollary 3.4. let p, γ, s1, s, a, b, f, u be given as in corollary 3.2, but with the stronger assumptions a ≥ 5/2, a + min{1,b} > 3 on a and b. let π ∈ lploc(ω) such that (3.4) holds. then there is c0 ∈ r such that inequality (3.5) is valid. proof: put b′ := a− 5/2 + min{1,b}. since a + min{1,b} > 3, we have b′ ∈ (1/2, ∞). moreover, since a ≥ 5/2, we find for z ∈ bcs1 that |f(z)| ≤ γ c(s1,a) |z|−5/2 s(z)−a+5/2−b ≤ γ c(s1,a) |z|−5/2 s(z)−b ′ . thus the assumptions of theorem 3.3 are satisfied with b replaced by b′ and with a modified parameter γ. this implies the conclusion of theorem 3.3. � 8 vol. 61 special issue/2021 note on the problem of motion of viscous fluid. . . 3.2. decay estimates in the non-linear case let us assume now the non-linear case, i.e. the system (1.1). first, recall the result about the decay properties of the velocity in this non-linear case: theorem 3.5. [20, theorem 1.1] let γ, s1 ∈ (0,∞), p0 ∈ (1,∞), a ∈ (2,∞), b ∈ [0, 3/2] with ωc ⊂ bs1, a + min{b, 1} > 3, a + b ≥ 7/2. take f : r3 7→ r3 measurable with f |bs1 ∈ lp0 (bs1 )3, |f(y)| ≤ γ · |y|−a ·s(y)−b for y ∈ bcs1. let u ∈ l6(ω)3 ∩w 1,1loc (ω) 3,π ∈ l2loc(ω) with ∇u ∈ l 2(ω)9,divu = 0 and∫ d c [∇u ·∇ϕ + τ∂1u− (ω ×z) ·∇u + ω ×u +τ(u ·∇)u−f) ·ϕ−π divϕ] dx = 0 for ϕ ∈ c∞0 (ω)3. let s ∈ (s1,∞). then |∂αu(x)| ≤ c (|x|s(x))−1−|α|/2 for x ∈ bcs, α ∈ n 3 0 with |α| ≤ 1. (3.6) now, using theorems 3.3 and 3.5, we are in the position to prove the result on the decay of the pressure in the non-linear case: theorem 3.6. consider the situation in theorem 3.5. suppose in addition that a ≥ 5/2. then there is c0 ∈ r such that inequality (3.5) holds. 4. leading term in this section we study the asymptotic behavior of the velocity profile of the system (1.2). let us recall known results from [26] and [24]. theorem 4.1. let d ⊂ r3 be open, p ∈ (1,∞), f ∈ lp(r3)3 with supp(f)compact. let s1 ∈ (0,∞) with d∪supp(f) ⊂ bs1, ω = d c . let u ∈ l6(ω)3 ∩w 1,1loc (ω) 3, π ∈ l2loc(ω) with ∇u ∈ l 2(ω)9, divu = 0 and∫ ω [ ∇u ·∇ϕ + ( τ ∂1u + τ (u ·∇)u− (ω ×z) ·∇u + ω ×u ) ·ϕ−π div ϕ ] dz (4.1) = ∫ ω f ·ϕdz for ϕ ∈ c∞0 (ω) 3. (this means the pair (u,π) is a leray solution to (1.2), (1.3).) suppose in addition that ωc is c2-bounded, u|∂ω ∈ w 2−1/p,p(∂ω)3, π|bs1\d ∈ l p(bs1\d). (4.2) let n denote the outward unit normal to ω, and define βk := ∫ ω fk(y) dy + ∫ ∂ω 3∑ l=1 ( −∂luk(y) + δkl π(y) + (τ e1 −ω ×y)l uk(y) − τ (ul uk)(y) ) nl(y) doy for 1 ≤ k ≤ 3, fj(x) := ∫ ω [ 3∑ k=1 ( zjk(x,y) −zjk(x, 0) ) fk(y) − τ · 3∑ k,l=1 zjk(x,y) (ul ∂luk)(y) ] dy + ∫ ∂ω 3∑ k=1 [( zjk(x,y) −zjk(x, 0) ) 3∑ l=1 ( −∂luk(y) + δkl π(y) + (τ e1 −ω ×y)l uk(y) ) nl(y) + ( e4j(x−y) −e4j(x) ) uk(y) nk(y) + 3∑ l=1 ( ∂ylzjk(x,y) (uk nl)(y) + τzjk(x, 0) (ul uk nl)(y) )] doy 9 p. deuring, s. kračmar, š. nečasová acta polytechnica for x ∈ bs1 c , 1 ≤ j ≤ 3. the preceding integrals are absolutely convergent. moreover f ∈ c1(bs1 c )3 and equation uj(x) = 3∑ k=1 βk zjk(x, 0) + (∫ ∂ω u ·ndox ) xj (4 π |x|3)−1 + fj(x). (4.3) holds. in addition, for any s ∈ (s1,∞), there is a constant c > 0 which depends on τ, ω, s1, s, f, u and π, and which is such that |∂αf(x)| ≤ c ( |x|s(x) )−3/2−|α|/2 ln(2 + |x|) for x ∈ bsc, α ∈ n30 with |α| ≤ 1. (4.4) theorem 4.2. let d, p, f, s1, u, π satisfy the assumptions of theorem 4.1, including (4.2). let β1, β2, β3 and f be defined as in theorem 4.1. define the function g as gj(x) := 3∑ k=2 βk zjk(x, 0) + fj(x) (x ∈ bs1 c , 1 ≤ j ≤ 3). (4.5) then g ∈ c1(bs1 c )3, equation uj(x) = β1 ej1(x) + (∫ ∂ω u ·ndox ) xj (4 π |x|3)−1 + gj(x) (x ∈ bs1 c , 1 ≤ j ≤ 3) (4.6) holds, and for any s ∈ (s1,∞), there is a constant c > 0 which depends on τ, ω, s1, s, f, u and π, and which is such that |∂αg(x)| ≤ c ( |x|s(x) )−3/2−|α|/2 ln(2 + |x|) for x ∈ bsc, α ∈ n30 with |α| ≤ 1. corollary 4.3. take d, p, f, s1, u, π as in theorem 4.1, but without requiring (4.2). (this means that (u, π) is only assumed to be a leray solution of (1.2), (1.3).) put p̃ := min{3/2, p}. then u ∈ w 2,p̃loc (ω) 3 and π ∈ w 1,p̃loc (ω). fix some number s0 ∈ (0,s1) with d∪ supp(f) ⊂ bs0 , and define β1, β2, β3 and f as in theorem 4.1, but with d replaced by bs0 , and n(x) by s −1 0 x, for x ∈ ∂bs0. moreover, define g as in (4.5). then all the conclusions of theorem 4.2 are valid. 5. formulation of the problem with artificial boundary conditions recall that we defined ωr = br ∩ ω. we introduce the subspace wr of h1(ωr) denoting wr := {v ∈ h1(ωr)3 : v|∂ω = 0}, where v|∂ω means the trace of v on ∂ω. lemma 5.1. ([27, lemma 4.1]) the estimate ‖u‖2 ≤ c (r ‖∇u‖2 + r 1/2 ‖u|∂br‖2) holds for r ∈ (0,∞) with ωc ⊂ br and for u ∈ wr. we introduce an inner product (·, ·)(r) in wr by defining (v,w)(r) = ∫ ωr ∇v ·∇wdx + ∫ ∂br (τ/2)v ·wdox for v,w ∈ wr. the space wr equipped with this inner product is a hilbert space. the norm generated by this scalar product (·, ·)(r) is denoted by | · |(r), that is |v|(r) := ( ‖∇v‖22 + (τ/2)‖v|∂br‖ 2 2 )1/2 for v ∈ wr. 10 vol. 61 special issue/2021 note on the problem of motion of viscous fluid. . . we define the bilinear forms ar : h1(ωr)3 ×h1(ωr)3 → r, br : h1(ωr)3 ×l2(ωr) → r, ar(u,w) := ∫ ωr [∇u ·∇w + τ∂1u ·w]dx + τ 2 ∫ ∂br (u(x) ·w(x)) ( 1 − x1 r ) dox,∫ ωr [ − ( (ω ×x) ·∇ ) u(x) + ( ω ×u(x) )] ·w(x) dx br(w,σ) := − ∫ ωr (div w) σdx, + ( ω ×u(x) )] ·w(x) dx for u,w ∈ h1(ωr)3, σ ∈ l2(ωr), r ∈ (0,∞) with ωc ⊂ br. lemma 5.2. let r ∈ (0,∞) with ωc ⊂ br. then |ar(u,w)| ≤c(r) |u|(r) |w|(r) for u,w ∈ h1(ωr)3. the key observation in this section is stated in the following lemma, which is the basis of the theory presented in this section. lemma 5.3. let r ∈ (0,∞) with ωc ⊂ br, and let w ∈ wr. then the equation (|w|(r))2 = ar(w,w) holds. proof: using the definition ar(·, ·), we get ar(w,w) = ∫ dr [ |∇w|2 + τ∂1 ( |w|2 2 ) − (ω ×x) ·∇ ( |w|2 2 )] dx + τ 2 ∫ ∂br |w(x)|2 ( 1 − x1 r ) dox = ∫ dr |∇w|2 dx + ∫ ∂br ( τ 2 |w(x)|2 x1 r − 1 2 (ω ×x) · x r |w(x)|2 ) dox + τ 2 ∫ ∂br |w(x)|2 ( 1 − x1 r ) dox = ∫ dr |∇w|2 dx + τ 2 ∫ ∂br |w(x)|2 = (|w|(r))2. we applied that (ω ×x) ·x = 0 for x, ω ∈ r3. � as in [28], we obtain that the bilinear form βr is stable: theorem 5.4. ([28, corollary 4.3]) let r > 0 with ωc ⊂ br. then inf ρ∈l2(ωr),ρ 6=0 sup v∈wr,v 6=0 br(v,ρ) |v|(r)‖ρ‖2 ≥ c(r). we note that functions from w 1,1loc (ω) with l 2-integrable gradient are l2-integrable on truncated exterior domains: lemma 5.5. [29, lemma ii.6.1] let w ∈ w 1,1loc (ω) with ∇w ∈ l 2(ω)3, and let r ∈ (0,∞) with ωc ⊂ br. then w|ωr ∈ l2(ωr). in particular the trace of w on ∂ω is well defined. the preceding lemma is implicitly used in the ensuing theorem, where we introduce an extension operator e : h1/2(∂ω)3 7→ w 1,1loc (ω) 3 such that div e(b) = 0. theorem 5.6. [29, exercise iii.3.8] there is an operator e from h1/2(∂ω)3 into w 1,1loc (ω) 3 satisfying the relations ∇e(b) ∈ l2(ω)9, e(b)|∂ω = b and div e(b) = 0 for b ∈ h1/2(∂ω)3. in view of lemma 5.2 and 5.3 and theorem 5.6 and 5.4, the theory of mixed variational problems yields 11 p. deuring, s. kračmar, š. nečasová acta polytechnica theorem 5.7. let s > 0 with ωc ⊂ bs, r ∈ [2s,∞), f ∈ l6/5(ωr)3, b ∈ h1/2(∂ω)3. then there is a uniquely determined pair of functions (ṽ ,p) = ( ṽ (r,f,b), p(r,f,b) ) ∈ wr ×l2(ωr) such that ar(ṽ ,g) + br(g,p ) = ∫ dr f ·g dx−ar ( e(b)|ωr, g ) for g ∈ wr, (5.1) br(ṽ ,σ) = 0 for σ ∈ l2(ωr), (5.2) where the operator e was introduced in theorem 5.6. let us interpret variational problem (5.1), (5.2) as a boundary value problem. define the expression used in the boundary condition on the artificial boundary ∂br : lr(u,π)(x) :=   3∑ j=1 ∂juk(x) xj r −π(x) xk r + τ 2 ( 1 − x1 r ) uk(x)   1≤k≤3 for x ∈ ∂br, r ∈ (0,∞) with d ⊂ br, u ∈ w 2, 6/5(ωr)3, π ∈ w 1, 6/5(ωr). lemma 5.8. assume that ωc is c2-bounded. let s ∈ (0,∞) with ωc ⊂ bs, r ∈ [2s,∞), f ∈ l6/5(ωr)3 and b ∈ w 7/6, 6/5(∂ω)3. put v := ṽ (r,f,b) + e(b)|ωr, with v (r,f,b) from theorem 5.7 and e(b) from theorem 5.6. suppose that v ∈ w 2,6/5(ωr)3 and p = p(r,f,b) ∈ w 1, 6/5(ωr), with p(r,f,b) also introduced in theorem 5.7. then −∆v (z) + τ∂1v (z) − (ω ×z) ·∇v (z) + ω ×v (z) + ∇p(z) = f(z), div v (z) = 0 (5.3) for z ∈ ωr, and v |∂ω = b, lr(v,p) = 0. theorem 5.9. suppose that ωc is c2-bounded. let γ, s1 ∈ (0,∞) with ωc ⊂ bs1, a ∈ [5/2, ∞), b ∈ r with a+min{1,b} > 3. let f : ω 7→ r3 be measurable with f|ωs1 ∈ l6/5(ωs1 )3 and |f (z)| ≤ γ |z|−as(z)−b for z ∈ bcs1 . let b ∈ w 7/6, 6/5(∂ω)3, u ∈ w 1,1loc (ω) 3 ∩l6(ω)3 such that ∇u ∈ l2(ω)9, div u = 0, u|∂ω = b and equation (3.3) is satisfied. for r ∈ [2s1, ∞), put vr := ṽ (r,f,b) + e(b), with e(b) from theorem 5.6, and ṽ (r,f,b) from theorem 5.7. then |u|ωr −vr| (r) ≤ c r−1 for r ∈ [2s,∞). acknowledgements the works of s.k. and š. n. were supported by grant no. 19-04243s of gačr in the framework of rvo 67985840, s.k. is supported by rvo 12000. references [1] c. conca, j. san martín h., m. tucsnak. motion of a rigid body in a viscous fluid. comptes rendus de l’académie des sciences series i mathematics 328(6):473 – 478, 1999. doi:10.1016/s0764-4442(99)80193-1. [2] b. desjardins, m. j. esteban. existence of weak solutions for the motion of rigid bodies in a viscous fluid. archive for rational mechanics and analysis 146:59–71, 1999. doi:10.1007/s002050050136. [3] m. gunzburger, h.-c. lee, g. seregin. global existence of weak solutions for viscous incompressible flows around a moving rigid body in three dimensions. journal of mathematical fluid mechanics 2:219–266, 2000. doi:10.1007/pl00000954. [4] k. h. hoffmann, v. n. starovoitov. on a motion of a solid body in a viscous fluid. two dimensional case. advances in mathematical sciences and applications 9:633–648, 1999. [5] g. p. galdi. handbook of mathematical fluid dynamics, vol. 1, chap. on the motion of a rigid body in a viscous liquid: a mathematical analysis with applications, pp. 653–791. north-holland, amsterdam, 2002. [6] r. farwig, m. krbec, š. nečasová. a weighted lq-approach to oseen flow around a rotating body. mathematical methods in the applied sciences 31(5):551–574, 2008. doi:10.1002/mma.925. [7] r. farwig. an lq-analysis of viscous fluid flow past a rotating obstacle. tôhoku math j 58:129–147, 2006. doi:10.2748/tmj/1145390210. [8] r. farwig. estimates of lower order derivatives of viscous fluid flow past a rotating obstacle. regularity and other aspects of the navier-stokes equations 70:73–84, 2005. 12 http://dx.doi.org/10.1016/s0764-4442(99)80193-1 http://dx.doi.org/10.1007/s002050050136 http://dx.doi.org/10.1007/pl00000954 http://dx.doi.org/10.1002/mma.925 http://dx.doi.org/10.2748/tmj/1145390210 vol. 61 special issue/2021 note on the problem of motion of viscous fluid. . . [9] r. farwig, t. hishida, d. muller. lq-theory of a singular “winding” integral operator arising from fluid dynamics. pacific journal of mathematics 215:297–313, 2004. doi:10.2140/pjm.2004.215.297. [10] t. hishida. an existence theorem for the navier-stokes flow in the exterior of a rotating obstacle. archive for rational mechanics and analysis 150:307–348, 1999. doi:10.1007/s002050050190. [11] t. hishida. the stokes operator with rotation effect in exterior domains. analysis 19(1):51 – 68, 1999. doi:10.1524/anly.1999.19.1.51. [12] t. hishida. lq estimates of weak solutions to the stationary stokes equations around a rotating body. journal of the mathematical society of japan 58:743–767, 2006. [13] t. takahashi. analysis of strong solutions for the equations modeling the motion of a rigid-fluid system in a bounded domain. advances in differential equations 8(12):1499 – 1532, 2003. [14] t. takahashi, t. marius. global strong solutions for the two-dimensional motion of an infinite cylinder in a viscous fluid. journal of mathematical fluid mechanics 6:53–77, 2004. doi:10.1007/s00021-003-0083-4. [15] p. cumsille, t. takahashi. well posedness for the system modelling the motion of a rigid body of arbitrary form in an incompressible viscous fluid. czechoslovak mathematical journal 58:961–992, 2008. doi:10.1007/s10587-008-0063-2. [16] r. farwig, r. guenther, e. thomann, š. nečasová. the fundamental solution of linearized nonstationary navier-stokes equations of motion around a rotating and translating body. discrete and continuous dynamical systems 34:511–529, 2014. doi:10.3934/dcds.2014.34.511. [17] m. geissert, t. hansel. a non-autonomous model problem for the oseen-navier-stokes flow with rotating effects. journal of the mathematical society of japan 63, 2010. doi:10.2969/jmsj/06331027. [18] g. p. galdi, m. kyed. steady-state navier-stokes flows past a rotating body: leray solutions are physically reasonable. archive for rational mechanics and analysis 200:21–58, 2011. doi:10.1007/s00205-010-0350-6. [19] r. farwig, g. galdi, m. kyed. asymptotic structure of a leray solution to the navier-stokes flow around a rotating body. pacific journal of mathematics 253(2):367–382, 2011. doi:10.2140/pjm.2011.253.367. [20] p. deuring, s. kračmar, š. nečasová. pointwise decay of stationary rotational viscous incompressible flows with nonzero velocity at infinity. journal of differential equations 255(7):1576 – 1606, 2013. doi:10.1016/j.jde.2013.05.016. [21] m. kyed. on the asymptotic structure of a navier-stokes flow past a rotating body. journal of the mathematical society of japan 66(1):1–16, 2014. doi:10.2969/jmsj/06610001. [22] g. p. galdi. an introduction to the mathematical theory of the navier-stokes equations, vol. i. linearised steady problems. springer, new york, 1998. [23] p. deuring, s. kračmar, š. nečasová. asymptotic structure of viscous incompressible flow around a rotating body, with nonvanishing flow field at infinity. zeitschrift für angewandte mathematik und physik 68, 2016. doi:10.1007/s00033-016-0760-x. [24] p. deuring, s. kračmar, š. nečasová. a leading term for the velocity of stationary viscous incompressible flow around a rigid body performing a rotation and a translation. discrete and continuous dynamical systems 37:1389–1409, 2016. doi:10.3934/dcds.2017057. [25] p. deuring, s. kračmar, š. nečasová. on pointwise decay of linearized stationary incompressible viscous flow around rotating and translating bodies. siam j math analysis 43:705–738, 2011. doi:10.1137/100786198. [26] p. deuring, s. kračmar, š. nečasová. linearized stationary incompressible flow around rotating and translating bodies – leray solutions. discrete and continuous dynamical systems series s 7:967–979, 2014. doi:10.3934/dcdss.2014.7.967. [27] p. deuring. finite element methods for the stokes system in three-dimensional exterior domains. mathematical methods in the applied sciences 20(3):245–269, 1997. doi:10.1002/(sici)1099-1476(199702)20:3<245::aid-mma856>3.0.co;2-f. [28] p. deuring, s. kračmar. artificial boundary conditions for the oseen system in 3d exterior domains. analysis 20(1):65 – 90, 2000. doi:10.1524/anly.2000.20.1.65. [29] g. p. galdi. an introduction to the mathematical theory of the navier-stokes equations. springer, new york, 2nd edn., 2011. 13 http://dx.doi.org/10.2140/pjm.2004.215.297 http://dx.doi.org/10.1007/s002050050190 http://dx.doi.org/10.1524/anly.1999.19.1.51 http://dx.doi.org/10.1007/s00021-003-0083-4 http://dx.doi.org/10.1007/s10587-008-0063-2 http://dx.doi.org/10.3934/dcds.2014.34.511 http://dx.doi.org/10.2969/jmsj/06331027 http://dx.doi.org/10.1007/s00205-010-0350-6 http://dx.doi.org/10.2140/pjm.2011.253.367 http://dx.doi.org/10.1016/j.jde.2013.05.016 http://dx.doi.org/10.2969/jmsj/06610001 http://dx.doi.org/10.1007/s00033-016-0760-x http://dx.doi.org/10.3934/dcds.2017057 http://dx.doi.org/10.1137/100786198 http://dx.doi.org/10.3934/dcdss.2014.7.967 http://dx.doi.org/10.1002/(sici)1099-1476(199702)20:3<245::aid-mma856>3.0.co;2-f http://dx.doi.org/10.1524/anly.2000.20.1.65 acta polytechnica 61(si):5–13, 2021 1 introduction 1.1 formulation of the problem 2 definitions and notation 3 decay estimates 3.1 decay estimates in the linear case 3.2 decay estimates in the non-linear case 4 leading term 5 formulation of the problem with artificial boundary conditions acknowledgements references ap05_4.vp 1 introduction engineering design processes must efficiently incorporate analytical components of varying complexity in multiple disciplines. the computational cost of slow and/or many objective function evaluations can be prohibitive in achieving a meaningful design. furthermore, the financial cost of lengthy design processes is also a real consideration. a system is proposed to enable time efficient multidisciplinary design optimisation through the use of a design and state database. furthermore, the same scheme offers similar benefits generally in simulations that depend on computationally efficient models. the financial and computational cost of design optimisation can be addressed in two ways. the first is to reduce the computational expense of objective function evaluations, the second is to reduce the number these evaluations required. the time required for objective function evaluations can be reduced by either selecting analytical components of lesser complexity, which, generally speaking, results in a trade-off between computational speed and accuracy, or by representing these components with a heuristic, empirical or stochastic system. the number of evaluations can be reduced by reusing the designs and states of previous design processes. a system is proposed that stores designs and states during design optimisation. this would enable the creation of a database that could be accessed in future design optimisations, which, given an efficient search algorithm, would prevent recalculation of data already contained in the database. this situation would improve as size of the database grew. furthermore, representative systems could be created from the data (for example neural networks), which could then be used in time efficient system representation in the optimisation process. the significant issues of such a system would involve the mechanisms of efficiently storing and retrieving designs and states, data compression, efficient search algorithms and system representations. a further extension of such a system would be, through the use of heuristics, the ability to switch between various classes of models such as low fidelity tabular data; neural network; various degrees of analytic or computational models and empirical data. this switching would depend on the availability of data in desired states, confidence, computational effort required for retrieval and other comparative measures, such as accuracy required. formulating efficient and robust heuristics for these algorithms would be an important contribution. end-users of the optimisation and simulations based on the intelligent database would not be expected to be overly concerned with data sources whilst the system is in operation. the intelligent database system is then somewhat like an engineering application of data warehousing. 2 database generation and use “database management systems are now used in almost every computing environment to organise, create and maintain important collections of information” [1]. database systems are also proving to be a critical method for storing many kinds of data for a multitude of purposes. databases have evolved from their beginnings in relational, network and hierarchical data models, to such modern and complex data types as object-oriented, spatial and temporal data, allowing for vastly expanded roles and capabilities. each of these data models is particularly suited to a range of data types, creating specialisation within these databases. as data models have evolved, specific models have gained acceptance for specific tasks, such as “object-oriented” for engineering design, “spatial” for medical imaging and geographic studies, and “temporal” for large-scale time-dependent data. the usefulness of a database is dependent on its applicability to store the data supplied. this is dependent on the data model of the database. shortcomings can occur, however, when it is unsuitable or infeasible to express data in a manner that allows for storage in a dbms (database management system). one such shortcoming has been noted in the case of commercial dbms. this is due to the fact that commercial dbms (cdbms) are primarily designed for business applications © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 intelligent data storage and retrieval for design optimisation – an overview c. peebles, c. bil, l. drack this paper documents the findings of a literature review conducted by the sir lawrence wackett centre for aerospace design technology at rmit university. the review investigates aspects of a proposed system for intelligent design optimisation. such a system would be capable of efficiently storing (and compressing if required) a range of types of design data into an intelligent database. this database would be accessed by the system during subsequent design processes, allowing for search of relevant design data for re-use in later designs, allowing it to become very efficient in reducing the time for later designs as the database grows in size. extensive research has been performed, in both theoretical aspects of the project, and practical examples of current similar systems. this research covers the areas of database systems, database queries, representation and compression of design data, geometric representation and heuristic methods for design applications. keywords: database architectures, design optimisation, configuration control, data access and retrieval. [2], and therefore some of the important features required in design applications are not available in cdbms. some of the deficiencies of cdbms in design applications have been documented in refs [2, 3], with the primary deficiencies being that the data models inherent to the majority of cdbms (relational) are unsuitable for accurately portraying complex design information. 2.1 database data models it is widely believed [4–10] that early database models are unsuitable for design database purposes, as they fail to sufficiently model certain design problems, or are incompatible with incorporated design software. there are, however, a number of beliefs about the best way to improve this situation. many traditional database systems, in line with commercial dbms, used the relational data model as mentioned in the previous section. this model was chosen for simplicity and logical nature of sets and relationships, separation of logical from physical parameters aiding casual users, and simple and precise semantics that reduce programmer burden. however, the main argument against the relational data model is its “flatness” of structure, whereby it loses valuable information contained in the relationships of the data. it therefore lacks expressiveness and semantic richness for which semantic data models are preferred [11]. it is recognised by many [2, 5, 6, 10–14] that a major requirement of design databases is the ability to accurately and naturally model the data by the designer. due to the inherent complexity of design information, as well as the composite nature of design objects (components built of lesser components), the object-oriented data model, or data models built on its foundations [5, 10, 12, 13, 15–22], are believed to be suitable to design databases. the object-oriented data model allows information to be expressed as objects, containing properties and methods (actions to be performed); these objects can be composed of other objects, or combined to create larger objects. as such, object-oriented data models allow engineering constructs to be expressed in a logical manner by the designer. “it is natural for designers to think in terms of the object being designed, the components (i.e. objects) that go into the design, and tools (i.e. operations) for manipulating these objects. a system that directly supports the mapping from the user’s mental model to the objects and operations supported by the system will enable design engineers to interact with the system in familiar terms” [12]. for a detailed survey of object-oriented database technologies, examples of existing systems and applicability for design/cad tasks, and projections for the development of o-o technologies, the reader is referred in more detail to [22]. the very nature of the design process also defines a fundamental requirement of a design database system. “design is an iterative process which begins with a general description of the design object and after repeated and possibly alternative refinements terminates when a complete and correct refinement has been reached. a ddbms must allow template definitions to be refined, must provide facilities to organise the refinements of the templates, and should support the semantics of refinements and alternatives.” [10] for cases where an efficient design-centered database is desired, but a relational or similar early data model database already exists, there are methods to translate from the early data model to the more advanced and efficient data model for design. ahad [23] presents such a method for translation of existing relational databases, in the form of an object management system (oms), which acts to translate the modelling concepts not inherent in the early data model into an object oriented framework. much of the above discussion has discussed the inadequacies of the relational data model, and advantages or improvements through hybridisation or implementation of different models; however it has been noted [24] that, out of the many desired improvements on the relational model for a number of different database applications: � there is a large collection of constructs, each relevant to one or more application-specific environment; and � the union of these constructs is impossibly complicated to understand and probably infeasible to implement with finite resources. as such, “it appears inappropriate to look for a single universal data model which will support all nontraditional applications. in short, what the cad community wants is different from what the semantic modelling community wants which is different from what the expert database community wants, etc. consequently, such users should build application-specific data models containing the constructs needed for their own environment … the thrust of a next-generation database system should be to provide a support system that will efficiently simulate these constructs” [24]. 2.2 storage and very large databases in the majority of database applications, relatively small data records are handled, therefore there are no problems in i/o accessing of the data, and no time costs for storage and retrieval. this does become a problem however when the size of records increases to the point where the memory performance of the system is no longer capable to perform these operations in a single pass [25]. this is becoming more prevalent with the advent of more complex database data models, particularly in applications for medical imaging, geographics, and cad/cam applications, to name a few. one method to combat this is to compress the database index and contents, thereby reducing the necessary space in memory for the data. this method is discussed in detail in section 4. for a database containing large external files, ramakrishna and larson [26] present a composite perfect hashing scheme. perfect hashing is an efficient and popular technique for organizing internal tables and external files, with ‘perfect’ meaning that the method doesn’t result in memory overflows. this scheme guarantees record retrieval in a single disk access, and this method can be used in any application that can afford to store header table in internal memory. 2.3 concurrent access to databases a key aspect in the design process lies in the fact that it is a continual process undertaken commonly by a team, as opposed to a single designer. this means that there will arise times where multiple users need access to data within a design database, and if both users working concurrently, allowances have to be made for accurate revision histories and relevant updating of the design record. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague roller et al. [27] present a cooperative transaction model for shared engineering databases, which provides a higher degree of concurrency and process parallelism in cad. as opposed to traditional transaction models where intra-transaction results are isolated, the presented model allows exchanging and sharing of data and supports the integration of subresults into a common solution. the realization of the transaction system is based on concepts of active and object-oriented database systems. 3 database search methods an important measure of the efficiency of a database system lies not only in the efficiency of data storage, but also in the efficiency of database search and retrieval. database queries allow the user to input given parameters, allowing for a search of the data within the database for similar or matching parameters. for efficient queries, the goal is to find user-specified data from an often very large database efficiently and with an acceptable accuracy. while query processing and optimisation can be seen as an important part of a database system, and a parameter for judging overall system efficiency, it is still but one such parameter among many others, and as we have seen previously, the weighting of these parameters will differ based on the requirements of any given system. as such, while efficient querying may be vital for a particular system that demands accurate and very fast returns for user queries, there will be other systems for which other parameters such as efficient storage and compression of very large data files is paramount, and hence inefficient querying can be acceptable, or even desirable in this light. there is a range of extensive surveys of the literature in database querying and query optimisation [28–35]. these surveys cover many aspects of querying for different data models, different query methods, and optimisation techniques for a range of scenarios. 3.1 query processing vs. data models it was discussed in the previous section that there is a wide range of data models available in database systems, with each holding advantages and disadvantages for particular applications. due to different modelling structures, levels of complexity, and indexing, these different data models will have consequences in the operation of query processes. a particular case of this arises in distributed or particularly federated database systems, which can be composed of numerous smaller databases, each using different data models. in such a case, the query could potentially need to be written and executed separately for each such system. in response to this, owrang o. et al. [36] have developed a parallel database machine, capable of query translation between different data models. the end user would only need to be proficient in the data model of the local dbms. the system is capable of translating simultaneous queries in parallel to some extent, by processing independent subparts of the translation in parallel. in addition, it can easily be expanded to incorporate other data models into the distributed database system. 3d geometries present a specific problem in query searches, based on the problem of formulating efficient indexes within the database by which to govern the search. there have been several papers [32–35, 37–39] surveying the problem of 3d geometric search and offering solutions to this dilemma. in his 1999 study [39], keim gives an example of an implementation for searching of databases for 3d geometries. his proposed solution for an efficient similarity search is based on a new geometric structure. keim states from his study of the field that “it is widely recognised that 3d similarity search is a difficult problem – by far more difficult than 2d similarity search” [39]. the most widely used techniques for accessing databases of complex objects are feature-based approaches [40, 41], which are mainly used as a simple filter to restrict the search space. the main contribution of keim’s paper is a new geometry-based index structure that generalises the well-known r-tree approach for an efficient volume-based similarity search on 3d volume objects. this solution is based on the general concept of using both progressive and conservative approximations. these approximations are used to define a minimum and maximum volume difference measure, which allows an efficient pruning of the search space. while this implementation has been developed with medical applications in mind, the author recognises the applicability to 3d geometries in cad and design applications, and states that it is generally applicable to a wide range of other applications. in a later paper (2003), funkhouser et al. [33] describe a web-based search engine for 3d geometries, supporting queries based on 3d sketches, 2d sketches, 3d models and/or text keywords. this paper presents a new matching algorithm for shape-based queries, which provides a stated 46–245 % better performance (using five different algorithms for comparison) than related shape-matching methods, and is fast enough to return query results from a repository of 20,000 models in under a second. in this study, which is principally aimed towards a web-based application, the authors describe novel methods for searching 3d databases using orientation invariant spherical harmonic descriptors. this answers one of the critical areas in 3d database search, in providing an efficient indexing method for 3d geometries through efficient geometric representation. such a shape descriptor should be: � quick to compute; � concise to store; � easy to index; � invariant under similarity transformations; � insensitive to noise and other small extra features; � independent of 3d model representation, tessellation or genus; � robust to arbitrary topological degeneracies; and � discriminating of shape differences at many scales. unfortunately no existing shape descriptor (at time of this writing in 2003) has all of these properties. the authors therefore propose their novel shape descriptor based on spherical harmonics. the main idea is to decompose a 3d model into a collection of functions defined on concentric spheres and to use spherical harmonics to discard orientation information for each one. this yields a shape descriptor that is both orientationinvariant and descriptive. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 this approach yields a significant advantage, in that it can be indexed without registration of 3d models in a canonical coordinate system, for example in the case of similar models, where even minimal dissimilarities can cause a large misalignment of the principle axes, resulting in poor alignments and poor match scores for algorithms that rely upon them. [35] presents an extensive survey of a number of different methods for performing similarity search in metric spaces, with the main focus being on distance-based indexing methods. it introduces a framework for performing searches based on distances, and presents algorithms for common types of queries. it surveys common query and search algorithms, highlighting with examples in the area of spatial data. methods discussed include ball partitioning, hyperplane partitioning, m-tree, sa-tree and distance matrix methods. 3.2 fuzzy search methods fuzzy logic can be an important inclusion into search algorithms for design applications. this is due to the fact that especially in preliminary stages of the design, where much is unknown, or cases where the design cannot yet be accurately described in query, there is possibility for many applicable but not exact matches to be excluded from the query results. this scenario is well described in [42]. an algorithm is presented which has the purpose of allowing the search of a non-homogenous database, which contains engineering design information and other data relevant to the engineering design process. the purpose of this work is to develop a quick access mechanism to heterogeneous, complex knowledge. the algorithm is based upon a new type of fuzzy query (i.e. iterative, ranked retrieval) from an exclusive, partitioned, spatial database. a prototype system which implements a small database and this retrieval from a plop hashing database is described, and an example is presented which demonstrates its application. this paper attempts to design processes by providing quick access to relevant engineering knowledge stored in a database. an engineer may not have an exact definition of the information being sought. the problem may be in the early, formative stage, and the information contained in the database is stored in terms of how it was used, rather than in terms of how the engineer now intends to use it. the algorithm described above is developed to help solve this problem. a fuzzy query takes into account tradeoffs among the facets of interest that form the database index. conventional database systems ignore such tradeoffs, and thus may miss potential items. relaxed queries [43] are similar in aspect to fuzzy queries. relaxed queries use fuzzy logic to introduce a ‘grey area’ in the query space, in order to reduce the amount of potentially missed results from a query operation. these fuzzy techniques are especially applicable in cases such as this, where “a query submitted to the system is usually domain knowledge related and a user often fails to appropriately formulate his/her problem to obtain all relevant sequences and relevant data.” [43] it can be especially seen in cases such as this, when working not just with database information, but with knowledge either contained within the database or derived from it, that the way queries are formulated can have a great impact on the efficiency of queries in terms of the number of successful results generated. special care must be taken in these cases to ensure that representation of domain knowledge does not limit the data obtained from a database. 4 selective creation and reuse of design space data the solving of any design optimisation problem requires the creation of a multidimensional design space (or search/solution space). this design space is a representation of the different variables being studied in the design, and describes the relationships between these variables and the imposed constraints on the system. along with a merit or objective function, the optimal solution within this search space can be found dependent on the user’s requirements. there are many issues to be covered in this pivotal field in design optimisation tools. in light of the current study, this section will concentrate on definition and optimisation of the design space, and search methods that can be used to determine an optimal solution. there are a range of numerical methods and algorithms available to determine solutions for design spaces, depending on the nature of the variables (discrete/continuous), relationships between the constraints, etc. these methods range in computational costs, and the definition of the design space boundaries has a remarkable effect on the overall cost of a solution, or even if any feasible (much less optimal) designs are contained within the design space. 4.1 design space creation the creation of a design solution space is a process of representing the limitations or constraints of a design in reference to the design variables. as such, these constraints will then form boundaries that determine whether a design is feasible or infeasible. the design process then continues in a search for a solution that satisfies these constraints, and if a merit or objective function is defined that determines the ‘quality’ of a design, the design space is searched for a valid design that yields the best quality. although constraints used during design include heuristics, tables, guidelines, and computer simulations, a majority of those used can be expressed as mathematical constraints. however, it is not enough to define constraints only as equalities, because a majority of the constraints define limitations on a design, rather than stipulating an exact relationship between variables. for example, a possible constraint may be that a beam cannot deflect more than a certain amount. [44] methods have been investigated [45, 46] for efficient representation of complex design systems. while not concentrating on an automated design system, [46] documents a method for generating a visual description of the design space, for ease of user understanding and navigation. this system is able to take a more active role in the generation of a computable design description than traditional computer aided design systems are able to, however the design process itself may remain largely under user control. [45] presents a method of visualising the design space for user legibility, allowing the entire design space to be shown visually on one single design chart. these charts also allow visualisation of performance changes as design variables are altered. these nested charts allow for plotting of multi-variable design spaces, without complex issues regarding visualising 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague relationships between charts for each variable. both of these methods are aimed primarily at team-oriented design projects, rather than computational approaches to design solutions. another area of interest is that of decomposition of complex design spaces. an example of this is a multidimensional design space; in such an example the numerical method used for search of the design space will determine its efficiency in such a space. as such, there may be cases where it is highly advantageous to decompose the design space into numerous simpler spaces, which can be solved simultaneously, thus creating a more efficient solution method. liu and tseng [47] propose a set of algorithms for space-decomposition minimisation (sdm), which decomposes a solution space into a series of sub-problems. these sub-problems, if uncoupled, can be solved independently; otherwise one of the algorithms allows for solution of coupled solution spaces. these algorithms also potentially allow the design space to be broken down into one-dimensional solution spaces, allowing for simple 1-d solution algorithms. also, given the 1-d design spaces, the sdm algorithms can be used as a direct search method, allowing for minimum solutions to be found directly, for example without requiring gradient information for the constraint and objective functions. these algorithms can also theoretically be run in parallel environments; however this case has not been implemented and tested at the time of writing (1999). 4.2 design space optimisation traditionally, the design solution space has been defined by constraints that limit the design, defining its quality for the intended function. these may be performance requirements, materials limitations, or operational limitations. often these constraints are user-specified, and in the desire to maximise the possible return of an optimal solution, it is allowable to assume that the designer will integrate a measure of conservatism in his/her estimation of these constraints, or more likely the range of design variables under study. this has the understandable effect of increasing the solution space, but it may then contain a large number of infeasible solutions, which the computer may have to evaluate, resulting in a higher computational cost. on the other hand, in cases where the computational cost of solution is known to be high, the solution space may be defined as too small, hence removing the optimal or even any feasible solutions from the design space. in light of these considerations, a number of studies [48–50] have been conducted investigating automated processes for efficiently determining the optimal size of the design space, in order to maximise the efficiency in obtaining a solution. yao and johnson [50] present an implementation of a domain propagation algorithm to be used to identify suitable bounds for design variables. the program can generate a more focused search space from the original specification of variables and constraints without omitting any feasible solutions in the original search space. the program is also able to identify cases where the original search space contains no feasible design solutions. tabulated results show extreme favour for this program. the program contains three options for non-linear optimisation: genetic algorithm, simulated annealing, and an improved sa algorithm. the sa algorithms show better performance for complex design cases; the reasons for this performance versus other search methods are discussed further in the next section. constraint-base design relies on the designers experience to select the bounds of design variables, or on conservative estimation, resulting in a larger design space. as such, methods such as these are useful, as they don’t rely as much on the experience of the user in determining an appropriate boundary for the solution space, as the computer can automatically generate a set of suitable bounds. also, the domain propagation program outlined in [50] makes the existing constraint-driven designs tools more reliable by increasing the chance of finding a feasible solution. in general, the use of the domain propagation program leads to an overall speed-up of the constraint-driven design process. 5 compression of design space data with the increasing complexity of design data, and the more widespread use of complex data types such as volumetric, spatial and high-resolution imaging, data compression is becoming very important for adequate storage of large volumes of data. compression makes use of the redundancy inherent in data files, allowing for data to be truncated and stored in a smaller volume. this allows for more efficient use of storage hardware, and networking and data transmission facilities by conservation of transmission bandwidth and reduction of transmission time. while compression is becoming more and more invaluable a tool for storage of large amounts of data, additional complications arise when these compressed files, often compressed using differing methods, are planned for inclusion in a database, requiring search and retrieval of data records while still in the compressed form. 5.1 data compression methods research has shown [51-54] that there are a vast number of methods for compressing data. this is due in part to the large number of forms that the data can take (binary, text, image, spatial, etc), and the intended use of the data, which determines the extent to which compression can take place. in some cases, where a high level of compression is paramount, and a certain loss of clarity in the data is acceptable, so-called ‘lossy’ compression methods can be used. there are many applications however, where this may be unacceptable (medical imaging being an example), and the data must be able to be perfectly reproduced from its compressed form. the wide range of compression algorithms, and the range of its applications, means that these algorithms can become highly specialised, which becomes a problem in large-scale projects where some degree of commonality is sought. “researchers continue to strive to develop their own algorithms that maximise compression rates in the least amount of time. however, the all-encompassing algorithm does not exist, and probably never will, since the measurement criteria are both data and application dependent.” [55] due to the large range of data models available for discussion, we will cover the data models that particularly relate to the intended study. there are also a number of cases that will be covered very briefly. for interests’ sake, the reader is referred to the specific compression areas of string match© czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 ing [53], compression of volumetric data [54], and adaptive arithmetic coding [51]. 5.2 image compression there is much work being done in the area of image compression, particularly for the field of medical imagery. this field has specific requirements for image compression, including largely (or in some cases totally) lossless compression, often of large individual or sets of high-resolution images, and the need for highly efficient compression is highlighted by the large amount of data being produced, for example at the ghent university hospital in belgium, 10 gigabytes of medical image data is produced each week [56]. due to the commonality of imagery data, and the high volume of research available especially in this field of imagery, we will briefly cover the aspects of image compression. a range of studies has been performed on image compression techniques. these range from simple comparisons of image compression techniques [56-58], to more advanced specialised studies. one example is particularly applicable to medical imaging, but also applicable to other forms of imaging, for example geographics or satellite imaging, where large image files are produced, but where certain areas are of key significance. reference [59] documents a method of image compression where the image is compressed in a lossy method, however with specific regions compressed losslessly. this allows for very efficient compression, however without losing image quality in designated areas of importance in the image. a similar method [60] can be used in cases where large sets of images are produced, for example in ct or mri imaging. it presents a method for compressing sets of images, where similarity exists between the images, allowing for reduction of the image set redundancy, and therefore higher lossless compression. this is useful for fields such as medical imagery and satellite imagery. the centroid method proposed here extracts these similar regions (similarity patterns), to enable this higher compression to be realised. this method could realistically be implemented for any case where there exist sets of images that are highly common in content, for example in storage of competitive designs, in the case of aircraft fuselage design; this example is used to highlight a case where there are common aspects between images (e.g. fuselage dimensions and configurations), however different images are used to depict possibilities for internal layout design, allowing for comparisons between potential design choices. 5.3 volumetric and elevation data volumetric data can take many forms, from geometric or particle density modelling, to such design-based examples as cfd solutions, in the form of streamlines or pressure distributions within a control volume. volumetric data takes up an immense amount of storage space, and is quoted as a prime example of the need for efficient compression algorithms in [54]. in many cases, it is also imperative that lossless compression algorithms are utilised, such that the data can be stored efficiently if possible, but predominantly without a loss in accuracy. a study was performed [54] to develop an efficient method of compressing volumetric data. a method was determined which uses optimal linear prediction to exploit correlations in all three dimensions, yielding typical compression of around 50 %. these results were achieved using mri images, ct images and electron-density map data as test data; however it is believed that similar results can be obtained for other forms of volumetric data. 5.4 database search of compressed data many compression techniques are utilised to compress a range of different data types and formats (a comprehensive review of these techniques can be found in [61]). the compression process has the potential to remove much of the schema, or legibility, of the file, as the redundant data is removed. while this is insignificant when regarding storage of the data, the issue becomes prevalent when you want to manipulate or search the file in its compressed form, for example in database query processing. this issue arises because the contents of the compressed file are now required to be accessed and searched in its compressed form, which is more efficient than expending computational cost and time to decompress the file before accessing data records, with the extreme example being in very large databases, consisting of large numbers of sizeable records, as each record would need to be decompressed before query. “compressing information in a database system is attractive for two major reasons: storage saving and performance improvement. storage saving is a direct and obvious benefit, whereas performance improvement derives from the fact that less physical data need be moved for any particular operation on the database.” [62] the primary concern in compressed database querying is that a common compression scheme is utilised for all the data domains in the database, thereby allowing for more efficient querying of the data while compressed. graefe et al. have shown in their study [63] that many query-processing applications can manipulate compressed data just as well as uncompressed data, and more-so that processing compressed data can speed up query processing by a factor much larger than the compression factor of the data. in query processing, compression can be exploited far beyond simply improving i/o performance. exact-match comparisons can be performed on compressed data, as long as the compression scheme is consistent, as all matching values will be encoded in the same manner. in the same fashion, projection and duplicate elimination can be performed on compressed data. the main factor for these query operations is that a common compression method is used, and therefore comparisons can be made on the compressed data in a similar manner to uncompressed data, without much change to the implemented algorithms [63]. this paper also documents performance analysis, showing that for data sets larger than the memory performance of the system, performance gains larger than the compression factor can be gained, due to larger amounts of the data being stored in the workspace allocated to the query operator. this is highly applicable to databases containing large amounts of complex data. this section highlights two major factors for implementation of a large-volume database of design data. the end intent for the proposed system would be for a very large database, including a large number of records, potentially each 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague being of a large file size. therefore efficient access methods are required not only for access and query in the database, but also for access to the files themselves during the query. this would require a robust filing system for efficient access, an efficient method for access of records from memory storage, for example in the form of disk striping over multiple physical drives, and compressed data coupled with a suitable query engine able to manipulate the compressed data. 6 efficient representation of large data sets, designs and states this section will primarily discuss the different and evolving methods for representing complex geometries, with particular concentrations towards design applications. large data sets have already been discussed previously, particularly in the chapters describing the different database technologies capable of handling these data sets. this chapter will outline the different methods available for accurate yet efficient geometric representation of simple and complex geometries, covering a range of complexities in approaches to the problem. 6.1 general shape-representation methods and cad tools various methods of defining and parameterising shapes for use in design optimisation processes can be found in the literature. for example, an early method for shape parameterisation is the nodal coordinate approach that uses the coordinates of nodes of the discrete fem model as design variables [64]. in practical design optimisation situations, this unfortunately leads to a large number of design variables, and therefore a high level of inefficiency in design optimisation. several methods to overcome these initial drawbacks have been formulated. these include the mesh parameterisation approach [65], the use of solid modelling [66], and the natural design variable method [67]. although these approaches are relatively easy to implement for realistic design optimisation problems due to the very large number of design variables associated with them, this often makes them too costly and thus makes design optimisation prohibitive. another popular approach for defining and parameterising shape for use in design optimisation is the so-called spline approach, where the shape is represented by means of a series of polynomial functions. two common methods of spline based surface representation are bézier and b-splines [68]. typically, the surface to be represented is broken into a mesh of primarily rectangular curvilinear regions. a surface patch is then defined over each region, whose shape is determined by a set of control points. the shape parameters in this formulation are the coordinates of each control point [69]. it can thus be seen that in order to represent the shape of a practical object it would require a number of surface patches, often involving a large number of shape parameters. furthermore, for design optimisation involving complex geometry the spline based approach makes it difficult to maintain smooth transitions between adjacent surface patches. examples have been found in the literature for proposed systems for aiding the user in cad processes. qin, wright and jordanov [70] present the development of a sketch-based cad system interface for assisting designers during conceptual design stages. the system captures designers’ intention and interprets the input sketch into geometrically more exact 2d vision objects and further 3d models. it could also allow designers to specify a 3d object or a scene quickly, naturally, and accurately. the use of fuzzy knowledge in such a system is particularly useful in the conceptual design stages, where there is still a large proportion of uncertainty in the design. in these stages, many designers prefer to work on paper, using rough sketches to process the design. to support this early stage of geometric design and to improve the speed, effectiveness and quality of the design decision, the authors’ studies indicate that a computer aided conceptual design system must allow sketched input, and must have a variety of interfaces, recognising features and managing constraints. as a tool, cad systems have been widely used to reduce design time and cost, and to improve design efficiency and quality. however, there are still some problems with current cad systems. first, present cad systems primarily support drafting and detailed design. they have little, if any, support for early stage design although early design is critically important in the development of new products. second, the process of making a 3d design with present cad systems is often lengthy and tedious. one reason for this is that, with 2d interface, users have to decompose 3d design tasks into 2d or 1d modelling operations and have to input detailed and complete specifications of a design. third, the visualization capability of present cad systems is limited, which may not always satisfy the requirements of design analysis. the integration of vr techniques allows some of these shortcomings to be addressed. 6.2 spline-based methods spline-based methods are a very efficient way of representing complex curves and surfaces using a small number of control points which define one or more polynomial functions. common methods of spline-based curve and surface representation are bézier and b-splines, and nurbs (non uniform rational b-splines). these methods not only allow for the more data-efficient storage of geometric objects, but also allow for geometries to be created when only a limited amount of data is available. while these methods have advantages in requiring only a small amount of control points for definition, there must be concentration of these points in areas of complex geometries, for example with rapid changes in surface geometries, inflections, etc. in fact, nurbs have been recognised in much of the literature as a prevalent shape definition method: “the representation of curves and surfaces in nurbs form is now an accepted industry standard; hence, it is of practical interest to have nurbs descriptions of all curves and surfaces occurring in the design and manufacturing process” [71]. there are a number of practical examples available outlining the use of spline methods in geometric representation or design applications [72–76]. in their paper, pottman and farin [71] describe the use of nurbs for surface definition in sheet-metal and plate-metal based applications. spline methods have shown to be useful in the generation of object meshes and grids, as outlined in [77–79]. for a sur© czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 vey of the literature covering various procedures to automate the transition between the modelling and analysis phases for design, the reader is referred to [80]. mastin [79] shows that solid modelling techniques using 3-dimensional bézier functions can be used to generate grids in a simple one-step procedure. these grids are loosely formed around a collection of defined points and vertices. for example, simply the vertices of a cube can be used to define a sphere, with the grid then being defined on the sphere’s surface using these vertices. many different types of configurations and edge and face treatments can be included in the model. as such, the complexity of the geometric model is only limited by the amount of input that is desired. this solid modelling technique is designed for free-form solids where a general shape or design is to be modelled rather than for constructing a solid with precisely defined edges or surfaces. yu and soni [77] sent another approach for surface grid generation using nurbs and enhanced algorithms to transform iges entities into nurbs definitions. this application is much more accessible to engineers and designers, through the implementation of common file types already in common usage in fem, cfd and cad software, through the use of the initial graphics exchange specification (iges) file format. this allows for designers to create the geometry in an existing application, allowing for efficient generation through familiarity with known software. the iges format file can then be converted into a nurbs or b-spline representation, ready for grid generation for future analysis. it has been noted by the authors that nurbs is becoming the de facto standard for geometry description in most modern grid generation applications, with many tools being available for nurbs definition. many of the previously mentioned applications are primarily user-based, rather than being automated procedures. this is due to the fact that free-form shape design is typically accomplished in an interactive manner, with computer-generated shapes rarely being immediately acceptable, making this part of the design process far from being rightfully called computer-aided. often the user has to manipulate a large number of variables (control points) in order to produce the desired geometric properties. keyser et al. propose a method for efficient exact manipulation of algebraic points and curves [81]. hohenberger and reuding [82] propose a method for manipulation of b-splines in an automatic optimisation scenario, through the use of weights at the control points. in many cad systems, the use of weights for nurbs representations is inadequately supported, and often hidden from the user and therefore remains unused. in this application, the perturbation of weights at the control points is defined in an optimisation problem, with the objective being to produce a curve with a more gradual change in curvature and the smallest deviation from its initial shape. examples of applications in automotive shape design are presented and discussed by the authors as practical examples of this method. 6.3 partial differential equation (pde) formulations the pde method defines a shape in terms of a number of surface patches that collectively describe the object’s surface. the shape of the surface is defined through boundary conditions and a small set of design parameters. the boundary conditions can be specified in terms of curves in 3d-space. it is these features of the method that can be utilised for interactive design. although the pde method has certain features in common with more established techniques for surface design (b-splines, bézier curves and nurbs), what distinguishes it from these conventional techniques is its global smoothing approach associated with its elliptic boundary-value formulation. unlike conventional techniques (which are spline-based) for surface representation, the pde method can parameterise complex surfaces in terms of a small set of design variables, instead of many hundreds of control points. “the principle strength of the pde method lies in the ease with which we can quickly change the geometry using a small number of global design parameters. once a design has been determined, it would be a relatively straightforward procedure to derive the more commonly used design parameters (target points, upsweep angles, etc.) from the geometry. [this allows designers to obtain] useful information from simple assumptions; this is again advantageous when analysis of many models is needed.” [83] practical examples of the use of the pde formulation in geometric modelling and design have been found in the literature [83–87]. in [83], dekanski et al. present a practical implementation for creation and testing of a geometric design, incorporating simplified geometrical design, cfd modelling and analysis of the gas exchange cycles in a 2-stroke engine, and optimisation of design parameters to maximise engine scavenging (removal of combustion products during each cycle). bloor and wilson [87] present a method for the creation of wing geometries based on a series of 2d wing aerofoil sections. this builds on their earlier work concerned with the application of elliptic pde’s to the parameterisation of generic aircraft geometries [88]. this method has been shown to be capable of producing pde surface patches of the wing geometries of high-speed cargo transport (hsct) aircraft, allowing the designer to then progress with cfd analyses in this application. in this method, the 2d sectional data is interpolated between sections via a variable smoothing parameter, in order to control the pde solution. this can be viewed as a lofting method; unlike conventional cad lofting techniques, however, the pde method does not use spline techniques for the loft. it allows radical changes to be made to a design very quickly and cheaply, with a minimum of user-intervention since its surface definition remains valid, i.e. closed, throughout any changes in the design variables. this method can also be used in this example to smoothly join the wings to the aircraft fuselage [88, 89]. the pde formulation has been shown to be an advantageous parameterisation method for the efficient creation and manipulation of complex objects. the low number of design variables inherent to this method, when compared to conventional methods such as spline-based methods, makes it every efficient to integrate an optimiser to perform shape optimisation. 6.4 expanding past geometric representations previous sections have outlined various methods for describing the geometric definition of an object. methods have 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague also been developed that allow representations to expand past this basic functionality, also including such information as internal material structure [90, 91], design histories [92], and other capabilities arising from the representation of geometric objects in object-oriented terms [92, 93]. “heterogenous solid modelling” is an expansion of the solid modelling concept, such that a ‘heterogenous object’ is defined that can have different material composition within the object. this concept is a new area in cad, and still in its infancy. in their proposal of this concept, siu and tan [90] present a representation scheme for heterogenous objects, whereby material information can be integrated as a part of the object representation. kumar et al. [91] present a similar approach, which can include not only material properties, but also grading between different materials for composites material applications such as aircraft engine turbine blades, which are a complex blending of metals and ceramics, along with a very accurate grading and geometry definition. these capabilities are called for due to recent advances in materials creation and analysis software, where the analysis must be able to keep up with the pace of materials advances. 7 heuristics for switching between classes of models this section will review heuristic and stochastic process that can be utilised by the knowledge-based processes of the proposed system. it is envisaged that a number of potential forms of data will be available for the design at any stage, with each having differing levels of accuracy and availability/applicability. these forms of data could range from cfd solutions to empirical tables and formulae. at a point during the design process, any number of these potential options may be open, and it is important to determine which will be the most efficient, in terms of computation and accuracy required. a range of information and practical examples have been found in the literature for intelligent design systems [94–126]. 7.1 knowledge base an expert system is a computer program that has an extensive knowledge base in a specific domain, and uses an inference mechanism to draw conclusions in the same manner as a human expert in the same domain. since they are not subject to human frailties such as boredom, forgetfulness, bias or tunnel vision, fully developed expert systems frequently out-perform their human models. in the design of an expert system, the development consists of three major phases; namely, knowledge acquisition, internal design of the knowledge base, and expert system validation. among these activities, the design of the knowledge base forms the most crucial part from the performance viewpoint of the expert system [127], and there are a number of methods available to specify the reasoning of knowledge-based systems [128, 129]. a knowledge base usually consists of conceptual, procedural and declarative knowledge. conceptual knowledge is concerned with underlying ideas, theories, concepts, hypotheses and relationships that exist within a domain. declarative knowledge consists of the truths of a domain, and includes facts, terminology and classifications. procedural knowledge refers to knowledge used to direct pathways of thought or actions, leading to solutions to problems that the system is trying to solve. procedural knowledge is largely concerned with the manipulation of declarative or conceptual knowledge. this type of knowledge consists of rules-of-thumb and their control, weights of evidence, working procedures and strategies. it has been noted that the major problem with building a knowledge base to solve aspects of a design problem is in finding the most effective representation for the knowledge [130]. for fully developed expert systems, these knowledge bases can be substantial in size; however dev and murthy [127] present an approach to the problem of knowledge-based partitioning in the context of rule-based expert systems. advantages of these systems lie in the inference mechanisms used to determine the best approach for a given design domain, thereby removing the bias, favouritism and familiarity of human designers. this not only ensures that an appropriate search or optimisation method will be chosen, it also reduces the reliance on designer experience, which can limit the potential for an optimal design. also, such systems will make full use of the data generated by designers during design-evaluate-redesign studies, which are otherwise often discarded. 7.2 heuristic processes utilising previous design data the design process can be made more efficient if it is able to recognise previous relevant design data; this concept is fundamental to this study. gantovnik, gurdal and watson [131] propose a method for augmenting genetic algorithms to include memory for discrete and continuous variables, with the term ‘memory’ implying preserved data from previously analysed designs. in the standard ga approach, a new population may contain designs that have already been encountered in previous generations, especially near the end of previous design optimisation processes. the memory procedure eliminates the possibility of re-creating these design candidates, thereby saving computational time. after a new generation of designs is created by the genetic operations, the binary tree used for memory storage is searched for each new design. if the design is found, the fitness value is retrieved from the binary tree without conducting an analysis. otherwise, the fitness is obtained based on an exact analysis. this new design and its data are then inserted in the tree as a new node. this approach proves to serve well for design systems based wholly on discrete variables. for cases that include a continuous variable, a spline interpolation method is utilised. the main idea of this approach is to construct approximations for the fitness function as a function of the continuous variable using a spline function fitted to the historical data, and interpolate from the stored data whenever possible. 7.3 approximation methods for the design process a common engineering practice is the use of approximation models [132, 133] in place of expensive computer simulations to drive a multidisciplinary design process based on non-linear programming techniques. the use of approxi© czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 mation strategies is designed to reduce the number of detailed, costly computer simulations required during optimisation while maintaining the pertinent features of the design problem. after each sequence of approximate optimisation, the approximations of system behaviour are updated with new information about the current design. thus, many iterations of such algorithms may be required before convergence of the optimisation process is achieved, and every additional iteration adds to the cost of the process. in light of this, a primary concern in developing an approximate optimisation strategy is the proper choice of a move limit management strategy. two main alternatives have been investigated in the multidisciplinary design optimisation community to approximate physical systems. the first approach has been the use of a simplified physical representation of the system to obtain less costly simulations as described in [134]. a second alternative for system approximation, which has grown in interest in recent years, are response surface approximations (rsa’s) based on polynomial and interpolation models. polynomial rsa’s employ the statistical techniques of regression analysis and analysis of variance (anova) to determine the approximate function. rodriguez et al. [135] overview the current state of the art in model management strategies for approximate optimisation. model management strategies coordinate the interaction between the optimisation and the fidelity of the approximation models in order to ensure that the process converges to a solution of the original design problem. approximations play an important role in multidisciplinary design optimisation (mdo) by offering system behaviour information at a relatively low cost. most approximate mdo strategies are sequential, in which an optimisation of an approximate problem subject to design variable move limits is iteratively repeated until convergence. the move limits, or trust region, are imposed to restrict the optimisation to regions of the design space in which the approximations provide meaningful information. as computers advance in speed, more efficient data sharing and exchange algorithms are developed. one observes that an increasing number of discipline sets are being encompassed in actual engineering optimisation processes. problem complexity is observed to grow at a pace that taxes the limits of the advances in processing powers. therefore, the dimensionality and complexity of mdo problems may always necessitate the use of approximations and decomposition strategies to make the optimisation a practical task. 7.4 heuristics for resource selection the proposed system would include inference algorithms that are able to choose from a selection of data types and approximation techniques. ref [136] documents a more practical-minded implementation of the desired heuristics. where the proposed system would implement these heuristics to determine the best calculations to perform, this implementation determines the best processes to be performed in a construction project. “such resource-assignment and optimisation problems demand efficient combinational computations if all the possible options are to be considered, and decision-making facilitated.” consequently, research into efficient methods of resource optimisation has always been an area of interesting study on its own. previous research works on resource optimisation have investigated the use of deterministic models in construction decision-making, while other works investigated the use of stochastic models in solving the problem [137–139]. ugwu and tah [136] present an investigation into the application of genetic algorithms (gas) to the multidimensional problem. the objective is to investigate the use of gas for both a numerical function optimisation and a combinatorial search problem within the framework of a decision-support system (dss). a hybrid ga system was designed for construction-resource selection, and a genetic model that represents the problem and solution space was built into the system (methods for accurately simulating complex nonmathematical processes for solution in an optimisation process can be found in [140, 141]). a genetic state-space search (gsss) technique for multimodal functions was used to evaluate the cost profiles that resulted from different combinations of tasks and resources. the study indicates that ga systems have huge potential applications as dss component(s) in construction-resource assignment. the results also highlighted that ga’s exhibit the chaotic characteristics that are often observed in other complex non-linear dynamic systems. the power of their use in applications is derived from their ability to combine numerical parameter optimisation with combinatorial searches within an application domain. ga’s are therefore uniquely suitable for solving multidimensional optimisation problems such as this. the algorithm discussed in this paper also demonstrates the use of a hybrid ga that is integrated with a project database to perform combinatorial optimisation. this improves the robustness of the ga because the services it provides (functional and combinatorial optimisation) are independent of the data on which it acts in performing such services. this distinct feature means that the imposition of genetic operators such as reproduction, crossover, and mutation do not result in an arbitrary loss of information, since the knowledge about the problem domain is stored in the project database. this integration of ga with the project database allows for a wide range of applications in real-time or real-life situations [142]. while the above approaches have primarily been concerned with practical applications, and the allocation of physical resources as opposed to calculations of varying complexity, it is possible that a similar approach can still be used for the proposed project. parameters can be defined within the ga structure, relating to the cost and performance characteristics of different search methods and calculation techniques, allowing for the use of genetic algorithms to determine the optimal approach in an optimisation scenario. this approach could also be integrated with previously discussed methods such as memory integration for genetic algorithms, or approximation techniques. 8 conclusions and recommendations an extensive literature review has been conducted by the sir lawrence wackett centre for aerospace design technology, rmit university. the review investigates aspects of a proposed system for intelligent design optimisation. such a system would be capable of efficiently storing (and compress48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague ing if required) a range of types of design data into an intelligent database. this database would be accessed by the system during subsequent design processes, allowing for search of relevant design data for re-use in later designs, allowing it to become very efficient in reducing the time for later designs as the database grows in size. extensive research has been performed, in both theoretical aspects of the project, and practical examples of current similar systems. database systems have been reviewed and discussed. aspects of databases such as dbms and data model have been discussed, along with the general consensus in the literature of object-oriented technologies allowing great advantages for design applications. aspects of database design, integration with existing software and expert systems, as well as storage issues for very large databases have also been addressed. database query methods have been reviewed, and compared for the different data models available. query optimisation techniques have been illustrated, and fuzzy or knowledge-based enhancement methods have also been noted. a critical component of the proposed system is the efficient and accurate representation of design space data. design space creation, optimisation and search methods have been discussed, along with methods for compressing various types of data that could be encountered in design. these data types include images, volumetric and basic geometric data, to name a few. particular aspects of database searches performed on such compressed data have also been addressed, and it has been noted that searches can be performed on the data in compressed form provided that given information is known, such as the utilised compression schemes; this has a great impact on the search efficiency within the database. numerous methods have been discussed for efficient representation of geometric objects, covering basic techniques, spline-based methods, solid modelling, and pde formulations. it has been found that the pde (partial differential equation) formulation shows the best efficiency for accurate portrayal of complex 3d geometries, due to its reliance on a low number of design variables, as compared to spline methods, which can rely on hundreds of control points at a time. these sections have discussed all the important functional aspects of the proposed system. the final chapter addresses the intelligent algorithms that are required to differentiate between different design methods and model classes, a number of which may be available at any time, but which could vary in efficiency, cost and accuracy. a range of methods have been discussed for these inference mechanisms, including memory processes for genetic algorithms, approximation methods, and resource allocation systems. this review provides an accurate and timely review of the literature pertinent to the proposed application. during the extent of this project, the proposed system was kept in general terms, and it is envisaged that more accurate or targeted research can be attempted when the proposition becomes more defined. 9 acknowledgments the authors would like to acknowledge the defence science and technology organisation (dsto) for their support for this project. references [1] silberschatz, a., stonebraker, m., ullman, j.: “database systems: achievements and opportunities.” communications of the acm, vol. 34 (1991), no. 10, p. 110–120. [2] sidle, t. w.: “weaknesses of commercial database management systems in engineering applications.” in: proceedings of the 17th design automation conference on design automation. 1980, minneapolis, minnesota, usa: acm press, ny, usa. [3] eastman, c. m.: “system facilities for cad databases.” in: proceedings of the 17th design automation conference on design automation. 1980, minneapolis, minnesota, usa: acm press, ny, usa. [4] bennet, j.: “a database management system for design engineers. in: proceedings of the 19th design automation conference. 1982, ieee press, nj, usa. [5] lacroix, m., pirotte, a.: “data structures for cad object description.” in: proceedings of the 18th design automation conference on design automation. 1981, nashville, tennessee, usa: ieee press, nj, usa. [6] haynie, m.: a dbms for large design automation databases. in: 1988 acm sigmod international conference on management of data, 1988, chicago, illinois, usa: acm press, ny, usa. [7] schutzman, h. b.: “ichabod: a data base manager for design automation applications. in: proceedings of the 22nd acm/ieee conference on design automation, 1985. las vegas, nevada, usa: acm press, ny, usa. [8] haynie, m. n.: “the relational/network hybrid data model for design automation databases.” in: 18th design automation conference on design automation. 1981, nashville, tennessee, usa: ieee press, nj, usa. [9] hardwick, m.: extending the relational database data model for design applications. in 21st proceedings of the design automation conference on design automation. 1984, albuquerque, new mexico, usa: ieee press, nj, usa. [10] ketabchi, m. a., berzins, v., march, s.: “odm: an object oriented data model for design databases.” in: 1986 acm 14th annual conference on computer science. 1986, cincinnati, ohio, united states: acm press, new york. [11] navathe, s. b.: “evolution of data modeling for databases.” communications of the acm, vol. 35 (1992), no. 9, p. 112–123. [12] heiler, s. et al.: an object-oriented approach to data management: why design databases need it. in: 24th acm/ieee conference proceedings on design automation conference. 1987, miami beach, fl, usa: acm press, ny, usa. [13] agrawal, r., gehani, n. h.: “ode (object database and environment): the language and the data model.” in: proceedings of the 1989 acm sigmod international conference on management of data. 1989, portland, or, usa: acm press, ny, usa. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 [14] haynie, m. h.: “tutorial: the relational data model for design automation.” in: 12th design automation conference on design automation. 1983, miami beach, florida, usa: ieee press, nj, usa. [15] maryanski, f. et al.: “the data model compiler: a tool for generating object-oriented database systems.” in: the 1986 international workshop on object-oriented database systems. 1986, pacific grove, ca, usa: ieee computer society press, la, usa. [16] bedell, j., maryanski, f.: “semantic data modeling support for cad.” in: proceedings of the 1987 fall joint computer conference on explorer technology: today and tomorrow. 1987, dallas, tx, usa: ieee computer society press, la, usa. [17] loomis, m. e. s., chaudhri, a. b.: object databases in practice. 1998, london: prentice-hall international. xviii, 312. [18] lee, k.-h., lee, d., han, s.-h.: “object-oriented approach to a knowledge-based structural design system. expert systems with applications, vol. 10 (1996), no. 2, p. 223–231. [19] spooner, d. l.: “an object-oriented data management system for mechanical cad.” in: 1986 international workshop on object-oriented database systems. 1986, pacific grove, ca, usa: ieee computer society press, los alamitos, ca, usa. [20] ketabchi, m. a.: “object-oriented data models and management of cad databases.” in: proceedings of the 1986 international workshop on object-oriented database systems. 1986, pacific grove, ccalifornia, usa: ieee computer society press, la, usa. [21] ishikawa, h.: object-oriented database system: design and implementation for advanced applications. computer science workbench. 1993, tokyo; new york: springer-verlag. cm. [22] zand, m., collins, v., caviness, d.: “a survey of current object-oriented databases.” acm sigmis database, vol. 26 (1995), no. 1, p. 14–29. [23] ahad, r.: “a framework to support an object-oriented view of existing engineering databases. in: proceedings of the 1988 acm sigsmall/pc symposium on actes. 1988, cannes, france: acm press, ny, usa. [24] stonebraker, m.: “object management in postgres using procedures.” in: 1986 international workshop on object-oriented database systems. 1986, pacific grove, ca, usa: ieee computer society press, los alamitos, ca, usa. [25] manegold, s., boncz, p. a., kersten, m. l.: “optimising database architecture for the new bottleneck: memory access.” the vldb journal the international journal on very large data bases, (2000), no. 9, p. 231–246. [26] ramakrishna, m. v., larson, p..: “file organisation using composite perfect hashing.” acm transactions on database systems (tods), vol. 14 (1989), no. 2, p. 231–263. [27] roller, d., eck, o., dalakakis, s.: “integrated version and transaction group model for shared engineering databases.” data & knowledge engineering, vol. 42 (2002), no. 2, p. 223–245. [28] graefe, g.: “query evaluation techniques for large databases.” acm computing surveys (csur), vol. 25 (1993), no. 2, p. 73–169. [29] analyti, a., pramanik, s.: fast search in main memory databases. in: acm sigmod, 1992, ca, usa. [30] jarke, m., koch, j.: “query optimisation in database systems.” acm computing surveys (csur), vol. 16 (1984), no. 2, p. 111–152. [31] chaudhuri, s.: “an overview of query optimisation in relational systems.” in: pods ‘98. 1998, seattle, wa, usa. [32] böhm, c., berchtold, s., keim, d. a.: “searching in high-dimensional spaces: index structures for improving the performance of multimedia databases.” acm computing surveys (csur), vol. 33 (2001), no. 3, p. 322–373. [33] funkhouser t., kazhdan, m. p. et al.: “a search engine for 3d models.” acm transactions on graphics (tog), vol. 22 (2003), no. 1, p. 83–105. [34] gaude, v., günther, o.: “multidimensional access methods.” acm computing surveys (csur), vol. 30 (1998), no. 2, p. 170–231. [35] hjaltason, g. r., samet, h.: “index-driven similarity search in metric spaces.” acm transactions on database systems (tods), vol. 28 (2003), no. 4, p. 517–580. [36] owrang, o., omidvar, m. m.: a parallel database machine for query translation in a distributed database system. p. 47–52. [37] ciaccia, p., patella, m.: “searching in metric spaces with user-defined and approximate distances.” acm transactions on database systems (tods), vol. 27 (2002), no. 4, p. 398–437. [38] kozlowski, m. c., panda, m.: “computer-aided design of chiral ligands: part i. database search methods to identify chiral ligand types for asymmetric reactions.” journal of molecular graphics and modelling, vol. 20 (2002), no. 5, p. 399–409. [39] keim, d. a.: “efficient geometry-based similarity search of 3d spatial databases.” in: 1999 acm sigmod intl. conf. on management of data. 1999, phil, penns, usa. [40] faloutsos, c. et al.: “efficient and effective querying by image content.” journal of intelligent information systems, vol. 3 (1994), p. 231–262. [41] mehrotra, r., gary, j. e.: “feature-based retrieval of similar shapes.” in: proc. 9th int. conf. on data engineering. 1993, vienna, austria. [42] manwaring, m. l., jones, k. l., glagowski, t. g.: an engineering design process supported by knowledge retrieval from a spatial database. ieee: p. 395–398. [43] chen, y., che, d., aberer, k.: “on the efficient evaluation of relaxed queries in biological databases.” in: proceedings of cikm ‘02. 2002, mclean, virginia, usa. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague [44] thornton, a. c.: “the use of constraint-based design knowledge to improve the search for feasible designs.” engineering applications of artificial intelligence, vol. 9 (1996), no. 4, p. 393–402. [45] burgess, s., pasini, d., alemzadeh, k.: “improved visualisation of the design space using nested performance charts.” design studies. in press, corrected proof. [46] chien, s. -f., flemming, u.: “design space navigation in generative design systems.” automation in construction, vol. 11(2002), no. 1, p. 1–22. [47] liu, c. -s., tseng, c. -h.: “space-decomposition minimization method for large-scale minimization problems*1.” computers & mathematics with applications, vol. 37 (1999), no. 7, p. 73–88. [48] chen, t. -y., lin, c. -y.: “determination of optimum design spaces for topology optimization.” finite elements in analysis and design, vol. 36 (2000), no. 1, p. 1–16. [49] sanchez marin, f. t., perez gonzalez, a.: “global optimization in path synthesis based on design space reduction.” mechanism and machine theory, vol. 38 (2003), no. 6, p. 579–594. [50] yao, z., johnson, a. l.: “on estimating the feasible solution space of design.” computer-aided design, vol. 29 (1997), no. 9, p. 649–655. [51] chen, p. -y., jou, j. m.: “adaptive arithmetic coding using fuzzy reasoning and grey prediction.” fuzzy sets and systems, vol. 114 (2000), no. 2, p. 239–254. [52] holtz, k., holtz, e.: lossless data compression techniques. p. 392–397. [53] yang, e., kieffer, j. c.: “on the performance of data compression algorithms based upon string matching.” ieee transactions on information theory, vol. 44 (1998), no. 1, p. 47–65. [54] fowler, j. e., yagel, r.: “optimal linear prediction for the lossless compression of volume data.” ieee transactions on computers, 1995, p. 458. [55] kidner, d. b., smith, d. h.: “advances in the data compression of digital elevation models.” computers & geosciences, vol. 29 (2003), no. 8, p. 985–1002. [56] philips, w. et al.: “state-of-the-art techniques for lossless compression of 3d medical image sets. computerized medical imaging and graphics, vol. 25 (2001), no. 2, p. 173–185. [57] ageenko, e., franti, p.: “lossless compression of large binary images in digital spatial libraries*1.” computers & graphics, vol. 24 (2000), no. 1, p. 91–98. [58] kivijarvi, j. et al.: “a comparison of lossless compression methods for medical images.” computerized medical imaging and graphics, vol. 22 (1998), no. 4, p. 323–339. [59] strom, j., cosman, p. c.: “medical image compression with lossless regions of interest*1.” signal processing, vol. 59 (1997), no. 2, p. 155–171. [60] karadimitriou, k., tyler, j. m.: “the centroid method for compressing sets of similar images.” pattern recognition letters, vol. 19 (1998), no. 7, p. 585–593. [61] lelewer, d. a., hirschberg, d. s.: “data compression.” acm computing surveys (csur), vol. 19 (1987), no. 3, p. 261–296. [62] cormack, g. v.: “data compression on a database system.” communications of the acm, vol. 28 (1985), no. 12, p. 1336–1342. [63] graefe, g., shapiro, l. d.: “data compression and database performance.” ieee transactions on computers, 1991, p. 22–27. [64] francavilla, a., ramakrishanan, c. v., zienkiewicz, o. c. z.: “optimisation of shape to minimise stress concentration.” journal of strain analysis, vol. 10 (1975), p. 63–70. [65] chang, k. h., choi, k. k.: a geometry based parametrisation method for shape design of elastic solids.” mech struct mach, vol. 20 (1992), p. 215–52. [66] kodiyalam, s., kumar, v., finigan, p. m.: “constructive solid geometry approach to threedimensional shape optimisation.” journal of the aiaa, vol. 30 (1992), p. 1408–15. [67] tortorelli, d. a.: “a geometric representation scheme suitable for three-dimensional shape optimisation.” mech struct mach, vol. 21 (1993), p. 95–121. [68] schumaker, l. l.: spline functions: basic theory. new york: john wiley and sons, 1981. [69] braiband, v., fleury, c.: “shape optimal design using b-splines.” computer methods in applied mechanics and engineering, vol. 44 (1984), p. 247–67. [70] qin, s. f., wright, d. k., jordanov, i. n.: “from on-line sketching to 2d and 3d geometry: a system based on fuzzy knowledge.” computer-aided design, vol. 32 (2000), no. 14, p. 851–866. [71] pottmann, h., farin, g.: “developable rational bezier and b-spline surfaces.” computer aided geometric design, vol. 12 (1995), no. 5, p. 513–531. [72] de kemp, e. a.: “visualization of complex geological structures using 3-d bezier construction tools*1.” computers & geosciences, vol. 25 (1999), no. 5, p. 581–597. [73] garcia, a. l., de miras, j. r., fieto, f. r.: “free-form solid modelling based on extended simplicial chains using triangular bezier patches.” computers & graphics, vol. 27 (2003), no. 1, p. 27–39. [74] tsay, d. m., lin, b. j.: “improving the geometry design of cylindrical cams using nonparametric rational b-splines.” computer-aided design, vol. 28 (1996), no. 1, p. 5–15. [75] wang, r. -h.: “multivariate spline and algebraic geometry*1.” journal of computational and applied mathematics, vol. 121 (2000), no. 1–2, p. 153–163. [76] hinton, e., ozakca, m., rao, n. v. r.: “free vibration analysis and shape optimization of variable thickness plates, prismatic folded plates and curved shells: part 2: shape optimization.” journal of sound and vibration, vol. 181 (1995), no. 4, p. 567–581. © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 [77] yu, t. y., soni, b. k.: “application of nurbs in numerical grid generation.” computer-aided design, vol. 27 (1995), no. 2, p. 147–157. [78] zheng, y., lewis, r. w., gethin, d. t.: “three-dimensional unstructured mesh generation: part 2. surface meshes.” computer methods in applied mechanics and engineering, vol. 134 (1996), no. 3–4, p. 269–284. [79] mastin, c. w.: “three-dimensional bezier interpolation in solid modeling and grid generation*1.” computer aided geometric design, vol. 14 (1997), no. 9, p. 797–805. [80] natekar, d., zhang, x., subbarayan, g.: “constructive solid analysis: a hierarchical, geometry-based meshless analysis procedure for integrated design and analysis.” computer-aided design, vol. 36 (2004), no. 5, p. 473–486. [81] keyser, j. et al.: “efficient and exact manipulation of algebraic points and curves.” computer-aided design, vol. 32 (2000), no. 11, p. 649–662. [82] hohenberger, w., reuding, t.: “smoothing rational b-spline curves using the weights in an optimization procedure.” computer aided geometric design, vol. 12 (1995), no. 8, p. 837–848. [83] dekanski, c. w., bloor, m. i. g., wilson, m. j.: “a parametric model of a 2-stroke engine for design and analysis.” computer methods in applied mechanics and engineering, vol. 137(1996), p. 411–425. [84] ugail, h., bloor, m. i. g., wilson, m. j.: “manipulation of pde surfaces using an interactively defined parameterisation. computers & graphics, vol. 23 (1999), p. 525–534. [85] du, h., qin, h.: “a shape design system using volumetric implicit pdes.” computer-aided design. in press, corrected proof. [86] ugail, h., wilson, m. j.: “efficient shape parametrisation for automatic design optimisation using a partial differential equation formulation.” computers & structures, vol. 81 (2003), no. 28–29, p. 2601–2609. [87] bloor, m. i. g., wilson, m. j. i.: “generating parametrizations of wing geometries using partial differential equations.” computer methods in applied mechanics and engineering, vol. 148 (1997), p. 125–138. [88] bloor, m. i. g., wilson, m. j. i.: “the efficient parametrization of generic aircraft geometry.” journal of aircraft, vol. 32 (1995), no. 6, p. 1269–1275. [89] sevant, n. e. et al.: “the automatic design of a generic wing/body fairing.” in cfd 95: third annual conference of the cfd society of canada. 1995, banff, canada. [90] siu, y. k., tan, s. t.: “,source-based’ heterogeneous solid modeling.” computer-aided design, vol. 34 (2002), no. 1, p. 41–55. [91] kumar, v. et al.: “a framework for object modeling.” computer-aided design, vol. 31(1999), no. 9, p. 541–556. [92] batory, d. s., kim, w.: “modeling concepts for vlsi cad objects.” acm transactions on database systems (tods), vol. 10 (1985), no. 3, p. 322–346. [93] rappoport, a.: “geometric modeling: a new fundamental framework and its practical implications.” in: proceedings of the 3rd acm symposium on solid modeling and applications. 1995, salt lake city, utah, usa: acm press, nj, usa. [94] webber, s. j.: “an expert system for hydraulic fracturing.” in: school of petroleum and geological engineering. 1994, university of oklahoma. [95] li, y.: “an engineering constraint handling and optimisation system for integrated product development.” in: industrial engineering. 1998, wayne state university: detroit. [96] chau, k. w., albermani, f.: “a coupled knowledge-based expert system for design of liquid-retaining structures.” automation in construction, vol. 12 (2003), no. 5, p. 589–602. [97] gantovnik, v. b. et al.: “a genetic algorithm with memory for mixed discrete-continuous design optimization.” computers & structures, vol. 81 (2003), p. 2003–2009. [98] lee, d., kim, s. -y.: “a knowledge-based expert system as a pre-post processor in engineering optimization.” expert systems with applications, vol. 11 (1996), no. 1, p. 79–87. [99] mccorkle, d. s., bryden, k. m., carmichael, c. g.: “a new methodology for evolutionary optimization of energy systems.” computer methods in applied mechanics and engineering, vol. 192 (2003), no. 44–46, p. 5021–5036. [100] walker, m., smith, r. e.: “a technique for the multiobjective optimisation of laminated composite structures using genetic algorithms and finite element analysis.” composite structures, vol. 62 (2003), no. 1, p. 123–128. [101] bos, a. h. w.: “aircraft conceptual design by genetic/gradient-guided optimization.” engineering applications of artificial intelligence, vol. 11 (1998), no. 3, p. 377–382. [102] zakarian, v. l., kaiser, m. j.: “an embedded hybrid neural network and expert system in a computeraided design system.” expert systems with applications, vol. 16 (1999), no. 2, p. 233–243. [103] leung, r. w. k., lau, h. c. w., kwong, c. k.: “an expert system to support the optimization of ion plating process: an olap-based fuzzy-cum-ga approach.” expert systems with applications, vol. 25 (2003), no. 3, p. 313–330. [104] lu, jingui, et al.: “an improved strategy for gas in structural optimization.” computers & structures, vol. 61 (1996), no. 6, p. 1185–1191. [105] le riche, r., gaudin, j., besson, j.: “an object-oriented simulation-optimization interface.” computers & structures, vol. 81 (2003), no. 17, p. 1689–1701. [106] ko, d. -c., kim, d. -h., kim, b. -m.: “application of artificial neural network and taguchi method to preform design in metal forming considering workability.” international journal of machine tools and manufacture, vol. 39 (1999), no. 5, p. 771–785. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague [107] hauser, m., scherer, r. j.: “application of intelligent cad paradigms to preliminary structural design.” artificial intelligence in engineering, vol. 11 (1997), no. 3, p. 217–229. [108] chun, h. w.: “automatic simulation program synthesis using a knowledge-based approach.” simulation practice and theory, vol. 5 (1997), no. 6, p. 473–488. [109] soremekun, g. et al.: “composite laminate design optimization by genetic algorithm with generalized elitist selection.” computers & structures, vol. 79 (2001), no. 2, p. 131–143. [110] brekelmans, r. et al.: “constrained optimization involving expensive function evaluations: a sequential approach*1.” european journal of operational research. in press, corrected proof. [111] de silva garza, a. g., maher, m. l.: design by interactive exploration using memory-based techniques.” knowledge-based systems, vol. 9 (1996), no. 3, p. 151–161. [112] manohar, p. a., shivathaya, s. s., ferry, m.: “design of an expert system for the optimization of steel compositions and process route.” expert systems with applications, vol. 17 (1999), no. 2, p. 129–134. [113] giannakoglou, k. c.: “design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence.” progress in aerospace sciences, vol. 38 (2002), no. 1, p. 43–76. [114] wallace, d. r., jakiela, m. j., flowers, w. c.: “design search under probabilistic specifications using genetic algorithms.” computer-aided design, vol. 28 (1996), no. 5, p. 405–421. [115] bullock, g. n. et al.: “developments in the use of the genetic algorithm in engineering design.” design studies, vol. 16 (1995), no. 4, p. 507–524. [116] chau, k. w., albermani, f.: “expert system application on preliminary design of water retaining structures.” expert systems with applications, vol. 22 (2002), no. 2, p. 169–178. [117] linkens, d. a., min-you, c.: “expert control systems – 2. design principles and methods.” engineering applications of artificial intelligence, vol. 8 (1995), no. 5, p. 527–537. [118] liebowitz, j.: “expert systems: a short introduction.” engineering fracture mechanics, vol. 50 (1995), no. 5–6, p. 601–607. [119] renner, g., ekart, a.: “genetic algorithms in computer aided design.” computer-aided design, vol. 35 (2003), no. 8, p. 709–726. [120] teughels, a., de roeck, g., suykens, j. a. k.: “global optimization by coupled local minimizers and its application to fe model updating.” computers & structures, vol. 81 (2003), no. 24–25, p. 2337–2351. [121] shyy, w. et al.: “global design optimization for aerodynamics and rocket propulsion components.” progress in aerospace sciences, vol. 37 (2001), no. 1, p. 59–118. [122] mejasson, p. et al.: “intelligent design assistant (ida): a case base reasoning system for material and design.” materials & design, vol. 22 (2001), no. 3, p. 163–170. [123] prabhu, b. s., biswas, s., pande, s. s.: “intelligent system for extraction of product data from cadd models.” computers in industry, vol. 44 (2001), no. 1, p. 79–95. [124] stephanopoulos, g., han, c.: “intelligent systems in process engineering: a review.” computers & chemical engineering, vol. 20 (1996), no. 6–7, p. 743–791. [125] lagaros, n. d., papadrakakis, m., kokossalakis, g.: “structural optimization using evolutionary algorithms.” computers & structures, vol. 80 (2002), no. 7–8, p. 571–589. [126] papadrakakis, m., lagaros, n. d., tsompanakis, y.: “structural optimization using evolution strategies and neural networks.” computer methods in applied mechanics and engineering, vol. 156 (1998), no. 1–4, p. 309–333. [127] dev, k., murthy, c. s. r.: “a genetic algorithm for the knowledge base partitioning problem*1. pattern recognition letters, vol. 16 (1995), no. 8, p. 873–879. [128] fensel, d., groenboom, r., de lavalette, g. r. r.: “modal change logic (mcl): specifying the reasoning of knowledge-based systems.” data & knowledge engineering, vol. 26 (1998), no. 3, p. 243–269. [129] clibbon, k., edmonds, e.: “representing strategic design knowledge.” engineering applications of artificial intelligence, vol. 9 (1996), no. 4, p. 349–357. [130] maher, m.: “the importance of representation in ai in design.” in: research directions for artificial intelligence in design. 1995, key centre of design computing, university of sydney. p. 49–54. [131] gantovnik, v. b., gurdal, z., watson, l. t.: “a genetic algorithm with memory for optimal design of laminated sandwich composite panels.” composite structures, vol. 58 (2002), no. 4, p. 513–520. [132] fluery, c., zhang, w. h.: “selection of appropriate approximation schemes in multi-disciplinary engineering optimization.” advances in engineering software, vol. 31 (2000), no. 6, p. 385–389. [133] schmit, l. a., farshi, b.: “some approximation concepts for structural synthesis.” journal of the aiaa, vol. 12 (1974), p. 692–9. [134] barthelemy, j. f., haftka, r. t.: “approximation concepts for optimum structural design – a review. structural optimisation, vol. 5 (1993), p. 129–144. [135] rodriguez, j. f. et al.: “trust region model management in multidisciplinary design optimization.” journal of computational and applied mathematics, vol. 124 (2000), no. 1–2, p. 139–154. [136] ugwu, o. o., tah, j. h. m.: “towards optimising construction-method selection strategies using genetic algorithms. engineering applications of artificial intelligence, vol. 11 (1998), no. 4, p. 567–577. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 [137] abou, r. s. m., shi, j.: “automated construction-simulation optimization.” journal of construction engineering and management, vol. 120 (1994), no. 2, p. 374–385. [138] paulson, b. c. j., chan, w. t., koo, c. c.: “construction operations simulation by microcomputers.” journal of construction engineering and management, vol. 113 (1987), no. 2, p. 302–320. [139] smith, s. d., osborne, j. r., forde, m. c.: “analysis of earthmoving systems using discrete-event simulation.” journal of construction engineering and management, vol. 121 (1995), no. 4, p. 388–396. [140] smith, k., palaniswami, m., krishnamoorthy, m.: “traditional heuristic versus hopfield neural network approaches to a car sequencing problem.” european journal of operational research, vol. 93 (1996), no. 2, p. 300–316. [141] sette, s., boullart, l., langenhove, l. v.: “optimising a production process by a neural network/genetic algorithm approach.” engineering applications of artificial intelligence, vol. 9 (1996), no. 6, p. 681–689. [142] ugwu, o. o., tah, j. h. m., howes, r.: “optimizing construction-method selection strategies in an integrated decision support environment.” in: association of researchers in construction management conference (arcom98) university of reading. 1998, reading, u.k. craig peebles e-mail: craig.peebles@optusnet.com.au cees bil e-mail: bil@rmit.edu.au school of aerospace mechanical and manufacturing engineering rmit university gpo box 2476v melbourne, vic 3001, australia lorenz drack lorenz.drack@dsto.defence.gov.au maritime platforms division defence science and technology organisation po box 4331 melbourne, vic 3001, australia 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague acta polytechnica doi:10.14311/ap.2020.60.0098 acta polytechnica 60(2):98–110, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap similarity solutions and conservation laws for the beam equations: a complete study amlan kanti haldera, ∗, andronikos paliathanasisb, c, peter gavin lawrence leachc, d a pondicherry university, department of mathematics, 605014 kalapet, india b universidad austral de chile, instituto de ciencias físicas y matemáticas, valdivia, chile c durban university of technology, institute for systems science, durban 4000, republic of south africa d university of kwazulu-natal, school of mathematics, statistics and computer science, durban, south africa ∗ corresponding author: amlan91.res@pondiuni.edu.in abstract. we study the similarity solutions and we determine the conservation laws of various forms of beam equations, such as euler-bernoulli, rayleigh and timoshenko-prescott. the travelling-wave reduction leads to solvable fourth-order odes for all the forms. in addition, the reduction based on the scaling symmetry for the euler-bernoulli form leads to certain odes for which there exists zero symmetries. therefore, we conduct the singularity analysis to ascertain the integrability. we study two reduced odes of second and third orders. the reduced second-order ode is a perturbed form of painlevé-ince equation, which is integrable and the third-order ode falls into the category of equations studied by chazy, bureau and cosgrove. moreover, we derived the symmetries and its corresponding reductions and conservation laws for the forced form of the abovementioned beam forms. the lie algebra is mentioned explicitly for all the cases. keywords: symmetry analysis, singularity analysis, conservation laws, beam equation. 1. introduction there are basically two types of beams. one type is supported at both ends and the other type is supported at only one end. it is a cantilever. the latter is of greater mathematical and physical interest for the free end can vibrate. this causes stresses in the beam. the first mathematical description was made by leonhard euler and daniel bernoulli around 1750, but there were some earlier attempts by leonardo da vinci and galileo galilei who were more than a little hampered by no knowledge of differential equations. jacob bernoulli laid the groundwork for the development of leonhard euler and daniel bernoulli. in 1894, the polymath lord rayleigh proposed an improvement to the euler-bernoulli model by including a term related to rotational stress. in 1921, timoshenko introduced considerable improvements in what is now termed the timoshenko-prescott model. there has been a considerable experimental and numerical work devoted to the comparison of predictions of the theories and experimental results. it should be emphasised that the infinitesimal theory of elasticity is three-dimensional and that the three models mentioned above are linear models. they make for easier mathematics, but there is a price to pay. curiously, the simplest model, that of euler-bernoulli, still finds favour amongst some practitioners. some of the experimental work [1] undertaken to compare reality with theoretical prediction tries to make the experiment as close to a one-dimensional model as possible. one of the more interesting studies is the propagation of shock waves along the beam. this involves firing a bullet into the fixed end of the cantilever which is of a small diameter – 25 mm – to emulate a uniform boundary condition at the fixed end of the beam. the literature devoted to the theory and practice of beams is extensive both in time and space. a fairly recent paper by labuschagne [2] is very good in its historical aspects as well as being clearly written. earlier papers in addition to that of davies cited above are by hudson [3] and bancroft [4]. an interesting feature is that the beams are taken to be cylindrical in shape even though the beams one sees in buildings are anything but cylindrical with some exceptions to be found in beamed structures of the nineteenth century. one assumes that this makes the analysis simpler due to the radial symmetry. even a square cross-section would significantly complicate the mathematics. 98 https://doi.org/10.14311/ap.2020.60.0098 https://ojs.cvut.cz/ojs/index.php/ap vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . in this work, we study the algebraic properties of the euler-bernoulli, the rayleigh and of the timoshenkoprescott according to the admitted lie point symmetries, for the source-free equation as also in the case where a homogeneous source term exists. the application of the symmetry analysis for the euler-bernoulli equation is not new, there are various studies in the literature [5–9], however in this paper, we obtained some new results, as the reduction of the euler-bernoulli form to a perturbed form of painlevé-ince [10] equation, which is integrable and the third-order ode, which falls into the category of equations studied by chazy, bureau and cosgrove. also, we show that the three beam equations of our study admit the same travelling-wave solution. certain phenomenal works were recently done for the static euler-bernoulli equation by ruiz [11] and da silva pl [12]. in ruiz [11], the euler-bernoulli equation with an external agent is studied with respect to the joint invariants of the algebra and complete solutions are specified whereas in [12], for the static euler-bernoulli equation with specific nonlinear term, it was found that the algebraic structure of lie point symmetries is similar to that of the noether symmetries. it is worthwhile to mention the paper of freire il [13] where the lane-emden system is reduced to the emden-fowler equations and, correspondingly, the solutions of the system have been studied with the aid of the point symmetries. to elaborate the above mentioned works, we focus on the more general euler-bernoulli equation with and without the external forcing term and compute its solution using the point symmetries. it is also our intuition that the point symmetries of the form of euler-bernoulli under consideration do possess some similarities with the noether symmetries to follow the results of the above mentioned work. this paper is structured in the following way: in section 2, we mention the lie point symmetries and the corresponding algebra. in section 3, we discuss the travelling-wave solutions for all the beam forms and further reductions of euler-bernoulli form using the scaling symmetries. in section 4, we study the forced forms of the beam equations. section 5 is devoted to the singularity analysis of a third-order equation, which is obtained by the reduction of euler-bernoulli equation by using the scaling symmetry. conservation laws for the three beam equations are derived in section 6. the conclusion and appropriate references are mentioned henceforth. 2. lie symmetry analysis for the convenience of the reader, we give a briefly discussion in the theory of lie point symmetries. in particular, we present the basic definitions and main steps for the determination of lie point symmetries for a give differential equation. consider ha (t,x,u,u,i) = 0 to be a set of differential equations and u,i = ∂u∂yi in which y i = (t,x). then, under the action of the infinitesimal one-parameter point transformation t′ = t (t,x,u; ε) , (1) x′ = x (t,x,u; ε) , (2) u ′a = ua (t,x,u; ε) , (3) in which ε is an infinitesimal parameter, the set of differential equations ha is invariant if and only if, ha (t′,x′,u′) = ha (t,x,u) , (4) or equivalently lim ε→0 ha (t′,x′,u′; ε) −ha (t,x,u) ε = 0. (5) the latter expression is the definition of the lie derivative l of ha along the direction γ = ∂t′ ∂ε ∂t + ∂x′ ∂ε ∂x + ∂u′ ∂ε ∂u. (6) hence, we shall say that the vector field γ will be a lie point symmetry for the set of differential equations ha if and only if the following condition is true lγ ( ha ) = 0. (7) in other words, the operator γ can be considered to be symmetry provided γ[n]ha = 0, 99 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica whenever ha(t,x,u,u,i ) = 0 and γ[n] denotes the n-th prolongation of the specified operator in its defined space. the set of all such operators can be denoted by g, which can be regarded as the symmetry group for the set of differential equations ha(t,x,u,u,i ) = 0 [14–16]. 2.1. the euler-bernoulli equation. the euler-bernoulli form of the beam equation is [1, 17], αβuxxxx + utt = 0. (8) the lie point symmetries are γ1a = ∂x , γ2a = ∂t, γ3a = u∂u , γ4a = 2t∂t + x∂x, γ5a = a(t,x)∂u, where a(t,x) satisfies the euler-bernoulli form of the beam equation. the lie algebra is (a3,3 ⊕a1) ⊕s ∞a1, according to the morozov-mubarakzyanov classificatin scheme [18–21]. 1 2.2. the rayleigh equation. the rayleigh form of the beam equation is [1, 17], αβuxxxx + utt −βuxxtt = 0. (9) the lie point symmetries are γ1b = ∂t , γ2b = ∂x, γ3b = u∂u , γ4b = b(t,x)∂u, where b(t,x) satisfies the rayleigh form of the beam equation. consequently, the admitted lie algebra is a3 ⊕s ∞a1. 2.3. the timoshenko-prescott equation. the timoshenko and prescott form of the beam equation is [1, 24], αβuxxxx + utt −β(1 + �)uxxtt + �βutttt α = 0. (10) the lie point symmetries are γ1c = ∂t , γ2c = ∂x , γ3c = u∂u, γ4c = c(t,x)∂u, where c(t,x) satisfies the timoshenko and prescott form of the beam equation. hence, the admitted lie algebra is a3 ⊕s ∞a1. therefore, we can say that the timoshenko-prescott equation and the rayleigh equation are algebraic equivalent, but different with the euler-bernoulli equation as it admits a higher-dimensional lie algebra. we proceed our analysis by applying the lie point symmetries to determine similarity solutions for the three equations of our study. 1we use the sym package developed by prof.stelios dimas [22, 23]. 100 vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . 3. the travelling-wave solution the travelling-wave solution for eq (8), with respect to γ2a + cγ1a, where c, denotes the frequency, leads to the fourth order equation, c2v′′(s) + αβv′′′′(s) = 0, (11) where s = x− ct, v(s) = u(x,t). the lie point symmetries of equation (11) are γ1d = ∂s, γ2d = ∂v, γ3d = s∂v, γ4d = v∂v, γ5d = sin cs √ αβ ∂v, γ6d = cos cs √ αβ ∂v. the reduced form has six symmetries and hence linearisable, the solution for the fourth-order equation can be given as, v(s) = c0 + c1s + c2 sin cs √ αβ + c3 cos cs √ αβ , where ci, i = 0, 1, 2, 3 are arbitrary constants. corresponding, to which the solution for euler-bernoulli form of beam can be given as u(x,t) = c0 + c1(x− ct) + c2 sin c(x− ct) √ αβ + c3 cos c(x− ct) √ αβ . 3.1. further reduction of the euler-bernoulli equation. the reduction with respect to γ3a and γ4a, leads to fourth-order odes. for the similarity variables with respect to 2γ3a+γ4a, s = t x2 , u(t,x) = tv(s), the reduced ode is ( 2 αβ + 120s2 ) v′ + ( s + 300s3 ) v′′ + 144s4v′′′ + 16s5v′′′′ = 0. the latter equation is solvable. we continue by considering the similarity variable u(t,x) = xv(s), with respect to γ3a+γ4a for s being same as above leads to the fourth-order ode, 24sv′ + ( 1 αβ + 156s2 ) v′′ + 16s3 (7v′′′ + sv′′′′) = 0. (12) the equation has a total of five lie point symmetries with ∂v, v∂v are the two simpler symmetries, the other three are in terms of hypergeometric functions, which is complicated enough to be mentioned here. we apply ∂v to perform the reduction. the new invariant functions are s = h and g(h) = v′(s), hence, the reduced equation is a third-order ode, g′′′(h) = − 7g′′(h) h − ( 39 4h2 + 1 16h4αβ ) g′(h) − 3g(h) 2h3 . (13) 101 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica the latter equation admits four lie point symmetries. the simpler one being g∂g, the other three are hyperbolic functions of sin h,cos h and hypergeometric function respectively. we consider g∂g, to do the reduction. the subsequent second-order equation is, m′′(n) = ( −3m(n) − 7 n ) m′(n) −m(n)3 − 7m(n)2 n − ( 39 4n2 + 1 16n4αβ ) m(n) − 3 2n3 , (14) where n = h and m(n) = g ′(h) g(h) . this is the perturbed form of painlevé-ince equation and the singularity analysis of this equation shows that it is integrable. in our subsequent paper, we look at the analysis and discuss it elaborately. the reduction of (12) with respect to v∂v, leads to a third-order equation with zero symmetries, g′′′(h) = ( −4g(h) − 7 h ) g′′(h) − 3g′2 − ( 6g(h)2 + 21 g(h) h + 39 4h2 + 1 16h4αβ ) g′(h) −g(h)4 − 7g(h)3 h − ( 39 4h2 + 1 16h4αβ ) g(h)2 − 3g(h) 2h3 , (15) where h = s and g(h) = v ′(s) v(s) . this equation is integrable as ascertained by the singularity analysis, the calculations of which are mentioned in a following section. 3.2. the travelling wave solution for the rayleigh equation. the reduction using γ1b + cγ2b leads to the fourth-order equation which is maximally symmetric, where c is the frequency. the definition of similarity variables s and v(s) is similar to that of the previous case.( 1 −βc2 ) v′′′′(s) + c2v′′(s) = 0, (16) which is in the form of equation (11). 3.3. the travelling-wave solution for the timoshenko-prescott equation. in a similar way, the application of the generic symmetry vector, γ1c + cγ2c, in (10) provides the fourth-order ode, ( α2β −βc2a−αβc2ε + c4εβ ) v′′′′(s) + ac2v′′ (s) = 0, (17) which is again in the form of (11). consequently, we conclude that the three different beam equations provide the same travel-wave solutions. we continue our analysis by assuming the existence of a source term f (u) in the beam equations. 4. symmetry analysis with a source term in this section, we study the impact of the forcing-source term f(u) in the rhs of the euler-bernoulli, rayleigh and timoshenko-prescott beam equations. 4.1. euler-bernoulli. the lie symmetry analysis for the euler-bernoulli equation (8) with the forced term f(u), leads to the following possible cases for the forcing term f1 (u) = au + b, (18) f2 (u) = (au + b)n, (19) f3 (u) = eau+b, (20) f4 (u) = arbitrary. (21) for f1(u), the admitted lie point symmetries for the euler-bernoulli equation are, γf11 = ∂t , γ f1 2 = ∂x , γ f1 3 = u∂u , γ f1 ∞ = b (t,x) ∂u, 102 vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . where they form the 3a1 lie algebra and b (t,x) is a solution of the original equation. for the source f2 (u) , the admitted lie point symmetries are, γf21 = ∂t , γ f2 2 = ∂x , γ f2 3 = 2 (n− 1) t∂t + (n− 1) x∂x − 4 ( u + b a ) ∂u, which they form the 2a1 ⊕s a1 lie algebra. for f3 (u) the admitted lie point symmetries are, γf31 = ∂t , γ f3 2 = ∂x , γ f3 3 = 2t∂t + x∂x − 4 a ∂u, where the corresponding lie algebra is the 2a1 ⊕s a1. finally, for the arbitrary functional form of f (u), the admitted lie point symmetries are the only two symmetry vectors, γf41 = ∂t , γ f4 2 = ∂x, which form the 2a1 lie algebra and provide the travelling-wave solution. 4.2. rayleigh and timoshenko-prescott equations. for the other two beam equations, namely the rayleigh and timoshenko-prescott equations with a source term, we find that, for a linear function f = f1 (u), the two equations admit the same lie point symmetries with the force-free case, while for arbitrary function f (u) = f4 (u), admit only two lie point symmetries, the vector fields γf41 , γ f4 2 which provide travelling-wave solutions. 4.3. symmetry classification of ode. we show the reduction with the lie point symmetries, γf41 + cγ f4 2 , because the three beam equations of our consideration provide the same fourth-order ode, which now, with a source term, takes the following form, v′′′′ + c2v′′ = f (v) , (22) we perform the symmetry classification of the latter differential equation and we find that, for the arbitrary function f (v), the latter equation admits only the autonomous symmetry vector ∂v. however, for a constant source f (v) = a0, the lie point symmetries are, ∂s ,∂v , s∂v , ( 2c2v −a0s2 ) ∂v , cos (cs) ∂v , sin (cs) ∂v, where the generic solution of equation (22) is, v (s) = v1 sin (cs) + v2 cos (cs) + v3s + v4 + a0 2c2 s2. (23) on the other hand, for f (v) = a1v + a0, equation (22) admits the six lie point symmetries, ∂s , (a1v + a0) ∂v , exp ( ±i √ 2c2 + 2 √ c4 + 4a1 2 s ) ∂v , exp ( ±i √ 2c2 − 2 √ c4 + 4a1 2 s ) ∂v, where the generic solution of (22) is, v (s) = v1 exp ( i √ 2c2 + 2 √ c4 + 4a1 2 s ) + v2 exp ( −i √ 2c2 + 2 √ c4 + 4a1 2 s ) + +v3 exp ( +i √ 2c2 − 2 √ c4 + 4a1 2 s ) + v4 exp ( −i √ 2c2 − 2 √ c4 + 4a1 2 s ) . (24) 103 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica 4.4. scaling solutions for the forced euler-bernoulli equation. we continue by presenting the reduction with the scaling symmetries for the euler-bernoulli equation for the power-law and the exponential sources f2 (u) and f3 (u). for simplicity and without a loss of generality, we select b = 0. for the power-law source f2 (u) = aun, the application of the lie point symmetry γ f2 3 provides the reduced fourth-order ode, 4αβv′′′′ + s2v′′ + 3n + 5 n− 1 sv′ + 8(1 + n)v (n− 1)2 − 4avn = 0, (25) in which s = x√ t and u (t,x) = v (s) t 2 n−1 . for the exponential source f3 (u) = eau, the reduced equation given by the scaling symmetry is, 4αβv′′′′ + s2v′′ + 3sav′ + 4aeav + 8 = 0, (26) where now s = x√ t and u (t,x) = −2 a ln (t) + v (s). 5. singularity analysis the third-order ode that we apply the singularity analysis to is (15) or 24νxy(x) + y(x)2 + 156νx2y(x)2 + 112νx3y(x)3 + 16νx4y(x)4 + y′(x) + 156νx2y′(x) +336νx3y(x)y′(x) + +96νx4y(x)2y′(x) + 48νx4y′2(x) + 112νx3y′′(x) +64νx4y(x)y′′(x) + 16νx4y′′′(x) = 0, (27) where g (h) = y (x) , h = x and αβ = v. we apply the ars algorithm [25–27] and we make the usual substitution to obtain the leading-order behaviour [28], y → a(x−x0)p, (28) which provides 32aνpx4(x−x0)−3+p − 48aνp2x4(x−x0)−3+p + 16aνp3x4(x−x0)−3+p − 112aνpx3(x−x0)−2+p +112aνp2x3(x−x0)−2+p + ap(x−x0)−1+p + 156aνpx2(x−x0)−1+p + 24aνx(x−x0)p +a2(x−x0)2p + 156a2νx2(x−x0)2p + 112a3νx3(x−x0)3p + 16a4νx4(x−x0)4p −64a2νpx4(x−x0)−2+2p + 112a2νp2x4(x−x0)−2+2p + 336a2νpx3(x−x0)−1+2p +96a3νpx4(x−x0)−1+3p = 0. (29) from the latter, it is evident that p →−1. hence, − 96aνx4 (x−x0)4 + 176a2νx4 (x−x0)4 − 96a3νx4 (x−x0)4 + 16a4νx4 (x−x0)4 + 224aνx3 (x−x0)3 − 336a2νx3 (x−x0)3 + 112a3νx3 (x−x0)3 − a (x−x0)2 (30) + a2 (x−x0)2 − 156aνx2 (x−x0)2 + 156a2νx2 (x−x0)2 + 24aνx x−x0 . we extract the obvious dominant terms, − 96aνx4 (x−x0)4 + 176a2νx4 (x−x0)4 − 96a3νx4 (x−x0)4 + 16a4νx4 (x−x0)4 , (31) and solve for a, (−3 + a)(−2 + a)(−1 + a)a = 0, (32) 104 vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . to obtain, a → 0 ,a → 1 ,a → 2 ,a → 3. (33) in order to find the resonances, we substitute, y → a(x−x0)−1 + m(x−x0)−1+s, (34) and we linearize around m. then, the usual x → z + x0, simplify the calculations and provide the dominant terms factor as, 16ν(−3 + 2a + s) ( 2 − 6a + 2a2 − 3s + 2as + s2 ) x40z −4+s, (35) and for the three values of a, we obtain the three sets of resonances, for each value of the coefficient term a : a = 1 : s →−1, s → 1, s → 2, a = 2 : s →−2, s →−1, s → 1, and a = 3 : s →−3, s →−2, s →−1. for a = 1, the solution is expressed in terms of a right painlevé series, for a = 3, in terms of a left painlevé series and for a = 2, in terms of a mixed painlevé series. we commence the consistency test. for the right painlevé series, we write y → (x−x0)−1 + f0 + f1(x−x0) + f2(x−x0)2 + f3(x−x0)3 + ... (36) where f0 and f1 are the second and third constants of integration. the output is enormous and hence omitted. the substitution x → z + x0, just makes it easier to collect as powers. the terms in z−1 are, 2f0 z + 24νx0 z + 312f0νx20 z + 336f 20 νx30 z + 336f1νx30 z + + 64f 30 νx40 z + 192f0f1νx40 z + 128f2νx40 z + ...... = 0. (37) this is solved to give, 64νx40f2 → −f0 − 12νx0 − 156f0νx 2 0 − 168f 2 0 νx 3 0 + −168f1νx30 − 32a 3 0νx 4 0 − 96f0f1νx 4 0. (38) the expression for f2 is substituted into the major output, −7f 20 + 3f1 − 240ν − 22f0 x0 − 2880f0νx0 − 3780f 20 νx 2 0 − 2220f1νx 2 0 + −1680f 30 νx 3 0 − 1680f0f1νx 3 0 − 240f 4 0 νx 4 0 + −480f 20 f1νx 4 0 + 240f 2 1 νx 4 0 + 480f3νx 4 0 = 0, (39) and the coefficient of the constant term is solved to give f3 as, 480νx50f3 = 22f0 + 7f 2 0 x0 − 3f1x0 + 240νx0 + 2880f0νx 2 0 + 3780f 2 0 νx 3 0 +2220f1νx30 + 1680f 3 0 νx 4 0 + 1680f0f1νx 4 0 + 240f 4 0 νx 5 0 + 480f 2 0 f1νx 5 0 − 240f 2 1 νx 5 0. (40) thus, there is not a problem with the determination of the coefficients of the terms in the right painlevé series. as certain terms in the third-order ode are less dominant, there cannot be a left painlevé series. the possibility of the existence of a mixed painlevé series is moot due to the practical difficulty of calculating coefficients. consequently, equation (15) is integrable according to the painlevé test 105 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica 6. conservation laws the ibragimov’s theory of nonlinear self-adjointness details the construction of conservation laws for a scalar pde[29–31]. our first step is to verify the self-adjointness condition on the various forms of beams and later on, compute the conservation laws. the preliminaries can be easily accessed from[29]. the main motivation behind using the ibragimov’s approach is to obtain the conservation laws to deduce certain special solutions for the beam equations following the methodology specified by cimpoiasu [32], where the author have used the nonlinear self-adjointness method to compute solutions for the rossby waves. the noether’s theorem can be easily applied to obtain the conserved terms but it is our intuition that the non-local conserved terms as obtained using the ibragimov’s method can contribute in obtaining new solutions in a different subspace of the complex plane. the main objective is to deduce new solutions using the point symmetries, singularities and conservation laws. for instance, the euler-bernoulli equation does imply the existence of series type solutions through the method of singularity analysis as mentioned in section 5 for equation (15). let the scalar pde admit the following generators of the infinitesimal transformation, v = ξi(x,u,ui, ..) ∂ ∂xi + η(x,u,ui, ...) ∂ ∂u . (41) then the scalar pde and its adjoint equation, as defined above, admits the conservation law ci = ξil + w ( ∂l ∂ui −dj ( ∂l ∂uij ) + djdk ( ∂l ∂uijk ) − ... ) + dj (w) ( ∂l ∂uij −dk ( ∂l ∂uij ) + ... ) +djdk(w) ( ∂l ∂uijk − ... ) , where w = η − ξiui and l denotes the lagarangian of the corresponding form of the beam equation. for the euler-bernoulli, rayleigh and timshenko-prescott forms the lagarangians are as follows l = q(t,x)(utt + αβuxxxx), l = q(t,x)(αβuxxxx + utt −βuxxtt), l = q(t,x)(αβuxxxx + utt −β(1 + �)uxxtt + �βutttt α ), where q(t,x) is the new dependent variable. to verify the non-linear self adjointness, the substitution of q(t,x) = φ(t,x,u), to the adjoint equation of (8), (9) and (10) must satisfy for all solutions u of those equations. the possible values of φ(t,x,u) is a constant term, let us say, a0 and a1u(t,x) + a2, where a1 and a2 are arbitrary constants. a complete description of this method can be obtained from [29]. 6.1. conservation laws for the various form of beam. with respect to each of the symmetries in (8), we compute the nonzero conservation laws. for γ1a, the conservation components are ct = uxφt −uxtφ(t,x,u), cx = φ(t,x,u)(utt + αβuxxxx) + αβuxφxxxx −αβuxxφxx + αβφxuxxx −uxxxxφ(t,x,u). (42) for γ2a, ct = φ(t,x,u)(utt + αβuxxxx) + utφt −uttφ(t,x,u), cx = αβ(utφxxx −uxtφxx −uxxtφx −uxxxtφ(t,x,u)). (43) 106 vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . for γ3a, ct = −uφt + utφ(t,x,u), cx = αβ(uxφxx −uφxxx + uxxφx + uxxxx). (44) for γ4a, ct = 2t(φ(t,x,u)(utt + αβuxxxx)) + φt(2tut + xux) −φ(t,x,u)(2tutt + 2ut + xuxt), cx = x(φ(t,x,u)(utt + αβuxxxx)) + αβφxxx(2tut + xux) −αβφxx(2tuxt + xuxx + ux) + αβφx(2tuxxt + xuxxx + 2uxx) −αβφ(t,x,u)(2tuxxxt + xuxxxx + 3uxxx). (45) f0r γ5a, ct = −a(t,x)φt + atφ(t,x,u), cx = −αβa(t,x)φxxx + αβaxφxx −αβaxxφx + φ(t,x,u)axxx. (46) next, we compute the conservation laws of equation (9), with respect to its symmetries. γ1b leads to the following conserved components, ct = φ(t,x,u)(αβuxxxx + utt −βuxxtt) + utφt −φ(t,x,u)utt, cx = αβ(utφxxx −uxtφxx + uxxtφx −uxxxtφ(t,x,u)). (47) for γ2b, ct = uxφt −uxtφ(t,x,u), cx = φ(t,x,u)(αβuxxxx + utt −βuxxtt) + αβ(uxφxxx −uxxφxx + uxxxφx −φ(t,x,u)uxxxx). (48) for γ3b, ct = utφ(t,x,u) −uφt, cx = αβ(uxφxx −uφxxx −uxxφx + uxxxφ(t,x,u)). (49) for γ4b, ct = btφ(t,x,u) − b(t,x)φt, cx = αβ(bxφxx − b(t,x)φxxx − bxxφx + bxxxφ(t,x,u)). (50) for the timoshenko-prescott form of the beam (10), the conserved components are as follows: for γ1c, ct = φ(t,x,u) ( αβuxxxx + utt −β(1 + �)uxxtt + �βutttt α ) + ut ( φt + �βφttt α ) −utt ( φ(t,x,u) + �βφtt α ) + uttt �βφt α −utttt �βφ(t,x,u) α , cx = αβ(utφxxx −uxtφxx + uxxtφx −uxxxtφ(t,x,u)). (51) 107 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica for γ2c, ct = ux ( φt + φttt�β α ) −uxt ( φ(t,x,u) + φtt�β α ) + uxtt φt�β α −uxttt �β α , cx = φ(t,x,u) ( αβuxxxx + utt −β(1 + �)uxxtt + �βutttt α ) + αβ(uxφxxx −uxxφxx + uxxxφx −φ(t,x,u)uxxxx). (52) for γ3c, ct = −u ( φt + �βφttt α ) + ut ( φ(t,x,u) + �βφtt α ) −utt �βφt α + uttt �βφ(t,x,u) α , cx = αβ(uxφxx −uφxxx −uxxφx + uxxxφ(t,x,u)). (53) for γ4c, ct = − c(t,x) ( φt + �βφttt α ) + ct ( φ(t,x,u) + �βφtt α ) − ctt �βφt α + cttt �βφ(t,x,u) α , cx = αβ(cxφxx − c(t,x)φxxx − cxxφx + cxxxφ(t,x,u)). (54) 7. conclusion in this work, we focused on the algebraic properties for three different forms of the beam equations with or without a source. for the source free equations, we found that the euler-bernoulli equation is invariant under the lie algebra (a3,3 ⊕a1)⊕s ∞a1, while, the rayleigh and timoshenko-prescott equations are invariant under the lie algebra a3 ⊕s ∞a1. in the case of an isotropic source f (u), we found that the euler-bernoulli, rayleigh and timoshenko-prescott equations are invariant under the lie algebra 2a1 for an arbitrary source f (u). moreover, for the euler-bernoulli beam equations, the admitted lie algebras are a3 ⊕s ∞a1 for linear f (u) = au + b, 2a1 ⊕s a1 for exponential or power-law functional form. for the other two beam equations, there are no specific functional forms of f (u), where the equations admit different algebras. therefore, for the source-free equations, we derived the conservation laws by applying the ibragimov’s method. we applied the lie point symmetries to reduce the pdes and we proved that the three beam equations provide exactly the same travelling-wave solutions. the most important result of our paper is the reduction of euler-bernoulli equation to a second-order equation, of the form of perturbed painlevé-ince equation and to a third-order equation, which was studied by chazy, bureau and cosgrove. one of our subsequent papers will be on the singularity analysis of the perturbed painlevé-ince equation. moreover, our future work also includes deriving further solutions of the different forms of beams using conservation laws. acknowledgements akh expresses grateful thanks to ugc (india) nfsc, award no. f1-17.1/201718/rgnf-2017-18-sc-ori-39488 for financial support and late prof. k.m.tamizhmani for the discussions that formed the basis of this work. pgll acknowledges the support of the national research foundation of south africa, the university of kwazulu-natal and the durban university of technology and thanks the department of mathematics, pondicherry university, for gracious hospitality. references [1] r. m. davies, g. i. taylor. a critical study of the hopkinson pressure bar. philosophical transactions of the royal society of london series a, mathematical and physical sciences 240(821):375 – 457, 1948. doi:10.1098/rsta.1948.0001. [2] a. labuschagne, n. van rensburg, a. van der merwe. comparison of linear beam theories. mathematical and computer modelling 49(1):20 – 30, 2009. doi:10.1016/j.mcm.2008.06.006. 108 http://dx.doi.org/10.1098/rsta.1948.0001 http://dx.doi.org/10.1016/j.mcm.2008.06.006 vol. 60 no. 2/2020 similarity solutions and conservation laws for the beam equations. . . [3] g. e. hudson. dispersion of elastic waves in solid circular cylinders. physical review 63:46 – 51, 1943. doi:10.1103/physrev.63.46. [4] d. bancroft. the velocity of longitudinal waves in cylindrical bars. physical review 59:588 – 593, 1941. doi:10.1103/physrev.59.588. [5] a. h. bokhari, f. m. mahomed, f. d. zaman. symmetries and integrability of a fourth-order euler–bernoulli beam equation. journal of mathematical physics 51(5):053517, 2010. doi:10.1063/1.3377045. [6] a. fatima, f. m. mahomed, c. m. khalique. noether symmetries and exact solutions of an euler–bernoulli beam model. international journal of modern physics b 30(28-29):1640011, 2016. doi:10.1142/s0217979216400117. [7] d. huang, x. li, s. yu. lie symmetry classification of the generalized nonlinear beam equation. symmetry 9:115, 2017. doi:10.3390/sym9070115. [8] c. wafo soh. euler–bernoulli beams from a symmetry standpoint-characterization of equivalent equations. journal of mathematical analysis and applications 345(1):387 – 395, 2008. doi:10.1016/j.jmaa.2008.04.023. [9] m. tuz. the existence of symmetric positive solutions of fourth-order elastic beam equations. symmetry 11:121, 2019. doi:10.3390/sym11010121. [10] e. l. ince. ordinary differential equations. longmans green & co, london, 1927. [11] a. ruiz, c. muriel, j. ramirez. exact general solution and first integrals of a remarkable static euler-bernoulli beam equation. communications in nonlinear science and numerical simulation 69:261 – 169, 2018. doi:10.1016/j.cnsns.2018.09.012. [12] p. da silva, i. freire. symmetry analysis of a class of autonomous even-order ordinary differential equations. ima journal of applied mathematics 80:1739 – 1758, 2015. doi:10.1093/imamat/hxv014. [13] i. l. freire, p. l. da silva, m. torrisi. lie and noether symmetries for a class of fourth-order emden–fowler equations. journal of physics a: mathematical and theoretical 46(24):245206, 2013. doi:10.1088/1751-8113/46/24/245206. [14] w. bluman, s. kumei. symmetries and differential equations. springer, new york, 1989. doi:10.1007/978-1-4757-4307-4. [15] y. n. grigoriev, v. f. kovalev, s. meleshko, n. ibragimov. symmetries of integro-differential equations: with applications in mechanics and plasma physics. springer dordrecht, 2010. [16] p. j. olver. applications of lie groups to differential equations. springer science & business media, new york, 2000. [17] j. m. gere, s. p. timoshenko. mechanics of materials. pws-kent publishing company, 1997. [18] v. v. morozov. classification of six-dimensional nilpotent lie algebras. izvestia vysshikh uchebn zavendenĭı matematika 5:161 – 171, 1958. [19] g. m. mubarakzyanov. on solvable lie algebras. izvestia vysshikh uchebn zavendenĭı matematika 34:99 – 106, 1963. [20] g. m. mubarakzyanov. classification of real structures of five-dimensional lie algebras. izvestia vysshikh uchebn zavendenĭı matematika 32:114 – 123, 1963. [21] g. m. mubarakzyanov. classification of solvable six-dimensional lie algebras with one nilpotent base element. izvestia vysshikh uchebn zavendenĭı matematika 35:104 – 116, 1963. [22] s. dimas, d. tsoubelis. sym: a new symmetry-finding package for mathematica. group analysis of differential equations pp. 64 – 70, 2004. [23] s. dimas, d. tsoubelis. a new mathematica-based program for solving overdetermined systems of pdes. in in 8th international mathematica symposium. 2006. [24] w. t. thomson, m. d. dahleh. theory of vibration with applications. prentice-hall, new jersey, 1981. [25] m. ablowitz, a. ramani, h. segur. nonlinear evolution equations and ordinary differential equations of painleve’type. lettere al nuovo cimento 23(9):333 – 338, 1978. doi:10.1007/bf02824479. [26] m. j. ablowitz, a. ramani, h. segur. a connection between nonlinear evolution equations and ordinary differential equations of p-type. i. journal of mathematical physics 21(4):715 – 721, 1980. doi:10.1063/1.524491. [27] m. j. ablowitz, a. ramani, h. segur. a connection between nonlinear evolution equations and ordinary differential equations of p-type. ii. journal of mathematical physics 21(5):1006 – 1015, 1980. doi:10.1063/1.524548. [28] a. paliathanasis, p. g. l. leach. nonlinear ordinary differential equations: a discussion on symmetries and singularities. international journal of geometric methods in modern physics 13(07):1630009, 2016. doi:10.1142/s0219887816300099. 109 http://dx.doi.org/10.1103/physrev.63.46 http://dx.doi.org/10.1103/physrev.59.588 http://dx.doi.org/10.1063/1.3377045 http://dx.doi.org/10.1142/s0217979216400117 http://dx.doi.org/10.3390/sym9070115 http://dx.doi.org/10.1016/j.jmaa.2008.04.023 http://dx.doi.org/10.3390/sym11010121 http://dx.doi.org/10.1016/j.cnsns.2018.09.012 http://dx.doi.org/10.1093/imamat/hxv014 http://dx.doi.org/10.1088/1751-8113/46/24/245206 http://dx.doi.org/10.1007/978-1-4757-4307-4 http://dx.doi.org/10.1007/bf02824479 http://dx.doi.org/10.1063/1.524491 http://dx.doi.org/10.1063/1.524548 http://dx.doi.org/10.1142/s0219887816300099 a. k. halder, a. paliathanasis, p. g. l. leach acta polytechnica [29] n. h. ibragimov. nonlinear self-adjointness and conservation laws. journal of physics a: mathematical and theoretical 44(43):432002, 2011. doi:10.1088/1751-8113/44/43/432002. [30] n. h. ibragimov. a new conservation theorem. journal of mathematical analysis and applications 333(1):311 – 328, 2007. doi:10.1016/j.jmaa.2006.10.078. [31] r. tracina, m. bruzon, m. l. gandarias, m. torrisi. nonlinear self-adjointness, conservation laws, exact solutions of a system of dispersive evolution equations. communications in nonlinear science and numerical simulation 19:3036 – 3043, 2014. doi:10.1016/j.cnsns.2013.12.005. [32] r. cimpoiasu, r. constantinescu. nonlinear self-adjointness and invariant solutions of a 2d rossby wave equation. central european journal of physics 12:81 – 89, 2014. doi:10.2478/s11534-014-0430-6. 110 http://dx.doi.org/10.1088/1751-8113/44/43/432002 http://dx.doi.org/10.1016/j.jmaa.2006.10.078 http://dx.doi.org/10.1016/j.cnsns.2013.12.005 http://dx.doi.org/10.2478/s11534-014-0430-6 acta polytechnica 60(2):1–13, 2020 1 introduction 2 lie symmetry analysis 2.1 the euler-bernoulli equation. 2.2 the rayleigh equation. 2.3 the timoshenko-prescott equation. 3 the travelling-wave solution 3.1 further reduction of the euler-bernoulli equation. 3.2 the travelling wave solution for the rayleigh equation. 3.3 the travelling-wave solution for the timoshenko-prescott equation. 4 symmetry analysis with a source term 4.1 euler-bernoulli. 4.2 rayleigh and timoshenko-prescott equations. 4.3 symmetry classification of ode. 4.4 scaling solutions for the forced euler-bernoulli equation. 5 singularity analysis 6 conservation laws 6.1 conservation laws for the various form of beam. 7 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0077 acta polytechnica 61(si):77–88, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague computational methodology to analyze the effect of mass transfer rate on attenuation of leaked carbon dioxide in shallow aquifers radek fučíka, ∗, jakub solovskýa, michelle r. plampinb, hao wuc, jiří mikyškaa, tissa h. illangasekared a czech technical university in prague, faculty of nuclear sciences and physical engineering, department of mathematics, trojanova 13, 12000, praha, czech republic b u.s. geological survey, eastern energy resources science center, 12201 sunrise valley drive, reston, va 20192, usa c virginia polytechnic institute and state university, department of geosciences, 926 west campus drive, blacksburg, va 24061, usa d colorado school of mines, center for experimental study of subsurface environmental processes, 1500 illinois st., golden, co 80401, usa ∗ corresponding author: fucik@fjfi.cvut.cz abstract. exsolution and re-dissolution of co2 gas within heterogeneous porous media are investigated using experimental data and mathematical modeling. in a set of bench-scale experiments, water saturated with co2 under a given pressure is injected into a 2-d water-saturated porous media system, causing co2 gas to exsolve and migrate upwards. a layer of fine sand mimicking a heterogeneity within a shallow aquifer is present in the tank to study accumulation and trapping of exsolved co2. then, clean water is injected into the system and the accumulated co2 dissolves back into the flowing water. simulated exsolution and dissolution mass transfer processes are studied using both nearequilibrium and kinetic approaches and compared to experimental data under conditions that do and do not include lateral background water flow. the mathematical model is based on the mixed hybrid finite element method that allows for accurate simulation of both advectionand diffusion-dominated processes. keywords: compositional flow, two–phase flow, kinetic mass transfer, gas exsolution, gas dissolution. 1. introduction geologic carbon sequestration has the potential to significantly reduce greenhouse gas emissions [1], but also poses risks to groundwater resources including mobilization of contaminants in shallow aquifers due to leakage of co2 from deep storage formations [2]. the extent and severity of the risks depend on complex multiphase flow and transport phenomena that govern the migration of co2 through the shallow subsurface. a persistent issue with predicting these processes is the general difficulty of understanding co2 phase change (a.k.a. inter-phase mass transfer) within macroscopic porous media systems, which is important in the case of co2 due to its high solubility and potential mobility in the gas phase. if a leakage pathway is encountered, stored co2 that is originally supercritical in a deep geologic storage formation may migrate upward due to buoyancy, dissolve into water, exsolve to form a separate gas phase, and eventually re-dissolve into clean water. these interrelated processes are collectively referred to as multiphase evolution, and have recently been studied in various continuum-scale systems. experimental investigations have identified the factors that control multiphase evolution during one-dimensional (1d) vertical flow [3–5] and have qualitatively addressed the various transport phenomena during two-dimensional (2d) flow [6]. numerical modeling has been used to show the effects of 1d flow rate [7] and quasi-2d flow paths [8] on multiphase co2 evolution processes. however, numerical models have not yet been able to fully explain all of the observations from the experimental studies, particularly of those that occur during 2d flow under the influence of background water flow. the purpose of this present study is to develop a numerical modeling methodology to assess the factors that control inter-phase co2 mass transfer during migration through heterogeneous 2d porous media systems in the shallow subsurface where co2 can only exist in the gaseous and dissolved phases. specifically, we seek to test whether the local equilibrium assumption for mass transfer adequately explains multiphase evolution, and to investigate the effect of lateral water flow on heterogeneity-enhanced gas phase accumulation. our approach is to expand the numerical model developed in [9] to include kinetic mass transfer, and then to compare the results of the simulations performed by the model to the laboratory data from experiments that build upon methods developed in 77 https://doi.org/10.14311/ap.2021.61.0077 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en r. fučík, j. solovský, m. r. plampin et al. acta polytechnica [10]. this methodology forms the basis for another article [11] that investigates multiphase co2 plume dynamics in relatively large-scale synthetic aquifer systems. these investigations improve our understanding of the underlying processes that together lead to attenuation of co2 within groundwater systems [6]. the main objective of this work is to study the exsolution and dissolution of co2 on a small scale using both experiments and numerical modeling. motivated by fundamental differences in mass transfer in various scenarios studied in [11], our aim is to determine whether a kinetic or a simplified equilibrium mass transfer model is needed to reproduce the results observed in the experiments and how the results are affected by changes in the flow field enforced in the experiments. 2. experiment the experimental methodology is detailed by [10], so it will be only briefly described here. to assess the effects of background (lateral) water flow on multiphase co2 migration, two different bench-scale experiments were performed. in each experiment, co2-saturated water was injected into the bottom of a tank filled with water-saturated porous media until gas phase formed and accumulated to some steady value. then, clean water was injected until all of the co2 dissolved and migrated out of the tank. the test system was initially reported by [10], but this study expands upon that methodology by incorporating different experimental conditions. most importantly, one experiment was performed with water flowing from right to left across the tank. as is shown in figure 1, a block of low-permeability sand was incorporated into the middle of the tank above the injection ports, and saturation sensors were installed below this block of fine sand where gas phase co2 was expected to accumulate after exsolving from the injected co2-saturated water. material properties of the three porous media used are given in table 3. the saturation pressure, as defined by [3], was 10 kpa for the first experiment and 15 kpa for the second, which led to dissolved co2 concentrations that were sufficiently high to cause exsolution in the porous media. in the first experiment, the constant head devices connected to the regions of granusil #8 (which were installed to distribute the head evenly across the vertical boundaries of the main sand pack) were positioned at equal elevations. this led to negligible background water flow across the tank, and this case is therefore referred to as the static case. in the second experiment, the left hand constant head device was positioned at an elevation below that of the right-hand one, thus establishing background flow. in the second case, water inflow was supplied to the right-hand constant head device via a peristaltic pump, while the outflow from the left-hand constant head device was routed into a container placed on a computer-interfaced electronic scale. the scale and the saturation sensors were configured to automatically take measurements at 1-minute intervals. the resulting data from the experiments were compared against a numerical model of two-phase flow in porous media, which extends upon that of [9]. a major addition to the model is its capability to account for non-equilibrium co2 mass transfer between aqueous and gas phases. 3. mathematical model in this section, the mathematical model that describes the two-phase compositional flow in porous media and incorporates the phenomena studied in this work is summarized. 3.1. two-phase flow in porous media the governing equations for the two-phase flow in porous media are based on [12–14]. the quantities corresponding to the liquid (wetting) and gas (nonwetting) phases are denoted by indices ` and g, respectively. the mass balance equations for the incompressible liquid and gas phases are given by ρ`φ ∂s` ∂t + ρ`∇·~v` = f`, (1a) and ρgφ ∂sg ∂t + ρg∇·~vg = fg, (1b) respectively, where φ [−] is the material porosity and sα [−], ρα [kg m−3], ~vα [m s−1], fα [kg m−3 s−1] are the α-phase saturation, density, velocity, and the sink or source term. the velocity ~vα is given by darcy’s law ~vα = −λαk(∇pα −ρα~g), (2) where ~g [m s−2] is the gravitational acceleration vector, k [m2] is the intrinsic permeability, pα [pa] is the α-phase pressure, λα = krα/µα [pa −1 s−1] denotes the mobility of phase α, where µα [pa s] is the dynamic viscosity, and krα(sα) [−] denotes the relative permeability. the difference between the wetting and non-wetting phase pressures is defined as the capillary pressure pc = pg −p` and the brooks and corey model [15] is used in the form pc(s`) = pd(se` ) − 1 λ , (3) where pd [pa] is the entry pressure, λ [−] is the pore size distribution index, and seα [−] denotes the effective saturation defined by seα = sα −sr,α 1 −sr,g −sr,` , (4) where sr,α [−] is the residual saturation of phase α, α ∈{`,g}. 78 vol. 61 special issue/2021 computational methodology of mass transfer on attenuation of co2 water outflow (to scale) water outflow (to scale) co2 gas outflow (flow meter) 5 .8 c m granusil #110 & #250 (2:1) granusil #20/#30 granusil #8 (gravel)saturation sensor co2-saturated water injection port temperature sensor 3.81 cm 3.81 cm 25.4 cm 3.81 cm 3.81 cm 1 5 .2 4 c m 1 5 .2 4 c m 1 cm headspace 1 cm 2.5 cm 10.2 cm 2.5 cm 1.5 cm port c port b port a port d port e port f figure 1. setup for the small tank experiments, adapted from [10]. the brooks and corey model parameters are also used in the burdine’s model for the relative permeability functions kr,` and kr,g, [16], in the form kr,`(s`) = (se` ) 2+3λ λ , (5) kr,g(sg) = (seg) 2 ( 1 − (1 −seg) 2+λ λ ) . (6) based on [7, 11, 17], however, instead of kr,g, a modified formula for the gas phase relative permeability function is used in eq. (2) in the form k̃r,g(sg) = { 0, if sg < sc, kr,g( sg−sc 1−sc ), otherwise, (7) where sc [−] denotes the critical gas saturation. 3.2. component transport the gas phase is considered as a single component (pure co2), whereas the liquid phase is assumed to be a two component mixture (water and co2). the compositional balance equation for co2 dissolved in the liquid phase is included into (1) and (2) as φρ` ∂(s`x) ∂t + ρ`∇· (x~v` − τφs`d`∇x) = fx, (8) [14], where x [−] is the mass fraction of co2, fx [kg m−3 s−1] is the sink or source term, d` [m2 s−1] is the free molecular diffusion of co2 in water, d` = 1.92 · 10−9 m2 s−1, and τ` [−] is the tortuosity given by τ` = φ1/3s 7/3 ` based on [18]. 3.3. kinetic mass transfer based on [19], the kinetic mass transfer of co2 between both phases (i.e., the dissolution and exsolution processes) is represented by −fg = f` = fx = k(cs −xρ`), (9) where k [s−1] is the lumped mass transfer rate coefficient and cs [kg m−3] is the saturated co2 concentration in water at the relevant pressure and temperature given by henry’s law in the form cs = pg kh mg, (10) 79 r. fučík, j. solovský, m. r. plampin et al. acta polytechnica where mg [ kg mol−1 ] is the molar mass of co2, mg = 44.01 g mol−1, and kh [ pa mol−1 m3 ] is henry’s constant for which the van’t hoff equation is employed in the form kh = kh,refe −c ( 1 t − 1 tref ) (11) where t [k] is the temperature, kh,ref is the value of henry’s constant at a reference temperature tref [k], and c [k] is the gas-specific constant, i.e., kh,ref = 2979.97 [ pa mol−1 m3 ] , tref = 298.15 k, and c = 2400 k [20]. the average temperatures in the static case and background flow experiments were 37◦c and 26◦c, respectively. 4. numerical method and implementation the governing equations are solved using a general numerical solver based on the mixed-hybrid finite element method described in [9]. the mixed-hybrid finite element method combines velocity discretizations in the lowest order raviart–thomas–nédélec space with piecewise constant approximations for the scalar variables. the numerical method can be used for accurate simulation of degenerate diffusion or advectiondominated problems like the one discussed here. 4.1. numerical method the numerical method solves a system of n partial differential equations in the coefficient form in a ddimensional polygonal domain ω ⊂ rd and a time interval [0, tfin]: n∑ j=1 ni,j ∂zj ∂t + n∑ j=1 ~ui,j ·∇zj+ ∇· (mi(~qi + ~wi)) = fi, (12a) with ~qi = − n∑ j=1 di,j∇zj, (12b) where zj = zj(t,~x), j = 1, 2, . . . ,n, represent the unknown variables, ~x ∈ ω, t ∈ [0, tfin]. eq. (12) is further supplemented with dirichlet or neumann boundary conditions, or its combination on different parts of the boundary [9]. here we restrict ourselves only the case d = 2 which is relevant for the problems presented in the paper. in brevity, the mixed hybrid finite element discretization approximates the unknown scalar functions zj in the space of piecewise constant functions, i.e., zj ≈ ∑ a∈ah zj,aϕa (13) and the vector function ~qi from eq. (12b) in the lowest order raviart–thomas–nédélec space rtn0 as ~qi ≈ ∑ a∈ah χa ∑ e∈ea qi,a,e~ωa,e, (14) where ah denotes the set of triangles discretizing the computational domain ω, h is the mesh size defined as the largest ball diameter circumscribed around elements in ah, ea is the set of all sides of an element a ∈ah, ϕa are the piecewise constant basis functions defined by ϕa = χa, where χa is the characteristic function of a defined by χa(~x) = { 1 ~x ∈ a, 0 ~x /∈ a, (15) and ~ωa,e are the vector basis functions of rtn0(a), [9]. eq. (12b) and the discretization given by eq. (14) allow to express the coefficients qi,a,e as qi,a,e = ∑ j∈σi,a ( bi,j,a,ezj,a − ∑ f∈ea bi,a,e,fzj,f ) , (16) where zj,e are traces of zj on side e ∈ ea, σi,a ⊆ {1, 2, . . . ,n} denotes the set of all indices j for which di,j is non-zero on element a and the definition of coefficients bi,j,a,e and bi,j,a,e,f are given in [9]. the approximation of ~qi in rtn0 given by eqs. (14) and (16) together with the piecewise constant approximation of eq. (12a) allow to express the cell-averages zj,a as a local linear combination of zj,e, e ∈ ea, on each element a ∈ ah, see [9] for details. hence, qi,a.e can be expressed solely as linear function of traces zj,e, ∀e ∈ea. across all interior sides e of the triangulation ah, the balance of conservative fluxes is given by∑ {a:e∈ea} mi,e(~qi,a,e + ~wi,a,e) = 0, (17) where a single (upwinded) value mi,e is assumed at e, see [9], and, thus, it can be eliminated from eq. (17) if it has a non-zero value to produce∑ {a:e∈ea} ~qi,a,e + ~wi,a,e = 0. (18) when mi,e = 0 in eq. (17) for some side e, eq. (18) is also considered to assure that the resulting system of linear equations, given in the vector form as mk ~z`+1 = ~b`, (19) is non-singular. in eq. (19), m is a sparse, positive definite matrix [9] and ~b is the right-hand-side, both evaluated on the previous time level `, and ~z is the vector containing side-traces zj,e on the next time level ` + 1 for all j = 1, 2, . . . ,n and all sides e in ah. 4.2. problem formulation the system of governing equations given by eqs. (1), (2), (8), and (9) is represented by (12) using n = 3, 80 vol. 61 special issue/2021 computational methodology of mass transfer on attenuation of co2 parameter units liquid gas h2o co2 density ρ [kg m−3] 997.78 1.98 dyn. viscosity µ [10−5pa s−1] 87.2 1.48 table 1. properties of fluids. units mesh mesh mesh ∆1 ∆2 ∆3 elements 5 494 18 587 72 949 mesh size h [mm] 5.55 2.78 1.43 table 2. properties of meshes. d = 2, z1 = pc, z2 = pg, z3 = x, and ( ni,j ) i,j∈3̂ =  −φρ` ds`dpc 0 0−φρg ds`dpc 0 0 0 0 φs`ρ`   , (20) ( ~ui,j ) i,j∈3̂ =  ~0 ~0 ~0~0 ~0 ~0 ~0 ~0 ρ`~v`   , (21) ( mi ) i∈3̂ =  ρ` λ` λ`+λg ρg λg λ`+λg ρ`   , (22) ( di,j ) i,j∈3̂ =  (λ` + λg)k −λ` + λgk 00 (λ` + λg)k 0 0 0 τφs`d`   , (23) ( ~wi ) i∈3̂ =  −(λ` + λg)ρ`k~gλ` + λgρgk~g ~0   , (24) ( fi ) i∈3̂ =   −f`fg f` −xf`   , (25) where 3̂ = {1, 2, 3}. the computational domain ω depicted in figure 2 is discretized using three gradually refined triangular meshes generated by gmsh [21]. the mesh properties are listed in table 2 and in figure 2, the coarsest mesh is shown. the numerical method is implemented in an inhouse computer code using c++. the applicability of the numerical method for heterogeneous porous media is further discussed in [22] together with parallel implementations of the method on gpu [9] or on cpu using mpi [23]. 5. results and discussion 5.1. computational study after the physical domain and boundaries were set up to mimic the experiments, several simulations were performed for each experimental case (i.e., the static γ1 γ2 γw γe γw γe γw γe γs γn γ3 γ4 γ6 γ5 5 cm 0.8 cm 9.2 cm 0.8 cm 9.92 cm0.4 cm20.12 cm 0.4 cm figure 2. computational domain with coarse mesh based on figure 1. one and the one with background flow), using several different values for k. the boundary conditions on the model included hydrostatic pressure in the inflow/outflow ports on the left and right-hand side of the tank (see figure 1). for the experiment with background flow, the pressure on the right-hand side was increased to enforce a lateral pressure gradient and thus background flow through the tank. the pressure difference across the tank was fitted to match the measured outflow rates from the experiment. injection of co2-saturated and clean water was represented by a neumann boundary condition at the injection port. no flow boundary conditions were prescribed along the remaining parts of the computational domain boundary. several additional simulations were performed to investigate the effect of different background water flow rates on gas accumulation below the heterogeneity. all of these simulations were performed with a high (near-equilibrium) mass transfer rate of k = 1 s−1. the simulation representing the experiment with background flow was conducted with a pressure difference of 5 pa, while the other simulations were performed with pressure differences of 2, 4, 6, 7, 8, and 9 pa. 5.2. numerical convergence the mesh effects on the numerical solution are investigated using three gradually refined triangular meshes described in table 2. the gas saturation profiles at ports a, b, and c for k = 0.5 s−1 and sc = 0.2 for both static and background flow cases are compared in figure 3. the amount of gas simulated present in ports a, b, and c is slightly overestimated for coarser meshes with respect to finer meshes, however, these results are expected due to the first order of convergence of the numerical method as reported in [9]. 5.3. discussion in this section, the comparison between the experimental data and the numerical results is presented 81 r. fučík, j. solovský, m. r. plampin et al. acta polytechnica parameter units granusil granusil 2:1 mixture of granusil #8 #20/30 #110 and #250 porosity φ [-] 0.4 0.32 0.35 intrinsic permeability k [m2] 1×10−9 2.30×10−10 6.36×10−14 pore size distribution index λ [-] 4.275 7.33 5.35 entry pressure pd [pa] 600 1200 8027 residual liquid phase saturation sr,` [-] 0.084 0.084 0.17 residual gas phase saturation sr,g [-] 0 0 0 table 3. material properties. 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 60 70 time [h] f) port c, background flow 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 time [h] e) port c, static case 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 60 70 time [h] d) port b, background flow 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 time [h] c) port b, static case 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 60 70 time [h] b) port a, background flow 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 10 20 30 40 50 time [h] a) port a, static case mesh ∆1 mesh ∆2 mesh ∆3 figure 3. illustration of mesh resolution effects on the numerical solution of the gas saturation evolution for k = 0.5 s−1 and sc = 0.2 at the sampling ports a (a b), b (c, d), and c (e, f) for both static (a, c, e) and background flow cases (b, d, f). the mesh properties are given in table 2. 82 vol. 61 special issue/2021 computational methodology of mass transfer on attenuation of co2 0 0.05 0.1 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 50 60 time [h] f) port c, background flow 0 0.05 0.1 0.15 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 time [h] e) port c, static case 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 50 60 time [h] d) port b, background flow 0 0.05 0.1 0.15 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 time [h] c) port b, static case 0 0.05 0.1 0.15 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 50 60 time [h] b) port a, background flow 0 0.05 0.1 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 time [h] a) port a, static case model, k = 0.01 s−1 model, k = 0.05 s−1 model, k = 0.1 s−1 model, k = 0.5 s−1 model, k = 1 s−1 model, k = 2 s−1 experiment figure 4. results of the experiments and numerical simulations for (a) the static case and (b) the case with background flow. 83 r. fučík, j. solovský, m. r. plampin et al. acta polytechnica 0 0.1 0.2 g a s sa tu ra ti o n s g [− ] 0 10 20 30 40 50 60 time [h] model, 2 pa model, 4 pa model, 5 pa model, 6 pa model, 7 pa model, 8 pa port a figure 5. results of the simulations performed with various pressure differences applied across the tank. the experiment with background flow corresponds to the simulation with a pressure difference of 5 pa. and the mass transfer and accumulation processes are discussed for both static and background flow cases. figure 4 shows the results of gas saturation from the experiments and simulations for the static case and the case with background flow. the experimental data shown in (a) is from port b (directly above the injection port, as co2-saturated water was injected into the left injection port in this case), while that in (b) is from port a (also above the injection port, as co2-saturated water was injected into the right injection port in this case). both experiments show the accumulation of gas phase co2, then re-dissolution during the injection of clean water after the co2 saturations had stabilized. the simulation results for various choices of k show that the best match with experimental data is achieved for rather large values of the mass transfer coefficient and that the simulated results are almost the same for those greater than a certain threshold value of 0.1 to 0.5 s−1. this indicates that the mass transfer processes are at near-equilibrium and that the background flow is not strong enough to substantially affect the co2 exsolution and dissolution rates in these cases. on the other hand, the numerical results show that the presence of the background flow slows down the accumulation dynamics below the fine layer as indicated by different slopes of the saturation curves in figure 4c compared to figure 4b. note that because different injection ports were used in each experiment, the corresponding ports are shifted by one, i.e., results from port b for the static case are compared to the results from port a for the background flow case (see the gas saturation and co2 mass fraction spatial evolution shown in figures 6 and 7). further investigation of the background flow effect on the exsolution process can be done by analyzing the results of the simulations performed with different background flow rates, as shown in figure 5. the results show a wide range of different gas accumulation behaviors, with higher flow rates leading to lower and slower exsolution due to different gas phase distribution around the injection ports. this indicates that the flow field significantly affects the multiphase co2 evolution, however, the experimental data for these scenarios would be needed to quantify the impact. while the analysis above indicated that nearequilibrium behavior occurred in the experiments, further experimental data are needed to determine generally in which cases the equilibrium mass transfer simplification is valid. both experiments considered in this work are on a small scale and the measurement ports are close to the injection. also, the difference between the static and background flow cases is not very substantial. addressing these remaining knowledge gaps is one of the primary goals in [11], where non-equilibrium mass transfer behavior is observed in a large laboratory-scale system with background flow, and fundamentally different results are observed as compared with predominantly one-dimensional flows. this indicates that, for the decision between the kinetic and equilibrium models, flow field and scale both need to be taken into account. additionally, the flow field and scale are not the only factors controlling the multiphase evolution, as also indicated in [11], which showed that temperature fluctuations can also significantly affect mass transfer processes. 6. conclusions to assess the potential risks of groundwater contamination via leakage of stored co2 from deep geologic carbon sequestration sites, we must be able to predict multiphase co2 transport through shallow aquifers. to aid in this effort, an innovative numerical model was applied and tested against data from well-controlled, small-scale laboratory experiments. beyond conventional continuum-based two-phase flow 84 vol. 61 special issue/2021 computational methodology of mass transfer on attenuation of co2 gas saturation sg, static case time 1 h mass fraction x, static case time 1 h gas saturation sg, static case time 10 h mass fraction x, static case time 10 h gas saturation sg, static case time 20 h mass fraction x, static case time 20 h figure 6. spatial distribution of sg (left) and x (right) for the static case at 1 h, 10 h, and 20 h (from top to bottom), computed using k = 0.5 s−1 and sc = 0.2. 85 r. fučík, j. solovský, m. r. plampin et al. acta polytechnica gas saturation sg, background flow time 1 h mass fraction x, background flow time 1 h gas saturation sg, background flow time 10 h mass fraction x, background flow time 10 h gas saturation sg, background flow time 20 h mass fraction x, background flow time 20 h figure 7. spatial distribution of sg (left) and x (right) for the background flow case at 1 h, 10 h, and 20 h (from top to bottom), computed using k = 0.5 s−1 and sc = 0.2. 86 vol. 61 special issue/2021 computational methodology of mass transfer on attenuation of co2 in porous media, the model incorporated the ability to simulate non-equilibrium co2 mass transfer between the aqueous and gaseous phases during exsolution and dissolution processes. the experimental data used in this work clearly showed equilibrium or near-equilibrium behavior for both steady and background flow cases. these findings are in contrast with the non-equilibrium behavior observed on a larger scale for the scenarios with background flow. this indicates that the mass transfer for the similar scenarios is not controlled by the flow field only. further investigation of the mass transfer is needed to understand for which scenarios the kinetic model is needed and when the equilibrium assumption is sufficient. the computational methodology developed here could be used to assess both equilibrium and non-equilibrium processes in these future scenarios. 7. acknowledgments the work reported in this paper was supported by the ministry of education, youth and sports of the czech republic under the op rde grant number cz.02.1.01/0.0/0.0/16_019/0000778 centre for advanced applied sciences and inter-excellence grant no. ltausa19021. any use of trade, firm, or product names in this publication is for descriptive purposes only and does not imply endorsement by the u.s. government. references [1] s. pacala, r. socolow. stabilization wedges: solving the climate problem for the next 50 years with current technologies. science 305(5686):968–972, 2004. doi:10.1126/science.1100103. [2] j. a. apps, l. zheng, y. zhang, et al. evaluation of potential changes in groundwater quality in response to co2 leakage from deep geologic storage. transport in porous media 82(1):215–246, 2010. doi:10.1007/s11242-009-9509-8. [3] m. r. plampin, t. h. illangasekare, t. sakaki, r. j. pawar. experimental study of gas evolution in heterogeneous shallow subsurface formations during leakage of stored co2. international journal of greenhouse gas control 22:47–62, 2014. doi:10.1016/j.ijggc.2013.12.020. [4] m. r. plampin, r. n. lassen, t. sakaki, et al. heterogeneity-enhanced gas phase formation in shallow aquifers during leakage of co2-saturated water from geologic sequestration sites. water resources research 50(12):9251–9266, 2014. doi:10.1002/2014wr015715. [5] t. sakaki, m. r. plampin, r. pawar, et al. what controls carbon dioxide gas phase evolution in the subsurface? experimental observations in a 4.5 m-long column under different heterogeneity conditions. international journal of greenhouse gas control 17:66–77, 2013. doi:10.1016/j.ijggc.2013.03.025. [6] m. r. plampin, m. l. porter, r. j. pawar, t. h. illangasekare. intermediate-scale experimental study to improve fundamental understanding of attenuation capacity for leaking co2 in heterogeneous shallow aquifers. water resources research 53(12):10121–10138, 2017. doi:10.1002/2016wr020142. [7] m. l. porter, m. plampin, r. pawar, t. illangasekare. co2 leakage in shallow aquifers: a benchmark modeling study of co2 gas evolution in heterogeneous porous media. international journal of greenhouse gas control 39:51–61, 2015. doi:10.1016/j.ijggc.2015.04.017. [8] r. n. lassen, m. r. plampin, t. sakaki, et al. effects of geologic heterogeneity on migration of gaseous co2 using laboratory and modeling investigations. international journal of greenhouse gas control 43:213–224, 2015. doi:10.1016/j.ijggc.2015.10.015. [9] r. fučík, j. klinkovský, j. solovský, et al. multidimensional mixed–hybrid finite element method for compositional two-phase flow in heterogeneous porous media and its parallel implementation on gpu. computer physics communications 238:165–180, 2019. doi:10.1016/j.cpc.2018.12.004. [10] m. r. plampin, m. porter, r. pawar, t. h. illangasekare. multi-scale experimentation and numerical modeling for process understanding of co2 attenuation in the shallow subsurface. energy procedia 63:4824–4833, 2014. doi:10.1016/j.egypro.2014.11.513. [11] j. solovský, r. fučík, m. r. plampin, et al. dimensional effects of inter-phase mass transfer on attenuation of structurally trapped gaseous carbon dioxide in shallow aquifers. journal of computational physics 40, 2020. doi:10.1016/j.jcp.2019.109178. [12] p. bastian. numerical computation of multiphase flows in porous media. habilitation thesis, kiel university, 2000. [13] r. helmig. multiphase flow and transport processes in the subsurface, a contribution to the modelling of hydrosystems. springer, 1997. [14] k. mosthaf, k. baber, b. flemisch, et al. a coupling concept for two-phase compositional porous-medium and single-phase compositional free flow. water resources research 47(10), 2011. doi:10.1029/2011wr010685. [15] r. brooks, a. corey. hydraulic properties of porous media. colorado state university, hydrology paper 3:27, 1964. [16] n. burdine. relative permeability calculations from pore size distribution data. journal of petroleum technology 5:71–78, 1953. doi:10.2118/225-g. [17] i. tsimpanogiannis, y. c. yortsos. the critical gas saturation in a porous medium in the presence of gravity. j colloid interface sci 270:388–395, 2014. doi:10.1016/j.jcis.2003.09.036. [18] r. j. millington, j. p. quirk. permeability of porous solids. transactions of the faraday society 57:1200–1207, 1961. [19] j. niessner, s. m. hassanizadeh. modeling kinetic interphase mass transfer for two-phase flow in porous media including fluid-fluid interfacial area. transport in porous media 80:329–344, 2009. doi:10.1007/s11242-009-9358-5. 87 http://dx.doi.org/10.1126/science.1100103 http://dx.doi.org/10.1007/s11242-009-9509-8 http://dx.doi.org/10.1016/j.ijggc.2013.12.020 http://dx.doi.org/10.1002/2014wr015715 http://dx.doi.org/10.1016/j.ijggc.2013.03.025 http://dx.doi.org/10.1002/2016wr020142 http://dx.doi.org/10.1016/j.ijggc.2015.04.017 http://dx.doi.org/10.1016/j.ijggc.2015.10.015 http://dx.doi.org/10.1016/j.cpc.2018.12.004 http://dx.doi.org/10.1016/j.egypro.2014.11.513 http://dx.doi.org/10.1016/j.jcp.2019.109178 http://dx.doi.org/10.1029/2011wr010685 http://dx.doi.org/10.2118/225-g http://dx.doi.org/10.1016/j.jcis.2003.09.036 http://dx.doi.org/10.1007/s11242-009-9358-5 r. fučík, j. solovský, m. r. plampin et al. acta polytechnica [20] r. sander. compilation of henry’s law constants for inorganic and organic species of potential importance in environmental chemistry. max-planck inst of chem, mainz, germany 1999. [21] c. geuzaine, j.-f. remacle. gmsh: a 3-d finite element mesh generator with built-in pre-and post-processing facilities. international journal for numerical methods in engineering 79(11):1309–1331, 2009. doi:10.1002/nme.2579. [22] j. solovský, r. fučík. mass lumping for mhfem in two phase flow problems in porous media. in f. a. radu, k. kumar, i. berre, et al. (eds.), numerical mathematics and advanced applications enumath 2017, pp. 635–643. springer international publishing, 2019. doi:10.1007/978-3-319-96415-7_58. [23] j. solovský, r. fučík. a parallel mixed-hybrid finite element method for two phase flow problems in porous media using mpi. computer methods in materials science 17:84–93, 2017. 88 http://dx.doi.org/10.1002/nme.2579 http://dx.doi.org/10.1007/978-3-319-96415-7_58 acta polytechnica 61(si):77–88, 2021 1 introduction 2 experiment 3 mathematical model 3.1 two-phase flow in porous media 3.2 component transport 3.3 kinetic mass transfer 4 numerical method and implementation 4.1 numerical method 4.2 problem formulation 5 results and discussion 5.1 computational study 5.2 numerical convergence 5.3 discussion 6 conclusions 7 acknowledgments references ap04_4web.vp 1 introduction electrostatic discharges arising during separation of a charged sheet of a highly resistive material from a grounded object often assume the form of surface discharges. in such cases charge carriers are forced to propagate along a highly resistive dielectric surface where they are trapped and create latent invisible tracks which can be visualized by means of the well-known powder technique. the visible powder tracks called lichtenberg figures [1] have been used many times [2– 11] to study gaseous discharges on the surface of dielectrics. the traditional experimental arrangement for studying surface discharges in the form of lichtenberg figures consists of a point-to-plane electrode system and employs short voltage pulses in the microor nanosecond time scale. our experimental arrangement differs somewhat from the traditional arrangement since the subject of a present investigation is not classical corona discharges but microscopic electrostatic discharges occurring between a charged dielectric surface and a grounded metallic electrode. these microdischarges appear in the narrow wedge-shaped air gap when the dielectric is separating from the electrode. the morphology of lichtenberg figures was a subject of interest many times in the past. with the advent of fractal geometry morphological studies became more sophisticated and more exact. the pioneering work of niemeyer, pietronero and wiesmann [6] showed that a random structure of positive corona streamers form a fractal patterns on the dielectric surface and that these patterns can be modeled on a computer. these facts have been verified later many times by other authors [9, 10, 11–18]. this paper analyses the lichtenberg figures created by electrostatic separation discharges. the general multifractal formalism [19–41] has been used to perform the multifractal image analysis of surface discharge patterns: the discharge figures have been scanned with the resolution of 120 dpi and their digital images have been processed, i.e., the channel structure has been extracted from the background and then subjected to software multifractal analysis. 2 experimental arrangement the experimental arrangement consists of a sandwiched plane-to-plane electrode system and dc high voltage applied for various time periods ranging from several minutes to many hours. the polyethyleneterephthalate (pet) sheets 0.180 mm thick were pressed between a bronze electrodes of diameters �1 � 20 mm and �2 � 40 mm. the smaller electrode was loaded with the negative electric potential of �8.5 kv while the larger electrode was grounded. this resembles the arrangement used for poling electrets, and the chosen highly resistive dielectric samples – pet sheets – are also good electrets. however, charging the samples is not performed at higher temperatures as with usual electret poling but at common room temperatures. when the sheet of polymer is separated from the grounded plane electrode, the electrostatic discharges ‘draw’ latent lichtenberg figures on the surface of the polymer. 2.1 computational model since the 1980s the multifractal calculations [15–19] have been employed as the basic tool for morphological studies of complex objects embedded mostly in euclidean space. for multifractal analysis the euclidean space of the topological dimension e is partitioned into an e-dimensional grid whose basic cell is of a linear size � (arrangement necessary for the box-counting method). one of the topological partition sums used in this field is defined by the probability moments m q pq i q i i ( , ) ( )� �� � � 1 , p n n i i( )� � � � � � � , pi i i ( )� � � � 1 1 , q �� �( , ) . (1) the symbol ni represents the number of points in the i-th cell and n is the number of all points of the object studied. the goal of multifractal analysis is to determine one of the three multifractal spectra. the most frequently used spectrum is that of generalized dimensions dq d m q q q q q � �� lim ln ( , ) ( * ) ln* � � � �1 . (2) the present fractal objects are in the form of graphical bitmap files created by digitizing the pictures. the plane of graphical pixels (points) representing the fractal object is covered with a two-dimensional grid whose basic cell is of linear size � pixels. using this grid the partition sum (1) is computed. since the covering of the plane with a grid is arbitrary and the position of the grid should not infuence the results, we used several positions of the grid to find the average value of the partition sum m qq( , )� . for each �-grid there are � 2 independent coverings generated by shifting the grid origin within the first �-cell 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 multifractal image analysis of electrostatic surface microdischarges t. ficker the multifractal image analysis of lichtenberg figures has confirrmed a selfsimilar arrangement of surface streamers belonging to the special case of electrostatic separation discharges propagating along a surface of polymeric dielectrics. keywords: microdischarges, random processes, multifractals, electrets. m m qq j j ( ) ( , )� � � � � � � 1 2 1 2 . (3) such a procedure requires the fractal set to be embedded in a larger grid that allows one to move the origin without losing of any part of the fractal object. the averages mq( )� are estimated for a series of �-grids and the slopes in the bilogarithmic plot (ln �, ln ( )mq � ) are calculated using the linear regression method. these slopes divided by the corresponding (q � 1) values represent the generalized dimensions dq. different dq values for an analyzed object indicate multifractal behavior while identical values signify simple fractal features. the algorithm described above has been implemented by means of the software tool delphi. the created computer program multifran is able to run on the nt system or on windows. 3 results and discussion figs. 1, 2 show surface structures ‘drawn’ by electrostatic separation microdischarges appearing after the electret poling (74.5 hours at 8.5 kv) when the saturated electret state has been reached. to our knowledge, the first author to report the channel structure of electrostatic separation discharges on the surface of polyethyleneterephthalate was bertain [3]. she presented a picture (fig.7 in [3]) showing the ‘charge distribution obtained when a negatively charged foil of mylar is removed far from an earthed plate’. the clearly depicted and ramified channel structure is very similar to that in our fig. 1. similar pictures are also available in the detailed study of separation discharges published by takahashi, fuji, wakabayashi, hirano and kobayashi [8]. a characteristic feature in the morphology of a positive streamer channel is their ramification (fig. 2). at first sight it apparent that the branching of the surface channels determines the geometry of the structure. an abundant ramification, when the branches thoroughly fill the surface, leads to a geometrical structure whose dimension d will approach that of a plane (d � 2). on the other hand, at poor ramification, when branches arise sparcely and the structure resembles a group of linear simple channels, the corresponding dimension can be expected to be close to that of a line dlin � 1. if no surface streamer channels appear but only point-like microdischarge spots are developed, dimension d will approach that of a point, i.e, dpoint � 0. therefore, the interval �0, 2� represents all possible values of dimensions d of the surface positive streamers. the actual geometrical dimension d for a given structure can be obtained from the multifractal analysis in terms of the hausdor-besicovitch dimension d0 [45]. in order to analyse our surface streamers it was necessary to extract the channel structure from the background of the remanent electret surface charge (fig. 2). the results of the corresponding multifractal analysis have shown that the studied structures manifest fractal rather than multifractal features so that the spectrum of the generalized dimensions dq reduces to a single representative value d for all q. the actual value of dimension d for the structure presented in fig. 2 is 1.45 � 0.2. the first authors to recognize possible fractal features of a surface discharge channel structure and tried to estimate its fractal dimension were niemeyer, pietronero and wiesmann [6]. they determined the value d � 1.7 for their surface corona streamers. this higher value of dimension corresponds to a more ramified channel structure, which can be easily checked by visual inspection of their fig. 1 in [6]. many followers appeared in the field of computer simulations of discharge channel structures [12–18]. the dimensions obtained from these simulations show a large variety of values influenced by the chosen model parameters: structures can be found with restricted ramification [10], i.e., with a lower dimension d � 1.46 [10] which is close to our value d � 1.45 or more ramified structures d � 1.75, [10] or even very branched structures d � 1.8–1.96 [6, 14]. although much work in this field has been done, one important question still remains: what physical parameters influence ramification of the channel structure and exactly what mechanisms participate. the original computer model of niemeyer, pietronero and wiesmann [6] solved the problem of ramification by introducing the ‘growth’ probability p dependent on a local electric field e © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 44 no. 4/2004 fig. 1: electrostatic discharge structure for poling time 1min. and voltage �8.5kv fig. 2: surface streamers extracted from fig. 1 p e~ �, (4) where � is a model parameter. in the sequence of simplifying simulation steps the dimension d of the resulting structure is dependent on the only model parameter �, i.e, d(�), although there is experimental evidence [3], [8] that branching depends on more than one parameter: the thickness ratio between the discharge gap and the dielectric layer, electronegativity of the used gas and the chosen external (global) voltage, to mention some of them. the dependence on external voltage for the case of our surface electrostatic separation discharges actually means dependence on the remanent electret surface charge density. this study should be followed up by verification this dependence by performing the analysis at various poling potentials. a fractal dimension seems to be a characteristic parameter not only for the amplitude statistics [46] used for non-destructive testing of partial microdischarges in the field of high voltage technology but it seems to be also a promissing candidate for assessing electret charge saturation, which has been indicated by our experiments. further study of this problem is in progress and a following report on the subject is in preparation. 4 acknowledgment this work was supported by the grant agency of the czech republic under the grant no. 202/03/0011. references [1] lichtenberg g.c.: novi comm. soc. reg. sci.gott. vol. 8 (1777), p. 168. [2] morris a. t.: br. j. appl. phys. vol. 2 (1951), p. 98. [3] bertein h.: j. phys. d vol. 6 (1973), p. 1910. [4] murooka y., koyama s.: j. appl. phys. vol. 50 (1979), p. 6200. [5] nakanishi k., yoshioka a., shibuya y., nitta t.: charge accumulationon spacer surface at dc stress in compressed sf6 gas. in gaseous dielectrics iii, ed. l. g. christophorou, pergamon press, new york, 1982, p. 365. [6] niemeyer l., pietronero l., wiesmann h. j.: phys. rev. lett. vol. 52 (1984), p. 1033. [7] hidaka k., murooka y.: j. appl. phys. vol. 59 (1986), p. 87. [8] takahashi y., fujii h., wakabayashi s., hirano t., kobayashi s.: ieee trans. el. insul. vol. 24 (1989), p. 573. [9] niemeyer l.: 7th internat. symp. on high voltage engineering, dresden, 1991, p. 937. [10] femia n., lupo g., tucci v.: 7th internat. symp. on high voltage engineering, dresden, 1991, p. 921. [11] gallimberti i., marchesi g., niemeyer l.: 7th internat. symp. on high voltage engineering, dresden, 1991. [12] murat m.: phys. rev. b vol. 32 (1985), p. 8420. [13] fujimori s.: japan j. appl. phys. vol. 24 (1985), p. 1198. [14] satpathy s.: phys. rev. lett. vol. 57 (1986), p. 649. [15] wiesmann h. j, zeller h. r.: j. appl. phys. vol. 60 (1986), p. 1770. [16] evertsz c.: j. phys. a vol. 22 (1989), p. l 1061. [17] pietronero l., wiesmann h. j.: z. phys. b vol. 70 (1988), p. 87. [18] barclay a. l., sweeney p. j., dissado l. a., stevens g. c.: j.phys. d vol. 23 (1990), p. 1536. [19] mandelbrot b. b.: in statistical models and turbulence, lecture notes in physics, possible refinement of the lognormal hypothesis concerning the distribution of energy dissipation in intermittent turbulence, eds. m.rosenand c. van atta, springer, newyork, 1972, p. 333. [20] mandelbrot b. b.: j.fluid mech. vol. 62 (1974), p. 331. [21] mandelbrot b. b.: the fractal geometry of nature, w. h. freeman, new york, 1983. [22] grassberger p.: phys. lett. a vol. 97 (1983), p. 227. [23] hentschel h. g. e., procaccia i.: physica vol. 8 (1983), p. 435. [24] grassberger p., procaccia i.: physica vol. 9 (1983), p. 189. [25] grassberger p., procaccia i.: physica d vol. 13 (1984), p. 34. [26] badii r, politi a.: phys. rev. lett. vol. 52 (1984), p. 1661. [27] badii r, politi a.: j. stat. phys. vol. 40 (1985), p. 725. [28] frisch u., parisi g. : in turbulence and predictability in geophysical fluid dynamics and climate dynamics, on the singularity structure of fully developed turbulence, eds. m. ghil, r. benziand, g. parisi, north-holland, newyork, 1985, p. 84. [29] jensen m. h., kadanoff l. p., libchaber a., procaccia i., stavans j.: phys. rev. lett. vol. 55 (1985), p. 2798. [30] bensimon d., jensen m. h., kadanoff l. p.: phys. rev. a vol. 33 (1986), p. 362. [31] halsey t. c., jensen m. h., kadanoff l. p, procaccia i., shraimanb.i.: phys. rev. a vol. 33 (1986), p. 1141. [32] glazier j. a., jensen m. h., libchaber a., stavans j.: phys. rev. a vol. 34 (1986), p. 1621. [33] feigenbaum m. j., jensen m. h., procaccia i.: phys. rev. lett. vol. 57 (1986), p. 1503. [34] chhabra a. b., jensen r. v.: phys. rev. lett. vol. 62 (1989), p. 1327. [35] chhabra a. b., menevean c., jensen r. v., sreenivasan k. r.: phys. rev. a vol. 40 (1989), p. 5284. [36] voss r. f: in scaling phenomena in disordered systems, random fractals: characterization and measurement, eds. r pynnand, a. skjeltorp, plenum press, new york, 1985, p.1. [37] feder j.: fractals. plenum press, new york, 1988, p. 80. [38] baumann g., nonnenmacher t.f: in glioggetti fractali in astrofisica, biologia, fisica e matematica, determination of fractal dimensions, eds. g. a. losa, d. merlini, r. moresi, cerfim, locarno, 1989, p. 93. [39] losa g. a.: microsc. elett. vol. 12 (1991), p. 118. [40] bauman g., barthand a., nonnenmacher t. f.: in fractals in biology and medicine, measuring fractal dimensions of cell contours: practical approaches and their limitations, eds. g. a. losa, t. f nonnenmacher, e. r.weibel, birkhäuser verlag, basel, 1993. [41] losa g. a., nonnenmacher t. f., weibel e. r.: fractals in biology and medicine. birkhäuser verlag, basel, 1993. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 44 no. 4/2004 [42] belana j., mudarra m., calaf j., ca adas j. c., menéndez e.: ieee trans. el. insul. vol. 28 (1993), p. 287. [43] kressman r., sessler g. m., günther p.: ieee trans. diel. el. insul. vol. 3 (1996), p. 607. [44] dodd s. j., dissado l. a., champion j.v., alison j. m.: phys. rev. b, vol. 52 (1995), p. r16985. [45] ficker t.: phys. rev. a, vol. 40 (1989), p. 3444. [46] ficker t.: appl. phys. vol. 78 (1995), p. 5289. assoc. prof. rndr. tomáš ficker, drsc. phone: + 420 541 147 661 e-mail: fyfic@fce.vutbr.cz department of physics faculty of civil engineering university of technology žižkova 17 662 37 brno, czech republic ap04_1.vp 1 introduction the electron beam transmission system in the prague mt-25 microtron requires accurate alignment of the beam with all deflecting or focusing magnetic elements. without knowledge of the beam position at crucial points of the transport system, the alignment is a troublesome operation even for an experienced operator. therefore it is desirable to have actual knowledge about the beam position, which however must be obtained without any deterioration of the electron beam quality. once the alignment is obtained, the main duty of the operator is to hold the beam in the center of the output window. this function, requiring permanent attention from the side of the operator, can be automated as well. there exist several types of beam non-disturbing position detection systems. in our case we decided to apply the system based on secondary electron emission from impact of accelerated electrons on thin metallic wire probe. the amount of energy dissipated in a wire probe is negligible and no supplementary cooling of the wire is necessary. the deterioration of the beam by scattering on a single wire is very small; nevertheless it must be taken into account, if several wires in series are placed in the beam path. therefore the wire probes can be used only in the period of the alignment of the beam and should be removed from the beam path after the alignment has been accomplished. only probes situated at the beam periphery can stay at place permanently. 2 experimental part 2.1 stabilization of the beam position at the output window the wire probe method was tested in connection with the stabilization of the beam position at the output. to be sure, that the method will work in vacuum as well as in air, an electrically isolated copper wire � 0.4 mm was placed in the axis of a thin wall cylindrical aluminum chamber of 28 mm id, which could be evacuated. the chamber wall was grounded and a load resistor was inserted between the wire and the ground. the potential on the resistor was measured by a low noise, low drift instrumental amplifier. the chamber was exposed to the electron beam with the wire in the beam axis and evacuated. the portion of the 0.8 �a, 22 mev electron beam passing through the wire was estimated to 0.15 �a. the total beam intensity was monitored by an induction pick up system [1] at the end of the electron transmission line. the wire potentials, measured on load resistors going from 105 to 106 ohms, were in the mv range. currents calculated from potentials and resistor values were in all cases 0.08 �a, independent on the resistor values. the loss of intensity of the primary beam at passage through the wire being negligible, the current is due exclusively to secondary electrons leaving the wire and reaching the wall of the chamber. the wire is charged positively and behaves as a constant current source. the situation changed, when the same measurements were made at atmospheric air pressure in the chamber. in this case several effects must be taken into account. most of the secondary electrons, with an energy spectrum from fraction of ev and more, cannot reach the wall of the chamber. their energy is dissipated in collisions with the nitrogen and oxygen atoms or molecules and finally, captured by them, they form negative atomic or molecular ions. a negative space charge barrier built up in the space between the wire and the wall of the chamber prevents low energy secondary electrons to escape from the region in vicinity of the wire. the diffusion of the space charge both toward the wire and the chamber represents additional currents. at atmospheric pressure, the backward current of negative ions toward the central wire overweighs the current toward the wall of the chamber. con© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 the system for control and stabilization of the beam position in the microtron mt-25 in prague č. šimáně, m. vognar, d. chvátil a method of control the beam position at crucial points of the transport system and for the stabilization of its output position has been proposed and preliminary tested. the method is based on secondary electron emission from a thin metallic wire probe induced by electrons from the 25 mev microtron. it was demonstrated, that magnetic field of the order of 0.2 t and parallel to the wire probe in front of the orifice of the extraction channel in the acceleration space, does not prevent the functioning of the method. a strong parasitic effect of secondary electron emission from the material of the channel and its support construction was found, leading to the inversion of the electron current polarity from the wire. this effect can be to great extent eliminated by negative electric potential bias relative to the channel. at the electron output current of 1 �a the secondary emission current from the wire probe of 0.3 mm diameter is of the order of several na. two electromechanical systems were designed for the removal of the probes from the beam path, to avoid the deterioration of the electron beam quality by scattering. electronic schemes used for remote measurement of small probe currents, suppressing the influence of strong electromagnetic noise, are described. for stabilization of the output beam position two wire probes situated in air close to the al output window were used. these probes having been placed at the periphery of the beam did not deteriorate the beam quality. the difference of their emission currents was used as an error signal to control the magnetic field of the last dipole, which kept the beam in the center of the output window. keywords: microtron, electron transport, beam detection, beam position stabilization, secondary electron emission. tact potentials between the al chamber and cu wire as well as thermoelectric potentials may also in some way modify the load resistor current. in detailed analysis, the pulse character of the microtron electron beam should be also taken into account. each 2.5 �s beam pulse is followed by a 2.5 ms pause, during which the space charge diffuses without being nourished by secondary electrons. some role may also play the ionization of air primary beam. the observed final result of all these rather complicated effects is the reduction of the current in the load resistor as compared with the current in the evacuated chamber. when keeping the load resistor below 100 k�, the observed reduction represents about 50 %. based on these experimental results, a system for stabilizing the beam in the horizontal plane was developed [2]. in its simplest version, two parallel 0.4 mm cu wire probes (e1 and e2) were placed in air vertically in front of the exit window of the electron transport line, symmetrically to its center. the 6 mm distance between them was selected as suitable for normally adjusted 3 mm beam half-width. for this probe configuration, using measured values from fig. 1, the difference u2 � u1 of potentials on the load resistors (error signal) as function of the beam departure from the central position was calculated (fig. 2). the probes were connected to the inputs of a differential instrument amplifier (fig. 3) through a frequency filter suppressing the pulse character of the secondary electron emission currents and environmental electromagnetic noise. the inherent noise of the amplifier was reduced to units of �v, which corresponds to a noise current of the order of 10 pa at the input. the output of the instrument amplifier was connected to a power amplifier feeding the auxiliary dipole winding for fine beam position control. in such a way the error signal controls the dipole magnetic field, which compensate the departure of the beam from the central position. in our case, 0.28 a current in the auxiliary winding is necessary to compensate 1 mm departure. the current in the auxiliary dipole winding as function of the beam departure from the symmetry plane of the wire probes at l �a mean beam current is presented on fig. 4. the stabilization factor at 1 �a mean beam current, defined as the ratio of the beam departure from the center without and with stabilization, was about 20, increasing linearly with the beam current. 2.2 tracing of the beam path in the transport system preliminary experiments were carried out with the aim to verify the applicability of this system for alignment of the iron channel in the acceleration space, through which the electrons are extracted. the electron energy during these experiments was fixed to 22 mev. in correct position, the axis of the channel must be oriented tangentially to the electron orbit and the center of its input orifice must coincide with the point of contact with the orbit. to find this position, an electrically isolated stainless iron wire probe � 0.3 mm was fixed vertically on the channel, crossing its axis in a distance of 15 mm in front of the input orifice. the secondary electron emission 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 0 1 2 3 4 5 6 �5 �4 �3 �2 �1 0 1 2 3 4 5 ( )x x0 [mm] u [m v ] fig. 1: potential on the load resistor r � 105 � of the wire probe versus the horizontal beam departure from the axis �6 �4 �2 0 2 4 6 �10 �8 �6 �4 �2 0 2 4 6 8 10 ( )x x� 0 [mm] ( 2 1 ) [m v ] u u � fig. 2: calculated values of the error potential difference between the load resistors r � 105 � of two wire probes versus the horizontal beam departure from the axis (at 3 mm beam half width) fig. 3: principal electronic scheme of the stabilization device current of the probe was measured as function of the channel position relative to the electron orbit. the experiment furnished two important results. first, it was confirmed, that magnetic field existing in the acceleration space of the order of 0.2 t and parallel to the wire probe, does not prevent its functioning, which was not at all evident in advance. secondly, a strong parasitic effect, leading to the inversion of electron current direction from the wire probe, was observed, the origin of which was the secondary electron emission from the channel and its support material hit by the beam. the current of the probe is thus an algebraic sum of its own secondary emission current and of current of secondary electrons emitted from the support material and collected by the probe. as seen from fig. 5, the setting of the wire probe on negative bremspotentials �9, �18 and �27 v against the channel potential inhibited progressively the collection of secondary electrons from the construction material. at a bremspotential of �27 v the maximum of the current peak was clearly identifiable and current inversion to great extent suppressed. in next experiments, the possibility of using the system was verified for finding the correct channel position at passage from an electron orbit to the next (to the previous) one. during the passage, the effect of secondary emission from the channel and its support material was very strong and could not be fully suppressed even if the bremspotential was used. from the same reason the determination of the correct channel position by differential wire probes, which has been tried too, gave not satisfactory results. nevertheless, as can be seen from fig. 6, using only a single wire probe the correct channel position on two neighbor orbits can be found unambiguously. the experiments gave also a crude estimate of the order of magnitude of the secondary emission current. the � 0.3 mm wire probe was situated in front of the orifice of the iron channel, where the horizontal width of the beam is about 6 mm. setting the accelerated electron output current to 1 �a, the maximum of the secondary electron current was of the order of 20 na. because of the unknown distribution of the current density in the beam the coefficient of electron secondary emission could not be calculated and compared with the experimental results, so that this value is only informative. a low noise low drift analog amplifier has been built working in two wire current loop with the measuring instrument on the panel in the remote control room. the amplifier was placed in vicinity of the accelerator at a place with relatively low radiation level to avoid its deterioration by radiation. in this way the influence of strong electromagnetic noise in the microtron room on measurement of the small probe currents was effectively suppressed. the supposed deterioration of the beam quality by scattering on the wire probe was experimentally confirmed. therefore a system has been designed, which will be used in the future electron transmission line, with wire probes, which can be removed from the beam after the alignment procedure has been accomplished. it is supposed to place the wire probes at the inputs of the magnetic quadrupole lenses and deflection dipoles, both to align the horizontal and vertical positions of the beam. two wire probes for setting the horizontal and vertical central positions of the beam will © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 �5 �4 �3 �2 �1 0 1 2 3 4 5 �9 �7 �5 �3 �1 1 3 5 7 9 ( x ) [mm]x � 0 i m [a ] fig. 4: compensation current in the auxiliary dipole winding versus the horizontal beam departure from the axis �6 �4 �2 0 2 4 6 8 10 12 14 16 18 20 22 �12 �10 �8 �6 �4 �2 0 2 4 6 8 10 12 a c b d wire probe position relative to the orbit [mm] w ir e p ro b e c u rr e n t [n a ] fig. 5: current of the wire probe in front of the input orifice of the extraction channel as function of its departure from the electron orbit �13 �8 �3 2 7 12 17 22 880 890 900 910 920 channel position ( motor turn number) w ir e p ro b e cu rr en t [n a ] n-th orbit (n+1)-th orbit fig. 6: the current of the wire probe in front of the input orifice of the extraction channel in course of the passage from the n-th to the (n+1)-th electron orbit (bremspotential of the wire probe equals to �27 v) be oriented perpendicularly to the transmission line axis and will be made extractable from the beam. for this purpose a special mechanism was designed securing the linear transport of the probe by external magnetic fields (fig. 7). additional differential wire probes will be situated in the beam transmission line at the periphery of the beam, serving for determination of the sign of deviation of the beam from the axis. being situated at the periphery of the beam, they will perturb but insignificantly the beam and can be fixed permanently in the transmission line. the wire probe in front of the orifice of the extraction channel is fixed on a coil with two stable positions. in one stable position the wire is in front of the channel orifice, in the other position the probe is removed from the beam path (fig. 8). current pulses of appropriate polarity actuate the flipping of the coil from one to the other position when magnetic field in the acceleration space is on. the system was preliminary tested inside the acceleration space of the microtron with satisfactory results. 3 conclusion the use of the secondary electron emission from wire probes represents an advantageous way of electron beam position control when setting the parameters of magnetic deflecting and focusing elements of the electron transport system. the differential wire probes can be used for automatic stabilization of the output beam position. introduction of 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague output connectoranchor in working position 39 wire probe wire probe out of the beam path r 25 electromagnets wire probe in the beam path electron-guide 173 fig. 7: linear shift mechanism for setting the wire probe in and out of the electron beam path fig. 8: flipping coil mechanism for setting the wire probe in front of the electron extraction channel and to withdraw it: 1-extraction channel input orifice, 2-wire probe, 3-coil, 4-holder of the extraction channel, 5-shaft of the flipping coil holder, 6-pole pieces of the microtron electromagnet these probes in the transport system in the way described does not deteriorate the beam quality. no additional cooling system is required, what represents a very important feature of the wire probes. the incorporating of the beam position control in the existing transport system is expected to result in optimization of its parameters, improvement of the output beam quality and simplification of the overall microtron control. references [1] šimáně č., vognar m.: induction pick up system for microtron mt25 current measuring and position indication. šimáně, č., vognar, m.: czech technical university in prague, workshop 98, prague, 1998. [2] šimáně č., vognar m., němec v.: stabilization of microtron electron beam position. czech technical university in prague, workshop 2000, part b, prague, 2000, p. 540. prof. ing. čestmír šimáně, drsc. ing. miroslav vognar ing. david chvátil department of dosimetry and application of ionizing radiation czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 praha 1, czech republic 63 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 table of contents wavelet-based embedded rate scalable still image coders: a review 3 farag ibrahim younis elnagahy, b. šimák implementation of component-based simulation support tool for conceptual design 18 t. brandejský stellar image interpretation system using artificial neural networks: ii – bi-polar function case 24 a. el-bassuny alawy, f. i. y. elnagahy, a. a. haroon, y. a. azzam, b. šimák optimization of flapping airfoils for maximum thrust and propulsive efficiency 31 i. h. tuncer, m. kay the psychological image of realistic physical quantities. the psychological speed of aging 39 l. végh magnetic technique for nondestructive evaluation of residual stresses 43 b. a. belyaev, a. a. leksikov, s. g. ovchinnikov, i. kraus, a. s. parshin neural networks for self-tuning control systems 49 a. noriega ponce, a. aguado behar, a. ordaz hernández, v. rauch sitar effect of temperature and age of concrete on strength – porosity relation 53 t. zadražil, f. vodák, o. kapièková influence of g irradiation on concrete strength 57 v. sopko, k. trtík, f. vodák the system for control and stabilization of the beam position in the microtron mt-25 in prague 59 è. šimánì, m. vognar, d. chvátil ap04-bittnar2.vp 1 introduction when a precast reinforced concrete (r/c) panel building is demolished using controlled explosions, selection of appropriate sizes, placement, and timing of the charges is crucial in order to ensure complete collapse of the structure while not damaging surrounding objects. the design of safe and efficient deconstruction procedures can be facilitated by means of computer simulations. in contrast to the standard structural analysis, the main objective of such a simulation is to predict the mechanical behavior of a structure during the phase when it disintegrates and loses static stability. the mechanical phenomena to be dealt with include dynamic motion (finite displacements and rotations) and interaction of debris on the structural level (macroscale), and fracture and yielding on the material level (mesoscale). simultaneous treatment of all these mechanisms would be computationally too costly; thus, a computational methodology based on a sequential multiscale approach has recently been proposed [1]. in order to reduce the number of degrees of freedom involved in the dynamic analysis at the macroscale, the entire structure is modeled as an assembly of beam finite elements, which represent structural members (panels) and their joints. the governing relations among bending moment, axial force, curvature, and axial strain of the beam elements are formulated by modeling the overall behavior of r/c panel sections or joints that undergo local damage (mesoscale model). during demolition, structural members and joints may be exposed to loading states that are diametrically different from those for which they were designed. nevertheless, the load bearing capacity of an r/c section exposed to an arbitrary combination of axial force and bending is usually determined by concrete crushing in compression and/or reinforcement yielding in tension. both of these phenomena can be well modeled by plasticity. if a section is exposed to further deformation upon reaching the load capacity, the load drops, but usually not immediately to a zero value. the post-failure response is often dominated by highly ductile behavior of the reinforcing bars, which can typically sustain strain up to the order of 10�1. the corresponding residual load carried by a failed section or joint cannot be neglected when analyzing the disintegration and dynamic motion of a structure during demolition. in tension, the behavior of reinforcing bars can still be modeled by simple one-dimensional plasticity, but in compression, plastic buckling has to be taken into account. the present paper deals with modeling of the latter phenomena. 2 physical phenomena let us consider an r/c member which is loaded by such a combination of axial force and bending moment, that at least on one side it is exposed to compression. concrete crushing starts when the maximum compressive strain attains the level of roughly 0.002. as the crushed concrete spalls, it exposes longitudinal reinforcing bars. upon losing the support of concrete cover, the reinforcing bars buckle outward fig. 1. the buckling length is limited either by transversal stirrups or by the length of the spalled concrete zone. eventually the bars yield in plastic hinges, which form in the locations of extensive bending. since the crushed and spalled concrete ceases to contribute to the load bearing ability of the section, its overall response is then dominated by the behavior of the buckling bars. 3 analytical model 3.1 loading and boundary conditions the behavior of a buckled bar is modeled by the relationship between applied axial force p and the relative displacement of its ends u, as shown in fig. 2. force p is positive when compressive and similarly positive u means contraction. we © czech technical university publishing house http://ctn.cvut.cz/ap/ 165 acta polytechnica vol. 44 no.5–6/2004 a simplified analysis of the post-buckling behavior of a compressed reinforcing bar p. kabele recently, a computational methodology based on a sequential multiscale approach, which facilitates numerical simulation of an r/c building demolition has been developed. in this type of analysis, it is necessary to capture the behavior of compressed reinforcement bars until complete rupture, which occurs due to extensive bending in the post-buckling regime. to this end, a simplified analytical model of the post-buckling behavior of a compressed bar is proposed. the simplification consists namely in considering rigid-plastic material behavior, neglecting axial contraction of the central line, and approximating the shape of the deformed central line in the plastic hinges by a circular arch. consequently, the axial loading force, bar end displacement, and extreme strain can be expressed in relatively simple closed forms. the results obtained with the proposed model show very close agreement with those obtained by a detailed and realistic finite element analysis, which justifies the use of the simplifying assumptions. keywords: reinforcement bar, hardening plasticity, post-buckling behavior. fig. 1: concrete cover spalling and reinforcement buckling in an r/c member consider that the bar is free to buckle in plane x-z within length l. as the bar is in reality continuous, the in-plane rotations and displacements in z-direction are fixed at the end points of length l. since the problem is symmetric about axis z, we solve it only on one half of l, i.e. in the interval ��l/2, 0�. consequently, the displacement in x-direction and the rotation are fixed on the symmetry axis. 3.2 assumptions we accept the following general assumptions: (1) the bar is modeled as a bernoulli-euler beam, i.e. the planar cross-sections perpendicular to the central line prior to deformation remain so also after deformation takes place. this assumption is acceptable, since the length of the bar, which is free to buckle, is usually much larger than the cross-sectional size. (2) the bar undergoes finite displacements, therefore the equilibrium equations are formulated on the deformed configuration. (3) in the post-buckling regime, the elastic deformations are negligibly small compared to the plastic deformations. consequently material is modeled as rigid-plastic with linear hardening. since we do not consider load reversals, we will use the deformation theory of plasticity. (4) when the bar buckles, the contribution of central-line axial contraction to the overall contraction is negligible in comparison with the contribution due to bending (finite deflection). 3.3 moment-curvature relation in the view of assumption (1), the normal strain distribution is linear on the bar cross-section. since axial straining of the central line is neglected [assumption (4)], the strain can be expressed as: � � �� � , (1) where � denotes the local axis normal to and originating at the central line, and � is the curvature of the central line. assumption (3) implies that the material constitutive law can be written as: � � � � � � � � � � � � � � 0 �� �� for for y y h yesgn( ) (2) where � is the normal stress, �y is the yield strength and eh is the linear-hardening modulus. by combining eqs. (1) and (2) and considering the equivalence of bending moment m and the moment due to normal stress, the following relation is obtained: m m k p� �sgn( )� �0 (3) where m ay a 0 � � � � d (4) and k e ap h a � � � 2 d (5) and a is the cross-sectional area of the bar. note that eq. (3) holds only for those parts of the bar which are fully plastic (plastic hinges), i.e. � �� y on the entire section, or, in terms of the bending moment, m m� 0. outside the plastic hinges, the curvature is zero for any m. 3.4 equilibrium on a deformed bar let us consider a symmetric half of a buckled bar, as shown in fig. 3. the maximum deflection is denoted as w. to satisfy the equilibrium of forces in x and z directions, reaction r � p and v � 0. the anti-symmetry of the deformed shape implies that moment reactions �ma � mb. to maintain the equilibrium of moments, we require 2m p wb � . (6) it is obvious from fig. 3 and fig. 4 that the bending moment induced by the load and the reactions is a linear function of coordinate z: m m p zb� � � (7) which in combination with eq. (6) gives: m p z w � � � � � 2 . (8) 3.5 geometry of a deformed bar as discussed in section 3.3, the curvature of the deformed bar outside the plastic hinges is zero, which means that the corresponding part of the central line remains straight. the shape of the central line within the plastic hinges can in general be obtained by solving together eqs. (3) and (8). 166 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 x x z p p pp u/2 u/2 l/2 l/2a) b) fig. 2: configuration and loading of a reinforcing bar: a) undeformed, b) deformed x r p v m a m b w z fig. 3: load and reactions on a deformed bar however, solution of the differential equation is rather complicated, since it involves elliptic integrals [3]. to simplify the problem, we approximate the shape of the central line in plastic hinges by circular arches with constant curvature �av (grey parts in fig. 4). due to the anti-symmetry, the plastic hinges on both ends of the analyzed half of the bar have the same length c. the arches are delimited by angle �. then � � av c � . (9) to ensure a smooth transition from the plastic hinges to the straight rigid portion of the bar (black part in fig. 4), the latter has to be inclined at the same angle � from the horizontal direction. while taking into account assumption (4) (the central line length remains constant and equal to l/2), the two parameters c and � completely describe the shape of the deformed central line. then it is obvious from fig. 4 that maximum deflection w can be expressed as: w c l c� � � � � � �2 1 2 2 � � �( cos ) sin . (10) similarly, the relative displacement of the bar ends (fig. 2) is: u l c l c� � � � � � � � � � � � � � � � � � � 2 2 2 2 2 � � �sin cos . (11) 3.6 derivation of p-u relationship in the light of eq. (3), the assumption of constant curvature implies that the bending moment within a plastic hinge is also constant. thus, for the right-side hinge: m m kav p av� �0 � . (12) we consider that this constant moment is also equal to the average moment in the plastic hinge (fig. 4), then m m mav b� � 1 2 0 ( ) . (13) now we substitute equations in the following order: (9) � (12) � (13), from which we express mb. this result is substituted together with eq. (10) into eq. (6), which allows us to express force p in terms of c and �: p m k c c l c p � � � � � � � � 2 4 2 1 2 2 0 � � � �( cos ) sin . (14) in order to eliminate the length of the plastic hinge c, we consider that the bending moment from eq. (8) must be equal to –m0 at the right end of the left-side hinge, i.e. for z c � � � �( cos )1 . (15) eq. (8) is then rewritten in the form: � � � � � � � � � �m p c w 0 1 2� �( cos ) . (16) after substituting eqs. (10) and (14), the above equation is solved for c. of two roots, the following one is relevant: c k lm lm k m k p p p � � � � � 2 2 2 1 2 2 0 0 2 0 � � � � � � � sin ( cos sin ) ( cos ) 2 02 1 sin ( cos ) . � �m � (17) if we now substitute eq. (17) into eqs. (11) and (14), both p and u depend only on parameter �. the desired p-u relationship is thus obtained in a parametric form. note that parameter � has a clear physical meaning: it is the angle of inclination of the straight portion of the buckled reinforcing bar. note also that the derived equations are valid only after the bar cross sections at x l l� � 2 0 2, , have completely yielded due to post-buckling bending. 3.7 extreme strain a complete rupture of the buckled bar is initiated when the extreme strain reaches material strain capacity �u. this first happens at the sections with extreme bending moments ma or mb, that is, for x l l� � 2 0 2, , . on these sections, the rupture criterion is first satisfied at the most distant points with � �� ext . by using eqs. (6), (14), (10), and (17) the extreme moment is expressed in terms of parameter �. then we obtain the extreme strain from eqs. (3) and (1) as: � �ext ext b p m m k � � 0 . (18) 4 validation in order to validate the proposed analytical model, a typical problem of a buckling bar is solved and the results are compared with those obtained with fem. the material and geometrical properties of the analyzed reinforcement bar are listed in table 1. they correspond to standard steel no. 10 216, which was in the past often used in precast r/c panels. for the sake of simplicity of the fe analysis, the cross section has the shape of a square with side d. © czech technical university publishing house http://ctn.cvut.cz/ap/ 167 acta polytechnica vol. 44 no.5–6/2004 x w z r av r av c c j j j l /2 -2 c 1 av av c r k j = = m(z) m a =-m b m b m 0 m av fig. 4: geometry and distribution of bending moment on a deformed bar e*) � *) �y eh �u l d gpa – mpa mpa – m m 210 0.3 206 1387.5 0.24 0.2 0.01 *) used only in fe analysis table 1: material and geometrical properties of the analyzed reinforcement bar the fe analysis was preformed assuming plane stress, finite displacements and finite strains. the material was modeled as elastic-plastic with linear hardening. the model represented the entire length l. to facilitate buckling, the bar was given a piecewise linear lateral imperfection with the maximum value at the vertical symmetry line equal to 0.25 % of l. the non-uniform fe mesh consisted of 100�20 9-noded elements, and it was refined in the locations of the plastic hinges. according to eq. (18), the bar rupture is initiated at p � 4.34 kn and u � 0.0342 m, which correspond to � � 0.804962 rad. fig. 5 compares the post-buckling load-displacement curves obtained with the proposed model and with fem. a close agreement is evident for the entire relevant range, i.e. up to u � 0.0342 m. fig. 6 shows the values of extreme strain calculated according to eq. (18) and the corresponding extreme values of logarithmic strains obtained with fem. it is obvious that the proposed model captures very well the extreme compressive strain, while it overestimates the tensile strain. note that the difference between the strains on the compressed and tensioned surfaces is a result of considering the finite displacements and strains in the fe analysis. 5 concluding remarks a simplified approach to the analysis of the post-buckling behavior of a compressed bar has been presented. the simplification consists namely in considering rigid-plastic material behavior, neglecting axial contraction of the central line, and approximating the shape of the deformed central line in plastic hinges by a circular arch. consequently, axial loading force p, end displacement u, and extreme strain �u can be expressed in relatively simple closed forms. this feature is particularly desirable, since the formulation will be used in a multiscale context to model the behavior of an r/c structure during demolition [2]. the results obtained with the proposed model show very close agreement with those obtained by a realistic detailed finite element analysis, which justifies the use of the simplifying assumptions. 6 acknowledgment the research presented in this paper was supported by grant no. 103/02/0658 provided by the science foundation of the czech republic and by contract of the ministry of education of the czech republic no. j04/98: 210000003. references [1] kabele p., pokorný t.: “computational analysis of precast concrete building deconstruction.” in “proceedings of the fifth world congress on computational mechanics (wccm v), july 7–12, 2002, vienna, austria” (editors: h. a. mang, f. g. rammerstorfer, and j. eberhardsteiner), vienna university of technology, austria, http://wccm.tuwien.ac.at. [2] kabele p., kalousková m.: “multiscale stochastic simulation of building demolition.” in proceedings of the sixth world congress on computational mechanics (wccm vi), to appear. [3] shames i. h., dym c. l.: “energy and finite element methods in structural mechanics.” taylor & francis 1991. doc. ing. petr kabele, ph.d. phone: 1420224354485 e-mail: petr.kabele@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic 168 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 0 5 10 15 20 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 p (k n ) u (m) proposed model fem fig. 5: post-buckling load-displacement curves of a reinforcement bar 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 0.01 0.02 0.03 0.04 0.05 0.06 s tr a in () u (m) proposed model fem compression side fem tension side fig: 6: extreme strains vs. end displacement at x � 0 and� � � d 2; absolute values of logarithmic strains are plotted as fem results ap03_6.vp 1 introduction the increasing requirements for indoor air quality in buildings need more exact criteria in order to ascertain the real condition of the environment and to allow better optimization of its level, to remove “sick building” symptoms, i.e. to get the real comfort within a building. human physiology research makes evident that the weber-fechner law applies not only to noise perception, but also to the perception of other environmental components. based on this fact, new decibel units for odor component representing indoor air quality in majority locations have been proposed: decicarbdiox dcd (for carbon dioxide co2) and decitvoc dtv (for total volatile organic compound tvoc) – see part 1 of this paper. equations of these new units have been proved by application of a) experimental relationships between odor intensity (representing odor perception by the human body) and odor concentrations of co2 and tvoc, b) individually measured co2 and tvoc levels (concentrations) – from these new decibel units can be calculated and their values compared with decibel units of noise easured in the same locations. to be able to evaluate the indoor air quality in practice, we need to establish limits or, more exactly, admissible and tolerable ranges for both unadapted and adapted persons (p). 2 carbon dioxide the starting points of these values are based on various studies (e.g. [23]) whose results are listed in table 2.1 and in fig. 2.1. see also tables 1.2 and 1.3 (p. 24, 25) and fig. 1.4 (p. 26) in part 1. the optimal value overwhelmingly corresponds pd � 20 %. a better value of pd � 10 % could be prescribed © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 43 no. 6/2003 indoor air quality assessment based on human physiology – part 2. limits m. v. jokl in order to evaluate indoor air quality in practice it is necessary to establish limits, or more exactly, tolerable ranges for unadapted and adapted persons. the optimal value overwhelmingly corresponds to pd � 20 %. a better value of pd � 10 % could be prescribed for asthmatics and for persons with increased requirements, i.e. those allergic to the environment and operators in airport control towers and atomic power stations. a worse value pd � 30 % could be accepted as an admissible value. these values differ for unadapted and adapted persons (as introduced by bsr/ashrae 62-1989 r). the long-term tolerable value is the end of sbs range (for co2 it is based on ussr space research, for tvoc on molhave). the short-term tolerable value is the beginning of the toxic range (for co2 it is taken from british guidance note eh 40/90; for tvoc from molhave). keywords: indoor air quality, odors, air changes estimation. no. limit value source [mg�m�3] [ppm] [dcd] 1 875 485 0 threshold 5.8 % dissatisfied 2 1080 600 8 usaf warning usaf armstrong laboratory 1992 3 1110 615 9 un asthm. optimal? 10 % dissatisfied unadapted 4 1440 800 20 osha warning osha: federal register 1994 5a 1800 1000 28 optimal limit pettenkofer 1858 5b 1800 1000 28 acceptable limit ansi/ashrae 62/1989 5c 1800 1000 28 opt. long-term who/euro:air quality guidelines 1992 6a 1825 1015 29 un optimal limit 20 % dissatisfied unadapted 6b 1825 1015 29 un asthm. admissible? 6c 2000 1110 32 concentration of no concern for non industrial buildings who (levy 1992) table 2.1: various limits and ranges for co2 concentrations: un � unadapted persons, asthm. � asthmatic persons, ? � for values of asthmatic persons there is no experimental background (analogy to tvoc is presumed) for asthmatics and for persons with increased requirements, i.e. those allergic to the environment and operators in airport control towers and power stations (especially atomic power stations). this is analogous to the tvoc limits (see later). 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 no. limit value source [mg�m�3] [ppm] [dcd] 7a 2160 1200 35 opt. short-term who/euro:air quality guidelines 1992 7b 2200 1225 36 ad asthm. optimal ? 10 % dissatisfied adapted 8 2830 1570 46 un admissible 30 % dissatisfied unadapted 9a 4350 2420 63 ad optimal limit 20 % dissatisfied adapted 9b 4350 2420 63 ad asthm. admissible? (bsr/ashrae standard 62-1989r) 10a 5035 2800 68 limit for direct gas fired air heaters bs 5990: 1981 of british standard institution 10b 5035 2800 68 limit for direct gas fired air heaters bs 6230: 1982 of british standard institution 11a 6300 3500 77 long-term acceptable env. health directorate, canada 1989 11b 7000 3890 81 concentration of concern for nonindustrial buildings who (levy 1992) 11c 7360 4095 83 ad admissible 30 % dissatisfied adapted 12a 9000 5000 91 long-term exposure limit 8 hrs guidance note eh 40/90 from hse of gb 12b 9000 5000 91 average concentration for industrial and nonindustrial buildings commission de la sante et de la securite du travail 12c 9000 5000 91 long-term tolerable ussr space research (sbs range ends) 13a 18000 10000 118 maximum allowable concentration for ind. and nonind. buildings commission de la sante et de la securite du travail 13b 18000 10000 118 short-term tolerable ussr space research 14a 27000 15000 134 short-term tolerable toxic range begins 14b 27000 15000 134 short-term expos. lim. 10 min guidance note eh 40/90 from hse of gb un a 875–1825 485–1015 0–29 optimal range un a1 875–1110 485–615 0–9 asthm. optimal range? un a2 1110–1825 616–1015 10–29 asthm. admissible range? un b 1826–2830 1016–1570 30–46 admissible range un c 2831–9000 1571–5000 47–91 long-term tolerable (sbs) range un d 9001–2700 5001–15000 92–134 short-term tolerable range un e � 27001 � 15001 � 135 intolerable range un a 875–4350 485–2420 0–63 optimal range un a1 875–2200 485–1225 0–36 asthm. optimal range? un a2 2201–4350 1226–2420 37–63 asthm. admissible range? un b 4351–7360 2421–4095 64–83 admissible range un c 7361–9000 4096–5000 84–91 long-term tolerable (sbs) range un d 9001–27000 5001–15000 92–134 short-term tolerable range un e � 27001 � 15001 � 135 intolerable range table 2.1: various limits and ranges for co2 concentrations: un � unadapted persons, asthm. � asthmatic persons, ? � for values of asthmatic persons there is no experimental background (analogy to tvoc is presumed) (continue) a worse value of pd � 30 % could be accepted as an admissible value. these values differ for unadapted and adapted persons (as introduced by bsr/ashrae 62-1989 r). the long-term tolerable values, which are quoted in occupational health standards and studies, are reached in buildings with sick building syndrome (sbs). short-term tolerable values are those at the beginning of the toxic range both for unadapted and adapted persons. also the lowest, detectable value is the same for unadapted and adapted person (p). for unadapted persons optimal (pd � 20 %) and admissible (pd � 30 %) values are 1015 ppm and 1570 ppm (fig. 2.2), i.e. 29 dcd and 46 dcd. for adapted persons the curve must first be added into the diagram (fig. 2.2) as follows: © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 43 no. 6/2003 fig. 2.1: the proposed co2 limits: optimal admissible and tolerable values, the psycho-physical scale slightly modified by fanger (1988) (ad � adapted persons, un � unadapted persons) for unadapted persons the equilibrium equation is valid (1) � � � � r g p p i e e � � � � � � � co co co co 2 2 2 2 10 3600 19 1000 36 1015 7 6 � � �. .5 [l�s�1�p�1](1) where rp � 7 5. l � s �1 � p�1, prescriptive outdoor air requirement for unadapted persons [9]; gpco2 19� 1� h �1 � p�1, co2 load caused by a sedentary person (see table 2.2); � ico mg m2 1015 3 � � � , co2 indoor air concentration for 20 % dissatisfied unadapted persons (see fig. 2.2); � eco2 310� ppm, co2 outdoor air concentration as a result of eq. (1). for adapted persons the same eq. (1) is valid, but the prescriptive outdoor air requirements for them, according to [9], is only 2.5 1 � s�1� p�1 co2 load caused by a sedentary person and outdoor co2 concentration remain the same, i.e. g�co2 19� 1 � h �1� p�1, � eco2 310� ppm. so co2 indoor concentration � ico2 can be calculated: � � rp i � � 19 3600 2 5 2 � co . [l � s�1� p�1] (2) where � ico2 2420� ppm, i.e. for 20 % dissatisfied adapted persons, co2 indoor air concentration can be raised from 1015 ppm to 2420 ppm. presuming the same character of the curve, i.e. � �� ���co k pd2 5 98 4 � � �ln . [ppm] � �2420 310 20 5 98 4� � � �k ln . we get for adapted persons: � �� ���co pd2 167 350 5 98 4 � � �. ln . (3) for adapted persons optimal (pd � 20 %) and admissible (pd � 30 %) values are 2420 ppm and 4095 ppm, i.e. 63 dcd and 83 dcd. optimal (pd � 10 %) and admissible (pd � 20 %) values for asthmatics could be, for unadapted persons: 615 ppm, 9 dcd (pd � 10 %) and 1015 ppm, 29 dcd (pd � 20 %); and for adapted persons: 1225 ppm, 36 dcd (pd � 10 %) and 2420 ppm, 63 dcd (pd � 20 %). more experimental data are required. the long-term tolerable value (5000 ppm, 91 dcd; the end of the sbs range) is based on ussr space research; the short-term tolerable value (15000 ppm, 134 dcd; the beginning of the toxic range) is taken from british guidance note eh 40/90 as described previously. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 2.2: the percentage of dissatisfied sedentary subjects as a function of the carbon dioxide concentration above outdoors activity tvoc [�g � h �1 � p �1 ] 3) co2 [l � h �1 � p �1 ] sedentary, 1–1.2 met1) 0 % smokers 5140 19 20 % smokers2) 10290 19 40 % smokers2) 15430 19 100 % smokers2) 30870 19 physical exercise low level, 3 met 20580 50 medium level, 6 met 51440 100 high level (athletes), 10 met 102890 170 children kindergarten, 3–6 years, 2.7 met 6170 18 school,14–16 years, 1–1.2 met 6690 19 1) 1 met is the metabolic rate of a resting sedentary person (1 met � 58 w� m�2 skin area, i.e. approx. 100 w for an average person) 2) average smoking rate 1.2 cigarettes/hour per smoker, emission rate 44 ml co/cigarette 3) converted olf values presented in eur 14449 en table 2.2: pollution load caused by occupants no limit value source [�g � m�3] [dtv] 1 50 0 threshold 5.8 % dissatisfied (1.0 by yaglou psycho-physical scale) 2 85 12 un asthm. optimal 10 % dissatisfied unadapted (eur 14449 en) 3a 200 30 un optimal limit 20% dissatisfied unadapted (eur 14449 en) old dwelling houses old office buildings table 2.3: various limits and sranges for tvoc concentrations © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 43 no. 6/2003 no limit value source [�g � m�3] [dtv] 3b 200 30 un asthm. admissible 4 250 35 ad asthm. optimal 10 % dissatisfied adapted 5 300 39 target guideline seifert (1990) 6 360 43 un admissible 30 % dissatisfied unadapted (eur 14449 en) new dwelling houses new office buildings 7a 500 50 level of concern national health and medical research council of australia (dingle, murray 1993) 7b 580 53 ad optimal limit 20 % dissatisfied adapted 7c 580 53 ad asthm.admissible 8 1040 66 ad admissible 30 % dissatisfied adapted 9a 3000 89 long-term tolerable sbs range ends 9b 3000 89 multifactorial exposure range limit molhave (1990) 10a 25000 135 short-term tolerable toxic range begins 10b 25000 135 discomfort range limit molhave (1990) no limit value [�g � m�3] [dtv] un a 50–200 0–30 optimal range un a1 50–85 0–12 asthm. optimal range un a2 86–200 13–30 asthm. admissible range un b 201–360 31–43 admissible range un c 361–3000 44–89 long-term tolerable (sbs) range un d 3001–25000 90–135 short-term tolerable range un e 25001 136 intolerable range un a 50–580 0–53 optimal range un a1 50–250 0–35 asthm. optimal range? un a2 251–580 36–53 asthm. admissible range? un b 581–1040 54–66 admissible range un c 1041–3000 67–89 long-term tolerable (sbs) range un d 3001–25000 90–135 short-term tolerable range un e 25000 136 intolerable range table 2.3: various limits and ranges for tvoc concentrations (continue) 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 43 no. 6/2003 fig. 2.3: the proposed tvoc limits: optimal, admissible and tolerable values. the psycho-physical scale slightly modified by fanger (ad � adapted persons, asthm � asthmatic persons, un � unadapted persons) 3 total volatile organic compounds the starting points of these values are based on various studies whose results are listed in table 2.3 and in fig. 2.3 (see also tables 1.3 and 1.4 (p. 25 and 28) and fig. 1.4 (p. 26) in part 1). the philosophy behind the various limits is the same as previously presented for co2. for unadapted persons optimal (pd � 20 %) and admissible (pd � 30 %) values are 200 �g � m�3 and 360 �g � m�3, i.e. 30 dtv and 43 dtv. for adapted persons the curve must first be added into the diagram (fig. 2.4) as follows. for unadapted persons the equilibrium equation (4) is valid: � � � � r g b b i e e � � � � �tvoc tvoc tvoc tvoc36 5140 36 200 7 5 . . . � � � [l�s�1�p�1](4) where rb � 7 5. 1�s �1 �p�1, prescriptive outdoor air requirement for unadapted persons [9] gbtvoc � 5140 �g�h �1 �p�1, tvoc load caused by sedentary person (see table 2.2); � itvoc � 200 �g � m �3, tvoc indoor air concentration for 20 % dissatisfied unadapted person (see fig. 2.4); � etvoc �10 �g � m �3, tvoc outdoor air concentration, as a result of eq. (4). for adapted persons the same eq. (4) is valid, but the prescriptive outdoor air requirement for them, according to bsr/ashrae 62-1989 r, is only 2.5 l � s�1� p�1; tvoc load caused by a sedentary person and outdoor air tvoc concentration remain the same, i.e., gbtvoc � 5140 �g�h �1� p�1, � etvoc �10 �g � m �3. so tvoc indoor air concentration � itvoc can be calculated: � � rb i � � � 5140 3 6 10 2 5 . . � tvoc [l � s�1� p�1] (5) where � itvoc � 580 �g � m �3, i.e. for 20 % dissatisfied adapted persons tvoc indoor air concentration can be raised from 200 �g � m�3 to 580 �g � m�3. presuming the same character of the curve, i.e. � �� �� tvoc k pd� � �ln .5 98 4 � �580 20 5 98 4� � �k ln . [�g � m�3] we get for adapted persons, preferring again �p as it was with co2, � �� ��� tvoc � � � �46000 5 98 104ln .pd [mg � m�3] (6) for adapted persons optimal (pd � 20 %) and admissible (pd � 30 %) values are 580 �g�m�3 and 1040 �g�m�3, i.e. 53 dtv and 66 dtv. optimal (pd � 10 %) and admissible (pd � 20 %) value for asthmatics could be for unadapted persons: 85 �g�m�3, 12 dtv (pd � 10 %) and 200 �g�m�3, 30 dtv (pd � 20 %) and for adapted persons: 250 �g�m�3, 35 dtv (pd � 10 %) and 580 �g�m�3, 53 dtv (pd � 20 %). besides asthmatics these values are also recommended for people with increased requirements: those allergic to the environment or those having responsible positions – operators in airport control towers and power stations, especially atomic power stations. the long-term tolerable value (3000 �g�m�3, 89 dtv; the end of the sbs range) and short-term tolerable value (25000 �g�m�3, 135 dtv; the beginning of the toxic range) are based on [36]. 4 conclusions 1. the new units – decitvoc and decicarbdiox – can be a new basis for a constituent mutual interaction study. 2. the units dcd and dtv can be estimated by the direct measurement of tvoc and co2 concentrations – instruments can be calibrated directly in the new units. 3. the units dcd and dtv, as indoor air quality criteria, allow an optimal range definition. 4. the units allow an optimal range definition (so-called asthmatics optimal range) for persons with increased requirements (e.g. those allergic to indoor air quality, operators in airport control towers, power stations etc.). 5. the unit allow the admissible range definition (for both healthy and allergic persons). 6. the units allow definition of the sbs range (corresponding to the long-term tolerable range). 7. the units allow the estimation of dangerous indoor air quality (corresponding to the short-term tolerable range, see figs. 2.1 and 2.3). 8. the units allow the efficiency of air cleaners (and other indoor air-improving measures, e.g. using low polluting building materials) (see [30]) to be expressed, i.e. what is the decrease of air contamination after application? references see presented at the end of part 3 (p. 44). prof. ing. miloslav v. jokl, drsc. phone: +420 224 354 432 email: miloslav.jokl@fsv.cvut.cz department of engineering equipment of buildings czech technical university in prague faculty of civil engineering 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 43 no. 6/2003 fig. 2.4: the percentage of dissatisfied sedentary subjects as a function of total volatile organic compound concentration above outdoors ap04_3web.vp 1 introduction we have developed a new component-based software development framework called mz-platform. this research, financed by meti (ministry of economy trade and industry), started in 2001 and is planned as a 5 years research project [1]. the aim of this project is to develop a new tool for the manufacturing industry, particularly for small and medium size enterprises, that enables engineers themselves to develop new cax (i.e., cad, cam, cae, cat as a whole) programs and strengthen their competitiveness. information technology (it) tools, typified by cad, have become indispensable for manufacturing enterprises in order to enhance their productivity. however, the costs of purchasing it tools, maintaining the systems, and training engineers and operators for tools, are a heavy burden for small and medium-size enterprises. the burden becomes much heavier, if they try to customize tools to make the most of the ability. according to our survey, current cad tools have abundant general functions, but are too general for small and medium-size enterprises, because cad tools are designed to satisfy the diversified requirements of the aerospace, shipbuilding, and automobile industries. particularly die and metal mold design for mechanical parts, in the design of which small and medium-size enterprises play a key role in japan, requires specialized techniques and skills that are difficult to replace by current cad tools, which mainly focus on supporting tasks in the early stage of product design. we conclude that easy-to-use software development tools, generous supplies of software parts (components) and semi-finished cad (or cax) tools will encourage small and medium-size enterprises. mz-platform is designed for this purpose and we have developed basic software modules of the mz-platform: component-bus, application-builder, xml-component-transmutation, and remote-component-collaboration. we have also developed an application called mz-checker, which can verify the quality of 3-dimensional geometry data, such as the distance or the break angle of two adjacent free-form surfaces. we consider that the mz-checker development has attested the capability of mz-platform as a development tool. in this paper we show the algorithm, the system architecture of mz-platform, and a new parametric modeling scheme that overcomes the current parametric data issues. 2 background and related work the idea of component-based development (cbd) is well known in software engineering [2]. in cbd, software systems are built by assembling components already developed and prepared for integration. component-based software engineering (cbse) has become a subdiscipline of software engineering, and much research has been done, but mainly on software architecture and software architecture description languages [3]. on the other hand, commercial cbd tools, such as visual basic, .net framework of microsoft corporation [4], are quite powerful, but they do not have special features directly related to cax application development. in the development of the cad system, dassault systems, the major cad vendor, has adopted a component-based development concept called the om component model [2] for its product catia. the om component model will be utilized when we enhance the catia functions. though they seem to have ample functions, we do not use them because our component-framework is not based on catia. in order to design a tool to make cax systems, flexibility of component connection is a key concept, because cad data represented in object-oriented language, such as java, becomes a component. 3 mz-platform: a component-based software development framework mz-platform is a fully event-driven component development and execution framework, based on java and javabeans component model technology. the architecture is shown in figure 1. when users want to make a cax software tool through the use of mz-platform, the first thing they have to do is to consult the component library (fig. 1). many components have been prepared, such as gui components, graphic library components, database access components and components for remote access in component library. as a result, users may find components that are commonly used in cax tools in the component library. some functions specific to user-demands have to be designed and developed in the conventional programming environment, and these functions are to be made in the form of components. after storing these components to compo© czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 44 no. 3/2004 a component-based software development and execution framework for cax applications n. matsuki, h. tokunaga, h. sawada digitalization of the manufacturing process and technologies is regarded as the key to increased competitive ability. the mz-platform infrastructure is a component-based software development framework, designed for supporting enterprises to enhance digitalized technologies using software tools and cax components in a self-innovative way. in the paper we show the algorithm, system architecture, and a cax application example on mz-platform. we also propose a new parametric data structure based on mz-platform. keywords: component-based development, cad, parametric modeling. nent library, users can develop a software tool using component builder loading components from the library, and “wiring” those components. throughout the process, users can check the actions of components and assembled components (applications) anytime they want through the functions of application builder. in this way, we suppose that nonspecialists in software can develop a cax tool very quickly. 3.1 component bus the core of mz-platform is the event-handling module called component bus. component bus controls all events from components using the connection objects. when an event from component a to component b has occurred, component bus receives this event and, if component b is ready to activate, invokes component b and sends the event to component b. to establish a connection between components, component bus uses the reflection of java language, which enables all the methods that the component has to be seen through a reflection interface. connection of objects is designed to propagate events. the concept of a connector object plays an important role in the study of architectural definition language (adl), but few currently available component frameworks have such a kind of intermediate object. almost all component frameworks connect components directly, because of simplicity. we employ connector objects because they allow us abundant flexibility to change the component assembly in a dynamic way, namely, to change the connection while the program is running. it is frequently seen in cax applications that, according to the change of real number of geometry dimensions, the algorithm (and program) is to be switched. the problem is that there is no absolute dimension value (the threshold value) at which it should switch, but it depends on the environment like the accuracy of computer hardware. for example, at the corner of break lines we usually define a semicircle to smooth them. but if the break angle becomes near 180 degrees, we can no longer define a semicircle at the corner because the radius goes to infinity. usually, the switch of the algorithm is coded within one component, but, it is difficult to switch correctly as the threshold value depends on the environment. component bus can handle this switching much more easily because the switch of program can be coded as the assembly of components, and the threshold value can easily be accommodated. 3.2 component library we developed gui (graphical user interface) components, 3-dimensional geometry visualization components based on java3d, and other components helpful for building cax applications, such as a component that calculates the distance of geometry entities. these components are serializable javabeans and are stored in jar format files. component bus loads components from component library. if a new component is developed, it has to be stored in component library to be utilized as a part of mz-platform. the abundance of component library is one of the key issues for mz-platform to be an easy-to-use tool and, at the same time, assistance on how to use the components is also important. as this project is planned to continue until 2005, we try to enrich component library continuously. 3.3 application builder application builder is the user interface module to define the component assembly. an example of a user defined component assembly screen is shown in fig. 2. each rectangle of the screen represents a component, and the lines between the rectangles represent the method invocations, the name of which is shown above the line. once the connection between components is defined, component bus invokes the components and creates a connector object simultaneously. users can select several different modes of application builder using the buttons at the bottom of the screen, which are application execution mode, screen layout mode and 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 ( event handling between components ) mz-platform componentacomponent a componentb jar file ‚ a ‚` ‚ a ‚` menu button … … image graph label field component libraryapplication builder component bus component b xml xml-component transmutation ( event handling between components ) mz-platform componentacomponent a componentb jar file ‚ a ‚` ‚ a ‚` menu button … … image graph label field component libraryapplication builder component bus component b xml xml-component transmutation fig. 1: architecture of mz-platform load/save components. during the execution of application builder, users can alter the properties of components like colors of the screen background, message string of dialog and similar attributes of components. also, the usability of the screen layout of an application can be verified while the software is under construction, because the components on application builder are all activated. application builder supports a compound component. several components are assembled into a line component; namely, a hierarchical component can also be defined in the same screen. 3.4 xml-component transmutation component integration information defined by application builder can be stored in the xml (extensible markup language) format file. in the xml file, comments for component and argument explanation are also added. fig. 3 (left) shows an example of component assembly information and the corresponding application. in this example, the xml file has more than four thousand statement lines. this xml file contains the program to make the surface shown in fig. 3 (right) and the geometry data of four special curves to define the surface. this shows that not only a cax function such as “create a surface from four bounded curves”, but also surface data in parametric representation can be stored and shared in the xml file. xml-component transmutation module can read this xml file as a component assembly information, and pass this information to component bus as if it was defined by application builder. moreover, if an application receives a message of this xml data, xml-component transmutation translates the message and load components interactively. using this function, for example, the application screen layout can be dynamically altered depending on the message outside the application. 3.5 remote component collaboration recently, it has become common that several enterprises are involved in the design and manufacture of a new product. to support these processes, collaboration of distributed software is one of key features of cax tools. mz-platform has the collaboration capabilities to invoke a remote component and send a component from one mz-platform to the other. remote component methods (or services) can be found using uddi (universal description, discovery and integration) protocols [5]. messages are sent by soap (simple object access protocol) format, and they can reach the other mz-platform site if it is inside the firewall. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 44 no. 3/2004 fig. 2: application builder of mz-platform fig. 3: component assembly information stored in xml format and the corresponding application there are several major protocols: corba (common object request broker architecture) [6] defines rpc (remote procedure call)-like remote object invocation protocols; java has rmi (remote method invocation) to use the service of a remote java object. the feature of remote component collaboration is that users can assemble remote components in the same way as local components in their own component library. component bus creates an xml message if an event for a remote component occurs. a process for remote communication called broker creates a soap message and sends it to the destination broker. during this procedure, the uddi server provides destination information to broker, as shown in fig. 4. 4 applications to prove the effectiveness and usability of mz-platform, we have developed an application called mz-checker (fig. 5). the main function of mz-checker is to verify the quality of 3-dimensional cad data: to calculate the angles between a selected axis-vector and surface normal vectors that are finely sampled, and to display them on the screen as color-coded surfaces for the purpose of metal-mold model checking. we have proved that mz-platform is capable of developing commercial level programs. we believe that 6 months of development is very short compared to the same level application development. moreover, the easiness of adding a new function is shown by the example, which was developed and tested by mz-checker in just two days. as a cax tool, mz-checker has many features. sasig (strategic automotive product data standards industry group) has announced that cad data of inferior quality is recognized as a major cause of rework and cost by many automobile industry organizations [7]. jama (japan automobile manufacturing association) [8] and sasig have published a guideline aimed at preventing inferior quality cad data from being created, called the pdq (product data quality) guideline. mz-checker is the only tool that has full conformity with the pdq guideline. furthermore, mz-checker can load various types of cad data, such as step (ap203, ap214), iges; it is also used as a viewer of cad data. as a part of this research project, we started distribution of mz-checker for small and medium enterprises. 5 component representation of parametric modeling mz-platform provides an environment for creating geometry modeling applications as a completely event-driven component assembly. the xml-component transmutation module can store component assembly in xml format. this shows that mz-platform can be regarded as a parametric modeling framework. history-based parametric data in the current cad tool is a collection of cad operation (or command) names, argument geometry data, input parameters to these operations, and the assembly structure of operations [9]. history-based parametric modeling in the current commercial cad tool has an imperfection that its data has no warranty to be re-executable, 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 broker mz-platform rmi registry server xml soap soap soap broker mz-platform rmi soap xml java xml xml java broker mz-platform rmi registry server (uddi) xml soap soap soap broker mz-platform rmi soap xml java xml xml java fig. 4: architecture of remote component fig. 5: mz-checker if there is a slight difference between the cad tools it defined and the cad tool it runs. this imperfection arises from the nature of parametric data, which is deeply involved in the algorithm of geometric modeling or its programming codes. that is, if the name of an operation in parametric data is modified, it is the same as the parametric data is being modified. when the new cad version is released, a part of the program codes is modified or upgraded, and the operation that is defined by these modified program codes will behave differently than it used to. this is the essential problem of parametric data, and we call this a problem of the “parametric data not being persistent”. as shown above, the component assembly information written in xml format can be parametric data. if the component is designed to be a suitable size and some tacit factors that cause the inconsistency are controlled from outside the component, parametric data represented in xml format becomes persistent. this is an ongoing research item and we will show the results in future papers. the parametric data example we have developed is shown figure 6. in this example, a curve in the model can be replaced by an arbitrary curve. in fig. 6(a), the 2nd of the three curves is replaced by a new curve. fig. 6(b) shows that the sweep angle is changed from 2 degrees to 10 degrees. 6 conclusions and future work we have developed a component-based software development framework mz-platform. it enables engineers in small and medium-size enterprises to develop programs specific to their needs. basic modules of mz-platform have been designed and developed. there is plenty of scope for improvement. for example, the user-interface of application builder can be simpler and more friendly to nonspecialists in programming. as mentioned above, we will to improve and enrich the components from now on. concerning the parametric modeling problem, we are planning to design a more complicated cad model with fillet surfaces. it will prove that the proposed parametric data in xml format could be a candidate for parametric data of the next generation. references [1] matsuki n.: “a national r&d project plan to establish manufacturing information infrastructure in japan”. proc. of itit symposium on development manufacturing technology infrastructure, aist, (2001), p. 6–8. [2] crnkovic i., larsson m.: building reliable component-based software systems. ma (usa): attech house 2002, isbn 1-58053-327-2. [3] medvidovic n., taylor r. n.: “a classification and comparison framework for software architecture description languages”. ieee trans. on software engineering, vol 26 (jan. 2000), no. 1. [4] http://www.microsoft.com/net/basics/whatisasp [5] http://www.uddi.org [6] http://www.omg.org [7] http://www.sasig-pdq.org [8] http://www.jama.or.jp [9] anderl r., mendgen r.: “parametric design and its impact on solid modeling applications”. acm proc. 3rd symposium of solid modeling, (1995), 1. mr. norio matsuki dr. hitoshi tokunaga dr. hiroyuki sawada digital manufacturing research center (dmrc) national institute of advanced industrial science and technology (aist) 1-2, namiki, tsukuba ibaraki, japan © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 44 no. 3/2004 (a) curve replacement example (b) sweep angle modification example fig. 6: parametric representation of a model and result of modification acta polytechnica https://doi.org/10.14311/ap.2021.61.0644 acta polytechnica 61(5):644–660, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague preliminary prospects of a carnot-battery based on a supercritical co2 brayton cycle karin rindt∗, františek hrdlička, václav novotný czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic ∗ corresponding author: karin.edel@fs.cvut.cz abstract. as a part of the change towards a higher usage of renewable energy sources, which naturally deliver the energy intermittently, the need for energy storage systems is increasing. for the compensation of the disturbance in power production due to inter-day to seasonal weather changes, a long-term energy storage is required. in the spectrum of storage systems, one out of a few geographically independent possibilities is the use of heat to store electricity, so-called carnot-batteries. this paper presents a pumped thermal energy storage (ptes) system based on a recuperated and recompressed supercritical co2 brayton cycle. it is analysed if this configuration of a brayton cycle, which is most advantageous for supercritical co2 brayton cycles, can be favourably integrated into a carnot-battery and if a similar high efficiency can be achieved, despite the constraints caused by the integration. the modelled ptes operates at a pressure ratio of 3 with a low nominal pressure of 8 mpa, in a temperature range between 16 °c and 513 °c. the modelled system provides a round-trip efficiency of 38.9 % and was designed for a maximum of 3.5 mw electric power output. the research shows that an acceptable round-trip efficiency can be achieved with a recuperated and recompressed brayton cycle employing supercritical co2 as the working fluid. however, a higher efficiency would be expected to justify the complexity of the configuration. keywords: pumped thermal energy storage (ptes), carnot-battery, power-to-heat-to-power (p2h2p), supercritical co2 cycle, brayton cycle, heat exchange, pinch-point analysis. 1. introduction as the demand for renewable energy is increasing, so is the importance of its reliability. unfortunately, the availability of energy from renewable sources and the electricity demand are usually not corresponding. the majority of renewable energy is obtained from so-called variable renewable energy sources (vres), such as the solar radiation and wind, which the electric grid could handle, up to a maximum integration of 10 %, without any further changes [1]. the variation of hourly wind speed as well as the solar radiation compared to the feed-in, at the example of nuremberg in germany, can be visually compared in figure 1 and figure 2. this gives us an idea of the fluctuation of “available weather” and energy demand throughout a day. for this purpose, figure 1 also shows the variation of the wind by highlighting its minimum and maximum values (at the different hours of the day throughout the year). the wind speed and solar radiation data are obtained from the merra database [2–5]1 , and the feed-in values from a local energy supplier called n-ergie [6]. these kinds of inequalities of supply and demand can be observed worldwide and throughout the whole year, also requiring shifting the energy between dif1the data from the merra database was, with a kind permission, obtained from prof. dr.-ing. matthias popp, burgstraße 19, d-95632 wunsiedel, germany and processed with an excel-tool (version 18.03.2016). ferent seasons of the year. to tackle this problem, besides focusing on making the energy production as flexible as possible, energy storages are about to become of great importance in the change to a decarbonised future, as also stated by the international energy agency (iea) [8]. energy storage can also be an enabler for making vaes power plants like a concentrated solar power (csp) more competitive to traditional, fossil fuel burning power plants [9]. a thermal storage, directly integrated into the power cycle, can replace the sun as a source of thermal energy when it is cloudy and allow the csp plant to deliver energy more continuously and reliably. with a competitive solution like this, places with a high availability of solar radiation, for example western china, could effectively use the sun instead of a fossil fuel power plant, even if the interest would be more of an economic nature, rather than the protection of the environment. in the spectrum of storage systems, the thermal energy storage (tes) is one out of a few possibilities how to store energy environmentally friendly and geographically independently. the pumped thermal energy storage (ptes), proposed in this paper, is also called carnot-battery, electro-thermal energy storage etes or pumped heat energy storage phes2. the principle is described schematically in figure 3. 2not to be confused with phs (pumped hydro storage). 644 https://doi.org/10.14311/ap.2021.61.0644 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . figure 1. intraday load curve network supply and wind speed in nuremberg, germany, 2015 [6, 7]. figure 2. intraday load curve network supply and solar radiation in nuremberg, germany, 2015 [6, 7]. figure 3. principle of a carnot-battery (modified from [10]). 645 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 4. comparison between different storage technologies (modified from [11, 12]). the potential of different storage principles is visualised in figure 4. electro-thermal energy storage (etes) is rated as a technology feasible for longer storage times and higher power. especially due to the possible storage time of several hours and days, it is feasible for handling the offset between the supply and the demand caused by the unstable nature of renewable energies sources like wind and solar radiation (as can be seen in figure 1 and figure 2). however, it is still in its concept phase [11], with the first prototypes being tested. by 2020, several carnot-battery pilot plants have been operated successfully. the argon-based brayton cycle with a power of 150 kw (600 kwh electric) of the newcastle university, built with the help of the company isentropic ltd. in the united kingdom, is deploying reciprocating devices. the system has a round-trip efficiency of 60 % to 65 % [13] and is based on the theoretical concept by howes [14]. siemens gamesa built a ptes pilot in hamburg, germany, which went into operation in summer 2019 [15–17]. they use an electric heater to heat air that is blown through a packed bed storage, thus charging it. for retrieving the power from the system, the thermal energy can be used to generate steam for a conventional rankine cycle. the maximum electrical output power is then 1.5 mw, while the storage has a capacity of 30 mwh, with a round-trip efficiency of about 45 % [8– 10]. furthermore, two thermally integrated ptes (ti-ptes) with a heat-pump/organic rankine cycle in a lab-scale were built in liege, belgium [18]. utilising waste thermal energy at 75 °c as the cold source for the heat pump, a round-trip efficiency of 100 % is reached [18, 19]. a liquid-air energy storage (laes) with an 8 % round-trip efficiency from the company highview power delivering 350 kw (2.5 mwh) was tested from 2011 to 2014 in greater london [20, 21]. in addition, there are several other projects of carnotbattery pilot plants already under development, in construction or being tested. 2. co2 as working fluid for ptes cycles co2 is an interesting candidate as a working fluid for ptes systems, because of its critical point at nearly ambient temperature and high densities. the low temperature is good for the rejection of thermal energy in a brayton cycle [23]. the high density results in a comparably low compression work and high power density, which results in a small turbomachinery [24]. additionally, it is widely available, inexpensive and nontoxic [24]. problems with co2 are that it is highly diffusive, enhances corrosion of surrounding materials, leads to a rapid depressurisation and is very sensitive to small changes in pressure or temperature near the critical point [24]. additionally, the specific heat capacity of carbon dioxide strongly varies, leading to high changes in the temperature difference between the hot and cold fluid in heat exchangers, which can cause the minimal temperature difference to be not at an inlet or exit, but inside the heat exchanger, which is called a pinch-point problem [23]. as compared to cycles running with other working fluids, co2 cycles are not so widely tested and used yet, which can lead to possible problems, but also offers a great potential. most proposed ptes systems, which employ co2 as the working fluid, are based on transcritical rankine 646 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . name formula molecular weight critical temperature critical pressure critical density carbon dioxide co2 44.01 kg/mol 31.03 °c 7.383 mpa 466 kg/m3 table 1. physico-chemical properties of co2 [22]. cycles, reaching a round-trip efficiency between 40.9 % and 68.6 % [25–29]. only mctigue et al. proposed a supercritical co2 brayton cycle, reaching a 60.4 % round-trip efficiency with a low-temperature cycle and 78.4 % in a high-temperature cycle [30]. a very different approach is the transcritical isothermal rankine cycle by kim et al., using a double-acting liquid piston system with a direct heat transfer to the storage media water, reaching a 68.6 % round-trip efficiency [31]. for the proposed transcritical co2 rankine cycles, the pinch-point problem is usually solved by using multiple hot storage tanks, using an indirect heat transfer to (mostly pressurised) water tanks [25, 26, 32]. for ayachi et al. [28], who are using a ground storage, the heat transfer is directly between the co2 and the solid storage material through which it is flowing (direct, passive multi-tube packed bed storage). their pinch-point considerations are directly connected with the overall storage design. steinmann et al. [29]point out that it is generally impossible to achieve a mean temperature difference of 5 to 10 k between a single pressurised water storage and the co2 while still having a constant mass flow and a significant heat transfer (temperature difference) in the heat exchanger. unlike the before mentioned research, they choose to assume an ideal storage system, neglecting the pinchpoint problem, rather than coming up with a possible solution. mctigue et al. designed a simple, nonrecuperated brayton cycle [30], even though they’re mentioning, that recuperation would be feasible due to the high-temperature difference between the compression and the expansion. due to the high potential of co2 in its supercritical state, this paper investigates a new layout of a brayton sco2 ptes. the proposed cycle is recompressed and employs a double recuperation because this layout is reaching the highest efficiencies for sco2 cycles and, generally, has a great potential for recuperation, as also stated by mctigue et al. [30], whereas other research focused on transcritical co2 cycles or simple non-recuperated supercritical co2 cycles [33, 34]. the paper will determine if this layout is favourable for a carnot battery and how the boundary conditions limit the power cycle performance (e. g., how the carnot-batteries storage temperatures, flow rates of the working fluid and the storage material limit the options of cycle configuration and influence one another). the effects of different operating temperatures and pressure ratios as well as various temperature differences of the heat exchangers and efficiencies of turbines and compressors were analysed. the key inputs are presented together with the best combination of parameters in the section 4. 3. concept of a supercritical co2 brayton cycle with hot and cold liquid storage 3.1. description of the hot and cold storage the proposed supercritical co2 brayton cycle has a hot and cold two-tank liquid storage. for the discharging cycle, a turbine and two compressors (one for recompression) are needed. the same principle applies to the charging cycle, where two turbine sections and one compressor are necessary. for the charging cycle, a heat exchanger to ambient air is necessary due to irreversibilities within the cycle. furthermore, each configuration needs two heat exchangers for the recuperation. two heat exchangers for transferring the thermal energy between the cycle and the storage tanks are used for the charging and discharging. since the heat pump and the power cycle can employ the same pressure ratio due to the flexibility in mass flows through the recompression stage, it seems possible that some of the turbomachinery can be used in charging and discharging modes. possible effects are not considered further. each storage consists out of two tanks, one contains the storage material at a hightemperature and the other one a material at a low-temperature. within a tank, the storage material is kept at a constant temperature. figure 5 shows the working principle of the two-tank storage. during the charging, the material from the colder tank of the hot storage is pumped through a heat exchanger, which is transferring the thermal energy from the cycle to the storage media. it is then stored in the second tank at a higher temperature. for the cold storage, the fluid is cooling down during the charging cycle, heating the cycle fluid. in the discharging phase, the fluid is pumped the other way around. the hot tank of the hot storage is emptying while the colder tank is filling up, and the level of fluid in the cold tank of the cold storage is decreasing while it is rising in the warmer tank. through maintaining a constant flow between a charged and discharged tank of a storage, where both tanks maintain a constant temperature but vary only in the amount of stored material, the heat flux in the heat exchanger is constant. the constant heat flux is an advantage over packed beds, where the problem of a non-constant temperature output is a challenge faced in the cycle design. 647 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 5. scheme of the hot and cold two-tank storage. the hot storage material is solar salt with the chemical composition of 60 % nano3 + 40 % kno3. it has a melting temperature of 221.04 °c and thermal stability up to 588.51 °c [35]. the mean heat capacity for the operating temperatures of the hot storage is 1.518 kj kg−1k−1 [36]. the cold storage can work with a thermal oil such as therminol 66 due to the lower temperatures (thermal stability up to 380 °c) [37]. the mean heat capacity of the thermal oil at cold storage temperatures is 1.782 kj kg−1k−1. self-discharge of the storages is neglected. 3.2. double recuperated and recompressed brayton cycle with co2 as working fluid the ptes is designed to deliver 10 mw of thermal power (3.4 mw electric). self-discharging of the storages, the pumps on the storage side and mechanical and electrical losses are neglected, while losses due to heat transfer were considered. a parameter variation as well as a pinch-point analysis were carried out to determine the best cycle performance. 3.2.1. cycle layout for charging the (hot) storage in figure 6, the layout of the charging cycle is shown. at a pressure ratio of 3, the system is operating in a temperature range between 16 °c and 513 °c. from point 1 to 2, the co2 is compressed to a pressure of 24 mpa, and between 2 and 3, the thermal energy is used for charging the hot storage, cooling down the fluid. the thermal energy at the lower temperature is recuperated between 3 and 4. at point 4, the mass flow is split to continue with the recuperation from 4 to 5 and expansion from 4 to 8. after point 5, irreversibilities are dissipated to the environment in a heat exchanger between 5 and 6. from 6 to 7, the second portion of the working fluid is expanded back to 8 mpa in the turbine and then heated from 7 to 8 by cooling down the cold storage. at point 8, the flow is merged with the parallel expanded portion, and from 8 to 9 and 9 to 1, heated with the recuperated thermal energy from 3 to 4 and 4 to 5. by splitting the flow in point 4 and merging it in point 8, the amount of heat transferred to the cold storage can be influenced. 3.2.2. cycle layout for discharging the (hot) storage the layout for the discharging cycle can be seen in figure 7. the upper and lower temperatures are given through the terminal temperature difference of 8 k between the working fluid and the hot storage during the charging and discharging. the pressure ratio for this cycle is 3 as well (8 mpa/24 mpa). from state a to b, the co2 expands, with a following recuperation b to c and c to d. at d, the flow is split into a share, which is cooled by discharging the cold storage (d to e), compressed (e to f) and recuperated (f to g), with a part being directly recompressed 648 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . figure 6. double recuperated sco2 brayton cycle as a heat pump with dual expansion (charging). figure 7. double recuperated and recompressed co2 brayton cycle (discharging). 649 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 8. t-s diagram of the charging (red) and discharging (blue) of the proposed ptes with recuperated and recompressed sco2 brayton cycle. (d to g) before being merged again in point g. the second heating with recuperation takes place from g to h and with the thermal energy from the hot storage from h to a. 3.2.3. thermodynamic model both the charging and discharging cycles are illustrated in figure 8. the nominal value of the low pressure is 8 mpa and therefore, close to the critical value. with the mentioned pressure ratio of 3, the nominal high pressure is 24 mpa. the efficiency of the discharging cycle of the ptes system ηdc is 34.9 % (1), and the coefficient of performance of the heat pump (charging cycle) copcc of the ptes system is 1.11 (2). all co2 properties were retrieved from the refprop library [38]. ηdc = wnet,dc qhotstor,dc = = wexp,ab − wcomp,ef − wcomp,dg qha (1) copcc = qout,hotstor,cc wnet,cc = = q23 wcomp,12 − wexp,67 − wexp,48 (2) wnet denotes the cycle work (net cycle work), wexp the expansion work and wcomp the compression work, while qhotstor and qcoldstor describe the heat flux between the cycle and the hot and cold storages, respectively. the indices dc and cc are further describing the discharging and charging cycles, while the numbers (for the charging cycle) and letters (for the discharging cycle) specifically describe between which states the change takes place. the round-trip efficiency ηrt of the proposed system is 38.9 % and calculated with the net work ratio of charging and discharging cycles (3), or expressed differently with (1) and (2), by multiplying the cop of the heat pump and the efficiency of the heat engine (4). ηrt = wnet,dc wnet,cc = = wexp,ab − wcomp,ef − wcomp,dg wcomp,12 − wexp,67 − wexp,48 (3) ηrt = copcc · ηdc (4) table 2 lists the isentropic efficiencies of the turbomachinery, the work and the heat transfer rates within and over the boundaries of the cycle. the heat flux from the hot storage to the cycle during the discharging is set by the power of the system q̇system=3.4 mw and the efficiency of the discharging cycle (5). furthermore, the total mass flow rate ṁ is calculated with the enthalpy difference ∆h of the storage (6). q̇hotstor = q̇system ηdc (5) ṁdc = q̇hotstor · ∆hha (6) around a mass flow ratio ṁdc,maincomp/ṁdc,total of 0.8, the pinch-points within and between the charging and discharging cycles are in the desired range. together with the enthalpy difference from d to e, the heat flux from the cycle to the cold storage fluid is calculated (7) q̇coldstor = ṁdc,maincomp · ∆hde (7) the mass flow on the storage side, between the two tanks, is calculated in the same manner but is a little bit lower due to the irreversibilities in the heat 650 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . temperatures and pressures tlow 16 [°c] thigh 513 [°c] pnom,low 8 [mpa] pressure ratio 3 [–] mass flow charging cycle mass flow discharging cycle ṁtotal,cc 56.9 [kg/s] ṁtotal,dc 56.9 [kg/s] ratio ṁ 0.8 [–] ratio ṁ 0.4 [–] efficiency compressor and turbine ηcomp 0.9 [–] ηexp 0.9 [–] terminal temperature difference pinch point cold storage 18 [k] 7.5 [k] hot storage 8 [k] 7.4 [k] recuperation 8 [k] 8 [k] cycle efficiency copcc 1.10 [–] ηdc 0.342 [–] ηoverall 0.378 [–] charging cycle discharging cycle wcomp,12 9294 [kw] wexp,ab -7249 [kw] wexp,48 -272 [kw] wcomp,ef 2530 [kw] wexp,67 -33 [kw] wexp,dg 1225 [kw] qin,coldstor,78 6506 [kw] qout,coldstor,de -6506 [kw] qout,hotdstor,23 -10000 [kw] qin,hotstor,ha 10000 [kw] qamb,56 -5495 [kw] qrecup,bc -5664 [kw] qrecup,34 -14987 [kw] qrecup,gh 5664 [kw] qrecup,91 14987 [kw] qrecup,cd -8876 [kw] qrecup,45 -1139 [kw] qrecup,fg 8876 [kw] qrecup,89 1139 [kw] wnet,cc 8989 [kw] wnet,dc -3494 [kw] table 2. thermal model key inputs and results for the proposed ptes. exchanger. the effectiveness of the heat exchangers is represented by a minimum pinch-point between the storage side and the cycle. the enthalpy difference on the storage side then results from the corresponding temperatures and the heat capacity of the storage material. the minimum amount of stored thermal energy (8) and storage material (9) for the hot and cold storage, respectively, can be found by multiplying the mass flow rate with the storage discharging time. qstor = q̇stor · tdischarge (8) mstor = ṁstor,dc · tdischarge (9) five hours were chosen for the duration of charging and discharging as this is a typical wind farm production time [39]. 3.2.4. effect of different system parameters on the system’s round-trip efficiency to get a full use of the positive effects of co2 close to its critical pressure, a low pressure of 8 mpa was chosen. in figure 9, it can be seen that the roundtrip efficiency (ηrt) is higher with a higher pressure ratio of the discharging cycle and a lower pressure ratio in the charging cycle. however, pressure ratios lower than 3 in the charging cycle and lower than 4, with a discharging cycle pressure ratio of 4, as well as pressure ratios during discharging higher than 4, are not possible in this ptes system, because the minimum temperature differences would get violated. the best combination is a pressure ratio of 3 for the charging as well as the discharging cycle. a greater difference between the minimum and maximum temperatures results in a higher round-trip efficiency, as to be expected. 10 °c are, therefore, suggested as the minimum temperature of the ideal ptes 651 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 9. effect of the pressure ratios of charging and discharging cycles on the system’s round-trip efficiency. figure 10. effect of the low temperature on the system’s round-trip efficiency (with a high temperature of 500 °c). (tlow). the resulting minimum temperature of the cycle fluid for the rejection of thermal energy to ambient is 20 °c. figure 10 shows that a raise in the cycle’s low temperature would cause a drastic drop in the system’s round-trip efficiency. the ptes is then proposed with a maximum temperature of 500 °c (thigh), which is the limit for many materials that could be used for the storage containers, turbomachinery, and storage materials. in figure 11, it can be seen that with temperatures greater 600 °c, nearly 50 % efficiency can be achieved. at temperatures as high as this, storage materials like the solar salt mixture 40 % nano2 + 7 % nano3 + 53 % kno3 with a working range between 142.24 °c (melting point) and 630.97 °c and a heat capacity of 1439 j/kgk, as proposed for csp plants by fernández et al., could be used [35]. the effect of the isentropic efficiencies of the compressors and expanders is also clear. in figure 12, the change of the round-trip efficiency with the isentropic compressor efficiency (ηcomp) is shown. once with a constant isentropic expander efficiency of 0.9 (ηexp=0.9), and one time with an equal efficiency for all turbomachinery (ηcomp = ηexp). the latter is a very simplified approach because, generally, the turbine’s efficiency can be expected to be much higher than the compressor’s efficiency. it is, though, certain that the isentropic efficiency of the turbomachinery has a great impact on the system’s round-trip efficiency. the last analysed impact are the temperature differences of the various heat exchangers of the system. a special focus was also on the pinch-point problem, as investigated for the best configuration of the cycle in the next chapter (3.2.5). generally, a smaller temperature difference (e. g. higher effectiveness of the heat exchangers) means a higher round-trip efficiency of the system, as can be seen in figure 13. besides the case, where the terminal temperature difference of the cold storage heat exchanger and the recuperator is each 5 k (tdtcoldst = 5; tdtrecup = 5, yellow line), the temperature difference between the heat exchanger and the cold storage is shown on the x-axis. in the mentioned case (tdtcoldst = 5; tdtrecup = 5), it is the temperature difference of the hot storage heat exchanger (tdthotst) instead. all variations are displayed in this one diagram to allow an easy visual identification of the best combination. the temperature difference compared in the diagram is the terminal temperature difference at the two ends of the heat exchanger and is, therefore, not trans652 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . figure 11. effect of the high temperature on the system’s round-trip efficiency (with a low temperature of 10 °c). figure 12. effect of the isentropic efficiencies of compressor and expander on the system’s round-trip efficiency. figure 13. effect of the temperature differences at the heat exchanger ends on the system’s round-trip efficiency. 653 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 14. heat transfer from the (charging) cycle to the hot storage. figure 15. heat transfer from the hot storage to the (discharging) cycle. ferrable to the heat exchanger effectiveness, which is corresponding to the minimum temperature difference. the minimum temperature difference in the cycle-to-storage heat exchangers is sometimes lower than the temperature difference at the ends of the heat exchanger due to the severe pinch-point problem of sco2 (explained in 2 and visualised for the proposed cycle in 3.2.5). 3.2.5. heat exchanger analysis the heat exchangers were checked for the pinch-point problem with the following analysis. the hot storage heat exchanger is transferring thermal energy between the co2 cycle and the solar salt of the storage. figure 14 shows the charging and figure 15 the discharging of the hot storage. the terminal temperature difference at the hot and cold ends of the heat exchanger is 8 k with a pinch-point of 7.5 k (during the charging mode). the necessary heat flux is 10 mj/s. in total, an amount of 180 gj thermal energy is stored in the solar salt, and 777 tons are required as a minimum to transfer the thermal energy between the two-tank storage and the charging cycle. the molten salt has a working range between 221.04 °c (melting point) and 588.51 °c [35], with a mean heat capacity of 1518 j/kgk for the working temperature range of the storage [36]. between the co2 cycle and the thermal oil of the cold storage, a heat exchanger with pinchpoint of 7.5 k is used (figures 16, 17). the terminal temperature difference during the discharge operation is 18 k. the necessary heat flux is about 6.5 mj/s. the heat capacity of the oil (therminol66) is, on average, 1779 j/kgk [37] between the heat capacity at the minimum and maximum temperatures of the heat transfer. the thermal oil is stable up to a maximum temperature of 380 °c. 117 gj of thermal energy is stored in the thermal oil, also assuming no losses through self-discharging. this amount of heat, transferred between the given temperatures, equals to 387 tons of the thermal oil. the heat transfer in the recuperators is shown in figure 18 to figure 21. they have a pinch-point of 8 k (in the high-temperature recuperator) or greater, and 654 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . figure 16. heat transfer from the (charging) cycle to the cold storage. figure 17. heat transfer from the cold storage to the (discharging) cycle. the terminal temperature difference is always higher than that, up to 43 k. 4. results and discussion table 2 summarises the results of the chosen thermodynamic model. the efficiency of the discharging cycle of the ptes system ηdc is 34.9 %, which is a little bit lower than the maximum efficiency, which can theoretically be reached without an integration in a ptes system. such a standard supercritical co2 brayton cycle (without an integration in a ptes system) can reach efficiencies of about 20.5 %. if it is recuperated, 42.6 % is possible. if it is recompressed, with a double recuperation, it can be up to 45.0 % [23]. the lower efficiency of the cycle, while being integrated into a ptes system as its discharging cycle, could be explained with the boundary conditions applied to the cycle through the attached storages and due to the unavoidable rejection of the thermal energy to the environment as well as simply through the desired optimum between the high cop of the heat pump and the high efficiency of the power cycle. the heat dissipation to the environment is necessary because of the inevitable irreversibilities (entropy generation) during the charging and discharging. the coefficient of performance of the heat pump (charging cycle) copcc is 1.11, reaching a round-trip efficiency ηrt of 38.9 % for the proposed system. the pressure losses due to the heat exchange were calculated for each heat exchanger, assuming the use of shell and tube heat exchangers. the power consumption of the pumps transporting the liquid through the storages as well as the power drain of the motor and generator are neglected. however, if we assume ηgenerator = 98 % and ηmotor = 98 %, while the hot storage pump reduces the net work by whotstorp ump = 10.6 kw and the cold storage pump by wcoldstorp ump = 15.9 kw , the round-trip efficiency reduces to about 37.6 % as by formula (10). 655 k. rindt, f. hrdlička, v. novotný acta polytechnica figure 18. heat transfer in the high-temperature recuperator during charging. figure 19. heat transfer in the high-temperature recuperator during discharging. figure 20. heat transfer in the low-temperature recuperator during charging. 656 vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . figure 21. heat transfer in the low-temperature recuperator during discharging. ηrt,reduced = (wnet,dc − wstorp ump) · ηgen (wnet,cc−wstorp ump) ηmot (10) formula (10) can also be written as in (11), with wnet,dc,reduced = wnet,dc − whotstorp ump and wnet,cc,reduced = wnet,cc − wcoldstorp ump. ηrt,reduced = wnet,dc,reduced · ηgen wnet,cc,reduced ηmot = = wnet,dc,reduced wnet,cc,reduced · ηgen · ηmot (11) this allows us to express the reduced round-trip efficiency with the reduced cop of the charging cycle and the reduced efficiency of the discharging cycle (12). ηrt,reduced = copcc,reduced · ηdc,reduced · ηgen · ηmot (12) the parameters of the above-proposed cycle were chosen according to the parameter variation presented in 3.2.4. it is apparent that maximising the pressure ratio and the difference between the minimum and maximum temperature is improving not only the efficiency of the stand-alone cycle but also the overall system, although the cop of the heat pump is decreasing through these measures. if materials and costs wouldn’t limit the upper temperature, this configuration could reach more than a 50 % roundtrip efficiency already at 800 °c, without any further changes to the original system. the lower temperature of the cycle is limited by the minimum temperature for the heat exchange with the environment. a higher turbine efficiency is, of course, resulting in a higher round-trip efficiency as well, but it is limited by the available technologies. for today’s available gas sco2 turbines, a 90 % efficiency might be possible, while for the compressor, this theoretical value is most likely too high. if the compressor didn’t reach such a high efficiency, the round-trip efficiency would reduce drastically, and a realisation would most likely not be feasible, especially considering the complexity of the cycle. the heat exchanger effectiveness is in a more realistic range, and it is also clear that a higher efficiency (lower temperature difference) leads to better overall results. because the same pressure ratio was chosen for the charging and the discharging cycles, which is only possible through the recompression, the temperatures for storing and recuperating the thermal energy are relatively fixed and the possibilities for improving the cycle efficiency are limited. this cycle configuration was, however, still chosen and analysed because it allowed the use of reciprocating devices and all heat exchangers for both the heat pump and the power cycle. for a similar cycle, with slightly different pressure ratios, recompression would not be necessary and a similar or higher round-trip efficiency could be reached. 5. conclusion this paper investigated the feasibility of a carnotbattery with a recuperated and recompressed supercritical co2 brayton cycle. the pressure ratio is 3 with a low nominal pressure of 8 mpa. the system operates in a temperature range between 16 °c and 513 °c and provides a round-trip efficiency of 38.9 % with a maximum electric power output of 3.5 mw. the carnot-battery could be attached to a small wind park (two to four wind turbines) or, for example, situated in a domestic area with a high rate of installed solar power generation. both could charge the storage with a power of 9 mw. co2, in its supercritical state, can be used in a brayton cycle as part of an energy storage system with 657 k. rindt, f. hrdlička, v. novotný acta polytechnica a hot and cold storage. the usual measures to enhance the co2 cycle performance are much harder to realise for a carnot-battery, due to its complexity. the upside of a recuperated sco2 brayton cycle is the small turbomachinery, which is being reduced by the space needed for the heat exchangers (heat exchanger areas around 70 to 100 m2). therefore, other cycle layouts, as well as other working mediums, should definitely be further investigated. some thermal energy needs to be released to the environment to keep the energy balanced within the system. this thermal energy could be used in district heating, or for an attached orc as it is still at a temperature of about 130 °c. the modelled system provides a round-trip efficiency of 38.9 %. for such a complex system, with recompression and double recuperation, this round-trip efficiency is comparably low. simpler configurations of sco2 brayton cycles might reach the same or higher efficiencies, as also the analysis of mctigue et al. in 2019 suggests. they proposed a simple sco2 brayton cycle, which could reach up to 60.4 % with a low-temperature cycle (200 °c) and 78.4 % with a high-temperature cycle (560 °c). they’re mentioning that a recuperation would be feasible due to the high-temperature difference between the compression and the expansion, but they didn’t analyse it. this paper, however, shows that a recuperation is difficult to implement and might actually not improve the round-trip efficiency of a sco2 brayton ptes as expected due to the limitations of the integration in the storage system. list of symbols abbreviations chest compressed heat energy storage co2 carbon dioxide cop coefficient of performance es energy storages et es electro-thermal energy storage orc organic rankine cycle p 2h2p power-to-heat-to-power p cm phase change material p hes pumped heat energy storage p t es pumped thermal energy storage t es thermal energy storages symbols h specific enthalpy [kj/kg] ṁ mass flow [kg/s] p r pressure ratio q heat [j] q̇ heat flow rate [j/s = w] s specific entropy [j/(kg∗k)] t time (e. g. discharging time tdischarge) [s] t temperature [°c] w work rate (power) [w] η isentropic efficiency subscripts amb ambient cc charging cycle coldstor cold storage comp compression dc discharging cycle exp expansion gen generator hotstor hot storage in “in”, as added to the cycle mot motor out “out”, as extracted from the cycle recup recuperation/recuperated rt round-trip stor storage 1, 2, 3, . . . cycle state points during charging a, b, c, . . . cycle state points during discharging acknowledgements this work was supported by the grant agency of the czech technical university in prague, grant no. sgs20/116/ohk2/2t/12. references [1] a. b. gallo, et al. energy storage in the energy transition context: a technology review. renewable and sustainable energy reviews 65:800–822, 2016. https://doi.org/10.1016/j.rser.2016.07.028. [2] global modeling and assimilation office, s. pawson. inst3_3d_asm_cp: merra 3d iau state, meteorology instantaneous 3-hourly (p-coord, 1.25x1.25l42) version 5.2.0, 2008. [3] global modeling and assimilation office, s. pawson. inst6_3d_ana_np: merra 3d analyzed state, meteorology instantaneous 6-hourly (p-coord, 2/3x1/2l42) version 5.2.0, 2008. [4] global modeling and assimilation office, s. pawson. tavg1_2d_slv_nx: merra 2d iau diagnostic, single level meteorology, time average 1-hourly (2/3x1/2l1) version 5.2.0, 2008. [5] global modeling and assimilation office, s. pawson. merra-2 tavg1_2d_rad_nx: 2d, 1-hourly, time-averaged, single-level, assimilation, radiation diagnostics v5.12.4, 2015. [6] m. söllch. netzrelevante daten. [2021-07-14], https://web.archive.org/web/20160821034950/ https://www.main-donau-netz.de/header/ veroeffentlichungen/strom/netzrelevantedaten.html. [7] s. pawson. gmao merra: modern era retrospective-analysis for research and applications. [2020-03-05], https://disc.gsfc.nasa.gov/. [8] iea, international energy agency. technology roadmap energy storage, 2014. 658 https://doi.org/10.1016/j.rser.2016.07.028 https://web.archive.org/web/20160821034950/https://www.main-donau-netz.de/header/veroeffentlichungen/strom/netzrelevante-daten.html https://web.archive.org/web/20160821034950/https://www.main-donau-netz.de/header/veroeffentlichungen/strom/netzrelevante-daten.html https://web.archive.org/web/20160821034950/https://www.main-donau-netz.de/header/veroeffentlichungen/strom/netzrelevante-daten.html https://web.archive.org/web/20160821034950/https://www.main-donau-netz.de/header/veroeffentlichungen/strom/netzrelevante-daten.html https://disc.gsfc.nasa.gov/ vol. 61 no. 5/2021 preliminary prospects of a carnot-battery . . . [9] x. wang, et al. investigation of thermodynamic performances for two-stage recompression supercritical co2 brayton cycle with high temperature thermal energy storage system. energy conversion and management 165:477–487, 2018. https://doi.org/10.1016/j.enconman.2018.03.068. [10] v. novotný. pumped thermal energy storage (carnot batteries): overview and prospects. [11] a. valera-medina, et al. ammonia for power. progress in energy and combustion science 69:63–102, 2018. https://doi.org/10.1016/j.pecs.2018.07.001. [12] t. barmeier. electric thermal energy storage (etes). [2020-08-16], https: //windenergietage.de/wp-content/uploads/sites/ 2/2017/11/26wt0811_f11_1120_dr_barmeier.pdf. [13] the engineer. team connects first grid-scale pumped heat energy storage system. [2020-08-13], https://www.theengineer.co.uk/grid-scalepumped-heat-energy-storage/. [14] j. howes. concept and development of a pumped heat electricity storage device. proceedings of the ieee 100(2):493–503, 2012. https://doi.org/10.1109/jproc.2011.2174529. [15] siemens gamesa renewable energy, s.a. start of construction in hamburg-altenwerder: siemens gamesa to install fes heat-storage for wind energy. [2020-08-13], https://www.siemensgamesa.com/enint/newsroom/2017/11/start-of-construction-inhamburg-altenwerder. [16] siemens gamesa renewable energy, s.a. world first: siemens gamesa begins operation of its innovative electrothermal energy storage system. [2020-08-13], https://www.siemensgamesa.com/enint/newsroom/2019/06/190612-siemens-gamesainauguration-energy-system-thermal. [17] siemens gamesa renewable energy, s.a. thermal energy storage with etes i siemens gamesa. [2020-08-13], https://www.siemensgamesa.com/enint/products-and-services/hybrid-andstorage/thermal-energy-storage-with-etes. [18] o. dumont, et al. carnot battery technology: a stateof-the-art review. journal of energy storage 32:101756, 2020. https://doi.org/10.1016/j.est.2020.101756. [19] o. dumont, v. lemort. first experimental results of a thermally integrated carnot battery using a reversible heat pump/organic rankine cycle. in 2nd international workshop on carnot batteries 2020. 2020. https://www.researchgate.net/publication/ 344252650_first_experimental_results_of_a_ thermally_integrated_carnot_battery_using_a_ reversible_heat_pump_organic_rankine_cycle. [20] r. morgan, et al. liquid air energy storage — analysis and first results from a pilot scale demonstration plant. applied energy 137:845–853, 2015. https://doi.org/10.1016/j.apenergy.2014.07.109. [21] highview power. plants: pilot plant. [2020-10-26], https://highviewpower.com/plants/. [22] schick gmbh + co. kg. r744 / co2: kohlendioxid: ein unbegrenzt verfügbarer und natürlicher stoff. [2020-08-10], https: //www.schickgruppe.de/wordpress/?page_id=1103. [23] k. brun, et al. fundamentals and applications of supercritical carbon dioxide (sco2) based power cycles. woodhead publishing an imprint of elsevier, duxford, united kingdom, 2017. [24] s. weihe, et al. untersuchung zur evaluierung von forschungspotentialen hinsichtlich prozessen, komponenten und werkstoffen von transund superkritischen co2-anwendungen. [2020-08-17], http://www.zfes.uni-stuttgart.de/downloads/ bericht_zfes_mpa_itsm_ike.pdf. [25] m. morandin, et al. thermo-electrical energy storage: a new type of large scale energy storage based on thermodynamic cycles. [2020-08-17], https://www.researchgate.net/publication/ 259973612_thermoelectric_energy_storage_a_new_ type_of_large_scale_energy_storage_based_on_ thermodynamic_cycles. [26] m. mercangöz, et al. electrothermal energy storage with transcritical co2 cycles. energy 45(1):energy, 2012. https://doi.org/10.1016/j.energy.2012.03.013. [27] y. m. kim, et al. transcritical or supercritical co2 cycles using both lowand high-temperature heat sources. energy 43(1):402–415, 2012. https://doi.org/10.1016/j.energy.2012.03.076. [28] f. ayachi, et al. thermo-electric energy storage involvingl co2 transcritical cycles and ground heat storage. applied thermal engineering 108:1418–1428, 2016. https: //doi.org/10.1016/j.applthermaleng.2016.07.063. [29] w.-d. steinmann, h. jockenhöfer, d. bauer. thermodynamic analysis of high-temperature carnot battery concepts. energy technology 8(3):1900895, 2020. https://doi.org/10.1002/ente.201900895. [30] j. mctigue, et al. pumped thermal electricity storage with supercritical co2 cycles and solar heat input. aip conference proceedings 2303(1):190024, 2020. https://doi.org/10.1063/5.0032337. [31] y.-m. kim, et al. isothermal transcritica co2 cycles with tes (thermal energy storage) for electricity storage. energy 49:484–501, 2013. https://doi.org/10.1016/j.energy.2012.09.057. [32] m. morandin, et al. thermoeconomic design optimization of a thermo-electric energy storage system based on transcritical co2 cycles. energy 58:571–587, 2013. https://doi.org/10.1016/j.energy.2013.05.038. [33] l. v. dosta, et al. a supercritical carbon dioxide cycle for next generation nuclear reactors. [2021-07-14], https://web.mit.edu/22.33/www/dostal.pdf. [34] g. angelino. carbon dioxide condensation cycles for power production. journal of engineering for power 90(3):287–295, 1968. https://doi.org/10.1115/1.3609190. [35] a. g. fernández, et al. thermal characterization of hitec molten salt for energy storage in solar linear concentrated technology. journal of thermal analysis and calorimetry 122(1):3–9, 2015. https://doi.org/10.1007/s10973-015-4715-9. 659 https://doi.org/10.1016/j.enconman.2018.03.068 https://doi.org/10.1016/j.pecs.2018.07.001 https://windenergietage.de/wp-content/uploads/sites/2/2017/11/26wt0811_f11_1120_dr_barmeier.pdf https://windenergietage.de/wp-content/uploads/sites/2/2017/11/26wt0811_f11_1120_dr_barmeier.pdf https://windenergietage.de/wp-content/uploads/sites/2/2017/11/26wt0811_f11_1120_dr_barmeier.pdf https://www.theengineer.co.uk/grid-scale-pumped-heat-energy-storage/ https://www.theengineer.co.uk/grid-scale-pumped-heat-energy-storage/ https://doi.org/10.1109/jproc.2011.2174529 https://www.siemensgamesa.com/en-int/newsroom/2017/11/start-of-construction-in-hamburg-altenwerder https://www.siemensgamesa.com/en-int/newsroom/2017/11/start-of-construction-in-hamburg-altenwerder https://www.siemensgamesa.com/en-int/newsroom/2017/11/start-of-construction-in-hamburg-altenwerder https://www.siemensgamesa.com/en-int/newsroom/2019/06/190612-siemens-gamesa-inauguration-energy-system-thermal https://www.siemensgamesa.com/en-int/newsroom/2019/06/190612-siemens-gamesa-inauguration-energy-system-thermal https://www.siemensgamesa.com/en-int/newsroom/2019/06/190612-siemens-gamesa-inauguration-energy-system-thermal https://www.siemensgamesa.com/en-int/products-and-services/hybrid-and-storage/thermal-energy-storage-with-etes https://www.siemensgamesa.com/en-int/products-and-services/hybrid-and-storage/thermal-energy-storage-with-etes https://www.siemensgamesa.com/en-int/products-and-services/hybrid-and-storage/thermal-energy-storage-with-etes https://doi.org/10.1016/j.est.2020.101756 https://www.researchgate.net/publication/344252650_first_experimental_results_of_a_thermally_integrated_carnot_battery_using_a_reversible_heat_pump_organic_rankine_cycle https://www.researchgate.net/publication/344252650_first_experimental_results_of_a_thermally_integrated_carnot_battery_using_a_reversible_heat_pump_organic_rankine_cycle https://www.researchgate.net/publication/344252650_first_experimental_results_of_a_thermally_integrated_carnot_battery_using_a_reversible_heat_pump_organic_rankine_cycle https://www.researchgate.net/publication/344252650_first_experimental_results_of_a_thermally_integrated_carnot_battery_using_a_reversible_heat_pump_organic_rankine_cycle https://doi.org/10.1016/j.apenergy.2014.07.109 https://highviewpower.com/plants/ https://www.schickgruppe.de/wordpress/?page_id=1103 https://www.schickgruppe.de/wordpress/?page_id=1103 http://www.zfes.uni-stuttgart.de/downloads/bericht_zfes_mpa_itsm_ike.pdf http://www.zfes.uni-stuttgart.de/downloads/bericht_zfes_mpa_itsm_ike.pdf https://www.researchgate.net/publication/259973612_thermoelectric_energy_storage_a_new_type_of_large_scale_energy_storage_based_on_thermodynamic_cycles https://www.researchgate.net/publication/259973612_thermoelectric_energy_storage_a_new_type_of_large_scale_energy_storage_based_on_thermodynamic_cycles https://www.researchgate.net/publication/259973612_thermoelectric_energy_storage_a_new_type_of_large_scale_energy_storage_based_on_thermodynamic_cycles https://www.researchgate.net/publication/259973612_thermoelectric_energy_storage_a_new_type_of_large_scale_energy_storage_based_on_thermodynamic_cycles https://doi.org/10.1016/j.energy.2012.03.013 https://doi.org/10.1016/j.energy.2012.03.076 https://doi.org/10.1016/j.applthermaleng.2016.07.063 https://doi.org/10.1016/j.applthermaleng.2016.07.063 https://doi.org/10.1002/ente.201900895 https://doi.org/10.1063/5.0032337 https://doi.org/10.1016/j.energy.2012.09.057 https://doi.org/10.1016/j.energy.2013.05.038 https://web.mit.edu/22.33/www/dostal.pdf https://doi.org/10.1115/1.3609190 https://doi.org/10.1007/s10973-015-4715-9 k. rindt, f. hrdlička, v. novotný acta polytechnica [36] sqm international n.v. thermo-solar salts. [2020-08-21], https://www.sqm.com/wp-content/ uploads/2018/05/solar-salts-book-eng.pdf. [37] solutia. therminol 66: high performance highly stable heat transfer fluid. [2020-03-11], http://twt.mpei.ac.ru/tthb/hedh/htf-66.pdf. [38] e. lemmon. nist reference fluid thermodynamic and transport properties database: version 9.0, nist standard reference database 23. https://doi.org/10.18434/t4js3c. [39] k. attonaty, et al. thermodynamic analysis of a 200 mwh electricity storage system based on high temperature thermal energy storage. energy 172:1132–1143, 2019. https://doi.org/10.1016/j.energy.2019.01.153. 660 https://www.sqm.com/wp-content/uploads/2018/05/solar-salts-book-eng.pdf https://www.sqm.com/wp-content/uploads/2018/05/solar-salts-book-eng.pdf http://twt.mpei.ac.ru/tthb/hedh/htf-66.pdf https://doi.org/10.18434/t4js3c https://doi.org/10.1016/j.energy.2019.01.153 acta polytechnica 61(5):644–660, 2021 1 introduction 2 co2 as working fluid for ptes cycles 3 concept of a supercritical co2 brayton cycle with hot and cold liquid storage 3.1 description of the hot and cold storage 3.2 double recuperated and recompressed brayton cycle with co2 as working fluid 3.2.1 cycle layout for charging the (hot) storage 3.2.2 cycle layout for discharging the (hot) storage 3.2.3 thermodynamic model 3.2.4 effect of different system parameters on the system’s round-trip efficiency 3.2.5 heat exchanger analysis 4 results and discussion 5 conclusion list of symbols acknowledgements references ap04-bittnar1.vp 1 introduction in standard continuum mechanics, a solid body is decomposed into a set of idealized, infinitesimal material volumes, each of which can be described independently as far as the constitutive behavior is concerned. of course, this does not mean that the individual material points are completely isolated, but their interaction can take place only on the level of balance equations, through the exchange of mass, momentum, energy and entropy. in reality, however, no material is an ideal continuum. both natural and man-made materials have a complicated internal structure, characterized by microstructural details whose size typically ranges over many orders of magnitude. the expression “microstructure” will be used as a generic denomination for any type of internal material structure, not necessarily on the level of micrometers. some of these details can be described explicitly by spatial variation of the material properties. but this can never be done simultaneously over the entire range of scales. one reason is that such a model would be prohibitively expensive for practical applications. another, more fundamental, reason is that, on a small enough scale, the continuum description per se is no longer adequate and needs to be replaced by a discrete model (or, ultimately, by interatomic potentials based on quantum mechanics). constructing a material model, one must select a certain resolution level below which the microstructural details are not explicitly “visible” to the model and need to be taken into account approximately and indirectly, by an appropriate definition of “effective” material properties. also, one should specify the characteristic wavelength of the imposed deformation fields that can be expected for the given type of geometry and loading. here, the notion of characteristic wavelength has to be understood in a broad sense, not only as the spatial period of a dynamic phenomenon but also as the length on which the value of strain changes substantially in static problems. if the characteristic wavelength of the deformation field remains above the resolution level of the material model, a conventional continuum description can be adequate. on the other hand, if the deformation field is expected to have important components with wavelengths below the resolution level, the model needs to be enriched so as to capture the real processes more adequately. instead of refining the explicit resolution level, it is often more effective to use various forms of the so-called enriched or generalized continuum formulations. a systematic overview and detailed discussion of generalized continuum theories has been given e.g. in the recent review paper by bažant and jirásek (2002). the aim of the present study is to demonstrate by specific examples how the need for enriched continuum formulations arises from discrepancies between experimental observations and theoretical predictions based on the standard theories, and also how the model performance can be improved by adding carefully selected enrichment terms. the enrichments to be discussed here are in general referred to as nonlocal, but this adjective must be understood in the broad sense, covering both strongly nonlocal and weakly nonlocal formulations. precise mathematical definitions of strong and weak nonlocality were given by rogula (1982) and are also explained in bažant and jirásek (2002). here we only note that strongly nonlocal theories are exemplified by integral-type formulations with weighted spatial averaging or by implicit gradient models, while weakly nonlocal theories include for instance explicit gradient models. the meaning of these expressions will become clear from the examples to follow. we will start from enriched formulations of the theory of elasticity, and then proceed to elastoplasticity and damage mechanics. the paper is organized as follows. section 2 treats the dispersion of short elastic waves in heterogeneous or discrete media. it is shown that the standard homogenization procedure erases the information on dispersive properties. dispersion laws are then derived for a host of generalized continuum models, including strain-gradient elasticity, models with mixed spatial-temporal derivatives, and integral-type nonlocal elasticity. advantages and drawbacks of individual formulations are discussed, and a general framework of nonlocal strain-gradient elasticity is outlined. section 3 deals with size effects in microscale elastoplasticity, in particular with the size dependence of the apparent hardening modulus. using the academic example of a 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 nonlocal theories in continuum mechanics m. jirásek the purpose of this paper is to explain why the standard continuum theory fails to properly describe certain mechanical phenomena and how the description can be improved by enrichments that incorporate the influence of gradients or weighted spatial averages of strain or of an internal variable. three typical mechanical problems that require such enrichments are presented: (i) dispersion of short elastic waves in heterogeneous or discrete media, (ii) size effects in microscale elastoplasticity, in particular with the size dependence of the apparent hardening modulus, and (iii) localization of strain and damage in quasibrittle structures and with the resulting transitional size effect. problems covered in the examples encompass static and dynamic phenomena, linear and nonlinear behavior, and three constitutive frameworks, namely elasticity, plasticity and continuum damage mechanics. this shows that enrichments of the standard continuum theory can be useful in a wide range of mechanical problems. keywords: damage mechanics, dispersion, elasticity, enriched continuum, gradient models, nonlocal models, plasticity, quasibrittle materials, size effect, strain localization, wave propagation. semi-infinite shear layer, it is shown that stiffer behavior of smaller structures can be reproduced with explicit or implicit gradient plasticity if appropriate boundary conditions are enforced. the general trends are discussed and compared to experimental measurements of size effect in plastic torsion of thin wires. section 4 is concerned with localization of strain and damage in quasibrittle structures and with the resulting transitional size effect. mathematical and numerical difficulties related to the objective description of strain localization due to softening are explained using the one-dimensional example of a tensile bar. it is shown that if a stress-strain law with softening is incorporated in the standard continuum theory, the numerical results suffer from pathological sensitivity to the discretization parameter such as the size of finite elements. this can be remedied by special enrichments acting as localization limiters, e.g. by a nonlocal damage formulation. the onset of localization is studied analytically and relations to dispersion analysis are pointed out. it is also shown that the nonlocal model can correctly reproduce the transitional size effect on the nominal strength of a quasibrittle structure. all generalized theories presented here introduce a model parameter with the dimension of length that reflects the intrinsic material length scale. the response of a material point depends not only on the strain and temperature history at that point but also on the history of a certain neighborhood of that point or even of the entire body. for this reason, such theories are classified as nonlocal in the broad sense. 2 dispersion of elastic waves 2.1 continuum versus discrete models in the standard continuum theory, propagation of waves in a homogeneous one-dimensional linear elastic medium is described by the hyperbolic partial differential equation ���u eu� �� � 0, (1) where � is the mass density, e is the elastic modulus, u ( x, t) is the displacement and, as usual, overdots stand for derivatives with respect to time t and primes for derivatives with respect to the spatial coordinate x. since � and e are constant coefficients, equation (1) admits solutions of the form u x t i kx t ik x ct( , ) ( ) ( )� �� �e e� , (2) where i is the imaginary unit, � is the circular frequency, k is the wave number, and c k� � is the wave velocity. substituting (2) into (1), we find the condition � � ���2 2 0ek (3) which implies that the magnitude of the circular frequency is proportional to the magnitude of the wave number. the signs of � and k are irrelevant – if both change, the real part of (2) does not change, and if one of them changes, the wave propagates in the opposite direction but otherwise remains the same. therefore, we will restrict ourselves to nonnegative values and write the solution of (3) as � � � k e . (4) due to the direct proportionality between � and k, the velocity of a harmonic elastic wave, c k e� �� �, is constant, independent of the wave number k. since a wave of a general shape can be represented by a linear combination of harmonic waves, even such a general wave propagates at constant velocity c and its shape remains invariant. the situation is different in a discrete mechanical system, which can be best exemplified by a regular infinite one-dimensional array of mass points connected by linear springs (fig. 1a). the governing equations of motion form a system of ordinary differential equations mu k u u k u u j zj j j j j�� ( ) ( ) ,� � � � � �� �1 1 0 , (5) where m is the mass of each mass point, k is the spring stiffness, uj is the displacement of mass point number j initially located at xj, and z is the set of all integer numbers. for an assumed harmonic wave of the form u tj i k x tj( ) ( ) � � e � (6) we obtain the condition � � � � � � �� � � � m k k j z ik x x ik x xj j j j�2 1 1 01 1( ) ( ) , ( ) ( ) e e . (7) the spacing of mass points is assumed to be regular, i.e., x x x x aj j j j� � � �� �1 1 . the circular frequency corresponding to wave number k is thus � � � � � � 2 1 2 1 2 2 k m ika k m ka k m ka ( cosh ) ( cos ) sin . (8) © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 1: (a) discrete mass-spring model with nearest-neighbor interaction and (b) the corresponding dispersion diagram with the normalized frequency, � m k , plotted as a function of the normalized wave number, ka � for this mass-spring system, the relationship between � and k is nonlinear and the wave velocity depends on the wave number. in such a case, one must distinguish between the phase velocity, c kp � � , and the group velocity, c kg � d d� . in the long wave limit (i.e., for k approaching zero), both the phase velocity and the group velocity tend to c a k m0 � . as is clear from the � – k diagram in fig. 1b, c0 is the upper bound on the velocity of waves at any wave number. as k increases from 0 to 2� a, the corresponding phase velocity decreases from c0 to 0. this means that shorter harmonic waves propagate at slower velocities. for a general wave, propagation of different harmonic components at different velocities leads to changes of the wave shape. this phenomenon is known as dispersion, the equation relating � and k is called the dispersion equation, the resulting function �( )k is the dispersion law, and its graph is the dispersion curve. it is natural to expect that a discrete mechanical model consisting of mass points m regularly spaced at distance a and connected by springs of stiffness k is in some sense equivalent to a continuum characterized by mass density � � m a and elastic modulus e ka� . however, this “equivalence” has its limits. a standard homogeneous linear elastic continuum has a linear dispersion curve with slope c e ka m� �� 2 , which coincides with the initial slope c a k m0 � of the nonlinear dispersion curve of the discrete model. both dispersion curves are almost identical for long waves, but for wave numbers comparable to � a (i.e., for wavelengths comparable to 2a) they differ substantially. the standard continuum can be considered as a long-wave approximation of the discrete model (or vice-versa). if the actual physical system is close to the mass-spring model, the “equivalent” continuum model does not capture the phenomenon of dispersion of short elastic waves. on the other hand, if a homogeneous elastic continuum is discretized in space by finite differences (or by finite elements with a lumped mass matrix), the resulting set of equations has the form (5) with m a� � and k e a� , where a is the grid parameter of the numerical method (e.g., size of finite elements). if these ordinary differential equations are integrated exactly in time, the solution captures correctly long-wave phenomena but introduces an artificial (numerical) dispersion of short waves with wavelengths comparable to the element size. note that the numerical dispersion disappears if equations (5) are integrated in time using the central difference scheme with time step � t a c� , which is just at the limit of numerical stability. consequently, propagation of shocks and other wide-spectrum phenomena are not represented accurately. discrete mass-spring systems are quite realistic models for crystalline materials on a scale of observation close to the atomic spacing. of course, real crystal lattices are three-dimensional and interaction forces arise not only between immediate neighbors but also at longer distance. still, a one-dimensional lattice is an acceptable model for plane waves propagating perpendicular to a certain set of crystallographic planes. in this case, each mass point actually represents one plane of densely packed atoms instead of one single atom. interactions at longer distance can easily be incorporated by adding springs of stiffness k2, k3, …, kn, connecting pairs of mass points spaced by 2a, 3a, …, na. a straightforward extension of the foregoing dispersion analysis yields the dispersion relation � � � � � �� � � � � �m k kn ikna n n n ikna n n �2 1 1 1 1 0( ) ( )e e (9) from which � � � � � � � � 2 1 2 2 1 2 1 m k kna m k kna n n n n n n ( cos ) sin . (10) 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 2: (a) dispersion curve of a mass-spring model with interaction up to distance , (b) dispersion curve of aluminum for longitudinal waves in the direction (after yarnel et al., 1965) in the long-wave limit (k � �), the phase velocity c kp � � tends to c a m k nn n n 0 2 1 � � � . (11) an example of a dispersion curve constructed with n � 3 and k1 � k2 � k3 � k is shown in fig. 2a. this curve has a more general shape than for the nearest-neighbor interaction only (cf. fig. 1b), but the wave number 2� a still corresponds to zero frequency. this is natural, because the displacements uj generated by a harmonic wave with wavelength a are the same at all the lattice sites. the associated mode is a uniform translation of the lattice points, which would not lead to any vibrations. a striking property of the dispersion law (10) is its periodicity. this is closely related to the discrete and periodic nature of the underlying mechanical model. in fact, wave numbers that differ by an integer multiple of � a correspond to the same physical state of the mass-point chain. consequently, the dispersion curve of this chain is uniquely characterized by its initial part for wave numbers between 0 and � a. this range of wave numbers is in the theory of crystal vibrations called the first brillouin zone. another interesting property is that the frequency range of harmonic waves is limited. the dispersion law maps the first brillouin zone onto a certain interval [0, �max] where �max is a limiting circular frequency. if certain mass point is externally excited with a circular frequency larger than �max, the vibration does not propagate through the entire chain and remains localized to the neighborhood of the excited mass point. the discrete chain therefore acts as a low-band filter. dispersion of elastic waves in crystals is a real physical phenomenon that can be observed and studied experimentally. fig. 2b shows an example of a dispersion curve of aluminum, measured by yarnel, warren and koenig (1965). this specific curve corresponds to waves propagating in the crystallographic direction (110), and it can be well approximated by a function of the form (10). 2.2. strain-gradient elasticity if dispersive phenomena were limited to the atomistic scale, elastic wave propagation could be described by discrete atomistic models on that scale and by standard continuum models on any coarser scale. however, dispersion arises not only due to the discrete character of the crystal lattice, but in general due to any type of material heterogeneity. leaving aside the ideal case of a perfect monocrystal, the internal structure of all materials exhibits heterogeneities on various scales. some defects in crystals can still be treated by atomistic models, but in most other cases the material needs to be considered as a continuum (because the relevant scale is already above the atomistic one) with a certain heterogeneous microstructure. for elastic materials, there exist sophisticated and mathematically well-founded homogenization techniques providing the effective elastic moduli of a homogeneous material that is in a certain sense equivalent to the heterogeneous one and can replace it in large-scale simulations. again, this equivalence is limited and holds with reasonable accuracy for long-wave phenomena only. in the present context, waves are considered as long if the wavelength is much larger than the characteristic size of major heterogeneous features in the internal structure of the material. if there is a need to describe shorter waves in a realistic manner, it is in principle possible to explicitly resolve the details of the heterogeneous internal structure, but this approach often leads to extreme demands on the computational resources. also, since the particular microstructure is usually not known exactly but only in terms of a stochastic description, the method of explicit resolution would need to exploit a monte-carlo type of technique or use stochastic differential equations, which again complicates the procedure and makes it computationally expensive. as an elegant and efficient alternative, it is possible to construct enrichments of the standard continuum theory that reflect the main features of the microstructure without using fast oscillating material properties. in standard continuum elasticity it is assumed that the density of elastic energy stored per unit volume, w, depends only on the strain tensor, which is directly related to the deformation gradient, i.e., to the first gradient of the displacement field. the elastic energy stored by the entire body, w, is then evaluated as the spatial integral of the elastic energy density. in the one-dimensional setting, one can write w w u x x l � �� ( ( )) ,d (12) where � �u u xd d is the strain, further denoted as �, and l is the interval representing geometrically the one-dimensional body. in linear elasticity, the elastic energy density w e( )� �� 1 2 2 (13) is a quadratic function of strain. one class of enrichments is based on the incorporation of higher gradients of the displacement field (toupin, 1962; mindlin 1964, 1965). in general, the elastic energy density can be assumed to depend on �� ���u u u iv, , , etc. the simplest strain-gradient theory of elasticity uses enrichment by the second displacement gradient, ��u , which is equal to the strain gradient, �� , further denoted as �. if we consider one single material point only, the strain gradient is locally independent of the strain value. in the linear case, the enriched elastic energy density potential is written as w e c( , )� � � �� � 1 2 1 2 2 2, (14) where c is a higher-order elastic modulus. the variation of elastic energy density is given by � � �� � �� �� ���w w w � � � � , (15) where � �� �w e is the (cauchy) stress and � � �� �w c is the so-called double stress. based on the extended form of the principle of virtual work, it is possible to derive the static equilibrium equation ( ) �� � � � �b 0, (16) where b is the body force density. in dynamics, b is replaced by the inertial force density, ����u. combining this with the constitutive equations �� e and � �� c and with the kinematic equations � � �u and � � ��u , we obtain the wave equation of strain-gradient elasticity, © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 44 no. 5 – 6/2004 ���u eu cu iv� �� � � 0, (17) which differs from the standard wave equation (1) by the presence of a term with the fourth spatial derivative of displacement. substituting the assumed harmonic wave solution (2) into (17), we obtain the dispersion equation � � � ���2 2 4 0ek ck (18) from which � � � � k e ck2 . (19) the dispersion curve is plotted in fig. 3a. the phase velocity c k e ck p � � �� � 2 (20) is not constant, except for the case when c � 0 and the model reduces to standard elasticity. in strain-gradient elasticity it is usually assumed that the higher-order modulus c is positive. this assumption leads to a convex energy potential and permits to generalize certain uniqueness theorems known from standard elasticity. however, for c > 0 the phase velocity cp increases with increasing wave number k. we have seen that the discrete mass-spring model exhibits the opposite trend, and this is also confirmed by measurements of dispersion curves in real crystals. even for heterogeneous continua, the dispersion curves (determined experimentally or by analytical solution of some simple cases) typically have negative curvature. so the strain-gradient theory gives a reasonable approximation of the dispersion effect only if the higher-order modulus c is negative. convexity of the elastic potential is then lost and uniqueness cannot be guaranteed. indeed, if c el� � 2 where l is a model parameter with the dimension of length, the phase velocity of a harmonic wave with wave number k lcrit �1 vanishes. this means that the equation of motion (17) is satisfied by a stationary wave of wavelength 2 2� �k lcrit � . a similar result was found for the discrete mass-spring model, but in that case the stationary wave in reality represented a uniform translation, because the values of the displacements had physical meaning only at discrete points with spacing equal to the critical wavelength. in contrast to that, a stationary wave in a continuous elastic medium is physically inadmissible. the problem is aggravated by the fact that, for wave numbers exceeding kcrit, the circular frequency � solved from the dispersion equation (18) becomes imaginary. this means that harmonic modes with wavelengths shorter than 2�l would spontaneously blow up. the source of this instability becomes apparent if one realizes that, for short waves, the negative higher-order part the elastic energy, � ��el u2 2( ) , exceeds in magnitude the positive standard part, e u( )� 2, and so the total energy density becomes negative. if the amplitude of the wave grows, energy is released instead of being consumed. 2.3 models with mixed spatial-temporal derivatives due to the unstable behavior of short waves, equation (17) is sometimes called the “bad boussinesq problem”. this equation can describe dispersion of waves with “moderate” wave numbers but leads to instabilities if waves shorter than the critical wavelength 2�l are involved. if the body of interest is discretized by finite elements, the minimum wavelength that can be captured by the numerical approximation is proportional to the element size. therefore, for meshes that are sufficiently coarse with respect to the material length parameter l, the numerical solution leads to reasonable results. however, upon mesh refinement, the solution becomes polluted by unstable modes rapidly oscillating in space. several modifications of the bad boussinesq problem were proposed in the literature. fish, chen and nagai (2002a) replaced the term with the fourth spatial derivative, u iv , by a term with a mixed derivative, ����u . their arguments can be rephrased and expanded as follows: for small wave numbers, the fourth-order term in (17) is negligible with respect to the second-order terms, so we can write eu u�� ���. differentiating this twice with respect to x, we obtain eu uiv ����� . finally, replacing in (17) u iv by ( ) ��� e u�� and c by �el2 yields a modified wave equation � ��� ��u eu l u� �� � �� �2 0 (21) 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 3: dispersion curves of (a) strain-gradient elasticity and gradient models with mixed derivatives, (b) nonlocal elastic models with different weight functions; the normalized circular frequency is �l c0 in case (a) and �a c0 in case (b), and the normalized wave number is kl in case (a) and ka � in case (b) which was called by fish et al. (2002a) the “good boussinesq problem”. this problem can be expected to have similar solutions to the original bad boussinesq problem (17) at low wave numbers but a different asymptotic behavior for high wave numbers. indeed, the usual procedure leads to the dispersion equation � � � ��� � �2 2 2 2 2 0ek l k (22) from which c k e k l c k l p � � � � � � �( )1 1 2 2 0 2 2 . (23) for this model with enrichment by a mixed derivative, the phase velocity remains real and positive for all wave numbers. the dispersion curve, plotted in fig. 3a, is monotonically increasing, has a negative curvature, and for k � � approaches a horizontal asymptote at circular frequency � � c l0 . the model can reasonably reproduce dispersion of waves at moderate wavelengths and does not give rise to instabilities for very short waves. its extension to multiple dimensions is relatively complicated (fish, chen and nagai, 2002b; nagai, fish and watanabe, 2004). metrikine and askes (2002) used a different line of reasoning and arrived at an equation of motion with two enrichments terms, proportional to ����u and u iv . in its most general form, this equation can be written as � � ��� ( �� )u eu l u eu� �� � � �� �� �2 0 , (24) where l is an internal length and � is an additional model parameter. for l � 0 or for � � 1, the model reduces to the standard elastic continuum, while for � � 0 it reduces to the good boussinesq problem of fish et al. (2002a). metrikine and askes (2002) used certain heuristic arguments to link l and � to the microstructure and then proposed a parameter identification procedure based on a reflection-transmission test (askes and metrikine, 2002). of course, one can also consider l and � as free model parameters and determine them by optimal fitting of the dispersion curve for a given material. the dispersion equation corresponding to (24), � � � � � ��� �� �2 2 2 2 2 2 0ek l ek k( ) (25) yields the phase velocity c k c k l k l p � � � � � � 0 2 2 2 2 1 1 , (26) where c e0 � �, as usual. if parameter � is nonnegative, the phase velocity remains real and positive for all wave numbers. for 0 < � < 1, the dispersion curve has a negative curvature; see fig. 3a. so, with a proper choice of parameters, the model can reasonably approximate dispersion and does not suffer by unstable behavior of short waves. its disadvantage is that the presence of the fourth derivative u iv requires either a c1-continuous finite element approximation (which is hard to construct on general meshes in multiple dimensions) or a mixed approach with independent approximations of several fields (e.g., of the displacement field and the strain field). also, nonstandard higher-order boundary conditions are needed on the physical boundary of the investigated body. 2.4 integral-type nonlocal elasticity another class of enrichments is based on weighted spatial averaging. the simplest model of this kind can be derived from the elastic potential w e x x x ll � �� 1 2 ( , ) ( ) ( ) � � d d , (27) where e x( , ) is a function describing the generalized elastic modulus. the variation of elastic energy is evaluated as � �� � � �� w e x x x e x x x ll l � � � �� 1 2 1 2 ( , ) ( ) ( ) ( , ) ( ) ( ) d d d�� �� � � � d d d � �� l ll e x e x x x 1 2 [ ( , ) ( , )] ( ) ( ) . (28) this can be written in the usual form � ��w x x x l � � ( ) ( ) d if the stress is defined as � ( ) ( , ) ( )x e xs l � � d , (29) where e x e x e xs( , ) [ ( , ) ( , )] � � 1 2 (30) is the elastic modulus function symmetrized with respect to its arguments. the corresponding equilibrium equations derived from the principle of virtual work then keep their standard form, � � � b 0. consequently, the wave equation for this model reads � 2 2 0 u x t t x e x u t s l ( , ) ( , ) ( , ) � �� d . (31) since function e xs( , ) reflects the strength of long-distance interaction between points x and , its value can be expected to be negligible if the distance between x and is large compared to the internal length of the material (which corresponds to the characteristic size and spacing of major heterogeneities). for functions es with a sufficiently fast decay, the integrals in (29) and (31) make sense even if the integration domain l is considered as the entire real axis. if the body infinite and macroscopically homogeneous, function e xs( , ) should depend only on the distance between x and . bearing in mind these restrictive assumptions, we present the modulus function in the form e x e xs( , ) ( ) � � �0 0 , (32) where e0 is a reference value of the elastic modulus and �0 is a dimensionless even function, further called the nonlocal weight function. the second term in the wave equation (31) can then be transformed as follows: � x e x u t e x x u t s( , ) ( , ) ( ) ( , ) d d � � � � � 0 0 � � � � � � � � �e x x u t e x u t 0 0 0 0 2 � � ( ) ( , ) ( ) ( , d ) . 2 d � � (33) © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 44 no. 5 – 6/2004 substituting the assumed harmonic form of an elastic wave (2) into the transformed wave equation � � 2 2 0 0 2 2 0 u x t t e x u t( , ) ( ) ( , ) � � � � � d (34) we obtain the dispersion equation � � ��� �2 0 2 0 0e k k * ( ) (35) in which � �0 0 * ( ) ( )k r rikr� � � � e d (36) is the fourier image of the nonlocal weight function �0( )r . finally, the phase velocity is evaluated as c k e k c kp � � � � � � �0 0 0 0 * *( ) ( ) . (37) relation (35) shows that there is a unique correspondence between the dispersion law and the fourier image of the nonlocal weight function. if the dispersion law of a certain material is given in the form �( ) ( )k kc kp� where c kp ( ) is a known function, it is possible to construct a nonlocal elasticity model that exactly reproduces the dispersion properties. for this, it suffices to set e c0 0 2� � where c cp0 0� ( ), and to evaluate the weight function by the inverse fourier transform, taking into account that the phase velocity does not depend on the sign of the wave number: � � � � 0 0 0 2 21 2 1 ( ) lim ( ) ( ) cos*r k k c c k kr k k ikr k k p� � � � � e d d 0 � . (38) of course, this is possible only if the dispersion law to be reproduced has reasonable properties, such that the inverse fourier transform exists. for instance, for the dispersion law corresponding to the bad boussinesq problem, c kp ( ) is given by (20) and the integral in (38) does not converge (independently of the sign of c). on the other side, for the good boussinesq problem (21) we have c k c k lp ( ) � �0 2 21 and the inverse fourier transform (38) yields � � 0 2 2 0 1 1 1 2 ( ) cos r kr k l k l r l� � � � � d e . (39) a nonlocal elasticity model with this particular weight function gives exactly the same dispersion law as the model with enrichment by a mixed derivative proposed by fish et al. (2002a). on an infinite domain, both models are equivalent. the transformation of the model of fish et al. (2002a) into a integral-type nonlocal model can also be performed directly. at a fixed time instant, equation (21) can be written as �� ��u l u e u� �� � ��2 � (40) and interpreted as an ordinary differential equation for the unknown acceleration ��u, with the current displacement u considered as known. equation (40) has the form of the so-called helmholtz equation, and its solution satisfying conditions of boundedness (which play the role of boundary conditions at plus and minus infinity) can be expressed as ��( , ) ( , ) ( , )u x t e g x u t� �� � �� d , (41) where g x( , ) is the green function of the helmholtz equation, formally obtained as the solution of this equation with the dirac distribution � ( ) on the right-hand side. it turns out that the green function is in this case given by g x l x l( , ) � � � 1 2 e (42) and so equation (41) is in fact equivalent with (34) if the nonlocal weight function �0 is selected according to formula (39). 2.5 combination of nonlocal averaging and strain gradients for the dispersion law corresponding to the gradient model proposed by metrikine and askes (2002), the integral in the inverse fourier transform (38) does not converge. so this model is not equivalent to any integral-type nonlocal elastic model derived from the potential (27). still, using the alternative procedure based on the green function, it is possible to construct a more general nonlocal model equivalent to the original enriched gradient formulation. indeed, rewriting (24) as �� �� ( )u l u e u l u iv� �� � �� �2 2 � � (43) and “solving” for the acceleration, we obtain ��( , ) ( , ) ( , ) ( , ) ( , )u x t e g x u t l g x u tiv� �� � � �� � d d 2 � � � � � � � � � � � . (44) the result resembles the wave equation of strain-gradient elasticity (17), but with the derivatives ��u and u iv replaced by their weighted spatial averages. this observation motivates the development of a nonlocal strain-gradient model with the elastic energy potential given by w e x x x c x x x ll l � � � �� � 1 2 1 2 ( , ) ( ) ( ) ( , ) ( ) ( ) � � � � d d d d l� . (45) taking the variation and using the same line of reasoning as for the basic version of nonlocal elasticity, we identify the constitutive equations � ( ) ( , ) ( )x e xs l � � d (46) � � ( ) ( , ) ( )x c xs l � � d (47) in which c x c x c xs( , ) [ ( , ) ( , )] � � 2 is the symmetrized higher-order modulus. substituting this into the equation of motion of the strain-gradient theory, � ��� ( )u � � � � � 0, and using the kinematic relations � � �u and � � ��u , we obtain the wave equation � 2 2 2 2 u x t t x e x u t x c x s l s ( , ) ( , ) ( , ) ( , ) � � � � d 2 2 0 u t l ( , ) d� � (48) 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 that generalizes equation (31). for the moduli functions in the special form e x e xs( , ) ( ) � � �0 0 (49) c x e l xs( , ) ( ) � � �0 2 1 (50) dispersion analysis provides the following expression for the phase velocity: c k c k k l kp ( ) ( ) ( ) * *� �0 0 2 2 1� � . (51) here, �1 * denotes the fourier image of function �1. the nonlocal strain-gradient model just presented is quite general and covers as special cases all the other models discussed so far. the special choices of weight functions �0 and �1 are summarized in table 1. the model permits to reproduce a very wide class of dispersion laws. which terms need to be activated depends on the asymptotic behavior of the dispersion curve at wave numbers approaching infinity. if the dispersion law �(k) is bounded, it is sufficient to use a regular weight function �0 obtained by the inverse fourier transform of ( )� c k0 2. if �(k) tends to infinity but remains of order o(k), it is possible to use either a weight function �0 with a singular part of the dirac type, or regular functions �0 and �1. finally, if �(k) grows superlinearly but remains of order o(k2), function �1 must have a singular dirac-type part. still faster growth could be reproduced by models with second (mindlin, 1965) or still higher (green and rivlin, 1964) strain gradients, but it seems that dispersion laws of real materials do not exhibit such behavior, so this question is purely academic. in fact, all dispersion laws with superlinear growth at k � � are suspicious because the phase velocity becomes unbounded and disturbances can propagate at an arbitrary velocity if the wavelength is selected as sufficiently short. 2.6 nonlocal model reproducing dispersion of discrete lattice it is interesting that the nonlocal elastic model with a weight function �0 that linearly decreases from its maximum value at r � 0 to zero at r � a and vanishes for r > a leads to exactly the same dispersion law as the simplest mass-spring model with nearest-neighbor interactions and with spacing a between neighboring mass points. table 2 shows several other nonlocal weight functions and their fourier images that can be used to construct the corresponding dispersion curves, © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 44 no. 5 – 6/2004 model �0( )r �1( )r standard elasticity �( )r 0 strain-gradient elasticity �( )r �( )r fish et al. 1 2l r le � 0 metrikine and askes 1 2l r le � � 2l r le � mass-spring chain a r a � 2 0 the brackets � denote the positive part operator, defined by x x x� �( ) 2. table 1: special cases of nonlocal strain-gradient elasticity model �0( )r �0 * ( )k standard elasticity �(r) 1 mass-spring (neighbors) a r a � 2 2 1 2 2 ( cos )� ka k a mass-spring (long interaction) k na r a n k n n �� �2 2 2 1 2 2 2 k kna k a n k n n ( cos )�� � eringen see eq. (54) 2 1 2 2 ( cos )� ka k a if k a � � uniform averaging 1 2a for r a� sin ka ka fish et al. 1 2a r ae� 1 1 2 2� k a gauss weight function 1 2 2 a r a � e� e 4�k a 2 2 quartic weight function 15 16 1 2 2 2 a r a � 15 3 35 5 2 2 k a k a ka ka ka[( ) sin cos ]� � table 2: nonlocal weight functions and their fourier images which are plotted in fig. 3b. the dispersion law of a mass-spring model with long-distance interactions can be reproduced by a nonlocal model with a piecewise linear weight function whose characteristics uniquely depend on the spring stiffnesses. if all stiffnesses used by the discrete model are positive, the weight function is concave (for nonzero r). the dispersion law gives real frequencies for all wave numbers, but for wave numbers that are integer multiples of 2� a the frequency vanishes, i.e., the model admits stationary waves of wavelength a, a 2, a 3, etc. this is natural for the discrete model, as already explained, but the same property is shared by the nonlocal continuum model. for the simplest weight function, constant for r between 0 and a and vanishing for r > a, the dispersion law gives real frequencies only for wave numbers in intervals [0, �/a], [2�/a, 3�/a], [4�/a, 5�/a], etc. between these bands, the frequency becomes imaginary, which indicates an instability. the potential appearance of periodic modes that carry no energy for the nonlocal model with uniform strain averaging over a finite neighborhood was mentioned by bažant and chang (1984). it is clear that every periodic function with period 2a is mapped by the nonlocal operator onto a zero function, and therefore has no influence on the nonlocal average. each such function can be decomposed into a sum of harmonic functions of wavelengths 2a, 2 2a , 2 3a , etc., which correspond to zero frequencies. thus the static solution can be modified by a periodic function with period 2a without disturbing equilibrium. the dispersion analysis indicates that in dynamics the situation is even worse, because harmonic modes with wave numbers in the intervals (�/a, 2�/a), (3�/a, 4�/a), etc., are associated with imaginary frequencies and would blow up. to avoid the potential appearance of unstable modes, it is sufficient to use weight functions with positive values of their fourier images for any positive wave number k. this is the case for instance for the green function of the helmholtz equation, and also for the gauss-type weight function � � 0 1 2 2 ( )r a r a� �e (52) that is often used by nonlocal models. on the other hand, the fourier image of the truncated quartic polynomial function �0 2 2 2 15 16 1( )r a r a � � (53) is positive only for wave numbers smaller than 5.763/a. consequently, instabilities could develop for fine meshes with element size in the order of a. the relationship between atomic lattices and the nonlocal integral models was studied by eringen (1972), who proposed a weight function that should correspond to the mass-spring model. however, eringen did not use the complete inverse fourier transform but integrated (38) only in the limits � � �� �a k a. consequently, his nonlocal model would reproduce the dispersion law of the mass-spring model only for wavelengths larger than 2a. all smaller wavelengths would be associated with a zero frequency, i.e., stationary waves could easily appear. eringen’s weight function � � � � � � � � � 0 2 3 2 4 2 2 1 1 2 1 2 2 ( ) cos x a x a a � � � � � � � � � � � � � � si si 2 2 2 4 2 1 4 2 2 x x a x a x a x a x a � � � � � � � � � � � � � � � � � � � � sin si si� � 1 4 2 2 � �� � � � � � � � � � � � � � � �� x a x a si (54) (where si dx x � �� � � � 1 0 sin ) is also much more complicated than the piecewise linear function reproducing the dispersion of the discrete model exactly for arbitrary wave numbers, and does not seem to be very practical. 3 size effects in microscale plasticity 3.1 experimental observations among phenomena that are hard to model and predict by standard continuum theories, one finds also various forms of size effects on the apparent “material” properties. such effects can be observed already in the elastic range. for instance, according to standard elasticity, the torsional stiffness of a prismatic beam with a circular cross section should be proportional to the shear modulus of elasticity, g, and to the third power of the sectional diameter, d. however, if the torsional stiffness is evaluated experimentally as the ratio between the torque and the relative twist angle, it turns out that the expected proportionality to d3 holds with sufficient accuracy only for diameters larger than a certain threshold value (morrison, 1939; lakes, 1986; fleck, muller, ashby and hutchinson, 1994). if the results obtained for thick wires are extrapolated to thin wires, the actual stiffness is underestimated. in the context of the standard theory, this can be interpreted as a dependence of the elastic modulus on the size of the sample. however, such an explanation is not satisfactory from the theoretical point of view. in a more fundamental approach, it is admitted that the standard continuum elasticity theory provides only a large-size approximation to the static torsion problem, just as it provides only a long-wave approximation to the dynamic wave dispersion problem. if the size of the structure is comparable to a certain internal length scale of the material, higher-order effects appear and the classical concept of a homogeneous local continuum needs an adjustment. in elasticity, such an adjustment that accounts for the principal features of the microstructure is provided by various theories enriched by higher-order gradients or integral-type nonlocal terms, already exposed in section 2. in general, by size effect we mean a situation when a certain parameter normally considered as a material property appears to be dependent on the size of the sample or specimen for which it is evaluated. the reason for this unexpected behavior is usually that the specimens are either too small or too big and the underlying theory is not adequate on the extreme scale. important size effects in microscale plasticity have been detected in indentation tests (nix, 1989; ma and clarke, 1995; poole, ashby and fleck, 1996), bending of thin sheets (stolken and evans, 1998), plastic torsion (fleck et al., 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 1996), and void growth. from the practical point of view, proper understanding and modeling of size effects on the small scale is essential for applications of simulation methods in the analysis and design of microand nano-devices used in modern technology. 3.2 illustrative example: shear layer to illustrate the power of plasticity theories extended by gradient or integral-type nonlocal terms, we will analyze a rather academic but instructive problem of a material layer of thickness l, placed between stiff loading plates and loaded statically by shear. this problem can be modeled in one spatial dimension, which facilitates its analytical solution. the choice of the coordinate system and the type of loading are shown in fig. 4. it is assumed that the elastic properties of the material are isotropic and plastic flow is isochoric (preserves volume), so that volumetric and deviatoric effects can be decoupled. if this is the case, the relevant components of stress and strain are the shear stress �xy � and the engineering shear strain 2� �xy � . both of them are considered as independent of spatial coordinates y and z. from the equilibrium equation with vanishing body forces, it follows that � is also independent of x, i.e., it is uniformly distributed in space and varies only as a function of the pseudo-time parameterizing the loading process. classical plasticity with linear isotropic hardening is described by the basic equations � � �� �g p( ) (55) � � sgn( )� � �p � (56) � , ( , ) , � ( , )� � � � � �� � �0 0 0f f (57) where �p is the plastic strain and � is the cumulative plastic strain. the yield function f is given by f y( , ) ( )� � � � �� � (58) where �y is the current yield stress in shear, evaluated from the hardening law � � � �y h( ) � �0 (59) with �0 � initial yield stress in shear and h � hardening modulus (in shear). during monotonic loading with positive value of shear stress �, there is no difference between the plastic shear strain �p and the cumulative plastic strain �. we will therefore replace �p by � and call it simply the plastic strain. if the hardening modulus is positive and the loading is monotonic, the stress � uniquely determines the corresponding strain �. since the stress is uniform, the strain must be uniform as well. therefore, the strain at any point is equal to the average strain determined as the ratio between the relative displacement of the loading plates, �v v l v l� � �( ) ( )2 2 , and the layer thickness, l. the stress-strain curve can be directly determined from the measured dependence between v and the tangential traction on the boundary, t, and it should be independent of the layer thickness. therefore, the standard plasticity model does not indicate any size effect. 3.3 explicit gradient plasticity if the layer thickness is comparable to the characteristic length of the material microstructure, the assumption of the local dependence of stress on the history of strain at the same material point becomes questionable. the reason is that the hardening process does not take place at each infinitely small material point separately and independently of the surrounding points. this can be taken into account by sophisticated models that consider the details of the hardening mechanisms, e.g., by discrete dislocation models. as a simpler alternative, one can use enriched continuum models that take into account the micromechanical processes only ”on the average”, but by terms of a higher order than in the standard continuum theory. motivated by certain micromechanical considerations, aifantis (1984) proposed a family of models with the yield stress dependent not only on the value of the cumulative plastic strain (internal variable driving the hardening process) but also on its first and second gradients. in the one-dimensional setting and for linear hardening, the simplest version of the aifantis gradient plasticity model replaces the hardening law (59) by � � � � � �y , h hl( )�� � � � ��0 2 (60) where l is a parameter with the dimension of length. the elastic part of the model remains unchanged, and so the strain is uniform in the elastic range. after the onset of yielding, the yield condition must be satisfied in the plastic zone. it is easy to show that in the present case the plastic zone must extend over the entire layer. the yield condition f � 0 combined with the hardening law (60) then provides the ordinary differential equation � � � � � �� � � l h 2 0 (61) for the unknown function �. this equation should be supplemented by boundary conditions at the layer boundaries. the choice of the specific form of boundary conditions has a major influence on the solution and on the resulting size effect. for homogeneous neumann boundary conditions, enforcing a vanishing normal derivative of � at the boundary, the uniform solution obtained with the classical model would remain valid even for the gradient model. however, if the shear layer is fixed to rigid or very stiff loading plates and the bond between the two materials is perfect, the boundary (or rather the material interface) acts as an obstacle for dislocation motion, which is the main mechanism of plastic flow in crystalline materials. this has been confirmed by simulations based on discrete dislocation models (shu, fleck, van der giessen and needleman, 2001) . in the extreme case, plastic flow is completely prevented and the cumulative plastic strain © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: material layer between stiff plates loaded by shear � must vanish at the boundary. for homogeneous dirichlet boundary conditions, �( )� �l 2 0, the solution of (61) is � � � ( ) cosh cosh x h x l l l � � � � � �� � � �� 0 1 2 . (62) the distribution of plastic strain (normalized by the factor ( )� �� 0 h that would correspond to the value of uniform plastic strain in the standard theory) across the layer thickness is plotted in fig. 5b for different relative sizes, l l. if the layer thickness is substantially larger than the material length, the plastic strain is almost uniform, except for narrow boundary layers. with decreasing structural size l, the relative importance of these boundary layers with reduced plastic strain levels increases, which makes the overall response stiffer. to characterize the overall response of the layer, the relative tangential displacement of one loading plate with respect to the other can be evaluated as �v x x g x x l g h l l l l � � � � � � �� � � � � � � � � � �� � � � � � ( ) ( )d d 2 2 2 2 0 l l l l � � � � � � �2 2 tanh . (63) defining the average shear strain ~� � �v l, we can transform (63) into the average stress-strain law � � � �� � �0 0 ~ (~ )gep , (64) where � �0 0� g is the limit elastic shear strain and ~ tanhg g h l l l l ep � � � � � � � � � � � � �� � 1 1 1 2 2 1 (65) is the elastoplastic (tangent) shear modulus. according to (64)–(65), the overall response of the specimen after the onset of yielding is linear and is equivalent to the response of the same specimen made of the standard elastoplastic material with hardening modulus ~ tanh h h l l l l � �1 2 2 . (66) this “apparent” hardening modulus depends not only on the material parameters h and l but also on the layer thickness l, i.e., on the size of the specimen for which it is evaluated; see fig. 5a. therefore, it cannot be considered as an intrinsic material parameter. if the actual behavior of the material is close to the gradient model but the experimental results are interpreted using the standard elastoplastic model, the value of the hardening modulus will appear to be size-dependent. the reason is that the model is oversimplified and the comparison between theory and experiments uses only global characteristics such as the measured relation between the loading force and the relative displacement of one loading plate with respect to the other. if detailed measurements of the strain field inside the specimen were available, they would reveal a discrepancy between the actual strain distribution and the theoretical solution based on the oversimplified assumptions. this would give a hint regarding the necessary refinement of the theoretical model by inclusion of higher-order terms. but even if such detailed measurements are not possible or not available, the development of an appropriate enriched continuum theory can be guided by the experimentally detected size effect. it turns out that the size effect for one specific type of test performed on a series of geometrically similar specimens can often be well reproduced by several different types of enriched models that are not necessarily equivalent in a general case. ideally, the enriched theory should be verified by several tests leading to different types of stress and strain distributions, and also supported by micromechanical arguments and confirmed by observations of the actual processes in the microstructure. only then can the model be assumed to reasonably reproduce the actual material behavior and to have some predictive power. if only one series of experiments is fitted, the model can usually serve for reliable interpolation in the limits that have been covered by the experiments, but extrapolation to smaller or larger sizes can be dangerous. the diagram linking the shear stress to the average shear strain for different thicknesses l of the shear layer is plotted in fig. 6a, and the dependence of the apparent hardening modulus ~ h on the specimen size (layer thickness) is shown in fig. 5a. if the layer thickness is much larger than the material length l, the stress-strain curve is practically size-independent and the apparent hardening modulus ~ h evaluated from the 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 5: explicit gradient plasticity: (a) dependence of the apparent hardening modulus on the size of the specimen (layer thickness), (b) distribution of plastic strain across the layer test is very close to the model parameter h, considered here as an intrinsic material property. so the standard theory can be used as a good approximation if the specimen is large. for layer thicknesses smaller than about 20l, the stress-strain curve becomes size-dependent and it rises above the basic stress-strain curve valid in the large-size limit. the apparent hardening modulus increases with decreasing size. even though the shear test of a layer is difficult to perform, this trend can be considered as realistic because stronger hardening for smaller specimens is indeed observed in experiments with plastic torsion of tiny wires or microbending of thin films. this is documented in fig. 6b, adapted from fleck et al. (1994). the figure shows the dependence of the normalized torque, m d3, on the normalized angle of twist, �d, which is equal to the shear strain on the wire surface. according to any theory with stress at a point dependent on the history of strain at that point only, the resulting curve should be independent of the wire diameter d. in reality, this is true only for sufficiently thick wires. the figure clearly shows that for diameters in the order of dozens of micrometers, the actual response after the onset of yielding is stronger than expected. aifantis (2003) and fleck and hutchinson (2001) have demonstrated that this size effect can be captured by gradient plasticity theories similar to the one presented here. 3.4. implicit gradient plasticity the model with yield stress dependent on the second gradient of cumulative plastic strain falls into the category of explicit gradient models. recently it turned out that certain advantages can be gained by using implicit gradient formulations, which are closely related to integral-type nonlocal models. a prominent example is the so-called ductile damage model of geers, engelen and ubachs (2001). this model introduces the dependence of the yield stress on the nonlocal cumulative plastic strain, �, defined implicitly as the solution of a helmholtz-type differential equation with the local cumulative plastic strain on the right-hand side. for a one-dimensional problem, this equation reads � � �� �� �l2 , (67) where l is, as usual, a model parameter with the dimension of length. in the simplest case, one could replace � in (58)–(59) by �, but such a formulation would have certain deficiencies. micromechanical arguments based on the idea of a plastically hardening matrix weakened by growing voids lead to the following expression for the yield stress: � � � � � � �y ph( , ) ( )[ ( )]� � �0 1 . (68) here, the term � �0 � h represents the yield stress of the matrix, which exhibits hardening driven by the local value of the cumulative plastic strain, while 1 � � �p ( ) is an integrity factor taking into account the reduction of the effective area by voids that carry no stress. void propagation is assumed to be driven by the nonlocal cumulative plastic strain, �, and is reflected by the ductile damage function �p that vanishes for � � 0 and increases later on. due to the nonlinear format of the hardening law (68), analysis of the plastic strain distribution in a general state would lead to a nonlinear differential equation. to allow for an analytical treatment, we restrict attention to the initial distribution of the plastic strain rate. differentiating (68) with respect to time and using the consistency condition � � �f y� � �� � 0, we obtain the differential equation h hp p ( ) � ( ) � �1 0� � � �� � � � � � � � d d . (69) at the onset of yielding, �, � and � p vanish in the entire layer and the derivative d d� �p evaluated at � � 0 has the same value �p * at all points x. taking all this into account and substituting the rate form of (67) into (69), we get the linear differential equation with constant coefficients, ( ) � � �*h hlp� � �� �� � � � �0 2 . (70) in the absence of body forces, static equilibrium implies that the shear stress � must be uniform, and so the right-hand side in (70) is constant. if h p� �� �0 0 * , the general solution reads � � cosh sinh�� � � � � � x h c x l c x l � � � 2 1 2 , (71) where c1 and c2 are integration constants and a hp� �1 0� � * is a dimensionless parameter introduced for convenience. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 6: (a) stress versus average strain for different thicknesses of the shear layer, (b) normalized torque, m d3, versus surface shear strain for twisted copper wires of various diameters d (after fleck et al., 1994) the particular solution depends on the boundary conditions, which should be formulated in terms of the nonlocal cumulative plastic strain and its normal derivative. for homogeneous neumann boundary conditions, � � �� ( )l 2 0, the integration constants vanish and the solution is uniform. consequently, no size effect is predicted by the model. this is similar to what happens for the aifantis model with the homogeneous neumann boundary conditions � � �� ( )l 2 0. a stiff boundary with a perfect bond inhibits propagation of voids, which can be expressed by the condition that � vanishes at the boundary. with this homogeneous dirichlet condition applied at both parts of the boundary, the particular solution of (70) becomes � ( ) � cosh( ) cosh( ) � � � � � x h x l l l � � � � � � �2 1 2 (72) and the corresponding local cumulative plastic strain rate is � ( ) � ( ) cosh( ) cosh( ) � � � � � � x h x l l l � � � � � � � �2 21 1 2 . (73) following the same procedure as in section 3, we find that the shear stress rate is linked to the average shear strain rate by the linear relation � ~ ~�� �� gep , (74) where ~ ~g g h ep � � � � � � � � � 1 1 1 (75) is the elastoplastic shear modulus and ~ ( ) tanh h h l l l l � � � � � � � 2 21 1 2 2 (76) is the size-dependent hardening modulus. for very large sizes l, the hardening modulus tends to h�2, which is the value corresponding to the local theory with �p computed from � instead of �. for values of l comparable to l, the hardening modulus increases with decreasing size, which means that the average response becomes stiffer. so the present model predicts a qualitatively similar trend as the explicit version of gradient plasticity. the ratio between the size-dependent hardening modulus and its large-size limit is plotted in fig. 7a as a function of the relative size, l l, for parameter � � 0 5. . fig. 7b shows the distribution of plastic strain rate across the layer for different ratios l l. even though the general trends are the same as for the explicit gradient model, certain differences can be revealed by comparing fig. 7 with fig. 5. for the explicit model, the hardening modulus tends to infinity as the layer thickness approaches zero, while for the implicit model it tends to a finite value. the local plastic strain on the boundary is zero for the explicit model (as dictated by the boundary condition), while for the implicit model it is positive (because the boundary condition is formulated in terms of nonlocal plastic strain). for the explicit model, the profile of plastic strain distribution across the layer thickness keeps the same shape and grows proportionally during hardening, while for the implicit model the analytical solution covers only the initial distribution of the plastic strain rate and later the shape of the profile would change. 4 strain localization due to softening 4.1 problems with objective description of softening in many structures subjected to high levels of solicitation, the initially smooth distribution of strain at a certain critical stage abruptly changes into a localized pattern with large strains concentrated in relatively narrow regions. this phenomenon, called strain localization, can be caused for instance by the softening character of the material response. the general definition of softening is more involved, but in the one-dimensional case softening means decreasing stress at increasing strain. the physical source of softening usually resides in the growth and coalescence of defects such as voids or cracks. from the micromechanical point of view, this means that the internal structure of the material is evolving and the approximate description of the material as a macroscopically homogeneous one may become questionable. indeed, softening incorporated into standard inelastic continuum models leads to serious mathematical and numerical problems, and enriched theories are needed to provide an objective description of the softening process. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 7: implicit gradient plasticity: (a) dependence of the apparent hardening modulus on the size of the specimen (layer thickness), (b) initial distribution of plastic strain rate across the layer the essence of the localization problem will be explained using a one-dimensional example, which could be interpreted as localization of shear strain in a layer of elastoplastic material between two rigid plates (already studied in the previous section in the context of size effects). however, to better show various facets of nonlocal theories and their broad application field, we will discuss the closely related problem of a prismatic bar of length l and cross-sectional area a under uniaxial tension. the bar is made of a softening material described by a continuum damage model rather than by an elastoplastic model. damage mechanics is frequently used for quasibrittle materials such as concrete under predominantly tensile loading. many results and conclusions of the present section could be directly reinterpreted in terms of shear bands in ductile materials such as metals or confined soils. in the one-dimensional setting, the stress-strain law used by the damage model reads �� �( )1 � e , (77) where and � are the (normal) stress and strain, e is young’s modulus of elasticity, and � is a scalar damage variable characterizing the current size and density of defects that reduce the effective area capable of transmitting stress. in the “virgin”, undamaged state of material with no defects (or with very small initial defects that are incorporated in the elastic modulus), the value of the damage parameter is zero, and it remains zero throughout the elastic stage of loading. when the elastic limit is exceeded, damage starts growing and the elastically computed stress e� is reduced by the integrity factor 1 � �. the limit value � �1 corresponds to a fully damaged material that can no longer carry any stress. the growth of damage must be described by an appropriate damage evolution law for the internal variable �. this law could be postulated in the rate form, but a particularly simple and practical formulation is obtained with a damage law in the total form � � g( )� , (78) where � corresponds to the maximum level of strain reached in the previous history of the material. mathematically, the internal variable � is defined by the loading-unloading conditions � , , ( ) �� � � � � �� � � � �0 0 0. (79) during monotonic loading, � is equal to the strain �, and so the damage evolution function g that appears in (78) can easily be identified from the monotonic stress-strain curve. now suppose that the stress-strain diagram is linear up to a certain strain level �0, after which stress decreases as a linear function of strain until the zero stress level is reached at strain �f (fig. 8a). this linear softening model can be considered as the simplest description of concrete cracking under tension. due to the heterogeneous and quasibrittle nature of the material, a contiguous stress-free crack across the entire section of a bar does not form instantaneously but is obtained as the final result of propagation and coalescence of many smaller cracks. consequently, even after the onset of cracking the bar can still transmit a certain force but its residual strength decreases as the cracks evolve. under static loading and in the absence of body forces, the stress in the bar must be uniformly distributed. in the elastic range, strain is a unique function of stress, and so the strain distribution must be uniform as well. when the peak stress (tensile strength) ft is attained, the uniqueness of the response is lost. stress must still remain uniform and it can only decrease, but a given stress level can be reached either by further stretching the material into the softening regime, or by elastic unloading. consequently, there are many different spatial distributions of the strain increments that lead to the same uniform stress decrement and thus represent a valid solution satisfying the equilibrium equation and the constitutive law. the compatibility equations do not represent any important constraint in the one-dimensional case, because the strain field can always be integrated to yield the displacement field. for example, the material can be softening in an interval of length ls and unloading everywhere else. when the stress is completely relaxed to zero, the strain in the softening region is �f and the strain in the unloading region vanishes; the total elongation of the bar is therefore u l s f� � . the length ls remains undetermined, and it can have any value between 0 and l. this means that the problem has infinitely many solutions; the corresponding post-peak branches of the load-displacement diagram fill the fan shown in fig. 8b. the ambiguity is removed if imperfections are taken into account. real material properties and sectional dimensions cannot be perfectly uniform. suppose that the strength in a small region is slightly lower than in the remaining portion of the bar. when the applied stress reaches the reduced strength, softening starts and the stress decreases. consequently, the material outside the weaker region must unload elastically, because its strength has not been exhausted. this leads to the conclusion that the size of the softening region cannot exceed the size of the region with minimum strength. such a region can be arbitrarily small, and the corresponding softening © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 8: (a) stress-strain diagram with linear softening, (b) fan of possible post-peak branches of the load-displacement diagram branch can be arbitrarily close to the elastic branch of the load-displacement diagram. thus the standard strain-softening continuum formulation leads to a solution that has several pathological features: (i) the softening region is infinitely small; (ii) the load-displacement diagram always exhibits snapback, independently of the structural size and of the material ductility; (iii) the total amount of energy dissipated during the failure process is zero. from the mathematical point of view, these annoying features are related to the so-called loss of ellipticity of the governing differential equation. in the present one-dimensional setting, loss of ellipticity occurs when the tangent modulus ceases to be positive, i.e., it coincides with the onset of softening (this is not always the case in multiple dimensions). the boundary value problem becomes ill-posed, i.e., it does not have a unique solution with continuous dependence on the given data. from the numerical point of view, ill-posedness is manifested by pathological sensitivity of the results to the size of finite elements. for example, suppose that the bar is discretized by ne twonode elements with linear displacement interpolation. if the numerical algorithm properly captures the most localized solution, the softening region extends over one element, and we have l l ns e� . the slope of the post-peak branch therefore strongly depends on the number of elements, and it approaches the initial elastic slope as the number of elements tends to infinity; see fig. 9a, constructed for a linear softening law with � �f 0 20� . the strain profiles at u l� 2 0� for various mesh refinements are plotted in fig. 9b (under the assumption that the weakest spot is located at the center of the bar). in the limit, the profiles tend to 2 20l x l� �( )� where � denotes the dirac distribution. the limit solution represents a displacement jump at the center, with zero strain everywhere else. 4.2 nonlocal formulation serving as localization limiter in real materials, inelastic processes typically localize in narrow bands that initially have a small but nonzero width. propagation and coalescence of microdefects in the localization band can eventually lead to the formation of a displacement discontinuity, e.g., of a macroscopic stress-free crack or a sharp slip line. the initial thickness of the localization band depends on the material microstructure and is usually of the same order of magnitude as the characteristic material length, determined by the size or spacing of dominant heterogeneities. therefore, it is natural to expect that enriched continuum theories can better reflect the actual deformation and failure processes and restore mathematical well-posedness of the boundary value problem. indeed, when properly formulated, nonlocal or gradient enrichments regularize the problem in the sense that the resulting strain field is highly concentrated in certain narrow zones but remains continuous. the corresponding numerical solutions converge upon mesh refinement to a physically meaningful limit, and the numerical results do not suffer by pathological sensitivity to the discretization. enrichments that prevent localization of strain into arbitrarily small regions are called localization limiters. nonlocal material models of the integral type were first exploited as localization limiters in the 1980s. after some preliminary formulations exploiting the concept of an imbricate continuum (bažant, belytschko and chang, 1984), the nonlocal damage theory emerged (pijaudier-cabot and bažant, 1987). nonlocal formulations were then developed for a number of constitutive theories, including softening plasticity, smeared cracking, microplane models, etc. for a list of references, see e.g. bažant and jirásek (2002) or chapter 26 in jirásek and bažant (2002). generally speaking, the nonlocal approach consists in replacing a certain variable by its nonlocal counterpart obtained by weighted averaging over a spatial neighborhood of each point under consideration. in nonlocal elasticity, the averaged quantity is usually the strain. nonlocal elastic models can correctly reflect the experimentally observed dispersion of short elastic waves. however, in typical structural applications, the strain in the elastic regime remains relatively smoothly distributed (with the exception of stress concentrations and singularities around specific points, e.g., tips of pre-existing sharp cracks). steep strain gradients appear only after the onset of localization and are accompanied by highly nonuniform distribution of damage. therefore, most nonlocal models serving as localization limiters reduce to standard local elasticity at low strain levels, and the nonlocal effects are considered only in the inelastic regime. for instance, one widely used nonlocal damage formulation replaces the strain in the loading-unloading conditions (79) by its nonlocal average, �, while strain entering the stress-strain law (77) is still considered as local. according to the modified loading-unloading conditions, � , , ( ) �� � � � � �� � � � �0 0 0, (80) 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 9: pathological effects of mesh refinement on the numerical results obtained with the local damage model: mesh-dependence of (a) load-displacement diagrams, (b) strain profiles the internal variable � has the meaning of the largest previously reached value of nonlocal strain. if strain has a tendency to localize in a very narrow region, e.g., in one single cross section, the nonlocal strain becomes high not only in this region but also in its close neighborhood. this leads to damage growth in that neighborhood, and the local strain at the damaged points must be increased in order to keep the stress distribution uniform. by this mechanism, strain and damage are prevented from localizing into a single cross section. the localized region always has a certain finite size, controlled by the length parameter that appears in the definition of the nonlocal weight function. 4.3 localization analysis the initial bifurcation from a uniform state can be studied analytically under the simplified assumption that the strain keeps increasing at all points both for the fundamental uniform solution and the bifurcated nonuniform solution. if the bifurcated solution is considered as a small perturbation of the uniform solution, the stress perturbation � can be linked to the strain perturbation �� by the linearized equation � �� � �� � �( )1 � �e e (81) and the perturbation of the damage field �� is linked to the strain perturbation by the nonlocal relation � � �� �( ) ( , ) ( )*x g x l � � d , (82) where g* is the derivative of the damage function g with respect to its argument � evaluated for the fundamental (uniform) solution and therefore independent of the spatial coordinate. even though the fundamental solution is considered as static, we will analyze the evolution of the perturbation as a dynamic process. this approach provides more insight into the localization phenomena. in dynamics, the stress perturbation and the displacement perturbation must satisfy the equation of motion �� � ��u � � � 0. (83) for an infinite body and nonlocal weight function in the form � � ( , ) ( )x x� �0 , the assumed solution for �u in the form of a harmonic wave substituted into (81) – (83) leads to the dispersion equation � � � � ��� � �2 2 2 01 0( ) ( ) * *� ek g e k k , (84) where � is the circular frequency of the wave (not to be confused with the damage variable �), k is the wave number, and �0 * is the fourier image of the weight function. the resulting dispersion law reads � ��� � �c k g k0 01 � * * ( ) . (85) the fourier image �0 * has a unit value at k � 0 and smaller values at positive wave numbers k. so if 1 0� � �� g *� , all wave numbers have real positive frequencies and a small perturbation of an arbitrary shape propagates through the body but does not grow in magnitude. if 1 0� � �� g *� , there exists a band of low wave numbers between zero and a positive limit kcrit for which the frequencies are imaginary. this indicates an instability if the perturbation contains a component with a sufficiently long wavelength. in statics, a stationary wave of wavelength 2� kcrit can be superimposed on the fundamental uniform solution without violating the equilibrium condition. the critical wave number can be determined from the condition � � 0. in a monotonic loading process, the damage variable � is a function of strain, � �� g( ), and also the derivative g g*� d d� is a function of strain. the expression under the square root in (85) vanishes for the wave number satisfying the condition � � � � � 0 1* ( ) ( ) ( ) k g gcrit � � d d . (86) for a given damage function g and nonlocal weight function �0, this equation can be solved (either analytically or numerically) and the critical wavelength l kcrit crit� 2� can be determined as a function of strain. for instance, for an exponential softening curve (more realistic for concrete than the linear one), the damage function is given by g f ( ) exp� � � � � � � � � � � � � � � � � � � � 1 0 0 0 (87) where �0 is the limit elastic strain and � f is another parameter controlling the slope of the softening curve. substituting (87) and the fourier image �0 2 2 4* ( ) exp( )k k a� � of the gauss-like weight function (52) into (86), we obtain l k a crit crit f � � � � � � � � � � � � 2 1 0 � � � � � ln . (88) of course, this expression is valid only in the inelastic range, i.e., for � �� 0. for the present damage function (87), the elastic limit coincides with the peak of the stress-strain curve. if, for instance, � �f �11 0, the critical wavelength for the uniform state at peak stress is l a acrit � �� ln . .11 10176 . this example shows that the critical wavelength is proportional to the model parameter that plays the role of the internal material length (and is often denoted as l). in an infinite bar at peak stress, the appearance of a stationary wave of wavelength lcrit corresponds to a particular periodic localization pattern. in reality, there exists an energetically more favorable localization pattern that is not periodic but is concentrated into one single interval of length slightly below the critical wavelength. the exact size of the localized zone could be solved from a fredholm integral equation of the second kind combined with the complementarity conditions, but since the solution is not available in closed form, we will not elaborate on that. if the bar is finite but longer than lcrit , it can be expected that at peak stress the subsequent increments of strain localize into a band of thickness approximately equal to lcrit . for shorter bars, localization can be delayed and the actual behavior depends on the specific form of the nonlocal averaging operator around the boundaries. however, in general it can be expected that localization occurs when the critical wavelength becomes approximately equal to the bar length. as shown in fig. 10a, the critical wavelength monotonically decreases with increasing strain and asymptotically tends to zero. this property has important implications for the evolution of the localized strain © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 44 no. 5 – 6/2004 profile – it indicates that the active zone in which strains are increasing tends to shrink during the loading process. such a trend has indeed been confirmed by numerical simulations. 4.4 mesh sensitivity and size effect fig. 10b shows the load-displacement diagram for strain localization in a bar under uniaxial tension, calculated by the finite element method using a nonlocal damage model with the weight function (53) and with the exponential damage function (87). as the number of elements nel increases, the load-displacement curve rapidly converges to the exact solution. convergence of strain and damage profiles generated by an applied displacement u l� 2 0� is documented in fig. 10c, d. in contrast to the local model, the process zone does not shrink to a single point as the mesh is refined. its size is controlled by parameter that sets the internal length scale. this example demonstrates that the nonlocal formulation serves as a localization limiter and provides an objective description of the localization process, with no pathological sensitivity of the numerical results to the discretization. another important advantage of nonlocal softening models is that they can realistically describe the size effect on nominal strength of quasibrittle materials. the nominal strength is understood as the peak load divided by a characteristic area of the structure. according to the standard (local) version of perfect plasticity theory, the nominal strength for a set of geometrically similar structures of different sizes should be the same, independent of the size. for instance, for a beam of a rectangular cross section, subjected to three-point bending, the plastic collapse load is f m l bd l p 0 2 0 4 � � , (89) where b is the width and d the depth of the cross section, l is the span, mp is the plastic limit moment of the cross section, and 0 is the yield stress. the nominal strength defined as the peak load divided by the cross-sectional area, n f bd d l � �0 0 (90) is equal to the material property 0 multiplied by the geometrical factor d l, which depends only on the shape of the structure but not on its size (assuming proportional scaling of all structural dimensions for three-dimensional similarity, or at least of the in-plane dimensions l and d for two-dimensional similarity). in contrast to elastic-perfectly plastic structures, structures made of quasibrittle materials often exhibit a strong size dependence of the nominal strength. for certain specimen geometries with initial notches scaled proportionally to other structural dimensions, the large-size limit is adequately described by linear elastic fracture mechanics, 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) (c) (d) fig. 10: nonlocal damage model: (a) critical wavelength as a function of strain (for reference, the stress-strain curve is plotted by the dashed line), (b) convergence of load-displacement diagram, (c) convergence of strain profile, (d) convergence of damage profile; nel � number of elements which predicts proportionality of the nominal strength to 1 d. the transition between no size effect for small sizes and strong size effect for large sizes can be approximated by bažant’s (1984) formula n n d d � � 0 0 1 , (91) where d0 and n0 are parameters to be determined by fitting of the experimental results. the transitional size effect characteristic of quasibrittle structures is nicely reproduced by nonlocal softening models, if the model parameters are chosen correctly. as an example, consider the compact tension test of concrete depicted in fig. 11a. the experimental results obtained by wittmann, mihashi and nomura (1990) are shown in the logarithmic size effect diagram in fig. 11b, along with the optimal fit by formula (91) and the results of numerical simulations. the role of the characteristic structural size d is played by the ligament length, and the nominal strength is defined as the peak load divided by the ligament area, bd, where b is the out-of-plane thickness of the specimen. the experimentally measured size effect can be accurately reproduced by a nonlocal isotropic damage model with different combinations of internal length parameter a and parameter �f of the damage function (86). if one parameter is fixed, the other can be determined by optimal fitting, but both parameters cannot be determined simultaneously from the size effect on nominal strength only. a unique parameter identification requires additional information such as the distribution of strain or damage in the process zone (geers, de borst, brekelmans and peerlings, 1999) or the size effect on fracture energy (jirásek, rolshoven and grassl, 2004). 5 concluding remarks the common denominator of all examples presented in the preceding sections is that the characteristic wavelength of the deformation field becomes comparable to the characteristic size of the internal material structure. here, the notion of characteristic wavelength has to be understood in a broad sense, not only as the spatial period of a dynamic phenomenon but also as the length on which the value of strain changes substantially in static problems. such a more general definition could be based e.g. on a suitably normalized ratio between the maximum strain and the maximum strain gradient (both in absolute values). thus the characteristic wavelength is necessarily close to the internal material length if the size of the specimen is not much larger than the size and spacing of major heterogeneities, or if strain localizes due to softening. the enrichment terms introduced by various generalized continuum theories have a differential or integral character, but all of them can be considered as nonlocal, at least in the weak sense. they always introduce a model parameter with the dimension of length, which reflects the internal length scale of the material. three typical cases covered here encompass static and dynamic phenomena, linear and nonlinear behavior, and three classes of material laws, namely elasticity, plasticity and continuum damage mechanics. this shows that the nonlocal enrichments can be useful in a wide range of mechanical problems. unfortunately, so far there is no general and universally accepted theory covering this entire range within one unified framework. even though the first nonlocal theories were pioneered in the 1960s, there many problems remain open and many issues unresolved. some of the most challenging questions include the correct formulation of boundary conditions, micromechanical justification of models with nonlocal internal variables, or identification techniques for the internal length parameter and its possible evolution. references [1] aifantis e. c.: “on the microstructural origin of certain inelastic models.” journal of engineering mechanics and technology, asme, vol. 106 (1984), p. 326–330. [2] aifantis e. c.: “update on a class of gradient theories.” mechanics of materials, vol. 35 (2003), p. 259–280. [3] askes h., metrikine a. v.: “one-dimensional dynamically consistent gradient elasticity models derived from a discrete microstructure. part 2: static and dynamic response.” european journal of mechanics a, vol. 21 (2002), p. 573–588. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 44 no. 5 – 6/2004 (a) (b) fig. 11: (a) compact tension specimen, (b) size effect diagram [4] bažant z. p.: “size effect in blunt fracture: concrete, rock, metal.” journal of engineering mechanics, asce, vol. 110 (1984), p. 518–535. [5] bažant z. p., chang, t.-p.: “instability of nonlocal continuum and strain averaging.” journal of engineering mechanics, asce, vol. 110 (1984), p. 2015–2035. [6] bažant z. p., jirásek m.: “nonlocal integral formulations of plasticity and damage: survey of progress.” journal of engineering mechanics, asce, vol. 128 (2002), p. 1119–1149. [7] bažant z. p., belytschko t. b., chang, t.-p.: “continuum model for strain softening.” journal of engineering mechanics, asce, vol. 110 (1984), p. 1666–1692. [8] eringen a. c.: “linear theory of nonlocal elasticity and dispersion of plane waves.” international journal of engineering science, vol. 10 (1972), p. 425–435. [9] fish j., chen w., nagai, g.: “nonlocal dispersive model for wave propagation in heterogeneous media. part 1: one-dimensional case.” international journal for numerical methods in engineering, vol. 54 (2002a), p. 331–346. [10] fish j., chen w., nagai g.: “nonlocal dispersive model for wave propagation in heterogeneous media. part 2: multi-dimensional case.” international journal for numerical methods in engineering, vol. 54 (2002b), p. 347–363. [11] fleck n. a., hutchinson j. w.: “a reformulation of strain gradient plasticity.” journal of the mechanics and physics of solids, vol. 49 (2001), p. 2245–2271. [12] fleck n. a., muller g. m., ashby m. f., hutchinson j. w.: “strain gradient plasticity: theory and experiment.” acta metallurgica et materialia, vol. 42 (1994), p. 475–487. [13] geers m. g. d., de borst r., brekelmans w. a. m., peerlings r. h. j.: “validation and internal length scale determination for a gradient damage model: application to short glass-fibre-reinforced polypropylene.” international journal of solids and structures, vol. 36 (1999), p. 2557–2583. [14] geers m. g. d., engelen r. a. b., ubachs r. j. m.: “on the numerical modelling of ductile damage with an implicit gradient-enhanced formulation.” revue européenne des éléments finis, vol. 10 (2001), p. 173–191. [15] green a. e., rivlin r. s.: “simple force and stress multipoles.” archive for rational mechanics and analysis, vol. 16 (1964), p. 325–353. [16] jirásek m., bažant z. p.: “inelastic analysis of structures.” john wiley and sons, chichester, u.k., 2002. [17] jirásek m., rolshoven, grassl, p.: “size effect on fracture energy induced by nonlocality.” international journal for numerical and analytical methods in engineering, vol. 28 (2004), in press. [18] lakes r. s.: “experimental microelasticity of two porous solids.” international journal of solids and structures, vol. 22 (1986), p. 55–63. [19] ma q., clarke d. r.: “size dependent hardness in silver single crystals.” journal of materials research, vol. 10 (1995), p. 853–863. [20] metrikine a. v., askes h.: “one-dimensional dynamically consistent gradient elasticity models derived from a discrete microstructure. part 1: generic formulation.” european journal of mechanics a, vol. 21 (2002), p. 555–572. [21] mindlin r. d.: “micro-structure in linear elasticity.” archive for rational mechanics and analysis, vol. 16 (1964), p. 51–78. [22] mindlin r. d.: “second gradient of strain and surface tension in linear elasticity.” international journal of solids and structures, vol. 1 (1965), p. 417–438. [23] morrison j. l. m.: “the yield of mild steel with particular reference to the effect of size of specimen.” proceedings of the institute of mechanical engineers, vol. 142 (1939), p. 193–223. [24] nagai g., fish j., watanabe k.: “stabilized nonlocal model for wave propagation in heterogeneous media.” computational mechanics, vol. 33 (2004), p. 144–153. [25] nix w. d.: “mechanical properties of thin films.” metallurgical transactions, vol. 20a (1989), p. 2217–2245. [26] pijaudier-cabot g., bažant z. p.: “nonlocal damage theory.” journal of engineering mechanics, asce, vol. 113 (1987), p. 1512–1533. [27] poole w. j., ashby m. f., fleck n. a.: “microhardness of annealed and work-hardened copper polycrystals.” scripta metallurgica et materialia, vol. 34 (1996), p. 559–564. [28] rogula d.: “introduction to nonlocal theory of material media”, in d. rogula (ed.), nonlocal theory of material media, no. 268 in cism courses and lectures, springer verlag, wien and new york, 1982, p. 125–222. [29] shu j. y., fleck n. a., van der giessen e., needleman a.: “boundary layers in constrained plastic flow: comparison of nonlocal and discrete dislocation platicity.” journal of the mechanics and physics of solids, vol. 49 (2001), p. 1361–1395. [30] stolken j. s., evans a. g.: “a microbend test method for measuring the plasticity length scale.” acta metallurgica et materialia, vol. 46 (1998), p. 5109–5115. [31] toupin r. a.: “elastic materials with couple-stresses.” archive for rational mechanics and analysis, vol. 11 (1962), p. 385–414. [32] wittmann f. h., mihashi h., nomura n.: “size effect on fracture energy of concrete.” engineering fracture mechanics, vol. 35 (1990), p. 107–115. [33] yarnel j. l., warren j. l., koenig s. h.: in lattice dynamics, r. f. wallis (ed.), pergamon press, 1965, p. 57. milan jirásek laboratory of structural and continuum mechanics swiss federal institute of technology at lausanne (epfl) 1015 lausanne, switzerland 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 acta polytechnica doi:10.14311/ap.2018.58.0388 acta polytechnica 58(6):388–394, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap modification of hierarchical agglomerative clustering with complete linkage for optimal usage with passive optical networks tomáš pehnelt, pavel lafata∗, marek nevosad department of telecommunication engineering, faculty of electrical engineering, czech technical university in prague, technicka 2, prague 6, czech republic ∗ corresponding author: lafatpav@fel.cvut.cz abstract. today, passive optical networks (pons) represent a modern solution for high-speed fttx subscriber lines and networks. due to the fact that optical fibres and cables are deployed in last-mile network segments. unfortunately, the costs of trenching and installation of optical cables in cities are still very high today. however, the application of mathematical methods and algorithms for designing optimum network solutions and topologies might significantly decrease the overall capital expenditures (capex) of the whole process. this article introduces an efficient method based on agglomerative clustering for an optimization of deployment of pons in practice. the paper describes its application and proposes the necessary modifications of the clustering method for the environment of pons. the proposed algorithm uses the technique to cluster and connect the adjacent optical terminals and splitters of pons and to calculate the optimum locations of these units in order to achieve the most optimum network solution and to minimize the summary capital expenditures. keywords: passive optical network; network optimization; topology optimization; algorithms; hierarchical clustering; optical splitter placement. 1. introduction nowadays, the demands for transmission speed and overall transmission capacity of access networks are rapidly increasing [1, 2]. one of the most promising network solutions for building modern fttx lines and networks are passive optical networks (pons) [3]. a typical pon consists of one optical line termination (olt), which acts as a central unit, and optical network terminals (onts) or units (onus), which are located at the end-point users and connect these users to the pon itself [3, 4]. the optical distribution network (odn) is then composed of all optical elements between olt and all onts including optical fibres, patch cords, connectors, and especially passive optical splitters [3, 4]. since all these components are typically passive (no power consumption, no management), no optical signal regeneration, routing nor switching is possible between olt and onts, therefore, the odn must always be properly designed and optimized [3, 4]. however, there are also economical aspects and criteria, which need to be addressed during the process of planning of the odn and its topology in practice [5, 6]. today, the costs of deployment of optical fibres and cables, especially the trenching costs and the installation costs, are still very high. the last aspect, which needs to be addressed during the designing and planning of realistic odns and pons in practice, are the locations and placements of all optical components in real networks and situations [7, 8]. generally, the odns are built mainly in cities, suburbs, residential areas, industrial and commercial objects, because the optical fibres are mostly trenched or installed along the roads, pavements [9], in drains and subways, using existing street lights, telephone poles, etc. it is also necessary to adequately place passive optical splitters and the rest of the network elements. it is evident that the designing of passive odn in order to meet all the network criteria for specific pon version is an important process, which needs specific calculations and designing methods [10]. there are various existing techniques described in numerous scientific papers focused on the optimization and planning of pon topologies and networks, the most important ones will be briefly discussed in the following section. this paper, however, is focused especially on clustering of onts in order to obtain optimum locations of passive optical splitters and optimum network topology. through performing a correct cluster analysis of ont units and by designing their proper clusters, the optimum placement of passive splitters can be achieved and the overall network topology planned. the solution described in this paper is based on a hierarchical agglomerative clustering algorithm with implemented modifications and innovations. the proposed algorithm is primarily focused on the number of onts in formed clusters in order to optimize the splitting ratio of splitters. as the passive splitters greatly increase the summary attenuation in the pons, the optimization of their positions and splitting ratios also play an important role. the proposed algorithm was also tested through typical realistic scenarios and the results are included in this paper as well. 388 http://dx.doi.org/10.14311/ap.2018.58.0388 http://ojs.cvut.cz/ojs/index.php/ap vol. 58 no. 6/2018 modification of hierarchical agglomerative clustering 2. literature review this section brings a short review and comparison of existing and proposed techniques and algorithms for designing optimum pon topologies and networks. various existing studies have investigated methods for the optimization of network deployment including optical network and pon deployment. one of the very first methods was to apply a heuristic algorithm together with a minimum spanning tree algorithm in order to optimize the fibre length in the pon deployment, as presented in [11] and further improved in [12]. however, this solution does not use real map data and the computation time of heuristic method is also quite long, therefore, large scenarios must be divided into smaller ones, as illustrated in [8]. another approach is to use ant-colony algorithms, as described in [13], and their combination with a metaheuristic method and updated techniques in [14]. in [14] and [15], an antcolony method is used to find optimum locations of passive splitters to solve the optimization task of passive optical networks. however, the complex solution of the whole network topology is not provided, and moreover, no further optimization criteria are implemented for a multi-parametric optimization process. similarly, [16] and [17] propose the application of genetic algorithms and genetic oriented techniques, however, the optimization is oriented towards optimum optical topology without the possibility of capex minimization option or attenuation minimization decision. in order to offer a complex pon optimization, [9] and [18] provide the combination of various algorithm techniques and methods to solve this problem. in [9], all end-users in the designed pon topology are first separated into clusters, then the clusters are connected to the olt using splitters. similarly, in [18], the hierarchical cascading technique of passive splitters is presented together with the application of clustering algorithms. in [19] and [20], surveys dealing with the recent pon evolution and innovations can be obtained. nevertheless, none of the above referenced papers deals with the optimization of splitting ratios and with the clustering of the ont units based on their optimum number in order to obtain an optimized splitting ratio solution. the solution presented here is partially based on initial ideas already presented in [21], however, this paper is focused exclusively on the optimization of splitting ratio performed through a hierarchical agglomerative clustering technique and on proposing an innovative algorithm. 3. proposed hierarchical agglomerative clustering algorithm the main contribution of this paper is the designing of an innovative hierarchical agglomerative clustering algorithm with a modified complete linkage. these connections are updated throughout the running of an algorithm resulting in updating clusters of ont units in order to optimize the splitting ratios of passive splitters in these clusters. a passive optical splitter is a key component of pons, providing splitting of optical signals in downstream direction towards all its outputs, while it combines the optical signals together in the upstream direction. the internal structure of each passive splitter is composed of basic y-junctions, while each yjunction forms a passive splitter with a ratio 1:2. by cascading these junctions, a resulting splitting ratio can be achieved. due to that, the splitting ratios of passive splitters are typically powers of 2, e.g., 1 : 2, 1 : 4, 1 : 8, 1 : 16, 1 : 32, 1 : 64, 1 : 128 and 1 : 256. due to that, the optimum number of ont units in each cluster should be close to the power of 2 if possible. the following section brings the description and proposal of an innovative algorithm technique. the basic hierarchical agglomerative clustering algorithm is modified and amended with a complete linkage functionality. 3.1. description of existing hierarchical agglomerative clustering method the hierarchical agglomerative clustering algorithm is a hierarchical method of a gradual merging of existing clusters until the desired number of clusters is reached. there are several methods of the process, which clusters should be merged, these methods are called linkage criterion. this criterion is used to determine the distance between two clusters. the following list contains some of the best-known criterions, the list is far from complete, as there are many existing types of linkage criterions: complete linkage. it uses the maximum possible distance between two clusters, i.e., distance between two farthest points in two sets. single linkage. in contrast to the complete linkage, single linkage uses the smallest distance between two clusters. centroid. this linkage uses distances between two centres of the two clusters. average linkage. it uses the average distance between each point in one cluster to each point in the other cluster. another important parameter for hierarchical clustering is the metric. this means, which distance function will be used for evaluating the distance between points, most commonly used are euclidean or hamming. the algorithm for the hierarchical agglomerative clustering starts with all points being single clusters each. in each step, the clusters with the smallest distance according to the selected linkage and metric criterions are merged. this process continues until the initial limit of number of clusters is reached. 389 t. pehnelt, p. lafata, m. nevosad acta polytechnica 3.2. modified linkage for hierarchical agglomerative clustering algorithm this section introduces the main contribution of this paper. our goal was to optimize the resulting numbers of ont units in all clusters, so that the number of units would be close to the number of ports of passive optical splitter (splitting ratio). since the splitting ratio is usually the power of 2, the optimal number of ont units including a reserve in the form of one port for potential updates and topology modifications in each cluster should be as close as possible to the nearest higher power of 2. the whole algorithm is described in the following flowchart in figure 1. figure 1. recursive hierarchical agglomerative clustering. the modified linkage for hierarchical agglomerative clustering algorithm in figure 1 is recursive so that it is possible to have a feedback allowing for changing the parameters in the previous step. this includes mainly the parameters s and β, which will be described later. the first step of the algorithm is the decision, whether the current number of clusters corresponds to the total number of points to merge. the algorithm starts with the number of clusters corresponding with the number of ont units in the entire pon, the recursive function is called with the current number of clusters plus one. this process stops when the current number of clusters is the same as the number of onu units to merge. when this number of clusters is reached, the main body of an algorithm begins. first, all possible cluster merges are sorted based on the distance between two farthest points in clusters to be merged (complete linkage). the list of these merges is further filtered, based on the metrics calculated through the following set of equations. in the first recursive step, the value β is set to 1 (meaning that the merging probability is 100 %). the threshold value τ is calculated in each recursive step as τ = (1 −β) · 20. (1) the filtering process returns a boolean value s, which holds the result of the filtering, if no value is left after the filtering, the s is false, otherwise, it is true. if the filtering was not efficient and s holds false, then the algorithm returns, decrements the value β by t and again checks the value of current clusters. this mechanism returns the algorithm to its beginning until β reaches the threshold value τ or s holds the false value. the decision of merging two clusters (filtering process) is constrained by the threshold γ calculated as γ = µ(τ + 1), (2) where µ is the minimum distance between two points in the whole dataset. constraining the number of elements in the list according to the threshold τ as well as according to the number of units in merged cluster so that they only contain allowed amount of ont units plus the reserve based on the bounds. this lower bound (lb) and upper bound (ub) are calculated as ub = 2n − 1, (3) where n is the nearest greater power of 2 and the number of units in a cluster; and lb = bβ(2n − 1)c− 1, (4) where β is an acceptance criterion from the input of the process and the part 2n−1 represents the greatest smaller power of two to the number of ont units in the cluster. finally, the result of an algorithm (number of clusters, number of ont units in clusters, cluster distances and β) is then used to calculate the resulting 390 vol. 58 no. 6/2018 modification of hierarchical agglomerative clustering figure 2. an example with the city center. figure 3. rural area with 3 villages. attenuation of the topology a, the summary length of optical fibres lf and the total trenching distance lt. based on these values, the solutions obtained during the recursive runs of proposed algorithm can easily be compared and the best one identified. once the solution with the desired values a, lf, lt is reached, or the best solution based on their comparison is achieved, the algorithm ends. both the threshold value τ as well as parameter t influence the results of the algorithm. the parameter t represents a granularity of the algorithm steps and through numerous testing and calculations, its value was set to 0.05. 4. results and discussion the following section contains a discussion of the results and their comparison with the existing hierarchical agglomerative clustering algorithm. for the following comparison, the realistic data and maps from openstreetmaps.org were exported and two different examples were selected. the first one in figure 2 illustrates a typical situation in the historical city centre (urban area), while figure 3 contains a typical scenario for a rural area with 3 villages. in both figures, the position of optical line termination (olt) is illustrated using a violet square, the positions of passive splitters are marked using black rounds, while ont units are illustrated by red circles. all paths (roads, streets, walkways, etc.) are represented by blue lines, while the resulting optical topology (optical fibres, cables) by red lines. these topologies were obtained by previously presented methods and algorithms described in a previous article [21] with the implementation of the presented innovative hierarchical agglomerative clustering with a complete linkage. the pon topology designing presented in [21] is based on innovative techniques using the dijkstra’s algorithm, metrics and proposed hierarchical clustering method with the complete optimum linkage described in this paper. in order to provide a comparison with existing methods for different scenarios and topologies, the proposed 391 t. pehnelt, p. lafata, m. nevosad acta polytechnica hierarchical agglomerative clustering area type summary fiber length, lf [m] number of onts in clusters maximal attenuation, amax [db] total trenching distance, lt [m] proposed city center 13405 7, 30, 3 23.1118 4590 existing [21] 13410 11, 29 22.8004 4592 existing [5] 14236 10, 30 22.8561 4604 existing [22] 13951 11, 29 22.8004 4592 proposed urban 10511 14, 26 21.4530 3241 existing [21] 10512 3, 5, 6, 26 21.4627 3253 existing [5] 10705 4, 4, 7, 25 21.8909 3372 existing [22] 10532 3, 5, 6, 26 21.4627 3253 proposed rural 22115 10, 30 22.2105 10474 existing [21] 22115 10, 30 22.2105 10474 existing [5] 22287 11, 29 22.3992 10532 existing [22] 23563 7, 10, 23 23.0035 10701 proposed urban 14490 5, 13, 13, 9 24.8365 4605 existing [21] 14270 5, 7, 15, 13 26.6512 4570 existing [5] 14963 6, 6, 12, 10, 6 27.3008 4638 existing [22] 14806 8, 14, 15, 3 27.0850 4623 proposed rural 18933 15, 22, 3 24.9055 4866 existing [21] 22808 27, 13 32.5624 4952 existing [5] 24073 14, 5, 11, 10 26.0898 4913 existing [22] 23223 4, 15, 21 25.7300 4920 proposed city center 11181 11, 6, 11, 12 21.5039 5015 existing [21] 11207 4, 8, 11, 8, 9 24.4557 5117 existing [5] 11205 10, 7, 12, 11 26.0046 5095 existing [22] 11340 5, 8, 8, 10, 9 24.3980 5109 table 1. comparison of existing and proposed methods. algorithm and methods were used for several situations including rural, urban and city areas and were compared with the existing methods [5, 21, 22]. based on these results, a statistical processing and comparison was performed, and its results are presented within the following table 1. rural areas represent typical village regions, urban maps illustrate typical city parts containing block of flats and houses, while centre areas consist of historical city centres. the existing methods presented in [5, 21, 22] are based on existing agglomerative clustering techniques, while the proposed method presented within this paper is referred to as proposed in table 1. the summary length of optical fibres lf and the total trenching distance lt [m] are obtained directly as the results of the algorithm in figure 1. the maximum attenuation amax [db] is calculated as the maximum of attenuation a between the central olt unit and the ont units in the designed topology. generally, the attenuation a in the pon can be calculated as a = ∑ as + ∑ npeapr + ∑ αltotal + ares [db], (5) where the sum as [db] represents the sum of the attenuation of all passive splitters located between the olt unit and the selected ont unit, the sum npeape [db] represents the sum of the attenuation of npe passive elements each with an attenuation ape (connectors, splices, passive filters, etc.), the sum αltotal represents the summary attenuation of optical fibres with an attenuation factor α [db/km] and length ltotal [km] and ares [db] stands for the attenuation reserve, which covers the attenuation compensation of aging, temperature fluctuations, deformations of optical fibres, etc. the attenuation of a symmetrical splitter as can be calculated according to as = 10 log1 0n + au log2 n + ape [db], (6) where n is the number of output ports of a splitter (1 : n is its splitting ratio), au [db] represents the uniformity of one single y-junction cascaded in the internal structure of a splitter and ac [db] is the additional attenuation of connectors or splices used to connect the splitter in the pon topology. in all calculations, au is considered as 0.3 db. in (5), the attenuation factor α is considered as 0.4 db/km meeting the itu-t g.652d optical fibre type, ares as 1 db and ape in (5) and (6) as 0.4 db, which represents an attenuation of one standard optical connector of a type sc/pc or sc/apc. based on the results in table 1, it is evident that the proposed algorithm provided the best solution in 392 vol. 58 no. 6/2018 modification of hierarchical agglomerative clustering almost all scenarios. the only situation, in which the existing method [21] provided a better result, was one of the urban scenarios. evidently, the proposed hierarchical agglomerative clustering method with modifications presented in § 3.2 performed the lowest summary fibre length, total trenching distance and the maximum attenuation in almost all the situations. due to that, the capex of this solution should be the lowest compared with all existing methods. moreover, the proposed algorithm provided the best solution for all three different scenario types (city centre, urban, rural), therefore, it can be used for any of these situations. 5. conclusion this paper is focused on planning and designing an optimum network topology of passive optical networks by using advanced algorithms and techniques. the article presents an innovative hierarchical agglomerative clustering algorithm with a modified complete linkage ability. the algorithm is presented by a flowchart in § 3.2 (figure 1), while § 4 contains its comparison with several existing methods and techniques. these results were obtained for various different scenarios and situations based on realistic map data obtained from openstreetmap.org and processed in matlab environment. the performance of all methods and algorithms is compared through elementary output parameters and results, especially the maximum attenuation, summary fibre length and the total trenching distance. these parameters form the resulting capital expenditures (capex) when building the network in practice. it is evident that the presented algorithm always provided the best solution, except for a single scenario. the presented algorithm provides the best results for all three typical types of scenarios in practice, the city centre area, the urban area and the rural area. the algorithm will be further evaluated and modified within the following research. for example, the comparison in table 1 was performed for a fixed number of ont units in a network, therefore, the following step is to implement the variable number of units in network topology. acknowledgements this work was supported by the grant agency of the czech technical university in prague grant sgs18/182/ohk3/3t/13. references [1] arévalo, g. v., gaudino, r.: a techno-economic network planning tool for pon deployment including protection strategies. 19th international conference on transparent optical networks (2017). doi:10.1109/icton.2017.8025122 [2] akgun, t., ünverdi, n. o.: fttx analysis and applications. 24th signal processing and communication application conference (2016). doi:10.1109/siu.2016.7496220 [3] lam, c., f.: passive optical networks: principles and practice. academic press of elsevier inc., burlington, usa (2007). isbn 0-12-373853-9. [4] nesset, d.: pon roadmap. ieee/osa journal of optical communications and networking (2017). doi:10.1364/jocn.9.000a71 [5] papaefthimiou, k., tefera, y., mihaylov, d., gutierrez, j. m. and jensen, m.: algorithmic pon/p2p ftth access network design for capex minimization. 21st telecommunications forum telfor (2013). doi:10.1109/telfor.2013.6716199 [6] davim, j. p., pinto, a. n.: capex model for pon technology. 2nd international conference on evolving internet (2010). doi:10.1109/internet.2010.41 [7] cramer, s. and kampouridis, m.: optimising the deployment of fibre optics using guided local search. ieee congress on evolutionary computation (2015). doi:10.1109/cec.2015.7256973 [8] arévalo, g. v., sierra, j. e., hincapié, r. c., gaudino, r.: a novel algorithm for pon optimal deployment over real city maps and large number of users. 18th italian national conference on photonic (2016). [9] li, j., and shen, g.: cost minimization planning for greenfield passive optical networks. ieee/osa journal of optical communications and networking (2009). doi:10.1364/jocn.1.000017 [10] mitcsenko, a., paksy, g., cinkler, t.: topology design and capex estimation for passive optical networks. 6th international conference on broadband communications, networks (2009). doi:10.4108/icst.broadnets2009.7245 [11] khan, s. u.: heuristics-based pon deployment. ieee communications letters (2005). doi:10.1109/lcomm.2005.1506723 [12] lakic, b., hajduczenia, m.: on optimized passive optical network (pon) deployment. second international conference on access networks & workshops (2007). doi:10.1109/accessnets.2007.4447124 [13] xiong, w., wu, c., wu, l., guo, x., chen, y. and xie, m.: ant colony optimization for pon network design. ieee 3rd international conference on communication software and networks (2011). doi:10.1109/iccsn.2011.6013738 [14] hu, w., wu, k., ping shum, p., zhedulev, n., soci, c.: using nonlinear optical networks for optimization: primer of the ant colony algorithm. ieee conference on lasers and electro-optics (cleo), (2014). [15] zu, y., yang, w., wang, y., shao, l.: the routing algorithm of multi-layer multi-domain intelligent optical networks based on ant colony optimization. 15th ieee international conference on communication technology (2013). doi:10.1109/icct.2013.6820399 [16] liu, z., jaekel, a., bandyopadhyay, s.: a genetic algorithm for optimization of logical topologies in optical networks. international parallel and distributed processing symposium (2001). doi:10.1109/ipdps.2002.1016609 393 http://dx.doi.org/10.1109/icton.2017.8025122 http://dx.doi.org/10.1109/siu.2016.7496220 http://dx.doi.org/10.1364/jocn.9.000a71 http://dx.doi.org/10.1109/telfor.2013.6716199 http://dx.doi.org/10.1109/internet.2010.41 http://dx.doi.org/10.1109/cec.2015.7256973 http://dx.doi.org/10.1364/jocn.1.000017 http://dx.doi.org/10.4108/icst.broadnets2009.7245 http://dx.doi.org/10.1109/lcomm.2005.1506723 http://dx.doi.org/10.1109/accessnets.2007.4447124 http://dx.doi.org/10.1109/iccsn.2011.6013738 http://dx.doi.org/10.1109/icct.2013.6820399 http://dx.doi.org/10.1109/ipdps.2002.1016609 t. pehnelt, p. lafata, m. nevosad acta polytechnica [17] villalba, t., rossi, s., mokarzel, m., salvador, m., neto, a., cesar, a., romero, m., rocha, m.: design of passive optical networks using genetic algorithm. sbmo/ieee mtt-s international microwave and optoelectronics conference (2009). doi:10.1109/imoc.2009.5427496 [18] lin, b., lin, l., and ho, p. h.: cascaded splitter topology optimization in lrpons. ieee international conference on communications (2012). doi:10.1109/icc.2012.6364216 [19] butt, r. a., hasunah, m. s., idrus, s. m., rehman, s. u.: evolution of access network from copper to pon – current status. arpn journal of engineering and applied sciences (2015). [20] butt, r., idrus, s., zul, n., ashraf, m.: a survey of energy conservation schemes for present and next generation passive optical networks. journal of communications (2018). doi:13.10.12720/jcm.13.3.129-138 [21] pehnelt, t., and lafata, p.: optimizing of passive optical network deployment using algorithm with metrics. advances in electrical and electronic engineering, (2017). doi:10.15598/aeee.v15i5.2285 [22] olsen, d. a. closing the loop on a complete linkage hierarchical clustering method. 11th international conference on informatics in control, automation and robotics, (2014). doi:10.5220/0005058902960303 394 http://dx.doi.org/10.1109/imoc.2009.5427496 http://dx.doi.org/10.1109/icc.2012.6364216 http://dx.doi.org/13.10.12720/jcm.13.3.129-138 http://dx.doi.org/10.15598/aeee.v15i5.2285 http://dx.doi.org/10.5220/0005058902960303 acta polytechnica 58(6):388–394, 2018 1 introduction 2 literature review 3 proposed hierarchical agglomerative clustering algorithm 3.1 description of existing hierarchical agglomerative clustering method 3.2 modified linkage for hierarchical agglomerative clustering algorithm 4 results and discussion 5 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0465 acta polytechnica 61(3):465–475, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague genetic algorithms to determine the optimal parameters of an ensemble local mean decomposition willian t. f. d. silva∗, filipe d. d. m. borges western paraná state university, engineering and exact sciences center, av. tancredo neves, 6731, 85867-970, foz do iguaçu/pr, brazil ∗ corresponding author: willian.silva7@unioeste.br abstract. an optimization method for an ensemble local mean decomposition (elmd) parameters selection using genetic algorithms is proposed. the execution of this technique depends heavily on the correct choice of the parameters of its model as pointed out in previous works. the effectiveness of the proposed method was evaluated using synthetic signals, discussed by several authors. the resulting algorithm obtained similar results to oelmd, but with an 82 % reduction in processing time. actual vibration signals were also analysed, presenting satisfactory results. keywords: ensemble local mean decomposition, genetic algorithms, signal processing, optimization. 1. introduction due to the wide variety of application, several signal processing techniques have been developed in recent years [1–4]. in 1998, a major advance in so-called decomposition techniques occurred when [1] introduced the empirical mode decomposition (emd), which was an effective tool for a non-linear and non-stationary signal analysis. in this method, a complex signal could be decomposed on a series of sums of finite functions called intrinsic mode functions (imf) that represents the oscillatory components of the signal. however, one of the main disadvantages of this technique is its susceptibility to mode mixing phenomenon. [4]. mode mixing occurs when multiple modes resides withing one imf. to prevent this, in 2005, the local mean decomposition (lmd) [2] was developed in order to mitigate the mode mixing. nevertheless, when tested against highly complex and contaminated signals, such as faulty mechanical components vibration, despite showing superiority when compared to emd [5], the lmd still suffers, in a prohibitive way, of mode mixing problem. therefore, in order to improve its applicability in complex signals [3] proposed an ensemble local mean decomposition (elmd), which adds white noise to the vibration signals in order to obtain the optimal compositions. meanwhile, according to [4], the effectiveness of elmd in reducing mode mixing is highly influenced by its parameters, such as white noise amplitude, bandwidth and ensemble numbers. different drawbacks are also pointed out by other authors, such as the occurrence of pseudo-components ([6]) and poor signal reconstruction ([7]). regarding the choice of appropriate parameters for the elmd, [4] proposed an optimized ensemble local mean decomposition (oelmd), an optimization of the technique, in which parameters are chosen to satisfy the decomposition performance. however, the technique used a gross-based method for testing of several values for the parameters, leading to a highly prohibitive computational cost. in this scope of optimization, genetic algorithms (ga) are techniques that search for the best result based on the principles of genetics and natural selection strongly studied by [8–15] and disseminated since 1970s. a ga allows that a population composed of various individuals get involved in certain rules that minimize (or maximize) a cost function. this article proposes a new approach in the optimization of elmd parameters to satisfactorily fulfill the decomposition performance. thus, better results than the elmd with regards to mode mixing are expected, as well as better results than oelmd in terms of processing time. although this article is inspired by previous works [4], [16], none of these studies, unlike ours, reported the use of genetic algorithms to reduce the number of iterations necessary in the search for optimal parameters.. furthermore, in order to assess the effectiveness of the proposed method in one of its many applications, this work intends to use the algorithm in the analysis of bearing failure diagnosis. thus, this work comes with the following contributions: (a) development of a new procedure based on genetic algorithms to determine white noise parameters in an elmd; (b) extend the work of [4] on the investigation of rrmse and snr in the optimal parameters selection; and (c) evaluation of the effectiveness of time-frequency techniques in the diagnosis of fault mechanical components. the work is divided into six sections: section 1 introduces the subject of decompositions and forms of optimization; section 2 introduces the fundamentals of local mean decomposition and its derivations; 465 https://doi.org/10.14311/ap.2021.61.0465 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en willian t. f. d. silva, filipe d. d. m. borges acta polytechnica section 3 presents the methods of optimization of parameters based on genetic algorithms; section 4 compares the proposed methodology with that suggested by [4] and other lmd improvements, by means of a synthetic signal; section 5 proposes a methodology for improving the decomposition results by reapplying the technique based on the value of rrmse; section 6 shows the results of a test performed on real vibrational data; and section 7 conclusion. 2. local mean decomposition assuming that a signal can be represented as the sum of a finite set of a product function from which it is possible to extract instantaneous frequency imbued with physical meaning, [2] developed a method that basically consists in decomposing the signal in several other functions, obtained from the product between an envelope signal and a frequency modulated signal, from which it is possible to extract the instantaneous frequency, thus it can be represented as a time function and as a frequency function, giving a rise to timefrequency representation (tfr). in the method, the signal decomposition is performed by a progressive decoupling of the frequency modulated signal from an amplitude modulated envelope, throughout the following steps: (1.) obtaining all the local extremes from the signal x(t). the extreme indices are denoted by ek and the correspondent extremes by x(ek) (2.) calculating the smoothed local mean m(t) and smoothed local amplitude a(t). to acquire these values two preliminary steps are necessary. the first one, characterized by the calculation of the preprocessed local mean m0(n) and the local amplitude a0(n), it is obtained by: m0(n) = x(ek) + x(ek+1) 2 for ek ≤ n ≤ ek+1 (1) a0(n) = |x(ek) −x(ek+1)| 2 for ek ≤ n ≤ ek+1 (2) however, despite the simplicity and consistency of equations, [17] warn that results can not be obtained without the extension of the signal, which may introduce disagreements in its ends that gradually influence its middle, disturbing the decomposition performance [18]. thus, the authors proposed a treatment for the extremes, called a boundaries processing method that calculates the local mean and local amplitude, by means of the following equations. for the first extreme: m0(n) = x(e1) + 2x(e2) + x(e3) 4 (3) a0(n) = |x(e1) −x(e2)| + |x(e2) −x(e3)| 4 (4) for the last one: m0(n) = x(em−2) + 2x(em−1) + x(em ) 4 (5) a0(n) = = |x(em ) −x(em−1)| + |x(em−1) −x(em−2)| 4 (6) where m is the signal length. moreover,[19] determined a signal extension algorithm that modifies the extremes by a spline interpolation, which is based on the work of [20]. afterwards, from the variables m0(n) and a0(n), the smoothed local mean m(t) and smoothed local amplitude a(t) are obtained. there are differences in how this smoothing is calculated. although the method proposed by [2], using the moving average algorithm (ma), has been studied with proven efficacy [3, 5, 18, 19, 21], [22] argue that this method could lead the decomposition to incoherent results. thus, [23] proposed a cubic spline interpolation based on lmd (slmd) due its property of a good convergence and high smoothing. however, [24] confirmed that large interpolation errors can occur in the local amplitude calculation. thereby, the authors have proposed a rational hermite interpolation (olmd), replacing the spline interpolation, stating that it could better counteract the waveform of the amplitude. nevertheless, [22] confirm that the hermite interpolation cannot adaptively adjust the shape of the curves with the varying local characteristics of the waveform in the sifting process. therefore, the authors suggest that a rational spline interpolation coupled with an optimization procedure of a tension parameter could control the shape of the cubic spline. according to the authors’ studies, their method yields more accurate results of decomposition as well as a reduction of the total processing time of the technique. some smooth examples are show in figure 1 (3.) calculate the estimated zero-mean signal h11 and fm signal s11 by means of the variables x(n), m(n) and a(n). for that, the equation is defined by: h11(n) = x(n) −m11(n) (7) s11(n) = h11(n) a11(n) (8) it is important to make sure that s11(n) is a purely fm signal. otherwise, the function x(n) assumes the value of s11(n) and the steps are repeated until the condition described by equation 9 is satisfied. this condition is the so-called sifting process. lim p→∞ a1p(n) = 1 (9) 466 vol. 61 no. 3/2021 genetic algorithms for determining the optimal parameters. . . local amplitude (olmd) local amplitude (slmd) local amplitude local mean smoothed local mean signal x(t) local amplitude (ma) figure 1. hypothetical signal x(t) together with the local mean, smoothed local mean, local amplitude and local amplitude smoothed by ma, slmd and olmd methods. due to its notorious importance in the lmd final results [22], [19] proposed a method called sifting stopping, which defines an optimal number of iterations for the decomposition, which consequently brought better results of the method as well as reducing the processing time. (4.) calculation of the signal s1(n), envelope signal a1(n) and the product function pf1(n) after the execution of the sifting process. considering a process with p iterations, the values of s1(n), a1(n) and pf1(n) are given by: s1(n) = s1p(n) (10) a1(n) = n∏ j=1 aij (n) (11) f1(n) = a1(n) ·s1(n) (12) (5.) subtract the product function from the signal x(n). the process must be repeated m times until the entire signal is decomposed, so that: x(n) = m∑ i=1 pf i(n) (13) 2.1. ensemble local mean decomposition components of the product function with different characteristics are obtained by means of the lmd method. however, due to the signal discontinuity, mode mixing still occurs during the lmd process. this condition causes an ambiguity in the physical meaning of the instantaneous frequencies of the product function after the decomposition. from this point of view, [3] demonstrated that the addition of different gaussian white noises to the signal prior to its decomposition by lmd could drastically decrease the mode mixing phenomenon. although it may seem that the addition of the disturbance to the signal could reduce the signal-to-noise ratio (snr) and consequently bring erroneous results to the decomposition, due to the addition of a nonexistent product function, the authors proved that because there are several independent gaussian white noises, the average of all added noise would tends to zero. thus, the technique repeatedly applies the lmd method to the signal along with a gaussian white noise of a finite amplitude. the average of the product functions derived from the various applications is used as the result of the decomposition. since the mean noise is zero, all disturbance added can be considered as excluded. this technique was called ensemble local mean decomposition (elmd) and yields results far superior to those of lmd in a fault diagnosis of rotating machines [3, 21, 25]. according to [26] elmd can be described by the following steps: (1.) adding white noise to the signal x(t) thus forming y(t). (2.) applying the lmd to signal y(t) in order to obtain multiple product functions. (3.) repeat steps 1 and 2 several times adding different noises at each iteration. (4.) calculating the mean of the pf obtained and consequently using it as the result of the decomposition. 3. ensemble local mean decomposition based on genetic algorithms genetic algorithm (ga) is based on the principles of genetics and natural selection. a ga allows a 467 willian t. f. d. silva, filipe d. d. m. borges acta polytechnica population composed of many individuals to be involved under some sort of natural selection rule, so that the final population is the one that best fits those conditions. thus, the first step is to define the condition of the environment for the population, in this case, the cost function. in this work, it is essential that that the white noise added to the original signal has a maximum value of rrmse, which evaluates the difference between the product functions and the original signal in order to cancel the mode mixing. however, the cost function minimization is used as an optimization notation, thus establishing the cost function as: cost = −rrmse (14) therefore, an initial population is defined, which can be formed by totally random chromosomes or by initial guesses in order to improve the convergence of the algorithm. in the proposed method, the population is defined in a totally random way, but within a defined range based on data taken from the signal to be decomposed. population = rand(ncro,nvar ) (15) where, ncro represents the number of chromosomes of a population and nvar represents the number of alleles contained in the chromosome. subsequently the pairing is defined, where the most adapted chromosomes are placed in order to cross. parents are defined randomly, and each pair produces two descendants, which contain traits from each parent. the parents still survive to be a part of the next generation. the more similar the two parents are, the better is the convergence towards a final population. once the pairing is defined, the crossover stage is started. several methods have been developed to optimize the creation of a descendant. the simplest methods of crossbreeding are those called [11] points. in these, one or more chromosome points are selected as crossing points. then, the variables between these points are exchanged between the two parents. the main disadvantage of this technique is that there is no new information in the generation of individuals, they are just replicas of the random values provided by the initial population. therefore, a variation of this method was suggested, the so-called simple crossing [12]. in this method, a descendant comes from the combination of parents, so the chromosome assumes new values, even though it is still related to its predecessors. the formation of an allele for this chromosome is demonstrated by: allelenew = α ·allelemom,n+ + (1 −α) ·alleledad,n (16) α being a random number between 0 and 1; allelemom,n: nth allele on the mother chromosome, and alleledad,n: nth allele on the father chromosome. note that in a simple crossing, if the value of α = 0.5, the descending chromosome allele becomes a simple average of the parent variables. however, even if this method allows new information to be entered by combining information from parents, it does not allow values outside parent’s extremes. therefore, another approach, proposed by [12], was the heuristic crossover, which again uses a random variable, β, chosen in the interval [0.1], to define one or more alleles of the descendants. in this work, the use of the heuristic crossover was defined in the implementation of the algorithm because it yields better results in the search for the global maximum [13]. thus, in the generation of a new population lineage, this crossing is imposed for at least one allele of each descendant. this one is chosen at random and the remaining variable is fairly distributed to the children, so that each parent is always represented. allelenew = β · (allelemom,n −alleledad,n)+ + allelemom,n (17) finally, some form of mutation can be defined for chromosomes in the population. the mutation process is important in some cases where a function can assume several local maximums and the cost function eventually converges to one of these maximums. if there is no preventive measure, the result can be far from the overall maximum cost. in this work, the mutation is defined in a completely random way, where a random value between 0 and 1 is calculated. if it is greater than 0.8 (arbitrarily chosen value) a chromosome is recalculated in a random way, without presenting any correlations with its parents. the proposed method is exemplified by the flowchart shown in figure 2, which represents an application of genetic algorithms to the method developed by [3]. 4. simulated signal test the test of the proposed technique was carried out using a synthetic signal x(t) extracted from [4], obtained by summation of the three components presented in equations 18, 19 and 20 shown in figure 3. since it is often impossible to know all the compositions of a real signal, the use of synthetic ones is very useful for the evaluation of a signal processing method. x(t) = x1(t) + x2(t) + x3(t) (18) x1(t) = 1.5 ·e−800t ′ ·sin(2π · 5000t) (19) x2(t) = 0.2 · (1 + cos(2π · 100t) + cos(2π · 100t) (20) 468 vol. 61 no. 3/2021 genetic algorithms for determining the optimal parameters. . . start de�ne cost func�on, variables, cost and ga parameters generate ini�al popula�on muta�on (if any) convergence check iga > it_max or cost >1 perform elmd with variables that provided the op�mal cost obtain pfs with li�le mode mixing end find cost for each chromossome select mates ma�ng no yes proposed method figure 2. flowchart of the proposed method based on genetic algorithms for parameter selection. 0 0.01 0.02 0.03 0.04 0.05 -2 0 2 x(t) 0 0.01 0.02 0.03 0.04 0.05 -2 0 2 x 1 (t) 0 0.01 0.02 0.03 0.04 0.05 -0.5 0 0.5 x 2 (t) 0 0.01 0.02 0.03 0.04 0.05 -2 0 2 x 3 (t) a m p li tu d e a m p li tu d e a m p li tu d e a m p li tu d e time [s] figure 3. waveform in the time domain of the signal x(t), x1(t), x2(t) and x3(t). where, x3(t) is white gaussian noise with a bandwidth from 2 to 4 khz, and; t′ is a periodic function with a fundamental period of 1/160 s. according to [4], this frequency was chosen because, when compared with low frequency noises, high frequencies generally present larger contributions to the extremes of the original signal. in order to compare the performance of the proposed gab-elmd (genetic algorithm based ensemble local mean decomposition) to oelmd and lmd methods (using ma and sifting process for smoothing, respectively) root-mean-square error (rmse), the number of product functions and processing time are considered as indicators. the expression for the rmse is given by. rmse = √ ∑n i=1[xi(t) −pfi(t)]2 n (21) where xi(t) and pfi(t) are the original components of the signal and its decomposed form, respectively. a lower rmse value indicates a better performance. the computer used for the simulation is a 2.4 ghz i7-dual core processor with 8 gb ram. the software used is matlab (r2018b). the tests were performed ten times and the results shown in table 1 represent their means. values presented in table 1 shows similar results for both methods, making it impossible to point out the best within the giving confidence interval. in terms of the computational cost, oelmd required a longer time due to its test characteristic, while the proposed technique obtained similar results with a shorter average processing time without significant losses in quality of the signal, highlighting a reduction in processing time of 82.6 %. the processing time is also illustrated in figure 5, which displays the relationship between the number of samples of a discrete signal and the processing time of each of the techniques shown in table 1. results are shown along their error calculated by a student’s tdistribution for a confidence interval of 95 %, with the proposed technique being superior to the oelmd in each instance. as for the decomposition error, it can be seen that although the proposed method had better average values as compared to oelmd, it shows higher error values due to its inherent characteristic of finding adjustable results not always aligned with the global maximum, which corroborates the data presented in figure 6, which displays a heat map containing the number of times that a given solution arranged in the model of figure 4 was achieved by the technique based on the genetic algorithm, in a test performed one hundred times. figure shows that there were many cases where an overall maximum was not reached. 5. proposed algorithm for improvement of decomposition results after the signal is decomposed by gab-elmd, some product functions still contain mode mixing, which can be observed in the previous section. although it is 469 willian t. f. d. silva, filipe d. d. m. borges acta polytechnica methods rmse processing pf1 pf2 pf3 time (s) oelmd 0,171 ± 0,017 0,125 ± 0,006 0,198 ± 0,013 166,5 gab 0,197 ± 0,033 0,117 ± 0,020 0,190 ± 0,028 28,9 table 1. comparison of performance between oelmd’s and proposed method’s performances. figure 4. model of the rrmse values for signal defined by eq. 18 based on the amplitude and bandwidth values of the noise. 10 12 14 16 0 50 100 oelmd gab sample size (× 1000) p ro ce ss in g ti m e [m in ] figure 5. signal’s processing time per sample size. an intrinsic phenomenon of the elmd method, in this work, the reapplication of the decomposition based on rrmse in order to mitigate mode mixing is proposed. recent works, such as [16] and [4], have already investigated the use of rrmse as a parameter of mode mixing between product functions, but none of them used it in order to reapply the decomposition in the product functions with the highest rrmse. based on the characteristics of this function, reapplication of decomposition is based on the following steps: (1.) gab-elmd is initially applied to decompose the signal. (2.) the rrmse matrix is then calculated making it possible to compare the relative root-mean-square error between the product functions. in this way, it is expected to find the product function with a higher mode mixing (minimum value). figure 6. heat map showing the number of times such a solution was obtained for a method based on genetic algorithm. m =  rmse(pfi,pfi) · · · rmse(pfi,pfj )... ... ... rmse(pfj,pfi) · · · rmse(pfj,pfj )   (22) (3.) the pf with the highest mode mixing is selected. (4.) lmd is applied to the pf previously selected. (5.) each new product function is compared through the rrmse with the pfs obtained in step 1. the pf of step 1 with the closest resemblance to the new one will be added to it or be replaced by it (if the pf is selected in the third step). (6.) the procedure is repeated from step 2 to 5 until the lowest value in the rrmse array is greater than or equal to 1, or the number of desired maximum iterations is reached. (7.) gets the product functions with the least mix of modes. in order to compare the effectiveness of the proposed method, it was applied to a hypothetical signal defined by equations 18-20, which was previously decomposed by the methods mentioned in section 2. thus, the pfs obtained (figures 7-11) were again compared to the respective components of the original signal obtained by means of rmse. the values are then presented in table 2, built by performing each algorithm 10 times. data shows the rmse average calculated along with its error, using a student’s t-distribution for a 95 % confidence level. 470 vol. 61 no. 3/2021 genetic algorithms for determining the optimal parameters. . . method rmse numberof pfspf1 pf2 pf3 slmd 0.331 ± 0,000 0.134 ± 0,000 0.301 ± 0,000 3 olmd 0.331 ± 0,000 0.134 ± 0,000 0.302 ± 0,000 3 ilmd 0.211 ± 0,000 0.086 ± 0,000 0.302 ± 0,000 4 oelmd 0,171 ± 0,017 0,125 ± 0,006 0,198 ± 0,013 3 proposed 0.154 ± 0,062 0.101 ± 0,072 0.172 ± 0,0485 3 table 2. comparison of performance between proposed algorithm and methods discussed in section 2. 0 0.05 -2 0 2 p f 1 0 0.05 -0.5 0 0.5 0 0.05 time (s) -2 0 2 p f 3 p f 2 0 0.1 0.2 a m p li tu d e 0 0.1 a m p li tu d e 0 2 4 6 8 frequency (khz) 0 0.05 0.1 a m p li tu d e figure 7. signal in time domain (left) and frequency domain (right) by slmd method. 0 0.05 -2 0 2 p f 1 0 0.05 -0.5 0 0.5 0 0.05 time (s) 0 0.1 0.2 a m p li tu d e 0 0.1 a m p li tu d e frequency (khz) 0 0.05 0.1 a m p li tu d e -2 0 2 p f 3 p f 2 0 2 4 6 8 figure 8. signal in time domain (left) and frequency domain (right) by olmd method. the figures show that all methods discussed in section 2 presented a major problem of mixing modes. figures 7 and 8 shows that the slmd and olmd methods had almost the same result being unable to separate the signal into its product functions, with almost the entire signal restricted in the first product function. although the second pf still keeps its signal characteristics, the amplitudes are very low. in the same manner, pf3 shows its noise characteristics, depicting it more like an error function. for ilmd (fig. 9), the best results for pf2 are evident, clearly 0 0.05 -2 0 2 p f 1 0 0.05 -0.5 0 0.5 -2 0 2 p f 3 0 0.1 0.2 a m p li tu d e 0 0.1 a m p li tu d e 0 0.05 0.1 a m p li tu d e 0 0.05 time (s) frequency (khz) 0 2 4 6 8 p f 2 figure 9. signal in time domain (left) and frequency domain (right) by ilmd method. 0 0.05 -2 0 2 p f 1 0 0.05 -0.5 0 0.5 0 0.05 time (s) -2 0 2 p f 3 0 0.1 0.2 a m p li tu d e 0 0.1 a m p li tu d e 0 2 4 6 8 frequency (khz) 0 0.05 0.1 a m p li tu d e p f 2 figure 10. signal in time domain (left) and frequency domain (right) by oelmd method. showing the beating signal with amplitude levels very close to the original, however, the method was unable to separate the impact signal from the noise, basically leaving the two compositions only in pf1. as for oelmd (fig. 10), even though the optimal result shown by ilmd for pf2 was not reproduced, it was quite efficiently able to separate the noise, leaving the product function representing the impact signal alone at pf1. however, the mode mixing still remained, as much of the beating signal was embedded in the noise signal. finally, the proposed method managed 471 willian t. f. d. silva, filipe d. d. m. borges acta polytechnica 0 0.05 -2 0 2 p f 1 0 0.05 -0.5 0 0.5 0 0.05 tempo (s) -2 0 2 p f 3 p f 2 0 0.1 0.2 a m p li tu d e 0 0.1 a m p li tu d e 0 2 4 6 8 frequency (khz) 0 0.05 0.1 a m p li tu d e figure 11. signal in time domain (left) and frequency domain (right) by proposed method. to separate the signal into three product functions very similar to the original composition, with the noise being isolated in only one function, the beating signal with optimum amplitude levels in another and the first product function representing only the impact signal. table 2 shows the same results as figures 7-11, however, numerically highlighting that the proposed method gives better average results than all the algorithms tested for pfs 1 and 3. for pf2, the ilmd method showed better results, as in some cases the proposed method was not able to improve the beating signal decomposition, having similar results to the ones presented by oelmd. another fact to mention are the high variances presented by oelmd and the proposed method against the null variance of the other algorithms. this is because, as already mentioned, the optimal bandwidth and noise amplitude values to be applied to the decomposition are not always the same, which causes fluctuations in the decomposition results. 6. experimental data analysis to assess the proposed algorithm in a real case scenario, a set of rolling bearing data obtained from a test rig was used and the results were compared. figure 12 shows the experimental test rig and all the apparatus used in the test while their specifications and technical and instrumental characteristics are shown in table 3. two model 6004-2rs1 (skf) bilaterally shielded rolling bearings were used, their dimensions and construction data are shown in table 4. from the data presented in table 4 , the characteristic fault frequencies on the outer and inner races have been calculated and presented in table 5. the damage to the bearings was made by means of a 1 mm diameter diamond-tipped drill mounted on a mini electric bench driller. the localized damage has approximate dimensions of 1.2 mm in diameter and 1 mm in depth. after the defect was imposed, the steel chips from 1) base, 2) drive motor, 3) accelerometers, 4) computer, 5) data conditioning, 6) amplifier, 7) drive shaft, 8) optical encoder, 9) bearing, 10) frequency inverter figure 12. test rig. equipment specifications drive motor weg w22 tfve nominal speed: 2 985 rpm power: 1.0 kw frequency inversor weg cfw10power: 1.5 kw amplifier hbm quantum xsample frequency: 19 200 hz accelerometer skf 739l sensibility: 500 mv/g ± 5% frequency range: ±5 %: 0.6 − 700 hz ±10%: 0.4 − 1 000 hz ±3 db: 0.2 − 2 300 hz table 3. specifications and technical characteristics of the test rig. the damaged components were removed, the proper lubricant reapplied and shielding plates reassembled. therefore, the vibration signal was decomposed using the proposed and the oelmd methods, obtaining a series of pfs and a constant residual. then, an envelope analysis was applied to the second pf to identify the localized defect on the bearing. 6.1. bearing with outer race fault figure 13 shows the envelope spectrum of the signal for the outer race defect, being (a) without any preprocessing treatment; (b) using the oelmd for the pre-processing; (c) using the proposed method for the pre-processing. as seen, in (a), although the characteristic bearing failure frequencies are seen, they are not evidenced and are hidden in noises of an order of magnitude similar to the excitations. in (b), although presenting different spectrum, the noise is still dominant and has even worsened the visualization of the characteristic failure frequencies. thus, only in the method proposed in (c), it is possible to clearly visualize the bpfo frequency and its harmonics, clearly showing a failure signal. 472 vol. 61 no. 3/2021 genetic algorithms for determining the optimal parameters. . . construction feature dimension outer diameter 42 mm inner diameter 20 mm rolling’s element diameter 6.35 mm number of rolling elements 9 table 4. geometry parameters of the experimental bearing. fault frequency value (hz) bpfi 269,7 bpfo 178,0 table 5. bearing’s fault frequencies. 6.2. bearing with inner race fault in a similar way to outer race signals, the envelope spectra for the inner race are shown in figure 14. unlike the previous case, it was already possible to observe the characteristic fault frequencies without the presence of a signal pre-processing. although the decomposition techniques removed a significant amount of noise level from the analysis, the proposed algorithm, for instance, could not clearly show the third harmonic. 7. conclusions ensemble local mean decomposition is a new method in time-frequency analysis which comprises the main innovation of the lmd, being the separation of a single component am-fm signal into a set of product functions of the envelope signal, and a purely frequency-modulated signal, with a significant improvement in mode mixing. however, an intensive search for parameters that actually solve the problem is required, which can be time and human resource consuming. in this work, an approach using a wellknown optimization method was suggested in order to select the parameters for elmd automatically and with the least time consumption. at this stage, the results demonstrate the superiority of the proposed technique to the oelmd, leading to the following conclusions: (1.) although oelmd shows a greater selectivity of the parameters, the proposed technique presented similar results when applied on a synthetic signal. (2.) the proposed method showed lower processing time, reducing the total time by more than 82 %. (3.) it was found that the main difficulty in achieving better processing times in the execution of the technique was the smoothing algorithm, discussed in section 2, delineating a new field of research. (4.) even though the results are promising, they have not been compared to other new techniques that address elmd limitations in other ways, focusing 0 500 1000 1500 frequency [hz] 0 0.005 0.01 p e a k a m p li tu d e envelope spectrum bpfo harmonics (a) . without any pre-processing method. 0 500 1000 1500 frequency [hz] 0 0.5 1 p e a k a m p li tu d e 10 -3 envelope spectrum bpfo harmonics (b) . pre-processing with oelmd method. 0 500 1000 1500 frequency [hz] 0 2 4 6 p e a k a m p li tu d e 10 -3 envelope spectrum bpfo harmonics (c) . pre-processing with proposed method. figure 13. envelope spectrums of pf2 from bearing with outer race fault. on being a simple and complementary solution to the algorithm proposed by [4]. subsequently, a method for improving the decomposition results was proposed, based on the use of the relative-root-mean-square error and re-application of the decomposition to the obtained product functions. regarding this section, the analyses of the results lead to the following conclusions: (1.) the proposed method presented superior results in the mitigation of mode mixing as compared to the oelmd. (2.) the execution of the method does not significantly increase the computational costs nor the processing time, since after the execution of the ensemble local mean decomposition which is more time consuming only the lmd, which presents a faster processing, is executed. (3.) the excellent results obtained were achieved from synthetic signals and a small sample of real vibration signals, so the effectiveness of the method was not tested on highly complex signals. 473 willian t. f. d. silva, filipe d. d. m. borges acta polytechnica 0 500 1000 1500 2000 2500 frequency [hz] 0 0.05 0.1 p e a k a m p li tu d e envelope spectrum bpfi harmonics (a) . without any pre-processing method. 0 500 1000 1500 2000 2500 frequency [hz] 0 1 2 3 4 p e a k a m p li tu d e 10 -3 envelope spectrum bpfi harmonics (b) . pre-processing with oelmd method. 0 500 1000 1500 2000 2500 frequency [hz] 0 0.02 0.04 0.06 p e a k a m p li tu d e envelope spectrum bpfi harmonics (c) . pre-processing with proposed method. figure 14. envelope spectrums of pf2 from bearing with inner race fault. finally, the effectiveness of the method was analysed for only one of its applications. that’s why the proposed technique was tested against oelmd. in this section, the following conclusions can be drawn: (1.) although defects in bearing races require a simpler analysis, a slight improvement in the envelope spectrum was observed as compared to the ones derived from the ensemble local mean decomposition. (2.) spectral analysis does not always guarantee the detection of a bearing defect. this work has shown the effectiveness of a well-known meta-heuristic method in the optimization of an algorithm used for signal processing, proposing a new technique with a notable improvement in pre-processing the vibration signals and spectral analysis. acknowledgements the authors greatly acknowledge the grant support from the universidade estadual do oeste do paraná. references [1] n. e. huang, z. shen, s. r. long, et al. the empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. royal society of london proceedings series a 454(1):903–995, 1996. https://doi.org/10.1098/rspa.1998.0193. [2] j. s. smith. the local mean decomposition and its application to eeg perception data. journal of the royal society interface 2(5):443–454, 2005. https://doi.org/10.1098/rsif.2005.0058. [3] y. yang, j. cheng, k. zhang. an ensemble local means decomposition method and its application to local rub-impact fault diagnosis of the rotor systems. measurement 45(3):561–570, 2012. https://doi.org/10.1016/j.measurement.2011.10.010. [4] c. zhang, z. li, c. hu, et al. an optimized ensemble local mean decomposition method for fault detection of mechanical components. measurement science and technology 28(3):035102, 2017. https://doi.org/10.1088/1361-6501/aa56d3. [5] y. wang, z. he, y. zi. a comparative study on the local mean decomposition and empirical mode decomposition and their applications to rotating machinery health diagnosis. journal of vibration and acoustics 132(2):021010, 2010. https://doi.org/10.1115/1.4000770. [6] z. wang, j. wang, w. cai, et al. application of an improved ensemble local mean decomposition method for gearbox composite fault diagnosis. complexity 2019, 2019. https://doi.org/10.1155/2019/1564243. [7] y. cheng, d. zou. complementary ensemble local means decomposition method and its application to rolling element bearings fault diagnosis. proceedings of the institution of mechanical engineers, part o: journal of risk and reliability 233(5):868–880, 2019. https://doi.org/10.1177/1748006x19838129. [8] i. bruant, l. gallimard, s. nikoukar. optimal piezoelectric actuator and sensor location for active vibration control, using genetic algorithm. journal of sound and vibration 329(10):1615–1635, 2010. https://doi.org/10.1016/j.jsv.2009.12.001. [9] l. b. jack, a. k. nandi. genetic algorithms for feature selection in machine condition monitoring with vibration signals. iee proc-vis image signal process 147(3):205– 212, 2000. https://doi.org/10.1049/ip-vis:20000325. [10] h. hao, y. xia. vibration-based damage detection of structures. journal of computing in civil engineering 16(3):222–229, 2002. https://doi.org/10.1061/(asce)08873801(2002)16:3(222). [11] a. a. adewuya. new methods in genetic search with real-valued chromosomes. master thesis, massachusetts institute of technology, 1996. [12] z. michalewicz. genetic algorithms + data structures = evolution programs. springer berlin heidelberg, berlin, heidelberg, 1996. https://doi.org/10.1007/978-3-662-03315-9. [13] r. l. haupt, s. e. haupt. practical genetic algorithms. john wiley & sons, inc., hoboken, nj, usa, 2nd edn., 2003. https://doi.org/10.1002/0471671746. 474 https://doi.org/10.1098/rspa.1998.0193 https://doi.org/10.1098/rsif.2005.0058 https://doi.org/10.1016/j.measurement.2011.10.010 https://doi.org/10.1088/1361-6501/aa56d3 https://doi.org/10.1115/1.4000770 https://doi.org/10.1155/2019/1564243 https://doi.org/10.1177/1748006x19838129 https://doi.org/10.1016/j.jsv.2009.12.001 https://doi.org/10.1049/ip-vis:20000325 https://doi.org/10.1061/(asce)0887-3801(2002)16:3(222) https://doi.org/10.1061/(asce)0887-3801(2002)16:3(222) https://doi.org/10.1007/978-3-662-03315-9 https://doi.org/10.1002/0471671746 vol. 61 no. 3/2021 genetic algorithms for determining the optimal parameters. . . [14] j. h. holland. adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. cambridge, ma, 1992. https://doi.org/10.1086/418447. [15] j. h. holland. genetic algorithms and the optimal allocation of trials. siam journal on computing 2(2):88–105, 1973. https://doi.org/10.1137/0202009. [16] w. guo, p. w. tse. a novel signal compression method based on optimal ensemble empirical mode decomposition for bearing vibration signals. journal of sound and vibration 332(2):423–441, 2013. https://doi.org/10.1016/j.jsv.2012.08.017. [17] y. wang, z. he, y. zi. a demodulation method based on improved local mean decomposition and its application in rub-impact fault diagnosis. measurement science and technology 20(2), 2009. https://doi.org/10.1088/0957-0233/20/2/025704. [18] f. d. d. m. borges. comparação de métodos de tratamento de sinais aplicáveis ao diagnóstico de defeitos em mancais de rolamento. mestrado, universidade estadual do oeste do paraná, 2018. [19] z. liu, m. j. zuo, y. jin, et al. improved local mean decomposition for modulation information mining and its application to machinery fault diagnosis. journal of sound and vibration 397:266–281, 2017. https://doi.org/10.1016/j.jsv.2017.02.055. [20] g. rilling, p. flandrin, p. gonçalvès. on empirical mode decomposition and its algorithms. in 6th ieee-eurasip workshop on nonlinear signal and image processing. grado, italy, 2003. https://doi.org/10.1210/en.2002-220356. [21] l. wang, z. liu, q. miao, x. zhang. time–frequency analysis based on ensemble local mean decomposition and fast kurtogram for rotating machinery fault diagnosis. mechanical systems and signal processing 103:60–75, 2018. https://doi.org/10.1016/j.ymssp.2017.09.042. [22] y. li, x. liang, y. yang, et al. early fault diagnosis of rotating machinery by combining differential rational spline-based lmd and k-l divergence. ieee transactions on instrumentation and measurement 66(11):3077–3090, 2017. https://doi.org/10.1109/tim.2017.2664599. [23] l. deng, r. zhao. an improved spline-local mean decomposition and its application to vibration analysis of rotating machinery with rub-impact fault. journal of vibroengineering 16(1):414–433, 2014. [24] y. li, m. xu, z. haiyang, et al. a new rotating machinery fault diagnosis method based on improved local mean decomposition. digital signal processing 46:201–214, 2015. https://doi.org/10.1016/j.dsp.2015.07.001. [25] l. wang, z. liu, q. miao, x. zhang. complete ensemble local mean decomposition with adaptive noise and its application to fault diagnosis for rolling bearings. mechanical systems and signal processing 106:24–39, 2018. https://doi.org/10.1016/j.ymssp.2017.12.031. [26] j. sun, z. peng, j. wen. leakage aperture recognition based on ensemble local mean decomposition and sparse representation for classification of natural gas pipeline. measurement 108:91–100, 2017. https://doi.org/10.1016/j.measurement.2017.05.029. 475 https://doi.org/10.1086/418447 https://doi.org/10.1137/0202009 https://doi.org/10.1016/j.jsv.2012.08.017 https://doi.org/10.1088/0957-0233/20/2/025704 https://doi.org/10.1016/j.jsv.2017.02.055 https://doi.org/10.1210/en.2002-220356 https://doi.org/10.1016/j.ymssp.2017.09.042 https://doi.org/10.1109/tim.2017.2664599 https://doi.org/10.1016/j.dsp.2015.07.001 https://doi.org/10.1016/j.ymssp.2017.12.031 https://doi.org/10.1016/j.measurement.2017.05.029 acta polytechnica 61(3):465–475, 2021 1 introduction 2 local mean decomposition 2.1 ensemble local mean decomposition 3 ensemble local mean decomposition based on genetic algorithms 4 simulated signal test 5 proposed algorithm for improvement of decomposition results 6 experimental data analysis 6.1 bearing with outer race fault 6.2 bearing with inner race fault 7 conclusions acknowledgements references acta polytechnica doi:10.14311/ap.2018.58.0195 acta polytechnica 58(3):195–200, 2018 © czech technical university in prague, 2018 available online at http://ojs.cvut.cz/ojs/index.php/ap characterization of recycled linear density polyethylene/imperata cylindrica particulate composites olusola femi olusunmadea, ∗, sunday zechariahb, taofeek ayotunde yusufa a department of mechanical engineering, university of agriculture, makurdi, pmb 2373, makurdi, nigeria b department of mechanical engineering, federal polytechnic, mubi, nigeria ∗ corresponding author: olusunmadeolusola@yahoo.com abstract. water-sachets made from low density polyethylene (ldpe) form a bulk of plastic wastes which creates environmental challenges, while certain species of plants like imperata cylindrica constitute large portion of weeds on farm lands. as a technological approach to the reduction and utilization of these materials, composites of imperata cylindrica (ic) particulate and synthetic polymer (from recycled waste water-sachets) were produced and evaluated for several mechanical and physical properties. the production of the composites and testing were done using the standard methods available in the literature. the results showed an increase in tensile modulus, hardness, impact strength, and water absorption of the composite in comparison with unreinforced polymer, as the ic particulate loading increased from 5 wt% to 30 wt%. however, there was a decrease in tensile strength, percentage elongation at break and density of the composite as the particulate loading increased from 5 wt% to 30 wt%. the combination of the recycled waste water-sachets and ic particulate is really promising for composites development. this creates opportunities to reduce ldpe wastes and add economic importance to an otherwise agricultural menace. it will mean creating an economic value from “wastes”. keywords: mechanical properties; physical properties; imperata cylindrica (ic); particulate; waste water-sachets; composites and recycled linear low density polyethylene (rldpe). 1. introduction the development of composite materials, particularly natural fibre composites, has become increasingly popular. the volume and number of applications of composite materials have grown steadily, reaching to and conquering new markets. according to a market report published by lucintel [1], the future of natural fibre composites market looks attractive with opportunities in the automotive as well as building and construction industries. modern composite materials constitute a significant proportion of the engineered materials market ranging from everyday products to sophisticated applications. the efforts to produce economically attractive composite components have resulted in several innovative manufacturing techniques currently being used in the composites industry [2]. these composite materials are sometimes produced using waste resources. the fact that natural resources are ever depleting calls for more responsible and efficient use of the available scarce resources. besides, creating products from “waste” will really add value and help with environmental challenges posed by the enormous waste being generated. in many countries in africa, potable water is being packed with sachets made from linear density polyethylene (ldpe). this has really become a big business in many cities. as a result of increasing demand for the packed water, there has been a rise in the amount of waste watersachets generated. these empty sachets often end up thrown away on the streets creating large amount of non-biodegradable waste. responsible utilization of these wastes will really be an advantage. the water-sachets clog mini-water ways, thereby creating a perfect breeding habitat for mosquitoes that are responsible for spreading malaria parasite, which is a major cause of illness on the african continent. eventually, the water-sachets find their way to larger bodies of water such as oceans and seas and pose a serious threat to aquatic life, as these wastes can be around for many years because of their non-biodegradable nature. however, if the water-sachets are burnt, the emissions from the burning process severely pollutes the air and also contribute to the greenhouse effect, which is responsible for the global warming that the earth currently experiences as a result of the depleting ozone layer. there is, therefore, a need for a responsible handling of these wastes. one of such efforts targeted at utilizing the waste water-sachets is by incorporating natural fibres into them to produce a usable composite materials. natural fibres could serve as viable and abundant alternatives to the expensive and non-renewable synthetic fibres as reinforcement in thermoplastic composites. these types of fibres present many advantages compared to synthetic fibres, such as low tool wear, low density, cheaper cost, availability and biodegradability [3, 4]. one of such natural 195 http://dx.doi.org/10.14311/ap.2018.58.0195 http://ojs.cvut.cz/ojs/index.php/ap o. f. olusunmade, s. zechariah, t. a. yusuf acta polytechnica supplementary materials figure 1: ground ic figure 2: waste water-sachets figure 1. ground ic. fibres is imperata cylindrica. it is an aggressive and difficult weed to control due to its short growth cycle. it is abundant, yet unsuitable for grazing animals and lacks a good commercial value [5]. when fully mature, its overall nutrient decline and its sharp pointed seeds and tangled awns may injure animals and humans [6]. they also act as a host for pathogens that affect the yield of some food crops [7]. however, imperata cylindrica possesses good stiffness properties, which, if incorporated in a matrix, can enhance the composite rigidity. hence, imperata cylindrica is proposed as a fibre reinforcement for recycled water-sachets to produce a thermoplastic composite, thereby increasing its economic importance and reduce the environmental challenges posed by improper handling of waste water-sachets. this study, therefore, examined some mechanical and physical properties of recycled watersachets/imperata cylindrica particulate composite to determine its viability for engineering applications. 2. materials and method the part of the imperata cylindrica (ic) that was used for this study is the stem and they were obtained from pilla village in makurdi area of benue state. the waste water-sachets were gathered from the campus of the university of agriculture makurdi, benue state. 2.1. polymer and fibre processing the imperata cylindrica stems were harvested and sundried for two weeks. subsequently, the finer strands of the stems were handpicked and ground (see figure 1). the grounded particles were then filtered through a sieve of a pore size of 300 microns. the waste watersachets (see figure 2) were thoroughly washed, dried and pulverized at goshen plastics industry, makurdi (see figure 3). these pulverized waste water-sachets will be referred to as recycled low density polyethylene (rldpe) henceforth. 2.2. composite preparation the ic particulate and the rldpe were weighed to get the required weight using an electronic weighing balance (ohaus adventurer pro analytical balsupplementary materials figure 1: ground ic figure 2: waste water-sachets figure 2. waste water-sachets. figure 3: pulverized water-sachets figure 4: compression of mold and content figure 3. pulverized water-sachets. ance). the ic particulate and the recycled polymer were mixed such that the particulate weight ratio in the matrix varied from 5 wt% to 30 wt% in steps of 5 wt%. the mould was preheated at 100 °c. half of the rldpe was added to the mould because the cavity of the mould could not accommodate the whole mass of the rldpe in an un-melted state. after one minute, the ic particulate was added to the mould and after that, the other half of the rldpe was also added. the combined ic particulate and rldpe were then heated in the aluminium mould at a temperature of 150 °c for 15 minutes, during which the rldpe showed a reasonable fluidity and the blend was thoroughly mixed to ensure homogeneity. the heating continued for another 5 minutes after which the mould was removed from the heat source and compressed using a 5tonnes hydraulic jack (see figure 4). it was then allowed to cool at a room temperature until the composite took the shape of the mould cavity, after which the composite sheet (295 × 210 × 5 mm) was removed from the mould. heating was carried out using a qasa (qsg-505g) gas cooker. 2.3. composite characterization the ic particulate-reinforced plastic sheet was retrieved from the mould and cut into test specimens. characterization of the composites was achieved by mechanical testing. some physical properties of the materials were also examined. 196 vol. 58 no. 3/2018 characterization of recycled particulate composites figure 3: pulverized water-sachets figure 4: compression of mold and content figure 4. compression of the mould and content. 2.3.1. mechanical properties tensile test was carried out using the instron 3369 (universal testing machine) according to astm d 638, to determine the tensile strength, tensile modulus, and elongation at break of the materials. the test specimen had a dumb-bell shape with a gauge length of 30 mm, grip width of 15 mm and thickness of 5 mm. specimens were placed in the grips of the universal tester and pulled at a crosshead speed of 5 mm/min until failure. the hardness of the materials was measured using a computerized micro-vickers hardness tester (mv-1 pc) with a load of 300g according to astm e 384, which is a standard test method for vickers hardness testing of materials. the vickers indenter produces a geometrically similar indentation at all test forces. the dimension of the hardness specimens is 40 × 40 × 5 mm. the charpy impact test was carried out to determine the impact strength of the composites according to astm d 6110, which is used to determine the resistance of plastics to a breakage by flexural shock produced by a pendulum type hammer. the dimension of the impact test specimens was 100 × 10 × 5 mm. three specimens from each of the materials were used for each of the tests. 2.3.2. physical properties water absorption test according to astm d-570 was also carried out to determine the water absorption characteristic of the composite. three samples from each of the materials, with dimensions 42 × 12 × 5 mm, were cut, cleaned and weighed before immersion in distilled water at a room temperature. the specimens were removed from the water after 24 hours and the surfaces wiped off and weighed. the difference between the weight before and after immersion was noted. the water absorption was then calculated using a = m2 − m1 m1 · 100 %, (1) where m1 is the initial mass in grams and m2 is the final mass in grams. figure 5: tensile strength at varying ic particulate loading 0 2 4 6 8 10 12 0 5 10 15 20 25 30 t e n si le s tr e n g th ( m p a ) ic particulate loading (wt%) figure 5. tensile strength at varying ic particulate loading. the density of the composite was also determined by comparing the mass of a given specimen with its volume: density (%) = mass (m) volume (v) . (2) the dimensions of the specimens used to determine the density was 50 × 50 × 5 mm. 3. results and discussion 3.1. mechanical properties 3.1.1. tensile strength figure 5 illustrates the average tensile strength of the composite produced at different ic particulate loadings as compared to the rldpe. it showed that the rldpe has an average tensile strength of 10.86mpa. it was observed that there was a decrease of 16.48 % to 34.07 % in the average tensile strength of the composite as the ic particulate loading increased from 5 wt% to 30 wt% compared to the rldpe. though, there was an increment of 6.39 % in the average tensile strength of the composite for particulate loading from 5 wt% to 15 wt%. the maximum average tensile strength of the composite is nevertheless still lower compared to that of the rldpe. the decrease is due to the poor interfacial adhesion between the hydrophobic rldpe and hydrophilic ic particulate. the scanning electron microscope micrographs (see figures 6 and 7) showed that while the ic particulates were fairly evenly distributed within the matrix, the agglomeration of the particulate observed, however, indicates a weak interfacial bonding. poor interfacial adhesion acts as a stress concentration point upon an application of external forces leading to a premature failure due to a poor stress transfer from matrix to the fibre particulate. higher tensile strength demonstrated by the neat rldpe is due to the flexibility and plasticity of the rldpe [3]. 3.1.2. tensile modulus figure 8 illustrates the average tensile modulus of the composite produced at different particulate load197 o. f. olusunmade, s. zechariah, t. a. yusuf acta polytechnica figure 6: sem micrograph at 20 wt% ic particulate loading figure 7: sem micrographs at 30 wt% ic particulate loading particles polymer matrix agglomeration of particulates figure 6. sem micrograph at 20 wt% ic particulate loading. figure 6: sem micrograph at 20 wt% ic particulate loading figure 7: sem micrographs at 30 wt% ic particulate loading particles polymer matrix agglomeration of particulates figure 7. sem micrographs at 30 wt% ic particulate loading. figure 8: tensile modulus at varying ic particulate loading 0 50 100 150 200 250 0 5 10 15 20 25 30 t e n si le m o d u lu s (m p a ) ic particulate loading (wt%) figure 8. tensile modulus at varying ic particulate loading. ings as compared to the rldpe. it showed that the rldpe has an average tensile modulus of 116.44mpa. it was observed that as the particulate loading increased from 5 wt% to 30 wt%, there was an increase of 8.78 % to 82.53 % in the average tensile modulus of the composites when compared to the rldpe. as the ic particulate loading increased, the elasticity of rldpe has been suppressed by the presence of the derived cellulose. the increment in the modulus is attributed to the decreased deformability of the interface between the ic particulate and the matrix material, which caused a reduced strain as the particulate loading increased, due to the rigidity of the material [3]. then et al. [8] suggested that the enhancement in the tensile modulus is probably due to the fibres itself, which have a higher stiffness than those of the polymer. 3.1.3. percentage elongation at break figure 9 illustrates the average percentage elongation at break of the composite produced at different cellulose loadings as compared to the rldpe. it showed that the rldpe has a percentage elongation at break figure 9: elongation at break at varying ic particulate loading 0 10 20 30 40 50 60 0 5 10 15 20 25 30 e lo n g a ti o n a t b r e a k ( % ) ic particulate loading (wt%) figure 9. elongation at break at varying ic particulate loading. of 54.93 %. it was observed that there was a decrease of 64.73 % to 73.84 % in the average percentage elongation at break of the composite as the ic particulate loading increased from 5 wt% to 30 wt% compared to the rldpe. however, there was an increment of 57.31 % in the average percentage elongation at break of the composite for particulate loading from 5 wt% to 15 wt%. the maximum average percentage elongation at break of the composite is nonetheless still lower compared to that of the rldpe. the increment noticed between 5 wt% and 15 wt% particulate loadings may be attributed to a better dispersion of the particles within the matrix. there was less agglomeration of the particles and so a slightly more strain at this range of particulate loading. however, as the ic particulate loading increased, the elasticity of the composite is suppressed by the presence of the increased derived cellulose. the reduction is attributed to the decreased deformability of a rigid interface between the ic particulate and the matrix material [3]. liu et al. [9] reported that the decrease in elongation at break is due to the destruction of the structural integrity of the polymer by the fibres and the rigid structure of the fibres. 198 vol. 58 no. 3/2018 characterization of recycled particulate composites ic particulate impact hardness loading (wt%) strength (hv) (j) 0 5.03 52.63 5 2.68 50.57 10 2.88 64.73 15 2.98 70.63 20 3.45 91.53 25 3.8 103.77 30 4.4 111.33 table 1. impact strength and hardness property in relation to ic particulate loading. 3.1.4. impact strength table 1 illustrates the average impact strength of the composite produced at different particulate loadings as compared to the rldpe. it showed that the rldpe has an average impact strength of 5.03 j. higher impact strength demonstrated by the neat rldpe is due to the flexibility, plasticity, and less brittleness of the rldpe, which allows it to absorb and distribute the impact energy efficiently [3]. there was an increase of 7.46 % to 64.18 % in the average impact strength of the composite as the ic particulate loading increased from 5 wt% to 30 wt%. nevertheless, the maximum average impact strength of the composite at 30 wt% particulate loading was still 12.52 % lower when compared to that of the rldpe. considering the steady increment that was observed, up to 30 wt% loading of the ic particulate, it is possible that if the particulate loading increases beyond 30 wt%, the impact strength of the composite may eventually reach or even exceed that of the rldpe at some point. the increment in the average impact strength may be attributed to the rigid interface between the ic particulate and the matrix material as the particulate loading increased. 3.1.5. hardness property table 1 illustrates the average hardness values of the composite produced at different ic particulate loadings as compared to the rldpe. it showed that the rldpe has an average hardness value of 52.63. it was observed that as the particulate loading increased from 5 wt% to 30 wt%, there was an increase of 22.99 % to 111.53 % in the average hardness values of the composites when compared to the rldpe. the increase in the hardness property observed with the rldpe/ic particulate composite is a result of the hardness property of the ic particulate itself, which has been transmitted to the composite. the reduction in the hardness value at 5 wt% particulate loading may be as a result of a void in the composite [10]. 3.2. physical properties 3.2.1. water absorption table 2 illustrates the average percentage water absorption of the composite produced at different ic ic particulate water mass density loading (wt%) absorp(g) (g/cm3) tion (%) 0 4.64 10.69 0.855 5 6.81 10.66 0.853 10 7.11 10.34 0.827 15 7.19 10.01 0.801 20 7.86 9.88 0.791 25 13.66 9.86 0.789 30 15.47 9.77 0.782 table 2. water absorption and density in relation to ic particulate loading. particulate loadings as compared to the rldpe. it showed that the rldpe has a percentage water absorption of 4.64 %. it was observed that as the ic particulate loading was increased from 5 wt% to 30 wt%, there was an increase of 46.77 % to 233.41 % in the average percentage water absorption of the composites when compared to the rldpe. this result is according to expectations, as composites with natural fibre reinforcement exhibit higher water absorption due to the inherent hydrophilic nature of the fillers [11, 12]. 3.2.2. density table 2 illustrates the average density of the composite produced at different ic particulate loadings when compared to rldpe. it showed that the rldpe has a density of 0.855 g/cm3. it was observed that as the ic particulate loading was increased from 5 wt% to 30 wt%, there was a decrease of 0.23 % to 8.53 % in the average density of the composites when compared to the rldpe. the decrease in the density observed with the rldpe/ic particulate composite is as a result of the low density of the ic particulate itself, which has been transmitted to the composite. when a larger fraction of the rldpe, which is of a higher density, is replaced by lighter particulates, the overall density of the subsequent composite is reduced, which is one advantage that natural fibre composites have over synthetic fibre composites [3] and other engineering materials. 4. conclusion in this study, rldpe/ic particulate composites have been produced through a form of hand lay-up techniques and the mechanical and physical properties at 5 wt% to 30 wt% particulate loadings have been examined. the results from the tests carried out showed that the tensile modulus, hardness, impact strength and water absorption of the composite increased as the ic particulate loading increased from 5 wt% to 30 wt% respectively. although, an increasing trend was observed for the impact strength as particulate loading increased up to 30 wt%, the value was still lower compared to that of the rldpe. by increasing the ic particulate loading beyond 30 wt%, the impact 199 o. f. olusunmade, s. zechariah, t. a. yusuf acta polytechnica strength of the composite may eventually reach or even exceed that of the rldpe at some point. however, tensile strength, percentage elongation at break and density of the composite decreased as particulate loading increased 5 wt% to 30 wt% respectively. although an increase in tensile strength and elongation was observed up to 15 wt% loading of the particulate, the maximum tensile strength and elongation of the composite at that loading was still lower than that of the rldpe. the results obtained from the tests conducted showed that the composite can actually be adapted in some engineering applications particularly because of the positive indications observed regarding tensile modulus, hardness, impact strength and density. the combination of the recycled waste watersachets and the ic particulate is really promising for a composite development. this creates opportunities to reduce ldpe wastes and add economic importance to an otherwise agricultural menace. it will mean creating an economic value from “wastes”. references [1] lucintel: “global natural fiber composite market 2015-2020: trends, forecast, and opportunity analysis,” market research reports, 2015. [2] celluwood: “technologies and products of natural fibre composites,” cip-eip-eco-innovation2008: id: eco/10/277331, 2008. [3] olusunmade, o. f, adetan d. a and ogunnigbo c. o: “a study on the mechanical properties of oil palm mesocarp fibre reinforced thermoplastic (opmfrt)”, journal of composites, vol. 2016, article id 3137243, 7 pages, 2016. doi:10.1155/2016/3137243 [4] ogunsile, b. o and oladeji, t. g: “utilization of banana stalk fiber as reinforcement in low density polyethylene composite,” revistamateria, artigo11757, 21(4), pp.953-963, 2016. doi:10.1590/s1517-707620160004.0088 [5] angzzas, s. m. k, aripin, a. m, ishak, n, hairom, n. h. h, fauzi, n. a, razali, n. f and zainulabidin, m. h: “potential of cogon grass (imperata cylindrica) as an alternative fibre in paper-based industry,” arpn journal of engineering and applied sciences, vol. 11, no. 4, 2016. [6] soromessa, t: “heteropogon contortus (l.) beauv. ex roem. & schult.,” prota (plant resources of tropical africa), wageningen, netherlands, 2011. [7] cook, b. g, pengelly, b. c, brown, s. d, donnelly, j. l, eagles, d. a, franco, m. a, hanson, j, mullen, b. f, partridge, i. j, peters, m and schultze-kraft, r: “tropical forages: an interactive selection tool,” csiro, dpi&f (qld), ciat and ilri, brisbane, australia, 2005. [8] then, y. y, ibrahim, n. a, zainuddin, n, ariffin, h and wan yunus, w. m. z: “oil palm mesocarp fiber as new lignocellulosic material for fabrication of polymer/fiber biocomposites,” international journal of polymer science, vol. 2013, article id797452, 7 pages, 2013. doi:10.1155/2013/797452 [9] liu, l, yu, j, cheng, l and qu, w: “mechanical properties of poly (butylene succinate) (pbs) biocomposites reinforced with surface modified jute fibre,” composites part a: applied science and manufacturing, vol. 40, no. 5, pp. 669–674, 2009. doi:10.1016/j.compositesa.2009.03.002 [10] kling, v, rana, s and fangueiro, r: “fibre reinforced thermoplastic composite rods,” materials science forum, vol. 730-732, pp. 331–336, 2013. doi:10.4028/www.scientific.net/msf.730-732.331 [11] naghmouchi, i, mutje, p and boufi, s: “olive stones flour as reinforcement in polypropylene composites: a step forward in the valorization of the solid waste from the olive oil industry,” industrial crops and products, 72, 183-191, 2015. [12] deo, c and acharya, s. k: “effect of moisture absorption on mechanical properties of chopped natural fiber reinforced epoxy composite,” journal of reinforced plastics and composites, vol. 1 pp. 5-15, 2010. doi:10.1177/0731684409353352 200 http://dx.doi.org/10.1155/2016/3137243 http://dx.doi.org/10.1590/s1517-707620160004.0088 http://dx.doi.org/10.1155/2013/797452 http://dx.doi.org/10.1016/j.compositesa.2009.03.002 http://dx.doi.org/10.4028/www.scientific.net/msf.730-732.331 http://dx.doi.org/10.1177/0731684409353352 acta polytechnica 58(3):195–200, 2018 1 introduction 2 materials and method 2.1 polymer and fibre processing 2.2 composite preparation 2.3 composite characterization 2.3.1 mechanical properties 2.3.2 physical properties 3 results and discussion 3.1 mechanical properties 3.1.1 tensile strength 3.1.2 tensile modulus 3.1.3 percentage elongation at break 3.1.4 impact strength 3.1.5 hardness property 3.2 physical properties 3.2.1 water absorption 3.2.2 density 4 conclusion references ap04_2web.vp 1 introduction the propeller slipstream and airframe interaction is one of the most important problems in the aerodynamic design of an aircraft, especially in the case of a heavily – loaded ps [1], [2]. the high efficiency of turboprop engines is a reason for their widespread applications. however, effective thrust of turboprop engines depends on their position on an aircraft. the ps slipstream has an essential influence on the lift and stability and controllability characteristics of an aircraft due to interaction with the wing, fuselage and empennage. on the other hand, the nacelles, fuselage, wing and other aircraft components influence the flow velocity distribution over the ps plane of rotation, and as a consequence alter the aerodynamic loads on the ps blades and their thrust characteristics in comparison with free-stream flow. modern computation methods are mainly based on ideal fluid theory, and are not able to fully reveal and take into account the above-mentioned effects. the use of experimental methods is therefore the main way to study the problems of ps and airframe interaction. this is especially true in the case of aircraft of unusual layout with an unusual position of the engines and empennage. this report presents the results of experimental investigations of the model of a twin-engine light transport aircraft of unusual twin-boom layout with a �-shaped tail. the main feature of the airframe is the twin-fin immersed in the ps slipstream. in this case the rudder effectiveness may be increased significantly due to the slipstream flow. however, the lateral stability and controllability characteristics can be significantly influenced by the engine power setting as well as by the angles of attack and by the deflection of the wing high-lift devices. the tests in tsagi’s t-102 subsonic wind tunnel were aimed at studying the peculiarities of these effects on the aerodynamics, stability and controllability of the aircraft model in cruise and takeoff/landing configurations. 2 aircraft model the general geometry of the light transport aircraft (lta) model is shown in fig. 1. lta is a twin-boom high-wing monoplane with a �-shaped tail. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 44 no. 2/2004 development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests a. v. petrov, y. g. stepanov, m. v. shmakov this report presents the results of experimental investigations into the interaction between the propellers (ps) and the airframe of a twin-engine, twin-boom light transport aircraft with a �-shaped tail. an analysis was performed of the forces and moments acting on the aircraft with rotating ps. the main features of the methodology for windtunnel testing of an aircraft model with running ps in tsagi’s t-102 wind tunnel are outlined. the effect of 6-blade ps slipstreams on the longitudinal and lateral aerodynamic characteristics as well as the effectiveness of the control surfaces was studied on the aircraft model in cruise and takeoff/landing configurations. the tests were conducted at flow velocities of v � � 20 to 50 m/s in the ranges of angles of attack � � �6 to 20 deg, sideslip angles of � � �16 to 16 deg and blade loading coefficient of b � 0 to 2.8. for the aircraft of unusual layout studied, an increase in blowing intensity is shown to result in decreasing longitudinal static stability and significant asymmetry of the directional stability characteristics associated with the interaction between the ps slipstreams of the same (left-hand) rotation and the empennage. keywords: windtunnel testing, propeller slipstream, engine failure, test methodology. fig. 1: model lta the wing aspect ratio is � � 10.9, wing area s � 0.54 m2, span l � 2.43 m and mean aerodynamic chord c � 0 24. m. the double-slotted and double-section flaps can be deflected to angles of �f � 25 deg (takeoff position) and 50 deg (landing position). the twin-fin vertical tail has a relative area of svt � 0.3 and a relative arm of lvt � 0.35. the corresponding relative parameters of the horizontal tail are sht � 0.25 and lht � 3.8. the model power plant consists of two 6-blade ps of left-hand rotation (as viewed forward) with a diameter of ds � 0.36m, driven by electric drives each with a power of n � 5 kw. the blades are manufactured of a moulded reinforced carbon plastic. they can be assembled on the faired hub (spinner) at the desired blade setting angles in the range �b � 13.3–29.44 deg. the ps speed in the range n � 0–6500 1/min was measured by an internal photoelectric transducer. 3 test methodology a technique was developed for testing the model with operating ps. it consists of two parts: 1) determination of the isolated model power plant thrust; 2) a test methodology for the full aircraft model with two running ps. 3.1 methodology for measuring of ps thrust the methodology is based on measurements of the forces and moments with mechanical balances ab-102 acting on an isolated nacelle with ps on and off and elimination of the influence of the supporting devices and communications by using calculation and experimental methods. determined as a result of balance measurements carried out at flow velocities of v� � 0–45 m/s, ps speeds n � 0–6500 1/min and blade setting angles of �b � 13.3–29.44 deg were: � available range of thrust p and thrust coefficient � �s s sp n d� � 2 4, � available range of blade loading coefficient b p q ds� �0 25 2. � , � possible range of ps advances ratio � s s sv n d� � , � requirement power ns and power coefficient �s range, where n ms s� , � � � �s s s s s s sn n d m n d� �� � 3 5 2 52 , v�, q�, �� – free stream parameters, ns – propeller speed, l/s – propeller angular velocity, �� 2 n s, ms – torque moment on the ps shaft. in fig. 2, the thrust coefficient �s is plotted as a function of advanced ratio �s at �b � 29.44 deg and n � 5000 1/min. the relationship �s(�s) is corrected for the nacelle and the interference of supporting devices, and agrees satisfactorily with the test data of a large-scale model of the similar ps in tsagi’s t-104 large wind tunnel. 3.2 test methodology for the aircraft model with running ps the investigations of the aerodynamic characteristics of the aircraft model in cruise and takeoff/landing configurations were conducted in the ranges of angles of attack � � �6 to 20 deg, sideslip angles of � � �16 to 16 deg and loading coefficient b � 0–2.8. the values b � 0.3 correspond to the cruise flight regime and range b � 1.0 – 2.8 corresponds to the takeoff/landing regimes. the values of �s and the corresponding required values of the propeller speed ns were determined according to the needed values of b. the values of v� were determined under conditions of maximum possible values of re number for the model and electrodrive power limitation. when calculating the aerodynamic coefficients, the forces and moments were referenced to the flow dynamic pressure and base wing area. the pitching moments, in addition, were referenced to the mean aerodynamic chord, and the rolling and yawing moments were referenced to the wing span. the moment values were determined relative to the conditional center of gravity position at a distance of x cg � 0 257. from the mac forward edge. 4 aircraft model test data analysis 4.1 longitudinal aerodynamic characteristics tests of the lta with running ps on and off for cruise � f � 0, landing gear retracted), takeoff (� f � 25 deg) and landing (� f � 50 deg) configurations were conducted in the ranges of angles of attack � � �6 to 20 deg and coefficients b � 0 – 2.8, including also the autorotation regime (ap). 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 �0.1 0 0.1 0.2 0.3 0.4 0.5 0 0.5 1 1.5 2 � � s wind tunnel t-102, d 0.36ms � wind tunnel t-104, d 0.65ms � fig.2 propeller thrust coefficient �s versus advance ratio � propeller cb-34-01 fig. 3 is an example of the test results for the takeoff configuration. an increase in blow intensity (loading coefficient b) increases the derivative cl, maximum lift coefficient and critical angle of attack, improving effectiveness of the flaps and essentially decreasing the longitudinal stability (see fig. 3). the aerodynamic center moves from x cf � 53 7. % (at b � 0) to x cf � 32 % at b � 2. this is mainly associated with the action of the transverse force on the ps and the ps slipstream effect on the flow about the horizontal tail. the test results show that the aerodynamic center of the model without the horizontal tail shifts to a much lesser degree, and the pitching moment increases in the nose-down direction due to the increased effectiveness of the flaps blown by the ps slipstreams. a substantial feature of an aircraft with two left-hand-rotation ps is the appearance of yawing moments with zero sideslip (see fig. 3). this is mainly associated with the action of the two slipstreams swirled in the same direction on the flow about the vertical tail immersed in the wake produced by the ps. 4.2 lateral aerodynamic characteristics the effect of the running ps on the lateral stability of the aircraft model is comparatively small for the configurations tested. however, the directional stability characteristics vary considerably with varying blowing intensity (fig. 4). the variations of directional stability characteristics with sideslipping, rudder deflected, and running engines results mainly from the action of two factors: non-uniform variations of the rudder effectiveness with sideslip angle (due to displacement of the slipstream core in the direction of the ps rotation), and the peculiarities of the interaction between the two left-hand swirled ps slipstreams and the vertical tail. as a consequence, an increase in the loading coefficient b from 0 to 2.8 leads, in addition to the variations of cn with � � 0, to “nonsymmetrical” variations in directional stability: this increases with the left wing slip (� < 0) and decreases with the right wing slip (� � 0–15 deg). 4.3 effectiveness of control surfaces the effectiveness of the rudder with the propellers operative significantly increases with increasing blade loading coef© czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 44 no. 2/2004 0 0,5 1 2 2.5 3 3.5 10 0 10 20 �� cl b=2.0 b=1.0 b=0 b=2.0 b=1.0 b=0 0.4 0.2 0 0,2 0.4 10 0 10 20 �� c m 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0.01 10 5 0 5 10 15 20 �� c n 1.5 fig. 3: longitudinal aerodynamic characteristics takeoff configuration � f tx� � � �25 25 0 25; . 0.05 .0 04 .0 03 .0 02 .0 01 0 0 01. 0 02. 0 03. 0 04. 0.05 20 15 10 5 0 5 10 15 20 �� c n b=1.0 b=0 b=2.0 b=2.8 fig. 4: lateral aerodynamic characteristics � � � f r � � � � � � 25 5 0 ficient b; in this case the region of maximum increments of the yawing moment shifts towards negative angles of sideslip � � �5 through �10 deg), i.e. in the direction of rotation of the propellers (fig. 5). the effectiveness of the rudder on the model in takeoff configuration at an angle of attack of � � �5 and b � 2 increases by a factor of 1.6 – 1.7 at � � 0, and by a factor of 1.8–2.1 at � � �6 through �10 deg. the effectiveness of the ailerons and elevator in the operational angle-of-attack range depends rather insignificantly on the power setting. 5 effect of engine failure on the aerodynamic characteristics studies into one-engine-inoperative (leftor right-hand) situations were performed on the aircraft model in takeoff configuration (� f � 25 deg, landing gear extended) at a propeller area load factor b � 2, corresponding to the maximum takeoff engine power. the propeller blades of the filed engine are in feathered pitch (fp). fig. 6 demonstrates the longitudinal aerodynamic characteristics of the model with both engines operative (br � 2; bl � 2), with the right engine failed (br � fp; bl � 2) and with the left engine failed (br � 2; bl � fp), as well as with both ps windmilling (wm). the change-over from wm to b � 2 results increases the derivative cl � , the maximum lift coefficient and critical angle of attack, decreases the degree at static stability, and decreases the degree at the longitudinal (propulsive) thrust component (shift of the drag polar to the negative cd domain). failure of the left or right engine results in approximately the same decrease in the increments of the lift and longitudinal force. besides, with one engine failed the longitudinal static stability of the aircraft model is restored to a significant degree. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 0.08 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 �20 �15 �10 �5 0 5 10 15 20 �� b�0 b��.0 n �c fig. 5: rudder effectiveness � � � f r � � � � � � � 25 5 20 �0.30 �0.20 �0.10 0.00 0.10 0.20 0.30 0.40 �10.00 �5.00 0.00 5.0 10.00 15.00 20.00 � � cm �0.50 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 �10.00 �5.00 0.00 5.00 10.00 15.00 20.00 � � cl cl �0.50 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 �0.60 �0.40 �0.20 0.00 0.20 0.40 0.60 cd f=25 b fp b 2.0l r� � b 2.0 b fpl r� � b wm b wml r� � b 2.0 b 2.0l r� � fig. 6: longitudinal aerodynamic characteristics of the model a special feature of the aircraft layout with the twin-fin tail in propeller slipstreams is the generation of a significant directional moment of negative sign, which increases with the angle of attack and with loading coefficient b (see fig. 3). this is attributed to the effect of flow angularity on the vertical tail, created by the swirling slipstreams behind the ps with the same direction of rotation (left-handed). the influence of the operating ps on the rolling moment is insignificant up to � � �10 . however, the failure of any engine results in the generation of rolling moments opposite in sign and approximately equal in magnitude at the identical angles of attack. 6 conclusion investigations of a model of a twin-engine transport aircraft with an uncommon twin-boom layout and running propellers have shown that an increase in the blowing intensity leads to enhancement of the lifting ability of the aircraft and degradation of its longitudinal static stability. it was found that a twin-fin tail unit in the propellers slipstreams increases the rudder effectiveness by a factor of 1.5 –2 with the running propeller in takeoff regime. however, because of the interaction of the swirling slipstreams behind propellers with the same direction of rotation and the vertical tail, a significant yawing moment appears on an aircraft with the considered layout (even at zero sideslip angle) and its directional stability characteristics vary noticeably. failure of the right (critical) engine leads to an additional significant increase in the yaw moment, which poses the problem of ensuring the aircraft’s directional trim. references [1] petrov a. v.: aerodynamics of stol aircraft with wings blowing by turbofan exhaust jets and propfan slipstreams. aiaa-93-4832, 1993. [2] viskov a. n., kishalov a. n., petrov a. v.: problems of designing advanced turboprop aircraft. proceedings of the international conference “aviation technologies 2000“ zhukovsky, 1997. dr. albert v. petrov dr. yury g. stepanov dr. michael v. shmakov aerodynamic department central aerohydrodynamic institute (tsagi) zhukovsky str. 1 zhukovsky, 140180, russia © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík ap05_5.vp 1 introduction philosophers and scientists in the 19th century started to investigate many natural and social phenomena. in fact, the 19th century was a revolutionary era during which the first “natural law” of economics [1] – pareto’s law was observed. pareto’s law states that the high end of wealth distribution follows the power-law p(w) ~ w����, where exponent � is stable for an investigated country in a given period of time. many scientists have questioned the validity of pareto’s law and they have made measurements of the distribution, but the main message still remains true – the higher end of wealth distribution behaves like the power law. experiments in e.g. [2, 3, 4, 5] performed in the last few years, have shown the validity of pareto’s law. the functional form itself is not amazing but the stability of the law in time and space is remarkable. the value of exponent � varies slightly from one country to another and there are small fluctuations of exponent � in time, but pareto’s law has been found almost everywhere. moreover, the validity of pareto’s law can be extended back to ancient egypt, to the times of the pharaohs [6]. this universality of the power-law tail is a surprising phenomenon, and it asks for an explanation. recent studies [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] have investigated the multiplicative random process repelled from zero as a mathematical source of power-law distributions. however, there are a million ways to implement multiplicative random processes, and the most studied implementations are the generalized lotka-volterra equation [10, 11, 12, 13] and the analogy with directed polymers in random media [21, 22, 23]. in these methods, models are formed by a kinetic equation that describes the exchange of wealth in a society of agents and global redistribution which is analogous to repelling from zero in stochastic processes. empirical studies of the lower end of the distribution have shown exponential behavior [3, 24, 25] and this behavior has been interpreted as a conservation law for total wealth, which leads to the robust boltzmann exponential distribution that is analogous to the energy distribution in a gas of elastically scattering molecules. similar studies in [26] with previous notes lead to the view of economic activities as a scattering process, where the agents are analogous to inelastic scattering particles [27, 28, 29, 30, 31, 32, 33]. inelasticity is very important to explain the power-law tail of wealth distribution. the assumption that there is a total wealth increase on average is reasonable for economic reasons (e.g., rising gdp). inelastical scattering of particles has been investigated in the context of granular materials [34] and the maxwell model and its inelastical variants, e.g., [35, 36]. these studies lead to the conclusion that a self-similar solution of kinetic equations exists. this solution is not stationary but assumes a time-independent form after rescaling the energy, and the tail of the scaling function is the power-law when certain conditions are used. a theoretical investigation of inelastical scattering agents on a fully-connected network (mean field solution) is performed in [37] and power-law tails of wealth distribution were found for a large set of parameters �, � of the interaction. it is suggested in [37] that the theoretical solutions do not answer the problem of the robustness of exponent � in different societies and the answer could be given by a sociological ingredient in the model. recent investigations of networks, which has been reviewed in [38], show some remarkable phenomena, and models that agree with the basic experimental measurements have been introduced e.g. in [39, 40]. one possible enhancement of the model could be the use of networks where interaction is allowed only along the edges. this paper deals with simulations of the model in [37] on the wattsstrogatz network [39]. 2 definition of the model let us imagine a society of n agents, where each agent has only one variable which signs his/her wealth ~wi, i n�{ , , , }1 2 � . thus, the state of the system is described by w w w wn� { ~ , ~ , , ~ }1 2 � . the agents are able to interact and the interaction is essentially instantaneous. of course, a real society is more complicated and many economic interactions can take place at the same time, pairwise, although some economic interactions can be taken as multilateral rather than bilateral in a real society, and positive, the interaction has a positive effect on the total wealth of the society of the agents. thus, the interacting agents become, in sum, more wealthy after the interaction than at the beginning of the interaction. when two agents i and j are chosen to interact, the dynamics of the wealth of agents i and j is governed by interactions that can be formalized as follows 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague a model of the distribution of wealth in society h. lavička, f. slanina a model of the distribution of wealth in society will be presented. the model is based on an agent-based monte carlo simulation where interaction (exchange of wealth) is allowed along the edges of a small-world network. the interaction is like inelastic scattering and it is characterized by two constants. simulations of the model show that the distribution behaves as a power-law and it agrees with results of pareto. keywords: pareto’s law, economics, scattering. ~ ( ) ~ ( ) ~ (w t w t w ti j i� � � � � � � � � � � � � � � �� 1 1 1 1 � � � � � � ) ~ ( )w tj � � � � � (1) and all other agents remain unchanged, so ~ ( ) ~ ( )w t w tk k� �1 for k i and k j where � and � are parameters of the model. � �( , )0 1 measures the strength of the exchange and � > 0 measures the one-step inow of wealth. interaction is allowed only along the edges of a network, which is represented by the graph � �� ( , )e , where � is the set of all nodes and e is the set of all edges. edge e is a unordered pair e i j� ( , ) connecting nodes i and j. each node i from i � � has its neighborhood � � �i j i j� � �{ |( , ) }. each agent i is bound to its own node i and the agent’s neighborhood is �i. the network is generated by the watts-strogatz algorithm [39], which supports the network with basic features that have been found to describe human networks. a rewiring algorithm is applied to a totally ordered network, which means that each edge is rewired to a randomly chosen agent with probability p. there are two possible ways to execute one monte carlo step, using � an agent initiated model, � an edge initiated model. 2.1 agent initiated model the updating mechanism of the monte carlo step is based on the choice of agents i.e., agent i � � is chosen with uniform distribution and a second agent j is chosen with uniform distribution from his/her neighbors �i. it can be argued that the edges of the graph are only dispositions that can be used by agents and pair agents that interact, are interested in collaboration, and the collaboration is useful for them. 2.2 edge initiated model this model is based on the choice of an edge e i j� ( , ) with uniform distribution. the interacting agents are signed i and j, the rule of interaction is symmetric to the exchange of i for j, so there is no ambiguity. it can be argued that every connection in society is used with the same probability, and highly connected agents will interact very frequently. 3 interesting variables measured wealth was normalized w w wi i� ~ . this means that there are n units of wealth in the society after normalization, and they are distributed among the agents. the first interesting variable is social tension, which measures differences in wealth t w e w w ii e i j j i � � � � � � � � � � � � � � � 1 1 1 1 � � (2) where w n wii e � �� 1 so it is the average. � �( , )0 1 is a parameter set up to � � 1 2 in our simulations. the second interesting variable is distribution of wealth d w p w w( ) ( )� � � . (3) p means probability that a randomly chosen agent’s wealth is greater than w. the following variable is the correlation between wealth and connectivity h c wp w c( ) ( | )� . (4) value h(c) is computed as h c wik c k c i i ( ) � � � � � 1 , (5) where ki is connectivity of individual agent i and c is an integer value. 4 results of simulations the model was investigated with fixed interaction parameters that were set up to fulfill equation 10 from [37] 2 1 2� � �� �( ) . (6) with � � 3 2, i.e., the same interaction where the power-law exponent of wealth distribution of the model on the fully-connected network was � � 3 2. now there is only one freedom, which will be fixed by setting up � � 0 01. . the simulations were performed with the following parameters: general parameters of the monte carlo method � number of agents n � 10000 � final time of the simulations t � 1.5 109 � number of monte carlo runs r � 10 parameters of the interaction � � � 2.5 10�5 � � � 0.01 parameters of the network there are two parameters in the construction of the small-world network using the watts-strogatz algorithm [39]: � initial number of edges from agent m � 4 (mean connectivity) � probability p �[ , ]0 1 of rewiring of the edge. the initial wealth of the agents was set at 1, so the initial wealth dispersed in the society of n agents is n. 4.1 agent initiated model the model is based on random choice of agents that will interact using motion equation 1. the time evolution of social tension (fig. 1) for all p rises and then decreases, but in the case of parameter p < pa, 0.00007 < pa < 0.0001, the process is slower and for the subset of cases with p 0 there is a trough or plateau in the time evolution after the peak of social tension. the case with p > pa behaves differently: there is one peak and then a rapid decrease in social tension. the distribution of wealth (fig. 2) permits the power-law tail for p>pa (the same symbol is used as a consequence of the power-law behavior and the different social tension evolution) with exponent �0.96, which is stable for the interval of p at the thermodynamic limit n � ��, t � �� and n t constant. the power-law is valid for approximately 1–5 % of the population, which is in quite good agreement with the mea© czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 1: time evolution of social tension in the agent initiated model fig. 2: distribution of wealth among agents in the agent initiated model fig. 3: correlation of wealth and connectivity in the agent initiated model surements in [25]. the deviation of the data from the power-law for the higher end of distribution behaves as a finite-size effect. if p > pa, behavior of wealth distribution is no longer power-law, and the initial power-law tail is spread by the dynamics of the model. in fig. 3, the average connectivity of the network was 4, and the connectivity is dispersed around this value during the rewiring process. in the case p < pc, there is a strong correlation between average wealth and connectivity, but the case p > pa enables less connected agents to outperform agents with average connectivity. 4.2 edge initiated model the model is based on a random choice of edges that will interact using motion equation 1. time evolution of social tension (fig. 4) seems very similar to the previous case. in the case of p < pe , 0.00007 < pe < 0.0001, the dynamic is slower than in the following case, and for p > 0 there is a peak and a plateau, or a twin peak. this is in contrast to the case p > pe, where there is only one peak and then a rapid decrease. the distribution of wealth (fig. 5) allows power-law behavior with exponent � � �0.95 for the case p < pe and the power-law tail is stable at the thermodynamic limit. as in the previous case it is valid for 1–5% of the population and the deviation from the power-law for the higher end of the distribution is a finite-size effect. however, there is no power-law for p > pe. the correlation of wealth (fig. 6) shows that average wealth is a strictly growing function of connectivity c. the average wealth of a player with average connectivity (4) is better for p < pe, which is similar to the previous case. 5 conclusions a model of wealth distribution based on inelastical scattering interaction was simulated on the watts-strogatz network with the aim of obtaining the powerlaw tail in the higher end of the distribution, which corresponds with pareto’s empirical observations. there are intervals p where the model admits the power-law p < pg, g a e�{ , }, which is stable at the thermo© czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 fig. 4: time evolution of social tension in the edge initiated model fig. 5: distribution of wealth among agents in the edge initiated model dynamic limit and p > pg where there is no longer the power-law. so the model admits the power-law only for a “closed” community without many merchants that trade with distant communities. the exponents, which were measured in the simulations, differ from the mean-field computations in [37] with the exponent � � 3 2. connectivity of the agents was a positive factor for wealth, although there are counter-cases. this is especially true for higher connectivities. 6 acknowledgments the computer simulations in this paper were made on a cluster that is supported by the department of physics of the faculty of nuclear sciences and physical engineering, ctu in prague. this work was supported by grant frvš 2005:3305010. references [1] pareto, v.: cours d’economie politique. lausanne: f. rouge, 1897. [2] levy, m., solomon, s.: physica. a 242, 90 (1997). [3] dragulescu, a., yakovenko, v. m.: physica. a 299, 213 (2001). [4] reed, w. j., hughes, b. d.: phys. rev. e 66, 067103 (2002). [5] aoyama, h., souma, w., fujiwara, y.: physica. a 324, 352 (2003). [6] abul-magd, a. y.: phys. rev. e 66, 057104 (2002). [7] levy, m., solomon, s.: int. j. mod. phys. c 7, 595 (1996). [8] levy, m., solomon, s.: int. j. mod. phys. c 7, 65 (1996). [9] biham, o., malcai, o., levy, m., solomon, s.: phys. rev. e 58, 1352 (1998). [10] solomon, s.: “in decision technologies for computational finance.” ed. a.-p. refenes, a. n.burgess, j. e. moody, kluwer academic publishers, 1998. [11] solomon, s.: “in application of simulation to social sciences.” ed. g. ballot, g. weisbuch, hermes science publications, 2000. [12] huang, z.-f., solomon, s.: eur. phys. j. b 20, 601 (2001). [13] solomon, s., richmond, p.: physica. a 299, 188 (2001). [14] blank, a., solomon, s.: physica. a 287, 279 (2000). [15] solomon, s., levy, m.: cond-mat/0005416. [16] huang, z.-f., solomon, s.: physica. a 294, 503 (2001). [17] sornette, d., cont, r.: j. phys. i france 7, 431 (1997). [18] sornette, d.: physica a 250, 295 (1998). [19] sornette, d.: phys. rev. e 57, 4811 (1998). [20] takayasu, h., sato, a. -h., takayasu, m.: phys. rev. lett. 79, 966 (1997). [21] marsili, m., maslov, s., zhang, y. -c.: physica. a 253, 403 (1998). [22] bouchaud, j. -p., mézard, m.: physica. a 282, 536 (2000). [23] burda, z., johnson, d., jurkiewicz, j., kaminski, m., nowak, m. a., papp, g., zahed, i.: cond-mat/0101068. [24] dragulescu, a., yakovenko, v. m.: eur. phys. j. b 17, 723 (2000); eur. phys. j. b 20, 585 (2001); in “modeling of complex systems: seventh granada lectures”, aip conference proceedings 661, 180 new york, 2003. [25] yakovenko, v. m.: cond-mat/0302270. [26] ispolatov, s., krapivsky, p. l., redner, s.: eur. phys. j. b 2, 267 (1998). [27] chakraborti, a., chakrabarti, b. k.: eur. phys. j. b 17, 167 (2000). [28] chakrabarti, b. k., chatterjee, a.: “applications of econophysics.” conference proceedings of second nikkei symposium on econophysics, tokyo, japan, 2002, springer-verlag, 2003. [29] chatterjee, a., chakrabarti, b. k., manna, s. s.: physica. a 335, 155–163 (2004); phys. scripta. t 106, 36–38 (2003). [30] scafetta, n., picozzi, s., west, b. j.: cond-mat/0209373. [31] scafetta, n., west, b. j.: “a trade-investment model for distribution of wealth, nonlinear dynamics and nonextensivity.” conference proceedings of workshop on 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 6: correlation of wealth and connectivity in the agent initiated model anomalous distributions, nonlinear dynamics and nonextensivity, santa fe, usa. [32] gligor, m., ignet, m.: eur. phys. j. b 30, 125 (2002). [33] sinha, s.: physica scripta. t 106, 59-64 (2003). [34] jaeger, h. m., nagel, s. r., behringer, r. p.: rev. mod. phys. 68, 1259 (1996). [35] bobylev, a. v., carillo, j. a., gamba, i. m.: j. stat. phys. 98, 743 (2000). [36] bobylev, a. v., cercignani, c.: j. stat. phys. 106, 547 (2002); j. stat. phys. 110, 333 (2003). [37] slanina, f.: phys. rev. e 69, 046102 (2004). [38] albert, r., barabási, a. -l.: reviews of modern physics. 74, 47 (2002). [39] watts, d. j., strogatz, s. h.: nature. 393, 440 (1998). [40] barabási, a. -l., albert, r.: science. 286, 509 (1999). ing. hynek lavička e-mail: lavicka@fjfi.cvut.cz department of physics czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 praha 1, czech republic rndr. františek slanina, csc. tel. +420 266 052 671 slanina@fzu.cz institute of physics academy of sciences of the czech republic na slovance 2 182 21 praha 8, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 ap04_3web.vp 1 introduction turboprop engines are widely used in a category of commuter aeroplanes. the main considerations for correct function of a turboprop engine, from the aerodynamic point of view, are losses in the air intake section and drag of the engine nacelle. as a basic for further consideration we chose an l-410 uvp-e aeroplane equipped with twin walter m-601e engines and avia v-510 propellers. the general goals were to analyse the flow field around its engine nacelle and in whole intake section using the cfd method, and to provide set of basic aerodynamic characteristics with certain variations of air intake and nacelle geometry. the cfd package cfx 5.5 was used to model this complex problem. the finite volume method was implemented for numerical solution of the navier-stokes equation. it the choice of different models for turbulence. a computation run was done at two computers at iae, a silicon graphics origin 2100 workstation with eight processors running under the unix platform and a twin-processor pc running under the windows 2000 platform. 2 data preparation 2.1 geometric simplification one consideration when dealing with such a complex problem is to be as close as possible to the reality. on other hand there are of course certain limits, e.g., when modelling geometric details, allowable solver time, etc. we therefore decided to neglect minor and complicated structural elements in the stilling chamber and in the inlet duct into the axial compressor (rivets, engine mount, blades, etc.) 2.2 geometry the geometry of the nacelle and air intake was created from 2d documentation obtained by letecké závody a.s. the unigraphics (ug) cad system was used for digitising the geometric data. points, curves and surfaces to describe complicated geometric layout. the geometry involves an external part and an internal part. the external parts consist of the nacelle, the torso of the wing with a span of 2 m and appropriate airfoil sections the external part of the exhaust, the propeller hub spinner (see fig. 1). the internal part consists of the nacelle, featuring the air intake, the stilling chamber and the engine protection screen, covering a part of an axial compressor inlet duct (see fig.2). for importing cad data into the cfx preprocessor an iges file was created. 2.3 computational domain for preprocessing , cfx build was used to prepare data for the solver. the iges file was imported into preprocessor, where the geometry was reconstructed and a solid for whole computational domain was created. the domain consists of a semi-spherical front part and a cylindrical rear part. as the creation of the solid was successfully performed surface seeds, 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 aerodynamic analysis of turboprop engine air intake p. chudý, k. fi�akovský, j. friedl the objective of this paper is to present cfd computation of a let l-410 engine nacelle equipped with a walter m-601e turboprop engine. the main purpose is to estimate the air intake fluid characteristics of different air intake geometries. the results of these computations are part of an optimisation process focused on increasing the performance and reducing the losses in the ‘engine – nacelle’ system. a problem with flow separation in the input section was observed. this project is supported by ministry of industry and trade of the czech republic. keywords: aerodynamic, cfd, turboprop engine, air intake. fig. 1: external part of nacelle geometry fig. 2: internal part of nacelle geometry the setting of prismatic elements and volume mesh refinement controls were applied at expected areas of main flow changes and a volume mesh was generated. it proved very difficult to set appropriate values of mesh controls due to the geometric complexity of the model. a number of elements in the unstructured hybrid mesh varied up to 1.3 millions. the surface mesh is shown at fig. 3. 2.4 boundary condition the task was assumed with one plane of symmetry in order to decrease solution time. the boundary conditions (bc) were chosen as follows. for free stream input into a domain, a velocity-inlet (“inlet”) was used, an “outer wall” was defined as a wall with no slip, an “outlet” was defined as a pressure-outlet from the domain. for the “nacelle” we used a wall, at the “compressor input” pressure-outlet was defined (from the domain’s point of view it is in fact output of the flow), influence of “exhaust” as pressure-inlet (see fig. 4, 5). a certain pressure loss, computed from the screen geometry [1], was defined at the “screen” (subdomain) as a linear loss coefficient, according to user’s reference guide [2] (see fig. 5). 2.5 monitor surfaces for the computed case, comparison sets of monitor surfaces were defined to determine the flow characteristics in these sections (see fig. 6). 2.6 flow parameters and solver setting all computations were done with the following set of parameters. free stream velocity value v f � 250 kph and the further parameters according to international standard atmosphere at the flight altitude h p �1500 m. the fluid was modelled as incompressible gas due to the relatively low flight speed. the influence of the propeller was modelled as an certain increments of free stream velocity and static pressure defined on the basic of ideal propeller propulsion theory [3, 4], and the propeller stream slip was neglected (this allows a single symmetry plane to be used). a turbulence intensity value 4 % behind the propeller was estimated. all solver runs were realised as parallel on the computers. the tasks were set as an unsteady solution with a small time step, due to complications with convergence during steady runs. 3 computed cases 3.1 computation process the cfx solver was used for all computations. the solution time for one case varied up to few days. the computation process always has the same scheme. firstly, in several steps, the mass flow ratio through the engine was “tuned” by setting the of static pressure value at bc “compressor input” (the mass flow value was given for this work regime of the engine). © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 44 no. 3/2004 fig. 3: surface mesh at nacelle and symmetry plane fig. 4: global bc fig. 5: bc at nacelle fig. 6: monitor surfaces in air intake and detail at stilling chamber finally the pressure loss at the subdomain “screen” was set to achieve a drop in total flow pressure through the screen. the allowable differences between the two known and computed values were considered as � 3%. under the computation reached a value for all maximal residuals under 1�10�4 the solution was declared as converged. 3.2 solutions: the main goal in this project was to provide aerodynamic data for different air intake geometries. five geometric variations were computed, as follows: � original air intake, � air intake +10 %, � air intake � 10%, � air intake with prismatic insert, � optimised air intake. in all cases the k-� turbulence model was used, except in the original air intake case where a k-� sst turbulence model was used. a shape of the air intake �10 % was determined from the original air intake with the cross section areas increasing by �10 %. designing the air intake –10 % geometry, we used the same process, and the areas of the cross sections were decreased. the geometry of the air intake with a prismatic insert has a constant area up to a length of 138 mm from the front section, then an expansion of the cross section areas to the original geometry was created. the prismatic length value was determined using laminar and turbulent boundary layer theory simplified for 2 dimensions [5]. the optimised air intake geometry also has a prismatic front part 138 mm in lenght and an expansion of the cross section areas according to boundary layer theory of flow against back pressure was used [5, 6, 7]. in all cases the total length of the channel was identical. the geometry of the stilling chamber was also identical in all cases (except the optimised case). 4 results and visualisation the main flow characteristics were monitored at surfaces s1–s5, and the total pressure losses in the internal sections were determined (see table 1). 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 7: typical history of convergence, for original air intake turbulence model k-� air intake case pressure loss channel pressure loss stilling chamber total pressure loss ratio chamber/channel [pa] [pa] [pa] [1] original k-� 158.13 278.02 436.15 1.758 original k-� sst 53.77 255.79 309.56 4.757 insert 63.13 240.23 303.36 3.805 �10 % 49.97 237.02 286.99 4.743 +10 % 45.40 231.08 276.48 5.090 optimized 127.30 223.53 350.83 1.756 table 1: summary of results fig. 8: contours of static pressure (relative to the reference pressure), original air intake, k-e turbulence model fig. 9: streamlines in internal sections, original air intake, k-� turbulence model 5 comparison with experiment the experiment was performed by walter a.s. in a static test room for engines [8]. the experimental set-up was relatively simple, and the main goal was to confirm or refuse the appearance of a certain separation area in the original intake geometry. for analysis we used the method of visualisation of flow by cotton fibres, which were stuck to the upper side of the channel in a 50×50 mm grid. a small digital industrial camera was used to record the measurements. it was mounted on the operating platform. due to the impossibility of seeing through the nacelle structure, the bottom part of channel was replaced by appropriately blended acrylic glass, and the external structure part was cut out. the experiment confirmed the appearance of a certain separation area during all regimes from free wheel up to maximum thrust. for illustration purposes, the case with maximum thrust in fig. 15 shows separation and recirculating zones with large movement of the fibres (unfocused). © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 44 no. 3/2004 fig. 10: streamlines in internal sections, original air intake, k-� sst turbulence model fig. 11: streamlines in internal sections, air intake �10 %, k-� turbulence model fig. 12: streamlines in internal sections, air intake �10 %, k-� turbulence model fig. 13: streamlines in internal sections, air intake with insert, k-� turbulence model fig. 14: streamlines in internal sections, air intake with insert, k-� turbulence model 6 discussion of results, and conclusions this work successfully shows a cfd analysis of different turboprop air intake geometries, although the process of creating computer models was extremely time consuming. some conclusions from the computed results and experiment can be presented, but for a detailed analysis more sophisticated experiment at work is needed (pressure taps, measurement of intensity turbulence, etc.) the first computed case of the original air intake geometry with the k-� turbulence model has a wide flow separation area, beginning of the channel, which results in high total pressure loss. the second analysis, the case of the original air intake with turbulence model k-� sst, shows a smaller separation area, and the total pressure loss is three times smaller than in the previous case. channel geometry variants modified by increments of �10 % and �10 % show a relatively small separation and also a small total pressure loss value. the case of air intake air intake with a prismatic insert has less total pressure loss than the original k-� case, but the advantages of this geometric modification are controversial. optimised geometry has incomparable results for major geometry changes in the channel, bottom part of the stilling chamber and also in the external shape of nacelle. all cases show that an extremely high pressure drop has occurred in the stilling chamber, so in order to further decrease the pressure drop in the whole air intake the complete geometry of the channel and stilling chamber must be modelled as a simple part and modified. this project successfully shows that even quite complex configurations can be simulated via cfd codes, but comparison with experimental values is essential. references [1] rae w. h., pope a.: low-speed wind tunnel testing. john wiley & sons, n. y., 1984. [2] cfx user‘s manuals [3] alexandrov v. l.: letecké vrtule. sntl, praha, 1954. [4] švéda j.: teorie vrtulí a vrtulníků. skripta va brno, 1962. [5] schlichting h.: boundary layers theory. mcgraw-hill, 1979. [6] fi�akovský k.: “graficko-analytická metóda riešenia tmv”. sborník va brno, řada b (1966), č. 6. [7] fi�akovský k., pavelek m.: “2d tmv s vlivem tlakového spádu”. stát. úkol zákl. výzkumu iii-4-2/01-2 zpráva za rok 81/82, 1982. [8] sláčik s.: “optimalizace zástavby turbovrtulového motoru”. výzkumná zpráva. walter a.s., praha, 2003. ing. peter chudý phone: +420 541 143 370 e-mail: chudy@lu.fme.vutbr.cz prof. ing karol fi�akovský, csc. phone: +420 541 142 235 fax: +420 541 142 879 e-mail: fil@lu.fme.vutbr.cz ing. jan friedl phone: +420 541 143 470 e-mail: friedl@lu.fme.vutbr.cz institute of aerospace engineering brno university of technology technická 2 616 69 brno, czech republic 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 3/2004 fig. 15: flow visualisation in channel ap06_1.vp 1 grinding and polishing processes as part of the manufacturing chain the aim of grinding and polishing processes is to provide greater forming and dimensional accuracy as well as better surface finishing. both processes play an important role, as they are at the end of the net product chain, and processing errors lead to high rates of rejection. in the sanitary fitting industry of today, complex, freely formed work pieces are manufactured by casting. through subsequent grinding and polishing a high-quality shiny surface is produced with the dimensional accuracy of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 development of a robot system for advanced high quality manufacturing processes b. kuhlenkoetter grinding and polishing are standard operations in material processing which are nowadays automated with the help of industrial robots in order to relieve human labour and optimize the profitability of production. however, it is expensive to adapt present systems to the production of other part geometries and operation cycles, and therefore adaptations are economically applicable only for large batch sizes. this project develops an “intelligent” robot system that obtains sensory skills due to the linkage of innovative robot technology and image processing systems via new software. with this system even the smallest error on highly-polished, mirror-like surfaces can be detected objectively and reproducibly. in addition, the system will be capable of establishing an optimum error compensation strategy dependent on the error data, as well as generating and realizing operating programmes. for this purpose it is given a manual-learning skill. a new offline-programming and simulating system for exacting operation processes makes it easier to set up, change and optimize robot programmes, thus making it useful for the operator. keywords: flexible manufacturing systems, materials handling and robotics, quality systems. selected paper from the 4th international conference on advanced engineering design (aed 2004), which was held in glasgow from 5 to 8 september 2004. 1. 2. 3. 4. 5. fig. 1: manufacturing steps in the manufacture of fittings – 1. casting, 2. grinding, 3. polishing, 4. galvanizing and 5. the end product workpiece playing only a secondary role. (fig. 1). the casting process, however, is characterized by high resulting dimensional and form tolerances as well as quality fluctuations such as blowholes and pores. these greatly varying starting conditions lead to unprofitable rejection rates and a very costly manual testing procedure in automated grinding and polishing processing. what is even more difficult for the realization of an automated solution is that errors are only detectable after a part of the fine processing has been done and that sensitive and very shiny surfaces are hard to establish by measuring methods. in addition, visual inspection can strain the operator’s eyesight. 2 the use of robots in grinding and polishing processes the use of modern handling/robot systems for belt grinding and polishing is intended to relieve human workers from physically hard, monotonous and dangerous work (fig. 2) and, on the other hand, to minimize costs while optimizing quality. the robot-aided automated solutions known at present in the fields of grinding and polishing are especially and successfully used in the sanitary fitting industry (fig. 3). whereas in the past these systems were profitable despite high wage costs, they are now challenged by competition from cheaper manual grinding and polishing processes in low-wage countries, due to advancing globalization of the markets. the threat of grinding and polishing processes moving abroad is compounded by the medium-term danger that the subsequent steps of manufacturing will also be shifted abroad. the high time and cost requirements for programming and optimizing have a particularly negative effect on the profitability of industrial robot-aided grinding and polishing cells [4]. compared to conventional robot tasks, these high requirements result from the clearly more complex, comprehensive and more accurate motion programs and the use of “trial and error” in optimizing the process. these requirements have of an even more negative influence if new programming or adaptions frequently become necessary [2]. the two main reasons for this can be an unfavourable ratio of batch size to the variety of modifications,, and also the occurrence of fluctuations in the process due to workpiece tolerances, as well as other errors in the upstream manufacturing process. the general aim of the intended r&d cooperation between smes oriented to automation and development, research institutes and manufacturing users is therefore to develop of manual, partly or fully automated procedures based on efficient program optimization of robot-aided grinding and polishing processes (“epo”). the required degree of automation of the procedures depends on how often optimizing work is needed. while manual intervention is sufficient for the initial programming and for occasional process malfunction, more frequent occur4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 2: the manual grinding process fig. 3: robot aided grinding and polishing in the sanitary fitting industry rences require full automation.. the techniques to be developed for manual use differ considerably from those for fully automatic use. in manual procedures the focus lies on efficient interaction with the operator, whereas full automation requires the development and integration of a complex measuring method, data processing and process control. 3 development of an offline-programming system for small batch sizes and a wide variety of modifications one aspect of the project deals with process-specific further development of the offline-programming system – as approved in practise – in order to achieve greater efficiency in manual programming and optimization. present day systems are designed for universal use, and are similar to complex 3d-cad-systems in their layout and operation. processes that do not need an extra path or parameter optimization, such as palletizing, assembling or varnishing, can be programmed efficiently using these programs by highly-qualified engineers and technicians in the planning department. for the grinding and polishing processes, however, no appropriate tools are available directly at the robot cell for the optimization phase. the use of a conventional offline programming system in the vicinity of the workshops usually fails, because it is too complex for underqualified operators. there is a lack of process specific functions, and, as a result, there is a need for a suitable system to be developed. the intended system is directed at a target group that, due to small batch sizes and numerous modifications, must often make new programs or adjust their products to changed conditions. moreover, the methods and procedures to be developed can enable future uses beyond grinding and polishing, e.g. robot-aided milling and water torching. another aspect of the project deals with disturbing influences that “frequently” occur and must therefore be detected and compensated for automatically. while in the first aspect of the project the operator of the robot machine is of the centre of the decision-making and should be given pc-based decision guidance for a structured next step, and suitable tools for efficient program optimization, the skills for error detection and classification and also the deduction of parameter optimization strategies (see [1,2,3]) through measuring methods and process control must be performed fully automatically. (fig. 4) 4 optimized automation through innovative robot systems the developments presented below aim at raising the tolerances of handling systems to changing conditions, and also their flexibility toward frequently changing workpieces in order to increase the reliability of the machines in the end and to expand the use of solutions involving industrial robots. the focus lies here on industrial robot-aided processes like grinding and polishing of complex free forming geometries with high demands on the optical quality of the resulting surface. these processes show a high degree of program generation with several hundred robot targets, and optimizing times in the range of weeks, as well as high sensitivity to differences in the starting qualities of the workpiece. in order to shorten the programming and optimizing times, the operator must have access to modern offline-programming procedures, taking into account the qualifications and experience usually available in industrial production. the acceptance of such systems will be raised by a stronger orientation to the process and greater integration of knowledge. thus, the operator will be able in the future to take over programming and optimizing tasks which until now have been carried out only by highly-specialized staff in the planning department, or which have been given up in favour of manual manufacturing. higher tolerance toward changing the starting qualities of the workpiece will be achieved by combining image processing measuring systems, grinding and polishing process models, adaptive control techniques and intelligent software components. a special challenge is posed in this context by the automation of “seeing and evaluating” processing errors on highly shiny surfaces, which are even difficult for the untrained human eye to detect. however this problem can be resolved with the help of special illumination. furthermore, errors in the workpiece material in the process chain of rough grinding, finish grinding and polishing can often be detected only after a part, or all, of the processing has been done. this results in greater cooperation among what are now single machines, which are only interlinked due to the material flow in order to enable complete or partial reworking of inadequate workpieces. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 4: central aspects of the project to account for these problems, the following developments have been made: a) the development of a software system in the vicinity of the workshop for demanding robot processing applications such as grinding and polishing. this software system closes the gap between multi-functional, but complex offline-programming systems used in the planning department, on the one hand, and inefficient possibilities of robot control used by the operator for optimizing the program on the other hand. b) the development of a fully automatic working process chain for industrial robot-aided grinding and polishing that, on the basis of the measurements of an image processing system, modifies a given machining course in such 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 5: flow chart of the fully automated process chain fig. 6: user-orientated offline-programming and simulation system [5] a way that an optimum surface quality is achieved despite fluctuating starting conditions. if the required standard is not achieved, the component is rejected as scrap. (fig. 5) a software system has been developed for a workshop-lose programming robot systems without a workshop that has an intuitively operable graphic 3d-user surface and provides process-specific optimizing tools. an information-technical combination of interfaces of different offline-programming systems and robot control has already been realised. the software is supplemented by an adaptive consulting centre for the allocation of errors, causes of errors and compensation strategies and an internet connected process-know-how-database. fig. 6 shows how grinding paths/slideways are simply generated on the surface of the workpiece, which is then produced accordingly by the robot system. with the help of an image processing system and an error data base, fully automatic error detection and classification is implemented for geometrically and optically difficult (highly-polished) free forming parts (fig. 7). independent “intelligent” establishment of optimum error compensation is under preparation given the example of the grinding and polishing applications, to enable automatic compensation of detected surface errors specific generation of machining processes. parts of the automatic program selection,or automatic generation of a program for the handling system for reworking the detected errors have already been realized. an important consider is that the target contour and surface must be kept. in the course of the project presented here, a new generation of robot systems is originating that can process sensory feedback to surface errors, and can establish and carry out experience-based optimum error compensation strategies. 5 acknowledgment this research and development project is funded by the “bundesministerium für bildung und forschung“ (bmbf) within the framework of research for the production of tomorrow and supervised by the project supporter of the bmbf for the production and manufacturing technologies (pft) research centre in karlsruhe. references [1] kneupner, k., kuhlenkoetter, b., zhang, x.: “a new force distribution calculation model for high quality production processes.” international journal of advanced manufacturing technology, springer verlag, london (article accepted – to be published soon). [2] čabaravdić, m., kneupner, k., kuhlenkoetter, b.: ”methods for efficient optimisation of robot supported grinding and polishing processes.” international conference on “trends in the development of machinery and associated technology”, barcelona (spain), september 2003. [3] kreis, w., schueppstuhl, t., kneupner, k.: “den bandschleifprozess automatisieren – prozessplanung und – optimierung bei der bearbeitung von freiformflächen.” mo metalloberfläche, s. 12–15, 4/2000. [4] schueppstuhl, t.: beitrag zum bandschleifen komplexer freiformgeometrien mit dem industrieroboter. shaker verlag, aachen 2003. [5] kuhlenkoetter, b., schueppstuhl, t.: “vollautomatisierung durch innovative robotersysteme. in: vdi-berichte 1892.2, mechatronik 2005, innovative produktentwicklung, vdi verlag, düsseldorf 2005. dr.-ing. bernd kuhlenkoetter phone: 0049 231 755 5611 fax: 0049 231 755 5616 e-mail: bernd.kuhlenkoetter@udo.edu www.irf.de university of dortmund robotics research institute otto-hahn-str. 8 44221 dortmund, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 7: integrated image processing systems and error detection ap05_5.vp 1 introduction the preservation of artefacts, monuments, and archaeological sites and finds frequently requires scientific analysis of cultural materials or testing of specific properties of chemical treatments in order to document their historic evidence, clarify deterioration processes, and specify conservation treatments. analyses of the variety of materials constituting our cultural heritage require the expertise of many various scientific methods. x-ray fluorescence analysis, as one of them, is a non-destructive analytical technique used to determine the elemental composition of a sample. 2 basic pronciple of x-ray fluorescence when a primary � or x-ray excitation beam from a radioactive source or from an x-ray tube strikes a sample, the photons can either be absorbed by the atom or scattered through the material. the process in which a photon is absorbed by the atom by transferring all of its energy to an innermost electron is called the photoeffect. during this process, if the primary photon has sufficient energy, electrons are ejected from the inner shells, creating vacancies. these vacancies present an unstable condition for the atom. when this atom returns to its stable state, an electron from the outer shell is transferred to the inner shell and in this process a characteristic x-ray is emitted, whose energy is equal to the difference between the two binding energies of the electrons in the corresponding shells. each element has a unique set of energy levels each element produces x-rays with a unique set of energies allowing one to measure the elemental composition of the sample. between the proton number z of the atom emitting characteristic radiation and the energy, or the wavelength of the radiation, the following relationship applies e k b� �( )z 2, where e is the energy of the photons corresponding to the transfer between two specific shells, and k and b are constants. the process of emission of the characteristic x-rays is called x-ray fluorescence (xrf), and the analytical method using xrf is called x-ray fluorescence analysis (xrfa) [1]. 3 analysis of fresco pigments the x-ray fluorescence equipment was built and is now operated in the laboratory of quantitative methods in the research of ancient monuments at fnspe. for the in-situ measurements we built an xrf analyser with changeable radionuclide sources in the measuring head and with an si(li) detector (see fig. 1). the radioisotope sources 55fe, 238pu, and 241am, are used. 55fe enables the excitation of elements with low z up to 23, 238pu is used for the excitation of 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague application of x-ray fluorescence analysis in investigations of historical monuments t. čechák, l. musílek, t. trojek, i. kopecká nuclear techniques and other techniques using ionising radiation represent a valuable tool in non-destructive diagnostics applied to archaeological finds and objects of arts, namely for determining the composition of materials used in the production of artefacts. x-ray fluorescence analysis, both in its energy form and in its wave dispersive form, is one of the most widespread methods using ionising radiation to study the elemental composition of materials. it is frequently used for studies of various cultural and historic relicts and objects of art. this work summarizes the authors’ experience with x-ray fluorescence analysis in investigating historical frescos namely by means of portable provide spectroscopic devices. the results of these measurements information on the composition of the pigments, enable the comparison of processes used in the fabrication of pigments by individual artists, and in many cases offer information on how to repair the damaged parts. keywords: x-ray fluorescence analysis, si(li) detectors, portable spectrometers, energy dispersive analysis, fresco paintings, 55fe, 238pu. fig. 1: si(li) detection equipment during the measurement of a fresco elements with z from 20 up to 39, and 241am is used for the excitation of the k shell electrons of elements with higher z up to 68. the si(li) detector is cooled by liquid nitrogen from a 5 l dewar vessel. for special purposes, a 2 l dewar vessel is also available. these small dewar vessels and the portable multichannel analyser enable in situ measurements. the collimator system of the exited radiation makes it possible to select the irradiated area. our spectrometer can be used for area mapping or line scanning. pigment compositions on fresco paintings vary with the locality and historic period. white pigments, for example, can be produced with pb, zn and ti oxides, but zno was not produced before 1870, and titanium white was not used before 1920 [2]. therefore, if either zn or ti is found in the white areas of fresco paintings supposed to be from the renaissance period, those paintings were restored or they are a forgery. a more subtle study of the fresco paintings of several works of a particular artist to determine the pigments characteristically used by him or by his disciples would reveal frescos that were particularly typical for the artist and possibly, the times. differences between countries or areas and time periods would most certainly be revealed, and differences between works of fresco painters in the same place at the same time period might be shown up, too. the overall results of xrfa on frescos are multielement spectra indicating the main elements of the pigments that are present. usually, this step makes it possible to identify the inorganic pigments under consideration. in many cases useful details about minor and trace elements can be obtained [3]. several measurements have been carried out directly in the field in order to verify the method and receive information about gothic fresco paintings. the valuable fresco paintings from žirovnice castle were investigated in this way. surprisingly many pigments were used on these frescos, which was not typical of the bohemian region at the end of 15th century. for all basic paints, a few pigments and combinations of pigments were used. local common pigments e.g., green earth and yellow ochre were used together with expensive imported pigments such as vermilion, saturn red and azurite and with pigments which were used very seldom on fresco paintings, e.g., antimonate yellow, and manganese brown. our investigation of the fresco paintings at karlštejn castle aims to date the particular parts of the frescos restored in the 19th century, during the reconstruction of the castle. analysis of xrf spectra of the pigments can give information about the type of pigments. fourteenth century painters used other types of pigments than in the 19th century, and the use of xrfa enables us to differentiate mediaeval and new parts of the fresco. the red and black pigment was used as a marker of the mediaeval part of the fresco, see fig. 2. the red pigment used in the 14th century was a mixture of vermilion, saturn red, and red ochre. red ochre with chinese white was used as a red pigment in the 19th century. a mixture of mars black with saturn red and verdigris was used for tinting the black pigment in the 14th century. this pigment differs from the black pigment used in the 19th century (mars black with traces of zn) [4, 5]. 4 analysis of layered structures xrf facilities can also measure the thickness and perform a composition analysis of coating layer. the covering layer on certain areas can in many cases have a more or less constant thickness. the aim of this type of layer analysis is to separate qualitatively the presence of pigments from the covering layer and to identify the pigments and their position in the substrate layer by means of a non-destructive technique [6]. the basic principle of the measurements of thickness h of a coating deposited on a substrate material is given by the following equation. for a given excitation spectrum, the intensity of the fluorescence radiation emitted from the substrate layer and measured by a detector depends on the angle of excitation �i and the angle of detection �f, with h as a parameter. in fact, the total fluorescence emitted by the sub© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 kt3 10 100 1000 10000 100000 0 5 10 15 energy [kev] c o u n ts k � s k � c a l � h g l � p b k � f e kt6 1 10 100 1000 10000 0 5 10 15 energy [kev] c o u n ts k c a k f e k z n fig. 2: xrf spectrum of the red pigments in the fresco in karlštejn castle. in red pigment used in the 14th century was a mixture of vermilion, saturn red and red ochre (kt3). red ochre with chinese white was used as a red pigment in the 19th century (kt6). strate depends on the thickness of the substrate layer it is now assumed that the thickness is saturated, i.e. it is so big that the signal does not depend on it. the intensity of the measured fluorescence emitted from the substrate can easily be calculated for a simple model defined by the following condition: a single-element coating deposit and a single element substrate, monochromatic excitation at energy e0, a flat and smooth surface area, a substrate of infinite thickness and only primary fluorescence are considered. under this assumption, the fluorescence intensity isf measured by the detector is given by i i e e e e sf s c i c s � � � � � 0 0 0 4 1 �� � � � � � � ( ) ( ) exp ( ) sin ( ) sin � f c s i s h e � � � � � � � �� � � � �� � � �� � � � �� � � � 1 1 0sin ( ) sin ( ) sin� � �i s s f e � where i0 is the intensity of the incident radiation, and � is the excitation factor for the respective x-ray line of the substrate. when considering a k line, � is the product of the absorption jump factor for the creation of a k vacancy, the fluorescence yield for the decay of the k vacancy, and the relative emission rate for the specific k line with respect to other k lines. �(e0) is the photoelectric absorption coefficient of the substrate at energy e0; � /4� is the fraction of the emitted photons collected within the detector solid angle, �(es) is the detection efficiency at the fluorescence energy of the substrate es; s is the density of the substrate; c is the density of the coating; and �s(e0) and �s(es) are mass absorption coefficients of the substrate at energies e0 and es, respectively and �c(e0) and �c(es) are the mass absorption coefficients of the coating at energies e0 and es respectively [6]. in principle, knowing the coating density, it is possible for this very simple model to determine analytically also the coating thickness h, by using the measured intensities. from the model, we can to define the energy of the primary radiation what is necessary for irradiation of the atoms in the substrate layer and z of the atoms whose characteristic radiation can be detected. transmission of characteristic (k� and k�, l� and l�) radiation through a layer of painting material will result in modification of the intensity ratio i�/i�. this effect becomes obvious if the attenuation is sufficiently different for the � and � components of the excited radiation. maximum i�/i� modification is anticipated for selective absorption, i.e., if the element of strongest absorption in the pigment (hg in cinnabar and pb in lead white) has an absorption edge just between the k� and k� or l� and l� energy of the penetrating radiation [7]. in order to search for pigment combinations, i.e. top layer subjacent layer, where layer position studies via the k�/k� or l�/l� intensity ratios are practicable, estimations are very useful on the basis of single elements. for this, modification of the intensity ratio i�/i� is expressed by a gain factor for the characteristic element of the pigment: g i e a i e a i e i e � � � � � ( , ) ( , ) ( ) ( ) , where (e, a) represents the emitter – absorber layer system and (e) denotes only the emitter without the attenuating top layer. it is well known that this gain factor g s s d( ) exp( )� is a function of the absorption selectivity: s � � � �� � � �� � � � �� � � �� � � � � � � � � � where � � � � �� � � �� and � � � � �� � � �� denote the mass attenuation coefficients for the considered system, and and d represent the top material density and thickness, respectively [8]. for a realistic situation, the two models mentioned above are too simple. the monte carlo method must be used for calculating the intensities of the k and l lines of characteristic radiation from the subjacent layer. the mcnp-ivc code is very useful for this calculation [9], and this code is available and used in our laboratory. in the franciscan monastery in kadaň, an extremely precious gothic fresco was found. unfortunately, this fresco was superimposed by a new layer of fresco in the renaissance period. the renaissance fresco is also valuable and removing it will be very hazardous and expensive. xrfa can help here to investigate the subjacent layer of the gothic fresco. 5 conclusions for elemental analysis of the pigments used in fresco-paintings, xrfa is the preferred technique for routine work. this is because of the mobility of the equipment, which ensures that the object can be kept in its stationary position during the measurement. compared to other techniques, xrfa has the advantage of being non-destructive, multi-elemental, fast, and cost-effective. thanks to advances in miniaturization of electronics, detectors, cooling equipment and x-ray tubes, it is possible to build and use portable equipment for xrfa. xrfa along with thermoluminiscence dating will remain an important technique used in the laboratory of quantitative methods in the research of ancient monuments at fnspe. in the future research work, the use of small x-ray tubes is also planned for studies of fresco paintings. the use of x-ray tube beams can increase the area resolution of the method. better collimation of the beam enables the applica50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 3: geometry of the experiment tion of more precise software for quantitative analysis. thanks to fruitful collaboration with the national institute for monument care, it is possible to compare the xrfa results with the results obtained by other methods, e.g. electron microscopy and, last but not least, collaboration with the pixe laboratory of fnspe also gives some interesting results. pigment analysis is also an extremely important aid in restoration, since it can help to distinguish the original sections of painting from the restored or later added sections. thus the pigment characterisation may be very important in making a decision whether to remove spurious layers or for choosing the most closely matching process for restoration. the main reason for analysing pigments in frescos or in manuscripts is for the purposes of conservation. depending on their nature, pigments may be sensitive to light, humidity, gaseous atmospheric pollutants, or heat, which requires specific storage or display conditions. additionally, we may want to identify the pigments before applying some chemical or other treatment aimed at reversing or at least putting a stop to the deterioration process. characterisation of the pigments may also help in assigning a probable date to the painting, in reconstructing its restoration and conservation history, and in detecting forgeries. references [1] van grieken, r. e., markowicz, a. a. (eds.): handbook of x-ray spectrometry. 2th edition, marcel dekker, new york 2002. [2] šimunková, e., bayerová, t.: pigmenty. společnost pro technologie ochrany památek, stop praha 1999. [3] frankel, r.: “detection of art forgeries by x-ray-fluorescence spectroscopy.” in: isotopes and radiation technology, vol. 8 (1970), no. 1. [4] čechák, t., gerndt, j., musílek, l., kopecká, i.: “application of x-ray fluorescence for analysis of fresco paintings.” in: european conference on energy dispersive x-ray spectrometry. krakow : fast plk, 2000, vol. 1, p. 167–169. [5] musílek, l., čechák, t., kopecká, i.: “x-ray fluorescence in research on the cultural heritage.” in: 1st itrs international symposium on radiation safety and detection technology. seoul: korean nuclear society, 2001, vol. 1, p. 228–234. [6] fiorini, c. et al.: “determination of the thickness of coating by means of a new xrf spectrometer.” x-ray spectrometry, vol. 31 (2002), p. 92–99. [7] mantler, m., schreiner, m.: “x-ray fluorescence spectrometry in art and archeology.” x-ray spectrometry, vol. 29 (2000), p. 3–17. [8] neelmeijer, c. et al.: “paintings – a challenge for xrf and pixe analysis.” x-ray spectrometry, vol. 29 (2000), p. 101–110. [9] mcnp ivc manual, http://www.jlab.org/~semenov/ rlinks/soft.html prof. ing. tomáš čechák, csc. phone: +420 222 314 132 fax: +420 224 811 074 e-mail: cechak@fjfi.cvut.cz prof. ing. ladislav musílek, csc. phone: +420 224 358 247 fax: +420 224 811 074 e-mail: musilek@fjfi.cvut.cz ing. tomáš trojek phone: +420 224 358 242 fax: +420 224 811 074 e-mail: tomas.trojek@fjfi.cvut.cz czech technical university in prague faculty of nuclear sciences and physical engineering, břehová 7 115 19 prague 1, czech republic ing. ivana kopecká, csc. phone: +420 224 213 813 fax: +420 224 232 025 e-mail: kopecka@praha.npu.cz národní památkový ústav valdštejnské nám. 3 118 01 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 ap04_1.vp 1 introduction as is known, during mechanical treatment of materials here arise elastic and plastic deformations, which may have an essential influence on the lifetime of products. in particular, specially created deformations on the surfaces of metal details increase their strength, operational reliability and lifetime [1]. therefore the development of a technique and apparatus for nondestructive evaluation of deformations [2] is an important problem at the present time. measurements of the distribution of the quantity of near-surface deformations on the area of details, and also the elastic stresses existing in them, are of great importance, as they significantly simplify the search for optimum conditions of treatment for each specific detail. this paper shows in principle how the inhomogeneous elastic stresses in a sample of ferromagnetic material can be measured. the measurement technique is based on recording the inhomogeneities of stray fields near to a surface of a sample placed in a homogeneous magnetic field h0, the quantity of which is much lower than the field of magnetic saturation. the stray fields are determined by the magnetic domain structure on each local part of the sample, which depends on a set of factors including elastic stresses. in particular, elastic stresses, are known [3] to have a dramatic effect on the magnetic anisotropy of a material. therefore the equilibrium orientation of the magnetic moments in domains on any local part of an investigated sample depends on the quantity and direction of the elastic stresses on this part. 2 measurement technique the stray fields are measured with the help of an original microwave detector [4] whose sensing element is a thin magnetic film (tmf) having uniaxial magnetic anisotropy. the detector (fig. 1) functions on the basis of the measuring head of a scanning automated spectrometer of ferromagnetic resonance [5]. the transistor microwave generator and the detector are placed in the nonmagnetic metal body. the driver circuit of the generator is a microstrip resonator (msr). the measuring hole is etched in the screen of the msr. the measuring hole is the local source of the microwave magnetic field. this hole with a diameter of ~1 mm is enclosed by a magnetic film in such a way that the axis of easy magnetization of the tmf is orthogonal to the direction of the microwave magnetic field. to decrease the noises caused by magnetic inhomogeneities on the edges of the tmf, the dimensions of the film must be a little bigger than the diameter of the measuring hole. the operation of the detector is based on a measuring the reversal magnetization field on the tmf, which is directed along the easy magnetization axis. in the absence of a ferromagnetic sample this field will obviously coincide with the coercive force hc of the magnetic film area that which closes the measuring hole. as the detector we used a permalloy film with a thickness of ~ 0.1 microns with hc < 1 oe, which had magnetization reversal on the measuring hole area by a single barkhausen jump. the inverse of the magnetic moment vector associated with film magnetization reversal gives a jump change of the fmr signal polarity in the field hc (fig. 2). thus, hc is determined by the position of the signal jump in the recorded spectrum. the scanning spectrometer of the ferromagnetic resonance can measure hc with high precision, 0.01 oe [6], due to the digital sweep of the constant magnetic field and to the lock-in detection of a signal on the modulating magnetic field frequency. it is very important that the low-frequency modulating magnetic field with frequency © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 magnetic technique for nondestructive evaluation of residual stresses b. a. belyaev, a. a. leksikov, s. g. ovchinnikov, i. kraus, a. s. parshin a technique has been designed for measuring planar components of stray fields from ferromagnetic samples placed in a constant magnetizing field. the technique is based on recording the field of magnetization reversal of a thin magnetic film with the small coercive force being the sensor device of a microwave detector. the possibility of measuring the deformation inhomogeneities caused by mechanical treatment when manufacturing products from ferromagnetic materials is demonstrated. the results of the magnetic measurements agree with the data from x-ray diffraction analysis. keywords: ferromagnetic resonance, thin magnetic films, microvawe scanning spectrometer, residual stress tmf sample hole msr microwawe detector microwawe generator signal power fig. 1.: layout of mw detector of magnetic field 1 khz present in the spectrometer acts as “a magnetic shake-up”, due to which the recurrence of good results for repeated measurements is ensured. the signal connected to the film magnetization reversal depends essentially on the microwave pumping frequency of the detector [6], and at the fixed pumping frequency it depends on the magnetic parameters of the film, and also on the anisotropy field quantity. the curves of the fmr of two permalloy films with different uniaxial magnetic anisotropy and coercive force are shown in fig. 2. the frequency pumping was equal to 721 mhz. as research has shown [7], the greatest signals are observed when measured field hc falls in the medial part of the ferromagnetic resonance curve slopes of the tmf. optimum pumping frequencies are easy to determine from the equation: � � �1 2 4, � � �m h h hs k c � where � – circular frequency of mw pumping; � – gyromagnetic ratio; ms – saturation magnetization; hk – anisotropy field; �h – ferromagnetic resonance linewidth. for permalloy films the optimum frequencies fall in the 200–1500 mhz band. it is natural that the investigated ferromagnetic sample strongly perturbs the field homogeneity by its own field hb (fig. 3), which is directed against the homogeneous magnetizing field h0 created by the helmholtz coils. therefore during the sweep of constant field h0 the tmf reverses its magnetization in some field h0 � hd which is much more than the quantity of the coercive force of the film hc. however, at some point of measurement hc � hd � hb. as a result, the dependence of the change in the field of tmf magnetization reversal hd on moving the detector above an investigated sample, for example, along the line shown by the points in fig. 3, reflects the relevant dependence of stray field hb, enlarged on value hc. taking into account that hc, as a rule, is much lower than hd, the magnetic film in the detector can be considered as a null-indicator showing the equality condition of the external magnetic field created by the helmholtz coils and the tangential component of the sample stray field. it should be noted that the magnetic film of the detector reverses the magnetization in some, averaged trough the area of the tmf, magnetic field equal to hc and that the film practically does not perturb the stray fields of the sample due to its small volume. with the help of the special adjusting screw of the device, the spacing between the magnetic film plane of indicator and the investigated sample surface may be regulated in a wide range between 0.02 mm and 5 mm. the minimum spacing not only ensures maximum locality of measurements, but also allows the stray fields to be fixed mainly from the thin near-surface layer of the sample. as the spacing increases the thickness of the near-surface layer forming the integrated stray fields grows, and the locality of the measurements is reduced. thus all the results shown below were obtained with minimum spacing. 3 experimental results to test the described technique for measuring stray fields, some high quality steel disks were manufactured, 48 mm in diameter and 4 mm in thickness. after manufacturing, the disks were annealed to eliminate the residual stresses in them, and then on one side of each disk half of its thickness was removed by grinding with either one-entry or two-entry. samples manufactured in such way were placed in the centre of the stage in the fmr scanning spectrometer. [5] (see fig. 1). the typical behaviour of the local stray fields when the magnetizing field was increased is shown in fig. 4. the tmf magnetization reversal field hd in this experiment was registered on a demagnetized sample with reverse sweeping of the magnetizing field from set value h0 down to zero. the fmr spectra were read, starting from magnetic field 1 oe, and after each pass the age h0 was also incremented by 1 oe. as a result the investigated sample was gradually more and more magnetized, and, accordingly, its stray field increased, which, as already mentioned, directed against the magnetizing field. the coercive force of the magnetic film used in the detector hc � 0.15 oe. therefore the occurrence of the tmf magnetization reversal signal of the detector, since field h0 � 21 oe, demonstrates that up to this critical field the dif44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague 0 fig. 2.: fmr curves of permalloy films having different fields of uniaxial magnetic anisotropy. 1) hk � 1.5 oe, 2) hk � 4.5 oe. solid curves correspond to the sweeping magnetic field along the initial magnetizing, dots are the opposite sweeping. 0 fig. 3.: the investigated sample in a uniform magnetic field ference between external field h0 and sample stray field hb remained less than 0.15 oe. the apparent discrete change hd with growth h0 allows us to assume that the magnetic domains in the demagnetized sample do not penetrate right through and domain structure is multi-layer. this domain structure with the growth of the magnetizing field for particular critical fields is reconstructed by jumps, similar to barkhausen jumps, that become more and more simple. at a further increase h0 the amplitude of these jumps is diminished, and dependence hd (h0 ) gradually becomes monotonic. applying the experiments to many samples has shown that the number of the jumps, their amplitude hd, and also their position depend not only on the sample material, but also on the selected local area, and mainly, on the magnetizing direction of the sample relatively to the direction of its grinding during manufacture. this demonstrates that the set of magnetization curves been recorded in local spots across the whole area of the sample contain information about the distribution of the plastic and elastic deformations formed in the sample during a mechanical treatment. similar behaviour hd (h0) is observed during a direct magnetic field sweep, but after first magnetizing the sample by a field that is more than critical. depending on the strength of the magnetizing field, the value of the tmf magnetization reversal field on the direct sweep course hd� may be either more or less than the magnetization reversal field of the film on the reverse sweep course hd� (fig. 5). this fact is explained as follows. the field hd measured on the direct sweep, with the increase in the magnetizing field, is saturated much earlier than the same field measured on the reverse sweep. as a result for “small” magnetizing fields the difference �h h hd d d� � � � is negative, while for “large” fields it is positive. and obviously there exists a quantity of the magnetizing field at which �hd � 0. the observed difference of the tmf magnetization reversal fields for a direct and inverse sweep reflects the real hysteresis of the stray fields on the measured local area. the hysteresis quantity for the fixed magnetizing field depends not only on the selected local area, but also on the angle of the magnetization direction of the sample (the direction of its treatment). therefore the hysteresis measured on local areas of the sample surface also reflects the distribution of the plastic and elastic deformations acquired by the material during its mechanical treatment. for rapid diagnostics of samples, the fmr spectrometer-controlling program provides automatic evaluation and makes a record in a file not only of the tmf magnetization reversal fields, but also of the hysteresis quantity. it is important to note that the pattern of the distribution of the inhomogeneities of the hysteresis �hd on the sample surface, and also the pattern of inhomogeneities of the magnetization reversal field hd, depends on the quantity of the magnetizing field selected in the experiment. using a “weak” magnetizing field (h0<50 oe), when �hd � 0, on a background of “long-wave” inhomogeneities, with a reference size of about the sample size, we can clearly observe “short-wave” inhomogeneities that are much smaller than the samples. using “strong” fields (h0 > 100 oe) when �hd 0, only the “long-wave” inhomogeneities become apparent, and the “short-wave” inhomogeneities level out. it is possible that these stray fields of “short-wave” inhomogeneities are connected to stresses reflecting the “frozen” standing waves pattern of the elastic oscillations in the sample, produced by blank vibration during manufacture. in the patterns of „short-wave“ inhomogeneities distribution of stray fields on the disk samples, we can observe a symmetry that often has a radial structure. fig. 6 shows the stray fields distribution for a sample manufactured by two-entry grinding along the y axis. the spectra record was carried out for a reverse sweep from the field h0 � 20 oe, directed along the grinding line. note that the pattern of inhomogeneities remains as a whole the same when the sample is magnetized in the direction perpendicular to the grinding line, but the number of apparent extremes varies. in fig. 7 shows the stray field distribution obtained for magnetizing along the grinding line (hdy) and perpendicular to the grinding line (hdx) for a sample manufactured by one-entry grinding in direction y. the dependences are obtained for field reverse sweeping from h0 � 200 oe, so the “short-wave“ inhomogeneities show weakly on them. it can be seen that these distributions differ essentially from each © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 15 10 5 0 20 30 40 hd, oe h0, oe fig. 4.: dependence of the tmf magnetic reversal field on the value of magnetizing field applied along the direction of the sample treatment 0 50 100 150 a, arb. un. -1.0 -0.5 0.0 0.5 1.0 h0, oe 100 300 200 fig. 5.: typical fmr signals recorded with the opposite course of magnetic field sweeping from h0 100 200 300� , and oe. dots display the signal obtained in the direct course of sweeping. other, but they have good symmetry concerning the y axis. here the dependence of the difference of these fields �h h hd dy dx� � , where a “dent” is visible on the part where the tool enters the sample as it is being ground. a similar feature is also seen in the homogeneity hysteresis distribution �hd (see fig. 7). this is visible independently of the sample magnetization direction. as expected, the distribution patterns of the stray fields for all samples obtained by two-entry grinding have rather good symmetry not only concerning the y axis, but also concerning the x axis. for “magnetic” diagnostics of cold-hardening in the ferromagnetic samples, helpful information can be received from the angular dependences of the stray fields measured for a sample previously magnetized along any chosen direction. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague x, mm h, oe y, mm fig. 6.: distribution of magnetic reversal field of tmf in the detector across the surface of a sample manufactured by two-entry grinding x, mm x, mm x, mm x, mmy, mm y, mm y, mmy, mm hx , oe hy , oe �h, oe�h, oe fig. 7.: distributions of the magnetic reversal field of tmf in the detector in the central area of a sample manufactured by one-entry grinding, hdy corresponds to magnetizing along the grinding direction and hdx corresponds to magnetizing in the perpendicular direction; �hd is the distribution of the difference of these fields; �hd is the distribution of hysteresis. fig. 8 shows a typical dependence of the tmf magnetization reversal field on the sweeping field direction when the detector was placed above the central segment of the sample. the sample had been magnetized in a field 100 oe along the grinding direction. a record of spectra was made in an inverse sweep of the magnetic field from h0 � 20 oe. this field was much lower than the field in which the sample was magnetized and, as the experiment has shown, it does not in practice destroy the sample domain structure that was formed in it after its magnetization. therefore the quantity of field hd after rotating the sample by 360° remains almost the same for the initial value at � � 0°. such angular dependence of the stray fields demonstrates the presence of unidirectional anisotropy in the sample. fig. 8 shows that the region of the stray field maximum occupies a wide enough interval of angles between 0° and 100°. this indicates that the direction of the resulting magnetic moment in the measured area of the sample strongly deviates from the initial magnetizing field direction. this is clearly due to the local magnetic anisotropy of the sample which, as already mentioned, immediately depends on the direction and quantity of the elastic stresses on the given local area. fig. 9 shows measurements of stray fields on two diameters: along (black points) and perpendicular to (white points) the direction of treatment of the sample, obtained by two-entry grinding along the y axis. the dependences are measured with a step of 0.5 mm for a reverse sweep of the constant magnetic field from the quantity h0 � 20 oe. the constant magnetic field was directed along the line of grinding. in this sample, measurements of the distribution of the elastic stresses were also made, using the x-ray diffraction method [8]. this method needs much time, so the measurements were carried out with larger steps of 2 mm. the elastic stress dependences, obtained with the use of an x-ray procedure, qualitatively conform to the behaviour of the stray fields. like hd (x), hd (y) (see fig. 9), they have a minimum in the centre of the sample and two maxima on each side. however, as it approaches the edges of the sample field hd diminished slightly faster than the apparent decrease in the elastic stresses. it is easy to explain such a discrepancy by the influence of the “magnetic charges” existing on the sample edges, which can greatly reduce the measured field of the tmf magnetization reversal. “magnetic charges” are especially large on the disk edges, orthogonal to the direction of the magnetization field. therefore dependence hd is stronger on the y axis than on the x axis. in principle, the “magnetic charges” can be removed from the measurement area of the sample. 4 conclusion it is shown that cold-hardening diagnostics in ferromagnetic samples can be performed with the help of a scanning ferromagnetic resonance spectrometer. our original measurement procedure of local stray fields with use of a thin magnetic film has high sensitivity. it allows us to record inhomogeneities of the elastic stresses arising in the sample during mechanical treatment, and in this a way to record cold-hardening indirectly. in any selected point above the sample a measurement is made of the external magnetic field, the quantity of which point compensates the sample stray fields at this point. thus the thin magnetic film included in the measuring microwave head is a null-indicator, and the compensating point can be fixed. our research has also shown that the distribution of cold-hardening across the sample surface can be registered by inhomogeneities not only in the stray fields, but also by the hysteresis of these fields. in additional, the local magnetization curves and the angular dependences of stray fields can be used as signals. it is possible that in order to display the most complete and adequate pattern of plastic and elastic deformations in the sample, the whole complex of all these measurements will be needed. the high speed and low cost of obtaining information about elastic stresses in ferromagnetic materials make our technique promising. it should be useful for making a rapid analysis improve the conditions for the manufacture of details. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 44 no. 1/2004 hd, oe 15 10 5 0 0 100 200 300 �, ° fig. 8.: angular dependence of the magnetic reversal field in the central area of the sample magnetized along the line of treatment 13 12 11 10 �20 �10 0 10 x y, , mm hd, oe fig. 9.: tmf magnetic reversal fields along two orthogonal directions: along the line of grinding (black points) and transversely to the line of grinding references [1] odintsov l. g.: uprochnenie i otdelka detaley poverkhnostnym plasticheskim deformirovaniem. moskva, mashinostroenie, 1987. [2] aseyev n. v., dudkina n. g., parshev s. n., fedorov a. f.: “metodika opredelenia velichiny uprochnenia materiala pri poverkhnostnom plasticheskom deformirovanii.” zavodskaya laboratoriya. diagnostika materialov. vol. 61 (1995), no. 7, p. 19. [3] vonsovsky s. v.: magnetizm. moskva, nauka, 1971. [4] belyaev b. a., leksikov a. a., makievskii i. ya., ovchinnikov s. g.: russian patent no. 2160441. [5] belyaev b. a., leksikov a. a., makievskii i. ya., tyurnev v. v.: “ferromagnetic resonance spectrometer.” instruments and experimental technique, vol. 40 (1997), no. 3, p. 390. [6] belyaev b. a., izotov a. v., leksikov a. a.: “skaniruyushchiy spektrometer ferromagnitnogo rezonansa dlya diagnostiki kharakteristik tonkikh magnitnykh plyonok.” zavodskaya laboratoriya. diagnostika materialov. vol. 67 (2001), no. 9, p. 23. [7] belyaev b. a., ivanenko a. a., leksikov a. a., makievskii i. ya., pashkevich a. z., tyurnev v. v.: spektrometer ferromagnitnogo rezonansa lokalnykh uchastkov tonkikh magnitnykh plyonok. preprint no. 761, institute of physics, krasnoyarsk, 1995. [8] kraus i., ganev n.: residual stress and stress gradients. industrial applications of x-ray diffraction. f. h. chung, d. k. smith (eds.). new york, basel, 1999, p. 793. prof. boris a. belyaev prof. aleksandr a. leksikov sergei g. ovchinnikov l. v. kirensky institute of physics ras krasnoyarsk, russia prof. rndr. ivo kraus, drsc. phone: +420 221 912 416 e-mail: kraus@troja.fjfi.cvut.cz department of solid state engineering czech technical university in prague faculty of nuclear sciences and physical engineering, trojanova 13 120 00 praha 2, czech republic prof. anatolyi s. parshin siberian state aerospace university krasnoyarsk, russia 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 1/2004 czech technical university in prague acta polytechnica doi:10.14311/ap.2019.59.0067 acta polytechnica 59(1):67–76, 2019 © czech technical university in prague, 2019 available online at http://ojs.cvut.cz/ojs/index.php/ap particle motion over the edge of an inclined plane that performs axial movement in a vertical limiting cylinder serhii f. pylypakaa, mykola b. klendiib, viktor m. nesvidomina, viktor i. trokhaniaka, ∗ a national university of life and environmental sciences of ukraine, heroiv oborony str., 15, kyiv, ukraine. b separated subdivision of national university of life and environmental sciences of ukraine berezhany agrotechnical institute, akademichna str., 20, berezhany, ukraine. ∗ corresponding author: trohaniak.v@gmail.com abstract. differential equations of a relative material particle motion over the edge of an inclined flat ellipse that rotates around the axis of a vertical limiting cylinder have been deduced. the position of a plane relative to the axis of the rotation is set by the angle ranging from zero to ninety degrees in its value. if the angle is equal to zero, the plane is perpendicular to the axis of rotation and if the angle is equal to ninety degrees, it passes through the axis of rotation. the equations have been solved using numerical methods. analytical solution has been found for certain angles. the aim of the research is to investigate the transportability of a technological material in a vertical direction by a cascade operating element that rotates in a cylindrical cover. the working part of the operating element is an inclined rigid plane, which is limited by an ellipse — the line of its contact with a cover. the objective of the research is to analytically describe the movement of a single particle of the technological material on two surfaces, namely, an inclined plane and a vertical cover. the research methodology is based on the methods of differential geometry and the theory of surfaces, theoretical mechanics and numerical methods of solving differential equations. the paper presents a first developed analytical description of the relative particle motion in an ellipse — a contact line of an inclined plane and a limiting vertical cylinder, in which the inclined plane rotates. the kinematic characteristics of such motion have been determined. keywords: inclined cylinder; oscillating motion; vertical plane; particle; differential equations; kinematic parameters. 1. introduction a particle motion over a horizontal plane in the form of a rigid disk, which rotates around a vertical axis, is the most investigated one. such disks with blades attached to them are used in scattering centrifugal apparatuses. operating elements with a horizontal axis of rotation in the form of a shaft with flat blades attached to it are used for scattering organic fertilizers. in addition, they can be used for mixing particles and their scattering in a centrifugal direction. the investigation of the patterns of a material particle motion over the edge of an inclined flat ellipse that rotates around the axis of a vertical limiting cylinder is interesting in terms of theory and practical use. major works [1, 2] consider the compound particle motion on rough surfaces of agricultural machinery operating elements. the particle motion on a horizontal disk that rotates around a vertical axis, which is either equipped with blades of the simplest designs or is without them, is analysed in these papers. research [3] considers the particle motion on a flat disk, which rotates around the axis that is inclined to the horizon. the patterns of the particle motion on a disk without blades as well as on the one, which is equipped with rectilinear blades located in a radial direction from the axis of the rotation, are presented in this research work. paper [4] presents the research, which is similar to ours. it considers relative particle motion at a wide range of inclination angles of a plane to the axis of rotation, beginning from a horizontal position and finishing by a vertical one. the development of a bladed operating element of a conveyer-mixer is considered in paper [5–10]. 2. material and methods in order to investigate the patterns of the material particle motion over the edge of an inclined flat ellipse, which rotates around the axis of a vertical limiting cylinder, let us cut a segment off an inclined plane with the help of a vertical cylinder, which is limited by an ellipse (figure 1). let us rotate this segment around a cylinder axis at a constant angular velocity ω. if a material particle gets onto a moving segment, it will slide on it, that is to say, it will perform a relative motion. under the action of centrifugal force, a particle will be thrown away to the inner surface 67 http://dx.doi.org/10.14311/ap.2019.59.0067 http://ojs.cvut.cz/ojs/index.php/ap s. f. pylypaka, m. b. klendii, v. m. nesvidomin, v. i. trokhaniak acta polytechnica figure 1. an inclined plane, which is limited by a vertical cylinder: (a) axonometric projection; (b) diagram of forces acting on a particle of a limiting cylinder and, from this point on, it will move along a common intersection line of a cylinder and a plane, that is to say, along an ellipse. a particle, which is shown in figure 1b in the down position on a flat segment that is projected into a straight line, is influenced by the following forces: weight force of a particle mg, where m – mass of a particle, g = 9.81 m/s2, reaction n of a surface, reaction nr of a cylinder surface with a radius r. the last mentioned forces cause relative friction forces. first of all, let us consider a situation, when a plane inclination angle is the following: β = 0. here, an ellipse will transform into a circle and the problem will become planare, which makes the problem solving much easier. let us deduce a differential equation of an absolute particle motion in a projection onto a moving axis s – motion direction (figure 2). let us write the equation in the following form: mw = ∑ f , where w – absolute acceleration of a particle, ∑ f – total force exerted upon a particle. a circular segment of a plane rotates with the angular velocity ω. during the time t it rotates through the angle of ωt. a particle will rotate in the same direction, but its rotation will be slower, since its motion will be restricted by the friction force caused by sliding over a cylinder plane. an absolute angle of rotation will be equal to ωt−α, where α – the angle at which a particle rotates as a result of sliding on a plane (a disk). the absolute distance s, which a particle covers during the time t, is determined from the following expression: s = r(ωt−α). (1) let us consider the angle α to be a time-varying function t: α = α(t). with the help of the successive differentiation of the expression (1), let us find the figure 2. a diagram of forces acting on the particle a, which is located on a circular disk that rotates inside a cylindrical cover. absolute velocity va and the absolute acceleration w of a particle: va = ds dt = r(ω −α′), (2) w = dva dt = −rα′′. (3) two friction forces, which act along the moving axis s in opposite directions, are exerted upon a particle. the force f of the particle friction on a disk is directed opposite to its sliding on it or in the direction of the absolute motion. its value is the product of the coefficient of the particle friction f on a disk by disk reaction n = mg. thus, f = fmg. in a similar way, the force of the particle friction fr on the surface of a cylindrical cover is determined. the cover reaction on a particle nr balances the value of the centrifugal force mr(ω−α′)2 that acts on the particle. thus, fr = frnr = frmr(ω−α′)2. after that, the equation mw = ∑ f takes the following form: −mrα′′ = fmg −frmr(ωt−α′)2. (4) after its reduction by the mass m and its simplification, (4) is written as: −α′′ = fg r −fr(ωt−α′)2. (5) this equation can be analytically solved. its partial solution can be found, which describes the particle motion after its stabilization. in this case, the angular velocity α′ of a particle sliding will be stable, but its angular acceleration α′′ will be equal to zero. having solved (5) for α′ at α′′ = 0, we obtain: α′ = ω − √ fg frr . (6) for example, at r = 0.1 m, f = fr = 0.3 and ω = 20 s−1 using the formula (6), we obtain: α′ = 10.1 s−1. thus, the angular velocity of the particle rotation in the absolute motion ω−α′ = 9.9 s−1 will be twice less 68 vol. 59 no. 1/2019 particle motion over the edge of an inclined plane than the angular velocity of the disk rotation. the minimum value of the angular velocity of the disk rotation, at which the particle sliding is possible over its surface, is determined from the expression (6): ω > √ fg frr . (7) for the above mentioned parameters ω > 9.9 s−1. if the angular velocity of the disk rotation is less than this, a particle will rotate together with it without sliding. at β 6= 0, the absolute particle motion will be spatial. in order to find the pattern of the sliding angle change α = α(t), let us use fixed oxyz coordinates. the differential equation of motion is written in projections onto the axes of this coordinate system. the parametrical equations of an ellipse that is created as a result of the intersection of an inclined plane and a vertical cylinder with radius r, are written as: x = r cos α, y = r sin α, z = r tg β cos α, (8) where β — the angle of the plane inclination (figure 1b). when a plane segment rotates with the angular velocity ω, a particle in an absolute motion has to rotate in an ellipse (8). during the time t, an ellipse rotates through the angle ωt in the direction opposite to the particle sliding through the angle α. having considered the angle α to be sufficient, let us find the parametrical equations of an absolute particle motion. for this purpose, let us turn the ellipse (8) through the angle −ωt. the coordinate z does not depend on this turn, that is why, it remains unchanged. after such a turn, we obtain parametrical equations of the absolute motion of a particle: xa = r cos α cos(−ωt) −r sin α sin(−ωt) = r cos(ωt−α), ya = r cos α sin(−ωt) −r sin α cos(−ωt) = −r sin(ωt−α), za = −r tg β cos α. (9) having differentiated (9), we obtain the projections of the absolute velocity of a particle: x′a = −r(ω −α ′) sin(ωt−α), y′a = −r(ω −α ′) cos(ωt−α), z′a = −rα ′ tg β sin α. (10) its value is found as a geometric sum of the projections (10): va = √ (x′a)2 + (y′a)2 + (z′a)2 = r √ (ω −α′)2 + (α′)2 tg2 β sin2 α. (11) let us find the projections of the unit vector, which sets the direction of the absolute velocity of a particle, by dividing the expressions (10) by the velocity value (11): tv ax = − (ω −α′) sin(ωt−α) a , tv ay = − (ω −α′) cos(ωt−α) a , tv az = α′ tg β sin α a , (12) where a = √ (ω −α′)2 + (α′)2 tg2 β sin2 α. as a result of the differentiation of (10), we obtain the projections of the absolute acceleration w of a particle: x′′a = rα ′′ sin(ωt−α) −r(ω −α′)2 cos(ωt−α), y′′a = rα ′′ cos(ωt−α) + r(ω −α′)2 sin(ωt−α), z′′a = rα ′′ tg β sin α + r(α′)2 tg β cos α. (13) let us write the equation mw = ∑ f in projections onto the axes of the oxyz coordinate system. for this purpose, it is necessary to determine the forces, that act on a particle as well as their direction. let us find the direction of the reaction force n of a flat segment and the reaction force nr of the inner surface of a cylinder (figure 1b). the vector force of the reaction n is projected onto two axes, namely, ox and oz. let us write its projections on coordinate axes through the angle β: nx = n sin β, ny = 0, nz = n cos β. (14) the projections (14) are written without taking into consideration the rotating motion of a plane segment. in order to exert the force n in the point of the particle location, the projections (14) must be turned through the angle −ωt about oz axis. after this, they take the following form: nωx = n sin β cos ωt, nωy = −n sin β sin ωt, nωz = n cos β. (15) the reaction nr is directed towards the centre of the cylinder perpendicular to its surface, that is to say, perpendicular to the circle, which is set by the first two equations in (8). the projections of the reaction force on the coordinate axes are written as: nrx = −nr cos α, nry = −nr sin α, nrz = 0. (16) after the projections (16) are turned through the angle −ωt about oz axis, we obtain: nrωx = −n cos(ωt−α), nrωy = 0n sin(ωt−α), nrωz = 0. (17) weight force of a particle mg is down-directed and does not depend on the rotation angle of a plane 69 s. f. pylypaka, m. b. klendii, v. m. nesvidomin, v. i. trokhaniak acta polytechnica segment. let us write its projections on the coordinate axes: (0; 0;−mg). (18) during an absolute particle motion, it slides both on a flat segment and on the wall of a cylindrical cover. in both cases, the value of the friction force is determined as the product of the reaction force and the correspondent friction coefficient: f = fn and fr = frnr. the force f is directed opposite to the direction of sliding, that is to say, opposite to the vector of the absolute velocity of particle motion. the projections of this vector are determined from differentiation of (8): x′ = −rα′ sin α, y′ = rα′ cos α, z′ = rα′ tg β sin α (19) let us find the value of the relative velocity of the particle motion: v = √ (x′)2 + (y′)2 + (z′)2 = rα′ √ 1 + tg2 β sin2 α. (20) the projections of a unit directing vector of relative velocity can be determined from dividing the projections (19) of this velocity by its absolute value (20): tv x = − sin α b ,tv y = cos α b ,tv z = tg β sin α b , (21) where b = √ 1 + tg2 β sin2 α. after the vector (21) is turned through the angle −ωt, so that it is exerted to a particle, we obtain: tv ωx = sin(ωt−α) b , tv ωy = cos(ωt−α) b , tv ωz = tg β sin α b , (22) here, it is possible to write the projections of the friction force f , taking into account that it is directed opposite to the direction of the vector (22): fωx = − fn b sin(ωt−α), fωy = − fn b cos(ωt−α), fωz = − fn b tg β sin α, (23) the friction force fr is directed opposite to the direction of the absolute velocity, that is to say, the vector (12). there is no need to turn the vector (12) through the angle −ωt, since the cylinder is fixed and the position of a particle in an absolute motion is determined relative to a fixed coordinate system as well. taking this into account, the projections of the force fr are written as follows: frx = frnr a (ω −α′) sin(ωt−α), fry = frnr a (ω −α′) cos(ωt−α), frz = − frnr a α′ tg β sin α, (24) here, it is possible to write the equation mw = ∑ f in projections on oxyz coordinate axes: mx′′a = nωx + nrωx + fx + frx, my′′a = nωy + nrωy + fy + fry, mz′′a = nωz + nrωz + fz + frz −mg. (25) let us substitute the expressions of the absolute acceleration (13) and the expressions of exerted forces (14), (15), (23) and (24) into (25). as a result, we obtain the system of three differential second-order equations with three unknown functions α = α(t), n = n(t), nr = nr(t): m ( rα′′ sin(ωt−α) −r(ω −α′)2 cos(ωt−α) ) = n sin β cos ωt−nr cos(ωt−α) − fn b sin(ωt−α) + frnr a (ω −α′) sin(ωt−α), m ( rα′′ cos(ωt−α) + r(ω −α′)2 sin(ωt−α) ) = n sin β sin ωt−nr sin(ωt−α) − fn b cos(ωt−α) + frnr a (ω −α′) cos(ωt−α), m ( rα′′ tg β sin α + r(α′)2 tg β cos α ) = n cos β − fn b tg β sin α− frnr a α′ tg β sin α−mg. (26) having solved (26) for α′′, n and nr, we obtain: α′′ = 1 mr (frnr a (ω −α′) − fn b + nr −mr(ω −α′)2 cotg(ωt−α) − n sin β sin ωt cos(ωt−α) ) , (27) n = d−1 ( −4mga cos β − 4mr sin β ( (α′)2a cos α + frω(ω −α′)2 sin α )) , (28) nr = d−1ma ( crω(ω − 2α′) −r ( 3ω2 − 6α′ω + 4(α′)2 ) − 2g sin 2β cos α ) , (29) where c = cos 2α − 2 cos 2β cos2 α and d = a(c − 3) + 2frω sin2 β sin 2α. having substituted (28) and (29) into (27), the particle mass m reduces and we obtain a differential second-order equation, which can be solved using numerical methods. at β = 0, the expressions (27)–(29) take the following forms: α′′ = frnr −fn mr + nr −mr(ω −α′)2 mr cotg(ωt−α) , (30) n = mg, (31) n = mr(ω −α′)2. (32) having substituted (31) and (32) into (30), after a simplification, we obtain the differential equation (5). 70 vol. 59 no. 1/2019 particle motion over the edge of an inclined plane thus, in case, when β = 0, all the three equations (30)– (32) completely coincide with the expressions obtained earlier, when the differential equation of an absolute particle motion was determined in the projection on a moving axis. 3. results the solution of the differential equation (27) was performed with the help of numerical methods. the investigations for a case, when β = 0, r = 0.1 m, f = fr = 0.3, ω = 20 s−1 and when the initial integrating conditions are α′ = 0, α = 0, were conducted and the obtained results are presented in figure 3. the curve in figure 3a shows that the angular velocity of the sliding after the motion stabilization is about 10 s−1. as a result of the analytical solution, the exact value of α′ = 10.1 s−1 was obtained, which is mentioned at the beginning of the paper. the deduced equations also work at ω = 0, that is to say, when there is no rotating motion of a plane segment. if the angle β is bigger than the friction angle, a particle moves uniformly accelerated on an inclined plane. in our case, particle motion must be different, since, if an ellipse moves, the angle between a velocity vector and a horizontal plane is alternate, besides, the friction force from a cylinder wall acts on a particle. figure 4 represents the curves of the relative velocity and the height of the particle fall movement. the initial value of the angle is α = 90°, which corresponds to the position of a particle in the middle point through the height of a segment. the curves show that, at first, a particle builds up speed and then it slows down its movement and stops. if a segment rotates with a low angular velocity, e.g. ω = 5 s−1, the fall movement of a particle is different. similar curves for such case are presented in figure 5. the curves in figure 5 show that there is more time needed for a particle to stop, here, at certain moments, the direction of the sliding velocity changes for the opposite one. this means that a particle performs oscillations near the low point of an ellipse and this oscillatory motion gradually declines. if the angular velocity of the segment rotation increases, a particle begins to slide in an ellipse passing through the lowest and the highest points in a sequence (figure 6). having analysed the curves with the help of the similar values of the time t, it can be concluded that a particle has its highest sliding velocity near the lowest point and its lowest speed near the highest point. the pattern of the height change (figure 6b) shows that a particle moves upwards slower than it moves downwards. let us increase the angle β of the plane inclination to 35° without changing any other parameters. in this case, a particle in a relative motion will go upwards and will stop near the highest point (figure 7a). after its stop in the relative motion, it will describe a circle in an absolute motion (figure 7b). the fact that a particle stops can be explained by the fact that when going upwards, there are inertial forces that make a particle move further upwards and the weight force of a particle is not enough to overcome them. since a particle cannot move further upwards, it “sticks”. it can be assumed, that such particle “sticking” can be overcome by increasing the angular velocity ?. indeed, when the angular velocity ω is increased to 24 s−1, there will be no “sticking”, but, at certain moments of the particle sliding, the reaction n of the surfaces will become less than zero. in figure 8 there are two curves synchronized in time: the segment surface reaction n at the particle mass m = 0.01 kg and the height z change. straight vertical lines of the curves show the parts where the reaction n is equal to zero or less than zero. this happens at the moment when a particle begins its fall movement. during this time, the particle detaches from a plane. a differential equation sufficiently describes the particle motion only at positive values of the reaction n. the physical essence of such effect can be explained by the fact that, if angular velocity is increased, inertial force increases and it overcomes the particle weight force, but here, it detaches a particle from a plane. let us increase the radius r of a limiting cylinder to 0.25 m without changing any other parameters. the area of the unpredictable behaviour of a particle increases and begins even before the moment when a particle reaches its highest point at its upward movement (figure 9). this implies that the increase of the design parameters (the angle β of the plane inclination, the radius r of a limiting cylinder) or the technological ones (the increase of the angular velocity ω of plane rotation) causes a situation when the particle motion in an ellipse cannot be provided along the whole trajectory. there is either the particle “sticking” or its detachment from a plane. let us determine the role of the coefficient f of the particle friction on the surface of a flat segment and the coefficient fr of the particle friction on the surface of a limiting cylinder. the value of these coefficients plays a part in the particle motion in an ellipse. let us determine this motion by the dependence z = z(t), that is to say, by the regularity of the particle upward movement and particle fall movement. this regularity is the same for the relative motion and the absolute motion. to put that into context, let us take the curve shown in figure 6b and decrease the friction coefficient f for 0.1. let us construct the analogical curve for the process lasting for 5 s. this curve is presented in figure 10 (at the top), which is synchronized in time with the surface reaction curve (at the bottom) in a similar way to the previous examples. 71 s. f. pylypaka, m. b. klendii, v. m. nesvidomin, v. i. trokhaniak acta polytechnica figure 3. (a) sliding angular velocity curve; (b) the curve of the relative velocity of particle sliding in a circle. figure 4. kinematic parameter curves for particle fall movement in an ellipse of a fixed plane at β = 30°, r = 0.1 m, f = fr = 0.3, ω = 0: (a) relative velocity curve; (b) the height of particle fall movement curve. figure 5. kinematic parameter curves for particle fall movement in an ellipse, which rotates with low angular velocity, at β = 30°, r = 0.1 m, f = fr = 0.3, ω = 5 s−1: (a) relative velocity curve; (b) the height of particle fall movement curve. figure 6. kinematic parameter curves for particle sliding in an ellipse, which rotates with angular velocity ω = 20 s−1, at β = 30°, r = 0.1 m, f = fr = 0.3: (a) relative velocity curve; (b) the height of relative motion curve. 72 vol. 59 no. 1/2019 particle motion over the edge of an inclined plane figure 7. kinematic parameter curves for particle sliding in an ellipse, which rotates with angular velocity ω = 20 s−1, at β = 35°, r = 0.1 m, f = fr = 0.3: (a) the height of relative motion curve; (b) the trajectory of absolute particle motion. figure 8. the height z curve (at the top) and the surface reaction n curve (at the bottom) at ω = 24 s−1, β = 35°, r = 0.1 m, f = fr = 0.3. figure 9. the height z curve (at the top) and the surface reaction n curve (at the bottom) at ω = 24 s−1, β = 35°, r = 0.25 m, f = fr = 0.3. 73 s. f. pylypaka, m. b. klendii, v. m. nesvidomin, v. i. trokhaniak acta polytechnica figure 10. the height z curve (at the top) and the surface reaction n curve (at the bottom)at ω = 20 s−1, β = 30°, r = 0.1 m, f = 0.2, fr = 0.3. figure 11. trajectories of absolute motion over the inner surface of a limiting cylinder of radius r = 0.1: (a) ω = 20 s−1, β = 30°, f = 0.2, fr = 0.3; (b) ω = 18 s−1, β = 45°, f = 0.1, fr = 0.5. having compared the curves in figure 6b and the ones in figure 10, at the top, it can be deduced that the number of upward movements and the number of particle fall movements increased in the latter case, that is to say, the decrease of the friction coefficient f resulted in the increase of the particle absolute velocity. this result is inconsistent with a plane problem at β = 0. for a plane problem, the friction force f between a particle and a round disk is a driving force, which pulls it in a motion in a circle in the direction of the disk rotation. the increase of the friction f results in the increase of the force f and, thus, the absolute velocity of the particle motion in a circle increases as well (figure 2). by contrast, the increase of the friction coefficient fr results in the decrease of the absolute velocity of the particle motion in a circle, since the friction force fr is a decelerating force, which is directed opposite to its direction. in the case of the particle relative motion in an ellipse (at β 6= 0), it will be vice versa as well: the increase of the friction coefficient fr will increase the absolute velocity of the particle motion. at first glance, it is paradoxical, but it has its explanation. for a plane problem, the only driving force is the force f = fn. the reaction force n is constant and it is balanced by the particle weight force mg. if β 6= 0, the reaction force component n, which is directed upwards, has the following expression n cos β according to the last equation in (26). the angle cosine β is a constant value, thus, this force is in a direct proportion to the reaction n. at the bottom of figure 10, it can be seen that the force n ranges widely, here, it has the highest value when a particle is located in the low point of an ellipse. it makes a particle move upwards and the decrease of the friction force, that is to say, the decrease of the coefficient f will facilitate its motion. this explains the fact that the velocity of the particle sliding in an ellipse is the greatest near its low point, which has been determined as a result of the analysis of the curves presented in figure 6. figure 11a presents the absolute trajectory 74 vol. 59 no. 1/2019 particle motion over the edge of an inclined plane figure 12. a cascade operating element. of a particle motion. it is characterized by the abrupt change in the direction of the motion in the bottom part and a more long-term stay at the top part of an ellipse. in addition, it proves that the velocity of the particle sliding along an ellipse is greater in the bottom part than it is in the top part. the investigations show that the increase of the friction coefficient fr for 0.1 results in almost the same effect as the decrease of the coefficient f for the same value. if there is an increase in the coefficient fr, the absolute angular velocity of rotation ω −α′ decreases, that is to say, the value of the reaction nr depends on this squared value. it can be assumed that the increase of the coefficient fr results in the decrease of the absolute angular velocity ω −α′, which square significantly influences the decrease of the decelerating force fr = frnr = frmr(ω − α′)2, whereas the coefficient fr increases. thus, if there is a decrease in the friction coefficient f and there is an increase in the friction coefficient fr, the number of upward and downward movements of a particle in an ellipse is expected to be increased or there can be an increase of the angle of the plane inclination to the limits, when such motion is possible. figure 11b presents the trajectory of an absolute particle motion for β = 45° at f = 0.1 and fr = 0.5. here, the angular velocity was decreased to ω = 18 s−1 in order to avoid a particle detachment from the plane segment. figure 11b shows that the difference between the particle movement in the bottom and the top parts of an ellipse is increased even more. particle motion is similar to its bottomupward throwing with a long-term stay at the top part of an ellipse before it moves downwards. taking into consideration such particle motion, several recommendations can be formulated for designing a device for lifting a technological material in a vertical pipe. in order to provide a sufficient material elevation, it is necessary to decrease the friction coefficient f and to increase the friction coefficient fr. in order to provide a continuous material feed to the specified height, a cascade operating element can be designed, as it is shown in figure 12. plane segments, which are limited by semi-ellipses, alternate each other, here, their inclination angles alternate in sign. the top part of the segment is cut, since there is a deceleration of the material movement. in order to conduct the experiments, a cascade operating element for transporting loose materials by a vertical conveyer was made. the efficiency of a vertical conveyer with a screw operating element and the efficiency of a conveyer with a cascade operating element figure 13. the efficiency of a vertical conveyer depending on the angular velocity of its operating element: 1 – a conveyer with a screw operating element; 2 – a conveyer with a cascade operating element. was determined depending on the frequency of the rotation of an operating unit when conveying wheat (f = 0.3; fr = 0.3). the results of the experimental investigation are presented in figure 13. it has been determined that the efficiency of a conveyer is at maximum and, further, it remains practically unchanged, if the angular velocity of the operating unit increases as a result of the spillage of grain through holes and gaps. the efficiency of a conveyer with a cascade operating element is practically the same as the efficiency of a conveyer with a screw operating element (error is not more than 5%) and, that is why a cascade operating element is more appropriate to be applied in conveyers, since one of the promising directions in determining the manufacturability of screw conveyer operating elements is the application of flat blades, which are inclined to the axis of the rotation and are attached to the cylindrical shaft of a frame, instead of helical spirals. the intention is to make such blades using a sheet metal forming method and further welding them to a cylindrical shaft, since it is cheaper. 4. conclusions a particle motion on an inclined plane that rotates around the axis of a vertical limiting cylinder differs from a particle motion on a screw surface, which rotates in a vertical cylinder. the difference is in variable kinematic and dynamic motion patterns. in order to provide particle sliding on a plane and in an ellipse at the same time, it is necessary to provide the required angular velocity of the flat segment rotation. 75 s. f. pylypaka, m. b. klendii, v. m. nesvidomin, v. i. trokhaniak acta polytechnica its motion pattern is influenced by the coefficients of the particle friction on the surfaces of a plane and a cylindrical cover, the radius of a cylinder, the angular velocity of segment rotation and the inclination angle of a plane. at a certain combination of these parameters, a particle may “stick” in an ellipse or detach from a plane in the top position. it is possible to increase the number of particle passings along an ellipse per unit time due to the decrease of the coefficient of its friction on an inclined plane and due to the increase of the coefficient of its friction on the inner surface of a limiting cylinder. a cascade operating element, which can compete with a screw operating element in reclaim conveyers due to the ease of its manufacturing, has been designed. references [1] p. m. vasylenko, theory of particle motion over rough surfaces of agricultural machines. kyiv, ukraine: ukrainian academy of agricultural sciences, pp. 283, 1960. [2] p. m. zaika, “concerning one family of regular particle motion over an oscillating plane of vibratory grain-cleaning machine,” theory of mechanisms and machines, vol. 1. m. gorkii kharkiv state university, pp. 28-33, 1966. [3] m. b. klendii, o. m. klendii. inverrelation between incidence ange and roll ange of concave disks of soil tillage implements, inmateh: agricultural engineering, vol.49, no.2, pp.13-20, 2016; [4] hevko r.b., dzyura v.o., romanovsky r.m. mathematical model of the pneumatic-screw conveyor screw mechanism operation, inmateh: agricultural engineering, vol.44, no.3, pp.103-110, 2014. [5] pylypaka s., klendii m., klendii o. particle motion on the surface of a concave soil-tilling disk, acta polytechnica, journal of advanced engineering, vol.58, no.3, pp.201-208, 2018. [6] p. ratanavararaksa and m. dejnakarintra, “series solutions of the anharmonic motion equation y′′+y2 = c,” engineering journal, vol. 20, no. 5, pp. 203-213, 2016. [7] r. b. hevko, r. i. rozum, and o. m. klendii, “development of design and investigation of operation processes of loading pipes of screw conveyors,” inmateh agricultural engineering, vol. 50, no. 3, pp. 89-94, 2016. [8] hevko b.m., hevko r.b., klendii o.m., buriak m.v., dzyadykevych y.v., rozum r.i. improvement of machine safety devices. acta polytechnica, journal of advanced engineering, vol.58, no.1, pp.17-25, 2018. [9] i. i. blehman, g. ju. dzhenalidze. “oscillatory displacement”, nauka, 410pp., 1964. [10] n. i. sysoev. theoretical basis and calculation for a sorting device ‘zmeika’, agricultural machinery, vol. 8, pp. 5-8, 1949. 76 acta polytechnica 59(1):67–76, 2019 1 introduction 2 material and methods 3 results 4 conclusions references acta polytechnica https://doi.org/10.14311/ap.2021.61.0633 acta polytechnica 61(5):633–643, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague peculiar aspects of cracking in prestressed reinforced concrete t-beams vasyl karpiuka, yuliia sominaa, ∗, fedir karpiuka, irina karpiukb a odesa state academy of civil engineering and architecture, faculty of civil engineering, department of reinforced concrete structures and transport facilities, didrihsona street 4, 65029 odesa, ukraine b odesa state academy of civil engineering and architecture, faculty of civil engineering, department of basements and foundations, didrihsona street 4, 65029 odesa, ukraine ∗ corresponding author: syomina3091@ukr.net abstract. in order to study the cracking of prestressed reinforced concrete t-shaped beam structures, the authors planned and carried out a full-scale experiment with five variable factors. the following factors were chosen as variable factors: the relative span of the shear, the ratio of the table overhang width to the thickness of the beam rib, the ratio of the table overhang thickness to the working height of the beam section, the coefficient of transverse reinforcement, the level of prestressing in the working reinforcement. the article describes the cracking process and the destruction of test beams. it was found that the loading level of an opening of inclined cracks is 53 % larger than the loading level of a normal crack opening. mathematical models of bending moments and transverse forces of cracking were built using the “compex” software. also, the mathematical models of the crack opening width and the projection length of a dangerous inclined crack were obtained. these models are based on the experimental data. analysing the obtained models, the complex influence of variable factors on the main parameters of crack formation and crack resistance was established. in particular, it was found that the prestress level in the working reinforcement has the greatest effect on the bending moment of cracking. in this case, the value of the shear force of cracking significantly depends on both the prestressing level in the reinforcement and the relative span of the shear. on the basis of the experimental data, the empirical expression is obtained for determining the projection of a dangerous inclined crack for prestressed reinforced concrete t-shaped beams. the resulting equation can be used to calculate a shear reinforcement. keywords: reinforced concrete, prestressing, t-beam, inclined section, normal crack, diagonal crack, cracking, transverse force, bending moment. 1. introduction solving the problems of capital construction is associated with the technical progress in the field of concrete and reinforced concrete, as they are the most common materials for supporting structures in modern construction. reinforced concrete elements with various cross-sectional shapes (rectangular, t-shaped, i-shaped, trapezoidal, etc.) are a significant part of prefabricated, monolithic and precast-monolithic structures, however, the data about the work under the load are rather limited. the data of some researches in this area are presented in works [1–3]. despite its widespread use, one of the disadvantages of reinforced concrete is the early formation of cracks in the tensioned zone, and as a result, the rapid growth of structural deflections to the ultimate value. the problem of crack opening has a considerable importance for ensuring the joint deformation of the reinforcement and the concrete, which determines the durability, rigidity and ensures the full use of the bearing capacity of reinforced concrete structures. each crack appearance in a reinforced concrete unit indicates that there has been a discharge of accumulated stresses in this area of the structure. the cause of cracks is internal tensile stresses, which can arise due to the internal processes in the unit (concrete shrinkage, loss (outflow) of hydration heat; reinforcement corrosion, etc.) and the external loads on the structure (temperature, force, shock actions, etc.). the development of a calculating apparatus for crack formation in reinforced concrete structures is very difficult because the main hypothesis of the mechanics of a solid deformable body (the hypothesis of continuity) [4, 5] is not applicable here, the continuity is violated by the presence of macro-cracks. the use of simplified approaches is also impossible, since the permissible error in this case exceeds the characteristic of the crack opening width, measured during the experiments using a microscope. in order to increase the rigidity and crack resistance, prestressing is widely used in reinforced concrete structures; it is used in beams, slabs, and trusses. in addition, it reduces the weight of the structure and, as a result, it increases the efficiency and the possibility of a rational use of high-strength reinforcement. there are 633 https://doi.org/10.14311/ap.2021.61.0633 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en v. karpiuk, y. somina, f. karpiuk, i. karpiuk acta polytechnica figure 1. dimensions and reinforcement of the tested beams with large, medium and small shear span. also some disadvantages: increased labour intensity and the necessity for a special equipment, lower fire resistance; reinforcing prestressed structures is more complicated than conventional ones; a high-strength reinforcement loses its plastic properties faster with corrosion and there is a danger of brittle fracture of structures. following this line of reasoning, it is necessary to determine the appropriateness of applying prestressing in individual cases. an investigation on the crack pattern at a failure and evolution of a compressive strut inclination and crack inclination at a failure from both the theoretical and the experimental point of view was reported in [6–8]. based on the foregoing, the accumulation of experimental data, as well as the identification of new patterns of this issue are urgent and practically useful scientific task. 2. materials and methods in order to study this theme, t-shaped beams were manufactured and tested [9]. the width of the tables was taken as a multiple of the rib width: 2b, 3b, 4b. their heights were 30, 45 and 60 mm. the calculated span length was 1975 mm. as a prestressed working reinforcement in the form of a separate rod, we used thermomechanically hardened reinforcement of a deformed (feather) profile, class a500c �22 mm [10, 11]. in addition to the specified rod, the beams were reinforced with two vertical flat frames, an upper mesh and separate rods. dimensions of the tested beams and their reinforcement is shown in fig. 1. the following factors were selected as variable factors: x1 is the relative span of the shear (a/h0); x2 is the ratio of the width of the table overhangs to the rib thickness (b/f /b); x3 is the ratio of the thickness of the table overhangs to the working height of the section (h/f /h0); x4 is the coefficient of the transverse reinforcement (ρsw); x5 is the prestressing level in the working reinforcement (p/fcdbh0). in accordance with the accepted test procedure [9], the prototype beams were tested according to the scheme of a single-span free-support beam on 2 seats at the age of 100...120 days (fig. 2). 27 test beams were tested. 3. results during the testing, the prototypes deformed with the marked deflections on a load increment) [12]. at the same time, during the process of loading, the normal cracks appeared in the simple bending zone, and then there were inclined ones in the support areas (fig. 3, 4). 634 vol. 61 no. 5/2021 peculiar aspects of cracking in prestressed . . . figure 2. prototype beam during the tests. the absolute majority (more than 90 %) of the prototype beams fractured in the support areas, and the rest of them did along both the normal and the inclined sections [13, 14]. during the loading, the first normal cracks formed in the bottom of the beam rib in the simple bending zone, and then in the shear span. with the load increasing, they developed as follows: in the simple bending zone – without changing the original direction, and in the shear span – deviating towards the application of a concentrated force and gradually turning into inclined cracks (the first type of inclined cracks). in addition to normal cracks developing into inclined ones, particular inclined cracks also formed in the middle height of the section of the beam rib (second type) over existing normal cracks and in the zone where they are absent. with a further increase of the load, inclined cracks developed towards the compressed and stretched sides of the beam, and one of them is the critical one – it opened up most intensively. in the experiments, there were three destruction forms of the support areas of the beams [15]: along a dangerous inclined crack from the dominated effect of the shearing force (a pattern, fig. 3, 4); along a dangerous inclined crack from the dominated effect of the bending moment (b pattern, fig. 3, 4); or along a compressed inclined band (c pattern, fig. 3, 4). with the destruction of the beams in the form a and b, a dangerous inclined crack, having reached a compressed table, penetrated it or developed along its lower side, and then penetrated the table. subsequently, a beam destruction from crushing or concrete shearing in the compressed zone occurred with the possible achievement of the yield point in the prestressed reinforcement at the mouth of the inclined crack. with a relatively small shear span, the beam destruction occurred from a concrete crushing along an inclined compressed band between inclined cracks in the rib in the direction from the load to the support. experimental data on bending moments and shearing forces of cracking are presented in table 1. the investigated limit here is the moment and the shearing force of the beginning of the cracks. it is generally accepted that the first cracks in reinforced concrete beams appear at the level 20–30 % of the total breaking load, and in prestressed ones – at the level of 30 %. analysing the table, this hypothesis cannot be observed in all the experiments. for example, in the experiments with more powerful reinforcement, the cracking rate is more than 30 %, and in the experiments with less powerful reinforcement, the cracking rate is about 10 %. these results must be taken into account in the design process. the interpretation of the experimental data was carried out by analysing the experimental statistical dependencies (mathematical models). these models were obtained using the “compex” software, which was developed at the odessa state academy of civil engineering and architecture under the guidance of professor v. a. voznesenskiy in 1991 on the basis of works [16, 17]. analysing these models, it is possible to determine the influence of design factors on the output parameter, both individually and interacting with each other. the theoretical pre-condition of pc is based on the theory of experiment planning, the theory of mathematical 635 v. karpiuk, y. somina, f. karpiuk, i. karpiuk acta polytechnica i – transfer of prestressing in reinforcement to concrete; ii – formation of normal cracks in the zone of pure bending; iii – growth of normal cracks and the formation of inclined cracks in shear spans; iv – development of normal and inclined cracks with an intersection of compressed beam flange and formation of new inclined cracks; v.a) – the beam destruction with the displacement of area near the support and with the reinforcement kink; v.b) – the beam destruction due to the prevailing action of the bending moment; v.c) – the beam destruction due to the crushing of the beam rib concrete figure 3. development of normal and inclined cracks and destruction patterns of test beams. 636 vol. 61 no. 5/2021 peculiar aspects of cracking in prestressed . . . no. x1, a/h0 x2, b / f /b x3, b / f /h0 x4, ρsw x5, p/fcdbh0 mcrc, ⊥ (knm) f acc.crc, ⊥ (kn) fu (kn) mcrc, / (knm) f acc.crc, / (kn) 1 3.18 4.0 0.36 0.0056 0.584 13.37 25.5 85.5016.49 31.4 2 1.06 2.0 0.36 0.0056 0.584 11.90 68.0 93.007.63 43.6 3 1.06 4.0 0.18 0.0020 0.0 2.84 16.20 76.004.94 28.2 4 3.18 2.0 0.18 0.0020 0.0 1.99 3.80 64.0013.65 26.0 5 1.06 4.0 0.18 0.0056 0.584 12.67 72.4 93.205.74 38.2 6 3.18 2.0 0.18 0.0056 0.584 11.83 22.5 80.3013.34 25.4 7 3.18 4.0 0.36 0.0020 0.0 3.54 6.70 69.3015.08 28.7 8 1.06 2.0 0.36 0.0020 0.0 2.07 11.8 76.806.42 37.6 9 1.06 4.0 0.36 0.0056 0.0 3.57 20.4 95.007.77 44.4 10 3.18 2.0 0.36 0.0056 0.0 2.03 3.90 81.0011.13 21.2 11 3.18 4.0 0.18 0.0020 0.584 12.63 24.1 64.0011.87 22.6 12 1.06 4.0 0.18 0.0020 0.584 11.87 67.8 78.307.14 40.8 13 1.06 4.0 0.36 0.0020 0.584 13.47 76.6 64.808.82 50.4 14 3.18 2.0 0.36 0.0020 0.584 11.86 22.6 88.8012.92 24.6 15 3.18 4.0 0.18 0.0056 0.0 2.80 5.3 89.8012.08 23.0 16 1.06 2.0 0.18 0.0056 0.0 2.03 11.6 80.005.53 31.6 17 3.18 3.0 0.27 0.0036 0.292 7.51 14.3 83.0014.65 27.9 18 1.06 3.0 0.27 0.0036 0.292 7.54 43.1 81.007.54 43.1 19 2.12 4.0 0.27 0.0036 0.292 8.10 23.1 81.0011.66 33.3 20 2.12 2.0 0.27 0.0036 0.292 6.95 19.9 82.0010.99 31.4 21 2.12 3.0 0.36 0.0036 0.292 7.72 22.1 80.0012.29 35.1 22 2.12 3.0 0.18 0.0036 0.292 7.33 20.9 90.0010.32 29.5 23 2.12 3.0 0.27 0.0056 0.292 7.52 21.5 70.0011.55 33.0 24 2.12 3.0 0.27 0.0020 0.292 7.52 21.5 82.0011.55 33.0 25 2.12 3.0 0.27 0.0020 0.584 13.00 37.1 80.0012.11 34.6 26 2.12 3.0 0.27 0.0020 0.0 2.45 7.0 81.0010.47 29.9 27 2.12 3.0 0.27 0.0020 0.292 7.53 21.5 78.3010.85 31.0 table 1. experimental data of bending moments and transverse forces. 637 v. karpiuk, y. somina, f. karpiuk, i. karpiuk acta polytechnica test no. 22 – destruction pattern a; test no. 6 – destruction pattern b; test no. 2 – destruction pattern c figure 4. photos of the test specimens with different destruction patterns. modelling, and also on the methods of statistics. in order to determine the influence coefficients of the examined factors, it was necessary to assign the program matrix of the test plan, which reflected variation levels of these factors as well as the experimental values of the output parameter. the specified pc is used under the assumption of the distribution of the investigated random value according to the normal gaussian law and, as a consequence, the application of the least squares method. thus, the experimental-statistical dependences for the bending moment, causing normal cracking in simple bending zone and the corresponding transverse load, have been obtained: ŷ (mcrc, ⊥) = 7.55 + 0.58x2 + 0.19x3 + 5.0x5 + 0.12x 25 + 0.17x2x3 (knm) (1) ŷ ( f acc.crc, ⊥ ) = 21.6 − 14.4x1 + 2.1x2 + 0.7x3 + 18.3x5 + 7.1x 21 − 0.4x 2 5 − 1.1x1x2− − 0.4x1x3 − 9.4x1x5 + 0.7x2x3 (kn) (2) the value of the coefficients at factors xi corresponds to the degree of their influence on the formation moment of normal cracks in relation to the absolute term b0. the “+” sign indicates that increasing this factor within the range of variation, the torque value will increase. so, three factors affect the time of cracking normal to the longitudinal axis of the beam (eq. 1): the ratio of the table overhangs width to the rib thickness b/f /b(x2), the ratio of the table overhangs thickness to the working height of the section h/f /h0(x3) and the stress level in the working reinforcement σsp (x5). analysis of eq. 1 shows that the factorx5 has the greatest influence on the moment mcrc, ⊥. so, increasing prestress σsp from 0 to 30 kn/cm2, the cracking moment in a stretched rib increases relative to the average value (b0 is the average value of the output parameter from the entire series of the experiments, determined with the theory of mathematical planning according to the standard procedure) by 135 %. increasing factors x2 and x3, it increases by 16 % and 8.5 %, respectively. the presence of a positive sign in the case of the quadratic effect x 25 indicates that with a further increase of the prestress level outside the range of this factor variation, there will be a significant increase of the crack formation moment. the output parameter increases with a simultaneous increasing of factors x2 and x3 (interaction). factor x4 is insignificant, its influence was less than 5 % in relation to the absolute term of the model. consequently, this factor is not reflected in the model. from eq. 2, we can see that the load corresponding to the formation of normal cracks is significantly influenced by both the prestressing level and the relative shearing span. increasing x1 from 1.06 to 3.18 decreases the value f acc.crc, ⊥ , in relation to the average value, by 131 %, and increasing x5 increases it by 175 %. there is also the interaction of the factors with each other. when simultaneously increasing a/h0 and σsp, a/h0 and b / f /b , a/h0 and h / f /h0 also increase simultaneously and the value of f acc. crc, ⊥ decreases. factor x4 was insignificant, its influence was less than 5 % in relation to the absolute term of the model. consequently, this factor is not reflected in the model. 638 vol. 61 no. 5/2021 peculiar aspects of cracking in prestressed . . . in [18–20], one can find the information on the crack resistance of normal sections of the investigated elements, but, as a rule, there is no comprehensive approach to assess the influence of experimental factors on the considered output parameter (crack resistance). the researchers in [21] choose one or two factors for the analysis and study their influence. but with such an approach to the study of this phenomenon, it is impossible to set the interaction of the factors with each other, while it naturally often exists. before the normal cracks were evident, the investigated reinforced concrete element deformed like an elastic body: the increase in deformations and deflections of the element was proportional to the increase in the external load. before the first cracks were evident, the stresses in the tensioned concrete decreased to zero, and in tensioned reinforcement, they increased abruptly. by processing the experimental data on the inclined crack formation in the shearing span, adequate mathematical models were obtained for the shearing force and the corresponding bending moment at this loading level: ŷ ( fcrc, / = qcrc, / ) = 33.0 − 7.0x1 + 1.0x2 + 2.8x3 + 2.3x5 + 2.3x 21 − 0.9x 2 2 − 1.0x 23 − 1.0x 2 5 − 1.8x1x3 − 1.6x1x5 + 2.4x2x3 + 1.0x2x4 (kn) (3) ŷ ( m acc.crc, / ) = 11.51 + 3.23x1 + 0.33x2 + 0.70x3 + 0.57x5 − 0.52x 21 − 0.29x 2 2 − 0.30x 23 − 0.32x 2 5 + 0.17x1x2 + 0.84x2x3 + 0.34x2x4 + 0.18x3x5 (knm) (4) the value of the shearing force, at which inclined cracks appear, is influenced by four factors: the relative shearing span, the ratio of the width of the table overhangs to the thickness of the rib, the ratio of the thickness of the table overhangs to the working height of the section, and the prestressing level in the working reinforcement. an analysis of the eq. 3 shows that factor x1 has the greatest influence on the shearing force. so, increasing the relative shear span from 1.06 to 3.08, the shearing force of the crack formation will decrease with respect to the average value (b0) by 42 %. increasing the ratio of the width of the table overhangs to the thickness of the rib from 2 to 4 will increase the indicated force by 6 %. factors x3 and x5 have a more significant influence than factor x2. accordingly, increasing h / f /h0 from 0.18 to 0.36 will increase the shearing force of the crack formation of inclined cracks by 17 %, and σsp from 0 to 30 kn/cm2 – by 14 %. the presence of a negative sign in the case of quadratic effects x 22 , x 23 and x 25 indicates that with a further increase of these factors outside of their variation, there will be no significant increasing of the shearing force, at which the inclined cracks appear. there is also the mutual interaction of the factors. if a/h0 and σsp increase simultaneously, the value fcrc, / will decrease. if b/f /b and h / f /h0, h / f /h0 and ρsw increase simultaneously, the value fcrc, / will increase. comparing the experimental data with the calculated ones, it was found that the experimental values of the cracking load, as a rule, are higher than the calculated ones, which should be taken into account in the design process. the obtained results on the opening width of normal cracks correspond to the experimental and calculated results [22–24]. when processing the experimental data, adequate experimental-statistical models (mathematical) were obtained that characterize the opening width of cracks normal and inclined to the longitudinal axis of the beam before its destruction under an external load 0,95fu: ŷ ( w0.95fucrc ⊥ ) = 0.157 + 0.011x2 + 0.011x3 − 0.032x5 − 0.002x 25 + 0.009x2x3 (mm) (5) ŷ ( w0.95fu crc / ) = 0.765 + 0.101x1 − 0.079x2 − 0.012x3 − 0.019x4 − 0.042x5 − 0.072x1x2 − 0.022x1x3 − 0.067x1x5 + 0.006x2x3 + 0.058x2x4 + 0.026x2x5 + 0.021x3x4 + 0.006x4x5 (mm) (6) as you can see in the model (5), the prestress level in the working reinforcement σsp (x5), the width ratio of the table overhangs to the thickness of the rib b/f /b (x2) and the thickness ratio of the table overhangs to the working height of the section h/f /h0 (x3) influence the opening width of normal cracks the most. 639 v. karpiuk, y. somina, f. karpiuk, i. karpiuk acta polytechnica no. x1, a/h0 x2, b / f /b x3, h / f /h0 x4, ρsw x5, p/fcdbh0 w0.95fucrc, ⊥ (mm) w 0.95fu crc, / (mm) c0 (cm) 1 3.18 4.0 0.36 0.0056 0.584 0.16 0.66 38 2 1.06 2.0 0.36 0.0056 0.584 0.12 0.63 13 3 1.06 4.0 0.18 0.0020 0.0 0.18 0.57 16 4 3.18 2.0 0.18 0.0020 0.0 0.18 1.30 46 5 1.06 4.0 0.18 0.0056 0.584 0.12 0.70 13 6 3.18 2.0 0.18 0.0056 0.584 0.11 0.84 38 7 3.18 4.0 0.36 0.0020 0.0 0.22 0.70 46 8 1.06 2.0 0.36 0.0020 0.0 0.18 0.75 15 9 1.06 4.0 0.36 0.0056 0.0 0.22 0.68 15 10 3.18 2.0 0.36 0.0056 0.0 0.18 1.04 46 11 3.18 4.0 0.18 0.0020 0.584 0.12 0.64 38 12 1.06 4.0 0.18 0.0020 0.584 0.11 0.75 13 13 1.06 4.0 0.36 0.0020 0.584 0.16 0.67 13 14 3.18 2.0 0.36 0.0020 0.584 0.12 0.88 38 15 3.18 4.0 0.18 0.0056 0.0 0.18 0.85 46 16 1.06 2.0 0.18 0.0056 0.0 0.18 0.55 15 17 3.18 3.0 0.27 0.0036 0.292 0.16 0.86 42 18 1.06 3.0 0.27 0.0036 0.292 0.16 0.66 14 19 2.12 4.0 0.27 0.0036 0.292 0.17 0.68 28 20 2.12 2.0 0.27 0.0036 0.292 0.15 0.84 28 21 2.12 3.0 0.36 0.0036 0.292 0.17 0.75 28 22 2.12 3.0 0.18 0.0036 0.292 0.15 0.78 28 23 2.12 3.0 0.27 0.0056 0.292 0.16 0.74 28 24 2.12 3.0 0.27 0.0020 0.292 0.16 0.78 28 25 2.12 3.0 0.27 0.0020 0.584 0.12 0.72 25 26 2.12 3.0 0.27 0.0020 0.0 0.19 0.81 31 27 2.12 3.0 0.27 0.0020 0.292 0.15 0.83 28 table 2. experimental data of normal and inclined cracks’ width and data of a projection value of a dangerous inclined crack. so, by increasing the prestressing level σsp from 0 to 30 kn/cm2, the opening width of normal and inclined cracks will decrease, relative to the average value (b0), by 41 and 12 %, respectively. by increasing the width ratio of the table overhangs to the thickness of the rib b/f /b from 2 to 4 and increasing the thickness ratio of the table overhangs to the working height h/f /h0 from 0.18 to 0.36, the opening width of normal cracks will increase by 14 %, and the opening width of inclined cracks will decrease by 21 and 23|,%. the presence of a negative sign in the case of the x 25 quadratic effect indicates that with a further increase of this factor level outside of its variation, there will be a more significant decrease in the opening width of cracks normal to the longitudinal axis of the beam. the analysis of the mathematical model (6) shows that the relative shear span a/h0 (x1) , the width ratio of the table overhangs to the rib thickness b/f /b and the prestressing level in the working reinforcement σsp have the most significant influence on the opening width of cracks inclined to the longitudinal axis of the beam. so, increasing the relative shear span from 1.06 to 3.08, the opening width of inclined cracks increases with respect to the average value (b0) by 26.4 %. by increasing the width ratio of the table overhangs to the rib thickness from 2 to 4 and prestressing from 0 to 30 kn/cm2, the opening width of inclined cracks will decrease by 20.7 and 11 %, respectively. there is also the mutual interaction of the factors. with a simultaneous increase of a/h0 and b / f /b, a/h0 and h / f /h0, a/h0 and σsp, the value wcrc, / will decrease. with a simultaneous increase of b / f /b and h / f /h0, b / f /b and ρsw, h / f /h0 and σsp, h / f /h0 and ρsw, ρsw and σsp, the value wcrc, / will increase. it is very important that the average opening width of inclined cracks is almost twice the allowable value. since the amount of cross reinforcement, studied in the experiments and used in practice, had an insignificant effect (up to 5 %) on the opening width of inclined cracks, it can be assumed that for these purposes, it is necessary to increase the rib width and the amount of cross reinforcement. 640 vol. 61 no. 5/2021 peculiar aspects of cracking in prestressed . . . it is particularly remarkable that the width of the overhangs and the thickness of the table play a significant role in the opening width of normal cracks as well as inclined ones, which must be taken into account in the calculations. also, according to the test results, an adequate experimental-statistical dependence (mathematical) model of the projection value of a dangerous inclined crack was obtained: ŷ (c0) = 28.0 + 14.2x1 − 2.6x5 − 1.4x1x5 (cm) (7) the analysis of this model shows that the relative shear span and the prestressing level in the working reinforcement have a significant impact on the size of a dangerous inclined crack. the rest of the factors were insignificant. so, by increasing the relative span of the section from 1.0 to 3.18, the projection value of a dangerous inclined crack will increase in relation to the average value (b0) by 101.4 %. by increasing the prestressing level from 0 to 30 kn/cm2, it decreased by 18.6 %. a fairly large number of prototypes allowed to conclude that the size of a dangerous inclined crack ranges from (0.8. . . 2.8)h0, which is a little over the recommended limits [25]. moreover, this increase is quite reasonable. after replacing the coded values of the experimental factors with natural ones, the authors obtained an empirical expression for determining the projection of a dangerous inclined crack for prestressed reinforced concrete t-shaped beams: ĉ0 = [0.8690 (a/h0) + 0.0014σsp − 0.0057 (a/h0) σsp − 0.040] × h0 (8) where σsp is in mpa, c0 is in cm. the obtained equation, as well as the experimental data on the projection of a dangerous inclined crack, represent the practical significance of obtained results. it is known that the shear force arising in an inclined section of a reinforced concrete structure consists of three parts, experienced by the concrete in the compressed zone, stretched cross rods crossed by a dangerous inclined crack, and longitudinal reinforcement in the form of a dowel action. knowing the height and the characteristics of the concrete in the compressed zone, it is easy to determine the sum of the shear forces experienced by the concrete. the value of the dowel action of the longitudinal reinforcement, as a rule, is not more than 10 % and, consequently, it is 5 % of the breaking shear force. taking into account the value of tensile stresses in the rods of the cross reinforcement at the level of 80 % of its yield point, it is easy to find the value of the shear force experienced by the cross reinforcement (stirrup) if the projection length of a dangerous inclined crack is known, i. e., the intensity of the cross reinforcement. knowing the values of the design cross force and the projection length of a dangerous inclined crack, it is easy to determine the required value of the cross reinforcement. also, there is a rather significant result. during the research, it was found that the experimental values of the cracking load exceed the calculated values by an average of 13 %. this must be taken into account in the design stage of reinforced concrete beam elements in order to predict the moment of cracking more accurately. based on the obtained data, physical models of bearing capacity of the beam structures will be made. at the same time, it is very important to take into account all their components correctly, which are, as a rule, stochastic (random) values, and to identify objective conformities. on the basis of the installed physical models, refined design schemes and design models of the shear capacity of such a complex composite material as reinforced concrete will be developed. 4. conclusions in accordance with the obtained test results, the following was found: (1.) support areas of experimental reinforced concrete elements with small shear spans (a/h0 ≈ 1) are destroyed along an inclined compressed band. the destruction of the support areas of the beam-samples with medium (a/h0 ≈ 2) and large (a/h0 ≈ 3) shear spans occurs along a dangerous inclined crack from the predominant action of the shear force or bending moment, depending on the stress values in the working reinforcement at the beginning of the crack. the projection length of a dangerous inclined crack significantly exceeds the standard values and depends on the size of the shear span as well as prestressing of the working reinforcement. (2.) a comparative analysis of eq. 2 and eq. 3, eq. 1 and eq. 4 showed that the qualitative influence of the investigated factors on the cross load and the moments corresponding to the appearance of the first normal and inclined cracks is generally the same. the main difference is that the loading level at which inclined cracks appear is 53 % higher than the loading level at which the first normal cracks appear in 641 v. karpiuk, y. somina, f. karpiuk, i. karpiuk acta polytechnica the investigated reinforced concrete elements. in this case, as expected, the prestressing level of the main longitudinal reinforcement has a great influence on the loading, which corresponds to the appearance of the first normal cracks in comparison with the loading when the first incline cracks appear, which corresponds to the test data [26–29]. the appearance of inclined cracks was fast, i. e., there was a non-proportional increase of deflections of the investigated elements. (3.) for a specified ratio of the factors, the opening width of inclined cracks is more than twice the width of normal cracks, and at the operational level, the load reaches its limiting values. the presence of a beam table in the compressed zone of the samples significantly affects not only the crack resistance and deformability but also the strength. list of symbols f acc.crc, ⊥ transverse force that causes appearance of inclined cracks [kn] f acc.crc, / transverse force that causes appearance of normal cracks [kn] fu fracture transverse force [kn] mcrc, ⊥ bending moment that causes appearance of normal cracks [knm] mcrc, / bending moment that causes appearance of inclined cracks [knm] w 0.95fu crc, ⊥ normal cracks width [mm] w 0.95fu crc, / inclined cracks width [mm] c0 projection value of a dangerous inclined crack [cm] references [1] o. harkava, b. barilyak. bearing capacity calculation of reinforced concrete crane beams under biaxial bending (in ukrainian). collected scientific works of ukrainian state university of railway transport (175):77–83, 2018. https://doi.org/10.18664/1994-7852.175.2018.127166. [2] t. azizov, o. melnyk. experimental studies of rigidity and strength of reinforced concrete elements of box section with normal torsional cracks. resource-economical materials, constructures, buildings and structures (21):82–86, 2011. https://dspace.udpu.edu.ua/handle/6789/665. [3] a. deifalla, a. awad, h. seleem, a. abdelrahman. experimental and numerical investigation of the behavior of lwfc l-girders under combined torsion. structures 26:362–377, 2020. https://doi.org/10.1016/j.istruc.2020.03.070. [4] a. b. golyshev, v. kolchunov. resistance of reinforced concrete. basis, kyiv, 2009. [5] a. iakovenko, i. kolchunov. the development of fracture mechanics hypotheses applicable to the calculation of reinforced concrete structures for the second group of limit states. istrazivanja i projektovanja za privredu 15(3):367–376, 2017. https://doi.org/10.5937/jaes15-14662. [6] a. marí, j. bairán, a. cladera, et al. shear-flexural strength mechanical model for the design and assessment of reinforced concrete beams. structure and infrastructure engineering 11(11):1399–1419, 2014. https://doi.org/10.1080/15732479.2014.964735. [7] d. de domenico. torsional strength of rc members using a plasticity-based variable-angle space truss model accounting for non-uniform longitudinal reinforcement. engineering structures 228:111540, 2021. https://doi.org/10.1016/j.engstruct.2020.111540. [8] d. de domenico, g. ricciardi. shear strength of rc beams with stirrups using an improved eurocode 2 truss model with two variable-inclination compression struts. engineering structures 198:109359, 2019. https://doi.org/10.1016/j.engstruct.2019.109359. [9] v. m. karpiuk. calculating models of power resistance of the span reinforced concrete constructions at the general case of stress state. odaba, odesa, 2014. [10] recommendations on the use of reinforcing steel according to dstu 3760-98 in the design and manufacture of structures without prestressing reinforcement, gosstroy of ukraine, technical committee for standardization “reinforcement for reinforced concrete structures”, kiev, 2002. [11] dstu 3760-98. reinforcing steel for reinforced concrete structures, gosstandart of ukraine, kiev, 1998. [12] v. s. dorofeev, v. m. karpiuk, f. r. karpiuk. calculation of deflections of prestressed reinforced concrete t-elements. mechanics and physics of building materials and structures destruction (8):402–415, 2009. [13] v. s. dorofeev, v. m. karpiuk, f. r. karpiuk. strength calculation of support sections of prestressed reinforced concrete t-elements. diagnosis, durability and reconstruction of bridges and building structures (11):13–26, 2009. [14] v. s. dorofeev, v. m. karpiuk, f. r. karpiuk. modeling of stress-strain state of prestressed concrete t-beams used in agro-industrial construction. in proceedings of the international scientific and practical forum “ecological, technological and socio-economic aspects of effective use of material and agricultural base of aic”, 1, pp. 522–530. 2008. 642 https://doi.org/10.18664/1994-7852.175.2018.127166 https://dspace.udpu.edu.ua/handle/6789/665 https://doi.org/10.1016/j.istruc.2020.03.070 https://doi.org/10.5937/jaes15-14662 https://doi.org/10.1080/15732479.2014.964735 https://doi.org/10.1016/j.engstruct.2020.111540 https://doi.org/10.1016/j.engstruct.2019.109359 vol. 61 no. 5/2021 peculiar aspects of cracking in prestressed . . . [15] v. m. karpiuk, y. a. syomina, d. v. antonova. calculation models of the bearing capacity of span reinforced concrete structure support zones. materials science forum 968:209–226, 2019. https://doi.org/10.4028/www.scientific.net/msf.968.209. [16] v. a. voznesenskiy. statistical methods of designing an experiment in feasibility studies. finance and statistics, moscow, 1981. [17] v. a. voznesenskiy, t. v. lyashenko, b. l. ogarkov. numerical methods for solving construction and technological problems with a computer. high school, kyiv, 1989. [18] p. i. vasiliev, o. a. rochnyak, n. n. yaroshin. influence of the nature of cracking on the resistance of reinforced concrete elements to shear force. improving the methods of calculation and research of types of reinforced concrete structures (1):19–25, 1981. [19] a. s. zalesov, o. f. ilyin. crack resistance of reinforced concrete elements inclined sections. limit state of reinforced concrete structures elements (1):56–68, 1976. [20] y. y. luchko, v. n. chubrikov, v. f. lazar. strength, crack resistance and durability of concrete and reinforced concrete structures on the basis of fracture mechanics. kamenar, lviv, 1999. [21] y. l. izotov. strength of reinforced concrete beams. budivelnik, kiev, 1978. [22] o. v. romashko, v. m. romashko. model of multilevel formation of normal cracks in reinforced concrete elements and structures. iop conference series: materials science and engineering 708:012069, 2019. https://doi.org/10.1088/1757-899x/708/1/012069. [23] v. romashko, o. romashko. calculation of the crack resistance of reinforced concrete elements with allowance for the levels of normal crack formation. matec web of conferences 230:02028, 2018. https://doi.org/10.1051/matecconf/201823002028. [24] v. i. kolchunov, a. i. demianov, i. a. iakovenko, m. o. garba. bringing the experimental data of reinforced concrete structures crack resistance in correspondence with their theoretical values (in russian). science and construction 15(1):42–49, 2018. https://doi.org/10.33644/scienceandconstruction.v0i1(15).7. [25] national building standards of ukraine “concrete and reinforced concrete structures”, minregionbud, kyiv, 2011. [26] f. s. zamaliev. numerical and full-scale experiments of prestressed hybrid reinforced concrete-steel beams (in russian). vestnik mgsu (3):309–321, 2018. https://doi.org/10.22227/1997-0935.2018.3.309-321. [27] i. iakovenko, v. kolchunov, i. lymar. rigidity of reinforced concrete structures in the presence of different cracks. matec web of conferences 116:02016, 2017. https://doi.org/10.1051/matecconf/201711602016. [28] z. blikharskyy, r. vashkevych, p. vegera, y. blikharskyy. crack resistance of rc beams on the shear. in proceedings of cee 2019, pp. 17–24. springer international publishing, 2019. https://doi.org/10.1007/978-3-030-27011-7_3. [29] p. vegera, r. khmil, r. vashkevych, z. blickharskyy. comparison crack resistance of rc beams with and without transverse reinforcement after shear testing. quality production improvement qpi 1(1):342–349, 2019. https://doi.org/10.2478/cqpi-2019-0046. 643 https://doi.org/10.4028/www.scientific.net/msf.968.209 https://doi.org/10.1088/1757-899x/708/1/012069 https://doi.org/10.1051/matecconf/201823002028 https://doi.org/10.33644/scienceandconstruction.v0i1(15).7 https://doi.org/10.22227/1997-0935.2018.3.309-321 https://doi.org/10.1051/matecconf/201711602016 https://doi.org/10.1007/978-3-030-27011-7_3 https://doi.org/10.2478/cqpi-2019-0046 acta polytechnica 61(5):633–643, 2021 1 introduction 2 materials and methods 3 results 4 conclusions list of symbols references acta polytechnica https://doi.org/10.14311/ap.2021.61.0242 acta polytechnica 61(1):242–252, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague application of global optimization to predict strains in rc compressed columns marek lechmana, ∗, andrzej stachurskib a building research institute, filtrowa 1, 00-611, warsaw, poland b warsaw university of technology, institute of control and computation engineering, nowowiejska 15/19, 00-665 warsaw, poland ∗ corresponding author: m.lechman@itb.pl abstract. in this paper, the results of an application of global and local optimization methods to solve a problem of determination of strains in rc compressed structure members are presented. solutions of appropriate sets of nonlinear equations in the presence of box constraints have to be found. the use of the least squares method leads to finding global solutions of optimization problems with box constraints. numerical examples illustrate the effects of the loading value and the loading eccentricity on the strains in concrete and reinforcing steel in the cross-section. three different minimization methods were applied to compute them: trust region reflective, genetic algorithm tailored to problems with real double variables and particle swarm method. numerical results on practical data are presented. in some cases, several solutions were found. their existence has been detected by the local search with multistart, while the genetic and particle swarm methods failed to recognize their presence. keywords: global optimization, nonlinear equations, least squares method, rc compressed structure members. 1. introduction our problem is to determine the normal strains in the cross-sections of reinforced concrete structure members subjected to compression. mathematically, it may be formulated as a task of solving sets of equations with box constraints. the unknown variables are: � ′ – maximum strain in the cross section and ξ – coordinate describing the location of the neutral axis. the presence of the box constraints makes a direct use of numerical methods for solving sets of nonlinear equations impractical. therefore, our task is reformulated by means of the frequently used least squares method. it leads to a nonlinear, nonconvex optimization problem of finding a minimum of a nonlinear function with the restricted scope of variables. 1.1. motivation to study the strains in rc compressed structure members reinforced concrete structure members subjected to the compression are frequently encountered in the engineering practice (columns, pillars, tower-like structures etc.). the determination of strains is very important in the safety assessment of existing rc structures. in order to solve this problem analytically, several physical models of materials and methods were proposed. lechman and lewiński [1] considered a generalized linear section model. a simplified approach based on the rectangular stress distribution for concrete was used by knauff [2] and knauff et al. [3]. nieser and engel [4] and cicind [5] applied the parabola-rectangle diagram for the design of cross-sections. for reinforcing steel itself, both linear and nonlinear models are used, see for instance lechman and stachurski [6], lechman [7–10], where the ring sections were investigated. the results of fe (finite element) modelling of failure behaviour of rc compressed columns were presented by majewski et al. [11] and rodriguez et al. [12]. in kim and lee [13], a numerical method for predicting the behaviour of rc columns subjected to axial force and biaxial bending is proposed and verified in tests. campione et al. [14] experimentally investigated the behaviour of compressed concrete columns subjected to the overcoring technique, see also campione et al. [15]. the list of researchers working in various directions could be continued. let’s mention some of them: lloyd and rangan [16], bonet et al. [17], ye et al. [18], xu et al. [19], trapko and musiał [20], trapko [21], hadi and le [22], el maddawy et al. [23], csuka and kollar [24], elwan and rashed [25], sadeghian et al. [26], eid and paultre [27], wu and jiang [28], quiertant and clement [29], lee et al. [30], kumar and patel [31] and many others. of course, the list is not complete. despite the variety of calculation methods and experimental investigations concerning this problem, there are not any appropriate analytical solutions based on the nonlinear material laws for determining the strains in rc externally compressed structure members that considers concrete softening. the aim of our paper is twofold. firstly, to formulate equilibrium equations allowing to calculate the strains. secondly, to investigate the usefulness of some 242 https://doi.org/10.14311/ap.2021.61.0242 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 1/2021 global optimization to predict strains in rc compressed columns figure 1. distribution of strain �, stresses in concrete σc and stresses in steel σs across the section global optimization methods to solve the problem numerically. 2. formulation of the equilibrium equations to get the required equations, we started with the integral equilibrium equations and integrated them. the rectangular rc cross-section is subjected to the axial force n and the bending moment m (see fig. 1). the content of the current section is an extension of that presented in lechman and stachurski [32]. the detailed way of deriving the formulas for the section wholly in compression is included. in the derivation of the governing equations, the following assumptions are made: • plane cross-sections remain plane, • elasto-plastic stress/strain relationships for concrete and reinforcing steel are used, • the tensile strength of concrete is ignored, • the ultimate strains for concrete are determined as �cu and for reinforcing steel as �su. in fig. 1, the following notation is used: t, b the thickness and the width of the cross-section, respectively, t1, t2 coordinates describing the locations of rebars, x, x ′ coordinates describing the location of the neutral axis and the location of any point of the section, respectively. in accordance with the eurocode 2 [33], the stress-strain relation for concrete σc – �c in compression for a short term uniaxial loading is assumed as σc = kηc −η2c 1 + (k − 2)ηc fcm, (1) where: ηc = �c/�c1, �c1 – the strain at peak stress on the σc – �c diagram, k = 1.05 ecm|�c1|/fcm, fcm – the mean compressive strength of concrete, ecm – secant modulus of elasticity of concrete, �cu (�cu1) – the ultimate strain for concrete. the reinforcing steel is characterized by yield stress fyk, es – modulus of elasticity and eh – coefficient of steel hardening (linear elastic model with hardening). 2.1. equations for strains in the rectangular sections in further considerations, the corresponding dimensionless coordinates are used: ξ = x/t, ξ ′ = x ′ /t, ξ1 = t1/t, ξ2 = t2/t. (2) 2.1.1. equations for sections wholly in compression let us consider the section wholly in compression. the strain distribution can be expressed in the form � = �1 + (�2 − �1)ξ ′ , (3) where: �1 – maximum compressive strain in the cross section, �2 – minimum compressive strain in the cross section. thus, ηc occuring in (1) assumes the form ηc = k2ξ ′ + k1, (4) after including in (3) the following assignements: k1 = �1 �c1 and k2 = �2 − �1 �c1 . the equilibrium equation of the axial forces in the cross-section takes the following form∫ ac σcdac + σs1fa1 + σs2fa2 + n = 0, (5) where: dac element of the concrete area ac, fa1, fa2 areas of the steel in compression and in tension, respectively. the sectional equilibrium of the bending moments about the symmetry axis of the rectangle can be expressed in the form∫ ac σc(0.5t−x ′ )dac + σs1fa1(0.5t− t1)+ σs2fa2(0.5t− t2) −m = 0. (6) in order to obtain the final form of the equilibrium equations, we integrated formulas in (5) and (6). the 243 marek lechman, andrzej stachurski acta polytechnica most difficult part was to find the antiderivatives of the functions in the integral expressions in (5) and (6). after substituting relations (3) and (4) in relation (1), the function to be integrated in (5) is fn (ξ′) = k(k2ξ ′ + k1) − (k2ξ ′ + k1)2 1 + (k − 2)(k2ξ ′ + k1) , (7) and in (6) is fm (ξ′) = k(k2ξ ′ + k1) − (k2ξ ′ + k1)2 1 + (k − 2)(k2ξ ′ + k1) (0.5 − ξ ′ ). (8) finally, the following equilibrium equations for strains in the rectangular sections are found. the first one concerns the equilibrium equation of the axial forces n + 1 k−2 { w1 + 0.5k2 + w2w3 (ln w5 − ln w6) } + µ1 fyk fcm { δi1 [ −1 + eh fyk (((�2 − �1) ξ1 + �1) + �ss)] + δi1+1 (�2 − �1)ξ1 + �1 �ss } + µ2 fyk fcm { δi2 [ 1 + eh fyk [((�2 − �1) (1 − ξ2) + �1) − �ss]] + δi2+1 (�2 − �1)(1 − ξ2) + �1 �ss } = 0 (9) and the second one represents the sectional equilibrium of the bending moments −m + 1 k−2 { −k212 + 0.5 w2 w6 (ln w5 − ln w6) − w2 w3 [ 1 − w6 w4 (ln w5 − ln w6) ]} + µ1 fyk fcm (0.5 − ξ1) { δi1 [ 1 + eh fyk (((�2 − �1) ξ1 + �1) + �ss)] + δi1+1 (�2 − �1)ξ1 + �1 �ss } + µ2 fyk fcm (0.5 − ξ2) { δi2 [ 1 + eh fyk (((�2 − �1) (1 − ξ2) + �1) − �ss)] + δi2+1 (�2 − �1)(1 − ξ2) + �1 �ss } = 0 (10) where: w1 = k1 −k − 1 k − 2 w2 = k(k − 2) + 1 w3 = (k − 2)(k − 2)k2 w4 = (k − 2)k2 w5 = 1 + k − 2 k2 + k1 n = n btfcm w6 = 1 + (k − 2)(k2 + k1) m = m bt2fcm δi = 0.5((−1i) + 1), i = 1, 2 and: µ1 – the reinforcement ratio of steel in compression, µ2 – the reinforcement ratio of steel in tension. the unknown variables are: �1, �2. 2.1.2. section under combined compression with bending let us consider the section under combined compression and bending. due to the bernoulli assumption, one obtains (see fig. 1) � = ( 1 − ξ ′ ξ ) � ′ , (11) where: � ′ – the maximum compressive strain in concrete. the resulting formulas are given below. equation (12) (for axial forces) n + 1 k−2 { w1ξ + 0.5k2ξ2 − 1k−2 [ w2 w3 ln w − ξ ]} + µ1 fyk fcm { δi1 [ −1 + eh fyk (( 1 − ξ1 ξ ) � ′ + �ss )] + δi1+1 � ′ �ss ( 1 − ξ1 ξ )} + µ2 fyk fcm { δi2 [ +1 + eh fyk (( 1 − 1 − ξ2 ξ ) � ′ − �ss )] + δi2+1 � ′ �ss ( 1 − 1 − ξ2 ξ )} = 0 (12) and equation (13), representing the sectional equilibrium of the bending moments −m + 1 k−2 { 0.5 ( w1 + 1k−2 ) ξ + 0.5 [ −w1 + 0.5k2 − 1k−2 ] ξ2− 1 3k2ξ 3 − w2(k−2)w3 [ 0.5 ln w + ξ − w w3 ln w ]} + µ1 fyk fcm (0.5 − ξ1) { δi1 [ 1 + eh fyk (( 1 − ξ1 ξ ) � ′ + �ss )] + δi1+1 � ′ �ss ( 1 − ξ1 ξ )} + µ2 fyk fcm (0.5 − ξ2) { δi2 [ +1 + eh fyk (( 1 − 1−ξ2 ξ ) � ′ − �ss)] + δi2+1 � ′ �ss ( 1 − 1 − ξ2 ξ )} = 0, (13) where: w1 = k −k2ξ w2 = k(k − 2) + 1 w3 = (k − 2)k2 w = 1 + (k − 2)k2ξ δi = 0.5((−1)i + 1), i = 1, 2 k2 = � ′ �c1ξ the unknown variables are: • � ′ – maximum strain in the cross section, • ξ – coordinate describing the location of the neutral axis. 3. computational solution and numerical results it is not our first work with models of the processes in the rc structure members. we have already got some experience with the circular rc structure members [6]. this experience suggests that we have to 244 vol. 61 no. 1/2021 global optimization to predict strains in rc compressed columns expect many global and local solutions of the least squares problem. therefore, we decided to compare three different algorithms: local search method (trust region reflective) started many times from all points from a net of points equally distributed on the feasible box and genetic and particle swarm algorithms designed for searching a global optimum. for the verification of the obtained formulae, two rectangular cross-sections 0.3 m × 0.3 m under the compression have been considered: the unreinforced one and that of reinforced (µfyk/fcm = 0.1). both sections had the following characteristics: the concrete grade c20/25, the yield stress of steel fyk = 500mpa (reinforced), reinforcement ratios of the steel in compression and in tension µ1 = µ2 = µ, t1/t = t2/t = 0.1, eh = 0. it is assumed that the resistance of the cross-section is reached when the compressive strain in concrete �cu = −3.5‰ or the ultimate strain in the reinforcing steel equals �su = 10‰. after some rearrangements and substituting � ′ = x and ξ = y, the set of equations (12–13) takes the forms (14–15) for the unreinforced and (16–17) for the reinforced cross-sections, respectively. due to the appearance of the term y − x in the denominator in some sets of equations, a danger of the division by 0 occurs. for this reason, fmincon has been finally applied with the algorithm option set to ”interior point method”, that allowed to include special constraints eliminating this danger. moreover, the existence of multiple minima may not be avoided. however, the least squares formulation of the problem itself may, in general, involve extra local solutions (such a counterexample may be found in stachurski [34]). this has been confirmed through computational results. many local minima and sometimes several global minima that resulted from the numerical properties of the optimization problem were encountered. therefore, the clusterization idea imported from clusterization the methods of the global optimization was incorporated (see, for instance, thorn and żilinstas [35]). the size of the problem and computation time were of secondary importance. we have also tested the genetic and particle swarm algorithms from the global optimization matlab’s toolbox, comparing them with a local search method started from all points of the net covering the whole set ω of feasible points. for testing purposes, the sets of equations were used that described the reinforced or unreinforced concrete sections subjected to the compression. the equations for the concrete without the reinforcement – the subject to the compression with bending are r1(x,y) = −a1 + (2.25 + 0.5x)y− 0.25xy− 4 [ −12.5 y x ln(1 − 0.125x) −y ] = 0, (14) r2(x,y) = −a2 + (3.125 + 0.25x)y+[ −3.125 + 0.25x− 0.125 x y ] y2+ 0.16667xy2+ 50 y x [0.5 ln(1 − 0.125x) + y + 8 (1 − 0.125x)y x ln(1 − 0.125x) ] = 0, (15) where: x – maximum compressive x ∈ [−5,−10−10] strain in concrete y – coordinate specifying y ∈ [10−10, 1] location of the neutral axis of the cross-section different values of constants a1 and a2 correspond to different axial forces n and bending moments m. parameters a are collected in table 1 the corresponding equations for the reinforced concrete section subjected to the compression with bending are given below r1(x,y) = (2.25 + 0.5x)y − 0.25xy− 4 [ −12.5 y x ln(1 − 0.125x) −y ] + 0.01x ( 1 − 0.9 y ) −a1 = 0, (16) r2(x,y) = (3.125 + 0.25x)y−[ 3.125 + 0.25x + 0.125 x y ] y2+ 0.16667xy2+ 50 y x [0.5 ln(1 − 0.125x) + y + 8(1 − 0.125x) y x ln(1 − 0.125x) ] + 0.004x(1 − 0.9 y ) −a2 = 0, (17) where x and y have the same meaning and scope as in equations (14) and (15). we used two sets of constant parameters a1 and a2 for that case specified below. set no. a1 a2 1 0.143445 0.0292155 2 0.129182 0.0348055 we have to solve sets of two nonlinear equations with two unknowns x and y specified above{ r1(x,y) = 0 r2(x,y) = 0 where [ x y ] ∈ ω ω = {[ x y ] ∈ r2 ∣∣∣∣ xl ≤ x ≤ xuyl ≤ y ≤ yu } (18) xl, xu are the lower bound and upper bound on variable x and similarly yl, yu are the lower bound and upper bound on variable y. due to their presence, a direct use of numerical methods for solving sets of nonlinear equations seems to be impractical. therefore, our task was reformulated by means of the frequently used the least squares method. it has 245 marek lechman, andrzej stachurski acta polytechnica set no. a1 a2 1 0.17157 0.01717 2 0.13345 0.02522 3 0.10918 0.026805 4 0.08579 0.02574 5 0.07065 0.02369 6 0.04448 0.01763 7 0.02571 0.01138 table 1. sets of parameters for unreinforced concrete subjected to compression with bending lead to a nonlinear, nonconvex optimization problem min (x,y) f(x,y) = 12 ( r21 (x,y) + r22 (x,y) ) s.t. x l ≤ x ≤ xu yl ≤ y ≤ yu (19) below are given the results of the local search with multiple starting points, genetic algorithm and the particle swarm optimization method. in the first approach, we selected the fmincon function from the matlab optimization toolbox as a tool to solve the least squares problem (19), because it allows the introduction of of the box constraints. the steps of the procedure may be summarized as follows: set s – set of solution clusters to be an empty set. while (there are non used points in the net covering ω) • take a new point x0 ∈ ω, • solve the least squares problem by means of the fmincon function from the matlab’s toolbox optimization starting from the point x0, • denote the found solution by x, • if f(x) < restol if x belongs to some cluster in s compare function value f(x) with the best in the cluster and save the better of the two points as the seed of the cluster else save the current point as the seed of a new cluster endif endif we assumed the threshold value restol = 1.0e− 20. the only exception was the set of sample problems for the reinforced concrete section subjected to the compression with bending where restol = 1.0e − 10. in the clusterization, we treated a new point as a structure member of the cluster if the following inequality was verified ‖x̂ − xseed‖≤ disttol where xseed is the seed point of the current cluster. we assumed disttol = 1.0e−10. the need of the clusterization is fully justified by the table 2 presented below. we can evidently observe four different clusters in the table. first seven examples are connected with the concrete sections without reinforcement. they are subjected to compression and bending. parameters a are collected in table 1 and the calculated solutions are put to table 3. consecutive table 4 contains the solutions for two sets associated with the situation, when the sections are reinforced. the results of calculations with the genetic algorithm are summarized in table 5 (for the sections without reinforcement) and in table 6 (for sections with reinforcement). unfortunately, the implementation of the genetic algorithm from the matlab’s global optimization toolbox has found only one global solution, even for sets where the local minimizer detected more global solutions. furthermore, the accuracy of the ga solution is definitely poorer compared with that found by the local minimizer. tables 7 and 8 summarize the results obtained by means of the particle swarm algorithm implementation in the matlab’s global optimization toolbox. the same comment as for the ga matlab function is valid for the particle swarm one. 4. comparison of experimental and numerical results in order to verify the calculated results, 175 mm × 175 mm × 1680 mm (the height) column specimens under eccentric compression were considered, the results of which were presented in detail by lloyd at al. [16]. the longitudinal steel reinforcement of the columns consisted of three rebars φ12 mm, fyk = 430 mpa, es = 200 gpa and they were made of concrete fcm = 44.78 mpa, ecm = 32 gpa. the static diagram and test specimen are shown in fig. 2. in the above mentioned tests, the following failure loads and corresponding eccentricities were measured: p1 = 1476 kn, e1 = 15 mm; p2 = 830 kn, e2 = 50 mm; p3 = 660 kn, e3 = 65 mm. the ultimate strain in concrete at failure was assumed in calculations as −2.4 ‰, which corresponds to the peak stress on the σc −�c diagram (fig. 1). the values collected in table 9 confirm a good 246 vol. 61 no. 1/2021 global optimization to predict strains in rc compressed columns x(1) x(2) f(x) -2.7790923390672915e+00 -3.0371456600817055e+00 9.8607613152626476e-32 -2.7790923390376046e+00 -3.0371456601087723e+00 2.2186712959340957e-29 -3.5000157890703703e+00 -4.9999510298477368e-01 1.2325951644078309e-29 -2.7790923390051390e+00 -3.0371456601383064e+00 9.1507865005637369e-29 -2.7790923391531197e+00 -3.0371456600033153e+00 1.9721522630525295e-29 -2.7790923392382747e+00 -3.0371456599104514e+00 8.4692264556708872e-25 -1.2763043306571440e+00 -1.0595130305186657e+00 4.9303806576313238e-31 -1.2763043306472746e+00 -1.0595130305290790e+00 6.9364539396083568e-27 -2.7790923389334106e+00 -3.0371456602237172e+00 5.7569619331116098e-25 -1.2763043307376327e+00 -1.0595130304452089e+00 4.7302072029314920e-28 -1.2763043305691497e+00 -1.0595130306000431e+00 2.0523202525456148e-27 -6.2452876553920056e-01 -3.6491369858491312e+00 3.6484816866471796e-30 table 2. sample table of results without clusterization x(1) x(2) f set 1 a(1) = 1.7157000000000000e-01 a(2) = 1.7170000000000001e-02 -3.5001760272980009e+00 8.9999746818148496e-01 1.6308774769071113e-29 set 2 a(1) = 1.3345000000000001e-01 a(2) = 2.5219999999999999e-02 -3.4992749633586557e+00 6.9999666861370502e-01 1.9772367181057118e-29 -1.7538910914819157e+00 8.3132002867829691e-01 4.3364238627823002e-29 set 3 a(1) = 1.0918000000000000e-01 a(2) = 2.6804999999999999e-02 -3.5000264028206489e+00 5.7271595174704082e-01 8.1807341061747740e-29 -1.7532806514095229e+00 6.8025870700809510e-01 2.0954117794933126e-29 set 4 a(1) = 8.5790000000000005e-02 a(2) = 2.5739999999999999e-02 -3.4999421856987043e+00 4.5001889527019823e-01 2.6437933681383566e-28 -1.7533490610068840e+00 5.3451336316461950e-01 2.9496002284279395e-29 set 5 a(1) = 7.0650000000000004e-02 a(2) = 2.3689999999999999e-02 -3.5002461261714624e+00 3.7060720525856033e-01 6.1800472650662032e-28 -1.7531021761012342e+00 4.4021717424283136e-01 4.8915539099524771e-29 set 6 a(1) = 4.4479999999999999e-02 a(2) = 1.7630000000000000e-02 -3.4981410371676795e+00 2.3329954055348942e-01 3.9372571632525573e-27 -1.7548124428383809e+00 2.7700765058540189e-01 8.0396019598500773e-29 set 7 a(1) = 2.5710000000000000e-02 a(2) = 1.1379999999999999e-02 -3.2265678368900650e+00 1.3352149291135310e-01 4.2148274638856933e-26 -1.9823327248432530e+00 1.5060554010027433e-01 2.5005850693336105e-28 table 3. results for non-reinforced concrete subjected to compression with bending 247 marek lechman, andrzej stachurski acta polytechnica x(1) x(2) f set 1 a(1) = 1.4344499999999999e-01 a(2) = 2.9215499999999998e-02 -2.2782425251491150e+00 7.7232501367591999e-01 2.2709743058168710e-29 -3.4999345396782120e+00 6.9999699585013297e-01 3.2106687592596898e-29 -3.4997710707333067e+00 6.9999415656504049e-01 6.6164575189148840e-13 set 2 a(1) = 1.2918199999999999e-01 a(2) = 3.4805500000000003e-02 -2.8855483058111950e+00 5.9743506676707159e-01 5.7336006495255961e-30 -3.4999204514197872e+00 5.7272335751221370e-01 3.6503587461192280e-28 -2.8894696731050842e+00 5.9713482908132576e-01 4.5482057578604896e-11 -2.8850298075074248e+00 5.9747474535630740e-01 8.0528227557234545e-13 -2.8819290341073800e+00 5.9771169963840232e-01 3.9597006753181514e-11 table 4. results for reinforced concrete subjected to compression with bending x(1) x(2) f set 1 a(1) = 1.7157000000000000e-01 a(2) = 1.7170000000000001e-02 -2.0055257755736533e+00 9.9999993722360547e-01 4.9367829047643648e-06 set 2 a(1) = 1.3345000000000001e-01 a(2) = 2.5219999999999999e-02 -1.7533927171625709e+00 8.3145002503412480e-01 1.1683502858156642e-11 set 3 a(1) = 1.0918000000000000e-01 a(2) = 2.6804999999999999e-02 -3.5000168287943323e+00 5.7271604685964617e-01 4.2733490746564677e-15 set 4 a(1) = 8.5790000000000005e-02 a(2) = 2.5739999999999999e-02 -3.5000605612775848e+00 4.5002358354890987e-01 1.0830715796115672e-13 set 5 a(1) = 7.0650000000000004e-02 a(2) = 2.3689999999999999e-02 -1.7531661968763803e+00 4.4020811931145998e-01 1.4944588897123600e-14 set 6 a(1) = 4.4479999999999999e-02 a(2) = 1.7630000000000000e-02 -3.4974023122097648e+00 2.3328811178613765e-01 1.9405016478273012e-13 set 7 a(1) = 2.5710000000000000e-02 a(2) = 1.1379999999999999e-02 -3.8846165545827720e+00 1.3993640821435832e-01 1.8945631674275027e-08 table 5. results for non-reinforced concrete subjected to compression with bending obtained by matlab’s genetic algorithm function x(1) x(2) f set 1 a(1) = 1.4344499999999999e-01 a(2) = 2.9215499999999998e-02 -2.2782188146226554e+00 7.7232945260092078e-01 1.9240836428670515e-14 set 2 a(1) = 1.2918199999999999e-01 a(2) = 3.4805500000000003e-02 -2.8837525833446813e+00 5.9758208926874357e-01 1.0366597163520163e-11 table 6. results for reinforced concrete subjected to compression with bending obtained by matlab’s genetic algorithm function 248 vol. 61 no. 1/2021 global optimization to predict strains in rc compressed columns x(1) x(2) f set 1 a(1) = 1.7157000000000000e-01 a(2) = 1.7170000000000001e-02 -3.5002095622539371e+00 8.9992251419894265e-01 1.1469551926718323e-10 set 2 a(1) = 1.3345000000000001e-01 a(2) = 2.5219999999999999e-02 -3.4988243017354601e+00 6.9999134440783828e-01 8.0833419633913090e-12 set 3 a(1) = 1.0918000000000000e-01 a(2) = 2.6804999999999999e-02 -3.5005599263496445e+00 5.7271548379815385e-01 9.7034824624575985e-12 set 4 a(1) = 8.5790000000000005e-02 a(2) = 2.5739999999999999e-02 -3.4996043294305919e+00 4.5005795509525537e-01 4.3542554256757773e-11 set 5 a(1) = 7.0650000000000004e-02 a(2) = 2.3689999999999999e-02 -3.5287486517409783e+00 3.7098837709011562e-01 3.6828714611703899e-09 set 6 a(1) = 4.4479999999999999e-02 a(2) = 1.7630000000000000e-02 -1.8974393623306212e+00 2.6652417435303338e-01 1.4855280597896296e-08 set 7 a(1) = 2.5710000000000000e-02 a(2) = 1.1379999999999999e-02 -3.2244553544557881e+00 1.3346506124150329e-01 5.6375712128260839e-11 table 7. results for non-reinforced concrete subjected to compression with bending obtained by matlab’s particle swarm function x(1) x(2) f set 1 a(1) = 1.4344499999999999e-01 a(2) = 2.9215499999999998e-02 -3.5011205692965111e+00 6.9993319470774185e-01 1.0180508409446182e-10 set 2 a(1) = 1.2918199999999999e-01 a(2) = 3.4805500000000003e-02 -3.5003741045884293e+00 5.7271843442355697e-01 8.6633784199345278e-13 table 8. results for reinforced concrete subjected to compression with bending obtained by matlab’s particle swarm function experimental numerical failure load pi [kn] �cu = �c1 � ′ ξ strain in �1/�2 [‰] eccentricity ei [mm] [‰] [‰] steel �s [‰] p1 = 1476; e1 = 15 -2.4 -2.20 -2.20 / -0.35 p2 = 830; e2 = 50 -2.4 -2.20 0.63 0.92 p3 = 660; e3 = 65 -2.4 -2.20 0.49 1.85 table 9. comparison of the experimental and numerical results – 1 249 marek lechman, andrzej stachurski acta polytechnica experimental numerical failure load pi [kn] �cu = �c1 �1 �2 eccentricity ei [mm] [‰] [‰] [‰] p1 = 1548; e1 = 0 -2.1 -2.1 -2.1 p2 = 1386; e2 = 16 -2.1 -1.95 -0.31 p3 = 1098; e3 = 32 -2.1 -1.95 -0.12 table 10. comparison of the experimental and numerical results figure 2. static diagram and test specimen conformity between the numerical solution and the experimental data given by lloyd and rangan [16]. it is worth noting that the theoretical values are lower than those obtained from the experiment due to neglecting the effect of confinement of the column. as the next example, the results of tests conducted by trapko at al. [20] on unstrenghtened column specimens 200 mm × 200 mm × 1500 mm (the height) under eccentric compression were analyzed. the longitudinal reinforcement of the column consisted of two rebars φ12 mm, steel grade a-iiin, fyk = 608 mpa, es = 224 gpa and the transverse reinforcement consisted of stirrups φ6 mm, steel grade a-i. the columns were made of concrete fcm = 31.9 mpa. ecm = 31 gpa. the failure loads and the corresponding eccentricities were determined in these tests as: p1 = 1548 kn, e1 = 0 mm; p2 = 1386 kn, e2 = 16 mm; p3 = 1098 kn, e3 = 32 mm. the ultimate strain in concrete at failure was assumed in calculations as −2.1 ‰. the characteristic failure mechanisms of the tested specimens occurred in the form of crushing the concrete in the upper part of the structure members and yielding the longitudinal reinforcing steel. a good conformity between the calculated and experimental results are confirmed by the values collected in table 10. in author’s opinion, further experimental work is needed concerning the post-critical behaviour of rc columns under eccentric compression. 5. conclusions and comments our numerical results have confirmed that the elaborated analytical deformation model (taking into account the effect of concrete softening) may be used to determine the strains in rectangular cross-sections of rc compressed structure members. it can be applied to predict the behaviour of such structure members. the current matlab’s implementations of the global optimization algorithms (ga and particle swarm) do not seem to be suitable for our application. in genetic algorithm (ga), elitism is used (part of the previous population survives to the next one). but it does not ensure finding the correct solution. the particle swarm procedure also does not guarantee the computation of the correct solution. of course, we may tune some of their parameters, but we do not expect to gain much from that. our experiments with the local search method have frequently shown the existence of several global minima. all of them are almost equally good from the point of view of a numerical calculations specialist. we decided to select them by means of the hamilton minimum energy principle. in our opinion, from the existing global optimization methods, the most promising may be the clusterization methods (see for instance törn and żilinskas [35]). the calculated results conform to the experimental ones. the proposed approach enables to evaluate the structural safety of tower-like structures with rectangular sections without testing the drilling-out cores taken from the structure. it may also be useful in the structural design and maintenance. references [1] m. lechman, p. lewiński. generalized section model for analysis of reinforced concrete chimney weakened by openings. eng trans 49(1):3–28, 2001. 250 vol. 61 no. 1/2021 global optimization to predict strains in rc compressed columns [2] m. knauff. calculations of reinforced concrete structures according to ec 2. scientific publisher pwn, warsaw, poland, 2013. [3] m. knauff, a. golubiska, p. knyziak. tables and formulae for the design of reinforced concrete structures with calculation examples. scientific publisher pwn, warsaw, poland, 2013. [4] h. nieser, v. engel. structure of industrial chimneys. commentary on din 1056. german institute for standardization, berlin, germany, 1986. [5] cicind. model code for chimneys, part a: the shell. sec. ed., rev. 1. cicind, zurich, switzerland, 2001. [6] m. lechman, a. stachurski. nonlinear section model for analysis of rc circular tower structures weakened by openings. struct eng and mech 20(2):161–172, 2005. doi:/10.12989/sem.2005.20.2.161. [7] m. lechman. load-carrying capacity and dimensioning of ring cross-sections under eccentric compression. scientific papers of the building research institute, dissertations, warsaw, poland, 2006. [8] m. lechman. resistance of rc annular cross-sections with openings subjected to axial force and bending. eng trans 56(1):43–64, 2008. [9] m. lechman. dimensioning of sections of concrete members under compression according to eurocode 2. examples of use. building research institute publications, warsaw, poland, 2011. [10] m. lechman. resistance of reinforced concrete columns subjected to axial force and bending. transportation res procedia 14c:2411–2420, 2016. doi:10.1016/j.trpro.2016.05.283. [11] t. majewski, j. bobiński, j. tejchman. fe analysis of failure behaviour of reinforced concrete columns under eccentric compression. eng struct 30(2):300–317, 2008. doi:10.1016/j.engstruct.2007.03.024. [12] e. rodrigues, o. manzoli, l. bitencourt jr., et al. failure behavior modeling of slender reinforced concrete columns subjected to eccentric load. latin amer j of solids and struct 12(3):520–541, 2015. doi:10.1590/1679-78251224. [13] j. kim, s.-s. lee. the behavior of reinforced concrete columns subjected to axial force and biaxial bending. eng struct 22(11):1518–1528, 2000. doi:10.1016/s0141-0296(99)00090-5. [14] g. campione, g. . minafo. applicability of over-coring technique to loaded rc columns. struct eng and mech 51(1):181–197, 2014. doi:10.12989/sem.2014.51.1.181. [15] g. campione, m. fossetti, m. papia. behavior of fiber-reinforced concrete columns under axially and eccentrically compressive loads. aci struct j 107(3):272–281, 2010. [16] n. lloyd, b. rangan. studies on high-strength concrete columns under eccentric compression. aci struct j 93(6):631–638, 1996. [17] j. bonet, m. romero, p. miguel. effective flexural stiffness of slender reinforced concrete columns under axial forces and biaxial bending. eng struct 33(3):881– 893, 2011. doi:10.1016/j.engstruct.2010.12.009. [18] m. ye, y. pi, m. ren. experimental and analytical investigation on rc columns with distributed-steel bar. struct eng and mech 47(6):741–756, 2013. doi:10.12989/sem.2013.47.6.741. [19] c. xu, l. jin, z. ding, et al. size effect tests of highstrength rc columns under eccentric loading. eng struct 126:78–91, 2016. doi:10.1016/j.engstruct.2016.07.046. [20] t. trapko, m. musiał. the effectiveness of cfrp materials strengthening of eccentrically compressed reinforced concrete columns. arch of civil and mech eng 11(1):249–262, 2011. doi:10.1016/s1644-9665(12)60187-3. [21] t. trapko. effect of eccentric compression loading on the strains of frcm confined concrete columns. constr and building mat 61:97–105, 2014. doi:10.1016/j.conbuildmat.2014.03.007. [22] m. hadi, t. le. behaviour of hollow core square reinforced concrete columns wrapped with cfrp with different fibre orientation. constr and building mat 50:62–73, 2014. doi:10.1016/j.conbuildmat.2013.08.080. [23] t. el maaddawy, m. el sayed, b. abdel-magid. the effects of cross-sectional shape and loading condition on performance of reinforced concrete members confined with carbon fiber-reinforced polymers. mat and design 31(5):2330–2341, 2010. doi:10.1016/j.matdes.2009.12.004. [24] b. csuka, l. kollar. analysis of frp columns under eccentric loading. comp struct 94(3):1106–1116, 2012. doi:10.1016/j.compstruct.2011.10.012. [25] s. elwan, a. rashed. experimental behavior of eccentrically loaded rc short columns strengthened using gfrp wrapping. struct eng and mech 39(2):207–221, 2011. doi:/10.12989/sem.2011.39.2.207. [26] p. sadeghian, a. rahai, m. ehsani. experimental study of rectangular rc columns strengthened with cfrp composites under eccentric loading. j for comp for constr 14(4):443–450, 2010. doi:10.1061/(asce)cc.1943-5614.0000100. [27] r. eid, p. paultre. compressive behaviour of frp-confined reinforced concrete columns. eng struct 132:518–530, 2017. doi:10.1016/j.engstruct.2016.11.052. [28] y. wu, c. jiang. effect of load eccentricity on the stress-strain relationship of frp-confined concrete columns. comp struct 98:228–241, 2013. doi:10.1016/j.compstruct.2012.11.023. [29] m. quiertant, j. clement. behavior of rc columns strengthened with different cfrp systems under eccentric loading. constr and building mat 25(2):452– 460, 2011. doi:10.1016/j.conbuildmat.2010.07.034. [30] j. lee, y. kim, s. kim, j. park. structural performance of rectangular section confined by squared spirals with no longitudinal bars influencing the confinemen. arch of civil and mech eng 16(4):795–804, 2016. doi:10.1016/j.acme.2016.05.005. [31] v. kumar, p. v. patel. strengthening of axially loaded columns using stainless steel wire mesh (sswm) numerical investigations. struct eng and mech 60(6):979– 999, 2016. doi:10.1016/j.conbuildmat.2016.06.109. 251 http://dx.doi.org//10.12989/sem.2005.20.2.161 http://dx.doi.org/10.1016/j.trpro.2016.05.283 http://dx.doi.org/10.1016/j.engstruct.2007.03.024 http://dx.doi.org/10.1590/1679-78251224 http://dx.doi.org/10.1016/s0141-0296(99)00090-5 http://dx.doi.org/10.12989/sem.2014.51.1.181 http://dx.doi.org/10.1016/j.engstruct.2010.12.009 http://dx.doi.org/10.12989/sem.2013.47.6.741 http://dx.doi.org/10.1016/j.engstruct.2016.07.046 http://dx.doi.org/10.1016/s1644-9665(12)60187-3 http://dx.doi.org/10.1016/j.conbuildmat.2014.03.007 http://dx.doi.org/10.1016/j.conbuildmat.2013.08.080 http://dx.doi.org/10.1016/j.matdes.2009.12.004 http://dx.doi.org/10.1016/j.compstruct.2011.10.012 http://dx.doi.org//10.12989/sem.2011.39.2.207 http://dx.doi.org/10.1061/(asce)cc.1943-5614.0000100 http://dx.doi.org/10.1016/j.engstruct.2016.11.052 http://dx.doi.org/10.1016/j.compstruct.2012.11.023 http://dx.doi.org/10.1016/j.conbuildmat.2010.07.034 http://dx.doi.org/10.1016/j.acme.2016.05.005 http://dx.doi.org/10.1016/j.conbuildmat.2016.06.109 marek lechman, andrzej stachurski acta polytechnica [32] m. lechman, a. stachurski. determination of stresses in rc eccentrically compressed members using optimization methods. in computer methods in mechanics (cmm2017): proceedings of the 22nd international conference on computer methods in mechanics, aip conference proceedings, vol. 1922, pp. 1–11. aip publishing, asce, reston, va, 2018. doi:10.1063/1.5019133. [33] eurocodes. 1992-1-1 eurocode 2: design of concrete structures part 1-1: general rules and rules for buildings. joint research center, eurocodes, european commission, 2010. [34] a. stachurski. introduction to optimization, (in polish: wprowadzenie do optymalizacji). publishing house of the warsaw university of technology, warsaw, poland, 2009. [35] a. törn, a. żilinskas. global optimization. springer verlag, berlin, heidelberg, germany, 1989. 252 http://dx.doi.org/10.1063/1.5019133 acta polytechnica 61(1):242–252, 2021 1 introduction 1.1 motivation to study the strains in rc compressed structure members 2 formulation of the equilibrium equations 2.1 equations for strains in the rectangular sections 2.1.1 equations for sections wholly in compression 2.1.2 section under combined compression with bending 3 computational solution and numerical results 4 comparison of experimental and numerical results 5 conclusions and comments references acta polytechnica https://doi.org/10.14311/ap.2022.62.0190 acta polytechnica 62(1):190–196, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague modified korteweg-de vries equation as a system with benign ghosts andrei smilga university of nantes, subatech, 4 rue alfred kastler, bp 20722, nantes 44307, france correspondence: smilga@subatech.in2p3.fr abstract. we consider the modified korteweg-de vries equation, uxxx + 6u2ux + ut = 0, and explore its dynamics in spatial direction. higher x derivatives bring about the ghosts. we argue that these ghosts are benign, i.e., the classical dynamics of this system does not involve a blow-up. this probably means that the associated quantum problem is also well defined. keywords: benign ghosts, kdv equation, integrability. 1. introduction a system with ghosts is, by definition, a system where the quantum hamiltonian has no ground state so its spectrum involves the states with arbitrarily low and arbitrarily high energies. in particular, all nondegenerate theories with higher derivatives in the lagrangian (but not only them!) involve ghosts. the ghosts show up there already at the classical level: the ostrogradsky hamiltonians of higher derivative systems [1] include the linear in momenta terms and are thus not positive definite [2]. this brings about the ghosts in the quantum problem [3, 4]. in many cases, ghost-ridden systems are sick – the schrödinger problem is not well posed and unitarity is violated. probably, the simplest example of such a system is a system with the hamiltonian describing the 3-dimensional motion of a particle in an attractive 1 r2 potential: h = p⃗2 2m − κ r2 . (1) for certain initial conditions, the particle falls to the center in a finite time, as is shown in figure 1. the quantum dynamics of this system depends on the value of κ. if mκ < 1/8, the ground state exists and unitarity is preserved. if mκ > 1/8, the spectrum is not bounded from below and, what is worse, the quantum problem cannot be well posed until the singularity at the origin is smoothed out [5–7]. one can say that for mκ < 1/8, the quantum fluctuations cope successfully with the attractive force of the potential and prevent the system from collapsing. the latter example suggests that quantum fluctuations can only make a ghost-ridden system better, not worse. we, therefore, conjecture that, if the classical dynamics of the system is benign, i.e., the system does not run into singularity in finite time,1 its quantum 1we still call a system benign if it runs into a singularity at t = ∞. such systems have well-defined quantum dynamics. this refers, for example, to the problem of motion in a uniform electric field (see e.g. [8], $24) and also to the inversed oscillator -0.5 0.5 1.0 x -0.4 -0.2 0.2 0.4 0.6 0.8 y figure 1. falling on the center for the hamiltonian (1) with m = 1 and κ = .05. the energy is slightly negative. the particles with positive energies escape to infinity. dynamics will also be benign, irrespectively of whether the spectrum has, or does not have, a bottom. this all refers to ordinary mechanical or field theory systems, where energy is conserved and the notion of hamiltonian exists. the ghosts in gravity (especially, in higher-derivative gravity) are special issue that we are not discussing here. besides malignant ghost-ridden systems, of which the system (1) with mκ > 1/8 represents an example, there are also many systems with ghosts, which are benign – unitarity is preserved and the quantum hamiltonian is self-adjoint with a well-defined real spectrum. to begin with, such is the famous paisuhlenbeck oscillator [10] – a higher derivative system with the hamiltonian h = (p2 − x2)/2. in the latter problem, the classical trajectories x(t) grow exponentially with time, but the quantum problem is still benign (see e.g. [9], ch. 3, corollary 13). the spectrum in this case is continuous, as it is for the uniform field problem. 190 https://doi.org/10.14311/ap.2022.62.0190 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 modified kdv equation as a system with benign ghosts with the lagrangian l = 1 2 [ ẍ2 − (ω21 + ω 2 2 )ẋ 2 + ω21ω 2 2x 2] . (2) this system is free, its canonical hamiltonian can be reduced to the difference of the two oscillator hamiltonians by a canonical transformation [11]. the first example of a nontrivial benign ghost system involving nonlinear interactions was built up in [12]. for other such examples, see refs. [13–17]. in recent [18], we outlined two wide classes of benign ghost systems: (i) the systems obtained by a variation of ordinary systems and involving, compared to them, a double set of dynamic variables and (ii) the systems describing geodesic motion over lorenzian manifolds. in addition, we noticed that the evolution of the modified korteweg-de vries (mkdv) system (9) in the spatial direction also exhibits a benign ghost dynamics. this report is mostly based on section 4 of [18] that deals with mkdv dynamics. 2. spatial dynamics of kdv and mkdv equations first, consider the ordinary kdv equation, uxxx + 6uux + ut = 0 , (3) where ux = ∂u/∂x,ut = ∂u/∂t etc. it has an infinite number of integrals of motion and is exactly soluble.2 the kdv equation is derived from the field lagrangian l[ψ(t,x)] = 1 2 ψ2xx − ψ 3 x − 1 2 ψtψx (4) if one denotes u(t,x) ≡ ψx after having varied over ψ(t,x). this lagrangian involves higher spatial derivatives, but not higher time derivatives and does not involve ghosts in the ordinary sense. we can, however, simply rename t → x, x → t , (5) in which case the equation acquires the form ut t t + 6uut + ux = 0 (6) and higher time derivatives appear. according to our conjecture, to study the question of whether the quantum hamiltonian corresponding to the thus rotated lagrangian (4) is hermitian and unitarity is preserved, it is sufficient to study its classical dynamics: if it does not involve a blow-up and all classical trajectories exist at all times t, one can be sure that the quantum system is also benign. note that the question whether or not blowing up trajectories are present is far from being trivial. the ordinary cauchy problem for the equation (4) consists 2exact solvability always makes the behaviour of a system more handy. in particular, many mechanical models including benign ghosts, which were mentioned above, are exactly solvable. in setting the initial value of u(t0,x) at a given time moment, say, t0 = 0. and we are now interested [staying with eq. (4) and not changing the name of the variables according to (5)] in the cauchy problem in x direction. the presence of third spatial derivatives in (4) makes it necessary to define, at the line x = x0, three different functions: u(t,x0),ux(t,x0) and uxx(t,x0). the presence of three arbitrary functions makes the space of solutions to the spatial cauchy problem much larger than for the ordinary cauchy problem. the solutions to the latter represent a subset of measure zero in the set of the solutions in the former, and the fact that the solutions to the ordinary cauchy problem are all benign does not mean that it is also the case for the rotated x-directed problem. and, indeed, for the ordinary kdv equation (4), the problem is not benign. it is best seen if we choose a t-independent ansatz u(t,x) → u(x) and plug it into (3). the equation is reduced to ∂x(uxx + 3u2) = 0 =⇒ uxx + 3u2 = c . (7) this equation describes the motion in the cubic potential v (u) = u3 − cu. it has blow-up solutions. if c = 0, they read u(x) = − 2 (x − x0)2 . (8) however, the situation is completely different for the modified kdv equation,3 uxxx + 6u2ux + ut = 0 . (9) this equation admits an infinite number of integrals of motion, as the ordinary kdv equation does. the first three local conservation laws are ∂tu = −∂x(uxx + 2u3) , (10) ∂tu 2 = −2∂x [ 3 2 u4 + uuxx − 1 2 u2x ] , (11) ∂t ( 1 2 u4 − 1 2 u2x ) = ∂x [ ux(2u2ux + 1 2 uxxx) − 1 2 u2xx − 2u 3uxx − 2u6 ] (12) for the time-independent ansatz, we obtain, instead of (7), ∂x(uxx + 2u3) = 0 =⇒ uxx + 2u3 = c . (13) this describes the motion in a quartic potential v (u) = u4/2 − cu. this motion is bounded, the solutions being elliptic functions. 3in ref. [18], we wrote this equation as uxxx + 12κu2ux + ut = 0 and kept κ in all subsequent formulas. but here, we have chosen, for simplicity, to fix κ = 1/2. 191 andrei smilga acta polytechnica this observation presents an argument that the rotated cauchy problem for the equation (9) with arbitrary initial conditions on the line x = const might be benign. note that this behaviour is specific for the equation (9) with the positive sign of the middle term (the socalled focusing case). plugging the time-independent ansatz in the defocusing mkdv equation,4 uxxx − 6u2ux + ut = 0 , (14) the problem would be reduced to the motion in the potential v (u) = −u4/2 −cu characterized by a blowup. this conforms to the well-known fact that any solution u(t,x) of the ordinary kdv equation is related to a solution v(t,x) of the defocusing mkdv equation by the miura transformation, u = −(v2 + vx) . (15) a different (though related) analytic argument indicating the absence of real blow-up solutions for the focusing mkdv equation comes from the analysis of its scaling properties. it is easily seen that eq. (9) is invariant under the rescalings u = λuū, x = λxx̄, t = λtt̄ if λt = λ3x, λu = λ −1 x . (16) the quantities xu and x/t1/3 are invariant under these rescalings. using also the space and time translational invariance of the mkdv equation, we can look for scaling solutions of the type u(t,x) = 1 [3(t − t0)]1/3 w(z) , (17) where z = x − x0 [3(t − t0)]1/3 . (18) inserting the ansatz (17) in eq. (9), one easily verifies that the function w(z) satisfies the equation 0 = w′′′ + (6w2 − z)w′ − w = d dz [ w ′′ + 2w3 − zw ] (19) denoting the constant value of the bracket in the last right-hand side as c, we conclude that w(z) satisfies a second-order equation, w′′ = −2w3 + zw + c . (20) for the equation (14), the same analysis would give the equation w′′ = 2w3 + zw + c . (21) these are painlevé ii equations [19]. in general, painlevé equations have pole singularities. and indeed, a local analysis of eq. (21) (keeping the leadingorder terms w′′ ≈ 2w3) shows that (21) admits simple poles, w(z) ≈ ±1/(z − z0). the existence of 4the coefficient 6 is a convention. it can be changed by rescaling t and x. but the sign stays invariant under rescaling. 1 2 3 4 5 6 t -300 -200 -100 100 200 300 u figure 2. u(t, x = 2.26) for the defocusing mkdv. a real simple pole at z = z0 would then correspond to a singular (blow-up) behavior of u(t,x) of the form u(t,x) ∝ [ x − x0 − z0[3(t − t0)]1/3 ]−1 . but for the equation (20) [and hence for (9)], the singularities are absent. the third argument in favor of the conjecture that the x evolution of sufficiently smooth cauchy data on the line x = const for the mkdv equation (9) does not bring about singularities in u(t,x) comes from numerical simulations. to simplify the numerical analysis, we considered the problem on the band 0 ≤ t ≤ 2π, where we imposed [as is allowed by eq. (9)] periodic boundary conditions: u(t + 2π,x) = u(t,x) . (22) we have chosen the cauchy data u(t, 0) = sin t, ux(t, 0) = uxx(t, 0) = 0 . (23) we first checked that the use of such cauchy data for the defocusing mkdv equation (14) was leading to a blow-up rather fast (at x = 2.2630 . . .). this is illustrated in figure 2, where the function u(t,x) is plotted just before the blow-up, at x = 2.26. by contrast, our numerical simulations of the x evolution of the focusing mkdv equation showed that u(t,x) stayed bounded for all the values of x that we explored. we met, however, another problem associated with the instability of eq. (9) under highfrequency (hf) perturbations. suppressing the nonlinear term in the kdv or mkdv equations, we obtain uxxx + ut = 0 . (24) this equation describes the fluctuations around the solution u(t,x) = 0. its analysis gives us an idea about the behaviour of fluctuations around other solutions. decomposing u(t,x) as a fourier integral, in plane waves ei(ωt+kx), we obtain the dispersion law ω = k3 . (25) 192 vol. 62 no. 1/2022 modified kdv equation as a system with benign ghosts if one poses the conventional cauchy problem with some fourier-transformable initial data u(0,x) = v(x) ≡ ∫ dk 2π v(k)eikx , (26) the time evolution of the initial data v(x) yields the solution u(t,x) = ∫ dk 2π v(k)ei(k 3t+kx) . (27) the important point here is that u(t,x) is obtained from v(k) by a purely oscillatory complex kernel ei(k 3t+kx) of unit modulus. it has been shown that this oscillatory kernel has smoothing properties (see, e.g., [20]). this allows one to take the initial data in low-s sobolev spaces hs (describing pretty rough initial data). however, if one considers the x-evolution cauchy problem, one starts from three independent functions of t along the x = 0 axis: u(t, 0) = u0(t), ux(t, 0) = u1(t) and uxx(t, 0) = u2(t), as in (23). assuming that the three cauchy data ua(t), a = 0, 1, 2, are fourier-transformable, we can represent them as ua(t) = ∫ dω 2π ua(ω)eiωt . (28) the three cauchy data determine a unique solution which, when decomposed in plane waves, satisfies the same dispersion law (25) as before. however, the dispersion law (25) must now be solved for k in terms of ω. as it is a cubic equation in k, it has three different roots: ka(ω) = ω 1 3 e2πia/3 a = 0, 1, 2 . (29) this yields a solution for u(t,x) of the form u(t,x) = ∑ a=0,1,2 ∫ dω 2π va(ω)ei(ωt+kax) , (30) where the three coefficients va(ω) are uniquely determined by the three initial conditions at x = 0. the point of this exercise was to exhibit the fact that, when considering the x evolution with arbitrary cauchy data u0(t), u1(t), u2(t), the solution involves exponentially growing modes in the x direction, linked to the imaginary parts of k1(ω) and k2(ω). this can be avoided if the initial data are sufficiently smooth, not involving hf modes. as a minimum condition for a local existence theorem, one should require the fourier transforms va(ω) to decrease like e−α|ω| 1 3 for some positive constant α.5 however, it is difficult to respect these essential smoothness constraints on the behavour of u(t,x) in the numerical calculations. the standard mathematica algorithms do not do so, and that is why we, starting from some values of x, observe the hf noise in our results. 5see ref. [18] for more detailed discussion. 1 2 3 4 5 6 t -2 -1 1 2 u figure 3. u(t, x = 3) for the focusing mkdv. 1 2 3 4 5 6 t -6 -4 -2 2 4 6 du figure 4. ux(t, x = 3) for the focusing mkdv. in figures 3, 4, we present the results of numerical calculations of u(t,x) and ux(t,x) for x = 3. there is no trace of a blow-up. for the plot of u(t,x), one also does not see a hf noise, but it is seen in the plot for ux(t,x). for larger values of x, the noise also shows up in the plot of u(t,x). at x >∼ 3.8, the noise overwhelms the signal. the observed noise is a numerical effect associated with a finite computer accuracy. to confirm this, we performed a different calculation choosing the initial conditions which correspond to the exact solitonic solution to eq. (9). the soliton is a travelling wave, u(t,x) = u(x−ct) ≡ u(x̄). plugging this ansatz into (9), we obtain an ordinary differential equation ∂ ∂x̄ [ ux̄x̄ + 2u3 − cu ] = 0 . (31) denoting the constant quantity within the bracket as c′, we then get the following second-order equation for the function u(x̄): ux̄x̄ = − d du v(u) , (32) 193 andrei smilga acta polytechnica with a potential function v(u) now given by v(u) = u4 − cu2 2 − c′u. (33) as was also the case for the time-independent ansatz, the problem is reduced to the dynamics of a particle moving in the confining quartic potential v(u). the trajectory of the particle depends on three parameters: the celerity c, the constant c′, and the particle energy, e = 1 2 u2x̄ + v(u) . (34) the usually considered solitonic solutions (such that u(x̄) tends to zero when x̄ → ±∞) are obtained by taking c > 0, c′ = 0 (so that the potential represents a symmetric double-well potential) and e = 0. the zero-energy trajectory describes a particle starting at “time" x̄ = −∞, at u = 0 with zero “velocity" ux̄, gliding down, say, to the right, reflecting on the right wall of the double well and then turning back to end up, again, at u = 0 when x̄ = +∞. the explicit form of the corresponding solution defined on the infinite (t,x) plane is u(t,x) = √ c cosh[ √ c (x − ct)] . (35) however, to make contact with our numerical calculations, we need a periodic soliton solution. such solutions can be easily constructed by considering bounded mechanical motions in the potential v(u) having a non-zero energy. periodic solutions exist both for positive and negative c. the trajectories are the elliptic functions. it was more convenient for us to assume c = −|c|, in which case we could make contact with ref. [12], where the expressions for the trajectories of motion in the same quartic potential were explicitly written, one only had to rename the parameters. choosing e = 1 and c = −1, we obtain the following solution: u(t,x) = cn [√ 3(x + t),m ] , (36) where cn(z) is the jacobi elliptic cosine function with the elliptic modulus m = 1/3. the function (36) is periodic both in t and x with the period t = l = 4 √ 3 k ( 1 3 ) ≈ 4 . (37) we fixed the initial conditions for x = 0 and periodic conditions in time as is dictated by (36), and then numerically solved (9). the numerical solution should reproduce the exact one, and it does for x <∼ 4. however, at larger values of x, the hf noise appears. the result of the calculation for x = 4.5 is given in figure 5. one can suppress the hf noise by increasing the step size, but then the form of the soliton is distorted. to find a numerical procedure that suppresses the noise and gives correct results for large values of x remains a challenge for future studies. 1 2 3 4 t -1.0 -0.5 0.5 1.0 u figure 5. hf noise for the periodic soliton evolution. x = 4.5. 3. discrete models with benign ghosts one of the possible solutions to this numerical problem could consist in discretizing the model in time direction and assuming that the variable t takes only the discrete values t = h, 2h, · · · ,nh, for some integer n ≥ 3 and by replacing the continuous time derivative ψt by a discrete (symmetric) time derivative [ψ(t + h,x) − ψ(t − h,x)]/(2h). then the lagrangian6 l[ψ(t,x)] = ψ2xx − ψ4x − ψxψt 2 (38) acquires the form ln = n∑ k=1 { [ψxx(kh,x)]2 − [ψx(kh,x)]4 2 − 1 2 ψx(kh,x) ψ[(k + 1)h,x] − ψ[(k − 1)h,x] 2h } , (39) where we impose the periodicity: ψ(0,x) ≡ ψ(nh,x) and ψ[(n + 1)h,x] ≡ ψ(h,x).7 the lagrangian (39) includes a finite number of degrees of freedom and represents a mechanical system. this system involves higher derivatives in x (playing the role of time) and hence involves ghosts. defining the new dynamical variables ak(x) = ψx(kh,x), the equations of motion derived from the lagrangian (39) read akxxx + 6(a k)2akx + ak+1 − ak−1 2h = 0 . (40) there are two integrals of motion: the energy e = n∑ k=1 [ (akx)2 − 3(ak)4 2 − akakxx ] (41) 6it is quite analogous to (4). after variation with respect to ψ(t,x), one gets eq. (9) after posing u(t,x) = ψx(t,x). 7it is also possible to impose the dirichlet-type boundary conditions, ψ(0h,x) = ψ[(n + 1)h,x] = 0. for n = 2, periodicity cannot be imposed and dirichlet conditions are the only option. 194 vol. 62 no. 1/2022 modified kdv equation as a system with benign ghosts 5 10 15 20 x -8 -6 -4 -2 2 4 6 a n figure 6. the solution of the system (40) for an (x) (n = 350, h = t /n ). and q = n∑ k=1 [ akxx + 2(a k)3 ] . (42) the expression (41) is the discretized version of the integral ∫ dt [ (ux)2 − 3u4 2 − uuxx ] (43) in the continuous mkdv system, which is conserved during the x evolution, as follows from the local conservation law (11). the second integral of motion is related to the conservation law (10). by contrast, the currents in the higher conservation laws of the mkdv equation, starting with eq.(12), do not translate into integrals of motion of the discrete systems. we have only two integrals of motion and many variables, which means that the equation system (40) is not integrable and exhibits a chaotic behaviour. we fed these equations to mathematica and found out that their solution stays bounded up to x = 10000 and more – the ghosts are benign! this represents a further argument in favour of the conjecture that in the continuous theory the evolution in spatial direction is also benign. indeed, one may expect that taking larger and larger values of n would allow one to simulate better and better the continuous theory (though the presence of chaos might make such a convergence non uniform in x). anyway, we tried the solitonic initial conditions and found out that the discrete system for large n = 350 (the limit of mathematica skills) behaves better than the pde. as is seen from figure 6, the discrete solution stays close to the exact soliton solution up to x ≈ 10, to be compared to x ≈ 4.5, which was the horizon of the numerical procedure of the previous section. hopefully, a clever mathematician, an expert in numerical calculations, would be able to increase the horizon even more... lastly, we note that, irrespectively to the relationship of the systems (39) to the mkdv equation, these systems represent an interest by their own because they provide a set of nontrivial interacting higher derivative systems with benign ghosts. such systems were not known before. acknowledgements i am grateful to the organizers of the aamp conference for the invitation to make a talk there. working on our paper [18], we benefited a lot from illuminating discussions with piotr chrusciel, alberto de sole, victor kac, nader masmoudi, frank merle and laure saint-raymond. references [1] m. ostrogradsky. mémoire sur les équations differéntielles relatives au problème des isopérimètres. mem acad st petersbourg vi(4):385, 1850. [2] r. p. woodard. ostrogradsky’s theorem on hamiltonian instability. scholarpedia 10(8):32243, 2015. https://doi.org/10.4249/scholarpedia.32243. [3] m. raidal, h. veermae. on the quantisation of complex higher derivative theories and avoiding the ostrogradsky ghost. nuclear physics b 916:607–626, 2017. https://doi.org/10.1016/j.nuclphysb.2017.01.024. [4] a. v. smilga. classical and quantum dynamics of higher-derivative systems. international journal of modern physics a 32(33):1730025, 2017. https://doi.org/10.1142/s0217751x17300253. [5] k. m. case. singular potentials. physical review 80(5):797–806, 1950. https://doi.org/10.1103/physrev.80.797. [6] k. meetz. singular potentials in non-relativistic quantum mechanics. il nuovo cimento 34:690–708, 1964. https://doi.org/10.1007/bf02750010. [7] a. m. perelomov, v. s. popov. “fall to the center” in quantum mechanics. theoretical and mathematical physics 4:664–677, 1970. https://doi.org/10.1007/bf01246666. [8] l. d. landau, e. m. lifshitz. quantum mechanics. elsevier, 1981. [9] m. combescure, d. robert. coherent states and applications in mathematical physics. springer, 2012. https://doi.org/10.1007/978-94-007-0196-0. [10] a. pais, g. e. uhlenbeck. on field theories with non-localized action. physical review 79(1):145–165, 1950. https://doi.org/10.1103/physrev.79.145. [11] p. d. mannheim, a. davidson. dirac quantization of pais-uhlenbeck fourth order oscillator. physical review a 71:042110, 2005. https://doi.org/10.1103/physreva.71.042110. [12] d. robert, a. v. smilga. supersymmetry versus ghosts. journal of mathematical physics 49(4):042104, 2008. https://doi.org/10.1063/1.2904474. [13] m. pavšič. stable self-interacting pais-uhlenbeck oscillator. modern physics letters a 28(36):1350165, 2013. https://doi.org/10.1142/s0217732313501654. 195 https://doi.org/10.4249/scholarpedia.32243 https://doi.org/10.1016/j.nuclphysb.2017.01.024 https://doi.org/10.1142/s0217751x17300253 https://doi.org/10.1103/physrev.80.797 https://doi.org/10.1007/bf02750010 https://doi.org/10.1007/bf01246666 https://doi.org/10.1007/978-94-007-0196-0 https://doi.org/10.1103/physrev.79.145 https://doi.org/10.1103/physreva.71.042110 https://doi.org/10.1063/1.2904474 https://doi.org/10.1142/s0217732313501654 andrei smilga acta polytechnica [14] i. b. ilhan, a. kovner. some comments on ghosts and unitarity: the pais-uhlenbeck oscillator revisited. physical review d 88(4):044045, 2013. https://doi.org/10.1103/physrevd.88.044045. [15] a. v. smilga. supersymmetric field theory with benign ghosts. journal of physics a: mathematical and theoretical 47(5):052001, 2014. https://doi.org/10.1088/1751-8113/47/5/052001. [16] a. v. smilga. on exactly solvable ghost-ridden systems. physics letters a 389:127104, 2021. https://doi.org/10.1016/j.physleta.2020.127104. [17] c. deffayet, s. mukohyama, a. vikman. ghosts without runaway instabilities. physical review letters 128(4):041301, 2022. https://doi.org/10.1103/physrevlett.128.041301. [18] t. damour, a. v. smilga. dynamical systems with benign ghosts. physical review d (in press), arxiv:2110.11175. [19] p. a. clarkson. painlevé equations – nonlinear special functions. journal of computational and applied mathematics 153(1-2):127–140, 2003. https://doi.org/10.1016/s0377-0427(02)00589-7. [20] c. e. kenig, g. ponce, l. vega. well-posedness of the initial value problem for the korteweg-de vries equation. journal of the american mathematical society 4(2):323–347, 1991. https: //doi.org/10.1090/s0894-0347-1991-1086966-0. 196 https://doi.org/10.1103/physrevd.88.044045 https://doi.org/10.1088/1751-8113/47/5/052001 https://doi.org/10.1016/j.physleta.2020.127104 https://doi.org/10.1103/physrevlett.128.041301 http://arxiv.org/abs/2110.11175 https://doi.org/10.1016/s0377-0427(02)00589-7 https://doi.org/10.1090/s0894-0347-1991-1086966-0 https://doi.org/10.1090/s0894-0347-1991-1086966-0 acta polytechnica 62(1):190–196, 2022 1 introduction 2 spatial dynamics of kdv and mkdv equations 3 discrete models with benign ghosts acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0352 acta polytechnica 62(3):352–360, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague theoretical and experimental study of water vapour condensation with high content of non-condensable gas in a vertical tube jakub krempaský∗, jan havlík, tomáš dlouhý czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, prague 6 16607, czech republic ∗ corresponding author: jakub.krempasky@fs.cvut.cz abstract. this article deals with the possibility of separating water vapour from flue gases after oxyfuel combustion using condensation processes. those processes can generally be described as condensation of water vapour in the presence of non-condensable gases. hence, the effect of noncondensable gas (ncg) on the condensation process has been theoretically and experimentally analysed in this study. the theoretical model was developed on the basis of the heat and mass transfer analogy with respect to the effect of the ncg, the flow mode of the condensate film, the shear stress of the flowing mixture, subcooling and superheating. subsequently, an experimental analysis was carried out on a 1.5 m long vertical pipe with an inner diameter of 23.7 mm. the mixture of vapour and air flowed inside the inner tube with an air mass fraction ranging from 23 % to 62 %. the overall heat transfer coefficients (htc) from the theoretical model and experimental measurement are significantly lower than the htc obtained according to the nusselt theory for the condensation of pure water vapour. the overall htc decreases along the tube length as the gas concentration increases, which corresponds to a decrease in the local condensation rate. the highest values of the htc are observed in the condenser inlet, although a strong decrease in htc is also observed here. meanwhile, there is a possibility for an htc enhancement through turbulence increase of the condensing mixture in the condenser outlet. results also showed that the heat resistance of the mixture is several times higher than the heat resistance of the condensate film. the developed theoretical model based on heat and mass transfer analogy is in good agreement with experimental results with the standard deviation within +25 % and −5 %. the model is more accurate for lower ncg concentrations. keywords: condensation, non-condensable gas, vertical tube, experimental, theoretical. 1. introduction one of the promising ccs technologies is oxyfuel combustion. the use of co2 captured from oxyfuel combustion technology for other technological purposes and for storage requires a sufficient purity of the product. after emissions cleaning, suitable options for final drying of flue gases are condensation processes in condensing heat exchangers. the aim is to dry the flue gas formed by oxyfuel combustion, which is consists of steam and a high proportion of non-condensing gases (especially co2 and o2). the drying is necessary for the final purification of co2 prior to its further use. this process can generally be described as the condensation of water vapour with a high content of non-condensable gases. condensation of water vapour with the presence of non-condensable gas (e.g. co2) is a widely studied topic and its recognition dates back to 1873 [1]. up to now, the main focus has been on condensation of water vapour with a small fraction of air in tube condensers [2, 3]. such cases can be found in various technological processes such as refrigeration, condensers in power plants, geothermal power plants and various processes in chemical and process industry. this field of study has been investigated by many authors and, today, represents a well-covered topic. however, there are areas where the content of air during the vapour condensation can be higher or different gas than air is present. for instance, it can be: ccu/s technologies, desalination of water, latent heat recovery from flue gas or loca accidents [2–4]. therefore, studying condensation in tube condensers with respect to new technological challenges has a potential for commercial applications and system improvement. chantana [5] conducted an experimental and theoretical study of water vapour condensation in the presence of air in a vertical tube. the water vapour content in the gas vapour mixture was very low. the theoretical model was developed on the basis of the heat and mass transfer analogy. the results showed that the condensation of the vapour and roughness of the film surface cause a disruption of the gas layer accumulated near the phase interface, which increases the htc. a detailed description of the processes at the vapour-liquid interface is also introduced in [6, 7]. maheswari [8], in his work, studied the influence of the water vapour and air mixture’s reynolds number on the htc in a vertical tube. he concluded that the 352 https://doi.org/10.14311/ap.2022.62.0352 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 3/2022 theoretical and experimental study of water vapour condensation . . . semi-theoretical theoretical method correction anddegradation factor heat and mass transfer analogy diffusion layer model boundary layer model description empirical equations based on experimental data. based on similarities between momentum, energy, heat, and mass transfer equations and empirical coefficients. based on fick’s law of molecular diffusion. vapour diffuses through the layer of non-condensable gas to phase interphase. description of condensate film layer and boundary layer with initial and boundary conditions. accuracy and output low accuracy, simple output. moderately high accuracy, usually iteration involved. moderately high accuracy, usually iteration involved. high accuracy, various results, too complex for practical application. table 1. overview of theoretical models for film-wise condensation with non-condensable gas. htc of the film can be lower than the htc of the mixture in the case of high reynolds number of the mixture. no and park [9] conducted experiments in vertical and horizontal tubes. the most important observation was that the waviness of the film decreases the accumulation of gas near the phase interface, and thereby effectively increases heat transfer. the effect of gas velocity on the flow of the liquid film in the vertical tube is also studied in the work of the kracik [10], where the effect of shear stress of the flowing gas on the liquid film is analysed experimentally and theoretically with three different diameters of the inner tube. kuhn [11] created three theoretical models developed on degradation factor, heat and mass transfer analogy, and mass transfer modelling and compared them with results from an experimental measurement. the standard deviations of the htc were 6.4 %, 8.4 % and 17.6 % for water vapour with air and 3.2 %, 6.1 % and 17.6 % for water vapour with helium according to the mass transfer modelling, diffusion layer modelling and degradation factor, respectively. this article focuses on research of the process of condensation of water vapour from flue gases from oxyfuel combustion. in this study, the effect of noncondensable gas (ncg) on vapour condensation in a vertical tube condenser is theoretically and experimentally analysed. 2. theoretical modelling 2.1. summary of available theoretical models there are several methods to calculate heat transfer during the condensation of water vapour in the presence of ncg in a vertical tube. in general, theoretical models for such phenomena are governed by the film condensation model, which was first described by nusselt in 1916 [12]. theoretical models based on the theory of film-wise condensation are usually divided into two groups, semi-theoretical and theoretical. semi-theoretical models use, to some extent, experimentally established coefficients and data, which are incorporated into the theoretical analysis. the first option is correction and degradation factors, which modify the standard theory for the condensation of pure vapour. the second option is models based on heat and mass transfer analogy. meanwhile, theoretical models are not based on experimental data, but on the description of the boundary layer near the phase interphase. the general summary of the theoretical models is shown in table 1 [3, 11, 13]. the degradation factor is the simplest method for the determination of htc; however, it gives less accurate results. while the theoretical models are quite complex and can give very accurate results, their practical use is quite complicated. 2.2. developed theoretical model a theoretical model based on the heat and mass transfer analogy was developed for a vertical double-pipe condenser in which the condensing water vapour flows downwards in the inner tube and the cooling water flows in counter-current in the outer tube. detailed descriptions of semi-theoretical models based on heat and mass transfer analogy can be found in the literature [5, 14, 15]. herein, the condenser was divided into 15 segments where local heat transfer was determined. the overall htc in the condenser is given by the heat transfer coefficients of several layers as shown in figure 1 and described with equation (1). one segment is shown in figure 2. heat transfer of the condenser can be determined by a temperature difference and a corresponding htc of the certain layer. the overall condenser power can be calculated from the temperature difference of bulk mixture temperature tsat and cooling water temperature tk with the overall htc of the condenser k. the heat transfer from the condensing mixture to the phase interface is given by the temperature difference of the bulk mixture tsat and phase interface tf,sat and htc of condensing mixture α̈g. this is equal 353 j. krempaský, j. havlík, t. dlouhý acta polytechnica figure 1. heat and mass resistances during condensation [14]. figure 2. analytical model. to the heat transfer from the phase interface with temperature tf,sat to the coolant flow with temperature tk with corresponding htc k′. two dots in htc of condensing mixture α̈g indicates that heat is transferred not only by conduction which is given by αg but also by mass flow of condensing vapour, which is given by mass transfer coefficient βg. the heat flux between the gas-vapour mixture and the cooling water is determined by the heat and mass transfer resistances and the temperature difference of the fluids in the condenser. q̇ = α̈g(tsat − tf,sat ) = k′(tf,sat − tk ) = k(tsat − tk ) (1) the fundamental assumption in the model is that the condensate flows on a vertical wall in an annular film, similarly to the nusselt condensation theory with htc αf . the vapour condenses on the film surface, while the gas accumulates near the phase interface. this forms another layer of non-condensable gas through which the vapour diffuses towards the phase interface. heat flux, which is transferred from the gas-vapour mixture, can be divided into sensible heat given by htc αcv and latent heat given by αc, as described by equation (2). the mass transfer resistance occurs only in the mixture since the condensate film is formed only by water with a temperature at the wall tw,f . q̇ = 1 1 αf + 1 αcv + αc (tsat − tw,f ) (2) the convective heat flow q̇g transferred from the gas to the liquid film is calculated with respect to the mass transfer that occurs with heat transfer according to the ackermann correction factor et (equation (3)) with equation (4) and equation (5), where φt is a non-dimensional mass flow, ṅ local molar flux, c̃pg molar specific heat capacity and αg convective htc of the mixture. q̇ = αget (tsat − tf,sat ) (3) et = φt 1 − exp (−φt ) (4) φt = ṅc̃pg αg (5) the latent heat corresponds to the vapour transferred through the gas layer to the phase interface. this can be determined by applying the heat and mass transfer analogy. mass transfer is based on the same formal relation as heat transfer replacing the nusselt number n u with the sherwood number sh and the prandtl number p r with the schmidt number sc as shown in equations (6) and (7). the relation between mass and heat transfer can then be described by the lewis number le according to equation (8) with molar density of the gas ng. the mass transfer coefficient is then applied to calculate the local condensing flow according to the diffusion of concentration (equation (9)) where ṙv is the relative molar flow of the vapour, ỹv,f is the molar concentration of vapour at the phase interface, ỹv,b is the molar concentration of vapour in the bulk mixture, and ṅv is the local vapour molar flux. n u = creαp r0.6 (6) sh = creαsc0.6 (7) αg = ngβgc̃pgle0.6 (8) ṅ = ngβg ln ( ṙv − ỹv,f ṙv − ỹv,b ) , (9) where: ṙv = ṅv ṅ . (10) during the calculation of the heat balance at certain parts of the condenser, it is necessary to know the condensation temperature of the vapour on the phase interface tf,sat . this temperature is calculated from equation (10), including mass and energy balance. this equation is solved iteratively with the initial estimation of the film temperature at the phase 354 vol. 62 no. 3/2022 theoretical and experimental study of water vapour condensation . . . figure 3. schematic diagram of experimental setup. interface with ṅf being the film molar flux and tf being the film temperature. ṅf c̃pf dtf da + k′(tsat − tf,sat ) = ṅ∆h̃v + αget (tsat − tf,sat ) (11) overall results are given by an arithmetic ratio of results from given segments. the overall condensation performance of the heat exchanger is calculated according to the steps shown below. two parameters must be estimated at the beginning – the condensation temperature of the vapour on the film surface and the outlet temperature of the cooling water. the temperature of the film surface has to be estimated in each segment, while the temperature of the cooling water has to be estimated only in the outlet segment. the temperature of the cooling water at other points is given by the energy balance corresponding to the condensation and convection of the air-vapour mixture in the given segment. the htc of the cooling water was experimentally determined by pure water vapour tests. the effects of the film flow mode, the shear stress of the flowing mixture, and the subcooling and superheating are included according to [14]. (1.) input of initial values (mixture and cooling water inlet parameters – temperature, pressure, mass flow). (2.) estimation of the outlet temperature of cooling water. (3.) estimation of the saturation temperature at the phase interface of vapour. (4.) calculation of fluid thermodynamic properties according to [15]. (5.) calculation of heat and mass transfer coefficients αg; βg. (6.) calculation of the local condensation rate and htc of the film αf . (7.) calculation of saturation temperature at the phase interface using the ackermann correction factor. (8.) check if the calculated saturation temperature on the phase interface corresponds to the estimated (if not, back to step 3 – an adjustment of the saturation temperature on the phase interface is necessary). (9.) calculation of the condenser power, heat transfer coefficient, outlet cooling water temperature for a given segment. (10.) repeat the calculation from the beginning until the last segment is calculated. (11.) check if the calculated outlet temperature of the cooling water corresponds to the estimated temperature (if not, back to step 2 – the adjustment of cooling water outlet temperature). (12.) results and average values. 3. experimental apparatus a schematic diagram of the experimental apparatus used for the theoretical model validation is shown in figure 3. the system is designed as an open loop comprised of three parts: main test section; water 355 j. krempaský, j. havlík, t. dlouhý acta polytechnica vapour and air supply section; cooling water section. the main test section comprises a vertical double-pipe heat exchanger made of two concentric stainless steel tubes. the mixture of water vapour and air enters the heat exchanger at the top and is directed vertically downwards through a calming section before flowing into the inner vertical tube. the cooling water flows upwards in the annulus. the heat exchanger is in counter-current configuration. the inner tube of the heat exchanger is 2000 mm long with an inner diameter of 23.7 mm and a wall thickness of 1.6 mm. the outer tube is 1500 mm long, with an inner diameter of 29.7 mm and a wall thickness of 2 mm. the tube material is stainless steel 1.4301 (aisi 304). the annulus created from these two concentric tubes is 1.6 mm in width. stainless steel pins are used as spacers at three circumferential positions to keep the annulus concentric. the steam generator was placed on the platform weight scale, which measured the amount of vapour generated. the generator produced steam steadily at a rate controlled by the power input to the electrical immersion heaters. in the first stage of the experiments, air was chosen as the ncg and blown in the mixing point by a compressor. the volume flow rate of air was measured by a rotameter. the pressure of the air was measured using a u-tube water manometer. the mixing chamber of water and air was designed so that any possible condensate occurring during the mixing of the vapour and air was flowing back to the steam generator. the flow rate of the cooling water was controlled by a regulating valve. non-condensing gases in the water circuit were vented from the system through a manual gate valve at the exit of the condenser. microfiber insulation and rubber foam were wrapped around the heat exchanger to prevent any potential heat loss. four thermocouples were placed in condenser tapings measuring the inlet and outlet temperature of the cooling water and the inlet and outlet temperature of the water vapour-air mixture. all experiments were performed under atmospheric pressure. measurements were conducted after the steady state of the system was reached. 4. results and discussion 4.1. theoretical analysis the theoretical model was developed to predict the results from experimental measurements and analyse the condensation process. six different air mass concentrations in the mixture were tested in the model, ranging from 23 % to 62 %. the condensation of the vapour – air mixture differs in several parameters as compared to condensation of pure vapour. as water vapour condenses along the condenser tube, the concentration of vapour in the mixture changes. therefore, the htc, the condensation rate, the saturation temperature, and most of the driving parameters change along the condense tube. figure 4. overall htc along condenser length. figure 5. air concentration in the mixture along condenser length. the local htc of the condenser along the tube length for all tests is shown in figure 4. a rapid decrease in the htc in the condenser inlet was calculated for the model for all cases of the condenser. as the vapour condenses along the condenser tube, several factors change, which leads to a reduction in vapour condensation and heat transfer; these factors are: increase of air concentration; decrease of velocity and reynolds number of the mixture; increase of film thickness. as air concentration increases and the local condensation rate decreases, the overall htc decreases rapidly in the condenser inlet. in the condenser outlet, htc does not change so much since the local heat transfer is low and the amount of condensed vapour is also very small. correspondingly, the htc in the condenser outlet is very low because of the high concentration of air, which forms a gas layer next to the condensate film and prevents the vapour from reaching the phase interface. this can be seen in figure 5, where the air concentration along the tube length is shown for all measurements. a strong increase in air concentration is observed at the beginning, and slow degradation is seen at the end as a result of a lower local condensation rate. the overall htc of the condenser is determined by the heat resistance of certain parts of the condenser, as shown in figure 2 and equation (11), with the inner radius of the inner tube r1, inner radius of the outer tube r2, wall thermal conductivity λw , and htc of the 356 vol. 62 no. 3/2022 theoretical and experimental study of water vapour condensation . . . tests 1 2 3 4 5 6 air mass concentration x [-] 23 % 26 % 41 % 51 % 60 % 62 % air flow ṁa [kg/h] 0.58 0.58 1.37 2.04 2.63 2.04 water vapour flow ṁp [kg/h] 1.97 1.68 2.01 1.94 1.77 1.24 condensate flow ṁkon [kg/h] 1.98 1.68 1.87 1.58 1.16 cooling water flow ṁk [kg/s] 0.0213 0.0213 0.0213 0.0165 0.0213 0.0281 cooling water inlet temperature tk1 [◦c] 14.8 14.8 14.7 15.0 14.6 16.0 cooling water outlet temperature tk2 [◦c] 32.0 29.2 30.8 30.4 25.8 22.2 mixture inlet temperature tp1 [◦c] 94.9 93.9 91.1 82.9 82.1 76.4 mixture outlet temperature tp2 [◦c] 43.6 40.6 74.6 66.5 64.6 61.3 table 2. measured values for all six tests. cooling water ∝k . 1 r1k = 1 r1∝̈g + 1 r1 ∝f + ln r2 r1 λw + 1 r2 ∝k (12) since the heat resistance of the cooling side and the tube wall of the condenser is quite low, the overall htc is determined by the heat resistance of the condensing side. during the condensation of water vapour without non-condensing gases, the main heat resistance is usually formed by the condensate film on the cooling surface. this is quite well described by the nusselt theory. in the case of the presence of ncg, the heat resistance is formed not only by the condensate film but also by the heat resistance of the mixture. the magnitude of heat resistance in the condensate film and the mixture depends on several parameters, mainly the concentration of air in the mixture, the flow of the condensate film, the flow of the mixture and the subcooling of the film. due to the high amount of ncg in the mixture and low re of the mixture, the heat resistance on the condensing side, in presented tests, was formed mainly by the heat resistance of the mixture. this can be seen in figure 6 where the htc of the film and the mixture are compared. the htc of the condensate film is several times higher than the htc of the mixture, therefore, the heat transfer is strongly affected by the heat and mass transfer in the gas-vapour mixture. although the htc of the film decreases along the condenser length as a result of the increase in film thickness, the htc of the mixture is at least ten times lower and, in some cases, it is lower even more than two hundred times. this supports the claim that the htc of the film can be neglected in some cases, since the overall htc is formed mostly by the htc of the mixture. it can also be seen that it is quite difficult to fully condensate the vapour out of the mixture. since the htc decreases significantly at the end of the condenser, a very large condensation surface would be necessary to fully separate the vapour and the noncondensable gas. the initial gas concentration is not so crucial when the vapour must be fully condensed, figure 6. comparison of htc of film and gas – vapour mixture. as the htc is approaching similar parameters at the end of the condenser for all runs. 4.2. experimental results experimental measurements with a mixture of water and air were conducted in the experimental loop with a vertical double-pipe condenser. the tests were carried out for six different air concentrations. the evaluation of experimental data was determined from the mass and energy conversion equations. the measured parameters and values are shown in table 2. the evaluation of the htc for water vapour condensation is usually done according to the nusselt’s condensation theory. however, in the case when ncg is present, a deviation occurs between results from nusselt’s theory and experimental measurements. in figure 7, a comparison between the results from experimental measurements and the results predicted with the nusselt’s theory for the mean htc of the film is shown. as it can be seen, the nusselt theory cannot be used by itself for a result prediction when ncg is present in the condensing mixture. therefore, it might be useful to use one of the theoretical models presented in the previous section. in figure 8, the measured amount of condensate captured during the tests is shown. with increasing air inlet concentration, the ratio of condensed vapour significantly 357 j. krempaský, j. havlík, t. dlouhý acta polytechnica figure 7. condensing side htc comparison obtained from the experiments, developed theoretical model and nusselt theory. figure 8. dependence of the condensed vapour and the air concentration in the mixture. decreases. it was observed that in the case of the inlet air mixture concentration of 20 %, almost all the vapour condensed. this is in accordance with the results from the theoretical model. 4.3. comparison of results from experiments and the theoretical model the predicted results from the theoretical model were compared with the results from the experimental measurements. the results were compared with the overall condenser power and are shown in figure 9. the heat rejected by the cooling water, determined by the experiments and calculated from the theoretical model, is in good agreement with the standard deviation within +25 % and −5 %. in the case of a lower ncg concentration, the theoretical model can predict, with a very high accuracy, the overall condenser power; however, at a higher ncg concentration, the model is less accurate. another option to compare experimental and theoretical data is through the overall htc, which is shown in figure 7. the htc predicted from the theoretical model for lower values of air inlet concentration are lower than those measured during the experiment. although this might not be so accurate, since the overall htc from the model is calculated as an arithmetic ratio of local values. another reason figure 9. comparison of condenser power of theoretical model and experiments. why the htcs from the theoretical model and experimental measurements differ might be the disturbance of the ncg layer by other phenomena. these might occur during condensation and were not considered in the presented model. in [16, 17], the suction effect, which occurs during the condensation of vapour in the presence of ncg, was analysed and it was concluded that it can improve htc by even 20 %. the formation of mist can also improve htc [18, 19]. this occurs when the mass transfer resistance in the mixture is much higher than the heat transfer resistance. however, the clear influence of the suction effect and the formation of mist on the htc has not yet been determined. therefore, including these phenomena in the model is rather difficult and unclear, even though it might be viable to do so, since it might improve the accuracy of the results predicted from the theoretical models. 5. conclusions four theoretical methods for the determination of the htc and heat transfer during condensation of vapour with gas are usually recommended in the literature: degradation and correlation factor, heat and mass transfer analogy, diffusion layer and boundary layer model. for the theoretical model, based on heat and mass transfer analogy, it was observed that the overall htc of the condenser decreases as the air concentration increases, also, as the reynolds number of the mixture decreases, and/or the film thickness increases. an increase in air concentration has the strongest influence on the htc. this is observed because of the mass transfer resistance of the vapour due to the layer of ncg near the phase interface. increasing re number of the mixture or disturbing this layer of ncg near the phase interface might, therefore, increase the htc of the condensing mixture. the htc of the film during the condensation of vapour in the presence of a high content of ncg can be several times higher than the htc of the mixture. neglecting the htc of the film might not produce a significant error in calculation and can simplify the theoretical modelling. a significant decrease in film htc is also seen at the beginning 358 vol. 62 no. 3/2022 theoretical and experimental study of water vapour condensation . . . of the condenser, which corresponds to local condensing flow. a rapid decrease in htc is observed mainly in the condenser inlet, while in the condenser outlet, htc is very low and does not change significantly. it was also observed that an effort to condensate all the vapour content out of the mixture leads to a much larger condensing surface, since the htc is very low at the end of the condenser. increasing the htc in the condenser outlet can, therefore, rapidly increase the condenser power and might contribute to condensing a larger portion of the vapour out of the mixture. an experimental measurement of water vapour condensation in the presence of high concentration of air was carried out in a counter-current vertical doublepipe condenser. air mass concentration ranged between 23 % and 62 %. a significant decrease in the htc on the condensing side can be seen in the model and experiment as compared to results from nusselt theory. therefore, using the nusselt theory for vapour condensation in the presence of ncg is not suitable. the results of the theoretical model and the experimental measurements were compared regarding the cooling power of the condenser. the results from the theoretical model and experiments are in good agreement with the standard deviation within +25 % and −5 %. therefore, the heat and mass transfer analogy might predict the power of the condenser and the htc of vapour condensation in the presence of large amount of ncg with a sufficient accuracy. a larger accuracy is observed for a low ncg concentration than for a higher ncg concentration. precaution is, therefore, recommended when using the model for a higher ncg concentration. obtained results are valid for all vertical geometries when the width of the condensate film is negligible compared to the cross section of the flow. in addition, studying other factors that might influence the condensation of vapour in the presence of ncg, such as the suction effect and mist formation, can enhance the accuracy of the heat-mass transfer model. experiments to determine the effect of ncg on water vapour condensation have been performed, so far, for an artificially prepared mixture of water vapour and air, also because of the availability of relevant literature references. another goal of the research is to verify the process of water vapour condensation from a mixture with co2 to apply this method for drying flue gas from oxyfuel combustion. list of symbols a air a heat exchanger area [m2] c geometric dimensionless number in empirical equation c̃p molar specific heat capacity [j kg−1 k−1] et ackermann factor [–] ∆h̃v molar heat of condensation [j mol−1] htc heat transfer coefficient [w m−2 k−1] k overall heat transfer coefficient [w m−2 k−1] k′ heat transfer coefficient between phase interface and coolant [w m−2 k−1] k cooling water le lewis number n molar density [mol m−3] ṅ local molar flux [mol m−2 s−1] ṅ molar flux [mol s−1] ncg non-condensable gas n u nusselt number [–] p vapour p r prandtl number q̇ heat flux [w m−2] r1 radius of inner tube [m] ṙv relative molar flow [–] r2 radius of outer tube [m] re reynolds number [–] sc schmidt number [–] sh sherwood number [–] ỹ molar concentration [–] greek symbols α heat transfer coefficient [w m−2 k−1] α̈g heat transfer coefficient of condensing side [w m−2 k−1] β mass transfer coefficient [m s−1] λ thermal conductivity [w m−1 k−1] φt non-dimensional mass flow [–] subscripts 1 inlet 2 outlet a power products in empirical equation b bulk c condensation cv convection f film f,sat film saturation i inner kon condensate g gas sat saturation v vapour w wall acknowledgements this work was supported by the ministry of education, youth and sports under op rde grant number cz.02.1.01/0.0/0.0/16_019/0000753 “research centre for low-carbon energy technologies”. references [1] o. reynolds, h. e. roscoe. i. on the condensation of a mixture of air and steam upon cold surfaces. proceedings of the royal society of london 21(139-147):274–281, 1873. https://doi.org/10.1098/rspl.1872.0056. [2] m. ge, s. wang, j. zhao, et al. condensation of steam with high co2 concentration on a vertical plate. experimental thermal and fluid science 75:147–155, 359 https://doi.org/10.1098/rspl.1872.0056 j. krempaský, j. havlík, t. dlouhý acta polytechnica 2016. https: //doi.org/10.1016/j.expthermflusci.2016.02.008. [3] j. huang, j. zhang, l. wang. review of vapor condensation heat and mass transfer in the presence of non-condensable gas. applied thermal engineering 89:469–484, 2015. https: //doi.org/10.1016/j.applthermaleng.2015.06.040. [4] j.-d. li, m. saraireh, g. thorpe. condensation of vapor in the presence of non-condensable gas in condensers. international journal of heat and mass transfer 54(17-18):4078–4089, 2011. https: //doi.org/10.1016/j.ijheatmasstransfer.2011.04.003. [5] c. chantana, s. kumar. experimental and theoretical investigation of air-steam condensation in a vertical tube at low inlet steam fractions. applied thermal engineering 54(2):399–412, 2013. https: //doi.org/10.1016/j.applthermaleng.2013.02.024. [6] w. minkowycz, e. sparrow. condensation heat transfer in the presence of noncondensables, interfacial resistance, superheating, variable properties, and diffusion. international journal of heat and mass transfer 9(10):1125–1144, 1966. https://doi.org/10.1016/0017-9310(66)90035-4. [7] f. toman, p. kracík, j. pospíšil, m. špiláček. comparison of water vapour condensation in vertically oriented pipes of condensers with internal and external heat rejection. energy 208:118388, 2020. https://doi.org/10.1016/j.energy.2020.118388. [8] n. maheshwari, d. saha, r. sinha, m. aritomi. investigation on condensation in presence of a noncondensable gas for a wide range of reynolds number. nuclear engineering and design 227(2):219–238, 2004. https://doi.org/10.1016/j.nucengdes.2003.10.003. [9] h. c. no, h. s. park. non-iterative condensation modeling for steam condensation with non-condensable gas in a vertical tube. international journal of heat and mass transfer 45(4):845–854, 2002. https://doi.org/10.1016/s0017-9310(01)00176-4. [10] p. kracík, f. toman, j. pospíšil. effect of the flow velocity of gas on liquid film flow in a vertical tube. chemical engineering transactions 81:811–816, 2020. https://doi.org/10.3303/cet2081136. [11] s. kuhn, v. schrock, p. peterson. an investigation of condensation from steam–gas mixtures flowing downward inside a vertical tube. nuclear engineering and design 177(1-3):53–69, 1997. https://doi.org/10.1016/s0029-5493(97)00185-4. [12] w. nusselt. des oberflachenkondensation des wasserdamfes, vol. 60. z. vereinesdeutsch. ing., 1916. [13] k. karkoszka. theoretical investigation of water vapour condensation in presence of noncondensable gases. licentiate thesis, royal institute of technology, division of nuclear reactor technology, stockholm, sweden, 2005. [14] vdi e. v. vdi heat atlas. springer berlin, heidelberg, 2010. https://doi.org/10.1007/978-3-540-77877-6. [15] a. p. colburn, o. a. hougen. design of cooler condensers for mixtures of vapors with noncondensing gases. industrial & engineering chemistry 26(11):1178– 1182, 1934. https://doi.org/10.1021/ie50299a011. [16] l. e. herranz, a. campo. adequacy of the heat-mass transfer analogy to simulate containment atmospheric cooling in the new generation of advanced nuclear reactors: experimental confirmation. nuclear technology 139(3):221–232, 2002. https://doi.org/10.13182/nt02-a3315. [17] s. oh, s. t. revankar. experimental and theoretical investigation of film condensation with noncondensable gas. international journal of heat and mass transfer 49(15-16):2523–2534, 2006. https: //doi.org/10.1016/j.ijheatmasstransfer.2006.01.021. [18] m. y. a hijikata kunio. free convective condensation heat transfer with noncondensable gas on a vertical surface. international journal of heat and mass transfer 16(12):2229–2240, 1973. https://doi.org/10.1016/0017-9310(73)90009-4. [19] h. c. kang, m. h. kim. characteristics of film condensation of supersaturated steam–air mixture on a flat plate. international journal of multiphase flow 25(8):1601–1618, 1999. https://doi.org/10.1016/s0301-9322(98)00077-9. 360 https://doi.org/10.1016/j.expthermflusci.2016.02.008 https://doi.org/10.1016/j.expthermflusci.2016.02.008 https://doi.org/10.1016/j.applthermaleng.2015.06.040 https://doi.org/10.1016/j.applthermaleng.2015.06.040 https://doi.org/10.1016/j.ijheatmasstransfer.2011.04.003 https://doi.org/10.1016/j.ijheatmasstransfer.2011.04.003 https://doi.org/10.1016/j.applthermaleng.2013.02.024 https://doi.org/10.1016/j.applthermaleng.2013.02.024 https://doi.org/10.1016/0017-9310(66)90035-4 https://doi.org/10.1016/j.energy.2020.118388 https://doi.org/10.1016/j.nucengdes.2003.10.003 https://doi.org/10.1016/s0017-9310(01)00176-4 https://doi.org/10.3303/cet2081136 https://doi.org/10.1016/s0029-5493(97)00185-4 https://doi.org/10.1007/978-3-540-77877-6 https://doi.org/10.1021/ie50299a011 https://doi.org/10.13182/nt02-a3315 https://doi.org/10.1016/j.ijheatmasstransfer.2006.01.021 https://doi.org/10.1016/j.ijheatmasstransfer.2006.01.021 https://doi.org/10.1016/0017-9310(73)90009-4 https://doi.org/10.1016/s0301-9322(98)00077-9 acta polytechnica 62(3):352–360, 2022 1 introduction 2 theoretical modelling 2.1 summary of available theoretical models 2.2 developed theoretical model 3 experimental apparatus 4 results and discussion 4.1 theoretical analysis 4.2 experimental results 4.3 comparison of results from experiments and the theoretical model 5 conclusions list of symbols acknowledgements references ap05_6.vp 1 introduction failure recovery is a fundamental task of the dependable systems needed to achieve fault-tolerant communications, smooth operation of system components and a comfortable user interface. increasingly popular distributed applications providing data sharing, content distribution or stream data delivery services include many different computers, often at distant geographical locations. to communicate between their nodes, these applications build tree-topology overlay structures to connect the nodes and distribute information. the failure recovery schemes for overlay trees use the underlying network to build a completely new tree or to restore the tree keeping its original structure. while delay-prone creation of a new tree from scratch is usually possible using the same technique as for creation of the original tree, local tree restoration keeping the rest of the tree intact is a relatively unexplored area of research. moreover, the large-scale and dynamic nature of tree-based structures in the rapidly evolving area of overlay communications requires the recovery mechanisms to exhibit several key properties such as scalability, independence of location and number of message sources, optional level of fault-tolerance and support for application-specific tree optimization requirements. it is becoming increasingly apparent that a generic failure recovery platform providing a fragment-location and reconnection framework with these properties would be beneficial for many emerging applications. the problem of the failure recovery of overlay trees considers graph s � (�, �) of arbitrary topology, where � is a finite set of vertices representing nodes and � is a finite set of edges representing links between the nodes. graph s acts as an underlying network for a tree-topology overlay network modeled as graph t � (��, ��), where �� �� represents tree nodes, �� is a finite set of core tree edges representing overlay communication links connecting individual nodes ��. the goal of failure recovery is to protect a given tree network t against failure of the faulty cluster �� ��� of one or more adjacent nodes in the tree (see fig. 1). its task is to locate the tree fragments caused by the failure to restore the distributed knowledge of the topology and reconnect the tree, omitting the failed nodes, to enable communication in the tree to continue. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 generic platform for failure recovery in survivable trees v. dynda failure recovery is a fundamental task of the dependable systems needed to achieve fault-tolerant communications, smooth operation of system components and a comfortable user interface. tree topologies are fragile, yet they are quite popular structures in computer systems. the term survivable tree denotes the capability of the tree network to deliver messages even in the presence of failures. in this paper, we analyze the characteristics of large-scale overlay survivable trees and identify the requirements for general-purpose failure recovery mechanisms in such an environment. we outline a generic failure recovery platform for preplanned tree restoration which meets those requirements, and we focus primarily on its completeness and correctness properties. the platform is based on bypass rings and it uses a bypass routing algorithm to ensure completeness, and specialized leader election to guarantee correctness. the platform supports multiple, on-line and on-the-fly recovery, provides an optional level of fault-tolerance, protection selectivity and optimization capability. it is independent of the the protected tree type (regarding traffic direction, number of sources, etc.) and forms a basis for application-specific fragment reconnection. keywords: fault tolerance, failure recovery, tree restoration, distributed algorithms. �� original tree t faulty cluster ��= {n 8 , n 9 , n 10 } tree fragments t i t 1 t 2 t 3 t 4 t 5 t 6 t 7 tree node tree edge faulty cluster �� failed tree node tree fragment t i t lost connection n 8 n9 n 10 n 1 n 2 n 3 n 4 n 5 n 6 n 7 n 8 n 9 n 10 n 1 n 2 n 3 n 4 n 5 n 6 n 7 n 1 n 2 n 3 n 4 n 5 n 6 n 7 fig. 1: failure in the tree network and its partitioning into fragments in this paper, we analyze the environment of general overlay tree networks and identify the requirements for failure recovery in survivable trees ([1]). we outline a failure recovery platform for preplanned tree restoration based on bypass rings ([2], [3]) – cyclical redundant structures to be used in the event of failure to locate and reconnect the tree fragments. the rest of this paper is structured as follows. in the next section, the related work is summarized. sections 3 and 4 summarize the requirements for generic failure recovery and relate the qualities that the corresponding recovery scheme is expected to possess. section 5 briefly describes the proposed platform, and sections 6 and 7 deal with two main issues of recovery – completeness and correctness. section 8 outlines the elementary possibilities of application-specific fragment reconnection based on the platform. section 9 discusses the achieved results, and section 10 concludes and sets some future directions. 2 related work reconstruction of tree-topology graphs without starting from scratch is a relatively new area of research in the field of failure recovery. so far, on-demand restoration schemes building at least the affected subtree anew have been used in a number of applications. several preplanned special-purpose protocols based on pre-computed backup paths have also been proposed, aiming at some of the mentioned properties while neglecting others. although there are many possible applications, nearly all of the previous solutions are designed for specific network-layer or overlay multicast schemes. the important property of local recovery is mostly achieved only in single-source multicast trees, and the scalability of many solutions is limited by dependence of the control or memory overhead on the group size kn. there are several basic straightforward preplanned methods for multicast tree recovery (based on the group leave operation in [4]). in the grandfather method, each node maintains a backup link to its grandparent in the rooted tree. when a node (except the root) fails, its child nodes contact the grandparent, which either accepts the connection or redirects them down the tree. subtrees of the affected nodes remain unchanged. in the root method, the children of the failed node try to recover the tree by connecting directly to the root node, which uses the same strategy as in the grandfather approach above. in the grandfather-all and root-all methods, all descendants of the failed node try to recover by contacting the grandparent or root, respectively, in order to build the whole affected subtree of the failed node anew using the requested optimizations. all the nodes maintain respective knowledge of their ascendants in the multicast tree – the grandparent node in grandfather, the root node in root and root-all, and all ascendants from grandparent of n to root node in the grandfather-all method. except for the grandfather, the methods do not perform local failure recovery, as the affected nodes contact ascending nodes far up the tree. the scalability of these methods is limited either – in the grandfather-all method, each node maintains a link to the number of ascendants proportional to the size of the group. the root and root-all methods also load the root node with extensive communication proportional to the group size. these two methods also rely on a single root node whose failure breaks down the whole scheme. the grandfather method is scalable and keeps locality, but it does not cope with multiple adjacent failures in the tree. moreover, all these methods are designed for single rooted directed multicast trees only. however, these simple methods represent four classes of a number of other approaches based on the same principles. for example, proactive reconstruction [5] belongs to the grandfather-all category, eftmrp [6] for recovery in network-layer cbt multicast uses a principle similar to the grandfather method, lfr core recovery [7] resembles the root method. a different approach is chosen in specialized multicast protocols that use administrative control topologies for group management in addition to the data delivery tree. narada [8] is a protocol designed for small multicast groups, where each node keeps a periodically refreshed state about all other group members and uses this information to locate and reconnect the fragments. due to the state exchange inducing the control overhead ( )kn 2 , the narada protocol is effective only when the group is small. however, this is an explicit design choice where the high overhead is traded off for greater robustness in recovery from node failures. yoid [9] and hmtp [10] are examples of protocols using a dedicated node called the rendezvous point to arbitrate the failures and locate the fragments. in addition, cached links to several periodically discovered member nodes are used. hmtp nodes also maintain an ancestor list similarly to the grandfather-all method. these methods are capable of recovery from large failed clusters in their trees; however, the rendezvous point may become a bottleneck and a single point of failure. another solution is used in overlay protocols for media streaming nemo [11], nice [12], fatnemo [13] and zigzag [14]. they all first construct an administrative highly connected hierarchy among the nodes, and the data delivery tree is then built using this structure. the hierarchy is organized into layers divided into clusters, where each cluster has a leader node which then also belongs to a cluster in a higher level. when a node fails, the leader of its highest layer or the leader of the cluster of its children (depending on the protocol) is responsible for finding another node to take over the traffic and reconnect the disconnected subtrees. after the recovery, the administrative hierarchy is reorganized to adapt to the topology changes. the failure recovery of these protocols is efficient in heavy traffic multicasting, where the costs of the highly connected hierarchy are amortized by the huge amount of data. on the other hand, for less loaded trees, the memory and control traffic overhead may be significant. the data delivery trees are source-specific, the control overhead of a node is (log )k nk , where k is a constant proportional to the size of the administrative clusters. a dual-tree network-layer protection scheme [15] constructs a node-disjoint secondary tree connecting tree leaves in addition to the primary delivery tree. after failure, the scheme identifies disconnected subtrees and reconnects them to the rest of the tree using the secondary tree. the construction of redundant trees increasing network-layer multicast reliability is also studied in [16], and it is employed in overlay multicast as well in coopnet [17]. link-protection [18] and path-protection [19] for network-layer multicasting in atm 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague networks propose an individual backup path for each link in the tree and for each source-destination pair, respectively. 3 main issues of overlay failure recovery from the point of view of the design of failure recovery mechanisms for survivable networks, the characteristics of the environment as well as the properties of the protected tree networks and the group model are essential. recent trends in a networked computing environment point towards large-scale unbounded network infrastructures with theoretically unlimited number and size of groups interconnected with overlay trees. to design a practical and efficient failure recovery scheme for these systems, the following characteristics of the computing environment must be taken into account: � characteristics of a distributed system – asynchronous communication, no global clock and autonomous behavior of the nodes. � no central authority controlling proper functionality of the system; all nodes are peers. � no global knowledge of the number of nodes, their identities and the network topology. � unrestricted failure pattern; nodes fail arbitrarily at any time. � unlimited size of the system and the underlying network s. the most significant attributes of the overlay tree structures involve: � traffic direction. in single-source trees, the message traffic flows in a single direction from the root (source) node to other member nodes. however, in many emergent applications, the message source may change in the runtime or even several nodes become a message source simultaneously disseminating the traffic to the tree. � tree adaptation. as individual nodes may arbitrarily join or leave the group, the respective tree is either expanded or shrunk. moreover, the tree may adapt its topology in order to satisfy potential extern optimization requirements. � no global knowledge and unlimited size of the tree. the size of the group is not limited, and the number and identity of the tree member nodes is not fully known. each tree member keeps only the information about its neighbors in the tree for routing and possibly for tree adaptation purposes. � real-time operation. the tree operates in real time, such that the traffic in the tree cannot be suddenly turned off or switched to an off-line or stalled mode. � self-containment. due to the scale of the system, the possible number of groups and the overlay nature of the trees, the information pertaining to a particular tree is required to be kept solely at the tree member nodes, and no other node is capable to hold even auxiliary information concerning this tree. it shapes up that a generic platform for failure recovery available for different applications in this environment would be profitable perhaps as a part of the middleware architecture to increase reliability and cut down the costs. the listed attributes represent the restrictive characteristics of the environment and the protected tree. of course, not all applications employing tree communication structures employ this kind of trees, and the characteristics are somehow relaxed. however, in many other applications, particularly large-scale data sharing or data storing peer-to-peer systems, the tree networks exhibit all these attributes, which then must be reflected by the respective properties of the platform. 4 survivable trees a survivable tree is a general tree-topology communication network capable to deliver information to all its correct member nodes even in the presence of failures. consider t � (��, ��) to be an overlay tree network, �� ��� and �� t to be a connected vertex-induced subgraph of t. failure of faulty cluster �� causes t to be partitioned into fragments ti, i � 1, 2, …, n � card ( at(��)), where card ( at(��)) is cardinality of a neighbor set of �� in t. if card (��) � 1 then �� is called a single failure; if card (��) >1 then �� is a multiple failure of adjacent nodes in t. a survivable tree t is required to deliver messages even in the event of several single or multiple failures. a failure recovery is a process of reconnecting fragments ti in a single restored network t’ � (��\��, ��’), allowing the traffic to continue. we focus on two principal properties that each recovery mechanism employed in a survivable tree must have – correctness and completeness. correctness is based on the essential requirement to keep the tree topology of the network even after failure, since the correctness of most applications depends precisely on the acyclic property of the graph. completeness requires all the fragments of the original tree to be connected in a single restored tree, allowing all correct nodes to participate in t’. the following extra qualities of the recovery scheme are needed to address the characteristics of the large-scale unbounded environments of the targeted applications and properties of the possible protected trees, and should form the design subject of failure recovery in survivable trees. � scalability with the size of the protected tree and the underlying network. � multiple failure recovery. capability to recover t from multiple failures. � locality. the recovery affects only the tree nodes in the closest neighborhood of the failed nodes, keeping the rest of the tree in its original structure. � on-line recovery. ability to recovery several simultaneous failures in a single tree. � computational symmetry. there is no arbiter node, all �� nodes are peers. � on-the-fly recovery. the failure recovery is performed while the traffic in the tree goes on, even through the nodes performing the recovery. � optional level of fault-tolerance provided by the scheme for the survivable tree, allowing an optimal trade-off between survivability and costs to be found. � protection selectivity allows the fault-tolerance level to be chosen individually for each node. the survivable tree may then provide stronger protection against failure of less reliable or functionally more important nodes in the tree. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 � traffic direction independence. the recovery success depends neither on the traffic direction in the tree nor on the link orientation in multi-source multi-rooted trees. � optimization capability. the scheme takes into account the application-specific requirements regarding the restored tree. there are only three methods capable of failure recovery in a single multi-source tree – yoid [9], hmtp [10] and network-layer link-protection [18]. however, link-protection recovers the network from link failures only, and yoid and hmtp are not fully distributed, as they need the rp node for their operation. other schemes are designed either for single-source trees or for single-rooted shared trees, and they usually do not provide optional failure recovery and protection selectivity. moreover, the overhead of several recovery methods is proportional to the group size, which degrades their scalability. local recovery is performed only in the grandfather type of restoration and in implicit multicast schemes (e.g., nemo [11], nice [12], zigzag [14]). 5 bypass ring platform when designing a failure recovery scheme for a survivable tree, we face two main challenges to be solved while keeping in mind the requirements for failure recovery in survivable trees: � how to locate all the fragments and route messages among them � how to avoid creating cycles during fragment reconnection our solution is based on bypass rings – virtual cyclic structures appended to the tree and providing alternative paths to eliminate the failed nodes, locate the fragments and reroute the traffic in the tree ([3]). each bypass ring is identified by its center node and diameter; its edges connect individual tree branches of the center node in a distance proportional to the diameter. several concentric bypass rings of increasing diameter form a bypass framework. it is the responsibility of the bypass routing algorithm to locate all n � at(��) nodes and route among them cyclically in a uniform direction and order, regardless of the source and destination of the messages, using edges of bypass frameworks. this cyclical path bypassing cluster �� is referred to as bypass cycle bc(��). the bypass cycle connects all n fragments of the tree so that they can communicate and join together to restore the tree. however, it is not possible to sequentially join all the fragments on bc(��) since a cycle would occur in t’. instead, a single bypass edge on bc(��) that does not participate in the reconnection is to be identified in a distributed way. this is the task of the leader link election (lle) process, which is based on comparing the hierarchical identifiers of the fragments ([2]). a hierarchical identifier is a unique compound value inferred from the structure of the failed cluster found out by the routing algorithm. the overall operation of the scheme involves several fundamental steps – scheme initialization, failure detection, designated nodes discovery, leader link election, tree fragment reconnection and bypass ring reconfiguration. in the initialization phase, the bypass frameworks are set up centered at selected tree nodes against whose failure the tree is to be protected, with the diameter depending on the desired protection level of the particular node. in the event of failure, the failure detecting nodes initiate the recovery immediately and use the bypass routing to discover designated nodes, dn at( ) ( )�� ��� – the bypass cycle nodes with distinct properties related to the type of protected tree allowing them to coordinate the rest of the recovery process. bypass cycle edges incident to the designated nodes become candidates for the lle process. as the leader election proceeds along the bypass cycle, the pairs of fragments are successively joined together, forming greater connected components until all the fragments are reconnected into the restored tree t’. various fragment reconnection methods respecting the results of lle can be designed to consider application-specific constraints and requirements regarding the tree properties (e.g., degree constraints, weight functions, latency or bandwidth limitations, etc.). 6 bypass routing bypass routing is one of the key components of the proposed scheme, as it ensures its completeness. clearly, the success of the routing depends on the availability of the respective bypass rings in the event of failure. to achieve uniform direction and fragment order of the routing, the bypass rings are to be set up systematically in the tree. for this purpose, we introduce the partial order of the tree, which unambiguously specifies the sequence of neighbors seqt(n) of each tree member node n (e.g., according to their identifiers). 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague br t (n, 3) (a) r � (n, n k ) r�(n, n k �1) seq t (n) n n k n k �1 n 0 n deg(n)�1 ... ... .. . .. . (b) bf t (n, 4) seq t (n) ... ... .. . .. . n n k n k �1 br t (n, 2) br t (n, 3) br t (n, 4) fig. 2: bypass ring brt(n, 3) (a) and framework bft(n, 4) (b) the partial order defines the arrangement of each bypass ring in the tree. each bypass ring is referred to as brt(n, d), where n is its center node and d is its diameter. brt(n, d) consists of degt(n) bypass edges connecting each pair of tree branches bt(n, nk) and bt(n, nk�1) neighboring in seqt(n). we define the positive and negative ordered rays, rt �(n, nk) and rt �(n, nk) of each branch bt(n, nk) according to the partial order as its leftmost and rightmost path, provided that t is drawn as a planar graph where seqt(n) of each node follows a clockwise direction. each bypass edge of brt(n, d) connecting bt(n, nk) and bt(n, nk�1) is then initiated on rt �(n, nk) at distance d 2 � �� � � from n and terminated on rt �(n, nk � 1) at distance d 2 � �� � � , as shown in fig. 2a. the bypass framework is defined as the union of concentric bypass rings (see fig. 2b): bf n d br n dt t d d ( , ) ( , )max max � �2 � . with this arrangement, all the bypass rings keep the same direction, allowing the bypass routing algorithm to route between branches of a given center node using its rings, and to employ rings of lower diameters centered in particular branches to route through those branches, while preserving the direction. supposing that frameworks bf n dt ( , )max are set up around each node n �� in t, there are dmax bypass edges initiated at each node and terminating at nodes at increasing distances (up to dmax) on rt �(n, nk) of each of the node’s branches b n nt k( , ), n a nk t ( ). the routing itself is then based on the fact that each faulty cluster is an intersection of the respective tree branches of the nodes neighboring with the cluster (as illustrated in fig. 3): �� �� t t i j n a b n n i t � ( , ) ( ) � , where n a nj t i �( ) �� at each node ni at (��), the routing algorithm systematically browses b n nt i j( , ) using the bypass edges initiated at ni to find another node neighboring with ��, node ni�1, which is the next on bc(��). the branch lookup is performed sequentially by checking the nodes on rt �(ni, nj) at an increasing distance until the first non-faulty node, ni�1, is found. the sequence of nodes on rt �(ni, nj) is kept for further use by lle. this process is shown in fig. 4a. fig. 4b demonstrates a practical example of routing in the network from fig. 1. it can be shown [2] that bypass routing is feasible provided that the length of the positive ordered ray between every two bypass cycle neighbors is less or equal to dmax. the lower bound of the maximum recoverable failure is thus dmax 2 � �� � � nodes in arbitrary clusters and dmax�1 nodes in internal clusters of the tree (i.e., the clusters not containing leaves of the protected tree). the memory needed to keep routing information at a node is equal to (deg ( ) )maxt n d� . © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 �� b t (n i ,n j ) n i n j .. . n i�1 bc (��) b t (n i�1 ,n j�k ) b t (ni,nj) �� bc (� � ) n i n j n j+k n i�1 n i�2 (a) (b) r � (n i, n j) fig. 3: relation between nodes n at ( )�� and faulty cluster �� �� n j n j �1 n j�k n i�1 n j �2 n i r � (ni, nj) bc(��) br t (n j , 2) br t (n j , 3) br t (n, k�1) (a) (b) ��n8 n 9 n 10 n 1 n 2 n 3 n 4 n 5 n 6 n 7 (n 8 ,2) (n 8 ,3) (n 9 ,4 ) (n 10 ,2) (n 1 0 ,2 ) (n 1 0 ,3 ) (n 9 ,2 ) (n 9 ,2 ) (n 9 ,3) (n 8 ,2 ) (n 8 ,2) fig. 4: routing from n ai t ( )�� to ni�1 (a) and routing around ��� � n n n8 9 10, , (b) 7 leader link election the task of the leader link election algorithm is to identify a leader – a single edge on the bypass cycle that will not participate in fragment reconnection in order to ensure correctness of the restored tree while keeping it connected. moreover, to support on-line and on-the-fly recovery, we look for a solution guaranteeing that once a link loses the election, it remains lost and that there is not a state of the algorithm in which there is no leader (i.e. all the fragments are connected). the completeness property is provided by the bypass routing algorithm; the bypass cycle forms a ring-topology communication structure for the election. the election is based on identifiers of the bypass cycle nodes; the basic idea is similar to the chang-roberts leader election [20], where the maximum known id is sent around the cycle by means of election messages and compared with the id of each intermediate node. the lle algorithm exploits the favorable properties of chang-roberts on ordered cycles, where it needs only n � card(bc(��)) messages and only a single election message to decide whether a given node (the bypass cycle edge incident to it) loses the election or not (see [3] for further details). except for single failures �� � { }n f where bc( )�� � � a br nt t f( ( , )�� )= 2 , the bypass cycle nodes are not ordered. for this reason, the hierarchical identifiers that uniquely identify each node relative to another node in the tree are used. the leaves of an arbitrary partially ordered rooted tree are ordered in parts according to the hierarchical identifiers based in their closest common parent node. applying the principle of chang-roberts for an ordered ring, more than one leader can be elected using n election messages. these leaders (except one) are thereafter eliminated by a recursive sweep process considering hierarchical identifiers related to the common root node. fig. 5 illustrates the principle of eliminating multiple leaders (only the leaf nodes are members of bc( )�� ). the common root node for bypass cycle nodes is a faulty node nr �� with the minimum identifier determined together with the respective hierarchical identifiers step by step by the routing algorithm as it browses the relevant ordered rays in the bypass cycle lookup. in this way, the algorithm utilizes a byproduct of the bypass routing – knowledge of the cluster structure – to achieve ( log )n nb average message complexity of the election, where b is the average branching factor in ��. moreover, the algorithm needs only n messages to elect a leader link in an arbitrary bypass cycle in hierarchically ordered trees (e.g., binary search trees) and also in the bypass cycles of single failures in general partially ordered trees. 8 fragment reconnection the platform constituted by bypass routing and leader election forms a basis for various fragment reconnection methods responsible for joining the fragments into a single connected tree t’ according to application-specific requirements. fragment reconnection can be performed together with the lle process – as the lle messages travel around the bypass cycle and determine individual nodes not to be leaders, the respective fragments can be joined to the rest of the tree so that the data traffic can be transmitted immediately. the most straightforward reconnection approach comes directly from the lle process. the fragments on bc(��) are sequentially joined, except for the terminal fragments of the leader link. the drawback is the fact that the diameter of the failure recovery area at t( )�� � is always equal to n � 1, which might be a limitation for some applications because without extern tree balancing, the tree would depreciate to a linear graph after a certain number of recoveries. this is called lr reconnection method. two different reconnection methods, called trm and hrm, were proposed in [3]. in the trm method, all the fragments are joined directly to one of the designated nodes, while the hrm method allows the fragments to be joined in longer consecutive sequences. as a generalization of these two approaches, we propose the parameterized hr-x reconnection method, where value x affects the length of the successively joined fragments along bc( )�� and thus it can influence the degree of the nodes and the diameter of at t( )�� �. the maximum number of new core edges incident to the affected nodes is min , n x n �� �� � �� � � � � � � � � 1 2 1 . the diameter is proportional to x as well. hr-1 thus represents the trm, and hr-n is equivalent to the lr method. the possibility to influence the properties of the restored tree may help to balance or optimize it. the particular value of x can even be chosen autonomously by each bypass cycle node, so the local requirements may also be supported. we 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague n s ...... sweep on n s swept leader election (a) (b) leader n t seq t (n t ) seq t (n s ) partial order fig. 5: leader election and the recursive sweep process note that the hr-x method is only an illustration of the reconnection process. more sophisticated systematical reconnection methods can be applied, provided that they retain correctness and completeness of the recovery. 9 discussion the characteristics of the computing environment in large-scale distributed systems as well as the general properties of overlay trees in this environment place quite specific requirements on tree restoration. the proposed platform for generic failure recovery is designed to meet the mentioned requirements of survivable trees and to form a basis for application-specific fragment reconnection. the node memory complexity of the bypass ring recovery scheme is (deg ( ) )maxt n d� and the average message complexity of the recovery is ( log )n nb , where degt(n) is node degree, n is number of tree fragments, b is average branching factor of the failed nodes, and dmax is an optional parameter proportional to the provided fault-tolerance. the lower bound of the maximum size of the recoverable failure is dmax 2 � �� � � nodes in arbitrary clusters and dmax�1 nodes in internal clusters of the tree. the lower bound of the maximum diameter of internal faulty clusters is dmax�2. the scheme is scalable with the group size, as its overhead depends solely on n and dmax and performs local recovery since only the nodes closest to the failed cluster (on bc( )�� ) are involved in the recovery. the scheme also supports multiple, on-line and on-the-fly recovery – it is capable to recover the tree from several multiple failures with respect to the dmax parameter while communication in the tree continues, and the simultaneous recoveries do not interfere with each other because of the locality property. protection selectivity and optional fault-tolerance is provided so that a trade-off between survivability and costs can be easily chosen. the scheme is independent of the type of protected tree (regarding traffic direction, number of sources, etc.) and forms a basis for application-specific fragment reconnection. the simulations and measurements verifying the described behavior of the proposed bypass ring scheme were performed using the gfs file system ([21], [22]) as a test bed. gfs is a peer-to-peer large-scale file system providing a fault-tolerant and highly available file service. it extensively employs vast tree communication structures for replica-based management of its data, and it is a typical application to utilize the bypass ring scheme. gfs has been implemented ([23]) and simulated ([24]) in network simulator ns2. one of the important results is the fact that protection with dmax � 4 already provides ample fault-tolerance, and it is fully sufficient for the gfs application; the probability of employing rings in the recovery dramatically decreases with their diameter. the simulations also show the real scales of the recoverable failures. the average size of the recovered cluster in trees with average branching factor 4 is approx. 1.5 dmax, and the average diameter is approximately dmax�2 for 2 � dmax� 10. this result confirms the possibility to easily choose a trade-off between survivability and costs. 10 conclusion in this paper, we summarized the general properties of large-scale environments and identified the requirements for generic failure recovery in survivable trees. the main contribution of the paper is the outline of a scalable platform for local failure recovery that meets all the required properties. the platform is based on bypass rings – the redundant cyclic structures introduced in [3] and specified in detail in [2]. the recovery provided by this platform is generic to the intent that it is independent of specific tree properties and communication pattern, and it enables application-specific tree reconnection to optimize the restored tree according to some extern constraints and requirements. the performed simulations [24] in the gfs file system confirm the theoretical results. future research in this area may include specification of particular mechanisms for (autonomous) management of fault-tolerance level and protection selectivity with respect to the state of individual nodes, or a proposal of more sophisticated reconnection methods tailored exactly to specific application requirements. reference [1] dynda, v.: “a concept of survivable trees and its deployment in a fault-tolerant multicast.” in: workshop 2004, ctu, prague, 2004, p. 234–235. [2] dynda, v.: “a bypass-ring scheme for a fault-tolerant multicast.” acta polytechnica, vol. 43 (2003), no. 2, p. 18–24. [3] dynda, v.: “a simple scheme for local failure recovery of multi-directional multicast trees.” in: ifip / ieee iscis 2003, lncs, vol. 2869, springer-verlag, germany, 2003, p. 66–74. [4] deshpande, h., bawa, m., garcia-molina, h.: “streaming live media over a peer-to-peer network.” report no. cs-2001-31, cs dept., stanford university, 2001. [5] yang, m., fei, z.: “a proactive approach to reconstructing overlay multicast trees.” in: ieee infocom ’04, 2004. [6] jia, w. et. al.: “an efficient fault-tolerant multicast routing protocol with core-based tree techniques.” ieee transactions on parallel and distributed systems, vol. 10 (1999), p. 984–999. [7] manimaran, g., chakrabarti, a.: “a scalable approach for core failure recovery in multicasting.” in: advanced computing and communications, 2000, p. 191–196. [8] chu, y. h., rao, s. g., seshan, s., zhang, h.: “enabling conferencing applications on the internet using an overlay multicast architecture.” in: acm sigcomm ‘01, 2001. [9] francis, p.: “yoid: extending the multicast internet architecture.”, 2000 http://www.icir.org/yoid. [10] zhang, b., jamin, s., zhang l.: “host multicast: a framework for delivering multicast to end users.” in: ieee infocom ’02, 2002. [11] birrer, s., bustamante, f. e.: nemo: “resilient peer-to-peer multicast without the cost.” report no. nwu-cs-04-36, northwestern university, 2004. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 [12] banerjee, s., bhattacharjee, b., kommaredy, c.: “scalable application layer multicast.” in: acm sigcomm ’02, 2002. [13] birrer, s. et. al.: fatnemo: “building a resilient multi-source multicast fat-tree.” in: wcw ’04, lncs, vol. 3293, springer-verlag, germany, 2004, p. 182–196. [14] tran, d. a., hua, k. a., do, t.: “zigzag: an efficient peer-to-peer scheme for media streaming.” in: ieee infocom ’03, vol. 2, 2003, p. 1283–1292. [15] fei, a. et. al.: “a dual-tree scheme for fault-tolerant multicast.” in: icc ’01, vol. 4, 2001. [16] medard, m. et. al.: “redundant trees for preplanned recovery in arbitrary vertex-redundant or edge-redundant graphs.” ieee/acm transactions on networking, vol. 7 (1999), no. 5, p. 641–652. [17] padmanabhan, v. n., wang, h. j., chou, p. a.: “resilient peer-to-peer streaming.” in: ieee icnp ’03, 2003, p. 16–27. [18] wu, c. et. al.: “a new preplanned self-healing scheme for multicast atm network.” in: ieee icct ’96, vol. 2, 1996, p. 888–891. [19] wu, c., lee, w., hou, y.: “back-up vp preplanning strategies for survivable multicast atm networks.” in: ieee icc ’97, vol. 1, 1997, p. 267–271. [20] chang, e. g., roberts, r.: “an improved algorithm for decentralized extrema-finding in circular configuration of processors.” communication of the acm, vol. 22 (1979), no. 5, p. 281–283. [21] dynda, v., rydlo, p.: ”large-scale distributed file system design and architecture.” acta polytechnica, vol. 42 (2002), no. 1, p. 6–11. [22] dynda, v., rydlo, p.: “fault-tolerant data management in a large-scale file system.” in: ieee isads 2002, mexico, 2002, p. 219–235. [23] zradička, l.: “a distributed file system model.” master thesis, department of computer science and engineering, ctu prague, 2003. [24] řehák, p.: “fault-tolerance in a distributed file system.” master thesis, department of computer science and engineering, ctu prague, 2005. ing. vladimír dynda phone: +420 224 357 616 fax: +420 224 923 325 e-mail: xdynda@sun.felk.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo náměstí 13 121 35 praha 2, czech republic 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague ap04_2web.vp 1 introduction in the field of controlling of reservoirs systems, it is often very difficult to formulate a reliable system of criteria. control is usually defined as optimisation of structures and systems in terms of these appropriate criteria. this proper aims to assess the possibility of using risk analysis to define control criteria both in the phase of design and in the phase of operating hydraulic structures and systems of reservoirs. risk is defined in various ways in the literature. it is mostly explained as a combination of hazards, vulnerability and exposure. vulnerability is obviously defined as the aptness of a structure or system to failure as a result of low resistibility. exposure characterises the time period during which the structure or system is exposed to hazard. hazards are characterised as the threat of an event, that tends to put the system into an undesirable state (mortality, economic losses, infrastructure failure, etc.). hazards can be divided into natural hazards, caused generally by natural disasters (earthquakes, floods, tornados, fire, etc.), and hazards causes by human actions. risk can also be expressed by the theory of reliability. if the reliability is defined as the probability of the trouble-free state of the system, then the risk is given by the closeness of this probability to a certain event. risk assessment is generally a very difficult problem. the following relation is often used: r p c� � , (1) where r is the risk quantifier, p is the probability of the occurrence of losses, and c is monetary loss. in water management, risk analysis is broadly developed in the field of dam construction. during the 20th international congress on large dams in beijing [1] significant attention was given to risk analysis. most of the papers dealt with the analysis of dam failure. the safety of dams using risk analysis was studied by nilkens et al [2]. the capacity of the risk analysis for eia is described by riha [3]. a system of reservoirs is usually defined as a system of water management elements that are mutually linked by inner and outer connections in a purpose-built complex. combined elements are given by reservoirs, river sections, dams, weirs, hydropower plants, water treatment plants and other hydraulic structures. these elements also include the rainfall system, the run-off system, the ground water system, etc., etc. reservoirs and reservoir systems usually serve many purposes at the same time. according to their main purpose, reservoirs can be categorised as follows: � water supply (drinking water, industry), � flood control, � irrigation, � navigation, � recreation, � environmental function, etc. 2 methods two basic phases need to be distinguished when dealing with reservoirs and reservoir system. the first phase, from the system point of view, is the design, while the second phase is the control and operation of existing systems and structures. both phases involve the optimisation problem for the previously defined set of criteria. optimisation of the dynamic system is then called control. control is often defined as a systematic action on a control object that satisfies given aims. in the phase of reservoir design we generally talk about strategic control, and we are interested in optimising systems or structures from the long-term point of view. this phase aims at determining particular reservoir volumes in order to guarantee given reservoir purposes. in the phase of real operation of previously designed reservoirs we usually try to optimise the system during all possible operating situations, namely during extremes such as floods, hydrological droughts, water quality control, etc. this kind of optimisation is called real time control, and we try to satisfy particular reservoir purposes taking into account given criteria, which can be formulated by minimising the measure of risk. in a period of hydrological drought, the risk of water supply failure rises, and it can be losses that can affect society. there are known approaches for assessing the risk from floods, hydropower production failure, etc. it is evident that 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 control of systems of reservoirs with the use of risk analysis p. fošumpaur, l. satrapa a system of reservoirs is usually defined as a system of water management elements, that are mutually linked by inner and outer connections in a purpose-built complex. combined elements consist of reservoirs, river sections, dams, weirs, hydropower plants, water treatment plants and other hydraulic structures. these elements also include the rainfall system, the run-off system, the ground water system, etc. a system of reservoirs serves many purposes, which result from the basic functions of water reservoirs: storage, flood control and environmental functions. most reservoirs serve several purposes at the same time. they are so called multi-purposes reservoirs. optimum design and control of a system of reservoirs depends strongly on identifying the particular purposes. in order to assess these purposes and to evaluate the appropriate set of criteria, risk analysis can be used. design and control of water reservoir functions is consequently solved with the use of multi-objective optimisation. this paper deals with the use of the risk analysis to determine criteria for controlling the system. this approach is tested on a case study of the pastviny dam in the czech republic. keywords: risk analysis, system of reservoirs, multi-objective optimisation, water supply, hydropower plant, fuzzy logic. risk analysis is a very powerful tool in the area of controlling reservoir systems. when carrying out a risk assessment it is very important to make a loss estimation, which can be deterministic or stochastic. the deterministic approach involves loss estimation for only one previously measured event. by contrast to the stochastic approach includes the probability distribution of the studied events. this approach requires the simulation of numerous scenarios with the use of the monte-carlo method. determination of the risk with respect to the set of qualitative criteria is a very complicated problem in the area of the risk assessment. the criteria include the risk of exceeding the value of a minimum permissible maintained discharge downstream of the dam, the risk of deteriorating the environmental conditions in the downstream area, and the risk of a negative impact on the recreation function of a reservoir. in order to optimise a set of criteria which involves a certain number of qualitative requirements, we can use fuzzy set theory and the fuzzy logic theory put forward by zadeh [4]. in our case study we deal with optimisation of the strategic control of a reservoir with respect to the following criteria: � maintenance of the minimum discharge downstream of the dam, � flood control, � hydropower production, � recreation. to quantify the measure of risk of the first three criteria, the time-based reliability (re) of water supply according to duration is used [5]: re � � � t t t � 100 [%], (2) where t is the duration of the time series and � t is the sum of the duration of all failures that occurred in the series. the risk is then equal to a certain event (100%). to quantify the recreation benefits of a reservoir, fuzzy set theory was used. 3 case study multi-objective optimisation of strategic control with the use of risk analysis was tested on the pastviny dam on the divoka orlice river (czech republic). this is a stone masonry arch dam built in 1939. the total storage capacity is 11×106 m3, including a flood control capacity of 2×106 m3. there are six spillways consisting of a fixed sill and located at three different elevations. the dam is equipped with two bottom outlets with an inner diameter of 1.4 m. the pastviny dam is a typical multipurpose dam. its primary purpose is flood control, while the secondary purposes are: production of hydroelectric power, discharge regulation in the downstream part of the divoka orlice river, and sport and recreation. the criteria for minimum discharge maintenance, flood control and hydropower production were solved by using their reliability according to relation (2). the criterion for the recreation purpose was accomplished by using four fuzzy sets according to fig.1. each fuzzy set describes the quality of recreation by the membership degree index � for the range of © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 44 no. 2/2004 fig. 1: fuzzification of the recreation criterion fig. 2: scenarios of the total reservoir volume allocation water levels in a reservoir. fig. 1 shows that these are excellent conditions for recreation in the area around the full active storage capacity (altitude: 469 m) and as the water level decreases the recreation conditions become worse. fig.2 shows selected scenarios of the total reservoir volume allocation to the flood control capacity and the active storage capacity. the maximum water level in the reservoir is 473 m, and different active storage capacitiesare defined for the summer and winter periods. as not all the particular criteria are mutually scalable they should be normalised into the interval of (0,1). secondly, preferential weights should be chosen that agree with the order of significance of particular reservoir purposes. in actual operation, the order of significance order of particular reservoir purposes is established in the operating schedule of each dam. our research also deals with the sensitivity analysis of the weight determination. particular scenarios was performed with the use of the stochastic dynamic programming. probability distribution of the input time series was adopted from the measured variables from the 1938–1995 period. fig. 3 shows the dependence of the mean annual hydropower production e on the choice of the active storage capacity level ma. the hydropower production values are related to the current level of mean annual production (100%). fig.4 shows ratio of the flood control capacity vr and the flood volume with given return period n with respect to the altitude of the water level of the active storage capacity ma. the suitability of particular scenarios according to the recreation purpose of the reservoir was evaluated with the use of the membership degree of the actual water level to each fuzzy set during the simulation (fig.1). the optimum scenario was then found by standard multi-objective optimisation methods. 4 conclusions we studied the use of risk analysis to quantify a suitable set of criteria for multi-objective optimisation of reservoirs and reservoir systems, which are usually intended to satisfy 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 2/2004 40 50 60 70 80 90 100 462 463 464 465 466 467 468 469 470 ma [m] v r /w n [% ] w100 w50 w20 w10 w5 fig. 4: dependence of the ratio of the flood control capacity vr and the flood volume wn with a given return period n with respect to the altitude of the water level of the active storage capacity ma 80.0% 85.0% 90.0% 95.0% 100.0% 105.0% 462 463 464 465 466 467 468 469 470 ma [m] e [% ] fig. 3: dependence of mean annual hydropower production on the choice of water level ma of the active storage capacity several purposes at the same time. these purposes can be in contradiction with each other. the described methodological approach of multi-objective optimisation was applied to a case study of the pastviny dam in the czech republic, which serves for flood control, hydropower production, discharge regulation, and recreation purposes. reliability theory was used for risk assessment of quantifiable criteria. the research has proved that fuzzy set theory is efficient for quantifying fully qualitative criteria, such as the recreation or environmental function of reservoirs. 5 acknowledgment this research has been supported by grant no. 103/02/0606 and grant no. 103/02/d049 of the grant agency of the czech republic. references [1] 20th international congress on large dams. vol. 1, question 76, beijing china, paris: icold 2000, p. 896. [2] nilkens b., rettemeier k.: “risk assessment procedure for german dams”. in: workshop on the occasion of the 69th annual meeting of icold in dresden, 2001. [3] riha j.: evaluation of investment impacts on environment. multipurpose analysis eia, prague, academia, 1995. [4] zadeh l. a.: fuzzy sets. information and control. vol. 8, (1965), p.338–353. [5] votruba l., broža v.: water management in reservoirs. elsevier: prague sntl, 1989. dr. ing. pavel fošumpaur phone: +420 224 354 425 e-mail: fosump@fsv.cvut.cz ass. prof. ing. ladislav satrapa, ph.d. phone: +420 224 354 618 e-mail: satrapa@fsv.cvut.cz department of hydrotechnics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 44 no. 2/2004 table of contents biological systems thinking for control engineering design 3 d. j. murray-smith computational fluid dynamic simulation (cfd) and experimental study on wing-external store aerodynamic interference of a subsonic fighter aircraft 9 tholudin mat lazim, shabudin mat, huong yu saint dynamics of micro-air-vehicle with flapping wings 15 k. sibilski the role of cad in enterprise integration process 22 m. ota, i. jelínek development of a technique and method of testing aircraft models with turboprop engine simulators in a small-scale wind tunnel – results of tests 27 a. v. petrov, y. g. stepanov, m. v. shmakov developing a conceptual design engineering toolbox and its tools 32 r. w. vroom, e. j. j. van breemen, w. f. van der vegte knowledge support of simulation model reuse 39 m. valášek, p. steinbauer, z. šika, z. zdráhal the effect of pedestrian traffic on the dynamic behavior of footbridges 47 m. studnièková control of systems of reservoirs with the use of risk analysis 52 p. fošumpaur, l. satrapa a coding and on-line transmitting system 56 v. zagursky, i. zarumba, a. riekstinsh speech signal recovery in communication networks 59 v. zagursky, a. riekstinsh simulation of scoliosis treatment using a brace 62 j. èulík image analysis of eccentric photorefraction 68 j. dušek, m. dostálek a novel approach to power circuit breaker design for replacement of sf6 72 d. j. telfer, j. w. spencer, g. r. jones, j. e. humphries numerical analysis of the temperature field in luminaires 77 j. murín, m. kropáè, r. fric computer aided design of transformer station grounding system using cdegs software 83 s. nikolovski, t. bariæ recycling and networking 90 t. bányai response of a light aircraft under gust loads 97 p. chudý preliminary determination of propeller aerodynamic characteristics for small aeroplanes 103 s. slavík acta polytechnica https://doi.org/10.14311/ap.2022.62.0293 acta polytechnica 62(2):293–302, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague a mindlin shell finite element for stone masonry bridges with backfill petr řeřicha czech technical university in prague, faculty of civil engineering, thákurova 7, prague, czech republic correspondence: petr.rericha@fsv.cvut.cz abstract. stone masonry bridges are difficult to analyse with commercial finite element (fe) packages for their specific heterogeneous composition. the stone arch is best modelled as a thick shell where there are predestined directions of tension failure, normal to the bed joints. a dedicated, very simple, mindlin shell finite element is developed with five translational degrees of freedom per node. it features compatibility with linear isoparametric or constant strain elements for the backfill. most bridges can be analysed with a sufficient accuracy assuming plain strain conditions. the element then simplifies to a timoshenko beam element with three translational degrees of freedom per node. an application of the latter one to the bridge at poniklá is presented. keywords: masonry arches, no tension joints, thick shell elements, octave script. 1. introduction masonry arch bridges still constitute a considerable part of the bridge stock of the transport infrastructure in czechia. many of them are valueable monuments of the technical expertise and craftsmenship of the past generations as well. design standards for this sort of bridges do not exist worlwide, guides and manuals for load rating are provided instead, like [1] or [2]. they usually contain conservative approximate methods for a routine assessment and some guidance and recommendations for more complicated individual assessments based on advanced structural analysis methods. a structural analysis of masonry arch bridges is specific in that the ultimate limit state forces and stresses cannot be solved for assuming homogeneous isotropic linear elastic material. in contrast to that, design standards for reinforced concrete and steel structures admit such solutions. en 1996-1-1 codifies in clause 6.1.1(2) that plane sections remain plane and the tensile strength of masonry perpendicular to bed joints is zero in the ultimate limit state. the standard is not compulsory for bridges but its conditions should be perceived as the minimum for them. cracks in the bed joints of the masonry arches affect the stress distribution in a way that does not admit a linear solution. the simplest material model that can be accepted is the no-tension linear material where the no-tension condition applies to the normal stress in the planes parallel to the bed joints. this specific behaviour is difficult to simulate by material models available in general purpose program packages, [3]. commercial packages do not include such material models. there were attempts to achieve the specific properties of the masonry arches with a trivial model of a heterogeneous material – individual meshing of voussoirs and joints by standard continuum finite elements with homogeneous material models. it is possible in academic works but not acceptable in design practice for demands on the labour, software and input data on materials properties. this experience testifies that shell/beam elements using the rigid normals assumption (timoshenko, mindlin, bernoulli-navier and other assumptions) are indispensable for the solution of the masonry arch bridges. several dedicated codes have been written around 1990 for the analysis of masonry arches based on the castigliano principle, ctap [4], [5], rigid block assumption ring , [6], mechanism based archie, [7] and others. all of them assume plane strain conditions. these models simulate very well the observed behaviour of masonry arches in situ and in large scale model tests. they have been applied worldwide in practice despite their common drawback that they underestimate the interaction of the barrel, backfill and pavement. two simple finite elements for thick shells are developed based on the timoshenko/mindlin kinematic assumptions, one for plane stress/plane strain conditions, the other for 3d shells. in order to facilitate a seamless interaction with the backfill continuum, only translational degrees of freedom (dofs) are employed. the no-tension linear elastic constitutive equations are integrated in a closed form for the normal stress in planes parallel to the bed joints. the shape functions of the elements are 2d and 3d variants of the same concept. the simpler 2d variant is presented first with an application to a bridge, a triangle thick shell element follows. 2. the tt element, timoshenko beam with translational dofs the displacement shape functions are sketched in figure 1 for the axial displacement ui,t of node i, for the 293 https://doi.org/10.14311/ap.2022.62.0293 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en petr řeřicha acta polytechnica u i,b u i,b u u i,t j,b x xx i,b j,b j,ti,t u j,t v vi y yi j l i,j d i, j vj i j x x y figure 1. shape functions, dofs and conjugate nodal forces of the tt element lateral displacement vj of node j and for the axial displacement ui,b. axial and lateral directions denote the element local x and y directions, indices t and b denote the top and bottom faces later also extrados and intrados of the arch. the local x axis goes from node i to node j. the deflection line is parabolic and symmetric. it’s single free parameter is determined so that the shear strain energy is minimum for any imposed node displacements. analogous shape functions are valid for all four axial dofs ui/j,t/b and the same applies to the two lateral dofs vi/j . the cross-sections remain straight but not normal to the deflected beam axis. in an isoparametric beam element (linear variation of the cross-section rotations and shear deformation, axial strain and lateral deflections along the beam axis), there would be no deflections for the ui/j,t dofs. the element is known to lock for small depths of the beam owing to excessive work of the shear deformation. the present element features a constant shear deformation since the antisymmetric (with respect to the element centre) part of the cross-section rotations is assigned to a constant curvature of the deflection line. the element dofs are ordered in the {u} matrix, {u} = { {u}i {u}j } , {u}n =   un,t vn un,b   , n = i,j linear normal and constant shear element strains are specified by top and bottom face normal strains εt, εb and shear strain γ, ordered in the matrix of internal strains ε = {εt,εb,γ}t . shear strain γ and conjugate shear stress τ have opposite signs than usual in elasticity, γ = −(∂ux/∂y+∂uy/∂x)/2. the element geometric matrix [b] reads: [b] = 1 l [ −1 0 0 1 0 0 0 0 −1 0 0 1 l/2/d −1 −l/2/d l/2/d 1 −l/2/d ] conjugate to node displacements {u} are the nodal forces {x}, conjugate to {ε} is the matrix of internal forces {s}. the nodal forces in terms of nodal displacements read {x} = [b]t {s}({ε}) = [b]t {s}([b]{u}) (1) constitutive equations for the cross-section are developed separately for the bending, the first two members of {ε} and {s}int. ε ξ d d ζ t σ: >0 ε σ: εt <0 tz ( ) z ( )ζ b b σ: 0 ε 1 1 ξ 1 1 ε b >0<0 ξ d ξ ζ figure 2. stress/strain diagram, strain shape functions and normal stress diagrams of the tt element 2.1. constitutive equations for bending no-tension linear elastic material is assumed, see the stress-strain diagram in figure 2. the no-tension condition is applied to the normal stress in crosssections and bed joints of the arch. though simple, it makes possible the most frequent type of failure of the stone masonry arch bridges. the normal strain ε is determined by the top and bottom face strains εt, εb through the shape functions shown in the centre of figure 2. ε = zt(ζ)εt + zb(ζ)εb constitutive equations of the cross-sections are defined in terms of dimensionless functions st() and sb() of dimensionless arguments: st(εt,εb) = ∫ 1 0 zt(ζ)s(εtzt(ζ) + εb,xzb(ζ))dζ sb(εt,εb) = ∫ 1 0 zb(ζ)s(εtzt(ζ) + εbzb(ζ))dζ, (2) the cross-section forces are st/b = ebdst/b = east/b (valid for both t and b subscripts, where e is the young modulus, b the cross-section width, d the crosssection depth and a its area. functions st/b() and their derivatives with respect to εt, εb have to be developed separately for four possible strain states: algorithm start when εt > 0 and εb > 0 then all internal forces are 0, the cross-section is totally disintegrated. all stresses and internal forces vanish in the element. when εt > 0 and εb < 0, then (see the left strain diagram in figure 2): ξ = εb εb − εt st = εbξ2/6, sb = (0.5 − ξ/6)ξεb ∂st ∂εt = ξ3/3, ∂st ∂εb = ( 1 − 2ξ εt εb ) ξ2 6 ∂sb ∂εt = (0.5−ξ/3)ξ2, ∂sb ∂εb = (0.5− ξ 6 )ξ−(0.5− ξ 3 )ξ2 εt εb when εt < 0 and εb > 0, then (see the right strain diagram in figure 2): ξ = εt εt − εb 294 vol. 62 no. 2/2022 a mindlin shell finite element for stone masonry bridges with backfill i x =s +vl/(2d)x =−s +vl/(2d) x =−s −vl/(2d)b t l d x =s −vl/(2d) bj tjti bi v y =−vi j y =vj t b figure 3. element internal forces {s} and nodal forces {x} st = (0.5 − ξ 6 )ξεt, sb = εtξ2/6 ∂st ∂εt = (0.5− ξ 6 )ξ−(0.5− ξ 3 )ξ2 εb εt , ∂st ∂εb = (0.5−ξ/3)ξ2 ∂sb ∂εt = ( 1 − 2ξ εb εt ) ξ2 6 , ∂sb ∂εb = ξ3/3 when εt < 0 and εb < 0, then the cross-section is linear elastic and dimensionless generalised forces are st/b = (2εt/b + εb/t)/6 the cross-section stiffness matrix: ∂{sbend} ∂{εbend} = 1 6 [ 2 1 1 2 ] (3) algorithm end the bending stiffness tangent matrix of the crosssection is denoted for a later reference [dbend] = ∂{sbend} ∂{εbend} = ebd [ ∂st ∂εt ∂st ∂εb ∂sb ∂εt ∂sb ∂εb ] (4) 2.2. full cross-section stiffness matrix the shear force v is computed from the shear deformation γ in the timoshenko beam. linear elastic behaviour v = gebdγ is assumed in the model, independent on the normal stress in the cross-sections. this assumption is mostly sufficient for cross-sections of the stone masonry arch bridges. the full cross-section stiffness matrix is then [dcs] = [ [dbend] {0} {0}t gebd ] the matrix has to be adapted when more involved cross-section constitutive equations are necessary, for instance, shear failure of bed joints. the nodal forces {x} = {xi,t,yi,xi,b,xj,t,yj,xj,b}t of the element can be obtained in terms of the internal forces {s} either by expansion of [b]t {s} or by the equilibrium conditions of the element in figure 3: the dimensionless cross-section stiffness matrix for bending [dbend] = ∂{sbend/∂{εbend} is not symmetric when the neutral axis lies inside the cross-section so that the x xe f bot tom fac e bottom face element e node i element f t t r rr 1 1 r b i x b x b x b x b x b x b f e f f e e figure 4. equilibrium of node i element and structure matrices will not be symmetric, too. the element stiffness matrix is [k]straight = ebdl[b]t [dcs][b] it is ready for application in straight beams, but for arches and other curved beams, it must be modified. 2.3. arch nodes equilibrium the equilibrium of the nodal forces x⃗ibe = xibex⃗e, x⃗ibf = xibf x⃗f of the adjacent elements e and f at arch node i simplifies to xibe + xibf = 0 when the element’s unit length direction vectors x⃗l,e and x⃗l,f coincide, see the figure 4. the node index i is omitted for brevity as long as the the development concerns just a single node i. xbe and xbf are the magnitudes of the respective forces with positive senses in element’s direction vectors. these forces are stored in the third components of the element nodal forces matrices and in case of parallel elements, the equilibrium equation at a node is {x}e + {x}f = {0}, where indices e and f indicate the left and right element, respectively. when the adjacent elements are not parallel, the resultants of the two forces add up to force r⃗. the force is decomposed into component r in the direction r and force t in the t direction. the component r can act anywhere on ray r since the arch is considered rigid in the transverse direction. it is thus added to forces acting at node i. component t must vanish to keep the null moment with respect to joint i. t = (xbex⃗e + xbf x⃗f )·⃗t = = (xbex⃗e + xbf x⃗f )·(x⃗e + x⃗f )/|x⃗e + x⃗f | = 0, which implies xbe + xbf = 0, provided the two elements are not normal to each other. force r is r = (xbex⃗e − xbf x⃗f )·r⃗ = (xbex⃗e − xbf x⃗f )·(x⃗e − x⃗f )/|x⃗e − x⃗f | = (xbe − xbf ) √ (1 − x⃗e·x⃗f )/2 and its vector r⃗ = rr⃗ = 0.5(xbe − xbf )(x⃗e − x⃗f ) 295 petr řeřicha acta polytechnica the ray r of the force must connect points i and bi, the intersection of the x⃗be and x⃗bf forces rays, otherwise node i would not be in equilibrium. ray r is approximately radial to the arch. vector g⃗ = 0.5(x⃗e − x⃗f ) (5) is an important property of node i. the matrix expression for vector r⃗ is in terms of g⃗ r⃗ = {g⃗, −g⃗} { xbe xbf } recall that xbe denotes the third element of the nodal forces matrix of element e at node i and the analogue holds for xbf . conjugate displacements to xbe and xbf are ube and ubf , conjugate to r⃗ is the displacement vector u⃗ of node i. the principle of virtual work thus implies { ube ubf } = { g⃗ −g⃗ } ·u⃗ the same result can be obtained when the part of u⃗ in r⃗ direction, (u⃗·r⃗)r⃗, is projected on the x⃗e direction to obtain ube: ube = (u⃗·r⃗)r⃗·x⃗e = g⃗·u⃗. projection on x⃗e yields the same expression but for a negative sign at g⃗. expression for εb becomes εb =(ub,j + g⃗j ·u⃗j − ub,i + g⃗i·u⃗i)/li,j =(ub,j + {gj }t { ut,j vj } − ub,i + {gi}t { ut,i vi } )/li,j (6) where indices i and j denote the nodes of the element with the element local x axis from i to j. in the wake of it, the geometric matrix is modified to [b] = 1 l   −1 0 0 1 0 0gi,x gi,y −1 gj,x gj,y 1 λ −1 −λ λ 1 −λ   (7) with λ = l/d/2 and subscripts x and y indicating components of the vectors g⃗ in the global coordinate system. different element lengths have to be used in the geometric equation 7 and the volume integration implicit in 8. the top face nodes distance l is used in the former case, the reduced length lred = lrax/rt = lrratio is used in the latter case. rax denotes the radius of the arch central axis, rt is the radius of the top face curve. the element stiffness matrix for an arch element is [k]e = ebdlred[b]t [dcs][b] (8) -4 -2 0 2 4 -32 -30 -28 -26 -24 2587 2176 18311501 1182 930.61146 13401500 16411786 18891889 17731596 14151256 1104963.2840.8 739.7 732.1810873.5924.2963.4992.21012102410291030102710231018 1014 1013 1015 1021 1031 1043 32.79 33.3625.514.772.337 10.6723.2834.9345.5855.4464.5973.7681.7488.2593.2895.794.1388.34 122.6162.2170.4 173.1318.9525.3723.4734.6209.217011612211.5125.5 1667171753.04490.5 563.2447.9301.5 190.5135.7 figure 5. bridge at ponikla, compression stresses [kpa] and depths of the cracks in the bed joints 3. application to an arch bridge exclusive translational dofs facilitate the combination of the tt element with simple 2d continuum elements. a dedicated matlab/octave code has been written to utilize this combination for the analysis of masonry arch bridges. the code features the simplest possible way how the interaction of the masonry arch, backfill and pavement can be assessed parallel with the principal pattern of failure of the masonry arch. the arch and pavement are modelled by the tt elements, the backfill by the classic constant strain triangle (cst) elements. meshing of the backfill provides the persson and strang mesh generator, [8]. the no-tension cross-section model defined in section 2.1 supports the cracking of the arch bed joints, which is the dominant pattern of the masonry arch bridge failure. the code includes a simple, purely vector graphics output. it consists of 1150 octave command lines, not including the mesh generator. an application to the sandstone arch bridge at ponikla in north bohemia illustrates the code output in figure 5. the span of the arch is 11.4 m. two principal quantities are shown in each arch element, the maximum compression normal stress value inserted at the arch face where it occurs and the depth of the cracks in the bed joint. the load case is the design dead load combined with the tandem axle live loads with twice the design intensity level, i.e., 360 kn per wheel in lane 1 according to en 1991-2. the tt element and the masonry arch fem models based on it claim to be the simplest models available to assess the most frequent failure mode of the stone masonry arches and the interaction of the arch with backfill and pavement. 4. 3d mindlin shell element tm the 2d models are often not adequate for the solution of masonry arches whose widths are mostly comparable to spans which makes the transverse variation of stresses and displacements nonuniform. a 3d analogue to the 2d tt element is, therefore, derived. the mindlin assumptions, [9], and the simplest triangular facet shape are adopted, see figure 6. the element local coordinate system features z axis normal to the facet plane. there is always a preference direction in the facet plane, the direction closest to the normals 296 vol. 62 no. 2/2022 a mindlin shell finite element for stone masonry bridges with backfill k x j d y z i u w t,x ut,y u u b,x b,y figure 6. geometry of the shell facet triangle element, the element local coordinate system and the dofs at node i to the bed joints planes of the masonry arch. it is assumed forthwith that the arch is a cylindric segment shell and the bed joints normals are tangents to the directrix of the cylinder so that their directions lie in a single plane. the global x axis is assumed to lie in that plane, too. in terms of the bridge nomenclature, the global x axis is in the direction of the arch span. the element local x axis is chosen as the direction closest to the global x within the element plane. the no-tension condition is applied to the σx (local x) normal stress component . other options are possible and can be rather easily implemented in the code. it may be necessary in case of other than cylindric shells or more general failure modes and constitutive equations. local y axis completes the element local right-handed cartesian coordinate system. all components are in this system up to section 4.4, where two other systems are necessary. five translational dofs per node are shown in figure 6 for node i. they are ordered in matrix {u} at each node, with node index omitted: {u}t = {ut,x,ut,y,w,ub,x,ub,y } deflection w and in-plane displacement vector u⃗ are approximated by linear independent functions of natural triangular coordinates ξi. w = ∑ i wiξi, u⃗ = ∑ i u⃗iξi in literature, a common term for this kind of approximation is isoparametric element, not to be confused with true isoparametric elements for 2d and 3d continua. derivatives of ξi are constants in the element area, ∂ξi ∂x = bi 2a , ∂ξi ∂y = ai 2a or ∂ξi ∂x⃗ = −n⃗i li 2a , where a is the element area (note that li/2a is the slope of the ξi surface upon the element area) and n⃗i = − 1 li { bi ai } . γ y γ x i j k a aa b b ij k i k y z w i φ i wjdw/dx d w /d y i j k a a a s n i i b j x w w figure 7. triangular coordinates and shear deformation owing to deflection alone chain differentiation yields then γw,x = − 1 2a ∑ i wiai, γw,y = 1 2a ∑ i wibi (9) the shear deformation γ⃗w owing to deflection alone is γw,x = − ∂w ∂y , γw,y = ∂w ∂x , (10) note that positive sense is indicated in the figure for each of the four rotations and γ⃗ is the rotation of the normal to the element plane embedded in the material before deformation towards the normal to the element plane after deformation with standard sign convention of its components. in particular, the motion (rotation) shown in the section view right to left in figure 7 goes from the inclined dashed line to the horizontal and γx is thus negative. as in the tt element, the bending and shear deformation, are treated separately. 4.1. bending and in-plane deformation pseudocurvatures define the bending deformation. they are the derivatives of rotations of the normals to the element plane. the rotations of the normals in their turn depend on the in-plane displacements u⃗ of the top and bottom faces of the element. the element in-plane strains are specified in terms of the top and bottom faces (extrados and intrados) strain matrices {εt} and {εb} in analogy to the top and bottom uniaxial strains εt and εb in the timoshenko 2d beam element. the element triangle geometry is defined in terms of distances ai = xk − xj , bi = yj − yk, where indices i, j, k are subjected to cyclic permutation and xi, yi specify node’s i positions in the local system, see figure 7. the geometric equations 297 petr řeřicha acta polytechnica of the cst triangle applied to the top face read {εt} =   εt,x εt,y 2εt/b,xy   = 1 2a ∑ i   bi 00 ai ai bi   { ui,t,x ui,t,y } , (11) substitute b for t subscript to get the bottom face strain matrix. the in-plane strains vary linearly across the element depth d, {ε} = zt{εt} + zb{εb} (12) where zt = (d + z)/d = 1 + ζ and zb = −z/d = −ζ are linear shape functions of the element in terms of the element local 0 > z > −d coordinate with z = 0 at the top face of the element or of the relative dimensionless 0 > ζ > −1 coordinate. plain stress conditions with σzz = 0 are assumed in the shell, then, the material stiffness matrix is [d] = e 1 − ν2   1 ν 01 0 1−ν 2   (13) for a linear elastic isotropic material. when cracked bed joints occur in planes normal to element’s x axis, the conditions become vague. adjacent to the cracked joints an approximately uniaxial stress prevails. it is thus assumed forthwith that the material stiffness matrix in the cracked elements is [d({ε})] = e   s(εx) 0 01 0 1−ν 2   (14) where s() is the normalized bilinear stress/strain diagram in the left sketch of figure 2. it is obvious that the element local x axis must approximately coincide with the normals to the bed joints of the arch barrel. the virtual work equation for cracked elements δw = ∑ i (δ{ui,t}t {gi,t} + δ{ui,b}t {gi,b}) = ∫ v δ{ε}t [d({ε})]{ε}dv = e ∫ v (δεxs(εx) + δεyεy + (1 − ν)δεxyεxy dv (15) delivers the expressions for the nodal forces {gi,t}, {gi,b} when strains are expanded in terms of the virtual nodal displacements using equation 11. functions st(εt,x,εb,x) and sb(), see 2 in section 2.1, return dimensionles top and bottom cross-section forces conjugate to top and bottom strains (input parameters) of a unit depth cross-section with e = 1 and no-tension stress-strain diagram. with the aid of the functions, the integrals in the virtual work expression can be expressed in a closed form: δw = ed 2 (δεt,xst + δεb,xsb + δεt,y (εt,y/3 + εb,y/6) + δεb,y (εt,y/6 + εb,y/3) + (δεt,xy (εt,xy/3 + εb,xy/6) + δεb,xy (εt,xy/6 + εb,xy/3))(1 − ν)) (16) substitution by 11 and selection of individual nonzero virtual nodal displacements yields the expressions for the element nodal forces: gi,t,x = ed2 (bist + ai(εt,xy/3 + εb,xy/6)(1 − ν)) gi,t,y = ed2 (ai(εt,y/3 + εb,y/6)+ bi(εt,xy/3 + εb,xy/6)(1 − ν)) gi,b,x = ed2 (bisb + ai(εt,xy/6 + εb,xy/3)(1 − ν)) gi,b,y = ed2 (ai(εt,y/6 + εb,y/3)+ bi(εt,xy/6 + εb,xy/3)(1 − ν)) (17) derivatives of the nodal forces gi,t/b,x/y with respect to the dofs can be written in terms of the st/b functions, for instance ∂gi,t,x ∂uj,t,x = ∂gi,t,x ∂εt,x ∂εt,x ∂uj,t,x + ∂gi,t,x ∂εt,xy ∂εt,xy ∂uj,t,x = ed 4a (bibj ∂st ∂εt,x + aiaj 1 − ν 2 1 3 ) functions st, sb depend on x components of εt, εb only. the subscript x can thus be omitted in their derivatives. formulas for the derivatives are in section 2.1. when µ = 1−ν2 is introduced and common factor ed4a is omitted for brevity, submatrices [kbend,i,j ] of the element stiffness matrix associated with the inplane dofs {uplane}t = {ui,t,x,ui,t,y,ui,b,x,ui,b,y } are: i, i : [ bibj ∂st ∂εt + aiajµ/3 aibjµ/3 ajbiµ/3 bibjµ/3 + aiaj/3 ] i,j : [ bibj ∂st ∂εb + aiajµ/6 aibjµ/6 ajbiµ/6 bibjµ/6 + aiaj/6 ] j,i : [ bibj ∂sb ∂εb + aiajµ/6 aibjµ/6 ajbiµ/6 bibj/6µ + aiaj/6 ] j,j : [ bibj ∂sb ∂εb + aiajµ/3 aibjµ/3 ajbiµ/3 bibjµ/3 + aiaj/3 ] (18) for a linear element (uncracked bed joints) [ki,j ]t = [kj,i], which implies symmetry of the whole matrix [k]. for cracked bed joints ht,b ̸= hb,t, the symmetry is lost. 298 vol. 62 no. 2/2022 a mindlin shell finite element for stone masonry bridges with backfill 4.2. shear owing to the rotations of normals the rotation of normals φ⃗ induces shear deformation, too. it would not induce any deflection in a consistent isoparametric element and then φ⃗ would simply be added to γ⃗ to obtain the total shear deformation. such displacement pattern is known to imply locking in thin elements, like it does in the isoparametric variant of the tt element. in the tm element, φ⃗ induces a quadratic field of deflection wφ, which must have zero values at all joints. the deflection should be compatible with the neighbouring elements along the boundaries. the antisymmetric part of the normal component of φ⃗ along an element’s side is assigned to deflection, the symmetric part to shear deformation, in analogue, to the tt element. perfect compatibility of deflections along boundaries with neighbouring elements is attained for elements lying in a plane. the implementation of the outlined displacement pattern is rather lengthy and it is skipped here. the resulting constant transverse shear owing to normals rotation is γu = ∑ i [ψi]({ui,t} − {ui,b}) [ψi] = ([ 0 −1/3 1/3 0 ] − 1 12a [ a2k − a 2 j bjaj − bkak −bkak + bjaj b2j − b 2 k ]) 1 d (19) the full transverse shear of the tm element is then {γ} = ∑ i ( [ψi], 1 2a [ −ai bi ] , −[ψi] ) {u}i (20) where indices i,j,k are subject to cyclic replacement. formula 20 is necessary with thin shells where shear locking can occur. masonry arches and other burried shells are mostly thick shells. matrices [ψi] can be simplified by omitting the second summand in such applications. experience with tests and applications testifies that keeping the second summand improves the element convergence. 4.3. tm flat slab element simple linear constitutive equations are assumed for the transverse shear, vx = gγx, vy = gγy. (21) vx is the standard shear force in element local y − z plane when looking along x axis and vy is the standard shear force in element local x − z plane when looking along y axis. the contribution of the shear stiffness to the element stiffness matrix is written down in terms of the 5x5 submatrices associated with the two top vertex, one transverse and two bottom vertices translational dofs of each node. column matrices [φi] = [−ai,bi]t /2a and 2x2 symmetric matrices ψij = [ψi]t [ψj ] help to write down the shear contribution to the submatrices: [kshear,i,j ] = gad2   [ψij ] [ψi]t [φj ] −[ψij ][φi]t [ψj ] [φi]t [φj ] −[φi]t [ψj ] −[ψij ] −[ψti [φj ] [ψij ]   (22) complete 5×5 stiffness submatrices are the sums of the bending stiffness submatrices 18 extended by an empty third column and row inserted between the 2×2 subsubmatrices and submatrices 22: [ki,j ] = [kbend,i,j ]extended + [kshear,i,j ] (23) the whole element stiffness matrix is the size 15×15 composition of submatrices [ki,j ], but the submatrices are assembled in the global stiffness matrix so that the whole matrix is never set up. cracking in the bed joints followed by development of virtual hinges in the arch have been considered the dominant failure pattern of masonry bridges since the early attempts [10], [11] to assess the load capacity. other failure modes like shear sliding of the bed joints or transverse tension cracking of the arch require sophisticated material models and properties which are almost impossible to obtain in the design practice. this stiffness matrix can be used for the solution of flat slabs loaded both in and out of plane but in shells, the bottom face dofs ub,x, ub,y require a special treatment since they are not simply shared by the elements attached to a node. 4.4. tm shell element the dofs at the bottom vertices of the elements connecting to a node do not lie in the same plane and are not independent of the dofs at the top vertices. the equilibrium of the nodal forces at the top vertices is affected by the nodal forces at the bottom vertices. these two conjugate deficiencies must be removed. the normal to the shell surface is needed for a consistent formulation of the tm element. the normal is rigid in the frame of the timoshenko-mindlin shell theory, which implies that the displacements of all points of the normal share the same lateral displacement component. the exact definition of the normal direction n⃗ would be through the shell surface mathematical definition. for practical applications of the tm element, it is sufficient to define the unit normal n⃗i at a node i as the normalized sum of the normals of all elements connecting to the node, each element normal length proportional to the sine of the angle of the two adjacent sides of the respective element. position vectors of nodes are denoted x⃗i and nodes of the connecting elements i, j, k, ordered counterclockwise when viewed from the top side of the shell. the sum is n⃗s,i = ∑ e n⃗e sin(αe) = 299 petr řeřicha acta polytechnica∑ e (x⃗j − x⃗i) × (x⃗k − x⃗i)/(|(x⃗j − x⃗i)||(x⃗k − x⃗i)|)e n⃗i = n⃗s,i/|n⃗s,i| vector n⃗i’s positive direction is outwards from top surface of the shell. the vector also defines the tangential plane τi at node i, τi ⊥ n⃗i. the top vertices of the connecting elements share the displacement r⃗t,i of the top end of the rigid normal. the global components of vector r⃗t,i constitute the first three dofs of the node i. the index of the node is omitted forthwith since a single node is considered. the compatibility of displacements at the bottom facets of the connecting elements demands that the projections on the τ plane of the bottom vertices displacements of all connecting elements share the same displacement vector r⃗b,τ ⊂ τ. a coordinate system is established in the τ plane such that axis 1 direction vector n⃗1 lies in the intersection of the global coordinates plane x−z and the τ plane, oriented as the global x axis. axis 2 direction vector is n⃗2 = n⃗ × n⃗1 (vector product). the component expansion of vector r⃗b,τ is defined this way. the two components in the τ plane define the two complementary dofs of the node. note that they are not the displacement components in the global coordinate system x − y − z but in the local ‘tangential’ plane instead. the transformation matrix from global to these tangential vector components is [t ]τ =   {n1}t{n2}t {n}t   (24) the whole bottom vertices displacement vector is r⃗b = r⃗b,τ + n⃗(n⃗·r⃗t) = r⃗b,τ + n⃗ ⊗ n⃗r⃗t, but just the two components of r⃗b,τ ⊂ τ constitute the two complementary dofs of the node. the out-of-τplane component depends on the basic three dofs r⃗t of the node. several coordinate systems (all cartesian) are employed. the global one, common for all, the τ system, common for a node and the local e systems of individual elements. the convention is adopted for component matrices that the matrix with components in the global system has no subscript τ, in the τ system, and subscripts e in the element systems. furthermore, the transformation matrix [t ]e from the global to the element system has subscript e. the matrix form of the expression for the element dofs is{ {rt}e {rb}e } = [ [t ]e 0 [t ]e{n}{n}t [t ]e[t ]tτ ] { {rt} {rb}τ } (25) recall that both the displacement component matrices {rb}e and {rb}τ have zero third components so that the third column and row of the transformation matrix in 25 can be omitted. the nodal forces conjugate to {rt}e and {rb}e are denoted {gt}e and {gb}e. rigid normals to the shell surface imply that the equilibrium equations of the top and bottom joints of a node are not independent. just a single equilibrium equation can be written in the direction n⃗ and it is assigned to the top end of the rigid normal – the top joint of the node. the two remaining equations at the bottom joint of the node include components acting in the τ plane. component decompositions of the element nodal forces in the coordinate axes tripod n⃗1, n⃗2, n⃗ – the node local τ system, are added in the matrix equilibrium equation of the bottom joint of the node: ∑ e {gb}τ,e = ∑ e [t ]τ [t]te {gb}e =   0 0 {n}{n}t {gb}e   the sum includes all elements connecting at the node. the third (tranverse) components are added to forces acting at the top vertex of the node to obtain the final nodal forces of the element:{ {gt} {gb}τ } = ∑ e [ [t ]te {n}{n}t [t ]te 0 [t ]τ [t ]te ] { {gt}e {gb}e } (26) the transformation matrices of the nodal displacements and forces are transpose of each other, which testifies to their correctness. note that the last scalar equations in 25 and 26 can be omitted and so can be the last columns of the tranformation matrices. the transformation matrices are then 5×5 in size. they are specific for each node of an element since the {n} and [t]τ matrices are different at each node. for an easy reference, they are denoted [ti] for node i forthwith. the submatrices [ki,j ] are transformed to the global coordinate system [ki,j ]g = [ti]t [ki,j ][tj ] (27) and assembled in the system matrix. 4.5. tm element code and test the tm element has been implemented in a dedicated matlab/octave code including a simple graphic output. just the shell reference surface is drawn to keep the view readable. the maximum compression stresses in the bed joints and the relative depth of the cracks are inserted in each element. these values are sufficient to decide on the arch load capacity in the context of the present model. the graphic output uses vector graphics so that pictures can be zoomed in with a stable resolution. the mesh generator [8] is used for the shell surface meshing. the code is entirely self-contained, no input data file and preor postprocessing is necessary. it contains just about 660 octave command lines, not including the mesh generator. the pinched cylinder with rigid end diaphragms in figure 8 is a popular benchmark to test shell elements. the shell is thin, d/r = 0.01 so that the test is a severe one for a thick shell element. the benchmark solution by double fourier series with 80 terms in both directions was first presented in [12] based on 300 vol. 62 no. 2/2022 a mindlin shell finite element for stone masonry bridges with backfill 600 6 0 0 1 1 sym. 3 ri g id sy m . sym.x,y z figure 8. benchmark shell 0.2 0.4 0.6 0.8 1.0 0 16 32 488 elements along directrixr a ti o f e m d e fl e c ti o n /b e n c h m a rk figure 9. convergence of the fem deflections, ratio to the benchmark deflection the work [13]. the kirchhoff-love kinematic assumptions are used in the benchmark so that a definite deflection is obtained. differences can be expected between the present and benchmark solutions in the vicinity of the pin force, in particular for fine meshes. the mindlin assumptions imply infinite deflection beneath the force in the continuum model, thus making the discretized models mesh dependent and inherently non-convergent. owing to the three symmetry planes just shaded, 1/8 of the cylinder can be considered. four mesh densities were considered with 8, 16, 32 and 48 elements along the directrix. the ratio of the loaded node deflection of the tm finite element solutions to the benchmark deflection 1.825 · 10−5 is shown in figure 9. in spite of the correction of the transverse shear strain in equation 20, the residual shear locking still persisted and affected the results in this thin shell, in particular, for low densiiy meshes. in order to reduce it further, the material shear stiffness in the transverse direction (element local x − z and y − z planes) was selectively lowered eight times. this has a negligible effect on the overall response of the fem model since the contribution of the transverse shear compliance to it is small. at the denser mesh side of the figure 9, the curve already tends to adopt to the infinite deflection of the continuum mindlin shell and break the kirchhoff-love benchmark limit. the deformed mesh is shown in figure 10. the radial component of the deflection along the directrix from the benchmark solution [12] in figure 11 compares well to the deformed mesh in figure 10, note 50 100 150 200 250 300 -300 -250 -200 -150 -100 -50 0 0 100 200 300 figure 10. deformed mesh, 48 elements along the directrix figure 11. radial component of deflection along the directrix, benchmark by the solid line, [12] the shalow inward deflection in the lower part of the front directrix. the precision and convergence of the deflection beneath the pin force is, as expected, worse than for elements based on the kirchhoff-love assumptions. a solution of the ponikla bridge arch loaded by characteristic arch selfweight and standard lm1 tandem axle with wheel forces of 250 kn is provided for the illustration of the code output in figure 12. displacements are 2000 times scaled up. in comparison to the 2d solution of the same bridge arch in section 3, the interaction and selfweight of the fill and pavement are not accounted for, the tandem axle load and mesh density are slightly different, too. the tandem axle position in the transverse direction is the extreme eccentric one within the bounds defined by the en 1991-2. the differences between the loaded/unloaded sides of the arch barrel are 1261/405 kpa in extreme normal stresses in the bed joints and 0.41/0.25 in relative crack depths. the backfill and pavement stiffness and selfweight reduce the differences in the real bridge, but the example testifies that the 2d models need corrections. the 3d bridge model analogue to the one in section 3 is currently being worked on. the absence of rotational dofs improves the convergence of the iterations. the no-tension, history independent material model admits a single load increment strategy. in the illustration example, the 301 petr řeřicha acta polytechnica 0 2 4 6 8 0 -10.5 -10 -9.5 2 4 6 8 10 -1758||0.34-1647||0.3 -963.8||0 -1222||0.36 -1493||0.36 -1368||0.36 -795.9||0 -1594||0.35 -862.4||0.35-1046||0.36 -593.7||0 -680.6||0 368.1||0 -1663||0.35-1727||0.35 -921.3||0 -899.1||0 -884.5||0 -832.6||0 -891.1||0 893.6||0 924||0 1036||0.21 1139||0.27637.6||0 740.3||0 678.6||0 726.5||0.16 693.5||0 993.4||0.2 952.8||0.26 952.8||0.26 875.1||0 1071||0.221139||0.27 1087||0.2 1141||0.35 1141||0.33 1186||0.33 -499.6||0 333.4||0 370.9||0 727.8||0.15 891.2||0.3 849.8||0.33 599||0.24847.9||0.33 889.3||0.31 594.3||0.23 605.2||0.099 608||0.27 495.5||0.27608.1||0.27 610||0.1 -622.6||0.33 -466.7||0.36 -346.7||0.4 -527.3||0.35 -331.1||0.087 -719.4||0.34 -459.8||0 -301.5||0 -301.1||0 349.1||0 342.1||0 159||0 336.4||0 -342||0.13 -218.1||0 162.3||0 189.8||0 329.3||0.096 324.7||0.09 495.6||0.27 459.9||0.2 405.9||0.25 329.8||0.17405.8||0.25 459.9||0.2 459.3||0.015 461.2||0.021 351.8||0.21 358.1||0.11 362||0 362.5||0 358.1||0.11 291.1||0.039 286.7||0 287||0 220.9||0 1261||0.41 1066||0.34 732||0.048 607||0 1068||0.34 773.3||0.096607.5||0 781.3||0.1 471.9||0 529.7||0 529.4||0 413||0 413.8||0 326.9||0 467||0 467.6||0 364.7||0 364.7||0 -426.6||0 -497.2||0 -557.2||0 -585||0 -625||0 -609||0-409.8||0 336.9||0 -409.8||0 337||0 -497.2||0 -290.2||0 -420.3||0 -420.3||0 -520.3||0 -585||0 -520.3||0 -573||0 -499||0 -399.5||0 -498.9||0 -462.6||0 -573.2||0 -528.8||0 -529.8||0 -462.6||0 -290.1||0 -272.1||0 -399.5||0 -367.1||0 -609||0 -560.4||0 -560.5||0 -492.6||0 -415.9||0 -481||0 -493.3||0 -480.3||0 -415.9||0 -412.4||0 -373.3||0 -416.4||0 -416||0 -411.6||0 -371.7||0 -391.1||0 327.4||0 264.6||0 -271.6||0 -252.6||0 -366.9||0 264.8||0 209.4||0 209.4||0 -252.2||0 -331.3||0 -157.2||0 -243.2||0 -331.3||0 -242.7||0 -237||0 -305.4||0 -308.9||0 -373||0 -283.4||0 -356.3||0 -344.6||0 -329.8||0 -369.7||0 -339.5||0 -325.6||0-312.6||0 -321.4||0 -316.1||0 -277.8||0 -283.5||0 322.6||0 301||0 277.8||0 -275.3||0 -279.3||0 -630.6||0 -566.2||0 -566.2||0-469.2||0 -470.3||0 -569.9||0-451.3||0 -452.3||0 -428.4||0 425.4||0 456.5||0 447.1||0 420.9||0 -395.3||0 -398.2||0 411.7||0 393.8||0 366.5||0 345.6||0 figure 12. bridge at ponikla, 3d shell, compression stresses [kpa] and relative depths of the cracks in the bed joints ratio 0.005 of the rms norms of the imbalance and load was reached in 3 iterations. 5. conclusions the simplest finite elements for 2d and 3d thick shells have been developed for applications in masonry arch bridges limit load analysis. they feature seamless combination with 2d cst and 3d tetrahedron elements for the bridge’s backfill. exclusively translational dofs are used, which improves the convergence of iterations. the no-tension condition is applied to the normal stress in the bed joints planes. the output minimum normal stresses and cracks depths in these planes can be used to assess the ultimate limit loads of a bridge. a sample analysis of an arch of the bridge at poniklá testifies that this material law and shell elements develop a characteristic failure mode of the stone masonry arch bridges, the gradual creation of the virtual hinges. acknowledgements the author was supported by the grant naki dg20p02ovv001, ‘tools for the preservation of historical values and functions of arch and vaulted bridges’ provided by the ministry of culture of the czech republic. references [1] the highways agency, london. design manual for roads and bridges, vol. 3, highway structures: inspection and maintenance, section 4, assessment, part 3, the assessment of highway bridges and structures, 2001. [2] ministry of transport of the czech republic. zatížitelnost zděných klenbových mostů load rating of masonry arch bridges, 2008. [3] t. e. ford, c. e. augarde, s. s. tuxford. modelling masonry arch bridges using commercial finite element software. in proceeding of the 9th international conference on civil and structural engineering computing. civil-comp press, stirling, 2001. [4] mott-macdonald, 20-26 wellesley road, croydon, cr9 2ul, uk. ctap manual for the assessment of masonry arch bridges, 1990. [5] r. bridle, t. hughes. an energy method for arch bridge analysis. proceedings of the institution of civil engineers 89:375–385, 1990. https://doi.org/10.1680/iicep.1990.9397. [6] m. gilbert. ring: a 2d rigid block analysis program for masonry arch bridges. in proceedings of the 3rd international arch bridges conference, pp. 459–464. 2001. [7] w. harvey. application of the mechanism analysis to masonry arches. structural engineer 66:77–84, 1988. [8] p.-o. persson, g. strang. simple mesh generator in matlab. siam review 46(2):329–345. https://doi.org/10.1137/s0036144503429121. [9] r. d. mindlin. influence of rotatory inertia and shear on flexural motions of isotropic, elastic plates. asme journal of applied mechanics 18(1):31–38, 1951. https://doi.org/10.1115/1.4010217. [10] a. pipard, r. ashby. an experimental study of a voussoir arch. proceedings of the institution of civil engineers 10:383, 1938. [11] a. j. s. pippard. the civil engineer in war, vol. 1, chap. the approximate estimation of safe loads on masonry bridges., pp. 365–372. 1948. https://doi.org/10.1680/ciwv1.45170.0021. [12] g. lindberg, m. d. olson, g. r. cowper. new developments in the finite element analysis of shells, 1969. https://apps.dtic.mil/sti/pdfs/ad0707780.pdf. [13] w. flugge. stresses in shells. springer verlag, 1962. 302 https://doi.org/10.1680/iicep.1990.9397 https://doi.org/10.1137/s0036144503429121 https://doi.org/10.1115/1.4010217 https://doi.org/10.1680/ciwv1.45170.0021 https://apps.dtic.mil/sti/pdfs/ad0707780.pdf acta polytechnica 62(2):293–302, 2022 1 introduction 2 the tt element, timoshenko beam with translational dofs 2.1 constitutive equations for bending 2.2 full cross-section stiffness matrix 2.3 arch nodes equilibrium 3 application to an arch bridge 4 3d mindlin shell element tm 4.1 bending and in-plane deformation 4.2 shear owing to the rotations of normals 4.3 tm flat slab element 4.4 tm shell element 4.5 tm element code and test 5 conclusions acknowledgements references ap04-bittnar1.vp 1 introduction concrete, as a convenient building material, inherently involves uncertainty about its composition, which is difficult to eliminate completely. however, this uncertainty can be assessed by statistical, fuzzy, or other suitable tools. in the case of concrete structures, such as frames made of reinforced concrete, it is costly to obtain a sufficient experimental data set which would yield the desired statistical characteristics of the material parameters. instead, the knowledge gained by practicing engineers can be included as fuzzy numbers in the material modeling. for design purposes, traditionally, we may wish to conduct a statistical analysis, using the statistical characteristics of several measured events. in the case of an earthquake, however, the measured data for each site of interest is not particularly dense, leaving the statistical characteristics with little relevance. on the other hand, the expected seismic load at a site can alternatively be expressed by fuzzy sets [1], which take into account the scarcity of seismic stations and information about local sub-soil composition. in this paper, an approach to dynamic analysis based on fuzzy set theory is presented as a germane alternative to classical stochastic dynamic analysis. the material parameters of reinforced concrete are considered to be fuzzy quantities with a given distribution, i.e., fuzzy numbers with a desired shape of the membership function [2]. the dynamic analysis is, then, performed with the help of fuzzy arithmetic on either �-cuts or computation-efficient (l, r) numbers [3]. the result of such an analysis is in the form of fuzzy numbers, which is less expensive than the stochastic approach in terms of computation time, but still provides an idea of the distribution of the sought quantity. in order to further improve the computational efficiency, inspired by [4], the concept of a surface response function is utilized, [5, 6]. this approach is demonstrated in an illustrative example of a 2d frame, where the effect of uncertain material parameters transpires in the corresponding distributions of the natural modal shapes and natural frequencies of the analyzed two-dimensional frame. a methodology for a possible application to seismic design is also explained. it is believed that this approach enables practicing engineers and other people with knowledge to contribute actively to analyses of seismic-sensitive structures. 2 dynamic finite element analysis the finite element method applied to dynamical problems of structures results in the form m c k d d d d 2 2 r r r f ( ) ( ) ( ) ( ) t t t t t t� � � , (1) where m denotes the mass matrix, c stands for the damping matrix, k denotes the stiffness matrix, f(t) expresses the load vector and r(t) is the vector of the nodal displacements which are computed, t stands for time. eq. (1) represents a semidiscrete problem where the spatial coordinates are discretized while the time is still assumed to be continuous [7]. the analysis of the natural frequencies (eigenvalues) and natural mode shapes (eigenvectors) of an undamped structure is based on the simplified relation eq. (1), which has the form ( )k m 0� ��0 2 v . (2) the nonzero vector v is the eigenvector containing the natural mode shapes and �0 stands for the natural frequency. eq. (2) represents a generalized problem of eigenvalues. the most common method for solving of such problems is subspace iteration [8]. 3 fuzzification of dynamic finite element analysis the uncertainty, that is present in the input parameters can be tackled with the help of fuzzy set theory [1]. in this theory, uncertain quantities are defined in terms of fuzzy sets. unlike in the classical set theory, here membership of an element in a fuzzy set also assumes values between 0 and 1, where 0 means “does not belong” and 1 means “definitely belongs” to a fuzzy set. usually, fuzzy sets represent a vague verbal evaluation. in cases when a fuzzy set represents a numeral, it is called a fuzzy number. 3.1 fuzzy numbers the notion of a fuzzy number arises from experience of everyday life where many phenomena which can be quantified are not characterized in terms of absolutely precise numbers. © czech technical university publishing house http://ctn.cvut.cz/ap/ 117 acta polytechnica vol. 44 no. 5 – 6/2004 fuzzy dynamic analysis of a 2d frame p. štemberk, j. kruis this paper deals with the dynamic analysis of a 2d concrete frame with uncertainties which are an integral part of any real structure. the uncertainties can be modeled by a stochastic or a fuzzy approach. the fuzzy approach is used and the influence of uncertain input data (modulus of elasticity and density) on output data is studied. fuzzy numbers are represented by �-cuts. in order to reduce the volume of computation in the fuzzy approach, the response surface function concept is applied. in this way the natural frequencies and mode shapes described by fuzzy numbers are obtained. the results of fuzzy dynamic analysis can be used, e.g., in seismic design of structures based on the response spectrum. keywords: fuzzy numbers, natural frequency, mode shape, response surface function. fuzzy numbers are fuzzy sets which are defined on the set of real numbers. their membership function assigns the degree of 1 to the central, also called nominal, modal or mean, value and lower degrees to other numbers which reflect their proximity to the central value according to the used membership function. the membership function should thus decrease from 1 to 0 on both sides of the central value. such fuzzy sets are called fuzzy numbers. an example of a fuzzy number is shown in fig. 1, where � represents the membership function and a1 and a2 stand for two real numbers on the real axis. the intervals defined for a specific value of the membership function, e.g. � � 0.7, represent the so-called �-cuts. a fuzzy number can be equally expressed by either a nominal value and a membership function on each side of the nominal value or by a set of �-cuts. 3.2 fuzzy arithmetic a fuzzy arithmetic operation depends on the definition of a fuzzy number. in the cases when fuzzy numbers are defined by a set of �-cuts, the problem of fuzzy arithmetic is reduced to the well-known arithmetic operations on intervals, which are applied to each �-cut. implicitly, this means a sequence of binary combinations on each �-cut in order to obtain the minimum and the maximum value for each �-cut. the finite element method converts a problem into a system of linear equations, in this case a system of fuzzy linear equations, which comprises an extensive number of arithmetic operations. this fact makes the formulation in the above terms merely unsolvable due to the number of all necessary binary operations. to eliminate the drawback of the �-cut formulation, new techniques for solving fuzzy linear equation systems have been developed, e.g. [9]. however, these techniques are not easily applicable to robust problems, such as fuzzy dynamic finite element analysis. therefore, another technique for reducing the large number of binary combinations has been exploited.this technique was originally developed for other problems, such as statistical analysis. 3.3 surface response function fuzzy analyses, as well as stochastic analyses, suffer from non-occurrence of analytical solutions in the case of non-deterministic input data. this situation can be remedied by the following. let ~ ~ x x� denote the vector of input data from the space of input data ~ x and ~ ~ y y� denote the vector of output data from the space of output data ~ y . both, stochastic and fuzzy analyses require knowledge of the response, which can be written in the form ~ (~)y x� f , (3) where f denotes the response of a system (structure) to the input data collected in vector ~x . this represents a mapping from the space ~ x to the space ~ y . the non-occurrence of an analytical solution requires the application of a suitable numerical method which discretizes the problem and solves it numerically. the space ~ x is discretized by an n-dimensional space, x, and similarly the space ~ y by an m-dimensional space, y. a stochastic analysis based on simulation methods generates thousands or millions of samples of input data (the vectors x) and then deterministic computation follows. fuzzy analysis based on �-cuts requires computation of all combinations of input data, which also leads to thousands or millions of samples. both approaches yield the response of a system based on a huge amount of output data (thousands or millions of vectors y) obtained from many executions of standard (deterministic or crisp) solutions. in order to reduce the necessary number of computation runs, the concept of a response surface function has been used many times. the basic idea of the response function is to approximate operator f by a suitable function which should be as simple as possible. the function for the k-th output parameter can be written in the form f x a b x c x xk k i k i ij k i j j n i n i n ( ) ( ) ( ) ( )( ) � � � ��� ��� 111 , (4) where the superscript identifies an output parameter and n denotes the dimension of the space of the input data, x. the unknown coefficients are obtained from the least square method in the following way. let the set of input parameters contain s samples. each sample is located in the vector x [i], where the superscript identifies a sample. the standard computation gives the output data, which are collected in the vectors y [i]. the coefficients of the response function minimize the following expression � �f a b c f yk k i k ij k k i k i i s ( ) ( ) ( ) ( ) ( ) [ ] [ ]( , , ) ( )� � � � x 2 1 . (5) in many cases, it is not necessary to use quadratic terms. considering only the linear terms simplifies further computations. 4 numerical example as an example, the natural frequency analysis of a two-dimensional frame with four floors made of reinforced concrete is considered. the overall height of the frame is 16 meters and the width is 5 + 5 meters. the dimensions of the beams and columns are identical (0.5×0.5 m). it is assumed that the building was erected in four consecutive lifts. each lift consists of placing concrete in three columns and in the beam which connects the upper ends of the columns. 118 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 . . . fig. 1: a normal fuzzy number and its �-cuts therefore, it is further assumed that there are only four types of concrete whose composition can possibly differ. the influencing material parameters are the modulus of elasticity, e and the density, �. e and � are fuzzy input parameters with nominal values of 30 gpa and 2500 kg/m3, respectively, which can change by �10 % and with a linear membership function (triangular fuzzy numbers). © czech technical university publishing house http://ctn.cvut.cz/ap/ 119 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 2: mode shape fig. 3: enlarged section of mode shape 1 fig. 4: mode shape 2 fig. 5: mode shape 3 fig. 6: mode shape 4 fig. 7: mode shape 5 for illustrative purposes, we need 125 response surface functions to describe the first five natural vibration modes, i.e. a response surface function to express each natural frequency and the horizontal and vertical displacements in each joint (three joints on each of the four floors) for each natural mode shape. in order to obtain sufficient input and output data for calculation of the coefficients of the response surface functions it was decided to take three values (minimum, modal value, maximum) for each material parameter, e and �, i.e., ( � 6561) independent runs of the dynamic finite element analysis. the specific form of eq. (4) in this example was f x b e b e b e b e b b k k k k k k ( ) ( ) ( ) ( ) ( ) ( ) ( ) � � � � � � � 1 1 2 2 3 3 4 4 5 1� 6 2 7 3 8 4 9 ( ) ( ) ( ) ( ).k k k kb b b� � �� � � (6) the first five mode shapes are shown in figs. 2 to 7, where the dotted lines represent all possible envelopes of response, in other words, the minimum and maximum values, which correspond to the values obtained for �-cuts (� � 0). the finite element model of this frame discretized each frame section (beam and column) by five beam elements, however, only the joint displacements are shown. this is why no significant difference is distinguishable in figs. 6 and 7. in fig. 3, a section of the frame is enlarged and the vertical displacements are 1000 times increased compared to the horizontal displacements so that we can see the distribution of possible displacements of the frame. the distribution of the first five natural frequencies is shown in fig. 8. it was observed that the response function gives very good results for the dominant displacements (at point a, which is the top left joint) in the lower natural mode shapes, which are important for seismic design. for vertical displacements, which do not play an important role in seismic design (at point b, which is the intermediate joint of the first floor), the response function could not fit the proper shape of the membership function, which is evidenced in figs. 9 and 10. 5 possible applications in the design of earthquake resistant structures, it is essential not to neglect any uncertainty as it may lead to an erroneous conclusion due to the dynamic simulation which may amplify such uncertainty beyond all limits. for these reasons, it seems reasonable to express uncertain numerical data in terms of fuzzy numbers and to use them as such in analyses to cover all possible solutions. in the previous section, an approach to natural vibration analysis was shown which provides input data for further analyses considering, e.g. earthquake induced vibrations. spectral-analysis based methods require only the maximum values obtained for each natural mode in order to evaluate the excited vibration. therefore, it is desirable to verify whether these values can be satisfactorily expressed by surface response functions which were obtained only by binary combinations of material parameters with three values (minimum, modal value, maximum). the resulting surface response function was also obtained for five values, corresponding to the �-cut values with a equal to 0, 0.5 and 1. however, this meant 52×4 (� 390625) independent runs of the dynamic finite element analysis. the improvement was negligible, and compared with the computational effort it proved truly unnecessary. 120 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 8: distribution of natural frequencies fig. 9: distribution of horizontal displacement fig. 10: distribution of vertical displacement 6 conclusions the fuzzy approach has been applied to dynamic analysis of a 2d concrete frame with uncertainties. the natural frequencies and the natural mode shapes have been computed and described by fuzzy numbers. the response surface function has been applied in order to reduce the number of required computations. the results have been compared with a full analysis based on an evaluation of all combinations (in the order of thousands) and very good accordance has been obtained. the first lower natural mode shapes are naturally described more precisely than the higher ones. the errors of the obtained results for lower natural mode shapes are less than 5%. there are some quantities belonging to higher modes where the response surface function gives results unacceptable from the point of the fuzzy set theory. these difficulties should be studied in future. 7 acknowledgment this work was supported by project no. 103/04/1320 of the grant agency of the czech republic, which is gratefully acknowledged. references [1] zadeh l. a.: “fuzzy sets.” information control, vol. 8 (1965) no. 3, p. 338–352. [2] valliappan s., pham t. d.: “construction the membership function of a fuzzy set with objective and subjective information.” microcomputers in civil engineering, vol. 8 (1993), p. 75–82. [3] kaufman a., gupta m. m.: “introduction to fuzzy arithmetic: theory and application.” new york, van nostrand reinhold company, inc.1985. [4] akpan u. o., koko t. s., orisamolu i. r., gallant b. k.: “practical fuzzy finite element analysis of structures.” finite elements in analysis and design, vol. 38 (2001), p. 93–111. [5] bucher c. g., chen y. m., schueler g. i.: “time variant reliability analysis utilizing response surface approach.” in: reliability and optimization of structural systems , 88. (editor p. thoft-christensen), 1988. [6] rajashekhar m. r., ellingwood b. r.: “a new look at the response surface approach for reliability analysis.” structural safety, vol. 12 (1993), p. 205–220. [7] bathe k. j.: “finite element procedures.” prentice-hall, inc. 1996. [8] bittnar z., šejnoha j.: “numerical methods in structural mechanics.” asce press 1996. [9] buckley j. j., qu y.: “solving systems of linear fuzzy equations.” fuzzy sets and systems, vol. 39 (1991), p. 33–43. ing. petr štemberk, ph.d. phone: +420 224 354 634 fax: +420 233 335 797 e-mail: stemberk@fsv.cvut.cz department of concrete structures ing. jaroslav kruis, ph.d. phone: +420 224 354 369 fax: +420 224 310 775 e-mail: jk@cml.fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 121 acta polytechnica vol. 44 no. 5 – 6/2004 table of contents section 1 z. p. bažant, q. yu nonlocal theories in continuum mechanics 16 m. jirásek modelling of size effect with regularised continua 35 h. askes, a. simone, l. j. sluys a discontinuous model to study fracture of brittle materials 42 k. de proft, w. p. de wilde section 2 p. krysl oofem – an object oriented framework for finite element analysis 54 b. patzák, z. bittnar triangulation of 3d surfaces recovered from stl grids 61 d. rypl, z. bittnar solution of nonlinear coupled heat and moisture transport using finite element method 68 t. krejèí membrane triangles with drilling degrees of freedom 75 p. fajman section 3 p. tesárek, j. drchalová, j. kolísko, p. rovnaníková, r. èerný experimental analysis of sandstone and travertine 89 t. doležel, m. drdácký, p. konvalinka, l. kopecký mohr-coulomb failure condition and the direct shear test revisited 93 p. øeøicha efficient analytical model for calculation of the influence zone inside the subsoil below foundations slabs 97 p. kuklík, m. kopáèková section 4 b. teplý, p. rovnaník, z. keršner, p. rovnaníková neural network based identification of material model parameters to capture experimental load-deflection curve 110 d. novák, d. lehký fuzzy dynamic analysis of a 2d frame 117 p. štemberk, j. kruis acta polytechnica https://doi.org/10.14311/ap.2021.61.0601 acta polytechnica 61(5):601–616, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague process optimization and performance evaluation of a downdraft gasifier for energy generation from wood biomass ilesanmi daniyana, ∗, felix aleb, ikenna damian uchegbuc, kazeem bellod, momoh osazelec a tshwane university of technology, department of industrial engineering, staatsartillerie road, private bag x680, pretoria 0001, south africa b national space research & development agency, department of engineering and space systems, abuja, nigeria c afe babalola university, department of mechanical & mechatronics engineering, p. m. b. 5454, ado ekiti, nigeria d federal university, oye-ekiti, department of mechanical engineering, oye-are road, oye-ekiti, nigeria ∗ corresponding author: afolabiilesanmi@yahoo.com abstract. in recent time, due to the increasing demand for energy and the need to address environment-related issues, a great deal of focus has been given to alternative sources of energy, which are green, sustainable and safe. this work considers the process optimization and performance evaluation of a downdraft gasifier, suitable for energy generation using wood biomass. the assessment of the performance of the downdraft gasifier was based on the amount of output energy generated as well as the emission characteristics of the output. the response surface methodology (rsm) was employed for the determination of the optimum range of the process parameters that will yield the optimum conversion of the biomass to energy. the optimum process parameters that produced the highest rate of conversion of biomass to energy (2.55 nm3/kg) during the physical experiments were: temperature (1000 °c), particle size (6.0 mm) and residence time (35 min). the produced gas indicated an appreciable generation of methane gas (10.04 % vol.), but with a significant amount of co (19.20 % vol.) and co2 (22.68 % vol.). from the numerical results obtained, the gas yield was observed to increase from 1.86908 nm3/kg to 2.40324 nm3/kg as the temperature increased from 800 °c to 1200 °c. the obtained results indicate the feasibility for the production of combustible gases from the developed system using wood chips. it is envisaged that the findings of this work will assist in the development of an alternative and renewable energy source in an effort to meet the growing energy requirements. keywords: downdraft gasifier, energy, rsm, optimization, wood biomass. 1. introduction biomass resources are renewable sources of energy, which produce combustible fuels as a means of energy from organic materials, such as crop residues, wood chips, food or animal wastes [1–3]. several researchers have proven that the development of sustainable energy resources is a catalyst for the development of any nation [4, 5]. renewable sources of energy can be in the form of wind, solar, photovoltaic, hydropower, biomass, geothermal etc. [6]. the use of biomass as solid fuels for energy generation is predominant in african countries, especially for domestic applications such as cooking [7, 8]. the increasing global demand for energy can be linked to the increasing population, especially in the african continent, where the population number is experiencing an exponential growth. biomass is a major resource for developing countries, but there is an over-reliance on the use of traditional biomass for energy generation in the african continent, which poses a significant health risk [9, 10]. a recent study estimates the number of people who rely on biomass solid fuels to be 700 million people in subsaharan africa [11]. this can be attributed to the lack of modern energy facilities and the proximity to biomass resources in the rural communities as well as the cost of the solid biomass fuel etc. over the years, biomass resources have been used for energy generation because it is cost effective, readily available, easily accessible, renewable and easy to store [9–11]. wood is one of the primary biomass resources used for energy generation. researches have proven that, in the traditional process of wood combustion, only partial utilization of the energy inherent in the wood biomass takes place, with some energy lost into the environment in the form of emissions. the modern process of harnessing energy from wood biomass via the process of gasification involves the collection of the emission and its combustible components, to minimize the energy losses during the conversion process [12– 14]. furthermore, the traditional way of harnessing energy from wood biomass causes environmental and indoor pollution due to the generated emissions, depending on the usage, population density, location 601 https://doi.org/10.14311/ap.2021.61.0601 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica and the layout of the environment where it is harnessed [15, 16]. this is because the heat energy produced from the incomplete combustion of the solid fuel generate gases due to a limited presence of oxygen during the combustion reaction [17]. if not controlled, such emissions pose a great health risk as studies have revealed that many health challenges and deaths can be attributed to it [18, 19]. a properly designed gasifier can suitably convert solid fuels such as wood chips into safe gaseous fuels. the energy generated via the gasification of wood biomass is suitable for domestic use, powering stationary gasoline engines, such as electric generators, pumps, and industrial equipment of light-to-medium duty. modified gasoline engines can also be powered primarily via the energy from wood biomass [20]. the generation of energy via the use of wood biomass boasts safety, environmental, social and economic benefits, if properly harnessed when compared to an energy generated through the traditional wood combustion process. its environmental benefit stems from the fact the energy usage significantly reduces the net carbon emissions, thereby promoting a significant reduction in air pollution and global warming [21, 22]. by weight, the produced gas contains combustible elements and compounds, approximately 20 % of carbon oxide (co), 20 % of hydrogen (h), a certain amount of methane (ch4), with about 55 % of nitrogen (n), which is not combustible, and other gaseous and solid matters such as moisture, sulphur, and ash [23]. the process of the wood combustion produces carbon dioxide (co2) and water vapour (h2o) as the products of combustion as well as carbon oxide (co), a poisonous gas, as a by-product [24, 25]. the modern process of gasification is developed to ensure a complete transformation of the biomass constituents into clean gaseous fuels. however, in order to address the health risk posed by the continuous use of the traditional biomass for energy generation, there is a need for the development of an efficient system for the conversion of biomass into fuel. many papers have been reported in this regard. for instance, chingunwa et al. [26] developed a wood gasifier for powering an internal combustion engine. the gasifier operates in the combined heat and power modes for providing power and heat to domestic and industrial applications. the residence time and operating temperature are important factors, which determine the conversion rate of wood biomass into energy. li et al. [22] investigated the behaviour of wood biomass under a high temperature application. the findings from this study revealed that there was a significant mass loss of biomass at an elevated temperature at the devolatilization stage while requiring a long residence time to drive the conversion process to completion. in addition, lu et al. [7] investigated the effect of particle sizes and shapes, as well as the combustion characteristics of the wood particles on the combustion characteristics of the biomass using modelling techniques. the findings indicate that the particle size and shapes significantly affect the dynamics of the biomass particle, as well as the drying, heating and reaction rates. the authors reported an inverse relationship between the particle size and the drying, heating and reaction rates. ingle and lakade [27], reported on the design and development of a downdraft gasifier systems for the production of producer gas. the comparative analysis of the wood biomass with agricultural feedstock indicate that the wood biomass has a higher carbon oxide and hydrogen content as well as higher calorific value than the agricultural biomass briquettes. masmoudi et al. [28], indicate that the temperature fields and the reactivity of the char affects the loading of the gasification process. in addition, the authors reported that another important factor that affects the yield of the hydrogen and carbon monoxide produced is the particle size of the biomass. striugas et al. [29] performed an experimental analysis on the differences in the process parameters associated with the gasification process of lump and pelletized fuel. the authors found out that the major difference between these two types of feedstock include: the temperature of the gasification process, the pressure drop and the residual content. the gasification of feedstock of a larger size such as wood chips requires an elevated temperature, up to 1100 °c, as compared to waste and pellet biomass, which requires a maximum reaction temperature between 800–850 °c. the papers reviewed were able to establish the range of process parameters for the production of biogas. however, there is still a dearth of information regarding the process optimization of gasifier, hence, it is envisaged that this work will contribute to the existing knowledge on the production of clean producer gas for thermal domestic needs. the novelty of this work includes the process optimization and performance evaluation of an energy extraction through gasification using the response surface methodology as well as the generation of design data for the development of a downdraft gasifier. the developed downdraft gasifier can serve as a template for scaling and future development. furthermore, the process optimization through the use of numerical experimentation, validated via physical experiments, will assist in the determination of the feasible combination of the process parameters in order to keep them within the optimum range during the process of biomass conversion to fuel. in addition, the work provides a development framework for the production of energy from wood biomass. the focus of this study is to evaluate the performance of a wood gasifier, determine the optimum process parameters that will promote an efficient conversion of the wood chips (biomass) into energy and to analyse the composition (by volume percentage) of the gases produced by the downdraft gasifier. the following sections present the materials and method employed, results and discussion as well as conclusion and recommendation. 602 vol. 61 no. 5/2021 process optimization and performance evaluation . . . 2. materials and method the design and construction of the downdraft gasifier had already been done and reported, this work is in line with the recommendation for a continuous performance evaluation of the system in terms of energy generation for powering small units in order to meet the thermal domestic needs. the section is divided into four sub-sections (2.1–2.4). the first sub-section presents the energy requirements and the thermal efficiency of the developed downdraft gasifier system, while the next sub-sections present the details of the conversion process of the wood biomass to energy. the third sub-section presents the optimization of the conversion process in order to obtain the most feasible combination of the process parameters that will produce the yield of gas while the last sub-section presents how the analysis of the composition of the produced gas was achieved. 2.1. energy requirements and thermal efficiency the heating value of the gas produced is a function of the moisture content of the biomass. the lesser the moisture content, the higher is the energy content and vice versa. the moisture content can be determined on a dry basis as well as on a wet basis, as is expressed by equation 1 [30]. m.c.dry = wet weight − dry weight dry weight × 100 (1) where m.c.dry is the moisture content on a dry basis (%). the thermal efficiency of the system reduces with the amount of moisture content in the biomass and vice versa. this is due to the fact that there will be a heat loss during the drying and consequently, such an energy loss in the form of heat will no longer be available for the reduction reactions in the chemical bound energy in the gas. therefore, the heating values increase with the reduction in the amount of moisture content and vice versa. equation 2 expresses the power output of the producer gas in the engine [31]. m ax. air gas intake = 1 2 × n × d 60 × 1000 (2) m ax. air gas intake = 0.045m3/s (3) where d is the outlet pipe diameter (m), n is the speed (in rpm), and the air/gas ratio (stoichiometric) is expressed as 1.1: 1.0 m ax. gas intake = 1.02.1 × 0.045 m ax. gas intake = 0.0212m3/s (4) the real gas intake is expressed as equation 5 [31]. real gas intake = 0.0212 × f (5) where f is the volumetric efficiency of the engine, which is a function of the engine’s revolution per minute, as well as the design of the air inlet manifold of the engine. if we consider revolutions to be 1500 rpm and f equal to 0.8 (for a well-designed and clean air inlet manifold), the real gas intake (rg ) is given as: rg = 0.0212 × 0.8 = 0.017m3/s (6) the thermal power (pg ) in the gas is given as equation 7 [31]. pg = rg × hv (7) where rg is the real gas intake, which is calculated as 0.017 m3/s , and hv is the heating value of the gas (hv ) equal to 4800 kj/m3. pg = 0.017 × 4800 = 8.16 kj/s pg = 81.6 kw (8) the engine efficiency is partly a function of the compression ratio of the engine. hence, for a compression ratio of 9.5:1, the efficiency (ε) is estimated at 28 %. equations 9 and 11 express the maximum mechanical output of this engine as well as the maximum electrical output (cos φ generator = 0.80) respectively [31]. pm max = pg × ε (9) pm max = 81.6 × 0.28 = 22.85 kw (10) pe max = pm max × cos φ (11) pe max = 22.85 × 0.80 = 18.30 kva (12) by assumption, the thermal efficiency of the downdraft gasifier is considered to be 70 %, hence, the thermal power consumption (in kilowatts) is obtained from equation 13. tp = pg 0.7 (13) recall pc = 81.6 kw , hence 603 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica figure 1. the framework for the particle combustion during the conversion of wood biomass to energy [22]. tp = 81.6 0.7 = 116.57 kw (14) the heating value of biomass (bhv ) was found to be 14 % of the moisture content, which corresponds to 17000 kj/kg. therefore, the biomass consumption of the downdraft gasifier is expressed as equation 15 [31]. gbc = tp bhv (15) gbc = 116.57 17000 = 0.0069 kg/s = = 0.0069 × 3600 = 24.84 kg/h (16) therefore, gbc pe max (17) 24.84 18.30 = 1.36 kg biomass produces 1 kwh of electricity. the yield of the gas produced is defined as the ratio of the flow rate of the produced total inert free gas to the mass flow rate of dry and ash free value of feedstock [32, 33]. equations 18 and 19 express the yield of the producer gas (yg ) and the cold gas efficiency (cge), respectively [31]. yg = (qair × 79) (n2 × bi) (18) cge = synhv × yg bhv (19) where yg is the yield of the producer gas (nm3/kg), qair is the quantity of the input air (nm3/h), bi is the biomass input (kg/h), n2 is the nitrogen mass fraction of the output gas, cge is the cold gas efficiency (%), synhv is the lower heating value of the sync gas on a dry basis (kj/nm3) and bhv is the lower heating value of the biomass (kj/kg). 2.2. conversion of wood biomass to energy the prototype unit in this work operates well on wood chips (minimum size: 0.019 × 0.019 × 0.008 m) and blocks (up to 0.051 m3). the restriction in the size of the wood chips is to prevent a bridging of the wood chips. however, larger wood chip sizes could be used, if the fire tube diameter is increased. the choice of wood chips as a biomass resource stems from the fact it has several advantages. first, wood is suitable for energy generation. in addition, the ash content is usually significantly low, ranging between 0.5 to 2 % by weight when properly harnessed; although this depends on the nature of the wood [34]. furthermore, wood is sulphur free, hence, the use will contribute less to the environmental pollution with a lesser tendency to cause corrosion damage to the engine. in addition, wood is readily available, cost effective and its conversion process to energy is relatively simple. however, one of the major disadvantages for wood as a biomass resource is its moisture content. the moisture content and other volatile matters first have to be reduced significantly in the process of drying before being used for the energy generation. the framework for the wood particle combustion during the conversion of wood biomass to energy is shown in figure 1. the process of converting the biomass, in this case, the wood particles to generate energy (producer gas) is known as gasification. as depicted in figure 1, there are about four major processes involved, namely: drying, pyrolysis, combustion and gasification processes [22, 35]. the process of converting a solid biomass fuel into energy is divided into these phases; (1.) first is the feeding of the wood chips into the drying chamber through the hopper, followed by the drying process. (2.) the process of drying is aimed at removing the moisture content in the biomass fuel up to 10 % at a temperature of 150 °c. 604 vol. 61 no. 5/2021 process optimization and performance evaluation . . . notation independent variables levels -1 0 1 a temperature (°c) 800 1000 1200 b particle size (mm) 2 6 10 c residence time (min) 10 35 60 table 1. summary of the numerical experiments. (3.) the pyrolysis process (devolatilization) usually occurs between 200–300 °c. this is aimed at removing other volatile matters from the already dried biomass to produce the mixture of ash and char. the pyrolysis process is a function of the biomass properties and determines the composition of the char, which subsequently undergoes the combustion reactions. the combustion process occurs when the volatile products and part of the produced char react in the presence of oxygen to form primarily oxides of carbon (co and co2). this reaction is oxidative and produces heat (exothermic in nature), which is subsequently used for firing the dry mass and the mixture of char and ash during the conversion process. the gasification process occurs when the produced char reacts with steam to produce hydrogen and co. a further reaction of the co with the steam promotes the forward reaction in a gasifier, producing co2 and hydrogen. in addition, the presence of a limited amount of oxygen in the reaction chamber will promote a combustion reaction of some organic material to produce co2 and energy. methane and excess carbon dioxide are produced when co2 and residual water react in the presence of a catalyst [36] according to equation 20. 4co + 2h2o → ch4 + 3co2 (20) the ash containing a small quantity of unreacted carbon is collected in the gasifier grate. from the design calculations, 2 kg of biomass was fed through the hopper into the downdraft gasifier having a nominal thermal capacity of 81.6 kw, along with 0.5 kg of charcoal to activate the process of ignition. the system is periodically refilled before it is completely empty, and occasionally, the ashes are shaken down from the grate. it is necessary to cover the hopper when the downdraft gasifier unit is shut down, to prevent a wood combustion due to the entrance of air into the hopper. the downdraft gasifier unit can be shut down by turning off the ignition switch followed by the opening of the carburettor’s air control valve for few seconds so to relieve any pressure from the system. then, the air control valve is completely closed with the fuel hoper covered tightly. the producer gas passes through a two stage filtering process for cleaning, before it is used to drive the generator which produces the electricity. the generator is a spark ignited turbocharged vee configuration system which operates at 1500 rpm. the generator is a directly coupled system capable of delivering a 3phase voltage of 200 v at a frequency of 50 hz. as the engine burns the wood gas, the energy generated in the process is converted into kinetic energy. the generator then converts the kinetic energy, due to the rotation, into electricity. 2.3. optimization of the conversion process the optimization of the process parameters for the combustion process (char oxidation) was carried out using the response surface methodology (rsm) with the process parameters selected in the following range: temperature (800–1200 °c), particle size (2–10 mm), residence time (10–60 min). the range of the process parameters was inspired by a similar work carried out by li et al. [22]. li et al. [22] reported on the prediction of a high-temperature rapid combustion behaviour of wood biomass particles using a modelling and simulation approach. however, the optimization of the process conditions was not reported. this is one of the focal points of this study in order to establish an optimum range of process parameters for the conversion of wood biomass to energy. the choice of the rsm stems from the fact that it is suitable for the investigation of the interactive cross-effect of the process parameters as it affects the measured response (yield of the gas). it is also suitable for obtaining a predictive mathematical model for determining the yield of the gas as a function of the independent process parameters [38]. the design expert software (version 8) containing the rsm was used for the design of the experiment and process optimization. the rsm gave feasible combinations of the process parameters, establishing 20 experimental runs, whose response (yield of the gas) was determined through physical experiments. the summary of the experimental design, involving two factors varied over three levels: the high level (+1), centre points (0) and low level (-1), is presented in table 1. the validation of the numerical experiments was done via physical experiments as well as the analysis of variance (anova). the indicators for determining the validity of a numerical experiments include the: “p-value prob > f” (which should be less than 0.050), “lack of fit” (which should be statistically insignificant compared to the pure error) as well as the 605 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica figure 2. the developed downdraft gasifier [37]. parameter specification diameter of fire tube (m) 0.152 length of fire tube (m) 0.483 volume of hopper (m3) 0.15 height of nozzle (m) 0.107 diameter of nozzle (m) 0.0104 thermal consumption (kw) 116.57 biomass consumption (kg/h) 24.84 pressure (mpa) 0.14 table 2. the design specifications of the developed downdraft gasifier. correlation coefficients, namely the predicted r square, r squared and the adjusted r squared (which should be close to 1) for a statistically significant model [39]. 2.4. analysis of the composition of gas produced a multi-component gas analyser (ir400) with five components, which feature non-dispersive infrared (ndir), ultraviolent, vis photometer, paramagnetic and electrochemical o2 and thermal conductivity sensors, was employed for measuring the gas components. the analyser coupled with the non-dispersed infrared sensor and probe was used for detecting the concentration of oxides of carbon as well as methane (hydrocarbon) in the gas sample in parts per million (ppm). the analyser has two separate analysis chambers (central interface), which is common for analyser modules. following the automatic calibration of the analyser via a sample gas probe, the sample gas was fed into the analyser module at a room temperature through an inlet valve at a flow rate of 0.0000166 m3/sec. the analyser module is a blind analysis unit, which measures the gas concentration with the results displayed with the aid of the liquid crystal display of the analyser. the results obtained in ppm were divided by 10,000 to obtain the concentration of the gas in volume percentage. 3. results and discussion this section comprises three sub sections (3.1–3.3), which present the results obtained from the physical experiments, numerical experiments as well as the results obtained from the analysis of the concentration of the gas produced. 3.1. the results obtained from the physical experiments the developed downdraft gasifier is shown in figure 2. table 2 presents the design specifications of the developed downdraft gasifier. the results obtained from the proximate and ultimate analyses of the wood biomass used for the production of the gas and the percentage composition of the constituents of the produced gas are presented in table 3. generally, the producer gas obtained from biomass gasifiers contains some amount of tar. the higher the amount of tar present in the producer gas, the lesser is its suitability for fuel cells, engines and turbines. the tar content of the experiment has been measured and found to be within the permissible range, which further confirms the fact that the gasification technology was successfully implemented. the obtained results have been reported in daniyan et al. [37]. the proximate analysis of the wood biomass in terms of the moisture content, volatile matters, ash and fixed carbon as presented in table 3 is necessary for the determination of the heating value of the 606 vol. 61 no. 5/2021 process optimization and performance evaluation . . . types of analysis constituents of biomass/gas % composition (wet basis) proximate analysis moisture 10.03 volatile matter 48.85 ash 6.23 fixed carbon 38.15 types of analysis constituents of biomass/gas % composition (dry basis) ultimate analysis carbon 52.38 hydrogen 6.23 oxygen 21.43 nitrogen 0.60 others 0.40 table 3. the analysis of the wood biomass. figure 3. the plots of the actual and predicted yields of the gas produced. biomass. similarly, the ultimate analysis evaluates the chemical composition of the biomass and this is useful in the process design for an optimum rate of conversion of the biomass to energy. 3.2. the results obtained from numerical experiments table 4 presents the feasible combination of the process parameters as determined by the rsm as well as the corresponding gas yield determined via the physical experiments. figure 3 then presents the plots of the actual and predicted yield of the produced gas. the essence of the numerical experiments is to search for the optimum process condition that will ensure process efficiency and increase in the yield of the producer gas. from table 4, the combination of the process parameters that leads to the highest yield of gas (2.55 nm3/kg) can be seen: temperature (100 °c), particle size (6.00 mm) and residence time (35 min). from figure 3, it is obvious that there is a sound agreement between the yields of gas obtained from both the numerical and the physical experiments as indicated by the similarity of the patterns of the data points. this implies that the developed model is fit for a predictive purpose. the closeness of the values of the actual yields of gas obtained from the physical experiments and the predicted yield of gas from the numerical experiments means that the model is efficient and reliable. under the similar process conditions, it could be used for the prediction of the yield of gas without the need for a physical experiment, which could be expensive and time consuming. hence, this will assist the process designers in the determination of the throughput of the gasifier, given a certain input and process conditions. table 5 presents the statistical analysis of the developed quadratic model for predicting the yield of gas as a function of the independent process parameters, namely; temperature, particle size and residence 607 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica trials factor a: factor b: factor c: actual predicted temperature particle residence gas yield gas yield (°c) size (mm) time (min) (nm3/kg) (nm3/kg) 1. 900 6.00 35.00 2.00 1.9771 2. 800 10.00 60.00 1.80 1.7812 3. 800 2.00 10.00 1.20 1.2208 4. 1000 6.00 35.00 2.55 2.4804 5. 1200 2.00 10.00 1.80 1.8203 6. 1000 6.00 35.00 2.55 2.5540 7. 800 2.00 60.00 2.00 2.0347 8. 1000 8.00 35.00 2.43 2.4690 9. 1200 10.00 10.00 1.90 1.8560 10. 663.64 6.00 35.00 2.00 1.9899 11. 800 10.00 10.00 1.52 1.5589 12. 1200 10.00 60.00 1.99 2.0563 13. 1000 6.00 35.00 2.45 2.5657 14. 1200 2.00 60.00 2.22 2.3223 15. 1000 6.00 35.00 2.38 2.3789 16. 1000 1.00 35.00 1.77 1.7897 17. 1000 6.00 35.00 1.89 1.9043 18. 1000 6.00 55.00 2.50 2.5543 19. 1000 6.00 35.00 2.55 2.6097 20. 1000 6.00 30.00 2.47 2.5310 table 4. the results obtained from the numerical and physical experiments. time, while table 6 presents the analysis of variance (anova) of the developed model. the model “f-value” of 5.68 implies that the model is statistically significant. there is only a 0.60 % chance that the model “f-value” this large could occur due to noise. in addition, the value of the “p-value prob > f” of the developed model was 0.0060. the fact that the value of the “p-value prob > f” was less than 0.050 indicates that the model is statistically significant. the significant model terms which can greatly influence the yield of the gas are a (temperature), c (residence time) and b2 (a square of the particle size). the insignificant “lack of fit” value of 0.38 implies that the lack of fit is not statistically significant relative to the pure error. there is a 84.32 % chance that a “lack of fit-f-value” this large could occur due to noise. “the insignificant lack of fit” value implies that the model is good for a predictive purpose. the values of the adjusted r-squared (0.8892) and predicted r squared (0.8840) are in a reasonable agreement with the r squared (0.8364), and were all close to 1, thus, indicating that the model is suitable for correlative predictive purposes. the results obtained from both the numerical and physical experiments were statistically analysed using the rsm to obtain a predictive model, which correlates the dependent variable (yield of the gas) as a function of the independent process parameters, namely temperature, particle size and residence time (equation 21). y ield of gas = +2.40 + 0.18a + 6.984e − 003b + 0.20c − 0.031ab + 0.071ac − 0.11bc − 0.043a2 − 0.39b2 − 0.15c 2 (21) where a is the temperature (°c), b is the particle size (mm) and c is the residence time (min). figure 4 is the normal plot of residuals for the developed model for the yield of the produced gas. the normal plot of the residuals depicts the degree of the normal distribution of the data set [38]. the closeness of the data set to the diagonal (average) line is an indication of the linearity of the residuals. this further indicates that the data set is approximately linear and normally distributed by approximation, although with an inherent randomness leftover within the error portion as shown in figure 4. the variation of data points from the diagonal line were marginal within the permissible range of ±10 % in a relation to the average line without any outliner. this further indicates that the developed model is efficient and suitable for predictive and correlative purposes. this plot shows that there exists a close relationship between the actual and the predicted values of the yield of the gas as indicated by the closeness of the data points to the diagonal line. figures 5 and 6 shows the contour plot and the 3d plot of the effect of the temperature and particle size on the yield of gas, respectively. the operating temperature and particle size heavily influences the yield of the gas during the conversion process. this is because the char as well as the yield of tar and the subsequent gas product are a function of the combus608 vol. 61 no. 5/2021 process optimization and performance evaluation . . . statistical sum of df mean square f value p-value remarksparameters squares prob > f model 2.23 8 0.26 5.68 0.0060 significant a-temperature 0.28 1 0.28 6.09 0.0332 b-particle size 4.212e-004 1 4.212e-004 9.230e-03 0.9254 c-residence time 0.34 1 0.34 7.45 0.0212 ab 7.813e-003 1 7.813e-003 0.17 0.6878 ac 0.041 1 0.041 0.89 0.3677 bc 0.090 1 0.090 1.98 0.1898 a2 0.011 1 0.01 0.24 0.6321 b2 0.36 1 0.36 7.95 0.0182 c2 0.038 1 0.038 0.83 0.3827 residual 0.032 10 0.032 lack of fit 0.13 5 0.025 0.38 0.8432 not significant pure error 0.33 5 0.066 corr. 2.79 19 table 5. the statistical analysis of the developed model. parameter value remarks r-squared 0.8364 significant adjusted r squared 0.8892 significant predicted r-squared 0.8840 significant adequate precision 8.2840 significant table 6. the analysis of variance (anova) for the developed model. figure 4. the normal plot of residuals . 609 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica figure 5. the contour plot of the effect of the temperature and particle size on the yield of gas. figure 6. the 3d plot of the effect of the temperature and particle size on the yield of gas. tion temprerature. the rate of combustion of the char is also a function of its size and the operating temprature. from the plot, an increase in temperature to the optimum value favours the forward reaction resulting in a decrease in the tar and char content and vice versa. this is due to the fact that the amount of tar and char produced are converted to hydrogen gas, oxides of carbon and other light hydrocarbons via the process of thermal cracking as the operating temperature increases, thereby leading to the production of more gas. from figures 5 and 6, as the temperature increases, further reduction in the amount of tar can be observed, but with an increase in the content of hydrogen gas and a reduction in the amount of oxides of carbon oxide. this increases the amount of the producer gas, thereby making the gaisifer more environmental-friendly. the range of the temperature distribution during the conversion process was also observed to vary from 800–1200 °c. the initial low temperature can be traced to the presence of some volatile matters in the biomass. however, a progressive increase in the magnitude of the temperature was observed as the conversion process proceeded with time. the gas yield was also observed to increase from 1.86908 nm3/kg to 2.40324 nm3/kg as the temperature increased from 800–1200 °c at a residence time of 35 mins. this finding strongly agrees with the findings of many researchers as reported by the literature [22, 34, 40–42]. li et al. [22] explain the significance of the operating temperature during the conversion of biomass to fuel. in an agreement with the findings of this study, the authors reported that an increase in the temperature enhances the biomass combustion, but with an increase in the time required 610 vol. 61 no. 5/2021 process optimization and performance evaluation . . . figure 7. the contour plot of the effect of the temperature and residence time on the yield of gas. figure 8. the 3d plot of the effect of the temperature and residence time on the yield of gas. to burn out the char completely. hence, the higher the temperature, the higher the rate of biomass combustion, but with an increase in the residence time. however, the magnitude of temperature increases with an increase in the energy requirement of the process. this can make the process less sustainable in terms of energy consumption and environmental sustainability. in addition, an increase in the magnitude of the temperature beyond the optimum promotes the possibility for a slag formation, thereby making the conversion process less sutainable. as the temperature increases, the particle size decreases and vice versa. this is in line with the findings of lu et al. [7], who established that the particle size significantly affects the dynamics of the biomass particle, including the drying, heating and reaction rates. the authors reported an inverse relationship between the particle size and the drying, heating as well as reaction rates, which are a function of the operating temperature. figures 7 and 8 show the contour plot and the interactive 3d plot of the effect of the temperature and residence time on the gas yield, respectively. from figures 7 and 8, an increase in the magnitude of time and temperature was observed to promote the forward reaction, thereby resulting in a high yield of the producer gas. this also agrees with the findings of li et al. [22] that a direct relationship exists between the temperature and residence time during the conversion of biomass to fuel. the yield of the producer gas ranges from a minimum value of 1.88269 to a maximum value of 2.4277 nm3/kg as the residence time increases from 10 min. to 60 min. and combustion 611 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica figure 9. the contour plot of the effect of the particle size and residence time on the yield of gas. figure 10. the 3d plot of the effect of the particle size and residence time on the yield of gas. temperature from 800 °c to 1200 °c. the relationship among the tar content, time and the temperature was observed to be inversely proportional. hence an increase in the magnitude of the residence time and combustion temperature results in a decreasing content of the tar and vice versa. this confirms the fact that a sufficient time is needed to drive the conversion process to the required completion. a lower residence time may promote the incomplete combustion of the biomass constituents, thereby resulting in lower yields of the producer gas and vice versa. the obtained results indicate that an optimum residence time of 35 min. was sufficient for the conversion process. a further increase beyond this time may lead to a decrease in the tar content and a low rate of conversion of biomass to gas at high energy input, thereby making the process less sustainable. figures 9 and 10 show the contour plot and the interactive 3d plot of the effect of the particle size and residence time on the gas yield, respectively. a direct relationship was observed between the yield of the gas and the residence time. at first, an increase in the size of the particle size was observed to increase the yield of the gas up to the size of 6 mm. a further increase in the size of the particle beyond this value promotes the backward reaction resulting in the reduction in the yield of the gas produced. this relates to the fact that as the particle sizes increases, beyond the optimum, the rate of combustion will decrease due to a lesser interaction of the biomass constituents with other elements thereby slowing down the rate of conversion rate and vice versa. the smaller the particle sizes, the larger the area to volume ratio of the wood particle and the rate of reaction and production rate of the producer gas is and vice versa and [22, 43, 44]. 612 vol. 61 no. 5/2021 process optimization and performance evaluation . . . figure 11. the desirability plot for the optimization of the process parameters. figure 12. comparison of the % volume of the gas concentration. 3.3. numerical optimization of the process parameters the numerical optimization of the process parameters was carried using the design expert software. with the goal of maximizing the yield of the producer gas, the optimization produced 30 feasible solutions whose desirability equals to 1, as shown in figure 11. the closer the desirability value is to 1, the more desirable the responses obtained are. the fact that the desirability value equals to 1 implies that the optimum values of yield and process parameters obtained are highly desirable. the optimum process parameters that produced the highest rate of conversion of biomass to energy (2.50694 nm3/kg) during the numerical optimization were: temperature (1082.40 °c), particle size (6.16 mm) and residence time (40.87 min). comparing these optimal values to the optimal values obtained via the physical experimentats: energy yield (2.55 nm3/kg), temperature (1000 °c), particle size (6.0 mm) and residence time (35 min), it is obvious that the range of the process parameters for the numerical optimization and physical experiments were close. this further lends confirms the fact that the developed model is highly efficient for correlative and predictive purposes. 3.4. results obtained from the analysis of the concentration of the gas produced table 7 and figure 12 show the volume percentage of the concentration of the gas produced. the combustible gas generated with the present experimental setup contains only methane, carbon monoxide and carbon dioxide as the major constituents. hence, only they are considered in the analysis. figure 11 shows that there is a need for further removal of tar from the produced gas. the amount of carbon oxide present in the gas was the highest (19.20 % by volume), which is significant enough to cause health risks and environmental pollution. carbon oxide is a poisonous gas, which can cause serious health problems, when inhaled [23]. in addition, the amount of carbon dioxide was also significant (22.68 % by volume). carbon dioxide is a greenhouse gas capable of causing global warming, thereby making the use of the producer gas less sustainable and less environmentally friendly. although the composition of the methane gas obtained was satisfying (10.04 % by volume), an even higher yield can be obtained via an adequate process design and effective reduction in the tar content of the intermediate product during the process of conversion. 613 i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica concentration % (vol.) co 19.20 co2 22.68 ch4 10.04 table 7. concentration of the produced gas. 4. conclusion the process optimization and performance evaluation of a downdraft gasifier suitable for energy generation using wood biomass was carried out in this study in the effort to develop renewable and alternative sources of a clean and safe energy. the obtained results indicated that the downdraft gasifier can convert wood chips to producer gas for the generation of energy for light to medium duty. the analysis of the gas concentration indicated a satisfying generation of the methane gas, but with a significant amount of co and co2. this implies the need for an effective process of tar removal in order to ensure that the production of gas is safe and environmentally sustainable. using the rsm, the optimum process parameters that produced the highest rate of conversion of biomass to energy (2.55 nm3/kg) were found to be: temperature (1000 °c), particle size (6.0 mm) and residence time (35 min). in addition, the developed and statistically validated mathematical model showed a capacity for predicting the yield of the producer gas. the produced gas contained a satisfying amount of methane gas (10.04 % vol.), but with a significant amount of co (19.20 % vol.) and co2 (22.68 % vol.). from the numerical results obtained, the gas yield was observed to increase from 1.86908 nm3/kg to 2.40324 nm3/kg as the temperature increased from 800–1200,°c. the results obtained indicate the feasibility of the production of combustible gas from the developed system using wood chips. the combustible gas generated with the present experimental setup contains only methane, carbon monoxide and carbon dioxide as the major constituents. hence, only they are considered in the analysis. it is, therefore, recommended that governments of many countries who still largely rely on the traditional energy generation via biomass should embrace this innovation and put the appropriate policies in place to encourage its use. a future research can consider a deeper performance evaluation of the downdraft gasifier. references [1] h. lu, w. robert, g. peirce, et al. comprehensive study of biomass particle combustion. energy fuels 22(4):2826–2839, 2008. https://doi.org/10.1021/ef800006z. [2] i. a. daniyan, a. k. ahwin, a. a. aderoba, o. l. daniyan. development of a smart digester for the production of biogas. petroleum and coal 60(5):804–821, 2018. [3] i. a. daniyan, k. a. ahwin, a. a. aderoba, et al. development and optimization of a smart digester for the production of biogas. in proceedings of the international conference on industrial engineering and operations management pretoria/johannesburg, south africa, october 29 – november 1, pp. 1456–1459. 2018. [4] a. sahir. qureshi specific concerns of pakistan in the context of energy security issues and geopolitics of the region. energy policy 35(4):2031–2037, 2007. https://doi.org/10.1016/j.enpol.2006.08.010. [5] p. s. c. rao, j. b. miller, y. d. wang, j. b. byrne. energy-microfinance intervention for below poverty line households in india. energy policy 37(5):1694–1712, 2009. https://doi.org/10.1016/j.enpol.2008.12.039. [6] s. gorjian. an introduction to renewable energy technologies. [2021-03-04], https://doi.org/10.13140/rg.2.2.27055.53928. [7] h. lu, e. ip, j. scott, et al. effects of particle shape and size on devolatilization of biomass particle. fuel 89(5):1156–1168, 2010. https://doi.org/10.1016/j.fuel.2008.10.023. [8] g. tucho, s. nonhebel. bio-wastes as an alternative household cooking energy source in ethiopia. energies 8(9):9565–9583, 2015. https://doi.org/10.3390/en8099565. [9] v. tumwesige, g. okello, s. semple, j. smith. impact of partial fuel switch on household air pollutants in sub-sahara africa. environmental pollution 231(1):1021–1029, 2017. https://doi.org/10.1016/j.envpol.2017.08.118. [10] i. ozturk, f. bilgili. economic growth and biomass consumption nexus: dynamic panel analysis for sub-sahara african countries. applied energy 137:110–116, 2015. https://doi.org/10.1016/j.apenergy.2014.10.017. [11] k. mortimer, c. b. ndamala, a. w. naunje, et al. a cleaner burning biomass-fuelled cook stove intervention to prevent pneumonia in children under 5 years old in rural malawi (the cooking and pneumonia study): a cluster randomised controlled trial. lancet 389(10065):p167–175, 2017. https://doi.org/10.1016/s0140-6736(16)32507-7. [12] a. k. chatterjee. state of the art report on pyrolysis of wood and agricultural biomass. u.s. department of agriculture, newark, nj, 2014. pn aak-818, https://www.osti.gov/biblio/6043806. [13] j. saastamoinen, m. aho, a. moilanen, et al. burnout of pulverized biomass particles in large scale boiler — single particle model approach. biomass bioenergy 34(5):728–736, 2010. https://doi.org/10.1016/j.biombioe.2010.01.015. 614 https://doi.org/10.1021/ef800006z https://doi.org/10.1016/j.enpol.2006.08.010 https://doi.org/10.1016/j.enpol.2008.12.039 https://doi.org/10.13140/rg.2.2.27055.53928 https://doi.org/10.1016/j.fuel.2008.10.023 https://doi.org/10.3390/en8099565 https://doi.org/10.1016/j.envpol.2017.08.118 https://doi.org/10.1016/j.apenergy.2014.10.017 https://doi.org/10.1016/s0140-6736(16)32507-7 https://www.osti.gov/biblio/6043806 https://doi.org/10.1016/j.biombioe.2010.01.015 vol. 61 no. 5/2021 process optimization and performance evaluation . . . [14] j. j. hernández, g. aranda, j. barba, j. m. mendoza. effect of steam content in the air–steam flow on biomass entrained flow gasification. fuel processing technology 99:43–55, 2012. https://doi.org/10.1016/j.fuproc.2012.01.030. [15] t. song, j. wu, l. shen, j. xiao. experimental investigation on hydrogen production from biomass gasification in interconnected fluidized beds. biomass bioenergy 36:258–267, 2012. https://doi.org/10.1016/j.biombioe.2011.10.021. [16] p. mondal, g. s. dang, m. o. garg. syngas production through gasification and clean-up for downstream applications—recent developments. fuel processing technology 92(8):1395–1410, 2011. https://doi.org/10.1016/j.fuproc.2011.03.021. [17] e. e. donath. vehicle gas producers. fuel processing technology 3(2):141–153, 1980. https://doi.org/10.1016/0378-3820(80)90017-x. [18] s. chakrabarty, f. m. boksh, a. chakraborty. economic viability of biogas and green self-employment opportunities. renewable and sustainable energy reviews 39:757–766, 2013. https://doi.org/10.1016/j.rser.2013.08.002. [19] m. k. sidhu, k. ravindra, s. mor, s. john. household air pollution from various types of rural kitchens and its exposure assessment. science of the total environment 586:419–429, 2017. https://doi.org/10.1016/j.scitotenv.2017.01.051. [20] t. r. reed, d. jantzen. generator gas: the swedish experience from 2009-2014 (a translation of the swedish book, gengas). solar energy research institute, golden, co, 2012. seri/sp-33-140. [21] s. c. saxena, j. c. k. fluidized-bed incineration of waste materials. progress in energy and combustion science 20(4):281–324, 1994. https://doi.org/10.1016/0360-1285(94)90012-4. [22] j. li, c. p. manosh, p. l. younger, et al. prediction of high-temperature rapid combustion behaviour of woody biomass particles. fuel 165:205–214, 2016. https://doi.org/10.1016/j.fuel.2015.10.061. [23] p. mckendry. energy production from biomass (part 1): overview of biomass. bioresource technology 83(1):37–46, 2002. https://doi.org/10.1016/s0960-8524(01)00118-3. [24] a. kumaraswamy, b. d. prasad. performance analysis of a dual fuel engine using lpg and diesel with egr system. procedia engineering 38:2784–2792, 2012. https://doi.org/10.1016/j.proeng.2012.06.326. [25] t. b. reed. a survey of biomass gasification, vol. i synopsis and executive summary. solar energy research institute, golden, co, 2011. seri/tr-33-239 (vol. i). [26] s. chinguwa, w. r. nyemba, t. c. jen. development and fabrication of a wood gasifier to power an internal combustion engine. in proceedings of the international conference on industrial engineering and operations management pretoria/johannesburg, south africa, october 29 – november 1, pp. 537–547. 2018. [27] n. a. inglea, s. s. lakade. design and development of downdraft gasifier to generate producer gas. energy procedia 90:423–431, 2016. https://doi.org/10.1016/j.egypro.2016.11.209. [28] m. a. masmoudi, m. sahraoui, n. grioui, k. halouani. 2-d modeling of thermo-kinetics coupled with heat and mass transfer in the reduction zone of a fixed bed downdraft biomass gasifier. renewable energy 66:288–298, 2014. https://doi.org/10.1016/j.renene.2013.12.016. [29] n. striugas, k. zakarauskas, a. dziugys, et al. an evaluation of performance of automatically operated multi-fuel downdraft gasifier for energy production. applied thermal engineering 73(1):1151–1159, 2014. https: //doi.org/10.1016/j.applthermaleng.2014.09.007. [30] j. reeb, m. milota. moisture content by the oven-dry method for industrial testing. oregon state university, corvallis, or, pp. 1-9. [31] manual for design calculation of downdraught gasifier. [2021-03-04], https://www.fao.org/3/t0512e/t0512e1a.htm. [32] f. pinto, c. franco, r. n. andré, et al. cogasification study of biomass mixed with plastic wastes. fuel 81(3):291–297, 2002. https://doi.org/10.1016/s0016-2361(01)00164-8. [33] c. franco, f. pinto, i. gulyurtlu, i. cabrita. the study of reactions influencing the biomass steam gasification process. fuel 82(7):835–842, 2003. https://doi.org/10.1016/s0016-2361(02)00313-7. [34] l. e. taba, m. f. irfan, w. a. m. wan daud, m. h. chakrabarti. the effect of temperature on various parameters in coal, biomass and co-gasification: a review. renewable and sustainable energy reviews 16(8):5584–5596, 2012. https://doi.org/10.1016/j.rser.2012.06.015. [35] g. sharma, m. v. s. krishna. performance evaluation of high pressure down draft biomass gasifier for big/gt applications. international journal of engineering research 5(2):499–505, 2016. https://doi.org/10.17950/ijer/v5i2/044. [36] a. k. chatterjee. state of the art report on pyrolysis of wood and agricultural biomass. u.s. department of agriculture, newark, nj., 2014. pn – aak-818. [37] i. a. daniyan, k. mpofu, a. o. adeodu, o. momoh. development and automation of a 12 kw-capacity gasifier for energy generation. in 2020 southern african universities power engineering conference/robotics and mechatronics/pattern recognition association of south africa (saupec/robmech/prasa). added to ieee xplore, pp. 68–73. 2020. https://doi.org/10. 1109/saupec/robmech/prasa48453.2020.9040963. [38] i. a. daniyan, i. tlhabadira, k. mpofu, a. o. adeodu. development of numerical models for the prediction of temperature and surface roughness during the machining operation of titanium alloy (ti6al14v). acta polytechnica 60(5):369–390, 2020. https://doi.org/10.14311/ap.2020.60.0369. [39] i. a. daniyan, f. fameso, f. ale, et al. modelling, simulation and experimental validation of the milling operation of titanium alloy (ti6al4v). the international journal of advanced manufacturing technology 109(7):1853–1866, 2020. https://doi.org/10.1007/s00170-020-05714-y. 615 https://doi.org/10.1016/j.fuproc.2012.01.030 https://doi.org/10.1016/j.biombioe.2011.10.021 https://doi.org/10.1016/j.fuproc.2011.03.021 https://doi.org/10.1016/0378-3820(80)90017-x https://doi.org/10.1016/j.rser.2013.08.002 https://doi.org/10.1016/j.scitotenv.2017.01.051 https://doi.org/10.1016/0360-1285(94)90012-4 https://doi.org/10.1016/j.fuel.2015.10.061 https://doi.org/10.1016/s0960-8524(01)00118-3 https://doi.org/10.1016/j.proeng.2012.06.326 https://doi.org/10.1016/j.egypro.2016.11.209 https://doi.org/10.1016/j.renene.2013.12.016 https://doi.org/10.1016/j.applthermaleng.2014.09.007 https://doi.org/10.1016/j.applthermaleng.2014.09.007 https://www.fao.org/3/t0512e/t0512e1a.htm https://doi.org/10.1016/s0016-2361(01)00164-8 https://doi.org/10.1016/s0016-2361(02)00313-7 https://doi.org/10.1016/j.rser.2012.06.015 https://doi.org/10.17950/ijer/v5i2/044 https://doi.org/10.1109/saupec/robmech/prasa48453.2020.9040963 https://doi.org/10.1109/saupec/robmech/prasa48453.2020.9040963 https://doi.org/10.14311/ap.2020.60.0369 https://doi.org/10.1007/s00170-020-05714-y i. daniyan, f. ale, i. d. uchegbu et al. acta polytechnica [40] t. y. ahmed, m. m. ahmad, s. yusup, et al. mathematical and computational approaches for design of biomass gasification for hydrogen production: a review. renewable and sustainable energy reviews 16(4):2304–2315, 2012. https://doi.org/10.1016/j.rser.2012.01.035. [41] d. a. bulushev, j. r. h. ross. catalysis for conversion of biomass to fuels via pyrolysis and gasification: a review. catalysis today 171(1):1–13, 2011. https://doi.org/10.1016/j.cattod.2011.02.005. [42] j. p. stratford, t. r. hutchings, f. a. a. m. de leij. intrinsic activation: the relationship between biomass inorganic content and porosity formation during pyrolysis. bioresource technology 159(5):104–111, 2014. https://doi.org/10.1016/j.biortech.2014.02.064. [43] c. zhang. numerical modeling of coal gasification in an entrained-flow gasifier. in asme 2012 international mechanical engineering congress and exposition, vol. 6, p. 1193–1203. 2012. https://doi.org/10.1115/imece2012-88481. [44] j. j. hernández, g. aranda-almansa, a. bula. gasification of biomass wastes in an entrained flow gasifier: effect of the particle size and the residence time. fuel processing technology 91(6):681–692, 2010. https://doi.org/10.1016/j.fuproc.2010.01.018. 616 https://doi.org/10.1016/j.rser.2012.01.035 https://doi.org/10.1016/j.cattod.2011.02.005 https://doi.org/10.1016/j.biortech.2014.02.064 https://doi.org/10.1115/imece2012-88481 https://doi.org/10.1016/j.fuproc.2010.01.018 acta polytechnica 61(5):601–616, 2021 1 introduction 2 materials and method 2.1 energy requirements and thermal efficiency 2.2 conversion of wood biomass to energy 2.3 optimization of the conversion process 2.4 analysis of the composition of gas produced 3 results and discussion 3.1 the results obtained from the physical experiments 3.2 the results obtained from numerical experiments 3.3 numerical optimization of the process parameters 3.4 results obtained from the analysis of the concentration of the gas produced 4 conclusion references ap06_2.vp 1 introduction adaptive control systems typically work in a feedback regime, hence the quality of the control depends heavily on the quality of the measurements of the process data. however, the measured data are often corrupted by various disturbances caused by uncertain elements of the process, such as measurement noise, malfunctions of measuring devices, etc. these disturbances can negatively influence the performance of the resulting automatic system. therefore, the task of data pre-processing (filtration) is of great importance in adaptive control, e.g. [1] or [2]. one of the most dangerous corruptions of the measured data is due to outliers. an outlier is an incorrect measurement of the process, which is significantly different from the real process data. two types of outliers are distinguished: i) isolated outliers; which are caused by an isolated failure of the measurement; ii) blocks of outliers; which are caused by a temporary breakdown of a measuring device. the former type is relatively easy to detect. however, it is more challenging to detect the latter, since the block of outliers may have some characteristics of uncorrupted data. in this paper, we will to model corrupted data by a probabilistic mixture of dynamic (i.e. autoregressive) models [3, 4, 5]. the model is identified using a bayesian approach [6, 7, 8, 9, 10]. aim and outline of the solution the task addressed in this paper is to detect of outliers and to reconstruct the measured data. the proposed solution of this task is based on modelling of the corrupted data by a mixture model comprising two components. the first component models the uncorrupted data, while the second models the outliers. detected outliers are replaced by predictions from the first component. 2 principle of mixture model estimation the quasi-bayes algorithm for recursive estimation of mixture model parameters was developed recently [11]. the algorithm is designed for estimating mixture model parameters with components in the form of linear regression models. the mixing weights of the components are considered to be unknown, and their estimates are also provided by the algorithm. mixture models the mixture model is described as a conditional probability density f d d t f dt i i t t i i c ( ( ), , ) ( , )� � � � �1 1 1 � � � � � � (1) where f ( )� � denotes conditional probability density function (pdf), d is modelled (and filtered) variable; dt is actual value at time t, �t�1 is a vector of historical data on which dt depends, � � � �� [ , , , ]1 2 � �c are parameters of individual components, � � � �� [ , , , ]1 2 � �c is a vector of components weights, c � is number of components. the main advantage of this model is the ability to describe a system with a finite number of different states, even if the relations between the states are very complex. bayes rule for mixture models direct application of the well known bayes rule f d t f d t d t f d t( , ( )) ( ( ) ( ), , ) ( , ( ))� � � � � �� � �1 1 (2) to the mixture model (1) yields intractable posterior distribution. specifically, application of the bayes rule (2) to the mixture model (1) yields posterior distribution in the form of a mixture with ct components. hence, the complexity of the posterior grows with time, t, which is prohibitive for on-line processing. model approximation to solve the above problem, an approximate bayesian estimation is used. it is achieved in three steps: (i) introducing a random variable ct that indicates the active component at time t, (ii) reformulation of the model of the active component into a product form: f d f dc t t i t t i i c i c t t( , , ) ( , ) ( )� � � � � �� � � � � �1 1 1 � , (3) and (iii) approximating the kronecker delta function �(i � ct) in (3) by its conditional mean value 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague mixture based outlier filtration p. pecherková, i. nagy success/failure of adaptive control algorithms – especially those designed using the linear quadratic gaussian criterion – depends on the quality of the process data used for model identification. one of the most harmful types of process data corruptions are outliers, i.e. ‘wrong data’ lying far away from the range of real data. the presence of outliers in the data negatively affects an estimation of the dynamics of the system. this effect is magnified when the outliers are grouped into blocks. in this paper, we propose an algorithm for outlier detection and removal. it is based on modelling the corrupted data by a two-component probabilistic mixture. the first component of the mixture models uncorrupted process data, while the second models outliers. when the outlier component is detected to be active, a prediction from the uncorrupted data component is computed and used as a reconstruction of the observed data. the resulting reconstruction filter is compared to standard methods on simulated and real data. the filter exhibits excellent properties, especially in the case of blocks of outliers. keywords: data filtration, system modelling, mixture models, bayesian estimation, prediction. e i c d t i c f c d t c i d t t t i c t t [ ( ) ( )] ( ) ( ( )) pr( ( ) � �� � � � � � � � 1 � ) ,,� wi t (4) where pr(�) denotes probability. an evaluation of the weight for linear regression models is available, [6] or [12]. effect of the approximation the mean value (4) is a vector of probabilities, wi, t, of the individual components. thus, at each time instant, the statistics of all components are updated by the observed data. the contribution of the observed data to each component is given by the estimated weight (4). for components from an exponential family [6], the estimation is equivalent to the weighted least squares technique. initiation of the estimation the bayesian estimation (2) updates the parameter description – represented by conditional pdf f d t( , ( ))� � – using the observed data, dt, for all times t � 1, 2, …, d � , where d � is the number of available data items. the recursion starts at t � 1 with pdf f d( , ( ))� � 0 , which is called the prior pdf. this pdf reflects our prior knowledge about parameters � and �. the prior can also be used to ensure that the estimated model has certain advantageous features [13]. approximate estimation algorithm the algorithm for an approximate estimation of the mixture model parameters with exponential family components is outlined in the following scheme: a. initial off-line part ˆ choose the number of components of the mixture model and their structure. ˆ set initial statistics of parameters. b. on-line time loop ˆ measure the current data. ˆ compute probabilistic weights of all components using (4). the component with maximum weight is called the active component. ˆ update parameter statistics for each component. c. concluding off-line part ˆ compute point estimates of the parameters from their statistics (if they are needed). 3 mixture-based outlier filtration the process of bayesian mixture estimation, indicated above, is adapted for outlier detection and reconstruction. idea of the filter the main idea is to model the observed data by a probabilistic mixture with two components: 1) the data component; which models uncorrupted data, and 2) the outlier component; which models the outliers. initiation of the filter the initial description of the components is formalized by the prior pdf. the prior variance of the data component is chosen from prior analysis of the filtered data and thanks to forgetting [14], the variance is not allowed to change much. the prior variance of the outlier component is chosen significantly larger that that of the data component. moreover, it is left relatively free, in order to be able to “catch” whatever does not belong to the uncorrupted signal, i.e. the outliers. naturally, better modelling of uncorrupted data allows better separation of the data from the corruptions. dynamic models describe the variable in dependence on its historical values, while a static description is without this. our experience with data modelling [15] suggests that even data that is almost static deserves to be described by dynamic models in order to achieve high quality of the description. therefore, the data component was chosen as a first order regression model. the structure of the outlier component is relatively loose and is chosen as static, i.e. a zero order regression model. its only task is to “cover” all possible errors, mainly outliers. operation of the filter as described in the previous paragraph, the estimation of a mixture model is based on weighting the data with respect to the individual components. using the estimated weights the active component can be detected. this mechanism is used for outlier detection as follows: 1. if the dominant weight belongs to the data component, no action is taken, and 2. if the dominant weight belongs to the outlier component, the current data item is considered to be an outlier and the observed value is substituted by a simulated realization of the data component. a problem occurs if in the following step the current data is not an outlier. then the dominant weight belongs to the data component and it would be influenced by the old data item, which is now the outlier, through its regression vector. this value going to the regression vector of the data component must therefore also be substituted by the filtered value. algorithm of the filtration the filter can be summarized by the following modification of the above mixture estimation algorithm. a. initial off-line part ˆ set initial statistics of parameters for a two-components mixture model, � first component with small data covariance (the data component), � second component with large data covariance (the outlier component). ˆ set the forgetting coefficient. b. on-line time loop ˆ measure the current data. ˆ compute the probabilistic weights of both components with respect to the measured data. ˆ choose the component with the greater weight. ˆ if the chosen component is that of the data, � re-evaluate parameter statistics for each component separately. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 ˆ if the chosen component is that of the outliers, � generate data prediction using the data component, � use the predicted value as a reconstruction of the observed data, � use the predicted value in the (future) regression vector of the data component, � re-evaluate parameter statistics for the outlier component only. 4 experiments in this section, we test the proposed algorithm on real data. a sample of data from a traffic miniregion in the center of prague was chosen for the experiments. the data refers to formed by intensities of traffic flow measured at a single point in the miniregion. the noise corrupting the data is represented mainly represents standard traffic irregularities (interactions between neighbouring control lights, accidental accumulations of cars, minor accidents etc.). the data sample consist of 1000 data items measured with a period of 5 minutes. it involves data over approximately 3,5 days. the intensity maxima reflect the traffic load during each day. the noise causes dissimilarities of the courses for individual days. different daily courses (visible at the beginning of the fourth day) are caused by different types of days, such as weekdays and weekends. the outliers are not frequent disturbances, but they are very important due to their devastating effects on model estimation. they are caused either by accidental breakdowns of detectors or by detector failure for several periods of measurements. especially the latter category are very difficult to distinguish automatically from the normal signal. in order to test the filter, the data was artificially corrupted by various types of outliers. basically, single outliers and blocks of outliers are used in all experiments. then, various outlier amplitudes are tested – big, medium, small – and a combination of these types in a single data sample. for all experiments, the results of the proposed filter are compared to those obtained using standard filters. these filters are based on a fixed-length window, moving along the current time, and evaluating some data characteristics for comparison with the current data measurement to detect an outlier. these characteristics are either the mean value or the median computed over the window. these characteristics are computed either equally for all data or via a kind of forgetting algorithm. a description of such filters can be found e.g. in [16, 17, 18, 19]. many preliminary experiments were performed to compare the suggested mixture filter to standard filters. all of them gave comparable results for isolated outliers but almost all standard filters were quite unsuitable for filtering block outliers. typically, the standard filters failed to detect a block of outliers. among all the standard filters, only two were found to be comparable with the proposed mixture filter. standard filter no. 1 was designed to detect a block of outliers [16]. after detecting the borders of a block outlier, it models the data before and after the outlier with a simple regression model and substitutes the outlying values by a combination of predictions from both these models. standard filter no. 2 is a median filter with window size 200 (time periods) and without forgetting. to demonstrate the results of filtering results in this paper, only these two standard filters are used. as mentioned above, the components have different covariances. for these experiments, the data component has 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague fig. 1: uncorrupted transportation data covariance covd � 0.1 and the outlier component has covariance covo � 100. example 1: all big outliers the first experiment follows a typical scenario, i.e. outliers with big amplitudes. the level of the outliers is about 5000, which is approximately 100 times the level of the uncorrupted data. the mixture filter completely detects all outliers and leaves the normal data without any change. the filtered variable is plotted in fig. 2. the filtering gives practically identical data (cf. fig. 1), up to 20 isolated outliers and two short blocks (first 100–200 items and second 650–700 items) where groups of outliers were located. all substitutions for outliers are in an appropriate range. in order to evaluate the results in a non-visual way, and making use of the fact that the outliers were introduced artificially, the corrupted data is compared to its predictions from a model estimated on the basis of a filtered data sample. this quality evaluation is done through the prediction error (pe) coefficient, which is the square root of the sum of squares of the prediction error divided by the variance of the data. the results for the suggested mixture filter and the two chosen standard filters are given in the following table (table 1). example 2: all small outliers an outlier is a value lying “far” outside the range of the corrupted data. what happens if “far” is not so far as in the previous experiment? now, outliers of an amplitude from 1 to 5 times the uncorrupted data amplitude are tested. the composition of the data and outliers is the same. the results are summarized in the following table: once again, the mixture filter outperforms the others. the absolute values of the differences of pe are smaller than in the previous experiment because the outliers are smaller © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 2: filtered data filter pe coefficient mixture filter 0.49 standard filter no. 1 0.72 standard filter no. 2 4.80 remark: the results of the pe coefficient for the other standard filters were from 8 to 170. the big difference is caused by the fact that the standard filters are not able to detect the blocks of outliers. table 1: pe coefficients for all big outliers filter pe coefficient mixture filter 0.50 standard filter no. 1 1.52 standard filter no. 2 1.36 table 2: pe coefficients for all small outliers and the failure to detect them results in a smaller contribution to the pe. example 3: first big and then small outliers this last case is the most difficult, because the filter could “calibrate” the size of the outliers according to the first suspicious data and miss everything that is smaller than its pattern. will the filter be able to recognize smaller outliers that follow the bigger ones? the results are summarized in the following table: also in this example, which is the most dificult for the mixture filter, the results are stable and superior to the other filters. 5 conclusions a new type of filter detecting outliers and data reconstruction has been described and demonstrated on a serial of examples. the filter is based on modelling the data by a mixture model with two components: one for modelling uncorrupted data, and the second for modelling the outliers. the main advantage of the filter is its ability to detect groups of outliers, which was demonstrated in a simulation. outliers of this type arise from temporary breakdowns of measuring devices, which are rather frequent in transportation systems. in all the simulated examples, both single outliers and blocks of outliers were correctly detected and the data was reconstructed by reasonable values, generated by the uncorrupted data component. the comparison of the results with standard filters proved that reconstruction of a block of outliers is a difficult task. typically, standard filters were not able to substitute the whole block of outliers. the best standard filters usually missed several outliers from the block before they “realized” that an outlier had occurred. the proposed mixture-based filter detects the outliers more correctly, and thus outperforms the standard filters in all simulated experiments. the particular disadvantage of the filter is when the blocks of outliers are very long. there can be a problem with the data component because its statistics were not recomputed during failure. future work will be focus on solving this problem. acknowledgment this research was partly supported by grants mšmt čr 1m6798555601, mdčr 1f43a/003/120. references [1] zhao, f., leong, t. y: “a data preprocessing framework for supporting probability-learning in dynamics decision modeling in medicine”. j am med inform assn, vol. suppl. s2000, 2000, p. 933–937. [2] dzerovski, s., gamberger, d., lavrac, n.: “noise detection and elimination in data processing experiments in medical domains.” appl artif intell, vol. 14 (2000), no. 2, p. 205–223. [3] titterington, d. m., smith, a. f. m., makov, u. e.: statistical analysis of finite mixtures. john wiley & sons, chichester, new york, brisbane, toronto, singapore, 1985, isbn 0 471 90763 4. [4] richardson, s., green, p. j.: “on bayesian analysis of mixtures with an unknown number of components, with discussion”, journal of the royal statistical society, series b, vol. 59 (1997), no. 4, p. 731–792. [5] mclachlan, g. j.: finite mixture models. wiley, new york, 1999. [6] kárný, m., nagy, i., novovičová, j.: “mixed-data multi-modelling for fault detection and isolation”, adaptive control and signal processing, 2002, no. 1, p. 61–83. [7] kárný, m.: “probabilistic support of operators”. ercim news, 2000, no. 40, p. 25–26. [8] ettler, p., kárný, m., nagy, i.: “employing information hidden in industrial process data”, in preprints of symposium on intelligent systems for industry, paisley, uk, academic press, 2001, p. 1814–1817. [9] kárný, m., nedoma, p., nagy, i., valečková, m.: “initial description of multi-modal dynamic models”, in artificial neural nets and genetic algorithms. proceedings, eds.: v. kurková, r. netruda, m. kárný, n. c. steele, vienna, springer, april 2001, p. 398–401. [10] nagy, i., nedoma, p., kárný, m.: ”factorized em algorithm for mixture estimation“, in artificial neural nets and genetic algorithm. proceedings, eds.: v. kurková, r. netruda, m. kárný, n. c. steele, vienna, springer, april 2001, p. 402–405. [11] kárný, m., kadlec, j., sutanto, e. l.: “quasi-bayes estimation applied to normal mixture”, in preprints of the 3rd european ieee workshop on computer-intensive methods in control and data processing, eds.: j. rojíček, m. valečková, m. kárný, k. warwick, prague, september 1998, útia av čr, p. 77–82. [12] nagy, i., kárný, m., nedoma, p., voráčová, š.: “bayesian estimation of traffic lane state”, international journal of adaptive control and signal processing, vol. 17 (2003), no. 1, p. 51–65. [13] kárný, m., khailova, n., böhm, j., nedoma, p.: “quantification of prior information revised”, international journal of adaptive control and signal processing, vol. 15 (2001), no. 1, p. 65–84. [14] kulhavý, r., zarrop, m. b.: “on general concept of forgetting”, international journal of control, vol. 58 (1993), no. 4, p. 905–924. [15] nagy, i.: “estimation of real data with dynamic mixtures”, tech. rep., research report no. 2066, útia av čr, prague, 2002. [16] tesař, l., quinn, a.: “detection and removal of outliers from multidimensional ar processes”, in proceedings of irish signal and systems conference, maynooth, ireland, august 2001. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague filter pe coefficient mixture filter 0.49 standard filter no. 1 1.52 standard filter no. 2 1.36 table 3: pe coefficients for first big and subsequent small outliers [17] ko, sung-jea, lee, y. h.: “center weighted median filters and their applications to image enhancement”, ieee transactions on circuits and systems, vol. 38, no. 9, september 1991, p. 984–993. [18] tesař, l., quinn, a.: “method for artefact detection and suppressing using alpha-stable distributions”, in proceedings of icannga conference, prague, czech republic, march 2001. [19] cipra, t.: “dynamic credibility with outliers”, applications of mathematics, vol. 41 (1996), no. 2, p. 149–159. ing. pavla pecherková e-mail: nemcova@utia.cas.cz doc. ing. ivan nagy, csc. phone: +420 224 890 732 e-mail: nagy@fd.cvut.cz department of applied mathematics czech technical university in prague faculty of transportation sciences ctu na florenci 25 110 00 prague 1 institute of information theory and automation av čr p.o.box 18 182 08 prague 8, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 acta polytechnica https://doi.org/10.14311/ap.2022.62.0522 acta polytechnica 62(5):522–530, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague extra control coefficient additive ecca-pid for control optimization of electrical and mechanic system erol can erzincan binali yıldırım university, school of civil aviation, department of aviation electric-electronics, erzincan, turkey correspondence: cn_e@hotmail.com abstract. proportional integral derivative (pid) controllers are frequently used control methods for mechanical and electrical systems. controller values are chosen either by calculation or by experimentation to obtain a satisfactory response in the system and to optimise the response. sometimes the controller values do not quite capture the desired system response due to incorrect calculations or approximate entered values. in this case, it is necessary to add features that can make a comparison with the existing traditional system and add decision-making features to optimise the response of the system. in this article, the decision-making unit created for these control systems to provide a better control response and the pid system that contributes an extra control coefficient called ecca-pid is presented. first, the structure and design of the traditional pid control system and the ecca-pid control system are presented. after that, ecca-pid and traditional pid methods’ step response of a quadratic system are examined. the results obtained show the effectiveness of the proposed control method. keywords: ecca-pid, decision-making unit, satisfactory response. 1. introduction pid (proportional-integral-derivative) controller control loop method is a control mechanism that has a wide range of uses, such as electronic devices, mechanical devices and pneumatic systems [1–4]. the pid compares the signal sent to the input via the feedback path with the input signal and calculates the error obtained. the pid control system compares the reference signal of input with sensing the output signal of controlled plant via the feedback path. then, the controller system calculates the error of the obtained signal. this error is sent to p, i, d and after the controller units multiplies this error with a coefficient, it sends new created signals to the input of the target plant system [5, 6]. this process is repeated until the error reaches a minimum value. while pid control studies generally focus on linear systems, studies on a good-performing pid controller are also presented for some system groups with uncertainty [7, 8]. the balancing of the first order time-delayed system using a pid controller with the previously given pid values has been investigated [9, 10] while high order time delay systems are controlled by pid [11–13]. in some studies, it relies on testing the negative feedback control system in continuous oscillation with a step input to calculate the pid gain values. initially, the integral and derivative terms are disabled by making the gains of zero in the pid controller, and the controller is operated with only a proportional effect. a step input is applied to the input of the system and the kp gain is increased from zero until a continuous and same amplitude oscillation is obtained at the output of the system [12, 13]. the gain kp giving sustained oscillation is determined as the sustained oscillation period in seconds. forcing this method to reach the constant oscillation region may have undesirable results in some applications. against external factors, the process can easily pass into the unstable region. therefore, some physical damage to the equipment may occur. it takes a lot of experimentation to calculate its value. however, in some systems, predetermined insufficient controller values may be insufficient to provide the desired stabilisation times. in order to eliminate such situations, an extra control coefficient additive (ecca)-pid control is recommended, which is based on all these principles, but which can activate the system faster and stabilise the system by providing a shorter settling time. the ideal reference signal is divided into reflection reference values of different magnitudes to form a decision unit to be compared with the error and error change rates. therefore, extra controller coefficients are produced by observing the error and error rate of change and comparing it with the reflection reference part values of different sizes. it is aimed to provide a faster optimisation with a semi-linear control independently of the controller coefficients entered into the system before. first, the ecca-pid design working logic is given. then, in the implementation phase, conventional pid and proposed pid are applied to the transfer function of a second-order system and the step response is examined. the ideal response parts expected with the proposed system are tested 522 https://doi.org/10.14311/ap.2022.62.0522 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 5/2022 extra control coefficient additive ecca-pid for control optimization . . . at the reflection reference values 0–0.5 and 0–1 and the step responses are measured. considering the results obtained, the proposed method can reach the ideal control point in a very short time, while the traditional system is far from the desired response. 2. design with pid controller although the pd control from three controllers brings attenuation to the system, it does not affect the steady state behaviour of the system. the pi controller, however, increases the relative stability as well as the rise time, although it corrects the steady-state errors. these results lead to the use of pid control, with the use of a combination of pi and pd controllers. kp, ki, kd define the proportional, integral and derivative gain coefficients, respectively. a pid controller consists of pi and pd parts connected in series. the closed loop control scheme for a pid control is given in figure 1, with e being the error of the output signal, r is the references value. figure 1. the closed loop control scheme for pid. while the transfer function of pid controller is as below: u(t) = e(t) ∗ kp + ki de(t) dt + ki ∫ t 0 e(t) dt (1) e(t) = r(t) − y(t) (2) open-loop techniques rely on the results of a bump or step test in which the output of the controller is abruptly manually forced by cancelling the feedback. the graphical slice of the trailing trajectory of the process variable is given in figure 2 in [10], the curve known as the reaction curve. the sloping line drawn tangent to the steepest point of the reaction curve demonstrates how fast the process reacts to the step change of the controller output. the inverse of the slope of this line, t, which is the measure of the severity of the delay, is the time constant of the process. the reaction curve is also: the dead time (d), which shows how long it takes for the process to give the initial reaction of the process, and the process gain (k), how much the process variable increases according to the size of the step. ziegler and nichols determined, by trial and error, that the best values of the tuning parameters p, ti, and td can be calculated from the t, d, and k values as follows [12, 13]. p is 1.2 t/kd, ti is 2d, td is 0.5d. a closed-loop technique executes the controller in an automatic mode but with integral and derivative figure 2. open loop curve. figure 3. curve for a closed-loop. turned off. as seen in figure 3, the gain for the controller is boosted until the smallest error produces a continuous oscillation in the process variable. the gain of the smallest controller that causes an oscillation is named the final gain, pu. the period of these oscillations is also named the final period, tu. appropriate tuning parameters are calculated from the following rules based on these two values [10]. as a results, p is 0.6 pu, ti is 0.5 tu, td is 0.125 tu. despite all these separations and arrangements, the gain kp giving a sustained oscillation is determined as the sustained oscillation period in seconds. forcing this method to reach the constant oscillation region may have undesirable results in some applications. the process can move to the unstable region very easily against external factors. thus, some physical damage to the equipment may occur. so, the eccapid method offers a good alternative to avoid these complex and inconvenient situations of traditional methods. in order to provide a more optimum control, the response expected from the system is divided into partial sizes and compared again with the error obtained, the error and error change rates produced for the control are evaluated, and new coefficients are created to be added to the controller coefficients of the controllers, thus enabling the system to give a better one. k ∈ z+ → k = {k1, k2, k3, . . . , kn}, in order to find the value that will provide the desired control in rr, the value is to be compared with the error produced by the system; virtual part reference value is rr ∈ r+ → rr = {rr1, rr2, . . . , rrn}, the k values to be produced can be found as follows. 523 erol can acta polytechnica if e1 > rr1 then k1 if k1 > 0 then kp + k1 and ki + k1 and kd + k1 if e2 > rr2 then k2 if k2 > 0 then kp + k2 and ki + k2 and kd + k2 if en > rrn then kn if kn > 0 then kp + kn and ki + kn and kd + kn unlike other swarm optimization and traditional pid control methods, the proposed method produces linear movements to approach the desired value whenever it is far from the desired value, and in this case, the desired control can be achieved more quickly. extra control coefficient (ecca)-pid control is given in figure 4a while figure 4b shows the mesh depicting the interaction of the reference and reflection reference values that will contribute to the extra coefficient. the control gains predicted by the decision-making unit can be expressed as the following equations. u(t1) = e(t1) ∗ (kp + k1) + (kd + k1) de(t1) dt + (ki + k1) ∫ t1 0 e(t1) dt (3) u(t2) = e(t2) ∗ (kp + k2) + (kd + k2) de(t2) dt + (ki + k2) ∫ t2 0 e(t2) dt (4) u(tn) = e(tn) ∗ (kp + kn) + (kd + kn) de(tn) dt + (ki + kn) ∫ tn 0 e(tn) dt (5) if there is too much overshoot and oscillation in the system, the decision-making order of the proposed method can be arranged as follows. if e1 > rr1 then k1 if k1 > 0 then kp + k1 and ki + k1 and kd + k1 else if e1 < rr1 then k11 if k11 > 0 then kp − k11 and ki − k11 and kd − k1 if e2 > rr2 then k2 if k2 > 0 then kp + k2 and ki + k2 and kd + k2 else if e2 < rr2 then k22 if k22 > 0 then kp − k2 and ki − k2 and kd − k2 if en > rrn then kn if kn > 0 then kp + kn and ki + kn and kd + kn else if en < rrnn then knn if kn > 0 then kp−knn and ki−knn and kd−knn e is error, de is error change, e ∈ r → e = {e1, e2, . . . , en}, de ∈ r → de = {de1, de2, . . . , den}, e and de are expressed as below. e(t1) = r(t) − y(t1) (6) k1 = e(t1) − rr1 (7) e(t2) = r(t) − y(t2) (8) k2 = e(t2) − rr2 (9) de1 = e(t2) − e(t1) (10) e(tn−1) = r(t) − y(tn−1) (11) kn−1 = e(tn−1) − rrn−1 (12) den−1 = e(tn−1) − e(tn−2) (13) e(tn−1) = r(t) − y(tn−1) (14) kn−1 = e(tn) − rrn−1 (15) e(tn) = r(t) − y(tn) (16) kn = e(tn) − rrn (17) considering the error ec for a conventional pid control and the effect of the proposed method on the error of the conventional method ek, e(t) can be arranged as follows. e(t) = ec + ek (18) the general equation for the pid can be arranged as follows. u(t1) = (ec1 + ek1)(t1) ∗ (kp + k1) + (ki + k1) (dec1 + dek1)(t1) dt + (ki + k1) ∫ t1 0 (ec1 + ek1)(t1) dt (19) u(t2) = (ec2 + ek2)(t2) ∗ (kp + k2) + (ki + k2) (dec2 + dek2)(t2) dt + (ki + k2) ∫ t2 0 (ec2 + ek2)(t2) dt (20) u(tn) = (ecn + ekn)(tn) ∗ (kp + kn) + (ki + kn) (decn + dekn)(tn) dt + (ki + kn) ∫ tn 0 (ecn + ekn)(tn) dt (21) depending on whether the error is positive or negative, the control diagram of the system is as in figure 5, in line with the above explanation of the decisionmaking unit. the equation of the second order and pid system is given in equation (22). x(s) f (s) = 1 0.5s2 + s + 1 (22) the pid control is applied to the system to be tested as in equation (23). x(s) f (s) = kd · s2 + kp · s + ki (0.5 + kd)s2 + kp · s + ki (23) 524 vol. 62 no. 5/2022 extra control coefficient additive ecca-pid for control optimization . . . (a). (b). figure 4. a) extra control coefficient (ecca)-pid control, b) relation network between r and rr. figure 5. the control diagram of the system, depending on whether the error is positive or negative. equation (24) and equation (25) give the fixed value contributions to the controller systems as a result of the comparison of the reflection reference values in the decision-making unit of the proposed system with the actual controller coefficient values. x(s) f (s) = (kd + k1) · s2(kp + k1) · s + (ki + k1) (0.5 + (kd + k1)) · s2 + (kp + k1) · s + (ki + k1) (24) x(s) f (s) = (kd − k1) · s2(kp − k1) · s + (ki − k1) (0.5 + (kd − k1)) · s2 + (kp − k1) · s + (ki − k1) (25) 525 erol can acta polytechnica figure 6. the matlab simulink model of the designed system. 3. (ecca)-pid control application the proposed system is examined over the step response of a second-order system such as [(1/(0.5s2 + s + 1)]. the matlab simulink model of the designed system is given in figure 6. in the second order system, both traditional pid control method and (ecca)-pid control are applied. figure 7a shows the step response values when two rr values such as 0–0.5 are given to the proposed system for the step response of the system. while the extra gain values produced by the controller decision unit are given in figure 7b, the controller output signal and controller errors can be seen in figure 8. kp is 1, ki is 0.5, kd is 0.02, e is error. while the rise moment response of the system is as short as 0.1 s for ecca-pid and ecca-p, the rise moment response of the system for a traditional pid control is 1.1 s. this means that the rise moment response of the proposed system corresponds to 9 % of the take-off response of a traditional pid controlled system. the settling time for ecca-p cannot occur in 10 s, but when i-d controllers are added to the total control system, the settling time takes place in 5 s for ecca-pid. when the system is controlled with a traditional pid, the system is not capable of settling in 10 s. this shows that the desired control can be achieved with the decision-making unit of the proposed system, even if insufficient controller coefficients are selected for the system. rr values in the range of 0–0.5 taken into consideration by the decision-making unit are trying to reach the desired control point. while the gain factor is increased between 0–1 s and 5.4–8.4 s for rr 0, the gain factor is increased between 0–1 s and 2–10.4 s for rr 0.5. the controller output signal becomes stable in 4 s. for the proposed control system, the controller error ends in 5 s, while for the ecca-p and traditional pid method, the error does not end for 10 s. figure 9a shows the step response values when two rr values such as 0–1 are given to the proposed system for the step response of the system. while the extra gain values produced by the controller decision unit are given in figure 9b, the controller output signal and controller errors can be seen in figure 10. kp is 1, ki is 0.5, kd is 0.02. while the rise moment response of the system for rr of 0–1 is as short as 0.1 s for ecca-pid and ecca-p, the rise moment response of the system for a traditional pid control is 1.1 s. this means that the rise moment response of the proposed system corresponds to 9 % of the take-off response of a traditional pid controlled system. the settling time for eccap cannot occur in 10 s, but when i-d controllers are added to the total control system, the settling time takes place in 4 s for ecca-pid. when the system is controlled with a traditional pid, the system is not capable of settling in 10 s. this shows that the desired control can be achieved with the decision-making unit of the proposed system, even if insufficient controller coefficients are selected for the system. rr values in the range of 0–1 taken into consideration by the decision-making unit are trying to reach the desired control point. while the gain factor is increased between 0–1 s and 3.9–7 s for rr of 0, the gain factor is increased between 0–10 s for rr of 1. while the controller output signal becomes stable in 4 s, it deviates from the ideal control reference value between 2 and 4 s. for the proposed control system, the controller error ends in 4 s, while for the ecca-p and traditional pid method, the error does not end for 10 s. figure 11 shows the step responses and the extra gain values produced by the controller decision unit for rr 0–0.3. there are controller output signal and controller errors for 0–0.3. the rise moment response of the system for rr of 0–0.3 is as short as 0.1 s for ecca-pid and ecca-p, the rise moment response of the system for a traditional pid control is 1.1 s. 526 vol. 62 no. 5/2022 extra control coefficient additive ecca-pid for control optimization . . . (a). (b). figure 7. for rr of 0–0.5 > e: a) the step responses, b) the extra gain values produced by the controller decision unit. (a). (b). figure 8. a) the controller output signal, b) errors for controllers. (a). (b). figure 9. for rr of 0–1 > e: a) the step responses, b) the extra gain values produced by the controller decision unit. the 0–0.3 rr range in ecca-p provides an earlier rise as compared to the 0–1 range. the settling tim for ecca-p cannot occur in 10 s, but when i-d controllers are added to the total control system, the settling time takes place in 4 s for ecca-pid. when the system is controlled with a traditional pid, the system is not capable of settling in 10 s. this shows that the desired control can be achieved with the decision-making unit of the proposed system, even if insufficient controller coefficients are selected for the system. rr values in the range of 0–0.3 taken into consideration by the decision-making unit are trying to reach the desired control point. while the gain factor is increased between 0–1 s and 5.5–8.7 s for rr of 0, the gain factor is increased between 0–1 s and 1.2–10 s for rr of 0. after the maximum collapse occurs in 2.2 s, the controller output signal becomes stable in 4 s. for the proposed control system, the 527 erol can acta polytechnica (a). (b). figure 10. a) the controller output signal, b) controller errors. (a). (b). figure 11. for rr of 0–0.3 > e: a) the step responses, b) the extra gain values produced by the controller decision unit. (a). (b). figure 12. a) the controller output signal for 0–0.3, b) controller errors. controller error ends in 6 s, while for the ecca-p and traditional pid method, the error does not end for 10 s. figure 13 shows the step response of the system controlled with ecca-pid and the controller errors for different rr values. figure 14 shows the step response of the system control with ecca-p and the controller errors for different rr values. ecca-pid and ecca-p are tested for control of a quadratic system. even if the previously determined controller coefficient constants are insufficient or not entered at all, the system controlled by ecca-pid produces values that will contribute to the controller system by making comparisons with the actual error of the system for different reflection rr values in the decision-making unit. thus, unlike traditional pid controllers with linear response, the error variation affects the error variation in a semi-linear manner, independent of the controller coefficients entered into the system before, and brings the control of the system to a satisfactory level. 528 vol. 62 no. 5/2022 extra control coefficient additive ecca-pid for control optimization . . . (a). (b). figure 13. with ecca-pid: a) the step response of the system controlled, b) the controller errors for different rr values. (a). (b). figure 14. with ecca-p: a) the step response of the system controlled, b) the controller errors for different rr values. 4. conclusions in this article, a pid control with extra gain is developed. the structure and design of the traditional pid control system and the ecca-pid control system are presented. then, the step response of a second-order system with the conventional method is examined. in the control processes for rr of 0–5 and rr of 0–1 values, the proposed system responds in 0.1 s for the moment of rise, while the traditional pid method responds in 1.1 s. again, while ecca-pid provides settling time at 4 s and 5 s, traditional pid cannot provide settling at 10 s. this shows the effectiveness of the proposed system and its contribution to the control systems. therefore, it seems to be an ideal method for energy conversion systems and motor control units. references [1] h. wang, l. jinbo. research on fractional order fuzzy pid control of the pneumatic-hydraulic upper limb rehabilitation training system based on pso. international journal of control, automation and systems 20(3):210–320, 2022. https://doi.org/10.1007/s12555-020-0847-1. [2] m. n. muftah, a. a. m. faudzi, s. sahlan, m. shouran. modeling and fuzzy fopid controller tuned by pso for pneumatic positioning system. energies 15(10):3757, 2022. https://doi.org/10.3390/en15103757. [3] e. can, h. h. sayan. the performance of the dc motor by the pid controlling pwm dc-dc boost converter. tehnički glasnik 11(4):182–187, 2017. [4] e. can, m. s. toksoy. a flexible closed-loop (fcl) pid and dynamic fuzzy logic + pid controllers for optimization of dc motor. journal of engineering research 2021. online first, https://doi.org/10.36909/jer.13813. [5] j. crowe, k. k. tan, t. h. lee, et al. pid control: new identification and design methods. springer-verlag london limited, 2005. [6] c. knospe. pid control. ieee control systems magazine 26(1):30–31, 2006. https://doi.org/10.1109/mcs.2006.1580151. [7] m.-t. ho, c.-y. lin. pid controller design for robust performance. ieee transactions on automatic control 48(8):1404–1409, 2003. https://doi.org/10.1109/tac.2003.815028. [8] c. zhao, l. guo. towards a theoretical foundation of pid control for uncertain nonlinear systems. automatica 142:110360, 2022. https://doi.org/10.1016/j.automatica.2022.110360. 529 https://doi.org/10.1007/s12555-020-0847-1 https://doi.org/10.3390/en15103757 https://doi.org/10.36909/jer.13813 https://doi.org/10.1109/mcs.2006.1580151 https://doi.org/10.1109/tac.2003.815028 https://doi.org/10.1016/j.automatica.2022.110360 erol can acta polytechnica [9] p. patil, s. s. anchan, c. s. rao. improved pid controller design for an unstable second order plus time delay non-minimum phase systems. results in control and optimization 7:100117, 2022. https://doi.org/10.1016/j.rico.2022.100117. [10] e. s. tognetti, g. a. de oliveira. robust state feedback-based design of pid controllers for high-order systems with time-delay and parametric uncertainties. journal of control, automation and electrical systems 33(2):382–392, 2022. https://doi.org/10.1007/s40313-021-00846-2. [11] c. cruz-díaz, b. del muro-cuéllar, g. duchénsánchez, et al. observer-based pid control strategy for the stabilization of delayed high order systems with up to three unstable poles. mathematics 10(9):1399, 2022. https://doi.org/10.3390/math10091399. [12] j. g. ziegler, n. b. nichols. optimum settings for automatic controllers. journal of dynamic systems, measurement, and control 115(2b):220–222, 1993. https://doi.org/10.1115/1.2899060. [13] s. skogestad. simple analytic rules for model reduction and pid controller tuning. journal of process control 13(4):291–309, 2003. https://doi.org/10.1016/s0959-1524(02)00062-8. 530 https://doi.org/10.1016/j.rico.2022.100117 https://doi.org/10.1007/s40313-021-00846-2 https://doi.org/10.3390/math10091399 https://doi.org/10.1115/1.2899060 https://doi.org/10.1016/s0959-1524(02)00062-8 acta polytechnica 62(5):522–530, 2022 1 introduction 2 design with pid controller 3 (ecca)-pid control application 4 conclusions references ap05_4.vp 1 introduction most materials of sculpture works, paintings, plasters, stuccos, etc., have absorbed a moisture content that settles at an equilibrium level corresponding to the relative humidity and temperature of the surrounding air (cassar, 1995). wood is one of the most sensitive materials, furthermore with considerably different mechanical properties and extensibility in each of its three primary axes. variations in moisture content therefore result in stresses leading to deformations or even to cracks opening in the exhibits (camuffo, 1998). the internal environment standard in leading galleries and museums involves keeping constant both the interior temperature and the air humidity in the exhibition rooms, but due to impracticable costs, such air-conditioning cannot realistically be implemented as far as remote historical objects are concerned (cassar, 1993). owing to this, an invaluable part of european cultural heritage is more or less exposed to the damaging influence of varying moisture caused by air humidity fluctuations (kotterer, 2002). however, when it is not possible to keep constant both air temperature and humidity, the primary goal is to avoid dimensional extensions or shrinking of the exhibits arising from variations in air humidity and temperature. in principle, it is possible to accept modest temperature changes if they are compensated simultaneously by some adequate air humidity adjustments. in this way it is possible to protect the exhibits against the impact of moisture change and to fulfil the crucial conservation requirement, particularly with respect to the usually anisotropic character of these phenomena. apparently, this kind of compensation can meet the preservation demands only if both the temperature and the air humidity changes are smooth and slow enough for the internal atmosphere to remain in an almost steady state and the absorbed moisture in the exhibit materials to remain constant. 2 air humidity and equilibrium moisture content in view of the preservation demands it is a hard task to define unambiguously the proper parameters of the desirable environment. first of all, it is not the visitor s comfort, but rather the benefit of the exhibits that should be the priority in exhibition rooms (camuffo et al., 2002). recent research in preventive conservation has introduced the complex concept of the microclimate of a room, in which monuments or exhibits are placed (camuffo, 1998). among the numerous parameters of the microclimate, air humidity and temperature are considered as primary and as the most important attributes of the microclimate (cassar, 1993). particularly in remote sites of deposits, where neither heating nor air handling devices are in operation, the humidity impact is the most dangerous exposure from the preservation point of view. the decisive role of moisture sorption impact is typical for most of the materials that artistic works are made of, i.e. wood, paper, parchment, leather, ivory, bone, paints, plaster, stucco or stones containing abundant clay minerals, etc. these materials absorb a specific amount of water and are rather sensitive to variations in it. the steady value of this amount, corresponding to the surrounding air humidity and temperature, is called the equilibrium moisture content (emc), usually expressed as the ratio of the mass of water to the mass of anhydrous material. after the ambient temperature or humidity have changed their values, the absorbed moisture content changes accordingly (massari, 1993). an increase in emc increase is then followed by swelling of the material and, conversely a decrease results in contraction. due to the non-isotropic character of these size changes, harmful deformations or destructive cracks appear as a result. the material extensions resulting from growing emc are relatively high, and it is important to note that this extension is much higher than just the thermal extension of the dry wood. in fact the expansion phenomenon is somewhat more complex. for example, a rise in temperature induces thermal extension of dry wood alone, but consequently it results in a drop in relative humidity and therefore also in a drop in emc, which brings about a material contraction, and vice-versa (camuffo, 1998). in this way, thermal expansion and emc contraction are of opposite character and the shrinkage is partially mitigated by the expansion. however, the dimension change due to relative humidity is largely dominant, since only the temperature expansion itself is more than ten times weaker than that of relative humidity (kowalski, 2003). © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 model based control of moisture sorption in a historical interior p. zítek, t. vyhlídal this paper deals with a novel scheme for microclimate control in historical exhibition rooms, inhibiting moisture sorption phenomena that are inadmissible from the preventive conservation point of view. the impact of air humidity is the most significant harmful exposure for a great deal of the cultural heritage deposited in remote historical buildings. leaving the interior temperature to run almost its spontaneous yearly cycle, the proposed non-linear model-based control protects exhibits from harmful variations in moisture content by compensating the temperature drifts with an adequate adjustment of the air humidity. already implemented in a medieval interior since 1999, the proposed microclimate control has proved capable of permanently maintaining constant a desirable moisture content in organic or porous materials in the interior of a building. keywords: equilibrium moisture sorption, humidity control, non-linear cascade control. 3 equilibrium moisture content models for each of the considered materials the equilibrium moisture content settles at a level appropriate to the ambient air humidity and temperature. although the emc levels are different for various materials, the following properties are common to all of them � the emc always increases with growing � and decreases with growing t, � the emc value is much more sensitive to changes in air humidity than to variations in temperature. the relationship between emc as u, and both air temperature t and relative humidity �, u t� �( , )� has been fitted by several formulae developed for various areas of application. usually this relationship is plotted in the coordinates � and u, as so-called sorption isotherms, with temperature considered as a parameter. in particular the mathematical models by day and nelson and by simpson (ball et al., 2001) were found to fit well the experimental data for professionally evaluating emc. the day and nelson model modified by ball et al. (2001) is of the form u a t a t k t dn k b t dn k k dn k � � �� � � � � � log( ) ( ) , ( ) exp( ) ( ) 1 1 1 � 300 3002 � � � � � � � � � � � , ( ) exp( ) ,b t k t dn k k (1) where tk is the temperature in degrees kelvin, � is the relative humidity expressed as the mass ratio and k1, k2, �, � are model parameters. the simpson model, in the version for wood moisture evaluation is as follows u w k k k k k k k k k k k k � � � � �1800 1 1 0 0 0 1 0 2 1 2 2 0 1 0 2 1 2 2 � � � � � � � � �� � � � , (2) where the parameters w, k0, k1, k2 are given as quadratic polynomial functions of temperature t. using models (1) or (2), high accuracy can be achieved in fitting the experimental data. however, for the microclimate control idea proposed below these models are less suitable because their derivatives result in fairly complicated formulae. that is why we chose the following henderson model with only three parameters u a t b t c � � � � � � � � � � ln( ) ( ) ( , ) 1 � �� , (3) to derive the following model based controller of moisture sorption. in this model the relative humidity is considered as the mixing ratio � � 0 1, , temperature t and parameter b are in °c, c is a positive dimensionless exponent, a is a parameter in °c�1 and u is the dimensionless ratio of moisture mass content to the mass of dry material. as regards the function �(.) the mutual dependence between the variables � and t should be noted. if x is the absolute humidity in kg of water per one kg of dry air, its value, corresponding to a certain given � and t, results, e.g., from the magnus relationship (camuffo, 1998), as follows x at b t� � � �3 795 10 103. ( )� , (4) where a � 7.5 and the additive temperature constant b � 237.3 °c. only from x can the actual water vapour content be assessed for a given air volume. on the other hand the relative humidity changes inversely to the temperature change if the absolute humidity x is maintained at a constant level. then for two states �1, t1 and �2, t2 of humid air with the same x it holds � � 2 1 10 10 1 1 2 2 � � � at b t at b t ( ) ( ) . (5) hence, if the humid air temperature has changed without changing the water vapour content x, its relative humidity moves in the opposite way. for example, a temperature drop under x � const. necessitates the relative humidity increase given by (5). this phenomenon can be observed in simultaneous records of � and t, where most of the relative humidity fluctuations are usually due to temperature changes, while only a minor part is due to the actual change in water vapour content. another consequence can be derived from the magnus formula. if a considerable change in relative humidity from �0 to � has occurred under x = const., the corresponding temperature change is relatively small, given as follows t t b t a0 0 0� � � �(log log )� � , (6) where, unlike (3), the decimal logarithm is used. this means that, for example, a drop of relative humidity from 60 to 40 per cent, where x does not change, corresponds to a temperature increase of 6 °c only. since the sorption isotherms describing the relationship u t� �( , )� are rather close to each other, the small temperature increments given by (6) result in the property that the lines of x � const. are of a rather similar shape to the isotherms in the diagram of u t� �( , )� and are relatively close to them. a comparison of several isotherms with the lines of x � const. can be seen in fig. 2. model (3) is marked out with advantageous analytical properties, particularly with regard to its derivatives. the model parameters a, b, c can be assessed to fit a set of data obtained either from experiments or from a more precise emc model (1) or (2). for a given set of data �i, ti, ui, i � 1, 2, …, n the most fitting parameters a, b, c can be identified by means of the least-square approach. for wood the parameter c is about 0.65, b is from 260 to 310 °c and the value of a is about 0.28 °c�1 if u is expressed in kg/kg. in this way either the experimental data or the more complicated formulae (1), (2) can be fitted by the relatively simple model (3) with a good coincidence over sufficiently broad intervals, i.e. for air temperature from 5 to 40 °c and for air humidity from 10 % to 90 % in the controlled interior. an example of the attainable conformity of the simpson expression of a sorption isotherm with the henderson model is shown in fig. 1. on the other hand, it is obvious from the nature of function (3) that it cannot fit emc for relative humidity values close to 100 %, since then ln( )1 � � ��� . obviously, due to varying sorption properties, parameters a, b, c result in different values for various materials. in order to fit the three parameters of the henderson model to the more precise simpson model, the iterative least square method was used. applying this method to fit the points of the characteristics within the intervals t � 0 25, °c 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague and � � 10 90, %, which define the region of interest from a practical point of view, the following parameters result a � 6.962 10�5, b � 305.7 °c and c � 0.64. the accordance of these two models can be seen in the comparison of one of the isotherms in fig. 1. 4 equal-sorption principle of air humidity control let us assume the well-being of the preserved exhibits as the priority in setting up the parameters of the exhibition room microclimate. the following facts and claims are then to be taken into account in controlling the internal air parameters � the impact of air humidity changes is more significant than the impact of usual temperature fluctuations. � the primary harmful impact on the state of the exhibits is due to the moisture content absorbed in their porous or organic materials. � changes in moisture content have the crucial harmful impact, bringing about an anisotropic swelling or shrinking that results in deformations or cracks in exhibits. � furthermore, a higher level (> 12 %) of moisture content becomes favourable for microbiologically harmful organisms to attack the surface of the exhibit. hence, it is not the air temperature and humidity, but the moisture content in the absorbent materials that is the decisive parameter to be kept as constant as possible in order to avoid any swelling or shrinking of the exhibits. however, this parameter can hardly be considered as a usual controlled variable, since its continual measurements are not available. first of all, the harmful impact is brought about by the anisotropic behaviour of the exhibit materials. on the other hand, it is important to realize that an exactly constant emc can be kept only in one sort of exhibit due to the differences among the sorption isotherms for various materials. moreover, the idea of compensating the spontaneous temperature fluctuations by humidity adjustments is feasible if, and only if, the variations of the interior temperature are smooth and slow enough, i.e. sufficiently close to a steady state. it should be emphasized that the moisture content is only slightly sensitive to temperature variations, but much more dependent on the air humidity. let the following example be mentioned: a great temperature increase, say from 5 °c to 40 °c, causes approximately the same change in moisture content in wood as results from a relative humidity drop as small as about 6 %, (camuffo, 1998). although no direct measurement of emc is available, the henderson model allows us to express its value analytically from the air humidity and temperature measurement, provided that the state of the environment is all the time close to a steady state. with the help of this model, a control action can be assessed protecting exhibits made of the selected material from harmful emc changes. let us assume that the temperature change is slow and smooth, and let a simultaneous air humidity correction be provided in order to keep zero increment of the moisture content. in other words, although both temperature t and relative humidity � of the air change, the moisture content in a specific material can be kept constant. if small increments of the state variables are considered, the compensating �� correction results from the derivative of model (3), i.e. from the condition (zítek and němeček, 2004) � � � � �u t t� � � � 0 (7) using the henderson model, the derivatives in this equation can be obtained from (3) in an analytical form. for the air temperature, the derivative is always negative �u t a c b t c c � � �� �� � � � � � � ln( ) ( ) ( ) 1 01 (8) (0< � < 1) and, conversely, for the humidity the derivative is always positive � � � u b t a c a c c � � � � � � � � �� �� � � � � �1 1 1 0 1 ( ) ln( ) ( ) . (9) the opposite signs of the two derivatives correspond to the opposite character of temperature and relative humidity increments in case of unchanged absolute humidity x given as the ratio of water vapour mass to the mass of dry air. following the requirement to keep the moisture content constant in the selected material, it is necessary to provide a compensating air humidity correction � � corresponding to the admissible temperature increment �t according to the following ratio of the derivatives © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 � [%] fig. 2: comparison of the shapes of isotherms (solid) and the lines of x � const. (dashed) (using the magnus and henderson models) � [%] fig. 1: comparison of fitting the simpson model by the henderson model, with parameters a � 6.96 10�5, b � 305.7, c � 0.651, isotherms t � 15 °c (solid – simpson, dashed – henderson) � � �� � �� � � � u t u t k t tc( , ) , (10) where the compensation gain k tc( , )� , given by k t b tc ( , ) ln( )( ) � � � � � � � 1 1 (11) is a function of both the temperature and the air humidity. however, from the three parameters of the henderson model, k tc( , )� is dependent only on b. obviously, due to the dominant role of b in the denominator value, the relative change of with respect to the temperature variations is fairly low. the increment-proportional character of relationship (11) does not guarantee that a constant desired emc value u ud� can be maintained by this kind of control. in order to prevent the moisture content from drifting from ud, the integration action has to be added into the controller and the following feed-forward-feedback control action results � � � �� � d d d d � � � ( ) ( ), ( ) ( ) ( ), ( ) t t k t t t t t t k u t t tc i d� � � � � , (12) where the feed-forward gain k tc( , )� , compensating the temperature trend, is given by (11), the unavailable emc value is computed from the � and t measurements by means of model (3), and ki is an integration gain. this controller is considered as a master controller in a cascade scheme, see fig. 3, where the slave is a humidity control with a variable desired � given by (12). with regard to the relative humidity to be kept in the controlled environment (within the interval � � 40 60, %), the weak variability of the feed-forward gain can be neglected and a constant coefficient k tc( , )� � � 0.0013 °c �1 can be considered. as regards the integration gain ki, its value should be assigned low enough to prevent the control process from obtaining an oscillatory character. in any case, it is desirable to set ki to achieve a sufficient filtering effect in the day cycle fluctuations of the relative air humidity. in general, the dynamics of the interior air humidity can be considered subject to the model � � � �v x t t q x t x t s x t x t q ti l e i x w w i m� � d d ( ) ( ) ( ) ( ) ( ) ( )� � � � � (13) where v is the interior volume, � air density, q l the in-leakage of the ambient air, �x the effective diffusion coefficient between the walls and the internal air, sw is the entire wall surface and qm the water input provided by the humidifier ( qm > 0) or dehumidifier (qm < 0). the values of absolute humidity of the external and internal air and the walls respectively, are xe, xi, xw, where xi is computed from the measured interior temperature t(t) and the interior relative humidity by means of the magnus formula (4). the in-leakage flow of the outdoor air is usually referred to as being approximately 25 to 30 % of the internal volume per hour ( ( ) ( . . ) )q vl � � �0 25 0 3 1to h . the moisture diffusion from the walls is highly variable. according to our measurements, the moisture release from damp walls may represent as significant a water source for the interior as window leakage does. the last term qm(t) of (13) is the water transfer amount. this control actuating effect has to be powerful enough to counteract even the highest demands of moisture transport. the time constant of the humidity process (13) in the interior is as follows � � � v q kl d . (14) since the on-off control is commonly used for the humidifying device (i.e. as the slave controller) this time constant is valid approximately also for the humidity-controlled interior. unlike the master controller output, equation (13) is expressed for absolute humidity x, but the interior output measurement is again provided for the relative humidity. hence the two conversions � � x and x � � cancel each other in the overall plant model. the only conversion left is given by the henderson model. after linearizing this model as in (7) the following characteristic equation of the control loop is obtained � s s k u s s k ki i h 2 2 0� � � � � � (15) and the non-oscillating character of the control process is obtained if the integration gain satisfies the condition k ki h� �( )4 1 . in considering the usual parameters for an interior about 100 m2 in size or larger, this involves adjusting the integration gain approximately as low as k i � �1 1h . 5 application case study within the framework of the research project several historical interiors in the czech republic are being monitored. one of them is st. václav chapel in plasy monastery in west bohemia. as in the other sites the selected interior microclimate is monitored by dataloggers measuring the temperatures and relative humidities each 30 minutes. in the plasy application, a model of the interior emc process has been identified from the collected data, and an equal-sorption humidity controller has been designed. correlating the interior and exterior absolute humidity (computed by means of the magnus formula) and using the least-square-based identification method, the time constant results at � 2.44 hours. the interval in fig. 5 was selected for identification in order to obtain the day cycle fluctuations of the absolute humidity with the least influence of long-term transient phenomena. due to this selection, wall moisture release, which has much slower dynamics than the variations caused by infiltration (the dynamics of which we want to model), has a minimum effect on the system dynamics. a comparison of the interior absolute humidity simulated by model (13) (the exterior absolute humidity is considered as the model input) and the real interior absolute humidity can be seen in fig. 5. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 3: scheme of microclimate control even though there are considerable discrepancies between the modeled and measured data (the process itself is nonlinear and its dynamics is much more complicated than the dynamics of model (13)), the model fits the character of the dynamics sufficiently. using the model to transfer the measured exterior humidity into the interior humidity, and considering the interior temperature to be the temperature measured on the same time interval as the exterior humidity, see fig. 5, the simulation of the control in keeping emc has been performed. the results of the simulation are shown in fig. 8. the parameters of the master controller have been chosen kc � 0.0011 °c �1, ki � 0.007 hour �1. the maximum dehumidifier performance needed for counteracting the worst moisture sources is approximately 500 g of water per hour. 6 conclusions a novel microclimate control scheme has been proposed, for protecting preserved exhibits from changes in equilibrium moisture content in the materials they are made of. although the emc as an alternative controlled variable is unavailable for continual measurement, changes in emc can be esti© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 4: temperature in st. václav’s chapel and ambient temperature measurements fig. 5: relative humidity in st. václav’s chapel and ambient relative humidity measurements fig. 6: process/model comparison, model time constant � 2.44 hours and static gain q q kl l d( ) .� � 014 (absolute humidity mean values: xi � 5.74 g/kg, xe � 6.45 g/kg) fig. 7: the ambient relative humidity and interior temperature used as the model inputs in simulating control on emc fig. 8: results of simulating the control on emc in st. václav’s chapel. upper: un – emc if no control is used (computed from the measured data), ud – desired value of emc, uc – emc if control is used (resulting from simulation). lower: the amount of water to be extracted mated by a non-linear model (3) from the humidity and temperature measurements. compensating the natural interior temperature variations simultaneously by the appropriate adjustment of air humidity proves efficient in maintaining a reference emc constant during the entire annual weather cycle. basically, the non-linear model-based microclimate control is marked out with a feed-forward character since the changes in the thermodynamic air state variables are much faster than the sorption phenomena, and therefore they inhibit the possible sorption process. the proposed moisture content control scheme has already been implemented and has been in successful operation in the holy cross chapel at karlštejn castle in the czech republic (zítek and němeček, 2004) since 1999. incited by the alarming deterioration of the collection of 29 precious medieval paintings by master theodoricus, the implementation was based on a thorough investigation of the interior microclimate. in heavy inertial structures, like those in this implementation, the temperature course can be allowed to run almost its natural yearly cycle, without considerable heating influence, since the temperature does not drop below six or seven °c. however, in mansions and similar buildings which do not have such thick walls, the winter course of the heating-unaffected interior temperature would drop to zero, or even below freezing point during the winter season, and it is essential to maintain the temperature safely above zero. hence the temperature cycle has to be corrected by a low-level heating during the winter season in buildings where such low temperatures occur. on the other hand, moisture control is able to safely protect the interior from condensation effects if the interior temperature does not drop below 5 °c. although the proposed control scheme is based on the emc model for a single material selected as decisive for preservation purposes, the differences between the sorption isotherms of various materials affect the control action only weakly, also with regard to the low level of humidity corrections needed. humidity compensation therefore works well not only for the selected material but also for other materials. although the sorption differences between materials are expressed in three parameters a, b, c, only b (in °c) influences the main part of the control action (13). only this property of the henderson model turned out to be significant in helping to show that the control actions (13) computed for various materials would not be very distinct. the value of b is relatively high (more than 200 °c) and therefore the compensation control parameter kc(t, �) is rather low and makes the humidity correction relatively small. in this way the sorption differences between various materials have an inconsiderable effect on the resulting control action (13). on the other hand, the following two factors are to be emphasized: � the equilibrium assumption. it is of primary importance to avoid any rapid air temperature and humidity changes in the interior. � providing sufficient internal air circulation. the control of the internal air humidity will have little effect without ensuring sufficient air circulation, particularly near the surface of the walls. it should be emphasized that the most important advantage of equal-sorption control is not economy in power consumption but, above all, the careful character of its operation from the preventive conservation point of view. due to the lower temperature and controlled humidity during the winter season the exhibits do not suffer such exposure to high gradients of both temperature and humidity as they do in the case of usual air-conditioning, or in historical buildings where no technical care of the internal environment is provided. 7 acknowledgement the presented research has been supported by the grant agency of the czech republic under project no. 101-03-1365 and by the ministry of education of the czech republic under project ln00b096. references [1] camuffo, d.: microclimate for cultural heritage. elsevier science ltd., amsterdam, london, 1998 [2] camuffo, d., bernardi, a., sturaro, g., valentino, a.: “the microclimate inside the pollaiolo and botticelli rooms in the uffizi gallery, florence.“ journal for cultural heritage, vol. 3 (2002), p. 155–161. [3] cassar, m.: environmental management. guidelines for museums and galleries. routledge, london 1995 [4] cassar, m.: “a pragmatic approach to environmental improvements in the courtauld institute galleries in somerset house.” icom committee for conservation, vol. 2 (1993), p. 595–600. [5] kotterer, m.: “research report of the project eu 1383 prevent”, museum ostdeutsche galerie regensburg, 2002. [6] kowalski, s. j.: thermomechanics of drying processes. springer, berlin, 2003. [7] massari, g., massari i.: damp buildings. old and new. iccrom, rome, 1993. [8] ball, r. d., simpson, i. g. and pang, s.: “measurement, modelling and prediction of e.m.c. in pinus radiata heartwood and sapwood.” holz als rohund werkstoff 59 (2001), springer-verlag 2001, p. 457–462. [9] zítek, p., němeček, m.: “stabilizing microclimate of historical interior by moisture sorption control.” ieee conference on math. modeling, automation and robotics 2004, szczecin, poland (to appear). prof. ing. pavel zítek, drsc. e-mail: zitek@fsid.cvut.cz ing. tomáš vyhlídal, ph.d. e-mail: vyhlidal@fsid.cvut.cz centre for applied cybernetics institute of instrumentation and control engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2021.61.0001 acta polytechnica 61(si):1–2, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague foreword tomáš bodnár, rudolf dvořák prof. rndr. karel kozel, drsc. born december 24, 1939 died january 23, 2021 long-standing professor at the czech technical university in prague and researcher at the czech academy of sciences professor karel kozel is a prominent czech mathematician and pedagogue who has devoted his entire life to applied and especially numerical mathematics. he was born on december 24, 1939, studied at the primary school in pyšely, graduated from the high school in benešov and then from the pedagogical university (specializing in mathematics physics), which he graduated from in 1960. then he started working as a teacher at the high school in sedlčany and completed basic military service. in 1964 he moved to the department of mathematics, faculty of mechanical engineering at the czech technical university in prague as an assistant professor. since 1988 he has been an associate professor, and since 1991 a full professor of applied mathematics. he has been working professionally since 1970, defending his candidacy in 1977 (under the leadership of prof. polášek), and earning his drsc. in 1990. he led the tasks of the state plan of basic research since 1972; since 1990 he has led grants and research projects, a total of 10 (gačr, ga avčr, vz mšmt) in the czech republic and 3 (cost, qnet) from the eu. he was the head of the department of technical mathematics for 13 years and twice vice-dean of the faculty of mechanical engineering of the czech technical university in prague. professor karel kozel devoted himself to applied mathematics throughout his life and became probably the most authentic successor of the school, that was founded at ctu by professor polášek, whose goal and purpose was the direct use of mathematics in solving specific problems of technical practice. professor kozel further developed this school and, together with his students, made a significant effort to develop and apply numerical methods in computational fluid mechanics. as part of his professional activity, he cooperated with numerous scientific and research institutions in the czech republic (mff uk, út avčr, mú avčr, fjfi, fel and fsv čvut) and abroad (e.g. von kármán institute in belgium, university of toulon in france, th darmstadt, university of stuttgart and tu chemnitz in germany, ercoftac). he significantly contributed to the support and development of cooperation among mathematicians, industry and industrial research (škoda plzeň, škoda auto, vzlú letňany, svúss běchovice). his professional activity was mainly focused on mathematical models, the numerical solution of partial differential equations and their application to the simulation of flow models; first subsonic and transonic flow, then flow in the boundary layer of the atmosphere, flow in biomechanics and the so-called “fluid structure interaction”. he led 13 projects or grants from the czech republic and the eu (cost, qnet-cfd) and worked on at least six other grants. he was a member of the european professional societies gamm and euromech as well as of the czech society for mechanics and the union of czech mathematicians and physicists. karel kozel’s professional lectures and publications include more than 130 lectures, 16 university texts and monographs, 55 research reports and more than 570 publications in journals and proceedings. the pedagogical activity of karel kozel is also extensive and important. he was a long-term member and later the head of the department of technical mathematics at the faculty of mechanical engineering of the czech technical university in prague. for many years, he also served as the vice dean of the faculty. his colleagues at the department and the faculty have always appreciated his direct and honest conduct. he taught mathematical subjects ranging from basic to doctoral. he was also pedagogically active at the faculty of nuclear science and physical engineering of ctu in prague and at the university of west bohemia in pilsen. he also supervised a number of graduates and doctoral students. he was a member of the branch councils for doctoral studies at the faculty of mechanical engineering, the faculty of nuclear science and physical engineering of ctu in prague and the faculty of applied sciences of the university of west bohemia in pilsen. further, he was a member of the scientific council of the czech technical university and the institute of thermomechanics of the czech academy of sciences. many of his students have successfully established themselves in the field of applied mathematics and continue their work, at universities, research institutes, and in industry. at the end of 2019, professor rndr. karel kozel, drsc. celebrated an important anniversary. for his extraordinary contribution to the development of technical mathematics, international scientific cooperation and, above all, the education of several generations of scientists, engineers and educators, he received the honorary medal for mathematics from the czech mathematical society. 1 https://doi.org/10.14311/ap.2021.61.0001 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en tomáš bodnár, rudolf dvořák acta polytechnica ad: eightieth birthday of prof. karel kozel the age of eighty is when everyone – willingly or unwillingly – begins to look back and begins to balance their lifelong journey. happy is the one who can enjoy the feeling of a fully and successfully lived life and the feeling of a job well done. not everyone can taste this feeling to the same extent as our important jubilant professor rndr karel kozel, drsc. graduated from the faculty of education as a teacher, he also worked as a teacher for the rest of his life. he spent his first four years at the high school in sedlčany, then from 1964 until his retirement he taught at the department of mathematics, the faculty of mechanical engineering, at the czech technical university in prague. there he received his habilitation in 1988. in 1991 he obtained the rank of doctor of science (drsc) and in the same year he was appointed full professor and the head of the department of technical mathematics. however, many of us knew him more as an unmissable employee of the institute of thermomechanics of the czech academy of sciences, with which he successfully cooperated since 1969. here he had his corner where he could work undisturbed and meet and discuss with staff members important topics and issues. he then passed them on to his students and co-workers at the faculty. he also significantly participated in the construction of a joint workplace of the institute of thermomechanics of the cas and the faculty of mechanical engineering of the czech technical university. this collaboration led him to two issues, which then prevailed in his next work: namely, the numerical simulation of transonic flow in vane lattices and mathematical modeling in fluid dynamics. over time, cooperation with the staff of the department has expanded to the problematics of turbulent flow in internal and external aerodynamics, flow in the boundary layer of the atmosphere (e.g. pollution dispersion), selected problems of fluid-structure interaction and biomechanics (e.g. vocal cord movement). in addition to these problems, the institute of thermomechanics had original and detailed experimental results available, which enabled him and the staff of the institute to verify the suitability of mathematical models and to find the correct interpretation of the results of their numerical simulation. it was a collaboration that enriched all involved and showed how the concept of “applied mathematics” can really be fulfilled. in addition, it significantly contributed to the gradual building of a “school” of mathematical modeling in fluid mechanics, which soon became known beyond the borders of the czech republic. his cooperation with foreign universities was extensive and many of his successful doctoral students also received a phd degree from a foreign university as part of a joint doctoral study. professor kozel himself has been a visiting professor at the university of toulon, france, every year since 1996. in addition to his own professional and pedagogical work, professor kozel also devoted himself to scientific and organizational work. he made a significant contribution to the establishment of the ercoftac czech pilot center at the institute of thermomechanics. he is a member of the gamm and euromech society and a czech representative on the board of the von kármán institute for fluid dynamics in rhode-saint-genèse, belgium. on behalf of all his friends and current or former collaborators, i would like to thank professor kozel for all his professional activities and commitments so far and wish him good health and well-being for all the years to come. the preparation of this special issue of acta polytechnica started already in 2019. although professor karel kozel passed away in the beginning of 2021 before this issue appeared we let the forewords in the original form. authors of the special issue 2 acta polytechnica 61(si):1–2, 2021 acta polytechnica https://doi.org/10.14311/ap.2021.61.0336 acta polytechnica 61(2):336–341, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague dependence of the viscosity coefficient of the niosomal dispersion on the temperature and particle size of the dispersed phase elena igorevna diskaevaa, ∗, olga vladimirovna vechera, igor alexandrovich bazikovb, karine sergeevna elbekyanc, elena nikolaevna diskaevad a stavropol state medical university, department of physics and mathematics, mira 310, 355017 stavropol, russia b stavropol state medical university, department of microbiology, mira 310, 355017 stavropol, russia c stavropol state medical university, department of general and biological chemistry, mira 310, 355017 stavropol, russia d branch of the federal state budget educational institution of higher education “mirea – russian technological university” in stavropol, department of industrial technology, kulakov avenue 8, 355035 stavropol, russia ∗ corresponding author: e_diskaeva@mail.ru abstract. the aim of this study was to experimentally investigate the dependence of viscosity coefficient of niosomal dispersion based on peg-12 dimethicone on the temperature and size of niosomes vesicles. the experiments were carried out with niosomes, the average size of which varied from 85 to 125 nm. the temperature varied from 20 to 60 °c, the volume concentration varied from 1 to 10 %. the particle size was determined by scanning electron microscopy (sem) with subsequent statistical data processing. this study showed that the viscosity of niosomal dispersions significantly depends on both the temperature and the size of niosomes vesicles. with increasing temperature, the viscosity of niosomal dispersions decreases and with increasing particle size, the viscosity increases. keywords: niosome, nonionic surfactant vesicles, viscosity of niosomal dispersion, vesical size. 1. introduction the study of the physicochemical properties of niosomal dispersions is motivated by their wide application in pharmacology, medicine, and cosmetology [1]. the optimal size of niosome vesicles for drug delivery systems is in a range (10−400 nm). drug delivery systems often need nanoscale features, for example, less than 200 nm for parenteral and local transport into tissues, less than 300 nm for the eye chamber and less than 10 nm for the circulating bloodstream [2–4]. physicochemical characteristics of carrier particles determine their size, charge, elastic properties, ability to aggregate and deform, which affects biodistribution in the tissues, lifespan and elimination rate [5–7]. in most cases, the efficiency of using niosomal dispersions will significantly depend on the viscosity coefficient, since this coefficient determines the external energy required for organizing the flow and selecting the optimal dosage of drugs. almost all classical theories of viscosity believe that the viscosity coefficient is dependent only on the volume concentration, without taking into account its dependence on temperature, although, even for homogeneous liquids, this dependence is very significant [8– 10]. currently, there is a number of empirical and semiempirical formulas describing the changes in viscosity caused by temperature, which are applicable to determine the viscosity of dispersed systems. the more exactly these equations define the viscositytemperature dependence, the more coefficients they contain. however, the increased number of coefficients complicates the practical application of these formulas due to the need to experimentally determine each constant [11, 12]. the modified formula of batschinsky formed the basis of the slotte formula, which can be used for a quantitative description of the empirical dependence of viscosity on temperature [13]: c η = (t + a)n, (1) where c, a and n are constants, t is temperature, η is the coefficient of dynamic viscosity. the bingham and stokei formula is also often used: 1 η = at + bt 2 (2) where a and b are constants, t is the absolute temperature. there are other semi-empirical dependencies, but qualitatively, the dependence of the viscosity coefficient on temperature can be represented, in general 336 https://doi.org/10.14311/ap.2021.61.0336 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 2/2021 dependence of the viscosity coefficient of the niosomal dispersion. . . form, by the expression: η = a · exp å b t ã (3) where a, b are constants. however, formula (3) is used for a narrow temperature range. in the case of a sufficiently wide temperature range, they resort to formulas of other types, the most used of which is vogel’s formula: η = a · exp å b t − c ã (4) where a, b, c – are constants. all considered dependencies are not universal and vary significantly depending on the concentration of the niosomal dispersion, the size of the niosomal vesicles, and the base fluid. the main contradictions arise with studying the dependence of viscosity on particle size. many investigators assert that, as in ordinary coarsely dispersed liquids, this dependence is absent [14]. it is worthy to note, that there is no common point of view regarding the dependence of the viscosity coefficient on temperature. most researchers note a decrease in viscosity with increasing temperature. the situation is complicated by the fact that the dependence of the viscosity coefficient, even of ordinary dispersed liquids, on temperature has also not been properly studied. the purpose of this work was to conduct an experimental study of the dependence of viscosity coefficient of a niosomal dispersion based on peg-12 dimethicone on particle size and temperature. the viscosity of niosomal dispersion was measured in a temperature interval (20 − 60 °c). 2. materials and methods in our experiment, we used niosomes that consisted of a shell in the form of a water-insoluble double layer of a nonionic surfactant, which is a group of dimethiconecopolyol substances that are esters of polyethylene glycol and polydimethylsiloxane (pdms) backbone [15–17]. to obtain the silicone-based capsules, physicochemical methods for the synthesis of molecules were used. in the hydrophilic part of dimethicone, there are functional groups of silicon oxide. the length of the si-o bond was 1.6å, which is much longer than the c-c bond of 1.4å. due to this, the functional groups of the molecules are able to rotate with respect to each other. this provided the niosomes with a greater elasticity than that of liposomes made up of phospholipids. the use of peg-12 dimethicone promoted the formation of vesicles without requiring significant energy and effort. the si-o-si bond angle was 130 degrees, in contrast to the 109 degrees c-c-c bond, which increased the elasticity and stability [18]. the stage of formation of vesicles occurred with intensive mechanical mixing of the mixture using an figure 1. structure of a silicone-based noisome. automatic reclosure homogenizer at a room temperature for 5 minutes. the dispersion was then placed in a vessel for ultrasonic treatment. ultrasonic sounding was carried out at a frequency of 20 khz, power – 200 w, exposure – 10 minutes. then, to stabilize the concentration of hydrogen ions (ph) to 6.6 − 7.0 and the formation of a homogeneous structure of niosomes, emulsification was carried out on an apv homogenizer apv lab series homogenizers 1000. monolamellar niosomes with a mean vesicle size of 80-150 nm were formed. the prepared niosomes were spherical in shape. then, the samples were diluted with ultrapure water. a vpzh-1 capillary viscometer with a capillary diameter of 0.86 mm was used to measure the viscosity. to ensure measurements at a fixed temperature, the experiments were carried out using a thermostat with a fixation accuracy of 0.1 °c. since one of the tasks was to determine the dependence of the viscosity coefficient on the particle size, a high accuracy of determining the average size of niosomes and their size distribution was required. the dispersion of nanoparticles was studied using the scanning electron microscopic method (sem tescan mira 3 im). the sample preparation for sem involved sample drying, mounting, and coating. characteristics of the sample surface are obtained from the electrons emitted from the sample surface after the scanning with a focused electron beam. the particle size, determined by image j, ms excel statistical package program, was used to perform the analysis [19]. 3. results and discussion niosomal dispersions with volumetric concentrations of 1, 5 or 10 % were selected for the research. the vesicle size analysis of the niosomal dispersion was carried out using scanning electron microscopy. a typical electron micrograph of niosomes vesicles with an average size of 103 nm is shown in fig. 2. 337 e. i. diskaeva, o. v. vecher, i. a. bazikov et al. acta polytechnica figure 2. micrograph of niosomes obtained by scanning electron microscopy (sem). the microphotograph shows that niosomes, for the most part, are spherical particles. this implies that the projective diameter can be used for the equivalent diameter determination of the particles. the area of the project diameter will be equal to the area of the particle projection image. area of the projection of a spherical particle is equal to: s = πδ2p 4 (5) then, the average projection diameter can be presented in the form: δp = … 4sp π (6) to determine the average particle size for each volume concentration and to plot the particle size distribution, various fractions with a total number of particles of 500 were examined. particle probability density functions of the size of niosomes obtained by processing an ensemble of electron microphotographs are shown in fig. 3. the graphs in fig. 3 confirmed that in all cases, distributions are lognormal. a comparison of these values for different volume concentrations demonstrates homogenization with respect to particle size of the niosomes dispersion. considering the interval of values of the vesicles diameter in the range from 85 nm to 130 nm, it can be seen that higher volume concentration causes a decrease of particle size: from 123 nm at 1 % to 92 nm at 10 %. the reason behind this phenomenon is, probably, the peculiarities of the intermolecular interaction of the vesicle shells and deformation of the vesicle membranes. it should be noted that the complex nature of the mutual influence of such factors as the zeta potential, surface energy, and the distance between the shells complicates the construction of a mathematical model in a wide range of volume concentrations. we assume that in more concentrated systems (> 10 %), there will be a more intensive interaction between the particles of the dispersed phase, which may lead to the formation of temporary associates. a further increase of concentration of the dispersed phase will lead to the formation of stable aggregates consisting of many particles. this will increase the viscosity of the system. fig. 4 describes the character of the dependence of viscosity on temperature for different volume concentrations. research results show that the dependence of the kinematic viscosity coefficient of dispersions is close to an exponential law with the coefficient increasing proportionally with an increase in the specific surface area of the dispersed phase. a careful statistical analysis of experimental data revealed an exponential correlation given by (7), which fits the data with a correlation factor r2 > 0.99. νae−bt (7) here, ν is the kinematic viscosity, mm/s2; t is the temperature, °c; a and b are the functions of particle volume concentration (φ), the values of which are given in table 1. the coefficient a is related to the volume percentage given by a = 37.75φ2 − 0.8605φ + 1.3707 with r2 = 0.995 (8) in the above expression, φ ranges from 1 to 10 %. the value of b did not change. however, this does not exclude the possibility of its dependence on the volume concentration outside the range from 1 to 10 %. fig. 5 shows the experimental values of viscosity plotted against the size of niosomes. viscosity coefficients dispersions with different volume concentrations differ and grow with an increase in the diameter of niosomal vesicles. presumably, this may be due to a decrease in the resistance at the phase “dispersion medium-niosome” interface due to a reduction in surface area [20]. it can be concluded that the dependence of the viscosity coefficient of niosomal dispersions on particle size is rather complex and depends on the volume concentration of the dispersed phase [21–23]. results obtained showed that not only the temperature but also the structural properties and the nature of intermolecular interactions have a noticeable effect on the viscosity (fluidity) of niosomal dispersions. 338 vol. 61 no. 2/2021 dependence of the viscosity coefficient of the niosomal dispersion. . . figure 3. probability density function of particle size distribution in niosomal dispersions for different volume concentration. figure 4. change in viscosity with rise in temperature. volume concentration (φ) 0.01 0.05 0.10 a 1.3658 1.4215 1.6601 b 0.0130 0.0130 0.0130 table 1. value of a and b with a correlation factor r2 > 0.99. 339 e. i. diskaeva, o. v. vecher, i. a. bazikov et al. acta polytechnica figure 5. dependence of viscosity plotted against the size of niosomes vesicles. 4. conclusion niosomes are versatile drug delivery devices and have numerous therapeutic applications. hence, this versatility necessitates proper physicochemical characterization techniques to suit the intended route of administration. the viscosity of niosomes is an important indication to evaluate the biodistribution in the tissues, lifespan and drug elimination rate. this study shows that the viscosity of niosomes dispersion depends on many parameters, such as base fluid, particle volume fraction, particle size, temperature, particle size distribution and particle aggregation. in this article, we attempted to study and characterise viscosity of a niosomal dispersion as the function of temperature and vesicle size. obtained results may give a better understanding of the residence time of drugs in the tissues and bioavailability products. drug diffusion out of the formulation into the tissues may also be inhibited due to a high product viscosity. finally, the administration of highviscosity liquid products tends to be more difficult. references [1] r. c. dutta. drug carriers in pharmaceutical design: promises and progress. current pharmaceutical design 13(7):761 – 769, 2007. https://doi.org/10.2174/138161207780249119. [2] h. s. nalwa (ed.). handbook of nanostructured materials and nanotechnology. academic press, boston, 2000. [3] s. p. vyas, r. k. khar. targeted & controlled drug delivery : novel carrier systems. cbs publishers & distributors, 2002. [4] g. p. kumar, p. rajeshwarrao. nonionic surfactant vesicular systems for effective drug delivery an overview. acta pharmaceutica sinica b 1(4):208 – 219, 2011. https://doi.org/10.1016/j.apsb.2011.09.002. [5] a. kapoor. an overview on niosomes a novel vesicular approach for ophthalmic drug delivery. pharma tutor 4(2):28 – 33, 2016. [6] z. s. bayindir, n. yuksel. characterization of niosomes prepared with various nonionic surfactants for paclitaxel oral delivery. journal of pharmaceutical sciences 99(4):2049 – 2060, 2010. https://doi.org/10.1002/jps.21944. [7] v. f. naggar, s. s. el gamal, a. n. allam. proniosomes as a stable carrier for oral acyclovir: formulation and physicochemical characterization. journal of american science 8(9):417 – 428, 2012. [8] r. pal. modeling the viscosity of concentrated nanoemulsions and nanosuspensions. fluids 1(2), 2016. https://doi.org/10.3390/fluids1020011. [9] j. happel, h. brenner. the viscosity of particulate systems, pp. 431 – 473. springer netherlands, dordrecht, 1983. https://doi.org/10.1007/978-94-009-8352-6_9. [10] j. gonzalez-gutierrez, s. hert, b. von bernstorff, i. emri. prediction of viscosity of pim feedstock materials with different particle size distribution. in 31th danubia-adria symposium in advances in experimental mechanics. kempten university, germany, 2014. [11] h. j. h. brouwers. viscosity of a concentrated suspension of rigid monosized particles. physical review e 81(5):051402, 2010. https://doi.org/10.1103/physreve.81.051402. [12] m. ochowiak, j. różański. rheology and structure of emulsions and suspensions. journal of dispersion science and technology 33(2):177 – 184, 2012. https://doi.org/10.1080/01932691.2010.548694. [13] a. j. batschinski. untersuchungen aber die innere reibnng der flüssigkeiten. i. zeitschrift für physikalische chemie 84(1):643 – 706, 1913. https://doi.org/10.1515/zpch-1913-8442. [14] s. mueller, e. llewellin, h. mader, et al. the rheology of suspensions of solid particles. proceedings of the royal society a: mathematical, physical and engineering sciences 466(2116):1201 – 1228, 2009. https://doi.org/10.1098/rspa.2009.0445. 340 https://doi.org/10.2174/138161207780249119 https://doi.org/10.1016/j.apsb.2011.09.002 https://doi.org/10.1002/jps.21944 https://doi.org/10.3390/fluids1020011 https://doi.org/10.1007/978-94-009-8352-6_9 https://doi.org/10.1103/physreve.81.051402 https://doi.org/10.1080/01932691.2010.548694 https://doi.org/10.1515/zpch-1913-8442 https://doi.org/10.1098/rspa.2009.0445 vol. 61 no. 2/2021 dependence of the viscosity coefficient of the niosomal dispersion. . . [15] i. a. bazikov, p. a. omelyanchuk. the method of delivery of biologically active substances with the help of niosomes, rf patent, 2539396, 2014. [16] i. a. bazikov. a method for transdermal transfer of active substances using niosomes on the basis of peg-12 dimethicone, rf patent, 2320323, 2008. [17] i. a. bazikov, v. v. lukinova, a. n. maltsev, et al. interaction niosomal doxorubicin cell membranes. medical gazette of the north caucasus 11(1):108 – 110, 2016. https://doi.org/10.14300/mnnc.2016.11011. [18] i. a. bazikov, v. v. lukinova, n. i. malinina, a. n. maltsev. study of the mechanisms of intercellular interaction of the niosomal form of the antitumor drug doxorubicin with plasma membranes. the eurasian union of scientists 3(24):34, 2016. [19] e. i. diskaeva, o. v. vecher, i. a. bazikov, d. s. vakalov. particle size analysis of niosomes as a function of temperature. nanosystems: physics, chemistry, mathematics 9(2):290– 294, 2018. https://doi.org/10.17586/2220-8054-2018-9-2-290-294. [20] v. y. rudyak, a. a. belkin, v. v. egorov. effective viscosity coefficient of nanosuspensions. in aip conference proceedings, vol. 1084, pp. 489 – 494. 2008. https://doi.org/10.1063/1.3076527. [21] m. n. yasin, s. hussain, f. malik, et al. preparation and characterization of chloramphenicol niosomes and comparison with chloramphenicol eye drops (0.5% w/v) in experimental conjunctivitis in albino rabbits. pakistan journal of pharmaceutical sciences 25(1):117 – 121, 2012. [22] s. b. shirsand, g. r. kumar, g. g. keshavshetti, et al. formulation and evaluation of clotrimazole niosomal gel for topical application. rajiv gandhi university of health sciences journal of pharmaceutical sciences 5(1):32 – 38, 2015. [23] b. lu, y. huang, z. chen, et al. niosomal nanocarriers for enhanced skin delivery of quercetin with functions of anti-tyrosinase and antioxidant. molecules 24(12):2322, 2019. https://doi.org/10.3390/molecules24122322. 341 https://doi.org/10.14300/mnnc.2016.11011 https://doi.org/10.17586/2220-8054-2018-9-2-290-294 https://doi.org/10.1063/1.3076527 https://doi.org/10.3390/molecules24122322 acta polytechnica 61(2):336–341, 2021 1 introduction 2 materials and methods 3 results and discussion 4 conclusion references acta polytechnica https://doi.org/10.14311/ap.2022.62.0409 acta polytechnica 62(4):409–417, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague performance of a hydromagnetic squeeze film on a rough circular step bearing: a comparision of different porous structures jatinkumar v. adesharaa, hardik p. patelb, ∗, gunamani b. deheric, rakesh m. pateld a vishwakarma government engineering college, chandkheda, ahmedabad 382424 gujarat state, india b l. j. institute of engineering and technology, department of humanity and science, ahmedabad, gujarat state, india c sardar patel university, department of mathematics, vallabh vidyanagar 388 120 gujarat state, india d gujarat arts and science college, department of mathematics, ahmedabad 380 006 gujarat state, india ∗ corresponding author: hardikanny82@gmail.com abstract. this investigation deals with a comparative analysis of the impact of spongy structure based on the model of kozeny-carman and irmay on a hydromagnetic squeeze film in a rough circular step bearing. christensen and tonder’s stochastic averaging process has been utilized to determine the role of an arbitrary transverse surface irregularity. the distribution of the pressure in the bearing is obtained by solving the concerned generalised stochastically averaged equation of reynolds’ with appropriate boundary conditions. the outcomes show that increasing values of magnetization results in an augmented load. the impact of the surface irregularity (transverse) has been found to be adverse. in addition, the negative effect of the surface irregularity and porosity can be minimised by the positive impact of magnetization, at least in the case of the globular sphere model of kozeny-carman. furthermore, the lower strength of the magnetic field results in an approximately similar performance for both these models. this study offers the possibility that the kozeny-carman model could be deployed in comparison with irmay’s model. keywords: hydromagnetic fluid, squeeze film, circular step bearing, surface irregularity, spongy structure. 1. introduction the efficiency of bearings was substantially improved as compared to traditional lubricants. the magnetohydrodynamic squeeze film performance between curved annular plates was inspected by lin. et al. [1]. patel & deheri [2] examined the influence of a magnetic fluid lubricant on a squeeze film in conical plates. it was found that the overall efficiency improved with this bearing system. of course, the feature of the cone’s semi-vertical angle was crucial in enhancing the performance. the approach adopted in the patel & deheri investigation [2] has been amended and improved by vadher et al. [3]. calculating the negative impact of the surface irregularity (transverse) between rough spongy conical plates of a magnetic-fluid-based squeeze film. they found that in the case of the negative skewed surface irregularity, the already-increased load increased even further. andharia & deheri [4] examined the longitudinal effect of a surface irregularity between conical plates on the magnetic-fluid-dependent squeeze film. in relation to the surface irregularity (transverse), the squeeze film associated with the surface irregularity increased the load for a cylindrical squeeze film, lin [5] developed a ferrofluid lubrication equation that takes into account convective fluid inertia forces for a circular disc application. compared to the non-inertia non-ferrofluid case, a longer elapsed period was found. the product of a squeeze film based on magnetic fluid between rough (longitudinally) elliptical plates has already been considered by andharia & deheri [6]. it was noted that due to the combination of the squeeze film and the negative skewed surface irregularity, the load increased significantly due to the magnetisation. the shliomis-model-based ferrofluid lubrication of a squeeze film was discussed by patel & deheri [7] for rotating rough (transversely) curved circular plates. such a type of bearing structure allows a certain amount of load even though there is no flow of a typical lubricant. very recently, patel et al. [8]) investigated a squeeze film behaviour of different spongy structures on rough conical plates. the kozeny-carman model is preferred over the irmay’s model for spongy structure in the case of a surface irregularity (transverse). hydromagnetic squeeze film in rough truncated conical plates, using the kozeny-carman-model-based spongy structure, was discussed by adeshara et al. [9]. a new kind of bearing is introduced by robert goraj [10], where the thickness of the lubricant is also used as an electromagnetic system air gap. here, under hydrodynamic, electromagnetic, and gravity stresses, a new 409 https://doi.org/10.14311/ap.2022.62.0409 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en j. v. adeshara, h. p. patel, g. b. deheri, r. m. patel acta polytechnica figure 1. the bearing system design and specification are given. figure 2. configuration of spongy sheets given by kozeny-carman. figure 3. configuration of spongy sheets provided by irmay. governing equation system defining steady loci of such an electromagnetically supported short hydrodynamic plain journal bearing is obtained and solved. lu et al. [11] develops an analytic model to predict the static properties of a new hydrodynamic–rolling hybrid bearing. the findings demonstrate that the hydrodynamic–rolling hybrid bearing working states are split into two separate phases by a transition speed at which the hydrodynamics and contact models are separated. the impact of surface roughness and micropolar lubricant between two elliptical plates under the application of an external transverse magnetic field was analysed by halambi et al. [12]. younes et al. [13] have presented several strategies for increasing the thermal conductivity of these fluids by suspending nano/micro-sized particle materials in them. patel and deheri [14] discussed the influence of viscosity variation on ferrofluid-based long bearing. in this article, it is observed that the increased load carrying capacity, due to the magnetization, is not significantly affected by viscosity variation. in this article, it has been sought to study and analyse the performance of hydromagnetic squeeze film on a rough circular step bearing with inclusion of two different porous structures: kozeny-carman and irmay’s model. furthermore, the effect of transverse roughness on the bearing’s performance is also discussed. 2. analysis the lower plate with a porous facing is assumed to be fixed while the upper plate moves along its normal towards the lower plate. the plates are considered electrically conductive and the clearance space between them is filled by an electrically conducting lubricant. a uniform transverse magnetic field is applied between the plates. the flow in the porous medium obeys the modified form of darcy’s law. as shown in figure 1, here, bearings are not in direct contact and the load is applied on the bearings. the load w is carried within the pocket and surface by the fluid. the fluid flows in a radial direction. following the analyses of majumdar [15], and patel, deheri & vadher [8], one finds that the reynolds-type equation 410 vol. 62 no. 4/2022 performance of a hydromagnetic squeeze film on . . . for the pressure induced flow in a circular-step bearing is q = − 2πr dp dr [ 2a m3 [ m 2 − tanh m 2 ] + ψl1a c2 ] [ ϕ0 + ϕ1 + 1 ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ] 12µ , (1) where a = h3 + 3h2α + 3h ( α2 + σ2 ) + ε + 3σ2α + α3. (2) 2.1. case – i (a globular sphere model as displayed in figure 2) this model includes the globular sphere to fill in the spongy content of particles. the mean particle size is dc. the permeability of the spongy region was found to be ψ = d2ce 3 1 180 (1 − e1) 2 , (3) where e1 is the porosity. integrate the above equation (1) with respect to the boundary condition. using reynolds’ boundary condition r = r0, p = 0, r = ri, p = ps. (4) the leading film pressure equation p is given by the p = ps ln ( r ro ) ln ( ri ro ), (5) where in ps = 6 ln ( ro ri ) π [ 2a m3 [ m 2 − tanh m 2 ] + d2ce 3 1l1a 180 (1 − e1) 2 c2 ] [ ϕ0 + ϕ1 + 1 ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ]. (6) here non-dimensional equation p ∗s = 6 ln ( 1 k ) π [ 2b m3 [ m 2 − tanh m 2 ] + ψbe31 15 (1 − e1) 2 c2 ] [ ϕ0 + ϕ1 + 1 ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ]. (7) introducing the non-dimensional quantities b = 1 + 3α∗ + 3 ( α∗2 + σ∗2 ) + ε∗ + 3σ∗2α∗ + α∗3, α∗ = (α h ) , σ∗ = (σ h ) , ε∗ = ( ε h3 ) , k = ( ri ro ) , ψ = d2cl1 h3 . (8) by integrating the pressure that takes the dimensionless form, the load w is computed. w = p ∗s ( 1 − k2 ) 2 ln ( 1 k ) . (9) 2.2. case – ii (model with capillary fissures as revealed in figure 3) this model deals with three sets of mutually orthogonal fissures (mean solid size ds). irmay [16] assumed no loss of hydraulic gradient at the junction and derived the expression for the spongy structure parameter as ψ = (1 − m)2/3d2s 12m , (10) 411 j. v. adeshara, h. p. patel, g. b. deheri, r. m. patel acta polytechnica where m = 1 − e1, e1 being the porosity. the governing equation for the film pressure p is given by p = ps ln ( r ro ) ln ( ri ro ), (11) ps = 6 ln ( ro ri ) π [ 2a m3 [ m 2 − tanh m 2 ] + (1 − m)2/3d2sl1a 12mc2 ] [ ϕ0 + ϕ1 + 1 ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ]. (12) here, non-dimensional equation p ∗s = 6 ln ( 1 k ) π [ 2b m3 [ m 2 − tanh m 2 ] + ψb(1 − m)2/3 mc2 ] [ ϕ0 + ϕ1 + 1 ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ]. (13) introducing the non-dimensional quantities b = 1 + 3α∗ + 3 ( α∗2 + σ∗2 ) + ε∗ + 3σ∗2α∗ + α∗3, (14) where ψ = d2sl1 h3 . the load w is calculated by integrating the pressure, which takes the dimensionless form w = p ∗s ( 1 − k2 ) 2 ln ( 1 k ) . (15) 3. results and discussions in the absence of porous structures, the current study reduces to the deliberation of hydromagnetic squeeze film in rough circular step bearing. further, for smooth bearing surfaces, this analysis comes down to the discussion of circular step bearing (majumdar [15]), when there is no magnetization. however, because of porous structures, there is an additional degree of freedom from a bearing design point of view. equations (7) and (13) describe the dimensionless pressure profile, while equations (9) and (15) govern the non-dimensional load. these expressions clearly suggest that the load w ∝ p ∗s and ps = q π [ 2a m3 [ m 2 − tanh m 2 ] + ψl1a c2 ] [ ϕ0 + ϕ1 + l ϕ0 + ϕ1 + ( tanh m2 ) / ( m 2 ) ]. this means that the load increases with a constant flow rate as the stochastically averaged squeeze film decreases in thickness. the bearing is thus self-compensating, provided that the flow rate is assumed to be constant. equations (7) and (13), (9) and (15) indicate that the effect of conductivity parameters on the pressure distribution and load is determined by ϕ0 + ϕ1 + ( tanh m 2 ) / ( m 2 ) ϕ0 + ϕ1 + 1 , which turns to ϕ0 + ϕ1 ϕ0 + ϕ1 + 1 , for big values of m, because as tanh m ∼= 1 and (2/m) ∼= 0. furthermore, it is observed that the pressure and load rise with growths in ϕ0 + ϕ1 because both the functions are growing functions of ϕ0 + ϕ1. figures 4, 5, 8, 9, 12, 13, 16, 17, 20, 21, 24, 26 deal with the pattern of the load with respect to different parameters for kozeny-carman’s globular sphere model, figures 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 25, 27 deal with the variance of the load in relation to irmay’s model of capillary fissures. 412 vol. 62 no. 4/2022 performance of a hydromagnetic squeeze film on . . . figure 4. profile of load with m & e. figure 5. data of load with m & ψ. figure 6. data of load with m & e. figure 7. change of load with m & ψ. increased magnetization parameters will then contribute to an increased load as can be seen in figures 4 and 5 and figures 6 and 7, it can be noted that the rate of the increase in the load is comparatively greater in kozen-carman’s globular sphere model. however, in the case of the globular sphere model, the impact of spongy structure parameters and porosity on the variation of load with regard to magnetization is negligible to some degree. for both models, the load increases sharply. figure 8. change of load with ϕ0 + ϕ1 & e. figure 9. profile of load with ϕ0 + ϕ1 & ψ. figure 10. data of load with ϕ0 + ϕ1 & e. figure 11. change of load with ϕ0 + ϕ1 & ψ. the effect of ϕ0 + ϕ1 on the distribution of the load with respect to the kozeny-carman model is shown in figures 8 and 9, while the profile of the load for the irmay model is given in figures 10 and 11. the rate 413 j. v. adeshara, h. p. patel, g. b. deheri, r. m. patel acta polytechnica of the increase in the load is comparatively higher in kozeny-carman’s globular sphere model as compared to irmay’s capillary fissures model. figure 12. variation of load with σ∗ & e. figure 13. data of load with σ∗ & ψ. figure 14. data of load with σ∗ & e. figure 15. profile of load with σ∗ & ψ. figures 12 and 13 provide the squeeze film effect on the distribution of the load with respect to the kozenycarman’s model and irmay model in figures 14 and 15. it can be seen that an increase in the values of squeeze film results in an increased load and, consequently, adversely affects the squeeze film performance. as can be seen from figures 12 and 13, for the kozeny-carman model, the effect of the spongy structure and porosity on the variance of the load carrying capacity with regard to the squeeze film is marginal, but in the irmay model, it is negligible. figure 16. change of load with α∗ & e. figure 17. variation of load with α∗ & ψ. figure 18. trend of load with α∗ & e. figure 19. profile of load with α∗ & ψ. 414 vol. 62 no. 4/2022 performance of a hydromagnetic squeeze film on . . . it is found that the load bearing capacity decreases due to the positive variance. the opposite of this trend is visible in the case of the negative variance for both models (kozeny-carman and irmay). it is important to note that the influence of the spongy structure parameter on the variation of the load remains negligible for the kozeny-carman model with respect to variance. (figures 16 and 17) and irmay’s model (figures 18 and 19). figure 20. data of load with ε∗ & e. figure 21. change of load with ε∗ & ψ. figure 22. variation of load with ε∗ & e. figure 23. profile of load with ε∗ & ψ. the effect of skewness for the kozeny-carman model and irmay’s model is presented in figures 20 and 21 and figures 22 and 23, respectively. the increased load due to variance (-ve) gets further increased as a result of the negatively skewed surface irregularity. here, the effect of the spongy structure parameter and porosity is also negligible in the case of kozeny-carman model, while a better performance can be seen in the case of irmay’s model. figure 24. profile of load with ψ & k. figure 25. trend of load with ψ & k. the combined effect of the spongy structure parameter and radii ratio (k) appears to be adverse, which can be seen from figures 24 and 25. however, at the outset, the decrease in load is more profound in the case of irmay’s model. the effect of porosity and redii ratio is presented in figure 26 for kozeny-carman model and figure 27 for irmay’s model. for irmay’s model, only an increase can be seen, however, that is not the case for kozeny-carman model. a comparison of both models is presented. from table 1, one can see that kozeny-carman model provides a better performance as compared to irmay’s model with regards to the transverse roughness. 415 j. v. adeshara, h. p. patel, g. b. deheri, r. m. patel acta polytechnica figure 26. variation of load with e & k. figure 27. change of load with e & k. kozeny-carman’s globular sphere model irmay’s model of capillary fissures sr. no graph minimum maximum minimum maximum 1 m → e 1.04178748 5.10051255 0.99169035 4.75745642 2 m → ψ 1.04311200 5.10122332 0.94204858 5.10122332 3 ϕ0 + ϕ1 → e 0.72690126 1.90743812 0.66815659 1.86550029 4 ϕ0 + ϕ1 → ψ 0.72851122 1.90752098 0.61379908 1.90752098 5 σ∗ → e 1.49304632 1.72660919 1.37238548 1.68864716 6 σ∗ → ψ 1.49635317 1.72668420 1.26073582 1.72668420 7 α∗ → e 1.07736948 2.03056534 0.99030164 1.98592038 8 α∗ → ψ 1.07975567 2.03065355 0.90973620 2.03065355 9 ε∗ → e 1.43886631 1.82372390 1.32258403 1.78362666 10 ε∗ → ψ 1.44205315 1.82380313 1.21498594 1.82380313 11 ψ → k 1.07109072 2.15361105 0.90243565 2.15361105 12 e → k 1.06872368 2.15351750 0.98235456 2.10616926 table 1. a comparison of both models. 4. conclusion this analysis demonstrates that the kozeny-carman model is a better choice for this type of bearing design. furthermore, this research shows that the surface irregularity aspect must be carefully considered while designing the bearing systems, even if the required magnetic strength is employed. this can play a vital role in improving the overall performance in the case of irmay’s model. in addition, in the absence of flow, the bearing system supports some load for these two models, which never occurs for traditional lubricants and this load is comparatively higher in the case of kozeny-carman model. acknowledgements the authors would like to thank the reviewers for their valuable remarks and suggestions for the overall improvement of the presentation and organization of the manuscript. list of symbols r radial coordinate ro outer radius ri inner radius k = ri ro – radii ratio h film thickness of lubricant s lubricant’s electrical conductivity µ lubricant’s viscosity bo uniform transverse magnetic field applied between the plates m = b0h ( s µ )1/2 – hartmann number ps supply pressure q flow rate p ∗s dimensionless supply pressure 416 vol. 62 no. 4/2022 performance of a hydromagnetic squeeze film on . . . p lubricant pressure p ∗ non-dimensional pressure w dimensionless l.c.c h0 lower plate’s surface width h1 upper plate’s surface width s0 lower surface’s electrical conductivity s1 upper surface’s electrical conductivity ϕ0(h) = s0 h ′ 0 sh – electrical permeability of lower surface ϕ1(h) = s1 h ′ 1 sh – electrical permeability of upper surface σ∗ non-dimensional s.f. α∗ dimensionless variance ε∗ non dimensional skewness ψ spongy structure of the spongy region e1 porosity l1 thickness of spongy facing references [1] j.-r. lin, r.-f. lu, w.-h. liao. analysis of magneto-hydrodynamic squeeze film characteristics between curved annular plates. industrial lubrication and tribology 56(5):300–305, 2004. https://doi.org/10.1108/00368790410550714. [2] r. m. patel, g. deheri. magnetic fluid based squeeze film between porous conical plates. industrial lubrication and tribology 59(3):143–147, 2007. https://doi.org/10.1108/00368790710746110. [3] p. vadher, g. deheri, r. patel. performance of hydromagnetic squeeze films between conducting porous rough conical plates. meccanica 45(6):767–783, 2010. https://doi.org/10.1007/s11012-010-9279-y. [4] p. i. andharia, g. deheri. longitudinal roughness effect on magnetic fluid-based squeeze film between conical plates. industrial lubrication and tribology 62(5):285–291, 2010. https://doi.org/10.1108/00368791011064446. [5] j.-r. lin. derivation of ferrofluid lubrication equation of cylindrical squeeze films with convective fluid inertia forces and application to circular disks. tribology international 49:110–115, 2012. https://doi.org/10.1016/j.triboint.2011.11.006. [6] p. andharia, g. deheri. performance of magnetic-fluid-based squeeze film between longitudinally rough elliptical plates. international scholarly research notices 2013:482604, 2013. https://doi.org/10.5402/2013/482604. [7] j. r. patel, g. deheri. shliomis model based magnetic fluid lubrication of a squeeze film in rotating rough curved circular plates. caribbean journal of sciences and technology (cjst) 1:138–150, 2013. [8] r. m. patel, g. deheri, p. vahder. hydromagnetic rough porous circular step bearing. eastern academic journal 3:71–87, 2015. [9] j. adeshara, h. patel, g. deheri. theoretical study of hydromagnetic s.f. rough truncated conical plates with kozeny-carman model based spongy structure. proceeding on engineering sciences 2(4):389–400, 2020. https://doi.org/10.24874/pes0204.006. [10] r. goraj. theoretical study on a novel electromagnetically supported hydrodynamic bearing under static loads. tribology international 119:775–785, 2018. https://doi.org/10.1016/j.triboint.2017.09.021. [11] d. lu, w. zhao, b. lu, j. zhang. static characteristics of a new hydrodynamic–rolling hybrid bearing. tribology international 48:87–92, 2012. https://doi.org/10.1016/j.triboint.2011.11.010. [12] b. halambi, b. n. hanumagowda. micropolar squeeze film lubrication analysis between rough porous elliptical plates and surface roughness effects under the mhd. ilkogretim online 20(4):307–319, 2021. https://doi.org/10.17051/ilkonline.2021.04.33. [13] y. menni, a. j. chamkha, a. azzi. nanofluid flow in complex geometries – a review. journal of nanofluids 8(5):893–916, 2019. https://doi.org/10.1166/jon.2019.1663. [14] j. patel, g. deheri. influence of viscosity variation on ferrofluid based long bearing. reports in mechanical engineering 3(1):37–45, 2022. https://doi.org/10.31181/rme200103037j. [15] b. c. majumdar. introduction to tribology of bearings. ah wheeler & company, india, 1986. [16] s. irmay. flow of liquid through cracked media. bulletin of the research council of israel 5(1):84, 1955. 417 https://doi.org/10.1108/00368790410550714 https://doi.org/10.1108/00368790710746110 https://doi.org/10.1007/s11012-010-9279-y https://doi.org/10.1108/00368791011064446 https://doi.org/10.1016/j.triboint.2011.11.006 https://doi.org/10.5402/2013/482604 https://doi.org/10.24874/pes0204.006 https://doi.org/10.1016/j.triboint.2017.09.021 https://doi.org/10.1016/j.triboint.2011.11.010 https://doi.org/10.17051/ilkonline.2021.04.33 https://doi.org/10.1166/jon.2019.1663 https://doi.org/10.31181/rme200103037j acta polytechnica 62(4):409–417, 2022 1 introduction 2 analysis 2.1 case – i (a globular sphere model as displayed in figure 2) 2.2 case – ii (model with capillary fissures as revealed in figure 3) 3 results and discussions 4 conclusion acknowledgements list of symbols references ap05_1.vp 1 introduction interframe predictive coding is used to eliminate the large amount of temporal and spatial redundancy that exists in video sequences, and helps in compressing them. in conventional predictive coding the difference between the current frame and the predicted frame (based on the previous frame) is coded and transmitted. the better the prediction, the smaller the error and hence the transmission bit rate. if a scene is still, then a good prediction for a particular pel in the current frame is the same pel as in the previous frame, and the error is zero. however, when there is motion in a sequence, then a pel on the same part of the moving object is a better prediction for the current pel. use of knowledge about the displacement of an object in successive frames is called motion compensation. there are a large number of motion compensation algorithms for interframe predictive coding. in this study, however, we have focused on a single class of such algorithms, called block matching algorithms. these algorithms estimate the amount of motion on a block-by-block basis, i.e. for each block in the current frame a block from the previous frame is found that is said to match this block, based on a certain criterion. there are a number of criteria to evaluate the “goodness” of the match, some of which are: 1. pixel difference classification (pdc), 2. mean absolute difference (mad), 3. mean squared difference (msd). mean absolute difference (mad) is the most commonly used cost function, since it does not need a multiplication operation. pdc counts the number of matching pixels between two blocks. mathematically these cost functions can be defined as: mad( , ) ( , ) ( , )i j nm c x k y l r x i k y j l l n k m � � � � � � � � � � � � �1 0 1 0 1 � � �msd( , ) ( , ) ( , )i j nm c x k y l r x i k y j l l n k m � � � � � � � � � � � � �1 2 0 1 0 1 � pdc ( , ) ( , ),i j t k li j l n k m � � � � � �� 0 1 0 1 , where, for given threshold t, t k l c x k y l r x i k y j l t i j, ( , ) ( , ) ( , ) � � � � � � � � �1 0 if otherwise, � where, r x i k y j l( , )� � � � and c x k y l( , )� � are the reference frame’s block and the current frame’s block respectively and, the motion vector is defined by (i, j). 2 search algorithms in this section some famous algorithms are introduced, including fs, tss, 4ss, osa, ots, cbosa, csa, mrvbs, and multi-resolution algorithms. 2.1 full search algorithm (fs) the simplest method to find the motion vector for each macro-block is to compute the certain cost function at each location in the search space. this is referred to as the full search algorithm. the cost function used in the full search algorithm is the mean absolute difference mad. the best matching block is the reference block for which mad (i, j) is minimized, thus the coordinates (i, j) define the motion vector. the main problem of the full search algorithm is the computation complexity, which can be estimated as follows [1]. for each motion vector there are (2p�1)2 search locations. at each location (i, j) n×m pixels are computed. each pixel comparison requires four operations, namely, a subtraction, an absolute-value calculation, one addition, and one division, if the cost of accessing pixels c x k y l( , )� � and r x i k y j l( , )� � � � is ignored. thus the total complexity per macro-block is (2p�1)2×nm×4 operations. then for frame resolution i×j and frame rate f frames per second, the overall complexity is defined as: complexity operation � � � � � � � ijf nm p mn ijf p ( ) ( ) 2 1 4 2 1 4 2 2 / sec. for example for typical values for broadcast tv with n � m � 16, i � 720, j � 480 and f � 30 the motion estimation based on the full search algorithm requires 39.85 gops (giga operations per second) for p � 15, and 9.32 gops for p � 7. this example shows that the full search algorithm is computationally expensive but guarantees finding the minimum mae © czech technical university publishing housee http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 a complexity and quality evaluation of block based motion estimation algorithms s. usama, m. montaser, o. ahmed motion estimation is a method, by which temporal redundancies are reduced, which is an important aspect of video compression algorithms. in this paper we present a comparison among some of the well-known block based motion estimation algorithms. a performance evaluation of these algorithms is proposed to decide the best algorithm from the point of view of complexity and quality for noise-free video sequences and also for noisy video sequences. keywords: motion estimation, block matching, quality evaluation, complexity evaluation. value. due to the high computational complexity of the full search, alternative search methods are desirable. 2.2 three step search algorithm (tss) this algorithm [2] is simple and robust and also provides near optimal performance, so it has become very popular. it searches for the best motion vectors in a coarse to fine search pattern. it can compute displacement up to 7 pixels. the algorithm may be described as follows: step 1: an initial step size is chosen. eight blocks at a distance of step size from the center (around the center block) are picked for comparison. step 2: the step size is halved. the center is moved to the point with the minimum distortion. steps 1 and 2 are repeated until the step size is equal to 1. one problem that occurs with the three step search is that it uses a uniformly allocated checking point pattern in the first step, which becomes inefficient for small motion estimation. for each motion vector there are (8×3�1) search locations. at each location (i, j) n×m pixels are computed. each pixel comparison also requires three operations, a subtraction, an absolute-value calculation, and one addition, if the cost of accessing pixels c x k y l( , )� � and r x k i y l j( , )� � � � is ignored. thus the total complexity per macro-block is (8×3�1)×nm×4 operations. then for frame resolution i×j and frame rate f frames per second, the overall complexity is defined as: complexity operation� � � � � ijf nm mnx ijf25 4 100 / sec. for example if i � 720 and j � 480 and f � 30 then the overall complexity is equal to 1036.8 mops, and this is just 2.6 % of the full search required operations. 2.3 four step search (4ss) this algorithm [3] is based on the real world image sequence’s characteristic of center-biased motion. the algorithm starts with a nine-point comparison and then the other points for comparison are selected on the basis of the following algorithm: step 1: start with a step size of 2. pick nine points around the search window center. calculate the distortion and find the point with the smallest distortion. if this point is found to be the center of the searching area, go to step 4, otherwise go to step 2. step 2: move the center to the point with the smallest distortion. the step size is maintained at 2. the search pattern, however, depends on the position of the previous minimum distortion. a) if the previous minimum point is located at the corner of the previous search area, five points are picked. b) if the previous minimum distortion point is located at the middle of the horizontal or vertical axis of the previous search window, three additional checking points are picked. locate the point with the minimum distortion. if this is at the center, go to step 4, otherwise go to step 3. step 3: the search pattern strategy is the same, however it will finally go to step 4. step 4: the step size is reduced to 1 and all nine points around the center of the search are examined. the computational complexity of the four step search is less than that of the three step search, while the performance in terms of quality is as good. 2.4 multi-resolution algorithms spatial multiresolution video sequences provide video at multiple frame sizes, allowing extraction of only the resolution or bit rate required by the user. to illustrate the efficiency of multi-resolution based algorithms [1] in comparison with full-frame based algorithms, assume that the current frame and the reference frame are decomposed into two levels using a simple averaging filter (2×2) twice. using the fs algorithm in the lowest resolution (level 2), the complexity of the algorithm is as follows. level 2: assume the parameters for broadcast tv (720×480 at 30 frames per second). then the picture size in level 2 is 180×120, macroblock size 4×4, and the number of macroblocks is equal to (180×120) / (4×4), equal to 1,350 at 30 frames/second. the searching window will be rescaled to � � � � � � � � � � � � � � � � � � � p p 4 4 , . if p � 15 then the searching window in level 2 is [4, 4], thus the number of search locations is (2×4 � 1)2� 81. the complexity for level 2 � 30×180×120×81×4 � 209,952 mops. level 1: in this level the picture size � 360×240, and the macroblock size � 8×8. the number of macroblocks � 1,350 at 30 frames/second. in this level there is just a search for the best matching within the center resulting from level 2 and its eight neighbors, so the searching window is [�1, 1] and the number of searching locations � 9. the complexity for level 1 � 30×240×360×9×3 � 69.984 mops. level 0: picture size � 720×490, and macroblock size �16×16. the number of macroblocks � 1,350 at 30 frames/second. in this level also there is just a search for the best matching within the center resulting from level 1 and its eight neighbors, so the searching window is [-1,1] and the number of searching locations � 9. the complexity for level 0 � 30×480×720×9×4 � 279,936 mops. then the total complexity of this algorithm � 373.248 � 93.312 � 209.952 � 676.512 mops. this is a significant reduction over 29.98 gops that is needed for the fs algorithm. from the complexity point of view the multi-resolution search algorithm is very efficient; however, such a method requires increased storage due to the need to keep pictures at different resolutions. also, because the search starts at the lowest resolution small objects may be completely eliminated and thus fail to be tracked. on the other hand the creation of low-resolution pictures provides some immunity to noise. 2.5 wavelet based algorithms [5] an efficient multi-resolution tool is the wavelet transform, so we review a robust algorithms based on the wavelet trans30 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague formation, mrvbs (multi-resolution variable block size) algorithm. it is based on a central search process in three layers, namely, layer 2, layer 1, and layer 0 (the original frame). mad is used as a cost function. the main steps are described as follows: first the current frame and the previous frame are decomposed into two layers of the wavelet domain. step 1: in layer 2, the central search process is applied on the low band i.e., searching for the best matching within the nine neighboring blocks to get an initial motion vector. the block size used in this step is 4×4, and the estimated motion vector is used as the new center for the central search process for the details. step 2: the estimated motion vector in the previous step is rescaled and used as the new center for the three highest bands in layer 1 with block size 8×8. step 3: from the estimated motion vectors in step 2, the median values are chosen to be rescaled into layer 0 and then used as a new center to estimate the final motion vector by using block size 16×16. the computational cost of this algorithm without the wavelet complexity is (36�p2� 27�p1� 9�p0), where p2, p1, and p0 are the block size in layer 2, layer 1, and layer 0, respectively. 2.6 two dimensional logarithmic search (tdl) this algorithm was introduced by jain & jain [6]. although this algorithm requires more steps than the three step search, it can be more accurate, especially when the search window is large. the algorithm can be described as follows: step 1: choose an initial step size as 2 j. look at the block at the center of the search and the four blocks at a distance s from this on the x and y-axes. (the five positions form a � sign). step 2: if the position of the best match is at the center, halve the step size. if, however, one of the other four points is the best match, then it becomes a new center, and step 1 is repeated. step 3: when the step size becomes 1, all the nine blocks around the center are chosen for the search and the best among them is picked as the required block. many variations of this algorithm exist, and they differ mainly in the way in which the step size is changed. some people argue that the step size should be halved at every stage. some people believe that the step size should also be halved if an edge of the search space is reached. 2.7 orthogonal search algorithm (osa) this algorithm was introduced by puri [7] and it is a hybrid of the three step search and the two dimensional logarithmic search. it has a vertical stage followed by a horizontal stage for the search for the optimal block. then the algorithm may be described as follows: step 1: pick a step size (usually half the maximum displacement in the search window). take two points at a distance of the step size in the horizontal direction from the center of the search window and locate (among these) the point of minimum distortion. move the center to this point. step 2: take two points at a distance of the step size from the center in the vertical direction and find the point with the minimum distortion. step 3: if it is greater than one; halve the step size and repeat steps 1 and 2, otherwise, halt. 2.8 center-biased orthogonal search algorithm (cbosa) the cbosa algorithm [8] for finding small motion is described below. the cbosa algorithm is a modification of the orthogonal search algorithm (osa), which is reviewed in section (2.7). the osa algorithm has faster convergence, fewer checking points and fewer searching steps. however, the performance of osa in terms of mse is much lower than that of 3ss and other fast bmas. this is because the osa algorithm does not make use of the center-biased motion vector distribution characteristics of the real world video sequence. in order to tackle this drawback, the cbosa algorithm uses a smaller step size in the first step so as to increase the probability of catching the global minimum point. for the maximum motion displacement of 7 in both the horizontal and vertical directions, the cbosa algorithm uses three horizontal checking points with a step size of 2 in the first step (step 1–h). if the minimum bdm (block distortion measure) is at the center, it jumps to the vertical step (step 1-v). otherwise, one more checking point is searched in the horizontal direction). this extra step is to make sure that the algorithm can cover the whole search window even using a small step size of 2 in the first step. using the minimum bdm point found in step 1–h, step 1–v uses the same searching strategy as step 1–h to search in the vertical direction. then the algorithm jumps to step 2–h and step 2–v, respectively. these two steps use three checking points also with a step size of 2 in the horizontal and vertical directions, respectively. after step 2–v, the algorithm jumps to step 3–h and step 3–v, respectively. step 3–h and step 3–v use three checking points with the step size reduced to 1 in the horizontal and vertical directions, respectively. thus, the number of checking points required by the cbosa algorithm varies from (3 � 2 � 2 � 2) � 9 to ( 3 � 1 � 2 � 1 � 2 � 2 � 2 � 2) � 15. the worse case computational requirement is just 2 checking points more than that of osa, but it is 10 checking points fewer than for 3ss. 2.9 one at a time algorithm (ots) [9] this is a simple, but effective way of trying to find a point with the optimal block. during the horizontal stage, the point on the horizontal direction with the minimum distortion is found. then, starting with this point, the minimum distortion in the vertical direction is found. the algorithm may be described as follows: step 1: pick three points about the center of the search window (horizontal). step 2: if the smallest distortion is for the center point, start the vertical stage, otherwise look at the next point in the horizontal direction closer to the point with © czech technical university publishing house e http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 the smallest distortion (from the previous stage). continue looking in that direction till you find the point with the smallest distortion. (going in the same direction, the point next to it must have a larger distortion). step 3: repeat the above, but taking points in the vertical direction about the point that have the smallest distortion in the horizontal direction. this search algorithm requires very little time; however the quality of the match is not very good. 2.10 cross search algorithm (csa) this algorithm was introduced by m. ghanbari [10]. the basic idea in this algorithm is still a logarithmic step search. however, the main difference between this and the logarithmic search method presented above is that the search locations picked are the end points of an “x” rather than a “�”. the algorithm may be described as follows: step 1: the center block is compared with the current block and if the distortion is less than a certain threshold, the algorithm stops. step 2: pick the first set of points in the shape of an “x” around the center. (the step size picked is usually half the maximum displacement). move the center to the point of minimum distortion. step 3: if the step size is greater than 1, halve it and repeat step 2, otherwise go to step 4. step 4: if in the final stage the point of minimum distortion is the bottom left or the top right point, then evaluate the distortion at 4 more points around it with a search area of a “�”. if, however, the point of minimum distortion is the top left or bottom right point, evaluate the distortion at 4 more points around it in the shape of an “x”. 3 performance comparison of the motion estimation algorithms in this section we will introduce a comparison between some of the most efficient algorithms from different points of view. the well known fs, 4ss, 3ss, osa, cbosa, ots, and csa algorithms compared from the psnr point of view as well as the are complexity point of view. the comparison is performed for noise-free sequences as well as noisy sequences with different snr. 3.1 complexity point of view fs algorithm: the fs algorithm searches for the best matching within a large window [�p : p]×[�p : p]. this means that it searches for the best matching within a (2p�1)2 block. thus for the simplest cost function mae three operations are performed, namely, one addition, one absolute computation, and one for subtraction, then the total operation number for just one block matching is equal to 4nm (2p � 1)2 , where n and m are the block size, and the total operation number per frame is 4ij ( 2p � 1)2 where i, and j are the frame size. this is too many operations, and requires very high speed processors. tss algorithm: the tss algorithm searches for the best matching within [�p : p]×[�p : p] window. here p is equal to 7, but only blocks in this window with a certain step are checked. the total number of checked blocks is 25. this means that the total operation per frame is 75 (ij), so it requires just 2.6 % of the operations required for the fs algorithm (with p � 15). note that data access is not taken into consideration. 4ss algorithm: in 4ss certain conditions are inserted for jumping between steps to overcome computation overlap. the total number of checked blocks varies between the maximum value (27) and the minimum value (17). on an average it requires 22 blocks to be checked. this means that the total operation per frame is 66 (ij), and it requires just 2.289 % of the operations required for fs the algorithm. tdl algorithm: in this algorithm for a maximum displacement of 7 it requires checking points varying from (5 � 8) � 13 to (5 � 4 � 8) � 17 checking points. osa algorithm: for maximum displacement of 7 the osa algorithm requires (3 � 2 � 2 � 2 � 2 � 2) � 13 checking points. cbosa algorithm: the number of checking points required by the cbosa algorithm varies from (3� 2 � 2 � 2) � 9 to ( 3 � 1 � 2 � 1 � 2 � 2 � 2 � 2) � 15. it is very fast compared with the fss, fs, or tss algorithms, but it requires 2 more checking points than for the osa algorithm. csa algorithm: for a maximum displacement of 7 the csa algorithm requires (5 � 4 � 4) � 13 checking points it can be formulated in a general form as 5 � 4*log2w, where w is the initial step size. for example, it is chosen to equal 4 for a maximum displacement of 7. ots algorithm: the ots algorithm is very attractive from the computation point of view. the number of checking points required by the ots algorithm varies from (3 � 2) � 5 to (3 � 1 � 1 � 1 � 1 � 1 � 1 � 2 � 1 � 1 � 1 � 1 � 1 � 1) � 17, and the number of checking points may take the values 5, 6, 7, 8, …17. an advantage of this algorithm is that it adds only one checking point at a time till reaching the minimum distortion. 3.2 quality point of view in this section we introduce the simulation results for a comparison between some well known algorithms. in the simulation we used two different techniques to search for the best matching blocks: 1. searching within non-overlapped blocks in the search area, as in fig. 1. 2. searching within overlapped blocks, as in fig. 2. it is clear from figs. 1, 2, that the difference between the overlapped and non overlapped block technique is that the displacement in the overlapped blocks is in the pixels while in the non overlapped blocks it is an integer number of the block size. thus for the same complexity the searching window is (2�p � 1)2�n2, and (2�p � 1)2 pixels for non overlapped and overlapped techniques, respectively. these two searching windows give the same searching points, and consequently the same complexity. the comparison between different algorithms in this section will indicate the effect of three major factors in motion estimation algorithms. 32 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 1) the cost function. 2) the block size. 3) the addition of external noise. the effect of these factors is simulated with almost all the motion types. 3.2.1 the effect of the cost function the cost function is one of the major factors that affect the complexity of the motion estimation algorithm and consequently its performance. in this section two well known and widely used cost functions are compared from the point of view of complexity and also the effect on the performance of different algorithms: mean square difference (msd): to execute an msd equation four operations have to be performed, namely; one subtraction, one addition, one squaring operation (multiplication), and one division. these operations are performed in addition to the data accessing. mean absolute difference (mad): mad also requires four operations; one subtraction, one absolute value computation, one addition, and one division, plus data accessing. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 current framereference frame motion vector (d , d )x y search area macroblock comparison of block fig. 1: searching within non-overlapped blocks current framereference frame motion vector (d , d )x y search area macroblock comparison of macroblock fig. 2: searching within overlapped blocks algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carfone foreman silent news m s d fs 43.5474 44.4123 37.2656 39.0550 44.3305 36.8690 43.7964 37.1411 23.5908 33.8072 32.6524 36.3689 38.6385 3ss 43.5448 44.4123 37.1765 39.1357 44.3257 36.7711 43.7967 37.1357 22.5886 33.6144 32.3541 36.2465 38.5294 4ss 43.3002 44.2467 36.0448 38.5753 44.2794 35.2129 43.7950 36.9039 21.4003 31.7651 29.3743 35.2058 38.2778 tdl 43.5448 44.4123 37.1115 39.1318 44.3270 36.3258 43.7966 37.0261 21.9381 33.5907 32.1399 36.2873 38.6219 osa 43.5396 44.3946 37.0775 39.0355 44.3205 36.5008 43.7967 37.1225 21.9206 33.4743 32.1316 36.1810 38.6103 cbosa 43.5394 44.3946 37.0626 39.0436 44.3205 36.4833 43.7967 37.1220 21.7396 33.4637 32.1477 36.1252 38.6236 ots 43.5368 44.3963 37.0518 39.0167 44.3151 36.0907 43.7965 37.0188 21.5874 33.5764 31.9536 36.0010 38.5794 csa 43.5214 44.4123 36.9050 38.9088 44.3191 36.1068 43.7966 37.0215 21.8452 33.2189 31.7770 36.0568 38.5700 m a d fs 43.5145 44.5667 37.1630 39.0550 44.3245 36.8167 43.7964 37.1225 22.9285 33.7505 32.6306 36.1862 38.5197 3ss 43.5123 44.3828 37.0821 39.0204 44.3198 36.6795 43.7964 37.1067 22.5110 33.5387 32.2222 36.1050 8.5105 4ss 42.8956 44.0538 36.5073 38.6163 44.2609 35.8949 43.7964 37.0504 21.3811 32.5986 30.5635 35.7104 8.0756 tdl 43.5123 44.3828 37.0242 39.0184 44.3169 36.2371 43.7963 37.0004 21.8857 33.5071 32.0957 36.0065 8.5076 osa 43.4960 44.3812 36.9955 38.9326 44.3095 36.4073 43.7963 37.0951 21.8874 33.4039 31.9828 35.9547 8.5076 cbosa 43.4959 44.3812 36.9720 38.9380 44.3110 36.4230 43.7963 37.0946 21.6624 33.3958 32.1301 35.8209 8.5257 ots 43.4798 44.3828 36.9568 38.9377 44.3065 36.0126 43.7963 36.9939 21.4116 33.5067 31.9247 35.7376 8.4877 csa 43.4879 44.3828 36.8263 38.8235 44.3119 36.0175 43.7963 36.9952 21.8158 33.1682 31.7464 35.8129 8.4893 table 1: the effect of the cost on different algorithms using the overlapped block technique fig. 3: the effect of the cost function it is clear that mad is simpler than msd, because an absolute evaluation operation rather is required, than the squaring operation. mad is therefore preferable to msd from the complexity point of view. tables 1, 2 present a comparison between msd and mad cost functions for different algorithms and with different noise-free video sequences from the psnr point of view, using overlapped blocks and non-overlapped blocks, respectively. fig. 3 shows an example of the effect of the cost function. in this example the tss algorithm is used, with a constant block size for the two cases (16×16), and using the overlapped blocks technique. this example illustrates that at first msd achieves better quality than mad (higher psnr), but the improvement in the psnr is small in comparison with the increase in complexity. we can therefore conclude that mad is better than msd, when complexity and quality are traded off. 3.2.2 the effect of adding external noise video sequences are usually not pure. some noise almost always corrupts the sequences. noise may come from the camera (this is called camera noise), or it may be from the transmission lines. the algorithm is therefore required to be robust against the addition of noise. in this section the robust34 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carfone foreman silent news m s d fs 42.7768 44.0277 35.8226 38.0863 44.2413 34.6745 43.7960 36.9393 21.1449 31.5224 28.7398 35.1039 37.9532 3ss 42.7768 44.0277 35.7407 38.0771 44.2413 34.6135 43.7960 36.9288 20.5063 31.5180 28.7127 34.9695 37.9305 4ss 42.7768 44.0277 35.5863 38.0771 44.2326 34.5888 43.7960 36.6986 20.1529 31.5064 28.7053 34.7968 37.9298 tdl 42.7768 44.0277 35.6900 38.0771 44.2413 34.5729 43.7960 36.9077 20.2925 31.5153 28.6511 34.8341 37.9305 osa 42.7768 44.0277 35.7155 38.0771 44.2413 34.6088 43.7960 36.9194 20.2273 31.5157 28.6499 34.8917 37.9305 cbosa 42.7768 44.0277 35.7126 38.0771 44.2413 34.5949 43.7960 36.9098 20.0801 31.5156 28.6494 34.8287 37.9305 ots 42.7768 44.0277 35.6906 38.0771 44.2412 34.5775 43.7960 36.9034 19.9627 31.5152 28.6464 34.7360 37.9305 csa 42.7768 44.0277 35.6892 38.0771 44.2413 34.5731 43.7960 36.9076 20.1619 31.5153 28.6465 34.8313 37.9305 m a d fs 42.7768 44.0277 35.7964 38.0802 44.2409 34.6410 43.7960 36.9267 20.8890 31.5072 28.7184 34.9766 37.9268 3ss 42.7768 44.0277 35.7332 38.0771 44.2409 34.5961 43.7960 36.9157 20.3351 31.5032 28.7028 34.8675 37.9268 4ss 42.7768 44.0277 35.7075 38.0771 44.2402 34.5646 43.7960 36.8977 20.0559 31.4973 28.6327 34.7843 37.9268 tdl 42.7768 44.0277 35.6866 38.0771 44.2409 34.5559 43.7960 36.9036 20.1449 31.5023 28.6475 34.7554 37.9268 osa 42.7768 44.0277 35.7074 38.0771 44.2408 34.5793 43.7960 36.9111 20.0946 31.5026 28.6447 34.8347 37.9268 cbosa 42.7768 44.0277 35.7085 38.0771 44.2408 34.5720 43.7960 36.9003 19.9489 31.5026 28.6447 34.7537 37.9268 ots 42.7768 44.0277 35.6891 38.0771 44.2408 34.5551 43.7960 36.8982 19.8432 31.5022 28.6431 34.6913 37.9267 csa 42.7768 44.0277 35.6859 38.0771 44.2409 34.5561 43.7960 36.9036 20.0297 31.5023 28.6431 34.7388 37.9268 table 2: the effect of the cost on different algorithms using the non overlapped block technique algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carfone foreman silent news m s d fs 29.4134 29.6722 27.9724 31.3667 32.6476 28.3379 26.5496 29.2625 22.2051 27.5143 25.1267 26.8225 29.9344 3ss 29.3249 29.6530 27.9331 31.3790 32.6174 28.2959 26.5087 29.2079 21.4108 27.4111 25.0125 26.8069 29.9308 4ss 29.2215 29.5168 27.6073 31.0839 32.5855 27.7661 26.4546 29.1125 20.4853 26.7187 24.0649 26.5515 29.7430 tdl 29.2646 29.6208 27.8652 31.3600 32.5782 28.0768 26.4690 29.1601 20.9107 27.3920 24.9583 26.7881 29.9217 osa 29.2977 29.2772 27.8948 31.3532 32.5899 28.1825 26.4952 29.1804 20.8884 27.3524 24.9576 26.7506 29.9178 cbosa 29.2772 29.6383 27.8735 31.3498 32.5825 28.1837 26.4729 29.1624 20.7383 27.3422 24.9522 26.7485 29.9188 ots 29.2282 29.6234 27.8558 31.3407 32.5574 27.9863 26.4604 29.1183 20.5936 27.3513 24.9028 26.7119 29.9016 csa 29.2768 29.6400 27.8504 31.3164 32.5828 28.0142 26.4755 29.1508 20.8304 27.2488 24.8637 26.7336 29.9008 m a d fs 29.3731 29.6440 27.9357 31.3662 32.6318 28.3129 26.5330 29.2356 21.7096 27.4818 25.1092 26.8004 29.9233 3ss 29.2870 29.6345 27.9031 31.3654 32.5891 28.2548 26.4859 29.1750 21.3939 27.3836 24.9587 26.7773 29.9067 4ss 29.1674 29.5069 27.5440 31.0566 32.5605 27.7385 26.4079 29.0629 20.4072 26.6927 24.0404 26.5357 29.7871 tdl 29.2655 29.6242 27.8732 31.3542 32.5752 28.0672 26.4666 29.1400 20.8855 27.3786 24.9412 26.7557 29.8866 osa 29.2553 29.6275 27.8682 31.3245 32.5648 28.1567 26.4643 29.1464 20.8823 27.3007 24.9019 26.7331 29.8984 cbosa 29.2482 29.6196 27.8515 31.3331 32.5674 28.1686 26.4541 29.1291 20.7114 27.3091 24.9375 26.7147 29.8993 ots 29.1998 29.6136 27.8311 31.3347 32.5497 27.9681 26.4319 29.0930 20.4992 27.3310 24.8916 26.6859 29.8788 csa 29.2475 29.6129 27.8241 31.3043 32.5765 27.9798 26.4536 29.1222 20.8277 27.2225 24.8538 26.7089 29.8867 table 3: the effect of the adding gaussian noise with snr � 25 db on different algorithms using the overlapped block technique. ness of some algorithms is tested. here we used white gaussian noise, with snr of 25 db and 20 db. the results are shown in tables 3, 4, 5, and 6. also, an example to indicate the relation between noise (snr) and quality (psnr) is shown in fig. 4. in this example fs, tss, tdl, and 4ss are used to compare their robustness to noise. another example is shown in that figure to indicate the robustness of one algorithm (tss chosen) with a different video sequence. these two examples show that as the noise increases the quality decreases, and the fs algorithm is the best even with addition of heavy noise. the tss algorithm is the second best algorithm for noisy seqences. 3.2.3 the effect of block size the choice of macroblock size or simply block size (nxm) is the result of tradeoffs among three conflicting requirements. specifically, 1. small values for n and m (from four to eight) are preferable, since the smoothness constraint would be easily met at this resolution; © czech technical university publishing housee http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carfone foreman silent news m s d fs 24.6454 24.9011 23.5174 27.0233 27.9679 23.9352 21.7355 24.8762 20.1625 23.4123 20.7877 22.3660 25.6711 3ss 24.5211 24.8633 23.4701 27.0220 27.9273 23.8821 21.6488 24.7843 19.6216 23.3344 20.7060 22.3418 25.6469 4ss 24.4304 24.7429 23.2883 26.8762 27.8972 23.5996 21.5808 24.6799 18.9677 22.9831 20.2567 22.1961 25.5331 tdl 24.1756 24.6807 23.1540 26.8660 27.7889 23.2628 21.4326 24.5156 17.1585 22.8037 20.0480 22.0807 25.4713 osa 24.4690 24.8423 23.4395 27.0024 27.9053 23.8089 21.6154 24.7411 19.2658 23.2817 20.6498 22.3179 25.6331 cbosa 24.4520 24.8295 23.4238 27.0083 27.8949 23.8033 21.6038 24.7352 19.1476 23.2705 20.6522 22.3054 25.6251 ots 24.3842 24.8056 23.3890 27.0009 27.8650 23.6878 21.5592 24.6723 19.0078 23.2578 20.6144 22.2789 25.6066 csa 24.4600 24.8181 23.4024 26.9950 27.8948 23.7152 21.6043 24.7266 19.2243 23.2294 20.6068 22.3022 25.6183 m a d fs 24.5962 24.8687 23.4848 27.0217 27.9538 23.9111 21.7139 24.8364 19.8365 23.3842 20.7733 22.3443 25.6414 3ss 24.4744 24.8321 23.4439 27.0071 27.9037 23.8458 21.6215 24.7473 19.5976 23.2996 20.6707 22.3190 25.6275 4ss 24.3854 24.7030 23.2590 26.8540 27.8720 23.5583 21.5363 24.6361 18.9205 22.9048 20.2166 22.1685 25.4961 tdl 24.4459 24.8089 23.4104 27.0036 27.8813 23.7428 21.5894 24.7060 19.2689 23.2678 20.6338 22.3028 25.6163 osa 24.4389 24.8117 23.4037 26.9944 27.8869 23.7785 21.5824 24.7152 19.2500 23.2569 20.6096 22.2989 25.6077 cbosa 24.4205 24.8077 23.3936 26.9983 27.8828 23.7750 21.5729 24.7009 19.1366 23.2532 20.6168 22.2807 25.6074 ots 24.3610 24.7839 23.3691 26.9929 27.8573 23.6601 21.5393 24.6434 18.9239 23.2314 20.5835 22.2549 25.5929 csa 24.4250 24.7997 23.3828 26.9821 27.8781 23.6838 21.5767 24.6980 19.2166 23.2077 20.5909 22.2828 25.6116 table 4: the effect of the adding gaussian noise with snr � 20 db on different algorithms using the overlapped block technique algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carefone foreman silent news m s d fs 29.0930 29.5213 27.5724 31.0762 32.5427 27.4645 26.4322 29.0839 20.2670 26.5460 23.8262 26.5191 29.6984 3ss 29.0657 29.5130 27.5307 31.0702 32.5258 27.4107 26.4250 29.0498 19.7371 26.5291 23.8154 26.4794 29.6919 4ss 29.0476 29.5110 27.2323 31.0675 32.3083 27.3863 26.0877 27.8835 19.4417 26.0194 23.7917 26.4159 29.6272 tdl 29.0479 29.5177 27.5052 31.0746 32.5254 27.3728 26.4076 29.0410 19.5542 26.5222 23.7758 26.4352 29.6916 osa 29.0564 29.5128 27.5225 31.0722 32.5274 27.4079 26.4179 29.0408 19.5026 26.5223 23.7677 26.4488 29.6938 cbosa 29.0541 29.5216 27.5128 31.0729 32.5258 27.4006 26.4194 29.0499 19.3747 26.5331 23.7709 26.4401 29.6972 ots 29.0441 29.5246 27.4973 31.0678 32.5225 27.3789 26.4079 29.0296 19.2696 26.5294 23.7708 26.4047 29.6887 csa 29.0503 29.5131 27.4994 31.0653 32.5167 27.3825 26.4123 29.0487 19.4416 26.5222 23.7655 26.4304 29.6934 m a d fs 29.0773 29.5214 27.5656 31.0738 32.5388 27.4433 26.4189 29.0693 20.1119 26.5412 23.8153 26.5097 29.6909 3ss 29.0538 29.5187 27.5209 31.0715 32.5279 27.4051 26.4028 29.0461 19.6215 26.5306 23.8017 26.4575 29.6833 4ss 29.0437 29.5177 27.2766 31.0724 32.3003 27.4051 26.0138 28.2736 19.4122 25.8462 23.7570 26.4090 29.5894 tdl 29.0399 29.5130 27.4993 31.0715 32.5218 27.3738 26.4098 29.0398 19.4583 26.5241 23.7739 26.4226 29.6978 osa 29.0470 29.5115 27.5164 31.0717 32.5138 27.3904 26.4032 29.0375 19.4047 26.5297 23.7771 26.4457 29.6932 cbosa 29.0399 29.5137 27.5109 31.0646 32.5212 27.3772 26.4126 29.0339 19.2773 26.5231 23.7681 26.4237 29.6886 ots 29.0414 29.5129 27.4869 31.0737 32.5192 27.3670 26.3943 29.0230 19.1884 26.5269 23.7751 26.3996 29.6806 csa 29.0347 29.5138 27.5039 31.0686 32.5153 27.3698 26.4041 29.0275 19.3566 26.5172 23.7731 26.4216 29.6870 table 5: the effect of adding gaussian noise with snr � 25 db on different algorithms using the non overlapped block technique 2. small values for n and m reduce the reliability of the motion vector, since few pixels participate in the matching process; 3. fast algorithms for finding motion vectors are more efficient for larger values of n and m. in this section we will show the effect of the block size on the performance of the algorithms. in the simulation, different block sizes (4×4, 8×8, and 16×16) are compared using both the overlapped block and non-overlapped block techniques. mad is used as a cost function, and the searching win36 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague algorithm claire akiyo mot_daug salesman grandma suzie container mis_am football carefone foreman silent news m s d fs 24.3748 24.7095 23.2762 26.8718 27.8667 23.4065 21.5700 24.6805 18.8333 22.8580 20.0903 22.1782 25.5085 3ss 24.3222 24.7058 23.2421 26.8693 27.8518 23.3709 21.5412 24.6382 18.4370 22.8450 20.0787 22.1503 25.5035 4ss 24.2718 24.6826 23.0947 26.8716 27.7363 23.3344 21.2836 23.6099 18.2048 22.3892 20.0444 22.1158 25.3588 tdl 24.2923 24.6975 23.2296 26.8707 27.8417 23.3347 21.5069 24.6061 18.2926 22.8312 20.0591 22.1394 25.4902 osa 24.2986 24.7051 23.2385 26.8700 27.8328 23.3491 21.5340 24.6263 18.2527 22.8380 20.0638 22.1335 25.5038 cbosa 24.2952 24.7053 23.2393 26.8694 27.8379 23.3401 21.5068 24.6128 18.1633 22.8321 20.0698 22.1425 25.4971 ots 24.2769 24.7001 23.2245 26.8663 27.8272 23.3278 21.5048 24.5963 18.0727 22.8316 20.0630 22.1216 25.4871 csa 24.2765 24.6929 23.2226 26.8746 27.8400 23.3329 21.4946 24.6023 18.2132 22.8342 20.0600 22.1262 25.5020 m a d fs 24.3433 24.6944 23.2724 26.8684 27.8606 23.3890 21.5432 24.6478 18.7474 22.8499 20.0904 22.1633 25.5024 3ss 24.3007 24.6900 23.2391 26.8659 27.8464 23.3545 21.5169 24.6232 18.3712 22.8315 20.0823 22.1440 25.4979 4ss 24.2659 24.6835 23.1040 26.8711 27.7022 23.3391 21.1987 23.3878 18.1935 22.3078 20.0516 22.1182 25.3493 tdl 24.2692 24.6956 23.2205 26.8683 27.8424 23.3278 21.5026 24.5962 18.2356 22.8331 20.0582 22.1255 25.4899 osa 24.2892 24.6942 23.2295 26.8631 27.8355 23.3441 21.5178 24.6081 18.1976 22.8319 20.0643 22.1415 25.4919 cbosa 24.2815 24.6993 23.2415 26.8700 27.8335 23.3378 21.5102 24.5968 18.1053 22.8273 20.0666 22.1297 25.4968 ots 24.2642 24.6943 23.2235 26.8684 27.8189 23.3240 21.4948 24.5876 18.0279 22.8249 20.0606 22.1158 25.4806 csa 24.2611 24.6924 23.2219 26.8695 27.8274 23.3234 21.5016 24.5919 18.1615 22.8293 20.0672 22.1322 25.4872 table 6: the effect of adding gaussian noise with snr � 20 db on different algorithms using the non overlapped block technique the effect of noise with different algorithms 10 14 18 22 25db 20db 15db 10db fs tss fss tdl osa the effect of nose for tss algorithms 10 14 18 22 26 30 25db 20db 15db 10db mis_am football carefone foreman fig. 4: the effect of noise on different algorithms, with different sequences silent sequence 34 34.5 35 35.5 36 36.5 37 37.5 38 38.5 4x4 8x8 16x16 tdl 4ss 3ss football sequence 20 21 22 23 24 25 4x4 8x8 16x16 3ss 4ss tdl fig. 5: two examples showing the effect of block size dow used is 15×15 (searching area parameter p=7). the simulation results are shown in tables 7 and 8. two different examples indicating the block size effect are shown in fig. 5. these two examples show that the psnr decreases as the block size increases. the increase in psnr is at the cost of increasing the computation time. © czech technical university publishing house e http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 the cost function is the mean absolute difference (mad) noise-free seguences video sequence block size fs 3ss 4ss tdl osa cbosa ots csa claire 16×16 42.7768 42.7768 42.7768 42.7768 42.7768 42.7768 42.7768 42.7768 8×8 42.7894 42.7862 42.7770 42.7846 42.7860 42.7839 42.7817 42.7837 4×4 43.1684 43.0100 42.9610 42.9835 42.9487 42.9406 42.8924 42.9281 akiyo 16×16 44.0277 44.0277 44.0277 44.0277 44.0277 44.0277 44.0277 44.0277 8×8 44.0497 44.0407 44.0349 44.0397 44.0397 44.0373 44.0392 44.0407 4×4 44.5296 44.3550 44.2828 44.3275 44.2827 44.2437 44.1994 44.2406 mot_daug 16×16 35.7964 35.7332 35.7075 35.6866 35.7074 35.7085 35.6891 35.6859 8×8 36.2719 36.0978 35.9943 35.9936 36.0011 35.9716 35.8743 35.9166 4×4 37.5963 36.9544 36.6822 36.7944 36.6327 36.5454 36.4824 36.6330 salesman 16×16 38.0802 38.0771 38.0771 38.0771 38.0771 38.0771 38.0771 38.0771 8×8 38.3956 38.2350 38.1451 38.1958 38.1753 38.1675 38.1539 38.1871 4×4 39.7085 39.1389 38.9262 39.0782 38.9602 38.8460 38.9358 38.9625 grandma 16×16 44.2409 44.2409 44.2402 44.2409 44.2408 44.2408 44.2408 44.2409 8×8 44.2705 44.2624 44.2528 44.2613 44.2591 44.2574 44.2568 44.2576 4×4 44.5451 44.4325 44.3738 44.4160 44.3980 44.3544 44.3622 44.3844 suzie 16×16 34.6410 34.5961 34.5646 34.5559 34.5793 34.5720 34.5551 34.5561 8×8 35.1784 35.0081 34.9003 34.9428 34.9279 34.8953 34.8585 34.9217 4×4 36.4271 35.8496 35.6005 35.7433 35.5897 35.4816 35.5631 35.6111 mis_am 16×16 36.9267 36.9157 36.8977 36.9036 36.9111 36.9003 36.8982 36.9036 8×8 37.2996 37.2038 37.1216 37.1297 37.1608 37.1419 37.0462 37.0897 4×4 38.5524 38.1191 37.8946 37.9207 37.9483 37.8733 37.6502 37.7892 container 16×16 43.7960 43.7960 43.7960 43.7960 43.7960 43.7960 43.7960 43.7960 8×8 43.7961 43.7961 43.7961 43.7961 43.7961 43.7961 43.7961 43.7961 4×4 43.8102 43.8064 43.8036 43.8057 38.9602 43.8054 43.8042 43.8049 carfone 16×16 31.5072 31.5032 31.4973 31.5023 31.5026 31.5026 31.5022 31.5023 8×8 31.9189 31.8412 31.6262 31.8163 31.8280 31.8095 31.7954 31.7886 4×4 33.5332 32.8910 32.4287 32.7960 32.7217 32.6610 32.6191 32.5616 foreman 16×16 28.7184 28.7028 28.6327 28.6475 28.6447 28.6447 28.6431 28.6431 8×8 29.4459 29.0493 28.8682 28.8742 28.7888 28.7704 28.7340 28.7767 4×4 31.4843 30.3725 29.7362 29.9669 29.5708 29.4712 29.4314 29.6674 silent 16×16 34.9766 34.8675 34.7843 34.7554 34.8347 34.7537 34.6913 34.7388 8×8 36.5615 36.0605 35.7335 35.8490 35.7905 35.6099 35.5770 35.7755 4×4 39.3660 38.1995 37.7022 37.9981 37.6317 .3660 37.3521 37.6430 news 16×16 37.9268 37.9268 37.9268 37.9268 37.9268 37.9268 37.9267 37.9268 8×8 38.2139 38.1805 38.1177 38.1706 38.1607 38.1646 38.1432 38.1534 4×4 39.6129 39.1762 39.0289 39.1427 39.0072 38.9403 38.8082 38.8152 football 16×16 20.8890 20.3351 20.0559 20.1449 20.0946 19.9489 19.8432 20.0297 8×8 23.1485 22.1201 21.7585 21.8682 21.6609 21.4343 21.3743 21.6571 4×4 26.5200 24.6827 24.1832 24.3096 23.7828 23.3710 23.2316 23.8813 table 7: the effect of block size on different algorithms using the non overlapped block technique 38 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague the cost function is the mean absolute difference (mad) noise-free sequences video sequence block size fs 3ss 4ss tdl osa cbosa ots csa claire 16×16 43.5145 43.5123 42.8956 43.5123 43.4960 43.4959 43.4798 43.4879 8×8 43.8594 43.7666 43.0371 43.7712 43.4959 43.4944 43.4718 43.6813 4×4 44.5517 43.9486 43.4459 44.0145 43.7641 43.7465 43.7740 43.8218 akiyo 16×16 44.5667 44.3828 44.0538 44.3828 44.3812 44.3812 44.3828 44.3828 8×8 44.8593 44.7638 44.1545 44.7828 44.6570 44.6578 44.6489 44.7087 4×4 45.6166 45.1495 44.7576 45.2207 45.0221 45.0259 45.1116 45.1231 mot_daug 16×16 37.1630 37.0821 36.5073 37.0242 36.9955 36.9720 36.9565 36.8263 8×8 38.3390 37.7442 37.0398 37.6741 37.4961 37.5502 37.5607 37.3572 4×4 39.2502 38.4231 37.7508 38.4135 37.9643 38.0876 38.2214 38.0042 salesman 16×16 39.0550 39.0204 38.6163 39.0184 38.9326 38.9380 38.9377 38.8235 8×8 40.0834 39.7301 39.2691 39.7364 39.5043 39.5443 39.4983 39.5222 4×4 41.3464 40.6167 40.1940 40.6732 40.2996 40.3844 40.4402 40.2478 grandma 16×16 44.3245 44.3198 44.2609 44.3169 44.3095 44.3110 44.3076 44.3119 8×8 44.5781 44.5037 44.3273 44.4985 44.4782 44.4742 44.4634 44.4731 4×4 45.0130 44.8618 44.5958 44.8534 44.8045 44.8041 44.7732 44.7762 suzie 16×16 36.8167 36.6795 35.8949 36.2371 36.4073 36.4230 36.0087 36.0175 8×8 37.8662 37.3184 36.3338 37.0996 36.8292 36.8412 36.7084 36.6589 4×4 38.9663 37.9522 37.0740 37.8821 37.3160 37.3391 37.4694 37.2577 mis_am 16×16 37.1225 37.1067 37.0504 37.0004 37.0951 37.0946 36.9958 36.9952 8×8 37.8509 37.5961 37.4185 37.4205 37.5131 37.4977 37.3306 37.3833 4×4 39.0408 38.7229 38.4166 38.4621 38.4970 38.4544 38.1839 38.3203 container 16×16 43.7964 43.7964 43.7964 43.7963 43.7963 43.7963 43.7964 43.7963 8×8 43.8018 43.7997 43.7995 43.7994 43.7994 43.7995 43.7992 43.7994 4×4 43.8399 43.8288 43.8164 43.8288 43.8282 43.8285 43.8245 43.8235 carfone 16×16 33.7505 33.5387 32.5986 33.5071 33.4039 33.3958 33.5072 33.1682 8×8 35.1353 34.2742 33.2570 34.2545 34.0278 34.0320 34.2366 33.5551 4×4 36.3276 35.2296 34.2488 35.2128 34.7692 34.8781 35.1781 34.1129 foreman 16×16 32.6306 32.2222 30.5635 32.0957 31.9828 32.1301 31.9249 31.7464 8×8 34.1180 32.9344 31.4437 32.7939 32.4283 32.6552 32.6965 32.2877 4×4 35.4050 33.4195 32.4833 33.3044 32.7338 33.1495 33.4824 32.6653 silent 16×16 36.1862 36.1050 35.7104 36.0065 35.9547 35.8209 35.7348 35.8129 8×8 38.9408 38.0916 37.3871 37.9064 37.5271 37.4855 37.2798 37.5107 4×4 40.9393 39.8626 38.8732 39.6441 38.8536 38.8269 38.6559 39.0480 news 16×16 38.5197 38.5105 38.0756 38.5076 38.5076 38.5257 38.4919 38.4893 8×8 39.7669 39.3590 38.5155 39.3298 39.2583 39.3243 39.3236 39.1938 4×4 41.2435 40.3439 39.6528 40.3543 40.1416 40.1696 40.4380 39.9729 football 16×16 22.9285 22.5110 22.5599 21.8857 21.8874 21.6624 21.6624 21.8158 8×8 26.3718 24.1567 22.8188 23.3926 23.1560 22.9056 22.7096 23.2567 4×4 27.7078 25.8479 23.9034 24.9480 24.2826 23.7844 23.6967 24.8451 table 8: the effect of the block size on different algorithms using the overlapped block technique 3.3 visual results in this section the reconstructed frames will be presented with the use of fs and tss algorithms and with msd as a cost function and for the overlapped technique. for comparison we used three video sequences, specifically; claire video sequence: this represents a head and shoulder sequence, and has just one moving object with slow motion. mother & daughter sequence: this also represents a head, and shoulder sequence, but it has just two moving objects with slow motion. football sequence: this represents a multi-object sequence with fast motion. for these three sequences the reconstructed frame is shown, the comparison is performed by estimating frame number n � k from a reference frame n. for each sequence three cases are performed with k � 1, k � 4, and k � 7. this is shown in fig. 6 and fig. 7. the psnr corresponding to these cases is shown in table 9. the simulation results show that the quality of the reconstructed frame decreases as the number of skipped frames increases (k). appearing and disappearing of objects during the sequence also decreases the quality of the reconstructed frames. © czech technical university publishing housee http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 (a) (b) (c) (d) (g) (e) (h) (f) (i) fig. 6: (a) the original 2nd frame, (b) the original 5th frame, (c) the original 8th frame, (d) reconstructed frame 2 using the fs algorithm, (e) reconstructed frame 5 using the fs algorithm, (f) reconstructed frame 8 using the fs algorithm, (g) reconstructed frame 2 using the tss algorithm, (h) reconstructed frame 5 using the tss algorithm, (i) reconstructed frame 8 using the tss algorithm claire football ref. �cur. 1�2 1�5 1�8 1�2 1�5 1�8 fs 40.9697 33.9463 31.5458 25.2897 20.8944 19.7531 tss 40.9697 33.7515 31.3894 23.7676 19.5222 18.3951 table 9: the psnr performance for two algorithms 4 conclusion from the simulation results we can conclude that: � there are two techniques for searching for the best matching, namely; 1) searching within non-overlapped blocks, 2) searching within overlapped blocks. � a comparison between these two techniques was performed using the same searching algorithm, the same block size, the same cost function, and with the same complexity, i.e. searching points. the simulation indicates that searching within the overlapped blocks is the better from the quality point of view. � the full search algorithm is the best algorithm from the quality point of view, but from the computation time (complexity) point of view it is the worst. � the tss algorithm is the best algorithm from the trade off quality – complexity point of view. � the block size is one of the effective factors in the motion estimation algorithms. small block size (such as 4×4 and 8×8) results in good quality, but reduces the reliability of the motion vector, since few pixels participate in the matching process. on the other hand, large block sizes (such as 16×16) are preferable for fast algorithms. � the cost function affects the complexity of the searching algorithm. a comparison between mad and msd cost functions indicates that msd achieves greater quality than mad at the cost of increasing the complexity. mad is preferable, since the difference in quality is very small. � the addition of white gaussian noise affects the direction of the motion vectors; consequently the reconstructed frame has less quality. references [1] vasuev bhaskaran, konstantinos konstantinides: “image and video compression standards algorithms and architectures.” second edition, kluwer academic publishers. 40 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague (a) (b) (c) (d) (e) (f) (g) (h) (i) fig. 7: (a) the original 2nd frame, (b) the original 5th frame, (c) the original 8th frame, (d) reconstructed frame 2 using tss algorithm, (e) reconstructed frame 5 using the fs algorithm, (f) reconstructed frame 8 using the fs algorithm, (g) reconstructed frame 2 using the tss algorithm, (h) reconstructed frame 5 using the tss algorithm, (i) reconstructed frame 8 using the tss algorithm [2] koga, t., linuma, k., hirano, a., lijima, y., ishiguro, t.: “motion compensated interframe coding for video conferencing.” in proc. nat: telecomm. conf. new orleans, la, p. g5.3.1–5.3.5., nov. 29–dec.3, 1981. [3] lai-man po, wang-chung ma: “a novel four-step search algorithm for fast block motion estimation,” ieee trans. circuits syst. video technol., vol. 6, (jun. 1996), no. 3, p. 313–317. [5] usama, s., simak, b.: “a low-complexity wavelet based algorithm for inter-frame image prediction.” in: acta polytechnica. vol. 42, (2002), no. 2/2002, p. 30–34. issn 1210-2709. [6] jain, j. r., jain, a. k.: “displacement measurement and its application in interframe image coding.” ieee transactions on communications, volume com-29, number 12, p. 1799–1808, dec. 1981. [7] puri, a., hang, h.-m., schilling, d. l.: “an efficient block-matching algorithm for motion compensated coding.” proceedings ieee icassp (international conference on acoustics, speech and signal processing), 1987 p. 25.4.1–24.4.4. [8] po, l. m., ma, w. c.: “a new center-biased search algorithm for block motion estimation.” proceedings icip 95, vol. i, dec. 1995, p. 410–413. [8] rao, k. r., hwang, j. j.: “techniques & standards for image, video & audio coding.” brazil, rio de janeiro, prentice hall, 1996. [9] srinivasan, r., rao, k. r.: “predictive coding based on efficient motion estimation.” ieee trans. on circuits and systems in video technology, vol. com-33, no. 8, aug. 1985, p. 888–896. [10] ghanbari, m.: “the cross-search algorithm for motion estimation.” ieee transactions on communications, volume 38, july 1990, no. 7, p. 950–953. dr. sayed usama phone: +20101346626 e-mail: usama@acc.aun.edu.eg, usama_s_1999@yahoo.com faculty of engineering, assiut university assiut, egypt dr. montaser mohamed faculty of engineering south valley university aswan, egypt eng. osama ahmed e-mail: omer.osama@gmail.com osama_12000@yahoo.com faculty of engineering south valley university aswan, egypt © czech technical university publishing house e http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 acta polytechnica https://doi.org/10.14311/ap.2022.62.0165 acta polytechnica 62(1):165–189, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on generalized heun equation with some mathematical properties nasser saad university of prince edward island, department of mathematics and statistics, 550 university avenue, charlottetown, pei, canada c1a 4p3. correspondence: nsaad@upei.ca abstract. we study the analytic solutions of the generalized heun equation, ( α0 + α1 r + α2 r2 + α3 r 3 ) y′′ + ( β0 + β1 r + β2 r2 ) y′ + ( ε0 + ε1 r ) y = 0, where |α3| + |β2| ≠ 0, and {αi}3i=0, {βi} 2 i=0, {εi}1i=0 are real parameters. the existence conditions for the polynomial solutions are given. a simple procedure based on a recurrence relation is introduced to evaluate these polynomial solutions explicitly. for α0 = 0, α1 ̸= 0, we prove that the polynomial solutions of the corresponding differential equation are sources of finite sequences of orthogonal polynomials. several mathematical properties, such as the recurrence relation, christoffel-darboux formulas and the norms of these polynomials, are discussed. we shall also show that they exhibit a factorization property that permits the construction of other infinite sequences of orthogonal polynomials. keywords: heun equation, confluent forms of heun’s equation, polynomial solutions, sequences of orthogonal polynomials. 1. introduction it seems as a simple question to ask: under what conditions does the differential equation π3(r) y′′ + π2(r) y′ + π1(r) y = (λn + µnπ0(r)) y, where λn and µn are constants and πj (r), j = 0, 1, 2, 3 are polynomials of unknown degree to be found, has n-degree monic polynomial solutions yn = ∑n k=0 ck r k, c0 ̸= 0, ck = 1? a simple approach to deduce the possible degrees of πj , j = 0, 1, 2, 3, is to examine the degrees for the (possible) polynomial solutions yn: for n = 0, y0(r) = 1, we must have π1(r) = λ0 + µ0π0(r) and the degree of the polynomial π1(r) must have the same degree as that of π0(r), so we may combine the same degree polynomial coefficients of y and write the equation as π3(r)y′′ + π2(r)y′ + π1(r)y = 0. next, for a polynomial solution of degree one, say y1(r) = r + α, the differential equation reduces to π2(r) + π1(r)(r + α) = 0 and the degree of π2 should be the degree of π1(r) plus one. similarly, for a second-order polynomial solution, say y(r) = r2 + α r + β, it follows by substitution that π3(r) + π2(r)(2r + α) + π1(r)(r2 + αr + β) = 0 which indicates that the degree of π3(r) should be the degree of π2 plus one, which, in turn, is a polynomial of π1 degree plus one. this simple argument shows that for the polynomial solutions of the linear second-order differential equation with polynomial coefficients, the degree of the polynomial coefficients πj (r), j = 3, 2, 1 must be of degree n, n − 1 165 https://doi.org/10.14311/ap.2022.62.0165 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en nasser saad acta polytechnica and n − 2, respectively. so, without the loss of generality, we may direct out attention to the following: under what conditions on the equation parameters αk, βk, and εk, for k = 0, 1, . . . , n, does the differential equation( n∑ k=0 αk r k ) y′′(r) + ( n−1∑ k=0 βk r k ) y′(r) + ( n−2∑ k=0 εk r k ) y(r) = 0, n ≥ 2 (1) has polynomial solutions y = ∑m k=0 cj r j ? a logical approach is to examine the differential equation using the series solution  y = ∞∑ j=0 cj rj , y′ = ∞∑ j=0 j cj rj−1, y′′ = ∞∑ j=0 j (j − 1)cj rj−2 in (1) and enforce the coefficients cj = 0 for all j ≥ m + 1, m = 0, 1, 2, · · · to find the condition so that cm ≠ 0. this approach leads to a conclusion that for equation (1) to have m degree polynomial solution, it is necessary that εn−2 = −m (m − 1) αn − m βn−1, n = 2, 3, · · · . (2) did this answer the question? indeed, no. consider, for example, this simple equation r3y′′ + 2 r2y′ + (−2 r + 5)y = 0. clearly, the necessary condition (2) is satisfied for m = 1 and one expects the existence of a first degree polynomial solution, say y = r + b, for an arbitrary value of b ∈ r, however, 2 r2 + (−2r + 5)(r + b) ̸= 0 for any real value of b. therefore, for n ≥ 3, the condition (2) is necessary but not sufficient for the existence of polynomial solutions of the differential equation (1). note, for n = 2, equation (1) is the classical hypergeometric-type differential equation [1–4] (α2 r2 + α1 r + α0) y′′ + (β1 r + β0) y′ + ε0 y = 0 (3) with the necessary and sufficient condition [2] for the polynomial solutions ε0 = −m (m − 1) α2 − m β1, m = 0, 1, 2, · · · . for n = 3, the differential equation (1) assumes the form p3(r) y′′ + p2(r) y′ + p1(r) y = 0,  p3(r) = 3∑ j=0 αj r j , p2(r) = 2∑ j=0 βj r j , p1(r) = 1∑ j=0 εj r j , αj , βj , εj ∈ r , (4) which includes as a special case or with elementary substitutions, the classical heun differential equation [5, 6] y′′ + ( γ r + δ r − 1 + ε r − a ) y′ + α β r − q r(r − 1)(r − a) y = 0, (5) subject to the regularity (at infinity) condition α + β + 1 = γ + δ + ε, and its four confluent forms (confluent, doubly-confluent, biconfluent and triconfluent heun equations). these equations are indispensable from the point of view of a mathematical analysis [5–11] and for its valuable applications in many areas of theoretical physics [5, 6, 12–20]. in the present work 166 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . • from equation (4), we will extract the possible differential equations that can be solved using two-term recurrence formulas. • from equation (4), we will extract all the differential equations whose series solutions can be evaluated with a three-term recurrence formula. • for a0 ̸= 0, we shall devise a procedure based on the asymptotic iteration method [21] to find the series and polynomial solutions of the differential equation (4). • in the neighbordhoud of a singular point r = 0, i.e., with a0 = 0, we will prove that the series solution can be written as y(r) = ∞∑ k=0 (−1)k pk;s(ε0) αk1 ( β0 α1 + s ) k (1 + s)k rk+s, where s is a root of the indicial equation. also, we show that {pk;s(ε0)}∞k=0 is an infinite sequence of orthogonal polynomials with several interesting properties. • by imposing the termination conditions, we study the mathematical properties of the finite sequences of the orthogonal polynomials {pk;s(ε0)}nk=0 and explore the factorization property associated with these polynomials. 2. elementary observations the classical approach to study the analytical solutions of equation (4) relies on the nature of the singular points of the leading polynomial coefficients l ≡ α0 + α1 r + α2 r2 + α3 r3 in addition to the point r = ∞ in the extended plane. for real coefficients and α0 ̸= 0, the odd-degree polynomial l is factored into either a product of a linear polynomial and an irreducible quadratic polynomial or a product of three linear factors. in the first case, the polynomial l can be written as l = α3(r − ξ)(r2 + br + c) where r2 + br + c is an irreducible polynomial. in this case, ξ is regular, real, singular point and ∞ is irregular for otherwise, the differential equation can be solved in terms of elementary functions according to the classical theory of ordinary differential equations. in this case, the differential equation can be written as d2y dr2 + ( µ1 r − ξ + µ2 r2 + br + c ) dy dr + ε1 r + ε0 α3(r − ξ)(r2 + b r + c) y = 0. (6) the second case, the polynomial l can be written as l = α3(r − ξ1)(r − ξ2)(r − ξ3) where ξj , j = 1, 2, 3 and ∞ are all regular singular points, i.e., the differential equation of fushsian type, d2y dr2 +   3∑ j=1 µj r − ξj   dy dr + ε1 r + ε0 α3(r − ξ1)(r − ξ2)(r − ξ3) y = 0. (7) where µj are constants depending on the differential equation parameters. one can then study the series solutions of equations (6) and (7) using the classical frobenius method. another approach, recently adopted, to study (4), depends on the possible combination of the parameters αj , j = 0, 1, 2, 3 such that the polynomial l does not vanish identically. there are fifteen possible combinations in total. these fifteen combinations can be classified into two main classes: the first class is characterized by α0 ≠ 0, which has eight equations in total, the second class characterized by α0 = 0 includes the remaining seven equations. each of these two classes will be studied in the next sections. first, we consider some elementary observations regarding the differential equation (4). 167 nasser saad acta polytechnica we assume no common factor among the polynomial coefficients pj (r), j = 1, 2, 3, we start our study of equation (4) by asking the following simple question: under what conditions the series solutions of the differential equation (4) can be evaluated using a two-term recurrence relation [22]? for, in this case, the two linearly independent series solutions can be found explicitly. theorem 2.1. the necessary and sufficient conditions for the linear differential equation p2(r) u′′(r) + p1(r) u′(r) + p0(r) u(r) = 0, (8) to have a two-term recurrence relationship that relates the successive coefficients in its series solution is that in the neighbourhood of the singular regular point r0 (where p2(r0) = 0), the equation (8) can be written as: [q2,0 + q2,h (r − r0)h]︸ ︷︷ ︸ q2(r) (r − r0)2−m u′′(r) + [q1,0 + q1,h (r − r0)h]︸ ︷︷ ︸ q1(r) r1−m u′(r) + [q0,0 + q0,h (r − r0)h]︸ ︷︷ ︸ q0(r) (r − r0)−m u(r) = 0, (9) where, for m ∈ z , h ∈ z+, j = 0, 1, 2, qj (r) ≡ ∞∑ k=0 qj,k(r − r0)k = pj (r) (r − r0)m−j , (10) when at least one of qj,0, j = 0, 1, 2 and qj,h, j = 0, 1, 2, is different from zero. in this case, the two-term recurrence formula is given by ck ck−h = − (k + λ − h)[q2,h (k + λ − h − 1) + q1,h ] + q0,h (k + λ)[q2,0 (k + λ − 1) + q1,0] + q0,0 , (11) where c0 ̸= 0, and λ = λ1, λ2 are the roots of the indicial equation q2,0 λ (λ − 1) + q1,0 λ + q0,0 = 0. (12) the closed form of the series solution generated by (11) can be written in terms of the generalized hypergeometric function as u(r; λ) =zλ ∞∑ k=0 chkr hk = rλ3f2 ( 1, 2λ−12h + q1,h 2 h q2,h − √ (q1,h−q2,h)2−4q0,hq2,h 2 h q2,h , 2λ−1 2h + q1,h 2 h q2,h + √ (q1,h−q2,h)2−4q0,hq2,h 2 h q2,h ; 1 + 2λ−12 h + q1,0 2 h q2,0 − √ (q1,0−q2,h)2−4q0,0q2,0 2 h q2,0 , 1 + 2λ−12h + q1,0 2 h q2,0 + √ (q1,0−q2,h)2−4q0,0q2,0 2 h q2,0 ; − q2,h q2,0 rh ) . (13) applying this theorem, equation (4) generates the following solvable equations: • differential equation: r2 (α2 + α3 r) u′′(r) + r (β1 + β2 r) u′(r) + (ε0 + ε1 r) u(r) = 0, ε0 ̸= 0, (14) recurrence relation: for k = 1, 2, · · · , and c0 = 1, ck ck−1 = − (k + λ − 1)[α3 (k + λ − 2) + β2] + ε1 (k + λ)[α2 (k + λ − 1) + β1 ] + ε0 , (15) where λ = λ+, λ− are the roots of the indical equation α2 λ (λ − 1) + β1 λ + ε0 = 0, namely λ± = α2 − β1 ± √ (α2 − β1)2 − 4α2ε0 2α2 . 168 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . the two linearly independent solutions generated by (15), in terms of the gauss hypergeometric functions, are: u± =r α2 −β1 ± √ (α2 −β1 )2 −4α2 ε0 2α2 2f1 ( β2 2α3 − β12α2 ± √ (α2−β1)2−4α2ε0 2α2 − √ (α3−β2)2−4α3ε1 2α3 , β2 2α3 − β12α2 ± √ (α2−β1)2−4α2ε0 2α2 + √ (α3−β2)2−4α3c1 2α3 ; α2± √ (α2−β1)2−4α2ε0 α2 ; − α3 α2 r ) . (16) • differential equation: (α1 r + α3 r3) u′′(r) + (β0 + β2 r2) u′(r) + ε1 r u(r) = 0, (β0 ̸= 0), (17) recurrence relation: ck ck−2 = − (k + λ − 2)[α3 (k + λ − 3) + β2] + ε1 (k + λ)[α1(k + λ − 1) + β0] , (c0, c1 ̸= 0, k = 2, 3, · · · ), (18) where λ = λ+, λ− are the roots of the indicial equation α1 λ(λ − 1) + β0 λ = 0, i.e λ+ = 0, λ− = 1 − β0/α1. the two linearly independent series solutions generated by (18) are: u+(r) = 2f1 ( β2 4α3 − 1 4 − √ (α3 − β2)2 − 4α3ε1 4α3 , β2 4α3 − 1 4 + √ (α3 − β2)2 − 4α3ε1 4α3 ; 1 2 + β0 2α1 ; − α3 α1 r2 ) , (19) and u−(r) =r 1− β0 α1 × 2f1 ( 1 4 + β2 4α3 − β0 2α1 − √ (α3 − β2)2 − 4α3ε1 4α3 , 1 4 + β2 4α3 − β0 2α1 + √ (α3 − β2)2 − 4α3ε1 4α3 ; 3 2 − β0 2α1 ; − α3 α1 r2 ) . (20) • differential equation (α0 + α3 r3) u′′(r) + β2 r2 u′(r) + ε1r u(r) = 0, α0 ̸= 0, (21) recurrence relation: ck ck−3 = − (k + λ − 3)[α3(k + λ − 4) + β2] + ε1 α0 (k + λ)(k + λ − 1) , (c0 ̸= 0), (22) where λ = λ1, λ2 are the roots of the indicial equation α0 λ (λ − 1) = 0, namely, λ1 = 0, λ2 = 1. the two linearly independent series solutions are: u1(r) = 2f1 ( − α3 − β2 + √ (α3 − β2)2 − 4α3ε1 6α3 , −α3 + β2 + √ (α3 − β2)2 − 4α3ε1 6α3 ; 2 3 ; − α3 α0 r3 ) , (23) and u2(r) = r 2f1 ( α3 + β2 − √ (α3 − β2)2 − 4α3ε1 6α3 , α3 + β2 + √ (α3 − β2)2 − 4α3ε1 6α3 ; 4 3 ; − α3 α0 r3 ) . (24) out of the three generic equations (14), (17) and (21), five exactly solvable differential equations (cases 1, 4, 5, 8, and 10) of the type (4) follows and other five (cases 2, 3, 6, 7, 9) that can be derived directly from them by taking the limits of the equation parameters. for direct use, the ten equations are listed in table 1. 169 nasser saad acta polytechnica des and their linearly independent solutions 1 α2 r2u′′ + (β1r + β2r2) u′ + (ε0 + ε1r) u = 0 u = r 1 2 − β1 2α2 + 12α2 √ (α2−β1)2−4α2ε0 1f1 ( 1 2 − β1 2α2 + ε1 β2 + √ (α2−β1)2−4α2ε0 2α2 ; 1 + √ (α2−β1)2−4α2ε0 α2 ; − β2 α2 r ) , u = r − 12 + β1 2α2 − 12α2 √ (α2−β1)2−4α2ε0 1f1 ( 1 2 − β1 2α2 + ε1 β2 − √ (α2−β1)2−4α2ε0 2α2 ; 1 − √ (α2−β1)2−4α2ε0 α2 ; − β2 α2 r ) . 2 α2 r2u′′ + β1r u′ + (ε0 + ε1r) u = 0 u = r 1 2 − β1 2α2 + 12α2 √ (α2−β1)2−4α2ε0 0f1 ( −; 1 + √ (α2−β1)2−4α2ε0 α2 ; − ε1 α2 r ) , u = r− 1 2 + β1 2α2 − 12α2 √ (α2−β1)2−4α2ε0 0f1 ( −; 1 − √ (α2−β1)2−4α2ε0 α2 ; − β2 α2 r ) . 3 α2 r2u′′ + β2r2u′ + (ε0 + ε1r) u = 0 u = r 1 2 − √ α2 −4ε0 2√α2 1f1 ( 1 2 + ε1 β2 − √ α2−4ε0 2 √ α2 ; 1 − √ α2−4ε0√ α2 ; − β2 α2 r ) , u = r 1 2 + √ α2 −4ε0 2√α2 1f1 ( 1 2 + ε1 β2 + √ α2−4ε0 2 √ α2 ; 1 + √ α2−4ε0√ α2 ; − β2 α2 r ) . 4 (α2 r2 + α3 r3)u′′ + β1 r u′ + (ε0 + ε1r) u = 0 u = r 1 2 − β1 2α2 + 12α2 √ (α2−β1)2−4α2ε0 2f1 ( 1 2 √ α3−4ε1 α3 − β12α2 + √ (α2−β1)2−4α2ε0 2α2 , − 12 √ α3−4ε1 α3 − β12α2 + √ (α2−β1)2−4α2ε0 2α2 ; 1 + √ (α2−β1)2−4α2ε0 α2 ; − α3 α2 r ) , u = r 1 2 − β1 2α2 − 12α2 √ (α2−β1)2−4α2ε0 2f1 ( 1 2 √ α3−4ε1 α3 − β12α2 − √ (α2−β1)2−4α2ε0 2α2 , − 12 √ α3−4ε1 α3 − β12α2 − √ (α2−β1)2−4α2ε0 2α2 ; 1 − √ (α2−β1)2−4α2ε0 α2 ; − α3 α2 r ) . 5 (α2 r2 + α3 r3)u′′ + β2r2 u′ + (ε0 + ε1r) u = 0 u = r 1 2 − 1 2√α2 √ α2−4ε0 2f1 ( β2 2α3 − 12 √ α2−4ε0 α2 + √ (α3−β2)2−4α3ε1 2α3 , β2 2α3 − 12 √ α2−4ε0 α2 − √ (α3−β2)2−4α3ε1 2α3 ; 1 − √ α2−4ε0√ α2 ; − α3 α2 r ) , u = r 1 2 + 1 2√α2 √ α2−4ε0 2f1 ( β2 2α3 + 12 √ α2−4ε0 α2 − √ (α3−β2)2−4α3ε1 2α3 , β2 2α3 + 12 √ α2−4ε0 α2 + √ (α3−β2)2−4α3ε1 2α3 ; 1 + √ α2−4ε0√ α2 ; − α3 α2 r ) . 6 (α2 r2 + α3 r3)u′′ + (ε0 + ε1 r) u = 0 u = r 1 2 − 1 2√α2 √ α2−4ε0 2f1 ( − 12 √ α2−4ε0 α2 − √ α3−4ε1 2 √ α3 , − 12 √ α2−4ε0 α2 − √ α3−4ε1 2 √ α3 ; 1 − √ α2−4ε0√ α2 ; − α3 α2 r ) , u = r 1 2 + 1 2√α2 √ α2−4ε0 2f1 ( 1 2 √ α2−4ε0 α2 − √ α3−4ε1 2 √ α3 , 12 √ α2−4ε0 α2 + √ α3−4ε1 2 √ α3 ; 1 + √ α2−4ε0√ α2 ; − α3 α2 r ) . 7 α2 r2 u′′ + (ε0 + ε1 r) u = 0 u = r 1 2 + 1 2 √ 1− 4ε0 α2 0f1 ( ; 1 + √ 1 + 4ε0 α2 ; − ε1 α2 r ) , u = r 1 2 − 1 2 √ 1− 4ε0 α2 0f1 ( ; 1 − √ 1 − 4ε0 α2 ; − ε1 α2 r ) 8 α1 r u′′ + (β0 + β2r2) u′ + ε1 r u = 0, u = 1f1 ( ε1 2β2 ; 12 + β0 2α1 ; − β22α1 r 2 ) , u = r1− β0 α1 1f1 ( 1 2 − β0 2α1 + ε12β2 ; 3 2 − β0 2α1 ; − β22α1 r 2 ) . 9 α1 r u′′ + β0 u′ + ε1 r u = 0, u = 0f1 ( −; 12 + β0 2α1 ; − ε14α1 z 2 ) , u = r1− β0 α1 0f1 ( −; 32 − β0 2α1 ; − ε14α1 r 2 ) . 10 α0 u′′ + β2 r2u′ + ε1 r u = 0, u = 1f1 ( ε1 3β2 ; 23 ; − β2 3α0 r3 ) , u = r 1f1 ( 1 3 + ε1 3β2 ; 43 ; − β2 3α0 r3 ) . table 1. ten solvable equations of the type (4) that follows from the generic equations (14), (17), and (21). 170 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . 3. the solutions in the neighbourhood of an ordinary point 3.1. series solutions in the case of α0 ̸= 0, r = 0, there is an ordinary point for the differential equations (4). the classical theory of differential equation ensure that the (4) has two linearly independent power series solutions in the neighbourhood of r = 0 and valid to the nearest real singular point of the leading polynomial coefficient l ≡ α0 + α1 r + α2 r2 + α3 r3 = 0. indeed, the polynomial l = 0 has the discriminate [23]: ∆ = 18 α3 α2 α1 α0 − 4 α32 α0 + α 2 2 α 2 1 − 4 α3 α 3 1 − 27 α 2 3 α 2 0. (25) the nature of the l roots as given by (25) along with the corresponding eight differential equations are summarized in table 2. for these differential equations, the following theorem, that can be easily proved using frobenius method, holds. theorem 3.1. (formal series solutions) in the neighbourhood of the ordinary point r = 0, the coefficients of the series solution y(r) = ∑∞ k=0 ck r k to the differential equation (4) satisfy the four-term recurrence relation ((k − 1) ((k − 2) α3 + β2) + ε1) ck−1 + (k((k − 1) α2 + β1) + ε0) ck + (k + 1)(kα1 + β0) ck+1 + (k + 2) (k + 1) α0 ck+2 = 0, (26) where k = 0, 1, 2, · · · , with c−1 = 0 and arbitrary nonzero constants c0 and c1. the radius of convergence of these series solutions is extended from r = 0 to the nearest singular point of the leading polynomial coefficient l = 0. the first few terms of the series solution are given explicitly by c2 = − ε02α0 c0 − β0 2α0 c1, c3 = (α1+β0) ε0−α0 ε1 6α20 c0 + β0(α1+β0)−α0(β1+ε0) 6 α20 c1, · · · . for α0 ̸= 0, using (26), we can extract the following differential equations with series solution from (4) using a three-term recurrence relation: • differential equation: ( α0 + α1 r + α3 r3 ) u′′(r) + ( β0 + β2 r2 ) u′(r) + ε1 r u(r) = 0. (27) recurrence formula: ck+2 = − (k + 1)(k α1 + β0) (k + 1) (k + 2) α0 ck+1 − (k − 1)((k − 2) α3 + β2) + ε1 (k + 1) (k + 2) α0 ck−1. (28) • differential equation: ( α0 + α2r2 + α3r3 ) u′′(r) + (β1 r + β2 r2)u′(r) + ε1 r u(r) = 0. (29) recurrence formula: ck+2 = − k(k − 1)α2 + k β1 (k + 1) (k + 2) α0 ck − (k − 1)(k − 2)α3 + (k − 1)β2 + ε1 (k + 1)(k + 2) α0 ck−1. (30) • differential equation: ( α0 + α2 r2 ) u′′(r) + (β1 r + β2 r2) u′(r) + ε1 r u(r) = 0. (31) recurrence formula: ck+2 = − k (k − 1)α2 + k β1 (k + 1) (k + 2) α0 ck − (k − 1)β2 + ε1 (k + 1) (k + 2) α0 ck−1. (32) 171 nasser saad acta polytechnica de α3 α2 α1 α0 discriminant roots of l domain definition i ∆3 > 0 ξ1 ̸= ξ2 ̸= ξ3 |r| < mini=1,2,3 ξi α3 α2 α1 α0 ∆3 = 0 ξ1 = ξ2 = ξ3 = ξ |r| < ξ ∆3 < 0 ξ ∈ r |r| < ξ differential equation: (α0 + α1 r + α2 r2 + α3 r3) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = 18 α3 α2 α1 α0 − 4 α23 α0 + α22α12 − 4 α3 α13 − 27 α32 α02 ii ∆3 > 0 ξ1 ̸= ξ2 |r| < mini=1,2 ξi 0 α2 α1 α0 ∆3 = 0 ξ1 = ξ2 = ξ |r| < ξ ∆3 < 0 none |r| < ∞ differential equation: (α0 + α1 r + α2 r2) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = α22(−4 α2 α0 + α12) iii 0 0 α1 α0 α1α0 > 0 r = −α0/α1 −∞ < r < −α0/α1 α1α0 < 0 r = −α0/α1 −α0/α1 < r < ∞ differential equation: (α0 + α1 r) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = 0 iv 0 0 0 α0 none none −∞ < r < ∞ differential equation: α0 y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = 0 v ∆3 > 0 ξ1 ̸= ξ2 ̸= ξ3 |r| < mini=1,2,3 ξi α3 0 α1 α0 ∆3 = 0 ξ1 = ξ2 = ξ3 = ξ |r| < ξ ∆3 < 0 ξ ∈ r |r| < ξ differential equation: (α0 + α1 r + α3 r3) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = −4 α3 α13 − 27 α32 α02 vi ∆3 > 0 ξ1 ̸= ξ2 ̸= ξ3 |r| < mini=1,2,3 ξi α3 α2 0 α0 ∆3 = 0 ξ1 = ξ2 = ξ3 = ξ |r| < ξ ∆3 < 0 ξ ∈ r |r| < ξ differential equation: (α0 + α2 r2 + α3 r3) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = −4 α23 α0 − 27 α32 α02 vii α3 0 0 α0 α0 α3 < 0 or α0 α3 > 0 ξ = 3 √ −α0/α3 |r| < ξ differential equation: (α0 + α3 r3) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = −27 α32 α02 viii 0 α2 0 α0 α2α0 < 0 r = ± √ − α0 α2 − √ − α0 α2 < r <√ − α0 α2 α2α0 > 0 none −∞ < r < ∞ differential equation: (α0 + α2 r2) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 discriminant: ∆3 = −4 α23 α0 table 2. tabulating the eight different types of differential equations, which apply to theorem 3.1. 172 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . • differential equation: ( α0 + α3 r3 ) u′′(r) + (β1 r + β2 r2) u′(r) + r ε1 u(r) = 0. (33) recurrence formula: ck+2 = − k β1 (k + 1) (k + 2) α0 ck − (k − 1)(k − 2)α3 + (k − 1)β2 + ε1 (k + 1) (k + 2) α0 ck−1. (34) • differential equation: α0 u ′′(r) + (β1 r + β2 r2) u′(r) + ε1 r u(r) = 0. (35) recurrence formula: ck+2 = − k β1 (k + 1) (k + 2) α0 ck − (k − 1) β2 + ε1 (k + 1) (k + 2) α0 ck−1. (36) • differential equation: ( α0 + α2 r2 + α3 r3 ) u′′(r) + β1 r u′(r) + ε1 r u(r) = 0. (37) recurrence formula: ck+2 = − k (k − 1)α2 + k β1 (k + 1) (k + 2) α0 ck − (k − 2)(k − 1)α3 + ε1 (k + 1) (k + 2) α0 ck−1. (38) • differential equation: ( α0 + α3 r3 ) u′′(r) + β1 r u′(r) + ε1 r u(r) = 0. (39) recurrence formula: ck+2 = − k β1 (k + 1)(k + 2) α0 ck − (k − 2) (k − 1) α3 + ε1 (k + 1)(k + 2) α0 ck−1. (40) • differential equation: ( α0 + α2 r2 ) u′′(r) + β1 r u′(r) + ε1 r u(r) = 0. (41) recurrence formula: ck+2 = − k(k − 1)α2 + k β1 (k + 1)(k + 2)α0 ck − ε1 (k + 1)(k + 2) α0 ck−1. (42) • differential equation: ( α0 + α2 r2 ) u′′(r) + (β1 + β2 r2)u′(r) + ε1 r u(r) = 0. (43) recurrence formula: ck+2 = − k(k − 1)α2 + β1 k) (k + 1) (k + 2) α0 ck − (k − 1)β2 + ε1 (k + 1) (k + 2) α0 ck−1. (44) • differential equation: u′′(r) + β1 r u′(r) + ε1 r u(r) = 0, (45) recurrence formula: ck+2 = − k β1 (k + 1) (k + 2) ck − ε1 (k + 1) (k + 2) ck−1. (46) 3.2. polynomial solutions the series solution y(r) = ∑∞ k=0 ck r k terminates to an nth-degree polynomial if cn ̸= 0 and cj = 0 for all j ≥ n + 1. it is not difficult to show by direct substitution that for polynomial solutions of pn(r) = ∑n k=0 ck r k, it is necessary that ε1 = −n (n − 1) α3 − n β2, n = 0, 1, 2, · · · . (47) 173 nasser saad acta polytechnica furthermore, the polynomial solution coefficients {ck}nk=0 satisfy a four-term recurrence relation, see (26),( (k − 1) ( (k − 2)α3 + β2 ) + ε1;n ) ck−1 + ( k ( (k − 1)α2 + β1 ) + ε0;n ) ck + (k + 1)(kα1 + β0) ck+1 + (k + 1)(k + 2)α0 ck+2 = 0 , k = 0, 1, . . . , n + 1 , (48) that generates a system of (n + 2) linear equations in {ck}nk=0: n−equations︷ ︸︸ ︷︸ ︷︷ ︸ (n+2)−equations the first n equations are  k = 0, → ε0c0 + β0c1 + 2 α0 c2 = 0 k = 1, → ε1c0 + (β1 + ε0)c1 + 2(α1 + β0)c2 + 6 α0 c3 = 0 k = 2, → (β2 + ε1)c1 + (2 α2 + 2 β1 + ε0)c2 + 3(2 α1 + β0)c3 + 12 α0 c4 = 0 k = 3, → (2α3 + 2 β2 + ε1)c2 + (6 α2 + 3 β1 + ε0)c3 + 4(3 α1 + β0)c4 + 20 α0 c5 = 0 ... k = n − 1, → ( (n − 2) ( (n − 3) α3 + β2 ) + ε1;n ) cn−2 + ( (n − 1) ( (n − 2) α2 + β1 ) + ε0;n ) cn−1 +n ( (n − 1) α1 + β0 ) cn = 0. (49) these equations permit the evaluation, using say cramer’s rule, of the coefficients {ck}nk=1 of the polynomial solution in terms of the non-zero constant c0. the (n + 1)th equation( (n − 1) (n − 2) α3 + (n − 1) β2 + ε1 ) cn−1 + ( n(n − 1) α2 + n β1 + ε0 ) cn = 0 , (50) gives our sufficient condition that relates ε0 ≡ ε0;n to the remaining parameters of the differential equation. finally, the (n + 2)th equation ε1;n = −n (n − 1) α3 − n β2 , n = 0, 1, · · · , (51) re-establishes the necessary condition (ε1 ≡ ε1;n) for the existence of the n-degree polynomial solution, see (47). for a non-zero solution, the n + 1 linear equations generated by the recurrence relation (48) require the vanishing of the (n + 1) × (n + 1)-determinant (with four main diagonals and all other entries being zeros) ∆n+1 = ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ s0 t1 η1 γ1 s1 t2 η2 γ2 s2 t3 η3 . . . . . . . . . . . . γn−2 sn−2 tn−1 ηn−1 γn−1 sn−1 tn γn sn ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ , where sk = ε0;n + k ( (k − 1)α2 + β1 ) , tk = k ( (k − 1)α1 + β0 ) , γk = ε1;n + (k − 1) ( (k − 2)α3 + β2 ) , ηk = k(k + 1)α0 , and for fixed n , ε1;n = −n (n − 1) α3 − n β2 . (52) 174 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . a simple relation to evaluate this determinant in terms of lower-degree determinants is given by ∆k+1 = sk ∆k − γk tk ∆k−1 + γk γk−1 ηk−1 ∆k−2 , (∆−2 = ∆−1 = 0, ∆0 = 1, k = 0, 1, . . . , n). (53) although there is a classical theorem [24] that guarantees the simple distinct real roots of the three diagonal matrix, to the best of our knowledge, there is no such theorem available for the matrix-type (52). however, we shall assume, in the following example, that the matrix entries allow for the distinct real roots of the resulting polynomial of ε0;n. illustrative example: • for the zero-degree polynomial solution p0 (r) = 1, i.e., n = 0, the coefficients cj = 0 for all j ≥ 1 and the recurrence relation (28) for k = 0, 1 gives, respectively, the necessary and sufficient conditions ε1;0 = 0 , ε0;0 = 0. (54) • for a first-degree polynomial solution, n = 1, the coefficients cj = 0 for all j ≥ 2 where k = 0, 1, 2 give the following three equations   ε0;1 c0 + β0 c1 = 0, ε1;1 c0 + (β1 + ε0;1) c1 = 0, (β2 + ε1;1) c1 = 0. (55) so, for c0 = 1, it is necessary that ε1;1 = −β2 and therefore, c1 = −ε0;1/β0 where ε0;1 are now the roots of the quadratic equation β0 β2 + β1ε0;1 + ε20;1 = 0. let εℓ0;1, ℓ = 1, 2, denote, if any, the two distinct real roots ε00;1 ̸= ε10;1 of this quadratic equation. then, for the two (distinct) differential equations( α0 + α1 r + α2 r2 + α3 r3 ) p′′1;ℓ (r) + ( β0 + β1 r + β2 r2 ) p′1;ℓ (r) + ( εℓ0;1 − β2 r ) p1;ℓ (r) = 0 , ℓ = 1, 2 , (56) the first-order polynomial solutions are p1;ℓ (r) = 1 − εℓ0;1 β0 r, β0 β2 + β1 εℓ0;1 + (εℓ0;1)2 = 0, ℓ = 1, 2 . (57) • for a second-degree polynomial solution, n = 2, the coefficients cj = 0 for all j ≥ 3 where k = 0, 1, 2, 3 give the four linear equations   ε0;2 c0 + β0 c1 + 2 α0 c2 = 0, ε1;2 c0 + (β1 + ε0;2) c1 + 2(α1 + β0) c2 = 0, (β2 + ε1;2) c1 + (2 α2 + 2 β1 + ε0;2) c2 = 0, (2α3 + 2 β2 + ε1;2)c2 = 0. (58) the very last equation of (58), correspondent to k = 3, gives the necessary condition ε1;2 = −2 α3 − 2 β2 , (59) and for k = 0, 1, the coefficients of the polynomial solution y(r) = 1 + c1 r + c2 r2 read  c1 = ∣∣∣∣∣ −ε0;2 2α02 α3 + 2 β2 2α1 + 2β0, ∣∣∣∣∣∣∣∣∣∣ β0 2α0β1 + ε0;2 2α1 + 2β0 ∣∣∣∣∣ , c2 = ∣∣∣∣∣ β0 −ε0;2β1 + ε0;2 2 α3 + 2 β2 ∣∣∣∣∣∣∣∣∣∣ β0 2α0β1 + ε0;2 2α1 + 2β0 ∣∣∣∣∣ . (60) 175 nasser saad acta polytechnica the equation corresponding to k = 2 and n = 2 establishes the sufficient condition∣∣∣∣∣∣∣∣ εℓ0;2 β0 2α0 −2 α3 − 2 β2 β1 + εℓ0;2 2α1 + 2β0 0 β2 − 2 α3 − 2 β2 2 α2 + 2 β1 + εℓ0;2 ∣∣∣∣∣∣∣∣ = 0, (61) where ℓ = 1, 2, 3 refers to the three distinct simple roots εℓ0;2, ℓ = 1, 2, 3, if any, of the polynomial generated by the determinant (61). hence, for each index ℓ = 1, 2, 3, the differential equation( α0 + α1 r + α2 r2 + α3 r3 ) p′′2;ℓ (r) + ( β0 + β1 r + β2 r2 ) p′2;ℓ (r) + ( εℓ0;2 − (2α3 + 2β2) r ) p2;ℓ (r) = 0 , (62) has the polynomial solution (for ℓ = 1, 2, 3 .) p2;ℓ (r) = 1 + ∣∣∣∣∣∣ −εℓ0;2 2α0 2 α3 + 2 β2 2α1 + 2β0 ∣∣∣∣∣∣∣∣∣∣∣∣ β0 2α0 β1 + εℓ0;2 2α1 + 2β0 ∣∣∣∣∣∣ r + ∣∣∣∣∣∣ β0 −εℓ0;2 β1 + εℓ0;2 2 α3 + 2 β2 ∣∣∣∣∣∣∣∣∣∣∣∣ β0 2α0 β1 + εℓ0;2 2α1 + 2β0 ∣∣∣∣∣∣ r2, (63) the above constructive approach can be continued to generate higher-order polynomial solutions of an arbitrary degree. theorem 3.2. suppose the polynomial in εℓ0;n generated by the determinant (52) has n + 1 distinct real roots arranged in ascending order ε00;n < ε10;n < ε20;n < · · · < εn0;n, then, the eigenvalue problem ( α0 + α1 r + α2 r2 + α3 r3 ) d2pn;ℓ dr2 + ( β0 + β1 r + β2 r2 ) dpn;ℓ dr − n ( (n − 1) α3 + β2 ) r pn;ℓ = −εℓ0;n pn;ℓ, (64) has a polynomial solution of the degree n, for ℓ = 1, 2, . . . , n + 1. this theorem is illustrated by figure 1, for n = 0, 1, 2, 3, n = 0 ε10;0 p0;1(r) n = 1 ε10;1 p1;1(r) ε20;1 p1;2(r) n = 2 ε10;2 p2;1(r) ε20;2 p2;2(r) ε30;2 p2;3(r) n = 3 ε10;3 p3;1(r) ε20;3 p3;2(r) ε30;3 p3;3(r) ε40;3 p3;4(r) · · · · · · · · · figure 1. a graphical representation of theorem 3.2 open problem: it is an open question to establish the condition(s) on the parameters so that the polynomial generated by the determinant (32) has simple and real distinct roots. 4. the solutions in the neighbourhood of a singular point 4.1. series solution and infinite sequence of orthogonal polynomials {pk(ε0)}∞k=0 as mentioned earlier, if α0 = 0, there are seven subclasses characterized by the equation r ( α1 + α2 r + α3r2 ) y′′ + ( β0 + β1 r + β2 r2 ) y′ + ( ε0 + ε1 r ) y = 0 . (65) 176 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . the classification of these seven equations along with their singularities and the associated domains are summarized in table 3. from this table, it is noted that if α1 ̸= 0, there are four subclasses where the point r = 0 is a regular singular point, while if α1 = 0, the condition β0 = 0 is necessary to ensure that r = 0 is a regular singular point for two additional subclasses and the last equation is a class where r = 0 is irregular singular point unless we reduce to euler’s type (α1 = α2 = β1 = β0 = ε0 = 0). in the neighbourhood of the regular singular point r = 0, the formal series solution y(r) = rs ∑∞ k=0 ckr k is then valid within the interval (0, ζ) where ζ is the nearest singular point obtained via the roots of the quadratic equation α1 + α2 r + α3r2 = 0. here, s are the roots of the indicial equation α1 s(s − 1) + β0 s = 0, i.e. s1 = 0 and s = 1 − β0/α1. using frobenius method, it is straightforward to show that the coefficients {ck}∞k=0 satisfy the three-term recurrence relation (k + s + 1) ( α1(k + s) + β0 ) ck+1 + ( (k + s)[α2 (k + s − 1) + β1] + ε0 ) ck + ( (k + s − 1)[α3(k + s − 2) + β2] + ε1 ) ck−1 = 0 , (66) where k = 1, 2, . . . . for   c−1 = 0, c0 = 1, c1 = − s(α2 (s − 1) + β1) + ε0 (α1 s + β0)(s + 1) = − p1;s(ε0) α1 ( s + β0 α1 ) (s + 1) , this equation can be written as ck+2 = λ0(k) ck+1 + s0(k) ck, where   λ0(k) = − (α2 (k + s) + β1) (k + s + 1) + ε0 (α1(k + s + 1) + β0) (k + s + 2) , s0(k) = − (α3(k + s − 1) + β2) (k + s) + ε1 (α1(k + s + 1) + β0) (k + s + 2) , from this equation, we note that ck+3 = λ1(k) ck+1 + s1(k) ck,   λ1(k) = λ0(k + 1) λ0(k) + s0(k + 1)s1(k) = λ0(k + 1) s0(k), ck+4 = λ2(k) ck+1 + s2(k) ck,   λ2(k) = λ1(k + 1)λ0(k) + s1(k + 1)s2(k) = λ1(k + 1)s0(k), ck+5 = λ3(k) ck+1 + s3(k) ck,   λ3(k) = λ2(k + 1)λ0(k) + s2(k + 1)s3(k) = λ2(k + 1)s0(k), and in general ck+m = λm−2(k) ck+1 + sm−2(k) ck,   λm(k) = λm−1(k + 1)λ0(k) + sm−1(k + 1)sm(k) = λm−1(k + 1)s0(k), and therefore c2 = (s + 1)(α2 s + β1) + ε0 (α1(s + 1) + β0)(s + 2) ( s(α2 (s − 1) + β1) + ε0 (α1 s + β0)(s + 1) ) − s(α3(s − 1) + β2) + ε1 (α1(s + 1) + β0)(s + 2) = p2;s(ε0) α21 ( s + β0 α1 ) 2 (s + 1)2 . (67) 177 nasser saad acta polytechnica de α3 α2 α1 condition roots of lpc domain definition i α3 α2 α1 a22 − 4a1a3 > 0 r = 0, ξ+ ̸= ξ− r ∈ (0, min ξ±) if ξ± > 0 r ∈ (max ξ±, 0) if ξ± < 0 r ∈ (0, ξ+) if ξ− < 0 < ξ+, |ξ−| > ξ+ r ∈ (ξ−, 0) if ξ− < 0 < ξ+, |ξ−| < ξ+ a22 − 4a1a3 = 0 r = 0, ξ+ = ξ− = ξ r ∈ (0, ξ) differential equation: r (ξ1 − r)(ξ2 − r) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 roots: r = 0; r = ξ± ≡ (−α2 ± √ α22 − 4α1α3)/(2α3) singularity: r = 0, ξ±, ∞: regular differential equation: r (ξ − r)2 y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 roots: r = 0; r = ξ ≡ −α2/(2α3) singularity: r = 0, ξ: regular; r = ∞: irregular ii 0 α2 α1 r = 0, r = −α1/α2 r ∈ (0, −α1/α2) if α1/α2 < 0 r ∈ (−α1/α2, 0) if α1/α2 > 0 differential equation: r(α1 + α2 r) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 singularity: r = 0, −α1/α2: regular; r = ∞: irregular iii α3 0 α1 a1a3 < 0 r = 0, ± √ −α1/α3 r ∈ (0, √ −α1/α3) α1α3 > 0 r = 0, r ∈ (0, ∞) differential equation: r (α3r2 + α1) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0, α3α1 < 0 singularity: r = 0, ± √ −α1/α3, ∞: regular differential equation: r (α3r2 + α1) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0, α3α1 > 0 singularity: r = 0: regular; r = ∞: irregular iv α3 α2 0 β0 = 0 r = 0, −α2/α3 r ∈ (0, −α2/α3) if α2/α3 < 0 r ∈ (−α2/α3, 0) if α2/α3 > 0 differential equation: r2(α3r + α2) y′′ + r(β1 + β2 r) y′ + (ε0 + ε1 r) y = 0 singularity: r = 0, −α2/α3: regular; r = ∞: irregular v 0 0 α1 r = 0 r ∈ (0, ∞) differential equation: α1 r y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 singularity: r = 0: regular; r = ∞: irregular vi 0 α2 0 β0 = 0 r = 0 r ∈ (0, ∞) differential equation: α2 r2 y′′ + r(β1 + β2 r) y′ + (ε0 + ε1 r) y = 0 singularity: r = 0 : regular; r = ∞: irregular vii α3 0 0 r = 0 r ∈ (0, ∞) differential equation: α3 r3 y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0 singulaity: r = 0, ∞: irregular table 3. tabulating the seven different types of differential equations, which apply to theorem 4.1. 178 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . initiated with p2,s(ε) = ((s + 1)(α2 s + β1) + ε0)p1;s(ε0) − (α1 s + β0)(s + 1)(s(α3(s − 1) + β2) + ε1) continuing with this process, it is straightforward to conclude that the series solution can be written as y(r) = rs ∞∑ k=0 ck rk = ∞∑ k=0 (−1)k pk;s(ε0) αk1 ( β0 α1 + s ) k (1 + s)k rk+s, (68) where the k-degree polynomials of the parameter ε0, namely {pk;s(ε0)}∞k=0, satisfy the following three-term recurrence relation: pk+1;s(ε0) = ( (k + s) [ (k + s − 1)α2 + β1 ] + ε0 ) pk;s(ε0) − (k + s) ( (k + s − 1)α1 + β0 )( (k + s − 1) × [ (k + s − 2)α3 + β2 ] + ε1 ) pk−1;s(ε0), (69) initiated with p−1;s(ε0) = 0 and p0;s(ε0) = 1. for the classes i-iv in table 3, including, of course, the classical heun equation, r = 0 is a regular singular point with one of the exponents of singularities being s = 0, in which case, the coefficients {ck}∞k=0 of the series solution y(r) = ∑∞ k=0 ckr k satisfy the three-term recurrence relation( (k + 1)(k α1 + β0) ) ck+1 + ( k ( (k − 1)α2 + β1 ) + ε0 ) ck + ( (k − 1) ( (k − 2)α3 + β2 ) + ε1 ) ck−1 = 0, (70) and we have the following general result concerning the series solutions of the equation (65): theorem 4.1. in the neighbourhood of the regular singular point r = 0, the series solution y(r) = ∑∞ k=0 ckr k of the differential equation (65), with α1 ̸= 0, is explicitly given by y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk , (71) where the infinite sequence {pk(ε0)}∞k=0 is evaluated using the three-term recurrence relation pk+1(ε0) = ( k(k − 1)α2 + kβ1 + ε0 ) pk(ε0) − k ( (k − 1)α1 + β0 ) × ( (k − 1)(k − 2)α3 + (k − 1)β2 + ε1 ) pk−1(ε0), (72) where p−1(ε0) = 0, and p0(ε0) = 1. here, (α)n refers to the pochhammer symbol (α)n = α(α + 1) · · · (α − n + 1) = γ(α + n)/γ(α) which is defined in terms of gamma functions and satisfies the identity (−n)k = 0 for any positive integers k ≥ n + 1. equation (72) in theorem follows directly by substituting the coefficients of (71) in the recurrence relation (65) and eliminates the common terms. corollary 4.2. in the neighbourhood of the regular singular point r = 0, the series solution y(r) = ∑∞ k=0 ckr k of the differential equation r (α1 + α3r2) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0, (73) is given, explicitly, by y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk, (74) where pk+1(ε0) = (k β1 + ε0) pk(ε0) − k ((k − 1)α1 + β0) × ((k − 1)(k − 2)α3 + (k − 1)β2 + ε1) pk−1(ε0), (75) initiated with p−1(ε0) = 0, p0(ε0) = 1. 179 nasser saad acta polytechnica corollary 4.3. in the neighbourhood of the regular singular point r = 0, the series solution y(r) = ∑∞ k=0 ckr k of the differential equation r (α1 + α2 r) y′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0, (76) is given, explicitly, by y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk, (77) where pk+1(ε0) = (k(k − 1)α2 + kβ1 + ε0)pk(ε0) − k((k − 1)α1 + β0)((k − 1)β2 + ε1)pk−1(ε0), (78) initiated with p−1(ε0) = 0, p0(ε0) = 1. corollary 4.4. in the neighbourhood of the regular singular point r = 0, the series solution y(x) = ∑∞ k=0 ckr k of the differential equation α1 r y ′′ + (β0 + β1 r + β2 r2) y′ + (ε0 + ε1 r) y = 0, (79) is given, explicitly, by y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk, (80) where pk+1(ε0) = (kβ1 + ε0)pk(ε0) − k ((k − 1) α1 + β0)((k − 1)β2 + ε1)pk−1(ε0), (81) initiated with p−1(ε0) = 0, p0(ε0) = 0. corollary 4.5. in the neighbourhood of the regular singular point x = 0, the series solution y(x) = ∑∞ k=0 ckx k of the differential equation α1r y ′′ + (β0 + β2 r2) y′ + (ε0 + ε1 r) y = 0, (82) is given, explicitly, by y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk, (83) where pk+1(ε0) = ε0pk(ε0) − k((k − 1)α1 + β0)((k − 1)β2 + ε1)pk−1(ε0), (84) initiated with p−1(ε0) = 0, p1(ε0) = 1. remark 4.6. if, in addition to α0 = 0, we also have α1 = 0, then r = 0 is a regular singular point only if β0 = 0, in which case the differential equation reduces to an equation that resembles euler’s equation, namely r2 ( α2 + α3r ) y′′ + r ( β1 + β2 r ) y′ + ( ε0 + ε1 r ) y = 0. (85) the exponents of the singularity r = 0 are s± = ( α2 − β1 ± √ (α2 − β1)2 − 4α2ε0 ) /(2α2). from the relation (66), the coefficients of the formal series solution y(r) = rs ∑∞ k=0 ck r k satisfy the two-term recurrence relation (k = 1, 2, . . . , c0 = 1), ck = − (k−1+s±)(k−2+s±)α3+(k−1+s±)β2+ε1 (k+s±)(k−1+s±)α2+(k+s±)β1+ε0 ck−1, = k∏ j=1 (−1)j (j−1+s±)(j−2+s±)α3+(j−1+s±)β2+ε1(j+s±)(j−1+s±)α2+(j+s±)β1+ε0 , (86) that allows to obtain a closed form of the series solution of (71) in terms of the generalized hypergeometric function as y(r) =rs± 3f2 ( 1, s± − 12 + β2 2α3 − √ (α3 −β2 )2 −4α3 ε1 2α3 , s± + 12 + β2 2α3 − √ (α3 −β2 )2 −4α3 ε1 2α3 ; s± + 12 + β1 2α2 − √ (α2 −β1 )2 −4α2 ε0 2α2 , s± + 12 + β1 2α2 + √ (α2 −β1 )2 −4α2 ε0 2α2 ; − α3 α2 r ) . (87) 180 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . 4.2. polynomial solution and finite sequence of orthogonal polynomials theorem 4.7. the necessary condition for the second-order linear differential equation (65) to have an nthdegree polynomial solution yn(r) = ∑n k=0 ck r k , n = 0, 1, 2, . . ., in the neighbourhood of the regular singular point r = 0 with one of the indicial equation exponents s = 0, is ε1;n = −n (n − 1) α3 − n β2 , n = 0, 1, 2, . . . , (88) along with the sufficient condition, relating the remaining coefficients, given by the vanishing of the tridiagonal (n + 1) × (n + 1)-determinant ∆n+1 ≡ 0 given by ∆n+1 = ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ s0 t1 γ1 s1 t2 γ2 s2 t3 . . . . . . . . . γn−2 sn−2 tn−1 γn−1 sn−1 tn γn sn ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ , (89) where, for fixed n : ε1;n = −n (n − 1) α3 − n β2,  sk = ε0;n + k ( (k − 1)α2 + β1 ) , tk = −k ( (k − 1)α1 + β0 ) , γk = −ε1;n − (k − 1) ( (k − 2)α3 + β2 ) , and all other entries are zeros. in this case, the polynomial solutions are given explicitly by yn(r) = n∑ k=0 (−1)k p nk (ε0;n) k! αk1 ( β0 α1 ) k rk , (90) where the finite orthogonal sequences {p nk (ε0;n)} n k=0 are evaluated using the three-term recurrence relation p nk+1(ε0;n) = ( sk + ε0;n ) p nk (ε0;n) − γktkp n k−1(ε0;n) , or, more explicitly, p nk+1(ε0;n) = ( k(k − 1)α2 + kβ1 + ε0;n ) p nk (ε0;n) + k(n − k + 1) ( (k − 1)α1 + β0 ) × ( β2 + α3(k + n − 2) ) p nk−1(ε0;n) , (91) where p n−1(ε0;n) = 0, and p n0 (ε0;n) = 1 for the non-negative integer n. expanding ∆k+1 with respect to the last column, it is clear that the determinant (89) satisfies a three-term recurrence relation { ∆k+1 = (sk + ε0;n) ∆k − γk tk ∆k−1, ∆0 = 1, ∆−1 = 0, k = 0, 1, . . . , n , (92) that allow to compute the determinant ∆k recursively in terms of lower-order determinants. we now show, by induction on k, that ∆k+1 = pk+1(ε0;n). (93) for k = 0, we find by (89) that ∆1 = (s0 + ε0;n) where the right hand side equals to p n1 (ε0;n) using (91). next, suppose that ∆j = pj (ε0;n), for j = 0, 1, 2, · · · , k, then from (91) p nk+1(ε0;n) = ( sk + ε0;n ) p nk (ε0;n) − γk tk p n k−1(ε0;n) = ( sk + ε0;n ) ∆k − γk tk ∆k−1 = ∆k+1 and the induction step is reached. these results can be represented by the graphical representation (figure 2). some of the mathematical properties of the finite sequence of polynomials {p nk (ε0;n)} n k=0 will be explored in later sections. 181 nasser saad acta polytechnica ∆k+1 = det s0 t1 0 ... 0 γ1 s1 t2 ... 0 0 γ2 s2 ... 0 · · · · · · · · · . . . ... 0 0 0 · · · sn     p n1 p n2 p n3 p nn figure 2. a demonstration of how the polynomials {p nk (ε0;n)} n k=0 may be obtained from the (k + 1)-determinant ∆k+1 for k = 0, 1, 2, . . . , n . remark 4.8. for α3 + α2 + α1 = 0, the canonical form of heun’s equation can be deduced from (65) by means of the following substitutions: y′′(r) +   β0+β1+β3α3−α1 r − 1 + β0 α1 r + α23β0+α1α3β1+α 2 1β2 α1α3(α1−α3) (r − α1 α3 )   y′(r) + ε1α3 r + ε0α3 r (r − 1) ( r − α1 α3 ) y(r) = 0. (94) or, simply in the standard form as y′′(r) + ( γ r + δ r − 1 + ε r − b ) y′(r) + α β r − q r(r − 1)(r − b) y(r) = 0, (95) where γ δ ε α β q b ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ β0 α1 β2 + β1 + β0 α3 − α1 β2α 2 1 + β1α1α3 + β0α23 α3α1(α1 − α3) β2 + (n − 1)α3 α3 −n − ε0 α3 α1 α3 ⇑ ⇑ ⇑ ⇑ ⇑ ⇑ γ δ ε β α q b where, in either case, it follows γ + δ + ε = α + β + 1 that ensures the regularity of the singular point ∞. with these parameters, the sturm-liouville form of the differential equation (65) is − d dr ( rγ (r − 1)δ (r − b)ε dy dr ) + α β rγ (r − 1)δ−1(r − b)ε−1y = q rγ−1(r − 1)δ−1(r − b)ε−1 y (96) where, for b ≥ 1, γ ≥ 0, δ ≥ 1, r ∈ (0, 1) . corollary 4.9. the second-order linear differential equation r2(α3 r + α2)y′′(r) + r (β2 r + β1) y′ + (−(n(n − 1) α3 + n β2) r + ε0) y = 0, (97) where r ∈ (−α2/α3, 0) if α2α3 > 0 or r ∈ (0, α2/α3) if α2α3 < 0, has a polynomial solution of degree n subject to   n∏ k=0 (ε0 + k((k − 1)α2 + β1) = 0 =⇒ ε0 = −n (n − 1) α2 − n β1, n = 0, 1, 2, · · · . (98) 182 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . in particular, the differential equation r2(α3r + α2) y′′(r) + r(β2 r + β1) y′(r) − ( (n (n − 1) α3 + n β2) r + n (n − 1) α2 + n β1 ) y(r) = 0, (99) has the polynomial solutions yn(r) = rn, n = 0, 1, 2, . . . . (100) proof. follows immediately from theorem 4.7 with α1 = β0 = 0. 5. mathematical properties of the orthogonal polynomials {pk(ε0)}∞k=0 as pointed out by theorem 4.1, in the neighbourhood of the singular point r = 0 with an indicial exponent root zero, the series solution of the differential equation with four singular points, see (65), r ( α1 + α2 r + α3r2 ) y′′ + ( β0 + β1 r + β2 r2 ) y′ + ( ε0 + ε1 r ) y = 0 . can be written as y(r) = ∞∑ k=0 (−1)k pk(ε0) k! αk1 ( β0 α1 ) k rk , (101) where the infinite sequence of polynomials {pk(ε0)}∞k=0 in the real variable ε0 satisfies the three-term recurrence relation pk+1(ε0) = (ε0 − ak)pk(ε0) − bkpk−1(ε0), (102) initiated with p−1(ε0) = 0, p0(ε0) = 1, k = 1, 2, 3, · · · . where ak = −k(k − 1)α2 − kβ1, bk = k ( (k − 1)α1 + β0 )( (k − 1)((k − 2)α3 + β2) + ε1 ) . for ak, bk ∈ r and if bk > 0, then according to favard theorem [25], see also [26, theorem 2.14], there exists a positive borel measure µ such that {pk}∞k=0 is orthogonal with respect to the inner product ⟨pk, pk′ ⟩ = ∫ r pk(ε0)pp′ (ε0)dµ (103) such that ∫ r pk(ε0)pk′ (ε0)dµ = pkpk′ δkk′ , ∫ r dµ = 1, (104) where δkk′ is the kronecker symbol. in particular,∫ r εk0 pk′ (ε0)dµ = 0 f or all 0 < k < k ′. (105) the norm pk can be found using the recurrence relations (102) by multiplying with εk−10 and taking the integral over ε0 with respect to µ that yields ∫ r εk0 pk(ε0)dµ = bk ∫ r εk−10 pk−1(ε0)dµ = bk bk−1 ∫ r εk−20 pk−2(ε0)dµ = · · · =   k∏ j=2 bj  ∫ r dµ (106) 183 nasser saad acta polytechnica and ∫ r pk(ε0)pk′ (ε0) dµ = k! (α1 α3)k β0ε1 ( β0 α1 ) k × ( −α3 + β2 − √ (α3 − β2)2 − 4α3ε1 2α3 ) k × ( −α3 + β2 + √ (α3 − β2)2 − 4α3ε1 2α3 ) k δkk′ . (107) using the recurrence relation (102), it also follows that ∫ r ε0[pk(ε0)]2dµ = − k((k − 1)α2 + β1)k! (α1 α3)k β0ε1 ( β0 α1 ) k ( −α3 + β2 − √ (α3 − β2)2 − 4α3ε1 2α3 ) k × ( −α3 + β2 + √ (α3 − β2)2 − 4α3ε1 2α3 ) k . (108) further, for k = 0, 1, 2, · · · ,∫ r ε0pk+1(ε0)pk(ε0)dµ = (k + 1)! (α1 α3)k+1 β0ε1 × ( β0 α1 ) k+1 ( −α3 + β2 − √ (α3 − β2)2 − 4α3ε1 2α3 ) k+1 ( −α3 + β2 + √ (α3 − β2)2 − 4α3ε1 2α3 ) k+1 . (109) other integrals can be evaluated similarly, for example ∫ r[ε0pk(ε0)] 2dµ can be evaluated by multiplying (102) by ε0pk(ε0) and integrate with respect to the measure µ using (107), (108), and (109) and we continue similarly for ∫ r εm0 [pk(ε0)] 2dµ, m = 0, 1, 2, · · · . the recurrence relations (102) for x = ε0 and y = ε′0 read pk+1(x) = ( x − ak ) p nk (x) − bkp n k−1(x) , pk+1(y) = ( y − ak ) p nk (y) − bkp n k−1(y), respectively. by multiplying the first by pk(y) and the second by pk(x) and subtracting, the resulting equation becomes (x − y)pk(y)pk(x) = qk+1(x, y) − bk qk(x, y) (110) where qk+1(x, y) = pk+1(x)pk(y) − pk(x)pk+1(y). thus, recursively over k, we have (x − y)pk(x)pk(y) = qk+1(x, y) − bk qk(x, y) (x − y)pk−1(x)pk−1(y) = qk(x, y) − bk−1 qk−1(x, y) ... (x − y)p n0 (x)p n 0 (y) = q1(x, y) , from which it is straightforward to obtain (x − y) [ pk(x)pk(y) + bkpk−1(x)pk−1(y) + bk bk−1pk−2(x)pk−2(y) + bk bk−1bk−2pk−3(x)pk−3(y) + · · · + λk+1λkλk−1λk−2 . . . λ2p0(x)p0(y) ] = qk+1(ε0, y) . dividing both sides by (x − y)bk bk−1bk−2 . . . b2 and summing over k results in k∑ j=0 pj (x)pj (y) bj bj−1bj−2 . . . b2 = (bk bk−1bk−2 . . . b2)−1 × pk+1(x)pk(y) − p nk (x)pk+1(y) x − y . 184 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . (101) then follows using bk bk−1bk−2 . . . b2 = k∏ i=2 bi = k! (α1 α3)k β0ε1 ( β0 α1 ) k × ( −α3 + β2 − √ (α3 − β2)2 − 4α3ε1 2α3 ) k × ( −α3 + β2 + √ (α3 − β2)2 − 4α3ε1 2α3 ) k and finally, we have, for k ≥ 0, christoffel-darboux identities: k∑ j=0 pj (x)pj (y) j! (α1 α3)j ( β0 α1 ) j (ξ+)j (ξ−)j = pk+1(x)pk(y) − pk(x)pk+1(y) k! (α1 α3)k ( β0 α1 ) k (ξ+)k (ξ−)k (x − y) , (111) where ξ± = −α3 + β2 ± √ (α3 − β2)2 − 4α3ε1 2α3 and by evaluating the limit of both sides as y → x, its confluent form k∑ j=0 [pj (x)]2 j! (α1 α3)j (ξ+)j (ξ−)j = p ′k+1(x)pk(x) − p ′ k(x)pk+1(x) k! (α1 α3)k ( β0 α1 ) k (ξ+)k (ξ−)k (112) follows. here, the prime refers to the derivative with respect to the variable x. as a direct consequence of the christoffel-darboux formula (112), all the zeros of the n-degree polynomial pn(ε) are simple. to prove that they are also real, we note that the recurrence relation (102) can be written in a matrix form as x   p0(x) p1(x) p2(x) ... pk−1(x)   =   a0 1 0 · · · 0 0 b1 a1 1 · · · 0 0 0 b2 a2 · · · 0 0 ... ... ... . . . ... ... 0 0 0 · · · bk−1 ak−1     p0(x) p1(x) p2(x) ... pk−1(x)   + pk(x)   0 0 0 ... 1   (113) thus, if xi is a zero of pk(x), it is an eigenvalue of the given tridiagonal matrix. since, by the hypothesis of (102), bk > 0 for all k ≥ 1, the results of arscott [24] confirm that (i) the zeros of pk−1(x) and pk(x) interlace – that is, between two consecutive zeros of either polynomial lies precisely one zero of the other (ii) at the zeros of pk(x) the values of pk−1(x) are alternately positive and negative, (iii) all the zeros of pk(x) – i.e. all the eigenvalues of tridiagonal matrix are real and different. 6. mathematical properties of the finite orthogonal polynomials {p nk (ε0)}nk=0 in this section, we shall study some of the mathematical properties of the orthogonal polynomials {p nk (ε0;n)} n k=0. first, the zeros of the polynomial generated by the aforementioned determinant are all simple. this fact can be confirmed by establishing the christoffel-darboux formula. denote x = ε0;k and y = ε0;k′ , where k ̸= k′ and k, k′ = 0, 1, 2, · · · , n − 1: for x ̸= y k∑ j=0 p nj (x)p n j (y) j!(α1α3)j (−n)j ( β0 α1 ) j ( β2 α3 + n − 1 ) j = p nk+1(x)p n k (y) − p n k (x)p n k+1(y) k!(α1α3)k(−n)k ( β0 α1 ) k ( β2 α3 + n − 1 ) k (x − y) , (114) while, for the limit y → x, k∑ j=0 ( p nj (x) )2 j!(α1α3)j (−n)j ( β0 α1 ) j ( β2 α3 + n − 1 ) j = [p nk+1(x)] ′p nk (x) − [p n k (x)] ′p nk+1(x) k!(α1α3)k(−n)k ( β0 α1 ) k ( β2 α3 + n − 1 ) k . (115) 185 nasser saad acta polytechnica here, the prime refers to the derivative with respect to the variable x. if x = xk is a zero of the polynomial p nk (x) with multiplicity > 1, then p n k (xk) = 0 and (115) yields the contradiction 0 < k−1∑ j=0 ( p nj (xi) )2 j!(α1α3)j (−n)j ( β0 α1 ) j ( β2 α3 + n − 1 ) j = 0, (116) and the zeros of the polynomial p nk (x), k = 1, 2, · · · n are distinct. 6.1. norms of the orthogonal polynomials denote ε0;n = x, the general theory of orthogonal polynomials [27] guarantees that the finite sequence of polynomials {pk(x)}nk=0 form a set of orthogonal polynomials for each n. this implies the existence of a certain weight function, w(x), which can be normalized as∫ dw = 1 , (117) for which ∫ pk(x)pk′ (x)dw = pk pk′ δkk′ , 0 ≤ k, k′ ≤ n , (118) where pk denotes the norms of polynomials pk(x). these norms can be found from the recurrence relations (36) by multiplying with xk−1w(x) and taking the integral over x yields the recurrence formula∫ x k p nk (x) w(x) dx = −k(n − k + 1) ( (k − 1)α1 + β0 ) × ( β2 + α3(k + n − 2) )∫ x k−1p nk−1(x)w(x)dx , (119) and thus ∫ p nk (x) x k w(x) dx = k! (α1α3)k (−n)k ( β0 α1 ) k ( β2 α3 + n − 1 ) k . (120) from which it follows∫ [p nk (x)] 2w(x)dx = p2k = k! (α1α3) k (−n)k ( β0 α1 ) k ( β2 α3 + n − 1 ) k (121) for all 0 ≤ k ≤ n. because of the pochhammer identity (−n)k = 0 for k > n, it follows from (71) that the norms of all polynomials p nk (x) with k ≥ n + 1 vanish. thus pk = 0 , k ≥ n + 1 . (122) we may also note, using the recurrence relation, that∫ x[pk(x)]2w(x)dx = −k((k − 1)α2 + β1) k! (α1α3)k (−n)k ( β0 α1 ) k ( β2 α3 + n − 1 ) k . (123) 6.2. the zeros of the polynomials {p nk (ε0;n)} n k=0 one of the important properties of the polynomials p nn+1(ε0;n) concerns their zeros. an argument provided by arscott [24] proves that if the product (γk · tk) > 0 for all k = 1, 2, . . . , n, then the polynomials that satisfy the tri-diagonal determinant (67) are real and simple. let us denote that the roots of the polynomials p nn+1(ε0;n) = 0 by εℓ0;n, ℓ = 0, 1, . . . , n such that p nn+1(ε ℓ 0;n) = 0 , (124) where ε00;n < ε 1 0;n < · · · < ε n 0;n . 186 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . in particular, since p nn+1(ε0;n) is of degree n + 1 and all the roots are simple and different, it follows that p nn+1(ε0;n) = n∏ ℓ=0 (ε0;n − εℓ0;n) . (125) the ‘discrete’ weight function w can be computed numerically [28] using (118), (119) and (125) for the given n. denote pℓ(ε0) = p nℓ (ε0), and let the roots of p n n+1(ε0;n) = 0 be ε j 0;n arranged in ascending order for j = 0, 1, 2, · · · , n. the weights wj , j = 0, 1, · · · , n, for the orthogonal polynomials {p nℓ (ε0;n)} n k=0 can be computed by solving the linear system n∑ j=0 wj p nℓ (ε j 0;n) = 0 (126) for ℓ = 0, 1, · · · , n. 6.3. factorization property another interesting property of the polynomials {p nk (x)} n k=0, aside from being an orthogonal sequence, is that when the parameter n takes positive integer values, the polynomials exhibit a factorization property. clearly, the factorization occurs because the third term in the recursion relation (36) vanishes when k = n + 1, so that all subsequent polynomials have a common factor p nn+1(ζ) called a critical polynomial. indeed, all the polynomials p nk+n+1(x), beyond the critical polynomial p n n+1(x) are factored into the product p nk+n+1(x) = q n k (x) p n n+1(x), k = 0, 1, . . . , (127) where the sequence {qnk (x)} are polynomials of degree k = 0, 1, . . . . interestingly, the quotient polynomials {qnk (x)} ∞ k=0 form an infinite sequence of orthogonal polynomials. to prove this claim, we substitute (128) into (36) and re-index the polynomials to eliminate the common factor p nn+1(ζ) from both sides. the recurrence relation (36) then reduces to a three-term recurrence relation for the polynomials {qnk (ζ)}k≥0 that reads qnk (x) = ( (k + n)(k + n − 1)α2 + (k + n)β1 + x ) qnk−1(x) − (k + n)(k − 1) ( (k + n − 1)α1 + β0 ) × ( β2 + α3(k + 2n − 2) ) qnk−2(x) , (128) where qn−1(ζ) = 0, and qn0 (ζ) = 1. hence, the quotient polynomials qnk (ζ) also form a new sequence of orthogonal polynomials for each value of n. for example, if n = 2, the critical polynomial is p 23 (x) = x 3 + (2α2 + 3β1)x2 + 2 ( (3α3 + 2β2)β0 + β1(α2 + β1) + α1(2α3 + β2) ) x + 4β0(α2 + β1)(α3 + β2) . (129) and p 24 (x) = ( x + 6α2 + 3β1 ) p 23 (x) , p 25 (x) = ( x 2 + (18α2 + 7β1)x − 4 ( (3α1 + β0)(4α3 + β2) − 3(2α2 + β1)(3α2 + β1) )) p 23 (x) , p 26 (x) = ( x 3 + (38α2 + 12β1)x2 + ( 432α22 + 290α2β1 + 47β 2 1 − 2(124α1α3 + 33α3β0 + 26α1β2 + 7β0β2) ) x − 10 ( β1(84α1α3 + 23α3β0 − 6β21 + 18α1β2 + 5β0β2) + 2α2 ( 31α3β0 − 27β21 + 7β0β2 + 12α1(9α3 + 2β2) ) − 144α32 − 156α 2 2β1 )) p 23 (x) , ... from which we have q0(x) = 1 , q1(x) = x + 6α2 + 3β1 , q2(x) = x2 + (18α2 + 7β1)x − 4 ( (3α1 + β0)(4α3 + β2) − 3(2α2 + β1)(3α2 + β1) ) , q3(x) = x3 + (38α2 + 12β1)x2 + ( 432α22 + 290α2β1 + 47β 2 1 − 2(124α1α3 + 33α3β0 + 26α1β2 + 7β0β2) ) x − 10 ( β1(84α1α3 + 23α3β0 − 6β21 + 18α1β2 + 5β0β2) + 2α2 ( 31α3β0 − 27β21 + 7β0β2 + 12α1(9α3 + 2β2) ) − 144α32 − 156α 2 2β1 ) , ... 187 nasser saad acta polytechnica and so on. the christoffel-darboux formula for this infinite sequence of orthogonal polynomials reads k∑ j=0 qnj (x)qnj (y) j! (α1α3)j (n + 2)j ( β0 α1 + n + 1 ) j ( β2 α3 + 2n ) j = qnk+1(x)q n k (y) − q n k (x)q n k+1(y) k! (α1α3)k(n + 2)k ( β0 α1 + n + 1 ) k ( β2 α3 + 2n ) k (x − y) , (130) and as y → x k∑ j=0 ( qnj (x) )2 j! (α1α3)j (n + 2)j ( β0 α1 + n + 1 ) j ( β2 α3 + 2n ) j = [qnk+1(x)] ′qnk (x) − [q n k (x)] ′qnk+1(x) k! (α1α3)k(n + 2)k ( β0 α1 + n + 1 ) k ( β2 α3 + 2n ) k . (131) theorem 6.1. the norms of all polynomials qnk (ξ) are given by gqk = k! (α1α3) k(n + 2)k ( β0 α1 + n + 1 ) k ( β2 α3 + 2n ) k . (132) proof. the proof follows by multiplying the recurrence relation (128) by xk−2ρ(x), with the normalized weight function ∫ ρ(x)dx = 1, and integrating over x. this procedure yields a two-term recurrence relation gqk = k (k + n + 1) ((k + n)α1 + β0) (β2 + α3(k + 2n − 1))) g q k−1 , where gqk = ∫ |qnk (x)| 2ρ(z)dz = ∫ xkqnk (x)ρ(x)dx with a solution given by (132). we see that, in general, the norm of the polynomials qnk (x) does not vanish. acknowledgements partial financial support of this work under grant number gp249507 from the natural sciences and engineering research council of canada is gratefully acknowledged. references [1] a. f. nikiforov, v. b. uvarov. special functions of mathematical physics. birkhäuser verlag, basel, 1988. https://doi.org/10.1007/978-1-4757-1595-8. [2] e. j. routh. on some properties of certain solutions of a differential equation of the second order. proceedings of the london mathematical society 16:245–262, 1884/85. https://doi.org/10.1112/plms/s1-16.1.245. [3] n. saad, r. l. hall, h. ciftci. criterion for polynomial solutions to a class of linear differential equations of second order. journal of physics a: mathematical and general 39(43):13445–13454, 2006. https://doi.org/10.1088/0305-4470/39/43/004. [4] n. saad, r. l. hall, v. a. trenton. polynomial solutions for a class of second-order linear differential equations. applied mathematics and computation 226:615–634, 2014. https://doi.org/10.1016/j.amc.2013.10.056. [5] a. ronveaux (ed.). heun’s differential equations. oxford university press, new york, 1995. [6] s. y. slavyanov, w. lay. special functions: a unified theory based on singularities. oxford university press, oxford, 2000. isbn 0-19-850573-6. [7] a. decarreau, m.-c. dumont-lepage, p. maroni, et al. formes canoniques des équations confluentes de l’équation de heun. annales de la societé scientifique de bruxelles série i sciences mathématiques, astronomiques et physiques 92(1-2):53–78, 1978. [8] a. decarreau, p. maroni, a. robert. sur les équations confluentes de l’équation de heun. annales de la societé scientifique de bruxelles série i sciences mathématiques, astronomiques et physiques 92(3):151–189, 1978. [9] k. heun. zur theorie der riemann’schen functionen zweiter ordnung mit vier verzweigungspunkten. mathematische annalen 33(2):161–179, 1888. https://doi.org/10.1007/bf01443849. [10] f. beukers, a. van der waall. lamé equations with algebraic solutions. journal of differential equations 197(1):1–25, 2004. https://doi.org/10.1016/j.jde.2003.10.017. [11] a. turbiner. on polynomial solutions of differential equations. journal of mathematical physics 33(12):3989–3993, 1992. https://doi.org/10.1063/1.529848. [12] r. v. craster, v. h. hoàng. applications of fuchsian differential equations to free boundary problems. proceedings of the royal society a mathematical, physical and engineering sciences 454(1972):1241–1252, 1998. https://doi.org/10.1098/rspa.1998.0204. 188 https://doi.org/10.1007/978-1-4757-1595-8 https://doi.org/10.1112/plms/s1-16.1.245 https://doi.org/10.1088/0305-4470/39/43/004 https://doi.org/10.1016/j.amc.2013.10.056 https://doi.org/10.1007/bf01443849 https://doi.org/10.1016/j.jde.2003.10.017 https://doi.org/10.1063/1.529848 https://doi.org/10.1098/rspa.1998.0204 vol. 62 no. 1/2022 on generalized heun equation with some mathematical . . . [13] p. p. fiziev. the heun functions as a modern powerful tool for research in different scientific domains, 2015. arxiv:1512.04025v1. [14] mathematical physics. in u. camcı, i. semiz (eds.), proceedings of the 13th regional conference held in antalya, october 27–31, 2010, pp. 23–39. world scientific publishing co. pte. ltd., hackensack, nj, 2013. https://doi.org/10.1142/8566. [15] h. ciftci, r. l. hall, n. saad, e. dogu. physical applications of second-order linear differential equations that admit polynomial solutions. journal of physics a: mathematical and theoretical 43(41):415206, 14, 2010. https://doi.org/10.1088/1751-8113/43/41/415206. [16] y.-z. zhang. exact polynomial solutions of second order differential equations and their applications. journal of physics a: mathematical and theoretical 45(6):065206, 2012. https://doi.org/10.1088/1751-8113/45/6/065206. [17] b.-h. chen, y. wu, q.-t. xie. heun functions and quasi-exactly solvable double-well potentials. journal of physics a: mathematical and theoretical 46(3):035301, 2013. https://doi.org/10.1088/1751-8113/46/3/035301. [18] f. caruso, j. martins, v. oguri. solving a two-electron quantum dot model in terms of polynomial solutions of a biconfluent heun equation. annals of physics 347:130–140, 2014. https://doi.org/10.1016/j.aop.2014.04.023. [19] a. v. turbiner. one-dimensional quasi-exactly solvable schrödinger equations. physics reports a review section of physics letters 642:1–71, 2016. https://doi.org/10.1016/j.physrep.2016.06.002. [20] h. karayer, d. demirhan, f. büyükkılıç. solution of schrödinger equation for two different potentials using extended nikiforov-uvarov method and polynomial solutions of biconfluent heun equation. journal of mathematical physics 59(5):053501, 2018. https://doi.org/10.1063/1.5022008. [21] h. ciftci, r. l. hall, n. saad. asymptotic iteration method for eigenvalue problems. journal of physics a: mathematical and general 36(47):11807–11816, 2003. https://doi.org/10.1088/0305-4470/36/47/008. [22] h. scheffé. linear differential equations with two-term recurrence formulas. journal of mathematics and physics 21(1-4):240–249, 1942. https://doi.org/10.1002/sapm1942211240. [23] r. s. irving. integers, polynomials, and rings. springer-verlag, new york, 2004. https://doi.org/10.1007/b97633. [24] f. m. arscott. latent roots of tri-diagonal matrices. edinburgh mathematical notes 44:5–7, 1961. https://doi.org/10.1017/s095018430000330x. [25] j. favard. sur les polynômes de tchebicheff. comptes rendus hebdomadaires des séances de l’académie des sciences, paris 200:2052–2053, 1935. [26] f. marcellán, r. álvarez nodarse. on the “favard theorem” and its extensions. journal of computational and applied mathematics 127(1-2):231–254, 2001. https://doi.org/10.1016/s0377-0427(00)00497-0. [27] m. e. h. ismail. classical and quantum orthogonal polynomials in one variable. cambridge university press, cambridge, 2009. [28] a. krajewska, a. ushveridze, z. walczak. bender-dunne orthogonal polynomials general theory. modern physics letters a 12(16):1131–1144, 1997. https://doi.org/10.1142/s0217732397001163. 189 http://arxiv.org/abs/1512.04025v1 https://doi.org/10.1142/8566 https://doi.org/10.1088/1751-8113/43/41/415206 https://doi.org/10.1088/1751-8113/45/6/065206 https://doi.org/10.1088/1751-8113/46/3/035301 https://doi.org/10.1016/j.aop.2014.04.023 https://doi.org/10.1016/j.physrep.2016.06.002 https://doi.org/10.1063/1.5022008 https://doi.org/10.1088/0305-4470/36/47/008 https://doi.org/10.1002/sapm1942211240 https://doi.org/10.1007/b97633 https://doi.org/10.1017/s095018430000330x https://doi.org/10.1016/s0377-0427(00)00497-0 https://doi.org/10.1142/s0217732397001163 acta polytechnica 62(1):165–189, 2022 1 introduction 2 elementary observations 3 the solutions in the neighbourhood of an ordinary point 3.1 series solutions 3.2 polynomial solutions 4 the solutions in the neighbourhood of a singular point 4.1 series solution and infinite sequence of orthogonal polynomials {pk (0)}k=0 4.2 polynomial solution and finite sequence of orthogonal polynomials 5 mathematical properties of the orthogonal polynomials {pk (0)}k=0 6 mathematical properties of the finite orthogonal polynomials {pkn(0)}k=0n 6.1 norms of the orthogonal polynomials 6.2 the zeros of the polynomials {pkn(0;n)}k=0n 6.3 factorization property acknowledgements references acta polytechnica doi:10.14311/ap.2020.60.0455 acta polytechnica 60(6):455–461, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap the development of a new adsorption-desorption device ľudmila gabrišováa,∗, peter peciara, oliver machoa, martin jurigaa, paulína galbaváb, žofia nižnanskáb, róbert kubinecb, ivan valentc, marián peciara a slovak university of technology in bratislava, faculty of mechanical engineering, institute of process engineering, námestie slobody 17, 812 31 bratislava, slovakia b comenius university in bratislava, faculty of natural sciences, institute of chemistry, mlynská dolina, ilkovičova 6, 842 15 bratislava, slovakia c comenius university in bratislava, faculty of natural sciences, department of physical and theoretical chemistry, mlynská dolina, ilkovičova 6, 842 15 bratislava, slovakia ∗ corresponding author: ludmila.gabrisova@stuba.sk abstract. the aim of this work was to construct a new adsorption-desorption device based on the principle of separation of volatile organic compounds, e.g., ethanol. as an adsorbent, it is possible to use granulated activated carbon (gac) in the adsorption and desorption process. in this study, two kinds of gacs were used and marked as gac1 and gac2. a particle size distribution and water vapuor sorption for the selected gacs were measured. an experiment with distilled water was performed as a preliminary study of the new device’s functionality. after the determination of the time necessary for the adsorption and desorption, the experiments were carried out with a model mixture (5 % v/v ethanol-water mixture), which resulted in a product with the ethanol content of 39.6 %. the main advantage of this device would be the potential competition of conventional distillation. keywords: adsorption, desorption, air stripping, activated carbon, ethanol. 1. introduction in both, industry and science, there is an increasing effort to produce simple devices, for the separation of volatile organic compounds (voc) [1–3]. in the light of this, devices which are capable of adsorbing vocs and subsequently desorbing them, are being constructed. the desorption process is decisive in many cases because it is necessary to obtain unchanged adsorbed compounds (surface of the sorbent cannot react with the adsorbate) [4]. after the desorption, compounds are analysed and stored for further use. ethanol is the most discussed compound in this context as it is the main product of many fermentation processes [5–8] produced by yeasts as their metabolism by-product. the yeast transforms saccharides into ethanol and other vocs. the content of vocs and other fermentation products depends on the nature of the raw material (fruit, corn, etc.) [9–13]. generally, distillation is the most common way of separating ethanol from the fermentation broth. this process is energy-demanding, and hence, there is a demand for another alternative device [14]. in most production processes, ethanol is in a liquid matrix (fermentation broth). there are many ways to separate ethanol from these matrices, for example, by a conventional distillation or by an adsorption onto the adsorbent directly from the liquid [15]. another method of separation is the adsorption of ethanol in a gaseous state [16]. gas with the ethanol content is created by gas stripping. this technique often takes place at laboratory temperature. the gas used for the gas stripping should be inert to compounds (voc) in a liquid matrix. carbon dioxide, nitrogen or air are, therefore, the commonly used gases. moreover, the absence of interaction with a separation adsorbent is another important requirement [1, 6–8, 12, 13, 17, 18]. activated carbon, polymeric resins and zeolites can be used as an adsorbent in this kind of separation method [19– 22]. the main advantage of activated carbon is its extensive specific surface area, price and availability [23]. desorption, a reverse process of adsorption, is an important part of these kinds of separation methods. desorption of adsorbed compounds is facilitated by a temperature increase or pressure decrease, which are two of the most frequently used techniques [24–28]. in our study, we used granulated activated carbon as the adsorbent and air as the stripping gas. the desorption was performed by increasing the temperature. 2. methods 2.1. experimental material 96 % ethanol purchased from mikrochem s.r.o. (pezinok, slovakia) was used for the preparation of the model mixture. for the gas adsorption, two kinds of granulated activated carbon were used, gac1 455 https://doi.org/10.14311/ap.2020.60.0455 https://ojs.cvut.cz/ojs/index.php/ap ľ. gabrišová, p. peciar, o. macho et al. acta polytechnica and gac2, which were purchased from sandsystem s.r.o. (klimovice, czech republic) and alchimica s.r.o. (prague, czech republic) respectively. 2.2. analysis of experimental data the ethanol concentration before and after the adsorption was calculated from the density of the model mixture, which was determined by dma 48 digital density meter (anton paar, austria) at a constant temperature of 25 °c. the volume of the sample for the density determination was 3 ml. the ethanol concentration in the condensate was determined using the same method. the particle size distribution of gac1 and gac2 was measured by a partan 3d particle size analyzer (microtrac gmbh, germany). the water vapour adsorption was determined by an aquadyne dvs (quantachrome, uk) at isothermal conditions. 2.3. experimental device – construction and process description our device (fig. 1) works on the principle of adsorption and desorption of gases or compounds present in a gaseous state at laboratory temperature. the device works in two phases. firstly, the gas is adsorbed onto the adsorbent in the device at laboratory temperature. secondly, the adsorbed gas is desorbed at a high temperature (150 °c). this phase also includes a condensation of the desorbed gas. this device is especially designed for the adsorption of voc from matrices with a low voc content and subsequent formation of a product with a high voc content by desorption and condensation. the new device consist of the following components: an hg-120 air blower (zhejiang, china), a haake n6 heating circulator (karlsruhe, germany), a hakke dc1 refrigerating circulator (karlsruhe, germany), a upls 3 flowmeter (prague, czech republic) and an almemo 5690-1m measuring temperature station (holzkirchen, germany). the glass components (heat exchangers, stock vessel, flask, etc.) of the device were of the brand simax. the new adsorption-desorption device works in a closed cycle, which is achieved by the air blower. the path of the gas flow in the device is indicated by roman numerals (i-ix) (fig. 1). adsorption: stripping gas (air) flows from the air blower into the stock vessel, which is filled with liquid (path i-iii). this liquid (model mixture or distilled water) is stripped by the stripping gas (air) in the stock vessel. in this way, the molecules are converted from liquid to gaseous state. subsequently, the gas flows through the first heat exchanger (iv) into the second heat exchanger (v) filled with the gac. here, adsorption takes place at laboratory temperature and the gas containing molecules from the liquid is adsorbed onto the gac. the molecules in the gas, which are not adsorbed, flow through the condenser back to the air blower (path vi-ix) and the adsorption circuit is repeated until the gac is saturated. it is important to mention that both heat exchangers (iv and v) coupled with a heating circulator are interconnected. at the beginning of the desorption process, the heating circulator is turned on and set to 150 °c (thermal desorption). the condenser (vii) is connected to the refrigerating circulator, which is set to -25 °c (condensation of desorbed gas). in each experiment, the stripping gas flow rate was controlled by a flowmeter (f) set at 5 l/min. the temperature was measured with six thermocouples by a real-time measuring station. the thermocouples’ location is in fig. 1 and marked as t 1 – t 6. desorption: adsorption is finished after a complete saturation of the gac. the liquid from the stock vessel is removed through a port for liquid removal (fig. 1). the air blower serves as a gas propeller in the desorption process. for this reason, desorbed molecules (gas) are transported to the condenser (vii) where they condense and accumulate in the flask fixed at the end of the condenser. very important is the fact that the heating circulator and refrigerating circulator are turned on, only during the desorption. 3. results and discussion 3.1. characterization of gac the particle size of the adsorbent is one of the parameters which affects the processes of adsorption and desorption. hence, the particle size distribution for the chosen gacs was measured. the small particle size (powdered activated carbon, pac) of the adsorbent negatively affects the gas flow in the packed bed. however, the gas flow through the adsorbent as gac is better than through the pac, thanks to the lower influential resistance during the flow [30]. the particle size distribution was measured by partan 3d. partan 3d is a device, which analyzes the size and shape of particles by an integrated high-speed camera system. the falling solid particles are captured by the camera and the acquired data are evaluated by the software. the result of the analysis is a graph, gacs particle size distribution, which is shown in fig. 2. the gac particle size distribution shows that the majority of gac1 and gac2 particles have a diameter in the range of 2-2.75 mm and 1.25-2 mm, respectively. hence, gac2 can create a higher resistance during the adsorption and desorption process. as a consequence, there can be a worse transfer of the gas flow between gac2 particles. the second step in the production of activated carbon is its activation. after this step, activated carbon becomes more hydrophilic [26, 31–33]. due to this property of activated carbon, a maximum possible adsorbed amount of water for gac1 and gac2 was determined by the water vapour station aquadyne dvs. the determination of water vapour sorption, by aquadyne dvs, is important because a content of water in the stripped model mixture is significant (fig. 3). aquadyne dvs is a device that measures 456 vol. 60 no. 6/2020 the development of a new adsorption-desorption device figure 1. scheme of the new adsorption-desorption device [29]. figure 2. the particle size distribution of gac1 and gac2. the change of the initial sample mass (50 mg) as a function of relative humidity. the gacs water vapour sorption graph, where the gac´s weight increased as a function of rising relative humidity and is stable after the saturation, is shown in fig. 3. the saturation point of gac1 was at 95.1 % of relative humidity and the weight increased by 40.1 % compared to the initial sample. the gac2 saturation point was at 93.8 % of relative humidity and the weight of the sample increased by 32.3 %. the shapes of the adsorption and desorption curves are the same for gac1 and gac2, which suggests the same progress of both processes. gac1, in comparison to gac2, can adsorb a higher amount of water vapour, which is clear from the yaxis expressing the percentage change of mass. it is a consequence of a higher specific surface area of gac1. figure 3. the water vapour adsorption and desorption graph of gac1 and gac2. the difference in the specific surface area of the gacs is approximately 54 m2/g. 3.2. preliminary study for the determination of the necessary adsorption time and basic observation of temperature in the device, the experiment with distilled water was performed. general conditions and specifications, which were applied for each experiment, are described in this subchapter. the volume of the liquid in the stock vessel was 997 ml (3 ml from 1 000 ml for the density determination) per experiment. the second heat exchanger was filled with 80 g of gac with ≤ 1 % residual moisture in each experiment. the temperature in the device was measured by six thermocouples (t 1 – t 6). thermocouple t 1 measures the temperature of the liquid (the model 457 ľ. gabrišová, p. peciar, o. macho et al. acta polytechnica figure 4. the temperature readings of thermocouples t1 – t6 during the adsorption process with distilled water. mixture or distilled water), the other thermocouples measure the temperature of the gas. fig. 4 shows the temperature readings obtained during the adsorption experiment with distilled water. this figure shows that after approximately 12 hours, the temperature measured by thermocouples t 2 – t 6 is stable. the decrease in the temperature in the liquid is caused by the gas stripping during the adsorption. gas stripping, in this case, is an endothermic desorption process. however, the adsorption is an exothermic process [34– 36]. the increase in the temperature of the distilled water in the stock vessel (curve of thermocouple t 1) is caused by the saturation of the adsorbent. hence, the time for the adsorption was set to 12 hours. the time required for the desorption, determined by the formation of condensate in the condenser, was 1 hour. 3.3. model experiments for a confirmation of our adsorption-desorption device’s functionality, three repeated experiments for each kind of gac using the same conditions were performed. an ethanol-water mixture, with ethanol concentration of 5 % v/v, was used as the model mixture. the location and temperature measurement by thermocouples were identical as in the preliminary study. during the adsorption in model experiments, we observed significant temperature changes measured by thermocouples t 1 and t 4. temperature readings of thermocouples t 1 and t 4 for gac1 and gac2, obtained during the adsorption of the gas created from the model mixture, are shown in fig. 5. the curve of temperature readings measured by thermocouple t1 has the same course as the temperature record of t 1 for distilled water (the decrease of temperature during adsorption and the increase of temperature after gac saturation). the thermocouple t 4 temperature reading (fig. 5) has two significant peaks an increase in temperature. the increase in temperature behind the fixed bed of the adsorbent is caused by the adsorption of the gas. the presence of the two figure 5. the comparison of thermocouples, t 1 and t 4 temperature readings during the adsorption of the gas created from 5 % ethanol-water mixture. peaks is a consequence of the heat release during the adsorption of ethanol and water. the method for the determination of the ethanol concentration is described in the part analysis of experimental data. the results of the adsorption, for both types of gac, are listed in table 1 and table 2 in the part adsorption. as mentioned above, the desorption was performed for one hour at a high temperature. afterwards, the condensate was removed from the device through the sampling point of condensate fig. 1. the values for the ethanol concentration and the volume of condensate for each experiment are listed in table 1 and table 2 in the part desorption. after the desorption process, the second heat exchanger is emptied and the gac weighed. from the values listed in table 1 and table 2, it is clear that gac1 can produce condensate with a higher concentration of ethanol and volume than gac2. the condensate volumes correspond to ∆vm m values, which represent the change of the model mixture volume after the adsorption. the difference between the condensate volume and ∆vm m depends on several factors: desorption is not complete, a part of the gas volume created by gas stripping stays in the device’s dead volume after the adsorption (saturation of the gac), and some volume of the condensed gas stays on the walls of the condenser after the desorption. a theoretical interpretation of the obtained experimental data will be the subject of a future analysis. at this stage, we merely suggest some possible approaches. it is useful to assess the gas stream composition leaving the stock solution. assuming that the partial pressures of ethanol (pe) and water (pw ) above the model mixture correspond to the equilibrium vapour pressures, the pe and pw are given by the henry’s and raoult’s laws, respectively. however, such an approach provides upper limits only, as the vapor-liquid equilibrium is not precisely specified in gas-stripping systems [37]. instead, a two-film mass transfer model [38, 39] for non-equilibrium volatilization processes 458 vol. 60 no. 6/2020 the development of a new adsorption-desorption device experiment adsorption desorption vm m etoha1 ∆vm m etoha2 vc etohc [ml] [%] [ml] [%] [ml] [%] 1. 997 5.0 32.5 3.5 28.5 39.6 2. 997 5.0 31.0 3.6 27.0 38.1 3. 997 5.0 32.0 3.5 28.5 39.2 table 1. the observed values for adsorption and desorption quantities of gac1. experiment adsorption desorption vm m etoha1 ∆vm m etoha2 vc etohc [ml] [%] [ml] [%] [ml] [%] 1. 997 5.0 31.0 3.7 27.5 33.9 2. 997 5.0 30.5 3.4 27.0 34.1 3. 997 5.0 31.5 3.6 26.5 34.9 table 2. the observed values for adsorption and desorption quantities of gac2. should be applied. this approach requires a knowledge of the mass transfer coefficients values, but those, for our system, have not been determined. calculated equilibrium partial vapour pressures above the 5 % (v/v) aqueous solution of ethanol at 25 °c are pe = 449 pa and pw = 3119 pa. ignoring the partial pressure of the stripping gas, these values predict a molar fraction of ethanol in the gaseous binary mixture as 12.6 %. as the ethanol volume fraction of 5 % corresponds to a molar fraction of 1.59 %, the stripping increases the molar fraction of ethanol by a theoretical factor of 7.9. this value agrees with the concentration of ethanol in the condensate of gac1. for a comparison, the conventional distillation process yields a factor of 9 at the boiling point (95.5 ℃) of the 5 % ethanol [40]. 4. conclusion in this article, we describe the design and construction of the new adsorption-desorption device. basic properties (particle size distribution and water vapour sorption) of commercially available gacs were determined using the partan 3d and the aquadyne dvs. based on the data from the particle size distribution analysis, gac1 is assumed to have a better gas flow through the adsorbent bed. the results from the water vapour sorption measurements show that gac1 can adsorb more mass than gac2. the experiment with distilled water determined the time necessary for adsorption and desorption. the functionality of the new adsorption-desorption device was confirmed using 5 % ethanol-water mixtures. the content of ethanol 39.6 % (v/v) in the product represents the efficiency of the separation processes in this device. the ethanol concentration in the product was almost eight times higher than in the initial sample (5 % v/v). the main benefit of this device is an innovative approach to removing ethanol from available matrices. this device has a potential to increase the production of ethanol in fermentation processes. list of symbols etoha1 ethanol concentration of the model mixture before adsorption [%] etoha2 ethanol concentration of the model mixture after adsorption [%] etohc ethanol concentration of the product after adsorption-desorption process [%] pe partial pressures of ethanol [pa] pw partial pressures of water [pa] vc volume of the product (condensate) after adsorption–desorption process [ml] vm m volume of the model mixture in stock vessel before adsorption [ml] ∆vm m volume of the model mixture in stock vessel after adsorption [ml] acknowledgements this work was supported by the slovak research and development agency under the contract numbers apvv15-0466, apvv-18-0282 and apvv-18-0348. this publication was created on the basis of the major project “accord” (itms project code: 313021x329) supported by operational programme research and development funded by the european regional development fund. the authors wish to acknowledge the ministry of education, science, research and sport of the slovak republic for the financial support of this research by grant kega 036stu4/2020. this article was created within the grant project “the research of modern unit operations in production of solid and liquid dosage forms with a focus on continuous granulation and lyophilization” from the grant scheme to support excellent teams of young researchers under the conditions of the slovak university of technology in bratislava. the authors would like to thank mr. john peter blight and m.sc. barbora veselková for the language reviews and editing. 459 ľ. gabrišová, p. peciar, o. macho et al. acta polytechnica references [1] s. onuki, j. koziel, w. s. jenks, et al. ethanol purification with ozonation, activated carbon adsorption, and gas stripping. separation and purification technology 151:165–171, 2015. doi:10.1016/j.seppur.2015.07.026. [2] f. gironi, v. piemonte. vocs removal from dilute vapour streams by adsorption onto activated carbon. chemical engineering journal 172(2-3):671–677, 2011. doi:10.1016/j.cej.2011.06.034. [3] f. taylor, m. j. kurantz, n. goldberg, j. craig jr. kinetics of continuous fermentation and stripping of ethanol. biotechnology letters volume 20(1):67–72, 1998. doi:10.1023/a:1005339415979. [4] a. j. fletcher, y. yüzak, k. m. thomas. adsorption and desorption kinetics for hydrophilic and hydrophobic vapors on activated carbon. carbon 44:989–1004, 2006. doi:10.1016/j.carbon.2005.10.020. [5] a. silvestre-albero, j. silvestre-albero, a. sepúlvedaescribano, f. rodríguez-reinoso. ethanol removal using activated carbon: effect of porous structure and surface chemistry. microporous and mesoporous materials 120(12):62–68, 2009. doi:10.1016/j.micromeso.2008.10.012. [6] g. ponce, j. miranda, m. alves, et al. simulation, analysis and optimization of an in situ gas stripping fermentation process in a laboratory scale for bioethanol production. chemical engineering transactions 37:295–300, 2014. doi:10.3303/cet1437050. [7] j. sonego, d. lemos, g. y. rodriguez, et al. extractive batch fermentation with co2 stripping for ethanol production in a bubble column bioreactor: experimental and modeling. energy & fuels 28(12):7552–7559, 2014. doi:10.1021/ef5018797. [8] c. löser, a. schröder, s. deponte, t. bley. balancing the ethanol formation in continuous bioreactors with ethanol stripping. engineering in life sciences 5(4):325–332, 2005. doi:10.1002/elsc.200520084. [9] g. m. walker, g. stewart. saccharomyces cerevisiae in the production of fermented beverages. beverages 2(30):1–12, 2016. doi:10.3390/beverages2040030. [10] c. xue, g.-q. du, j.-x. sun, et al. characterization of gas stripping and its integration with acetone–butanol–ethanol fermentation for high-efficient butanol production and recovery. biochemical engineering journal 83:55–61, 2014. doi:10.1016/j.bej.2013.12.003. [11] t. ezeji, n. qureshi, h. blaschek. microbial production of a biofuel (acetone-butanol-ethanol) in a continuous bioreactor: impact of bleed and simultaneous product removal. bioprocess and biosystems engineering 36(1):109–116, 2013. doi:10.1007/s00449-012-0766-5. [12] n. qureshi, h. blaschek. recovery of butanol from fermentation broth by gas stripping. renewable energy 22(4):557–564, 2001. doi:10.1016/s0960-1481(00)00108-7. [13] g. ponce, j. neto, s. santos de jesus, et al. sugarcane molasses fermentation with in situ gas stripping using low and moderate sugar concentrations for ethanol production: experimental data and modeling. biochemical engineering journal 110:152–161, 2016. doi:10.1016/j.bej.2016.02.007. [14] l. m. vane, f. r. alvarez. membrane-assisted vapor stripping: energy efficient hybrid distillation-vapor permeation process for alcohol-water separation. journal of chemical technology & biotechnology 83:1275–1287, 2008. doi:10.1002/jctb.1941. [15] l. li, p. a. quinlivan, d. r. u. knappe. effects of activated carbon surface chemistry and pore structure on the adsorption of organic contaminants from aqueous solution. carbon 40:2085–2100, 2002. doi:10.1016/j.watres.2005.01.029. [16] h. s. samanta, s. k. ray. separation of ethanol from water by pervaporation using mixed matrix copolymer membranes. separation and purification technology 146:176–186, 2015. doi:10.106/j.seppur.2015.03.006. [17] k. n. truong, j. w. blackburn. the stripping of organic chemicals in biological treatment processes. environmental progress 3(3):143–152, 1984. doi:10.1002/ep.670030304. [18] m. hashi, j. thibault, f. h. tezel. recovery of ethanol from carbon dioxide stripped vapor mixture: adsorption prediction and modeling. industrial & engineering chemistry research 49(18):8733–8740, 2010. doi:10.1021/ie1002608. [19] r. xiong, s. sandler, d. vlachos. alcohol adsorption onto silicalite from aqueous solution. the journal of physical chemistry c 115(38):18659–18669, 2011. doi:10.1021/jp205312k. [20] j. delgado, v. águeda, m. uguina, et al. separation of ethanol–water liquid mixtures by adsorption on a polymeric resin sepabeads 207. the chemical engineering journal 220:89–97, 2013. doi:10.1016/j.cej.2013.01.057. [21] j. a. delgado, m. a. uguina, j. l. sotelo, et al. separation of ethanol–water liquid mixtures by adsorption on silicalite. chemical engineering journal 180:137–144, 2012. doi:10.1016/j.cej.2011.11.026. [22] j. vivo-vilches, a. perez-cadenas, f. carrascomarín, f. j. maldonado-hódar. about the control of voc’s emissions from blended fuels by developing specific adsorbents using agricultural residues. journal of environmental chemical engineering 3:2662–2669, 2015. doi:10.1016/j.jece.2015.09.027. [23] e. wolak, e. vogt, j. szczurowski. chemical and hydrophobic modification of activated wd-extra carbon. energy fuels 14:1–8, 2017. doi:10.1051/e3sconf/20171402033. [24] j. a. delgado, v. i. águeda, m. a. uguina, et al. separation of ethanol-water mixtures by adsorption on bpl activated carbon with air regeneration. separation and purification technology 149:370–380, 2015. doi:doi:10.1016/j.seppur.2015.06.011. [25] k. dettmer, w. engewald. adsorbent materials commonly used in air analysis for adsorptive enrichment and thermal desorption of volatile organic compounds. analytical and bioanalytical chemistry 373:490–500, 2002. doi:10.1007/s00216-002-1352-5. [26] x. zhang, b. gao, a. e. creamer, et al. adsorption of vocs onto engineered carbon material: a review. journal of hazardous materials 338:102–123, 2017. doi:10.1016/j.hazmat.2017.05.013. 460 http://dx.doi.org/10.1016/j.seppur.2015.07.026 http://dx.doi.org/10.1016/j.cej.2011.06.034 http://dx.doi.org/10.1023/a:1005339415979 http://dx.doi.org/10.1016/j.carbon.2005.10.020 http://dx.doi.org/10.1016/j.micromeso.2008.10.012 http://dx.doi.org/10.3303/cet1437050 http://dx.doi.org/10.1021/ef5018797 http://dx.doi.org/10.1002/elsc.200520084 http://dx.doi.org/10.3390/beverages2040030 http://dx.doi.org/10.1016/j.bej.2013.12.003 http://dx.doi.org/10.1007/s00449-012-0766-5 http://dx.doi.org/10.1016/s0960-1481(00)00108-7 http://dx.doi.org/10.1016/j.bej.2016.02.007 http://dx.doi.org/10.1002/jctb.1941 http://dx.doi.org/10.1016/j.watres.2005.01.029 http://dx.doi.org/10.106/j.seppur.2015.03.006 http://dx.doi.org/10.1002/ep.670030304 http://dx.doi.org/10.1021/ie1002608 http://dx.doi.org/10.1021/jp205312k http://dx.doi.org/10.1016/j.cej.2013.01.057 http://dx.doi.org/10.1016/j.cej.2011.11.026 http://dx.doi.org/10.1016/j.jece.2015.09.027 http://dx.doi.org/10.1051/e3sconf/20171402033 http://dx.doi.org/doi:10.1016/j.seppur.2015.06.011 http://dx.doi.org/10.1007/s00216-002-1352-5 http://dx.doi.org/10.1016/j.hazmat.2017.05.013 vol. 60 no. 6/2020 the development of a new adsorption-desorption device [27] i. k. shah, p. pre, b. j. alappat. effect of thermal regeneration of spent activated carbon on volatile organic compound adsorption performances. journal of the taiwan institute of chemical engineers 45:1733–1738, 2014. doi:10.1016/j.jtice.2014.01.006. [28] n. qureshi, s. hughes, i. maddox, m. cotta. energy efficient recovery of butanol from fermentation broth by adsorption. bioprocess and biosystems engineering 27(4):215–222, 2005. doi:10.1007/s00449-005-0402-8. [29] l. gabrišová, p. peciar, r. kubinec. apparatus for concentrating volatile organic compounds by adsorption and desorption method for concentrating volatile organic compounds by adsorption and desorption. patent application number 50055-2018. industrial property office of the slovak republic, banská bystrica, slovakia, 2018. [30] m. e. gamal, h. a. mousa, m. h. el-naas, et al. bio-regeneration of activated carbon: a comprehensive review. separation and purification technology 197:345–359, 2018. doi:10.1016/j.seppur.2018.01.015. [31] m. jeguirim, m. belhachemi, l. limousy, s. bennici. adsorption/reduction of nitrogen dioxide on activated carbons: textural properties versus surface chemistry – a review. chemical engineering journal 347:493–504, 2018. doi:10.1016/j.cej.2018.04.063. [32] g. f. de oliviera, r. c. de andrade, m. a. c. trindade, et al. thermogravimetric and spectroscopic study (tg–dta/ft–ir) of activated carbon from the renewable biomass source babassu. química nova 40(3):284–292, 2017. doi:10.21577/0100-4042.20160191. [33] m. baysal, k. bilge, b. yilmaz, et al. preparation of high surface area activated carbon from waste-biomass of sunflower piths: kinetics and equilibrium studies on the dye removal. journal of environmental chemical engineering 6:1702–1713, 2018. doi:10.1016/j.jece.2018.02.020. [34] a. c. lua, j. guo. preparation and characterization of activated carbons from oil-palms stones for gas-phase adsorption. colloids and surfaces a: physicochemical and engineering aspects 179:151–162, 2001. doi:10.1016/s0927-7757(00)00651-8. [35] s. niknaddaf, j. d. atkinson, p. shariaty, et al. heel formation during volatile organic compound desorption from activated fiber cloth. carbon 96:131–138, 2016. doi:10.1016/j.carbon.2015.09.049. [36] j. d. seader, e. j. henley, d. k. roper. separation process principles: chemical and biochemical operation (3rd. ed). john wiley & sons, inc. the united states, 2011. [37] t. c. ezeji, p. m. karcher, n. qureshi, h. p. blaschek. improving performance of a gas stripping-based recovery system to remove butanol from clostridium beijerinckii fermentation. bioprocess and biosystems engineering 27(3):207–214, 2005. doi:10.1007/s00449-005-0403-7. [38] p. liss, p. slater. flux of gases across the air-sea interface. nature 247:181–184, 1974. doi:10.1038/247181a0. [39] d. mackay, p. j. leinonen. rate of evaporation of low-solubility contaminants from water bodies to atmosphere. environmental science & technology 9(13):1178–1180, 1975. doi:10.1021/es60111a012. [40] c. d. hodgman, r. c. weast, r. s. shankland, et.al. crc handbook of chemistry and physics: a readyreference book of chemical and physical data (44th. ed). the chemical rubber publishing, cleveland, ohio, 1963. 461 http://dx.doi.org/10.1016/j.jtice.2014.01.006 http://dx.doi.org/10.1007/s00449-005-0402-8 http://dx.doi.org/10.1016/j.seppur.2018.01.015 http://dx.doi.org/10.1016/j.cej.2018.04.063 http://dx.doi.org/10.21577/0100-4042.20160191 http://dx.doi.org/10.1016/j.jece.2018.02.020 http://dx.doi.org/10.1016/s0927-7757(00)00651-8 http://dx.doi.org/10.1016/j.carbon.2015.09.049 http://dx.doi.org/10.1007/s00449-005-0403-7 http://dx.doi.org/10.1038/247181a0 http://dx.doi.org/10.1021/es60111a012 acta polytechnica 60(6):455–461, 2020 1 introduction 2 methods 2.1 experimental material 2.2 analysis of experimental data 2.3 experimental device – construction and process description 3 results and discussion 3.1 characterization of gac 3.2 preliminary study 3.3 model experiments 4 conclusion list of symbols acknowledgements references acta polytechnica doi:10.14311/ap.2019.59.0448 acta polytechnica 59(5):448–457, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap automatic reconstruction of roof models from building outlines and aerial image data vojtěch hron∗, lena halounová czech technical university in prague, faculty of civil engineering, department of geomatics, thákurova 7, 16629 prague, czech republic ∗ corresponding author: vojtech.hron@fsv.cvut.cz abstract. the knowledge of roof shapes is essential for the creation of 3d building models. many experts and researchers use 3d building models for specialized tasks, such as creating noise maps, estimating the solar potential of roof structures, and planning new wireless infrastructures. our aim is to introduce a technique for automating the creation of topologically correct roof building models using outlines and aerial image data. in this study, we used building footprints and vertical aerial survey photographs. aerial survey photographs enabled us to produce an orthophoto and a digital surface model of the analysed area. the developed technique made it possible to detect roof edges from the orthophoto and to categorize the edges using spatial relationships and height information derived from the digital surface model. this method allows buildings with complicated shapes to be decomposed into simple parts that can be processed separately. in our study, a roof type and model were determined for each building part and tested with multiple datasets with different levels of quality. excellent results were achieved for simple and medium complex roofs. results for very complex roofs were unsatisfactory. for such structures, we propose using multitemporal images because these can lead to significant improvements and a better roof edge detection. the method used in this study was shared with the czech national mapping agency and could be used for the creation of new 3d modelling products in the near future. keywords: building reconstruction, roof model, edge detection, orthophoto, digital surface model, gis. 1. introduction experts from a wide range of disciplines use complex spatial data to solve specialized tasks, such as creating noise maps, highway inventory [1], modelling air pollution, estimating the solar potential of roof structures, planning new wireless infrastructures, designing houses by taking natural daylight requirements into consideration and generating virtual environments for flight simulators. these tasks require the use of digital elevation models and 3d building models with generalized roof structures also referred to as level of detail 2 (lod2) buildings [2]. the wide interest in lod2 building models has led to their possible inclusion in the inspire buildings theme [3]. currently, digital elevation models, such as the digital terrain model (dtm) and the digital surface model (dsm), are typically already available on the market and are widely used. 3d lod2 building models (subsequently referred to as “3d building models”) are available only for certain areas, mostly urban areas, in the form of 3d city models. the absence of 3d building models for larger territories or even whole countries makes some specialized tasks very difficult or even impossible to solve. thus, there is an evident demand for more 3d building models that experts and researchers could use. 1.1. data different data gathering techniques, including terrestrial and aerial laser scanning or photogrammetry, are used to create 3d building models. aerial data gathering techniques must be used to create 3d building models because they enable a rapid and non-selective mapping of large-scale areas. airborne laser scanning (als) and aerial photogrammetry (ap) are conventional methods for a collection of aerial data. the als serves to collect information for dsm and especially dtm creation due to the registration of multiple reflections. an als laser pulse can pass through vegetation (typically tree crowns) and provide ground height information [4]. ap primarily produces seamless orthophoto maps (subsequently: “orthophotos”) using vertical images, which capture ground truth from a nadir view. ap is also used to collect oblique aerial photographs to observe the captured scene from multiple viewing angles. oblique imagery can significantly help with the interpretation of ground features in highly occluded areas. in many countries, orthophotos are acquired on a regular basis to update gis and other map products. the update frequency usually depends on the size and complexity of the country and may range from one year (e.g., in the netherlands) [5] to several years (united kingdom) [6]. orthophotos and source aerial survey photographs (asp) represent common and up-to-date type of spatial information 448 https://doi.org/10.14311/ap.2019.59.0448 https://ojs.cvut.cz/ojs/index.php/ap vol. 59 no. 5/2019 automatic reconstruction of roof models from building outlines. . . available for whole countries. their primary advantage is that they contain not only positional information but also height information when there are sufficient overlaps. they also contain positional and height information, which is uniform in time. modern image matching algorithms can automatically generate very dense point clouds (pc) from ultra-high resolution stereo images (spatial resolution less than 0.3 m [7]), which are suitable for the creation of detailed dsms. image matching provides height and also colour information for top surfaces but does not penetrate the vegetation like the als. 3d building models can be generated successfully with laser-based pcs (lpc) or image-based pcs (ipc). the biggest difference between them is that image matching does not allow the height information to be generated in textureless image parts, such as deep cast shadows and highly reflective materials or in stereoscopic occlusions. oblique aerial photographs can be used as an alternative or replacement of the vertical imagery to reduce occlusions in cities [8]. als can fail to sense objects that are highly reflective or which absorb laser beams. these weaknesses have to be taken into account or could be addressed by a fusion of ipc and lpc in highly reliable dsms [9]. unfortunately, a simultaneous acquisition of laser and image data from the same flight platform is still not very common. 1.2. building outlines 3d building models could be created purely from spatial data as completely separate objects or their creation could be supported or fully linked to existing 2d building outlines. both approaches have advantages and disadvantages in relation to roof structures. the size and shape of building models made solely from aerial data (typically pc) are defined by the size and shape of the roofs. roofs usually have overhangs over any outer walls, so building outlines derived from roofs could be larger than the real built-up area. potentially undesirable topological conflicts may also occur when merging these building models with other spatial objects such as road networks and facilities. therefore, an alternative approach can be employed: using existing building footprints. building footprints can be obtained from cadastre or national gis databases that produce large-scale maps and plans. creating 3d building models from outlines allows them to be combined with other spatial objects from resource databases, significantly increasing their value and usefulness. however, combining existing 2d data with point clouds may lead to complications. depending on the type and quality of the resource gis databases, building shapes may become too general or become shifted. building outlines from cadastral maps are usually very accurate, but they represent the intersection of the outside walls with the ground [3] and may not necessarily match the shape of a roof that might be simpler or more complex than the building outline. any resultant 3d building models might, therefore, not exactly fit to the roofs. the purpose of 3d building models should be considered when selecting which technique to use. existing 2d outlines should be used for the integration into existing databases. for other cases, 3d building models can be generated only from point clouds. 1.3. 3d building models many researchers have examined various aspects of 3d building reconstruction, with the first studies dedicated to this topic dating back two decades [10]. our previous publications mentioned some of the early investigations in this area [11, 12]. [13] provides a very comprehensive review of methods and principles for an automatic 3d building reconstruction. this review was followed by another work that presented isprs benchmark results for building detection and 3d building reconstruction [14, 15]. the following text is structured into sections according to the type of spatial data. 1.3.1. laser-based point clouds many studies deal with the creation of 3d building models exclusively using lpc [5, 16, 17]. algorithms based on roof topology graphs [5, 17, 18] represent well-developed approaches using high-density pc (20 points/m2). global optimization solutions to create roof models from low-density lpc (at least 3 points/m2) have also been introduced [16]. such solutions require pc segmentation into roof planes, the extraction and regularization of building boundaries and step edges, partitioning building bounding boxes into volumetric cells and categorizing them as inside or outside based on a visibility analysis [16]. the faces between the inside and outside cells form the reconstructed 3d building model. authors [16] stated that their solution was robust in terms of missing points due to occlusions, but their results were fully dependent on the completeness of input roof planes. regardless of the lpc approach chosen, it is important to realize that building boundaries (outlines) are derived from points classified as roofs and thus could be larger than the real built-up area. 1.3.2. image-based point clouds a recently published study introduced a generation of lod2 building models from photogrammetric point clouds without using any ancillary data such as building footprints [19]. in this work, the building generation was based on pc segmentation using a regiongrowing algorithm, extraction primitives using random sample consensus [20], and creation of 3d building models with polyfit software. polygonal surface reconstruction from point clouds (polyfit) is a framework for the generation of simple polygonal surface models from intersecting planes [21]. according to the software authors [21], polyfit handles noise, outliers and missing data in pc. 449 vojtěch hron, lena halounová acta polytechnica 1.3.3. point clouds and image data technically very fascinating papers have described the connection of height and image data to create building models [22, 23]. image data can be successfully used to detect roof edges. detected roof edges can form the final vertices of a reconstructed roof model. a solution presented by [22] was based on the extraction of roof vertices from true-orthophotos (0.1 m/pixel) using canny edge detector and their integration with dsm (0.1 m/pixel) created from ipc to form closed-cycles of roof planes. detected roof edges can also support an extraction of roof planes from pc [23]. methods based on the extraction of roof planes using a region-growing algorithm from high-density lpc (35 points/m2) supported by edges extracted from uhr orthoimages (0.05 m/pixel) have been described in detail in [23]. 1.3.4. laser-based point clouds and outlines fusing the use of building outlines from a national topographic database with planimetry accuracy of 1-2 m and laser scanning elevation data to produce a nationwide 3d landscape model was presented in the netherlands [5]. point clouds with reduced density from 20 to 3 points/m2 were used to create a 3d landscape model of all topographic objects in lod0 and building models in lod1 with flat, horizontal roofs. in this study, fusing the very accurate point clouds with less accurate outlines might complicate the creation of resultant building models. this was the reason for choosing low-quality lod1 instead of lod2 building models. 1.3.5. image-based point clouds and outlines three european national mapping agencies (united kingdom, ireland and spain) tested oblique aerial datasets to generate very dense point clouds, textured polygonal meshes and 3d building models in lod2 from known footprints [8]. city modeller, module of the tridicon/hexagon software, was used for creation of 3d building models. according to the authors, point clouds produced from oblique imagery are cleaner in comparison with the conventional nadir images and additionally contain points on building facades. however, medium format oblique cameras and image overlaps up to 80 % lead to more flight hours and thus to higher aerial survey costs. furthermore, a large number of images affects processing time and storage requirements. 1.3.6. outlines and attributes fast automatic generation of 3d building models from outlines linked to attributes data (the number of stories and the type of roof) is also possible [24]. this approach requires reshaping the building polygons into orthogonal form and partitioning them into rectangles. for each rectangle generates a basic 3d building model according to linked attributes. an extension of this approach for automatic generation of buildings with generally-shaped roof models is possible using non-orthogonal footprints and straight skeleton computation [25]. unfortunately, the straight skeleton technique can only produce hipped roof models. complete elimination of pc analysis greatly simplifies the problem creating lod2 building models but reduces the model’s exactness. the rest of this paper is organized as follows. our aims and motivation are presented in section 2. section 3 contains a detailed description of the technique we developed, supported by many sample images. the evaluation of results and discussion are provided in section 4. section 5 summarizes conclusions and possible future plans. 2. aims this work builds on our previous publications [11, 12] that were focused on the comparison of existing commercial software solutions, using pc (envi lidar) and also existing outlines (inpho building generator), to create 3d building models. the building models created in our previous investigations were not satisfactory. inpho building generator building models were simple and topologically correct, but a large number of them did not correspond to real forms. envi lidar could create realistic and even complex 3d building models but were composed of many overlapping polyhedrons and thus contained many topological errors due to a missing topology control. because of these imperfections, we decided to develop our own method for an automatic creation of topologically correct building roof models. only such roof models can be used to produce useful 3d building models corresponding to reality. our method uses vector building outlines together with image and height information derived from asp to reconstruct roof models. our approach incorporates current requirements of the local national mapping agency (nma) of the czech republic, including using existing building outlines (positional accuracy up to 2 meters, usually up to 1 m) and currently available ultra-high resolution asp (ground sample distance 0.15-0.25 m) for the creation of 3d building models without the need to collect extra spatial data. 3. description of the roof model construction the technique developed combines image and height information in raster format derived from standard top-view asp with approximate 55 % forward and 20 % side overlaps. image data represents colour orthophotos (resolution 0.25 m/pixel) from the czech nma (land survey office). height data represents normalized dsm (ndsm) created by subtracting dtm from dsm. all models are in a raster form with a resolution of 1 m/pixel. the dtm used comes from land survey office data obtained by als. the dsm used comes from forest management institute datasets and was generated by a fusion of stereo-image 450 vol. 59 no. 5/2019 automatic reconstruction of roof models from building outlines. . . figure 1. orthophoto. pcs into the final model and projection of the final model to a plane. the pcs were generated with the enhanced automatic terrain extraction (eate) module to erdas imagine or imagine photogrammetry (formerly leica photogrammetry suite). the eate is a dense image matching algorithm. it uses a pixel by pixel correlation technique to generate very dense point clouds from a stereo imagery coverage [26]. in our case, the normalized cross-correlation was used to produce point clouds with a density of 1 point/m2. the generated pc has a lower quality (height deviations up to 1 m) than the results obtained by stateof-the-art image matching algorithms [27, 28] but it was fully sufficient for our needs. using a dtm generated from an als dataset for the normalization of dsm, we introduced another height error into ndsm. in addition, we converted the final ndsm to 8-bit (integers/meters from 0 to 255 range) for optimization reasons. height errors could theoretically be up to 2 meters using the processing steps we employed. however, this was not a complication because our approach did not require high-quality ndsm used. an example of an orthophoto used in our study is shown in figure 1. a small north-west shift in the building roof image in relation to the ground is recognizable (fig. 1). this is a radial image shift of above-terrain objects due to the central projection. the orthophoto used was not a true orthophoto because of small overlaps between images. building footprints were obtained from the czech nma, land survey office, which administers the national gis database (fundamental base of geographic data of the czech republic) [29]. building outlines were processed one by one. three approximate roof height values (top, bottom, overall height) based on ndsm were calculated for each building footprint. top height was calculated to be the 95th percentile of all ndsm values in the building footprint. roof bottom height was calculated as the third quartile of all ndsm values within a one-meter distance around (inside and outside) the building outline. roof overall height was calculated as top height minus roof bottom height. the color orthophoto was converted to grayscale values (luma) as a weighted sum of the red (r), green (g), blue (b) components with the formula r · 0.299 + g · 0.587 + b · 0.114. roof edges were detected from grayscale orthophoto with a line segment detector (lsd). the lsd algorithm detects figure 2. orthophoto of building with hip roof [top] and edges detected with the line segment detector [bottom]. figure 3. orthophoto of building with hip roof [top] and the result of canny edge detector with automatic parameter tuning [middle] followed by a probabilistic hough transform [bottom]. locally straight contours (also called line segments) on grey-level images without a parameter tuning [30]. according to [31], lsd is an automatic image analysis tool working in a manner similar to a human perception because the level of detail depends on the size of the entire image being analysed. we selected this edge detection algorithm because it achieves satisfactory results without any need for parameter adjusting, is very fast, and is a part of the open source computer vision library (opencv) [32]. it is thus the ideal edge detection algorithm for fully automated procedures. its disadvantage is that it works only with grey-level images, so some edges can be lost during the colour to grayscale conversion. nonetheless, this is not a disadvantage for the detection of roof edges, typically defined by roof planes having different brightness levels depending on their exposure to the sun. the performance of the lsd was compared to the standard edge detection technique based on the canny edge detector [33] with an automatic parameter tuning [34] followed by probabilistic hough transform [35]. results of both edge detection ap451 vojtěch hron, lena halounová acta polytechnica figure 4. orthophoto of building with hip roof and building outline (solid line), edges detected [top], edges merged [middle] and edges filtered [bottom]. proaches for the same image are in figures 2 and 3. both approaches detected edges that represent main roof edges and other elements, such as dormers and building outlines. identified edges on building outlines were detected due to the sudden change of pixel values on the boundaries of the clipped orthophoto. according to a visual inspection, lsd (fig. 2) provided better results. the edges detected matched each other and did not overlap, which was important for a further post-processing. the post-processing consists in merging adjacent edges and their filtration to remove duplicates. the proposed method employed in our study was based on the roof edge detection and categorization. however, edges detected with any method must represent meaningful roof elements (ridges, hips, valleys, dormers) for a proper categorization. unfortunately, some roof elements were fragmented (ridge in fig. 2) or duplicated (hips in fig. 2). that is why we merged detected edges and removed duplications prior to the categorization. the edges detected were merged based on the following three conditions: similar angle (anglediff ≤ 9°), adjacency (endpoints distance ≤ 1.25 m), and the new edge formed by joining them must be longer than each of them. subsequently, shorter parallel edges (within 1 m) were filtered out. the results of merging and filtration are shown in figure 4. here, edges representing the ridge have been correctly merged and shorter duplicate edges have been removed. having executed the previous steps, it was possible to perform an initial edge categorization based on the analysis of height information from ndsm and the figure 5. categorized edges: ridges (r), hips/valleys (h), eaves (e), uncategorized (u). spatial relationships between edges detected and the building footprint. this unique approach was based on the detection of ridges, hips/valleys, and eaves. the roof ridge was recognized using several characteristic features. the main specific features included parallelism with at least one building outline (anglediff ≤ 9°), ideally no (or a very small) difference in the height of endpoints (heightdiff ≤ 2 m), and for the main ridge the height of endpoints should correspond to the primary top height (heightdiff ≤ 2 m). however, hips and valleys were categorized using different criteria. the edge was marked as a hip/valley if it was non-parallel (anglediff > 9°) and close (distance > 1 m) to the building outline. for edges 2.5 m and longer, the height difference of the endpoints had to be at least 0.5 m. the height difference of endpoints was not checked for edges shorter than 2.5 m; thus, only the first two conditions were sufficient. the remaining edges were classified as eaves/gutters or uncategorized edges. eaves or gutters were parallel (anglediff ≤ 9°), close (distance < 1 m) to the building outline, and ideally had no (or a very small) difference in the height of endpoints (heightdiff ≤ 2 m) and corresponded to the primary roof bottom height (heightdiff ≤ 2 m). all other edges that did not meet the relevant criteria for specific roof elements were classified as uncategorized. figure 5 shows the results of the categorization. all edges representing the main roof elements were properly categorized. ridges and hips/valleys were additionally sorted. ridges were sorted according to height (highest to lowest) and the hips/valleys by length (longest to shortest). the roof model reconstruction started with the highest ridge and its adjacent hips/valleys. hips/valleys were considered adjacent to a ridge if one of their endpoints was close (distance < 1.5 m) to one of the ridge endpoints. hips and valleys were further differentiated from each other according to their orientation to the ridge. hips formed an obtuse angle with the ridge while valleys formed an acute angle with the ridge. a multi-plane roof type (hip/half-hip, saddle, dormer, or pyramid) was determined according to the existence and position of the ridge in relation to the building outline and adjacent hips and valleys. a hip (and half-hip) roof was defined as a single ridge roof with at least one adjacent hip (fig. 5). a saddle roof was defined as a single ridge roof with at least one 452 vol. 59 no. 5/2019 automatic reconstruction of roof models from building outlines. . . figure 6. reconstructed roof model. endpoint (within distance < 1 m) near to the building outline and having no adjacent hips. a dormer was defined as a single ridge roof with at least one adjacent valley and no hips. a pyramid roof had no ridge but at least one hip. a mono-plane roof type (flat or shed) was determined based on a height analysis of eaves (outline segments) and approximate roof height values. a flat roof had similar heights for outline segments, top and bottom. a shed roof had equally tilt parallel eaves. a roof was classified as unknown if the edge configuration or height analysis did not match the previous roof type definitions. edges detected (ridges, hips, valleys) by our method could have positional offset from their real position due to a conventional orthophoto in which the image of the roof elements might have radially shifted. thus, the position of edges detected could not be used directly for the creation of roof models as in [22]. thus, our method used information about angles, lengths, and the topology of edges detected for the creation of models according to additional rules (or rather, constraints) for individual roof types. our method created roof models only for building outlines in the shape of a polygon with two parallel sides (subsequently: “polygon”). there was one common constraint for all roof types with a ridge. the ridge was always constructed exactly in the middle of the polygon parallel sides and its angle was calculated as the average of angles of the polygon parallel sides if they had angles similar to the detected ridge (anglediff ≤ 9°). in the following text, this constraint referred to as “the rules”. creation of a hip roof began with the construction of a ridge according to the rules. the length of the constructed ridge corresponded to the length of the detected ridge. any ridge was constructed from information about the angle and length from the centroid (for a rectangle polygon) or a fixed ridge endpoint (for a side roof polygon). hips, line segments that connect the endpoints of the constructed ridge and the nearest vertices of the polygon, were then created and our algorithm checked the angles of the constructed and detected hips. if at least one angle of the constructed hip corresponded to one angle of the detected hip, the roof type was confirmed. the vector model (skeleton) was created by joining the polygon, the ridge, and the hips. figure 6 shows the reconstructed hip roof model. however, if the angles of the constructed and detected hips did not match, the algorithm continued by creating a half-hip roof. the half-hip roof model creation was similar to that of the hip roof. the ridge design was identical. the main difference was in the construction of the hips. the hips were created based on the angles detected. from the hips detected, those which formed a similar angle to a ridge (anglediff ≤ 9°) were selected and their average angle was calculated. if only one hip was detected, its angle value was used. the hips were rendered as line segments connecting the endpoints of the ridge and the intersections of the hip half-lines and the perpendicular sides of the polygon. unfortunately, there was no control in the rendering of the half-hip roof because all the available values had already been used to construct the roof model. a skeleton of the half-hip roof was created by joining the polygon, the ridge, and the hips. the gable roof model was composed only of the ridge. as in previous cases, the ridge was designed according to the rules and rendered only using the information about the angle from the centroid or the fixed ridge endpoint. the algorithm checked the length of the constructed ridge against the length of the detected ridge. in case that the difference was smaller than the defined threshold (lengthdiff < 2 m), a gable roof model was validated. a pyramid roof skeleton consisted of 4 hips connecting the centroid and the vertices of the rectangle. it was not appropriate to compare the lengths of the hips detected against those rendered because the opposite hips could be joined during the edge merging process to the diagonal. the algorithm checked the angles of both detected and rendered hips. if an angle of at least one rendered hip corresponded to an angle of one detected hip, a pyramid roof type was confirmed. if the building outline shape was complex, the building was decomposed. building decomposition consisted of dividing the outline into individual line segments and their extensions inward to form inner line segments (intersections). the intersections could cross each other to form smaller inner line segments. our aim was to combine outline segments and inner line segments (further line segments) to form nearest parallel line segments on both sides of the ridge. the line segments found defined two parallel sides of a polygon. depending on the existence of valleys, a polygon had to meet certain criteria. if there were no valleys adjacent to the ridge, the shape of a polygon should have corresponded to a rectangle. for a saddle roof, the ridge and parallel rectangle sides were approximately the same length (lengthdiff < 2 m). for a hip/half-hip roof, parallel rectangle sides were about the same length or longer than the ridge. if there were valleys adjacent to the ridge, the shape of the polygon was complex and dependent on the number of valleys. such a polygon usually represented a side roof rectangle and contained a so-called fixed ridge endpoint. a fixed ridge endpoint was defined as the 453 vojtěch hron, lena halounová acta polytechnica figure 7. orthophoto of complex building with gable roof [top left], building outline (solid line) and intersections (dashed line) [top right], categorized edges (legend as in fig. 5) [middle left], main ridge (r) and polygon (dashed line) [middle right], side ridge (r), valleys (v) and polygon (dashed line) [bottom left], reconstructed roof model with a fixed ridge endpoint (p) [bottom right]. figure 8. orthophoto of complex building with halfhip roof and building outline (solid line) [top left], ndsm with 0.5 m/pixel resolution and building outline [top right], categorized edges (legend as in fig. 5) [bottom left], and reconstructed roof model with a fixed ridge endpoint (p) [bottom right]. intersection of a valley half-line and a ridge line following the rules. an example of processing a complex building outline is shown in figure 7. 4. results and discussion the strengths of our method include its robustness and topological correctness defined by rules for the reconstruction of individual roof types. figure 8 shows figure 9. ndsm with resolution 1 m/pixel and categorized edges (legend as in fig. 5) [left], ndsm with resolution 2 m/pixel and categorized edges [right]. an example of a roof model created for one l-shaped building with a half-hip roof. not all half-hips were detected and valleys detected were shorter than in reality. despite the incompleteness of the detection, a satisfactory roof model was reconstructed, demonstrating the robustness of our approach. unfortunately, the side ridge was incorrectly connected directly to the main ridge, but the algorithm prevented the mutual crossing of the main and side ridges that would occur due to the spacing and angle of detected valleys. the roof model created was thus topologically correct. in this example (fig. 8), ndsm with 0.5 m/pixel resolution created from ipc was used. another significant advantage of our solution lies in the possibility of using ndsms with different levels of quality. as noted above, the roof edges were detected using orthophotos and ndsm only helped with their categorization. thus, the quality of orthophotos employed using such a method is important but the quality of a ndsm is less important. figure 9 illustrates the results of the edge categorization using ndsms with different resolutions. using lower quality ndsm led to several changes in the categorization of edges at the top of the building. however, these changes did not affect the construction of the roof model because the categorization of key edges remained unchanged. this demonstrates that ndsms with quality variances and origins can be used to reconstruct roof models without any significant side effects. we tested our method on a small dataset containing approximately 30 buildings of various shapes (number of vertices: 4-16) and roof complexity (number of roof types: 1-3). excellent results were achieved for simple and medium complex buildings. examples of the evaluation are shown in figure 10. stereo photogrammetric measurements were used to obtain the reference data in the form of corresponding 3d points (colour dots in fig. 10). the root mean square error (rmse) [36] was calculated separately between the vertices of the building outline (input data) and roof skeleton (extracted data) and the reference points for every building. the average rmse was 0.73 m for the building outline and 0.92 m for the roof skeleton. unfortunately, the results for very complex cases were unsatisfactory. the model reconstruction failed due to missing key roof edges. this situation occurred when a roof was inappropriately illuminated, so some edges 454 vol. 59 no. 5/2019 automatic reconstruction of roof models from building outlines. . . figure 10. examples of evaluation with reference vertices as eaves (e), ridges/half-hips (r) and shifts between corresponding vertices (short line). figure 11. orthophoto of complex building with mixed roof type and categorized edges (legend as in fig. 5) from epoch 1 (2014) [top], epoch 2 (2016) [middle] and reconstructed roof model [bottom]. could not be properly detected. for such situations, we propose using multitemporal images because these can lead to significant improvements by allowing more roof edges to be detected. figure 11 shows colour orthophotos of a complex building with mixed roof type in two epochs (2014 and 2016). different roof edges were detected in each epoch and their combination enabled a reconstruction of the roof model. weaknesses of our method include the limitations resulting from the application of rules for individual roof types. for example, the current implementation does not permit reconstruction of roof ridges except in the middle or on the outline of buildings. also, undefined roof types, such as sawtooth, mansard, butterfly, and dome cannot be created. another drawback is the way in which the heights of detected edges are determined. the radial shift of a building roof image is not taken into account. thus, the roof edges detected are slightly horizontally shifted. however, for their categorization, we used the endpoint heights determined from the positionally correct ndsm raster. an inaccurate determination of the heights could, in extreme cases, lead to erroneous edge categorizations and, therefore, make it impossible to reconstruct a roof model. for this reason, we recommend using only the central part of the orthophotos, where the image shift is minimal or very small. this is the standard procedure for creating a seamless orthophoto mosaic used in our case. it is clear that using a trueorthophoto is ideal for the edge detection, oblique images are extremely inappropriate. however, oblique images have many benefits for the creation of highquality dsms using dense image matching. in order to create accurate 3d building models, it is necessary to re-determine all elevations from the ndsm (or dsm) after the roof reconstruction. 5. conclusions in this paper, we introduced an approach for an automatic reconstruction of roof models using planar/2d building footprints, a nadir orthophoto, and ndsm. all data types used were typical spatial datasets often managed by national mapping agencies. the solution presented was developed and tested using real standard resolution production data (0.25 m/pixel for orthophotos and 1 m/pixel for dsm) and illustrated how it is not necessary to acquire extremely high resolution spatial datasets. because of commonly available datasets were employed, the method described here is widely applicable and inexpensive. the roof model reconstruction was based on the extraction of 2d roof edges from an orthophoto and categorized using height and spatial relationship information. categorized edges were used for determining the type of the roof and its key parameters (especially hip angles). buildings with a complex footprint were solved in part through a decomposition into simple shapes according to ridges detected and adjacent valleys. the simpleto-use method described here allows for the creation of visually attractive building roof models composed of gable, hip, half-hip, tent, flat, and shed roof types. simple parameter tuning consisted of defining several variables with angle and length thresholds. the preliminary results presented in this paper are very promising. future work will focus on the possibility of using the fully blocked building footprints and step edges extraction with more accurate ndsms and the 455 vojtěch hron, lena halounová acta polytechnica implementation of other roof types. in addition, future plans include the creation of citygml standard 3d building models and a real-world implementation in cooperation with the land survey office in the czech republic. acknowledgements this work was supported by the czech technical university in prague student grant competition (grant number sgs19/048/ohk1/1t/11). references [1] m. hůlková, k. pavelka, e. matoušková. automatic classification of point clouds for highway documentation. acta polytechnica 58:165–170, 2018. doi:10.14311/ap.2018.58.0165. [2] g. gröger, l. plümer. citygml – interoperable semantic 3d city models. isprs journal of photogrammetry and remote sensing 71:12–33, 2012. doi:10.1016/j.isprsjprs.2012.04.004. [3] m. med, p. souček. analysis and implementation of application schemas for the inspire buildings theme. acta polytechnica 56:291–300, 2016. doi:10.14311/ap.2016.56.0291. [4] g. vosselman, h.-g. maas (eds.). airborne and terrestrial laser scanning. dunbeath: whittles. crc press, 2010. [5] g. vosselman, s. o. elberink, m. post, et al. from nationwide point clouds to nationwide 3d landscape models. photogrammetric week pp. 247–256, 2015. [6] c. s. gladstone, a. gardiner, d. holland. a semi-automatic method for detecting changes to ordnance survey topographic data in rural environments. in geobia, pp. 396–401. 2012. [7] r. qin, j. tian, p. reinartz. 3d change detection – approaches and applications. isprs journal of photogrammetry and remote sensing 122:41–56, 2016. doi:10.1016/j.isprsjprs.2016.09.013. [8] f. remondino, i. toschi, m. gerke, et al. oblique aerial imagery for nma: some best practices. in l. halounova (ed.), proceedings of the xxiii isprs congress from human history to the future with spatial information, vol. iii-2, pp. 639–645. international society for photogrammetry and remote sensing (isprs), 2016. doi:10.5194/isprs-archives-xli-b4-639-2016. [9] g. mandlburger, k. wenzel, a. spitzer, et al. improved topographic models via concurrent airborne lidar and dense image matching. isprs annals of photogrammetry, remote sensing and spatial information sciences iv-2/w4:259–266, 2017. doi:10.5194/isprs-annals-iv-2-w4-259-2017. [10] n. haala, c. brenner. extraction of buildings and trees in urban environments. isprs journal of photogrammetry and remote sensing 54:130–137, 1999. doi:10.1016/s0924-2716(99)00010-6. [11] v. hron, v. kostin, l. halounová. comparison of software solutions for automatic generation of 3d building models. in international multidisciplinary scientific geoconference surveying geology and mining ecology management, vol. 1, pp. 513–520. 2014. doi:10.5593/sgem2014/b21/s8.065. [12] v. hron, l. halounová. automatic generation of 3d building models from point clouds. in i. ivan, i. benenson, b. jiang, et al. (eds.), geoinformatics for intelligent transportation, pp. 109–119. 2014. doi:10.1007/978-3-319-11463-7_8. [13] n. haala, m. kada. an update on automatic 3d building reconstruction. isprs journal of photogrammetry and remote sensing 65:570–580, 2010. doi:10.1016/j.isprsjprs.2010.09.006. [14] f. rottensteiner, g. sohn, j. jung, et al. the isprs benchmark on urban object classification and 3d building reconstruction. isprs annals of photogrammetry, remote sensing and spatial information sciences i3:293–298, 2012. doi:10.5194/isprsannals-i-3-293-2012. [15] f. rottensteiner, g. sohn, m. gerke, et al. results of the isprs benchmark on urban object detection and 3d building reconstruction. isprs journal of photogrammetry and remote sensing 93:256–271, 2014. doi:10.1016/j.isprsjprs.2013.10.004. [16] j. yan, w. jiang, j. shan. a global solution to topological reconstruction of building roof models from airborne lidar point clouds. isprs annals of photogrammetry, remote sensing and spatial information sciences iii-3:379–386, 2016. doi:10.5194/isprs-annals-iii-3-379-2016. [17] b. xiong, s. oude elberink, g. vosselman. a graph edit dictionary for correcting errors in roof topology graphs reconstructed from point clouds. isprs journal of photogrammetry and remote sensing 93:227–242, 2014. doi:10.1016/j.isprsjprs.2014.01.007. [18] b. xiong, m. jancosek, s. o. elberink, g. vosselman. flexible building primitives for 3d building modeling. isprs journal of photogrammetry and remote sensing 101:275–290, 2015. doi:10.1016/j.isprsjprs.2015.01.002. [19] i. pârvu, f. remondino, e. özdemir. lod2 building generation experiences and comparisons. journal of applied engineering sciences 8:59–64, 2018. doi:10.2478/jaes-2018-0019. [20] m. a. fischler, r. c. bolles. readings in computer vision: issues, problems, principles, and paradigms. chap. random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, pp. 726–740. morgan kaufmann publishers inc., san francisco, ca, usa, 1987. doi:10.1016/b978-0-08-051581-6.50070-2. [21] l. nan, p. wonka. polyfit: polygonal surface reconstruction from point clouds. in ieee international conference on computer vision, pp. 2372–2380. 2017. doi:10.1109/iccv.2017.258. [22] a. mcclune, j. mills, p. miller, d. holland. automatic 3d building reconstruction from a dense image matching dataset. isprs international archives of the photogrammetry, remote sensing and spatial information sciences xli-b3:641–648, 2016. doi:10.5194/isprsarchives-xli-b3-641-2016. [23] m. awrangjeb, c. zhang, c. s. fraser. automatic extraction of building roofs using lidar data and multispectral imagery. isprs journal of photogrammetry and remote sensing 83:1–18, 2013. doi:10.1016/j.isprsjprs.2013.05.006. 456 http://dx.doi.org/10.14311/ap.2018.58.0165 http://dx.doi.org/10.1016/j.isprsjprs.2012.04.004 http://dx.doi.org/10.14311/ap.2016.56.0291 http://dx.doi.org/10.1016/j.isprsjprs.2016.09.013 http://dx.doi.org/10.5194/isprs-archives-xli-b4-639-2016 http://dx.doi.org/10.5194/isprs-annals-iv-2-w4-259-2017 http://dx.doi.org/10.1016/s0924-2716(99)00010-6 http://dx.doi.org/10.5593/sgem2014/b21/s8.065 http://dx.doi.org/10.1007/978-3-319-11463-7_8 http://dx.doi.org/10.1016/j.isprsjprs.2010.09.006 http://dx.doi.org/10.5194/isprsannals-i-3-293-2012 http://dx.doi.org/10.1016/j.isprsjprs.2013.10.004 http://dx.doi.org/10.5194/isprs-annals-iii-3-379-2016 http://dx.doi.org/10.1016/j.isprsjprs.2014.01.007 http://dx.doi.org/10.1016/j.isprsjprs.2015.01.002 http://dx.doi.org/10.2478/jaes-2018-0019 http://dx.doi.org/10.1016/b978-0-08-051581-6.50070-2 http://dx.doi.org/10.1109/iccv.2017.258 http://dx.doi.org/10.5194/isprsarchives-xli-b3-641-2016 http://dx.doi.org/10.1016/j.isprsjprs.2013.05.006 vol. 59 no. 5/2019 automatic reconstruction of roof models from building outlines. . . [24] k. sugihara, t. murase, x. zhou. automatic generation of 3d building models from building polygons on digital maps. in proceedings of the international conference on 3d imaging. 2015. doi:10.1109/ic3d.2015.7391817. [25] t. murase, k. sugihara. automatic generation of 3d building models for environmental education by straight skeleton computation. in international conference on signal processing, pp. 1040–1045. 2018. doi:10.1109/icsp.2018.8652493. [26] hexagon geospatial. imagine photogrammetry brochure. https://www.hexagongeospatial.com/ brochure-pages/imagine-photogrammetry-brochure, 2018. accessed: 1 may 2019. [27] f. remondino, m. spera, e. nocerino, et al. state of the art in high density image matching. the photogrammetric record 29:144–166, 2014. doi:10.1111/phor.12063. [28] n. haala. dense image matching final report. tech. rep., european spatial data research, 2014. official publication no 64. [29] geoportál čúzk. zabaged®planimetric components introduction. https://geoportal.cuzk. cz/default.aspx?lng=en&mode=textmeta&text= dsady_zabaged&side=zabaged&menu=24, 2019. accessed: 20 may 2019. [30] r. grompone von gioi, j. jakubowicz, j.-m. morel, g. randall. lsd: a fast line segment detector with a false detection control. ieee transactions on pattern analysis and machine intelligence 32:722–732, 2010. doi:10.1109/tpami.2008.300. [31] r. grompone von gioi, j. jakubowicz, j.-m. morel, g. randall. lsd: a line segment detector. image processing on line 2:35–55, 2012. doi:10.5201/ipol.2012.gjmr-lsd. [32] opencv 3.0.0-dev documentation. feature detection. https://docs.opencv.org/3.0-beta/ modules/imgproc/doc/feature_detection.html, 2014. accessed: 1 may 2019. [33] j. canny. a computational approach to edge detection. ieee transactions on pattern analysis and machine intelligence 8:679–698, 1986. doi:10.1109/tpami.1986.4767851. [34] a. rosebrock. zero-parameter, automatic canny edge detection with python and opencv. https://www.pyimagesearch.com/2015/04/06/ zero-parameter-automatic-canny-edgedetection-with-python-and-opencv/, 2015. accessed: 1 may 2019. [35] j. matas, c. galambos, j. kittler. robust detection of lines using the progressive probabilistic hough transform. computer vision and image understanding 78:119–137, 2000. doi:10.1006/cviu.1999.0831. [36] j. avbelj, r. müller. quality assessment of building extraction from remote sensing imagery. in 2014 ieee geoscience and remote sensing symposium, pp. 3184–3187. 2014. doi:10.1109/igarss.2014.6947154. 457 http://dx.doi.org/10.1109/ic3d.2015.7391817 http://dx.doi.org/10.1109/icsp.2018.8652493 https://www.hexagongeospatial.com/brochure-pages/imagine-photogrammetry-brochure https://www.hexagongeospatial.com/brochure-pages/imagine-photogrammetry-brochure http://dx.doi.org/10.1111/phor.12063 https://geoportal.cuzk.cz/default.aspx?lng=en&mode=textmeta&text=dsady_zabaged&side=zabaged&menu=24 https://geoportal.cuzk.cz/default.aspx?lng=en&mode=textmeta&text=dsady_zabaged&side=zabaged&menu=24 https://geoportal.cuzk.cz/default.aspx?lng=en&mode=textmeta&text=dsady_zabaged&side=zabaged&menu=24 http://dx.doi.org/10.1109/tpami.2008.300 http://dx.doi.org/10.5201/ipol.2012.gjmr-lsd https://docs.opencv.org/3.0-beta/modules/imgproc/doc/feature_detection.html https://docs.opencv.org/3.0-beta/modules/imgproc/doc/feature_detection.html http://dx.doi.org/10.1109/tpami.1986.4767851 https://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/ https://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/ https://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/ http://dx.doi.org/10.1006/cviu.1999.0831 http://dx.doi.org/10.1109/igarss.2014.6947154 acta polytechnica 59(5):448–457, 2019 1 introduction 1.1 data 1.2 building outlines 1.3 3d building models 1.3.1 laser-based point clouds 1.3.2 image-based point clouds 1.3.3 point clouds and image data 1.3.4 laser-based point clouds and outlines 1.3.5 image-based point clouds and outlines 1.3.6 outlines and attributes 2 aims 3 description of the roof model construction 4 results and discussion 5 conclusions acknowledgements references ap05_3.vp notation symbol unit description a interference factor c m blade chord length cd airfoil drag coefficient cl airfoil lift coefficient cmc/4 airfoil quarter chord pitching moment coefficient cd blade drag coefficient cl blade lift coefficient cp turbine performance coefficient cq turbine torque coefficient d n drag df n elementary force acting on the elementary actuator disk ip kg m 2 blade moment of inertia it kg m 2 turbine moment of inertia l n lift m nm instantaneous turbine torque mc nm instantaneous load torque mc/4 nm quarter chord pitching moment mm nm average turbine torque n n blade radial force nb number of blades p w instantaneous turbine mechanical power pm w average turbine mechanical power r m turbine radius re blade reynolds number s m2 turbine frontal area t nm blade tangential force v m /s local velocity vr m /s tip speed v � m /s asymptotic velocity xc/4 %blade chord blade aerodynamic centre position xhinge %blade chord floating hinge position � rad blade angle of attack �tan rad angle between the local velocity and the local tangent at the blade �zv rad blade pitch angle ��� rad /s2 blade pitch angle acceleration � tip speed ratio � �r/v� � kg /m3 fluid density � solidity � nb c /r � rad blade azimuth angle ��� rad /s 2 turbine acceleration � rad /s turbine angular velocity subscript d conditions at the downwind actuator disk u conditions at the upwind actuator disk h hinge 1 introduction marine current energy is a type of renewable energy resource that has been less exploited than wind energy. only recent years, have some countries devoted funds to research aimed at developing tidal current power stations. tidal current turbines, as in the wind community, can be divided into vertical-axis and horizontal axis types. although horizontal axis turbines have been more widely used than vertical axis types for wind energy exploitation, vertical axis turbines could present significant advantages for tidal current exploitation, because they are simple to build and reliable in working conditions. therefore, at beginning of the studies © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 dynamic behaviour of the patented kobold tidal current turbine: numerical and experimental aspects d. p. coiro, a. de marco, f. nicolosi, s. melone, f. montella this paper provides a summary of the work done at dpa on numerical and experimental investigations of a novel patented vertical axis and variable pitching blades hydro turbine designed to harness energy from marine tidal currents. ponte di archimede s.p.a. company, located in messina, italy, owns the patented kobold turbine that is moored in the messina strait, between the mainland and sicily. the turbine has a rotor with a diameter of 6 meters, three vertical blades of 5 meters span with a 0.4 m chord ad hoc designed curved airfoil, producing high lift with no cavitation. the rated power is 160 kw with 3.5 m/s current speed, which means 25% global system efficiency. the vawt and vawt_dyn computer codes, based on double multiple steamtube, have been developed to predict the steady and dynamic performances of a cycloturbine with fixed or self-acting variable pitch straight-blades. a theoretical analysis and a numerical prediction of the turbine performances as well as experimental test results on both a model and the real scale turbine will be presented and discussed. keywords: vertical-axis-hydro-turbine, variable pitch, double-multiple-streamtube, tidal currents, tidal energy. vertical axis wind turbines were taken as models for hydro-turbines. the blades of darrieus-type vertical axis wind turbines are fixed, and they perform well when the blade solidity is low and the working speed is high. for this reason, the first hydro-turbines were impossible to start. a variable-pitch blade system can be a solution to this problem. some prototypes with different variants of this system have therefore been developed around the world: the kobold turbine in the strait of messina, italy; the cycloidal turbine in guanshan, china; the moment-control turbine at edinburgh university, uk; and the mass-stabilised system turbine by kirke and lazauskas in inman valley, south australia. the kobold turbine has been under development since 1997: the rotor has a self-acting variable pitch and the kobold blades have an ad hoc designed airfoil, called hlift, to be cavitation free and to have high lift performance. the methods for calculating the hydrodynamic performances of vertical axis turbines also come from wind turbines: in the 1970s templin developed the single-disk single-tube model, and then strickland put forward the single-disk multi-tube model. in the 1980s paraschivoiu introduced the double-disk multi-tube model. the vawt and vawt_dyn computer codes, based on this theory, have been developed to predict the steady and dynamic performances of a cycloturbine with fixed or self-acting variable pitch straight-blades. the numerical results have been compared with two sets of experimental data: one set is obtained from wind tunnel tests on a scaled model, and the other set is based on field data from the kobold prototype. 2 double multiple streamtube in order to analyze the flow field around a vertical axis turbine, a dms model was used. the dms model is an evolution of the previous “momentum models”: the single streamtube model, the multiple streamtube model and the double streamtube model [1]. the dms model [2] assumes that the flow through the rotor can be modelled by examining the flow through several streamtubes, and the flow disturbance, produced by the rotor is determined by equating the aerodynamic forces on the turbine rotor to the time rate of change in momentum through the rotor as depicted in fig. 1. in the dms model, the flow velocities vary in both the upwind and downwind regions of the streamtube, as well as varying from streamtube to streamtube. so dms is able to analyse the interference between the downwind blade and the upwind blade’s wake in order to evaluate more accurately the local value of the velocity and the instantaneous blade load. as shown in fig. 1., the rotor is modelled as a series of elementary streamtubes, and each streamtube is modelled with two actuator disks in series. across the actuator disk the pressure drops and this drop is equivalent to the streamwise force df on the actuator disk divided by the actuator disk area da. the elementary force dfu and dfd, respectively on the upwind and downwind disk, given by the momentum principle, are d du u uf v a v v� ��� ( )2 (1) d dd d df v a v v� �� ( )2 3 (2) where vd, which is the velocity on the downwind actuator disk, is influenced by the velocity vu on the upwind actuator disk. the elementary forces df on the actuator disks may be calculated using blade element theory. if the upwind and downwind interference factors are defined as a v v v a v v vu u d d� � � �� � � � (3) the mathematical problem can be reduced to the calculation of au and ad. because of the non-linearity of the equations, the problem must be resolved iteratively. if the rotor blades have a fixed pitch angle or an assigned pitch variation (i.e. sinusoidal like in pinson, cycloidal, etc.), the mathematical model is reduced, for each elementary streamtube, to an equation for the momentum balance for the upwind actuator disk and an equation for the momentum balance for the downwind actuator disk. � ��a a sen v v c senu u ru lu u( ) ( ) tan1 1 8 2 2 � � � � � � � � �� � � � � � � ��c a a a sen v v du u d d u rd cos ( ) ( )( ) tan2 1 2 1 8 � � � � � � � � � � � � � � � � � � � � 2 c sen c ld d dd d ( ) cos( ) tan tan � � � � (4) if the rotor blades have a self-acting variable pitch angle [3], [4], [5], another equation is also necessary for each actuator disk: the hinge moment equilibrium. in this case, in fact, the blade is partially free to pitch under the action of the aerodynamic and inertia forces so as to reduce the angle of attack 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague downwind disk d�d � i j u upwind disk elementary streamtube circular path of the blade v v downwind disk daddau v v upwind disk v dfu dfd v v v v v v v d� fig. 1: double multiple streamtube model and hence the tendency of the blade to stall. the allowed angular swinging of the blade is limited by the presence of two blocks. in this way the mathematical model is represented by two systems of equations, each constituted of two equations: momentum balance and hinge moment equilibrium. for one blade and for the upwind actuator disk a a sen v v cu u u ru lu u zvu u( ) ( , ,retan1 1 8 2 � � � � � �� � � � �� � ) ( ) ( , ,re ) cos( ) tan tan tan � � � � � sen c c � � � � � � u u du u zvu u u u mc u zvu u nu u zvu u c cer 4 4 ( , ,re ) ( , ,re ) ( ) tan tan� � � �� � � c x x � � � � � � cos ( , ,re ) ( ) tan � � � � zvu tu u zvu u c cer zvu c x x sen4 0 (5) for the downwind actuator disk ( )( ) ( ,tan 1 2 1 8 2 � � � � � � � a a a sen v v c d d u d rd ld d z � � � � �� vd d d d dd d zvd d d ,re ) ( ) ( , ,re ) cos( tan tan� � � � � sen c� � � � � � �tan tan tan ) ( , ,re ) ( , ,re ) ( d mc d zvd d nd d zvd d c c c x 4 4 � � � �� � � � � � � � x c x x sen cer zvd td d zvd d c cer ) cos ( , ,re ) ( ) tan � � � �4 zvd � 0 (6) the instantaneous torque and power produced by the blade are given by the moment equilibrium around the turbine axis � � m n x x t r x x sen p m � � � � � � � ( ) cos ( ) c hinge zv c hinge zv 4 4 � � � (7) to obtain the mean torque and mechanical power produced by nb blades in a revolution it is necessary to average the instantaneous values. �m n x x t r x x sen m b c hinge zv c hinge � � � � � �2 4 0 2 4 � � � � ( ) cos ( )� ��zv d ,� (8) �p n x x t r x x sen m b c hinge zv c hinge � � � � � � � 2 4 0 2 4 � � � ( ) cos ( )� ��� �zv d . (9) to simulate dynamic performances, we have to resolve only the equation of the moment equilibrium around turbine axis (10) for fixed blade or nb� 1 equations for nb floating blades around their hinge axis (11). i m mt i i n �� � � � � c b 1 (10) i m i m i p zv1 h1 p zv h2 p zv (�� ��) (�� ��) (�� ��) � � � � � � � � 2 3 � � � m i m h3 p zvn hn � (�� ��)� (11) 3 airfoils and aerodynamic characteristics the rotor performances were tested using five airfoils: three symmetrical airfoils naca 0012, 0015, 0018 and two cambered airfoils naca 4415 and hlift18. the last named was designed at dpa; it is a high lift [6], [7] and no cavitating airfoil. it has in fact been designed to work in the water on the kobold turbine. for the naca airfoils, data was taken from the literature [2], [8] while for the hlift18 airfoil, tbvor [9], [10], [11] codes were used to generate the values of the aerodynamic coefficients. the airfoil 2d data is corrected, in vawt and vawt_dyn codes [12], to take into account the three-dimensional effects due to the blade finite aspect ratio. to take into account the three-dimensional effects, prandtl’s lifting line theory, extended to treat high lift flow, has been used, evaluating in this way the 3d lift curve beginning from the 2d data. this theory is valid only in the linear zone of the lift curve but with care it can also be extended to non linear conditions. the total blade drag coefficient is the sum of the airfoil drag coefficient (cd), due to skin friction, and the induced drag coefficient. to take into account the interference between the blade and the support arms, a further drag coefficient increment, �cd, has been introduced. moreover, 2d post-stall modelling, based on the viterna-corrigan correlation method, has been introduced to extend the 2d aerodynamic coefficients to this angle range. © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 v arm � �r v v n n t c l d mc/4 �zv fig. 2: hinge moment equilibrium j i mh1 mh2 mh3 m � mc ti �zv : fig. 3: torques on a kobold turbine 4 experimental models the reported experimental data is divided into two parts: � experimental data measured in the dpa wind-tunnel at the university of naples on a small straight-bladed cycloturbine, which was designed, developed and assembled at dpa [13]; � experimental data measured in water on the “kobold” prototype (real scale) [12], [14]. both the dpa straight-bladed cycloturbine (model a) and the “kobold” prototype (model b) will be described, as follows. both turbines have variable pitch blades with a self-acting system, made up of two balancing masses for each blade. in this way, the centre of gravity of the blade can be moved into its optimal position in order to optimize the global performance of the rotor, and, using two stops, the pitch blade range can be limited, as shown in fig. 5. model a in the dpa wind-tunnel is shown in fig. 6. using different stop positions, it was possible to test different pitch angle ranges, and while using different numbers of blades it was possible to take into account different solidity [nc/r] values. model a has the following geometric parameters: number of blades tested 2, 3, 4, 6 blade chord 0.15 m blade airfoil naca 0018 blade span 0.8 m aspect ratio 5.33 radius 1.05 m number of radial arm 4, 6, 8, 12 arm chord 0.05 m solidity 1 0.286 solidity 2 0.428 solidity 3 0.571 solidity 4 0.857 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 0 0.2 0.4 0.6 0.8 1 x/c -0.2 -0.1 0 0.1 0.2 y/c naca 0012 naca 0015 naca 0018 0 0.2 0.4 0.6 0.8 1 x/c -0.2 -0.1 0 0.1 0.2 y/c naca 4415 hlift18 -30 -20 -10 0 10 20 30 a lfa [de g ] -1.5 -1 -0.5 0 0.5 1 1.5 2 c l na ca 00 12 na ca 00 15 na ca 00 18 na ca 44 15 hli ft 18 -30 -20 -10 0 10 20 30 a lfa [d e g ] 0 0.1 0.2 0.3 0.4 0.5 0.6 c d n a c a 0 0 12 n a c a 0 0 15 n a c a 0 0 18 n a c a 4 4 15 h l ift 1 8 fig. 4: airfoils tested and aerodynamic characteristics (re � 10e6) � i stop v� hinge stop pitch range j � fig. 5: balancing mass and blade stops fig. 6: dpa straight-bladed cycloturbine (model a) the “kobold” prototype (model b) lies out in the strait of messina, close to the sicilian shore, facing a village called ganzirri, close to the lake of the same name, as shown in fig. 7. in this site the peak current speed is 2 m/s (4 knots), the sea depth is 20 meters and the plant has been moored 150 meters offshore. the current changes direction every 6 hrs and 12 minutes, and the amplitude period is equal to 14 days. a high lift airfoil, called h-lift18, is used for the blade sections and has been specially designed at dpa to be cavitation free and to optimise the turbine performance. two arms sustain each blade and the arms have been streamlined using another ad hoc designed symmetrical airfoil. the turbine has a very high starting torque, being able in this way to start spontaneously, also with electrical load connected, without the need for any starting devices. the enermar plant is composed of the turbine rotor hanging under a floating buoy that contains the remaining mechanical and electrical parts to deliver energy to the grid, as shown in fig. 7. the rotor has a diameter of 6 meters with 6 radial arms holding three blades with a five-meter span and with a chord of 0.4 m employing the h-lift18 airfoil leading to an aspect ratio of 12.5 and to solidity � � 0.4. the acquisition data system is made up of a torque-meter, a tidal current speed-meter and an rpm counter all connected to a plc that converts analog signals to digital data transferring them to a pc. data handling software has been developed to monitor the acquired data in real time. fig. 8 shows some components of the acquisition data system. the plc also acts as an electrical load controller to keep the turbine working always at its maximum efficiency independently from the current speed. 5 experimental tests fig. 9 and fig. 10 compare the vawt code numerical results with the experimental data measured on model a, for © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 7: the strait of messina, the position of the plant and a picture of the plant fig. 8: the torque-meter, the plc and the pc 0 100 200 300 400 rotor angular velocity (rpm) 0 50 100 150 200 250 m e c h a n ic a l r o to r p o w e r (w ) pitch range = 0° 10° blades = 2 solidity = 0.256 experimental data v awt r esults 0 100 200 300 400 rotor angular velocity (rpm) 0 50 100 150 200 250 m e c h a n ic a l r o to r p o w e r (w ) pitch range = 0° 10° blades = 3 solidity = 0.428 experim ental data vawt results fig. 9: experimental data and vawt results (model a) 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 0 100 200 300 400 rotor angular velocity (rpm) 0 50 100 150 200 250 m e c h a n ic a l r o to r p o w e r (w ) pitch range = 0° 10 ° blad es = 4 solidity = 0.571 experimental data v awt results 0 100 200 300 rotor angular velocity (rpm) 0 50 100 150 200 250 m e c h a n ic a l r o to r p o w e r (w ) pitch range = 0° 10° blades = 6 solidity = 0.857 experim ental data vawt results fig. 10: experimental data and vawt results (model a) 1.6 1.8 2 2.2 2.4 2.6 2.8 tsr 0.12 0.16 0.2 0.24 0.28 0.32 0.36 n e t ro to r p o w e r c o e ff ic ie n t experimental data vawt results fig. 11: experimental data and vawt results (model b) 0 10 20 30 40 50 2 4 6 8 10 12 numerical experimental 0 10 20 30 40 50 -40 0 40 80 120 160 t o rq u e (n m ) numerical experimental fig. 12: experimental data and vawt_dyn results (model b) different blade numbers and different pitch range. the pitch angle is measured between the local tangent to the blade circular path and the blade trailing edge, starting from the tangent in clockwise direction. the power measured is the net rotor power, and the tests are carried out with 9 m/s air speed in the dpa wind tunnel for model a. a comparison of the numerical results with the experimental data shows good agreement, especially for 3 and 4 blades; in the case of 6 blades, the solidity is very high and there is a strong wake effect on each blade. fig. 11 compares the experimental data measured during field tests in water for the kobold turbine prototype – model b – with the vawt code numerical prediction. the net rotor power coefficient is measured, and the tests are carried out with a presumed current speed of 1.4 m/s, but there is uncertainty around 25 % on the real “undisturbed” current speed value, which is strongly influenced by the location of the current speed meter: this is at present being investigated. fig. 12 and fig. 13 show the experimental data and vawt_dyn code numerical results referred to the starting condition for the kobold turbine prototype (model b). the rotor angular velocity variation in time predicted by the code seems to be very accurate, while the rotor torque and power amplitudes are in good agreement only in the first part of the time range: this is probably due to the uncertain value of the numerical predicted losses. the frequency of the torque and power is, however, very well predicted. 6 conclusions the double multiple steamtube model seems to be good in predicting the vertical axis turbine performances, with fixed or floating blades, expecially for solidity values less than 0.5. this model has been implemented in vawt and vawt_dyn codes, which are then capable of predicting both static and dynamic performances of the turbine with very low computing time. in these codes blade 3d effects have been included as well as arm losses, while it is difficult to predict the losses due to other effects. in the field tests, the accuracy of the tidal speed current measurement is a problem, because it is difficult to set the current speed-meter in the “real” undisturbed flow, so the measurements have an uncertainty level around 20 %. on the kobold turbine, the hlift18 non-symmetrical and non-cavitating airfoil gives better performance than a symmetrical airfoil, and the self-acting variable pitch system with balancing mass has proven to be simple to build and more reliable than other more complex systems. references [1] strickland, j. h.: “a review of aerodynamic analysis methods for vertical-axis wind turbine.” in: fifth asme wind energy symposium, sed – vol. 2 edited by a. h. p. swift. [2] paraschivoiu, i.: wind turbine design with emphasis on darreius concept. ecole polytechnique de montreal, polytechnic international press, 2002. [3] kentfield, j. a. c.: “cycloturbines with freely hinged blades or freely hinged leading edge slats.” in: alternative energy sources v. part c: indirect solar/geothermal. (editor: t. n. veziroglu). amsterdam: elsevier science publishers b. v., 1983, p. 71–86. [4] lazauskas, l.: “three pitch control systems for vertical axis wind turbines compared.” wind engineering, vol. 16 (1992), no. 5, p. 269–281. [5] kirke, b. k., lazauskas, l.: “experimental verification of a mathematical model for predicting the performance of a self-acting variable pitch vertical axis wind turbine.” wind engineering, vol. 17 (1993), no. 2, p. 58–66. [6] healy, j. v.: “the influence of blade camber on the output of vertical axis wind turbine.” wind engineering, vol. 2 (1978), no. 3, p. 146–155. [7] healy, j. v.: “the influence of blade thickness on the output of vertical axis wind turbine.” wind engineering, vol. 2 (1978), no. 1, p. 1–9. [8] reuss, r. l. et al.: effects of surface roughness and vortex generators on the naca 4415 airfoil. report: nrel/tp-442-6472. golden, colorado: national renewable energy laboratory, december 1995. [9] coiro, d. p., de nicola, c.: prediction of aerodynamic performance of airfoils in low reynolds number flows. in: low © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 0 10 20 30 40 50 ti m e ( s ) 0 4 8 12 m e c h a n ic a l p o w e r (k w ) n u m e r i c a l e x p e r im e n t a l fig. 13: experimental data and vawt_dyn results (model b) reynolds number aerodynamics conference, notre dame, indiana, u.s.a., june 1989. [10] coiro, d. p., de nicola, c.: “low reynolds number flows: the role of the transition.” in: x congresso nazionale aidaa, pisa, ottobre 1989. [11] coiro, d. p.: “convergence acceleration procedure for a viscous/inviscid coupling approach for airfoil performances prediction.” in: xi aidaa national congress, forli’, italy, october 1991. [12] montella, f., melone s.: “analisi sperimentale e numerica del comportamento statico e dinamico di una cicloturbina ad asse verticale.” aerospace engineering bachelor thesis. napoli, italy: dipartimento di progettazione aeronautica, december 2003. [13] coiro, d. p., nicolosi, f.: “numerical and experimental tests for the kobold turbine.” in: sinergy symposium, hangzhou, republic of china, november 1998. [14] coiro, d. p. et al.: “exploitation of marine tidal currents: design, installation and experimental results for the patented kobold vertical axis hydro turbine.” in: poster-session owemes 2003. napoli, italy, 10–12 april 2003. prof. d. p. coiro phone: +39 081 7683322 fax: +39 081 624609 e-mail: coiro@unina.it dr. a. de marco e-mail: agodemar@unina.it dr. f. nicolosi e-mail: fabrnico@unina.it dr. s. melone e-mail: stmelone@unina.it dr. f. montella e-mail: francmont@genie.it dipartimento di progetazione aeronautica (dpa) university of naples “federico ii” via claudio 21, 80125 naples – italy 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague acta polytechnica doi:10.14311/ap.2020.60.0279 acta polytechnica 60(4):279–287, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap complexity reduction of cyclostationary sensing technique using improved hybrid sensing method hikmat n. abdullaha, ∗, zinah o. dawoodb, ammar e. abdelkareema, hadeel s. abeda a al-nahrain university, college of information engineering, department of information and communication engineering, 10072 jadiria, baghdad, iraq b university of baghdad, al-khwarizmi engineering college, department of information and communication engineering, 10070 jadriah, baghdad, iraq ∗ corresponding author: hikmat_04@yahoo.com abstract. in cognitive radio system, the spectrum sensing has a major challenge in needing a sensing method, which has a high detection capability with reduced complexity. in this paper, a low-cost hybrid spectrum sensing method with an optimized detection performance based on energy and cyclostationary detectors is proposed. the method is designed such that at high signal-to-noise ratio snr values, energy detector is used alone to perform the detection. at low snr values, cyclostationary detector with reduced complexity may be employed to support the accurate detection. the complexity reduction is done in two ways: through reducing the number of sensing samples used in the autocorrelation process in the time domain and through using the sliding discrete fourier transform (sdft) instead of the fast fourier transform (fft). to evaluate the performance, two versions of the proposed hybrid method are implemented, one with the fft and the other with the sdft. the proposed method is simulated for cooperative and non-cooperative scenarios and investigated under a multipath fading channel. obtained results are evaluated by comparing them with other methods including: cyclostationary feature detection (cfd), energy detector and traditional hybrid. the simulation results show that the proposed method with the fft and the sdft successfully reduced the complexity by 20 % and 40 % respectively, when 60 sensing samples are used with an acceptable degradation in the detection performance. for instance, when eb/n0 is 0 db , the probability of the detection of pd is decreased by 20 % and 10 % by the proposed method with the fft and the sdft respectively, as compared with the hybrid method existing in the literature. keywords: cognitive radio (cr), cyclostationary detector, fast fourier transform (fft), sliding discrete fourier transform (sdft). 1. introduction due to the expansion of remote gadgets and applications, the available electromagnetic radio range is becoming crowded. it has been recognized that due to the static allotment strategy of the spectrum, this allocated range will be under-utilized. ”spectrum holes” or “white spaces” result from the unutilized part of the spectrum. due to the limitation of the available spectrum and the inefficiency in its usage, a new communication paradigm is required to exploit the existing wireless spectrum opportunistically. thus, cognitive radio (cr) has been recognized as the key enabling technology to overcome the spectrum under-utilization, in order to supply extremely reliable communication for all secondary users of the network [1]. a cognitive radio is a wireless technology that can automatically detect the available spectrum and changes its parameter accordingly. in this framework, cognitive users, called secondary users (sus), can recognize the spectrum holes and use them to communicate among themselves without causing an interference to the license users, called primary users (pus) [2]. spectrum sensing is the main step of cognitive radio, this process is done by checking a spectrum band, and finding those channels not used by pu (licensed) users, which could be utilized by sus [3]. energy detection ed, matched filter detection and cyclostationary detection techniques are basic types of spectrum sensing techniques. ed based sensing method is the most broadly used due to its simplicity and low computational complexity. however, at a low snr, ed has no capability to separate the pu signal from the noise. the matched filter maximizes the received snr, so it could be viewed as an optimal detector. but, the matched filter has a demerit in that it needs an information about the pu signal characteristics, i.e., type of modulation, packet format, pulse shaping. if the cr doesn’t have enough information about the pu signal, the performance of the matched filter is degraded. in such a situation, the cyclostationary detector can be used as a sub optimal detector. the cyclostationary detector can differentiate the pu signal from the noise since it exploits the periodicity 279 https://doi.org/10.14311/ap.2020.60.0279 https://ojs.cvut.cz/ojs/index.php/ap h. n. abdullah, z. o. dawood, a. e. abdelkareem, h. s. abed acta polytechnica property of a signal in its process. it can work in a low snr condition [4] but it has a high computational complexity since it requires a long sensing time [5]. in order to improve the detection performance under fading and shadowing environments, cooperative spectrum sensing css is utilized. in css, many sus sense the spectrum and each su sends its local decision about the pu activity to the fusion centre (fc) during the transmission stage. in this stage, various types of coding algorithms can be used to guarantee the successful arrival of decisions. finally, when these local decisions arrive to the fc, global decision about the status of the pu is taken [4, 6, 7]. many works have addressed the complexity of the cyclostationary detection. in [5, 8], the complexity of the cyclostationary detection is reduced based on the selection of optimal parameters. in [9], the computational complexity of cyclostationary detection is addressed by reducing the test statistics sharing among multi antenna. in [10], the computational complexity is improved by a cyclic autocorrelation function caf at only one cyclic frequency. in [11], a method is proposed for improving the computational complexity by splitting the autocorrelation into its real and imaginary parts and calculating two modified cafs, before combining them in a final test statistic. although the methods mentioned above reduce the computational complexity, they suffer from having a high degradation in the detection performance especially at low snr values. furthermore, none of them addressed the complexity during the autocorrelation process and during the conversion to the frequency domain except in [11], which addresses the autocorrelation process only. in this paper, we proposed two methods to reduce the complexity of cyclostationary detectors based on fft and sdft with an acceptable detection performance. the proposed methods include a hybrid detection with complexity reduction in the autocorrelation process. the hybrid detection combines both the energy detector and the cyclostationary detector. the procedure of the proposed method is based on checking the received samples of the pu signal using the energy detection first. if it can detect the pu properly, there is no need to use a cyclostationary detector. but, if the ed has a false detection, then the cyclostationary detector is used to assist the detection. to save the complexity of the cyclostationary detector, only half samples of the pu signal are used for the autocorrelation process. to enhance the minor reduction in the detection performance due to neglecting the samples, the sdft is used for the conversion into the frequency domain since it has a better accuracy than the fft. furthermore, the sdft has a reduced complexity compared to the fft, as will be discussed later. the contributions of this paper are as follows: proposing a hybrid sensing method that can adapt its complexity according to the detection satisfactory level without the need for the snr estimator circuitry. the complexity of the cyclostationary method is reduced by manipulating the number of sensing samples and using an efficient frequency domain conversion interchangeably. the equations of computational complexity of the proposed method are derived. the simulation results using matlab are obtained in awgn and multipath fading channels in both cooperative and non-cooperative scenarios and the performance of the proposed methods are evaluated through comparing with the hybrid method in ref [4], traditional method of cyclostationary detector, and with the traditional method of an energy detector. 2. energy detector technique due to its simplicity and low complexity, energy detection is considered as one of the broadly utilized spectrum sensing techniques. it doesn’t necessitate to know the information about pus signals structures. however, to correctly perform the detection, the information about the noise variance is needed. the ed has no ability to separate the pu signal from the noise at low snr. the su checks the range allocated to the pu and when it detects the absence of pu transmission, it starts the data transmission to its receiver. the received samples at the cu receiver are [4, 12]: x(n) = hθs(n) + no(n) (1) where x(n), is the received complex signal of the cognitive radio and it is a function of sample time, s(n) is the transmitted signal of the primary user, no is the additive white gaussian noise (awgn), and h is the complex gain of the ideal channel, and θ is the activity indicator, which can take one out of two values as given in equation (2), θ = { 0 for h0 hypothesis 1 for h1 hypothesis (2) hypothesis h1 is referred to the pu in an active case, while hypothesis h0 is referred to an inactive pu. by comparing the detector decision metric with a pre-set threshold λ, the false alarm and detection probabilities are evaluated. the decision metric ej is defined as the average accumulated energy for jth su of the captured samples during the monitoring window w . ej = 1 n ns∑ n=1 |x(n)|2 (3) where ns is the number of samples ns = wfs, and fs is the sampling frequency. the probability of a false alarm pf and probability of a detection pd are given by equations (4) and (5) respectively: pf = pr(ej > λ|h0) (4) pd = pr(ej > λ|h1) (5) 280 vol. 60 no. 4/2020 complexity reduction of cyclostationary sensing technique. . . mathematically, the common equation for setting the threshold assuming pf is constant is given as in equation (6) [13]. λ = (q−1(pf ) + √ ns)2 √ ns(ns)2 (6) where q−1 is the inverse of complimentary error function q(.). 3. cyclostationary technique cyclostationary characteristic identification is a method of discovering primary user transmissions by taking advantage of the cyclostationary features of the received signals [14]. earlier research endeavours exploit the cyclostationary merit of signal as a method for classification, which has been found to be better than match filtering and energy detection. distinguished features are the number of signals, their modulation type, presence of interferer and symbol rate [14]. the analysis of the sensing execution is done by the correlation process. the correlation can be enumerated by multiplying the received signal x(n) and the identical delayed signal. to decide whether pu is present or absent, the sum of correlations is compared with a predetermined sensing threshold. pu is assumed to be present, if the sum of correlation is greater than the predetermined threshold, otherwise it is absent [4]. because it can distinguish between the signal power and noise power, it performs better than energy detectors. however, it is very complex in that it requires a long time for processing, which generally breaks down the performance of cognitive radio. a signal is said to be cyclostationary if its autocorrelation is a periodic function with some period. this type of cyclostationary is called second-order cyclostationary detection [14]. a discrete cyclic autocorrelation function of a discrete time signal x(n) with a fixed lag l is defined as [15]: rαxx(l) = lim ns→∞ 1 ns n−1∑ m=0 x[m]x∗[m + l]e−j2παm∆m (7) where ns is the number of samples of a signal x[m] and ∆m is the sampling interval. by applying the discrete fourier transform to rαxx(l), the cyclic spectrum (cs) is given as: sαxx(f) = ∞∑ l=−∞ rαxx(l)e −j2πfl∆l (8) the detection of the presence and absence of the signal is performed based on scanning the cyclic frequencies of its cyclic spectrum or its cyclic autocorrelation function. the decision is made very simple, i.e., at a given cyclic frequency, if the cyclic spectrum or its cyclic autocorrelation function (caf) is below the threshold level λ as shown in equation (6), the signal is absent, otherwise the signal is present [14, 15]. 4. hybrid sensing method [4] to improve the detection probability of the cru, a hybrid detector is suggested in [4]. it consists of energy and cyclostationary detectors. the energy detector is moderate compared to the cyclostationary detector. so, to verify whether the pu is present or not, the output of the pu transmitter first pushes through the ed. if the ed is not sure about the presence of pu, then a cyclostationary detector of first or second order is applied [4]. 5. the proposed methods the proposed sensing methods use hybrid sensing, which consists of an energy detector and an improved cyclostationary detector. the main goal of the proposed methods is to reduce the computational complexity of the cyclostationary technique while maintaining a good probability of detection pd. with this structure, since the energy detector has a satisfactory pd at high eb/n0 values, it is only used to save the complexity required by the cyclostationary detector. at low eb/n0 values, an improved version of the cyclostationary is used to reduce the computational complexity. two methods are proposed to reduce the computational complexity of the cyclostationary spectrum sensing as shown in the following two sections: 5.1. hybrid proposed method using fft the procedure of the proposed method is: first, the qpsk signal, which is a pu signal, is processed through a one type of a channel, awgn or rayleigh multipath fading, then sensed using an energy detector. the accumulated energy of the pu signal ej is computed by the energy detector and passed to a comparator. if ej is greater than the threshold λ, the local sensing declares the pu presence without using the cyclostationary detector. otherwise, the pu signal is sensed by using the cyclostationary detector. in this case, first, an autocorrelation function is applied to the samples of the pu signal, reducing the number of samples to half by selecting one sample and skipping the next alternatively. so, half the number of samples are entered to the autocorrelation process and this leads to a huge reduction in the computational complexity of the cyclostationary technique. then, the result is converted to a frequency domain using traditional fft to compute the cyclic spectrum cs and compare it with the pre-defined threshold λ. if it is greater than the threshold, the local sensing result is that the pu is present, otherwise the pu is absent. the procedure of the proposed method is shown as a flow chart in figure 1. 5.2. hybrid proposed method using sdft this proposed method has the same procedure as is used in the first proposed method, but instead of using the fft for the conversion to the frequency 281 h. n. abdullah, z. o. dawood, a. e. abdelkareem, h. s. abed acta polytechnica figure 1. procedure of the proposed methods. 282 vol. 60 no. 4/2020 complexity reduction of cyclostationary sensing technique. . . domain, sliding dft (sdft) is used. the sdft can be greatly helpful to reduce the computational complexity [16]. the computational complexity of this method is o(ns) for a single update, which is much lower than o(ns log2(ns)) the fft has [16]. generally, given a discrete time signal x, at any time index n, the kth frequency bin of its m-point dft is [17] xkn = m1∑ m=0 w−kmm xn−m+1+m ∀k ∈{0, 1, . . .m − 1} (9) where wm = ej2π/m. equation (9) can be transformed into its recursive equivalent xkn = w −km m (x k n−1 + xn −xn−m ), (10) the procedure of this proposed method is shown in fig. 1. the probability of the detection of the proposed hybrid method pd,hybrid is given by [4]: pd,hybrid = 1 − (1 −pd,ed)(1 −pd,cyco) (11) where pd,hybrid is the probability of the detection of the proposed method, pd,ed is the probability of the detection in the energy detector stage, and pd,cyco is the probability of the detection in the cyclostationary stage. 6. the computational complexity the computational complexity of cyclostationary technique of the traditional and proposed methods are computed as shown in the following equations: cx = cx_auto + cx_freq (12) where, cx = total computational complexity of cyclostationary technique, cx_auto = computational complexity through the autocorrelation process, cx_freq = computational complexity when converting to frequency domain. in cyclostationary (traditional method) the computational complexity cx1 is computed as follows: according to [18], cx_auto is, cx_auto = no. of real multiplications + no. of real additions cx_auto = 4ns + 4ns− 2 = 8ns− 2 according [16], cx_freq is, cx_freq = computational complexity in traditional fft cx_freq = o(ns log2(ns)) so, (12) is transformed to cx1 = 8ns− 2 + o(ns log2(ns)) (13) in the two hybrid proposed methods with the fft and sdft, the computational complexity cx2 and cx3, respectively, are computed as follows: since, half the number of samples ns/2 is entered to the autocorrelation process, so, cx_auto in traditional method is transformed to: cx_auto = 2ns+2ns−2 = 4ns−2 in both cx2 and cx3. in the hybrid proposed method using the fft, cx_freq = o(ns log2(ns)), so cx2 is: cx2 = 4ns− 2 + o(ns log2(ns)) (14) in the hybrid proposed method using the sdft, the cx_freq = o(ns), so cx3 is: cx3 = 2ns + 2ns− 2 + o(ns) (15) the computational complexity ratio of each method is computed as follows: cx1_ratio = cx1/c_max, cx2_ratio = cx2/c_max, cx3_ratio = cx3/c_max, where, c_max = maximum computational complexity of traditional method. the computational complexity of cyclostationary in the hybrid method in [4] cx4 is the same as the computational complexity in the traditional method cx1, since the method does not take into consideration the complexity reduction in the cyclostationary stage. it should be noted that all equations of computational complexity focus only on computing the complexity in the cyclostationary stage of hybrid methods and not taking into consideration the computational complexity in the energy detector stage. table 1 summarizes the equations of the computational complexity in traditional and proposed methods. 7. simulation results and discussion this section presents the simulation results of the performance of the proposed methods under cooperative and non-cooperative scenarios. the performance is tested under awgn and rayleigh multipath fading channels. we evaluate the performance of the proposed method by comparing it with a traditional hybrid method in reference [4], the cyclostationary feature detection (traditional method) and energy detector method. the procedure followed to produce the results below: (1.) creating qpsk modulation as a pu signal. (2.) adding a channel, which is an awgn channel or itu indoor channel model (a), like rayleigh multipath fading (3.) applying one of the sensing methods shown in the previous sections. (4.) there are two scenarios for the sensing process, in the non-cooperative scenario, single sensing method (single su) is used to check the activity of the pu signal. in cooperative scenario, multiple sus are used to sense. (5.) testing the sensing performance by computing the probability of the detection in equation (11) under various values of eb/n0 of pu signal. in the case of the cooperative scenario, the average probability of the detection over all sus is computed. 283 h. n. abdullah, z. o. dawood, a. e. abdelkareem, h. s. abed acta polytechnica method computational complexity equation cyclostationary (traditional method) cx1 = 4ns + 4ns− 2 + o(ns log2(ns)) hybrid proposed method using fft cx2 = 2ns + 2ns− 2 + o(ns log2(ns)) hybrid proposed method using sdft cx3 = 2ns + 2ns− 2 + o(ns) hybrid method in [4] cx4 = 4ns + 4ns− 2 + o(ns log2(ns)) table 1. computational complexity. (6.) computing the computational complexity of each method from equations shown in table 1 under various numbers of samples of the pu signal. (7.) plotting the obtained results. the simulation parameters used are: qpsk modulation of the pu signal with a carrier frequency fc = 200 hz, sampling frequency fs = 4000 hz , and pf = 0.001. the multipath fading used is “itu indoor channel model (a)” [19]. the simulation results are divided into two parts. the first part presents the results using a non-cooperative scenario, while the second is for the cooperative scenario. 7.1. non-cooperative scenario figure 2 shows the performance curves of a number of samples versus the computational complexity ratio in cyclostationary of hybrid proposed methods compared with the traditional method. it can be seen that the computational complexity increases as the number of samples increase in all methods, and the hybrid method using the fft uses less computational power than the traditional method and hybrid method in [4], since only 50 % of samples enter the autocorrelation process. it can also be noted that the computational complexity in the proposed hybrid method using the sdft uses less computations than the fft and traditional methods since the sdft reduces the computational complexity and requires o(ns) rather than o(ns log2 ns) required in fft besides that 50 % reduction of the sensing samples. it can be noted that the computational complexity in the hybrid method in [4] is the same as the computational complexity of traditional methods since it does not have any improvement in reducing the complexity during the cyclostationary process. for example, when the number of samples equals 60, the computational complexity is reduced by 20 %, and 40 % in proposed methods using fft and sdft respectively, compared with traditional methods. figures 3 and 4 show the proposed methods’ performance curves in terms of the probability of detection pd versus eb/n0 in awgn and rayleigh multipath fading channels respectively, compared with hybrid method in [4], cfd (traditional method) and energy detector method. in both curves, pd increases as eb/n0 increases. in figure 3, the traditional method and hybrid method in [4] give the best results especially at low values of eb/n0. however, they have a high computational complexity, as shown in figure 2, because they require a long time for processing. the hybrid method in [4] has a high complexity at low values of eb/n0, since it firstly utilizes the energy detector for sensing and then uses the cyclostationary detector with the full number of samples. but, the proposed methods reduce the computational complexity and give a good detection performance. the price we pay for the complexity reduction is a slight loss in pd as compared with the traditional method, also, it can be seen that although the proposed method using the sdft hasa greater reduction in the computational complexity than the other proposed methods that use the fft, it also gives a good detection performance, especially at low values of eb/n0, since the sdft has a higher accuracy than the fft and the performance of other methods becomes the same as the method that uses the sdft at eb/n0 equals 6 db. as shown in the figure, when eb/n0 equals 0 db, pd is decreased by 20 % and 10 % using proposed methods that use the fft and sdft respectively, as compared with the hybrid method in [4] and increased by 30 % and 40 % respectively, as compared with the energy detector. at higher values of eb/n0, pd of the energy detector becomes almost the same as for other methods. in figure 4, the performance curves are the same as in figure 3, but with a degradation in the detection performance because of multipath fading. the energy detector has worse results at low values of eb/n0 and the proposed methods also give acceptable values of pd especially in the proposed method that uses the sdft. for example, at eb/n0 equals 3 db, pd decreases by 28 % and 5 % in the proposed methods that use the fft and sdft respectively, as compared to traditional and hybrid methods in [4]. also, pd of the proposed methods increases by 50 % and 75 % respectively as compared to the energy detector at the same value of eb/n0. the difference between methods when pd equals 1 can be compared in the case of the fading channel as shown in figure 4. since the fading channel is more realistic than the awgn, it can be seen that when pd becomes 1, eb/n0 required by the energy detector (traditional method), cyclostationary (traditional method), hybrid method in refrence [4], proposed method using fft , and proposed method using sdft are 9.1 db, 5.1 db, 6 db, 6 db, and 6 db respectively. we can further improve the detection performance in all methods under fading channel by using the cooperative scenario as shown in the following section. 284 vol. 60 no. 4/2020 complexity reduction of cyclostationary sensing technique. . . figure 2. the performance curves of number of samples versus computational complexity ratio. figure 3. the performance curves of traditional and hybrid methods in awgn channel. figure 4. the performance curves of traditional and hybrid methods in rayleigh multipath fading channel. 285 h. n. abdullah, z. o. dawood, a. e. abdelkareem, h. s. abed acta polytechnica figure 5. the performance curves of the proposed methods of pd versus eb/n0 in cooperative scenario compared with hybrid method in [4]. 7.2. cooperative scenario in the cooperative scenario, the effect of fading is reduced. in this scenario, we assumed that 3 cus are used to sense the spectrum and one of them is suffering from multipath fading. figure 5 shows the performance curves of the proposed methods in terms of pd versus eb/n0 as compared with the hybrid method in [4]. in figure 5, the detection performance of the proposed method using the fft is approximately the same as for the hybrid method in [4]. for example, at eb/n0 equals 0 db, pd is reduced by 5 % in the proposed method using the fft as compared to the hybrid method [4]. it can be seen that the proposed method using the sdft outperforms other methods, for example, when eb/n0 equals 0 db, pd in the proposed method using the sdft is increased by 15 %, and 20 % compared to the hybrid method in [4] and proposed method using the fft respectively. we conclude that the proposed method operating in the cooperative scenario is better than that in the non-cooperative one and it is very efficient, since it significantly reduces the computational complexity. 8. conclusions in this paper, the performance of cr in terms of detection performance in cooperative and non-cooperative scenarios under awgn and multipath fading channels is investigated. the performance is evaluated using traditional and hybrid methods available in literature, energy detection, and improved hybrid (proposed methods) using the fft and sdft. the equations of computational complexity are derived. it was observed that the traditional and hybrid methods in the literature have approximately the same performance and they have the best performance, but they have a high computational complexity. the energy detector has the worst detection performance. however, the proposed methods reduce the computational complexity by using only a half number of samples in the autocorrelation process and using the sdft in the second proposed method with a little loss in the detection performance. the proposed methods become more efficient when the cooperative scenario is assumed. this technique can be developed in the future by taking other types of fading channels and using double threshold in the sensing process to achieve an improvement in the probability of detection. also, the transmission stage and choosing an appropriate coding techniques can be taken into consideration. references [1] a. s. khobragade, r. d. raut. hybrid spectrum sensing method for cognitive radio. international journal of electrical and computer engineering 7(5):2683 – 2695, 2017. doi:10.11591/ijece.v7i5.pp2683-2695. [2] a. sepasi zahmati, x. fernando, a. grami. a hybrid spectrum sensing method for cognitive sensor networks. wireless personal communications: an international journal 74:953 – 968, 2014. doi:10.1007/s11277-013-1332-4. [3] m. kustra, k. kosmowski, m. suchański. performance of hybrid sensing method in environment with noise uncertainty. journal of telecommunications and information technology 1:51 – 57, 2018. doi:10.26636/jtit.2018.123117. [4] k. yadav, s. dhar roy, s. kundu. hybrid cooperative spectrum sensing with cyclostationary detection for cognitive radio networks. in 2016 ieee annual india conference (indicon), pp. 1 – 6. 2016. doi:10.1109/indicon.2016.7839118. [5] h. arezumand, p. azmi, h. sadeghi. a low-complexity cyclostationary-based detection method for cooperative spectrum sensing in cognitive radio networks. international journal of information & communication technology research 3(3):1 – 10, 2011. doi:10.1109/iccke.2011.6413370. [6] k. vasudevan, a. p. k. reddy, g. k. pathak, s. singh. on the probability of erasure for mimo-ofdm. semiconductor science and information devices 2(1), 2020. doi:10.30564/ssid.v2i1.1689. 286 http://dx.doi.org/10.11591/ijece.v7i5.pp2683-2695 http://dx.doi.org/10.1007/s11277-013-1332-4 http://dx.doi.org/10.26636/jtit.2018.123117 http://dx.doi.org/10.1109/indicon.2016.7839118 http://dx.doi.org/10.1109/iccke.2011.6413370 http://dx.doi.org/10.30564/ssid.v2i1.1689 vol. 60 no. 4/2020 complexity reduction of cyclostationary sensing technique. . . [7] k. vasudevan. coherent detection of turbo coded ofdm signals transmitted through frequency selective rayleigh fading channels. in 2013 ieee international conference on signal processing, computing and control, ispcc 2013, pp. 1 – 6. 2013. doi:10.1109/ispcc.2013.6663392. [8] d. shen, d. he, et al. an improved cyclostationary feature detection based on the selection of optimal parameter in cognitive radios. journal of shanghai jiaotong university (science) 17(1):1 – 7, 2012. doi:10.1007/s12204-012-1222-z. [9] d. cho, s. narieda, k. umebayashi. low computational complexity spectrum sensing based on cyclostationarity for multiple receive antennas. ieice communications express 7(2):54 – 59, 2018. doi:10.1587/comex.2017xbl0167. [10] s. narieda. computational complexity reduction for signal cyclostationarity detection based spectrum sensing. in proc. ieee int’l symp. on circuits and systems (ieee iscas 2017), pp. 1 – 4. 2017. doi:10.1109/iscas.2017.8050564. [11] d. allan, l. crockett, r. stewart. a low complexity cyclostationary detector for ofdm signals. in 2017 new generation of cas (ngcas), pp. 253–256. 2017. doi:10.1109/ngcas.2017.19. [12] m. emara, h. ali, s. khamis, f. abd el-samie. spectrum sensing optimization and performance enhancement of cognitive radio networks. wireless personal communications 86, 2015. doi:10.1007/s11277-015-2962-5. [13] s. atapattu, c. tellambura, h. jiang. energy detection for spectrum sensing in cognitive radio. springer-verlag, new york, 2014. doi:10.1007/978-1-4939-0494-5. [14] w. adigwe, o. r. okonkwo. a review of cyclostationary feature detection based spectrum sensing technique in cognitive radio networks. e3 journal of scientific research 4(3):41 – 47, 2016. doi:10.1007/s11277-015-2962-5. [15] k. po, j. takada. signal detection based on cyclic spectrum estimation for cognitive radio in ieee 802.22 wran system. technical report of ieice, the institute of electronics, information and communication engineers, 2007. [16] p. kruczek, j. obuchowski. cyclic modulation spectrum — an online algorithm. in ieee 24th mediterranean conference on control and automation (med), pp. 361 – 365. 2016. doi:10.1109/med.2016.7535994. [17] b. han, h. schotten. a cost efficient and flexible cyclostationary feature detector based on sliding discrete fourier transform for cognitive spectrum sensing. in ieee 24th international conference on telecommunications (ict). 2017. doi:10.1109/ict.2017.7998263. [18] s. narieda. low complexity cyclic autocorrelation function computation for spectrum sensing. ieice communications express 6(6):387 – 392, 2017. doi:10.1587/comex.2016xbl0211. [19] m. pätzold. mobile fading channels. john wiley & sons, ltd, 2000. doi:10.1002/0470847808. 287 http://dx.doi.org/10.1109/ispcc.2013.6663392 http://dx.doi.org/10.1007/s12204-012-1222-z http://dx.doi.org/10.1587/comex.2017xbl0167 http://dx.doi.org/10.1109/iscas.2017.8050564 http://dx.doi.org/10.1109/ngcas.2017.19 http://dx.doi.org/10.1007/s11277-015-2962-5 http://dx.doi.org/10.1007/978-1-4939-0494-5 http://dx.doi.org/10.1007/s11277-015-2962-5 http://dx.doi.org/10.1109/med.2016.7535994 http://dx.doi.org/10.1109/ict.2017.7998263 http://dx.doi.org/10.1587/comex.2016xbl0211 http://dx.doi.org/10.1002/0470847808 acta polytechnica 60(4):279–287, 2020 1 introduction 2 energy detector technique 3 cyclostationary technique 4 hybrid sensing method 04 5 the proposed methods 5.1 hybrid proposed method using fft 5.2 hybrid proposed method using sdft 6 the computational complexity 7 simulation results and discussion 7.1 non-cooperative scenario 7.2 cooperative scenario 8 conclusions references ap_06_4.vp 1 introduction a concept of multi-agent systems (mas) is a widely used paradigm for modelling, planning and control of various processes. generally, it uses distributed negotiation techniques for achieving particular goals. besides standard centralized planning and optimization mechanisms, mas supports local replanning with minimal needed changes of the entire plan. there are several mas implementations for production planning e.g. [16] and for cooperation across supply chains [18, 20, 13]. modern business speeds up research in the domain of virtual organizations [10] that transform supply chains into dynamic cooperative networks. (cooperation with the other partners in a virtual organization allows the enterprise to react to incoming business opportunities that could not be covered by the enterprise alone.) cooperation in such an environment is based on distributed negotiation among individual partners that leads to satisfaction of individual or common goals. in case of internal cooperation (within an enterprise), the goal is to maximize the overall profit of the whole enterprise. changing the scope to external cooperation (across a supply chain), the behaviour of the parties involved is more self-interested, as their goal is to maximizate their own profits. standard negotiation protocols and techniques used in mas do not follow this course, so new negotiation principles for such an environment have to be investigated. 2 competitive and collaborative environments let us introduce a difference between collaborative and competitive multi-agent environments [1]. by a collaborative multi-agent environment we understand an agent community where the agents usually share a common goal that they try to achieve cooperatively. in other cases the agents may have different goals, but their primary motivation is to maximize their social welfare – the total sum of all the individual utilities (profits) of the collaborative agents. conversely, by a competitive multi-agent environment we understand an agent community where the primary motivation of the agents is to maximize their individual utilities, no matter what the social welfare of the community is (agents are so called self-interested). the agents establish cooperation in the process of achieving a common goal only if it maximizes their individual utilities. as mentioned above, cooperation across a supply chain differs substantially from the cooperation within an enterprise. we distinguish collaborative and competitive environments and inspect different aspects of cooperation in these cases. in our work we focus on reconfigurating and replanning as the crucial element in successful cooperation among agents. this paper attempts to analyze ways of adjusting contracts in a real-world setting. proper algorithms for contracting need to be developed as they underlie the cooperation in a competitive environment. 3 commitments and decommitments the concept of cooperative problem solving by means of social commitments was introduced by wooldridge and jennings in 1999. dropping a social commitment (decommitment) was either rational and beneficial for all the participants, or did not occur at all. although the authors do not restrict the commitment description provided here to collaborative environments, the agents turn out to be social-welfare maximizers rather than competitors. however, in competitive environments an agent tends to drop its commitments if this maximizes its individual utility, no matter how it © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 46 no. 4/2006 multi-agent contracting and reconfiguration in competitive environments using acquaintance models j. bíba, j. vokřínek cooperation of agents in competitive environments is more complicated than in collaborative environments. both replanning and reconfiguration play a crucial role in cooperation, and introduce a means for implementating a system flexibility. the concepts of commitments, decommitments with penalties and subcontracting may facilitate effective reconfiguration and replanning. agents in competitive environments are fully autonomous and selfinterested. therefore the setting of penalties and profit computation cannot be provided centrally. both the costs and the gain differ from agent to agent with respect to contracts already agreed and resources load. this paper proposes an acquaintance model for contracting in competitive environments and introduces possibilities of reconfigurating in competitive environments as a means of decommitment optimization with respect to resources load and profit maximization. the presented algorithm for contract price setting does not use any centralized knowledge and provides results corresponding to a realistic environment. a simple customerprovider scenario proves this algorithm in competitive contracting. keywords: multi-agent systems, competitive environments, reconfiguration, decommitment, contract price, penalty. may consequently harm the others. if we want self-interested agents either to fulfil their commitments or to provide compensations for the harm to others in the case of decommitment (i.e. if we want agents to act responsibly), the agents have to commit themselves in this sense as well. for contracting in collaborative environments there is usually no need for any explicit metrics of the individual utility or the social welfare gained. (for example, the number of goals successfully achieved suffices as an evaluation of the utility gained; algorithms for contracting in collaborative environments often guarantee maximization of social welfare.) by contrast, in competitive environments an explicit expression of utility is desirable. it facilitates the implementation of rewards and penalties as utilities that the agents gain or lose. the concept of such an explicit utility evaluation is then a part of commitments -an agent providing a service (the contractee) commits not only to perform appropriate actions (in order to gain the utility promised – this is the agent’s motivation), but to provide compensation if fails (e.g. a compensation of the profit lost to the other party). simultaneously, the other party (the contractor) commits not only to pay for services provided by the first party, but also to provide a compensation if it decommits from the contract (the first party suffers loss of profit loss and is paid e.g. the opportunity cost). 3.1 levelled commitment contracts the most complete approach to commitments in a competitive environment has been presented by sandholm and lesser [19] as levelled commitments. levelled commitments include an explicit utility evaluation in the form of a contract price and penalties. levelled commitments facilitate decommitments that were not acceptable for full commitments commonly used. a full commitment is defined by a contract obligation as an n-tuple (�, �), where � introduces a description of what each of the two parties (the contractor and the contractee) has to perform (handling tasks, contributing goods, lending resources, etc.) and � introduces the contract price that the contractor has to pay to the contractee. neither of the agents may drop the commitment, under any circumstance till it is brought to a good end. in contrast, levelled commitments are defined as an n-tuple (�, �, a, b), where the extending parameters a and b introduce penalties to be paid when decommitments occur. levelled commitments are based on non-cooperative game theory. a negotiation process consists of two parts (i) the contracting game, when the agents agree on a contract and (ii) the decommitting game, when they decide whether or not to decommit. various events may occur (resources failing or becoming available, outside offers, etc.) that change the value of the contract independently for any of the two agents so that keeping the commitment does not need to be desirable for one or for both of the agents. both the decommitment decision and the setting of the contract (�, a and b) are based on knowledge of the ex ante probability density functions (p.d.f.) of receiving the best outside offers. the p.d.f. are assumed to be common knowledge between the contractor and the contractee. levelled commitments have several limiting assumptions that facilitate equilibrium calculations of the contract settings, but make the use of levelled commitments more difficult in domains where such assumptions may be neither possible nor even desirable (e.g. logistics, production planning, etc.). the most significant assumptions are: (i) an agent does not want to be involved in more than one contract at a time, (ii) all the contracts available have the same description � (the only concern is the contract price) and (iii) the p.d.f. of receiving the best outside offers are common knowledge between the agents. the most limiting assumption is (iii) [8]. moreover, the concept of levelled commitments also does not state explicitly whether the contract price under consideration introduces only a profit or whether if it considers costs on performing �. it rather seems that � introduces the total price of the contract, set only on the basis of p.d.f. (there is no distinction between costs and the expected profit that both are comprised in the real-world contract price). fixed costs are seemingly also not taken into account, as the price of a null deal (i.e. the agents do not agree on a contract) is assumed to be equal to the average best outside offer. thus, the agent does not lose anything, but only does not get what it might have got if the best offer had come. 3.2 contract setting an extension of levelled commitment contracts has been introduced by excelente-toledo et al. in [8]. they provide both an algorithm for calculating the contract setting and an algorithm for considering a decommitment. unfortunately, the assumptions considered (e.g. omitting the fixed costs) need not always be acceptable. thus, algorithms for contract setting in the competitive environments need to be developed. the price of a contract in the real world covers at least the following three items: (i) variable costs which depend on contract size, feasibility issues, etc. (i.e. specific conditions related to a particular contract), (ii) fixed costs which are not related to a particular contract, but are related to the overall business and need to be covered (e.g. the rent for an office, payment for energy, employees’ wages, etc.) and (iii) intended profit from the contract (e.g. a profit for the enterprise owner). a penalty in the real world seeks to cover at least a portion of the fixed costs and also the profit lost. while the calculation of variable and fixed costs or of profit lost, is rather pragmatic, the setting of the intended profit is rather strategic or even speculative, and depends on many considerations (e.g. experience with the second party, various social relations, the profit eagerness of the first party or “good manners”). overall, the setting of a contract price and a penalty may predetermine the acceptability of such a bid for the customer, i.e. the fruitfulness of the contact. let us propose an algorithm for setting a contract price. the scenario is as follows: there are two actors – a customer and a service provider. the customer proposes contracts of different sizes and calls for bids. the service provider calculates the bids and proposes the prices to the customer. let the coefficient of variable costs with respect to contract size be common for all agents, and let the customer’s private preference be to accept bids with a margin up to e.g. 10 % of the variable costs. (this is inspired by the real-world contracting, where a customer is willing to accept a price only up to a certain limit.) the service provider does not try to do more than 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 cover all the costs (variable and fixed). it calculates the variable costs and projects the actual fixed costs to be covered to the margin. (the fixed costs are constant for a time unit, but are generally accumulated or reduced based on whether the provider’s business was successful in the past.) let the margin be finally limited in three different ways: (i) simple limitation, (ii) learned safe limitation and (iii) learned speculative limitation: � simple limitation – the margin is not limited until it attacks a certain bound with respect to the variable costs – let us say 50 % of the variable costs; � learned safe limitation – the margin is limited either to an average value of the previously accepted margins or to a half of a minimum previously rejected margin; � learned speculative margin – the margin is limited to mid way between the maximal accepted (lower boundary) and minimal rejected (upper boundary) margin from the past; until both boundaries are obtained the margins are either increased by 50 % or decreased by 20 %. obviously, the first approach does not guarantee the provider a profit in the long-term. the other two approaches do, because the provider learns from the past (provided the total turnover can cover the fixed costs). the latter limitation then promises the maximum possible profit (experiments are described in the section 4). 3.3 negotiation and acquaintance model for competitive contracting the process of contracting and contract maintenance consist of two fundamental processes: (i) negotiation and (ii) deliberation. negotiation is a process of exchanging messages among the agents in accordance with interaction protocols agreed in advance. deliberation is a complex process of knowledge maintenance, evaluation of different decision alternatives, and choice of the appropriate one (decision making) with respect to knowledge of the actual situation in the surrounding environment, decision strategies and individual goals. many interaction protocols implementing various negotiation methodologies have been developed. a well known and very popular methodology is the a contract-net protocol, which implements a cnp auction that has been standardized by fipa [9]. however, this interaction protocol (and most of the widely-used related protocols, some of them also implemented by fipa), was developed rather for collaborative environments [5]. the most interesting multistage negotiation protocol for flexible negotiation in competitive environments was presented by bergenti et al. in [2]. this protocol allows the agents (the initiator and the responder) to propose and to counter-propose till they reach an agreement, or until one of them decides to withdraw from the negotiation. if agreement is reached, any of the parties may decommit and the contract becomes void. if the contract is successfully completed, the responder informs the initiator about it. one of the means used for decision making involves acquaintance models. an acquaintance model is a computational model of agents’ mutual awareness, stored in the interaction wrapper of each of the agents. the acquaintance model is a collection of the agent’s social knowledge [14] available from previous interactions or provided by independent monitoring mechanisms. there are various implementations of acquaintance models, e.g. the tri-base (3ba) acquaintance model [17], the twin-based model [4] or the acquaintance model in archon [21]. the contracting algorithm proposed in section 3.2 uses quite a simple acquaintance model that enables only tracking the history of contracts for each customer that the provider interacts with, and the decision making process is rather myopic. moreover, in a competitive environment more considerations need to be employed, because it is necessary to reflect e.g. contract profitability, partners’ reputations, actual contract commitments, availability of resources, longterm strategies, etc. therefore, we propose an acquaintance model consisting of three modules: � profitability module – maintains both the own economy model and the economy models of the partners; the module determines the profitability of incoming contract offers from various points of view – e.g. their contribution to covering fixed costs, attainable profit, increase/decrease of own reputation, a strategic measure taking into account long-term cooperation with the particular partner, etc. � reputation module – maintains track of former interactions with partners. these are used for building reputation models for all partners as well as an estimate of one’s own reputation with the other partners; the module determines the safety/risk of concluding a contract (both incomming and outgoing opportinuties) with a particular partner, and advises on decisions or adjustments concerning the contract – e.g. whether or not conclude the contract, under which conditions to conclude it with respect to possible delays (setting a reserve in service delivery time), service quality or total failure (setting penalties), payment delays (setting a reserve in the contract price as opportunity cost insurance), etc. � commitments-resources module – maintains both a record of the actual contracts (i.e. both the commitments of the agent to the partners and the commitmentsof partners to the agent) and user’s own resource availability schedule bound with the contracts; the module supports scheduling of incoming contracts, provides information about resource availability and facilitates computation of the feasibility of the contract, simulation of overbooking of resources with a risk evaluation (i.e. level of penalties for delays or decommitments in the case of unavailability of resources), simulation of resource relaxation based on outsourcing, etc. as the contracting process and contract maintenance in a competitive environment are based on the agent’s local knowledge only, the acquaintance model is built iteratively based on the experience of the agent during interactions with the partners (other agents). thus, model building is an inseparable part of the decision making processes. the acquired knowledge is then used by an inference machine based on a rule system that either provides advice to a human operator (manager) concerning the incoming and ongoing contracts or carries out decisions by means of pre-defined rules (i.e. contract acceptance or rejection, decommitments or outsourcing, etc.). administrative activities like updating the company erm/crp system or payments to partners are then carried out automatically. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 46 no. 4/2006 3.4 reconfiguration in competitive environments although there are some domain-dependent application implementations of reconfiguration in multi-agent systems (mas) e.g. [6, 11, 3], the first deep study of a reconfiguration and its formalization was published only recently published by dunin-keplics in [7]. the multi-agent environment has been assumed to be collaborative. thus, if a failure occurred, all the agents involved in achieving their common goal did their best to establish a recovery, in order to complete their task successfully (a reconfiguration occurred). this behaviour due to by their persistent collective intention to achieve their common goal in accordance to the definition of a joint persistent goal [12]. however, in competitive environments the collective intention does not need to be kept unconditionally by all the agents – any agent may decommit from the actual contract (e.g. on account of a more profitable third-party contract offer). decommitment as a concept was not taken into account in the above-mentioned research, while according to [22] it may become a means for optimizing the agents’ individual profits. while the implementation of a reconfiguration in collaborative environments was facilitated by the agents’ primary motivation (i.e. maximization of their social welfare), in competitive environments it is even more difficult. in collaborative environment, decommitment or replanning is driven by a common goal. it is obvious that both partners to the ’contract’ have the same motivation to keep it or change it. in a competitive environment, the agents need to be motivated not only to agree on a contract and to keep their commitments, but also to perform a reconfiguration, if necessary. a self-interested agent will be reluctant to take on further obligations if they are not rewarded, or if they even entailed lost of profit. one of the available motivations is to use reconfiguration as an alternative to decommitment, if keeping the current contract does not contribute to the maximization of the agent’s individual utility (e.g. if a more profitable offer has appeared). for example, the agent may find a subcontract that can be reimbursed from the reward promised by its contractor and may decide to take advantage of both the current contract and the new contract without decommitting. of course, the idea of decommitment may have resulted from of unanticipated events, e.g. lack of resources, an unexpected delay that may prevent the contract deadline being met, etc. thus, reconfiguration may appear to be useful for optimizing the resource load and also for maximizing the profit by optimizing any possible or necessary decommitments. in a collaborative environment reconfiguration is usually driven by individual tasks (it is invoked top-down by decomposition) rather than by the intention of the contracted agents. table 1 shows an overview of various environment properties. let us provide an example of a reconfiguration and its potential. customer 1 grants a contract contract 1 to a group of agents – coalition [15] – coordinated by provider leader. while the contract is being executed, a coalition member provider traitor receives a proposal for a more profitable contract contract 2 and decides to participate in it. however, its resources are not sufficient for both contracts, and therefore provider traitor considers a decommitment from contract 1. it may still save the decommitment penalty if it finds a subcontractee provider subcontractee that takes on its obligations. if provider traitor succeeds and subcontracting is less costly than fulfilling contract 1 by provider traitor, it may benefit on both contracts. the maximal rational cost of subcontracting is the sum of the profit on fulfilling contract 1 and the decommitment penalty for contract 1. if provider traitor does not succeed in subcontracting, it decommits from contract 1. in this case, although provider leader collects the decommitment penalty from provider traitor, it tends to find subcontracting on its own in order to avoid paying a penalty to customer 1. if this succeeds, the coalition leader may keep the penalty as a reward and perhaps also the difference in the prices, provided the subcontracting is cheaper than provider traitor’s services. if it does not succeed, it pays the collected penalty to customer 1. there may be other reconfiguration scenarios – e.g. provider leader may receive a more profitable bid from customer 2, or it may find a provider substitutor that may be cheaper than one of the current coalition members. however, reconfiguring and taking advantage of such opportunities is even more difficult than in the above-mentioned example. in any case, reconfiguration strongly depends on the contract setting and vice versa. this needs to be reflected in the contracting algorithm. extending the basic algorithm proposed in section 3.2, in this sense, requires taking into account, e.g. long-term business strategies, reputation, etc., as the basic version is still rather guesswork than a deliberative calculation. the implementation of a more sophisticated contracting algorithm using the proposed acquaintance model and thus capable of reconfiguration and also handling e.g. customer’s adaptive behaviour is currently under research. 4 experiments the provided experiments show the properties of our algorithm (in three variants), as proposed in section 3.2. our model implementation defined two types of agents – the customer and the provider. the sizes of the contracts proposed by the customers were arbitrary, within a bounded interval and varied in number from none to several per day. the customer’s margin boundary was set to 12 %. the variable-costs coefficient and the provider’s daily fixed costs were set to 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 collaborative environments: maximized criteria social welfare commitments full decommitments common-goal driven reconfiguration contract based competitive environments: maximized criteria individual utility commitments full, levelled decommitments individual-utility driven reconfiguration contracted-agent based table 1: overview of various environment properties fixed values. for each algorithm variant 50 runs of the simulation were performed and the results were averaged. (due to both the randomness of the contract sizes and the absence of contracting history, the mean square deviation at the first day was cca 0.24. however, it tended to decrease to zero quickly during the simulation.) let us describe the results in detail. fig. 1 introduces a comparison of the margin setting for all three algorithm variants with respect to the customer’s margin boundary. the first variant of the algorithm implements simple limitation of margins. as the agent does not take into account past experience, and its only concern is to cover its current fixed costs, the success ratio (the ratio of the number of successful contracts to the number of all contracts) decreases together with the growth of the debts reflected in the margin set (see fig. 2). although a contract sometimes arrives that is big enough for the fixed costs to be dissolved in the overall contract price and the bid is accepted (the customer’s margin boundary is not attacked, and thus the fixed costs are covered), thia happens only seldom, and mostly the agent has to cover its expenses from its reserve. this is the most obvious drawback of this algorithm variant – once the agent gets into debts, there is a little probability that it will get a chance to pay them back. the second variant implements the learned safe limitation of margins. in the first run, the agent sets the margin so that it covers its current fixed costs. if the bid is rejected the margin is reduced. this is done until a bid is accepted. then the agent sets the margins to correpond to average acceptedmargin value. the margin set in a particular contract does not need to cover all the daily fixed costs. however, if several contract proposals arrive in one day, they can cover the fixed costs when added together or even make a reserve for the future (the provider benefits from turnover). such a strategy is safe, as it guarantees the acceptance of all bids (the margins are set to a certain value below the customers’ margin boundaries) and the provider may even be better off over a period of time. obviously, the provider does not get as much as it might have got if it had chosen a less safe strategy. moreover, there a situation may occur in which the benefit from the turnover does not suffice to cover the agent’s fixed costs and the agent may end up with debts. on the other hand, the process of running into debt may be slower than in the first algorithm variant. the third variant implements the learned speculative limitation of margins. at the beginning of the simulation the provider speculates and tries to learn each customers’ margin boundary. thus, it is less successful at the beginning. on the other hand, once it approaches the best possible and most reasonable margin, it begins to benefit from the turnover. the learned margin guarantees acceptance of all bids and also the maximum pay-off from the particular customer. it may also run into debt, if the customers’ margins are too low and the maximum benefit from the turnover does not cover the provider’s fixed costs. on the other hand, the provider cannot defend himself against this situation and it is obvious that it will run into debt at the minimum possible speed. fig. 2 and fig. 3 introduce a comparison among all three algorithm variants. while in the first variant the provider runs into debt, in the second and the third variant it begins to be better off. although the average success ratio of the third variant grows more slowly at the beginning than the success ratio of the second variant (due to speculation at the beginning), the account balance shows that the profit gained was worth in© czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 46 no. 4/2006 fig. 1: comparison of all the proposed limitations: average margins and customer’s margin boundary fig. 2: comparison of all the proposed limitations: average success ratios fig. 3: comparison of all the proposed limitations: account ratios curring a loss at the beginning. let it be noted that the success ratio was computed with respect to whole the simulation duration, i.e. the third-variant success ratio would converge to the second-variant success ratio, which converges to 100 % in an infinite time horizon. (this assumption is based on the provider’s knowledge of the customer’s (stationary) strategy. the margins are then set in such a way the bids are no longer rejected.) 5 conclusion this paper focuses on cooperation in competitive environments and on means of competitive contracting. the role of reconfiguration for decommitment optimization is introduced, and an algorithm for contracting is proposed. the motivation of our research is to explore possibilities of reconfigurating in competitive environments and to use it as a means for decommitment optimization with respect to the resource load and profit maximization. the development of algorithms for contracting (i.e. setting of contract price and decommitment penalties) is crucial for establishing and processing a cooperation in a competitive environment. contract setting techniques that assume both a global view of the economy and the possible business opportunities to be common knowledge among the business parties are not applicable, as such information is unavailable in a fully competitive environment. the proposed approach supports individual contract setting for each agent independently (with respect to its current state, resource load and profit), supports full agent autonomy, and corresponds to real environments. the simple customer-provider scenario used in the experiments proves the presented algorithms in competitive contracting. these basic algorithms use stationary models of the customer’s business strategies, and therefore they would not be able to handle adaptive behaviours of customers. contract-setting algorithms capable of handling adaptive customer behaviours would require more complex non-stationary models of their business strategies, taking into account both long-term and short-term considerations, and also the use of approximations of the whole market situation – e.g. by building sophisticated acquaintance models of the business community. however, the proposed approach to contracting sets up a solid base for the future research on the decommitments and reconfiguration. 6 acknowledgments the research described in the paper has been supervised by doc. dr. ing. michal p chou ek, m.sc., fee ctu in prague, and has been supported by the eu integrated projects – european collaborative networked organizations leadership (ecolead – contract no. 506958) and collaborative process automation support intelligent dynamic agents in sme clusters (panda – contract no. 027169). the research has been also supported by the ministry of education, youth and sports of the czech republic (grant no. msm 840770013). references [1] andersson, m. r., sandholm, t. w.: leveled commitment contracts with myopic and strategic agents. in: proceedings of the fifteenth national conference on artificial intelligence, aaai-98, 26-30 july 1998, madison, wi, usa, p. 38–45, menlo park, ca, usa, 1998. aaai press/mit press. [2] bergenti, f., poggi, a., somacher, m.: a contract decommitment protocol for automated negotiation in time variant environments. in: dagli oggetti agli agenti: tendenze evolutive dei sistemi software (a. omicini, m. viroli, eds.), woa 2001, 4–5 september, 2001, modena (italy): pitagora editrice bologna, 2001, p. 56–61. [3] brennan, r. w., fletcher, m., norrie, d. h.: reconfiguring realtime holonic manufacturing systems. in: proc. of 12th international workshop on database and expert systems applications (a. m. tjoa, r. r. wagner, eds.), 3–7 sept. 2001, munich (germany), los alamitos (ca, usa), ieee comput. soc., p. 611–615. [4] cao, w., bian, c. -g., hartvigsen, g.: achieving efficient cooperation in a multi-agent system: the twin-base modeling. in: cooperative information agents, number 1202 in lnai (p. kandzia, m. klusch, eds.), springer-verlag, heidelberg, 1997, p. 210–221. [5] collins, j., youngdahl, b., jamison, s., mobasher, b., gini, m.: a market architecture for multi-agent contracting. in: proceedings of the second international conference on autonomous agents, 9–13 may 1998, minneapolis, (mn, usa), new york, (ny usa), acm, 1998, p. 285–92. [6] coudert, t., berruet, p., philippe, j. -l.: integration of reconfiguration in transitic systems: an agent-based approach. in: smc ’03 conference proceedings. 2003 ieee international conference on systems, man and cybernetics, 5–8 oct. 2003, washington (dc, usa), piscataway (nj, usa), ieee, 2003, vol. 4, p. 4008–4014. [7] dunin-keplicz, b., verbrugge, r.: evolution of collective commitment during teamwork. fundamenta informaticae. vol. 56, august 2003, no. 4, p. 563–592. [8] excelente-toledo, c. b., bourne, r. a., jennings, n. r.: reasoning about commitments and penalties for coordination between autonomous agents. in: proceedings of the fifth international conference on autonomous agents, 28 may–1 june 2001, montreal (que., canada), new york (usa), acm, 2001, p. 131–138. [9] fipa. foundation for intelligent physical agents [online], 3 2006. http://www.fipa.org. [10] hagel, j. iii, armstrong, a. g.: net gain-expanding markets through virtual communities. harvard business school press, boston (ma, usa), 1997. [11] inohira, e., konno, a., uchiyama, m.: layered multi-agent architecture with dynamic reconfigurability. in: ieee icra 2003 conference proceedings. 4–19 sept. 2003, taipei (taiwan), piscataway (nj, usa), ieee, 2003, vol. 3, p. 4060–4065. [12] levesque, h. j., cohen, p. r., nunes, j. h. t.: on acting together. in: aaai-90 proceedings, eighth national conference on artificial intelligence, 29 july–3 august 1990, 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 boston (ma, usa), cambridge (ma, usa), mit press, 1990, vol. 2, p. 94–99. [13] mařík, v., pěchouček, m., vokřínek, j., říha, a.: application of agent technologies in extended enterprise production planning. in: eurasia-ict 2002: information and communication technology, berlin (germany), springer, 2002, p. 998–1007. [14] mařík, v., pěchouček, m., štěpánková, o.: social knowledge in multi-agent systems. in: multi-agent systems and applications, (m. luck, v. mařík, o. štěpánková, eds.), lnai. springer-verlag, heidelberg, 2001. [15] pěchouček, m., mařík, v., bárta, j.: a knowledge-based approach to coalition formation. ieee intelligent systems, vol. 17 (2002), no. 3, p. 17–25. [16] pěchouček, m., vokřínek, j., bečvář, p.: explantech: multiagent support for manufacturing decision making. ieee intelligent systems, vol. 20 (2005), no. 9, p. 67–74. [17] mařík, v., pěchouček, m., štěpánková, o.: role of acquaintance models in agent-based production planning systems. in: cooperative information agents iv (m. klusch, l. kerschberg, eds.), lnai no. 1860, heidelberg, july 2000, springer verlag, 2000, p. 179–190. [18] sadeh, n. m., hildum, d., kjenstad, d., tseng, a.: mascot: an agent-based architecture for dynamic supply chain creation and coordination. production planning and control, vol. 12 (2001), no. 3. [19] sandholm, t. w., lesser, v. r.: leveled commitment contracts and strategic breach. games and economic behavior, vol. 35 april–may 2003, no. 1–2, p. 212–70. [20] swaminathan, j. m., smith, s. f., sadeh, n.m.: modeling supply chain dynamics: a multiagent approach. decision sciences, vol. 29 (1998), no. 3, p. 607–632. [21] wittig, t.: archon: an architecture for multi-agent system. ellis horwood, chichester, 1992. [22] ’t hoen, p. j., la poutre, j. a.: a decommitment strategy in a competitive multi-agent transportation setting. in: proceedings of the second international joint conference on autonomous agents and multiagent systems, aamas 03, melbourne (victoria, australia), 2003, p. 1010–1011. ing. jiří bíba e-mail: biba@labe.felk.cvut.cz ing. jiří vokřínek e-mail: vokrinek@labe.felk.cvut.cz department of cybernetics gerstner laboratory for intelligent decision making czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 46 no. 4/2006 ap05_4.vp 1 introduction the concept design methodology for monotonous, tapered, thin-walled structures (wing /fuselage /ship /bridge) is presented. the problem solution is based on the octopus program [1, 2]. it contains: (a) response and feasibility analysis modules (fin-crest), (b) decision making-synthesis modules (demak) and (c) interaction /visualization programs (maestro mm/mg and deview) that irerate in the design cycle. the modules are summarized in table 1 as modules 1a–8c. (a) the analytical (crest) modules and methods are fully described in [2]. module-1a indat is used for data generation, combined with the maestro fem modeler [8]. module-2 load is used for design load generation. module-1b mind is used for determining the minimal scantlings based on prescribed rules (lr, dnv, abs, etc.) module-3a ltor is used for direct calculation of the primary strength (shear flow and corrected stresses in bending and warping torsion). they are calculated using an original extended beam theory[5, 6]. module-3b tokv is used for the transverse strength calculation (newly developed 8-node stiffened panel macro-elements are used for modeling the transverse structural response). module-4 is the panel library of structural serviceability and ultimate strength criteria for the structural adequacy calculation [4], using the response fields generated in modules 3a and 3b. module-8a is vb-shell for the designer-model interaction. module-8b maestro graphic [8] is used for presenting the model (loading, response, adequacy, etc.) (b) synthesis (demak) modules and methods are documented in [1, 3]. local variables for substructures (s � 1, …, ns), are denoted xs � {xi} s � {tplating, nstiffeners, hweb …}s. substructure areas x � {xs}are intermediate (global) variables, where xs � xs(x s). project k is defined as pk � {x1, …, xns, xfixed} k. design criteria (attributes, objectives, constraints) are formulated as a library of mathematical functions/procedures for driving the optimization process or feasibility check. octopus metamodeling of failure surfaces is based on the most unsatisfied constraint from each local problem. they are added to the set of global constraints. the value function for a global level is a multicriterion combination of normalized attribute functions. the solution strategy involves generation of designs, using (a) a random number generator in the first cycles of design space exploration, and (b) fractional factorial experiments for subsequent cycles. coordination is performed by modifying v(xs) respective to its divergence from globally optimal substructure area xs . special provisions: � generation of promising designs using 27 designs obtained from orthogonal array l27. � extensive usage of tables of optimized profiles to speed up the generation process. modules 6 and 7c (gaz) are used for calculating the sensitivity of the structural response with respect to the design variables, based on the global strength module fin/ltor [7]. module-7a glo is used for global level modm optimization (level 1) of the cross-section. module-7b loc is used for local coordinated madm decision making via sequential application of stochastic search methods and theory of experiments. (c) module-8c deview is used for the designer-model interaction with graphic presentation of designs in the design and attribute spaces. the stratified distances from the target (or the ideal) design, calculated by lp metric are used as a means of visualizing the multidimensional space of the design attributes and/or free variables. visualization is the most powerful tool for the designer’s understanding of the decision support problem. it generates expert knowledge about the problem for all participants involved, and helps the designer to identify advantageous combinations of variables, feasible options and clusters of non-dominated designs (pareto frontier, thus enabling realistic decision support to the head and structural designer). 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague primary response assessment method for concept design of monotonous thin-walled structures v. zanic, p. prebeg a concept design methodology for monotonous, tapered thin-walled structures (wing/fuselage/ship/bridge) is presented including modules for: model generation; loads; primary (longitudinal) and secondary (transverse) strength calculations; structural feasibility (buckling/fatigue/ultimate strength criteria); design optimization modules based on es/ga/ffe; graphics. a method for primary strength calculation is presented in detail. it provides the dominant response field for design feasibility assessment. bending and torsion of the structure are modelled with the accuracy required for concept design. a ‘2.5d-fem’ model is developed by coupling a 1d-fem model along the ‘monotonity’ axis and a 2d-fem model(s) transverse to it. the shear flow and stiffness characteristics of the cross-section for bending and pure/restrained torsion are given, based upon the warping field of the cross-section. examples: aircraft wing and ship hull. keywords: thin-walled structures, shear flow, fem, concept design. 2 modeling philosophy for primary response in concept design classical fe modeling, giving good insight into stresses and deformations, is not capable of giving the efficient and fast answers regarding feasibility criteria (buckling, fatigue, yield) required by the rules. however, structural feasibility and compliance with the rule requirements are of primary interest, not stresses or deformations. most of the local failure criteria, e.g. various buckling failure modes of stiffened panels, require specified force and displacement boundary conditions. they are available only if logical structural parts, such as complete stiffened panels between girders and frames, are modeled (macro-elements). for the concept design structural evaluation of the primary response (longitudinal strength, torsional strength), the beam idealization of a wing /ship /bridge is often used. a primary strength calculation provides the dominant response field (demand) for design feasibility assessment. the evaluation is based on extended beam theory, which needs cross-sectional characteristics. these are obtained using analytical methods, which can be very complicated for real combinations of open and closed cross-sections. application of energy based numerical methods gives an opportunity for an alternative approach to the given problems. the method is based on decomposing a cross-section into the line finite elements between nodes i and j with coordinates (yi, zi), (yj, zj); element thickness t e; material characteristics (young’s modulus e /shear modulus g); material © czech technical university publishing house http://ctn.cvut.cz/ap/ 97 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 module octopus (1a) structural model maestro files generated by program mm and used in octopus (s/r indat) (2) load model rule loads + designer given loads generated automatically by octopus s/ r load (1b) minimal dimensions minimal dimensions by octopus s/r mind (3a) response calculations primary strength (u displ.; stresses �x, �) extended beam theory (cross section warping fields in bending and torsion, normal stresse s, respective shear flows) program ltor (3b) response calculation -transverse strength (displacements v, w, �x stresses �y) fem calculation using beam element with or without rigid ends and stiffened panel macroelements program tokv (4) feasibility calculation ii ii i dc dc g � � � (normalized safety factor) calculation of macroelement feasibility using library of safety criteria in program panel (c – capability; d – demand from 3a and 3b) (5) reliability calculation (not used) form approach to panel reliability. upper dietlevsen bound as design attribute (6) decision support problem definition (interactive) constraints: user given minimal dimensions library of criteria (see 4) objectives: minimal weight, minimal cost, maximal safety (7a, b, c) optimization method decision making procedure using a) global modm optimization program glo b) local madm optimization module loc c) coordination module gaz (8a, b, c) presentation of results a) vb environment, b) program mg, c) deview graphic tool table 1: summary of octopus modules � x px+ x x qy q(x) x x qy q(x) x fig. 1: transverse strip (s1-s2) with external loading p, warping fields u and 1d / 2d fem idealization efficiency rn and rs (due to cutouts, lightening holes, etc.) with respect to normal /shear stresses. using the fem approach, a procedure is developed for calculating the set of cross-sectional geometric and stiffness characteristics at position x denoted gx with the following elements: � cross-section area a � center of gravity ycg, zcg, � shear /torsion center yct, zct. � moments of inertia with respect to the center of gravity: iy , iz, iyz, ip ; principal: i1, i2, �0-angle of axis-1 w.r.t. z-axis, � horizontal and vertical bending: � flexural stiffness eiz , eiy � shear stiffness gav, gah, � cross-section axial stiffness ea � torsional stiffness git � warping stiffness eiw. standard stiffness matrices (alternatively with geometrical nonlinearity [4]), for axial (ku) and flexural (kv, kw) and torsional response modes are given as functions of the geometric set gx: k k u v � � � � � � � � � � � ae l ei l symm l lz y y 1 1 1 1 1 12 6 4 3 2 ; ( ) ( ) � � � � � � � � � � � � � � � � 12 6 12 6 2 6 4 12 2 2 l l l l l ei y y y z ( ) ( ) ; � � � l gay 2 , kw is obtained similarily. stiffness matrices for free torsion (kt) and restrained warping (k�) read: k t t� � � � � � gi l symm l l l l l l l 6 5 10 2 15 6 5 10 6 5 10 2 30 10 2 15 2 2 2 � � � � � � , k 4 � � � � � � � � � � � � � ei l symm l l l l l l l w 3 2 2 2 12 6 12 6 12 6 2 6 4 . global stifness matrix k1d is obtained by the combination of modal stiffnesses corrected for centroid and shear centre position relative to the position of the origin of global c.s. a u v w z l y z x � � � � � � � � � � � � � � � � � � � � � � � � t t t t cg � � � � � � , 1 0 0 0 � � � � � � � � � � y z y z y cg cg cg cs cs 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 � � � y z u v w y z cs cs0 0 0 0 0 0 0 0 0 0 1 � � � � � � � � � � � � � � � � , x � � � � � � � � � � � � � � � � � � � � � � � t ag f f f f f k k k k k l u v w u v w t 0 0 0 0 0 0 0 0 0 0 0 0 � � � � � � � � � � � � � � � � � � �+ � � � � � � � � � � � � � � � � � � � � � � � � � � � � a a a a k a u vb wb l l � e where a t a t t tl g � � � � � � �, 0 0 . finally, the global stiffness matrix k1d is obtained as the sum of the element stiffness matrices k t k tg l e t e� with appropriate node numbering. the system k1d a � f can now be solved for unknown displacements a which in turn enable determination of element parameters al. from these parame98 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 2: octopus/maestro graphic shear stress fields in wing and bulk-carrier transverse strip ters the element axial, bending and torsion parameter distributions, based on the applied shape functions, can be derived (e.g. �(x), �, x(x), �, xx(x), �, xxx(x) for torsion). the key element for calculation of the response of the complex thinwalled structure is therefore determination of elements of gx. a simple and elegant fem procedure for such a calculation is presented in the sequel. 3 calculation of response for a transverse strip with a complex cross section the shear flow and geometrical characteristics of the cross section in bending and torsion is usually calculated using analytical methods. such calculations become rather complicated for multiple-connected cross section graphs with a combination of open and closed (cell) contours. application of numerical methods based on the energy approach offers an elegant alternative. the procedure is based on section decomposition into finite elements, as first introduced by herman and kawai. in the sequel, the method of calculation as described in [5, 6] is presented. it has been successfully used in practical calculations since its development for [10]. the simplest decomposition of thin-walled cross-section (symmetric or not) into line finite elements (segments) is shown in figs. 1 and 2. these elements form stiffened panel macro-elements for the feasibility evaluation. the methodology is based on applying the principle of minimum total potential energy (�) with respect to parameters which define the displacement fields of the structure. the primary displacement field (following classical beam theory) is defined via displacements and rotations of the cross section as a whole. secondary displacement field u u2( , , ) ( , , )x y z x y z� represents warping (deplanation) of the cross section. for piecewise-linear fem idealization of the cross-section, divided into n elements, with shape functions n in the element coordinate system (x, s), the warping field reads: u s s l s l u ux x e e e e i j ( ) � � � � � � � � � � � � � � � � 0 1n ut . element strain and stress fields � and � are obtained from the strain-displacement and stress-strain relations: � � � � � � � ��� � � � � � � � � � � xs e e e i j u s l l u u b ut 1 1 , and � � �� � � xs e xs e eg g b ut . the total potential energy of the �x -long transverse strip of the beam, with the cross-section divided into n elements, reads: � � � � � � � � � � � � �� � � t v sn v v p x s u s s v f e e e d d dt ( , ) ( ) ( 1 2 b b s u s s x se e e e e e e e ) ( ) , d t t �� � � � � � � � � �� � � �� � 1 2 u k u u f where p x s( , ) is the external loading on two cross sections (s1 and s2) of the strip. minimization of � leads to the classical fem matrix relation k2d u2d � f2d (shortened to k u � f). the element stiffness matrix for the proposed linear displacement distribution along the line element (the same for bending and torsion) reads: k e e e e e g t rs l � � � � � � � � 1 1 1 1 , where rs is the prescribed shear efficiency. 4 cross-sectional shear stress distribution due to bending in the case of bending, the net external load (due to bending moments m(x+�x) and m(x)) is the normal stresses: p x s x x s x s x x s x q x s i s s c ( , ) ( , ) ( , ) ( , ) ( ) ( ) � � � � � � � � � � � � �2 1 ( )x x� , where � c s( ) is distance from the point to n. a. the load vector for a nonsymmetrical cross-section in, e.g., bending about the z axis reads: f fz e y z e e y e e y z yz y x q x x e q x t rn e i i i i ( ) ( ) ( ) ( ) ( ) � � � 2 � � � � � � � � � � � � � � � � � y l l y l l i e e e i e e e c c 2 6 2 3 2 2 sin sin � � i z l l z l l yz i e e e i e e e � � � � � � � � � � � � � � c c 2 6 2 3 2 2 sin sin � � � � � � � � � � � � . for bending around the y and z axes, the matrix relations k u � f with u u� q x( ) can be converted into expressions for the warping due to unit load f. for node warping ui(x), unit warping u( x) must be multiplied by q(x). this enables the assessment of shear stresses �y e or �z e from the expression � e j i eg u u l� �( )2 2 in each element e between nodes i and j. if necessary, it is possible to calculate shear stress distribution �xs e s( ) more accurately, from the mean stress �xs ke obtained from fem, and the contribution to each element calculated analytically � �� � � �xse u xs ke u u ke us s( ) ( ) ( )� � �1 1 u � y or z from expression (for symmetrical section): � ��xse z z e t y e e e y e i y s q g e rn ei l z z ( ) � � � � �� � b u 3 1 2c c ! " # � � � � � ! " # # � � � � z s z z s l i j i ec c c( ) . 2 2 in this case, the sectional characteristics and shear center are easily obtained. the shear /torsion center position reads: y t d ssc z e e e l n q e z � � � � � � ��� � � c d 0 1 , z t d ssc y e e e l n q e y � � � � � � ��� � � c d 0 1 , where dec is the normal distance from the centroid to e. the shear stiffness for bending about the y and z axes, gav , gah reads: © czech technical university publishing house http://ctn.cvut.cz/ap/ 99 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 � � ga g t s rs xs e y q e e l e ey e v d� � � � � � � � � � � � �� ( )� 1 2 0 � � � � �1 ; � � ga g t s rs xs e z q e e l e ez e h d� � � � � � � � � � � � �� ( )� 1 2 0 � � � � �1 . 5 corrected normal stresses due to the influence of shear (shear lag) the normal stress must be corrected for (a) stress arising from a longitudinal change of the warping field and (b) normal stress due to correcting bending moment (mc), compensating for the loss of cross section equilibrium: (a) ( ) ( )� � � x c y i e y i e z y i y c ee u x e p u u rn� � � � ! " # � � � e � ; (b) � �m s z s t syc xc y c e l e e � �� � ( ) ( ) d 0 . the total normal stress correction in node i reads: ( ) ) (� x ct y i e z i yz i y z yz y c e e zrn i z i y i i i m e e p u� � � �c c e( 2 y i y cu� � � � � �) . the approximate value of normal stress for simultaneous bending about axes y and z for node i reads: � �x i e z z e i x ct z i e y y e i rn m ei e y rn m ei e z � � � � � � � � � c c ( ) (� x ct y i) . � � � � � 6 calculation of warping and primary shear stresses due to pure torsion a transverse strip of a thin-walled beam of length x is subjected to torsional loading. the displacement field of the middle line of thin walled elements can be expressed using the warping function u s t( ) �0, rotation v xs t( ) �0 around the centre of twist, twist rate (�x,x) and angle (�x) of the twist reads: u x s u s xt x,x( , ) ( ) ( )� �0 0� , v x s t v x d xs t t x( , , ) ( ) ( )� � � �0 0 0t � , where dt is the normal distance from the element to the center of torsion. the strain (with s $ 0) and stress fields read: � � � � � � � � � �� � ! " # � � � �� � � � �� � � � � � x xs x xx x x u u s d , ,t , � �� � �� � ! " # � � � �� � � � �� e e u g u s d x xx x x � � � � , ,t . the total potential energy of a section is given by the standard expression: � � � � ��u w v w v 1 2 � � t d . after summation of all elements and transformation of local element displacements u n ue e� t and loads f e into global displacements u and loads f we get: u x rs d l t gx x e e e e e e� � � � � ! " # #�� � , 2 21 2 1 2 u ku u ft t t . where k bbe e e e l rs g t s e � � td 0 , and f be e e e e l rs g t d s e � � �t d 0 . minimization of total potential energy leads to two sets of equations: (1) � �� % � 0 (1d beam torsion) and (2) �� u � 0 (2d cross-section warping). a second set of equations, � �� u uu� � � �0 ku f , enables determination of the unit warping field. the primary shear stresses on the elements which are parts of closed contours (cc) and open sections (os) can now be calculated as functions of 1d twist rates �, x(x) (to be obtained from the first relation for 1d beam torsion): � �xs ke e x x e e i j eg l l u u d ( ) , cc t� � � � � � � � � � � � � � � � � ! " 1 1 # # and � �xs ke e x x e g t max , (os) � 2 . 7 calculation of torsional and warping stiffness of thin-walled structures to solve the equation for 1d beam free torsion, the torsional stiffness of elements which are parts of the open eo and closed cells ec can now be calculated using the known unit warping field u: gi g l t rse e e e e to o � � 3 3 , gi g l t u u l d rse e e j i e e e e tc t c � � � � � ! " ## � 2 and gi gi git to tc� � . warping stiffness is calculated using the expression: ei e l t u u u u rne e e i i j i e e w � � � � 3 2 2( ) . using git and eiw, the matrix k1d for the 1d beam problem can be formed and relevant parameter distributions �(x), �, x(x), �, xx(x), �, xxx(x) can be determined for use in shear stress calculations. 100 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague 8 normal and secondary shear stresses due to restrained warping restrained warping of a thin-walled beam will induce (a) normal stresses in a cross-section and (b) secondary shear stresses which will balance the longitudinally non-uniform distribution of normal stresses. this additional mechanism will influence the strain energy and the work of expression, so an iterative solution may be needed for greater accuracy. let u x s u s xx x( , ) ( ) ( ),� � be the warping field in the cross-section calculated from the case of free torsion. normal stresses are caused by restraining the warping, and vary along the x axis. they are given by: � �� � � �x x x xe e x u s xw � � ( ) ( ), or � �x i e x xx i ee u rnw � , . let u2(x, s) be the secondary displacement field containing a displacement correction due to restrained warping. the total potential energy of a transverse strip consists of the internal energy generated from the fields �2 and �2 (based on u2) and the additional work done by the strip axial load px on the secondary displacements u2. if the change of u2 along strip length � x is neglected, the total potential energy reads: � �� � � � ! " # � � � � �e e e v x s g u s v x p x u x s s e e 1 2 2 2 2 2 � � � � d d� ( , ) � � � � � � � e . the net external load �px due to restrained warping reads: � � �p x s x s e u s x rn xx x e x xxx e( , ) ( , ) ( ) ( ),� � � � � �w , and the total potential energy of the element, using the same shape functions as before, reads: � e e e e e l ex t g rs s x e � � � ! " # # # � � 1 2 2 0 2 2 � � u bb u u t td e x xxx e e e l et e rn s e t td� , . � � ! " # # # � n n u 0 minimization of the total potential energy with respect to the unknown displacement field u2 leads to: ��u x2 0 02 2� � � � � � �� ( )ku f ku f, where: k is the global stiffness matrix as before, u2 is the global vector of unknown displacements u u2 2� �x xxx, , f is the global load vector f f� �x xxx, . the element load and the secondary shear stresses (constant on element) read: f e x xxx e e e e i j xrn e t l u u � � � � � � � � � � � � �� �, , 1 3 1 6 1 6 1 3 xxx ef ; � � � �xs ke e e j i e e j i e x xxx g u s g u u l g u u l ( ) , 2 2 2 2 2 2� � � � � . the shear stress distribution can be calculated more accurately along the element (similar to the bending case) from the known element average stress �xs ke( )2 , the direction of shear stress flow, local element contribution �2( )s and its average �2 ke using expression � � � �xs e xs ke kes s ( ) ( ) ( ) ( ) 2 2 2 2� � � . after rearranging, it reads: �xs e e j i e e e e i j e i s g u u l e rn rs u u l u ( ) ( ) ( ) 2 2 2 1 2 3 � �� � � � � � �� � ! " # # � � � � s u u l s j i e x xxx2 2 � , . 9 examples the first example is based on reports from the advanced subsonic technology (ast) program. in the course of this program an experimental model of a composite wing box was made and tested (fig. 3). [11] gives the loads carried by the hydraulic actuators which simulate the in-flight loading conditions. the example shows a way of rapidly modeling and calculating the overall response of a similar metal wing box with linear behavior during the early stages of wing structural design. the analyzed wing box with reference to the ast box was shortened to 9.8 meters and modeled with high strength aluminum alloy 7075. the loads are decreased for the shortened wing box. the wing box is modeled by octopus 1d/2d combination. fig. 4 presents a model of the aluminum wing-box under modified �2.5g loading conditions, and the unit response which needs to be multiplied by the values in parentheses to get the actual values for the considered load. the response components are decoupled to show the influence of each type © czech technical university publishing house http://ctn.cvut.cz/ap/ 101 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 3: md 90 airplane and the test model of the wing box 102 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague a) metal wing box with 13 elements b) normal stress (bending) c) normal stress (restr. warp.) d) shear stress (bending) e) shear stress (torsion) f) shear stress (restr. warp.) � � xxxw , 1 �� � �mb 1 � 13 1211 10 9 7 8 6 1 2 3 4 5 � �q q 1 � � �xxt ,1 �� � �xxxxw ,1 �� fig. 4: wing box model and unit responses of 1d fem element 10 (between ribs 10 and 11) stress accuracy (strake 5) -200.0 -150.0 -100.0 -50.0 0.0 50.0 1 2 3 4 5 6 7 8 9 10 11 12 13 element s ig m a /t a u [n /m m 2 ] sigx m tauxy m sigx o tauxy o (a) (b) fig. 5: (a) stress accuracy across a wing span between octopus (o) and maestro (m), (b) cross-sectional properties of 1d-fem element 10 1d-element 10 ycg [mm] �14.5818 zcg [mm] �337.763 � [°] 7.14e-02 ysc [mm] -16.4946 zsc[mm] �354.305 a [mm2] 66270 iy [mm 4] 2.78e+10 iz [mm 4] 1.66e+09 iyz [mm 4] 3.26e+07 i1[mm 4] 2.78e+10 i2 [mm 4] 1.66e+09 it [mm 4] 3.96e+09 iw [mm6] 2.85e+14 av[mm 2] 2412.777 ah[mm 2] 31405.07 y z 6,39(0,502)(0,304) (0.09) (0.686) 1.6 (1.39) (1.65) (1,387) 1.66 (0.59) 1.6 �wt[mpa] �b 17 1614 15 12 10 13 9 8 7 6 5 2 1 4 3 11 835 (881) z y 121311.218 .5 4.9 2.2 3.1 3.1 16.6 12.3 16.5 21.9 27.2 10 16 1 0.6 16.7 13 12 16 14 13 10 9 68 4 7 5 3 12 �2wt[kp/cm 2 ] fig. 6: (a) u-beam shear stress �w, (b) container ship shear stress �w, (c) general cargo ship shear stress �bvert of load on the response. the cross section geometric characteristics obtained from 2d fem and used in 1d fem analysis are given in fig. 5b. the accuracy of the method is demonstrated in figures 5a and 6. strake 5 of fig. 5a is located in the middle of the upper skin. fig. 6 presents the application to the u-channel and two standard ship structures. it can be seen that the accuracy of the shear stress distribution based on fem (constant per element) and analytical formulae (continuous line) in examples 6a and 6c is very good, even without parabolic correction. the verification examples are taken from [5]. 10 conclusions a simple and practical method for calculating the primary response of monotonous structures (wings, ships, bridges) has been presented. all cross-section parameters are easily determined for complex stiffened thin-walled structures using a special fem procedure. it could successfully replace classical, often cumbersome, analytical calculations. the method has been in constant use since 1980, applied to many real structures for concept design (in the octopus system) or as the generator of the force boundary conditions for partial 3d fem models. references [1] zanic, v., das, p. k., pu, y., faulkner, d.: “multiple criteria synthesis techniques applied to reliability based design of swath ship structure.” (chapter 18). in: integrity of offshore structures 5, (faulkner, das, incecik, cowling, editors), emas scientific publications, glasgow, 1993, p. 387–415. [2] zanic, v., rogulj, a., jancijev, t., bralic, s., hozmec, j.: “methodology for evaluation of ship structural safety.” proc. of 10th intl. congress imam 2002, crete, greece, 2002, p. 54 + cd. [3] zanic, v., andric, j., frank, d.: “structural optimization method for the concept design of ship structures.” proceedings of the 8th international marine design conference, vol. 1; papanikolau, a. d. (ed.), athens, greece, 2003., p. 99–110. [4] hughes, o. f.: ship structural design. wiley, 1983, sname 1992. [5] zanic, v.: “calculation of shear flow in cross-section of ship in bending.” (in croatian). proc. of sorta conference: theory and practice in shipbuilding, split, croatia, 1982. [6] zanic, v.: “determinazione degli sforzi principali -flessione e torsione-sollo scafo di una nave applicando particolari elementi finiti.” technica italiana (rivista d’ingegneria), no.3, 1985, p. 105–113. [7] prebeg, p.: diploma thesis, university of zagreb, 2003. [8] maestro documentation, proteus eng. stevensville, md, usa, 2003. [9] crest documentation, croatian register of shipping, split, croatia, 2004. [10] hughes, o. f., mistree, f., zanic, v.: “a practical method for the rational design of ship structures.” journal of ship research, vol. 24 (1980), no. 2, june 1980, p. 101–113. [11] karal, m.: ast composite wing program – executive summary, nasa cr 2001-210650, 2001. vedran zanic phone: +385-1-6168122 fax: +385-1-6168399 e-mail: vedran.zanic@fsb.hr pero prebeg e-mail: pero.prebeg@fsb.hr university of zagreb faculty of mechanical engineering and naval architecture i. lucica 5 10000 zagreb, croatia © czech technical university publishing house http://ctn.cvut.cz/ap/ 103 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 ap_06_4.vp 1 necessity for tight cooperation in the present turbulent times it is very important for companies to be customer-oriented and thus highly flexible in delivering products and/or services and in supplying new and marketable products. this trend is notable especially on markets of high technology products. commodities are not influenced so greatly. companies are forced to concentrate on those activities they are outstanding in, and to drop all other supporting activities. this is the way to achieve a major competitive advantage, and it forces companies to enter into strategic alliances, create joint ventures or simply outsource processes to other companies that are more efficient in performing those processes. what further complicates the situation is that it is now and then vital to set up such cooperation very fast. 2 business process model (bpm) a business process model describes the workflow of the business processes that are performed by an organization. an important aspect of bpm is that the customer is involved, because all processes maintained by the company have to add value to the company’s products or services of the company from the customers’ point of view. the customer need not always be a consumer of the company’s products or services. a customer for processes can be an employee, supplier etc. the most important elements of bpm are business processes that are defined by: � goals – why to perform this particular process (added value), � activities and subprocesses – how to transform inputs into outputs, � resources – inputs of the process that are consumed during transformation, � information – inputs that are used to modify the transformations in the process, but are not consumed (e.g. the color of a product), � owner – the person responsible for performance of the process, � trigger – an event that sets the process in motion. there are several levels of complexity of processes and bpms. a process in a more complex bpm can be represented by many processes in a more detailed bpm. [4] 2.1 benefits of bmp one of the major benefits of implementing business process management and thus creating the business process model of a company is that bpm is a good starting point for setting up cooperation with other companies. the model clearly describes the requirements of each individual process, the added value for the customers (and implicitly for the company itself). based on this information, it is easier to hand a particular part of the business process workflow over to another company. further very important benefits of introducing business process management to a company structure can be found in literature [1, 2, 4]. 3 procedure for process outsourcing if a company has business process management already integrated into its management system, it has a very good starting point for successful outsourcing of its processes. if not, outsourcing can be very difficult – though not impossible – to repeat with different outsourced activities. in this paper it is assumed that the company already has bpm (in several levels of detail). the procedures for making a contract are also not dealt with here. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 46 no. 4/2006 a business process model as a starting point for tight cooperation among organizations o. mysliveček outsourcing and other kinds of tight cooperation among organizations are more and more necessary for success on all markets (markets of high technology products are particularly influenced). thus it is important for companies to be able to effectively set up all kinds of cooperation. a business process model (bpm) is a suitable starting point for this future cooperation. in this paper the process of setting up such cooperation is outlined, as well as why it is important for business success. keywords: business process management, business process model, outsourcing, joint venture, poster 2006. p2 p3 t1 p1 p4 tx – event that triggers processes px – process fig. 1: example of a simple bpm step 1: identify the need for workflow improvement. business process management involves activities that should identify problems (or just a need for improvement) in the workflow of processes. for example slow or non-effective marketing research. based on this finding managers can decide to find a partner company that will take care of this particular group of problematic processes. step 2: identify the group of all processes to be outsourced. a very important step is to select processes to be handed over to a new partner. it is sometimes appropriate to outsource even non-problematic processes linked with the problematic process in order to gain other benefits (to reduce the amount and frequency of information or material exchange, to cut the need to employ staff, etc.) step 3: specify the interface for the cooperation. the goal of this step is to identify inputs for the processes that follow the processes that are to be outsourced, and also the outputs of processes that precede the outsourced processes. the first part is more important, because the inputs of subsequent processes form the requirements for the cooperating company. based on this data, the search for a cooperating partner should begin. the outputs of the preceding processes may not be enough for the partner company, and a change in close areas of bpm can be triggered (or a new search for a cooperating company that will be able to transform the available inputs to the required outputs). step 4: integration of a new partner into the workflow. after a successful search for a cooperating partner, it is essential to integrate its processes tightly into the company’s bpms. the possibility of integration (and maximum automation of information exchange) should also be one of the criteria in the selection process. step 5: evaluation of effectiveness. it is important to evaluate the effectiveness of outsourced processes. this has to be carried out continuously for the whole period of cooperation. 3.1 other kinds of cooperation joint ventures involving several companies, strategic alliances and other ways of cooperating differ as regard the roles of the participating subjects, as well as in other aspects of the cooperation. however the significance of bpm is very similar to the situation of outsourcing. it helps in analyzing the inputs and outputs of the processes that the external subject has to provide. 4 summary business process management and the linked business process model are interesting starting points for making decisions about handing over certain internal business processes to external subjects. the process of identifying the need to establish cooperation, identifying processes that should be handed over, etc., is very complex, and many factors have to be taken into account at all stages in the process. references [1] truneček, j.: znalostní podnik ve znalostní společnosti, professional publishing, praha 2003. [2] bpmtutorial (http://bpmtutorial.com). [3] smith, h.; fingar, p.: business process management: the third wave. 1st edition, meghan-kiffer press, 2003. [4] uml tutorials: the business process model. sparx systems 2004 (http://www.sparxsystems.com.au/downloads /whitepapers/the_business_process_model.pdf) ing. ondřej mysliveček e-mail: myslivo@fel.cvut.cz dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 ap05_1.vp 1 stress state of elastic bodies 1.1 theory of elasticity uses forces and axioms of stereomechanics in the theory of elasticity the stress state of a loaded elastic body is described by the internal force and stress, i.e. quantities adopted from the mechanics of perfectly solid bodies. while the deformation depends on the physical and mechanical properties of the materials, and changes during any change of the load applied to the body, for a description of the stress state of elastic bodies we have only the quantities which do not depend directly either on the material properties or on the deformation magnitude of the body. from the mechanics of perfectly solid bodies the theory of elasticity adopts also both the fundamental axioms of statics, viz the axiom on the equilibrium of forces and the axiom on the composition and resolution of forces. according to the axiom on the equilibrium of forces, two forces are in equilibrium only if they act in opposite directions in a single line. according to the axiom on the composition and resolution of forces (axiom of the parallelogram of forces), the vector resolution or composition of forces does not change the state of equilibrium of the given system. a necessary prerequisite for the application of both of these axioms is the idealization of the continuous application of forces by a force concentrated in a beam. the concentration of the force in a single beam and its effect on a geometric point makes it possible to consider the force as a vector quantity and to use all vector operations. according to newton’s third theorem (axiom on action and reaction), at every point of a contact surface between two bodies the vector of a specific load must equal the vector of the stress of the opposite direction. the effect of the forces on the contact of two bodies does not depend on their properties or on the changes in their volume and shape. fig. 1. shows an interaction of the forces of two prismatic bodies of identical transverse dimensions on a contact surface from which friction has been excluded. in the direction of the first principal axis both bodies are affected by the load generating longitudinal principal stress �1 in them. the left body is exposed to a uniaxial load, while the right body is loaded additionally in the principal directions x2, x3 by a load generating transverse compressive stresses of equal magnitude � � �2 3 1� � . the equilibrium of the forces on the contact surface will manifest itself by the equal principal stress �1 acting from both of its sides. the deformations of the two bodies, however, are entirely different. the left body will be more compressed longitudinally than the right body. in the transverse direction the left body will elongate, while the right body will shorten. when the stress attains the strength in uniaxial compression, the left body will fail by a crack parallel with x1, while the right body will not fail even under the highest values of external stresses. the different deformations of the two bodies are shown illustratively by the deformation of a small spherical element © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 deformation stress state of elastic bodies j. skokánek the theory of the deformation stress state is based on the actual corpuscular structure of matter characterized in terms of mechanics by the fact that an increase in the distance of two adjacent atoms is accompanied by the origin of an attractive force and a reduction in their distance by the origin of a repulsive force. these forces differ significantly from the classical internal forces, which are the forces of the mechanics of perfectly solid bodies. these express the equilibrium of forces with reference to the given area within the loaded body, and have no direct deformation effect. this paper defines the quantities of the deformation stress state – the deformation force and the deformation stress – the direct manifestation of which is a deformation. the author introduces the term of deformation stress state theory (dss theory) to the field of the theory of elasticity dealing with the stress state of deformable bodies. the quantities and the equations of this theory also form the basis for the formulation of the theory of failure, which makes it possible to determine reliably the safety margin and the strength of a multiaxially loaded body from the stress state described by the static quantities (stress tensor) and uniaxial strength. keywords: deformation force, deformation stress, effective stress, effective strength, proportionality principle. fig. 1: classical stress state and deformation state at the contact of two bodies subjected to different loads within the right and the left bodies (fig. 1a). the initial shape of the elements is drawn by a dashed line, and the deformed shape by a solid line. while stress �1 has the same value in all cross sections of the two bodies perpendicular to x1, the spherical element within the left body is deformed entirely differently from the spherical element within the right body. the different deformations of the two bodies are illustrated in fig. 1b, showing the deformation of the two halves of the initially spherical element with the centre on their contact surface. from the axiom of the composition and resolution of forces it follows that the two vectors of opposite directions of the resulting stress can be resolved into any number of components without disturbing the state of equilibrium or changing the deformation of either body. the resolution of the vectors does not influence the deformation state or the stress state. fig. 2 shows a case of resolution of the resulting stress vector � acting obliquely on a plane p within the loaded body into various components. fig. 2a shows the usual vector resolution into two components – normal �n and tangential �. any stress vector resolution different on both sides of the surface does not change either the stress or the deformation. fig. 2b shows one of the infinite number of cases of the possible resolution of the resulting stresses in the same stress state as in fig. 2a. in the resulting stress direction of the right side, the stress component on the other side can have an entirely different magnitude. the resolution of every resulting stress vector acting on three mutually perpendicular planes into three components is useful for the definition of the three-dimensional stress state by means of the stress tensor; however, it is incorrect to assign to the individual stresses or even to their components any influence on the deformation (deformation effect). 1.2 erroneous opinion of the deformation effect of stress tensor components the relation between principle stresses and principal relative elongations is defined by the extended hooke’s law � � 1 1 2 3� � �� � �( ) e , (1a) � � 2 2 3 1� � �� � �( ) e , (1b) � � 3 3 1 2� � �� � �( ) e . (1c) if two principal stresses equal zero, we speak about a uniaxial stress state. we obtain the relation between the single principal stress different from zero and the first principal elongation (basic hooke’s law) by the substitution � �2 3 0� � in eq. (1a) �1 1� � e . (2a) from the two equations (1b) and (1c) we obtain � � � ��2 3 1 1� � � � � � e . (2b) the general hooke’s law makes it obvious that every principal relative elongation depends on all principal stresses, and not only on the principal stress of identical direction. in the preceding paragraph we pointed out that none of the principal stresses will change due to any change in the other principal stresses. at the same time every change of one of the principal stresses will produce a change of the relative longitudinal deformation in every direction from the given point. in the case of a multiaxial stress state, the relative elongation in the direction of the given principal stress may attain different magnitudes depending on the other principal stresses. this is shown in table 1, which gives the relative elongation value (computed by means of the expanded hooke’s relation) in the zero stress direction for various values of the remaining two principal stresses. the selected modulus of elasticity is e � 1 , poisson’s coefficient � � 0 25. . depending on the overall stress state, the relative elongations in the zero stress direction may have positive or negative values. item 6 even shows a case when the body elongates in the compressive stress direction due to the effect of major transverse compressive stresses. in this case, during increasing stresses the body apparently paradoxically breaks in the compressive stress direction after achieving a certain limit of transverse compressive stresses. these statements are confirmed by the tests made by bridgman [1] with glass and steel bars installed in a high-pressure chamber in such a way that their ends protruded from the chamber (fig. 3). the successively increasing pressure of the liquid increased both transverse principal stresses, while the longitudinal stress remained at zero level. when a certain limit of transverse compressive stress (pressure in the liquid) was attained, the bar broke in the zero stress direction. bridgman was unable to explain the cause of this failure. he merely stated that the origin of the failure could not be explained by any known hypothesis. he considered the hypo4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 2: different cases of resolution of the resulting stress into its components item �1 �2 �3 e�1 1 0 �100 �100 0 2 0 0 �100 �25 3 0 0 �100 �25 4 0 �100 �100 �50 5 0 �100 �100 �50 6 �25 �100 �100 �25 table 1 thesis of maximum relative elongation to be the most likely, but his tests showed that the bars broke under the effect of markedly higher liquid pressure than those corresponding with this hypothesis. to express the relation between the stress state and the further deformation stress state problems elaborated later on this paper, it is necessary to express the relative longitudinal change in any direction from the given point by a simple equation. the relative elongation in a general direction �1, �2, �3 of any triaxially loaded body (fig. 4) can be defined by means of given principal relative elongations �1, �2, �3 by the known equation � � � � � � �� � �1 2 1 2 2 2 3 2 3cos cos cos . (3) the equation for relative elongation in a general direction � of a uniaxially loaded body can be derived from this equation by putting � �1 � , � � � �2 3 2� � � , � � � �2 3 1� � � , � �� � � � �� � � �1 21( ) cos (4) and by the substitution of �1 1� � e � � � � � � � � � ��1 21( ) cos e . (5) in a triaxially loaded body the relative elongation in a general direction expressed by angles �1, �2, �3 can be determined by the superposition of the relative elongations produced by all uniaxial principal stresses � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 2 1 2 2 2 3 2 3 1 1 1 ( ) cos ( ) cos ( ) cos e � e . (6) it is more advantageous to express the relative elongation � in spherical coordinates �, �, for which it holds that (fig. 4) cos cos2 1 2 � �� , (7) cos sin cos2 2 2 2 � � �� , (8) cos sin sin2 3 2 2 � � �� . (9) substitution in eq. (6) yields the equation of relative elongation of a triaxially loaded body in a general direction � � � � � � � � � � � � � �� � � � � � � � � � � � 1 2 2 2 2 3 1 1 1 ( ) cos ( ) sin cos ( ) si e e � �n sin . 2 2 � � � � e (10) 1.3 actual significance of shear stress in case of a general stress state, the vector of the resulting force is applied obliquely to a certain cross section area. in such a case the resulting internal force is usually resolved into the normal component and two tangential components. it follows from this that the stress expresses the equilibrium of forces with reference to the given cross section area, but has no direct influence on the longitudinal deformation in its direction. this applies, naturally, also to the orthogonal components into which we resolve the resulting stress vector according to the axiom of the parallelogram of forces. neither the normal nor the tangential stress components have any deformation effect. nevertheless, the deformation effect of the tangential stress components is generally considered as an indubitable fact, and exceeding the maximum shear stress is still considered a possible cause of failure. in the case of elastoplastic bodies, the origin of failure due to shear is even considered the most widely recognized failure hypothesis. the themselves terms, such as shear stress, shearing force, shear area, shear failure and shear strength express the erroneous assumption that such a stress state produces a mutual displacement of two parts of the body. the shear effect of the shearing force is most frequently explained according to fig. 5. for instance, “strength of materials” [4] states: “if external forces on the left-hand side of the cross section p are composed into a resultant which has only tangential component q acting in the cross section plane with the point of application in the cross section centroid (the so-called shearing force) and the external forces on the right-hand side of infinitely near parallel cross section p’ are composed according to the action and reaction theorem into a resultant q ’ of opposite direction, of the same magnitude, with the point of application in the cross sections centroid, then both adjacent cross sections will be displaced mutually. when the ultimate © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 3: bridgman's test, in which a glass bar ruptures under the pressure of the liquid in the zero stress direction fig. 4: diagram of spheric coordinates shear strength has been attained, the cross section will be shorn off.” this explanation of the shearing effect of the shearing force is based on the idea that both cross sections are infinitely near, so that it is possible to consider the opposite shearing forces as acting in a single plane, while assuming that their mutual distance enables mutual displacement. such an assumption is evidently erroneous. if both shearing forces act at a differential distance dx according to fig. 5, they produce a bending moment dm so that the equilibrium must be assured by a bending moment of equal magnitude, but of opposite direction. thus d dm q x� � , hence q m x � d d , (11) which is known as schwedler’s theorem, according to which the shearing force is the first derivative of the bending moment. also the statement that the cracks originating in the body at the ultimate strength of the material may have the direction of the principal shearing stresses, is erroneous. this is also shown by fig. 6. fig. 6a represents a body of cubic shape loaded by uniaxial pressure �2 in the direction of x2. the maximum shear stress �max acts on the plane 45° diverting from x2. the shear stress �’max acts on the same plane in opposite direction as a reaction. the same maximum shear stress �max originates in the same plane of the same element during the uniaxial stress state produced by the tensile stress � �1 2� � acting in the direction of x1 (fig. 6b). let us add that the tangential stress is not algebraically ambiguous (positive or negative), so that zero is considered its minimum value. both cases involve a uniaxial load: in the former case in compression, and in the latter case in tension. in either case the failure crack does not originate in the shear stress direction, but in the direction perpendicular to x1, which is the axis of maximum elongation in either case. the absolute values of both uniaxial strengths and the extreme shear values in both cases differ substantially. the uniaxial compression strength is several times as high as the uniaxial tensile strength, and the values of the ultimate shear stresses in both stress state cases are in the same ratio. 2 deformation stress state of elastic bodies 2.1 influence of atomic structure on the deformation and stress state of solid bodies classical stress is a plane force quantity of the mechanics of perfectly solid bodies expressing the equilibrium of forces applied to a unit surface from both sides. stress exercises no direct influence on deformation. in the direction of a stress vector of a certain magnitude, the relative elongation may be of a different value and orientation, depending on the overall triaxial stress state (see table 1). the principal stress at the contact of two bodies (fig. 1) produces an entirely different three-dimensional deformation in each of them. an examination of the relative elongation in the given direction from the given point requires knowledge of the three stresses in the given point acting on three mutually perpendicular planes passing through the given point. an advantageous expression of the stress state is the stress tensor, the simplest (diagonal) form of which comprises three components – the principal stresses, by means of which it is possible to compute the principal relative elongations, according to eq. (1), and the relative elongation in any direction from the given point in a triaxially loaded body according to eq. (10). however, the solution of various problems of the theory of elasticity, particularly the examination of the multiaxial strength of bodies, lacks a force quantity which would be in a direct relation with the relative elongation. it is advisable to base the definition of such a quantity on the corpuscular structure of the bodies. the contemporary discrete theory of elasticity, which takes into account the atomic structure of bodies, is based on the assumption that between two atoms of a loaded body a repulsive force acts in the direction of their decreasing distance, 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 5: erroneous assumption of shear stress deformation effect according to [4] fig. 7: the force acting between two atoms is accompanied by a change in their distance fig. 6: the origin of a faillure crack does not depend on the magnitude of the shear stress, and its plane does not follow the direction of the maximum shear stress and an attractive force acts in the direction of their growing distance (fig. 7). cottrell [3] has come to the conclusion that the relation between the change of distance of two atoms u and the force f accompanying this change is linear. the equation expressing this relation can be written in the form f u� �c k , (12) where ck is a proportionality coefficient. cottrell considers the proportion between force f and longitudinal change u in eq. (12) to be the hooke’s law. actually, however, there is a considerable difference between this equation and hooke’s law. according to hooke’s law, the relative longitudinal change in the direction of the principal axis of a uniaxially loaded body is proportionate to the internal force acting on the given plane surface (surface a in fig. 7), perpendicular to the same axis. the atoms, however, affect their whole volumes by their forces; in this process their shapes and volumes change, as the picture illustrates. every atom has a major number of adjacent atoms. for instance, in a crystalline substance with a cubic face centered structure, every atom has 12 equally distant nearest neighbours (fig. 8). the load changes the distances among all atoms, so that according to eq. (12) every increase in the distance of the given atom from its neighbour is accompanied by an attractive force and by a reduction of their distance by a repulsive force. cottrell [3] expressed the stress state of the loaded body by a mutual force effect of all its atoms, as follows: “the bulk elastic behaviour of large solid bodies is simply the aggregate effect of the individual deformations of the bonds in them, the applied force being transmitted from one loading point to another along the network of bonds running through the material”. this statement reveals that the forces produced among atoms by the stress state are not plane internal forces, but the forces with deformation effect, i.e. deformation forces. fig. 9 shows the force effect of ambient atoms on a given atom and the changes in the distances of the atom centres in a biaxially compressed body. the boundaries of the electronic clouds of atoms in an unloaded state are idealized by spherical surfaces deformed under a load like microscopic three-dimensional elements. the picture simplifies the actual spatial arrangement of the atoms in the crystal lattice by a two-dimensional arrangement, thus reducing also the actual number of adjacent atoms to the given atom. according to eq. (12), the changes in distances of the centre of the given atom and all adjacent atoms are proportionate to the force originating among them in the process. the biaxially loaded body will become narrower in the direction of the principal axes x2, x3 and will elongate in the direction of x1. the spherical elements idealizing the boundary of the electronic clouds of atoms will narrow or elongate equally. in parallel with the deformation, the repulsive forces f2, f3 and the attractive force f1 will originate between the given atom and its neighbours. when we examine the resulting internal force f1 acting on the cross section area a of the given atom perpendicular to x1, we must add the vector effect of all interatomic forces on this surface. generally speaking, the forces f affect the area a obliquely only by reduced values for which it holds that f fred � � cos � . (13) the resulting internal force f1 is then the vector sum of all partial internal forces acting on the given area from one side f1 1 2 1 � � � � � � � �f fi i i n i i i n red, cos cos� � for i � 1 to n . (14) the schematic diagram also reveals why in the case of biaxial compression the internal force in direction x1 equals zero, although the body elongates in the same direction. this is due to fact that the magnitude of the internal force f1 is participated in by both positive and negative internal forces. the sum of their normal components is in this case the zero internal force. the idealization of the forces acting on the given atom, accompanied by the change in its distance from the adjacent atoms, gives only an approximate picture of the force effects among the atoms. this is due to the fact that the forces fi are discrete quantities, cumulating the deformation force effect on the given three-dimensional element (atom) from a certain spatial angle into one vector, while the relative elongation changes continuously within the same spatial angle. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 8: cubic face centered crystal structure fig. 9: schematic diagram of the forces produced by biaxial compression between a given atom and its neighbours according to this model, the number of forces f is limited by the number of the nearest neighbours of the given atom. the forces acting linearly between the atoms are based on the idealization of a three-dimensional material element (an atom in this particular case) by a material point. actually, the force interaction of the given atom with every one of its neighbours is realized by the interaction of all their particles, particularly protons with a positive charge and moving electrons with a negative charge. the charges with the same sign repel one another, while the charges with different signs attract one another. the load changes the shape and volume of the electron cloud and, consequently, the distance of the atoms, giving rise to an electromagnetic field which manifests itself by the continued force effect on the given atom simultaneously with a continuous change in its shape and dimensions. we speak about the deformation force effect. 2.2 dependence of classical stress on the deformation stress state at a small distance from a given point it is possible to consider the relative elongation in a given direction as constant. therefore, when examining the dependence between a deformation and stress state in the environs of a given atom, we can replace the force effect on the atom with the stress state of a macroscopic spherical element of very small radius with its centre coinciding with the centre of the atom. the strain state of a spherical element of unit radius gives the elongation of the radius in all directions the magnitude of the relative elongation. fig. 10a shows the deformation of a spherical element of unit radius loaded by uniaxial tension, while fig. 10b is a schematic representation of the deformation force effect on the same element. within the scope of the differential solid angle d (fig. 10c), the deformation force dd acts on the area of the element surface dasf � r 2d (for the radius r � 1 it is dd � dd1 and dasf � d ). for the quota of deformation force dd1 and the spherical area dasf � d affected by it we shall introduce the term deformation stress ( ) � � d d d d n mm sf 1d a d [ ]2 . (15) like the elementary deformation force, the deformation stress must also be positive in the direction of the positive elongation, negative in the direction of relative shortening, and equal to zero in the direction of zero elongation. when taking into account the corpuscular structure of bodies, the shear stress has no real meaning. the definition of deformation stress will reduce the relation (12) to the proportionality between stress and relative longitudinal change � �c � , (16) where c is the coefficient of proportionality between deformation stress and relative elongation. if we substitute �� from eq. (5) in eq. (16), we obtain the equation for deformation stress in a general direction of a uniaxially loaded body � � � � � � � c e �1 21( ) cos . (17) after the substitution of �� from eq. (10) in eq. (16), the deformation stress equation in general direction of a triaxially loaded body will acquire the form �� � � � � � � � � � � � � � � c e � � � 1 2 2 2 2 3 1 1 1 [( ) cos ] [( ) sin cos ] [( � �� � � �) sin sin ] .2 2 (18) the equation expresses the deformation force effect on the spherical element of a triaxially loaded body, for which we shall introduce the term deformation stress state (dss). by putting �� � 0 and � �� 0 2, or , we obtain from this equation the formulas for computation of the angles of zero deformation stresses in the planes x1 x2 and x1 x3 for a triaxially loaded body cos ( ) ( )( )( ) 2 0 12 1 3 2 1 21 � � � � � � � � � � � � � , (19a) 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 10: strain and deformation stress state in a uniaxially loaded body cos ( ) ( )( )( ) 2 0 13 1 2 3 1 31 � � � � � � � � � � � � � . (19b) in a uniaxially loaded body, the directional cosine of zero elongation is obtained from eq. (5), where we shall substitute �� � 0 cos � � � 0 1 2 1 � � � � � � � . (20) a more adequate graphic representation of the deformation state and the deformation stress state than that given by figs. 10a and 10b is provided by the deformation and the deformation stress state curves (fig. 10d and 10e), which are a graphic representation of eqs. (5) and (17). the curves are geometrically similar. the definition of the continuous deformation force effect enables a refinement of the relation (14) between the internal force and the interatomic forces, based on the assumption of their linear effect. instead of summing up the normal components of the individual reduced interatomic forces acting on the surface perpendicular to the first principal axis, we shall integrate the normal components dn� of the reduced elementary deformation forces df� (fig. 10c) d d d dsfn f a� � � �� � �� � �cos cos cos . 2 2 (21) the normal force n1 identical with the resulting internal force f1, affecting surface a1 in this particular case, is f a1 2 2 2 2 � �� � � � � � � �cos cosd dsf . (22) in the spherical element of unit radius a1 � �, so that the first principal stress is �1 2 2 1 � �� �� � cos d . (23) if we substitute in eq. (23) � from eq. (17) for the uniaxial deformation stress state and put � � 0 25. which is in the theory of dss, according to [7] is poisson’s coefficient of linearly elastic materials, the solution yields the proportionality coefficient c � 2e, the substitution of which in eq. (16) yields � 3e� . (24) this equation holds for any, i.e. also for a triaxially loaded linearly elastic body. this is the fundamental relation of the deformation stress state theory, which we call the principle of proportionality: in any loaded linearly elastic body the deformation stress is proportionate to the relative longitudinal change. let us return to the deformation force effect on the spherical element in a uniaxially loaded body (fig. 10). in the space defined by conical surfaces of zero longitudinal deformation (eq. 20), the axis of which coincides with the first principal axis, only positive longitudinal changes take place. this means that within the volume of the element defined by this surface, the element is affected only by positive deformation forces. the remaining part of the element volume is loaded by negative deformation forces. the cross section through the spherical element (a1 � �) perpendicular to the first principal axis is affected on one side by internal force f1, which must be in equilibrium with the resultant of the deformation force effect on the adjacent half of the element volume. let us note that the half of the element above a1 is affected by both positive and negative deformation forces. the internal force f1 (eq. 22), consequently, cumulates not only the positive effect of the forces affecting the elongation of the element, but also the negative deformation force effect producing its compression. however, the elongation of the element and its rupture in the ultimate stress state is influenced only by the positive part of this force effect; the negative force effect produces transverse compression of the element. the principal hooke’s law (eq. 2a), consequently, does not represent the relation between the positive force effect on a1 and the relative elongation �1, but the relation characterizing the summary effect of the positive and negative force effect on a1. the deformation and the deformation force effect on the spherical element are independent of the orientation of the cross section surface. for a uniaxially loaded body this is clearly illustrated by fig. 11. the deformation stress state is not changed by any change of location of the cross section surface. however, every change in the position of the surface does change the magnitude and, usually, also the direction of the internal force. fig. 11a shows the equilibrium of the internal forces with reference to the surface perpendicular to the first principal axis (see also fig. 10b). if the surface is not perpendicular to any of the principal axes, the resulting stress affects it obliquely (fig. 11b). in this case we usually express the equilibrium of the forces with reference to the cross section surface by means of the normal and tangential components (n, t) of the resulting force f. if we select the cross section surface parallel with the first principal axis, the inter© czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 11: the internal force changes with a change in the position of the cross section surface, while the deformation stress does not change nal force f2 referred to it, is equal to zero (fig. 11c). this means that the positive part of the internal force f2(+) generated by the positive deformation force effect on the half of the element above the cross section surface is in equilibrium with the negative part of the internal force f2 (�) generated by the negative deformation force effect f f2 2 0( ) ( )� �� � . (25) although the relative longitudinal change in the direction of x2 has a value different from zero (� ��2 1� � ), the stress in the same direction is equal to zero. by analogy it is possible to explain also other stress states that are otherwise difficult to explain, in which the vector of relative elongation in one direction may have even an opposite orientation than the resulting stress vector (see table 1). 3 fields of practical application of deformation stress state theory deformation stress state theory covers an extensive field of the theory of elasticity. the limited scope of this paper can provide only basic information on the deformation stress state and its relation to the stress state expressed by classical stress state quantities stress and its components. it provides a new look at the stress state as a phenomenon accompanying deformation. the application of new deformation stress state quantities will enable the solution of a number of problems in the theory of elasticity for the solution of which the application of existing force quantities of stereomechanics is inadequate. these include, in particular: � analysis of triaxial strength and determination of the safety margin of elastic bodies, � analysis of the multiaxial strength of technical materials, � clarification of the origin of the deformation and failure of triaxially loaded elastoplastic bodies, � analysis of the strength and yield limit of three-dimensionally loaded elastoplastic bodies. of extraordinary significance for practical application is the formulation of the theory of failure of elastic and elastoplastic materials. the principal problem in existing hypotheses of the origin of failure is erroneous selection of the criterial quantity. although it is known that failure originates due to exceeding a certain limit state of a three-dimensional stress state, which can be defined only by a complete stress tensor, i.e. by all its components, most hypotheses use only its individual components as the criterion of failure. the theory of failure derived from deformation stress state theory is based on the following assumptions: � failure can be due only to rupture of the body; the assumption that the origin of failure is due to shearing or crushing, i.e. shear or compressive stress, is erroneous. � the origin of failure is influenced only by the positive part of the deformation stress state accompanied by positive three-dimensional deformation. the negative part of the deformation stress state affecting compression has no direct influence on the origin of failure. � the failure crack is always perpendicular to the vector of the maximum principal stress. for the positive part of the maximum principal stress �1(+) we have introduced the term effective stress �, and for the positive part of the maximum internal force f1(+) the term effective force f�. the effective stress equation is obtained from eq. (23), in which the solid angle of the deformation stresses with reference to the cross section surface (a � �) is limited to the solid angle of the positive deformation force effect (+) (fig. 12). � � � � 1 2 � �� cos ( ) d . (26) the effective stress indicates the magnitude of the stress state. failure originates in any three-dimensional stress state by exceeding the same maximum effective stress value, effective strength f�, i.e. the ultimate positive part of the resulting stress with reference to the surface perpendicular to the principal axis of maximum elongation. the criterion of failure based on deformation stress state theory is expressed as follows: failure occurs in that point of the body in which the maximum effective stress �max attains effective trength f�. this criterion yields the limit condition � �max � f . (27) the effective strength is obtained from any classical strength value (preferably strength in uniaxial compression or tension) obtained by tests and expressed by classical stress. fig. 12 shows the deformation force effect under various loads applied to a spherical element. fig. 12a shows schematically the deformation stress state of a biaxially compressed element. the positive part of the deformation stress state is limited by the conical surface of zero deformation stress � 0 , which is also the surface of zero relative elongation � � 0. the 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 12: schematic diagram of the deformation force effect on the spherical element in the classical stress state picture also explains why in the case of biaxial compression the body ruptures in the direction of zero stress. the rupture is due to the fact that the value of the positive part of the internal force is equal to the absolute value of its negative part (eq. 25). although the internal force is equal to zero, the body ruptures as soon as the effective stress attains the value of the effective strength. figs. 12b and 12c show schematically the deformation stress state of the element subjected to uniaxial tension and of the element subjected to uniform all-direction tension, respectively. the small diagrams attached to the pictures show that for the maximum deformation of the same magnitude 1 (and, consequently, also maximum relative elongation �1), the effective force f� is smallest under biaxial compression and greatest under uniform all-direction tension. this explains why, when the effective strength f� has been attained, the relative elongation is highest under biaxial compression and lowest under all-directional tension. references [1] bridgman, p. w.: the physics of high pressure, chapter 4. london, g. bell and sons ltd., 1954. [2] bridgman, p. w.: “collected experimental papers”, papers 1–14, cambridge, massachussets, harvard university press, 1964. [3] cottrell, a. h.: the mechanical properties of matter. new york, london, sydney, j. wiley and sons, inc. 1964. [4] novák, o. et al.: “strength of materials in building.” technical guidebook vol. 3 (in czech), prague, sntl publishers, 1963. [5] servít, r.: “statics of building structures” technical guidebook (in czech), prague, sntl publishers, 1973. [6] servít, r.: strength of materials in construction, vol. ii – theory of failure (in czech), prague, sntl publishers, 1966. [7] skokánek, j.: “deformation stress state theory.” (in czech), prague 2004, unpublished. ing. jiří skokánek, csc. autorizovaný inženýr pro statiku a dynamiku staveb phone: +420 241 727 378 korandova 37 147 00 praha 4, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 ap04_4web.vp 1 introduction ultrasonic non-destructive testing is used for detecting flaws in materials. ultrasound uses the transmission of high-frequency sound waves in a material to detect a discontinuity or to locate changes in material properties. the most commonly used ultrasonic testing technique is a pulse echo, where sound is introduced into a test object and the reflections (echoes) are returned to a receiver from internal imperfections or from the geometrical surfaces of a part. the highest signal-to-noise ratio (snr) provides the optimum frequency of an acoustic wave appropriate for detecting specific discontinuity. there are several sources of noise that can hide a fault. a common source of noise is electronic circuitry, which is used for processing the ultrasonic signal, and scattering at the inhomogeneities in the structure of a grainy material. the amplitude of the fault echoes can be smaller than the amplitude of the noise, and the noise can totally mask echoes characterizing faults. this case is undesirable, because we cannot correctly identify flaws in the material. the most frequent usage of ultrasonic testing is for weld inspection. in welds there is big probability of cracking. the places where the flaws are have to be uniquely determined. for this determination we have to use a method for reducing the ultrasonic signal noise. the best method for reducing noise which ensures zero-time shifts of ultrasonic echoes is the discrete wavelet transform (dwt) [1]. 2 filtering method based on the discrete wavelet transform the wavelet transform is a multiresolution analysis technique that can be used to obtain the time-frequency representation of the ultrasonic signal. the continuous wavelet transform (cwt) is computed by changing the scale of the analysis window, shifting the window in time, multiplying by the signal, and integrating over all times. the continuous wavelet transform is defined by: cwt s s x t t s t� � � � ( , ) ( ) *� �� � � � � 1 � d , (1) where x(t) is the input signal, t is the translation, s is the scale and �(t) is the transforming function called mother wavelet. the mother wavelet is given by: � �� � , s s t s � �� � � � � 1 . (2) dwt coefficients are usually sampled from the cwt on a dyadic grid, choosing parameters of translation � � �n m2 and scale s m� 2 , it is possible to defined mother wavelet in dwt as: � �m n m m m t t n , ( ) � �� � � � � � 1 2 2 2 . (3) dwt [2, 3] analyzes the signal by decomposing it into its coarse and detail information, which is accomplished by using successive high-pass and low-pass filtering operations, on the basis of the following equations: y k x n g k n y k x n h k n n n high low ( ) ( ) ( ) , ( ) ( ) ( ) , � � � � � � � � 2 2 (4) where yhigh(k) and ylow(k) are the outputs of the high-pass and low-pass filters with impulse response g and h, respectively, after subsampling by 2. this procedure is repeated for further decomposition of the low-pass filtered signals. starting from the approximation and detailed coefficients the inverse discrete wavelet reconstructs signal, inverting the decomposition step by inserting zeros and convolving the results with the reconstruction filters. the discrete wavelet transform [4, 5] can be used as an efficient filtering method for families of signals that have a few nonzero wavelet coefficients for a given wavelet family. this is fulfilled for most ultrasonic signals. the standard filtering (also called de-noising) procedure affects the signal in both frequency and amplitude, and involves three steps. the basic version of the procedure consists of: © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 44 no. 4/2004 signal-to-noise ratio improvement based on the discrete wavelet transform in ultrasonic defectoscopy v. matz, m. kreidl, r. šmíd in ultrasonic testing it is very important to recognize the fault echoes buried in a noisy signal. the fault echo characterizes a flaw in the material. an important requirement on ultrasonic signal filtering is zero-time shift, because the position of ultrasonic echoes is essential. this requirement is accomplished using the discrete wavelet transform (dwt), which is used for reducing the signal-to-noise ratio. this paper evaluates the quality of filtering using the discrete wavelet transform. additional computer simulations of the proposed algorithms are presented. keywords: ultrasonic testing, discrete wavelet transform, de-noising algorithms. a) decomposition of the signal using dwt into n levels using bandpass filtering and decimation to obtain the approximation and detailed coefficients, b) thresholding of detailed coefficients (see fig. 1), c) reconstruction of the signal from detailed and approximation coefficients using the inverse transform (idwt). for decomposition of the signal it is very important to choose a suitable mother wavelet. the shape of the mother wavelet has to be very similar to the ultrasonic echo. it has to fulfill the following properties: symmetry, orthogonality and feasibility for dwt. a group of mother wavelets was tested: haar’s wavelet, the discrete meyer wavelet, daubechie’s wavelet and coiflet’s wavelet. the best results were obtained with the discrete meyer wavelet. in the following study, only this mother wavelet was used. in the proposed procedure, local thresholding of detailed coefficients was used [6, 7]. we computed the threshold at each level of decomposition from the detailed coefficients, and this value was used for thresholding in the same level. we evaluated common thresholding methods implemented in the matlab wavelet toolbox [8] (rigsure, sqtwolog, heursure, minimaxi) and due to the unsatisfactory results we proposed a new method based on standard deviation. the local threshold at every level of decomposition is given by threshold � � � � � � �k n i i n 1 1 2 1 ( )dc d c (5) where k is the coefficient related to the crest factor of the filtered signal, dc is a vector of detailed coefficients at each level, n is the length of each set of detailed coefficients. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 1: decomposition and tresholding the main idea of dwt filtering involves replacing small wavelet coefficients by zero, and keeping the coefficients with an absolute value above the threshold. this type of thresholding is called hard thresholding [7], and is used in our study. to evaluate the noise reduction we used the signal measured on grainy material used for constructing airplane engines (see fig. 2). the ultrasonic signal from this material is very noisy. the noise is partially caused by scattering at the grains in the structure of the materials. the arrow in fig. 2 shows the place where the measurement was conducted. fig. 3 shows the raw signal in the place where crack no.1 was located. this crack was artificially created. fig. 3 shows noise reduction, but the sources of this noise are not fully known. to determine the filtering quality a standard k1 calibration gauge was used. the gauge is made of homogeneous material so the noise can be estimated and fully described. we made a measurement of the ultrasonic signal which took into consideration only back-wall echo. for a comparison with the previous signal, we composed the artificial fault echo from a properly scaled back-wall echo. in our study the amplitude of the fault echo from 5 % to 100 % was changed. to evaluate the filtration quality we used the signal-to-noise improvement ratio is used: © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 44 no. 4/2004 b) x-ray image 9.2 1 5 1 5 1 5 2 0 3 2 .5 4.8 3 1.1 section a-a’ 1 6 5 53 26.5 a a’ 120 o 120 o 5 5 5 5 5 5.8 welding plane a) drawing of gauge æ0.7(hole 1.) æ0.7(hole 7.) æ0.5(hole 6.) æ0.5(hole 5.) æ0.5(hole 4.) æ0.5(hole 3.) æ0.5(hole 2.) fig. 2: material used for constructing airplane engines fig. 3: filtering of an ultrasonic signal using dwt with the threshold based on standard deviation 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fig. 4: filtering of an ultrasonic signal without fault echo fig. 5: filtering of an ultrasonic signal with 100% fault echo fig. 6: filtering of an ultrasonic signal – fault echo with amplitude 132.6 % of back-wall echo �fnr f n � � � �� � � 20 log ef ef db , (6) where nef is the root mean square value of the noisy part of the raw signal, fef is the root mean square value of an adequate part of the filtered signal. 3 experimental results for filtering the ultrasonic signal measured on the k1 calibration gauge we used the same filtering technique based on dwt as in the previous case. the following figures show the filtering signal without fault echo (see fig. 4) and with fault echo (see fig. 5), which has the same amplitude (100 %) as the back-wall echo. a value of 1020 % of effective noise value corresponds to 100 % of back-wall echo. the results presented in table 1 show that the noise reduction value varies from 17 to 20 db. for relative amplitude 132.6 % no fault echo can be identified, but for relative amplitude higher than 132.6 % the fault echo can be recognized. the results for relative amplitudes of 132.6 % and 142.8 % are depicted in fig. 6 and fig. 7. the arrows indicate the fault echo that can be hidden by noise. the proposed algorithm based on filtering using the discrete wavelet transform was tested on data measured on two materials: a k1 calibration gauge and a construction material used in airplane engines. a simulated fault was created which artificially reduced the back-wall echo and was inserted in the raw signal. the results of the measurements are shown in fig. 6 and fig. 7. 4 conclusion this paper describes a method for filtering an ultrasonic signal using the discrete wavelet transform. for thresholding we used a novel thresholding technique based on the standard deviation of coefficients of dwt. this method provides the best result for filtering of simulated and real ultrasonic signals. the noise reduction for a signal without fault echo is 18.56 db. for signals with a simulated fault echo the noise reduction ratio was from 17.65 db to 19.72 db. we also investigated improvements in sensitivity of fault detection. our method allows identification of faults with relative amplitude higher than 132.6 % of the effective noise value. 5 acknowledgment this research work has received support from research program no. msm210000015 “research of new methods for physical quantities measurement and their application in instrumentation” of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic ). references [1] šmíd r., matz v., kreidl m.: “ultrasonic signal filtering.” defektoskopie 2003. praha: česká společnost pro nedestruktivní testování, 2003, isbn 80-214-2475-3, p. 259–262. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 44 no. 4/2004 amplitude [%] fnr [db] amplitude [%] fnr [db] 51 17.7 357 18.4 102 17.7 408 18.4 112.2 17.7 459 18.3 122.4 17.7 510 18.3 132.6 17.7 561 18.3 142.8 17.7 612 18.3 153 17.7 663 18.2 163.1 17.7 714 18.2 173.4 17.7 765 19.0 183.5 17.7 816 19.0 194 17.7 867 18.9 204 17.6 918 19.7 255 17.6 968 19.7 306 17.9 1020 19.6 table 1: noise reduction for different amplitudes of fault echo fig. 7: filtering of an ultrasonic signal fault echo with relative amplitude of 142.8 % of back-wall echo [2] edwards t.: discrete wavelet transforms: theory and implementation. technical report, stanford university, september 1992. [3] louis a. k., maaß p., rieder a.: wavelets: theory and applications. john wiley and sons ltd., england, 1997. [4] mallat s.: a wavelet tour of signal processing. academic press, 1999. [5] kreidl m. et al.: diagnostic systems (in czech). czech technical university in prague, prague, 2001. 352 p. isbn 80-01-02349-4. [6] polikar r. et al.: “frequency invariant classification of ultrasonic weld inspection signal.” ieee trans. on ultrasonic, ferro. and freq.contr.,vol. 45, no. 3, may, 1998. [7] polikar r.: the engineer's ultimate guide to wavelet analysis. iowa state university of science and technology, 1999, http://users.rowan.edu/~polikar/wavelets/. [8] misiti m., misiti y., oppenheim g., poggi j-m.: wavelet toolbox for use with matlab, user's guide, version 2. the mathworks, inc., 2002. ing. václav matz phone: +420 224 352 346 e-mail: vmatz@email.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@feld.cvut.cz ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@feld.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 table of contents a numerical model to predict matric suction inside unsaturated soils 3 a. farouk, l. lamboj, j. kos influence of matric suction on the shear strength behaviour of unsaturated sand 11 a. farouk, l. lamboj, j. kos multifractal image analysis of electrostatic surface microdischarges 18 t. ficker fractality of electrostatic microdischarges on the surface of polymers 22 t. ficker, v. kapièka, j. macur, p. slavíèek, p. benešovský programming a logical control method by a parallel process 27 p. jiroušek effect of boundary constraints in the formulation of the partition of unity method: one-dimensional setting 32 m. audy, m. šejnoha study of the discharge stream from a standard rushton turbine impeller 39 j. kratìna, i. foøt pumping capacity of pitched blade impellers in a tall vessel with a draught tube 48 j. brož, i. foøt, r. sperling, s. jambere, m. heiser, f. rieger eigenstructure assignment by state-derivative and partial output-derivative feedback for linear time-invariant control systems 54 t. h. s. abdelaziz, m. valášek signal-to-noise ratio improvement based on the discrete wavelet transform in ultrasonic defectoscopy 61 v. matz, m. kreidl, r. šmíd ap06_6.vp 1 introduction in a recent paper [1], the procedure for determining the rheological parameters from measurements on a viscometer with coaxial cylinders (see fig. 1) was proposed on the basis of flow analysis. the relations for calculating of consistency coefficient k for power-law fluids and yield stress �0 for bingham plastics were reported. these relations were derived for three reference radii – inner, mean and radius presented by klein [2]. however, it is possible to find radius rr at which newtonian and non-newtonian shear rates are the same. if the experimental data are related to this radius, k and �0 can be obtained directly as the �-intercept of the measured data �� �� f ( � ) )straight line. 2 solution a) power-law fluids the following equation was derived for shear rate (eq.(9) in [1]) � ( ) � � � � � � � � � � � 2 1 2 1 2 n r rn n . (1) inserting n � 1, the following equation can be obtained for the shear rate in newtonian fluids � ( ) � � � � � � � � � � � 2 1 2 1 2 n r r . (2) the two values are the same at r = rr and using eqs.(1) and (2) we get r r n r n n n 1 2 2 2 2 1 1 � � � � � � � � � � ( ) ( ) � � . (3) from this equation it can be seen that the r1/rr ratio depends on n and �. the dependence of r1/rr on n for selected values of ratio � is shown in fig. 2. comparison of r1/rr with the ratio of r1 to the mean radius presented by klein [2] r r r r r k � � 1 2 1 2 2 2 2 (4) for � � 0.8 is shown in fig. 3. this figure shows that rk represents the mean value of rr in the presented n interval, and for this reason ratio k/kk is relatively small, as was shown in [1] (see figs.12 and 13 in [1]). b) bingham plastics combining eqs.(3), (11) and (13) presented in [1], the following equation for shear rate can be obtained � ( ) ln� � � � � � � � � � � � � � � � � � � � � � � � � 0 0 2 2 1 22 1 2 1p p r r . (5) again we can find radius rr at which newtonian and bingham shear rates are the same by comparing equation (5) with the corresponding relation for a newtonian fluid (2), and we get r rr 1 21 2 1 � � � �ln( ) . (6) from this equation it can be seen that ratio r1/rr depends on �. the graphical form of this dependence is shown in fig. 4. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 determination of rheological parameters from measurements on a viscometer with coaxial cylinders – choice of the reference radius f. rieger knowledge about rheological behavior is necessary in engineering calculations for equipment used for processing concentrated suspensions and polymers. power-law and bingham models are often used for evaluating the experimental data. this paper proposes the reference radius to which experimental results obtained by measurements on a rotational viscometer with coaxial cylinders should be related. keywords: viscometer with coaxial cylinders, power-law fluids, bingham plastics fig. 1: viscometer with coaxial cylinders © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 46 no. 6/2006 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 n r 1 /r r � � 0 . 8 � � 0 . 5 � � 0 . 2 fig. 2: dependence of r1/rr on n for selected values of ratio � 0 9. 0 902. 0 904. 0 906. 0 908. 0 91. 0 912. 0 914. 0 916. 0.918 0 2. 0 3. 0 4. 0 5. 0 6. 0 7. 0 8. 0 9. 1 n r r1 k/ r r1/ r fig. 3: dependence of r1/rk resp. r1/rr on n 0 0 2. 0 4. 0 6. 0 8. 1 1.2 0 1. 0 2. 0 3. 0 4. 0 5. 0 6. 0 7. 0 8. 0 9. 1 � r 1 /r r fig. 4: dependence of r1/rr on � 3 conclusion on the basis of the above paragraph, the following procedure for evaluating the experimental data can be recommended: 1) if the logarithmic plot of shear stress �1 on newtonian shear rate ��1n is linear (the slope is equal to flow index n) the power-law model can be used and we can calculate r1/rr from eq.(3). if the plot of shear stress �1 on the newtonian shear rate ��1n is linear, the bingham model can be used and we can calculate r1/rr from eq.(6). 2) the shear stresses and shear rates related to radius rr can be calculated from experimental values �1 and ��1n using the following relations � �r r r r � � � � � � � 1 1 2 , (7) � � �� � �r r r r rn n � � � � � � � � 1 1 2 (8) 3) if the logarithmic plot of shear stress �r on shear rate �� r is linear, the consistency coefficient k is �-intercept and flow index n is the slope of a straight line. if plot of shear stress �r on shear rate �� r is linear, the yield stress �0 is �-intercept and plastic viscosity �p is the slope of a straight line. 4 nomenclature k coefficient of consistency l length of cylinder n flow index r radial coordinate r1 inner rotating cylinder radius r2 outer stationary cylinder radius rk radius presented by klein rr radius at which newtonian and non-newtonian shear rates are the same �� shear rate � r1/r2 ratio �p plastic viscosity � angular velocity � shear stress �0 yield stress subscripts 1 at radius r1 r at radius rr n newtonian reference [1] rieger, f.: determination of rheological parameters from measurements on a viscometer with coaxial cylinders. acta polytechnica, vol. 46 (2006), p. 42–51. [2] klein, g.: basic principles of rheology and the application of rheological measurement methods for evaluating ceramic suspensions. in: ceramic forum international yearbook 2005 (edited by h. reh). baden-baden: göller verlag, 2004, p. 31–42. prof. ing. františek rieger, drsc. phone: +420 224 352 548 e-mail: frantisek.rieger@fs.cvut.cz. czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 ap_06_4.vp 1 about watermarking watermarking is a type of steganography, with the help of which we can embed information into the host medium, in our case into a video stream. its main task is to protect the medium. to embed the information we use a key (stego key), and the same key is used during the watermark detection process [1]. the watermarking process consists of the following steps (fig. 1.): creation of the watermark (w): coding the data to embed (d) into a format that is acceptable as input for the embedding algorithm. using the stego key (s), we embed the watermark into the host medium (m). the medium, which contains the watermark is called a stego medium (mw). the medium passes through a transmission channel, where it can suffer various distortions (intended or unintended). we will refer to this as a distorted medium (mw’). using the stego key again, we search the hidden watermark in the medium. the watermark has probably suffered distortions too, and we have to watch for this during retrieval. some algorithms use the original medium, but this is not necessary. if we detect the watermark without the original medium, we speak about blind watermarking. we decode the retrieved watermark, and get the embedded data. 2 requirements of watermarking if we want to design a watermarking system to protect our medium we must agree on numerous requirements [2]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 46 no. 4/2006 robust watermarking of video streams t. polyák in the past few years there has been an explosion in the use of digital video data. many people have personal computers at home, and with the help of the internet users can easily share video files on their computer. this makes possible the unauthorized use of digital media, and without adequate protection systems the authors and distributors have no means to prevent it. digital watermarking techniques can help these systems to be more effective by embedding secret data right into the video stream. this makes minor changes in the frames of the video, but these changes are almost imperceptible to the human visual system. the embedded information can involve copyright data, access control etc. a robust watermark is resistant to various distortions of the video, so it cannot be removed without affecting the quality of the host medium. in this paper i propose a video watermarking scheme that fulfills the requirements of a robust watermark. keywords: robust watermarking, video streaming. fig. 1: block scheme of watermarking robustness: this refers to the resistance of the watermark to various distortions and/or attacks. there are many types of attacks that can impact a watermark: clipping, filtering, noise adding, etc. the watermark should not be deleted without significant loss of quality. imperceptibility: the watermark cannot be seen. this means that it can only make changes that are almost invisible to the human visual system, and it should not affect the quality of the video very much. payload: different applications need watermarks of different sizes. for copy protection, one bit is usually enough, but if we want to store copyright information, we need more bits. the requirements mentioned above are slightly conflicting: if we maximize robustness, it will worsen imperceptibility, if we maximize imperceptibility, we can only hide a few bits. so we have to find the correct relationship between them (fig. 2). 3 overview of the algorithm the algorithm is designed to embed a watermark in small resolution (176×144 pixels) video files that can be played on mobile devices. these devices have limited computational capacities, so the algorithm should not be too complex. the algorithm is based on the dittmann algorithm, which is designed to watermark mpeg video files [3]. according to the mpeg dct coding, dittmann uses 8×8 pixel blocks to hide information. our algorithm better fits more to the needs of mobile devices. to improve robustness the algorithm uses larger blocks than dittmann’s algorithm. according to the key, different pattern blocks are added to or subtracted from the selected blocks of video frames. these pattern blocks are not counted in real time (to reduce the calculation requirements), they are read from a file. the embedding algorithm obtains the following data as input. these are the free parameters of the algorithm. the stego key. the information to embed. the strength of the watermark (n): this shows the amount by which the pixel luminance values are modified. obviously, if we select a high value we improve robustness but worsen quality, and vice versa. pattern blocks: these are 0–1 blocks, and we hide the information with the help of these. first they are filled with random 0–1 values, then we eliminate the high frequencies. after this, there will be large 0 and 1 islands, so that during watermark detection the patterns can be more easily recognized, and the watermark will be more resistant to attack. 3.1 embedding process each frame of the video is watermarked. when we embed the watermark in the frame, first we select the blocks of the frame which will contain the information bits (each block holds one bit of information) and the pattern blocks used for it. information is hidden in the luminance values of the pixels. the method is as follows: counting the values of the pattern block: where the value of the chosen pattern is 1, we select n (n is the strength of the watermark), and where it is 0, the selected value is –n. the scaled patterns are added to the frame blocks: if we want to hide 1 in the block, then we add the pattern to it. if we want to hide 0, we subtract it. the characteristics of the embedded watermark are shown in fig. 3. there are eight watermarked blocks in the frame. high strength (n) is used for better visibility. to improve robustness, the same data is hidden in five frame blocks. the algorithm also places synchronization information, so it can resist attacks in the time domain, e.g., frame dropping. 3.2 detection process during detection, the same blocks and pattern blocks are chosen as in the embedding process. then the algorithm reads the bits of the blocks. it summarizes the pixel values according to the pattern (if the pattern value is 1, the pixel value is added to the sum, if it is 0, then it is subtracted from it). due to the correlation of the bits, the sum should be around 2n or –2n. if the sum is positive, the hidden bit was 1, otherwise it was 0. 4 test results the tests of the watermark were of two main types: imperceptibility tests and robustness tests. in the imperceptibility tests i studied the effect of watermarks of various strengths on the quality of the video. i put the watermark in the video stream, compared it to the original, and calculated the psnr value caused by the watermark. sample frames can be seen in figs. 4–7, and the test results in fig. 8. in the robustness tests i submitted the watermarked videos to various attacks and distortions (cropping, filtering, noise adding, format conversions, etc.) for the strength of the watermarks i used values 1, 3, 5, 8 and 10. after attacking 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 2: blockscheme of watermarking fig. 3: effect of the watermark on a grey frame the images i examined how many bits were read correctly by the detection algorithm. some test results are shown in fig. 9. 5 summary the test results show that there are appropriate settings of the algorithm which can ensure robustness of the watermark and low loss of the original quality of the video. the algorithm is quite simple, it does not use complex calculations, so it can be used in mobile devices. references [1] furht, b., kirovski, d.: multimedia security handbook, crc press, 2005. [2] hartung, f., kutter., m.: multimedia watermarking techniques. proceeding ieee, vol. 87 (1999), no. 7, p. 1079–1107. [3] dittmann, j., stabenau, m., steinmetz, r.: robust mpeg video watermarking technologies, 1998. tamás polyák e-mail: polyak.tamas@tmit.bme.hu dept. of telecommunication and mediainformatics budapest university of technology and economics, műegyetem rkp. 3–9 budapest, hungary © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 46 no. 4/2006 fig. 4: original video fig. 5: watermarked video (n=3) fig. 6: watermarked video (n=5) fig. 7: watermarked video (n=10) fig. 8: psnr values of video streams watermarked with different strength fig. 9: correct bit ratio after attacks (adaptive median filtering, downsampling, format conversion) ap05_4.vp 1 introduction the use of computer simulations is now an established technique in engineering design. developed originally for military applications during and immediately after the second world war, the use of computers (analog and digital) to simulate both existing and proposed systems spread to many fields of application, including engineering design, in the immediate post war years. in the first two or three decades after the war, analog computers were favored for simulations applied to engineering design. their ability to solve differential equations more rapidly and cheaply ensured their dominance over the digital computers of the time. as the power and cost effectiveness of digital computers quickly increased they eventually replaced the analog computer which was handicapped by low accuracy, difficult programming and setup, and high maintenance demands. the development of the personal computer put the final nail in the coffin of analog computers as software was developed that allowed the personal computer to be used as a low-cost, flexible, accurate simulation tool. in recent years, all kinds of computers, from laptops to supercomputers have been used to support an ever increasing variety of computer simulations, including many applications to engineering design. many software packages have been developed to support these simulations. there are several reasons for using simulations to improve the design process affecting different phases of the engineering life cycle from conceptual design, through detailed design, testing, training, diagnosis of faulty operation, and decommissioning. in many cases, the simulation is required to produce time-domain or frequency-domain data representing the behavior of the system under different operating conditions. these can be used, for example, to confirm that a proposed design is likely to perform within specifications. this is particularly useful for evaluating systems operating under extreme fault conditions, such as emergency shut down of a plant after failure of a coolant pump, or the handling of an aircraft following an engine failure for example. simulation can also be used to try to reproduce and hence diagnose faulty system operation. none of these applications require the simulation to execute in exactly the same time as that in which the system operates. it is often convenient if they do not. a simulation of a slowly varying phenomenon that takes minutes or hours to complete, such as the slow build up of fission products in a nuclear reactor, benefits from simulations that execute much faster than real-time. simulation of a very rapid phenomenon, such as the operation of electronic switches, will execute more slowly than real time to allow observation of the changes taking place. it is not even necessary for the simulation to execute in a strictly time-scaled fashion, i.e. n times faster or slower than real-time. some sections of a simulation may run more slowly than others depending on the amount of computation required to complete a given time period. this is normally quite acceptable. there are, however, applications of simulation in which a strict time synchronism with the operation of the simulated system is required. these are known as real-time simulations. they are used whenever the simulation is combined with real system components or human operators. there are significant differences between non-real-time and real-time simulations that go beyond the need for real-time synchronization. a real-time simulation is always interfaced to external hardware, software, human operators or a combination of all three. the nature of the interface depends, of course on the nature of the external subsystem. it may be digital or analog and, for human operators in training simulators, it may involve the construction of a realistic system control center, such as an aircraft flight deck, or a plant control room. this means that in addition to performing the computations that solve the equations describing the behavior of the model, the simulation program must also handle the time-synchronized input and output of data via the real-time interfaces. it is very common for a real-time simulation © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 high-speed real-time simulators for engineering design r. e. crosbie, n. g. hingorani the use of computer simulations is now an established technique in engineering design. many of these simulations are used to predict the expected behavior of systems that are not yet built, or of existing systems in modes of operation, such as catastrophic failure, in which it is not feasible to test the real system. another use of computer simulations is for training and testing purposes in which the simulation is interfaced to real hardware, software and/or a human operator and is required to operate in real-time. examples are plant simulators for operator training or simulated environments for testing hardware or software components. the primary requirement of a real-time simulation is that it must complete all the calculations necessary to update the simulator outputs as well as all the necessary data i/o within the allotted frame time. many real-time simulations use frame times in the range of a few milliseconds and greater. there is an increasing number of applications, for example in power electronics and automotive systems, in which much shorter frame rates are required. this paper reviews some of these applications and the approaches to real-time simulation that can achieve frame times in the range 5 to 100 microseconds. keywords: computer simulation, real-time, engineering design, digital signal processors. of this kind to be built upon a real-time operating system (rtos) that is capable of timing the execution of the simulation program appropriately. one of the key parameters of a real-time simulator is the frame rate at which its outputs are updated. many real-time simulators, including those used for operator training, perform satisfactorily with frame times in the range 10 to 100ms. all the calculations needed to advance the simulation by one frame must be completed, along with all necessary data transfers within one frame. in some applications real-time simulation is used as a test environment for real hardware or embedded software, referred to as the system under test (sut). in some cases much shorter frame times (<10�s) are required because of the high-frequency dynamics of both the simulated system and the sut. such applications are found, for example, in aerospace, automotive and power electronic systems. we consider approaches to producing real-time simulations in cases such as these. to distinguish these higher speed examples of real-time simulation we will apply the term high-speed real-time (hsrt) simulation to applications requiring frame times of less than 100�s. 2 the need for hsrt simulation the maximum acceptable frame time for a real-time simulation is determined by the dynamics of the simulated system and the hardware to which it is interfaced. training simulators that are interfaced to instruments and controls used by human operators would not normally benefit from hsrt operation because humans can not absorb information or react at the corresponding data rates. hsrt simulation is confined to situations in which a simulated system with a wide dynamic range is interfaced to equipment that is sensitive to the high-speed dynamics of the simulation. a common application occurs in the design of embedded controllers for high-speed systems. two examples that have received a lot of attention are in automotive systems and power electronic systems. embedded computers are now widely used as automobile engine electronic control units (ecus). testing of these embedded controllers on real engines is expensive and time-consuming, and simulation is increasingly being used instead. these simulations often require frame times of 100 �s or less, especially for high-performance engines such as are found, for example, in formula 1 racing cars. power electronic systems are used to convert electrical power from alternating to direct current and vice versa and for conditioning and stabilizing the resulting power outputs. they are widely used for producing power in the form required by an electrical load, ac or dc, with appropriate current and voltage ratings, and with the necessary stability and reliability. they range from encapsulated low-wattage power supplies for laptops and domestic electronic devices, through industrial electric drives rated at kilowatts to megawatts, to power distribution and transmission components and systems that convert and control hundreds of megawatts of electrical power. converters, which convert ac to dc or dc to ac are key components of these systems. the converters consist of configurations of switches that can be turned on and off via a control signal. the timing of this switching determines the form of the converter output. the feedback controllers that control this switching are often based on pulse-width modulation (pwm) techniques with pwm frequencies of tens of kilohertz. the testing of controllers for this type of power electronic system using real-time simulations can require frame times of less than 10 �s. two trends are increasing the need for hsrt simulation. as the power and cost effectiveness of simulation technology is more widely recognized, it continues to penetrate new fields of application. furthermore, within particular fields in which real-time simulation is already established, advances in technology that reduce response times and increase frequencies demand shorter frame times from the corresponding real-time simulations. these trends are likely to persist causing an increasing need for hsrt simulation. we will discuss some of the commercially available solutions for implementing hsrt simulations, and will also present a new approach adopted by a team at california state university, chico to a particularly demanding application requiring frame times beyond the reach of currently available commercial systems. 3 available systems for hsrt simulation special techniques are required to achieve such short frame times. high-speed real-time (hsrt) simulations of this kind represent one kind of hard real-time embedded system, and can be based on the same real-time operating systems (rtos) used for many high-performance embedded systems. a number of commercial systems that can support hsrt simulation are available from companies such as d-space, ref [1], opal-rt, ref [2], adi, ref [3], and rtds, ref [4]. the dspace ds1006 simulator (ref [1]) is based on an amd opteron� processor 248, a high-performance server processor. multiprocessor systems are available that use several processor boards connected via optical fiber. opal-rt (ref [2]) uses clusters of low-cost pcs – incorporating off-the-shelf technologies and fpga-based reconfigurable i/o – that can simulate electric power converters and drives at time steps down to 10 �s. applied dynamics international (adi) offers a hsrt simulator (ref [3]), based on the motorola mvme5500 powerpc. one processor acts as a user interface processor, a second processor runs the models under rt-exec, a real-time operating system that “guarantees microsecond level determinism”. multiple processors can be used 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague three phase ac timed switch inputs dc connection fig. 1: switch connections for a 3-phase voltage-sourced converter. each switch consists of a turn-off device and a diode in parallel. for high-voltage/high-power converters each switch may contain many devices in series and/or parallel for more demanding applications. rtds technologies (ref [4]) specialize in providing simulation systems customized to specific, large-scale power system simulations. their systems consist of racks of up to 20 rail-mounted cards. two types of processor card are available, one based on dual ibm ppc750cxe risc processors, the other containing three analog devices adsp21062 sharc dsps. other types of card are available for digital and analog i/o, user interface, inter-rack communication etc. 4 hsrt power electronic simulation the real-time power system (rtps) simulations developed at the mcleod institute of simulation sciences at csu, chico concentrate on simulations of systems based on three-phase bridge converters using electronic switches with turn-on and turn-off control. typically, one side of the system is connected to three-phase a.c. and the other to a d.c. system. the direction of power flow, i.e. whether the converter is acting as a rectifier or as an inverter, is determined by the timing of the operation of the switches. a typical system (fig. 1) consists of two such converters connected by a dc line with a three-phase transformer and three-phase ac system at the remote ends of both converters. the dc line is often represented by a large capacitance, but in some cases involving long lines a distributed line model of some kind is required. the converter switches are switched on and off by control pulses from a feedback controller that controls the timing of switch operation in order to maintain the required power specifications. pulse-width modulation controllers are frequently used for this purpose. switching frequencies can range from a few hundred hz to tens of khz. frame times of 10 �s or less are necessary for accurate simulation in these cases. for real-time operation this implies that the simulation must re-compute the state of the system within a 10 �s frame time. systems that use higher pwm frequencies require even shorter frame times. simulation studies suggest that 20 khz pwm controllers may require frame times as low as 2 �s. 5 using dsp arrays for hsrt simulation the miss center at chico was asked to develop low-cost hsrt simulations of typical power electronic systems using off-the-shelf components that can perform with frame times shorter than those available from currently available commercial systems. a future need for frame times shorter than 5 �s was projected although the initial goal was to implement typical generic models at 10-�s frame times. a further requirement was that the technology should be scalable so that more complex models could be accommodated with minimal impact on achievable frame times. the approach selected was to use pci boards containing small arrays of digital signal processors (dsps) inserted into a conventional desktop computer. the real-time simulation executes on the dsps and the user interface runs under the windows operating system on the host computer. rather than develop a custom user interface, an existing simulation system was selected to provide this feature. the virtual test bed (vtb), developed at the university of south carolina (ref [5]) was chosen for this purpose. the vtb supports user entry of model parameters, simulation control parameters, control of the simulation, and provides powerful graphical display capabilities through its vxe graphics system. use of the vxe permits detailed examination of complex waveforms generated during the real-time simulation runs. the dsp boards used are based on analog devices sharc (super harvard architecture) processors (ref [6]). boards containing 4 processors are available from several manufacturers; the boards used at chico are from bittware inc. (ref [7]). this board contains four analog devices tigersharc processors with a 250 mhz clock. clock speeds for these processors appear to be quite slow compared to, say, pentiums, but this can be misleading. the processors have pipelined floating point adders and multipliers that can, under suitable conditions, produce several floating-point results in each clock cycle. 6 performance issues several simulations have been developed of varying complexity. the mathematical models on which they are based consist of up to 40 differential equations. the three most important factors affecting the execution speed of the array of sps are model partitioning, data transfer and code efficiency. the models that have been implemented so far lend themselves to a simple partitioning between processors. the systems split naturally into two similar halves, each of which is allocated to one of the four dsps on a single pci board. the controllers for both sides are assigned to a third processor and the remaining dsp is used to handle synchronization and communication with the host processor. several considerations influence the way in which the model is partitioned. ideally this should be done in a natural fashion in which each processor represents a natural subdivision of the complete system. this is particularly true of the controller model bearing in mind that one application is to combine the simulation of the actual power system with a real controller. separation of the controller model into a separate processor dedicated to that purpose allows easier replacement with the real controller. at the same time it is important to try to equalize the computing loads so that no single processor unduly delays the completion of each frame. current models run with frame times between 4.5 �s, for a simple model such as is shown in fig. 1 (with feedback control at each end), and 11.9 �s for a more complex model. the first of these models runs on 4 processors and the ratio of the longest to the shortest execution times for the individual processor tasks is less than 1.1:1. for the more complex model, almost all the extra load is taken by the two processors that simulate the power system, and the ratio in this case is almost 3:1. clearly, distributing the load differently, or using more processors could improve performance significantly. ultimately it is a matter of judgment how to make the trade off between convenience and natural partitions on the one hand and minimization of execution time by equalizing the frame times of the different processors. the dsp board provides several ways of performing processor to processor and board to host data transfers. initially common memory was used as a convenient method of inter-processor communication, but accessing common memory proved to be a much slower than expected. the use of link © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 ports proved more efficient. the board contains a number of ways in which dma transfers can be used. care has to be taken to use the dma features of the board to maximum effect. much of the programming of the models is carried out using the c++ language, but the compiler does not make optimum use of the capabilities of the processor and time-critical code needs to be efficiently hand coded. this is not a trivial task, but fortunately libraries are available containing efficiently coded routines for common operations. the rtps models use efficient matrix-multiply routines based on modifications of standard routines found in the libraries that are provided with the dsp software. work continues on optimizing code for the current models and to investigate scalability using larger numbers of processors in more complex simulations. automated methods of model development that rely less on manual processes are under development. further speed up should also be possible using the latest 500-mhz tigersharc processors. 7 conclusions computer simulation is now used in almost all types of engineering design. in some cases simulations are required to work in synchronism with the execution of the system being simulated. this is required, for example, when the simulated system must be connected to real hardware, to embedded software or to human operators. a number of commercial systems are available that support the development of real-time simulations. increasingly, applications are emerging that require frame times of 100�s or less and in some cases, such as power electronic systems with high-frequency pwm controllers, frame times of less than 10�s may be required. in these extreme cases special care is needed to achieve the required short frame times. one approach, used for high-speed power electronic simulations at california state university, chico, is to use arrays of digital signal processors with custom designed software. these simulations have so far achieved execution times as low as 4.5�s for models consisting of approximately 20 differential equations. 8 acknowledgments the work described in this paper is the result of a team effort. the authors wish to thank professors john zenor, richard bednar, dale word and ralph hilzer and numerous graduate students for their contributions, and dean kenneth derucher and dr. larry wear for their support and encouragement. the research described in this paper is supported by the us office of naval research (onr) through grants n00014-01-1-0394 and n00014-03-1-0950. references [1] dspace inc., “dspace sets the pace in hil simulation again”, feb. 2004, www.dspaceinc.com/ww/en/inc/company/press/pr_0402001.htm. [2] opal-rt technologies inc.: “rt-lab electric drive simulator”. http://www.opal-rt.com/catalog/products/ 2_es/eds /description.html. [3] applied dynamics international, “the rts ultra-high performance real-time simulator”. http://www.adi.com/pdfs/product/datasheet_rts.pdf . [4] rtds technologies inc. information available at http://www.rtds.com/hardware.htm. [5] dougal, r. a., liu, s., gao, l., blackwelder, m.: “virtual test bed for advanced power sources.” j. of power sources, vol. 110 (09/02), no. 2, p. 285–294. [6] analog devices inc.: “sharc processors.” http://www.analog.com/processors/processors/sharc/. [7] bittware inc.: “embedded sharc dsp solutions.” http://www.bittware.com/. roy e. crosbie e-mail: rcrosbie@csuchico.edu mcleod institute of simulation science california state university, chico chico, ca 95929-0003, usa narain g. hingorani consultant los altos hills, ca usa 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague ap_06_4.vp 1 introduction customers have quickly become accustomed to the high standard of today’s information technology, and their demands continue to grow. for this reason, modern embedded systems frequently resemble small computer systems. just a few examples are pdas, smart and cell phones, set top boxes, network and telecom systems. systems like these require some degree of configurability, so that new functionality may be added at any time, even after production. this process is usually known as a firmware update. platforms based on fpgas have partially managed to address these issues. now is the right time to consider another degree of flexibility – it is time to consider the need to use an embedded operating system (os). one of the key advantages of any os is easy software integration. in the case of embedded systems, this allows an increasing amount of their functionality to be moved to software, instead of designing it as a pure hardware solution. the flexibility of software solutions and the lower design times help on the one hand to reduce the economically crucial time-to-market factor, and on the other hand to tune the most time consuming tasks in hardware. hardware acceleration and reconfiguration for time-consuming and computationally heavy applications is much more applicable in a fully configurable embedded system with the help of some os. 2 motivation aiming to follow modern trends and to evaluate for ourselves the real advantages of an embedded os, we started a project on experiments with some os. because of our interest in configurable hardware, we chose the microblaze processor [1] as the basis of the project. first, there was a very important and difficult decision on which operating system would be the best for our platforms. the fundamental criteria were source code availability and support for a wide range of hardware devices (drivers) and standard software components (file systems, networking, etc.). easy adaptability to a changing hardware platform was also crucial for the configurable systems that we considered. among the oses that supported microblaze we finally chose a linux based system. firstly, it matched up to our requirements, and secondly our familiarity with linux systems was a decisive factor. specifically, we used a distribution called uclinux [2, 3] (pronounced “you see linux”). 3 target system 3.1 microblaze processor microblaze is a 32-bit embedded soft-core processor with a reduced instruction set computer (risc) architecture. it is highly configurable and specifically optimized for synthesis into xilinx field programmable gate arrays (fpgas). the microblaze configurability enables embedded developers to tune its performance to match the requirements of target applications. for example, microblaze may be configured to use a hardware multiplier or a dedicated barrel shifter. the current version even features an optional floating-point unit that can accelerate system performance by as much as 120 times over software emulation. xilinx claims that the core can operate at frequencies up to 200 mhz. 3.2 hardware platforms currently, we are using two development kits by xilinx – ml401 and ml310 [1]. the former is based on virtex-4 fpga, while the latter uses virtex-ii pro. both platforms offer a wide range of industry standard peripherals – e.g. ether© czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 46 no. 4/2006 embedded operating system for microblaze m. šimek this paper presents a work in progress on experiments with embedded operating systems for the microblaze processor. modern embedded systems based on a configurable platform incorporating a similar processor core are gaining importance with the ongoing effort to minimize cost and development time. after an overview of the configurable platform based on this processor core, we devote our attention to uclinux os. this os has been successfully ported for the microblaze processor, and we present our current experience with it. at the end of the paper we discuss several possible booting strategies and recommend further development of u-boot. keywords: embedded systems, operating systems, configurable hw, microblaze. fig. 1: configurable system architecture net controller, compact flash card controller, usb controller, ac97 audio codec, etc. all of these may optionally connect to microblaze through configurable interfaces realized in fpga. an example of what the final system may look like is shown in fig. 1. when using embedded os, the ethernet interface becomes of high importance, either as a standardized communication interface or for better support for applications development (i.e. debugging). the networking interface is famous especially for unix based operating systems, among which uclinux may be classified. in addition, we use ethernet for downloading new drivers to a board via ftp protocol. 3.3 uclinux operating system uclinux has features similar to those of standard linux, but its advantage is the optimization for embedded devices and applications. it is especially optimized to minimize the size of the code necessary for both applications and kernel. unlike “standard” linux, uclinux may be used even for embedded systems that have a main memory size as big as an l2 cache in an ordinary personal computer. like any other linux system, uclinux is composed of a kernel and a distribution. its kernel is derived from a standard linux kernel v 2.0 with memory management left out. today’s kernel version is 2.4.32-uc0 and version 2.6 is planned for the near future. the kernel supports many processor families, such as alpha, arm, blackfin, i386, m68k, microblaze, mips, ppc, sh, sparc, etc. the main purpose of a distribution is in the first step creating the root file system and adding applications. the type of root file system is elective for almost any available storage device. we use two types of root file systems – romfs (rom file system) and cramfs (compressed rom file system) [2]. cramfs is 20 % smaller than romfs. it is possible to use a nfs (network file system), which wasn’t fully tested yet. the distribution extends the kernel for a number of programs. these include core applications (init, agetty, cron, at), flash tools (netflash, mtd-utils), file system applications (flatfsd), network applications (dhcpd, ftpd, inetd, ping, telnetd, thttpd, tftp, ifconfig, route), miscellaneous applications (cat, cmp, cp, ln ls, mkdir, mv, rm, ps), microwindows (still not tested), etc. a very useful tool is the busybox package [4], which contains programs for managing kernel modules (e.g. mount, umount, insmod, lsmod, rmmod, modprobe). 4 project status 4.1 completed work operating systems such as linux are very extensive, and it is hardly possible for an individual to fully comprehend them. however, a detailed knowledge of system internals is crucial in the embedded field, where each system is specific in some way. therefore, one of the most important objectives of the project was to gain enough experience with uclinux to be able to deploy the system on any hw platform. it was necessary to understand the kernel source tree structure, to go through kernel sources and discover all kinds of dependencies. alongside this tedious work, kernel configuration and building seemed easy – though some problems had to be solved, too. in this phase of the project, we took advantage of the fact that a working uclinux demo was available for the ml401 platform [3]. this was especially helpful at the beginning for purposes of testing the kernel and distribution configuration. with increasing experience, we used our own derived variants of this reference platform. finally, to demonstrate our full mastery, we successfully ported uclinux to the ml310 development kit [5]. 4.2 full-platform support uclinux is well-known for offering broad support for various hardware devices. however, one cannot expect this to be true for specifically designed components, such as configurable systems built within fpgas. therefore, to get full support for our hw platforms, we need to write our own device drivers. a vga controller driver is currently under development, and for the future, a driver for the usb controller is planned. 4.3 booting strategy so far, we have been using a simple boot loader created as a part of the project. its only purpose is to initialize the necessary peripherals (serial line, lcd display), perform a memory test and, if successful, copy the kernel binary image into ram and execute it. this procedure is illustrated in fig. 2. although this simple boot loader works satisfactorily, it does not fully cover our needs. for higher configurability and better debugging means, it would be better to have a more sophisticated boot loader. this might for example allow us to pass boot arguments to the uclinux kernel, or choose between several pre-configured kernel images. with support from the ethernet interface, it would even enable remote kernel updates. all these features and even more can be found within u-boot [6] (das u-boot – universal boot loader). 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 2: simple booting strategy u-boot is an open source project and it has been designed mainly for high flexibility. its support for many processor families is also advantageous – e.g. arm, i386, mips, nios, powerpc, xscale, etc. microblaze is also supported, but the recent u-boot port for this processor has very limited capabilities. it implements only a serial line interface and allows us to work with ram memory listing, writing, modifying [7]. therefore, our recent plan is to focus on implementing the remaining functions, e.g. remote file download, access to flash memory, support for boot arguments, etc. with help of u-boot, we can easily implement our sophisticated booting strategy (see fig. 3). this enables us to have three kinds of root file systems – standard romfs in ram, a read/write file system on an external storage device partition, and finally an alternative jffs system (the journaled flash file system) stored in a flash memory. it would then be possible to access both flash and external storage memory, and to choose a right root file system. 5 summary we give an overview of a project in progress that concentrates on adapting a uclinux embedded os for the microblaze soft-core processor. we have given a brief overview of both uclinux and microblaze. we have also presented some of the hw platforms on which we have been carrying out our experiments. although our position was simplified by the fact that a uclinux port for microblaze already existed, it was still quite a tedious task to gain sufficient mastery of the system. however, this work has paid off because we are now able to adapt uclinux according to our needs. this ability is essential for configurable embedded systems, which are our main concern. we are currently developing device drivers for unsupported peripherals and for designing a sophisticated strategy offering a convenient system boot procedure with unique debug properties. our final objective is to provide a preconfigured uclinux distribution for the microblaze processor that will be complete and flexible in support of our development platforms. such a uclinux package would form the basis for further projects, which might then concentrate on some specific problems rather than dealing with the entire complexity of the operating system. 6 acknowledgments i would like to thank tomáš brabec for his help in completing of this paper. references [1] xilinx: the programmable logic company, [online] http://www.xilinx.com, 2006. [2] dionne, d. j., albanowski, k., durrant, m.: uclinux – embedded linux/microcontroller project, web page available at http://www.uclinux.org, 2006. [3] williams, j.: microblaze uclinux project home page, [online] http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux, 2006. [4] landley, r.: busybox: the swiss army knife of embedded linux, [online] http://www.busybox.net, 2006. [5] šimek, m.: embedded operating systems for microblaze, [online] http://cs.felk.cvut.cz/~simekm2/uclinux, 2006. [6] u-boot: das u-boot – universal bootloader, [online] http://sourceforge.net/projects/u-boot. [7] shoji yasushi: suzaku: series of embedded devices based on the combination of fpga and linux, [online] http://suzaku-en.atmark-techno.com. michal šimek e-mail: simekm2@fel.cvut.cz dept. of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo náměstí 13 121 35 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 46 no. 4/2006 fig. 3: sophisticated booting strategy ap04-bittnar1.vp 1 introduction in recent years, research community has faced growing demands to merge the power of current computer technology with the extensive knowledge acquired in engineering, mathematics, physics and other disciplines, and to exploit them effectively in the design of new multi-functional materials and structures. instead of sticking to traditional (usually very conservative) design code formulas and provisions, modern design procedures should focus on optimizing the actual performance, taking into account multiple objectives and criteria, such as sufficiently low probability of failure, low cost, long-term durability, proper functionality and high utility, compatibility with the environment, aesthetic quality, etc. this is not possible without powerful modeling, simulation and optimization tools supporting the designer in the decision-making process. the behavior of structures, solids, and fluids is governed by complicated systems of partial differential equations with appropriate boundary conditions. a reliable and accurate analysis of this behavior, necessary for practical decisions regarding design, optimization, risk assessment etc., is frequently based on a numerical simulation and requires the development of efficient computational tools. these growing demands for realistic modeling that typically includes state-of-the-art constitutive models, adaptive and multi-level solution techniques brings in new software issues. a very important feature of any modern computational code is its open nature, so that it should allow straightforward and efficient implementation of new solution methods, algorithms, material models, etc. an analyst or researcher naturally wants to work with a code which is easily extensible towards future demands, easily maintainable, but still efficient and portable across many platforms. object-oriented modelling is a tool that has been successively used to design and implement complex software systems meeting the above criteria. it is based on the uniform application of the principles for managing complexity – abstraction, inheritance, association, and communication using messages. the design of an object-oriented application consists in finding classes and objects, identifying structures and attributes, and defining the required services. in recent years, a number of articles on applying an object-oriented approach to finite element analysis have been published. in 1990 fenves [1] described the advantages of an object-oriented approach for developing of engineering software. forde et al. [2] presented one of the first applications of object-oriented programming to finite elements. many authors have presented complete architectures of oo finite element codes, notably, a coordinate free approach by miller [3], a non-anticipation principle by zimmermann et al. [4], dubois-pelerin et al. [5, 6, 7], and commend [8]. recent contributions include the work of mackie [9, 10, 11], archer et al. [12, 13], and menetrey et al.[14]. this paper presents the design principles and structure of object-oriented finite element code oofem [15]. this code has been actively developed for several years and it is distributed as a free software under gnu public licence. the basic intentions of oofem design include modularity, open nature, extensibility, maintainability, portability, and last but not least, computational performance. although the primary focus has been given to research applications, the code has been used several times for solving of industrial problems. in the next section, the general structure of the code is presented using the coad-yourdon methodology [16]. such a representation allows to show class hierarchy as well as the mutual relations between the classes, representing the generalization/specialization, whole/part, or association relations. all the fundamental abstract base classes, representing the basic building blocks of finite element code, will be introduced and their role will be discussed. finally, the oofem features and future development directions will be presented. 2 design principles the overall structure consists of several modules. the core module is called oofemlib. it contains the definition of fundamental top-level fe classes, that represent, for example, degrees of freedom, nodes, elements, integration points, boundary and initial conditions, constitutive models, numerical solvers, sparse matrices, and problems under consideration. it also contains some utility classes that are of general use and that can facilitate development, like representations of vectors and matrices, etc. this module introduces the fundamental class hierarchy, which is intended to be general enough to incorporate any fe problem and which is, at the same time, problem independent. the primary role of these core classes is to specify a general interface that defines the services that are provided by each derived class. these services are typically abstract ones, they are implemented by 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 oofem – an object oriented framework for finite element analysis b. patzák, z. bittnar this paper presents the design principles and structure of the object-oriented finite element software oofem, which has been under active development for several years. the main advantages of the presented framework include modular design, extensibility, and robustness. the code itself is freely available and is distributed under gnu public license. it provides tools for linear and nonlinear analysis of mechanical and transport problems on sequential and parallel computers. keywords: finite element software; object oriented fem design. inherited classes, which implement particular objects. the role of abstract services is very important, since they declare the general interface, which is implemented by derived classes. thus any derived class is enforced to implement this interface, which allows a high level of abstraction. an important consequence of the abstract interface concept is that it allows to implement some general services already at the abstract level. a typical example is stiffness matrix computation, which can be done already at the abstract level, provided that methods for computing the geometrical matrix and material stiffness are declared in an interface specification – they are only declared as abstract (virtual), and implementation is left to derived classes representing particular finite elements. typical implementation of this procedure then consists in a loop over finite element integration points, and computation of the products of these matrices and summation of the contributions. implementation of such general services can significantly facilitate the development of new elements. at the same time, such a default service can be overloaded (specialized) by a particular element implementation to reflect the specific needs of particular element formulation, if necessary. abstract interfaces allow to a developer to implement high-level functionality using the general interface, without regarding the details of each derived class. and on the other hand, it allows to implement a particular class without deep knowledge of the whole code structure; it is only necessary to implement the required services that constitute the general interface. such an approach allows to write high level procedures that will work even with classes added in the future. moreover, such a concept leads to a maintainable and extensible code structure which enables efficient team-work support. however, it is necessary to carefully design the abstract interfaces declared by top-level classes to be general enough to incorporate future demands. on the top of the core oofemlib module, specialized modules are built (see fig. 1 and 2). these modules contain application-specific classes that implement the required functionality. they typically contain implementation classes representing problem – specific finite elements, constitutive models, boundary conditions, and solution algorithms. a typical example is a structural analysis module (sm) or a transport-problem module (tm). modules may also represent an interface to external libraries. such a module then provides “shell” classes that implement the required interface and translate the messages to external library procedures. the petsc module providing an interface to portable, extensible toolkit for scientific computation (petsc) [17] is a typical example. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 1: oofem modules fig. 2: oofem modules – problem independent core module oofemlib and problem specific sm module 3 general structure the general structure of the oofem is shown in fig. 3, using the coad-yourdon representation. in short, abstract classes are represented by single framed rectangles, classes which have instances (so called class&objects) are represented by double framed rectangles. the lines with a semi-circle mark represent a generalization/specialization relation (inheritance), where the line from the semi-circle midpoint points to the parent class. the lines with a triangle mark represent a whole/part relation, where the line starting from the triangle vertex points to “whole” class possessing the “part” class. an association is represented by a solid line drawn between the classes. bold lines represent communication using messages. the details can be found in [3]. class dof represents a single degree of freedom (dof). it maintains its physical meaning, an associated equation number, and keeps a reference to the applied boundary and initial conditions. the base class dof manager represents an abstraction for an entity possessing some dofs. it manages its dof collection, a list of applied loadings and optionally its local coordinate system. general services include methods for gathering localization numbers from maintained dofs, computing the applied load vector, and computing transformation to its local coordinate system. derived classes typically represent a finite element node or an element side, possessing some dofs. boundary and initial conditions are represented by corresponding classes. derived classes from the base boundarycondition class, representing particular boundary conditions, can be applied to dofs (primary bc), dof managers (typically nodal load), or elements (surface loads, neumann, or newton boundary conditions, etc.) 3.1 problem representation the problem under consideration is represented by a class derived from the engngmodel class. its role is to assemble the governing equation and use a suitable numerical method (represented by a class derived from the numericalmethod class), to solve the system of equations. the discretization of a problem domain is represented by the class domain, which maintains lists of objects representing nodes, elements, material models, boundary conditions, etc. the domain class is an attribute of the engngmodel class and, in general, it provides services for accessing particular components. for each solution step, the engngmodel instance assembles the governing equations by summing up the contributions from the domain components. since the governing equations are typically represented numerically in a matrix form, implementation is based on vector and sparse matrix representations to efficiently store the components of these equations. then a suitable numerical method, represented by an instance of the class derived from the numericalmethod class, is used to solve the problem. an important consequence of abstract interfaces is that problem formulation can use any sparse matrix representation and any suitable numerical method, even added in the future, because they all implement the same common interface. an abstraction for the general field is provided. fields have the capability to represent any global field like displace56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 3: general structure of oofem ment or temperature fields, described using nodal values, and to evaluate the field values at any valid point of the problem domain. a particular problem implementation can store its solution in the form of field(s). this can help significantly, when implementing adaptive or staggered solution techniques, since transfers of solution fields between several grids are provided by the field implementation. high level numerical methods are represented as a hierarchy of classes derived from the base numericalmethod class. classes directly derived from this base class define the problem-specific interface for particular numerical problems (for example, interfaces specific to an eigen value problem or a linear system of equations). the derived classes then implement particular algorithms. the methods forming problem-specific interfaces accept parameters in the form of abstract classes representing sparse matrices or vectors. thus there are no assumptions about a particular type of data representation. as a consequence, numerical method implementation can work in principle with any sparse matrix, provided that it uses only operations available in the general interface of the basic sparsematrix class. this is illustrated in fig. 4, where linear static analysis can use different solution algorithms for a linear system of equations, since they all implement the same interface (here represented by the method “solve”). at the same time the iterative solver can work with different sparse matrix representations, since in principle the only required operation is the multiplication of the atrix by a vector, which is a part of the general sparse matrix interface (in reality, suitable preconditioning should also be applied, but this is omitted here, for the sake of brevity). the independent problem formulation and the numerical solution, together with independent data storage representation on a numerical algorithm, are the key features that characterize the design and structure of this frame. 3.2 material-element frame in this section, the structure of the material-element frame will be described. the primary goal during the design was to achieve straightforward extensibility and a high level of modularity. to achieve these requirements in the context of this frame, the following set of fundamental abstract classes is introduced to represent finite elements (element class), cross section models (crosssection class), constitutive models (material class), integration rules (integrationrule class), integration points (integrationpoint class), interpolation functions (interpolation class), and material mode specific containers for storing history variables in integration points (materialstatus class). to reflect the needs of specific problems under consideration, specialized abstract interfaces for particular problems are needed. one should have, for example, a different material model interface for structural mechanics problems and for heat and mass transport problems. these problem – specific interfaces are declared by corresponding problem-specific classes, derived from base classes representing a finite element, a cross section, or a material model. specific finite elements, cross section models, and constitutive models are then derived from these problem-specific base classes in a frame of the corresponding module. © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 4: independent problem formulation on data storage and numerical algorithm particular finite element implementations are represented by the classes derived from the corresponding problem – specific base class. each finite element can have one or more integration rules, which are abstractions for a set of integration points used to evaluate numerical integrals over an element volume. an integration point maintains its local coordinates and integration weights. each integration point can maintain one or more instances of materialstatus class (the purpose of this feature will be explained later). for convenience, a hierarchy of classes derived from the base interpolation class, representing fe interpolation, is provided. derived classes implement many interpolation schemes and can be used to evaluate shape functions and their respective derivatives. the crosssection class represents a geometrical model of a cross section. classes representing finite elements do not communicate directly with a constitutive model. instead, they always use the crosssection class interface, which performs necessary integration over the cross section and invokes the corresponding material model services. a cross section model can introduce special integration points to account for a layered description, for example. in such a case, these additional integration points (slaves) are created and stored at every element integration point, but are hidden to element formulation. a material class represents a base abstract class for all constitutive models. an associated materialstaus class is introduced in order to account for extensibility and efficiency requirements. in general, every material model must store its unique history parameters at every integration point. the amount, type, and meaning of these history variables vary from one material model to another. therefore, it is not possible to efficiently match all needs and reflect them in the integration point data structure. the suggested remedy uses the associated material status class, related to the corresponding material model, in order to store the necessary history variables. a unique copy of the corresponding material model status is created and associated to every integration point by the particular constitutive model. the developer of a new constitutive model defines and implements the material class representing the model and it has to define also the associated material status class (derived from base materialstatus), which contains the history variables related to the model and corresponding services. because the integration point is the compulsory parameter of all messages sent to the material model, it can in turn access its related material status from the given integration point, and therefore has access to the corresponding history variables. there are typically two sets of history variables, one related to the previous equilibrium state (needed to correctly evaluate the evolving character of constitutive relations) and the working set, which is changing during the equilibrium iterations (see fig.5). once equilibrium is reached, the working set is copied into the equilibrium set. on the other hand, when equilibrium is not reached, the solution step can be restarted, and in this case the working set is initialized from the set related to the previous equilibrium. recalling the concept of abstract interfaces, introduction of independent representations for a finite element, a cross section description, and a constitutive model allows to com58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 fig. 5: constitutive model and its history variables bine particular a finite element representation with different cross section and material models, see fig. 6. this is fully transparent, since all cross section and material models implement the same interface, declared by the corresponding abstract classes. 4 oofem features oofem is an open source, free software finite element system with object oriented architecture. it is distributed under gnu public license. it is written in the c++ programming language. it operates on various platforms, including unix (linux) and microsoft windows platforms. a graphical post-processor is available in x-windows (unix). the general features include staggered solution procedures, a multiple domain concept, full restart support from any saved state, and built-in support for parallel processing (message passing). many sparse matrix storage schemes are available, as well as the corresponding iterative and direct solvers. the structural analysis module (sm) includes many analysis procedures including serial and parallel nonlinear static analyses with direct and indirect control, parallel nonlinear explicit dynamics, linear dynamics (eigen value analysis, implicit and explicit integration methods). a large material library including state-of-the-art models for the nonlinear fracture mechanics of quasi-brittle materials and a rich element library are provided. the transport problem module (tm) is capable of solving a stationary and transient (linear and nonlinear) heat transfer and coupled heat & mass transfer problems. the element library includes axisymmetric, two and three dimensional elements. staggered analysis of heat transfer analysis and mechanical analysis can be performed, where the temperature field generated by heat transfer analysis can be used in mechanical analysis as temperature loading. oofem interfaces to the following external software: iml++ (template library for numerical iterative methods [18]), petsc – portable, extensible toolkit for scientific computation [17], and vtk (the visualisation toolkit [19]). 5 conclusion to summarize, a general object oriented environment for finite element computations has been developed. the described structure leads to a modular and extensible code design. special attention has been focused on important aspects of material-element and analysis frame design. a successful implementation using c++ language verifies the designed program structure and provides a robust computational tool for finite element modeling. 6 acknowledgment this work was supported by the grant agency of the czech republic, under project no.: 103/04/1394. references [1] fenves g. l.: “object-oriented programming for engineering software development.” engineering with computers, vol. 6 (1990), p. 1–15. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 44 no. 5 – 6/2004 fig. 6: cross section and material models interface concept [2] forde b. w. r, foschi r. o., stiemer s. f.: “object-oriented finite element analysis.” computer and structures, vol. 6 (1990), p. 1–15. [3] miller g. r.: “an object-oriented approach to structural analysis and design.” computers and structures, vol. 40 (1991), p. 75–82. [4] zimmermann t., dubois-pelerin y., bomme p.: “object-oriented finite element programming part i. governing principles.” comp. meth. in appl. mech. engng., vol. 98 (1992), no. 3, p. 291–303. [5] dubois-pelerin y., zimmermann t., bomme p.: “object-oriented finite element programming part ii: a prototype program in smalltalk.” comp. meth. in appl. mech. engng., vol. 98 (1992), no. 3, p. 261–397. [6] dubois-pelerin y., zimmermann t.: “object-oriented finite element programming part iii: an efficient implementation in c++.” comp. meth. in appl. mech. engng., vol. 108 (1993), p. 165–183. [7] dubois-pelerin y., pegon p.: “object-oriented programming in nonlinear finite element analysis.” computers and structures, vol. 67 (1998), p. 225–241. [8] commend s., zimmermann t.: “object-oriented nonlinear finite element programming: a primer.” advances in engineering software, vol. 32 (2001), p. 611–628. [9] mackie r. i.: “object-oriented programming of the finite element method.” international journal for numerical methods in engineering, vol. 35 (1992), p. 425–436. [10] mackie r. i.: “using objects to handle calculation control in finite element modelling.” computers & structures, vol. 80 (2002), p. 2001–2009. [11] mackie r. i.: “object oriented methods and fnite element analysis.” saxe-coburg publications, stirling, uk, 2001. [12] archer g. c.: “object-oriented finite element analysis.” phd thesis, university of california at berkeley, apr. 1996. [13] archer g. c., fenves g., thewalt c.: “a new object-oriented finite element analysis program architecture.” computers and structures, vol. 70 (1999), p. 63–75. [14] menetrey p., zimmermann t.: “object-oriented non-linear finite element analysis – application to j2 plasticity.” computers and structures, vol. 49 (1993), p. 767–777. [15] patzák b.: oofem home page, http://www.oofem.org, 2004. [16] coad p., yourdon e.: “object-oriented analysis.” prentice-hall, 1991. [17] balay s., buschelman k., gropp w. d., kaushik d., knepley m., mcinnes l. c., smith b. f., zhang, h.: petsc home page, http://www.mcs.anl.gov/petsc, 2001. [18] iterative methods library, http://math.nist.gov/iml++/. [19] schroeder w., martin k., lorensen b.: “the visualization toolkit an object-oriented approach to 3d graphics.” 3rd edition, kitware, inc. publishers, 2003. doc. dr. ing. bořek patzák phone: +420 224 354 369 e-mail: borek.patzak@fsv.cvut.cz prof. ing. zdeněk bittnar, drsc. phone: +420 224 354 493 e-mail: bittnar@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 5 – 6/2004 ap06_5.vp 1 introduction in our work we focused on design of lightweight drive shaft for connection of piston engine with axial flow fan. this shaft has got the following design criterions. obvious requirement is the sufficient strength and wall stability to withstand torsional moment of 500 nm which is estimated to be the peak one. low weight is important for the propulsion system and aircraft overal weight. bending stiffness high enough to allow the shaft to work in supercritical regime, but torsional stiffness low enough to allow the use of lighter shaft couplings are required. all these requirements can be fulfiled when composite material is used. for our application the hybrid construction is chosen, where flanges are made of al-alloy and central cylindrical part is made of carbon fibre reinforced epoxi resin. connections between composite and metallic parts are accomplished using epoxi adhesive system. two different versions of composite shafts were produced to compare properties of them. analytical solution of both shafts were done using classical lamination theory with the puck s failure criterion. finite element analysis of the cylindrical shell and of the adhesive bond was also done. both analytical methods were compared with experimental results. this shaft is expected to be more useful for propulsion system than previously designed metallic version. 2 shaft description the driveshaft connects the fan rotor with driving engine. one shaft coupling is inserted at each end of the shaft to minimize the transfer of torque and rotation irregularities from engine to rotor. our goal is to design the driveshaft with sufficiently low torsional stiffness to allow us to use lighter shaft couplings. this is possible when applying orthotropic mechanical properties of fibre reinforced composite. three different versions of the driveshaft were designed. first one is completely made of alluminium alloy. this shaft is used to compare “classical” design with composite solution. the greatest disadvantages of this design is complicated manufacturing and limitations to taylor shaft properties. cylindrical part of the shaft version two is made of high strength carbon fibres with epoxi resin. flanges made of alluminium alloy are adhesively bonded to the central composite part. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 design of carbon composite driveshaft for ultralight aircraft propulsion system r. poul, p. růžička,. d. hanus, k. blahouš this paper deals with the design of the carbon fibre composite driveshaft. this driveshaft will be used for connection between piston engine and propulsor of the type of axial-flow fan. three different versions of driveshaft were designed and produced. version 1 if completely made of al alloy. version 2 is of hybrid design where the central part is made of high strength carbon composite and flanges are made of al alloy. adhesive bond is used for connection between flanges and the central cfrp tube. version 3 differs from the version 2 by aplication of ultrahigh-strength carbon fibre on the central part. dimensions and design conditions are equal for all three versions to obtain simply comparable results. calculations of driveshafts are described in the paper. keywords: composite driveshaft, adhesive bond, filament winding, carbon composite, cfrp driveshaft. fig. 1: driveshaft version 1 (left) and version 2 (center). detail of connection of flange with central section of versions 2 and 3 (right) this solution provides technological simplification despite more manufacturing steps are needed. shaft version two did not take the need of low torsional stiffness into account. driveshaft version three combines in its cylindrical part high and standard modulus fibres reinfoced plastics. in this design we obtaind high bending stiffness combined with low torsional stiffness. technological demands are the same as for version two. all three versions are of approximately the same dimensions to make them comparables. all shafts are of the lenth of 953 mm. internal diameter is 115 mm and wall thickness is approximately 1 mm. tollerance of the wall thickness of composite versions is higher because of the filament winding technology used. 3 design of composite sections of shafts central, composite sections of driveshafts was designed by iterative method with reference to technological, dimensional and mechanical limitations given for the shaft. filament winding technology was chosen for production of the shaft. this technology allowes us to specify the angle of winding in the range from 0 to 90 degrees where 0 degrees represents axial orientation of the reinforcement. for estimation of mechanical properties we used classical lamination theory (clt) which is expected to be sufficient for such a thin walled shell. clt ommits off-plane deformations and works with the shell as with the kirchhoff ’s plate. micromechanical models was used to calculate the composite properties as result of properties of matrix and reinforcement. following formulas demonstrate dependancy of mechanical properties of composite on orientation of the reinforcement. e e e e e g l l t l lt lt ( ) cos sin sin ( � � � � � � � � � � � �� �� � 4 4 21 4 2 2 �) g e e e e e e g l lt l t lt l t l lt ( )� � � � � � � � �� �� � � � � � � �� 1 2 1 2 �� � cos ( ) 2 2 � where: e is young’s modulus of composite, g is shear modulus of elasticity, �lt is poisson’s ratio, � is angle between reinforcement direction and chosen mean axis of the material, subscript l represents the direction along the reinforcement and subscript t represents the direction perpendicular to the reinforcement orientation. following layups are designed for shafts version 2 and 3. both versions are expected to be loaded by torsional moment of 500 nm. this value is given as the maximal one. no bending moment is applied. torsional moment can cause three types of failure. shear failure of the composite shell which is predicted by puck’s failure criterion. this criterion is applied on results of the clt. loss of wall stability can occur, if the wall bending stiffness is not sufficient. third case of failure is the failure of adhesive bond between metallic flanges and composite section. this is analysed separately. the loss of wall stability caused by torsional moment is analysed by the following formula: t e r l e e xy x y xy yx � � � � � � � � � � � � � � � � � � � 2 9 0 3 2 1 4 3 5 512 1( ) 1 8 , where: e is young’s modulus, e is the tube wall thickness, r0 is radius of the tube midplane, l is length of the tube, � is poisson’s ratio, t is the unit shear force calculated from the torsional moment and tube dimensions, subscript x represents the tube axial direction and subscript y represents the tube tangential direction. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 46 no. 5/2006 hsc + hsc 0 deg hsc + hsc + hmc 0 deg hsc + shaft v.2. shaft v. 3. fig. 2: layups of carbon composite sections of shafts version 2 and 3; hsc means high-strength carbon (also called as standard-modulus carbon) and hmc means high-modulus carbon fig. 3: coordinate system used for stress analysis the critical torsional moment is calculated from this formula for tubes version 2 and 3 as 429 and 1605 nm respectively. shear strength was analysed by the clt software called lamiex v3, which calculates stresses and deformations of laminate element. puck’s failure criterion was applied on calculated stresses to check if failure occures in the laminate when maximal torsional moment is applied. this software also calculates effective moduli of elasticity which can be then used for simplified calculations. these values we used to estimate bending and torsional stiffnesses of all shafts. stress distribution in laminates is presented in figs. 4 and 5. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 4: stress distribution in composite part of the shaft version 2. torsional moment of 500 nm is applied. subscript x means axial direction, subscript y means tangential direction. -0,6 -0,4 -0,2 0 0,2 0,4 0,6 -100 -50 0 50 100 150 �x [mpa] h [m m ] -0,6 -0,4 -0,2 0 0,2 0,4 0,6 -100 -50 0 50 100 �y [mpa] h [m m ] -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0 20 40 60 80 �xy [mpa] h [m m ] fig. 5: stress distribution in composite part of the shaft version 3. torsional moment of 500 nm is applied. subscript x means axial direction, subscript y means tangential direction. puck’s criterion is one of the most complex strength criterion for composite materials analysis. this criterion distinguishes between the failure of fibres (ff) and interfibre failure (iff) of the matrix or matrix-fibre interface. puck’s criterion passes when its values for all layers and for ff and iff are less or equal than 1. details about puck’s criterion are in literature [2, 3]. third failure case is the failure of adhesive bond between composite and metallic part. this bond was accomplished by epoxi adhesive spabond 345. we used fem analysis to verify whether no failure occures in the adhesive. the value of 8 mpa is taken as allowable shear stress in the adhesive. the producer specifies the maximal stress of 35 mpa which value is not exceeded even in the stress peaks. the fea was accomplished only for version 3, because steeper peeks are expected for this torsionaly less stiff shaft. fig. 6 represents the section of fe mesh. model of the metallic flange was simplified. 4 conclusion table 1 concludes results for all three versions of driveshafts. first all metal shaft shows the medium longitudinal stiffness and the highest torsional stiffness and furthermore the highest weight from all shafts. these results are disadvantageous. shaft version 2 is of relatively low axial stiffness and of still too high torsional stiffness. shaft version 3 fulfills all conditions and is chosen for future tests. references [1] rastogi, n.: design of composite driveshafts for automotive applications. 2004-01-0485, sae technical paper series. [2] cuntze, r. g., freund, a.: the predictive capability of failure mode concept-based strength criteria for multidirectional laminates, composites science and technology vol. 64 (2004), p. 343–377. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 46 no. 5/2006 fig. 6: model of adhesive bond fig. 7: distribution of the shear stress in the z-x direction. (see this picture for coordinate system) fig. 8: distribution of the shear stress in the y-z direction. (see this picture for coordinate system) shaft version 1 al alloy shaft version 2 hsc/epoxi shaft version 3 hsc+hmc/epoxi longitudinal young’s modulus gpa 73.0 63.8 182.7 transversal young’s modulus gpa 73.0 9.8 39.0 shear modulus of elasticity gpa 28.0 21.5 5.1 longitudinal stiffnes (ei) n�mm2 4.4×1010 4.3×1010 1.1×1011 torsional stiffnes (gj) n�mm2 3.3×1010 2.9×1010 5.9×1009 mass g 1564 923 916 table 1: comparison of main properties of designed shafts [3] puck, a., schrumann, h.: failure analysis of frp laminates by means of physically based phenomenological models, composites science and technology vol. 58 (1998), p. 1045–1067. [4] puck, a., kopp, j., knops, m.: guidlines for the determination of the parameters in puck’s action plane strength criterion, composites science and technology, vol. 62 (2002), p. 371–378. [5] agarwal, b. d., broutman, l. j.: vláknové kompozity, sntl, 1987. [6] gay, d.: matériaux composites, hermes, 1997. [8] hoskins, b. c., baker, a. a.: composite materials for aircraft structures, american institute of aeronautics and astronautics, inc.,1986. [9] kolář, v., němec, i., kanický, v.: fem – principy a praxe metody konečných prvků, computer press, 1997. [10] uher, o.: mathematical modeling of behavior of filament wound composite structures, čvut, 2002. ing. robin poul e-mail: poulrobin@seznam.cz ing. pavel růžička doc. ing. d. hanus csc. department of automotive and aerospace engineering ing. karel blahouš department of mechanics czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 ap04_4web.vp 1 introduction electrostatic erosive microdischarges accompany triboelectric charging of various objects during their industrial production as well as during their use in everyday life. it is especially polymeric solids that are liable to surface charge accumulation to such an extent that they can even cause electric breakdown of an adjacent gas environment. usually, this phenomenon is undesirable because of its disruptive effects in industrial production, laboratory research, and civil working activities. the study of these surface microdischarges is therefore useful from various points of view. surface microdischarges often assume the form of branched channel structures whose particular channels represent streamer discharges propagated along the surface of the object. such branched patterns can be visualised by using photographic or powder techniques. these visible tracks are called lichtenberg figures and have often been used to study gaseous discharges on the surface of dielectrics [1–6]. to visualise lichtenberg figures, it was necessary to chose a transparent, highly resistive material in the form of thin layers (foils). polyethylene terephthalate (pet) seems to be a convenient material for such a purpose. it is in broad use in industrial, academic, as well as civil applications. pet is a hard, stiff, strong, dimensionally stable material that absorbs very little water. it has good gas barrier properties and good chemical resistance except to alkalis. its crystallinity varies from amorphous to fairly high crystalline; it can be highly transparent and colourless but thicker sections are usually opaque. it is widely known as a biaxially oriented and thermally stabilised film. it is produced in various modifications according to its destination: mylar® films are used for capacitors, graphics, photographic film base, recoding tapes, etc. other modifications are used as fibres in a wide range of textiles (dacron®, trevira®, tyrelene®) or in the food industry (pet bottles) and the electrical industry (for insulation). nevertheless, its probably most important feature, investigated intensively in academic research, is the ability to conserve an electric charge for a very long time. this feature ranks it among the high-quality electrets. the charging of electrets (forming electret states) is performed in various ways, the most frequent of which involves inserting an electret foil between two parallel electrodes loaded by a sufficiently high electric field. the process of electret forming may be facilitated by increasing the ambient temperature up to the point of glass transition of the polymer (~85 °c for pet). however, a good electret state can be reached at normal room temperature, too. as soon as the charged electret foil is separated from the electrode system, electrostatic microdischarges may cross the air gap between the charged foil and the grounded electrode to neutralise the electret surface charge. in this way the latent lichtenberg figures are drawn into the background electret charge. these structures may be visualised by using the powder technique and digital projection. after digital filtering of the background electret charge, only the branched patterns of the streamer channels remain on the surface (figs. 1–3) and these filtered structures (true lichtenberg figures) may by studied from the morphological point of view. for example, seeking for the relation between the fractal dimension of lichtenberg figures and the stage of initial electret state, it is possible to answer the question whether the morphology of the lichtenberg figures is a direct consequence of the initial electret state or whether they are caused by a purely stochastic process. 2 experimental arrangement highly resistive polymeric sheets of amorphous polyethylene terephthalate were carefully cleaned with ethanol and then inserted between two short-circuited copper plates for 24 hours to eliminate surface charges. the pet sheets 0.180 mm in thickness were pressed between flat bronze electrodes of diameters �1 � 20 mm and �2 � 40 mm. the smaller electrode was loaded with the negative electric potential of �8.5 kv while the larger electrode was grounded. the time application of hv was chosen in the intervals of several minutes up to hundreds of hours. 3 multifractal analysis although the method of multifractal analysis has been described in our preceding paper in this issue, yet, we would like to repeat main features of the method. multifractal calculations [7–11] have represented a fundamental aid for geometrical investigations of complex disordered systems since the early eighties. multifractal analysis – in the form of box-counting method – assumes the studied complex system to be embedded in e-dimensional euclidean space. this euclidean space is partitioned into a grid with the 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 fractality of electrostatic microdischarges on the surface of polymers t. ficker, v. kapička, j. macur, p. slavíček, p. benešovský ramified lichtenberg figures caused by electrostatic microdischarges on the surface of polymeric polyethylene terephthalate have been studied. they occurred in consequence of the previous electret forming of the polymeric sheets and were initiated in the air gap between the grounded electrode and the polymeric sheets. multifractal image analysis was used to determine the fractal dimensions of the lichtenberg patterns in dependence on the loading voltages used for electret forming. keywords: electrostatic microdischarges, lichtenberg figures, multifractal formalism, fractal dimension, electret saturated states. same dimension e. the basic cell of the grid is of a linear size �. using probability moments pi a partition sum mq is defined on the grid m q pi q i i ( , ) ( )� �� � � 1 , p n n i i( )� � � � � � � , pi i i ( )� � � � 1 1 , q �� �( , ) (1) where ni is a number of points in the i-th cell and n represents a number of all points of the complex system. the spectrum of generalised dimensions dq is usually used for the purpose of the analysis d m q q q q q � � lim ln ( , ) ( * ) ln* � � � �1 . (2) the complex systems under investigation are in the form of graphical bitmap files that were created by means of a scanner. in this way the object is converted into a set of graphical pixels that are covered with a two-dimensional grid with the basic cell �. on this grid the partition sum (1) is determined. to avoid a dependence on the orientation of the grid, several positions of the grid in the euclidean space are realised and an average partition sum m qq � ( , )� is calculated m m qq j j ( ) ( , )� � � � � � � 1 2 1 2 . (3) the averages mq( )� are determined for a series of �-grids and the slopes within the coordinate system (ln �, ln ( )mq � ) are estimated using the linear regression. the slopes divided by ( q �1) give the generalised dimensions dq. an object which produces identical dq for all q is called ‘monofractal’ while different dq– values indicate a ‘multifractal’ object. 3 results and discussion in this study 150 pet samples were investigated. three different forming voltages (8 kv, 4 kv, 2 kv) and ten different forming intervals (480 min., 240 min., 120 min., 60 min., 30 min., 15 min., 5 min., 1 min., 0.5 min., 0.25 min.) were used to create corresponding electret states each of which was represented mostly by five pet samples. however, some of the samples were removed from the statistical ensemble because their lichtenberg structures were not developed or because of other experimental problems. the remaining 110 samples were analysed and their fractal dimensions de© czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 44 no. 4/2004 dimensions d t/min. 8 kv 4 kv 2 kv 480 – 1.384 1.253 240 – 1.387 1.404 120 1.404 1.296 1.205 60 1.505 – 1.260 30 1.522 1.214 1.133 15 1.613 1.175 1.214 5 1.514 1.267 1.098 1 1.494 1.268 discharges disappear 0.5 1.290 1.204 – 1.085 average dimensions ds of saturated states and their probability errors 1.477 � 0.026 1.253 � 0.022 1.224 � 0.025 table 1: results of experiments a) b) fig. 1: lichtenberg figures generated at 8 kv and 30 min., (d � 1.486, sample no. 27) termined. the resulting hausdorf-besicovitch dimensions of the lichtenberg figures generated from the saturated electret states are listed in table 1. the experiments were performed at normal atmospheric conditions with small variations (less than 5 %) around the average values: p2 = 102.19 kpa, �0 � 24.31 °c, �0 � 69.7 % rh. the processing of one typical sample (cleaning, conditioning, forming, separating, developing, digitising, filtering, analysing) took about 30 hours. comparing our experimental conditions (forming electric fields and times) with those of normally used [12], it is possible to assume that for all our combinations of voltages and times (table 1) the saturated electret states were reached. these saturated electret states are dependent only on forming voltages u and not on times t, since the chosen times (t � 0.25min.) are sufficiently long for reaching electret saturation with pet samples at 2 kv and higher voltages (u � 2 kv). therefore, if the fractal dimension d of the lichtenberg figures risen from the saturated electret states is dependent on the stage of saturation, then the fractal dimension d should be dependent on the forming voltages u and has to show an increasing dependence d(u). this hypothesis has proved to be sound and is illustrated by figs. 4 – 6 in which the increasing linear dependence d au b� � is well recognisable. in addition, if our forming times are sufficiently long for reaching saturation states, the average dimensions ds calculated from those belonging to the same columns of table 1, i.e. for the voltages of 8 kv, 4 kv, and 2 kv (dimensions 1.477, 1.253 and 1.223), also have to be an increasing function of voltage 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 a) b) fig. 2: lichtenberg figures generated at 4 kv and 30 min., (d � 1.246 , sample no. 86 ) a) b) fig. 3: lichtenberg figures generated at 2 kv and 60 min., (d � 1.225 , sample no. 131 ) ds(u), which is documented in fig. 7. in addition, this trend of branching structures with increasing voltages is visible in figs. 1–3 which present three different lichtenberg figures in dependence on voltage (the forming times were chosen sufficiently long to reach saturated electret states for the respective voltages). all these experimental findings lead us to the opinion that the morphology of lichtenberg figures (i.e. their fractal dimension) is not a consequence of purely stochastic processes but rather it is dependent on the stage of the electret saturation which is determined solely by the forming voltages u, provided sufficiently long forming times have been used. 4 conclusions the performed experiments have shown that the branching of lichtenberg figures caused by electrostatic microdischarges on the surface of polyethylene terephthalate is not quite a stochastic phenomenon but rather a process that is governed by the stage of electret forming. since the attainability of a saturated electret state is dependent on the forming voltage u and time t, the fractal dimension d of the corresponding lichtenberg figures should be a function of these parameters, i.e. d ( u, t ). nevertheless, for sufficiently long times t > tmin the dimension d is a function of voltage only, i.e. d ( u). it seems to be natural that the d ( u) function cannot increase its values to infinity but rather it should be restricted by some maximum dmax and minimum dmin values characterising a particular electret (0 � dmin < dmax � 2). however, a verification of this idea would require further experiments. acknowledgment this work was supported by the grant agency of the czech republic under the grant no. 202/03/0011. references [1] morris a. t.: “heat developed and powder lichtenberg figures and the ionization of dielectric surfaces produced by electrical impulses.” br. j. appl. phys., vol. 2 (1951), p. 98–109. [2] bertein h.: “charges on insulators generated by breakdown of gas.” j. phys d, vol. 6 (1973), p. 1910–1916. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 44 no. 4/2004 fig. 4: the fractal dimesion d in dependence on forming voltages u for forming time t � 5 min. fig. 5: the fractal dimension d in dependence on forming voltages u for forming time t � 15 min. fig. 6: the fractal dimension d in dependence on forming voltages u for forming time t � 30 min. fig. 7: the average fractal dimension ds in dependence on forming voltages u [3] murooka y., koyama s.: “a nanosecond surface discharge study in low pressures.” j. appl. phys., vol. 50 (1979), p. 6200–6206. [4] takahashi y., fujii h., wakabayashi s., hirano t., kobayashi s.: “discharges due to separation of a corona-charged insulating sheet from a grounded metal cylinder.” ieee trans. el. insul., vol. 24 (1989), p. 573–580. [5] niemeyer l.: a fractal lichtenberg figure. 7th international symp. on high voltage engineering, dresden, 1991, p. 937–938. [6] femia n., lupo g., tucci v.: fractal characterization of lichtenberg figures: a numerical approach. 7th international symp. on high voltage engineering, dresden, 1991, p.921–923. [7] kudo k.: “fractal analysis of electrical trees.” ieee trans. diel. el. insul., vol. 5 (1998), p. 713–727. [8] feder j.: fractals. plenum. new york, 1988. [9] ficker t., druckműller m., martišek d.: “unconventional multifractal formalism and image analysis of natural fractals.” czech. j. phys., vol. 49 (1999), p.1445–1459. [10] ficker t.: “normalized multifractal spectra within the box-counting method.” czech. j. phys., vol. 50 (2000), p. 389–403. [11] losa g. a., nonnenmacher t. f., weibel e. r.: fractals in biology and medicine. birkhäuser, basel, 1993. [12] belana j., mudarra m., calaf j., canadas j. c., menéndez e.: “tsc study of the polar and free charge peaks of amorphous polymers.” ieee trans. on el. insul., vol. 28 (1993), p. 287–293. assoc. prof. rndr. tomáš ficker, drsc. phone: + 420 541 147 661 e-mail: fyfic@fce.vutbr.cz department of physics assoc. prof. rndr. jiří macur, csc. phone: +420 541 147 249 e-mail: macur.j@fce.vutbr.cz department of informatics faculty of civil engineering university of technology žižkova 17 662 37 brno, czech republic prof. rndr. vratislav kapička, drsc. rndr. pavel slavíček, ph.d. phone: +420 549 496 830 department of physical electronics masaryk university of brno faculty of science kotlářská 2 616 00 brno, czech republic mgr. petr benešovský department of physics faculty of civil engineering university of technology žižkova 17 662 37 brno, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no. 4/2004 ap07_1.vp 1 introduction information technology has seen big advances in the audio data storage and transmission field. in 1981, the cd (compact disc) was developed by philips corporation in the netherlands, implementing a storage solution based on optical laser and digital representation. however, data transmission bounds have demanded lower transmission rates, and therefore compression algorithms for reducing the data information stream without significantly distorting the signal. in 1987, the fraunhoffer institute released a compression algorithm standard, based on perceptual models of the hearing system, using masking phenomena models. the quantization noise introduced by these coding systems, specially when coding at low bit rates, gave rise to audible distortion errors, known as artifacts. subjective evaluation led to a blurred classification of these artifacts. one type of artifact – preecho – is dealt with in this paper. preecho cancelation is discussed, and then a wavelet transform technique for this purpose is discussed, as well as some mathematic considerations about the transform. hybrid coders, making use of fft or dct for the quasi-periodic components of the signal, and dwt for the transient attacks of the signal seem to be, in author’s opinion, the right direction for further research. 2 two psychometric methods for evaluating coding systems dbts and sr are two pscyhometric methods that have been tested for subjective evaluation of audio codecs. results from these tests have been published in [1][2], together with a description of the tests, excerpts and results. the dbts method is a psychometric method which introduces the reference signal. the listener compares the coded signal with the reference, while in sr the reference is not introduced. here we show the anova tables which show that the dbts method is stricter than sr. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 evaluation of audio compression artifacts m. herrera martinez this paper deals with subjective evaluation of audio-coding systems. from this evaluation, it is found that, depending on the type of signal and the algorithm of the audio-coding system, different types of audible errors arise. these errors are called coding artifacts. although three kinds of artifacts are perceivable in the auditory domain, the author proposes that in the coding domain there is only one common cause for the appearance of the artifact, inefficient tracking of transient-stochastic signals. for this purpose, state-of-the art audio coding systems use a wide range of signal processing techniques, including application of the wavelet transform, which is described here. keywords: audio-coding, artifacts, wavelet transform, psychoacoustics, orthonormal transforms. this text was a part of the international conference poster 2006 which was held in faculty of electrical engineering ctu in prague. source of variation degr of freed sums of squares mean square variance ratio (f ) probability factor a 5 109.8867 21.9773 59.0312 p < 0.05 factor b 6 12.0919 2.0153 5.4131 p < 0.05 factor a×b 30 47.5625 1.5854 4.2584 p < 0.05 error 840 312.7176 0.3723 total 881 482.2587 table 1: anova results for dbts methodology source of variation degr of freed sums of squares mean square variance ratio (f ) probability factor a 5 7.1488 1.4298 5.7146 p < 0.05 factor b 6 9.6904 1.6151 6.4552 p < 0.05 factor a×b 30 7.4262 0.2475 1.12 p < 0.05 error 966 241.6617 0.2502 total 1007 265.9271 table 2: anova results for sr methodology 3 artifacts from audio compression subjective tests performed on coded-audio signals show that individual codecs vary considerably in performance (this is validated by the anova method), and also differ in performance depending on the type of signal that is used for the test. coding signals with a strongly aperiodic character, called “attack signals” or “signals with transient behaviour” lead to an artifact known as preecho. similarly, speech signal coding introduces to the signal an artifact known as reverberation. sometimes, when coding at low bit rates, variations in the masking threshold from one frame to the next may lead to different bit assignments, and as a result some groups of spectral coefficients can appear or disappear[3]. preecho is analyzed here, and some techniques for canceling it are described. when describing artifact generation, researchers explain that a pointed artifact originates because of incorrect bit assignment from frame to frame, due to dispersion of the signal energy, which spreads out to neighbouring frames and even subbands. the relations between these dispersion lengths give rise to various perceptual artifacts. in the time domain, it is signals with a transient-stochastic character, that are affected. percussive signals such as castanets, cymbals, clicks, claps, drums, etc. give rise to preecho when coding. plosive phonemes are stochastical speech signals with a noisy character arising from turbulent air streaming in the formation of some consonants. when coding these signals, which of course occur together with vowel sounds of quasi-periodic character, reverberation is perceived. when coding a signal which consists not only of the components explained above, but which has a frecuency representation that gives strong variations of the masking threshold from one frame to the next, the birdies artifact is perceived. 3.1 origin of compression artifacts the general structure of an audio-coder is given in [4]. there are three types of audio-coding systems, which differ according to the way they feed the input signal into the psychoacoustic model. the first type are transform coders, where samples from the input signal are transformed to the frequency domain. the second type are subband coders, where the transformation is performed, and then the masking thresholds are calculated for each subband. the third type are so-called parametric coders, in which a definite type of parametrization is observed. some authors observe that subband coders give better results when tracking transient signals, but the fixed window length that they apply does not track these signals accurately. for this purpose a wide range of techniques have been implemented, as will be described below. 4 audio critical material selection during this work, the author designed a program in the matlab environment to describe the energy of the signal in each of the subbands that subband coders use. signals with a transient character show dispersion of their energy through the neighbouring subbands. therefore, for selecting audio material suitable for the subjective assessment of audio-codecs the program provides an estimation of which signals will behave critically and which not. further research needs to be done to determine the relations between the subband representation of the particular signal and the artifact produced while compressed with a definite algorithm for transient tracking. stating the relations between the power spectrum levels inside each subband should give a cue to further research. figure 2 shows the energy allocation of the power spectrum density of signal castanets. 5 current state-of-the-art for transient audio signal detection digital signal processing clearly has some potential for transient detection. this includes modifications of the discrete cosine transform, dct, with block variable lengths, tracking transient signals more accurately. the discrete wavelet transform, dwt, is also a powerful tool for transient tracking. some implementations use a hybrid dwt/dct. other approaches combine non-linear transform coding and structured approximation techniques, together with hybrid modeling of the signal class under consideration. techniques with non-uniform lapped transforms are also used. here, a non-uniform filter bank is obtaining by joining uniform cosine modulated filter banks using a transition filter. audio © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 47 no. 1/2007 fig. 1: a pre-echo artifact in a castanet excerpt [3] fig. 2: power spectrum density of a castanet audio signal watermarking, in which a watermark signal modifies the statistical characteristics of audio signals, in particular its stationarity, is also used [5]. 5.1 application of the wavelet transform while tracking transients the representation of the signal in the frequency domain in earlier coders, such as mpeg-1 layer iii, ogg vorbis and others was based on fft, or dct. nowadays, applications aimed at transient tracking, use hybrid dct, dwt among others. discarding the noise component, an audio signal can be represented in the following way [6], xton � � � � �� � � � , xtran � � � � �� � � , (1) where { , , , }�n n n� �0 1� is a wavelet basis, and { , , , }�m m n� �0 1� is an mdct basis. the resulting signal is x x x r� � �tran ton (2) daudet et al. [6] describe and � as subsets of the index sets, termed significance maps. residual signal r is not sparse with respect to the two bases considered here. the main idea is that dct, fft and the other algorithms usually implemented in audio compression are very suitable for analysing and tracking the sinusoids or the quasi-stationary components of the signal. transient tracking is more convenient with dwt. dwt transformation, and its ability to localize sharp attacks in time comes from the fourier-plancharel transformation and the uncertainty principle. further work is being done to apply these algorithms in improving codec performance. 5.2 demonstration of the wavelet transform when solving a transient signal when castanets, one of the critical material excerpts, is processed by fft or dct with fixed window length, the spectrum disperses in such a manner that the bit-assignment derived from the psychoacoustic model is non-efficient and therefore an audible artifact known as preecho originates. the following figure shows the original castanet signal, dwt, fft, dct and the other orthonormal transforms perform signal decomposition of the signal to the decomposition basis. in the case of fft, the decomposition orthogonal basis is the set of all functions, � � � � t n e t n n j t n � � � � 1 0 1 2 1 0 1 2 1 2 � � , , , , , , , , , � � (3) in the fourier basis, frequency localization is precise, but time localization is poor. the euclidean orthonormal basis, which has the form (1, 0, 0, …, n � 1), (0, 1, 0, …, n � 1), … (4) unlike fft, performs precise localization in time, but is poor in frequency. stft represented a possible solution to the problem. it windows the signal, and therefore gives the possibility to separate the signal into frames and get the frequency representation of these frames separately. however, it still faced the problem that because of the fixed window length, transient attack signals were non-efficiently tracked. dwt represents a compromise between these two limit representations, and performs good localization either in frequency or in time. signal decomposition into a particular basis can be viewed as a scalar product of the signal with the corresponding coefficient of the basis. mathematically, 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 signal noisy component transient component quasi-stationary component fig. 3: signal decomposition used in state-of-the art codecs fig. 4: original castanet signal, critical material excerpt fig. 5: 1-step decomposition of the signal using the wavelet transform ( , ) ( ) ( )f g f x g x x� � � d (5) representing how similar function f is to the corresponding coefficient of the orthonormal basis g. signal decomposition, mathematically expressed, is a set-mapping from the set of complex numbers to the set where the decomposition is described, c z z z nn �( ( ), ( ), , ( ))0 1 1� . (6) let us perform a one-step decomposition of a castanet signal, with dwt. after one-step decomposition we achieve two signal components, depicted in fig. 5. let us reconstruct the signal with the coefficients that arose after one-step decomposition. fig. 6 gives the reconstructed signal. higher levels of signal decomposition, of course, will give more accurate representations of the audio signal, in a similar manner as higher frequency resolution improves the accuracy of the frequency representation of the signal in fft. dwt, then, has a hierarchical structure in which the higher the level that the decomposition affords, the longer the hierarchical dwt tree. comparing figures 4 and 6, we see that the reconstruction was succesfully performed. now, let us perform a 3-step decomposition. a finite set of coefficients is obtained. coefficient extraction is then performed, and this is presented in fig. 7. finally we reconstruct an approximation at level 3 from the wavelet decomposition structure. we perform reconstructions of detailed coefficients at levels 1, 2 and 3 from the wavelet decomposition structure (fig. 8). © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 47 no. 1/2007 fig. 6: invert direct decomposition of a signal using coefficients fig. 7: detailed coefficients at levels 1, 2 and 3 from the wavelet decomposition structure. original signals, ca3, cd3, cd2 and cd1. fig.8: reconstructed detailed coefficients at levels 1, 2 and 3, from the wavelet decomposition structure. the upper figure is the original signal, followed by the reconstructed signal, and then the coefficients. fig. 9: original and reconstructed signal the last step is signal reconstruction from the wavelet decomposition structure (fig. 9). transient signal reconstruction shows that dwt is a suitable method for decomposing transient signals, even performing just a 3-level decomposition. this result shows that a hybrid codec implementing fft for extracting and processing quasi-stationary signals and dwt for extracting and processing transient signals is a more suitable algorithm for sound-coding than formerly-used codecs, which tracked signals with fixed window length dct or fft transforms. 6 conclusions psychometric methods were used to evaluate audio-coding systems. dbts and sr were the methods chosen to perform the evaluation. from these tests, the anova validation of results shows that not only the codec performance but also the characteristics of the signal have a strong impact on the evaluation. signals with a percussive character, such as castanets, cymbals, claps and others, when coded by algorithms which implement dct and fft for frequency representation of the signal, show preecho as an auditory artifact produced due to compression. the two other artifacts, while appearing to differ from preecho in the auditory domain, in the author’s opinion, they have the same origin: the incorrect bit-allocation of the masking coefficients. this is because the critical signal has a power spectrum which spreads out not only to two neighbouring frames, but to the neighbouring bands. the signal criticality can be checked by the program. finally, some state-of-the-art techniques are discussed in order to efficiently track these critical audio signals, giving special attention to the wavelet transform. acknowledgments this work has been supported by research project msm 6840770014 “research in the area of prospective information and communication technologies” and by national science foundation grant no. 102/05/2054 “qualitative aspects of audiovisual information processing in multimedia systems”. references [1] herrera, m.: summary of the subjective evaluation of audio-coding testing at the cvut during the period 2003–2005. in: xi. international symposium of audio and video, krakov (poland), 2005. [2] husnik, l., herrera, m.: comparison of two methods used for the subjective evaluation of compressed sound signals. in: forum acousticum. budapest, 2005. [3] aes. tutorial cd-rom, perceptual audio coders, what to listen for. new york, 2002. [4] herrera, m., dolejsi, p.: subjective evaluation of audio-coding systems. in: internoise 2004. prague, 2004. [5] larbi, s., jaidane, m.: audio watermarking: a way to stationnarize audio signals. in: ieee transactions of signal processing, vol. 53 (2005), no. 2, february 2005. [6] daudet, l., molla, s., torresani, b.: towards a hybrid audio coder. in: proceedings of the international conference on wavelet analysis and applications. february 2004. [7] http://www.mathworks.com/access/helpdesk/help/toolbox/wavelet/wavelet.htm marcelo herrera martinez e-mail: herrerm@feld.cvut.cz department of radioelectronics czech technical university in prague technická 2 166 27 prague, czech republic 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 ap07_2-3.vp 1 introduction 1.1 the metric operator in the schrödinger formulation of pseudo-hermitian quantum mechanics the recent interest in non-hermitian hamiltonians stems from the work of bender and boettcher [1], who showed numerically that the class of hamiltonians h p g ix n� � 1 2 2 ( ) (1) has a completely real spectrum for n � 2. they attributed this property to an unbroken ��-symmetry, whereby x x� � , i i� � . (2) a rigorous proof [2] of the reality came a few years later by exploiting the ode-im correspondence, i.e. the correspondence between ordinary differential equations in their different stokes sectors and integrable models. in such cases there exists a similarity transformation from the non-hermitian h to a hermitian h: h h� �� � 1. (3) here � is a positive-definite hermitian operator (re)introduced by mostafazadeh [3]. it is related to the q operator [4], which provides a positive-definite metric for the quantum mechanics governed by h, according to � � � e q 1 2 . (4) it is also useful to introduce its square � �� � �2 e q (5) from eq. (3) � � � �� �� � �1 1h h h h† † . so h h h† � �� �� � � �2 2 1. (6) this replaces the usual hermiticity requirement on the hamiltonian. h is said to be quasi-hermitian [5], or pseudo-hermitian(1), with respect to �. the operator � � �e q is in fact precisely the metric operator occurring in � � � � �, ,a a� , (7) because the similarity transformation � � �a a� � 1, � �� � � gives � � � � �� � � � � � � �a a† 1 . here � � � �† � �2 , rather than 1, as would be the case if � were unitary rather than hermitian, so � � � �� � � � �a a . (8) if the operator �a is hermitian then a is pseudo-hermitian: a a† � �� � 1, and is an observable, with real eigenvalues (the same as those of �a ). 1.2 functional integral formalism of quantum mechanics in the functional integral formulation of standard hermitian quantum mechanics, the basic object of interest is the vacuum generating functional z j d t v j t[ ] [ ]exp � ( ) ( )� � � � � � � � � � � � ��� � � � �d 1 2 2 , (9) from which green functions can be obtained by functional differentiation with respect to j(t). the fundamental question we are asking here is, what is the corresponding expression in pseudo-hermitian quantum mechanics? one might perhaps expect something like z j d t v j t[ ] [ ] exp � ( ) ( )� � � � � � � � � � � � ��� � � � � �d 1 2 2 , (10) but this is not what happens. in fact � does not appear explicitly in the expression for z[j]. rather, depending on the 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 the metric operator and the functional integral formulation of pseudo-hermitian quantum mechanics h. f. jones pseudo-hermitian quantum theories are those in which the hamiltonian h satisfies h h† � �� � 1, where � � �e q is a positive-definite hermitian operator, rather than the usual h h† � . in the operator formulation of such theories the standard hilbert-space metric must be modified by the inclusion of � in order to ensure their probabilistic interpretation. with possible generalizations to quantum field theory in mind, it is important to ask how the functional integral formalism for pseudo-hermitian theories differs from that of standard theories. it turns out that here q plays quite a different role, serving primarily to implement a canonical transformation of the variables. it does not appear explicitly in the expression for the vacuum generating functional. instead, the relation to the hermitian theory is encoded via the dependence of z on the external source j(t). these points are illustrated and amplified in various versions of the swanson model, a non-hermitian transform of the simple harmonic oscillator. keywords: quantum mechanics, functional integral, non-hermitian. metric chosen in the operator formalism, j will appear differently in the lagrangian. the method we will use to find the correct expression for z is to use the similarity transformation between our pseudo-hermitian theory and the equivalent hermitian theory, where we know how z should be written. in this paper we will limit ourselves to a particular soluble model, the swanson model, which can be formulated as a viable quantum theory in a variety of ways (in fact there is a one-parameter family [6] of �s), of which we will pick the three simplest. in [7] we treated the two cases q q x� ( ) and q q p� ( ), and in addition the pseudo-hermitian “wrong-sign“ quartic oscillator, i.e. eq. (1) for n � 4. 2 z for various versions of the swanson model the hamiltonian for this model, first introduced in [8], is h a a a a� � �� � † †2 2 where a and a† are standard lowering and raising operators for a simple harmonic oscillator with unit frequency, and �, � and are real parameters. h is non-hermitian for � � . in terms of x and p, h ax bp c x p� � � � 2 2 { , } , (11) where a � � �12 ( )� � , b � � � 1 2 ( )� � , c i� � 1 2 ( )� . 2.1 q q x� ( ) h can be written as h x p a c b x b p c b x a x bp h x( , ) ~ ( ,� � � � � � � � � � � � � � � � � � � � � 2 2 2 2 2 p) (12) this is a simple harmonic oscillator with frequency � 2 ~a b. recall that h x p e h x p eq q( , ) ( , )� � 1 2 1 2 . so q has to satisfy x x e x e p p c b x e p e q q q q � � � � � � � 1 2 1 2 1 2 1 2 , , (13) which can be achieved by q ic b x� � 2. (14) note that eq. (13) represents a (complex) canonical transformation between the pairs (x, p) and (x, p). classically q appears as the active part of the generator f x p xp i q x2 1 2 ( , ) ( )� � of this canonical transformation, according to which x f p x� � � � 2 , p f x p i q� � � � � � 2 1 2 . (15) it is also worth noting that to construct the classical lagrangian corresponding to eq. (11) we have l xp h x b a x c b xx� � � � �� � ~ � 2 2 4 . (16) which differs from a normal (scaled) lagrangian for the harmonic oscillator with frequency only by the total derivative ( ) � �c b xx iq� 12 . our approach will be to start with the naive form for the euclidean z[0] corresponding to h, verify that this is correct by transforming to its hermitian equivalent, then insert the external source j(t) coupled to the hermitian observable, and finally transform back to obtain the form of z[j] for the non-hermitian hamiltonian h. in this spirit we suppose that � �z d d t i a b c[ ] [ ][ ]exp �0 22 2� � � � ���� � � ��� � � �� � � ��d , (17) in which we have h written in terms of � and �. we then complete the square in exactly the same way as in eq. (12), to obtain z d d t i c b a b[ ] [ ][ ] exp � ~0 2 2� � � � � � � � � � � � ��� � �� �� � �d �� , (18) with � � �� �c b . here the term �i c b ��� in the exponent is precisely 1 2 �q, which can be neglected under the t integration. this is the only place that q makes its appearance in this procedure. so � �� z d d t i a b d t [ ] [ ][ ]exp � ~ [ ]exp � 0 4 2 2 2 � � � � � � � �� � � � � � � � �d d b a� � � � � � � � ! �! � � ! �! �� ~ ,�2 (19) (to be compared with the non-euclidean eq. (16)). now let us couple j to the observable � � � in eq. (19) and work backwards, to obtain z j d t b a j d [ ] [ ]exp � ~ [ � � � � � � � � � ! �! � � ! �! � �� � � � � � d 2 2 4 ][ ]exp [ � ] .d t i a b c j� �� � � �� �� � � � ��� � � � ��� d 2 2 2 (20) thus functional derivatives � �j bring down factors of the observable �, and q does not appear at all! 2.2 q q p� ( ) h can equally be written as h x p a x c a p b c a p ax b p h x ( , ) ~ ( , � � � � � � � � � � � � � � � � � � � � � 2 2 2 2 2 p). (21) this is again a simple harmonic oscillator with the same frequency , since a b a b ~ ~� . in this case q has to satisfy p p e p e x x c a p e x e q q q q � � � � � � � 1 2 1 2 1 2 1 2 , (22) which can be achieved by © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 47 no. 2–3/2007 q ic a p� 2. (23) the corresponding classical generating function is f x p xp iq p2 1 2( , ) ( )� � , giving p f x p x f p x iq� � � � � � � � � � 2 2 1 2 . (24) we now mimic this procedure in the functional integral, starting again with � �z d d t i a b c[ ] [ ][ ] exp �0 22 2� � � � ���� � � ��� � � �� � � ��d , (25) and completing the square in the manner of eq. (21) rather than (12). then z d d t i c a a b[ ] [ ][ ]exp ( � � ~ 0 2 2� � � � � � � � � � � �� � � � ��� � �d ��� , (26) where � � �� �c a . again the term i c a ��� in the exponent is just 1 2 �q, and can be dropped under the t integration. so � �� z d d t i a b d t [ ] [ ][ ]exp � ~ [ ]exp � ~ 0 4 2 2 2 � � � � � � �� � � � �� � � � d d b a� � � � � � � � ! �! � � ! �! �� �2 . (27) now we restore j, coupled to the observable � in this hermitian version and work backwards: � �z j d d t i a b j[ ] [ ][ ]exp � ~� � � � ���� � � ��� � �� �� � �d 2 2 . (28) rewriting this in terms of the original field � �� �� c a , the square bracket in the exponent becomes (up to total derivatives) � �� � � � � � � � � � � � � � � � i b a c j j c a b b i c c a j � � �� � � �� � � � � � 2 2 2 1 2 2 � � � � � � � � 2 2 2 2 2 2 2 1 4 4 2 4 b a ab j ic ab j c a b j � � . � � � � (29) integrating over � and rescaling � �� 2b, the final result corresponding to the operator theory with metric given by q(p) is (note that had we coupled j to � in (28), we would have simply reproduced (20)) z j d t a j b ic a j b c j [ ] [ ]exp � � � � � � � � � � � � � d 1 2 1 2 2 2 2 2 2 2 2 2 2 4 2a b � � � � � � � � � � � ! ! � ! ! � � ! ! � ! ! �� (30) again q does not appear explicitly, but now the source j appears in an unfamiliar way, with terms in j�� and j2. as a check of these results let us calculate the expectation values � and � �1 2 from the expression (30). the first is rather trivial: � � � � � � 1 1 2 2 0 0 2 z z j b a ic a j � � � �� , (31) as expected. however, the second check is more interesting: � � 1 2 2 1 2 0 2 1 1 2 2 1 1 2 2 2 � � � � � � � � � � � � z z j j b a ic a a j � � � � � �� � � � � � � � � � � � ic a c a b t t � ). � � 2 2 2 1 22 (32) but � � � � � � � � 1 2 1 2 1 2 2 1 1 1 2 1 2 1 2 1 2 � � � � � � � � � e t t e t t t t � � ( ) � � ( ),� �2 1 22 1 2� � � �� � e t tt t (33) giving � � 1 2 4 1 2 1 2� �� � � � a e b et t t t ~ , (34) which is indeed the result to be expected from eq. (27). 2.3 q q x p� �( )2 2 this was in fact the original similarity transformation found by geyer et al. [9], according to which q x p� � ��( )2 2 , with � � � ln . in this case the result of the similarity transformation e e q q� 1 2 1 2� is x x ip p p ix � � � � cosh sinh cosh sinh � � � �� (35) resulting in h x p h x p x p( , ) ( , )� � �� �2 2 , where � � � � � � � � � � 1 2 2 1 2 2 ( ), ( ). (36) 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 in the functional integral formalism we start from the hermitian form, and couple j to �, to obtain � �� z j d d t i j d d [ ] [ ][ ]exp � [ ][ ]exp � � � � � � � �� � � � d d �� �� �� � � 2 2 � �� t i a b c j i d t � ( cosh sinh ) [ ]exp � �� � � �� � � � � � � � � � � � � �� 2 2 2 d � � � � � � � 4 2 2 2 2 b a j ic b j b j � � � � � � � � � � �~ sinh cosh � sinh sinh � 4 b � � � � � � � ! �! � � ! �! �� . (37) in this case, the total derivative dropped under dt� was not a multiple of �q. it turns out that this was a special case when q was a quadratic in either x or p. the more general result is that the two lagrangians differ by the time derivative of 1 2 x q x p x p q x p p � � � � ( , ) ( , ) � � � � � � � . we have again checked that we correctly obtain � and � �1 2 from functional derivatives of � �z j . 3 discussion the essential formula is � �� z d d t i h d d t i [ ] [ ][ ]exp � ( , ) [ ][ ]exp � 0 � � � � � �� � � �� � � � � d d� � �� � ���� h( , ) .� (38) here we must write i��� in terms of � and �. then, if possible, we write z[ ]0 in lagrangian form, either in � or in �: � � z d t z d t [ ] [ ]exp ( ) [ ] [ ]exp ( ) . 0 0 � � � � �� �� � � � � d or d � � (39) finally we add �j� or �j� to � and try to work backwards. in this paper we have only used the first option in eq. (39), thinking of quantum fields rather than their conjugate momenta as the relevant objects. however, we could have used the second option in section 2.2, in which case we would have finished with a simple formula like eq. (20) for � �z j , but with the roles of � and � reversed. for the case of the wrong-sign quartic oscillator treated in [7], an equivalent conventional hermitian theory, with a standard kinetic term, is only possible if � �z 0 is expressed in terms of �. in any case we have shown that q does not appear explicitly in the functional integral formalism, on the lines of eq. (10), as might have been naively supposed. instead the choice of metric in the operator formalism is reflected in the j dependence of � �z j . it is interesting to note that in their work on the v x igx� �1 2 2 3 model [10], bender et al. implicitly made the assumption that q does not appear in the functional integral formalism, since their feynman rules, for both the original theory and its hermitian equivalent, were effectively derived from standard functional integrals. in that case q is only known perturbatively, and the series for � involves complicated derivative couplings. the successful construction of the equivalent hermitian theory to that with a �gz 4 potential raises hopes that a similar construction, within the functional integral framework, might be possible for the corresponding �g�4 field theory. some tentative steps were made in this direction in [11], but the generalization seems far from straightforward. remarks (1) in this context, where � is a positive-definite operator, the first term may be preferable. the �� invariance of the original class of hamiltonians (1) can be expressed as pseudo-hermiticity with respect to the indefinite operator p. references [1] bender, c., boettcher, s.: real spectra in non-hermitian hamiltonians having �� symmetry. phys. rev. lett., vol. 80 (1998), p. 5243–5246. [2] dorey, p., dunning, c., tateo, r.: spectral equivalences, bethe ansatz equations, and reality properties in ��-symmetric quantum mechanics. j. phys. vol. a34 (2001), p. 5679–5704. [3] mostafazadeh, a.: pseudo-hermiticity versus �� symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian. j. math. phys. vol. 43 (2002), p. 205–214; exact �� -symmetry is equivalent to hermiticity. j. phys. vol. a36 (2003), p. 7081–7091. [4] bender, c., brody, d., jones, h.: complex extension of quantum mechanics. phys. rev. lett. vol. 89 (2002), 270401(1–4); ibid. vol. 92 (2004), 119902. [5] scholtz, f., geyer, h., hahne, f.: quasi-hermitian operators in quantum mechanics and the variational principle. ann. phys. vol. 213 (1992), p. 74–101. [6] musumbu, d., geyer, h., heiss, w.: choice of a metric for the non-hermitian oscillator. arxiv: quant-ph/061 1150, to be published in j. phys. a. [7] jones, h., rivers, r.: disappearing operator. phys. rev. vol. d75 (2007), 025023 (p. 1–7). [8] swanson, m.: transition elements for a non-hermitian quadratic hamiltonian. j. math. phys. vol. 45 (2004), p. 585–601. [9] geyer, h., scholtz, f., snyman, i.: quasi-hermiticity and the role of a metric in some boson hamiltonians. czech. j. phys. vol. 54 (2004), p. 1069–1073. [10] bender, c., chen, j., milton, k.: �� -symmetric versus hermitian formulations of quantum mechanics. j. phys. vol. a39 (2006), p. 1657–1668. [11] bender, c., brody, d., chen, j., jones, h., milton, k., ogilvie, m.: equivalence of a complex �� -symmetric quartic hamiltonian and a hermitian quartic hamiltonian with an anomaly. phys. rev. vol. d74 (2006), 025016 (p. 1–10). dr. hugh f. jones phone: +44 (0)20 7594 7830 email: h.f.jones@imperial.ac.uk physics department faculty of natural sciences imperial college london south kensington campus london sw7 2az, united kingdom © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 47 no. 2–3/2007 acta polytechnica doi:10.14311/ap.2019.59.0483 acta polytechnica 59(5):483–489, 2019 © czech technical university in prague, 2019 available online at https://ojs.cvut.cz/ojs/index.php/ap compensation of springback in large sheet metal forming tomáš pačák∗, františek tatíček, michal valeš czech technical university in prague, faculty of mechanical engineering, department of manufacturing technology, technická 1902/4, 16607 prague, czech republic ∗ corresponding author: tomas.pacak@fs.cvut.cz abstract. a precise production of sheet metal parts has always been a main goal in press shops. highest quality demands are required especially in automotive production. unfortunately, even today, the production is not optimal due to an ineffective approach to the springback compensation. springback results in geometrical shape inaccuracies of the obtained product. based on the current approach, excessive time and financial costs emerge due to corrections on the press tools. however, these corrections do not always lead to a better accuracy of the stampings. the main objective of the research is to design a modified solution of the current approach. the modified solution is designed as a methodology with a focus on the analysis and compensation of the springback with a help of a numerical simulation. to achieve the main goal, smaller sub-goals are employed. these sub-goals, or rather experiments, mainly focus on parameters, which, more or less, influence the springback phenomenon. the designed methodology is verified with real car body parts and is carried out with a help of the department of the press tools design in škoda auto, a.s. there, the methodology is used for improving the accuracy of the stamping process of the car body parts and for improving the quality of the final product. keywords: springback, sheet metal forming, compensation, numerical simulation, autoform. 1. introduction at the final stage of forming, as soon as the load is removed and stamping tools released, the product springs out. this is caused by internal stresses (elastic deformation). this shift in the material results into change of geometry, this phenomenon is called springback. springback is a significant problem in the process of sheet metal forming, since it results in geometrical shape inaccuracies of the final product. in order to compensate springback, it is important to carefully consider all factors prior to the stamping process, otherwise a reject product occurs. in most cases, springback is solved in post-production stage of the tools production. that includes intervention into the tools geometry in the form of corrections. this leads to an increase in the costs of the whole project. ideally, springback should be solved in the pre-production stage of the project. in this stage, various numerical simulations and special computational modules can be used. tools design and construction are the most time consuming steps in new car body developing process. therefore, it is very important to find an effective and reliable methodology for a springback prediction and its compensation. [1] over the last decades, many researches have investigated the springback phenomenon in the process of sheet metal forming. in particular yoshida and uemori have improved the model of large strain plasticity. with the help of this model, it is possible to predict the springback in sheet metal forming more precisely. nowadays, numerical methods for the process simulation and evaluation are commonly used. software like autoform, pam-stamp, dynaform, ls dyna or others are powerful tools in the pre-production phase. the problem is in the accuracy of these methods, which is still far from exact. if the experimental trial and error process is replaced by a reliable numerical procedure, the pre-production time and costs can be decreased significantly. [2–4] 2. description of the springback behavior the dimensional instability caused by springback is highly influenced by the material and process variables. in terms of material variables, it helps to use a material with greater thickness of the sheet, lower material strength and higher young’s modulus. for example, hss, ahss and aluminium alloys are predisposed to have a poor dimensional accuracy in comparison with low carbon steels. each stamping process is unique, due to the variation of each product. a small change in the product geometry can influence the resulting springback significantly. same applies even for the employed press. where the inaccuracy of a part differs on a main and replacement press. springback can be divided into categories accordingly to the changes in the shape and action of forces. there are a few major categories, e.g. angle deviation, sidewall curl, twist, distortion and global shape change. in the process of pure bending, only one springback type at a time takes place. when we look 483 https://doi.org/10.14311/ap.2019.59.0483 https://ojs.cvut.cz/ojs/index.php/ap t. pačák, f. tatíček, m. valeš acta polytechnica at the more complex deep drawing process, more types of springback take place simultaneously. during this process, various types of the springback influence each other, which makes the description of the problem even more complicated. the springback behaviour can be described with the help of many approaches. each approach uses different theory for the description and results into differences in the final results. various limitations of the process conditions cause differences in the final results. one of the less accurate approaches (rigid – ideally plastic approach) is listed below (eq. 1 and 3). this approach can be applied only on simple bending processes. more accurate approach (elastic – plastic approach), which can be closely compared to fem methods, is described with help of eq. 3, description for the pure bending. [5–7] rigid – ideally plastic approach – ideally plastic material without elastic strain and without strain hardening [6] m w = 2 ∫ t/2 0 σ′0z dz = σ′0t 2 4 (1) r r = 1 − 3σ′0r e′t (2) elastic – plastic approach – elastoplastic relationship in bending process, elastic – plastic power hardening [6] ∆m w = 2k rn(n + 2) ( t 2 − z∗ )n+2 + + 2kz∗ rn(n + 1) ( t 2 − z∗ )n+1 (3) for monitoring and analysis of the stamping process, numerical simulations are very useful. one of the commonly used numerical simulation software is autoform. autoform uses static implicit time integration scheme. in every time step, starting from the previous one, the mesh is regenerated using local refinement. this solving process is iterated until the estimated error is between bounds of the interval of the requiring precision. if the time step between a new iteration is not too large, the time of solving process is usually very small. even with the help of numerical simulation, it is still very complicated to predict the springback behaviour accurately. especially in automotive production, where car body parts have a very complex shape, due to the design and body stiffness. [6] mechanical properties also play a significant role in the springback phenomenon. the dependency of the mechanical properties is, for example, mentioned in the articles [8] or [9]. during our own experiment of a simple bending over a radius, it has been verified that higher the strength of steel material, greater the springback. for example, in the case of 90° bending of 1.0 mm sheet over the radius r10, material hx340 lad (rm = 441 mpa) showed 6.8° springback, hx220 bd (rm = 352ṁpa) 6.1° and dx54d (rm = 168 mpa) 5.0°. this theory does not contribute to the current trend in the automotive. with more pressure on the lightweight car bodies, high strength steels and smaller thicknesses are applied. that results in even more issues with the total inaccuracy of the shape. 3. methodology of the springback analysis and compensation in order to resolve the springback phenomenon properly, stable and accurate analysis of the springback must be carried out. with use of the numerical simulations, possibilities in the springback analysis are enormous. the use of the numerical simulations for the springback analysis has mostly advantages, however, some drawbacks are also present (besides the persisting lower accuracy). the advantages are mainly in the analysis itself, where all kinds of comparisons can be used. such as an evaluation in various directions, comparison with the reference geometry and more. on the contrary, when it comes to the springback analysis, conditions of the analysis play a great role in the final accuracy. for example, a part can be evaluated with no gravity taken into account (free springback), with gravity (constrained springback) or with gravity and with progressive clamping (real measurement). results with an application of each analysis always vary. furthermore, settings of the initial numerical simulation also influence the final results of the springback analysis. for instance, settings of finite element method (type and size of elements, nodes, number of iterations, etc.) and process parameters (pressure of binder, drawbead type, trimming with or without tools, pressure and velocity dependency, etc.). possibilities in the combination of various conditions and settings in the numerical simulation are very comprehensive. the aim is to reach accurate results that can be achieved through a unified methodology. such methodology is shown in the fig. 1. this methodology also represents the modified approach to the springback phenomenon. [5] at the moment, when analysis and compensation of springback is expected, numerical simulation must be carefully designed from the very beginning. as mentioned, the overall accuracy of the springback analysis is highly influenced by the process variables and initial settings of the numerical simulation. in the purpose of creating a unified methodology, a checklist for the numerical simulation had to be designed foremost. optimal options for the settings of numerical simulations were obtained through various tests. below are listed only the key steps and major parts of the checklist: [7] 484 vol. 59 no. 5/2019 compensation of springback in large sheet metal forming figure 1. schema of the modified solution of the approach to the springback phenomenon with greater focus on the pre-production, virtual part of the project. • settings of imported geometry – meshing tolerance, stitching distance, max side length, master element size, max element angle, etc. • fem calculation – use of elastic plastic shell elements instead of membrane cells. • stamping process – use of the 3d or adaptive drawbeads, application of pressure and velocity dependency, use of more complex description of friction, etc. • process in numerical simulation corresponds with the productive press line. • trimming and cutting operations with complete tool geometry. • radius and thickness ratio r/t < 2. • use of pilots due to the centring of blank position. the most important source of inaccuracy of the springback in numerical simulations is the analysis itself. only with a virtual analysis, which is identical with the real measurement, accurate results can be achieved. compact methodology for the springback analysis is shown in the figure 2. when it comes to the springback compensation, two approaches can be applied. first and current approach focuses on the manual geometry compensation with the cad modelling. this approach, oftentimes spring-forward method, requires a lot of experience in the field of forming. also this approach is time consuming due to the manual surface modelling. the more effective approach is the use of a special computational modules, e.g. autoform compensator or pamstamp die compensation module. these modules focus on the geometry correction after the springback analysis from the previous iteration. the principle is similar to the manual correction to create a compensated tool geometry which will help to achieve a better dimensional accuracy. however, to get such results from the compensation, a very consistent fundamental simulation with appropriate compensation strategy must be used. [6] results from the compensation can differ because of various possibilities in the approach to the compensation. compensation strategy focuses on the: • selection of the stamping operation for the following compensation. • selection of the tools geometry (fixed, compensated or transitional). • definition of a compensation ratio. above, a compensation ratio was mentioned. with the help of the ratio, the intensity of the compensation can be influenced. with a smaller ratio, tool’s geometry is compensated only slightly. this results into smaller changes in the tools geometry and thus improved feasibility in the tools try-out. on the contrary, a higher ratio results in a better elimination of springback. based on the experiments, an optimal ratio was found to be in the interval between 0.2 to 0.4. when a higher ratio is applied, the continuity of the tool’s geometry is fragmented. this will lead to a poor tool’s feasibility. [7] 4. verification of the springback compensation methodology the main motivation behind the verification was to discover if the modified approach is feasible and bene485 t. pačák, f. tatíček, m. valeš acta polytechnica figure 2. schema of the methodology of the virtual springback analysis [5]. ficial for future projects. the described methodology has been carried out on car body parts with a various geometry complexity. namely: • seat ateca outer bottom panel from fifth doors dc06+ze • seat ateca – inner fifth door panel dx57d • škoda superb fender hx220 bd the methodology was verified through a process of a numerical simulation, springback analysis and comparison with the reference geometry. finally, a compensation according to the methodology has been carried out. car body parts with a less complex geometry underwent the verification relatively well. this applies to the outer and inner panel of fifth doors from seat ateca. on these parts, the verification of springback analysis was successful (maximum deviation from real-life scan circa 0.5 mm). also the springback compensation was successful, where accuracy of the shape was improved. maximum dimensional inaccuracy was circa 0.5 mm on the outer fifth door panel and circa 0.6 mm on the inner fifth door panel. as mentioned, each compensation strategy results into different results. specifically, on the outer fifth door panel, total of 5 strategies have been applied. results from each strategy and its iterations are shown in fig. 3. the best results have been achieved with the 1st strategy where tools are compensated in every stamping operation. fig. 4 displays the springback analysis of the initial numerical simulation and analysis after the final iteration of the springback compensation (1st strategy). in terms of more complex geometry and forming processes, the methodology was not successful. in the case of the fender from škoda superb, both the springback analysis and compensation did not achieve satisfying results. thanks to the compensation, springback has been improved from the initial maximum value 7.5 mm to the final 3 mm. even though dimensional accuracy has been improved significantly, it still did not meet the requirements (0.8 mm). a total of 4 strategies has been applied and resulted with similar or worse results. based on the application of the designed methodology and its verification, car body parts have been distributed into categories of complexity (figure 5). each category describes its complexity from the point of view of a forming process and geometry of the part. also a status has been added to describe if it is currently possible to successfully use the methodology on the parts from each category. 5. comparison of the results evaluations presented in this paper show a similar tendency as the current literature. this paper corresponds with the statement that springback phenomenon can currently be accurately analysed and compensated only on parts with a less complex shape. through literature, most papers agree that various compensation strategies lead to an improvement in the inaccuracy by up to 70-80 % [10–12]. similar but slightly better results have been achieved in the presented experiment. with the designed strategy, the improvement of the inaccuracy achieved up to 80-90 % in the case of seat ateca fifth door parts. some researches rate the accuracy of springback phenomenon even more critically. since the calculation of the springback prediction is not able to reach 486 vol. 59 no. 5/2019 compensation of springback in large sheet metal forming figure 3. graphical representation of springback results after application of various compensation strategies (each number represents a magnitude of the springback defined by the springback coefficient si) [7]. figure 4. results from compensation strategy no. 1 comparison between initial springback analysis and last iteration of the compensation (sb compensation consist of iterations in total) [7]. 487 t. pačák, f. tatíček, m. valeš acta polytechnica figure 5. categories of car body parts based on the complexity of the forming process and of the geometry. each category is evaluated from the point of view of the springback prediction and its compensation [7]. 100 % of the accuracy, the following compensation should be even more inexact (the paper mentions overall accuracy 56 % even with the remodelling the tools geometry) [13]. similar trend has been achieved with the experiment on the škoda octavia fender, where the accuracy has not been acceptable. 6. conclusion existence of the springback in stamping processes is very common. in terms of bending operations, springback has already been described by many authors. the description of the springback behaviour is more complicated in the case of consecutive stamping operations in the press line, in the process of forming, springback is influenced by a number of process variables. some of the conditions influence springback extensively (material properties, material’s thickness, friction, plastic strain, etc.). the initial settings of numerical simulation also have an additional impact. a major condition is that the numerical simulation has to be carried out under the same conditions as in the real stamping process. the research is based on the modified solution in the virtual approach to the springback analysis and compensation. an integral part of the modified solution is the methodology. the methodology consists of three main parts: correct settings of virtual forming process (numerical simulation), springback analysis and springback compensation. the methodology has been employed on car body parts with a various geometry and various forming process (chapter 4). results from the numerical simulation have been compared to digital scans of produced parts (based on a real stamping process in a press shop). the verification proved that an accurate springback analysis and its compensation is challenging. figure 5 summarize the possible application on various types of car body parts. with more complex part geometry and with the virtual solution, it is difficult to accomplish accurate results. in addition, accurate virtual solving of compound parts with hemming is nearly impossible. the reason is the description of a springback when assembly parts are hemmed together. another problematic step after the virtual compensation is a curvature analysis of the compensated geometry. the curvature of the surface is mostly uneven and wavy due to the local geometry adjustment. it is complicated to mill such surface and later to fit bottom and upper tools together in the try-out press. therefore, after the local springback compensation, surface has to be smoothened with a special software. the modified virtual solution is beneficial for car body parts but with a condition of application on less complex geometry, e.g. fifth door outer panels, inner and outer doors or inner and outer bonnets. the same applies on the smaller structural car body parts. here, the designed solution can highly improve the final accuracy and decrease the time and financial costs during the preand post-production [7]. acknowledgements the research was financed by sgs16/217/ohk2/3t/12. sustainable research and development in the field of manufacturing technology. references [1] a. soualem. a detailed experimental study and evaluation of springback under stretch bending process. international journal of aerospace and mechanical engineering 8(6):1128 – 1131, 2014. doi:10.5281/zenodo.1093219. [2] a. h. alghtani, p. c. brooks, d. c. barton, v. v. toropov. springback analysis and optimization in sheet metal forming. in 9th european ls-dyna conference 2013. 2013. [3] department of ferrous metallurgy. springback of high strength automotive steels. http: //www.iehk.rwth-aachen.de/index.php?id=503&l=2. 488 http://dx.doi.org/10.5281/zenodo.1093219 http://www.iehk.rwth-aachen.de/index.php?id=503&l=2 http://www.iehk.rwth-aachen.de/index.php?id=503&l=2 vol. 59 no. 5/2019 compensation of springback in large sheet metal forming [4] t. yoshida, e. isogai, s. yonemura, et al. material modeling for accuracy improvement of thespringback prediction of high-strength steel sheets. technical report 102, nippon steel, 2013. [5] t. pačák, et al. methodology of the springback compensation in sheet metal stamping processes. in metal 2017 conference proceedings, pp. 502–507. brno, 2018. [6] asm handbook. metalworking: sheet forming, vol. 14b. asm international, ohio, 2006. [7] t. pačák, et al. compensation of the springback behavior in large sheet metal stamping. in book of proceeding from conference technological forum 2018. jaroměř, 2018. [8] s. benson. bending basics: the hows and whys of springback and springforward, 2014. https://www.thefabricator.com/thefabricator/ article/bending/bending-basics-the-howsand-whys-of-springback-and-springforward. [9] d.-k. leu, z.-w. zhuang. springback prediction of the vee bending process for high-strength steel sheets. journal of mechanical science and technology 30(3):1077 – 1084, 2016. doi:10.1007/s12206-016-0212-8. [10] j. weiher, b. rietman, k. kose, et al. controlling springback with compensation strategies. aip conference proceedings 712(1):1011 – 1015, 2004. doi:10.1063/1.1766660. [11] w. a. siswanto, a. d. anggono, b. omar, k. jusoff. an alternate method to springback compensation for sheet metal forming. the scientific world journal 2014:1 – 13, 2014. doi:10.1155/2014/301271. [12] s. xu, k. zhao, t. lanker, et al. springback prediction, compensation and correlation for automotive stamping. aip conference proceedings 778(1):345 – 352, 2005. doi:10.1063/1.2011244. [13] r. lingbeek. aspects of a design tool for springback compensation. master’s thesis, university of twente / inpro, 2003. 489 https://www.thefabricator.com/thefabricator/article/bending/bending-basics-the-hows-and-whys-of-springback-and-springforward https://www.thefabricator.com/thefabricator/article/bending/bending-basics-the-hows-and-whys-of-springback-and-springforward https://www.thefabricator.com/thefabricator/article/bending/bending-basics-the-hows-and-whys-of-springback-and-springforward http://dx.doi.org/10.1007/s12206-016-0212-8 http://dx.doi.org/10.1063/1.1766660 http://dx.doi.org/10.1155/2014/301271 http://dx.doi.org/10.1063/1.2011244 acta polytechnica 59(5):483–489, 2019 1 introduction 2 description of the springback behavior 3 methodology of the springback analysis and compensation 4 verification of the springback compensation methodology 5 comparison of the results 6 conclusion acknowledgements references ap06_1.vp 1. introduction 1.1 similarity solutions of shear flows fluid flows are governed by the navier-stokes partial differential equation, for which there are (apart from a handful of cases of trivial simplicity) no practically useful analytical solutions. the equation is therefore usually solved by numerical procedures over a domain discretised into finite elements or volumes. the numerical solutions do not provide a global view of the problem, because – in contrast to the general character of analytical solutions – each computed case is valid only for a particular set of parameters and boundary condition values, not showing a relation with other cases. this is acceptable for solving a particular engineering task, but not helpful for educational purposes or for general investigations. sometimes it is possible to obtain solutions having the desirable general character by utilising a similarity property [9] of the flowfield. there are not many flows possessing this property. fortunately, many basic cases of shear flows do: the spatial distributions of their flow parameters – e.g., the velocity profiles – at different streamwise locations are mutually similar so that the introduction of suitably transformed co-ordinates can make them identical. the resultant universal velocity profile in transformed co-ordinates then represents all profiles at all streamwise locations in the flowfield. the similarity transformation reduces the number of independent variables in the problem. the governing partial differential equations are reduced to ordinary differential equations in the transformed co-ordinates. this transformation approach was first used by g. stokes who obtained analytical solutions for unsteady boundary layers developing in time-dependent motion of flat walls [2]. later this use of the symmetry properties of governing equations became the standard tool for studies of laminar shear flows by l. prandtl and his göttingen school. to this day, standard teaching of laminar boundary layer theory is based on this idea, using the solution [2] obtained under prandtl’s guidance by blasius in 1908. another researcher influenced by prandtl, schlichting in 1933, used this approach for laminar submerged jets [2] and glauert in 1956 applied it successfully to the laminar wall jet problem [2]. modern approach to the similarity transformations are based on the ideas of e. noether [16] who proved that each conservation law of a physical problem is associated with symmetry of the governing equations. the treatment of the relationship between physical invariants and lie-bäcklund operators, which are noether symmetries as shown by kara and mahomed [17], was influenced by the group theory ideas of ibragimov [18]. application to turbulent shear flows has been slowed by problems of modelling turbulence. in 1926, w. tollmien [1] applied the similarity approach to submerged turbulent jets using prandtl’s 1925 algebraic model of turbulence [2]. this model requires independently input information about the size of the turbulent vortices. tollmien supplied this in the form of a very simple assumption: the turbulence length scale was assumed to be constant at each cross section and proportional to the local width of the shear flow. his results were, unfortunately, in only very rough agreement with experimental data. the reason for the disagreement, the non-local character of turbulence transported by advection as well as diffusion, was initially not recognised and the remedy was sought in introduction of several other physically not substantiated algebraic turbulence models. in particular, current textbooks still often discuss turbulent submerged jets on the basis of görtler’s 1942 [8] solution based on prandtl’s conceptually wrong 1942 “neues modell”. the popularity of this solution is due – besides its simplicity – to purely fortuitous better agreement with experimental data than obtained with tollimen’s solution. only at the end of the last century did similarity solutions of fully developed turbulent jets using advanced models, taking into account turbulence transport, become available – an example are the solutions of turbulent jets by tesař 1996 [3], 1995 [5], 1997 [9], 2001 [6]. essential problem to be overcome is the complexity of modern turbulence models. to incorporate the transport effects the solved equation for fluid momentum needs simultaneous solution of additional transport equations for parameters of turbulence. the less complex one-equation turbulence model used in [3] 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague similarity solutions of jet development mixing layers using algebraic and 1-equation turbulence models v. tesař mixing layers are formed between two parallel fluid streams having different velocities. one of the velocities may be zero, as is the usual case of the mixing layer that surrounds, immediately downstream from the nozzle, the core of a developing jet issuing into stagnant surroundings. earlier – but so far not properly published – experimental evidence shows a remarkably weak effect of transversal curvature, making the present solution applicable with acceptable precision to description of developing round jets. this paper presents solutions of a planar mixing layer by a similarity transformation, which reduces the problem to solving ordinary differential equations. two solutions are investigated: one based on an algebraic model and the other using the 1-equation model of turbulence. they are compared with recent results of piv measurements of a developing jet. keywords: similarity of shear flows, mixing layer, turbulence models, algebraic model, one-equation model, submerged jets. relies on the original hypothesis about the size of the momentum transporting turbulent vortices as originally introduced by tollmien [1]. the more sophisticated two-equation model used in [5] does not require any such a priori assumption. the solutions in [3] and [5] assume the turbulence to be isotropic, which is not satisfied exactly but fortunately the anisotropy in the jet flows is not very large. agreement with experiments is excellent for both models, which shows the tollmien’s simple length scale hypothesis to be remarkably successful. both solutions [3] and [5] assume the jet to be fully developed. experiments show this causes more problems than generally believed. according to many standard textbooks, jets are said to be fully developed at a downstream distance as short as 8 to 10 nozzle exit widths (or diameters in the axisymmetric case). this is based on the character of velocity profiles, which at these locations indeed agree with the similarity predictions reasonably well. however, parameters of turbulence require a much longer distance to develop. profiles of fluctuation energy were found by tesař and střílka in [6] to be not fully developed at a downstream distance as large as 60 diameters – so large that the jet may cease there to be useful for practical applications (e.g. due to the velocity of an air jet decreasing to a level comparable with room draft motions). 1.2 mixing layer elimination of the streamwise distance variable in shear flows by the similarity transformation is achieved by dividing the transverse distances by the shear layer thickness d and dividing the velocity by the difference between the highest and lowest velocity. in practical situations it is extremely rare for the transversal dimensions of solid walls defining the flow geometry to vary in the streamwise direction in the same way as the layer thickness. this means that practical condition for the existence of similarity solutions is the absence of scale defining solid walls. the mixing layer, fig. 1, between two parallel flows having different velocities wea and web, away from the bounding walls, is an example of such a scale-less geometry. of particular importance are the cases with web � 0, i.e. the mixing layer between a single stream and stagnant surroundings. an important fact in fig. 1 is the planar character of the geometry. the related axisymmetric case shown in fig. 2 is not scale-less. there is the transversal dimension – the radius of transversal curvature r0 of the edge of the separating wall. since the transversal curvature does not vary equally as the layer thickness with the streamwise distances, this flow ceases to possess exact similarity. the wakes immediately downstream from the separation wall edge are a complicating factor. fortunately, they tend to disappear rather fast with increasing streamwise distance. the influence of the wake usually becomes negligible – at least as reflected in the shape of the velocity profiles – at downstream distances equal to a few multiples of the thickness of the boundary layer formed on the separation wall. the similarity approach to the mixing layer is then applicable in the fully developed layer sufficiently far downstream. the standard solution of the plane mixing layers presented in most textbooks is due to görtler (1942) which, like the görtler’s solution of the submerged jet mentioned above, is unfortunately based on prandtl’s physically ungrounded “neues modell”. it is the similarity solution in the relative co-ordinates transverse co-ordinate � �g g x x � 2 1 (1) relative velocity u w w w w eb ea eb � � � 1 (2) where the transverse distances x2 are measured from the location where u � 0.5, x1 is the streamwise distance measured from an extrapolated virtual origin, w1 is the streamwise component of time-mean velocity, and �g is görtler’s proportionality constant in the assumed linear streamwise growth of layer thickness � � � � x g 1 (3) so that � �g x� 1 . the similarity transformed momentum transport equation with prandtl’s 1942 model of turbulence is an ordinary © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 1: mixing layer between parallel flows is quite thin because momentum diffusion from the flow of higher velocity wea towards the lower velocity web is slower than the streamwise motion. the character of the flow is here represented by time-mean velocity profile. fig. 2: a transversally curved mixing layer forms when flow of higher velocity wea leaves the central nozzle and mixes with the outer flow of lower velocity web. transversal curvature radius varies with axial distance differently than the layer thickness. this renders an exact similarity of the profiles impossible. differential equation, which görtler solved by series expansion. the first term in the power series is dominant. görtler neglected all remaining terms and found an analytic solution [2] for the first term. u erf x xg � � � � � � � � � � � � � � 1 2 1 2 1 � (4) this first-term solution possesses a central symmetry with respect to point x2 � 0, u � 0.5, so that u at �x2 equals 1 � u at x2. the usual slight deviation from this symmetry, discernible though not prominent in experimental results, is ascribed to the neglected terms. 1.3 approximate application to axisymmetric jet flows submerged jets, flows of high importance in engineering applications, gradually develop from the usually nearly uniform velocity profile flow in the nozzle exit [4]. in the earliest stages of development, the flowfield is dominated by the mixing layers between the core flow and the stagnant outer fluid, fig. 3. since the developed jets lose their velocity very fast with the distance travelled, and hence gradually cease to be able to generate a useful effect, many applications – in particular in fluidics, the technique of controlling fluid flows without the use of moving components in the devices – tend to use only the developing part of the jet. the general rule in fluidic valves, e.g. [19], is to capture the jet while it still contains a significant jet core. as a result, it is of high practical importance to be able to analyse the properties of the mixing layers surrounding the core. of course, in the case of a round jet the mixing layer is subject to quite strong transversal curvature. theoretically, the planar (� infinite curvature radius) conditions are approximated at small downstream distances, where the curvature radius is much larger than the very small layer thickness. unfortunately, practical considerations limit the applicability of this theoretically sound assumption: a) at small downstream distances the conditions are complicated by the wake of the separation wall edge, which needs a considerable streamwise distance to disappear. b) in analogy with the conditions in jets [6], the turbulence structure of the mixing layer may be reasonably expected to need a considerable streamwise distance from the separating edge before it develops self-similarity. in principle, therefore, no similarity solution for the mixing layer at the outer edge of the developing axisymmetric jet should be possible. fortunately, the effect of curvature is found to be rather weak and experimental evidence shows that the flow may be well approximated by the plane mixing layer solution, indeed with precision sufficient for most engineering applications. in 1978–79 the present author investigated the mixing layer formed between concurrent flows in a pipe, as shown in fig. 4, with the faster central flow wea � web injected axially through the central nozzle. diagrams of the results were used as teaching examples in textbook [2] and the velocity profile diagram was also shown in [15], but because of the classified character of the application in the nuclear industry, the complete results were never properly published. they are of importance in the present context, since they provide an experimental demonstration of several important facts: � the first of them is the proof of the linear dependence eq. (3). it is a consequence of very general property of turbulent shear flows. their thickness grows in convected co-ordinates in proportion to the local velocity scale of turbulence wt (cf. eq. (5)). for the 1978 experiments this is demonstrated in fig. 6 by the plot of convention thickness �0.8 dependence on the streamwise distance x1 * from the nozzle exit. the convention definition of the thickness �0.8 in fig. 5 was chosen for convenience of processing the experimental data: �0.8 is the transversal distance between the locations in which the time-mean axial velocity differs from the outer velocity we on each side by 0.1 � we, one tenth of the difference between the outer velocities � w w we ea eb� � . it should be noted in fig. 6 that the thickness was not 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 3: the mixing layers surrounding the jet core dominate the character of the fluid flow in the developing submerged jet at small downstream distances from the nozzle from which the jet is issuing fig. 4: the mixing layer in a round pipe: faster flow a injected into parallel flow b of lower velocity web. investigations of this layer were performed by the present author in 1978 at ctu in prague as a part of classified study of radioactive contaminant diffusion towards the pipe wall. evaluated in the not fully developed flows at small x1 *, influenced by the wake downstream from the edge of the nozzle exit. the complicated development necessitated the use of shifted axial co-ordinate x1 measured from the virtual origin. this shift is necessary for changing the axial �0.8 dependence indicated in fig. 6 into the linear homogeneous proportionality. � the second fact of importance is the very small effect of the transversal curvature of the axisymmetric flow, indeed negligible for engineering purposes – despite the curvature radius being comparable here with the layer thickness. this is demonstrated in fig. 8, which shows four measured velocity profiles, similar to the top example in fig. 7, transformed into the similarity co-ordinates eqs. (1) and (2). the transformation was based on locating the convention edges of the layer according to fig. 5, and presented in fig. 9 where it is obvious how near the edges are to axis of the mixing pipe. the next fig. 10 shows how short is the radius r of the transversal curvature relative to the layer thickness �. the table in fig. 10 shows how the ratio r/� decreases with the downstream distance x1. despite this fact, the transformed profiles measured at different x1 are in fig. 8 practically identical. their mutual differences are smaller than the experimental scatter (which is remarkably small, considering the stochastic nature of the turbulent flow and the small dimensions of the experiment – pipe internal diameter a mere 30.11 mm). as © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 5: unpublished 1978 measurements by a miniature pitot probe of axial time-mean velocity profiles in the mixing layer inside the pipe of fig. 3. convention thickness �0.8 is defined as the distance between the black symbols, where the velocity differs by 10 % from the velocity outside the layer. fig. 6: streamwise dependence of the convention thickness �0.8 was usual at that time, the present author compared the measured profiles with the görtler solution, eq. (4). the good agreement provides an explanation for the continuing popularity of this solution, despite its lack of a sound theoretical foundation. 2 hierarchy of three turbulence models 2.1 isotropic models the models discussed here assume isotropic turbulence. of course, large eddies in a mixing layer generally exhibit a preferred spatial orientation (with axes parallel to the edge of the dividing wall). their anisotropy, however, does not warrant the use of more complex anisotropic models. most influential in the turbulent momentum transport are smaller vortices, of characteristic size roughly twenty times smaller than the local layer thickness. their orientation is near to chaotic, and there is hardly any orientation preference. isotropic turbulence is fully specified by a single parameter – the turbulent viscosity �t. this is a product of two factors (using the same symbols as in earlier publications [2, 3, 5, and 6]): � t tw� �, (5) where � [m] is turbulence length scale – size of vortices most effective in turbulent momentum transport), and wt [m/s] is the velocity scale of turbulence. algebraic model the turbulent viscosity �t is evaluated solving algebraic equations based on two hypotheses: 1) prandtl hypothesis: w w xt � � � 1 2 � (6) 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 7: the character of measured velocity profiles in the pipe. if there was a prominent wake (as in the example at downstream distance 8.3 mm), the data were not included into the comparison with the similarity solution, fig. 8. fig. 8: velocity profiles of the mixing layer, figs. 3 and 4, plotted in the similarity co-ordinates and compared with the first-term görtler solution eq. (4). no significant systematic effect of the transverse curvature is discernible. turbulent eddies are set into rotation by the velocity difference between their extreme ends, due to the local transverse gradient � � w x 1 2 of streamwise velocity w1. 2) tollmien hypothesis: � � k � (7) eddy size is constant across each transversal section of the layer, its magnitude increases with the thickness � of the layer. 1-equation model the turbulent viscosity �t is evaluated solving the transport equation for specific energy of turbulent fluctuation ef and making use of two relations: 1) prandtl-kolmogorov expression: w c et f� � (8) 2) tollmien hypothesis: � � k � (9) 2-equation model the turbulent viscosity �t is evaluated solving two transport equations – one for specific energy of turbulent fluctuation ef and the other for the turbulence dissipation rate �. the resultant spatial distributions are used in the relations: 1) prandtl-kolmogorov expression: w c et f� � (10) 2) expression due to � � � c e z f 3 (11) 2.2 similarity transformation with the algebraic model using the notation from textbook [2], the mixing layer flowfield is described by prandtl’s thin shear flow equation for spatial distribution of time-mean axial velocity w1 � � � � � � � � � �� � w x w w x w w x w x xt t1 1 1 1 2 2 2 1 2 2 1 2 2 � � � � � � � � � � 0, (12) the algebraic model eqs. (5), (6), and (15) expresses the turbulent viscosity �t as � � � t w x k s x� 1 2 2 1 2( ) , (13) and its transverse gradient �� � � � t x w x k s x 2 2 1 2 2 2 1 2� ( ) (14) following closely the transformation procedure described in [2] and [3], where there are more detailed explanations, eq. (12) with inserted expressions eqs. (13) and (14) is transformed using the definition of the relative transverse co-ordinate according to [3] � alg x k s x � 2 23 12( ) , (15) where k is the proportionality constant from eq.(7) and s is the proportionality factor in the linear streamwise growth relation for layer thickness �: � � s x1 (16) the definition eq. (15) differs from the original one used in [3] by the added index alg to remove possible confusions with several other similarity co-ordinate definitions in the present paper. it should be also noted that the thickness eq. (16) is different from görtler’s � in eq. (3) because of the different definitions in eqs.(1) and (15). from eqs. (7) and (16), © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 9: definition of the convention edges of the mixing layer. the layer is quite thin and long (note the compression of the axial scale), its radius of transversal curvature is quite small. fig. 10: the negligible effect in fig. 4 of the transversal curvature is quite surprising, considering the fact that the curvature radius is comparable with the layer thickness �. � � k s x1 (17) to evade repeated checking of the conformity of solutions with the mass conservation condition, in the present two-dimensional case it is useful to re-write eq. (12) in terms of stream function �, related by means of eq.(15) to transformed relative stream function f � �� f k s x we2 23 1( ) . (18) the transformation into the similarity form follows closely the analogous approach for submerged plane jets in [3] – the dissimilarity in the present case is in the constant velocity difference �w w we ea eb� � (19) replacing the maximum velocity in the velocity profiles of the jets in [3], which varies in streamwise direction. transformed velocities and their derivatives are: w f we1 � d d� � , w f k s w f x w xe e 2 23 2 1 2� � �( ) � �d d� , � � � w x f x w k s x e1 1 2 23 1 22 � � d d 2 2 � ( ) , � � � w x f w k s x e1 2 23 12 � d d 2 2 � ( ) , � � � 2 1 2 2 43 1 24 w x f w k s x e� d d 3 3 � ( ) . inserting these expressions into the momentum transport equation (12) converts it into d d d d d d d d 2 2 2 2 f f x w k s x f fe � � � � � � � � � � � � � � � �2 2 23 1 22 � ( ) � � � � � � x w k s x f f w x f f e e 2 2 23 1 2 2 1 2 2 � � ( ) d d d d d d 2 2 3 3 2 2 � � �� � � � � w x e 2 12 0 mutual cancellation of the first and second terms and division by � w x e 2 1 results in the final form of the similarity transformed governing equations d d 3 3 f f � � � 0 or d d 2 2 f � � 0 (20) 2.3 similarity transformation with the 1-equation model again following closely the details of the derivation for the jet in [3], the prandtl equation (12) is transformed using the definition of the relative transverse co-ordinate � � 1 2 1 eq x k s c x � , (21) where c� is the proportionality constant from eq. (8), known in equilibrium turbulence to possess [2] the value c� � 0.548, while k and s are as in eq. (16) and (17). again, this definition differs from the original in [3] by an added index – here the index 1eq to discriminate the variable introduced in eq. (21) from the different definitions eqs. (1) and (15). this turbulence model requires that simultaneously with the momentum transport equation is solved also the transport equation for the specific energy of turbulent fluctuations ef � � � � � � � � � � � � e x w e x w e x e x x f f f t f t 1 1 2 2 2 2 2 2 2 � � � � � � � � � � � �, (22) where is the turbulence production rate [2] � �( )w t1 2 2 � (23) and � is the dissipation rate � � c e k s xz f 3 1 (24) the coefficient of the gradient transport rate in eq. (22) is �t – the same as in eq. (12), i.e. with the prandtl number for gradient transport of fluctuations prtef. � 1.0. this is a standard assumption, equivalent to the expectation of all quantities being transported by turbulent motions equally. as presented in [2], the standard value of the model constant in eq. (24) is cz � 0.164. the transformation into the similarity form, again following closely the analogous approach for submerged plane jets in [3], differs from the jet case by the absence of streamwise variation of the velocity reference in the streamwise direction. the reference value used is the constant value from eq. (19). transformation of eq. (22) requires a suitably defined relative value of the specific energy of turbulent fluctuations: using the velocity reference (19) it is � e w f e� 2 . (25) for the same reasons as in the algebraic model case, it is useful to re-write eq. (12) in terms of stream function �, which is now related by the definition eq. (21) to the transformed relative stream function f � �� f k s c x we� 1 (26) the 1-equation model, eqs. (8), (9), together with the usual eqs. (5) and (16) leads to the following expression for the turbulent viscosity, using the relative value eq. (25) of the fluctuation energy � �t ek s c x w� 1� . (27) the transverse gradient of the turbulent viscosity is expressed by means of �� � � t ex k s c w 2 1 2 � � � . (28) the transformed velocities and their derivatives are w f we1 � d d� � , w f k s c w f x w xe e 2 2 1 � � �� � � �d d , � � � � w x f x w k s c x e1 1 2 1 2� � d d 2 2 � , 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague � � � � w x f w k s c x e1 2 1 � d d 2 2 � , � � � � 2 1 2 2 1 2 w x f w k s c x e� d d 3 3 � , and the transformed terms in eq. (22) are � � � � e x x w k s c x f e 1 2 2 1 2� � d d � , � � � � e x w k s c x f e 2 2 1 � � d d � , � � � � 2 2 2 2 1 2 e x w k s c x f e� d d 2 2 � . inserting these expressions into the momentum transport eq. (12), converts it after mutual cancellation of the first and second terms, and division by � w x e 2 1 into the final form 2 2 0 d d d d d d d d 3 3 2 2 2 2 f f f f � � � � � � � � � � � � � (29) an analogous procedure applied to eq. (22) results in 2 2 2 2 2 d d d d d d d d 2 2 2 2 � � � � � � � � � � � � � � � � � � � �f f c k s z � 0. (30) the solution of these simultaneous equations is complicated by the unknown parameter c k s z . an analogous problem arose in [3], where it was solved by using experimental data about the magnitude of the thickness growth factor s. 3 experiment: piv measurements in a developing jet the experimental facility used for generation of the investigated developing jets was designed by the present author for investigations of helicity transport in swirling jets [15]. it was tested in the initial stages of the project without imparting the swirl to the jet [7] and it is the data from these preliminary tests which were used for the present investigations. the round nozzle exit was of d � 32 mm diameter. the working fluid was air at atmospheric conditions, seeded upstream from the nozzle with water particles of 0.19 mm mean diameter. the reynolds number evaluated for the nozzle exit conditions re .d � �4 54 10 3 (31) though not very high, was sufficient for reasonably well developed turbulence in the mixing regions of the jet. schematic representation of the investigated flow and definition of © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 11: schematic representation of the investigated mixing layers surrounding the core of a round jet with a zero axial component of the outer velocity web. fig. 12: velocity profiles obtained by piv measurements in an air jet (data obtained in collaboration with g. regunath). only three profiles nearest to the exit were used for the comparison with the similarity solutions. the slight asymmetry of the profiles is hardly avoidable in practical flows. the used co-ordinate axes is presented in fig. 11. the axial component of the external velocity was practically zero, web � 0. the particle image velocimetry (piv) system, supplied by oxford lasers inc., was used for velocity measurement. the flowfield was illuminated by a twin pulse nd:yag laser with wavelength of the generated light 532 nm. the laser emits two laser pulses of 5 ns duration, with energy of 50 mj per pulse. in the present case, the time delay between the two laser pulses was set to 18 �s–40 �s to ensure that the particle-image displacement is about one-quarter of the interrogation domain. the light sheet, positioned in the meridian plane (passing through the jet axis) was less that 2 mm in thickness. the intensity of the light reflected from the tracer particles is recorded separately by two synchronised digital ccd cameras (pco sensicam). the data were processed by vidpiv software package, also supplied by oxford lasers inc. the data were collected under present author’s supervision by gavita regunath. they consist of sets of time-mean velocity values obtained at constant 1 mm steps along transverse lines perpendicular to the nozzle axis. of them, three velocity profiles located at 32 mm (one nozzle diameter), 64 mm (two diameters) and 96 mm (3 d), as shown in fig. 12 were chosen as potentially suitable for validation of the similarity solutions. previous positive experience with an analogous evaluation of the axisymmetric mixing layer suggested that a validation based on comparison with planar mixing layer solutions is worthwhile. the similarity co-ordinate for the velocities used for presentation of the data in fig. 13 is defined by eq. (2) – it is the same as for görtler solution. however, the transformed distance co-ordinate is neither of those defined above – neither the görtler’s eq. (1) nor eqs. (15), (20), and (33) of the better models – because of the absence at the initial stage of data processing of knowledge about the quantities used in these later definitions. the problem was circumvented by using yet another dimensionless co-ordinate �0.6, related to the convention thickness �0.6, analogous to �0.8 specified in figs. 4 and 6 – here, however, �0.6 is the distance between the locations in which the velocity w1 differs from the outer velocity we on each side by 0.2 �we. fig. 14 presents an evaluation of the growth factor s0.6 in �0.6 � s0.6 x1, (a version of eq. (16)) from the experimental data. 4 solutions of the transformed equations 4.1 algebraic model it is useful to introduce auxiliary variables, identical to those used in [3] and [5]: u w w f e � �1 � d d� dimensionless axial time-mean velocity, and g dimensionless transversal gradient of the axial time-mean velocity. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 13: velocity data evaluated from fig. 12 and re-plotted in similarity variables. no significant effect of transversal curvature (which increases with x1 *) was found. the different osculation radii ra and rb indicate lack of central symmetry, a feature of the usually presented görtler’s solution in its first-term form eq.(4). fig. 14: convention thickness �0.6 dependence on the downstream distance x1 *. resultant least-squares expression is fitted through all experimental data as a mean between the not perfectly symmetric top and bottom sides of the jet. the two ordinary differential equations eq. (19) describing the mixing layer flow in the transformed co-ordinates are then decomposed into the set of three first-order equations d d d d d d or f u u g g f g � � � � � � � � � � � � � � � � �0 (32) of which the last one has two alternatives. the meaning and necessity of this somewhat unusual feature is discussed below. the next part of the problem is to determine the set of boundary conditions at a point which is chosen as the starting point of the integration. there is the trivial set of zero boundary conditions f � u � g � 0 outside the layer, but trying to start the integration from this boundary is useless, the values would simply remain zero. there is, unfortunately, no other point across the layer in which all the three values are known. the best choice seemed to be to select as the starting point the location at which there is d d g � � 0, since for the algebraic model this means also f � 0. it is then necessary to find there the proper starting values for u and g. the location must be near to the centre of symmetry point u � 0.5 of the görtler solution, because the velocity profile u � f (�) there may be roughly approximated by a sloping straight line, for which the distribution of the gradient g � f (�) is a second-order parabola with its vertex d d g � � 0 just at this location. nevertheless, the obvious asymmetry as seen in the experimental data in fig. 11 requires the location of the maximum gradient g – and the corresponding starting point �alg � 0 of the integration – to be slightly to the right-hand side, at a slightly higher value of u. cut-and-try manipulation of the starting conditions finally led to satisfactory starting values found at g � 0.523 and u � 0.5834. some exact optimisation algorithm could be applied but this was considered unnecessary since the algebraic model cannot be expected anyway to provide a really good correspondence with real flows and is discussed here as a mere starting stage for later more sophisticated solutions. with the above starting values, the velocity profiles obtained by the runge-kutta integration procedure consists of two parabolic arcs. fig. 15 presents the integral curves at positive �alg values for several starting values of the dimensionless velocity gradient g. with the proper values for both u and g there is a maximum u � 1 at �alg � 1.166581. on the negative �alg side there is a minimum u = 0 at �alg � �1.846694. the explanation for the perhaps somewhat strange existence of two alternatives for the last equation in eq. (32) becomes apparent in fig. 16: the other equation d d u g � � � 0 provides the smooth continuation beyond the extremes, where d d g f � � � would lead to a physically wrong decrease in the velocity. qualitatively similar, though numerically different, is the situation on the left-hand side of negative distances �alg. the complete solution, not only for the velocity profile u � f (�) but also for the profiles of the other transformed quantities of interest for the algebraic model is shown in fig. 16. the algebraic solution for relative velocity u shown in fig. 17 reaches the value u � 0.8 at point a where �alg � 0.4304798. on the other side, the value u � 0.2 at point b is reached at �alg � �0.8160231. these two values provide the information needed for conversions between � 0.6 used to process the experimental data and the present co-or© czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 15: solutions for the dimensionless velocity u obtained by numerical integration of the set of equations d df u� � , d du g� � , and d dg f� � � in the direction of the positive transversal similarity co-ordinate �alg . fig. 16: with g � 0.523 the solution curve just reaches u � 1 at the vertex of the parabola. thereafter, the physically unacceptable decrease is avoided by switching from the solved equation d dg f� � � to the other alternative g � 0. dinate �alg. note that because of the choice of the starting point for the integration, the positions have to be shifted horizontally by �0.192772. when expressed in terms of �0.6, the values indicating the position are larger and have to be multiplied by 0.62325145 (� one half of the distance between points a and b) to obtain the distance in terms of �alg. this conversion provides an opportunity for comparing the similarity solution using the algebraic model with the experimental data. it should perhaps be stressed that these data are by no means an infallible reference. besides the obvious possibility of the influence of transverse curvature, it should also be kept in mind that they were taken at reynolds number (cf. eq. (31)) too small to secure the fully developed turbulence which is assumed in the model. also the development distance downstream from nozzle exit edge is here almost certainly too short for the turbulence achieving full similarity. there were also inevitable inaccuracies in the experiment, as is visible from e.g. the imperfect symmetry in fig. 12. considering all such negative factors, the agreement of the dimensionless velocities obtained from the similarity solution and from the experiment as presented in fig. 18 must be described as very good. indeed, the solution is perhaps sufficiently accurate for most practical engineering applications. despite the extreme simplicity of the solved equations (cf. eq. (32)), this similarity solution – in contrast to the central symmetry of the correlations used earlier, as shown e.g. in fig. 8 – even results in profiles which are asymmetric, exhibiting better agreement with reality. though it must be admitted that the radii of the osculation circles (cf. fig. 13) are not rendered particularly well, they are at least not the same for positive and for negative �alg and reflect the fact that the radius rb on the negative �alg side is larger. as could be expected, the agreement is less perfect for the values of the first derivative in fig. 19 and it is even worse for the second derivative of the velocity in fig. 20. the derivatives were evaluated for the experimental data using the 5-point numerical differentiation scheme (which provides some smoothing effect). of particular importance is the comparison presented in fig. 20. considering all the possible sources of inaccuracies and discrepancies, there is actually a very good agreement between the similarity solution and the experiment in the central part of the diagram, roughly between �alg � �0.5 and �alg � 0.5 . beyond these limits, the solution and data diverge quite widely. since the algebraic model is a reasonable description of transport-less turbulence, the conclusion from fig. 20 is that the turbulence is very probably in near equilib50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 17: complete solution of the planar mixing layer profiles – of relative velocity u, its gradient g, and the relative stream function f – using the algebraic model of turbulence. eq. (42) integrated using the runge-kutta algorithm and the indicated boundary condition values at �alg � 0. fig. 18: comparison of the profiles of relative velocity u obtained by the present similarity solution using the algebraic model of turbulence with data points obtained as averages of the values at regular intervals in fig. 13. rium �� in the central part of the mixing layer. this means its local production rate is practically equal to the local destruction rate and does not generate any excess which has to be transported away. in the outer parts of the layer this ceases to be the case: the turbulence there is substantially influenced by transport (advective transport and/or gradient transport). 4.2 1-equation model this model of turbulence, based on simultaneous solution of an additional transport equation for one turbulence parameter, can perform what the algebraic model fails to do – to take into account the spatial transport of the turbulence. it should provide a better approximation to reality in the outer parts of the mixing layer, where the turbulence ceases to be in local equilibrium. again, it is useful to introduce (by analogy with [3, 5], and also with the already discussed algebraic model) the auxiliary variables. apart from the dimensionless axial time-mean velocity u and the dimensionless transversal gradient g, the similarity solution with the 1-equation model also operates with – dimensionless specific energy of turbulent fluctuation, already defined by eq.(25) and also uses its derivative n � d d � . the derivatives, of course, are now with respect to another independent variable, the transversal co-ordinate defined in eq. (21). it should be noted that its definition operates with the constants cz, k, and s. using the auxiliary variables, the equations derived for this model in part 2.3, eqs. (29) and (30), are re-written as d d d d d d d d d g gf g n n c k s n f n g f u u g z � � � � � � � � � � � � � 2 2 2 2 d� � � � � � � � � � � � � � � � � n (33) the problem now involves integration of these five simultaneous first-order ordinary differential equations, one of them containing a numerical parameter cz /(k s). the solution requires knowledge of five boundary conditions at a point, which is then used as the starting point of the integration. also to be determined is the numerical value of the parameter cz /(k s). it may be useful to note that the same parameter was encountered in similarity solutions of a plane submerged jet in [3]. its value evaluated there, from considering known jet spreading rates, was c k s z � 6 40. (34) unfortunately, the only points at which all five values f, u, g, , n are known are the two extremes at the theoretical boundaries �1eq � � � and �1eq � � �. these values (with zero gradients g, as well as n) are useless for starting the integration since they lead to the trivial case of both d dg � and d dn � remaining zero everywhere. investigation of similar problems with a hierarchy of models of progressively increasing complexity brings an important advantage, as already noted in similar circumstances in [2] and [5]: although the lower members of the hierarchy describe the reality less well, they operate with parameters easier to determine and to evaluate so that they provide useful starting points for more sophisticated approaches by the higher members of the hierarchy. in the present case, the simple solution with the algebraic model – a model which may be, indeed, characterised as rather primitive, resulting in profiles with a discontinuity – provides useful information which helps to solve the boundary conditions problem. it resulted above in the interesting conclusion that in the central part of the mixing layer the turbulence tends to be in local equilibrium, its local production rate being equal to the dissipation rate. the effect of the turbulence dissipation rate � in the transport equation for fluctuation energy – the second equation, for d dn �, in eq. (33) – is represented by the first term on the right-hand side. the last term on the right-hand side there represents the effect of turbulence production rate in the original trans© czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 19: comparison of the profiles of the transversal gradient d du g� � of the relative velocity u obtained by the similarity solution and the experiment. fig. 20: comparison of the profiles of d d d d2 2g u� �� obtained by the similarity solution and by the experiment. according to the primary alternative of eq. (42), the values are also negative values of the transformed stream function f. port eq. (22). if the investigated turbulence were in equilibrium, ��, there would be g c k s z2 � production equals dissipation which may be differentiated to 2 d d g g n c k s z � � . (35) inserting these two expressions into the equation for d d g � – eq. (43) – results in 2 d d g f c k s z � � � (36) – which differs only by the constants (due to different definitions of the similarity transformed transverse variable �) from the equation d dg f� � � derived (eq. (32)) for the algebraic model. it is a reasonable choice – at least as an initial step to be improved upon later – to select again as the starting point of the integration, as in the algebraic model case, the central location at which there is g� � 0 and to use there as the boundary values for u and g the values which were successful with the algebraic model. the two remaining boundary conditions may then be, as the first approximation, evaluated for the equilibrium turbulence conditions: f u g k s g c n g g k s c z z � � � � � � � � � � � � � � � � 0 0 5834 0 523 2 0 2 . . � d d � (37) also the constants of the problem can be reasonably identified using the previous, algebraic model solution. considering the definition of �alg eq. (15) and the value of the experimentally evaluated growth factor s0.6 � 0.1219 in fig. 14, the correspondence between the two similarity co-ordinates leads to the conclusion that the size � of the vortices most effectively transporting momentum across the mixing layer is practically one half of the local convention thickness �0.6 (cf. eq. (7)) k � 0.50175. (38) this, together with cz � 0.164 from [2], indicates c k s z � 2.6813 (39) – a value surprisingly smaller than 6.4 found for the jet flows, as mentioned above. a partial explanation for the difference is that the jet actually contains two shear layers, one on each side. very rough reasoning would therefore suggest the parameter c k sz ( ), reflecting the size of the turbulent eddies, to be 6.4 /2 � 3.2, smaller than but comparable to 2.6813. finally, with the value of the prandtl-kolmogoroff coefficient c� � 0.5477 in definition eq. (26) as derived e.g. in [2], it is possible to evaluate the coefficient in the present similarity transverse co-ordinate �1eq (which is needed e.g. for re-plotting the experimental data from figs. 18 to 20 so that they may be used for comparison in fig. 21). � �1 1068593eq alg� . . (40) already at the very first attempt, numerical integration of eqs. (33) using the conditions eqs. (37) resulted in a surprisingly better approximation to the experimental data than the solution with the algebraic model, as shown for the quantities of the transformed momentum transport equation in fig. 21. the transport effects now taken into account improve the character of the solution at and beyond the limits of the equilibrium zone, where the solution with the algebraic model resulted in the discontinuity and substantial divergence between the experiment and the theory – as shown e.g. in fig. 20. unfortunately, the integration run with the results shown in fig. 21 did not meet the theoretical zero boundary 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 21: initial attempt at a solution using 1-equation model compared with the experiment. profiles computed from the starting conditions at �1 0eq � with the solution parameter c k sz ( ) exactly as used earlier with the algebraic model for equilibrium turbulence – eq. (47). conditions in what is assumed to be a non-turbulent external flow outside the layer for the fluctuation energy transport equation. since the experimental data used for the comparison did not incorporate an information about the fluctuation energy, the success of the individual solution computations of the fluctuation transport equation was judged on the basis of whether or not they met the requirement of zero values at the boundaries �1eq � � � and �1eq � � �. computation of the relative fluctuation energy profile showed that in the case plotted in fig. 21, which will be called case a , the boundary conditions were not met, and the fluctuation energy did not decrease to zero at the layer boundaries. of course, there is no reason why the conditions in the central part of the mixing layer, in the vicinity of the starting point of the integration, should be exactly in equilibrium. using the equilibrium conditions eq. (37) as the departure point for testing small deviations from them, a solution compliant with the theoretical zero turbulence condition at the boundaries was obtained. it was arrived at after rather laborious cut-and-try adjustments of the starting conditions as well as the parameter c k sz ( ) of the solution. this theoretically proper solution is here described as case c. fig. 22 presents for this case the computed profiles of the fluctuation energy as well as its transverse gradient in the similarity co-ordinates. the values of the starting conditions in what may be described as the “middle” of the mixing layer at �1eq � 0 are indicated both in fig. 21 and in fig. 22. while the changes of the other values are small and may be called insignificant, an important fact is that to obtain the theoretical boundary condition in case c a much smaller specific energy of fluctuations was required, non-dimensionalised to e. it had to be decreased to nearly one half of the original value from fig. 21 (comparison of figs. 21 and 22 shows that the starting value in case c had to be 0.545-times smaller in fig. 21. the general conclusion from the results in case c is that the distributions of the turbulence parameters – as well as e.g. of the transverse velocity gradient – exhibit much more abrupt ends at both boundaries than the experimental data which tend to approach the zero values asymptotically. indeed, an important property of case c is the fact that instead of such an asymptotic character the distributions of the variables end at quite well defined boundary points – determined in fig. 22 by the intersections of the slope curve with the horizontal axis. further away, beyond these ends, the values cease to vary. the slope curves (not only that for the fluctuation energy in fig. 22) exhibits a marked discontinuity in these end points. this non-asymptotic character, with the rather abrupt ends of the solution, is not the only suspect feature of the 1equation model solution with tollmien’s constant � hypothesis applied to the characteristic scale of turbulence. a closer study of modelled turbulence reveals another strange aspect. plotted in fig. 23, there are distributions of individual terms in the transport equation for fluctuation energy ef across the mixing layer for the 1equation solution with the starting values corresponding to case c. it is immediately apparent there that with the decreased starting value e the curve representing the turbulence dissipation term is very much lower than the curve of the production term. contrary to what was deduced from the previous analysis of the experimental results and the 1-equation solution, the central part of the mixing layer is here not in local equilibrium ��. in fact, the imbalance between the production and dissipation is largest in the centre of the layer near �1eq � 0. the excess of the produced and not dissipated turbulence is removed by the two transport mecha© czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 22: similarity transformed fluctuation energy profiles in the mixing layer obtained with the 1-equation model solution using the conditions at �1eq � 0 adjusted so that the integration proceeded to the ideal zero value boundary conditions on both sides of the layer fig. 23: individual similarity transformed terms in the transport equation for fluctuation energy in the 1-equation model solution from fig. 21 nism, gradient diffusion and advection. the advective transport is necessarily zero near the centre of the layer, where there is no transversal fluid movement. fig. 23 shows that indeed this is the location of extreme diffusion transport rate, the magnitude of which is exactly equal to the difference between the production and dissipation rates. in fact, there are two distinguishable components of the diffusive transport. the second, perhaps less obvious, component is due to the transversal variations of the turbulent viscosity. the present solution enables us to evaluate them independently – though to simplify the diagram fig. 23 plots the sum of the two components as the overall diffusive transport (but note the indicated expression as the sum of the two terms). as the advection increases with increasing distance �1eq from the centre, the absolute magnitude of the diffusion decreases. this trend does not end on reaching the zero value and continues so that the diffusion becomes positive at more distant locations from the central location. as the edges of the layer are approached, the diffusive transport is approximately equal in magnitude but of opposite sign to the advection. this mutual balancing is necessary since the difference between production and dissipation decreases, approaching zero at the layer edges. the strange fact is that in fig. 23 there are actually no locations where the turbulence should be transported to. the solution in this case c does not predict any part of the layer cross section in which production is lower than dissipation. the quite massive diffusion and advection transports simply oppose each other. this strange character is removed if we allow for the turbulence energy to be non-zero at least on one of the layer edges, as in case a described above. to find more about these aspects of the solution, yet another intermediate case b was evaluated. essentially, it differed from case c by their higher starting value of transformed fluctuation energy , increased in case b to � 0.6. the only other change in the starting conditions in case b was a decrease in the transformed gradient n. this change of the central slope was adjusted to obtain slightly higher turbulent transport towards the right-hand, higher velocity side. this corresponds to the idea of there being a higher turbulence inside the jet. plotted in fig. 24, there are distributions of individual terms in the transport equation for the fluctuation energy ef across the mixing layer profiles for the 1equation solution with the starting values corresponding to case b. in fact, the overall character of the curves does not differ very much from that seen in fig. 23. again, there is a difference – shown as the vertical distance – between the curve of the turbulence dissipation term and that of the production term. this difference, however, is now smaller. evidently, the increase of the starting value of transformed fluctuation energy shifts the conditions in the central zone of the mixing layer towards the local equilibrium � �. the important fact in fig. 24 is the existence of the two intersection points – marked in the diagram by two black dots. the turbulence production is higher than the dissipation in the central zone up to these intersection points of the two curves. further away it is dissipation that becomes dominant. this provides a sense to the idea of turbulence transport. turbulent fluctuations are transported from the central zone, where they are produced, to the outer parts of the layer, where they are dissipated. it may be interesting to note that the transport into the outer parts – especially clearly seen on the right-hand side of the diagram in fig. 24 – is mainly the advective component of the transport. as a result of the presence of the turbulence in the outer parts, the abrupt ends at the layer edges seen in fig. 23 now disappear, and the distributions of the flow parameters obtain an asymptotic character. unfortunately, as already mentioned, the experimental data collected by g. regunath contain only the values of the time-mean velocities. there is no direct information about the turbulence, which would prove (or disprove) the present conclusions about the distributions of the turbulence equation terms. however, assuming validity of the 1equation approach with the tollmien hypothesis (the latter may be invalidated if it could be demonstrated that the turbulence length scale is not constant across the profiles), the nonzero outer turbulence level in the discussed cases a and b provides the only reasonable explanation for the experimental data. this is demonstrated in fig. 25, again (similarly as fig. 20 for the algebraic model) providing a comparison of the second velocity derivative profiles d d d d g u f � � �� � 2 2 ( ) obtained by the experiment and by the similarity solution, here of course with the 1equation model. superimposed in the experimental data points (evaluated from g. regunath’s data by repeated numerical differentiation) are three curves, computed with three different sets of the starting values, corresponding to case a, case b and case c. it is immediately apparent that the theoretically “best” solution case c, meeting the condition of no turbulent fluctuation outside the mixing layer, fails to agree with the data points. in fig. 25 it is seen to 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 24: another 1equation model solution, computed for a slightly increased starting value of fluctuation energy , between the values of case a and case c . the diagram, by analogy with fig. 23, again shows individual terms in the turbulent energy transport equation. almost coincide with the unrealistic end effects of the algebraic model solution in fig. 20. the other computed cases in fig. 25 form a succession of shapes progressing from the near-algebraic distribution to the distribution found in the experiment. the inescapable evidence provided by these solutions is that the theoretically ideal 1-equation solution incorporating the tollmien hypothesis � � const is inadequate when confronted with the available experimental data. either it is to be admitted that turbulence is propagated away from the layer beyond its nominal boundaries or the size scale of the turbulence is distributed across the layer profiles in a more complex manner (which would be surprising, considering the almost perfect success of the hypothesis in the related solution of submerged turbulent jets, e.g. in [2, 5 and 6]). information about the distributions of the turbulence parameters across the mixing layer seems to be essential for deciding which is the correct road to follow. the author found some information in the experimental data of liepmann & laufer [10] and chow & korst [11], but they are insufficient – and, indeed, in some aspects mutually contradictory. an answer to the surprising findings about the inadequacy of what was considered the theoretically correct zero boundary conditions may be provided by the most complicated similarity solution based on solving two simultaneous transport equations for parameters characterising turbulence. this approach to the problem is the subject of another planned paper. 5 conclusions several important results of the present investigations may be summarised as follows: � the similarity transformation, converting the solution of two-dimensional flows into a problem described by a set of first-order ordinary differential equations, is a powerful tool deserving more attention. � its application to the present case of the mixing layer surrounding a round jet is successful. it proves the earlier conclusions from this author’s 1978 experiments that the effects of transversal curvature may be neglected for engineering purposes. � it is useful to use a hierarchy of progressively more complex turbulence models. lower members of the hierarchy provide helpful departure points for solutions with more advanced models, providing insight and useful numerical values. � it is useful to start the integration from the internal location where there is at least one zero gradient inside the layer. � the algebraic as well as the 1equation model solution described here compare favourably with the available experimental data and may suffice for practical applications. they are not perfect, in particular exhibiting physically unrealistic discontinuities at the edges and a number of open questions concerning the turbulence structure. solution with a more advanced, 2equation turbulence model, is being prepared for a forthcoming publication. references [1] tollmien, w.: “berechnung turbulenter ausbreitungsvorgänge.” (computation of turbulent propagation processes – in german), zeitschrift für angewandte mathematik und mechanik, vol. 6 (1926), p. 468. [2] tesař, v.: „mezní vrstvy a turbulence“ (shear layers and turbulence – in czech), publishing house of ctu prague, czech republic, various editions from 1984 to 1996, isbn 80-01-00675-1. [3] tesař, v.: “the solution of the plane turbulent jet.” acta polytechnica, vol. 36, (1996), no. 3, p. 15, issn 1210-2709. [4] tesař, v.: “nozzle characteristics – the boundary layer model.” hydraulika i pneumatika, vol. xxiv (2004), no. 3, poland, p. 35, issn 1505-3954. [5] tesař, v.: “two-equation turbulence model solution of the plane turbulent jet”, acta polytechnica, vol. 35 (1995), no. 2, p. 19, issn 1210-2709. [6] tesař, v.: “two-equation turbulence model similarity solution of the axisymmetric fluid jet.” acta polytechnica vol. 41 (2001), no. 2, p. 26, issn 1210-2709. [7] regunath, g., tesař, v., zimmerman, w. j. b., russell, n.: “novel merging swirl burner design controlled by helical mixing.” proceedings of the 7th world congress of chemical engineering, glasgow, 2005. [8] görtler, h.: “berechnung von aufgaben der freien turbulenz auf grund eines neuen näherungsansatzes.” (computation of free turbulence problems on the basis of a new approximation theorem – in german) zeitschrift für angewandte mathematik und mechanik, vol. 22 (1942), p. 244. [9] tesař, v.: “similarity solutions of basic turbulent shear flows with oneand two-equation models of turbulence.” zeitschrift für angewandte mathematik und mechanik, vol. 77 (1997), sup. 1, p. 333, issn 0946-8463. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 25: comparison of the profiles of d d d d g u � � � 2 2 evaluated (by differentiation) from the experimental velocity data and computed by the 1equation model solution with the three cases differing in the numerical values of the conditions inserted at �1eq � 0 at the start of the integration. [10] liepmann, h. w., laufer, j.: “investigation of free turbulent mixing”, naca technical note tn 1257, 1947. [11] chow, w. l., korst, h. h.: “on the flow structure within a constant pressure compressible turbulent jet mixing region.” nasa technical note d-1894, ames research center, 1963. [12] cheng, b. l., glimm, j., jin, h. s., sharp d.: “theoretical methods for the determination of mixing.” laser and particle beams, vol. 21 (2003), issue 03, p. 429. [13] olsen, m. g., dutton, j. c.: “planar velocity measurements in a weakly compressible mixing layer.” journal of fluid mechanics, vol. 486 (2003), p. 51. [14] tesař, v.: „směšování koaxiálních průtoků a fluidická čerpadla.“ (mixing of concurrent flows and fluidic pumps – in czech), acta polytechnica (1982), no. 7, ii, 1. [15] tesař, v.: “time-mean helicity distribution in turbulent swirling jets.” accepted for publication in acta polytechnica, 2005, issn 1210-2709 [16] noether, e.: “invariante variationsprobleme.” nachr. könig. gessell. wissen. göttingen, math-phys. kl., 1918, p. 235. [17] kara, a. h., mahomed, f. m.: “relation between symmetries and conservation laws.” internat. journ. theoret. phys., vol. 39 (2000), p. 23. [18] ibragimov, n. kh.: elementary lie group analysis and ordinary differential equations. chichester, new york: wiley 1999, isbn 0471974307. [19] tesař, v.: “valvole fluidiche senza parti mobili.” (no-moving-part fluidic valves – in italian), oleodinamica – pneumatica, vol. 39 (1998), no. 3, p. 216, issn 1 122-5017. prof. ing. václav tesař, csc. e-mail: v.tesar@sheffield.ac.uk university of sheffield mappin street s1 3jd sheffield, uk av čr dolejškova 5 182 00 prague 8, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague acta polytechnica doi:10.14311/ap.2020.60.0243 acta polytechnica 60(3):243–251, 2020 © czech technical university in prague, 2020 available online at https://ojs.cvut.cz/ojs/index.php/ap testing the suitability of the extruded polystyrene (styrodur) application in the track substructure libor ižvolt, peter dobeš∗, martin mečár university of žilina, faculty of civil engineering, department of railway engineering and track management, univerzitná 8215/1, 010 26 žilina, slovakia ∗ corresponding author: peter.dobes@uniza.sk abstract. extruded polystyrene (xps) and its excellent thermal insulation properties have been known for over 60 years. due to its thermal, mechanical, but also deformation properties, xps has a universal application, not only in the construction industry. this paper presents the results of the first series of experimental measurements of the deformation resistance of the sub-ballast layers with a built-in xps thermal insulation layer and the sub-ballast layers with a standard structure (crushed aggregate sub-ballast layer). the aim of the first series of experimental measurements was to determine the impact of placing the xps layer at the subgrade surface level (deformation resistance of subgrade surface e0 = approx. 10 mpa or 30 mpa) on the deformation resistance of the sub-ballast layers and then to determine the necessary thickness of the sub-ballast layer in relation to the required deformation resistance at the sub-ballast upper surface. experimental measurements carried out so far show that the application of xps boards in the sub-ballast layers has almost no or minimal effect on its deformation resistance. since xps boards have significantly better thermal technical properties compared to crushed aggregate, considerable savings of this material can be achieved in areas with unfavorable climatic conditions (high values of air frost index). keywords: railway track, track substructure, extruded polystyrene, static load tests, deformation resistance. 1. introduction an important prerequisite for a quality design of the railway track structure, determined by the requirements for high-quality track layout and geometry, is its sufficient resistance to traffic and non-traffic loads. the track substructure must have the ability to receive a defined long-term traffic load from the track skeleton and non-traffic load (effects of climatic influences) without harmful deformations during the whole year. it can exhibit these properties only if it is designed with sufficient dimensions and built from quality building materials of the required characteristics, which guarantee adequate resistance to individual impacts of traffic (static and dynamic) and non-traffic (water or snow and frost) load [1]. at present, in the design of the railway track construction, it is necessary to consider not only highquality building materials of the required physicalmechanical and thermal-insulating properties but also to take into account the economic costs and environmental impact of these building materials. partial or complete replacement of conventional building materials by materials with better thermal-technical properties (low water absorption and hence low thermal conductivity), but also suitable physical-mechanical properties (high permeability and low humidity, suitable grain size and high deformation resistance) results in a thickness reduction of the structural sub-ballast layers. in this way, it also decreases the amount of standard building materials for the establishment of the sub-ballast layer (currently mostly crushed aggregate), or it can possibly save financial resources for the installation of a sufficiently deformable and heat-insulating track substructure. one of the thermal insulating materials that can be applied in the structural composition of the subballast layers is extruded polystyrene (xps). the first polystyrene was made from natural resin by german physician eduard simon in 1839, but almost a century later, german organic chemist hermann staudinger realized that the isolated chemical was, in fact, a plastic polymer composed of long chains of styrene molecules, of which the name polystyrene was born. in 1937, polystyrene was introduced by dow chemical co. to the us market for commercial purposes under the styrofoam (eps) brand. in 1953, herman staudinger was awarded the nobel prize for chemistry, specifically for a research of polymers. the best-known form of expanded polystyrene packaging (xps) was launched one year later (1954) [2]. in europe, specifically in austria, production of eps under the styropor trademark began in 1953. in 1990, production was expanded to include the xps production, until then, the xps was imported into austria [3]. abroad, thermal insulation materials such as eps or xps are applied to the railway track with the conventional type of railway superstructure (gravel su243 https://doi.org/10.14311/ap.2020.60.0243 https://ojs.cvut.cz/ojs/index.php/ap l. ižvolt, p. dobeš, m. mečár acta polytechnica perstructure), mostly at the sub-ballast upper surface, i.e. directly below the track ballast layer [4–6], or below the ballastless track structure [7, 8], in the case of unconventional railway superstructure. placing the xps at the bottom of sub-ballast layers has several drawbacks, namely: • it must withstand significant traffic loads, • it is worn due to a considerable contact stress with the ballast bed material, • the track substructure may freeze from the side, especially in the case of a railway line in the embankment. the authors’ workplace the department of railway engineering and track management (dretm) has been long involved in monitoring the impact of various building materials (gravel, liapor-concrete, foam concrete, styrodur) built in the sub-ballast layers to improve their thermal-technical [9] and deformation characteristics [10–12]. due to the aforementioned disadvantages of placing the xps on the sub-ballast layers, the experimental workplace at dretm started to test placing the xps on the subgrade surface level. this level is particularly weak in the case of a subgrade surface consisting of fine-grained soils, it is the weakest link in the entire track substructure with the conventional track superstructure, not only in relation to traffic but also the non-traffic load. the aim of the first series of experimental measurements was to determine the impact of placing the xps layer at the subgrade surface level (deformation resistance of subgrade surface e0 =approx. 10 mpa or 30 mpa) on the deformation resistance of the subballast layers and then to determine the necessary thickness of the sub-ballast layer in relation to the required deformation resistance at the sub-ballast upper surface. 2. characteristics of structures and methodology of the experimental measurements this part of the paper presents the properties of the tested structures and the test methods for the verification of the deformation resistance of the sub-ballast layers with the built-in layer of extruded polystyrene (xps), or without the xps layer. 2.1. characteristics of the tested structures the extruded polystyrene (xps) is produced from materials similar to the eps but by a process known as extrusion. this is a manufacturing process in which the melt of crystalline polystyrene is extruded while the foaming agent is saturated. by releasing the pressure at the end of the extrusion tube, the material is filled, which then forms the insulating boards. thus, the xps manufacturing process has a major influence on its properties which are relatively different from other types of polystyrene. extruded polystyrene is also produced with a modified edge when a half-groove is applied on the edge of the board. thanks to this, the adjoining boards can be connected more easily and the heat loss of the structure in which they are built can be reduced. in the construction industry, the extruded polystyrene is mostly used in the form of thermal insulation boards. compared to other types of polystyrene boards, the mechanical resistance of the xps is many times higher, it can withstand loads of 300 kpa or more. because the xps exhibits a high impact and pressure resistance, it is used to isolate the areas that must withstand high compressive loads as well as mechanical loads. it is also important that, in addition to the conventional load, the xps is resistant to water, mold, rodents, etc. since xps is also resistant to rotting and, compared to the conventional eps, it has completely closed pores [13], so its application can be very suitable at points of contact with soil thanks to a good thermal insulation. for the purpose of monitoring the impact of various thermal insulation materials in the track structure on its thermal regime and also deformation resistance, the dretm experimental workplace [9] was built on the campus of the university of žilina (uniza). one part of this experimental workplace is also an experimental field (fig. 1), which serves to test the deformation characteristics of the structural sub-ballast layers. the experimental field in question consists of two segments (a, b), where segment a represents the standard track substructure (0/31.5 mm crushed aggregate sub-ballast layer) and the segment b represents the modified track substructure (crushed aggregate sub-ballast layer, fr. 0/31.5 mm + extruded polystyrene boards placed on the subgrade surface on the leveling sand layer). experimental measurements of the deformation resistance of the sub-ballast layers were performed in two stages. in the first stage of measurements, both segments of the experimental field were characterized by a lower deformation resistance of the subgrade surface, whose static deformation modulus e0 = 10 ± 2 mpa. in the second stage of measurements, the deformation resistance of the subgrade surface was increased by its modification (grain change) mechanical stabilisation. the deformation resistance of the modified subgrade surface reached the value of the static modulus of deformation e0,mod = 30 ± 2 mpa in both segments of the experimental field, which coincidentally is the required value of the deformation resistance of the subgrade surface on existing lines for sz4 speed zone (120 km·h−1 < v ≤ 160 km·h−1) according to [14]. in both construction cases, the geofiltex 63/20 t separating geotextile was placed on the subgrade surface or the modified subgrade surface. in segment b, 30 mm thick sheets of extruded polystyrene (styrodur 2800 c 8 pieces with dimensions 1250 × 600 mm) laid on the 50 mm thick 244 vol. 60 no. 3/2020 testing the suitability of the extruded polystyrene. . . figure 1. experimental field – 1st stage of experimental measurements. figure 2. experimental field – 2nd stage of experimental measurements. 245 l. ižvolt, p. dobeš, m. mečár acta polytechnica figure 3. static plate load test equipment [15]. sand leveling layer, were applied. in both phases of experimental measurements and for both segments, the sub-ballast layer was established from the crushed aggregate fr. 0/31.5 mm, which was compacted with the aid of a vibration plate in such a technological way that its construction thicknesses of 150 mm, 300 mm and 450 mm were gradually achieved. the structural composition of the experimental field with segments a and b, with the location of the performed static load tests, are demonstrated in fig. 1 and fig. 2. 2.2. characteristics of determining the deformation resistance of sub-ballast layers the determination of the deformation resistance (static modulus of deformation) of the tested structural composition of the sub-ballast layers was conducted within the experimental measurements using the static plate load test equipment at the dretm experimental field (fig. 3). the static plate load tests (plt) at the sub-ballast upper surface of the experimental field (segment a and segment b) were performed according to the methodology outlined in the directive [16]. in the process of performing the individual static load tests, a rigid circular plate of 300 mm diameter was pressed in two load cycles. the maximum contact stress used was 0.20 mpa for both load cycles (standard stress used on the slovak railways lines for the sub-ballast layers). the measured variable was the static modulus of deformation esi, which was determined separately from the first and second load cycle. according to the methodology presented in [16], the value of the modulus of deformation obtained from the second load cycle is always decisive. the compaction quality of the material incorporated into the structural layer is expressed according to [17] from the ratio of the deformation moduli determined in the first and second load cycle (edef 2/edef 1). the static modulus of deformation esi was determined according to the relation: esi = 1.5 · p · r y (1) where: esi static modulus of deformation of the ith structural layer, (mpa); p specific pressure applied to the plate, (mpa); r radius of the load plate, (m); y the total average settlement of the load plate determined in the second cycle, (m). the measurement of the values of static modulus of deformation esi at the level of individual structural layers of the tested structures was carried out in both stages of experimental measurements at the subgrade surface level, or a modified subgrade surface level, at the level of xps boards and subsequently at the level of each partial structural layer of the sub-ballast layers. for each construction layer of segment a or segment b of the experimental field, a series of 4 static plate load tests was performed, always at the same location indicated as the measurement point mi. measurement points m 1 to m 4 (see fig. 1) are located in the segment a of the experimental field (non-xps segment in the sub-ballast layers) and the measurement points m 5 to m 8 are located in segment b (a segment with xps embedded in the sub-ballast layers) (see fig. 2). 3. results of experimental measurements fig. 4 shows the measured values of the static modulus of deformation esi at the level of the individual sub-ballast layers of the experimental field for both segments within the 1st stage (lower deformation resistance/bearing capacity of the subgrade surface). fig. 4 demonstrates a gradual increase in the deformation resistance of the sub-ballast layers as the thickness of the sub-ballast layer increases. it is also evident that the xps boards placed on a leveling layer of sand and a low deformation subgrade surface (e0 = 10 ± 2 mpa) had a minimal effect on the resulting (equivalent) deformation resistance of the sub-ballast layers (at the level of sub-ballast layers) of the tested structure in the experimental field. fig. 5 shows the measured values of the static modulus of deformation es at the level of the individual sub-ballast layers of the experimental field for both segments within the 2nd stage. in this case, a higher value of deformation resistance (e0,mod = 30 ± 2 mpa) was achieved at the level of the modified subgrade surface. fig. 5 also demonstrates that the extruded polystyrene boards placed on the leveling layer of sand caused a decrease in the deformation resistance of about 7 to 8 mpa, as compared to the values obtained for a modified subgrade surface. consequently, this decrease of the deformation resistance also had an impact on the achieved partial or total deformation resistance of the tested sub-ballast layers. since the decrease of the deformation resistance determined on the surface of extruded polystyrene boards was observed only in the case of the modified subgrade surface, it is possible to assume that, in this case, the extruded polystyrene board material occurred between two solid surfaces (a deformation-resistant subgrade surface and a circular plate of the static load 246 vol. 60 no. 3/2020 testing the suitability of the extruded polystyrene. . . s ta ti c d e fo r m a ti o n m o d u lu s e si ( m p a ) 150 100 50 0 1 2 ,2 1 1 ,7 7 5 ,0 4 0 ,9 2 0 ,1 1 0 ,8 7 0 ,3 3 4 ,6 1 8 ,3 9 ,8 7 2 ,6 3 8 ,8 1 8 ,6 6 4 ,3 3 2 ,6 1 3 ,7 s ta ti c d e fo r m a ti o n m o d u lu s e si ( m p a ) 150 100 50 0 1 1 .8 1 2 .2 2 3 .7 4 3 .3 7 7 .6 1 1 .1 1 0 .8 6 4 .3 3 8 .8 1 6 .9 7 2 .6 4 0 .9 1 9 .4 1 0 .0 9 .6 6 6 .2 3 2 .6 1 2 .0 8 .3 8 .1 m 4 m 8 m 2 m 6 m 3 m 7 m 1 m 5 segment a segment b figure 4. measured values of static modulus of deformation esi on the surface of individual structural sub-ballast layers 1st stage (lower deformation resistance of the subgrade surface). equipment) and became more compressed. (in the case of the low deformation resistance of subgrade surface, the difference between the static modulus of deformation determined on the surface of the subgrade surface and extruded polystyrene was minimal.) in the case of a low deformation-resistant subgrade surface, the boards of extruded polystyrene were likely to be deformed at the same time as the subsoil on which they were laid, and therefore there was no significant compression. it can be assumed that once the maximum compression of the extruded polystyrene boards in the sub-ballast layers due to operational load of the track (the weight of the overlaying construction layers including the track skeleton and trains) has been achieved, the impact of the xps boards on the deformation resistance of sub-ballast layers will also be minimal. the compaction of the individual crushed aggregate layers was performed by a vibration plate, and the achieved values of compaction (the ratio between the achieved value of the static modulus of deformation in the 1st and 2nd load cycles) ranged from 1.4 to 1.8. these values are satisfactory since the maximum permissible value for the coarse-grained material according to [17] is 2.6. the compaction values at the subgrade surface level formed by the fine-grained material were in the range of 2.0 to 2.3. these values are also satisfactory since the maximum permissible value for the fine-grained material according to [17] is 2.5. fig. 6 shows the dependence of the sub-ballast layer thickness tsbl on the required value of the static deformation modulus esi of the sub-ballast upper surface (esbl) for the standard sub-ballast layers (sub-ballast layer formed only by crushed aggregate) with low deformation-resistant subgrade surface e0 =approx. 10 mpa. in the case of an existing line located in the sz4 speed zone (120 km·h−1 < v ≤ 160 km·h−1), in order to achieve the required static modulus of deformation of the sub-ballast upper surface esbl ≥ 50 mpa it would be necessary to design the sub-ballast layer 247 l. ižvolt, p. dobeš, m. mečár acta polytechnica 3 1 .7 4 5 .0 8 6 .5 1 1 2 .5 3 1 .3 4 3 .3 8 6 .5 1 0 2 .3 1 1 2 .5 9 0 .0 4 7 .9 3 2 .1 1 0 7 .1 7 5 .0 4 0 .9 2 9 .2 s ta ti c d e fo r m a ti o n m o d u lu s e si ( m p a ) 0 50 150 s ta ti c d e fo r m a ti o n m o d u lu s e si ( m p a ) 0 50 100 150 2 9 .6 2 2 .2 3 4 .6 6 6 .2 9 3 .8 2 8 .8 2 1 .0 3 1 .3 5 6 .3 8 6 .5 3 0 .0 3 4 .6 2 2 .4 6 2 .5 9 0 .0 2 8 .1 2 1 .3 2 9 .6 5 4 .9 8 3 .3 m 1 m 2 m 3 m 4 m 8 m 6 m 7 m 5 segment a segment b figure 5. measured values of static modulus of deformation esi on the surface of individual structural sub-ballast layers 2nd stage (higher deformation resistance of the subgrade surface). 400 mm thick. however, the same thickness of the sub-ballast layer would also have to be designed in the case of the modified structural composition of the sub-ballast layers. here, in addition to the sub-ballast layer of crushed aggregate, the tested structure contains the xps boards placed on the sand leveling layer (fig. 7). fig. 8 shows the dependence of the thickness of the crushed aggregate sub-ballast layer on the required value of the static deformation modulus esi of the sub-ballast upper surface (esbl) for the standard structure of sub-ballast layers (sub-ballast layer of crushed aggregate only) and higher deformation resistance of the subgrade surface e0 =approx. 30 mpa. in the case of an existing line located in the sz4 speed zone (120 km·h−1 < v ≤ 160 km·h−1), in order to achieve the required static modulus of deformation of the sub-ballast upper surface esbl ≥ 50 mpa, it would be necessary to design the sub-ballast layer only 200 mm thick. in the case of designing a modified structural composition of the sub-ballast layers (sub-ballast layer consisting of crushed aggregate and xps boards), it would be desirable to design the sub-ballast layer of a construction thickness greater than 100 mm (fig. 9). 4. conclusions within the research of the department of railway engineering and track management at the university of žilina, a series of experimental measurements related to the assessment of the suitability of the application of xps boards in the track substructure with the conventional superstructure (gravel superstructure) coacting with the sub-ballast layer of crushed aggregate fr. 0/31.5 mm were conducted. the experimental test field was divided into segment a (standard structure of sub-ballast layers, i.e. the sub-ballast layer of crushed aggregate) and segment b (modified structure of sub-ballast layers, i.e. sub-ballast layer of crushed aggregate and xps 248 vol. 60 no. 3/2020 testing the suitability of the extruded polystyrene. . . figure 6. dependence of the sub-ballast layer thickness tsbl on the required value of the static modulus of deformation es (standard structure of sub-ballast layers, low deformation resistance of the subgrade surface approx. 10 mpa). figure 7. dependence of the sub-ballast layer thickness tsbl on the required value of the static modulus of deformation es (structure of sub-ballast layers with built-in xps boards, low deformation resistance of the subgrade surface approx. 10 mpa). 249 l. ižvolt, p. dobeš, m. mečár acta polytechnica figure 8. dependence of the sub-ballast layer thickness tsbl on the required value of the static modulus of deformation es (standard structure of sub-ballast layers, higher deformation resistance of the subgrade surface approx. 30 mpa). figure 9. dependence of the sub-ballast layer thickness tsbl on the required value of the static modulus of deformation es (structure of sub-ballast layers with built-in xps boards, higher deformation resistance of the subgrade surface approx. 30 mpa). 250 vol. 60 no. 3/2020 testing the suitability of the extruded polystyrene. . . boards). the impact of the built-in xps boards on the deformation resistance of sub-ballast layers is apparent from fig. 4 to fig. 9. fig. 4, or fig. 6 and fig. 7 demonstrate that the application of the xps boards in the sub-ballast layers has almost no effect on its deformation resistance in the case of a low deformation-resistant subgrade surface (e0 = 10 ± 2 mpa). in the case of a more deformation-resistant subgrade surface (e0,mod = 30 ± 2 mpa), the xps boards in the sub-ballast layers cause a decrease in its deformation resistance, but it is assumed that after reaching the maximum compression of xps boards caused by an operational load on the track (the weight of the overlaying structural layers including the track skeleton and the trains), their impact on the deformation resistance of sub-ballast layers will be negligible (fig. 5, fig. 8 and fig. 9). since the xps boards have significantly better thermal technical properties compared to crushed aggregate (especially the thermal conductivity value, which is 50 times lower [18]), considerable savings of this material can be achieved in areas with unfavorable climatic conditions (high values of air frost index). a comparison of standard and modified structure of sub-ballast layer’s design for other cases of deformation resistance of subgrade surface (e0 = 20 mpa or 40 mpa the second series of experimental measurements), for cyclic loading of sub-ballast layers and from the point of view of the non-traffic load will be subject to further experimental research at dretm. acknowledgements the presented results are partial results of solving the vega grant project 1/0084/20 “numerical and experimental analysis of transition areas of objects of structures of railway superstructures and objects of formation substructure”. references [1] l. ižvolt. railway substructure – stress, diagnostics, design and implementation of body construction layers of railway subgrade (scientific monograph). edis, 2008. slovak. [2] isowall group. a brief history of polystyrene. https://www.isowall.co.za/ a-brief-history-of-polystyrene/, 2019. accessed: 3 october 2019. [3] austrotherm: history. http://en.austrotherm. com/front_content.php?idcat=155, 2019. accessed: 3 october 2019. [4] p. addison, p. lautala, t. oommen, z. vallos. embankment stabilization techniques for railroads on permafrost. in proceedings of the 2016 joint rail conference, pp. 1 – 9. 2016. doi:10.1115/jrc2016-5731. [5] a. nurmikolu, p. kolisoja. extruded polystyrene (xps) foam frost insulation boards in railway structures. in proceedings of the 16th international conference on soil mechanics and geotechnical engineering, pp. 1761 – 1764. 2006. [6] styrodur: load-bearing and floor insulation. https://static1.squarespace.com/static/ 53f9d90ee4b0572e560769f8/t/ 54bbe72be4b0567044c061c0/1421600555419/ styrodur+floor+application+brochure.pdf, 2019. accessed: 3 october 2019. [7] c. esveld, v. markine. use of expanded olystyrene (eps) sub-base in railway track design. iabse symposium report pp. 33 – 38, 2003. doi:10.2749/222137803796329952. [8] m. madhkhan, m. entezam, m. torki. mechanical properties of precast reinforced concrete slab tracks on non-ballasted foundations. scientia iranica 19(1):20 – 26, 2012. doi:10.1016/j.scient.2011.11.037. [9] p. dobeš, l. ižvolt, s. hodás. examining the influence of railway track routing on the thermal regime of the track substructure – experimental monitoring. in proceedings of the 16th scientific and technical conference transport systems theory and practice, pp. 201 – 209. 2019. [10] l. izvolt, p. dobes, m. mecar. testing the suitability of the reinforced foam concrete layer application in the track bed structure. iop conference series: materials science and engineering 661:012014, 2019. doi:10.1088/1757-899x/661/1/012014. [11] j. vlcek, m. drusa, w. scherfel, b. sedlar. experimental investigation of properties of foam concrete for industrial floors in testing field. iop conference series: earth and environmental science 95:022049, 2017. doi:10.1088/1755-1315/95/2/022049. [12] v. valašková, j. vlček, m. drusa. experimental and computational dynamic analysis of the foam concrete as a sub-base layer of the pavement structure. matec web of conferences 211:13002, 2018. doi:10.1051/matecconf/201821113002. [13] ursa xps: insulation for a better tomorrow. https://cmt.ee/wp-content/uploads/2018/10/ursa_ xps_intro_eng.pdf, 2019. accessed: 3 october 2019. [14] tnž 73 6312 the design of structural layers of subgrade structures. standard, directorate general of railways of the slovak republic, slovakia, 2005. [15] geolab: in-situ and lab testing equipment. https://geolab.com.pl/en/oferta/ grunty-sprzet-terenowy/ plyta-statyczna-vss-jednopunktowa/ plyta-statyczna-vss-1f, 2019. accessed: 3 october 2019. [16] slovak railway regulation ts4 “track substructure appendix 6”. standard, directorate general of railways of the slovak republic, slovakia, 2018. [17] stn 73 6133 – road building. roads embankments and subgrades. standard, slovak office of standards, metrology and testing, slovakia, 2017. [18] styrodur: technical data. https://www2.basf.de/ basf2/img/produkte/kunststoffe/styrodur/ downloads2/en/styrodurtechnicaldata.pdf, 2019. accessed: 3 october 2019. 251 https://www.isowall.co.za/a-brief-history-of-polystyrene/ https://www.isowall.co.za/a-brief-history-of-polystyrene/ http://en.austrotherm.com/front_content.php?idcat=155 http://en.austrotherm.com/front_content.php?idcat=155 http://dx.doi.org/10.1115/jrc2016-5731 https://static1.squarespace.com/static/ 53f9d90ee4b0572e560769f8/t/54bbe72be4b0567044c061c0/1421600555419/styrodur+floor+application+brochure.pdf https://static1.squarespace.com/static/ 53f9d90ee4b0572e560769f8/t/54bbe72be4b0567044c061c0/1421600555419/styrodur+floor+application+brochure.pdf https://static1.squarespace.com/static/ 53f9d90ee4b0572e560769f8/t/54bbe72be4b0567044c061c0/1421600555419/styrodur+floor+application+brochure.pdf https://static1.squarespace.com/static/ 53f9d90ee4b0572e560769f8/t/54bbe72be4b0567044c061c0/1421600555419/styrodur+floor+application+brochure.pdf http://dx.doi.org/10.2749/222137803796329952 http://dx.doi.org/10.1016/j.scient.2011.11.037 http://dx.doi.org/10.1088/1757-899x/661/1/012014 http://dx.doi.org/10.1088/1755-1315/95/2/022049 http://dx.doi.org/10.1051/matecconf/201821113002 https://cmt.ee/wp-content/uploads/2018/ 10/ursa_xps_intro_eng.pdf https://cmt.ee/wp-content/uploads/2018/ 10/ursa_xps_intro_eng.pdf https://geolab.com.pl/en/oferta/grunty-sprzet-terenowy/plyta-statyczna-vss-jednopunktowa/plyta-statyczna-vss-1f https://geolab.com.pl/en/oferta/grunty-sprzet-terenowy/plyta-statyczna-vss-jednopunktowa/plyta-statyczna-vss-1f https://geolab.com.pl/en/oferta/grunty-sprzet-terenowy/plyta-statyczna-vss-jednopunktowa/plyta-statyczna-vss-1f https://geolab.com.pl/en/oferta/grunty-sprzet-terenowy/plyta-statyczna-vss-jednopunktowa/plyta-statyczna-vss-1f https://www2.basf.de/basf2/img/produkte/kunststoffe/styrodur/ downloads2/en/styrodurtechnicaldata.pdf https://www2.basf.de/basf2/img/produkte/kunststoffe/styrodur/ downloads2/en/styrodurtechnicaldata.pdf https://www2.basf.de/basf2/img/produkte/kunststoffe/styrodur/ downloads2/en/styrodurtechnicaldata.pdf acta polytechnica 60(3):243–251, 2020 1 introduction 2 characteristics of structures and methodology of the experimental measurements 2.1 characteristics of the tested structures 2.2 characteristics of determining the deformation resistance of sub-ballast layers 3 results of experimental measurements 4 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0148 acta polytechnica 61(si):148–154, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague multivariate interpolation using polyharmonic splines karel segeth czech academy of sciences, institute of mathematics, žitná 25, 115 67 praha 1, czech republic correspondence: segeth@math.cas.cz abstract. data measuring and further processing is the fundamental activity in all branches of science and technology. data interpolation has been an important part of computational mathematics for a long time. in the paper, we are concerned with the interpolation by polyharmonic splines in an arbitrary dimension. we show the connection of this interpolation with the interpolation by radial basis functions and the smooth interpolation by generating functions, which provide means for minimizing the l2 norm of chosen derivatives of the interpolant. this can be useful in 2d and 3d, e.g., in the construction of geographic information systems or computer aided geometric design. we prove the properties of the piecewise polyharmonic spline interpolant and present a simple 1d example to illustrate them. keywords: data interpolation, smooth interpolation, polyharmonic spline, fourier transform. 1. introduction measuring data of all different types and formats is the basic means of research in all branches of science and technology. it is a discrete process providing a finite number of numerical values over some domain. the first stage of data processing usually consists in its approximation, i.e., computing reliable data values at an arbitrary point in the domain of interest. in the paper, we are concerned with the problem of data interpolation in an arbitrary dimension. in particular, we consider the interpolation with radial basis functions that is reasonable if we assume that the value at a point x depends in some way on the euclidean distance r(x,xj) between x and the nodes xj where the values have been measured. the background of the paper is the so-called smooth interpolation [1], [2] allowing for the minimization of some functionals applied to the interpolation formula. choosing particular basis functions in the minimization space, we can get an interpolation formula whose principal part is a linear combination of polyharmonic splines of fixed order that are, at the same time, radial functions. we construct such a radial basis, i.e. polyharmonic splines, and show its properties. among other things, we prove that the interpolant is piecewise polyharmonic. we present a 1d example that shows the result of interpolation if different derivatives of the interpolant are minimized in the l2 norm. the example shows that the respective interpolations give expected results. interpolation of this nature is often employed in signal processing, construction of geographic information systems, or computer aided geometric design. moreover, if the field measured is known to be polyharmonic, it is worth to interpolate it by a formula preserving the polyharmonicity. frequent citations of the author’s paper [2] have been used to introduce the notation and basic properties of notions used. the conclusion of the present paper is more advanced, it shows that for interpolation it is possible to use polyharmonic functions of different orders m that minimize different norms of the interpolant u, i.e. the l2 norm of the (m + l)th derivative of u for any l positive. we state the problem of data interpolation in sec. 2, introduce radial functions in sec. 3, and polyharmonic splines in sec. 4. further, we define the spaces wl where we are going to carry out the minimization and present a general form of the interpolation formula in sec. 5. moreover, we quote a theorem from [2] that states the existence and unicity of the solution of the interpolation problem. in sec. 6, we choose exponential functions of a pure imaginary argument for the basis functions in wl. for this choice of basis functions, we obtain a radial basis function interpolation formula, where the radial functions are polyharmonic functions, which can be seen in sec. 7, present some properties (polyharmonicity) of such an interpolant in sec. 8. and show a simple computional example in sec. 9. 2. problem of data interpolation fundamental notation and basic statements are taken mostly from [2]. consider a finite number n of (complex, in general) measured (sampled) values f1, . . . ,fn ∈ c obtained at n given nodes x1, . . . ,xn ∈ ω, xj = (xj1, . . . ,xjn), that are mutually distinct, where n is a positive integer and ω ∈ rn is a cube. usually, we need also the values corresponding to other points in ω that are not known. let fj = f(xj) be measured values of a 148 https://doi.org/10.14311/ap.2021.61.0148 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 special issue/2021 multivariate interpolation using polyharmonic splines complex-valued function f continuous in ω and z is an approximating function to be constructed. definition 1. the interpolating function (interpolant) z is constructed to fulfill the interpolation conditions z(xj) = fj, j = 1, . . . ,n, (1) cf. definition 1 of [2]. various additional conditions can be considered, e.g., minimization of some functionals applied to z. the problem of data interpolation does not have a unique solution. the property (1) of the interpolant is clearly formulated by mathematical means but the behavior of the interpolating curve or the surface between nodes can differ case from case and cannot be formalized easily. the problem of least squares data approximation is more general than the problem of data interpolation. no explicit interpolation conditions of the form (1) are to be satisfied, but the approximating function z is constructed to minimize the least squares functional n∑ j=1 wj(z(xj) −fj)(z(xj) −fj)∗, where wj, j = 1, . . . ,n, are positive weights and ∗ denotes the complex conjugate, cf., e.g., [1]. various additional conditions can be considered, for example, the minimization of some further functionals applied to z. in some branches of science, the terminology may differ. the terms exact and inexact interpolation are also used if the interpolant or approximant satisfies the conditions (1) or not. we are not concerned with the general data approximation in the paper. 3. interpolation with radial basis functions let x,y ∈ rn and r(x,y) = ‖x−y‖e = √√√√ n∑ s=1 (xs −ys)2 (2) be the euclidean norm of the vector x − y ∈ rn. the dimension n of the independent variable can be arbitrary. definition 2 (radial function). we say that the function f(x,y) = f̂(r(x,y)) depending only on r from (2) is a radial function. radial functions are radially symmetric. they are often used as basis functions for interpolation as well as approximation. it is assumed that every item fj of the measured data at the node xj influences the result of interpolation or approximation at a point x in the vicinity of xj proportionally, in some sense, to its distance r(x,xj) from xj if this vicinity can be considered “homogeneous”. the vector α = (α1, . . . ,αn), where αs, s = 1, . . . ,n, are integers, is called a multiindex. denote the length of a multiindex α by |α| = n∑ s=1 |αs|, (3) where |αs| means the absolute value of the component αs. we say that α is a nonnegative multiindex if αs ≥ 0 holds for all s = 1, . . . ,n. choose a nonnegative integer l and consider the interpolant z(x) = n∑ j=1 λjf(x,xj) + ∑ |α|≤l−1 aαϕα(x), (4) where α is a nonnegative multiindex, f(x,y) = f̂(r(x,y)) is a radial basis function (e.g. a proper polyharmonic spline, see sec. 4), ϕα are all the monomials of the form ϕα(x) = xα11 . . .x αn n (5) of degree |α| ≤ l − 1 called trend functions, and λj, j = 1, . . . ,n, and aα, |α| ≤ l−1, are coefficients to be found; the second sum in the formula (4) is empty if l = 0 (cf. [2]). the interpolation with radial basis functions is widely used in computational practice. for practical reasons, the basis functions are usually taken only from a very small set of functions, e.g., √ r2 + s2, 1/ √ r2 + s2, exp(−sr2), or r2 ln(r/s), where s > 0 is a constant. 4. polyharmonic splines definition 3 (polyharmonic spline). let r(x,y) be the euclidean norm (2) of the vector x−y ∈ rn. the functions rq, q = 1, 3, . . . , (6) rq ln r, q = 2, 4, . . . , (7) are called polyharmonic splines. the equation ∆m u(x1, . . . ,xn) = 0, (8) where ∆ = ∂2/∂x21 +· · ·+∂2/∂x2n is the laplace operator, is called the polyharmonic equation of order m, cf. [2]. apparently, all the derivatives in the equation (8) are of order 2m. polyharmonic splines solve the respective polyharmonic equation. the next theorem presents the exact statement. 149 karel segeth acta polytechnica theorem 1. fix the vector y ∈ rn. then the polyharmonic spline rq (or rq ln r) solves the polyharmonic equation in variable x of order m = 12 (q + n) in rny = rn \{x = y} for n odd (or for n even). proof. it is easy to prove the statement by direct computation. note that the term spline is used here also for a nonpolynomial function. another (weak) definition of the polyharmonic spline can be given with the help of the dirac function. apparently, r(x,y) is a real radial basis function. all polyharmonic splines (6), (7) also possess the same property. for a practical use in approximation, the polyharmonic splines are combined with lower order polynomial terms (trends) to form an interpolation or approximation formula as in (4). 5. smooth interpolation let us briefly present some properties of polyharmonic splines in the smooth approximation or variational spline theory. these properties show the place of splines in the context of radial basis function interpolation. alternatively, the splines can be derived with the help of the algebraic spline theory, cf. an example of the 1d cubic spline in [3]. we employ the usual lebesgue space l2(ω) of generalized complex-valued functions with the norm ‖g‖2l2 = ∫ ω |g(x)|2 dx. we follow [1] and [4], and formulate and solve the problem of smooth interpolation [2]. choose a set {bα} of nonnegative numbers, where α is a nonnegative multiindex. let l be the smallest nonnegative integer such that bα > 0 for at least one α, |α| = l, while bα = 0 for all α, |α| < l. recall that the nodes x1, . . . ,xn ∈ ω are supposed to be mutually distinct. let w̃ be a linear vector space of complex valued functions g continuous together with all their partial derivatives of all orders in ω. for g,h ∈w̃ we put (g,h)l = ∑ l≤|α| bα ∫ ω ∂|α|g(x) ∂xα11 . . .∂x αn n ( ∂|α|h(x) ∂xα11 . . .∂x αn n )∗ dx (9) and similarly |g|2l = ∑ l≤|α| bα ∫ ω ∣∣∣∣ ∂|α|g(x)∂xα11 . . .∂xαnn ∣∣∣∣2 dx (10) if the values of |g|l and |h|l exist and are finite. if l = 0 (i.e. bα > 0 for |α| = 0), consider functions g,h ∈ w̃ such that the values of |g|0 and |h|0 exist and are finite. then (g,h)0 has the properties of inner product and the expression ‖g‖0 = |g|0 is norm in a normed space w0 = w̃. let l > 0. consider again functions g,h ∈w̃ such that the values of |g|l and |h|l exist and are finite. let pl−1 ⊂ w̃ be the subspace whose basis {ϕα}, where α is a nonnegative multiindex, |α| ≤ l − 1, consists of all the trend functions (5) of degree l− 1, at most. then, for a nonnegative multiindex β, (ϕα,ϕβ)l = 0 and |ϕα|l = 0 for |α| ≤ l− 1 and |β| ≤ l− 1. (11) using (9) and (10), we construct the quotient space w̃/pl−1 whose zero class is the subspace pl−1. finally, considering (·, ·)l and | · |l in every equivalence class, we see that they represent the inner product and norm ‖g‖l in the normed space wl = w̃/pl−1. wl is the normed space where we minimize functionals and measure the smoothness of the interpolation as prescribed by the choice of {bα}. we complete the space wl in the norm ‖·‖l and denote the completed space again wl. for an arbitrary l ≥ 0, choose a basis system of functions {gκ}⊂ wl that is complete and orthogonal (in the inner product in wl), i.e., if κ = (κ1, . . . ,κn) and µ = (µ1, . . . ,µn) are nonnegative multiindices then (gκ,gµ)l = 0 for κ 6= µ. (12) if l > 0 then, moreover, (ϕα,gκ)l = 0 for a nonnegative multiindex α, |α| ≤ l− 1. (13) the set {ϕα} of trend functions is empty for l = 0. definition 4 (smooth interpolation). the problem of smooth interpolation [1] consists in finding the complex coefficients aκ and aα of the interpolant z(x) = ∑ κ aκgκ(x) + ∑ |α|≤l−1 aαϕα(x) (14) with nonnegative multiindices κ and α such that z(xj) = fj, j = 1, . . . ,n, (15) and the quantity ‖z‖2l attains its minimum on wl. (16) the second sum in the interpolant (14) is empty for l = 0. according to (10), the quantity ‖z‖2l is the weighted sum of the squares of l2 norms of the derivatives of z of all orders |α| with weights bα. putting bα > 0 for some set of multiindices α, we can specify the partial derivatives of z whose l2 norms are to be minimized, i.e., the smoothness of the interpolant z. for example, if n = 1 we put bk = 0, except for b2 = 1 (i.e. l = 2), and minimize the l2 150 vol. 61 special issue/2021 multivariate interpolation using polyharmonic splines norm of the second derivative of z, which corresponds to minimizing its curvature. apparently, ‖z‖2l = ∑ κ aκa ∗ κ‖gκ‖ 2 l due to (11), (12), (13), and (14). remark 1. with a fixed n, it is easy to employ the multinomial theorem to find out (see [2]) that there are π(n, |α|) = ( |α| + n− 1 n− 1 ) mutually different nonnegative multiindices α of n components with |α| fixed. the same is the number of the trend functions ϕα with |α| fixed and t(n,l) = ∑ |α|≤l−1 ( |α| + n− 1 n− 1 ) = ( l− 1 + n n ) is the total number of the trend functions ϕα, |α| ≤ l− 1. to remove the inconvenient infinite sum from (14), we introduce the generating function [1]. definition 5 (generating function). let the basis system of functions {gκ} ⊂ wl, where κ is a nonnegative multiindex, be complete and orthogonal in wl. if the series r(x,y) = ∑ κ gκ(x)g∗κ(y) ‖gκ‖2l (17) converges for all x,y ∈ ω and is continuous in ω we call the fuction r(x,y) the generating function. if l > 0, introduce an n ×t(n,l) matrix φ with entries φjα = ϕα(xj), j = 1, . . . ,n, |α| ≤ l− 1. the matrix φ is, in general, rectangular. we state in following theorem 2 that a finite linear combination of the values of the generating function r(x,y) at nodes is used for the practical interpolation instead of the infinite linear combination of the values of the basis functions in (14). theorem 2. let xi 6= xj for all i 6= j. assume that the series (17) converges for all x,y ∈ ω and the generating function r(x,y) is continuous in ω. moreover, let rank φ = t(n,l). then the problem (14), (15), and (16) of smooth interpolation has the unique solution z(x) = n∑ j=1 λjr(x,xj) + ∑ |α|≤l−1 aαϕα(x), (18) where the complex, in general, coefficients λj, j = 1, . . . ,n, and aα, |α| ≤ l−1, are the unique solution of the linear algebraic system n∑ j=1 λjr(xi,xj) + ∑ |α|≤l−1 aαϕα(xi) = fi, i = 1, . . . ,n, (19) n∑ j=1 λjϕ ∗ α(xj) = 0, |α| ≤ l− 1. (20) proof. the proof is given in [2]. note that we have to solve the linear algebraic system (19), (20) for n + t(n,l) unknowns. the number of unknowns (and equations) depends on n only through t(n,l), the number of trend functions. the smooth interpolant z given by (14) can now be rewritten for the generating function r(x,y) in the form (18). 6. a periodic basis function system of wl let the continuous function f(x) = f(x1, . . . ,xn), to be interpolated, be 2π-periodic in each independent variable xs, s = 1, . . . ,n. periodic functions with other periods in the individual variables can be formally transformed to the period 2π. let us consider f in the cube ω̃ = [0, 2π]n. write x ·y = x1y1 + · · · + xnyn for the rn inner product of vectors x and y. we choose exponential functions of a pure imaginary argument for the periodic basis system {gρ} in wl, where gρ(x) = exp(−iρ ·x). we have to change the notation properly with respect to the fact that the integer components of the multiindex ρ are also negative. the definition (3) of the length |ρ| of the multiindex ρ remains without change. in the definition (17) of the generating function r(x,y), we sum over all multiindices ρ (not only over those with nonnegative components). the following theorem shows important properties of the system {gρ}. theorem 3. let there be an integer u, u ≥ l, such that bα = 0 for all |α| > u in wl. the system of periodic exponential functions of pure imaginary argument gρ(x) = exp(−iρ ·x), x ∈ ω̃, (21) ρ being a multiindex with integer components ρs = 0,±1,±2, . . . , s = 1, . . . ,n, is complete and orthogonal in wl. proof. the proof is given in [2]. 151 karel segeth acta polytechnica remark 2. note that on the assumption of theorem 3 that there is an integer u of required properties, bα > 0 can occur only for l ≤ |α| ≤ u. we will keep this assumption in the rest of the paper. we further follow [2]. for the basis system (21), notice that ρ is not nonnegative and the generating function r(x,y) = ∑ ρ gρ(x)g∗ρ(y) ‖gρ‖2l = ∑ ρ exp(−iρ · (x−y)) ‖gρ‖2l (22) is the n-dimensional fourier series in l2(ω̃) with the coefficients ‖gρ‖−2l , where ‖gρ‖2l = (2π) n u∑ |α|=l bαρ 2α1 1 . . .ρ 2αn n according to (10). let now the complex-valued function f, to be interpolated, be nonperiodic in rn. redefine the generating function r(x,y) = ∫ rn exp(−iρ · (x−y)) ‖gρ‖2l dρ = f ( 1 ‖gρ‖2l ) (23) as the n-dimensional fourier transform f of the function ‖gρ‖−2l of n continuous variables ρ1,ρ2, . . . ,ρn if the integral exists [4]. employing the transition from the fourier series (22) with the coefficients ‖gρ‖−2l to the fourier transform (23) of the function ‖gρ‖−2l of continuous variable ρ ∈ rn (cf., e.g., [5]), we have transformed the basis functions, enriched their spectrum, and released the requirement of periodicity of f. moreover, if the integral (23) does not exist in the usual sense, in many instances, we can calculate r(x,y) as the fourier transform f of the generalized function ‖gρ‖−2l of ρ. the generating function r(x,y) given by (23) depends on x and y only through the distance r(x,y). 7. polyharmonic spline interpolation in the notation introduced above, we continue in deriving the polyharmonic spline interpolation according to [2]. put k(α) = |α|! α1! . . . αn! (24) for a nonnegative multiindex α. recall that n is the dimension of the problem, fix l > 0, and put bα = 0 for all α, |α| 6= l, and bα = k(α) for |α| = l. then ‖gρ‖2l = (2π) n ( n∑ s=1 ρ2s )l according to the multinomial theorem. in tables (e.g. [6]), we easily find r(x,y) = f  ( n∑ s=1 ρ2s )−l = { c1r 2l−n for n odd, c21r 2l−n ln r + c22r2l−n for n even, (25) where r = r(x,y) is given by (2) and c1, c21, and c22 are quantities depending only on n and l. then the generating function r(x,y), for 2l−n > 0, is a radial basis function. note that the function (25) has the form of the polyharmonic function (6) only if the dimension n is odd and it is the sum of the polyharmonic function (7) and c22r2l−n if n is even. using lemmas 2 and 3 of [2], we remove the term c22r2l−n from the formula (25) for the generating function in the case of n being even. we obtain r(x,y) = r2l−n for n odd, (26) = r2l−n ln r for n even. (27) for 2l − n > 0, the generating function r(x,y) is the polyharmonic spline (6) or (7), i.e., a radial basis function. 8. some properties of the polyharmonic interpolant consider now the interpolant z given by (18) where the generating function r(x,y) is the polyharmonic spline for n odd (26) or for n even (27), n fixed. its properties are characterized by the following theorem and lemma. theorem 4. choose l such that 2l−n > 0. let the interpolant z(x) be given by (18). then it solves the polyharmonic equation (8) of order m = l in the set rnx = r n \ ⋃n j=1{x = xj}. proof. according to theorem 1, the generating function r(x,xj) = r2l−n(x,xj) for n odd or r(x,xj) = r2l−n(x,xj) ln r(x,xj) for n even, it is the solution of the polyharmonic equation of order m = 1 2 (2l−n + n) = l in r n xj , j = 1, . . . ,n. moreover, the trend functions ϕα(x) given by (5) are monomials of a degree at most l− 1. they satisfy the polyharmonic equation of order m = l in rn as the operator ∆l is a linear combination of the derivatives of order 2l and, according to the multinomial theorem, each of these derivatives includes a derivative of order l with respect to some particular variable xs. the coefficients λj and aα are complex constants. therefore, the interpolant (18) satisfies the polyharmonic equation (8) in rnx. 152 vol. 61 special issue/2021 multivariate interpolation using polyharmonic splines −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −10 −5 0 5 10 15 true degree=1 degree=3 degree=5 figure 1. n = 5. the horizontal axis: independent variable, the vertical axis: the true function (29) (solid line); the interpolant with b1 = 1 (dashed line, piecewise linear), b2 = 1 (dotted line, cubic spline), and b3 = 1 (dash-dot line, quintic spline). the scales on the xand y-axis are different. we have just proven that the interpolant (18) with the generating function (26) or (27) is polyharmonic in rnx. moreover, in the example we will use its another trivial property stated in the following lemma. lemma 1. let the function u of the variable x = (x1,x2, . . . ,xn) satisfy the polyharmonic equation of order m in the set ψ ⊂ rn. then it satisfies the polyharmonic equation of order m + l for any positive l in the same set. 9. example in fig. 1, we show results of a simple computation: the polyharmonic spline interpolation for n = 1, i.e. the modification z(x) = n∑ j=1 λjr(x,xj) + l−1∑ k=0 akϕk(x) (28) of the formula (18) with r(x,y) given by (26). note that α = k is now a simple index. we consider three cases: the minimization of the l2 norm of the 1st, or 2nd, or 3rd derivative of the interpolant. we interpolate the third degree polynomial f(x) = 8x3 + 6x2 + 2x− 1 (29) on ω = [−1, 1] (solid line in fig. 1) with n = 3, i.e. using the nodes x1 = −1, x2 = 0, and x3 = 1. we employ the formula (24) for k(α) in case n = 1, i.e., k(l) = 1, and put bk = 0 for all k, k 6= l, and bl = 1. if we put l = 1, b1 = 1, and bk = 0 otherwise to minimize the l2 norm of the 1st derivative of the interpolant according to (16) then the generating function r(x,y) = r2l−n = r (a piecewise linear function given by (2)) and the trend function is a polynomial of degree l− 1 = 0, i.e. a constant. the interpolant (28) solves the equation (8), which is now harmonic (m = l = 1), according to theorem 4, everywhere in r1 except for the points x = xj, j = 1, 2, 3 (cf. sec. 8). the constant trend function satisfies the equation (8) everywhere in r1. from the form of the interpolation formula (28) we see that the interpolant z(x) (dashed line in fig. 1) does not satisfy the equation (8) at the three nodes x1,x2,x3. this is not important at the first and last node where the value prescribed can be understood as a boundary condition. we can thus claim that the interpolant (28) satisfies the harmonic equation (8) on (−1, 1) \ {0}. moreover, according to lemma 1, the interpolant z(x) given by (28) also satisfies the polyharmonic equation (8) of any order m > 1 in the same set as the equation of order 1, i.e. on (−1, 1)\{0}. if we further put l = 2, b2 = 1, and bk = 0 otherwise to minimize the l2 norm of the 2nd derivative of the interpolant, then r(x,y) = r2l−n = r3 (the well-known cubic spline), the trend functions are a constant and linear function. the interpolant (28) 153 karel segeth acta polytechnica (dotted line in fig. 1) solves the biharmonic equation (8) as m = l = 2 according to theorem 4, everywhere in r1 except for the points x = xj, j = 1, 2, 3. the constant and linear trend functions satisfy the equation (8) everywhere in r1. as in the previous case, we see that the interpolant (28) satisfies the biharmonic equation (8) on (−1, 1)\{0}. again, according to lemma 1, the interpolant z(x) also satisfies the polyharmonic equation (8) of any order m > 2 in the same set. if we finally put l = 3, b3 = 1, and bk = 0 otherwise to minimize the l2 norm of the 3rd derivative of the interpolant, then r(x,y) = r2l−n = r5 (quintic spline), the trend functions are a constant, linear function, and a quadratic function. the interpolant (28) (dash-dot line in fig. 1) solves the triharmonic equation (8) as m = l = 3 by theorem 4 everywhere in r1 except for the points x = xj, j = 1, 2, 3. all the trend functions satisfy the equation (8) everywhere in r1. as in the previous case, we see that the interpolant (28) satisfies the triharmonic equation (8) on (−1, 1) \{0}. according to lemma 1, the interpolant z(x) satisfies the polyharmonic equation (8) of any order m > 3 in the same set. the results agree with general expectations. the interpolation conditions (1) are satisfied. for n = 1, the formula (2) gives r(x,y) = |x−y|, i.e. the absolute value of the difference x−y. naturally, minimizing the l2 norm of the first derivative of the interpolant (b1 = 1) gives a broken line. minimizing the same norm of the second derivative of the interpolant (its curvature) using b2 = 1 leads to a cubic spline. if we put b3 = 1, we minimize the l2 norm of the third derivative of the interpolant by a quintic spline. apparently, it is not possible to draw principal conclusions from a single 1d example. 2d and 3d cases are more interesting and can be applied to many problems of practice. 10. conclusion using the general theory of smooth interpolation we constructed a radial basis interpolant as a linear combination of the values of a polyharmonic spline of fixed order and a linear combination of the trend functions. the construction shows how to choose the functional applied to the formula in order to minimize particular derivatives of the interpolant, i.e., to get the smoothness of these derivatives. moreover, the interpolant is proven to be piecewise polyharmonic, which can be considered advantageous in some cases. note that the problem considered is n-dimensional and that the number of equations of the linear algebraic system to be solved is the number n of nodes of the measurement plus the number of trends (that depends on the dimension). acknowledgements the author was supported by rvo 67985840 and by czech science foundation grant 18-09628s. references [1] a. talmi, g. gilat. method for smooth approximation of data. j comput phys 23:93–123, 1977. doi:10.1016/0021-9991(77)90115-2. [2] k. segeth. polyharmonic splines generated by multivariate smooth interpolation. comput math appl 78:3067–3076, 2019. doi:10.1016/j.camwa.2019.04.018. [3] k. segeth. some splines produced by smooth interpolation. appl math comput 319:387–394, 2018. doi:10.1016/j.amc.2017.04.022. [4] l. mitáš, h. mitášová. general variational approach to the interpolation problem. comput math appl 16:983–992, 1988. doi:10.1016/0898-1221(88)90255-6. [5] k. segeth. a periodic basis system of the smooth approximation space. appl math comput 267:436–444, 2015. doi:10.1016/j.amc.2015.01.120. [6] s. g. krĕın (ed.). functional analysis (russian). 1st edition. nauka, moskva, 1964. 154 http://dx.doi.org/10.1016/0021-9991(77)90115-2 http://dx.doi.org/10.1016/j.camwa.2019.04.018 http://dx.doi.org/10.1016/j.amc.2017.04.022 http://dx.doi.org/10.1016/0898-1221(88)90255-6 http://dx.doi.org/10.1016/j.amc.2015.01.120 acta polytechnica 61(si):148–154, 2021 1 introduction 2 problem of data interpolation 3 interpolation with radial basis functions 4 polyharmonic splines 5 smooth interpolation 6 a periodic basis function system of wl 7 polyharmonic spline interpolation 8 some properties of the polyharmonic interpolant 9 example 10 conclusion acknowledgements references ap06_1.vp 1 frame of reference interaction with industry supports the view that reliability is a major criterion [1] under consideration while evaluating concepts in the initial phases of design. although many multi criteria decision making methods are available to select the final concept (s) from the available candidates [2], reliability like other criterion is normally given some weight and it becomes one of the given criteria during the selection of concepts. since reliability is a very important criterion in product design, we here propose to obtain the ordinal ranks using subjective inputs on the basis of functionality fulfilment in the initial phases of design. also, it is amply clear and argued that the data to calculate reliability is not available in the conceptual design phase of original design [3]. however we propose to relatively assess the reliability of concepts and then screen out those that seem to have unacceptable level of rank. the overview of this paper is as follows. in section 2, we review reliability and its definition. section 3 aims to explain the proposed model of comparing concepts with respect to reliability. in the same section, we present an overview of the tools we intend to apply for the calculations. they are the analytic hierarchy process (ahp) [4] and the entropy method [2]. the idea of concept functionality graph is introduced to enable designers to look at the final outcome and to provide ease of decision making. the example problem is introduced in section 4. it consists of seat suspensions for an off highway vehicle taken from [5]. in section 5, we apply the method proposed (section 3) on the example problem (section 4) and the results are then discussed. we conclude the paper in section 6 with a brief note on our future work. 2 reliability: review and definition to explore and understand the method proposed, let us first understand the grounds of this research. reliability has been defined as “the probability that an item will perform a required function, under stated conditions, for a stated period of time.” [6]. normally, while speaking of reliability, we speak in terms of break down of the product. but then this “breakdown” concept does not fully conform to the above and accepted standard definitions. it is also performance that dictates reliability. for example, if a missile is intended to cover a range of 1000 km under stated conditions and stated period of time, and if it is unable to reach that range, its reliability is lower. it is generally argued that conventional reliability calculations in the conceptual design phase are of limited use [7]. we argue that there are various types of designs that industries undertake and the definition of conceptual design differs from company to company. for example, a company may wish to utilize the available components in the market for a new product. the product is definitely new but the conceptual design phase of such a product would entail selection of available components to make an “ideal” fit that the industry wishes to go ahead with. predicting and calculating reliability in such cases is possible using the techniques available. also, in case of original designs, a relative measure can be obtained which is explained in the section that follows. cooper & thompson [8] have listed all the valuable reliability prediction tools in their paper. qualitative methods have been suggested in the conceptual design phase and quantitative methods in the latter phases of design. still, most of the techniques applied are meant for adaptive designs [3] or “proprietary products” and not for original designs. as regards original design [3], absolute reliability calculations in the conceptual design phase are not possible but a relative reliability indicator may be calculated in order to rate the generated design options and get the ordinal rankings for them. we propose a method to utilize functionalities for calculating the index, what we call the relative reliability risk index (rrri or r3i). the argument that functionality has less to do with reliability seems invalid here because, as we stated earlier, performance is a measure of reliability and the functionality indication refers to the performance of the product considered during the conceptual design phase. henceforth, 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague reliability risk evaluation during the conceptual design phase g. mamtani, g. green systematic evaluations of concept designs involve considering a range of criteria. interaction with industry supports the view that reliability is a major criterion among those considered in product design. although there are few methods to predict reliability in the initial phases of design, most of them are only applicable to adaptive designs. in this paper, we introduce the concept of relative reliability risk assessment for original designs, where information availability is less, to calculate reliability. we consider the function structures of the product under consideration and apply the analytic hierarchy process using verbal assessments for relative measurements. the weight assigning technique used is the entropy method. a final value of r3i (relative reliability risk index) is calculated and the idea of concept functionality graphs is presented. this method is applied on the example of seat suspensions for an off highway vehicle and the results are discussed. the findings help to sort out the concepts that are relatively strong in terms of reliability. keywords: reliability, concept design. selected paper from the 4th international conference on advanced engineering design (aed 2004), which was held in glasgow from 5 to 8 september 2004. we follow a relative approach in calculating r3i using the analytic hierarchy process (ahp). the method is proposed in the next section. 3 functionality as an indicator of risk and proposed method to calculate the relative reliability risk index (r3i), we propose a four step methodology (fig. 1). to begin with, we consider the established function structure of the product. we deal with this in more detail in the next sub-section. after consideration of function structure, the analytic hierarchy process (ahp) [4] is applied so as to relatively rate the main functions of the function structure. we consider the main functions (section 3.1), functions that are fundamental to the system [3] and compare them with respect to the alternatives. after the comparisons have been made, we obtain the priorities. application of ahp is done using the commercially available decision support software by expert choice. the software is interactive with the required number crunching and provides a measure of inconsistencies during the comparison. this inconsistency gives a good measure of the relative ratings, and provides a check whether the comparisons should be performed again. using these priorities, we draw concept functionality graphs (cfgs). cfg indicates the relative measure of functionality fulfilment with respect to each of the available concepts. the example problem (section 4) and the application of this methodology on this example shall clarify the steps of this methodology in due course. step four includes assigning weights to the functions. we do this using the entropy method [2]. this method has been adopted because it does not require the designer to indicate the weight. instead, weights are calculated using the information obtained from the decision matrix. additionally, this helps to rule out any chance of prejudice or manipulation to assign weights by the decision-maker. even if the weights have already been assigned by the decision-maker, they can be combined with the weights obtained using this method (section 3.3). now, this decision matrix is arrived at in step 2 using ahp. the application of ahp leads to the normalized priorities, which are used to extract information for input to the entropy method in step 4. 3.1 function structures as a means of modelling concepts establishing function structures in the conceptual phase of design helps to pursue design in a systematic manner. there have been many approaches towards developing function models. for brevity, we do not discuss all of them here but follow the approach proposed by pahl and beitz [3]. in the initial stages of design, the technical systems are represented using function structures before their solution principles have been proposed. initially a “black box” approach towards the system is established, representing the overall system goal with the inputs and outputs. the inputs and outputs are in the form of energy, matter and signals. then subfunctions are added to this system and each of them is usually represented as a verb-noun pair. the detail of the structure depends on the level of abstraction one wants to achieve. there are two types of functions, main functions and auxiliary functions. main functions are the those directly help achieve the overall goal, and auxiliary functions indirectly help in achieving the overall function. to understand this better, let us take an example of a common 3-axes horizontal lathe machine. the function structure of such a lathe is shown in fig. 2 and fig. 3 at different levels of abstraction. initially the overall function is laid down in which the main task of the lathe is considered i.e. machining work piece (w/p) (as shown in fig. 2). to understand this, refer to the symbols for the conversion of matter, energy and signals as shown below fig. 3. when considered at a detailed level of abstraction, the structure as shown in fig. 3 is arrived at. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 apply ahp establish function structure organise concept functionality graphs apply entropy method and calculate r 3 i fig. 1: steps to calculate r3i w/p s eauxilary erotation c/t w/pdeformed emachining c/tdeformed overall function fig. 2: “black box” approach 3.2 analytic hierarchy process the analytic hierarchy process, developed by saaty, is one of the available mutli attribute decision making tools. the strength of this tool lies in utilising insight based soft information from the decision makers in the form of relative values. a hierarchy is developed in which the main objective forms the highest level. the next lower level is occupied by the criteria, and so on. the bottom most hierarchy is occupied by the alternatives available. one such hierarchy is shown in fig. 4. once the hierarchy has been established, comparison matrices are formulated and comparisons of lower level criteria are made with respect to the property at the upper level. much literature is available on ahp that deals with the mathematics of the method, one of them being [9]. the example problem we undertake to illustrate the ahp method is that of selecting a temperature sensor. a university thermodynamics laboratory wants to purchase a temperature sensor for temperature measurements. the alternatives available in the market are thermistors, platinum resistance thermometers and thermocouples. we would like to mention here that this is a hypothetical situation where we limit out alternatives to three only for ease of explanation. the criteria on which the selection depends are accuracy, temperature range measured, price and reliability. the hierarchy is shown in fig. 4. if we apply the top-down approach here, we would first compare all the criteria, i.e. accuracy, temperature range, price and reliability using a pairwise comparison matrix with respect to the objective i.e. selecting a temperature sensor. such a pairwise comparison matrix is shown in table 1. next, we compare all the three alternatives with respect to each property at the level above it. there would be four comparison matrices for these comparisons that are shown as table 2, 3, 4 and 5. comparisons are made using a scale that involves integers from 1 to 9 and their reciprocals to represent relative importance. if a numeric scale cannot be used, verbal assessment is then preferred. we shall be using verbal assessment for calculating r3i. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague hold w/p rotate w/p machine w/p convert energy into force and motion hold c/t move c/t s erotation w/p c/t emachining c/tdeformed change speed w/pdeformed eauxilary eauxilary where, energy signals c/t main function cutting tool materials w/p system boundary auxiliary function work piece fig. 3: function structure of a lathe selecting temperature sensor temperature range price reliability platinum resistance thermometerthermistor thermocouple accuracy fig. 4: hierarchy for a temperature sensor selection problem here, a = accuracy, tr = temperature range measured, r = reliability, p = price, prt = platinum resistance thermometer, t = thermistor and tc = thermocouple. the priorities calculated are shown in the comparison matrices. these matrices are used to calculate the final priorities for the available alternatives. with each matrix, there is associated a consistency ratio (cr) which gives the measure of consistency in the comparisons made. we use the expert choice software for calculating cr. usually, cr should be under 10 % for the results to be acceptable; else the comparison should be undertaken again. in the method we have proposed to calculate r3i, we shall calculate the priorities of the alternatives with respect to criteria, but we do not compare the criteria with respect to the objective. this is because the criteria that are available with us are functions (main functions) from the function structures. it would be inadvisable to compare the functions that are basic or fundamental to the system using the pairwaise comparison matrix, because all the main functions may seem to be equally important to the designer. instead, we use the entropy method [2] to calculate the weights of the functions with us. sub section 3.3 explains the entropy method. 3.3 entropy method to calculate weights the entropy method [2] is an madm method to calculate the weights of the attributes that have been considered during the decision-making process. it utilizes the information content of the decision matrix to calculate the weights of the attributes. this method has been adopted as a part of calculating r3i because it may be inappropriate for a designer to compare functions relatively from the function structure. the information contents of the normalized values of the attributes can be measured using entropy values. the entropy vj of the set of normalized outcomes of attribute j is given by v l l j j kj ij ij i n � � � � � �� ln , ( 1 1for all to represents attribute and to represents alternativei n� 1 ) (1) where � is constant which defined as � � 1 ln( )n and lij is a normalized element of the decision matrix. if there are no preferences available, the weights are calculated using the equation w e e e vj j j j k j j � � � � � 1 1and . (2) if the decision maker has the weights available beforehand we, it can be combined with the weights calculated above, resulting in new weights that are wnew. w w w w w new e j e j j k � � � � � 1 . (3) 3.4 concept functionality graphs concept functionality graphs depict the strengths and weaknesses of the concepts generated in the conceptual design phase. they are the graphs between the functional priorities obtained from ahp and the concepts. ulrich & eppinger [10] have proposed a five-step method for generating solution concepts using function diagrams. this strategi c approach towards generating concepts helps identify the strengths and weaknesses of all concepts functionwise. unfortunately, systematic methods are not always used in industry [11]. also, a large number of concepts generated produce a complex situation to recognise the strengths and weaknesses as regards each function in the concepts. henceforth concept © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 sensor a tr p r priorities a 1 3 1/3 2 .257 tr 1/3 1 1/3 2 .147 p 3 3 1 3 .483 r 1/2 3 1/3 1 .113 incon: 0.08 table 1: comparison matrix for criteria a prt t tc priorities prt 1 5 5 .709 t 1/5 1 2 .179 tc 1/5 1/2 1 .113 incon: 0.05 table 2: comparison matrix wrt accuracy tr prt t tc priorities prt 1 5 6 .726 t 1/5 1 2 .172 tc 1/6 1/2 1 .102 incon: 0.03 table 3: comparison matrix wrt temp range r prt t tc priorities prt 1 3 5 .163 t 1/3 1 3 .540 tc 1/5 1/3 1 .297 incon: 0.01 table 4: comparison matrix wrt reliability p prt t tc priorities prt 1 1/3 1/2 .637 t 3 1 2 .258 tc 2 1/2 1 .105 incon: 0.04 table 5: comparison matrix wrt price functionality graphs are thought of as a means to represent the strengths and weaknesses of concepts after the comparison using ahp has been performed. 4 example under consideration: seat suspensions the example we use here to illustrate the application of this methodology is the seat suspension mechanism for off-highway vehicles. it has been taken from [5]. hurst had considered this example to illustrate the effectiveness of using spreadsheets for concept selection. the method applied is similar to the weighting & rating method and the ratings provided to the concepts with respect to criteria are in terms of satisfaction of criteria. all the six concepts are shown in fig. 5. 5 application of the proposed method on example and results 5.1 establishing function structures the function structure established for seat suspensions is shown in fig. 6. essentially, 3 main functions are considered in the structure. they are hold seat, dampening vibrations and adjusting seat height. the flow of matter, energy and signals are shown. 5.2 applying ahp for comparing concepts with respect to the main functions ahp is applied to the functions considered here and the comparison matrices are shown in tables 6, 7 and 8. in tables 6, 7 and 8, a, b etc refer to concept a, b etc. the inconsisten12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague concept a concept b concept c concept d concept e concept f fig. 5: concepts for seat suspensions for off-highway vehicles (after hurst) cies are also laid down with each matrix considered. the inconsistencies are all less than 0.1 and are acceptable. after the application of ahp, a priority matrix is obtained (table 9). this will be treated as our decision matrix. 5.3 concept functionality graphs for the example considered the cfg for this example is shown in fig. 7. the integers 1 – 6 on the x-axis in fig. 7 represent concept a – concept f respectively. the figure is meant to depict a clear picture of the strengths and weaknesses of different concepts with respect to the functions considered. 5.4 application of the entropy method to calculate weights the weights for the three functions considered have been calculated using the information from the matrix and the entropy method (explained in section 3.3) is utilized to calculate © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 hold seat dampen vibrations adjust seat height seat seat evibration eauxiliary edampening eadjust fig. 6: function structure of seat suspension mechanism hold seat a b c d e f priorities a 1 5 3 3 2 1/3 0.233 b 1/5 1 1/3 2 1/3 1/4 0.061 c 1/3 3 1 3 1/3 1/4 0.103 d 1/3 1/2 1/3 1 1/5 1/5 0.047 e 1/2 3 3 5 1 1/2 0.190 f 3 4 4 5 2 1 0.365 incon.: 0.06 table 6: comparison matrix wrt to hold seat dampen vibrations a b c d e f priorities a 1 5 3 5 1/2 3 0.0271 b 1/5 1 1/3 1 1/5 1/2 0.053 c 1/3 3 1 3 1/4 3 0.145 d 1/5 1 1/3 1 1/5 1/2 0.053 e 2 5 4 5 1 5 0.396 f 1/3 2 1/3 2 1/5 1 0.082 incon.: 0.03 table 7: comparison matrix wrt to dampen vibrations adjust seat height a b c d e f priorities a 1 1/3 1/2 1 1/2 1/2 0.082 b 3 1 3 5 2 3 0.352 c 2 1/3 1 3 1/2 2 0.157 d 1 1/5 1/3 1 1/3 1/3 0.061 e 2 1/2 2 3 1 3 0.229 f 2 1/3 1/2 3 1/3 1 0.119 incon.: 0.04 table 8: comparison matrix wrt to adjust seat height concept a b c d e f hold seat 0.233 0.061 0.103 0.047 0.19 0.365 dampen vibrations 0.271 0.053 0.145 0.053 0.396 0.082 adjust seat height 0.082 0.352 0.157 0.061 0.229 0.119 table 9: priority matrix for seat suspension concepts concept functionality graph seat suspensions 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 1 2 3 4 5 6 concepts hold seat dampen vibrations adjust seat height fig. 7: cfg for the seat suspension example the same. the weights obtained after the application of the method are shown in table 10. normalisation of the decision matrix is not required since the sum of priorities for any attribute j is 1 in table 9. having calculated the weights and priorities, we obtain r3i (table 11) using eq. (4). r i for all3 1 i ij j j k l w i� � � � . (4) we can see from table 11 that concept e has the best r3i among all the available concepts. also the concepts that may be screened out are those that have low r3i value, which are b and d. the ordinal ranks are also shown in table 11. 6 conclusion in this paper, we reviewed reliability and proposed a method for calculating a relative index to compare concepts in the initial phases of design. the method helps to obtain ordinal rankings of the available concepts and is applied on the example of seat suspensions for off highway vehicles. the methodology involves application of the analytic hierarchy process to relatively compare concepts and the entropy method for obtaining the weights of the functions considered. the idea of concept functionality graphs is introduced and the results of application on the example are discussed. future work includes validation of this methodology using other examples from student projects and from industry. references [1] mamtani, g., green, g.: “evolution of a computer evaluation tool in context with scottish industries”, proceedings of design 2004 conference, dubrovnik, 2004. [2] sen, p., yang, j.: multiple criteria decision support in engineering design. london, 1998. [3] pahl, g., beitz, w.: engineering design – a systematic approach. ken wallace, london, 1996. [4] saaty, thomas l.: decision making for leaders. pittsburgh, 2001. [5] hurst, k.: “spreadsheet analysis applied to the concept selection phase of engineering design.” proceedings of international conference on engineering design, dubrovnik, 1990. [6] smith, d.: reliability and maintainability in perspective. london, 1988. [7] smith, j., clarkson, p. j.: “improving reliability during conceptual design.” proceedings of international conference on engineering design, glasgow, 2001. [8] cooper, g., thompson, g.: “concept design and reliability.” advanced engineering design, glasgow, 2001. [9] saaty, thomas l.: the fundamentals of decision making and priority theory with the analytic hierarchy process. pittsburgh, 2000. [10] ulrich, k., eppinger, s.: product design and development. boston, 2000. [11] taylor, ben: “enhancement of design evaluation during concept development.” proceedings of international conference on engineering design, the hague netherlands, 1993. girish mamtani e-mail: g.mamtani@mech.gla.ac.uk dr. graham green phone: +44(0) 141 330 4071 e-mail: g.green@mech.gla.ac.uk mechanical engineering department james watt building university of glasgow glasgow, g12 8qq scotland, uk 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague functions weight (wj) hold seat 0.33 dampen vibrations 0.413 adjust seat height 0.252 table 10: weights obtained after application of the entropy method a b c d e f r3i 0.209 0.13 0.133 0.052 0.283 0.184 rank 2 5 4 6 1 3 table 11: r3i and ranks for concepts – seat suspensions ap07_2-3.vp 1 introduction and summary a more or less satisfactory explanation of the numerous experimental observations that the states of molecules, atoms and atomic nuclei are quantized belongs to the most important achievements of physics made, predominantly, during the first few decades of the twentieth century [1]. it is, therefore, slightly surprising that the situation is still not yet entirely satisfactory at present. in particular, one often has to rely upon phenomenological models in nuclear physics where our uncertainties concerning the “correct form” of the interactions between individual nucleons are combined with the enormous mathematical difficulties arising in connection with a sufficiently reliable numerical solution of the underlying quantum-mechanical many-nucleon problem. one of the ways out of the latter theoretical as well as practical trap has been found in the elimination of as many irrelevant degrees of freedom as possible. perceivable success has been encountered in the so called interacting boson models, where practical solvability of the complicated (i.e., partial differential or integro-differential) linear schrödinger equations for bound states, h e nn n n� �� �, , ,0 1 � has been achieved via their reduction to the “effective”, simplified form h e n neff n eff n eff n eff( ) ( ) ( ) ( ) max, , , ,� �� � 0 1 � , where h(eff ) is a finite-dimensional and, often, real and symmetric matrix. in the latter context as briefly reviewed, e.g., in [2], a conflict survives between the numerical reliability and practical tractability of the effective models. in the context of the so called dyson-mapping approximation technique, for example, it was originally felt as an unpleasant surprise that the requirement of the smallness of the dimension of the matrix h(eff ) emerged in an apparently inseparable combination with the necessity of moving to a less standard hilbert space h(physical) of states where the definition of the inner (“scalar”) product between elements � � h(physical) and � � h(physical) had to be modified, � � � �� � �� � �, † 0 (1) this trick very easily extends the applicability of the formalism of standard textbook quantum mechanics by the transition to the various nontrivial, “non-dirac” metric operators. of course, all the observable quantities must be then represented by the operators �(physical) which are self-adjoint in �(physical). whenever one chooses a nontrivial metric �(physical) � � in � (physical), all the operators �(physical) which are self-adjoint in �(physical) must obey the consistency condition � �� �( ) † ( )physical physical� �� � 1 . (2) unfortunately, confusion may (as it often does [3]) immediately arise in all the models where one employs, in parallel, another, auxiliary but much more easily tractable hilbert space �(unphysical), the scalar product in which is specified by the unit metric �(unphysical) � i. for this reason, it has been recommended [2] to call the operators �(physical) [of eq. (2), with � � i] “quasi-hermitian”, while reserving the name “hermitian” solely for the subset of operators � for which eq. (2) holds at the traditional “dirac” special metric � � i. the latter convention is believed to minimize the possible misunderstandings, in spite of the apparently counterintuitive fact that all the operators of observables �(physical) may be called, strictly speaking, “non-hermitian”, at least from the point of view of the more conventional, albeit auxiliary, hilbert space �(unphysical). the latter, apparently innocent-looking paradox demonstrated its full strength and impact when bender and boettcher published their apparently surprising observation [4] that certain remarkably elementary (viz. ordinary differential) hamiltonians h(bb) possess, in spite of their manifest non-hermiticity (with respect to the most common hilbert space l2(� , )), real and discrete (i.e., bound-state-like) spectra. it took a few years before the bender’s and boettcher’s apparent puzzle was resolved. independently, several groups of people realized that in our present language and for all the models in question we have l2(� , ) � (unphysical) [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]. this means that inside a certain quasi-hermiticity domain � of parameters where the energies remain real [4, 7], all the models h(bb) � [h(bb)]† do satisfy the necessary quasi-hermiticity condition (2). one even © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 47 no. 2–3/2007 ��-symmetric quantum chain models m. znojil a review is given of certain tridiagonal n-dimensional non-hermitian j-parametric real-matrix quantum hamiltonians h(n). the domains �(n) of reality of their spectra of energies are studied, with particular attention paid to their exceptional-point boundaries ��( )n . the strongest admissible couplings are specified in closed form for all n. keywords: quantum mechanics, pseudohermitian hamiltonians, ��-symmetric nearest-neighbor interactions, exactly solvable finite-dimensional models, domains of quasihermiticity, exceptional points. realizes [2], [10] that there exist quite a few different metrics � � �(physical) � i which can all be assigned to the “candidate-for-the-hamiltonian” operator h � h†. in what follows we intend to return to the problem of the ambiguity of the assignment of an “optimal” �(physical) to a given h � h†. for the sake of clarity we shall restrict our attention to the effective-hamiltonian models described by the following real matrices of the chain-model form, h n g g n g g n g g n n( ) � � � � � � � � 1 0 0 0 3 0 0 0 5 0 0 0 1 1 2 2 3 2 � � � � � � � � � � 3 0 0 0 1 1 1 g g n� � � � � � � � � � � � � � � � � � � � . (3) complicated as the problem may look for the general n, we shall recollect and summarize some results of refs. [15]–[20] and show that, and why, these models remain tractable, up to a large extent, by certain analytic, perturbative or algebraic non-numerical techniques. after a compact and more or less self-contained review of the underlying physics in section 2 we shall formulate our project in section 3. at the first few lowest dimensions n, we shall then derive some consequences of the implicit secular-equation definitions of the energy spectra in the respective sections 4–9, with particular emphasis on non-perturbative, strong-coupling results. sections 10–12 will finally summarize our observations and, in a climax of our present message, they enable us to conjecture an extrapolation of some of the formulae to all the finite dimensions. 2 ��-symmetry of the chain model (3) once we assume that all the matrix elements of h(n)remain real we observe that � �h hn n( ) † ( ),� � �� � � � � � � � � � � � � � � � � 1 0 0 1 0 0 1 0 0 1 � � � � � � � � � � � . (4) in the literature the latter formula is usually called �-pseudo-hermiticity [10]–[13] (or, in the context of physics [4, 21], ��-symmetry) of h(n). due to the exceptional simplicity of its “parity-reversal” matrix factor �, the validity of eq. (4) may also significantly simplify an explicit re-construction of �(physical), proceeding in three steps. in the first step we imagine that h(n) is non-hermitian so that we may and have to solve not only the standard schrödinger’s linear-algebraic eigenvalue problem h n q n n nn n ( ) , , , ,� � 1 2 � (giving all the right eigenvectors n of h(n) as its result) but also the parallel, “left-eigenvector” linear algebraic problem at the same eigenvalue, m h m q m nn m ( ) , , , ,� � 1 2 � (5) (note that n n� for the generic h hn n( ) ( ) † � ). in the second step we assume that the eigenvalues remain real and non-degenerate (thus, in our notation, all our coupling constants gj stay inside a real quasi-hermiticity domain �) and recollect the following explicit general formula for the metric, � � � � � n nn n n n� � 1 0, (6) (cf., e.g., [10]–[13]) where any choice of the n-plet of the real parameters �n defines an eligible hilbert space � (physical) equipped with the inner product (1). in the final step we notice that the “additional” problem (5) can be re-written in the equivalent form h n q n n nn n ( )† * , , , ,� � 1 2 � . (7) obviously, its solution [i.e., a key assumption of the feasibility of an evaluation of the sum (6)] can be circumvented because in the light of eq. (4) the “unknown” left eigen-ketkets are all proportional to the “known” right eigen-kets multiplied by a diagonal matrix, n q nn� � . here the coefficients qn are arbitrary. their variability represents in fact the freedom in the normalization of our (as we can prove, biorthogonal [7]) basis (composed of the left and right eigenvectors of h(n)). in the case of the real spectrum qn one can easily fix their choice in such a way that the inner product becomes positive definite. whenever necessary, we may even re-scale their values to �1, – that’s why we called these coefficients “quasiparities” in [7]. 3 the extreme exceptional points secular-equation definitions � �det ( )h e in � � 0 of the energies degenerate to the polynomial equations in e2 � s, s p a b s p a b sj j j j j� � � �� � � � 1 1 2 2 0( , , ) ( , , )� � � . (8) here we abbreviated a g j� 2 , b g j� �1 2 , …, z g� 1 2, with � �j entier n� 2 . in the j-dimensional space of our new, non-negative coupling parameters a, b, …, z, all the spectrum of the energies e s� � remains real only inside a certain compact, hedgehog-shaped domain �. our main attention will be paid to the “spikes of the hedgehog”, i.e., to the 2 j points where the boundary ��( )n forms certain protruded spikes. in the context of physics these spikes represent the strong-coupling extremes in the set �� ( )n tractable as kato’s exceptional points occurring in the real domain [22]. one can notice that at all these extreme exceptional points (eeps) [with coordinates a(eep), b(eep), …, z (eep)], all the energy levels degenerate to the single value of e (eep) � 0. this follows from the up-down symmetry of the unperturbed levels as well as of their perturbations. as a consequence, the eep secular equation acquires the form (e � e (eep)�n � 0 so that in the light of eq. (8), all of the eep coupling strengths will have to satisfy the set of the following j-plet of polynomial equations, 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 p a b p a b p j j � � � � 1 2 0 0 ( , , ) , ( , , ) , ( ) ( ) ( ) ( ) eep eep eep eep � � � 0 0( , , ) . ( ) ( )a beep eep � � (9) although the simplicity of their solutions is amazing, their derivation is from easy. let us now study this problem with a step-by-step increase of the dimension. 4 two-dimensional model of paper [15] in ref. [15] we paid attention to all the ��-symmetric real matrices h c a a c [15] � � � � � � � �� � � �� 1 1 . (10) all of their eigenvalues (i.e., eigenenergies) are known in the closed form, e c a� � � �( )1 2 2 . once we decided to ignore the “pathological” case with c � cpath � �1 (giving complex energies), an elementary re-scaling enabled us to put c � 0 and get the first nontrivial one-parametric version of our present class of models h(n). our main result was that one can guarantee the reality of the spectrum in an interval �(2) � (�1, 1) of our couplings a � cos with, say, �( , )0 . we also discussed several eligible ways of suppressing the ambiguity of the metric. this task proved easily achieved: we simply extracted the general metric from eq. (2) via a real, hermitian ansatz q t t t t � � � �� � � �� 1 2 2 3 . this enabled us to reduce all the construction to the single condition 2 03 1 2t t t� � �( ) cos . two free parameters survived and the latter relation defined the value of t3. once we set t z1 1� �( )� and t z2 1� �( )� we have t z3 � cos as well as the overall-scaling interpretation of z [16]. once we had constructed the metric � compatible with eq. (2), it remained for us to guarantee that our � was positive definite. fortunately, both the eigenvalues of � are available in closed form, � � � � � �z ( cos1 2 2 so that the derivation of the final constraint � 2 2� sin is trivial. 5 three-dimensional model of paper [17] out of the generic three-by-three model of ref. [17] with two free parameters, h a a a a [17] � � � � � � � � � � � � � � � � 2 � � � � � � � � the present one-parametric chain model h(3) is obtained in the limit � � 0. thus, the determination of the interval of the quasi-hermiticity �( ) ( , )3 2 2� � is trivial since the secular equation � � � �e a e3 24 2 0( ) is exactly solvable in closed form. 6 four-dimensional model of paper [18] the four-dimensional model of paper [18] contains four free real parameters, but it contains our present two-parametric chain model with n � 3 as a special case. such a reduction simplifies the n � 4 secular equation det 3 0 0 1 0 0 1 0 0 3 0 � � � � � � � � � � � � � � � � � � � � � e b b e a a e b b e and makes it equivalent to the quadratic equation for s � e2, s b a s b a b2 2 2 2 2 410 2 9 6 9 0� � � � � � � � �( ) solvable in closed form, s s b a b a b a a� � � � � � � � �� 5 1 2 1 2 64 64 16 42 2 2 2 2 2 4 . (11) the discussion of the two-dimensional domain �(4) of the reality of these energies is entirely analogous to the previous case. also the set of the polynomial eep equations remains elementary, a b b a� � � �2 10 3 92, ( ) . the elimination of a leads to a quadratic equation for b � 3 giving a spurious solution a � 64 and b � �27 (which would imply an imaginary coupling b) and the unique correct solution a(eep) � 4 and b(eep) � 3. 7 five-dimensional case the model h b b a a a a b b ( )5 4 0 0 0 2 0 0 0 0 0 0 0 2 0 0 0 4 � � � � � � � � � � � � � � � � � � � � � � gives the central constant energy e0 � 0 so that its secular equation � � � � � � � � � �s b a s b a b a b2 2 2 2 2 4 2 220 2 2 64 16 32 2 0( ) (12) is solvable in closed and compact form again, e a b a a b� � � � � � � � �1 2 2 2 4 210 36 12 36 e a b a a b� � � � � � � � �2 2 2 2 4 210 36 12 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 47 no. 2–3/2007 thus, the triplet of the necessary inequalitites comprises the trivial simplex condition 10 � a � b, condition 36 � 12a � a2 � 36b [showing that b must lie below a parabola bmax � bmax(a)] and condition ( ) ( )8 32 2 2� � �b b a [giving the upper bound for a a a b� �max max( )]. in the eep context, two coupled conditions degenerate to the single quadratic equation with the unique non-spurious solution a(eep) � 6 and b(eep) � 4. 8 six-dimensional case the secular equation at n � 6, det 5 0 0 0 0 3 0 0 0 0 1 0 0 0 0 1 0 0 0 0 3 0 � � � � � � � � � � � e c c e b b e a a e b b e c 0 0 0 5 0 � � � � � � � � � � � � � � � � � � � � � c e in its polynomial form for s � e2, s b c a s b c a b c a c 3 2 2 2 2 4 2 2 2 2 2 4 2 35 2 2 44 28 34 2 � � � � � � � � � � � ( ) ( 59 2 10 30 225 30 25 2 2 2 2 4 2 2 2 2 2 2 4 4 � � � � � � � � � b c s a c b c c a a c c b ) 25 150 02� �b is still solvable in closed form. in the eep extreme the full solution of the triplet of eqs. (9) ceases to be easy but it still remains feasible, giving a b c n( ) ( ) ( ), , ,eep eep eep� � � �9 8 5 6. (13) 9 seven-dimensional case by the same gröbner-basis method as above we derive the result a b c n( ) ( ) ( ), , ,eep eep eep� � � �12 10 6 7. (14) it is again unique because one of the two roots c� � �27 9 21 of the “first alternative” gröbnerian “effective” equation c c2 54 972� � and both the roots � �354 60 34 of the “second alternative” equation c c2 708 2916 0� � � are negative, while the only remaining positive root c� � 68 24318125. gives the negative b c� �28 3 . 10 extrapolation formulae of paper [19] by construction, quasi-hermiticity domain � must lie inside a simplex s, a b c z g g k k n k j k k j � � � � � � � � � � � �2 2 4 3 2 2 2 1 1 3 ( ) ,� even (15) and a b c d z m m m n m � � � � � � � � � � � � 2 3 3 2 1 3 2 , odd . (16) this provides important information about the shape of the domain �, obtained fairly easily by the extrapolation technique (cf. [19]). in the light of the simplicity of all our previous eep formulae the extrapolation trick can be applied to them as well. at the even n � 2k such an approach leads to the extrapolation conjecture a k b k c k d k ( ) ( ) ( ) ( ) , , , , eep eep eep eep � � � � � � � 2 2 2 2 2 2 2 1 2 3 � (17) the validity of which we tested up to k � 6. in parallel one arrives at the following odd-dimensional formula a m m b m m m m c m ( ) ( ) ( ) ( ), ( ) ( ) , ( eep eep eep � � � � � � � � � � 1 1 1 2 1 2 m d m m � � � � � � � 1 2 3 1 3 4 ) , ( ) ,( )eep � (18) when n m� �2 1. 11 verifications: eight-dimensional case even dimensions n � 2k are “anomalous” in having the coupling a in the matrix h k z z b b a a b b k( )2 2 1 0 0 3 0 1 0 0 1 0 0 3 � � � � � � � � � � � � � � � � � � � � � � � � � � � z z k0 1 2� � � � � � � � � � � � � � � � � � � � � � � � � just once. this is reflected by the specific, “anisotropic” form of the simplex (15). once we had revealed the general n-dependence of the similar formulae, it was necessary to test these conjectures. the eight by eight model with k � 4 played a key role in it since the complexity of its non-numerical description is already quite perceivable. the situation is still not bad when the circumscribed simplex with the definition a b c d� � � �2 2 2 84 is sought. the subsequent eep construction is much more difficult. it is based on simultaneous solution of the quadratic, cubic and quartic polynomial equations p a b c d2 0( , , , ) � , p a b c d1 0( , , , ) � , and p a b c d0 0( , , , ) � containing 13, 19 and 20 individual terms, respectively. just a marginal simplification exists, e.g., in the p2 – case reducible to the 9-term equation 1974 2 2 2 83 142 70 50 2� � � � � � � � � � ( )b c d ad bd ac a b c d still, a decisive formal merit of our model is reflected by the survival of the simplicity of the final formula a b c d n( ) ( ) ( ) ( ), , , ,eep eep eep eep� � � � �16 15 12 7 8. (19) 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 in a test of the uniqueness of solution (19) one finds out that it possesses seven real and positive roots d. out of these, the following three of them are negative and, hence, manifestly spurious,�203.9747095, �156.6667001, �55.49992441. the proof of the spuriosity for the remaining four roots 0.4192854385, 5.354156128, 1354.675195 and, however straightforward, becomes unpleasant and clumsy. for example, the values of a are given by the rule ×a � (a polynomial in d of 16th degree) where the number of digits in the auxiliary integer constant exceeds one hundred. 12 odd dimensions and the test at n � 9 it remains for us to discuss the models with odd dimensions n � 2m �1, h m z z a a a a m( )2 1 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 � � � � � � � � � � 0 0 0 0 0 0 0 2 � � z z m� � � � � � � � � � � � � � � � � � � � � � � and to test and verify the validity of the extrapolated formulae at n � 9. at m � 4 we were still able to evaluate the explicit form of the secular equation, 14745600 7372800 2 220 2 2 2 04 5 � � � � � � � � � � a c b a d s s � ( ) and to re-derive the expected m � 4 eep values by its direct solution, a b c d n ( ) ( ) ( ) ( ) , , , , . eep eep eep eep � � � � � 20 18 14 8 9 (20) at a few higher m > 4 we just re-confirmed the validity of the extrapolated formulae (18) by their insertion in secular equations. acknowledgment this work has been supported by the mšmt doppler institute project nr. lc06002, by the institutional research plan av0z10480505 and by gačr grant nr. 202/07/1307. references [1] styer, d. f. et al.: nine formulations of quantum mechanics, amer. j. phys. vol. 70 (2002), p. 288. [2] scholtz, f. g., geyer, h. b., hahne, f. j. w.: quasi-hermitian operators in quantum mechanics and the variational principle, ann. phys. (ny), vol. 213 (1992) p. 74–101. [3] cf. http://www.mth.kcl.ac.uk/~streater/lostcauses.html#xiii [4] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having ��-symmetry, phys. rev. lett. vol. 80 (1998), p. 5243–5246. [5] bender, c. m., turbiner, a.: analytic continuation of eigenvalue problems, phys. lett. a vol. 173 (1993), p. 442; a. turbiner, private communication (april/may, 2000). [6] buslaev, v., grecchi, v.: equivalence of unstable anharmonic oscillators and double wells, j. phys. a: math. gen. vol. 26 (1993), p. 5541–5549; v. grecchi, private communication (february, 2000). [7] znojil, m.: should ��-symmetric quantum mechanics be interpreted as nonlinear?, j. nonlin. math. phys. vol. 9, suppl. 2 (2002), p. 122–133 (quant-ph/0103054). [8] znojil, m.: conservation of pseudo-norm in ��-symmetric quantum mechanics, rendiconti del circ. mat. di palermo, ser. ii, suppl. 72 (2004), p. 211–218, (math-ph/0104012). [9] bagchi, b., quesne, c., znojil, m.: generalized continuity equation and modified normalization in ��-symmetric quantum mechanics, mod. phys. lett. a, vol. 16 (2001), p. 2047–2057. [10] mostafazadeh, a.: pseudo-hermiticity versus �� symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian, j. math. phys. vol. 43 (2002), p. 205–214. [11] mostafazadeh, a.: pseudo-hermiticity versus ��-symmetry ii. a complete characterization of non-hermitian hamiltonians with a real spectrum, j. math. phys. vol. 43 (2002), p. 2814–2816; [12] mostafazadeh, a.: pseudo-hermiticity versus ��-symmetry iii: equivalence of pseudo-hermiticity and the presence of antilinear symmetries, j. math. phys. vol. 43 (2002), p. 3944–3951; [13] langer, h., tretter, ch.: a krein space approach to ��-symmetry, czech. j. phys. vol. 54 (2004), p. 1113–1120. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 47 no. 2–3/2007 one can only appreciate the latter observation when one sees the final gröbner-basis element which defines d( )eep � 7 as a root of the following seventeenth-degree polynomial 314432 d17� 5932158016 d16� 4574211144896 d15� 3133529909492864 d14� 917318495163561932 d13 �167556261648918275684 d12� 14670346929744822064505 d11� 720991093724510065469933 d10 � 62429137451114251409236415d9� 676326278232758784369966787d8� 40525434802944282153115803370d7 � 2361976444746440513605248930610 d6� 145759836636885012145070948315366 d5 � 8129925258122948689157916436170874 d4� 68875673245487669398850290405642067 d3 � 2 35326754101824439936800228806905073 d2� 453762279414621179815552897029039797 d � 153712881941946532798614648361265167 � 0 [14] bender, c. m., brody, d. c., jones, h. f.: complex extension of quantum mechanics, phys. rev. lett. vol. 89 (2002), 0270401; erratum-ibid. 92 (2004), 119902. [15] znojil, m., geyer, h. b.: construction of a unique metric in quasi-hermitian quantum mechanics: nonexistence of the charge operator in a 2×2 matrix model, phys. lett. b, vol. 640 (2006) 52–56; erratum-ibid, phys. lett. b 649 (2007) – see [16] below. [16] in ref. [15] we accepted a specific overall scale z � 1in �. in the resulting one-parametric subfamily � of the factorization postulate � � �� of ref. [14] led to the charge factor c, which was not involutive. in order to avoid possible misunderstandings we would like to emphasize here that neither the involutivity �2 � i nor the preference of z � 1 are based on deeper physical grounds. in this sense, the statements formulated in the last four lines of paragraph 3.2 of ref. [15] (plus their several citations throughout the text) should be interpreted with due care. it is obvious that, whenever necessary, one can always return to the full, two-parametric family of � and achieve the involutivity of charge � via an elementary hamiltonian-dependent adaptation of the scale z z h� ( ). [17] znojil, m.: a return of observability near exceptional points in a schematic ��-symmetric model, phys. lett. b, vol. 647 (2007), p. 225–230. [18] znojil, m.: determination of the domain of the admissible matrix elements in the four-dimensional ��-symmetric anharmonic model, phys. lett. a, vol. 367 (2007), p. 300–306. [19] znojil, m.: maximal couplings in ��-symmetric chain-models with the real spectrum of energies, j. phys. a: math. theor. vol. 40 (2007), p. 4863–4875, (math-ph/0703070v1) [20] znojil, m.: conditional observability, phys. lett. b, vol. 650 (2007), p. 440–446 (arxiv:0704.3812v1 [math-ph]). [21] bender, c. m.: making sense of non-hermitian hamiltonians, rep. prog. phys., submitted (hep-th/0703096 and preprint la-ur-07-1254). [22] kato, t.: perturbation theory for linear operators berlin: springer, 1966, p. 64. miloslav znojil, drsc. phone: +420 266 173 286 e-mail: znojil@ujf.cas.cz academy of science of the czech republic nuclear physics institute 250 68 řež, czech republic 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ap05_3.vp 1 introduction a major tool used to measure and assure quality is quality inspection. this inspection can assure that the products being produced meet the standards of quality and describe the quality levels. the objective of methodically planned inspection is to ensure regular quality inspection and its optimum integration into the production sequence. quality inspection can be a check made on each piece produced (100 % inspection) or a check made on a statistical sample of the lot. the inspection can be a mechanical or electronic measurement or a visual inspection. it can also be performed by the operator or worker making the part or component, by a second person who is responsible for measuring only, or performed entirely by computer-controlled measurement. these matters are combined into an inspection strategy. each inspection strategy has its own pros and cons. three important criteria are needed to evaluate the inspection strategies: quality, cost and time. the best inspection strategy should consider not only one criterion but all three of them. however, to find this best integration of inspection into production is not an easy task. the cycle times, quality and manufacturing costs – depending on lot sizes or on work-in-process – are difficult to estimate and a process improvement is hard to identify without support. therefore, simulation has become a powerful tool to solve this problem. there have been some researches on simulation in the quality inspection area. tannock [1] developed a simulation model in order to evaluate inspection strategies according to process capability and cost of quality (coq). the quality costs and the taguchi-based measure (qmp) are then evaluated, according to the inspection strategy selected. he also used the simulation method [2] to prove the capability of providing an insight into the comparative patterns of cost associated with control charting for variables and alternative inspection strategies. the simulation results confirm that control charting of variables is much more efficient than 100 percent inspection at reducing losses caused by process trends or changes in the variability when known assignable causes are applied to the data. although many researches have been done on quality inspection, no exhaustive investigation of the inspection strategy in different planning factors with respect to quality, costs and time has been made. this report therefore presents the developed simulator named “quinte” [3, 4], which can be used to investigate and evaluate the inspection strategies with respect to quality, cost, and time in the manufacturing process. the paper also introduces the application of this simulator in industry and the integration to commercial simulation software. before describing the simulator in detail, a section focusing on the fundamental concept of inspection strategies is presented. 2 inspection strategies the inspection strategy should be planned at the beginning of the product development phase together with process planning, in order to allow integration of the inspection planning into the quality planning. there are five main aspects that should be considered in inspection strategy planning [5]. the quality characteristics of the product that need to be inspected should also be defined and classified in advance. 2.1 inspection method the method used for inspection can be either an attribute method or a variable method. an attribute inspection checks whether the product is good or bad, while variable inspection obtains the quantitative value of the quality characteristic. to select the method used in the inspection, cost, time and 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague simulation in quality management – an approach to improve inspection planning h.-a. crostack, m. höfling, j. liangsiri production is a multi-step process involving many different articles produced in different jobs by various machining stations. quality inspection has to be integrated in the production sequence in order to ensure the conformance of the products. the interactions between manufacturing processes and inspections are very complex since three aspects (quality, cost, and time) should all be considered at the same time while determining the suitable inspection strategy. therefore, a simulation approach was introduced to solve this problem. the simulator called quinte [the quinte simulator has been developed at the university of dortmund in the course of two research projects funded by the german federal ministry of economics and labour (bmwa: bundesministerium für wirtschaft und arbeit), the arbeitsgemeinschaft industrieller forschungsvereinigungen (aif), cologne/germany and the forschungsgemeinschaft qualität, frankfurt a.m./germany] was developed to simulate the machining as well as the inspection. it can be used to investigate and evaluate the inspection strategies in manufacturing processes. the investigation into the application of quinte simulator in industry was carried out at two pilot companies. the results show the validity of this simulator. an attempt to run quinte in a user-friendly environment, i.e., the commercial simulation software – arena® is also described in this paper. keywords: simulation, quality, inspection strategies, manufacturing process. notation: quinte qualität in der teilefertigung (quality in the manufacturing process) application should be taken into account. while the attribute method is usually simple and inexpensive, the variable method gives adequate information for the purposes of process control. 2.2 inspection point the quality characteristics of products need to be inspected in order to prove the conformity of the parts with the demands. the earliest possible inspection point is located right after the production of the characteristic. if an inspection is performed after every process, the scrap and rework costs are at a minimum because faulty items are identified before adding more costs to already defective material. however, it is more expensive to conduct the inspection in this way than to combine the inspections of many quality characteristics at a single inspection point. the reason for this is that the inspection time and cost, for example setup, queuing, and buffer, are high. if many characteristics are inspected together later in the process flow, the inspection time and cost will be lower. then again, the scrap and rework costs will be higher. therefore, if this intermediate inspection is done either too often or too late, unnecessary costs will occur. the choice of the inspection point is based on a number of criteria such as inspection costs, damage risk, accessibility of the characteristic, increase in the product value, etc. also the fact that some parts cannot be inspected when they are already assembled must be considered. 2.3 inspection extent a decision about the extent of the measurements must be carried out. this aspect of inspection strategy directly influences inspection and failure cost. it ranges from no test to random and intermittent tests, and all the way to a 100% test. the 100 % inspection is usually used for the final test of critical or complex products. it may also be used when the process capability is inherently too poor to meet product specification. to conduct 100 % inspection is very costly and time consuming, even though the entire products are sorted. sample inspection is carried out according to externally valid standards or company internal regulations. when choosing the inspection extent, prior knowledge of the product is important, e. g., the importance of the feature for product quality, and process capability. no inspection is used when there is already adequate evidence that the product conforms, and hence no further inspection is needed. while no inspection or sample inspection gives benefits in inspection cost and time, the company should bear in mind that it includes the risk of declaring the lot good even if it might contain defects in the lot. 2.4 inspection location and personnel inspection can be performed either at the machine or at a special inspection location. in some cases, the operator may be the only person who should make the inspection. in other cases, the product might pass through an inspection or test station, where inspectors or testers make further inspections. or such inspections might be made by automatic quality-control equipment and the data is automatically processed and used for adjustment of the process. if the inspection is performed at the machine, the advantages are that no transportation is necessary and the feedback on error can be done quickly. however, the worker might spend more time on inspection and less time on production. therefore, machine utilisation can decrease and the cycle time and manufacturing cost can increase. the accuracy of inspection might be low since the worker has to both operate the machine and inspect the parts. moreover, it is not economical to have testing equipment at every production process or line, if the testing equipment is expensive. the other hand, if the inspection is conducted at a special inspection location, the accuracy and capability of the inspection process are higher, as the inspection environment can be controlled. however, the cons of this alternative are higher transportation costs and higher cycle time, since the parts have to be transported and wait for inspection. 2.5 inspection equipment the capability of the inspection device (measurement accuracy) and the tolerance of the inspected characteristics are the main selection criteria. furthermore, acquisition cost, capacity and other elements are taken into account. most of the time, high capability devices are expensive and difficult to handle. as mentioned above, the inspection strategies have drastic effects on the performance of production processes in terms of production cost, cycle time, and product quality. these impacts vary from one inspection strategy to another. therefore, choosing a good inspection strategy can be a complicated decision. the inspection of a single process can influence the other processes in the production in such a way that would be hardly possible to predict the effects of different inspection strategies analytically. thus, simulation can be a powerful tool to evaluate various inspection strategies. 3 the “quinte” simulator most general-purpose simulators focus on production and material handling systems. however, they are not directly aimed at the quality area, so some features which are valuable to quality engineers are lacking. therefore, a simulator called quinte has been developed at the university of dortmund and rif e.v., germany. the simulator focuses on the inspection strategy and its interaction with the production process. it is designed to investigate the impact of different inspection strategies on manufacturing cost, cycle time, and product quality. the flow chart of quinte components is shown in fig. 1. initially, the model of the machining process is characterized by its own statistical distribution, for example, by a normal distribution with a certain standard deviation � and mean �, depending on the current process capability. then, according to the statistical model, quinte randomly simulates the quality characteristic value of the given process. quinte models the distributions dynamically since the process capability is not constant over time. the expected value can glide from its original value or the deviation can increase because of failure, wear of the tool, etc. this changed distribution can be restored or improved by setting up and © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 maintenance. the disturbance of machining process is modelled in two ways: failure and maintenance. the obtained characteristic value denotes the actual value of a characteristic of a manufactured part. once this characteristic value is acquired, it is stored in the database and is used for the inspection simulation. the inspection process is simulated in a similar way as the manufacturing process. due to bias and precision, the value given by the inspection tool may differ from the true value. the capability of the inspection process is described by a statistical distribution, for example, a normal distribution. a standard deviation �insp and a mean �insp are assigned for each inspection process. the mean �insp is not a fixed value because the process is used to find out what value is produced. thus, the machined characteristic value is used as a mean of the inspection process. the capability of the inspection process is changed according to time in the same way as in the manufacturing process. therefore, the expected value can glide from its original value or the deviation can increase. the distribution can be restored or improved by setting up and calibration. fig. 2 shows a clear example of the inspection of characteristic value xi. quinte randomly generates the inspected value from the specified distribution. the inspected value will be compared with the specification limit, thus the decision is made whether or not the part conforms. furthermore, the occurrences of decision errors (type i and type ii error) can also be simulated by quinte. because of the variation in inspection processes, there is a possibility 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague manufacturing make material available waiting setup transport inspection machining waiting setup inspection maintenance failure rework scrap finish job start job calibration failure manufacturing make material available waiting setup transport inspection machining waiting setup inspection maintenance failure rework scrap finish job start job calibration failure fig. 1: components of quinte inspection value characteristic value scrap or rework xi process inspection insp, i = xi xinsp, i usllsl insp p ro b a b il it y p ro b a b il it y inspection value characteristic value scrap or rework xi process inspection insp, i = xi xinsp, i usllsl insp p ro b a b il it y p ro b a b il it y lsl = lower specification limit usl = upper specification limit � � � � fig. 2: inspection of characteristic value xi that a wrong decision can be made. a type i error is an error when a good part is wrongly declared as a bad part. on the other hand, a type ii error is an error when a bad part is declared as a good part. after the decision is made, the part that is declared as a conformed part continues on its production sequence. scraps must be sorted out and a new job must be started to replace the scraps if needed. rework parts can be handled in two ways. the rework parts can be sent back to the preceding process or processes and the operation can be repeated. another option is to repair the part in a separate rework area. at the end of the simulation runs, the simulation output is used in the evaluation of the inspection strategies. 4 industrial application the application of quinte in industry was tested in collaboration with two pilot companies [6]. the results from the simulator are presented in this section. however, due to the companies’ confidentiality policies, the names of the companies are not presented and the results presented here are not concrete values but only relative values. 4.1 example 1 this pilot company is a manufacturer of mobile hydraulic products. nine different products from 19 manufacturing stations, 8 inspection stations, and 1 manufacturing/inspection tool store were chosen for the simulation experiments. each product had different production sequences and different quality characteristics. in the simulation modelling, 268 inspection tools were considered. six sampling inspections were defined in the modelling. five of them were based on din en iso 2859 and one sampling inspection was done every 50th part. the transportation time between stations was assumed to be 100 seconds. the combinations of these inspection aspects were combined into inspection strategies, and ten of these strategies were chosen for the simulation experiments. the results of the simulation experiments are shown in figs. 3 and 4. however, these results cannot be compared with the current situation at the company, since the company does not implement any solid inspection strategy. the inspection is done from operators’ experience. the results show that there is no significant difference between all strategies in terms of manufacturing cost and rework © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 manufacturing cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 inspection cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 rework cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 manufacturing cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 manufacturing cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 inspection cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 rework cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 inspection cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 inspection cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 rework cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 rework cost 1 2 3 4 5 6 7 8 9 10 inspection strategies c o st [% ] 100 0 amount of good parts 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of scraps 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of outgoing defects 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of good parts 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of good parts 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of scraps 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of outgoing defects 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of scraps 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of scraps 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of outgoing defects 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 amount of outgoing defects 1 2 3 4 5 6 7 8 9 10 inspection strategies a m o u n t [% ] 100 0 product 1 product 2 product 3 product 4 product 5 product 6 product 7 product 8 pro duct 9 fig. 3: simulation results of company 1 (1) cost. however, clear differences appear in quality (amount of scrap) and cycle time. the best inspection strategy is alternative no. 4, since it gives the best quality and cycle time. from this result, the company can realize where the potential improvements are. 4.2 example 2 the manufacture of stub shafts in the second company was chosen as an area of study. three types of products, products 10, 19, and 20, were taken into consideration. the chosen part of the factory consisted of six manufacturing stations, one inspection room, and two manufacturing/ inspection tool stores. the transportation time between manufacturing stations was assumed to be 20 seconds, while the transportation time between a manufacturing station and an inspection station or the tool store was 300 seconds. in the simulation modelling, 155 inspection tools were considered. they were located at one of the following places: the manufacturing station, the inspection room, or the tool store. six sampling inspections were defined in the modelling. four of them were based on din en iso 2859, and two sampling inspections were done every 50th and 150th part, respectively. with the same procedure as for example 1, ten inspection strategies were selected for the simulation experiments plus the other three inspection strategies, namely: 1) the current strategy of the company, 2) no inspection at all, and 3) 100 % inspection after each process. the result for cycle time is shown in fig. 5. it is obvious that no inspection alternative (no. 11) gives the lowest cycle time, while 100% inspection gives the highest cycle time. the cycle time for the 100 % inspection alternative (no. 13) is not presented in the figure due to the extreme bar height in the diagram. as long as costs are concerned, there are no apparent differences for manufacturing cost and inspection cost. however, the current strategy gives a higher rework cost and a higher amount of scrap than the ten other selected strategies. the reason is that the three characteristics were 100% inspected and the accuracy of the chosen inspection tool was not adequate in the current strategy, while the sampling inspections with the same inspection tool were done in other alternatives. for this reason, there are more inspection errors in the current strategy than others. aside from this, the currently used strategy at this company is still among the best strategies in terms of overall performance. the current strategy can be improved by replacing the bad parameters with good parameters from other analysed strategies, e.g. the inspections of some characteristics can be changed to sampling inspection. 5 quinte in the new environment the quinte simulator was originally built on c/c++ language and it was successfully tested in industry. however, there are some minor drawbacks in quinte; for example, there is no animation for the simulation, only limited process distributions can be modelled, and it is difficult to modify or enhance other components or functions in quinte without strong knowledge of c/c++. in order to overcome these shortcomings, quinte was transferred into a commercial simulation software called arena®. with this new quinte environment, the user can create and animate the processes, use arena’s statistical analyzer, and use other user-friendly arena functions. moreover, quinte can easily be modified or enhanced by the developer without a need for programming, since arena is very easy to use with its point and click interface and fill in the blank dialog boxes. most of quinte’s functions are now placed in arena’s template, as shown in figure 6. the quinte template consists of 9 modules. the first module is the “general process” module. this module is designed for any processes in production, such as the manufacturing and inspection process. the user can assign the process name, processing time, cost allocation, information about failure and maintenance. there are two modules for the quality characteristic assignment: “attribute characteristic” and “variable characteristic” modules. the information about a characteristic, its distribution or conformity rate, and the specifications can be defined here. as described earlier, quinte can model these distributions dynamically. the deviation of the distribution can be assigned under the “distribution change…” dialogue box (fig. 6). the information about inspection equipment can be added in the “inspection – attribute” or “inspection – variable” module, according to the type of quality characteristic that the equipment can measure. in these modules, the accuracy 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 1 2 3 4 5 6 7 8 9 10 inspection strategies t im e [% ] 100 0 cycle time 1 2 3 4 5 6 7 8 9 10 inspection strategies t im e [% ] 100 0 product 1 product 2 product 3 product 4 product 5 product 6 product 7 product 8 product 9 fig. 4: simulation results of company 1 (2) product 10 product 19 product 20 inspection strategy 1-10: the selected strategies inspection strategy 11: no inspection inspection strategy 12: actual strategy cycle time 1 2 3 4 5 6 7 8 9 10 11 12 inspection strategies t im e [% ] 0 100 product 10 product 19 product 20 inspection strategy 1-10: the selected strategies inspection strategy 11: no inspection inspection strategy 12: actual strategy cycle time 1 2 3 4 5 6 7 8 9 10 11 12 inspection strategies t im e [% ] 0 100 cycle time 1 2 3 4 5 6 7 8 9 10 11 12 inspection strategies t im e [% ] 0 100 fig. 5: simulation results of company 2 of the inspection equipment must be defined. the user also has the possibility to add the change in inspection accuracy over time. the last four “sampling” modules are used to insert the sampling inspection into the model. the two options for sampling inspection are continuous and lot-by-lot. the continuous sampling inspection module is based on dodge’s aoql plan for continuous production (csp-1) [7]. the application of this plan aims for continuous production, since the formation of inspection lots for lot-by-lot acceptance is impractical and costly. the lot-by-lot sampling inspection module was built according to a single sampling scheme. the simulation model can be easily built with this quinte template and other arena templates. the user can build the model by dragging and dropping the required modules into the modelling space, can fill in the necessary information, and connect the modules together in the same order as the process flow. a simple example of how to model in arena is illustrated in fig. 7. the example shows the manufacture and inspection of a shaft. shafts enter the simulation model by the “create” module and pass through a turning machine, which produces the diameter as a quality characteristic. after the characteristic is produced, the shafts are sampled. those which are not to be inspected go to the next production step, which in this example is the disposal step. those which must be inspected are sent to the diameter inspection process and then go to the next step. the usability of quinte template has been preliminarily tested with some examples and the results show the validity of this new version of quinte. 6 conclusion the quinte simulator, which simulates the manufacturing process and inspection was successfully developed and validated. the application in two pilot companies proves © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 6: quinte template in arena® fig. 7: modelling example that this simulator can be used to investigate and evaluate different inspection strategies. the result from quinte enables the company to choose the most suitable inspection strategy according to the company’s goal. the simulator can also support the management in justifying of investment in inspection equipment or in a manufacturing process, for example, by illustrating the consequences of changes in the uncertainty of inspection equipment. furthermore, the simulator was reconstructed in arena simulation software. this new environment makes it easier for users to build the model and get additional advantages from arena’s functions. although quinte is a beneficial simulator, some potential improvements can still be made. this tool can be applied extensively in other areas, such as assembly processes and commissioning processes although the main concentration of quinte is currently only on manufacturing processes. references [1] tannock, j. d. t.: “choice of inspection strategy using quality simulation.” international journal of quality & reliability management, vol. 12 (1995), no. 5, p. 75–84. [2] tannock, j. d. t.: “an economic comparison of inspection and control charting using simulation.” international journal of quality & reliability management, vol. 14 (1997), no. 7, p. 687–699. [3] crostack, h.-a., heinz, k., nürnberg, m., nusswald, m.: “evaluating inspection strategies by simulation.” manufacturing systems, vol. 29 (1999), no. 5, p. 421–425. [4] crostack, h.-a., mayer, m., höfling, m.: “optimization of inspection and test planning – report on a german research project.” 17th international conference on production research, no. 67, blacksburg, virginia, usa, 2003. [5] pfeifer, t.: production metrology. oldenbourg verlag münchen wien, 2002. [6] crostack, h.-a., hermes, a., höfling, m., zielke, r., heinz, k., grünz, l., mayer, m.: “quinte+ – optimierung der prüfplanung nach kosten und durchlaufzeit mit hilfe der simulation. fqs-dgq-band nr. 84-04. frankfurt: fqs – forschungsgemeinschaft qualität e.v., 2004. [7] dodge, h. f.: “a sampling inspection plan for continuous production.” the annals of mathematical statistics, vol. 14, september 1947, p. 264–279. prof. dr.-ing. h.-a. crostack phone: +492 319 700 101 fax: +492 319 700 460 email: sekretariat@rif.fuedo.de dipl.-ing. m. höfling phone: +492 319 700 126 fax: +492 319 700 460 email: mhoef@rif.fuedo.de j. liangsiri, m.sc. phone: +492 319 700 107 fax: +492 319 700 460 email: jliangsi@rif.fuedo.de dortmunder initiative zur rechnerintegrierten fertigung e.v. (rif) joseph-von-fraunhoferstr. 20 44227 dortmund, germany 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap05_2.vp 1 introduction the trivial logic arrays dealt with here [1,2] (fig. 1.) are: � pla (programmable logic array) with programmable input and output matrices, � pal (programmable array logic) with a programmable input matrix and a given output matrix, � rom (read only memory) a given input matrix (is given through an address decoder) and a programmable output matrix. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 trivial logic arrays j. bokr, v. jáneš this paper deals with matrix modelling of trivial logic arrays (pla, pal, rom) and the design of the above array as structural models of static and dynamic logic objects. keywords: cartesian product of matrices, pla, pal, rom, canonical decomposition, states coding, states coding of miller or liu, substitute input variable. a) b) e) c) d) fig. 1: trivial logic arrays: a) block diagram, b) labelling; examples of logic arrays: c) pla, d) pal, e) rom, where × and � mean the programmable and given positions, respectively y x x x x x x y x x x x x x 1 1 2 3 1 2 3 2 1 2 3 1 2 3 � � � � y x x x x y x x x x 1 1 2 1 3 2 1 2 2 3 � � � � y x x x x x x y x x x x 1 1 3 1 2 1 3 2 2 3 1 3 � � � � � the input matrix has either conjunctors (& and) or inverse disjunctors {� – nand). the output matrix has either disjunctors {1 � or) or inverse disjunctors (nor). the paper deals with matrix analysis and synthesis of trivial logic arrays. 2 cartesian product of boolean matrices the cartesian product m (op) (op) m of boolean matrices � � � � � �m q h mij: : ,1, 2, , 1, 2, , 0, 1� � �� i j � � � � � �m : , , , , , , , : ,1 2 1 2 0 1� � �h p j k jk� m denotes a boolean matrix � � � � � �m op op q p i k ik( )( ) : , , , , , , , : ,m 1 2 1 2 0 1� � �� � where �ik ij jk op j m op� � �� � �� ( ) m and (op) and (op) are boolean operators. example 1: let a system {y1, y2} of boolean functions y1, y2 be given � � � �� �y1 2 0 1 40 1, , ,y � � � � � � � : write the system {y1, y2} as matrices {cndf (y1), cndf (y2)}, where cndf denotes a canonical normal disjunctive formula. note that ( )x x� �0 and ( )x x� �1 : 3 structural model of a static logic object let � � � �� j m m jx x x y: , , : , , ,0 1 0 1 1 2 � � be a boolean function and the literal x� be � �x x x� � � �� � �( , )0 1 , where x x0 � and x x1 � . note that the function �j can be expressed as cnd cndf x x xj m m j j m( ) , , , ( , , , ) � � � � � � � � � � � � � � 1 2 1 2 1 2 1 1 2 � � � or as ndf (a normal disjunctive formula) ndf x x x kj j j jk i ij j j jk j j jk( ) , , , � � � � � � � � �� � 1 2 1 2 1 2 � � , where kij is a normal conjunct on � �x x x x x x xm m� 1 1 2 2, , , , , ,� or finally expressed as mndf (�j) (minimal ndf ). let the symbol k denote a set of conjuncts kij from ndf (�j). let � � � �0 1 0 1, , , ,m n � be a finite automaton model of a binary static logic object where � is the vector output function � �� �� �� � � � � � � � 0 1 0 1 , ,n m � � � �� : , , : , , , , , ,0 1 0 1 1 2 1 2 m n m nx x x y y y� � � represented by a system � �� j j n �1 of components, output functions � � � �� j m m jx x x y: , , : , , ,0 1 0 1 1 2 � � . let min (mout) be the input (output) matrix of the given static object, i.e.: � � � � � �m , , , k , , , m ,in : :1 2 1 2 2 0 1� �� r s, � 0 for x ks ijs � � , r s, � 1 for x ks s ij � � , � � � � � �m , , , k , , , n ,out : :1 2 1 2 0 1� �� r s, � 0 for k ndfij j� ( )� , r s, � 1 for k ndfij j� ( )� . 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague x1 x2 x3 x4 y1 y2 1 0 0 1 0 1 0 1 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 � � � �y y x x x x1 2 1 2 3 4 1 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 0 1 1 1 , , , , &� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � t x x x x & , , , & 0 1 1 1 1 0 0 0 1 0 1 1 2 3 4 0 1 0 0 0 1 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 1 1 1 0 0 0 1 0 � � � � � � � � � � � � � � � � � � � � & � � � � � � � � � � � � � � � � � ( ) ( ) ( ) ( ) ( ) ( ) ( x x x x x x x 1 2 3 4 1 2 3 1 0 0 1 0 1 � � � � � � � � � 0 0 1 0 0 1 0 0 0 4 1 2 3 4 1 2 3 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( x x x x x x x x ) ( ) ( ) ( ) ( ) ( ) & x x x x x 4 1 2 3 4 0 0 1 1 1 0 � � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 0 0 0 1 0 � � � � � � � � � � � � � � � � � � � � � � � � � � �x x x x x x x x x x x x x x x x x x x x 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4� � � � � � � � � � � � � � � � � � � � � � � � � � � & [ 0 1 1 1 1 0 0 0 1 0 1 2 3 4 1x x x x x x2 3 4 1 2 3 4 1 2 3 4 1 2 3 4x x x x x x x x x x x x x x� �, ] if � �xt s, , m x s x: , :1 2 2� � for s k� �2 1, s xs� for s k� 2 (k m� �0 1 2 1, , , ,� ), then for the trivial logic array m the following are valid [ , , , ] ( & & ) &y y y m x mn t 1 2 � � �in out , [ , , , ] ( & & ) &y y y m x mn t 1 2 � � �in out , where x x x x x x xm m� [ , , , , , , ]1 1 2 2 � . the total capacity (area) c(m) (bit) of trivial logic array m can be written as c m c m c m m n k( ) ( ) ( ) ( )� � � �in out 2 . example 2: design a pla (min, mout) modeled system of output functions y x x x x x x x x1 1 2 3 1 2 3 3 3� � � � �( ) ( ) � � y x x x x x x x x2 1 2 1 2 3 1 2 3� � � �( ) � and determine its capacity � �� �� � � �x x x x x x x x1 2 1 2 1 2 1 2, . hence m in � 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 1 1 2 2 3 3 1 2� � � � � � � � � � � � � � � � � � � � � � x x x x x x x x � � x x x x x x x x x 1 2 1 2 1 2 3 3 3 � � � m y out � � � � � � � � � � � � � � � � � � � � � � � 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 y x x x x x x x x x x x 2 1 2 1 2 1 2 1 2 3 3 3 � � � � and fig. 2 can be constructed. since [ , , ]y y1 2 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 � � 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 1 1 2 2 3 3 � � � � � � � � � � � � � � � � � � � � � � x x x x x x � � � � & & x x x x x x 1 1 2 2 3 3 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � �� � � t & & 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 � � � � � � � � � � � � � � � � � � � � � � � � � y y x x x x x x x 1 2 3 3 1 2 3 1 2 � � �� ��[ � x x1 2 ] and c m( ) � � � � � �2 4 6 3 6 66. the starting point for designing min and mout trivial logic array matrices is a system of tables� �� j j n �1 (rom) or a compressed form [2] (pla, pal) of it, including the corresponding record of the cndf ({�j}) of the group function or the ndf ({�j}) of the component functions of �j. example 3: the following system of functions can also be written as )* including the group cndf or ndf: cndf ( , , ) ( ) ( ) ( )y y y x x x y y x x x y x x x y y x 1 2 3 1 2 3 1 2 1 2 3 1 1 2 3 1 2� � � � � 1 2 3 1 1 2 3 3 1 2 3 3x x y x x x y x x x y( ) ( ) ( )� � © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 2: pla from example 2 x1 x2 x3 y1 y2 y3 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 1 1 0 0 1 0 0 – – – 1 0 1 – 0 1 1 1 0 0 0 0 1 1 1 – 0 1 x1 x2 x3 y1 y2 y3 0 – 0 1 1 0 0 – 1 1 0 0 1 1 0 0 0 0 1 – 1 – 0 1 or ndf ( , , ) ( ) ( ) ( )y y y x x y y x x y x x y1 2 3 1 3 1 2 1 3 1 1 3 3� � � . )* note that the undetermined values of function arguments need to be interpreted as both zeroes and ones, whereas the undetermined values of functions are very difficult, uninteresting, or even impossible, to determine and their delivery is motivated by the maximal rate of their utility. if the system � �� j j n �1 is also given by the system � �cndf y j j n ( ) �1 , then it is to be decided whether to realise rom by means of the cndf y j( ) array or the cndf y j( ) array. if, therefore, for the hamming weight wh (yj) of the function �j – w yh j m( ) ! 2 2 holds, then naturally we work with use cndf y j( ) or otherwise we do with cndf y j( ) . since cndf y j( ) or cndf y j( ) are very complicated systems in practice – the systems of (�j) functions are systems consisting of tens or hundreds of functions depending on tens or hundreds of arguments – the classic minimisation procedures are not applicable on cndf y j( ) or cndf y j( ) . with advantage, however, the quine and mccluske method of minimising � �cndf y j j n ( ) �1 systems can be used, under the condition that the definition of the undetermined values of functions �j of system � �� j j n �1 will be suitably defined and the covering table [2] will be not constructed. in this way, we will obtain a subminimal group ndf y y yn( , , , )1 2 � – see example 4. the so called ‘harvard’ approach to systems� �� j j n �1 [3, 4] can be applied with advantage in designing logic arrays. let us deal, without loss of generality, only on the system � �� �1 1 2 2 1 2( , , , ), ( , , , )x x x x x xm m� � and regard one of the functions as an argument of the other function and this, of course, affects the complexity of the conception logic array. for instance, let us write y y x x x x x xm m2 2 1 1 2 2 1 2� � � ( , , , , ) ( , , , )� �� � . example 4: let y1 00010111� and y2 00010110� , that is y x x x x x1 1 2 3 1 2� � �( ) and y x x x x x x x x2 1 2 3 1 2 3 2 3� � �( ) or also y y x x x2 1 1 2 3� since: example 5: let us consider the system� �� j j�1 3 from example 3. then, w yh ( )1 4� , w y w yh h( ) ( )2 3 2� � , i.e., y x x x x x x x x x x x x1 1 2 3 1 2 3 1 2 3 1 2 3� � � � , y x x x x x x2 1 2 3 1 2 3� � , y x x x x x x3 1 2 3 1 2 3� � , since 2 2 4 3 � , and so y x1 1� , y y x2 1 3� � , y y x3 1 3� � . example 6: let there be a static logic object by means of a product (input) matrix and a sum (output) matrix. min � 1 2 3 4 5 6 7 8 9 10 11 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � x x x x x x x x x x1 1 2 2 3 3 4 4 5 5 mout � 1 2 3 4 5 6 7 8 9 10 11 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � y y y y y y y1 2 3 4 5 6 7 let the rows of the input (output) matrix be compatible if each of them contains at least one common input literal (one common output function). since the relation of compatibility of rows is their relation of tolerance, on the set of rows of the matrix it defines the covering ��� � ��� �� �c i c ix i y imax max� �111 111 with all maximal classes: ��� � � �� � � �� � c i c x y max m , , , , , , , , , , , , , , � " # $ % & ' 1 2 4 5 7 8 3 6 9 10 3 10 11 5 7 10 ��� � � �� �� �� �� �� �� �ax , , , , , , , , , , , , , , , ,i � 2 3 5 8 10 4 5 8 10 3 9 10 11 5 7 6 9 since the rows within the input (output) matrix correspond to the sets of input literals (output functions): 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague according to ��� � ��� �� �c i c ix ymax max we get to the covering on the set of input variables (output functions) of all maximum classes: � � � �� �� � � � c x x x x x x x c y x j j y k k max max , , , , , � � � � � � � � � 1 5 1 2 3 3 4 5 1 7 � � � �� � � ( ( (( (( " # $ % & ' � � � � � � y y y y y y y y y 1 2 3 1 3 4 5 6 7 , , , , , , , , i.e., we obviously obtain sparse matrices min � 2 4 7 5 1 8 10 3 11 9 6 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � x x x x x x x x x x1 1 2 2 3 3 4 4 5 5 mout � 2 4 7 5 1 8 10 3 11 9 6 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 0 1 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � y y y y y y y1 2 3 4 5 6 7 the matrices min (mout) are represented by submatrices min1 2 4 7 5 1 8 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 � � � � � � � � � � � � � � � � � � � � � x x x x x x1 1 2 2 3 3 , mout � � � � � � � � � � � � � � � � � � � � � ( 2 4 7 5 1 8 0 0 1 1 0 0 0 1 0 1 1 1 0 1 0 1 0 1 y y y1 2 3( ( � � � � � � � � � � �� � � � � � � � � � � �� min2 � � � � � � � � � 10 3 11 9 6 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 � � � � � � � � x x x x x x3 3 4 4 5 5 where y y y1 1 1� ( � (( and y y y3 3 3� ( � (( which leads to saving the capacity of matrices min1, min2 (mout1, mout2) compared to the capacity of min (mout). 4 structural model of a dynamic logic object let us consider a trivial block diagram of canonic composition of a dynamic logic object – like that in fig. 3, where the substitute ) [5] is a parallel register consisting of “memory ” modules mk (fig. 5.) modeled with finite automata � �� �0 1 0 1 2, , , , �k where �k are the transition functions � � � � � ��k k k k kq e e q: , , , : ,0 1 0 1 0 1 2 1 2� (� , where qk and (qk are the affiliate state and the substitute state, respectively. let � �m m� or � � � �* � */ m m or , where m is a set and m its element (m m� ). the finite automaton model or the given binary dynamic object is to be an ordered sextuple � � � � � �0 1 0 1 0 1, , , , , , ,m p n � where the vector of transitional function � �� � � � � �� �� � � �0 1 0 1 0 1, , ,p p m and the output function are © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 rows stimuli responses 1 {x1, x2} {y2} 2 {x1} {y3} 3 {x4, x5} {y3, y4} 4 {x1} {y1} 5 {x1, x3} {y1, y2, y3} 6 {x5} {y5, y7} 7 {x1, x3} {y2} 8 {x1, x2} {y1, y3} 9 {x5} {y4, y5} 10 { x3, x4, x5} {y1, y3, y4} 11 {x4} {y4, y6} fig. 3: “memory” module mk � �� � � � � �� �� �� �� � � �0 1 0 1 0 1, , ,n p m , respectively � � � � � � : , , , : : , 0 1 0 1 0 1 1 2 1 2 1 2 p m p p m pq q q x x x q q q � ( ( (� � � � � � � �� � � � � � � : , , , : : , , , , , 0 1 0 1 0 1 1 2 1 2 1 2 p m n p m nq q q x x x y y y � � � � � when � � � � : , , , , , , , , q q q x x x q e e q e e q p m p 1 2 1 2 1 1 11 21 2 2 12 22 � � � � �� � � � �p p pe e, ,1 2 having in mind that the vector exciting function e sought for � �� � � � � �� �e p p m � � � � � � � � 0 1 2 0 1 0 1 , , , � � � � � �e q q q x x x e e e e e p m p p m p : , , , : : , 0 1 0 1 0 1 1 2 1 2 11 21 12 22 1 � � � � � e p2 represented by the system � � 12 1k k p � of components driving functions � � � � � � 12 2 1 2 1 2 1 2 0 1 0 1 0 1k p m p m k kq q q x x x e e : , , , , � � � � � � found according to the function +, and � and is represented by the system� �� j j n �1 of the component input functions � � � �� � � � � � � j p m n p m jq q q x x x y : , , , : : , . 0 1 0 1 0 1 1 2 1 2 � � � � let � �� �m m m min out outext outint� , be the input (output, output external, output internal) matrix, (respectively) of the given dynamic object, i.e., � � � � � � � �m k m p r s t x q in for : , , , , , , , , , , : , , , 1 2 1 2 2 1 2 2 0 1 0 � � � � � � s s� t ij t ij k r s t x q k � � � t s t s � � , , , ,� 1 for � � � � � � � �m k n n r s k ij out ext for : , , , , , , , , , , : , 1 2 1 2 2 1 2 2 0 1 0 � � � � � �ndf r s k ndf j ij j ( ), , ( ) � �� 1 for � � � � � � �m k p r t k ndf eij k out int for : , , , , , , , : , ( ) 1 2 1 2 2 0 1 0 12 � � � � � , , ( ) .r t k ndf eij k� 1 12for � if in addition, � �qt tp q t q: , , , :1 2 2� for t k� �2 1, t qt� for t k k p� � �2 0 1 2 1( , , , , )� for the logic array m of the given dynamic object there holds � �y y y m x q mn t 1 2, , , & & &� � � � � � � � � � � � � � �in out ext, � �y y y m x q mn t 1 2, , , & & &� � � � � � � � � � � � � � �in out ext, � �e e e e e e m x qp p t 11 21 12 22 1 2, , , , , , & & &� � � � � � � � � � � � � � �in mout int , � �e e e e e e m x qp p t 11 21 12 22 1 2, , , , , , & & &� � � � � � � � � � � � � � �in mout int , where � �x x x x x x xm m t � 1 1 2 2, , , , , ,� and � �q q q q q q qp p t � 1 1 2 2, , , , , ,� . it can be recommended to use reduced exciting matrices of “memory” modules [2, 5], i.e., a matrix with actual state transitions only (except for the delay element and d flip-flop circuit): 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague fig. 4: block diagram of canonical composition on logic array qk (qk dk 0 0 1 1 0 1 0 1 0 1 0 1 � � � � � � � � � � � � rk sk dk lk jk k k tk 1 0 � � � 1 1 � � � 1 0 � � � 1 1 � � � 1 0 � � � 1 1 � � � 1 1 � � � � � � the total capacity (area) c(m) (bit) of the logic array m is expressed as � �c m c m c m c m m p n p k( ) ( ) ( ) ( ) ( ) .� � � � � � �in out ext out int 2 2 example 7: design a pla by means of an operating table (table 1a) and use a synchronous flip-flop rs as a binary substitutor. it seems suitable to modify the operating table and add the excitation rs flip-flop (table 1b). according to table 1b) we obtain y q q x q q x q q x� � �1 2 1 2 1 2 as well as r q q1 1 2� r q q x2 1 2� s q q x1 1 2� s q q x q q x2 1 2 1 2� � hence min � � � � � � � � � � � � � � � � � 0 1 0 1 0 1 0 1 0 1 1 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 x x q q q q q q x q q x q q x q q x q q 1 1 2 2 1 2 1 2 1 2 1 2 1 2 m y out ext � � � � � � � � � � � � � � � � � 1 1 0 1 0 m r s r out int � � � � � � � � � � � � � � � � � 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 1 1 2 s2 determine the capacity pla: c m( ) ( )� � � �6 1 4 5 55 bits. since the number of logical array input ports is usually markedly higher than the number of rows of input as well as exciting columns, the area of the input matrix min is unnecessarily large. replace, therefore, the input variables by suitable substitutes. example 8: given a synchronous logic object (table 2a) by an operating table (table 2b), create a table with a minimal number of columns denoted by substitution variables si( , , )i �1 2 3 – table 2c [6]. hence � � � �s a p p d p pi � � � �( ) ( ) ( ) ( )1 2 3 5 , � �s c p p e p b p2 4 6 1 2� � � �( ) ( ) ( ) ( ), s c3 � , where �� � �p q qq: , :� 1 6 0 1 0� if the object is not in the state q, in the opposite case q 1. “substitution” variables define on the alphabet state��q q�1 6 partition {{1, 2}, {3, 5}, {4, 6}} (the permutations of row elements of table 2c can also helpful). if miller’s state coding [7] transferred on synchronous logic objects [2] is used, and if the elementary substitute will be d flip-flop, we obtain w( )1 0� for the state. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 a) ( (q q y1 2 q1q2 x 0 1 0 0 01/1 00/0 0 1 01/1 10/0 1 0 00/0 01/1 b) q1 q2 x ( (q q1 2 y rk, sk 0 0 0 0 1 1 s2 1 0 0 0 – 0 1 0 0 1 1 – 1 1 0 0 s1, r2 1 0 0 0 0 0 r1 1 0 1 1 r1, s2 table 1: a) the operating table, b) the modified operating table from example 7 a) q x y (q 1 a � 2 c � 5 e 6 2 a � 2 b � 4 d � – 3 b � – d � 5 4 c 3 5 a � – d � 4 6 c 5 e � – table 2: a) flow table, b) operating table, c) table of substitution variables from example 8 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague b) q1 q2 q3 s y1 y2 (q1 (q2 (q3 dk 1 0 1 s1 0 0 0 0 1 d3 c 0 0 0 0 0 � s2 1 0 0 1 1 d2, d3 0 0 1 s1 0 0 0 0 1 d3 s2 1 1 0 1 0 d2 � 0 1 � � 1 0 0 � 0 1 � � s1 0 0 0 0 � 0 0 1 0 1 0 1 0 d1 s2 0 0 0 0 � 0 0 � � s1 0 0 0 1 0 d2 0 1 1 1 0 0 0 0 � s2 0 1 � � � c) q s1 s2 s3 1 a e c 2 a b – 3 d – – 4 – c – 5 d – – 6 – c – weights w( )1 0� , w w( ) ( )3 6 1� � , w w( ) ( )2 4 2� � , w ( )5 3� w q q( ) ( )� (� , where � ( )(q is the characteristic function of the set� �( (q q� ( ) of the state binary code �� � �i i� 1 6 30 1 5 000 3 100 6 011 4 010 2 001 1 101 , : , , , , , , � � � � � � which is a reasonable compromise: note, in addition, that the states of one and the same class {1, 2}, {3, 5}, {4, 6} of the partition {{1, 2}, {3, 5}, {4, 6}} is to be coded also with respect to the neighbouring coding, where a heuristic procedure lead to the reduction number of rows of the array matrices, i.e.: s q q q q q q a q q q q q q d q q a q q d 1 1 2 3 1 2 3 2 2 3 1 2 3 2 3 2 3 � � � � � � � ( ) ( ) , s c3 � , s q q q q q q c q q q e q q q b q q c q q q e 2 1 2 3 1 2 3 1 2 3 1 2 3 2 3 1 2 3 � � � � � � � ( ) � q q q b1 2 3 . let the input binary code be � � � �a b c d a b c d e , , , , : , , , , 0 1 000 001 010 011 100 3 � � � � � i.e.: s x x x q q x x x q q s x x x q q x x x q q q 1 1 2 3 2 3 1 2 3 2 3 2 1 2 3 2 3 1 2 3 1 2 � � � � , 3 1 2 3 1 2 3� x x x q q q ; q3 q2 q1 5 4 6 2 3 � � 1 � �hence s s1 2 0 1 0 1 0 1 0 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 1 0 0 1 1 0 0 , � 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 1 1 2 2 3 3 � � � � � � � � � � � � � � � � x x x x x x q1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3 q q q q q x x x x x x q q q q q q & & � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � t s & 1 0 1 0 0 1 0 1 0 1 1 s x x x q q x x x q q x x x q q q x x x q q q 2 1 2 3 2 3 1 2 3 2 3 1 2 3 1 2 3 1 2 3 1 2 � � � � �� �3 . example 9: given an asynchronous logic object (table 3a) by an operating table (table 3b), create a table with a minimal number of columns denoted by “substitute” variable ,i (i �1, 2, 3) – (table 3c). since � �1 2� �a b, and � 3 � c, let us discuss just the parallel coding of states �� � �q q� 1 6 50 1, according to liu [2]. in the case of single stimuli a, b, c, the noncritical pairs of state transitions on the state alphabet��q q�1 6 define � � �� �� �1 2 3 4 5 6, , , , , , � � �� �� �� �1 4 2 3 5 6, , , , , , � � �� �� �� �1 2 3 4 5 6, , , , , . hence for the separation state components [2] we obtain a state code (table 3d), again with an obvious effort to achive the least possible number (table 3e) of state components q ( , , )j �1 2 3 and the least possible w dh j( ) exciting weight dj. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 let us code the responses according to miller again. this can also be used with advantage in coding reactions of asynchronous objects: � � � �� � � � � � , , , , : : , , , , 0 1 00 01 10 11 2 � � � � since the response weights are w ( )� � 5, w ( )� � 4, w ( ) � 3 and w ( )� �1. for excitation and reactions we can write: d q q q s1 1 2 3 2� d q q q q q q s q q q s q q s q q q s2 1 2 3 1 2 3 2 1 2 3 1 2 3 2 1 2 3 1� � � � �( ) d q q q s q q q s q q q s q q s q q q s3 1 2 3 1 1 2 3 2 1 2 3 1 2 3 1 1 2 3 2� � � � � y q q q q q q q q q q q q s q q q q s1 1 2 3 1 2 3 1 2 3 1 2 3 2 1 2 2 3 2� � � � � �( ) ( ) y q q q s q q q q q q q q q s q q q s q q2 1 2 3 1 1 2 3 1 2 3 1 2 3 2 1 2 3 1 1� � � � � �( ) ( 3 1 2 3 2� q q q s) . hence � �y y d d d1 2 1 2 3 0 1 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 1 , , , , � 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � q q1 2 2 3 3 1 1 2 2 1 1 2 2 3 3 1 1 2 2 q q q q s s s s q q q q q q s s s s & & � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � t & 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 1 2 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � y y d d d2 3 � �hence � � �, , , � � � � � � � � � 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0� � � � � � � � � � � � � � � � � � � � � � � � x x x x x x x x x x x x 1 1 2 2 3 3 1 1 2 2 3 3 & & � � � � � � � � � � � � � � � � � � � � � � � � � � t & 1 0 1 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 � � � � � � � � � � � � � � � � � � � �� 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague a) q x (q 1 a 1 b 4 c 2 2 a 1 b 3 c 2 3 a 1 b 3 c 3 4 a 4 b 4 c 3 5 a 4 b 6 c 5 b) q q q1 2 3 x x1 2 ( ( (q q q1 2 3 dk 0 0 0 0 0 0 0 0 – 0 1 0 0 1 d5 1 0 0 1 0 d3 0 1 0 0 0 0 0 0 – 0 1 0 1 1 d3, d5 1 0 0 1 0 d3 0 1 1 0 0 0 0 0 – 0 1 0 1 1 d3, d5 1 0 0 1 1 d3, d5 0 0 1 0 0 0 0 1 d5 0 1 0 0 1 d5 1 0 0 1 1 d3, d5 1 0 0 0 0 0 0 1 d5 0 1 1 1 0 d1, d3 1 0 1 0 0 d1 table 3: a) transition table, b) operating table, c) table of input “substitute” variables, d) state code table, e) state code table with a minimum state code words from example 9 � �d d d1 2 3 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 , , � 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � q q q q q q q q q q q q 1 1 2 2 3 3 1 1 2 2 3 3 � � � � � � � � � � � � � � � & & � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � & 1 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � d d d1 2 3 c) q �1 �2 �3 1 a b c 2 a b c 3 a b c 4 a b c 5 a b c 6 a b c d) x a b c q q1 q2 q3 q4 q5 1 0 0 0 0 0 2 0 0 1 0 0 3 0 0 1 0 1 4 0 0 0 0 1 5 1 1 – 1 – 6 1 1 – 1 – e) q q1 q2 q3 1 0 0 0 2 0 1 0 3 0 1 1 4 0 0 1 5 1 0 0 6 1 1 0 5 conclusions in [8], we find exact, theoretically demanding and quite elaborate optimization algorithms both according to state coding (miller’s economic coding and liu’s parallel coding) and by the number rows of matrices of the matrix structural model of the dynamic logic object. if the number of states of the dynamic object is large, it is suitable to apply a simple block decomposition [2, 9] and design the structural model of the matrix as a composition of matrix structural models of individual blocks. references [1] liebig, h., thome, s.: logischer entwurf digitaler systeme. berlin -…tokio: springer, 1996. [2] bokr, j., jáneš, v.: logické systémy. praha: vydavatelství čvut, 1999. [3] pospelov, d. a.: logičeskie metody analiza schem. moskva: energija, 1974. [4] šalyto, a. a.: logičeskoe upravlenie. metody apparatnoj i programmnoj realizaciji algoritmov. sankt peterburg, 2000. [5] bokr, j.: kánonická dekompozice. acta elektrotechnica et informatica, vol. 4 (2004), no. 1, p. 60–65. [6] baranov, s. i., sinev, v. n.: avtomaty i programmiruemye matricy. minsk: vyšejšaja škola, 1980. [7] miller, r. e.: “switching theory”. sequential circuits and machines, vol. ii. (translated into russian), moskva: nauka, 1971. [8] ačasova, s. m.: algoritmy sinteza avtomatov na programmirujemych matricach. moskva: radio i svjaz, 1987. [9] bokr, j.: “prostá bloková dekompozice”. slaboproudý obzor, sv. 52 (1991), č. 3–4, p. 80–83. doc. ing. josef bokr, csc. e-mail: bokr@kiv.zcu.cz department of information and computer science university of west bohemia faculty of applied sciences universitní 22 306 14 pilsen, czech republic doc. ing. vlastimil jáneš, csc. phone: +420 224 357 289 e-mail: janes@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám.13 121 35 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 acta polytechnica https://doi.org/10.14311/ap.2022.62.0085 acta polytechnica 62(1):85–89, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague maxwell-chern-simons-higgs theory usha kulshreshthaa, ∗, daya shankar kulshreshthab, bheemraj sihagb a university of delhi, kirori mal college, department of physics, delhi-110007, india b university of delhi, department of physics and astrophysics, delhi-110007, india ∗ corresponding author: ushakulsh@gmail.com abstract. we consider the three dimensional electrodynamics described by a complex scalar field coupled with the u(1) gauge field in the presence of a maxwell term, a chern-simons term and the higgs potential. the chern-simons term provides a velocity dependent gauge potential and the presence of the maxwell term makes the u(1) gauge field dynamical. we study the hamiltonian formulation of this maxwell-chern-simons-higgs theory under the appropriate gauge fixing conditions. keywords: electrodynamics, higgs theories, chern-simons-higgs theories, hamiltonian formulations, gauge-theories. 1. introduction we study the hamiltonian formulation [1] of the three dimensional (3d) electrodynamics [2–22], involving a maxwell term [20], a chern-simons (cs) term [19, 21, 22], and a term that describes a coupling of the u(1) gauge field with a complex scalar field in the presence of a higgs potential [22]. such theories in two-space one-time dimension ((2+1)d) can describe particles that satisfy fractional statistics and are referred to as the reletivistic field theoretic models of anyons and of the anyonic superconductivity [21, 22]. a remarkable property of the cs action [21, 22], is that it depends only on the antisymmetric tensor ϵµνλ and not on the metric tensor gµν . as a result, the cs action in the flat spacetime and in the curved spacetime remains the same [21, 22]. hence cs action, in both the abelian and in the non-abelian cases represents an example of a topological field theory [21, 22]. the systems in two-space, one-time dimensions (2+1)d (i.e., the planar systems, display a variety of peculiar quantum mechanical phenomena ranging from the massive gauge fields to soluble gravity [19– 22]. these are linked to the peculiar structure of the rotation group and the lorentz and poincare groups in (2+1)d. the 3d electrodynamics models with a higgs potential, namely, the abelian higgs models involving the vector guage field aµ with and without the topological cs term in (2+1)d have been of a wide interest [19–22]. when these models are considered without a cs term but only with a maxwell term accounting for the kinetic energy of the vector gauge field and they represent field-theoretical models which could be considered as effective theories of the ginsburg-landau-type [22] for superconductivity. these models in (2+1)d or in (3+1)d are known as the nielsen-olesen (vortex) models (nom) [20]. these models are the relativistic generalizations of the well-known ginsburg-landau phenomenological field theory models of superconductivity [2, 20, 22]. the effective theories with excitations, with fractional statistics are supposed to be described by gauge theories with cs terms in (2+1)d and a study of these gauge field theories and the models of quantum electrodynamics involving the cs term represent a broad important area of investigation [21, 22]. the cs term provides a velocity dependent gauge potential [21, 22], and the presence of the maxwell term in the action makes the gauge field dynamical [20]. we study the hamiltonian formulation [1] of this maxwell-chern-simons-higgs theory under the appropriate gauge fixing conditions [20, 22]. the quantization of field theory models with constraints has always been a challenging problem [1]. infact, any complete physical theory is a quantum theory and the only way of defining a quantum theory is to start with a classical theory and then to quantize it [1]. theory presently under consideration is also a constrained system. in the present work, we quantize this theory using the dirac’s hamiltonian formulation [1] in the usual instant-form (if) of dynamics (on the hyperplanes defined by: x0 = t = constant) under appropriate gauge-fixing conditions (gfc’s) [1, 19–22]. 2. hamiltonian formulation the maxwell chern-simons higgs theory in two space one time is defined by the following action: s = ∫ l(φ, φ∗, aµ) d3x, (1) 85 https://doi.org/10.14311/ap.2022.62.0085 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en u. kulshreshtha, d. s. kulshreshtha, b. sihag acta polytechnica where the lagrangian density l (with κ = θ2π2 ; θ being the cs parameter) is given by: l = [ − 1 4 fµν f µν + (d̃µφ∗)(dµφ) − v (|φ|2) + κ 2 ϵµνλaµ∂ν aλ ] (2) v (|φ|)2 = γ + β|φ|2 + α|φ|4 = λ(|φ|2 − φ20) 2 ; (φ0 ̸= 0). (3) where the covariant derivative is defined by: dµ = (∂µ + i eaµ) d̃µ = (∂µ − i eaµ) gµν = diag(+1, −1, −1) ϵ012 = ϵ012 = +1 µ, ν = 0, 1, 2. (4) in the above lagrangian density the first term is the kinetic energy term of the u(1) gauge field and second term represents the coupling of u(1) gauge field with the complex scalar field as well as kinetic energy for the complex scalar field. third term describes higgs potential and the last term is the cs term. the model without the cs term describes an abelian higgs model and is defined by the lagrangian density l = l ( φ, φ∗, aµ ) where l is a function of a complex scalar field and an abelian gauge vector field aµ(x) defined by the above lagrangian density. in (2+1)d this theory is called as the nielsen olsen (vortex) model (nom). these models possesses stable, time independent (i.e., static), classical solutions (which could be called 2d solitons). in fact, the model admits the so-called topological solitons of the vortex type [4]. further, in this model, if we choose the parameters of the higgs potential to be such that the scalar and vector masses become equal i.e., if we set the higgs boson and the vector boson (photon) masses to be equal i.e., if we set: mhiggs = mp hoton = eφ0 then that implies: v (|φ|)2 = 1 2 e2(|φ|2 − φ20) 2. (5) the above model then reduces to the so-called bogomol’nyi model which describes a system on the boundary between type-i and type-ii superconductivity [4]. in component form, the above lagrangian density can be written as: l = ( κ 2 )[ a0f12 + a1(∂2a0) − a2(∂1a0) ] + ( κ 2 )[ a2(∂0a1) − a1(∂0a2) ] − 1 2 f 212 + [ 1 2 (∂1a0 − ∂0a1) + 1 2 (∂0a2 − ∂2a0) ] + [ (∂0φ∗)(∂0φ) + i e(∂0φ∗)a0φ − i e(∂0φ)a0φ∗ + e2a20φ ∗φ ] + [ − (∂1φ∗)(∂1φ) − i e(∂1φ∗)a1φ + i e(∂1φ)a1φ∗ − e2a21φ ∗φ ] + [ − (∂2φ∗)(∂2φ) − i e(∂2φ∗)a2φ + i e(∂2φ)a2φ∗ − e2a22φ ∗φ ] − v (|φ|2). (6) canonical momenta obtained from the above lagrangian density are: π = ∂l ∂(∂0φ) = (∂0φ∗ − i ea0φ∗) π∗ = ∂l ∂(∂0φ∗) = (∂0φ + i ea0φ) π0 = ∂l ∂(∂0a0) = 0 (7) e1(:= π1) := ∂l ∂(∂0a1) = −(∂1a0 − ∂0a1) + κ 2 a2 e2(:= π2) = ∂l ∂(∂0a2) = (∂0a2 − ∂2a0) − κ 2 a1. (8) here π, π∗, π0, e1, e2 are the momenta canonically conjugate respectively to φ, φ∗, a0, a1, a2. the theory is thus seen to possess only one primary constraint (pc): χ1 = π0 ≈ 0. (9) the canonical hamiltonian density of the theory is obtained using the legendre transformation from the lagrangian density of the theory in the usual manner. every term in the lagrangian density (including the cs term) is equally important. the calculational details are omitted here for the sake of brevity. the total hamiltonian density of the theory is then obtained from the canonical hamiltonian density by including 86 vol. 62 no. 1/2022 maxwell-chern-simons-higgs theory in it the primary constraint of the theory with the help of the lagrange multiplier field u ≡ u(xµ) (which is dynamical) as follows: ht = π0u + π π∗ − iea0(πφ − π∗φ∗) + 1 2 (e12 + e22) + 1 2 f 212 + [ e1(∂1a0) + e2(∂2a0) ] + 1 2 ( κ 2 )2 (a12 + a22) − ( κ 2 )[ a2e1 − a1e2 + a0f12 ] + [ (∂1φ∗)(∂1φ) + i e(∂1φ∗)a1φ − i e(∂1φ)a1φ∗ + e2a21φ ∗φ ] + [ (∂2φ∗)(∂2φ) + i e(∂2φ∗)a2φ − i e(∂2φ)a2φ∗ + e2a22φ ∗φ ] , (10) where ht = ∫ ht d2x, (11) with the total hamiltonian density given by: ht = [ hc + π0u ] . (12) . it is to be noted here that in the construction of the canonical hamiltonian density of the theory, all the fields of the theory play an equally important role through the legendre transformation and through the lagrangian density of the theory that defines the theory. also, it is worth mentioning here that the hamilton’s equations of motion of the theory (that are omitted here for the sake of brevity) obtained from the total hamiltonian density of the theory preserve the constraints of the theory for all time. after preserving the primary constraint χ1 in the course of time, one obtains a secondary constraint χ2 = [ ie(πφ − π∗φ∗) + (∂1e1 + ∂2e2) + κ 2 (∂1a2 − ∂2a1) ] ≈ 0. (13) the matrix of poisson brackets (pb’s) among the constraints χi is a null matrix and thereby theory is a gauge invariant theory and is invariant under the following local vector gauge transformations: δφ = iβφ, δφ∗ = −iβφ∗, δπ0 = 0 δa0 = −∂0β ; δa1 = −∂1β ; δa2 = −∂2β δπ = −iβ(∂0φ∗) − eβa0φ∗ + i(e − 1)(∂0β)φ∗ δπ∗ = iβ(∂0φ) − eβa0φ − i(e − 1)(∂0β)φ δe1 = −κ 2 ∂2β; δe2 = κ 2 ∂1β; δu = −∂0∂0β. (14) here, β is the gauge parameter β ≡ β(xµ) and the vector gauge current satisfies: ∂µj µ = 0. the components of j µ are: j 0 = j0 = (iβφ) [ ∂0φ∗ − i ea0φ∗ ] − (iβφ∗) [ ∂0φ + i ea0φ ] − (∂1β) f01 − (∂2β) f02 − κ 2 [ (∂1β)a2 − (∂2β)a1 ] j 1 = −j1 = (iβφ) [ − ∂1φ∗ + i ea1φ∗ ] − (iβφ∗) [ − ∂1φ − i ea1φ ] − (∂0β) f10 − (∂2β) f21 + κ 2 [ (∂0β)a2 − (∂2β)a0 ] j 2 = −j2 = (iβφ) [ − ∂2φ∗ + i ea2φ∗ ] − (iβφ∗) [ − ∂2φ − i ea2φ ] − (∂0β) f20 − (∂1β) f12 − κ 2 [ (∂0β)a1 − (∂1β)a0 ] . (15) for quantizing the theory using dirac’s procedure we choose the following two gauge-fixing conditions (gfc’s): ξ1 = π ≈ 0 ξ2 = a0 ≈ 0. (16) here the gauge a0 ≈ 0 represents the time-axial or temporal gauge and the gauge π ≈ 0 represents the coulomb gauge. these gauges are acceptable and consistent with our quantization procedure and also physically more interesting. corresponding to this set of gauge fixing conditions the total set of constraints now becomes: χ1 = π0 ≈ 0 χ2 = [ ie(πφ − π∗φ∗) + (∂1e1 + ∂2e2) + κ 2 (∂1a2 − ∂2a1) ] ≈ 0 χ3 = ξ1 = π ≈ 0 χ4 = ξ2 = a0 ≈ 0. (17) the non-vanishing matrix elements of the matrix rαβ (:= {χ1, χ2}p ) of the equal-time poisson brackets of the above constraints are: r14 = −r41 = − δ(x1 − y1)δ(x2 − y2) r23 = −r32 = ieπ δ(x1 − y1)δ(x2 − y2). (18) the above matrix is nonsingular and the set of constraints χi ; i = 1, 2, 3, 4 is now second class and the theory is a gauge non-invariant theory. the nonvanishing matrix elements of the matrix r−1αβ (which 87 u. kulshreshtha, d. s. kulshreshtha, b. sihag acta polytechnica is the inverse of the matrix rαβ ) are given by: r−114 = −r −1 41 = δ(x 1 − y1)δ(x2 − y2) (19) (eπ)r−123 = −(eπ)r −1 32 = i δ(x 1 − y1)δ(x2 − y2). following the standard dirac quantisation procedure, the non-vanishing equal time dirac brackets (db’s) of the theory are obtained as: (π) {π∗(x0, x1, x2) , φ(x0, y1, y2)}d = (−π∗)δ(x1 − y1)δ(x2 − y2) {π∗(x0, x1, x2) , φ∗(x0, y1, y2)}d = {π∗(x0, x1, x2) , φ∗(x0, y1, y2)}p = −δ(x1 − y1)δ(x2 − y2) (ieπ) {e1(x0, x1, x2) , φ(x0, y1, y2)}d = ( κ 2 ) δ(x1 − y1) ∂2δ(x2 − y2) {e1(x0, x1, x2) , a1(x0, y1, y2)}d = {e1(x0, x1, x2) , a1(x0, y1, y2)}p = − δ(x1 − y1)δ(x2 − y2) (ieπ) {e2(x0, x1, x2) , φ(x0, y1, y2)}d = − ( κ 2 ) ∂1δ(x1 − y1)δ(x2 − y2) {e2(x0, x1, x2) , a2(x0, y1, y2)}d = {e2(x0, x1, x2) , a2(x0, y1, y2)}p = −δ(x1 − y1)δ(x2 − y2) (π) {φ(x0, x1, x2) , φ∗(x0, y1, y2)}d = (−φ∗)δ(x1 − y1)δ(x2 − y2) (π) {φ(x0, x1, x2) , a0(x0, y1, y2)}d = (φ) δ(x1 − y1)δ(x2 − y2) (ieπ) {φ(x0, x1, x2) , a1(x0, y1, y2)}d = ∂1δ(x1 − y1)δ(x2 − y2) (ieπ) {φ(x0, x1, x2) , a2(x0, y1, y2)}d = δ(x1 − y1) ∂2δ(x2 − y2). (20) here one finds that the product of the canonical variables appear in the expressions of the constraints as well as in the expressions of the db’s and therefore for achieving the canonical quantisation of the theory, one encounters the problem of operator ordering while going from db’s to the commutation relations, this problem could however be resolved by demanding that all the fields and the field momenta after quantisation become hermitian operators and that all the canonical commutation relations need to be consistent with the hermiticity of these operators. this completes the hamiltonian formulation of the theory under the choosen gauge fixing conditions. it may be worthwhile to mention here that our choice of gfc’s is by no means unique. in principle, one can choose any set of gfc’s that would convert the set of constraints of the theory from first-class into a set of second-class constraints. however, it is better to choose the gfc’s that are physically more meaningful and nore relevant like the ones that we have choosen. in our case the gauge a0 ≈ 0 represents a time-axial or temporal gauge and the gauge π ≈ 0 represents a culomb gauge and both of them are physically important gfc’s. another important point is that one can not choose covariant gfc’s here simply because the constraints of the theory are not covariant and therefore it would not work. in path integral quantization (piq) [23], transition to quantum theory is made by writing the vacuum to vacuum transition amplitude for the theory, called the generating functional z[jk] of the theory which in the presence of the external sources jk for the present theory is [23]: z[jk] = ∫ [dµ] exp [ i ∫ d3x [jkφk + π∂0φ + π∗∂0φ∗ + π0∂0a0 + e1∂0a1 + e2∂0a2 + πu∂0u − ht ] ] . (21) here φk ≡ (φ, φ∗, a0, a1, a2, u) are the phase space variables of the theory with the corresponding respective canonical conjugate momenta: πk ≡ (π, π∗, π0, e1, e2, πu). the functional measure [dµ] of the theory (with the above generating functional z[jk]) is: [dµ] = [ (ieπ)δ(x1 − y1)δ(x2 − y2) [dφ][dφ∗][da0][da1][da2][du][dπ] [dπ∗][dπ0][de1][de2][dπu]δ[(π0) ≈ 0] δ[[ie(πφ − π∗φ∗) + (∂1e1 + ∂2e2) + κ 2 (∂1a2 − ∂2a1)] ≈ 0] δ[π ≈ 0]δ[a0 ≈ 0] ] . (22) acknowledgements we thank the organizers of the international conference on analytic and algebraic methods in physics xviii (aamp xviii) – 2021 at prague, czechia, prof. vít jakubský, prof. vladimir lotoreichik, prof. matej tusek, prof. miloslav znojil and the entire team for a wonderful orgnization of the conference. one of us (bheemraj sihag) thanks the university of delhi for the award of a research fellowship. 88 vol. 62 no. 1/2022 maxwell-chern-simons-higgs theory references [1] p. a. m. dirac. generalized hamiltonian dynamics. canadian journal of mathematics 2:129–148, 1950. https://doi.org/10.4153/cjm-1950-012-1. [2] v. l. ginzburg, l. d. landau. on the theory of superconductivity (in russian). zhurnal eksperimental’noi i teoreticheskoi fiziki 20:1064–1082, 1950. [3] a. a. abrikosov. on the magnetic properties of superconductors of the second group. soviet physics jetp 5:1174–1182, 1957. [4] h. b. nielson, p. olsen. vortex line models for dual strings. nuclear physics b 61:45–61, 1973. [5] c. becchi, a. rouet, r. stora. the abelian higgs kibble model, unitarity of the s-operator. physics letters b 52(3):344–346, 1974. https://doi.org/10.1016/0370-2693(74)90058-6. [6] e. b. bogomolnyi. the stability of classical solutions. soviet journal of nuclear physics 24(4):449–458, 1976. [7] s. deser, r. jackiw, s. templeton. three-dimensional massive gauge theories. physical review letters 48:975–978, 1982. annals of physics, 140:372, 1982, https://doi.org/10.1103/physrevlett.48.975. [8] f. wilczek. quantum mechanics of fractional-spin particles. physical review letters 49:957–959, 1982. https://doi.org/10.1103/physrevlett.49.957. [9] a. j. niemi, g. w. semenoff. axial-anomaly-induced fermion fractionization and effective gauge-theory actions in odd-dimensional space-times. physical review letters 51:2077–2080, 1983. https://doi.org/10.1103/physrevlett.51.2077. [10] a. n. redlich. gauge noninvariance and parity nonconservation of three-dimensional fermions. physical review letters 52:18–21, 1984. https://doi.org/10.1103/physrevlett.52.18. [11] k. ishikawa. chiral anomaly and quantized hall effect. physical review letters 53:1615–1618, 1984. https://doi.org/10.1103/physrevlett.53.1615. [12] g. w. semenoff, p. sodano. non-abelian adiabatic phases and the fractional quantum hall effect. physical review letters 57:1195–1198, 1986. https://doi.org/10.1103/physrevlett.57.1195. [13] l. jacobs, c. rebbi. interaction energy of superconducting vortices. physical review b 19:4486–4494, 1979. https://doi.org/10.1103/physrevb.19.4486. [14] i. v. krive, a. s. rozhavskĭı. fractional charge in quantum field theory and solid-state physics. soviet physics uspekhi 30(5):370, 1987. https://doi.org/10.1070/pu1987v030n05abeh002884. [15] a. l. fetter, c. b. hanna, r. b. laughlin. random-phase approximation in the fractional-statistics gas. physical review b 39:9679–9681, 1989. https://doi.org/10.1103/physrevb.39.9679. [16] t. banks, j. d. lykken. landau-ginzburg description of anyonic superconductors. nuclear physics b 336(3):500–516, 1990. https://doi.org/10.1016/0550-3213(90)90439-k. [17] g. v. dunne, c. a. trugenberger. self-duality and nonrelativistic maxwell-chern-simons solitons. physical review d 43:1323–1331, 1991. https://doi.org/10.1103/physrevd.43.1323. [18] s. forte. quantum mechanics and field theory with fractional spin and statistics. reviews of modern physics 64:193–236, 1992. https://doi.org/10.1103/revmodphys.64.193. [19] u. kulshreshtha. hamiltonian and brst formulations of the two-dimensional abelian higgs model. canadian journal of physics 78(1):21–31, 2000. https://doi.org/10.1139/p00-002. [20] u. kulshreshtha. hamiltonian and brst formulations of the nielsen-olesen model. international journal of theoretical physics 41(2):273–291, 2002. https://doi.org/10.1023/a:1014058806710. [21] u. kulshreshtha, d. s. kulshreshtha. hamiltonian, path integral, and brst formulations of the chern–simons theory under appropriate gauge-fixing. canadian journal of physics 86(2):401–407, 2008. https://doi.org/10.1139/p07-176. [22] u. kulshreshtha, d. s. kulshreshtha, h. j. w. mueller-kirsten, j. p. vary. hamiltonian, path integral and brst formulations of the chern–simons–higgs theory under appropriate gauge fixing. physica scripta 79(4):045001, 2009. https://doi.org/10.1088/0031-8949/79/04/045001. [23] h. j. w. mueller-kirsten. introduction to quantum mechanics: schrödinger equation and path integral. world scientific, singapore, 2006. isbn 9789814397735. 89 https://doi.org/10.4153/cjm-1950-012-1 https://doi.org/10.1016/0370-2693(74)90058-6 https://doi.org/10.1103/physrevlett.48.975 https://doi.org/10.1103/physrevlett.49.957 https://doi.org/10.1103/physrevlett.51.2077 https://doi.org/10.1103/physrevlett.52.18 https://doi.org/10.1103/physrevlett.53.1615 https://doi.org/10.1103/physrevlett.57.1195 https://doi.org/10.1103/physrevb.19.4486 https://doi.org/10.1070/pu1987v030n05abeh002884 https://doi.org/10.1103/physrevb.39.9679 https://doi.org/10.1016/0550-3213(90)90439-k https://doi.org/10.1103/physrevd.43.1323 https://doi.org/10.1103/revmodphys.64.193 https://doi.org/10.1139/p00-002 https://doi.org/10.1023/a:1014058806710 https://doi.org/10.1139/p07-176 https://doi.org/10.1088/0031-8949/79/04/045001 acta polytechnica 62(1):85–89, 2022 1 introduction 2 hamiltonian formulation acknowledgements references ap05_5.vp 1 introduction in recent years much effort has been invested in an investigation of how to employ quantum systems as parts of a computer. it has been demonstrated that quantum information is different from classical information and that the essence of this difference lies in the entanglement of quantum systems. entanglement is a simple consequence of the linearity of quantum mechanics, hence it does not have a classical counterpart. this effect equips a quantum computer with massive parallelism and hence could be used to speed up computation. therefore quantum information processing could be more efficient than classical information processing. on the other hand, it has become clear that the linear character of quantum theory imposes severe restrictions on the character of elementary tasks of quantum information processing. for example it is impossible to clone an arbitrary quantum state perfectly [1]. nevertheless, imperfect copies can be made [2, 3]. this particular process of cloning belongs to the class of so called universal processes. these processes act on all input states of a quantum system in a ‘similar’ way. for universal processes working with one quantum system in a pure state and finishing in an n-particle state this property is mathematically described by the so called covariance condition. two-particle processes fulfilling this condition were analyzed in [4]. in this paper a theoretical framework was developed within which all possible two-particle universal processes can be described and those compatible with the linear character of quantum theory were determined. of special interest were universal processes generating entangled two-particle output states which do not contain any separable components. it has been shown that this particular subclass forms a one-parameter family of totally anti-symmetric states with respect to permutations of the two particles. the aim of this paper is to generalize the results obtained for two-particle universal processes to the subclass of multi-particle universal processes from one to n particles. for this purpose we use theoretical framework developed in [4]. we show how the general statement for a multi-particle universal process can be constructed. the one-parameter family of processes generating totally anti-symmetric states was generalized to a multi-particle regime and its properties were studied, mainly bipartite entanglement. for this purpose we use the concept of negativity [5]. finally we briefly review the question of the complete positivity of the obtained universal processes. 2 general structure of a universal process we will now derive the general structure of the n-particle covariant mapping using the statement for the two-particle case [4]. in the following we assume that the d-dimensional one-particle hilbert space � is the same for all n particles. an arbitrary one-particle pure state can be described by d-dimensional generalized bloch vector p. such a state will be simply denoted by|p . consider a linear map � from 1 to n particles, i. e. � � � � �: ( ) ( ), ( ) ( ), ( ) ( ),� � � �in out in out np p p p� � � � (1) where the density matrix �in ( )p has the form �in n n d i( ) ( ) ( )p p p� � � � �1 1 1 , (2) i.e. the first particle is in the pure state p , and all others are in the state of a complete mixture. we will call this map universal (or covariant), if it possesses the following covariance property � �out n out nu u( ) ( ( )) ( )( ( ) )†p p p p� � �0 (3) where the one-particle unitary transformation u(p) maps the state p 0 to the state p , i. e. p p p� u( ) 0 . (4) to construct the general statement for covariant mapping with n particles we will use the results for the two-particle [4]. we will now summarize the main results. an arbitrary density operator of a d dimensional quantum system can be represented in terms of some basis of the su(d) algebra. in order to implement the covariance condition (3) we used the basis { }a ij , ( , , , )i j d� 1 � , which fulfills the following commutation relations [ , ] ( )a a aij mn ab jm ai bn in am bj� �� � � � � � , (5) (we have used the einstein summation convention in which we have to sum over all indices which appear in an expression twice from one to d). a representation of these generators is given by the d×d matrices ( )( )a ij kl ik jl ij kld � �� � � � 1 . (6) the density matrix �in ( )p can be written in the form �in ij ijd p a( ) ( ).p i� � 1 (7) in terms of matrices (6) the most general output state is represented by the density matrix 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague multi-particle universal processes j. novotný, m. štefaňák, v. košt’ák we generalize bipartite universal processes to the subclass of multi-particle universal processes from one to n particles. we show how the general statement for a multi-particle universal process can be constructed. the one-parameter family of processes generating totally anti-symmetric states has been generalized to a multi-particle regime and its entanglement properties have been studied. a view is given on the complete positivity and the possible physical realization of universal processes. keywords: universal processes, quantum entanglement, complete positivity. � � � � out ij ij ij ij ijkl n ( ) ( ) ( )( ) ( )p i i p a i p i a� � � � � � � 1 2 1 2 ( ) ,p a aij kl� (8) where the linearity requirement of quantum mechanics implies that �ij ( )( )1 p , �ij ( )( )2 p and �ijkl ( )p have to be linear with respect to the parameters of the bloch vector p. to fulfill the covariance condition (3) the matrix (8) has to involve only terms invariant under the transformation of the form u � u, i.e. the scalar part, or terms which transform like the generators of the su(d) group a ij , i.e. the vector part. it was shown in [4] that the nontrivial scalar is a aij ji� and the nontrivial vectors have the form a aki kj� , a aik kj� . from these facts, the most general output matrix that fulfills the covariance condition (3) and depends linearly on the input must have the form � � �out ij ij ij ij ij ji n p p c ( ) ( ) ( )p i i a i i a a a � � � � � � � � � 1 2 1 2 � ��p pij ik kj ji ki jka a a a� � � . (9) where �( , )12 and c are real parameters and � is complex. the ranges of these parameters are restricted by the fact that �out ( )p must be a density matrix, i. e. a positive operator with a unit trace. this statement can be easily generalized to the n-particle case. the output density matrix has to involve only the scalar terms (multiplied by an arbitrary constant) and the vector terms (multiplied by a constant with the parameters of the initial state pij). these terms can be constructed by the tensor product of the one-particle scalar i and vector a ij , and the two particle scalar and vector terms. to obtain a scalar term we have to sum over all free indices, and for a vector term two indices must remain free (these are later summed with the parameters pij). the summation is done in such a way that one summation index is in the first position while the second is in the second position of the generators (6). for example we will show the explicit form of the scalar and vector terms for the case n � 3 (for more information on three-particle universal processes see [6]): � scalar terms – a a iij ji� � , a i aij ji� � , i a a� �ij ji , – a a aij jk ki� � � hermitian conjugate � vector terms – a i iij � � , i a i� �ij , i i a� � ij , – a a iik kj� � , a i aik kj� � , i a a� �ik kj +hermitian conjugate – a a aij kl lk� � , a a akl ij lk� � , a a akl lk ij� � , – a a aik kl lj� � , a a akl ik lj� � , a a aik lj kl� � +hermitian conjugate in this case the output density matrix �out ( )p depends on 23 real parameters, which are restricted by the positivity of this operator. compared with just five real parameters in the two-particle case, we see that the complexity of the n-particle covariant mapping grows rapidly with the number of particles n. if we want to study the properties the covariant mappings in more detail, we can simply study the resulting density matrix for an arbitrary input state. due to the covariance (3), functions like the entropy or the entanglement measures of the particular universal process will have the same value for all possible input states, because these functions are invariant under local transformations. the universal processes generating entangled two-particle output states which do not contain any separable components were studied in [4]. it has been shown that this particular subclass forms a one-parameter family of totally anti-symmetric states with respect to permutations of the two particles. we will now give a generalization of this class for the case of n-particle processes. 3 generalization of the universal process generating anti-symmetric states to n particles in this section we propose a generalization of the two-particle universal process generating totally anti-symmetric states to an arbitrary number of particles. the initial state of the first qudit is assumed to be 1 without loss of generality, due to the covariant condition (3). we will use the following notation a j z j j j j j j d b j z d n n n d n � � � � � � � � � � { | ( , , , ), } { | � � � � � 1 22 2 3 � � � � � j j j j j j d j n j n n� � � � � � � ( , , ), } { } ! sgn( ) ) 1 1 2 1 2 1 � �� �� jn pn ) , �� (10) where pn is a group of permutations of n elements. the output states of the n-particle universal process generating totally anti-symmetric states form a one-parameter family and can be written in the form � n anti n n anti n n antip p( ) ( ) ( ) ( ) ( )( )� � 1 , p n( ) ,� 0 1 , (11) where the matrices n anti( ) and n anti( ) are given by n anti j a n anti n d d n j j d ( ) ( ( ) ! ( ) ( ) { } { } ,� � � � � � 11 1� � � � ) ( ) ! ( ) ( ) { } { } .� � � � � nd d n j j j bd 1 1� � � � (12) in the case n � d, i.e. when the number of particles is equal to the dimension of the one-particle hilbert space, we have to put p n( ) � 0 and the one-parameter family collapses to a single process generating an n-particle singlet state. this is the only case when the one-parameter family produces a pure state, while in all other possible cases the output state �n anti( ) is always mixed. the maximally mixed state is described by the parameter p d n dn( ) � � and the output density matrix involves all d n � � � � � n-particles anti-symmetric states with the same probability. in this case the output density matrix has the form �n anti n j c p d n d d n j j d ( ) ( ) { } { }� �� � � � � � � � � � � � � 1 � � � , (13) where the set cd is defined as c j z j j j dd n n� � � � � � �{ | } � �1 1 2 . (14) © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 this state is invariant under the transformation of the type u n� and therefore this particular process involves only the scalar part. reduced states of k particles are the same for all k-particle subsystems and have the same form as the output state of the k-particle universal process. the only difference is that the parameter p k( ) has to be replaced by p n k ( ) ( ), which is given by p k n k nn k p n ( ) ( ) ( ) ( ) � � � . (15) the properties of the two-particle subsystems are described by the parameter p n( ) ( )2 . the entanglement of two-particle subsystems is quantified by the negativity [5]. for this one-parameter family this function is given by ( c.f. [4]) � � n p p d p d p d ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 2 2 2 2 2 2 2 1 1 2 1 1 1 � � � � � � � . (16) the negativity of the two-particle subsystems of the n-particle universal process generating totally anti-symmetric states is then given by n p n d n p dp n d p n n n n n ( ) ( ) ( ) ( ) ( ) ( ) ( ) � � � �� � � � � � 1 2 1 2 2 4 4 22 2 4 4� � � � �n d . (17) this function has a minimum for p d n nn( ) � � , (18) i.e. the scalar process that produces the most disordered state also produces the state with the least entangled two-particle subsystems. the maximal value of negativity is obtained for p n( ) � 1 or p n( ) � 0, depending on the relation between n and d. for d > 2n � 1 the maximum is at the point p n( ) � 0 and has the value n d n n d n n n dmax ( ) ( ) ( ) � � � � � � � � 2 1 2 4 4 2 1 . (19) for d < 2n � 1 the maximum is at the point p n( ) � 1 and is given by n d n dmax ( )� � � � 2 1 1 1 . (20) in the case when d � 2n � 1 the values of negativity at the points p n( ) � 0 and p n( ) � 1are equal, so the function n p n( )( ) has two equal peaks. the output density matrix (11) has the form of the convex sum of two density matrices, n anti( ) is the sum of the projectors onto the anti-symmetric subspace involving the initial state 1 , n anti( ) involves the complementary projectors to those in n anti( ) on the n-particle anti-symmetric subspace. for the special case of p n( ) � 0 only the matrix n anti( ) is involved and the output density matrix can be written as �n anti n in n n d d n a a( ) ( ) ! ( ) ( ) � � � � � 1 1 1� , (21) where an is the projection operator onto the anti-symmetric subspace of the n-particle hilbert space. hence this particular process can be realized by a certain projector acting on the initial state. universal processes with this property will be studied in the following section. 4 realization of universal processes as was shown above, some important universal processes can be realized by the action of certain projectors on the input state. besides the example in the previous section, where the projector on the anti-symmetric subspace of the n-particle hilbert space was used, we can mention the optimal universal copying process from one to n particles, which can be implemented by using a projector on the symmetric subspace of the n-particle hilbert space [2]. these projectors play the role of kraus operators for a given processes and can be used to construct a unitary evolution operator in the canonical way [7], as will be shown below. the natural question arises what class of projectors have this property of realizing covariant processes. let t be a projector on some subspace of the n-particle hilbert space and �in. the mapping on the density operators � �out int t� � , where � is normalizing factor �out to be a density matrix, fulfills the covariant condition if the following equality is satisfied for all one-particle unitary transformations u and for all one-particle pure input states p p : t u u i t u t i tu n n n n ( )† ( ) ( ) ( ) † p p p p � � � � � � � � � 1 1 (22) a sufficient condition for t to satisfy (22) is u t tun n� ��( ) ( ), (23) i.e. t must commute with all unitary transformations of the form u n�( ). all n-particle transformations u n�( ) form an n-particle representation of a one-particle unitary group. this representation is generally reducible for n � 2. from schur’s lemma, which states that a representation is irreducible if and only if the only matrix commuting with all members of the representation is proportional to the identity, we can deduce that if t is proportional to the projector on invariant subspace of the reducible representation then condition (23) is satisfied. more generally, let { }p� be a set of all projectors on the minimal invariant subspaces, then operator t, which is defined as t p� � � � , (24) commutes with all unitary operators of the form u n�( ), for arbitrary � � �. now it is obvious that the mapping ~ t �out n n t t i t t i t � � � � � � � � ~ ( ) ( ) [ ( ) ] ( ) † ( ) †p p p p p p 1 1tr , (25) where t is of the form (24), from the pure one-particle input states to the n-particle output density operators satisfies condition (22). the next simple consequence of the presented construction is that any convex combination of transformations of the form (25) also satisfies (22). even though transformation (25) fulfils covariant condition (22) it need not be completely positive, which is a necessary condition for the physical acceptability of this mapping. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague from the kraus theorem we have the following criterion [8]: linear mapping p from the density matrices on the input hilbert space to the density matrices on the output hilbert space is completely positive if and only if there exists a so called kraus decomposition of a given transformation, i.e. if there exists a set of linear mappings, kraus operators, { }ek satisfying the two following conditions for any input density operator , � � p e e e e k kk k kk ( ) † † � tr (26) e e ik k k † � . (27) the definition of transformation ~ t guarantees fulfillment of the first condition (26). to satisfy (27) we must restrict the range of parameters � in (24) to inequality � � 1 for all �. the construction of the completely positive universal processes which was described above can still be generalized. the transformation of the form (25) describes the universal process from one to n particles. by applying the partial trace operation over arbitrary m particles, m < n, to the n-particle output state of transformation (25) we get the transformation from one particle pure states to an (n � m)-particle. it is easy to see that in this case the covariant condition remains satisfied and we get the completely positive universal process on n � m particles. the last form of the universal process on n particles built up of projectors is therefore the following � � � � � out i i l in l l l in m t i t t i � � � � � � tr tr 1 1 , , ( ) ( ) ( )† ( ) ( ( ) ( � � �l lt�1) ( )†) , (28) where m � �0, l � n � m, { , , } { , , }i i lm1 1� �� and t (l) is the transformation of the form (24) on the l-particle hilbert space, where the condition � � 1 is required for all �. note that by tracing out different particles we can generally get a different classes of processes [9]. the simple consequence of the linearity of quantum theory is that the convex combination of the processes of the form (28) is also a completely positive universal process. in the case of formula (28) the mapping t l( ) does not play the role of the kraus operator of the universal process (except the case m � 0), but (28) can be further rewritten in the following way: � � out j j k k i i in n j j k km m m m m i � � � �� � 1 1 1 1 1 1 , , , , ( ) , , ,� � � � � � � i i j j k k d l in l l m m m t i t 1 1 1 1 1 1 , † , , ( ) ( ) ( )†( ) � � � � � � � �tr � (29) where d is a dimension of the one-particle hilbert space and � j j k k i i m m m 1 1 1 , , , , � � � are the linear operators on the n-particle hilbert space, which are defined � j j k k i i i m i l m m m m m j j p k k 1 1 1 1 1 1 , , , , ( ) . � � � � � � �� � � � � � (30) subscription by the bra-vector i j means that i acts on the j-th component of the tensor product, i.e. on the j-th particle. the operators defined by equation (30) form the set of kraus operators of the universal process (28). we generalize the bipartite universal processes to the subclass of the multi-particle universal processes from one to n particles. we show how the general statement for a multi-particle universal process can be constructed. the one-parameter family of processes generating totally anti-symmetric states was generalized to a multi-particle regime and its entanglement properties were studied. a view on the complete positivity and the possible physical realization of the universal processes is given. 5 conclusions universal processes may play an important role in various branches of quantum information processing, e. g. in preparation of entangled states or copies of the input state. in addition to operations with two particles, multi-particle operations are also of interest. for this purpose we have generalized the two-particle universal processes to a multi-particle regime and we have shown how the general statement for the multi-particle universal process can be constructed using the results for two-particle universal processes. for the preparation of multi-particle entangled states we generalized the one-parameter family of processes generating totally anti-symmetric states to a multi-particle regime and studied its bipartite entanglement properties. this one-parameter class generates entangled states with equal reduced states, thus the entanglement is shared uniformly between all pairs of particles. a particular process of this one-parameter family can be realized by a simple action of a projection operator. for processes of this kind we can in a canonical way construct a unitary evolution operator, thus they are completely positive. an open question is whether other universal processes can be performed by an action of a certain projector, possibly on a larger hilbert space ( i. e. with more particles ) and then tracing over some of the particles. such processes will also be completely positive, and therefore physically feasible, which makes them very interesting for the possible future realization of a quantum computer. 6 acknowledgment the authors would like to thank prof. igor jex from ctu in prague and prof. gernot alber from tu darmstadt for many stimulating discussions. references [1] wooters, w. k., zurek w. h.: “a single quantum cannot be cloned.” nature 299 (1982) 802. [2] werner, r. f.: “optimal cloning of pure states.” phys. rev. a 58, (1998) 1827–1832. [3] novotný, j., alber, g., jex, i.: “optimal copying of entangled two-qubit states.” phys. rev. a 71, (2005), 042332. [4] alber, g., delgado, d., jex, i.: “optimal universal two-particle processes in arbitrary dimensional hilbert spaces.” qic 1 (2001) 33. [5] vidal, g., werner, r. f.: “computable measure of entanglement.” phys. rev. a 65, (2002), 032314. [6] m. štefaňák: “many particle universal processes.” diploma thesis, fjfi, (2003). © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 [7] nielsen, m. a., chuang, i. l.: quantum computation and quantum information. cambridge university press, cambridge, 2000. [8] kraus, k.: states, effects, operations: fundamental notions of quantum theory. springer-verlag, berlin, 1983. [9] košt’ák, v.: “vlastnosti a implementace univerzálních procesů pro více částic.” (in czech) diploma thesis, fjfi, ctu in prague, 2005. ing. jaroslav novotný e-mail: novotny.jaroslav@seznam.cz ing. martin štefaňák e-mail: stefanak.m@seznam.cz vojtěch košt’ák e-mail: vojtech.kostak@centrum.cz department of physics czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 prague 1, czech republic 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague ap07_4-5.vp 1 methodology the implementation of investment benchmarking requires methods that are not categorized in the special literature. however, for the purposes of this study it is appropriate to divide the methods applicable for investment benchmarking into one-dimensional and multi-dimensional methods. the simplest way to established benchmarks is by comparing companies according to only one indicator. through such a comparison we obtain an ordered set of companies according to the chosen indicator. the question then concerns the benefits or disadvantages of using an absolute or a relative indicator. when we valuate according to an absolute indicator we obtain a concrete conception, expressed in units, about the selected criterion. however, as disadvantage of this classification is the absence of information about the input or about the source of the value of the chosen indicators. this absence removes assessment, according to relative indicators. however, it is also impossible to rely purely on relative indicators. the difficulty with relative indicators rests in the presumption of linearity that may in reality be misleading. a decision on the manner of valuation will be troublesome. both solutions imply a certain view on the problem under investigation. assessing the investment efficiency of a company at a given time using several indicators can be difficult when, that the company achieves different results according to particular indicators. in this case we require methods that can help us to make a total assessment of efficiency. this task can be performed by using multi-dimensional methods. the segmentation of these methods is not generally defined. however, the following methods are applicable for presentation purposes. � multi criteria decision making methods, � data analysis envelopment. these two methods enable the investment efficiency of companies to be measured through elective indicators that have an influence on the total investment efficiency of a company. indicators representing efficiency can be of a financial or non-financial character, depending on the branch in which the company operates. it should be emphasized, that multi criteria decision making methods are common tools for multi-dimensional valuation, while data analysis envelopment is an alternative approach to valuation. the following sections offer a brief description of multi criteria decision making methods and data analysis envelopment. 1.1. multi criteria decision making methods a range of multi criteria decision making (mcdm) methods are presented in the literature. they can be categorized into methods based on: determining the order in the particular criteria, determining the base in the particular criteria, valuating the distance from an imaginary object and geminate comparison dudorkin [1], fiala [2]. however, for the purposes of this study it is sufficient to choose a method providing reliable and transparent results. a method that provides the required characteristics is the modified score method (msm). this method appertains to the set of methods founded on determining the base in the particular criteria, which is usually the most favorable value, the arithmetic mean or the median. all the values of the existing indicator, are related to the selected base, so that methods of this kind provide objective results. another advantage of these methods is easy interpretation of the results and relatively low demands on calculation. these methods are criticized for giving a positive assessment of companies even for the worst value achieved for a given indicator, which means that the total valuations of the compared companies are not very for apart. the modified score method deals with this problem. 1.1.1. modified score method the principle of this method, is that, for each indicator we have to find the company for which the appropriate indicator achieves the maximum value (if growth of the indicator is 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 valuating the investment efficiency of distribution companies m. karajica the task of this study is to valuate the investment efficiency of distribution companies. although a series of publications and studies has been dedicated to this topic, it is difficult to find a general consensus in defining the investment efficiency of a company. nevertheless if we simplify an imaginary company as a production unit in which a series of actions transforms inputs to outputs, efficiency can be understood as like an effort to achieve maximum value of the outputs together with minimum usage of inputs, where the inputs constitute investments by a company. the investment efficiency of a company can be measured by expressing the absolute values of selected inputs and outputs, a relative expression of inputs and outputs, and perhaps an expression of the difference between them. however, an examination of the efficiency of a certain company is impossible without a valuation of other companies. in view of the amount of benchmarking, it should be emphasized, that this study is dedicated to a certain category of benchmarking, which we may term investment benchmarking. this benchmarking can be defined as a comparison of companies in terms investment efficiency. the purpose of this comparison is not only to investigate levels of investment efficiency and to relate them to other companies from the same branch, but also to locate the greatest efficiency and indicate potential improvement. keywords: benchmarking, investment efficiency, multi criteria decision making methods, modified score method, data analysis envelopment, returns on scale, ccr model, distribution companies, valuation criteria. desirable) or the minimum value (if a decline of the indicator is desirable). such company receives 100 points for this indicator by. the other companies will obtain a proportion of the points bij valuated by the following formula b x x x xij ij j j j � � � � ,min ,max ,min 100 (desirable growth of indicator), b x x x xij j ij j j � � � � ,max ,max ,min 100 (desirable decline of indicator), where xij is the value of the j-th indicator of the i-th company, xj,max is the highest value of the j-th indicator, xj,min is the lowest value of the j-th indicator, while i m� 1, ,� and j n� 1, ,� . the total investment efficiency valuation is equal to the average value of points awarded bi, obtained according to the following formula b n bi ij j n � � � 1 1 , i m� 1, ,� . the total valuation bi, when using different weights of the indicators, is given by the weighted average b b w w i ij j j n j j n � � � � � 1 1 , i m� 1, ,� . the highest attainable value, of the total assessment is 100 points. this value can be understood as the percentage of the achieved investment efficiency of a company as represented by the selected indicators. it is evident that the overall assessment of a given company will reach the limit of 100 points only if the most favorable values have been achieved all indicators. since case is rare in practice, it is appropriate to normalize the total assessment bi to the interval 0–1. normalization is achieved by relating the total assessment bi of the appropriate company toward the most favorable total assessment bi across all compared units. this normalization is denoted below as the relative valuation of the modified score method. 1.2 data analysis envelopment the minimizing criteria in the data analysis envelopment (dea) models are denoted as inputs, and the maximizing criteria as outputs. acknowledged weights of the particular indicators are analogous to multi criteria decision making methods. however, a significant difference between these approaches to valuation is that the weights are not determined by a value-maker but by calculation. in this way, less data is required, and this is an advantage over multi criteria decision making methods. data analysis envelopment models also have a disadvantage in comparison with multi criteria decision making methods. this disadvantage is the high demands on calculation and the limited number of elective inputs and outputs depending on the number of units of comparison. no precise relationship between the number of selected criteria and the number of units has been introduced in the literature. data analysis envelopment includes a range of models varying above all in presumption about returns on scale. fig. 1 illustrates acceptable types of returns on scale for only one input and output. fig. 1 for constant returns on scale, shows that only one unit (company) is efficient, as it lies on the efficiency line. in the case of constant returns on scale, this is a straight line from the origin of coordinates with its derivation equal to the highest value of the rate output and input of a particular unit. all units below the efficiency line are ineffective, and their rate of inefficiency is directly proportional to the distance from the line. it is difficult to determine the character of the returns on rate, which depends on a wide range of factors, so this study applies the ccr model of data analysis envelopment, supposing constant returns on scale. the principle and a mathematical description of this model are stated in the following section. 1.2.1 ccr model the first data analysis envelopment model was proposed by charnes, cooper and rhodes [3]. this model is denoted as the ccr model. the efficiency of the units is valuated by the theorem e u y u y u y v x v x v x u y k k k s sk k k r rk j jk j � � � � � � � � �1 1 2 2 1 1 2 2 1� � s i ik i r v x � � �1 , k n� 1, ,� , where ek is efficiency of the k-th unit, xik is i-th input of the k-th unit, yjk is j-th output of the k-th unit, vi is weight of the i-th input, uj is weight of the j-th output, r is number of inputs, and s is number of outputs. for each unit, different values of the weights can be sets. the purpose is to find the optimal set of weights for which a given unit achieves maximum efficiency. the condition of the calculation is that, for a given set of weights, the efficiency value of all the participating units © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 47 no. 4–5/2007 output constant returns to scale input output variable returns to scale input output non-increasing returns to scale input output non-decreasing returns to scale input fig. 1: shape of the efficiency line for various returns on scale must be lower or equal to the upper efficiency limit, which is equal to 1. the optimal set of weights is calculated for each unit. the mathematical formulation for this exercise is: max h u y v x k j jk j s i ik i r0 0 0 1 1 � � � � � subject to u y v x k k n j jk j s i ik i r � � � � � � 1 1 01 1, , , , , ,� � u j s v i r j i � � � � 0 1 0 1 , , , , , , , , � � where hk0 is efficiency rate of unit k0, xik is i-th input the k-th unit, yjk is j-th output the k-th unit, vi are the weights assigned to the i-th input and uj are weights assigned to j-th output, while i r� 1, ,� , j s� 1, ,� and k k n� 1 0, , , ,� � . for the practical application it is necessary to transform this exercise to a standard linear programming exercise. the exercise is transformed using the charnes-cooper transformation, which keeps the weighted sum of the inputs equal to constant). while calculating efficiency of a given unit, maximum efficiency rate could be achieved by entirely rendering some of the inputs or outputs. to prevent this, we have to impose the infinitesimal constant �, with the help of which we establish the lowest limit of the inputs and outputs weights. the value of constant � depends on the values of the applied inputs and outputs with regard to charnes-cooper transformation. supplementing the presented here with this transformation and constant �, we obtain the primary ccr model oriented on inputs. the model is formulated as follows: max z u yk j jk j s 0 0 1 � � � , subjected to u y v x k k nj jk j s i ik i r � � � �� � 1 1 01, , , , ,� � , v xi ik i r 0 1 1 � � � , u j s v i r j i � � � � � � , , , , , , , , 1 1 � � where zk0 is efficiency of unit k0. the description of other symbols are as in the previous model. 2 comparative analysis of companies the previous section presented a theoretical description of the multi criteria decision making method and data analysis envelopment, which provide ways to evaluate the investment efficiency of companies. however, these tools for supporting a valuation of investment efficiency require relevant assessment criteria. 2.1 selection of subjects for comparison emphasis was placed on selecting comparable companies on the basis of functioning in the same branch and having concurrent activities. the comparative analysis comprised 47 companies dealing with power distribution. the set comprises one company originating in the united states of america and 46 european companies, 35 of which come from the european community. the analysis includes 3 majority distribution companies acting in the czech republic: čez distribuce, a. s., e.on distribuce a. s., and pražská energetika a. s. 2.2 criteria selection both the ccr model and the modified score method enable criteria to be selected. due to this common character, the same criteria could be assessed for the ccr model and for the modified score method. the following table contains the set of selected criteria. table 1 shows the use of financial and non-financial indicators. this selection aims to raise the credibility of the valuation, which can be distorted when financial indicatorsare used, due to incompatible accounting practices sůvová a kol. [4]. since basic models of data analysis envelopment require value comparability of the selected inputs and outputs, absolute indicator units are applied. these express the numerator or denominator of the selected criteria. however, it should be noted that the values of these absolute indicators were assessed as arithmetical averages of the values for the period 2003–2005. all data needed for evaluating the selected criteria were ascertained from the annual reports of the companies. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 inputs (minimization criteria) x1 � investmens [mil. czk] assets [mil. czk] x2 � investmens [mil. czk] grid extended lenght [mil. czk] x3 � operating expenses [mil. czk] electricity distribution [gwh] x4 � electricity losses [gwh] electricity distribution [gwh] outputs (maximization criteria) y x1 5� � operating profit [gwh] investmens [mil. czk] y2 � electricity distribution [gwh] investmens [mil. czk] � x6 y x3 7� � number of connection [-] assets [mil. czk] table 1: valuation criteria for investment efficiency primary models are rarely found in applications of data analysis envelopment models. dual models are frequently used, because they contain a smaller number of constraints as well as additional variables that correct the calculation. this study therefore uses the dual ccr model, which is defined, e.g., in dlouhý, jablonský [5]. due to the use of the charnes-cooper transformation and relatively low input and output values, infinitesimal constant � was set to a value of 10�3. for the solution exercises, the ccr model was created in ms excel containing the solver tool. the weights applied in the modified score method are equal, due to the difficulty, is deciding which are more important. 3 conclusions the modified score method and the ccr model of data analysis envelopment were used to obtain the results presented in appendix i of this paper. the companies receive an assessment in the range from 0 until 1, where the best result is 1. the results of the ccr model show, that eight companies achieve the best valuation. the efficiency line is set by the following companies: egl, eidsiva energinett, electrabel, hafslund nett, latvenergo, statkraft, vattenfall and verbund. the least favorable valuation of the ccr model is given to esb. the modified score method selects hafslund nett as the best of all. hafslund nettis is also in the group of the best rated companies by the ccr model. the least favorable valuation on the basis of the modified score method was given to nuon. the arithmetical average of relative valuation modified score method is 0.695, while in ccr model the average value is 0.535. a comparison of the results of the two methods on the basis of a relative valuation shows that there are considerable differences. for lucidity and for the purposes of analysis it is more suitable to put the companies into an order within the appropriate method. appendix ii of this paper is a graph that illustrates the results of both methods in terms of the achieved order. it should be emphasized that the more favorable the relative valuation is, the lower the position of the company in the ranking order. in order to compare the results of methods based on an assessment order it was necessary to use a so-called associated order, which uses the arithmetical mean of the order numbers. this means that the companies that occupy the same position are placed in an associated order. all eight best-assessed companies according to the ccr model are given a score of 4.5, since (1+2+3+4+5+6+7+8)/8 give this order value. a comparison of the results introduced here shows that the two methods provide valuations that are not very significantly different, because most of the companies that are differently ordered are low in both rankings. when assessing results presented here, it is necessary to note a potential threat to the companies that have achieved a favorable valuation. this result could be due e.g., to an unreasonably low level of investment in a given period, which can reduce the competitive advantage of that company in the future. the best situation is if a company has slightly above-average values of investment efficiency. when this average value is included in the total assessment order list, we find, that the average relative valuation using the modified score method is 22nd. position (blue line in the figure presented in appendix ii), while the average relative valuation in the ccr model is 23rd. position (red line in figure presented in appendix ii). as far as distribution companies operating in the czech republic are concerned, the figure in appendix ii shows, that these companies are placed, slightly above the average, for both methods. we can therefore consider the investment efficiency of most distribution companies operating in the czech republic to be satisfactory. acknowledgments the author would like to express has gratitude to his supervisor, prof. ing. oldřich starý, csc. for valuable comments and contributions. references [1] dudorkin, j.: systémové inženýrství a rozhodování. čvut, praha, 2003. [2] fiala, p.: modely a metody rozhodování. oeconomia, praha, 2003. [3] charnes, a., cooper, w. w., rhodes, e.: measuring the efficiency of decision making units. european journal of operational research, 1978a, p. 429–444. [4] sůvová, h. a kol.: finanční analýza v řízení podniku, v bance a na počítači. bankovní institut, praha, 2000. [5] dlouhý, m., jablonský, j.: modely hodnocení efektivnosti produkčních jednotek. professional publishing, praha, 2004. ing. mirza karajica karajm1@fel.cvut.cz dept. of economics, management and humanities czech technical university in prague faculty electrical engineering, technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 47 no. 4–5/2007 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 0 .0 0 .2 0 4. 0 .6 0 8. 1 .0 a g d er e n er g i n et t n a t e l c h b k k n et t n b ri ti sh e n er g y g b è e z c z e .o n c z e .o n d e .o n g b e .o n s e d f f e d is o n i e d p p e g l c h e id si v a e n er g in et t n e le ct ra b el b e le k tr o l ju b lj a n a s l o e le k tr o m a ri b o r s l o e le k tr o p ri m o rs k a s l o e n b w d e n d es a e e n ec o n l e n el i e s b ir l e ss en t n l f o rt u m f in h a fs lu n d n et t n h e p h r ib er d ro la e l a tv en er g o l v l y se n et t n n a ti o n a l g ri d u s a n u o n n l p p c g r p r e c z r ey k ja v ik e n er g y is r s t l t r w e d s co tt is h a n d s o u th er n e n er g y g b s co tt is h p o w er g b s k a g er a k n et t ns s e s k s ta tk ra ft n v a tt en fa ll sv er b u n d a v s e s k v s t l t z s e s k m s m d e a c c r a p p en d ix i: m sm & d e a c c r co m p ar is on by re la ti ve va lu at io n © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 47 no. 4–5/2007 13579 111 3 1 5 1 7 1 9 2 1 2 3 2 5 2 7 2 9 3 1 3 3 3 5 3 7 3 9 4 1 4 3 4 5 4 7 ag de re ne rg in et tn a te l ch bk k ne tt n br iti sh en er gy g b èe z cz e. on cz e. o n d e .o n g b e. on s ed f f ed iso n i ed p p e g l ch ei ds iv ae ne rg in et tn el ec tra be lb el ek tro lj ub lja na sl o el ek tro m ar ib or sl o el ek tro pr im or sk as lo en bw d e nd es ae en ec o nl en el i e sb ir l es se nt n l fo rtu m fi n ha fs lu nd ne tt n h ep h r ib er dr ol ae la tv en er go lv ly se n ett n n at io na lg rid u sa n uo n n l pp c g r pr e cz re yk jav ik en er gy is rs t lt rw e d sc ot tis h an d so ut he rn en er gy g b sc ot tis h po we rg b sk ag er ak n et tn ss e sk s ta tk ra ft n va tte nf all s v er bu nd a v se sk v st lt zs e sk c o m p an y msm&deaccrranking m s m d e a c c r a p p en d ix ii : m sm & d e a c c r co m p ar is on by ra n ki n g acta polytechnica https://doi.org/10.14311/ap.2021.61.0219 acta polytechnica 61(1):219–229, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague design optimization of moveable moment stabilization system for access crane platforms kemal ermisa, ∗, mehmet caliskana, muammer tanriverdib a sakarya university of applied science, technology faculty, department of mechanical engineering, esentepe campus, 54187 serdivan-sakarya, turkey b yilmaz machine company, imes-5 avenue, no: 17, 41455, dilovası-kocaeli, turkey ∗ corresponding author: ermis@subu.edu.tr abstract. the popularity of aerial work platforms is rapidly increasing in the mechanization industry. as a result, the safety and structural strength of aerial work platforms should be prioritized. in this study, the mathematical model of a reconstructed aerial work platform was developed and a 3d model was created using the solidworks software. a dynamic analysis was then performed to improve various structural parameters of the aerial work platform. the analysis was carried out using solid modelling, finite elements, and dynamic transient analysis. in compliance with international structural standards, the weight distribution was reconstructed after placing a mass behind the turret. the results of the dynamic transient analysis were compared with the mathematical model and validated. then, the effect of the mass placed behind the turret on the machine was examined. the lateral tipping distance of the static work platform was found to have increased from 15.9 m to 17.08 m. the structure of the aerial work platform was improved using a structural and dynamic analysis approach. it was also discovered that the machine efficiency could be further increased by ensuring that the balancing weight is moved further away from the tower centre by a hydraulic-based system and controller. keywords: dynamic analysis, structural analysis, mechanic, aerial work platform. 1. introduction nowadays, aerial work platforms are designed to provide people with a means to reach heights not feasible otherwise and are rapidly rising in popularity in the mechanization sector [1, 2]. previously, this task was carried out using attachments placed on mobile cranes [3]. however, following the ban of this practice, the use of aerial work platforms has become mandatory and has led to the increased production of aerial work platform machines [4]. initially, aerial work platforms were put into production to replace scissor lifts, previously used to reach fruit trees when the users of these lifts needed a higher reach. due to the use of aerial work platforms becoming more widespread, the areas of use have also expanded. nowadays, aerial work platforms are found in various industrial sectors and are being used to carry out jobs such as cleaning the exterior surfaces of high-rise buildings, the assembly of roof systems, connecting electric poles, surface treatments in shipyards, and firefighting operations [2]. some variations in the construction of aerial work platforms include articulated, scissor lift, and telescopic platforms [5–7]. these aerial work platforms have become increasingly popular in the mechanization sector, which means that these machines must provide the desired elevation required by the consumers’ height demands, but the cost of these machines must remain competitive in narrow market conditions. there are many producers of small and medium-sized aerial work platforms around the world. however, with the expansion of the international market, the pressure on the producers of aerial work platforms has increased rapidly. 2. the general structure of aerial work platforms aerial work platforms generally consist of a carrier vehicle, chassis, turret, booms, and basket sections [8, 9]. nowadays, when manufacturing aerial work platforms, the goal is to reach a vertical height of 26 meters, as shown in figure 1. a damage to aerial work platforms mainly results from pushing the machine to extend past its limits along the horizontal axis [10]. when users first select the machine they need, they consider the total working height capacity, the lateral extension limit, and the total weight [11, 12]. the two parameters that limit the capacity of the lateral extensions of the machine affect the stability of the outrigger tipping points and act as counterweights are the platform chassis and the mass of the turret [10, 13]. the total weight of the aerial work platform produced determines the specification of the vehicle under it [14, 15]. the limitations in the carrying capacity of the carrier vehicles as well as the total cost of these 219 https://doi.org/10.14311/ap.2021.61.0219 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en k. ermis, m. caliskan, m. tanriverdi acta polytechnica figure 1. a typical telescopic aerial work platform that has a 26-meter extension. machines affect the users of these vehicles. increasing the weight capacity and the cost of the carrier vehicle leads to an increase in the travel costs of these aerial work platforms, which are used as mobile vehicles. in general, the design of the machine determines the cost and users select these carrier vehicles based on their design. in the aerial work platform manufacturing industry, manufacturers often avoid compromising safety [1]. they are especially reluctant when it comes to making changes to the boom structure. however, with the machinability of high-strength steel sheets, and the increased bending ability and level of the brittleness of high yield strength hard materials, it is possible to produce low weight booms and thus more effective machines [7, 16]. it is obvious that there should not be any restrictions placed on the improvement and development of each of the elements that constitute these machines, and that this should be an important factor considered by internationally prominent manufacturers. in addition, the structural strengths of the machines have been examined in international studies [4, 17, 18] and in the case of one of these studies, improvements were made for ht26 [5]. in some studies, structural improvements have been achieved by examining the effects of bearing loads obtained directly by using dynamic modelling on the machines [11]. in this study, building on previous studies and experiments [19, 20], improvements will be made through realistic systems constructed using advanced multi-body dynamic system modelling (msc adams®). all standards, except for the wind and hand forces will be applied according to en280 [8] regulations. all the behaviours of the constructed structure will be examined. in this study, the working capacity of aerial work platforms, improvements to their working capacity, and suggestions for how these improvements could be further amplified are examined in detail. 3. work method the work method consists of; the outrigger contact force function formula with physical properties, the improvement of the aerial work platform structure, the development of the mathematical model of the aerial work platform, and the process of structural and dynamic structural analysis. the methodology of this study is shown in figure 2. the 3d modelling for the aerial work platform was performed using the solidworks software based on the mathematical model results for the initial and boundary conditions of geometry. then, the completed model was exported to a parasolid file. the cad data were then used to prepare the mesh structure using the msc apex software, and then a nastran document file was created to be used for the finite element analysis. in the finite element analysis section, defined starting and boundary conditions were applied in the msc simxpert software, and then a model analysis was performed to create a data file that was used for the dynamic transient analysis performed using the msc adams software. results of the finite element analysis were compared with the theoretical maximum stress and deflection. 3.1. outrigger contact force function formula the impact function has seven variables, which all correspond to properties of the physical world. impact(x,ẋ,x1,k,e,cmax,d) (1) 220 vol. 61 no. 1/2021 design optimization of moveable moment stabilization system. . . figure 2. flow chart of this study. k = stiffness “10e+8 n/m” e = force exponent “2.2 steel” c = max damping “10e+4 n·s/m” d = penetration depth “10e-2 mm” f = ß 0 if x > x1 k(x1 − x)e − cmaxẋ(step(x, (x1 − d), 1,x1, 0)) if x ≤ x1 ™ (2) the assumptions taken into account in the dynamic analysis process are as follows. • all construction elements consist of rigid elements. • the effect of the wind load on the basket platform is not taken into account. • the machine is able to move at a certain dof by defining a “spherical joint” on the foot shoes of the platform with the basket towards the tipping direction. • the chains on the machine booms are eliminated and the motion equations that will model these chains are applied to the booms and the operation of the machine is modelled. • a gravitational acceleration of 9.81 m2/s is applied on the structure vertically on the y-axis. 3.2. improvement of aerial work platform structure the aim of this study is to increase the working efficiency of aerial work platforms a through lateral extension. figure 3 shows the lateral extension of the 26-meter platform. as the axis of the tipping moment, the contact points of the vertical outrigger are illustrated on the figures and the working limits of the machine can be determined by calculating the moment based on the 250 kg load. the weights of the platform components based 221 k. ermis, m. caliskan, m. tanriverdi acta polytechnica figure 3. statically calculated lateral working capacity of aerial work platform by msc adams®. figure 4. statically limited working displacement of aerial work platform lateral extension by msc adams®. on the moment arm distances given in figure 3 as follows: basket and load (250 + 120 kg), booms (1 380 kg), turret (430 kg), machine chassis (1 120 kg), and vehicle mass (2 830 kg). in a conventional design process, the determination of the tipping moment and the working diagram of the machine were made based on basic static calculations. according to these static loads, the construction tipping limit was calculated to be 19.1 m. (msc adams®) (fig. 3). however, these limits are a result of the structural analysis previously carried out on the booms where the safety factor was limited to a maximum of 2 and it could be extended up to 15 meters. based on this limit value, a safety coefficient of 1.2 was used so the structure could statically remain under control. after the production, the limitations on lateral extension are electronically generated by controlling the pistons as shown in figure 4. 3.3. mathematical model of aerial work platform the mathematical model of the aerial work platform was used to define the initial and boundary conditions for the numerical analysis. also, the mathematical model results provide limit values for the numerical analysis. these limits are important in ensuring the safety and control for every step of the numerical analysis. the extendable boom’s free body diagram is shown in figure 5. in figure 5, where l1 is the extendable boom length, l2 is the length of basket arm, θ angle between the boom and the horizontal l, m1 is the total mass of booms, m2 is mass of basket arm, m3 is the load in the basket, i1 is the mass moment of inertia on the main boom and i2 is the mass moment of inertia on the basket arm. (fig. 5). instead of forces, the lagrangian mechanics (l) use the energies that can be described by the difference between kinetic energy (ke) and potential energy (pe) as shown in equation 3. l = ke − pe (3) the non-relativistic lagrangian for a system of particles can be defined by l = t − v (4) where t is the total kinetic energy of the system’s motion and v is the potential energy of the system. 222 vol. 61 no. 1/2021 design optimization of moveable moment stabilization system. . . figure 5. free body diagram. the kinetic energy depends on time. using generalized coordinates (θ,l1), following equations can be derived. d dt å ∂t ∂θ̇ ã − ∂v ∂θ = m3g (5) d dt å ∂t ∂l̇1 ã − ∂v ∂l1 = m3g (6) coordinates can be calculated as follows: x = l1 cos(θ) (7) y = l1 sin(θ) (8) the kinetic energy of the system is given as follows t = ß 1 2 m1 ( ẋ2 + ẏ2 ) + 1 2 i1θ̇ 2 + 1 2 m1l̇1 2 + 1 2 m2ẏ 2 + 1 2 m3ẏ 2 + 1 2 i2ẏ 2 ™ (9) the potential energy of the system is given as follows v = ß m1g l1 3 sin(θ) + m2g(l1 sin(θ) − l2) + m3g(l1 sin(θ) − l2) ™ (10) t − v = m3g (11) for “θ”; d dt å ∂t ∂θ̇ ã − ∂v ∂θ = m3g (12)ß i1θ̈ − (m1g l1 3 ) cos(θ) + m2gl1 cos(θ) + m3g sin(θ) ™ = m3g (13) for “l1”; d dt å ∂t ∂l̇1 ã − ∂v ∂l1 = m3g (14){ m1l̈1 − (m1g sin(θ) + m2g sin(θ) + m3g sin(θ)) } = m3g (15) using the above equations, the following equations are found i1θ̈ − g cos(θ) ß l1 3 m1 + m2l2 + m3l1 ™ = m3g (16) m1l̈1 − g sin(θ) ß 1 3 m1 − m2 − m3 ™ = m3g (17) 223 k. ermis, m. caliskan, m. tanriverdi acta polytechnica figure 6. definition of vehicle weight by point mass using rbe2 rigid connectors (msc adams®). figure 7. the mesh design of chassis structure with 2d elements created by using msc apex®. 3.4. the process of structural and dynamic structural analysis the msc adams® multi-body dynamic system modelling program was used to construct the structure. in this dynamic model, the structure has been created with rigid connectors, instead of the balancing piston, which allows the basket to run parallel to the ground plane during the machine operation, the primitive joint was attached onto the basket, thus allowing the basket to always move perpendicular to the ground plane. the lateral working capacity of the aerial work platform was calculated using the model prepared in the msc adams dynamic system modelling program (fig. 3). in accordance with the method applied, the lateral working capacity of the aerial work platform was calculated to be 19.1 meters. as shown in figure 3, the outrigger contact force was determined to be zero at the time of 160 s. if the analysis were to be continued at this point, this outrigger contact force value would remain negative and this would suggest that it is not in contact with the ground. the value observed on the graph at the 160 s mark is 1.918e+4 mm. this value is equal to 19 180 mm which can also be expressed as 19.1 m. during the machine manufacturing process, this value is divided by a safety coefficient of 1.2 as a more reliable static working capacity is preferred. the final lateral working capacity was consequently determined to be 15.9 meters. during the creation of the dynamic model, a 250 kg basket load and a 2 780 kg vehicle load were applied to the model. the vehicle load was placed onto the vehicle’s centre of gravity, which is defined as the centre of mass based on the connection point between the machine and the vehicle, and it was connected to the structure with an appropriate connection element (spherical joint) as shown in figure 6. chassis-vehicle connection plates were modelled to be welded on the machine chassis before proceeding to the structural analysis process so that the prepared mesh elements could be used in msc adams® as shown in figure 7. 224 vol. 61 no. 1/2021 design optimization of moveable moment stabilization system. . . figure 8. creation of aset locations on flexible model using msc simxpert®. figure 9. definition of aset, which provides transfer of forces between rigid and flexible parts created by using msc simxpert®. the results showed that the maximum lateral extension distance of the machine under rigid conditions was 15.9 meters. as further improvements are made to the structure of the machine, it is assumed that the booms are composed of rigid elements. however, the effect of this distance on the vehicle chassis varies according to the behaviour of the material and structure used. the elements of the machine chassis are composed of sheet metal elements. using the mid-surface feature, 2-dimensional elements were formed. the msc apex® commercial software was used to model the entire structure in detail during the fem geometry creation phase. welded regions were rearranged by creating new mesh elements. the process of creating an mnf document after the modal analysis includes the preparation of load transfer nodes on the msc simxpert® software (aset), creating a new connector node suitable for these nodes, creating a new connector node suitable for these nodes, and material definition as shown in figure 8. “structural steel” is defined as the material used for the construction of the structure and the st52 steel parameter was taken into consideration as yield strength (353 mpa), as shown in figure 9. structural steel specifications: young’s modulus: 210 000 mpa, poisson’s ratio: 0.3, and density: 7.85e6 kg/mm3. the obtained flexible chassis model was applied on a previously prepared dynamic model. the chassis model, which had been previously solved dynamically, was solved again with the flexible chassis structure. the 225 k. ermis, m. caliskan, m. tanriverdi acta polytechnica figure 10. rigid dynamic model of aerial work platform created using msc adams®. figure 11. definition of aset which provides transfer of forces between rigid and flexible parts created by using msc simxpert®. lateral working limit of the machine was calculated by considering the yield strength and the safety factor of the material used as shown in figure 10. as another parameter, the effect that the balance weight (500 kg), which was placed behind the turret on the machine to increase its lateral extension, had on the working efficiency of the machine was examined. the balancing weight, defined as a point mass, was placed 1 125 mm away from the centre of the turret so that it did not interfere with the rotation of the machine behind the turret. the balancing weight was positioned at the minimum distance away from the centre of the turret after taking the volume to be covered with steel (which has a weight of 500 kg and has a density of 7.85 g/cm3) into consideration as shown in figure 11. in this model, which is constructed using dynamic structural modelling, the working limit of the machine has been determined by taking the chassis strength, material used, and the safety coefficient into consideration as shown in figure 12. in the model, the weight of the vehicle is defined as a point mass with all the structure weights and inertial behaviours. welded connection plates are used on the chassis for the distribution of vehicle weight as well as for more realistic load distributions. these connections are assembled around a single node with rbe2 elements (fig. 7). the effect of the balancer weight, placed behind the turret, on the working efficiency of the machine is observed for the sample as shown in figure 13. using a 500 kg load as a balancer weight placed behind the turret and positioning this load 1 125 mm away from the centre of the turret increased the aerial work platform’s lateral working capacity to 20.5 meters. 4. results and discussion as a result of this analysis and other working scenarios, on the machine chassis, low load distributions were observed in the section extending to the rear outriggers. it was determined that the machine work efficiency could be increased by distributing this load to the whole machine and moving the centre of gravity closer to the turret rotation centre. as a result, the lateral working capacity of the statically improved aerial work platform was determined to be 17.08 meters when a safety factor coefficient of 1.2 was used in the calculations. the effects of elongation and shortening of the boom of the crane were analysed. the contact force graph was created to determine the lateral tipping limit of the vehicle in line with the data obtained from the dynamic model as shown in figure 14. according to the results, the lateral distance between the tip of the machine basket to the tower-boom rotation axis was determined to be approximately 18.5 meters. however, this distance was further limited by approximately 0.3 percent, providing a safer operation and better structural protection of the machine. 226 vol. 61 no. 1/2021 design optimization of moveable moment stabilization system. . . figure 12. the structural effect of the lateral extension of the platform on the chassis (msc adams). figure 13. sectorial application of balancing weight. figure 14. lateral (elongation) overturning moment calculation. 227 k. ermis, m. caliskan, m. tanriverdi acta polytechnica figure 15. mode shape of the booms on 12 hz frequency. figure 16. the effect of hydraulic table stroke distance on boom extension. with the results obtained, the maximum lateral extension distance of the machine under rigid conditions was determined to be 18.5 meters. the effect of a counterbalance weight (500 kg) behind the tower, which can be used to increase the machine lateral extension distance efficiency, on the machine work efficiency has been investigated. the effect of the tower back balancer weight on the machine side work efficiency is observed in figure 14. accordingly, the machine shows a greater tipping limit distance, which is 20.5 meters, thanks to the 500 kg load used as a balancing weight. according to this result, with a safety ratio of 0.3, the side working capacity of the machine can be increased up to 15 meters. this value of the lateral working capacity was obtained when the balancer weight placed behind the tower was stationary. this value can be increased by using a movable hydraulic table, which can be placed on the balancing weight base. the effect of the 1 000 mm hydraulic moving table stroke on the platform lateral capacity can be seen in figure 15. the natural frequencies of booms are critical in generating resonances, which can damage the machine. the truck was placed under the aerial work platform and ran approximately at idle speed. this idle speed range is between 700 to 1 000 rpm. the effect of the balancing weight on the displacement of the boom extension with the use of the movable table is observed in figure 16. the use of a counterweight on the back of the turret helps to improve the rigidity of the turret. when the working platform is at the top, the dynamic load will affect the turret more aggressively. therefore, along with structural improvements and weight optimizations that can be made on aerial work platforms, balancing weights can also be used to optimize results. from the results obtained using the flow diagram seen in figure 16, the distance of the aerial work platform lateral tipping distance, statically calculated to be 15.9 meters, was extended to reach up to 17.08 meters. it was observed that the machine efficiency could be further increased by ensuring that the balancing weight is 228 vol. 61 no. 1/2021 design optimization of moveable moment stabilization system. . . moved further away from the tower centre by a hydraulic-based system and controller. (the structure of the boom has been considered rigid). from these results, the effect of weight optimization of the machine chassis, tower, elevation box, and outriggers by means of weight and structure improvements of the machine have been determined. by taking these results into consideration, it will be possible to provide more secure and more efficient operation of aerial work platforms that are used to complete many tasks. future studies can examine the aerodynamic effects of different air velocities and directions on aerial work platforms. acknowledgements the authors wish to thank bias engineering for their substantial help with the dynamic system modelling. references [1] h. hu, e. li, x. zhao, et al. modeling and simulation of folding-boom aerial platform vehicle based on the flexible multi-body dynamics. in international conference on intelligent control and information processing, pp. 798 – 802. ieee, dalian, china, 2010. doi:10.1109/icicip.2010.5565257. [2] m. sezer, m. kalyoncu. cage levelling control of truck-mounted hydraulic aerial work platforms. selcuk university journal of engineering, science and technology 2(2):1 – 9, 2014. doi:10.15317/scitech.201426889. [3] r. g. dong, c. s. pan, j. j. hartsell, et al. an investigation on the dynamic stability of scissor lift. open journal of safety science and technology 2(01):8 – 15, 2012. doi:10.4236/ojsst.2012.21002. [4] a. yucel, a. arpaci. analytical and experimental vibration analysis of telescopic platforms. journal of theoretical and applied mechanics 54(1):41 – 52, 2015. doi:10.15632/jtam-pl.54.1.41. [5] m. karahan. design and finite element analysis of two levels telescopic crane. master’s thesis, atatürk university, erzurum, turkey, 2007. [6] h. marjamäki, j. mäkinen. modelling a telescopic boom the 3d case: part ii. computers & structures 84(29 30):2001 – 2015, 2006. doi:10.1016/j.compstruc.2006.08.010. [7] a. trabka. dynamics of telescopic cranes with flexible structural components. international journal of mechanical sciences 88(29):162 – 174, 2014. doi:10.1016/j.ijmecsci.2014.07.009. [8] cen en 80 mobile elevating work platforms design calculations stability criteria construction safety examinations and tests. standard, european committee for standardization, 2013. [9] m. a. m. nor, h. rashid, w. m. f. w. mahyuddin, et al. stress analysis of a low loader chassis. procedia engineering 41:995 – 1001, 2012. doi:10.1016/j.proeng.2012.07.274. [10] s. m. bošnjak, n. b. gnjatović, d. b. momčilović, et al. failure analysis of the mobile elevating work platform. case studies in engineering failure analysis 3:80 – 87, 2015. doi:10.1016/j.csefa.2015.03.005. [11] j. guo, h. he, c. sun. analysis of the performance of aerial work platform working device based on virtual prototype and finite element method. energy procedia 104:568 – 573, 2016. doi:10.1016/j.egypro.2016.12.096. [12] e. maleki, b. pridgen, j. q. xiong, w. singhose. dynamic analysis and control of a portable cherrypicker. in asme 2010 dynamic systems and control conference, pp. 477 – 482. 2010. doi:10.1115/dscc2010-4241. [13] s. he, m. ouyang, j. gong, g. liu. mechanical simulation and installation position optimisation of a lifting cylinder of a scissors aerial work platform. the journal of engineering 2019(13):74 – 78, 2019. doi:10.1049/joe.2018.8961. [14] l. shanzeng, z. lianjie. kinematics and force analysis of lifting mechanism of detachable container garbage truck. the open mechanical engineering journal 8(1):219 – 233, 2014. doi:10.2174/1874155x01408010219. [15] t. erdol. design, analysis of the gantry crane with finite element method and box girder optimization. master’s thesis, gebze institute of technology, gebze, turkey, 2007. [16] b. posiadala, d. cekus. discrete model of vibration of truck crane telescopic boom with consideration of the hydraulic cylinder of crane radius change in the rotary plane. automation in construction 17(3):245 – 250, 2008. doi:10.1016/j.autcon.2007.05.004. [17] f. yao, w. meng, j. zhao, et al. analytical method comparison on critical force of the stepped column model of telescopic crane. advances in mechanical engineering 10(10):1 – 13, 2018. doi:10.1177/1687814018808697. [18] j. yao, f. xing, y. fu, et al. failure analysis of torsional buckling of all-terrain crane telescopic boom section. engineering failure analysis 73:72 – 84, 2017. doi:10.1016/j.engfailanal.2016.12.006. [19] r. mijailovic. modelling the dynamic behaviour of the truck-crane. transport 26(4):410 – 417, 2011. doi:10.3846/16484142.2011.642946. [20] p. jia, e. li, z. liang, y. qiang. dynamic stability of the aerial work platform based on zmp. in third international conference on intelligent control and information processing, pp. 422 – 425. ieee, dalian, china, 2012. doi:10.1109/icicip.2012.6391539. 229 http://dx.doi.org/10.1109/icicip.2010.5565257 http://dx.doi.org/10.15317/scitech.201426889 http://dx.doi.org/10.4236/ojsst.2012.21002 http://dx.doi.org/10.15632/jtam-pl.54.1.41 http://dx.doi.org/10.1016/j.compstruc.2006.08.010 http://dx.doi.org/10.1016/j.ijmecsci.2014.07.009 http://dx.doi.org/10.1016/j.proeng.2012.07.274 http://dx.doi.org/10.1016/j.csefa.2015.03.005 http://dx.doi.org/10.1016/j.egypro.2016.12.096 http://dx.doi.org/10.1115/dscc2010-4241 http://dx.doi.org/10.1049/joe.2018.8961 http://dx.doi.org/10.2174/1874155x01408010219 http://dx.doi.org/10.1016/j.autcon.2007.05.004 http://dx.doi.org/10.1177/1687814018808697 http://dx.doi.org/10.1016/j.engfailanal.2016.12.006 http://dx.doi.org/10.3846/16484142.2011.642946 http://dx.doi.org/10.1109/icicip.2012.6391539 acta polytechnica 61(1):219–229, 2021 1 introduction 2 the general structure of aerial work platforms 3 work method 3.1 outrigger contact force function formula 3.2 improvement of aerial work platform structure 3.3 mathematical model of aerial work platform 3.4 the process of structural and dynamic structural analysis 4 results and discussion acknowledgements references ap05_3.vp 1 introduction thermal desorption is a highly-specialised technique for the capture, concentration and analysis of trace-level volatile organic compounds (vocs) in real world samples. a typical application is the measurement of airborne voc pollutants in an industrial setting, such as a chemical processing plant [1]. the thermal desorption technique, in conjunction with gas chromatography and mass spectrometry, allows minute quantities of these substances to be detected and measured [2]. however, the demand from customers for greater sensitivity over a wider range of applications is increasing. the recent security concerns over chemical and biological weapons within the uk and usa provides a vivid example of the need for an accurate and sensitive technique to monitor air quality. a collaborative research programme was establishe in this area to identify optimal product design and development techniques for the novel thermal desorption process. the thermal desorption process combines analytical chemistry, materials science, transient heat transfer, mechanical drive systems and electronic control. thermal desorption products are traditionally labbased capital equipment, and the disparate areas of expertise have meant that product development has tended to be treated as somewhat of a ‘black art’. the development of these products has been driven by the application of analytical science, whereby thermal desorption is now a proven, reliable concept, but it has been reliant on ‘traditional’ design and manufacturing techniques. new market opportunities, such as portable instruments and embedded on-line products, have highlighted the need to investigate a wider range of product design and development techniques in order to progress the technology. this paper reports the application of new design and development tools within a small company specialising in thermal desorption equipment. the context of this study is important because new product development activities within small and medium-sized enterprises provide higher financial returns than practically any other type of similar investment [3]. smes are often in a prime position to identify innovative new products as a consequence of their close working relationships with customers and suppliers; however, the majority of the product development literature focuses on design tools and activities within large well-established companies. the literature on design and development within smes is more limited in scope [4]. this is surprising given the fact that smes play a key role in most european economies. it has been estimated, for example, that 95% of the three million businesses in the uk employ fewer than 20 people [5]. thus their performance as product developers is a matter of no small concern. smes typically operate in a resource constrained environment, and there can be a tendency for small companies to conduct product development in an ad hoc manner [6]. in order for smes to maintain their competitive advantage in an increasingly harsh international market, they need to adopt best practice design and development techniques. advanced design tools, such as computer-aided design (cad) software, represent an important route for improving competitiveness through enhanced quality and productivity. integrated cad-based systems have made a major impact across a range of industries. cad speeds up design and engineering activities by rapidly capturing design intent and reducing the errors between development stages [7]. a number of researchers have highlighted the benefits of integrating cad systems within the overall product development process [8,9]; however it has also been noted that cad systems can inhibit innovation if implemented inappropriately and used ineffectively [10]. therefore the methods by which design and development techniques are implemented are an important consideration. this paper will assess the impact of a cadbased structured design process, which has been applied to the novel design challenges of thermal desorption products. the specific requirements of thermal desorption are presented through a product design specification, and the core component of the thermal desorption instruments is highlighted for re-design and prototype development. the resulting design improvements are assessed through an experimental test programme. the overall impact of the design © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 the design and development of enhanced thermal desorption products r. humble, h. millward, a. mumby, r. price this research study is based on a knowledge-transfer collaboration between the national centre for product design and development research (pdr) and markes international ltd. the aim of the two-year collaboration has been to implement design tools and techniques for the development of enhanced thermal desorption products. thermal desorption is a highly-specialised technique for the analysis of trace-level volatile organic compounds. this technique allows minute quantities of these compounds to be measured; however, there is an increasing demand from customers for greater sensitivity over a wider range of applications, which means new design methodologies need to be evaluated. the thermal desorption process combines a number of disparate chemical, thermal and mechanical disciplines, and the major design constraints arise from the need to cycle the sample through extremes in temperature. following the implementation of a comprehensive product design specification, detailed design solutions have been developed using the latest 3d cad techniques. the impact of the advanced design techniques is assessed in terms of improved product performance and reduced development times, and the wider implications of new product development within small companies are highlighted. keywords: thermal desorption, product design, sme. and development techniques is reported in terms of development times and project costs, and the wider implications for new product development within small companies are discussed. 2 case-study methodology markes international ltd was established in 1997 and now operates with a staff of approximately 20. they specialise in the design and development of thermal desorption capital equipment, and all the key development activities are conducted in-house. this encompasses early concept design, experimental testing and final assembly. the company has successfully developed a range of state-of-the-art laboratory-based products, and their core products are based on the ‘unity’ and ‘ultra’ thermal desorption instruments, an example of which is shown in fig. 1. a typical selling price for this type of unit is approximately 20k. the company’s client base includes the process industry, key regulatory agencies and the service laboratory sector, with applications ranging from environmental health to materials testing. the company has established an international reputation as a leading supplier of quality thermal desorption equipment, accounting for a 12 % market share across the world and 60 % of the market within the uk. as a result of emerging markets (e.g. building materials emissions and contamination in food and beverages), markes international has identified new products that will drive significant business growth. in order to bring these new products to market, markes international formed a partnership with pdr, and this identified the need to introduce a 3d cad-based design capability to add value through enhanced product performance in combination with a more effective development process. a collaborative ktp (knowledge transfer partnership) programme was established in march 2003, and this paper reports the preliminary results during the first 12 months of the programme. pdr have employed the ktp model as an effective mechanism for partnership and collaboration with a wide range of smes, predominantly in wales. the ktp strategy has been in operation for over 20 years, and is a government-backed knowledge transfer scheme. the aim of the scheme is to strengthen the competitiveness and wealth creation of the uk by stimulating innovation in industry through structured collaborations with universities and research organisations. ktp is run for the government by technology transfer and innovation limited (tti). a typical ktp programme is a two-year partnership between one company, one university and tti. each individual ktp programme is designed to address the key elements central to the successful development of the specific company. the two-year project provides employment for a wellqualified graduate ktp associate for the duration of the programme. it should be noted that ktp was formerly known as tcs (teaching company scheme) prior to 2003. all the pdr-based ktp programmes are, or have been, focused on product design, and the numbers reflect the uk trend in that the majority have been based in smes. pdr have successfully completed 12 ktp programmes since 1995, seven of which have been with small companies. a typical pdr-based ktp programme implements a new design capability within a ‘traditional’ manufacturing company. in line with other researchers [11], pdr have found that the ktp model is an ideal vehicle through which to analyse the design-to-manufacturing interface and the associated elements that impact upon new product development within smes. the well-defined management and structure of the ktp process promote a detailed analysis of the company from the university partner’s perspective. a close working relationship is developed with the company in the early stages during the drafting of the ktp grant proposal. this is written by the university in collaboration with the company, describing the company and the financial benefits, and quantifying the aims and objectives of the programme. a key feature of the proposal is a detailed 104-week gantt chart, which defines a programme of work to address the strategic needs of the company. following the approval process, a grant is awarded and pdr employs an associate to work full time at the company for two years to meet the project objectives. the associate is assigned at least one pdr-based supervisor and at least one supervisor from the company. the regular contact with the company fosters a level of trust and co-operation that generates an indepth understanding of the subtle issues and problems inherent in any small company. ktp programmes are characterised by a commitment to disciplined effective project management through mandatory monthly and quarterly meetings. the monthly meetings between the supervisors and the associate not only focus on the technical issues within the programme, but also address training and personal development requirements. the quarterly meetings are designated local management committee (lmc) meetings, and are underpinned by support from a tti consultant. lmcs act as the programme’s steering group to ensure that the longer-term objectives for the company and associate are met. the documentation (technical reports, 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 1: example of a ‘unity’ thermal desorption laboratory-based instrument executive summaries and presentation material) arising from the structured ktp meetings, in parallel with the weekly informal contact with the company personnel, results in a comprehensive portfolio of case study material, and this provides the foundation to this research study. 3 product design specification the generic sampling and analysis process for the lab-based thermal desorption instruments can be summarised as follows: a) the vocs are ‘captured’ in sample tubes that contain an absorbent matrix. the sample tubes are located at key points in the area to be analysed (two sample tubes are shown in fig. 1). b) the sample tubes (typical volume 100–200 ml) are sealed and transported to the lab-based equipment. c) the sample tubes are fed into the thermal desorption equipment and an inert gas stream (usually helium) extracts the vocs from the sample tubes. d) the gas stream refocuses the vocs into the cold trap module, which is electronically cooled and contains an absorbent with a strong affinity for the voc. e) the cold trap is then rapidly heated, which desorbs the vocs and concentrates the sample into a smaller volume (typically 100–200 �l). f) the concentrated sample is then introduced into a gas chromatograph or mass spectrometer for analysis. it can been seen that the cold trap module acts as the concentration engine at the heart of all the thermal desorption instruments. the cold trap technology has been developed for the unity range, and this provides the basis for further design improvements that can be incorporated into replacement products and new products for emerging markets. the first stage of the re-design process was establishing a product design specification (pds). the pds is a central element within the various design documentation, and needs to be reviewed and updated in response to other design output (e.g. test reports and risk analysis). the implementation and maintenance of an appropriate pds for assemblies and critical components is the first step toward a systematic design process, although this is not always acknowledged by small companies [12]. the cold trap module pds can be used as an exemplar, and key sections are highlighted in table 1 in order to define the main design challenges. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 performance • constant cold trap cooling to give a minimum temperature of –15 °c. • intermittent ‘firing’ to heat the trap and achieve desorption at a maximum temperature of 300 °c. • materials compatible with temperature extremes and absorbent matrices. • electrical supply to power both the ‘cooling’ and ‘heating’ elements. • product will run for ten years, and service stock will be required for a further five years. • product will be usable for up to 24 hours per day, operating a seven-day week. • product must avoid ice formation. interfaces • upstream connection and input from sample tube. • downstream connection to heated valve (constant operating temperature of 200 °c), and output to analyser. • the critical connection between the cold trap module and the heated valve must avoid condensation, and there should be no interconnecting tubing. • the cold trap module will need to operate as a key sub-assembly within a range of new products. environment • upstream ambient conditions assumed to be 30 °c. • relative humidity in the range 5 %–95 % (non-condensing). • module must withstand transit-vibrations and shock loading up to an acceleration of 10×9.8 ms�2. • noise levels must adhere to laboratory regulations. quality • no failure should cause any hazard for any operator. • the functional failure rate should be greater than 1 in 50 – based on primary testing. • product must comply with all relevant ec legislation and bs regulations. table 1: key sections of the pds for the cold trap module 4 advanced design techniques the selection and implementation of a 3d cad system was a key aspect of the first year of the ktp collaboration. prior to this partnership, the company were reliant on a basic 2d cad package that was used predominantly for final manufacturing drawings. with the focus on new product challenges and inherently complex assemblies, there was a clear need to make the transition to a solidsand surface-modelling package. early 3d systems required costly unix-based hardware, but with the advent of cost-effective, mid-range packages, full 3d cad functionality has become a viable option for smes. during the early stages of the ktp programme, the company’s current and future design requirements were reviewed and documented. a number of mid-range cad packages were short-listed and tested against the company’s requirements. the system selected was the autodesk inventor cad package. once the first phase of training had been completed, the 3d cad software was employed on all new projects, and the old 2d system was kept running in the background for comparison. the preliminary indications are that 3d solid modelling is a more efficient and productive approach to the design of thermal desorption products. the main benefits of the new 3d cad system can be summarised as follows: � the ability to clearly communicate the design intent of new product concept is a central feature within 3d cad systems. different drawing views can be created instantly from a solid model, and this removes the ambiguity associated with 2d drawings. this allows non-technical personnel to provide meaningful feedback much earlier in the design cycle. furthermore, at the end of the development process the 3d cad images significantly enhance brochures and scientific presentations. � thermal desorption products are essentially large, complex assemblies comprising hundreds of mechanical parts and precise interfaces; the 3d environment efficiently manages these assemblies. furthermore, the 3d cad can assess fit and tolerance problems early in the design process. � design lead-times have reduced by approximately 15 % because it is much simpler to make quick design changes, and error-checking time has reduced. this, in turn, has reduced the number of engineering change notes. changes to a 3d model automatically propagate through to all relevant drawings, and this is further enhanced through bi-directional associativity and parametric design. the immediate impact of the 3d cad has been to speed up the design and development cycle; however the new design software provides the foundation for strategic changes in the way thermal desorption products are designed and developed. the 3d design data can integrate with downstream systems to facilitate analysis and verification, and expand manufacturing options. analysis techniques, notably finite element analysis, have been applied to a wide range of industries. the main design challenges within thermal desorption products are driven by the extremes in temperature, therefore thermal analysis software to model the transient heat transfer between critical components would link well with the 3d solid models. this form of thermal analysis should provide a route for evaluating and optimising cold trap performance prior to full experiments and verification testing within the laboratory. it has been highlighted that low-volume manufacturing techniques are needed for the specialised thermal desorption market, therefore zero-tooling rapid manufacture directly from the 3d cad data may be commercially viable – particularly for bespoke products. the design team have already started to consider stereolithography parts direct from cad, and this could be extended to sintered metal parts for functional components. 5 prototype development the development process established a new design concept for the cold trap module, based on the criteria set out in the pds. taking into consideration factors such as materials selection and design for manufacture, a detailed design evolved, and the cad model is shown in fig. 2. the key temperature-dependent functionality is driven by the absorption/cooling and the desorption/heating components. the absorbent matrices are contained within the central quartz cold trap tube, and this is surrounded by a ceramic cylindrical channel. continuous cooling is provided by two peltier coolers located beneath this cylindrical trap. the trap box is positioned on top of a heat sink that helps to regulate heat conduction away from the trap channel. the desorption/heating stage is driven by a metallic-strip heater located on the inner circumference of the ceramic channel. both the peltier coolers and the trap heater are powered through the electrical supply. the operation and performance of the new cold trap module can be characterised by a temperature traverse 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague materials & manufacture • materials and manufacturing techniques must conform to in-house assembly and approved sub-contracted manufacturing processes. • products will be built individually to order, with approximately 40 cold trap modules assembled per year. • manufacturing techniques must be appropriate for low-volume, batch fabrication. • the competitive driver is the delivery of required performance at a reduced cost. maintenance • the cold trap module must be maintenance free. • product must be designed to minimise the servicing requirements. • absorbent traps must be changeable without the risk of breaking the traps. through the quartz trap during the key cooling and heating phases. this experimental testing is part of the rigorous in-house design verification process, and the data was generated by positioning a thermocouple at known axial positions within the quartz trap. fig. 3 shows the performance of the default condition, namely continuous cooling, whereby the peltier coolers are always on and air recirculates through the trap box. it should be noted that the upstream junction of the ceramic trap (x � 10 mm) is at ambient temperature, and the downstream section connects directly to the heated valve assembly (x � 120 mm), which is maintained at a constant temperature of 200 °c. the critical absorbent matrices are located from x � 40 mm to x � 80 mm, and it can be seen that most of this section is maintained at sub-zero temperatures, close to the specified value of minus 15 °c. the design challenge for this phase of thermal desorption is to generate a distinct step change in temperature at the upstream and downstream interfaces, and thereby create abrupt thermal junctions. the gradual ‘s curve’ temperature profile evident in fig. 3 is not ideal, and derives from the thermal communication between adjacent components. during the desorption phase, the trap heater is ‘fired’ for approximately five minutes, and the temperature profile for this operation is shown in fig. 4. the objective for this phase is to maintain the core temperature above 200 °c from the matrices (x � 40mm to x � 80mm) through to the heated valve (x � 120mm). it can been seen that the peak temperature of 300 °c occurs within the quartz trap (x � 70mm), but the temperature drops just below 200 °c in the zone ahead of the heated valve. the reason for this is that the close proximity of the heated valve to the cold trap module creates conflicting design constraints. the step change in temperature indicated for the absorption phase acts against the need for uniform temperature during the desorption phase. the peltier coolers remain on when the trap is ‘fired’, and this generates a cold pocket between the trap and the heated valve. however, the peltier coolers must remain on in order to produce the abrupt temperature change when the absorption/cooling phase re-commences. one option would be to increase the peak temperature of the trap heaters, but this is an inefficient use of power and may increase the price of the overall unit. these preliminary results provide a sound basis for further development. it is anticipated that innovative solutions will lie in the design of the cold trap module in terms of refined materials properties and configuring precise thermal communication between the key interfaces. 6 conclusions & future work the implementation of structured design documentation and advanced design tools has been shown to add significant benefits to the design and development of a specific thermal desorption product. the first phase of a two-year knowledge-transfer collaboration has defined the design challenges through a detailed pds, and has developed functional prototypes by employing a 3d cad system. the preliminary results indicate that design-cycle lead times have reduced as a consequence of the new design tools, and product performance is in line with the requirements of the specification, which provides the basis for further design improvements. these findings support the view that, if properly implemented, design tools and techniques can have a significant operational impact within smes. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 2: cad model of the cold trap module fig. 3: absorption/cooling temperature profile within the cold trap module future work in this area will expand on the research questions highlighted in this study, and evaluate techniques for optimising the thermal desorption products. the thermal characteristics and constraints within the cold trap module have now been defined; therefore, the next phase of research will employ specific analysis tools to refine the thermal desorption process. for example, the 3d cad data could link directly to thermal analysis software in order to model the transient heat transfer at precise locations within the cold trap module. in order to verify any theoretical design results, a thermal imaging system could also be employed to support the experimental test programme. cad-based design optimisation should provide a mechanism for reducing project costs, as well as lead times, and thereby deliver tangible commercial benefits directly linked to the new product development process. references [1] woolfenden, e., broadway, g.: “an overview of sampling strategies for organic pollutants in ambient air.” liquid and gas chromatography international, vol. 5 (1986), p. 166–171. [2] zha, q., oppenheimer, j., weisel, c.: “the automated analysis of ambient air samples by a thermal desorption-gc/ms system.” proceedings of the 38th asms conference on mass spectrometry and allied topics, tucson, arizona, 1990, p. 617–625. [3] berliner, c., brimson, j.: cost management for today’s advanced manufacturing: the cam-i conceptual design. boston: harvard business school press, 1988. [4] brown, r., lewis, a., mumby, a.: enhancing design effectiveness in very small companies. cardiff (uk): design engineering research centre, 1996. [5] daly, m., mccann, a.: “how many small firms?” employment gazette, vol. 100 (1990), p. 14–19. [6] millward, h., lewis, a.: “product development limitations within smes: the drive to manufacture in the absence of an understanding of design.” proceeding of the 10th international product development management conference, brussels, belgium, 2003, p. 713–723. [7] robertson, d., allen, t.: “cad systems use and engineering performance.” ieee transactions on engineering management, vol. 40 (1993), p. 274–282. [8] droge, c., jayaram, j., vickery, s.: “the ability to minimise the timing of new product development and introduction.” journal of product innovation management, vol. 17 (2000), p. 24–40. [9] sanchez, a., perez m.: “cooperation and the ability to minimise the time and cost of new product development within the spanish automotive supplier industry.” journal of product innovation management, vol. 20 (2003), p. 57–69. [10] kessler, e., chakrabarti, a.: “speeding up the pace of new product development.” journal of product innovation management, vol. 16 (1999), p. 231–247. [11] lipscomb, m., mcewan, a.: the tcs model: an effective method of technology transfer at kingston university. uk: industry and higher education, december, 2001, p. 393-401. [12] lewis, a., walters, a.: “implications of inadequate specification procedures on new product development in small to medium-sized enterprises.” proceedings of the 9th international product development management conference, sophia antipolis, france, 2002, p. 541–550. robert humble dr. huw millward, ph.d. e-mail: hmillward-pdr@uwic.ac.uk alan mumby the national centre for product design & development research (pdr) university of wales institute cardiff (uwic) cardiff cf5 2yb wales, uk ryan price markes international limited, llantrisant business park pontyclun cf72 8yw wales, uk 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 4: desorption/heating temperature profile within the cold trap module ap04-bittnar2.vp 1 introduction cellular solids can either be found in nature or manufactured by foaming of polymers, metals and ceramics, or by other technologies, e.g. by cvd-chemical vapor deposition, or dmls-direct metal laser sintering. they have a wide range of applications namely in absorbing the kinetic energy from impacts, or as thermal and electrical insulators. to exploit these properties fully and efficiently, suitable methodologies allowing a detailed characterization of the behavior of the cellular solids are needed. in this paper we will examine the upper bounds on homogenized linear elastic properties. a cellular solid (a foam) is composed of an interconnected network of solid beams and shell parts, which can be assigned to cells that are repeated in the medium. two essential features characterize cellular media: � the size of the voids is very small compared to the size of the full medium, and thus homogenization techniques (see duvaut (1976), bensoussan at al. (1978), suquet (1985), bakhvalov and panasenko (1989), nemat-nasser and hori (1993)) can be used in determining of the effective properties. � the relative density is low, usually below 0.3 (gibson and ashby (1988)). as a consequence, at least one dimension of the solid phase (thickness) at the cell level is small compared to the characteristic cell size. this condition justifies the use of structural theories in homogenization calculations instead of the full 3d elasticity model. cellular solids may be classified as closed-cell, partly open-cell and open-cell foams. in this work we will restrict our analysis to open-cell foams, which consist solely of solid beams. then the name repetitive lattice structures can also be adopted. several works deal with the effective elastic properties of open-cell foams or repetitive lattice structures, but the upper bounds on them are rarely analyzed. the main monograph on cellular solids was published by gibson and ashby (1988). extensive work by christensen has been dedicated to the characterization of effectively isotropic open-cell microstructures, where the response is governed by bending or direct (axial) resistance, (christensen (1994, 1995)). in christensen (1995) the values of the upper bound on the effective bulk and shear modulus are presented. the value of the bulk modulus bound has also been addressed in several other works, but only in the sense of an effective property of some particular microstructure, (see e.g. warren and kraynik (1988, 1997), kraynik and warren (1994), zhu et al. (1997)). methodologies for determining effective properties can be discrete or continuous. discrete approaches are usually based on micromechanics. they exploit either the periodicity or the regularity of the medium under consideration. in the former case, the calculations are performed on a unit or a basic cell, while in the latter case either a representative volume element or a typical joint is used. for instance in kraynik and warren (1994) and warren and kraynik (1997) the effective properties are determined by considering a tetrahedral joint (kelvin foam) under the assumption of affine displacements. application of this methodology to a medium with randomly placed basic cells of the regular cubic lattice yields also the maximum shear response. in this context, we can also mention the work of dimitrovová (1999), where there is a detailed discussion of applicability of the orientational averaging to periodic cells. among other works, grenestedt (1998) and li, gao and roy (2003) should also be mentioned. continuum modeling of repetitive lattice structures is reviewed by noor (1988). the literature review in this paragraph is far from complete, because it is not the aim of this paper to determine homogenized properties, but their upper bounds. the inverse problem of identifying microstructures that achieve the prescribed effective properties has also been extensively studied (see, e.g. sigmund (1994), neves et al. (2000), gibiansky and sigmund (2000), guedes et al. (2003)). these methods exploit homogenization techniques, starting with a basic cell, whose shape must be specified in advance, and then the available material is optimally distributed within it. © czech technical university publishing house http://ctn.cvut.cz/ap/ 171 acta polytechnica vol. 44 no.5–6/2004 a new methodology to establish upper bounds on open-cell foams homogenized properties z. dimitrovová the methodology for determining the upper bounds on the homogenized linear elastic properties of cellular solids, described for the two-dimensional case in dimitrovová and faria (1999), is extended to three-dimensional open-cell foams. besides the upper bounds, the methodology provides necessary and sufficient conditions on optimal media. these conditions are written in terms of generalized internal forces and geometrical parameters. in some cases dependence on internal forces can be replaced by geometrical expressions. in such cases, the optimality of some medium under consideration can be verified directly from the microstructure, without any additional calculation. some of the bounds derived in this paper are published for the first time, along with a proof of their optimality. keywords. upper bounds on effective properties, open-cell foams, homogenization techniques, energy methods, optimization, optimal microstructures. cellular solids can be viewed as two-phase composites with void and solid (generally non-homogeneous) phases. determining the bounds on composite effective properties has been the subject of considerable research for many years. it may be argued, that there is no need for a new methodology, since the bounds for foams can be obtained from the composite two-phase bounds, just by introducing zero void properties. this is true in 2d, but in 3d the optimal foams must contain shell parts in some regimes of optimality (allaire and kohn (1993)), therefore upper bounds on the homogenized properties of open-cell foams are strictly lower than those for general foams, and the development of a new methodology addressing this issue is fully justified. only the upper bounds on effective elastic properties will be examined, because the lower bounds for media with one void phase are zero. without loss of generality only open-cell foams with a periodic microstructure will be considered, because in a medium with a random microstructure, a representative volume element can be chosen so that a medium created by its periodic repetition will have the same effective properties as the original random one. the contribution of this paper is that it extends of the methodology proposed by dimitrovová and faria (1999) from 2d to 3d. the methodology is based on homogenization theory and does not require any restriction on the basic cell shape or arrangement. the influence of the boundary layer is not accounted for and it is assumed that the basic cell contains a finite number of structural members, i.e. beams or bars. the upper bounds are derived by a bounding procedure using results from linear algebra and the voigt bound basic assumption (hill (1963)). the main advantage of the new methodology is that the necessary and sufficient conditions characterizing the optimal media will immediately follow from the bounding procedure. these conditions are written in terms of generalized internal forces and geometrical parameters. the proposed methodology recovers the well known bounds for effectively isotropic open-cell foams, though with a different proof. the main contribution lies in identifying of new bounds on the effective shear moduli of open-cell microstructures with effective cubic symmetry. in such cases, dependence on internal forces in maximality conditions can be replaced by geometrical expressions, implying that the optimality of the medium under consideration can be verified directly from the microstructure, without any additional calculation. the approximations inherent to the methodology are within the structural simplifications commonly used. the limitations are based on the assumption of a finite number of structural members in the basic cell, allowing only the identification of single scale microstructures, which implicitly excludes multiple rank laminates (see e.g. allaire and aubry (1999)) and the hashin spheres medium (hashin (1962)). the paper is organized as follows. the methodology is reviewed in section 2, namely in section 2.1 simplified assumptions and basic relations are introduced, in section 2.2 it is shown that the optimal media can be initially searched within a specific class of micro-trusses (this term will be explained later on), and in section 2.3 the methodology is reviewed within this restricted class. the bounds are proven in section 3, along with a specification of the optimal media microstructures. the paper is concluded in section 4 with a discussion and an analysis of the developments. 2 review of the new methodology 2. 1 simplifying assumptions and basic relations the basic cell, �, defined as the (smallest) region of a periodic medium that can compose the full one by periodic repetition, will be conveniently rescaled to v, where the spatial microvariable y is introduced. it is assumed that v contains a finite number of beams and that the solid phase is homogeneous and isotropic. therefore the term material volume fraction can be used instead of relative density. there are two extreme possibilities for the structural model of a joint between the beams composing the foam: (i) a pin joint and (ii) a rigid joint. a pin joint cannot transmit bending moments, and therefore it allows rotations of the structural members connected to it. consequently, a non-loaded structural member with two pin joints can only support the internal forces acting in the direction of the line connecting the joints. on the other hand, a rigid joint preserves the angles between the beams connected to it. if all joints are rigid, the term micro-frame medium can be used; on the other hand not necessarily straight structural members connected by pin joints will be named as micro-truss media. therefore any micro-frame medium has its related micro-truss, which is obtained by switching the behavior of rigid joints to pin joints. in reality, joint behavior is somewhere between these two extreme cases and should be represented by a flexible joint. pin joint behavior can be achieved either by special construction allowing for rotations of the connected members or as a limit case: if the beams connected to a given joint have uniform cross sectional areas and the material volume fraction tends to zero, then the flexible joint approaches pin joint behavior. in structural theories, beams are defined by their middle axes and joints can be replaced by single points (joint “centers”) located in the intersection of the middle axes. the term theoretical length will be used to identify the middle axis length between the joint centers, and active length will be usedto identify the same length shortened by the parts inside the joints (fig. 1). small discrepancies when middle axes do not intersect exactly at a single point will not be considered. 172 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 active length fig. 1: introduction of theoretical and active lengths we will address only open-cell foams with effective isotropy or cubic symmetry. the tensor of the effective elastic constants can thus be written in dimensionless matrix form as: c c 0 0 c * * *� � � � � � � � � 1 2 , (1a) where c i2 2 * *� g and c1 1 1 1 1 1 4 3 2 3 2 3 4 3 2 3* * * * * * * * * * * . � k g k g k g k g k g symm k * * � � � � � � � � � �4 31g . (1b) here i stands for the unit 3 by 3 matrix and 0 for the zero 3 by 3 matrix. effective engineering constants k, g1 and g2 are the homogenized bulk and two shear moduli, respectively. their dimensionless values with respect to the solid phase young’s modulus es are identified as: k k es * � , g g e1 s1 * � and g g e2 s2 * � . a medium is effectively isotropic when g g g *1 2 * *� � . the above matrix form of the fourth order tensor of elastic constants (lekhnitskii (1981)) in terms of the engineering constants k *, g1 * and g2 * is presented in hashin and shtrikman (1962). at first, the aim is to determine each of the macroscopic engineering constants in terms of the generalized internal forces, which will form the initial relation for the bounding procedure. the global strain energy density w can be expressed for isotropic media as: w e k gs m d d� � � � � � � 1 2 2 2 � * * :� � (2) and for media with effective cubic symmetry as: w e k g ( s m d,12 d,13 d,23 d,11 d � � � 1 2 2 2 2 2 2 � � � � � � * * ,22 d,11 d,33 d,22 d,33( ( 6g ) ) ) , * 2 2 2 1 � � � � � � � � (3) where �m and �d are the volumetric and deviatoric parts of the global stress tensor � and � �d d d,ij d,ij: � � � (the summation convention is adopted). the test macroloads to be applied on the medium and consequently on the basic cell can be chosen so that only one effective engineering constant will be left in (2) or (3), and can thus be expressed independently of the others and in terms of � components and w. examples of these macroloads are specified in table 1. it is seen that the corresponding macrostrain e must fulfill similar conditions. macrostrain e is connected to macrostress � by � � �esc e * , where � �� � � � � � � �11 22 33 23 31 12, , , , , t, � �e � e e e e e e11 2 2 2, , , , ,22 33 23 31 12 t and “ � ” stands for matrix multiplication. � and w can be expressed with the help of an averaging operator applied on the local characteristics, � and w, (suquet (1985)): � � � �jk jk jk i jk i i v d v d� � �� �� � 1 1 y y v v ii# # , (4) w v w d v w d wi i i � � �� �� � 1 1 y y v v ii# # , (5) where � i and w i are the local stress and the local strain energy density corresponding to the ith beam (i-beam). the volume of the full cell is v while the volume of the i-beam is vi # . vi # is composed of the volume corresponding to the active length plus the corresponding volume in the connected parts within the joints, so that v v i ji # j #� � � �0 and v vi # i #� � . v# is the volume of the material part in the cell. due to the periodic repetition it is not necessary to treat separately the case when the i-beam is cut by the boundary of the cell. next, it is necessary to express the contributions of each i-beam, �i and wi , in terms of generalized internal forces. looking at �i , the formula from nemat-nasser and hori (1993) � �mj i i # jk i k m v i # j i m v v n b d v t b d i # i # � �� � 1 1 s s � � (6) © czech technical university publishing house http://ctn.cvut.cz/ap/ 173 acta polytechnica vol. 44 no.5–6/2004 macroload property specification of � specification of e �k k * � � � �� � � �11 22 33 0, �ij i j� � �0 e e e e� � � �11 22 33 0, e i j, 3k e eij * s� � � �0 � ( ) �1g g1 * � � � �11 22 33 0 0 � � �, ,k; kk �ij i j� � �0 e e e e i jij11 22 33 0 0 � � � �, 21g e e kkk kk s #* ( )� �� �2g g2 * � � � �11 22 33 0 0� � � � � �, i j; ij e e e11 22 33 0� � � , 22g e e i jij ij s #* ( )� � �� �g g * � � � �11 22 33 0 0 � � �, ,k; kk � � �i j; ij� 0 e e e11 22 33 0� � � , 2g e e i jij ij s #* ( ) ,� �� (# if the macrostress component is different from zero) table 1: test macroloads and the corresponding specification of the macrostress and the macrostrain can be exploited. in (6), b is the position vector of the points on �vi # and t is the boundary traction. if t is self-equilibrated, then �i is symmetric and the integral in (6) does not depend on the origin of the coordinate system for b. the expression for wi can be, as usual, simplified by considering that generalized internal forces act over theoretical lengths and that the contribution of the joints is negligible. 2.2 micro-trusses with straight bars of constant cross sectional area versus micro-frames optimal low-density micro-frame open-cell foams will be defined as those for which the related micro-truss is optimal. justification of this definition and more details on optimal micro-trusses are presented in this subsection, namely it will be proven that optimal micro-trusses can only be composed of straight bars with a constant cross section. in order to justify the definition stated above, it is necessary to verify that a curved beam cannot from part of the optimal low-density media. let us suppose that the i-beam of a micro-frame basic cell is curved. then a local coordinate system (z1, z2) can be introduced so that z1 connects the centers of the joints (fig. 2). the middle axis of the beam is given by z 2 =a(z1 ) and r designates the curved coordinate. let us separate the beam of active length from the joints by the cuts shown in fig. 2. it is assumed that there exists a plane containing the i-beam middle curve and that the macroload acts in such a way that the generalized internal forces in the beam cuts are also contained in this plane. the geometrical parameters �(r), �0 k , �0 m , hk, hm, vk, vm, l, p, the generalized internal forces in the beam cuts f, b and d and other local auxiliary coordinate systems (~ , ~z z1 2) and (� , �z z1 2) are specified in fig. 2. l and p are projections of the theoretical and the active lengths on z1 and the bending moment along the beam is separated into its (average) constant (d) and “antisymmetric” parts. for the i-beam let us express the average quantities �i and wi in terms of the generalized internal forces and discuss the possibility of its position in an optimal medium. superscript “i” will be omitted for the sake of simplicity, whenever no confusion is possible. it must be pointed out that local stress averaging cannot be performed over the theoretical length, because this would cause overlapping of the joints. thus the i-beam average stress � must be expressed as � � � �� � �bm jk jm , where the contribution with subscript “bm” relates to the beam with active length and thoses with “jk” and “jm” subscripts relate to the left (k) and right (m) adjacent joint parts, respectively. strict application of (6), in the previous expression, would imply integration over the internal faces of the joints, which is complicated. to overcome these difficulties we can define ~ ~ ~ � � � �� � �bm jk jm , where ~� jk and ~ � jm stand only for the contribution of the faces where the beam was cut. then ~� jk and ~ � jm are coordinate system dependent and therefore their coordinate 174 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 r r l p k v mv 1 z 2 z k0 m0 d d dd 2/bp 2/bp 2/bp 2/bp k h mh f f f f b b b b 1 z~ 2 z~ 1 z 2 z 12 zaz m k m v k v fig. 2: specification of the curved i-beam origins of the coordinate systems (z1, z2) and ( ~ , ~z z1 2) are coincident, therefore only the face of joint (m) and the corresponding face of the beam can be considered to obtain (10). it is necessary to point out that the reason for non-symmetry of ~ � is the omission of the contributions of the internal faces of the joints in (8–9), as explained above. this does not mean any inaccuracy, because after rotation of the contributions of all beams to the cell coordinate system and after summation over all the beams, the final expression for � will be complete and symmetric. when a general curved beam under a general macroload is considered, the local coordinate system (z1, z2) connecting the centers of the jointscan also be introduced. then it is necessary to replace internal force b by b1 and b2, bending moment d by d1 and d2, and to introduce torsion moment t. following the same procedure as above, we obtain: ~ � � � � � � � � � � � � l v f b b 0 0 0 0 0 0 1 2 . (11) it is important to realize that (11) has the same form as it would have for a related straight beam of theoretical length l, arbitrary cross sectional area variation and with the same generalized internal forces in the cuts. therefore there is no distinction between the ��� contribution of a straight or a curved beam to �. moreover, (11) includes neither include the constant part of bending moments d1 and d2 nor the torsion moment t. if the i-were to have pin joints, then from equilibrium b1 � b2 � 0. we point outd that in order to express �, (11) should be rotated to the basic cell coordinates and summed over all the beams. now the average �w� will be determined. for the sake of simplicity it is again firstly assumed that a curved beam and the generalized forces are contained in a plane. as usual, the strain energy density corresponding to the shear forces can be omitted. then we can write: w v e n r a r dr + m r i r dr s 2 2 aa � � � � � � � ��� 1 2 ( ) ( ) ( ) ( ) ( )( ) , (12) © czech technical university publishing house http://ctn.cvut.cz/ap/ 175 acta polytechnica vol. 44 no.5–6/2004 systems must be uniquely defined in a way applicable to any beam from the basic cell. coordinate systems (~ , ~z z1 2) and (z z1 2� , � ) are introduced as specified in fig. 2. with respect to (z1, z2) this yields from (6): v fp 1 2 bp d k k m m k �bm � (sin cos sin cos ) (sin co � � � � � 0 0 0 0 0 s sin cos ) ) � � � � � � 0 0 0 0 0 0 k m m 2 k 2 m 2 k bp(cos cos d(cos � cos f(v v bp(cos cos d(cos 2 m k m 2 k 2 m 2 k � � � � 0 0 0 0 ) ) ) � � cos 1 2 bp b(v v d2 m k k m m k m � � � � � 0 0 0 0 0 ) (sin cos sin cos ) ) (sin cos sin cos )� � � �0 0 0 0k k m m � � � � � � � � � � � � � � � � � � � � � � � � � � � � . (7) with respect to (~ , ~z z1 2) and (� , �z z1 2) we can obtain: v � � � � � � � � ~ � jk k k k kfh bv fv bh bp 2 d 0 and v � � � � � � � � ~ � jm m m m mfh bv fv bh bp 2 d 0 , (8) which after rotation to (z1, z2) yields: v k k k k� � ~ cos sin cos sin s � jk k kfh bp d bh bp1 2 1 20 0 0 0 � � � � in sin cos cos cos 2 0 2 0 2 0 2 0 1 2 1 2 � � � � k k k k d fv bp d bv bpk k � � � � 0 0 0 0 k k k k sin cos sin � � � � � � � � � � � � � � � � � �d and (9a) v � � ~ cos sin cos sin s � jm m m m m m mfh bp d bh bp1 2 1 20 0 0 0 � � � � in sin cos cos c 2 0 2 0 2 0 2 0 1 2 1 2 � � � � m m m m m m d fv bp +d bv bp os sin cos sin � � � � 0 0 0 0 m m m md � � � � � � � � � � � � � � � � � � , (9b) which finally gives ~ � � � � � � � � l v f b 0 0 . (10) where a(r) and i(r) stand for cross sectional area and moment of inertia, respectively, n(r) and m(r) are normal forces and bending moments and (a) stands for the integration along the curved theoretical length of the beam. it may be pointed out that using theoretical lengths and overlapping in joints is an allowable and common simplification in strain energy. in accordance with this approximation, the originally introduced generalized forces f, b and d in the beam cuts do not have to change. a cross sectional area a0 of a related straight beam with constant cross section and the same volume as the original curved beam, can be introduced by (overlapping in the junctions can also be neglected here): a a z a z d0 1 1 2 1 0 1l z l � �� ( ) ( ( )) , (13) where � �a z da z dz ( ) ( ) 1 1 1 . because there is no distinction between the � contribution of a straight or a curved beam to �, let us minimize w in order to discuss the position of the i-beam in an optimal medium. this minimization must be performed over all possible shapes a(z1) and volume distributions along the middle curve: 2e v w n r a s a(z ); a(r); i(r) a(z ); a(r); i(r) 2 1 1 � � � min min ( ) ( ) ( ) ( ) min ( )( ) r dr + m r i r dr 2 aa a(z ); a(r)1 �� � � � � � � � � � n r a r dr + m r i r dr 2 a(z ); i(r) 2 aa 1 ( ) ( ) min ( ) ( ) . ( )( ) �� (14) equality in (14) can be achieved, if the minimizing shape and volume distribution are the same for both terms in the last part of (14). the distribution of the normal forces can be written as: n r f r b r( ) cos ( ) sin ( )� � � . (15) therefore: n r a r dr = (f + ba (z a(z a (z dz 2 1 1 1 1 0a ( ) ( ) )) ) ( )) ( ) � ��� 2 21 l . (16) using the schwarz inequality in the form of: f (f + ba (z dz f + ba (z a(z 1 1 1 1 2 2 0 2 1 l l � � � � � � � � � � � � � )) ) ) � � �� � � � � � � � � � �� ( )) ) ( )) a (z a(z a (z dz 1 1 1 1 0 24 24 1 l � � � � � � � � � � � � �� 2 2 21 (f + ba (z a(z a (z dz a1 1 1 1 0 )) ) ( )) l (z a (z dz a (f + ba (z a( 1 1 1 0 0 1 ) ( )) )) 1 2 2 � � � � � � � � � � � � l l z a (z dz 1 1 1 0 ) ( ))1 2 �� l (17) gives the following inequality: (f + ba (z a(z a (z dz f a 1 1 1 1 0 0 � � �� )) ) ( )) 2 2 2 1 l l . (18) equality in (18 or 17) can only be achieved if f + ba (z a(z a (z 1 1 1 � � ) ) ( ))1 2 is constant with respect to z1, which implies that the beam must be straight and with a constant cross sectional area. then contribution of the normal forces to w does not include b. the distribution of the bending moments can be expressed as: m z d fa z z( ) ( )1 1 12 � � � � � �b l . (19) when minimizing conditions for normal forces contribution to w are used, it is sufficient to look at d b z dz d + b1 � � � � � � � � � � �� l l l l 2 121 2 0 2 2 3 . the optimal media require d � 0, because d does not appear in (10). if moreover b � 0, as a consequence of constant cross sectional area and the material volume fraction going to zero, then the contribution of the bending moments is zero and the last term in (14) reaches its trivial minimum. extension of this statement to a general curved beam under a general macroload is clear, there would only be one more integral in the form of (19) and a separate t contribution, which can be required to be zero, because t does not appear in (11). this justifies the definition of optimal media stated at the beginning of this subsection, and proves that optimal micro-trusses must be composed of straight bars with constant cross sectional areas. nevertheless the contribution of bending is not excluded from the optimal media, when behaving as micro-frames. in summary, optimal open-cell foams can be searched within the class of micro-trusses with straight bars of constant cross section. in this class the bound can be expressed as a linear function of the material volume fraction, s, as shown in dimitrovová and faria (1999) and as clarified in section 3. related optimal micro-frames can develop non-zero bending moments, but only in their antisymmetric form (in terms of b1 and/or b2). if bending moments are presented, the corresponding effective engineering constant, written as a taylor’s expansion in s, contains a quadratic term (a detailed discussion is provided in dimitrovová and faria (1999)). the tangent at s � 0, i.e. the linearized bound, relates to the same property of the corresponding micro-truss. please note, however, (see fig. 3) that for a particularly high material volume fraction, s0, there can exist a micro-frame with a higher elastic property than that which is obtained from the optimal micro-truss. these cases are of no interest here since for low-density media only the initial slope (linearized property) matters. for the same reasons media with only a bending response are strictly excluded from the class of optimal micro-frames, because the corresponding micro-truss is a kinematic mechanism and the linearized bound is zero. 176 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 it was shown in dimitrovová and faria (1999) that if the bulk modulus is under consideration, then the macrostress components that are necessary to express this property do not contain a contribution of b. then bending moments are excluded from optimal media, not only in the limit at s � 0, but in the full range of low-density s values. this result is readily extendable to 3d. in section 3.2, optimal micro-trusses for shear modulus g1 * of media with effective cubic symmetry will be fully geometrically specified. in this case it will be seen that switching to micro-frames will not develop bending moments. so also here the upper bound is linear within the validity of structural theories. the bending contribution is present only in isotropic shear g* and in g2 *. 2.3 review of the methodology in the class of micro-truss media with straight bars of constant cross section in the class of micro-trusses, the normal force is the only generalized internal force in the medium. let an arbitrary basic cell consisting of n bars be assumed. the contributions � i and wi of each i-bar of theoretical length li , cross sectional area ai and normal force ni can be specified in the following way (compare with (11)): � i i i i i i i i i i n v � � � l cos sin ; sin cos sin ; cos sin2 2 2 � � � cos sin sin ; sin sin cos cos � � � � � � i i i i i i isymm 2 2 2 � � � � � � � � � � , (2 0) where the two spherical angles �� �i � 0, and � �i � 0 2, specify the i-bar position with respect to the cell coordinates yj , j � 1, 2, 3, (fig. 4); and (see (14) and (18)) w v e n a i s i 2 i i � � � � � � � 1 2 l . (21) let the following designation be introduced: � � � � 1 2 2 2 2 2 3 2 1 , , , cos sin , sin sin , cos ; i i i i i i i i � � � � � � , , , sin sin cos , cos sin cos , sin i i i i i i i i i i � � � � � � � � � 2 3 cos sin ; sin sin cos , cos c , , � � � � i i i i i i i i 2 1 2 2 2 2 2 � � � � os sin ; cos( ) sin ;, 2 2 3 22 � � i i i i i� � (22) then the vectors n, r, q and l (compare with dimitrovová and faria (1999)) can be defined as: n r � � ! " # $ � n a n a n a l a l a 1 1 1 2 2 2 n n n j j 1 1 j 2 l l l , , , , ,, , � � �1 2� �2 j n n n j j 1 1 j 2 2 j n n l a j l a l a l , , , , , , , , , , , , , � � � � � � � � 1 2 3 1 2q � � � � a j a a a n 1 1 2 2 n n , , , , , , , � � 1 2 3 l l l l� (23) in addition, let us denote: p r r1 2 3� , p r r2 3 1� , p r r3 1 2� . (24) thus: � �p j j,1 1 1 j,2 2 2 j,n n nl a l a l a j � � � � �, , , , , , � 1 2 3 , (25) and it holds: p p p 0 r r r l p p p q q q l 1 2 3 1 2 3 1 2 2 2 3 2 1 2 2 2 3 2 22 � � � , , , (26) where is the euclidean norm. the material volume fraction, s, can be approximated neglecting the higher order terms as: s v � l 2 . (27) © czech technical university publishing house http://ctn.cvut.cz/ap/ 177 acta polytechnica vol. 44 no.5–6/2004 s0 s elastic property under consideration optimal micro-truss micro-frame corresponding to optimal micro-truss micro-frame corresponding to non-optimal micro-truss non-optimal micro-truss fig. 3: specification of optimal media response i 1 y 2 y 3 y i n i n i � i l i a bari fig. 4: specification of the i-bar within the basic cell taking into account (22–23), (20) can be substituted into (4) giving: � �� � � � � 1 1 1 2 3 1 2 3v v t t ts n r r r q q q n, , , , , , (28) and (21) into (5) as: w v es � n 2 2 , (29) where s will be named as the modified static matrix. as written in section 2.1, a particular engineering constant can be expressed, from (2) or (3), independently of the others, if the corresponding macroload from table 1 is applied. then expressions (28–29) can be introduced and the initial expression for the bounding procedure, in terms of normal forces and geometrical parameters, is obtained. the bounding procedure is performed using basic knowledge from linear algebra and the voigt assumption for the upper bound derivation (uniform local strain), and the bound is finally expressed as a linear function of the material volume fraction. the maximality conditions on possible normal forces are then obtained as conditions that ensure equality with the bound. the specifications in tab. 1 provide the additional constraints on the possible normal forces that can be developed in an optimal medium. using the maximality conditions, these additional constraints can be written in terms of microstructure geometrical parameters, as will be seen in section 3. for more details on the voigt assumption and bound see e.g. hill (1963). we only remark that, when the local strains are uniform throughout the medium, then they are equal to the macroscopic strain and the global engineering constant corresponding to such a macroload reaches its maximum. since micro-truss media are characterized by the middle axes of the bars, which (except for the joints) correspond to the ”direction” of the local strain, voigt assumption implies that the local displacements of middle axes of the bars, u, coincide with the linear part of the displacements, i.e. u e yi ij j� (the summation convention is adopted). this requirement states the necessary maximality conditions on possible normal forces, which can be written as: s e nt t se � � . (30) maximality conditions (30) are not sufficient, because the requirement of uniform strain does not exclude bars with zero normal force (zero bars). more facts about the relation between optimal micro-frames and the voigt bound are given in the appendix. obviously, upper bounds determined in the way described in this subsection can be extremely large and unrealistic, because none of the restrictions, e.g., topological connectivity or equilibrium of the joints, were considered. however, if a physical medium saturating the bound can be found, the bound would be proven as optimal. this is actually achieved in all the cases considered in this paper. 3 linearized bounds on effective properties 3.1 bulk modulus k* (for effective isotropy or cubic symmetry) if the macroload �k (table 1) is imposed, then starting with (2) and introducing (28–29), (26) and (27), the bulk modulus k* can be expressed as: k e w v ( v v m 2 s t t * ) ( ) � � �� � � � � � � � � � � 2 3 1 9 2 1 2 3 2 n r r r n l n 2 2 9n � s , , (31) providing the maximality condition n // l , (32) (i.e. the local stresses are required to be constant all ever the bars) and the bound k s � * 9. using (32), additional constraints from table 1 can be written in terms of geometrical parameters as: r l r l r l q l1 2 3 1 2 3� � � � � % � � t t t j j , ,& . (33) equation (30), which should also be implemented, does not in this case bring anything new. it is seen that it corresponds directly to (32), after conditions from table 1 have been implemented, n l� � 3k * . we can check that in this case (30) ensures not only a necessary but also a sufficient maximality condition, because zero bars are excluded as n a ii i~ � �0 , where “~” means proportionality. the conditions stated in (32 – 33) are the necessary and sufficient conditions on k*optimal media. (32) cannot be expressed only in terms of geometrical parameters, and therefore verification of the k*optimality of some medium requires the determination of the normal forces in n. the bound is optimal, because several known media saturate it. the simplest k*optimal medium is a regular cubic lattice (fig. 5) (see warren and kraynik (1988), dimitrovová (1999)); where it is easy to verify conditions (32 – 33). the class of periodic k*optimal media can be extended by the class of media with a random microstructure, where a basic cell of some k*optimal medium appears in the representative volume 178 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 1 y 2 y 3 y g1 a fig. 5: the regular cubic lattice element with all possible rotations with the same probability. because the bulk modulus is invariant under orientational averaging, the bulk modulus of the new random medium will be the same as for the corresponding periodic medium, dimitrovová (1999). 3.2 shear modulus g1 * (for effective cubic symmetry) if macroload �1g (see table 1) is imposed, then one has: g v s t t t 1 3 2 2 2 1 2 2 1 2 2 1 6 6 * ( ) ( ) ( ) cos ( � � � � � � � � p n p n p n n p n, ) cos ( , ) cos ( , ) p p n p l p n p l 1 2 2 2 2 2 3 2 2 3 2 3 � � � � � � � � � s p n p p n p p p p q q q 1 2 2 1 2 2 2 2 1 2 2 2 3 2 1 2 2 2 3 2 cos ( , ) cos ( , ) � � � � � � p n p p p p q q q 3 2 2 3 1 2 2 2 3 2 1 2 2 2 3 2 cos ( , ) . (34) according to table 1, additional constraints on possible n are: n l n q% % � �& jj 1 2 3, , . (35) if q 0j j� � �1 2 3, , and n p/ / , ,j j� �1 2 3, then the maximum in (34) would be s/3. however none physical medium could fulfill all these conditions, as will be shown in the following. in order to determine the real maximum, it is necessary to realize that any � 1 g can be written as a linear combination of three basic cases � �22 33 1� � , � �33 11 1� � and � �11 22 1� � . in each of them local strains must be uniform according to (30) and the value of the corresponding g1 * must be the same, as specified in table 2. using superposition, the necessary maximality condition from table 2 reads as: n ( p p p p p p� � 1 2 1 1 1 2 2 3 3 1 1 2 2 3 3 g * ) ,� � � � � � (36) where the coefficients �i are expressing the particular basic cases combination, corresponding to the imposed macroload. additional constraints from table 2 must be satisfied simultaneously, giving: q p r r r r r r r j k j, k% � � � � � � 1 2 3 1 2 3 1 2 2 3 , , , , cos( , ) cos( , ) cos( , )r r3 1 (37) and p p p1 2 3 12� � � g v * . (38) (38) could be obtained directly as the condition ensuring the same g1 * in all basic cases. if (38) were to be derived first, then using some statements about finite dimensional spaces, condition (36) is the maximality condition for the sum: cos ( , ) cos ( , ) cos ( , )2 1 2 2 2 3n p n p n p . here it holds: cos ( , ) cos ( , ) cos ( , )2 1 2 2 2 3 3 2 n p n p n p � . (39) then the bound g s1 6, * � can be obtained from (34) if q 0j j� � �1 2 3, , . thus the proof of g s1 6, * � would be completed if at least one optimal medium can be found, i.e. if there exists a medium in which q 0j j� � �1 2 3, , , expressions (36–38) hold and no zero bars are contained in it. in order to justify the existence of such a medium, first of all, the spherical angles that will ensure q 0j j� � �1 2 3, , must be found. this requirement is equivalent to the condition under which & �max , , ,� � �1 2 3i i i (40) is obtained for each i. solution of problem (40) results in three groups of angles, which predict the bars directions of the bars in an optimal medium, as stated in table 3. it is therefore convenient to choose a rectangular basic cell with faces perpendicular to the directions of the bars. due to the equilibrium in the joints, only continuous bars passing through the cell can be present. from table 3 it follows immediately that r r r1 2 3% % , but in order to ensure r r r1 2 3� � , the following condition must be satisfied: l l li i group j j group k k group a a a 1 2 3. . . � � �� � , (41) i.e. in each of the three perpendicular directions in the cell, the volume of the bars must be the same. it remains to ensure (36) and impose conditions to eliminate zero bars. let us take for example one continuous bar from the first group. from (36) it follows that all over the bar n ai i � � �3 2 holds. therefore the normal forces between the respective joints must be proportional to the cross sectional areas with the © czech technical university publishing house http://ctn.cvut.cz/ap/ 179 acta polytechnica vol. 44 no.5–6/2004 basic case 1. 2. 3. macrostress � �22 33 1� � � �33 11 1� � � �11 22 1� � macrostrain e e g es 22 33 11 2 � � � ( )* e e g es 33 11 11 2 � � � ( )* e e g es 11 22 11 2 � � � ( )* maximality condition from eq. (30) n p� 1 2 1 1 g * n p� 1 2 1 2 g * n p� 1 2 1 3 g * additional constraints r r r p2 3 1 1� %, , q pj 1 j% � �1 2 3, , r r r p3 1 2 2� %, , q pj 2 j% � �1 2 3, , r r r p1 2 3 3� %, , q pj 3 j% � �1 2 3, , table 2: basic load cases in � 1 g same coefficient of proportionality in each group. due to the equilibrium in the joints, the normal forces must be the same within each continuous bar, which implies that the cross sectional areas are also constant within the continuous bar, as well. let us now summarize the results. g s1 6, * � and all g1 *optimal media can be fully geometrically specified in the following way: g1 *optimal media are continuous lattices for which: � a rectangular basic cell (with dimensions li in yi -directions, i � 1, 2, 3) can be found, consisting only of continuous orthogonal bars in yi -directions, � each bar has a constant cross sectional area within the basic cell and the condition l a l a l ai i n j j n k k n 1 1 2 1 3 1 1 2 3 � � � � � �� � is satisfied (ni is the number of bars in the yi -direction, i � 1, 2, 3). the group of media specified above is the only group of g1 *-optimal media. they are in fact a 3d extension of upl (the uniform perpendicular lattices) introduced in dimitrovová and faria (1999). the simplest example from this group is the regular cubic lattice (fig. 5). the value of its g1 * (not the proof of maximality) can be obtained directly from g1 * of its 2d analog: the regular square lattice. if we denote by s2d and s3d the material volume fractions of 2d and 3d regular lattices, respectively, it holds s sd d2 32 3� , and consequently g s sd d1 2 3 1 4 1 6 * � � . (42) 3.3 shear modulus g2 * (for effective cubic symmetry) first of all, we point out that in 3d there exists no such rotation of the global coordinates that would interchange the positions of g1 * and g2 * in c*, as it does in 2d (see dimitrovová and faria (1999)). thus g2 *optimal media cannot be derived from g1 *optimal media. for macroload �2g we can obtain: g v s t t t 2 1 2 2 2 3 2 2 1 2 2 1 6 3 * ( ) ( ) ( ) cos ( � � � � � � � � q n q n q n n q n, ) cos ( , )q q n q p p p q q q q 1 2 2 2 2 1 2 2 2 3 2 1 2 2 2 3 2 3 2 � � cos ( , ) . 2 3 1 2 2 2 3 2 1 2 2 2 3 2 n q p p p q q q � � � � (43) additional constraints on possible n are: n r% � �j j 1 2 3, , . (44) the obvious maximum s/3 cannot be achieved by any medium, similarly as in section 3.2. also, combining the 2d results (unlike to (42)) would lead to a wrong conclusion, as can be demonstrated: let only �23 0� in � 2g, then an optimal medium should have bars in the directions of the unit square diagonals in (2, 3)-planes, according to dimitrovová and faria (1999). by analogy, the other load cases �12 0� and �13 0� imply directions of the bars in (1, 2) and (1, 3)-planes, respectively. the 2d result g s d2 2 4, * � and the fact that s sd d2 3 3� thus yield g s 3d2 12 * � , because the directions of 180 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 group 1. 2. 3. spherical angles � � 1 12 0� �, � � �2 22 2� �, �3 0� values of �1, �2, �3 0, 1, 1 1, 0, 1 1, 1, 0 values of �1, �2, �3 1, 0, 0 0, 1, 0 0, 0, 1 table 3: characterization of g1 *optimal media basic case 1. 2. 3. macrostress �23 1� �13 1� �12 1� macrostrain �23 21 2� ( ) *g es �13 21 2� ( ) *g es �12 21 2� ( ) *g es maximality condition from eq. (30) n q� 1 2 1g * n q� 1 2 2g * n q� 1 2 3g * additional constraints 1 2 1 3 1 1 2 2 q q q q q r q % % % � � & * i i g v 2 1 2 3 2 2 2 2 q q q q q r q % % % � � & * i i g v 3 1 3 2 3 3 2 2 q q q q q r q % % % � � & * i i g v table 4: basic load cases in �2g the bars stated previously do not coincide. however, it will be proven that g s2 9, * � . an arbitrary �2g can be expressed as a linear combination of three basic cases (table 4). using superposition, the necessary maximality condition reads as: n � � 1 2 1 1 2 2 3 3 1 1 2 2 3 3 g * )( q q q q q q� � � � � � , (45) where the coefficients �i express a combination of the specific basic cases, corresponding to the imposed macroload. additional constraints are: q r q q qj k 1 3j, k &% � � % %1 2 3 2, , (46) and q q q1 2 3 2� � � g v * (47) if (47) were to be derived first, then using some statements about finite dimensional spaces, (45) is the maximality condition for the sum of cosines from (43), similarly as in section 3.2. now, due to the orthogonality of qj, the sum of the cosines is equal to 1, therefore g s2 9, * � , if at least one optimal medium exists, i.e. if there can be found a medium in which p 0j j� � �1 2 3, , , (45–47) hold and no zero bars are contained in it. the requirement p 0j j� � �1 2 3, , is equivalent to the condition under which & �max � � �1,i 2,i 3,i (48) is obtained for each i. solutions of (48) yield four groups of angles, as specified in table 5, corresponding to the main diagonals of the unit cube. it is not convenient to choose a basic cell with eight faces (perpendicular to the directions of the bars), because a regular octahedron does not fill the space. it is better to assume a rectangular cell according to fig. 6. conditions q rj k j, k 1, 2, 3% � � imply, again, the same volume constraint of the bars within each group: l l l li i 1.group j j 2.group k k 3.group r r 4.group a a a a� � �� � � � , (49) which, in a sequel, guarantees the mutual perpendicularity of q j j 1,2,3, � , while (47) is satisfied directly. it remains to ensure (45) and impose conditions to eliminate zero bars. let us take one bar from the first group. from (45) it directly follows that in each part between the joints of this bar n ai i � ( )� � �1 2 3 3 holds. thus the normal forces must be proportional to the cross sectional areas with the same coefficient of proportionality in each group, in other words, in the bars of each group the local stresses must be the same: �1, �2, �3, �4, respectively. because four possible directions of the bars exist, it cannot be directly concluded that due to the equilibrium in the joints only continuous bars are included in the cell. however, this statement can be justified in the following way. obviously � � � � � � � � � � � � � 1 1 2 3 2 1 2 3 3 1 2 3 3 3 3 � � � ( ) , ( ) , ( ) and 4 1 2 3 3� ( )� � � (50) hold. let us take a joint and suppose that a member from each group is presented there as continuous. contributions to the cell coordinate directions are given in table 6. consequently, the equilibrium in the joint reads as: � � � � 1 2 3 ( ) ( ) ( ) a a a a a a 1,m 1,m 1 2,n 2,n 1 3,r 3,r 1 4 1 2 0( ) , ( ) ( ) a a a a a a 4,s 4,s 1 1,m 1,m 1 2,n 2,n 1 � � � � � � 3 4 1 0( ) ( ) , ( ) a a a a a a 3,r 3,r 1 4,s 4,s 1 1,m 1,m 1 � � � � 2 3 4 0 ( ) ( ) ( ) a a a a a a 2,n 2,n 1 3,r 3,r 1 4,s 4,s 1 � , (51) © czech technical university publishing house http://ctn.cvut.cz/ap/ 181 acta polytechnica vol. 44 no.5–6/2004 group 1. 2. 3. 4. spherical angles cos ; sin ; cos ; sin � � 1 1 1 1 1 2 1 2 1 3 2 3 � � � � cos ; sin ; cos ; sin � � 2 2 2 2 1 2 1 2 1 3 2 3 � � � � cos ; sin ; cos ; sin � � 3 3 3 3 1 2 1 2 1 3 2 3 � � � � cos ; sin ; cos ; sin � � 4 4 4 4 1 2 1 2 1 3 2 3 � � � � values of �1, �2, �3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 values of �1, �2, �3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 1/3, 1/3, 1/3 table 5: characterization of g2 * optimal media 1 y 2 y 3 y groups.2and.1 ofdirections groups.4and.3 ofdirections fig. 6: rectangular cell for g2 * optimal media, top view where the first subscript at the cross sectional areas denotes the group and the second one expresses the order number within the group. (51) must be satisfied for any �i i, , , ,�1 2 3 4, consequently the cross sectional areas must be either the same (resulting in a continuous bar with a constant cross sectional area) or zero (the group is not contained in the joint), which completes the justification. in summary, g s2 9, * � and all g2 *optimal media can be fully geometrically specified in the following way: g2 *optimal media are continuous lattices for which: � a rectangular basic cell, according to fig. 6, can be found, where only continuous bars in the four directions specified by table 5 are present, � each continuous bar has a constant cross sectional area and (49) holds. the group of media described above is the only group of g2 *-optimal media. the name for such media was introduced in dimitrovová and faria (1999) as udl (uniform diagonal lattices). the simplest example from this group is the regular cube-diagonal lattice in fig. 7. 3.4 shear modulus g* (for effective isotropy) in this case, the conclusions from the two previous sections can be exploited. let us assume that we already have a g*-optimal medium. macroloads �1g and �2g can be imposed separately on it and the same bounding procedure as in sections 3.3–4 can be performed. it is only necessary to prevent a geometrical specification which would enter in contradiction with the possibility of effective isotropy of the medium. thus: g s* � � 4 2 2 p l and g s* � � 6 2 2 q l , (52) where the subscripts in p and q are omitted for the sake of simplicity. since the maximum in both relations of (52) must be the same, p q2 2 2 3� and taking into account the last expression of (26), g* � s /15 can finally be obtained. therefore g s � * 15, if at least one optimal medium exists. the necessary maximality and additional constraints can be expressed by analogy with sections 3.2–3 as: n p q� �� �j j k k j k, , , ,1 2 3 (53) and r r r r r r r r r q r 1 2 3 1 2 2 3 3 1 � � � � % � , cos( , ) cos( , ) cos( , ) ; j k j, k 1 3 � % % � � 1 2 3 2 1 2 3 , , ; ; .q q q q q q (54) unfortunately, no full geometrical characterization of g*optimal media is possible. the existence of at least one optimal medium can be proven by superposition of the results, namely by combining of the cells of the simplest g1 *and g2 *optimal media, see dimitrovová and faria (1999) for the conditions under which such a superposition can be performed. let us denote the material volume fractions of the simplest g1 *and g2 *optimal media as s1g and s2g, respectively. it can be written: g s s 9 1g 2g* � � 6 , (55) yielding s s1g 2g� 2 3 and consequently (s = s s1g 2g because the bars of the original media do not coincide) g s* � 15 for the combined medium. as a consequence, the relation between the cross sectional areas can be derived as a a 91g 2g� 8 3 , where, a1g and a2g stand for the cross sections of the original g1 *and g2 *-optimal media, respectively (figs. 5 and 7). it can be verified that also in this case the bending effect can be superposed directly, as in the 2d analog, as shown in dimitrovová and faria (1999). 4 concluding remarks it was proven that g1 *and g2 *-optimal media can be geometrically fully specified. they are upl and udl, respectively. neither for k*optimal nor for g*optimal media can a full geometrical specification of their microstructure be given. 182 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 1 y 2 y 3 y g2 a fig. 7: the regular cube-diagonal lattice �1 �2 �3 �4 y1 1 3 1 3 1 3 1 3 y2 1 3 1 3 1 3 1 3 y3 1 3 1 3 1 3 1 3 table 6: contributions of the local stresses to the coordinate directions it is easy to verify that g1 *and k*optimal media, assumed either as micro-trusses or as micro-frames, respond only axially, while in g2 *and g*optimal micro-frames, a bending response is always presented. the bending contribution is different for different g2 *and g*optimal media, therefore the non-linear part is rather difficult to define. however, it can be stated that for low-density media this non-linear part is not important. the bending contribution can be increased by putting more material close to the joints, because the bending moment distribution is antisymmetric within each beam (section 2.2). however, this change would decrease the axial contribution (17–18), and the corresponding linearized bound would decrease. then the medium would no longer be optimal according to the definition from section 2.2. it is useful to remark that the directions of the bars in optimal micro-trusses should be related to the principal directions of the applied macroload according to the theory of michell trusses. this is directly related to the impossibility of full geometrical characterization of k*and g*optimal media. for � k each direction is a principal direction. �g can be determined by five non-zero and independent parameters, and therefore each direction can also be assumed as a principal one. the directions of the bars are precisely specified in g1 *and g2 *optimal media. in the former case they coincide with the principal directions of the macroload, which are in this case unique for any �1g, but in the latter case the directions of the bars can hardly be determined in such a way. another remark yields from a comparison of the additional constraints (geometrical requirements) for k*and g*optimal media, (33) and (54), respectively. it can be shown, with the help of (24) and the second relation in (26), that the group of media satisfying (54) forms a subgroup of microstructures for which (33) holds. therefore each k*optimal medium already fulfills additional constraints for g*optimal media, so that it is hard to find a g*optimal medium which is not k*optimal. it is straightforward to derive bounds for young’s modulus for media with effective isotropy and cubic symmetry, respectively, in the forms of e k g k g s s s s s is, � � � �* * * * * , 4 4 9 15 9 15 6 (56a) e k g k g s s s s 4 s ,cs, � � � �* * , * * , * 4 4 9 6 9 6 15 1 1 (56b) but no conclusions can be reached on the upper bounds on effective poisson’s ratios. it is only easy to verify that upl have zero effective poisson’s ratio. k*& g*optimal micro-trusses have an effective poisson’s ratio equal to 1/4, as shown by bakhvalov and panasenko (1989). in optimal micro-trusses it is interesting to see what the other elastic properties are. a summary is given in table 7. furthermore, in table 8 the bounds for open-cell foams proven in this paper are compared with the composite ones. the composite bounds for effectively isotropic media are taken from hashin and shtrikman (1963) and hashin (1970, 1983) and for media with effective cubic symmetry they are taken from avellaneda (1987). they are specified to one void phase and linearized with respect to the material volume fraction. it can be seen that the solid phase poisson’s ratio �s naturally appears in the linearized composite bounds (unlike the 2d case shown in dimitrovová and faria (1999)). this is because shell or plate parts must be included in optimal 3d media. � and � stand for coefficients of the bending contribution. finally, let us make some remarks about the simplified assumptions adopted for the strain energy contribution. it is known that assuming the micro-frame medium with theoretical lengths makes the medium softer than it really is. it is thus better to use active lengths of the beam and include the deformation of the joints. moreover, the strain energy density corresponding to the shear forces can be included in w. obviously, such improvements do not change the linearized bounds since they do not influence expressions for the axial response of the media. if, e.g., the strain energy density corresponding to the shear forces were to be included, parameters � and � from tab. 8 would decrease. in this case the solid phase poisson’s ratio would appear in the final result. appendix admitting a more general solid material behavior it can be shown that open-cell foam bounds coincide with the voigt bound. for the sake of simplicity let us assume a 2d medium, the regular lattice, which is k*& g1 *optimal (dimitrovová and faria (1999)). k* stands for 2d bulk modulus and has the © czech technical university publishing house http://ctn.cvut.cz/ap/ 183 acta polytechnica vol. 44 no.5–6/2004 macro-load optimal micro-trusses other elastic properties k* 1g* 2g* g* �k upl, udl, other media, which cannot be fully geometrically specified k * not uniquely defined not uniquely defined not uniquely defined �1g only upl k * 1g * 0 – �2g only udl k * 0 2g * – �g media, which cannot be fully geometrically specified, but neither upl nor udl not uniquely defined g * g * g * table 7: other elastic properties in optimal micro-trusses same meaning as before. voigt bounds kv * and g1,v * for one void phase are (see hill (1963)): k s g s v * s 1,v * s � � 2 1 2 1( ) , ( ) , � � where �s is the solid phase poisson’s ratio. now deformation of joints cannot be neglected, due to the presence of �s, which is restricted to the interval [ 1,1]. it is obvious that the strain field inside the cell of the regular lattice would be fully uniform for �k macroload only if �s � 1 giving k sv * � 4 and for �1g macroload only if �s �1 yielding g s1,v * � 4, which are the upper bounds on the properties of 2d cellular media. references [1] allaire g., aubry s.: “on optimal microstructures for a plane shape optimization problem.” struct. opt. vol. 17 (1999), p. 86–94. [2] allaire g., kohn r. v.: “optimal design for minimum weight and compliance in plane stress using extremal microstructures.” eur. j. mech., a/solids, vol. 12 (1993), p. 839–878. [3] avellaneda m.: “optimal bounds and microgeometries for elastic two-phase composites.” j. appl. math., siam, vol. 47 (1987), p. 1216–1228. [4] bakhvalov n., panasenko g.: “homogenization: averaging processes in periodic media.” dordrecht, boston, london: kluwer academic publishers, 1989. [5] bensoussan a., lions j. l., papanicolau g.: “asymptotic analysis for periodic structures.” north holland, amsterdam, 1978. [6] christensen r. m.: ”heterogeneous material mechanics at various scales.” appl. mech. rev., vol. 47 (1994), s20–s33. [7] christensen r. m.: “the hierarchy of microstructures for low density materials.” z. angew. math. phys., vol. 46 (1995), p. s507–s521. [8] dimitrovová z.: “effective constitutive properties of linear elastic cellular solids with randomly oriented cells.” j. appl. mech., asme, vol. 66 (1999), p. 918–915. [9] dimitrovová z., faria l.: “new methodology to establish bounds on effective properties of cellular solids.” j. mech. compos. mater. struct., vol. 6 (1999), p. 331–346. [10] duvaut g.: “homogenisation et materiaux composites.” in: “theoretical and applied mechanics.” ed. p. ciarlet; m. rouseau, amsterdam, north-holland, 1976. [11] gibson l. j., ashby m. f.: “cellular solids. structure and properties.” pergamon press, oxford, 1988. [12] gibianski l. v., sigmund o.: “multiphase elastic composites with extremal bulk modulus.” j. mech. phys. solids, vol. 48 (2000), p. 461–498. [13] grenestedt j. l.: “effective elastic behavior of some models for ‘perfect’ cellular solids.” int. j. solids struct., vol. 36 (1999), p. 1471–1501. [14] guedes j. m., rodrigues h. c., bendsoe m. p.: “a material optimization model to approximate energy bounds for cellular materials under multiload conditions.” struct. opt., vol. 25 (2003), p. 446–452. [15] hashin z.: “the elastic moduli of heterogeneous materials.” j. appl. mech., asme, vol. 29 (1962), p. 143–150. [16] hashin z.: “theory of composite materials.” in: ”mechanics of composite materials.” ed. f. w. wendt, h. liebowitz, n. perrone. pergamon press, oxford, (1970). [17] hashin z.: “analysis of composite materials-a survey.” j. appl. mech., asme, vol. 50 (1983), p. 481–505. [18] hashin z., shtrikman s.: “a variational approach to the theory of the elastic behavior of polycrystals.” j. mech. phys. solids, vol. 10 (1962), p. 343–352. 184 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 44 no.5–6/2004 bounds comparison composite bounds k � * ( ( ) ( )) 2 3 3 1 1 s ss s� � g � * ) ( )( ( ) ( )) s(7 5 s s s s s � � � �2 1 15 1 2 4 5 1g � � * ) ( ) ( ( ) ( )) s(2 s s s s s � � � �2 1 1 3 1 1 2 2g � * ) ( )( ( ) ( )) s(5 3 s s s s s � � � �2 1 9 1 2 2 3 k * 1g * 2g * g * linearized form 2 s s9 1( ) � 2 ss s � �6 1 2 ( ) 5 ss s 3 18 1 2 � �( ) 7 ss s 5 30 1 2 � �( ) previous form with �s � 0 2s 9 s 3 5s 18 7s 30 bound for open-cell foams from this article s 9 s 6 s s2 9 � s s2 15 � optimal media are … determined by necessary and sufficient conditions fully geometrically specified fully geometrically specified determined by necessary and sufficient conditions table 8: : composite bounds, bounds for open-cell foams and characterization of the optimal media [19] hashin z., shtrikman s.: “a variational approach to the theory of the elastic behavior of multiphase materials.” j. mech. phys. solids, vol. 11 (1963), p. 127–140. [20] hill r.: “elastic properties of reinforced solids: some theoretical principles.” j. mech. phys. solids, vol. 11 (1963), p. 357–372. [21] kraynik a. m., warren w. e.: “the elastic behavior of low-density cellular plastics.” in: “low density cellular plastics: physical basis of behavior.” ed. n. c. hilyard, a. cunningham. chapman & hall, london, 1994, p. 217. [22] lekhnitskii s. g.: “theory of elasticity of an anisotropic body.” mir publishers, moscow, 1981. [23] li k., gao x.-l., roy a. k.: “micromechanics model for three-dimensional open-cell foams using a tetrakaidecahedral unit cell and castigliano’s second theorem.” compos. sci. techn., vol. 63 (2003), p.1769–1781. [24] nemat-nasser s., hori m.: ”micromechanics: overall properties of heterogeneous materials.” north-holland series in applied mathematics and mechanics, vol. 37, ed. j. achenbach, b. budiansky, h.a. lauwerier, p.g. saffman, l. van wijngaarden, j.r. willis, north-holland-amsterdam, london, new york, tokyo, 1993. [25] neves m. m, guedes j. m., rodrigues h.: “optimal design of periodic linear elastic microstructures.” comput. & struct., vol. 76 (2000), p. 421–429. [26] noor a. k.: “continuum modeling for repetitive lattice structures.” appl. mech. rev., vol. 41 (1988), p. 285–296. [27] sigmund o.: “materials with prescribed constitutive parameters: an inverse homogenization problem.” int. j. solids struct., vol. 31 (1994), p. 2313–2329. [28] suquet p. m.: “elements of homogenization for inelastic solid mechanics.” in: “homogenization techniques for composite media.” ed. e. sanchez-palencia, a. zaoui, lecture notes in physics, 272, springer-verlag, 1985, p. 193-278. [29] warren w. e., kraynik a. m.: “the linear elastic properties of open-cell foams.” j. appl. mech., asme, vol. 55 (1988), p. 341–346. [30] warren w. e., kraynik a. m.: “linear elastic behavior of a low-density kelvin foam with open cells.” j. appl. mech., asme, vol. 64 (1997), p. 787–794. [31] zhu h. x., knott j. f., mills n. j.: “analysis of the elastic properties of open-cell foams with tetrakaidecahedral cells.” j. mech. phys. solids, vol. 45 (1997), p. 319–343. zuzana dimitrovová phone: +351 218419462 fax: +351 218417915 e-mail: zdimitro@dem.ist.utl.pt / zuzana@dem.isel.ipl.pt dem / isel idmec / ist instituto superior técnico av. rovisco pais, 1 1049-001 lisbon, portugal © czech technical university publishing house http://ctn.cvut.cz/ap/ 185 acta polytechnica vol. 44 no.5–6/2004 ap07_2-3.vp 1 introduction and motivation a detailed understanding and description of quantal and classical phenomena has attracted the attention of many mathematicians and physicists for a long time. quantum and classical mechanics are the best elaborated, understood and examined parts of physics. their mathematical setting is concentrated around powerful artillery, which includes differential geometry, spectral calculus, functional analyses, group and representation theory, (co)homology techniques, and so on. the problem of how to get directly from classical dynamics, represented by the system of second-order differential equations of the newton-lagrange type, �� ( , �, )x x x ti i� � to the corresponding quantum dynamics was articulated by feynman, see freeman dyson’s editorial comment [1]. standard approaches are based on canonical quantization (heisenberg-like and/or schrödinger-like equations) or on the feynman path integral technique. all of these procedures require in some sense lagrangian � � �� � and/or hamiltonian � � �� � functions, such that �� ( , �, )x x x t x x i i i i � � � �� � �� � � � 0 0and / or . here we have imprecisely denoted the functional derivatives of the underlying classical variational principles by � � � x and/or � � � x . the fact that kinetic energy � is a time-independent quadratic form in velocities and/or the corresponding momenta is absolutely crucial for the classical and quantal descriptions. it provides an intimate connection with the universal mechanical property known as inertia. the question of, whether there exists a lagrangian and/or hamiltonian for the given set of forces for the initial newton-lagrange system has been studied by many authors, for more detail see for example [2–4]. this problem is known as the inverse problem of the calculus of variation. in many applications, requirements such as the quadraticity and time-independence of � are suppressed. for example, for a one-dimensional free particle driven by friction proportional to the actual velocity, one can assign to any real number � the lagrangian function �� as follows: � � � � � � �� � ( , �, ) , , � ln � � , � ( ) x x t r x x x x x t � � � � � � � 1 1 1 0e for for e for � � � � � � �� � � � � 0 1t xln �, then � � �� x x x� � � �0 �� �. no two of the lagrangians listed above are equivalent, i.e. � �� �� �const d dt f x t( , ). this simple observation in the simplest example under consideration has very strong physical consequences in general. the non-equivalent lagrangians lead to non-equivalent quantum mechanics (this is unrelated to the problem of ordering), i.e. the transition amplitudes computed according to them are different, though their classical limit is the same. this problem is called quantization ambiguity. it is possible to say which lagrangian provides the “genuine“ quantum mechanics only after performing a suitable experiment, but the role of quadraticity and time independency of � will surely be essential there. the aim of this paper is to try to find an answer to the feynman problem, and, at the same time, to provide a geometrical picture of classical and quantum mechanics for physical systems, where a proper lagrangian and/or hamiltonian description is missing. the central object in our approach is a certain canonical two-form �, which is defined in an extended tangent bundle. its main properties are narrowly studied in section 2. the differential two-form � serves as a guide for a new type of variational principle. in section 3 we introduce the notion of an “umbilical world-sheet.” it generalizes the concept of the history of the system and there60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 quantization of equations of motion d. kochan the classical newton-lagrange equations of motion represent the fundamental physical law of mechanics. their traditional lagrangian and/or hamiltonian precursors when available are essential in the context of quantization. however, there are situations that lack lagrangian and/or hamiltonian settings. this paper discusses a description of classical dynamics and presents some irresponsible speculations about its quantization by introducing a certain canonical two-form �. by its construction � embodies kinetic energy and forces acting within the system (not their potential). a new type of variational principle employing differential two-form � is introduced. variation is performed over “umbilical surfaces“ instead of system histories. it provides correct newton-lagrange equations of motion. the quantization is inspired by the feynman path integral approach. the quintessence is to rearrange it into an “umbilical world-sheet“ functional integral in accordance with the proposed variational principle. in the case of potential-generated forces, the new approach reduces to the standard quantum mechanics. as an example, quantum mechanics with friction is analyzed in detail. keywords: quantization of dissipative systems, umbilical strings, path vs. surface integral. dedicated to tulka. fore it becomes important in the context of quantization. variation uncovers the desired classical trajectory and, as a bonus, also some kind of minimal surface. in section 4 we will see how “umbilical strings” can be used to rearrange the feynman integral over the histories of the system to the surface functional integral. string formulation has the big advantage that it concerns components of the forces rather than their potential. in section 5 we are able to compute the transition probability amplitude for a quantum system with friction explicitly performing the surface functional integration. for potential-generated forces, the “umbilical world-sheet” approach reduces to standard quantum mechanics. this paper follows up ideas presented in my previous work. many facts briefly mentioned here can be consulted in detail in [5, 6]. 2 lagrangian mechanics and the two-form � the physical content of classical mechanics is represented by newton’s dynamical law. its formulation in general curvilinear coordinates (after resolving e.g. the initial holonomic constraints) coincides with the lagrange equations [7–10]: d dt q q i n i i i � � � � � � � � , , , ( � � � � � � � � � � � �1 � number of degrees of freedom). (1) here, �( , �, ) ( , ) � �q q t t q t q qab a b� 1 2 is the kinetic energy of the system and � i iq q t r q ( , �, ) � f� �� � is the i-th component of a generalized force. in the special case when forces are potential-generated � � � i i iq t q � � � � � � � � � � � � � � � d d � , one can introduce the lagrangian function � � �� � and write down the celebrated euler-lagrange equations. generalized coordinates { }qi cover some open patch of the configuration space (n-dimensional manifold) m. let us trace out the importance of kinetic energy � in a geometrical description of mechanics. components tab can be interpreted as some riemannian metric on m. here the fact that � is the quadratic function in the generalized velocities is absolutely crucial. after introducing a rlc-connection � for such a metric (kinetic energy), one can rewrite (1) in the closed form: � � p t i id � , where p q i i : � � � � � . (2) we immediately realize that for the free case the system evolves along a geodetic specified by the riemannian connection �. this, hopefully, sheds some light on the phenomenon called inertia. in the lagrangian picture, the space of all physical states is the set of all admissible initial conditions for the differential system (1). the initial condition specified at time t0 by the generalized position q t q( )0 0� and velocity � ( )q t v0 0� , defines a point (q0, v0, t0) in an extended tangent bundle tm � �. for the extended tangent bundle coordinates we will use the ( )2 1n � -tuple� �q q v v tn n1 1, , , , , ,� � . let us express from (1) generalized accelerations as functions of the remaining entries: �� ( , �, , ) � � ( , �, ) q f q q t q q q q t i i i a a � � � � � � � � � � � � � � � � � � 2 1 � � � � � � � � � � � q q q q q ta a b b a � � � �� � � � �� 2 2 � � � . when identifying �qi with vi, we get instead of (1) the system of (2n�1) first-order differential equations in the extended tangent bundle: �q vi i� , � ( , , , )v f q v ti i� � , �t �1. (3) the system above can be interpreted as a coordinate expression of a vector field on tm � �; down-to-earth, according to (3) one can assign to any physical state (q, v, t) a tangent vector � ( , , ) ( , , ) ( , , ) ( , ( , , , ) q v t t q v t i q q v t i v q v v f q v t i i � � � � � �� , ) ( , , ) ( ). t q v t t tm� � � (4) the time evolution is represented by a curve in the extended tangent bundle (see fig. 1) � : [ ]� �� �tm , � � � ( )� �q q , v v� ( ) , such that d dt � � � ( ) ( ) ( ) � � . we have just observed that classical dynamics is determined by the extended tangent bundle vector field �. having the function �( , , )q v t and the components of the generalized force �( , , )q v t we can establish the two-form � � � �� �� : ( )� � � � �� � �i i v i iq t t q v tid d d d d d� . (5) © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 47 no. 2–3/2007 fig. 1: time evolution in the lagrangian picture is bedded in the extended tangent bundle. at each physical state (q, v, t) there is a uniquely prescribed vector � ( , , )q v t , which defines the dynamics. following its integral curves, the complete time evolution is recovered its main properties can be summarized as follows: � it is a differential two-form on the extended tangent bundle � for any point (q, v, t) , it provides the linear map � � � � � � � � : ( ) ( ), : ( , ) ( , , ) ( , , ) *t tm t tmq v t q v t� � w w w� � � � � if � is nonsingular � �� �v vi a 2 ( )� is invertible , then the kernel � of the above contraction is one-dimensional and it is spanned by the vector � ( , , )q v t ; the subspace � is called the null-space of � � whenever � is potential-generated � � � � � � � � � � � �� � � � �� � � � i i iq t v � � � � d d , then � is exact, i.e. � � d�� , where � �� �� �: ( )� � �d d dt q v tv i ii is the lepage one-form on the extended tangent bundle that is associated to the lagrangian � � �� � . � the two-form � and also the one-form �� are invariant with respect to the group of diffeomorphisms of the “space-time” m � �, which are of the form: � ( , ) ( , ),q t q q q t t ti i i� � � so we can claim: lagrangian mechanics is determined by the null-spaces of the distinguished two-form �. finding them, it is enough to pick up at each null subspace � a vector w for which w� �dt 1 . doing this, we are point-wisely reconstructing the dynamical vector field (4). its integral curves are solutions of the lagrange equations. 3 variational principle and “umbilical strings” suppose there are given newton-lagrange type equations of motion with forces of any origin: � , � ( , , ) ker ( q v v f q v ti i i i� � � nel of made of the ingredient � s and� �� � 1 2 �ab a b i iv v f ) (6) in what follows we will provide a variational principle for the above set of differential equations just in terms of the distinguished two-form � introduced above. variation will be carried over bit peculiar objects, namely the surfaces in the extended tangent bundle. down-to-earth, let us fix � two points in the extended configuration space m � �, initial and final events (q0, t0) and (q1, t1) � any extended tangent bundle curve � � ref : , ( ), ( ),� � � � � �t t q q v v t tm0 1 � �, such that q t q( )0 0� and q t q( )1 1� the space of admissible “umbilical surfaces” of the reference curve � ref is defined as follows: �umb ( ): :( , ) , , ( , ) ( , ), ( � ref � � � � � � � �t t q q v v 0 1 01 � � , ), , : t tm q � � � � such that for all values of the parameter ( , ) , ( , ) : ( , ) � � � � � � � t q q t q0 0 1 1 0of the parameter � �ref ( ) . here � t t0 1, is the time parameter, is some “worldsheet” distance coordinate from the unit interval and the second edge curve � ( ): ( , )� �� 1 , see fig. 2. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 fig. 2: oriented “umbilical surface” � connects the reference curve �ref with the “history” � ( ) ( , )� �� 1 . sideways boundary curves � 0 0( ) ( , )� � t and � 1 1( ) ( , )� � t are located within the n-dimensional submanifolds ( )q t0 0, � fixed and ( , )q t1 1 � fixed of the extended tangent bundle. in the figure, these edge submanifolds are schematically represented by two-dimensional “d-branes.” in the above class of all admissible “umbilical world-sheets” a stationary surface of the action s s: ( ):� � � � � � ! (7) satisfies the following two equations: boundary term: 0 � � d d � � , (8) bulk term: 0 � �� � � � � �d� � � � � � � , , . (9) the first equation states that tangent vector d d � lies in the kernel of �. this means that the second edge �( , ) �1 of the sought stationary surface satisfies the initial newton-lagrange differential system (6). by its definition, it should be a classical trajectory �class that connects the space-time events (q0, t0) and (q1, t1). the genuineness of � class � ��( , )1 is obvious, and it does not depend on the chosen auxiliary reference curve � ref . the complete solution of the variational problem provides as a bonus also some stationary (also called minimal) surface �min. it is anchored to the curves � ref and � class and is trapped in between the “d-brane type” submanifolds ( , )q t0 0 � fixed and ( , )q t1 1 � fixed in the extended tangent bundle. whether such a stationary “umbilical” surface exists depends on properties of the physical system under consideration. 4 quantization: path versus surface integral in the case when classical dynamics (6) is “derivable” from the lepage one-form ��, i.e. � � d�� , one can use for quantization the feynman prescription [11, 12]. according to feynman, the probability amplitude of the transition of the system from the space-time configuration (q0, t0) to (q1, t1) is given as follows: a( , , , ) [ ]expq t q t i 0 0 1 1 " � � � � � � � � !! � � � � � � � . (10) the “path-summation” here is taken over the class � of all admissible curves in tm � �, as drawn in fig. 3. the exponent in (10) is the standard curve integral of the one-form �� over �. the questions of the measure [ ]�� and the proper normalization of the probability amplitude are discussed in the next section. we have already noted that classical mechanics is only � sensitive. on the other hand the sensitivity of quantum mechanics on its one-form potential precursor �� is ultimately evident from the feynman prescription. in what follows, we propose some modifications to (10), leading to the replacement of �� by the two-form �. this will enable us to “quantize” dissipative systems as well. our main trick is a simple rearrangement based on the stokes theorem. down-to-earth, in the class � that enters the “path-summation” in (10), there is one specially distinguished curve, the classical trajectory. using it, for any other � within this class we get an oriented loop (cycle): � � � � �� : � � � �1 0class . here �0 and �1 are arbitrarily chosen curves within the “d-branes“ ( , )q t0 0 � fixed and ( , )q t1 1 � fixed, see fig. 2. the restriction of �� to any of these edge submanifolds is trivial, therefore we can write: � � � � � � � � � � � � � � � � �! ! ! ! ! !� � � � � class zero d 1 0� � � � , where umb class� � ( ).� (11) © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 47 no. 2–3/2007 fig. 3: schematic picture of the class � � � � � �� : ( ( ), ( ), ), ( ) (� q q v v t q t q q tsuch that and0 0 1� �) � q1 let us remind the reader that � the integral of �� over the curve � class gives the value of the classical action sclass � the existence of the “umbilical“ string � that connects � and � class is determined by the topological (homological) properties of the extended tangent bundle � define vol� as the “number“ of surfaces in umb class�� ) containing � and � class as the subboundaries, assuming that all elements of � are homotopically equivalent, vol� becomes �-independent motivated by the trick (11) and assuming no topological obstructions on the side of tm � � we can slightly rearrange the transition amplitude (10) as follows: # $ a( , ; , ) exp exp q t q t i s i 0 0 1 1 " � � � � �! � � class vol vol � � � � � � � class !! � � � � � � � � � � � � � � � � � � � � (12) # $� � " � � � � � �� � � � �� ! 1 vol vol class � � � � � exp exp i s i � � � � � � ! # $ " � � � � � �� � � � �� !! exp exp , i s i � � class � � � d umb �� (13) here in the last line we have included the numerical prefactor vol� �1 to normalization and, because the integrand is only boundary dependent, the path integration was extended to the surface integration over the umbilical class umb umb class� ( )� . from formulas (12) and (13) we get the transition probability amplitude in the product form of the classic phase and quantum corrections. in general, the role of d�� is played by the distinguished two-form � and therefore we can presume to express the contribution of the quantum corrections in any case (including dissipativity) as follows: # $q.c umb . exp" � �� � � � �� !! �� � � i � . (14) if we also have a suitable candidate for the classical phase, we will be able to write: # $ a( , ; , ) . . exp exp q t q t i s i 0 0 1 1 " � � � � � � c.p q.c class � � � �� � � !! � �� � � � ��umb . (15) in my two previous papers [5, 6] i proposed for c.p. the following procedure: first find the classical solution �class of the problem (6), then split the forces into the potential-generated and the non-potential-generated parts, i.e. � � � �i q v ii it � � � � � � � �� � d d rest( ) , at third introduce subsidiary-like lagrangian � �� using the potential � and express the c.p. term in the form: c.p. i s i � � � � � � � � � � � � � � �!exp exp� �class class � � � � � . (16) this procedure, however, seems not to be absolutely correct. lack of the classical limit becomes more-or-less apparent here (see the results in the next section). when varying of the above term we do not get back the initial differential system (6). the classical limit, or equivalently, the classical time evolution of the expectation values of the corresponding physical operators (ehrenfest theorem) is achieved only in the regime where � i rest can be treated as small perturbations to potential-generated forces. thus, (15) with the c.p. term of the form (16) can serve as a type of effective perturbation theory when the perturbations are not potential-generated. 5 an example: quantum mechanics with friction let us focus on the perturbation-like quantization of the dynamics of a unit mass particle moving in m x� �[ ], which is driven by the conservative force � �� � d d x x( ) along with the friction � rest � �� v. the extended tangent bundle t x v t� �[ , ] [ ]� corresponds to an ordinary three-dimensional cartesian space. the distinguished two-form takes the simple form � � � � � � � � � � � � � � � � v x t v x v x td d d d d 1 2 2 �( ) . our aim is to evaluate the transition amplitude as a function of the initial and final events. suppose we have chosen a solution � class class class class( ) ( ( ), ( ) � ( ), )� � �x v x t of the newton-lagrange equation of motion: �� ( ) �x x x� �� � , which respect the initial and final conditions: x t qclass ( )0 0� and x t qclass ( )1 1� . direct application of formula (15) then leads to the following expression: a( , ; , ) exp exp q t q t i s i v x 0 0 1 1 1 " � � � � �! � � class umb [ ]�� d 2 2v x t v x t� � � � � � � � � � � � �� � � � �� � �� !! �( ) d d d� � �� � � � �� . (17) 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 47 no. 2–3/2007 fig. 4: schematic picture of the nodal grid, points marked with empty circles are constrained by (18). to evaluate the world-sheet functional integral entering the above formula, let us introduce a grid in the underlying parametric space t t0 1 20 1, , [ , ]� % � : � � �( , ) : ( , ) [ , ];a b t a b a nodes time index runs fro � � �0 2� � � �m to and distance index runs from to0 0k b l we get rectangular tiles, each of which encloses the infinitesimal area � � � � � �t t k l 1 0 1 0. at the end the numbers k and l will be sent to infinity and the volume � � � d d . after this splitting has been performed, any “umbilical“ string � �: [ , ] [ , , ], ( , ( , ( , , ( , (� �2 3 � �� �x v t x v t� , � can be discretized by evaluating its coordinate functions at the nodes of the considered grid � �( , )a b nodes , i.e. � � � �� � �: ( , ) ( , ) ( , , )( , ) ( , )a b a b x v t aa b a bnodes � � �0 . here, to keep the ensemble � ( , ) [ , , ]a b x v t% �3 within the considered “umbilical“ class umb umb( class� � ), we must impose: & � � � � �a k x x t a x x t aa a0 0 0 0 0, , ( ), (( , ) ( , )� class class� � � � � � � � ) ( , ) ( ) , , , (( , ) ( , ) � � � & � � � � � � � 0 1 0 0 1 class b l x q x qb k b� � � � � % � � �� t t0 1, ) ( , ) “ ”and d branes� (18) therefore, formally, the functional integral over all possible string configurations is a formal limit of the ordinary integrals taking them over all unconstrained variables in the ensemble � �� ( , )a b , i.e.: [ ] d d d d umb ��! !� �' �' �' ' : lim ( , ) ( , ) ( , ) ( , k l a b a b b kx v v v� 0 b b l a k ) �� � �' ' ((! � �� � � � ��11 1 when step-wisely discretizing the integrals in the exponent of (17), taking into account the constraints (18), we get for the bulk term (everything is done with respect to the chosen orientation of the “umbilical“ world-sheet �): v x t v x t t k l d d d d� � � � � � �! !! �' � � � ( , ) ( , ) lim 0 1 0 1 �' � � � � � � � � � � � )) � � � v x x a b a b a b b l a k ( , ) ( , ) ( , )1 0 1 0 1 � �lim ( , ) ( , ) ( , ) ( , ) ( , ) k l a b a b a b a a a v x x v x �' �' � � � �� � 1 0 1 0 � � k b l a k t v x � � � � � ))) � �� � � � �� � 1 1 1 1 1 d class class( ) ( ) 0 1t ! and similarly for the boundary term: v x v x t v x v xd d d� � � � � � � � � � � � � � � � �! 1 2 1 2 2 2 � �( ) ( ) �� d d dt v x v x t � � � � � � � � � � � � � � �! �� � � �� � 1 21 2 �( ) � � � � 0 1 21 2 ! � � � � �' �lim (( , ) ( , ) ( , ) ( , ) ( k a l a l a l a l av x x v x � , )) ( ( )) ( (l a k v x � � � � � � � � � � � � ) 0 1 21 2 d class class � )) . � � � �! t t 0 1 5.1 free particle with damping putting together all fragments that enter formula (19) and taking into account the required normalization conditions, we arrive at the following probability amplitude: a( , ; , ) exp ( ) ,q t q t i i q q0 0 1 1 1 0 21 2 2 � � � � � � � � � � � � � where 2 2 1 0� � tanh ( ) .t t� � � � � (20) a short inspection of (20) discloses that if � � 0, then � � �( )t t1 0 and the above amplitude a( , ; , )q t q t0 0 1 1 coincides with the ordinary quantum propagator for a free particle. let us perform an analysis of the time evolution in terms of the transition probability amplitude (20) from the point of view of quantum mechanics, the best fit of a unit mass particle with the classical initial condition (q v v t0 0 0 00 0� � �, , ) is the gaussian wave-packet �( ) expx x i x v" � � � �� � � � �� 2 2 02� � with some initial width �. at a later time t, the system under consideration will be characterized by the convoluted wave-packet distribution � �( , ) ( ) ( , ; , )x t q q q x t" �' ' ! d a 0 . it preserves its gaussian shape, and its main characteristics, the mean value of the position x and the actual width of the wave-packet �2, vary with time according to x v� 0 � and � 2 2 2 2 2� �� � � � . the velocity of the center of the wave-packet d d e e e t x v v t t t� � " � � �4 1 0 2 0 � � � ( ) , i.e. it decreases for t�1 exponentially, as one would predict on classical intuition. 5.2 damped harmonic oscillator the probability amplitude for a damped harmonic oscillator with unit mass requires a solution of newton’s equation �� �x x x� � �� �2 : � �x a bi iclass e e e( ) � � � � �2 � � , where the new frequency � � �� �2 2 4 . when substituting the general �(x) in (19) by the oscillator potential 1 2 2 2w x and doing simple algebraic manipulations, we arrive at: 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 putting everything together, integrating over all variables apart subensemble� �x a l( , ) and returning back to the continuum limit, we get the following expression for the transition amplitude: � �a( , ; , ) expq t q t i v x t t 0 0 1 1 0 1 " � � � � � � � � !� d class class � [ ( )]exp � ( )�x i x x x v t t � � d class 1 2 2 0 1 � � � � � � � � � � ! � � � � � ! . (19) the phase factor in front of (19) comes from the definition of the classical action sclass � and from the world-sheet functional integration. the second term is the standard feynman path integral, which is taken over the histories { ( ( ): ( , ), )} � x x t� � �1 in the extended configuration space �[x]×�[t]. however, in comparison with the standard formula, a new term appears here. it is an external source generated by the classical velocity vclass . let us remind the reader how to treat ugly infinite constants emerging in the functional integration. if the entering infinities are functionally independent of the coordinates of space-time events, then one can easily neglect them. the only important term inside the functional integral is the phase factor, which depends on the coordinates of ( , )q t0 0 and ( , )q t1 1 , i.e. we need to seize the following quantity: a( , ; , ) exp ( , ; , )q t q t i q t q t0 0 1 1 0 0 1 1" � � � �� phase , anything else is just an inherited rudiment. the proper normalization of the amplitude a( , ; , )q t q t0 0 1 1 is dictated by its physical meaning. the square of its absolute value answers the question about the probability density of observing a particle in a sufficiently small neighborhood of the configuration ( , )q t1 1 , when before it was observed in a neighborhood of the space-time position ( , )q t0 0 . this implies the desired normalization conditions (since we are dealing with the space-time continuum, the normalization to a �-function should be employed): � �t t q t q t q q t1 0 0 0 1 1 1 0 0� * � � �a( , ; , ) ( )� at time the system occ� �upies definite position d q q q t q t q t 0 1 0 0 1 1 0 0a a *( , ; , ) ( , ;� q t q q1 1 0 0, ) ( ) �' ' ! � � � �� conservation of the total probabi� �lity having everything at hand, let us compute the normalized probability amplitude with the presence of friction in the cases when �( )x � 0 (free particle) and �( )x x� 1 2 2� (linear harmonic oscillator). � � a( , ; , ) ( ) exp sin ( ) ( ) q t q t t t i t t q q 0 0 1 1 1 0 1 0 1 2 0 2 2 � � � � � � � � �# � �$ cos ( ) ( ) ( ) , � t t q q t t 1 0 1 0 1 02 2 � � � � � � � ch � (21) where the normalization factor � � � � ( ) ( ) sin ( ) .t t t t i t t1 0 2 1 0 1 0 � � � � � �� � ch � � (22) it is clear that � taking the limit � � 0; the above amplitude reproduces the free particle result (20), � in the limit � � 0; the schrödinger propagator for a harmonic oscillator with frequency � is recovered. acknowledgment many thanks (in alphabetical order) go to vladimír balek, pavel bóna, marián fecko, tamás fülöp, peter prešnajder, artur sergyeyev and pavol ševera for their interest, criticism, fruitful discussions and many useful comments. this research was supported in part by comenius university grant uk/359/2006, vega grant 1/3042/06 and esf project jpd3 ba-2005/1-034. references [1] dyson, f. j.: feynman’s proof of the maxwell equations, am. j. phys. vol. 58 (1990), p. 209–211. [2] henneaux, m.: equations of motion, commutation relations and ambiguities in the lagrangian formalism, annals phys. vol. 140 (1982), p. 45–64. [3] sarlet, w.: the helmholtz condition revisited. a new approach to the inverse problem of lagrangian dynamics, j. phys. a: math. gen. vol. 15 (1982), p. 1503–1517. [4] henneaux, m.: on the inverse problem of the calculus of variations, j. phys. a: math. gen. vol. 15 (1982), p. l93–l96. [5] kochan, d.: how to quantize forces (?): an academic essay how the strings could enter classical mechanics, submitted to jmp, hep-th/0612115. [6] kochan, d.: direct quantization of equations of motion: from classical dynamics to transition amplitudes via strings, submitted to cmp, hep-th/0703073. [7] arnol’d, v. i.: mathematical methods of classical mechanics (2nd ed.), springer-verlag, 1989. [8] abraham, r., marsden, j. e.: foundation of mechanics, addison-wesley, 1978. [9] burke, w. l.: applied differential geometry, cambridge university press, 1985. [10] fecko, m.: differential geometry and lie groups for physicists, cambridge university press, 2006. [11] feynman, r. p., hibbs, a. r.: quantum mechanics and path integral, mcgraw-hill, 1965. [12] faddeev, l. d., slavnov, a. a.: gauge fields: an introduction to quantum theory (2nd ed.), addison-wesley, 1991. mgr. denis kochan, ph.d. phone: +421 02 602 95 460 e-mail: kochan@fmph.uniba.sk department of theoretical physics and didactics of physics comenius university of bratislava faculty of matematics physics and iinformatics mlynská dolina f2 842 48 bratislava, slovakia © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 47 no. 2–3/2007 acta polytechnica https://doi.org/10.14311/ap.2022.62.0090 acta polytechnica 62(1):90–99, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague rational extension of many particle systems bhabani prasad mandal banaras hindu university, institute of science, physics department, lanka, 221005–varanasi, uttar pradesh india correspondence: bhabani.mandal@gmail.com abstract. in this talk, we briefly review the rational extension of many particle systems, and is based on a couple of our recent works. in the first model, the rational extension of the truncated calogerosutherland (tcs) model is discussed analytically. the spectrum is isospectral to the original system and the eigenfunctions are completely expressed in terms of exceptional orthogonal polynomials (eops). in the second model, we discuss the rational extension of a quasi exactly solvable (qes) n-particle calogero model with harmonic confining interaction. new long-range interaction to the rational calogero model is included to construct this qes many particle system using the technique of supersymmetric quantum mechanics (susyqm). under a specific condition, infinite number of bound states are obtained for this system, and corresponding bound state wave functions are written in terms of eops. keywords: exceptional orthogonal polynomials, rational extensions, many particle systems, susyqm. 1. introduction orthogonal polynomials play very useful and important roles in studying physics, particularly in electrostatics and in quantum mechanics. in quantum mechanics, only a few of the commonly occuring bound states problems, which have a wide range of applications and/or extensions, are solvable. such systems generally bring into physics a class of orthogonal polynomials.these classical orthogonal polynomials have many properties common, such as (i) each constitutes orthogonal polynomials of successive increasing degree starting from m = 0, (ii) each satisfy a second order homogeneous differential equations, (iii) they satisfy orthogonality over a certain interval and with a certain non-negative weight function, etc. in 2009, new families of orthogonal polynomials (known as exceptional orthogonal polynomials (eop)) related to some of the old classical orthogonal polynomials were discovered [1–3]. unlike the usual classical orthogonal polynomials, these eops start with degree m = 1 or higher integer values and still form a complete orthonormal set with respect to a positive definite inner product defined over a compact interval. two of the well known classical orthogonal polynomials, namely laguerre orthogonal polynomials and jacobi orthogonal polynomials, have been extended to eops category. xm laguerre (jocobi) eop means the complete set of laguerre (jacobi) orthogonal polynomials with degree ≥ m. m is positive integer and can have values of 1, 2, 3, . . . attempts were made to also extend the classical hermite polynomials [4]. soon after this remarkable discovery, the connection of eops with the translationally shape invariant potential were established [5–9]. the list of exactly solvable quantum mechanical systems is enlarged and the wave functions for the newly obtained exactly solvable systems are written in terms of eops. such systems are known as rational extension of the original systems. the study for the exactly solvable potentials has been boosted greatly due to this discovery of eops over the past decade [10–37]. there are several commonly used approaches to build the rationally extended models, such as susyqm approach [38, 39], point canonical transformation approach [40, 41], darboux-crum transformation approach [42, 43], group theoretical approach [44], etc. these approaches have been used to study different problems in this field leading to a discovery of a large number of new exactly solvable systems, which are isospectral to the original system and the eigenfunctions are written in terms of eops. further, quasi-exactly solvable (qes) systems [45–49] and conditionally exactly solvable (ces) systems [50, 51] attracted attention in literature due to the lack of many exactly solvable systems. several works have been devoted to the rational extension of these qes/ ces systems [22, 24, 37]. nowadays, the parity time reversal (pt) symmetric non-hermitian systems [52–62] are among the exciting frontier research areas. rational extensions have also been carried out for non-hermitian systems [6, 19, 29–32]. even though most of the rational extensions are for the one dimensional and/or one particle exactly solvable systems, the research in this field has also been extended to many particle systems [24, 25, 27]. we have done several works on rational extensions for many particle systems. in one of the works, the well known calogero-wolfes type 3-body problem on a line was extended rationally to show that exactly solvable wave functions are written in terms of xm laguerre and xm jacobi eops [26]. however, this article is based on two of our earlier works on rational extension of many particle systems [24, 25], which 90 https://doi.org/10.14311/ap.2022.62.0090 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 rational extension of many particle systems were central to the talk presented during the aamp meeting. in the first work [25], we discuss the rational extension of the truncated calogero-sutherland model using a pct approach. we indicate how to obtain rationally extended solutions, which are isospectral to the original system in terms of xm laguerre eops. in the second model [24], we discuss the rational extension of a qes n-particle calogero model with a harmonic confining interaction. new long-range interactions to the rational calogero model are included to construct this qes many particle system using susyqm. the wavefunctions are expressed, again, in terms of exceptional orthogonal laguerre polynomials. now, we present the organisation of the article. in the next section, we present the tcs model and its solutions in brief to set the things for the section 3, where we consider the rational extension of the tcs model. in section 4, the qes solutions for the rationally extended calogero type many particle system are presented. section 5 is reserved for conclusions. 2. tcs model in his work, jain-khare (jk) [63] exactly solved some variant of calogero-sutherland model (csm) on the full line by taking only the nearest and next-to-nearest neighbor interactions through 2-body and 3-body interactions. later, pittman et al. [64] generalized this model by considering an n-body problem on a line with harmonic confinement with tunable inverse square as well as the three-body interaction extends over a finite number of neighbors and were able to solve it exactly. this model is known as truncated calogero-sutherland model (tcs). n-body tcs model [64], where particles are interacting through 2-body and 3-body potentials, is given by h = n∑ i=1 [ − 1 2 ∂2 ∂x2i + 1 2 ω2x2i ] + ∑ i 0 for any l < l (31) 103 gregory natanson acta polytechnica must be also even which implies that the set of seed solutions ±,m(∆ ′ l→1) is composed of l segments of even lengths [11, 19] or in other words is formed by ‘juxtaposed’ [36–38] pairs of seed solutions ±,m′, ±,m′ + 1. similarly if the set of seed solutions, ±,m(∆ ′ )l→1 is formed by ‘juxtaposed’ pairs of seed solutions ±,m, ±,m+ 1 then the conjugated set is formed by seed solutions ∓,m′ with only even gap lengths, again starting from an even number. we refer the reader to subsection 3.4 below for a scrupulous analysis of this issue in connection with juxtaposed pairs of eigenfunctions of the schrödinger equation with the morse potential in the bref representation [19]. 3. quantization of rationally deformed morse potentials by wronskian transforms of r-bessel polynomials 3.1. schrödinger equation with morse potential in bessel form in this paper we focus solely on the tfi csle{ d2 dy2 + i0[y; a] + ε∞ρ⋄[y] } ∞ φ[y; a; ε] = 0 (32) with the refpf i0[y; a] = 2ay−3 − y−4 + 1/4y−2 (33) and the density function ∞ρ⋄[y] ≡ ∞σ−1[y] = y−2 (34) one can directly verify that csle (32) has a pair of ’basic’ solutions ∞ϕ±,0(y; a) = y1±ae±1/y (y > 0) (35) at the energies ∞ε±,0(a) = −(a ± 1/2)2. (36) examination of solutions (35) shows that they obey the following symmetry relations ∞ϕ±,0[y; a + k] = y±k ∞ϕ±,0[y; a] (37) for any integer k and ∞ϕ+,0(y; a)∞ϕ−,0(y; a) = y2 (38) whereas the function f±,0[ξ; ⇀ a,b] ≡ ϕ∓,0[ξ; ⇀ a,b]/ϕ±,0[ξ; ⇀ a,b] (39) takes form ∞f±,0[ξ; a] ≡ y∓2ae∓2/y. (40) we thus proved that the pair of basic solutions in question satisfy the tfi condition [23] ∞ϕ∓,0[y; a ± 1] = ∞ρ −1/2 ⋄ [y]/∞ϕ±,0(y; a). (41) one can directly verify that ∞ε∓,0(a ± 1) = ∞ε±,0(a) (42) and thereby ∞e±1(a) ≡ ∞ε∓,0(a ± 1) − ∞ε±,0(a) = 0 (43) so the symmetry condition [23] e∓1(a ± 1) = −e±1(a). (44) trivially holds. the gauge transformations ∞φ[y; a; ε] = εϕ±[y; a]∞f±[y; a; ε] (45) convert csle (32) to a pair of bochner-type eigenequations{ y2 d2 dy2 + ∞τ±[y; a] d dy + [ε −∞ ε±,0(a)] } ∞f±[y; a; ε] = 0, (46) with ∞τ ±[y; a] = 2(1 ± a)y ∓ 2. (47) 104 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . we define generalized bessel polynomials as y (α,β)n (y) ≡ y (α) n (y/β), (48) where the polynomial y (α)n (x) is given by (2) in [4] and thereby coincides with polynomial (9.13.1) in [6] y (α)n (x) ≡ yn(x; α). (49) note that chihara’s relation (4.3) in [5] is apparently based on brafman’s definition [39] for the polynomial yn(x; α,β) such that yn(x; α + 2, 2) = yn(x; α). adding the second index to the conventional notation [4, 5] allows us to avoid uncertainties in the definition of the variable used to differentiate a polynomial in the reflected argument, keeping in mind that y (α)n (−y) ≡ y (α,−2) n (y). (50) eq. (37) for the bessel dps in [40] thus corresponds to the polynomials y (α−2,β)n (y) in our terms. (we prefer to preserve symbol ‘b’ for their orthogonal subset composed of r-bessel polynomials [12, 13].) it is also worth mentioning that alhaidari [1] introduced a slightly modified notation for generalized bessel polynomials: jan (1/2y) ≡ y (2a) n (y) = (2n + 2a)n(y/2) n 1 f1(−n; −2a − n; 2/y), (51) with the pochhammer symbol (a)n standing for the falling factorial. and indeed it would be possibly more convenient to use the parameter a as the polynomial index keeping in mind that the forward and backward shift relations change the polynomial index by 1. however we prefer to stick to the more conventional notation. the basic solution ∞ϕ±,0[y; a] is thus nothing but a constant solution of eigenequation (46) converted back by gauge transformation (45). similarly the reverse gauge transformation of each of the dpss composed of polynomials y (±2a,∓2)m (y) results in pairs of infinite sequences of q-rss of csle (32): ∞ϕ±,m[y; a] = ∞c±,m(a)∞ϕ±,0[y; a]y (±2a,∓2)m (y). (52) the multiplier lc±,m will be chosen below in such a way that q-rss (52) satisfy recurrence relations (15). the crucial advantage of expressing q-rss in terms of generalized bessel polynomials, instead of laguerre polynomials [7–11], is that the weight function ∞ϕ±,0[y; a] in the right-hand side of (52) does not depend on the polynomial degree – the direct consequence of the fact that the given tfi csle belongs to group a [18, 19, 23], in contrast with the conventional representation of eigenfunctions of the schrödinger equation with the morse potential in terms of classical laguerre polynomials [22]. according to the general theory of bochner-type eigenequations [41] differential equation (46) has a polynomial solution of degree m at ε = ∞ε±,m(a) = ∞ε±,0(a) − m[2(1 ± a) + m − 1], (53) which, coupled with (36), gives ∞ε±,m(a) = −(m + 1/2 ± a)2. (54) this brings us to the simplified version of the raising ladder relations [23] for the energies of q-rss (15): ∞ε±,m+1(a) = ∞ε±,m(a ± 1) (55) with e±1(a) ≡ 0 . to be historically accurate, it is worth mentioning that cotfas’ eq. (10) in [16] with the leading coefficient σ(s) = s2 does list al-salam’s [4] formula y (α)n (y) = n! (−y/2) nl(−α−2n−1)n (2/y) (56) for the generalized bessel polynomials in terms of laguerre polynomials in the reciprocal argument 2/y (though without mentioning the former polynomials by name). actually cotfas discusses only eigenfunctions of the corresponding sturm-liouville problem so the cited formula specifies r-bessel polynomials expressed in terms of classical laguerre polynomials in 2/y: b(a)n (y) ≡ y (−2a−1) n (y) = n! (−y/2) nl(2a−2n)n (2/y) for n < a, (57) with cotfas’ parameter α standing for 1 − 2a here. the remarkable feature of this finite subsequence of generalized bessel polynomials is that the polynomials in question are orthogonal on the positive semi-axis as prescribed by orthonormality relations (9.13.2) in [6]:∫ ∞ 0 ∞ρ⋄[y]∞ϕ2−,0[y; a + 1/2]b (a) n (y)b (a) ˜ n (y)dy ≡ ∫ ∞ 0 y−2a−1e−2/yb(a)n (y)b (a) ˜ n (y)dy = n! γ(2a + 1 − n) 2a − 2n − 1 δn ˜ n. (58) 105 gregory natanson acta polytechnica making use of (39) we can represent backward shift relation (9.13.8) in [6] as d dy î ∞f+,0[ ξ; a] y (−2a,2)m (y) ó = 2∞f+,0[ξ; a + 1]y (−2a−2,2) m+1 (y) (59) so the functions ∞f+,m[ξ; a] = ∞c−,m(a)∞f∞,0[ξ; a]y (−2a)m (y) (60) satisfy raising relation (17) provided we choose ∞c−,m+1(a) = 2∞c−,m(a − 1) ≡ 2m+1 (61) keeping in mind that ∞c−,0(a) ≡ 1. substituting (54) into (20) gives ∞e−,m−1(a − 1) = −m(m + 1 − 2a) (62) so recurrence relation (19) can be re-written as 2myẏ (−2a,2)m (y) = m(m + 1 − 2a)∞ϕ−,m−1[y; a − 1]/∞ϕ−,0[y; a]. (63) combining (52), (61), and (37) with k = 1 brings us to ’forward shift operator’ (9.13.6) in (6) ẏ (−2a,2)m (y) = 0.5m(m + 1 − 2a)y (2−2a,2) m−1 (y). (64) to formulate the sturm-liouville problem of our interest it is worthy to convert csle (32) to its ‘prime’ [42] form at ∞ using the gauge transformation ∞ ̸ψ [y; a; ε] = y−1/2∞φ[y; a; ε] (65) and then to solve the resultant rsle{ d dy y d dy − y−3 + 2ay−2 + εy−1 } ∞ ̸ψ [y; a; ε] = 0 (66) under the dirichlet boundary conditions (dbcs): lim y→0 ∞ ̸ψ [y; a; εn] = lim y→∞ ∞ ̸ψ [y; a; εn] = 0. (67) the main advantage of converting csle (32) to its prime form with respect to the regular singular point at infinity comes from our observation [42] that the characteristic exponents for this singular end have opposite signs and therefore the corresponding principal frobenius solution is unambiguously selected by the dbc. prime rsle (66) can be also re-written in the form of the ‘algebraic’ [42] schrödinger equation{ y d dy y d dy − y−2 + 2ay−1 + ε } ∞ ̸ψ [y; a; ε] = 0. (68) (as discussed in the following subsections this is the common remarkable feature of rcsles with density function (34) assuming that the singular point at infinity is regular.) reformulating the given spectral problem in such a way allows us to take advantage of powerful theorems proven in [43] for zeros of principal solutions of sles solved under the dbcs at singular ends. the eigenfunctions of rsle (66) thus take form ∞ ̸ψ−,n [y; a] = y −1/2 ∞ϕ−,n[y; a] = 2ny1/2−ae−1/yb(a−1/2)n (y) for n = 0, . . . ,n(a). (69) one can then directly verify that each eigenfunction obeys the dbc at both singular ends. since r-bessel polynomials (57) form an orthogonal sequence the eigenfunction ∞ ̸ψ−,n [y; a] must have exactly n nodes and therefore [43] the sequence of eigenfunctions (69) corresponds to ⌈a⌉ = n(a) + 1 lowest eigenvalues of rsle (66) with n(a) = ⌊a − 1/2⌋ ≡ ⌊a⌋. (70) note also that eigenfunctions (69) are orthogonal with the weight y−1 and that any solution normalizable with this weight must vanish at infinity. 106 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . the presented argumentation does not exclude existence of eigenfunctions with the number of nodes larger than n(a) − 1. to confirm that the problem in question is indeed exactly solvable one can simply take advantage of the conventional analysis of the schrödinger equation with the morse potential [22] in the l ref representation. the reader can argue that the problem must be exactly solvable since the morse potential is tsi. however the author [44] has an issue with this assertion. though the gendenshtein’s claim [34] concerning the exact solvability of shape-invariant potentials is most likely correct it has been never accurately proven to our knowledge. the catch is that gendenshtein’s arguments decreasing the translational parameter a one by one bring us to the sturm-liouville problem with | a |< 1/2 and then we still need to prove that the resultant sle has no discrete energy spectrum. the change of variable y(x) = ex converts bref csle (32) into the schrödinger equation with the morse potential ∞v [y(x); a], where ∞v [y; a] = −y2i0[y; a] + 1/4 (71) = −2ay−1 + y−2. (72) comparing (72) with (1) in [10] shows that ∞v [y(x); a + 1/2] = va,1(x) in quesne’s notation. according to the general theorem presented in [43] for singular sles solved under the dbcs any principal solution ∞ ̸ψ−,m [y; a] near the singular end point y = 0 has nodes at the positive semi-axis iff it lies above the ground energy level. examination of the inequality ∞ε−,m(a) < ∞ε−,0(a) (73) thus shows that the q-rs ∞ ̸ψ−,m [y; a] with m ̸= 0 preserves its sign on the positive semi-axis iff m > 2a − 1 = 2a (74) (cf.(12) in [10]). it will be proven in next subsection that one can use any combination of admissible q-rss ∞ ̸ψ−,m [y; a] as seed functions to construct an exactly solvable rdct of the bref csle. according to (9.13.1) in [6] y (−2a,+2)m (y) = 2 −m(2m − 2a)mŷ (−2a,+2)m (y) (75) where, in following [5], we use hut to indicate that the polynomial in question is written in its monic form. it is essential that the multiplier (2m − 2a)m = m−1∏ l=0 (2m − 2a − l) = m∏ l′=1 (m − 2a + l′) (76) necessarily differs from 0 if either 2m − 2a < −1 (r-bessel polynomials) or m = m > 2a − 1 (generalized bessel polynomials with no positive zeros) so the polynomial degree is equal to m in both cases of our primary interest. 3.2. rdct s of principal solutions near singular end points using an arbitrary set mp = m1, . . . ,mp of seed functions ∞ϕ±,mk [y; a] of the same type (0 < mk < mk+1 for k = 1, . . . ,p − 1) we can represent the corresponding rdct of bref csle (32) as{ d2 dy2 + ∞i0[y; a | ± ...mp] + εy−2 } ∞φ[y; a; ε | ± ...mp] = 0, (77) where ∞i 0[y; a | ...mp] = ∞i0[y; a] + 2 y d dy (y ld∞w[y; a | ± ...mp]) (78) with ∞w[y; a | ± ...m1] ≡ ∞ϕ±,m1 [y; a], (79) ∞w[y; a | ± ...mp] ≡ w { ∞ϕ±,m1 [y; a], . . . , ∞ϕ±,mp [y; a] } for p > 1, (80) and the symbolic expression ld standing for the logarithmic derivative. when deriving (78) we also took into account that the so-called [42] ‘universal correction’ ∆i {ρ(y)} ≡ 0.5 » ρ(y) d dy ld ρ(y)√ ρ(y) (81) 107 gregory natanson acta polytechnica in schulze-halberg’s [45] generic formula for zero-energy free term of the transformed csle vanishes in the case of our current interest: ρ(y) = y−2. the common remarkable feature of wronskians (80) for tfi csles from group a (originally noticed by odake and sasaki [19] in their scrupulous study on rdct s of the corresponding tsi potentials) is that each can be represented as the weighted polynomial wronskian ∞w[y; a | ± ...mp] = ∞ϕp±,0[y; a]∞wnm p [y; a | ± ...mp], (82) where the wronskian ∞wnm p [y; a | ± ...mp] ≡ w { y (±2a,∓2)m1 (y), . . . ,y (±2a,∓2) mp (y) } (83) is a polynomial of degree n m p =| mp | −0.5p(p − 1) (84) (see (61) in [19]). when it seems appropriate we will drop the index specifying the degree of polynomial wronskians in question. substituting (82) into (78), coupled with (33) and (35), one finds ∞i 0[ y; a | ± ...mp] = 2(a ± p)y−3 − y−4 + 1/4y−2 + 2 y d dy å y ld∞w[y; a | ± ...mp] ã . (85) each rcsle under consideration can be alternatively obtained via sequential rdts with the ffs ∞φ±,m ˜ p [y; a | ± ...m ˜ p−1] = yp−1 ∞w[y; a | ± ...m ˜ p] ∞w[y; a | ± ...m ˜ p−1] ( ˜ p = 1, . . . ,p) (86) so refpfs (85) can be determined via the following sequence of recurrence relations ∞i 0[y; a | ± ...mp] = ∞i0[y; a | ± ...mp−1] + 2 y d dy å y ld∞φ±,mp [y; a | ± ...mp−1] ã (87) (a natural extension of the renown crum formulas [21] to the csles). for an arbitrary choice of the partition mp refpf (85) generally has poles on the positive semi-axis and therefore rcsle (77) cannot be quantized analytically. so let us choose a set m ± p = m ± 1 , . . . , m ± p of seed solutions of the sane type, ∞ϕ±,mk [y; a] (0 < mk = m ± k < mk+1 = m ± k+1 for k = 1, . . . ,p − 1), in such a way that the seed function ∞ϕ±,m1 [y; a] and all wronskians ∞w[y; a | ± ...m ± p ] for ˜ p = 2, . . . ,p preserve their sign on the positive semi-axis. in particular odake and sasaki [19] and nearly the same time gomez-ullate et al [11] constructed the subnet of rationally deformed morse potentials ∞v [y; a | + ...m + p ] = ∞v [y; a] + y 2 { ∞i 0[y; a] − ∞i0[y; a | + ...m + p ] } (88) using seed solutions infinite at both quantization ends. in next subsection we will introduce another subnet of rationally deformed morse potentials ∞v [y; a | − ...m _ p ] = ∞v [y; a] + y 2 { ∞i 0[y; a] − ∞i0[y; a | − ...m _ p ] } (89) constructed by means of ffs vanishing at the origin. the subnet starts from the potential ∞v [y; a | − ...m] with a positive integer m > 2a − 1 – potential function (16) in [10] with a = a − 1/2,b = 1. substituting (82) into (86) and also making use of (37) with k = p, shows that rcsle (77) has an infinite set of q-rss ∞φ±,m[y; a | ± ...mp] = ∞ϕ±,0[y; a ± p] ∞w[y; a | ± ...mp,m] ∞w[y; a | ± ...mp] . (90) apparently q-rs (90) with the label ‘−’ represents the principal solution approaching 0 as yδ−(m p)e−1/y in the limit y → +0. on other hand q-rs (90) labelled by ‘+’ infinitely grows as yδ+(m p)e1/y in this limit. in both cases ld∞φ±,m[y; a | ± ...mp] ≈ ∓y−2 (91) 108 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . and consequently ld ∗φ±,m[y; a | ± ...mp] ≡ ld y − ld∞φ±,m[y; a | ± ...mp] ≈ ±y−2 (92) for 0 < y << 1, where we dropped subscript ∞ in the notation of the ff for the reverse rdt: ∗φ±,m[y; a | ± ...mp] ≡ y/∞φ±,m[y; a | ± ...mp] . (93) note that the last summand in sum (85) has a simple pole at y = 0 so an arbitrary principal solution of rcsle (77) near its irregular singular point at y = 0 can be approximated as ∞φ0[y; a; ε | ± ...mp] ∝ y∆±(a;m p)e−1/y for y << 1, (94) where ∆±(a; mp) stands for a finite power exponent which particular value is non-essential for our discussion. examination of the quasi-rational function ∞φ0[y; a; ε | ± ...mp+1] = y w { ∞φ±,mp+1 [y; a | ± ...mp] , ∞φ0[y; a; ε | ± ...mp] } ∞φ±,mp+1 [y; a | ± ...mp] = y ∞φ̇0[y; a; ε | ± ...mp] − y ld∞φ±,mp+1 [y; a; | ...mp] ∞φ0[y; a; ε | ± ...mp] (95) representing the rdt of the principal solution of rcsle (77) near its irregular singular point at y = 0 confirms that it is a principal solution of the transformed rcsle near the singular point in question. vice versa the quasi-rational function y w { ∗φ±,mp [y; a | ± ...mp] , ∞φ0[y; a; ε | ± ...mp+1] } ∗φ±,mp+1 [y; a | ± ...mp] = y ∞φ̇0[y; a; ε | ± ...mp+1] − y ld∗φ±,mp+1 [y; a | ± ...mp+1] ∞φ0[y; a; ε | ± ...mp+1] (96) representing the reverse rdt of the principal solution (95) is the principal solution of rcsle (77) near its irregular singular point at y = 0. to study a behavior of frobenius solutions near a regular singular point of rcsle (77) at infinity it is convenient to convert this equation to its ‘prime’ form [42] using the gauge transformation ∞ ̸ψ [y; a; ε ± ...mp] = y−1/2∞φ[y; a; ε | ...mp] (97) which gives{ d dy y d dy − y−3 + 2(a ± 1)y−2 + 2 d dy (y ld∞w[y; a | ± ...mp] ) + εy−1 } ∞ ̸ψ [y; a; ε | ± ...mp] = 0 (98) as explained above the main advantage of this representation comes from the fact that the characteristic exponents of two frobenius solutions of rsle (98) near this singular end have opposite signs, with the principal frobenius solution decaying as y− √ −ε when y → ∞. again rsle (98) is nothing but the ‘algebraic’ [42] form of the schrödinger equation with the rationally deformed morse potentials (88) or (89) accordingly – the common feature of rcsles with density function (34) as far as the given sle has a regular singular point at infinity. apparently ∞ ̸ψ [y; a; ε | ± ...mp+1] ≡ y−1/2∞φ[y; a; ε | ± ...mp+1] = y w { ∞ ̸ψ [y; a; ε±,mp+1 (a) | ± ...mp], ∞ ̸ψ [y; a; ε | ± ...mp] } ∞ ̸ψ [y; a; ε±,mp+1 (a) | ± ...mp] (99) here we are only interested in cases when the ff appearing in the denominator of pf (99) is the non-principal frobenius solution of rsle (98) near the singular point at infinity so ∞ ̸ψ [y; a; ε | ± ...mp+1] ≈ − [ √ −ε + » −ε±,mp+1 (a)]y − √ −ε for y >> 1 (100) 109 gregory natanson acta polytechnica if ∞ ̸ψ [y; a; ε | ± ...mp] is an arbitrary principal frobenius solution of this rsle near the singular end in question. we thus proved that the rdt of any principal frobenius solution for each of the singular end points is itself the principal frobenius solution of the transformed rsle near the singular point in question. suppose that rsle (98) with mp replaced for m ± p+1 has an additional eigenfunction ∞ ̸ψ [y; a; ε∗(a) | ± ...m ± p+1] at the energy ε∗(a) < 0. applying the reverse rdt with the ff y1/2/∞φ[y; a; ε±,mp+1 (a) | ± ...m ± p ] = ∞ ̸ψ −1 [y; a; ε±,mp+1 (a) | ± ...m ± p ] (101) to the new eigenfunction we would come to the solution which obeys the dbc at infinity: w { ∞ ̸ψ−1 [y; a; ε±,mp+1 (a) | ± ...m ± p ], ∞ ̸ψ [y; a; ε∗(a) | ± ...m ± p+1] } ∞ ̸ψ−1 [y; a; ε±,mp+1 (a) | ± ...m ± p ] ≈ [ » −ε±,mp+1 (a) − » −ε∗(a)]y− √ −ε±,mp+1 (a) for y >> 1 (102) assuming that ε∗(a) ̸= ε±,mp+1 (a). on other hand, the quasi-rational function on the left is related to principal solution (96) via gauge transformation (97) with ε = ε∗(a) and therefore the solution in question would obey both dbcs which contradicts the assumption that ε∗(a) is a new eigenvalue. the only exception corresponds to the case ε∗(a) = ε±,mp+1 (a), when the rdt with ff (100) insert the new bound energy state below the ground energy level of rationally deformed morse potential (88) or (89) accordingly. 3.3. isospectral family of rationally deformed morse potentials with a regular spectrum let us prove that any set m _ p of seed solutions ϕ−,mk [y; a] (0 < m1 < mk < mk+1 ≤ p) is admissible if the generalized bessel polynomial y (−2a)mk (y) does not have positive zeros so each seed function ∞ϕ−,mk [y; a] preserves its sign on the positive semi-axis. according to (73), this is possible if m > 2a − 1 for any m ∈ m _ p . in other words we have to prove that polynomial wronskian (83) does not have positive zeros if this is true for each polynomial y (−2a)mk (y). this assertion is obviously trivial for ˜ p = 1. it also directly follows from the arguments presented in previous subsection that the rdt of bref csle (32) with the ff ϕ−,m1 [y; a] preserves the discrete energy spectrum so the prime rsle{ d dy y d dy + y∞i0[y; a | − ...m1] + (ε + 1/2)y−1 } ∞ ̸ψ [y; a; ε | − ...m1] = 0 (103) solved under the dbcs lim y→0 ∞ ̸ψ [y; a; εn(a) | − ...m1] = lim y→∞ ∞ ̸ψ [y; a; εn(a) | − ...m1] = 0 (104) has exactly n(a) eigenfunctions ∞ ̸ψ [y; a; εn(a) | − ...m1] ≡ ∞ ̸ψ−,n [y; a | − ...m1] = y−1/2∞φ−,n[y; a | − ...m1] (105) at the energies ∞ε−,n(a) with n varying from 0 to n(a) − 1. making use of (90) with p = 1 and m = n, they can be also re-written in the quasi-rational form ∞ ̸ψ−,n [y; a | − ...m1] = ∞ ̸ψ−,0 [y; a − 1] ∞w[y; a | − ...m1,n] y (−2a) m1 (y) (106) keeping in mind that the pf in the right-hand side of the latter expression is proportional to yn−1 for y >> 1 one can immediately confirm that eigenfunctions (105) vanish in the limit y → ∞ for any n < a − 1/2. let us now use the mathematical induction to prove that the polynomial ∞w[y; a | − ...m _ ˜ p+1] does not have positive zeros if this assertion holds for the polynomial ∞w[y; a | − ...m _ ˜ p ]. again it is suitable to convert rcsle (77) to its prime form{ d dy y d dy + y∞i0[y; a | − ...m _ ˜ p ] + (ε + 1/2)y −1 } ∞ ̸ψ [y; a; ε | − ...m _ ˜ p ] = 0 (107) 110 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . solved under the dbcs lim y→0 ∞ ̸ψ [y; a; εn(a) | − ...m _ ˜ p ] = lim y→∞ ∞ ̸ψ [y; a; εn(a) | − ...m _ ˜ p ] = 0. (108) making use of (90) with p = ˜ p we can again re-write the eigenfunctions ∞ ̸ψ−,n [y; a | − ...m _ ˜ p ] ≡ ∞ ̸ψ [y; a; εn(a) | − ...m _ ˜ p ] = y −1/2 ∞φ−,n[y; a | − ...m _ ˜ p ] (109) in the quasi-rational form ∞ ̸ψ [y; a | − ...m _ ˜ p ] = ∞ ̸ψ−,0 [y; a − ˜ p] ∞w[y; a | − ...m _ ˜ p+1] ∞w[y; a | − ...m _ ˜ p ] (110) examination of q-rs (110) reveals that it vanishes at the origin and therefore represents a principal solution of prime sle (103) near its irregular singular point. since this solution lies below the lowest eigenvalue it must be nodeless [43] and therefore no wronskian ∞w[y; a | − ...m _ p ] has positive zeros. all the q-rss ∞ ̸ψ−,n [y; a | − ...m _ p ] = ∞ ̸ψ−,0 [y; a − p] ∞w[y; a | − ...m _ p ,n] ∞w[y; a | − ...m _ p ] (111) vanish at infinity for n < n(a) = ⌊a⌋ since the power exponent of the pf in the right-hand side of (111) is equal to n − p in the limit y → ∞. this confirms that the direchlet problem for sle (103) has exactly n(a) eigenfunctions defined via (111) with n < n(a). since these eigenfunctions must be orthogonal [43] with the weight y−1 the polynomial wronskians ∞w[y; a | − ...m _ p ,n] with n varying from 0 to n(a) − 1 are orthogonal with the positive weight ∞w [y; a | − ...m _ p ] = ∞ ̸ψ2−,0 [y; a − p] y ∞w2[y; a | − ...m _ p ] . (112) if the morse potential has at least 2 energy levels the sequence starts from a polynomial of degree | m _ p | −0.5 p(p + 1) ≥ 2p. (113) keeping in mind | m _ p |> (2a − 1)p + 0.5p(p + 1) > 2p + 0.5p(p + 1) (114) in this case. the finite eop sequence in question thus starts from a polynomial of at least second degree and therefore [46] does not obey the bochner theorem [47]. re-writing (85) with mp = m _ p as ∞i 0[y; a | − ...m _ p ] = ∞i 0[y; a − p] + 2 y d dy å y ld∞w[y; a | − ...m _ p ] ã (115) we can then explicitly express corresponding liouville potential (89) in terms of the admissible wronskian ∞w[y; a | − ...m _ p ] as follows ∞v [y; a | − ...m _ ˜ p ] = ∞v [y; a − p] − 2y d dy å y ld∞w[y; a | − ...m _ p ] ã (116) as mentioned in previous subsection this net of isospectral rational potentials starts from potential function (16) in [10] with a = a − 1/2 , b = 1, after the latter is expressed in terms of the variable y = ex. 111 gregory natanson acta polytechnica 3.4. subnet of rationally deformed morse potentials quantized via wronskians of r-bessel polynomials another family of solvable rdct s of csle (32) can be constructed using juxtaposed pairs of eigenfunctions ∞ϕ−,nk [y; a], ∞ϕ−,nk +1[y; a] (0 < nk < nk+1 − 1 < n(a) for k = 1, . . . ,j). the simplest double-step representative of this finite family of rationally deformed morse potentials with n1 = 1, j = 2 was constructed by bagrov and samsonov [38, 48] in the late nineties based on the conventional l ref representation of the schrödinger equation with the morse potential. the extensions of their works to an arbitrary number of juxtaposed pairs of eigenfunctions in both l ref and bref representations were performed more recently in [11] and [19] accordingly. for any tfi rcsle from group a one can by-pass an analysis of the pre-requisites for the krein-adler theorem [49, 50] by taking advantage of the fact that the wronskians of eigenfunctions are composed of weighted orthogonal polynomials with the common degree-independent weight and therefore the numbers of their positive zeros are controlled by the general conjectures proven in [51] for wronskians of positive definite orthogonal polynomials. in particular we conclude that any wronskian formed by juxtaposed pairs of r-bessel polynomials of non-zero degrees may not have positive zeros. let n2j be a set of r-bessel polynomials of degrees n2j = m(∆′l→1) = n1 : n1 + 2j1 − 1,n2j1+1 : n2j1+1 + 2j2 − 1, . . . ,n2j −2jl+1 : n2j (n1 > 0,n2j < n) (117) with even δ′l = 2jl (l = 1, . . . ,l). (118) examination of the q-rs functions ∞ ̸ψ−,n [y; a | − ...n2j ] = y−1/2∞φ−,n[y; a | − ...n2j ] = ∞ ̸ψ−,0 [y; a − 2j] ∞w[y; a | − ...n2j,n] ∞w[y; a | − ...n2j ] (n /∈ n2j ) (119) shows that they all represent principal solutions near the irregular singular point of the prime rsle{ d dy y d dy + y∞i0[y; a | − ...n2j ] + (ε + 1/2)y−1 } ∞ ̸ψ [y; a; ε | − ...n2j ] = 0 (120) assuming again that the latter equation is solved under dbcs lim y→0 ∞ ̸ψ [y; a; εn | − ...n2j ] = lim y→∞ ∞ ̸ψ [y; a; εn | − ...n2j ] = 0. (121) note that the pf in the right-hand side of (119) is proportional to yn−2j for y >> 1 so each solution with n /∈ n2j < n(a) represents an eigenfunction of rsle (120). again these eigenfunctions must be orthogonal with the weight y−1 and therefore n(a) − 2j wronskians ∞w[y; a | − ...n2j,n] with n /∈ n2j < n(a) form a polynomial set orthogonal with the positive weight ∞w[y; a | − ...n2j ] = ∞ ̸ψ2−,0 [y; a − 2j] y∞w2[y; a | − ...n2j ] (122) if sequence (117) starts from n1 = 1 then the finite eop sequence in question lacks the first-degree polynomial. otherwise it always starts from a polynomial of non-zero degree | n2j | −j(2j + 1) > (n1 − 1)(δ′1 − 1) ≥ 1. (123) in both cases the pre-requisites of the bochner theorem are invalid as expected [46]. the liouville potentials in question can be thus expressed in terms of the admissible wronskians ∞w[y; a | − ...n2j ] as follows ∞v [y; a | − ...n2j ] = ∞v [y; a − 2j] − 2y d dy (y ld∞w[y; a | − ...n2j ]). (124) we refer the reader to conjectures in [51] to verify that the number of zeros of each wronskian in the constructed orthogonal polynomial set changes exactly by 1 even if a jump in the polynomial degree is larger than 1. however 112 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . even if we take advantage of these elegant results we still need to prove that there are no additional eigenfunctions with a number of nodes larger than n(a) − 2j − 1. in contrast with the analysis presented in the previous section, this proof is complicated by the fact that the rdt at each odd step results in a non-solvable rsle with singularities on the positive semi-axis. luckily we deal with the tfi csle so its rdct using juxtaposed pairs of eigenfunctions can be alternatively obtained via sequential rdts with seed solutions from the second sequence +,m [11, 19]. namely, as already mentioned in the end of section 2 the conjugated partition m + |δ1→l| = m(∆1→l) (125) is formed by alternating even and odd integers starting from an even integer δ′1. the reverse is also true: if the partition m + p = m( pδ1→lp ; pδ′1→lp ) (126) is composed of alternating even and odd integers starting from an even integer δ′1 then each segment of the conjugated partition pn2jp = m( pδ ′ lp→1; pδlp→1) (127) must have an even length, with the largest element m+ |pδ′ 1→lp | =| p∆1→lp | −1 = m|pδlp →1| ∈ pn2jp, (128) where p∆1→lp ≡ pδ1→lp ; pδ′1→lp . making use of (37) one can verify that quasi-rational functions (30) can be decomposed as ∞χ∓n [y; a] = y1/2n (n −1)∓n δ [ξ]∞ϕn±,0[y; a (δ)] (129) and therefore the denominators of the fractions in equivalence relations (29) take form y−1/2|δl|(|δl|−1) l∏ l=1 ∞χ−δl [y; a (|∆′ l→1|−δ)] = yσl ∞ϕ |δl| −,0 [y; a] (130) y−1/2|δ ′ l|(|δ′ l|−1) l∏ l=1 ∞χ−δ′ l [y; a (|∆′ l→1|−δ′)] = yσl ∞ϕ |δ′ l| −,0 [y; a (|∆1→l|)] (131) accordingly, where | ∆1→l |=| ∆′l→1 | and σl = l∑ l=1 δ′l(δl + l∑ ˜ l=l+1 δ ˜ l) = l∑ l=1 δl(δ′l + l−1∑ ˜ l=1 δ′ ˜ l). (132) we thus come to the following equivalence theorem for the wronskians of generalized bessel polynomials ∞ŵ[y; a | + ...m(∆1→l)] = ∞ŵ[y; a(|∆1→l|) | − ...m(∆′l→1)]. (133) note that decomposition (129) holds for any tfi csle of group a provided that we replace y2 for the leading coefficient lσ[y] of the corresponding counter-parts of differential eigenequations (46). this brings us to the equivalence relations for polynomial wronskians discovered by odake and sasaki [19] in their pioneering analysis of tsi potentials from group a. if a > 1/2 then, according to (128), the largest element of the partition pn2jp is smaller than a+ | p∆l→lp | −1/2 and therefore the wronskian in the right-hand side of (133) with ∆′l→1 replaced for p∆′lp→1 is formed by juxtaposed pairs of r-bessel polynomials. this confirms that none of the polynomial wronskians ∞ŵ[y; a | + ...m + p ] has zeros on the positive semi-axis and therefore each partition m + p specifies an admissible sequence of seed solutions ∞ϕ+,mk [y; a] (mk ∈ m + p for k = 1, . . . ,p). based on the arguments presented in subsection 3.2 we thus assert that the rdts in question may insert only one bound energy level at the energy ∞ε+,mp+1 (a) which by definition lies below the ground energy level ∞ε+,mp (a) of the liouville potential ∞v [a | m + p ]. on other hand all the existent energy levels remain unchanged. as the simplest example we can cite the partition 1, 2, . . . , 2j = m(2j, 1) = m†(1, 2j) for 2j ≤ ⌊a⌋ (134) 113 gregory natanson acta polytechnica as a direct consequence of the equivalence theorem we find that ŷ (2a−2j −1,−2) 2j (y) = ∞ŵ[y; a | − ...1 : 2j] (2j ≤ ⌊a⌋), (135) where the wronskian on the right is formed by 2j sequential r-bessel polynomials of non-zero degrees smaller than a and therefore may not have positive zeros for a > −1/2 [51]. as initially proven in [7] and then illuminated in more details in [8] using the so-called ‘kienast-lawton-hahn’s theorem’ [52–54] the latter assertion holds for any positive j despite the fact that the seed functions ∞ ̸ψ−,m [y; a(2j +1)] have nodes on the positive semi-axis for a(2j +1) + 1/2 < m < 2a. (136) indeed, representing (56) as y (2a,−2)m (y) ≡ y (2a) m (−y/2) = m! (−y/2) ml(−2a−2m−1)m (−2/y) (137) shows that the absolute value of the negative m-dependent laguerre index αm = −2a − 2m − 1 < 0 (138) is larger than the polynomial degree and therefore the polynomial in question may not have zeros at negative values of its argument. 3.5. isospectral rational extensions of krein-adler susy partners of morse potential since any rdct of the morse potentials using pairs of juxtaposed eigenfunctions n2j keeps unchanged the ground-energy level a set of seed functions ∞ϕ+,m[y; a] is admissible iff all m ∈ n2j, m _ p , where m _ p is an admissible set of seed polynomials specified in subsection 3.3. we can then use the same arguments as in subsection 3.3 to prove that any liouville potential ∞v [y; a | − ...n2j, m _ p ] = ∞v [y; a − 2j − p] − 2y d dy å y ld∞w[y; a | − ...n2j, m _ p ] ã (139) has exactly the same discrete energy spectrum as rationally deformed morse potential (124) constructed by means of juxtaposed pairs of r-bessel polynomials of non-zero degrees. its eigenfunctions expressed in terms of the variable y = ex can be represented as ∞ ̸ψ−,n [y; a | − ...n2j, m _ p ] = ∞ ̸ψ−,0 [y; a − 2j − p] ∞w[y; a | − ...n2j,n, m _ p ] ∞w[y; a | − ...n2j, m _ p ] for n /∈ n2j < n(a) (140) keeping in mind that the corresponding prime rsle is nothing but the schrödinger equation re-written in its algebraic form. 4. conclusions the presented analysis illuminates the non-conventional approach [19] to the family of rationally deformed morse potentials using seed solutions expressed in terms of wronskians of generalized bessel polynomials in the variable y = ex. as a new achievement compared with odake and saski’s [19] study on rdct s of the morse potential (see also [11] where a similar analysis was performed within the conventional l ref framework) we constructed a new rdc net of isospectral potentials by expressing them in terms of the logarithmic derivative of wronskians of generalized bessel polynomials with no positive zeros. the constructed isospectral family of rationally deformed morse potentials represents a natural extension of the isospectral rdt s of the morse potential discovered by quesne [10]. an important element of our analysis often overlooked in the literature is the proof that the sequential rdts in question do not insert new bound energy states. the widespread argumentation in support of this (usually taken-for-granted) presumption is based on the speculation that the theorems of the regular sturm-liouville theory [55] are automatically applied to singular sles. we can refer the reader to the scrupolous analysis performed in [43] for sles solved under the dbcs as an illustration that this is by no means a trivial issue. to be able to prove the aforementioned assertion we converted the given rcsle to its prime form such that the characteristic exponents of frobenius solutions for the regular singular point at ∞ have opposite signs and therefore the principal frobenius solution near this singular end is unambiguously selected by the corresponding dbc. (in the particular case under consideration the prime rsle accidently coincides with the schrödinger equation re-written in the ‘algebraic’ [42] form but this is 114 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . not true in general.) re-formulating the given spectral problem in such a very specific way allowed us to take advantage of powerful theorems proven in [43] for zeros of principal solutions of sles solved under the dbcs at singular ends. we [42] also used this simplified version of the conventional spectral theory to prove that any rdt of a principal (non-principal) frobenius solution near the regular singular point at ∞ is itself a principal (non-principal) frobenius solution of the transformed rsle. this assertion plays a crucial role in our proof of the exact solvability of the constructed dc net of isospectral rational potentials. it is commonly presumed that the krein-adler theorem [49, 50] is applied to an arbitrary potential regardless its behavior near the singular end points. in [42] we examined this presumption more carefully for the dirichlet problems of our interest again taking advantage of the theorems proven in [43] for zeros of juxtaposed eigenfunctions. however one can by-pass this analysis for any tfi rsle from group a keeping in mind that the wronskians in questions are formed by orthogonal polynomials with degree-independent indexes and therefore the numbers of their positive zeros are controlled by the general conjectures proven in [51]. in particular this implies that any wronskian formed by juxtaposed pairs of r-bessel polynomials of non-zero degrees may not have positive zeros. acknowledgements i am grateful to a. d. alhaidari for bringing my attention to the alternative representation of eigenfunctions of the schrödinger equation with the morse potential in terms of r-bessel polynomials with degree-independent indexes. this alteration helped me to fully comprehend odake and sasaki’s suggestion to place the morse oscillator into group a of rational tsi potentials. references [1] a. d. alhaidari. exponentially confining potential well. theoretical and mathematical physics volume 206:84–96, 2021. https://doi.org/10.1134/s0040577921010050. [2] j. gibbons, a. p. veselov. on the rational monodromy-free potentials with sextic growth. journal of mathematical physics 50(1):013513, 2009. https://doi.org/10.1063/1.3001604. [3] h. l. krall, o. frink. a new class of orthogonal polynomials: the bessel polynomials. transactions of the american mathematical society 65:100–105, 1949. https://doi.org/10.1090/s0002-9947-1949-0028473-1. [4] w. a. al-salam. the bessel polynomials. duke mathematical journal 24(4):529–545, 1957. https://doi.org/10.1215/s0012-7094-57-02460-2. [5] t. s. chihara. an introduction to orthogonal polynomials. gordon and breach, new york, 1978. [6] r. koekoek, p. a. lesky, r. f. swarttouw. hypergeometric orthogonal polynomials and their q-analogues. springer, heidelberg, 2010. [7] d. gomez-ullate, n. kamran, r. milson. the darboux transformation and algebraic deformations of shape-invariant potentials. journal of physics a: mathematical and general 37(5):1789–1804, 2004. https://doi.org/10.1088/0305-4470/37/5/022. [8] y. grandati. solvable rational extensions of the morse and kepler-coulomb potentials. journal of mathematical physics 52(10):103505, 2011. https://doi.org/10.1063/1.3651222. [9] c.-l. ho. prepotential approach to solvable rational potentials and exceptional orthogonal polynomials. progress of theoretical physics 126(2):185–201, 2011. https://doi.org/10.1143/ptp.126.185. [10] c. quesne. revisiting (quasi-)exactly solvable rational extensions of the morse potential. international journal of modern physics a 27(13):1250073, 2012. https://doi.org/10.1142/s0217751x1250073x. [11] d. gomez-ullate, y. grandati, r. milson. extended krein-adler theorem for the translationally shape invariant potentials. journal of mathematical physics 55(4):043510, 2014. https://doi.org/10.1063/1.4871443. [12] v. i. romanovski. sur quelques classes nouvelles de polynomes orthogonaux. comptes rendus de l’académie des sciences 188:1023–1025, 1929. [13] p. a. lesky. einordnung der polynome von romanovski-bessel in das askey-tableau. zeitschrift für angewandte mathematik und mechanik 78(9):646–648, 1998. https://doi.org/10.1002/(sici)1521-4001(199809)78:9<646::aid-zamm646>3.0.co;2-w. [14] c. quesne. extending romanovski polynomials in quantum mechanics. journal of mathematical physics 54(12):122103, 2013. https://doi.org/10.1063/1.4835555. [15] n. cotfas. systems of orthogonal polynomials defined by hypergeometric type equations with application to quantum mechanics. central european journal of physics 2(3):456–466, 2004. https://doi.org/10.2478/bf02476425. [16] n. cotfas. shape-invariant hypergeometric type operators with application to quantum mechanics. central european journal of physics 4(3):318–330, 2006. https://doi.org/10.2478/s11534-006-0023-0. [17] m. a. jafarizadeh, h. fakhri. parasupersymmetry and shape invariance in differential equations of mathematical physics and quantum mechanics. annals of physics 262(2):260–276, 1998. https://doi.org/10.1006/aphy.1997.5745. 115 https://doi.org/10.1134/s0040577921010050 https://doi.org/10.1063/1.3001604 https://doi.org/10.1090/s0002-9947-1949-0028473-1 https://doi.org/10.1215/s0012-7094-57-02460-2 https://doi.org/10.1088/0305-4470/37/5/022 https://doi.org/10.1063/1.3651222 https://doi.org/10.1143/ptp.126.185 https://doi.org/10.1142/s0217751x1250073x https://doi.org/10.1063/1.4871443 https://doi.org/10.1002/(sici)1521-4001(199809)78:9<646::aid-zamm646>3.0.co;2-w https://doi.org/10.1063/1.4835555 https://doi.org/10.2478/bf02476425 https://doi.org/10.2478/s11534-006-0023-0 https://doi.org/10.1006/aphy.1997.5745 gregory natanson acta polytechnica [18] s. odake, r. sasaki. extensions of solvable potentials with finitely many discrete eigenstates. journal of physics a: mathematical and theoretical 46(23):235205, 2013. https://doi.org/10.1088/1751-8113/46/23/235205. [19] s. odake. krein–adler transformations for shape-invariant potentials and pseudo virtual states. journal of physics a: mathematical and theoretical 46(24):245201, 2013. https://doi.org/10.1088/1751-8113/46/24/245201. [20] g. darboux. leçons sur la théorie générale des surfaces et les applications géométriques du calcul infinitésimal. gauthier-villars, paris, 1915. [21] m. m. crum. associated sturm-liouville systems. the quarterly journal of mathematics 6(1):121–127, 1955. https://doi.org/10.1093/qmath/6.1.121. [22] l. d. landau, e. m. lifshity. quantum mechanics (non-relativistic theory). 3rd ed. butterworth-heinemann, 1977. [23] g. natanson. equivalence relations for darboux-crum transforms of translationally form-invariant sturm-liouville equations, 2021. https://www.researchgate.net/publication/353131294. [24] d. gómez-ullate, y. grandati, r. milson. durfee rectangles and pseudo-wronskian equivalences for hermite polynomials. studies in applied mathematics 141(4):596–625, 2018. https://doi.org/10.1111/sapm.12225. [25] w. n. everitt, l. l. littlejohn. orthogonal polynomials and spectral theory: a survey. in c. brezinski, l. gori, a. ronveaux (eds.), orthogonal polynomials and their applications, vol. 9, pp. 21–55. 1991. imacs annals on computing and applied mathematics. [26] w. n. everitt, k. h. kwon, l. l. littlejohn, r. wellman. orthogonal polynomial solutions of linear ordinary differential equations. journal of computational and applied mathematics 133(1-2):85–109, 2001. https://doi.org/10.1016/s0377-0427(00)00636-1. [27] a. k. bose. a class of solvable potentials. nuovo cimento 32:679–688, 1964. https://doi.org/10.1007/bf02735890. [28] g. a. natanzon. study of the one-dimensional schrödinger equation generated from the hypergeometric equation. vestnik leningradskogo universiteta 10:22–28, 1971. english translation https://arxiv.org/ps_cache/physics/pdf/9907/9907032v1.pdf. [29] b. v. rudyak, b. n. zakhariev. new exactly solvable models for schrödinger equation. inverse problems 3(1):125–133, 1987. https://doi.org/10.1088/0266-5611/3/1/014. [30] l. d. fadeev, b. seckler. the inverse problem in the quantum theory of scattering. journal of mathematical physics 4(1):72–104, 1963. https://doi.org/10.1063/1.1703891. [31] w. a. schnizer, h. leeb. exactly solvable models for the schrödinger equation from generalized darboux transformations. journal of physics a: mathematical and general 26(19):5145–5156, 1993. https://doi.org/10.1088/0305-4470/26/19/041. [32] w. a. schnizer, h. leeb. generalized darboux transformations: classification of inverse scattering methods for the radial schrödinger equation. journal of physics a: mathematical and general 27(7):2605–2614, 1994. https://doi.org/10.1088/0305-4470/27/7/035. [33] d. gómez-ullate, y. grandati, r. milson. shape invariance and equivalence relations for pseudo-wronskians of laguerre and jacobi polynomials. journal of physics a: mathematical and theoretical 51(34):345201, 2018. https://doi.org/10.1088/1751-8121/aace4b. [34] l. e. gendenshtein. derivation of exact spectra of the schrödinger equation by means of supersymmetry. journal of experimental and theoretical physics letters 38:356–359, 1983. [35] l. e. gendenshtein, i. v. krive. supersymmetry in quantum mechanics. soviet physics uspekhi 28(8):645–666, 1985. https://doi.org/10.1070/pu1985v028n08abeh003882. [36] b. f. samsonov. on the equivalence of the integral and the differential exact solution generation methods for the one-dimensional schrodinger equation. journal of physics a: mathematical and general 28(23):6989–6998, 1995. https://doi.org/10.1088/0305-4470/28/23/036. [37] b. f. samsonov. new features in supersymmetry breakdown in quantum mechanics. modern physics letters a 11(19):1563–1567, 1996. https://doi.org/10.1142/s0217732396001557. [38] v. g. bagrov, b. f. samsonov. darboux transformation and elementary exact solutions of the schrödinger equation. pramana 49:563–580, 1997. https://doi.org/10.1007/bf02848330. [39] f. brafman. a set of generating functions for bessel polynomials. proceedings of the american mathematical society 4:275–277, 1953. https://doi.org/10.1090/s0002-9939-1953-0054100-x. [40] k. h. kwon, l. l. littlejohn. classification of classical orthogonal polynomials. journal of the korean mathematical society 34(4):973–1008, 1997. [41] a. f. nikiforov, v. b. uvarov. special functions of mathematical physics. birkhauser, basel, 1988. https://doi.org/10.1007/978-1-4757-1595-8. [42] g. natanson. darboux-crum nets of sturm-liouville problems solvable by quasi-rational functions i. general theory, 2018. https://doi.org/10.13140/rg.2.2.31016.06405/1. 116 https://doi.org/10.1088/1751-8113/46/23/235205 https://doi.org/10.1088/1751-8113/46/24/245201 https://doi.org/10.1093/qmath/6.1.121 https://www.researchgate.net/publication/353131294 https://doi.org/10.1111/sapm.12225 https://doi.org/10.1016/s0377-0427(00)00636-1 https://doi.org/10.1007/bf02735890 https://arxiv.org/ps_cache/physics/pdf/9907/9907032v1.pdf https://doi.org/10.1088/0266-5611/3/1/014 https://doi.org/10.1063/1.1703891 https://doi.org/10.1088/0305-4470/26/19/041 https://doi.org/10.1088/0305-4470/27/7/035 https://doi.org/10.1088/1751-8121/aace4b https://doi.org/10.1070/pu1985v028n08abeh003882 https://doi.org/10.1088/0305-4470/28/23/036 https://doi.org/10.1142/s0217732396001557 https://doi.org/10.1007/bf02848330 https://doi.org/10.1090/s0002-9939-1953-0054100-x https://doi.org/10.1007/978-1-4757-1595-8 https://doi.org/10.13140/rg.2.2.31016.06405/1 vol. 62 no. 1/2022 quantization of rationally deformed morse potentials. . . [43] f. gesztesy, b. simon, g. teschl. zeros of the wronskian and renormalized oscillation theory. american journal of mathematics 118(3):571–594, 1996. https://doi.org/10.1353/ajm.1996.0024. [44] g. natanson. exact quantization of the milson potential via romanovski-routh polynomials, 2015. https://doi.org/10.13140/rg.2.2.24354.09928. [45] a. schulze-halberg. higher-order darboux transformations with foreign auxiliary equations and equivalence with generalized darboux transformations. applied mathematics letters 25(10):1520–1527, 2012. https://doi.org/10.1016/j.aml.2012.01.008. [46] d. gomez-ullate, n. kamran, r. milson. an extension of bochner’s problem: exceptional invariant subspaces. journal of approximation theory 162(5):987–1006, 2010. https://doi.org/10.1016/j.jat.2009.11.002. [47] s. bochner. über sturm-liouvillesche polynomsysteme. mathematische zeitschrift 29:730–736, 1929. https://doi.org/10.1007/bf01180560. [48] v. g. bagrov, b. f. samsonov. darboux transformation of the schrödinger equation. physics of particles and nuclei 28(4):374–397, 1997. https://doi.org/10.1134/1.953045. [49] m. g. krein. on a continuous analogue of the christoffel formula from the theory of orthogonal polynomials. doklady akademii nauk sssr 113(5):970–973, 1957. [50] v. e. adler. a modification of crum’s method. theoretical and mathematical physics 101:1381–1386, 1994. https://doi.org/10.1007/bf01035458. [51] a. j. durán, m. pérez, j. l. varona. some conjecture on wronskian and casorati determinants of orthogonal polynomials. experimental mathematics 24(1):123–132, 2015. https://doi.org/10.1080/10586458.2014.958786. [52] a. kienast. untersuchungen über die lösungen der differentialgleichung xy ′′ + (γ − x)y ′ − βy. denkschriften der schweizerischen naturforschenden gesellschaft 57:247, 79 pages, 1921. [53] w. lawton. on the zeros of certain polynomials related to jacobi and laguerre polynomials. bulletin of the american mathematical society 38:442–448, 1932. https://doi.org/10.1090/s0002-9904-1932-05418-0. [54] w. hahn. bericht über die nullstellen der laguerreschen und der hermiteschen polynome. jahresbericht der deutschen mathematiker-vereinigung 1:215–236, 1933. [55] r. courant, d. hilbert. methods of mathematical physics, vol. 1,. interscience, new york, 1953. 117 https://doi.org/10.1353/ajm.1996.0024 https://doi.org/10.13140/rg.2.2.24354.09928 https://doi.org/10.1016/j.aml.2012.01.008 https://doi.org/10.1016/j.jat.2009.11.002 https://doi.org/10.1007/bf01180560 https://doi.org/10.1134/1.953045 https://doi.org/10.1007/bf01035458 https://doi.org/10.1080/10586458.2014.958786 https://doi.org/10.1090/s0002-9904-1932-05418-0 acta polytechnica 62(1):100–117, 2022 1 introduction 2 tfi sturm-liouville equations 2.1 liouville-darboux transformations 2.2 translational from-invariance of sturm-liouville equation 2.3 equivalence theorem for darboux-crum transforms of a tfi csle with two basic solutions 3 quantization of rationally deformed morse potentials by wronskian transforms of r-bessel polynomials 3.1 schrödinger equation with morse potential in bessel form 3.2 rdcts of principal solutions near singular end points 3.3 isospectral family of rationally deformed morse potentials with a regular spectrum 3.4 subnet of rationally deformed morse potentials quantized via wronskians of r-bessel polynomials 3.5 isospectral rational extensions of krein-adler susy partners of morse potential 4 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0003 acta polytechnica 61(si):3–4, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague předmluva tomáš bodnár, rudolf dvořák prof. rndr. karel kozel, drsc. narozen 24. prosince 1939 zemřel 23. ledna 2021 dlouholetý profesor čvut v praze a vědecký pracovník út avčr profesor karel kozel je významný český matematik a pedagog, který se po celý svůj život věnoval aplikované a zvláště numerické matematice. narodil se 24. 12. 1939, studoval na základní škole v pyšelích, vystudoval gymnasium v benešově a následně pak vysokou školu pedagogickou (zaměření matematika – fyzika), kterou skončil v r. 1960. pak nastoupil jako učitel na gymnasium v sedlčanech a absolvoval základní vojenskou službu. v roce 1964 přešel na katedru matematiky fakulty strojní čvut v praze jako odborný asistent. od r. 1988 je docentem, od r. 1991 pak profesorem aplikované matematiky. odborně pracoval od roku 1970, obhájil kandidaturu v roce 1977 (pod vedením prof. poláška), drsc. získal v roce 1990. vedl úkoly státního plánu základního výzkumu od roku 1972, od r. 1990 pak vedl granty a výzkmnné projekty, celkem 10 (gačr, ga avčr, vz mšmt) v čr a 3 (cost, qnet) z eu. byl 13 let vedoucím ústavu technické matematiky, 2x proděkanem fs čvut v praze. profesor karel kozel se po celý svůj život věnoval aplikované matematice a stal se pravděpodobně nejautentičtějším pokračovatelem školy, založené na čvut profesorem poláškem, jejímž cílem a smyslem bylo přímé využití matematiky při řešení konkrétních problémů technické praxe. profesor kozel tuto školu dále rozvinul a spolu se svými žáky se významně zasadil o rozvoj numerických metod ve výpočtové mechanice tekutin. v rámci této své odborné činnosti spolupracoval s četnými vědeckými a výzkumnými institucemi u nás (mff uk, út avčr, mú avčr, fjfi, fel a fsv čvut) i v zahraničí (např. von kármán institute v belgii, universita toulon ve francii, th darmstadt, universita stuttgart, tu chemnitz v německu a ercoftac). významně se zasloužil o podporu a rozvoj spolupráce matematiků s průmyslem a průmyslovým výzkumem (škoda plzeň, škoda auto, vzlú letňany, svúss běchovice). jeho stěžejní odborná činnost byla zaměřena především na matematické modely, numerické řešení parciálních diferenciálních rovnic a jejich aplikaci při simulaci modelů proudění; nejdříve proudění subsonického a transsonického, pak proudění v mezní vrstvě atmosféry, proudění v biomechanice a tzv. „fluid structure interaction“. vedl 13 projektů či grantů čr a eu (cost, qnet-cfd), dále pak pracoval na dalších nejméně šesti grantech. byl členem evropských odborných společností gamm a euromech i české společnosti pro mechaniku a jednoty českých matematiků a fyziků. dlouholetá odborná přednášková i publikační činnost karla kozla zahrnuje mimo jiné více jak 130 přednášek, 16 skript a monografií, 55 výzkumných zpráv a více než 570 publikací v časopisech a sbornících. rozsáhlá a významná je i pedagogická činnost karla kozla. byl dlouholetým členem a později i vedoucím ústavu technické matematiky na fakultě strojní čvut v praze. řadu let působil i ve funkci proděkana fakulty. jeho kolegové na katedře i ve vedení fakulty vždy oceňovali jeho přímé a čestné jednání. podílel se na výuce matematických předmětů v celém rozsahu od základního studia až po doktorské. byl pedagogicky aktivní i na fjfi čvut v praze a zu v plzni. byl školitelem celé dlouhé řady diplomantů a doktorandů. byl členem oborových rad pro doktorandské studium na fs a fjfi čvut v praze a fav zu v plzni, dále členem vědecké rady čvut a út avčr. mnozí z jeho studentů se úspěšně etablovali v oboru aplikované matematiky a dále na jeho působení navazují a to jak na vysokých školách a ve vědeckých ústavech, tak i v průmyslu. profesor rndr. karel kozel, drsc. se v závěru roku 2019 dožívá významného životního jubilea. za svůj mimořádný přínos k rozvoji technické matematiky, mezinárodní vědecké spolupráce a především výchově několika generací vědců, inženýrů a pedagogů obdržel ocenění ve formě matematické oborové medaile jčmf. 3 https://doi.org/10.14311/ap.2021.61.0003 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en tomáš bodnár, rudolf dvořák acta polytechnica ad: osmdesátiny prof. karla kozla osmdesátiny; to je věk, kdy každý – chtě nechtě se začne ohlížet zpět a začne bilancovat svoji celoživotní pouť. šťastný je ten, kdo se může těšit z pocitu plně a úspěšně prožitého života a z pocitu dobře vykonané práce. ne každý může okusit tento pocit do té míry, jako náš významný jubilant – profesor rndr. karel kozel, drsc. vyškolen na pedagogické fakultě jako kantor, působil i celý život jako kantor. nejdříve jen čtyři roky na gymnáziu v sedlčanech, od roku 1964 až do důchodu na katedře matematiky fakulty strojní čvut. zde se habilitoval v roce 1988, v roce 1991 získal hodnost drsc. a v témže roce byl jmenován profesorem a vedoucím ústavu technické matematiky. mnozí z nás ho však znali spíše jako nepřehlédnutelného pracovníka ústavu termomechaniky av čr, s nímž úspěšně spolupracoval od roku 1969. zde měl svůj kout, kde mohl nejen nerušeně pracovat, ale i setkávat se s pracovníky ústavu a v úzkém kontaktu s nimi získávat náměty aplikačně významných témat a problémů. ty pak přenášel dál na svoje žáky a spolupracovníky na fakultě. významně se podílel i na vybudování společného pracoviště ústavu termomechaniky av čr a fakulty strojní čvut. tato spolupráce ho přivedla ke dvojí problematice, která pak převládala v jeho další práci, a to k numerické simulaci transsonického proudění v lopatkových mřížích a k matematickému modelování v dynamice tekutin. postupem času se spolupráce s pracovníky ústavu rozšířila na problematiku turbulentního proudění ve vnitřní i vnější aerodynamice, na proudění v mezní vrstvě atmosféry (např. šíření exhalací) i na vybrané problémy interakce proudící tekutiny s obtékaným tělesem i biomechaniky (např. na pohyb hlasivek). ke všem těmto problémům byly v ústavu termomechaniky k dispozici původní a detailní experimentální výsledky, na nichž bylo možno ověřit vhodnost matematických modelů a v diskuzi s pracovníky ústavu nalézt i správnou interpretaci výsledků jejich numerické simulace. byla to spolupráce, která obohacovala všechny zúčastněné a ukázala, jak lze skutečně naplnit pojem „aplikovaná matematika“. navíc významně přispěla k postupnému vybudování „školy“ matematického modelování v mechanice tekutin, která se stala brzy známou i za hranicemi naší republiky. jeho spolupráce se zahraničními univerzitami byla rozsáhlá a řada jeho úspěšných doktorandů získala současně i titul ph.d. na zahraniční univerzitě v rámci společného doktorského studia. profesor kozel sám působil od roku 1996 každoročně jako hostující profesor na univerzitě ve francouzském toulonu. kromě vlastní odborné a pedagogické práce se profesor kozel věnoval i vědecko-organizační práci. významně přispěl ke zřízení ercoftac czech pilot centre v ústavu termomechaniky. je členem gamm a euromech society a českým zástupcem ve správní radě von kármán institute for fluid dynamics v rhode-saint-genèse v belgii. jménem všech jeho přátel a současných, či bývalých spolupracovníků bych rád poděkoval profesoru kozlovi za celou jeho dosavadní odbornou činnost i angažovanost a popřál pevné zdraví a dobrou pohodu do všech dalších let. přípravy k vydání tohoto zvláštního čísla časopisu acta polytechnica začaly už v roce 2019. přestože profesor karel kozel zemřel na začátku roku 2021, ponechali jsme předmluvy v jejich původní podobě. autoři speciálního čísla 4 acta polytechnica 61(si):3–4, 2021 ap08_2.vp 1 lecture 1. these lectures are based on a series of papers written in collaboration with h. boos, m. jimbo, t. miwa, y. takeyama [1, 2, 3, 4, 5, 6, 7, 8]. consider the infinite xxz spin chain with the hamiltonian h k k k k k k k xxz � � � �� � � ��� � �12 1 1 1 1 2 1 2 3 1 3( ( ))� � � � � �� , where � a a( , , )�1 2 3 are pauli matrices and � � cos ��. i shall consider the case � �1, so, � is real: 0 1� �� . i also use the notation q e i� � � the ground state of the hamiltonian will be denoted by vac . i shall later briefly explain how this vector is found, but now let me formulate the main problem. introduce s k j j k ( ) � ��� �12 3� and let � be a parameter. we consider the normalized vacuum expectation values: vac vac vac vac q q s s 2 0 2 0 � � ( ) ( ) � , where � is a local operator. the locality of � implies that the operator q s2 0� ( )� stabilizes: there exist integers k, l such that for all j l (resp. j k� ) this operator acts on the j-th lattice site as 1 (resp. q�� 3 ). if k (resp. l) is the maximal (resp. minimal) integer with this property, l k� � 1 will be called the length of the operator q s2 0� ( )� and denoted by lenght �q s2 0� ( )� . let me start with the free fermion case � � 0 (q i� ). in this case we have hxx k k k k k � �� � ��� � �12 1 1 1 2 1 2( )� � � � . in this case the model can be easily solved by introducing the fermions: � � �k k is ke� � � �� ( )1 , which satisfy canonical commutation relations: [ , ] , ,� � � � k k k l 1 2 1 2 0 � �� . the hamiltonian takes the form: h ixx k k k k k � �� � � � � � ��� � �( )� � � �1 1 . consider a finite chain of length n with periodical boundary conditions (we assume that n is even): h ixx n k k k k k n n ( ) ( )� �� � � � � � � � � � � � �1 1 2 2 1 , � � � n s ne i 2 22 � � �� � , s is the total spin. introduce the fourier transform: � � � � � � � � � � � � � � � � ��( ) 1 1 1 2 2 2 2 1 n k k k n n . i apologize for the strange parametrization of the momentum, but it will be useful in the generic case. we have [ ( ), ( )] ,� � � � � �1 2 0 [ ( ), ( )]� � � � � � � � � � � � � � � � � 1 2 1 2 1 2 2 2 2 2 2 1 1 1 1 1n n � � � � � � � � � � � � � � � � � � � � � � � � � � � n n n 2 1 2 1 2 2 2 2 2 1 1 1 1 1 1 1 1 1 11 2 1 2 2 2 2 2 � � � � � � � � � � � � � � � � � � � . so, introducing �j as solutions to 1 1 2 2 2� � � � � � j j i n j e , j n n� � �2 2 1, , ,� we have: [ ( ), ( )] ,� � � � � � � � �j j i j . the hamiltonian is easily expressed in terms of the fourier transform: h n jxx n j j k n n ( ) ) ) sin� � � � � � � � � � � � �� � �� �2 2 2 1 . © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 2/2008 correlation functions for lattice integrable models f. smirnov in this lectures i consider the problem of calculating the correlation functions for xxz spin chain. first, i explain in details the free fermion case. then i show that for generic coupling constant the fermionic operators acting on the space of quasi-local fields can be introduced. in the basis generated by these fermionic operators the correlation functions are given by determinants as in the free fermion case. keywords: quantum integrable models, correlation functions, exactly solvable models of statistical physics. as usual in fermionic models, there are negative energies which are taken care of by the dirac trick: rewrite the hamiltonian as h n jxx n j j k n j ( ) ) ) sin ) � � � � � � � � � � � � � � � �� � �� � � �� � � 2 0 2 1 � � j k n n j const) sin 2 2 1 � � � � � � �� � � which means that the vacuum must satisfy � �� � �� � � � � � � � � � j j j n j n ) , , , ) , , , . vac vac 0 2 1 0 0 2 1 � � so, if we start with a ferromagnetic vacuum 0 in which all spins are down, the real vacuum vac is obtained by filling the dirac sea: vac 0� � �� � � � ��j j n ) 2 1 . i have considered this simple example in order to explain that even in the free fermion case the vacuum is a rather complicated state if we present it in terms of original spin variables. now we have to take the limit n � �. for the fourier transform we have � � k d� � � � � � � � � � � � � �� 1 1 2 1 2 2 2 4 12 ( ) , and the vacuum satisfies: � � � �) vac 0, phase 1 1 0 2 2 � � � � � � � � � � , � � ) vac � 0, phase 1 1 0 2 2 � � � � � � � � � � . let me give some explanation about the case of generic q. it is well known nowadays that the integrability of the xxz model is due to its relation to the trigonometric r-matrix. the r-matrix belongs to the tensor product of the two algebras of the 2 matrices: r mat mat( ) ( , ) ( , ) � �2 2� � : r q q q q q q q q ( ) � � � � � � � � � � � � � � � 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 11 q q� � � � � � � � � � � � � � �� (1.1) the r-matrix satisfies the yang-baxter equation: r r r r12 1 2 13 1 3 23 2 3 23 � �� � � �� � �� � � �� � �� � � �� � 2 3 13 1 3 12 1 2 � �� � � �� � �� � � �� � �� � � ��r r , where as usual i shall denote by ri j, ( ) the r-matrix acting in the tensor product of two copies of mat( , )2 � . i shall also use the notion of an l-operator. in this particular case there is no difference between these two things, and i shall explain later what i mean by an l-operator in general. here i consider two copies of mat( , )2 � one of which is called auxiliary (carrying the index a), and the other is called quantum (carrying the index j). so, la j, ( ) is r( ) as an element of the tensor product of these two algebras. the most important object of the theory of integrable models is the transfer-matrix: t l ln a a n a n( ) ( ( ) ( )), , � � �tr 2 1 2� . the yang-baxter equation implies commutativity: [ ( ), ( )]t tn n 1 2 0� . it is well known that the hamiltonian of a periodic xxz chain on n sites is contained in this one-parametric family of operators, but since i shall need some knowledge about higher local integrals of motion let me repeat the derivation of this fact. it is convenient to introduce ~ ( ) ~ ( ), ,l la j a j j j � � � � 3 3 2 2 . note that we still have t l ln a a n a n( ) ( ~ ( ) ~ ( )), , � � �tr 2 1 2� because tn ( ) commutes with the total spin. introduce further � � � � � � � l l p q q ha j a j a j a j, , , ,( ) ~ ( ) 1 12 2 1 , where h q q q q a j a j a j a j a , ( ) ( � � � � � � � � � � � � � � � � � � � � � � � 1 3 3 1 3 4 1 4 � � � � � � j 3) . note that the operator ha j, (the density of the hamiltonian) is a projector, so it is easy to see that � � � � � � � � � � � � � � � � � l q q q q q q ha j a j, ,( ) exp log 1 1 2 1 2 1� � � � � . now we calculate t l l ln n n n n n n( ) ( ) ( ) ( ), , , � � � � � � � � � � � � � �2 2 1 2 1 2 2 2 1 2 p p e un n n n ip p � � � � � � �2 2 1 2 2 1 1 , , ,� where �� � � l n n2 2 1, means that the operators from the �n 2 ( )n 2 1� tensor component are put to the right (left) of the product of the l-operators. it is easy to see that i hp n� xxz ( ) , which is the hamiltonian of the periodic xxz chain. moreover, from the campbell-hausdorf formula one concludes that i dp p j j � � , 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 where dp, j acts nontrivially from the j-th to the (j�p)-th site. these operators are called local integrals of motion: they commute with the hamiltonian, and they are composed of local densities. finally, i would like to say what the ground state of the hamiltonian of the periodic chain looks like. denote l l a b c da n a n n n n n a , ,( ) ( ) ( ) ( ) ( ) ( )2 1 2� � � � �� � � �� � . the eigenvectors of the transfer matrix with total spin 0 (a vacuum is among them) are constructed as c cn n n( ) ( )� �1 2 0� where �j solve the b the equations: 1 1 1 2 2 2 2 2 2 2 2 2 � � � � � � � � � � � � � � � j j n j k k jk jq q q j, ,�, n 2. a vacuum corresponds to a particular solution to these equations which is a continuous deformation from the free fermion case. so, we see that the vacuum is a complicated state which hardly allows a mathematically satisfactory description in the case n � �. 2 lecture 2. let us return to our main problem. we want to calculate vac vac vac vac q q s s 2 0 2 0 � � ( ) ( ) � . consider the free fermion case. one easily calculates: s( ) : ( ) (0 1 1 12 1 2 1 2 2 2 2 2 1 1 1 2 2 1 2 � � � � � � � ���� � � d d 2 1 2 2 2 1 0 ) : � � , where : : stands for normal ordering with respect to the creation-annihilation parts of fermions discussed before. for further computation i need the formula: e es z2 0� �( ) : :� . this is a simple exercise on normal reordering of an exponent of quadratic form. one finds: z� �� � � � � � � � � � �� 4 2 1 13 1 2 1 2 2 2 2 2 1 1 1 2 2 1 2 sin : ( ) d d � � � � � � � ( ) : ,2 1 2 1 2 2 2 0 from this formula one easily derives: vac vac vac q q s k k s 2 0 1 1 2 � � � � � � ( ) ( ) ( ) ( ) ( )� � � � � � � �� � ( ) det ( , ) 0 vac � � �� i j where � ( , )1 2 is the two-point green function: � � ( , ) ( )( ) ( ) 1 2 2 2 1 2 2 2 1 2 2 2 2 1 1 1 1 2 2 1 � � � � � � � � y y y y � 2 1 2 2 2 � � , where y e i � � � 2 . i want to rewrite this answer differently, but let me first explain my logic. i appeal to an analogy with conformal field theory (cft). it is well-known that on long distances the xxz model is described by cft with c �1. moreover, the xxz model can be considered as a perturbation of this cft by an infinite number of irrelevant operators. here, contrary to the perturbation of relativistic quantum field theory the flow of the renormalization group goes from infrared to finite distances. the main object for cft is not the space of states, but the space operators. following this analogy let me denote by w�,0 the space of the spinless operators for the xxz model of the form q s2 0� ( )�. in the conformal limit this space goes to the space of descendants of the primary field �� � � � � � � � � � e i ( ) ( ) 1 2 here ( )� �� , are two chiral bosonic fields. the descendants are created by the action of two chiral virasoro algebras: [ , ] ( ) ( ) ,l l m n l n nm n m n m n� � � �� � 1 12 12 � , [ , ] ( ) ( ) ,l l m n l n nm n m n m n� � � �� � 1 12 12 � . we have: l l mm m( ) ( ) , ,� �� �� � 0 0 l l0 0( ) ( ) ,� � �� � � �� � � where � � � � � �� � �( ( ))1 2 8 is the anomalous dimension. the space of spinless descendants is spun by the vectors l l l lk k l lp p� � � �1 1� � ( )�� . after perturbation, the descendants acquire the vacuum expectation values. the question which a. zamolodchikov asked me long ago is whether it is possible to write find a covector 0 1 2 1 2f l l l l( , , , , , )� � (here 0 has nothing to do with what was previously used, it is the virasoro left vacuum 0 0l lk k� �� for � 0) such that the vacuum expectation values are given by the scalar products of this covector with descendants. at that time i was unable to answer this question, but in these lectures i shall explain a construction which is very much in the spirit of this point of view. so, our goal is to start working in the space of the operators and to introduce the operators acting in this space. let me give some formal definitions. i have already defined the space w�,0. similarly i define the spaces w s�, of operators q s2 0� ( )� of spin s, and � �� �� � ��� � s ,s . also, we shall shift � by integers, so i introduce the space: © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 2/2008 � �[ ]� �� � ��� � � k k . we shall need an analogue of the left virasoro vacuum, i.e. a linear functional on w� . first, introduce also the linear functional on end( )�2 : tr tr� � � �� ( )x q q q x� � � � � � � � �� �1 2 2 1 2 3 (2.1) with the obvious properties: tr tr� � ��( ) ( )1 1 3 � �q . this gives rise to a linear functional on �� tr � � � �( ) ( )x x�� �tr tr tr1 2 3 obviously, this functional is well-defined, since any operator from w� stabilizes at �� (��) as i e( ) �� 3 . introduce the operators �k k f x kx x x � � �� � � �( ) ( ) ( )� �1 �k k f x kx y x y x� � �� � � �( ) ( ( ) )( ) 1 1 12 2 � �� � , where f(x) is the fermionic number of operator x. these operators annihilate respectively the right or left tail of elements of w� . they satisfy the canonical commutation relations: [ , ] [ , ], ,� � � �k l k l � � � � � �� � 0 [ , ],� �k l k j � �� �� � � � � �� . introduce the fourier transform: � �� � ��� � � � � � � � � � � ��( ) j j j 1 1 2 2 , � �� � ��� � � � � � � � � � � ��( ) j j j 1 1 2 2 . we want to use �k � (�k �) for k 0 as annihilation (creation). obviously, �k sq k� � ( ) ,( )2 0 0 0� . there is a problem with the creation operators and the functional tr tr� �: ( ( ))�k x � does not always vanish. however, one easily derives that tr tr� �( ( )) ( ( ))� �k kx y y x� �� � �2 1 , which allows us to perform all necessary calculations. without going into further details i rewrite the formula for vacuum expecation values in a new form: vac vac vac vac y y e s s 2 0 2 0 ( ) ( ) ( ( )) � � � � tr � � � � � � � � � �� i sin ( ) ( ) �� � � � � 2 1 1 1 2 1 2 2 2 2 1 1 2 1 2 2d d 2 2 2 11 1 2 2 1 2 � � � � � � � � �� �� , where � denote local operators living on a positive axis only. still i am not quite happy with this formula, because there is a problem with translational invariance. i would like to avoid using � . i introduce the following operators: b( ) ( ) sin ( ) ( , ) , �� w s s s a s i y e s � � � � � � � � � �2 1 1 1 2 12 g � 1 2� � � ! " # , c( ) sin ( ) ( , ) , � � w s a s y e s� � � � � ! " # � � � �2 1 1 2 1 2 g � , where � e n i m i m � � � � � � � ( , exp [ log( ) log( ) ] . �� � �� � � � 2 2� � in the last formula we consider ��, j � (resp. � j �) as components of a row (resp. column) vector, m u u u j j� � � � � � � �( )( ) , ( )1 1 1 1� � , and log( )i m� 2 are understood as taylor series in u. � [ ]� stands for the normal ordering, which applies only to operators acting at the same site. for them we set � [ ] ( ), ( ). , , , � � � � � � � � � j j j j j j j j � � � � � � � $ % & '& 0 0 these operators satisfy a number of remarkable properties. first, the vacuum expectation values can be presented for any local � as vac vac vac vac y y e s s 2 0 2 0 ( ) ( ) ( ( )) � �� tr � � � � �� �� i sin ( , ) ( ) ( ) �� � � 2 1 2 1 2 1 2 2 2 11 2 2 1 2 b c d d , (2.2) where � � �( , ) ( )� � � � �8 1 1 1 1 2 2 2 4 y y . let me formulate the rest of properties of these operators in the most general form because formula (2.2) has a direct analogue in the general case. 1. operators b( ) , c( ) have the following block structure: b( ): , , � �w ws s� � �1 1, c( ): , , � �w ws s� � �1 1. hence operator b c( ) ( ) 1 2 acts from w�,0 to itself. 2. we have complete anti-commutativity: [ [ [b b b c c c( ) ( )] ( ) ( )] ( ) ( )] 1 2 1 2 1 2 0� � �� � � . 3. we have lenght ( lenghtb( )( )) ( )( ) ( ) � �q qs s2 0 2 0� �� , lenght ( lenghtb( )( )) ( )( ) ( ) � �q qs s2 0 2 0� �� . 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 moreover, by definition b( ) ( ) �� � � � � � � 1 121 p p s p b , c( ) ( ) �� � � � � � 1 121 p p s p c , then b p s sq p q( ) , ( )( ) ( )2 0 2 00� �� �� lenght , c p s sq p q( ) , ( )( ) ( )2 0 2 00� �� �� lenght . this implies that for any � only a finite number of terms in the series for e� count. another corollary is that b c( )( ) ( )( )( ) ( ) � �q qs k s k2 2 0� � , so we have translational invariance as was announced. 4. recall that the local integrals of motion have the form: i dp j p� � , . so, obviously their adjoint action is well defined on w� . denote � p px i x( ) [ , ]� . we have: [ , ( )] [ , ( )]� �p pb c � � 0. this property is very important for self-consistency of our main formula, because the vacuum expectation value of � p x( ) must vanish. for the moment, all this was for the free fermion case. however, the main result of our research is that one can give an algebraic definition of operators b( ) , c( ) in the case of generic q �1. they possess the same properties, and the formula for the vacuum expectation values is exactly the same, with the only difference that the function � �( , ) should be replaced by: � � � � � � ( , ) ( ) ( ) � � � � � � � � � � � � � � � 4 1 1 12 2 2 2 2 q q q q q q u� � �� �� � � �� � � � �� � � � sin ( ( )) sin cos ( ) .2 2 20 0 u u u u u i i d i shall define the operators b( ) , c( ) in the last lecture, but i would like to finish the present lecture by completing the set of operators. namely, we are able to define the creation operators, b *( ) , s*( ) and an additional operator t*( ) . for the free fermion case one can write explicit formulae for these operators. in the generic case they are constructed similarly to what i shall explain in the next lecture. their properties are as follows. 1. the block structure of b *( ) , s*( ) , t*( ) is as follows: b * s s( ): , , � �� �� � �1 1 s* s s( ): , , � �� �� � �1 1 t* s s( ): , , � �� �� 2. these operators have the following commutation relations with b( ) , c( ) : [ ( ), ( )] [ ( ), ( )] [ ( ), ( )] [ b c c b c t b 1 2 1 2 1 2 * * * � � � � � � ( ), ( )] , 1 2 0t* � � [ ( ), ( )]b b � 1 2 1 2 1 2 2 2 1* � � � �� � � �� � , [ ( ), ( )] .c c � 1 2 2 1 1 2 2 2 1* � � � �� � � �� � 3. as functions of the operators b *( ) , c*( ) , t*( ) are: b b* p p* s p ( ) ( ) �� � � � � � 2 1 1 1 , c c* p p* s p ( ) ( ) �� � � � � � 2 1 1 1 , t t* p p* p ( ) ( ) � � � � � 2 1 1 1 . 4. the operators b *( ) , c*( ) , t*( ) acting on q s k2� ( ) create the space w�. the locality is respected due to the properties: lenght ( lenghtb p* s sq q p( )) ( )( ) ( )2 0 2 0� �� �� � , lenght ( lenghtc p* s sq q p( )) ( )( ) ( )2 0 2 0� �� �� � , lenght ( lenghtt p* s sq q p( )) ( )( ) ( )2 0 2 0� �� �� � . 5. the operators b *( ) , c*( ) , t*( ) respect tr � : tr b tr c tr t tr � � � � ( ( )( )) ( ( )( )) , ( ( )( )) ( ) * * * x x x x � � � 0 . (2.3) for the vacuum expectation value we have: vac b b c c t t* * p * * p * * q q( ) ( ) ( ) ( ) ( ) ( ) 1 1 1 0 0 2� � � � � � � � � �� s s j j q w ( ) ( ) det ( , 0 2 0 vac vac vac � � � � i think this is a good point to finish this lecture. 3 lecture 3. in the previous lecture i claimed that the space of operators w� can be organized in such a way that the vacuum expectation values are easy to calculate. namely, i claimed that there are operators b( ) , c( ) , b *( ) , c*( ) , t*( ) which can be constructed algebraically, and which provide this organization of w�. but for the moment i described some of these operators for the case of free ferminos only. now i shall explain the general construction, but i think i shall not be able to do this for all the operators. i shall therefore restrict myself to the operator c( ) . if the construction of these operators is clear to you at the end of this lecture i shall be quite happy. other operators are constructed using similar means. first, let me prepare our notation for the l-operators. consider the quantum affine algebra uq( )sl2 . the universal r-matrix of this algebra belongs to the tensor product b b� �� of its two borel subalgebras. by an l-operator we mean its image under an algebra map b b� �� � �n n1 2, where n1 , n2 are some algebras. i shall always take n2 to be the algebra © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 2/2008 � m mat� ( , )2 � of 2×2 matrices. as for n1 i make several choices: uq( )sl2 , m, the q-oscillator algebra osc (see below) or osc m� �, where m m� ( are the subalgebras of upper and lower triangular matrices. for economy of symbols, i use the same letter l to designate these various l-operators. i shall put indices, indicating to which tensor product of the algebras they belong. we use j, k, … as labels for the lattice sites, and a, b, … as labels for the ‘auxiliary’ two-dimensional space. accordingly i write the matrix algebra as mj or ma. capital letters a, b, … will indicate the q-oscillator algebra osc. finally, for osc m� � i use pairs of indices such as {a, a}. we already had the l-operator la j, ( ) , which is essentially the image of the universal r-matrix under the map of both b�, b� to mat( , )� 2 . the next case is due originally to bazhanov, lukyanov and zamolodchikov [9, 10, 11]. let us consider the q-oscillators a, a * satisfying a a a a* *q q� � �2 21 . it is convenient to introduce one more element qd such that q q q a a qd * * d d da a� �� �1 1, , a a a a* d * dq q� � � � �1 12 2 2, . denote by osc the algebra generated by a, a *, q d� with the above relations. we consider the following two representations of osc, w k c k� � � � � 0 , a * k k� �1 , d k k k� , a 0 0� ; w k c k� � ��� � � 1 , a * k k� �1 , d k k k� , a * � �1 0; in the root of the unity case, if r is the smallest positive integer such that q r2 1� , we consider the r-dimensional quotient of w � generated by 0 or �1 . my goal is not to give a complete description of the construction, but rather to consider an example. so, i shall explain how to construct the operator c( ) . this operator has the block structure explained before. let me denote by c( ) its block acting from w� to w� �1. returning to the l-operators, we introduce the following one, which is the image of the universal r-matrix for n osc1 � , n mat2 2� ( , )� : l i q q q a j d a a j d a a , * ( ) � � � � � � � � � � � � � � � 1 2 1 4 2 2 21 1 a a 0 0 q osc md j a j a � � � � � � � � � . (3.1) this l-operator satisfies the crossing symmetry relation: l la j a j, ,( ) ( ) � � �1 1 1 , where we have set l l qa j j a j t j j , ,( ) ( ) � �� �2 1 2 , and tj stands for the transposition in mj. consider the product l la j a j, ,( ) ( ) . it is well known that this product can be brought to a triangular form, giving rise in particular to baxter’s ‘tq-equation’ for transfer matrices. introduce fa a a a, � � �1 a � , then we have the fusion relation: l f l l f q q a a j a a a j a j a a{ , }, , , , ,( ) ( ) ( ) ( ) � � � � � � � 1 1 1 0 � � � j a a a q q l q q l q q j j � � � � � � � � � � � � � � � 1 1 2 2 3 3 0 0 ( ) ( ) � � � � � a . (3.2) we shall also need l q q q q q l a a j a j { , }, ( ) ( )( )( ) � � � � � � � � � � � 1 1 1 1 1 2 1 3 ( ) ( ) ( ) � � q q l q q q q q j a a j 0 0 0 3 2 1 1 1 1 � � � � � � � � � � � � � � � � � � � � � � � ��1 a . (3.3) let us consider the finite chain with sites from k to l. by m k l[ , ] i shall denote mat l k( , ) ( )2 1� � � � . as usual, the main object is the monodromy matrix: t l lx k l x l x k,[ , ] , ,( ) ( ) ( ) � � �� � , where x stands for any auxiliary algebra. however, contrary to the usual situation i want to act on the operators, so, i introduce the adjoint action: � x k l x k l x k lt t,[ , ] ,[ , ] ,[ , ]( )( ) ( ) ( ) � � � � � �1. consider x mk l� , and define � � � a k l a k l a k l a ,[ , ] ,[ , ] ,[ , ] ( , ) ( , ) ( , ) ( � � � 0 � �� � � �� x q q x a a k l d sa a k l ) ( ) ( ), { , },[ , ] ( )( ) [ , ] � � � � � � �1 2 23 where s k l j j k j [ , ] � � �12 3� . from the fusion relation we obtain: � � � a k l a k l dx q q q q qa,[ , ] ,[ , ] ( )( , )( ) ( , ) ( � � �� � � � �1 2 1 2s k l x[ , ] ), � � � a k l a k l dx q q q q a,[ , ] ,[ , ] ( )( , )( ) ( , ) ( � � �� � � � � �1 1 2 1 q x s k l�2 [ , ] ), where � stands for the adjoint action of total spin. now i introduce the most important object: c [ , ] (0) trk l a a k l sx x( , )( ) ( ( , ) ( ),[ , ] � � �� � �� , 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the trace is taken with respect to w+, it converges for q� �1, and continues analytically to the other �. there is one obvious property of c [ , ] (0) k l ( , ) � : c c[ , ] (0) , [ , ] ( k l k l k lq x qk k( , ) [ ] ( ) � �� � �3 3 1 1 � � � � � � � � 0) ( , )( )[ , ] � x k l�1 . it follows from the definition of the adjoint and u(1)-symmetry of the l-operator. we call this property the first reduction relation. in addition there is another property which is a result of non-trivial calculation: � �c c[ , ](0) , [ , ](0) ,k l k l l k l k lx i x i( , ) ( , )[ ] [ ] � �� � ��1 1 1 l f� �( ( )) , where �( ( )) ( ) ( )f f q f q � � �1 , the explicit expression for f( ) is irrelevant here. so, the reduction relation from the right is satisfied up to the “q-exact form”. in classical mathematics the additional term is usually eliminated by integrating over the closed cycle. here it is similar. obviously, c [ , ] (0) k l ( , ) � is singular at 2 2 21� �, ,q q . then the above relation implies that for c (c c[ , ]( , ) sin ( , ) (k l k l k la q q � � � �g 2 =1 [ , ] (0) [ , ] (0) 1, ))� we have c c[ , ] [ , ] [ , ] [ , ]( , )( ) ( , )( )k l k l l k l k l la x i a x i � � ��1 1 1 . this is the second reduction relation. now we are able to define the operator c( , ) a acting from w� to w� �1 taking the inductive limit. indeed consider x w� � , denote by x k l[ , ] its restriction to the interval [ , ]k l . then c c( , )( ) lim ( , )( ) , [ , ] [ , ] a x a x k l k l k l� �� �� due to the reduction relation the limit is well-defined since for a large enough interval [ , ]k l the sequence stabilizes! the commutation relations are proved similarly. i do not go into details, but using the r-matrix one can show that c c c[ , ] (0) [ , ] (0) [ , ] (0) k l k l k l( , ) ( , ) ( , ) � � �1 2 21 1� � � c [ , ] (0) k l f f ( , ) ( ( , )) ( ( , )), � 1 1 2 2 11 2 � �� � so, there is anticommutativity up to the “q-exact 2-form”. again, “integrating over the closed cycle” (passing to c( �) one obtains anticommutativity. let me finish these lecture with some general remarks. the construction of other operators is not very different from c, but for b * and c* some additional work has to be done. the commutation relations are not always easy to prove because the r-matrices are not always applicable. i did not say anything about the derivation of our formula for the vacuum expectation values for generic q. actually, it was obtained as a result of a long transformation of the integral formula by jimbo and miwa [12]. however, i find quite unsatisfactory that we have such a complicated derivation of such a simple result. i hope to find a more direct proof. references [1] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: a recursion formula for the correlation functions of an inhomogeneous xxx model. algebra and analysis, (2005), p. 115–159. [2] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: reduced qkz equation and correlation functions of the xxz model. commun. math. phys. (2006), p. 245–276. [3] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: traces on the sklyanin algebra and correlation functions of the eight-vertex model. j. phys. a: math. gen. (2005), p. 7629–7659. [4] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: density matrix of a finite sub-chain of the heisenberg anti-ferromagnet, lett. math. phys. (2006), p. 201–208. [5] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: algebraic representation of correlation functions in integrable spin chains. ann. henri poincaré (2006), p. 1395–1428. [6] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: hidden grassmann structure in the xxz model. commun. math. phys., (2007), p. 263–281. [7] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: fermionic basis for space of operators in the xxz model. hep-th/0702086 [8] boos, h., jimbo, m., miwa, t., smirnov, f., takeyama, y.: hidden grassmann structure in the xxz model ii. creation operators. in preparation. [9] bazhanov, v., lukyanov, s., zamolodchikov, a.: integrable structure of conformal field theory, quantum kdv theory and thermodynamic bethe ansatz, commun. math. phys., (1996), p. 381–398. [10] bazhanov, v., lukyanov, s., zamolodchikov, a.: integrable structure of conformal field theory ii. q-operator and ddv equation, commun. math. phys. (1997), p. 247–278. [11] bazhanov, v., lukyanov, s., zamolodchikov, a.: integrable structure of conformal field theory iii. the yang-baxter relation, commun. math. phys. (1999), p. 297–324. [12] jimbo, m., miwa, t.: algebraic analysis of solvable lattice models, reg. conf. ser. in math., ams,. vol. 85 (1995). fedor smirnov e-mail: smirnov@lpthe.jussieu.fr fs (membre du cnrs): laboratoire de physique théorique et hautes energies, université pierre et marie curie, tour 16 1 étage 4 place jussieu 75252 paris cedex 05, france © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 48 no. 2/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap05_2.vp 1 introduction nowadays, arithmetic operations are among the most common operations in digital integrated circuits. even the simplest circuit adds, subtracts or multiplies something. therefore there are high requirements on arithmetic operations. the computation should be fast and the area consumed by the arithmetic units should be small. these are two basic requirements, which are unfortunately contradictory. numbers in digital integrated circuits can be represented in various ways. the most widely known and most frequently used is fixed-point addition and multiplication representation. as mentioned above, the question of efficient fixed-point arithmetic operations is interesting when implementing a digital circuit. this is not a new question. many studies have been done and many articles and books have been published on the topic of arithmetic. however, these have only rarely considered this question in conjunction with fpgas (field programmable integrated circuits). in the design of an arithmetic unit in fpga, it was common to use the corresponding operator of an hdl (hardware description language) and rely on a synthesis tool. no thorough study and comparison of different structures of basic fixed-point arithmetic units has yet been published. this paper shows the results of our study [7], based on implementing and then comparing of fixed-point arithmetic units. virtex-ii fpga was chosen for implementation since it possesses arithmetic-support features common for contemporary fpgas. the paper is divided into five sections. after the introduction, the second section describes the selected implementation platform. the third section outlines the implantation of arithmetic units. the fourth section consists of a discussion and a comparison of the results of the measurements. the last section concludes the text with a summary of the results. 2 platform as shown in table 1, contemporary fpgas possess special features to support of arithmetic operations. the support features differ from one fpga vendor to another. it was not the goal of the study to compare the different fpgas. we decided to use xilinx [8] virtex-ii fpgas as the implementation platform. this is a very popular implementation platform in contemporary designs, and it has special support features for both addition and multiplication. cad tools are another important part of the implementation platform. the choice of tools affects the results of the implementation. again, a comparison of cad (computer aided design) tools was not the goal of our study. the tools were chosen according to their availability in our department. the following tools were used in the relevant implementation steps: � synthesis: leonardo spectrum 2001, mentor graphics [4] � p&r, timing analysis: xilinx ise 5.2i foundation, xilinx [8] � simulation: modelsim 5.5f, mentor graphics [4] 3 implementation this section lists the implemented adders and multipliers. they were chosen from different sources [1-3][5-6] to meet the objectives of the study: to explore the characteristics of arithmetic unit structures when implemented in fpga, and to examine the fpga architecture itself including the features that support arithmetic operations. © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fixed-point arithmetic in fpga m. bečvář, p. štukjunger arithmetic operations are among the most frequently-used operations in contemporary digital integrated circuits. various structures have been designed, utilizing different features of ic architectures. nevertheless, there are very few studies that consider the design of arithmetic operations in field programmable gate arrays (fpgas), a re-programmable type of digital integrated circuit. this text compares the results achieved when implementation of basic fixed-point arithmetic units in fpga. keywords: arithmetic, fixed-point, fpga. vendor chip features altera stratix ii adder logic, carry chain, dsp support. atmel at40kal vector multipliers. the at40kal’s patented 8-sided core cell with direct horizontal, vertical and diagonal cell-to-cell connections implement ultra fast array multipliers without using any busing resources. lattice ispxpga dedicated carry chain and booth multiplication logic (extra two-input and). quicklogic eclipse-ii ecu blocks (up to 8-bit mac functions) for dsp (8b multiplier, 16b adder), 12 ecus in the largest device. xilinx virtex-ii dedicated fast carry chain and carry propagation logic, embedded 18x18 bit signed multipliers with associated ram blocks. table 1: arithmetic support features in fpgas classes of implemented fixed-point adders: � carry-ripple adder (cra): a basic adder structure based on full adders connected into a chain. this is the simplest structure for implementation and also the slowest for operation. nevertheless, it is important for comparison with the other structures. � generated adder (ga): an adder generated by a synthesis tool. in contrast to cra it utilises the dedicated carry chains of xilinx virtex ii fpgas. � carry-lookahead adder (cla): the carry-lookahead adder uses two special signals g(enerate) and p(ropagate) to predict the propagation of a carry signal without using a carry chain. several clas were implemented with different building-block types and sizes. � carry-skip adder: based on cla carry-skip, this adder uses only the p(ropagate) signal, and for this reason, it is much easier to implement. it can be seen as a compromise between the complexity of clas and the long delay of cras. � carry-select adder: two sets of adders are used, one with the carry-in signal driven to logic 0 and the second with the carry-in signal driven to 1. the result is selected by multiplexers controlled by the actual carry-in signal. this structure is suspected to have the largest consumption, which should however compensate its performance. fixed-point multipliers can be divided into two groups by the type by their operation: serial mode multipliers and parallel mode multipliers. a serial mode multiplier computes bits of the result one by one in each computation cycle. on the other hand, a parallel mode multiplier computes all bits of the result at once. several multiplier structures from the two groups were selected for implementation. in the case of the parallel mode, not only conventional versions but also pipelined versions of multipliers were implemented. as they were implemented, the multipliers take n-bit operands and produce a 2n-bit result. no adjustments of the result, such as radix-point position or length truncation, were considered in the implementation. classes of implemented fixed-point multipliers: serial mode multipliers: � classical multiplier: the most basic serial mode multiplier structure that computes multiplication in a human-like way, i.e., going through the bits of one operand and simultaneously accumulating the second operand and shifting the sum. � booth multiplier: a serial mode multiplier that uses booth recoding. this enables it to compute signed number multiplication and, for radices greater than two, also increases its speed by computing several bits of the result in one computational step. � csa multiplier: a serial mode multiplier that uses carry-save adders, adders that have 3n inputs and 2n outputs. carry-save adders have a shorter delay than a normal full adder, which shortens the length of a single step of the multiplier. � booth + csa multiplier: combining the concepts of booth and csa multipliers, this multiplier uses radix-4 booth recoding to reduce the number of computational steps and csa adders to shorten the duration of a single computational step. parallel mode multipliers: � array multiplier: a parallel mode multipliers with a structure based on the basic formula for computation of a multiplication. it consists of n2 cells connected into an array-like structure. � wallace multiplier: a parallel mode multiplier that uses a tree of csa adders. thanks to the csas at each level of the tree, the number of operands is reduced by a factor of 3:2. � generated multiplier (generated hw): a parallel mode multiplier generated by a synthesis tool. it utilises 18-bit hardware multipliers embedded in xilinx virtex ii fpgas. � combinative generated multiplier (generated c): a parallel mode multiplier also generated by a synthesis tool, but with disabled usage of the embedded multipliers. this multiplier was implemented for the purposes of comparison with other structures and also for investigating the pipelining abilities of the synthesis tool. vhdl hardware description language was used for implementation; however there is no reason it should have any impact on the results of the experiments. different arithmetic unit structures were implemented with a common interface, so that they can be used interchangeably (i.e. an adder can be replaced by another adder, and a multiplier by another multiplier). the units are parametrized, which allows the user to choose the width of the operands (and consequently of the result). however, this it can be done only statically before the start of the implementation process. for more details on implementation, see [7]. 4 results the results of the experiments with implemented arithmetic structures are shown below. the target fpga chip was xilinx virtex-ii, specifically xc2v500 in the fg456-6 package. there was no particular reason for choosing exactly this part rather than some other member of the family. it was discovered later that some of the 64-bit multiplier structures did not fit in this chip, so a larger (xc2v1500) chip had to be used instead. place and route reports were used as a source for the evaluation. 4.1 fixed-point addition tab. 2 shows the results of measurements carried out on the implemented adder structures of 16, 32 and 64 bit-width. the results show that the best adder structure in terms of both time and area is the one that uses dedicated carry chains as depicted in figure fig. 1. none of the techniques that try to speed up the addition by cutting the carry chain, applied to this structure, brought any improvement in speed, but led to an increase in the occupied area. therefore, the results of such structures are not shown. as mentioned above, the speed-up techniques cripple the performance of the generated adder, and the following observations are only valid for adders based on the cra adder. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague cla: cla occupies the largest area among adders with a speed-up technique applied, without bringing better speed improvement than the other techniques. the regularity of cla with blocks of size 2 proved to have an advantage over other clas in smaller area consumption. carry-skip adders: no extra area was occupied by the additional carry propagation logic – a very favourable result. with no need for extra area, the carry-skip adder accelerates cra to less than half delay at 64 bits. carry-select adders: the carry-select adder requires extra area. in addition, it does not shorten the delay of computation as much as the carry-skip adder. 4.2 fixed-point multiplication table 3 shows the results of measurements carried out on different multiplier structures. 16, 32 and 64-bit multipliers were implemented. as mentioned above, the implemented multipliers can be divided according to their operation mode into two groups: serial and parallel mode multipliers. the multipliers within each group have similar characteristics, and the results can often be generalized to the whole group. as in the case of adders, two parameters were examined: area and delay. before dealing with the delay of the multiplier, it is necessary to state clearly what it means. two measures have to be distinguished: delay and response time. response time refers to the total time required for computation of the result. on the other hand, the delay of a sequential circuit is defined as the reverse value of its clock frequency. note that the delay of the parallel mode multipliers is equal to their response time, but they response times of serial mode multipliers and pipelined multipliers are multiples of their delays. these time measures are both important, and were therefore the object of our examination. 4.2.1 area fig. 2 depicts the area consumption of selected multiplier structures with 32-bit operand width. this figure shows, and it is also valid in general, that serial mode multipliers consume considerably a smaller area than parallel mode multipliers. the implementation of wide (larger than 64 bits) parallel mode multipliers approaches the limits of technology – not only the limits of fpga, but also the capacity of the synthesis tool. the results are much better when embedded hardware multipliers are used, but the number on one chip is limited [8] and consumption grows exponentionally with operand width. note that the area consumption depicted by the figure does not include the area of the embedded multipliers. 4.2.2 delay two time measures were observed on the multiplier structures: delay and response time. the response time gives an overview of the overall performance of the unit. the delay limits the maximum clock frequency of the system on which the given multiplier structure will be used. parallel mode structures have a shorter response time, while serial mode © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 n � 16 n � 32 n � 64 area (slices) delay (ns) area (slices) delay (ns) area (slices) delay (ns) cra 23 13.52 47 23.99 95 40.27 generated 8 4.05 16 5.09 32 8.38 cla 29 7.97 61 19.06 127 24.84 carry-skip 23 11.54 47 15.67 95 18.02 carry-select 27 13.27 67 16.47 135 23.88 table 2: results of fixed-point adders fig. 1: fixed-point adders: comparison of the delay and area consumption of an adder with dedicated carry chains vs. an adder utilising classical routing paths (an example for 32-bit data width) multipliers have a better delay criterion. this is an expected result, thanks to the philosophy of the operation mode. in parallel mode the computation is done in a single step. the step is long, longer than one step of the serial mode multiplier, but shorter than the sequence of steps that has to be carried out by a serial mode multiplier to compute the result. the pipelining technique was applied on parallel mode multipliers to see if it brings any improvements. this technique could not be applied when embedded multipliers were used, since the multipliers cannot be divided to allow the insertion of registers for the pipeline. the advantages and disadvantages of pipelining were observed on the implemented 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague n � 16 n � 32 n � 64 area (slices) delay (ns) area (slices) delay (ns) area (slices) delay (ns) serial mode multipliers: classical 40 4.60 71 4.94 136 8.75 booth-2 (radix 2) 41 4.63 73 4.89 138 7.99 booth-4 (radix 4) 55 4.93 104 5.37 202 9.09 csa 62 3.46 119 3.67 234 4.54 booth-4 + csa 101 4.62 191 4.86 370 5.45 parallel mode multipliers: array * 364 28.96 1536 53.45 4187 99.97 wallace * 390 19.81 1542 27.18 6164 37.00 generated 0 14.87 49 18.54 295 28.94 generated c 134 20.32 533 26.27 2109 29.97 pipelined multipliers: array [p2] * 274 14.95 1059 29.98 4193 59.92 wallace [p] 446 4.64 1654 4.95 6367 9.9 generated c [p4] * 188 4.95 635 9.94 3317 11.61 note1: [pn] denotes a pipelined multiplier with a given number of stages, where appropriate. note2: 64-bit versions of multipliers marked * did not fit in an xc2v500 chip. an xc2v1500 was used instead. note3: area consumption does not include the area of embedded multipliers. generated multipliers of 16, 32 and 64 bit-width utilize 1, 4, and 16 embedded multipliers, respectively. table 3: results of fixed-point multipliers fig. 2: area of fixed-point multipliers: selected types of fixed-point multipliers are shown, two parallel mode multipliers on the left and two serial mode multipliers on the right (all with 32-bit data width) structures. however, pipelining did not show any considerable improvements. when a parallel mode multiplier is to be implemented, embedded multipliers are the best option. 5 conclusion arithmetic operations in contemporary digital integrated circuits have an important role. our study [7] provides a comparison of the structures of fixed-point adders and multipliers implemented in fpga. from a number of well-known [1–3], [5–6] implementation architectures, several structures were chosen to observe their characteristics when they are implemented in fpga and, at the same time, to investigate the possibilities of the fpga architecture itself. the xilinx virtex-ii fpga family was chosen as the implementation platform, because it provides special features to support arithmetic operations, a trend that seems to be common at the present time for fpga chips. nine fixed-point adder structures were implemented in the study. the results show that a structure using virtex’s dedicated fast carry chains has the best results in terms of both area and delay criteria. none of structures utilizing a speed-up technique showed better results. if no dedicated carry chain is available, the carry-skip adder is the best option, because it doubles the speed of an adder with a normal carry chain without any extra area consumption. ten fixed-point multipliers were implemented. they can be divided according to their operation mode into two groups: serial and parallel mode multipliers, respectively. the parallel mode structures provide a shorter response time, © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 czech technical university in prague acta polytechnica vol. 45 no. 2/2005 fig. 3: response time of fixed-point multipliers: selected types of fixed-point multipliers are shown, two parallel mode multipliers on the left and two serial mode multipliers on the right (all with 32-bit data width) fig. 4: delay of fixed-point multipliers: selected types of fixed-point multipliers are shown, two parallel mode multipliers on the left and two serial mode multipliers on the right (all with 32-bit data width) but they also require a significantly larger area in terms of clb slices. a parallel multiplier structure that utilizes embedded hardware multipliers proved to have the shortest response time. however, the embedded hardware multipliers are consumed very rapidly for bit-widths larger than 18 and, furthermore, the number on a chip is limited [8]. serial mode structures proved to have advantages for applications that require small area usage with a short clock cycle. they provide a reasonable response time even for relatively large bit-widths (32 bits and more), on a very small area. this gives the opportunity to exploit the parallelism between a large number of serial-mode multipliers in a single fpga. references [1] ercegovac, m. d., lang, t.: digital arithmetic, morgan kaufmann publishers, san francisco, 2003. [2] hennessy, j. l., patterson, d. a., goldberg, d.: computer architecture: a quantitative approach: appendix h computer arithmetic, elsevier science, 1995. [3] koren, i.: computer arithmetic algorithms, prentice-hall, englewood cliffs, new jersey, 1993. [4] mentor graphics, web page, http://www.mentor.com/. [5] omondi, a. r.: computer arithmetic systems (algorithms, architectures and implementations), prentice-hall international (uk) limited, 1994. [6] pluháček, a.: projektování logiky počítačů (designing computer logic), ctu publishing house, prague, 1992. [7] štukjunger, p.: arithmetic in fpga, diploma thesis, ctu in prague, 2004. [8] xilinx, web page, http://www.xilinx.com. ing. miloš bečvář e-mail: becvarm@fel.cvut.cz ing. petr štukjunger e-mail: stukjup@fel.cvut.cz department of computer science & engineering, czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 prague, czech republic 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 2/2005 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2022.62.0445 acta polytechnica 62(4):445–450, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague measurement of a quantum particle position at two distant locations: a model jaroslav dittrich czech academy of sciences, nuclear physics institute, 250 68 řež, czech republic correspondence: dittrich@ujf.cas.cz abstract. a simplified one-dimensional model of measurement of the position of a quantum particle by two distant detectors is considered. detectors are modelled by quantum particles bounded in potential wells with just two bound states, prepared in the excited states. their de-excitation due to the short range interaction with the measured particle is the signal for the detection. in the approximations of short time or small coupling between the particle and the measuring apparatuses, the simultaneous detection of the particle by both detectors is suppressed. the results extend to other models with two-level detectors. keywords: position measurement, epr paradox, quantum measurement. 1. introduction the theory of measurement is one of the topics in quantum mechanics from its early days [1]. the development can be found, for example, in reviews [2], [3], [4], [5], [6]. recently, a comparison of an experimental measurement on a trapped 3-level ion by an interaction with a photon environment with the ideal quantum theoretical measurement was done in [7]. measurements in quantum field theory with non-relativistic quantum-mechanical detectors coupled to the quantum field were studied in [8]. the famous einstein-podolsky-rosen (epr) [9] paradox concerns the relation of the measurement results performed on distant parts of a quantum system. later, the subject evolved to a whole branch of the quantum physics, studies of the quantum entanglements and their applications in quantum informatics (see, e.g., chapter 16 in [10] for the overview, also extracted in [11], review [12] with an extensive list of references, chapters 4 and 6 in [13], chapter 9 in [14], or [15]). for a brief review of the physical background up to 2005 see [16]. consequences for the black-hole physics are discussed in [17]. effects in several microscopic and statistical systems were recently studied in [18], [19], [20], [21], [22]. further possible applications in quantum information are proposed in [23], [24]. the collapse of quantum state due to the measurement was, again, considered in [25], the entanglement in a system chaotic in the classical limit was studied in [26]. although mostly entanglement of spin states of the two particles is considered, the general idea concerns any simultaneous measurements at distant locations with correlated results. the epr paradox in the case of a particle position measurement means a simultaneous measurement of the particle presence at two distant places. as the particle can be detected at one place only, a sort of superluminal interaction between the two places seems to be necessary. another possible interpretation could be that the simultaneous detection at the two places is excluded (or perhaps only strongly suppressed) by the quantum evolution of the complete system, consisting of the measured particle and both detectors (and perhaps even their environment), starting from a given initial state. we try to support such an idea by a simple, perhaps not very realistic, model. we obtain suppression of the simultaneous detection at the two places in an approximation but not exactly. the description of the quantum measurement as the unitary evolution of the complete system was considered already by von neumann in his classical book [1] (called as process 2 in section v.1). there exist a number of attempts to model macroscopic measuring apparatuses as quantum objects, let us mention only a sample here. in [1] (sect. vi.3), [27] (sect. 139 and appendix xiv), [28, 29] and essentially also in [30] (chapter 22), the apparatus is modelled by one very massive body. more close to the quantum description of a macroscopic system are the models of the measuring apparatus as a many-body system, composed of infinite or very large finite number of particles, typically a spin chain see [31], [32], [33] (chapter 7). we consider a rather simple model described in the next section. it consists of a particle interacting with two mutually distant detectors. however, we do not attempt to describe the detector as a macroscopic apparatus. we model it by a single quantum particle bounded in a potential well only. 445 https://doi.org/10.14311/ap.2022.62.0445 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en jaroslav dittrich acta polytechnica 2. the model we consider one light particle and two detectors on a line. each detector consists of a very massive particle bounded in the square potential well of a width 2a and depth v0 tuned to the existence of just two bound states of energies e1 > e0. the detector is prepared for the measurement in the excited state e1. the detection of the light particle at the detector is modelled by the passing of the detector to the ground state e0 due to the interaction with the light particle. this is the determination of the particle position before the measurement as it gains energy e1 − e0 and is kicked away. the two detectors are located with centres of their potential wells at points −r, r with r ≫ a (see figure 1). e0 e1 e0 e1 −v0 −v0 −r − a −r + a r − a r + a figure 1. potential wells of the two detectors. remember that they bind two different massive particles (it is not a double-well potential but two different one-well potentials). let us write the hamiltonian of the whole three-particle system. let us denote the potential well v (x) = −v0χ(−a,a)(x) = { −v0 for − a < x < a 0 for x ≤ −a or x ≥ a , (1) where χ(−a,a) is the characteristic function of the interval (−a,a) for definiteness although the detailed shape of the potential v is not really used. denoting r ∈ r the coordinate of the measured light particle (mass m), x ∈ r the coordinate of the heavy particle in the left detector (mass m) and y ∈ r that in the right detector (mass m), the free hamiltonian (without an interaction of the measured particle and detectors) reads h0 = − ℏ2 2m ∂2r − ℏ2 2m ∂2x + v (x + r) − ℏ2 2m ∂2y + v (y − r) , (2) understood as a self-adjoint operator in the state space l2(r3) with the domain h2(r3). let us assume that the measured particle and the particle of the detector interact when their distance is shorter than a with the model interaction hamiltonian (g is an interaction constant) hi = gχ(−a,a)(r − x) + gχ(−a,a)(r − y) . (3) for simplicity, the interaction potential between the measured particle and the heavy particle in each detector is chosen, again, in the form of a square well (or barrier). its width 2a is chosen the same as for the potentials binding heavy particles in the detectors, which can be interpreted as the detector size (in the order of magnitude, at the least). the complete hamiltonian h = h0 + hi . (4) we assume that the detector with hamiltonian − ℏ2 2m ∂2x + v (x) (5) has just two bound states ψ0, ψ1 with energies e0 < e1. the detectors are prepared for measurement in the excited states ψ1, and the measured particle in a state φ0 so the initial wave function of the complete system is ψ(0,r,x,y) = φ0(r)ψ1(x + r)ψ1(y − r) . (6) let us assume that the energy of the light particle is insufficient to release the detector particles from their potential wells and that their states from the continuous spectrum can be neglected in the time evolution. in other words, we assume the wave function of the complete system in the form ψ(t,r,x,y) = c00(t,r)ψ0(x + r)ψ0(y − r) + c01(t,r)ψ0(x + r)ψ1(y − r) +c10(t,r)ψ1(x + r)ψ0(y − r) + c11(t,r)ψ1(x + r)ψ1(y − r) . (7) 446 vol. 62 no. 4/2022 measurement of a quantum particle position at two distant locations. . . inserting into the schrödinger equation, iℏ∂tψ = hψ , (8) and projecting onto ψj (x + r)ψk(y − r) (j,k = 0, 1), we obtain iℏ∂tcjk(t,r) = − ℏ2 2m ∂2r cjk(t,r) + (ej + ek)cjk(t,r) +g 1∑ l=0 (fjl(r + r)clk(t,r) + fkl(r − r)cjl(t,r)) , (9) where fjk(z) = ∫ z+a z−a ψj (x)ψk(x) dx (10) come from the matrix elements of interaction term hi (3) and the finite integration range here is a consequence of the finite range of interaction. as eigenfunctions ψj are exponentially decaying for |x| > a, the same is true for fjk at |z| > 2a. in the matrix notation c(t,r) = ( c00(t,r) c01(t,r) c10(t,r) c11(t,r) ) , e = ( e0 0 0 e1 ) , (11) f(r) = ( f00(r + r) f01(r + r) f10(r + r) f11(r + r) ) , (12) g(r) = ( f00(r − r) f10(r − r) f01(r − r) f11(r − r) ) , (13) the equations (9) read iℏ∂tc(t,r) = − ℏ2 2m ∂2r c(t,r) + (e + gf(r))c(t,r) + c(t,r)(e + gg(r)) . (14) the initial condition (6) gives c(0,r) = ( 0 0 0 φ0(r) ) . (15) matrices f(r) and g(r) are hermitian and bounded in r ∈ r according to their construction. so it is easily seen that the operator on the right-hand side of (14) is self-adjoint in l2(r,c4) with the domain h2(r,c4) and the solution c(t,r) exists there for every c(0, ·) ∈ h2(r,c4). the probability of finding the left detector in the state ψj and the right detector in the state ψk at the time t is calculated by (7) pjk(t) = ∫ r |cjk(t,r)|2 dr , (16) i.e., p11 is the probability that no detector reacts (p11(0) = 1 as detectors are initially prepared in the excited state ψ1), p01 the probability of detection by the left detector only, p10 the probability of detection be the right detector only, and p00 the probability of simultaneous detection by both detectors. 3. short-time evolution in the approximation c(t,r) = c(0,r) + t∂tc(0,r) + o(t2) , (17) c(t,r) =( 0 −igtℏ f01(r)φ0(r) −igtℏ g10(r)φ0(r) i ℏt 2m∂ 2 r φ0(r) + (1 − i t ℏ (2e1 + gf11(r) + gg11(r)))φ0(r) ) , (18) 447 jaroslav dittrich acta polytechnica assuming at the least that φ0 belongs to the domain of ∂2r . therefore, the probability p00(t) of the simultaneous detection by the both detectors, indicated by their de-excitations to the ground states, remains zero in the first nontrivial approximation as t → 0, p00(t) = ∫ r |c00(t,r)|2 dr = o(t4) , (19) in comparison with the probabilities of detection by just one detector p01(t), p10(t), which are of the order of o(t2). this is seen from (18), looking by which powers of t the expansions of cjk(t,r) start. this result is an indication that quantum mechanics inherently prefers the detection of a particle at one place only. however, one should not take it more seriously than the model can approximate macroscopic measuring apparatuses. for the larger times, p00(t) surely is not zero, which also corresponds to the physical picture that the particle interacts with one detector and is bounced to the other. 4. weak coupling approximation in this section, we derive a result similar to the one of the previous section but in the first perturbation approximation according to the coupling constant g between the light particle and the measuring instruments. the procedure follows the standard approach of passing to the interaction picture and then iteration of the integral equivalent of the schrödinger equation. let us start from equation (14) and denote as hm the free hamiltonian of the light particle, i.e., hm = − ℏ2 2m ∂2r , (20) where the right hand side is understood as the corresponding self-adjoint operator in l2(r). we transform c(t,r) = ( e− i ℏ ete− i ℏ hmtc1(t, ·)e− i ℏ et ) (r) . (21) here c1(t, ·) denotes the matrix function of r with the value c1(t,r) at the point r (t is just a parameter). the operator e− i ℏ hmt transforms it to another function c2(t, ·), which is then multiplied by two exponentials depending on e and t only. finally, the value of the resulting function at the point r, i.e., e− i ℏ etc2(t,r)e− i ℏ et, is taken. a similar notation is used in a few following formulas. inserting (21) into (14), we obtain ∂tc1(t,r) = − ig ℏ (a(t)c1(t, ·))(r) , (22) where the operator a(t) acts as a(t)t = e i ℏ hmt ( f̃(t)(e− i ℏ hmtt) + (e− i ℏ hmtt)g̃(t) ) (23) on a matrix function t ∈ l2(r,c4) and f̃(t), g̃(t) are multiplications by the matrix functions f̃(t,r) = e i ℏ etf(r)e− i ℏ et , g̃(t,r) = e− i ℏ etg(r)e i ℏ et . (24) the integral form of (22) reads c1(t,r) = c1(0,r) − ig ℏ ∫ t 0 (a(τ)c1(τ, ·)) (r) dτ . (25) let us iterate this equation starting with c1(0,r) as the first approximation. since the operator a(τ) is uniformly bounded in τ, the resulting perturbation series is convergent to the unique solution. keeping only terms linear in g, c1(t,r) = c1(0,r) − ig ℏ ∫ t 0 (a(τ)c1(0, ·)) (r) dτ + o(g2) . (26) the initial value c1(0,r) = c(0,r) is given in (15) and (26) reads as c1(t,r) = ( 0 gu1(t,r) gu2(t,r) φ0(r) + gu3(t,r) ) + o(g2) , (27) 448 vol. 62 no. 4/2022 measurement of a quantum particle position at two distant locations. . . where u1(t,r) = − i ℏ ∫ t 0 ( e i ℏ (e0−e1)τ e i ℏ hmτ f (+) 01 e − iℏ hmτ φ0 ) (r) dτ , (28) u2(t,r) = − i ℏ ∫ t 0 ( e i ℏ (e0−e1)τ e i ℏ hmτ f (−) 01 e − iℏ hmτ φ0 ) (r) dτ , (29) u3(t,r) = − i ℏ ∫ t 0 ( e i ℏ hmτ (f(+)11 + f (−) 11 )e − iℏ hmτ φ0 ) (r) dτ , (30) denoting f(±)jk (r) = fjk(r ± r) for j,k = 0, 1. the free propagator e − iℏ hmτ can be expressed by the wellknown integral formula here (e.g., eq. (ix.31) in [34]). passing back to c(t,r), c(t,r) = ( 0 ge− i ℏ (e0+e1)te− i ℏ hmtu1 ge− i ℏ (e0+e1)te− i ℏ hmtu2 e −2 iℏ e1te− i ℏ hmt(φ0 + gu3) ) + o(g2), (31) and the probability of the simultaneous detection by both detectors remains zero in the first nontrivial approximation as g → 0, p00(t) = o(g4) , (32) in comparison with p01(t) and p10(t) which are of the order of o(g2). 5. concluding remarks the obtained results may be interpreted as a support of the idea that the detection of a particle at one place only is a consequence of the initial conditions for the complete system (particle plus detectors) without any superluminal interaction between two distant places. however, as they are obtained in the approximation of a short time or weak coupling only, it may be questioned whether they do not represent just a kind of initial state stability. a possible objection against this explanation is that c00 is more stable than c01 and c10. another discussible point is the modelling of a macroscopic detector by a single quantum particle. we tacitly assume that the particles in detectors are heavy and thus have a large action. it was used only in heuristic justification of the assumption that their states remains in the two-dimensional subspace spanned by the vectors ψ0 and ψ1, in other words, the detectors are essentially two-level systems. the calculation for a more realistic models of the detectors and without approximations of small times or weak coupling would be very desirable but seems to be extremely difficult. a very specific model was considered for the definiteness but the main results (19) and (32) hold for any detectors modelled as two-level systems and any interaction separated to the sum of two parts corresponding to the particle interaction with each detector instead of (3). the form of the initial particle state φ0 is also not important here. the specific form of the model is not essentially used in the above calculations. however, the probability p00 of a simultaneous de-excitation of both detectors is already nonzero in the next iterations to (19) and (32). so the detection by both detectors is not excluded for large values of t or g in our model. acknowledgements the author is indebted to p. exner for the comments on the manuscript. the work is supported by the czech science foundation project 21-07129s and npi cas institutional support rvo 61389005. references [1] j. von neumann. mathematical foundations of quantum mechanics. princeton university press, 2018. [2] a. a. clerk, m. h. devoret, s. m. girvin, et al. introduction to quantum noise, measurement, and amplification. reviews of modern physics 82(2):1155–1208, 2010. https://doi.org/10.1103/revmodphys.82.1155. [3] a. e. allahverdyan, r. balian, t. m. nieuwenhuizen. understanding quantum measurement from the solution of dynamical models. physics reports 525(1):1–166, 2013. https://doi.org/10.1016/j.physrep.2012.11.001. [4] j. zhang, y. xi liu, r.-b. wu, et al. quantum feedback: theory, experiments, and applications. physics reports 679:1–60, 2017. https://doi.org/10.1016/j.physrep.2017.02.003. [5] b. d’espagnat. conceptual foundations of quantum mechanics. crc press, boca raton, 2nd edn., 2019. [6] c. beck. local quantum measurement and relativity. springer, cham, 2021. https://doi.org/10.1007/978-3-030-67533-2. [7] f. pokorny, c. zhang, g. higgins, et al. tracking the dynamics of an ideal quantum measurement. physical review letters 124(8-28):080401, 2020. https://doi.org/10.1103/physrevlett.124.080401. 449 https://doi.org/10.1103/revmodphys.82.1155 https://doi.org/10.1016/j.physrep.2012.11.001 https://doi.org/10.1016/j.physrep.2017.02.003 https://doi.org/10.1007/978-3-030-67533-2 https://doi.org/10.1103/physrevlett.124.080401 jaroslav dittrich acta polytechnica [8] j. polo-gómez, l. j. garay, e. martín-martínez. a detector-based measurement theory for quantum field theory. physical review d 105:065003, 2022. https://doi.org/10.1103/physrevd.105.065003. [9] a. einstein, b. podolsky, n. rosen. can quantum-mechanical description of physical reality be considered complete? physical review 47:777–780, 1935. https://doi.org/10.1103/physrev.47.777. [10] i. bengtsson, k. życzkowski. geometry of quantum states: an introduction to quantum entanglement. cambridge university press, 2017. https://doi.org/10.1017/9781139207010. [11] k. zyczkowski, i. bengtsson. an introduction to quantum entanglement: a geometric approach, 2006. https://doi.org/10.48550/arxiv.quant-ph/0606228. [12] n. friis, g. vitagliano, m. malik, m. huber. entanglement certification from theory to experiment. nature reviews physics 1:72–87, 2019. https://doi.org/10.1038/s42254-018-0003-5. [13] p. meystre. quantum optics: taming the quantum. springer, cham, 2021. https://doi.org/10.1007/978-3-030-76183-7. [14] d. d’alessandro. introduction to quantum control and dynamics. crc press, boca raton, 2021. https://doi.org/10.1201/9781003051268. [15] m. fadel. many-particle entanglement, einstein-podolsky-rosen steering and bell correlations in bose-einstein condensates. springer, cham, 2021. https://doi.org/10.1007/978-3-030-85472-0. [16] m. kupczynski. seventy years of the epr paradox. aip conference proceedings 861:516–523, 2006. https://doi.org/10.1063/1.2399618. [17] l. susskind. er=epr, ghz, and the consistency of quantum measurements. fortschritte der physik 64:72–83, 2016. https://doi.org/10.1002/prop.201500094. [18] w. zhang, m.-x. dong, d.-s. ding, et al. einstein-podolsky-rosen entanglement between separated atomic ensembles. physical review a 100(1):012347, 2019. https://doi.org/10.1103/physreva.100.012347. [19] m. fadel, a. usui, m. huber, et al. entanglement quantification in atomic ensembles. physical review letters 127(1):010401, 2021. https://doi.org/10.1103/physrevlett.127.010401. [20] p. marian, t. a. marian. einstein-podolsky-rosen uncertainty limits for bipartite multimode states. physical review a 103(6):062224, 2021. https://doi.org/10.1103/physreva.103.062224. [21] w. zhong, d. zhao, g. cheng, a. chen. one-way einstein–podolsky–rosen steering of macroscopic magnons with squeezed light. optics communications 497:127138, 2021. https://doi.org/10.1016/j.optcom.2021.127138. [22] k. berrada, h. eleuch. einstein-podolsky-rosen steering and nonlocality in quantum dot systems. physica e: lowdimensional systems and nanostructures 126:114412, 2021. https://doi.org/10.1016/j.physe.2020.114412. [23] y. xiang, x. su, l. mišta, et al. multipartite einstein-podolsky-rosen steering sharing with separable states. physical review a 99(1):010104, 2019. https://doi.org/10.1103/physreva.99.010104. [24] n. b. an. quantum dialogue mediated by epr-type entangled coherent states. quantum information processing 20:100, 2021. https://doi.org/10.1007/s11128-021-03007-1. [25] o. v. gritsenko. quantum collapse as reduction from a continuum of conditional amplitudes in an entangled state to a single actualized amplitude in the collapsed state. physical review a 101(1):012106, 2020. https://doi.org/10.1103/physreva.101.012106. [26] k. m. frahm, d. l. shepelyansky. chaotic einstein-podolsky-rosen pairs, measurements and time reversal. the european physical journal d 75:277, 2021. https://doi.org/10.1140/epjd/s10053-021-00274-6. [27] d. i. blokhintsev. fundamentals of quantum mechanics. nauka, moscow, 1976. in russian. [28] s. machida, m. namiki. theory of measurement in quantum mechanics i. progress of theoretical physics 63(5):1457–1473, 1980. https://doi.org/10.1143/ptp.63.1457. [29] s. machida, m. namiki. theory of measurement in quantum mechanics ii. progress of theoretical physics 63(6):1833–1847, 1980. https://doi.org/10.1143/ptp.63.1833. [30] d. bohm. quantum theory. prentice-hall, new york, 1952. [31] k. hepp. quantum theory of mesurement and macroscopic observables. helvetica physica acta 45:237–248, 1972. https://doi.org/10.5169/seals-114381. [32] m. cini, m. de maria, g. mattioli, f. nicolò. wave packet reduction in quantum mechanics: a model of a measuring apparatus. foundations of physics 9:479–500, 1979. https://doi.org/10.1007/bf00708363. [33] p. bóna. classical systems in quantum mechanics. springer, cham, 2020. https://doi.org/10.1007/978-3-030-45070-0. [34] m. reed, b. simon. methods of modern mathematical physics. ii. fourier analysis. self-adjointness. academic press, san diego, 1975. 450 https://doi.org/10.1103/physrevd.105.065003 https://doi.org/10.1103/physrev.47.777 https://doi.org/10.1017/9781139207010 https://doi.org/10.48550/arxiv.quant-ph/0606228 https://doi.org/10.1038/s42254-018-0003-5 https://doi.org/10.1007/978-3-030-76183-7 https://doi.org/10.1201/9781003051268 https://doi.org/10.1007/978-3-030-85472-0 https://doi.org/10.1063/1.2399618 https://doi.org/10.1002/prop.201500094 https://doi.org/10.1103/physreva.100.012347 https://doi.org/10.1103/physrevlett.127.010401 https://doi.org/10.1103/physreva.103.062224 https://doi.org/10.1016/j.optcom.2021.127138 https://doi.org/10.1016/j.physe.2020.114412 https://doi.org/10.1103/physreva.99.010104 https://doi.org/10.1007/s11128-021-03007-1 https://doi.org/10.1103/physreva.101.012106 https://doi.org/10.1140/epjd/s10053-021-00274-6 https://doi.org/10.1143/ptp.63.1457 https://doi.org/10.1143/ptp.63.1833 https://doi.org/10.5169/seals-114381 https://doi.org/10.1007/bf00708363 https://doi.org/10.1007/978-3-030-45070-0 acta polytechnica 62(4):445–450, 2022 1 introduction 2 the model 3 short-time evolution 4 weak coupling approximation 5 concluding remarks acknowledgements references ap06_3.vp 1 introduction a major problem confronting the designer is to choose the correct materials for the design of an individual component or structure; the material properties have to ensure that the component performs the work for which it was designed without malfunctioning throughout its guarranted life, and can be produced for a price acceptable to the customer. to enable this, the designer needs to know the loads that his component or structure will be subjected to the environment in which it will act, the service life expected, and the production cost. these factors limit the range of materials for application in the distinctive design. to finalize the selection of materials, he needs to be aware of their characteristics under various loadings and in various environments (i.e. the material properties). knowing these properties, he must be able to correlate them with the bearing capacity of his proposed component or structure. fracture mechanics deals with these questions. conventional design assumes that the material is a flawless continuum. however, we know that rational design and material evaluation require knowledge of flaws in materials. many materials, components and structures either have inherent defects as part of the production routine, or develop them at some phase of their life. comprehension of how a cracked body conducts itself under loading is fundamental to the explanation of any fatigue problem. since much of the information on fatigue from both laboratory experiments and service characteristics was obtained and expounded before the study of cracked-material behaviour was implemented, no established field of study correlated the particulars. isolated sets of data have practically become fatigue folklore, and their relation to other groups of data has not been clarified. 2 crack initiation significant publications on fatigue research, in the 20th century include forsyth [1], on extrusions and intrusions in slip bands, see fig. 1. three basic observations are: the importance of the free material surface, the irreversibility of cyclic slip, and environmental influences on microcrack initiation. for the most part, microcracks begin at the free material surface, and in unnotched specimens possessing a nominally homogeneous stress distribution loaded with cyclic tension. there is less prevention of cyclic slip than inside the material for the free surface at one side of the surface material. microcracks also start more easily in slip bands with slip displacements normal to the material surface [2]. there remain questions about why cyclic slip is not reversible. as far back as the 1950s, it was understood that there are two reasons for non-reversibility. one argument is that (cyclic) strain hardening occurs, which implies that not all dislocations return to their original position. another important factor is the interaction with the environment. a slip step at the free surface implies that fresh material is exposed to the environment. in a non-inert environment, most technical materials are rapidly covered with a thin oxide layer, or some chemisorption of foreign atoms of the environment occurs. exact reversibility of slip is then obviated. fatigue initiation is a surface effect. in the mid 20th century, microscopic investigations were still made with an optical microscope. this implies that crack nucleation was observed on the surface, where it indeed occurs. as soon as cracks grow into the material away from the free surface, only the ends of the crack front can be observed on that free surface. it is questionable whether this information is representative of the growth process inside the material, a problem that is sometimes overlooked. microscopic observations on crack growth inside the material require cross-sections of the specimen are made. several investiga34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague metal fatigue features p. brož this paper presents a summary of fatigue, crack initiation and growth, and fractographic findings for metal materials. the purpose of this paper is to consolidate and summarize some aspects of the fatigue of materials and structures. keywords: fracture surface, fracture toughness, macro-and microscopic appearance, microcrack, slip band, small cracks, striation, subgrain, transcrystalline. fig. 1: slip geometry at the material surface according to forsyth [1] tions employing sectioning were made in the 1950s and earlier. these showed that in most materials fatigue cracks grow transcrystalline. although the fatigue fractures appeared rather flat to the unaided eye, it turned out that the crack growth path under the microscope could be rather irregular, depending on the type of material. in materials with low stacking fault energy (e.g. cuand ni-alloys), cross slip is difficult and as a result the cyclic slip bands are narrow and straight. crack growth on a microscale occurs in straight segments along these bands. in materials with high stacking fault energy (e.g. al-alloys) cross slip is easy. moreover, in the al crystal lattice there are many slip systems which can easily be activated. as a consequence, the slip lines are wider and can be rather wavy. crack growth on a micro scale does not suggest that it occurs along crystallographic planes. as a result, fatigue on a microscale can be significantly different for different materials. the behavior is structure-sensitive, depending on the crystal structure (fcc, bcc, or hexagonal), the elastic anisotropy of the crystalline structure, grain size, texture, and dislocation obstacles (e.g. pearlite bands in steel, precipitated zones in al-alloys, twins, and so on). the outset of the damage in a cyclically stressed metal is tied in with a free surface. there is the following evidence that damage in a polycrystalline ductile metal is connected with grains having a free surface rather than those within the body: (i) surface grains are in intimate touch with the atmosphere; thus if the environment is a factor in the damage process, they are apparently more receptive. a surface grain is the only part of a polycrystal not fully supported by adjoining grains. because the slip systems in neighbouring grains of a polycrystal are not related to each other, a grain having a free surface will be able to deform plastically more easily than a grain in the body of the metal that is surrounded by other grains. (ii) it has been shown that if a fatigue test is stopped after some fraction (perhaps 20 per cent) of the expected life of the specimen, a thin layer of metal is removed from the test section, and the test is continued at the same stress level, the total life of the specimen is longer than the expected life of the original specimen. if a surface layer is removed at frequent intervals throughout a test, the expected life may be exceeded many times; in fact, provided that the stress amplitude is maintained constant and the frequency of removal and the depth of the removed layer are sufficient, the life will be limited only by the initial cross-sectional area of the specimen. (iii) the fatigue strength of small specimens cut from the interior of the test-section of a larger specimen broken in reversed direct stress (that is, cut from material which has been subjected to a stress level greater than the plain fatigue limit) is not lower than that of the virgin material. (iv) if the surface of a specimen is hardened, either metallurgically or by surface working, the fatigue strength of the specimen en bloc may be increased. similarly, any procedure that softens the surface decreases the fatigue strength of the specimen. (v) metallurgical examination of broken fatigue specimens of nominally homogeneous metallic specimens which have been subjected to a uniform stress distribution over their cross-section does not reveal cracks in the body of the specimen. in certain circumstances, however, cracks may form in the interior of a specimen at inclusions or flaws or below hardened surface layers. the onset of damage and cracking is thus associated with the surface grains, only those grains in the body of a specimen through which a crack, formed in a surface grain, passes as it grows across the specimen being damaged. this means that it is relatively simple to make a direct observation of the progressive development of cracking during a fatigue test. in general, only one crack penetrates into the metal to any considerable depth, but many additional cracks may be visible to the naked eye on the surface of soft metals (for example, copper, mild steel), especially when tested at stress levels giving failure after relatively short endurance (say, less than 105 cycles). on the other hand, no cracks, with the exception of the crack leading to complete failure, may be visible to the naked eye on the surface of specimens of hard metallic alloys such as high strength steels and aluminium alloys, which are light and hardenable. 3 fractographic features a milestone in experimental research was the introduction of the electron microscope (em), originally the transmission electron microscope (tem) in the 1950s, and later the scanning electron microscope (sem) in the 1970s. microscopic investigations in the tem are more laborious than in the sem, because it is necessary to make either a replica of the fracture surface, or a thin foil of the material. the thin foil technique is destructive and does not show the fatigue fracture surface. however, information on the material structure can be obtained, such as the formation of subgrains under cyclic loading. investigations of fatigue fracture surfaces in the sem are now a rather well standardized experimental alternative which can indicate where the fatigue fracture started, and in which directions it was growing [2]. a fundamental observation was made with the electron microscope around 1960. fractographic pictures revealed striations which could be correlated with individual load cycles. by mixing small and large load cycles in a fatigue test the occurrence of one striation per load cycle was proven by ryder [5]. the striations are assumed to be remainders of microplastic deformations at the crack tip, but the mechanism can be different for different materials. because of microplasticity at the crack tip and the crack extension mechanism in a cycle, it should be expected that the profile of the striations depends on the type of material. terms such as ductile and brittle striations were adopted. striations could not be observed in all materials, at least not equally clearly. moreover, the visibility of striations also depends on the severity of the load cycles. at very low stress amplitudes it may be difficult to see striations, although fractographic indications were obtained which showed that crack growth still occurred in a kind of cycle-by-cycle sequence. striations have also shown that the crack front is not simply a single straight line, as usually assumed in fracture mechanics analysis. noteworthy observations on this problem were made by bowles in the late 1970s [6]. bowles developed a vacuum infiltration method to obtain a plastic casting of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 entire crack. the casting could then be studied in the electron microscope. macroscopic shear lips, see fig. 2, were well known for aluminium alloys from the early 1960s, but they were also observed on fatigue cracks in other materials. the width of the shear lips increased for faster fatigue crack growth, and finally a full transition from a tensile mode fatigue crack to a shear mode fatigue crack can occur. the shear lips are a surface phenomenon because crack growth in the shear mode is not so constrained in the thickness direction. shear lips are macroscopic deviations from a mode i crack assumed in a fracture mechanics analysis. fatigue cracks in thick sections can be found largely in the tensile mode (mode i) because shear lips are then relatively small. however, the topography of the tensile mode area observed in the electron microscope indicates a more or less uneven surface, although it appears rather flat if viewed with the unaided eye. large magnifications clearly show that the fracture surface on a microlevel is not at all a nicely flat area. it is a rather irregular surface going up and down in some random way depending on the microstructure of the material. an inert environment increases the surface roughness, whereas an aggressive environment (salt water) promotes a smoother fracture surface. similarly, the shear lips are narrower in an aggressive environment and wider in an inert environment. these trends are associated with the idea that an aggressive environment stimulates tensile-decohesion at the crack tip, whereas an inert environment promotes shear decohesion. it should be understood that the crack extension in a cycle (i.e. the crack growth rate) depends on the crack growth resistance of the material, and also on the crack driving force, which is different if deviations of the pure mode i crack geometry are present, e.g. shear lips and fracture camber. since the mid 20th century, much work has been done on investigating the microscopic incidence of fatigue fracture surfaces, using the electron microscope. various collections of electron micrographs demonstrating the particulars of the fracture faces created by a growing fatigue crack have been published. the most dominant features of fatigue fracture surfaces (especially those performed by cracks growing on 90° planes) are distinct line markings, parallel to each other and normal to the crack growth direction. these are known as striations; each striation is in accordance with one load cycle. as a rule, striations are more clearly defined in ductile than brittle materials; e.g., in more resistant steels, the striations are short and discontinuous and their successive positions are not explicitly defined. the presence of striations on a fracture surface is proof that the failure was caused by fatigue [7], but they cannot always be found on all fatigue fracture surfaces, often because the microscope used has insufficient resolution. striations varying in spacing from about 2.5 mm [8] to less than 2.5 × 10�5 mm [9] have been observed on various materials. at high crack growth rates they tend to give way to ductile dimples [10]. on a microscopic scale, fatigue crack growth is often an irregular process. a study [11] of the shape of the front of a crack growing in a 3.2 mm thick mild steel sheet was made by examining sections on planes normal to the direction of crack growth so as to eventually intersect the crack front. this showed that the crack front bows forward slightly so that a section can be made with the leading part of the crack in the 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 2: fatigue crack growth with a transition from tensile mode to shear mode (taken from [2]) middle of the sheet thickness. in the region of the crack front, numerous apparently independent cracks were found (see fig. 3). the one-to-one correspondence between striations and applied loads was first proved by forsyth and ryder [7]. microhardness tests taken on the mild steel specimens indicated that a small region at the apex of the rounded-off fatigue crack was as hard as the cup part of a static tensile fracture, just as in the case of the tips of cracks that had been sharpened by compression. the deformation produced at the crack tip during each cycle resulted in an extension of the crack, crack propagation being a continuous process and resulting from a miniature double-cup plastic separation repeated each cycle. fig. 4 (a) shows an example of a blunted tip having two ears, found in a mild steel specimen in which the crack was filled with plastic steel while the maximum load was held constant. however, other sections revealed that the crack tip could be either blunted with only one ear or no ears, or it could still retain a sharp profile as shown in fig. 4 (b). on mild steel and alluminium alloys, very large striations having a wave-shaped section sometimes occur. 4 the growth of fatigue cracks when the level of cyclic stress is sufficiently high, a microcrack will diffuse over the surface and get through the body in, by continuing to-and-fro slip processes until it has reached such the size that it is capable of propagating in the capacity of a macrocrack. that is, its growth characteristics will depend on how much it opens and closes subject to the normal cyclic stress over its faces. to get a macrocrack in a specimen at a stress level less than the plain fatigue limit of a material, some form of notch must be introduced into the specimen so that the effective length of the crack formed at the notch root is increased to a value sufficient to grow directly as a macrocrack at the applied nominal stress level. unless expressed otherwise, it is ensues from any reference to fatigue crack growth that the crack has achieved the macrocrack period. no accurate quantitative experimental particulars concerning the rate of fatigue crack growth had been issued prior to 1953, when head [12] published his relationship, derivative in the abstract, between crack length and number of stress cycles. this may be owing to the fact that assessing design stresses, with respect to the fatigue properties of a material, had practically always been based on the plain fatigue limit or strength of the material being obtained from smooth laboratory specimens. of course, the object of these design stresses was to obviate the initiation of any cracks subjected to the working loads by keeping all cyclic stresses below some critical value. the requirements to produce components or structural members of complex form that are economic, and that perform under service conditions which are not accurately defined, ended in the possibility of cracks developing even at relatively low nominal cyclic loads, for the most part as a result of fretting, or in locally highly stressed material round some discontinuity. this indicates that some components and structural members, especially those designed to have a limited life, have to function to good effect even though they may contain fatigue cracks. forward-looking inspection procedures enabled small cracks to be detected in certain components at an early period in their expected life. however, it was given that a crack approximately 5–15 mm long was the smallest flaw that could be detected in the course of a routine service inspection. the need to assume the existence of cracked members in engineering construction despite the © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 3: section ahead of main crack front in mild steel sheet (taken from [10]) fig. 4: profile of crack tip in a loaded mild steel sheet efforts of designers to create fatigue-resistant structures is now universally accepted. these cracks will depend on the material and the values of the applied nominal mean and alternating stresses. the choice of a material from those which fulfil other necessary design factors yielding the slowest rate of crack growth for an assigned external loading will result in an increased range of safety between usual inspections. knowledge of the growth rate behaviour of a material, together with regular inspections, may enable a cracked component to have a long service life before having to be replaced, as is shown below. fail-safe design (that is, a structure designed in such a way that should cracks form they will not cause catastrophic failure under the working loads until one of them reaches a known length) implies that a limiting crack length can be established which must be detected by inspection. accordingly, designers should use all possible means to achieve the ideals of a low rate of crack propagation and high residual static strength in the presence of a crack. probably the best way of assessing the virtues of fail-safe design is by the length of the inspection periods which it allows in relation to the importance of the design feature. extensive, seemingly different “laws” of fatigue crack growth have been described, and by making various acceptable assumptions some of them can be derived theoretically. all the laws can be regarded as valid in the sense that they describe a particular set of fatigue crack growth data, and they can be used to predict crack growth rates in situations similar to those used to collect the data. it is sometimes possible to fit the same set of data to apparently contradictory laws. since the 1940s, the problem of brittle fracture has been extensively studied. it has been found that such low-stress (compared to the yield stress of the material) fractures always originate at flaws or cracks of various types. the fracture-mechanics approach to residual static strength in the presence of a crack makes use of the stress intensity factor ki concept to describe the stress field at a crack tip; when ki reaches a critical value kc the crack extends, usually catastrophically. values of ki are known for a wide range of crack configurations, and the fracture-mechanics approach has proved useful in problems of material development, design, and failure analysis. in view of its success in dealing with static fracture problems, it is logical to use a similar general approach to analyze fatigue crack growth data. in the mid 20th century, many researchers stated how early in the fatigue life they could observe microcracks. since then it has been clear that the fatigue life under cyclic loading consists of two phases: the crack initiation life, followed by the crack growth period until failure. this can be demonstrated in a block diagram, see fig. 5. the crack initiation period may cover a large percentage of the fatigue life under high-cycle fatigue, i.e. under stress amplitudes just above the fatigue limit. however, for larger stress amplitudes the crack growth period can be a substantial part of the fatigue life. an implicit special problem is how to define the transition from the initiation period to the crack growth stage. the stress-field component �ij at the point (r, �) near the crack-tip is given by � � � �ij ijr k r f( , ) ( )� � 2 other terms, (1) where the origin of the polar coordinates (r, �) is at the crack tip and fij (�) contains trigonometric functions. as the coordinate r approaches zero the leading term in equation (1) dominates; the other terms are constant or tend to zero. the constant k in the first term is known as the stress intensity factor. results for the stress concentrations of notches of very small flank angle and very small root radius � may be used to obtain theoretical expressions for stress intensity factors. consider a notch which, in the limit of zero root radius (�), tends to a crack along the y � 0 axis: if �max is the maximum value of �yy at the tip, then � �k i � � � � � � 2 0 lim max . (2) although the relationship between k and �max is exact, the actual expression for the maximum stresses may be known only approximately. as an example of this approach consider a semi-elliptical edge notch of depth c in a semi-infinite sheet subjected to a remote uniaxial tensile stress �. equation (2) can be written in terms of kt, the stress concentration factor (the ratio of maximum stress to applied stress), as follows: k c k k c i i t � � � � � � � � � � � �� lim 0 1 2 , (3) where � is the crack length (i.e c � � at � � 0). the stress concentration factor kt has been obtained for this configuration as a function of �/c. from this result for kt and a plot of kt(�/c) 1/2/2 vs �/c (as shown in fig. 6) ki can be determined from the limit as ��0 and c� �, that is, k i � �� � 113. . (4) 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague cyclic slip crack nucleation micro crack growth final failure initiation period crack growth period stress concentration factor stress intensity factor fracture toughness kt k k kic c micro crack growth fig. 5: various periods of fatigue life and applicable considerations in conjunction with the well-known paris equation, it has to be recognized that fatigue crack growth is subject to physical laws. generally, something drives the crack extension mechanism, and this is called the crack-driving force. this force is associated with the �k-value. the stress intensity factor is related to the strain energy release rate, i.e. the strain energy in the material which is available for producing crack extension, in compliance with the expression: d d u a k e � 2 * , (5) where e e* � (young’s modulus) for plane stress, and e e* ( )� �1 2� for plane strain (� � poisson’s ratio). the strain energy looks like a characteristic variable for energy balances. the experimental constants c and m in the paris equation are not easily associated with physical properties of the material. however, the crack growth rate obtained represents the crack growth resistance of the material. for the range of the elastic stress intensity factor, �k, alternative parameters were developed to correlate crack propagation rates under conditions of elastoplastic crack growth, as follows: (i) crack tip plastic range, (ii) change in crack tip opening displacement, and (iii) cyclic j-integral. as early as the 1960s it was known that the correlation of d a /dn and �k depends on the stress ratio r. this was to be expected, because an increased mean stress for a constant �s should give faster crack growth while the r-value is also increased. furthermore, the results of crack growth tests indicated systematic deviations of the paris equation at relatively high and low �k values. this led to the definition of three regions in da /dn-�k diagrams, namely zones i, ii, and iii, see fig. 7. evident questions are connected with the vertical asymptotes at the lower �k boundary of zone i and the upper �k boundary of zone iii. the latter boundary seems to be reasoned, since if kmax exceeds the fracture toughness (either kc or kic), a quasi-static failure will occur and fatigue crack growth is no longer feasible. further, it should be identified that the kmax value causing specimen failure in the last cycle of a fatigue crack growth test may well very from kc or kic measured in a fracture toughness experiment. from the standpoint of fracture mechanics, the incidence of a lower boundary in region i is not so obvious. if a k-value can be defined for the tip of a crack, a singular stress field should be on the scene and micro-plasticity at the tip of the crack should abound. thus, why should the crack not propagate any more; for which physical reason should there be a threshold �k-value (�kth). new inspirations on �kth were connected with observations on so-called small cracks. these cracks occur as microcracks at the beginning of the fatigue life starting at the material surface or, more exactly, in the subsurface. the first paper on this topic was published by pearson © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 6: stress concentration factor as a function of notch radius fig. 7: three regions of crack growth (taken from [2]) in 1975. he observed that small surface cracks grew more rapidly than large macro cracks at nominally similar �k values. several works corroborated that microcracks could grow at low �k-values, while macrocracks did not propagate at these low �k-values, where �k<�kth . exemplified data from wanhill is shown in fig. 8. the small-crack problem became a theme for further research. sundry crack growth obstacles extended by the material structure (e.g. grain boundaries, pearlite in steel, phase boundaries universally) could be fundamental for microcracks, whilst they were less important for macrocrack propagation. as an outcome, considerable scatter was observed in microcrack growth rates. besides, the barriers influencing microcrack propagation could be entirely different for different materials. although suggestions for fracture mechanics predictions of the growth of microcracks were presented, the published findings were not always convincing. indeed, it should be noted that the k-conception for such small cracks in a crystalline material is made problematic. the slip band is a plastic region and its size is not small in comparison with the microcrack length. another issue concerning the �kth concept involves macrocracks. why do large cracks stop growing if �k<�kth? a formal answer to this question is because the crack driving force does not exceed the crack growth resistance of the material. at low k-values the crack driving force is low, which affects the crack front microgeometry. the crack front becomes more tortuous, and the crack closure mechanism also changes. the crack driving force may just no longer be capable of producing further crack growth. a concept to be discussed is the occurrence of crack closure, and more specifically plasticity induced crack closure. in the 1960s, elber observed that the tip of a growing fatigue crack in an al-alloy sheet specimen (2024-t3) could be closed at a tensile stress [13]. crack opening proved to be a non-linear function of the applied stress, see fig. 9. during loading from s � 0 to s � sop the crack opening displacement is a non-linear function of the applied stress. the same non-linear response was observed in unloading. in the course of the non-linear behaviour the crack is partly or fully closed due to plastic deformation left in the wake of the growing crack. elber discussed that a load cycle is only effective in driving the growth of a fatigue crack if the crack tip is fully open. concurrently, the effective �s and �k are expressed in the form: �s s seff op� �max and � �k s aeff eff� � (6) ( is the geometry factor). elber supposed that the crack growth rate is a function of �keff only. d d eff a n f k� ( )� (7) he derived that the crack opening stress level depends on the stress ratio, for which elber presented the relation: u s s f r r� � � � � � eff ( ) . .0 5 0 4 (for 2024-t3 al-alloy). (8) 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 8: wanhill’s results for large cracks and small microcracks (taken from [2]) in addition, he proposed that the relation should be independent of the crack length. the elber approach continued in later investigations, partly because it was attractive to present crack growth data of a material for various r-values by just a single curve according to eq. (7) it turned out that the relation in eq. (8) could be markedly different for other materials, which is not surprising as the cyclic plastic behavior depends on the type of material. in the 1980s, the crack closure approach was much embraced by researchers on crack growth versions for fatigue subject to variable amplitude loading. 5 conclusion the major conclusions are: (i) fatigue failure is a product of the initiation of a crack and the follow-up propagation of this crack; (ii) in homogeneous metals, cracks initiate at a free surface and no damage proceeds in metal far afield from this surface by cyclic stressing; (iii) the initiation of a slip-band crack is possible only in ductile metals; (iv) other materials can demonstrate fatigue behaviour, though this is owing to the growth of a crack from some initial flaw or fault. acknowledgments this work was supported by the grant agency of the czech republic, under project no. 103/06/1382. references [1] forsyth, p. j. e.: “the application of “fractography” to fatigue failure investigations.” roy aircraft est, tech note met, (1957), p. 257. [2] schijve, j.: “fatigue of structures and materials in the 20th century. inter journal of fatigue,vol. 25 (2003), p. 679–702. [3] young, j. m., greenough, a. p.: j. inst. metals, vol. 89 (1960–1), p. 241. [4] modlen, g. f., smith, g. c.: j. iron steel inst., vol. 194 (1960), p. 459. [5] ryder, d. a.: “some quantitative information obtained from the examination of fatigue fracture surfaces.” roy aircraft est, tech note met, (1958), p. 288. [6] bowles, c. q, schijve, j.: “crack tip geometry for fatigue cracks grown in air and vacuum.” in: astm 811. philadelphia , pa: american society for testing and materials, 1983, p. 400–426. [7] forsyth, p. j. e., ryder, d. a.: aircraft engineering, vol. 32 (1960), p. 96. [8] pook, l. p.: hawker siddeley aviation ltd. unpublished report, 1962. [9] mc millan, j. c., pelloux, r. m. n.: boeing scientific research lab. document, d1-82-0558, 1966. [10] plumbridge, w. j., ryder, d. a.: acta metall, vol. 17 (1969), p. 1449. [11] frost, n. e., holden, j., phillips, c. e.: crack propagation symposium, cranfield, 1961, p. 166. [12] head, a. k.: phil. mag., vol. 44 (1953), p. 925. [13] elber w.: “fatigue crack propagation.” phd thesis, university new south wales, australia, 1968 doc. ing. petr brož, drsc. e-mail: broz.petr@tiscali.cz czech institution of structural and civil engineers sokolská 15 120 00 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 9: measurement of the crack opening displacement demonstrating the occurrence of plasticity induced crack closure at a positive stress (taken from [13]) acta polytechnica https://doi.org/10.14311/ap.2022.62.0567 acta polytechnica 62(5):567–573, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague reduction of pavement thickness using a subgrade layer treated by different techniques raquim n. zehawi∗, yassir a. kareem, emad y. khudhair university of diyala, college of engineering, highway and airport engineering department, baquba 32001, iraq ∗ corresponding author: raquim_zehawi@uodiyala.edu.iq abstract. a range of stabilisers for poor quality subgrade soils have been developed to promote road constructions. many of them are becoming more popular depending on their effectiveness. the purpose behind this research is to identify the relative efficacy of many physical and chemical stabilisation techniques for enhancing the properties of three types of local iraqi subgrade soils. the comparison of the samples is based on the cbr tests. the aashto (1993) flexible pavement design was used to compute the pavement thickness requirements. the soil samples a, b and c have a natural cbr values of 3.8, 3.9 and 4, respectively, on which the physical stabilisers of powdered rock (pr), grained recycled concrete (grc), and recycled crumb rubber grains (cr) were employed, while quicklime (ql) and activated fly ash (afa) were both utilised as chemical stabilisers. the stabilisation with 15 % of afa proved to be the most applicable method for soil types a and b for reducing the pavement thickness requirements by 51 % and 32 %, respectively, with a reasonable financial feasibility for both. the same feasibility is proven when stabilising soil type c with 15 % of grc, which reduces the pavement thickness by 25.7 %. keywords: flexible pavement, aashto flexible design method, cbr, physical stabilizers, chemical stabilizers. 1. introduction there are several highway pavement distresses, some of which can be attributed to subgrade soils’ inadequate support. such cases often result from the sensitivity to a high-water content, low specific gravity, and low shear strength, along with many undesirable characteristics of highway pavement subgrade soils. subgrade strength is often assessed by many kinds of tests conducted either in the field or in the lab, tests like the field density and the california bearing ratio (cbr) value [1]. to achieve an optimal performance of a flexible pavement, the design method must depend on cost effective, proper, and readily existing subgrade layer components. soft soil in a subgrade layer, for example, needs very special improvements to ensure the suitability for constructing a supporting layer for flexible pavement layers. the process of stabilising subgrade soil is both efficient and cost effective in most cases, because road paving materials are generally less expensive than replacing the existing subgrade with stronger materials [2]. the process of stabilisation could be mechanical, chemical, or a combination of both. the mechanical stabilisation is usually performed by guaranteeing the proper arrangement of soil particles either by compacting the soil layer by vibrations to rearrange the soil particles, or by adopting some advanced methods such as soil nailing with the use of barriers. the chemical methods are usually achieved through the use of a stabilising agent such as cementation substances causing a chemical reaction with soil particles. in the case of soft natural soil, such as clayey peat, silt or organic soil, the majority of stabilising mechanisms may be used [3]. fine-grained granular soils are the best for stabilisation. this is due to the huge ratio of the surface area to the diameter of the particles. during such a stabilisation process for soils that has the potential of swelling danger, the physio-synthetic within and around the clay particles . as a result, the treated cl soil showed an increase in the cbr value, which may signify the enhancement of the quality of the subgrade and, consequently, an increase in the carrying capacity of the pavement [4]. over the last few decades, non-traditional soil stabilising additives have been introduced at a rapid rate. due to their affordability, quick treatment times, and ease of use, these stabilisers are becoming more popular [5–7]. the aim of this research is to study the effect of soil stabilisation for road subsoil whose resistance strength was measured by a cbr experiment, and to study the effect of this resistance index on the design of pavement thickness by the aashto method. five physical and chemical additives were used – quicklime, activated fly ash, powdered rock, grinded recycled concrete, and crumb rubber. 2. literature review the subgrade’s quality has a significant impact on the design of the pavement as well as its life time and performance. the highway pavements, which are constructed on problematic soil usually demonstrate poor performance and unpredictable behaviour due to 567 https://doi.org/10.14311/ap.2022.62.0567 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en raquim n. zehawi, yassir a. kareem, emad y. khudhair acta polytechnica the influence of the subgrade soil type [8]. shrinkagedriven fissures may be in-filled with sediment over a geological time scale, resulting in subgrade irregularities. it is worth noting that in the field of geotechnical soil stabilisation, the substructure of the paved highway is subject to all the usual soil stabilisation laws. according to asad et al. [9], who studied the stabilisation of subgrade consisting of low plastic clayey soil (cl), when lime additive is used in a percentage ranging from 0 % to 6 %, then the unconfined compressive strength (ucs) of the stabilised soil increases from 46.08 psi to 103.27 psi. increasing the percentage of the additive above 6 % decreases the ucs [9]. the explanation of the ucs increase of soil is due to the increased flocculation induced by lime in the treated clayey soil. many properties of the lime-treated soil were observed to be improved, such as the plasticity index of soil decreasing and the soil type transforming from clay with a low plasticity to silty soil. the swelling of the treated soil is omitted and the soil becomes un-expanded soil. the cbr of the lime-treated soil is increased. this increase in cbr value has the potential to reduce both the cost and total thickness of the multi-layer pavement. according to karim et al. [10], who studied the effect of fly ash addition on the geotechnical properties of the soil, which is soft clay. the addition of fly ash lowers the specific gravity of the treated soil because the specific gravity of fly ash is lower than the specific gravity of the soil. the plasticity of the treated soil is reduced. the maximum dry density (mdd) of the soil is reduced while the optimum water content (owc) of the treated soil is increased, which solves many of the issues with the untreated soil [10]. the ucs of the soil increased with increasing fly ash percentages. the cbr of soil increased from 90.1 % at 5 % fly ash to 538.3 % at 20 % fly ash and after the 20 % fly ash increase, the cbr decreases. also, the compressibility of the soil decreases. in this study, it was found that 20 % of fly ash is the optimum percentage. many other studies concluded similar results by the addition of approximate percentages of fly ash [11–13]. kumar & biradar utilised the quarry-dust as a physical additive, which has been collected from a quarry at srikakulam in india. the samples were blended with waste materials at different percentages of which the sg is 2.68, the omg is 9.3 % and the mdd is 17.02 kn/m3. it was found that the plasticity of the soil treated by this substance was reduced because the quarry dust is a non-plastic material. the mdd of the modified soil was increased by 5.88 % when adding 40 % of quarry dust. but beyond the 40 % the mdd of soil began to decrease. the cbr of the soil also increased, and after the 40%, it also started to decrease. it was found by this study that the 40 % of quarry dust is the optimum percentage [14]. the use of grained recycled concrete (grc) as an additive has been proven to enhance the properties of soft soils [15, 16]. saeed and rashed assessed experimentally the ability to use demolished waste concrete (dwc) as a mechanical (physical) stabiliser that is added for treating expansive soil geotechnically. the plasticity of the treated soil was reduced as the (dwc) is a non-cohesive material. the swelling potential (which includes both the swelling percentage and the swelling pressure) decreased with the increase in dwc up to 12 %, but above this percentage, no significant decrease in swelling potential was noticed. according to their opinion, both the mdd and the omc of the soil were reduced because of the presence of fine sand in the substance. the strength of the soil represented by ucs increased up to 12 % of dwc cured for 28 days and the behaviour of the soil changed from flexible to brittle with the increment of this substance. the cbr of the soil treated by 12 % of dwc increased from 4.27 % to 24.14 %. it was concluded by this study that grained recycled concrete (grc) is economical, environment-friendly, and effective for treating adverse properties of expansive soils [17]. many researches dealt with the addition of crumb rubber to improve the properties of soft soils in terms of cbr and mr. values and to enhance their support of flexible pavements [5, 18]. ravichandran et.al. tested the use of crumb rubber grains of waste tires in the stabilisation process of weak soils. it was found that the cbr value of the treated soil increased up to 10 % of tire rubber grains, and above this percentage, the cbr decreased. increased cbr value of stabilised soil can greatly lower the overall pavement thickness and, as a result, the entire cost of road construction. the permeability of the treated soil denoted by the coefficient of permeability was increased. it was concluded at the end of this study that the use of rubber grains as a stabiliser presents a low-cost stabilisation technology that considerably decreases the current waste tire disposal problem [19]. in this research, five additives are used; quicklime, activated fly ash, powdered rock, grained recycled concrete, and crumb rubber. all these additives are mixed with three soil samples representing the subgrades of three main highways connecting baquba city. the experimental work is conducted on these mixtures to determine the effect of these additives on increasing the strength of soils in terms of cbr values and, consequently, the expected reduction in highway pavements. 3. materials used 3.1. soil three samples of soil were used in this study and all of them were extracted from flexible pavement road soil subgrades. they were brought from three different locations at diyala governorate in the middle of iraq. the first soil sample (denoted soil a) was brought from the subgrade of baquba-khalis highway (latitude 33◦ 48′ 27.43′′ n and longitude 44◦ 35′ 0.32′′ e). 568 vol. 62 no. 5/2022 reduction of pavement thickness using a subgrade layer treated by . . . property soil a soil b soil c natural water content [%] 40 32 28 liquid limit [%] 48 35 34 plastic limit [%] 15 15 20 plasticity index [%] 33 20 14 gravel [%] 0 0 0 sand [%] 0.7 5 15 silt [%] 38.3 36 40 clay [%] 61 59 45 specific gravity (gs) 2.67 2.71 2.75 uscs soil classification cl cl cl aashto soil classification a-7-6 (35) a-6 (29) a-6 (11) maximum dry density [kn/m3] 17.5 18.6 19.1 optimum moisture content [%] 17.7 16.5 16 table 1. properties of the collected soil samples. the second soil sample (soil b) was brought from the subgrade of baquba-al sabtiya highway (latitude 33◦ 47′ 7.59′′ n and longitude 44◦ 37′ 22.36′′ e) . the third soil sample (soil c) was brought from the university of diyala-khan bani saad highway (latitude 33◦ 40′ 29.32′′ n and longitude 44◦ 35′ 7.66′′ e). table 1 shows some index properties of the collected soil samples. 3.2. additives 3.2.1. quicklime (ql) the type of lime employed in this study was the un-slaked lime, often known as quicklime, which is obtained from limestone. this material was manufactured by the azerbaijan lime chemical company in iran. its particles , determined by a sieve analysis, is 850 microns (sieve no. 20). 3.2.2. activated fly ash (afa) type f fly ash employed in this study is a soil stabiliser, but it lacks cementation and pozzolanic qualities, which were compensated by adding the sulfateresistant portland cement, which is manufactured by al-geser manufacturing company in the governorate of kerbela in southern iraq. the fly ash used in this investigation was produced in india. 3.2.3. powdered rock (pr) this material was obtained from a local quarry in diyala province by grinding the local sand stone according to the iraqi specification (2715) [20]. its particle size, determined by a sieve analysis, is 0.075 mm. 3.2.4. grained recycled concrete (grc) this material was made from the leftovers of concrete cubes that were used for research engineering purposes in the university of diyala’s college of engineering’s structural testing laboratory, where it was crushed in a specific local mill for this purpose. its particle size, determined by a sieve test, is 0.45 mm. the tested specific gravity of the material was found to be 2.7. 3.2.5. crumb rubber grains (cr) for obtaining this material, worn tires were cut into small pieces with a grain size of one to two millimeters in diameter maximum. the specific gravity is 0.91, the compacted void ratio ranges between (0.9–1.3) while the uncompacted void ratio ranges between (1.2–2.4) and poisson’s ratio is 0.5. 4. experimental work 4.1. preparation of treated soil samples in this research, the five above-mentioned additives were added to soil samples (a, b, and c). an optimum ratio of each additive was selected depending on previous researches on similar types of soils in order to enhance their properties in terms of the cbr value [2, 5, 16]. the percentages of these stabilisers were; 9 % of quicklime, 15 % of activated fly ash, 25 % of powdered rock, 15 % of grained recycled concrete, and 4 % of crumb rubber grains [6, 10, 19]. these percentages were added to each soil sample and subjected to the cbr test before and after the addition. 4.2. cbr test the cbr test is one of the important empirical tests for evaluating the strength of the subgrade soil and one of the input parameters in determining the flexible pavement thickness according to aashto design method. this test was conducted according to astm d1883-21. the test included the preparation and compaction of the test sample at the maximum dry density and then immersed in water path for 4 days as 569 raquim n. zehawi, yassir a. kareem, emad y. khudhair acta polytechnica recommended in the specifications. then, the samples were extracted from the water and allowed to drain, and then tested using cbr loading machine with a penetration rate of 1.25 mm/min [21]. 5. pavement design in order to find out the optimal subgrade soil improvement strategy, a test with a unified set of parameters for the flexible pavement was used according to the aashto design method [22] and conducted on all soil types included in this study. the design parameters as adopted by the local directorate of highways and bridges diyala governorate are detailed in table 2. table 3 shows the compositions and coefficients of each pavement layer according to its properties. parameter parameter’s value pavement lifetime 15 years traffic esal (80 kn) 2 × 106 reliability (r) 99 % standard deviation (so) 0.49 initial serviceability (pi) 4.5 terminal serviceability (pt) 2.5 state of water drainage poor table 2. the adopted unified design parameters . parameter subbase base surface materials granularsoil crushed stone asphalt concrete structural coefficient a3 = 0.1 a2 = 0.14 a1 = 0.44 drainage coefficient m3 = 0.8 m2 = 0.8 – table 3. the compositions of pavement layers and coefficients. the chosen flexible pavement is a three-layered system, which is composed of a wearing layer on the top, a base layer underneath it and a sub-base layer in the bottom. this three-layered system is supported by the subgrade soil. this system is assumed to be fixed, while the impacts of the soil improvements on the total thickness of the pavement is going to be investigated. the results of the calculation of each pavement layer thickness using the aashto method can be seen in table 4 which shows the required thicknesses over each layer of the natural soils before the improvements. many relations that correlate the mr value to the cbr value were developed due to the importance of this topic [23, 24]. the following equation is used in this method for the calculation of the resilient modulus (mr) of the subgrade that uses the structural number (sn) of the pavement layer, according to the aashto method [25, 26]: mr (psi) = 1500 × cbr, for cbr ≤ 10 %. (1) the subgrade soil strength measured with the cbr value before and after treatment changes the bearing capacity values of the subgrade and a new flexible pavement thickness is calculated to investigate the impact of the five improvement techniques on the flexible pavement’s total depth before and after the treatment and the percentage reduction for each treatment. 6. results and discussion the results indicated that all three low plasticity soil samples a, b, and c having the cbr values of 3.8, 3.9, and 4, respectively, have responded to the stabilisation process. despite the difference between their cbr values being small, these samples have highly different characteristics for they were brought from different locations as stated earlier. these differences could be noticed in their responses to the stabilisation processes in terms of the increase in cbr values. this increment would result in reducing the road pavement’s thickness requirements, and consequently make a financial profit due to the low cost of the stabilisation process as compared to the high cost of pavement construction. the tests revealed that soil sample a responded very well to the stabilisation processes and the cbr value increased from 3.8 % to 30 % by adding the optimum percentages of afa, ql, and pr, while the addition of grc and cr increased the cbr value to 20 %. soil sample b also showed an enhanced cbr value, which has increased from 3.9 % to about 19 % by the addition of pr and grc. a lesser increment in cbr was observed for the addition of other additives, about 14 %. as for soil sample c, for which the smallest increase was observed, the cbr value increased from 4% to 11% by adding grc and cr while other additives didn’t increase the cbr to more than 7.5 %. in order to determine the effect of these enhancements in stabilised soils on the pavement thickness, the calculations according to the aashto flexible design method were repeated to find out the required total pavement thickness for all treatments for each soil type. the results are shown in figure 1. at the same time, the financial impact has been determined in which the net profit is calculated by subtracting the cost of the additive’s application process from the amount of costs saved due to the reduction of the pavement layer thickness. the financial details are shown in table 5. the results reveal that the biggest reductions found for soil a, in which the total pavement thickness requirement is reduced from 39 in to 19 in i.e. 51 %. lesser reduction can be observed for treated soil sample b, in which the total pavement thickness is reduced 570 vol. 62 no. 5/2022 reduction of pavement thickness using a subgrade layer treated by . . . parameter soil a soil b soil c subbase base asphalt mr [psi] 5760 5925 6000 13500 31800 450000 structural number sn 6.4 6.2 6 3.8 2.6 – total thickness [in] 39 37 35 18 6 – table 4. design thicknesses above each layer on natural soils. additive soil stabilizationcost [$/m2] pavement reduction [cm] reduction cost [$/m2] benefit [$/m2] b/c ratio 9% ql a 3.64 50.8 15.24 11.6 3.19 9% ql b 3.64 33.02 9.906 6.266 1.72 9% ql c 3.64 12.7 3.81 0.17 0.05 15% afa a 2.26 50.8 15.24 12.98 5.74 15% afa b 2.26 30.48 9.144 6.884 3.05 15% afa c 2.26 15.24 4.572 2.312 1.02 25% pr a 7.5 48.26 14.478 6.978 0.93 25% pr b 7.5 38.1 11.43 3.93 0.52 25% pr c 7.5 10.16 3.048 -4.452 – 15% grc a 2.72 45.72 13.716 10.996 4.04 15% grc b 2.72 35.56 10.668 7.948 2.92 15% grc c 2.72 22.86 6.858 4.138 1.52 4% cr a 6.03 45.72 13.716 7.686 1.27 4% cr b 6.03 33.02 9.906 3.876 0.64 4% cr c 6.03 20.32 6.096 0.066 0.01 table 5. financial feasibility analyses. figure 1. effect of subgrade improvement on the pavement thickness. 571 raquim n. zehawi, yassir a. kareem, emad y. khudhair acta polytechnica by approximately 40 % from 37 in to 22 in. soil sample type c has the worst results in this regard as the pavement thickness was reduced by no more than 25.7 % from 35 in to 26 in. these results are very close , if not better, to the improvement achieved by shubber & saeed and amakye et al. [27, 28], the variations in results are mostly due to the type of the treated soil. the financial analyses showed another sequence of preferences among additives in terms of the benefit to cost ratio. for the soil type a, although; the addition of ql and afa resulted in an identical pavement thickness reduction, they returned b/c ratios of 3.19 and 5.7, respectively, and, similarly, the addition of grc and cr resulted in an identical reduction in pavement thickness, yet their b/c ratios were 4.4 and 1.27, respectively. adding the pr resulted in a 0.93 b/c ratio, which is unacceptable despite its relatively high pavement reduction. likewise in soil type b, the highest reduction in pavement thickness was produced by the addition of pr , yet, financially, it is unacceptable due to the b/c ratio of 0.52. the same result was observed for cr, which returned a b/c ratio of 0.62. in this type of soil, the most financially efficient additives are afa and grc, yielding b/c ratios of 3.05 and 2.92, respectively. for soil type c, most of these additives are not financially efficient producing b/c ratio lesser than 1, except for afa and grc for, yielding b/c ratios of 1.02 and 1.52, respectively. 7. conclusions and recommendations in this paper, three iraqi local subgrade soil samples were stabilised using five different additives. these samples were tested for the cbr value both before and after the treatment in order to examine the extent to which these additives could enhance the soil support capability for the flexible pavement by utilising the aashto flexible pavement design guide. the most important conclusions drawn from the results of this research are as follows: (1.) all three subgrade soil samples responded well to the stabilisation process, but to a varying extent. (2.) the best reduction in the pavement thickness was about 51 %, achieved for soil (a) by a stabilisation with 15 % of afa. at the same time, this process returns the highest financial feasibility. (3.) the highest reduction in pavement thickness in soil b is approximately 47 %, achieved by a stabilisation with 25 % of pr, but it was found financially unacceptable for yielding a b/c ratio lesser than 1, while the most financial return in this soil is achieved by using 15 % of afa that returns a 3.05 b/c ratio. (4.) the smallest response to the stabilisation was observed for soil (c), in which the highest reduction in the pavement thickness is achieved by stabilisation with 15 % of grc , resulting in a pavement thickness reduction of 26 % and it was proved to have the highest financial return with a 1.52 b/c ratio. (5.) it is recommended to study the environmental impacts on the stabilised subgrade soils, especially water infiltration, and their effect on road pavements. list of symbols gs specific gravity cl low plasticity clayey soil mr resilient modulus ucs unconfined compressive strength mdd maximum dry density owc optimum water content grc grained recycled concrete dwc demolished waste concrete astm american society for testing and materials sn structural number references [1] h. m. park, m. k. chung, y. a. lee, b. i. kim. a study on the correlation between soil properties and subgrade stiffness using the long-term pavement performance data. international journal of pavement engineering 14(2):146–153, 2013. https://doi.org/10.1080/10298436.2011.633167. [2] o. a. abd-allah. the effect of soil improvements on pavement fail-ures along diyala governorate highways. master’s thesis, 2021. [3] n. a. patel, c. b. mishra, d. k. parmar, s. b. gautam. subgrade soil stabilization using chemical additives. international research journal of engineering and technology 02(04):1089–1095, 2015. [4] e. segun, o. moses, s. akinlabi. geotechnical and microstructural properties of cement-treated laterites stabilized with rice husk ash and bamboo leaf ash. acta polytechnica 61(06):722–732, 2021. https://doi.org/10.14311/ap.2021.61.0722. [5] o. a. abd-allah, s. h. a. awn, r. n. zehawi. improvement of soft clay soil using different types of additives. iop conference series: earth and environmental science 856(1):012010, 2021. https://doi.org/10.1088/1755-1315/856/1/012010. [6] z. al-khafaji, h. al-naely, a. al-najar. a review applying industrial waste materials in stabilisation of soft soil. electronic journal of structural engineering 18(2):16–23, 2018. https://doi.org/10.56748/ejse.182602. [7] j. vesely. numerical modelling of the soil behaviour by using newly developed advanced material model. acta polytechnica 57(1):58–70, 2017. https://doi.org/10.14311/ap.2017.57.0058. [8] r. baadiga, u. balunaini, s. saride, m. r. madhav. effect of geogrid type and subgrade strength on the traffic benefit ratio of flexible pavements. transportation infrastructure geotechnology pp. 1–31, 2021. https://doi.org/10.1007/s40515-021-00203-5. 572 https://doi.org/10.1080/10298436.2011.633167 https://doi.org/10.14311/ap.2021.61.0722 https://doi.org/10.1088/1755-1315/856/1/012010 https://doi.org/10.56748/ejse.182602 https://doi.org/10.14311/ap.2017.57.0058 https://doi.org/10.1007/s40515-021-00203-5 vol. 62 no. 5/2022 reduction of pavement thickness using a subgrade layer treated by . . . [9] a. asad, a. hussain, a. farhan, et al. influence of lime on low plastic clay soil used as subgrade. journal of mechanics of continua and mathematical sciences 14(1):69–77, 2019. https://doi.org/10.26782/jmcms.2019.02.00005. [10] h. h. karim, z. w. samueel, a. h. jassem. influence of fly ash addition on behavior of soft clayey soil. engineering and technology journal 38(5):698–706, 2020. https://doi.org/10.30684/etj.v38i5a.426. [11] r. kishor, v. p. singh. evaluation of expansive soil amended with fly ash and liquid alkaline activator. transportation infrastructure geotechnology pp. 1–22, 2022. https://doi.org/10.1007/s40515-022-00240-8. [12] a. thomas, k. kumar, l. tandon, o. prakash. effect of fly ash on the engineering properties of soil. international journal of advances in mechanical and civil engineering 2(3):16–18, 2015. [13] b. lekha, g. sarang, a. shankar. effect of electrolyte lignin and fly ash in stabilizing black cotton soil. transportation infrastructure geotechnology 2(2):87–101, 2015. https://doi.org/10.1007/s40515-015-0020-0. [14] u. a. kumar, k. b. biradar. soft subgrade stabilization with quarry dust-an industrial waste. international journal of research in engineering and technology 03(08):409–412, 2014. https://doi.org/10.15623/ijret.2014.0308063. [15] m. a. al-obaydi, m. d. abdulnafaa, o. a. atasoy, a. f. cabalar. improvement in field cbr values of subgrade soil using construction-demolition materials. transportation infrastructure geotechnology 9(2):185–205, 2022. https://doi.org/10.1007/s40515-021-00170-x. [16] o. ahmed, s. h. a. awn, r. n. zehawi. improving california bearing ratio (cbr) using chemical and physical treatments for diyala soils. diyala journal of engineering sciences 14(2):42–51, 2021. https://doi.org/10.24237/djes.2021.14204. [17] s. b. saeed, k. a. rashed. evaluating the uses of concrete demolishing waste in improving the geotechnical properties of expansive soil. journal of engineering 26(7):158–174, 2020. https://doi.org/10.31026/j.eng.2020.07.11. [18] h. a. hasan, l. h. a. mohammed, l. g. g. masood. effect of rubber tire on behaviour of subgrade expansive iraqi soils. iop conference series: materials science and engineering 870(1):012066, 2020. https://doi.org/10.1088/1757-899x/870/1/012066. [19] p. t. ravichandran, a. s. prasad, k. d. krishnan, p. r. k. rajkumar. effect of addition of waste tyre crumb rubber on weak soil stabilisation. indian journal of science and technology 9(5):1–5, 2016. https://doi.org/10.17485/ijst/2016/v9i5/87259. [20] central organization for standardization and quality control. technical specifications for civil work, ministry of planning/republic of iraq, 2017. [21] american association of state highway and transportation officials. standard method of test for the california bearing ratio (aashto standard no. t193), 2013. [22] american association of state highway and transportation officials. aashto guide for ddesign of pavement structures. american association of state highway and transportation officials, washington dc, 1993. isbn 1-56051-055-2. [23] m. arshad. development of a correlation between the resilient modulus and cbr value for granular blends containing natural aggregates and rap/rca materials. advances in materials science and engineering 2019:8238904, 2019. https://doi.org/10.1155/2019/8238904. [24] h. r. heinimann. pavement engineering for forest roads: development and opportunities. croatian journal of forest engineering: 42(1):91–106, 2021. https://doi.org/10.5552/crojfe.2021.860. [25] a. a. hussein, y. m. alshkane. prediction of cbr and mr of fine-grained soil using dcpi. in proceedings of the 4th international engineering conference on developments in civil & computer engineering, pp. 268– 282. 2018. https://doi.org/10.23918/iec2018.20. [26] n. j. garber, l. a. hoel. traffic and highway engineering. cengage learning, 2009. isbn 0495082503. [27] k. h. h. shubber, a. a. saad. subgrade stabilization strategies effect on pavement thickness according to aashto pavement design method. (review). iop conference series: materials science and engineering 737(01):012145, 2020. https://doi.org/10.1088/1757-899x/737/1/012145. [28] s. y. o. amakye, s. j. abbey, c. a. booth, j. oti. road pavement thickness and construction depth optimization using treated and untreated artificiallysynthesized expansive road subgrade materials with varying plasticity index. materials 15(08):2773, 2022. https://doi.org/10.3390/ma15082773. 573 https://doi.org/10.26782/jmcms.2019.02.00005 https://doi.org/10.30684/etj.v38i5a.426 https://doi.org/10.1007/s40515-022-00240-8 https://doi.org/10.1007/s40515-015-0020-0 https://doi.org/10.15623/ijret.2014.0308063 https://doi.org/10.1007/s40515-021-00170-x https://doi.org/10.24237/djes.2021.14204 https://doi.org/10.31026/j.eng.2020.07.11 https://doi.org/10.1088/1757-899x/870/1/012066 https://doi.org/10.17485/ijst/2016/v9i5/87259 https://doi.org/10.1155/2019/8238904 https://doi.org/10.5552/crojfe.2021.860 https://doi.org/10.23918/iec2018.20 https://doi.org/10.1088/1757-899x/737/1/012145 https://doi.org/10.3390/ma15082773 acta polytechnica 62(5):567–573, 2022 1 introduction 2 literature review 3 materials used 3.1 soil 3.2 additives 3.2.1 quicklime (ql) 3.2.2 activated fly ash (afa) 3.2.3 powdered rock (pr) 3.2.4 grained recycled concrete (grc) 3.2.5 crumb rubber grains (cr) 4 experimental work 4.1 preparation of treated soil samples 4.2 cbr test 5 pavement design 6 results and discussion 7 conclusions and recommendations list of symbols references ap05_1.vp 1 introduction measuring transient moisture profiles is considered as a common and effective tool for determination of the liquid moisture diffusivity of porous building materials. in classical experimental setups rod specimens that are water and vapor proof insulated on all lateral sides (parallel to the direction of moisture transport) are placed in either horizontal or vertical position. then, they are brought into contact with thewater by one of their face sides. beginning from this time, the moisture contents are measured at different positions and time intervals during the experiment, producing moisture content profiles versus time. finally, the measured moisture profiles are analyzed using the methods of inverse analysis, and the moisture diffusivity vs. moisture content function is determined. there are a variety of methods for determination of moisture content. the most frequently used in building physics are the �-ray attenuation technique [1–3] and the nmr technique [4–5]. other commonly used techniques include the capacitance method [6], positron emission tomography [7], neutron radiography [8] and the microwave method [9]. the tdr technique [10] originally used in soil science and x-ray radiography, evolved from the well-known medical technique [11-12] can also be applied to building materials. in this paper, we use the capacitance method and the microwave impulse method to determine the moisture profiles in porous building materials. 2 materials and samples three typical building materials, namely cement paste, ceramic brick and autoclaved aerated concrete, were tested. the cement paste was prepared using portland cement cem i 32.5 r (env 197-1) (horní srní, cz) and water. the water to cement ratio w � 0.3 was chosen in our experiments. the bulk density of the cement paste was 1910 kgm�3. the ceramic brick was produced by the brick kiln at nebužely, cz. the bulk density of the ceramic brick was 1720 kgm�3. the autoclaved aerated concrete (aac) was produced by ytong (laussig, germany). its bulk density was 650 kgm�3. for the measurements of the moisture profiles we used the following samples: 6 specimens 20×40×280–300 mm for each measured material and each method. the samples were insulated on all lateral sides by waterand vapor-proof plastic foil. they were stored in laboratory conditions at a temperature of 25 °c and relative humidity of 50 %. 42 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague comparison of the capacitance method and the microwave impulse method for determination of moisture profiles in building materials p. tesárek, j. pavlík, r. černý a comparison of the capacitance method and the microwave impulse method for the determination of moisture profiles in three typical porous building materials is presented in this paper. the basic principles of the measuring methods are given. the calibration process is described in detail. on the basis of the measured results, it can be concluded that the capacitance method provides better accuracy in the range of lower moisture content than to the microwave impulse method, which is more accurate for the highest values of moisture content. keywords: moisture profiles, capacitance method, microwave impulse method, building materials. fig. 1: block diagram of the capacitance device 3 experimentals methods 3.1 capacitance method the capacitance device designed in [6] was used for the measurements. a block diagram of the device is shown in fig. 1. the low-voltage supply drives an oscillator with a 400 khz working frequency, which has constant output voltage feeding a circuit where the measuring capacitor (with the analyzed sample as a dielectric) is connected in series with a resistance. on this resistance, the voltage after rectifying is determined. this depends significantly on the moisture content of the dielectric. the relation between the moisture content and the voltage measured on the resistance can be determined by a calibration. the measured voltage increases with increasing capacity. proper choice of the resistance can ensure that the dependence of the measured voltage on the capacity is linear in the range of approximately one or two orders of magnitude of the capacity. the voltage is recorded at specified time intervals by a data logger. the described capacitance moisture meter was equipped with electrodes in the form of parallel plates with dimensions 20×40 mm. moisture meter readings along the rod specimen were taken every 5 mm in order to achieve certain space averaging of the results and reduce the effect of inhomogeneities of the material. the experimental setup is shown in fig. 2. the specimen is fixed in horizontal position in order to eliminate the effect of gravity on the moisture transport. the lateral sides of the specimen are water and vapor-proof insulated in order to simulate 1-d water transport. a viscous sponge ensuring a good contact of the surface of the specimen with the water is put into a perspex water-filling chamber and applied to one face side of the specimen. the sponge sucks water from a free surface, being about 1 cm below the lower side of the specimen. the water in the chamber is maintained on constant level using a float. if the water level in the filling chamber decreases due to water suction by the specimen, the water level in the float chamber decreases in the same way. the needle of the float opens the hole in the cover of the float chamber and water from a burette flows through the hole into the float chamber until the needle closes the hole again due to the increase in the water level. in this way, a continuous water supply to the measured specimen is achieved. for a particular material, the calibration curve of the capacitance moisture meter is usually determined in advance using the gravimetric method. in this case, we chose another method. the calibration was done after the last moisture meter reading, when the moisture penetration front was at about one half of the length of the specimen, using this last reading and the results of the standard gravimetric method measurements after cutting the specimen into pieces 1 cm in width. the final calibration curve for each material was constructed using the data for 3–6 samples. in the regression analysis, a logarithmic function was found as the best approximation to the measured data. an example of the calibration curve is presented in fig. 3. 3.2 microwave impulse method the measuring system designed in [13] was used for the experiments. it is relatively compact and consists of three basic components (see fig. 4), namely an impulse generator, an applicator and a sampling oscilloscope. the gpsi-1a generator (radan, ltd.) produces triangular impulses 250 ps in width and with an amplitude of 2 v. the apparatus consists of the impulse generator itself, its feed circuits, controlling, auxiliary and protecting circuits. the energy output is realized by three sma coaxial connectors. these signals make it possible to determine the reference and measured position of the impulse and to synchronize the sampling oscilloscope. the applicator connected to the generator output ensures the necessary exposure of both measured and reference specimens. it consists of two pairs of transmitting and receiving antennas formed by coaxial/waveguide reducers and horns. the pairs of antennas are fixed parallel in separate holders ensuring a defined position, and therefore also stability and reproducibility of measurements. the specimens of the tested materials are put into the applicator between the measuring antennas. the sample thickness is limited mechanically to about 100 mm. from the electric point of view it is limited by the attenuation in the measured material and the sensitivity of the oscilloscope. the dynamics of the signal is over 20 db. the tektronix 7603 sampling oscilloscope analyses the impulse signals. it has a 7t11a sampling sweep unit and two © czech technical university publishing housee http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 2: experimental setup of the capacitance method y = 0.0752ln(x) 0.229 r 2 = 0.9904 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0 50 100 150 200 moisture meter reading m o is tu re c o n te n t [k g k g -1 ] fig. 3: calibration curve of the capacitance method for cement paste 7s11 sampling units with a s-4 sampling head. the time resolution of the oscilloscope is about 10 ps, and the sensitivity is 2 mv. the frequency range is up to 14 ghz. the signal from the oscilloscope display is recorded by a digital camera and analyzed by a pc. the experimental setup was very similar to that for the capacitance method. the details of the setup are shown in fig. 5. scanning by the microwave impulse moisture meter along the specimen was done every 10 mm. similarly as with the capacitance method, the calibration curve was determined using the results of the last scan and the data obtained by the standard gravimetric method after cutting the specimen into pieces 10 mm in width. the calibration curve was constructed as the dependence of the moisture content on the permittivity of the measured material. the permittivity of the material was calculated on the basis of measuring the time difference � � �t t t21 2 1� � , where t2 is the travel time of the impulse to pass the thickness of a measured specimen and t1 is the respective travel time in the air (see [13] for details of the calculation procedure). as in the case of the capacitance method, here too a logarithmic function was found to be the best approximation of the measured data. an example of the calibration curve is shown in fig. 6. 4 experimental results and discussion fig. 7 presents typical moisture profiles in cement paste specimens measured by the capacitance method and the microwave impulse method. a comparison of the two sets of 44 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague fig. 4: block diagram of the microwave impulse method fig. 5: experimental setup of the microwave impulse method y = 0.1251ln(x) 0.1929 r 2 = 0.945 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0 5 10 15 20 permittivity m o is tu re c o n te n t [k g k g -1 ] fig. 6: calibration curve of the microwave impulse method for cement paste curves shows three basic features. first, the agreement of the results obtained by the two methods is reasonably good for longer times, but for short times from the beginning of the experiment it is less good. second, the agreement is better for high moisture content than in the low moisture range. third, the data scattering seems to be higher for the microwave impulse method than for the capacitance method. however, an exact direct comparison of the measured moisture profiles is difficult in general. due to the characteristic features of each technique, the experimental data may be obtained for different positions and different time steps. therefore, making an exact comparison of a profile at a certain time step would only be possible by an interpolation in the time domain. to overcome this problem the boltzmann transformation was applied to the experimental data. for each technique the obtained moisture content versus distance profiles was replotted as a moisture content versus � profile, with � � �xt 1 2. if the boltzmann conditions are fulfilled (a constant boundary condition applied to a semi-infinite homogeneous © czech technical university publishing housee http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.00 0.05 0.10 0.15 0.20 position [m] m o is tu re c o n te n t [k g k g -1 ]. .. t=3600 s t=9000 s t=76800 s t=97800 s t=160800 s t=177000 s t=183600 s t=246900 s t=263400 s 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.00 0.05 0.10 0.15 0.20 position [m] m o is tu re c o n te n t [k g k g -1 ] t=76920 s t=161100 s t=183720 s t=247140 s t=261420 s a) b) fig. 7: typical moisture profiles in cement paste specimens measured by a) the capacitance method, b) the microwave impulse method medium that is initially at a uniform moisture content), all the measured moisture profiles should fall on a single �-profile. figures 8–10 compare the boltzmann transformed experimental data of the capacitance method, the microwave impulse method and the gravimetric method for all three materials. in all cases, the boltzmann transformation seems to hold very well. however, a systematic difference between the capacitance method and the microwave impulse method in comparison to the gravimetric technique can be observed. the capacitance method shows a better agreement with the gravimetric method for lower values of moisture content, where the samples are almost dry (naturally wet). on the other hand, the microwave impulse method gives results closer to the gravimetric measurements in the range of highest moisture content. the reasons for these differences lie in the physical background of the two methods. the accuracy of the capacitance method depends mainly on achieving an ideal contact of the sample surface with the probe, because any appearance of air bubbles on the surface can damage the accuracy substantially (two capacitors in series, one of them with very low permittivity of the dielectric). once a good contact is achieved, the accuracy of the method depends just on the accuracy of voltage measurement, which is only seldom a limiting factor. therefore, the accuracy should be comparable for both high and low moisture content. the microwave impulse method in the presented setup is a transmission-based technique. its accuracy depends mainly on the precision of measuring the time difference � � �t t t21 2 1� � (see above). the permittivity of bound water, which prevails in the material in the range of low moisture content, is about 20 times lower than the permittivity of free water. therefore, for low moisture content the time difference is much lower than for high moisture content, and the sensitivity and accuracy of the method is also substantially lower. 5 conclusions both the capacitance method and the microwave impulse method were shown to be well applicable for regular determination of moisture profiles in porous building materials. however, taking into account the particular advantages of each method, in order to achieve the highest possible accuracy the capacitance method can be recommended for measurements where lower moisture content is expected, while the microwave impulse technique is better for higher moisture content measurements. acknowledgments this research has been supported by the czech science foundation, under grant no. 103/03/0006. references [1] nielsen, a.: “gamma-ray attenuation used for measuring the moisture content and homogeneity of porous concrete.” building science, vol. 7, (1972), p. 257–263. [2] kumaran, m. k., bomberg, m.: “a gamma-spectrometer for detemination of density distribution and moisture distribution in building materials.” proceedings of the international symposium on moisture and humidity, washington dc, 1985, p. 485–490. [3] descamps, f.: “continuum and discrete modelling of isothermal water and air transfer in porous media.” phd thesis, ku leuven, 1997. leuven, belgium. [4] gummerson, f. j., hall, c., hoff, w. d., hawkes, r., holland, g. n., moore, w. s.: “unsaturated water flow 46 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0 0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 xt -1/2 [ms -1/2 ] m o is tu re c o n te n t [k g k g -1 ] capacitance method microwave method gravimetric method fig. 8: boltzmann transformed experimental data of the capacitance method, the microwave impulse method and the gravimetric method for cement paste 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0 0.0005 0.001 0.0015 0.002 xt -1/2 [ms -1/2 ] m o is tu re c o n te n t [k g k g -1 ] capacitance method microwave method gravimetric method fig. 9: boltzmann transformed experimental data of the capacitance method, the microwave impulse method and the gravimetric method for ceramic brick 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 xt -1/2 [ms -1/2 ] m o is tu re c o n te n t [k g k g -1 ] capacitance method microwave method gravimetric method fig. 10: boltzmann transformed experimental data of the capacitance method, the microwave impulse method and the gravimetric method for autoclaved aerated concrete within porous materials observed by nmr imaging.” nature, vol. 281 (1979), p. 56–57. [5] pel, l.: “moisture transport in porous building materials.” phd thesis, tu/e, eindhoven, 1995, netherlands. [6] semerák p., černý r.: “a capacitance method for measuring the moisture content of building materials.” stavební obzor, vol. 6 (1997), p. 102–103 (in czech). [7] hoff, w. d., wilson, m. a., benton, d. m., hawkesworth, m. r., parker, d., fowles, p.: “the use of positron emission tomograhy to monitor unsaturated flow within porous construction materials”. journal of materials science letters, vol. 5 (1996), p. 1101–1104. [8] pel, l., ketelaars, a. a. j., odan, o. c. g., well, a. a.: “determination of moisture diffusivity in porous media using scanning neutron radiography.” international journal of heat and mass transfer, vol. 36, (1993), p. 1261–1267. [9] hasted, j. b., shah, m. a.: “microwave absorption by water in building materials.” brit. j. appl. phys., vol. 15, (1964), p. 825–836. [10] plagge, r., grunewald, j., haüpl, p.: “application of time domain reflectometry to determine water content and electrical conductivity of capillary porous media.” in: proceedings of the 5th symposium on building physics in the nordic countries: göteborg, sweden. 1999, p. 337–344. [11] van besien, t.: “experimentele bepaling van de vochtdiffusiviteit met behulp van x-straal radiografie.” (experimental determination of moisture diffusivity using x-ray radiography). msc thesis, ku leuven, 2001, belgium. [12] roels, s., van besien, t., carmeliet, j., wevers, m.: “x-ray attenuation technique for the analysis of moisture flow in porous building materials.” in: second international conference on research in building physics, j. carmeliet, h. hens, g. vermeir (eds.), a.a. balkema publishers, lisse: 2003. the netherlands, p. 151–157. [13] pavlík, j., tydlitát, v., černý, r., klečka, t., bouška, p., rovnaníková, p.: “application of a microwave impulse technique to the measurement of free water content in early hydration stages of cement paste.” cement and concrete research, vol. 33, (2003), p. 93–102. ing. pavel tesárek phone:+420 2 2435 5436 e-mail: tesarek@fsv.cvut.cz ing. jaroslav pavlík phone:+420 2 2435 5436 e-mail: pavlikj@fsv.cvut.cz prof. ing. robert černý, drsc. phone:+420 2 2435 4429 email: cernyr@fsv.cvut.cz department of structural mechanics czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic © czech technical university publishing housee http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 ap05_6.vp 1 introduction condition monitoring systems are very important for researchers in gearbox development. they enable detection of gear cracks during testing, and stop the test before the gear crack progresses. then the researchers are able to recognize where the crack began and to decide about the reason for the gearbox fault. consequently, the designers can take appropriate steps in gearbox design to improve gearbox performance. condition monitoring systems deal with various types of input data, for instance vibration, acoustic emission, temperature, oil debris analysis etc. systems based on vibration analysis, acoustic emission and oil debris are the most common and are very well established in industry. systems based on acoustic emission have a more obvious application for bearing monitoring than for gearing monitoring. however some applications for gearbox condition monitoring have been introduced. acoustic emission (ae) is usually defined as transient elastic waves generated from a rapid release of strain energy caused by a deformation or by damage within or on the surface of the material [1]. successful application of ae to bearing condition monitoring has been presented in many papers. roger in [2] describes the application for monitoring of a slowly rotating anti-friction slew bearing mounted in cranes for gas production. al-ghamdi and his colleges describe in [3] an experiment where with the use of basic cis (rms and max. amplitude) they try to identify the type and size of a bearing defect. they claim that ae is more sensitive to defect identification than vibration analysis and that the ae burst duration may indicate the size of the bearing defect. some authors describe the potential of ae to gearbox condition monitoring. singh [4], tandom [5] and sires [6] use simulated gearbox defects and sensor placed on bearings or on a gearbox housing. toutountzakis [7], sentoku [8] and miyakchika [9] present applications to natural defects and they use of a slip ring to transfer the data from a rotating sensor. this type of sensor mounting ensures a direct transfer path for the ae signal. tan [10] deals with the sources of ae during meshing of spur gears. he offers three possible sources of ae during the mesh: tooth resonance, secondary pressure peak in lubricated gears and asperity contacts. he considers the asperity contact to be the most important source of ae. oil debris analysis is a very reliable method for detecting gearing damage in the early stages and allows estimation of the wear level. during gearbox operations the contacting surfaces of gearwheels and bearings are gradually abraded. small pieces of material break down from the contact surfaces. these small pieces of material are carried away by oil lubricating the gearwheels and bearing. by detecting the number and size of particles in the oil we can identify gear-pitting damage in an early stage, which is unidentifiable by vibration analysis. oil debris sensors are usually based on a magnetic or an optical principle. magnetic sensors measure the change in magnetic field caused by metal particles in a monitored sample of oil. the oil debris monitoring system can be on-line or off-line. oil debris monitoring systems are usually more reliable than vibration based systems for early pitting failure detection [11]. a disadvantage of oil debris analysis is that it does not localize the failure in complicated gearboxes. the oil used in the oil debris monitoring system should not dissolve the metal particles and spread a metal film on to the gearbox housing. condition monitoring systems based on vibration analysis can monitor all parts of gearboxes, for example gearing, bearings and shafts. a typical condition monitoring system based on vibration monitoring is depicted in fig. 1. for a better signal to noise ratio a raw vibration signal is filtered and pre-amplified. consequently the signal is processed in two different ways. the overall vibration level is monitored in an analog rms detector. if the vibrations exceed the selected level the system stops the gearbox testing before the gearbox is destroyed. to set the system more sensitive to incipient gearbox failures, the analog signal is digitized. the time domain signal is synchronously averaged and consequently filtered to focus only on the important part of the vibration signal. then some condition indicators (cis) are computed. finally the computed cis are compared in the decision making unit. if any of the indicators exceed the limit the alarm signal is generated. sometimes single condition indicators are not used and complete fourier spectra are compared. since 1997, when steward introduced fm0 and fm4 condition indicators and some other cis [12], research in gearing condition indicators has progressed. zakrejsek [13] introduced na4 which is sensitive to damage growth. na4 was further improved in trending by decker [14] and also by demsey [15] to decrease na4 dependency on torque changes. © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 condition indicators for gearbox condition monitoring systems p. večeř, m. kreidl, r. šmíd condition monitoring systems for manual transmissions based on vibration diagnostics are widely applied in industry. the systems deal with various condition indicators, most of which are focused on a specific type of gearbox fault. frequently used condition indicators (cis) are described in this paper. the ability of a selected condition indicator to describe the degree of gearing wear was tested using vibration signals acquired during durability testing of manual transmission with helical gears. keywords: damage detection, condition monitoring, condition indicators, transmissions, vibration. zakrejsek [13] presented the application in gearbox diagnostics of ci m6a and m8a, originally developed by martin [17] for surface damage detection. 2 description of gearbox condition indicators traditional cis deal with data distribution. the main differences between these cis are in the signal from which the computations are made. generally three type of signal are used: a raw, a residual and a difference signal. a residual signal is defined as a synchronous averaged signal without the gear mesh frequency, its harmonics, driveshaft frequency and its second harmonics. if the first order sidebands about the gear mesh frequencies are filtered out, a differential signal is created. however these definitions are not strict. some authors leave the second harmonics of the driveshaft frequency in the residual signal or remove the second order sideband from the difference signal. 2.1 root mean square value the root mean square value (rms) for the velocity vibration signal is defined in equation 1. in comparison with the definition of kinetic energy ek (eq. 2) it is obvious that the rms value computed from the velocity of the vibration signal describes the energy content of the signal. v t v t trms t � � 1 2( ) d , (1) where vrms is the root mean square value of the velocity of the vibration signal, t is integration time, v is the velocity of the moving object. e m v tk � 1 2 2( ) , (2) where ek is the kinetic energy, m is the weight of the moving object. nowadays, however, digital signals are more used than analog signals. the rms definition for a discrete signal is given in equation 3. s n srms i i n � � �1 2 1 ( ), (3) where srms is the root mean square value of dataset s, si is the i-th member of dataset s, n is the number of points in dataset s. from the definition of rms it is obvious that the rms value does not increase with the isolated peaks in the signal, consequently this parameter is not sensitive to incipient tooth failure. its value increases as tooth failure progresses. generally the rms value of the vibration signal is a very good descriptor of the overall condition of the tested gearboxes. this parameter is sensitive to gearbox load and speed changes. the main usage of this parameter is to monitor the overall vibration level. then the test can be stopped when the vibration energy reaches the critical value and the gearbox destruction can be treated. typical time history of the rms 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 1: condition monitoring system based on vibration analysis value for the overall vibration signal during a gearbox durability test is depicted in fig. 2. as the gearbox consequently wears out, the vibration level will increase. when pitting damage occurs, the vibration level will increase. as pitting damage progresses, the overall vibration level increases rapidly. 2.2 delta rms this parameter is the difference between two consequent rms values. this parameter focuses on the trend of the vibration and is sensitive to vibration signal changes. theoretically it allows selection of an alarm level which is not sensitive to load. however the parameter is sensitive to load change. the theory behind this parameter states that if gear damage occurs the vibration level will be increased more rapidly than in a normal case without gear damage. 2.3 peak value this value is the maximum value of the signal in a selected time interval. this parameter is usually not used alone. 2.4 crest factor this parameter indicates the damage in an early stage. it is defined as the peak value of the signal divided by the rms value of the signal. cf s s peak peak rms � � , (4) where cf is the crest factor, speak�peak is the peak to peak value of the signal, srms is the root mean square value of the vibration signal. when only one tooth is damaged, there is no change in the rms value of the vibration signal during one rotation of the drive shaft where the damaged gear is located, while the peak value increases. therefore the crest factor increases its value. as the damage progresses the root mean square value of the vibration signal increases its value and the crest factor decreases. this parameter enables very tiny surface damages to be discovered, as experiments show. the crest factor is often used in gearbox quality monitoring devices. 2.5 energy operator the energy operator (eo) is computed as the normalized kurtosis from the signal where each point is computed as the difference of two squared neighborhood points of the original signal [18]. � � � � eo n x x x x i i n i i n � � � � � � � � � � 2 4 1 2 1 2 � � � � , (5) where eo is the energy operator, � x is the mean value of signal �x, � x s si i i� ��1 2 2, n is the number of the point in dataset x. 2.6 kurtosis the shape of the amplitude distribution is often used as a data descriptor. kurtosis describes how peaked or flat the distribution is. if a vibration signal contains sharp peaks with a higher value, then its distribution function will be sharper. we can assume that these types of signals will be produced by a damaged gearbox. therefore the kurtosis value will be higher for a damaged gearbox than for a gearbox in good condition. a mathematical definition of kurtosis is given by eq. 6. � � � � kurt n s s s s i i n i i n � � � � � � � � � � � 4 1 2 1 2 , (6) where kurt is kurtosis, n is the number of points in the time history of signal s, si is the i-th point in the time history of signal s. thus kurtosis is the fourth centralized moment of the signal, normalized by the square of the variance. 2.7 energy ratio the energy ratio (er) is defined as the ratio between the energy of the difference signal and the energy of the regular meshing component [19]. er d r � � � ( ) ( ) , (7) where er is energy ratio, �(d) is the standard deviation of the difference signal, �(r) is the standard deviation of the regular signal. not all authors use the same definition for these signals. the regular meshing components are usually defined as the mesh frequency and its harmonics. consequently the differ© czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 2: time history of the vibration signal during a durability test of the gearing (l is the duration of the test normalized by the total test duration) ence signal is defined as the remainder of the vibration signal after the regular meshing components are removed. the basic idea of this indicator is that the energy is transferred from the regular meshing component to the rest of the signal as the wear progresses. this parameter is a very good indicator for heavy wear, where more than one tooth on the gearing is damaged. 2.8 sideband level factor the sideband level factor is defined as the sum of the first order sideband about the fundamental gear mesh frequency divided by the standard deviation of the time signal average [20]. this parameter is based on the idea that tooth damage will produce amplitude modulation of the vibration signal. for a gearbox in good condition this factor is near zero. 2.9 sideband index the sideband index is defined as the average amplitude of the sidebands of the fundamental gear mesh frequency [21]. 2.10 zero-order figure of merit the zero-order figure of merit (fm0) parameter is defined as the quotient of the peak-to-peak value of the signal divided into the energy of the mesh frequency and its harmonics [21]. it is obvious that this parameter is similar to the crest factor. fm0 compares the peak value of the synchronous averaged signal to the energy of the regular signal, and the crest factor compares the peak value of the synchronous averaged signal to the energy of the synchronous averaged signal. therefore fm0 is more focused on the damage to the tested gearing. fm s a i peak peak i n 0 1 � � � � ( ) , (8) where fm0 is the zero-order figure of merit, speak�peak is the peak to peak value of the vibration signal in the time domain, a(i) is the amplitude of the i-th mesh frequency harmonics. let us assume that one tooth on the gear mesh gear is slightly damaged. then the gearing produces a vibration signal with a significantly increased peak to peak value. however, the sum of the root mean square values is approximately the same. 2.11 fm4 parameter the fm4 parameter [18] is a simple measure if the amplitude distribution of the difference signal is peaked or flat. the parameter assumes that a gearbox in good condition has a difference signal with a gaussian amplitude distribution, whereas a gearbox with defective teeth produces a difference signal with a major peak or a series of major peaks resulting in a less peaked amplitude distribution. if more then one tooth is defective, the data distribution becomes flat and the kurtosis value decreases. � � � � fm n d d d d i i n i i n 4 4 1 2 1 2� � � � � � � � � � � , (9) where di is the i-th point of the differential signal in the time record, n is the total number of points in the time record. 2.12 na4 parameter the na4 [18] parameter was developed to improve the behavior of the fm4 parameter when more than one tooth is damaged. the first difference between na4 and fm4 is that na4 uses a residual signal to compute kurtosis. the second difference is that we use an average value of variance. thus if the gear damage spreads from one tooth to another tooth the value of the average variance increases slowly and allows the na4 parameter to grow. the second reason why the na4 parameter increases its value is that the residual signal contains the first order sidebands, which increase when tooth damage occurs. the na4 parameter is defined by eq. 10. � � � � na n r r m r r i i n ij j i n j m 4 1 4 1 2 11 � � � � � � � � � � � �� � �� � 2 , (10) where ri is the i-th point in the time record of the residual signal, rij is the i-th point in the j-th time record of the residual signal, j is the current time record, i is the data point number per reading, m is the current time record in the run ensemble, n is the number of points in the time record. when gear damage progresses the averaged variance value increases rapidly, which results in a decrease of the na4 parameter. to overcome this problem the na4* parameter was introduced. the fourth centralized moment of the residual signal is normalized with the average variance for a gearbox in good condition. this allows the na4* parameter to grow as the damage progresses. � � � � na n r r r i i n ok 4 4 1 2 � �� � �� var( ) , (11) where var(rok) is the variance for a gearbox in good condition. the value of the signal for a gearbox in good condition is usually assumed from the variance for a well-functioning gearbox. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague 2.13 nb4 parameter the nb4 [22] parameter is designed from the na4 parameter. na4 is computed from the residual signal whilst nb4 is computed from the envelope signal. the computation procedure follows. a raw vibration signal is bandpass filtered about the gear meshing frequency. each recommendation for the bandwidth is different. some authors suggest using a band pass filter with the bandwidth giving the maximum amount of the sidebands, whilst others use a filter with the bandwidth limited by the first harmonic different from the gear mesh frequency. after the unwanted part of signal has been filtered out, the hilbert transform (eq.12) is used to create an analytic signal � a. � �� a t a t a t d( ) ~( ) ( )� � � � �� � � 1 1 � � � � (12) where ~( )a t is the hilbert transform, a( )� is the input real analog signal. from the analytic signal the envelope is simply calculated according to eq. 13. the result is the input for the na4 parameter. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 3: signal processing during the test � a t a t a t( ) ( ) ~ ( )� �2 2 , (13) where � a t( ) is the envelope of the analytic signal, a(t) is an input analog signal, ~( )a t is the hilbert transform of the input signal. 3 experiments with selected cis selected cis were used to monitor the condition of a gearbox during its durability test. the idea behind the test is to assess the ci indicator which best follows the condition of the gearbox. the cis were computed for the same input conditions, because of the dependency on torque and rotational speed. 3.1 data acquisition the vibrations of the gearbox housing were measured during a simulated test drive on the test bench. one piezoelectric accelerometer was used for vibration signal acquisition. the transducer was placed near the differential gearing. the data from the accelerometer was recorded directly onto the pc hard disk for off-line analysis by the b&k multi-analyzer system type 3560. the b&k mm0024 photoelectric tachometer probe captured the rotational speed of the drive shaft during the test procedure. 3.2 data processing in the digital domain because of slight speed changes, the raw vibration signal is re-sampled to obtain the same number of samples in each basic time interval. the basic interval matches one rotation of the drive shaft. then the re-sampled signal is divided into sections. the section lengths correspond to 10 rotations of the output shaft. after that the sections are averaged to decrease the noise. then the signal is filtered to obtain the residual, difference and regular signal. consequently the cis were computed. mean values of the cis are used as representative for each ci for a selected length of the input interval. the data processing is depicted in fig. 3. 3.3 experimental results the main idea of most cis is as follows. the amplitude distribution of the vibration signal without gear mesh frequencies differs more from gaussian distribution as the gearbox wears. the changes in amplitude distribution during the test are depicted in figure 4. these changes are numerically expressed by the cis in figs. 5 and 6. the condition of the tested gearbox is very well represented by the rms value of the vibration signal. the trend of the rms, peak and crest factor value is depicted in fig. 5. the energy ratio has a steady trend except at the start of the test. the skewness and kurtosis do not reflect any explicit trend. the rms slightly increases during the test. the peak value has a similar trend as the rms value. the trend of the crest factor depends on the trends of the rms and peak values. fm0 follows the trend of the crest factor. the trend of fm4, na4 and nb4 are similar, and reflect the very fine wear of the teeth. at about 60 % of the test fm4, na4 and nb4 rapidly increase in value, but the gearing does not show abnormal wear. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 4: histograms of asynchronous averaged signal during the test © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 5: trend of rms, peak, crest factor, energy ratio, skewness and kurtosis (l is the actual duration of the test normalized by the total test duration) fig. 6: trend of the fm0,fm4, na4 and nb4 parameter 4 conclusion condition indicators are based on detecting of differences between the amplitude distribution of a vibration signal without filtered gear mesh frequencies and the distribution of the signal for a gearbox in good condition. the rest of the vibration signal without the gear mesh frequency, its harmonics and shaft frequency has its amplitude distribution near to a normal distribution for a gearbox in good condition. therefore many cis give a zero value for a gearbox in good condition. cis differ each from other in the signal that they use and how they use the time history of the vibration signal. some of them are improved for torque independency. the experimental result indicates that condition indicators describing the overall vibration level track the condition of the tested gearbox condition (rms, peak) very well. typical gearbox monitoring systems are therefore based on order analysis and condition indicators describing the overall vibration value (rms, crest factor etc.) condition indicators provide information that something has happened and order analysis provides information about what has happened. the condition indicator that indicates the gearbox fault at its beginning does not track the condition of the gearbox well during the test, e.g., the crest factor. the author ascribes this to the fact that the condition indicator was not computed continuously during the test. it was computed only from the selected datasets of the vibration signal acquired during the test. the experimental results indicate that the monitoring of fine wear of gearing places big demands on precise data acquisition. the part of the vibration signal created by gearing wear can be masked by noise. in these cases, an additional method such as oil debris should be used for precise condition monitoring. acknowledgments the research was supported by the research program no. msm6840770015 “research of methods and systems for measurement of physical quantities and measured data processing” of the ctu in prague sponsored by the ministry of education, youth and sports of the czech republic. references [1] mathews, j. r.: acoustic emission. gordon and breach science publishers inc., new york 1983. [2] roger, l. m.: “the application of vibration analysis and acoustic emission source location to on-line condition monitoring of anti-friction bearings.” tribology international, 1979, p. 51–59. [3] al-ghamdi, a. m., zhechkov, d., mba, d.: “the use of acoustic emission for bearing defect identification and estimation of defect size.” dgzfp-proceedings bb 90, 2004. [4] singh, a., houser, d. r., vijayakar, s.: “detection of gear pitting.” power transmission and gearing conference, 1996, asme. de-vol. 88, p. 673–678. [5] tandom, n., mata, s.: “detection of defects in gears by acoustic emission measurements.” journal of acoustic emission, vol. 17 (1999), issue 1–2, p. 23–27. [6] siores, e., negro, a. a.: “condition monitoring of a gear box using acoustic emission testing, material evalution.”, 1997, p. 183–187. [7] toutountzakis, t., mba, d.: “observation of acoustic emission activity during gear defect diagnostis.” ndt and e international. vol. 26 (2003), p. 471–477. [8] sentoku, h.: “ae in tooth surface failure process of spur gears.” journal of acoustic emission, vol. 16 (1998), issue 1–4, s19–s24. [9] miyachika, k., oda, s., koide, t.: “acoustic emission of bending fatigue process of spur gear teeth.” journal of acoustic emission, vol. 13 (1995), issue 1–2, s47–s53. [10] tan, c. k., mba, d.: “the source of acoustic emission during meshing of spur gears.” ewgae 2004. [11] dempsey, p. j., afjeh, a. a.: “integrating oil debris and vibration gear damage detection technologies using fuzzy logic.” international 58th annual forum and technology display, quebec (canada) june 11–13, 2002. [12] stewart, r. m.: “some useful data analysis techniques for gearbox diagnostics” report mhm/r/10/77, machine health monitoring group, institute of sound and vibration research, university of southampton, july 1977. [13] zakrajsek, j. j., townsend, d. p., decker, h. j.: “an analysis of gear fault detection methods as applied to pitting fatigue failure data.” nasa tm-105950, presented at 47th meeting of the society for machinery failure prevention technology, april 1993. [14] decker, h. j., handschuh, r. f., zakrajsek, j. j.: “an enhancement to the na4 gear vibration diagnostic parameter, 18th annual meeting vibration institute, hershey, pa, june 20–23, 1994. [15] dempsey, p. j., zakrajsek, j. j.: “minimizing load effects on na4 gear vibration diagnostic parameter.” nasa tm-2001-210671, glenn research center, cleveland, oh, feb. 2001. [16] martin, h. r.: “statistical moment analysis as a means of surface damage detection.” proc. of 7th international modal analysis conference, society for experimental mechanics, schenectady, n.y., 1989. [17] mosher, m., pryor a. h., huff, e. m.: “evaluation of standard gear metrics in helicopter flight operation.” 56th mechanical failure prevention technology conference, virginia beach, va, april 15–19, 2000. [18] decker, h. j.: “crack detection for aerospace quality spur gears.” 58th international annual forum and technology display sponsored by the american helicopter society, montreal, quebec, canada, june 11–13, 2002. [19] keller, j. a., grabill, p.: “vibration monitoring of uh-60a main transmission planetary carrrier fault.” american helicopter society 59th annual forum, phonix, arizona, may 6–8, 2003. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague [20] szczepanik a.: “time synchronous averaging of ball mill vibration.” mechanical systems and signal processing, vol. 3 (1989), no. 1, p. 99–107. [21] stewart, r. m.: “some useful data analysis techniques for gearbox diagnostics.” report mhm/r/10/77, machine health monitoring group, institute of sound and vibration research, university of southampton, july 1977. [22] lebold, m., mcclintic, k., cambell, r., byington, c., maynard, k.: “review of vibration analysis methods for gearbox diagnostics and prognostics.” proceedings of the 54th meeting of the society for machinery failure prevention technology, virginia beach,va, may 1–4, 2000, p. 623–624. [23] dempsey, p. j.: “a comparison of vibration and oil debris gear damage detection methods applied to pitting damage.” 13th international congress on condition monitoring and diagnostic engineering management sponsored by the society for machinery failure prevention technology, houston, texas, december 3–8, 2000. [24] dempsey, p. j., zakrajsek, j. j.: “minimizing load effects on na4 gear vibration diagnostic parameter.” 55th meeting sponsored by the society for machinery failure prevention technology, virginia beach, virginia, april 2–5, 2001. ing. petr večeř e-mail: vecerp@fel.cvut.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@fel.cvut.cz doc. ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@fel.cvut.cz department of measurements czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap05_5.vp 1 introduction in our deep geological repository, we plan to use canisters made of stainless and carbon steel, with a compacted bentonite barrier. the use of bentonite as a backfill barrier in repositories for nuclear waste is based mainly on its low permeability, swelling properties and capability to significantly retard the migration of most radionuclides (rn) [1,2,3]. as a result of corrosion processes, the main corrosion product of steel canisters – magnetite – will also form an important part of the barrier. to be able to predict the fate of rn in repositories, the retardation processes, i.e. mainly the sorption processes, have to be studied, and mathematical sorption models of the ‘rn – bentonite – magnetite’ system have to be developed. the bentonite (clay) surface contains at least two types of surface groups. the first type are permanently charged functional groups created by ionic substitution within the crystal structure. isomorphic substitution of, e.g. al3� for si4� within the tetrahedral layer creates a permanent negative charge on the mineral surface, which is compensated externally by cations. these inter-structural-charge surface sites are denoted as layer sites. on the edges of the surface structure, there are (�soh) sites with a ph-dependent charge. this is due to the “adsorption” of h� ions (then so called protonation proceeds: � soh � � soh2 �) or “desorption” of h� ions (then deprotonation proceeds: � soh � � so�) depending on the ph of the solution. these variable-charge surface sites are designated as edge sites. these two surface site types are responsible for two uptake processes. the first, on layer sites, tends to be dominant at low ph and/or high sorbate concentrations. the mechanism of this process is cation exchange. the second process occurs on edge sites, depending on ph, the mechanism of which is surface complexation. as for magnetite, both types of surface sites were also found [4]. the aim of this work was to derive the mathematical models of the protonation and sorption processes occurring in the ‘rn – bentonite – magnetite’ system using the famulus software product, including the newton-raphson nonlinear regression method [5], and to prepare codes for evaluating the experimental data. 2 derivation of models of protonation and sorption processes 2.1 types of models and basic modeling approaches in principle, four types of sorption models can be used. the simplest is the kd – model characterizing the linear equilibrium isotherm. the second, e.g. the langmuir equation, describes the non-linear sorption isotherm. the third is based on the application of a classical chemical-equilibrium equation or equations, e.g., the ion exchange model (iexm). the fourth type is represented by surface complexation models (scms) [6, 7, 8]. as was mentioned above, iexm and scms describe the processes occurring on layer sites and edge sites, respectively, and as a result, they can be designated as the most important models. at least three types of surface complexation models (scms), namely two electrostatic models, the constant capacitance model (ccm) and the diffuse double layer model (dlm) and one chemical equilibrium, non-electrostatic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 mathematical modeling of a cs(i) – sr(ii) – bentonite – magnetite sorption system, simulating the processes taking place in a deep geological repository h. filipská, k. štamberg the derivation of mathematical models of systems consisting of cs(i) or sr(ii) and of bentonite (b), magnetite (m) or their mixtures (b+m) are described. the paper deals especially with modeling of the protonation and sorption processes occurring on the functional groups of the solid phase, namely on so called edge sites and layer sites. the two types of sites have different properties and, as a result, three types of surface complexation models (scm) are used for edge sites, viz. two electrostatic scms: the constant capacitance model (ccm) and the diffusion double layer model (dlm), and one without electrostatic correction: the chemical model (cem). the processes taking place on the layer sites are described by means of the ion exchange model (iexm). in the course of modeling, the speciation of the given metal in the liquid (aqueous) phase has to be taken into account. in principle, the model of protonation or sorption processes is based on the reactions occurring in the aqueous phase and on the surface of the solid phase, and comprises not only the equations of the equilibrium constants of the individual reactions, but also the mass and charge balance equations. the algorithm of the numerical solution is compatible with famulus 3.5 (a czech software product quite extensively used at czech universities in the last decade, the bookcase codes of which are utilized). keywords: cesium, strontium, bentonite, magnetite, surface sorption, ion exchange, protonation, titration, mathematical modeling. model (cem), were employed to simulate the amphoteric property of the solid phase types of bentonite, goethite and montmorillonite, and to describe the sorption of different metal ions and their complex compounds from an aqueous solution as a function of ph, ionic strength, solution concentration, etc. well developed scms enable the behaviour of systems, e.g. rn-bentonite-magnetite, to be quantitatively described and the sorption processes to be simulated. scms and iexms are based on the following suppositions: � no mutual interactions exist between particles adsorbed on the surface of the solid phase. � the protonation and deprotonation processes depend on ph. � on the surface of the solid phase (e.g. bentonite), as was mentioned above, there are two types of functional groups, i.e. so called edge sites and layer sites. � the functionality of adsorption sites, i.e. edge sites or layer sites, is identical. � the concentration of the ith-component in the aqueous phase near the surface (in the aqueous layer adhering to the surface) is given by the boltzman equation (1): ( ) expc c z f r ti s i � � � � � � � � � � (1) where (ci)s is the concentration near the surface, ci is the so-called bulk concentration of the ith-component, z is the charge of the ith-component, � is the electrostatic potential, f is the faraday constant, r is the gas constant and t is the absolute temperature. � between the surface charge, �, and the electrostatic potential, �, the following dependencies hold (2, 3, 4): � �� �g (in the case of ccm) (2) � � � � � � � � � � � 0 1174 2 1 2. sinhi z f r t i (in the case of dlm) (3) � � 0 (in the case of cem) (4) where g is the so-called helmholtz capacitance. in order to apply scms for describing the sorption processes occurring in the system type of e.g. rn-bentonite-magnetite, we need the surface area of each solid phase component, the total surface site concentrations (edge and layer sites) and the protonation constant (for edge sites) and ion exchange constant (for layer sites). these can be obtained by evaluating the acid-base titration data (in our case of bentonite and magnetite and their mixtures) and by measuring the surface area. in the course of evaluating the of sorption experimental data, specifically the sorption dependencies on ph, the values of the equilibrium constants of the surface complexation reactions are found. as for the modeling approaches used in the case of rn-bentonite-magnetite type system, two different procedures can be tested: � the generalized composite approach (gc), where the given mixture of solid phase components, e.g. bentonite � magnetite, is considered as a compact sorbent characterized by a single set of titration and sorption parameters, which are sought by direct fitting of the experimental data. � the component additivity approach (ca), composed of a weighted combination of models describing the protonation � ion exchange (on layer sites) and the sorption on individual solid phase components, e.g. on bentonite and magnetite. the individual parameters have to be obtained by fitting the appropriate experimental data that are valid for the solid component. it is evident that the gc approach demands less experimental time than the ca approach because we need only one titration curve and one sorption ph-dependency for the given mixture of solid components. on the other hand, if the protonation � ion exchange and sorption quantities characterizing the individual solid phase (mineralogical) components are determined, the ca approach enables the sorption behaviour of selected component mixtures to be simulated. it is also evident that the codes corresponding to both the gc and the ca approaches are in principle different: whereas the gc approach code has to be based on a non-linear regression procedure, the ca approach code is relatively simple, even if internal iterative loops also have to be used. 2.2 modeling of titration curves description of the system so h soh0� �� � (5) soh h soh0 2� � � � (6) x xna h h na� � �� � (7) where so�, soh and soh2 � are symbols for edge sites, and x� is the symbol for layer sites. in the course of titration, protonation reactions (5) and (6) on the edge sites, and the ion exchange reaction (7) on the layer sites are under way in the system. titration occurs under an inert atmosphere (e.g. n2), and as a result, the presence of atmospheric co2 need not be taken into account. the determination of the titration curve starts at approx. ph 7, namely the titration of an aqueous suspension of the solid phase, having a given ionic strength, with both hcl and naoh solutions such that the total titration curve consists of two parts – two sets of experimental points obtained with both hcl and naoh. derivation of cem, ccm and dlm titration models it is supposed that the electroneutrality represented by eq. (8) exists between positive and negative charged surface groups and negative and positive charged species in solution in the course of titration. � � � � � �� � � � � �� � � � � �� � m v c m x v c b a � � � � � � � � � � � � � � soh h so oh 2 + � � (8) by rearrangement of eq. (8) we can obtain eq. (9). it describes the course of titration relating the surface charge, q [mol/kg], and the experimentally measurable parameters: � � � � � � � � � � � � � �� � q x v m c ca b � � � � � � � � � � � soh so oh h 2 +� (9) � �c v c va � �h h � (10) 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague � �c v c vb � �oh oh � , (11) where v� ( � v0 + vh + voh) is the total volume and v0 is the starting volume of the aqueous phase [dm3]; m is the mass of the solid phase [kg]; [ca] and [cb] are the bulk concentrations of acid (i.e. of [cl�]) and caustic soda (i.e. of [na�]), respectively, in solution [mol�dm�3]; ch and coh are the concentrations, [mol�dm�3], of hydrochloric acid, hcl, and caustic soda, naoh, respectively, used in titration; vh and voh are the consumptions, [dm3], of acid and caustic soda, respectively, in the course of titration. eq. (9) consists of two parts: the right-hand side can be designated as the experimentally determined values of the surface charge, qexp, and the left-hand side expresses the sum of the two values of the surface charges, qcal (� qes � qls), namely the charge of the edge sites, qes (� [soh2 �] � [so�]) and the charge of the layer sites, qls (� �[x �]). the function (qexp)i � f(phi) or (qexp)i � f([h �]i), i � 1, 2, …, np, describes the experimental titration curve, consisting of np experimental points. the goal of modeling the titration curve is to construct the function qcal � f([h �]) applicable for fitting of experimental data and for determining of the values of the protonation and ion exchange constants, and the total concentrations of the edge sites and layer sites in the given solid phase. as for the numerical method, the acid-base titration data are fitted by the newton-raphson multidimensional non-linear regression method and the quantity wsos/df (see eqs. (83) and (84)) is used as the criterion for the goodness-of-fit. derivation of q cal � f([h �]) using cem (chemical equilibrium model) as was mentioned above, it is assumed that in the case of cem � � 0 and as a result, the concentrations of the ith-components existing near the surface, (ci)s, equals the bulk concentrations, ci (see eq. (1)). the protonation constants of the edge sites, ks1 and ks2, are given by eqs. (12) and (13), respectively. now, using the balance equation (14) for the total concentration of the edge sites, �soh, together with eqs. (12) and (13), we derive eq. (15) corresponding to the function qes � f([h �]): so h soh� �� � 0, � � � � � � ks1 soh so h � �� � 0 , (12) soh h soh2 +0 � �� , � � � � � � ks2 soh soh h 2 + � � �0 , (13) � � � � � ��soh soh soh so2+� � � �0 , (14) � � � � � � q ks ks ks ks ks es soh 1 2 h 1 h 1 2 h � � � �� � � � � � � �� � � � � 2 2 1 1 � � [mol�kg�1]. (15) the surface charge can also be expressed in coulomb per m2, �es [c�m �2]: � es es� q f sp [c�m�2]. (16) as regards layer sites, the ion exchange reaction takes place on these sites, and the equilibrium constant, ks5, is given by eq. (17). furthermore, the dissociation of xh and xna according to equations (18) and (19), respectively, needs to be taken into consideration. on the basis of the literature data, it is supposed that for the corresponding values of the dissociation constants, it holds: ks5b � ks5a and ks5a � 10�2. it follows from this that the dissociation of xh can be neglected and that the concentration of xna is very low and therefore xna is practically dissociated. x xna h h na� � �� �, � � � � � � � � ks x x 5 � � � � � h na na h , (17) x xh h� �� �, � � � � � � ks b x x 5 � �� �h h , (18) x xna na� �� �, � � � � � � ks b x x 5 � �� �na na . (19) if the function qls � f([h �]) is to be derived, two balance equations are needed, viz., the first balances the layer sites, cf. eq. (20) for total concentration of layer sites �x, and the second balances the sodium ions, cf. eq. (21) from which eq. (22) can be obtained. � � � � � � � ��x x x x x� � � ��na h h , (20) � � � � � � � � � � na na na na na 0 0 0 oh oh� � � � � � � � � m x v v c m x v , (21) � � � � � � � �� �na na na na0 0 oh oh 0� � � � � �v v c m x x v� . (22) in our case, the starting concentration [na0] equals the starting value of the ionic strength achieved by adding nano3, or naclo4 and as for [xna0], if the starting ph of titration is approx. 7, it is assumed to be approximately equal to the value of �x (cf. eq. (20)). � � � � � � � � � � q x x x ks ls na na na h � � � � � � � � � � 3 [mol�kg�1] (23) also in this case, the surface charge can be expressed in coulomb per m2, �ls: � ls ls� �q f sp [c�m�2]. (24) it is evident that the function qls � f([h �]) is given by the combination of eqs. (23) and (22). altogether, the cem function of the titration curve, namely qcal � qes � qls � f([h �]), which can be used for evaluating the experimental data, consists of equations (15), (22) and (23). formally, the total surface charge in coulomb per m2, ��, can be given by eq. (25): � cal cal� �q f sp [c�m2]. (25) derivation of q cal � f([h �]) using ccm (constant capacitance model) in principle, the construction of this model function is congruent with the construction of the cem model. however, the value of electrostatic potential � does not equal zero and © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 the boltzman equation, eq. (1), has to be used for calculating the component concentrations existing near the surface, (ci)s. these are then inserted into the model equations (15), (22) and (23) instead of the bulk concentrations. first, quantity � must be calculated, namely by means of equations (2) and (25). after the rearrangement procedure, equation (26) is obtained, for the solution of which a suitable interpolation method has to be used. q f sp gcal � � � �� 0 (26) � � � � � � q ks ks ks ks ks cal s s s soh 1 2 h h 1 2 h � � � �� � � � � � � � � � � � 2 1 1 1 � � � � � � 2 1 3 � � � � � � � � � � � � �x ks na h na mol kg s s s [ ] , (27) � � � �h hs� �� � �� � � exp �f rt (28a) � � � �na nas� �� � �� � � exp �f rt (28b) where the symbols with subscript “s” mean the concentrations near the surface. the proper ccm model function consists of four equations, namely (26), (27), (28a) and (28b). the values of the quantities searched, viz. ks1, ks2, ks3, �soh, �x and g, are determined in the course of simultaneous solution of the above mentioned equations using a non-linear regression procedure. derivation of qcal � f([h �]) using dlm (diffuse layer model) similar to ccm, quantity � must first be calculated using eqs. (3) and (25) rearranged into the form of eq. (29): q f sp i f rt cal � � � � � � � �� � �� �01174 2 0. sinh � (29) the proper dlm model function consists of four equations, namely (29), (27), (28a) and (28b), which can subsequently be used for determining ks1, ks2, ks3, �soh and �x by the procedure described above. 2.3 modeling of ph dependences of sorption description of the system the derivation of scms (cem, ccm, dlm) for sorption of strontium on bentonite or magnetite will be demonstrated, as an example. the sorption experiments are carried out in a mixed batch reactor under given conditions, i.e., under given values of the starting volume, v0, and composition of the aqueous phase (as a starting solution, synthetic granitic water is used) and mass of solid phase, m. the value of the given ionic strength, i, is adjusted using nano3. the reaction time (time of contact of the phases, e.g. approx. 30 days) must be sufficient for equilibrium to be reached. because the system is open to the atmosphere, the influence of atmospheric partial pressure of co2 is taken into consideration, especially if ph is greater than approx. 8.2. in the course of the sorption experiment, the influence of ph is observed, namely in such a way that each experimental point has a given ph, the value of which is adjusted by means of 0.1 m hcl or 0.1 m naoh. the sorption phenomenon is observed to proceed with the formation of a surface complex, or complexes, including ion exchange on the layer sites, as described below, between the surface groups and various species of strontium (sr2�, srco3 0, srno3 �) present in the experimental solution. these species compete with each other to form a surface complex with the solid phase, and the values of the corresponding equilibrium constants quantify this competition. the input data include among others the protonation and ion exchange constants and the total concentrations of the edge and layer sites, or the helmholtz capacitance, determined in the course of evaluating the titration curve (ks1, ks2, ks5, �soh, �x, or g). derivation of cem, ccm and dlm sorption models the experimental sorption data are in the form: kdexp � f(phexp) or %sorptionexp � f(phexp), where kdexp is the distribution coefficient of sorption of sr(ii) and %sorptionexp expresses the sorption of sr(ii) in percentage units. as a result, the analogous model function, namely kdcal � f(ph) or %sorptioncal � f(ph), has to be derived using the following procedure. firstly, let the reactions occurring on the solid phase (e.g. bentonite) and in the aqueous solution be formulated (the symbols for the corresponding equilibrium constants are given in parenthesis): reactions on edge-sites so h soh0� �� � (ks1) see eq. (5) soh h soh0 2� � � � (ks2) see eq. (6) soh co soh co2 3 2 2 3 � � �� � (ks3) (30) soh hco soh hco2 3 2 3 � �� � 0 (ks4) (31) so sr sosr� � �� �2 (ksr1) (32) 2so sr so) sr2 � �� �2 0( (ksr2) (33) so srno sosrno3 3 0� �� � (ksr3) (34) so srco sosrco3 3 � �� �0 (ksr4) (35) reactions on layer-sites x xh na na h� � �� � (ks5) see eq. (7) 2 h sr sr 2h2x x� � � � �2 (ksr5) (36) x xh srno srno h3 3� � � � � (ksr6) (37) reactions in an aqueous solution sr co srco3 3 2 2 0� �� � (kl1) (38) sr no srno3 3 2� � �� � (kl2) (39) sr so srso4 4 2 2 0� �� � (kl3) (40) sr co srco3 3 2 2� �� � . solid (sr) (41) h co hco3 3 � � �� �2 (k8) (42) 2h co h co3 2 3 � �� �2 (k9) (43) 2h co co h o)3 2 2 � �� � �2 ( (kp) (44) 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague h o h oh2 � � � � (kw) (45) secondly, the equilibrium constants of the reactions mentioned above have to be established: constants holding for edge-sites (using the boltzman equation (1)): � � � � � � � � � � � � ks f rt 1 � � � � � � � � � � � � � � � soh so h soh so h 0 s 0 exp � (46) � � � � � � ks f rt 2 0 � � � � � � � � � � � soh soh h 2 exp � (47) � � � � � � ks f rt 3 2 � � � � � � �� � � � � � soh co soh co 2 3 2 3 exp � (48) � � � � � � ks f rt 4 0 � � � � � � �� � � � � soh hco soh hco 2 3 2 3 exp � (49) � � � � � � ksr f rt 1 2 2 � � � � � � � � � � � � sosr so sr exp � (50) � � � � � � ksr f rt 2 2 0 2 2 � � � � � � � � � � � (so ) sr so sr 2 2 exp � (51) � � � � � � ksr f rt 3 0 � � � � � � � � � � � sosrno so srno 3 3 exp � (52) � � � � � � ksr 4 0 � � � � � � � sosrco so srco 3 3 (53) constants holding for layer sites: � � � � � � � � ksr x x 5 2 2 � � � � � � � � 2 2 sr h h sr (54) � � � � � � � � ksr x x 6 2 � � � � � � � � srno h h srno 3 3 (55) constants of reactions occurring in an aqueous solution (for i � 0): � � � � � � kl1 � � � � � � � srco sr co 3 0 2 3 2 (56) � � � � � � kl2 � � � � � � � srno sr no 3 + 2 3 (57) � � � � � � kl3 � � � � � � � srso sr so 4 0 2 4 2 (58) � � � �sr � �� �sr co322 (59) � � � �kw � �� �h oh (60) � � � � � � k 8 � � � � � � � � hco h co 3 3 2 (61) � � � � � � k 9 0 2� � � � � � � h co h co 2 3 3 2 (62) � � � � kp p � �� � co co h 2 3 2 2 (63) where sr is the solubility product of srco3. solid and pco2 is the partial pressure of co2 (in our case, it deals with atmospheric partial pressure). the davies equation (64) for the i-th ionic species and the hegelson equation (65) for the i-th neutral species are used for calculating the activity coefficients by means of which the values of the equilibrium constants are corrected for the given ionic strength, i. log .f a z i i ii i� � � � � � � � 2 1 0 24 , (64) where a � 0.509 (holds for aqueous solutions and ambient temperature), log f b ii � � (where b � 0.01–0.10). (65) thirdly, the balance equations are formulated: balance equation of the surface charge, �: � � � � � �� � � � �� � � � � � � � � � � � � � � soh sosr so sosrco soh co c m 2 3 2 3 f sp [ ]2 (66) it is evident that only charged surface species are taken into consideration. as was described above in connection with the derivation of the ccm and dlm models, it is necessary the quantity � needs to be determined, namely by solving eqs. (67) and (68) originating from combination of eq. (66) with eq. (3) or (2), respectively. depending on the type of surface complexation model, the obtained value of � is then inserted into eqs. (46)–(52). dlm i f rt : . sinh� � � � � �01174 2 0 (67) ccm g: � �� � � 0 (68) cem: � �� �0 0i.e. (69) balance equation of total concentration of the edge sites, �soh: � � � � � � � � � � � � �soh soh soh so sosr (so) sr sosrno sosrco 2 2 3 � � � � � � � � � � 2 0 0 � � � � � � 3 2 3 2 3soh co soh hco � �� � 0 . (70) © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 balance equation of concentration of sr(ii) in the aqueous phase, [sr]solution: � � � � � � � � � �sr sr srco srno srsosolution 2 3 3 4� � � �� �0 0 . (71) balance equation of the total concentration of sr(ii) in the solution, [�sr]solution, including the precipitate: � � � � � � � � � � � � �sr sr srco srno srso srco solution 2 3 3 4 3 � � � � � � �0 0 . solid (72) balance equation of the total concentration of sr(ii): � � � � � � � �� � � � �� v v solid m � � � � � � � � � � �sr sr srco srno srso srco so 2 3 3 4 3 0 0 0 . � � � � � �� � � � � � �� sr (so) sr sosrno sosrco sr srno 2 3 3 2 3 � � � � � � � 0 0 x x . (73) balance equation of nitrates: � � � � � �� � � � � �� � v v m x � � � � � � � � �no no srno sosrno srno 3 3 3 3 3 0 0 0 . (74) balance equation of sulphates: � � � � � �� �so so srso4 4 42 0 2 0� �� � . (75) it is assumed that sulphates are present only in solution. balance equation of carbonates: � � � � � �� � � � �� � � v v m � � � � � � � � � � � � � co co hco h co srco sosrco soh 3 3 3 2 3 3 3 2 0 2 0 � �� � �� 2 3 2 3 co soh hco � � 0 . (76) where v is the volume of the aqueous phase, m is the mass of the solid phase, e.g. bentonite, and [sr]0, [no3 �]0, [so4 2�]0 and [co3 2�]0 are the starting concentrations in the aqueous phase. if the partial pressure of co2, pco2, is taken into consideration then it holds (kpi is the constant of equation (63) corrected for the given ionic strength, i): � � � � co co h 2 3 2 2 � � � �� � � p kpi (77) and consequently, the concentrations of � �hco3� and � �h co2 3 are calculated by means of eqs. (61) and (62), respectively. balance equation of the total concentration of the layer sites, �x: � � � � � � � ��x x x x x� � � �2 2 3sr srno na h (78) the solution of the set of equations mentioned above, namely for the given interval of ph, lies in combinating the balance equations with the equilibrium constant equations, for example, the concentration [sosr�] derived from eq. (50) is inserted into eqs. (66), (70) and (73), [xsrno3] from eq. (55) into eq. (73) and (78), etc. the group of thus modified balance equations creates the regression function, the solution algorithm of which is depicted in fig. 1. it is evident that the algorithm consists of two loops: external and inner. in the external loop, the newton-raphson multidimensional non-linear regression procedure is used for fitting the experimentally determined data, and in the inner loop, the proper regression function is solved. in the last step, the functions kdcal � f(ph), eq. (79), and %sorption � f(ph), eq. (80), are evaluated. � � � � � � � � � � � � kdcal 2 3 2 3 3 sosr (so) sr sosrno sr srco srno sr � � � � � � � � � 0 0 0 � � � � � � � � � � � � � � � � so sosrco sr srno sr srco srno srso 4 3 2 3 2 3 3 4 0 0 0 � � � � � � � � � x x , (79) � � � � � � � � %sorption m v � � �� � �sosr (so) sr sosrno sr sosrco 2 3 3 0 0 0 � � � � � � � � � � � � � � x x2 3sr srno sr 0 100 . (80) the sorption on the edge sites, eq. (81), and layer sites eq. (82), can also be calculated � � � � � � %sorption m vedge sites � � �� � �sosr (so) sr sr sosrno 2 0 0 � � � � � � 3 3sosrco sr 0 0 100 � � � � � , (81) � � � � � � % .sorption m v x x layer sites � � � �2 3 sr srno sr 0 100 (82) the goodness-of-fit is evaluated by the �2-test, which is based on calculating the quantity �2 according to eq. (83): � 2 2 1 � � � ( )( ) ssx s i q ii n . (83) where (ssx)i is the ith-square of the deviation of the experimental value from the calculated value, (sq)i is the relative standard deviation of the i-th experimental point, n (� np) is the number of experimental points. the value of �2 is used for calculating the criterion wsos/df (weighted sum of squares divided by degrees of freedom) [9]: wsos df n n n n i i p/ ,� � � � 2 (84) where ni is the number of degrees of freedom, np is the number of experimental points and n is the number of model parameters sought during the regression procedure. if wsos/df � 20, then there is a good agreement between the experimental and calculated data, while, in the other case the fitting is regarded as unsatisfactory. as for (sq)i, on the basis of the corresponding experiments, the constant value for each experimental data point was assumed to be equal to 0.1 (in the case of the titration experiments) or 0.05 (in the case of the sorption experiments). 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 f ig . 1: a lg or it h m of th e p sr b en (m ag ). fm p ro gr am 3 conclusions the constructed and debugged codes are collected in the stamb-2003 package, namely the codes: (a) psrspec.fm determining the solution speciation, (b) p46dnlrg.fm for evaluating the titration curves, (d) pcsben(mag).fm and psrben(mag).fm for evaluating the experimental sorption data by the generalized composite (gc) approach, (e) bemagcs(sr).fm assigned for the component additivity (ca) modeling approach. the codes were successfully verified using the two systems, namely sr-bentonite-magnetite and cs-bentonite-magnetite, and the results were published at the international conference migration'03 [10], then at two home conferences [11, 12]. the final paper has been prepared for migration'05 [13]. in principle, the same procedure and algorithm as was described above can be used for modeling uptake processes of the scms � iexm type that occur on hydrated metal oxides, clay materials, etc. references [1] wanner, h. et al.: “the acid/base chemistry of montmorillonite.” radiochimica acta, vol. 66/67 (1994), p. 157–162. [2] baeyens, b., bradbury, m. h.: a quantitative mechanistic description of ni, zn and ca sorption on na-montmorillonite, part i: physico-chemical characterisation and titration measurements, report psi bericht nr. 95–10, villigen: paul scherrer institut, 1995. [3] hurel, c. et al.: “sorption behaviour of caesium on bentonite sample.” radiochimica acta, vol. 90 (2002), p. 695–698. [4] kroupová, h.: „studium sorpčních interakcí v systému: bentonit – vybrané radionuklidy a produkty koroze kontejneru – podzemní voda.“ doctoral thesis. prague: czech technical university, 2004, p. 132. [5] ebert, k., ederer, h.: computeranwendungen in der chemie, weinheim: vch verlags gmbh, brd, 1985. [6] štamberg, k. et al.: “modelling and simulation of sorption selectivity in the bentonite-hco3 �-co3 2�-233u(vi) species system.” journal of radioanalytical and nuclear chemistry, vol. 241 (1999), no. 3, p. 487–492. [7] bradbury, m. h., baeyens, b.: “sorption of eu on naand ca-montmorillonites: experimental investigations and modelling with cation exchange and surface complexation.” geochimica et cosmochimica acta, vol. 66 (2002), no. 13, p. 2325–2334. [8] missana, t., garcía-gutiérres m., fernńdez v.: uranium (vi) sorption on colloidal magnetite under anoxic environment: experimental study and surface complexation modelling. geochimica et cosmochimica acta, vol. 67 (2003), no. 14, p. 2543–2550. [9] herbelin, a. l., westall, j. c.: “fiteql – a computer program for determination of chemical equilibrium constants from experimental data, version 3.2.” report 96-01. corvallis, oregon: department of chemistry, oregon state university, 1996. [10] kroupová, h., štamberg, k.: “application of the generalized composite (gc) and component additivity (ca) approaches for modeling of cs(i) and sr(ii) sorption on bentonite in the presence of corrosion products using three types of surface complexation model.” in: migration '03 proceedings. gyeongju, korea: 9th international conference on chemistry and migration behaviour of actinides and fission products in the geosphere, 2003, p. 100. [11] kroupová, h., štamberg, k.: “experimental study and mathematical modeling of cs(i) and sr(i) sorption on bentonite as barrier material in deep geological repository.” in: xviith conference on clay mineralogy and petrology proceedings, 13–17 september 2004. (št’astný, m., ed.) prague: czech national clay group, 2004. [12] kroupová, h., štamberg, k.: „experimentální studium a povrchově-komplexační modelování sorpce cs a sr na bentonitu a magnetitu.“ chemické listy, vol. 98 (2004), no. 8, p. 570. [13] filipská, h., štamberg, k.: “parametric study of the sorption of cs(i) and sr(ii) on mixture of bentonite and magnetite using scm � iexm.” migration '05 10th international conference on chemistry and migration behavior of actinides and fission products in the geosphere, 18–23 september, avignon, france, to be published in the proceeding of migration '05. ing. helena filipská, ph.d. phone: +420 224 358 225 fax: +420 224 358 202 e-mail: filipska@fjfi.cvut.cz centre for radiochemistry and radiation chemistry doc. ing. karel štamberg, csc., ph.d. phone: +420 224 358 205 fax: +420 222 320 861 e-mail: stamberg@fjfi.cvut.cz department of nuclear chemistry czech technical university in prague faculty of nuclear sciences and physical engineering břehová 7 115 19 praha 1, czech republic 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague ap08_3.vp 1 introduction scheduling theory plays an important role in optimization of resources, and is used in many manufacturing and service industries. in the last fifty years, many optimal and heuristic algorithms have been proposed, but there is growing demand for transparent and realistic representation of results in scheduling. the objective of visualization and simulation is to make these theoretical results accessible to non-experts in scheduling theory. especially production scheduling and planning needs to be represented in a transparent form. the goal of this work is therefore to extend the existing torsche scheduling toolbox for matlab [1] by a new tool for graphic visualization of time schedules. scheduling optimizes the utilization of resources with given constraints in time. in other words, scheduling solves the problem how to assign given resources to given tasks in time [2]. production scheduling is a branch of scheduling mostly aimed at automated production lines and industrial production in general [3]. in this paper, visualization means graphic representation of a schedule in time, whereas simulation monitores the influence of some parameters on a whole system. this work emerges from the torsche scheduling toolbox for matlab [http://rtime.felk.cvut.cz/scheduling-toolbox/], which provides data structures and algorithms for time scheduling. therefore, the whole project is realized in the matlab programming environment [http://www.mathworks.com/]. one of the related tools is truetime [4] – a matlab based tool for real-time simulation for wide spectrum of problems, e.g. digital filters, embedded systems and wireless networks. for visualization, opengl (open graphics library) [http://www.opengl.org/] is a standard specification defining a cross-platform api for writing applications that produce 2d and 3d computer graphics. this means that visualization can be realized by opengl at any operation system. on the other hand, matlab includes the virtual reality toolbox, which is also a sufficient tool for visualization of scheduling results. from the scheduling area there are also some closely related works. optimization using simulation was described by fishman [5], and the use of simulation based optimization in real production was briefly described by manlig and sramek [6]. visualization in scheduling has been studied at karlsruhe university [7] and an application for visualization of process scheduling has been developed there. the main goal of this paper is to present the application for visualization and simulation of scheduling results in the matlab environment – visis (visualization and simulation in scheduling). this application uses the matlab-based simulation environment simulink and the virtual reality toolbox for graphic visualization. two areas of usage are considered: simulation for monitoring the influence of scheduling on the system function (e.g. for digital filters), and time visualization (e.g. graphic represpentation of execution on a production line in time). this tool will be freely available in the next version of torsche. to the best of our knowledge, there is no such a tool providing visualization of scheduled processes in this range. this paper is organized as follows: section 2 provides the basic notation used in scheduling theory. section 3 describes the implementation of visis. in section 4, examples of simulation and visualization are provided and a comparison with truetime is shown. the last section concludes the work. 2 representation of results in scheduling scheduling problems can be divided into three categories: resources (processors or machines), constraints and criterions. generally accepted notation of the problem has the form �|�|�, where � stands for types of processors used, � represents tasks and characteristics of resources, and � denotes the optimality criterion [2]. for example, 1|rj, ~dj|cmax represents the problem with one processor (resource), given release date and deadline for each task, while the objective is to minimize the latest completion time. the cmax value is the most frequently used criterion, because it represents the throughput of the system [8]. generally, scheduling is np-hard problem. polynomial algorithms are therefore known only for the limited number of problems. this leads to an exponential rise in the time needed to find the optimal solution in dependence on the number of input tasks. the most common graphic representation of scheduling results is the gantt chart [http://www.ganttchart.com/], first established by henry gantt in 1910. the gantt chart has discrete time values on the x axis and processors on the y axis (see fig. 1). tasks are represented by a rectangle area on the intersection of the appropriate processor and the assigned time. the form of the gantt chart is identical for all time 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 visualization and simulation in scheduling r. čapek this paper deals with the representation of scheduling results and it introduces a new tool for visualization and simulation in time scheduling called visis. the purpose of this tool is to provide an environment for visualization, e.g. in production line scheduling. the simulation also proposes a way to simulate the influence of a schedule on a user defined system, e.g. for designing filters in digital signal processing. visis arises from representing scheduling results using the well-known gantt chart. the application is implemented in the matlab programming environment using simulink and the virtual reality toolbox. keywords: scheduling, visualization, simulation, matlab, visis. schedules represented by the start times and processing times of the tasks. for the hoist scheduling problem [9], there is another way to display the results (see fig. 2). it is a chart with discrete time values on the x axis again and on the y axis there are tanks where the material is processed. the time schedule is then represented by lines denoting the moves of the hoists, one type for loaded hoists, one for empty hoists, and one for material storage in tanks. this type of chart is special for the hoist scheduling problem, and it gives a better idea of the final result, although understanding is at first quite difficult. visualization should arise from the individual problem definition instead of one general form. 3 implementation as mentioned above, this project is realized completely in the matlab environment, and the output of the application is a simulink scheme. graphical objects for visualization are created in vredit (part of the virtual reality toolbox for matlab) and the final visualization is displayed using the virtual reality toolbox. the vredit environment allows us to define basic geometrical objects, text, background, textures and complex predefined objects. each object in virtual reality has its own set of variable parameters. the numerical value of any parameter can be directly changed from simulink. the visis implementation provides several functions available for users. for maximum simplicity of usage, the simulink scheme is generated automatically. this output scheme contains one masked subsystem representing the control system. in the case of visualization, there is another block referencing the predefined virtual reality world. the mask of the control subsystem has inputs and outputs with user-defined names and sizes. the core of the control subsystem is the s-function block, which contains the main control function. this function updates the outputs according to the given schedule and the actual values of the inputs. this control function is also generated automatically, and all needed external data is created in the matlab workspace before the simulation begins. the s-function block has only one input and output port as default so the in/out signals are integrated/divided to reach user defined number of inputs and outputs. this subsystem is then masked as one block with appropriate ports. the simulink scheme and code for the s-function block are both generated as text files from the prepared templates. the control function is called for each sample of simulink, and the outputs are updated according to the schedule and the actual simulink time. all implemented functions can be called as standard matlab functions. users of visis are expected to have basic knowledge of using the torsche toolbox, and to be able to create their own project in the virtual reality toolbox. the first step in simulating or visualizing is to create a set of tasks (function taskset). then the code of the operations has to be assigned to the tasks by the adduserparam function. this function reads data from the given text file, and the user-defined code is assigned to appropriate tasks in the taskset according to the format of the text file. the next step is to define the input and output ports for the control block and inputs of the virtual reality block, if needed. then the set of tasks has to be scheduled using an appropriate scheduling algorithm and all created structures are passed to the main function taskset2simulink, which creates the simulink scheme and the main control function. any other block can be added to the scheme before simulation begins. the application checks the control function after it has been created, and if there is any structural error, a warning is displayed. then a visualization can be seenin the virtual reality world, with the standard possibility to save any frame or video stream during the simulation. 4 examples and experimental results 4.1 hoist scheduling visualization the hoist scheduling problem is chosen as an example for visualization. the classic representation of one period by the gantt chart is shown in fig. 1, where each task represents one move of a loaded hoist. this representation does not reflect empty hoist moves. temporary stays of the material in the tanks are also not clearly visible.a schedule of material moves is displayed in fig. 2. this representation gives a better idea about the realization of the schedule, and it also displays a sequence of © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 3/2008 fig. 1: gantt chart for hoist scheduling fig. 2: special chart for hoist scheduling fig. 3: frame of visualization several periods. fig. 3 shows one frame of visualization by visis. 4.2 digital filter simulation simulation of the dsvf-digital state variable filter [10] is taken as an example of the simulation capabilities. this filter is formed by a set of arithmetic equations, which are repeated in a never-ending loop. each elementary arithmetic operation has to be assigned to one task in accordance with the requirements of the scheduling algorithm requirements. then the problem with the precedence constraints has to be scheduled and the resulting schedule can be passed to the taskset2simulink function. after the simulink scheme has been generated, an appropriate signal generator and some display unit can be added. then simulation is ready to start. the corresponding scheme is shown in fig. 4, and the input and output signals of the modeled filter are displayed in fig. 5. 4.3 experimental results as mentioned above, generation of both the simulink scheme and the control function code is text-based, so the time complexity of generating in dependence on the number of input tasks or on code length is approximately linear. simulation by visis needs approximately 80 % of the time needed for the same example realized using the truetime library. in addition, the time needed for one second of simulation with 220000 samples per second is approximately 32 seconds in truetime and 26 seconds in visis. 5 conclusions this work has two areas of use: in discrete simulation (e.g. in digital signal processing) and in visualization of scheduled problems. visis is planed as an extension of a future version of the torsche scheduling toolbox for matlab, which can be freely downloaded. the application can be used for presentations, for educational purposes or as an optimization tool when clear representation of the results is needed as a feedback for scheduling. the simulation of the digital filter is faster than in the truetime library, since visis is optimized for simulations of time schedules. the main advantage of visis is easier problem definition and simple usage. acknowledgments this work was supported by the ministry of education of the czech republic under research programme msm6840770038. references [1] šůcha, p., kutil, m., sojka, m., hanzálek, z.: torsche scheduling toolbox for matlab. in ieee international symposium on computer-aided control systems design. munich, germany, 2006. [2] blazewicz, j. et al.: scheduling in computer and manufacturing systems. springer, 1993. [3] herrmann, j. w.: handbook of production scheduling. springer, 2006. [4] ohlin, m., henriksson, d., cervin, a.: truetime 1.4 – – reference manual. department of automatic control, lund university. 2006. [5] fishman, g. s.: discrete-event simulation: modeling, programming, and analysis. springer, 2001. [6] manlig, f., šrámek, m.: řízení výrobních zakázek s podporou počítačové simulace. in: průmyslové inženýrství, 2003. [7] wittstein, h., zoller, h., lieflander, g.: visualization of process scheduling. universität karlsruhe, department of computer science. http://i30www.ira.uka.de/teaching/co urse documents/processscheduling/ [8] crama, y., kats, v., van de klundert, j., levner, e.: cyclic scheduling in robotic flowshops. in: annals of operations research, 2004. [9] manier, m. a., bloch, ch.: a classification for hoist scheduling problems. in: international journal of flexible manufacturing systems, 2003. [10] matějíček, d.: optimalizace algoritmů pro fpga. diploma thesis, ctu prague, 2007. ing. roman čapek phone: +420 776 716 588 e-mail: capekr1@fel.cvut.cz department of control engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 4: output simulink scheme fig. 5: input and output signal ap07_4-5.vp 1 introduction one of the present-day tasks for power engineers is to generate electricity from renewable energy sources (res). increasing energy consumption and diminishing conventional sources have made people think of utilizing renewable sources in conventional power systems. the use of renewable sources of energy promotes sustainable living and, with the exception of biomass combustion, it is virtually pollution free. res are economically feasible in small-scale applications in remote locations (away from the grid areas) or in large-scale applications in areas where the resource is abundant and can be harnessed by giant conversion systems. distributed generators can be connected to medium voltage (mv) or low voltage (lv) distribution systems. two basic methods are used to supply electricity to a certain area. the first is to transfer the required energy using conventional power systems, and the second one is to produce it directly at the location where it will be consumed. a combination of the two methods can be used. the first method is frequently used, but small power generators located close to energy consumers are very interesting. this paper will draw attention to the impacts of connecting wind power plants to the existing system. 2 wind power plants wind is formed by changes in pressure between differently heated areas of air in the earth’s atmosphere. the velocity of the wind, which is the most important parameter for exploitation of wind energy, is proportional to the amount of pressure difference. the minimum economical limit for exploitation of the wind energy all over the world is 5 m/s. the upper wind speed limit is considered to be about 25 m/s. higher wind speeds are dangerous because they may damage the equipment of the power plant. 3 power system disturbances electrical energy is a commodity in the market, and it is necessary to set certain rules for rating its quality. in earlier times, electricity was rated only according to voltage and frequency stability; however the use of more advanced appliances and devices with non-linear or changing characteristics has had an adverse impact on electricity distribution systems. the adverse impacts can cause disturbances to other appliances and devices in that system. the disturbances are categorized according to the frequency (power systems, acoustic, radio). this paper deals with power system disturbances due to higher harmonics, interharmonics, voltage fluctuation, flicker and voltage imbalance. 3.1 higher harmonics higher harmonics are sinusoidal waves of voltages and currents with frequency integer multiples of 50 hz. the most common sources of higher harmonics are devices with a non-sinusoidal current. if we want to maintain a voltage balance between a system and a consumer that affects the system adversely with higher harmonics, a higher harmonic current must flow through a part of the circuit. this current creates voltage drops on the inductances and resistances of the circuit. these higher harmonic voltages are superimposed on the sinusoidal voltage of the system frequency. these currents and voltages cause additional stress to other devices connected to the system. if the stress exceeds the resistance level, the following may occur: � reduction of lifetime, � early failures of capacitors and motors because of overloading, � function failures of electrical devices, � wrong protection function, � malfunction of ripple control receivers, � negative effect on arc extinction. the higher harmonic voltage should not exceed certain limits, in order to avoid the phenomena mentioned above. higher harmonic currents are influenced by electrodynamic phenomena, e.g., skin effect and proximity effect. these phenomena cause an increase in resistances and active losses. higher harmonic currents and voltages form additional losses, which are included in the “deformational power”. the apparent power s can be expressed in terms of active power p, reactive power q and distortive power d: s p q d2 2 2 2� � � . (1) 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 the impact of connecting distributed generation to the distribution system e. v. mgaya, z. müller this paper deals with the general problem of utilizing of renewable energy sources to generate electric energy. recent advances in renewable energy power generation technologies, e.g., wind and photovoltaic (pv) technologies, have led to increased interest in the application of these generation devices as distributed generation (dg) units. this paper presents the results of an investigation into possible improvements in the system voltage profile and reduction of system losses when adding wind power dg (wind-dg) to a distribution system. simulation results are given for a case study, and these show that properly sized wind dgs, placed at carefully selected sites near key distribution substations, could be very effective in improving the distribution system voltage profile and reducing power losses, and hence could improve the effective capacity of the system. keywords: renewable energy sources, distributed generators, wind power plant, power systems disturbances, flicker and steady-state stability. deformational power influences the “true power factor” cos �. an electronic energy meter can measure this true power factor, while classical inductive energy meters measure only the first harmonic. fig. 1 shows the geometrical relationship between apparent, active, reactive and distortive power. 3.2 interharmonics interharmonics are all sinusoidal waves of voltage and current having a frequency that is a non-integer multiple of the system frequency. interharmonic voltages create an additional distortion of the voltage waves and are not periodic with the system frequency. interharmonic sources are converters with direct inter-circuit, straight frequency converters, rectifying cascades and electronic devices working in cycles (heat devices, machines with changing load torque). interharmonics have a negative influence on the ripple control signal. interharmonics are the main cause of flicker and disturbance of ripple control signal receivers. they can be reduced by smoothing the direct inter-circuit currents at frequency converters or by choosing a suitable point of common coupling (pcc) at places with higher short-circuit power. 3.3 voltage fluctuation voltage fluctuation is a disturbance phenomenon that arises during the operation of electrical devices with a varying load. these changes cause different voltage drops on impedances, resulting in voltage changes at places of consumption. the changes in voltage cause changes in light flux (flicker) at light sources, mainly bulbs. this voltage fluctuation can be periodic or stochastic. the sources of voltage fluctuations can be controlled converters, welding machine sets, electrical resistors and arc furnaces, all consumers with pulse consumption, large load switching or asynchronous generators in wind power plants. voltage fluctuation can be reduced by connecting a large consumer to a system with a higher short-circuit power, or by strengthening the system. in order to achieve higher short-circuit power (mva) the following can be done: � increase the installed power of the supply transformer, � connect a new generator, a synchronous compensator, � decrease the supply line impedance by means of serial compensation, � connect to a higher voltage level. 4 conditions for connecting distributed generators to a distribution system the possibility of connecting distributed generators to a distribution system is judged according to their negative impacts on the distribution system. our judgment is based on the system impedance at the point of common coupling (pcc), short-circuit power, resonance, connected power and the type of distributed generators (wind, small hydro, etc.) the following subsections refer to the requirements of the czech (european) technical standards for connecting distributed generators to the distribution network. 4.1 size of the connected power the size of the power to be connected to the existing distribution system depends upon the voltage level of that system. if the power is to be connected to lv systems then the total rated power of the distributed generators to be connected should not exceed 10 % of the distribution transformer rated power. and if the power is to be connected to the mv system then the total rated power of the distributed generators to be connected should not be higher than 10 % of the supply transformer 110/22 kv rated power. 4.2 increase in voltage the voltage increase in the mv systems after connecting the distributed generators at the point of common coupling (pcc) should in the most unfavorable case not exceed a value of 2 % in comparison with the voltage before their connection, and distributed generators connected to the lv systems should not increase the voltage by more than 3 %. 4.3 voltage changes during switching connecting and disconnecting individual generators causes voltage changes at pcc. in the mv system, these changes do not cause undue negative impacts provided that the highest voltage change at pcc does exceed 2 %, and for lv systems the voltage change limit is 3 %. 4.4 long-time flicker when analysing one or many distributed generators at pcc, it is important to consider the voltage fluctuation that causes flicker. at pcc it is therefore necessary to keep to the limits for the long-time flicker perception rate plt and for the long-time flicker factor alt as follows: plt � 0 46. , (2) alt � 01. . (3) the long-time flicker perception rate plt is determined using the following equation: p c s slt ne kv � , (4) where: c is flicker factor [-], sne is load rated power [mva], skv is system short-circuit power [mva]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 97 acta polytechnica vol. 47 no. 4–5/2007 fig. 1: geometrical relationship of powers if the producer includes more than one generating unit, plt must be calculated separately for each device and the resulting flicker value pltres at pcc is then determined: p plt res lt i i � � 2 . (5) 4.5 higher harmonics in mv systems the allowable higher harmonic currents of the individual equipment connected at pcc can be computed using the following equation: i i s s i s s sv pri v pr a av v pr kv a av � � � � , (6) where: sa is single device apparent power [mva], sav is total connected power [mva], iv pr is relative current [a/mva], skv is distribution system short-circuit power [mva], iv pr is higher harmonic allowable current [a]. if the allowable harmonic currents are exceeded, then connection at pcc is not possible. 5 a study of the wind power plant connection before the wind power plant was connected, the distribution network was fed at two points from the 110/22 kv system through transformer t1 with 25 mva and transformer t2 with 25 mva. the short-circuit power at the first and second point is 1200 mva and 1000 mva respectively. now another supply from the wind power plant is connected to the distribution network through transformer t3 rated 8mva, 22/0.7 kv. the 22 kv distribution network consists of an overhead line 95alfe6 with a maximum current carrying capacity of 289 a, and the total length of the line is about 40 km. the distribution network is connected to the wind power plant through a 70alfe6 line with maximum current carrying capacity of 225 a and about 2 km in length. the wind power plant consists of four synchronous vestas v80–1800 kw type generators with nominal voltage of 690 v, nominal current of 1390 a and power factor range from 0.90 (inductive) to 0.98 (capacitive). the system steady-state calculations were performed using the egc vlivy 4.2 and daisy pas off – line v 3.5 programmes. the calculations were performed for all generator combinations (without connecting the generators to the sequential connection of all four generators) and further for different power factors in the range 0.98 cap – 0.90 ind with a step of 0.02. the calculation was performed for minimum line loading of 66 a and maximum line loading of 108 a. 5.1 line losses the calculations of the line losses were performed during the network steady state calculation. the losses in fig. 2 were calculated for cases when all loads are supplied from substations t1, t2 and the wind power plant contributes only 60 % of its rated power. it is seen during the analysis that the line losses are lower when the generators do not contribute to the network and it is fed from both substations than when the generators are connected sequentially to the system. generally, the number of connected generators influences whether the losses are higher at minimum or maximum line loading. relatively high line losses occur when the generators work in an overexcited state with the power factor cos � � 0.98 cap. so in order to operate this network with minimum losses, we have to regulate the power of the wind power plant according to the immediate consumption in the network. 5.2 voltage profiles the voltage profiles were constructed at 60 % generator loading. the profiles are shown in fig. 3. only a small change in voltages occurs at substations t1 and t2 after connecting 98 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 2: line losses when the network is supplied by substations t1, t2 and at 60% generator loading the generators to the system. this is due to the higher short-circuit power of the system where the substations are connected than that of the system where the wind power plant works. in mv systems according to the technical standards of the distribution utility, the maximum voltage difference must not exceed 2 % of the original voltage profile. this criterion is fulfilled only for some cases of generator connection. the size of the voltage difference depends on the number of generators connected and the power factor. 5.3 ripple control signal (rcs) damping since electrical generators can unfavorably influence the level of the ripple control signal (rcs), this level must not drop by more than 10 % to 20 % below the required level. when the four synchronous vestas v80 – 1800 kw machines are connected at the point of common coupling (pcc) 22 kv, the rcs damping is lowered by additional supporting impedance. 5.4 flicker another qualitative parameter for judging the connection of a wind power plant is the flicker value caused by voltage variation. the flicker coefficient plt at the point of common coupling (pcc) must not exceed a value of 0.46. it was proved by calculation that a value 0.46 of flicker coefficient plt was not reached. 5.5 contributions of harmonic currents the checks on the allowable size of the harmonic and interharmonic currents that reached the network from the wind power plant when it was connected to the network proved that the allowable limits were not exceeded. 6 stability of the system with the wind power plant the existing technical rules commonly used by distributors for connecting generators to the network are as follows: � increase in voltage, � islanded operation (and anti-islanded protection), � increase in short circuit level, � power quality (harmonics, voltage drops, unbalanced voltage etc.). in some countries (france, italy) simple rules are applied which define the voltage level to which a distributed generator may be connected, depending on its rated power output. it appears from this study that keeping the line voltage profile requirement (which is a basic criterion for connecting a unit in the czech republic) is not suitable for all situations, especially where there are weakly supplied parts of the network. an earlier study proved that losses can be reduced due to the voltage profile after connecting the distributed generators. however, new problems with stability arise. simulations were performed using simulink in order to analyze the voltage stability of the synchronous generators during the mechanical power oscillations that represent wind speed changes. these oscillations are simulated by a sudden step of synchronous generator torque, as shown in fig. 4. this study is made for several cases of compensation, such © czech technical university publishing house http://ctn.cvut.cz/ap/ 99 acta polytechnica vol. 47 no. 4–5/2007 fig. 3: voltage profile after connecting the generators at 60 % loading as compensation of the reactive power of a load and generator, a generator only, or no compensation. in each case the torque step causes a voltage drop, and its stability is shown in fig. 5, fig. 6, fig. 7, fig. 8 and fig. 9. the first case (load and generator reactive power compensation) is stable, and the voltage drop is negligible. the second case (only generator reactive power compensation) is quasi-stable. a high level of torque causes instability, but when the torque drops, the parameters of the system return to their original state. the stability depends on the time period of the power oscillations. the third case (no compensation) causes high-level instability for the selected duration of the torque. this study has proved the existence of new criteria for connecting distributed generator units into a distribution net100 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 m m p u [ ] t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 0 �0.2 �0.4 �0.6 �0.8 �1 �1.2 �1.4 fig. 4: wind generator mechanical torque u p u [ ] 1.002 t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 1 0.998 0.996 0.994 0.992 0.99 fig. 5: voltage: load and generator reactive power compensation – stable case w p u [ ] t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 1.025 1.02 1.015 1.01 1.005 1 fig. 6: rotor speed: load and generator reactive power compensation – stable case u p u [ ] t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 1.02 1 0.98 0.96 0.94 0.92 0.9 0.88 fig. 7: voltage: only generator reactive power compensation – unstable case w p u [ ] t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 1.09 1.08 1.07 1.06 1 1.05 1.04 1.03 1.02 1.01 fig. 8: rotor speed: only generator reactive power compensation – unstable case w p u [ ] t s[ ] 0.50 1 1.5 2 2.5 3 3.5 4 1.25 1.2 1.15 1.1 1.05 1 fig. 9: rotor speed: no reactive power compensation – unstable case work. improvements have to be made on both sides, e. g., unit type, construction and regulation and distribution network architecture, control and regulation. 7 conclusion on the basis of a discussion the general problem of electricity generation from renewable energy sources, more specific ally wind power plants, this paper has described the conditions for connecting these sources to the distribution network. the important part when assessing these plants is the conditions for connection and their negative impacts (voltage change, harmonics, flicker, negative impacts on the ripple control signal) in the distribution system. the paper included a study of connecting a wind power plant to the 22 kv distribution network. the parameters were calculated according to the prescribed rules for connection. the calculation results were analyzed and compared with the conditions for connecting a distributed generation. the study also deals with steady state stability and the impact of the wind power plant on the distribution network. acknowledgments the research described in this paper was supervised by prof. j. tlustý of the electrical power engineering department, fee ctu in prague. references [1] rules for the operation of distribution systems in czech republic. [2] kersting, w. h.: distribution system modeling and analysis. crc press, 2002. [3] sýkora, t.: rozptýlená výroba elektrické energie (dispersed electrical energy generation). diploma thesis, fee ctu, prague, 2004. [4] czech standards. parameters of power quality. pne 33 3. ing. erick vincent mgaya e-mail: mgayae1@fel.cvut.cz ing. zdeněk müller phone: +420 2243352372 fax: +420 233 337 556 e-mail: mullez1@fel.cvut.cz department of electrical power engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 101 acta polytechnica vol. 47 no. 4–5/2007 21muller.pdf 100 101 acta polytechnica https://doi.org/10.14311/ap.2021.61.0313 acta polytechnica 61(2):313–323, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague strategic modulation of thermal to electrical energy ratio produced from pv/t module anges a. aminou moussavou∗, atanda k. raji, marco adonis center for distributed power and electronics systems, cape peninsula university of technology (bellville campus), department of electrical engineering, symphony way, po box 1906, bellville 7535, south africa ∗ corresponding author: akdech80@yahoo.fr abstract. several strategies have been developed to enhance the performance of a solar photovoltaicthermal (pv/t) system in buildings. however, these systems are limited by the cost, complex structure and power consumed by the pump. this paper proposes an optimisation method conversion strategy that modulates the ratio of thermal to electrical energy from the photovoltaic (pv) cell, to increase the pv/t system’s performance. the design and modelling of a pv cell was developed in matlab/simulink to validate the heat transfer occurring in the pv cell model, which converts the radiation (solar) into heat and electricity. a linear regression equation curve was used to define the ratio of thermal to electrical energy technique, and the behavioural patterns of various types of power (thermal and electrical) as a function of extrinsic cell resistance (rse). the simulation results show an effective balance of the thermal and electrical power when adjusting the rse. the strategy to modulate the ratio of thermal to electrical energy from the pv cell may optimise the pv/t system’s performance. a change of rse might be an effective method of controlling the amount of thermal and electrical energy from the pv cell to support the pv/t system temporally, based on the energy need. the optimisation technique of the pv/t system using the pv cell is particularly useful for households since they require electricity, heating, and cooling. applying this technique demonstrates the ability of the pv/t system to balance the energy ( thermal and electrical) produced based on the weather conditions and the user’s energy demands. keywords: cell efficiency, photovoltaic systems, solar photovoltaic-thermal (pv/t) system, modelling and simulation, power production. 1. introduction renewable energy (re) originates from the natural processes, which are constantly replenished [1]. re has been widely promoted in many countries to mitigate the use of electricity from the main grid [2, 3]. re prevails over fossil fuels because of the high price of oil. furthermore, it is less harmful to the environment as compared to the traditional power plant [4, 5]. from all the different forms of renewable energy, solar radiation can be used to generate electricity and heat. it offers a sustainable energy supply to domestic and industrial sectors and has demonstrated a promising energy economic development [6, 7].the combination of photovoltaic and thermal (pv/t) systems is used to generate electricity and thermal energy. the inclusion of the pv/t system in buildings can achieve substantial energy-savings. studies on the efficiency of domestic hot water (dhw) distribution systems in buildings have shown that the innovative circulation pipes improve the dhw by reducing the losses by 40 % [8]. however, it has been acknowledged that the action of cooling and reheating water in pipes may lead to a thermal fatigue of fixtures and reduce their life cycle [9, 10]. an improvement of the pv/t system composed of pv laminate and absorber with two water channels in which water flows through the upper channel and returns through the lower channel was reported [11, 12]. this system presents a high thermal efficiency; however, the geometric complexity makes it difficult to manufacture. a conceptual nanofluid-based pv/t system was developed to improve the thermal and electrical efficiency of the system. it was noted that at a temperature of about 62 °c, the controlled flow rate of the nanofluid yielded a total efficiency of 70 %, while the electrical and the thermal efficiency were 11 % and 59 %, respectively [13]. however, this method is costly, suffers from the high-pressure drop and is difficult to hold nanoparticles suspended in the base liquid [14, 15]. an environmentally friendly pv/t system was proposed using a glazed solar collector composed of a pv panel bonded to a metal absorber [16]. the experimental results obtained from the proposed pv/t system show that the pv panel temperature was 45 °c, even in summer, the water temperature circulating within the pv/t was 60 °c based on the flow rate control [16]. a feasibility study of using the pv/t optimisation as a heat source and sink for a reversible heat pump to cool and heat the standard building in three distinct climate zones was evaluated. this pv/t system proved to be technically feasible, and its yearly costs are relatively similar to the traditional 313 https://doi.org/10.14311/ap.2021.61.0313 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en a. a. aminou moussavou, a. k. raji, m. adonis acta polytechnica solar cooling systems that use a reversible air-to-water heat pump as the heat and cold source [17]. in view of these findings, it is obvious that an improvement of the pv/t system’s performance is needed. this paper proposes an optimisation technique of the pv/t system’s performance using the heat flow from the pv cell. therefore, a controllable self-heating (useful heat) pv cell model using an external parameter is developed to support the pv/t system. the pv cell is partially turned into a useful heat source. modelling and analysing the pv cell as well as thermal power, electrical power and energy efficiency, were evaluated. 2. theoretical analysis of photovoltaic modules solar photovoltaic technology is highly appreciated due to its abundance and environmental friendliness as compared to other sources. the pv module performance characteristics mainly depend on the ambient temperature and solar radiation. also, it depends on parameters such as local wind speed, the material and structure of the photovoltaic module, such as glazing-cover transmittance and absorbance [18, 19]. these parameters have an impact on the low energy efficiency conversion. 2.1. influence of solar radiation the overall photovoltaic module performances are typically defined by the standard test conditions (stc), such as radiation, which is 1000 w/m2, ambient temperature, 25 °c, air mass is 1.5. there is no air velocity near the pv module. however, these performances are completely different once operating in real-world conditions; this difference is due to the perpetual change of the conditions. the pv module performance is associated with the absorption of the solar radiation, the position of the sun through each day and the apparent sun movement during the year [20]. solar radiation does not reach the earth’s surface intact, because it passes through the earth’s atmosphere. the luminous intensity and its spectrum depend not only on the composition of the atmospheric particles and gases but also on clouds [21–23]. the impact of irradiance on the pv module is given in equation 1. isc(t) = isc,ref [1 + α(t − 25)] g 1000 w/m2 (1) where isc,ref , g, α and t represent the reference short-circuit current at 25 °c, global solar radiation on the photovoltaic module surface (w/m2), a constant temperature coefficient of the module, and the temperature of the photovoltaic module kelvin (k), respectively. photovoltaic modules are made to convert solar radiation into electrical energy. figure 1 illustrates the influence of the irradiation intensity variation on the pv modules. figure 1 shows that when the solar figure 1. the pv cell’s characteristics under various solar radiation [24]. figure 2. the pv module characteristics under various temperatures and an irradiation intensity of 1000 w/m2 [25]. radiation increases from 233 to 1000 w/m2, the maximum power increases from 30 to 120 mw, respectively. the open-circuit voltage of the pv module increases by 0.05 v, while the current stays constant [21, 24]. 2.2. influence of the operating temperature of the pv module the temperature rise of the photovoltaic (pv) module reduces its open-circuit voltage (voc) and decreases the maximum power (pmp). at high temperatures, the formation of electron-holes and the bandgap of the photovoltaic module decreases, while the dark current saturation increases [27–29]. figure 2 illustrates the i − v characteristics curve of the photovoltaic performance. the voc dependence on t is given by the equation below. voc(t) = voc,ref [1 + β(tc − tref )] (2) where voc,ref , β, tref and tc represent the reference of the open-circuit voltage, temperature coefficient, the operating temperature of the module and the reference temperature at 25 °c, respectively. the derivative of voc with respect to the temperature and energy 314 vol. 61 no. 2/2021 strategic modulation of thermal to electrical energy ratio. . . figure 3. different sources of losses [26]. gap of the semiconductor is expressed in equation 3: voc(t) dvoc dt = voc t − γk q − eg0 qt = = 1 t å voc − eg0 q ã − γk q (3) where γ, k, eg0 and q represent the specifics of the temperature coefficient, boltzmann constant, band gap of the material and electron charge (c), respectively. the photovoltaic module defined parameters are maximum voltage, open-circuit voltage, maximum current, short-circuit current, maximum power, fill factor and efficiency. in figure 2, it is denoted that when the temperature increases from 0 to 75 °c, as an immediate consequence, the open-circuit voltage of the photovoltaic module decreases from 40 to 31 v, the maximum power point declines by 55 w and the short-circuit current increases slightly by 0.3 a [30–32]. it is also observed that temperature variations have a marginal effect on the isc, while having a substantial impact on voc [28, 33]. the characteristics curve is influenced to different values when photovoltaic modules are exposed to cell damage, radiation change, temperature inequality, local shading and dust, which considerably decreases the output power [30–32]. 2.3. losses due to extrinsic and intrinsic in a solar cell different power losses occur in the pv cell and can be categorised as extrinsic and intrinsic losses, and optical and electrical losses [26, 34] as shown in figure 3. extrinsic loss: this type of losses is caused by reflection, cell damage, shading, series resistance, radiation change, incomplete collection of generated photocarriers, absorption in the window layer and non-radiative recombination. if the pv module operates under partial shading, the shadow cell is reversely polarised and amplified in the opposite direction; this produces high temperatures because it is charged [35–37]. intrinsic losses: this type of losses is caused by two factors and the lack of ability of the single-junction solar cell to react adequately to all wavelength spectrums. the solar cell becomes translucent to the photon energy (eph), and this energy is less than the band gap energy (eg) of the semiconductor (eph < eg). however, on condition that the photon energy is higher than the band gap energy of the semiconductor (eph > eg), the extra energy is dissipated in the form of heat. the loss is also due to the radiative recombination in the solar cell. the common semiconductor material used for the solar cell is silicon, monocrystalline, polycrystalline and amorphous with an efficiency of 20 %, 12 % and 7 %, respectively [38]. the solar cell heating is reversely proportional to the efficiency [29]. 3. method and simulation set up the simulation predicts the thermal behaviour patterns, the total power dissipated (pd), the power generated and the effectiveness of the pv cell model. this pv model comprises a diode (made of the semiconductor property material of the photovoltaic cell), internal series and internal parallel resistance. it should be noted that for a simulation of a physical phenomenon, like the issue of heat transfer, in simulink/simscape, there is a need to establish the calculations of the heat 315 a. a. aminou moussavou, a. k. raji, m. adonis acta polytechnica figure 4. heat transfer characteristics of the pv system. figure 5. evaluation of pv performance under extrinsic cell resistance. transfer occurring in this study. figure 4 depicts the heat transfer characterisation of the pv system. the following section evaluates the distribution of power dissipated in the pv cells, triggered by extrinsic cell resistance (rse). for this, the electro-thermoradiative behaviour pattern of the pv cell, for various values of rse ranging from 0 to 100 ω, were simulated while maintaining other parameters, such as solar radiation at 1000 w/m2, ambient temperature at 20 °c and convective heat transfer at 20 w/(m2·k)). the extrinsic cell resistance (rse) is illustrated in figure 5. the value of rse can be obtained analogically with a variable resistance. it can also be obtained electronically by applying a voltage on the fet’s gate pin resistance. the channel resistance of the fet is a function of the gate-source voltage. by increasing the reverse biasing, the resistance increases. in this study, the pv modules parameters are listed in table 1 and table 2. the entire pv system consists of two pv arrays assumed to perform identically and in a parallel configuration; the system has a capacity of 3.24 kwp at 1000 w/m2. 4. simulation results and discussion the pv cell model is analysed and discussed to better appreciate the optimisation technique of the pv/t system using the pv cell. the simulation is performed under stable conditions. 4.1. pv cell power dissipation as a function of rse the parameters representing the pv cell’s internal properties are comprised of a diode, series resistance and parallel resistance. the model is assessed based on extrinsic cell resistance. figure 6 shows an increment in the total pd curve, from 990 to 3490 w, as rse increases. series resistance marginally increases, while parallel resistance remains virtually the same. a considerable amount of the total power dissipation is attributed to the diode (because of the recombination current of the semiconductor material property used to make the pv cell model), ranging from 750 to 3480 w. the series and parallel resistance resistivity losses decrease as less current flow through them. a substantial reverse current occurred in the pv cells in the form of heat. this reverse current leads to a pd and then to a local overheating and turns into heat by conduction. the pv thermal resistance varies based on the width of the material and its thermal resistivity. figure 6 shows an increase in the total pd of the pv cell, from 990 to 3490 w, as rse increases. this increase in rse will reduce the fill factor and then decrease the maximum-power point of pv cells. the graph in figure 6 is consistent with those obtained from previous studies [39–41]. however, here, the rse causes the restricted conductivity of the terminal material used. the following trend can be elucidated: the rse constrains a partial conversion of the pv output into a useful thermal energy. this study focuses on the internal heat generation and electrical power generation of the pv cell based on the rse. the technique relies on the linear regression equation curve to model the behaviour of different types of power as a function of rse in the pv cell being studied. 316 vol. 61 no. 2/2021 strategic modulation of thermal to electrical energy ratio. . . component parameter value pv modules cell type mono-crystalline packing factor 0.91 conversion efficiency 16 % module peak power 3.25 kw maximum voltage, vm 255 v maximum current, im 12.4 a open circuit voltage, voc 310 v short circuit current, isc 14.64 a series resistance rsi/ cell 0.0042 ω parallel resistance rpi/cell 10.1 ω table 1. pv module parameters. figure 6. dissipated power by pv cell versus rse. figure 7. rse according to the power generated and heat by conduction of the pv module. 317 a. a. aminou moussavou, a. k. raji, m. adonis acta polytechnica pv module type absorbance α 0.8 emissivity � 0.75 thermal conductivity 840 thickness δ 0.003 temperature coefficient 0.000905 energy gap eg 1.11 table 2. optical parameters of pv cells. 4.2. estimation of heat transfer by conduction and generated pv power as a function of rse the synthesis of the results, illustrated in figure 7 in 3d, shows a standardised map of rse as a function of the heat transfer by conduction, and the generated pv power. the normalised yields are plotted on this map, which includes the polynomial surface of the model. by adjusting rse, the equivalent values of the electrical power generated and electrical power dissipated by heat conduction is determined. figure 7 presents the heat generated by conduction (qcond) within the pv cell. qcond rises from 425 to 1715 w (in magnitude) as rse moves from 0 to 100 ω. this heat is ascribed to electrical power dissipated in the pv cell, and part of the dissipated power turns into useful energy within the pv cell. however, the temperature difference is the main impetus behind the conductive heat flow in a material with a given thermal resistance, and the transfer is governed by the fourier law. it can be seen in figure 7 that the generated pv power decreases as rse increases. the power rapidly (exponentially) falls from 2800 w to 260 w when rse increases from 0 to 50 ω, and beyond 50 ω, the power decreases slower from 255 w to 110 w. the power degradation of a pv cell is due to recombination according to rse variation, leading to electrical power dissipation in the form of heat by conduction. these outcomes are in concurrence with those acquired by other authors, where the rise of rse is attributed to dust particles on the pv model [42–44]. contrary to other studies, rse is used to proportionally influence the electrical power and power dissipation of the pv cell. a polynomial model appropriately represents the graphical model of the results. it can be used to predict and interpret the pv cell’s performance. the confidence intervals and the means of the linear regression equation for the graphical model result was derived. the estimation graph is expressed by equation 4. rse(qcondpp v ) = p00 + p10 · qcond + p01 · pp v + + p20 · q2cond + p11 · qcond · pp v + p02 · p 2 p v (4) where p00, p10, p01, p20, p11 and p02 are coefficients, qcond is the thermal transfer coefficient by conduction (w), pp v is the generated pv power and rse is the external series resistance (ω). table 3 describes the polynomial interpretation of the surface plot result of the heat conduction and pv power as a function of rse in figure 7; the estimation curve is expressed by equation 4. to find the optimal power (electrical or thermal), computation of the coefficient of determination (r2) is 0.9998 and rmse is 0.4791 for any selected value of rse. 4.3. convection and radiation heat generated by the pv cell figure 8 illustrates a steady increase in convection (qconv) from 3100 w to 5300 w when the rse value increases from 0 to 100 ω, as the heat is carried to the atmosphere. the impact of heat on the pv cell is caused by the high electrical power dissipation, and the heat loss by the conduction happening in the pv cell. the thermal loss by qconv increases faster when rse is in the range between 0 and 50 ω; nonetheless, qconv is slowed down and approaches saturation when rse is higher than 50 ω figure 9 presents the incremental change of radiation (qrad) from 350 w to 660 w when rse increases from 0 to 100 ω. the pv cell emits radiation based on its temperature. also, the losses depend on the absorptivity of the covering glass. the outcomes shown in figure8 and figure 9 demonstrate that the growth of the heat loss by qconv is higher than that in qrad. both were assessed as positive values, which shows that they are taken away into the ambient environment. at the same time, the qcond is measured as a negative value in figure 7. this negative value indicates that the qcond is directed mostly inside the pv cell. comparing the heat transfer happening in the pv cell, qconv, qcond and qrad, varied by 2200 w, 1290 w and 310 w, respectively, as rse differed by 100 ω. some others discussed the convection, conduction and radiation heat transfer occurring in the pv module, here, the effect of rse is included in this paper [45–48]. 4.4. the pv cell temperature under rse variation figure10 illustrates the logarithmic growth of tc as a function of rse. as rse varies from 0 to 50 ω, the temperature tc rises from 45 to 59 °c, and as rse value increases from 50 to 100 ω, tc increases slowly from 59 to 62 °c. the rise in tc leads to a build-up of electrical pd in the form of heat in figure6. most of the previous works show that the rse expands by the rise of the pv cell temperature [41]. inversely, here, the pv cell temperature is controlled by rse, to improve the pv/t system’s thermal efficiency. 4.5. pv cell electrical efficiency under rse variation figure 11 presents the pv cell electrical efficiency dependence on rse. the pv cell electrical efficiency 318 vol. 61 no. 2/2021 strategic modulation of thermal to electrical energy ratio. . . description of equation 4 goodness of fit rse(qcondpp v ) = p00 + p10 · qcond + p01 · pp v + sse: 12.85 +p20 · q2cond + p11 · qcond · pp v + p02 · p 2 p v r-square: 0.9998 where x is normalised by mean -1385 and std 328.5 and where y is normalised by mean 561.3 and std 662.5. adjusted r-square: 0.9997 coefficients (with 95 % confidence bounds): rmse: 0.4791 p00 = −829(−982.3, −675.8) p10 = 3100(2542, 3658) p01 = −4467(−5261, −3673) p20 = 965.3(808.8, 1122) p11 = −223.2(−242.9, −203.5) p02 = 145.4(127.7, 163.1) table 3. linear model poly22 of figure 7. figure 8. convection heat transfer versus rse. figure 9. radiation heat transfer versus rse. 319 a. a. aminou moussavou, a. k. raji, m. adonis acta polytechnica figure 10. pv cell temperature versus rse. figure 11. pv cell electrical efficiency dependence on rse. quickly (exponentially) falls from 14.2 % to 2.5 % when rs increases from 0 to 50 ω, and above 50 ω, the efficiency slowly decreases from 2.5 % to 1.5 %. this was observed to be in agreement with the results reported by similar studies [41, 49]. the degradation of the pv cell power was due to the power dissipation in the form of heat shown in figure 6. 4.6. generated power through time here, the heat by conduction corresponds to the useful thermal energy. in the electrical power and the heat by conduction in the pv module vary based on rse, mainly due to the power dissipation. here, the heat by conduction corresponds to the useful thermal energy. as shown in figure 7, the electrical power and the heat by conduction in the pv module vary, based on rse, mainly due to the power dissipation. this indicates that, if the rse is selected, pv will only deliver electrical power and thermal power under a given condition. for example, it is observed in figure 12 that when rse is 0 ω, the electrical and thermal power at the steady-state is 2835 w and 450 w, respectively. the electrical power is prioritised. while in figure 13, when rse is to set 20 ω, the electrical and thermal power at the steady-state is 831 w and 1150 w, respectively. the electrical power is degraded to prioritise the useful thermal energy. however, the rse is used to control the energy of the pv module. these findings are consistent with similar previous studies [42, 43]. 5. conclusion the design and modelling of a pv cell system were carried out in matlab/simulink to validate the heat transfer occurring in the pv cell model. the pv cell’s output is partially converted into useful thermal energy (the internal heat generation) for domestic hot water supply and space heating. a change of rse might be an effective method of controlling the amount 320 vol. 61 no. 2/2021 strategic modulation of thermal to electrical energy ratio. . . figure 12. generated powers from pv cell when rse is 0 ω. figure 13. generated powers from pv cell when rse is 20 ω. 321 a. a. aminou moussavou, a. k. raji, m. adonis acta polytechnica of thermal and electrical energy from the pv cell. the technique is determined by a linear regression equation curve to model the behavioural patterns of various types of power (thermal and electrical) as a function of rse. these findings are particularly useful for household water-heating systems. rse may be adjusted to produce supplementary heat while the fluid carries the produced heat to the load. a further research will develop a model that incorporates the absorber pipe affixed at the rear of the pv cell model, all together linked to a hydraulic pump and storage device. the optimisation technique that modulates the ratio of thermal to the electrical energy generated from the pv cell may be used to optimise the combined pv/t system’s performance. references [1] n. el bassam, p. maegaard, m. l. schlichting. chapter six energy basics, resources, global contribution and applications. in distributed renewable energies for off-grid communities, pp. 85 – 90. elsevier, 2013. https://doi.org/10.1016/b978-0-12-397178-4.00006-2. [2] w.-c. lu. greenhouse gas emissions, energy consumption and economic growth: a panel cointegration analysis for 16 asian countries. international journal of environmental research and public health 14(11):1436, 2017. https://doi.org/10.3390/ijerph14111436. [3] a. a. a. moussavou, m. adonis, a. k. raji. microgrid energy management system control strategy. in 2015 international conference on the industrial and commercial use of energy (icue), pp. 147 – 154. 2015. https://doi.org/10.1109/icue.2015.7280261. [4] a. n. nunes. energy changes in portugal. an overview of the last century. méditerranée revue géographique des pays méditerranéens/journal of mediterranean geography (130), 2018. https://doi.org/10.4000/mediterranee.10113. [5] a. stocker, a. großmann, r. madlener, m. i. wolter. sustainable energy development in austria until 2020: insights from applying the integrated model “e3. at”. energy policy 39(10):6082 – 6099, 2011. https://doi.org/10.1016/j.enpol.2011.07.009. [6] d. banks, j. schäffler. the potential contribution of renewable energy in south africa. sustainable energy & climate change project, 2006. [7] a. chel, g. kaushik. renewable energy technologies for sustainable development of energy efficient building. alexandria engineering journal 57(2):655 – 669, 2018. https://doi.org/10.1016/j.aej.2017.02.027. [8] b. bøhm. production and distribution of domestic hot water in selected danish apartment buildings and institutions. analysis of consumption, energy efficiency and the significance for energy design requirements of buildings. energy conversion and management 67:152 – 159, 2013. https://doi.org/10.1016/j.enconman.2012.11.002. [9] g. y. chuang, y. m. ferng. experimentally investigating the thermal mixing and thermal stripping characteristics in a t-junction. applied thermal engineering 113:1585 – 1595, 2017. https://doi.org/10.1016/j.applthermaleng.2016.10.157. [10] r. tunstall, d. laurence, r. prosser, a. skillen. large eddy simulation of a t-junction with upstream elbow: the role of dean vortices in thermal fatigue. applied thermal engineering 107:672 – 680, 2016. https://doi.org/10.1016/j.applthermaleng.2016.07.011. [11] j.-h. kim, j.-t. kim. the experimental performance of an unglazed pvt collector with two different absorber types. international journal of photoenergy 2012, 2012. https://doi.org/10.1155/2012/312168. [12] h. a. zondag. flat-plate pv-thermal collectors and systems: a review. renewable and sustainable energy reviews 12(4):891 – 959, 2008. https://doi.org/10.1016/j.rser.2005.12.012. [13] z. xu, c. kleinstreuer. concentration photovoltaic–thermal energy co-generation system using nanofluids for cooling and heating. energy conversion and management 87:504 – 512, 2014. https://doi.org/10.1016/j.enconman.2014.07.047. [14] a. h. a. al-waeli, m. t. chaichan, h. a. kazem, et al. numerical study on the effect of operating nanofluids of photovoltaic thermal system (pv/t) on the convective heat transfer. case studies in thermal engineering 12:405 – 413, 2018. https://doi.org/10.1016/j.csite.2018.05.011. [15] p. k. nagarajan, j. subramani, s. suyambazhahan, r. sathyamurthy. nanofluids for solar collector applications: a review. energy procedia 61:2416 – 2434, 2014. https://doi.org/10.1016/j.egypro.2014.12.017. [16] k. terashima, h. sato, t. ikaga. development of an environmentally friendly pv/t solar panel. solar energy 199:510 – 520, 2020. https://doi.org/10.1016/j.solener.2020.02.051. [17] r. braun, m. haag, j. stave, et al. system design and feasibility of trigeneration systems with hybrid photovoltaic-thermal (pvt) collectors for zero energy office buildings in different climates. solar energy 196:39 – 48, 2020. https://doi.org/10.1016/j.solener.2019.12.005. [18] o. dupre, b. niesen, s. de wolf, c. ballif. field performance versus standard test condition efficiency of tandem solar cells and the singular case of perovskites/silicon devices. the journal of physical chemistry letters 9(2):446 – 458, 2018. https://doi.org/10.1021/acs.jpclett.7b02277. [19] l. hernández-callejo, s. gallardo-saavedra, v. alonso-gómez. a review of photovoltaic systems: design, operation and maintenance. solar energy 188:426 – 440, 2019. https://doi.org/10.1016/j.solener.2019.06.017. [20] v. perraki, p. kounavis. effect of temperature and radiation on the parameters of photovoltaic modules. journal of renewable and sustainable energy 8(1):013102, 2016. https://doi.org/10.1063/1.4939561. [21] m. r. maghami, h. hizam, c. gomes, et al. power loss due to soiling on solar panel: a review. renewable and sustainable energy reviews 59:1307 – 1316, 2016. https://doi.org/10.1016/j.rser.2016.01.044. 322 https://doi.org/10.1016/b978-0-12-397178-4.00006-2 https://doi.org/10.3390/ijerph14111436 https://doi.org/10.1109/icue.2015.7280261 https://doi.org/10.4000/mediterranee.10113 https://doi.org/10.1016/j.enpol.2011.07.009 https://doi.org/10.1016/j.aej.2017.02.027 https://doi.org/10.1016/j.enconman.2012.11.002 https://doi.org/10.1016/j.applthermaleng.2016.10.157 https://doi.org/10.1016/j.applthermaleng.2016.07.011 https://doi.org/10.1155/2012/312168 https://doi.org/10.1016/j.rser.2005.12.012 https://doi.org/10.1016/j.enconman.2014.07.047 https://doi.org/10.1016/j.csite.2018.05.011 https://doi.org/10.1016/j.egypro.2014.12.017 https://doi.org/10.1016/j.solener.2020.02.051 https://doi.org/10.1016/j.solener.2019.12.005 https://doi.org/10.1021/acs.jpclett.7b02277 https://doi.org/10.1016/j.solener.2019.06.017 https://doi.org/10.1063/1.4939561 https://doi.org/10.1016/j.rser.2016.01.044 vol. 61 no. 2/2021 strategic modulation of thermal to electrical energy ratio. . . [22] j. page. chapter iia-1 the role of solar-radiation climatology in the design of photovoltaic systems. in practical handbook of photovoltaics, pp. 573 – 643. academic press, boston, second edition edn., 2012. https://doi.org/10.1016/b978-0-12-385934-1.00017-9. [23] t. markvart, l. castañer (eds.). practical handbook of photovoltaics: fundamentals and applications. elsevier, 2013. https://doi.org/10.1016/b978-1-85617-390-2.x5000-4. [24] c. xiao, x. yu, d. yang, d. que. impact of solar irradiance intensity and temperature on the performance of compensated crystalline silicon solar cells. solar energy materials and solar cells 128:427 – 434, 2014. https://doi.org/10.1016/j.solmat.2014.06.018. [25] j.-c. wang, y.-l. su, j.-c. shieh, j.-a. jiang. high-accuracy maximum power point estimation for photovoltaic arrays. solar energy materials and solar cells 95(3):843 – 851, 2011. https://doi.org/10.1016/j.solmat.2010.10.032. [26] a. r. jha. solar cell technology and applications. auerbach publications, 2009. https://doi.org/10.1201/9781420081787. [27] f. fertig, s. rein, m. schubert, w. warta. impact of junction breakdown in multi-crystalline silicon solar cells on hot spot formation and module performance. in 26th european photovoltaic solar energy conference and exhibition, pp. 1168 – 1178. 2011. https://doi.org/10.4229/26theupvsec2011-2do.3.1. [28] p. singh, n. m. ravindra. temperature dependence of solar cell performance an analysis. solar energy materials and solar cells 101:36 – 45, 2012. https://doi.org/10.1016/j.solmat.2012.02.019. [29] j. zaraket, t. khalil, m. aillerie, et al. the effect of electrical stress under temperature in the characteristics of pv solar modules. energy procedia 119:579 – 601, 2017. https://doi.org/10.1016/j.egypro.2017.07.083. [30] j. c. teo, r. h. g. tan, v. h. mok, et al. impact of partial shading on the pv characteristics and the maximum power of a photovoltaic string. energies 11(7):1860, 2018. https://doi.org/10.3390/en11071860. [31] a. j. swart, p. e. hertzog. varying percentages of full uniform shading of a pv module in a controlled environment yields linear power reduction. journal of energy in southern africa 27(3):28 – 38, 2016. [32] p. arjyadhara, s. m. ali, j. chitralekha. analysis of solar pv cell performance with changing irradiance and temperature. international journal of engineering and computer science 2(1):214 – 220, 2013. [33] p. löper, d. pysch, a. richter, et al. analysis of the temperature dependence of the open-circuit voltage. energy procedia 27:135 – 142, 2012. [34] c. h. henry. limiting efficiencies of ideal single and multiple energy gap terrestrial solar cells. journal of applied physics 51(8):4494 – 4500, 1980. https://doi.org/10.1063/1.328272. [35] g. trzmiel, d. głuchy, d. kurz. the impact of shading on the exploitation of photovoltaic installations. renewable energy 153:480 – 498, 2020. https://doi.org/10.1016/j.renene.2020.02.010. [36] a. m. humada, f. b. samsuri, m. hojabria, et al. modeling of photovoltaic solar array under different levels of partial shadow conditions. in 16th international power electronics and motion control conference and exposition, pp. 461 – 465. 2014. https://doi.org/10.1109/epepemc.2014.6980535. [37] f. lu, s. guo, t. m. walsh, a. g. aberle. improved pv module performance under partial shading conditions. energy procedia 33:248 – 255, 2013. https://doi.org/10.1016/j.egypro.2013.05.065. [38] l. a. kosyachenko. solar cells thin-film technologies, chap. thin-film photovoltaics as a mainstream of solar power engineering, pp. 1 – 40. intechopen limited, london, 2011. https://doi.org/10.5772/39070. [39] d. kiermasch, l. gil-escrig, h. j. bolink, k. tvingstedt. effects of masking on open-circuit voltage and fill factor in solar cells. joule 3(1):16 – 26, 2019. https://doi.org/10.1016/j.joule.2018.10.016. [40] h. a. koffi, a. a. yankson, a. f. hughes, et al. determination of the series resistance of a solar cell through its maximum power point. african journal of science, technology, innovation and development 12(6):699 – 702, 2020. https://doi.org/10.1080/20421338.2020.1731073. [41] m. wolf, h. rauschenbach. series resistance effects on solar cell measurements. advanced energy conversion 3(2):455 – 479, 1963. https://doi.org/10.1016/0365-1789(63)90063-8. [42] p. g. kale, k. k. singh, c. seth. modeling effect of dust particles on performance parameters of the solar pv module. in 2019 fifth international conference on electrical energy systems, pp. 1 – 5. 2019. https://doi.org/10.1109/icees.2019.8719298. [43] a. hussain, a. batra, r. pachauri. an experimental study on effect of dust on power loss in solar photovoltaic module. renewables: wind, water, and solar 4(1):9, 2017. https://doi.org/10.1186/s40807-017-0043-y. [44] k. dastoori, g. al-shabaan, m. kolhe, et al. charge measurement of dust particles on photovoltaic module. in 8th international symposium on advanced topics in electrical engineering, pp. 1 – 4. 2013. https://doi.org/10.1109/atee.2013.6563411. [45] r. vaillon, o. dupré, r. b. cal, m. calaf. pathways for mitigating thermal losses in solar photovoltaics. scientific reports 8:13163, 2018. [46] m. hammami, s. torretti, f. grimaccia, g. grandi. thermal and performance analysis of a photovoltaic module with an integrated energy storage system. applied sciences 7(11):1107, 2017. https://doi.org/10.3390/app7111107. [47] r. masoudi nejad. a survey on performance of photovoltaic systems in iran. iranian (iranica) journal of energy & environment 6(2):77 – 85, 2015. https://doi.org/10.5829/idosi.ijee.2015.06.02.01. [48] j. a. duffie, w. a. beckman. solar engineering of thermal processes. wiley, new york, 1991. [49] p. singh, n. ravindra. analysis of series and shunt resistance in silicon solar cells using single and double exponential models. emerging materials research 1:33 – 38, 2012. https://doi.org/10.1680/emr.11.00008. 323 https://doi.org/10.1016/b978-0-12-385934-1.00017-9 https://doi.org/10.1016/b978-1-85617-390-2.x5000-4 https://doi.org/10.1016/j.solmat.2014.06.018 https://doi.org/10.1016/j.solmat.2010.10.032 https://doi.org/10.1201/9781420081787 https://doi.org/10.4229/26theupvsec2011-2do.3.1 https://doi.org/10.1016/j.solmat.2012.02.019 https://doi.org/10.1016/j.egypro.2017.07.083 https://doi.org/10.3390/en11071860 https://doi.org/10.1063/1.328272 https://doi.org/10.1016/j.renene.2020.02.010 https://doi.org/10.1109/epepemc.2014.6980535 https://doi.org/10.1016/j.egypro.2013.05.065 https://doi.org/10.5772/39070 https://doi.org/10.1016/j.joule.2018.10.016 https://doi.org/10.1080/20421338.2020.1731073 https://doi.org/10.1016/0365-1789(63)90063-8 https://doi.org/10.1109/icees.2019.8719298 https://doi.org/10.1186/s40807-017-0043-y https://doi.org/10.1109/atee.2013.6563411 https://doi.org/10.3390/app7111107 https://doi.org/10.5829/idosi.ijee.2015.06.02.01 https://doi.org/10.1680/emr.11.00008 acta polytechnica 61(2):313–324, 2021 1 introduction 2 theoretical analysis of photovoltaic modules 2.1 influence of solar radiation 2.2 influence of the operating temperature of the pv module 2.3 losses due to extrinsic and intrinsic in a solar cell 3 method and simulation set up 4 simulation results and discussion 4.1 pv cell power dissipation as a function of rse 4.2 estimation of heat transfer by conduction and generated pv power as a function of rse 4.3 convection and radiation heat generated by the pv cell 4.4 the pv cell temperature under rse variation 4.5 pv cell electrical efficiency under rse variation 4.6 generated power through time 5 conclusion references ap_06_4.vp 1 introduction 1.1 principle of a digital loudspeaker the fundamental idea of the digital loudspeaker is to shift d/a conversion process to the very end of the audio chain. a digital loudspeaker is a single device system which uses direct d/a conversion for digital audio signals. this conversion can be performed in various ways and more than one implementation of this system is possible. the conversion process itself is based on the same principle of summing the partial weighted signals of the input digital signal as in the case of electric converters. driving signals correspond to single bits of input signal and their binary weights. the acoustic pressures, electromagnetic, electrostatic and other forces can be summed, depending on the construction and the principle of the electro-acoustic conversion of the system. the principle of a digital loudspeaker is shown in fig. 1. 1.2 driving a digital loudspeaker a tri-state driving signal can be prepared by multiplying single data bit streams by the signum bit in the direct digital code of the pcm signal. the resulting bit streams can be directly used to drive the transducers of a digital loudspeaker. the driving bit streams created from a digital representation of 2 periods of a 1 khz sine signal at 44.1 sampling frequency are shown in fig. 2. 2 experimental construction 2.1 signal processing unit a serial digital interface spdif was chosen as the master input. this digital format is widely used in consumer audio, and all digital audio equipment uses this interface. therefore the signal source for a digital loudspeaker can be connected to any cd/dvd player or properly equipped personal computer, which can also act as an a/d converter and a powerful signal processing unit. the serial spdif signal needs to be decoded into a pure audio data stream and converted to parallel form. then any signal processing such as binary code conversion, dithering and decimation can be applied. parallel data bit streams must be multiplied by the signum bit stream to achieve the tri-state character of the driving sig40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 a digital loudspeaker: experimental construction p. valoušek to verify the principal functionality of a digital loudspeaker the experimental construction of a digital loudspeaker was designed. this system consists of signal processing unit and transducer array. both parts offer 8-bit precision and wide a spectrum of sampling frequencies. the signal processing unit is based on a programmable logic device which provides a flexible system for preparing a driving signal. the transducer array is formed by conventional dynamic transducers arranged in a circle. the initial measurements and listening tests provide acceptable results that are valuable for further digital loudspeaker developments. keywords: digital loudspeaker, signal processing, transducer array. fig. 1: principle of a digital loudspeaker with summation of acoustic pressures fig. 2: driving bit streams nal, and must be amplified according to the binary weight of the corresponding bit. a scheme of experimental signal source chain is shown in fig. 3. the input digital spdif signal is decoded by a texas instruments dir1703 digital receiver integrated circuit to an i2s industry standard serial audio bus. the receiver also provides the sample rate detection and master processing clock recovery. the serial audio bus is received by the xilinx cool runner ii programmable logic circuit, which performs the main processing of the serial to parallel conversion, binary code conversion, and optionally decimation. the programmable logic circuit is designed to process two channels with 16 bit data. the parallel data streams are then tri-state coded by an hc 4051 multiplexer switching matrix with 8 single multiplexer circuits. currently the switching matrix is able to process 8-bit data in one channel and can easily be upgraded to 16 bits. in the amplifier section, the data streams can be electrically weighted by resistor dividers on the input side of the tda2822n power operating amplifier. the amplifier also adjusts the output parameters to conform the transducer characteristics. the operational amplifiers have a very wide frequency characteristic, therefore they can process a digital rectangular signal without distortion. the amplifier is designed for 8 bits and can also easily be upgraded to 16 bits. the experimental signal source chain is shown in fig. 4. 2.2 transducer array for the experimental construction of a digital loudspeaker an array of identical dynamic transducers was created. arz 6604 wide-range electro-dynamic transducers were chosen because of their wide frequency response and relatively small dimensions. the impulse and frequency response of an arz 6604 transducer is shown in fig. 5 the array consists of 7 transducers attached to the fiber-glass base plate. the transducers are arranged in a circle. with tri-state coding the 7-transducer array can be driven by an 8-bit digital signal. the experimental array is shown in fig. 6. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 46 no. 4/2006 fig. 3: signal source chain scheme fig. 4: signal processing unit chain fig. 5: impulse and frequency response of an arz 6604 wide range dynamic transducer fig. 6: transducer array 3 experimental results the experimental transducer array was measured at various sampling frequencies from 22 khz to 96 khz by harmonic signal, harmonic sweep and music signals with strong dithering applied. although it suffered from high harmonic distortion, its performance was as expected. harmonic and other types of distortion are caused by distance delays thanks to large spacing of the transducers in the array. the array was driven by the digital representation of a 1 khz sine signal at sampling frequency 44.1 khz. the signal produced by a single wide-range transducer at the msb bit level is shown in fig. 7 (compare with the msb bit stream in fig. 2). the signal produced by the whole array is shown in fig. 8. listening tests were also performed with the receiving point placed in the axis of the transducer array, because this provides identical distances between the individual transducers and the receiving point. 4 conclusion the principle of the digital loudspeaker has been briefly described and the experimental construction of a driving signal source and an 8-bit transducer array has been introduced. this design is based on conventional dynamic transducers, which have a limited frequency response and quite a long impulse response. although they are not suitable for digital loudspeaker design, they can be used for experimental purposes to verify the principal functionality of a digital loudspeaker. 5 acknowledgments this work has been supported by research project msm 6840770014 “research in the area of prospective information and communication technologies”. references [1] flanagan, j. l.: direct digital-to-analog conversion of acoustic signals. the bell systems technical journal, vol. 59 (1980), no. 9. [2] busbridge, s. c. et al.: digital loudspeaker technology: current state and future developments, aes 112th convention, munich, germany 2002 may 10–13. [3] huang, y., busbridge, s. c., gill, d. s.: distortion and directivity in a digital transducer array loudspeaker, j. audio. eng. soc., vol. 49, may 2001, no. 5. [4] hayama, a., furihata, k., yanagisawa, t.: electrodynamic type plane loudspeaker driven by 16 bits digital signal and its acoustic responses, proceedings of ica conference, rome 2001. [5] husník, l.: porovnání různých principů elektroakustické přeměny z hlediska vhodnosti použití v digitálním reproduktoru, sborník atp 2002, brno, 21. 5. 2002, p. 38–43. [6] husník, l.: výhody a nevýhody digitálního reproduktoru. akustické listy csas 4, vol. 7 (2001), p. 19–20. [7] merhaut, j.: teoretické základy elektroakustiky, praha, academia 1985, 328 stran. [8] škvor, zd.: akustika a elektroakustika, praha, academia 2001, 520 stran. ing. pavel valoušek e-mail: valousp@feld.cvut.cz department of radioelectronics czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 7: signal at msb level fig. 8: signal produced by the whole array ap_06_4.vp 1 connections in supply chains (vertical diversifications) the connections options are not numerous, because the supply chain in electro-energetics is relatively short producer, (transmission system), (distribution system), customer. the systems in parenthesis can be bypassed. this paper deals with the view of the market for the electricity distribution company (this does not refer to distribution of electricity, but to the part of the company that trades with electricity). the only real way for connection is with the producer. in the calculations below, a simplificaiton will be made, and a short-term diagram of supply only will be presented. this refers to the differences between main part of diagram, covered by standard purchases, made in advance on the standard market for electricity, and the prediction of consumption, which is made not more than one day in advance. in the conditions in the czech republic it is not possible to buy a producer, who is able to cover the whole diagram of all its customers. a connection with producer could be made using the following options: � buy a production capacity (an existing facility or a newly-built facility). this is barely feasible (if the producer or facility is big enough, it will be very expensive, and if the producer is small, the impact will be minimal – the purchase depends very much on the price). � connect with an existing producer trough a contract (i.e., buy or rent an exact part of its production capacity, or pay for rights to use this capacity freely). the second variant was chosen for the explanation. before attempting to determine the value of the connection, it is important to take into account what might occur during cooperation with producer, and the principle underlying the cooperation: 1) prediction of consumption is higher than the amount of electricity purchased � production of the source can be increased, or production can be started (substitution of negative deviation by the costs for the electricity produced). 2) prediction of consumption is lower than the amount of electricity purchased � production of the source can be decreased or stopped (substitution of positive deviation by the costs for non-produced electricity). it is clear, that the designation of predicting the price of the deviations will play a very important role in the evaluation. 2 deviations the market operator in the czech republic performs the so-called settlement of deviations. the operator determines the balance between the purchased / sold and the actually consumed / supplied electricity for each participant in the market and for the system as a whole. the price of the deviations depends on the state of the system at the given moment and also on the price of the auxilliary services that had to be activated. a negative deviation from the distribution system point of view means that more electricity was taken from the transmission system than was purchased, while a positive deviation means that less electricity was taken from the transmission system than was purchased. from the producer’s point of view, a negative deviation means that less electricity was supplied to the transmission system than was sold, and a positive deviation means that more electricity was supplied to the transmission system than was sold. the price of the deviations does not always have to be positive (company has to pay), but can be negative as well (company will be paid) in cases when the deviation of the company is opposite to the deviation of the whole system. this means that the deviation of the system is being improved by the deviation of the com10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 supply chain in electro-energetics and real options o. drahovzal this paper deals with determining the value of the connection in the supply chain in electro-energetics. first, useful connection options are described. the principle of deviations in the energetic system is then briefly described, and examples of their histories are shown in a chart. next methods for evaluating the acquisition of a connection with the producer of electricity are explained. the simple classical approach to the problem and the real options approach are described. the real options approach is considered to be the most correct, providing the most realistic results. then there is an example of a calculation for a company, and certain values for the prices of deviations are calculated and shown. finally, a comparison of the results is made, and their consequences for the decision process are explained. keywords: supply chain, deviations, real options. fig. 1: prices of negative deviations pany. the following graphs show big changes in the deviation prices and the distribution. 3 determining the value of a connection the principle is to determine acquisitions (revenues – costs) of one megawatthour obtained by connection with the producer – only the covering of negative deviations will be calculated. for simplification, the selling price of one megawatthour was set as fixed and known in advance. the two main variables entering the equation are the price of the deviation and the price for the electricity from the producer (only unary single tariff for electricity – payments for the energy and not for the capacity only – is assumed in this paper). 3.1 evaluation by standard methods the principle is to calculate the difference between revenue and loss. assuming that we know or can easily predict the price for the supplied electricity, the only problem is to determine the price of the deviation. the first option is to predict the price as the average value of historical values (which can be pertinently heightened with the trend): p c n c t t n � �� � 0 1 e , (1) where ct 0 is the price of a deviation in a given period of time, n is the number of time sections, and ce is the price for the supplied or non-supplied electricity. this approach is very simple, but also provides the most inaccurate results. the second option is to predict the price of the deviations as the median of the historical values: p c c� �m e 0 , (2) where cm 0 is the median of the price of the deviations. this approach is also very simple, and its results are very dependent on the distribution of the prices of the deviations. 3.2 evaluation by real options the principle of the following equation uses the option to switch, which is computed as the sum of two options. the first is the put option, with the following parameters: the current price is the present value of the future cash flows with input #1, the strike price is the variable costs with input #1, time to expiration is the lifetime of the project, and the volatility of the asset price is the volatility of the future cash flows with input #1. the second option is the call option with the same parameters as the put option expect that input #1 is changed to input #2. the black-scholes formulas for standard european call and put options have the following form: c s n d x n drt� � � � ��( ) ( )1 2e (3) p s n d x n drt� � � � � � � ��( ) ( )1 2e (4) d s x r t t1 2 2 � � � � � � � � � � � ln � � (5) d d t s x r t t2 1 2 2 � � � � � � � � � � � � � � � � � ln , (6) where s is the current value of the asset, x is the strike price of the asset, n() is the standard normal cumulative distribution function, r is the risk-free interest rate, t is time to expiration of the option, and � is the volatility of the price of the asset. the formula for calculating the value of the option to switch then has the following form: v p c� � (7) and the formula for calculating the total value of a connection has the following form: v v c n c t t n t e� � � � � 0 1 . (8) this approach is the most correct, it includes a calculation of the great volatility and instability of the price of the deviations and gives the best and most accurate results. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 46 no. 4/2006 fig. 2: prices of positive deviations fig. 3. distribution of deviations 4 solution and results the company pre (pražská energetika, a.s.) was chosen for evaluation. data was extracted from the annual report, which was downloaded from the company website. data on deviations and their prices was obtained from the website of the market operator. the results are shown in the following graph: numerical results that are show the point where the decision changes (the maximum price, at which the acquisition is equal to zero), are shown in the following table: if the total value of the connection is being calculated, the sum of the options for each hour should be used. if the contract has a life time of one year, the company can use the producer in each hour in this year, which means 8 760 hours. thus the total value is the sum of 8 760 options to switch. for example, if the price for one megawatthour will be 1 500 czk, which is very high, the total value of the project, which allows the company to control one megawatthour of the producer for one year, is about 3 millions czk. 5 conclusions it is logical that the determining the connection acquisition has a big influence on the decision-making process. the calculations show that the use of standard methods gives results that exclude the connection of the distribution company with the producer of the electricity, because the profitable price for the supplied electricity comes out too low. on the other hand, evaluation by real options gives totally different results, due to the very great volatility of the price of the deviations, and shows the connection in a completely different light. this paper, for the purposes of illustration, simplifies the input parameters. it could be interesting for future research to focus, for example, on the multi-component price of electricity or to perform a more sophisticated exploration of a longer history of prices of deviations. 6 acknowledgments this research was financially supported by research project msm6840770017 of the czech ministry of education, youth and sports. references [1] scholleová, h.: real options. dissertation thesis, prague, 2004. [2] ote (market operator in czech republic), [online] www.ote-cr.cz. [3] eru (energy regulatory office), [online] www.eru.cz [4] ceps (company operating transmission system), [online] www.ceps.cz. [5] pre (prague distribution company), [online] www.pre.cz ing. ota drahovzal e-mail: drahovo@feld.cvut.cz department of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 4: value of the connection average 959.06 median 1103.98 option 1918.12 table 1: numerical results ap08_3.vp 1 introduction to assess the performance of trading in electricity, it is advisable to measure the profitability of electricity trading position will be measured. the profitability of an electricity trading position expresses the realized profit in relation to the expression of the closed position values and the expected profit with respect to the market value of the open position. it consists of the following two basic components: � closed position profitability (“realized profit”) and � market value of the open position (“unrealized profit”). 2 definition of a trading position closed position profitability is calculated using the gm indicator equal to the difference between the sales revenues and the costs of acquisition and generation. open position profitability indicates what “profit might be expected from an open position, if it were closed today”. in other words, it is the current market valuation of contracts constituting the open position using the mark to market method. gm is to be used for the open position valuation and for the purposes of determining its profitability. a closed position consists of the closed historical position for the period from the beginning of the current year to the present date and the closed future position for the period from the next day within the selected future time window. the total value of gm from the closed position is calculated by adding up the closed historical positions and the closed future positions for a time window of one year. 3 closed historical position a closed historical position is a position for the period from the beginning of the current year to the present date, and is calculated as the sum of the concluded contract volumes per hour and day of the relevant period, while the following applies: qs qg qptd td td td td td � � �� � , (1) where qstd is the volume of sold electricity for hour t of day d [mwh], qgtd is the generated volume of electricity for hour t of day d [mwh] and qptd is the volume of electricity purchased for hour t of day d [mwh]. the gross margin is used for determining the profitability of a closed historical position. gm from the closed historical position is to be calculated as the difference between the sales revenues (multiplied by the sold volume) and the production costs (the weighted value of the production prices multiplied by the produced volume) and the purchase costs (the weighted value of the purchase price multiplied by the purchase volume). gm chp gm chp qs ps qg pg qp td td td td td td td td t _ _� � � � � � � � �� � d td td pp�� � , (2) where �pstd is the weighted value of the sales prices [kč], �pgtd is the weighted value of the production prices [kč], and �pptd is the weighted value of the purchased prices [kč]. 4 closed future position a closed future position is a position for the period from the next day within the selected future time window, and is calculated as the sum of concluded contract volumes per hour © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 48 no. 3/2008 power producer production valuation m. kněžek the ongoing developments in the electricity market, in particular the establishment of the prague energy exchange (pxe) and the associated transfer from campaign-driven sale to continuous trading, represent a significant change for power companies. power producing companies can now optimize the sale of their production capacities with the objective of maximizing profit from wholesale electricity and supporting services. the trading departments measure the success rate of trading activities by the gross margin (gm), calculated by subtracting the realized sales prices from the realized purchase prices and the production cost, and indicate the profit & loss (p&l) to be subsequently calculated by the control department. the risk management process is set up on the basis of a business strategy defining the volumes of electricity that have to be sold one year and one month before the commencement of delivery. at the same time, this process defines the volume of electricity to remain available for spot trading (trading limits). keywords: closed position, difference from plan fulfillment, forward price curve, future position, gross margin, historical position, internal regulation, mark to market, open position, power producer, planned production, profit & loss, risk exposure, risk management committee, trading limits. 1 2 3 4 5 6 7 8 9 10 11 12 open future position closed future position closed historical position 1 2 3 4 5 6 7 8 9 10 11 12 gross margin open future position closed future position closed historical position production plan fig. 1: an overview of the trading positions of a power producer and day of the relevant time window, while the following applies: qs qpg qptd td td td td td � � �� � , (3) where qstd is the volume of sold electricity for hour t of day d [mwh], qpgtd is the planned electricity generation for hour t of day d [mwh], qptd is the volume of electricity purchased for hour t of day d [mwh]. the gross margin is used for determining the profitability of a closed future position. gm from the closed future position is to be calculated as the difference between the sales revenues (multiplied by the sold volume) and the planned production costs (the planned value of the production prices multiplied by the produced volume) and the purchase costs (the weighted value of the purchase price multiplied by the purchase volume). gm cfp gm cfp qs ps qpg ppg qp td td td td td td td td _ _� � � � � � � � �� td td td pp�� � , (4) where �pstd is the weighted value of the sales prices [kč], ppgtd is the planned value of the production prices [kč] and �pptd is the weighted value of the purchase prices [kč]. 5 open future position an open future position is a position for the period from the next day within the selected future time window and is calculated as the difference between the planned volume of sold electricity, the planned volume of generated electricity and the planned volume of purchased electricity per hour and day of the relevant time window, while the following applies: qps qpg qpptd td td td td td � � �� � , (5) where qpstd is the planned volume of sold electricity for hour t of day d [mwh], qpgtd is the planned volume of generated electricity for hour t of day d [mwh] and qpptd is the planned volume of electricity purchased for hour t of day d [mwh]. the gross margin is used for determining the profitability of an open future position. gm from the open position is to be calculated as the difference between the planned sales revenues (the planned sales volume multiplied by the planned sales price – forward price curve) and the planned production costs (the planned generation volume multiplied by the planned production cost) and the purchase costs (the planned purchased volume multiplied by the relevant planned purchase price). gm ofp gm ofp qps pps qpg ppg td td td td td td td td td _ _� � � � � � � � � �� qpp ppptd td td , (6) where ppstd is the planned value of the sales prices [kč], ppgtd is the planned value of the production prices [kč] and ppptd is the planned value of the purchased prices [kč]. for the purposes of an open position mark-to-market valuation, the open position per hour has to be observed in order to be valued by the hourly forward price curve, which differentiates the peak and off-peak prices for sales and purchase. 6 total gross margin and difference from plan fulfillment the total gm value is calculated as the sum of the gross margins of the closed historical and future position and open future position, as follows: gm gm chp gm cfp gm ofp� � �_ _ _ . (7) the total difference from plan fulfillment will be given as the sum of the actual historical variance and the expected future variance in a one-year time window: � � �� �� �act futtd td td td . (8) the actual difference from plan fulfillment represents the profit/loss realized from plan fulfillment for the period from the beginning of the current year to the present date. it is to be calculated as the difference between the planned value of the gm of the closed historical position and the actual value of the gm of the closed historical position to the present date: �act gm chppl gm chpacttd � �_ _ . (9) the expected difference from plan fulfillment represents the projected unrealized profit/loss from plan fulfillment for the period from the next day within the selected future time window. it will be calculated as the difference between the planned value of the gm of the future position and the sum of the actual value of the gm of the closed future position and the gm of the open future position to the present date: �fut gm fppl gm cfpact gm ofpacttd � � �_ _ _ . (10) 7 risk exposure of open position power producers, unlike pure electricity traders (involved in pure electricity purchase/sale and managing their position based on trading only), provide their sold electricity primarily by controlled generation, with purchases being of a complementary nature only. to avoid the risk exposure of an open position, their internal regulation states that: � producers are only allowed to purchase electricity for a price that is lower than the cost price by a certain coefficient � producers are allowed to sell electricity for a price that is higher than the cost price by a certain coefficient � producers are allowed to sell electricity up to the volume that they are able to generate. for the above reasons, power producers are usually not exposed to risks implied by the fluctuating market price of electricity from an open position and, therefore, the calculation of value at risk has no rationale. 8 conclusion power producers are exposed to energy market risk through trading in power and power related products as well as all components associated with electricity production. the purpose of this paper is to manage power producers’ trading activities and their exposure to energy market risk, in particular the risks associated with the volatile and unpredictable 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 markets in electricity, ancillary services and co2 and the associated transactions of over-the-counter and exchange-traded derivative contracts, both to hedge their position and to optimize their position. acknowledgments the research described in this paper was supervised by prof. ing. o. starý, csc., fee ctu in prague and also by doc. ing. jaroslav knápek, csc., fee ctu in prague. references [1] kněžek, m.: managing electricity market risk. modern trends of industrial enterprises management conference, department of economics, management and humanities, czech technical university in prague. [2] kněžek, m.: electricity market risk management. poster 2007, dept. of economics, management and humanities, czech technical university in prague. [3] kněžek, m.: risk management in the electricity market. diploma thesis, 2007, dept. of economics, management and humanities, czech technical university in prague. [4] kněžek, m.: electricity risk management. student’s scientific and technical project, 2006, dept. of economics, management and humanities, czech technical university in prague. [5] kněžek, m.: financial derivatives in power engineering. semester project, 2006, dept. of economics, management and humanities, czech technical university in prague. [6] briys, e., mai, h., bellalah, m., varenne, f.: options, futures and exotic derivatives. wiley 1998. isbn 0417969095 [7] holton, g. a.: value at risk. elsevier academic press 2003. isbn 0123540100. [8] leppard, s.: energy risk management. risk books 2005. isbn 1904339743. [9] mcclave, j. t., benson, p. g., sincich, t.: statistics for business and economics (7th edition), prentice hall 1998. isbn 0139505458. [10] wengler, j.: managing energy risk. pennwell books 2001. isbn 0878147942. ing. marian kněžek e-mail: knezem1@fel.cvut.cz department of economics, management and humanities czech technical university faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 48 no. 3/2008 acta polytechnica acta polytechnica 62(1):1–2, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague preface to the special issue of acta polytechnica “analytic and algebraic methods in physics” andrii khrabustovskii, miloslav znojil in september 2021, the xviiith continuation of the series of the international, mathematically oriented conferences “analytic and algebraic methods in physics” (aamp) had to be organized, for well-known reasons, online. fortunately, every cloud has a silver lining: the related reduction of the capacity of the scientific communication channels led to the willingness of the participants to return to the recently almost abandoned tradition of complementing the zoom-mediated meeting by a subsequent preparation of an aamp-oriented special issue (si) of acta polytechnica. the main purpose of this si is twofold. firstly, it is intended to offer, in written form, a sufficiently representative sample of what has been presented online. this means that in the form of the standard refereed papers, the readers of this si will be rewarded by the up-to-the-minute information about the current state of art. secondly, in an ambition which reaches behind the meeting itself, the contributing authors felt motivated by the idea that a compact and comprehensible presentation of their results might find a broader readership among people who would not normally participate in the conference but who could still find at least some of the presented results potentially relevant for their own field of research. in comparison with the aamp meeting itself (where the separate subjects covered by 36 talks have been subdivided into 12 sections), a minor disadvantage of our present si lies, from the point of view of its readers at least, in the (traditional) alphabetical ordering of the contributions by their first authors. fortunately, interested readers might get more info about the subdivisions of the subjects via the webpage of the conference [1]. another weakness of the si collection could be seen, mainly by the 75 aamp participants themselves, in an incomplete coverage of the talks. indeed, roughly one third of them was not eligible for our si because the material was based on the recently published papers. again, the related complementary information is available via the aamp homepage [1]. this being said, the readers of this si are expected to make their own selection of the consumption out of the menu. all of the papers belong to the aamp framework, but even such a restriction admitted the inclusion of a broad spectrum of subfields, which are all bridging the gaps between the existing abstract mathematical structures (ranging from our understanding of ordinary differential equations up to the applications of the various forms of symmetries, antilinear symmetries, supersymmetries and nonlinearities) and their possible practical implementations (ranging again from multiple elementary models and methodical considerations up to certain fairly complicated phenomenological questions as encountered, say, in the relativistic quantum field theory). in the aamp context, we could speak about the tradition of the search for a deeper understanding of the connection between mathematics and physics. this led, in 2007, to the formulation of the project and to the organization of the series of the dedicated international conferences. at that time, indeed, the analytic and algebraic methods were particularly actively developed by the founding fathers from the nuclear physics institute of the cas in řež. in this sense, the mathematical side of the bridge to physics has been (and, in fact, it is still being) restricted to the analytic and algebraic methods. in parallel, the physics side of the same bridge proved quickly growing with time. at present, its scope covers so many parts of physics that even the originally tacitly assumed specification “quantum physics” would and could be considered over-restrictive. one can only conclude that the interaction between mathematics and physics remains enormously productive. we believe that our si will contribute to this productivity, counteracting the extent of damages caused to the scientific world by the coronavirus. one of its most damaging effects was, indeed, the interruption of many regular series of international conferences, of which the series “analytic and algebraic methods in physics” (aamp), regularly taking place in prague every year, is just one of many examples. in fact, the original hopes that the interruption might only last one year were not fulfilled. equally disappointing proved to be our slow but definite empirical discovery that the success and efficiency of the transformation of these conferences into virtual meetings (mediated, say, by zoom) remains limited. what was saved was only a form, not the full contents; not the essence. we all revealed that there exists no real substitute for the face-to-face meetings, converting the hours of isolated research performed by individuals into an exchange of ideas and providing a platform for their critical re-evaluation. creating a genuine living science which can acquire its final, collective and truly creative character only after multiple informal debates and only after multiple active personal interactions. for all of these reasons, the organizers of the aamp series came to the conclusion that one of the possible reactions to the unpleasant current circumstances would be an enrichment of the internet-mediated standard 1 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en andrii khrabustovskii, miloslav znojil acta polytechnica form of the meetings (in which one listens to talks for a few days, without having a real opportunity of discussing the subjects in the couloirs) via a return to an apparently obsolete practice of a subsequent preparation and publication of at least some of the talks in their written, more lasting and better accessible form, better suitable for the subsequent critical re-evaluation. in this special issue of acta polytechnica, the readers will have the opportunity of seeing and, perhaps, appreciating the result. surprisingly, many speakers decided to contribute. for us, this is a proof that the production of special issues characterized by a well-defined and not-too-broad range of subjects still makes sense. on behalf of organizers, the guest editors of the special issue, andrii khrabustovskii and miloslav znojil, university of hradec králové references [1] http://www.ujf.cas.cz/en/departments/department-of-theoretical-physics/events/conferencies/aamp/ index.html 2 http://www.ujf.cas.cz/en/departments/department-of-theoretical-physics/events/conferencies/aamp/index.html http://www.ujf.cas.cz/en/departments/department-of-theoretical-physics/events/conferencies/aamp/index.html acta polytechnica 62(1):1–2, 2022 references acta polytechnica https://doi.org/10.14311/ap.2021.61.0762 acta polytechnica 61(6):762–767, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague cavitation wear of eurofer 97, cr18ni10ti and 42hnm alloys hanna rostovaa, ∗, victor voyevodina, b, ruslan vasilenkoa, igor kolodiya, vladimir kovalenkoa, vladimir marinina, valeriy zuyoka, alexander kuprina a national science center kharkiv institute of physics and technology, institute of solid state physics, materials science and technologies nas of ukraine, 1 akademichna str., 61108 kharkiv, ukraine b v.n. karazin national university, physics and technology faculty, department of reactor materials and physical technologies, 4 svobody sq., 61022 kharkiv, ukraine ∗ corresponding author: rostova@kipt.kharkov.ua abstract. the microstructure, hardness and cavitation wear of eurofer 97, cr18ni10ti and 42hnm have been investigated. it was revealed that the cavitation resistance of the 42hnm alloy is by an order of magnitude higher than that of the cr18ni10ti steel and 16 times higher than that of the eurofer 97 steel. alloy 42hnm has the highest microhardness (249 kg/mm2) of all the investigated materials, which explains its high cavitation resistance. the microhardness values of the cr18ni10ti steel and the eurofer 97 were 196.2 kg/mm2 and 207.2 kg/mm2, respectively. the rate of cavitation wear of the austenitic steel cr18ni10ti is 2.6 times lower than that of the martensitic eurofer 97. keywords: cavitation erosion, wear, steel, hardness, structure, resistance. 1. introduction realization of ambitious programs of development and construction of nuclear power plants of a new generation (gen iv, terra power wave reactor etc.) will be possible only after solutions of problems of nuclear material science are found. promising materials for future generations of reactors, in addition to high radiation and corrosion resistance, high mechanical characteristics, should also have an increased cavitation resistance to the coolant (supercritical water or liquid metals) [1]. among the main promising materials for future generations of reactors, the ferrite-martensitic steel eurofer 97 and the cr-ni-mo alloy 42hnm stand out. the eurofer 97 is a european reference material within the framework of the european fusion development agreement (efda)–structural materials. in europe, eurofer97 has been recognized as a prospective material [2] for first walls, divertors, blanket and vessels of fast breeder reactors [3–8]. one of the main reason for its selection are the high mechanical properties at service temperatures coupled with the low or reduced activation characteristic under radiation with the result of a very low loss of mechanical properties of the eurofer 97 steel. this material behaviour has been reported in many studies and important initiatives are still ongoing [9–11]. nickel superalloys are selected for their usage in nuclear reactor core systems [12–17], in particular, nuclear power plants with a molten salt coolant [18– 20] and advanced ultra super critical (ausc) power plants [21, 22]. it is due to their advantage over the austenitic steels in terms of radiation and corrosion resistance (including molten salts) [12, 13] at relatively low neutron irradiation temperatures. in particular, the 42hnm alloy is considered as a candidate material for accident-resistant fuel (atf) claddings [23]. in a moving fluid flow under certain hydrodynamic conditions, the continuity of the flow is disrupted, and cavities, caverns and bubbles are formed, which then collapse [24]. this phenomenon, occurring in the liquid flow, causes a cavitation erosion of the material [25]. depending on the intensity of the cavitation and the time of exposure, the destruction of the metal surface can be fractions of a square millimetre, and sometimes even several square meters. the depth of the destruction of materials and products made from them is also different – up to a complete destruction. cavitation erosion can carry away an amount of metal no lesser than corrosion; hence, the importance of studies on cavitation resistance, which will reduce metal losses and increase the durability and reliability of parts and devices, is obvious. it is known that the cavitation resistance of a material is determined by its composition and structure [26]. in this regard, in this work, we studied the cavitation wear of promising reactor materials with different crystal structures – eurofer 97 and 42hnm. the cr18ni10ti steel, widely used in nuclear power engineering, was chosen for a comparison. 2. materials and methods of investigation the chemical composition of the materials under study (wt. %): eurofer 97 (c – 0.11, w – 1.4, mn – 0.6, v – 0.25, cr – 9.7, ta – 0.3, fe – balanced), 42hnm (cr 762 https://doi.org/10.14311/ap.2021.61.0762 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 6/2021 cavitation wear of eurofer 97, cr18ni10ti and . . . (a). (b). (c). (d). (e). (f). figure 1. microstructure of investigated materials: eurofer 97 – (a, b); cr18ni10ti – (c, d); 42hnm – (e, f). optical microscopy – (a, c, e) and sem – (b, d, f) images. – 42, mo – 1.4, ni – balanced) and cr18ni10ti (cr –18.7, mn – 1.1, ni – 10.5, ti – 0.6, fe – balanced). the size of the samples was 20 × 20 × 0.5 mm. the investigated materials are reactor-grade and were studied in the initial state; heat treatment modes: eurofer97 – normalization (980 °c/27’) plus tempering (760 °c/90’/air-cooled), cr18ni10ti – water quenching at 1050 °c, 42hnm – austenisation at 1130 °c. microstructural studies were carried out on metallographic inversion microscope olympus gx51 and on scanning electron microscope jeol 7001-f. the specimens for metallographic studies were preliminary encapsulated into bakelite and then grinded on sic paper (graininess from p120 to p1200) and polished on diamond suspensions with a fraction size of 1 and 0.05 µm. etching of all samples was carried out on a tenupol 5 setup with a reagent of 88 % ethanol + 6 % perchloric acid + 6 % glycerol at a voltage of 39 v at a room temperature. the xrd analysis was performed on dron-2.0 xray diffractometer in cobalt co-kα radiation, using fe selectively-absorbing filter. the diffracted radiation was detected by a scintillation detector. microhardness of the materials was measured on a lm 700 at tester with a vickers diamond indenter at a load of 2 n, with a holding time – 14 s. studies of the cavitation wear of the samples were carried out on a facility described in detail in the work [27, 28]. the cavitation zone was created by ultrasonic waves under the end face of the concentrator installed in a vessel with distilled water. the oscillation amplitude of the end face of the concentrator was 30 ± 2 µm at a frequency of 19.5 khz [29]. the sample was mounted at a distance of 0.50 mm from the concentrator surface. the erosion of the samples was measured gravimetrically with an accuracy of ± 0.015 mg. the dependence of the weight loss on the time of exposure to the cavitation was measured, and from these data, kinetic curves of destruction of the samples were plotted. the average cavitation wear rate of the materials was determined in the quasilinear sections of the cavitation wear rate curves. 3. results and discussion the general view of the microstructure of the materials is shown in fig. 1. the initial structure of eurofer 97 is tempered martensite, with prior austenite grain boundaries presence, with an average size of 6 µm (carbon mainly in m23c6 and mx precipitates). the microstructure of cr18ni10ti steel is austenitic with the presence of twins with an average grain size of 7.5 µm. 42hnm 763 h. rostova, v. voyevodin, r. vasilenko et al. acta polytechnica figure 2. diffraction patterns of the investigated samples: a) eurofer 97; b) cr18ni10ti; c) 42hnm. (a). (b). figure 3. cavitation wear mass loss (a) and cavitation wear rate (b) for eurofer 97, cr18ni10ti and 42hnm alloys. alloy has fcc structure and an average grain size ∼ 25 µm. diffraction studies have shown that all samples are single-phase, the diffraction lines in the diffractograms are narrow (fig. 2), meaning that the samples are in a coarse-crystalline state (grain size ≥ 1 µm). the sample eurofer 97 consists of fe-α ferrite/martensite with a lattice parameter a = 2.8726 å. the line intensity distribution corresponds to the (110) texture. the cr18ni10ti steel consists of fe-γ austenite with a lattice parameter a = 3.5894 å. the intensity distribution of the austenite lines corresponds to the (220) texture. the 42hnm alloy is also singlephase and consists of an fcc phase (solid solution based on nickel and chromium) with a lattice parameter a = 3.5903 å. in the diffractogram of the sample, the intensity of the lines (200) and (220) are overestimated, which indicates a more complex texture as compared to the previous samples. the results of the cavitation erosion experiments are shown in the form of curves of a sample mass loss depending on the test time (fig. 3a) and curves of the rate of cavitation erosion (fig. 3b). from the obtained data, it can be seen that the 42hnm alloy has the highest resistance to cavitation wear of the studied materials, and the eurofer 97 steel has the lowest one (fig. 3a). the cavitation wear rate curves are characterized by the presence of an initial section, when the destruction is low, so-called incubation period, and a section with a maximum quasi-constant rate. the cavitation wear rate becomes constant after 3 hours of testing in the case of the investigated materials (fig. 3b). mechanical properties and structural characteristics of the investigated materials are given in table 1. alloy 42hnm has the highest microhardness of the investigated materials, which explains its high cavitation resistance. despite the close values of microhardness, the rate of cavitation wear of the austenitic steel cr18ni10ti is 2.6 times lower than that of the eurofer 97 (table 1). scanning electron microscopy (sem) was used to observe the eroded surfaces of the samples after the 764 vol. 61 no. 6/2021 cavitation wear of eurofer 97, cr18ni10ti and . . . alloy characteristics crystal structure d, µm a, å hv , kg/mm2 vc, cm3/min eurofer 97 bct 6.0 2.8726 207.2±6.0 2.6 × 10−6 cr18ni10ti fcc 7.5 3.5894 196.2±6.1 1 × 10−6 42hnm fcc 25.0 3.5903 249.0±8.5 1.6 × 10−7 table 1. crystal structure, average grain size d, lattice parameter a, microhardness hv , cavitation wear rate vc of investigated materials. (a). (b). (c). (d). (e). (f). figure 4. sem images of the eroded surfaces for the investigated materials: eurofer 97 – (a, b); cr18ni10ti – (c, d) and 42hnm – (e, f) under different magnification (a, c, e – 500; b, d, f – 3500). 765 h. rostova, v. voyevodin, r. vasilenko et al. acta polytechnica 4 hours of the cavitation tests. it was found that the morphologies of the eroded surfaces are different to each other in the case of eurofer 97, cr18ni10ti and 42hnm alloy (fig. 4). it was found that the cavitation damage for steel samples eurofer 97 and cr18ni10ti is similar in shape and is characterized by the formation of craters, pits, cracks and protruding steps on the surface of the samples. however, the difference in the degree of deformation and the size of defects is obvious for the two materials under study. the surface of eurofer 97 steel is covered with large craters and many deep pits (fig. 4a, 4b). the dimensions of craters and cracks are ∼ 10 µm. in the case of the cr18ni10ti steel, pits and cracks were significantly smaller in size (fig. 4c, 4d). the defect sizes are at the level of ∼ 5 µm. the comparison of the sem images clearly indicates a significantly smoother surface of the 42hnm alloy (fig. 4e) as compared to the steel samples. in addition, the large craters observed for eurofer 97 and cr18ni10ti were not found in the case of 42hnm (fig. 4f). the presence of small pits and cracks (< 5 µm) for the 42hnm alloy can be associated with its high workhardening characteristics as well as high corrosion resistance. it is known that nickel alloys with chromium and molybdenum are highly resistant to cavitation wear [30]. usually, the resistance to cavitation erosion of martensitic and austenitic stainless steels is higher than that of ferritic stainless steels [31]. the excellent erosion resistance of martensitic stainless steels can be attributed to the uniform strain distribution and shorter effective average free martensite laths [32]. in this case, the low cavitation resistance of eurofer 97 steel can be caused by the presence of ferrite in the steel structure. the use of various thermomechanical treatments can significantly improve the mechanical properties of eurofer 97 steel [33]. however, the effect of such treatments on the cavitation resistance of this steel requires further research. 4. conclusions the present work investigated the cavitation resistance of materials with different crystal structures: eurofer 97 (bct) and cr18ni10ti, 42hnm (fcc). it was shown that the cavitation wear rate in distilled water for the 42hnm alloy is 1.6×10−7 cm3/min, ∼ 1 × 10−6 cm3/min for cr18ni10ti and 2.6 × 10−6 cm3/min for eurofer 97. it was found that after the cavitation tests, the morphology of eroded surfaces differs from each other for the alloy eurofer 97, cr18ni10ti and 42hnm and is in good agreement with the rate of cavitation wear. the surface of eurofer 97 steel is covered with large craters and a large number of deep pits; for cr18ni10ti steel, the size of these defects is 2 times smaller. 42hnm alloy has the smallest size of erosion defects. further studies are required to determine the effect of various thermomechanical treatments on the structure and cavitation resistance of the eurofer 97 steel. acknowledgements this work was prepared within the project № 6541230, implemented with the financial support of the national academy of science of ukraine. references [1] m. lee, y. kim, y. oh, et al. study on the cavitation erosion behavior of hardfacing alloys for nuclear power industry. wear 255(1-6):157–161, 2003. https://doi.org/10.1016/s0043-1648(03)00144-3. [2] m. rieth, b. dafferner, h. d. rohrig, c. wassilew. charpy impact properties of martensitic 10.6 % cr steel (manet-i) before and after neutron exposure. fusion engineering and design 29:365–370, 1995. https://doi.org/10.1016/0920-3796(95)80043-w. [3] k. d. zilnyk, v. b. oliveira, h. r. z. sandim, et al. martensitic transformation in eurofer-97 and ods-eurofer steels: a comparative study. journal of nuclear materials 462:360–367, 2015. https://doi.org/10.1016/j.jnucmat.2014.12.112. [4] a. möslang. ifmif: the intense neutron source to qualify materials for fusion reactors. comptes rendus physique 9(3-4):457–468, 2008. https://doi.org/10.1016/j.crhy.2007.10.018. [5] s. j. zinkle, g. s. was. materials challenges in nuclear energy. acta materialia 61(3):735–758, 2013. https://doi.org/10.1016/j.actamat.2012.11.004. [6] y. guerin, g. s. was, s. j. zinkle. materials challenges for advanced nuclear energy systems. mrs bulettin 34(1):10–19, 2009. https://doi.org/10.1017/s0883769400100028. [7] g. h. marcus. innovative nuclear energy systems and the future of nuclear power. progress in nuclear energy 50(2-6):92–96, 2008. https://doi.org/10.1016/j.pnucene.2007.10.009. [8] k. ehrlich, j. konys, l. heikinheimo. materials for high performance light water reactors. journal of nuclear materials 327(2-3):140–147, 2004. https://doi.org/10.1016/j.jnucmat.2004.01.020. [9] r. l. klueh, d. r. harries. high-chromium ferritic and martensitic steels for nuclear applications. west conshohocken, pa: astm international, 2001. https://doi.org/10.1520/mono3-eb. [10] l. tan, d. t. hoelzer, j. t. busby, et al. microstructure control for high strength 9cr ferritic-martensitic steels. journal of nuclear materials 422(1-3):45–50, 2012. https://doi.org/10.1016/j.jnucmat.2011.12.011. [11] l. tan, x. ren, t. r. allen. corrosion behaviour of 9-12 % cr ferritic-martensitic steels in supercritical water. corrosion science 52(4):1520–1528, 2010. https://doi.org/10.1016/j.corsci.2009.12.032. [12] m. i. solonin, a. b. alekseev, s. a. averin, et al. cr-ni alloys for fusion reactors. journal of nuclear materials 258-263(2):1762–1766, 1998. https://doi.org/10.1016/s0022-3115(98)00406-1. 766 https://doi.org/10.1016/s0043-1648(03)00144-3 https://doi.org/10.1016/0920-3796(95)80043-w https://doi.org/10.1016/j.jnucmat.2014.12.112 https://doi.org/10.1016/j.crhy.2007.10.018 https://doi.org/10.1016/j.actamat.2012.11.004 https://doi.org/10.1017/s0883769400100028 https://doi.org/10.1016/j.pnucene.2007.10.009 https://doi.org/10.1016/j.jnucmat.2004.01.020 https://doi.org/10.1520/mono3-eb https://doi.org/10.1016/j.jnucmat.2011.12.011 https://doi.org/10.1016/j.corsci.2009.12.032 https://doi.org/10.1016/s0022-3115(98)00406-1 vol. 61 no. 6/2021 cavitation wear of eurofer 97, cr18ni10ti and . . . [13] a. v. vatulin, v. p. kondrat’ev, v. n. rechitskii, m. i. solonin. corrosion and radiation resistance of “bochvaloy” nickel-chromium alloy. metal science and heat treatment 46(11-12):469–473, 2004. https://doi.org/10.1007/s11041-005-0004-8. [14] a. f. rowcliffe, l. k. mansur, d. t. hoelzer, r. k. nanstad. perspectives on radiation effects in nickel-base alloys for applications in advanced reactors. journal of nuclear materials 392(2):341–352, 2009. https://doi.org/10.1016/j.jnucmat.2009.03.023. [15] m. stopher. the effects of neutron radiation on nickel-based alloys. materials science and technology 33(5):518–536, 2017. https://doi.org/10.1080/02670836.2016.1187334. [16] m. i. solonin, a. b. alekseev, y. i. kazennov, et al. xhm-1 alloy as a promising structural material for water-cooled fusion reactor components. journal of nuclear materials 233-237(1):586–591, 1996. https://doi.org/10.1016/s0022-3115(96)00297-8. [17] m. i. solonin. radiation-resistant alloys of the nickel-chromium system. metal science and heat treatment 47(7-8):328–332, 2005. https://doi.org/10.1007/s11041-005-0074-7. [18] m. de los reyes, l. edwards, m. a. kirk, et al. microstructural evolution of an ion irradiated ni-mo-cr-fe alloy at elevated temperatures. materials transactions 55(3):428–433, 2014. https://doi.org/10.2320/matertrans.md201311. [19] c. le brun. molten salts and nuclear energy production. journal of nuclear materials 360(1):1–5, 2007. https://doi.org/10.1016/j.jnucmat.2006.08.017. [20] s. delpech, c. cabet, c. slim, g. s. picard. molten fluorides for nuclear applications. materialstoday 13(12):34–41, 2010. https://doi.org/10.1016/s1369-7021(10)70222-4. [21] a. h. v. pavan, r. l. narayan, m. swamy, et al. stress rupture embrittlement in cast ni-based superalloy 625. materials science and engineering: a 793(139811), 2020. article number 139811, https://doi.org/10.1016/j.msea.2020.139811. [22] a. h. v. pavan, r. l. narayan, k. singh, u. ramamurty. effect of ageing on microstructure, mechanical properties and creep behavior of alloy 740h. metallurgical and materials transactions a 51:5169–5179, 2020. https://doi.org/10.1007/s11661-020-05951-6. [23] b. gurovich, a. frolov, d. maltsev, et al. phase transformations in irradiated 42crnimo alloy after annealing at elevated temperatures, and also after rapid annealing, simulating the maximum design basis accident. in xi conference on reactor materials science, pp. 30–33. ao gnts niiar, dimitrovgrad, 2019. [24] b. sreedhar, s. albert, a. pandit. cavitation damage: theory and measurements – a review. wear 372-373:177–196, 2017. https://doi.org/10.1016/j.wear.2016.12.009. [25] r. h. richman, w. p. mcnaughton. a metallurgical approach to improved cavitation-erosion resistance. journal of materials engineering and performance 6(5):633–641, 1997. https://doi.org/10.1007/s11665-997-0057-5. [26] d. e. zakrzewska, a. k. krella. cavitation erosion resistance influence of material properties. advances in materials science 19(4):18–34, 2019. https://doi.org/10.2478/adms-2019-0019. [27] v. i. kovalenko, v. g. marinin. research of fracture of doped titanium alloys under cavitation (in russian). eastern-european journal of enterprise technologies 6(11):4–8, 2015. https://doi.org/10.15587/1729-4061.2015.54118. [28] v. g. marinin, v. i. kovalenko, n. s. lomino, et al. cavitation erosion of ti coatings produced by the vacuum arc method. in proceedings isdeiv. 19th international symposium on discharges and electrical insulation in vacuum (cat. no.00ch37041), vol. 2, pp. 567–570. ieee, xi’an, china, 2000. https://doi.org/10.1109/deiv.2000.879052. [29] astm g32-16, standard test method for cavitation erosion using vibratory apparatus, 2016. https://doi.org/10.1520/g0032-16. [30] h. g. feller, y. kharrazi. cavitation erosion of metals and alloys. wear 93(3):249–260, 1984. https://doi.org/10.1016/0043-1648(84)90199-6. [31] s. hattori, r. ishikura. revision of cavitation erosion database and analysis of stainless steel data. wear 268(1-2):109–116, 2010. https://doi.org/10.1016/j.wear.2009.07.005. [32] w. liu, y. g. zheng, c. s. liu, et al. cavitation erosion behavior of cr-mn-n stainless steels in comparison with 0cr13ni5mo stainless steel. wear 254(7-8):713–722, 2003. https://doi.org/10.1016/s0043-1648(03)00128-5. [33] j. hoffmann, m. rieth, l. commin, et al. improvement of reduced activation 9 %cr steels by ausforming. nuclear materials and energy 6:12–17, 2016. https://doi.org/10.1016/j.nme.2015.12.001. 767 https://doi.org/10.1007/s11041-005-0004-8 https://doi.org/10.1016/j.jnucmat.2009.03.023 https://doi.org/10.1080/02670836.2016.1187334 https://doi.org/10.1016/s0022-3115(96)00297-8 https://doi.org/10.1007/s11041-005-0074-7 https://doi.org/10.2320/matertrans.md201311 https://doi.org/10.1016/j.jnucmat.2006.08.017 https://doi.org/10.1016/s1369-7021(10)70222-4 https://doi.org/10.1016/j.msea.2020.139811 https://doi.org/10.1007/s11661-020-05951-6 https://doi.org/10.1016/j.wear.2016.12.009 https://doi.org/10.1007/s11665-997-0057-5 https://doi.org/10.2478/adms-2019-0019 https://doi.org/10.15587/1729-4061.2015.54118 https://doi.org/10.1109/deiv.2000.879052 https://doi.org/10.1520/g0032-16 https://doi.org/10.1016/0043-1648(84)90199-6 https://doi.org/10.1016/j.wear.2009.07.005 https://doi.org/10.1016/s0043-1648(03)00128-5 https://doi.org/10.1016/j.nme.2015.12.001 acta polytechnica 61(6):762–767, 2021 1 introduction 2 materials and methods of investigation 3 results and discussion 4 conclusions acknowledgements references ap06_6.vp 1 introduction the development of digital imaging arrays such as charge coupled devices (ccds) leads to large amounts of data. a single 2048×2048 ccd frame with 16 bit/pixel, for example, generates 8 mbytes of data. such large format ccds are currently used when focusing telescopes, and in many other fields. scanning astronomical schmidt plates with automatic plate scanning machines provides a tremendous quantity of data. a single schmidt plate produces about 2 gbytes of digital information at 10ěm resolution. these images are not only stored but are also currently transmitted over networks for remote observing, remote browsing or data processing. a particular solution to the transfer problem can be found with high bandwidth transmission links, which are expensive to install and operate. moreover the problem of image archiving is partially solved with optical disks or very high-density tapes, which do not allow easy access for remote users. a solution can be found by using both data compression and high digital density supports such as cd-roms [1]. many techniques have been developed for compressing astronomical images. a review of these techniques can be found in [2, 3]. different applications require different data rates for compressed images and different visual qualities for decompressed images. in some applications, when browsing is required or transmission bandwidth is limited, progressive transmission is used to send images in such a way that a low quality version of the image is transmitted first at a low data rate [4]. gradually, additional information is transmitted to progressively refine the image. a progressive image transmission technique allows a full-sized image to be a visualized at any moment during the receiving time. this transmission method is more efficient than a non-progressive image transmission scheme, where every element of the transmitted data contains information about only a small piece of the image, and it is displayed by rows or by columns. progressive image transmission is also a desirable feature because it provides the capability to interrupt transmission when the quality of the image has reached an acceptable level or when the user decides that the received image is not interesting. similarly, the user at the receiver site can make a decision based on a rough reproduction of the image and can interact with the remote device to obtain a new image, or to recover a higher quality (or even an exact) replica of only a part of the image. a specific coding strategy known as embedded rate scalable coding is well suited for progressive transmission. in embedded coding, all the compressed data is embedded in a single bit stream. the decompression algorithm starts from the beginning of the bit stream and can terminate at any data rate. a decompressed image at that data rate can then be reconstructed. in embedded coding, any visual quality requirement can be fulfilled by transmitting the truncated portion of the bit stream. to achieve the best performance the bits that convey the most important information need to be embedded at the beginning of the compressed bit stream [5]. rpsws is an embedded rate scalable wavelet-based image coder. it produces a fully embedded bit stream and encodes the image to exactly the desired bit rate (precise rate control). this paper is investigates the applicability and the reliability of the rpsws coder for coding astronomical images. rpsws is well suited for progressive transmission of astronomical images. the remainder of this paper is organized as follows: section 2 presents a brief description of the rpsws coder. section 3 discusses experimental results for various rates and for various astronomical images. conclusions are presented in section 4. 2 rpsws algorithm description rpsws is an embedded rate scalable wavelet-based image coder which can be used to code grayscale images [6] as well as color images [7] in an embedded fashion. for completeness sake, a brief description of rpsws is presented below. 2.1 general structure of rpsws the rpsws algorithm uses the discrete wavelet transform to decompose the original image into sub-bands, where the sub-bands are logarithmically spaced in frequency and represent octave-band decomposition. each sub-band is encoded and decoded separately in a predefined order known by the encoder and the decoder. the sub-bands are scanned from the lowest frequency to the highest frequency sub-band, as shown in fig. 1. for an n-scale transform, the scan begins at the lowest frequency sub-band, denoted as lln, and scans sub-bands hln, lhn, and hhn, at which point it moves on to scale n�1, etc. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 46 no. 6/2006 embedded coding of astronomical images f. i. y. elnagahy, a. a. haroon, y. a. azzam, a. el-bassuny alawy,h. k. elminir, b. šimák recursive partitioning of significant wavelet sub-bands (rpsws) is an embedded rate scalable wavelet-based image coding algorithm. the rpsws coder produces a fully embedded bit stream and encodes the image to exactly the desired bit rate (precise rate control). this paper investigates the applicability and the reliability of the rpsws coder for coding astronomical images. various types of astronomical images were compressed at different bit-rates using the rpsws coder. the results were compared to the set partitioning in hierarchal trees coder (spiht), a well known embedded image corder. the comparison of the performance of the two algorithms was based on quantitative rate-distortion evaluations and subjective assessments of image quality. keywords: image compression, color rpsws, wavelet sub-band partitioning, embedded coding, spiht, astronomical images. in order to perform embedded coding in general, multiple cycles (iterations) through the sub-bands are adopted. to reduce distortion of the decompressed image, the wavelet coefficients with the largest magnitude should be first encoded and refined to an additional bit of precision. to achieve this goal of distortion reduction, an initial threshold must first be determined. in the first cycle, the sub-bands are scanned as described above, and the wavelet coefficients that have magnitudes larger than the threshold are coded. in the next cycle, the threshold is halved and the coefficients that were coded in the previous cycle are refined to an additional bit of precision. meanwhile, the transformed coefficients that have magnitudes larger than the threshold and have not been coded in the previous cycle are coded. this process is repeated until the desired bit rate is achieved. in each cycle, two passes are used to encode/decode and refine the wavelet coefficients of each sub-band. a significance map pass is used to code the position, the sign, and/or the initial reconstructed magnitude value of the significant wavelet coefficients. in the refinement pass some refinement bits are generated to refine the magnitude of the previously encoded coefficients. 2.2 rpsws code stream a wavelet coefficient that has an absolute value above or equal to the threshold is called a significant coefficient; otherwise it is an insignificant coefficient. an insignificant wavelet coefficient is coded with one bit “0” while a significant wavelet coefficient is coded with two bits. the first bit “1” indicates the significance of the wavelet coefficient and the second bit indicates its sign (“0” for positive and “1” for negative coefficient). a sub-band that has no significant wavelet coefficient is called an insignificant sub-band. as the insignificant sub-band is entirely coded with one bit, a large number of insignificant coefficients are coded with one symbol (one bit). coding a large number of insignificant wavelet coefficients with one bit is the main idea of the proposed coding algorithm. the encoder sends one bit to inform the decoder about the significance of the sub-band. this bit is called the sub-band significance flag bit (ssf) which is equal to zero for insignificance sub-band ssf. when the decoder receives ssf � 0 it knows that the sub-band is insignificant and hence all coefficients in that band are insignificant. the decoder sets all reconstructed coefficient values in that sub-band to zero. in other words all wavelet coefficients in that band are quantized to zero. for a significant sub-band, the encoder outputs ssf � 1 and the decoder knows that this sub-band contains significant coefficient(s). in this case both the encoder and the decoder partition the sub-band into four blocks and the significance test is then applied to each block. an insignificant block is also coded with one bit “0”. this bit is called the block significance flag bit (bsf), where the significant block is coded with bsf � 1 and this block is again sub-divided into four new blocks. this block division process continues until the last block contains four coefficients or less, depending on the dimensions of the sub-band. the encoder creates a list to store information about the significant wavelet coefficients. this list is called the coefficients significance list (csl). each time the encoder finds a significant wavelet coefficient it appends its coefficient value to the coefficients significance list. the encoder then sets the value of the significant wavelet coefficient to zero in the wavelet coefficients matrix. the coefficients significance list is used during the magnitude refinement pass. the sub-band significance map coding process mentioned above is implemented in two steps. the first step is called sub-band or-tree significance map assignment and the second is sub-band or-tree significance map coding. 2.3 rpsws color image coding the general scheme that will be used to code color images in an embedded fashion is described as follows: the rgb color space is first converted to yuv color space. color space conversion is used to reduce the correlation between the color components. the discrete wavelet transform is then applied to individual yuv components. the transformed wavelet coefficients of each component are then coded with the rpsws coding algorithm. color space conversion and the coding process are described below. 2.3.1 color space conversion the rpsws coding system accepts the input image in 24-bit rgb color space. each pixel is represented by the three bytes (red (r), blue (b), and green (g)). a conversion to the yuv color space [8] is performed to reduce the correlation between the color components as follows: y � 0.3 r�0.6 g�0.1b u � b�y v � r�y where y is the luminance component, and u and v are two chrominance components. the luminance component provides a grayscale version of the image, while the two chrominance components give additional information that converts the gray image to a color image. the yuv representation is more natural for image and video compression. the corresponding inverse transform is as follows: r � y�v g � y�v/2�u/6 b � y�u 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 1: sub-bands scanning order 2.3.2 color components coding the discrete wavelet transform is applied to individual yuv components. the resulting wavelet coefficients for y, u, and v components are respectively yc, uc, and vc. in order to perform embedded color image coding, an initial threshold t is first determined. the initial threshold t is chosen so that t � 2i, where i is computed in such a way that 2i�1>x>2i and x is the maximum absolute value of the yc-component wavelet coefficients. the significance pass information of the yc-component (sub-band by sub-band) is encoded, followed by magnitude refinement pass information of the all yc-component sub-bands. in the same manner, the uc-component and the vc-component are encoded. the threshold t is halved and the process continues until the desired bit rate is achieved. 3 experimental results various types of astronomical images were used to evaluate the performance of the spiht [9] and rpsws coders. the images used in this study had been selected from the hubble space telescope archive (http://heritage.stsci.edu/gallery/galindex.html). in the selection we included a variety of types of objects e.g., galaxy, nebula, open cluster, globular cluster and planets. the color images (512×512– 24bit/pixel) used in the presented results are cluster (ngc 6397) which is a globular cluster, bright, very large, galaxy (m51, whirlpool galaxy) which is a magnificent (or otherwise interesting) object and great spiral nebula and mars, as shown on the left of fig. 2, respectively. the gray scale images (512×512 – 8bit/pixel) are cluster (m103), which is an open cluster, quite © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 46 no. 6/2006 fig. 2: original 512×512 test images: color images at 24 bpp (left): cluster (ngc 6397), galaxy (m51), and mars images, respectively. grayscale images at 8 bpp (right) cluster (m103), galaxy (m101), and nebula (b33) images, respectively large, bright, round and rich in stars of magnitude 10 to 11, galaxy (m101), which is quite bright, very large, irregularly round, gradually and a very abruptly much brighter middle bright small nucleus and nebula (b33, horsehead nebula), which is a very faint and dark nebula, as shown on the right of fig. 2, respectively. the spiht color results were obtained by applying the spiht software (decdcolr/codecolr) to the color images and using the adaptive arithmetic coding algorithm of [10] with several adaptive models. the spiht gray scale results were obtained by applying the spiht software (fastcode/fastdecd) to the gray scale images, and these algorithms do not use an adaptive arithmetic coding algorithm. all spiht algorithms were obtained from (http://www.cipr.rpi.edu/research/spiht/spiht3 .html). in our results, the wavelet transform is implemented with 6-level pyramids constructed with 9/7-tap filters of [11] and using a reflection extension at the image edges. the resulting bit stream of the rpsws was not entropy coded. table 1 shows the objective coding results (psnr in db) of the spiht and rpsws algorithms for different color images and bit rates. table 2 shows the results of the gray scale images. some reconstructed images for both algorithms at different bit rates are shown in fig. 3, 4, 5, 6, 7, and 8. from out results we can conclude that: � the spiht objective color coding results (psnr) are superior to higher than the rpsws results, for all test images and at all bit rates, as shown in table 1. this increase in performance is expected, and it is mainly due to the use of arithmetic coding in the spiht algorithm. the subjective quality tests showed that there are no significance differences between the reconstructed images of the two algorithms at higher bit rates. at lower bit rates, the quality of our reconstructed images for cluster (ngc 6397) and galaxy (m51) is better than the spiht reconstructed images. as shown in fig. 3, bright stars are more defined and have more details for rpsws than in the case of spiht. the fine details of the arms of the galaxy (m51) are more defined for rpsws than in the case of spiht, as shown in fig. 4. for the mars reconstructed images shown in fig. 5, the details of mars are better in the case of spiht than in the case of rpsws at lower bit rates. � the rpsws objective gray scale coding results (psnr) are superior to the spiht results, for most test images and at most bit rates, as shown in table 2. the subjective quality tests show that rpsws reconstructed images are better than spiht reconstructed images at lower bit rates, as shown in fig. 6, 7, and 8. at higher bit rates there are no significance differences in the visual qualities of the spiht and rpsws reconstructed images. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 3: reconstructed cluster (ngc 6397) images at bit rates 0.03125,0.25, and 1 bits/pixel, respectively. spiht (left) and rpsws (right). fig. 4: reconstructed galaxy (m51) images at bit rates 0.03125, 0.25, and 1 bits/pixel, respectively. spiht (left) and rpsws (right) 4 conclusion this paper presents objective coding results of both spiht and rpsws for coding color astronomical images as well as gray scale images. from the results presented here we conclude that the qualities of rpsws reconstructed images are better than those of spiht reconstructed images at lower bit rates for both color and gray scale images. at higher bit rates there are no significant differences in the visual qualities of the spiht and rpsws reconstructed images. both spiht and rpsws are embedded rate scalable wavelet-based image coders, and are well suited for astronomical images to be progressively transmitted over networks for remote observing, remote browsing and data processing. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 46 no. 6/2006 fig. 5: reconstructed mars images at bit rates 0.03125,0.25, and 1 bits/pixel, respectively. spiht (left) and rpsws (right). fig. 6: reconstructed cluster (m103) images at bit rates 0.03125, 0.25, and 1 bits/pixel, respectively. spiht (left) and rpsws (right). bit rate compression cluster(ngc 6397) galaxy(m51) mars spiht rpsws spiht rpsws spiht rpsws 1 24:1 29.25 28.07 35.97 35.32 46.94 46.53 0.5 48:1 24.75 24.10 32.51 31.95 43.37 42.94 0.25 96:1 21.54 21.04 29.91 29.15 39.99 39.26 0.125 192:1 19.28 18.62 27.59 27.17 36.15 35.71 0.0625 384:1 17.30 17.13 25.94 25.61 32.19 32.38 0.03125 768:1 16.40 16.24 24.47 24.35 28.78 28.39 0.015625 1536:1 15.62 15.62 23.25 23.09 25.59 25.38 0.0078125 3072:1 15.18 15.15 22.40 22.11 23.89 23.38 table 1: spiht and rpsws psnr color results, measured in db, for various images and bit rates. references [1] bobichon, y., bijaoui, a.: a regularized image restoration algorithm for lossy compression in astronomy. experimental astronomy, vol. 7 (1997), p. 239–255. [2] elnagahy, f., šimák, b.: wavelet-based astronomical digital image compression. proceedings of workshop 2004, prague: ctu in prague: ctu reports, part a, special issue, 8, march 2004, 208–209. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 bit rate compression cluster (m103) galaxy (m101) nebula (b33) spiht rpsws spiht rpsws spiht rpsws 1 8:1 31.79 31.75 39.53 39.52 40.79 40.70 0.5 16:1 27.59 27.82 36.10 36.05 37.39 37.41 0.25 32:1 24.16 24.38 33.83 33.67 35.31 35.21 0.125 64:1 21.53 21.69 31.85 31.83 33.97 33.95 0.0625 128:1 19.70 19.83 29.90 29.87 32.36 32.41 0.03125 256:1 18.56 18.70 28.22 28.28 30.68 30.90 0.015625 512:1 17.70 17.84 26.74 26.88 29.74 29.89 0.0078125 1024:1 17.32 17.36 25.73 25.84 28.82 28.96 table 2: spiht and rpsws psnr gray scale results, measured in db, for various images and bit rates fig. 7: reconstructed galaxy (m101) images at bit rates 0.03125, 0.25, and 1 bits/pixel, respectively: spiht (left) and rpsws (right). fig. 8: reconstructed nebula (b33) images at bit rates 0.03125, 0.25, and 1 bits/pixel, respectively: spiht (left) and rpsws (right). [3] louys, m., starck, j. l., mei, s., bonnarel, f., murtagh f.: astronomical image compression. astronomy astrophysics supplement series, vol. 136 (1999), p. 579–590. [4] rabbani, m., jones, p. w. : digital image compression techniques. bellingham, washington: spie opt. eng. press, 1991. [5] shapiro, j. m.: embedded image coding using zerotrees of wavelets coefficients. ieee transactions on signal processing, vol. 41(1993), no. 12, p. 3445–3462. [6] elnagahy, f., šimák, b. : embedded rate scalable wavelet-based image coding algorithm with rpsws. international journal of wscg, vol. 12 (2004), no. 1, p. 97–104. [7] elnagahy, f.: embedded rate scalable wavelet-based color image coding. in proceedings of the 8th international student conference on electrical engineering, poster 2004, may 2004, prague, p. ic12. [8] westwater, r., furht, b.: real-time video compression: techniques and algorithms. kluwer academic publisher, 1997. [9] said, a., pearlman, w. a.: a new, fast, and efficient image codec based on set partitioning in hierarchical trees. ieee transactions on circuits and systems for video technology, vol. 6 (1996), p. 243–250. [10] witten, i. h., neal, r. m., cleary, j. g.: arithmetic coding for data compression. communications of acm, vol. 30 (1987), no. 6, p. 520–540. [11] antonini, m., barlaud, m., mathieu, p., daubechies, i.: image coding using wavelet transform. ieee transactions on image processing, vol. 1(1992), no. 2, p. 205–220. dr. eng. farag ibrahim younis elnagahy e-mail: faragelnagahy@hotmail.com dr. aly a. haroon e-mail: alyharoon@hotmail.com dr. eng. yosry ahmed azam e-mail: yosryahmed@hotmail.com assistant prof. ahmed el-bassuny alawy email: abalawy@hotmail.com astronomy department dr. eng. hamdy k. elminir e-mail: hamdy_elminir@hotmail.com solar and space research department national research institute of astronomy and geophysics (nriag) 11421 helwan, cairo, egypt doc. ing. boris šimák, csc. e-mail: k332@fel.cvut.cz department of telecommunications engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 46 no. 6/2006 ap06_5.vp notation sar specific absorption rate, �1 reflection coefficient of metallic plate, �2 reflection coefficient of textile, � �2 2 21= − transmission factor, e ex t− −= �tex absorption in textile, �tex attenuation factor of textile, t thickness of the textile, � phase constant of free space, � distance between reflective plate and textile, �tex relative permittivity, tg �tex loss factor. 1 introduction microwave heating is a very promising technology which has been finding new applications in industry. it is a technology which can replace conventional heating. we would like to describe our new microwave industrial applicator used for drying textiles in manufacturing. in this drying process, a very thin layer of textile material does not have a very well defined position in the applicator. also the complex permittivity of dried textile is not constant during the procedure. its value changes in time with respect to the decreasing moisture content [1]. we designed and analyzed our open-resonator applicator. in this paper we will present the theoretical model, the results of em field and sar numerical modeling, and experimental evaluation. the prototype of the microwave drying machine working at frequency 2.45 ghz, also is created from 17 cells, will be described here. 2 design of the applicator our new microwave industrial applicator for drying textiles is based on the principle of the fabry-perrot resonator, which is an open type resonator. this applicator has a magnetron as a source of high electromagnetic power, placed in the waveguide holder. the power of the magnetron is 800 w and its working frequency is 2.45 ghz. dried textile material is located in the middle plane between the parallel conductive plates, and the distance between these plates is equal to ( )3 2 � [2]. fig. 1 shows the scheme of this applicator. in the first part of design we used a simulator of the electromagnetic field. we determined the position of the magnetron, and in this way we found the distribution of the electric field strength in the drying textile. fig. 2a shows the impedance matching of the systems that were obtained by 3 acta polytechnica vol. 46 no. 5/2006 microwave drying of textile materials and optimization of a resonant applicator m. pourová, j. vrba the principal aim of this work was to design and optimize the applicator for microwave drying. our applicator is derived from the fabry-perrot resonator, which is an open type resonator.the whole system works at frequency 2.45 ghz and the magnetron that we used delivers power 800 w. this machine is intended for use in drying in factory production of fabrics. after identifying of the basic arrangement of the microwave drying machine, the next step in the design was the use of the electromagnetic field simulator. we determined the position of the magnetron and found the distribution of the electric field strength in drying textiles in this way. in parallel, we analyzed the drying system with analytical calculations. we created a diagram of the em waves inside this structure and reached the resulting expression for use in calculating the strenght of the electric field in the plane of the drying textile. this quantity depends on the electrical characteristics of wet textiles, e.g. the permittivity and the loss factor. measurements of these dielectric properties for the coburg is complicated, and this method makes it possible to solve our problem with dielectric parameters. we have sar distribution results (by simulation and also by measurement), results of measurements of the moisture content in the dried textile with respect to time. these results are important for subsequent optimization of the efficiency of the whole machine. keywords: microwave drying, open resonator, drying of textile, sar distribution. fig. 1: scheme of the open-resonator type applicator simulation and by measurement. the resonant frequency is changed if we simulate the applicator with textile (fig. 2b). however, this applicator can be tuned back to resonance by changing the position of the parallel plates. fig. 3a shows the simulated distribution of the electric field strength in the applicator for textile with complex permittivity 8 and loss factor 0.566. the thickness of the textile used here is 0.3 mm and the weight is 210 g/m2. fig. 4a shows the sar characteristic calculated with simulation software. fig. 3b and 4b show examples of the drying experiments. both processes are different, but they give very well comparable results. 4 acta polytechnica vol. 46 no. 5/2006 (a) (b) fig. 2: impedance matching of the applicator (a) (b) fig. 3: distribution of electric field strength (a) and an example of a drying experiment (b) (a) (b) fig. 4: sar distribution in textile (a) and an example of a drying experiment (b) 3 optimization of the resonant applicator our drying resonant system is optimized by criteria to create the maximum electric field strength in the plane of the drying textile. we can describe this structure by means of an oriented graph, and we can also create a diagram of the em waves inside this structure (fig. 5) by modifying the diagrams on the right side (fig. 4) we can arrive at the resulting expression for calculating the e-field strength in the textile plane: e e en x n j n n ( , , ) ( ) ( )� �� � � � � �2 1 2 2 1 2 0 tex = + ⋅ − − + = ∞ ∑ , (1) parameters �2 and �tex are given by the dielectric properties of the textile, so we can write the electric field strength dependent on relative permittivity �tex and loss factor tg �tex as follows e e e j t t j ( , , ) ( ) ( ) � � � � � � � � � � � tex textg tex tex = − ⋅ + ⋅ ⋅ +2 1 2 e t� � �tex ⋅ − −1 2 21 . (2) some examples of the application of this equation are given in fig. 5. electric field strength with respect to distance � and to relative permittivity �tex (tg � = 0.566). as shown in fig. 6, with decreasing permittivity the electric field strength in the textile increases, and the position of its maximum does not change. this finding is very important. during the drying process, the permittivity of the textile decreases. however, due to the increase in the electric field strength the resulting efficiency of the drying process does not decrease so quickly. fig. 7 shows the dependence of the electric field strength on distance � and loss factor tg �tex (�tex is constant). we can see that due to tg �tex the electric field strength does not decrease significantly but changes the position of its maximum value. during the drying process the quantity of water in the textile changes. we measured the complex permittivity of the textile with agilent e4991a material analyzer (fig. 9). rela5 acta polytechnica vol. 46 no. 5/2006 fig. 5: tools for optimizing the microwave dryer a) b) fig. 6: dependence of electric field strength on distance � and relative permittivity �tex (tg � � 0.566), 3d (a) and 2d (b) display of the quantities tive permittivity and loss factor are decreasing with decreasing water content. it can be seen in fig. 8. the next part presents the analysis in which the calculation with measured results of the permittivity and loss factor are used. fig. 10 is a chart of the dependence of the electric field strength on distance for the case of a resonator without a textile (high peaks) and with a sample of wet textile (attenuated peaks). fig. 11 shows a chart of the dependence of the electric field strength on distance for the case of four different volumes of moisture. we can see that the distance of the parallel plate and the textile changes in dependence on the moisture content. 4 realization of the applicator for our first experiments we built a single workable cell (fig. 12), which was used for the first drying experiments (fig. 3b, fig. 4b) and for measurements of the drying curve. fig. 13 shows the results of measurements of the moisture content in drying textile with respect to time. 6 acta polytechnica vol. 46 no. 5/2006 fig. 7: dependence of electric field strength on distance � and loss factor tg �tex (�tex = 8) (a) (b) f f fig. 8: relative permittivity (a) and loss factor (b) are decreasing with frequency (horizontal axis) and with decreasing water content (set of parametric curves) fig. 9: measurement system – the agilent e4991a material analyzer � fig. 10: electric field strength in dependence on distance � for the case of a resonator without textile (high peaks) and with a sample of wet textile � fig. 11: electric field strength in dependence on distance � and moisture content (set of parametric curves) based on theoretical considerations developed at the czech technical university in prague, a prototype of the microwave drying machine has been built in cooperation with the research institute of textile machines and the technical university in liberec, see fig. 14a. the inside microwave part of the system is shown in fig. 14b. the waveguide horn apertures of several cells can be seen here (together with the tubes used for withdrawing the moisture). this apparatus was created with 17 heating cells: a matrix of 6 cells in the first row, then 5 cells in the second row, and again 6 cells in the third row. each magnetron in all 17 cells delivers 1 kw at 2.45 ghz [3]. 5 conclusion the novel results of our work are a description and a basic evaluation of a microwave open-resonant applicator for drying textile materials. optimization of this applicator based on an analytical model is reported. the whole system is optimized, and also a single workable cell working at frequency 2.45 ghz. acknowledgment this work has been supported by the czech grant agency under grant 102/03/h086 novel approach and coordination of doctoral education in radioelectronics and related disciplines. references [1] vrba, j.: applications of microwave techniques. prague: ctu press 2001 (in czech). [2] stejskal, m., vrba, j., klepl, r.: microwave drier for fabrics. european patent number ep1319914 a2 18-06-2003. [3] vrba, j.: microwave drying of textile materials. prague: series of research reports 2001–2004 (in czech). ing. marika pourová e-mail: pourovm@fel.cvut.cz prof. ing. jan vrba, csc. e-mail: vrba@fel.cvut.cz department of electromagnetic field czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic 7 acta polytechnica vol. 46 no. 5/2006 fig. 12: realization of a workable drying cell t [sec] fig. 13: typical drying characteristic of this type applicator (a) (b) fig. 14: prototype of the drying machine (a) and inside microwave part of the system waveguide horn apertures (b) ap06_1.vp 1 introduction the procedure in the design of logical structural models with multiplexors might seem to be complete. it appears, however, that the artjuchov-shalyto extension of the boolean function, which models the performance of a multiplexor, leads to its mere “setting”, and the generalised model of the multiplexor performance makes it possible to design structural models with multiplexors according to the disjoint decomposition of the given boolean function. 2 boolean function let the boolean function f x x x ym m:{ , } { , }: , , ,0 1 0 1 1 2� � � be given. if we denote the set { }xi i m �1 of the function f arguments by the symbol x, we can write f (x) instead of f (x1, x2, …, xm). let us also write f (xi � �i) instead of f (x1, x2, …, xi�1, �i, xi�1, …, xm), where �i � {0, 1}. we require the function f (x)to be minimal with respect to the number of arguments, i.e., not to contain fictive arguments; the argument xi is called fictive if f (xi � 0) � f (xi � 1). the term hamming weight wh of the function f – wh f – denotes the value of the arithmetic expression w f x fh m m m ( ) ( , , , ) , , , { , } � � � � � � � � � � � 1 2 0 11 2 let x x x� � � � � � �� � � 0 1 ; each boolean function f (x) can be expressed by means of a canonic normal disjunctive formula – cndf f (x) f x x x x f m m m m m( ) ( , , , ) , , , { , } � � � � � �� � � � � � � � � 1 2 1 2 0 1 1 2 1 2� . then, if w fh m 2 2 or w fh m � 2 2, or if w fh m � 2 2, it is preferable to write down the respective cndf f (x) or cndf f x( ), or to apply the artjuchov-shalyto extension of the function f (x) [1] f x x x f x x f x f x x x f x x f x i i i i i i i i i ( ) ( ) ( ) ( ) ( ) ( � � � � � � � � � 0 1 0 i � 1) the validity of which can be easily confirmed by supplying 0 or 1 for xi. let a dichotomy {x1, x0}be given on a set of x arguments x1, x2, , xm without loss of generality, such that x1 � {x1, x2, …, xn} and x0 � {xn � 1, xn � 2, …, xm}, where n < m. let the simple k-multiple (k < n) disjoint decomposition of the function f (x) be called the composition f x x x x xk( ) ( ), ( ), , ( ),� � � � �1 1 2 1 1 0� . the construction of the simple k-multiple (k < n) disjoint decomposition of function f (x) can be easily done by means of a decomposition by map[2, 3]. 3 multiplexor the term multiplexor (mx) [2, 4, 5] denotes a logical object modeled both in a parametrical and an algebraic way. see fig.1, where ar (r � 1, 2, …, k) and dj ( j � 0, 1, …, 2 k�1) are the respective adjustable address-, and data-input ports: y mx x x x g x g x g x k k � � ( ( ), ( ), , ( ) ( ), ( ), , ( � � �1 1 2 1 1 0 0 1 0 2 1 � � 0 0 2 1 1 1 2 1 1 0 1 2 ) ( ) ( ) ( ) ( ) ) � � � � � j k j k kx x x g x� � �� � � � © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 logical structural models with multiplexors j. bokr, v. jáneš the paper deals with the use of multiplexors in designing logical structural models.. the applications can be preferably used in designing morphology on efga chips and other programmable structures. illustrative examples are included. keywords: boolean function, artjuchov–shalyto extension, shannon extension, boolean function decomposition, multiplexor. fig. 1: a) schematic diagram of a multiplexor, b) structural model of a multiplexor where j k r k r r k � � � � �� � � �1 2 1 2� . 4 multiplexor and the boolean function let us provide the output port mx with an element of anticoincidence such that y y z� � , where z x xi i�{ , , , }0 1 – see fig. 2. let us design a multiplexor modelled by the function f (x): y z mx x x x g g gm m m � � � � � � � � ( ), , , , , , , , , { , } 1 2 0 1 2 1 0 11 2 � � � � � m mx x x fm m1 2 1 2 1 2� � � � � �� ( , , , ).� hence g fj m� �( , , , )� � �1 2 , where j m i m i i m � � � � � �� � � �1 2 1 2, , , . if it is more suitable to construct y � cndf f x( ) or y � cndf f x( ), – see par. 2 –, then the respective z � 0 or z � 1, for y � �f x( ) 0 or y � �f x( ) 1. if one cannot decide whether to construct f (x) or f x( ), then if w f x w f xh h m m m m( ) ( ), , , , , ,� � � � � �1 2 1 22 2 2 2� � or w f x w f xh h m m m m( ) ( ), , , , , ,� � � � � �1 2 1 22 2 2 2� � � then z x� 1 and y � � � �x f x x f x1 1 1 10 1( ) ( ) or z x� 1 and y � � � �x f x x f x1 1 1 10 1( ) ( ) respectively. example 1: construct a multiplexor realizing the function f (x1, x2, x3) � 0001 0111. since w fh � �4 2 2 3 , as well as w f x w f xh h m m ( ) ( ) , , , , , ,� � � � � �1 2 1 24 4 1 3 � � � � , we obtain z x� 1 and since f x x x mx x x x x mx x ( , , ) ( , , , , , , , , , ) ( , 1 2 3 1 2 3 1 1 0 0 0 1 0 1 1 1� � � � x x d d d d d d d d2 3 0 1 2 3 4 5 6 7, , , , , , , , ) we obtain d0 � d1 � d2 � d5 � d6 � d7 � 0 and d3 � d4 � 1, i.e. y � mx x x x( , , , , , , , , , )1 2 3 0 0 0 1 1 0 0 0 , which is certainly a simpler setting of mx than that of d0 � d1 � d2 � d4 � 0 and d3 � d5 � d6 � d7 � 1. 5 multiplexor and the simple k-multiple disjoint decomposition let f x x x x x x xm k( , , , ) ( ), ( ), , ( ),1 2 1 1 2 1 1 0� �� � � � � where x x x xn1 1 2� { , , , }� and x x x xn n m0 1 2� � �{ , , , }� , be a simple k-multiple (k < n) disjoint decomposition of the function f x x xm( , , , )1 2 � . let there be �r rx r k� �( , , , )1 2 � with k � n and let us construct the shannon extension of the given function according to the arguments x x xn1 2, , ,� , without loss of generality f x x x x x x f m n n n( , , , ) ( , , , , , , 1 2 1 2 1 2 1 2 1 2 � �� � � � � �� � � � � � � � � n n mx x, , , ).�1 � and further let f x x x mx x x x g g gm n j n n ( , , , ) , , , , , ,( )1 2 1 2 0 1 2 1 0 2 � � �� � � � � � �1 1 2 1 2x x x gn jn � � � � ; hence g f x x xj n n n m� � � �( , , , , , , , )� � �1 2 1 2 � , where j n i n i i n � � � � � �� � � �1 2 1 2 . note that the selection of arguments according to which the shannon extension of the given function f x x xm( , , , )1 2 � is done depends completely on the view of the designer, and there is no reason to distinguish the development qualitatively according to the ‘left-side’ arguments x1, x2, …, xn or ‘right-side’ arguments x x xn n m� �1 2, , ,� of the function f, as stated in [1]. example 2: let the function y � � (5, 6, 7, 10, 11, 19, 21, 23, 26, 27, 30, 31) be given; design a structural model with mx according to the shannon development extension of the given function both according to arguments x1, x2, x3 and according to arguments x4, x5, i.e., according to y � �mx x x x h h h mx x x g g g g( ) ( ), , , , , , , , ,1 2 3 0 1 7 4 5 0 1 2 3� . thus, let us construct decomposition maps (fig. 3.) hence 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 2: multiplexor and the element of the sum modulo 2 – m2 fig. 3: decomposition maps of the function from example 2 y � � � � � � � x x x x x x x x x x x x x x x x 1 2 3 1 2 3 4 5 1 2 3 4 1 2 3 1 0 0 ( ) ( ) ( ) ( ) x x x x x x x x x x x x x x x x x 2 3 4 5 1 2 3 4 5 1 2 3 4 1 2 3 4 ( ) ( ) ( ) ( ) � � � � � as well as y � � � � � � � � x x x x x x x x x x x x x x x x x 4 5 4 5 2 3 4 5 1 2 2 3 2 3 4 5 0( ) ( ) ( ) ( 1 2 3 2 3� �x x x x ). hence the structural models from fig. 4. note that in fig. 4b) a rom module is suggested and in fig. 4c) the structure is realized only with multiplexer modules. let a simple k-multiple disjoint decomposition f x x x x x x xm k( , , , ) ( ), ( ), , ( ),1 2 1 1 2 1 1 0� �� � � � � be given, where � will be termed an outer function and the functions �1 (i � 1, 2, …, k) will be denoted inner functions. and, further, let f x x x mx x x x g g g m k k ( , , , ) ( ), ( ), , ( ) , , , 1 2 1 1 2 1 1 0 1 2 � � � � � � � � �1 hence g x x x xj k� � � �1 1 2 1 1 0( ), ( ), , ( ),� , where j n i n i i n � � � � � �� � � �1 2 1 2, , , . example 3: construct a structural model with mx according to the decomposition y � � � �( )( , , ), ( , , ), ,1 1 2 3 2 1 2 3 4 5x x x x x x x x of the function y from example 2. according to the decomposition map (fig. 5) we obtain �1 1 2 3 1 2 2 3( , , )x x x x x x x� � �2 1 2 3 1 2 3 2 3( , , )x x x x x x x x� � for the inner functions. since y � mx g , g , g , g( ),� �� � 0 1 2 3 we obtain y � � � � �� � � � � � � �� � � �2 4 2 2 4 5 2 4 50( ) ( ) ( ) ( )x x x x x and hence the structural model in fig. 6. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 4: structural model with mx from example 2 fig. 5: decomposition map of the function from example 3 6 conclusions the multiplexer appears to be a very helpful msi module. the design of structural models is sufficiently simple and suitable also for the implementation of logical functions on chips provided with fpga or fpd. references [1] šalyto, a. a.: metody apparatnoj i programmnoj realizacii algoritmov. nauka, sankt-peterburg 2000. [2] bokr, j., jáneš,v.: logické systémy. vydavatelství čvut, praha 1999. [3] bokr.j.: “sovmestnaja dekompozicija sistem bulevych funkcij.” avtomatika i vyčislitelnaja technika, 2000, no. 2, p. 38–44. [4] frištacký, n. at al..: logické systémy. bratislava/praha, alfa/sntl 1986. [5] liebig, h., thome, s.: logischer entwurf digitaler systeme. springer, berlin – tokio 1996. doc. ing. josef bokr, csc. e–mail: bokr@kiv.zcu.cz department of information and computer science university of west bohemia faculty of applied sciences universitní 22 306 14 pilsen, czech republic doc. ing. vlastimil jáneš, csc. e–mail: janes@fd.cvut.cz department of control and telematics czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 6: structural model with mx prescribed by the decomposition from example 3 ap07_1.vp 1 introduction the financial risk of a project may be determined in numerical form, when the starting point is the determination of the probabilistic distribution of the criteria that evaluate the financial efficiency of the project. when the risk of the project is determined indirectly, unlike direct (numerical) determination of the risk, the probabilistic distribution of the criteria for evaluating the project does not have to be constructed, but certain characterization of the project must be laid down. on the basis of this characterization, the degree of risk can be estimated indirectly. generally, a project can be defined as a set of reciprocally adherent liaisons, actions and proceedings oriented at achieving specific project objects in a given area and within a certain period. from the point of view of the development of a certain territory (locality) the desired target is the gradual progress of the development in different time periods; globally expressed – the locality should be improved from its initial state in each time period. the targets give to each system a purpose and the direction of its motion. the target is well formulated if it includes the interest that the project pursues, and also if the desired orientation of the progress of this interest is marked. a global formulation of the development of a certain locality is also taken into account. this development is viewed in terms of time as well as the effects and requirements of the assessed project throughout its life (development period of the territory). thus in a certain sense there is a positive or negative influence on the environmental capacity of the locality.the area of the assessment of a project represents a rational distribution of elements that are the object of the evaluation. for practical reasons these elements are regrouped into certain blocks – the above-mentioned targets of the project – in which savings of resources appear as a benefit and the expense of resources as a cost. the basic blocks for assessing transportation projects can comprise: direct demand of users: this paragraph analyses the benefit of the implemented project for the user. the benefits may be quantified according to: subjects (individuals, transporters); areas in which the benefits emerge; type of journey (to work, for shopping). alternatively, the quantification can refer to the relation to the conditions of future traffic flow (security, extent of awareness of users). the analysed benefits should also consider the possibility that the implemented project will “attract” additional transportation flow (generated by the new project or diverted from another route). this involves quantification of timesavings, petrol savings, reduction of traffic incidents, and benefits arising from the higher quality of the transportation path. direct and indirect demands of the transportation system: the investment costs and also the costs of the functional provision and the efficiency of the transportation path are dealt with in this block. indirect external influences: this block contains external effects and the events that accompany them. these influences reach beyond the area of the infrastructure. this refers above all to the impact on the environment, economic activities in the affected area or the recreation area. we need to quantify air pollution, noise, vibrations, together with limitations on recreation functions and regional development. 2 model of multi-criterion assessment of variants the set of targets must also be considered in relation to the assessed project and the criteria for the efficiency of the adopted decision in time. each project is characterized by its demands and impacts. if a global indicator of the evaluation of the decision efficiency is to be constructed, it must be taken into account that this is a multidimensional category. the indicator then represents whether the selected project leads to an improvement on a deterioration of the locality. the starting point is the matrix of a multi-criterion assessment of the variants (a globally used term in multi-criterion decision theory) – see table 1. the variables used in table 1 represent: x (x1, x2, …, xm) set of possible variants, r (r1, r2, …, rn) set of the common characteristics of the variants (criteria, aspects), uj(xi) an evaluation of the i-th variant according to the j-th characteristics. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 47 no. 1/2007 indirect determination of the risk of transportation projects o. pastor this paper deals with indirect determination of the risk of transportation projects. it focuses on cases when the project has certain characteristics that might lead to conclusions concerning the degree of risk. the author proposes a multi-criterion evaluation of the territory with respect to its time evolution in order to determine the characteristics. the proposed method enables a simple graphical representation of the achievement of the territorial evolution in each time section, and also in the entire period that is under consideration. the method presents an inverse characteristic of the risk of the project. keywords: risk, indirect determination, transportation projects. the process of variant assessment related to the selected set of criteria determines the preferable distribution of the variants, which is a sequence of the convenience of the variants. this process usually requires complementary information about the weight of each criterion. 3 construction of a global expression for the development of a locality as a characteristic of the risk of the project from the point of view of constructing a global expression for the development of a locality, the characteristics r1, r2, …, rn are assumed to be expert expressions of the evaluating aspects of exogenously submitted targets. in other words, the characteristics are aspects (indicators) that characterize the demands and effects as a result of the implementation of a certain variant of the project that influences the development of a given locality. let us attach to the term “project” a set of evaluating aspects of the locality r1, r2, …, rn. the impact of the efficiency of the implemented solution must be shown in the assessment of the global efficiency of the whole period of development ro. each submitted period (time section) is indicated by 1, 2, …, t. the degree of fulfilment of each evaluating aspect is known for each of these periods and is expressed as follows: x1(1), x2 (1), …, xn(t), measured on a cardinal or ordinal scale (some of the data may be missing). expressed by table 2. variable xj(i) in table 2 indicates the degree of fulfilment of the j-th evaluating criterion in the i-th period. from the value of xj(i) we may be guess whether or not the development of the locality is successful in relation to the chosen project (in terms of improving the locality). this involves a multi-criterion assessment of the variants, where the variants represent time periods in evaluating the locality, and the characteristics constitute the evaluating aspects of the development of the locality. the output of some of the methods of multi-criterion evaluation (electra iv) is an ordered list of variants, from the best to the worst. the order is represented graphically and a global indicator of the development of the locality is obtained. four different courses of the order of each period can be anticipated, as follows below. hence, we obtain a global (graphical) expression of the efficiency of the development of the locality influenced in time by the evaluating project. a) monotonous growth in efficiency of development of a locality for each pair of periods t1, t2� ro, t1< t2 , such that order t1� order t2: the last observed period is considered as the best, and the worst period is the first in the progressive series. hence we can conclude that the observed project has a positive impact on the development of the locality (territory) and the risk is very low, arising from the positive influence on the environmental capacity of the locality (fig. 1). 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 variants characteristics r1 r2 … rj rn x1 u1(x1) u2(x1) … uj(x1) un(x1) x2 u1(x2) u2(x2) … uj(x2) un(x2) … xm u1(xm) u2(xm) … uj(xm) un(xm) table 1: multi-criterion assessment period evaluating criteria r1 r2 … rn 1st period x1(1) x2(1) … xn(1) … tth period x1(t) x2(t) … xn(t) table 2. evaluation period fig. 1: monotonous growth in efficiency the function of “efficient” development of a locality f(t), t � 1, 2, …, t, can in this and other cases be expressed, for example, by newton’s interpolation polynomial: f t y a t a t t a t t t a ( ) ( ) ( )( ) ( )( )( )� � � � � � � � � � � � 1 1 2 31 1 2 1 2 3 � t t t t t( )( ) ( ),� � �1 2 � where y1 is the order of the first period and ai, i � 1, 2, …, t are calculated constants. b) monotonous decrease in efficiency of development of a locality for each pair of periods t1, t2� ro, t1< t2 , such that order t1� order t2: the first period is considered as the best, and the worst period is the last in the sequence. hence a negative influence on the development of the locality can be assumed and the project is characterized by a high risk level, arising from the negative influence on the environmental capacity of the locality (fig. 2). c) extreme courses of efficiency in the development of a locality if there is a minimum in the course, it refers to a situation of monotonous growth in the efficiency of the development of a locality. if there is a maximum in the course,it refers to a situation of monotonous growth in efficiency until this extreme, followed by a monotonous decrease in the efficiency of the development of the locality after the extreme. it can be anticipated that the worsening situation will cause problems similar to b). similarly for the minimum (see fig. 3). d) non-systematic course of efficiency of the development of a locality no concrete conclusion can be reached for this situation without making a deep analysis and performing experiments with a set of evaluating criteria and a simulation of combinations of resolutions, since the locality can hold more than one extreme (fig. 4). the courses that are obtained may serve as a starting point for an ex ante analysis of the deviations in the development of the locality from the simulated impact targets when the project was selected. the analysis should show how these deviations emerge, how the development would proceed without an intervention, how to correct the deviations, etc. the proposed method enables the user to express in a simple graphical way the “efficiency” of the development of the locality in each period and throughout the observed period, as an inverse expression of the risk of the project. as the efficiency grows, the risk decreases. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 47 no. 1/2007 fig. 2: monotonous decrease in efficiency fig. 3: extreme courses of efficiency 4 conclusions the evaluating criteria characterize the demands in each period and the effects that will result from implementing a project that influences a given locality in the course of its development. the targets and evaluating criteria are expressed in the “language of problems”, i.e. a decision is to emerge. different problems (types of infrastructure problems) have different languages. however the method is formally identical. in conclusion it should be pointed out that when the risk of a project is quantified or if a characterization of the indirect determination of a project is constructed, we cannot speak of a real risk, but a risk as it appears to the team working on the problem. the analysis on its own is not a technique that can compensate for defects in the input data and in the team’s informed guesswork. nevertheless, applying our method, together with a financial analysis, brings transparency to the information concerning the project. references [1] pastor, o.: project development cycle – activities and project documentation. in: automatizace. czech republic, vol. 44 (2001), no. 12, p. 755–756. issn 0005-125x. [2] pastor, o.: modelling of economic risks in transport projects. in: pront 2000. pilsen: university of west bohemia, czech republic, 2001, p. 195–198. isbn 80-7082-648-7. [3] fiala, p., jablonský, j., maňas, m.: vícekriteriální rozhodování. prague: university of economics, czech republic, 1994, isbn 80-7079-748-7. [4] zelený, l.: doprava (ekonomické souvislosti rozvoje). prague: university of economics, czech republic, 1995, isbn 80-7079-402-x. doc. dr. ing. otto pastor, csc. e-mail: pastor@fd.cvut.cz department of logistic and transportation processes czech technical university in prague faculty of transportation science horská 3 128 03 prague 2, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 4: non-systematic course of efficiency acta polytechnica https://doi.org/10.14311/ap.2021.61.0624 acta polytechnica 61(5):624–632, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague design development and study of an elastic sectional screw operating tool roman hevkoa, sergii zalutskyia, ihor tkachenkob, oleg lyashukc, oleksandra trokhaniakd, ∗ a ternopil ivan puluj national technical university, engineering mechanics and agricultural machines department, ruska str. 56, ternopil, 46001 ukraine b ternopil ivan puluj national technical university, manufacturing engineering department, ruska str. 56, ternopil, 46001 ukraine c ternopil ivan puluj national technical university, automobiles department, ruska str. 56, ternopil, 46001 ukraine d national university of life and environmental sciences of ukraine, department of technical reliability, heroiv oborony str. 15, kyiv, 03040 ukraine ∗ corresponding author: sashaklendii@gmail.com abstract. the results of an elastic sectional screw operating tool development and its production technique are presented in the article under consideration. the operating tool has been made to fix the elastic sections, providing the transportation of bulk materials of agricultural production, in order to ensure their minimal damage and the process minimal power capacity. the article presents constructed regression dependencies and response surfaces for the effects of the design, kinematic and technological parameters of a sectional screw operating tool on power consumption and the damage rate of grain material in the process of its transportation. as the result of the conducted experimental research, authors came to a conclusion that the arrangement of an elastic auger without a gap between its peripheral part and the inner surface of the guiding tube significantly reduces vibrations in the process of conveying bulk material. keywords: manufacturing method, elastic screw, process of coiling the spiral, screw making, efficiency, grain material damage. 1. introduction transportation of bulk material by screw conveyors has a broad range of applications in different fields of production. however, conventional screw operating tools have a number of disadvantages, namely the increased material damage rate by screw conveyors due to existing gaps between the rotating part of the auger and the inner surface of the tube. this also leads to an increased power consumption of conveying the material and significant vibrational oscillations that reduce the operational life cycle of the employed mechanisms. to increase the conveyor productivity, pneumatic devices are used, which improve material flow modes. developments found in the works of [1–3] are dedicated to the subject of increasing the operational efficiency of pneumatic auger conveyors, and in [4– 7], the results of a research on pneumatic conveyors with a simultaneous transportation and mixing of bulk material in the main characteristic areas routes are presented. they provide the transportation of bulk materials along technological paths of various spatial configurations employing a mechanical feed of material by a screw discharger and an additional intensification of the process with the help of pneumatic devices. we also carried out a research analysis on improvement of operational and functional characteristics of screw and tubular conveyors and a reliable safety protection of their drive members, which are described in the works of [8, 9]. these conveyors are designed to simultaneously transport and mix bulk feed mixtures along multicurved paths with lower energy inputs of the technological process. the works of [10, 11] are dedicated to the development of screw operating tools with an elastic surface and their theoretical and experimental research. the works of [12–15] provide research findings on flow patterns of bulk materials depending on constructional and kinematic characteristics of screw operating tools, bunker type and solid particles, as well as frictional forces. the papers [16, 17] also presents results from an experimental screw conveyor where it is shown that the analytical predictions very closely correlate to the measured results of transportation of bulk materials. operating conditions, which affect the performance of screw conveyors by using a discrete element method (dem) to simulate single-flight screw conveyors with periodic boundary conditions, are considered in [18– 20]. in the works of [21–23] mathematical models of the process of feeding bulk materials into bunkers of 624 https://doi.org/10.14311/ap.2021.61.0624 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 5/2021 design development and study of an elastic . . . (a). (b). figure 1. design scheme (a) and general view (b) of screw operating tool with overlapping elastic sections. (a). (b). figure 2. installation diagram for coiling helices onto the stack (a) and general view of coiled helix stack (b). screw conveyors are proposed, and operating modes of inclined screw conveyors are considered, allowing to develop their new designs with a justification of rational parameters. the analysis of known works shows that a considerable reduction in the damage caused to agriculturallyproduced materials can be achieved by employing elastic surfaces, which are attached to the screw blade. thereby, it is efficient to develop new designs of elastic screw operating tools and determine their optimal structural and kinematic parameters, which will improve operational performance in the processes of bulk material transportation. the purpose of the conducted research is to increase operational performance of elastic conveyor screw operating tools by developing new designs and determining their rational parameters, as well as an application of the proposed manufacturing method. the novelty of this work is to determine the influence of constructional and kinematic parameters of an elastic screw operating tool on the damage rate of grain material, conveyor output and the energy consumption of the transportation. 2. material and methods according to the main objectives, a constructional design of an elastic sectional screw operating tool and its manufacturing method have been developed. an experimental installation has been designed to determine the productivity of an auger conveyor as well as to establish the influence of technological and kinematic parameters of a screw conveyor with hard and elastic operating surfaces on the grain degradation rate. a statistical processing of experimental research findings has been performed in order to construct, and thereafter conduct the analysis of regression equations. to meet the objectives, an elastic sectional screw operating tool has been designed [24]. the design scheme and general view of the sectional screw operating tool with overlapping adjacent elastic sections are shown on figure 1. it consists of the central shaft 1 (diameter �38) with spiral body 2, which is made of material s12 to which elastic sections 3 are attached with the help of sectional screw plates 4, bolted joints with semi-circular heads 5 and nuts 6. during the transportation of the material in tubular casing 7, when the grain is stuck between the stationary casing’s surface and rotating sectional elastic screw’s surface, sections bend to avoid any grain damage. the manufacturing method of the elastic sectional screw operating tool is as follows: the ribbon is coiled onto the screw stack, without performing a pitch calibration. different types of coiling methods may be applied [25]. the process of coiling the helices out of ribbons with their continuous escape from the framework differs in the fact that acting forces, which capture and carry the ribbons, are frictional forces (figure 2). for the coiling, we used installation (figure 2), which was mounted onto the metalworking lathe. the installation is made in the form of a step-like cylindrical mandrel 1, which, in its wider part, has an axial 625 r. hevko, s. zalutskyi, i. tkachenko et al. acta polytechnica (a). (b). (c). figure 3. fixture for drilling the screw stack (a) and its general view (b, c). figure 4. attaching the calibrated screw onto the tubular base. slot 2. ribbon’s bent end 3 is inserted into the slot and is fixed with bushing 4. at the end of the mandrel, there is a single helical coil 5 with a pitch that equals to the helix width. clamping 6 and guide 7 rollers, with the possibility of free rotation, are mounted on the saddle, perpendicularly to the axis of the mandrel. the clamping roller has a step-like design, its cylindrical surface presses the ribbon to the bushing end 5, and the wider part of the roller comes to contact with the ribbon rib that is being coiled. the roller is rigidly mounted on the body frame 8, attached on the axle with the help of bushing and is set onto the saddle by means of the thrust bearing. to improve the operating conditions, the axle of thrust bearing is displaced relatively to the mandrel within the e value, sideways to the feed of the ribbon. the installation operates the following way: the end of the ribbon, which is bent at an angle of 90°, is attached into the mandrel’s axial slot 2 and fixed with the bushing. the pressing roller 6 is approached to the ribbon in such a manner that the roller’s narrower cylindrical side presses the ribbon to the mandrel’s end, and the roller’s wider end surface presses down the ribbon along its rib. in this case, the end surface should be positioned from the mandrel at a distance that corresponds to the ribbon’s width in the screw stack. the loose ribbon’s end is bent along the surface of the pressing roller and is inserted into the gap formed by smaller sides of the pressing 6 and guiding 7 rollers. after the mandrel is set in rotation, the ribbon, under the action of the end surface of the pressing roller, is coiled onto the mandrel’s smaller part. the feeding of the ribbon towards its bent side is performed by working surfaces of pressing and guiding rollers. the rotation of the mandrel is synchronized with the feed mechanism 9, which has a pressing roller mounted on it. the feeding is determined by the maximum thickness of the inner diameter of the helix screw. after the coiling is completed, the pressing roller is drawn aside and the coiled screw stack is removed from the mandrel. investigations proved the possibility to coil the ribbon onto the mandrel with the ratio of ribbon width to its thickness up to 15–20. this is due to the advantageous bend pattern and improvement of metal deformation conditions. a diagram of mount holes and a general view of the screw stack with holes are shown in figure 3. the screw stack 1 was attached to the fixture, which, in its lower part, has a special bushing 2, and its end has a screw pattern with a pitch that equals the screw thickness. a similar bushing 3 is placed in the upper part of the fixture. coils of the workpiece coming through the fixture plate 4 are maximally constricted against each other with the help of a central screw 5, which is screwed into the fixture’s body. the fixture plate 4 has through holes for attaching the fixture bushings 7. guiding columns 8 provide the necessary position of the plate 4. holes in coils are drilled by bit 9 with a spindle speed of 1000 rpm and a feed rate that is equal to 0.2 mm per rotation. the next technological step is a pitch calibration of the ribbon screw, which is rigidly jointed onto its tubular base afterwards (figure 4). then, an elastic operating screw, or its sections, depending on geometric and rheological parameters of the transported material, are mechanically jointed onto the hard supporting screw (figure 5). 626 vol. 61 no. 5/2021 design development and study of an elastic . . . (a). (b). (c). figure 5. solid rubber elastic screw (a) and sectional screw (b, c) attached to the base. (a). (b). (c). figure 6. design scheme (a) and general view (b, c) of experimental installation. to evaluate the transportation productivity process and bulk material degradation rate that depends on the design, kinematic and technological characteristics of the screw conveyor with an elastic auger, an experimental installation was designed, figure 6. it consists of a frame 11 and an auxiliary frame 10, which is attached to the frame by pivot joints 1, 12, 14 and a mounting bracket 13 with holes. an electric motor 15 with a belt driven 2 operating tool is mounted on the frame. the operating tool itself consists of a shaft 6 and a supporting ribbon screw 5 with elastic sections 4 being attached to the screw edge. the operating tool is mounted inside the guiding tube 7, which has a bunker 3 in the feed area, a branch tube 8 in the discharge area and a bulk container 9. the width and rigidness of the elastic sections were selected according to the stress-strain behaviour of the material being conveyed. an experimental research was conducted using polyurethane pu-60 material, the elastic sections are made with a thickness of 5 mm [26]. investigations on the bulk material damage during its transporting process, which depends on variations of the design and kinematic parameters of the screw operating tool, were conducted using the following method: the overall percentage of damaged grain was determined using three samples of grain before its transportation. the grain material was transported through the technological path in the screw conveyor of the experimental installation, and then, we examined the three samples of grain and determined the percentage of its damage. the discrepancy in the amount of the damaged grain material before and after the transportation indicated on its damage rate [27]. the experiment was carried out ten times, which corresponded to the general transportation distance of 10 m. in the process of investigation, the variation range of factors was the following: the screw operating tool rotational frequency was equal to (n = 200–500 rpm), the elevation angle – (α = 0–40°), and the clearance value between the auger and the casing – (∆ = 0– 7 mm). to start the 2.2 kw three-phase asynchronous motor and adjust its rotational speed, we used altivar 71 frequency inverter and power suite v.2.5.0 software. the data obtained on motor torque and power variables were displayed in the power suite application window on a computer screen. electric motor drive power and torque values were recorded in percentage terms. the motor power was determined by multiplying its nominal output (2.2 kw) 627 r. hevko, s. zalutskyi, i. tkachenko et al. acta polytechnica (a) . 00:00:01 sec. (b) . 00:00:03 sec. (c) . 00:00:05 sec. (d) . 00:00:07 sec. (e) . 00:00:09 sec. (f) . 00:00:11 sec. figure 7. photoscript of grain material transportation process. by a peak value (expressed in percentages) for the given operating mode. the evaluation method of the screw conveyor production output per second consisted in selecting the grain samples within 5 seconds during the given transportation mode. for the auger with the elastic sectional surface, the gap value ∆ was equal to 0 mm, and the same value was considered to determine the grain damage rate of winter wheat of bulk density 720 kg/m3 and moisture content w = 12–15 %. for the rigid auger, the gap value ∆ was equal to 4 mm. guiding jackets (tubes) with inner diameters of d 100 and 120 mm were used to determine the screw conveyor production capacity q. for both variants of the screw, the screw surface pitch equals to t1 = 70 mm. the photoscript for the process of conveying the grain material with α = 10° and n = 450 rpm is shown in figure 7. having studied the photographs of the grain material transportation process, it has been determined that the peak production output of the screw conveyer, meaning its bunker being filled with grain, is between 5 and 10 sec., with α = 10° and n = 450 rpm. within this timespan, the grain material was selected and weighed in order to evaluate the production output of the screw conveyor per second. 3. results and discussion based on the experimental research on the evaluation of power consumption of an elastic screw conveyor during a transportation of grain material, we constructed the following regression equation p = 0.055 + 0.11 · 10−2n − 0.06 · 10−4α − 0.014∆ + 0.21 · 10−5nα + 40.84 · 10−4n∆ + 0.75 · 10−4α∆ − 0.33 · 10−6n2 + 0.21 · 10−6α2 − 0.5 · 10−4∆2 (1) the factorial field was determined by the following range of parameter variations 200 ≤ n ≤ 500 (rpm); 0 ≤ α ≤ 40 (degree); 0 ≤ ∆ ≤ 4 (mm). response surfaces, which are constructed according to regression equation 1, are shown in figure 8. an analysis of the response surfaces shows that the dominating factor, which influences the value p , is rotational speed n. the next impact factor is angle α, and the least influential factor is gap value ∆. the investigation findings on the determination of productivity output q per second, during the transportation of grain material by the experimental installation with inner diameters of guiding pipes of d = 120 and 100 mm at a screw rotational speed of n = 450 rpm with a horizontal configuration of the screw operating tool, are shown of figure 9. it was determined that the maximal screw conveyor output value was within 5 and 10 seconds after it had been set in motion and the bunker filled with grain material. within this timespan, we selected and weighed the material. the general tendency for the screw conveyor output q per second, depending on the inclination angle of the operating tool relative to the horizon α = 0–40° with n = 450 rpm, shows that value q decreases when the inclination angle of the operating tool α increases, whereby flow rate q considerably increases with an angle value of α = 30° and more. this can be explained by the fact that at intense inclination angles of the operating tool relative to the 628 vol. 61 no. 5/2021 design development and study of an elastic . . . (a). (b). (c). figure 8. response surfaces for power dependencies p per conveyor drive: (a) p = f (n, α) with ∆ = 2 mm; (b) p = f (∆, n) with α = 20°; (c) p = f (∆, α) with n = 350 rpm. (a). (b). figure 9. characteristic curves for screw production output as per second: (a) dependency on n = 200–500 rpm with α = 0; (b) dependency on α with n = 450 rpm; 1, 2 – d = 120 mm; 3, 4 – d = 100 mm; 1, 3 – screw with elastic surface (with ∆ = 0 mm); 2, 4 – hard screw (with ∆ = 4 mm). horizon, the feeding of grain into the guiding pipe becomes more complicated. since the bunker is rigid and vertically fixed to the tube (figure 6), additional frictional forces develop along the inside surface of the bunker and this slows down the feeding of material into the technological transportation zone. the analysis of the screw conveyor production output showed that the transportation output increased 1,28–1,29 times in the case of the elastic blade augers, provided (∆ = 0 mm), the inside diameter of the tube was 100 to 120 mm and the operating tool angle inclination was within the range of α = 0–40°, and in the case of hard screws (with ∆ = 4 mm), the transportation output increased 1.28–1.33 times. investigations were conducted to determine the grain degradation rate caused by hard th and elastic te augers. the regression equation determining the dependence of the grain degradation rate on values α, n and ∆ for the rigid auger is as follows th = 0.0108 + 0.0046α + 0.0005n + 0.053∆ (2) the factorial field had the following range of parameter variations: 0° ≤ α ≤ 40°; 200 ≤ n ≤ 500 (rpm); 2 ≤ ∆ ≤ 7 (mm). hard screw response surfaces th depending on variations of two factors are shown in figure 10: 10a th = f (∆, α); 10b th = f (n, α); 10c th = f (n, ∆). the regression equation of dependencies of the grain damage on values α, n i ∆ for the elastic screw is as follows te = 0.0011 + 0.0012α + 0.0002n + 0.051∆ (3) the factorial field had the following variation range of corresponding parameters: 0° ≤ α ≤ 40°; 200 ≤ n ≤ 500 (rpm); 0 ≤ ∆ ≤ 4 (mm). figure 11 shows the response surfaces te depending on variations of two factors: 11a te = f (∆, α); 11b te = f (n, α); 11c te = f (n, ∆). the analysis of the response surfaces shows that the dominating factor that influences the value th is the gap width ∆. the next impact factor is angle 629 r. hevko, s. zalutskyi, i. tkachenko et al. acta polytechnica (a). (b). (c). figure 10. response surfaces for grain damage th by hard screw depending on variations of two factors: (a) th = f (∆, α); (b) th = f (n, α); (c) th = f (n, ∆). (a). (b). (c). figure 11. response surfaces for grain damage rate te by elastic auger depending on variations of two factors: (a) te = f (∆, α); (b) te = f (n, α); (c) te = f (n, ∆). α, and very close to it, is the rotational speed of the operating tool n. the methodology of conducting the investigations on the level of the grain material damage and on the transportation-related energy consumption applying a multifactorial experiment has been suggested. the analysis of the investigation results has made it possible to assess the influence of all the above mentioned system parameters on the behaviour of the bulk material when passing the area of adjacent overlapping sections. the presented method of making a screw operating element, which is adjusted to its fixture with elastic straps, involves a pre-coiling of a strip onto a rib with a further drilling of fixing holes in the developed conductor and a calibration of a screw base with its further fixation onto a shaft base. the ways of mounting an elastic spiral and its sections onto the base of a screw spiral rib have been suggested. 4. conclusions a completely new design of a screw operating tool with a sectional elastic surface has been developed to study the technological processes of transporting grain materials and a set of experiments has been made. in the course of running the experimental research, the following variable values were considered: rotational speed of the operating tool (n, rpm); its inclination angle towards the horizon (α, degrees); gap value between the auger and the tube (∆, mm). when determining the grain damage rate t , it has been established that when we used elastic sections, as compared to rigid blades, at n = 100–400 rpm, t value decreased 1.55–3.0 times, when angle was changing within the range of α = 0–40, t value decreased 1.63– 4.0 times in the case of the auger with elastic blades. based on a multi-factor experimental research, we obtained the regression dependency on the determination of power p of a screw conveyor operation. the dominating factor, which influences the power value of a screw conveyor operation is the rotational speed of the operating tool n. the next influential factor is the auger angle α towards the horizon. the gap value ∆ between the elastic auger and the tube had the 630 vol. 61 no. 5/2021 design development and study of an elastic . . . lowest impact on the power change p of the conveyor drive operation. an analysis of the screw conveyor productive output per second showed that the output of an elastic auger (provided ∆ = 0 mm) with increased inside diameters of tubes from 100 to 120 mm at a rotational speed of the operating tool within n = 300–450 rpm increases 1.25–1.27 times, and productivity increases 1.27–1.31 times in the case of a rigid auger (provided ∆ = 4 mm). the general tendency of variation of the screw conveyor production output per second q, depending on angles α = 0-40° at n = 450 rpm, shows that q value decreases, when angle α increases. whereby the flow rate q significantly increases, when the angle value equals to α = 30° and more. transportation output increases 1.28–1.29 times in the case of the elastic blade augers (∆ = 0 mm) with an increased tube inner diameter from 100 to 120 mm, and operating tool being inclined by α = 0–40° towards the horizon; and in the case of the rigid augers (∆ = 4 mm), it increases 1.28–1.33 times. a completely new technique of a competitive elastic sectional screw operating tool production has been developed on the basis of the conducted set of theoretical and experimental studies. the technical innovation of the developed design is defended by the declaration of utility model patents of ukraine [24] and the research results have been partially implemented. references [1] v. m. baranovsky, r. b. hevko, v. o. dzyura, et al. justification of rational parameters of a pneumoconveyor screw feeder. inmateh: agricultural engineering 54(1):15–24, 2018. [2] r. b. hevko, o. m. strishenets, o. l. lyashuk, et al. development of a pneumatic screw conveyor design and substantiation of its parameters. inmateh: agricultural engineering 54(1):153–160, 2018. [3] r. b. hevko, v. m. baranovsky, o. l. lyashuk, et al. the influence of bulk material flow on technical and economical performance of a screw conveyor. inmateh: agricultural engineering 56(3):175–184, 2018. [4] m. lech. mass flow rate measurement in vertical pneumatic conveying of solid. powder technology 114(1-3):55–58, 2001. https://doi.org/10.1016/s0032-5910(00)00263-1. [5] y. li, y. z. li. new equipment for bulk cargo conveying screw-gas bulk sucking and taking equipment. in 6th international conference on material handling (icmh 2008), pp. 243–247. 2008. [6] e. v. p. j. manjula, w. k. hiromi ariyaratne, c. ratnayake, m. c. melaaen. a review of cfd modelling studies on pneumatic conveying and challenges in modelling offshore drill cuttings transport. powder technology 305:782–793, 2017. https://doi.org/10.1016/j.powtec.2016.10.026. [7] n. tripathi, a. sharma, s. s. mallick, p. w. wypych. energy loss at bends in the pneumatic conveying of fly ash. particuology 21:65–73, 2015. https://doi.org/10.1016/j.partic.2014.09.003. [8] r. b. hevko, b. o. yazlyuk, m. v. liubin, et al. feasibility study of mixture transportation and stirring process in continuous-flow conveyors. inmateh: agricultural engineering 51(1):49–59, 2017. [9] r. b. hevko, m. v. liubin, o. a. tokarchuk, et al. determination of the parameters of transporting and mixing feed mixtures along the curvilinear paths of tubular conveyors. inmateh: agricultural engineering 55(2):97–104, 2018. [10] r. b. hevko, s. z. zalutskyi, i. g. tkachenko, o. m. klendii. development and investigation of reciprocating screw with flexible helical surface. inmateh: agricultural engineering 46(2):133–138, 2015. [11] y. tian, p. yuan, f. yang, et al. research on the principle of a new flexible screw conveyor and its power consumption. applied sciences 8(7):1038, 2018. https://doi.org/10.3390/app8071038. [12] o. l. lyashuk, o. r. rogatynska, d. l. serilko. modelling of the vertical screw conveyer loading. inmateh: agricultural engineering 45(1):87–94, 2015. [13] j. w. fernandez, p. w. cleary, w. mcbride. effect of screw design on hopper draw dawn by a horizontal screw feeder. in seventh international conference on cfd in the minerals and process industries csiro, melbourne, australia 9-11 december, pp. 1–6. 2011. [14] a. w. roberts. the influence of granular vortex motion on the volumetric performance of enclosed screw conveyors. power technology 104(1):56–67, 1999. https://doi.org/10.1016/s0032-5910(99)00039-x. [15] x. x. sun, w. j. meng, y. yuan. design method of a vertical screw conveyor based on taylor-couette-poiseuille stable helical vortex. advances in mechanical engineering 9(7), 2017. https://doi.org/10.1177/1687814017714984. [16] d. schlesinger, a. papkov. screw conveyor calculation based on actual material properties. powder handling and processing 9(4), 1997. [17] h. zareiforoush, m. h. komarizadeh, m. r. alizadeh. effect of crop-screw parameters on rough rice grain damage in handling with a horizontal screw conveyor. journal of food, agriculture and environment 8(3-4):494–499, 2010. [18] p. j. owen, p. w. cleary. prediction of screw conveyor performance using the discrete element method (dem). powder technology 193(3):274–288, 2009. https://doi.org/10.1016/j.powtec.2009.03.012. [19] p. j. owen, p. w. cleary. screw conveyor performance: comparison of discrete element modelling with laboratory experiments. progress in computational fluid dynamics 10(5/6):327–333, 2010. https://doi.org/10.1504/pcfd.2010.035366. [20] a. w. roberts, s. bulk. optimizing screw conveyors. chemical engineering 122(2):62–67, 2015. [21] a. s. merritt. mechanics of tunneling machine screw conveyors: a theoretical model. geotechnique 58(2):79–94, 2008. https://doi.org/10.1680/geot.2008.58.2.79. 631 https://doi.org/10.1016/s0032-5910(00)00263-1 https://doi.org/10.1016/j.powtec.2016.10.026 https://doi.org/10.1016/j.partic.2014.09.003 https://doi.org/10.3390/app8071038 https://doi.org/10.1016/s0032-5910(99)00039-x https://doi.org/10.1177/1687814017714984 https://doi.org/10.1016/j.powtec.2009.03.012 https://doi.org/10.1504/pcfd.2010.035366 https://doi.org/10.1680/geot.2008.58.2.79 r. hevko, s. zalutskyi, i. tkachenko et al. acta polytechnica [22] o. rogatynska, o. liashuk, t. peleshok, r. liubachivskyi. investigation of the process of loose material transportation by means of inclined screw conveyers. bulletin of ternopil ivan puluj national technical university 79:137–143, 2015. [23] r. m. rohatynskyi, a. i. diachun, a. r. varian. investigation of kinematics of grain material in a screw conveyor with a rotating casing. bulletin of kharkiv petro vasylenko national technical university of agriculture 168:24–31, 2016. [24] r. b. hevko, i. g. tkachenko, s. z. zalutskyi, v. v. khradovyi. screw with sectional elastic screw surface. patent of ukraine no. 119856 (in ukraine), 2017. [25] b. m. hevko. manufacturing method of screw spirals. higher school., lviv, 1986. [26] poliuretan. [2018-03-25], https://electroplast.company/poliuretan. [27] m. r. hevko. substantiation of the parameters of sectional screw conveyers for transporting loose agriculture materials. national technical university, ternopil, 2013. 632 https://electroplast.company/poliuretan acta polytechnica 61(5):624–632, 2021 1 introduction 2 material and methods 3 results and discussion 4 conclusions references ap05_1.vp 1 introduction project gama for adjustment of geodetic networks was started at the department of mapping and cartography, faculty of civil engineering, czech tu prague, in 1998. at first it was planned to be only a local project with the main goal of demonstrating to students the power of object programming and at the same time being a free independent tool for comparing of adjustment results from various sources. the gama project received the official status of gnu software in 2001, and now consists of a c�� library (including small c�� matrix/vector template library gmatvec) and two programs gama-local and gama-g3, which correspond to two development branches of the project. the stable branch of the gama project consists of the command line program gama-local for adjustment of three-dimensional geodetic networks in a local coordinates system (platform independent qt based gui roci-local is also available). the new development branch of the project (gama-g3) aims to adjust the geodetic networks in a global geocentric system. the stable branch (gama-local) enables common adjustment of possibly correlated horizontal directions and distances, horizontal angles, slope distances and zenith angles, height differences, observed coordinates (used in sequential adjustment, etc.) and observed coordinate differences (vectors). although such an adjustment model has now been made obsolete by global positioning systems, it can still serve as an educational tool for demonstrating adjustment procedures to students and as a starting platform for the developing a new branch of the project (gama-g3). numerical solution of least squares adjustment in geodesy is most commonly based on the solution of normal equations. as the gama project was also meant to be a comparison tool, it was desirable to use a different method, and singular value decomposition (svd) was implemented as the main numerical algorithm. as an testing alternative, gama implements another algorithm from the family of orthogonal decompositions based on gram-schmidt orthogonalization (gso). practical experience with both algorithms is discussed. in the gama project, the geodetic input data are described in extensible markup language (xml). the primary motivation for using xml was to define structured input data for adjustment of a local geodetic network. the most important feature of xml is probably the ease of defining a grammar for user data (a class of xml documents) that consequently can be validated even independently of our applications. one of the goals of the gama project is to build a collection of model geodetic networks described in xml. the lack of reliable testing data was one of major obstacles when testing the implementation of the numerical solution of the geodetic network adjustment. 2 adjustment and analysis of observations geodesy as a scientific discipline studies the geometry of the earth or, from the practical point of view, the positioning of objects located on the earth’s surface or in zones relatively close to it. the input information consists of geodetic observations. the spectrum of observation types dealt with by geodesy is very wide, and ranges from classical astro-geodetic observations (astronomical longitude and latitude, variations and position of the earth’s pole), measurements of geophysical quantities (gravity acceleration and its local anomalies), through traditional geometric observables like directions, angles and distances to photogrammetric measurements of historical monuments. however, the main importance in geodesy today is given to satellite global positioning systems (first of all navstar gps and other complementary systems like doris or glonass). the key role in processing geodetic data belongs to the sphere of applied statistics in geodesy, traditionally called adjustment of observations. the processing of geodetic observations is determined by the choice of an appropriate mathematical model, which can be symbolically expressed as f c x l( ), , � 0 , (1) where f is a vector of functions describing the relations between constants c, unknown parameters x and observed quantities l. corresponding to the three components of this model are three mathematical spaces: parameter, observation and model space [1]. the three basic components of the mathematical model (1) are depicted in fig. 1, where a, b, g and h are the matrices of corresponding linearized relations (values of constants c are not estimated in geodesy and we can consider them to be a part of model space). models can be direct, indirect or 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague a progress report on numerical solutions of least squares adjustment in gnu project gama a. čepek, j. pytel gnu project gama for adjustment of geodetic networks is presented. numerical solution of least squares adjustment in the project is based on singular value decomposition (svd) and general orthogonalization algorithm (gso). both algorithms enable solution of singular systems resulting from adjustment of free geodetic networks. keywords: geodesy, least square adjustment, local geodetic network. implicit; linear or nonlinear; can occur individually or in combinations model explicit in x: x g l� ( ) , x gl v� � , model explicit in l: l h x� ( ) , l hx v� � , implicit model: f x, l( ) � 0 , ax gl v� � � 0. 3 least squares and singular systems when adjusting geodetic observations we are relatively often faced with models leading to singular sets of linear equations. typically these are models without fixed points, ie., no points with fixed coordinates are given, or the number of fixed points is not sufficient (free networks, see [6] for more information). let us take as an example the local network with observed directions and distances from fig. 2. the relationship between the unknown adjusted coordinates and observations can be expressed after linearization as the project equations ax l v� � , (2) where a is design matrix, x vector of unknowns, l vector of reduced observations and v vector of residuals (misclosure vector). in geodesy, the number of observations is always higher than the number of unknowns. project equations (2) thus represent an overdetermined system, and matrix a has more rows than columns. least squares is the basic method used in geodesy for observation adjustment. it gives us the unique solution x of system (2) that minimizes the euclidean norm of the residual vector min �v v . (3) a method commonly used for solving projects equations (2) (model explicit in observations) is based on normal equations n a a n a l x n n � � � � � � , , .1 (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 fig. 1: linear relations between parameter, observation and model spaces fig. 2: example of a local geodetic free network apart from unknown vector x (and residuals v) we are always interested in geodesy in estimates of the precision of the adjusted quantities, in geodetic practice represented by the variance-covariance matrix of adjusted unknowns cxx and adjusted observations cll c nxx m� � � 0 1, (5) c ac all xx� �. (6) the geometric shape of our adjusted network is defined by observed directions (or angles) and directions. if we fixed the coordinates of two or more points the network shape would necessarily be distorted. normal equations would lead to an adjustment solution in which the residuals would be dependent on the coordinates of the fixed points. this way we would degrade our observations in cases when the coordinates of the network points are either unknown or known with lower precision. on the other hand if we consider the coordinates of all points to be free, the corresponding matrix n is inevitably singular; the columns of matrix a are linearly dependent (the network can float freely in the coordinate system) and normal equation matrix n is positive-semidefinite � � �p np p0 , 0 . to get a unique solution we have to define additional constraints regularizing the system, preferably without deforming of the network shape. in geodetic practice we most often meet the following approaches: � a singular system is regularized by introducing pseudo-observations, typically with huge weights, that play a similar role as a set of constraint equations. � an explicit system of constraint equations is defined to make the given system regular cx c� . (7) normal equations then become n c c x l c �� � � � � � � � � � 0 � , (8) where � is the vector of lagrange multipliers. in this case matrix c is problem dependent and needs to be known explicitly in advance. � the euclidean norm of a certain subset of unknown parameters vector x is minimized min , x i i x i2� �� . (9) the set of indices � can contain all elements, but more often only selected elements of x. in the case of a plane geodetic free network we can geometrically interpret the last constraint (9) as follows. by minimizing of the euclidean norm of the residual vector (3) the shape and scale (if at least one distance is available) of the adjusted network together with the covariances of the adjusted observations are uniquely defined. the second additional constraint (9) then defines the localization of the network in the coordinate system. apart from the adjusted network shape we simultaneously define its shift and rotation in the coordinate system. another equivalent interpretation is that constraint (9) defines the particular solution of (2) in which the trace of the variance-covariance submatrix corresponding to indices i �� is minimal. 4 normal equations and numerical stability the numerical solution of the adjustment of observed quantities based on normal equations can be numerically unstable and in certain cases we should prefer other numerical algorithms that directly solve the project equations (2). a possible source of troubles are the normal equations themselves, or more precisely the condition number of normal equations. let us restrict our discussion here to the simple case when matrix a does not contain linearly dependent columns and matrix n is positive-definite. the condition number of matrix a is defined as � � � ( ) ( ) ( ) max min a a a a a � � � , (10) where �( )*�a a denotes the maximal and minimal eigenvalue of matrix �a a . if we solve a linear set of equations then its condition number represents the minimal upper estimate of ratio of relative error of x and the relative error of the right hand side l. from equation (10) it directly follows that the condition number of normal equation matrix n is the square of the condition number of the project equation matrix a � �� �( ) ( )n a� 2. (11) we can say that when solving poorly conditioned normal equations we lose twice as many of correct decimal digits in a solution x as in any direct solution of project equations. probably the most important class of algorithms for direct solution of project equations (2) is the family of orthogonal decomposition algorithms. apart from other goals, gnu project gama has been planned to be a kind of benchmark, ie., a tool for checking adjustment results from other software products. for this reason it was desirable to have the adjustment based on a different numerical method other than the traditional solution of normal equations, and singular value decomposition (svd) was implemented as the main numerical algorithm. as an alternative, another orthogonal decomposition adjustment algorithm gso (based on gram-schmidt orthogonalization) is also available. we give a brief description of both algorithms in the following section. 5 gram-schmidt orthogonalization the gram-schmidt orthogonal decomposition is an algorithm for computing factorization a qr q q� � �, 1 , (12) where q is the orthogonal matrix and r is the upper triangular matrix. matrix r here is identical to the upper triangular matrix of the cholesky decomposition of normal equations n a a r q qr r r� � � � � � � . (13) gram-schmidt orthogonalization is a very straightforward and relatively simple algorithm that can be implemented in several variants differing in the order in which the vectors are orthogonalized. the following three algorithms are adopted from [2, p. 300–301]. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague algorithm 1.1 [modified gram–schmidt (mgs) row version] for k � 1 2, , ,� n � : ; : ( � � ) ;( )q a r q qk k k kk k t k� � 1 2 q q rk k kk: � ;� for i k n� � 1, ,� r q a a a r qki k t i k i k i k ki k: ( ) ( ) ( ); : ;� � � �1 end end algorithm 1.2 [modified gram–schmidt (mgs) column version] for k � 1 2, , ,� n for i k� �1 2 1, , ,� r q a a a r qik k t k i k i k i ik i: ; : ; ( ) ( ) ( )� � ��1 end � : ; : ( � � ) ;( )q a r q qk k k kk k t k� � 1 2 q q rk k kk: � ;� end algorithms 1.3 [classical gram–schmidt (cgs)] for k � 1 2, , ,� n for i k� �1 2 1, , ,� r q aik i t k: ;� end � :q a r qk k ik ii k � � � � � 1 1 r q q q q rkk k t k k k kk: ( � � ) ; : � ;� � 1 2 end it should not be forgotten that the variant known as classical gram–schmidt has very poor numerical properties in that there is typically a severe loss of orthogonality among the computed qi. a rearrangement of the calculation, known as modified gram–schmidt, yields a much sounder computational procedure [3, p. 230–232]. 5.1 generalized orthogonalization algorithm (gso) the generalized orthogonalization algorithm (gso), a method based on gram-schmidt orthogonalization, for numerical solution of various adjustment models in geodesy was elaborated by františek charamza [4, 5]. gso was implemented in gnu gama to conserve this rarely used but interesting method, and to offer an alternative numerical algorithm to svd (which we expected to give better numerical results for numerically unstable systems). algorithm gso operates on a block matrix structure m m m m q q qv q 1 2 3 4 1 2 3 4 � � � � � � � , (14) where the transition from m to q is defined by the equations � �q q1 1 1 , (15) m q r1 1� , (16) q m r1 1 1� � , q m q q m2 2 1 1 2� � � , (17) q m r3 3 1� � , q m q q m4 4 3 1 2� � � , (18) and r is the upper triangular matrix. algorithm mgs is applied to block matrix m so that the column dot products are computed only for submatrices (m1, m2), the projections rkiqk are computed for full columns of m and the whole process is terminated after of all columns of submatrix ( m1, m3 �) have been processed. this step is called the first orthogonalization in algorithm gso. let us take as an example the linear system of project equations (2) ax l v� � and apply algorithm gso to the block matrix a l q v r x �� � � � � � � �1 0 1 . the result is directly the vector of unknown parameters x and the vector of residuals v. the cofactors (weight coefficients) of the adjusted parameters qx xi j are available as the dot products of rows i and j of submatrix r�1, the cofactors of the adjusted observations ql lm n are computed as the dot products of rows m and n of submatrix q and the mixed cofactors qx li n similarly as the dot products of the i-th row of r�1 and the n-th row of matrix q. 5.2 algorithm gso and singular systems let us suppose now that project equations matrix a contains r linearly independent columns and the remaining d linearly dependent columns. without loss of generality we can assume that the linearly dependent columns are located in the right part of matrix a. we denote linearly independent columns a1, linearly dependent columns a2 and the matrix of their linear combinations � a a a a a x x x � � � � � � ( ) ,1 2 2 1 1 2 , , � . (19) now we can rewrite the project equations as v a x a x l a x x l a x l� � � � � � � �1 1 2 2 1 1 2 1( ) ~� . (20) as matrix a1 does not contain linearly dependent columns, a unique solution ~x of (20) exists that minimizes the euclidean norm of v. if we know matrix � and vector ~x then any solution x of ~ ( , )x x x x� � �1 2� �1 (21) is at the same time the least square solution of (20) with the same vector of residuals v. if we apply algorithm gso to the matrix m m m m m a a l i 1 i 2 i 3 i 4 i� � � � � �� � � 1 2 1 0 0 0 1 0 (22) we receive a block matrix q q v r xi � � � � � � 0 0 1 0 1 1 � ~ . (23) © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 in the case of singular systems in gso we have the first orthogonalization, which defines a particular solution in which the unknown parameters corresponding to the linearly dependent columns of a are set to zero. from cgs it emerges directly that matrix � is the matrix of the linear combinations from (19). the cofactors are computed the same way as in the case of regular systems. when computing gso numerically we naturally do not obtain exactly zero vectors on the positions of the (almost) linearly dependent columns. we declare to be linearly dependent those columns of a whose norms drop below a given tolerance. during the first orthogonalization we set to zero corresponding subvectors in the area of a2. these values can be considered as random noise that adds no information to the whole solution. the result of the first orthogonalization are first of all the vector of the residuals and cofactors of the adjusted observations. it now remains to determine the vector of the unknown parameters x that satisfies condition (9) and its cofactors (weight coefficients). this step of gso is called second orthogonalization. by solving the system of linear equations �� � � � � � � � 1 0 x x 2 ~ (24) we get, according to (21), a vector x with the minimal norm. if we select from (24) only certain rows, we obtain the solution minimizing the corresponding subvector. this system can naturally be solved using gso. if we need the cofactors of the adjusted unknowns, as is the standard case with geodetic applications, we have to process during the second orthogonalization the whole lower submatrix that resulted from the first orthogonalization step. m m m r xii ii ii� � �� � � � ( ) ~ 1 1 1 1 � 1 0 0 . (25) during the first orthogonalization, the linearly dependent columns in m1 are identified and are explicitly zeroed. the result of the first orthogonalization is a particular solution in which the unknowns corresponding to the linearly dependent columns are all set to zero. naturally their cofactors are zero as well. during the second orthogonalization step only the submatrix (q q3 4 i i, ) is influenced and the orthogonalization process is carried out as follows � gram-schmidt orthogonalization runs only through columns corresponding to the linearly dependent columns of m1 as if they were numbered 1, 2, … , d, where d is the nullity of m1 � dot products are computed only for indices i� � from the regularization condition min , x i i x i2� ��. linearly dependent columns are zeroed during the second orthogonalization even in the region of submatrix (q3, q4). the cofactors after the second orthogonalization are computed in the same way as in the case of regular systems. 6 singular value decomposition (svd) for any real m × n matrix a, m � n, there exists the singular value decomposition a uw v� � (26) � � � � � �u u vv v v1 1, , where u is an m × n matrix with orthogonal columns, w is a diagonal matrix n × n with nonnegative elements and v is a square orthogonal matrix n × n, (this variant is referred to as the thin svd [3]). matrix w is uniquely determined up to the permutation of its diagonal elements. the diagonal elements wi are called singular values of matrix a. their squares are eigenvalues of n × n matrix �a a. thus, the condition number of matrix a can be computed as the ratio of the maximal and minimal singular value. �( ) max min a � w w . (27) with singular decomposition we can directly express the vector of unknown parameters x from the project equations ax l x vw u l w� � � �� �, , ( )1 1 1diag wi . (28) if matrix a has more rows than columns (overdetermined system), then the euclidean norm of the residual vector v ax l� � is minimal and vector x is the least squares solution to the project equations (2). for a matrix a with linearly dependent columns d singular values are zero (d is the dimension of null space of a). singular value decomposition explicitly constructs the orthonormal vector basis of the null space and the range of a. the columns of matrix u corresponding to nonzero singular values wi form the orthonormal base of the range of a. similarly, the columns of matrix v corresponding to nonzero singular values form the orthonormal basis of the null space of a. � �� �a x a x x� � �0, n � �� �a y y a x x� � �, n . in the case of rank deficient systems, we set into the diagonal of inverse matrix w �1 zeros instead of reciprocals for elements corresponding to linearly dependent columns a w � � � � � � � 1 1 0 0 0 diag for for w w w i i i . (29) the resulting particular solution x minimizes both the euclidean norm of the residuals and at the same time the norm of unknown parameters x. the rather surprising replacement of reciprocal 1 0 � � by zero can be explained as follows. the solution vector x of the overdetermined system ax l� can be expressed as the linear combination of the columns of matrix v x u l v� � � � � � 1 1 wi i i i n ( ) ( ). (30) 16 © czech technical university publishing house e http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague the coefficients in parentheses are dot products of columns u and right hand side l multiplied by the reciprocal value of the singular value. the zero singular values correspond to the linearly dependent columns of matrix a that add no other information to the given system. setting corresponding diagonal elements of matrix w �1 to zeros is equivalent to eliminating the linearly dependent columns from matrix a. with matrix w �1 defined according to (29), the cofactors computed the same way for regular and singular systems q n a a vw u uwv vw w vxx � � � � � � � � � � � � � �1 1 1 1( ) ( ) t , (31) q aq a uwv vw w v vw u uull xx� � � � � � � � � � �( ))( )(t1 , (32) q aq uwv vw w v uw vlx xx� � � � � � � � �1 1t . (33) the cofactors (weight coefficients) for the adjusted parameters, observations and mixed cofactors are computed, similarly as in the case of gso, as the dot products of the rows of matrices u and v ; multiplied by the diagonal elements of w �1 in the case of cofactors of x. 6.1 algorithm svd and singular systems what now remains is to show how to compute the particular solution that minimizes only a given subset of subvector x according to the second regularization condition (9). we compose an overdetermined system of linear equations �c x x� � � , (34) where the columns of matrix � are vectors of null space basis � � �( , , , ), ,( ) ( ) ( )v v vi i i id nw1 2 0� and c is the vector of the coefficients of the linear combination of null space basis vectors that, when added to vector x, minimizes the selected subvector of unknown parameters �x (here they act as residuals). from a comparison of (34) with equations (24) and (25) it is obvious that for computing �x we can use the second orthogonalization of algorithm gso. if the gso second orthogonalization is applied to matrix v from the singular decomposition m m mii ii ii� �( ) ( )1 2 � � , (35) we obtain matrix �v . if we now replace the singular value decomposition matrix v by matrix �v , we can compute vector �x and all cofactors according to the same formulas (30–33) as in the case of standard svd solution x. 7 network adjustment in gnu gama gama was started in 1998 as a local educational project, mainly to demonstrate to our students the power and capability of object programming (the project is written in c��), and at the same time to show some alternatives to traditional approaches to numerical solutions of least squares adjustments based on normal equations. project gama was released under the terms of the gnu general public license, and in 2001 received the official status of gnu software. numerical solution of geodetic network adjustment in gama is based on an abstract c�� class and currently two derived classes are available implementing algorithms svd and gso. svd is the primary algorithm used in gama (one of our long term goals is to add more numerical solutions, namely solutions exploiting the sparse structure of project equations). from this perspective, algorithm gso was implemented in gama only as an testing alternative, both for comparing numerical results and for testing the hierarchy of the adjustment classes in practice. it is generally agreed that a bad implementation of gso can produce disastrous results. for example, during the first orthogonalization step of gso we set to zeros the unknown parameters corresponding to linearly dependent columns. in the case of a free geodetic network adjustment these are the coordinates of certain points – the whole network is pinned on these points and clearly, if close points are selected the regularization is unstable. the order of the columns in the orthogonalization is important. from practical experience we know that vector norms in the gso orthogonalization process generally tend to decrease. as gso is just an alternative algorithm in gama and its performance is not a crucial point, we implemented it with full pivoting, i.e., in each orthogonalization cycle the vector with the maximal norm is selected as a pivot (with this modification gso is about twice as slow as svd for large networks). singular value decomposition is a very robust method for dealing with systems that are either singular or numerically close to singular. even with full pivoting we had expected gso to prove to be inferior to svd, at least in cases with illconditioned matrices. surprisingly, with all the real geodetic networks that we have available this was not the case. apart from real data, we used series of randomly generated three-dimensional networks for testing. our implementation of svd is based on a classical algorithm published by golub and reinsch [7] (the algol procedure svd). the decomposition is constructed in two phases. it starts with the householder reduction to bidiagonal form, followed by diagonalization. contrary to our expectations, svd as used in gama has not proved to give numerically better results and in some cases it has even lost convergence in the diagonalization phase. a simple and tempting explanation that comes first to mind would be that the svd implementation in gama is somehow wrong. after all testing and revisions this does not seem to be the point. a possible explanation might be given by the following quotation from [3] … finally, we mention jacobi’s method … for the svd. this transformation method repeatedly multiplies a on the right by elementary orthogonal matrices (jacobi rotations) until a converges to u�; the product of the jacobi rotations is v. jacobi is slower than any of the above transformation methods (it can be made to run within about twice the time of qr … ) but has the useful property that for certain a it can deliver the tiny singular values, and their singular vectors, much more accurately than any of the above methods provided that it is properly implemented… surely to have more numerical methods implemented in gama would be helpful, for example the above mentioned jacobi method for svd. a practical problem during testing of the adjustment methods in gama was the relative shortage of reliable observation data and their adjustment results for testing. to enable easy comparison with other softwares we made a description of geodetic networks in xml (we use dtd for the definition © czech technical university publishing housee http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 1 /2005 of the formal syntax of our structured data). conversion from a well defined data format into xml is a relatively simple task, but processing of xml is not a trivial task and cannot be done without an xml parser. in the gnu gama project we use the xml parser expat by james clark, see http://expat.sourceforge.net/. we believe that xml is the best data format for description and exchange of structured data in the gama project. one of the goals of our project is to compile a free collection of geodetic networks described in xml. references [1] vaníček, p., krakiwsky, e. j.: “geodesy: the concepts”, north-holland, 2nd. ed. (1986), isbn 0-444-87775-4. [2] björck, : “numerics of gram-schmidt orthogonalization”. linear algebra and its applications vol. 197, 198, (1994), p. 297–316. [3] golub, g. h., van loan, c. f.: “matrix computations”, 3rd edition, the john hopkins university press 1996, isbn 0-8018-5413-x. [4] charamza, f.: “gso – an algorithm for solving least-squares problems with possibly rank deficient matrices”. optimization of design and computation of control networks, budapest, akadémiai kiadó, 1979. [5] charamza, f.: “an algorithm for the minimum-length least-squares solution of a set of observation equations”. studia geoph. et geod., vol 22, (1978), p. 129–139. [6] koch, k. r.: “parameter estimation and hypothesis testing in linear models”. 2nd ed. springer-verlag (1999), isbn 3-540-65257-4. [7] golub, g. h., reinsch, c.: “singular value decomposition and least squares solutions”. numer. math. 14, (1970), p. 403–420, handbook for auto. comp., vol. ii – linear algebra, (1971), p. 134–151. [8] demmel, j. w.: “singular value decomposition”. in z. bai, j. demmel, j. dongarra, a. ruhe, and h. van der vorst, editors. templates for the solution of algebraic eigenvalue problems: a practical guide. siam, philadelphia, 2000, http://www.cs.utk.edu/~dongarra/etemplates/book.html prof. ing. aleš čepek, csc. phone: +420 224 354 647 e-mail: cepek@fsv.cvut.cz department of mapping and cartography ing. jan pytel phone: +420 224 354 644 e-mail: pytel@gama.fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic 18 © czech technical university publishing housee http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 1 /2005 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2022.62.0438 acta polytechnica 62(4):438–444, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague cfd simulation of partial channel blockage on plate-type fuel of triga-2000 conversion reactor core sukmanto dibyoa, ∗, wahid lufthia, surian pinema, ign djoko iriantoa, veronica indriati sriwardhanib a center for nuclear reactor technology and safety – batan, puspiptek, building no. 80, tangerang selatan, indonesia b center for applied nuclear science and technology – batan, jln.tamansari, bandung, indonesia ∗ corresponding author: sukdibyo@batan.go.id abstract. a nuclear reactor cooling system that has been operating for a long time can carry some debris into a fuel coolant channel, which can result in a blockage. an in-depth two-dimensional simulation of partial channel blockage can be carried out using fluent code. in this study, a channel blockage simulation is employed to perform a safety analysis for the triga-2000 reactor, which is converted using plate-type fuel. heat generation on the fuel plate takes place along its axial axis. the modelling of the fuel-plate is in the form of a rectangular sub-channel with an inlet coolant temperature of 308 k with a low coolant velocity of 0.69 m/s. it is assumed that blockage is in a form of a thin plate, with the blockage area being assumed to be 60 %, 70 %, and 80 % at the sub-channel inlet flow. an unblocking condition is also compared with a steady-state calculation that has been done by coolod-n2 code. the results show that a partial blockage has a significant impact on the coolant velocity. when the blockage of 80 % occurs, a maximum coolant temperature locally reaches 413 k. while the saturation temperature is 386 k. from the point of view of the safety aspect, the blockage simulation result for the triga-2000 thermal-hydraulic core design using plate-type fuel shows that a nucleate boiling occurs, which from the safety aspect, could cause damage to the fuel plate. keywords: blockage, fluent, plate-type fuel, triga-2000, reactor safety, low coolant velocity. 1. introduction triga-2000 is an indonesian pool-type nuclear research reactor that has been operating for a long time. this reactor uses rod-type fuel produced by general atomic usa and nowadays, this rod-type fuel is no longer being produced. to maintain the reactor operation, modification of the triga-2000 reactor core using plate-type fuel will be carried out [1–3]. the u3si2al is used as a fuel material, which is produced domestically, except uranium. this fuel has also been used in the indonesian rsg-gas reactor. for this reason, it is presumed that there will not be many changes in core parameters as compared to rod-type fuel. based on the neutronic calculation, the conversion from rod-type to plate-type shows good results from the safety aspect of a reactor operation. furthermore, several calculations related to reactor core thermal-hydraulic design have been carried out previously [4–6]. calculations related to reactor safety analysis, simulations of loss of flow accident (lofa) and reactivity insertion accident (ria) have also been carried out [7, 8]. the plate-type fuel element is an arrangement of several fuel plates, which then become a bundle of fuel elements. the fuel plates are arranged in such a way that the gap between each plate can be used as a coolant channel. this plate-type fuel has several advantages over the rod-type fuel, namely its compact structure and high power density [9]. however, the coolant channel between these plates may experience blockage (become clogged) because the coolant channel is quite narrow. channel blockage causes the heat transfer process to be disrupted so that it has an impact on fuel integrity and reactor safety. channel blockage conditions can occur due to bending or swelling of fuel plates, caused by other material falling into the reactor pool, or debris carried by the coolant flow [10, 11]. thus, an analysis of flow channel blockage becomes important for the conversion from rod-type fuel elements to a plate-type fuel assembly. although the assumed blockage scenario occurs seldomly during the reactor operation. simulated blockage scenarios on plate-type fuel by a previous work have been carried out with the iaea generic 10 mw pool-type benchmark of material test reactor (mtr) [12]. the iaea research reactor is a pool-type mtr, 10 mw. the core is cooled by light water in a forced circulation mode with an average coolant inlet velocity of 3.55 m/s, the operating pressure of 1.7 bar, inlet coolant temperature of 311 k and the core coolant flow direction is downwards [13]. simulated flow blockage observations using the reactor 438 https://doi.org/10.14311/ap.2022.62.0438 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 4/2022 cfd simulation of partial channel blockage . . . data have been carried out for one sub-channel with an 80 % blockage ratio and several blockage positions including the inlet, middle, and outlet, in which no boiling occurs in all blockage cases [9]. the main objective of the following study is to perform a safety analysis of a coolant channel partially blocked by debris under steady-state operation, which can cause local heat peaks and ultimately, a loss of fuel integrity. the channel blockage simulation is carried out using fluent code. this analysis is necessary because the thermal-hydraulic design for a fuel-plate type of triga-2000 reactor has a low coolant velocity of 0.69 m/s for cooling the reactor core. the simulation is carried out with the blockage area assumed to be 60 %, 70 %, and 80 % at the inlet of the coolant sub-channel. the channel blockage analysis using computational fluid dynamics (cfd) simulations has been widely used to investigate fluid dynamics and heat transfer [14, 15]. meanwhile, one of the main issues in the context of safety assessment of a research reactor plate-type fuel is flow blockages. in a reactor cooling system analysis, the flow phenomena in coolant channels are mostly modelled in two-dimensional space (2d), including the possibility of local eddies, which are impossible to observe using one-dimensional simulations. so, cfd modelling is useful to predict the coolant flow temperature and velocity profiles inside a fuel assembly for a blockage state. it can be used to determine the steady-state behaviour on most critical coolant channels. meanwhile, there are no published experimental data on coolant channel blockage accidents considering heat transfer from the fuel plate to coolant, which is the subject of this simulation. 2. description of reactor core triga-2000 is a pool-type research reactor, with a conversion core design consisting of 16 standard fuel elements and 4 control fuel elements as shown in figure 1. the reactor core is cooled by light water, forced circulation mode, and downward flow. there are 21 fuel plates in each standard fuel assembly. coolant subchannel dimensions of a standard fuel assembly are 67.10 mm in length, 2.557 mm in width, and 625 mm in height. other parameters related to the fuel elements are given in table 1. meanwhile, table 2 summarizes basic thermal-hydraulic design information of the reactor, in which the reactor power is 1 mw thermal. the power distribution of fuel elements in axial direction was obtained from batan-fuel and batan-3diff neutronic codes that were previously published [3]. a visual representation of the fuel element and its fuel plates arrangements are depicted in figure 2. horizontal and cross-section views of the fuel elements are shown in figure 3. in the fuel element, each plate contains u3si2 al with 19.7 % enriched uranium. currently, these fuel elements are used in the rsg-gas reactor at serpong, indonesia. figure 1. core configuration [2]. fuel element parameter design values number plate in standard fuel element 21 fuel plate active length, mm 600 type of fuel element u3si2 al width of cooling channel, mm 67.10 sub-channel gap, mm 2.557 thickness of cladding, mm (average) 0.38 cladding material design almg2 plate thickness fuel, mm 1.30 width of fuel plates, mm 70.75 length of fuel plates, mm 625.00 table 1. fuel element specification. operating parameters value fluid material lightwater coolant mass flow rate to the core, kg/s 50.0 mass flow rate/fuel element, kg/s 2.10 average coolant velocity in core, m/s 0.69 inlet coolant temperature to core, k 308.0 inlet pressure to core, bar 1.583 table 2. input data of coolant operating parameters. 439 s. dibyo, w. lufthi, s. pinem et al. acta polytechnica figure 2. visual representation of fuel element. 3. methodology simplification of the flow blockage case is done by assuming that the flow distribution can affect surrounding channels only by simulating a few sub-channels. this is because parameters from the blocked channel and the adjacent channel will interact and influence each other, especially because the geometry of each fuel plate and coolant channel is similar [9, 10]. debris can take many forms, either due to material damage or other elements that can be carried away by the coolant flow. using the cfd method of fluent code, any blockage shape can be modelled. in this study, there was no buckling on the fuel plate and the blockage was only caused by the debris. another assumption is that the blockage only occurs at the inlet channel. during the blockage scenario, heat generation on the fuel plate takes place along its axial axis with an effective length of 600 mm and a total channel length of 625 mm. the heat transfer received by coolant flow follows equation (1) below: tcool(n + 1) = tcool(n) + q(n + 1) mcool · cpcool (1) with tcool(n) being the coolant temperature on nth segment, q(n + 1) being the heat generated by fuel element on n + 1th segment, mcool and cpcool are coolant mass flow rate and specific heat, respectively. in this simulation, the solution to turbulence flow problem are the assumptions and simplifications that the flow is steady-state, adiabatic, and turbulence k-epsilon (k-ε) standard model is set as a boundary condition. figure 3. vertical and horizontal cross-sections of the fuel assembly [16, 17]. 4. blockage scenarios in this study, the modelling of the fuel-plate is in the form of a rectangular sub-channel. due to the presence of blockage, the coolant flow in the blocked sub-channel will be decreased because of the obstruction. figure 4 shows a cross-section image of the simulated coolant sub-channel. it is assumed that one of the coolant sub-channels will be blocked by debris. the blockage was simulated as a very thin thickness plate instead of the actual debris. this blockage will cause a reduction in the cross-sectional area of the coolant flow at the inlet of the coolant sub-channel. the sub-channel no. 2 is a sub-channel that will be partially blocked at the top side or inlet stream. in this case, scenarios for the size of the blocked channel area of 60 %, 70 %, and 80 % are determined. these values are based on the study in which it has been investigated by guo. et al., salama et al. and fan et al. [10, 11, 18]. the simulation was performed at a low coolant flow velocity of 0.69 m/s as shown in table 2. the coolant velocity is an important variable that affects the fuel plate surface temperature. in addition, calculations for sub-channels in normal conditions or 0 % blocked channels were also carried out. figure 5 shows schematic steps of activities that include: (a) data preparation: the design data of triga2000 conversion using plate-type fuel is used, (b) steady state calculation without blockage: coolod-n2 code is used to validate the steadystate calculation, 440 vol. 62 no. 4/2022 cfd simulation of partial channel blockage . . . figure 4. cross-section view of sub-channel blockage simulation. figure 5. schematic diagram of simulation. (c) the partial blockage simulation: 60 %, 70 %, and 80 % flow area. in this analysis, a cfd method is used, this program helps to solve mathematical equations that formulate the process of fluid dynamics in describing the phenomenon of fluid flow that occurs by making a modelling geometry that matches the actual state of both the shape and dimensions for the simulation. the results obtained in this simulation are in the form of data, images, and curves that show predictions of the system reliability. 5. results and discussions gambit program is used to create geometry, grid, and mesh based on the modelled parts, namely inlet, outlet, and wall as part of the fuel plate. figure 6 shows a visualization of the mesh from gambit that will be executed with fluent code. one fuel plate with 2 sub-channels is divided into 21 faces in the axial direction, which is the hot surface with axial heat generation flux distribution available from the previous study. there are no experimental data for the triga conversion core with a plate-type fuel that could be used figure 6. meshing of channel blockage model. figure 7. one dimension fuel plate temperature and coolant temperature for steady state without blockage. to validate the results of the steady-state calculation. therefore, one dimension of the coolod-n2 code is used to calculate the steady-state at unblocked conditions. the coolod-n2 is a computer code for the analyses of steady-state thermal-hydraulics of both the rod-type and plate-type fuels [19, 20]. in this study, it is assumed that the blocked sub-channel was the hottest channel in the core. as part of a conservative approach, if the hottest channel remains safe, then other channels will also likely be safe. figure 7 shows the calculated result of the fuel plate surface temperature and coolant temperature by coolod-n2 and fluent code under normal conditions before the blockage occurs, and there are no significant differences, which gives confidence in the cfd simulation. in this steady-state calculation, a cosine-shape power distribution is utilized in an axial direction. the fuel maximum temperature is 441 s. dibyo, w. lufthi, s. pinem et al. acta polytechnica around 337 k for these two codes. meanwhile, the maximum coolant temperature at the outlet channel is around 320 k. the maximum bulk temperature (tcool) occurs at the outlet channel as written in equation (1). furthermore, the fluent model used above can be used for the blocked case to find the temperature profile. figure 8 shows calculated results for the profile of the coolant velocity in the blockage area of 0 %, and 60 %. as shown in figure 8(a), in the case of no blocked channel area, a coolant velocity is 0.69 m/s and the mass flow rate is 0.118 kg/s, which is the uniform velocity distribution at normal conditions. then, figure 8(b) shows the profile of flow velocity in the case of blockage of 60 %. when the blockage occurs, a partial blockage in this channel has a significant impact on the flow velocity. because the inlet are of the coolant sub-channel is reduced to 40 %, the coolant velocity in this inlet sub-channel increases. it can be seen that there is a jet flow with a velocity of up to 1.60 m/s and a mass flow rate of 0.109 kg/s. it has an impact on the coolant velocity along the vertical direction (axial), and there is a localized eddy beneath the blockage plate. furthermore, the coolant flows vertically, downstream to the outlet channel. figure 8. profile of flow velocity for blockage areas within the range of 0 % and 60 %. figure 9(a) and figure 9(b) show the simulation results for the coolant velocity profile for the blockage areas of 70 % and 80 %. as shown in figure 8(b), the coolant velocity profile in the coolant channel follows a similar trend. however, the average flow velocity through the narrow channel for 70 %, and 80 % cases is 1.92 m/s and 2.34 m/s, respectively. this flow velocity gradually decreases along the channel’s edge. furthermore, the flow in the middle region appears to be non-homogeneous, and the flow velocity profile is quite complex. the coolant velocity on a fuel plate surface is higher than the velocity on the surface in front of it. the coolant velocity is a variable that significantly influences the fuel plate surface temperature. figure 9. profile of flow velocity for blockage areas within the range of 70 % and 80 %. figure 10(a) and figure 10(b) depict the static coolant temperature profile of the un-blocked area (0 %) and the blockage of 60 %. a blockage here means a reduction in flow area in the obstructed channel, which could result in a reduction of flow rate. a non-homogeneous flow causes the heat transfer process to not occur properly. the coolant temperature becomes higher than in an unblocked channel. this hot spot temperature also affects the temperature on the opposite plate surface. this effect occurs by the conduction of heat through the thin plate. as depicted in figure 10(b), it shows that when a channel is obstructed, the coolant temperature profile changes significantly. this simulation indicates that the maximum coolant temperature for the 60 % blockage is 377 k. this temperature is located at 0.52 m from the inlet flow direction. however, it is still below the limit of the coolant saturation temperature, 386 k. figure 10. profile of static temperature for blockage areas within the range of 0 % and 60 %. 442 vol. 62 no. 4/2022 cfd simulation of partial channel blockage . . . figure 11(a) and figure 11(b) show that the coolant temperature profile for the 70 % blockage case was similar to the 80 % case. the maximum temperature for the 70 % blockage reached 380 k, which is slightly lower than the saturation temperature of 386 k. while in the case of 80 % blockage, as shown in figure 11(b), the maximum temperature locally achieves 413 k. it means a nucleate boiling within a liquid may occur in a sub-channel coolant while the bulk fluid flow is sub-cooled, this could cause damage to the fuel plates. furthermore, the damage to the plate (almg2 cladding material) causes a release of fission products from the fuel element into the coolant through the damaged cladding. figure 11. profile of static temperature for blockage areas within the range of 70 % and 80 % to ensure reactor safety, nucleate boiling should be avoided in the core, and the coolant should always remain in the super-cooled state, otherwise, it can cause damage to the fuel plate. from the point of view of the safety aspect, the blockage simulation result for triga-2000 thermal-hydraulic core design using plate-type fuel shows a nucleate boiling occurring. it is different with the channel blockage for the reactor of iaea generic mtr-10mw, in which the iaea generic reactor has a high coolant flow velocity. there is no boiling occurrence for all blockage cases as can be found in an article published by q. lu et al. [12] s. xia et al. [9]. 6. conclusions a channel blockage simulation has been carried out using the thermal-hydraulic design for a fuel-plate type of triga-2000 reactor with a low coolant velocity. under the condition of a 70 % channel blockage, the cooling temperature is still below the nucleate boiling temperature. in the case of the 80 % blockage, a nucleate boiling may occur due to the maximum local temperature reaching 413 k with the saturation temperature being 386 k. this is insufficient for a safety margin and could cause damage to the fuel plate. acknowledgements we would like to acknowledge the insinas program of research and technology brin 2021 for funding this research, member of “triga plate type fuel design team” and also to all staff of center for applied nuclear science and technology – batan for supporting the activities. references [1] r. nazar, j. s. pane, k. kamajaya. heat transfer analysis of plate type fuel element of reactor core. aip conference proceedings 1984:020031, 2018. https://doi.org/10.1063/1.5046615. [2] p. basuki, p. i. yazid, z. suud. neutronic design of plate type fuel conversion for bandung triga-2000 reactor. indonesian journal of nuclear science and technology 15(2):69–80, 2014. [3] s. pinem, t. m. sembiring, t. surbakti. core conversion design study of triga mark 2000 bandung using mtr plate type fuel element. international journal of nuclear energy science and technology 12(3):222–238, 2018. https://doi.org/10.1504/ijnest.2018.095689. [4] k. a. sudjatmi, e. p. hastuti, s. widodo, n. reinaldy. analysis of natural convection in triga reactor core plate types fueled using coolod-n2. journal of nuclear reactor technology tri dasa mega 17(2):67–68, 2015. https://doi.org/10.17146/tdm.2015.17.2.2317. [5] a. i. ramadhan, a. suwono, e. umar, n. p. tandian. preliminary study for design core of nuclear research reactor of triga bandung using fuel element plate mtr. engineering journal 21(3):173–181, 2017. https://doi.org/10.4186/ej.2017.21.3.173. [6] v. i. s. wardhani, j. s. pane, s. dibyo. analysis of coolant flow distribution to the reactor core of modified triga bandung with plate-type fuel. journal of physics: conference series 1436:012098, 2020. https://doi.org/10.1088/1742-6596/1436/1/012098. [7] s. dibyo, k. s. sudjatmi, s. sihana, i. d. irianto. simulation of modified triga-2000 with plate-type fuel under lofa using eureka2/rr-code. atom indonesia 44(1):31–6, 2018. https://doi.org/10.17146/aij.2018.541. [8] s. pinem, t. surbakti, i. kuntoro. analysis of uncontrolled reactivity insertion transient of triga mark 2000 bandung using mtr plate type fuel element. ganendra majalah iptek nuklir 23(2), 2020. https://doi.org/10.17146/gnd.2020.23.2.5876. [9] s. xia, x. zhou, g. hu, et al. cfd analysis of the flow blockage in a rectangular fuel assembly of the iaea 10 mw mtr research reactor. nuclear engineering and technology 53(9):2847–2858, 2021. https://doi.org/10.1016/j.net.2021.03.028. [10] y. guo, g. wang, d. qian, et al. accident safety analysis of flow blockage in an assembly in the jrr-3m research reactor using system code relap5 and cfd code fluent. annals of nuclear energy 122:125–136, 2018. https://doi.org/10.1016/j.anucene.2018.08.031. 443 https://doi.org/10.1063/1.5046615 https://doi.org/10.1504/ijnest.2018.095689 https://doi.org/10.17146/tdm.2015.17.2.2317 https://doi.org/10.4186/ej.2017.21.3.173 https://doi.org/10.1088/1742-6596/1436/1/012098 https://doi.org/10.17146/aij.2018.541 https://doi.org/10.17146/gnd.2020.23.2.5876 https://doi.org/10.1016/j.net.2021.03.028 https://doi.org/10.1016/j.anucene.2018.08.031 s. dibyo, w. lufthi, s. pinem et al. acta polytechnica [11] a. salama, s. e.-d. el-morshedy. cfd simulation of flow blockage through a coolant channel of a typical material testing reactor core. annals of nuclear energy 41:26–39, 2012. https://doi.org/10.1016/j.anucene.2011.09.005. [12] q. lu, s. qiu, g. su. development of a thermal–hydraulic analysis code for research reactors with plate fuels. annals of nuclear energy 36(4):433–447, 2009. https://doi.org/10.1016/j.anucene.2008.11.038. [13] o. s. al-yahia, m. a. albati, j. park, et al. transient thermal hydraulic analysis of the iaea 10 mw mtr reactor during loss of flow accident to investigate the flow inversion. annals of nuclear energy 62:144–152, 2013. https://doi.org/10.1016/j.anucene.2013.06.010. [14] m. adorni, a. bousbia-salah, t. hamidouche, et al. analysis of partial and total flow blockage of a single fuel assembly of an mtr research reactor core. annals of nuclear energy 32(15):1679–1692, 2005. https://doi.org/10.1016/j.anucene.2005.06.001. [15] d. gong, s. huang, g. wang, et al. heat transfer calculation on plate-type fuel assembly of high flux research reactor. science and technology of nuclear installations 2015, 2015. https://doi.org/10.1155/2015/198654. [16] prsg. reactor. safety analysis report (sar) of rsg-gas. batan-indonesia. rev. 9. chapter 5. 2002. [17] m. subekti, d. isnaini, e. p. hastuti. the analysis of coolant-velocity distribution in plate-type fuel element using cfd method for rsg-gas research reactor. journal of nuclear reactor technology tri dasa mega 15(2):67–76, 2013. [18] w. fan, c. peng, y. chen, y. guo. a new cfd modeling method for flow blockage accident investigations. nuclear engineering and design 303:31–41, 2016. https://doi.org/10.1016/j.nucengdes.2016.04.006. [19] k. tiyapun, s. wetchagarun. neutronics and thermal hydraulic analysis of triga mark ii reactor using mcnpx and coolod-n2 computer code. journal of physics: conference series 860:012035, 2016. https://doi.org/10.1088/1742-6596/860/1/012035. [20] s. widodo, e. p. hastuti, k. a. sudjatmi, r. nazar. steady-state thermal-hydraulic analysis of the triga plate core design by using coolod-n2 and relap5 codes. aip conference proceedings 2180:020017, 2019. https://doi.org/10.1063/1.5135526. 444 https://doi.org/10.1016/j.anucene.2011.09.005 https://doi.org/10.1016/j.anucene.2008.11.038 https://doi.org/10.1016/j.anucene.2013.06.010 https://doi.org/10.1016/j.anucene.2005.06.001 https://doi.org/10.1155/2015/198654 https://doi.org/10.1016/j.nucengdes.2016.04.006 https://doi.org/10.1088/1742-6596/860/1/012035 https://doi.org/10.1063/1.5135526 acta polytechnica 62(4):438–444, 2022 1 introduction 2 description of reactor core 3 methodology 4 blockage scenarios 5 results and discussions 6 conclusions acknowledgements references ap05_4.vp 1 introduction today, modeling and simulation are mostly used as analysis tools. however, systems with many power electronic components are emerging – driven by the need for power quality, availability, security, and efficiency. the detail and complexity of these “system-of-systems” exceeds the capability of today’s rule-based design methods as reported by ericsen [1]. tomorrow’s systems will require a relational and rational design process using modeling and simulation. the model becomes the specification. paper documents cannot address the complexity of the next generation power electronic systems. computer-aided design must become computer-based design. furthermore, two types of models are envisioned: requirement models and product models. requirement models can be behavioral, empirical, and relational. product models must be physics-based or functional to include the nature of the constituent materials and the methods of manufacture. physics-based models enable one to predict the physics of failure, quantify risk as a function of known and unknown physics, and quantify cost as a function of materials and manufacturing. key parts of physics-based design are validation and incremental prototyping. today’s power electronic systems require a completed and commissioned system to validate the design. new power electronic concepts such as power electronic building blocks (pebb) enable designers to avoid re-commissioning elements that have been proven in previous designs. a new power electronic system design, that uses the same pebb elements as a previous design, need only validate the new application stresses and design elements. a physics-based design process can provide confidence levels and quantified risk to any degree of certainty. however, there is still a great cost to build and test the new elements and to characterize the system under new application stresses. if most of the elements in a system are new, the cost of validation can be great. real-time simulation, capable of running with real hardware as a hardware-in-the-loop (hil) simulation, can reduce the cost of validating new complex system designs. hil simulation enables parts of a converter (switches, phase-legs, and bridges) to be run with the rest of the converter that is emulated in a software simulation. therefore, the whole converter does not need to be built to validate the design, only the new elements. in the same way, new converters could be simulated with an entire system emulated in a hil simulation. a process of incremental prototyping can be developed that proceeds by calculating a minimum significant hil experiment for a given design problem. based on the results obtained, a new hil experiment would be configured for the next minimum significant hardware validation. the process continues until an acceptable level of confidence is obtained from the incremental prototyping steps performed. in the case of completely new systems, these steps can be used to build one-of-a-kind prototypes that can be validated based on physics in a process where the building steps are quantified and confidence is established prior to each prototyping step. 2 complexity and detail complex systems have the following attributes, as shown by [2]: 1. the more identical that a model must be to the actual system to yield predictable results, the more complex the system is. 2. complex systems “…have emergence … the behavior of a system is different from the aggregate behavior of the parts and knowledge of the behavior of the parts will not allow us to predict the behavior of the whole system.” 3. “in systems that are ‘complex’, structure and control emanate or grow from the bottom up.” 4. a system may have an enormous number of parts, but if these parts “interact only in a known, designed, and structured fashion, the system is not complex, although it may be big.” 5. although a physical system may not be complex, if humans are a part of the system, it becomes complex. as our physical understanding increases, the details and the burdens on design increase. as the power of systems increases, lower order effects have substantial amounts of energy and that cannot be ignored. converters in the 10’s of megawatt range can produce 100’s of kilowatts of losses in the form of heat, electromagnetic interference (emi), and mechanical vibration. these machines can make great heaters, radio transmitters, and noise amplifiers. electric motors are also transducers and they act as speakers for noise. electromagnetic interference can interact with communica© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 physics based design, the future of modeling and simulation t. s. ericsen this paper discusses the expanding role of modeling and simulation in the design and development of electrical power systems. the concepts of physics-based design and building blocks are introduced to show how complex systems may be simplified. however, the detail and complexity of tomorrow’s systems are beyond today’s tools. computing power has increased to the point where physics-based design is possible. the aim of this paper is to discus the issues and opportunities for modeling and simulation in advanced system design. keywords: modeling, simulation, power electronics, power electronic building blocks, pebb, electric power systems, advance marine electric power systems, naval electric power systems. tions and other electronic machines and create system malfunctions and instabilities. today’s designer must take into account these interactions, as well as predict many other effects such as: system reliability, cost, environmental effects, health effects, and so on. these predictions span temporal ranges from microseconds, through days, to years. integration has become the workhorse of affordability and increased performance. with increased integration comes increased coupling and detail. for example, today’s systems require designers to meet thermal, mechanical, electrical, and chemical requirements synergistically, rather than as independent design threads woven together at the end of the project. digitally-controlled systems are multifunctional and the implications of functionality on the totality of application and environment must be understood before commissioning. 3 design cycle & hierarchy in the classic design cycle, a requirement is given. a prototype must be synthesized. the prototype is then analyzed and the results compared to the requirement. if the requirement is satisfied, then the design cycle is complete and the product is produced. if the requirement is not satisfied, then the prototype is modified and analyzed. the results are again compared to the requirement. the cycle is repeated until the requirement is satisfied. analysis is well understood and the analysis problem is well posed. first, a solution exists. the solution is unique and stable – the solution depends continuously on the data. synthesis is ill-posed. there are many potential solutions – not unique. the solution does depend continuously on the requirement. often, the requirement is ill-defined. synthesis is creative and uniquely human. a larger design cycle continues in time where requirements drive new products and new products drive new requirements, see fig. 1. moreover, today’s products must be able to meet complex cost, performance, and life tradeoffs. it is not enough to make something new and revolutionary. products must meet market cost goals and not last longer than required. as noted above, modern integrated system design involves many sub-system designs and many engineering disciplines. this requires the talents of many designers from many different disciplines working collaboratively to design a system. extending this analogy a bit, let us say customers generate requirements and vendors create products in a design cycle. the vendor’s prototype helps refine the customer’s requirement. the customer’s requirement drives the vendor’s prototype. the customer’s requirement is also a result of his own synthesis process to produce a prototype for a higher-level customer. one can envision a chain of customer/vendor design cycles extending from the basic materials, up through components, to systems. the product at highest system level may be a ship, a utility, or a city. there are always two designers at every level – customer and vendor designers. finally, we need to take into account progressive integration processes that continually increase the coupling of thermal, mechanical, electrical, and chemical design threads. one can also imagine interlocking design cycles widening the design process to include these multi-discipline design teams. at least two types of models are needed for this process – models for the requirements and models for the products. the requirement model represents the top-down design point of view. the product model represents the bottom-up perspective. since a requirement model needs to convey only performance and not specific solutions, it can be general or behavioral in nature. the product model conveys the manufacturer’s specific solution to the requirement given; therefore, it must be physics-based or functional. there will be many possible product models for a single requirement model. if a vendor supplies a behavioral model in response to a requirement, the vendor is not clearly showing how material selection and the manufacturing processes employed yield a product, which will satisfy the necessary and sufficient conditions of the requirement. 4 simplification up to this point, factors contributing to increased complexity and detail have been discussed. however, complex systems can be simplified by applying advanced technology. using intelligent controllers and partitioning the system based on the physics of the materials, components, and methods of manufacture can produce building blocks, which allow systems to be designed, built, and operated in a rational predictable manner. for example, traditional electrical power systems are complex. power is produced synchronously; 60 hz (or 50 hz) sources supply power to 60 hz loads. in equilibrium, the system is well-understood. however, when these systems are disturbed, the behavior becomes unpredictable and phenomena such as bifurcation can occur. in contrast, power electronic systems are not complex – theoretically. power is produced, distributed, and consumed asynchronously. power electronic machines such as converters, inverters, rectifiers, and motor controllers are active devices. when used intelligently in systems, they create known controlled and predictable states. power electronic machines can have thousands of parts and many of these machines would be needed to simplify a system. furthermore, power electronics add greatly to system size, weight, and cost. in this case, reducing complexity increases detail. 5 power electronic building blocks power electronics building blocks (pebb) described in [3] and [4] exemplify the process of simplification. in 1994, the office naval research initiated the pebb program to reduce the size, weight, and cost of power electronics so as to enable advanced electrical power systems, which enable electric warships with reduced manning, increased survivability, and 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague customer designercustomer designer supplier designersupplier designer mission:mission: performance, life, &performance, life, & costcost requirementsrequirementsproductsproducts fig. 1: design cycle increased power for electric propulsion, high power sensors, and high-power electric weapons as reported in [5] and by ericsen [6]. the pebb concept has transcended these initial objectives and is being refined by the ieee power engineering society, working group wgi8. pebb-based systems can be found in many utility, marine, and industrial applications – today. every power electronic block has input/output filters, power switching, control, and thermal management, as shown in fig. 2. ideally, a pebb knows what it is plugged into and what is plugged into it, and makes appropriate connections. pebb is a broad concept that incorporates progressive integration of power devices, gate drives, and other components into building blocks with defined functionality and interfaces – serving multiple applications; resulting in reduced cost, losses, weight, size, and engineering. the pebb designer addresses details such as: device stresses, stray inductances, switching speed, losses, thermal management, and protection. the system engineer may apply pebb in many diverse system applications without having to understand these details. adoption of building blocks that can be used for multiple applications results in high-volume production and in reductions to engineering effort, design testing, onsite installation, and maintenance work. the value of integration can be enhanced with standardization of interfaces. physics-based relationships are essential for the design of power electronics building blocks. high-energy levels in power applications require that all the natural conservation laws be obeyed – conservation of energy, voltage, current, torque, force, etc. partitions based on physics are of primary importance. since the pebb is an active device, the temporal partitions and control interface definitions are as important as the pebb blocks. the universal controller architecture, another thrust of the ieee wgi8 working group, partitions the control for a system built using pebb. standard control interfaces enable system control to be implemented top down. furthermore, standard control interfaces enable pebbs to be electronically tuned to meet custom performance requirements and to adapt to changing system applications, environmental, and mission operating conditions. power electronics is akin to microelectronics, part of silicon science, sand-based technology. trends in microelectronics applications, i.e., computers, servers, controls, etc., have resulted in their assembly from functional building blocks with incredible reductions in cost and increases in performance. while very small power supplies have followed these trends, power electronics for higher power applications are just beginning this revolution. modular and hierarchical design principles are the cornerstones of the building block concept. the idea of open plug-and-play architecture is to build power systems in much the same way as personal computers. ideally, pebbs would be plugged into power electronics systems and operational settings would be made automatically. the system knows the pebb capabilities, its manufacturer, and its operational requirements. each pebb maintains its own safe operating limits. © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 makes the electrical conversion needed via software programming senses what they are plugged into... senses what are plugged into them... controls i/o i/o thermal functions in software inverter breakers frequency converter motor controller power supply actuator controller filter filter power switching • pebb defined by ieee (power engineering society) • wg i8 • tf2, pebb technologies pebb fig. 2: power electronic building blocks pebb inverter unit aru exu lsu cou wcu cbu inu power stack(s) acs6000 base module pre -defined interfaces for power, cooling & control connections controls & sw controlscontrols && swsw dtc im > acs 6000ad sm > acs 6000sd pm > acs 6000pm twin im > acs 6000ad twin -hispin active rectifier unit line supply unit excitation unit capacitor bank unit water cooling unit control unit inverter unit aruaru exuexu lsulsu coucou wcuwcu cbucbu inu power stack(s) acs6000 base module pre -defined interfaces for power, cooling & control connections inu power stack(s) acs6000 base module pre -defined interfaces for power, cooling & control connections controls & sw controlscontrols && swsw dtc im > acs 6000ad sm > acs 6000sd pm > acs 6000pm twin im > acs 6000ad twin -hispin active rectifier unit line supply unit excitation unit capacitor bank unit water cooling unit control unit 6 pulse <9mva 12 or 18 pulse <18mva single drive ac dc ac dc dc ac m m m m dc ac dc ac dc ac ac dc dc ac m ac dc ac dc dc ac dc ac ac dc dc ac m dc ac m dc ac m dc ac m senses what they are plugged into... senses what are plugged into them... controls i/o i/o thermal filter filter power switching pebb fig. 3: 9 mva pebb, cabinets housing pebbs, and systems using pebbs, courtesy of abb the u.s. office of naval research (onr) has funded several manufacturers to develop pebbs for a broad range of applications. some of these designs are now commercially available, although they may not be called pebb products. one of these designs, the 9 mva pebb, is shown in the upper-left area of fig. 3 [7]. this 9 mva standard pebb design is used in several marine, industrial, and utility applications. a smaller size 1 to 5 mva pebb has been commercialized for transportation, storage, marine and renewable energy source applications. in all cases, the pebb is configured into standard cabinets; cabinets are selected and arranged as required; software programming is implemented for the specific application; and electronic tuning completes the system. reductions in cost have been found in the range of 5×–10×, and reduction in size and weight in the range of 2×–5×. progressive integration leading to reduced engineering effort and manufacturing cost has been the key. pebb concepts allow systems to be built rationally: based on pre-engineered building blocks, modeled and simulated with reduced detail, controlled from the top down, and the behavior of the system can be predicted based on the behavior of the building blocks that comprise it. the engineering effort and cost needed to produce a pebb product is a tradeoff with the number of systems in which the pebb can be applied. therefore, the pebb must be as generally applicable as possible to create the greatest return on investment. paradoxically, systems designed with pebbs are potentially not complex; but the pebb itself is most likely complex. ideally, a pebb can be designed based on several more layers of building blocks cascading down to the materials themselves. however, these building blocks have not been defined. several projects have been tried and failed to produce an accurate pebb model before building the pebb. so far, designing and building the hardware pebb takes much less time and cost than building a pebb software model. in the future, it is reasonable to believe that these issues will be resolved. additionally, the building block concept can be extended to other disciplines, as well. actuator building blocks, mechanical building, and structural building blocks have all been proposed and studies initiated. 6 model is the specification, model is control, model is machine as can be seen from the previous discussions, the complexity and detail of modern systems is beyond paper specifications. even when systems are simplified, detail increases dramatically. the detail is multi-disciplined and highly-coupled. performance specifications are not specific enough to convey the engineering criteria of modern systems. furthermore, performance specifications are a one-way street where the vendor is the only design agent. the model is only vehicle capable of conveying the engineering details needed and flexible enough to be used in true engineering design cycle. moreover, the model is the only vehicle that has the potential for multi-physics relationships supporting integrated multi-discipline design. thus, the model must become the specification and simulation the design medium for future systems. the ability to program the control of a power electronic machine is a direct result of onr’s pebb and universal controller architecture programs. many of today’s controls are developed in software and implemented in microprocessors or programmable gate arrays. most recently, algorithms developed in commercial software packages can be programmed into the targeted processors and tested directly from the software package, eliminating machine language programming and bench-top testing. in effect, the control developed on the desktop computer simulation software is the machine control. the model is the control. fig. 4 shows the universal controller architecture that describes the general partitioning for microprocessor controller platforms now used in many different industry applications. analog to digital and digital to analog conversion occurs very close to the switches. the rest of the system is digital. the manufacturer can increase the production and reduce the cost of the hardware manufacture while tailoring the functionality to meet custom application requirements. it is not a big step to envision a future where different system functions can be implemented at different times by changing the equipment’s software programming – the model becomes the machine. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague 1 s 10 s power in 100 s 1 ms 10 ms power module power module power filter power filter power filter power filter load controller load controller 100 ms system level controller system level controller inner loop inner loop smart gate drives smart gate drives smart sensors smart sensors smart sensors smart sensors smart sensors smart sensors modulatormodulator digital universal controller and application manager mixed signal serial bus serial bus hardware manager power stage and serial bus serial bus � � � fig. 4: universal controller architecture, power engineering society wgi8 task force 1 based on original work by and with permission from deepak divan if the model is to be the specification, then institutional support will be needed for activities such as software certification, standard models for calibration, standards for interfaces and protocols, and public libraries of physics-based models. public libraries of standard physics-based models are essential to reduce duplication of non-proprietary effort and to gain the most from industry experience. the specification will be part of a legal agreement between the customer and vendor. programming decisions, assumptions, and techniques can have a profound affect on the numerical accuracy and stability. one cannot qualify and quantify risk and the reliability of prediction, if the analysis cannot be certified. the pebb approach seeks to satisfy the necessary and sufficient conditions for the customer to analyze performance, while the relationships within the model can trace their roots directly to the underlying physics – as detailed as required for the questions asked. more work will be needed to achieve this goal. yet, it is not clear if the model is one large multifaceted structure where the appropriate face is presented for the question asked, or if the model is a series of connected models where the right model is inserted in the simulation based on the question asked. in any case, the simulation environment must be able to change the models as needed without experts or additional expertise on the part of the inquirer. if the model is the controller, then the simulation environment must be able to perform many forms of numerical analyses such as signal flow, conservation of energy, stability, and stress. different solution methods can be connected by many means such as co-simulation, computational wrappers, and as a compiled model within another simulation. all of today’s methods have different affects on computation speed and accuracy. software designers need to rethink solvers and numerical processes to produce more robust simulation environments. 7 hardware in the loop if the model is to be the machine, the interface between hardware and software must be capable of real-time simulation. simulations that interact with high frequency switching machines, such as inverters and converters, must be real time and high speed. for example, a hil simulation could have the controller from a motor controller in a loop where the computer is supplying real-time signals to the controller to emulate the motor, power section of the motor controller, and the power source. the performance of the controller can be analyzed over many different operating conditions – some of which may too dangerous or costly to test otherwise. this is called controller-in-the-loop, cil. a digital semiconductor curve tracer is an example of a device-in-the-loop, dil. the curve tracer applies a preprogrammed stress to the device and the response of the device is recorded. many different stresses can be applied and the device’s characteristics can be mapped as a result. in this case, the hil has everything the cil has, plus programmable power supplies and possibly a heater, cooler, and other environmental stress controllers. the next generation of hil is the digital power laboratory (dpl). if pebbs can be configured and programmed to be motor controllers, propulsion drives, windmill controllers, and energy storage controllers, then one could use these pebbs in a laboratory to emulate these loads. likewise, power sources can be emulated with pebbs. furthermore, one does not need to duplicate every machine in a system to emulate the system. if a simulation of an entire system can be made, then an emulation of the system can be created at any point that replicates all of the artifacts of the energy and power at that point. thermal, mechanical, and chemical aspects of the system at the point of interest can be included as well. real power in real time is required at the interface between software simulation and hardware. scaling methods can be used, if the underlying physics justify it. the primary limitations are bandwidth and power levels of the programmable power supplies. 8 the future system design process it is time for system design, particularly naval ship design, to evolve. the pharmaceutical, microelectronics, and bio-medical industries, to name a few, have progressed to computer-based analysis and experimentation. a new computer is needed to combine human and machine intelligence and to fill the gap in advance system design. the scope of the design problem is daunting, bordering on unfathomable; but, complex systems can be simplified by physics-based partitioning and the use of intelligent active devices. detail is increasing dramatically; but computational capability continues to grow dramatically, as well. notionally, the system design environment of the future will enable: 1. human creativity – capable of synthesis as well as analysis which will enable software experimentation. 2. customer and vendor design processes – models for requirements (top down) and products (bottom up). 3. collaborative multidiscipline design –where many designers can work synergistically and simultaneously. 4. a relational design process – multi-discipline numerical processes allowing concurrent experimentation. 5. building blocks – standard interfaces, model libraries, cellular and hierarchical design. 6. more robust and creative numerical solvers and environments. 7. hil experiments. 8. human and machine synergistic solvers – machines crunching millions of equations; humans watching for trends, cutting off unproductive threads, and creating leaps in solution space as inspiration and experience lead. in the future, a new system project would begin with multi-discipline teams at their desktops logging into the design environment. customer requirement models are up loaded. libraries of physics-based models will be accessed to obtain needed elements. these models would be able to determine any new analysis conditions that go beyond the model’s relevancy and assure the analysis limitations are established and known. the physics of the proposed system would first be divided into domains of known and unknown physics. if there are no new physics, no new application stresses, material, or components, then the design can be analyzed to assure compliance with customer requirements. if there is new physics, then risk and cost are directly related to the unknown or new physics. one will be able to quantify and © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 qualify risk, time, and cost of the proposed system development – complete with confidence levels. computer experimentation begins with analysis of the domain of unknown physics. polynomial neural nets, fuzzy logic, chaos, and other possibly new methods will enable researchers to understand the implications of the assumptions they make. it is very possible that these tools will be incorporated in the models themselves. these methods will enable researchers to find the details that humans tend to overlook, test the consistency of boundaries between known and unknown, and understand the logical extensions of physical assumptions. at some point, analytical experiments are complete and a system hypothesis is formed. hardware experiments are extracted from this hypothetical system by statistical analysis. everything needed for the experiments is generated – test devices or machines, the test conditions, order of tests, relevant observations, etc. a digital power laboratory (dpl) is configured directly from the results. after the dpl tests are performed, the problem is analyzed to determine the next steps – more unknowns or completed system design. if needed, a new hypothetical system is calculated based on these results. a new experimental series is executed. these cycles are continued until the customer and vendor are satisfied. although the above discussion is notional, two main points are clear. first, physical knowledge must be exploited to the fullest extent possible. if the components, materials, and algorithms have been applied previously, then these results can be used in the new design. the new design needs only to focus efforts on new elements and application conditions – not on redeveloping what is already known. second, computer-based design and experimentation can achieve incremental prototyping. physics-based modeling and simulation enables the design to be edited and validated without building and rebuilding hardware. with the concept of the dpl, new components and machines can be tested as if they were in the real system. as each unknown is validated, the system becomes incrementally validated. 9 conclusion in summary, the complexity and detail of modern systems are beyond today’s modeling and simulation tools in six ways: 1) machine detail is beyond existing tools when design is taken in a conventional sense; 2) complexity is beyond tools when new and growing customer expectations are considered; and 3) complexity and detail exceed existing tools when entirety of the “system of systems” hierarchy is considered; 4) today’s modeling and simulation tools are primarily analysis tools – not designed for synthesis; 5) today’s tools assume there is only one designer, when there are many designers; 6) tomorrow’s electronically reconfigurable systems will need a new set of tools for design, control, and operation. furthermore, we are not producing enough engineers and scientists, worldwide, to address this complexity and increased detail. human intelligence cannot expand to meet these challenges. machine intelligence must be harnessed into a design environment with human intelligence, if we are to meet these future challenges. finally, some of the key issues are: multidiscipline modeling and simulations; handling model order across vast temporal, spatial, and application ranges; manufacturing and material-based models, designing with uncertainty, the digital power laboratory for emulation and validation, and incremental prototyping. 10 acknowledgments i would like to acknowledge the extremely helpful discussions i have had with roy crosby, narian hingorani, and albert tucker during the writing of this paper. references [1] ericsen, t.: “future navy application of wide bandgap power semiconductor devices.” proceedings of the ieee, vol. 90 (2002), no. 6, p. 1077–1082. [2] sage, a., olson, s.: “modeling and simulation in systems engineering: whither simulation based acquisition?” sim 2001 journal, vol. 76 (2001), no. 2, p. 90–92. [3] panel session on “power electronics building block concepts” at the ieee pes general meeting in toronto, july 2003. [4] document: power electronics building block concepts. under preparation by ieee power engineering society task force 2 of working group i8 of power electronics equipment subcommittee of the substation committee. this publication will be available soon. [5] ericsen, t., tucker, a.: “power electronics building blocks and potential power modulator application.” proc. 23rd int. power modulator symposium, rancho mirage, ca, june 1998, p. 12–15. [6] ericsen, t.: “power electronic building blocks – a systematic approach to power electronics.” proc. ieee power engineering society summer meeting, vol. 2 (2000), p. 1216–1218. terry sayre ericsen e-mail: ericset@onr.navy.mil onr334 ship hull, mechanical, and electrical systems science and technology division office of naval research arlington, virginia, usa 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2023.63.0171 acta polytechnica 63(3):171–178, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague qualitative sign stability of linear time invariant descriptor systems madhusmita chanda, mamoni paitandia, mahendra kumar guptaa, b, ∗ a national institute of technology jamshedpur, department of mathematics, 831014 jamshedpur, india b indian institute of technology bhubaneswar, school of basic sciences, khordha, 752050 odisha, india ∗ corresponding author: mkgupta@iitbbs.ac.in abstract. this article discusses assessing the instability of a continuous linear homogeneous timeinvariant descriptor system. some necessary conditions and sufficient conditions are derived to establish the stability of a matrix pair by the fundamentals of qualitative ecological principles. the proposed conditions are derived using only the qualitative (sign) information of the matrix pair elements. based on these conditions, the instability of a matrix pair can easily be determined, without any magnitude information of the matrix pair elements and without numerical eigenvalues calculations. with the proposed theory, magnitude dependent stable, magnitude dependent unstable, and qualitative sign stable matrix pairs can be distinguished. the consequences of the proposed conditions and some illustrative examples are discussed. keywords: descriptor systems, stability of a matrix pair, qualitative sign instability, interactions and interconnections, characteristic polynomial. 1. introduction the concept of stability of a matrix and a matrix pair is very fundamental to the control theory, and it is an important property to be analysed for all practical control systems. a continuous homogeneous linear time-invariant descriptor system, i.e. differential algebraic equations (daes) can be written as: eẋ(t) = ax(t) , (1) where x(t) ∈ rn is the state vector and e, a ∈ rn×n are the constant matrices [1]. when e = i (identity matrix), system (1) is well known as a state space system. system (1) is called regular if det(λe − a) is not identically zero as a polynomial of λ [2, 3]. a regular system (1) is said to be stable if and only if the matrix pair (e, a) is a stable matrix pair, i.e. all of its eigenvalues have negative real parts. in order to find the eigenvalues of the matrix pair (e, a), we have to determine the roots of the characteristic equation det(λe − a) = 0. it is remarkable that when matrix e is singular, the number of eigenvalues of the matrix pair (e, a) is less than n. this numerical eigenvalue calculation of a matrix pair of a higher order is a computationally intensive effort. to overcome this drawback, economists have introduced the concept of ‘qualitative stability’ and ecologists have derived some necessary and sufficient conditions for the stability of a matrix using only the sign information of matrix elements. nonetheless, in the literature, this problem is addressed only for state space systems, i.e. when e = i, where eigenvalues of only matrix a are checked. this paper extends these results for checking the eigenvalues of matrix pair (e, a). in this paper, the word ‘quantitative’ is used for both magnitude and sign information, and the word ‘qualitative’ strictly for the sign information with no magnitude information of matrix elements. matrix pairs which are stable, independent of their magnitudes with only sign information are denoted as qualitative sign stable (qlss) matrix pairs and qualitative sign unstable are denoted as ‘qlsu’ matrix pairs. matrix pairs, whose stability/instability depend upon the magnitude information of the matrix pair elements, are denoted as magnitude dependent stable/unstable (mds/u) matrix pairs. with the knowledge of qualitative sign structure, we can now discuss the stability of a matrix pair. the analysis of stability of matrices has evoked various research directions. the way non-engineers, such as ecologists and economists, have tackled this problem without even having any magnitude information is fascinating. in [4], the stability problem of a matrix is studied in a purely qualitative environment assuming that quantitative information is unavailable. article [5] provides some sufficient conditions for the qualitative stability of an ecosystem by simply concluding the mutual qualitative effects on member species via signed digraphs, whereas necessary conditions for the qualitative stability are presented in [6]. in [7], linear systems are studied based on the qualitative theory. some conditions concerning the structural qualitative stability of a system are proposed in [8]. a graph-theoretic analysis based on sign patterns of a real square matrix is used to conclude the stability of a linear system in [9]. in [10], it has been shown that in a complex ecological system, when species interact as predator-prey, the system can still be stable. in [11, 12], ecological-sign stability pricnciples of a matrix are transformed into mathematical principles 171 https://doi.org/10.14311/ap.2023.63.0171 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en m. chand, m. paitandi, m. k. gupta acta polytechnica to encounter stability problems in engineering control systems. the qualitative analysis of control systems is explained in [13]. the stability of the continuous-time linear state space system is explained in [14] and the stability of discrete-time system is explained in [15]. the series of papers [16], [17], and [18] were attempts to find the conditions for stability/instability of real matrices using qualitative reasoning. in [16], few conditions for qualitative sign instability of a matrix are derived in terms of the nature of interactions and interconnections, taken from ecological principles. the stability analysis of a matrix using these conditions requires only the qualitative (sign) information of the matrix elements (no need for any quantitative information). in [17], an alternative sufficient condition is proposed by combining the concepts of both quantitative (magnitude and sign) as well as the qualitative (only sign) information of the matrix elements. this condition possesses a convexity promotion property with respect to stability. a new necessary and sufficient condition is proposed in [18], for the stability of any real matrix that does not need the information of the characteristic polynomial, and it is based on matrix entries’ sign information only. article [19] studies asymptotic stability criteria for time-delayed systems. remarkable works have been done in [20, 21] on matrices with stable sign patterns. however, all the existing research is focused on the stability of a matrix confining its utility to only linear normal state space systems, and improving on these papers, this paper generalises some of these conditions for qualitative stability/instability of a matrix pair that have a relatively broader scope in the analysis of linear square descriptor systems in control engineering. to the best of our knowledge, this is the first work discussing the qualitative sign stability for a matrix pair. for a proper perspective, let us consider the following matrix pairs (ei, ai), i = 1, 2: e1 =   −3 −2 0.1 0.4 −1 0.9 −0.3 −8 −0.7 5 3 1 0.1 2 1.4 5   , a1 =   2 −1 −3 0.5 0.3 −1 0.4 −2 −0.1 3 −0.5 4 −2 −0.2 1 0.7   , e2 =   1 −0.2 0.8 −3.5 −4.1 1.9 −2.7 −3.3 4 −0.1 −1.2 −0.4 2 −0.5 −4.8 −0.3 −1.7 −4.3 −3.7 0.2 0.6 −2.5 0.9 −5.2 −2.3   , a2 =   −1 0.2 −3 −0.7 5 −0.9 1.3 2.4 −1.2 −3.1 4.8 −1.1 −2 −0.4 4.3 2.7 1 −1.8 1.7 −0.7 −5 −3.2 0.3 −2.9 −1.5   . it is very much difficult to decide the stability/instability of the above matrix pairs with the numerical eigenvalue calculation. but with the necessary and sufficient conditions presented in this paper, we can conclude that the matrix pair (e1, a1) is a qualitative sign unstable (qlsu) matrix pair and the matrix pair (e2, a2) is an mds/u matrix pair. this is concluded just by a simple visualisation of the nature of interactions and interconnections of the matrix pair. note that, the matrix pairs of order 2 do not have any interconnection terms and thus are trivial for our studies. hence, we focus on matrix pairs of order 3 and higher. in the next section, the matrix pair elemental sign structures are briefly reviewed and few basic ‘qualitative sign matrix pair indices’ are developed for formulating the conditions for qualitative sign instability. in section 3, the necessary and sufficient conditions for qualitative instability are proposed, which is the main result of this paper. section 4 discusses the implications of these conditions and illustrates few examples for a clear visualisation of the importance of qualitative stability. in section 5, we discuss the conclusions drawn from this paper. 2. qualitative sign matrix pair indices for assessing the stability, first, we have to visualise an n×n matrix pair in the following structured way. just for a simplified view, let us illustrate the structure using a 4 × 4 matrix pair: e =   e11 e12 e13 e14 e21 e22 e23 e24 e31 e32 e33 e34 e41 e42 e43 e44   and a =   a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34 a41 a42 a43 a44   . the entire matrix pair sign structure is completely specified by the diagonal elements and the off-diagonal link structures (interactions and interconnections). this matrix pair consists of: • diagonal elements: eii and aii, • interactions of the form eij eji and aij aji, • interactions of the form eij aji, • interconnections of the form bij bjk . . . bmi, where bij = eij or aij . now, we look at the signs of the entries and use the following sign convention in the rest of this paper. we use the letter ‘p’ for the ‘+’ (positive) sign, the letter ‘n’ for the ‘−’ (negative) sign, and ‘0’ for the zero entry. we label the interactions of the matrix pair 172 vol. 63 no. 3/2023 qualitative sign stability of linear time invariant descriptor systems notations np number of positive diagonal elements nz number of zero diagonal elements nng number of negative diagonal elements npz number of non-negative diagonal elements table 1. notations for number of diagonal elements. using this sign convention. the possible off-diagonal links or interactions of a matrix pair are: • mutualism link: (pp) link, • competition link: (nn) link, • predation-prey, prey-predation links: (pn) link and (np) link, • ammensalism link: (n0) link and (0n) link, • commensalism link: (p0) link and (0p) link, • null link: (00) link. we further categorise these links in the following way. all the mutualism (pp) links and the competition (nn) links are collectively labeled as ‘same sign (ss) links’. all the predation-prey (pn) links and the prey-predation (np) links are collectively labeled as ‘opposite sign (os) links’. similarly, all the ammensalism (n0 and 0n) links and the commensalism (p0 and 0p) links are collectively labeled as ‘zero sign (zs) links’. finally, the null (00) links are labeled as ‘zero zero’ (zz) links. for a ‘structural zero link’, we label them as szz links and for elemental zero links, we label them as ezz links. • same sign (ss) links: pp (++) links and nn (−−) links, • opposite sign (os) links: pn (+−) links and np (−+) links, • zero sign (zs) links: n0 links, 0n links, p0 links, and 0p links, • zero zero (zz) links: 00 links. based on this sign convention, the matrix pair elemental sign structure of an n × n matrix pair are elaborated in the following ways: 2.1. diagonal elements eii and aii the number of diagonal elements of different signs is mentioned in table 1. the total number of diagonal elements of the matrix pair is 2n. the number of non-negative diagonal elements can be written as: npz = np + nz . let us assume that npz is not zero and define: ηpz = npz 2n , ηng = nng 2n . notations ntl total number of links of e and a nss total number of ss links of e and a nzs total number of zs links of e and a nos total number of os links of e and a nszz total number of szz links of e and a nezz total number of ezz links of e and a nlc total number of active links of e and a ngood number of ‘good’ links of e and a nbad number of ‘bad’ links of e and a table 2. notations for number of links of matrices e and a. ∴ ηpz + ηng = 1 . from an ecological perspective, the information about how a species affects itself is provided by the diagonal elements. the positive sign signifies that the species helps to increase its own population, zero signifies that the species has no effect on itself, and the negative sign signifies that it is self-regulatory. that is why elemental + (positive) and 0 (zero) signs are considered as ‘bad’ signs and elemental − (negative) signs are considered as ‘good’ signs in a row or column of a matrix pair [11]. 2.2. interactions of the form eij eji and aij aji all the products of off-diagonal elements of the form eij eji connecting only two distinct nodes (indices) of matrix e are known as interactions of matrix e. similarly, all the products of off-diagonal elements of the form aij aji connecting only two distinct nodes (indices) of matrix a are known as interactions of matrix a. table 2 includes information on the number of different links of matrices e and a. the total number of links (interactions) of this form is: ntl = [1 + 2 + 3 + . . . + (n − 1)] × 2 = n(n − 1) and it can be expressed as: ntl = nss + nzs + nos + nszz + nezz . (2) we now take out the structural zero links from any further discussion [16] and denote the total number of ‘active links’ as nlc. thus: nlc = nss + nzs + nos + nezz . (3) from an ecological viewpoint, it is noted that same sign (ss) links (i.e. pp and nn links) of this form are highly detrimental to stability whereas, opposite sign (os) links (i.e. pn and np links) of this form are conducive to stability [10, 22]. so, from the stability 173 m. chand, m. paitandi, m. k. gupta acta polytechnica notations n ′ tl total number of links of (e, a) n ′ ss total number of ss links of (e, a) n ′ zs total number of zs links of (e, a) n ′ os total number of os links of (e, a) n ′ szz total number of szz links of (e, a) n ′ ezz total number of ezz links of (e, a) n ′ lc total number of active links of (e, a) n ′ good number of ‘good’ links of (e, a) n ′ bad number of ‘bad’ links of (e, a) table 3. notations for number of links of matrix pair (e, a). point of view, ss links, zs links, and zz links of this form are considered as ‘bad’ links, and os links of this form are considered as ‘good’ links. hence: ngood = nos , (4) nbad = nss + nzs + nezz . (5) let us define: ηbad = nbad nlc ηgood = ngood nlc ∴ ηbad + ηgood = 1 . remark 1. the above indices are already defined for matrices. now we are, for the first time, going to define another form of interaction and chain for matrix pairs. 2.3. interactions of the form eij aji all the products of off-diagonal elements of the form eij aji connecting one node of matrix e and the corresponding node of matrix a are known as interactions of the matrix pair (e, a). table 3 lists the number of different links of matrix pair (e, a). the total number of links (interactions) of this form is: n ′ tl = n(n − 1) and it can be expressed as: n ′ tl = n ′ ss + n ′ zs + n ′ os + n ′ szz + n ′ ezz . (6) we now take out the structural zero links from any further discussion and denote the total number of ‘active links’ as n ′ lc. thus: n ′ lc = n ′ ss + n ′ zs + n ′ os + n ′ ezz . (7) in ecological literature, it is realised that same sign (ss) links (i.e. pp and nn links) of this form are conducive to stability whereas opposite sign (os) links (i.e. pn and np links) of this form are detrimental to stability. so, ss links and zs links of this form are considered as ‘good’ links, and os links of this form are considered as ‘bad’ links. hence: n ′ good = n ′ ss + n ′ zs + n ′ ezz , (8) n ′ bad = n ′ os . (9) let us define: η ′ bad = n ′ bad n ′ lc η ′ good = n ′ good n ′ lc ∴ η ′ bad + η ′ good = 1 . let us define ζbad as potentially destabilising sign matrix pair index and ζgood as the potentially stabilising sign matrix pair index. so: ζgood = ηng + ηgood + η ′ good , (10) ζbad = ηpz + ηbad + η ′ bad . (11) also define ζnet as the net matrix pair stabilisation index given by ζnet = ζgood − ζbad . (12) let us define the index, known as ‘chain’. the elemental structure of the form eij ejiaij aji is called a ‘chain’. the chain containing at least three ‘+’ sign is known as ‘+ chain’ and the chain containing at least three ‘−’ sign is known as ‘− chain’. using these ‘qualitative sign matrix pair indices’, we discuss the qualitative stability of a matrix pair and derive few conditions for the qualitative sign instability. 3. conditions for qualitative sign stability and instability in this section, the main results are presented. here we focus on the case of matrix pairs with diagonal elements containing only a mixture of positive and negative elements, i.e. with npz = np. also, we assume that nbad = nss and n ′ good = n ′ ss. a series of necessary and sufficient conditions for stability/instability of a matrix pair are presented here. 174 vol. 63 no. 3/2023 qualitative sign stability of linear time invariant descriptor systems 3.1. necessary conditions for qualitative stability of matrix pair for qualitative reasoning, we need to know the number and nature of the above-defined interactions and interconnections. but it is not possible to find the number and nature of all the interconnection terms of a matrix pair. so, for the qualitative sign stability, all the interconnection terms of a matrix pair need to be zero. furthermore, any matrix pair (e, a) with det(a) = 0 has always at least one zero eigenvalue, that makes the matrix pair unstable. hence, the nonsingularity of matrix a is also necessarily required for qlss matrix pair. let us consider a matrix pair (e, a) with entries eij and aij , respectively. based on the above discussion, we are enlisting the two important ‘necessary’ conditions for ‘qualitative sign stability’: • c1: bij bjk . . . bqr bri = 0, where bij = eij or aij for any sequences of three or more distinct indices i, j, k, . . . , q, r. • c2: det(a) ̸= 0. here, c1 is the necessarily required condition for qualitative sign stability, i.e. the condition which makes the matrix pair stable, independent of magnitudes, and c2 is the necessarily required condition for the stability of the matrix pair (e, a), independent of any qualitative or quantitative information of the matrix pair elements. more details about the concept of qualitative sign stability is discussed in [11, 14]. the net matrix pair stabilisation index ζnet serves as an index to indicate the likelihood of matrix pair being stable or unstable in a qualitative way. negative values indicate that the matrix pair is more likely to be unstable, and positive values indicate that the matrix pair is more likely to be stable. the higher the value, the higher the probability, see [16]. the qualitative sign stability of a matrix pair depends upon the stabilising strength of the matrix pair. when the matrix pair is more potentially stabilised, then ζnet is non-negative, and when it is more potentially destabilised, then ζnet is negative. so, the qualitative stability is connected with the value of ζnet. with this observation, we have the following conditions: • ζnet varies in an interval given by −3 ≤ ζnet ≤ 3. • ζnet < 0 specifies that the matrix pair is mdu, that means for these mdu matrix pairs, there always exist magnitudes that make this matrix pair unstable. • ζnet ≥ 0 specifies that the matrix pair is mds, that means for these mds matrix pairs, there always exist magnitudes that make this matrix pair stable. • a given (non-qlss) matrix pair is qlsu only if −3 ≤ ζnet < 0 (necessary condition for qlsu). a matrix pair, which is not qualitative sign stable (qlss), is said to be a non-qlss matrix pair. 3.2. a necessary condition for instability of a matrix in a matrix pair (e, a), if we substitute e by the identity matrix i, then the above necessary condition for qlsu will be the same as that of the matrix a. with the fundamentals of ecological principles, the matrix pair (i, a) is visualised in the following structured way. let us illustrate the structure by using a 4 × 4 matrix pair. i =   1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1   , a =   a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34 a41 a42 a43 a44   according to the qualitative stability concept of matrix pair: ηpz = n + npz (a) 2n = 1 2 + npz (a) 2n ηng = nng (a) 2n ηbad = nlc(i) + nbad(a) nlc = 1 2 + nbad(a) nlc ηgood = ngood(a) nlc η ′ bad = 0 η ′ good = 1 ∴ ζnet = ζgood − ζbad = 1 2 [ nng (a) n + ngood(a) nlc(a) − npz (a) n − nbad(a) nlc(a) ] = 1 2 [ζgood(a) − ζbad(a)] = 1 2 ζnet(a) . since the matrix pair (i, a) is qlsu only if ζnet < 0, therefore, the matrix a is qlsu only if ζnet(a) < 0. this is the necessary condition for a matrix to be qlsu, which is a particular case of a matrix pair. the necessary and sufficient conditions of a matrix to be qlsu are discussed extensively in [16]. in this paper, we generalise the identity matrix i to any matrix e and propose few necessary and sufficient conditions for the qualitative sign instability for a matrix pair. 3.3. a necessary condition for a matrix pair to be qlsu we know that ζnet is a real function and it takes on discrete values. so there may be a situation that not all the diagonal elements have to be positive and not all the links have to be ‘bad’ links to make the matrix pair qlsu. that means, for a matrix pair to be qlsu, the number of positive diagonal elements can be less than 2n and the number of bad links in the matrix pair can be less than nlc, not necessarily equal to nlc. now, we calculate the minimum number of bad links needed to make a matrix pair unstable and denote it as n ∗u . for a matrix pair with np ̸= 0, let us define n ∗u as follows: 175 m. chand, m. paitandi, m. k. gupta acta polytechnica n ∗u =the closest higher upper integer of (ηng + 1 2 ) times nlc, including when that is itself an integer. let us state a theorem. theorem. a matrix pair is qlsu only if the total number of bad links in it is ≥ n ∗u i.e. nbad + n ′ bad ≥ n ∗ u . proof. for a qlsu matrix pair, ζnet < 0 =⇒ ηng + ηgood + η ′ good < ηpz + ηbad + η ′ bad =⇒ ηng + 1 − ηbad + 1 − η ′ bad < 1 − ηng + ηbad + η ′ bad =⇒ ηng + 1 2 < ηbad + η ′ bad =⇒ ηng + 1 2 < nbad + n ′ bad nlc =⇒ [ηng + 1 2 ] × nlc < nbad + n ′ bad . thus, the total number of bad links needed to make a matrix pair qlsu is greater than or equal to n ∗u . this is a necessary condition for a matrix pair to be qlsu. 3.4. sufficiency guidelines for qlsu matrix pair while observing the expression for the determinant of a matrix pair, we find that a diagonal element is always multiplied by the link elements surrounded by it. that means the row and column elements associated with a diagonal element play a vital role to assess the stability/instability. these observations provide us with some ‘guidelines’ for sufficiency for qlsu [16]. • guideline 1: a matrix pair with ζnet < 0, is likely qlsu if a positive diagonal element eii or aii is surrounded by + chains. (the above guideline is the result of the idea that positive diagonal elements along with + chains promote instability.) • guideline 2: a matrix pair with ζnet < 0, is likely qlsu if a negative diagonal element eii or aii is surrounded by − chains. (the above guideline is the result of the idea that negative diagonal elements along with − chains promote instability.) 4. illustrative examples for instability of a matrix pair by now, we have few necessary conditions for qlsu and few guidelines for sufficiency for qlsu. once the necessary condition ζnet < 0 is satisfied, we can make a qlsu matrix pair, with the appropriate placement of the ‘bad’ and ‘good’ links in it. let us consider some examples. example 1. consider a 4 × 4 matrix pair with diagonal elements as shown below: e =   − ∗ ∗ ∗ ∗ + ∗ ∗ ∗ ∗ + ∗ ∗ ∗ ∗ +   and a =   + ∗ ∗ ∗ ∗ − ∗ ∗ ∗ ∗ − ∗ ∗ ∗ ∗ +   suppose there is no zz link in the matrix pair. for the given matrix pair (e, a), ηng = 3 8 , ηpz = 5 8 nlc =12 n ∗ u =11 thus, nbad + n ′ bad ≥ 11. any matrix pair with the given diagonal element structure and satisfying the above condition have ζnet < 0, and thus the matrix pair is an mdu matrix pair. example 2. let us consider a matrix pair with the conditions given in example 1: e =   − − + + − + − − − + + + + + + +   and a =   + − − + + − + − − + − + − − + +   here nbad = 7 , ngood = 5 n ′ bad = 5 , n ′ good = 7 ∴ nbad + n ′ bad = 7 + 5 = 12 > 11 = n ∗ u . hence, the necessary condition is satisfied, and now we check the sufficiency guidelines required for qlsu. here, all the negative diagonal elements are surrounded by − chains. therefore, this is a qlsu matrix pair. it is noted that once there is at least one positive diagonal element, we can assess the qualitative sign instability by computing the relative distribution of the bad links together with the good links. example 3. let us discuss the stability of the matrix pair given below: e =   − − − + − − + + + + + + + + − − − + + + + + + − −   and a =   − + + − + + + + + + + − + + + − + − + − + + + − −   for the given matrix pair (e, a), ηpz = 6 10 , ηng = 4 10 ηbad = 3 5 , ηgood = 2 5 η ′ bad = 1 2 , η ′ good = 1 2 ∴ ζnet = − 2 5 < 0 . here all the positive diagonal elements are surrounded by + chains. hence, as per the sufficiency guidelines for qlsu, the given matrix pair is qlsu. thus, this particular quantitative matrix pair is unstable without any need of eigenvalue calculations. 176 vol. 63 no. 3/2023 qualitative sign stability of linear time invariant descriptor systems example 4. consider the sign pattern of the matrix pair given below: e = [ − + − − + + + − + ] and a = [ + + − − + + + − − ] this matrix pair (e, a) has ηpz = 4 6 , ηng = 2 6 ηbad = 0 , ηgood = 1 η ′ bad = 1 , η ′ good = 0 making ζnet = −1 3 < 0. since, ζnet < 0, the necessary condition is satisfied. but the sufficient conditions are not satisfied. hence, it is an mdu (not a qlsu) matrix pair. it should be noted that we are not stating that the matrix pair with this elemental sign structure is always unstable. we are simply stating that the elemental sign structure of the above matrix pair guarantees that there exist magnitudes which would definitely make this matrix unstable. for example, the following matrix pair (e1, a1) with the above sign structure is unstable, e1 = [ −1.6132 2.0118 −1.6806 −0.0021 0.5791 0.2139 0.2017 −2.1852 1.7419 ] , a1 = [ 0.2462 1.8614 −0.7201 −1.8764 1.4531 1.9056 3.4318 −0.1567 −1.7543 ] , while the following matrix pair (e2, a2) having the same sign structure is stable. e2 = [ −1.0132 0.0118 −1.0006 −1.0021 0.4001 0.2110 0.0213 −0.0411 0.2015 ] , a2 = [ 1.8112 21.5624 −1.0207 −0.9016 1.0172 1.0061 9.6512 −0.0600 −5.0234 ] . example 5. consider the matrix pair (e, a) with a sign structure given by: e =   + − + − − + − − + − − − + − − − − − − + + − + − −   and a =   − + − − + − + + − − + − − − + + + − + − − − + − −   this matrix pair (e, a) has ηpz = 2 5 , ηng = 3 5 ηbad = 2 5 , ηgood = 3 5 η ′ bad = 1 5 , η ′ good = 4 5 ∴ ζnet = 1 > 0 . here, ζnet > 0, but the necessary condition for qualitative sign stability c1 is not satisfied. hence, it is an mds/u matrix pair and is a non-qlss matrix pair. 5. conclusion this paper addresses the issue of determining the stability/instability of a matrix pair that arises in a continuous linear homogeneous time-invariant system. the conditions for a matrix to be qlsu have already been discussed in the earlier works. in this research, we generalise the identity matrix i to any matrix e and propose few necessary conditions and sufficient conditions for qualitative sign instability of a matrix pair. the proposed conditions are very simple and are based on the number and nature of the diagonal elements and the number and nature of the off-diagonal element pairs (links). this reflects that the elemental sign structure of a matrix pair is an important contributor to the stability/instability. these conditions are extremely helpful for engineers and ecologists, in solving stability-related problems. acknowledgements the research work is supported by serb, dst under grant no. srg/2019/000451. references [1] v. k. mishra, n. k. tomar, m. k. gupta. regularization and index reduction for linear differential–algebraic systems. computational and applied mathematics volume 37(4):4587–4598, 2018. https://doi.org/10.1007/s40314-018-0589-3 [2] m. k. gupta, n. k. tomar, s. bhaumik. on detectability and observer design for rectangular linear descriptor system. international journal of dynamics and control 4(4):438–446, 2016. https://doi.org/10.1007/s40435-014-0146-x [3] m. k. gupta, n. k. tomar, s. bhaumik. on observability of irregular descriptor systems. international conference on advances in control and optimization of dynamical systems 3(1):376–379, 2014. https: //doi.org/10.3182/20140313-3-in-3024.00089 [4] j. quirk, r. ruppert. qualitative economics and the stability of equilibrium. the review of economic studies 32(4):311–326, 1965. https://doi.org/10.2307/2295838 [5] c. jeffries. qualitative stability and digraphs in model ecosystems. ecology 55(6):1415–1419, 1974. https://doi.org/10.2307/1935470 [6] r. m. may. stability and complexity in model ecosystems. princeton university press, new jersey, 1st edn., 2019. https://doi.org/10.1515/9780691206912 [7] k. lancaster. the theory of qualitative linear systems. econometrica: journal of the econometric society 33(2):395–408, 1965. https://doi.org/10.2307/1909797 [8] g. lunghini. qualitative analysis, determinacy and stability. quality and quantity 4(2):299–324, 1970. https://doi.org/10.1007/bf00199567 [9] c. jeffries, v. klee, p. van den driessche. qualitative stability of linear systems. linear algebra and its applications 87:1–48, 1987. https://doi.org/10.1016/0024-3795(87)90156-x 177 https://doi.org/10.1007/s40314-018-0589-3 https://doi.org/10.1007/s40435-014-0146-x https://doi.org/10.3182/20140313-3-in-3024.00089 https://doi.org/10.3182/20140313-3-in-3024.00089 https://doi.org/10.2307/2295838 https://doi.org/10.2307/1935470 https://doi.org/10.1515/9780691206912 https://doi.org/10.2307/1909797 https://doi.org/10.1007/bf00199567 https://doi.org/10.1016/0024-3795(87)90156-x m. chand, m. paitandi, m. k. gupta acta polytechnica [10] s. allesina, m. pascual. network structure, predator–prey modules, and stability in large food webs. theoretical ecology 1:55–64, 2008. https://doi.org/10.1007/s12080-007-0007-8 [11] r. k. yedavalli, n. devarakonda. sign-stability concept of ecology for control design with aerospace applications. journal of guidance, control, and dynamics 33(2):333–346, 2010. https://doi.org/10.2514/1.46196 [12] n. devarakonda, r. k. yedavalli. engineering perspective of ecological sign stability and its application in control design. in proceedings of the 2010 american control conference, pp. 5062–5067. 2010. https://doi.org/10.1109/acc.2010.5530716 [13] b. buonomo, d. lacitignola, c. vargas-de-león. qualitative analysis and optimal control of an epidemic model with vaccination and treatment. mathematics and computers in simulation 100:88–102, 2014. https://doi.org/10.1016/j.matcom.2013.11.005 [14] e. kaszkurewicz, a. bhaya. matrix diagonal and d-stability. in matrix diagonal stability in systems and computation, pp. 25–89. 2000. https://doi.org/10.1007/978-1-4612-1346-8_2 [15] e. kaszkurewicz, a. bhaya. qualitative stability of discrete-time systems. linear algebra and its applications 117:65–71, 1989. https://doi.org/10.1016/0024-3795(89)90547-8 [16] r. k. yedavalli, n. devarakonda. qualitative sign instability of linear state space systems via ecological principles. in proceedings of the indian control conference, pp. 85–90. 2015. [17] r. k. yedavalli. a convexity promoting sufficient condition for testing the stability of a matrix via qualitative ecological principles. in proceedings of the indian control conference, pp. 500–505. 2015. [18] r. k. yedavalli. a new, necessary and sufficient condition for hurwitz stability of a real matrix without characteristic polynomial, using qualitative reasoning. in 2018 annual american control conference (acc), pp. 2851–2856. 2018. https://doi.org/10.23919/acc.2018.8431691 [19] s. arunagirinathan, p. muthukumar. new asymptotic stability criteria for time-delayed dynamical systems with applications in control models. results in control and optimization 3:100014, 2021. https://doi.org/10.1016/j.rico.2021.100014 [20] d. grundy, d. olesky, p. van den driessche. constructions for potentially stable sign patterns. linear algebra and its applications 436(12):4473–4488, 2012. https://doi.org/10.1016/j.laa.2011.08.011 [21] a. berliner, d. d. olesky, p. van den driessche. relations between classes of potentially stable sign patterns. the electronic journal of linear algebra 36:561–569, 2020. https://doi.org/10.13001/ela.2020.4929 [22] s. allesina, s. tang. stability criteria for complex ecosystems. nature 483(7388):205–208, 2012. https://doi.org/10.1038/nature10832 178 https://doi.org/10.1007/s12080-007-0007-8 https://doi.org/10.2514/1.46196 https://doi.org/10.1109/acc.2010.5530716 https://doi.org/10.1016/j.matcom.2013.11.005 https://doi.org/10.1007/978-1-4612-1346-8_2 https://doi.org/10.1016/0024-3795(89)90547-8 https://doi.org/10.23919/acc.2018.8431691 https://doi.org/10.1016/j.rico.2021.100014 https://doi.org/10.1016/j.laa.2011.08.011 https://doi.org/10.13001/ela.2020.4929 https://doi.org/10.1038/nature10832 acta polytechnica 63(3):171–178, 2023 1 introduction 2 qualitative sign matrix pair indices 2.1 diagonal elements eii and aii 2.2 interactions of the form eijeji and aijaji 2.3 interactions of the form eijaji 3 conditions for qualitative sign stability and instability 3.1 necessary conditions for qualitative stability of matrix pair 3.2 a necessary condition for instability of a matrix 3.3 a necessary condition for a matrix pair to be qlsu 3.4 sufficiency guidelines for qlsu matrix pair 4 illustrative examples for instability of a matrix pair 5 conclusion acknowledgements references ap06_5.vp 1 introduction the necessity for termanalysis in product development documents is given by the problem of termini in this knowledge domain [14], [15]. the problem of term non-homogeneity is also given in other knowledge domains, but in product development many knowledge domains come together and should work together. for this reason, product development uses termini from other domains with a new or changed meaning. because of the non-homogeneity of termini in documents concerning product development, learning and teaching problems ensue. but the problem of terminology is not only an issue in education; it is also an obstacle to introducing product development knowledge in industry and other knowledge domains. 2 the pinngate-approach pinngate stands for product and process innovation gate, and is a current project of the department of product development and machine elements (pmd) at darmstadt university of technology. pinngate is a teaching, learning and application environment. the main aim is to support different users with high quality information. the content is saved in a central database. the level of content is separated by the level of application by the so-called navigator. the navigator intervenes between these two levels and the user via a front-end (see fig. 1) [10] [11] [12]. based on this general concept, pinngate contains a number of tools that provide a range of supports for different users. one factor that this paper focuses on is quality, and one aspect of quality is the homogeneity of the termini. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 46 no. 5/2006 term analysis – improving the quality of learning and application documents in engineering design s. weiss, j. jänsch, h. birkhofer conceptual homogeneity is one determinant of the quality of text documents. a concept remains the same if the words used (termini) change [1, 2]. in other words, termini can vary while the concept retains the same meaning. human beings are able to handle concepts and termini because of their semantic network, which is able to connect termini to the actual context and thus identify the adequate meaning of the termini. problems could arise when humans have to learn new content and correspondingly new concepts. since the content is basically imparted by text via particular termini, it is a challenge to establish the right concept from the text with the termini. a term might be known, but have a different meaning [3, 4]. therefore, it is very important to build up the correct understanding of concepts within a text. this is only possible when concepts are explained by the right termini, within an adequate context, and above all, homogeneously. so, when setting up or using text documents for teaching or application, it is essential to provide concept homogeneity. understandably, the quality of documents is, ceteris paribus, reciprocally proportional to variations of termini. therefore, an analysis of variations of termini could form a basis for specific improvement of conceptual homogeneity. consequently, an exposition of variations of termini as control and improvement parameters is carried out in this investigation. this paper describes the functionality and the profit of a tool called termanalysis. it also outlines the margins, typeface and other vital specifications necessary for authors preparing camera-ready papers for submission to the 5th international conference on advanced engineering design. the aim of this paper is to ensure that all readers are clear as to the uniformity required by the organizing committee and to ensure that readers’ papers will be accepted as camera-ready for the conference. termanalysis is a software tool developed within the pinngate project [5] by the authors of the paper at the department of product development and machine elements at darmstadt (pmd) university of technology. this tool is able to analyze arbitrarily and electronically represented text documents concerning the variation of termini. the similarity of termini is identified by using the levensthein distance [6]. identified variations are clustered and presented to the user of the tool. the number of variations provides the basis for identifying potentials of improvement with regard to conceptual homogeneity. the use of termanalysis leads to the discovery of variations of termini and so generates awareness of this problem. homogenization improves the document quality and reduces the uncontrolled growth of the concepts. this has a positive effect for the reader/learner and his/her comprehension of content [7]. by analyzing documents by various authors, a surprisingly high number of variations per document have been revealed. the investigations have indentified three main scenarios which are fully described in this paper. keywords: learning documents, product development knowledge, concepts. navigator process theorielösungen workshops lehre & lernen process in h a lt li c h e e b e n e a n w e n d u n g s e b e n e navigator theorysolutions workshops teaching, learning in h a lt li c h e e b e n e le v e l o f c o n te n t a n w e n d u n g s e b e n e le v e l o f a p p li c a ti o n navigator process theorielösungen workshops lehre & lernen process in h a lt li c h e e b e n e a n w e n d u n g s e b e n e navigator theorysolutions workshops teaching, learning in h a lt li c h e e b e n e le v e l o f c o n te n t a n w e n d u n g s e b e n e le v e l o f a p p li c a ti o n fig. 1: the pinngate approach 3 homogeneity of termini through the disposal of concepts a human being is able to think, learn and solve problems. the understanding of termini influences thinking, learning and problem-solving. a terminus is the name of a concept. a concept is more or less dependent on the individual and the situation. but how can termini and concepts be learned and taught? generally, there are many rules for defining categories. according to the property theory, concepts are defined by accentuated properties. so, a particular object can be called a bird if it has two wings, feathers and a beak. if one of these things is missing, the object is not perceived as a bird. an object is categorized by comparing it to a prototype (representative example). ever since a study by clark hull (1920), a concept is understood as a category that has a certain system of classification. accordingly, the learning of concepts consists of learning definitions and relevant properties. this method of learning concepts is based on the following assumptions: � each category is defined by a small number of relevant properties; the learner has to learn the relevant properties. � an object only belongs to a certain category if it has the relevant properties. � within a certain level of abstraction, the individual categories are distinctly separated. an object cannot belong in two categories. � the single properties do not differ according to their relevance. they all have the same relevance. [13] eleonor rosch states that concepts can be systematized by natural conditions according to prototypes and best examples (ideal scenarios). a prototype is a representative example on a cognitive level that is generated from all the examples that have been observed. in this way an example is generated that best presents a concept. with additional rules the prototype can be specified and a certain degree of digression is possible. an example belongs to a concept if it fits to the ideal of the concept within a certain scope. thus, the understanding of a concept depends strongly on the experiences made with the concept, the situation, problems and conditions. looking at different knowledge domains, one and the same terminus can belong to different concepts and have different meanings (see fig. 2). for this reason, it is necessary to teach concepts adequately in the relevant knowledge domain under realistic conditions, situations, problems, etc. further, it is absolutely necessary always to use the same terminus for one and the same concept. if one uses different termini to describe the same concept, the learner starts to look for differences in the properties and tries to set up a second category or concept. this leads to confusion and misunderstanding [16]. therefore, it is absolutely necessary to retain a high homogeneity of concepts within documents of learning material. the quality of learning documents becomes strongly diminished if homogeneity of termini is not considered. homogeneity of termini is not the only prerequisite for good documents. termini and their corresponding concepts must also be properly introduced with a sensible amount of adequate examples and instructions. 4 term analysis the quality q of documents may be understood as a function of different parameters xi influencing the quality. one of these parameters is the homogeneity of termini b. this can be written as q f x x xi� ( , , , )1 2 � . now we set each parameter different from x bi � constant: x x y iy y� � � �1 1� . so, homogeneity is ceteris paribus the only determinant in the following argumentation. pinngate deals with various documents, which can generally be understood as objects. each content item is represented in a modular way. a central modularization approach gives us a strategy for dealing with content divided into smaller modular constituents. however, content can be modularized or unmodularized. modularized content can always be transformed into unmodularized content through reconstruction according to the modularization approach. moreover, content can be newly created or it can already be present in the system’s database. within the argumentation of homogeneous termini, it is necessary to check new content before it is saved in the database. it must also be possible to check already existing content for homogeneity of termini. thus, the task is defined: create a draft that fulfills the following requirements: � identification of variations of termini � structuring of identified variations � applicable to both modularized and unmodularized content � applicable to new content � applicable to already existing content � applicable to any electronic text � compatible with pinngate based on these requirements, an approach should be developed that analyzes documents and identifies variations of termini, so that new documents are homogeneous from the very beginning. existing documents can also be analyzed and improved using this approach. the basic strategy of termanalysis is summarized in fig. 3. the draft requires a textual document to be analyzed in five steps. step 1: creating the potential term list an algorithm analyzes the input file and identifies all the termini. rules have to be defined on how to process symbols and special characters, such as & or -. on this occasion it is important to isolate each term exactly one time so that there will 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 terminus property 1 property 2 property 3 prototype conditions situation individual experiences observed examples concept fig. 2: terminus vs. concept be no redundancies later in the process. this is important because the whole process runs much faster based on a smaller potential term list. one has to think ahead to the comparisons of identified termini. thus, the result is the so-called potential term list. this is a list in which each term used in the original file is represented exactly one time. step 2: applying the not list and the thesaurus termanalysis uses two additional techniques to deal with the potential term list: the not list and a thesaurus. the not list is a list that contains a collection of termini which are not technically termini, such as articles and prepositions, as well as termini that should not be processed further on. this is a predefined filter mechanism to remove irrelevant termini from the potential term list. the not list is preconfigured and can be modified by the user. the application of the not list reduces the potential term list. the thesaurus maps termini and their synonyms. so, different termini can be treated as one and the potential term list is reduced again. the thesaurus is predefined but can also be modified by the user. step 3: the term list the term list is the list representing all the remaining termini. it is the basis for identifying the variations of termini. the smaller the term list is, the faster the variations can be determined. the term list gives the user an overview of all important words used in the original file. it is recommended to sort the term list alphabetically and evaluate it manually to get an impression of the words that are used. this allows one to draw a first conclusion about the quality of the original documents. step 4: the key term list the key term list contains very important termini that should be at the center of the subsequent analysis. the key term list is the basis for the algorithm to be applied in the next step. the main idea is to gain speed. thus, the algorithm does not compare each term from the term list with each other. rather, it compares each term from the term list with each key term of the key term list. the key term list is predefined but can be – indeed, must be – modified by the user. the definition of the key term list sets the focus on the real important termini that the user wants to analyze. moreover, the key term list can be created automatically. this is done by an algorithm identifying the most frequent strings or substrings. step 5: creating the key term structure the creation of the key term structure is the final step on the way from the original file to the variation of termini. each term in the term list will to be compared with each key term of the key term list. this is done by calculating the weighted levensthein distance. “although there are many models for similarity among words, the most generally accepted in text retrieval is the levensthein distance, or simply, edit distance. the edit distance between two strings is the minimum number of character insertions, deletions, and replacements needed to make them equal.”[6] the algorithm used in termanalysis uses the weighted levensthein distance, i.e. different weights are considered concerning insertions, deletions and replacements. the result of the comparison is a tree based on key terms and their variations obtained from the term list. these five steps result in different information concerning homogeneity of termini. the following section gives an overview of the results that can be achieved by applying term analysis to documents. 5 results a first result is the impression gained by manually analyzing the alphabetically sorted term list. mostly, it can be determined that special termini have been used very often in different phenotypes. moreover, it is possible to identify first key terms manually. this impression is a first sign of how consistent your choice of words really is. however it is more impartial to derive a statistical overview of the results, so that it is transparent how often each term has really been used and which variations of it have been built. these results are a good platform for discussing the authors’ original document. it is also a good basis for improving the document. especially in the case of learning documents, variations of termini should be minimized, because such variations may confuse the students. these statistics can be generated for the whole document or chapter by chapter. so, one gains an overview of peaks of variations depending on the chapter that one looks at. this may indicate key terms, too, because each chapter deals with specific problems and the used termini depend on the problem. thus, a peak of variations identified within a specific chapter allows one to conclude that the key terms are critical: either there is no clear definition of the concept or the authors have used it sloppily. especially in the context of learning documents, it is very important to use concepts well. each key concept has to be used very carefully because this has an impact on the students. the students have no chance to determine whether one term is synonymous with another or not, and therefore, cannot distinguish different termini representing the same concept. moreover, it could happen that the student recognizes differ© czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 46 no. 5/2006 potential term list file term list key term list key term structure not list thesaurus w e ig h te d l e v e n s th e in d is ta n c e fig. 3: basic strategy of termanalysis ent individual concepts represented by different termini and memorizes them. this can be dealt with very easily within the pinngate project. within pinngate, content is saved and modularly processed. thus, the definition of each concept is given modularly, too [9]. each document is also modularly represented modularized, so it is easy to determine two important positions: first, the position of the first occurrence of any concept and of associated termini, and second, the position of the modular definition of the concept. so, it must be stated that the modular definition of any concept has to occur at an earlier position than its variations of termini. then there is a good chance that the students will not be confused, even if there are still variations of termini present. termanalysis supports the author in analyzing his/her work and minimizing variations of termini. it facilitates the writing and reworking of documents. it helps to identify inconsistencies of termini and their definitions. with these advantages, termanalysis contributes to the improvement of product development knowledge and supports transfer of knowledge to students, industry and other domains. 6 example and consequences an example of a tool to support the consistency of terms within a text, after termanalysis has identified different termini for one concept is a concept map. fig. 4 shows a concept map (also called mind map) that gives recommendations on how to integrate termini, especially technical termini, in a document. concept mapping makes it possible to emphasize relevant properties of concepts and to distinguish them from each other. 7 conclusions to use termanalysis properly, the document has to be available in electronic form. it is sensible to have a well structured file system with various documents. this paper shows that the levenstein algorithm is suitable to check the termini consistency of documents. the checking speed of the termanalysis tool runs up to seconds for 100 words (depending on the hardware). the tool only examines the consistency, not the quality of the content. termanalysis is very useful for authors of learning and teaching documents. in most cases, the authors of such documents are experts, and therefore, very familiar with concepts and termini. but they also make use of „internal“ (insider) termini or use different termini for one concept without realizing it. thus, termanalysis can also be seen as a tool of knowledge engineering that helps to externalize experts’ knowledge properly. references [1] specht, g.: einführung in die betriebswirtschaftslehre. stuttgart: poeschel, 1990, p. 14. [2] seiffert, h.: einführung in die wissenschaftstheorie 1. münchen: beck, 1969, p. 37, 41. [3] strube, g.: wörterbuch der kognitionswissenschaft, stuttgart: klett-cotta, 1996, p. 58. [4] seel, n., m.: psychologie des lernens, münchen: ernst reinhardt verlag, 2003, p. 166. [5] www.pinngate.de [6] baeza-yates, r., ribeiro-neto, b.: modern information retrieval. acm press, 1999, p. 105. [7] anderson, j.: kognitive psychologie, heidelberg, berlin, oxford: spektrum akademischer verlag, 1996. [8] weiß, s.: konzept und umsetzung eines navigators für wissen in der produktentwicklung. düsseldorf: vdi-verlag 2006. [9] birkhofer, h., weiß, s., berger, b.: modularized learning documents for product development in education at the darmstadt university of technology. in: proceedings of design 2004, dubrovnik, 2004, p. 599–604. [10] weiß, s., berger, b., jänsch, j., birkhofer, h.: coseco (context-sensitive-connector) – a logical component for a userand usage-related dosage of knowledge. in: proceedings of iced 03, stockholm, 2003. [11] weiß, s., berger, b., birkhofer, h.: topology of modular knowledge structures in product development. in: proceedings of design 2004, dubrovnik, 2004. [12] jänsch, j., sauer, t., walter, s., birkhofer, h.: user-suitable transfer of design methods. in: proceedings of iced 2003, stockholm, 2003. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 4: recommendations for introducing and using termini in documents [13] mietzel, g.: pädagogische psychologie des lernens und lehrens, göttingen: hogrefe, 2003. [14] hansen, f.: konstruktionssystematik, berlin: veb verlag technik, 1965. [15] birkhofer, h., kloberdanz, h., berger, b., sauer, t.: cleaning up design methods – describing methods completely and standardised. in: marjanovic, d. (hg.). design 2002. vol. 1. faculty of mechanical engineering and naval, zagreb, the design society, glasgow: dubrovnik, croatia. p. 17–22. [16] jänsch, j.: akzeptanz und anwendung von konstruktionsmethoden im industriellen einsatz – analyse und empfehlungen aus kognitionswissenschaftlicher sicht, dissertation, fortschritt-berichte vdi, reihe 1, nr. 396, technische universität darmstadt, vdi-verlag, düsseldorf 2007. dipl.-wirtsch.-ing. sascha weiß phone: + 49 (0) 6151 – 16 2666 fax: + 49 (0) 6151 – 16 3355 e-mail: weiss@pmd-tu-darmstadt.de dipl.-wirtsch.-ing. judith jänsch phone: + 49 (0) 6151 – 16 3055 fax: + 49 (0) 6151 – 16 3355 e-mail: jaensch@pmd-tu-darmstadt.de dept. of product development and machine elements (pmd) darmstadt, university of technology magdalenenstraße 4 64289 darmstadt, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 46 no. 5/2006 ap08_1.vp 1 introduction data mining can be defined as the non-trivial extraction of implicit, previously unknown, yet potentially useful information from data, and may be defined as the science of extracting useful information from large data sets or databases. with the help of data mining, derived knowledge, relationships and conclusions are often represented as models or patterns. for example, any data cluster, tree structure, or set of rules, etc., can form a model or a pattern. the whole process is sometimes referred to as knowledge discovery in databases (kdd). data mining only implies modeling or an analytical method in this application sense, and is considered to be a part of the kdd process [1]. one standard, named crisp-dm (cross-industry standard process for data mining), describes this process step by step. it develops each phase of the kdd process and, in addition, helps to avoid common mistakes. the particular phases of the crisp-dm methodology are [1]: business understanding – the first objective is to thoroughly understand what is really to be accomplished. we have to begin by uncovering important factors that can influence the outcome of the project. this task involves also more detailed fact-finding about all of the resources, assumptions and other factors that should be considered in determining the data analysis goal. data understanding – orientation of data. this step usually investigates a variety of descriptive data characteristics (count of entities in tables, frequency of attribute values, average values, etc.). data preparation – this is the most difficult and most time-consuming element of all kdd processes. the goal of data preparation is to choose (or create) relevant data from the available data, and to represent it in a form which is suitable for the analytical methods that are applied (the date often needs to be in the same data table with all other values of the object attributes). data preparation includes activities like data selection, filtering, transformation, creation, integration and formatting. modeling – modeling is the use of analytical methods (algorithms) sometimes referred to as self-data mining. there are many different methods, and the most suitable one must be chosen to solve a given task; efficient settings of the parameters must also be found. this phase includes verifying the quality of the model (e.g. testing in the independent data matrix, cross validation, and so on). evaluation – interpretation and evaluation of the discovered knowledge. the main aspects are novelty, interest, utility and comprehensibility of the descriptive tasks. the derived knowledge is divided into the following categories: � evident knowledge, which is comparable to “horse sense”, or to the common knowledge of an expert. even if such knowledge does not offer anything new, it can show us that the method works well and that it is able to discover knowledge. � interesting knowledge that yields a new point of view. this is the main aim of kdd. � knowledge that seems to be unclear or is at variance with expert knowledge. this knowledge may have been created by coincidence, and should be ruled out. nevertheless, it can expose a new point of view that applies to all the problems, and this must be taken into consideration. deployment – the acquired knowledge should be modified into applicable forms, which can involve simply writing a final report or specific actions. 2 analytic methods the core of all kdd processes is the use of analytic methods. the input to the analytic procedures is the prepared data, and the output is discovered knowledge. analytic methods include regression analysis, discriminatory analysis, cluster analysis, decision trees and association rules, among others. it is possible to use standard methods and also modern methods, like neural networks. data analysis is based on association rules, so they will be described in more detail here. 2.1 association rules this method is based on determining of connections (associations) among attributes; in particular, it is used to discover common combinations of attributes that occur most frequently within a given data set. association rules, which originally came from the czech guha method (general unary hypotheses automaton), can be divided into two categories. the rules fall into the ant ~ suc form in the first class, and the so-called contingent rules fall into the 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 effective data mining for a transportation information system p. haluzová this paper describes the application of data mining methods in the database of the doris transportation information system, currently used by the prague public transit company. the goal is to create knowledge about the behavior of objects within this information system. data is analyzed partly with the help of descriptive statistical methods, and partly with the help of association rules, which may discover common combinations of attributes that occur most frequently within a given data set. two types of quantifiers were used when creating the association rules; namely “founded implication” and “above average”. the results of the analysis are presented in the form of graphs and hypotheses. keywords: data mining, knowledge discovery in databases, association rule ant ~ suc /cond form in the second class, where ant (antecedent or precondition), suc (succedent or conclusion, consequent) and cond (condition) are a logical conjunction of literals; and symbol ~ means a generalized quantifier indicating a type of relationship between ant and suc. the literal is defined as an attribute (positive literal); or its negation (a negative literal). a generalized quantifier can be in the form of an implication, an equivalence or a statistical test, etc. regarding contingent association rules, only objects that satisfy a given condition may be included in the hypothesis [1]. a contingent table for the ant ~ suc rule and for n instances can be made [1]: where a is the number of rows in the analyzed data matrix that satisfy the concurrent precondition and conclusion, b is the number of instances that satisfy the precondition and not the conclusion, c is the number of instances that do not satisfy the precondition but satisfy the conclusion; and d is the number of instances that satisfy neither the precondition nor the conclusion. we can determine various rule characteristics from these numbers, and with this we can quantitatively evaluate the knowledge that has been found. the basis of all searching algorithms in association rules is the generation of combinations (conjunctions) of attribute values. they work only with categorical data, and therefore, it is necessary to separate the numerical attributes into intervals. quantifiers the quantifier (generalized quantifier) characterizes the type of relationship between antecedent and succedent. there are many types of quantifiers, e.g. lower critical implication, upper critical implication, above average, below average, founded equivalence, fisher quantifier, etc. two types of quantifiers were used to create the hypotheses with the lisp-miner program; namely “founded implication”, and the relationship “above average”. founded implication this quantifier has two parameters p and b; p is called confidence and b is base. always p � (0, 1� and b > 0. we say that antecedent and succedent have a relation of founded implication with a minimum value of parameters p and b in the data matrix, if the equation that follows is true [1]: a a b p a base � � � � (for a, b see the contingent table – table 1). this can be written formally as ant �p, b suc and can be interpreted as: “at least p�100% of objects satisfying the antecedent also satisfy the succedent, and at least b objects satisfy both antecedent and succedent”. or “antecedent implies succedent with probability p�100%”. aa quantifier (above average) this quantifier also has two parameters; they are confidence p, and base b. always p > 0 and b > 0. we say that antecedent and succedent have a relation of above average, with minimum value of parameters p and b in the data matrix, if the equation that follows is true [1]: a a b c d a b a c p a base ( ) ( )( ) � � � � � � � � �1 (for a, b, c, d see the contingent table – table 1). this can be written formally as ant � p, b suc and can be interpreted as: “among objects satisfying the antecedent, there is a relative frequency of objects that satisfy the succedent at least p � 100% higher than the relative frequency of objects satisfying the succedent among all the objects in the whole data matrix; and there are at least b objects that satisfy both antecedent and succedent.” another way to express this rule is: “if we add one to p, we discover how many times the probability of the succedent increases when the antecedent is satisfied, as compared to a case where we don’t know whether the antecedent is or is not satisfied.” [2] the values of the confidence and base parameters are very important for the resultant hypotheses, because the higher the parameter, the more gravity will be assigned to these hypotheses. if we set the values of the parameters too high, only the strongest hypotheses will be found. their numbers are mostly low, and can even be zero. consequently, it is necessary to repeatedly set parameters; and to try various values. 2.2 software for knowledge discovery in databases systems for kdd can be divided into the research sphere and the commercial sphere. most of them include components for modeling, data preparation, visualization and interpretation. some examples of these commercial systems are clementine, enterprise miner, intelligent miner, knowledge studio, and the statistical data miner. lisp-miner and weka are examples of non-commercial tools. the lisp-miner system was used to generate the association rules, which have been under development for research and education purposes since 1996 at the faculty of informatics and statistics of the university of economics in prague. this system patterns itself on the guha method, and it is possible to closely specify the character of each association rule. in this way, only really interesting hypotheses are found. this brevity of the selection process hastens the generation of hypotheses. lisp-miner works with the ms access database, which is appropriate, considering that data used only to be available in this database system. 3 data understanding and preparation 3.1 the doris system the analyzed data created between april and december 2003 was taken from the doris tram dispatch control system. this system observes tram-cars in the electric tramway © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 1/2008 suc ¬suc � ant a b r ¬ant c d s � k l n table 1: contingent table net of the prague public transit company, and supports tram-car control in real time. doris ensures basic functions such as localization of all tram-cars logged into the system; evaluation of the time divergence of tram-car as the pass the tram stops; information about the departure of tram-cars compared with the timetable; control of data and phonic traffic; control of departure of tram-cars from the terminus; unified digital information about exact times, etc. [3]. the data is archived by months; three databases were available for each month: the database of daytime traffic; the database of night traffic; and the database containing emergency records (traffic accidents). the database of daytime traffic includes a table of daily records for each day of the month. there are approximately 50 000 records daily, and each record has 11 attributes (see table 2). 3.2 data preparation it is necessary to integrate data from several tables to create a single database that includes all required data. furthermore, duplicate records have to be deleted, because the existence of such records has a negative influence on the statistics; the greater the duplication, the greater the distortion. as well as duplicate records, extreme values can influence the resultant statistics and the computation of new attributes (average values). records with extreme values of delay/overtake attributes were therefore deleted. such values occur due to so-called “reduced operating effectiveness” events (when a tram is diverted from its route, towed to the depot because of technical problems or an accident, or is being handled in the depot). if a data table, which usually has 1.5 million records 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 array name data type meaning rec_id text primary key (integer) datum date and time date and time (in the general form dd.mm.yyyy; hh:mm:ss) kod text identification of the type of record (log-in to the system, passage round the infra-pylon, departure from route, etc.) lnno number line number porno number line order evcislo text tram-car license number vozovna number number of the depot (1–8), from which the tram departed a number id of tram stop b number number of the stop post c number delay (overtake recorded as minus) d number reserve column, without data table 2: attributes of daytime traffic tables name of attribute meaning name of category values of attributes weekday days in a week monday to sunday 1 to 7 hour hour interval (e.g. 7 means an interval from 7:00 to 7:59) morning 5, 6 morning rush hour 7, 8 before noon 9, 10 noon 11, 12 afternoon 13, 14 afternoon rush hour 15, 16 evening 17, 18, 19, 20 late evening 21, 22, 23 average delay average delay [s] under 1 min �0, 59� from 1 to 2 min �60, 119� from 2 to 3 min �120, 179� over 3 min > 179 table 3: meaning and categories of attributes for each month, generates about 20 000 duplicate records, this corresponds to 1.3 % of the total. if a dispatcher diverts a tram from its route, there is no applicable delay/overtake attribute for this tram, because there is no defined timetable on its “new” route, and therefore, it is not possible to determine a departure. so a ‘null’ value arises in these attributes; records with such values were also deleted in the data preparation phase. missing values can, in general, be handled in various ways. because attributes like ‘day of the week’ or ‘hour intervals’ were used, it was necessary to create these from the ‘date’ attribute. 3.3 creation of attributes for generating of association rules association rules are generated and hypotheses are created with groups of attributes, and must therefore be collected in a simple data table, mostly with the help of the sql language aggregation function. relevant categories must be created for the values of each attribute. an authentic group of attributes is offered here as an example. the attribute categories: ‘weekday’, ‘hour’, ‘line’, ‘stop’ and ‘average delay’ are shown in the data table for the generation of hypotheses. the meanings of these attributes and their division into categories in table 3. the tram direction should be reflected in the survey of the delays at the stops; whether or not it is inbound towards, or outbound from the city center (this can be distinguished with the help of the ‘stop post’ attribute in the database). the attributes ‘line’ and ‘stop’ have one category for each value. the resultant data table for generating the association rules includes about 30 000 rows. 4 results we will now give some real examples and show how they have been modeled. 4.1 description we can obtain information from the database, which allows us to follow any leg of a selected line, either in the long term, or in one day, or in hourly segments. the “long term” pattern has a higher deposed value, and can be used for modifying timetables for adjusting the driving time between stops or as a basis for drafting new timetables. the statistics of accidents divulge much interesting statistical information. about 800 accidents were recorded in the database between july and december 2003. with the help of simple a sql query it was possible to determine that the most common type of accident was between a tram and a car, without resultant injury. injury occurred in only 4.4 % of all such accidents. fig. 1 shows that the greatest number of accidents occurred on monday and on wednesday. the greatest number of accidents during the day may be assumed to have occurred during the rush hour. this is confirmed by the second graph (fig. 2). the greatest number of accidents occurred between 3 and 4 p.m. on the other hand, the total number of accidents in the evening, and at night, is below twenty per hour, appreciably fewer than in other hours of the day. another example that illustrates the use of data mining in the doris database is the influence of one accident on increasing the delays to other trams in the affected area. an accident has been chosen from the database that happened to the second tram in the line order for line #18, between chotkovy sady and malostranská tram stops. the tram struck a pedestrian, and the tram was then taken to the depot. this accident influenced the progress of trams not only on line 18, but also on lines 22 and 23. fig. 3 describes the trams operating on lines 22 and 23, and the increase in delays to them at the time of the accident. each tram is affixed with a line number and an ordinal number. at first sight, the ordinal numbers of both lines appear to be numbered similarly in the graph, but in reality they are slightly different. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 1/2008 number of accidents in the period under consideration 162 123 156 137 147 46 25 monday tuesday wednesday thursday friday saturday sunday day of a week n u m b e r o f a c c id e n ts fig. 1: number of accidents in the study period as shown in fig. 3, the first trams on both lines ran without any major delay. the accident occured between the passage of trams 23/1 and 22/2. the delays increased rapidly after this event. the trams most impacted were 22/2, 23/2, 22/3 and 23/3, and the delay reached as much as 700 s. afterwards, the flow of traffic slowly returned to the normal state, but several following trams were still delayed by about 200 s. 4.2 searching for association rules the attributes ‘weekday’, ‘hour’ and ‘number of accidents’ were placed as a precondition to an association rule; and the attributes ‘percentage of delayed trams’, ‘average delay’, ‘average overtake’ were assigned to a succedent. the founded implication with the parameters b � 15 and p � 0,700 was used as a quantifier. these hypotheses were then searched: among those that have the relationship of the founded implication between ant and suc, at least 15 objects satisfy both ant and suc and confidence of the implication is at least 0,700. with this setting of parameters, for example, a strong hypothesis appeared as follows: number of accidents (high) �0.89, 23 percentage of delayed trams (high), where the confidence is 89 % and the 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 number of accidents during the day 0 10 20 30 40 50 60 70 80 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 hour n u m b e r o f a c c id e n ts fig. 2: the number of accidents during the day in the study period m a lo v a n k a b ru s n ic e k rá l. le to h rá d e k m a lo s tr a n s k á ú je z d n á ro d n í d iv a d lo 22/1 23/1 22/2 23/2 22/3 23/3 22/4 23/4 22/5 23/5 22/6 23/6 22/7 23/7 0 100 200 300 400 500 600 700 800 delay [s] stop line and order influence of an accident on the delay increase of other trams fig. 3: the influence of an accident on increasing the delays on of lines 22 and 23 support a � 23 (number of objects which satisfy the concurrent precondition and conclusion). the hypothesis can be interpreted as: if a high number of accidents occurs in a given day and hour then a high percentage of trams delayed more than 180 seconds will appear with a probability rate of 89 % in the same day and hour. this is confirmed by the study of 23 entities in the data matrix. examples of other hypotheses: � number of accidents (zero) �0.79, 15 percentage of delayed trams (low) & average delay (low) � number of accidents (low) �0.82, 44 average delay (low) from the hypotheses mentioned above it can be unambiguously deduced that the number of accidents influences the delay rate. such a conclusion can be intuitively assumed, and we can draw this conclusion without the use of association rules. no new elements appear, but we know that the method used for generating association rules works well. other searches for contingent rules followed. we used the attributes ‘line’, ‘weekday,’ ‘hour’ in the antecedent, and ‘average delay’ in the succedent. the ‘stop’ attribute (the direction towards city center) was used as a condition. there was an “above average” quantifier with the parameters p � 2 and b � 20. thirty hypotheses satisfied the given conditions. the attribute ‘stop’ constrains the validity of founded rules at the individual stops. these founded hypotheses reveal: line (25) �3, 20 average delay (over 3 min) /stop (vltavská) this hypothesis holds true for vltavská tram stop, in the direction towards the city center, for 20 instances in the data matrix, and p is approximately 3. this is interpreted as: there is a delay of over 3 minutes on line 25, four times more often at vltavská tram stop than for trams of all other numbered lines running through this tram stop. the rules can be interpreted in various ways. while intelligibility and simplicity should be the most important elements in the explication, it is sometimes difficult to explain a hypothesis simply and realistically in the same instance. a given hypothesis can be simply interpreted as: trams of line 25 are those that are most often delayed at vltavská tram stop. as in the case of generating association rules from the previous attributes group, evident or already known conclusions appeared. as an example: hour (late evening) �2, 64 average delay (under 1 min) / stop (štěpánská). simply stated: in comparison to any other hour of the day at stěpánská tram stop, the smallest number of delayed trams occur between 21:00 and 24:00. 5 conclusions the results of data analysis can serve for modifying the running times between stops, or as a basis for drafting of new timetables. they also draw attention to the existence of black spots with a high rate of accidents or a high percentage of delayed trams (or tram overtake incidence). the conclusions drawn from such analyses provide an exact tool for recognizing black spots, and suitable measures can be taken, e.g., decisions about tram preference at traffic lights or about constructing some suitable passive elements. as a result of applying the association rules, hypotheses appeared which in most cases could have been intuitively assumed. basically, it can be inferred that the method used here for searching for association rules in the lisp-miner program works well. no surprising or new hypotheses were found, above all due to the interdependence among the attributes that were used (for example, there is a correlation coefficient of 0.89 for the pair of attributes ‘number of accidents’ and ‘average delay’). it would be necessary to variegate the data base with other independent attributes in order to find more interesting hypotheses. for example, vehicle occupancy rate could be used as another new attribute to record at the stops, if the required equipment for reading the weight on an axle were available. if the vehicle occupancy were known, it would be possible to optimize the scheduled intervals between trams, to reduce or increase the frequency of connections. the most interesting hypotheses are those predicting the probability of a delay on a line at a selected stop. not such statistics had been created previously at the prague public transit company. when considering transport options, reliability of transfer connections and information about the probability of a delay are decisive elements when choosing a mode of transport. references [1] berka, p.: dobývání znalostí z databází. praha, academia, 2003. [2] kejkula, m.: 4ft-miner pro začátečníky. získávání znalostí z databází. praha, vše, 2004. [3] internal materials of the prague public transit company: dispečerský řídicí systém doris. [4] the official site of the lisp-miner project, http://lispminer.vse.cz [5] hájek, p., havránek, t.: metoda guha. automatická tvorba hypotéz. praha, academia, 1983. [6] ducháček, m.: nástroj pro správu databází s využitím pro multi-relační data mining. diploma thesis mff uk, praha, 2005. [7] burian, j: datamining a aa (above average) kvantifikátor. sborník 2. ročníku konference znalosti. ostrava, 2003. isbn 80-248-0229-5. [8] rauch, j., šimůnek, m.: systém lisp-miner. sborník 2. ročníku konference znalosti. ostrava, 2003. isbn 80-248-0229-5. ing. petra haluzová email: p.haluzova@seznam.cz department of informatics and telecommunications czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 1/2008 ap07_2-3.vp 1 introduction the study of factor complexity, palindromic complexity, balance property, return words, and the recurrence function of infinite aperiodic words is an interesting combinatorial problem. moreover, the investigation of infinite words coding �-integers ��, for � being a pisot number, can be interpreted as an investigation of one-dimensional quasicrystals. in this paper, new results concerning return words and recurrence function will be presented. in general, little is known about return words. a return word of factor w of an infinite word u is any word which starts at some occurrence of w in u and ends just before the next occurrence of w. vuillon [1] has shown that an infinite word over a bilateral alphabet is sturmian if and only if each of its factors has exactly two return words. justin and vuillon [2] proved that each factor of an arnoux-rauzy word of order m has m return words. moreover, vuillon proved that each factor of an -interval exchange coding word also has return words. return words in the thue-morse sequence and return words in the infinite words associated with simple parry numbers have been studied in [4]. even less is known about the recurrence function which, for an infinite uniformly recurrent word u, determines the minimal length ru(n) such that if we take any subword of length ru(n) of u, we will find there all the factors of length n. cassaigne [5] determined this function for sturmian words taking into account the continued fractions of their slope and, in [6], he given a general algorithm describing how to determine the recurrence function if we know the return words. this algorithm will constitute the headstone of our further results. we will describe the cardinality of the set of return words for any factor w of the infinite word u� associated with quadratic non-simple parry (and, thus pisot) number � and, as a consequence, we will be able to deduce the exact formula for the recurrence function. these results will complete the list of properties already studied: factor complexity has been studied in [7], palindromic complexity is described in [8], and results on balances can be found in [9]. 2 preliminaries first, let us introduce our “language”, which will be used throughout this paper. an alphabet � is a finite set of symbols called letters. a concatenation of letters is a word. the length of word w is denoted by w . the set �* of all finite words (including the empty word �) provided with the operation of concatenation is a free monoid. we will also deal with right-sided infinite words u u u u� 0 1 2 � . a finite word w is called a factor of the word u (finite or infinite) if there exist a finite word w(1) and a word w(2) (finite or infinite) such that u w ww� ( ) ( )1 2 . the word w is a prefix of u if w( )1 � � and it is a suffix of u if w( )2 � �. a concatenation of k letters a (or words a) will be denoted by ak. the language of u, �(u), is the set of all factors of a word u, and �n(u) is the set of all factors of u of length n. we say that a letter a �� is a left extension of a factor w u� �( ) if the factor aw belongs to the language �(u) and w is left special if w has at least two left extensions. right extensions and right special factors are defined by analogy. we call a factor w u� �( ) bispecial if it is left special and right special. a mapping � on the free monoid �* is called a morphism if � � �( ) ( ) ( )vw v w� for all v w, *�� . obviously, for determining any morphism it suffices to give �(a) for all a ��. the action of a morphism can be naturally extended on right-sided infinite words by the prescription � � � �( ) : ( ) ( ) ( )u u u u u u0 1 2 0 1 2� �� a non-erasing morphism �, for which there exists a letter a �� such that �( )a aw� for some non-empty word w �� *, is called a substitution. an infinite word u such that �( )u u� is called a fixed point of the substitution �. obviously, every substitution has at least one fixed point, namely lim ( ) n n a �� � . in everything that follows, we will focus on the infinite word u� being the only fixed point of the substitution � given by � �( ) , ( ) , ,0 0 1 1 0 1 1� � � � �a b a b a �. (1) let us only briefly mention that this infinite word codes the set of �-integers ��, where � is a quadratic non-simple parry number. all details can be found, for example, in [8]. since our main interest is in studying return words and the recurrence function, let us give the corresponding definitions. definition 1 let w be a factor of an infinite word u u u� 0 1 � (with u j ��), w � �. an integer � is an occurrence of w in u if u u u wj j j� � � �1 1� � . let j, k, j 0 such that any segment of u of length � r n( ) contains all words in �n(u). as we deal with the infinite word u n� �� ��lim ( )0 , where �( )0 0 1� a , �( )1 0 1� b , a b� �1 , let us also recall what is known about the uniform recurrence of words being fixed points of a substitution. definition 2 a substitution � over an alphabet � � a a ak1 2, , , ,� is called primitive if there exists k � � such that for any letter ai ��, the word �k(ai) contains all letters of �. queffélec in [10] has shown that any fixed point of a primitive substitution � is a uniformly recurrent word. thus, since the infinite word ub is a fixed point of a primitive substitution, u� is uniformly recurrent. it is not difficult to see that the set of return words of w is finite for any factor w u� �( ) if u is a uniformly recurrent word. definition 3 the recurrence function of an infinite word u is the function ru : � �� � �� defined by � r n n v u v uu n n n( ) inf ( ), ( ) ( )� � � � � � ��� � � � in other words, ru(n) is the smallest length such that if we take any segment of u of length ru(n), we will find there all the factors of u of length n. clearly, u is uniformly recurrent if and only if ru(n) is finite for every n � �. to get another expression for ru(n) that is convenient to work with, let us introduce some more terms. definition 4 let u be an infinite uniformly recurrent word. � let w u� �( ), then l w v v wu( ) max ( )� � ret is the maximal return time of w in u. � for all n � �, we define l n l w w uu u n( ) max ( ) ( )� � � . once we have determined the lengths of the return words, the following proposition from [5] will allow us to calculate the recurrence function ru(n). proposition 5 for any infinite uniformly recurrent word u and for any n � �, one has r n l n nu u( ) ( )� � � 1 3 return words the aim of this section is to determine return words of the infinite word u� being the fixed point of the substitution � introduced in (1). vuillon in [1] has shown the following result for sturmian words. proposition 6 let u be an infinite word over a two letter alphabet. then u is sturmian if and only if # ( )ret w � 2 for every factor w of u. let us mention that u� is sturmian for a b� �1 , as follows from [7]. example 7 let u � � 001001010010010100101� be the fixed point of the substitution � �( ) , ( )0 001 1 01� � (sturmian case). let us show examples of return words: ret( ) { , }0 0 01� ret( ) { , }00 001 00101� ret( ) { , }001 001 00101� ret( ) { , }0010 001 00101� throughout this section, we will use methods analogous to those ones introduced in [4]. in order to study return words ret(w) of factors w of an infinite uniformly recurrent word u, it is possible to limit our considerations to bispecial factors. namely, if a factor w is not right special, i.e., if it has a unique right extension a ��, then the sets of occurrences of w and wa coincide, and ret ret( ) ( )w wa� (2) if a factor w has a unique left extension b ��, then j � 1 is an occurrence of w in the infinite word u if and only if j � 1 is an occurrence of bw. this statement does not hold for j � 0. nevertheless, if u is a uniformly recurrent infinite word, then the set ret(w) of return words of w stays the same no matter whether we include the return word corresponding to the prefix w of u or not. consequently, we have ret ret ret( ) ( ) ( )bw b w b bvb v w� � �� �1 1 , (3) where bvb�1 means that the word v is extended to the left by the letter b and the letter b on the right end is erased (note that b is always the suffix of v for v w� ret( )). observation 8 note that relation (2) implies that if w u� �( ) is not right special, then the maximal return times from definition 4 satisfy l w l wau u( ) ( )� , where a �� is the only right extension of w. similarly, relation (3) implies that if w is not left special, then l w l bwu u( ) ( )� , where b �� is the only left extension of w. for an aperiodic uniformly recurrent infinite word u, each factor w can be extended to the left and to the right to a bispecial factor. to describe the cardinality of ret( )w , it suffices therefore to consider bispecial factors w. observation 9 the only bispecial factors of u� that do not contain the letter 1 are 0 r, r a� � 1. observation 10 every bispecial factor w of u� containing at least one letter 1 has the prefix 0b1 and the suffix 1 0b. consequently, there exists a bispecial factor v such that w vb b� 0 1 0�( ) . (the empty word � is bispecial, too.) let us summarize the previous observations to get the description of all bispecial factors. corollary 11 the set of all the bispecial factors of u� is given by b r a nr n( ) , , ,� � �0 1� � , where b br n b r n b( ) ( )( )� �1 0 1 0� for all n � �, b0 1( ) � �, and br r( )1 0� for r a� �0 1, ,� . in order to find return words of bispecial factors, let us recall a lemma from [4]. to formulate it, we also need to recall a definition. definition 12 let w be a factor of a fixed point u of a substitution �. we say that a word v v v v un n n0 1 1 1� � �� � ( ) is an ancestor of w if � w is a factor of �� v v v vn n0 1 1� � ), 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 � w is neither a factor of �� v v vn n1 1� � ) nor of �� v v v vn n0 1 1� � ). clearly, any factor of the form �( )w has at least one ancestor, namely the factor w. lemma 13 let an infinite word u be a fixed point of a substitution � and let be a factor of u. if the only ancestor of �(w) is the factor w, then ret ret( ( )) ( ( ))� �w w� . proposition 14 let w be a non-empty factor of u�, then �(w) has a unique ancestor, namely w. proof. it is a direct consequence of the form of the substitution defined by (1) that any factor v having suffix 1 occurs only as suffix of �(w) for some factor w u� �( )� . moreover, any factor �v which has only 1 as its left extension, occurs only as prefix of �( � )w for some � ( )w u� � � . the statement follows by injectivity of �. now we are able to explain how to get return words of all bispecial factors. corollary 15 let br n( ) be a bispecial factor of u�. then # ( ) # ( )( ) ( )ret retb br n r n� �1 for every n � 2 and r a� �0 1, ,� . proof. it follows from proposition 14 that # ( ) # ( ( ))( ) ( )ret retb br n r n� ��1 1� . then, relations (2) and (3) say that # ( ( )) # ( ( ) ) # ( )( ) ( ) ( )ret ret ret� �b b br n b r n b r n� �� �1 10 1 0 . let us sum up the results to get the main theorem about the cardinality of the set of return words of any factor of u�. theorem 16 let w be a factor of u�. then 2 3� �# ( )ret w . proof. using corollary 15, it suffices to consider only bispecial factors of the form 0r, r a� � 1, and � to obtain all possible cardinalities of the sets of return words of factors of u�. it is not difficult to see that the return words of the simplest bispecial factors are the following: � let r b� then # ( ) ,ret 0 0 0 1r r� . � let b r a� � � 1 then # ( ) , ,ret 0 0 0 1 0 10r r r b� . � since # ( ( ) ) ( )( ) , ( )( )ret 0 1 0 0 1 0 0 1 0 1 1 0 1 0 10 1 1b b b b b b b � � � �� � � � a b b� , ,0 1 it is useful to put ret( ) ,� � 0 1 . observation 17 from the proof of theorem 16, we can observe that in the case b a� � 1, # ( )ret w � 2 for all the factors of u�. thanks to proposition 6, we have another confirmation of the fact that u� for b a� � 1 is a sturmian word. 4 how to compute the recurrence function let us show how to apply the knowledge of return words to describe the recurrence function (definition 3) of the infinite word u�. the methods used here follow the ideas from [6]. as the reader will have anticipated, we want to compute lu(n) in order to get the recurrence function ru(n) for every n � �. this task can be simplified using the notion of singular factors. a factor w u� �( ) is called singular if w � 1 or if there exist a word v u� �( ) and letters x x y y, , ,� � �� such that w xvy� , x x� �, y y� �, and � � �x vy xvy u, ( )� . obviously, v is a bispecial factor. proposition 18 let u be a uniformly recurrent word and n � 1. if l n l nu u( ) ( )� �1 , then there exists a singular factor w of length n such that l w l nu u( ) ( )� . a singular factor w is said to be essential if l w l w l wu u u( ) ( ) ( )� � � 1 . it follows that to calculate l nu( ), it is sufficient to consider singular, or, even, only essential singular factors. theorem 19 let u be a uniformly recurrent word and n � 1. l n l w w n and w l w w n and w essen u u u ( ) max ( ) max ( ) � � � � singular tial singular . now we are able to give an algorithm for computing the recurrence function of an infinite uniformly recurrent word u: 1. determine bispecial factors. 2. deduce the form of singular factors and compute their lengths. 3. for every singular factor, determine the associated return words and compute their lengths. 4. compute the function lu(n) to get the recurrence function ru(n) for every n � � using proposition 5. 5 computation of the recurrence function of u� let us apply the above described algorithm for computing the recurrence function of the infinite word u� being the only fixed point of the substitution �( )0 0 1� a , �( )1 0 1� b , a b� �1 . 1. the bispecial factors of u� are described in corollary 11. 2. let us show how the task of describing the singular factors can be simplified using the relation between a factor and its image under the substitution �. observation 20 let n � 2 and r a� �0 1, ,� . then, s x y x b yr n r n( ) ( )( , ) � is a singular factor if and only if s x y x b yr n r n( ) ( )( , )� ��1 1 is a singular factor. if we describe all the simplest non-trivial singular factors, i.e., those factors of the form s x y x b yr r ( ) ( )( , )1 1� , x y, ,� 0 1 , observation 20 will allow us to get the set of all singular factors of u�. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 47 no. 2–3/2007 proposition 21 the set of all singular factors of u� is given by 0 1 0 1, ( , ) , , ,( )� � � �s x y r a nr n � � where s x y x b yr n r n( ) ( )( , ) � for all n � � and x, y come from the form of the simplest non-trivial singular factors: s r ar r( )( , ) ,1 0 0 00 0 2� � � sb b( )( , )1 1 0 10 0� , sb b( )( , )1 0 1 00 1� , sb b( )( , )1 1 1 10 1� , s0 1 0 0 00( )( , ) � . to compute the length of singular factors, it is, of course, enough to compute the length of bispecial factors since s x y br n r n( ) ( )( , ) � � 2 . for the lengths of bispecial factors, we have b b br n r n r n( ) ( ) ( )� � 0 1 where denotes the number of 0’s in br n( ) and br n( ) 1 denotes the number of 1’s in br n( ). then b b a b b b r n r n r n r n ( ) ( ) ( ) ( ) 0 1 1 0 11 1 � � � � � � � � � � � � � �� � � �� � � 1 � � � � � � � � � � for all n � 2 . 3. to describe the return words of singular factors, we can again make use of a relation between return words of a singular factor and return words of its image under �. proposition 22 let n � 2 , r a� �0 1, ,� . the following sets are equal w w s x y v v s x yrn rn� � � �ret ret( ( , )) ( ) ( ( , ))� 1 . proof. we will distinguish two situations. a) for x � 0 and y � 0 1, , applying observation 8 on return words of factors which are not right or left special, it follows that the set of lengths of return words of s x y x b yr n b r n b( ) ( )( , ) ( )� �0 1 01� is the same as that of �� � � �� �x b yr n( )( )�1 . thus, we get w w s x y w w s x yrn rn� � � �ret ret( ( , )) ( ( ( , )))� 1 . b) the singular factor s br n b r n b( ) ( )( , ) ( )11 10 1 0 11� �� can be extended to the left without changing the set of lengths of return words to 0 10 1 0 1 01 11 1a b r n b r nb b� � � �( ) ( ) ( ) ( )( ) ( )� �� . thus, we get w w s x y w w srn rn� � � �ret ret( ( , )) ( ( ( , )))� 0 1 11 . the statement then follows using lemma 12 and proposition 14, and, in the case of (b), also using relation (3) for return words of 0 1 11sr n � ( , ) and sr n �1 1 1( , ). now it suffices to determine return words and their lengths for the simplest singular factors s x yr ( )( , )1 . proposition 22 implies that the lengths of return words of all the other singular factors will be obtained by calculating the lengths of images of the simplest return words. here are the return words of the simplest singular factors. a) for the trivial singular factors 0, 1, one gets ret( ) ,0 0 01� and ret( ) ,1 10 10� a b . (4) b) for sr r( )( , )1 0 0 00 0� , using the proof of theorem 16, we have ret( ) , , ,00 0 0 1 0 10 1 2 0 0 1 0 10 1 22 2r a a b r r b r a b r� � � � �� � if if � � � � � � � �� � � � � a r br 2 0 0 1 1 22, if (5) c) for sb b( )( , )1 1 0 10 0� , one can easily find ret( ) ,10 0 10 10 10b a a b� . (6) d) for sb b( )( , )1 0 1 00 1� , one has ret( ) ,( ) ( )00 1 0 10 0 10 101 1 1 1b b a b b b a b� � � � � � � . (7) e) since sb b( )( , ) ( )1 1 1 10 1 1 1� � � , using relation (3), we have ret ret( ) ( ( )) ( ) , ( ) . 10 1 1 1 1 1 10 1 1 10 1 1 1 1 b a b v v� � � � � � � � � (8) f) for s0 1 0 0 00( )( , ) � , one has to consider more cases: � let b � 1 and a � 2 (a sturmian case), it holds ret( ) ,00 001 00101� . � let b � 1 and a � 2, then ret( ) ,00 0 00101� . � let b � 2, then ret( ) ,00 0 001� . 4. the last step is to compute l k ku � ( ), � �. for simplicity, let us write l(k) instead of l ku � ( ) in the sequel. before starting, let us exclude singular factors which are not essential. (let us recall that a singular factor w is essential if l w l w l w( ) ( ) ( )� � � 1 .) naturally, we will apply proposition 22 to determine the maximal return times l(w) of singular factor w. a) note that s s sb n b n b n( ) ( ) ( )( , ) ( , ) ( , )1 0 0 1 1 1� � , but from relations (6), (7), and (8), it is clear that for all n � �, one gets l s l s l sb n b n b n( ( , )) ( ( , )) ( ( , ))( ) ( ) ( )1 0 0 1 1 1� � . thus, sb n( )( , )1 0 and sb n( )( , )0 1 are not essential singular factors and one does not have to consider them in calculation of l(k), k � �. b) by analogy, if b � 2, we have s sn b n 0 1 0 0 1 1( ) ( )( , ) ( , )� � , while l s l sn b n( ( , )) ( ( , ))( ) ( )0 1 0 0 1 1� � for all n � �. hence, the singular factors s n0 1 0 0( )( , )� are not essential. c) next, if a r b� � �2 , we have s sr n b n( ) ( )( , ) ( , )0 0 1 1� , nevertheless, l s l sr n b n( ( , )) ( ( , ))( ) ( )0 0 1 1� for all n � �. therefore, sr n( )( , )0 0 , r b� , are not essential. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 d) if 1 2� � �r b , then we obtain s sr n b n( ) ( )( , ) ( , )� �1 0 0 1 1 , but l s l sr n b n( ( , )) ( ( , ))( ) ( )� �1 0 0 1 1 for all n � �, thus, the singular factors sr n( )( , )�1 0 0 , 1 2� � �r b , are not essential. e) the last remark is that for the trivial singular factor w � 1, we have w sr� ( )( , )1 0 0 , 1 2� � �r b , but l s l wr( ( , )) ( ) ( )1 0 0 � , hence, sr ( )( , )1 0 0 , 1 2� � �r b , are not essential. the previous facts imply that in order to calculate l(k) we have to take into account only the trivial singular factors, the non-trivial singular factors of the form sb n( )( , )1 1 , and, eventually, s n0 0 0 ( )( , ) and sb n �1 0 0 ( )( , ). the formulae for l(k), k � �, will split into more cases, according to the values of a and b in the substitution (1). a) for b � 2, combining the previous facts and the description of the simplest singular factors, we obtain the following formula for l(k), k � �, � if 2 1b a� � , then l l b a( ) ( )1 1� � � �� , l b b( )� � �1 2 3 , l k n a( ) ( )� � 10 , for s k sb n b n( ) ( )( , ) ( , )1 1 0 01 1� � � � , n � �, l k n b b( ) ( )� �� 0 10 11 , for s k sb n b n � � �� �1 1 10 0 1 1( ) ( )( , ) ( , ) , n � �. � if 2b+1 < a, then l l b a( ) ( )1 1 1� � � � �� , l k n a( ) ( )� � 10 , for s k sb n b n( ) ( )( , ) ( , )1 1 1 11� � � , n � �. b) for b � 1 and 2 3� �a , then it holds for l(k), k � �, l a( )1 1� � , l( )2 5� , l k n a( ) ( )� � 10 , for s k sn n1 0 11 1 0 0( ) ( )( , ) ( , )� � � , n � �, l k n( ) ( )� � 00101 for s k sn n0 1 1 10 0 1 1( ) ( )( , ) ( , )� �� � , n � �. c) for b � 1 and a � 3, we get l l a( ) ( )1 2 1� � � , l k n a( ) ( )� � 10 , for s k sn n1 1 11 1 1 1( ) ( )( , ) ( , )� � � , n � �. having calculated the formula for l(k), k � �, we have also computed the recurrence function, since r k l k k( ) ( )� � � 1. references [1] vuillon, l.: a characterization of sturmian words by return words. eur. j. comb., vol. 22 (2001), no. 2, p. 263–275. [2] justin, j., vuillon, l.: return words in sturmian and episturmian words. theor. inform. appl., vol. 34 (2000), p. 343–356. [3] vuillon, l.: on the number of return words in infinite words with complexity. liafa research report 2000/15. [4] balková, l’., pelantová, e., steiner, w.: return words in the thue-morse and other sequences. arxiv:math/0608 603 sci., 2006. [5] cassaigne, j.: limit values of the recurrence quotient of sturmian sequences. theor. comput. sci., vol. 218 (1999), no. 1, p. 3–12. [6] cassaigne, j.: recurrence in infinite words. lecture notes in computer science, springer verlag, vol. 2010, 2001, p. 1–11. [7] balková, l’.: complexity for infinite words associated with quadratic non-simple parry numbers. journal of geometry and symmetry in physics (wgmp 2005 proceedings) vol. 7 (2006), p. 1–11. [8] balková, l’., masáková, z.: palindromic complexity of infinite words associated with non-simple parry numbers. submitted to rairo theor. inform. appl., 2006, 16 p. [9] balková, l’., pelantová, e., turek, o.: combinatorial and arithmetical properties of infinite words associated with non-simple quadratic parry numbers. to appear in rairo theor. inform. appl., 2006, 17 p. [10] queffélec, m.: substitution dynamical systems – spectral analysis. lecture notes in math., springer berlin vol. 1294, 1987. ing. l’ubomíra balková phone: +420 224 358 560 e-mail: l.balkova@centrum.cz doppler institute for mathematical physics and applied mathematics & department of mathematics czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 47 no. 2–3/2007 k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 l(k) 3 5 8 8 13 13 13 21 21 21 21 21 34 34 34 34 34 34 34 34 r(k) 3 6 10 11 17 18 19 28 29 30 31 32 46 47 48 49 50 51 52 53 the table of the first 20 values of l(k) and r(k) for the simplest case a � 2 and b � 1 ap07_4-5.vp 1 design of the switching device 1.1 deep brain stimulation deep brain stimulation is a curative method used for patients suffering from extrapyramide disorders. parkinson’s disease, essential tremor and dystonia fall into the disorder category. this method involves brain electrical activity thus also the symptoms of these disorders too [1]. the symptoms are shakes, spasms and stiffness. no method is able to cure extrapyramide disorders, it can only suppress symptoms and improve the quality of life. most patients use drugs, but drug treatment is unsuitable for some of them. deep brain stimulation provides an opportunity for these patients. the brain areas, that provoke the symptoms, are stimulated by means of special electrodes. the rectangular stimulation signals suppress the symptoms. the patient has to undergo surgery twice. during the first surgery electrodes are implanted into the patient's brain. the aim of the stimulation is chosen according to the symptoms. during the second surgery, a miniature neurostimulator is implanted below the collarbone. the parameters of the rectangular stimulation signals are tuned for a period of roughly one year. the ranges of the parameters are: � amplitude 1–4 v, � pulse width 60–90 �s, � frequency 130–160 hz. 1.2 switching device between the operations there is investigation with functional magnetic resonance. during the investigation the external neubrostimulator is used and the stimulation signals are controlled from the control computer with the help of a switching device. the block diagram, fig. 2, shows the situation during this investigation. the switching device is controlled via an optical fibre from lpt (linear printer terminal) of the control computer. this is battery operated for the patient's safety. it switches eight stimulation signals independently from each other. it can also be operated manually without a control computer. the device consists of two parts. the first is placed next to the control computer, and codes the control signals from lpt to rs232. then it sends the information along the optical fibre to the © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 47 no. 4–5/2007 computer controlled switching device for deep brain stimulation j. tauchmanová this paper has two goals. the practical part deals with the design of a computer controlled switching device for an external stimulator for deep brain stimulation. the switching device is used during investigations with functional magnetic resonance for controlling signals leading to the deep brain stimulation (dbs) electrode in the patient's brain. the motivation for designing this device was improve measured data quality and to enable new types of experiments. the theoretical part reports on early attempts to approach the problem of modeling and localizing the neural response of the human brain as a system identification and estimation task. the parametric identification method and real fmri data are used for modeling the hemodynamic response. the project is in cooperation with 1st faculty of medicine, charles university in prague and na homolce hospital in prague. keywords: deep brain stimulation, hemodynamic response, functional magnetic resonance, identification. fig. 1: deep brain stimulation components fig. 2: block diagram of experiment with fmri and dbs patient next room. the second part of the device is in the room with the magnetic tomograph, and switches relays on the basis of control signals. these signals are generated according to the evseng program (electrical and visual stimulation engine), which can write simple scripts. this part of the device is battery–operated, so its dissipation must be low. a typical experiment with functional magnetic resonance alternates the calm period and the stimulation period. the use of a switching device will lead to new types of experiments and better quality of data. the device has designed, tested and used at na homolce hospital. two patients have been investigated with fmri using the switching device. the first patient with was suffering from cervical dystonia (unnaturally coiling of the neck) and the second patient was suffering from parkinson’s disease. 2 survey of hemodynamic models the best-known authors of articles about hemodynamics are karl friston and richard buxton, and most of their articles are available on the spm web. there are some simple linear models and also some nonlinear models, for example the baloon model and the hemodynamic model. the models are presented in [4] and [2]. the linear convolution models are given in [5] and [6]. a quite new area in hemodynamics is dynamic casual modelling (dcm). the fundamental feature of dcm is the creation of area models which have some cross interaction. karl friston has also written about dcm, for example [7]. 3 localization and modelling of hemodynamic response 3.1 the data we acquired a great deal of data from the investigations with fmri. matlab was used for the processing. the structure of the data is described on fig. 4. during the investigation this patient alternated calm periods and left hand motion periods, when motion stimulation was used. the time sequence of the images is shown in analyze format. each image maps over the whole brain at an instant in time. so, all the images represent the activity of partial brain areas depending on time. for preprocessing, the spm toolbox (statistical parametric mapping) was used. spm corrected the undesirable motion artefact (realign), interpolated waveform (slice timing) and suppressed noise (smoothing). each scan is characterized by a cubic matrix in the workspace. the matrices were obtained by means of two functions of the spm toolbox called spm_vol and spm_read_vols. 3.2 localization of the hemodynamic response the hemodynamic response is a neural activity response. our real fmri data shows the left hand motion response. localization of the hemodynamic response involves finding the signal which agrees with the typical shape and marking of the brain area where the signal appears. the typical shape of hemodynamic response is shown in fig. 5, and it is very similar for all people and all brain areas. the hemodynamic response was localized by comparing the signals from many areas of the brain and the assumed hemodynamic response signal. the signal produced by convoluting the investigation time span and the typical shape of the hemodynamic response. 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 3: the switching device fig. 4: the fmri data structure fig. 5: typical shape of the hemodynamic response then one hundred nother signals were found which had the minimal deviation and these signals were marked. the localization is shown in fig. 7. the verification of the validity of the signals is shown in fig. 8 and fig. 9 with partial slices. localization hits off slices 18, 19, 20 and 21. the real fmri data and the results from spm toolbox processing were provided by mudr. robert jech, na homolce hospital, see fig. 9. 3.3 identification of hemodynamic response the reason for using system theory for processing of fmri data was that it describes the brain area as a system (for example, the transfer function). we consider the localized hemodynamic response, as a pulse response and we can use it for identification. the input signal was represented by time ordering of the fmri experiment, and the output signal was a localized hemodynamic response. the system identification, toolbox was used for identification and two types of models were created. the first was arx (autoregressive model with external input) model eq. 1, and the second was the oe (output error) model, eq. 2 [3]. a d y t b d u t e t( ) ( ) ( ) ( ) ( )� � , (1) a d y t b d u t a d e t( ) ( ) ( ) ( ) ( ) ( )� � . (2) the results from the system identification toolbox are in the form of transfer functions. the arx model is a fourth order system and the oe model, eq. 3, is a second order system. the oe model describes the hemodynamic response better, and it is of a lower order. the result of identification toolbox for oe model: discrete-time model: y t b q f q u t e t( ) ( ) ( ) ( ) ( )� � � � � � , b q q q( ) . � . �� �� �0 6487 0 14691 2, f q q q( ) . � . �� � �� �1 0 05194 0 080111 2. the identification results show that the methods of system theory can be used for fmri data processing. however, for more perfect identification we would need a higher sampling frequency. in our experiment we had only 64 samples available for identification. to acquire more samples we would © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 47 no. 4–5/2007 fig. 7: localization of the likeliest signals fig. 6. the hemodynamic response from real data fig. 8. the localization in partial slices fig. 9: localization using the spm toolbox need to increase the sampling frequency or we use a more time optimal experiment. 4 the next stage of the project i hope to be able to continue cooperating with the 1st faculty of medicine. the first goal of this cooperation will be to provide technical support during investigations with fmri on patients with dbs. it would be interesting to attempt online data processing during an investigation with functional magnetic resonance. it would also be interesting to design measurements which would be time optimal and the measured data would have better time resolution. a further goal will be to create model of the whole brain, including interrelationships among areas of the brain. this is known as dynamic casual modeling. 5 conclusion a computer-controlled switching device for deep brain stimulation was described in this paper. then a curative method for deep brain stimulation was introduced, together with its use for extrapyramide disorders. the second part of paper dealt with localizing and modeling the hemodynamic response, using methods of system theory. finally some future extensions of research collaboration are suggested. acknowledgment this work has been supported by the ministry of education of the czech republic under research program no. msm6840770038. references [1] jech, r., růžička, e., urgošík, d.: stereotaktická funkční neurochirurgie u extrapyramidových pohybových poruch. časopis sanquis, 37/2005. [2] friston, k. j., glaser,d. e., mechelli, a., turner, r., price, c. j.: hemodynamic modeling. human brain function, 1998. [3] havlena, v.: odhadování a filtrace. praha: čvut, 2002. [4] buxton, r. b., wong, e. g., frank, l. r.: dynamics of blood flow and oxygenation changes during brain activation: the baloon model. magnetic resonance in medicine, 1998. [5] boynton, g. m.,engel, s. a., glover, g. h., heeger, d. j.: linear systems analysis of functional magnetic resonance imaging in human v1., the journal of neuroscience, 1996. [6] kiebel, s., holmes, a. m.: the general linear model. human brain function, 1998. [7] friston, k.: dynamic casual models. human brain function, 1998. ing. jana tauchmanová e-mail: tauchj1@fel.cvut.cz department of control engineering czech technical university in prague faculty of electrical engineering karlovo náměstí 13 121 35 praha 2, czech republic 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 10. the output error model of the hemodynamic response ap07_2-3.vp in [1], the authors study matrices of morphisms preserving the family of words coding 3-interval exchange transformations. it is well known [2–4] that matrices of morphisms preserving sturmian words (i.e. words coding 2-interval exchange transformations with the maximal possible subword complexity) form the monoid � � � �m m m mem e� � � � � � �� �� �2 2 2 21det , where e � � �� � �� 0 1 1 0 . the result of [1] states that in the case of 3-interval exchange transformations the matrices preserving words coding these transformations and having the maximal possible subword complexity belong to the monoid � �m mem e m� � � � ���3 3 1and det , where e � � � � � � � � � � � 0 1 1 1 0 1 1 1 0 . we say that a matrix fulfilling the first condition has the so-called meme property. definition 1 let m � ��3 3. m is said to have the meme property if mem et � � , where e � � � � � � � � � � � 0 1 1 1 0 1 1 1 0 . the aim of this paper is to provide a stand-alone result connected with this property. strictly speaking, we prove another algebraic characterization of matrices having this property. before we give the above mentioned characterization, we need to prove the following technical lemma. lemma 2 let m � ��3 3, m � � �( ) ,mij i j1 3 be a matrix. then m has the meme property if and only if there exists � �� � �1 1, such that det 1 1 1 21 22 23 31 32 33 � � � � � � � � �m m m m m m � , (1) det m m m m m m 11 12 13 31 32 32 1 1 1� � � � � � � � � � � , (2) det m m m m m m 11 12 13 21 22 23 1 1 1� � � � � � � � � � . (3) proof. let us denote k mem� t . the transpose of k is k m e m kt t� � � �( ) and hence k is an anti-symmetric matrix. to investigate equalities of anti-symmetric matrices it suffices to consider elements k12, k13, k23. let us compute these relevant elements of k. k k k k k k k m m m t� � � � � � � � � � � � � 0 0 0 12 13 12 23 13 23 11 12 1 mem 3 21 22 23 31 32 33 0 1 1 1 0 1 1 1 0 m m m m m m � � � � � � � � � � � � � � � � � � � � � � � � � � � m m m m m m m m m m m m 11 21 31 12 22 32 13 23 33 12 13 11 � � � � � � � � � m m m m m m m m m m m m m m 13 11 12 22 23 21 23 21 22 32 33 32 33 31 32 11 21 31 12 22 32 13 23 33� � � � � � � � � � � � � m m m m m m m m m m � � , and hence k m m m m m m m m m k 12 12 13 21 11 13 22 11 12 23 13 � � � � � � � � � ( ) ( ) ( ) , ( m m m m m m m m m k m m 12 13 31 11 13 32 11 12 33 23 22 2 � � � � � � � � ) ( ) ( ) , ( 3 31 21 23 32 21 22 33) ( ) ( ) .m m m m m m m� � � � by definition of the determinant, k12 is equal to the left-hand side of (1), �k13 is equal to the left-hand side of (2), and k23 is equal to the left-hand side of (3). this implies k e� � if and only if equations (1)–(3) hold. � now we can prove the two main theorems – one providing the desired characterization in the regular case and the other in the singular case. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 matrices associated to 3-interval exchange transformation and their spectra p. ambrož a three by three integer matrix m is said to have the meme property if mem et � � , where e � � � � � � � � � � � 0 1 1 1 0 1 1 1 0 . we characterize such matrices in terms of their spectra. keywords: integer matrix, meme property, spectrum. theorem 3 let m � ��3 3 be a regular matrix. then m has meme if and only if the number det m or � det m is an eigenvalue of m with (1, �1, 1) being an associated left eigenvector. proof. �: denote �: det� m and let ��, � �� � �1 1, be an eigenvalue of m. if (1, �1, 1) is a left eigenvector of m associated to � then we have the following dependence between rows of m, denoted by m m m1 2 3� � �, , ( , , ) ( , , )1 1 1 1 1 11 2 3� � � � � �� � �m m m m �� . (4) using (4) we have � �� � � � � � � � � � � � � � � � � �det det ( , , ) m m m m m m m 1 2 3 1 1 3 1 1 1� 3� � � � � � � � , by subtracting the first and the third row from the second row and by factoring �� out from the second row we obtain � �� � � � � � � � � � � � � det ( , , ) m m 1 3 1 1 1 , which gives (2). similarly, using the row dependence (4) in the first and in the third row of m provides the equalities (1) and (3). therefore by lemma 2 the matrix m has meme. �: let mt x x x 1 2 3 1 1 1 � � � � � � � � � � � � � � � � . (5) we will compute the components x1, x2, x3 using cramer’s rule. x m m m m m m t1 21 31 22 32 23 33 1 1 1 1 � � � � � � � � � � det det det ( , m � � � � � � � � � � � 1 1 2 3 , ) det det m m m m � , where the last equality comes by lemma 2 from the fact that m has meme. similarly, one can compute x2 � � � det m and x3 � � det m . hence x x x 1 2 3 1 1 1 � � � � � � � � � � � � � � � � � det m . (6) substituting (6) in (5) and multiplying by �det m , which is non-zero due to the regularity of m, we obtain ( , , ) det ( , , )1 1 1 1 1 1� � �m m� , that is, � det m is an eigenvalue of m, ( , , )1 1 1� is an associated left eigenvector. � theorem 4. let m � ��3 3 be a singular matrix. then m has meme if and only if � �� � � � �( ) , , ,m � � �0 12 3 2 3 and ( , , )1 1 1� is a left eigenvector associated with the eigenvalue 0. proof. �: since det m � 0 we have 0 � � ( )m . let x be a left eigenvector of m to the eigenvalue 0, i.e. xm � 0 and hence x x te mem� � 0, that is, x is also a left eigenvector of e, associated to 0. obviously, x � �( , , )1 1 1 . concerning the other eigenvalues of m, since the equality (4) for a singular m has the following form ( , , )1 1 1 01 2 3� � � � �� � �m m m m (7) the characteristic polynomial � �m( ) will be given by the matrix ( )m i� � � � � � �� � � m m m m m m m m m m m m 11 12 13 11 31 12 32 13 33 31 32 33 � � � � � � � �� . computing its determinant, the linear component of � �m( ) is � � � � � � � � � � � � m m m m m m m m m m m m 33 12 11 33 11 32 13 32 13 31 12 31, which is exactly the left hand side of (2), and hence it is equal to �1. on the other hand, since the characteristic polynomial can be written in the form � � � � � � � � � � � � � � � � � m( ) ( )( )( ) ( ) ( � � � � � � � � � � 1 2 3 3 1 2 3 2 1 2 2 3 1 3 1 2 3� �� � � � �) ,x and, moreover, �1 0� �det m , the linear component of � �m( ) is also equal to ( )� �2 3 x . this implies � �2 3 1� � . �: due to the row dependency (7) we have for the left hand side of (1) det det 1 1 1 1 1 1 21 22 23 31 32 33 11 31 � � � � � � � � � � � m m m m m m m m m m m m m m m m m m 12 32 13 33 31 32 33 11 12 13 1 1 1 � � � � � � � � � � � det m m m m m m m m m31 32 33 11 12 13 31 32 33 1 1 1 � � � � � � � � � � � �det � � � � � , and similarly for (3) det det m m m m m m m m m m 11 12 13 21 22 23 11 12 13 1 1 1 1� � � � � � � � � 1 31 12 32 13 33 11 12 13 31 1 1 1 � � � � � � � � � � � � m m m m m m m m m mdet 32 33 11 12 13 31 32 331 1 1 1 1 1m m m m m m m� � � � � � � � � � � � �det � � � � � . therefore the conditions (1)–(3) are equivalent for singular matrices. using the same argument concerning the linear component of � �m( ) as in the previous part of the proof one can show that the conditions � �� � � � �( ) , , ,m � � �0 12 3 2 3and imply the equality (2) and hence m has meme. � in addition to the above stated algebraic characterizations of matrices having the meme property, there is also a nice relation between the singular and the non-singular case. let © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 47 no. 2–3/2007 m � ��3 3 be a singular matrix having meme and let us consider a matrix m, given by m m: ,� � � � � � � � � � � k 0 0 0 1 d 1 1 0 0 0 � ��� ��� where k � �. since de ed� �t 0 , we have mem m d e m d mem dem med ded mem t t t t t t t t k k k k k � � � � � � � � � � ( ) ( ) e , hence also the regular matrix m has meme. equivalently, one can say that for each singular matrix m having meme there exists a regular matrix m having meme such that their first and third row coincide, that is, m m� � � � � � � � 1 0 0 1 0 1 0 0 1 . acknowledgment the author acknowledges financial support from the grant agency of the czech republic gačr 201/05/0169, and from ministry of education, youth, and sports of the czech republic grant lc 06002. references [1] ambrož, p., masáková, z., pelantová, e.: matrices of 3iet preserving morphisms. submitted to theor. comp. sci., 2007. [2] berstel, j., séébold, p.: morphismes de sturm. bull. belg. math. soc. simon stevin, vol. 1 (1994), p. 175–189. [3] mignosi, f., séébold, p.: morphismes sturmiens et r gles de rauzy. j. théor. nombres bordeaux, vol. 5 (1993), p. 221–233. [4] séébold, p.: fibonacci morphisms and sturmian words. theoret. comput. sci., vol. 88 (1991), p. 365–384. ing. petr ambrož, ph.d. petr.ambroz@fjfi.cvut.cz department of mathematics faculty of nuclear sciences and physical engineering trojanova 13 120 00 praha 2, czech republic 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ap05_5.vp 1 introduction numeration systems with an irrational base � give us an opportunity to perform exact arithmetic operations with irrational numbers – a tool necessary for the new methods of building aperiodic random number generators, for new cryptographic methods and also for the mathematical modelling of recently discovered materials with a long range order – so called quasicrystals. the whole field of irrational numeration originates from the article of a. rényi [1], in which he proved that for each real base �>1 and for every positive real number x, there exists a unique representation of the number x in the numeration system with the base �. however, contrary to the usual numeration system with an integer base (such as binary or decimal numeration systems), there are several strange and unsteady (i.e. depending on the nature of the base �) phenomena, which have to be examined and described before we are able to employ these systems. the work is organized as follows. in the first part, there is a survey of the already classical �-expansions of real numbers. we recall basic definitions and fundamentals; then we discuss two key problems of these numeration systems – the so-called finiteness property and the problem of fractional parts arising under addition and multiplication of integers. in the second part, we study a slightly different approach to irrational representations of real numbers, which is called �-adic expansions. since this concept has been much less studied in the past, we focus on one particular irrationality (namely the golden mean �), rather than trying to explain everything in general. after giving necessary definitions we discuss the relations of these �-adic expansions with integers and rationals, and we also study an analogue to the finiteness problem for �-expansions. 2 beta-numeration systems let �>1 be a fixed real number. a representation in base � (or simply �-representation) of a real number x � �r is an infinite sequence ( )xi n i� ���, xi � z such that x x x x xn n n n� � � � � �� � � �� � � �1 1 0 0 1 1 � � for some n � z. a particular �-representation with xi i i n n� � ��� � 1 for all �� n n is called �-expansion, see [1]. the greedy algorithm computes the digits of the �-expansions of a real number x: let [ ]x and { }x denote the integer and the fractional part of x, respectively. find n � z such that � �n nx� �1. put x xn n� [ ]� and r xn n� [ ]� . for i n n� � �1 2, ,� put x ri i� �[ ]� 1 and r ri i� �[ ]� 1 . the �-expansion can be seen as an analogue to the ordinary expansion in a system with an integer base (such as a decimal or binary system), we usually use the natural notation x x x x xn n� � �� �1 0 1� � and we say that the coefficients x xn� 0 form the integer part, whereas the coefficients x x� �1 2� form the fractional part of the �-expansion of a number x. a sequence of integer coefficients that corresponds to some �-expansion is sometimes called admissible in the �-numeration system. a sequence (or string) of integer coefficients that is not admissible is called forbidden. for the characterization of an admissible sequence one needs to introduce the so-called rényi expansion of 1, d t t t�( ):1 1 2 3� �, where t1 � [ ]� and tn n n � � � 2 is the �-expansion of1 1� t �. the �-expansions (or �-admissible sequences) are then characterized by the parry condition. theorem 1.1: (parry [2]). let �>1 be a real number. let d t t t�( ):1 1 2 3� � be the rényi expansion of one. the sequence ( )xi n i m� � with xi �{ , , , [ ]}0 1 � � is a �-expansion of some x � 0 if and only if x x xn p n p m� � �1� is lexicographically strictly smaller (denoted by the symbol �) than d�( )1 for all 0 � � �p n m. the set of those x �r for which the �-expansion of x has only a finite number of non-zero fractional coefficients is denoted by fin( )� . if the �-expansion of a real number x is of the form x xi i i n � � �0 , i.e. there is no fractional part, the number x is said to be a �-integer. the set of all �-integers is denoted by z�. 2.1 finiteness condition in general, the set fin( )� does not have to be closed under arithmetic operations, and there is no criterion known which would decide whether fin( )� is a ring. the �-numeration systems for which fin( )� is a ring are equally characterized by the so-called property (f) (f) z [ ] ( )� �� �1 fin , 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague non-standard numeration systems p. ambrož we study some properties of non-standard numeration systems with an irrational base � >1, based on the so-called beta-expansions of real numbers [1]. we discuss two important properties of these systems, namely the finiteness property, stating whether the set of finite expansions in a given system forms a ring, and then the problem of fractional digits arising under arithmetic operations with integers in a given system. then we introduce another way of irrational representation of numbers, slightly different from classical beta-expansions. here we restrict ourselves to one irrational base – the golden mean � – and we study the finiteness property again. keywords: numeration system, beta expansion, tau-adic expansion. where z [ ]� denotes the ring of polynomials with integer coefficients in �. it is known [3] that the condition (f) implies that � is a pisot number (an algebraic integer with all its algebraic conjugates in modulus smaller than one). several authors have found partial answers to the finiteness question by giving some sufficient conditions on the minimal polynomial of �. theorem 2.1: (frougny, solomyak [3]). let � be a positive root of the polynomial m x x a x am m m( ) � � � � � 1 1 � , ai � z and a a am1 2 0� � � �� . then � is a pisot number and property (f) holds for �. theorem 2.2: (hollander [4]). let � be an algebraic integer with the minimal polynomial m x x a x am m m( ) � � � � � 1 1 � , ai � z, ai � 0 and a a a am1 2 3� � � �� . then property (f) holds for �. akiyama [5] proved a necessary and sufficient condition, however it is unfortunately quite vague. theorem 2.3: (akiyama [5]). let � be a pisot number of the degree m. then � has property (f) if and only if every element of x x x x j mj j � � � � � � � �� z[ ]| ,| [ ] , , ,( ) ( ) ( ) � � � 0 1 1 2 31 for � � � � �� has a finite �-expansion in base �. here x i( ) with i m�1 2, , ,� are the conjugates of x �q( )� in the algebraic field. ambrož et al. [6] gave a partial characterization of numbers � fulfilling property (f) in terms of minimal forbidden strings. definition 2.4: let � > 1. a forbidden string ukuk�1…u0 of non-negative integers is called minimal, if � uk�1…u0 and uk…u1 are admissible, � ui �1 implies uk…ui�1(ui�1) ui�1…u0 is admissible, for all i k� 0 1, , ,� . the conditions based on the above-defined notion of minimal forbidden strings were given by the following two propositions. proposition 2.5: (property (t)). if fin( )� is closed under addition of two positive numbers, then � must satisfy the following property: for every minimal forbidden string ukuk�1…u0 there exists a finite sequence vnvn�1…v of non-negative integers, such that � k n, � � , � v v u u un n k k� � �� � � � � �� �� 1 0 , � v v v u un n k n k � � 1 000 0� � ��� �� ( )times . theorem 2.6: (ambrož et al. [6]). let � > 1 satisfy property (t), and suppose that for every minimal forbidden string ukuk�1…u0 we have the following condition: if vnvn�1…v is the lexicographically greater string of (t) corresponding to ukuk�1…u0 then v v v u u un n k k� � � � � � �� �1 1 0� �� . then fin( )� is closed under addition of positive elements. moreover, for every positive x y, ( )�fin � , the �-expansion of x y� can be obtained from any �-representation of x y� using finitely many transcriptions. finally, akiyama gave an algebraic characterization of cubic pisot numbers satisfying property (f). theorem 2.7: (akiyama [7]) let � be a cubic pisot number. � has property (f) if and only if � is a root of the following polynomial with integer coefficients x ax bx3 2 1� � � where a � 0 and � � � �1 1b a . 2.2 number of fractional digits the second question concerning arithmetics in �-numeration systems is connected to the fact that for non-integer � the set z � is not closed under arithmetic operations. however, sometimes it is true that the result of addition or multiplication of two �-integers has only a finite fractional part. then it is interesting to try to estimate the maximal length of such fractional part. more precisely, the task is as follows. for given � find the value, or at least some good estimate, of the quantities l n x y x y x y n� �� � � � � � � � �( ): min{ | , , ( ) },� � �� �n z zfin l n x y xy xy n� �� � � � � � �( ): min{ | , , ( ) }� � �� �n z zfin . two methods for estimation of l�( )� , l�( )� are known. the first of them uses the so-called meyer property of the set of �-integers for � pisot number, namely that z z z� � �� � � f where f is a finite set, and z z z� � �� � g where g is a finite set. this method is used in [6] to find values of l�( )� , l�( )� for � the pisot number, solution of the equation x x x3 225 15 2� � � . the second, much more widely used method for estimation of l�( )� , l�( )� is based on the following theorem. several version of this method are employed in [6, 8, 9, 10, 11, 12]. theorem 2.8: (guimond et al. [10]). let � > 1 be an algebraic number, and let �� be its algebraic conjugate. for z �q( )� we denote by �z the image of z under the field isomorphism � �: ( ) ( )q q� � . if h z z z: sup{ | }� � � ��� , k z z z z: inf{ | \ }� � � �� �� 0, then 1 2 � � � � � � � � � � � l h k and 1 2 � � � � � � � � � � � l h k . some known results on the values of l�( )� , l�( )� � knuth [13]: l� �( )� 2, � root of x x 2 1� � . � burdík et al. [14] l l� �� �( ) ( )� � 1 for � root of equation x mx 2 1� � , l l� �� �( ) ( )� � 2 for � root of equation x mx 2 1� � . � guimond et al. [10] for � root the equation x mx n2 � � , m n� © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 2 1 1 2 1 m m n l m m n � � � ! " #" � � � � $ % "" �( )� , l l m� �� �( ) ( ) log ( )� �4 22 . � messaoudi [15]: l� �( )� 6 for � root of the equation x x x3 2 1� � � , that is the so-called tribonacci number. � ambrož et al. [6] for � tribonacci number 5 6� ��l ( )� and 4 6� ��l ( )� and l� �( )� 5, l� �( )� 7 for � root of the equation x x x3 225 15 2� � � . � bernat [9] gave an exact value for the tribonacci addition, l� �( )� 5. � ambrož et al. [8] for the � root of the equation x mx x3 2 1� � � 5 6� ��l ( )� for m � 2, 4 5� ��l ( )� for m � 3, 4 6� ��l ( )� for m � 2. 3 tau-adic numeration system let � be the golden mean, i.e. the pisot number, root of the equation x x2 1� � and �� its algebraic conjugate. a �-adic representation of a real number x �r is a left-infinite sequence ( )di n i� � �, di � z, n � �z , such that x di i i n � � �� � ( )� . it is denoted � � �� �� ( ):x d d d d n� �1 0 1 . the value of a �-adic representation is obtained by the function �: *z r given by & '� �( ) : ( )d di n i i ii n� � � �� � � � . if all finite factors of the sequence ( )di n i� � � are admissible in the �-numeration system, the sequence ( )di n i� � � is said to be the �-adic expansion of the number x, and it is denoted �� x . 3.1 basic properties of �-adic expansions we know [3] that fin( ) [ ] { | , }� � �� � � �z za b a b . hence for any z � �z there exist m n, � z, m n� such that z zi i i m n � � � thus z zi i i m n � � � (� � , i.e. for each z � �z the �-adic expansion � � � � z z is a finite word with some possible fractional part. for negative integers z � �z we have the following proposition. proposition 3.1: (ambrož [16]). letz � �z be a negative integer. its �-adic expansion �� z is a left infinite, eventually periodic word of the form � � � �z v( )10 , where v x� � , x �fin( )� . moreover, in [16], an algorithm is given for computing �-adic expansions of negative integers. since most of the �-adic expansions that we are dealing with are left infinite, eventually periodic, we define the following two sets of numbers. iep( ): { | ( ) }� � � � �� � �� � �x r x d d d dk k k�� �1 0 , fep( ): { | ( ) }� � � � �� � � � �� � �x r x d d d d d dk k k m�� � �1 0 1 , indeed, z f� �ep( )� . concerning rational numbers, there is also an algorithm for computing their �-adic expansions [16], which was used to prove that also every rational number has its �-adic expansion, eventually periodic to the left, more precisely we have q i( � � �( , ] ( )1 1 ep � and q f� �ep( )� . 3.2 finiteness condition in accordance with the usual �-expansions, it is interesting to inspect the properties of the set fep( )�� , which can be seen as a �-adic correlate of the set fin( )� . in [17] there is a proof that the set fep( )�� is closed under the addition. the proof is of a constructive type, consisting of building a right to left transducer performing the addition, which preserves the periodicity of its input. when we know that fep( )�� is closed under addition of positive elements, it is quite easy to prove that it is a ring. finally, there has been shown a relation between eventually periodic �-adic expansions and the elements of the ring q( )� . theorem 3.2: (ambrož [17]). the field q( )�� (and therefore also the field q( )� ) is equal to the set fep( )�� of all real numbers having eventually periodic �-adic expansion with a finite fractional part. 4 future perspectives there are several objectives that can be pursued in the future. concerning the already classical �-expansions, there is always the challenging problem of the algebraic characterization of numbers � satisfying property (f). moreover, there is a lot of work to be done while looking for the number of fractional digits arising under arithmetic operations. finally, there is an open question [6] of finding arithmetic algorithms working with infinite, but periodic �-expansions. a partial answer to the last question was recently given by the author in [18]. concerning the second notion of irrational representation – �-adic expansions – there are also several possible ways of future research. it seems quite obvious that it will be only a question of solving some technical obstacles to broaden the �-adic expansions onto the �-adic expansions, for some class of �. indeed, those classes of � fulfilling the property (f) are suitable candidates. on the other hand, the proof of the finiteness property for the �-adic expansions is deeply connected with the nature of the system itself and its generalization for some other irrationalities will be difficult or even impossible to perform without changing the approach. acknowledgment the author acknowledges financial support from the grant agency of the czech republic gačr 201/05/0169. references [1] rényi, a.: “representation for real numbers and their ergodic properties.” acta math. acad. sci. hungar, vol. 8 (1957), p. 477–493. [2] parry, w.: “on the �-expansions of real numbers.” acta math. acad. sci. hungar, vol. 11 (1960), p. 401–416. [3] frougny, c., solomyak, b.: “finite beta expansions.” ergod. th. and dynam. sys., vol. 12 (1992), p. 713–723. [4] hollander, m.: “linear numeration systems, finite beta-expansions, and discrete spectrum of substitution dynamical systems.” phd thesis, washington university, 1996. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague [5] akiyama, s.: “pisot numbers and greedy algorithm.” in: number theory (eger, 1996). berlin: de gruyter, 1998, p. 9–21. [6] ambrož, p., frougny, c., masáková, z., pelantová, e.: “arithmetics on number systems with irrational bases.” bull. belg. math. soc. simon stevin, vol. 10 (2003), p. 641–659. [7] akiyama, s.: “cubic pisot units with finite beta expansions.” in: algebraic number theory and diphantine analysis (graz, 1998). berlin: de gruyter, 2000, p. 11–26. [8] ambrož, p., masáková, z., pelantová, e.: “addition and multiplication of beta-integers in generalized tribonacci base.” to appear in j. autom. lang. comb. [9] bernat, j.: computation of for pisot numbers. preprint, 2004. [10] guimond, l. -s., masáková, z., pelantová, e.: “arithmetics on beta-expansions.” acta arith, vol. 112 (2004), p. 23–40. [11] messaoudi, a.: “généralization de la multiplication de fibonacci.” math. slovaca, vol. 50 (2000), p. 135–148. [12] messaoudi, a.: “fronti re du fractal de rauzy et syst me de numeration complexe.” acta arith., vol. 95 (2000), p. 195–224. [13] knuth, d. e.: “fibonacci multiplication.” appl. math. lett., vol. 1 (1988), p. 57–60. [14] burdík, č., frougny, c., gazeau, j. -p., krejcar, r.: “beta-integers as natural counting system for quasicrystals.” j. phys. a, vol. 31 (1998), p. 6449–6472. [15] messaoudi, a.: “tribonacci multiplication.” appl. math. lett., vol. 15 (2002), p. 981–985. [16] ambrož, p.: “on the tau-adic expansions of real numbers.” in: ‘words 2005, 5th international conference on words, actes’, s. brlek and c. reutenauer, (eds.). publications du lacim vol. 36 (2005), p. 79–89. [17] ambrož, p.: addition of eventually periodic tau-adic expansions. preprint, 2005. [18] ambrož, p.: addition in pisot numeration systems. preprint, 2004. ing. petr ambrož phone: +420 224 358 564 email: ampy@linux.fjfi.cvut.cz department of mathematics czech technical university in prague faculty of nuclear science and physical engineering trojanova 13 120 00 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 ap06_2.vp 1 introduction the eddy current (ec) method is a well-established technique for flaw detection and sizing in many application areas of non-destructive testing (ndt), e.g. in aircraft maintenance. a typical task is inspection for cracks close to the surface of tested material. an automated scanning system using motorized eddy current (ec) probe positioning is often applied for testing objects with large surfaces. there are several important benefits of enabling users to see 3d images of a surface containing detected flaws (ec scan). in particular, a three-dimensional image of the result of a scan enables personnel to determine the approximate position of flaws and to estimate the size (length) and bevel of each detected flaw immediately (flaw depth is not evaluated here). eddy current phenomena can be modeled by nonlinear three-dimensional, partial differential equations with complicated boundary conditions. however, modeling–based analysis methods are difficult to apply for test data evaluation, so in automated analysis systems ec signatures, as a trajectory of the instrument produced signal in the complex plane, are usually considered [1]. the complex signal s (n) consists of an in-phase (horizontal) component h (n) and a quadrature (vertical) component v (n). modeling of flaws and 3d visualization can be faithfully done by simulating the electromagnetic field with a 3d finite element method [2]. since 3d finite element calculations are very time consuming and this kind of method requires high-powered equipment for to calculate and calibrate it, it may be interesting to introduce a less sophisticated but accurate method that requires fewer computer resources. this paper is organized as shown in the block diagram of the ec scanning system, see fig. 1. section 2 describes the data acquisition stage and the appropriate equipment configuration. section 3 focuses on the main data processing methods (pre-processing and feature extraction). section 4 outlines the visualization that is used. finally, section 5 contains conclusions. 2 data acquisition the ec scanning system consists of two independent components. a single frequency instrument with an absolute probe forms the main component. an absolutely shielded ec probe with active diameter cca 2 mm was used. the probe was 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague system for 3d visualization of flaws for eddy current inspection a. dočekal, m. kreidl, r. šmíd this paper presents a novel method for 3d visualization of flaws detected during eddy current (ec) inspection. the ec data was acquired using an automated scanning system equipped with precise eddy current probe positioning. the method was tested on a single frequency instrument with an absolute probe. the ec inspection procedure is implemented statically by registering the operating point of the instrument at each equidistant point on a tested object. the paper describes a data processing method based on the fourier transform enabling 3d visualization of flaws. this three-dimensional image of the result of a scan enables the position of flaws to be determined, and the size and bevel (angle to the surface) of each detected flaw to be estimated. this research investigated flaws rising from the surface of the tested object, and flaw depth was not evaluated in this work. this method of visualization is simple to implement and is currently targeted for application in ec scanning devices. keywords: eddy current (ec), visualization, modified fourier descriptors. x y ec probe probe positioning data acquisition pre-processing visualization results: flaw position, size and bevel feature extraction fig. 1: block diagram of an automated ec scanning system operated at a frequency of 200 khz to achieve a sufficient penetration depth and sufficient resolution. the other part was posed by a precise ec probe positioning system. the positioning system achieved resolution 0.125 mm in the probe movement. a probe shift step of 0.3 mm was used for material testing. the two instruments did not enable on-the-fly scanning. hence the scanning process had to be synchronized by data acquisition software and the scan procedure was implemented statically. the surface of the scanned object was tested at equidistant points in order to ensure data synchronization. the ec signal value was registered or more precisely the mean of the signal samples of the ec tester at each equidistant point. a matrix formed of ec signal values resulted from the data acquisition process, see fig. 2. each row and each column of this matrix represents one ec signature describing an ec inspection for an assigned direction and probe shift, as shown in fig. 2. 3 signal processing the main aim of the visualization method is to provide personnel with information about the occurrence of a flaw on the surface, and to estimate the position, surface orientation, size (depth) and bevel (angle to surface) of each detected flaw. information about the position and surface orientation of a flaw can be obtained approximately by evaluating the threshold of the absolute value of the complex ec signal, or more precisely the absolute difference from the reference value. in this way, the orientation of the flaw surface convolved with the spatial response of the used ec probe is obtained. the position and surface orientation of a flaw can be given more accurately by applying an inverse filtering procedure, as described in [4]. to estimate the uncertainty carried in the simple signal thresholding, an ec probe response to a reference orthogonal flaw with sufficient surface orientation and size 1.5 mm on the tested material was measured, see fig. 3. in the measured case the uncertainty was estimated at 0.9 mm (k � 2). the shape (size and bevel) of a flaw is determined from the shapes of the appropriate ec signatures of the flaw in each related row and column of the matrix. these ec signatures correspond to standard testing in the relevant direction of the probe movement (horizontal or vertical) and the probe offset to the initial position, shown in fig. 2. the ec signature shape is described using the fourier transform, described under feature extraction, see section 3.2. the description of a flaw at each tested point calculated from the ec signatures can be represented as a 3d vector. the flaws are visualized as an approximation of the 3d vector endpoints by flaw surface forming. 3.1 pre-processing a number of pre-processing steps were carried out to reduce the impact of disturbances especially outliers caused by noise. the ec signature and the corresponding signal were filtered by a non-linear filtration method called median signature tracking. this method is based on median filtering an ec signature in the complex plane. the curvature of the ec signature is tracked using the median computed from the angles of the corresponding samples s (n). the median signature tracking procedure has been described in [5]. 3.2 feature extraction to extract shape-describing features from the ec signatures, we used a method based on the fourier transform. standard fourier descriptors are based on expanding the signature shape in a fourier series: f s( ) ( ) ,p n n e j n np n n � � � � � 1 2 0 1 � for p � 0, 1, 2, …, (n � 1) (1) where s (n) is an ec signature of length n. for feature extraction a limited number of coefficients are selected. we © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 h n( ) v n( ) y x h n( ) v n( ) h n( ) v n( ) fig. 2: matrix representing data acquired during the test with the corresponding ec signatures normalize the descriptors with respect to signature translations (especially offset) [5]. f r r n rn1( ) ( ) ( )� � �f f , (2) � � f r r n r r n r n2 2( ) ( ) ( ) max ( ) , ( ) � � � � f f f f , (3) f r r n r n 3 2 ( ) arg ( ) arg ( ) � � �f f , (4) where r is descriptor order, r � 0. modified fourier descriptors (mfd) fn1, fn2 and fn3 correspond to overall size, ellipticity and angle of signature description by harmonic r respectively. we can estimate the flaw size (length), bevel and depth (for flaws drowned in the material). the orientation of a flaw can be simply recognized from the direction of the ec value movement at an ec signature (complex curve) though the row or column of the matrix. flaw depth was not assessed in our study, which deals only with flaws rising from surface of a tested object. 3.3 visualization the position and surface orientation of each flaw is taken by processing the absolute value of the ec signal, see fig. 2. the first step in the visualization process is to assign a three-dimensional vector to the given features. these vectors are found for each point at which the flaw and the surface intersect. we used two ec signatures that intersected at a certain point on the surface, and these were measured for two orthogonal directions (horizontal and vertical) of the ec probe movement, see fig. 2. this stage can be split into two parts. first, we calculate a vector that represents the information found from the shape and size of one ec signature (in one direction). the calculation uses modified fourier descriptors, and only first order mfd descriptors were used. secondly, we construct the three-dimensional vector representing information about the flaw at the examined point in both orthogonal directions on the surface. this vector collects information from both ec signatures measured in orthogonal directions. the first vector, representing the horizontal part, gives values of df and hf, while the second vector, representing the vertical part, gives values of df and vf (see fig. 4.). the final 3d vector is given as the maximum of each component. finally, a surface is formed from these vectors. the two operations can be expressed by equations (5) to (8), illustrated in fig. 4, f n2 1( ) sin �, (5) d f ff n n� �1 21 1( ) ( ) , (6) whereas equation (7) is used for assigning the vector representing the horizontal part, and the equation (8) assigns the vector representing the vertical part. h f ff n n� � �1 2 21 1 1( ) ( ) and v f � 0, (7) v f ff n n� � �1 2 21 1 1( ) ( ) and h f � 0. (8) this result requires calibration especially due to the non-linear principle of ec inspection. this requires calibration 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague �4.5 �3 �1.5 0 1.5 3 0 200 400 600 800 1000 1200 1400 1600 1800 2000 ec probe shift [mm] | | [a .u .] s (n ) measured data gauss curve fig. 3: spatial response of the used ec probe to the reference flaw (a.u. arbitrary unit) df hf horizontal depth fn1(1) ö fn2 (1) (vertical) ( )vf fig. 4: construction of a 3d representation of an ec signature measurements on a sample made of the tested material. this sample should include the possibility of calibrating each considered feature as size and bevel. only after calibration can unknown samples be tested. this is a major disadvantage of the method. polynomial curve fitting was applied to calibrate all considered features. we used a polynomial p (x) of degree 2 that fits the data in a least squares sense (eq. 9). p x p x p x p( ) � � �2 2 1 0 . (9) calibration of the size descriptor fn1(1) for an aluminium sample is illustrated in fig. 5. 4 experimental evaluation this visualization method was evaluated in tests of two aluminium samples. the first sample contained notches of 0.3 mm width, size 0.4, 0.7, 1 and 1.5 mm perpendicular, depth 0.4, 0.7, 1 and 1.5 mm with an angle of 30°, 0.7, 1 and 1.5 mm with an angle of 60°, and 1.5 mm with an angle of 45°. this sample was made up of the same material as the second sample, which was also used for test calibration. the second sample was made of a block with some drilled holes with angles of 30°, 45° and 90° in both orthogonal directions. an example of visualization of a drilled hole is shown in fig. 6. this hole was drilled with an angle of 45 degrees and depth 20 mm. each drilled hole on the sample was tested as described in section 2. 5 conclusions a simple novel method for 3d visualization of flaws during eddy current testing based on the fourier transform was presented in this paper. the described visualization method is simple and uses the well-known fft (fast fourier transform). fft computation libraries are available for each currently widely-used family of microcontrollers. the method is limited to use for flaws rising from plain surfaces of a tested object during inspection by an automated scanning system using motorized eddy current probe positioning. flaw depth was not assessed. this method of visualization is simple in principle and is currently targeted for use in ec scanning devices as the first approximation of an ec scan result. each detected flaw should be diagnosed and classified by further data processing methods, e.g. using pattern recognition. acknowledgments this study was supported by research program no. msm 6840770015 “research of methods and systems for measurement of physical quantities and measured data processing” of the ctu in prague, sponsored by the ministry of education, youth and sports of the czech republic. references [1] kreidl, m. et al.: diagnostic systems. 1. ed. (in czech). praha: ctu, 2001. 352 p. isbn 80-01-02349-4. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 0.5 1 1.5 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 f (1 ) n 1 [a .u .] size [mm] measured values polynomial fitting fig. 5: calibration data of the size acquired on the first sample 8 6 4 2 0 2 4 6 0 1 2 3 4 5 |s |( ) [a.u.]n x [mm] y [mm] 0 1 2 3 4 5 0 – 4 – 2 0 1 2 3 4 5 x [mm] z [mm] y [mm] a) b) fig. 6: visualization of a hole drilled with the angle of 45 degrees on the second sample (a – absolute value of tester response, b – calculated visualization of the flaw) [2] magele, c., renhart, w., brandstätter, b.: “identification of hidden ferrous 3d objects using finite elements.” compel: int j for computation and maths. in electrical and electronic eng. vol. 3 (2001), p. 689–698, emerald group publishing limited, 2001. [3] prémel, d., pichenot, g., sollier, t.: “development of a 3d electromagnetic model for eddy current tubing inspection current data.” international journal of applied electromagnetics and mechanics. vol. 18 (2003), p. 1–5, ios press, 2003. [4] xiang, p., ramakrishnan, s., cai, x., ramuhalli, p., polikar, r., udpa, s. s., udpa, l.: “automated analysis of rotating probe multi-frequency eddy current data from steam generator tubes.” international journal of applied electromagnetics and mechanics, vol. 12, (2001), p. 151–164. [5] šmíd, r., dočekal, a., kreidl, m.: “automated classification of eddy current signatures during manual inspection.” ndt & e international. vol. 38 (2005), p. 462–470, elsevier science, 2005. ing. adam dočekal phone: +420 224 352 346 e-mail: docekaa@fel.cvut.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@fel.cvut.cz doc. ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@fel.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 16627 prague 6, czech republic 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2022.62.0394 acta polytechnica 62(3):394–399, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague effect of membrane separation process conditions on the recovery of syngas components petr seghman∗, lukáš krátký, tomáš jirout czech technical university, faculty of mechanical engineering, department of process engineering, technická 4, prague 6, 160 00, czech republic ∗ corresponding author: petr.seghman@fs.cvut.cz abstract. the presented study focuses on inspecting the dependency between process conditions, especially permeate and retentate pressure and component recovery of h2, co, and co2 during a membrane separation of model syngas. experiments with both pure components and a model mixture were performed using a laboratory membrane unit ralex gsu-lab-200 with a polyimide hollow fibre module with 3000 hollow fibres. permeability values were established at 1380 barrer for h2, 23 barrer for co, and 343 barrer for co2. the measured selectivities differ from the ideal ones: the ideal h2/co2 selectivity is 3.21, while the experimental values range from over 4 to as low as 1.2 (this implies that an interaction between the components occurs). then, the model syngas, comprised of 16 % h2, 34 % co, and 50 % co2, was tested. the recovery of each component decreases with an increasing permeate pressure. at a pressure difference of 2 bar, the recovery rate for h2, for a permeate pressure of 1.2 bar, is around 68 %, for 2.5 bar, the values drop to 51 %, and for 4 bar, the values reach 40 % only. a similar trend was observed for co2, with recovery values of 59 %, 47 % and 37 % for permeate pressures of 1.2 bar, 2.5 bar and 4 bar, respectively. keywords: membrane separation, syngas improvement, components recovery, hollow fibre module. 1. introduction one of the main challenges for scientific teams in the past years has been finding a solution to mitigate climate change and decrease the production of co2 and greenhouse gases. in addition to other approaches, waste utilization is one of the most promising ways. specifically, for biomass, gasification offers a suitable solution for the biomass-to-fuels and biomass-to-chemicals conversion. many studies have shown that the product of gasification can be used as a feed for various downstream technologies, including fischer-tropsch synthesis, methanol production, and other processes that have been used for coalgasification-produced syngas [1]. taking into account the environment, many scientific teams have published innovative approaches, including syngas fermentation using specific bacteria [2]. as biosyngas (biomass gasification-produced syngas) contains h2, co, co2, and minor amounts of ch4 and other components, it is necessary to adjust its composition and eventually remove the impurities before using it as feedstock for the mentioned technologies. membrane operations are one of the possible solutions for such adjustments. to implement membrane operations in the technology process, it is necessary to describe the processes. currently, the focus of scientific teams researching membrane operations is on the separation of two components. several studies have been published describing a two-component separation. choi et al. [3] studied h2/co separation – the effect of the operating pressure was described for h2:co ratios of 3:1, 5:1 and 7:1 and showed that permeance increased with a higher h2 concentration. the effects of increasing flow rate and operating pressure on separation factors and permeance. a study presented by huang et al. [4] focuses on h2 and co2 recovery and describes the dependency between the recovery of components and the area of the module. the article presents that it is necessary to increase the area near exponentially to achieve a lower co2 concentration in the retentate. besides the two-component separation, the scientific field of interest is the numerical simulation and the solution of multicomponent separation. a study published by lee et al. [5] proposes a numerical model of multicomponent membrane separation for co2 containing mixtures in counter-current hollow fibre modules based on the newton-raphson method. the numerical solution was compared with the experimental data using a gaseous mixture consisting of 14 % co2, 6 % o2, and 80 % n2. another approach to numerical modelling of membrane separation was presented by qadir et al. [6] and involved fluid dynamics within cfd simulations. the study reflects different process parameters in the simulation; however, mainly binary mixtures were studied. another similar paper on numerical modelling was presented by alkhamis et al. [7], who proposed the dependence of reynolds (re) and sherwood (sh) numbers on the separation parameters during the co2 and ch4 separation. based on simulations, applications of the spacers in the inter-fibre space was recommended to increase the co2 separation efficiency. however, neither of the mentioned approaches offers an effective enough description of the processes. also, there is not much data on the 394 https://doi.org/10.14311/ap.2022.62.0394 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 3/2022 effect of membrane separation process conditions on the recovery . . . membrane separation of h2-co-co2 and ev. ch4 mixtures (syngas), and not many studies have been published. therefore, our studies are focused on the syngas membrane separation and a further description of the processes. the study’s primary goal is to inspect the dependency between process conditions (specifically permeate and retentate pressure combination) and the separation process results represented by component recovery and or permeate and retentate concentrations. experimental data for the h2-co-co2 mixture are published along with several observations and dependency descriptions for permeate composition dependency on stage cut and component recovery dependency on pressure conditions. 2. materials and methods all experiments were performed using the experimental setup, the defined model syngas mixture and the following equations for computation described further below. 2.1. experimental equipment measurements were made using a laboratory membrane unit ralex gsu-lab-200 manufactured by membrain that allows a module exchange. the unit operates with pressures ranging from 1 to 10 bar in the retentate (and feed) branch, 1 to 5 bar in the permeate bar and is equipped with a temperature-regulating circuit that can maintain module temperature between room temperature and 60 ◦c. the measured values are pressure pi, temperature ti, and mass flow mi in each branch (feed – f, retentate – r, permeate – p), and the temperature of the module coating ts . the composition of each flow is measured using a gas analyser that switches between flows following a user-defined scheme. figure 1 below shows a simple scheme of the measurement. figure 1. scheme of measurement with indicated measured values. “gas ans.” stands for the gas analyser. the module used in the study is a polyimidepolyetherimide hollow fibre module (manufactured by membrain, the exact mixture being kept a secret) consisting of 3000 hollow fibres with a diameter de = 0.3 mm, the thickness of the wall w = 6 µm and an active length of l = 290 mm, defining the total area of the module s = 0.514 m2. the feed enters into the fibres; the retentate is collected from the end of the fibres, and the permeate from the inter-fibre space. the whole bundle of fibres is covered in a tube and equipped with flanges on both ends. the laboratory unit is connected to the supply of pure gases, including h2, co, co2, and n2 (used for washing before and after each measurement to maximize the measurement reliability). 2.2. model gas and measurement conditions for the experiments with pure gases, gas sources for hydrogen h2 of 5.0 gas purity, carbon monoxide co of 4.7 gas purity, and carbon dioxide co2 of 4.0 gas purity, were used. the model mixture representing the biomass-gasification-produced syngas was defined to contain 16 %mol h2, 34 %mol co and 50 %mol co2. the concentration of each component in the feed flow is defined by the feed molar flows for each gas directly using the membrane unit interface. based on the extensive literature study, the composition was chosen: the values represent the oxygen (or air) gasification from wood biomass. biomass gasification using air as the agent and wood as feedstock produces syngas with composition varying in 10–20 %mol for h2, 30–45 %mol for co and 35–55 %mol for co2. the model mixture was tested at approximately constant temperature tm = 22 ◦c (with a deviation smaller than 1 ◦c). the feed gas flow rate was maintained at 100 nl h−1 (4.464 mol h−1). the pressure conditions were defined by the permeate pressure varying at levels pp = 1 bar, pp = 2.5 bar and pp = 4 bar. the total pressure difference ranged from 0.45 to 8 bar. (the lowest pressure difference was defined by the unit’s limits and the module and was different for each pressure level.) the pressure conditions (permeate and retentate pressure) are defined within the unit interface and are maintained constant using an integrated regulation system. 2.3. methods of data evaluation the ideal gas behaviour was considered for all measurements (incl. pure component measurements). the ideal behaviour’s deviation was estimated in the previous study and reached values smaller than 2 %, with an average below 1 % for all three components (h2, co and co2) and their mixtures. values acquired from experiments consist of pressure for each branch (permeate pp , retentate pr, feed pf ), mass flows of each branch, and composition of the flows (represented by the concentration of the components in %mol) and temperatures of the flows. the value used to describe the ability to let the gases go through the membrane is the permeability pi of each gas. the permeability is defined as the molar flow rate of gas permeating through a unit of area of the module per second caused by a unit of partial pressure difference of the module and can be described by the following equation (1): pi = ni s · ∆pi w, (1) 395 p. seghman, l. krátký, t. jirout acta polytechnica where pi is the permeability of the component in [mol m−1 s−1 bar−1], ni is the molar flow in [mol s−1], w is the thickness of the module wall in [m], s is the total surface of the fibres of the module in [m2], and ∆pi is the partial pressure difference in [bar] (for pure components, the partial pressure difference is equal to the total pressure difference). the pressure difference is defined as the difference between the retentate and permeate pressure (resp. the i-th component’s partial pressure). the commonly used unit for permeability is 1 barrer, which is defined as follows: 1 barrer = 3.35 × 10−11 mol m−1 s−1 bar−1. (2) to calculate the molar flow of component i, the total molar flow in a stream must be calculated from the mass flow and the concentrations. taking the ideal gas behaviour into consideration, the following equation can be used: nx = mx∑ ci · mi , (3) where nx is the total molar flow in branch x in [mol s−1], mx is the mass flow measured in branch x in [g s−1], ci is the molar fraction of component i [–], and mi is the molar weight of component i in [g mol−1]. to obtain the amount of i-th component in a given flow, the following equation can be used: ni,x = ci · nx, (4) where ci is the measured (molar) fraction of the i component and nx is the total molar flow (where x can be p for permeate and r for retentate) in [mol s−1]. for a better comparison with the literature, stage cut θ is defined to describe the module’s properties and the process. stage cut is defined as a ratio between molar flow in permeate and feed flows. in some papers, mass-based stage cut can also be defined, but since most scientific papers use the molar version, so does this paper. the definition of stage cut is as follows: θ = np nf = np np + nr , (5) where nx is the molar flow in a given flow (index p for permeate, f for feed, r for retentate) in [mol s−1]. to inspect the interaction between components, the ideal and actual selectivities are compared. the selectivity is expressed as: αi,j = pi pj , (6) where αi,j is the selectivity of the component i over j (further in the text labelled α(i, j), to improve readability) and pi and pj are the permeabilities of the components i, j. the ideal selectivity is computed from pure component permeability and actual selectivity from the measurement with mixtures. the primary quantity used in the study is component recovery ri. recovery was selected because it can be compared between different module types and process conditions and provides useful information for a possible implementation. component recovery is defined as: ri = ni,p ni,p + ni,r , (7) where ni,p is the molar flow of component i in the permeate in [mol s−1] and ni,r is the molar flow of component i in the retentate in [mol s−1]. 3. results and discussion first, the permeabilities of pure components were obtained to describe the properties of the module for the separation of h2, co and co2. 3.1. pure component permeability as mentioned above, the permeabilities pi of pure components were measured. the values were averaged across all process conditions involved in the study (molar flow equal to 4.464 mol h−1, pressure differences ranging from 1 to 10 bar, permeate pressure from 1 to 4 bar, and temperature around 20–22 ◦c). table 1 shows the measured permeability values for h2, co and co2. components h2 co co2 permeability (barrer) 1380 ± 62 23 ± 1 343 ± 11 table 1. permeabilities of pure components h2, co and co2 for given polyimide module. to compare the values with similar modules in the literature, the permeance (p/w )i must be evaluated. permeance is obtained by dividing the permeability by the thickness of the wall. the values of two different studies with hollow polyimide fibre modules published by sharifian et al. [8] and huang et al. [9], along with the permeance values of our study, are shown in table 2. components p/wimeasured p/wi [4] p/wi [8] h2 61.40 ± 2.80 241.0 97.10 co 1.00 ± 0.03 8.7 1.28 co2 15.20 ± 0.50 67.0 31.10 table 2. permeance values obtained in this study compared to values published in other articles by huang et al. [4] for polyimide membrane, temperature between 25–75 ◦c, and sharifian et al. [8] for similar conditions. values in [nmol s−1 m−2 pa−1]. two main observations can be made in the two tables above: first, table 1 shows that polyimide 396 vol. 62 no. 3/2022 effect of membrane separation process conditions on the recovery . . . membranes can be suitable for separating h2 from the mixture and for adjusting the ratio (increasing co concentration in the retentate). second, table 2 shows that the results presented in this article are consistent with the data available in the literature. the exact composition of the membranes can cause the difference: the module used for the study contains polyetherimide in addition to pure polyimide fibres, and the structure of the polymers can also vary between modules. 3.2. gas mixture to demonstrate the interaction of components during the membrane separation, we compared the ideal selectivity with the measured selectivity in each case. then, the composition of the retentate flow and the component recovery were studied. 3.2.1. selectivity comparison to prove the mutual interaction of components during a multicomponent membrane separation, ideal and actual selectivities were compared. figure 2 and 3 show h2/co2 and co2/co selectivities. as seen in the figures below, with increasing stage cut, the measured selectivities decrease both for h2/co2 and co2/co. figure 2. ideal and measured selectivity for h2/co2. figure 3. ideal and measured selectivity for co2/co this phenomenon was also reported by z. he and k. wang [9], who tested the ideal and “true” selectivity for a mixture of he and co2. this case can be well compared with our case. the mentioned paper states that the true selectivity drops from 3.14 to 1.64 for the 1:1 mixture, from 3.35 to 0.94 for the 2:1 (co2:he) mixture and from 3.58 to 0.49 for the 3:1 (co2:he) mixture. this decrease is similar to the decreases observed in our study. similarity can also be found in the size and type of the involved molecules – the ratios of ideal and measured (or “true”) selectivities for h2:co2 in our study (ratio 16:50 ∼ 1:3) correspond very well to the data presented for he:co2 in a ratio of 1:3. 3.2.2. permeate composition as can be seen in figure 4 and 5, with increasing stage cut, the concentrations of the high permeable components (h2, co2) decrease. however, for co2, a slight maximum can be seen around stage cut θ = 0.40 for pp = 1.2 bar, of approximately 66 %mol, around stage cut θ = 0.45 for pp = 2.5 bar of approximately 62.5 %mol, and around stage cut θ = 0.5 for pp = 4 bar, of approximately 60.5 %mol. this observation implies that the co2 concentration in the permeate flow decreases with increasing permeate pressure. figure 4. h2 concentration in the permeate flow at stage cut. figure 5. co2 concentration in the permeate flow at stage cut. a similar trend of decreasing h2 and co2 concentrations when separating a ternary gas mixture containing 45 %mol h2, 40 %mol co2, 15 %mol ch4 using two different modules (dual membrane module and polyimide module) was reported by w. xiao et al. [9]. their experiments were performed with 397 p. seghman, l. krátký, t. jirout acta polytechnica stage cut ranging from 0.1 to over 0.4, and the concentration (mole fraction in the original paper) decreased from 68 %mol to 60 %mol for co2 and from 41 %mol to 32 %mol for h2 when using the polyimide hollow-fibre module. 3.2.3. components recovery one of the dependencies that appear when inspecting multicomponent gas membrane separation is that the recovery of the component achieved at a certain pressure is dependent on the permeability of the component. the component with the highest permeability (h2) reaches the highest recovery among the components at any pressure difference. however, the recovery does not increase proportionally with the permeability of the pure component – the permeability of h2 is four times higher than the permeability of co2. figure 6 shows data for all three components (h2, co, and co2) for permeate pressure pp = 2.5 bar. the described dependencies can be observed. figure 6. component recovery for h2, co and co2 for permeate pressure pp = 2.5 bar. another observed effect is the effect of the permeate pressure. the two more permeable components, h2 and co2, reach lower values of component recovery with an increasing permeate pressure. the recovery for co increases with pressure difference; however, it does not depend on the permeate pressure within the range of the statistical uncertainty. this implies that the differences between recoveries for different permeate pressures increase with increasing permeability of the pure component. figure 7–9 show the recoveries for the three components. w. xiao et al. [9] have reported similar trends for component recovery concerning the co2 in the ternary mixture of h2:co2:co (in ratio 45:40:15, respectively) as the published data follow the trend. however, the data for h2 seem to differ as the recovery seems to reach its limit below 0.4. this difference can be caused by the nature of the module. to describe the dependency of recovery on the total pressure drop and other process parameters, it is necessary to test model mixtures of different compositions (same components, different concentrations). figure 7. hydrogen recovery on total pressure drop. figure 8. carbon dioxide co2 recovery on total pressure drop. figure 9. carbon monoxide co recovery on total pressure drop. 4. conclusions several conclusions can be made based on the presented data. first, the tested hollow fibre module (polyetherimide-polyimide fibres manufactured by membrain) is suitable for h2 and co2 separation, as the permeabilities of the pure components reach 1380 ± 62 barrer for h2 and 343 ± 11 barrer for co2. the permeability of co reached 23 ± 1 barrer. the ideal selectivity (computed as the ratio of pure component permeabilities) for h2/co2 and for co2/co differ from the measured selectivities – the measured selectivities αh2/co2 and αco2/co decrease with increasing stage cut and drop to 1/3 of the ideal selec398 vol. 62 no. 3/2022 effect of membrane separation process conditions on the recovery . . . tivity for αh2/co2 and 1/10 of the ideal selectivity for αco2/co (both at stage cut θ ≈ 0.9). regarding the concentration of h2 and co2 in the permeate flow, both values decrease with increasing stage cut θ approaching 1. for the concentration of co2, maximum values of the concentration of cp (co2) can be observed at a value of 66 %mol for stage cut θ = 0.40 (pp = 1.2 bar), 62.5 %mol for stage cut θ = 0.45 (pp = 2.5 bar) and 60.5 %mol for stage cut θ = 0.5 (pp = 4 bar). the component recovery dependency on the permeate pressure drop has been studied. an observed trend is that the permeability of the component affects the recovery of the component so that the components with a higher permeability (when processed in pure form) reach higher recoveries at a given pressure difference. however, the increase in recovery is not directly proportional to permeability. also, a dependency between the component recovery and the permeate pressure has been revealed, showing that increasing the permeate pressure results in lower recoveries of the components at a given pressure drop. this can be caused by multiple reasons that have not been specified; however, the potential causes are a decrease in sorption and diffusion coefficients with increasing pressure and/or by fibre compression resulting in a decrease in its permeability. this study shows that component recovery of h2, co2, and co can be affected by process conditions. therefore, for a successful industrial application of the membrane separation within the field of biomass gasification, a wider sample of process conditions must be studied to develop a reliable model for describing the process. after that, membrane operations could be used for adjusting the ratio of the components by changing the pressure conditions, which would compensate for the variance in the biomass gasification product composition (caused by unstable feed composition due to biomass nature) and allow a better optimization of the technology. list of symbols ci concentration in component i [%mol] l length of the module [m] mf , mp , mr mass flow of gas in feed, permeate, and retentate, respectively [g s−1] mi molar weight of component i [g mol−1] ni molar flow of component i [mol s−1] pf , pp , pr pressure in the feed, permeate, and retentate branch, respectively [bar] pi permeability of component i [barrer] ri i-component recovery s total area of the module [m2] tf , tp , tr temperature in feed, permeate, and retentate, respectively [°c] tm mean measurement temperature [°c] w thickness of wall of the fibres [µm] ∆p pressure difference [bar] θ stage cut [–] acknowledgements this work was supported by the ministry of education, youth and sports of the czech republic under op rde grant number cz.02.1.01/0.0/0.0/16_019/0000753 “research centre for low-carbon energy technologies” and by student grant competition of ctu as part of grant no. sgs20/118/ohk2/2t/12. references [1] a. y. krylova. products of the fischer-tropsch synthesis (a review). solid fuel chemistry 48(22-35), 2014. https://doi.org/10.3103/s0361521914010030. [2] s. de tissera, m. köpke, s. d. simpson, et al. syngas biorefinery and syngas utilization. in k. wagemann, n. tippkötter (eds.), biorefineries. advances in biochemical engineering/biotechnology, vol. 166. springer, cham, 2017. https://doi.org/10.1007/10_2017_5. [3] w. choi, p. g. ingole, j.-s. park, et al. h2/co mixture gas separation using composite hollow fiber membranes prepared by interfacial polymerization method. chemical engineering research and design 102:297–306, 2015. https://doi.org/10.1016/j.cherd.2015.06.037. [4] w. huang, x. jiang, g. he, et al. a novel process of h2/co2 membrane separation of shifted syngas coupled with gasoil hydrogenation. processes 8(5):590, 2020. https://doi.org/10.3390/pr8050590. [5] s. lee, m. binns, j. h. lee, et al. membrane separation process for co2 capture from mixed gases using tr and xtr hollow fiber membranes: process modeling and experiments. journal of membrane science 541:224–234, 2017. https://doi.org/10.1016/j.memsci.2017.07.003. [6] s. qadir, a. hussain, m. ahsan. a computational fluid dynamics approach for the modeling of gas separation in membrane modules. processes 7(7):420, 2019. https://doi.org/10.3390/pr7070420. [7] n. alkhamis, d. e. oztekin, a. e. anqi, et al. numerical study of gas separation using a membrane. international journal of heat and mass transfer 80:835–843, 2015. https://doi.org/10.1016/j. ijheatmasstransfer.2014.09.072. [8] s. sharifian, n. asasian-kolur, m. harasek. process simulation of syngas purification by gas permeation application. chemical engineering transactions 76:829– 834, 2019. https://doi.org/10.3303/cet1976139. [9] w. xiao, p. gao, y. dai, et al. efficiency separation process of h2/co2/ch4 mixtures by a hollow fiber dual membrane separator. processes 8(5):560, 2020. https://doi.org/10.3390/pr8050560. 399 https://doi.org/10.3103/s0361521914010030 https://doi.org/10.1007/10_2017_5 https://doi.org/10.1016/j.cherd.2015.06.037 https://doi.org/10.3390/pr8050590 https://doi.org/10.1016/j.memsci.2017.07.003 https://doi.org/10.3390/pr7070420 https://doi.org/10.1016/j.ijheatmasstransfer.2014.09.072 https://doi.org/10.1016/j.ijheatmasstransfer.2014.09.072 https://doi.org/10.3303/cet1976139 https://doi.org/10.3390/pr8050560 acta polytechnica 62(3):394–399, 2022 1 introduction 2 materials and methods 2.1 experimental equipment 2.2 model gas and measurement conditions 2.3 methods of data evaluation 3 results and discussion 3.1 pure component permeability 3.2 gas mixture 3.2.1 selectivity comparison 3.2.2 permeate composition 3.2.3 components recovery 4 conclusions list of symbols acknowledgements references ap07_4-5.vp 1 introduction testing is becoming an increasingly important phase in the development process. the sooner a fault is found in the source code, the fewer resources it takes to correct it. automating test cases significantly improves the efficiency and reduces the duration of testing. many tools have been applied for testing purposes, for example ttcn-3 for automating whole test cases [1]. refactorisation is a commonly used technique for changing the syntax of program codes without making any changes to their behavior [2], [3]. we have concentrated on refactoring ttcn-3 source codes. 1.1 related work many tools have already been developed for refactoring sources written in different languages, such as c++ and java [4, 5], and even for ttcn-3 [6]. these tools are all semi-automatic, which means that the developer has to interact during the refactorisation process. these semi-automatic tools aim easier readability of the source. we have concentrated on achieving easier maintainability, scalability, and a compact source. this kind of refactorisation can be carried out automatically, without human interaction. there have not yet been any automatic tools for refactoring ttcn-3 sources, so our goal was to develop data structures and automatic algorithms for refactoring ttcn-3 sources. 1.2 introduction to ttcn-3 ttcn-3 is a test description language that was standardized by etsi in 2000 [1]. the test written in ttcn-3 runs on a test executor. the executor is connected to the sut (system under test). from the viewpoint of ttcn-3, the sut is a black box, that is, ttcn-3 determines whether the sut works as it should by examining the responses given by the sut for certain inputs. a ttcn-3 source consists of modules on the topmost level. each module has two parts, namely, the module definition part and the module control part [1]. the module definitions part includes declarations of data types, module-level variables, ports and definitions of templates. most of the generally used simple data types (integer, char, charstring) can be found in ttcn-3, but it has structured data types as well (record, set) [1]. templates are used for defining the structure of messages to be sent or received. the module control part coordinates the test execution. it contains function calls, message sending and receiving instructions, and value notations [1]. 2 refactoring the static part in this section we introduce a data model and an algorithm for refactoring the module definitions part. the data model consists of graphs that the data declarations and definitions of the source can easily be transformed to. the algorithm seeks for redundancy in this model and reduces it using inheritance (modified templates in ttcn-3) and references. we will concentrate on refactoring the definitions of record-typed templates. 2.1 data model for the static part first of all, the original ttcn-3 source has to be transformed into a data model by which the refactoring steps can be carried out efficiently. this model consists of two layers (fig. 1). the lower layer is a directed graph called the type graph, while the upper one consists of directed trees called value trees. the type graph consists of two kinds of nodes: t-nodes and f-nodes (fig. 1). each t-node represents a data type in the source. it stores the name of the data type that it represents, as well as pointers to its parent node and child nodes. as the structured data types in the source, each t-node has its fields in the data model, which are represented by the f-nodes, the child nodes of the t-node. an f-node stores the name of the field it represents as well as pointers to its parent node and child node, which is a t-node that represents the data type of the field. the value trees are built from the template definitions of the source code. they consist of v-nodes that store the values defined in the templates of the source. a v-node also contains pointers to its parent node and child nodes which are all v-nodes, and a pointer, the modifies-pointer, which is only set if the template represented by the tree inherits the values of some of its fields from another template. once the type graph and the value trees have been created, the value trees have to be connected to the type graph in the following way: each v-node is connected to the t-node © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 47 no. 4–5/2007 refactorisation methods for ttcn-3 l. eros, f. bozoki in this paper we introduce automatic methods for restructuring source codes written in test description languages. we modify the structure of these sources without making any changes to their behavior. this technique is called refactorisation. there are many approaches to refactorisation. the goal of our refactorisation methods is to increase the maintainability of source codes. we focus on ttcn-3 (testing and test control notation), which is a rapidly spreading test description language nowadays. a ttcn-3 source consists of a data description (static) part and a test execution (dynamic) part. we have developed models and refactorisation methods based on these models, separately for the two parts. the static part is mapped into a layered graph structure, while the dynamic part is mapped to a cefsm (communicating extended finite state machine) – based model. keywords: ttcn-3, formal methods, refactorisation, automatic. that represents the type of value stored in the v-node and to the previously mentioned t-node’s parent node (if one exists), which is an f-node (fig. 1). 2.2 ways of refactorisation we have concentrated on two types of redundancy. in the following we will write about these two kinds of redundancy, and the ways in which our algorithm reduces them. the first type of redundancy is caused by equal templates, in other words, templates of the same type with all the corresponding fields having equal values. in this case, the algorithm uses references to reduce the redundancy. there are two cases of this type of redundancy. in the one case, a separately defined template appears as a part of another template. handling this case is simple: the repetitive sub-template has to be replaced by a reference to the separately defined template. in the other case, a template that was not defined separately appears several times as a sub-template of other templates. when handling this kind of repetition, the repetitive sub-template has first to be defined separately, then all of its occurrences have to be replaced with a reference to this newly defined template. an example of this kind of refactorisation in the source and in the model can be seen in fig. 2 (the original structures are on the left, while the refactorised ones are on the right). the second type of redundancy is caused by similar templates of the same type, in other words, templates that have relatively many identical fields. this kind of redundancy is handled by modified templates (inheritance) in the following way: one of the similar templates has to be left just as it was before, and the other has to be turned into a modified template that only redefines the non-equal fields and inherits the rest from the other template (in the data model the modifies-pointer has to be used). fig. 3. shows how this kind of refactorisation works in the source and in the data model. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 1: transforming the source into the data model (f-nodes: light grey, t-nodes: dark grey, v-nodes: white) fig. 2: refactorisation by reference fig. 3: refactorisation by inheritance to be able to manage the level of similarity between two value trees of the same type, we defined the ros (rate of similarity): ros � number of equal leaves number of leaves of the value tree having more leaves . (1) if the ros of two value trees exceeds a limiting value, these two value trees should be refactorised. 2.3 refactoring algorithm of the static part in this section, we introduce the refactorisation algorithm of the static part. the algorithm has four steps as follows. in the first step of the algorithm, the n-matrices are created for each t-node. these matrices are used for determining if two value (sub-) trees of the same type are equal. an element of the matrix is 1 if the corresponding two value trees are equal, if not, it is 0. in the second step, the equal value (sub-) trees of the same type are handled by traversing the n-matrix of the t-node. the n-matrix is traversed twice. during the first traverse, the algorithm only handles the repetitions where one of the value trees is a standalone tree, and during the second traverse it handles all the remaining repetitions (the repetitions where both of the value trees are sub-trees of other value trees). this ensures that as many repetitions as possible are handled by references to originally defined value trees. in the third step, the d-matrices are created for each t-node. these matrices store the ros for each pair of value trees. after creating the matrices, the values below the limiting value of the t-node are cleared. in the fourth step, the d-matrices are used to create maximal weight spanning trees with the value trees as their nodes, for each t-node, using prim’s algorithm [2]. then two value trees are refactored by inheritance if they are connected by an edge in the spanning tree. the reason for building a maximal spanning tree is that in this way a maximal number of fields can be defined by inheritance, so a minimal number of fields have to be redefined. 3 refactoring the dynamic part in this section we introduce a data model and an algorithm for refactoring the module control part. the main point of this algorithm is to find repetitive sequences of instructions and turn them into bodies of functions or altsteps (special functions in ttcn-3), depending on their structure [1]. the original occurrences of the repetitions are replaced by calls of the corresponding functions. 3.1 data model for the dynamic part the module control part is transformed into a cefsm-based model. a cefsm is represented by a directed graph. in our approach, a cefsm state consists of a node and a directed edge in this graph. the states have several attributes: the guard is a condition that enables the transition to the state, the attribute event is the event that leads to the transition to the state, the action list is the sequence of instructions to be executed during the transition between two states, while the attribute called parameters contains the parameters of the action list. when mapping the source into the cefsm-model, the receive and timeout instructions of the source will be the events of the cefsm graph and the instructions between two receive instructions will be instructions of action lists. however, some ttcn-3 structures need to be handled in different ways. an if-else structure is mapped into a two-armed branch. one of the arms gets the if-condition as its guard and the instructions between the ifand the else-statement, while the other arm gets the negative of the if-condition as its guard and the instructions after the else-statement. both of the events are empty. an example of this kind of transformation is shown in fig. 4. alt structures describe alternative behavior [1]. these statements are also mapped into branches. an alt-branch has the same number of arms in the model as the alt statement has in the source. the guards and events of the arms in the source will be the guards and events of the arms in the © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 47 no. 4–5/2007 fig. 4: transforming an if-else structure into the cefsm-model fig. 5: transforming an alt structure into the cefsm-model cefsm. the action lists will contain the instructions of the corresponding arms in the source code (fig. 5). the states labeled “t” in the above figures are called terminating states. they are used for collecting the arms of the branches. all their attributes are empty. functions, for-cycles and while-cycles also have to be handled in special ways, since – when turning a sequence of instructions into the body of a function or an altstep – instructions from inside the body of a function or a cycle must not be handled together with instructions from outside it. to avoid these kinds of cases, we have introduced hyper states. a hyper state looks like a simple state from the outside, but it contains the cefsm-model of the body of the function or cycle inside. the event of a hyper state is also special. it can be call_ for, call_while, or call_ function, depending on the kind of structure that it represents. thus, functions and cycles are transformed into hyper states, and function calls are transformed into states referencing the corresponding hyper state (fig. 6.). 3.2 pre-arranging the action lists after creating the cefsm-model, the action lists have to be pre-arranged into a uniform order in order to create potentially longer repetitive sequences of instructions in the action lists. this pre-arrangement is possible, because the order of some instructions can be changed without changing the behavior of the system, and it is useful, because handling these longer repetitive sequences eliminates more redundancy. the rules of this pre-arrangement are as follows: changing the order of message sending instructions is not allowed. if referencing a variable, the last value notation of the variable has to be kept before the referencing instruction. the declaration of the variable has to come before the first reference to the variable and before all the instructions that change its value. to achieve this uniform order, a dependency graph is generated from the instructions of the action list. each instruction is mapped to a node. if a directed edge points from node a to node b, then the originating node (the instruction corresponding to node a) must be kept before the terminating node (the instruction corresponding to node b). fig. 7 shows an original source code, its dependency graph, and the source code after the pre-arrangement. 3.3 handling sequential repetitions this step of the algorithm searches for repetitive sequences of instructions within the action lists. before starting to seek for these repetitions, list al has to be created with the action lists as its elements, beginning with the longest one. after creating al, the search for repetitions works as follows: the algorithm selects a sequence, called bs (base sequence) that all the sequences with the same length are compared to. at the beginning, the length of bs is equal to the length of the longest action list, then it is decreased by one in each iteration. since the same sequence is not likely to be found in the same action list, once the algorithm has found a sequence that matches bs, it jumps to the next action list (the next element of al). when all the sequences that match bs are found, a whole repetition is explored. if sequences of the newly found repetition overlap with sequences from repetitions that were found earlier, they are thrown away. if the size of the repetition (the product of the length of bs and the number of its occurrences) are below a limiting value, the repetition is thrown away. finally, the repetition is stored in list rl, which contains some information about the repetitions found, and a new bs is selected. if the repetition contains equal action lists, this is indicated in matrix rm, which is used in the next step of the algorithm. when rl is complete, the repetitive sequences are turned into functions. 3.4 handling structural repetitions after handling the sequential repetitions, the algorithm searches for repetitions having repetitive structures that cover more states. two states are equal if their events and action lists are equal. the input of this part of the algorithm is rm. in each iteration of this step, the algorithm chooses a pair of 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 6: hyper state representing a for-cycle in the cefsm-model (g=guard, e=event, al=action list, p=parameters) fig. 7: original code, dependency graph and rearranged code states having action lists that were equal according to rm. if the events of these two states are equal (the states are equal), then their sibling nodes in the cefsm-graph are compared. if all the corresponding siblings are equal, then their parent nodes are examined in the same way, recursively, until the whole repetition is revealed. then the whole repetition is turned into the ttcn-3 structure (function or altstep) that best fits the structure of the repetition [1]. 4 case study in this section we demonstrate our algorithm on a ttcn-3 source code. the original and refactorised source codes are shown in fig. 8. 5 comparison with other solutions in this part we focus on a refactorisation tool for ttcn-3 named t-rex [6]. the basic concept of this solution is to search the source for typical patterns called smells, indicating that the quality of the code can be increased (for example, a smell can be an unused parameter). the list below shows some of the smells that are implemented in t-rex (in practice, many smells are not implemented): � constant actual parameter value smells for templates, � duplicate alt branches, � fully-parametrised templates, � singular component variable/constant/timer reference, � singular template reference. the method searches for these kinds of smells and indicates them to the user, who can decide to ignore or correct the indicated smells. unlike our method, t-rex is a semi-automatic approach as user interaction is needed for the refactorisation. this solution rather focuses on the readability of the code, while our goal is to decrease its redundancy and increase its maintainability. 6 summary in our paper we have introduced data models and algorithms for refactoring source codes written in ttcn-3. the data model of the module definitions part is a layered model. it consists of the type graph and the value trees. the algorithm uses references and inheritance in order to reduce the redundancy in the module definitions part. the module control part is transformed into a cefsm-model. in this model the algorithm seeks for sequential and structural repetitions and turns them into functions or altsteps. in this way, the source becomes more compact and more easily maintainable and scalable. acknowledgments first of all, we would like to thank our supervisors, gyula csopaki, ph.d. and antal wu-hen-chang for their direction and for their advice during our research. we would also like to thank our department for providing the necessary equipment to us. references [1] etsi es 201 873-1 3 1.1 methods for testing and specification (mts) the testing and test control notation language, version 3; part 1: ttcn-3 core language, etsi, 2005. [2] weiss, m. a.: data structures and algorithm analysis in c++, addison-wesley, 2006, p. 373–376. [3] mens, t.. tourwe, t.: survey of software refactoring. ieee transactions on software engineering, vol. 30 (2004), no. 2, p. 126–139. [4] ref++ for refactoring c++ sources, http://www.refpp.com [5] transmogrify for refactoring java sources http://transmogrify.sourceforge.net [6] neukirchen, h., bisanz, m.: utilising code smells to detect quality problems in ttcn-3 test suites, testcom /fates 2007 conference. levente eros e-mail: el492@hszk.bme.hu ferenc bozoki e-mail: bf490@hszk.bme.hu dept. of telecommunications and media informatics budapest university of technology and economics magyar tudosok korutja 2 1117 budapest, hungary © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 47 no. 4–5/2007 fig. 8: original and refactorised ttcn-3 source codes ap07_4-5.vp 1 introduction a novel obesity treatment method [1–3] requires a system for measuring stomach volume intended to control on-demand switching of a special gastric pacemaker. the crucial purpose of this is to reduce the power consumption of the originally tried, continuously operating implant. for various technological, physiological and medical reasons, many common methods of distance or extension measurement cannot be applied. the main limitations of this task are power consumption and the conductive, chemically aggressive working medium on the stomach wall. mechanical stresses should also be considered. the use of strain gauges is almost impossible, because large planar structures would need to be safely sewed to a certain area of the stomach wall. this is rejected by surgeons as physiologically impossible. furthermore relative elongations of the stomach can be also large as 100 %, depending on location [4]. ultrasound had been tested before our attempts, because some standard medical probes were available, but the changing speed of sound with the changing composition of the stomach content or of other things in the environment called for a technique that was less dependent on the actual parameters of the medium, and less power demanding [4]. as a promising solution, the induction principle was adopted [5]. low power can be achieved by intermittent operation, and at low frequencies (of the order of several khz) even the conductive medium is proved not to bias the measurement by means of eddy currents. the presence of ferromagnetic materials can cause problems, but such materials are not normally eaten, and when placed outside the body they should not influence the measurement significantly. at least they should not influence switching the stimulation on and off. magnetic measurements are unfortunately directionally dependent, so only the use of multi-axial probes and transmitters can ensure precise position evaluation. this is a similar question as in the case of magnetic tracking [6], but what we need here is just information about the distance between the transmitter(s) and receiver(s). the direction is useless for this application, though for certain configurations it has to be determined anyway. though the data for tracking in certain configurations should be enough from the mathematical point of view, there are regions of low resolution in which the error is high (fig. 5). the magnetic dipole by which we can represent the induction coil is described by its magnitude m, its position in the cartesian coordinate system x, y, z and then by its orientation given by the angles � – the azimuth and � the elevation (fig. 1). all these parameters should be calculated from the field measurements in order to determine the distance of the dipole (transmitter) from the receivers. this obviously needs multiple sensors, if we do not have any information about the orientation of the dipole with regard to the sensors. we usually know the size of the moment, but five parameters still remain. 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 inductive contactless distance measurement intended for a gastric electrical implant j. tomek for a gastric electrical stimulation project we are developing a system for on-demand switching according to the volume or elongation of the stomach wall. the system is to be implanted into the human abdomen, which limits the utilization of many possible solutions and types of sensors. magnetic induction has been agreed as the most suitable principle, despite its direction dependency and the need of multi-axial and multiple probes for precision measurements. possible configurations are discussed as well as the complexity of the necessary electronics and the implantation itself. for detecting food consumption, perfect precision is fortunately not necessary, but a certain compromise will still be necessary for the final system. a simple two-coil system – a transmitter and receiver and a system with a three-axial coil – have already been realized. the first system has already been successfully tested in-vivo on dogs by our us colleagues. however, if the implantation is badly performed, and the coils are completely out of axis, the system cannot sense relative changes in volume properly. the three-axial sensor presented here eliminates these problems. more complex arrangements emerging from magnetic tracking are discussed, because laboratory studies of stomach movements may require them. keywords: contactless distance measurements, magnetic induction, magnetic tracking, on-demand gastric electrical stimulation, obesity, implantable devices. fig. 1: description of a magnetic dipole (its orientation is given by angles � – the azimuth and � – the elevation) the magnetic induction created by the dipole is described by the biot-savart law – eq. (1). this can be written in the form of eq. (2) when we want the cartesian coordinate interpretation. b r r( ) � � � � � � � � � 0 5 34 3 m r r m r , (1) b m zx r zy r z r r � � � � � � � � � 0 5 5 2 5 34 3 3 3 1 ; ; . (2) or we can describe it by the components of the vector of magnetic induction b in the spherical coordinate system – eqs. (3) and (4). br � 2 107 3 m r cos �, (3) bs � 1 107 3 m r sin �. (4) the fact that the magnetic induction decreases with the third power of distance limits these measurements usually to relatively small ranges, because of the need for a wide dynamic range of the sensing electronics. however, this decreases the errors in distance estimation caused by misalignment errors. we need at least five independent measurements of the vector of magnetic induction b to calculate the five unknowns, and usually more to decrease the measurement errors. these measurements are usually made using three-axial sensors, thus at least two of them at known locations are needed. practical experience shows that a much higher number of receivers are needed. usually even multi-axial transmitters are used to multiply the acquired data when successively driving the individual transmitting coils whose mutual orientation is known. 2 methods the basic of distance measurement solutions using the mutual induction principle is to have just two single axial probes, one transmitter and one receiver. the distance can be determined only if their orientation is fixed. for maximum signals the coils must be aligned coaxially. if they can move, and if there are misalignments either in angle or in the position of both the transmitter and the receiver, significant errors bias the measurements (see figs. 8 and 9). a more sophisticated method is to use the three-axial sensor measuring field of a simple coil. by the simple operation of calculating of the magnitude (mag) of the total vector out of the measured magnetic induction components in the x, y and z axes – symbolized by x, y and z, see eq. (5) – we get a good estimate of the distance (fig. 4). up to 45° angular deviation of the transmitter coil (when the sensor is in the first gaussian position) the measured signal does not decrease below 79 % (fig. 8) and the distance determination error is not greater than 10 % [8]. angular misalignments of the receiver should ideally cause no difference in the total vector measurement. therefore, if the precision of the measurements is not so crucial, this is a very good solution for the distance measurement. the electronics stays almost the same © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 47 no. 4–5/2007 fig. 2: magnetic induction of a dipole at a given point described by vector r fig. 3: schematic picture of the components of the magnetic induction vector for a magnetic dipole [2] fig. 4: graph showing regions of the same magnitudes of the total vector of a dipole, i.e., isolines of the same magnitude of the total vector of the induction. the same magnitude as in the 1st gaussian position is at half distance in the 2nd gaussian position, in accordance with eqs. 3. and 4 as in the basic case; there only need to be three channels or a multiplexer if the signals are strong enough and the application enables it. mag x y z� � �2 2 2 . (5) for our application three separate channels are necessary, because the signals are extremely low and need to be detected by a synchronous detector and greatly amplified. switching between the channels would result in significant errors. this application can still be made relatively small and dependable. to achieve far more precise distance measurements, even the transmitter should be multi-axial. this is good when we need at most two compact structures that should be stitched to the stomach wall between which we measure the distance. unfortunately we cannot use multiple sensors or transmitters for more precise distance determination, as is done in magnetic tracking [6] because their mutual positions cannot be ensured. however we can use them to measure more distances over some area, which would improve the volume estimate. mathematically, the two-axial transmitter should be satisfactory. for each coil driven successively or at two frequencies we measure a vector of magnetic induction and we can calculate the mutual position of the structures. in practice there are certain regions of low resolution (fig. 5), but as we do not need to measure with extreme accuracy, this arrangement should be excellent for an evaluation of the elongation of the stomach and for precise sensing of its motility. the application range (mainly in research) of such an implantable system would be broad especially if the probes were small. an evaluation of the distance would require the use of some processor and appropriate firmware, or relatively complex hardware that would realize the methods described in [6]. these are non-iterative analytical methods; however, it will be difficult to make the necessary electronics implantable. there is one more suitable solution, which is the use of even a six-axial transmitter [7]. this method uses the principle of calculating the magnitude of the total vector measured by a three-axial sensor and calculating the average of the six successive measurements. the estimated error is far below 2 % in distance determination [7], which is excellent, and the processing should be relatively simple. 3 the experiments the coils that were used had the parameters described in table 1. to increase the inductance, a ferromagnetic core (see fig. 6) was inserted into each sensing coil. in order to measure the elongation of the stomach wall during food consumption or to detect gastric wall movement, just the simple two coil system was built at first. this was done to simplify the system and was agreed as suitable for early testing on the stomachs of laboratory dogs. the processing electronics was made as external with a usb interface. it was calibrated with coils placed coaxially to each other, and therefore when the coils were implanted not perfectly axially, only relative distance changes could be sensed. we unfortunately could not study the typical range of misalignments of the coils sewed to the stomach wall, so it is difficult to say whether calibration in-vivo with given amounts of water or food would ensure reliable long-term function, but anyway some successful measurements have been conducted, see [6]. however, bad implantation could result in the need for a second operation in order to re-adjust the coils to become as close as possible to the coaxial orientation. the errors that can occur when the transmitter or receiver is misaligned are shown in figures 8 and 9. parallel orientations of the coils are also depicted, and these are even worse. the three-axial sensor was tested just with the laboratory equipment, because we still do not have the electronics for in-vivo testing. the coils are arranged as shown in fig. 7. therefore it is not an ideal sensor with concentric coils, but such a probe would be difficult to manufacture and would not have such high inductances because ferromagnetic cores could not be used. deflections from orthogonality were measured in helmholtz coils and are below 15°, which is 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 5: low resolution regions of a two-axial transmitter [6] probe simple three-axial turns (� 34 mm) 2000 1000 length 8.5 mm 5 mm diameter 2 mm 2 mm l with h11/without 13mh/1mh 3mh/0.6mh core � 1 mm 1 mm table 1: parameters of the coils fig. 6: examples of bare coils made of self-bonding copper wire and a ferromagnetic core 8 mm in length. from the right 2000 turn coil, 1000 turn coil and 500 turn coil. acceptable. there are also certain differences in the coil inductances given by slightly different areas etc. these imperfections result in slightly different measurement errors than there would be in the case of an ideal three-axial sensor (see curves in fig. 8) 4 conclusion possible configurations of transmitter(s) and receiver(s) applicable for contactless magnetic measurement of distance were discussed with reference to an application in an implantable gastric elongation measurement system intended for on-demand stimulation of the stomach wall. the stimulation is for use in obesity treatment. in order to decrease power consumption, and for other reasons like tissue adaptation to continuous stimulation, this control is required. for initial testing, we developed the simplest system with just two coils, which could even be satisfactory for detecting food consumption. this system had been proved by in-vivo measurements. however in the event of really bad alignment after implantation it may fail to operate. thus a three-axial probe was built to eliminate such eventualities. it should make implantation easy, as there should be good signal that is not dependent on the mutual orientation of the implanted probes. it should provide good enough precision even for gastric motility sensing and other possible medical research purposes. in case that there is a need for higher precision measurement of distance by the induction method, configurations with multi-axial transmitters have been proposed. acknowledgments the research described in this paper was supervised by prof. p. ripka, fee ctu in prague and prof. j. chen, vrf, va medical center, oklahoma city, usa. it has been supported by an international grant of the ministry of education, youth and sports of the czech republic – programs of international cooperation (kontakt) no. 1p05me756. references [1] mccallum, r. w., sarosiel, lin, z., monocure, m., usa study group: preliminary results of gastric electrical stimulation on weight loss and gastric emptying in morbidly obese patients – randomized double blinded trial. neurogastroenterology and motility. 2002;14:422. [2] ouyang, h., yin, j., chen, j.: therapeutic potential of gastric electrical stimulation for obesity and its possible mechanisms: a preliminary canine study. dig dis sci vol. 48 (2003), no. 4, p. 698–705. [3] ouiang, h., yin, j. y., chen, j. d. z.: inhibitory effects of chronic gastric electrical stimulation on food intake and weight and their possible mechanisms. dig dis sci vol. 48 (2003), p. 698–705. [4] tomek, j., mlejnek, p., janasek, v., ripka, p, chen, j. z., zhu, h.: gastric distention sensing for implantable gastric pacemaker, submitted to ieee trans. on biomedical physics (2007). [5] tomek, j., mlejnek, p., janasek, v., ripka, p, kaspar, p., chen, j. z.: gastric motility and volume sensing by implanted magnetic sensors. sensor letters. vol . 5 (2007), no. 1 – special issue on emsa 2006, issn 1546-198x, p. 267–268. [6] paperno, e., keisar, p.: three-dimensional tracking of biaxial sensors, in ieee transactions on magnetics, vol. 40 (2004), no. 3, p.1530–1536. [7] majer, a.: distanční měření s využitím totálního vektoru magnetické indukce prostorově zkřížených cívek. in dějiny staveb, (2005), p. 196–198. [8] tomek, j., mlejnek, p., janasek, v., ripka, p, kaspar, p., chen, j. z.: the precision of gastric motility and volume sensing by implanted magnetic sensors, in: eurosensors xx proc. göteborg, sweden, vol. 2 – t3c-o3, september 2006, isbn 978-91-631-9280-7, p. 253–254. ing. jiří tomek e-mail: tomekj1@fel.cvut.cz department of measurement czech technical university in prague technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 47 no. 4–5/2007 fig. 7: arrangement of the coils in the three-axial sensor. under the coating it forms an oval shape of acceptable dimensions fig. 8: angular misalignments of the transmitter, signal measured for rotation from the 1st gaussian position to the 2nd gaussian position [8] fig. 9: rotation of the receiver coil. the three-axial coil is almost rotation independent – not included acta polytechnica https://doi.org/10.14311/ap.2021.61.0504 acta polytechnica 61(4):504–510, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague co2 power cycle chemistry in the cv řež experimental loop jan berkaa, b, ∗, jakub vojtěch balleka, b, ladislav velebila, eliška purkarováb, alice vagenknechtováb, tomáš hlinčíkb a centrum výzkumu řež s.r.o., husinec-řež, hlavní 130, řež, czech republic b university of chemistry and technology prague, faculty of environmental technology, department of gaseous and solid fuels and air protection, technická 1905, prague 6, czech republic ∗ corresponding author: jan.berka@cvrez.cz abstract. power cycles using carbon dioxide in a supercritical state (sc-co2) can be used in both the nuclear and non-nuclear power industry. these systems are characterized by their advantages over steam power cycles, e. g., the sc-co2 turbine is more compact than the steam turbine with a similar performance. the parameters and lifespan of the system are influenced by the purity of the co2 in the circuit, especially the admixtures, such as o2, h2o, etc., cause the enhanced structural materials to degrade. therefore, gas purification and purity control systems for the sc-co2 power cycles should be proposed and developed. the inspiration for the proposal of these systems could stem from the gas, especially the co2-cooled nuclear reactors operation. the first information concerning the co2 and sc-co2 power cycle chemistry was gathered in the first period of the project and it is summarized in the paper. keywords: supercritical carbon dioxide, sc-co2, power cycle chemistry, materials, purification, purity control. 1. introduction the increase of the electric power consumption and the co2 emissions reduction requirements demand new and effective energy sources. one way of increasing the conversion of mechanical power to electric power is by using carbon dioxide as a working medium in the power cycle. currently, the power cycles that are based on supercritical co2 (sc-co2) have been investigated [1, 2]. carbon dioxide becomes supercritical above a critical temperature of 30.98 °c (304.13 k) and above a pressure of 7.32 mpa [2]. apart from the power industry, sc-co2 has been used in other technologies. as examples, the extraction [3, 4], dissolving [5, 6], nanoparticles preparation [7] and impregnation [8] could be mentioned. the sc-co2 power cycle’s efficiency of the conversion of mechanical power to electric power may exceed 50 % as compared to the conventional steam power cycle, which has a maximum efficiency of approximately 40 % [9, 10]. the efficiency of sc-co2 increases within the temperature range of 500–950 °c. therefore, this technology is suitable for power conversions in both high temperature non-nuclear and nuclear technologies, including the generation iv nuclear reactors. the technologies with a possible use of sc-co2 cycles are listed in table 1 [11]. another advantage of the sc-co2 cycles is that they use more compact turbines as compared to the turbines for the steam power cycle [12]. 2. r&d in sc-co2 technologies for sc-co2 power cycles, several experimental devices as well as pilot plants can be investigated and verified. one of the larger devices is unit eps100, which measures the exhaust heat usage in echogen, usa. the power output of this unit reaches seven to eight mw. the optimum temperature of the source is 500-550 °c, and the minimum source temperature is 85 °c, which allows the low potential exhaust heat to be used [9]. examples of smaller scale experimental units are as follows: • supercritical co2 brayton cycle integral experiment loop (sciel), south korea • experimental loop knolls atomic power laboratory (kapl), usa • experimental loop institute of applied energy (iae), japan • sc-co2 loop scarlett (ike) university of stuttgart, germany see ref. [9] for details. in centrum výzkumu řež s.r.o. (czech republic), the supercritical carbon dioxide experimental loop has recently been built (see the scheme of the loop in figure 1). the purpose of the loop is to measure the thermohydraulic performance and physical parameters of the sc-co2 circuits. because the loop is equipped with a test section, the loop can also be used for testing materials. 504 https://doi.org/10.14311/ap.2021.61.0504 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 4/2021 co2 power cycle chemistry in the cv řež . . . application cycle type advantages output temperature pressure(mwe) (°c) (mpa) nuclear indirectheating efficiency, 10-300 350-700 20-35compactness,reduced water consumption fossil power stations indirect heating efficiency, 300-600 550-900 15-35reduced water consumption fossil fuels (synthesis and natural gas) direct combustion efficiency, reduced 300-600 1100-1500 35water consumption,carbon capture storage thermal solar power stations indirect heating efficiency, 10-100 500-1000 35compactness,reduced water consumption marine propulsion indirectheating efficiency, < 10 200-300 15-25compactness, exhaust heat usage indirectheating efficiency, 1-10 < 230-650 15-35compactness, simplicity geothermal indirectheating efficiency 1-50 100-300 15 table 1. possible use of sc-co2 power cycles [5]. figure 1. scheme of the supercritical carbon dioxide experimental loop at cv řež. 1: low temperature heat exchanger, 2: preheater, 3: main circulation pump, 4: cooler, 5: co2 dosing system, 6: sampling system, 7: high temperature heat exchanger, 8: cooler, 8, 9: heaters, 10: test section, 11: reduction valve, 12: cooler. 505 j. berka, j. v. ballek, l. velebil et al. acta polytechnica maximum medium temperature 550 °c maximum pressure in the high-pressure section 25 mpa maximum pressure in the low-pressure section 12.5 mpa maximum flow rate 0.4 kg·s−1 the loop volume 0.08 m3 table 2. the main parameters of the supercritical carbon dioxide experimental loop at centrum výzkumu řež s.r.o. component natural gas synthesis gas (vol. %) (vol. %) co2 91.80 95.61 h2o 6.36 2.68 o2 0.20 0.57 n2 1.11 0.66 ar 0.53 0.47 table 3. examples of the medium composition in the turbine inlet for the direct combustion cycle, according to the used fuel [13, 14]. for the main parameters of the loop, see table 2. the function of the loop could be described as follows: after the passage through the heat exchanger in the high-pressure section (7), co2 flows to two parallel (8) and one serial (9) heater branches. after heating, the medium flows to the test section (10). a reduction valve (11) , which reduces the medium pressure to 12.5 mpa, is placed after the test section. due to the accurate temperature reduction, a part of the flow is passed through the oil cooler (12), and the second part flows through the bypass. following that, the medium reaches the low-pressure part of the high temperature heat exchanger (7), which has a maximum allowed temperature of 450 °°c. it then flows to the low temperature heat exchanger (1). next, it flows through the cooler (4), which is situated before the entrance to the main circulator (3). 3. sc-co2 power cycles chemistry in contrast to other power cycles, such as steam and helium, the data from the chemical composition of the sc-co2 medium, purification, and purity control is rather scarce. some knowledge can be drawn from experience with nuclear reactors, which use carbon dioxide (not in a supercritical state) as the primary coolant. these reactors have been operated in great britain (magnox and advanced gas cooled reactors) [15], and the first nuclear power plant in former czechoslovakia, which is a1, also used carbon dioxide as a primary coolant [16]. 3.1. impurities in the co2 medium impurities in the unit’s contents of volume percentage in the co2 medium influence the thermodynamic properties of the medium. for example, the power consumption of the compressor when working with a medium that is near the critical point increases by six percent, whereas the medium purity decreases by 4.4 %. with a medium of 90.9 % purity, the compressor power consumption increases by 34 % as compared to 100 % with pure co2. this increase in power consumption is caused by a decrease in the medium density, which is due to the impurities [17]. this phenomenon is mainly significant for the systems with a direct combustion. examples of the medium composition in the direct combustion power cycles are listed in table 3. furthermore, the impurities cause the materials and components to corrode and degrade, and they may be the source of undesirable physicochemical processes in the power cycle. the sources of the impurities are as follows: • impurities in the source gas • residual air or gases present in the system prior to the operation • corrosion and chemical reactions in the system • residual impurities on the component’s surfaces • lubricant leakage • penetration from the outside environment due to the lack of tightness • desorption from the structural materials typical admixtures in the co2 medium are as follows: o2, h2o, h2, co, ch4, n2 [11, 18]. in the direct combustion cycles that use synthesis gas as a fuel, so2, so3, no, no2, and halogen compounds may be present in the medium. compounds of halogens and sulphur usually cause and accelerate the material’s degradation by corrosion. if the medium 506 vol. 61 no. 4/2021 co2 power cycle chemistry in the cv řež . . . compound average value in primary co2 coolant limit value for average value in supply gas supply gas (mg·kg−1) h2o 700-1200 20 15 oil 1–5 5 1 h2 2 < 2 h2s, nh3 and others 1 < 1 table 4. the impurities in the primary co2 coolant and supply gas in the a1 nuclear power plant [16]. does not contain these compounds, the corrosion is influenced by the water and oxygen content. when using compressors and other devices, oils and other lubricants may be released into the circuit. these compounds are soluble in sc-co2. if the oil content in sc-co2 is higher than one percent by weight, the oil coats the internal surfaces and negatively affects the heat transfer properties [19]. as an example of co2 medium composition, the data for the co2 primary coolant purity in the nuclear power plant is listed in table 4 [16]. 3.2. purification and purity control methods in some devices, oil separators are used. for example, in the scarlett loop, the oil separator is located after the compressor. the separator separates 99 % of the oil that is contained in the medium in the loop [20]. to limit the damage to the turbines and other parts of the devices, the particle separators are inserted into the circuit. for lower flow rates, the “y filters” (figure 2) can be used, through which the medium passes. figure 2. the y-filter [21, 22]. for higher flow rates, other types of particle filters should be used, as shown in the experimental loop in sunshot in swri [11]. other gaseous impurities can be separated by adsorption. the adsorption separation methods for medium purification in the supercritical carbon dioxide experimental loop in centrum výzkumu řež s.r.o. will be tested over the next few years. in ref. [23], the primary co2 purification method was based on condensation and a subsequent vaporization. part of the gas flow was injected into the rectification column to separate the fission products (xe, kr, i). 3.3. methods of analytical purity control for the impurity content control, the analytical methods were based on gas chromatography (gc), gas chromatography with mass spectrometry (gc-ms), infrared spectroscopy, etc. the analytical system for the supercritical carbon dioxide experimental loop will also be developed over the next few years within the special project. currently, the combination of the gas chromatography with the helium ionization detector (gc-hid) and the 1-channel moisture analyser is based on changes to the infrared light wavelength. the experience with these technologies was obtained in connection with another technology: the high temperature helium experimental loop. the details were published in ref. [24]. the gc-hid method is sensitive when determining the content of h2, co, ch4, o2, and n2 (the contents of a 10−5 volume percentage can be detected). one disadvantage of this method is that the hid detector requires helium as a carrier gas. helium must be added separately for the analysis, because the loop that the gas is based on, which is co2, will be sampled (in contrast to the helium loop), and the chromatographic method should be adjusted to these conditions. additionally, the chromatographic analysis is relatively slow, as it lasts approximately 20 minutes. the probe of the moisture analysis will be placed directly into the medium in the low-pressure part of the loop. the maximum gas pressure outside of the moisture analyser probe is 20 mpa. the data from the moisture content in the sc-co2 medium in the loop will be measured continually. 3.4. organic impurities measured during the first long-term operation of the supercritical carbon dioxide experimental loop at cv řež during the first long-term (1,000 hours) operation of the loop, the sampling of the co2 medium in the loop was carried out on the seventh, 15th, and 34th day of operation. the purpose of the sampling was to determine the amount of undesired minor organic impurities in the medium in the loop. the loop was operated at a temperature of 550 °c in the test section and a pressure of 20 mpa in the high-pressure 507 j. berka, j. v. ballek, l. velebil et al. acta polytechnica figure 3. the sum of the organic compounds in the co2 medium samples. figure 4. the relative distribution of organic compounds in the sample from the seventh loop operation day. section. during the operation, an average of approximately 40 kg of co2 per day was drained from the loop and replaced to remove possible impurities from the medium in the loop. the samples were taken by passing the gas through the sampling tubes with the active carbon. the volume of the samples was 10-170 l, with a pressure of 100 kpa and a temperature of 25 °c. in the next step, the compounds that were adsorbed by the active carbon were desorbed by carbon disulfide, which was subsequently determined by the gc-ms technique. the amount of the organic compounds in the loop’s medium that were determined in the samples has been recorded in the chart in figure 3. the contents of the organic impurities in the medium decreased from ca. 1800 to 5 ng·l−1. in the sample from the seventh day of operation, several organic compounds contained a large amount of benzene. the relative distribution of the organic compounds in the seventh operation day sample is shown in figure 4. in the other samples, only benzene was detected. the source of the organic compounds in the loop medium could be the residual organics from the loop production, such as lubricants, degreasers, and dissolvents. the amount of organic impurities significantly decreased during the loop operation, which is likely due to the continuous replacement of co2 in the loop. 4. conclusion the sc-co2 power cycles are a perspective technology due to the high efficiency of the mechanical to electric power conversion, their compactness, etc. centrum výzkumu řež s.r.o. (czech republic), along with several other partners, has investigated this field. this research institution also operates a semi-industrial facility, which is the supercritical carbon dioxide experimental loop. the research aims to test materials for sc-co2 technologies, physical properties, and ther508 vol. 61 no. 4/2021 co2 power cycle chemistry in the cv řež . . . mohydraulic properties of the sc-co2 medium as well as the coolant chemistry and the related purification & purity control methods. as typical impurities in sc-co2 cycles with indirect heating, o2, h2o, h2, co, ch4, n2 and, in some cases, also organic compounds (oil, etc.) were identified. as analytical methods suitable for co2 purity control, gas chromatography with helium ionization detector (gc-hid) connected directly with the sampling point of the circuit and optical hygrometer with probe placed directly next to the sc-co2 circuit were proposed. the methods will be tested. for the determination of trace concentrations of organic compounds, the gas chromatography with mass spectrometry detector (gc-ms) can be used. this method was verified by an analysis of real samples from a sc-co2 loop operation. for the separation of impurities from the co2, the processes based on adsorption will be tested within the continuation of the research program. acknowledgements the presented work was financially supported by the ministry of education, youth and sport czech republic project lq1603 “research for susen”. this work has been performed within the susen project and co-financed within the framework of the european regional development fund (erdf) in project cz.1.05/2.1.00/03.0108, and the european structural and investment funds (esif) in the project cz.02.1.01/0.0/0.0/15_008/0000293). some results that have been presented in the article were achieved within the project no. tk02030023 supported by the technology agency of czech republic (tacr). references [1] department of energy. sco2 power cycles for fossil fuels. [2020-04-03], https://www.energy.gov/sco2-power-cycles/ sco2-power-cycles-fossil-fuels. [2] e. g. feher. the supercritical thermodynamic power cycle. energy conversion 8(2):85–90, 1968. https://doi.org/10.1016/0013-7480(68)90105-8. [3] g. sodeifian, n. s. ardestani, s. a. sajadian, s. ghorbandoost. application of supercritical carbon dioxide to extract essential oil from cleome coluteoides boiss: experimental, response surface and grey wolf optimization methodology. the journal of supercritical fluids 114:55–63, 2016. https://doi.org/10.1016/j.supflu.2016.04.006. [4] g. sodeifian, s. a. sajadian, n. s. ardestani. optimization of essential oil extraction from launaea acanthodes boiss: utilization of supercritical carbon dioxide and cosolvent. the journal of supercritical fluids 116:46–56, 2016. https://doi.org/10.1016/j.supflu.2016.05.015. [5] g. sodeifian, s. a. sajadian, n. s. ardestani. determination of solubility of aprepitant (an antiemetic drug for chemotherapy) in supercritical carbon dioxide: empirical and thermodynamic models. the journal of supercritical fluids 128:102–111, 2017. https://doi.org/10.1016/j.supflu.2017.05.019. [6] g. sodeifian, f. razmimanesh, s. a. sajadian. solubility measurement of a chemotherapeutic agent (imatinib mesylate) in supercritical carbon dioxide: assessment of new empirical model. the journal of supercritical fluids 146:89 –99, 2019. https://doi.org/10.1016/j.supflu.2019.01.006. [7] g. sodeifian, s. a. sajadian, s. daneshyan. preparation of aprepitant nanoparticles (efficient drug for coping with the effects of cancer treatment) by rapid expansion of supercritical solution with solid cosolvent (ress-sc). the journal of supercritical fluids 140:72–84, 2018. https://doi.org/10.1016/j.supflu.2018.06.009. [8] a. ameri, g. sodeifian, s. a. sajadian. lansoprazole loading of polymers by supercritical carbon dioxide impregnation: impacts of process parameters. the journal of supercritical fluids 164:104892, 2020. https://doi.org/10.1016/j.supflu.2020.104892. [9] v. dostal, p. hejzlar, m. j. driscoll. the supercritical carbon dioxide power cycle: comparison to other advanced power cycles. nuclear technology 154(3):283– 301, 2006. https://doi.org/10.13182/nt06-a3734. [10] k. brun, p. friedman, r. dennis. fundamentals and applications of supercritical carbon dioxide (sco2) bases power cycles, chap. 15 research and development: essentials, efforts, and future trends. woodhead publishing, duxford, 2017. [11] m. li, et al. the development technology and applications of supercritical co2 power cycle in nuclear energy, solar energy and other energy industries. applied thermal engineering 126:255–275, 2017. https: //doi.org/10.1016/j.applthermaleng.2017.07.173. [12] k. brun, p. friedman, r. dennis. fundamentals and applications of supercritical carbon dioxide (sco2) bases power cycles, chap. 7 turbomachinery. woodhead publishing, duxford, 2017. [13] r. allam, et al. the oxy-fuel, supercritical co2 allam cycle: new cycle developments to produce even lower-cost electricity from fossil fuels without atmospheric emissions. in proceedings of the asme turbo expo 2014: turbine technical conference and exposition. volume 3b: oil and gas applications; organic rankine cycle power systems; supercritical co2 power cycles; wind energy. 2014. v03bt36a016, https://doi.org/10.1115/gt2014-26952. [14] environmental protection agency (epa). standards of performance for greenhouse gas emissions from new, modified, and reconstructed stationary sources: electric utility generating units, 2015. [15] m. perschilli, et al. supercritical co2 power cycle developments and commercialization: why sco2 can displace steam. power-gen india & central asia, 2012. [2020-04-03], http://www.echogen.com/documents/ why-sco2-can-displace-steam.pdf. [16] pris – power reactor information system. operational & long-term shutdown reactors. [2020-04-03], https://pris.iaea.org/pris/ worldstatistics/operationalreactorsbytype.aspx. [17] k. feik, j. kmošena. jadrová elektráreň a1 v kocke. slovakia, bratislava, 2010. 509 https://www.energy.gov/sco2-power-cycles/sco2-power-cycles-fossil-fuels https://www.energy.gov/sco2-power-cycles/sco2-power-cycles-fossil-fuels https://doi.org/10.1016/0013-7480(68)90105-8 https://doi.org/10.1016/j.supflu.2016.04.006 https://doi.org/10.1016/j.supflu.2016.05.015 https://doi.org/10.1016/j.supflu.2017.05.019 https://doi.org/10.1016/j.supflu.2019.01.006 https://doi.org/10.1016/j.supflu.2018.06.009 https://doi.org/10.1016/j.supflu.2020.104892 https://doi.org/10.13182/nt06-a3734 https://doi.org/10.1016/j.applthermaleng.2017.07.173 https://doi.org/10.1016/j.applthermaleng.2017.07.173 https://doi.org/10.1115/gt2014-26952 http://www.echogen.com/documents/why-sco2-can-displace-steam.pdf http://www.echogen.com/documents/why-sco2-can-displace-steam.pdf https://pris.iaea.org/pris/worldstatistics/operationalreactorsbytype.aspx https://pris.iaea.org/pris/worldstatistics/operationalreactorsbytype.aspx j. berka, j. v. ballek, l. velebil et al. acta polytechnica [18] electric power research institute (epri). performance and economic evaluation of supercritical co2 power cycle coal gasification plant, 2014. [19] t. hudský. special purification and purity control methods for advanced nuclear reactors. diploma thesis, uct prague, 2015. [20] c. dang, k. hoshika, e. hihara. effect of lubricating oil on the flow and heat-transfer characteristics of supercritical carbon dioxide. international journal of refrigeration 35(5):1410–1417, 2012. https://doi.org/10.1016/j.ijrefrig.2012.03.015. [21] w. flaig., r. mertz, j. starflinger. setup of the supercritical co2 test facility “scarlett” for basic experimental investigations of a compact heat exchanger for an innovative decay heat removal system. journal of nuclear engineering and radiation science 4(3):031004, 2018. https://doi.org/10.1115/1.4039595. [22] cast steel y type strainer. [2020-04-03], http://www.pmtengineers.com/images/product/ cast_steel_y_type_strainer.jpg. [23] b. hatala. katalog je a1 (chemia). internal vuje report. slovakia, bratislava, 2019. [24] j. berka, et al. new experimental device for vhtr structural material testing and helium coolant chemistry investigation – high temperature helium loop in nri řež. nuclear engineering and design 251:203–207, 2012. https://doi.org/10.1016/j.nucengdes.2011.10.045. 510 https://doi.org/10.1016/j.ijrefrig.2012.03.015 https://doi.org/10.1115/1.4039595 http://www.pmtengineers.com/images/product/cast_steel_y_type_strainer.jpg http://www.pmtengineers.com/images/product/cast_steel_y_type_strainer.jpg https://doi.org/10.1016/j.nucengdes.2011.10.045 acta polytechnica 61(4):1–7, 2021 1 introduction 2 r&d in sc-co2 technologies 3 sc-co2 power cycles chemistry 3.1 impurities in the co2 medium 3.2 purification and purity control methods 3.3 methods of analytical purity control 3.4 organic impurities measured during the first long-term operation of the supercritical carbon dioxide experimental loop at cv řež 4 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0428 acta polytechnica 61(3):428–440, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague marine diesel engines operating cycle simulation for diagnostics issues dmytro s. mincheva, ∗, roman a. varbanetsb, nadiya i. alexandrovskayab, ludmila v. pisintsalyb a national university of shipbuilding, internal combustion engines, plants and technical maintenance department, heroyiv ukrayiny ave., 9, 54025 mykolaiv, ukraine b odessa national maritime university, marine engineering department, 34 mechnikov str., 65029 odessa, ukraine ∗ corresponding author: misaidima@gmail.com abstract. the ongoing monitoring of marine diesel engines helps to detect the deviations of its parameters early and prevent major failures. but the experimental diagnostics data are generally limited, so frequently, it isn’t possible to get all the necessary information to make a clear decision. the mathematical simulation could be used to clarify the experimental data and to provide a deeper understanding of engine conditions. in this paper, the man 6l80mce marine diesel engine of “father s” bulk carrier diagnostics issues are considered. the diagnostics data were collected with depas handy equipment and present the information about indicated processes by every engine cylinder. the on-line resource blitz-pro was used for the simulation of the engine operation and helped to prove that the variation in exhaust valve’s closing timing is responsible for the observed compression pressure difference, while the irregularity in fuel injection causes the considerable difference in the maximum pressure. keywords: mathematical simulation, valve train, fuel injection, mathematical-based diagnostics. 1. introduction the marine diesel engines diagnostics aims to determine actual conditions of the engine in terms of fuel efficiency, reliability and correct the operation of its subsystems, such as the fuel injection system, supercharging system, valve train, etc. however, in the case of quick diagnostics, especially during normal vessel operation, there are only limited possibilities for correct measurements of all engine parameters. mathematical simulation of engine’s operating cycle could be used for these cases to clarify the possible issues and to refer particular engine operation conditions to a benchmark. mathematical-based fault diagnostics are becoming a very promising approach. the idea is to define the difference between the actual value and the modelled value of a signal being measured [1–3]. different types of mathematical models are used, depending on the diagnostic tasks [4, 5]. neural networks based models, which can generate a decision during the engine operation, could also be applied [6]. depas handy is the set of diagnostics tools developed for main and auxiliary marine diesel engines. it allows measuring the in-cylinder pressure diagram together with vibrodiagrams of fuel injection, intake and exhaust valves closing [7–9]. in practice, due to complex interrelations of gas-exchange, fuel-injection, supercharging processes, environmental parameters, warm-up conditions of an engine, etc. it is rather difficult to precisely define the real issue during an engine diagnostic procedure. a mathematical simulation may help to make the final decision. blitz-pro is the online internal combustion engine’s operating cycle simulation tool [10].it provides static and transient compression-ignition, spark-ignition and dual-fuel engines’ operating cycle simulation from any device, which has access to internet and does not require installation. blitz-pro can be used to accumulate statistics for the given engine during its operation time, which could be useful for analyses of the current engine conditions. in this paper, blitz-pro is used to analyse and clarify the diagnostics data from depas handy for the man 6l80mce main marine diesel engine. 2. mathematical model development blitz-pro offers operating cycle synthesis for various configurations of ice engines. nevertheless, for any engine configuration, the basic approach remains the same: the engine is divided into couple of open thermodynamic systems (ots), which interact with each other by energy and mass exchange processes. three types of ots are applied depending on computational tasks: (1.) single-zone 0-dimensional (0-d) quasi-steady model; (2.) two-zone 0-d quasi-steady model; 428 https://doi.org/10.14311/ap.2021.61.0428 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . (3.) 1-d unsteady model. for 0-d ots, the universal set of equations is used. these equations are based on the concepts of the first-law of thermodynamics, equality of working gases’ properties for each point of the volume, gas-state law and mass balance equations. the first law of thermodynamics is expressed as: difuel dϕ + n1∑ j=1 dij dϕ + δqcomb dϕ + n2∑ i=1 δqwall.i dϕ = cvmt   n1∑ j=1 dmj dϕ + dmfuel dϕ   + cvmdt dϕ + mt d(cvm)t dϕ + p dv dϕ , (1) where difuel/dϕ, dij/dϕ – the rates of enthalpy change due to fuel evaporation and due to mass exchange processes, respectively, δqcomb/dϕ – the heat release rate due to fuel combustion, δqwall.i/dϕ – the heat transfer rate to the walls of the system, dmfuel/dϕ, dmj/dϕ – the fuel mass flow and gases mass flow, respectively, n1 – the number of the interacting thermodynamic systems, involved in the mass exchange process, n2 – the number of walls, involved in the heat transfer process, p, t , v , m – the pressure, temperature, volume, mass of the gases mixture in the ots, respectively, cv, cvm – the actual and average isochoric specific heat capacity, respectively. the equation of thermodynamics first law is completed with the mass balance equation and with the gas-state equation: dm = n1∑ j=1 dmj + dmfuel; (2) pv = z m µ rt, (3) where z is the compression factor, calculated by the berthelot equation: z = 1 + 9 128 π θ ( 1 + 6 θ2 ) , (4) where π = p/pcrit – relative to the critical pressure, θ = t/tcrit – relative to the critical temperature. generally, for each open thermodynamic system, the single-zone model is applied, this means that the whole volume of the system is considered as a homogeneous mixture of gases. however, for several cases, the two-zone model is also implemented: (1.) during the combustion period to predict the burned gases and fresh charge temperatures for nox and co formation calculation. (2.) during the scavenging period, for two-stroke engines, to correctly predict gas exchange processes. (3.) to consider burned gases reflux from the cylinder into the intake receiver. in the two-zone model, the thermodynamic system is virtually divided into two interacting thermodynamic systems. the general concept for the two-zone model is the equivalence of pressure for both zones and impenetrable flexible boundary surface between them. basic equations are the application of the first law of thermodynamics:   difuel dϕ + n1∑ j=1 diij dϕ + dii−ii dϕ + δqcomb dϕ + n2∑ i=1 δqiwall.i dϕ + δqi−ii dϕ = dii dϕ + v i dp dϕ ; n3∑ j=1 diiij dϕ − dii−ii dϕ + n2∑ i=1 δqiiwall.i dϕ − δqi−ii dϕ = diii dϕ + v ii dp dϕ , (5) where index “i” refers to the burned gases zone, and index “ii” to the fresh mixture zone, δqi−ii/dϕ is the heat transfer rate between zones, δii−ii/dϕ is the enthalpy transfer rate between the zones, dii/dϕ, diii/dϕ are the corresponding enthalpy change rates for zone i and ii. these equations are used together with a general single-zone equation, which is used to find the pressure for the next time layer. 429 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica the mass flow between the interacting 0-d thermodynamic systems is calculated with a quasi-steady adiabatic nozzle concept coupled with prof. orlin’s approach to consider unsteady effects. the quasi-steady equations for gas flow velocity wstatic from the volume with the higher pressure “1” to the volume with the lower pressure “2” are: wstatic =   √ 2k1 k1 + 1 rµt ∗ 1 , if p1 p2 ≥ ( 2 k1 + 1 ) k1 1−k1 ;√√√√ 2k1 k1 − 1 rµt ∗ 1 [ 1 − ( p2 p∗1 )]k1 −1 k1 , otherwise, (6) where k1 is the adiabatic exponent, “*” indicates total parameters. to consider the unsteady phenomena, the pulse conservation equation is used: w ∂w ∂x + ∂w ∂t = − 1 ρ ∂p ∂x (7) it is converted to expresses the gas flow acceleration: dw dτ = wstatic|wstatic|−w|w| 2l , (8) where l is the “active” pipe length. the active pipe length is to be set by the user for the intake and exhaust valves/ports, but for many other cases (turbine, compressor, intercooler, etc.) it is assumed automatically (usually divisible by channel’s equivalent flow diameter). the 1-d unsteady models are used to consider unsteady effects of gas flow in the intake and exhaust systems. the following set of equations is applied:   ∂ρ ∂τ + w ∂ρ ∂x + ρ ∂w ∂x = −λfric w|w| 2d ; ∂w ∂τ + w ∂w ∂x + 1 ρ ∂p ∂x = −wρ d ln f dx ; ∂s ∂τ + w ∂s ∂x = 1 t ( λfric |w3| 2d − 4α ρd (twall −t) ) ; s −sno = rµ k − 1 p/pno (ρ/ρno) k (9) where λfric – the coefficient of friction, α the heat transfer coefficient from gas to the pipe wall, twall – the pipe wall temperature, t – the gas temperature, s – entropy, “no” – refers to normal conditions, d – the pipe diameter, k – the adiabatic exponent. the size h of the computational cells for 1-d model is calculated according to the cfl condition: h = lpipe (a + |wmax|) ∆ϕmax 6ncrank , (10) where lpipe the length of the runner, a the local sonic speed, wmax maximum gas velocity, ∆ϕmax the maximum calculation time step by the crank angle degree (as the variable time-step mesh generation available in blitz-pro [11]), ncrank crankshaft speed. the heat transfer processes and wall temperatures are considered using a quasi-steady approach. the woschni equation [12] is used to find the heat transfer coefficient from gases to cylinder walls: αgas = a (10p)0.8 t 0.53d0.2cyl [ c1cm + c2 (p−pmot)vdt 1000pv ]0.8 , (11) where a, c1, c2 coefficients, pmot pressure in the cylinder during running motor condition. 430 vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . cyl. rpm pmax pcompr αing ϕinj nmbr bar bar c.a.d. c.a.d. 1 66 75.1 61.3 -2.7 12 2 66 68.5 57.1 -3.2 9.6 3 66 72.6 55.5 -2.6 9.2 4 66 66.6 55.9 -4.9 10.4 5 66 66.4 52.3 -3.9 9.6 6 66 64.7 54.8 -4.2 9.5 1 75 89 74.2 -1.3 14.1 2 75 80.3 69.6 -3.1 12 3 75 84.5 67.4 -3 12.6 4 75 79.2 68.5 -4.9 12.5 5 75 76.2 64.6 -4.5 13.3 6 75 75.2 65.9 -4.4 11.2 table 1. results of man 6l80mce engine diagnostics with depas handy. the razleitzev fuel injection-evaporation-combustion model is applied to define the heat release rate [13]: dx dϕ =   1 6ncrank p0 + 6ncrank dσevdϕ 1 + a1(p0 + 6ncrank dσevdϕ ) |x=σix=0 ; 1 6ncrank p2 + 6ncrank dσev dϕ 1 + a16ncrank dσev dϕ |ϕ=ϕinj.end+∆ϕkσ=σi ; 1 6ncrank a3 ξa.cα x (1 − ∆u.f −x)x|ϕ=ϕcomb.endϕ=ϕinj.end+∆ϕk, (12) where x burned fuel fraction, σi represents the amount of fuel injected during the ignition delay, σev evaporated fuel fraction, ϕinj.end the moment of the injection end, ∆ϕk extension of the second equation usage time, ξa.c function of air usage, ∆u.f unburned fuel fraction, p0, p2, a0, a2 functions. the runge-kutta second order implicit method is applied to numerically solve the sets of equations for interacting ots. the working medium density relative accuracy for all open thermodynamic systems is used as the condition of the correct solution. the adaptive fixed-point numerical method is used to avoid looping the calculations. 3. mathematical simulation for diagnostics issues the man 6l80mce main diesel engine is installed on the “father s” bulk carrier. it is equipped with two abb vtr564 turbochargers and directly drives the fixed pitch propeller. the measurements made with the depas handy equipment include two sets of indicated diagrams and vibrodiagrams of the fuel injection and exhaust valves closing for the engine running at 66 and 75rpm. the diagrams in figure 1 and table 1 show a huge difference in the compression and maximum pressure between the cylinders. the injection advance is late by about αinj = −2...− 5 c.a.d. before the tdc and considerably uneven as well as the injection duration ϕinj. as a conclusion of measurements, the engine operates normally, but the true reason of the huge difference in the maximum and compression pressures between the cylinders as well as its influence on engine overall efficiency needs to be discovered. we used mathematical simulations of man 6l80mce diesel operating cycle to clarify the experimental results. to set-up and calibrate the blitz-pro mathematical model, we used official sea trials results of the “father s” bulk carrier combined with the manufacturer’s manuals information. the main task of the initial set-up of the mathematical model was to get the engine operation parameters, which are referred as its normal conditions. the important issues for this case of mathematical model set-up were the turbocharger’s performance maps and the actual exhaust valve diagram. turbocharger performance maps, used for the abb vtr564 turbocharger emulation are based on the information from [15] (see figure 2). these maps are extrapolated and interpolated according to [16, 17] and presented in figure 3). a further identification of the turbocharger performance was made by a comparison of the sea trail, depas handy experimental data and the blitz-pro simulation results. the relative exhaust valve lift diagram, used for the simulation, is shown in figure 5 b and is based on the man market update note [14]. it is clear that the exhaust lobe profile aims to decrease the speed of the valve 431 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica figure 1. depas handy report summary. “injector vibro” the signal of vibration sensor, which shows fuel injection valve’s opening and closing timing: a) ncrank = 66 rpm, b) ncrank = 75 rpm. 432 vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . figure 2. manufacturer’s performance maps abb vtr 564 turbocharger [14]. seating. so, in terms of gas-dynamics, valve closing occurs about 25 c.a.d. in advance of the mechanical closing (which is detected by depas handy). to consider unsteady gas flow effects, the 1-d sub-models were applied to the intake ports’ channel and exhaust runner figure 4 demonstrates the comparison between simulated and experimental engine parameters. the standard deviation for the maximum pressure is 2 %, 4.2 % for compression pressure, 4 % for scavenging pressure, 3.5 % for turbocharger speed (8.9 % if measurements error is included), 3.2 % for brake specific fuel consumption and 2.5 % for fuel pump index. the relatively big deviation in brake specific fuel oil consumption could be explained by the assumed fuel oil lower calorific value. the value of turbocharger’s speed at engine’s 75 rpm from sea trial is disputable and seems as some sort of measurements error. the diagrams of compression pressure and turbocharger speed in relation to the level of scavenging pressure (figure 4, at the bottom) help to prove it. the level of the scavenge air pressure in the case of correct turbocharger adjustment is determined by the turbocharger speed. in addition, if the actual compression ratio is constant, the level of pressure at the end of the compression is determined directly by the level of scavenge air pressure, so this helps to confirm the correct turbocharger maps setup. it is clear that the level of scavenging pressure at 75 rpm engine running during sea trials is pretty much on a curve, while the turbocharger speed is very small. the comparison to the experimental depas handy results proves this conclusion, so it is probably due to some mistake during measurements. after the mathematical model setup, we have used it to find the answer for two questions: 1) what is the reason of the compression pressure difference between engine’s cylinders; 2) what is the reason of maximum pressure variation by cylinder. the typical reasons for compression pressure variation from cylinder to cylinder are: 1) uneven exhaust valve timing and 2) gas leakage through compression rings or/and exhaust valve seat. depas handy, as was is mentioned, allows to define the actual exhaust valve closing timing with the aid of vibration sensor. figure 5 a shows the averaged results of measurements for six engine’s cylinders for the 433 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica figure 3. extrapolated abb vtr 564 performance maps, used in blitz-pro. 434 vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . figure 4. man 6l80mce diesel engine sea trials comparison. 435 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica figure 5. influence of exhaust valve closing timing on the compression pressure at 66 rpm (a) and relative valve lift diagram [15], used for the simulation (b). engine running at 66 rpm. there is a good correlation between the compression pressure and the moment of valve closing. when comparing the experimental and simulated data, we can definitely say that the variation in pcompr is caused by an uneven exhaust valve closing and the conditions of valves seats and compression rings are normal. simulation results also give a wide picture of the influence of exhaust valve closing timing on the engine operation (figure 6). a late exhaust valve closing causes a decrease in compression temperature and, thus, longer ignition delay period. so, the combustion starts later, decreasing the efficiency and rising temperatures. also, the late exhaust valve closing results in a volumetric efficiency decrement and smaller air excess ratio, which is also a bad influence in terms of engine parts temperatures and indicated efficiency. the difference in the maximum pressure couldn’t be explained only by the exhaust valve closing timing variation. it is also influenced by the uneven fuel injection from cylinder to cylinder. the mathematical simulation gives the estimated values of the injected fuel for every cylinder. the example for the engine running at 66 rpm is presented in figure 7. the simulated levels of compression pressure and maximum pressure by cylinder are close to the depas handy measurements results. the simulations show the huge irregularity in the fuel injection by about 20 %. this also causes the different output by the engine cylinder (about 17 %) and brake specific fuel oil consumption (about 2.7 %). obviously, the fuel injection system requires a better adjustment to achieve the same level of the cylinder’s wear, engine parts temperatures and better fuel economy. the toughest operating conditions were discovered for the number 4 engine cylinder. the combination of the late exhaust valve closing and the maximum fuel injection provides the lowest value of the air excess ratio – α = 2.38, while the average is 2.6. together with the late fuel injection, long ignition delay (because of the late exhaust valve closing) it explains the biggest exhaust valve opening in-cylinder temperature tb = 1099 °c, while the average is 1030 °c. this could cause increased exhaust valve and seat wear rate for this cylinder. 436 vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . figure 6. calculated results of exhaust valve closing timing at 66 rpm. 437 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica figure 7. simulated fuel injection variation by cylinder at ncrank = 66 rpm. 4. discussion mathematical simulation could be a powerful tool to analyse the experimental data from marine diesel engines diagnostic procedures. it helps to clarify the true reasons and causes of observed deviations in engine system’s operation, such as fuel-injection and turbocharging systems. for two-stroke marine engine diagnostic issues, the mathematical simulation provides: (1.) an estimation of the fuel-injection and power load irregularity by engine cylinder; (2.) a detection of the most loaded cylinder in terms of the greatest in-cylinder temperatures at the exhaust valve opening; (3.) an explanation of the compression pressure variation by engine’s cylinders, which could be caused by the difference in the exhaust valve closing between the cylinders, or by piston rings or exhaust valve leakages; (4.) revealing the influence of exhaust valve closing on the engine in-cylinder parameters, such as gas exchange processes, fuel combustion, exhaust gases temperatures; (5.) an ongoing diagnostics of the turbocharger operation. this approach can be used not only for older engines, which are not equipped with the self-diagnostic systems, but also for modern engines as it helps to reduce the amount of necessary sensors and provides a flexibility in terms of solving issues. the mathematical simulation also helps to make the forecast of engine operation at different loads and speeds as the experimental measurements generally have limited possibilities because of the conditions of the 438 vol. 61 no. 3/2021 marine diesel engines operating cycle simulation. . . ship operation. however, the mathematical simulation needs calibration procedures based on the experimental measurements to provide accurate calculations. it also demands an interpretation of the simulation results by a highly trained specialist. 5. conclusions the combination of a mathematical simulation with direct measurements provides more possibilities in the field of marine diesel engine diagnostics. it helps to provide accurate diagnostics during a vessel operation even in challenging conditions of variable environmental parameters and lack of experimental measurements. blitz-pro provides an easy-to-use ice operating processes simulation product, available online, which is accessible from a personal computer, tablet or smartphone. it could be useful for ice diagnostic issues, combining fast calculations for all types of internal combustion engines without device storage usage. the mechanical engineer can monitor the engine during its lifetime, using mathematical simulations to define possible deviations in the operating process. list of symbols α air excess ratio tdc top dead centre bdc bottom dead centre c.a.d. crank angle degree αing fuel injection advance [c a d beforetdc] ϕing fuel injection duration [c a d ] ncrank crankshaft speed [rpm] pmax maximum in-cylinder pressure [bar] pcompr compression pressure [bar] tb in-cylinder temperature at exhaust valve opening [°c] references [1] a. agarwal, j. gupta, n. sharma, a. singh. model-based fault detection on modern automotive engines. in: advanced engine diagnostics. energy, environment, and sustainability. springer, singapore, 2019. https://doi.org/10.1007/978-981-13-3275-3_9. [2] s. simani, c. fantuzzi, r. patton. model-based fault diagnosis techniques. in: model-based fault diagnosis in dynamic systems using identification techniques. advances in industrial control. springer, london, 2003. https://doi.org/10.1007/978-1-4471-3829-7_2. [3] p. kučera, v. píštěk, a. prokop, k. řehák. measurement of the powertrain torque. engineering mechanics proceedings 24:449–452, 2018. https://doi.org/10.21495/91-8-449. [4] r. varbanets, o. fomin, v. píštěk, et al. acoustic method for estimation of marine low-speed engine turbocharger parameters. journal of marine science and engineering 9(3), 2021. https://doi.org/10.3390/jmse9030321. [5] p. novotny, v. pistek, l. drapal, et al. efficient approach for solution of the mechanical losses of the piston ring pack. proceedings of the institution of mechanical engineers, part d: journal of automobile engineering 227(10):1377–1388, 2013. https://doi.org/10.1177/0954407013495187. [6] j. desantes, j. lopez, j. garcia-oliver, l. hernández. application of neural networks for prediction and optimization of exhaust emissions in a h.d. diesel engine. springer, london, 2002. https://doi.org/10.4271/2002-01-1144. [7] r. a. varbanets, s. a. karianskiy. analyse of marine diesel engine performance. journal of polish cimac, gdansk pp. 269–275, 2012. [8] s. neumann, r. varbanets, o. kyrylash, et al. marine diesels working cycle monitoring on the base of imes gmbh pressure sensors data. diagnostyka 20(2):19–26, 2019. https://doi.org/10.29354/diag/104516. [9] r. a. varbanets, v. i. zalozh, a. v. shakhov, et al. determination of top dead centre location based on the marine diesel engine indicator diagram analysis. diagnostyka 21(1):51–60, 2020. https://doi.org/10.29354/diag/116585. [10] d. s. minchev. blitz-pro. user’s manual, 2018. http://blitzpro.zeddmalam.com/application/index. [11] d. s. minchev, a. v. nagirnyi. application of the computational mesh with variable time step for ice operating cycle synthesis. herald of aeroenginebuilding (1):32–38, 2017. https://doi.org/10.15588/1727-0219-2017-1-6. [12] g. woschni. a universally applicable equation for the instantaneous heat transfer coefficient in the internal combustion engine. sae technical paper pp. 3065–3083, 1967. https://doi.org/10.4271/670931. [13] n. f. razleitsev. modeling and optimization of combustion procedure in diesel engines. kharkov university publishers, 1980. 439 https://doi.org/10.1007/978-981-13-3275-3_9 https://doi.org/10.1007/978-1-4471-3829-7_2 https://doi.org/10.21495/91-8-449 https://doi.org/10.3390/jmse9030321 https://doi.org/10.1177/0954407013495187 https://doi.org/10.4271/2002-01-1144 https://doi.org/10.29354/diag/104516 https://doi.org/10.29354/diag/116585 http://blitzpro.zeddmalam.com/application/index https://doi.org/10.15588/1727-0219-2017-1-6 https://doi.org/10.4271/670931 d. s. minchev, r. a. varbanets, n. i. alexandrovskaya, l. v. pisintsaly acta polytechnica [14] m. d. . turbo. me-b.3 engines with variable exhaust valve timing. 2013, https://marine.man-es.com/docs/ librariesprovider6/mun/me-b-3-engines-with-variable-exhaust-valve-timing.pdf?sfvrsn=c2ddeaa2_13. [15] e. by doug woodyard. pounder’s marine diesel engines and gas turbines. eighth edition. elsevier butterworth-heinemann, 2004. [16] d. s. minchev, y. l. moshentsev, a. v. nagirnyi. extrapolation of turbocharger radial turbine characteristics. aerospace engineering and technology: national aerospace university – “kharkiv aviation institute” , nau “khai” (10(87)):173–133, 2011. http://nti.khai.edu:57772/csp/nauchportal/arhiv/aktt/2011/aktt1011/minchev.pdf. [17] d. s. minchev, y. l. moshentsev, a. v. nagirnyi. extrapolation of experimental centrifugal compressors maps. proceedings of national university of shipbuilding (4):89–98, 2011. https://docplayer.ru/ 28442566-ekstrapolyaciya-eksperimentalnyh-harakteristik-centrobezhnyh-kompressorov.html. 440 https://marine.man-es.com/docs/librariesprovider6/mun/me-b-3-engines-with-variable-exhaust-valve-timing.pdf?sfvrsn=c2ddeaa2_13 https://marine.man-es.com/docs/librariesprovider6/mun/me-b-3-engines-with-variable-exhaust-valve-timing.pdf?sfvrsn=c2ddeaa2_13 http://nti.khai.edu:57772/csp/nauchportal/arhiv/aktt/2011/aktt1011/minchev.pdf https://docplayer.ru/28442566-ekstrapolyaciya-eksperimentalnyh-harakteristik-centrobezhnyh-kompressorov.html https://docplayer.ru/28442566-ekstrapolyaciya-eksperimentalnyh-harakteristik-centrobezhnyh-kompressorov.html acta polytechnica 61(3):428–440, 2021 1 introduction 2 mathematical model development 3 mathematical simulation for diagnostics issues 4 discussion 5 conclusions list of symbols references ap06_3.vp 1 introduction a rotational viscometer with coaxial cylinders is widely used in rheological measurements. its common configuration consists of an inner rotating cylinder with radius r1 and length l and an outer stationary cylinder with radius r2 – see fig. 1. the dependence of shear stress � on the newtonian shear rate �� n at the specified radius is usually obtained from measurements. the power law and bingham model are the simplest rheological models. the aim of this paper is to show a way to calculate the parameters of these models from experimental values � and �� n . for this purpose, the flow in the viscometer must be analysed. 2 theory if the influence of the bottom and the interface is neglected, the only non-zero velocity component u� depends in cylindrical coordinates on the radius r only. the component of cauchy’s equation of motion for this case takes the form (see e.g. [1]) � �d dr r r 2 0� � � . (1) integrating this equation, we obtain the following relation for shear stress � �r c r � 12 . (2) the shear rate for this type of flow is given by the relation (see e.g. [1]) �� �� � � � � r r u r d d . (3) 2.1 power law fluids the power law is the simplest model that is widely used for describing the rheological behavior of non-newtonian fluids. using this model, the dependence of shear stress on shear rate can be expressed by the following relation � � �� �k n� �1 , (4) where k is the coefficient of consistency and n stands for flow behavior index. inserting (2) and (3) into eq. (4) and taking into consideration that the shear rate in the gap is negative, we obtain k r r u r c r n � � � � � � � � � � � � d d � 1 2 . (5) integrating the above equation we obtain u r c k n r cn n� � �� � � � � � 1 1 2 22 . (6) for determining integration constants, the following boundary conditions are necessary r r� 1, u r � �� , (7a) r r� 2, u r � � 0. (7b) 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague determination of rheological parameters from measurements on a viscometer with coaxial cylinders f. rieger the paper deals with measurements of non-newtonian fluids on a viscometer with coaxial cylinders. the procedure for determining the rheological model parameters is recommended for power-law fluids and bingham plastics. keywords: viscometer with coaxial cylinders, power-law fluids, bingham plastics. fig. 1: viscometer with coaxial cylinders using them, the following expression for integration constant c1 is obtained � � c k n r rn n n 1 1 2 2 2 2 � � � � � � � � � �� � � . (8) inserting (8) into (5) the following equation for shear rate can be obtained after rearrangement � � �� � � � � � � � � � 2 1 2 1 2 n r rn n , (9) where � � r r1 2. the dependence of the dimensionless shear rate � � *� � �� � on the dimensionless coordinate defined by the relation y y r r* ( )� �2 1 (where y is the radial distance from the rotating cylinder) for ratio � � 0.5 and several n values is shown in fig. 2. the same dependence for ratio � � 0.9 is shown in fig. 3. from figs. 2 and 3 it is obvious that the maximum shear rate is at the inner cylinder (y* � 0) and the minimum is at the outer cylinder (y* � 1). from the above mentioned figures for both � values it can also be seen that the © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 = 0.5 0.001 0.01 0.1 1 10 100 0 0.2 0.4 0.6 0.8 1 y* * n = 1 n = 0.5 n = 0.2 fig. 2: dependence of dimensionless shear rate values ��* on dimensionless distance y* for � � 0.5 and selected flow behavior index values n = 0.9 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 y* n � 1 n � 0.5 n � 0.2 * fig. 3: dependence of dimensionless shear rate values ��* on dimensionless distance y* for � � 0.9 and selected flow behavior index values n effect of flow behavior index n on the shape of the shear rate profiles is more pronounced at smaller � value. at � � 1 (parallel plate asymptote) the flow behavior index n has no effect on the shear rate profiles – the shear rate is constant. 2.2 bingham plastics the simplest model for viscoplastic behavior is the bingham model �� � 0 for � �� 0 (10 a) � � � � �� � � � � � � p 0 � � for � �� 0, (10 b) where �p is plastic viscosity and �0 stands for yield stress. inserting (2) and (3) into equation (10b) and taking into consideration that the shear rate in the gap is negative, we obtain � � � p d d r r u r c r � � � � � �0 1 2 (11) 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.2 0.4 0.6 0.8 1 y* * � � � 0 � � � 0.1 � � � 1 fig. 4: dependence ��* on y* for � � 0.5 and selected values �* (�c* � 1.24) 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 y* � � � 0 * � � � 5 � � � 50 fig. 5: dependence � *� on y* for � � 0.9 and selected values �* (�c* � 83.9) and after integration u r r c r c� � � � � � �0 12 2 1 2p p ln . (12) using boundary conditions (7), we get, after some manipulation c r r r r r r r r r r1 2 2 1 2 1 2 2 2 0 1 2 2 2 1 2 2 2 1 2 2 2� � � � � � � p ln . (13) combining eqs. (3), (11) and (13) the equation for shear rate distribution can be obtained and the shear rate profiles shown in figs. 4 and 5 can be depicted. from both figures it can be seen that ��* distribution depends on � and the dimensionless parameter � � � � * � 0 p . (14) the distribution is more pronounced for greater �* and lower � values. however �* must be lower than critical at which shear stress value at the outer cylinder is equal to the yield stress � �2 0� . at greater values of �*, equation (10b) does not hold in the whole gap between the cylinders. the critical value �c* can be calculated from the equation � � � � � c* ln( ) � � � 2 1 2 1 2 2 2 (15) obtained from the condition � ��r � � 0 at r r� 2. the curves in figs. 4 and 5 for �* � 0 are related to newtonian fluids. 3 evaluation of rheological measurements the evaluation procedure is influenced by the radius to which the primary measured data (shear stress � and the newtonian shear rate �� n ) are related. 3.1 power law fluids a) data are related to inner radius r1 equation (4) at the inner cylinder surface takes the form � �1 1� k n � , (16) where �1 and ��1 are positive values of shear stress and shear rate at the inner cylinder surface. using (9) we can obtain � � �� � � 1 2 2 1 � �n n . (17) however, the dependence of shear stress �1 on the newtonian shear rate ��1n at the surface of inner the cylinder is obtained from measurements. for this reason, equation (16) can be rewritten to the form � � � � � � � � �1 1 1 1 2 2 1 1 1 � � � �� � � � � � � � � � � � k k nn n n n n n n � � � � n n nk� 1 1�� (18) where ��1n was expressed from (17) for n � 1 �� � � 1 2 2 1 n � � (19) the dependences of the ratio � �� �1 1n on flow behavior index n for cylinder ratio values � � 0.5 and 0.9 are depicted in fig. 6. this figure shows that the values of this ratio (necessary for newtonian shear rate correction) increase with decreasing flow behavior index n and are significantly greater at � � 0.5 than at � � 0.9. from (18) it can be seen that dependence �1 on ��1n is a straight line with slope n in logarithmic coordinates and the coefficient of consistency k can be calculated by the equation � � k k n n n � � � � � � � � � � 1 2 2 1 1 � � . (20) © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 n � � 0.5 � � 0.9 fig. 6: dependence of ratio � �� �1 1n on flow behavior index n for selected � values the dependence of ratio k/k1 on n for several values � is shown in fig. 7. from this figure it can be seen that it exhibits the minimum that is dependent on � (for � � 0, minimum k/k1 = 0.692 at n = 1/e = 0.368). b) data related to mean radius r r rm � �( )1 2 2 equation (4) written for the mean radius takes the form � �m m� k n � , (21) where �m and �� m are positive values of shear stress and shear rate at the mean radius. using (9) we can obtain � � � � �� � � � � � � m m � � � � �� � � � � � � � � 2 1 2 1 2 12 1 2 2 2 n r r nn n n n . (22) however, the dependence of shear stress �m on the newtonian shear rate �� mn at the mean radius is obtained from measurements. for this reason, equation (21) can be rewritten to the form � � � � �m m m m m m� � � �� � �k k n n n n n n� � � � , (23) where �� mn can be obtained from (22) for n � 1 �� � � � � mn � � � � � � � 2 1 2 12 2 . (24) dependencies of � �� �m mn ratio on flow behavior index n for cylinder ratio values � � 0.5 and 0.9 are depicted in fig. 8. this figure shows that the values of this ratio are less than 1 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 0 0.2 0.4 0.6 0.8 1 n k /k 1 � � 0 � � 0.5 � � 0.9 fig. 7: dependence of the ratio k/k1 on flow behavior index n for selected � values 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.3 0.5 0.7 0.9 n � � 0.5 � � 0.9 fig. 8: dependence of ratio � �� �m mn on flow behavior index n for selected � values and decrease with decreasing flow behavior index n and are significantly lower at � � 0.5 than at � � 0.9. from (23) it can be seen that dependence �m on �� mn is a straight line with slope n in logarithmic coordinates and coefficient of consistency k can be calculated by the equation k k n n � � � �� � m m m � � � � (25) the dependence of ratio k/km on n for � � 0.5 and 0.9 is shown in fig. 9. this figure shows that the ratio k/km is greater than 1 and it increases with decreasing n. c) data related to the mean radius presented by klein [2] r r r r r k � � 1 2 1 2 2 2 2 equation (4) written for this mean radius takes the form � �k k� k n � , (26) where �k and �� k are positive values of shear stress and shear rate at mean radius rk. using (9) we can obtain � � � � �� � � � � � k k � � � � �� � � � �� � � � � 2 1 2 1 1 22 1 2 2 2 1 n r r nn n n n . (27) © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 0.1 0.3 0.5 0.7 0.9 n k /k m � � 0.5 � � 0.9 fig. 9: dependence of the ratio k/km on flow behavior index n for selected � values 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 0.1 0.3 0.5 0.7 0.9 n � � 0.5 � � 0.9 fig. 10: dependence of ratio � �� �k kn on flow behavior index n for selected � values however, the dependence of shear stress �k on the newtonian shear rate �� kn at mean radius rk is obtained from measurements. for this reason equation (26) can be rewritten to the form � � � � �k k k k k k� � � �� � �k k n n n n n n� � � � , (28) where �� kn can be obtained from (27) for n � 1 �� � � �kn � � � 1 1 2 2 . (29) the dependencies of � �� �k kn ratio on flow behavior index n for cylinder ratio values � � 0.5 and 0.9 are depicted in fig. 10. this figure shows that the values of this ratio are approximately 1 for average values of n and rapidly decrease with decreasing flow behavior index n at low n values especially at � � 0.5 . from (28) it can be seen that dependence �k on �� kn is a straight line with slope n in logarithmic coordinates, and coefficient of consistency k can be calculated by the equation k k n n � � � �� � k k k � � � � (30) the dependence of ratio k/kk on n for � � 0.5 and 0.9 is shown in fig. 11. this figure shows that ratio k/kk is approximately 1 for usual values of n and rapidly increases with decreasing flow behavior index n at low n values, especially at � � 0.5. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 n k /k k � � 0.5 � � 0.9 fig. 11: dependence of the ratio k/kk on flow behavior index n for selected � values � = 0.9 0.99 1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 0.05 0.25 0.45 0.65 0.85 n k k/ m k k/ k k k1/ fig. 12: dependence of coefficient consistency ratios on flow index n for � � 0.9 the ratios of the real and apparent consistency coefficients related to one of the radii mentioned above are compared in figs. 12 and 13 for � � 0.9 and 0.5. figs. 12 and 13 show that for � � 0.9 and for � � 0.5 and n � 0.36 no correction of k is practically necessary (error is smaller than 3 %) when the data are related to the radius defined according to klein. 3.2 bingham plastics the dependences of the ratio of the shear rate to the newtonian shear rate (related to the reference radii) on �* for � � 0.5 and 0.9 are shown in figs. 14 and 15. these figures shows that the values of shear rate related to radii r1 and rm are greater than the newtonian values and the shear rate values related to radius rk are smaller than the newtonian values. inserting (13) into (2) we derive that the dependence of shear stress on newtonian shear rate values (related to reference radii) can be expressed in the form � � � �� �0ref p � n . (31) the ratio � �0 0ref depends on � and the reference radius to which the measured data are related and the following formulas can be derived: a) data related to inner radius r1 � � � � 01 0 2 2 1 1 � � ln( ) (32) © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 � = 0.5 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 0.05 0.25 0.45 0.65 0.85 n k k/ m k k/ k k k1/ fig. 13: dependence of coefficient consistency ratios on flow index n for � � 0.5 � 0.5� 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 0.1 0.3 0.5 0.7 0.9 1.1 �� 1 m k fig. 14: dependences of shear rate to newtonian shear rate ratio (related to reference radii) on �* for � � 0.5 b) data related to mean radius r r rm � �( )1 2 2 � � � � � � 0 0 2 2 8 1 1 1 m 2 ) ) � � � ln( ( )( (33) c) data related to the mean radius presented by klein [2] r r r r r k � � 1 2 1 2 2 2 2 � � � � � 0 0 2 2 1 1 1k � � � ln . (34) the dependences of ratio � �0 0ref on � for different reference radii are depicted in fig.16. this figure shows that practically no correction is necessary for � � 0.74 when the data are related to the radius defined according to klein (the error is smaller than 3 %). 4 conclusion the following procedure can be recommended for determining the rheological model parameters: 1) comparing the values of � and �� n reported by the manufacturer of the viscometer, we determine reference radius to which measured data are related (eq.(19), (24), (29)). 2) from the measured shear stress and newtonian shear rate values the values of kref and n are obtained for power-law 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 � m 1 k fig. 16: dependences of ratio � �0 0ref on � for different reference radii � 0.9� 0.8 1 1.2 1.4 1.6 1.8 2 0 10 20 30 40 50 60 70 80 1 m k �� fig. 15: dependences of shear rate to newtonian shear rate ratio (related to reference radii) on �* for � � 0.9 fluids (eq.(18), (23), (28)) or �0ref and �p are obtained for bingham plastics (eq.(31)). 3) values of the rheological model parameters, i.e., consistency coefficient k (eq.(20), (25), (30)) or yield stress �0 (eq.(32), (33), (34)) must be determined. list of symbols k coefficient of consistency l length of cylinder n flow index r radial coordinate r1 inner rotating cylinder radius r2 outer stationary cylinder radius u velocity �� shear rate � tangential coordinate � r1/r2 ratio �p plastic viscosity � angular velocity � r1/r2 � shear stress �0 yield stress references [1] middleman, s.: the flow of high polymers. new york: interscience publishers, 1968. [2] klein, g.: basic principles of rheology and the application of rheological measurement methods for evaluating ceramic suspensions. in: ceramic forum international yearbook 2005 (edited by h. reh). baden-baden: göller verlag, 2004, p. 31–42. prof. ing. františek rieger, drsc. phone: +420 224 352 548 email: frantisek.rieger@fs.cvut.cz czech technical university in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 acta polytechnica https://doi.org/10.14311/ap.2022.62.0222 acta polytechnica 62(1):222–227, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague a note on entanglement classification for tripartite mixed states hui zhaoa, yu-qiu liua, zhi-xi wangb, shao-ming feib, ∗ a beijing university of technology, faculty of science, beijing 100124, china b capital normal university, school of mathematical sciences, beijing 100037, china ∗ corresponding author: feishm@cnu.edu.cn abstract. we study the classification of entanglement in tripartite systems by using bell-type inequalities and principal basis. by using bell functions and the generalized three dimensional pauli operators, we present a set of bell inequalities which classifies the entanglement of triqutrit fully separable and bi-separable mixed states. by using the correlation tensors in the principal basis representation of density matrices, we obtain separability criteria for fully separable and bi-separable 2 ⊗ 2 ⊗ 3 quantum mixed states. detailed example is given to illustrate our criteria in classifying the tripartite entanglement. keywords: bell inequalities, separability, principal basis. 1. introduction one of the most remarkable features that distinguishes quantum mechanics from classical mechanics is the quantum entanglement. entanglement was first recognized by epr [1], with significant progress made by bell [2] toward the resolution of the epr problem. since bell’s work, derivation of new bell-like inequalities has been one of the important and challenging subjects. chsh generalized the original bell inequalities to a more general case for two observers [3]. in [4] the authors proposed an estimation of quantum entanglement by measuring the maximum violation of the bell inequality without information of the reduced density matrices. in [5] series of bell inequalities for multipartite states have been presented with sufficient and necessary conditions to detect certain entanglement. there have been many important generalizations and interesting applications of bell inequalities [6–8]. by calculating the measures of entanglement and the quantum violation of the bell-type inequality, a relationship between the entanglement measure and the amount of quantum violation was derived in [9]. however, for high-dimensional multiple quantum systems the results for such relationships between the entanglement and the nonlocal violation are still far from being satisfied. in [10], an upper bound on fully entangled fraction for arbitrary dimensional states has been derived by using the principal basis representation of density matrices. based on the norms of correlation vectors, the authors in [11] presented an approach to detect entanglement in arbitrary dimensional quantum systems. separability criteria for both bipartite and multipartite quantum states was also derived in terms of the correlation matrices [12]. in this paper by using the bell function and the generalized three dimensional pauli operators, we derive a quantum upper bound for 3 ⊗ 3 ⊗ 3 quantum systems. we present a classification of entanglement for triqutrit mixed states by a set of bell inequalities. these inequalities can distinguish fully separable and bi-separable states. moreover, we propose criteria to detect classification of entanglement for 2 ⊗ 2 ⊗ 3 mixed states with correlation tensor matrices in the principal basis representation of density matrices. 2. entanglement identification with bell inequalities we first consider relations between entanglement and non-locality for 3 ⊗ 3 ⊗ 3 quantum systems. consider three observers who may choose independently between two dichotomic observables denoted by ai and bi for the i-th observer, i = 1, 2, 3. let v̂i denote the measurement operator associated with the variable vi ∈ {ai,bi} of i-th observer. we choose a complete set of orthonormal basis vectors |k⟩ to describe an orthogonal measurement of a given variable vi. the measurement outcomes are indicated by a set of eigenvalues 1,λ,λ2, where λ = exp( i2π3 ) is a primitive third root of unity. therefore the measurement operator can be represented by v̂i = ∑2 k=0 λ k|k⟩⟨k|. inspired by the bell function (the expected value of bell operator) constructed in [13], we introduce the following bell operator, b = 2∑ j=1 1 4 (â1 j ⊗ â2 j ⊗ â3 j + λjâ1 j ⊗ b̂2 j ⊗ b̂3 j + λjb̂1 j ⊗ â2 j ⊗ b̂3 j + λjb̂1 j ⊗ b̂2 j ⊗ â3 j ), (1) where âi j (b̂i j ) denotes the j-th power of âi (b̂i). next we construct three bell operators in terms of eq. (1). consider three dimensional pauli opera222 https://doi.org/10.14311/ap.2022.62.0222 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 a note on entanglement classification for tripartite . . . tors [14] x̂ and ẑ which satisfy x̂|k⟩ = |k + 1⟩, ẑ|k⟩ = λk|k⟩, x̂3 = i, ẑ3 = i, where i denotes the identity operator. therefore, if we replace âi and b̂i with the following unitary operators, â1 = ẑ, â2 = λ2x̂ẑ, â3 = x̂ẑ2, b̂1 = ẑ, b̂2 = x̂ẑ2 and b̂3 = λ2x̂ẑ, we obtain b1 = 2∑ j=1 1 4 [ẑj ⊗ (λ2x̂ẑ)j ⊗ (x̂ẑ2)j + λjẑj ⊗ (x̂ẑ2)j ⊗ (λ2x̂ẑ)j + λjẑj ⊗ (λ2x̂ẑ)j ⊗ (λ2x̂ẑ)j + λjẑj ⊗ (x̂ẑ2)j ⊗ (x̂ẑ2)j ]. (2) if we choose unitary operators as follows, â1 = λ2x̂ẑ, â2 = x̂ẑ2, â3 = ẑ, b̂1 = x̂ẑ2, b̂2 = λ2x̂ẑ and b̂3 = ẑ, we have b2 = 2∑ j=1 1 4 [(ω2x̂ẑ)j ⊗ (x̂ẑ2)j ⊗ ẑj + λj (λ2x̂ẑ)j ⊗ (λ2x̂ẑ)j ⊗ ẑj ) + λj (x̂ẑ2)j ⊗ (x̂ẑ2)j ⊗ (ẑ)j + λj (x̂ẑ2)j ⊗ (λ2x̂ẑ)j ⊗ (ẑ)j ]. (3) taking â1 = λ2x̂ẑ, â2 = ẑ, â3 = x̂ẑ2, b̂1 = x̂ẑ2, b̂2 = ẑ and b̂3 = λ2x̂ẑ, we have b3 = 2∑ j=1 1 4 [(λ2x̂ẑ)j ⊗ ẑj ⊗ (x̂ẑ2)j + λj (λ2x̂ẑ)j ⊗ (ẑ)j ⊗ (λ2x̂ẑ)j ) + λj (x̂ẑ2)j ⊗ ẑj ⊗ (λ2x̂ẑ)j + λj (x̂ẑ2)j ⊗ ẑj ⊗ (x̂ẑ2)j ]. (4) concerning the bounds on the mean values |⟨bi⟩| of the operators bi, i = 1, 2, 3, we have the following conclusions. theorem 1. for 3 ⊗ 3 ⊗ 3 mixed states, we have the inequality, |⟨bi⟩| ≤ 54 , i = 1, 2, 3. proof due to the linear property of the average values, it is sufficient to consider pure states. any triqutrit pure state can be written as, |ψ⟩ =c1|000⟩ + c2|011⟩ + c3|012⟩ + c4|021⟩ + c5|022⟩ + c6|101⟩ + c7|102⟩ + c8|110⟩ + c9|111⟩ + c10|120⟩ + c11|122⟩ + c12|201⟩ + c13|202⟩ + c14|210⟩ + c15|212⟩ + c16|220⟩ + c17|221⟩ + c18|222⟩, (5) where c5, c11, c13, c15, c16, c17 and c18 are real and nonnegative, |c1| ≥ |ci| for i = 1, 2, . . . , 18, |c9| ≥ |c18| and ∑18 i=1 |ci| 2 = 1. therefore, |⟨b1⟩| =| 1 4 (−c1c2 + 5c1c5 − c2c5 − c6c10 + 2c7c8 + 2c9c11 + 2c12c15 + 5c12c16 − c13c14 − c13c17 − 4c14c17 + 5c15c16)| ≤ 1 8 × 10 × 18∑ i=1 c2i = 5 4 . (6) similarly one can prove that |⟨bi⟩| ≤ 54 for i = 2, 3. □ theorem 2. if a triqutrit mixed state ρ is fully separable, then |⟨bi⟩| = 0, i = 1, 2, 3. the proof is straightforward. due to the linear property of the average values, it is sufficient to consider pure states again. a fully separable pure state can be written as under suitable bases, |ψ⟩ = |0⟩ ⊗ |0⟩ ⊗ |0⟩ ⊗ |0⟩. therefore |⟨bi⟩| = |tr(ρbi)| = 0. theorem 3. for bi-separable states ρi|jk under bipartition i and jk, i ̸= j ̸= k ∈ {1, 2, 3}, we have |⟨b1⟩| ≤ 34, |⟨b2⟩| = 0, |⟨b3⟩| = 0, |⟨b1⟩| = 0, |⟨b2⟩| ≤ 34, |⟨b3⟩| = 0, |⟨b1⟩| = 0, |⟨b2⟩| = 0, |⟨b3⟩| ≤ 34, for ρ1|23, ρ3|12 and ρ2|13, respectively. proof it is sufficient to consider pure states only. every bi-separable pure state ρ1|23 can be written as via a suitable choice of bases [15], |ψ⟩ = |0⟩ ⊗ (c0|00⟩ + c1|11⟩ + c2|22⟩), where |c0| ≥ |c1| ≥ |c2| and ∑2 i=0 |ci| 2 = 1. therefore, we have by direct calculation, |⟨b1⟩| =| 1 4 (5c2c0 − c0c1 − c1c2)| ≤ 1 8 (5 × (c22 + c 2 0) + (c 2 0 + c 2 1) × (c 2 1 + c 2 2)) ≤ 3 4 . it is straightforward to prove similarly, |⟨b2⟩| = 0 and |⟨b3⟩| = 0. for bi-separable states ρ3|12 and ρ2|13, the results can be proved in a similar way. □ the above relations given in theorem 1-3 give rise to characterization of quantum entanglement based on the bell-type violations. if we consider |⟨bi⟩|, i = 1, 2, 3, to be three coordinates, then all the triqutrit states are confined in a cube with size 54 × 5 4 × 5 4 . the bi-separable states are confined in a cube with size 3 4 × 3 4 × 3 4 , see figure 1. 223 h. zhao, y.-q. liu, z.-x. wang, s.-m. fei acta polytechnica figure 1. all states lie in the yellow cube, while in the green cube are bi-separable states. 3. entanglement classification under principal basis consider the principal basis on d-dimensional hilbert space h with computational basis |i⟩, i = 1, 2, ...,d. let eij be the d×d unit matrix with the only nonzero entry 1 at the position (i,j). let ω be a fixed d-th primitive root of unity, the principal basis is given by aij = ∑ m∈zd ωimem,m+j, (7) where ωd = 1, i,j ∈ zd and zd is z modulo d. the set {aij } spans the principal cartan subalgebra of gl(d). under the stand inner product (x|y) = tr(xy) of matrices x and y, the dual basis of the principal basis {aij } is {(ωij/d)a−i,−j }, which follows also from the algebraic property of the principal matrices, aijakl = ωjkai+k,j+l. namely, a † i,j = ω ija−i,−j , and thus tr(aija † kl) = δikδjld [10]. next we consider the entanglement of 2 ⊗ 2 ⊗ 3 systems. let {aij } and {bij } be the principal bases of 2-dimensional and 3-dimensional hilbert space, respectively. for any quantum state ρ ∈ h21 ⊗ h22 ⊗ h33 , ρ has the principal basis representation: ρ = 1 12 (i2 ⊗ i2 ⊗ i3 + ∑ (i,j) ̸=(0,0) uij aij ⊗ i2 ⊗ i3 + ∑ (k,l) ̸=(0,0) vkli2 ⊗ akl ⊗ i3 + ∑ (s,t)̸=(0,0) wsti2 ⊗ i2 ⊗ bst + ∑ (i,j),(k,l) ̸=(0,0) xij,klaij ⊗ akl ⊗ i3 + ∑ (i,j),(s,t) ̸=(0,0) yij,staij ⊗ i2 ⊗ bst + ∑ (k,l),(s,t) ̸=(0,0) zkl,sti2 ⊗ akl ⊗ bst + ∑ (i,j),(k,l),(s,t) ̸=(0,0) rij,kl,staij ⊗ akl ⊗ bst), (8) where i2 (i3) denotes the two (three) dimensional identity matrix, uij = tr(ρa † ij ⊗ i2 ⊗ i3), vkl = tr(ρi2 ⊗ a † kl ⊗ i3), wst = tr(ρi2 ⊗ i2 ⊗ b † st), xij,kl = tr(ρa†ij ⊗a † kl ⊗i3), yij,st = tr(ρa † ij ⊗i2 ⊗b † st), zkl,st = tr(ρi2 ⊗a † kl ⊗b † st) and rij,kl,st = tr(ρa † ij ⊗a † kl ⊗b † st). denote t 1|231 , t 1|23 2 , t 2|13 1 , t 2|13 2 , t 3|12 1 and t 3|12 2 the matrices with entries given by r01,kl,st, r11,kl,st, rij,01,st, rij,11,st, rij,kl,10 and rij,kl,20 (i,j,k, l ∈ z2, s,t ∈ z3), respectively. let ∥a∥tr = ∑ σi = tr √ aa† be the trace norm of a matrix a ∈ rm×n, where σi are the singular values of the matrix a. first we note that ∥t 1|231 − t 1|23 2 ∥tr is invariant under local unitary transformations. denote uau† by au . suppose ρ′ = ρ(i⊗u2⊗u3) with u2 ∈ u(2) and u3 ∈ u(3), au2ij = ∑ (i′ ,j′ ) ̸=(0,0) mij,i′j′ai′j′ and bu3ij =∑ (i′ ,j′ ) ̸=(0,0) nij,i′j′bi′j′ for some coefficients mij,i′j′ and nij,i′j′ . the orthogonality of {au2ij } and {b u3 ij } requires that tr(au2ij a u2 kl ) = tr(u2aija † klu † 2 ) =tr(aija † kl) = 2δikδjl; tr(bu3ij b u3 kl ) = tr(u3bijb † klu † 3 ) =tr(bijb † kl) = 3δikδjl. hence, we have m = (mij,i′j′ ) ∈ su(3) and n = (nij,i′j′ ) ∈ su(8) since any two orthogonal bases are transformed by an unitary matrix. one sees that∑ (i,j),(k,l), (s,t)̸=(0,0) rij,kl,staij ⊗ au2kl ⊗ b u3 st = ∑ (i,j),(k,l), (s,t)̸=(0,0) ∑ (k′ ,l′ ),(s′ ,t′ ) ̸=(0,0) rij,kl,stmkl,k′ l′ nst,s′ t′ aij ⊗ ak′ l′ ⊗ bs′ t′ = ∑ (i,j),(k,l), (s,t)̸=(0,0) ( ∑ (k′ ,l′ ),(s′ ,t′ ) ̸=(0,0) mk′ l′ ,klrij,k′ l′ ,s′ t′ ns′ t′ ,st)aij ⊗ akl ⊗ bst. we have t 1|231 (ρ ′) = mtt 1|231 (ρ)n and t 1|23 2 (ρ ′) = mtt 1|23 2 (ρ)n. therefore, ∥t 1|231 (ρ ′) − t 1|232 (ρ ′)∥tr = ∥t 1|23 1 (ρ) − t 1|23 2 (ρ)∥tr, (9) due to that the singular values of a matrix are the same as those of mttn when m and n are unitary matrices. theorem 4. if a mixed state ρ is fully separable, then ∥t 1|231 − t 1|23 2 ∥tr ≤ √ 3. proof if ρ = |φ⟩⟨φ| is fully separable, we have |φ1|2|3⟩ = |φ1⟩ ⊗ |φ23⟩ ∈ h21 ⊗ h623, where |φ23⟩ = |φ2⟩ ⊗ |φ3⟩ ∈ h22 ⊗ h33 . then by schmidt decomposition, |φ1|2|3⟩ = t0|0α⟩ + t1|1β⟩, where t20 + t21 = 1. taking into account the local unitary equivalence in h22 ⊗ h33 and using (9), we only need to consider that {|α⟩, |β⟩} = {|00⟩, |01⟩}. then 224 vol. 62 no. 1/2022 a note on entanglement classification for tripartite . . . |φ1|2|3⟩ = t0|000⟩ + t1|101⟩. t 1|23 1 and t 1|23 2 are given by t 1|23 1 =   0 t0t1 0 0 t0t1 0 0 0 0 0 t0t1 0 0 t0t1ω2 0 0 0 0 0 t0t1 0 0 t0t1ω 0   t , (10) t 1|23 2 =   0 t0t1 0 0 −t0t1 0 0 0 0 0 t0t1 0 0 −t0t1ω2 0 0 0 0 0 t0t1 0 0 −t0t1ω 0   t . (11) with ω3 = 1. therefore, we have ∥t 1|231 − t 1|23 2 ∥tr =tr √ (t 1|231 − t 1|23 2 )(t 1|23 1 − t 1|23 2 )† = √ 12t20t21 ≤ √ 3. for a fully separable mixed state ρ = ∑ pi|φi⟩⟨φi|, we get ∥t 1|231 (ρ) − t 1|23 2 (ρ)∥tr =∥t 1|231 ( ∑ pi|φi⟩⟨φi|) − t 1|23 2 ( ∑ pi|φi⟩⟨φi|)∥tr ≤ ∑ pi∥t 1|23 1 (|φi⟩⟨φi|) − t 1|23 2 (|φi⟩⟨φi|)∥tr ≤ √ 3, which proves the theorem. □ theorem 5. for any mixed state ρ =∑ pi|φi⟩⟨φi| ∈ h21 ⊗ h22 ⊗ h33 , ∑ pi = 1, 0 < pi ≤ 1, we have: (1) if ρ is 1|23 separable, then ∥t 1|231 −t 1|23 2 ∥tr ≤ √ 6; (2) if ρ is 2|13 separable, then ∥t 2|131 −t 2|13 2 ∥tr ≤ √ 6; (3) if ρ is 3|12 separable, then ∥t 3|121 −t 3|12 2 ∥tr ≤ √ 3. proof (1) if ρ = |φ⟩⟨φ| is 1|23 separable, we have |φ1|23⟩ = |φ1⟩ ⊗ |φ23⟩ ∈ h21 ⊗ h623, where h623 = h22 ⊗ h33 . then by schmidt decomposition, one has |φ1|23⟩ = t0|0α⟩ + t1|1β⟩, where t20 + t21 = 1. taking into account the local unitary equivalence in h22 ⊗ h33 and using (9), we only need to consider two cases (i) {|α⟩, |β⟩} = {|00⟩, |01⟩} and (ii) {|00⟩, |11⟩}. for the first case we have ∥t 1|231 − t 1|23 2 ∥tr ≤ √ 3 by theorem 4. for the second case, we have |φ1|23⟩ = t0|000⟩ + t1|111⟩, where t 1|23 1 and t 1|23 2 are given by t 1|23 1 =   t0t1 0 t0t1 t0t1 0 −t0t1 0 0 0 t0t1 0 t0t1 t0t1ω 2 0 −t0t1ω2 0 0 0 t0t1 0 t0t1 t0t1ω 0 −t0t1ω   t , (12) t 1|23 2 =   t0t1 0 t0t1 −t0t1 0 t0t1 0 0 0 t0t1 0 t0t1 −t0t1ω2 0 t0t1ω2 0 0 0 t0t1 0 t0t1 −t0t1ω 0 t0t1ω   t . (13) then we have ∥t 1|231 − t 1|23 2 ∥tr =tr √ (t 1|231 − t 1|23 2 )(t 1|23 1 − t 1|23 2 )† = √ 24t20t21 ≤ √ 6. now consider mixed state ρ = ∑ pi|φi⟩⟨φi|. we obtain ∥t 1|231 (ρ) − t 1|23 2 (ρ)∥tr =∥t 1|231 ( ∑ pi|φi⟩⟨φi|) − t 1|23 2 ( ∑ pi|φi⟩⟨φi|)∥tr ≤ ∑ pi∥t 1|23 1 (|φi⟩⟨φi|) − t 1|23 2 (|φi⟩⟨φi|)∥tr, namely, ∥t 1|231 (ρ) − t 1|23 2 (ρ)∥tr ≤ √ 6. (2) if ρ = |φ⟩⟨φ| is 2|13 separable, we have |φ2|13⟩ = |φ2⟩ ⊗ |φ13⟩ ∈ h22 ⊗ h613, where h613 = h21 ⊗ h33 . then by schmidt decomposition, one has |φ2|13⟩ = t0|0α⟩ +t1|1β⟩, where t20 +t21 = 1. taking into account the local unitary equivalence in h21 ⊗ h33 , we obtain a similar equation of (9). thus we only need to consider again the two cases (i) {|α⟩, |β⟩} = {|00⟩, |01⟩} and (ii) {|00⟩, |11⟩}. in the first case, |φ2|13⟩ = t0|000⟩ + t1|101⟩, and t 2|13 1 and t 2|13 2 are zero matrices. in the second case, |φ2|13⟩ = t0|000⟩ + t1|111⟩, with t 2|13 1 and t 2|13 2 given by t 2|13 1 =   t0t1 0 t0t1 t0t1 0 −t0t1 0 0 0 t0t1 0 t0t1 t0t1ω 2 0 −t0t1ω2 0 0 0 t0t1 0 t0t1 t0t1ω 0 −t0t1ω   t , (14) 225 h. zhao, y.-q. liu, z.-x. wang, s.-m. fei acta polytechnica t 2|13 2 =   t0t1 0 t0t1 −t0t1 0 t0t1 0 0 0 t0t1 0 t0t1 −t0t1ω2 0 t0t1ω2 0 0 0 t0t1 0 t0t1 −t0t1ω 0 t0t1ω   t . (15) then we have ∥t 2|131 − t 2|13 2 ∥tr =tr √ (t 2|131 − t 2|13 2 )(t 2|13 1 − t 2|13 2 )† = √ 24t20t21 ≤ √ 6. for the mixed state ρ = ∑ pi|φi⟩⟨φi|, we have ∥t 2|131 (ρ) − t 2|13 2 (ρ)∥tr =∥t 2|131 ( ∑ pi|φi⟩⟨φi|) − t 2|13 2 ( ∑ pi|φi⟩⟨φi|)∥tr ≤ ∑ pi∥t 2|13 1 (|φi⟩⟨φi|) − t 2|13 2 (|φi⟩⟨φi|)∥tr ≤ √ 6. (3) if ρ = |φ⟩⟨φ| is 3|12 separable, we have |φ3|12⟩ = |φ3⟩ ⊗ |φ12⟩ ∈ h33 ⊗h412, where h412 = h21 ⊗h22 . then by schmidt decomposition, we have |φ3|12⟩ = t0|0α0⟩+ t1|1α1⟩ + t2|2α2⟩, where t20 + t21 + t22 = 1. taking into account the local unitary equivalence in h21 ⊗ h22 , we obtain similar equation of (9). we only need to consider the case |φ3|12⟩ = t0|000⟩ + t1|101⟩ + t2|210⟩. we have t 3|12 1 =   0 0 00 t20 − ωt21 + ω2t22 0 0 0 0   , t 3|12 2 =   0 0 00 t20 − ω2t21 + ωt22 0 0 0 0   . (16) using 1 + ω + ω2 = 0, we have ∥t 3|121 − t 3|12 2 ∥tr =tr √ (t 3|121 − t 3|12 2 )(t 3|12 1 − t 3|12 2 )† = √ 3(t20 + t21)2 ≤ √ 3. for the mixed state ρ = ∑ pi|φi⟩⟨φi|, we get ∥t 3|121 (ρ) − t 3|12 2 (ρ)∥tr =∥t 3|121 ( ∑ pi|φi⟩⟨φi|) − t 3|12 2 ( ∑ pi|φi⟩⟨φi|)∥tr ≤ ∑ pi∥t 3|12 1 (|φi⟩⟨φi|) − t 3|12 2 (|φi⟩⟨φi|)∥tr ≤ √ 3. as an example, let us consider the 2 ⊗ 2 ⊗ 3 state, ρ = x|ghz′⟩⟨ghz′| + (1 − x)i12, 0 ≤ x ≤ 1, where |ghz′⟩ = 12 (|000⟩ + |101⟩ + |011⟩ + |112⟩). by theorem 4, we have that when ∥t 1|231 − t 1|23 2 ∥ = (2 √ 3 2 + 1)x > √ 3, i.e., 0.5021 < x ≤ 1, ρ is not fully separable. by theorem 5, when ∥t 1|231 − t 1|23 2 ∥ = ∥t 2|131 − t 2|13 2 ∥ = (2 √ 3 2 + 1)x > √ 6, i.e., 0.7101 < x ≤ 1, ρ is not separable under bipartition 1|23 or 2|13. when ∥t 3|121 − t 3|12 2 ∥ = 7 √ 3 4 x > √ 3, i.e., 0.5714 < x ≤ 1, ρ is not separable under bipartition 3|12. 4. conclusions we have presented quantum upper bounds for triqutrit mixed states by using the generalized bell functions and the generalized three dimensional pauli operators, from which the triqutrit entanglement has been identified. our inequalities distinguish fully separable states and three types of bi-separable states for triqutrit states. moreover, any triqutrits states are confined in a cube with size 54 × 5 4 × 5 4 and the biseparable states are in a cube with the size 34 × 3 4 × 3 4 . we have also studied the classification of quantum entanglement for 2 ⊗ 2 ⊗ 3 systems by using the correlation tensors in the principal basis representation of density matrices. by considering the upper bounds on some the trace norms, we have obtained the criteria which detect fully separable and bi-separable 2 ⊗ 2 ⊗ 3 quantum mixed states. detailed example has been given to show the classification of tripartite entanglement by using our criteria. acknowledgements this work is supported by the national natural science foundation of china under grant nos. 11101017, 11531004, 11726016, 12075159 and 12171044, simons foundation under grant no. 523868, beijing natural science foundation (z190005), academy for multidisciplinary studies, capital normal university, shenzhen institute for quantum science and engineering, southern university of science and technology (siqse202001), and the academician innovation platform of hainan province. references [1] a. einstein, b. podolsky, n. rosen. can quantum-mechanical description of physical reality be considered complete? physical review 47(10):777–780, 1935. https://doi.org/10.1103/physrev.47.777. [2] j. s. bell. on the einstein podolsky rosen paradox. physics 1(3):195–200, 1964. https: //doi.org/10.1103/physicsphysiquefizika.1.195. [3] j. f. clauser, m. a. horne, a. shimony, r. a. holt. proposed experiment to test local hidden-variable theories. physical review letters 23(15):880–884, 1969. https://doi.org/10.1103/physrevlett.23.880. [4] p. y. chang, s. k. chu, c. t. ma. bell’s inequality and entanglement in qubits. journal of high energy physics volume 2017(9):100, 2017. https://doi.org/10.1007/jhep09(2017)100. 226 https://doi.org/10.1103/physrev.47.777 https://doi.org/10.1103/physicsphysiquefizika.1.195 https://doi.org/10.1103/physicsphysiquefizika.1.195 https://doi.org/10.1103/physrevlett.23.880 https://doi.org/10.1007/jhep09(2017)100 vol. 62 no. 1/2022 a note on entanglement classification for tripartite . . . [5] m. li, s. m. fei. bell inequalities for multipartite qubit quantum systems and their maximal violation. physical review a 86(5):052119, 2012. https://doi.org/10.1103/physreva.86.052119. [6] d. collins, n. gisin, s. popescu, et al. bell-type inequalities to detect true n-body nonseparability. physical review letters 88(17):170405, 2002. https://doi.org/10.1103/physrevlett.88.170405. [7] s. w. ji, j. lee, j. lim, et al. multi-setting bell inequality for qudits. physical review a 78(5):052103, 2008. https://doi.org/10.1103/physreva.78.052103. [8] h. zhao. entanglement of bell diagonal mixed states. physics letters a 373(43):3924–3930, 2009. https://doi.org/10.1016/j.physleta.2009.08.048. [9] d. ding, y. q. he, f. l. yan, t. gao. entanglement measure and quantum violation of bell-type inequality. international journal of theoretical physics 55(10):4231–4237, 2016. https://doi.org/10.1007/s10773-016-3048-1. [10] x. f. huang, n. h. jing, t. g. zhang. an upper bound of fully entangled fraction of mixed states. communications in theoretical physics 65(6):701–704, 2016. https://doi.org/10.1088/0253-6102/65/6/701. [11] j. i. de vicente, m. huber. multipartite entanglement detection from correlation tensors. physical review a 84(6):242–245, 2011. https://doi.org/10.1103/physreva.84.062306. [12] m. li, j. wang, s. m. fei, x. li-jost. quantum separability criteria for arbitrary dimensional multipartite states. physical review a 89(2):767–771, 2014. https://doi.org/10.1103/physreva.89.022325. [13] w. son, j. lee, m. s. kim. generic bell inequalities for multipartite arbitrary dimensional systems. physical review letters 96(6):060406, 2006. https://doi.org/10.1103/physrevlett.96.060406. [14] d. gottesman. fault-tolerant quantum computation with higher-dimensional systems. chaos, solitons & fractals 10(10):1749–1758, 1999. https://doi.org/10.1016/s0960-0779(98)00218-5. [15] h. a. carteret, a. higuchi, a. sudbery. multipartite generalisation of the schmidt decomposition. journal of mathematical physics 41(12):7932–7939, 2000. https://doi.org/10.1063/1.1319516. 227 https://doi.org/10.1103/physreva.86.052119 https://doi.org/10.1103/physrevlett.88.170405 https://doi.org/10.1103/physreva.78.052103 https://doi.org/10.1016/j.physleta.2009.08.048 https://doi.org/10.1007/s10773-016-3048-1 https://doi.org/10.1088/0253-6102/65/6/701 https://doi.org/10.1103/physreva.84.062306 https://doi.org/10.1103/physreva.89.022325 https://doi.org/10.1103/physrevlett.96.060406 https://doi.org/10.1016/s0960-0779(98)00218-5 https://doi.org/10.1063/1.1319516 acta polytechnica 62(1):222–227, 2022 1 introduction 2 entanglement identification with bell inequalities 3 entanglement classification under principal basis 4 conclusions acknowledgements references ap05_3.vp 1 introduction the use of computers has become more and more common for the development of all kinds of products. cad systems have been introduced successfully for the designing process. additionally there are a lot of calculation and simulation tools to help the designer. however, keeping data digitized is not limited to the design process. modern plm systems (product lifecycle management) are able to cover all data ocurring during the whole lifecycle of a product. therefore powerful interfaces are required to handle the numerous different native data formats from different applications. the goal is an unlimited data flow from the start to the end of any product. for the development process of gear units the gear calculation programs of fva (forschungsvereinigung antriebstechnik e.v., german research association for gears and transmissions) [1] are very widely used tools. these programs were developed by several gear research institutes at the german universities. there is a great range among the programs from the calculation of the geometry, load capacity and efficiency of the different kind of gears to programs simulating shafts, bearings and housings. nearly all aspects of a gear unit can be calculated and optimized. although fva has standards for the design of a new program, no data exchange among these programs was usually provided. this means even basic data like the number of teeth or the face width of a gear have to be transferred manually by the user. no common format for gear data existed for the use of all fva programs. if some gear unit date is to be changed, the data has to be changed in the input files of all participating programs. this is an unwanted process with a high risk of errors. these were the initial conditions some years ago that led to a project at fzg (forschungsstelle fuer zahnraeder und getriebebau, gear research centre of the technical university of munich) to improve the data exchange between the gear calculating programs of fva. the main goal was to realize of a product model for gear units for use as a central data base. for the data exchange itself a converter program should be developed. with this standardized format only one interface per each program is needed. the powerful step standard iso10303 [2] should be the basis for the product model for gear units. a second goal was to introduce data exchange between the gear calculation programs and cad systems. therefore the step standard is the only existing standard that could cover both kind of data. definitions for gear data do not exist sufficiently within the step standard. the typical way to define data in step as a so called application protocol would be a too time consumpting normative process. so the challenge was to create a new way to define gear data in step. 2 contents of the product model for gear units the first and main purpose for the product model for gear units is to realize the data exchange between the gear calculating programs of the german research association for gears and transmissions fva. these programs are commonly used for the designing process of gears and gear units within the german gear industry. therefore all data of machine elements occurring during the development process of a gear unit are covered by the product model. following elements are included in the product model for gear units: � gears: spur and helical gears, bevel gears, hypoid gears, crossed helical gears and worm gears. � shafts � bearings: plain bearings and rolling element bearings. � sealings � housings and basement © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 modelling a new product model on the basis of an existing step application protocol b.-r. hoehn, k. steingroever, m. jaros during the last years a great range of computer aided tools has been generated to support the development process of various products. the goal of a continuous data flow, needed for high efficiency, requires powerful standards for the data exchange. at the fzg (gear research centre) of the technical university of munich there was a need for a common gear data format for data exchange between gear calculation programs. the step standard iso 10303 was developed for this type of purpose, but a suitable definition of gear data was still missing, even in the application protocol ap 214, developed for the design process in the automotive industry. the creation of a new step application protocol or the extension of existing protocol would be a very time consumpting normative process. so a new method was introduced by fzg. some very general definitions of an application protocol (here ap 214) were used to determine rules for an exact specification of the required kind of data. in this case a product model for gear units was defined based on elements of the ap 214. therefore no change of the application protocol is necessary. meanwhile the product model for gear units has been published as a vdma paper and successfully introduced for data exchange within the german gear industry associated with fva (german research organisation for gears and transmissions). this method can also be adopted for other applications not yet sufficiently defined by step. keywords: data exchange, product model, gear data, step. � clutches � brakes � couplings further elements not directly related to a gear unit are also included to the product model: � materials: materials used for the above listed elements. � lubricants � tools: the tools are limited to tools used for gears. all of these elements can be calculated by a wide range of fva programs. the product model for gear units is to be used as a common format for central storage of all data belonging to these elements. the kind of data is very varied: data related to geometry, load capacity, specification, life time, application, safety, efficiency, environment, manufacturing, etc. are contained in the product model. not included are the electrical, electronic, hydraulic and pneumatic aspects of a gear unit. 3 iso 10303 step in order to develop a product model for gear units a suitable format is required. a very powerful format that includes not only geometrical data but also consists of rules for modelling any other product data is iso 10303 step [2]. step means standard for the exchange of product data. it was developed to replace existing standards like iges, dxf, vda-fs and set. the purpose of the older standards is to exchange geometric data between cad systems. the capability of step also contains geometric data, but the power of step consists of the ability to include much more data. step covers manufacturing data, tolerances, qualities, material data, working schedules and cost structures. all data occurring during the life cycle of a product can be represented by step. step enables the creation of a modern, object oriented and system independent database. a wide range of data is already defined in step, and much more will be added in the near future. 3.1 the structure of step iso 10303 step is organized into different parts (see fig. 1). parts 1–14 are the parts that include the description methods. the express description language is specified there. this is not a programming language, but a language specially generated to represent data. express contains an object oriented structure with the basic element ‘entity’. an entity describes a class of objects with the same attributes and properties. as in other object oriented languages, subtypes and supertypes are provided as well as the inheritance of attributes from the supertype to the subtype. for the visualization of express the graphical representation express-g is included in the step standard. express-g represents the structure of the objects and attributes graphically within diagrams. the basic parts are the integrated generic resources (parts 41–53). they contain the elementary definitions for geometry, topology, materials, tolerances, etc. application related definitions are described in the integrated application resources (parts 101–110). these are definitions for draughtings, kinematics, finite element analysis, etc. the application protocols are defined for special applications. they contain the determination of the boundary conditions, the scope of the application protocol, and the usage and interpretation of the elements of the integrated generic resources and the integrated application resources required for that application. the application protocols are assigned to a specific industrial sector. the application protocol ap 214 [3] for the automotive industry is one of the application protocols with a publication stage. about 40 application protocols are already in existence or are under development. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague iso 10303 parts 101 – 110: integrated application resources (draughting, kinematics, finite element analysis, …) part 1 14: description methods (e.g. express language) ... ... parts 41 – 53: integrated generic resources (geometry, topology, materials, tolerances, representation,…) ... parts 201 – 240: application protocols … ap 212 electronical design and installation ap 214 core data for automotive mechanical design process ap 226 ship mechanical systems … fig. 1: structure of step [2] other parts of step are the implementation methods (parts 21–29), the conformance testing methods and framework (parts 31–35) and the application interpreted constructs (parts 50 –522). 3.2 gear data inside iso 10303 step the only gear data in the scope of iso 10303 step is included in the application protocol ap 214. there, the entity gear_pair is defined. this gear_pair is a subtype of a kinematic_pair and owns five attributes (see fig. 2): two radii, one bevel angle, one helical angle, and the gear ratio. these properties are of course not sufficient for a detailed description of a whole gear unit itself. they only give information about the mechanical coupling of two objects. no more data about the gear unit is provided inside the scope of the application protocol ap 214 of step. the idea to add more attributes to the gear_pair is not in correspondence with the application protocol ap 214. in this case any conformity with this existing standard would be lost. an extension of the application protocol ap 214, which was originally generated to describe the design process in the automotive industry, with more gear data would result in a very time consuming normative process. gear unit data is not limited to gears and gear pairs. shafts, bearings, sealings and couplings also belong to a complete gear unit. therefore an extension based on the gear_pair is not a sufficient solution. 3.3 methods to create a new step product model there are two ways to generate a new product model in step (see fig. 3). the first possibility would be to create a new application protocol similar to ap 214 for the automotive industry. a complete definition specially for gear data can be created in this way. the problem is that the realization of a new application protocol means a very laborious and time-consuming procedure taking years to get completed. the second way is to extend an existing application protocol. for the product model for gear units the application protocol for the automotive industry is the most suitable one, and already includes an entity gear_pair (see chapter 3.2). other objects (e.g. shafts, bearings) of a gear unit are difficult to integrate into this gear_pair, so some new entities (and the surrounding structure) have to be created. a change in the structure of this application protocol would cause loss of compatibility with the existing standard and a difficult normative process would have to be undertaken to integrate the new extensions. 4 integration of the product model for gear units into iso 10303 step for the demands of the product model for gear units a new method had to be developed by the gear research centre (fzg) of the technical university of munich. this method uses the structure of the existing ap 214 but is completely independent. a new mechanism ensures total compatibility with ap 214 although new objects are included inside this model. this method does not need the complicated normative process of a new application protocol. 4.1 gear data definitions using existing ap 214 definitions ap 214 contains many general definitions that can be specified more exactly be the user himself. e.g. there is the entity (or object) called an ‘item’: “an item is either a single object or a unit in a group of objects. it collects the information that is common to all versions of the object.” ([3], chapter 4.2.267). this ‘item’ can be used for any purpose. the entity ‘specific_item _classification’ is linked to that ‘item’ and provides some classification attributes. “a specific_item_classification is a classification of an item with respect to specific criteria. the specific criteria are covered in the ‘classification_name’ attribute.” ([3], chapter 4.2.465). there are several predefined values for the ‘classification_name’ within ap 214. these are ‘part’, ‘prototype’, ‘assembly’, ‘collection’, ‘detail’, ‘raw material’, etc. if applicable the predefined values shall be used. in the case of requiring some definitions not included in the proposed values, it is allowed to introduce additional definitions. this means that it is possible to define gear data within the scope of the already existing ap 214 by oneself. for the application as gear data the ‘classification_name’ of the entity ‘specific_item_classification’ can be set to names like ‘gear’, ‘helical gear’, ‘spur gear’, ‘bevel gear’, ‘shaft’, ‘bearing’, ‘sealing’, ‘housing’, etc. this way the classified entity ‘item’ is clearly defined as an element of a gear unit (see fig. 4). © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 entity gear_pair subtype of (kinematic_pair); radius_first_link : length_measure; radius_second_link : length_measure; bevel : plane_angle_measure; helical_angle : plane_angle_measure; gear_ratio : real; end_entity fig. 2: entity gear_pair new application protocol extension of an existing application protocol advantage: complete definition especially for gear data disadvantage: very laborious and time-consuming advantage: usage of an existing structure disadvantage: loss of compatibility, difficult integration into the existing standard fig. 3: methods for creating a product model all required elements of a gear unit can be covered by the entity ‘item’ using self-defined names. to determine the relations between two gear elements, e.g. the linking of two gears within a gear pair, additional objects from ap 214 are needed (see fig. 5). the entity ‘general_item_definition_relationhip’ defines the relation between exactly two ‘items’. it contains an attribute ‘relation_type’ that can also be used for self-defined values. this way the relationships between the gear elements can be introduced. these are the different kinds of gear pairs like spur gear pair, helical gear pair, bevel gear pair and worm gearing, and also relations between two shafts, two bearings or between shafts and bearings, etc. assemblies, such as the complete gear unit or a planetary gear unit inside the gear unit can aslo be defined by the ‘item’ and the ‘specific_item_classification’. the predefined values contain an ‘assembly’, which can also be used for gear units. for a specific assembly of a gear unit a second ‘specific_item_classification’ with the self-defined name is to be added. fig. 6 shows an example of an assembly ‘planetary gear unit’ with the gears and the relationships between the gears 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague item specific_item_classification ‘part’ ‘prototype’ ... predefined values classification_name ‘helical gear’ ‘shaft’ ‘bearing’ self-defined values for gear objects description id name ‘assembly’ … fig. 4: self-defined classification_name general_item_definition_relationship relation_type item item‘gear pair’ ‘helical gear pair’ ... self-defined values for relationships between gear objects‘shaft bearing’ relating related fig. 5: self-defined relationships general_item_definition _relationship ‘cylindrical_gear_pair’ item (assembly) ‘planetary_gear_unit’ item (part) ‘cylindrical_gear’ item (part) ‘cylindrical_gear’ item (part) ‘cylindrical_gear’ general_item_definition _relationship ‘cylindrical_gear_pair’ fig. 6: planetary gear unit: usage of ap 214 objects with self defined attributes using self-defined values for the ap 214 objects. an ‘item’ can belong to several relationships and assemblies. for the definition of the data of these objects step ap 214 provides a structure called a ‘property’ (see fig. 7). this ‘property’ consists of several entities that indicate information belonging to an object. for the application to gear data the ‘property’ can indicate the value (string value or numerical value) itself, the unit of the value, the used calculation method for determining the value and the kind of the data (geometry, load capacity, life time, safety, etc.). this property is linked to the relating object. this can be an ‘item’ as well as a ‘general_item_definition_relationship’. in principle all data belonging to the gear elements can be covered by the ‘property’. the figures show simplified extracts of the ap 214 structure. the complete structure includes some more elements and relations not shown in the diagrams. problems: why cannot step ap 214 be used for gear data in the way shown above? the main problem is that it is only one way to define data by oneself. there are also other elements within ap 214 that are suitable for determining gear data. even usage of the proposed objects does not mean a standardized solution. different names could be set for the self-defined values. e.g. the classification name indicating a gear can be called ‘gear’, ‘cylindrical gear’, ‘helical gear’ or ‘spur gear’. the are no standardized names determined within ap 214, and even foreign language names can be set. this problem occurs not only for the objects but also for the property names. if not clearly defined, the data cannot be recognized by any electronic program. the complex structure of step requires electronic data processing. for the interpretation of gear data in step detailed explanations have to be added in this case. the definition of gear data using self-defined values can only work internally within a company. this method is not suitable for general application because there is no clear standardization. 4.2 standardized specification for the product model for gear units the main problem in using step ap 214 for gear data is the lack of any standardization. therefore the objects and structures needed from ap 214 and the names of the gear data have to be determined in a standardized way. therefore the product model for gear units developed by fzg contains clear definitions how to use ap 214 for defining gear data. all required elements of ap 214 are described exactly. the names outside the scope of ap 214, which have to be set for the gear data, are predefined in the product model for gear units. for spur and helical gears the common name cylindrical gear was introduced. the definition of a cylindrical gear object gives clear information about the application of objects of ap 214. translation tables for each object defined in the product model for gear units give complete information about the ap 214 elements and the relations that are required for the gear data object. table 1 shows the translation table for the object ‘cylindrical gear’. there are two links to the ap 214 object ‘specific_item_classification’, the first time using the self-defined value ‘cylindrical_gear’, the second time set with the predefined classification name ‘part’. the principle of the table was adopted from the ap 214 mapping tables. there the © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 property (property_value) (property_value_definition) (…) string / numerical value unit name calculation method kind attributesdescribed object general_item_ definition_relationship item fig. 7: data covered by property cylindrical_gear item {[item> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap07_1.vp investment decision-making in a market economy is related to the uncertainty that ensues from the rapid and uncertain development of many factors. this uncertainty can also significantly influence the profitability of transportation projects and often leads to unexpected or undesirable results. financial risk can generally be seen as a degree of risk that, if things go wrong, the financial outcome of a business activity will be worse than the anticipated outcome. hence, risk analysis must form an integral part of any financial assessment of a transportation project. it forms part of a feasibility study of regional projects, and this is the basic document when the project is being defined. risk analysis aims at finding the factors that most affect the risk of a given project and the importance of the particular risk. subjective probabilities have to be applied at this preparatory stage of transportation project. the subjective probability is an expression of the degree of belief that an expert has in an uncertain proposition. it transforms his opinion and his subjective judgement into a quantitative expression. any risk assessment of transportation project involves determining (quantifying) the risk of the investigated variants of the project and preparing corrective measures that lead to an increased probability of success. risk analysis can be schematically divided into the following stages: determining the functional dependency of a decision-making criterion on the influencing quantities, determining the risk factors and their relevance, determining the probabilities of risk factors, constructing the probability distribution of the considered decision-making criterion, and then assessing the risk of the project. we can apply so-called subjective probabilities (bayesian probabilities) to the probability assessment of risk situations at different stages of the risk analysis (a risk situation is a potential future situation having? an impact on the consequences of the considered decision-making variants known as “risk variants”). these probabilities are based on the presumption that every expert has a certain degree of faith in the occurrence of some phenomenon. in determining, these probabilities the expert will apply his or her knowledge, intuition, experience and various types of information. for example, the probability distribution of the risk factors is most often constructed on the basis of an expert analysis. in the case of discrete risk factors, the expert may either assess the probabilities of the individual values directly, or assess the parameters of some of the theoretical discrete probability distributions. however, risk factors with a discrete probability distribution occur relatively rarely. risk factors with a continual probability distribution are more frequent. in the latter cases, it is suitable to use a simulation assessment by the monte carlo method to determine the probability distribution of a chosen criterion. even in this scheme of application of the monte carlo method, subjective probabilities have to be used. therefore, within the development of tools for project risk analysis focused on practical application, the department of logistics and transportation processes at the faculty of transportation sciences, czech technical university in prague, also addresses the problem of constructing subjective probabilities with the aim to create and use a knowledge fund related to subjective probabilities. an analysis of both direct and indirect methods for constructing subjective distribution of probabilities is being carried out for an individual decision-maker and also for a group of decision-makers. in the case of direct methods, experts express their persuasion numerically in the form of probability values, whereas in the case of indirect methods subjective probabilities are attained on the basis of an expert’s behaviour in differentiating amongst multiple variants. a dialogue-type computer programme has been developed for the method of quantification of order; however tests showed that it was user-unfriendly. certain inadequacies of experts in determining subjective probabilities are examined, and the characteristics of the quality of subjective probabilities and factors that affect this quality are sought. the aim is to exert a favourable influence on the quality of project preparation. it was confirmed that experts tend to assess the distribution in the form of symmetrical distributions, and that the basic factors influencing the quality of subjective information are the influence of the expert (him or herself, his/her theoretical and practical knowledge and experience) and the methods applied in quantifying subjective information. any piece of subjective information is always based on information that is accessible to an expert at a given moment. different experts usually have different information sources, and therefore their probability assessments of a particular phenomenon may differ. expressed by a mathematical model, let p(e, i) be the subjective probability that an event e happens and i be the set of information on factors influencing the occurrence of the event. if the information base of different subjects is equalized, the subjective probabilities of event e should converge to a similar value. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 47 no. 1/2007 support for expert estimations in transportation projects o. pastor, p. novotný, j. melechovský this paper deals with risk analysis as a part of the financial assessment of transportation projects. two approaches to risk assessment are discussed. a risk can be evaluated either directly in terms of the probabilistic distribution of the assessment criterion; or an indirect determination of the risk can be applied without constructing the probability distribution, but by determining the characteristic features of the project. keywords: decision-making, economic risk, direct determination of the project’s risk, subjective probabilities. in conclusion, we would like to highlight that if subjective probabilities are used to quantify the risk of a project, we are not speaking about the real risk, but about the risk as it appears to the team of examiners. risk analysis is not a technique in itself that can compensate for inadequacies in documents and in the judgement of the authors of a project. its distinctive feature is that it attempts to combine exact methods and model tools with the knowledge and experience of researchers. reference: [1] pastor, o.: project development cycle – activities and project documentation. in: automatizace. czech republic, vol. 44 (2001), no. 12, p. 755–756. issn 0005-125x. [2] pastor, o.: modelling of economic risks in transport projects. in: pront 2000. pilsen: university of west bohemia, czech republic, 2001, p. 195–198. isbn 80-7082-648-7. doc. dr. ing. otto pastor, csc. e-mail: pastor@fd.cvut.cz. ing. petr novotný e-mail: petr.novotny@fd.cvut.cz ing. jan melechovský e-mail: jan.melechov@seznam.cz department of logistic and transportation processes czech technical university in prague faculty of transportation science horská 3 128 03 praha 2, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 acta polytechnica https://doi.org/10.14311/ap.2021.61.0415 acta polytechnica 61(3):415–427, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague self tunning filter for three levels four legs shunt active power filter with fuzzy logic controller dahmane djendaouia, ∗, amar benaissab, boualaga rabhic, laid zelloumad a university of biskra, electrical engineering department, bp 145 rp, 07000 biskra, algeria b university of djelfa, laadi laboratory,pb 3117, 17000 djelfa, algeria c university of biskra, lmse laboratory, bp 145 rp, 07000 biskra, algeria d university of el-oued, levres laboratory, bp 789, 39000 el-oued, algeria ∗ corresponding author: d.djendaoui@mail.univ-djelfa.dz abstract. the low harmonic distortion and reduced switching losses are the advantages of using the multilevel inverter. for this purpose, the three-level inverter is used in this paper as a three-phase four-leg shunt active power filter (sapf). the sapf is used to eliminate the harmonic current to compensate the reactive power current, and to balance the load currents under an unbalanced non-linear load. a fuzzy logic controller and self-tuning filters (stf) are used to control the active power filter (apf) and generate the reference current. to demonstrate the validity of the proposed control strategy, we compared it with a conventional p−q theory, under distortion voltage conditions and unbalanced non-linear load. the matlab-simulink toolbox is used to implement the algorithm of fuzzy logic control. the performance of the sapf controller is found very effective and adequate as compared with the p−q theory. keywords: 4-leg shunt active filter, harmonics isolator, distorted voltage conditions, self-tuning filter, fuzzy logic control. 1. introduction a connection of non-linear loads to the network, such as power electronics equipment in industrial activities and public consumers, causes an increase of current and voltage harmonics and leads to a lower power factor. this causes adverse effects in power systems, such as equipment overheating, motor vibration, excessive neutral currents. two levels of apf are widely used to decrease harmonics and compensate the reactive power [1–3], these active power filters are limited in medium power domains. for high-power applications, the multilevel power converter structure was introduced as an alternative due to the difficulty of connecting only one power semiconductor switch directly, and a hybrid architecture has been also proposed to achieve high-power filters [4–6]. a neutral-pointclamped inverter allows the shearing of equal voltage in each phase of series connected semiconductors. the instantaneous reactive power theory (irp), presented by akagi et al. [7, 8], has been successful to control the apf in 3-phase. then, it was extended by aredes et al. [9], for applications in 3-phase 4-wire systems. the irp theory was mostly applied to calculate the compensating currents assuming mains voltage is ideal, but in industrial systems, the mains voltage may be distorted, therefore, the four-leg apf control does not gyield a good results with using the p−q theory. the generation of the reference current mainly depends on the use of self-tuning filters (stf) to improve the performance of apf under conditions of non-ideal voltage. the stf extracts the main component directly from the electrical signals (distorted voltage and current) in the α−β reference frame power filter [10]. the apf control process has been the target of many recent researches and has used the traditional pi console a lot, but requires a precise mathematical model of the system and this is a difficult to achieve in the presence of an unbalanced nonlinear load and variations of parameters so we deal with that by using fuzzy logic [11, 12]. the proposed fuzzy logic control scheme controls the harmonic current and dc voltage inverter in order to improve the performances of the four-leg three-level shunt apf. a comparison and evaluation between the proposed scheme and conventional p−q theory are carried out by a computer simulation under unbalanced non-linear load and distorted voltage conditions. simulation results demonstrate that the proposed scheme was found practical and good. 2. system configuration figure 1shows three-phase ac source connected in parallel to an active filter through four inductors, this filter has a three-phase voltage inverter using igbt switches. to smooth the dc terminal voltage of the inverter, capacitors are used. to produce an unbalanced, harmonic, and reactive current in the phase currents, diode-rectifier nonlinear loads are used. the active filter inverter of n levels requires n − 1 capacitors and each capacitor passes a voltage equal to vdc/2, and vdc is the total voltage of the dc source. 415 https://doi.org/10.14311/ap.2021.61.0415 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica figure 1. power system configuration. ki ti1 ti2 ti3 ti4 vi0 1 1 1 0 0 vdc/2 0 0 1 1 0 0 -1 0 0 1 1 −vdc/2 table 1. voltage states of one three-level phase leg. each switch pair (t11,t13) represents a commutation cell; the two keys are arranged in a complementary manner. the inverter provides three voltage levels according to equation (1): vi0 = ki vdc 2 (1) where, vio is the phase-to-middle fictive point voltage, ki is the switching state variable (ki = 1, 0,−1), vdc is the dc source voltage, and i is the phase index (i = a,b and c). the three voltage values are shown in table 1 (vdc/2, 0,−vdc/2). 3. reference current calculation 3.1. the p − q theory the theory of instantaneous power [2, 13], which is briefly called the theory of p−q,depends on the instantaneous values of three-phase power systems with or without a neutral wire, it is valid in steady-state and transit-state processes, and in general, in the form of voltage and current waves. it works on the principle of the algebraic conversion (clark transformation) of voltage and current from a,b,c coordinates to α−β−0 coordinates followed by a calculation of the power elements p,q: the power components p and q can be written, using α−β voltages and currents, as: v0vα vβ   = √2 3  1/ √ 2 1/ √ 2 1/ √ 2 1 −1/2 −1/2 0 √ 3/2 √ 3/2    vsavsb vsc   (2)  i0iα iβ   = √2 3  1/ √ 2 1/ √ 2 1/ √ 2 1 −1/2 −1/2 0 √ 3/2 √ 3/2    iaib ic   (3) p0 = i0v0 (4) p = iαvα + iβvβ (5) q = iβvα − iαvβ (6) where p0, p and q are the instantaneous zerosequence power, instantaneous active power and instantaneous reactive power, respectively, the power components p and q are related to the same α−β voltages and currents, and can be written together: [ p q ] = [ vα vβ −vβ vα ][ iα iβ ] (7) by reversing the equation (7), we can calculate the reference compensation currents in the α−β coordinates: [ i∗α i∗β ] = 1 v2α + v2β [ vα −vβ vβ vα ][ p̃− p̂0 q ] (8) with p̂0 being the fundamental component and p̃ being the alternative component. the representation of reference compensation currents in the a,b,c coordinates requires a transformation inverse of (equation 3) given by: i∗ai∗b i∗c   = √2 3  1/ √ (2) 1 0 1/ √ (2) −1/2 √ (3)/2 1/ √ (2) −1/2 √ (3)/2    i∗0i∗α i∗β   (9) i∗n = −(i ∗ a + i ∗ b + i ∗ c) (10) the figure 2 gives the control block scheme of the p−q theory for apf with four legs. 416 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . figure 2. the p − q theory control block diagram for the 4-leg apf. 3.2. proposed method 3.2.1. self-tuning filter to obtain the equivalent transfer function for integration within the synchronous reference frame (srf), hong-scok song provided the following statement [14]: vxy(t) = exp(jwt) ∫ (exp(−jwt)uxy(t)dt) (11) where uxy and vxy are the instantaneous signals before and after the integration. we can express this equation by a laplace transformation as follows: h(s) = vxy(s) uxy(s) = s + jw s2 + w2 (12) to obtain the stf with a cut-off frequency wc, we consider introducing the constant k in the transfer function h(s), so the previous transfer function h(s) becomes: h(s) = vxy(s) uxy(s) = (s + k) + jwc (s + k)2 + w2c (13) by substituting the input signals uxy(s) by xαβ(s) and the output signals vxy(s) by x̂αβ(s), the following equations can be written: x̂α(s) = k s [xα(s) − x̂α(s)] − wc s x̂β(s) (14) x̂β(s) = k s [xβ(s) − x̂β(s)] − wc s x̂α(s) (15) the goal of using the stf is to extract the main component from distorted electrical signals, such as current and voltage, without changing the amplitude and without causing a phase shift. the selectivity of the filter could be increased by changing the value of k. the figure 3 shows the stf block and the figure 4 illustrate the bode diagram for different values of k at fc = 60 hz. it is clear that the phase delay corresponding to the desired frequency (fc) is equal to zero as shown in figure 4. figure 3. self-tuning filter tuned to the pulsation wc. 3.2.2. harmonic isolator the load currents ila,ilb and ilc are converted to the α−β axis according to the following relationship (fig. 5):  i0iα iβ   = √2 3   1√ 2 1√ 2 1√ 2 1 −12 − 1 2 0 √ 3 2 − √ 3 2    ilailb ilc   (16) the currents iα,iβ can be separated into dc and ac currents as follows: iα = îα + ĩα (17) iβ = îβ + ĩβ (18) when the stf extracts the fundamental component (îα,îβ) directly from the currents iα,iβ at a pulsation wc, we calculate the harmonic component of the load by subtracting the input signals stf from the corresponding outputs (fig. 5) and obtain ac currents (ĩα,ĩβ) that correspond to the harmonic currents of the load currents (ila,ilb,ilc) in the stationary reference frame. the source voltages (vsa,vsb,vsc) are converted in the same way, as follows: v0vα vβ   = √2 3   1√ 2 1√ 2 1√ 2 1 −12 − 1 2 0 √ 3 2 − √ 3 2    vsavsb vsc   (19) 417 d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica 0 20 40 60 80 100 120 -40 -35 -30 -25 -20 -15 -10 -5 0 frequendy (hz) m ag ni tu de ( db ) bode diagram 0 20 40 60 80 100 120 -80 -60 -40 -20 0 20 40 60 80 frequency (hz) p ha se ( d eg ) k=20 k=40 k=60 k=80 k=20 k=40 k=60 k=80 figure 4. bode diagram for the stf versus pulsation for different values of the parameter k(fc = 60 hz). with the same preceding steps by which we obtained the harmonic component of the current, we obtain the harmonic component of the voltage. then, we calculate the p,q and p0 powers as follows: p = iαv̂α + iβv̂β (20) q = iβv̂α − iαv̂β (21) p0 = i0v0 (22) where p = p̂ + p̃; q = q̂ + q̃; p0 = p̂0 + p̃0 (23) with p̂, q̂ and p̂0 being the direct components, p̃, q̃ and p̃0 being the alternative components. with the α−β voltages and currents, we can write the power components p̃ and q̃ as follows:[ p̃ q̃ ] = [ vα vβ −vβ vα ][ ĩα ĩβ ] (24) after reversing expression (24) and inserting the powers to be compensated ((p̃ − p̂0) and q), we get the reference compensating currents in the α−β coordinates. for regulating dc bus voltage, the active power required pc is added to the alternative component of instantaneous real power p̃ (figure. 5). the reference currents’ i∗αβ0 in α−β coordinates are calculated as follow: i∗α = v̂α v2α + v2β (p̃ + pc − p̂0) − v̂β v2α + v2β q (25) 418 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . figure 5. stf block diagram with harmonic isolator. figure 6. pwm block diagram of currents control. i∗β = v̂α v2α + v2β (p̃ + pc − p̂0) + v̂α v2α + v2β q (26) i∗0 = i0 (27) then, the filter reference currents in the a − b − c coordinates are defined by:  i∗fai∗fb i∗fc   = √2 3   1√ 2 1 0 1√ 2 −12 √ 3 2 1√ 2 −12 − √ 3 2    i∗0i∗α i∗β   (28) i∗fn = −(i ∗ fa + i ∗ fb + i ∗ fc) (29) figure 5 shows the block diagram of the stf. 4. inverter control using pwm the fuzzy logic controller is applied, where its input is the difference between the injected current and the reference current and its output are the reference voltages of the inverter after its comparison with two triangular signals, which have the same frequency and phase [15, 16]. the table 2 summarizes the switching states for one phase leg of a three-level inverter. in figure 6, a current control block diagram is shown. output voltage ti1 ti2 ti3 ti4 vdc/2 1 1 0 0 0 0 1 1 0 −vdc/2 0 0 1 1 table 2. the switching states. 5. fuzzy logic control application figure 7 shows the diagram of the fuzzy logic controller with two inputs and one output. the inputs are the difference between the reference current and the injected current (error: e = iref − if) and its derivative (de), respectively. the output is the command (cde). 6. active power filter current control the goal is to replace conventional controllers with fuzzy logical ones in order to obtain a sinusoidal source current that is in phase with the source voltage [17]. the fuzzy control properties are as follows: • three fuzzy sets for each input (positive, zero, negative) • five fuzzy sets for the director (positive large, positive, zero, negative, negative large). 419 d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica figure 7. fuzzy controller synoptic diagram. e nl nm ns ze ps pm pl δe nl nl nl nl nl nm ns ze nm nl nl nl nm ns ze ps ns nl nl nm ns ze ps pm ze nl nm ns ze ps pm pl ps nm ns ze ps pm pl pl pm ns ze ps pm pl pl pl pl nl nm ns ze ps pm pl table 3. fuzzy control rules. figure 8. structure of the bridge balancing. source system frequency 60 hz system voltage 230 √ 2 vmax impedance: rs,ls 0.5 mω, 0.015 mh 4-leg shunt apf dc-link voltage: vdc 800 v dc capacitor: c1,c2 70 mf switching frequency: 5 khz ac side filter: rf,lf 5 mω, 0.068 mh rfn,lfn 5 mω, 0.55 mh load inductor: ld,ld1 2 mh resistor: rd,rd1 0.6 ω, 3 ω inductor: ld2,ld3 1 mh resistor: rd2,rd3 3 ω, 2.75 ω inductor: ll,ll1,ll2 0.05 mh resistor: rl,rl1,rl2 1.2 mω table 4. simulation parameters. • use the ’minimum’ operator and the ’centroïd’ method. • the knowledge of fuzzy rules is based on the error sign (e). if the derivative (de) becomes positive, the error increases (iref > if). if it becomes negative, the error decreases (iref < if). if the derivative becomes equal to zero, the error is fixed (iref = if). 7. dc capacitors voltages stabilization to achieve the balance of two input voltages, improve the performance of active power filters and avoid a potential drift of the neutral point (np), a stabilization bridge system shown in fig 8 is inserted. if the voltage ucx becomes higher than the reference voltage uref (400volts), the transistor tx is opened to slow the charging of the capacitor cx, and the transistor is controlled as follows: ucx −ucref = ∆x with x = 1, 2. if ∆x > 0 then tx = 1 with irx = tx ucxrx . else tx = 0. 8. dc capacitor voltage control to reduce the deformation of harmonic current and improve the performance of the active filter, we use the fuzzy logic control unit, relying on the error handling of the voltage and its derivative, to control the inverter voltage of the capacitor. here, we use 49 rules as shown in table 3. 9. simulation parameters we use the parameters summarized in the table 4 for simulation. 420 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . figure 9. switching pulses of apf arms (t11, t13). 10. results and discussions the harmonic current of the 4-leg 3-level apf was examined as well as the reactive power compensation, load current balance and neutral current elimination by the proposed method and p−q method under deformed voltage and non-linear load conditions. two methods are applied to calculate the reference current, the p − q theory and the stf, method based on the topology shown in figure 1. the sequence of the connection of the nonlinear loads is presented as follows: a single-phase nonlinear load is connected in the phase (c) to create an unbalanced nonlinear load at 0s and at 0.2s: a three-phase nonlinear load is connected to increase the current load and a single-phase nonlinear load is also connected in phase (b) to create an unbalanced nonlinear load. all these sequences of connections aim to evaluate the dynamic performance and to show the extent of the system’s interaction. the advantages of a multi-level inverter are evident from the switching signal presented in figure 9 where the low frequency commutation process is shown. note that the supply voltage is not sinusoidal and includes a 5th harmonic component (thd= 11.10%), as shown in figure 10, and the total harmonic distortion of the current source isa before filtering is equal to (thd=22.30%) see figure 11. by applying the p − q theory, the harmonic distortion coefficient of isa is reduced to (8.62%) after filtering, as shown in figure 12. in figure 13, we present the value of the harmonic distortion coefficient of isa obtained by the proposed method after filtering, it is a very acceptable value according to the ieee 519-2014 and iec 61000-3 harmonic standards limit. as for the neutral current, we eliminate it in two different ways, p−q theory and the proposed method, as can be seen in figure 14. the harmonic isolation and fuzzy logic controller allow a simultaneous compensation of the reactive power and harmonic current. the current and voltage signal obtained are in phase under the distorted voltage and non-linear and unbalanced load conditions, this is illustrated in figure 15. figure 16 shows the harmonic currents obtained using the self-tuning filters that improved the quality of the reference currents. figure 17 shows the line-to-line output voltage and illustrates the three levels with the pdpwm. figure 18 shows the shape of the voltage between the two ends of the stabilization bridge, which is free from ripples, and it shows the two potentials (uc1; uc2) where it stabilizes around the value of 400 volts. load current balancing and harmonic current suppression results are shown in figure 19 (a, b, c, d) for the p−q theory and the proposed method under distorted voltages and unbalanced non-linear load conditions for 4-leg 3-level apf. the figure 19a shows the distorted voltage (vabc) and figure 19b shows the load currents (ilabc), it is clear that the load current (ilc) of the phase (c) is affected by the single nonlinear load connected at the same phase. at 0.2s the load currents (ilabc) are affected by three-phase nonlinear loads connected at the same moment (0.2s). as for the figure 19c, the p−q theory was applied; however, the source signal currents are still slightly distorted (thd = 8.62%) despite the response time being only 0.2s. as for the proposed method (fig. 19d), we notice a good shape (thd = 1.89%) of the source currents and an appropriate response at the time of the load change (0.2s), which gives it a slight edge over the p−q theory. 11. conclusion this article used a fuzzy logic controller to improve the control and performance of an active power filter under deformed voltage conditions and an unbalanced non-linear load. we adopted a modified version of pdpwm and p−q theory to generate switching signals in the context of improving the creation of the reference current. simulation results demonstrate that all of the following points have been effectively accomplished even under conditions of deformed voltage and an unbalanced and non-linear load: • filtering the current harmonics; • reactive power compensation; 421 d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica figure 10. waveform of supply voltage vsa. figure 11. waveform of supply current isa without filtering. 422 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . figure 12. waveform of supply current isa after filtering using p − q theory. figure 13. waveform of supply current isa with filter using proposed method. 423 d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 −200 0 200 load nautral current (iln [a]) (a ) 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 −20 0 20 source nautral current with p−q theory (isn [a]) (b ) 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 −20 0 20 source nautral current with the proposed method (isn [a] time [s] (c ) figure 14. neutral current elimination. 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 −1000 −500 0 500 1000 time [s] v sa [ v ], i sa [ a ] isa vsa figure 15. power factor correction. 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 −800 −600 −400 −200 0 200 400 600 800 time [s] if a [ a ] figure 16. active filter current. 424 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . 0.1 0.105 0.11 0.115 0.12 0.125 0.13 0.135 −1000 −800 −600 −400 −200 0 200 400 600 800 1000 time [s] v a b [ v ] figure 17. apf output voltage vab line-to-line. figure 18. dc voltage: (top) vdc, (middle) uc1, (bottom) uc2. 425 d. djendaoui, a. benaissa, b. rabhi, l. zellouma acta polytechnica 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 −500 0 500 distorted voltage vsabc [v] (a ) 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 −1000 0 1000 load current ilabc [a] (b ) 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 −1000 0 1000 source current with p−q theory isabc [a] (c ) 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 −1000 0 1000 source current with the proposed method isabc [a] time [s] (d ) load change figure 19. harmonic currents filtering under unbalanced non-linear load and distortion voltage conditions. • load balancing current; • elimination of excessive neutral current; • • under both static and dynamic operating conditions, the method shows excellent results. the use of a self-tuning filter leads to good improvements as it completely extracts harmonic currents in distorted conditions. in conclusion, the results showed the important advantages of using the stf and fuzzy logic in controlling the filter, such as improving the waveform and reducing harmonic distortion, as well as the possibility of using it in high-power applications. references [1] a. benaissa, b. rabhi, a. moussi. power quality improvement using fuzzy logic controller for five-level shunt active power filter under distorted voltage conditions. springer j 8(2):212–220, 2014. https://doi.org/10.1007/s11708-013-0284-4. [2] a. benaissa, b. rabhi, a. moussi, m. benkhoris. fuzzy logic controller for three phase four-leg five-level shunt active power filter under unbalanced non-linear load and distorted voltage conditions. springer j 5(3):361–370, 2014. https://doi.org/10.1007/s13198-013-0176-3. [3] h. akagi. trend in active power line conditioners. ieee transactions on power electronics j 9(3):263–268, 1994. https://doi.org/10.1109/63.311258. [4] h. k.chiang, b. r.lin, k. t.yang, k. w.wu. hybrid active power filter for power quality compensation. international conference on power electronics and drive systems (peds), kuala lumpur, malaysia pp. 949–954, 2005. [28 nov-1 dec], https://doi.org/10.1109/peds.2005.1619824. [5] x. wanfang, l. an, w. lina. development of hybrid active power filter using intelligent controller. automation of electric power system j 26(10):49–52, 2003. [6] o. vodyakho, t. kim, s. kwak. three-level inverter based active power filter for the three-phase, four-wire system. ieee power electronics specialists conference (pesc2008), rhodes, greece pp. 1874–1880, 2008. [june 15-19], https://doi.org/10.1109/pesc.2008.4592217. [7] g. w.chang, c. m.yeh. optimization-based strategy for shunt active power filter control under non-ideal supply voltages. iee electric power applications j 152(2):182–190, 2005. https://doi.org/10.1049/ip-epa:20045017. [8] m. i.m.montero, e. r.cadaval, f. b.gonzalez. comparison of control strategies for shunt active power filters in three-phase four-wire systems. ieee transactions on power electronics j 22(1):229–236, 2007. https://doi.org/10.1109/tpel.2006.886616. [9] t. c.green, j. h.marks. control techniques for active power filters. iee electric power applications j 426 https://doi.org/10.1007/s11708-013-0284-4 https://doi.org/10.1007/s13198-013-0176-3 https://doi.org/10.1109/63.311258 https://doi.org/10.1109/peds.2005.1619824 https://doi.org/10.1109/pesc.2008.4592217 https://doi.org/10.1049/ip-epa:20045017 https://doi.org/10.1109/tpel.2006.886616 vol. 61 no. 3/2021 self tunning filter for three levels four legs shunt. . . 152(2):369–381, 2005. https://doi.org/10.1049/ip-epa:20040759. [10] m. abdusalama, p. poure, s. karimi, s. saadate. new digital reference current generation for shunt active power filter under distorted voltage conditions. electric power systems research j 79(5):759–765, 2009. https://doi.org/10.1016/j.epsr.2008.10.009. [11] a. hamadi, k. el-haddad, s. rahmani, h. kankan. comparison of fuzzy logic and proportional integral controller of voltage source active filter compensating current harmonics and power factor. ieee international conference on industrial technology (icit), hammamet, tunisia pp. 645–650, 2004. [dec 8-10], https://doi.org/10.1109/icit.2004.1490149. [12] a. h.bhat, p. agarwal. a fuzzy logic controlled three-phase neutral point clamped bidirectional pfc rectifier. ieee international conference on information and communication technology in electrical sciences (ictes2007), tamil nadu, india pp. 238–244, 2007. [dec 20-22], https://doi.org/10.1049/ic:20070617. [13] j. afonso, c. couto, j. martins. active filter with control based on the p-q theory. ieee industrial electrons society newsletter j 47(3):5–10, 2000. [14] s. hong-scok. control scheme for pwm converter and phase angle estimation algorithm under voltage unbalanced and/or sag condition. phd thesis, postecch university, republic of korea (south) 2000. [15] b. p.mcgrath, d. g.holmes. multicarrier pwm strategies for multilevel inverters. ieee transactions on industrial electronics j 49(4):858–867, 2002. https://doi.org/10.1109/tie.2002.801073. [16] a. jouanne, s. dai, h. zhang. a multilevel inverter approach providing dc-link balancing, ride-through enhancement, and common mode voltage elimination. ieee transactions on industrial electronics j 49(4):739– 745, 2002. https://doi.org/10.1109/tie.2002.801233. [17] s. saad, l. zellouma, l. herous. comparison of fuzzy logic and proportional controller of shunt active filter compensating current harmonics and power factor. 2nd international conference on electrical engineering design and technology (iceedt08), hammamet, tunisia pp. 8–10, 2008. [nov. 3-5]. 427 https://doi.org/10.1049/ip-epa:20040759 https://doi.org/10.1016/j.epsr.2008.10.009 https://doi.org/10.1109/icit.2004.1490149 https://doi.org/10.1049/ic:20070617 https://doi.org/10.1109/tie.2002.801073 https://doi.org/10.1109/tie.2002.801233 acta polytechnica 61(3):415–427, 2021 1 introduction 2 system configuration 3 reference current calculation 3.1 the p-q theory 3.2 proposed method 3.2.1 self-tuning filter 3.2.2 harmonic isolator 4 inverter control using pwm 5 fuzzy logic control application 6 active power filter current control 7 dc capacitors voltages stabilization 8 dc capacitor voltage control 9 simulation parameters 10 results and discussions 11 conclusion references ap07_4-5.vp 1 introduction many papers on improving vna calibration procedure have been published recently. the accuracy of the error model is increased by using a cad-based evaluation of the calibration standards [1] and statistical processing of measured data of an over-determined set of calibration standards [2]-[5]. another improved calibration procedure optimizes parameterized models of calibration standards to minimize nonreciprocity in a known asymmetrical reciprocal two-port device [6]. these techniques increase the accuracy of the measured s-parameters while maintaining the requirements on the hardware quality of the calibration standards at the same level. however, the measurement accuracy is substantially worse for standards fabricated on a soft planar substrate than is much longer than a quarter wavelength. such an effect can be seen clearly in [2], where the 95 % confidence limit of the measured s-parameters is approximately ten times greater for a rogers 4350b soft substrate than for the measurement carried out using htcc material and a wafer-probe connection. the problem of measuring the s-parameters on a planar lossy medium can be overcome by using the tdr preselection method proposed in this paper. also, an origin of errors is explained assuming the solt calibration and correction method. 2 error sources a typical of s-parameter measurement arrangement is depicted in fig. 1. an intrinsic calibration standard and a device under test (dut) are connected to vna ports through feeding lines and connectors. the intrinsic calibration standard is assumed here as a lumped circuit. the connectors are either soldered to the printed circuit board (pcb) or they just touch the strip when a text fixture is used. the reference planes are usually considered at the center of the pcb. the overall error of vna calibration is composed of several particular errors that are described in this section. each of the particular errors can be assigned to a certain region of the pcb shown in fig. 1. 2.1 variation of the manufacturing process the limited accuracy of the manufacturing process affects mainly the cross-section of the feeding lines (see fig. 1). the impedance of the feeding lines differs between items in the calibration kit. since the feeding lines are naturally distributed circuits, a small impedance difference of 1–2 ohms leads to an error of hundredths in the measured s parameters. a symptom of this error is the “fast rippled” reflection coefficient of a unitary reflecting standard that is shifted from the reference plane of the calibration standards. fig. 2 shows an example of a typical tolerance zone for a shifted-short verification standard. it should be emphasized that it is not possible to correct this error with a different (better) characterization of the calibration standards. 102 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 improved evaluation of planar calibration standards using the tdr preselection method j. vancl calibration and correction methods for the vector network analyzer (vna) are based on the fundamental assumption of the constant error model, which is independent of connected calibration standards and/or devices under test (dut). unfortunately, this assumption is not satisfied well for planar calibration standards fabricated by etching technology on soft substrates. an evaluation of the error model is affected especially by variations in the manufacturing process and also by the reproducibility of an assembly. in this paper, we propose error minimization by selecting the best combination of available calibration standards based on time domain reflection (tdr) measurement, which can also be obtained by the fourier transformation from the measured s-parameters. the proposed method was verified experimentally using short, open, load and thru (solt) standards fabricated on an fr4 laminate substrate which achieves the essential reduction of the measurement error in the frequency range up to 15 ghz. keywords: calibration standards, equivalent circuits, error correction, scattering parameters, vector network analyzer. fig. 1: important regions of calibration and/or measurement error sources 2.2 reproducibility of the assembly non-reproducibility of the assembly is minimized by an averaging over several assemblies. an example of standard deviation of measured data for coplanar and microstrip lines is shown in fig. 3. the contribution to the overall error can be expected at a level of several thousandths. 2.3 uncertainty of calibration standards a basic set of calibration standards, such as short, open and match, is usually modeled using a full-wave electromagnetic field simulator, taking into account the real geometrical dimensions and the frequency dispersion of the material parameters. wrong characterization of intrinsic calibration standards leads to spurious “slow ripple” of the measured s-parameters. this error is shown in fig. 4, where the reflection coefficient of the shifted calibration standards short2 and short3 is depicted. however, uncertainty of intrinsic calibrations standards can be essentially reduced by using an over-determined calibration set, which improves the calibration accuracy by measuring of more distinct calibration standards than the required minimum [5]. 2.4 noise the smallest error is due to noise generated by vna. averaging applied to the measured data can reduce the noise. however, this source of random error is in most cases negligible and thus it is skipped in this paper. 3 tdr preselection time-domain reflection measurement can provide information about the impedance profile of the transmission line, when the step pulse has a steep enough rise time. therefore the differences between the impedance profiles of the feeding lines can be obtained. an example of the tdr response of the calibration standards is shown in fig. 5, where three regions can be distinguished. the response has to be the same in a time window of the cable and feeding line in order to keep the error model constant. obviously, it is different in the region of intrinsic calibration standards. a zoomed region of the feeding line is shown in fig. 6. usually, several instances of each calibration standard are used for vna calibration in order to minimize the influence of the variation in the manufacturing process [2]. an alternative approach is to select the best combination of calibration standards with minimum differences of impedance profiles within the feeding-line time window. © czech technical university publishing house http://ctn.cvut.cz/ap/ 103 acta polytechnica vol. 47 no. 4–5/2007 fig. 2: typical tolerance zone of the reflection coefficient of the shifted-short verification standard due to variable impedance of the feeding lines fig. 3: typical standard deviations of measured data due to imperfect reproducibility of the assembly (fr4 substrate inserted into the test fixture) fig. 4: reflection coefficient of shifted calibration standards short2 and short3 the feeding-line response of the calibration standards can also be obtained by a fourier transformation of the measured s-parameters. since we measure s-parameters only up to 15 ghz and the feeding lines are only 25 mm long, the resolution of the feeding line response is too small for accurate selection of the best combination of calibration standards. however, the differences between the impedance profiles (fig. 7) are distinguishable and correspond to the tdr measurement (fig. 6). a suitable criterion for selecting the best combination cbest is based on the standard deviation. the criterion formula is � �c n x xbest s j k j k n j n st � � � �� ��arg min , 1 1 2 11 , (1) where x n xj s j k k n s � � � 1 1 , (2) is average value of voltage response in time step j. ns and nt is number of calibration standards and number of time steps, respectively. total number of possible combinations n is ni raised to the power ns, where ni is the number of instances of each calibration standard. an advantage of the tdr preselection method is that for a given number ni it is possible to achieve the same or smaller variance of the feeding-line impedance profile in comparison with statistical methods. this is because statistical methods take into account all instances, while the tdr preselection method selects the combination with the lowest possible variance. 4 calibration scheme trl and solt are the most frequently used calibration methods. trl uses an 8-term error model and therefore it does not enable crosstalk to be corrected. moreover, the trl method cannot be used in conjunction with a fixed length test-fixture. on the other hand, calibration of vna using trl is much in terms of requirements on manufacturing and evaluation of calibration standards. the solt method enables correction of crosstalk, but realization of broadband load is difficult. this problem can be overcome by using the sliding load principle, but it is not suitable for calibration standards manufactured on soft substrates because of poor reproducibility of the reflection coefficient when the load is slid along the transmission line. another solution is it to combine the two methods, where the broadband fixed load is characterized using the trl method and dc measurement of the fixed load resistance [7]. thus the solt calibration kit containing the broadband fixed load is the optimal solution for the calibration procedure on soft substrates. the calibration procedure proposed in this paper is similar to the procedure used in [6], except for the criterion function for the optimization step. the flowchart of the new procedure is depicted in fig. 8 and is described in the following steps: 1) short, open and load calibration standards are designed using the em simulator, and their equivalent circuits are extracted, see fig. 9a)-c). the thru standard is considered as ideal. 2) full two-port correction coefficients are computed using measured and modeled s-parameters of s, o, l and t standards. the 12-term error model is considered. 3) measured data of shifted short s2 and shifted open o2 are corrected using the correction coefficients computed 104 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 5: time-domain reflection response measurement of calibration standards (fr4 substrate inserted into the test-fixture) fig. 6: a detailed time-domain reflection measurement corresponding to the feeding-line response fig. 7: a detailed feeding-line response obtained by fourier transformation of the measured s-parameters of the calibration standards in the previous step. the corrected reflection coefficients are compared with the models of s2 and o2 shown in fig. 9d)-e). 4) the parameters of the equivalent circuits are optimized to minimize the difference between the corrected measured s parameters and modeled the s parameters of s2, o2. involving shifted short/open over determines the characterization of the basic calibration kit, resulting in better accuracy of the s parameter measurement. the reflection coefficient of a shifted calibration standard, such as s2 and o2, is much more sensitive than the criterion of reciprocity to improper evaluation of the basic calibration standards s, o and l. thus the difference between estimated and modeled reflection coefficients of o2 and s2 was used instead of the reciprocity criterion used in [6]. 5 experiment an experimental test of the proposed calibration method was carried out using cpwg standards manufactured on an fr4 substrate, each standard in six instances, see fig. 10. short 1, open, match 4 and thru were used as a basic calibration set, short 2 and open 2 were used for optimizing the values of cm, rm, lm, co and ls (see fig. 9), while short 3 was used as a verification standard. the agilent 86100c wideband scope including tdr module 54754a (18 ghz) was used for measuring the tdr response. the choice of the best combination was carried out in matlab in according to formula (1). agilent e8364a pna was used for measuring the s parameters, and the calibration procedure was carried out according to the flowchart shown in fig. 8. pcbs of the calibration standards were inserted during measurement into the test fixture attached with sma connectors. each calibration standard was measured five times and the data was averaged in order to suppress the influence of the assembly. the experimental results for the best and worst combination of calibration standards are shown in fig. 11 and © czech technical university publishing house http://ctn.cvut.cz/ap/ 105 acta polytechnica vol. 47 no. 4–5/2007 fig. 8: flowchart of the process to optimize cal kit parameters. fig. 9: equivalent circuits of calibration standards; (a) fixed load, (b) open, (c) short, (d) shifted open, (e) shifted short fig. 10: cpwg calibration standards manufactured on an fr4 substrate (50×25×0.8 mm, �r � 4.7) fig. 11: reflection coefficient of shifted calibration/verification standards – the best combination fig. 12, respectively. it can be clearly seen that the preselection method essentially reduces the “fast ripple” of the measured reflection coefficients. we estimate that the final uncertainty of measurement is �0.02 in magnitude and �2° in angle. 6 conclusion a improved algorithm for evaluating of calibration standards using the new tdr preselection method was carried out and experimentally verified. the experiment showed that the measurement error of s parameters can be reduced even by a factor of two when the worst and best cases are compared. in contrast to statistical methods, only one set of calibration standards is necessary for recalibring of the vna. acknowledgment the research described in this paper was supervised by ing. v. sokol, ph.d., fee ctu in prague and supported by research program msm6840770015 “research of methods and systems for measurement of physical quantities and measured data processing” of ctu in prague, sponsored by the ministry of education and sports of the czech republic, and by grants 102/04/1079 and 13/03014/13117 of the grant agency of the czech republic. references [1] deal, w. r., farkas, d. s.: cad-based method to develop and evaluate calibration standards. ieee microwave magazine, june 2006, p. 70–84. [2] chen, x.: statistical analysis of random errors from calibration standards. ieee mtt-s, long beach, ca, june 2005. [3] wong, k.: uncertainty analysis of the weighted least squares vna. 64th arftg conference, orlando, 2004. [4] stumper, u.: uncertainty of vna s-parameter measurement due to nonideal trl calibration items, ieee transaction on instrumentation and measurement, vol. 54, april 2005, p. 676–679. [5] satler, m. j., ridler, n. m., harris, p. m.: over-determined calibration schemes for rf network analysers employing generalized distance regression, 62th arftg conference, boulder, 2003, p. 127–140. [6] scott, j. b.: investigation of a method to improve vna calibration in planar dispersive media through adding an asymmetrical reciprocal device, ieee mtt , vol. 53, september 2005, p. 3007–3013. [7] padmanabhan, s., dunleavy, l., daniel, j. e., rodrígues, a., kirby, p. l.: broadband space conservative on-wafer network analyzer calibration with more complex load and thru models. ieee mtt, vol. 54, september 2006, p. 3583–3593. [8] vancl, j., sokol, v., hoffmann, k., škvor, z.: improved evaluation of planar calibration standard using tdr preselection method, 68th arftg conference, broomfield, 2006. ing. jan. vancl e-mail: vanclj1@fel.cvut.cz department of electromagnetic field czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic 106 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 12: reflection coefficient of shifted calibration/verification standards – the worst combination ap07_2-3.vp in [1] we deal with a class of superintegrable hamiltonian systems on a 2n-dimensional phase space with arbitrary natural n. this class is a natural generalization of the class presented in [2]. using the results of [3] on flat coordinates for the metric tensors associated with the kinetic-energy parts of the hamiltonians under study we show that for the two subclasses of the class in question there exist additional integrals of motion linear in momenta. in turn, the presence of these additional integrals enables us to solve the hamilton-jacobi and schrödinger equations for the systems in question for arbitrary sufficiently large n. the general theory is illustrated by an example of the hamiltonian h p p q p p q j n j j n k k n j n k j j k n � � � � � � � � � � � � � � � � � 1 2 1 2 1 1 1 1 1 1 3 2 1 2 1 3q q q� ( ) acknowledgment it is my great pleasure to thank miloslav znojil for the warm hospitality and the stimulating atmosphere of the microconferences. the author would like to thank the ministry of education, youth and sports of the czech republic for supporting his participation at the microconference under grant msm 4781305904. references [1] sergyeyev, a.: exact solvability of superintegrable benenti systems, j. math. phys. vol. 48 (2007), 052114; arxiv: nlin.si/0701015. [2] błaszak, m., sergyeyev, a.: maximal superintegrability of benenti systems, j. phys. a: math. gen. vol. 38, (2005), l1–l5; arxiv: nlin.si/0412018. [3] błaszak, m., sergyeyev, a.: natural coordinates for a class of benenti systems, phys. lett. a vol. 365 (2007) p. 28–33; arxiv: nlin.si/0604022. doc. rndr. artur sergyeyev, ph.d. e-mail: artur.sergyeyev@math.slu.cz silesian university in opava mathematical institute na rybníčku 1 746 01 opava, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 flat coordinates and hidden symmetry for superintegrable benenti systems a. sergyeyev in this talk i present the results from my paper exact solvability of superintegrable benenti systems, j. math. phys. 48 (2007), 052114. keywords: superintegrability, benenti system, schrödinger equation, hamilton-jacobi equation, separation of variables. acta polytechnica https://doi.org/10.14311/ap.2022.62.0386 acta polytechnica 62(3):386–393, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague analysis of parameters important for indirect drying of biomass fuel michel sabatini∗, jan havlík, tomáš dlouhý czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague 6, czech republic ∗ corresponding author: michel.sabatini@fs.cvut.cz abstract. this paper focuses on biomass drying for the design and operation of an indirect dryer used in a biomass power plant. indirect biomass drying is not as well described process as direct drying, especially when used for the preparation of biomass in energy processes, such as combustion or gasification. therefore, it is necessary to choose a suitable model describing the drying process and evaluate its applicability for this purpose. the aim of this paper is to identify parameters that most significantly affect the indirect drying process of biomass for precise targeting of future experiments. for this purpose, the penetration model was chosen. the penetration model describes indirect drying through 21 parameters. to run a series of experiments focused on all parameters would be time consuming. therefore, the easier way is to select the most important parameters through a sensitivity analysis, and then perform experiments focused only on the significant parameters the parameters evaluated as significant are the temperature of the heated wall, operating pressure in the drying chamber, surface coverage factor, emissivity of the heated wall, emissivity of the bed, diameter of the particle, and particle surface roughness. due to the presumption of perfect mixing of the material being dried, stirrer speed is added into important parameters. based on these findings, it will be possible to reduce the scope of experiments necessary to verify the applicability of the penetration model for the description of indirect biomass drying and the design of dryers for a practical use. keywords: contact drying, indirect drying, penetration model, drying rate. 1. introduction biomass as an energy source is an important component of the energy mix with respect to the departure from coal combustion and the established trend of using renewable energy sources. however, many kinds of biomass, e.g., wood chips, bark, or some agricultural residues, have a high moisture content that affects their energy use. the energy consumption for drying the biomass is significant. therefore, it is important to dry biomass with the lowest possible energy intensity. biomass drying for power generation is commonly done in convective dryers, which are less energyeffective than indirect dryers. therefore, replacing a convective dryer with an indirect dryer should result in energy savings. the difference between indirect (contact, conductive) and convective dryers is in the way how the heating medium supplies its heat to the material. in the case of direct dryers, the material being dried comes into a direct contact with the flow of the heating medium, which is most often air or flue gas. in the case of indirect dryers, a heating medium does not come into contact with the material being dried and the heat is transferred to the material through a heated wall of the dryer [1]. indirect dryers are more energy effective because a lesser heat loss in the heating medium can be achieved. moreover, any form of heat (e.g. waste heat) can be used for drying, which is particularly noticeable when using steam heating. in addition, the heat in water vapour produced during the drying can be recovered, for example, in the previous operation for pre-drying, and reduce the overall energy consumption of drying. the average energy consumption of indirect dryers is reported to be in the range of 2800– 3600 kj/kg while direct dryer’s consumption is in the range of 4000–6000 kj/kg [2, 3]. many papers have been written on the subject of indirect drying for very specific materials and dryers. based on these papers, there are two main models used for a description of indirect drying. the first of them is based on simultaneous heat, mass and momentum transfer in porous media given by whitaker in [4, 5]. this model was used to investigate pharmaceutical materials dried in a laboratory vacuum dryer or in the nutsche filter dryer in [6–11]. the second model is called the penetration model, the heat transfer of this model was described in [12] and then extended to be applicable for indirect drying in a pure vapour atmosphere in [13]. the following authors improved the penetration model for both the specific properties of the material and the specific drying conditions: for multigranular beds [14], for materials with hygroscopic behaviour [15–17], for granular beds wetted with a binary mixture [18], for indirect drying in the presence of an inert gas [17, 19]. in the paper [20], authors were taking into account the local kinetics of grain dehydration or the diffusion of vapour inside the bed to improve the model. most of the previous 386 https://doi.org/10.14311/ap.2022.62.0386 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 3/2022 analysis of parameters important for indirect drying of biomass fuel figure 1. schematic illustration of moisture content, drying rate, and material temperature. articles were focused on the materials with a spherical shape. there are few other studies focused on different materials. paste materials dried in list-type kneader dryer were examined in [21]. drying of sewage sludge was studied in [22, 23]. crystalline powders were the focus of studies [24, 25], and other types of powders in [26]. a comparison of the penetration model with the discrete modelling was done in [27]. recent articles on indirect drying of biomass can be found in [28–30]. none of these studies aimed to describe the drying of biomass fuel in an indirect dryer using the penetration model. the novelty of the research is the modification and subsequent use of this model, because the model is derived and verified for materials with uniform shape and size and for conditions that are different from biomass, which has very inhomogeneous physical properties. the aim of this paper is to analyse the theoretical description of the biomass indirect drying process and to evaluate its applicability in the design of an indirect dryer for a biomass power plant. for a theoretical description of the indirect drying process, the penetration model can be used. to experimentally verify the model for the material and conditions corresponding to indirect drying of biomass, due to the large number of parameters that can influence the drying, many experiments would have to be done to determine the impact of each parameter on the drying process. therefore, reducing the number of necessary experiments, through the sensitivity analysis of the drying process to changes in individual parameters was evaluated using the penetration model. the parameters with the greatest influence on the process were determined. 2. materials and methods 2.1. drying kinetics the drying process consists of three main periods shown in figure 1. in the initial period, both the temperature of the material and the drying rate rise rapidly. for the second period, a constant drying rate is typical. the temperature of the material rises very slowly and the moisture content decreases linearly. in the third period, the drying rate steadily decreases and the temperature of the material begins to increase fast. biomass, as fuel, is usually being dried from a high moisture content to the level optimal for combustion, which is around 0.4 kgw kg −1 dry. drying of fuel takes place primarily in the initial and constant rate period. the initial period is usually short compared to the constant rate period, and it can be neglected. moisture content is defined by equation (1). x = mw mdry , (1) where mw [kg] is the weight of water and mdry [kg] is the weight of dry matter. 2.2. specifics of biomass fuel drying typical types of waste biomass used for a combustion are fresh wood chips and bark. drying of wood chips or bark has some specifics. the material is usually very moist, the optimal target remaining moisture content after drying is about 0.4 kgw kg −1 dry. therefore, drying mostly takes place in the constant drying rate period. in this period, some of the parameters from the penetration model do not significantly affect its results, their influence increases only in the falling rate period. the identification of these negligible parameters can be done by a sensitivity analysis. excluding them will reduce the scope of experiments required for the verification of the penetration model and evaluate its applicability for the description of contact drying of biomass. 2.3. penetration model the penetration model describes the heat and mass transfer during indirect drying. thus, for specific conditions of the drying process, it is possible to theoretically calculate the heat transfer coefficient between the heated surface of the dryer and the material being dried, and subsequently determine the drying rate. the drying model was proposed by schlünder and mollekopf in [13]. the model is developed for drying mechanically agitated particulate materials in a pure 387 m. sabatini, j. havlík, t. dlouhý acta polytechnica vapour atmosphere. the steady mixing process is substituted by a sequence of steps in which the material is stagnant for a fictitious period of time tr and at the end of this period, the material is instantaneously and perfectly mixed. in the rest of the chapter, the model is briefly explained. a detailed explanation of the model can be found in the listed citation [12, 13]. the contact heat transfer coefficient can be calculated according to schlünder [12]: αw s = φαw p + αrad, (2) where φ is the surface coverage factor (the heated surface which is in direct contact with the material), αw p is the wall-particle heat transfer coefficient, αrad is the heat transfer coefficient by radiation. αw p = 4λg dequiv [( 1 + 2(l + δ) dequiv ) · ln ( 1 + dequiv 2(l + δ) ) − 1 ] , (3) where dequiv [m] is the equivalent diameter of the particle: dequiv = 3 √ 6v π , (4) where v [m3] is the volume of the particle, λg is the thermal conductivity of the gas, and δ is the roughness of the particle surface. αrad = 4 · cw,bed · t 3, (5) where t is the mean temperature, and cw,bed is the overall radiation coefficient calculated by: cw,bed = σ( 1 εw + 1 εbed − 1 ) , (6) where σ is the stefan–boltzmann constant, εw is the emissivity of the heated wall, and εbed is the emissivity of the bed. the modified mean free path of the gas molecules l is defined as follows: l = 2 · 2 − γ γ √ 2πr̃t m̃ λg p ( 2cp,g − r̃ m̃ ) , (7) where γ is the accommodation coefficient, r̃ is the ideal gas constant, m̃ is the molecular weight of the gas, p is the operating pressure, and cp,g is the specific heat of the gas. the penetration heat transfer coefficient of a dry bed can be expressed as: αbed,dry = 2 √ π √ (pλc)bed,dry√ tr , (8) and of a wet bed as: αbed,wet = αbed,dry erf ζ = 2 √ π √ (pλc)bed,dry√ tr 1 erf ζ , (9) where: tr = nmix n , (10) where for a paddle dryer, nmix is calculated: nmix = 9 · f r0.05, (11) and f r froude number obtained from: f r = (2πn)2 · d 2g , (12) where n is the stirrer speed, d is the diameter of the vessel, and g is the gravitational acceleration. the reduced instantaneous position of the drying front ζ can be calculated from: ζ = zt 2 √ κbed,dry t , (13) where κbed,dry is the overall thermal diffusivity of the dry bed: κbed,dry = λbed,dry (ρc)bed,dry , (14) and ζ is determined from the relationship: √ π · ζ · exp ( ζ 2 )[( αw s αdry − 1 ) · erf ζ + 1 ] = 1 ξ ( αw s αdry − 1 ) , (15) where ξ is the reduced average moisture content of the bed: ξ = x∆hν cbed,dry · (tw − tbed) . (16) the overall heat transfer coefficient of a dry bed can be determined from the relation: 1 αdry = 1 αw s + 1 αbed,dry , (17) and of a wet bed from: 1 α = 1 αw s + 1 αbed,wet . (18) the overall heat transfer coefficient α is expressed as follows: α = αw s 1 + ( αw s αdry − 1 ) · erf ζ . (19) the flux at the hot surface: q̇0 = α(tw − tbed), (20) and the heat flux at the drying front: q̇lat = α(tw − tbed) · exp ( − ζ 2 ) (21) the drying rate is obtained from: ṁ = q̇lat ∆hν . (22) 388 vol. 62 no. 3/2022 analysis of parameters important for indirect drying of biomass fuel the results of this step are differences in both the moisture content and the bed temperature (equations (23) and (24)), which serve to determine the new moisture content and the new bed temperature (equations (25) and (26)), which are inputs for the next step. ∆x = ṁtra mdry , (23) where a is the covered surface of the heating wall depending mainly on the geometry of the dryer and the filling ratio. ∆tbed = ∆x ∆hν cbed,dry + clx · [exp / ζ 2 ) − 1], (24) xi+1 = xi + ∆xi, (25) tbed,i+1 = tbed,i + ∆tbed,i. (26) the penetration model allows us to calculate the drying rate and the heat transfer coefficient, which are the main parameters needed for the design of an indirect dryer. furthermore, the penetration model can predict the temperature of the material, which is important for drying temperature-sensitive materials. 2.4. sensitivity analysis using penetration model for indirect drying of biomass the aim of the sensitivity analysis of the penetration model is to determine the effect of the change in the v of individual input parameters on its overall heat transfer coefficient and drying rate. a large number of parameters enter into the calculation. for a practical use, under specific conditions, these parameters must be determined experimentally, often by very complex methods. therefore, it is desirable to select parameters whose change has little effect on the results of the model, and their value can be determined in a simplified way or by estimation. high-sensitivity parameters can cause large variations in results even with a small change, so great care must be taken to determine their exact value. based on the conclusions of the sensitivity analysis, it is possible to propose a precise plan for experiments for the model validation. all parameters used in the penetration model for drying in a pure steam atmosphere were identified. in general, these parameters can be divided into three basic categories that describe the properties of the material, evaporated substance, and equipment. material properties: bed temperature, dry bed density, dry bed thermal conductivity, dry bed specific heat capacity, diameter of the particle, surface roughness of particles, bed emissivity, and moisture content. properties of evaporated substance: thermal conductivity, molar mass, operating pressure, specific heat capacity, latent heat, and accommodation coefficient. equipment properties: heated wall temperature, heated wall emissivity, surface coverage factor, stirrer speed, dryer diameter, and constants c and x. selection of analysed parameters: some of the above parameters are defined for a specific material or device and usually cannot be changed, or their change depends on a change of other parameters, such as temperature or operating pressure. these parameters include the density of the dry bed of material, thermal conductivity of the dry bed, specific heat capacity of the dry bed, surface roughness of particles, thermal conductivity of the evaporated substance, molar mass of the evaporated substance, specific heat capacity of the evaporated substance and accommodation coefficient, emissivity of the heated wall, emissivity of the bed, surface coverage factor, diameter and length of the vessel, and moisture content. the moisture content of the material is usually specified and for this reason, it is also considered as an input parameter that cannot be changed. the parameters of the evaporated substance (in this case water) are determined from the steam tables. however, the temperature of the heated wall, stirrer speed, operating pressure (if vacuum drying is applied), and associated temperature of the bed, and in some cases, also the diameter of the particle can be changed within a certain range according to the technological requirements. the constants c and x were determined by the authors of the model, and they are empirically determined parameters, for the purpose of this analysis, they are considered constant. however, the literature review showed that even these values are modified by some authors for more accurate results and therefore, it is possible to adjust them for specific equipment and materials [22]. although it is not possible to change some of the parameters of the model, it is important to identify their influence and thus the importance of their exact determination. these parameters include the density of the dry bed, thermal conductivity of the dry bed, specific heat capacity of the dry bed, surface roughness of particles, emissivities of the heated wall and bed, surface coverage factor, and accommodation coefficient. the sensitivity analysis is performed for the expected reference value of the parameter, and its value is usually changed in the range of ±50 %. in some cases, when an assumed change of a given parameter could make a difference, the range of the analysis is adjusted to cover probable conditions. a value of 100 % represents the reference case. 2.5. reference case of indirect drying of biomass the reference values of the parameters used in the sensitivity analysis are in table 1. a horizontal paddle dryer is used, and the substance being evaporated is water. the chosen conditions represent both the material and the dryer, which will be used for future experiments and for validating the model for the purpose of biomass fuel drying. the drying of biomass fuel usually takes place under nearly atmospheric con389 m. sabatini, j. havlík, t. dlouhý acta polytechnica parameter unit value diameter of the vessel d mm 256 length of the vessel l mm 1000 constant c c 9 constant x x 0.05 emissivity of the heated wall εw 0.8 temperature of the heated wall tw ◦c 130 operating pressure p bar 1.01325 stirrer speed n rpm 17 equivalent particle diameter dequiv mm 10 density of the bed ρbed kg m−3 450 specific heat capacity of the bed cbed j kg−1 k−1 1500 thermal conductivity of the bed λbed w m−1 k−1 0.18 emissivity of the bed εbed 0.94 moisture content x0 kgw kg −1 dry 1 accommodation coefficient γ 0.8 surface coverage factor φ 0.3 amount of dry matter mdry kg 2.19 surface roughness of particles δ µm 500 table 1. reference case. ditions. for the purpose of the analysis, drying at a lower pressure was also considered, which would increase the drying rate. vacuum conditions allow using low-potential waste heat for drying, which would be otherwise lost. the benefit of using waste heat may overcome the energy required to run the vacuum pump. 3. results and discussion results of the sensitivity analysis allow us to divide the input parameters of the model into significant and negligible according to how strongly they affect the drying process. the effect of changing the parameter on the size of the heat transfer coefficient from the heated wall to the material being dried and on the drying rate, which are the two most important parameters for the dryer design, was evaluated. the parameter was classified as significant if the change in the heat transfer coefficient or drying rate was greater than 3 % in the analysed range. low impact parameters according to figure 2 and figure 3, the parameters with little influence on both the heat transfer coefficient and the drying rate are usually material properties, such as the thermal conductivity of the material bed, heat capacity of the material bed, density of the material bed, and the accommodation coefficient, i.e., the parameter related to the properties of the evaporated substance. changing these parameters has a negligible influence on the result of drying in the constant drying rate period. the last parameter is the stirrer speed, this parameter still needs to be verified experimentally, because one of the assumptions of the model is to achieve perfect mixing of the material, therefore, it is possible that the allowed speed range will be limited. strong impact parameters the following two figures show the strong influence of other parameters on the change of the heat transfer coefficient α and the drying rate. these parameters include temperature, operating pressure, diameter of the particle, emissivity of the heated wall, emissivity of the bed, surface roughness of particles, and surface coverage factor of the dryer. the greater the slope of the curve, the stronger the influence of the parameter. from the graph in figure 4, it can be seen that the value of the heat transfer coefficient is most influenced by the surface coverage factor, temperature, emissivity, diameter of the particle, and surface roughness of particles. all of these parameters have a similar effect on the heat transfer coefficient. the influence of parameters on the change in drying rate is slightly different. figure 5 shows the dominant effect of temperature. a smaller effect can be observed for the coverage, emissivity coefficients, and operating pressure. lowering the pressure has a negative impact on the heat transfer coefficient, but a positive impact on the drying rate. the effect of roughness and particle diameter on the drying rate is weaker. overall evaluation the penetration model used to describe the heat transfer coefficient and the drying rate of contact drying works with 21 parameters. ten of these parameters are related to the type of dryer and the evaporated substance, their value is determined by the specific circumstances of the case and cannot be easily changed. the remaining 11 parameters determine the properties of the material being dried and the conditions of the process that may be 390 vol. 62 no. 3/2022 analysis of parameters important for indirect drying of biomass fuel figure 2. parameters with low impact on the heat transfer coefficient. figure 3. parameters with low impact on the drying rate. figure 4. parameters with strong impact on the heat transfer coefficient. figure 5. parameters with strong impact on the drying rate. 391 m. sabatini, j. havlík, t. dlouhý acta polytechnica variable or might be important for the drying. according to the results of the sensitivity analysis, only 7 of these parameters can significantly affect the drying process in the constant drying rate period. these parameters are – temperature of the heated wall, operating pressure, diameter of the particle, and other parameters that are important and it is necessary to determine them accurately. these parameters include the emissivity of the heated wall, emissivity of the bed of the material, surface coverage factor, and surface roughness of the particles. the penetration model assumes perfect mixing of the bed of material, therefore, it is desirable to also include stirrer speed among the important parameters and into a scheme of verification experiments. the influence of stirrer speed will probably be noticeable only at very low speeds. 4. conclusion the sensitivity analysis of the penetration model describing the heat transfer process in indirect dryers for the purpose of fuel drying has been done. for the drying process in the constant rate period, the most important parameters were identified. changing them will be reflected in a significant change in the results of the model. in the case of using the penetration model to design a real biomass dryer, these parameters must be precisely determined within specific ranges that will correspond to the properties of the fuel and the required drying conditions. these parameters are the temperature of the heated wall, operating pressure, surface coverage factor, diameter of the particle, emissivity of the heated wall, emissivity of the bed, and surface roughness of particles. moreover, it is recommended to include stirrer speed in a set of important parameters, especially at low stirrer speeds. future research will focus on the experimental verification of the use of the penetration model to describe the drying of biomass in an indirect dryer in order to validate its applicability for the design of real equipment. list of symbols a covered surface of the heating wall [m2] c constant [–] c specific heat capacity [j kg−1 k−1] cw,bed overall radiation coefficient [–] cp specific heat capacity at constant pressure [j kg−1 k−1] d diamater of the vessel [m] dequiv equivalent particle diameter [m] f r froude number [–] g gravitational accelaration [m s−2] ∆hν latent heat of evaporation [j kg−1] l lenght of the vessel [m] m weight [kg] ṁ drying rate [kg m−2 s−1] m̃ molar mass [kg mol−1] n stirrer speed [rpm] nmix mixing number [–] p operating pressure [pa] q̇ heat flux [w m−2] r̃ universal gas constant [j mol−1 k−1] t temperature [k] tr contact time [s] v volume of the particle [m3] x moisture content [kgw kg−1d ry] x constant [–] zt position of the drying front [m] greek symbols α heat transfer coefficient [w m−2 k−1] γ accommodation coefficient [–] δ surface roughness of particles [m] ϵ emissivity [–] ζ dimensionless position of phase change front [–] l modified mean free path of gas molecules [m] κ thermal diffusivity [m2 s−1] λ thermal conductivity [w m−1 k−1] ξ reduced average moisture content [–] ρ density [kg m−3] σ stefan-boltzmann constant [w m−2 k−4] φ surface coverage factor [–] subscripts 0 initial bed bed of the material dry dry matter g gas l liquid rad radiation w heated wall w water wet wet matter wp wall-particle ws contact acknowledgements this work was supported by the project from research center for low carbon energy technologies, cz.02.1.01/0.0/0.0/16_019/0000753. we gratefully acknowledge support from this grant. references [1] c. w. hall. dictionary of drying. marcel dekker, 1979. isbn 9780824766528. [2] a. s. mujumdar. handbook of industrial drying. crc press, 4th edn., 2014. https://doi.org/10.1201/b17208. [3] m. lattman, r. laible. batch drying: the “indirect” solution to sensitive drying problems. chemical engineering (new york) 112(11):34–39, 2005. [4] s. whitaker. simultaneous heat, mass, and momentum transfer in porous media: a theory of drying. in j. p. hartnett, t. f. irvine (eds.), advances in heat transfer, vol. 13, pp. 119–203. elsevier, 1977. https://doi.org/10.1016/s0065-2717(08)70223-5. 392 https://doi.org/10.1201/b17208 https://doi.org/10.1016/s0065-2717(08)70223-5 vol. 62 no. 3/2022 analysis of parameters important for indirect drying of biomass fuel [5] s. whitaker. heat and mass transfer in granular porous media. in a. s. mujumdar (ed.), advances in drying, vol. 1, pp. 23–61. hemisphere publishing, new york, 1980. [6] m. kohout, a. p. collier, f. stepanek. vacuum contact drying of crystals: multi-scale modelling and experiments. in a. barbosa-póvoa, h. matos (eds.), computer aided chemical engineering, vol. 18, pp. 1075–1080. elsevier, 2004. https://doi.org/10.1016/s1570-7946(04)80245-1. [7] m. kohout, a. p. collier, f. stepanek. vacuum contact drying kinetics: an experimental parametric study. drying technology 23(9-11):1825–1839, 2005. https://doi.org/10.1080/07373930500209954. [8] m. kohout, a. collier, f. štěpánek. microstructure and transport properties of wet poly-disperse particle assemblies. powder technology 156(2-3):120–128, 2005. https://doi.org/10.1016/j.powtec.2005.04.007. [9] m. kohout, a. p. collier, f. štěpánek. mathematical modelling of solvent drying from a static particle bed. chemical engineering science 61(11):3674–3685, 2006. https://doi.org/10.1016/j.ces.2005.12.036. [10] m. kohout, f. stepanek. multi-scale analysis of vacuum contact drying. drying technology 25(7-8):1265–1273, 2007. https://doi.org/10.1080/07373930701438741. [11] m. murru, g. giorgio, s. montomoli, et al. model-based scale-up of vacuum contact drying of pharmaceutical compounds. chemical engineering science 66(21):5045–5054, 2011. https://doi.org/10.1016/j.ces.2011.06.059. [12] e. schlünder. heat transfer to packed and stirred beds from the surface of immersed bodies. chemical engineering and processing: process intensification 18(1):31–53, 1984. https://doi.org/10.1016/0255-2701(84)85007-2. [13] e. schlünder, n. mollekopf. vacuum contact drying of free flowing mechanically agitated particulate material. chemical engineering and processing: process intensification 18(2):93–111, 1984. https://doi.org/10.1016/0255-2701(84)85012-6. [14] e. tsotsas, e. schlünder. vacuum contact drying of free flowing mechanically agitated multigranular packings. chemical engineering and processing: process intensification 20(6):339–349, 1986. https://doi.org/10.1016/0255-2701(86)80012-5. [15] e. tsotsas, e. schlünder. vacuum contact drying of mechanically agitated beds: the influence of hygroscopic behaviour on the drying rate curve. chemical engineering and processing: process intensification 21(4):199–208, 1987. https://doi.org/10.1016/0255-2701(87)80017-x. [16] r. forbert, f. heimann. vacuum contact drying of mechanically agitated, coarse, hygroscopic bulk material. chemical engineering and processing: process intensification 26(3):225–235, 1989. https://doi.org/10.1016/0255-2701(89)80021-2. [17] a. gevaudan, j. andrieu. contact drying modelling of agitated porous alumina beads. chemical engineering and processing: process intensification 30(1):31–37, 1991. https://doi.org/10.1016/0255-2701(91)80006-b. [18] f. heimann, e. schlünder. vacuum contact drying of mechanically agitated granular beds wetted with a binary mixture. chemical engineering and processing: process intensification 24(2):75–91, 1988. https://doi.org/10.1016/0255-2701(88)87016-8. [19] e. tsotsas, e. schlünder. contact drying of mechanically agitated particulate material in the presence of inert gas. chemical engineering and processing: process intensification 20(5):277–285, 1986. https://doi.org/10.1016/0255-2701(86)80021-6. [20] d. farges, m. hemati, c. laguérie, et al. a new approach to contact drying modelling. drying technology 13(5-7):1317–1329, 1995. https://doi.org/10.1080/07373939508917024. [21] a. dittler, t. bamberger, d. gehrmann, ernst-ulrich schlünder. measurement and simulation of the vacuum contact drying of pastes in a list-type kneader drier. chemical engineering and processing: process intensification 36(4):301–308, 1997. https://doi.org/10.1016/s0255-2701(97)00004-4. [22] p. arlabosse, t. chitu. identification of the limiting mechanism in contact drying of agitated sewage sludge. drying technology 25(4):557–567, 2007. https://doi.org/10.1080/07373930701226955. [23] j.-h. yan, w.-y. deng, x.-d. li, et al. experimental and theoretical study of agitated contact drying of sewage sludge under partial vacuum conditions. drying technology 27(6):787–796, 2009. https://doi.org/10.1080/07373930902900911. [24] a. michaud, r. peczalski, j. andrieu. experimental study and modeling of crystalline powders vacuum contact drying with intermittent stirring. drying technology 25(7-8):1163–1173, 2007. https://doi.org/10.1080/07373930701438501. [25] a. michaud, r. peczalski, j. andrieu. modeling of vacuum contact drying of crystalline powders packed beds. chemical engineering and processing: process intensification 47(4):722–730, 2008. https://doi.org/10.1016/j.cep.2006.12.009. [26] m. intelvi, a. picado, j. martínez. contact drying simulation of particulate materials: a comprehensive approach. in 7th international conference on chemical engineering (icce 2011), vol. 59, pp. 1669–1676. 2011. https://doi.org/10.13140/rg.2.1.2632.6008. [27] e. tsotsas, m. kwapinska, g. saage. modeling of contact dryers. drying technology 25(7-8):1377–1391, 2007. https://doi.org/10.1080/07373930701439079. [28] j. havlík, t. dlouhý, m. sabatini. the effect of the filling ratio on the operating characteristics of an indirect drum dryer. acta polytechnica 60(1):49–55, 2020. https://doi.org/10.14311/ap.2020.60.0049. [29] m. sabatini, j. havlík, t. dlouhý. improving the efficiency of a steam power plant cycle by integrating a rotary indirect dryer. acta polytechnica 61(3):448–455, 2021. https://doi.org/10.14311/ap.2021.61.0448. [30] j. havlík, t. dlouhý. indirect dryers for biomass drying—comparison of experimental characteristics for drum and rotary configurations. chemengineering 4(1):1–11, 2020. https://doi.org/10.3390/chemengineering4010018. 393 https://doi.org/10.1016/s1570-7946(04)80245-1 https://doi.org/10.1080/07373930500209954 https://doi.org/10.1016/j.powtec.2005.04.007 https://doi.org/10.1016/j.ces.2005.12.036 https://doi.org/10.1080/07373930701438741 https://doi.org/10.1016/j.ces.2011.06.059 https://doi.org/10.1016/0255-2701(84)85007-2 https://doi.org/10.1016/0255-2701(84)85012-6 https://doi.org/10.1016/0255-2701(86)80012-5 https://doi.org/10.1016/0255-2701(87)80017-x https://doi.org/10.1016/0255-2701(89)80021-2 https://doi.org/10.1016/0255-2701(91)80006-b https://doi.org/10.1016/0255-2701(88)87016-8 https://doi.org/10.1016/0255-2701(86)80021-6 https://doi.org/10.1080/07373939508917024 https://doi.org/10.1016/s0255-2701(97)00004-4 https://doi.org/10.1080/07373930701226955 https://doi.org/10.1080/07373930902900911 https://doi.org/10.1080/07373930701438501 https://doi.org/10.1016/j.cep.2006.12.009 https://doi.org/10.13140/rg.2.1.2632.6008 https://doi.org/10.1080/07373930701439079 https://doi.org/10.14311/ap.2020.60.0049 https://doi.org/10.14311/ap.2021.61.0448 https://doi.org/10.3390/chemengineering4010018 acta polytechnica 62(3):386–393, 2022 1 introduction 2 materials and methods 2.1 drying kinetics 2.2 specifics of biomass fuel drying 2.3 penetration model 2.4 sensitivity analysis using penetration model for indirect drying of biomass 2.5 reference case of indirect drying of biomass 3 results and discussion 4 conclusion list of symbols acknowledgements references ap06_6.vp 1 introduction the recognition of clean speech recorded in quiet conditions can be addressed quite successfully with the widely used mel-frequency cepstral coefficients – mfcc [1] and perceptual linear predictive coefficients – plp [2]. in this work, the behavior of these features is examined in the case of changes in talking style (neutral speech, speech under lombard effect – le) and in the case of speech bandwidth limitation introduced by telephone filter. le introduces changes in speech production due to the speaker’s effort to increase communication intelligibility in noise [3]. bandwidth limitation of telephone filter introduces changes in the spectral content of the signal which may lead to loss of the fourth speech formant and may thus degrade the recognizer performance. in addition, two hmm training strategies were compared with respect to convergence speed and best achievable performance. the strategies differed only in the way in which the initial hmms containing one gaussian mixture component per hmm state were enhanced to contain the final 32 mixtures per hmm state. this was done either by direct splitting and cloning the only mixture to 32 mixtures and reestimating them many times (one-shot approach), or by gradually doubling the number of mixtures and reestimating after each split until there were 32 mixtures (progressive propagation). the recognition experiments presented in this paper were carried out on czech speecon [4] and clsd’05 [5] databases. czech speecon comprises recordings in public, office, car and entertainment scenarios. for the purpose of hmm training, office data representing neutral speech with high snr was chosen. for the tests, the clsd’05 database was used. this consists of neutral speech and speech uttered in various types of simulated noisy backgrounds (car2e car noise [6] artificial band-noises). since the noises were reproduced to the speakers through closed headphones, only clean lombard speech was captured, benefiting from similar snr to the speecon office recordings. 2 feature extraction techniques two widely used feature extraction methods were examined in this work, mfcc and plp. both methods use mel-frequency warping with the same number of frequency subbands, and the estimates of the spectral envelopes were described by the same number of cepstral coefficients, so that the two methods were comparable. the most important difference in the otherwise rather similar approaches is the way in which the cepstral coefficients are obtained from the spectra: through the dct transform in the case of mfcc, and through linear prediction in the case of plp. the settings were as follows: � preemphasis with � � 0.97 � 100 hz frame rate, frame length 25 ms � 26 mel-frequency bands � lpc of order 12 (for plp) � energy normalization per utterance � cepstral liftering � 39 features per frame (12 cepstral coeffs � frame energy, � and �� coeffs) the input speech data was either sampled at 16 khz or resampled at 8 khz. as the feature extraction settings were not dependent on the sampling rate, the frequency subbands were effectively wider for 16 khz than for 8 khz. 3 telephone channel simulation one of the main goals of the study was to investigate the influence of limiting the bandwidth of the input speech, particularly by simulating a standard telephone channel. the 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 influence of different speech representations and hmm training strategies on asr performance h. bořil, p. fousek this work studies the influence of various speech signal representations and speaking styles on the performance of automatic speech recognition (asr). the efficiency of two approaches to hidden markov model (hmm) training are compared. common mfcc and plp features were exposed to two sources of disturbance applied to the original wide-band speech: (i) stress (lombard effect) and (ii) transfer channel distortion (simulated telephone line). subsequently, the efficiencies of the two training strategies were evaluated. finally, a study of the optimal number of training iterations is introduced. keywords: plp, mfcc, lombard effect, clsd’05. this text was a part of the international conference poster 2006 which was held in faculty of electrical engineering ctu in prague. fig. 1.: transfer function of the simulated telephone channel original 16 khz, 16 bit pcm speech was processed in two steps. first, the signal was resampled using the sox tool with a polyphase filter to 8 khz [7]. then the telephone channel was emulated by applying a g.712 standard iir filter of the 4th order [8] using the fant tool [9]. superposition of the processing steps as described above leads to an effective bandwidth of 300 hz–3400 hz (see fig. 1). 4 recognition experiments 4.1 experimental setup all experiments were carried out on the czech speecon and clsd’05 databases. from both databases only the close-talk microphone channel was used. the training set consisted of speecon office recordings, which represent neutral speech in a quiet environment. this set contained general speech pronounced by both genders, and comprised about 15 hours of speech. there were four independent test sets covering examples of gender-dependent neutral or lombard speech: neutral-male (1423 words), neutral-female (4930 words), le-male (6303 words) and le-female (5360 words), containing continuously pronounced digits from “nula” to “devět”. though the neutral and lombard utterances differed in the prompt texts, speakers were the same for both sets. the recognizer was a gender-independent htk-based hmm system with 43 context-independent phoneme models � 2 silences, each with 3 emitting states and 32 gaussian mixtures per state. the task was to recognize 10 czech digits in 16 pronunciation variants. 4.2 effect of lombard speech and resampling on mfcc and plp features the aim of this experiment was to show how the lombard effect and a narrow bandwidth can affect recognition performance. in all cases, the training and testing conditions were the same. first, the baseline performance of mfcc and plp features on neutral wide-band speech was evaluated in terms of word error rate (wer), see rows 1–2, columns 1–2 in table 1. also, similar narrow-band systems were tested (rows 3–4, columns 1–2 of table 1). then all four systems were exposed to lombard speech (columns 3–4 of table 1). the observations are: � mfcc and plp features display comparable performance in all conditions. � the lombard effect leads to severe but consistent degradation: the relative drop from neutral to lombard speech is comparable for male and female (about 800 %) and almost independent of bandwidth and features. however, the absolute errors indicate that for female speakers the recognizer is almost useless. � narrowing the bandwidth to the telephone introduces a degradation which is consistent over gender, features and speaking style. on average there is a relative drop of around 30 %. note the special case of plp features and neutral female speech, when the relative drop is only 12 %. to help in interpreting the above observations, two phenomena should be mentioned. first, a known property of the lombard effect is a significant shift of the first two formant frequencies [10], see fig. 2. this may cause inability of the gaussian mixtures to match the testing data and thus failure of the system. avoiding this can be helped by appropriate front-end processing (equalization of le, robust features), multi-style training (including lombard speech in the training data) or back-end processing (changes of hmm structure) [3]. second, the formants carry important information for monophone identification [11]. narrowing the bandwidth to the telephone channel causes a loss of the 4th formant, which is close to 4 khz (see table 2). this can contribute to a performance drop in narrow band systems. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 46 no. 6/2006 features bandwidth word error rate (%) neutral lombard male female male female mfcc wide 2.4 4.9 18.8 43.9 plp wide 2.5 5.0 18.6 44.9 mfcc telephone 3.0 6.2 24.2 62.2 plp telephone 3.2 5.6 23.1 62.5 table 1: gender-dependent recognition results with neutral and lombard speech at different bandwidths neutral le 1000 1200 1400 1600 1800 2000 2200 2400 300 400 500 600 700 800 900 f1 (hz) f 2 (h z ) /u/ /i/ /i'/ /u'/ /e/ /e'/ /o/ /o'/ /a/ /a'/ clsd female vowel formants fig. 2: clsd’05 – vowel formant shifts vowel neutral lombard f4male (hz) f4female (hz) f4male (hz) f4female (hz) /a/ 3834 3934 3713 4012 /e/ 3696 4181 3728 4196 /i/ 3661 4170 3683 4218 /o/ 3916 3880 3711 4042 /u/ 3738 3939 3661 4001 table 2: clsd’05 – average positions of 4th formants in neutral and lombard speech 4.3 comparing training strategies all the recognizers mentioned up to now were trained using the progressive propagation method: initial hmms containing one gaussian mixture (gm) per state were reestimated using the baum-welch procedure and then each gm was split into two gms and reestimated. after 5 cycles there were 32 gms, which were further trained. this experiment compares such an approach with the one-shot strategy, where the initial mixture was cloned 32 times in each hmm to create 32 gms directly. the hmms were then reestimated. the performance of both strategies was tested on a set comprising 8279 digits from the speecon office and clsd’05 neutral sessions, see fig. 3. to complete the picture about the training process, the evolution of insertions and deletions is shown in fig. 4. no word insertion penalty was used. 4.4 when to stop training the last experiment attempts to answer the following questions: how many training iterations should be performed in order to get the best models? are the best models for neutral speech also the best for lombard speech? the wide-band mfcc system trained with progressive propagation was used to recognize neutral and lombard speech in each training epoch. fig. 5 shows the performance evolution. hmms tested with neutral speech appear to converge much earlier than with lombard speech. excessive reestimations improve the performance on lombard speech and do not seem to harm neutral speech. this suggests that many iterations do not lead to loss of the essential generalization properties of hmms. 5 conclusions the aim of the paper was to study the effect of narrowing the speech bandwidth and the effect of stressed speech on the performance of the hmm recognizer based on mfcc and plp features. experiments were carried out with the czech speecon and clsd’05 corpora. mfcc and plp features displayed similar behavior in all conditions. no fundamental differences were observed. narrowing the bandwidth to the telephone channel brought performance deterioration, which was consistent over gender, features and speaking style. a possible explanation is the loss of the 4th speech formant. a consequence of the lombard effect was a severe drop in performance, common to both features. though the relative drop was comparable for both genders and bandwidths, in the female case it led to a failure of the recognizer. without appropriate modifications, an hmm recognizer is almost useless when exposed to lombard speech. a comparison of the two training strategies showed their similar behavior and thus there is no need for further exploration. an experiment with a higher number of hmm training iterations indicated that in order to achieve better recognition ccuracy on stressed speech, more training epochs are needed. fortunately, these iterations do not damage the necessary generalization properties of hmms. acknowledgments this work was supported by gačr 102/05/0278 “new trends in research and application of voice technology”, gačr 102/03/h085 “biological and speech signals modeling”, and research activity msm 6840770014 “research in the area of prospective information and navigation technologies”. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 w e r (% ) 2 4 8 16 32 1 32 1 mfcc 16 khz 1 � 2 � … � 32 mix mfcc 16 khz 1 � 32 mix. training epoch fig. 3: comparing training strategies – progressive propagation (blue) vs. one-shot (red). the top axes show the number of mixtures in an epoch. -2400 -2000 -1600 -1200 -800 -400 0 400 800 1200 1600 0 10 20 30 40 50 60 training evolution (-) # in s e rt io n s () # d e le ti o n s () 1 2 4 8 16 32 1 32 mfcc 16 khz 1 � 2 � … � 32 mix mfcc 16 khz 1 � 32 mix. fig. 4: comparing training strategies – convergence of word insertions and deletions 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 le_f le_m neutral office + clsd w e r (% ) 2 4 8 16 321 43.9 % 18.8 % 3.8 % training epoch fig. 5: progressive propagation training – evolution of wer on male/female lombard speech and both-genders neutral speech references [1] young, s. et al.: the htk book ver. 2.2. entropic ltd 1999. [2] hermansky, h.: perceptual linear predictive (plp) analysis of speech, j. acoust. soc. am., vol. 87, no. 4, april 1990, p. 1738–1752. [3] hansen, j. h. l.: analysis and compensation of speech under stress and noise for environmental robustness in speech recognition. speech communications, special issue on speech under stress, vol. 20 (1996), no.2, p. 151–170. [4] speecon, http://www.speechdat.org/speecon. [5] bořil, h., pollák, p.: design and collection of czech lombard speech database. in: proc. interspeech ’05. lisboa (portugal), 2005, p. 1577–1580. [6] pollák, p., vopička, j., sovka, p.: czech language database of car speech and environmental noise. in proc. eurospeech ’99. budapest (hungary) 1999. vol. 5, p. 2263–6. [7] sox – sound exchange tool manual, http://sox.sourceforge.net. [8] the international telegraph and telephone consultative committee (ccitt), international telecommunication union (itu). ccitt g.712: general aspects of digital transmission systems; terminal equipments. transmission performance characteristics of pulse code modulation, 1992. [9] fant – filtering and noise adding tool. http://dnt.kr.hsnr.de/download.html. [10] bořil, h., pollák, p.: comparison of three czech speech databases from the standpoint of lombard effect appearance. in: aside 2005 – applied spoken language interaction in distributed environments. aalborg (denmark), 2005. international speech communication association. book of abstracts [cd-rom]. [11] rabiner, l.r., schafer, r. w.: digital processing of speech signals. prentice hall, new jersey, 1978. ing. hynek bořil e-mail: borilh@gmail.com petr fousek e-mail: p.fousek@gmail.com department of circuit theory czech technical university faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 46 no. 6/2006 ap_07_6.vp 1 introduction a mechanically agitated system under a turbulent regime of flow with or without internals (radial baffles, draft tubes, coils etc.) consists of a broad spectrum of eddies from the size of the main (primary) circulation loop (pcl) of the agitated batch down to the dissipative vortices corresponding to micro-scale eddies. this study deals with an experimental and theoretical analysis of the behaviour of the flow pattern, and mainly the pcl in an agitated liquid in a system with an axial flow impeller and radial baffles. some characteristics of the investigated behaviour are considered to be in correlation with the occurrence of flow macroinstabilities (fmis), i.e. the flow macro-formation (vortex) appearing periodically in various parts of a stirred liquid. the flow macroinstabilities in a mechanically agitated system are large-scale variations of the mean flow that may affect the structural integrity of the vessel internals and can strongly affect both the mixing process and the measurement of turbulence in a stirred vessel. their space and especially time scales considerably exceed those of the turbulent eddies that are a well known feature of mixing systems. the fmis occur in a range from several up to tens of seconds in dependence on the scale of the agitated system. this low-frequency phenomenon is therefore quite different from the main frequency of an incompressible agitated liquid corresponding to the frequency of revolution of the impeller. generally, experimental detection of fmis is based on frequency analysis of the oscillating signal (velocity, pressure, force) in long time series and frequency spectra, or more sophisticated procedures (proper orthogonal decomposition of the oscillating signal, the lomb period gram or the velocity decomposition technique) are used to determine the fmi frequencies. a theoretical method for finding fmis could contribute significantly to a deeper understanding of fluid flow behaviour in stirred vessels, e.g. a description of the circulation patterns of a agitated liquid, application of the theory of deterministic chaos, knowledge of turbulent coherent structures, etc. [1–10]. this study investigates oscillations of the primary circulation loop (the source of fmis) in a cylindrical system with an axial flow impeller and radial baffles, aiming at a theoretical description of the hydrodynamical stability of the loop. however, no integrated theoretical study of the mi phenomenon has been presented up to now. 2 experimental the experiments were performed in a flat-bottomed cylindrical stirred tank of inner diameter t � 0.29 m filled with water at room temperature with the tank diameter height h � t. the vessel was equipped with four radial baffles (width of baffles b � 0.1 t) and stirred with a six pitched blade impeller (pitch angle 45°, diameter d � t/3, width of blade w � d/5), pumping downwards. the impeller speed was adjusted n � 400 rpm � 6.67 s�1 and its off-bottom clearance c was t/3 (see fig. 1). © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 47 no. 6/2007 dynamics of the flow pattern in a baffled mixing vessel with an axial impeller o. brůha, t. brůha, i. fořt, m. jahoda this paper deals with the primary circulation of an agitated liquid in a flat-bottomed cylindrical stirred tank. the study is based on experiments, and the results of the experiments are followed by a theoretical evaluation. the vessel was equipped with four radial baffles and was stirred with a six pitched blade impeller pumping downwards. the experiments were concentrated on the lower part of the vessel, where the space pulsations of the primary loop, originated due to the pumping action of the impeller. this area is considered to be the birthplace of the flow macroinstabilities in the system – a phenomenon which has been studied and described by several authors. the flow was observed in a vertical plane passing through the axis of the vessel. the flow patterns of the agitated liquid were visualized by means of al micro particles illuminated by a vertical light knife and scanned by a digital camera. the experimental conditions corresponded to the turbulent regime of agitated liquid flow. it was found that the primary circulation loop is elliptical in shape. the main diameter of the primary loop is not constant. it increases in time and after reaching a certain value the loop disintegrates and collapses. this process is characterized by a certain periodicity and its period proved to be correlated to the occurrence of flow macroinstability. the instability of the loop can be explained by a dissipated energy balance. when the primary loop reaches the level of disintegration, the whole impeller power output is dissipated and under this condition any flow alteration requiring additional energy, even a very small vortex separation, causes the loop to collapse. keywords: mixing, axial impeller, primary circulation loop, oscillation, macroinstability. fig. 1: pilot plant experimental equipment the flow under the parameters mentioned above was turbulent, with the impeller reynolds number value rem � 6.22·10 4. the flow was observed in a vertical plane passing through the vessel (vertical section of the vessel) in front of the adjacent baffles. the flow patterns of the agitated liquid were visualized by means of al micro particles 0.05 mm in diameter spread in water and illuminated by a vertical light knife 5 mm in width, see fig. 2. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 2: experimental technique of flow visualization fig. 3: visualization of the pcl fig. 4: visualization of flow macro formation the visualized flow was scanned by a digital camera; series of shots were generated with a time step of 0.16 s and were analyzed by appropriate graphic software. the total length of the analyzed record was 30.24 s. the characteristics observed and analyzed were: a) the shape and sizes of the pcl, b) the size of the pcl core, c) the positions of the top of the flow macro formation and the corresponding functions were obtained: a) the height of the pcl hci � hci(t) (see fig. 3), b) the width of the pcl sci � sci(t) (see fig. 3), c) the equivalent mean diameter of the pcl core rc,av i � rc,av i(t) (see fig. 3), d) the height of the top of the flow macro formation hfmi,i � hfmi,i(t) (see fig. 4). 3 results of experiments the analysis of the experiments provided the following findings: the pcl can be described as a closed stream tube with a vertical section elliptical in shape with a core. the flow in the elliptical annular area is intensive and streamlined, while the core is chaotic and has no apparent streamline characteristics. the flow in the remaining upper part is markedly steadier. this is in agreement with the earlier observations of some authors [6, 7]. the flow process is characterized by three stages. in the first stage, the pcl grows to a certain size (its shape can be approximated as elliptical). then a quasi-steady stage follows, when the pcl remains at a constant size for a short time. in the next stage the pcl collapses into very small flow formations (vortices) or disintegrates into chaotic flow, see fig. 5a–c. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 47 no. 6/2007 a) b) c) fig. 5: a) pcl growing, b) pcl at maximum height, c) pcl after its collapse 0 20 40 60 80 100 120 0 5 10 15 20 25 30 35 t (s) p c l in c id e n c e (% ) fig. 6: dependence of pcl incidence on time for the whole analyzed time the process (oscillation of the pcl) is apparently characterized by a certain periodicity. the function of the incidence of the pcl in time is illustrated in fig. 6, which shows that the pcl incidence ratio approaches 80 % of the considered time. the graph in fig. 7 (vertical dimension of the pcl as a function of time) illustrates this process for the whole analysed time of 30.24 s. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 0 0.05 0.1 0.15 0.2 0.25 0 5 10 15 20 25 30 35 t (s) h c i (m ) fig. 7: dependence of the vertical dimension of pcl hci on time for the whole analyzed time fig. 8: time evolution of the height of the pcl (average cycle time tc,av � 1.59 s, maximum height hc,max � 0.181 m, hc, max/h � 0.62) for one cycle of pcl oscillation fig. 9: dependence of the mean rising velocity ur of the pcl on time (ur � dh/dt � 0.042 m/s � const., t > 0.77 s) the calculated average time of one cycle (the time between the origin of the pcl and its collapse) is tc,av � 1.59 s. this means that the frequency of the pcl oscillations is fosc � 1/tc,av � 0.63 hz and dimensionless frequency fosc � fosc/n � 0.094. this dimensionless frequency is markedly higher than the value detected earlier in the interval 0.02–0.06 [8–10]. this corresponds to our finding that only some of the flow formations generated by one pcl cycle result in macro-flow formation causing a surface level eruption, detected as “macroinstability”. the function hc � hc(t) for one cycle is shown in fig. 8. this function was obtained by regression of the experimental data hci(t), where the individual cycle intervals were recalculated for an average time cycle tc,av � 1.59 s. fig. 8 shows that the mean maximum height of the pcl hc,max reaches a value of 0.181 m, which is 62 % of surface level height h. this agrees with earlier findings [11] that hc,max� 2/3 h. the function hc � hc(t) was used for calculating the pcl rising velocity, see fig. 9. finally, the average rising time of flow macro formation generated by pcl was calculated from the time function hfmi � hfmi(t) (time between generating and disintegration), and the value obtained was tav, fmi � 1.0 s. the regression curve hfmi,i � hfmi,i(t) corresponding to the experimental data for one cycle (development of one macro-formation, the time recalculated for average rising time tav,fmi) is illustrated in fig. 10. to specify the pcl characteristics more deeply, the volume of the pcl (active volume of the primary circulation) was calculated. this was assumed as a toroidal volume around its elliptical projection in the r-z plane of the mixing vessel. the function of the ratio of the pcl volume to vessel volume vc/v � (vc/v) (t) for one cycle is illustrated in fig. 11, where © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 47 no. 6/2007 fig. 10: dependence of the height of the top of the flow macro formation hfmi, i on the time for one cycle fig. 11: dependence of the ratio of the pcl volume to the vessel volume vc,max/v on time for one cycle of the pcl oscillation v t t h t s t t r tc c c c, av( ) ( ) ( ) ( )� � � � � � � � 2 2 2 4 (1) is calculated from the corresponding regression curves hc � hc(t), sc � sc(t) and rc,av(t). fig. 11 shows that the mean maximum ratio of the pcl volume to the vessel volume vc,max/v reaches a value of 0.326. 4 theoretical calculation of energy balance an energy balance for the pcl was carried out with a view to explaining the reasons for the flow field behaviour (predominantly pcl oscillations). as mentioned above, there are three different stages in the flow process. the energy balance was carried out for the quasi-steady stage under conditions when the pcl reaches its top position, i.e. hc = hc, max, see fig. 12. this stage directly follows (precedes) the collapse and is assumed to have a critical influence on the flow disintegration. the impeller power input n is dissipated in the pcl by the following components: i) mechanical energy losses: 1) nturn – lower turn of the primary circulation loop about 180° from downwards to upwards. 2) nwall – friction of the primary circulation loop along the vessel wall. ii) turbulence energy dissipation: 1) ndisch – dissipation in the impeller discharge stream just below the impeller. 2) nup – dissipation in the whole volume of the agitated liquid above the impeller rotor region, i.e. in the space occupied by both the primary and secondary flow. it is expected that under quasi-steady state conditions (just before the pcl collapses) these components are in balance with the impeller power output. this means that no spare power is available, and that no alternative steady state formation is possible. the individual power components and quantities can be calculated when the validity of the following simplifying assumption for the flow in the pcl is considered: 1) the system is axially symmetrical around the axis of symmetry of the vessel and impeller. 2) the primary circulation loop can be considered as a closed circuit. 3) the cross section of the primary circulation loop (stream tube) is constant. 4) the conditions in the primary circulation loop are isobaric. 5) the liquid flow regime in the whole agitated system is fully turbulent. 6) the character of the turbulence in the space above the impeller rotor region is homogeneous and isotropic. 4.1 calculation of basic quantities impeller power input p po n d� � 3 5, (2) where the impeller power number po � 1.7 for rem>10 4 [12], n � 6.67 s�1, d � 0.0967 m, � � 1000 kg/m3. then p � 4.25 w. the impeller power output can be calculated for known impeller hydraulic efficiency �h (�h=0.48 for a four 45° pitched blade impeller [14]). n ph� �� 2 04. w . (3) impeller pumping capacity qp can be calculated from the known flow rate number nqp (nqp � 0.94 for rem>10 4 [13]). q n n dp qp m s l s� � � � � �3 3 3 1 15 66 10 5 66. . (4) the mass flow rate is m qp p� �� 5 66. kg s �1 (� � 1 kg l �1). (5) the axial impeller discharge velocity is equal to the average circulation velocity of the pcl (assuming a constant average circulation velocity in the loop): w q d c,av p � � 4 2 � 0.77 m s�1, (6) where �d2/4 is the cross sectional area of the impeller rotor region, i.e. the cylinder circumscribed by the rotary mixer. then the primary circulation loop is a closed stream tube consisting of the set of stream lines passing through the impeller rotor region. 4.2 calculation of turbulent dissipation below the impeller rotor region, ndisch the energy dissipation rate per unit mass is [14] � � �a q l 3 2 , (7) where � � � � � � � a 2 3 3 2 (8) and the integral length of turbulence [14] is l d � 10 . (9) 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 12: schematic view of the pcl under its top position (hc = hc, max) the kinetic energy of turbulence per unit of mass is q w w wz r� � � � � � � � � � � 1 2 2 2 2 � . (10) according to [14] the average value of q in the impeller discharge stream is q0 � 0.226 m 2s�2 for the conditions described in [14]: a four pitched blade impeller with a pitch angle of 45°, d0 � 0.12 m, t0 � 0.24 m, n0 � 6.67 s �1, po0 � 1.4, rem,0 � 44 10 4, p0 � 10.3 w, l0 � 0.012 m. we have from eq. (7) �0 � 6.029 m 2 s�3. the energy dissipation per unit mass � related to the experimental system used in this study is � � � � �� � �0 0 0 0 0 0 0 p m p m p v p v , ( ) (11) and after substitution v � 0.0192 m3, p0 � 10.3 w, v0 � 0.0109 m3 we obtain � � 1.384 m2 s�3. using n vdisch disch� � � , (12) where the dissipation volume below the impeller rotor region vdisch � 0.000330 m 3 (see fig. 13). we obtain ndisch � 0.46 w. 4.3 calculation of dissipation in the whole volume above the impeller region, nup according to [15] n pup up� �� 0.60 w, (13) where the portion of the impeller power input dissipated above the impeller rotor region �up = 0.14 for a six 45° pitched blade impeller (d/t = c/t = 1/3, h = t) and rem>10 4. when calculating the quantity nup, the validity of introduced assumption no. 6 is considered. moreover, because of the momentum transfer between the primary flow and the secondary flow out of the pcl the rate of energy dissipation above the impeller rotor region is related to the whole volume, consisting of both the primary and secondary flows. 4.4 calculation of the dissipation in the lower turns of the pcl, nturn according to [16] n w mj j turn c,av 2 p2 � � . (14) substituting the sum of loss coefficients � j j � 0.50 for bend of turn of the pcl � � 180° as well as the values of the average circulation velocity of the pcl wc,av , and the mass flow rate mp, we obtain nturn � 0.84 w. 4.5 calculation of the dissipation by the wall friction along the pcl, nwall using the relation for mechanical energy loss due to the friction along the wall, from [17] n l w d m e wall c,av 2 p� 2 , (15) where equivalent diameter is defined as d s o d t d te � � � 4 4 4 2 2 � � (16) and the friction factor for a smooth wall � �0 3164 0 25. re . (17) where re � w dc,av � � . (18) then we can calculate the rate of energy dissipation in the flow along the smooth vessel wall after substitution: � � 1 m pa s, length of the pcl along the wall l h h� �c,max turn where hc,max � 0.181 m, (see fig. 8), hturn � 0.051 m, so l � 0.13 m, and values de � 0.032 m, © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 47 no. 6/2007 fig. 13: dissipation volume below the impeller rotor region vdisc before the first turn of the loop (d � t/3 � 0.0967 m, z � 0.045 m) re � 7.45 104 and � 0.020 obtained from eqs. (16)–(18) to eq. (15), we get nwall � 0.14 w. 4.6 energy balance in the quasi-steady stage of the pcl the sum of all dissipated power components considered here is n n n n ni � � � � turn wall disch up, (19) ni � � � � �0 84 014 0 46 0 60 2 04. . . . . w. (19a) the value in eq. (19a) is in good agreement with the impeller power output from eq. (3), n � 2.04 w, though the data used for the calculations come from six independent literature sources. this result means that in the quasi-steady phase of the pcl oscillation cycle, all the impeller power output is consumed by dissipation. no power is available, either for increasing the pcl kinetic energy or for any changes in flow formation resulting in a higher energy level. however, it is well known that vortex disintegration in smaller formations or even a very small vortex separation from a primary flow is a process that consumes energy (according to the law of angular momentum conservation). the experiments proved that vortex separations (small and large) occur in all the stages, thus also in the quasi-steady stage. this seems to provide an explanation for the pcl collapse: in the quasi-steady stage, the energy necessary for vortex separation or for other changes in flow formation cannot be supplied by the impeller, but energy is exhausted from the ambient flow field. and then, even a very small energy deficit can result in a qualitative flow field change appearing as the pcl collapse. it should be noted that the pcl collapse can be followed by a noticeable rise in the surface level, classified as a macroinstability. however, not every collapse reaches the level and is observed as a macroinstability. this corresponds to this disagreement between the pcl oscillation frequency determined experimentally in this study and the macroinstability frequency presented. the phenomenon observed here affects the processes taking place in an agitated charge , especially on the macro level, i.e., when miscible liquids blend and when solids suspend in liquids. oscillations of the macroflow contribute to attain a so called “macroequilibrium” in an agitated batch, i.e., better homogeneity both in a pure liquid (blending) and in a solid-liquid suspension (distribution of solid particles in a liquid). these processes can be observed predominantly in the subregion of low liquid velocity above the impeller rotor region, corresponding to approx. one third of the volume of the agitated charge. oscillations of the primary circulation of an agitated liquid can contribute to the forces affecting the body of the impeller as well as the mixing vessel and its internals. this low frequency phenomenon probably need not have fatal consequences, because the standard design of industrial mixing equipment should have sufficient margins for unexpected events during processes running in an industrial unit. 5 conclusions a) the average circulation velocity in the primary circulation loop is more than one order of magnitude higher than the rising velocity of the loop. b) the frequency of the primary loop oscillations is about one order of magnitude lower than the revolution frequence of the impeller. c) the top of the disintegration of the primary circulation loop is a birthplace of macroinstabilities in the region of secondary flow. d) the primary circulation loop collapses owing to disequilibrium between the impeller power output and the rate of dissipation of the mechanical energy in the loop. a small change (e.g. a small turbulent vortex or a small increase in the primary circulation loop) can have a great effect. acknowledgments the authors of this paper are grateful for financial support from the following czech grants agencies: 1. czech grant agency grant no. 104/05/2500. 2. czech ministry of education grant no. 1p05 la 250. 3. czech ministry of education grant no. msm6046137306. list of symbols b width of baffle, m c off-bottom clearance, m de equivalent diameter, m d impeller diameter, m fosc frequency of pcl oscillations, hz fosc dimensionless frequency of oscillations hc value of height of the pcl obtained by regression, m hci experimental value of height of the pcl, m hc, max maximal height of the pcl, m hc,min minimal height of the pcl, m hfmi value of height of the top of macro formation obtained by regression, m hfmi,i experimental value of height of the top of the flow macro formation, m hturn height of turn of the pcl, m h height of water level, m l length of the pcl adjacent to wall, m l integral length scale of turbulence, m m mass of liquid in a stirred tank, kg mp mass flow rate, kg s �1 n impeller speed, s�1 n impeller power output, w ndisch turbulent dissipation below the impeller rotor region, w nqp impeller pumping number nturn dissipation in lower turns of the pcl, w nup dissipation in the whole volume above the impeller rotor region, w 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 nwall dissipation by the wall friction of the pcl, w p impeller power input, w po impeller power number q kinetic energy of turbulence per unit of mass, m2s�2 qp impeller pumping capacity, m 3s�1 rc,av equivalent mean diameter value of the pcl core, obtained by regression, m rc,av i experimental value of equivalent mean diameter of the pcl core, m re reynolds number rem impeller reynolds number s cross section, m2 sc value of width of the pcl, obtained by regression, m sci experimental value of width of the pcl, m t diameter of stirred tank, m tc,av average time of the pcl cycle, s tav, fmi average rising time of flow macro formation, s ur rising velocity of the pcl, m s �1 ur,av mean rising velocity of the pcl, m s �1 v volume of stirred tank, m3 vc calculated volume of the pcl, m 3 vc,max maximum value of calculated volume of the pcl, m3 vdisch volume of dissipation below the impeller rotor region, m3 w width of blade, m wc,av axial impeller discharge velocity, m s �1 �wz, �wr, �w� turbulent velocity fluctuations in individual axis directions, m s�1 z height of cylindrical volume of dissipation below the impeller rotor region, m � bend of turn of the pcl, 1 � energy dissipation rate per unit mass, m2s�3 � dynamic viscosity, pa.s �h impeller hydraulic efficiency �up portion of the impeller power input dissipated above the impeller rotor region friction factor � density of liquid, kg m�3 �j loss coefficient references [1] roussinova, v. t., kresta, s. m., weetman, r.: low frequency macroinstabilities in stirred tank: scale-up and prediction based on large eddy simulations. chemical engineering science., vol. 58 (2003), p. 2297–2311. [2] roussinova, v. t., kresta, s. m., weetman, r.: resonant geometries for circulation pattern macroinstabilities in a stirred tank. aiche journal, vol. 50-12 (2004), p. 2986–3005. [3] roussinova, v. t., grcic, b., kresta, s. m.: study of macro-instabilities in stirred tanks using a velocity decomposition technique. trans i cheme, vol. 78a (2000), p. 1040–1052. [4] fan, j., rao, q., wang, y., fei, w.: spatio-temporal analysis of macro-instability in stirred vessel via digital particle image velocimetry (dpiv). chemical engineering science, vol. 59 (2004), p. 1863–1873. [5] ducci, a., yianneskis, m.: vortex tracking and mixing enhancement in stirred processes, aiche journal, vol. 53 (2007), p. 305–315. [6] fořt, i., gračková, z., koza, v.: flow pattern in a system with axial mixer and radial baffles. collection of czechoslovak chemical communications, vol. 37 (1972), p. 2371–2385. [7] kresta, s. m., wood, p. e.: the mean flow field produced by a 45° pitched blade turbine: changes in the circulation pattern due to off bottom clearance. the canadian journal of chemical engineering, vol. 71 (1993), p. 52–42. [8] brůha, o., fořt, i., smolka, p.: phenomenon of turbulent macro-instabilities in agitated systems. collection of czechoslovak chemical communications, vol. 60 (1995), p. 85–94. [9] hasal, p., montes, j-l., boisson, h. c., fořt, i.: macro-instabilites of velocity field in stirred vessel: detection and analysis. chemical engineering science, vol. 55 (2000), p. 391–401. [10] paglianti, a., montante, g., magelli, f.: novel experiments and mechanistic model for macroinstabilities in stirred tanks. aiche journal, vol. 52 (2006), p. 426–437. [11] bittorf, k. v., kresta, s. m.: active volume of mean circulation for stirred tanks agitated with axial impellers. chemical engineering science, vol. 55 (2000), p. 1325–1335. [12] medek, j.: power characteristics of agitators with flat inclined blades. international chemical engineering, vol. 20 (1985), p. 664–672. [13] brůha, o., fořt, i., smolka, p., jahoda, m.: experimental study of turbulent macroinstabilities in an agitated system with axial high-speed impeller and radial baffles. collection of czechoslovak chemical communications, vol. 61 (1996), p. 856–867. [14] zhou, g., kresta, s. m.: distribution of energy between convective and turbulent flow for three frequently used impellers. trans i cheme, vol. 74 (part a) (1996), p. 379–389. [15] jaworski, z., fořt, i.: energy dissipation rate in a baffled vessel with pitched blade turbine impeller. collection of czechoslovak chemical communications, vol. 56 (1991), p. 1856–1867. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 47 no. 6/2007 [16] perry, j. h.: chemical engineer’s handbook (fourth edition). new york: mcgraw–hill book comp., 1963. [17] brodkey, r. s.: the phenomenon of fluid motions. reading: adison-wesley publishing comp., 1967. doc. ing. oldřich brůha, csc. phone, fax: +420 251 560 040 mobil: +420 777 895 766 e-mail: enex@atlas.cz department of physics czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic ing. tomáš brůha phone, fax: +420 251 560 040 mobil: +420 777 895 762 e-mail: enex@volny.cz department of chemical and process engineering institute of chemical technology, prague technická 5 166 28 prague 6, czech republic doc. ing. ivan fořt, drsc. phone: +420 224 352 713 fax: +420 224 310 292 e-mail: ivan.fořt@fs.cvut.cz department of process engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic doc. dr. ing. milan jahoda phone: +420 220 443 223 fax: +420 220 444 320 e-mail: milan.jahoda@vscht.cz department of chemical and process engineering institute of chemical technology, prague technická 5 166 28 prague 6, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 ap05_3.vp 1 introduction high speed milling (hsm) has been widely used in the aerospace sector for several years. the benefits of using hsm in manufacturing of components made of aluminium, titanium or hardened steel alloys have also been clearly identified. it is in addition recognized that hsm allows us to mill complex structures in aluminium that previously were neither practical nor possible to achieve. mainly, it allows us to achieve thinner walls to reduce the weight and to machine monolithic structures that save on assembly. frames and components of different sizes and with different degrees of geometrical complexity made of aluminium and titanium are very common in the aeronautical sector. in general, the benefits come from the use of higher feed rates and cutting speeds, and lighter depths of cut, making this combination cost-effective. however, it is also widely admitted that the use of hsm demands a change in the philosophy and the approach to machining. in addition to the change in the cutting parameters both in rough and finishing operations, dedicated tooling and production equipment incorporating the necessary design features is needed. in this sense, rigidity to avoid vibrations is a key factor, set-up stability is part of the solution, and fixtures are the elements responsible for part of this increased stability. in addition to the stability requirement, some other factors affect the design of fixtures for hsm, mainly the thinner part walls, the new machining strategies, the reduction in the number of set-ups, and the increased amount of material to become chips. erdel [1] provides a comprehensive view of the hsm process, but more specific information can be obtained from tool manufacturers such as sandvik and iscar. as mentioned in the previous paragraph, work holding is a key element in the machining process. some of the factors to be considered when designing jigs and fixtures are: quantity of work, production rate, machine capacity, sequence of operations, tolerances, interferences, cutting forces, chip evacuation, part dimensions and shape, etc. at the same time, requirements to consider are: simplicity, rigidity, accuracy, durability, set-up times, and economy, just to name some of the most relevant factors [2]. in the case of hsm for aeronautical parts, production is frequently carried out in a machining cell made up of several © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 kbe application for the design and manufacture of hsm fixtures j. ríos, j. v. jiménez, j. pérez, a. vizán, j. l. menéndez, f. más the design of machining fixtures for aeronautical parts is strongly based in the knowledge of the fixture designer, and it comprises certain repetitive tasks. an analysis of the design process allows us to state its suitability for developing knowledge based engineering (kbe) applications in order to capture the knowledge, and to systematize and automate the designs. this work justifies the importance of fixtures for high speed milling (hsm), and explains the development of a kbe application to automate the design and manufacturing of such elements. the application is the outcome of a project carried out in collaboration with the company eads. in the development process, a specific methodology was used in order to represent the knowledge in a semi-structured way and to document the information needed to define the system. the developed kbe application is independent of the parts design system. this makes it necessary to use an interface to input the part geometry into the kbe application, where it is analyzed in order to extract the relevant information for the fixture design process. the results obtained from the application come in three different ways: raw material drawings, fixture 3d solid models, and text files (bill of materials – bom, and numerical control nc programs). all the results are exported to other applications for use in other tasks. the designer interacts with the application through an ad hoc interface, where he is asked to select or input some data and where the results are also visualized. the prototype kbe application has been carried out in the icad development environment and the main interface is with the cad/cam system catia v4. keywords: kbe, icad, hsm fixtures design. fig. 1: fixture assembly for hsm hsm machines. precision pallets, hole-based two or four-sided tooling blocks and base plates, together with vacuum/non-vacuum specific purpose fixtures, magnetic clamping elements (when ferrous materials are machined), and modular fixture elements are the main components of the system. fig. 1 shows a machining fixture with a vacuum circuit for cnc machining in a typical assembly for hsm made up of the following elements: pallet, tooling block, base plate, and vacuum specific fixture. vacuum specific fixtures are frequently used for machining parts that require a single set-up, which cannot be achieved with any other kind of locating and clamping elements. as an example, in the aerospace sector fixtures of this kind are used for parts that are machined from a block of raw material and in which nc planning, pocketing, contouring and drilling operations are performed (fig. 2), and for skin panels on which nc trimming and drilling operations are performed (fig. 3). cutting strategies are also designed in order to achieve maximum stability and rigidity during the machining process, in order to avoid deformation of the part and generation of vibrations, as one of the requirements for hsm is to be chatter-free. for example, during the contouring operation a thin foil web of about 0.01 mm of material is left to keep the part attached to the vacuum fixture. once the machining is completed and the pallet is out of the machining stations, the vacuum is released and the part can be snapped out of the surrounding material. in order to help in getting a chatter-free machining environment, vacuum fixtures also provide a very fast clamping and un-clamping cycle time. looking at the design process of fixtures, there is wide recognition of the extensive use of heuristic knowledge during such a process, as well as the dependencies between some of the information used. as an example, when defining the machining operations, the fixture solution should be kept in mind, and vice versa. experience and skills, gathered and kept by designers in the form of ‘explicit’ and ‘tacit’ knowledge for several years, are a key factor in achieving a good fixture design. this fact, together with: 1. the extensive information needed during the design process, mainly related to the part, the machining process, resources, and production; and 2. the complexity of the design itself, which implies that we must determine the locating supporting and clamping positions and the corresponding physical fixture elements, considering mainly the requirements of stability, rigidity, deformability, accuracy, accessibility, interference, availability and cost; make it extremely difficult to automate the design process of fixtures completely. this is the main reason why much research work focuses on specific issues of fixture design (locating and clamping layout, fixture force analysis, fixture tolerances analysis, etc.), and address a specific kind of fixtures, e. g., modular fixtures, or consider only parts with a specific kind of geometry, e. g., prismatic forms. the first attempts to develop a computer aided design application for machining fixtures date back around twenty years [3]. with the improvements in feature based design, geometry analysis algorithms, knowledge capture and representation, and artificial intelligence techniques, the development of such applications has been facilitated, but the extensive expertise needed during the process makes this area of research still extremely challenging. a comprehensive study of different systems can be found in [4] and [5], in the work carried out by hou and trappey on modular fixtures [6], one the latest studies in this area, which also provides an interesting literature survey. this paper presents the development of a kbe system applied to the design of fixtures for hsm of aeronautical components. due to the special characteristics of the fixture design process and the complexity of the knowledge involved, the application integrates knowledge represented in the form of design rules with knowledge provided by the designer in the form of input parameters. because of the specific requirements, the development framework encompasses two different systems: catia v4 (cad/cam system) and icad (kbe development software). 2 automating the design of fixtures as stated previously, automation of fixture design has been pursued for several years. lately, the topic seems to have returned, basically due to the application of various techniques for the reasoning process needed to provide a possible solution to the problem. genetic algorithms [7], agent based systems [8], and machine-learning techniques [9] are being applied to automation of the fixture design process. the use of these techniques, framed under the general discipline of artificial intelligence, claims to facilitate the capture of the 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 2: example of vacuum specific fixture fig. 3: example of a vacuum specific fixture ‘what’, the ‘how’ and the ‘why’ needed to carry out the design reasoning. at this point, it seems relevant to point out that kbe systems can be considered a particular technique for automating design processes, and in fact this term is currently more likely to be used than the former expert system. in this sense, one of the main aims of any kbe system is to automate routine designs that require a significant amount of time when performed manually [10]. one of the main questions, when addressing the automation of any design process, is what degree of automation should be achieved. to answer this question is not an easy task, since many human, technical and economical factors should be taken into account. in particular, there is one key factor to consider, which is the amount of creative or routine work involved in the process. however, it should be considered that the automated design process should help the designer to carry out more creative work but at the same time to make use of part of his ‘knowledge’. above all, this concerns knowledge that the designer, can input easily, while failure to provide such knowledge would involve much extra effort. in this sense, when addressing the automation of fixture design, one of the main tasks is to analyse the geometry of the part. it is essential to know the raw or initial material and the final part that is to be obtained, in order to determine the volumes of material to remove. these volumes will help to determine the necessary machining operations, but the order in which they will be performed cannot be defined without taking into consideration many other factors like the tolerances, the machine tool and the fixture solution that it will be used. when dealing with aeronautical parts, the analysis of the part geometry turns out to be really complicated, because the parts themselves are geometrically complex. in this particular case, this complexity does not allow the wide use of design features. the general practice for many aerospace components is to use complex curves and surfaces as the starting point to design a 3d solid model. fig. 4 shows three examples of parts considered in the project. the method used to design the part, in particular the kind of geometric primitives and functions used, influences tremendously on the later analysis of the component. this consideration is even more relevant when a translation of the geometric model between different systems is needed. the problem that usually arises is the loss of the geometry modelling history of the part. this implies an upper degree of complexity when analyzing the received geometric model in the system where the automated design will be carried out. in particular, prior to the design of the fixture solution, we need to identify, dimension and locate the supporting face, the supporting face contour, the wall thicknesses, holes, pockets, and possible internal and external remnants from the part. in hsm, the possible remnants are particularly important because they need to be clamped to avoid vibrations or to avoid snapping during the machining process. the supporting face contour is also relevant, for two main reasons: the first is to dimension the specific support fixture to the minimum size to support the part properly, in order to reduce the length of the tool, and in consequence the possibility of chatter, when machining the part with a tool the axis of which is parallel to the supporting plane; and the second reason is that the contour is used as an input to determine the vacuum system geometry, if needed. in order to minimize part of the complexity of the geometrical analysis to be carried out by the automated application, two different kinds of actions can be performed by the fixture designer during preparation of the geometry of the part: geometry generation practices and addition of information. in terms of geometry generation, one of the practices is to make sure that there are neither gaps nor duplication between the curves and surfaces that define the faces of the part. in terms of adding information to the model, one example is to locate the part with the intended supporting face with one of its coordinates at value 0. the selection of the coordinate should be based on the configuration of the nc machine tool to be used. 3 kbe development environment the development of kbe applications implies the use of a particular environment providing a programming language that allows us to represent product and design process knowledge, and a reasoning process. the latter is referred to in general as the inference engine, and its basic function is to derive answers from the knowledge applied to the initial data. in the engineering design context, the relation between cad systems and what used to be called several years ago ‘expert systems’, currently referred to as kbe systems, dates back around twenty years. the reason for this relation lies in the need for any kbe system to be used in the design environment of a link with a geometric modelling kernel. in this © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 4: examples of aeronautical parts used in the project sense, parasolid, acis, and open cascade are some of the geometric modellers currently used by some kbe systems. in some way, the application program interface provided by some cad systems in the form of programming subroutines, which can be called from programs developed mainly in fortran, may be considered as the first version of a kbe programming environment. in fact, rule-based systems, one of the typical artificial intelligence (ai) methods, which constitutes the basis for several of the current kbe systems, such as icad and intent!, can be implemented using the construction ‘if … then’ in fortran language. however, with this approach it is necessary to define the reasoning process as well, which in fact would be a forward reasoning method based on the arrival of new facts represented by variable values. the reasoning process should be defined in the flow of the program. in this sense, it is important to point out that, currently, object oriented languages like c++ allow the development of ‘expert systems’ even though their development environments do not provide any reasoning process or inference engine, and this has to be implemented basically through the definition of methods and messages between objects. callan [11] provides a comprehensive introduction to ai techniques. coming back to kbe applications, it can be considered that there are currently five main commercial kbe development environments: � icad (intelligent computed aided design) based on lisp language, and using mainly parasolid as the geometric modelling kernel. � intent! based on lisp language and using acis as the geometric modelling kernel. � pacelab based on c++ language and using open cascade as the geometric modelling kernel. � unigraphics knowledge fusion based on intent! as the modelling language and using parasolid as the geometric modelling kernel. � catia v5-component application architecture based on c++ and having its own geometric modelling kernel. in the particular case of our project, catia v4 and icad constituted the development environment. catia v4 was used as the geometry modeller of the aeronautical part. and icad was used for the development of the kbe application. the geometry of the part was imported into icad via a catia model processor and later the resulting fixture design, generated by icad, was exported into catia. as has been previously mentioned, icad is based on lisp language. it provides what is called icad design language (idl) and a forward/backward inference engine, which means that the reasoning process can be done from the start state to the goal state, and vice versa. for the geometric modelling functions, icad provides two options, the parasolid kernel and a surface based kernel. the development of the kbe application user interface is done independently of the knowledge modelling. the basic components of an icad application are the idl code elements named defpart and defun. the first element can be used to define the geometric and non-geometric entities, and it is structured in a hierarchical tree. the highest level of the tree is a root defpart that encompasses the root defpart of the user interface and the root defpart of the geometrical components. a defpart has a basic type and several sections, the main ones being: part-documentation, inputs, optional-inputs, modifiable-inputs, query-attributes, attributes, attribute-types, descendant-attributes, pseudo-parts, and parts. the later element, defun, allows us to code the traditional functions, which take the parameters as input, apply an algorithm, and return a result. in addition to the defpart files, an icad application may include catalog files. these are text files where the parameters defined for the application are stored together with their possible values. in particular, this kind of file can be used to define the parameters needed for the decision rules and for the generation of library components. 4 capturing and representing product and design process knowledge the capture and representation of knowledge is a discipline that has attracted considerable attention from the research perspective in recent years, mainly due key role in the area of knowledge management. particularly, in the subarea related to kbe systems there are two main initiatives that must be considered: commonkads [12], and moka [10]. commonkads is a methodology for the development of knowledge-based systems (kbs). it considers the use of the tool set pc-pack as a knowledge elicitation tool, and in different parts it leans on the use of the unified modelling language (uml). it proposed three main modelling steps: � context modelling. this encompasses three different models: organization, task and agent models. � knowledge modelling. this also includes three different models: domain knowledge (static view), inference knowledge (reasoning process), and task knowledge (application goals). � communication modelling. this encompasses the definition of the information exchange procedures to perform the knowledge transfer between agents. moka is another methodology influenced by commonkads, amongst other techniques; it focuses on the development of kbe applications. the main objective of moka is to help to reduce the effort needed and the risk associated with the development of kbe applications. it defines two levels of knowledge representation: an informal level and a formal level. the informal level is based on the use of specific forms named icare forms: illustrations, constraints, activities, rules and entities. the formal level comprises the transformation of the knowledge defined in the icare forms into uml based diagrams. the idea is to use these object oriented models as an input in the coding of the kbe application. in this sense, it is relevant to note that the company that commercializes icad software was one of the major partners in the development of the moka methodology. currently, the pc-pack tool supports the generation of icare forms. in the development of this project, in addition to the documents defined by the industrial partner of the project to represent the hsm fixture requirements, only the moka informal level was considered. the icare forms were mainly 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague used to represent constraints and rules associated with the fixture design process. 5 a knowledge based application for the design and manufacture of fixtures the design of the fixture solution for a particular aeronautical component to be machined by hsm implies the work of various technicians and the development of a considerable number of steps. most of these steps are carried out making use of a cad/cam tool. as it was stated previously, information about the part, the possible machining process, and available resources, has to be gathered and put together. a detailed analysis of the whole process reveals that its automation should be addressed partially, with the main objective of providing an application to the fixture designer, which will help him in his most routine tasks. a fundamental element due to be considered is the fixture philosophy, which in fact depends on the configuration of the production resources and the family of parts to be machined. this determines the kind of fixture elements that need to be considered. in this particular case, and as previously stated, the hsm of aeronautical parts is frequently carried out in a machining cell made up of several hsm machines. this © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 5: general view of the developed icad application context implies the use of precision pallets, hole-based two or four-sided tooling blocks and base plates, together with vacuum/ /non-vacuum specific purpose fixtures, and some modular fixture elements. in relation to the family of parts, the project initially addressed parts that have a planar supporting face, which can be defined by one or more geometric surfaces (fig. 4). fig. 5 presents a general view of the development application context, and of the outputs. the knowledge needed for its operation is defined and codified in the form of two main kinds of files: code files, based mainly on the defpart and defun constructions, and parameter files for decision rules, catalogs, and for generating standard components. the general requirements specified for the application development are summarized in the following terms: � the geometry of the aeronautical part will be generated in catia v4. an output model in catia format will be created and it will be imported into icad via a direct translator. � unless specified in a different way, the icad application will calculate and generate all the geometry needed based on: geometric analysis, application interface and parameter files inputs, and design rules. � the icad application will generate the raw material of the part. this part implies the following outputs: � a 3d solid model of the raw material needed for the part, including all the holes for fixing it and for handling on the specific base fixture. the holes will be designed according to the company standards. the kind of holes to be considered are as follows: � holes for guiding/locating pins. � holes for fixing screws. � standard fixing holes. � non-standard fixing holes. these can be of two types: � those set by the designer. they will be situated close to the location points specified by the designer in the catia model. � those needed for fixing the pocket remnants, external remnants and narrow strips or inlets of material. the situation of these holes will be calculated by the application based on a geometric analysis and the design rules. � holes for lifting screws. this icad 3d solid model will be exported to a catia v4 model. � a drawing of the raw material for the part, according to the company standards and including all the annotations, dimensions and details needed for its machining. the drawing will be exported into a postscript format file. � a text file with the nc program in iso format for the stock material drilling operations. � the icad application will generate the specific base fixture for machining the raw material for the part. this part implies the following outputs: � a 3d solid model of the specific base fixture. the model will include the following elements: � a base plate. � a fixture block with the following kinds of holes, and the corresponding threaded inserts, guide bushes and screws: � holes for fixing screws. � to fix the raw material for the part to the fixture. � to fix the base fixture to a base plate. � holes for guiding pins. � to guide the assembly of the raw material for the part on the base fixture. � to guide the assembly of the base fixture on the base plate. � holes for lifting screws. to handle the base fixture during the assembly process. � the fixture block will have the following additional elements when requested by the designer: � a vacuum system, including: vacuum nozzle, vacuum channels, vacuum grid, connecting holes, connecting grooves, and sealing grooves. � a clamping system. three kinds of clamps are considered: plain clamps, adjustable clamps and bridge clamps. this system includes all the needed fixing holes, inserts, and components of the clamps. this icad 3d solid model will be exported to a catia v4 model. � a bill of materials (bom). a text file including all the elements that make up the fixture solution. due to the complexity of the application, the development, was divided into six main areas: an analysis of the imported geometry, calculation and generation of the raw material for the part, calculation and generation of the fixture solution, generation of drawings, generation of text files, and the user interface. in each of these areas, specific functions were developed to resolve the different geometric analyses needed and to overcome the problems due to the complexity of the geometry of the parts. in this sense, one of the most challenging tasks was to develop a method to locate, without interferences and conforming to the standards of the industrial partner, all the elements to be included in the drawing of the raw material for the part. fig. 6 shows an example of a created drawing6. to address this issue a methodology based on the concept of a ‘quality engine’ was developed, and implemented in the form of decision rules. this concept of quality, evaluated on the basis of quantifiable parameters, was applied to each element of the drawing and to the whole drawing as a single element. additionally, the elements were classified into three groups: with a fixed location, with a restricted location, and with a free location. the generation of the drawing was defined in five main phases: � phase 0. identifying of the geometry of the part. � phase 1. drawing preparation. locating the fixed elements. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague � phase 2. determining of the number and kind of views and sections. � phase 3. generating the fixed and mobile elements in pseudo-drawings (virtual intermediate drawing). � phase 4. locating all elements in the drawing. final configuration. phase 3 is where the interference analysis and the quality quantification are performed. in the fixture solution, in addition to the development of decision rules, the generation of the curves for the vacuum system needed a specific development. the generation of these curves is based on inner offsetting supporting face external contour of the aeronautical part and the outer offsetting of the contour of the through pockets and holes. the family of parts considered allows us to have a supporting face geometrically defined by more than one geometric surface or geometric face, which adds another element to the complexity of the problem. although icad provides a function to generate such curves, in reality due to the special requirements imposed by the geometric contour of the part, the offsetting of some curves resulted in curves with self-intersected loops, or in curves that were just one of the loops of the whole offset curve. this kind of result was totally invalid for generating the vacuum grooves and the vacuum grid. it is relevant to be pointed out that the problem of self-intersected loops in offset curves has been studied by various authors, due to its importance in the generation of surfaces starting from such curves, and in the generation of tool paths for nc machining. in this particular case, the solution adopted was based on discretizing the curves by points, and application of the offset distance to such points in the direction of its normal to the curve. the pitch between points was defined as a parameter, allowing a finer or rougher value depending on the geometric characteristics of the contour. once the offset points are generated, a list of segments is created. the evaluation of the intersection between segments allows us to identify all the possible self-loops. an algorithm was developed to determine and eliminate the points that are included in the loops. as a result, an approximated curve is generated with the remaining points. this process was applied first to the most critical offset curve, which is the offset curve used to define the connection of the vacuum grid, since it is the most inner offset curve to the external contour of the part. fig. 7 shows two examples of a vacuum fixture solution for two of the parts depicted in fig. 4. 6 conclusions several findings and conclusions can be extracted from the project presented in this paper: 1. due to the complexity of the fixture design process, its automation should be addressed in partial stages, combining designer decisions with automatic knowledge-based expert judgments. 2. the knowledge elicitation process plays a key role in the success of the development process. the use of a formalized technique like moka, or even a company specific one, results in better and faster application development, maintenance and possible future extension. 3. the translation of complex geometric models of parts between dissimilar systems is still a problematic issue when a deep analysis of the geometry is necessary in the receiving system. 4. the correct decision between using forward reasoning, which implies the application of rules from the start conditions followed by the necessary calculations and reasoning until the goal is achieved, or using backward reasoning, which implies the application of rules to a calculated general final state until a compliant solution is © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 6: example of a drawing of the raw material for a part generated with the developed application fig. 7: examples of 3d models of specific vacuum fixtures obtained with the developed icad application achieved, is another key factor to consider when developing a kbe application. 5. the benefits obtained from the development of a kbe application go beyond the use of the application itself, since a deep rationalization and systematization of the design process helps to improve it and to capture tacit knowledge that in regular circumstances is not formally documented in the company. 6. finally, in order to facilitate the application of kbe techniques, there is a need to develop and integrate knowledge-based tools which do not demand strong programming skills from the designers, in general purpose cad systems. acknowledgments the authors want to express their most sincere gratitude to the fixture designers from mtad eads-casa at the tablada factory, who kindly collaborated in this project. references [1] erdel, b.: high speed machining. usa: sme, 2003. [2] boyes, w. e.: handbook of jig and fixture design. 2nd edition. usa: sme, 1989. [3] miller, a. s., hannam, r. g.: computer aided design using a knowledge based approach and its application to the design of jigs and fixtures. proc. i.mech.e., 1985. [4] nee, a. y. c. et al.: advanced fixture design for fms. springer-verlag, london, 1995. [5] rong, y., zhu, y.: computer aided fixture design. marcel dekker inc., 1999. [6] hou, j.-l., trappey, a. j. c.: “computer-aided fixture design system for comprehensive modular fixtures.” int. journal production research, vol. 39 (2001), no. 16, p. 3703–3725. [7] senthil, a. et al.: “conceptual design of fixtures using genetic algorithms.” intl. advanced manufacturing technology, vol. 15 (1999), p. 79–84. [8] senthil, a. et al.: “a multi-agent approach to fixture design.” j. of intelligent manuf., vol. 12 (2001), p. 31–42. [9] senthil, a. et al.: “conceptual design of fixtures using machine-learning techniques.” the int. j. of advanced manufacturing technology, 2000, p. 176–181. [10] stokes, m. (ed.) et al.: managing engineering knowledge: moka methodology for knowledge based engineering applications. asme press, 2001. [11] callan, r.: artificial intelligence. uk: palgrave macmillan, 2003. [12] schreiber, g. et al.: knowledge eng. and manag. the commonkads methodology. mit press, 1999. dr. josé ríos e-mail: j.rios@cranfield.ac.uk eng. juan v. jiménez dr. jesús pérez dr. antonio vizán dept. of mechanical engineering and manufacturing industrial engineering superior college polytechnic university of madrid josé gutiérrez abascal 2 madrid, 28006 spain eng. josé l. menéndez eng. fernando más dept. of cad/cam-pdm systems, mtad eads-casa factory of tablada, av. garcía morato s/n sevilla, 41011 spain 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap06_1.vp 1 introduction the theory of dynamic optimization problems has been developed quite well. however, it is still not much used in applications. there are several reasons for this – the first is that solving a general dynamic optimization problem requires solving a general boundary problem for a system of ordinary differential equations. this problem is difficult, due to the need for proper discretization and also to the huge time consumption. another drawback is that the control law cannot be formulated in the feedback form. that is why this theory was almost abandoned, despite its successful applications to the optimization of, e.g., space flights in the 1960’s. this theory with some applications to this area has been introduced in many publications, e.g., [2]. the boundary problem arising from the dynamic optimization can be solved using the finite element method (fem) similarly as it is used for solving partial differential equations. hence it is possible to use pde solving tools for dynamic optimization. thanks to the vast efforts made in computer equipment in the last twenty years, powerful and user-friendly software packages for solving boundary problems for ordinary differential equations have appeared. one such software system is femlab. its close relationship to the well-known matlab system makes its application in complex control problems straightforward. even if the modern computer equipment is used for solving the boundary problem, computations can take too long to be used for directly designing a control law. hence we have designed a hierarchical controller. the theory of hierarchical control systems was developed in the early 1980’s in the works of singh, siljak, findeisen and others. a good summary of this theory can be found in [4]. in this article we design a control algorithm based on general dynamic optimization in the upper level. the lower level was designed via the theory of lq control. we use the femlab 2.3 package for solving the optimization problem, and the implementation is also described. simulations showing the advantages of the hierarchical control approach are mentioned. this approach can handle nonquadratic cost functionals, e.g. functionals containing a barrier function. this paper summarizes and extends the results from [6]. 2 hierarchical control law the real-time optimal control problem was divided into two subproblems. the first problem is to compute the optimal trajectory prediction. this involves dynamic optimization. the femlab system was used to carry out computations at this level. the second subproblem is to track this optimal trajectory. this is the task of the lq controller. these two subproblems are solved in different levels. the first is referred to as the upper while the latter is the lower level of the controller. the tracking in the lower level is independent of the computations of the upper level. thus both these tasks may run in parallel. the scheme of the hierarchical control system is shown in fig. 1 [4, 7]. we will now outline the main features of both levels now. upper level � determines the optimal trajectory for a long period ahead, � time-consuming calculations are carried out here, � the whole structure of the system is taken into account, including nonlinearities etc., � only the slower dynamics are considered. lower level � is responsible for tracking the optimal trajectory, � compensates the disturbances, � the control law should be as simple and as fast as possible and should have the feedback form, � when designing the control law design the most important connections are taken into account while the nonlinearities can be replaced by their linearizations, � the influence of the fast modes should be also be taken into consideration. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 design of a predictive hierarchical controller using femlab b. rehák a hierarchical controller with two levels is proposed. one level is based on dynamic optimization while the second is responsible for tracking the optimal trajectory and rejecting disturbances. its implementation using the femlab system is described. some simulations are presented at the end of the paper, together with an evaluation of the performance. keywords: hierarchical control, boundary-value problem, dynamical optimization, femlab. lower level lower level plant upper level fig. 1: scheme of a hierarchical control system the delay caused by computation of optimal control law in the upper level design should also be taken into account. while this computation is being carried out, the lower level uses the trajectory computed in the previous step. the time during which the trajectory computed in one computational step of the upper level is used is denoted by tv. this period must be longer than the longest computation in the upper level. 3 control algorithm design of the upper level in this paragraph we will introduce the optimal control problem, which makes up a central part of the upper-level design. then we will focus on how the results are applied in order to complete the upper level design. let t > 0. we consider the system described by the state equation � �� ( ), ( )x f x t u t� together with the initial condition x t x( ) � 0 . our aim is to design a control u(t) such that the cost functional � �j t j x t u t t t t th ( ) ( ), ( )� � � d is minimized. in the definition of the cost functional, th stands for the prediction horizon and j denotes a continuously differentiable function such that the integral exists. the horizon th must be longer than the longest time necessary for the calculations described below. it can, however, be significantly greater, as mentioned in [5]. the optimization is performed under the condition that the state equation is satisfied for every t between t and th. the problem described above will be referred to as the upper-level problem at time t on the interval (t, th) with the initial condition x t x( ) � 0 and denoted by (p[t, x0]). first, we introduce the solution of the problem (p[t, x0]). then we will demonstrate how it can be used for the upper-level definition. proceeding as in [2], we infer that the problem (p[t, x0]) is equivalent to the problem of unconstrained minimization of the lagrange functional l that is defined as follows: � � � �� �� �l t j x t u t t x t f x t u t tt t t th ( ) ( ), ( ) ( ) �( ) ( ), ( )� � � � � d . the vector-valued function � (having the same size as the state vector) is the lagrange multiplier, sometimes also called the co-state or the adjoint state. the functions u, x, � that solve the minimization problem also satisfy the following set of equations on the interval (t, th) (dxf, or duf denotes the derivative of function f with respect to variable x or u.): � �� ( ), ( )x f x t u t� � � � �� ( ), ( ) ( ), ( ) ( )� �� �d j x t u t d f x t u t tx x � � � �0 � �d j x t u t d f x t u t tu u( ), ( ) ( ), ( ) ( )� with boundary conditions defined as follows: x t x t th ( ) ( ) � � � 0 0� while the value of the state at the end of the interval (t, th) and also the value of the co-state at time t are not defined. then the boundary value problem for this set of equations is correctly defined. see [2, 9] for details. we will now describe the functionality of the upper level. since the solution of the optimization problem takes a considerable time, the algorithm for the solution can be started with period tv, as described above. the period must be longer than the time necessary for carrying out the computations, but it must be also shorter than the prediction horizon th. during that period, the following actions take place (let us assume that the current time is ktv): 1. the prediction of the state variables at time (k � 1)tv is evaluated with the help of the feedforward computed at time (k 1)tv. we denote this prediction by �((k � 1)tv). 2. the optimization problem p[t(k � 1)tv, �((k � 1)tv] is solved. 3. the optimal feedforward for the lower-level control loops on the interval [(k � 1)tv, (k � 2)tv] is evaluated. we will clarify this step in the following paragraph. at the time slot (k � 2)tv the optimal feedforward is saved into the buffer and these steps are repeated with new values of the initial and terminal time of the optimization problem. this is a kind of “receding horizon” method. it constitutes an essential part of predictive control theory [5]. 4 design of the lower level we will now describe the functionality and the design of the lower-level controllers. first we mention that we will make use of the possibility to decompose the system. we briefly mention this decomposition. the most important condition is that the systems must be almost autonomous, and their mutual interaction is only weak. in what follows we assume that the system can be divided into n subsystems. the system matrix (or, if the original system was nonlinear, its linearization) can be written as follows: a � � � � � � � � � � a a a n connection 1 0 0 � � � � � . the term aconnection contains all the interactions that are not taken into account in the lower-level design. we also assume that matrices b, q, r can be decomposed similarly, with compatible dimensions so that the i-th subsystem (with neglected interactions) is described by the equation �x a x b ui i i i i� � . since high speed is the main required feature, we decided to implement this level using time–invariant lq controllers. lq controllers also offer the advantage of a straightforward analysis of the cost of the control. see, e.g., [1, 9] for general lq-control theory and [7, 8] for more results about decentralized lq control and about constraints imposed by the prescribed structure of the controller. we assume that the lower lever controller of the i-th subsystem should track the optimal trajectory x topt i ( ) and the opti16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague mal control u topt i ( ), which were evaluated in the upper level. the tracking should be such that the cost � � � �j x t x t q x t x t u t i i opt i t i i opt i t t t i h � � �� � � � � ( ) ( ) ( ) ( ) ( )� � � � � �� u t r u t u t topt i t i i opt i( ) ( ) ( ) d is minimal. here, symbol ui(t) denotes the optimal control which is applied to the system, and xi(t) stands for the real trajectory. we assume that the weighting matrices qi and ri are positively semidefinite and positively definite, respectively. using the dynamic programming approach (see e.g. [1]), we can infer that the optimal control is given by the formula u t u t u ti feedback i feedforward i( ) ( ) ( )� � . the control consists of a feedback term and a feedforward term. for the feedback term it holds: u t r p t b x tfeedback i i i it i( ) ( ) ( ) ( )� 1 while the feedforward is defined by: u t r b p t u tfeedforward i i it i opt i( ) ( ) ( ) ( )� � 1 where function p i(t) is actually the solution of the continuous riccati equation with the initial condition p i(t � th) � 0 for the i-th system. function pi(t) solves the following differential equation �( ) ( ( ) ( ) ) ( ) ( )p t a p t b r b p t q x p t b ui i it i i i i opt i i i o� � 1 pt i t( ) with the initial condition pi(t � th) � 0. the last term in the equation is nonstandard. it is due to the existence of the desired control, which should also be tracked. note that this equation is solved in the backward direction. strictly speaking, the lq control on the horizon equal to the optimization horizon in the upper level (denoted by th, see below) presented so far should be applied. nonetheless we assume that the horizon is long enough to replace the time-variant control by the time-invariant control without a significant loss of accuracy. another advantage is that we obtain a time-invariant gain in the control loop. this feature simplifies the implementation significantly. hence we replace matrix function p i(t) by its limit value p i in the differential equation for function pi(t). the limit solution p i solves the algebraic riccati equation: 0 1� � � a p p a p b r b p qit i i i i it i i i i( ) . this trick simplifies the implementation of this level significantly. 5 analysis of the influence of the decomposition on the total cost in this section we will analyze the increase in the cost when the control algorithm is designed using a hierarchical approach. we will make some assumptions that will simplify our analysis. first we assume that the total cost is quadratic, i.e., that there exist a symmetric positive semidefinite matrix q and a symmetric positive definite matrix r such that � �j t x t x t q x t x t u t ru t tw t w t t t th ( ) ( ( ) ( )) ( ( ) ( )) ( ) ( )� � � � d where xw denotes the desired trajectory that is to be tracked by the controlled system. moreover we assume that the weighting matrices can be decomposed as described above. the dynamic optimization yields an optimal trajectory that should be tracked together with the optimal control that should be applied. this trajectory, or control, is denoted by xopt, uopt. (we will omit the explicit writing of time argument t in what follows.) we then have �j t x x x x q x x x x u w opt opt t w opt opt t t th ( ) ( ) ( ) ( � � � � � � � � � u u r u u u t x x q x x opt opt t opt opt w opt t w opt � � � � ) ( ) ( ) ( ) ( d x x q x x x x q x x x opt t w opt t t t w opt t opt op h � � � � � ) ( ) ( ) ( ) ( t t opt opt t opt opt t t opt x q x x u ru u u ru u r u u � � � � ) ( ) ( ) ( ) � � � � � ( ) ( ) ( ) ( ) ( u u r u u t x x q x x x opt t opt w opt t w opt op d 2 2� t t opt t t t opt t opt opt t opt x q x x u ru u u r u h � � � � � ) ( ) ( ) (2 2 � � �u t j jul ll) d 2 2 here, jul denotes the optimal cost achievable by the optimization in the upper level and jll stands for the lower-level optimal cost. the following holds for them: � �j x x q x x u ru tul w opt t w opt optt opt t t th � � � � ( ) ( ) d and � �j x x q x x u u r u u tll opt t opt opt t opt t t th � � � � ( ) ( ) ( ) ( ) .d nonlinearities would make this reasoning much more complicated. the same holds if the cost is non-quadratic. 6 upper level implementation we mention how the control scheme described above can be implemented using the femlab software package. the implementation is fairly straightforward since this software enables easy cooperation with matlab. we will describe briefly how the code was generated and what changes are to be made. we used the femlab 2.1 system (see [3]) as well as its graphical user interface (gui). generating the code using the gui saves a lot of effort. for simplicity we assume the system to be of order two. implementation of the control algorithm for a higher-order system will then be a straighforward extension of our approach we chose the general form, 6 one-dimensional equations on the interval (0, th) in the initial menu. the names of the functions are x1, x2, �1, �2, u1, u2. the first two stand for the state, then the pair �1, �2 represents the co© czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 -states, and finally the last two variables are used for control. then we define constants xx10 and xx20, which will be used for defining the boundary conditions of the state variables. next we define the following system of equations, which describes the dynamic optimization problem as introduced above. there is a slight problem with the definition of the boundary conditions. it is clear from the definition of the state and co-state functions that the following must be satisfied: x1(0) � xx10, x2(0) � xx20, �1(th) � 0, �2(th) � 0. nevertheless femlab also requires initial conditions for the states defined at the end of the interval and for the co-states at the beginning. this is due to the fact that the femlab gui was optimized for designing programs for solving differential equations of the second order. thus we have to define the boundary condition as void. we defined there the conditions x1(th) x1(th) = 0, resp. �1(0) �1(0) = 0, for the other state and co-state variables by analogy. then the equations are correctly defined and their definition meets the requirements of the femlab system. we also have to define the functions that evaluate the desired trajectory. the following actions were carried out with the code generated by the gui: � it was declared as a function with input parameters time, x1time, x2time. the time parameter contains the actual time, and the parameters x1time, x2time contain the values of the state variables. � the interval where the boundary problem was solved was changed to (time, time � th). � at the beginning of the function, the value of the state variables is evaluated at the time slot time � 1. this is possible since the control on this interval is known from the previous step. � then the code generated by the gui follows. some modifications were made: the time interval where the problem is solved was changed to time � 1, time � 1 � th. the boundary conditions on the state variables are their value at the time time � 1, where the constant th contains the length of the prediction horizon. � then the feedforward functions were computed and saved. the simulink system was used for modeling the system. to simulate both levels properly it would be necessary to run the calculations in two different threads. we did not perform this. the scheme is shown in fig. 2. the gains gain and gain1 together with the system build up the lower level. the functions firstreference and secondreference select the appropriate feedforward computed in the upper level. the connection from the lower into the upper level is realized by the function matlab fcn, which activates the upper level always after period 1. 7 simulation results a simulation example was performed with the system � .x x x u1 1 2 10 01� � � � .x x x u2 1 2 20 01� � � together with the following quadratic cost functional j. j x x q q x x u u t � � � � � �� � � � � �� � � � � �� � � � 1 2 1 2 1 2 1 2 0 0 � � �� � � � � �� � � � � �� � � � � � � � � � � t t t r r u u t1 2 1 2 3 0 0 d . the matrices q, r in the upper-level cost functional were chosen such that q1 � q2 � 10, r1 � r2 � 0.2. the lower-level compensators have gains optimal for controlling the system, with the interconnections between these subsystems neglected. the matrices in the quadratic cost functional are chosen as q1, r1 resp. q2, r2. we aim to design the control so that the state x1 will track the reference x1w � sin 4 t and the state x2 will track the reference x2w � cos 5 t. the equations for the derivatives of the co-states attain the following form 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 2: the simulation scheme � ( ) . ( , ) � � � � � 1 1 1 1 1 2 1 2 1 0 01� � � �q x x b x x xopt w opt opt � ( ) . ( , ) � � � � � 2 2 2 2 1 2 1 2 2 0 01� � � �q x x b x x xopt w opt opt with zero terminal condition at t � 3 again. the state is shown in fig. 3a without the barrier function and without the presence of the additive noise. (in this and in the following figures, the solid line represents the state x1 of the system while the dashed line represents the reference.) fig. 3b shows the state x1 if the additive noise acts upon the system while the barrier function is not present. the next two figures show the state x1 in the situation after augmentation of the cost functional by adding the barrier function. the state x1 is shown in the fig. 4a without the presence of the additive noise, and in fig. 4b in the presence of this noise. the state space is shown in fig. 5a, respectively 5b (with, respectively without, the barrier function activated). here we can easily see the effect of penalizing the states that are close to the point xp. the dashed circle joins the points where the barrier function (defined above) attains the value 1. finally, if the system is controlled by the optimal control computed in the upper level, the state x1 behaves as depicted in fig. 6 (solid line), the reference being again represented by the dashed line. the loss of reasonable performance is due to the fact that the system remains virtually uncontrolled during the computations in the upper level while the noise still acts. the simulations carried out show that the behavior of states x1 and x2 is very similar. therefore the state x2 is not shown. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 the cost functional can also be augmented by a barrier function. to show how the optimal controller takes this into account we carried out similar simulations of the same system when the term b x x x x x ( , ) . ( . ) . ( . 1 2 1 4 2 4 4 1 0 5 1 0 6 0 05 1 0 6 � � � � � � � � � if ) .4 2 4 0 05 10 0 � � � � � � � � � � � � x elsewhere was added to the cost functional. this barrier function penalizes trajectories that are close to the point xp � ( . , )0 6 0 in the state space. the system as well as the cost functional are very simple, but still this example demonstrates the main features of the proposed approach. 0 1 2 3 4 5 6 7 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 a) b) fig. 3: state x1 a) without the presence of the noise, no barrier function, b) with noise added, no barrier function a) b) fig. 4: state x1 a) without presence of the noise, with barrier function, b) with noise and barrier function added 7 remarks if the penalty on the tracking error is too great (qi = 1000), certain problems with satisfying the boundary conditions occur – these conditions are not satisfied even after a great number of iterations. using such an inaccurate approximation of the optimal trajectory significantly decreases the quality of the tracking. other problems occur if the barrier function is too steep – the computations often end with an error message. it is necessary to consider the huge time consumption in the upper level. this is due to the feedforward computation. at this stage a differential equation is solved. the optimal trajectory x is on the right-hand side of this equation. this results in the need to call the femlab function postinterp always when the right-hand side is evaluated. the postprocessing seems to be rather time-consuming, and in this case its time consumption of the time exceeds considerably exceeds the time necessary for solving the boundary-value problem. this difference is just strengthened by the fact that the optimal trajectory need not be evaluated with high precision. 8 conclusion we have designed and simulated a hierarchical controller where the optimization is based on the solution of a boundary control problem. this problem is solved in femlab 2.1. the solution of this problem represents the reference trajectory for the lower level of the hierarchical controller. acknowledgment this work was partially supported by research program no j04/98:212300013 “decision making and control for manufacturing” of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic), and by grant no 102/01/1347 of the grant agency of the czech republic. references [1] bertsekas, d. p.: dynamic programming and optimal control. belmont: athena scientific, 1995. [2] bryson, a. e., ho, y.: applied optimal control. new york: j. wiley, 1975. [3] femlab reference manual v. 2.0, stockholm: comsol ab, 2000 [4] jamshidi, m.: large scale systems: modeling, control and fuzzy logic. new jersey: prentice hall, 1997. [5] maciejowski, j. m.: predictive control with constraints. harlow: prentice hall, 2002. [6] rehák, b.: “design of a hierarchical controller using femlab.” in: proceedings of the 22nd iasted conference on modelling, identification and control (editor: m. h. hamza), anaheim, acta press, p. 543–547. [7] singh, m. g.: decentralised control. amsterdam: north holland, 1981. [8] trave, l., titli, a., tarras, a.: large scale systems: decentralization, structure constraints and fixed modes. berlin: springer, 1989. [9] vincent, t. l., grantham, j.: nonlinear and optimal control systems. new york: j. wiley, 1997. ing. mgr. branislav rehák phone: +420 2 2435 7336 fax: +420 2 2491 864 e-mail: rehakb@control.felk.cvut.cz department of control engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague -1.5 -1 -0.5 0 0.5 1 1.5 2 -1 -0.5 0 0.5 1 1.5 a) b) fig. 5: state space a) no barrier active, b) with the barrier active fig. 6: state x1, directly controlled by the upper level ap_06_4.vp 1 solar cells a solar cell, or photovoltaic cell, is a semiconductor device consisting of a large-area p-n junction diode, which in the presence of sunlight is capable of generating usable electrical energy. this conversion is called the photovoltaic effect. when light strikes the p-n junction of a semi-conductor the absorbed photon energy releases an electron from the p-type region and moves it to the n-type, creating a hole in the valence band and producing a current. the main criteria for the selected solar cell are efficiency and costs. these define the performance and availability, and they can vary greatly. this paper deals with problems of efficiency influenced by imperfections of crystal lattices, and the applicability of the basic diagnostic method used for determining such imperfections. free charge carriers generated by impacting light (known as excess carriers) move in all directions from their place of origin. an important quantity defining the range of a generated charge carrier is its lifetime, i.e., the time that passes before an electron meets a hole and recombines. however, as electrons and holes reach the boundary of the p-n junction, they are rapidly swept by the electric field of this junction, either to the p-side (in the case of holes) or to the n-side (in the case of electrons), generating voltage on the outer electrodes. the basic means for lowering carrier lifetimes are lattice vibrations (phonons) and impurities or, generally speaking, lattice impurities. as lattice vibrations depend only on crystal structure and temperature, which are fixed for a specific semiconductor material and usage, we will be concerned here with lattice impurities and their effect on the lifetime of charge carriers. 2 problems of carrier trapping and recombination every physical system attempts to achieve so-called thermal equilibrium as soon as possible, as do excess carriers. in the case of silicon, which is widely used in photovoltaic applications, the way to achieve thermal equilibrium, is either by auger recombination or by capturing free charge carriers on energy levels that lie in the band that separates the conduction and valence band (the forbidden band, or the band gap). these energy levels can originate either from an imperfection of the crystal lattice (e.g., dislocation) or from foreign atoms in some positions of the lattice, or even from complex crystal defects induced, for example, by radiation. these imperfections strongly influence the electron and hole transport through the bulk of the semiconductor device. they can act either as a trap, where an electron or a hole is trapped on this level for a certain time, or as a generation-recombination (g-r) center, where one charge carrier annihilates with a carrier with the opposite charge. 3 dark forward i-v characteristics of solar cells when conducted under different temperatures, this method provides many parameters of a solar cell, e.g., the temperature dependence of the shunt resistance and diode factor, the energy and concentration of the dominant recombination center, the lifetime of the charge carriers. the measuring apparatus works with a current source with a range of 0 to 100 ma. the temperature range is from approximately 20 °c to 150 °c. the process itself is controlled by a computer via a serial bus rs232, and the data (values of voltage and current) is stored on the hard drive. the dark current of forward biased solar cell idf can be expressed by the formula i a j e v r i kt a j e v r i df s s � �� � �� � � � � � � � � 01 1 02 exp exp � �2 1 kt v r i r � � �� � � � � � � � � � s p , (1) where a stands for the area of the sample, j01 is diffusion current density, j02 is generation-recombination current density, �1 and �2 are the diode factors, rs is the series resistance, rp is the shunt resistance, e is the electric charge, and k is the bolzmann constant. the series resistance of large-area solar cells is small and can be negligible. the plotted graph of i-v characteristics is divided into three regions: 1. in the range 0–40 mv, the influence of shunt resistance dominates and can be calculated; the current through the cell can be expressed by: © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 46 no. 4/2006 measurement of solar cell parameters with dark forward i-v characteristics j. salinger the grade of a solar cell depends mainly on the quality of the starting material. during the production of this material, many impurities are left in the bulk material and form defect levels in the band-gap, which act as generation-recombination centers or charge carrier traps. these levels influence the efficiency of solar cells. therefore knowledge of the parameters of these levels, e.g., energy position, capture cross section and concentration, is very useful for solar cell engineering. in this paper emphasis is placed on a simple and fast method for obtaining these parameters, namely measurements of dark characteristics. preliminary results are introduced, together with the difficulties and limits of this method. keywords: solar cell, lattice imperfections, lifetime, dark i-v characteristics. i v rdf p � . (2) 2. in the range 40–300 mv, the generation-recombination compound of the total current predominates, so it can be expressed by: i a j e v kt v rdf p � � � �� � � � � � � � �02 2 1exp � . (3) 3. above 300 mv, the first term in the expression of the total current (diffusion compound) is dominant, so, by the curve fitting method, the diffusion saturation current and the diffusion diode factor can be extracted. 4. extracting the cell parameters the generation-recombination current density j02 can be expressed by: j e n di 02 � �sc . (4) this means that the density j02 is inversely proportional to the lifetime of the charge carriers in the space charge region, for which in the case of a single trapping level the following formula can be obtained: � � �sc p0 n0� � �� � � � � � �� � � � � exp exp w w kt w w kt t i t i . (5) here, �p0 and �n0 stand for the lifetime of the minority carriers in an n-type semiconductor or a p-type semi-conductor, respectively, wt is the energy level of the g-r center (or trap), and wi is the intrinsic fermi level [1]. with the knowledge of these lifetimes and the capture cross sections the g-r center concentration nt can also be extracted. thus, to obtain the maximum parameters of the solar cell band gap structure, we are interested in the second region of the plotted graph. problems can arise from the fact that only a single recombination level can be extracted from this measurement. if there are, for example, two deep levels of approximately the same concentration, however, the standard extracting technique will lead to incorrect values of deep level energy. to evaluate a large number of parameters, curve fitting is used. while linear dependence can be fitted without problems, fitting exponential dependence can be difficult, and the results may vary strongly with different initial conditions. these complex conditions may cause errors when simple fitting techniques are applied to them. for example, non-linear dependence of the diode factor on temperature is observed. 5 parameters of g-r centers from i-v measurement monocrystalline silicon samples fabricated by the czochralsky grown method were used in the measurements. the dimensions were 102×102 mm, and each sample represented one batch. figs. 1, 2 and 3 show the temperature dependence of shunt resistance rp, diode factor �2 and g-r current i02, respectively. for maximum efficiency of a solar cell, the highest shunt resistance is needed. the measured resistances are shown in fig. 1. the values of the shunt resistance at the highest temperatures were always lower than at room temperature (rt), but some differences were found: 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 p fig: 1: temperature dependence of shunt resistance rp of samples #1942-02 (�), #20 (�) and le2 (�) fig. 2: temperature dependence of diode factor �2 of the samples #1942-02, #20 and le2 fig. 3: temperature dependence of g-r current i02 of samples #1942-02, #20 and le2 1. variation of more than one order between samples (e.g. #1942-02 and le2) at rt. 2. samples such as #1942-02 showed a small initial increase in the shunt resistance before a final decrease. diode factor �2 showed a decrease with temperature growth (fig. 2). in some cases, however, this dependence was not very clearly confirmed (e.g., sample #20). diode factors �2 have to be extracted, so that the g-r current density i02 can be evaluated more precisely. the hyperbolic logarithm of the g-r current density as a function of temperature is shown in fig. 3. the dependence is almost linear. for the lowest trapping/generation effect, this dependence should be weak. this is shown in fig. 3 for sample le2, thus confirming its quality from the shunt resistance measurement. the preceding extracted parameters and the obtained dependences were used for evaluating the energy levels of the g-r centers and their densities and the lifetime of the excess charge carriers in the space charge region. these parameters are shown in tables 1 and 2 for each sample. comparing values of the deep energy levels with measured efficiencies, we can evaluate the influence of these levels. the samples of series le# have almost the same levels and lifetimes, and also their efficiencies are similar. the same can be said about samples 1942-02 and 1940-24. although the deep energy levels in these two batches are different, their influences are nearly identical. on the other hand, samples #10, 20 and 22 show some inhomogeneities in the deep level energy and the lifetime in the space-charge region. in sample 1x-0883 a very deep level was found and the lowest efficiency was measured. other parameters of the measured solar cells, e.g., surface recombination velocity and series resistance, need to be determined for a more precise evaluation of the influence of deep level influence on efficiency. determining the formers of the deep levels, the most probable lattice imperfections creating deep levels in the range from 0.2 to 0.37 ev below the conduction band (or above the valence band), are bounded with boron, carbon and oxygen atoms, e.g. bics �0.29 ev, bi �0.37 ev and bioi �0.20 ev [2]. here ‘�’ means energy above the valence band and ‘�‘ means below the conduction band. the deep levels found in samples #10, 20 and 22 may have been caused by some special treatment of these samples, e.g. electron bombardment, which induces vacancy-related defects like vo �0.18 ev, or carbon related defects like cics �0.11 ev [1, 3]. of course there may be some other explanation in each sample of electrical behavior, namely the inherence of two or more energy levels deep within the band gap. this simple technique cannot give the exact parameters of these centers, for reasons mentioned above. 6 conclusion a proper characterization of the charge carrier lifetime and the extracting parameters of g-r centers is very useful for solar cell utilization. and will play a key role in their future development. although the method of dark forward characteristics has some limitations. as mentioned in the text, this method is very fast, non-destructive and simple, and can be used together with other methods as a diagnostic tool in the development and production of solar cells. 7 acknowledgments research described in the paper was supervised by prof. v. benda, fee ctu in prague, department of electrical technology. references [1] tran hung quan: diagnostics of large area crystalline solar cells. 2003 [2] adey, j., jones, r., briddon, p. r., goss, j. p.: optical and electrical activity of boron interstitial defects in si. j. phys.: condens. matter, vol. 15 (2003) s2851–s2858 pii [3] schroder, d. k.: semiconductor material and device characterization. john wiley & sons. inc., new york, 1990 ing. jan salinger e-mail: salinj1@feld.cvut.cz department of electrical technology czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 46 no. 4/2006 sample# �wt [ev] wt1 [ev] wt2 [ev] 10 0.438 0.122 0.998 20 0.373 0.187 0.933 22 0.341 0.219 0.901 1942-02 0.351 0.209 0.911 1940-24 0.360 0.200 0.920 1x-0883 0.201 0.359 0.761 le1 0.248 0.312 0.808 le2 0.212 0.348 0.772 le5 0.224 0.336 0.784 table 1: possible energy levels found in selected samples sample# �sc (295 k) [s] efficiency [%] 10 2.05×10�8 14.29 20 8.00×10�8 13.43 22 1.83×10�8 14.11 1942-02 1.53×10�8 14.27 1940-24 1.63×10�8 14.15 1x-0883 2.90×10�8 7.44 le1 1.50×10�8 14.58 le2 1.10×10�8 14.57 le5 1.50×10�8 15.03 table 2: lifetime of minority charge carriers in the space charge region. and the solar cell efficiencies of the measured sample ap06_5.vp 1 introduction the vision and image processing systems are widely employed in current manufacturing processes. these kinds of systems provide measures to inspect and control the manufacturing process of the producing location. in addition, the vision systems make automatic inspection and process examination possible with the help of “intelligent” software components. the project, on which this paper is based, is initiated in such a circumstance in order to associate and simplify the qualtiy control of free-form surface manufacturing, specially shining water taps in sanitary industries. the first stage in the fabrication of water taps is the casting of the rough part. the material that is mainly used is brass. after the casting process some other machining processes, like drilling, milling and threading are carried out. then the water tap is ground and polished sequentially in order to obtain high surface quality. finally, the end product is finished by electroplating. it is crucial to find out the potential defects existing on the workpiece surface after grinding and polishing before the final electroplating. if a defect is found on a part after electroplating, manufacturer has to painfully discard this part or remove the electroplated coating. in this case removing the coating is a very expensive process because its harmful to the evironment. therefore cost will be saved if the defects can be detected at an early stage. in addition, it is always useful to find out which type of a defect has been identified. once the type of a defect is known, the decision can be made whether the part is discarded or can be retouched with proper compensatory engineering processes and if an adjustment of a previous machining process is necessary. so far the tasks of defect inspection and categorization are performed by human operators in a traditional “see and evaluation” way. it is a labour-intensive and therefore also cost-intensive job. this process is necessarily automated to improve the efficiency of defects inspection, releasing worker from unpleasant working environments, and finally reducing the overall manufacture cost, specially in the countries where wages are high. to automate the defects inspection and classification, a vision system is installed and integrated into the manufacturing chain. in our project, the inspection process takes place after the end of the grinding and polishing processes. if no defects are found on the surface, the workpiece is accepted for the next processing step. otherwise, it is classified automatically in order to determine if a removal of the defect is possible or the workpiece should be rejected directly. the vision system consists of a carrier, a camera system, a lighting system, other accessories and the software. the system hardware is responsible to provide a constant lighting enviroment and obtain the digital images under this constant circumstance. the software provides the solution to examine the images from the camera system, locating and classifing the defects on workpiece surfaces. the challenge in our project now is to characterise and determine the type of defects from predefined categories and additional foulings (which are called pseudo-defects in our project) on the surface. to this end, this paper presents measures and considerations regarding both theoretical and practical aspects, to efficiently classify all defects as well as a separation of real-defects from pseudo defects. 2 automatic classification system 2.1 defects definition the defects have been generally divided into 15 classes according to their physical attributes and the consequent handling operations. fig. 1. gives samples for all defect categories. all defects mainly can be split up into two major categories, real-defects and pseudo-defects that are actually foulings, like dust or oil, on the surface or shades caused by uneven lighting on the free-form surfaces. the first ten defects in fig. 1. are real-defects and the last five are pseudo-defects. the pseudo-defects are causing no quality problem; while the real-defects should be critically picked out because they not only spoil the aesthetic aspect but sometimes also result in malfunction of final products. the vision system considers both kinds as failures at first and distinguish them in the classification phase. therefore, two indice have to be taken into account, the overall right classficaition ratio and wrong classification ratio between the real-defects and pseudo-defect. in addition, the wrong classification ratio from realto pseudo-defects is more crucial than that from pseudoto real-defects. in the previous case, a product with real8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 quality control in automated manufacturing processes – combined features for image processing b. kuhlenkötter, x. zhang, c.krewet in production processes the use of image processing systems is widespread. hardware solutions and cameras respectively are available for nearly every application. one important challenge of image processing systems is the development and selection of appropriate algorithms and software solutions in order to realise ambitious quality control for production processes. this article characterises the development of innovative software by combining features for an automatic defect classification on product surfaces. the artificial intelligent method support vector machine (svm) is used to execute the classification task according to the combined features. this software is one crucial element for the automation of a manually operated production process. keywords: quality control, image processing, defect classification, support vector machine. -defects will be accepted as qualified and put to the next processing step, electroplating. this product cannot be sold to the customer after electroplating. all manufacturing costs, including the expensive electroplating, are wasted. even if it is remediable, the electroplated coating has to be removed at first. this coating remove process is also expensive. in the second case, the product just run through one additional round of grinding or polishing, although this would not be necessary. 2.2 classification framework four basic steps are necessary for the classification to work: defects location, defects segmentation, feature extraction and classification. these four steps are executed sequentially in the system. fig. 2. shows the framework of the whole system. at the beginning, the system detects and locates the flaw in the greyscale image obtained from the camera system. before encoding the bitmap into some meaningful feature values, the area of the defect in the image should be pre-processed and segmented. in this phase we get defect image units that are ready for the subsequent feature extraction. this paper will mainly focus on the later two parts: feature extraction and classification. the feature extraction is the most important part in the system. it defines the rules to describe and express the defects © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 46 no. 5/2006 mechanical damage casting peel pore and embedding solids fine lined mark porosity crack swirl vapour lock swirl fragment turning defect fluff dust burned residues grease residues polishing shade fig. 1: samples of 15 defect categories defects location defects segmentation feature extraction feature 1 feature 2 feature 3 feature n classifier fig. 2: the framework of the automatic identification and classification system inside an image in a form that the classifier can understand and utilize to distinguish one class from others. the features are not limited to the shape feature, but also can be the texture features and some statistical features of the segmented images. after that, the features are applied as the training data to the classifier. many artificial intelligent techniques have the capability of multi-class classification, e.g. multi-layer perceptron (mlp) [1], support vector machine (svm) [2], learning vector quantization (lvq) etc. svm is used in this paper. 3 feature extraction and classification 3.1 review of previous work the theoretic methodology and practical considerations of this automatic classification system are presented in our previous article [3]. the article discussed various feature extraction methods and their mathematical bases, including shape features, features through different filter banks, statistical features based on co-occurrence matrix [4]. in general, feature extraction digitizes the defect images in a way that enlarges the distinctions among categories and discards the similarities at the same time. the filter bank technology is a widely used method for pattern recognition. a group of filters, which is called a filter bank, is employed and each filter in the bank contains intensity variations over a narrow range of frequency or orientation, specifying the regularity, coarseness and directionality of the original image [5]. the energy of each filtered image is used as a feature. the statistical features are, in comparison, extracted from the co-occurrence matrix of the original image. fig. 3 and fig. 4 illustrate filtered images by gabor filter bank [6] and co-occurrence matrices respectively. in addition, the classification structure was also introduced to combine these various features as a complete system, see fig. 5. first, the classification task is conducted by using single kind of features, i.e. only one switch is closed in fig. 5. the experimental results are shown in table 1. it can be concluded from table 1 that the shape feature is not suitable for this application. there are two reasons for that. the first one is that there are no clear differences in the shape between some defects, for example, fluff and pore, fine lined mark and polishing shade. the second one is that the geometric information of some kinds of defects cannot be exactly defined. for example, it is not easy to describe and to differentiate the shape of a swirl and a polishing shade. the pattern information is more effective than the simple geometric information in this sense. the gabor features excel all other features, obtaining 74.4 % overall classification ratio. the results of statistical parameters and laws filter are also very impressive and competitive. in the second step, we combined features of different technologies together, i.e. more than one switch in fig. 5 is closed. through combining gabor and statistical features, a 82.3 % classification ratio can be obtained. the real-to-pseudo wrong classification ratio of this model is about 2.5 %. the wrong classification ratio of pseudo-to-real is about 6.6 %. table 2 shows the combined matrix of defects classification using gabor and statistical features. one item in the combined matrix denotes the number of defects that are classified from one class (indicated by row elements) to another class (indicated by column elements). for example, we can conclude from the combined matrix in table 2 that 494 from 573 pores (label 3) 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 3: the gabor filter bank. the left image is the original image of a defect “vapour lock” and the images on the right side are the filtered images at different central frequencies. the orientation is 0°. fig. 4: the co-occurrence matrixes of a pore, a vapour lock, a polishing shade and a casting peel (from left to right) feature feature num. training classification ratio (%) testing classification ratio (%) shape 8 46.6 % 61.8 % laws 25 88.7 % 70.1 % dct 3x3 9 98.9 % 62.3 % dct 5x5 25 97.9 % 64.9 % gabor 16 98.9 % 74.4 % statistical 15 76.5 % 72.1 % table 1: training and testing classification ratio of varied features are correctly classified, 9 of them are recognized as vapour lock (label 8) and 64 are identified as polishing shade (label 105). in comparison, 39 defects, which are recognized as the pore (label 3), are actually fluffs (label 101). the combined matrix of an idealistic classification system has only non-zero values on the diagonal, meaning that none defects are wrongly classified. the last row of table 2 means the reliability of the prediction. for instance, if the reliability of prediction of pore (label 3) is 73.0 %, it is 73.0 % probably correct when the system predicts that it is a pore. the last column is the right classification ratio of each class. refer to the original paper [3] for more details about the feature extraction technologies, classifier design, numerical results and analyses. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 46 no. 5/2006 5 6 6 tr a in in g p a ir s 1 8 2 0 te st in g p a ir s shape laws filters 3x3 dct 5x5 dct gabor filters stat. features feature selection 8 25 9 25 16 15 classification svm1 svm2 svm3 svmm m=k(k-1)/2 voting c1 c2 c3 c4 c14 c15 feature extraction fig. 5: the structure of the classification system labels 1 2 3 4 5 6 7 8 9 10 101 102 103 104 105 n class. ratio (%) 1 19 0 4 0 0 0 0 1 0 0 0 0 0 0 2 26 73.1 2 0 21 7 0 0 0 0 0 0 0 0 0 0 0 2 30 70.0 3 0 1 545 1 1 0 0 4 0 0 11 4 3 0 3 573 95.1 4 0 0 2 40 0 0 0 2 0 1 0 0 0 0 3 48 83.3 5 0 0 19 0 37 0 0 1 0 0 0 0 0 0 0 57 64.9 6 0 0 8 1 1 17 0 0 0 0 3 0 0 0 2 32 53.1 7 0 0 20 0 0 0 29 0 0 1 3 0 1 0 3 57 50.9 8 0 0 26 0 0 0 0 58 0 0 0 0 1 0 2 87 66.7 9 0 0 3 0 0 0 0 0 8 0 0 0 0 0 0 11 72.7 10 0 0 3 0 0 0 0 0 0 178 0 0 0 0 3 184 96.7 101 0 0 26 1 0 0 0 0 0 0 62 0 1 0 11 101 61.4 102 0 0 18 0 0 0 0 0 0 0 0 24 0 0 7 49 49.0 103 0 0 29 0 0 0 0 4 0 1 1 0 38 0 6 79 48.1 104 0 0 6 0 0 0 0 0 0 0 0 0 1 29 5 41 70.7 105 0 0 31 4 0 0 0 0 0 1 5 0 7 4 393 445 88.3 reliability (%) 100 95.5 73.0 85.1 94.9 100 100 82.9 100 97.8 72.9 85.7 73.1 87.9 88.9 – 82.3 table 2: classification result obtained by combining gabor features and statistical features 3.2 further work from table 2, we can see that the four most common defects are pore (label 3), polishing phade (label 105), turning defect (label 10) and fluff (label 101). the pore and polishing phade constitute majority of all defects. some defects, e.g. crack and casting peel, have only few samples in the defect database. based on this database, the classifier was inclined to classify the major categories correctly and neglected the wrong classification of minor categories because the right classification of major categories contributes much more to the overall classification ratio than the minor categories. the samples of the database should be enriched in order to make the classifier to identify also defects in minority. another fact is that some kinds of defects are not part of the new database due to improvements of both the manufacturing process and the vision system. for example, the mechanical damage occurs rarely in the new manufacturing process in our project and fluffs are nearly eliminated if the workpieces are blowed by high pressure air before they are fed to the vision system. therefore, the number of classification target categories is reduced. a new database of defects was built by collecting the samples on the current manufacturing process. in this new database, the manual classification of pseudo-defects was reorganized with more consideration of their appearance. besides the features shown in table 1, we use additionally the average and standard deviation of greyscale value of the defect image as greyscale features. the greyscale features should be localized considering different size of the various defects. the greyscale information are obtained in four areas in the defect image respectively, see fig. 6. in this case, the number of greyscale features is eight, two of each area. table 3 shows the classification results by combining gabor features, statistical features and additional greyscale features. the overall classification ratio is 81.1 % even that all defect types are balanced in batch compared to the result based on the old database. the real-to-pseudo wrong classification ratio is about 2.6 % while the pseudo-to-real is about 5.1 %. human trained operators are achieving a similar rate. 4 software implementation fig. 7 shows the architecture of the implemented software system. this software system provides a platform to manage defects data and evaluate the feature extraction technologies and classifier design. the defect database contains information for all collected defects, including manual classification results, defect images and other necessary entries. each defect corresponds to a record in the database. the feature extraction reads out the information from the database and calculates the features of the defect. the features and the 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 6: greyscale information in four areas labels 2 3 4 6 7 8 10 101 102 103 104 105 n class. ratio (%) 2 95 4 0 0 4 0 0 0 1 0 0 4 108 88.0 3 0 453 8 0 0 20 0 42 10 1 2 4 540 83.9 4 0 20 217 4 4 8 0 0 2 1 0 0 256 84.8 6 0 0 2 87 2 3 2 0 0 3 1 0 100 87.0 7 4 7 6 0 137 0 0 1 4 1 0 0 160 85.6 8 0 1 7 0 11 197 0 0 8 0 8 0 232 84.9 10 0 0 0 0 0 0 568 0 0 0 0 8 576 98.6 101 2 23 0 0 0 0 0 129 10 0 0 0 164 78.7 102 7 57 9 15 23 9 3 14 460 63 26 70 756 60.8 103 0 2 1 2 0 0 10 0 16 268 0 1 300 89.3 104 4 0 0 0 3 8 0 0 7 5 254 19 300 84.7 105 1 3 4 0 7 0 3 6 61 5 13 273 376 72.6 reliability (%) 84.1 79.5 85.4 80.6 71.7 80.4 96.9 67.2 79.4 77.2 83.6 72.0 – 81.1 table 3: classification result by combining gabor features, statistical features and greyscale information. 1572 defects for training, 3868 for testing in which 3138 are classified in the right way. manual classification results are used for training and testing the svm models. the models can be used then by the classifier to perform identification task. the feature extraction and classifier design comprise the core component, which is called opticlass, of the software. the software system has the capability of self-learning. if the manufacturing process or the vision system is altered, the only part we should change is the classification model. we need no modification of the software code, it is only necessary to build new models using the new collected defects database. first, the training and testing files are generated by the software system based on the new database. after that, the svm models are produced by the training and testing files. finally, the new created models replace the old models in use. opticlass is a relatively independent software component. it can be one part of the off-line program and can also be © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 46 no. 5/2006 defect database management graphic user interface (gui) classifier opticlass feature extraction models t ra in in g figure 7: software architecture fig. 8: user interface of the software system directly integrated into the software of the vision system to online classify the occurred defects. moreover, opticlass is extendable. with enrichment of feature extraction methods and various implementations of classifiers, opticlass is not strictly limited to our project to classify defects on water taps in sanitary industries but also applicable to other digital image based classification tasks. fig. 8 shows the user interface of the software system. 5 summary this article introduces a vision system that inspects and identifies the defects on water taps in sanitary industrie. with this vision system the quality control process is fully automated without intervention of human operators. the vision system is divided into two parts, the hardware part and the software part. the hardware part cleans up product surfaces and then takes photos under a constant illuminating condition. the software part does the image processing and realizes the automatic identification. this paper presents the two major software components, features extraction and classifier design. a large range of methods is introduced to present defect images. it is possible to obtain about 81.1 % overall classification ratio for 13 predefined defects by combining the gabor filter features, statistical features and grayscale features. the software structure and implementation is also described. the collected defects are stored in a uniform database. therefore, the data of defects can be easily handled by the software. the core components, feature extraction and classifier design, are implemented independently. they service not only our simulation software platform offline but also can be integrated into the vision system and work online. the modularity of the software ensures that the vision system has the capability of self-learning and extendibility to other applications. acknowledgment this research and development project is funded by the “bundesministerium für bildung und forschung” (bmbf) within the framework of “research for the production of tomorrow” and supervised by the project supporter of the bmbf for production and manufacturing technologies (pft), research centre in karlsruhe. references [1] haykin, s.: neural networks – a comprehensive foundation. macmillan college publ. co., 1995. [2] vapnik, v.: the nature of statistical learning theory. new york : springer, 2000. [3] zhang, x., krewet, c.; kuhlenkötter, b.: automatic classification of defects on the product surface in grinding and polishing. international journal of machine tools and manufacture. vol. 46 (2006), no. 1, p. 59–69. [4] haralick, r. m.: statistical and structural approaches to texture. in: “proceeding of ieee bd. 67”, 1979, p. 786–804. [5] laws, k.: textured image segmentation. phd dissertation, university of southern california, 1980. [6] daugman, j. g.: complete discrete 2d gabor transformation by neural networks for image analysis and compression. ieee transactions on acoustics, speech and signal processing. vol. 36 (1988), no. 7, p. 1169–1179. dr.-ing. bernd kuhlenkötter phone: +490 231 755 5615 e-mail: bernd.kuhlenkoetter@udo.edu m.-eng. xiang zhang dipl.-ing. carsten krewet industrial robotics and handling systems robotics research institute dortmund university otto hahn-strasse 8 44221 dortmund, germany 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 acta polytechnica https://doi.org/10.14311/ap.2023.63.0103 acta polytechnica 63(2):103–110, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague derivation of entropy production in a fluid flow in a general curvilinear coordinate system erik flídr czech technical university in prague, faculty of mechanical engineering, technická 4, 160 00 prague 6 – dejvice, czech republic correspondence: flidreri@gmail.com abstract. the paper deals with the derivation of the entropy production in the fluid flow performed in a general curvilinear coordinate system. the derivation of the entropy production is based on the thermodynamics laws as well as on the balances of mass, momentum, and energy. a brief description of the differential geometry used in general curvilinear coordinates is presented here as well to define the used notation. the application of this approach is then shown in the evaluation of the entropy production along the suction side of the blade, where the calculation was performed using available experimental data. keywords: entropy production, fluid flow, general curvilinear coordinates, linear blade cascade, experimental data. 1. introduction a general curvilinear description of a flow is useful to describe the flow in highly curved channels, e.g. turbine blade passages. it is better for understanding the processes that occur in the flow field because the researcher can make some predictions about the flow without solving the equations at all. meaning, that it can be deduced from the form of equations whose terms will affect the results mostly and, next, if the geometry of the problem under investigation will be varied, how this geometry variation can affect the obtained data. the derivation of the equations of a fluid flow in a general curvilinear coordinate system was performed several times. for the first time, it was performed in [1], where curvilinear components of velocity vector were introduced. a more detailed description can be found in book [2]. the derivation of the momentum equation in general curvilinear coordinates as well as the description of differential geometry with application to physics was introduced in [3]. there are, of course, papers focused on the application of this approach to the numerical simulations, see e.g. [4], or [5]. these papers, however, are considering only an incompressible fluid flow. the aim of this paper is to go a little bit further and derive not only the equations of the fluid flow but unite the fluid mechanics and thermodynamics to obtain the entropy production in the flowing fluid in the general coordinate system. this result was not published so far according to the author’s knowledge. the entropy production in a flowing fluid was recently derived in [6], however, this derivation was performed in a vector form. 2. basics of differential geometry the definition of the tensor1 has to be mentioned in the following paragraph. the tensor is a quantity that exists independently whether there is some observer present or not. tensor is then given by its components multiplied by basis vectors. the sum of these multiplied components with basis vector has to be invariant under basis transformation (equation (1)), i.e. that tensors of the first order (vector) obey the transformation rule (equation (2)): v = vigi = v̂ j ĝj , (1) vi = aijv̂ j , (2) where v is the vector, vi are the components of the vector in the new basis, gi are the transformed basis vectors, ĝj are the original basis vectors, v̂j are the components of the vector v in the original basis and aij is the transformation matrix. the basis vectors gi can be defined as (see any textbook about differential geometry, e.g. [3]): gi = ∂x ∂ηi , (3) where ηi are the coordinates and x is the position vector. similarly, co-basis vectors can be defined as: gj = ∂ηj ∂x . (4) covariant, contravariant metric coefficients and the kronecker symbol can obtained as: 1here vectors and scalars are included in the definition of the tensors as the tensors of the first and zeroth order, respectively. 103 https://doi.org/10.14311/ap.2023.63.0103 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en erik flídr acta polytechnica gij = gi · gj, (5) gij = gi · gj, (6) gi · gj = gilg lj = δij. (7) now, derivatives of the tensors in general curvilinear coordinates have to be determined. the gradient of the vector function in the general curvilinear coordinate has a form: dv dx = ∂ui ∂ηj gig j + ui ∂gi ∂ηj gj . (8) christoffel symbols of the second kind can be defined based on this relationship as: γkij = ∂gi ∂ηj gk = 1 2 gkl (gli,j + glj,i − gij,l) . (9) substituting this result into equation (8) gives: dv dx = ∂ui ∂ηj gig j + uiγkij gkg j . (10) finally, the covariant and contravariant derivatives of the vector and second order tensor, which will be frequently used in the following text, are defined: vi;j = ∂ui ∂ηj + ukγikj, (11) vi;j = ∂ui ∂ηj − ukγkij, (12) t ij ;k = t ij ,k + t lj γilk + t ilγjlk, (13) tij,k = tij,k − tlj γlik − tilγ l jk, (14) t ij;k = t i j,k + t l j γ i lk − t i l γ l jk, (15) where partial derivatives are noted as: vi,j = ∂ui ∂xj . (16) all of the operations, such as divergence, gradient, curl etc., can be obtained from these relationships, see again e.g. [3]. 3. laws of thermodynamics thermodynamics is a scientific discipline concerning the transformation of thermal energy into its other forms. it is based on three laws, that were obtained thanks to a combination of theoretical and experimental research. 3.1. the first law of thermodynamics the formulation of the first law was described in detail by kvasnica [7]. if the system is free of chemical reactions, the first law can be written in the form: dein = dq + dw, (17) where ein is the internal energy, q is the heat and w is the work. in other words, the heat given to the system can be transformed into the system’s internal energy and work performed by this system. note that the number of the particles in the system has to be the same throughout the process. it must also be noted that the heat, as well as the work, are not total differentials, and therefore they are dependent on the integral path between the starting point and the ending state. it is useful to write down another formulation of the first law, where enthalpy is defined as: dh = dein + pdv + vdp, (18) where p is the thermodynamic pressure and v is the volume. 3.2. the second law of thermodynamics the formulation of the second law was motivated by the research of r. clausius, who followed the work of carnot and described the real processes in nature by the quantity named entropy. this quantity is defined as: ds = dq t , (19) where t is the thermodynamic temperature. this definition is valid for reversible processes. entropy is then a total differential, therefore, for the reversible process between states 1 and 2, the following equation holds: s2 − s1 = (2)∫ (1) dq t . (20) in the case of the reversible cycle, the integral on the r.h.s. of this equation is equal to zero. note that at this moment, the integral constant has to be determined. this can be done thanks to the nernst theorem: lim t −→0 ∆s = 0 . (21) the real processes are, however irreversible and therefore the equations (19) and (20) are actually inequalities: ds ≥ dq t . (22) now, entropy increase can be easily obtained (e.g. bejan [8]) sgen = s2 − s1 − (2)∫ (1) dq t . (23) 104 vol. 63 no. 2/2023 derivation of entropy production in a fluid flow 4. balances laws if some quantity has to be conserved in time, then its time derivative has to be equal to zero2, see e.g. maršík [9]: dψ dt = j (ψ) + p (ψ) = 0 , (24) where j (ψ) is the flux of quantity ψ through a boundary of the control volume and p (ψ) is its production within the control volume. 4.1. continuity equation creation or destruction of mass cannot occur in the control volume, this means that the production term in equation (24) has to be equal to zero, therefore: d dt ∫ dm = d dt ∫ v ϱdv = 0 , (25) where ϱ is the fluid density and v is the control volume. changing the sequence of the derivatives and integration (thanks to the leibniz rule for integration), the following equation is obtained: ∫ v dϱ dt dv = 0 . (26) this relationship has to hold in any control volume dv ̸= 0, therefore: dϱ dt = ∂tϱ+uiϱi; + ϱu i ;i = 0 , (27) where ui are contravariant components of the velocity vector. 4.2. momentum conservation law the momentum conservation law describes in newtonian mechanics that the time derivative of the momentum has to be equal to all of the forces acting on the control volume: dmi dt = f i + s i , (28) where f i are components of the volume forces and s i are components of the surface forces, respectively, that are defined as: 2note about the notations of derivatives: in literature, there usually is a difference in the notation of derivative. in lagrangian description material derivative of quantity, ψ is dψ/dt = ∂ψ/∂t, in euler description, the material derivative is given as dψ/dt = ∂ψ/∂t + uj∂ψ/∂xj . here, the material derivative in the sense of euler description will be noted as dψ/dt to emphasise the fact that the derivations are performed in general curvilinear coordinates, and therefore the convective term in the material derivative is covariant derivative, that contains christoffel symbols of the second kind. mi = ∫ v ϱuidv , (29) f i = ∫ v ϱaidv , (30) s i = ∮ a σij daj , (31) where ai are the components of the acceleration vector and σij are the components of the stress tensor. if the control volume bonded by the surface a is free of discontinuity or singular points, then integral (28) can be transformed by the gauss divergence theorem into: ∮ a σij daj = ∫ v σ ij ;j dv . (32) substituting equations (29) – (32) into equation (28), the balance of momentum is given by: d dt ∫ v ϱuidv = ∫ v ϱaidv + ∫ v σ ij ;j dv . (33) another change in the sequence of derivation and integration transforms this relationship into the form: ∫ v [ ϱai + σij;j − d ( ϱui ) dt ] dv . (34) this again has to be true in any control volume dv ̸= 0, therefore: d ( ϱui ) dt = ϱ ( ∂tu i + ujui;j ) = ϱai + σij;j , (35) no special assumptions about fluid flow were made to this point. the stress tensor can be decomposed into two parts: σij = −gijp + τij , (36) where p is the thermodynamic pressure, gij is the inverse of the metric tensor, and τij are the components of the viscous tensor. the negative sign in the thermodynamic pressure means that the force caused by this pressure acts in the opposite direction of the outer normal. if the fluid is at rest, a viscous tensor is zero and the thermodynamics pressure is equal to the hydrostatic pressure. at last, the momentum balance can be written in the form of: ϱ ( ∂tu i + ujui;j ) = ϱai − gijp;j + τ ij ;j . (37) 4.3. energy conservation law energy cannot be created nor destroyed by any known mechanism, the law of balance of energy tells us, that the energy is transforming between its different forms. the change of total energy of the flowing fluid is given by: 105 erik flídr acta polytechnica d dt ∫ v ϱetotdv = d dt ∫ v ϱ (ein + ekin)dv = (38) = − ∮ a gjlqldaj + ∮ a gilu lσij daj + ∫ v ϱgiju iaj dv , where total energy etot is a sum of internal energy ein and kinetic energy ekin. next, ql is the covector of the heat flux and gil is the metric tensor. on the l.h.s. of the equation (38), the sequence of the derivation and integration will be switched and on the r.h.s. of the equation (38), the gauss divergence theorem will be used to convert the surface integrals into volume integrals, resulting into: ∫ v d dt [ϱ (ein + ekin)]dv = (39) = ∫ v [ − ( gilql ) ;i + ( gjlu lσij ) ;i + ϱgiju iaj ] dv . again, this has to hold in any control volume, therefore: d dt [ϱ (ein + ekin)] = − ( gilql ) ;i + ( gjlu lσij ) ;i + ϱgiju iaj . (40) 4.4. entropy balance entropy balance can be written in the form of the balance equation (24) as: ds dt = j (s) + p(s) = 0 , (41) from where entropy production can be obtained as: p(s) = ds dt − j (s) ≥ 0 . (42) entropy contained in the control volume can be calculated as: s = ∫ v ϱsdv , (43) where s is the specific entropy. using clausius-duhem inequality and by performing the time derivative of equation (43), the integral value of the entropy production in the control volume can be obtained as: p = ∫ v pdv = ∫ v ϱ ds dt dv + ∮ a gijqj t dai = = ∫ v [ ϱ ds dt + gijqj;i t ] dv , (44) where the flux through the boundary is given by covector of the flux of heat qj . equation (44) has to be true in any control volume, therefore: p = ϱ ds dt + gijqj,i t , (45) where the continuity equation was taken into account. 5. entropy production in flowing fluid relationships from section 3 are used for the determination of the entropy production in the flowing fluid. by applying the time derivative on equations (17)–(19), the following relationships are obtained: dein dt = dq dt − p ϱ2 dϱ dt , (46) dh dt = dein dt + 1 ϱ dp dt − p ϱ2 dϱ dt , (47) ds dt = 1 t dq dt (48) where, for the work, the following relationship holds: dw = pd ( 1 ϱ ) . (49) time change of entropy can be obtained after some elementary manipulations as: ϱt ds dt = ϱ dh dt − dp dt . (50) the kinetic energy of the flowing fluid can be obtained by taking the dot product between the momentum equation and velocity in the form: ϱgilu l du i dt = ϱ dekin dt = gilulσ ij ;j + ϱgilu lai . (51) extracting equation (51) from equation (40), the internal energy is obtained: ϱ dein dt = − ( gijqj ) ;i + ( gjlu lσij ) ;i − gjlu lσ ij ;i = = gjlul;iσ ij − ( gijqj ) ;i . (52) substituting equation (52) into equation (47) and subsequently substituting this into equation (50), the following relation for the time change of entropy is obtained: ϱt ds dt = gjlul;iσ ij − ( gijqj ) ;i . (53) the continuity equation was used as well during the manipulations. comparing the relationships equation (45) and equation (53), the local entropy production can be obtained as: p = gjlul;iσ ij . (54) this means, that irreversible processes in the fluid flow are connected with the stress tensor. 106 vol. 63 no. 2/2023 derivation of entropy production in a fluid flow 6. newtonian fluid no special assumption was made to this point about the stress tensor σij except that if the fluid is at rest, the stress tensor is equal to the thermodynamic pressure and if this fluid is in the external force field that is acting on that fluid, the thermodynamic pressure has to be equal to the hydrostatic pressure. this transforms the problem of establishing the stress tensor to the problem of determination of the viscous tensor τij . let’s suppose, that this viscous tensor is a linear combination of the strain rate tensor elm = gklemk , where the mixed type of tensor is given by his definition through the covariant derivatives of the velocity field, then in general, the relationship between these two tensors can be written as: τij = agijgklekl + b ( gikg j l + g i lg j k ) ekl+ + c ( gikg j l + g i lg j k ) ekme ml , (55) where the non-linear third term on the r.h.s. of this equation will be neglected in the following text, therefore: τij = agijgklekl + b ( gikg j l + g i lg j k ) ekl . (56) this relationship, however, tells nothing about the properties of the tensor τij , meaning mainly about its symmetry. the coefficients are then functions of the invariants of the tensor of deformation, see e.g. [2]. if the fluid is homogeneous and isotropic, and if the viscous tensor is a continuous function of the stress tensor, then: τij = agij + λgijekk + 2µe ij . (57) this tensor has to be symmetric due to the law of conservation of moment of momentum. now strain rate can be substituted into the equation (57): ekk = u k ;k , (58) eij = 1 2 ( gjkui;k + g ilu j ;l ) . (59) the stress tensor can be written in the form: σij = −gijp + λgijuk;k + µg jkui;k + µg ilu j ;l , (60) where λ is the second viscosity and µ is the dynamic viscosity. substituting equation (60) into (37), navierstokes equation is obtained: ϱ ( ∂tu i + ujui;j ) = −gijp;j + + ( λgijuk;k + µg jkui;k + µg ilu j ;l ) ;j + ϱai . (61) coefficients of viscosity are, in general, functions of temperature and pressure, however, in most of the cases, their variations in the flowing fluid are minimal (see e.g. [10]), therefore, they can be considered constant. finally, the navier-stokes equation is obtained in a general curvilinear coordinate system in the form: ϱ ( ∂tu i +ujui;j ) = −gijp;j + + (λ + µ) gijuk;kj + µg jkui;kj + ϱa i . (62) 6.1. entropy production in newtonian fluid entropy production in newtonian fluid can be easily obtained by substituting the stress tensor into the equation (54). prior this step, however, the vector of the heat flux will be obtained from the constitutive relation: qi = kgijt;j . (63) substituting equations (63) and (60) into (54) gives: ϱt ds dt = −kgijt;ji + gilu l ;j ( −gijp;j + λg ijuk;k + + µ [ gjlui;l + g imuj;m ] ) . (64) using relationship (7) and after some manipulations, time change of entropy can be determined as: ϱt ds dt = −kgijt;ji − pu i ;i + λu i ;iu j ;j + 2µu i ;ju j ;i . (65) by a comparison of equations (65) and (45), local entropy production can be obtained as: p = −pui;i + λ ( ui;i )2 + 2µui;ju j ;i . (66) it can be seen that the entropy production in the flowing fluid will be dependent only on the stress tensor. moreover, if the fluid will be incompressible, the entropy production will depend only on the last term on the r.h.s. of the equation (66), as the other terms will be zero due to the continuity equation, where the divergence of the velocity vector will be zero. 7. application to linear blade cascade application of the curvilinear coordinates for calculation of the entropy production is demonstrated on a linear blade cascade, which was experimentally investigated by perdichizzi & dossena [11]. in this case, the two-dimensional flow is considered to demonstrate the usability of curvilinear coordinates. the blade under investigation is depicted in figure 1, where the suction side is approximated by the fourth-order polynomial function. the positions of the pressure taps are highlighted by red points. 107 erik flídr acta polytechnica 0 10 20 30 x (mm) −40 −30 −20 −10 0 y (m m ) y = a0 + a1x + a2x 2+ +a3x 3 + a4x 4 polynomial approximation of the blade suction surface suction surface pressure surface pressure taps positions figure 1. blade under investigation. the coefficients in the polynomial function have values a0 = 1.64, a1 = 1.16, a2 = 7.14 × 10−2, a3 = 1.56 × 10−3 and a4 = 4.55×10−5. the curvilinear coordinate system of the blade can be introduced if the blade geometry is known. this system is shown in figure 2. figure 2. blade curvilinear coordinates. the relations between the cartesian and curvilinear coordinates are given by: x1 = ζ1 , (67) x2 = ζ2 + 4∑ i=1 ai ( ζ1 )i , (68) ζ1 = x1 , (69) ζ2 = x2 − 4∑ i=1 ai ( x1 )i . (70) by introducing substitution: σ = 4∑ i=1 ai ( x1 )i , (71) the metric tensor as well as its inverse can be calculated as: gij = ( 1 −σ −σ 1 + σ2 ) , (72) gij = ( 1 + σ2 σ σ 1 ) . (73) christoffel symbols of the second kind can be then calculated using equation (8). at first, the derivatives of the metric components, with respect to the coordinate xi, are calculated as: ∂x1g11 = 2σβ , (74) ∂x1g12 = ∂x1g21 = β , (75) ∂x1g22 = 0 , (76) ∂x2g11 = 0 , (77) ∂x2g12 = ∂x2g21 = 0 , (78) ∂x2g22 = 0 , (79) where: β = 4∑ i=2 i (i − 1) ai ( x1 )(i−2) . (80) then, all of the christoffel symbols are equal to zero except γ211 = β. having all of these pieces of information, the calculation of the entropy production can be directly performed. this calculation was performed in the region near the blade surface (out of the boundary layer), where the gradients of the parameters along the coordinate ζ2 were considered only (in other directions, their values can be neglected). the boundary conditions were not specified in the paper [11], i.e. both the stagnation temperature and the pressure, however, based on the wind tunnel type, their values can be estimated as t0 = 297.15 k and p0 = 98000 pa. the pressure distribution on the blade surface (denoted by coordinate s) can be calculated from the mach number distribution. the pressure, temperature, and velocity distributions are shown in figure 3. distribution of the entropy production ϱds/dt along the blade surface is shown in figure 4. note, that the bulk viscosity λ was obtained by the stokes hypothesis, that λ + 2/3µ = 0. it is obvious that most of the entropy production occurred in the leading edge region, where the flow acceleration was the highest, and therefore the velocity gradients were the largest. overall entropy production along the coordinate ζ2 was calculated as an integral along this curve and was p = 0.0327 j·k−1·m−2·s−1. 108 vol. 63 no. 2/2023 derivation of entropy production in a fluid flow 0.0 0.2 0.4 0.6 0.8 1.0 s2 s2max (1) 60000 65000 70000 75000 80000 85000 p (p a ) (a). pressure distribution 0.0 0.2 0.4 0.6 0.8 1.0 s2 s2max (1) 250 260 270 280 290 t (k ) (b). temperature distribution 0.0 0.2 0.4 0.6 0.8 1.0 s2 s2max (1) 160 180 200 220 240 260 u (m ·s− 1 ) (c). velocity distribution figure 3. distribution of the flow parameters along the blade surface. 0.0 0.2 0.4 0.6 0.8 1.0 s2 s2max (1) 0 10 20 30 40 % d s d t (j ·k − 1 ·m − 3 ·s − 1 ) figure 4. entropy production. 8. conclusion a derivation of the entropy production in a flowing fluid in a general curvilinear coordinate system was performed in this paper. basic relationships from the differential geometry used during the derivation were presented as well to clarify used notations. curvilinear coordinates were chosen to describe this phenomenon in a more general case than is usually used in the literature. the results obtained here were used to describe the flow within the linear blade cascades in a simplified manner and the distribution of the entropy production along one coordinate near the blade suction surface was obtained. the integral value of the entropy production was then evaluated from this distribution. the aim of this paper was to prove that this approach is capable to evaluate some experimental data along the suction side of the blade. this approach can be used in close proximity to the blade. the polynomial function will be changing with increasing distance from the blade surface, as the opposite surface of the blade will be given by a polynomial with different coefficients. future work will, therefore, be dealing with the generalisation of this approach to a more complicated situation, where the polynomial coefficients ai will not be constants. list of symbols ai components of acceleration vector [m s−2] e strain rate tensor [s−1] f volume force [n] gi, gi basis and co-basis vectors [1] gij , gij components of metric tensor [1] h specific enthalpy [j kg−1] j (s) entropy flux [j kg−1 k−1 s−1] m mass [kg] m momentum [kg m s−1] p thermodynamic pressure [pa] p(s) local entropy production [j m−2 s−1 k−1] 109 erik flídr acta polytechnica p(s) integral entropy production [j m−2 s−1 k−1] qi components of heat flux vector [jm−2 s−1] s specific entropy [j kg−1 k−1] s blade surface coordinate [m] t time [s] t thermodynamic temperature [k] ui components of velocity vector [m s−1] v control volume [m3] x, y, z = xi cartesian coordinates [m] γ christoffel symbols [1] ζ i curvilinear blade coordinates [m] µ dynamic viscosity [pa s] λ bulk viscosity [pa s] ϱ density [kg m−3] references [1] c. truesdell. the physical components of vectors and tensors. journal of applied mathematics and mechanics 33(10–11):345–356, 1953. https://doi.org/10.1002/zamm.19530331005 [2] r. aris. vectors, tensors and the basic equations of fluid mechanics. dover publications, 1989. [3] v. ivancevic, t. ivancevic. applied differential geometry: a modern introduction. world scientific, 2007. https://doi.org/10.1142/6420 [4] s. surattana. transformation of the navier-stokes equations in curvilinear coordinate system with mapel. global journal of pure and applied mathematics 12(4):3315–3325, 2016. [5] l. swungoho, s. bharat. governing equations of fluid mechanics in physical curvilinear coordinate system. in third mississippi state conference on difference equations and computational simulations, pp. 149–157. 1997. [6] p. asinari, e. chiavazzo. overview of the entropy production of incompressible and compressible fluid dynamics. meccanica 51:1–10, 2015. https://doi.org/10.1007/s11012-015-0284-z [7] j. kvasnica. termodynamika. sntl, 1965. [8] a. bejan. entropy generation minimization. crc press lcc, 2013. [9] f. maršík. termodynamika kontinua. academia, 1999. [10] l. landau, e. lifschitz. fluid mechanics. pergamon press, 1987. [11] a. perdichizzi, v. dossena. incidence angle and pitch–chord effects on secondary flows downstream of a turbine cascade. journal of turbomachinerytransactions of the asme 115(3):383–391, 1993. https://doi.org/10.1115/1.2929265 110 https://doi.org/10.1002/zamm.19530331005 https://doi.org/10.1142/6420 https://doi.org/10.1007/s11012-015-0284-z https://doi.org/10.1115/1.2929265 acta polytechnica 63(2):29–36, 2023 1 introduction 2 basics of differential geometry 3 laws of thermodynamics 3.1 the first law of thermodynamics 3.2 the second law of thermodynamics 4 balances laws 4.1 continuity equation 4.2 momentum conservation law 4.3 energy conservation law 4.4 entropy balance 5 entropy production in flowing fluid 6 newtonian fluid 6.1 entropy production in newtonian fluid 7 application to linear blade cascade 8 conclusion list of symbols references ap05_3.vp 1 introduction a suitable design for the water power plant inlet is a basic condition for the proper functioning of the plant, and increases the overall efficiency, optimizing the operation of the plant. the hydraulic solution for the separating pier, and also for other components of the inlet, usually employs mathematical modelling. the objective of this study is to optimize the hydrodynamic conditions in the area of the inlet of a running water power plant using 2-d mathematical modelling of the flux. the main emphasis is placed on the design of the separating pier between the weir and the power plant, since the wrong width and shape may cause detachment of the current from the face side of the pier, infringing the flux. this disturbance then propagates to the inlet of the nearest aggregate. measurements conducted on completed water works have demonstrated a decrease in the efficiency of such an aggregate by as much as 30% [3]. this methodology can also be used for optimization of pier shapes, in order to direct the inlet flux towards the individual turbine blocks. these piers also serve as supports for coarse screens and for the footbridge. 2 methods the modelling was carried out on the assumption of a stable two-dimensional flux of incompressible fluid. the viscosity and the density were considered equal within the whole area. the basic description of the fluid flux is given by the two navier-stokes equations [5] � � � � � � � � � u t u x u p x ui i j j i i� � � � � 1 2 (1) and the continuity equation � �u 0 (2) where ui is the velocity vector component in the direction of the xi axis, � density �m3�s�1], p hydrostatic pressure [pa], � dynamic viscosity [pa�s]. the above set of equations is used for calculating of the laminar flux of a real fluid. for real flows, the current must be considered turbulent. mathematical modelling of turbulent flow uses the turbulent k-� model [6], comprising modified navier-stokes equations with the reynolds number, the continuity equation, transport equations and the diffusion of kinetic turbulent energy, and equations describing the transport and diffusion of the speed of the dissipation of the kinetic energy of the turbulence. equations describing the turbulent flux introduce the time-averaged velocity. the navier-stokes equations then take the following form: � � � � � � � �u u u u t p c k k � � � � � � � � � � � � � � � � � � � � � � � 1 2 . (3) the equation of turbulent kinetic energy k is � � � � � � � � � � � � k t c k k k c k k � � � � � � � � � � � � � � � � � � � � � 2 2 u (� �u) .2 �� (4) and the equation of turbulent energy dissipation �: � �� � � � � � � � � � � � � � t c k c c k � � � � � � � � � � � � � � � � � � � � � 2 u 1 ( ) .� �u 2 2 � � �c k2 (5) the k-� model contains the following constants, whose values are based on hydraulic model research: the above-specified equations for 2-d turbulent flow were approximated by the finite element method in the femlab computing environment. the geometric limit conditions are given by the shape of the area of flux, and are based on the layout solution of the inlet to the water power plant. the geometry of the inlet was estimated first, with subsequent optimization of the geometrical solution with the aim of achieving the most uniform possible velocity profile between the individual turbine blocks. the boundary conditions of the flow were selected as follows: � zero velocity assumed at the walls, � uniform velocity profile assumed at the inlet boundary, � zero pressure assumed for the outflow boundary. © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 design and optimization of a turbine intake structure p. fošumpaur, f. čihák the appropriate design of the turbine intake structure of a hydropower plant is based on assumptions about its suitable function, and the design will increase the total efficiency of operation. this paper deals with optimal design of the turbine structure of run-of-river hydropower plants. the study focuses mainly on optimization of the hydropower plant location with respect to the original river banks, and on the optimal design of a separating pier between the weir and the power plant. the optimal design of the turbine intake was determined with the use of 2-d mathematical modelling. a case study is performed for the optimal design of a turbine intake structure on the nemen river in belarus. keywords: run-of-river hydropower plant, turbine intake, 2-d modelling, optimization. c� c�1 c�2 �k �� 0.09 0.1256 1.92 1.0 1.3 3 case study the case study of inlet optimization before the turbine blocks is calculated for a power plant project on the nemen river in belarus. fig. 1 shows the layout solution. the flux area was derived from the layout solution of the power plant and the weir. the positioning of the power plant, weir fields and piers between individual turbine blocks was retained. the water power plant comprises a total of five turbine blocks, separated in the inlet by piers with elliptical pier heads. the number of finite elements in the area was 5,000 to 10,000 depending on the alternative and the type of flow. a reliable description of the inlet flow to the power plant required the selection of a sufficiently large area for the solution. the power plant is located on the concave side of a curve with a diameter of approximately 340 m. the area of the solution starts at the beginning of this curve in the vicinity of the upper dock of the lock chamber, where uniform distribution of velocities is assumed. a total of five basic scenarios of the power plant inlet were examined. four of them focused on the shape of the separating pier, and one examined the effects of the shape of the side pier between the power plant and the riverbank. the circumfluence around the separating pier between the power plant and the weir is unsymmetrical; therefore the shape of the pier should be equally unsymmetrical and streamlined. a model study [1, 2] demonstrated that the width of the separating pier is a function of turbine flow rates qt. the minimum width of the separating pier is established by the formula: d f q c qt t� � �( ) 2 5 (6) where factor c depends on the conditions of the pier circumfluence and its values fall within the interval from 0.7 to 1.4 according to study [2]. the specific value of factor c depends on the layout of the water power plant and its positioning in relation to the river flow and the bank, as shown in fig. 1. in terms of the design of the separating pier, the following features are fundamental: r radius of the river course curve above the weir and the power plant, b1 width of the riverbed above the weir and the power plant, b2 width of the power plant, l2 length of the separating pier above the weir, � central angle of the riverbed, being positive for a power plant on the concave bank and negative if the power plant is located on the convex bank. 88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague hydropower plant turbine intake area weir navigation chamber b 1 = 1 2 0 m r = 3 4 0 m b 2 = 4 0 m b 3 = 2 0 m l2 58° fig. 1: layout of the inlet solution of a water power plant on the nemen river (belarus) factor c can then be read from the nomogram according to [2], or can be determined from the formula according to holata [3]: c l b r b � � � � � � � � � � � � � �0 5 1 0 8 0 619 90 0 0455 12 2 1 . . . . � 1 0 05 153 2 � � � � � � � � �. . b b (7) the design of the separating pier of width d employed an elliptical layout according to fig. 2 [2]. the pier head is composed of an ellipse (a � 0.834 d; b � 0.166 d) and two circular curves of radius 0.24 d, and 0.12 d respectively. the angle � of the central circle curve is: � � � � � � � � � � �arcsin . . . 2 0166 0 24 1 22 54 d d (8) the logarithmic spiral function for shaping the separating pier for scenario 5 was used in the following form: � � � 2 344 180 . logd (9) where � is the polar vector [m], � denotes the polar angle [°] and d is the pier width [m]. fig. 6 shows a chart of the pier layout in comparison to scenario 3. the drawing also indicates the beginning of the logarithmic spiral. fig. 7 and fig. 8 show isosurfaces for total velocities and streamlines for scenarios 1 and 5 under laminar flow for reynolds number re � 2000. it holds that the streamlines follow the optimum (streamlined) path under potential laminar flow. any differences between this ideal shape determined © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 � fig. 2: elliptical shape of the separating pier scenario 1 this scenario is based on the length of the separating pier above the weir l b2 2 2� . the bank-side pier creates an angle of 45° with the axis of the river flow within the weir profile. scenario 2 since calculations showed that the angle of the bank-side pier of 45° negatively affects the field of flow above the aggregate near the riverbank, this scenario used an angle of 30°. the layout of all the other elements remains unchanged from scenario 1. all subsequent scenarios have the bank-side pier angle 30°. a chart of scenario 2 is given in fig. 3. scenario 3 this scenario uses the length of the face side of the pier above the weir l b2 2 4� . scenario 4 this alterative represents the greatest protrusion of the separating pier above the weir examined here. the protrusion is based on the maximum allowable width of the separating pier, determined according to the width of the weir field, as stated in the input data (12 m). this is the upper limit of the pier protrusion above the weir. scenario 5 the last investigated scenario is based on scenario 3. the width of the separating pier in both cases is 10.2 m. the scenarios differ in the shape of the pier head area, which in this scenario is based on a logarithmic spiral layout. fig. 3: scenario 2 fig. 4: scenario 3 by the streamlines and the actual shape of the structures subject to circumfluence thus indicate shortcomings in the shape design. this analysis was performed for the whole investigated area; however, for the sake of greater clarity only a detail of the separating pier is shown. the analysis made it clear that the most suitable solutions are offered by scenarios 3 and 5 (the least protrusion of the pier above the weir), where the streamlines and the contour of the separating pier are virtually parallel, making it safe to assume no detachment of the flow from the face side of the structure. conversely, the worst results were obtained for the pier shapes in scenarios 1 and 2 (the scenarios differ in the shape of the side pier only). nor is the pier according to scenario 4 (the greatest length protruding above the weir) ideal in terms of the streamlines. the following analysis compares the individual scenario solutions for actual flow rate values under turbulent flux. the results of this investigation enable the direct selection of the optimum scenario based on the characteristics of the velocity field at the inlet of the individual turbine blocks. figs. 9 and 10 show a comparison between the velocity fields for scenarios 3 and 4. the charts clearly demonstrate that in the case of scenario 3 the flux field above the turbines is more uniform than in the case of scenario 4, where the velocities in the vicin90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 5: scenario 4 � � fig. 6: scenario 5 fig. 7: scenario 1: laminar flow fig. 8: scenario 5: laminar flow fig. 9: scenario 3: total velocity [m�s�1] and streamlines fig. 10: scenario 4: total velocity [m�s�1] and streamlines ity of the bank-side pier are significantly higher than behind the separating pier between the power plant and the weir. a comparison of the scenarios of the inlet solution is shown in fig. 11. the graph shows the velocity profile in a cross section above the turbines (on the outflow limit of the investigated area). scenarios with differing shapes of the separating pier are monitored. scenario 1 with side pier angle 45° was replaced by the more suitable scenario 2, with pier angle 30°. based on the analysis, scenario 3 was recommended as the design solution for the shape of the separating pier. 4 conclusion there are numerous empirical relations and recommendations for the optimum design of hydrodynamic shapes of elements subject to circumfluence in the area of the inlets of water power plants, usually based on physical research in laboratories. this paper utilises these recommendations in the formulation of the initial design scenarios, which are subsequently optimised using 2-d mathematical modelling of the laminar and turbulent flux. the advantage of this solution is above all in the low time and financial demands of the optimization of inlet elements of water power plants. 5 acknowledgments this research has been supported by grant no. 103/02/d049 of the grant agency of the czech republic. references [1] čihák, f., medřický, v.: weir design. praha: vydavatelství čvut, 1991. [2] gabriel, p., grandtner, t., průcha, m., výbora, p.: weirs. praha: sntl, 1989. [3] holata, m.: small hydro-power plants. ed.: pavel gabriel, praha: academia, 2002. [4] hunter, s. c.: mechanics of continuous media. chichester (england): ellis horwood, 1976. [5] kolář, v., patočka, c., bém, j.: hydraulics. praha: sntl, 1983. [6] wilcox, d.: turbulence modeling for cfd. 2nd edition, dcw industries inc., 1998. dr. ing. pavel fošumpaur phone: +420 224 354 425 e-mail: fosump@fsv.cvut.cz prof. ing. františek čihák, drsc. phone: +420 224 354 611 e-mail: cihak@fsv.cvut.cz department of hydrotechnics czech technical university of prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 11: distribution of velocities above the turbines for individual scenarios ap07_2-3.vp 1 introduction the most widely studied quantum mechanical potentials are formulated as one-dimensional problems. these include potentials defined on a finite domain (e.g. the infinite square well) or on the full x axis (e.g. the pöschl-teller potential). potentials defined on the positive semi-axis also occur as radial problems obtained after the separation of the angular variables in centrally symmetric potentials. potentials in higher dimensions are less frequently discussed, and mainly in cases when they can be reduced to one-dimensional problems by the separation of the variables in some coordinates (cartesian, polar, etc.). these potentials differ from the one-dimensional ones in several respects: their spectrum can be richer due to the more degrees of freedom, and this can be manifested in the occurrence of degeneracies, for example. an interesting recent development in quantum mechanics was the introduction of �� symmetry [1]. quantum systems with this symmetry are invariant under the simultaneous action of the � space and � time inversion operations, where the latter is represented by complex conjugation. it has been found that although these ��-symmetric problems are manifestly non-hermitian, as they possess an imaginary potential component too, they have several features in common with traditional self-adjoint systems. the most striking of these is the presence of real energy eigenvalues in the spectrum, but the orthogonality of the energy eigenstates and the time-independence of their norm is also non-typical for complex potentials. there are, however, important differences too, with respect to conventional problems. the energy spectrum can turn into complex conjugate pairs as the non-hermiticity increases, and this can be interpreted as the spontaneous breakdown of �� symmetry in that the energy eigenstates cease to be eigenstates of the �� operator then. also, the pseudo-norm defined by the modified inner product � � � � �� �� turned out to have indefinite sign. �� symmetry was later identified as the special case of pseudo-hermiticity, and this explained much of the unusual results. the proceedings volumes of recent workshops [2], [3] give a comprehensive account of the status of ��-symmetric quantum mechanics and related fields. with only a few exceptions the study of ��-symmetric systems has been restricted to the bound states of one-dimensional non-relativistic problems, where �� symmetry amounts to the requirement v x v x*( ) ( )� � . here we extend the scope of these investigations by considering ��-symmetric problems in higher spatial dimensions. in particular, we employ a simple method of generating solvable non-central potentials by the separation of the variables and combine it with the requirements of �� symmetry [4]. 2 non-central potentials in polar coordinates let us consider the schrödinger equation with constant mass p r r r r r 2 2 2 2m v m v� � � � � � � � �( ) ( ) ( ) ( ) ( )� � � � � , (1) where the potential v(r) is a general function of the position r. although in this section we implicitly assume that v(r) is real, so the hamiltonian describing the quantum system is self-adjoint, the procedure we follow here can be applied to complex potentials too. in what follows we choose the units as 2 1m � �� . specifying (1) for d � 3 dimensions and using polar coordinates we obtain 1 1 1 1 2 2 2 2 2 2 2 2 r r r r r r r � � �� � � � �� � �� �� � � � � � � � cot( ) sin � � � �� � � � � 2 2 0� � �v r e( , , ) . (2) 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ��-symmetry and non-central potentials g. lévai we present a general procedure by which solvable non-central potentials can be obtained in 2 and 3 dimensions by the separation of the angular and radial variables. the method is applied to generate solvable non-central ��-symmetric potentials in polar coordinates. general considerations are presented concerning the �� transformation properties of the eigenfunctions, their pseudo-norm and the nature of the energy eigenvalues. it is shown that within the present framework the spontaneous breakdown of �� symmetry can be implemented only in two dimensions. keywords: �� symmetry, angular and radial variables, non-central potentials assuming that the separation of the variables is possible, we search for the solution as � � � � � � �( , , ) ( ) ( ) ( )r r r� �1 , (3) where � r � �0, , � �� � 0, and � �� � 0 2, . then (2) turns into � �� � �� � � � �� � � � � � � � � � � � � � 1 1 2 2 2 r r v r e ( cot( ) ) sin ( , , ) � � � 0, (4) where prime denotes the derivative with respect to the appropriate variable. next we assume that � �( ) and �( ) satisfy the second-order differential equations � �� � � � �� � � � �cot( ) ( )q q , (5) � �� � � � k k( ) . (6) it is seen that (6) can be considered a one-dimensional schrödinger equation defined in the finite domain � �0 2, with periodic boundary conditions ( ) ( )0 2� and � � � ( ) ( )0 2 . note that in the case of one-dimensional potentials defined within a finite domain the wavefunction is usually required to vanish at the boundaries, however, considering periodic boundary conditions, this is not a necessary requirement: it can also be finite there. equation (5) is solvable for the choice q( ) sin ( )� � �� �2 2 , q � �� �( )1 , (7) when the solutions are given by the associated legendre functions � p� � �cos( ) [5]. normalizability requires � and � to be non-negative integers such that � � l, � � �m l. then � � � �lm m l mi l l m l m p( ) ( ) ! ( ) ! cos( )� �� � � � � � � � � � � � 1 2 1 2 . (8) substituting (6), (5) and (7) in (4) the angular part can be separated, and a radial schrödinger equation is obtained � �� � � �� �� � �� � �� � �v r l l r e0 2 1 0( ) ( ) , (9) where the central potential v r0( ) is related to v r( , , )� � as � v r v r r k k m( , , ) ( ) sin ( ) ( )� � � �� � � �0 2 2 21 . (10) in its most general form, (10) is a non-central potential that depends on the states through k and m. in order to eliminate the state-dependence one can apply the prescription k m a� �2 , where a is a constant. since m has to be an integer, this prescription represents a restriction on the solutions of equation (6). a special case occurs for k a( )� � , i.e. for the free motion on a circle (or an infinite square well with periodic boundary conditions), which reduces (10) to a central potential, and takes the angular wavefunctions � � �( ) ( ) into spherical harmonics ylm( , )� � [5]. exact solutions of the radial schrödinger equation (9) are known for the harmonic oscillator, coulomb and square well potentials for arbitrary value of l, while for l � 0 (i.e. for s waves), it is solvable for many more potentials. some solutions can also be obtained for arbitrary l for quasi-exactly solvable (qes) potentials [6] in the sense that the first few solutions (up to a given principal quantum number) can be determined exactly then. specifying (1) to d � 2 dimensions the whole procedure can essentially be repeated. the equivalents of equations (2) and (3) are then 1 1 02 2 2 � � �� � � � �� � � � � � �� �� � � � �v e( , ) (11) and � � � �( , ) ( ) ( )� � 1 2 (12) the separation of the angular variable � is again possible if (6) holds, and the solutions are required to satisfy periodic boundary conditions. the radial schrödinger equation is now � �� � � � � � � � � � � � � � � �� � �v k e0 2 1 4 1 0( ) , (13) where v v k( , ) ( ) ( ) � �� �0 2 1 . (14) equation (13) can be solved exactly for the same potentials as in three dimensions. 3 non-central ��-symmetric potentials let us now specify the procedure outlined in the previous section to ��-symmetric potentials. since the kinetic term in (1) is ��-symmetric, we have to take care separately only of the �� symmetry of the potential term. the effect of the � operation is � : r r� � , so the condition for the �� symmetry of a general potential in d � 3 dimensions is v r v r( , , ) ( , , )*� � � � � � � . (15) it is obvious that central potentials v v v r( ) ( ) ( )r r� � can be ��-symmetric only if they are real: v r v r( ) ( )*� , so the angular variables play an essential role in introducing an imaginary potential component. applying condition (15) to the general potential form (10), the prescriptions v r v r0 0 *( ) ( )� , k k*( ) ( )� �� � , k k* � (16) are obtained, i.e. v0( ) is real, k ( )� is ��-symmetric and the eigenvalue of equation (6) is real. note that the reality of the potential v r0( ) implies that (9) has the same form as the radial schrödinger equation of a centrally symmetric self-adjoint quantum system, therefore the eigenvalues e also have to come out real. this means that the spontaneous breakdown © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 47 no. 2–3/2007 of �� symmetry cannot be implemented in the present approach for non-central potentials in d � 3 dimensions. according to (15) the � operator can be factorized as � � � �� r � �, (where, obviously, �r � 1), so the �� transformation properties of the functions �(r), �(�) and (�) can also be studied. due to the arguments concerning (9) above, �(r) can be chosen real, and in this case it obviously satisfies � �r r r� �( ) ( )� . introducing an extra phase factor il in (8), it is possible to make �(�) the eigenfunction of the ��� operator with eigenvalue 1. a similar procedure also has to be applied to the (�) functions and the �� operator, but this can be done only in the exact knowledge of the k(�) function. this guarantees that the full wavefunction �(r, �, �) (3) is also the eigenfunction of the �� operator with unit eigenvalue. these phase choices are also reflected in the sign of the pseudo-norm of the eigenstates �(r, �, �), since � �� can be determined by the inner products calculated with the constituent functions �(r), �(�) and (�), using the appropriate �i (i r� , �, �) operators. obviously, the contribution of the radial component will be 1, while it can be shown that with the phase convention described above, � ��� � � �( )1 l m holds. although the corresponding inner product for the (�) functions can be evaluated only in the knowledge of k(�), similar inner products are known to exhibit oscillatory behaviour (�1)n with respect to the principal quantum number for wide classes of ��-symmetric potentials with infinite number of eigenstates [7], [8], [9]. (note that for some potentials with finite number of eigenstates this is not necessarily the case [10].) the sign of the pseudo-norm is thus indefinite for three-dimensional non-central ��-symmetric potentials too, and it depends on the quantum numbers associated with the angular component of the eigenfunctions. let us now discuss the conditions under which non-central potentials can be ��-symmetric in d � 2 dimensions. the equivalent of (15) is now v v( , ) ( , )* � � � � , (17) and from (14) the conditions v v0 0 *( ) ( ) � , k k*( ) ( )� �� � (18) follow from the �� symmetry of the v( , �). the arguments on the �� symmetry of the wavefunction and the constituent functions are the same as in the three-dimensional case, as are those concerning the sign of the pseudo-norm. a major difference with respect to the three-dimensional case is that now k can be complex too. since k is the eigenvalue of (6), which itself can be considered a schrödinger equation with a ��-symmetric potential (k(�)), its complex eigenvalues occur in complex conjugate pairs. substituting k and k* into the radial schrödinger equation (13) one finds that the two equations are each other’s complex conjugate, so their energy eigenvalues will also appear as each other’s complex conjugates. this indicates that similarly to the one-dimensional case, the spontaneous breakdown of �� symmetry leads to complex conjugate energy eigenvalues for d � 2 too. 4 summary the most important results of this work are presented in the table below. d � 2 d � 3 state-independent potential always k m a� �2 central potential k const( ) .� � k a( )� � ��-symmetric potential v v0 0 *( ) ( ) � k k*( ) ( )� �� � v r v r0 0 *( ) ( ),� k k� * k k*( ) ( )� �� � energy eigenvalues real or complex conjugate pairs real sign of pseudo-norm indefinite indefinite 5 acknowledgment this work was supported by the otka grant no. t49646 (hungary). references [1] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having �� symmetry. phys. rev. lett., vol. 80 (1998), no. 24, p. 5243–5246. [2] j. phys. a: math. gen., vol. 39 (2006), no. 32. [3] czech. j. phys., vol. 56 (2006), no. 9. [4] lévai, g.: solvable ��-symmetric potentials in higher dimensions. j. phys. a: math. theor., vol. 40 (2007), no. 15, p. f273–f280. [5] edmonds, a. r.: angular momentum in quantum mechanics. princeton university press, princeton, 1957. [6] ushveridze, a. g.: quasi-exactly solvable models in quantum mechanics. institute of physics publishing, bristol, 1994. [7] trinh, t. h: remarks on the ��-pseudo-norm in ��-symmetric quantum mechanics. j. phys. a: math. gen., vol. 38 (2005), no. 16, p. 3665–3678. [8] bender, c. m., tan, b.: calculation of the hidden symmetry operator for a ��-symmetric square well. j. phys. a: math. gen., vol. 39 (2006), no. 8, p. 1945–1954. [9] lévai, g.: on the pseudo-norm and admissible solutions of the ��-symmetric scarf i potential. j. phys. a: math. gen., vol. 39 (2006), no. 32, p. 10161–10170. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 47 no. 2–3/2007 [10] lévai, g., cannata, f., ventura, a.: �� symmetry breaking and explicit expressions for the pseudo-norm in the scarf ii potential. phys. lett. a, vol. 300 (2002), no. 2–3, p. 271–281. dr. géza lévai phone: +36 52 509 298 fax: +36 52 416 181 e-mail: levai@namafia.atomki.hu institute of nuclear research of the hungarian academy of sciences (atomki) bem ter 18/c debrecen, 4026 hungary ap06_5.vp notation and units � complex permittivity (-) �0 permittivity of free space, (�0 � 8.854×10 �12 f�m�1) �r relative permittivity (-) tg� loss (dissipation) factor (-) �� optical permittivity (-) �s static permittivity (-) � angular frequency (rad�1) � relaxation time (ps) f frequency (hz) j imaginary unit (j2 � �1) � distribution parameter (-) �i static ionic conductivity (s�m �1) c capacitance (f) g conductance (s�m�1) s-parameter scattering parameter s11 reflection coefficient mut material under test fit finite integration technique fft fast fourier transformation tem transversal electric and magnetic wave 1 introduction the dielectric properties of biological tissues are determining factors for the dissipation of electromagnetic energy in the human body and therefore they are important parameters in hyperthermia treatment, in microwave detection of tumors and in the assessment of exposure doses in basic research on interactions between electromagnetic fields and biological tissues [1]. measurement of the dielectric parameters of biological tissue is a promising method in medical imaging and diagnostics. knowledge of the complex permittivity in an area under treatment, i.e. knowledge of the complex permittivity of healthy and tumor tissues, is very important for example in diagnosing of tumor cell-nests in medical diagnostics or for engineers in the design of thermo-therapeutic applicators. other interesting applications are 3d reconstruction methods for various biological tissues based on the layered uniform tissue model (skin, fat and muscle). 2 materials and method there are several measurement methods for measuring dielectric properties [2]. if we want to use a broadband measurement method that is nondestructive, noninvasive and can 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 evaluation of a reflection method on an open-ended coaxial line and its use in dielectric measurements r. zajíček, j. vrba, k. novotný this paper describes a method for determining the dielectric constant of a biological tissue. a suitable way to make a dielectric measurement that is nondestructive and noninvasive for the biological substance and broadband at the frequency range of the network analyzer is to use a reflection method on an open ended coaxial line. a coaxial probe in the frequency range of the network analyzer from 17 mhz to 2 ghz is under investigation and also a calibration technique and the behavior of discrete elements in an equivalent circuit of an open ended coaxial line. information about the magnitude and phase of the reflection coefficient on the interface between a biological tissue sample and a measurement probe is modeled with the aid of an electromagnetic field simulator. the numerical modeling is compared with real measurements, and a comparison is presented. keywords: complex permittivity, reflection method, coaxial probe. fig. 1: principle of the reflection method offer possibilities for in vivo as well as in vitro measurements, we should choose a reflection method on an open-ended coaxial line. the objective of the research reported here is to analyze an open-ended coaxial line sensor for in vivo and nondestructive measurements of complex permittivity, and to develop a precision measurement system. the interface between the measurement probe and the sample of biological tissue presents an impedance jump (fig. 1). biological tissue has extremely high permittivity values. at low frequencies, its relative permittivity is more than 100 and the value of the loss factor is more than 0.1. an exact evaluation is very difficult because the reflection coefficient r is close to 1. this means that only a very small part of the incident energy penetrates into the sample, and for this reason the obtainable information is very poor. the reflection method on the open end of the coaxial line is a well-known method for determining these dielectric parameters [3]. this method is based on the fact that the reflection coefficient of an open ended coaxial line depends on the dielectric parameters of the material which is attached to it. to calculate the dielectric parameters from the measured reflection coefficient it is necessary to use an equivalent circuit of an open-ended coaxial line. to determine the values of the elements in this equivalent circuit we make a calibration using materials with known dielectric properties. a typical measurement system using a coaxial probe method consists of a network or impedance analyzer, a coaxial probe, and software. our measurements (fig. 2) were done the aid of a sixport type network analyzer in the frequency range from 17 mhz to 2 ghz. 2.1 dielectric theory the dielectric constant – relative permittivity �r – in our case describes the interaction of a biological tissue with an electric field, and because of the loss character of biological tissue it is a complex quantity. � � � � � 0 � � � � ��r j , (1) where �r is relative permittivity and �0 is permittivity of free space. the real part of the permittivity is a measure of how much energy from an external electric field is stored in a material. the imaginary part of the permittivity is a measure of how dissipative or lossy a material is to an external electric field. the dissipation factor is defined by: tg� � � � �� � . (2) it is important to note that complex permittivity is not constant; it changes with frequency, temperature, etc. 2.2 frequency dependence of complex permittivity the dielectric spectrum of a tissue is characterized by three main relaxation regions �, �, and at low, medium and high frequencies, and other minor dispersions such as the often reported � dispersions. in its simplest form, each of these relaxation regions is the manifestation of a polarization mechanism characterized by a single time constant � (relaxation time � is a measure of the mobility of the molecules and dipoles that exist in a material), which to a first order approximation gives the following expression for complex relative permittivity �r as a function of angular frequency �. � � � � � � �� r r r s� � � �� � � � � � �j j1 . (3) this is the well-known debye expression, in which: � �� is the optical permittivity at high field frequencies (where �� >> 1), � �s is the static permittivity (at �� >> 1) � and j2 � �1. the dispersion magnitude is described as � � �� � �s . however, the complexity of both the structure and the composition of biological material is such that each dispersion region may be broadened by multiple contributions to it. the broadening of the dispersion can be empirically accounted for by introducing a distribution parameter, thus giving an alternative to the debye equation, known as the cole-cole equation � � � �� � � � � � � 1 1( )j , (4) where the distribution parameter � is a measure of the broadening of the dispersion. the spectrum of a tissue may therefore be more appropriately described in terms of multiple cole-cole dispersion � � � �� � ��� � � � �� �� n nn j j n1 1 0( ) i , (5) which, with a choice of parameters appropriate to each tissue, can be used to predict the dielectric behavior over the desired © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 46 no. 5/2006 fig. 2: measurement system frequency range. �i is static ionic conductivity and �0 is the permittivity of free space. for distilled water, the dielectric constants are: � �� � 78.3, � �s � 4.6, � � � 8.07 ps, � �i � 2 s�m �1, � � � 0.014. 2.3 measurement probe a coaxial probe is created by an open end of a transmission line. the material can be measured by touching this probe to the flat face of a material and determining the reflection coefficient. for this measurement method we have developed a new type of coaxial measurement probe. this probe was created by adapting the standard n-connector (fig. 4), from which the parts for connecting to a panel were removed. the reflected signal s11 can be measured and related (eq. 6) to the complex permittivity �r using an equivalent circuit of an open ended coaxial line. y j c g� ��� �r r 5 . (6) the equivalent circuit consists of two elements: � c is the capacitance between the internal and external wire out of the coaxial structure, � g is the conductance which represents the propagation losses. the measurement probe radiates at higher frequencies (fig. 6). so not only the fringing capacitance c must be taken into account. 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 3: dielectric behaviour of distilled water at 30 °c fig. 4: n-connector and probe model fig. 5: equivalent circuit of a measurement probe fig. 6: frequency behaviour of equivalent circuit elements 3 numerical modeling a feasibility study of this measurement method involves numerical calculations and modeling (fig. 7). the system that we modeled consisted of two parts, i.e. the sensor and the biological tissue. in order to model the n-connector, its dimensions were measured and the available catalogue data was studied. the biological tissue sample was modeled on the basis of available published data on relative permittivity �r, loss factor tg � and conductance �. numerical simulation based on the finite integration technique (fit) [5] is used to calculate the reflection coefficient on the interface between the coaxial probe and a sample of biological tissue. fit is a discretization method which transforms maxwell’s equations in their integral form onto a dual cell grid complex, resulting in a set of discrete matrix equations. the structure is excited in the time domain with a gaussian pulse. by performing the fourier transformation (fft) and division we can obtain the response of the structure in the frequency domain. the properties and duration of this time signal are taken into account in order to obtain accurate and correct results. the waveguide port is used to feed the calculation with power and to absorb the returning power. for the waveguide port, the time signals and s–parameters (s11) are recorded during a solver run. the numerical results will be compared with real measurements. let us summarize the parametrs of the input simulator; � the frequency range of the numerical calculation is given by the network analyzer frequency range from 17 mhz to 2 ghz � the wave port is used as a feeding element because of the coaxial structure of the probe and the transversal electric and magnetic (tem) mode (wave) excitation � the output of the simulation are the s11-parameter (reflection coefficient) and a visualization of the magnitude of the electric field � the model of the mut-biological tissue is defined by dielectric parameters found in the literature (table 1) 4 measurements dielectric measurements are very fast and proceed through three steps. first the vector analyzer is calibrated. then the calibration is checked using a reference material with a known dielectric constant � (e.g. distilled water, fig. 3), and last but not least the reflection coefficient of the biological tissue is measured. the complex permittivity of biological tissue is evaluated using a personal computer and mathematical software (e.g. matlab) to solve equation 6. we measured two samples of biological tissue: � sample a: a biological tissue of beef � sample b: values measured on the author’s arm 5 results a comparison of tabbed, modeled and calculated values: 5.1 comparison of relative permittivity relative permittivity is a heavily frequency dependent quantity. because of the decreassing ability of the particles, follow rapid changes of the electrical field, the relative permittivity decreases with increasing frequency. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 46 no. 5/2006 fig. 7: model of the measurement system f (mhz) �r (-) tg �(-) � (s�m �1) 10 170.7 6.5 0.62 20 110.6 5.2 0.64 50 77.1 3.2 0.68 70 70.8 2.5 0.69 100 65.9 1.9 0.71 200 60.2 1.1 0.74 415 57.0 0.61 0.80 915 55.0 0.34 0.95 1500 54.0 0.26 1.18 2000 53.0 0.25 1.45 table 1: tabbed dielectric values of muscle biological tissue fig. 8: comparison of relative permittivity for muscle tissue, at low frequencies the real part of the relative permittivity has a value of about 200, and with increasing frequency this goes down to a value of 50. 5.2 comparison of the loss factor loss factor tg � is also a frequency-dependent parameter. at low frequencies, there is a value of about 5 for muscle tissue tg �, and with increasing frequency this value goes down to 0.35. 6 discussion the measurement method reported on here has an inconvenient calibration step, needing materials with known dielectric parameters.total accuracy of the final determination of complex permittivity depends on accurate knowledge of this parameter. we very often use alcohol as a calibration material. alcohols are hygroscopic liquids, so they are time-unsteady calibers. in addition their parameters are known from a table only for 100% virgin alcohols. their use in calibration makes the measurements imprecise. we will go on to study the relation between the dielectric constant of biological tissue and the reflection coefficient. by exploring the mathematical and physical model of the reflection coefficient measurement on the interface between biological tissue and the measurement sensor we can make the measurements more accurate. we will construct and compare the probes for various frequency bands. this will ultimately lead to total accuracy in determining of relative permittivity. the measurement will be thus more sensitive and therefore precise. 7 conclusion measuring complex permittivity is a promising method for medical diagnostics and for preparing treatment with the use of an electromagnetic field. it is important to know the complex permittivity in the treatment area in order to design and match applicators for microwave thermotherapy. the reflection method on an open ended coaxial line is suitable for determining the dielectric parameters of biological tissue. acknowledgments this work has been conducted at the department of electromagnetic field at the czech technical university in prague and has been supported by ctu grant no. ctu0607013. references [1] vrba, j.: medical applications of microwave techniques. (in czech) prague: ctu in prague publishing house, 2003. [2] oppl, l.: measurement of dielectric properties. (in czech) prague: dissertation thesis, dept. of em field, ctu in prague, 2001. [3] zajíček, r.: dielectric parameter measurements of biological tissue. (in czech) prague: diploma thesis, dept. of em field, ctu in prague, 2005. [4] gabriel, s., lau, r., w., gabriel, c.: the dielectric properties of biological tissues: iii. parametric models for the dielectric spectrum of tissues. phys. med. biol. vol. 41 (1996), p. 2271–2293. [5] hazdra, p., hudlička, m.: finite integration technique, modeling of fields, prague: ieee czechoslovakia section, 2006, p. 41–48. ing. radim zajíček e-mail: zajicer1@fel.cvut.cz prof. ing. jan vrba, csc. doc. ing. karel novotný, csc. department of electromagnetic field http://www.elmag.org, http://www.hypertermie.cz czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 9: comparison of the loss factor acta polytechnica https://doi.org/10.14311/ap.2022.62.0016 acta polytechnica 62(1):16–22, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on some algebraic formulations within universal enveloping algebras related to superintegrability rutwig campoamor-stursberg universidad complutense de madrid, instituto de matemática interdisciplinar–ucm, plaza de ciencias 3, e-28040 madrid, spain correspondence: rutwig@ucm.es abstract. we report on some recent purely algebraic approaches to superintegrable systems from the perspective of subspaces of commuting polynomials in the enveloping algebras of lie algebras that generate quadratic (and eventually higher-order) algebras. in this context, two algebraic formulations are possible; a first one strongly dependent on representation theory, as well as a second formal approach that focuses on the explicit construction within commutants of algebraic integrals for appropriate algebraic hamiltonians defined in terms of suitable subalgebras. the potential use in this context of the notion of virtual copies of lie algebras is briefly commented. keywords: enveloping algebras, commutants, quadratic algebras, superintegrability. 1. introduction both the study of (quasi-)exactly solvable systems, as well as that of super-integrable systems make an extensive use of the universal enveloping algebras of lie algebras, either in the context of the so-called hidden algebras or as symmetry algebras of the system. of particular interest are those systems that, beyond super-integrability properties, also belong to the class of (quasi-)exactly solvable systems [1–5]. in particular, quadratic subalgebras have been shown to be a powerful tool for classifying and comparing superintegrable systems, as shown in [6], where the scheme of superintegrable systems on a two-dimensional conformally flat space has been characterized in terms of contractions. additional examples in higher dimensions [7] lead us to suspect that n-dimensional superintegrable systems are somehow associated to (higher rank) polynomials in a suitable enveloping algebra [8], further stimulating the search of alternative algebraic approaches based on the structural properties of enveloping algebras. although the precise fundamental properties of enveloping algebras of generic semidirect sums of simple and solvable lie algebras are still far from being completely understood, a purely formal ansatz applied to the case of the schrödinger algebras ŝ(n) has recently been shown to provide some interesting features [9]. in this work we comment on some purely algebraic approaches formulated in the enveloping algebras of lie algebras for the identification or construction of quadratic algebras that may lead to super-integrable systems, once a suitable realization of the enveloping algebra by first-order differential operators has been chosen. the motivation for this analysis lies primarily on the inspection of super-integrable systems from the point of view of the algebraic properties of first integrals seen as elements of an enveloping algebra, as well as an attempt to determine to which extent these integrals are characterized algebraically by the hidden algebra [10]. this moreover suggests a realization-free description of systems in terms of commutants of algebraic hamiltonians in enveloping algebras [11], in which elements of the coadjoint representation of lie algebras may be useful to simplify computations. 2. first algebraic reformulation in the context of (quasi)-exactly solvable problems, the hamiltonians are described as differential operators in p variables that admit an expression as elements in the enveloping algebra of a lie algebra g, commonly known as the hidden algebra, not necessarily associated to any symmetry algebra of the system. the main requirement is the existence of a representation of g that is invariant for the hamiltonian, a constraint that allows us to determine its spectrum (either partially or completely) using algebraic methods [12]. so, for example, the universal enveloping algebra of the simple lie algebra sl(2, r) and its realization as first-order differential operators on the real line provide a characterization of quasi-exactly solvable one-dimensional systems [13]. a second type of systems that uses the structural properties of enveloping algebras is given by super-integrable systems, where both the hamiltonian and the constants of the motion are interpreted in the enveloping algebra of some lie algebra g. merely integrable n-dimensional systems can be interpreted as the image, via a realization φ by first-order differential operators, of an abelian subalgebra a of u (g), while super-integrable systems would correspond to nonabelian extensions of a. the problem under what conditions a system both exhibits super-integrability and (quasi-)exact solvability has been analyzed in detail, and large classes of super-integrable systems that are exactly solvable have been found (see [3, 14, 15] and references therein). 16 https://doi.org/10.14311/ap.2022.62.0016 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 on some algebraic formulations . . . a first algebraic formulation, as developed in [10], is motivated by the use of quadratic algebras in the context of super-integrable (and exactly solvable) systems with a given hidden algebra g [3]. to this extent, we consider a hamiltonian h expressed in terms of a subalgebra m ⊂ g via a realization φ by differential operators of the lie algebra g: h = dim m∑ i,j=1 αij φ(xi)φ(xj ) + dim m∑ k=1 βkφ(xk) + γ0, (1) where αij , βk, γ0 are constants and {x1, . . . , xdim m} is a basis of m. in this context, the hamiltonian h is obtained as the image of a quadratic element ha in the universal enveloping algebra u (m) ⊂ u (g). similarly, the (independent) constants of the motion φ1, . . . , φs can also be rewritten as the image of elements in the enveloping algebra u (g). as differential operators they satisfy the commutators [h, φj ] = 0, 1 ≤ j ≤ s. (2) the commutators [φi, φj ] provide additional (dependent) higher-order constants of the motion. a specially interesting case is given whenever the first integrals generate a quadratic algebra. abstracting from the specific realization φ, and focusing merely on the underlying algebraic formulation, the formal polynomial ha = dim m∑ i,j=1 αij xixj + dim m∑ k=1 βkxk + γ0 in the enveloping algebra u (m) of m allows us to recover hamiltonian h of the system once the generators are realized by the differential operators. in analogous form, we can find elements j1, . . . , js in u (g) that correspond, via the realization φ, to the first integrals φ1, . . . , φs of the system. while for the initial system the relations [h, φk] = 0, 1 ≤ k ≤ s, are ensured, there is no necessity that the polynomials jk commute with ha in u (g), although the relation [ha, jk] = 0 (mod φ) (3) is satisfied. similarly, for the polynomial relations [φi, φj ] = αkℓij φkφℓ + β k ij φk of the first integrals, the commutators in u (g) lead to the relation [ji, jj ] = αkℓij jkjℓ + β k ij jk (mod φ) (4) if equations (3) and (4) are satisfied for any realization φ, then the problem is entirely characterized algebraically by the reduction chain m ⊂ g. it should be observed that this situation is rather exceptional, as the analysis of the exactly solvable systems described in [3] from the point of view of the first algebraic formulation indicates that, in general, the first integrals of the system do not correspond, at the level of the enveloping algebra of the hidden algebra, to polynomials that commute with the algebraic hamiltonian, showing that the commutativity properties are a consequence of the realization by differential operators. using the correspondence existing between the representations of g and those of its enveloping algebra u (g) (see e.g. [11]) and identifying a lie algebra g with the first-order (left-invariant) differential operators on a lie group g admitting g as its lie algebra, it follows that the universal enveloping algebra can be seen as the set of (left-invariant) differential operators on g of arbitrary order. therefore, if φ : g → x(rn) is some realization of the lie algebra by first-order differential operators, it can be uniquely extended to a realization φ̂ : u (g) → x(rn). in this context, this first algebraic reformulation of the system is still strongly related to the representation theory of lie algebras. more precisely, supposed that ha ∈ u (m) is an algebraic hamiltonian defined in the enveloping of some subalgebra m ⊂ g and that the (independent) polynomials j1, . . . , js generate a quadratic algebra, that is, satisfy the conditions [ji, jj ] = αkℓij jkjℓ + β k ij jk, (5) we consider the (two-sided) ideal i in u (g) generated by the polynomials qi := [ha, ji] , 1 ≤ i ≤ s. the problem is now to analyze whether there exists an equivalence class of (faithful) representations φ : g → x(rn) such that for the corresponding extension φ̂ : u (g) → x(rn), the image of the ideal i is contained in the kernel ker φ̂, ensuring that the realized polynomials φ̂(qi) correspond to first integrals of the hamiltonian in the given realization. in some sense, this is a special case of an important and still unsolved problem, namely the embedding of a lie algebra g into the enveloping algebra u (k) of another lie algebra k, for which currently only the case of embeddings ι : g → u (g) for g semisimple has been completely solved [16], using techniques of deformation theory [17]. we illustrate the preceding procedure considering the six-dimensional non-solvable lie algebra r ⊂ sl(3, r) with basis {x1, . . . , x6} and nonvanishing commutators [x1, x2] = x1, [x1, x5] = x4, [x2, x5] = x5, [x2, x6] = −x6, [x3, x4] = −x4, [x3, x5] = −x5, [x3, x6] = x6 [x4, x6] = x1, [x5, x6] = x2 − x3. superintegrable systems based on this hidden algebra r and the vector field realization x1 = ∂t, x2 = t∂t − n 3 , x3 = su∂u − n 3 , x4 = ∂u, x5 = t∂u, x6 = u∂t, (6) 17 rutwig campoamor-stursberg acta polytechnica have been extensively studied in [4], where in addition their exact solvability was analyzed. we consider a special case of the generic hamiltonians studied there. taking the values s = k = ω = 1, a = b = − 12 and n = 0, we obtain the hamiltonian h1 and two quadratic integrals h1 = −4t∂2t − 8u∂ 2 tu − 4u∂ 2 u + 4t∂t + 4u∂u, φ1 = 4u(u − t)∂2u, φ2 = 4(t − u) ( ∂2t − ∂t ) . (7) now h1, φ1, φ2 are the image by the realization of the following polynomials in the enveloping algebra of r: h1 = 4x2(1 − x1) + 8(1 − x3)x1 + 4(1 − x4)x3, p1 = −4x3x5 + 4x 23 − 4x3, q1 = 4(x2x1 − x6x1 + x6 − x2). at the purely algebraic level we have [h1, p1] ̸= 0, [h1, q1] ̸= 0, showing that the polynomials p1 and q1 do not belong to the commutant of h1 in u (r). therefore, the origin of the quadratic integrals of system (7) is not algebraic, but a consequence of the specific realization (6). if we maintain the algebraic hamiltonian as given above and search for quadratic polynomials in u (r) commuting with it, we find that only two such operators exist (see [10] for the general case), given by a1 = x4 − x3 − x6 + x1(1 + x3 + x6) + (x3 + x6)x4, b1 = −4x1 − x2 + x6 + x1x2 + x1x3 − x1x6 − x6x4. these polynomials are not independent, as they satisfy the relation a1 +b1 + 14 h1 = 0. now, if we extend the analysis to cubic polynomials in u (r), we find the following operator c1 that commutes with h1: c1 = 3x1 − 2x3 − x5 − 4x6 + 2x1x3 + 4x1x6 + x 23 + x2x4 + x 23 − x3x5 + x3x6 + x6x4 − x6x5 − x1x 23 − x1x3x6 + x2x3x4 + x2x6x4. the operators a1 and c1 generate a finitedimensional polynomial algebra in u (r), with explicit nonvanishing commutators [a1, c1] = d1, [a1, d1] = d1, [b1, c1] = −d1, [c1, d1] = 1 2 {b1, d1} − 1 2 {a1, d1} − 12a1 − 12a1 + 4b1 + 4c1 − 2{a1, b1}, where {◦, ◦} is the anticommutator. now, as the operators h1, a1, c1 commute at the algebraic level, for any realization of r by vector fields they give rise to a hamiltonian system possessing a quadratic and a cubic integral, respectively.1 for the particular realization (6), it follows that the resulting system is actually equivalent to the initial one (7), as the image of the ideal j generated by a1, b1, c1, d1 is properly contained in the ideal spanned by φ1 and φ2, thus being functionally dependent on these integrals. 1provided that the transformed operators are independent. 3. commutants in enveloping algebras and coadjoint representations a second algebraic approach, of a more general nature, can be proposed considering chain reductions g′ ⊂ g of (reductive) lie algebras, and analyzing the structure of the commutant of g′ in the enveloping algebra u (g), in order to identify polynomial (in particular, quadratic) subalgebras [9]. in the generic analysis of commutants, elements of the theory the coadjoint representation of lie algebras can be used, in order to simplify some of the computations in enveloping algebras. if g is a lie algebra with generators {x1, . . . , xn} and commutators [xi, xj ] = c kij xk, the xi’s are realized in the space c ∞ (g∗) by means of the first-order differential operators: x̂i = c kij xk ∂ ∂xj , (8) where {x1, . . . , xn} are the coordinates of a covector in a dual basis of r {x1, . . . , xn}. the invariants of g (in particular, the casimir operators) correspond to the solutions of the following system of partial differential equations: x̂if = 0, 1 ≤ i ≤ n. (9) for an embedding of lie algebras f : g′ → g, a basis {x1, . . . , xr } of the subalgebra can be extended to a basis {x1, . . . , xn} of g. therefore, we can consider the subsystem formed by the first r equations of (9), corresponding to the generators of the subalgebra g′. the solutions of this subsystem, that in particular encompass the invariants of g′, are usually called subgroup scalars [18]. by means of the standard symmetrization map λ ( xi1 . . . xip ) = 1 p! ∑ σ∈sp xσ(i1) . . . xσ(ip) (10) polynomial solutions of the subsystem correspond to elements in the enveloping algebra u (g) of g that commute with the subalgebra g′. if we now define an algebraic hamiltonian h = h (x1, . . . , xr ) ∈ u (g′), (11) in terms of the subalgebra generators, the commutant cu (g)(h) = {u ∈ u (g) | [h, u ] = 0} certainly includes the solutions of (9) common to the g′-generators, i.e. cu (g)(h) ⊃ { λ(φ) | x̂1(φ) = · · · = x̂r (φ) = 0 } , where φ(x1, . . . , xn) ∈ c ∞ (g∗). depending on the structure of g and the subalgebra g′, as well as on the choice of h, two possible cases arise for a polynomial p ∈ cu (g)(h): 18 vol. 62 no. 1/2022 on some algebraic formulations . . . (1.) p commutes with all x1, . . . , xr . (2.) there is an index k0 with [p, xk0 ] ̸= 0. polynomials p in the first case actually commute with the hamiltonian h, and thus belong to the two-sided ideal ⟨i⟩ generated by the set i = {j1, . . . , js} of elements corresponding to the symmetrization of independent polynomials satisfying the subsystem of (9) corresponding to g′. for these elements, it follows at once that [jk, jℓ] belongs to i. in the general case, the hamiltonian h does not commute with all xj -generators, and in order to find the commutant cu (g)(h), we can restrict the analysis to the determination of a basis of the factor module cu (g)(h)/⟨i⟩. although the problem is computationally cumbersome, certain algorithms in terms of gröbner bases have been developed that allow its precise determination [19]. a (restricted) systematic procedure that circumvents the above-mentioned obstruction and allows us to analyze polynomial algebras with respect to a reduction chain g′ ⊂ g can be proposed starting from the polynomials in u (g) that commute with all the generators intervening in the expression of the algebraic hamiltonian h ∈ u (g′). more precisely, if the hamiltonian h is given as a polynomial p (xi1 , . . . , xis ) in terms of the generators of the subalgebra g′ with basis {x1, . . . , xr }, we consider the subsystem of (9) given by x̂ij f (x1, . . . , xn) = 0, 1 ≤ j ≤ s. we then extract a maximal set of independent polynomial solutions {q1, . . . , qp} of (9), which in the reductive case forms an integrity basis for the solutions. symmetrizing these functions we obtain elements mj in the commutant cu (g)(h). starting from the set of polynomials s = {h, m1, . . . , mp}, we inspect their commutators and determine whether, either adjoining new (dependent) elements to s or discarding some elements of s, a finite-dimensional quadratic algebra a can be found. although there is some ambiguity in the construction, as there is no quadratic algebra “canonically" associated to the reduction chain g′ ⊂ g, it provides an alternative method that does not require a specific realization by vector fields, as the integrability condition is guaranteed by the commutant. this ansatz has been successfully applied in [9] to the enveloping algebra of the schrödinger algebras ŝ(n) for arbitrary values of n ≥ 1 and various choices of algebraic hamiltonian, showing that the construction is formally of use for the analysis of hidden algebras that are not reductive. 4. virtual copies in enveloping algebras in the solution of the embedding problem into enveloping algebras for semisimple algebras, the vanishing of the first cohomology group with values in u (g) plays an important role, as it allows to provide a general solution for the perturbation problem [16]. for nonsemisimple lie algebras, the application of the procedure is quite complicated for both computational reasons and the currently incomplete understanding of the precise structure of the corresponding enveloping algebras. however, for certain types of semidirect sums of simple and solvable lie algebras, some analogous statements may be proposed, providing copies of semisimple lie algebras in the enveloping algebra of a semidirect sum, up to a polynomial factor. supposed that s is the levi subalgebra of a semidirect sum g = s−→⊕ γr, we seek for elements of degree d ≥ 2 in the generators in u (g) that transform according to the structure tensor of s, up to a (polynomial) factor. the procedure can be summarized as follows: consider a basis {x1, . . . , xn} of s with commmutators [xi, xj ] = c kij xk. (12) and extend it to a basis {x1, . . . , xn, y1, . . . , ym} of of the semidirect sum g. we now define operators x ′i = xi r (y1, . . . , ym) + pi (y1, . . . , ym) (13) in u (g), where pi and r are still undetermined polynomials. in order to simplify computations, they can be considered as homogeneous polynomials of degrees k and k − 1 respectively, so that x ′i is homogeneous of degree k. we require that these operators commute with the generators yk of the radical r, so that the identity [x ′i , yj ] = 0, 1 ≤ i ≤ n, 1 ≤ j ≤ m is satisfied for all indices. expanding the latter leads to the expression [x ′i , yj ] = [xir, yj ] + [pi, yj ] =xi [r, yj ] + [xi, yj ] r + [pi, yj ] . taking into account the homogeneity degree of the terms with respect to the generators of s and the representation space, it follows that xi [r, yj ] can be seen as a polynomial of degree (k − 1) in the variables {y1, . . . , ym}. on the other hand the terms of [xi, yj ] r + [pi, yj ] have degree k, allowing us to further separate the commutator as [r, yj ] = 0, [xi, yj ] r + [pi, yj ] = 0. (14) from the first equation we conclude that the factor r commutes with all generators yi, thus defines an invariant of the solvable lie algebra r. we further require that the operators x ′i transform by the action of s as the generators of the latter algebra, i.e. [x ′i , xj ] = [xi, xj ] ′ := c kij (xkr + pk) . (15) as this relation must hold for all the generators of the semidirect sum g, further structural constraints on the polynomials r and pi are obtained. expanding the left-hand term of condition (15) yields [x ′i , xj ] = [xi, xj ] r − xi [xj , r] + [pi, xj ] . 19 rutwig campoamor-stursberg acta polytechnica as the yj are the generators of the representation space γ, it follows that the term [xi, xj ] r − xi [xj , r] is linear in the generators of s and of degree (k − 1) in the yj ’s, while [pi, xj ] does not involve generators of s. comparing now with the right-hand side of (15), the condition again separates into two parts: [xi, xj ] r − xi [xj , r] = c kij xkr, [pi, xj ] = c kij pk. (16) simplifying the first equations shows that xi [xj , r] = 0, hence implying that r also commutes with the generators of the lie algebra. as r corresponds simultaneously to an invariant polynomial of the radical, it must correspond to an invariant of g that depends only on the generators of its maximal solvable ideal.2 the second equation shows that the polynomials pi transform according to the adjoint representation of the semisimple lie algebra s. supposed that all the conditions are satisfied, we obtain the commutators of the operators x ′i in the enveloping algebra u (g) as[ x ′i , x ′ j ] = [xir + pi, xj r + pj ] = [xir + pi, xj r] + [xir + pi, pj ] = c kij xkr 2 + c kij pkr + [x ′ i , pj ] . (17) as the x ′i commute with the yj , it follows from equation (17) that [x ′i , pj ] = 0 and therefore that[ x ′i , x ′ j ] = [xi, xj ] ′ r, showing that the operators reproduce the commutators of s, up to the invariant factor r. it should be emphasized that r is not necessarily a central element, but an invariant of g that solely depends on the generators of the characteristic representation γ. it follows in particular from this construction that the operators {r, x ′1, . . . , x ′n} generate a finite dimensional quadratic algebra a in the enveloping algebra u (g), with commutators [r, x ′i ] = 0, [ x ′i , x ′ j ] = c kij x ′ kr, 1 ≤ i, j, k ≤ n. under some specific conditions, these so-called virtual copies of semisimple lie algebras in enveloping algebras can be used to construct (formal) hamiltonians with first integrals given by some of the operators x ′i . let us outline one possibility, based on the branching rules of representations of semisimple lie algebras. to this extent, we fix a semisimple subalgebra s′ of the levi factor s of the semidirect sum g. further suppose that the adjoint representation ad(s) decomposes, as a representation of s′, as follows ad(s) ↓ ad(s′) + γ1 + · · · + γs, (18) 2this fact actually provides information concerning the dimension of the characteristic representation γ in the semidirect sum. where γ = γ1 + · · · + γs is the so-called characteristic representation [20]. suppose that the trivial representation γ0 of s′ has multiplicity k > 0 in the decomposition (18). this means specifically that we can find k generators { x̃1, . . . , x̃k } of s that commute with the subalgebra s′. now, by condition (15), for the corresponding operators x̃s (1 ≤ s ≤ k) we have that [ x̃ ′i , z ] = [ x̃i, z ]′ = 0, z ∈ s′, (19) from which it follows that for any algebraic hamiltonian h ∈ u (s′) the integrability condition[ x̃ ′i , h ] = [r, h] = 0, 1 ≤ i ≤ k (20) is satisfied. on the other hand, by condition (17), it is straightforward to verify that[ h, [ x̃ ′i , x̃ ′ j ]] = 0. (21) this last identity implies that the terms appearing in the commutator [ x̃ ′i , x̃ ′ j ] also transform according to the trivial representation of the subalgebra s′. we conclude that the set { r, x̃ ′1, . . . , x̃ ′ k } generates a finite-dimensional quadratic algebra in the enveloping algebra u (g) that are (formal) first integrals for the hamiltonian h. whether or not these integrals are sufficient for guaranteeing (super-)integrability, essentially depends on the subalgebra s′ and the associated branching rule. in any case, the preceding construction determines the maximal number of operators x ′i that commute with the hamiltonian h, independently of any realization of the hidden algebra g by first-order differential operators. for the case where the characteristic representation γ does not contain the trivial representation of the subalgebra s′, i.e., when no generators of s simultaneously commute with the elements of s′, the integrability condition for the operators would not be a consequence of the structure of the enveloping algebra, but the specific consequence of a realization of g, relating this approach with the first algebraic formulation. we finally observe that the construction presented here, that depends essentially on the homogeneity of the operators x ′i , is specially suitable for semidirect sums admitting a nonvanishing centre and the class of one-dimensional non-central extensions of double inhomogeneous lie algebras [21, 22], while the argument is not valid whenever the levi factor s and the radical do not have nonconstant invariants in common. due to this obstruction, it is formally conceivable to propose a generalized construction by skipping the homogeneity assumption. it should however be taken into account that using operators of different degrees in (13) may lead to incompatibilities in the commutators, as equations (14)-(16) cease to hold, and more general constraints depending on the particular degrees of each pi would be required. if and under what specific assumptions a solution can be found for a generalized inhomogeneous set of generators (13), is still an unanswered question that is currently being studied in detail. 20 vol. 62 no. 1/2022 on some algebraic formulations . . . 5. conclusions two possible approaches to the problem of determining quadratic algebras as subalgebras of the enveloping algebra of a lie algebra have been commented. the first approach corresponds to an algebraic abstraction of already known systems, which are analyzed purely from the perspective of the hamiltonian and the integrals as the image by a realization of differential operators of elements in some enveloping algebra, trying to determine to which extent such integrals are realization-dependent [10]. in a the second algebraic formulation, commutants of subalgebras g′ ⊂ g in the enveloping algebra of g are considered, from which quadratic algebras formed by polynomials that commute with a given algebraic hamiltonian defined in u (g′) are deduced. in order to simplify the computations in the enveloping algebra, distinguished elements in the commutant can be deduced from the coadjoint representation. for the subalgebras found with this method, a realization by vector fields of an appropriate number of variables automatically provides a (super-)integrable system for the given hamiltonian [9]. the method of virtual copies, initially introduced in the context of invariant theory, provides an additional approach that combines elements of the two algebraic formulations, and refers to a number of still open problems, such as the general solution of the embedding problem of lie algebras into enveloping algebras [16], as well the classification problem of realizations of lie algebras in terms of differential operators [23]. whether these approaches are compatible or can be combined with other procedures like the quadratic deformations of lie algebras or the formalism of racah algebras (see e.g. [8, 24, 25] and references therein) is a problem worthy to be inspected. we hope to report on some progress in these directions in a near future. acknowledgements the author is indebted to alexander v. turbiner and ian marquette for valuable critical comments, to artur sergyeyev for providing reference [15], as well as to the referees for several remarks that have greatly improved the presentation. during the preparation of this work, the author was financially supported by the research grant pid2019-106802gbi00/aei/10.13039/501100011033 (aei/ feder, ue). references [1] a. m. perelomov. integrable systems of classical mechanics and lie algebras. birkhäuser verlag, basel, 1990. [2] l. freidel, j. m. maillet. quadratic algebras and integrable systems. physics letters b 262(2-3):278–284, 1991. https://doi.org/10.1016/0370-2693(91)91566-e. [3] p. tempesta, a. v. turbiner, p. winternitz. exact solvability of superintegrable systems. journal of mathematical physics 42(9):4248, 2001. https://doi.org/10.1063/1.1386927. [4] f. tremblay, a. turbiner, p. winternitz. an infinite family of solvable and integrable quantum systems on a plane. journal of physics a: mathematical and theoretical 42(24):242001, 2009. https://doi.org/10.1088/1751-8113/42/24/242001. [5] p. letourneau, l. vinet. superintegrable systems: polynomial algebras and quasi-exactly solvable hamiltonian. annals of physics 243(1):144–168, 1995. https://doi.org/10.1006/aphy.1995.1094. [6] w. miller, s. post, p. winternitz. classical and quantum superintegrability with applications. journal of physics a: mathematical and theoretical 46(42):423001, 2013. https://doi.org/10.1088/1751-8113/46/42/423001. [7] c. daskaloyannis, y. tanoudis. quadratic algebras for three-dimensional superintegrable systems. physics of atomic nuclei 73:214–221, 2010. https://doi.org/10.1134/s106377881002002x. [8] d. latini, i. marquette, y.-z. zhang. embedding of the racah algebra r(n) and superintegrability. annals of physics 426:168397, 2021. https://doi.org/10.1016/j.aop.2021.168397. [9] r. campoamor-stursberg, i. marquette. quadratic algebras as commutants of algebraic hamiltonians in the enveloping algebra of schrödinger algebras. annals of physics 437:168694, 2022. https://doi.org/10.1016/j.aop.2021.168694. [10] r. campoamor-stursberg, i. marquette. hidden symmetry algebra and construction of quadratic algebras of superintegrable systems. annals of physics 424:168378, 2021. https://doi.org/10.1016/j.aop.2020.168378. [11] j. dixmier. algèbres enveloppantes. hermann, paris, 1974. [12] m. a. shifman, a. v. turbiner. quantal problems with partial algebraization of the spectrum. communications in mathematical physics 126:347–365, 1989. https://doi.org/10.1007/bf02125129. [13] a. v. turbiner, a. g. ushveridze. spectral singularities and quasi-exactly solvable quantal problem. physics letters a 126(3):181–183, 1987. https://doi.org/10.1016/0375-9601(87)90456-7. [14] m. a. rodríguez, p. winternitz. quantum superintegrability and exact solvability in n dimensions. journal of mathematical physics 43:1309–1322, 2002. https://doi.org/10.1063/1.1435077. [15] a. sergyeyev. exact solvability of superintegrable benenti systems. journal of mathematical physics 48(5):052114, 2007. https://doi.org/10.1063/1.2738829. [16] v. ovsienko, a. v. turbiner. plongements d’une algèbre de lie dans son algèbre enveloppante. comptes rendus de l’académie des sciences paris 314:13–16, 1992. [17] a. verona. introducere in coomologia algebrelor lie. editura academiei rep. soc. românia, bucuresti, 1974. [18] r. t. sharp, c. s. lam. internal-labeling problem. journal of mathematical physics 10(11):2033–2038, 1969. https://doi.org/10.1063/1.1664799. 21 https://doi.org/10.1016/0370-2693(91)91566-e https://doi.org/10.1063/1.1386927 https://doi.org/10.1088/1751-8113/42/24/242001 https://doi.org/10.1006/aphy.1995.1094 https://doi.org/10.1088/1751-8113/46/42/423001 https://doi.org/10.1134/s106377881002002x https://doi.org/10.1016/j.aop.2021.168397 https://doi.org/10.1016/j.aop.2021.168694 https://doi.org/10.1016/j.aop.2020.168378 https://doi.org/10.1007/bf02125129 https://doi.org/10.1016/0375-9601(87)90456-7 https://doi.org/10.1063/1.1435077 https://doi.org/10.1063/1.2738829 https://doi.org/10.1063/1.1664799 rutwig campoamor-stursberg acta polytechnica [19] j. apel, w. lassner. an extension of buchberger’s algorithm and calculations in enveloping fields of lie algebras. journal of symbolic computation 6(2-3):361–370, 1988. https://doi.org/10.1016/s0747-7171(88)80053-1. [20] w. g. mckay, j. patera. tables of dimensions, indices and branching rules for representations of simple lie algebras. marcel dekker, new york, 1981. [21] r. campoamor-stursberg. intrinsic formulae for the casimir operators of semidirect products of the exceptional lie algebra g2 and a heisenberg lie algebra. journal of physics a: mathematical and general 37(40):9451–9466, 2004. https://doi.org/10.1088/0305-4470/37/40/009. [22] r. campoamor-stursberg, i. marquette. generalized conformal pseudo-galilean algebras and their casimir operators. journal of physics a: mathematical and theoretical 52(47):475202, 2019. https://doi.org/10.1088/1751-8121/ab4c81. [23] a. gonzález lópez, n. kamran, p. j. olver. lie algebras of vector fields in the real plane. proceedings of the london mathematical society 64(2):339–368, 1992. https://doi.org/10.1112/plms/s3-64.2.339. [24] l. a. yates, p. d. jarvis. hidden supersymmetry and quadratic deformations of the space-time conformal superalgebra. journal of physics a: mathematical and theoretical 51(14):145203, 2018. https://doi.org/10.1088/1751-8121/aab215. [25] p. gaboriaud, l. vinet, s. vinet, a. zhedanov. the generalized racah algebra as a commutant. journal of physics: conference series 1194:012034, 2019. https://doi.org/10.1088/1742-6596/1194/1/012034. 22 https://doi.org/10.1016/s0747-7171(88)80053-1 https://doi.org/10.1088/0305-4470/37/40/009 https://doi.org/10.1088/1751-8121/ab4c81 https://doi.org/10.1112/plms/s3-64.2.339 https://doi.org/10.1088/1751-8121/aab215 https://doi.org/10.1088/1742-6596/1194/1/012034 acta polytechnica 62(1):16–22, 2022 1 introduction 2 first algebraic reformulation 3 commutants in enveloping algebras and coadjoint representations 4 virtual copies in enveloping algebras 5 conclusions acknowledgements references ap08_3.vp 1 matrix converter a matrix converter is a direct frequency changer. this converter consists of an array of n×m bidirectional switches arranged so that any of the output lines of the converter can be connected to any of the input lines. the bidirectional switch is realized by using some semiconductor devices. they can be either discrete or integrated to the module. the bidirectional switch can be implemented in various ways. for the matrix converter, we chose modules which include 3 bidirectional switches in common emitter a configuration. the modulator is thus realized for these switchers. 2 switching two basic conditions must be kept during switching: � the converter is supplied by three-phase system voltage sources. the input must therefore not be short-circuited, which means that every output phase is connected with not more then one input phase. � an inductive load is premised. disconnection would lead to overvoltage, and for this reason the output circuit cannot be interrupted during routine running. it is important to choose an efficient switching algorithm. for the modulator we chose four-step switching driven by the input voltage. for this method it is necessary to know the polarity of the voltage between the input lines. the advantage of this algorithm is its simplicity and the fact that it can be driven even by low current values. the disadvantage is longer commutation time. the processes of four-step switching driven by input voltage are insinuated in tables table 1. and table 2. we can see that it is possible to use the same switching algorithm for different current direction. the modulation algorithms can therefore be driven only by the input voltage. 3 modulation strategy indirect space vector modulation (isvm) is used in the matrix converter. we can imagine the matrix converter as an indirect converter with a virtual dc link. we can therefore use some processes well known from classical indirect frequency converters. it is necessary to ensure the right timing for command switching and to generate the guard delay and then the switching at the right moments. we achieve this by adding or subtracting the given times of the switching combinations and comparing them with the values of saw courses, fig. 2. using the proper switching combinations, the necessary rate of switching igbt during one switching period can be reduced. the process of achieving a different order of switching in even and odd periods is presented on fig. 2 and fig. 3. 4 communication switching commands and times of switching combinations are sent from the superset regulator per pc 104 bus. all pc 104 bus signals are identical in definition and function to their isa counterparts [4]. signals are assigned in the same order as on the edgecard connectors of isa, but transformed to the connector pins. the matrix converter modulator was programmed in vhdl language, and consists of several parts. first part provides the right switching signals for igbts, and puts the guard delay between the separate switching steps. the second part generates switching commands at the right time. other © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 48 no. 3/2008 communication between a matrix converter modulator and a superset regulator p. pošta this work deals with the modulator of a matrix converter and its communication with the superset regulator. a switching algorithm is briefly introduced. the input voltage measurement method is presented. in the last part of the paper, the testing of communication between the superset regulator and the modulator in fpga technology are also presented. keywords: matrix converter, fpga, vhdl, power electronics. fig. 1: matrix converter 3×3 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fo u rt h st ep t h ir d st ep se co n d st ep fi rs t st ep in it ia l st at e h ar d sw it ch in g table 1: algorithm for four-step switching driven by the input voltage for a bidirectitional switch with a common emitter for urs �0 and i a �0 fo u rt h st ep t h ir d st ep se co n d st ep fi rs t st ep in it ia l st at e so ft sw it ch in g table 2: algorithm for four-step switching driven by the input voltage for a bidirectitional switch with a common emitter for urs �0 and i a � 0 parts provide communication. several registers are realized in fpga and each of them has its own address. some registers are “read only”, some are “write only”, and some are “read and write”. there is one state register which is cleared after it is read by the regulator. the information about the state of hhe modulator and the values from the a/d converters can be read due to the superset regulator. the regulator can write to the modulator switching the command and times of switching combination. the modulator has a mode register which changes its function. this is necessary for testing this modulator and for the possibility of changing the hw of superset regulator. 5 regulator the superset regulator is a one-desk pc from rtd or kontron with the pc 104 bus. during the test of communication test, we found that the one-desk kontron pc read from the pc 104 bus in 16-bit mode, in a way different from that described in the universal bus specification. when we want to read 16-bit from pc 104 bus, we have to read only the even addresses. because a one-desk kontron pc reads first from the even address and than from address �1, which is an odd address. special bus handling has to be implemented for this onedesk pc. the setting of this bus handling is enabled by the mode register. the functions of a one-desk pc are the same as those of other pcs. the difference between these two kinds of pcs is in their size. a one desk pc, as the name indicates, is realized on a single pcb (90.17 mm×95.89 mm). as a matter interest, a one-desk rtd pc has a standard operating temperature from �40 °c to �85 °c. an advantage of using a pc as a superset regulator is its high-computing power in floating-point arithmetic and the possibility of using of high-level language. a disadvantage of this solution is the need for real-time programming on a device which is not primary specified for this. another problem is with high-speed input and output. this problem must be solved in each case, because dsp (standard used as a regulator) has only 12 pwm outputs and a 3×3 matrix converter needs 18 outputs. when we realize the modulator in fpga, each high speed input and output and some algorithms can by replaced for this technology. this means that the superset regulator has more time for regulation algorithms and for communicating with the user. this is an important advantage for development, especially at a university, where students do not have a lot of time for a detailed study of dsp architecture. 6 analog-digital converters fpga can control analog-digital converters (adc), evaluate signals from igbts drivers and detect errors and control relays in power circuits. all acquisition of these values is very simple for a superset regulator, because it is only a read or a write operation to a specific address in memory via the pc104 bus. there are 4 sigma-delta adcs and 4 voltage frequency adcs on the board with the fpga circuit. these converters are served in fpga, using ram memory. the outputs from these two kinds of adc are serial data. the number of ones in all the data sent per measured period is equal to the average value. the data is put into the shift register and the number of ones can easily be determine by adding the input data and subtracting the output data. then appropriate register directly has an average value with an accurately defined delay. a disadvantage is the relatively long delay. another feature of this evaluation is that the adc values are filtered. this is an advantage when we pass thorough zero. however higher frequencies cannot be measured. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 3/2008 fig. 2: acquiring of the right timing for a switching command (odd period) fig. 3: acquiring of the points of optimized switching (even period) 7 communication testing a special program has been developed in c language for testing communication. this program writes values to the registers realized in fpga. these values are changed in the fpga circuit by a defined process, and then they are read from the superset regulator. a program placed in the superset regulator controls the values from the registers, and if it finds a mistake it notifies the error. this program was later modified for testing analogdigital converters and igbt drivers. the testing program is not in real-time, but it speed is sufficient for measurements on the fpga board. the testing program allows the measured value be saved to the ram superset regulator and, after this program is closed, to the flash memory. the measured values can be processed in ms excel or matlab to graphic form. fig. 4 presents the input signal to the adc and the most significant bit from the fpga register which contains the measured value. this bit represents the polarity. the delay measure is equal to the shift between this signal and the input to the adc. fig. 5 presents the values measured by adc. we can see that the measured value corresponds to the input signal. acknowledgments the research described in this paper was supervised by doc. ing. jiří lettl csc., fel ctu in prague, and was supported by research program msm6840770017. references [1] lettl, j., flígl, s.: electromagnetic compatibility of a matrix converter system. radioengineering, 2006, p. 58–64. [2] pošta, p.: stavový automat modulátoru pro kompaktní maticový měnič. diplomová práce, k13114, čvut, 2007. [3] arbona, c.: matrix converter igbt driver unit development and activation. bachelor project, k13114 fee ctu in prague, 2006. [4] techfest, isa bus technical summary, http://www.techfest.com/hardware/bus/isa.htm [5] rtd, cpumodule™ & controller selection guide: http://www.rtd.com/pc104/pc104_cpumodule.htm ing. pavel pošta e-mail: postap1@fel.cvut.cz department of electric drives and traction czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 4: input signal to the adc and evaluation polarity fig. 5. values measured by adc on fpga board ap06_1.vp 1 introduction the state of a (logical) object is generally not specified and is intuitively regarded as sufficiently obvious. often, the state is conceived in connection with the input history of the object. the state of an entity, let us say the determination of its state alphabet, identification of the object is fatally dependent. moreover, if the identification of an entity follows one and the same objective even several different final automaton models of the object, which satisfy the given objective, can be constructed [1]. the observer who identifies the object may be satisfied with the recurrent definition of the state presented in this article and, as we hope, could be persuaded that a stimulus affecting the object only initiates (provokes) the transition between the states of the object whereas it is the initial state of the transition which effectuates the transition. it is to be believed that (the physical untenability of) the so called determinization of a nondeterministic final automaton model will be shown to be physically untenable as it initially respects the randomness, which it later formally ignores. 2 state of a logical object a logical object receives and sends quantum signals. letters from the input x and output y alphabets denote individual input and output quanta, respectively. by analogy, the quanta of the anticipated inner signals (states) in the object are denoted by letters of the state alphabet s. the sampling of the input signals (stimuli) is random, and is determined by the neighbourhood of the object. the sampling of state and output signals (responses or reactions) in an asynchronous object is derived state from the sampling of output signals. in a synchronous object the sampling is done indirectly in a tenacious way so that the subject has delegated the sampling to the generator of the synchronizing pulses. if the object synchronises itself by means of a synchroniser, the situation can be denoted as an autosynchronous object, or the object of state sampling and the respective responses ‘forces’ the neighbourhood to accept a speed independent object. let t be a set of sampling moments and the binary relation � : : ,t t t2 � be a relationship of a strong arrangement on t, t t� �, meaning that moment t was initiated before moment �t . the monomorphism from t,� to n, � is said to hold if there exists a simple function f t t v: :� n � such that t t f t f t� � � � �( ) ( ), where n is a set of integers including zero. if f t v( ) � and f t v( )� � � 1, moment t is said to have initiated immediately before moment �t . in addition, let there be an, esp. a(�) or a�, where a is one of the alphabets and � is the time moment �� n), set {n � a}, esp.{{ }� � a}, or {� � a}of all representations n a a a a ai i� �: , , ( )0 10 1 0 1 � � � � , spec. { } : ( )� � �� �a a ai� , or � � �a a ei: ( )and . let us therefore, denote �a n , esp . �a { }� , either an or a� (�{e}), esp. either a{�}, or a� (�{e}) and also �a a 0 1 � , esp. �a � , either a0 a1 … or e, esp. either a� or e. the performance of a deterministic logical object with a given input x, state s, and output y alphabets can be modeled by a system of functions consisting of the � transition function � � � � � � �: : ,{ } { } { }s x s s x s� � � �1 1� � output function � �� � � � � � �: : ,{ } { } { }s x y s x y� � �� 1 i.e. either s x y� � �, � �1 or s y� �� of the mealy, or moore, type respectively, where � is the current moment of sampling and x�, s�, y� are the respective current stimulus, state, response; moment � � 1 is the immediate follower of moment � and s� � 1 is the follower state of state s� . however, since the future state s� � 1, is not yet at our disposal (!), the value of the transition function is given by the current predication of state s pred s� � �� �1 1– ( ) , i.e. � �� � � � � � � � �: : , ( ) { } s x s s x pred s� � � �1 1� . being aware of the fact above, we will keep to the traditional notation of function �. note also that the transition function offers, in fact, two possibilities causing the transition from state s� to that of s� � 1: the starting state of transition s� and stimulus x�; it remains only to decide which cause is the dominant one. since the postdiction post� (s��1) of predecessor states s��1 to the current state is not generally unambiguous, i.e., for � � � �� �� � � � �s x s x si j� � � �� �1 1 1 1, , we can admit s s i ji j � �� �� �1 1( ) we will introduce a generalized transition � function � � � � � � � � �� � �: : ,{ } { , , , } { }s x s s x x x s� �� � �1 1 1� � � recurrently so that � � � � �� �� � � ( ( , ), )s x x x x s� � � . © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 state of a logical object j. bokr, v. jáneš the paper deals with the state and performance of a logical object on a state-differentiating level and with the so called determinization of a finite – automaton model. keywords: logical object, state, state transition, finite automaton. it can be intuitively stated that the state of an object is defined by the entire input cumulated prehistory h of the object. the response y� is a value of the output function: � � � � � � �: : ,{ } { } { }[ ] [ ]h x y h x y� � � of an output block o of the object, where h� is the instantaneous input prehistory. therefore, we expect the given object to be divided into two components: the fixator f of the input prehistory h and the output block o (fig. 1). fixator f records and issues the input prehistory; since, however, f does not delete (!) every recorded prehistory h, the prehistory is kept in a cumulated form (let us imagine, e.g., the whole bible recorded on one current page of paper). a performance model of fixator f is a transition function � � � � � � �: : ,{ } { } { }h x h h x h� � � �1 1� . first let there be h x n� and h x x x x x x x n� � �0 1 1 0 1 1� �� �( ) [2] and consider an initially finite automaton model of the fixator [3] f � x x en, , ,�1 where �1 is the transition function � � � � � � 1 0 1 1 0 1 1 : : [ ], [ ] [ ] { }x x x x x x x x x x x n n� � � � � � � � and n is a set of integer numbers including zero. for the transition function �1 there holds (i j� ): � x x x x x x x x x x x x xi i i j j j i i i j j j 0 1 1 0 1 1 0 1 1 0 1 � � � � � � � �� � �� � � � ��1x � x x x x x x x x x x x x xi i i j j j i i i j j j 0 1 1 0 1 1 0 1 1 0 1 � � � � � � � �� � �� � � � ��1x � (fig. 2a) and not x x x x x x x x x x x x xi i i j j j i i i j j j 0 1 1 0 1 1 0 1 1 0 1 � � � � � � � �� � �� � � � ��1x , (where �ţis the statement of an implication) which can also be rightly required (fig. 2b), � it can be legitimately required that � � � �( )x x x x x x x0 1 1 0 1 1� �� �� , esp. � �( , )e x e� , this is, however, contradictory (fig. 3). let us write more cunningly h x n� �( / ) and h x x x x x x x n� �{ } ({ } ( / )0 1 0 1� �� � [3, 4, 5], where ( / )x n � is the finite partition on xn defined on xn through the nerode equivalence the class of partition ( / )x n � is denoted and we can consider the fixator model f � �x x en, ( / ), , { }�2 , where �2 is the transition function � � � � � 2 0 1 1 0 1 1 : ( / ) ( / ) : , { } [{ }] {[ x x x x x x x x x x n n� � � � � � � � � ] }x .� we have still to define the right-side congruence nerode equivalence � on x n; the input words x x xi i i 0 1 � � and x x xj j j 0 1 � � (i j� ) are said to be right-side congruent, if there holds � � �� �2 0 1 2 0 1( ) ( ){ }, { },e x x x e x x xi i i j j j� �� � � � � � �2 0 1 1 2( ){ },e x x x x x xi i i� �� � � � � �� � � � �2 0 1 1 2( ){ },e x x x x x xj j j� � for the transition function �2 there is (i j� ): � { } { }x x x x x xi i i j j j 0 1 1 0 1 1 � � � �� �� � � �� �{ } { }x x x x x x x xi i i j j j 0 1 1 0 1 1 � � � � � � � { } { }x x x x x xi i i j j j 0 1 1 0 1 1 � � � �� �� � � � � � �{ } { }x x x x x x x xi i i j j j 0 1 1 0 1 1 � � � � � � where � � means either = or � , � { , } { }a x e� � is legitimate. if we write both s x n� , s0 � e, s� � x0 x1… x��1 and s x x x x� � �� ��1 0 1 1� and then s x n� �( / ) , s0 � {e}, s� � {x0 x1… x��1} and s x x x x� � �� ��1 0 1 1{ }� , then, obviously, the first conception is equivalent to the transition: � s s s x s xi j i j � � � � � �� �� � �1 1( , ) ( , ), but this is not so with the transitions: � s s s x s xi j i j � � � � � �� �� � �1 1( , ) ( , ) 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague x h y o f fig. 1: division of the object into fixator f and output block o a) b) fig. 2: pairs of legitimized transitions of fixator f: a) correct, b) contradictory fig. 3: justified, but contradictory transitions of fixator f � � � � �1( , )s x s� , esp. �1 0 0 0( , )s x s� , whereas, even under the condition that the second conception will cope with the transitions: � s s s x s xi j i j � � � � � �� �� � �2 2( , ) ( , ) � s s s x s xi j i j � � � � � �� �� � �2 2( , ) ( , ) � � � � �2( , )s x s� , esp. �2 0 0 0( , )s x s� , it is still closely connected with the problematic acceptance of the empty word e by the object. example 1: let the input alphabet {a, b, c} be given, and let at: 1) s e c bc c bc b c bc ba c bc a� � � � �{ ,( ) ,( ) ,( ) ,( ) }* * * * , where *, � are respective iterations, disjunctions and concatenation of the kleene algebra of regular expressions, a transition function �1 being defined as: � �1( , )e e e� , �1( ,( ) ) ( ) * *e c bc c bc� � � , � �1( ,( ) ) ( ) * *e c bc b c bc b� � � , � �1( ,( ) ) ( ) * *e c bc ba c bc ba� � � , �1( ,( ) ) ( ) * *e c bc a c bc a� � � . if we make an unwilling and unreal compromise that 1 � � �e c bc( )*, 2 � �( )*c bc b and 3 � � � �( ) ( )* *c bc ba c bc a, we obtain the transition diagram from fig. 4. 2) s e c bc c bc b c bc ba c bc a� � � � � �{ }{ ,( ) },{( ) },{( ) ,( ) }* * * * � { , , }1 2 3 , where 1 � �{ ,( ) }*e c bc , 2 � �{( ) }*c bc b and 3 � � �{( ) ( ) }* *c bc a bc , the transition function �2 being defined: � � �2 21 1 1( , ) ( , ( ) ) *e c bc� � � , � �2 1 2( , ( ) ) *c bc b� � , � � �2 21 1 3( , ( ) ) ( , ( ) ) * *c bc a c bc ba� � � � , we obtain fig. 4. not defining the state as a cumulated input history or as a block of the final decomposition (nerode), we can now recurrently define the state of a logical object: � the initial state s0, i.e., � ( ,|)s s0 0� , where | is a separator, is a state, � s �1 is a state if there holds � ( , , , , )s x x x0 0 1 1� � . let us consider the transition function � � � �( , )s x s� �1 and ask whether according to the transition function the cause of the transition from initiating state s� to follower state s� � 1 can be assumed. a general opinion is that the cause of the state transition is merely the stimulus x [6]. by analogy, the only cause of the state transition can be assumed to be its initial state s�. both strict conceptions do not seem to be very convincing, since the cause of transition of an object to the follower state s� � 1 can be regarded both as the state s�, and the stimulus x�. let us traditionally characterize the causes s� or x� as necessary or sufficient, respectively. (let us admit that a sufficient cause always initiates the transition of an object to s� � 1, but need not always result in attaining the object s� � 1; if the object has attained state s� � 1, then it was certainly due to the necessary cause.) since both � �� � � � �( , ) ( , )s x s x si j� � �1 and � �� � � � �( , ) ( , )s x s x si j� � �1 can be admitted for s si j � �� or x xi j � �� , respectively, a decision cannot be made which of the causes, s� or x�, is necessary and which is sufficient. thus, let us characterize causes s� and x� as executing and initiating causes and decide which of them is the executing cause, and which is the initiating cause. let us, therefore, assume the following logical objects: a) a frog sitting on a water lily leaf in a pond, b) loose material in a discharging hopper and a truck, c) a logical object. ad a) a frog sitting on a water lily leaf – initial transition state (s�) – spots an insect – stimulus x� – jumps on to another leaf to catch the insect – follower state (s� � 1). the stimulus is obviously a mere initiator of the jump, whereas the initiating state is its executor of the jump. ad b) opening the discharging valve – stimulus x� – produces the pouring of sand from the hopper – initiating transition state s� – into the truck – follower state (s � � 1). again, the stimulus only initiates the pouring of sand, whereas the sand in the discharging hopper is the executor of the pouring. ad c) since dynamic logical objects are designed through canonical decomposition and since a substitute of the given object is usually a parallel register of flip flop circuits or unit delay circuits, and since each flip flop circuit is again designed through canonical decomposition (without respect to the intuitively designed rs-flip flop circuit), and since the substitute of the selected flip flop circuit in canonical decomposition is a unit delay circuit, let us examine its action. the unit delay circuit is the only dynamic element, since its finite – automaton model is an ordered trio �, ,q d� where �, q is the respective input state alphabet and � is the transition function � � � � � � �d q d pred q: ( ) : ( ) { } { } { } � � �� �1 1 proving the indivisibility of the unit delay circuit. across the pulse front, or the fall time of stimulus d�, when identifying the imperceptible intermediate state with the initial state of transition q� (otherwise a delay cannot be mentioned), the delay circuit, after the given time has elapsed, passes from initial state q� to follower state q� � 1 immediately and spontaneously. spontaneous realization of delay circuit state transitions after some time has elapsed from the instant of acceptance of the change of stimulus d� by the delay circuit is, however, a © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 a b c 21 3 a c fig. 4: transition diagram of the automaton from example 1 fiction. in which way, then, should the realization of the state transition from q� to q� � 1 through an uninvited change of stimulus d� be explained; we have to the transition from q� to q� � 1 of the respective delay circuits was usually effected by the initial state of the transition q�. let us then follow the successive action, without loss of generality, of a binary symmetrical delay circuit with the delay � according to fig. 5 a), b). it is necessary to explain what happens to the object during the transition from state s� to state s� � 1. without any doubt the object is in some uninteresting, imperceptible intermediate state, which is not taken into account by the model. since an object is determined by its model there is no other way than to identify the intermediate state either with initiating state s� or with follower state s� � 1 of the transition; the intermediate state is, in fact, made respected by the model. since the above mentioned identification appears not to be effected it is sometimes stated that the mealy automaton produces a response y� during the transition � � � �( , )s x s� �1. if the intermediate state of the state transition is identified either with the initiating state or with the state of the end transition, the state transition of the finite – automaton model is instantaneous. in this way, space of a state transition in a logical object itself can be taken into account. let us mention the acceptance of an empty word e by an object. since the empty word e (as an empty set �) is an empty concept, i.e., a concept with a controversial content and an empty extent (an empty word does not contain any word of length 1, whereas a word is determined by the sequence of its words of length 1) the acceptance of e by a logical object can be only virtual. we say that an object with the same initial and end state s0 accepts an empty word e s e s� �� ( , )0 0 – (fig. 6); if the reader does not agree with the above statement, we will introduce an unempty word x� on the object, which will be transferred by the object to a stable, for x� retaining state z s x z� � ��� �( , )0 , � � � �( , )z x z� , and we will state that prior to the introduction of the word x� the object accepted empty word e. thus, the transitions � � �( , )s e s� are imaginary since the acceptance of e by the object is virtual. only � � � �( ,| )s s� is offered, which is a virtual state transition initiated by separating word |�, where | is the separator (usually a space) and is a letter, though unpublished, of the input alphabet x. 3 ‘determinization’ of a nondeterministic automaton let, without generality loss, a final-automaton model of a nondetermistic logical object be a semiautomaton a a s� x, , � where � is the transition relation � � � � � � �: : , ,{ } { } { }s x s s x s� � � �1 1 and s� � 1 is one of the possible states of the followers of initial transition state s�, i.e., s s� �� � 1 1, � �s proj s x s s � � � � � � �� � 1 3 1 1 , , . the transition from state s� to just one of the possible states of the followers from s��1 along with stimulus x� initiated by an implicit (inaccessible for the observer – immeasurable) random fault; the fault is not meant only as a failure of the object but also as an action effect of the neighborhood or of the object itself. the so called determinization of a nondeterministic automaton transforms a nondeterministic automaton to a deterministic one [7]. to all possible followers of each initial state of the transition of the given nondeterministic automaton we will find all of their possible followers, and then, purely in a speculative way, we will specify them as the initial states of transitions and also find of their possible followers, etc., as long as the procedure renders the sets of possible follower states that have not occurred so far. then we will substitute each set of possible states s, formed in this way, by the only certain state of the ‘determinizated’ nondeterministic automaton so that the mutually different sets are assigned different states. hence the transition function �d of the ‘determinizated’ nondeterministic automaton: 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague d q �a) t b) � d t � t q 0 1 2 3 4 c) d � q � q �+1 0 1 1 11 0 0 0 fig. 5: a) graphical symbol, b) action, c) delay circuit transition table s 0 z � x � x � fig. 6: virtual acceptance of an empty word � � �� � � � d s sx s x s: : , { } { }{ }2 2 1 1� � � � � where 2s is the potency of the state alphabet s. example 2: construct a deterministic automaton (table 1 b)) to a given nondeterministic automaton (table 1 a)). after a state assignment (two left columns in table 1 b)) we obtain a transition table of the ‘determinizated’ nondeterministic automaton (table 1 c)). note a formal peculiarity of determinization: the states of an immediately constructed deterministic automaton are sets of states of the given nondeterministic automaton. it is astonishing how the mere replacement of the sets of possible follower states by a certain follower eliminates a random an action of implicit (immeasurable) faults. this peculiarity can be explained by stating that either the possible followers or the substituting certain follower are not simply states, otherwise the nondeterminism of an object is a fiction. if we were succeeded in identifying the faults, only one physically acceptable adequate nondeterministic automaton with a so called fault input could be constructed to the given nondeterministic automaton. thus, if z is a fault alphabet (the alphabet of explicit measurable faults), then for the transition function �e of a semiautomaton x z s e� , , � with a fault input, there holds � � � � � � � � �� e s x z s s x z s: : , , { } { } { } { }� � � �1 � example 3: assume that a fault alphabet {z1, z2, z3} of the automaton from example 2 is given such that, e.g., the table 2 holds. 4 conclusions as a satisfactory definition of the state of a logical object and its recurrent introduction can thus be regarded, the adequacy of which is proved particularly in identification of logical objects on the state distinguishing level. the ‘determinization’ of a nondeterministic finite automaton is undoubtedly a myth. references [1] bokr, j., jáneš, v.: logické systémy. vydavatelství čvut, praha 1999 (in czech). [2] brunovský, p., černý, j.: základy matematickej teórie systémov. veda, bratislava 1980 (in slovak). [3] harrison, m. a.: introduction to switching and automata theory. mc graw hill book co., new york – sydney 1965. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 a) s� � 1 x� s� a b c 1 1,2 2 4 2 3 3,4 4 3 4 3 4 4 4 2 4 b) s ��1 � � x � s � a b c 1 {1} {1, 2} {2} {4} 2 {1,2} {1,2,3} {2,3,4} {4} 3 {1,2,3} {1,2,3,4} {2,3,4} {4} 4 {2,3,4} {3,4} {2,3,4} {4} 5 {1,2,3,4} {1,2,3,4} {2,3,4} {4} 6 {3,4} {4} {2,3} {4} 7 {2,3} {3,4} {3,4} {4} 8 {2} {3} {3,4} {4} 9 {3} {4} {3} {4} 10 {4} {4} {4} {4} c) � � � 1 x� � � a b c 1 2 8 10 2 3 4 10 3 5 4 10 4 6 4 10 5 5 4 10 6 10 7 10 7 6 6 10 8 9 6 10 9 10 9 10 10 10 8 10 table 1: transition table of the a) nondeterministic automaton, b), c) deterministic automaton from example 2 s��1 x�z� s� a z1 a z z( )2 3� b z z( )1 2� b z3 c z z z( )1 2 3� � 1 1 2 2 2 4 2 3 3 3 4 4 3 4 4 3 3 4 4 4 4 2 2 4 table 2: transition table of the deterministic automaton of the respective nondeterministic automaton from example 2 [4] chytil, m.: automaty a gramatiky. sntl, praha 1984 (in czech). [5] kalman, r. e., falb, p. l., arbib, m. a.: topics in mathematical system theory. mc graw-hill book co, new york – sydney 1969. [6] gluškov, v. m.: sintez cifrovych avtomatov. gifml, moskva 1962 (in russian). [7] hopcroft, j. e., ullman, j. d.: formal languages and their relation to automata. slovak translation: alfa, bratislava 1978. doc. ing. josef bokr, csc. e-mail: bokr@kiv.zcu.cz department of informatics and computer sciences university of west bohemia in pilsen faculty of applied sciences universitní 22 306 14 pilsen, czech republic doc. ing. vlastimil jáneš, csc. e–mail: janes@fd.cvut.cz department of control and telematics czech technical university in prague faculty of transportation sciences konviktská 20 110 00 prague 1, czech republic 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague ap05_4.vp 1 description of afm atomic force microscopy (afm) is an aspect of scanning force microscopy, and is capable of measuring the interaction force between the sample and a sharp tip mounted on the end of a weak cantilever, see fig. 1. the cantilever is made of silicon and usually has a rectangular or triangular shape. its length does not exceed 600 �m. two main afm operation modes are used: contact mode and dynamic mode. in contact mode the cantilever is in full contact with the surface, and the interaction force (displacement of the lever) is determined by measuring the static deflection of the cantilever. this technique is capable of detecting displacements on atomic scale resolution ( 0 . 1 – 200 nm). the dynamic detection mode can be divided into techniques called the tapping mode and the non(pseudo)contact mode [1]. in these detection modes, the cantilever is driven, usually at its resonant frequency, with an amplitude typically less than 10 nm. the cantilever driver is usually a piezo-electric element, but many experiments have been performed with electrostatic, magnetic, thermo-optic and acoustic coupling drivers. the driver is attached to the head of the microscope and the chip with the cantilever and tip is assembled on top of the driver. in the tapping mode, the tip is touches the surface in each period of cantilever movement at its maximum deflection to© czech technical university publishing house http://ctn.cvut.cz/ap/ 65 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 feedback control in an atomic force microscope used as a nano-manipulator m. hrouzek this paper offers a concise survey of the most commonly used feedback loops for atomic force microscopes. in addition it proposes feedback control loops in order to minimize the effect of thermal noise on measurements of weak forces, and to improve the manipulability of the afm. growing requirements to study and fabricate systems of ever-shrinking size mean that ever-increasing performance of instruments like atomic force microscopes (afm) is needed. a typical afm consists of a micro-cantilever with a sharp tip, a sample positioning system, a detection system and a control system. present day commercial afms use a standard pi controller to position the micro-cantilever tip at a desired distance from the sample. there is still a need for studies showing the optimal way to tune these controllers in order to achieve high closed-loop positioning performance. the choice of other controller structures, more suitable for dealing with the robustness/performance compromise can also be a solution. keywords: automatic control, atomic force microscopy, thermal noise. fig. 1: schema of an atomic force microscope. stage positioning is controlled by two loops (x-y) displayed in the lower part of the picture. the driving and head positioning loops are at the top. their only input is a signal from the photo detector that measures the deflection of the cantilever. wards the surface. there is direct mechanical contact between the tip and the surface. another way to scan soft samples is with the use of a non-contact technique. the cantilever vibrates very close to the surface, but does not come into direct contact with it. attractive long distance interaction forces affect the effective spring constant of the cantilever, and the cantilever shifts its resonance frequency. this technique is used for measuring weak forces, such as the van der waals force, and for scanning soft samples. accuracy in measuring the cantilever displacement is very important in order to achieve good afm resolution. many techniques have been developed for this measurement, based on various principles, including capacitive and optical techniques as well as interferometry, optical beam deflection and laser diode detection. each technique has specific advantages and disadvantages. a method widely used by many manufacturers of afms is the optical beam deflection technique, shown in fig. 1. its function is based on a sensitive photodetector which is able to detect very small laser beam displacements. the photodiode receiving the reflected laser beam from the cantilever is divided into two sections, a and b. due to the macroscopic length of the reflected light path, any deflection causes a magnified displacement of the reflected laser spot on the photodiode. the relative amplitudes of the signals from the two segments of the photodiode change in response to the motion of the spot. the difference signal (a � b) is very sensitive to cantilever deflection. 2 feedback control in the afm modern afms include many feedback loops to control different blocks of this complicated instrument. some of the most exciting and still widely open areas in afm feedback control are: 2.1 loop controlling position of the afm stage on the x-y axis many experiments have been performed with various techniques for moving the stage with the scanned sample with high precision and speed. over many years, most afm manufacturers and scientists have preferred piezo-electric stack actuators. these devices offer fine position capabilities achieving sub-angstrom resolution and high-speed manipulation. piezo electric actuators have disadvantages in nonlinear properties such as hysteresis and creep, which become more significant with the increasing need for fast movement given by the actuators, see fig. 2. this illustrates the need for a better and more accurate description of the piezo elements and for improving the feedback control systems by robust design techniques. surface physicists have used afm mainly for scanning purposes in the past. recently, afm has been used in many different fields of biology, medicine, material science, nanomanipulation and nano-lithography. these extended uses involve the need for an improvement in the accuracy and repeatability of movements with the tip. in addition, the possibilities of using afms as manipulators on an atomic scale rather than as classical scanners are being explored. the difference between these two approaches is displayed in fig 3. the part on the left is a simple movement across the sample in straight lines, scanning. on the other side, we see the movement that is required to manipulate a small object on the surface. the cantilever has to reach the desired position, then approach the surface and make a movement which is not necessarily straight but can be of any shape. while manipulating an particle, it is very important to control the applied force by the tip on to the manipulated object. currently used regulators usually have a simple proportional-integral structure, which offers zero steady state regulation error but is not able to achieve the desired output in a short period of time. these regulators usually experience tracking problems that become significant with rising speed. the bandwidth limitations and non robust control are not satisfactory. present-day feedback control loops for the x and y 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 2: sketch of the hysteresis of a piezo stack actuator for slow and fast movement with the stage. increasing speed of manipulation usually leads to problems with accuracy of positioning, due to the rising influence of nonlinearities such as hysteresis. fig. 3: scanning approach on the left, manipulation approach on the right axis have to be able to perform very complex movements at very high speed, rather than simply dragging the cantilever across the sample in straight lines with a frequency of a few hertz. this area is now open to new techniques and new ideas on how to control the nonlinearities of systems coming from piezo hysteresis in order to achieve the desired performance, see [2]. fig. 4 displays the feedback control loops that are employed in the stage positioning system. © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 4: feedback control loop that can be used to control the position of the sample mounted on the stage of the afm fig. 5: frequency spectrum of a freely vibrating cantilever driven by thermal noise (brownian motion). the spectrum displays the first two resonance frequencies. measurements on the nanotec afm regulators have to be capable of treating nonlinearities of the controlled systems by using internal inverse models. with increasing requirements for accuracy of stage positioning by the piezo actuator, the problem of measuring distances on a smaller scale is becoming apparent. most applications use a linear variable differential transformer (lvdt), which offers the best sensitivity. position sensors experience problems with exceedingly high levels of electric and thermal noise, which lower the resolution. this problem is another field in which of feedback control might be applied. an observer based regulator could achieve better signal noise separation than current systems without any feedback. 2.2 loop controlling the position of the afm head on the z axis and cantilever excitation the description given below focuses on the non-contact operating mode of the afm. the non-contact dynamic technique is very sensitive to thermal noise, but is frequently used for nondestructive measurement of soft biological samples. the cantilevers are designed and manufactured small and soft enough not to destroy the surface of the measured sample. a very soft cantilever is sensitive to thermal excitation, which appears in the system as the largest source of noise [3]. thermal noise significantly affects sensitivity, and limits possible improvements in resolution. fig. 5 displays the amplitude spectrum measured on the cantilever without any artificial excitation. the spectrum shows little excitation across all frequencies, and large noise excitations are visible at the resonance frequencies of the cantilever. some techniques already exist for improving sensitivity by eliminating the thermal noise of the cantilever. an interesting technique for improve is noise squeezing [4]. the cantilever is of course exposed to many other noises, such as electrostatic noise, electromagnetic noise and air turbulence. the influence of these noises can be eliminated by appropriate mechanical and electrical construction of the afm. only thermal noise cannot be attenuated by changes in the construction of the microscope. some applications enable the sample and microscope to be put in a thermal bath at a very low temperature, which reduces the thermal excitation. however, there are many applications that have to be at room temperature. by applying a certain feedback control loop, the noise can be eliminated and the sensitivity of the instrument can be increased. scanning force microscopes are usually controlled by simple proportional-integral regulators which are not able to give the best possible results. fig. 6 shows control loops affecting the sensitivity of the afm. the regulators used in these loops need to be designed by optimal control techniques to achieve better sensitivity. the control system consists of two main parts: the positioning loop and the driving loop. the positioning loop works at lower frequencies, and its task is to keep the head in a position that ensures a constant signal from the tip-surface interaction. it simultaneously measures the surface without losing any information. in scanning force microscopy this constant signal is known as the “set point”, and ensures that the cantilever vibrates at its maximum harmonic deflection without hard tapping on the surface. the driving loop operates at the higher frequencies and keeps the cantilever excited at its resonant frequency and at the same amplitude despite the contact (van der waals attractive and mechanical repulsive interactions) with the surface of the sample. this is usually done by the piezoelectric element (driver) harmonically vibrating at the desired frequency (slightly off the resonance frequency) with a very small amplitude. 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 6: feedback control loops of the head positioning and the cantilever drive the third additional regulator is responsible for thermal noise attenuation. application of this loop is very complicated, because the controller can easily destroy all measured signals by applying excessively large attenuating signals. its function can be similar to cold damping techniques, known from cooling of nano-mechanical resonators or cavities [5]. the main goal of this technique is to stabilize the position of the mechanical part so that its displacement will be minimized and it appears virtually colder. in our case this is not fully true. the displacement caused by thermal noise needs to be eliminated, but the displacement caused by the driver or interaction forces needs to remain undisturbed, or less the measured signal will be changed. another possible design is an observer based regulator, which would be able to approximate the influence of the thermal noise and interaction forces [6]. both of these approaches face problems with determining which part of the signal is unwanted noise. 3 conclusion atomic force microscopy is a new field for applying advanced control systems capable of improving the properties of the instrument. presently used control systems are very simple, but ensure functionality. more sophisticated control system will enable atomic force microscopes to be applied as nano-manipulators on an atomic scale, with improved sensitivity. 4 acknowledgments this work was supported by research project j22/98:26 22000 12 “research in informatics and control systems”. references [1] stark, m., stark, r. w., heckl, w. m., guckenberger, r.: “spectroscopy of the anharmonic cantilever oscillations in tapping-mode atomic-force microscopy.” applied physics letters, november 2000. [2] salapaka, s., sebastian, a., cleveland, j. p., salapaka, m. v.: “high bandwidth nanopositioner: a robust control approach.” review of scientic instruments, september 2002. [3] cleland, a. n., roukes, m. l.: “noise processes in nanomechanical resonators.” journal of applied physics, 2002. [4] rugar, d., grutter, p.: “mechanical parametric amplication and thermomechanical noise squeezing.” physical review letters, august 1991. [5] pinard, m., cohadon, p. f., briant, t., heidmann, a.: “full mechanical characterization of a cold damped mirror.” physical review a, december 2000. [6] sebastian, a., sahoo, d. r., salapaka, m. v.: “an observer based sample detection scheme for atomic force microscopy. in proceedings of the 42nd ieee conference on decision and control, maui, hawaii, usa, december 2003. michal hrouzek e-mail: mhx@seznam.cz laboratoire d’automatique de grenoble inp/ujf grenoble, france and dept. of control and instrumentation feec, but brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 acta polytechnica https://doi.org/10.14311/ap.2022.62.0208 acta polytechnica 62(1):208–210, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague from quartic anharmonic oscillator to double well potential alexander v. turbiner∗, juan carlos del valle instituto de ciencias nucleares, universidad nacional autónoma de méxico, apartado postal 70-543, 04510 méxico-city, mexico ∗ corresponding author: turbiner@nucleares.unam.mx abstract. quantum quartic single-well anharmonic oscillator vao(x) = x2 + g2x4 and double-well anharmonic oscillator vdw(x) = x2(1−gx)2 are essentially one-parametric, they depend on a combination (g2ℏ). hence, these problems are reduced to study the potentials vao = u2 + u4 and vdw = u2(1 − u)2, respectively. it is shown that by taking uniformly-accurate approximation for anharmonic oscillator eigenfunction ψao(u), obtained recently, see jpa 54 (2021) 295204 [1] and arxiv 2102.04623 [2], and then forming the function ψdw(u) = ψao(u)±ψao(u−1) allows to get the highly accurate approximation for both the eigenfunctions of the double-well potential and its eigenvalues. keywords: anharmonic oscillator, double-well potential, perturbation theory, semiclassical expansion. 1. introduction it is already known that for the one-dimensional quantum quartic single-well anharmonic oscillator vao(x) = x2 + g2x4 and double-well anharmonic oscillator with potential vdw(x) = x2(1 − gx)2 the (trans)series in g (which is the perturbation theory in powers of g (the taylor expansion) in the former case vao(x) supplemented by exponentially-small terms in g in the latter case vdw(x)) and the semiclassical expansion in ℏ (the taylor expansion for vao(x) supplemented by the exponentially small terms in ℏ for vdw(x)) for energies coincide [3]. this property plays crucially important role in our consideration. both the quartic anharmonic oscillator v = x2 + g2x4 , (1) with a single harmonic well at x = 0 and the doublewell potential v = x2(1 − gx)2 , (2) with two symmetric harmonic wells at x = 0 and x = 1/g, respectively, are two particular cases of the quartic polynomial potential v = x2 + agx3 + g2x4 , (3) where g is the coupling constant and a is a parameter. interestingly, the potential (3) is symmetric for three particular values of the parameter a: a = 0 and a = ±2. all three potentials (1), (2), (3) belong to the family of potentials of the form v = 1 g2 ṽ (gx) , for which there exists a remarkable property: the schrödinger equation becomes one-parametric, both the planck constant ℏ and the coupling constant g appear in the combination (ℏg2), see [2]. it can be immediately seen if instead of the coordinate x the so-called classical coordinate u = (g x) is introduced. this property implies that the action s in the path integral formalism becomes g-independent and the factor 1ℏ in the exponent becomes 1 ℏg2 [4]. formally, the potentials (1)-(2), which enter to the action, appear at g = 1, hence, in the form v = u2 + u4 , (4) v = u2(1 − u)2 , (5) respectively. both potentials (4), (5) are symmetric with respect to u = 0 and u = 1/2, respectively. namely, this form of the potentials will be used in this short note. this note is the extended version of a part of presentation in aamp-18 given by the first author [5]. 2. single-well potential in [1] for the potential (4) matching the small distances u → 0 expansion and the large distances u → ∞ expansion (in the form of semiclassical expansion) for the phase ϕ in the representation ψ = p (u) e−ϕ(u) , of the wave function, where p is a polynomial, it was constructed the following function for the (2n + p)-excited state with quantum numbers (n, p), n = 0, 1, 2, . . . , p = 0, 1 : ψ(n,p)(approximation) = uppn,p(u2) (b2 + u2) 1 4 ( b + √ b2 + u2 )2n+p+ 12 208 https://doi.org/10.14311/ap.2022.62.0208 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 from quartic anharmonic oscillator to double well potential -2 -1 1 2 u -1 -0.5 0.5 1 1.5 v(u) -2 -1 1 2 3 u -1 -0.5 0.5 1 1.5 v(u) figure 1. two lowest, normalized to one eigenfunctions of positive/negative parity: for single-well potential (4), see (6) (top) and for double-well potential (5), see (9)(bottom). potentials shown by black lines. × exp ( − a + (b2 + 3) u2/6 + u4/3 √ b2 + u2 + a b ) , (6) where pn,p is some polynomial of degree n in u2 with positive roots. here a = an,p, b = bn,p are two parameters of interpolation. these parameters (−a), b are slow-growing with quantum number n at fixed p taking, in particular, the values a0,0 = −0.6244 , b0,0 = 2.3667 , (7) a0,1 = −1.9289 , b0,1 = 2.5598 , (8) for the ground state and the first excited state, respectively. this remarkably simple function (6), see figure 1 (top), provides 10-11 exact figures in energies for the first 100 eigenstates. furthermore, the function (6) deviates uniformly for u ∈ (−∞, +∞) from the exact function in ∼ 10−6. 3. double-well potential: wavefunctions following the prescription, usually assigned in folklore to e. m. lifschitz – one of the authors of the famous course on theoretical physics by l. d. landau and e. m. lifschitz – when a wavefunction for single well potential with minimum at u = 0 is known, ψ(u), the wavefunction for double well potential with minima at u = 0, 1 can be written as ψ(u) ± ψ(u − 1). this prescription was already checked successfully for the double-well potential (2) in [6] for somehow simplified version of (6), based on matching the small distances u → 0 expansion and the large distances u → ∞ expansion for the phase ϕ but ignoring subtleties emerging in semiclassical expansion. taking the wavefunction (6) one can construct ψ(n,p)(approximation) = pn,p(ũ2) (b2 + ũ2) 1 4 ( αb + √ b2 + ũ2 )2n+ 12 exp ( − a + (b2 + 3) ũ2/6 + ũ4/3 √ b2 + ũ2 + a b ) d(p) , (9) where p = 0, 1 and d(0) = cosh ( a0ũ + b0ũ3√ b2 + ũ2 ) , d(1) = sinh ( a1ũ + b1ũ3√ b2 + ũ2 ) . here ũ = u − 1 2 , (10) α = 1 and a, b, a0,1, b0,1 are variational parameters. if α = 0 as well as b0,1 = 0 the function (9) is reduced to ones which were explored in [6], see eqs.(10)-(11) therein. the polynomial pn,p is found unambiguously after imposing the orthogonality conditions of ψ(n,p)(approximation) to ψ (k,p) (approximation) at k = 0, 1, 2, . . . , (n − 1), here it is assumed that the polynomials pk,p at k = 0, 1, 2, . . . , (n − 1) are found beforehand. 4. double-well potential: results in this section we present concrete results for energies of the ground state (0, 0) and of the first excited state (0, 1) obtained with the function (9) at p = 0, 1, respectively, see figure 1 (bottom). the results are compared with the lagrange-mesh method (lmm) [7]. 4.1. ground state (0,0) the ground state energy for (5) obtained variationally using the function (9) at p = 0 and compared with lmm results [7], where all printed digits (in the second line) are correct, e(0,0)var = 0.932 517 518 401 , e (0,0) mesh = 0.932 517 518 372 . note that ten decimal digits in e(0,0)var coincide with ones in e(0,0)mesh (after rounding). variational parameters in (9) take values, a = 2.3237 , b = 3.2734 , a0 = 2.3839 , b0 = 0.0605 , cf. (7). note that b0 takes a very small value. 209 alexander v. turbiner, juan carlos del valle acta polytechnica 4.2. first excited state (0,1) the first excited state energy for (5) obtained variationally using the function (9) at p = 1 and compared with lmm results [7], where all printed digits (in the second line) are correct, e(0,1)var = 3.396 279 329 936 , e (0,1) mesh = 3.396 279 329 887 . note that ten decimal digits in e(0,1)var coincide with ones in e(0,1)mesh. variational parameters in (9) take values, a = −2.2957 , b = 3.6991 , a1 = 4.7096 , b1 = 0.0590 , cf. (8). note that b1 takes a very small value similar to b0. 5. conclusions it is presented the approximate expression (9) for the eigenfunctions in the double-well potential (5). in nonlinearization procedure [8] it can be calculated the first correction (the first order deviation) to the function (9). it can be shown that for any u ∈ (−∞, +∞) the functions (9) deviate uniformly from the exact eigenfunctions, beyond the sixth significant figure similarly to the function (6) for the single-well case. it increases the accuracy of the simplified function, proposed in [5] with α = 0 and b0,1 = 0, in the domain under the barrier u ∈ (0.25, 0.75) from 4 to 6 significant figures leaving the accuracy outside of this domain practically unchanged. acknowledgements this work is partially supported by conacyt grant a1s-17364 and dgapa grants in113819, in113022 (mexico). avt thanks the paspa-unam program for support during his sabbatical leave. references [1] a. v. turbiner, j. c. del valle. anharmonic oscillator: a solution. journal of physics a: mathematical and theoretical 54(29):295204, 2021. https://doi.org/10.1088/1751-8121/ac0733. [2] a. v. turbiner, e. shuryak. on connection between perturbation theory and semiclassical expansion in quantum mechanics, 2021. arxiv:2102.04623. [3] e. shuryak, a. v. turbiner. transseries for the ground state density and generalized bloch equation: doublewell potential case. physical review d 98:105007, 2018. https://doi.org/10.1103/physrevd.98.105007. [4] m. a. escobar-ruiz, e. shuryak, a. v. turbiner. quantum and thermal fluctuations in quantum mechanics and field theories from a new version of semiclassical theory. physical review d 93:105039, 2016. https://doi.org/10.1103/physrevd.93.105039. [5] a. v. turbiner, j. c. del valle. anharmonic oscillator: almost analytic solution, 2021. talk presented by avt at aamp-18 (sept.1-3), prague, czech republic (september 1, 2021). [6] a. v. turbiner. double well potential: perturbation theory, tunneling, wkb (beyond instantons). international journal of modern physics a 25(02n03):647–658, 2010. https://doi.org/10.1142/s0217751x10048937. [7] a. v. turbiner, j. c. del valle. comment on: uncommonly accurate energies for the general quartic oscillator. international journal of quantum chemistry 121(19):e26766, 2021. https://doi.org/10.1002/qua.26766. [8] a. v. turbiner. the eigenvalue spectrum in quantum mechanics and the nonlinearization procedure. soviet physics uspekhi 27(9):668–694, 1984. english translation, https://doi.org/10.1070/pu1984v027n09abeh004155. 210 https://doi.org/10.1088/1751-8121/ac0733 http://arxiv.org/abs/2102.04623 https://doi.org/10.1103/physrevd.98.105007 https://doi.org/10.1103/physrevd.93.105039 https://doi.org/10.1142/s0217751x10048937 https://doi.org/10.1002/qua.26766 https://doi.org/10.1070/pu1984v027n09abeh004155 acta polytechnica 62(1):208–210, 2022 1 introduction 2 single-well potential 3 double-well potential: wavefunctions 4 double-well potential: results 4.1 ground state (0,0) 4.2 first excited state (0,1) 5 conclusions acknowledgements references ap08_1.vp 1 introduction carbon thin films have attracted the attention of various investigators due to their unique optical, electrical and mechanical properties [1]. various deposition techniques and processes such as plasma enhanced chemical vapor deposition (pecvd), sputter deposition etc., have been employed to deposit carbon films [2, 3]. research on the deposition of carbon layers is mainly focused on the fabrication of diamond-like carbon (dlc) or carbon nanotubes. our study deals with the development of carbon layers, which should have good optical properties (low optical losses, suitable refractive indices, high photoluminescence efficiency) for potential application in integrated optics. [4] and [5] have shown that it is possible to prepare carbon planar waveguides with attenuation lower than 1 db�cm�1. low optical losses of the optical materials are one of the most important preconditions for using this material in integrated optics and for doping it with active ions such as erbium. er3�-doped optical materials are candidates for fabricating optical amplifiers or lasers operating at 1530 nm [6, 7, 8], due to er3� intra-4f emission, which corresponds to 4i13/2 � 4i15/2 transition. this wavelength is commonly used in telecommunication system, due to the fact that it corresponds to the low loss window of silica-based optical fibers. the only papers analyzing carbon layers containing erbium ions are by hüttel et al. [9, 10, 11], by baranov [12] and by speranza [13]. the articles report on fabricating erbium doped carbon layers by using magnetron sputtering on a silicon substrate. baranov and speranza used a graphite target partially covered with pieces of metallic erbium, and deposition was performed in a gas mixture of argon and c6h12 at a pressure of about 10�3 pa. the carbon layers had an amorphous structure with sp2 arrangement in which the er concentration varied from 0.15 % to 0.8 %. they observed photoluminescence due to the er3� transition 2h11/2 � 4i15/2 only in the sample with the highest amount of erbium. in this paper we will report on our attempts to fabricate carbon layers with erbium content higher than those in the papers mentioned above. 2 measurements the structure of the deposited carbon layers was studied by xrd (x-ray diffraction). the compositions of the fabricated samples were determined using nuclear chemical analyses such as rutherford backscattering spectroscopy (rbs) and elastic recoil detection analysis (erda). the amounts of erbium ions were checked by rbs using both 2.4 mev protons and 2.2 mev alpha particles. the rbs and erda spectra were evaluated by gisa3 [18] and simnra [19] code, respectively. the absorbance spectra of the samples were taken in the spectral region from 400 nm to 600 nm at room temperature. a tungsten lamp and mdr 23 monochromator were used as a light source, and the light transmitted through the samples was detected by the pyro-detector. the photoluminescence measurement was carried out at three excitation wavelengths: � an ila-120 ar laser operating at �ex � 488 nm, eex � 100 mw, (300 k). � ar lasers operating at �ex � 514.5 nm, eex � 300 mw, (4 k). � a semiconductor laser p4300 operating at �ex � 980 nm, e ex � 500 mw, (300 k). the feu62 photocell was used for detecting the wavelength from 500 to 1000 nm, while a ge detector was used for the wavelength from 1000 nm to 1600 nm. 3 sample fabrication and results erbium containing carbon thin films were deposited using three different procedures: � sputtering from carbon targets with built-in metallic erbium, 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 properties of erbium doped hydrogenated amorphous carbon layers fabricated by sputtering and plasma assisted chemical vapor deposition v. prajzler, z. burian, v. jeřábek, i. hüttel, j. špirková, j. gurovič, j. oswald, j. zavadil, v. peřina we report about properties of carbon layers doped with er3� ions fabricated by plasma assisted chemical vapor deposition (pacvd) and by sputtering on silicon or glass substrates. the structure of the samples was characterized by x-ray diffraction and their composition was determined by rutherford backscattering spectroscopy and elastic recoil detection analysis. the absorbance spectrum was taken in the spectral range from 400 nm to 600 nm. photoluminescence spectra were obtained using two types of ar laser (�ex � 514.5 nm, �ex � 488 nm) and also using a semiconductor laser (�ex � 980 nm). samples fabricated by magnetron sputtering exhibited typical emission at 1530 nm when pumped at 514.5 nm. keywords: carbon layers, erbium, pacvd, sputtering, photoluminescence � pacvd growth of carbon thin films from a gas mixture ( h 2� ar � erbium isopropoxide or ch4� erbium tris), � magnetron sputtering placing metallic erbium on the carbon target. the structures of the deposited carbon layers were then investigated by xrd (x-ray diffraction) and it was found that all of them were amorphous. 3.1 growth of erbium doped carbon thin films using sputtering the first procedure for depositing erbium doped carbon thin films was sputtering using a carbon (99.8 %) target that contained pieces of metallic erbium. the target was made by pressing milled graphite and erbium metal (7 g c � 0.8 g er) applying different pressures (100 mpa�cm�2 and 200 mpa�cm�2). the final targets were 5 cm in diameter. deposition was performed in an ar atmosphere at a pressure of 12 pa for 20 min. voltage on the grounded electrodes was �75 v and the distance of the electrodes was 3 cm. a schema of the experimental set-up is shown in fig. 1. the composition of the carbon layers determined by a nuclear chemical analysis as rbs and erda is given in table 1. the actual content of erbium ions depends on the fabrication parameters, and the maximum amount obtained after optimization was 0.02 at %. however, this content was still too low for photoluminescence at 1530 nm. 3.2 growth of erbium doped carbon thin films using pacvd here “erbium isopropoxide” (c9h21ero3) was used for the first time as the erbium source. the fabrication process was based on modifying the reactor, as shown in fig. 2 (compare with fig. 1). the deposition of erbium doped carbon layers was carried out using methane (flow of 50 cm3�min�1) and argon (flow of 30 cm3�min�1). the erbium isopropoxide (sigma-aldrich) vapor was added into this gas mixture at a temperature from 240 °c to 260 °c and injected into the reactor by flow of methane and argon. the bias applied at one of the electrodes was �100 v and the power density was from 0.31 w�cm�2 to 0.35 w�cm�2. the deposition rate was in the © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 48 no. 1/2008 vacuum gauge substrate pump ar carbon target vacuum chamber matching unit rf power 13.56 mhz (7 g c + 0.8 g er) fig. 1: schema of the sputtering set-up (see above) methods used for fabrication of carbon layers c h o er (at %) (at %) (at %) (at %) growth of erbium doped carbon thin films using sputtering (7g c � 0.8 g er) 52 45 3 0.02 growth of erbium doped carbon thin films using pacvd (ch4� ar � “erbium isopropoxide”) 65 32 3 0 growth of erbium doped carbon thin films using pacvd (ch4� “erbium tris”) – position 1* 51 27 15 8.7 growth of erbium doped carbon thin films using pacvd (ch4� “erbium tris”) position 2* 61 40 5 0.012 growth of erbium and ytterbium doped carbon thin films using magnetron sputtering 73 23 4 0.2–20** * for more details see fig. 3 ** depending on er or er � yb powder laid onto the top of the carbon target (see table 2) table 1: composition of erbium-doped carbon samples as determined by rutherford backscattering analysis and elastic recoil detection analysis range from 20 nm�min�1 to 22 nm�min�1. the temperature of the silicon substrates varied from 50 °c to 100 °c. because the deposited samples did not contain erbium even after optimization (for more details see table 1), another erbium source had to be searched for. chryssou and pitt [16] reported the fabrication of erbium doped al2o3 layers using pacvd. they used “erbium tris” (er(thd)3) as an erbium source and the fabricated samples had photoluminescence at 1530 nm even at room temperature. therefore it was decided to fabricate carbon layers doped with erbium using “erbium tris” ([tris (2,2,6,6-tetramethyl-3,5-heptanedionate)] er(+iii)), (sigma-aldrich) as a source. deposition was carried out using methane (flow of 50 cm3�min�1). “erbium tris” vapor was produced at temperatures from 130 °c to 160 °c, and it was always mixed with methane. there were two ways to insert the “erbium tris” into the vacuum chamber (fig. 3). when the “erbium tris” was in position (1) the concentration of erbium ions in the deposited carbon layers was as high as 9 at %. when the “erbium tris” was in position (2) the erbium concentration was always lower, reaching a maximum of 0.012 at % (see table 1). fig. 4 shows the photoluminescence spectrum of the carbon layer fabricated by pacvd. the bands at wavelength 582 nm and 645 nm come from the carbon layers themselves 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 vacuum gauge substrate pump ar vacuum chamber matching unit rf power 13.56 mhz erbium isopropoxide heating ch4 fig. 2: schema of the pacvd set-up used for depositing of carbon layers from methane and “erbium isopropoxide” vacuum gauge substrate pump vacuum chamber matching unit rf power 13.56 mhz 1. erbium tris heating ar ch4 2. erbium tris heating ar fig. 3: schema of the pacvd set-up used for deposition of carbon layers from methane and “erbium tris”. for the erbium precursor two positions (1. and 2.) were experimentally evaluated. (similar photoluminescence had already been reported in [17]). obviously, photoluminescence at 1530 nm (pl band 4i15/2� 4i11/2) did not occur. the probable reason is that the samples contained oh groups (see table 1) due to the gas source applied for the growing process (methane as the hydrogen source, oxygen residual contamination). oh impurities are known to quench er luminescence. 3.3 growth of erbium doped carbon thin films using magnetron sputtering the carbon layers were deposited by rf magnetron sputtering (balzers pfeiffer pls 160) using a carbon target of 99.9999 % purity. the rf magnetron sputtering set-up is depicted in fig. 5. the typical growth parameters were: deposition temperature 300 k, ar or ar � ch4 gas mixture, total gas pressure from 1 pa to 4 pa, and the growing time ranged from 30 min to 4 hrs, with power from 50 w to 150 w (13.56 mhz). for er3� doping into the carbon samples, powder of er (99.9 % purity) was placed on top of the carbon target. the rbs analyses and also the erda analyses of the deposited layers proved that the samples contained carbon, argon, oxygen, hydrogen and erbium (see tab. 1). the list of carbon samples and the relevant content of erbium or erbium ions is given in table 2. the amount of incorporated erbium ions differed depending on the area of the target covered by erbium powder placed on the target, and also depending on the erosion area represented by the part of the surface covered by the erbium ions. the samples had a total erbium content ranging from 1 at % to 17.1 at %. the highest concentration (17.1 at %) of erbium was found in the sample doped using 1.0004 g of erbium powder. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 48 no. 1/2008 500 550 600 650 700 750 800 0 200 400 600 800 1000 1200 1400 1600 582 nm p l in te n s it y (a .u .) wavelength (nm) 645 nm � ex = 488 nm t= 300 k fig. 4: photoluminescence spectrum of a carbon layer fabricated by pacvd window substrate pump ar cooling water ss n c-target vacuum gauge vacuum chamber er powder fig. 5: the schema of magnetron sputtering set-up the absorbance spectrum of erbium doped carbon layers fabricated by magnetron sputtering is given in fig. 6. this figure shows typical strong 2h11/2 (523 nm) transitions, which correspond to optical materials doped with erbium ions. other er3� transitions 4f11/2 (451 nm) and 4f7/2 (490 nm) are not so strong, and transitions 2h9/2 (405 nm), 4f3/2 (441 nm) and 4s3/2 (539 nm) are almost hidden in the background. photoluminescence spectra of erbium doped carbon layers fabricated by magnetron sputtering are shown in fig. 7 and fig. 8. the spectrum excited by an ar laser at very low temperature (�ex � 514.5 nm, temperature 4 k) is given in fig. 7 and shows quite well the typical photoluminescence bands attributed to the erbium transition 4i13/2� 4i15/2. fig. 8 shows the spectrum obtained by optical pumping at 980 nm at room temperature. obviously, the peak coming from the erbium 4i13/2� 4i15/2 transition is very weak but it is still there, as can be seen from the comparison with the pl spectrum of the reference sample given in the same figure. the intensities of the emission bands in the 1530 nm region did not show any evidence of being connected with the actual concentration of er3� in the samples (see table 2). the intensities are very low, so a small increase would be difficult to recognize. however, our opinion is that the higher concentration of er3� resulted in the formation of clusters that, due to pair interactions, would in fact have a counterproductive concentration quenching effect. 4 conclusion we report on carbon layers doped with erbium ions deposited by sputtering, by pacvd and by magnetron sputtering onto a silicon or glass substrate. the fabricated samples had an amorphous structure. when the carbon samples were fabricated by sputtering the content of erbium ions was too low to reveal the photoluminescence at 1 530 nm. carbon layers fabricated by pacvd using a ch4� ar � “erbium isopropoxide” gas mixture did not even contain erbium. erbium doped carbon samples fabricated by pacvd using “erbium tris” as erbium source contained erbium ions but no photoluminescence at 1530 nm was observed. the main reason could be that the carbon samples contained oxygen and hydrogen. oh impurities are known to be a strong quencher of er luminescence. the most successful method for fabricating erbium doped carbon layers was magnetron sputtering. the content of erbium ions in the carbon samples was controlled by the amount of erbium powder placed on the carbon target. under optical pumping at 514.5 nm (4 k) the typical photoluminescence emission due to the transition of 4i13/2� 4i15/2 was observed. this emission, though rather weak, was observed even at room temperature when generated by optical pumping at 980 nm. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 samples doping er (at %) #1 reference sample * … #2 er mer � 0.0011g ** 1.2 #3 er mer � 0.0012g ** 1.7 #4 er mer � 0.0106g ** 4.8 #5 er mer � 0.0104g ** 6.0 #6 er mer � 0.5004g ** 8.4 #7 er mer � 0.5004g ** 9.0 #8 er mer � 0.5004g ** 12.5 #9 er mer � 1.0004g ** 15.7 #10 er mer � 1.0004g ** 17.1 * sample without er doping ** weight of er powder placed on top of the carbon target table 2: composition of erbium-doped carbon samples fabricated by magnetron sputtering, showing the relations between the amounts of erbium used for doping and the actual concentrations of erbium found in the doped samples fig. 6: absorbance spectra of er3� doped carbon layers acknowledgments our research is supported by the grant agency of the czech republic under grant number 102/06/0424 and by research program msm6840770014 of the czech technical university in prague. references [1] mandel, t., frischholz, m., helbig, r., birkle, s., hammerschmidt, a.: electrical and opticalproperties of heterostructures made from diamond-like carbon layers on crystalline silicon. applied surface science, vol. 65 (1993), no. 6 , p. 795–799. [2] smith, s. m., voight, s. a., tompkins, h., hooper, a., talin, a. a., vella, j.: nitrogen-doped plasma enhanced chemical vapor deposited (pecvd) amorphous carbon: processes and properties. thin solid films, vol. 398 (2001), p. 163–169. [3] kusano, y., evetts, j. e., somekh, r. e., hutchings i. m.: properties of carbon nitride films deposited by magnetron sputtering. thin solid films, vol. 332 (1998), no. 1–2, p. 56–61. [4] hüttel, i., gurovič, j., černý, f., pospíšil, j.: carbon and carbon nitride planar waveguides on silicon substrates. diamond and related materials, vol. 8 (1999), no. 2–5, p. 628–630. [5] hüttel, i., černý, f., gurovič, j., chomát, m., matějec, v.: thin carbon nitride films for integrated optical chemical sensors. spie vol. 3858 (1999), p. 210–217. [6] polman, a.: erbium implanted thin film photonic materials. journal of applied physics, vol. 82 (1997), p. 1–39. [7] kenyon, a. j.: recent developments in rare-earth doped materials for optoelectronics. progress in quantum electronics, vol. 26 (2002), no. 4–5, p. 225–284. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 48 no. 1/2008 1500 1520 1540 1560 1580 1600 20 30 40 50 60 70 �ex = 514.5 nm t= 4 k p l in te n s it y (a .u .) wavelength (nm) 1 530 nm fig. 7: photoluminescence spectra of erbium doped carbon layers fig. 8: photoluminescence spectra of erbium doped carbon layers [8] van den hoven, g. n., snoeks, e., polman, a., vanuffelen, j. w. m., oei, y. s., smit, m. k.: photoluminesce characterization of er-implanted al2o3 films. applied physics letters, vol. 62 (1993), no. 24, p. 3065–3067. [9] prajzler, v., hüttel, i., nekvindová, p., schröfel, j., macková, a., gurovič, j.: erbium doping into thin carbon optical layers. thin solid films, vol. 433 (2003), no. 1–2, p. 363–366. [10] prajzler, v., hüttel, i., schröfel, j., nekvindová, p., gurovič, j., macková, a.: carbon layers for integrated optics. photonics, devices, and systems ii, spie (2002), p. 342–347. [11] prajzler, v., hüttel, i., špirková, j., oswald, j., peřina, v., machovič, v., jeřábek, v.: properties of sputtered carbon layers containing erbium and ytterbium ions. electronic devices and systems eds2006, imaps cs, (2006), p. 403-408. [12] baranov, a. m., sleptsov, v. v., nefedov, a. a., varfolomeev, a. e., fanchenko, s. s., calliari, l. speranza, g., ferrari, m., chiasera, a.: erbium photoluminescence in hydrogenated amorphous carbon. physics status solidy b, vol. 234 (2002), no. 2, p. r1–r3. [13] speranza, g., calliari, l., ferrari, m., chiasera, a., ngoc, k. t., baranov, a. m., sleptsov, v. v., nefedov, a. a., varfolomeev, a. e.. fanchenko, s. s.: erbium doped thin amorphous carbon films prepared by mixed cvd sputtering. applied surface science, vol. 238 (2004), p. 117–120. [14] saarilahti, j., rauhala, e.: interactive personal-computer data-analysis of ion backscattering spectra. nuclear instruments & methods in physics research section b-beam interactions with materials and atoms, vol. 64 (1992), no. 1–4, p. 734–738. [15] mayer, m.: simnra users guide, institut für plasmaphysik, 1998. [16] chryssou, c. e., pitt, c. w.: er(3+)-doped al2o3 thin films by plasma-enhanced chemical vapor deposition (pecvd) exhibiting a 55 nm optical bandwidth. ieee journal of quantum electronics, vol. 34 (1998), no. 2, p. 282–285. [17] rusli, amaratunga g. a. j., silva, s. r. p.: photoluminescence in amorphous carbon thin films and its relation to the microscopic properties. thin solid films, vol. 270 (1995), no. 1–2, p. 160–164. ing. václav prajzler, ph.d. e-mail: xprajzlv@feld.cvut.cz doc. ing. zdeněk burian, csc. e-mail: burian@fel.cvut.cz ing. vitězslav jeřábek, csc. e-mail: jerabek@fel.cvut.cz department of microelectronics czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic doc. ing. ivan hüttel, drsc. e-mail: ivan.huttel@vscht.cz rndr. jarmila špirková, csc. e-mail: jarmila.spirkova@vscht.cz institute of chemical technology technická 5 166 27 prague, czech republic ing. ján gurovič e-mail: jan.gurovic@fs.cvut.cz department of physics czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague, czech republic ing. jiří oswald, csc. e-mail: oswald@fzu.cz institute of physics czech academy of sciences cukrovarnická 10 162 53 prague, czech republic rndr. jiří zavadil, csc. e-mail: zavadil@ure.cas.cz institute of radio engineering and electronics academy of sciences chaberská 57 182 51 prague, czech republic vratislav peřina e-mail: perina@ujf.cas.cz institute of nuclear physics czech academy of sciences 250 68 řež near prague, czech republic 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 ap05_3.vp 1 introduction parallel kinematics machines (pkm) represent a new concept in the design of machine tools [6, 7]. this concept has been investigated since the early 1990s. parallel kinematics machines (pkm) enable the mechanical properties of manufacturing machines, especially the dynamics, to be improved. this has been proven by several new machine tool concepts. however, few pkms have been successful on the market, due to new design problems resulting from the parallel kinematics concept. this leads to reduced workspace and stiffness. however, these problems can be removed and the potential of pkm can be be increased by applying the principle of redundant actuation [1]. this paper deals with the extension of the concepts of redundantly actuated parallel kinematics structures for five-sided five-axis machine tools and for a free-forming sheet metal forming machine. these concepts can be divided into full, hybrid and modular pkms. 2 full parallel kinematics machines in full parallel kinematics machines, all dofs are realized by the motion of the platform, and the platform is suspended by parallel links with drives on the frame. an example of spatial redundant full parallel kinematics is octapod [2] (fig. 1) corresponding to the traditional hexapod. eight links were selected for octapod, instead of six links of hexapod. in the initial concept, both the frame and the platform were cubes (fig. 2a). the links connecting the vertices of the cubes are translational actuators, which are connected to the frame and the platform by spherical joints. the links have variable lengths. the first design problem was that the initial concept with a cubic platform was singular in the whole workspace. it was necessary to generate a modified platform concept. the modified platform consists of skew mutually rotated rectangles (fig. 2b), and parameter optimisation of its dimensions was provided. this pkm is non-singular, and has good dexterity in the whole workspace. the workspace is in principle equal to the whole cube of the frame without the boundary layer of the platform thickness. the orientation capability of the platform is good (more than +90 degrees have been achieved). if the platform is just a fraction of the frame cube (e.g. 1:5), then the ratio between the workspace and the machine space is much better than for other parallel kinematics concepts. however, these properties pose quite large demands on the angular extent of the spherical joints, and these requirements have led to a special design of the new spherical joints [3]. this involves the possibility of five-sided machining. five-sided machining means that a machine part (e.g. a cube) is fixed within the workspace and all five free sides of this cube are machined without any other fixing. severe problems are posed by collisions. in our case this is influenced by the relative size of the platform, the position of the fixing table within the workspace and the resulting cube being five-side machinable. these parameters were thoroughly optimised, with large computational demands. the result is that a cube with the size of 25% of the octapod frame can be really five-side machined. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague study of concepts of parallel kinematics machines for advanced manufacturing m. valášek, v. bauma, z. šika this paper deals with possible new concepts for machine tools based on parallel kinematics for advanced manufacturing. parallel kinematics machines (pkm) enable the mechanical properties of manufacturing machines to be improved. this has been proven by several new machine tool concepts. however, this potential can be and must be increased by applying the principle of redundant actuation. this paper deals with the extension of the concepts of redundantly actuated parallel kinematics structures for five-sided five-axis machine tools and for a free-forming sheet metal forming machine. the design principles of previous successful pkms are summarized and new concepts are proposed. the most important requirement criteria are summarized. the proposed concepts are qualitatively and initially quantitatively evaluated according to these criteria. keywords: parallel kinematics, machine tools, metal forming machines, conceptual design. fig. 1: laboratory model of octapod another example of a spatial full parallel kinematics machine is octaslide [4]. it is a redundant version of hexaslide. it is a parallel kinematics machine the links of which have sliding actuators. the basic concept of such a structure is shown in fig. 4. the platform is suspended on 8 links with actuators, whereas hexaslide or pentaslide are suspended on only 6 or 5 links. the concept of redundant actuation completely removes the singularities from the workspace. the stiffness in the whole workspace was increased in maximum values by 65–74 %, and in average values by 43–54 % compared to hexaslide. then the octaslide concept was optimised as a machine tool for five-axis machining. the hexaslide and octaslide concepts were simultaneously intensively optimized. the result is that octaslide is superior in all mechanical properties. the resulting concept of octaslide is an asymmetric structure (fig. 5), both in connections of the links with the frame and in the conical form of the platform. the increased orientation angle is � 33°. 3 hybrid parallel kinematics machines in hybrid parallel kinematics machines, the dofs of the machine are split into two parts, at least one of which is realized by the parallel kinematics concept. there are two groups © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 a b 1 2 34 5 6 7 8 8´ 7´ 5´ 6´ 3´ 2´1´ 4´ (a) (b) fig. 2: the initial (a) and final (b) structure of octapod spatial redundant parallel kinematics fig. 3: octapod variant for five-sided machining x y z fig. 4: kinematical concept of octaslide of concepts for five-axis / five-sided machining. they are based on dividing the required 5 dofs into two parts, with 3+2 dofs or 4+1 dofs. one group of hybrid pkms is based on a planar parallel kinematics mechanism that realizes 3 dofs in a large range, and this mechanism is combined with a translational table that can also rotate (two further dofs). an example is shown in fig. 6. an important feature is redundant actuation, which enables such a large range of motions to be realized. another other group of hybrid pkms is based on a parallel kinematics mechanism that realizes spatial translational cartesian motions and one rotation of the spindle (4 dofs), and this mechanism is combined with a rotary table (1 dof). an example is shown in fig. 7. it uses modifications of the advantageous module of trijoint [5] (fig. 8) as the portal horizontal mechanisms for the motion of the quill carrier, for quill travelling and even for spindle rotation. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 5: resulting variant of octaslide with a conical platform fig. 6: an example of a pkm based on a redundant planar parallel mechanism with a large range of motions top view back view fig. 7: an example of a pkm based on a portal mechanism with redundant modules 4 modular parallel kinematics machines modular parallel kinematics machines form a special group of hybrid pkms that are based on pkm for spatial translational cartesian motions combined with a swivel head for additional orientation of the spindle. they use the increased stiffness and dynamics of both parts – basic pkm for translational motions and the swivel head for orientation. if the swivel head is realized by a pkm, it solves the great problem of traditional swivel heads on composed rotational axes – that the head cannot always move directly to the required orientation. the basic pkm can implement all three cartesian motions (3 dofs) and the swivel head has only 2 dofs (2 orientation angles) ,or the basic pkm can implement only two cartesian motions (2 dofs) and the swivel head has 3 dofs (2 orientation angles and translation, as in the case of current parallel swivel heads). this modular solution has many possible variants. the basic pkm can be based on sliding delta [2] (redundant uran) with three cartesian motions (fig. 9), or on trijoint (fig. 8) or sliding star [2], with two cartesian motions (fig. 10). the swivel head is mounted on the platform of the basic pkm. the swivel head can be based on a traditional cardan mechanism or on parallel mechanisms. a redundantly actuated parallel swivel head with 3 dofs (2 rotations, 1 translations) with increased stiffness and dynamics is shown in fig. 11. © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 9: kinematical concept and laboratory model of sliding delta (redundant uran) fig. 10: kinematical concept and laboratory model of sliding star fig. 8: kinematical concept and design of trijoint 5 parallel kinematics metal forming machines there is only one example of the application of parallel kinematics for metal forming machines. this is hexabend, produced by iwu fhg chemnitz – pkm for free forming of tubes and similar rod profiles. hexabend consists of a traditional feed mechanism for travelling the tube, together with hexapod for bending the tube in any direction. this concept can be extended for metal forming of sheets (fig. 12). one parallel kinematics mechanism acts as the holder and manipulator, and the other acts as the universal forming tool. it can have planar and spatial variants. 6 conclusion this paper describes different ways of extending pkms for five-axis/ five-sided machining and for metal forming machines. the initial concept of parallel kinematics was oriented towards such applications, especially five-sided – five-axis machining, but design problems leading to limited workspace prevented the development of pkm with these applications. current progress in designing pkms without these design problems enables pkm to be proposed again for such demanding applications. the paper summarizes several ways for designing such machines. acknowledgment the authors appreciate the kind support by msmt grant j04/98:212200008 “development of methods and tools of integrated mechanical engineering”. references [1] valasek, m., sika, z.: “redundantly actuated parallel kinematics-new concept for machine tools.” in: proc. of 1st ifac-conference on mechatronic systems, darmstadt 2000, p. 241–246. [2] valasek, m., bauma, v., sika, z., vampola, t.: “redundantly actuated parallel structures – principle, examples, advantages.” in: neugebauer, r., (ed.): development methods and application experience of parallel kinematics, pks 2002, iwu fhg, chemnitz 2002, p. 993–1009. [3] valasek, m., sulamanidze, d., bauma, v.: “spherical joint with increased mobility for octapod.” in: neugebauer, r., (ed.): development methods and application experience of parallel kinematics, pks 2002, iwu fhg, chemnitz 2002, p. 285–294. [4] bauma, v., valasek, m., sika, z.: “design and properties of octaslide redundant parallel kinematics.” in: proc. of international conference on advanced engineering design aed 03, ctu, prague 2003, p. c3.6/1–8 [5] petru, f., valasek, m.: “concept, design and evaluated properties of trijoint 900h.” in: neugebauer, r. (ed.): proc. of pks 2004 parallel kinematics seminar, chemnitz 2004. [6] neugebauer, r., (ed.): “development methods and application experience of parallel kinematics.” pks 2002, iwu fhg, chemnitz 2002. [7] neugebauer, r., (ed.): “parallel kinematic seminar pks 2004.” iwu fhg, chemnitz 2004. prof. michael valášek, drsc. valasek@fsik.cvut.cz ing. václav bauma, csc. steinb@fsik.cvut.cz ing. zbyněk šika, phd. sika@fsik.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 11: redundantly actuated swivel head universal forming tool sheet metal holder fig. 12: sheet metal forming by pkm ap07_4-5.vp 1 introduction since the discovery of turbo codes [1], which allow for channel coding close to the shannon limit with moderate complexity, the turbo principle of exchanging extrinsic information has been extended to various components of the receiver chain. iterative source-channel decoding (iscd) [2, 3] is such an extension. instead of a concatenation of two or more channel decoders, a channel decoder and a soft decision source decoder (sdsd) [4, 5] are iteratively combined exchanging extrinsic information. unlike turbo channel decoding, which aims at minimizing the bit error rate, iscd mainly aims at error concealment and signal restoration which is not necessarily connected to a lower bit error rate, but to a higher parameter snr. iscd exploits the a priori knowledge on the residual redundancy of the source codec parameters that remains after imperfect source coding. the a priori knowledge can be a nonuniform probability distribution, an autocorrelation or a cross correlation. the source codec parameters can be, e.g., scale factors or predictor coefficients for speech, audio and video signals. delay or complexity constraints prevent a complete removal of the residual redundancy and therefore, in practice, a quite large amount of residual redundancy remains in the source codec parameters, which can be exploited by iscd. one major design issue in iscd systems is the index assignment. besides the traditional index assignments for noniterative systems, such as natural binary or gray, optimized index assignments have been developed that take into account the possible feedback due to the turbo loop between channel decoder and sdsd. in [6, 7] index assignments have been introduced that are optimized considering a nonuniform probability distribution, i.e., the zeroth order a priori information, of source samples. further enhanced index assignments have been presented in [8] and the corresponding optimization process even takes the first order a priori information, e.g., the autocorrelation, of the source samples into account. this paper focuses on the latter type of index assignments. in general, an index assignment is chosen in advance and is not exchanged during a transmission, otherwise side information has to be transmitted to notify the receiver of the change. however, in this paper the index assignment is assumed to be constant during a transmission. if it is a priori known that the source codec parameters bear a specific overall autocorrelation, an appropriate index assignment can be applied exploiting this autocorrelation. but since most signals have a time-varying autocorrelation it has to be examined how much the performance degrades, if the signal correlation does not match the correlation that the index assignment is optimized for. at first, the underlying iscd transmission scheme is described in section 2. since in this paper the performance of index assignments that consider first order a priori knowledge will be under examination, more details on this topic will be given in section 3. in section 4 the performance (degradation) of the iscd system is shown by means of simulation results utilizing the index assignments under both optimal and suboptimal conditions. in section 5 the simulated scenarios are then analyzed and confirmed by so-called exit (extrinsic information transfer) charts [9]. 2 iterative source-channel decoding the baseband model of the utilized iscd transmission scheme is depicted in fig. 1. source codec parameters u are generated by a gauss-markov source, with an inherent autocorrelation � in order to obtain comparable and reproducible results. at time instant �, k source codec parameters uk,� are assigned to one frame u � with k k� �0 1 1, , ,� denoting the position in the frame. in this paper the autocorrelation � is constant in order to simulate a fixed mismatch between the correlation �t of the transmitted source parameters and the assumed correlation �r at the receiver. the autocorrelation takes on values from a finite set, e.g., � �� � 0 0 01 0 9. , . , , .� . the value-continuous and time-discrete source samples uk,� are each quantized to a quantizer reproduction level u uk,� � , where u is the quantizer codebook. to each uk,� a unique bit 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 analysis of mismatched first order a priori information in iterative source-channel decoding b. schotsch, p. vary, t. clevorn due to complexity and delay constraints a usually significant amount of residual redundancy remains in the source samples after source coding. this residual redundancy can be exploited by iterative source-channel decoding for error concealment and quality improvements. one key design issue in joint source-channel (de-)coding is the index assignment. besides conventional index assignments optimized index assignments have been developed, e.g., considering zeroth or first order a priori information of the source samples. however, in real-world scenarios it is unlikely that the amount of residual redundancy is constant over time and thus it may occur that the just deployed index assignment is suboptimal at times when the residual redundancy differs too much from the amount that it is optimized for. in this paper the performance of optimized index assignments is examined that consider first order a priori knowledge under such suboptimal conditions. keywords: iterative source-channel decoding, optimized index assignments, first order a priori knowledge, exit charts pattern xk,� of m bits is assigned according to the utilized index assignment. the single bits of a bit pattern xk,� are indicated by xk m , ( ) � with m m� �0 1 1, , ,� , and the frame of bit patterns is denoted as x � . three parameter snr optimized index assignments considering first order a priori knowledge (soak1) are used and the natural binary (nb) index assignment serves as a reference. the soak1 index assignments are optimized for different source correlations, thus they are referred to as soak(�). the bit interleaver � scrambles the incoming bits x of the frame x � of bit patterns to ~x in a deterministic manner. to simplify the notation, we restrict the interleaving to a single time frame with index � and omit the time frame index � in the following, where appropriate. for the channel encoding of a frame x of interleaved bits x we utilize ldpc codes, which were first proposed by gallager [10] and rediscovered by mackay [11]. ldpc codes have a very high error correction capability with iterative decoding that is very close to the shannon limit. their performance is comparable or even superior to that of convolutional turbo codes. in this paper we use a modification of short ldpc codes as presented in [12]. identical instances of a short ldpc code are combined to a long ldpc code, whose frame size is flexible in multiples of a subframe size, i.e., the frame size of the short ldpc code. by serially concatenating the subframes with a bit-interleaver and a second component that provides extrinsic information according to the turbo principle (e.g., a soft decision source decoder (sdsd) as in this paper), extrinsic information can also be exchanged between subframes. such concatenated ldpc codes approach very well the performance of long monolithic ldpc codes of the same frame size [12]. the performance of the concatenated ldpc code strongly depends on the performance of the short code. therefore, the short code has to be chosen carefully. as short ldpc code a (21,11) difference set cyclic (dsc) code [13] is used. dsc codes feature a high minimum hamming distance, and especially at short block lengths they can outperform comparable pseudo-random ldpc codes [14]. the resulting codeword is denoted as y with bits y, which are mapped to bipolar bits � ���y � � 1 for bpsk transmission with symbol energy e s � 1 . we choose the simple bpsk modulation scheme, since modulation is no design issue in this paper. on the channel, the signal ��y is superposed with additive white gaussian noise (awgn) n with the known power spectral density s nn 2 0 2� , i.e., z y n� ��� . the received symbols z are evaluated in a turbo process, in which extrinsic reliabilities between the ldpc decoder and the sdsd are exchanged. utilizing ldpc codes results in an additional iterative loop in the ldpc decoder, in which extrinsic information is exchanged between the variable nodes and the check nodes. these iterations are denoted as ldpc-iterations. details about the iscd receiver can be found in [2, 3, 15]. the ldpc decoder uses the belief propagation algorithm [16, 11] to generate extrinsic information. the sdsd determines the extrinsic information mainly from the natural residual source redundancy, which generally remains in the bit patterns xk after source encoding. such residual redundancy appears on the parameter level, e.g., as a nonuniform distribution p u k( ) , in terms of a correlation, or as other possible time-dependencies. the latter terms of residual redundancy are generally approximated by a first order markov chain, i.e., by exploiting the conditional probabilities p k k( )x x �1 . these conditional probabilities heavily depend on the source correlation. for specific source correlations, e.g., �� � 0 0 01. , . , ��, .0 9 , they can be calculated once in advance. the technique for combining this a priori information p k k( )x x �1 on the parameter level with the soft input values p xldpc [ext] ( ) on the bit level is also well known in the literature. the algorithm for the computation of the extrinsic p xldpc [ext] ( ) has been detailed, e.g., © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 47 no. 4–5/2007 fig. 1: baseband model of iscd with ldpc codes in [2, 3, 15]. as a quality measure we consider the parameter signal-to-noise ratio (snr) � � � � p e u e u u � � 10 10 2 2 log � . (1) 3 index assignments the index assignment is a major design factor influencing the performance of iscd transmission systems. several conventional index assignments already exist, such as natural binary (nb) or gray [17]. they are well-suited for noniterative systems [15, 7, 6], but they exhibit only a suboptimal performance in iscd. in this paper we consider only bit patterns consisting of m � 3 bit that are assigned to q m� �2 8 quantizer levels. the utilized index assignments are listed in table 1, but first, the optimization algorithm will be briefly explained. this was introduced in [8] and can be found in more detail there. according to the notation in [15], the index assignment �: u � x is given in the corresponding decimal representation � �x 10 for an increasing quantizer level u. the quantizer levels are consecutively numbered, i.e., u u u( ) ( ) ( ), , ,0 1 7� for q � 8. thus, the decimal notation of the index assignment, e.g., soak1 for � � 0 9. is (cf. table 1) � � � �� � � � � �� � � � � � u u u ( ) ( ) ( ) � � � 0 10 2 1 10 2 7 10 0 000 6 110 7 � � � � x x x x x x � � � � � � � �� �2 111� . parameter snr optimized index assignments that take into account first order a priori knowledge are generated by minimizing the overall noise energy function [8] d mq p u u u u u u um m u [ ] ( , , ) minsoak1 1 2 1 1 1 � � �� � � � � � � � � � �� u , (2) with p u u u p u u p u u p u( , , ) ( | ) ( | ) ( ) � � � � � � � � � �� � � ��1 1 1 1 . the term u u� �� � 2 corresponds to the noise energy that originates from estimating the quantizer level � u� instead of the correct quantizer level u� at time instant �. � u� corresponds to the bit pattern that differs from the bit pattern of u� only at position m. both u� and � u� are assumed to have the same predecessor u��1. also only single bit errors are taken into account, since the system is generally supposed to operate in good channels, in which the probability of two or more bit errors occurring in one bit pattern is negligible. in order to determine the overall noise energy, each term u u� �� � 2 has to be weighted by the probability of its occurrence p u u u( , , ) � � � ��1 and has to be summed up. finally, the noise energy function has to be minimized either by an exhaustive search for small values of m ( )m � 4 , which yields a global optimum, or by the binary switching algorithm [18, 6, 15], which may lead to a local optimum only. 4 simulation results in fig. 2 the parameter snr performance of iscd utilizing the optimized index assignments is compared to the one using the conventional natural binary index assignment. the parameter snr performance is shown for various combinations of the source parameter correlation at the transmitter �t and the assumed correlation at the receiver �r. for exploiting the residual redundancy at the receiver it is necessary that the source parameter correlation is known at the receiver. to that end, the source correlation either has to be transmitted as side information or it has to be estimated at the receiver. the latter approach has turned out to be very precise and easy to implement. however, in this paper, the mismatch between �t and �r is preset. 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 index assignment � � �x 10 � �( )u natural binary (nb) 0, 1, 2, 3, 4, 5, 6, 7 parameter snr optimized (soak1) � � 0 0. 0, 1, 3, 2, 7, 6, 4, 5 � � 0 7. 0, 1, 3, 2, 6, 4, 5, 7 � � 0 9. 0, 6, 5, 1, 3, 2, 4, 7 table 1: index assignments �: u � x from q � 8 quantizer levels u q� �0 1, ,� , to bit patterns x with m � 3 bit frame size 330, 8-level lmq, (l s) iterationsk � 3 6 soak1( 0.0)� � nb (reference) soak1( 0.9)� � soak1( 0.7)� � fig. 2: parameter snr performance of source correlation optimized index assignments soak13 8 ( )� in combination with an autocorrelation mismatch a reference parameter snr of p[ ]ref � 13 db is assumed at which the parameter snr performances will be compared. the parameters emitted by the gauss-markov source ( )�2 1� exhibiting a source correlation of �t, are grouped to frames of size k � 330 and are quantized by an 8-level lloyd-max quantizer (lmq) resulting in 990 uncoded bits per frame. after channel encoding this yields 1890 coded bits per frame. on the receiver side, ( )l s3 6 iterations are performed, which means that during each of the six iterations between ldpc-decoder and sdsd three ldpc-iterations between the variable nodes and the check nodes of the ldpc-decoder are carried out. the set of curves in fig. 2 that is labeled by ( , ) ( , . )� � �t r t� � 0 0 shows the scenario in which no correlation is exploited at the receiver, independently of the actual source correlation. this is also the current state of today’s transmission systems, where the available correlation is not utilized at the receiver in order to enhance the signal quality. in this case the three curves for nb, soak1(� � 0 0. ) and soak1(� � 0 7. ) show about the same performance, while for soak1(� � 0 9. ) a degradation of � e nb 0 0 6� . db can be observed. the leftmost set of curves shows the case in which � �r t� � 0 9. are matching. such high values for �t are not unusual for several source codec parameters in current communication systems like gsm or umts. the parameter snr optimized index assignment soak1(� � 0 9. ) yields the highest gain of � e nb 0 0 6� . db compared to the reference index assignment nb . the gain of soak1(� � 0 7. ) is already negligibly small and the performances of soak1(� � 0 0. ) and nb are almost the same. however, this shows that if the source parameters exhibit a certain amount of correlation it is definitely expedient to exploit it. the rightmost set of curves labeled by ( , ) ( . , . )� �t r � 0 0 0 9 displays the performance of the scenario in which the source parameters are uncorrelated, but the receiver assumes a high correlation (�r � 0 9. ). when �r��t a high performance degradation occurs for all index assignments, but for nb the least and for soak1(� � 0 9. ) the highest degradation can be observed. however, this scenario is very unlikely, since the correlation can be estimated very accurately at the receiver, so that this high mismatch is temporally limited to the instances of abrupt changes of the source parameter correlation. in systems with a high and slowly varying source correlation a reliable estimation of the source correlation is possible, and thereby it is reasonable and feasible to exploit the performance gain of an index assignment optimized for high source correlations. in systems with fast and high source correlation fluctuations a more conservative choice of the index assignment is recommended. 5 exit chart analysis in this section the system is analyzed by exit (extrinsic information transfer) charts [9], which are a powerful tool for analyzing and optimizing the convergence behavior of iterative systems utilizing the turbo principle, i.e., systems exchanging and refining extrinsic information. the capabilities of the components, in our case the ldpc decoder and the soft decision source decoder (sdsd), are analyzed separately. the extrinsic mutual information i[ext] obtained by each component is determined for a certain a priori mutual information i[apri]. both, i[ext] and i[apri], are calculated on the basis of the actual data and the available extrinsic or a priori information of the data. as a basis for this calculation usually histograms of the respective l-values, e.g., lldpc ext[ ] for lldpc ext[ ] , are used. l-values are the log-likelihood ratios of the corresponding probabilities p [19]. for the exit characteristics, the a priori l-values are simulated as uncorrelated gaussian distributed random variables with variance � a 2 and mean � �� a 2 2 . in most cases, e.g., for a classical turbo code [1], this is a good assumption [9]. the applicability of exit charts to iscd is demonstrated, e.g., in [15]. since the extrinsic information of one component serves as input a priori information for the other component, the two resulting exit characteristics are plotted in a single graph with swapped axes. the exit characteristic of the ldpc decoder depends on the e nb 0 of the channel, while the exit characteristic of the sdsd is independent of the e nb 0 since the sdsd has no access to the received channel symbols z. the decoding trajectory (step curve) shows the mutual information i for each iteration in a simulation of the complete system. in the optimization process of the system design the components are chosen such that the (first) intersection of their exit characteristics moves towards the point ( , ) ( , )[ ] [ ]l lldpc ext sdsd ext � 1 1 . in the design of the system, i.e., the design of the characteristics, we have always � �t r� , while in the actual simulation settings � �t r� is not necessarily fulfilled. thus, when the system design matches the actual settings, the decoding trajectory approaches this intersection, indicating that the best possible performance is obtained. the number of steps of the decoding trajectory corresponds to the number of useful iterations. fig. 3(a) depicts the exit charts corresponding to the results with the soak1(� � 0 9. ) index assignment in fig. 2 at © czech technical university publishing house http://ctn.cvut.cz/ap/ 83 acta polytechnica vol. 47 no. 4–5/2007 e nb 0 2� db. we can observe that when the system design with the index assignment soak1(� � 0 9. ) matches the simulated settings with ( , ) ( . , . )� �t r � 0 9 0 9 , the decoding trajectory reaches the intersection of the ldpc exit characteristic and the soak1(� � 0 9. ) exit characteristic, confirming the excellent performance of this case in fig. 2. in the other cases the decoding trajectory ends very early at low amounts of mutual information, especially at low values of isdsd ext[ ] . here, only values in the range of the soak1(� � 0 0. ) exit characteristic are obtained. with ( , ) ( . , . )� �t r � 0 0 0 9 even a significant decrease in mutual information occurs with an increasing number of iterations. this yields the poor results of this case in fig. 2. fig. 3(b) depicts the exit charts of the results with the soak1( � � 0 0. ) index assignment in fig. 2, again at e nb 0 2� db. as expected, with the simulation settings matching the system design, i.e., ( , ) ( . , . )� �t r � 0 0 0 0 in this case, we reach the intersection of the soak1(� � 0 0. ) exit characteristic with the ldpc exit characteristic. however, this intersection occurs at much lower values of mutual information than the intersection with the soak1( � � 0 9. ) exit characteristic, which is relevant in fig. 3(a). this relation can also be observed in the corresponding parameter snr performance in fig. 2. the two cases with a mismatch differ noticeably in the soak1( � � 0 0. ) exit chart. for ( , ) ( . , . )� �t r � 0 0 0 9 , which corresponds to a significant overestimation at the receiver, we get a decreasing decoding trajectory, quite similar to the respective case in fig. 2. but when ( , ) ( . , . )� �t r � 0 9 0 9 , the iscd receiver correctly uses the high residual redundancy due to the high autocorrelation and significantly outperforms the ( , ) ( . , . )� �t r � 0 0 0 0 case. only the wrong index assignment soak1(� � 0 0. ) instead of soak1(� � 0 9. ) lets the ( , ) ( . , . )� �t r � 0 9 0 9 decoding trajectory fall short of the soak1(� � 0 9. ) exit characteristic. taking fig. 3(a) and fig. 3(b) into account, we can observe that by means of exit charts we can accurately analyze, explain and confirm the parameter snr results. 6 conclusion in this paper the performance of iterative source-channel decoding utilizing optimized index assignments has been analyzed. the studied index assignments are optimized with respect to the parameter snr and by considering first order a priori knowledge. the simulation results show that high gains are achievable if the source parameter correlation is high and if the correlation at the receiver matches. these findings have been confirmed by an exit chart analysis. with optimized index assignments additional gains can be achieved compared to conventional index assignments. in case the assumed correlation at the receiver is higher than the source parameter correlation, high deteriorations can occur depending on how mismatched the correlations are. thus, depending on the dynamics and the amount of correlation of the source codec parameters, either the optimized or the conventional index assignments are better suited. the index assignments optimized for high correlations perform better in systems with a high and slowly varying source correlation, while the conventional index assignments have an advantage in systems with fast and high source correlation fluctuations. 7 references [1] berrou, c., glavieux, a., thitimajshima, p.: near shannon limit error-correcting coding and decoding, in ieee international conference on communications (icc), geneva, switzerland, may 1993. [2] adrat, m., vary, p., spittka, j.: iterative source-channel decoder using extrinsic information from softbit-source decoding, in ieee international conference on acoustics, speech, and signal processing (icassp), salt lake city, ut, usa, may 2001. [3] görtz, n.: on the iterative approximation of optimal joint source-channel decoding, ieee journal on selected areas in communications, sept. 2001, p. 1662–1670.q 5 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 a) exit trajektories for soak1 ( 0.9)�� b) exit trajektories for soak1 ( 0.0)�� fig. 3: exit charts for soak1( � � 0 0. ) and soak1( � � 0 9. ) [4] fingscheidt, t., vary p.: softbit speech decoding: a new approach to error concealment, ieee transactions on speech audio processing, mar. 2001, p. 240–251. [5] vary, p., martin r.: digital speech transmission: enhancement, coding and error concealment, john wiley & sons ltd, 2006. [6] görtz, n.: optimization of bit mappings for iterative source-channel decoding, in 3rd international symposium on turbo codes & related topics, brest, france, sept. 2003. [7] hagenauer, j., görtz, n.: the turbo principle in joint source-channel coding, in ieee information theory workshop (itw), paris, france, apr. 2003. [8] clevorn, t., vary, p., adrat, m.: parameter snr optimized index assignments and quantizers based on first order a priori knowledge for iterative source-channel decoding, in conference on information sciences and systems (ciss), princeton, nj, usa, mar. 2006. [9] ten brink, s.: convergence behavior of iteratively decoded parallel concatenated codes, ieee transactions on communications, oct. 2001, p. 1727–1737. [10] gallager, r. g.: low-density parity-check codes, ire transactions on information theory, jan. 1962, p. 21–28. [11] mackay, d. j. c., neal, r. m.: near shannon limit performance of low density parity check codes, iee electronics letters, aug. 1996, p. 1645–1646. [12] clevorn, t., oldewurtel, f., vary, p.: combined iterative demodulation and decoding using very short ldpc codes and rate-1 convolutional codes, in conference on information sciences and systems (ciss), baltimore, md, usa, mar. 2005. [13] weldon, e. j.: difference-set cyclic codes, the bell systems technical journal, sept. 1966, p. 1045–1055. [14] lucas, r., fossorier, m., kou, y., lin, s.: iterative decoding of one-step majority logic decodable codes based on belief propagation, ieee transactions on communications, june 2000, pp. 931–937. [15] adrat, m., vary, p.: iterative source-channel decoding: improved system design using exit charts, eurasip journal on applied signal processing (special issue: turbo processing), may 2005. [16] pearl, j.: probabilistic reasoning in intelligent systems: networks of plausible inference, morgan kaufmann, 1988. [17] gray, f.: pulse code communication, united states patent office, us patent 2,632,058, mar. 1953. [18] zeger, k., gersho, a.: pseudo-gray coding, ieee transactions on communications, dec. 1990, p. 2147–2158. [19] hagenauer, j., offer, e., papke, l.: iterative decoding of binary block and convolutional codes, ieee transactions on information theory, mar. 1996, p. 429–445. birgit schotsch e-mail: schotsch@ind.rwth-aachen.de peter vary e-mail: vary@ind.rwth-aachen.de institute of communication systems and data processing rwth aachen university templergraben 55 52056 aachen, germany thorsten clevorn e-mail: thorsten.clevorn@infineon.com com development center nrw infineon technologies ag düsseldorfer landstrasse 401 47259 duisburg, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 acta polytechnica vol. 47 no. 4–5/2007 80 81 82 83 84 85 acta polytechnica https://doi.org/10.14311/ap.2022.62.0283 acta polytechnica 62(2):283–292, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague analysis of factors affecting the efficiency of jatropha curcas oil as an asphaltene stabiliser tomás darío marín velásqueza, ∗, dany day josefina arriojas tocuyob a universidad de oriente, petroleum engineering department, av. universitaria, maturín 6201, monagas, venezuela b petróleos de venezuela, data analysis management, campo rojo, punta de mata 6217, monagas, venezuela ∗ corresponding author: tmarin@gmx.es abstract. the effect of temperature and applied dose on the efficiency of jatropha curcas seed oil as an asphaltene stabiliser was studied. two crude oil samples (light and medium) were used. j. curcas oil was subjected to heating at 100, 150 and 170 °c for 24 h with an unheated sample (25 °c) and applied at doses of 2, 4, 6, and 8 µl in 10 ml of sample. the asphaltene instability index (aii) was determined as the ratio between the amount in ml of n-heptane to flocculate the asphaltenes and the amount in ml of xylene to disperse the flocs. the experimental design was taguchi factorial with a response surface for one response variable (aii) and two experimental factors (the applied dose and heating temperature). for light crude oil, the optimum conditions were 8 µl and t = 127 °c with an 85.3 % efficiency and for medium crude oil, 2 µl and t = 25 °c with a 94.3 % efficiency. the efficiency of j. curcas oil and the influence of the type of crude oil on the results obtained were demonstrated. keywords: stability, asphaltenes, flocculation, dispersion, jatropha curcas. 1. introduction asphaltenes are a heavy fraction of petroleum that shows a capacity for self-association and aggregate formation, a phenomenon that can occur in any of the different stages of production and processing, due to variations in pressure, temperature, composition, and shear, among others [1]. although the definition of asphaltenes has been the subject of discussion over the years, most researchers agree in defining them as the heavy organic components present in crude oil that are soluble in toluene and insoluble in heptane/pentane [2]. according to the colloidal theory, asphaltenes are surrounded by molecules of a similar structure, called resins, which interact with asphaltenes to improve their solubility in aliphatic media and stabilise in crude oil [3]. the aforementioned authors also consider that the asphaltenes-crude oil system is in a thermodynamically unstable state in many cases, which causes crude oils to be produced where the asphaltenes are unstable, that is when changes occur in the system conditions, they tend to separate from the liquid phase producing the phenomenon known as asphaltene precipitation. the stability of asphaltenes depends largely on the structure of the asphaltenes and their interaction with the rest of the oil components. however, the detailed molecular composition of asphaltenes remains unknown in many cases, which is because crude oil is made up of millions of different organic molecules and asphaltenes are the most complex of all [4]. unstable asphaltenes form aggregates that precipitate and deposit in pipelines and process equipment, causing plugging and loss of productivity, which has generated a wide field of study and research on asphaltene stability and the mechanisms governing it [5–9]. the determination of stability not only leads to defining the tendency of crude oil to produce asphaltene precipitation but also lays the foundation for the application of methods to prevent the phenomenon [10]. the use of asphaltene stabilising chemicals is the most widely used method for the prevention of asphaltene precipitation due to its effectiveness and low cost [11]. the study of the efficiency of asphaltene stabilising chemical compounds is of vital importance for the oil industry, which has led to investigating chemical compounds, such as ethoxylated nonylphenol and hexadecyl-trimethylammonium bromide, which have shown positive performance [11], likewise, the use of solvents such as toluene as a stabiliser have been studied for its effect as an asphaltene solvent [12]. alkylphenols have also been studied as asphaltene stabilisers, also demonstrating positive effects [13]. the use of aromatic polyisobutylene as an asphaltene stabiliser has also been reported, with equally positive effects [14]. the use of synthetic chemical compounds as asphaltene stabilisers generates expenses and environmental risks that have led to the search for alternatives, among which are vegetable oils, such as coconut oil, sweet almond oil, andiroba oil, and sandalwood oil, which have shown a certain degree of efficiency in asphaltene stabilisation [15]. also, coconut oil was evaluated with positive results indicating that such oil can achieve efficiencies even higher than those of synthetic commercial products [16, 17]. another vegetable oil that has been investigated is that of jatropha curcas [18], with results showing that it can be applied 283 https://doi.org/10.14311/ap.2022.62.0283 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en tomás darío marín velásquez, dany day josefina arriojas tocuyo acta polytechnica property crude oil a crude oil b standard api gravity 30.8 25.5 astm d287 viscosity [cp to 40 °c] 5.6 32.7 astm d2196 asphaltenes [%] 1.5 6.7 astm d6560 water and sediment [%] 0.5 0.5 astm d4007 viscosity-gravity constant (vgc) 0.878 0.854 astm d2501 table 1. properties of the crude oil samples. as an asphaltene stabiliser. similarly, oils, such as turnip, rosemary, sesame, chamomile, and olive oils, have been evaluated, which have shown asphaltene stabilising efficacy [19], and hazelnut and walnut oils, also with positive results as asphaltene stabilisers [20]. the objective of the present research was to determine the effect of j. curcas oil heating and the applied dosage on asphaltene stability in two crude oil samples, to achieve a better understanding of the parameters that may influence the performance of this vegetable oil as an asphaltene stabiliser, as an alternative for oil treatment. 2. materials and methods 2.1. crude oil samples two crude oil samples, which were donated by personnel from the production management of petróleos de venezuela (pdvsa) and came from the producing fields of el furrial and punta de mata in the north of the monagas state, venezuela, were used. the properties of the crude oil samples are detailed in table 1. 2.2. jatropha curcas oil the seeds of j. curcas were collected in the town of el furrial in monagas state, venezuela. mature fruits were collected when the drupe capsule had a dark brown or black coloration. the seeds were transferred to the hydrocarbon processing laboratory of the universidad de oriente monagas nucleus, venezuela, and the seeds were manually extracted, the shell was removed and dried in the sun for 4 days. the seeds were crushed using a laboratory mixer and the oil was extracted by the solid-liquid extraction procedure, using a soxhlet extraction equipment, with n-hexane as the extraction solvent. the extract was concentrated in a rotary evaporator and stored in a glass vial, according to the procedure established in previous research [18]. the extraction was performed at a ratio of 70 g of seeds per 250 ml of n-hexane and an extraction time of four hours. several extractions were performed until 100 ml of oil was obtained. the oil was divided into four parts of 20 ml each, stored in glass bottles, and numbered consecutively from one to four. the oil one was not subjected to heating and was kept at laboratory temperature (25 °c), the other oils were subjected to heating in a laboratory oven at different temperatures for 24 hours, as shown in table 2. oil sample heating temperature [°c] 1 25 2 100 3 150 4 170 table 2. heat treatment of j. curcas oil samples. temperatures were set at the researcher’s discretion, taking into account previous research and the capacity of the laboratory equipment. each heated oil sample was allowed to cool to laboratory temperature (25 °c) and then characterized by standardized density [21] and viscosity tests [22]. 2.3. asphaltenes instability index (aii) calculation the asphaltenes instability index (aii) was determined for the crude oil samples, defined for the research as the ratio between the amount in millilitres of n-heptane needed to obtain asphaltene aggregates visible under an optical microscope (asphaltene flocculation onset – fo) and the amount in millilitres of xylene needed to redissolve the asphaltene aggregates visible under an optical microscope or dispersion point (dp), according to equation 1. aii = dp f o , (1) where aii = asphaltene instability index [ml], f o = flocculation onset [ml], dp = dispersion point [ml]. the asphaltene instability index measures the amount of dispersant (in this case xylene) used to stabilise the asphaltenes in millilitres per millilitre of flocculants used to form the aggregates (n-heptane). the higher the aii value, the more unstable the asphaltenes are. the procedure performed to determine the original aii of the crude oil samples is shown in figure 1. figure 2 shows examples of microphotographs of asphaltene aggregates obtained by the addition of nheptane and dissolved asphaltenes upon the application of xylene. 284 vol. 62 no. 2/2022 analysis of factors affecting the efficiency of jatropha curcas oil . . . figure 1. flowchart of the procedure to obtain the aii. figure 2. micro-photographs taken for an oil sample. a: aggregates formed in the fo and b: asphaltenes solubilized in the dp. j. curcas oil samples were applied to each crude oil sample in doses of 2, 4, 6, and 8 µl in 10 ml and after mixing for 5 min with magnetic stirring, the aii was determined according to the procedure described in figure 1. the efficiency of each oil sample at each dose applied was calculated by equation 2. %ef = aiioriginal − aiidosed aiidosed × 100, (2) where %ef = percentage efficiency of j. curcas oil, aiioriginal = original instability index of asphaltenes, aiidosed = asphaltene instability index of crude oil dosed with j. curcas oil. 2.3.1. experimental design the experimental design evaluated was factorial, taguchi type, with one response variable (aii) and two experimental factors (oil heating temperature and applied oil dose). the selected design has 16 runs, with one sample taken in each run. the model allowed estimating the effects of the 2 control factors on the response variable. the result of the experimental analysis included analysis of variance (anova), pareto plot, main effects plot, response surface, and response optimization. statistical analyses were performed with the statistical package statgraphics centurion xvii. 3. results and discussion according to the properties of the two crude oil samples shown in table 1, it is observed that there are differences between the two. especially in viscosity 285 tomás darío marín velásquez, dany day josefina arriojas tocuyo acta polytechnica oil sample heating temperature density at 25 °c viscosity at 25 °c [°c] [g/ml] [cp] 1 25 0.917 34.47 2 100 0.919 75.14 3 150 0.922 94.13 4 170 0.910 117.61 table 3. shows the results of the density and viscosity properties measured for the j. curcas oil samples. and percentage of asphaltenes. crude oil b presents properties that characterise it as a heavier and denser fluid. the vgc is a constant with which the composition of crude oil can be estimated according to the main hydrocarbon groups it contains. crude oils with cgvs between 0.74 and 0.75 are considered paraffinic, with cgvs between 0.89 and 0.94 as naphthenic, and cgvs between 0.95 and 1.13 are typical of aromatic crudes. vgcs between 0.76 and 0.88 are typical for mixed composition crudes [23]. from the above, both samples are considered to be of a mixed base. it is observed that the density remains almost constant, since its variations are low, which is corroborated by determining the variation coefficient (vc), whose value was 0.49 %. in contrast, viscosity did show a higher vc value of 38.98 %, which is indicative of a relationship between the oil heating temperature and viscosity. to analyse the relationship between the two properties and the temperature to which the oil was subjected, a multivariate correlation test was performed, the results of which were that between the temperature and the density, the correlation coefficient r = −0.1265 and p = 0.8394 show that the relationship is low, negative [24] and not significant (p > 0.05) which corroborates the observation made in table 3. on the contrary, the correlation coefficient between the viscosity and the temperature was r = 0.9341 and p = 0.0201, indicating a very strong and significant relationship (p < 0.05). the effect of heating on oil viscosity can be accounted for by changes in composition that occur due to alterations in the fatty acid components of the oil due to the exposure to heat and oxygen, resulting in isomerization, polymerization, and oxidation reactions [25, 26]. in addition to the decomposition products mentioned above, heating of fats results in the formation of compounds of relatively high molecular weights [27], which will influence the increase in viscosity. this behaviour was also reported when analysing the density and viscosity of coconut oil (cocos nucifera) subjected to temperatures between 25 and 200 °c, also observing that the viscosity presented a coefficient of variation greater than 5 %, but without a significant correlation concerning temperature [17]. when comparing the average density of 0.917 g/ml with those reported by other investigations, it is run dose temperature aii %ef [µl] [°c] 1 2 25 1.50 0 2 2 100 0.60 60.0 3 2 150 0.50 66.7 4 2 170 0.77 48.7 5 4 25 1.52 0 6 4 100 0.52 65.3 7 4 150 0.50 66.7 8 4 170 1.06 29.3 9 6 25 1.52 0 10 6 100 0.43 71.3 11 6 150 0.50 66.7 12 6 170 0.80 46.7 13 8 25 1.43 4.7 14 8 100 0.48 68.0 15 8 150 0.24 84.0 16 8 170 0.38 74.7 table 4. result to aii and efficiency for crude oil a with an original aii of 1.50. observed that it is consistent with the values of 0.92 g/ml and 0.91 g/ml obtained in previous investigations [18, 28]. another research reported a density of j. curcas oil of 0.938 g/ml, which differs from the one obtained in our work [29]. the differences or similarities in the properties of j. curcas oil may vary depending on both climatic and agronomic factors [30], so the results obtained are consistent with those of other investigations. 3.1. results for crude oil a sample the data obtained after applying the experimental process with crude oil sample a are shown in table 4. the results indicate that the samples of j. curcas oil that were subjected to heating showed a promising asphaltene stabilising activity, since the efficiencies between 48.7 and 84.0 % were obtained, while the efficiency of the sample that was not subjected to heating, at a dose of 8 µl, was 4.7 %. on average, the efficiency obtained with a dose of 8 µl was the highest with 75.6 %, so it can be said that the efficiency depends on the dose applied, in addition to the heating temperature. the taguchi model used was fitted to a quadratic trend, resulting in an anova analysis (table 5). 286 vol. 62 no. 2/2022 analysis of factors affecting the efficiency of jatropha curcas oil . . . source sum of squares df mean square f-ratio p-value a: dose 0.0479 1 0.0479 2.32 0.1585 b: temperature 1.8223 1 1.8223 88.42 0.0000 aa 0.1024 1 0.1024 4.97 0.0499 ab 0.0571 1 0.0571 2.77 0.1271 bb 1.1729 1 1.1729 56.92 0.0000 total error 0.2061 10 0.2061total (corr.) 3.6891 15 table 5. anova results for crude oil a sample. figure 3. standardized pareto diagram for aii of crude oil sample a. table 5 shows that the factor that had a significant influence (p-value < 0.05) was the temperature at which the j. curcas oil was heated. likewise, in the applied quadratic model, the quadratic interaction of dose (aa) and temperature (bb) also had a significant influence. on the contrary, the applied dose and the interaction between the dose and the temperature did not show a significant influence on iia values (p-value > 0.05). this result agrees with those obtained by other authors [15, 17, 18]. it is observed that, although the efficiencies varied concerning the doses of oil applied, the differences were not statistically significant with a significance of 5 %, which contrasts with that reported in another research, where j. curcas oil was also applied to medium crude oil and the dose applied was significant [18], however, it is agreed that the best dose was that of 8 µl, even with the difference that in the cited research the oil was mixed with diesel oil and not pure as in the present research. for c. nucifera oil, it was also obtained that the 8 µl dose was the most efficient [16]. the standardized pareto diagram (figure 3) shows that the factor with the most important effect was temperature, followed by the quadratic factor of temperature. this is consistent with the results obtained in the analysis of variance. the effect of temperature is negative, which means that the aii for the crude oil sample varied inversely with temperature. the quadratic factor of temperature was the most important positive effect in the model. the response optimization of the experimental design to find the dose and temperature values that generate the lowest aii value was performed using a response surface, whose model is shown in figure 4. the optimal values were found to be a dose of 8 µl and a temperature of approximately 127 °c, with which a minimum aii value of 0.22 is obtained. these results indicate that the estimated maximum efficiency under the experimental conditions will be 85.3 %. this result is superior to that reported when applying c. nucifera oil to crude oil with the same api gravity (30.8), which showed a higher maximum efficiency as an asphaltene stabiliser, when heated up to 130 °c, of 78.6 % [17], indicating that j. curcas oil can be a more efficient alternative for asphaltene treatment in light crude samples as compared to c. nucifera oil. the quadratic mathematical model used for the response surface presented a fit through the coefficient of determination r2 of 0.944, which indicates that the model predicts the variability of aii by 94.4 %, representing a good approximation for an optimization. it is demonstrated for crude oil sample a, that the factor that influences the efficiency of j. curcas oil as a stabiliser of asphaltenes is the temperature to which the oil sample is subjected, prior to its application. the effect of the heating temperature of j. curcas oil on asphaltenes is mainly due to the compositional change that the oil undergoes due to heating, as reported in previous research [25–27]. the formation of oxides from the fatty acids of j. curcas oil by heating can induce the creation of 287 tomás darío marín velásquez, dany day josefina arriojas tocuyo acta polytechnica figure 4. estimated response surface for aii of crude oil sample a. a surfactant layer that acts as a stabilising agent, so it is to be expected that the oil samples subjected to heating have had a higher efficiency. this indicates that the temperature to which the oil is subjected can create a more efficient surfactant system, since it has been demonstrated that the quality of the surfactant used is fundamental for achieving the stability of the asphaltenes in crude oil when a chemical treatment is applied, due to the fact that the interaction between the asphaltenes and the surfactant molecules is promoted [31]. 3.2. results for crude oil b sample the data obtained after applying the experimental process with crude oil sample b are shown in table 6. in the case of crude oil sample b, it was observed that the efficiencies obtained when applying j. curcas oil subjected to heating ranged from 17.1 to 89.3 %. the only dose at which a moderate efficiency values were obtained at all temperatures was 2 µl, which, on average, showed an efficiency of 46.9 %. at the 4 µl dose, efficiency was obtained at temperatures of 25, 100, and 170 °c, and at 150 °c, efficiency was not obtained. the higher doses (6 and 8 µl) only showed potential with the j. curcas oil samples heated to 170 °c. the taguchi model used was fitted to a quadratic trend, resulting in an anova analysis (table 7). as can be seen in table 7, the effect of the factors on the aii of crude oil sample b differs from that obtained in crude oil sample a. in this case, the applied dose significantly influenced the aii values (p-value < 0.05) with a non-significant effect of temperature (p-value > 0.05). looking at the interactions defined in the quadratic model only the interaction between the two factors (ab) is significant within the model (p-value < 0.05). the influence of the dose applied coincides with that reported when applying j. curcas oil to a medium crude oil [18], which shows that the dose was signifirun dose temperature aii %ef [µl] [°c] 1 2 25 0.36 74.3 2 2 100 1.16 17.1 3 2 150 1.10 21.4 4 2 170 0.35 75.0 5 4 25 0.37 73.6 6 4 100 0.88 37.1 7 4 150 1.83 0 8 4 170 0.96 31.4 9 6 25 1.91 0 10 6 100 1.54 0 11 6 150 1.78 0 12 6 170 0.74 47.1 13 8 25 1.95 0 14 8 100 1.71 0 15 8 150 1.40 0 16 8 170 0.15 89.3 table 6. result to aii and efficiency for crude oil b with an original aii of 1.40. cant even though the types of crude oil used in both investigations were different in composition. the analysis of the pareto diagram shown in figure 5 shows that the dose factor not only has the greatest influence on aii but also has a positive influence, while the interaction between the dose and the temperature is the factor with the greatest negative influence. it is also important to highlight that the other factors have a negative influence on the aii, although they were not significant, which may have an important effect on the behaviour of the graphical trend of the response surface method. the optimization of the aii by response surface was determined considering a quadratic interaction model and the obtained graph can be seen in figure 6. figure 6 shows that the lowest values of aii were obtained for the dose of 2 µl and at the lowest tem288 vol. 62 no. 2/2022 analysis of factors affecting the efficiency of jatropha curcas oil . . . source sum of squares df mean square f-ratio p.value a: dose 1.4409 1 1.4409 7.10 0.0237 b: temperature 0.2548 1 0.2548 1.26 0.2886 aa 0.1661 1 0.1661 0.82 0.3869 ab 1.4055 1 1.4055 6.93 0.0251 bb 0.8533 1 0.8533 4.21 0.0674 total error 2.0290 10 0.2029total (corr.) 5.5348 15 table 7. anova results for crude oil a sample. figure 5. standardized pareto diagram for aii of crude oil sample b. perature, which was evidenced when obtaining the dose and temperature that optimise aii, which were 2 µl and 25 °c where the response surface predicts a value of aii = 0.08, being the maximum predicted efficiency of 94.3 %. when comparing the estimated optimum efficiency with the maximum efficiency reported for a j. curcas oil heated at 150 °c applied to a medium crude oil sample of 28.8 °api, which was 88.33 % [18], it follows that for the crude oil sample of 25.5 °api, the oil was more efficient, although it should be taken into account that samples with different characteristics were used. with the above, it is demonstrated that j. curcas oil is more efficient for the stabilisation of asphaltenes in the crude oil sample b and without being subjected to heating, as statistically obtained. the quadratic mathematical model used for the response surface analysis, according to the coefficient of determination r2 = 0.633, predicts, by 63.3 %, the optimal combination of the dose and heating temperature values of j. curcas oil for crude oil sample b. the results showed higher efficiency of j. curcas oil as compared to synthetic resins with which values of 55.56 % were obtained [32], however, as was also observed in other research, the stabilisation of asphaltenes does not depend only on the products used, but also on the characteristics of the oils where they are applied. in the case of crude oil b, the stability behaviour is different from that observed in crude oil a. it is observed that the applied dose has a significant effect, in addition to its interaction with temperature. the lowest doses were those with the best performance, and amongst them, the 2 ml dose was shown to be efficient both for the oil sample not subjected to heating and for the one subjected to 170 °c. from the previous result, it can be said that for this particular crude oil, the composition of its asphaltenes seems to interact better with the oil in its original state (without being subjected to heating) and at lower doses, which was corroborated by the response surface, so the structural changes that the oil underwent when heated do not favour the stability of the asphaltenes in this sample. the stability trend of asphaltenes, with respect to different chemicals, is, in most cases, not continuous, which is due to the fact that the interactions between asphaltenes and surfactants is dependent on the complexity of the former [33, 34]. according to what was obtained, it can be said that the characteristics of the crude oil and its composition can determine the optimum doses of asphaltene stabiliser product, as well as the most indicated product for its treatment. these results are consistent with those reported for the stability of two samples of medium crude oil after applying c. nucifera oil, where it was concluded that the results were dependent on the crude oil sample used [16]. according to previous research, it is accepted that composition influences the stability of asphaltenes in crude oil. stable crude oil has higher values of the polar fraction, i.e., asphaltenes, resin, and aromatics. therefore, when different crude oil samples are used in terms of their properties, different asphaltenes 289 tomás darío marín velásquez, dany day josefina arriojas tocuyo acta polytechnica figure 6. estimated response surface for aii of crude oil sample b. stability results and behaviour can be obtained [35], which justifies the differences observed in the present investigation when using two different crude oil samples. one of the determining factors in the stability of asphaltenes that can influence the efficiency of the stabilising products is the presence of resins, which, in the crude oil, are considered a key factor as molecules that stabilise the colloidal particles of asphaltenes against aggregation, their mechanism of action being a combined repulsion/adhesion process [36, 37]. likewise, other elements present in the crude oil composition, such as paraffin waxes, water, and organometallic compounds, act as destabilisers of asphaltenes, and also the presence of fine solids in the oil can stabilise asphaltenes [38, 39]. the use of j. curcas oil as an asphaltene stabiliser is shown to be feasible according to the results obtained, this is possibly due to its composition, formed by fatty acids, mostly linoleic acid [40, 41] which has been shown to have asphaltene-stabilising properties, as well as other acids such as palmitic acid, which is also present in the composition of the evaluated oil [15]. the polar nature of the fatty acid components of vegetable oils, such as that of j. curcas, give them surfactant properties, making them potential asphaltene stabilisers, which can be supported by the following factors: it is of utmost importance to find new substances with a higher solubility, because of the good results obtained with organic acids, taking into account that vegetable oils are mixtures rich in free or glyceride-forming organic acids, also vegetable oils are easy to obtain and cheaper than most of the polymeric dispersants used commercially also, once the state of conservation of such substances is sufficient, their use could have positive social and economic consequences [19]. 4. conclusion it is concluded that j. curcas oil is a viable alternative for the stabilisation of asphaltenes; however, its maximum efficiency depends on the dose applied, the temperature to which it is subjected in a heating process before its application, and the characteristics of the crude oil in which it is applied. the compositional changes that occur in j. curcas oil when it is heated are a determining factor in its efficiency as a stabiliser of asphaltenes in crude oil, showing that when the oil is heated, a higher stabilisation efficiency is obtained. the taguchi factorial experimental design and response surface demonstrated that the optimum dosage and type of oil (based on heating temperature) to be applied to a particular crude oil can be obtained, which can also be applied to the selection of other products used for asphaltene stabilisation at the laboratory level. references [1] l. ginçalves. precipitation of asphaltenes in petroleums induced by n-alkanes in the presence or absence of carbon dioxide (co2) [in portuguese]. ph.d. thesis, universidade estadual de campinas, brasil, 2015. [2] h. belhaj, h. a. khalifeh, n. al-huraibi. asphaltene stability in crude oil during production process. journal of petroleum & environmental biotechnology 4(3):1000142, 2013. https://doi.org/10.4172/2157-7463.1000142. [3] e. buenrostro-gonzález, c. lira-galeana, a. gilvillegas, j. wu. asphaltene precipitation in crude oils: theory and experiments. aiche journal 50(10):2552– 2570, 2004. https://doi.org/10.1002/aic.10243. [4] f. j. martín-martínez, e. h. fini, m. j. buehler. molecular asphaltene models based on clar sextet theory. rsc advances 5(1):753–759, 2015. https://doi.org/10.1039/c4ra05694a. [5] a. chamkalani. correlations between sara fractions, density, and ri to investigate the stability of asphaltene. 290 https://doi.org/10.4172/2157-7463.1000142 https://doi.org/10.1002/aic.10243 https://doi.org/10.1039/c4ra05694a vol. 62 no. 2/2022 analysis of factors affecting the efficiency of jatropha curcas oil . . . international scholarly research notices 2012:219276, 2012. https://doi.org/10.5402/2012/219276. [6] a. chamkalani, a. h. mohammadi, a. eslamimanesh, et al. diagnosis of asphaltene stability in crude oil through “two parameters” svm model. chemical engineering science 81:202–208, 2012. https://doi.org/10.1016/j.ces.2012.06.060. [7] a. a. gabrienko, v. subramani, o. n. martyanov, s. g. kazarian. correlation between asphaltene stability in n-heptane and crude oil composition revealed with insitu chemical imaging. adsorption science & technology 32(4):243–255, 2014. https://doi.org/10.1260/0263-6174.32.4.243. [8] r. guzmán, j. ancheyta, f. trejo, s. rodríguez. methods for determining asphaltene stability in crude oils. fuel 188:530–543, 2017. https://doi.org/10.1016/j.fuel.2016.10.012. [9] s. fakher, m. ahdaya, m. elturki, a. imqam. an experimental investigation of asphaltene stability in heavy crude oil during carbon dioxide injection. journal of petroleum exploration and production technology 10:919–931, 2020. https://doi.org/10.1007/s13202-019-00782-7. [10] m. z. hasanvand, m. montazeri, m. salehzadeh, et al. a literature review of asphaltene entity, precipitation, and deposition: introducing recent models of deposition in the well column. journal of oil, gas and petrochemical sciences 1(3):83–89, 2018. https://doi.org/10.30881/jogps.00016. [11] r. g. martins, l. s. martins, r. g. santos. effects of short-chain n-alcohols on the properties of asphaltenes at toluene/air and toluene/water interfaces. colloids and interfaces 2(2):13–22, 2018. https://doi.org/10.3390/colloids2020013. [12] a. natarajan, n. kuznicki, d. harbottle, et al. understanding mechanisms of asphaltene adsorption from organic solvent on mica. langmuir 30(31):9370– 9377, 2014. https://doi.org/10.1021/la500864h. [13] l. goual, m. sedghi, x. wang, z. zhu. asphaltene aggregation and impact of alkylphenols. langmuir 30(19):5394–5403, 2014. https://doi.org/10.1021/la500615k. [14] t. e. chávez-miyauchi, l. s. zamudio-rivera, v. barba-lópez. correction to aromatic polyisobutylene succinimides as viscosity reducers with asphaltene dispersion capability for heavy and extra-heavy crude oils. energy & fuels 30(1):758, 2016. https://doi.org/10.1021/acs.energyfuels.5b02945. [15] l. c. rocha, m. silva ferreira, a. c. da silva ramos. inhibition of asphaltene precipitation in brazilian crude oils using new oil soluble amphiphiles. journal of petroleum science and engineering 51(1-2):26–36, 2006. https://doi.org/10.1016/j.petrol.2005.11.006. [16] y. b. bello, j. r. manzano, t. d. marín. comparative analysis of the dispersing efficiency of asphaltenes of products based on coconut oil (cocusnucif era) as active component and commercial dispersants applied to oil samples from the furrial field, monagas state venezuela [in spanish]. revista tecnológica espol – rte 28(2):51–61, 2015. [17] t. d. marín. coconut oil (cocosnucif era) as an asphaltene stabilizer in a crude oil from monagas state, venezuela: effect of temperature [in spanish]. ingeniería y desarrollo 37(2):289–305, 2019. https://doi.org/10.14482/inde.37.2.1627. [18] t. marín, s. marcano, m. febres. evaluation of j atrophacurcas oil as an asphaltene dispersant additive in a crude oil from the furrial field, venezuela [in spanish]. ingeniería 20(2):98–107, 2016. [19] e. mardani, b. mokhtari, b. soltani soulgani. comparison of the inhibitory capacity of vegetable oils, and their nonionic surfactants on iran crude oil asphaltene precipitation using quartz crystal microbalance. petroleum science and technology 36(11):744–749, 2018. https://doi.org/10.1080/10916466.2018.1445103. [20] n. mohamadshahi, a. r. solaimany nazar. experimental evaluation of the inhibitors performance on the kinetics of asphaltene flocculation. journal of dispersion science and technology 34(4):590–595, 2013. https://doi.org/10.1080/01932691.2012.681608. [21] astm d1298. standard test method for density, relative density, or api gravity of crude petroleum and liquid petroleum products by hydrometer method. astm international, west conshohocken, 2017. https://doi.org/10.1520/d1298-12br17. [22] astm d2196. standard test methods for rheological properties of non-newtonian materials by rotational viscometer. astm international, west conshohocken, 2018. https://doi.org/10.1520/d2196-18e01. [23] a. bded, t. hameed khlaif. evaluation properties and pna analysis for different types of lubricants oils. iraqi journal of chemical and petroleum engineering 20(3):15–21, 2019. https://doi.org/10.31699/ijcpe.2019.3.3. [24] w. hopkins. a new view of statistics, 2014. https://complementarytraining.net/wp-content/ uploads/2013/10/ will-hopkins-a-new-view-of-statistics.pdf. [25] l. brühl. fatty acid alterations in oils and fats during heating and frying. european journal of lipid science and technology 116(6):707–715, 2014. https://doi.org/10.1002/ejlt.201300273. [26] d. n. raba, d. r. chambre, d.-m. copolovici, et al. the influence of high-temperature heating on composition and thermo-oxidative stability of the oil extracted from arabica coffee beans. plos one 13(7):e0200314, 2018. https://doi.org/10.1371/journal.pone.0200314. [27] w. w. nawar. thermal degradation of lipids. a review. journal of agricultural and food chemistry 17(1):18–21, 1969. https://doi.org/10.1021/jf60161a012. [28] n. araiza, l. alcaraz-meléndez, m. a. angulo, et al. physicochemical properties of jatropha curcas seed oil from wild populations in mexico [in spanish]. revista de la facultad de ciencias agrarias uncuyo 47(1):127–137, 2015. 291 https://doi.org/10.5402/2012/219276 https://doi.org/10.1016/j.ces.2012.06.060 https://doi.org/10.1260/0263-6174.32.4.243 https://doi.org/10.1016/j.fuel.2016.10.012 https://doi.org/10.1007/s13202-019-00782-7 https://doi.org/10.30881/jogps.00016 https://doi.org/10.3390/colloids2020013 https://doi.org/10.1021/la500864h https://doi.org/10.1021/la500615k https://doi.org/10.1021/acs.energyfuels.5b02945 https://doi.org/10.1016/j.petrol.2005.11.006 https://doi.org/10.14482/inde.37.2.1627 https://doi.org/10.1080/10916466.2018.1445103 https://doi.org/10.1080/01932691.2012.681608 https://doi.org/10.1520/d1298-12br17 https://doi.org/10.1520/d2196-18e01 https://doi.org/10.31699/ijcpe.2019.3.3 https://complementarytraining.net/wp-content/uploads/2013/10/will-hopkins-a-new-view-of-statistics.pdf https://complementarytraining.net/wp-content/uploads/2013/10/will-hopkins-a-new-view-of-statistics.pdf https://complementarytraining.net/wp-content/uploads/2013/10/will-hopkins-a-new-view-of-statistics.pdf https://doi.org/10.1002/ejlt.201300273 https://doi.org/10.1371/journal.pone.0200314 https://doi.org/10.1021/jf60161a012 tomás darío marín velásquez, dany day josefina arriojas tocuyo acta polytechnica [29] s. a. garcía-muentes, f. lafargue-pérez, b. labrada-vázquez, et al. physicochemical properties of oil and biodiesel produced from jatropha curcas l. in the province of manabí, ecuador [in spanish]. revista cubana de química 30(1):142–158, 2018. [30] a. k. m. aminul islam, z. yaakob, n. anuar, et al. physiochemical properties of j atrophacurcas seed oil from different origins and candidate plus plants (cpps). journal of the american oil chemists’ society 89(2):293–300, 2012. https://doi.org/10.1007/s11746-011-1908-7. [31] m. ahmadi, z. chen. molecular interactions between asphaltene and surfactants in a hydrocarbon solvent: application to asphaltene dispersion. symmetry 12(11):1767–1785, 2020. https://doi.org/10.3390/sym12111767. [32] b. gutiérrez. evaluation of the dispersant properties of modified (i) resins from hydrotreated blackberry crude oil on asphaltenes at laboratory level [in spanish]. master’s thesis, universidad de carabobo, valencia, venezuela, 2017. [33] z. rashid, c. d. wilfredand, t. murugesan. effect of hydrophobic ionic liquids on petroleum asphaltene dispersion and determination using uv-visible spectroscopy. aip conference proceedings 1891(1):020118, 2017. https://doi.org/10.1063/1.5005451. [34] y. v. larichev, a. v. nartova, o. n. martyanov. the influence of different organic solvents on the size and shape of asphaltene aggregates studied via small-angle x-ray scattering and scanning tunneling microscopy. adsorption science & technology 34(2-3):244–257, 2016. https://doi.org/10.1177/0263617415623440. [35] s. ashoori, m. sharifi, m. masoumi, m. m. salehi. the relationship between sara fractions and crude oil stability. egyptian journal of petroleum 26(1):209–213, 2017. https://doi.org/10.1016/j.ejpe.2016.04.002. [36] j. c. pereira, i. lópez, r. salas, et al. resins: the molecules responsible for the stability/instability phenomena of asphaltenes. energy & fuels 21(3):1317– 1321, 2007. https://doi.org/10.1021/ef0603333. [37] c. garcía-james, f. pino, t. marín, u. maharaj. influence of resin/asphaltene ration on paraffin wax deposition in crude oils from barrackpore oilfield in trinidad. in spett 2012 energy conference and exhibition. port-of-spain, trinidad, 2012. https://doi.org/10.2118/158106-ms. [38] a. prakoso, a. punase, k. klock, et al. determination of the stability of asphaltenes through physicochemical characterization of asfaltenes. in spe western regional meeting. anchorage, alaska, usa, 2016. https://doi.org/10.2118/180422-ms. [39] a. prakoso, a. punase, e. rogel, et al. effect of asphaltene characteristics on its solubility and overall stability. energy & fuels 32(6):6482–6487, 2018. https://doi.org/10.1021/acs.energyfuels.8b00324. [40] l. f. campuzano-duque, l. a. ríos, f. cardeño-lópez. compositional characterization of the fruit of 15 varieties of j atrophacurcas l. in the department of tolima, colombia [in spanish]. corpoica ciencia y tecnología agropecuaria 17(3):379–390, 2016. https://doi.org/10.21930/rcta.vol17_num3_art:514. [41] p. guevara-fefer, n. niño-garcía, y. de-jesús-romero, g. sánchez-ramos. jatropha sotoinunyezii and jatropha curcas, species from tamaulipas: a comparison from a bio-fuels perspective [in spanish]. cienciauat 11(1):91–100, 2016. https://doi.org/10.29059/cienciauat.v11i1.769. 292 https://doi.org/10.1007/s11746-011-1908-7 https://doi.org/10.3390/sym12111767 https://doi.org/10.1063/1.5005451 https://doi.org/10.1177/0263617415623440 https://doi.org/10.1016/j.ejpe.2016.04.002 https://doi.org/10.1021/ef0603333 https://doi.org/10.2118/158106-ms https://doi.org/10.2118/180422-ms https://doi.org/10.1021/acs.energyfuels.8b00324 https://doi.org/10.21930/rcta.vol17_num3_art:514 https://doi.org/10.29059/cienciauat.v11i1.769 acta polytechnica 62(2):283–292, 2022 1 introduction 2 materials and methods 2.1 crude oil samples 2.2 jatropha curcas oil 2.3 asphaltenes instability index (aii) calculation 2.3.1 experimental design 3 results and discussion 3.1 results for crude oil a sample 3.2 results for crude oil b sample 4 conclusion references acta polytechnica https://doi.org/10.14311/ap.2022.62.0313 acta polytechnica 62(2):313–321, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague cfd modelling of a secondary settling tanks: generalization based on database relations ondřej švanda∗, jaroslav pollert czech technical university in prague, faculty of civil engineering, department of sanitary and ecological engineering, thákurova 2077/7, prague 6, czech republic ∗ corresponding author: ondrej.svanda@cvut.cz abstract. the area of secondary settling tanks modelling using numerical methods has been quite extensively explored and researched by numerous authors and papers. these models utilize different approaches, from efforts to create a solely deterministic models to attempts to generalized or calibrated empirical models. nevertheless, the processes are not easy to simulate due to the high complexity of the physics involving multiple phases, bio-chemical reactions and non-newtonian fluids. therefore, an additional effort should be focused on improving these models and to validate them against experimental measurements. this article is focused on creating a numerical model for settling tank optimization, which builds on the previous works and is then extended with newly obtained relations from vast experimental measuring using the database approach. keywords: cfd modelling, secondary settling tank, sludge database. 1. introduction the field of numerical solutions of the secondary settling tanks (sst) has been under the scope of many researchers and over the years, numerous different models have been developed in order to describe the driving physics. the comprehensive summary of early works in sst modelling was done by ekama [1] and was later extended by samstag [2]. firstly, he mentions the historically first attempts to use cfd for sedimentation purposes done by mccorquodale and his student using the method of roache [3] and patankar [4]. later, zhou & mccorquodale [5] used a standard k-e turbulence model with the incorporation of solids transport and a settling model using double exponential equation of takacs [6]. they concluded that the velocity pattern of the water-only flow is significantly different from the one containing solids. a more advanced model was introduced by griborio [7], who developed a model that included also flocculation and used the vorticity/stream function to model the fluid pressure correctly. the impact of flocculation in the centre-well design tank was studied in griborio & mccorquodale [8], they stated that the influence of flocculation on the hydraulic performance is low. a recognizable author is de clercq [9], who introduced a 2d model based on a commercial solver that took into account flocculation, solids transport, and density coupling with the herschel bulkley rheological model. the possibilities of using a mixture model are well described in a phd thesis by burt [10], where the author tries to extensively validate and verify an extended drift flux model to be used in clarifier processes modelling. as a result, he points out that improved models are required for flocculent and discrete settling, since those cannot be captured by the standard takács settling function. the general problem is to incorporate all 5 regimes of sedimentation into one framework, which leads to the need for a generalized sedimentation model. in the recent years, there have been several attempts to do so, morse, sickza, & nielsen [10] or ramin, et al. [11] who introduced an extension of tákacs’ model for hindered settling to account for the compressive settling region. one of the most recent models is from wimshurst & burt [12], who modified the standard tacacs’ settling equation to account for lower velocity compression settling, but does not properly account for the floculant and discrete settling phase, as he points out in the paper. he also demonstrates a use of a response surface method to predict the behaviour under different conditions without the need to use a cfd model. this response surface is created by 64 cfd simulations of different conditions, but is not compared to any measured data to validate its accuracy outside the initial data. the developed models differ by xd approach, the complexity of the physics taken into account and by the approach chosen. one type of the models is focused on discrete particle settling using different particle classes and modelling their kinematics, which does not capture the hindered phase correctly and seems not to be the correct path for a generalized sedimentation model for several reasons. the second type of models considers the sludge phase as monodisperse, which is beneficial for the hindered phase but struggles to correctly account for the flocculation and discrete particle sedimentation phase. but the recent research focuses on these models as they are continuously being improved. nevertheless, what all these models share in common is that their performance 313 https://doi.org/10.14311/ap.2022.62.0313 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en ondřej švanda, jaroslav pollert acta polytechnica properties recorded for every sample external properties sampling – depth in the tank sst inlet flow rate sampling – radial coordinate of the tank sst outlet flow rate time and date sst sludge removal flow rate temperature sst inlet suspended solids concentration hindered sedimentation velocity sst outlet suspended solids concentration viscosity sst flocculant dosage sludge volume index weather (dry/rain) density sst suspended solids concentration profile suspended solids concentration flocs size distribution (small, medium, large) microflocs core consistency (compactness) filament index fragmentation buoyancy turbidity table 1. sludge sample properties and external properties recorded during experimental campaign. decreases when used outside the sludge parameters they were calibrated on or require to obtain the sludge parameters for every settling tank they are trying to assess. also, to this point, the cfd models are not validated against the real values in settling tanks during operation but rather against laboratory measured data, which is not optimal given the number of factors influencing the sedimentation. this paper describes an attempt to create a generalized sedimentation model with a different approach from all previous works. it utilizes the tacacs’ sedimentation equation for the hindered settling, as it has been verified many times before, to provide good results for hindered settling phase, but modifies it by additional sub-models to account for the flocculation phase and floc breakups but more importantly, adds a modifying coefficient to the tacacs’ sedimentation curve to allow for an adjustment to different flow and sludge conditions as the sedimentation is influenced by numerous biological and chemical aspects that change the settleability of the sludge and cannot be described by a single parameter. the novelty of this approach lies in the utilization of a big set of experimental data obtained over a long period of time. these data were put into a database in order to find relations between different influences, which allows to create a cfd model that can account for the sedimentation changes under different tank conditions. to be able to validate the model, a screening method for monitoring the suspended solids concentration distribution inside the settling tank was developed. 2. materials and methods the methodology consisted of several subsequent steps. first, the experimental data of the sludge properties of interest were gathered at the central waste water treatment plant in prague (cwwtp) from two different ssts. subsequently, the measured data were evaluated partly on-situ and partly in the lab for a more complex sampling. for each sample, a report including all measured and calculated properties was made. due to an extensive amount of data gathered over a period of two years, a database framework was created to process that data and to find sludge sedimentation dependencies, which were then used as an input for the cfd model. for the purposes of running the tests in-situ, a temporary field lab was built next to the ssts. it houses two settling columns for sedimentation tests, viscosimeter, sludge pump and other accessories. there were total of 9 data gathering campaigns from april to october 2017 from sst dn1 and then from april to september 2018 from tank dn3. specific data for the need of the numerical model were also measured in 2019. a total of 136 complex samplings were conducted. the extent of data analysed for each sample is summarized in table 1. one of the main tests was the settling column test conducted at the site. two three-meter-high cylindrical columns with a diameter of 0.3 m were used. the sludge from different locations in the sst was pumped into the columns up to the height of 2.8 m using a standard submerged sludge pump. the height of the interface between the water and the sludge was recorded each 5 minutes for the entire length of the test taking 1 or 2 hours. the outcome from this test was the hindered settling velocity (hsv) taken as the slope of the curve linear part (m/h). later in the ctu lab, spectrophotometry was used to obtain the concentration of extracellular polymers (carbohydrates, proteins and humic substances). the suspended solids concentration was measured gravimetrically. rheological properties of the samples were measured using the rotary viscometer rheometer rc20. 314 vol. 62 no. 2/2022 cfd modelling of a secondary settling tanks figure 1. distribution of suspended solids concentration in the sst (mg/l) – rainless flow. the tests were conducted in a cylinder/cylinder setup suitable for non-newtonian fluids. the strain rate range was chosen to be 0–1000 s−1 during the tests in 2017 and then changed to 3.5–500 s−1 during 2018. the overall time of the test was 300 s with the resolution of 50 values per test. the postprocessing of the data was done in rheotec 3000, v2.0. the dynamic viscosity was calculated as: µ = τ ∂u ∂y , (1) where µ is dynamic viscosity (pa · s), τ is shear stress and ∂u ∂y is the strain rate. the maximum strain rate in the sst obtained from a cfd simulation was around 20 s−1, meaning that the resolution from the rheometer rc20 was not sufficient for analysing the sludge in the sst as there was only 1 value between 0 and 20 s−1. changing the maximum test strain rate and/or sample resolution did not lead to usable results with the rheometer rc20. consequently, in 2019, a new viscometer brookfield dv2tlv was obtained in order to measure the viscosity in the interested range of 0–20 s−1. the density of the sludge was measured using a 100 ml pycnometer. the weight of a dry pycnometer was recorded and then it was filled with the sample and closed. the redundant sample overflowed through a capillary and its weight was measured. the density was then calculated from the weight difference of a dry and full pycnometer. also, the temperature of the sample was recorded using wtw multi 3430 multimeter. sludge volume index (svi) was determined as a specific volume of activated sludge after 30 minutes of settling in a 1 l container related to the suspended solid concentration. svi = v30 x , (2) where v30 is the settled sludge volume and x is the suspended solids concentration. the concentration of suspended solids was measured gravimetrically according to horáková et al. [13]. as a filter, pragopor nitrocellulose 0.4 µm was used, pre-dried at 105 ◦c. the filtered volume was chosen based on the suspended solids concentration. it ranged between 5–20 ml for sludge samples, 50 ml for supernatant and 100 ml for the water outflow. 2.1. sludge concentration profile for the numerical model validation, it is important to capture the distribution and depth of the sludge blanket. for that purpose, an innovative approach was developed [14]. it is based on the suspended solids concentration measuring using the cerlic multitracker and then postprocessing the data in matlab to visualize the concentrations in the sst. the handheld multitracker consists of a probe connected to the device through a several-meter-long cable. as the probe is being submerged into the tank to the bottom, it continuously records the suspended solids concentration creating a vertical concentration profile. this profile was measured at the sst at the radial distance of 3 meters from the tank’s centre and then at each 2 m increment until the outer wall of the sst, with the last profile being taken at the radial distance of 21 m. these data were then assembled in matlab to create a 2d concentration map (figure 1). it vividly displays the position of the sludge blanket and provides more insight into the state of the sludge in the tank during different events such as rain flow. it serves as a main validation tool to compare the cfd model results with, as these distributions can be taken at any moment to validate different flow scenarios and conditions. 2.2. sludge properties database to consolidate the big amount of data obtained during the campaigns, a database system was developed. for that purpose, a commercial software microsoft access was used as a platform. developing the database system was both beneficial and necessary from several aspects: 315 ondřej švanda, jaroslav pollert acta polytechnica • ability to sort and show all properties and values from a certain sample, • possibility to easily compare samples and to find relations between properties, • ability to categorize and to create sludge groups based on the settling properties. each sample underwent several different tests. the settling column test and viscosity measuring was done in-situ, where microscopic test, sludge volume index, ecps and densities were analysed in the ctu lab. on top of that, external parameters such as sst inlet and outlet flows and concentrations, flocculant dosage and weather information needed to be included as well. for that purpose, each sampling was given an id through which all the tests can be connected together in the database. choosing a certain sampling id brings all the parameters associated with that sampling in a well-arranged matter. plotting the properties from all samplings at once enabled to identify wrong data and error measurements and to exclude them from the database in order not to influence the relations. creating the database required an extensive amount of man hours and thus was developed with a cooperation of a small team who went through the input data a cleared them from any corrupted measurements, wrong readings and other misplaced data. 2.3. cfd model development the framework on which the numerical model is built is the commercial cfd software ansys fluent. it is not the goal of this work to develop a new cfd code from scratch but to extend the ability of a widely used cfd code to simulate the specific behaviour of sludge in ssts. that enables an easy deployment of the sludge settling model to any user with the ansys software package. the ansys software package provides all the necessary tools for creating geometry, meshing, post-processing and already implements transient implicit solver, multi-phase models and common turbulence models. the cfd sedimentation model is implemented through utilization of user defined functions (udfs) to handle the flocculation, sedimentation and rheology of the sludge. this model can be easily adjusted through parameters to respect different sludge types and behaviour which extends its usage to be applied at any settling tank beyond the experimental one. the developed numerical model consists of several sub-models, each handling different part of the sludge behaviour: • flocculation sub-model handles the initial phase of the settling process, the flocculation and particle breakup, • sedimentation sub-model is responsible for hindered zone and compress zone sedimentation, • rheology sub-model is based on the non-newtonian characteristics of the sludge. 2.3.1. rheology the purpose of the rheology sub-model is to find a relation between suspended solids concentration and strain rate. in the cfd model, both the strain rate and concentration are known so a viscosity can be calculated and assigned accordingly to each cell. a total of 41 samples were analysed using the viscometer which outputs the relation between strain rate and shear stress. using the well-known equation, the apparent viscosity was calculated: µm = τ γ̇ . (3) the data showed a good correlation with the casson sludge type, that is described by the following equation: τ 1 2 = τ 1 2 0 + η 1 2 ∞ · γ̇ 1 2 , (4) where τ0 is the casson yield stress that needs to be overcome at zero shear rate and η∞ is the casson plastic viscosity. these parameters differ for each sample based on the solids’ concentration, so we can obtain the function from a regression analysis of the data. the selected data to extract the dependency are the curves with c = 4.9 g/l and c = 13.5 g/l to capture both the low and high concentration profiles. from the regression of the data the τ0 parameter shows a linear dependency on the solids concentration that can be described as: τ0 = 6.35 · 10−2 · c − 1.58 · 10−1. (5) also, the η∞ can be described using a linear function: η∞ = 7.51 · 10−5 · c + 1.85 · 10−4. (6) eventually, we can create a viscosity function based on the solids concentration: τ 1/2 = (6.35 · 10−2 · c − 1.58 · 10−1)1/2 + (7.51 · 10−5 · c + 1.85 · 10−4)1/2 · γ̇1/2. (7) in the figure 2, the aforementioned function is plotted against experimental data. one is constructed for c = 4.9 g/l with τ0 = 0.15 and η∞ = 5.5 · 10−4 to show low solids concentration region fitting and one for c = 13.5 g/l with τ0 = 0.7 and η∞ = 1.2 · 10−3 to show high solids concentration fitting. only some of the sampling data are shown for better clarity. 2.3.2. sedimentation overall, 108 samples were measured using the settling columns. however, 20 of the samples did not create a sludge-water interface and were therefore omitted from the data which is usually the reason for suspended solids concentration over 14 g/l. 316 vol. 62 no. 2/2022 cfd modelling of a secondary settling tanks 0 1 2 3 4 5 6 0 100 200 300 400 500 600 700 800 900 1000 s h ea r s tr es s [p a. s] shear rate [s-1] 16.86 c=13.5g/l 18.94 17.84 13.52 15.66 6.71 5.84 5.76 4.86 c=4.9g/l figure 2. fitted casson sludge type rheology model. 0,1 1 10 100 0 1 2 3 4 5 6 7 8 s et tl in g v el o ci ty [ m /h ] solids concentration [g/l] figure 3. dependency of settling velocity on solids concentration. v0 rh xmin rp [m/h] [m3/kg] [m/h] [m3/kg] 10.08 0.35 0.008 3.5 table 2. obtained coefficients for the takacs’vesilind settling curve. the sedimentation sub-model origins from the wellknown takacs-vesilind model: vs = v0e(−rh ·(x−xmin)) − v0e(−rp·(x−xmin)), (8) where v0 is the maximum settling velocity, xmin is the minimum solids concentration at which settling occurs, rh is a parameter describing the hindered zone and rp is a parameter characterizing the low concentration settling. these parameters can be deducted from the batch column test data by linear regression same as the vesilind parameters. the settling velocity against suspended solids concentration was plotted on a natural log to linear scale. the gradient of the slope and the intercept of the curve are the v0 and rh coefficient respectively as shown in figure 3. from the regression the v0 = 10.08 m/h and rh = 0.35 m3/kg. the xmin parameter was measured by using decantation and resulted in 8 · 10−3 m3/kg. the last parameter rp is generally considered to be a one order of magnitude larger than rh, thus rp = 3.5 m3/kg. the summary of the coefficients is presented in table 2. it is apparent from the plot that the curve does not perfectly copy the shape of the source data. the settling velocity of the samples with low suspended solids concentration of x < 2 g/l are undervalued where the velocity of the samples with higher concentration of x > 3 g/l are slightly overvalued. the main reason for this is the fact, that the samples were taken over a long period of time (almost 2 years) and although they all come from a single wwtp, the properties of the sludge and especially the settleability may vary depending on the actual condition under which the samples were taken and thus creating a significant variance. this is very important to notice as this is actually the cumber stone of sludge settling models – they are fitted on very limited set of data representing usually only one flow condition. it becomes apparent from the figure 4, that a single averaged settling curve cannot enclose all the different sludge conditions and differentiate between well settling and badly settling sludge relatively to the suspended solids concentration. 317 ondřej švanda, jaroslav pollert acta polytechnica 0 1 2 3 4 5 6 7 8 9 10 11 0,00 0,01 0,10 1,00 10,00 s et tl in g v el o ci ty [ m /h ] solids concentration [g/l] exp. data max settling curve min settling curve average settling curve figure 4. envelope of the sludge settling curves. 7,00 8,00 9,00 10,00 11,00 12,00 13,00 14,00 15,00 16,00 0,00 100,00 200,00 300,00 400,00 500,00 600,00 700,00 800,00 s et tl in g a b il it y v 0 [ -] svi [ml/g] dry rain figure 5. relation between the sludge settling ability and svi. in order to be able to compensate the settling curves for different condition without the need to rerun the expensive batch settling measurement every time, an envelope is created to mark the maximum and minimum boundaries. that produces two new sets of settling curves as can be seen in figure 4. it is apparent from the range of min and max curves, that the settling velocity for the same sludge solids concentration may differ significantly. that corresponds to the fact, that there are other factors with a strong influence on the settleability of the sludge. the settling curves are based on the zsv which is considered to be a lumped parameter that inherently embeds sludge morphological, physical and chemical factors. given the fact that a sludge property database was created during the campaigns, it is possible to try to find other relations between sludge settleability and other factors such as svi, rain conditions, filament index, flocculant and coagulant dosage or retention time. after investigation the relation between settleability and other factors, it turned out that from the gather data, there is no statistical dependency for rain, flocculant dosage, coagulant dosage and not enough data to asses the filament index. on the other hand, the svi shows a logarithmic correlation of the data. low svi results in better settling performance and vice versa which corresponds to the general experience [15]. this correlation is valid for both dry and rain samples. now we can transform the y-axis into a v0 correction coefficient and add another parameter called rh correction coefficient. these coefficients will serve as modifiers to the original vesilind-takacs exponential function to adjust the settling curve and we can rewrite the equation as follows: vs = 10.08 · v0ce(−0.35·rhc·(x−0.008)) − 10.08 · v0ce(−3.5·rhc·(x−0.008)). (9) the dependency of the coefficients can be seen in figure 6. the ultimate benefit of this modified equation is that we can now construct a custom settling curve based only on a suspended solids concentration, flow rate and svi for any sample. that means that the numerous samples laboratory batch settling tests required every time when we want to run a numerical model simulation can now be completely avoided. that significantly simplifies the preparation work to run a cfd simulation of the sst and more importantly, it expands the usage of cfd settling model outside the batch test specific wwtp. the aforementioned models were converted into c language code and were implemented into the ansys fluent cfd solver using user defined functions. the simulation was run in 3d using 14 of the tank and periodicity. 318 vol. 62 no. 2/2022 cfd modelling of a secondary settling tanks 0,82 0,87 0,92 0,97 1,02 1,07 1,12 1,17 1,22 1,27 1,32 0,80 0,90 1,00 1,10 1,20 1,30 1,40 1,50 0 100 200 300 400 500 600 700 800 r h c o rr ec ti o n c o ef fi ci en t [] v 0 c o rr ec ti o n c o ef fi ci en t [] svi [ml/g] figure 6. v0c and rhc correction coefficient dependency on svi. 0 20 40 60 80 100 120 140 160 0 10 20 30 40 50 60 in te rf ac e h ei g h t [c m ] time [min] experiment cfd model figure 7. results of sludge water interface evolution. 3. results and discussion first validation was done using the recorded batch settling column test where the water-sludge interface is of an interest. the column has a height of 3 m and diameter of 0.3 m. initial suspended solids concentration was 5.7 g/l. as it can be seen from the figure 7, the evolution of the interface is similar between cfd and experiment. the 3d tank validation was done at settling tank dn3 located at the prague wwtp within the old treatment plant. the radius of the tank is 21 m and depths are 5 m at the sludge removal pit and 2.1 m at the outer rim. the influent is a pipe located traditionally in the centre area. the inlet zone is bounded by 8 pillars supporting metal plates. the outlet area is located 17 m from the centre and consists of two circular weirs. the geometry of the tank has a cylindrical periodicity and therefore only 14 of the geometry was modelled. the inlet is considered to be a mass flow inlet and atmospheric pressure is setup at the outlet. sludge removal is modelled as mass flow outlet. the side walls of the model are modelled as periodic to capture the symmetry. top boundary that represents water-air interface is modelled as symmetry boundary condition – it ensures a non-zero velocity at the boundary. for the tank dn3, a dry conditions scenario was simulated and compared to the experimental measurements. the flow rate 0.635 m3/s represents the standard flow at the tank during normal conditions and was measured on 16th june 2016. the svi at the tank inlet was 55 ml/g, which corresponds to the v0c = 1.31 and rhc = 1.18. the comparison of the suspended solids concentration between cfd and experiment can be seen on figure 8. it is clear that the cfd model shows a good match with the experiment. right after the inlet zone, there is a rising sludge eddy which is well-captured by the model. also, the sludge blanket height matches the experiment. another validation was done on a tank dn1 which has a different inlet zone. the rain flow from 16th april 2018 was chosen and is represented by the 0.87 m3/s flow rate and the concentration of suspended solids c = 3.3 g/l. the svi was measured 270 ml/g which corresponds to the correction coefficients of v0c = 0.9 and rhc = 0.89. from the results of the rain event, the experiment shows an area of an increased blanket height after the inlet zone. the same phenomena can be seen from the cfd results, even though the peak is more apparent. also, the overall sludge blanket height matches well between cfd and experimental data. the concentration of suspended solids is slightly over predicted by the model which might be caused by the fact, that during the rain events, the sludge properties vary quickly and the measured svi at the moment might not have corresponded to the sludge svi already in the tank because the retention time. that leads to the question of when to measure the svi during the rain events to realistically capture the tank average and more effort should be put into this matter. 319 ondřej švanda, jaroslav pollert acta polytechnica figure 8. comparison of the sludge blanket in dn3 between cfd model (left) and experiment (right). figure 9. comparison of the sludge blanket in dn1 between cfd model (left) and experiment (right). the aim of this paper was an attempt to create a cfd model for secondary settling tanks which would be calibrated on a data obtained through a measuring campaign but also to try to generalize the model enough so it would be possible to use it on different tanks with different type of sludge. based on a database data processing a new coefficient was used to extend the takacs settling curve in order to compensate for better or worse settling sludges based on their svi. that way it is possible to adjust the settling model based on an inlet flow rate, suspended solids concentration and svi. further work should be aimed to test the model against different settling tanks and compare its performance. also, additional work should be done to better the compression settling model which might lead to more accurate suspended solids distribution at the tank bottom. acknowledgements the published results were achieved under the grant tačr te02000077 smart regions – buildings and settlements information modelling, technology and infrastructure for sustainable development. references [1] g. a. ekama. secondary settling tanks: theory, modelling, design and operation. international association on water quality, london, 1997. [2] r. w. samstag, j. j. ducoste, a. griborio, et al. cfd for wastewater treatment: an overview. water science and technology 74(3):549–563, 2016. https://doi.org/10.2166/wst.2016.249. [3] p. roache. computational fluid dynamics. hermosa publishers, albuquerque, usa, 1982. [4] s. patankar. numerical heat transfer and fluid flow. hemisphere publishing corporation, taylor & francis, new york, usa, 1980. [5] s. zhou, j. a. mccorquodale. mathematical modelling of a circular clarifier. canadian journal of civil engineering 19(3):365–374, 1992. https://doi.org/10.1139/l92-044. [6] i. takács, g. patry, d. nolasco. a dynamic model of the clarification-thickening process. water research 25(10):1263–1271, 1991. https://doi.org/10.1016/0043-1354(91)90066-y. [7] a. griborio. secondary clarifier modeling: a multi-process. ph.d. thesis, university of new orleans, new orleans, louisiana, 2004. [8] a. griborio, j. mccorquodale. optimum design of your center well: use of a cfd model to understand the balance between flocculation and improved hydrodynamics. proceedings of the water environment federation 2006(13):263–280, 2006. https://doi.org/110.2175/193864706783710587. [9] b. declerq. computational fluid dynamics of settling. ph.d. thesis, deparment of applied mathematics, biometrics and process control of ghent university, ghent, belgium, 2003. [10] d. morse, j. sickza, k. nielsen. extending the mccoquodale model for 3d cfd of settling tanks. in water environment federation’s technical exhibition and conference (weftec), pp. 3673–3690. water environment federation, new orleans, usa, 2016. https://doi.org/10.2175/193864716819707544. [11] e. ramin, d. s. wágner, l. yde, et al. a new settling velocity model to describe secondary sedimentation. water research 66:447–458, 2014. https://doi.org/10.1016/j.watres.2014.08.034. [12] a. wimshurst, d. burt, s. jarvis. enhanced process models for final settlement tanks. in 13th european waste water management conference. birmingham, 2019. [13] m. horáková, v. janda, j. koller, et al. analytika vody. všcht, praha, 2nd edn., 2005. 320 https://doi.org/10.2166/wst.2016.249 https://doi.org/10.1139/l92-044 https://doi.org/10.1016/0043-1354(91)90066-y https://doi.org/110.2175/193864706783710587 https://doi.org/10.2175/193864716819707544 https://doi.org/10.1016/j.watres.2014.08.034 vol. 62 no. 2/2022 cfd modelling of a secondary settling tanks [14] o. švanda, j. pollert, i. johanidesová. development of screening methods for secondary settling tanks monitoring and optimization. in new trends in urban drainage modelling, pp. 242–245. springer international publishing, 2018. https://doi.org/10.1007/978-3-319-99867-1_40. [15] b. jin, b.-m. wilén, p. lant. a comprehensive insight into floc characteristics and their impact on compressibility and settleability of activated sludge. chemical engineering journal 95(1-3):221–234, 2003. https://doi.org/10.1016/s1385-8947(03)00108-6. 321 https://doi.org/10.1007/978-3-319-99867-1_40 https://doi.org/10.1016/s1385-8947(03)00108-6 acta polytechnica 62(2):313–321, 2022 1 introduction 2 materials and methods 2.1 sludge concentration profile 2.2 sludge properties database 2.3 cfd model development 2.3.1 rheology 2.3.2 sedimentation 3 results and discussion acknowledgements references ap06_1.vp 1 company value the value of a company, whose stocks are traded, can be determined from its present market value. when there is no trading, plenty of methods exist, which can be split into six groups: balance sheet, income statement, mixed methods (goodwill), cash flow discounting, value creation, and options. a method that assesses the value of the assets with a forecast of future incomes of the company and recalculating it to the present value was chosen for future evaluation. 1.1 evaluation by the dfcf method in the discounted free cash flow method, the value of a company is defined as the sum of the discounted free cash flows for the next five years, and the perpetuity of the free cash flow in the sixth year. free cash flow is given by the following equation: fcf ebit t ad inv� � � � �( )1 , (1) where ebit is earnings before interest and taxes, t is tax rate, ad is the amount of amortization and depreciation, and inv is the value of the investments. the value of the company is calculated with the following formula, which uses a six-year prediction. five years are estimated separately and the sixth is used for perpetuity calculation. this structure was used because of the difficulties in making predictions over a longer period of time. the formula for company value calculation is: v fcf wacc fcf wacc g wacc t t t � � � � � � � � ( ) ( ) ( )1 1 6 5 1 5 , (2) where fcf is the free cash flow for the given year, g is the expected growth rate of future cash flows, and wacc is the weighted average cost of capital: wacc r e d e r t d d ee d � � � � � ( )1 , (3) where re, rd are the relative costs of equity and debt, e, d are the amounts of equity and debt, and t is the tax rate – the cost of the debt is lowered by the tax shield. 1.2 evaluation by real options this method of evaluation uses the option extension of the yield method. the value of a company can be assessed as the call option on company assets with the strike price equal to the value of the debt. the black – scholes formula for the call option is: c s n d x e n drt� � � � ��( ) ( )1 2 , (4) d s x r t t1 2 2 � � � � � � � � � � � � � � ln � � , (5) d d t2 1� � �� , (6) where s is the current price of the asset, x is the strike price of the asset, r is the risk-free interest rate, t is time to expiration of the option, n () is the standard normal cumulative distribution function and � is the volatility of the price of the asset. the formula for calculating the value of the company then has the following shape: v a n d d e n drt� � � ��( ) ( )1 2 , (7) where a is the value of the company’s assets (determined by the dfcf method), d is the value of the debt, and t is the time of the debt’s maturity. the other parameters have the same meaning as in the standard call option. the value of volatility � can be determined directly from the company’s stock history, or can be specified by the following formula: � � � � � �� � � � � � � � � � �( ) ( )1 2 12 2 2 0 2 0p p p pd a d d d a , (8) where �a, �0 are the standard deviations of the prices of shares and obligations, � is the correlation coefficient between them, and pd is the ratio of debt to total capital. option values calculated by binomial and trinomial tree methods were also used to form a basis for comparison. all three results are very similar. the only slight difference is when the option value approaches zero. the value calculated by the binomial tree is the fastest method in converging to zero, and the black – scholes formula is the slowest. 2 financial leverage as the company debt increases, the equity decreases, as it is replaced by the debt (the same total amount of liabilities and equity is assumed). it is generally true that outside capital is less expensive than own capital. so with the presumption of maintaining the same profit as the company had had before the change in its capital structure, the cost-effectiveness of the equity (the profit divided by the equity) logically increases. if the original cost-effectiveness is high enough for the company © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 company value, real options and financial leverage o. drahovzal this paper deals with determining of the value of companies and financial leverage. the author tries to find the optimum debt ratio for selected companies in the czech republic. the method of yield option extension is used for evaluating a company. the dcfc method was selected as the yield method, due to its simplicity. the dynamic model used allows us to make changes in the debt ratio with recalculations of all parameters that depend on it. the assessment is made from two points of view: firstly, the maximum of the total amount of financial resources, and, secondly, the maximum of the inverse sums of the roe index and the ratio of equity to the value of the company. the values of the total debt ratio and the long-term debt ratio are shown as results. keywords: evaluation, real options, financial leverage. owner, the profit can be lowered by reducing prices, which can result in increased competitiveness in the market. the debt volume cannot of course be increased to infinity. 3 calculation model because the parameters used for company valuation by the dfcf method vary during changes of debt value, a dynamic model had to be used. it recalculates all values – loan interests, wacc and even the values of predicted profits within the years (interest and tax shields influence the value of profit). the final value of a company and the value of the debt are used as input for the real option. the amount of equity and the value of the roe index (profitability of the equity) are recalculated, also depending on the debt ratio. for completeness, the range of long-term debt ratio is counted from 0 % to 100 %. the total amount of resources is the sum of the value of the option and the opportunity income of capital not embedded in the equity of the company. 4 solution and results the following companies were selected for evaluation: pražská energetika, a.s. (pre), skupina čez, a.s. (cez), pražská plynárenská, a.s. (ppas), pražská teplárenská, a.s. (ptas), severočeské doly, a.s. (sdas), and české aerolinie, a.s. (csa). for each company, the amount of own capital costs, the actual level of debt ratio, interest and other entries were found in public annual reports downloaded from the internet in the first step. in the second step, the volatility of assets for each company in the list was calculated from the four-year daily history of stock prices. in the third step, free cash flows for the next six years were roughly predicted by estimating the profit, amortization and depreciation and investment value. the estimates were made by processing the six-year history. regression functions (polynomic, logarithmic or exponentional) or some coefficients and ratios (e.g. ratio of investments and depreciations) resulted from the estimates and were used for the prediction. because of the simplification, the expected growth rate of the future cash flow g was estimated at the same value for each company. for determining the optimum debt ratio, two evaluation methods were used. the first uses the maximum of the total amount of financial resources, while the second uses the maximum of the inverse sums of the roe index and the ratio of equity to the company’s value. this index evaluates the debt ratio also from the point of view of the profitability of the equity. the formula for the optimum debt ratio from the view of total value of resources is: d v or drv dr dr� � max{ ; }0 1 (9) where dr is the debt ratio, v is the option value and or is the opportunity revenues. the formula for the optimum debt ratio for profitability is: d e v roe drv dr dr dr� � � � � � � � � � � � � � max ; 0 1 (10) where dr is the debt ratio, e is the value of the equity, v is the value of the option and roe is the profitability of the equity. the long-term debt ratio (ratio of the long-term loans to the sum of the long-term loans and the equity) is presented as 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague fig. 1: structure of the calculation model the main result. the total debt ratio is shown in parenthesis (the ratio of the outside capital to the total amount of capital). 5 conclusion the calculations show that the energy companies in the czech republic, with a few exceptions, are using a very low level of outside capital. the managers are not even lowering their profitability indices, but they are also failing to increase the yield from the equity to a level which could be achieved with a better capital structure in the companies. acknowledgments this research was financially supported by research project msm6840770017 of the czech ministry of education, youth and sports. references: [1] scholleová, h.: “real options.” dissertation thesis, prague, 2004. [2] sedláček, j.: accounting data in manager’s hand. prague: computer press 2001. [3] fernández, p.: valuation methods and shareholder value creation. san diego: academic press 2002. [4] sůvová, h.: financial analysis. prague: bank institut 2000. [5] annual reports and stock price histories of the investigated companies. ing. ota drahovzal drahovo@fel.cvut.cz department of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 debt ratio actual optimum (v ) optimum (e ) pre 0 % (27 %) 36 % (49 %) 44 % (54 %) cez 6 % (36 %) 32 % (53 %) 40 % (59 %) ppas 0 % (57 %) 34 % (71 %) 45 % (76 %) ptas 33 % (40 %) 34 % (41 %) 47 % (53 %) sdas 0 % (26 %) 32 % (50 %) 37 % (53 %) csa 34 % (57 %) 35 % (57 %) 56 % (63 %) table 1: results of calculations fig. 2: sample of the resultant graph for company pre ap08_3.vp 1 introduction improving the fuel efficiency of passenger vehicles is the ultimate goal of a large scientific and technical community. the main purpose of a hybrid electric drive is to achieve a higher efficiency in energy transmission from internal combustion engine (ice) to the traction wheels of the vehicle. the hybrid electric vehicle (hev) is a transient solution from standard internal combustion engine driven cars to all-electric driven vehicles (ev). some car manufacturing companies already have successful commercial models, but all are based on significantly different technological solution. in the josef bozek research center for engine and automotive engineering (rcjb) at ctu in prague there is an ongoing project to develop a hybrid electric drive. for this purpose, an experimental working stand has been set up at the department of electric drives and traction at the faculty for electrical engineering (fee). the main innovative features of this hybrid drive are the use of a super-capacitor (sc) as an accumulation unit, and an electric power splitter (eps). commercial hybrid electric cars, for splitting energy from ice use a planetary gear and a separate electrical generator for the electrical power supply of the traction motor and for charging the battery. in the hybrid electric system developed at ctu in prague, power splitting is performed entirely electrically, with the use of eps. also, instead of a chemical battery for accumulating the breaking kinetic energy, a super-capacitor is used in this stand as a new technological element for electrical energy storage. this enables energy to be stored without transformation from electrical to chemical and back. this leads to higher efficiency in energy form transformation. 2 hev with electric power splitter a schematic representation of the working concept of hev with eps and a super-capacitor is shown in fig. 1. the internal combustion engine is the main and only power source of a vehicle that produces mechanical power pice. eps is a special type of synchronous generator with two rotating parts (a classic permanent magnet rotor and a rotating stator). the rotor is firmly coupled to the drive-shaft of ice, and the stator of the eps is firmly coupled to the transmission that leads to the car wheels and rotates at a speed proportional to the velocity of the vehicle (speed v). this technical solution enables the ice to operate on the optimal revolutions during the entire driving schedule. mechanical power pice is divided into electrical power pepsel and mechanical power pepsmh. the induction traction motor (tm) has been inserted on the shaft of the eps rotating stator, and is the main electric propulsion to the vehicle. eps and tm are electrically connected through two traction ac/dc and dc/ac power converters, with an intermediate dc link. sc is connected to the dc link via a charging and discharging dc-dc converter. tm is powered by pel, which is generated in eps (pepsel) and by additional power from sc (psc): p p pel epsel sc� � . (1) the traction motor tm produces mechanical power ptm which, with mechanical power pepsmh added from eps, is transmitted to the car wheels. the total power pcar propelling the car is expressed by the formula: p p pcar epsmh tm� � . (2) when the car is braking, tm changes function from motor to generator. in this way, the decelerating energy of the car 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 analysis of the fuel efficiency of a hybrid electric drive with an electric power splitter d. čundev this paper presents the results of an analysis of the fuel efficiency of a hybrid electric car drive, with an electric power splitter based on a double rotor synchronous permanent magnet generator. the results have been obtained through a precisely determined mathematical model and by simulating the characteristics of all essential values for the entire drive. this work is related to the experimental working stand for electric and hybrid car drive research, which has been developed at the faculty of electrical engineering (fee) at ctu in prague. keywords: hybrid drive, fuel efficiency, european driving schedule, electric power splitter, electric traction motor, super-capacitor. fig. 1: schematic representation of the hev can be partially converted into electric energy, which is accumulated in sc and can later be used for acceleration in the following driving cycles. 3 experimental working stand in order to perform laboratory tests on this hev concept, an experimental working stand was se up in the laboratory. the scheme of this laboratory stand is shown in fig. 2. the main functional units of the stand are the same as the units of the hev concept, shown in fig. 1. in this hev laboratory model, ice and the car wheels are substituted by two regulated induction motors. the internal combustion engine is simulated by a controlled electric ac induction motor. the traction load is simulated with another controlled ac induction motor. the entire working stand consists of four electric machines (shown in fig. 3), five semiconductor power control units, a super-capacitor and all the necessary instrumentation, control, power supply and protection equipment. 4 vehicle fuel efficiency the fuel efficiency of passenger vehicles is defined as the amount of fuel required (lcar [liter]) for a driven car trajectory distance (scar [km]). in europe, this value is measured according to a predetermined working regime defined as the european driving schedule (eds). eds consists of a 1200 sec. driving schedule, which combines 800 seconds of urban and 400 seconds of highway driving (fig. 4). the urban driving consists of four repeated cycles, each with three different acceleration, idle and deceleration regimes. highway driving is continuous driving with predetermined changes in speed v. this is the standardized european driving schedule for calculating the consumption and vehicle efficiency. the vehicle drives the first 800 sec. (urban driving), and the total driven distance is measured or calculated (scar). this value is divided by the amount of fuel consumed (lcar) and the consumption qurb of the vehicle for urban driving is calculated: q s lcar car car � (3) the same approach is used for calculating the consumption qhgw for highway driving (eds cycle from 800 to 1200 sec.). to calculate the combined consumption qcom (urban and highway) for the entire eds (1200 sec.), car trajectory scar and fuel consumed lcar are calculated. 5 simulation of the eds a simulation has been made by means of the matlab programming interface. the main approach in this simulation is to determine the energy fluctuations in the hybrid drive during the driving schedule. for this purpose, the function and behavior of each component of the system is determined and taken into account, e.g., aerodynamic resistance, rolling resistance between the tires and the road surface, density of the ambient air, cross-sectional area of the vehicle, coefficient of drag, etc. the kinematic model has been mathematically created for the predetermined car specifications, e.g., car weight, efficiency of transmission, ice fuel consumption and output power, number of accumulative units – sc, and the efficiency of the electric power converters. for example, for each time unit, (�tn � 10 �3 [sec]) of the drive it has been calculated the acceleration an, car trajectory distance sn, needed acceleration force fa, and corresponding energy wn. by means of these calculated values, acceleration power pa can be calculated for each subinterval �tn. the aerodynamic and rolling resistance between the tires and the road surface has been compensated by calculating the additional power pv, which depends on speed v(t). the total power that a hybrid vehicle needs to provide is the sum of acceleration power pa and speed power pv: p p pa vcar � � (4) the program calculates all these values according to eds. the data is presented in characteristics which are functions of time t. the power on the drive-shaft of the internal combus© czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 3/2008 fig. 2: experimental working stand for hev fig. 3: electric machines of the laboratory working stand fig. 4. european driving schedule eds tion engine pice, follows the required driving power pcar, which is shown in fig. 5. this is the main advantage of a hybrid electric drive that can not be performed in normal cars, where the ice must instantly and rapidly change the working regime according to the actual power demand. this leads to high fuel consumption and low efficiency. in hevs, ice works only when the working regime demands significant power, which is not the case with conventional cars, where ice works all the time, even when the car is not moving, e.g., in normal urban driving. ice provides power for driving the vehicle and also for charging sc and keeping usc above the critical minimal level uscmin. therefore pice depends on the actual energy volume wsc accumulated in the super-capacitor, which is determined from the actual sc voltage usc: w c u sc sc � � 2 2 (5) 6 simulation results the simulation results are shown in graphs representing the calculated values as a function of time t during the entire driving regime of eds. fig. 5 shows the output power pice of an internal combustion engine. fig. 6 presents the actual voltage of sc. according to equation (5), by measuring usc, the precise data on the accumulated energy in the super-capacitor can be obtained during the entire working regime. the working regime is determined in such a way that usc is maintained between the two critical values uscmin and uscmax. this is important, because sc must not be overcharged (u usc scmax� ) and it must be keep over minimal level (u usc scmin� ) to have an energy reserve. low voltage of sc is undesirable, because the current level (isc) of sc is increased and with that the efficiency of sc rapidly decreases. the efficiency (fig. 8) of the recuperative circuits of hev (power control units dc-dc and sc) is proportional to the current flow in those circuits. when simulating a of power drive system, is essential to calculate precisely the efficiency of each component where there is energy transformation. we have calculated the efficiencies of ice (fuel consumption according to the power demand pice and the revolutions of the drive-shaft), the efficiency of eps (the electrical pepsel and mechanical pepsmh power transformation), the efficiency of tm and the efficiency of transmission and sc (fig. 8). 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 5: pcar and power produced on the drive shaft of the ice fig. 6: voltage usc in function of time t fig. 7: current isc as a function of time t fig. 8: efficiency of transmission and sc fuel consumption is measured in each time interval. knowing the total fuel consumed in the end drive regime (in eds t � 800 [s] for the urban regime and t � 1200 [s] for the combined drive regime) and the total distance driven, we can calculate the consumption of the hybrid-electric drive is calculated. for urban driving consumption is 25.6 [km/l], for the highway 21.4 [km/l]. for combined driving it is 23.2 [km/l]. table 1 compares the values those for standard drive cars with similar characteristics. 7 conclusions the results show the main specifications of this new concept of hybrid electric drive according to the european driving schedule. this mathematical concept and these simulations provide the analytical results which prove the eligibility of this new technological solution. the results will also be used for tests on the real hev working stand, which is constructed at fee, czech technical university in prague. the results present the operating conditions of all systems during the driving regime. the changes in the values as a function of time give an overview of the energy fluctuation in each component of the system. the final numerical results for fuel consumption show the significance of this new technological approach. enabling vehicles to drive a greater distance using the same amount of fuel is the main task in increasing the efficiency of passenger cars. 8 acknowledgement the research described in this paper was supervised by doc. ing. pavel mindl, csc, department of electrical drives and traction, fee ctu in prague. it was supported by josef božek center of engine and automotive engineering. the project has been supported by the czech ministry of education grant. references [1] čundev, d., mindl, p.: driving characteristics for hybrid electric drive with super-capacitor as energy storage unit. progress in electromagnetics research symposium piers2007, prague, 2007. [2] čeřovský, z., mindl, p.: hybrid drive with super-capacitor energy storage, fisita conference, barcelona, f193m, 2004. [3] čeřovský, z., mindl, p., flígl, v, halámka, z., hanuš, p., pavelka, v.: power electronics in automotive hybrid drives, epe-pemc2002, dubrovnik, croatia conference, september 2002. [4] čeřovský, z., mindl, p.: double rotor synchronous generator used as power splitting device in hybrid vehicles, 31st fisita world automotive congress, yokohama 2006. ing. dobri čundev e-mail: cunded1@fel.cvut.cz department of electric drives and traction czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 48 no. 3/2008 [km/l] urban highway combined hybrid-electric vehicle 25.6 21.4 23.2 standard ice vehicle 13.7 19.2 17.2 table 1: fuel consumption acta polytechnica https://doi.org/10.14311/ap.2022.62.0337 acta polytechnica 62(3):337–340, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague measurement of solid particle emissions from oxy-fuel combustion of biomass in fluidized bed ondřej červený∗, pavel vybíral, jiří hemerka, luděk mareš czech technical university in prague, faculty of mechanical engineering, department of environmental engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: ondrej.cerveny@fs.cvut.cz abstract. the presented work summarizes the results of the measurement of solid particle emissions from an experimental 30 kw combustion unit. this unit has been used for the research of oxy-fuel combustion of biomass in a fluidized bed, which, when accompanied with carbon capture technologies, is one of the promising ways for decreasing the amount of co2 in the atmosphere. to implement a carbon capture system, it is first needed to separate various impurities from the flue gas. therefore, the goal of this work was to identify solid particles contained in the flue gas. part of the apparatus used for this measurement was an ejector dilutor. to evaluate the results of this measurement, it is key to know the dilution ratio of the dilutor. for this purpose, a method for determining the dilution ratio of an ejector dilutor was proposed and experimentally verified. keywords: oxy-fuel biomass combustion, solid particle emission, ejector dilutor, dilution ratio. 1. introduction as global warming is becoming an increasingly pressing issue, people are starting to look for new ways of decreasing carbon dioxide emissions. according to epa [1], co2 is widely considered the most important anthropogenic greenhouse gas. still, coal is one of the main energy sources and it contributes with approximately 40 % to the total energy production [2]. replacing or cofiring coal with biomass fuels and utilization of carbon capture systems is one of the ways to reduce co2 emissions. we recognize two carbon capture systems: carbon capture and storage (ccs) and carbon capture and use/utilization (ccu). by combusting biomass with ccs or ccu system implemented, we can achieve negative or neutral co2 emissions, respectively [3]. the goal of this work was to identify solid particle emissions from the process of oxy-fuel biomass combustion in a fluidized bed. as a part of research center for low-carbon energy technologies project, this measurement was performed to get data for an efficient solid particle separation. this measurement took place on an experimental 30 kw combustion unit, which is shown in figure 1. the fuel used for the experiment were spruce wooden pellets with 6 mm diameter. a part of the apparatus is an ejector dilutor. to get the correct data from such a measurement, it is needed to know the actual dilution ratio of the dilutor. although this ratio is provided by the manufacturer of the dilutor (in our case dekati®), the real dilution ratio may differ. as stated in [5], the dilution ratio is influenced by the composition of the gas. thus, when the nominal values of the dilution ratio are obtained using air, it is clear that these values will not be appropriate when diluting flue gas, as in our case. the figure 1. scheme of the 30 kwth bfb experimental facility 1) fluidized bed region, 2) distributor of the fluidizing gas, 3) gas burner mount, 4) fluidized bed spill way, 5) fuel feeder, 6) cyclone separator, 7) flue gas fan, 8) flue gas vent, 9) and 10) water coolers, 11) condensate drain, 12) primary fan, 13) air-suck pipe, 14) vessels with oxygen [4]. composition of the gas may change the dilution ratio by 20 %, which would cause a quite significant error in the results. specifically, the composition of the flue gas in our case was ca. 90 % co2, 6 % o2 and some minor gases in the dry flue gas. wet flue gas had around 40 % vol. water vapour content. this composition can be calculated using equations from [6], where the authors described the oxy-fuel combustion specifics. the secondary goal of this work was, therefore, to set and verify a method to obtain a more accurate dilution ratio for gases different from the calibration 337 https://doi.org/10.14311/ap.2022.62.0337 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en o. červený, p. vybíral, j. hemerka, l. mareš acta polytechnica figure 2. operating principle of dekati® dilutor [7]. figure 3. scheme of experimental set-up of solid particle emission measurement. gas. the operating principle of the ejector dilutor is shown in figure 2. pressurized dilution air enters the dilutor and is accelerated in the orifice ring, which causes a vacuum in front of the sample nozzle and the sample gas is sucked in. these two gases are mixed and then leave the dilutor. 2. method 2.1. solid particle emission measurement to determine the properties of solid particles contained in the flue gas, we used the gravimetric analysis. the apparatus used was a three-stage impactor with an ejector dilutor, both from dekati® company. the scheme of the experimental set-up can be seen in figure 3. the impactor classifies the solid particles into three fractions, which are pm10, pm2,5 and pm1. the use of the dilutor provides a longer running time for each measurement. that is desirable, because the impaction plates of the impactor have a limited mass capacity, about 1 mg each. additionally, it helps to avoid condensation of water vapour, which is useful here, because the flue gas from this combustion technology is specific for its high water vapour content (around 40 % vol.). the dilutor was heated up to the temperature of the flue gas in the vent (200 ◦c). the dilution air had an ambient temperature and was dried before feeding into diluter. 2.2. correction of the dilution ratio for a gas of general composition the following method is based on the theory of gas flow through a nozzle. to use it, we have to know the nominal dilution ratio (obtained with air) and the composition of the gas we are diluting. the method consists in comparing the air flow rate and the desired gas flow rate through the entry nozzle. the procedure is as follows. on the basis of the theoretically calculated air and gas flow rates, the air flow rate measured during the calibration is corrected to correspond to the case when the different gas would enter the nozzle under the same conditions (temperature and pressure). to calculate the mass flow rate through the nozzle, we use saint-venant-wantzel equation [8] in the following form: ṁs = ks × ρs,1 × as,2 (ps,2 ps,1 ) 1 κs × ( 2κs κs − 1 ps,1 ρs,1 ( 1 − (ps,2 ps,1 )κs −1 κs ))12 (1) index 1 in equation (1) indicates the point of entry into the dilutor, index 2 is the narrowest point of the nozzle – its outlet. ρs is the density, as flow cross section, ps static pressure, κs poisson’s constant and ks coefficient, which takes into account the internal friction and contraction of the flow. it follows from equation (1) that the flow rate depends on the composition, here expressed by density and poisson’s constant, and the state of the gas. mass flow rates obtained using equation (1) are first converted to volume flow rates and then their ratio is expressed: v r = v̇air,teor v̇gas,teor . (2) this calculated v r ratio is then used to convert the measured air flow rate to the desired gas flow rate: v̇gas = v̇air v r . (3) now, with the knowledge of the gas flow vgas and the dilution air flow vda, it is possible to determine the dilution ratio as given by equation (4): ngas = v̇da + v̇gas v̇gas . (4) by adjusting equations (3) and (4), an equation for a direct conversion between the dilution ratio with air nair (nominal) and with gas ngas can be obtained: ngas = (nair − 1)v r + 1. (5) 2.2.1. experimental verification to verify the method described in the previous section, an experimental measurement was performed using 338 vol. 62 no. 3/2022 measurement of solid particle emissions of biomass combustion figure 4. scheme of the experimental set-up. carbon dioxide as the diluted gas. dekati® dilutor in figure 2 was used. the overpressure of the dilution air was set to nominal 2 × 105 pa. the flow rate and pressure were measured at the sample inlet and both dilutor outlets. a heater preceded the sample inlet and the measurement was performed in the range from room temperature to a temperature of about 200 ◦c. this temperature was measured using a thermocouple placed in the gas stream before the entrance to the dilutor. the dilution air had ambient temperature during the whole experiment. the experimental setup is shown in figure 4. to calculate the flow rate according to equation (1), it is necessary to determine the static pressure ps,2, which causes the suction of the sample gas. this can be achieved by blocking the sample inlet to the dilutor and measuring the static pressure there without the sample gas flowing in. similarly, the dilution air flow rate vda can be determined, again, by blocking the sample inlet and measuring the two outlet flow rates of the dilutor. 3. results and discussion 3.1. solid particle emission characteristics with the apparatus described, we performed five separated measurements and specified the concentration and particle size distribution in the flue gas. cut-off diameters of the impactor were corrected according to temperature. the acquired particle size distribution is shown in figure 5 with a curve plotted with the average values. our results showed a relatively small mass median aerodynamic diameter of 1.6 µm. the solid particle concentration in the flue gas was found to be cca 30 mg/mn3. it is important to add that these results were influenced by the fact that the flue gas goes through a cyclone before exhausting to the vent, as can be seen in figure 1. according to different theories, the cut-off diameter of this cyclone was estimated to lay between 3 and 5 µm. that explains the relatively small median and concentration found. images of the particles captured in the impactor have been taken with an electron microscope. in figure 6, there is an image of particles captured in the second stage of the impactor. this picture shows many particles larger than 2.5 µm, which should be figure 5. particle size distribution curve. figure 6. pm2,5 particles captured in the impactor. the cut-off diameter of this stage. these particles correspond to particles penetrating from the previous stage. 3.2. evaluation of the proposed method for dilution ratio correction experimental measurements with co2, the results of which can be seen in figure 7, showed a very good agreement between the presented method and the experiment. the average value of the deviation between the measured and calculated values of the dilution ratio is, on average, about 1 % over the entire temperature course. the largest deviation was found at room temperature, where it reaches about 3 %. figure 7 also demonstrates how a significant error can be caused by using the air dilution ratio nair for a different gas. in the case of co2, the error is almost 20 %. the same figure also shows that the dilution ratio also varies with the sample gas temperature, which is another factor that should be remembered during field measurements. 339 o. červený, p. vybíral, j. hemerka, l. mareš acta polytechnica figure 7. values of dilution ratio obtained using the given method and experimentally by measurement as a function of sample gas temperature. 4. conclusions a measurement on the experimental 30 kw boiler during a biomass combustion in oxy-fuel regime was carried out to specify the solid particle emission present in the flue gas. we found a relatively small mass median aerodynamic diameter of 1.6 µm and a concentration of 30 mg/mn3. both these values were affected by the presence of a cyclone at the outlet of the boiler. the results of this measurement will be used for the design of an appropriate precipitator for this combustion technology. to correctly interpret the data acquired during the solid particle emission measurement, there rose a need to specify the dilution ratio of the ejector dilutor we used for the measurement. therefore, in this paper, we also proposed and experimentally verified a method for a correction of the nominal dilution ratio to fit the actual case during a field measurement. the experimentally obtained data differ from the proposed method by an average of about 1 %. this method could find good use wherever the sample gas composition differs from air (calibration gas) and the knowledge of accurate concentrations of the sample is important. such cases might be, for example, boiler flue gases or car exhaust gases. acknowledgements this work was supported by projects research center for low-carbon energy technologies cz.02.1.01/0.0/0.0/16_019/0000753 and optimization of hvac systems sgs22/154/ohk2/3t/12. we gratefully acknowledge the support from these grants. we would also like to thank ing. stanislav šlang, ph.d. for providing us the electron microscope photos of the impactor stages. references [1] epa. greenhouse gases. [2021-10-18], https://www. epa.gov/report-environment/greenhouse-gases. [2] s. black, j. szuhánszki, a. pranzitelli, et al. effects of firing coal and biomass under oxy-fuel conditions in a power plant boiler using cfd modelling. fuel 113:780–786, 2013. https://doi.org/10.1016/j.fuel.2013.03.075. [3] t. mendiara, f. garcía-labiano, a. abad, et al. negative co2 emissions through the use of biofuels in chemical looping technology: a review. applied energy 232:657–684, 2018. https://doi.org/10.1016/j.apenergy.2018.09.201. [4] m. vodička, n. e. haugen, a. gruber, j. hrdlička. nox formation in oxy-fuel combustion of lignite in a bubbling fluidized bed – modelling and experimental verification. international journal of greenhouse gas control 76:208–214, 2018. https://doi.org/10.1016/j.ijggc.2018.07.007. [5] b. giechaskiel, l. ntziachristos, z. samaras. calibration and modelling of ejector dilutors for automotive exhaust sampling. measurement science and technology 15(11):2199–2206, 2004. https://doi.org/10.1088/0957-0233/15/11/004. [6] p. skopec, j. hrdlička. specific features of the oxyfuel combustion conditions in a bubbling fluidized bed. acta polytechnica 56(4):312–318, 2016. https://doi.org/10.14311/ap.2016.56.0312. [7] dekati ltd. ejector diluter in exhaust measurements: technical note, 2001. [8] a. j. c. barré de saint-venant, p. l. wantzel. mémoire et expériences sur l’éccoulement de l’air, déterminé par des différences de pression considérables. journal de l’école polytechnique 27:85–122, 1839. 340 https://www.epa.gov/report-environment/greenhouse-gases https://www.epa.gov/report-environment/greenhouse-gases https://doi.org/10.1016/j.fuel.2013.03.075 https://doi.org/10.1016/j.apenergy.2018.09.201 https://doi.org/10.1016/j.ijggc.2018.07.007 https://doi.org/10.1088/0957-0233/15/11/004 https://doi.org/10.14311/ap.2016.56.0312 acta polytechnica 62(3):337–340, 2022 1 introduction 2 method 2.1 solid particle emission measurement 2.2 correction of the dilution ratio for a gas of general composition 2.2.1 experimental verification 3 results and discussion 3.1 solid particle emission characteristics 3.2 evaluation of the proposed method for dilution ratio correction 4 conclusions acknowledgements references ap05_3.vp notation q heat flux [w�m�2] t0 bulk temperature [k] t face temperature [k] k contact heat-transfer coefficient [w�m�2k�1] h heat-transfer coefficient [w�m�2k�1] 1 introduction the cylinder head is one of the most complicated parts of an internal combustion engine. it needs to contain a combustion chamber, intake and exhaust valve ports, valves with valve seats and guides, a fuel injector and a complex of cooling passages. in the combustion chamber there are peaks of combustion pressure and temperature of the order of 15 mpa and 2500 k. the heat fluxes and temperature nonuniformities lead to thermal stress, which further escalates the mechanical loading from combustion pressure. the maximum temperature of the head material is much lower, and the regions around the combustion chamber need to be safely cooled to prevent overheating. placing the cooling passages very close to the most exposed regions is not always possible because of space demands, which results in limited cooling in these regions. the parts of the engine head assembly are usually made of different materials with varying thermal expansion. these facts lead to many compromises in design, which can be sources of failures in operation. avoiding the risk of failure in operation is one of the targets of engine designers. the design of the engine head must be tested under operational conditions. this procedure is necessary, but expensive. fe modeling of the cylinder head assembly operational conditions is an appropriate complement to operational testing. a detailed fe strength analysis can provide valuable information about the temperature distribution and mechanical stresses in the overall assembly of the cylinder head. this information is especially useful in regions where experimental data is barely obtainable. temperature and mechanical stresses are analyzed using temperature field, combustion pressure in the combustion chamber and other mechanical loads, i.e. bolt pre-stress, moulded seats and valve guides, etc. the resulting displacement/stress fields may be utilized for the evaluating the operational conditions, i.e. the contact pressure uniformity between the valves and valve seats as well as strength and failure resistance of the assembly. such information contributes to a detailed understanding of the thermal and mechanical processes in the cylinder-head assembly under engine operation, which is a prerequisite for further optimization of engine design. in this study, we emphasize the problematic regions where proper cooling is limited. the regions around the valve seats experience thermal loading from in-cylinder burning gases during the combustion period and also during the exhaust phase – from burned gases flowing through the exhaust valve and along the exhaust-port walls. although the temperatures of the exhaust gases are significantly lower than peak in-cylinder temperatures, the rapid movement of flowing gases and the duration of the exhaust period exposes parts around the exhaust valves to heat. the main portion of the heat accumulated in the valve is conducted through the contact surface of the valve seat. deformations of these parts accompanied by improper contact and the occurrence of leakage on the conical valve contact face dramatically increase the thermal loading of the valves and, may lead to their destruction. the modeling of the operating condition of the combustion engine needs to include a model of cooling with the possibility of local boiling. the simplified model is used to increase the heat transfer coefficient depending on the surface temperature in the cooling passages, which simulates local boiling. this model is implemented in the heat transfer analysis. 2 cylinder head assembly in this study, the cylinder head of a large turbo-charged direct-injection diesel engine is analyzed. the engine is used in power generators. the basic parameters of the engine are: bore 275 mm, stroke 330 mm, maximum brake mean effective pressure 1.96 mpa, nominal speed 750 rpm. the cylinder head (fig. 1, link 1) is made of cast iron. the cylinder head assembly contains two intake valves (fig. 1, link 6) and two exhaust valves (fig. 1, link 7), which are made from forged alloy steel. the valve guides (2, 3) and also the valve seats (4, 5) are pressed into the head. the exhaust-valve seats are cooled by cooling water flowing through the annular cavities around the seats. the fuel injector is situated in the center of the cylinder, and it is held in place with pre-pressed bolt connections. the bottom face of the cylinder head, which is directly exposed to the in-cylinder gases, is cooled by special bores, which represents a complication in the design of this mechanically highly loaded region of the cylinder head. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 structural stress analysis of an engine cylinder head r. tichánek, m. španiel, m. diviš this paper deals with a structural stress analysis of the cylinder head assembly of the c/28 series engine. a detailed fe model was created for this purpose. the fe model consists of the main parts of the cylinder head assembly, and it includes a description of the thermal and mechanical loads and the contact interaction between their parts. the model considers the temperature dependency of the heat transfer coefficient on wall temperature in cooling passages. the paper presents a comparison of computed and measured temperature. the analysis was carried out using the fe program abaqus. keywords: structural stress analysis, fem, internal-combustion engine. cylinder head assembly lies on the cylinder, and it is fixed with six pre-pressed bolt connections. 3 the fe model the fe model includes all components mentioned above. the real design of the cylinder head was slightly modified in details to enable manageable meshing. the model of the cylinder head block was created using pro/engineer 3d product development software and was imported as a cad model, unlike the models of other components (valves, seats, valve guides and fuel-injector), which were developed directly in abaqus cae. some parts of the valves and fuel-injector were considerably simplified or completely left out, as they were considered to have a negligible influence on the results. the mesh geometry of the basic parts is shown in. it consists mainly of tetrahedron dc3d4 (158 586) and brick dc3d8 (41 844) elements. the bolts are modeled as beams b31. 4 interactions and boundary conditions although the thermal loadings of engine parts vary considerably in time due to the cyclical nature of engine operation, the computations were performed assuming steady-state heat fluxes evaluated on the basis of time-averaged values. taking into account the speed of the periodic changes and the thermal inertia of the components of the cylinder head, 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 1: cylinder head assembly fig. 2: mesh geometry the temperature variations are damped out within a small distance from the wall surface (~1mm), and this simplification is therefore acceptable. the thermal contact interactions between the individual parts of the cylinder head assembly are described by heat flux �qab from the solid face a to b, which is related to the difference of their surface temperatures ta, tb according to � ( )q k t tab b a� � , where k is the contact heat-transfer coefficient. the values of the coefficient used in the present analysis are summarized in table 1. they follow the values reported in [3]. the value of k � 6000 wm�2k�1 was used for all the metal contacts (fig. 3, fig. 4; links 1–4, 6–11), except that of valves vs. their guides (fig. 3, fig. 4; links 5, 7), where the value is k � 600 wm�2k�1. the boundary conditions of surfaces in contact with flowing gases are described as a steady-flow convective heat-transfer problem, where the heat flux q transferred from a solid surface at temperature t to a fluid at bulk temperature t0 is determined from the relation � ( )q h t t� � 0 , where h denotes the heat-transfer coefficient. it depends on the flow properties of the fluid and the geometry of the surfaces. the functional forms of these relationships are usually developed with the aid of dimensional analysis. in the present study, the values of gas-side heat-transfer coefficients and bulk gas temperatures (i.e. for in-cylinder surfaces and intake and exhaust port walls) were obtained from a detailed thermodynamic analysis of the engine operating cycle performed using the 0-d thermodynamic model obeh, (see [4]). the analysis uses eichelberg’s well-known empirical heat-transfer coefficient correlation. the remaining boundary conditions on the outside surfaces mostly exposed to the ambient air © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 3: interactions and boundary conditions on the cylinder-head block fig. 4: interactions and boundary conditions on other parts of the cylinder-head assembly temperature are described using estimated values of the heat-transfer coefficient; in special cases the heat-transfer is neglected. more detailed information on the used values is provided in table 1in conjunction with fig. 3 and fig. 4. the possibility of the cooling water exceeding the boiling point was anticipated, (see [1]). the dependency of the heat-transfer coefficient on surface temperature is shown in fig. 5. this figure presents experimental data of the heat-transfer coefficient increasing with the use of pure water boiling under flow conditions, (see [2]). the contact interactions between the head and the valve guides/ports, the valves and the guides/ports, the head and the gasket ring, the pre-stressed bolts and the valve springs are included in the structural analysis of the head. five basic states were studied. 1) assembly: the gasket ring is constrained in the cylinder side. the head is bolted on the cylinder gasket ring with six pre-stressed bolts fully constrained in the cylinder side. the valve seats/guides are pressed into the head using contact constraints. the valves interact with the guides by special mpc constraints, and with the seats by contact constraints. the pre-stressed valve springs are inserted between the valve and the head. the fuel injector is constrained on the head bottom inner surface by contact and pressed onto it by two pre-stressed bolts. 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 5: dependency of the heat-transfer coefficient of the cooling passages on the surface overheat boundary condition description link heat transfer coefficient [wm�2k�1] bulk temp. [k] insulated surfaces (negligible heat-transfer rate) 20 0 (adiabatic) – free surfaces (contact with ambient air) 29 5 320 cooling passages 30 (see fig. 4.) 350 in-cylinder surfaces 21 450 1120 intake-port surfaces 22 800 330 exhaust-port surfaces 23 800 700 table 1: description of the boundary condition measured point temperature [k] measured computed 1 425 551.8 2 509 533.1 3 442 424.1 4 0 430 5 412 422 6 448 432 7 415 425 8 468 478 9 394 427 10 430 490.6 11 400 432.6 12 361 437.9 13 414 523.8 table 2: comparison between the computed with measured temperatures 2) average pressure load: the assembly is loaded by average in-cylinder pressure p � 1.96 mpa on the head bottom outer surface and valve bottoms. 3) maximum pressure load: the head bottom outer surface and valve bottoms are loaded by maximum in-cylinder pressure p � 12 mpa. 4) maximum pressure and temperature load: the assembly is loaded by maximum pressure and the temperature field from the previous steady state heat transfer analysis. 5) average pressure and temperature load: the assembly is loaded by the average pressure and temperature field from the previous steady state heat transfer analysis. 5 results the experimentally determined temperatures provided by the engine manufacturer were compared with the computed results. the thermocouples were placed in special bores. all the bores were situated at a distance of 18 mm from the bottom margin of the cylinder head, fig. 6. despite a lack of further detailed information on the conditions of the experiment (errors caused by the measuring equipment, influence of the location and fixation of the thermocouples in the bores, etc.), the authors found the data provided to be a usable and useful resource for verification of the presented model. a comparison of the computed and measured temperatures is presented in table 2. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 7: contact pressure on the intake valve seat fig. 6: distribution of measured points p1–p13 over the head as an example of the structural analysis, the contact pressure distribution between the intake valve and the seat is shown in . the idealized contact surface is conical. two edge circles of this surface establish the inner and outer path that the contact pressure is mapped in. the position of both the inner and outer circles is measured as an angle coordinate in the cylindrical system associated to the axis of the valve. the curves document a strong dependency of the valve/seat contact in relation to temperature loading. in a cold state, the inner edge transfers more loads, whereas in the hot state the outer edge is simply overloaded. 6 conclusion the experimental data provided by the engine manufacturer was compared with the computed results. the thermocouples were placed in special bores at a distance of 18 mm from the bottom margin of the cylinder head. the heat transfer analysis acknowledged the importance of including the assumption of local boiling in the analysis. the structural analysis results have not been fully evaluated yet. the influence of valve seat deformation due to assembly, pressure and thermal loading on the contact pressure distribution between the valves and seats is significant. 7 acknowledgments this research was conducted in the josef božek research center of engine and automotive engineering, supported by the ministry of education of the czech republic, project no. ln00b073. references [1] španiel, m., macek, j., diviš, m., tichánek, r.: “diesel engine head steady state analysis, mecca – journal of middle european construction and design of cars, vol. 2 (2003), no. 3, p. 34–41, issn 1214-0821. [2] mcassey, e. v., kandlikar, s. g.: convective heat transfer of binary mixtures under flow boiling conditions. villanova university, villanova, pa usa. [3] horák, f., macek, j.: “use of predicted fields in main parts of supercharged diesel engine.” proceedings of xix. conference of international centre of mass and heat transfer. new york: pergamon press, 1987. [4] macek, j., vávra, j., tichánek, r., diviš, m.: výpočet oběhu motoru 6c28 a stanovení okrajových podmínek pro pevnostní a deformační výpočet dna hlavy válce. čvut v praze, fakulta strojní, vcjb, 2001 (in czech). [5] macek, j., vítek, o., vávra, j.: kogenerační jednotka s plynovým motorem o výkonu větším než 3 mw – ii. čvut v praze, fakulta strojní, 2000 (in czech). ing. radek tichánek phone: +420 2 2435 2507 tichanek@fsid.cvut.cz department of automotive and aerospace engineering ing. miroslav španiel, csc. phone: +420 2 2435 2561 spaniel@lin.fsid.cvut.cz department of mechanics ing. marcel diviš phone: +420 2 2435 1827 divis@student.fsid.cvut.cz department of automotive and aerospace engineering josef božek research center czech technical university in prague technická 4 166 07 praha 6, czech republic 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 8: contact pressure distribution on the intake valve seat acta polytechnica https://doi.org/10.14311/ap.2022.62.0479 acta polytechnica 62(4):479–487, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague active disturbance rejection control-based anti-coupling method for conical magnetic bearings danh huy nguyena, minh le vua, hieu do tronga, danh giang nguyenb, tung lam nguyena, ∗ a hanoi university of science and technology, school of electrical engineering, 1 dai co viet st, hanoi 100000, vietnam b national university of civil engineering, faculty of mechanical engineering, 55 giai phong st, hanoi 100000, vietnam ∗ corresponding author: lam.nguyentung@hust.edu.vn abstract. conical-shape magnetic bearings are currently a potential candidate for various magnetic force-supported applications due to their unique geometric nature reducing the number of required active magnets. however, the bearing structure places control-engineering related problems in view of underactuated and coupling phenomena. the paper proposes an adaptive disturbance rejection control (adrc) for solving the above-mentioned problem in the conical magnetic bearing. at first, virtual current controls are identified to decouple the electrical sub-system, then the active disturbance rejection control is employed to eliminate coupling effects owing to rotational motions. comprehensive simulations are provided to illustrate the control ability. keywords: conical active magnetic bearings, over-actuated systems, adrc, coupling mechanism, linearization. 1. introduction recently, active magnetic bearing (amb) has been of increasing interest to the manufacturing industry due to its contactless, lubrication-free, no mechanical wear, and high-speed capability [1–3]. these characteristics enable them to be employed in a variety of applications, including artificial hearts [4], vacuum pumps [5], and flywheel energy storage systems [6] and [7], etc. the motion resolution of the suspended object in translation or high-speed rotation is restricted solely by the actuators, sensors, and the servo system utilised due to the non-contact nature of a magnetic suspension. as a result, magnetic bearings can be utilised in almost any environment as long as the electromagnetic parts are suitably shielded, for example, in open air at temperatures ranging from 235 °c to 450 °c [8]. many researchers, in particular, have endeavored to design a range of ambs that are compact and simple-structured while still performing well. because of the advantages of a cone-shaped active magnetic bearing (amb) system, such as its simple structure, low heating, and high dependability, there is an increasing number of studies on it [9, 10]. the structure of a conical magnetic bearing is identical to that of a regular radial magnetic bearing, with the exception that both the stator and rotor working surfaces are conical, allowing force to be applied in both axial and radial directions [11, 12]. the conical form saves axial space, which can be used to install gears and other components for an added mechanical benefit. it also conserves energy for an optimal load support. conical electromagnetic bearings feature two coupled properties as compared to ordinary radial electromagnetic bearings: current-coupled and geometry-coupled effects, making dynamic modelling and control of these systems particularly difficult. the current-coupled effect exists because the axial and radial control currents flow in the bearing coils simultaneously. furthermore, the inclined angle of the magnet core causes a geometry-coupled effect. coupled dynamic characteristics of the rotor conical magnetic bearing system became known due to the existence of the two coupled effects. so far, several researchers have discussed the modelling and control of cone-shaped ambs [2, 13, 14]. lee cw and jeong hs presented a control method for conical magnetic bearings in [12], which allows the rotor to float in the air stably. they proposed a completely connected linearised dynamic model for the cone-shaped magnet coil that covers the relationships between the input voltage and output current. the connected controller uses a linear quadratic regulator with integral action to stabilise the amb system, while the decoupled controller is used to stabilise the five single dof systems. abdelfatah m. mohamed et al. [11] proposed the q-parameterization method for designing system stabilisation in terms of two free parameters. the proposed technique is validated using a digital simulation. as a result, plant parameters such as transient and forced response are good, and stiffness characteristics are obtained with small oscillation. recently, in [15], e. e. ovsyannikova and a. m. gus’kov cre479 https://doi.org/10.14311/ap.2022.62.0479 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en huy, minh, hieu et al. acta polytechnica ated a mathematical model of a rigid rotor suspended in a blood flow and supported by conical active magnetic bearings. they used the proportional-integral differential (pid) control, which takes into account the influence of hydrodynamic moments, which affect the rotor from the side of the blood flow, as well as external influences on the person. the experimental findings are reported, with a rotor speed range of 5000 to 12000 rpm and a placement error of less than 0.2 mm. nguyen et al. introduced a control approach considering input and output constraints in the magnetic bearing system in [16] and [17]. the control restricts the rotor displacement in a certain range according to the system structure. in [18], modelling of a conical amb structure for a complete support of 5-dof rotor system was reported by a. katyayn and p. k. agarwal, who improved the system’s performance by creating the interval type-2 fuzzy logic controller (it2flc) with an uncertain bound algorithm. this controller reduces the need for a precise system modeling while also allowing for the handling of parameter uncertainty. the simulation results show that the proposed controller outperforms the type-1 fuzzy logic controller in terms of transient responses. in this paper, we examine the concept of conical magnetic bearings for both the radial and axial displacement control. the governing equations characterising the relationship between magnetic forces, air gaps, gyroscopic force, and control currents are used to build the nonlinear model of the conical magnetic bearing. the main contribution of the paper is that rotational motions are treated as disturbances and are handled by the active disturbance rejection control (adrc) [19, 20] to stabilise the cone-shaped amb system. adrc was developed as an option that combines the easy applicability of conventional pid-type control methods with the strength of modern modelbased approaches. the core of adrc is an extended observer that treats actual disturbances and modelling uncertainty together, using only a very coarse process model to create a control loop. because of the excellent abilities of adrc, the paper also tackles the unwanted dynamics due to rotational motions, which are normally neglected in other related works. the ignorance might lead to system degradation due to a high operating speed resulting in strong coupling effects. the effectiveness of the proposed control structure for stabilising the rotor position and rejecting coupling-phenomenon-induced disturbance is numerically evaluated through comprehensive scenarios. 2. dynamic modelling of conical magnetic bearings consider the simplified model of a conical magnetic bearing system as shown in fig. 1. it is assumed that the rotor is rigid and its centre of mass and geometric centre are coincide. furthermore, the assumptions of non-saturated circuit and negligible flux linkage between magnetic coils are made. rm and β are the figure 1. model of a cone-shaped active magnetic bearing system. effective radius and inclined angle of the magnetic core, b1 and b2 are the distances between the two radical magnetic bearing and the centre of gravity of the rotor; fj (j =1 to 8) are the magnetic forces produced by the stator and exerted on the rotor; (x, y, z) and (θx, θy , θz ) are the displacement and angular coordinates defined with respect to the centre of mass. the cone-shaped active magnetic bearing system can be modelled as follows: mẍ = (f1 + f2 + f5 + f6) sin β − (f3 + f4 + f7 + f8) sin β − mg. mÿ = (f1 − f2 + f3 − f4)cosβ. mz̈ = (f5 − f6 + f7 − f8)cosβ. jdθ̈y = [(f6 − f5)b1 + (f7 − f8)b2]cosβ + (f5 − f6 + f8 − f7)rm sin β + j θ̇xθ̇z . jdθ̈z = [(f1 − f2)b1 + (f4 − f3)b2]cosβ + (f2 − f1 + f3 − f4)rm sin β + j θ̇xθ̇y (1) where j is the moment of inertia of the rotor about the axis of rotation. the mass and moment of inertia of the rotor are m and jd, respectively. we also consider the effect of the x-axis rotation on the other two axes. here, the first three equations in eq. (1) are the kinematics of the rotor’s transverse motion, while the last two equations represent the rotor’s rotational dynamics. in addition, in the two rotational kinematics equations, there is an additional component of the feedback force. suppose that when the rotor rotates rapidly if a force is applied to the y-axis (z-axis) that is sufficiently large to deflect the rotor from the axis of motion by a small angle, the rotor itself will also react back to a torque of the corresponding magnitude equal to j θ̇xθ̇z . similarly, the component of gyro force 480 vol. 62 no. 4/2022 active disturbance rejection control-based . . . figure 2. simplified model of the cone-shaped amb system. along the z-axis is computed. in order to linearise, the dynamic equation (1), small motions of the rotor are considered. fig. 2 shows the change of the air gap of the cone-shaped magnet, which is written as: gy1,2 = go − x sin β ± (y + b1θz ) cos β gy3,4 = go + x sin β ± (y − b2θz ) cos β gz1,2 = go − x sin β ± (z + b1θy ) cos β gz3,4 = go + x sin β ± (z + b2θy ) cos β (2) where go is the steady-state nominal air gap. the magnetic force can be written regrading to he actual air gap and the current as: f1,2 = µoapn 2(io1 + iy1,2 ) 2 4gy1,2 2 f3,4 = µoapn 2(io2 + iy3,4 ) 2 4gy3,4 2 f5,6 = µoapn 2(io1 + iz1,2 ) 2 4gz1,2 2 f7,8 = µoapn 2(io2 + iz3,4 ) 2 4gz3,4 2 (3) where µo(= 4π × 10−7h/m) is the permeability of free space; ap = a/cosβ, a is the cross-sectional area, n is the number of coil turns; iqj (j = 1, 4&q = y, z) is the control current of each magnet; io1 and io2 are the bias currents in the upper and lower bearing. assume that the current change and the displacement of the rotor are small relative to the bias current io and the nominal air gap. apply eq. (2) to eq. (3) and use the taylor expansion series to obtain the magnetic force, which is linearised as: f1,2 = fo1 + ki1 iy1,2 + kq1 x sin β ± kq1 (y + b1θz ) cos β f3,4 = fo2 + ki2 iy3,4 + kq2 x sin β ± kq2 (y − b2θz ) cos β f5,6 = fo1 + ki1 iz1,2 + kq1 x sin β ± kq1 (z − b1θy ) cos β f7,8 = fo2 + ki2 iz3,4 + kq2 x sin β ± kq2 (z + b2θy ) cos β (4) where foj = µoapn 2ioj 2 4go 2 , j = 1, 2 are the steady-state magnetic forces and kqj = 2foj go , kij = 2foj ioj , j = 1, 2 are the position and current stiffnesses, respectively. from the combination of eqs. (1) and (4), the linear differential equation showing the kinematics of the 5 degrees of freedom conical-amb drive system can be rewritten as: mbq̈b + kbqb = kibmim + gq̇b (5) where qb = { x, y, z, θy , θz } t im = { iy1 , iy2 , iy3 , iy4 , iz1 , iz2 , iz3 , iz4 } t kb=   −kxx 0 0 0 0 0 −kyy 0 0 −kyθz 0 0 −kzz −kzθy 0 0 0 −k θy z −kθy θy 0 0 −kθz y 0 −kθz θz 0   g =   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 j . θx 0 0 0 j . θx 0   kibm=   ki1 sβ ki1 cβ 0 0 ki1 σ ki1 sβ −ki1 cβ 0 0 −ki1 σ −ki2 sβ ki2 cβ 0 0 ki2 γ −ki2 sβ −ki2 cβ 0 0 −ki2 γ ki1 sβ 0 ki1 cβ ki1 α 0 ki1 sβ 0 −ki1 cβ −ki1 α 0 −ki2 sβ 0 ki2 cβ ki2 γ 0 −ki2 sβ 0 −ki2 cβ −ki2 γ 0   mb =   m 0 0 0 0 0 m 0 0 0 0 0 m 0 0 0 0 0 jd 0 0 0 0 0 j d   kxx = 4 (kq1 + kq2 ) s 2 β kyy = kzz = 2c2β (kq1 + kq2 ) kyθz = kzθy = 2c 2 β (kq1 b1 + kq2 b2) 481 huy, minh, hieu et al. acta polytechnica kθy z = 2c 2β (kq1 b1 + kq2 b2) + s(2β)rm (kq1 − kq2 ) kθz y = 2c 2β (kq1 b1 − kq2 b2) − s(2β)rm (kq1 − kq2 ) kθy θy = 2c 2β ( kq1 b1 2 − kq2 b2 2) + s(2β)rm (kq1 b1 + kq2 b2) kθz θz = 2c 2β ( kq1 b1 2 + kq2 b2 2) − s(2β)rm (kq1 b1 + kq2 b2) α = b1cβ + rmsβ; σ = b1cβ − rmsβ γ = b2cβ − rms β here, qb is the displacement vector defined in the mass centre coordinates; im is the control current vector and mb , kb and kibm are the mass, position stiffness, and current stiffness matrices, respectively. as can be observed, the system’s equation is complicated and coupled because the components outside the main diagonal of the matrices, kb , kibm and g are non-zero. due to this characteristic, conventional linear control rules cannot be applied directly to each motion direction. as a result, the active disturbance rejection control (adrc) algorithm is employed to handle the coupling effects by taking these effects as system disturbances. 3. control system design the conical amb system is naturally unstable, a closed-loop control is required to stabilise the rotor position. the control current of the system can be calculated through the control structure “different driving mode”, which is shown in fig. 3. figure 3. conceptual control loop of the cone-shaped active magnetic bearings. the main principle of the aforementioned structure: where controlling the position of the rotor according to the y − axis and z − axis, the magnet pairs are in the poles that are opposite each other. for example, iy1 and iy2 magnets, as well as, iy3 and iy4 , iz1 and iz2 , iz3 and iz4 are similarly controlled by this structure. here, the magnet in each pair is controlled by the sum of the bias current and control current, and the other with the difference of the bias current and control current. this means that when the rotor is displaced from its equilibrium position, the “different driving mode” controls the pairs of magnets, whereas when the rotor is in its equilibrium position, only the bias current is present on each pair of the magnets. when the rotor deviates from the equilibrium position, the current through the pairs of magnets is written as in the following equation:   iy1 iy2 iy3 iy4 iz1 iz2 iz3 iz4   =   io1 io1 io2 io2 io1 io1 io2 io2   +   1 0 0 0 −1 −1 0 0 0 −1 0 1 0 0 1 0 −1 0 0 1 0 0 1 0 −1 0 0 0 −1 −1 0 0 0 1 1 0 0 0 −1 1     iyt iyd izt izd ix   (6) where io = [io1 , io1 , io2 , io2 , io1 ,io1 , io2 , io2 ] t is the bias current. at steady-state, consider io = 0, ir = [iyt, iyd, izt, izd, ix] t is the x, y, and z axes’ virtual control current. ix is the virtual control current of x-axes. the virtual control current in the upper half of the y and z axes is (iyt, izt) , whereas the virtual control current in the bottom half is (iyd, izd). h =   1 −1 0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 −1 1 −1 −1 −1 1 1 −1 −1 1 1   t in this case, eq. (5) can be rewritten as: mbq̈b − gq̇b + kbqb = kibmhir (7) since the control is performed in the bearing coordinates, rewriting the equations of motion in bearing coordinates utilising the relationship between the mass-centre coordinates (x, y, z, θy , θz ) and the bearing coordinates (x, y1, y2, z1, z2), given by: qb = { x, y, z, θy , θz } t and qse = { x, y1, y2, z1, z2} t qse = tqb with t is the coordinate transfer matrix t =   1 0 0 0 0 0 1 0 0 b1 0 1 0 0 −b2 0 0 1 −b1 0 0 0 1 b2 0   eq. (1) shows that the inter-channel effect occurs at kb and kibmh because the major non-diagonal components are not zero. the kb and kibmh are invertible. the following control structure is used to eliminate the interstitial component: ir = (kibmh)−1(v + kbt−1qse) (8) 482 vol. 62 no. 4/2022 active disturbance rejection control-based . . . figure 4. control loop structure with active disturbance rejection control (adrc). eq. (7) can be rewritten as: mbq̈b − gq̇b = v (9) where v is the new control signal’s vector. the interstitial component has been removed in the control channels (x, y, z) leaving just the interstitial component in the control channel (θy , θz ) owing to the gyroscope force. the original model of the magnetic bearing is a complex, multivariable nonlinear system, through the process of linearity and decoupling, we have the linear form of the system shown in eq. (9) with 5 inputs and 5 outputs. the full form of eq. (9) is shown as follows:   m .. x = v1 m .. y = v2 m .. z = v3 jd .. θy +j . θx . θz = v4 jd .. θz +j . θx . θy = v5 (10) remark: it is noted that the first three equations of eq. (10) characterise the transnational motions and can be readily stabilized using v1, v2, and v3. the last two equation indicate coupling mechanism related to the rotational motion of the rotor that is normally ignored. in practice, the magnetic bearing is normally employed to operate in high speed range, hence rotational motion effects can not be neglected. in this section, the adrc controller will be used to remove the remaining coupling components as well as to stabilise the control object. an adrc controller is used for each input and output pair (x, v1), (y, v2), (z, v3), (θy , v4), (θz , v5). in adrc design, f (t) is unknown and is considered as a “generalized disturbance”, and b0 is the available information concerning the model. the according structure of the control loop with adrc is presented in fig. 4. the fundamental idea of adrc is to implement an extended state observer (eso) that provides an estimate, f̂ (t) , such that we can compensate the impact of f (t) on our system. the equation for the extended state observer is given as:  ˙̂x1 (t)˙̂x2 (t) ˙̂x3 (t)   =   0 1 00 0 1 0 0 0     x̂1 (t)x̂2 (t) x̂3 (t)   +   0b0 0   u (t) +   l1l2 l3   (y (t) − x̂1 (t)) =   −l1 1 0−l2 0 1 −l3 0 0   ︸ ︷︷ ︸ a−lc   x̂1 (t)x̂2 (t) x̂3 (t)   +   0b0 0   ︸ ︷︷ ︸ b u (t) +   l1l2 l3   y (t) (11) where x̂1(t) = ŷ(t); x̂2(t) = ˙̂y(t); x̂3(t) = f̂ (t). removing the unknown components is done through the following control law: ÿ(t) = (f (t) − f̂ (t)) + u0(t) ≈ u0(t) ≈ kp .((r(t) − y(t)) − kd .ẏ(t)) (12) where r is the setpoint. in order to work properly, observer parameters, l1, l2, l3, in eq. (11) still have to be determined. according to [20], the adrc’s parameters can be chosen to tune the closed-loop to a critically damped behaviour and a desired 2% settling time tsettle. the tuning procedure is summarised as follows: kp = ( scl )2 , kd = −2.scl l1 = −3.seso, l2 = 3. ( seso )2 , l3 = ( seso )3 (13) with scl = − 6 tsettle being the negative-real double closed-loop pole. seso ≈ (3...10).scl is the observer pole. using the adrc controller to calculate the variable x, y and z are calculated similarly: ẍ = ( 1 m .d(t) + ∆b.u(t))︸ ︷︷ ︸ f (t) +b01v1 = f (t) + b01.v1(t) v1(t) = kp 1.((r(t) − x̂(t)) − kd1. . x̂(t)) (14) for equations containing the two variables (θy and θz ), which have an interleaved component between the two equations. because the interleaved component is unknown, the extended observer can be used to estimate and analyse it, using the adrc controller with variable θy , as follows: θ̈y = ( 1 j d(t) + ∆b.v4(t) + j θxθz) + b04v4 = f (t) + b04.v4(t) b04 = 1 j v4(t) = kp 4.((r(t) − θ̂y (t)) − kd4. ˙̂ θy (t)) (15) 483 huy, minh, hieu et al. acta polytechnica name symbol b01 = b02 = b03 1/m b04 = b05 1/j tsettle 0.1 (s) scl -60 kp i (i = 1, . . . , 5) 3600 kdi (i = 1, . . . , 5) 120 seso -420 l1i (i = 1, . . . , 5) 1260 l2i (i = 1, . . . , 5) 529200 l3i (i = 1, . . . , 5) 74088000 table 1. controller parameters. 4. numerical simulations in this section, we consider two scenarios to evaluate the effectiveness of using the adrc controller in the case of variable speed rotation and rotor load disturbance. bearing design parameters value radial air gap g0 0.5 mm cross-sectional area a 18*10 mm inclined angle β 10o magnetic coils n 300 turns resistance r 2 ω inductance of wire l0 20 mh rotor mass m 1.86 kg moment of inertia jd 0.00647 kgm2 moment of inertia jp 0.00121 kgm2 bias current i01, i02 1.6 a,1 a bearing span b1, b2 81.7 mm,71.6 mm table 2. system parameters. 4.1. simulation scenario 1: we design an adrc controller with a rotor rotation speed of 3000 rpm. the initial values of the rotor’s centre of mass position are: x0 = 0.25.10−3; y0 = 0.2.10−3; z0 = 0.125.10−3; θy = 0.1.10−3; θz = 0.2.10−3. select the coefficients of the adrc as follows scl = − 60.1 , s eso = 7scl, kp = (scl)2, kd = −2scl, l1 = −3seso; l2 = 3 ( seso )2 , l3 = −(seso)3. the position of the centre of mass and the deflection angle of the rotor return to the equilibrium position after a time interval of 0.1 seconds and there is no overshoot in fig. 5 and fig. 6. from fig. 7, initially, when the rotor position deviates from the equilibrium position, a control current is generated to bring the rotor back to the equilibrium position. after the rotor is in the equilibrium position, the control current is zero so that the bias currents i01 and i02 keep the rotor in this equilibrium state. the impact force of the magnet is shown in fig. 8 as having a significant value figure 5. response to the position of the x, y, z axes. figure 6. the position of the axis angle θy , θz . figure 7. control current response. at first to bring the rotor to equilibrium, but once the rotor returns to equilibrium, the force is kept stable at the values f01 and f02. from the above results, it can be concluded that the controller is designed to completely satisfy the requirements. 484 vol. 62 no. 4/2022 active disturbance rejection control-based . . . figure 8. impact force of electromagnets. figure 9. velocity deviation of x, y, z axes according to observer. figure 10. velocity deviation of θy , θz axes according to observer. based on fig. 9 and fig. 10, the observer satisfied the requirements, and the estimated velocity values were near to the real velocity value after 0.1 s. 4.2. simulation scenario 2: the rotor speed will be changed to 12000 rpm to evaluate the controllability of the controller when the rotor is in the high-speed region, the initial value of the rotor’s centre of mass is: x0 = 0.25.10−3; y0 = 0.2.10−3; z0 = 0.125.10−3; θy = 0.1.10−3; θz = 0.2.10−3. the simulation results on the x, y, and z axes are identical to the first simulation scenario, as shown in fig. 12, where the angular position responses of the axes θy , θz have an undershoot and the response time has been increased to 0.2 seconds. only the θy and θz axes are impacted when the rotor rotates at high speeds, but it soon returns to equilibrium. the suggested controller takes into account the rotor speed factor and demonstrates its capacity to function well in the high-speed region. figure 11. response to the position of the x, y, z axes. figure 12. the position of the axis angle θy , θz . 485 huy, minh, hieu et al. acta polytechnica figure 13. control current response. figure 14. impact force of electromagnets. 5. conclusions in the paper, we consider the cone-shaped magnetic bearing, which is characterised as a class of underactuated and strongly coupled systems. based on control current distribution, the coupling mechanism in electrical sub-system is solved. subsequently, an active disturbance control is adopted to tackle the rotational-motion-induced disturbance acting on the system. the simulations are carried out proving that the proposed control can effectively bring the the rotor to equilibrium. the results also indicate that the coupling effects from low to high rotational speeds do not have a noticeable impact on the transnational motions of the rotor. in the future, experimental study will be carried out. acknowledgements this research is funded by the hanoi university of science and technology (hust) under project number t2021-pc001. figure 15. velocity deviation of x, y, z axes according to observer. figure 16. velocity deviation of θy , θz axes according to observer. references [1] c. r. knospe. active magnetic bearings for machining applications. control engineering practice 15(3):307–313, 2007. https://doi.org/10.1016/j.conengprac.2005.12.002. [2] w. ding, l. liu, j. lou. design and control of a high-speed switched reluctance machine with conical magnetic bearings for aircraft application. iet electric power applications 7(3):179–190, 2013. https://doi.org/10.1049/iet-epa.2012.0319. [3] p. imoberdorf, c. zwyssig, s. d. round, j. kolar. combined radial-axial magnetic bearing for a 1 kw, 500,000 rpm permanent magnet machine. in apec 07 twenty-second annual ieee applied power electronics conference and exposition, pp. 1434–1440. 2007. https://doi.org/10.1109/apex.2007.357705. [4] j. x. shen, k. j. tseng, d. m. vilathgamuwa. a novel compact pmsm with magnetic bearing for artificial heart application. ieee transactions on industry applications 36(4):1061–1068, 2000. https://doi.org/10.1109/28.855961. [5] m. d. noh, s. r. cho, j. h. kyung, et al. design and implementation of a fault-tolerant magnetic bearing 486 https://doi.org/10.1016/j.conengprac.2005.12.002 https://doi.org/10.1049/iet-epa.2012.0319 https://doi.org/10.1109/apex.2007.357705 https://doi.org/10.1109/28.855961 vol. 62 no. 4/2022 active disturbance rejection control-based . . . system for turbo-molecular vacuum pump. ieee/asme transactions on mechatronics 10(6):626–631, 2005. https://doi.org/10.1109/tmech.2005.859830. [6] b. han, s. zheng, y. le, s. xu. modeling and analysis of coupling performance between passive magnetic bearing and hybrid magnetic radial bearing for magnetically suspended flywheel. ieee transactions on magnetics 49(10):5356–5370, 2013. https://doi.org/10.1109/tmag.2013.2263284. [7] d. h. nguyen, m. l. nguyen, t. l. nguyen. decoupling control of a disc-type rotor magnetic bearing. the international journal of integrated engineering 13(5):247–261, 2021. https://doi.org/10.30880/ijie.2021.13.05.026. [8] a. h. slocum. introduction to precision machine design. precision machine design pp. 390–399, 1992. [9] h. s. jeong, c. s. kim. modeling and control of active magnetic bearings, 1994. [10] s. xu, j. fang. a novel conical active magnetic bearing with claw structure. ieee transactions on magnetics 50(5), 2014. https://doi.org/10.1109/tmag.2013.2295060. [11] a. mohamed, f. emad. conical magnetic bearings with radial and thrust control. in proceedings of the 28th ieee conference on decision and control, pp. 554–561. 1989. https://doi.org/10.1109/cdc.1989.70176. [12] c. w. lee, h. s. jeong. dynamic modeling and optimal control of cone-shaped active magnetic bearing systems. control engineering practice 4(10):1393–1403, 1996. https://doi.org/10.1016/0967-0661(96)00149-9. [13] j. fang, c. wang, j. tang. modeling and analysis of a novel conical magnetic bearing for vernier-gimballing magnetically suspended flywheel. proceedings of the institution of mechanical engineers, part c: journal of mechanical engineering science 228(13):2416–2425, 2014. https://doi.org/10.1177/0954406213517488. [14] s. j. huang, l. c. lin. fuzzy modeling and control for conical magnetic bearings using linear matrix inequality. journal of intelligent and robotic systems: theory and applications 37(2):209–232, 2003. https://doi.org/10.1023/a:1024137007918. [15] e. e. ovsyannikova, a. m. gus’kov. stabilization of a rigid rotor in conical magnetic bearings. journal of machinery manufacture and reliability 49(1):8–15, 2020. https://doi.org/10.3103/s1052618820010100. [16] d. h. nguyen, t. l. nguyen, m. l. nguyen, h. p. nguyen. nonlinear control of an active magnetic bearing with output constraint. international journal of electrical and computer engineering (ijece) 8(5):3666, 2019. https://doi.org/10.11591/ijece.v8i5.pp3666-3677. [17] d. h. nguyen, t. l. nguyen, d. c. hoang. a non-linear control method for active magnetic bearings with bounded input and output. international journal of power electronics and drive systems 11(4):2154–2163, 2020. https: //doi.org/10.11591/ijpeds.v11.i4.pp2154-2163. [18] a. katyayn, p. k. agarwal. type-2 fuzzy logic controller for conical amb-rotor system. in 2017 4th international conference on power, control & embedded systems (icpces), pp. 1–6. 2017. https://doi.org/10.1109/icpces.2017.8117616. [19] j. han. from pid to active disturbance rejection control. ieee transactions on industrial electronics 56(3):900–906, 2009. https://doi.org/10.1109/tie.2008.2011621. [20] g. herbst. a simulative study on active disturbance rejection control (adrc) as a control tool for practitioners. electronics 2(3):246–279, 2013. https://doi.org/10.3390/electronics2030246. 487 https://doi.org/10.1109/tmech.2005.859830 https://doi.org/10.1109/tmag.2013.2263284 https://doi.org/10.30880/ijie.2021.13.05.026 https://doi.org/10.1109/tmag.2013.2295060 https://doi.org/10.1109/cdc.1989.70176 https://doi.org/10.1016/0967-0661(96)00149-9 https://doi.org/10.1177/0954406213517488 https://doi.org/10.1023/a:1024137007918 https://doi.org/10.3103/s1052618820010100 https://doi.org/10.11591/ijece.v8i5.pp3666-3677 https://doi.org/10.11591/ijpeds.v11.i4.pp2154-2163 https://doi.org/10.11591/ijpeds.v11.i4.pp2154-2163 https://doi.org/10.1109/icpces.2017.8117616 https://doi.org/10.1109/tie.2008.2011621 https://doi.org/10.3390/electronics2030246 acta polytechnica 62(4):479–487, 2022 1 introduction 2 dynamic modelling of conical magnetic bearings 3 control system design 4 numerical simulations 4.1 simulation scenario 1: 4.2 simulation scenario 2: 5 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2021.61.0749 acta polytechnica 61(6):749–761, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague hydrothermally-calcined waste paper ash nanomaterial as an alternative to cement for clay soil modification for building purposes ubong williams roberta, sunday edet etukb, okechukwu ebuka agbasic, ∗, grace peter umorena, samuel sunday akpana, lebe agwu nnannac a akwa ibom state university, faculty of physical sciences, department of physics, p.m.b. 1167, ikot akpaden, mkpat enin, akwa ibom state, nigeria b university of uyo, faculty of science, department of physics, p.m.b. 1017, uyo, akwa ibom state, nigeria c michael okpara university of agriculture, college of physical science, department of physics, p.m.b. 7267, umudike, abia state, nigeria ∗ corresponding author: agbasi.okechukwu@gmail.com abstract. it has been observed that clay soil cannot be used for building design, unless it is modified by firing or with cement. either method of stabilization can adversely affect the environment and public health just like indiscriminate dumping or open burning adopted in developing countries as the prevalent disposal technique for waste papers. this paper sought to examine the feasibility of using assorted waste papers to derive an alternative stabilizer to portland limestone cement for modification of clay soil into composite materials suitable for building design. specifically, clay-based composites were fabricated at 0 %, 5 %, 10 %, 15 %, and 20 % replacement levels by weight with cement, and then hydrothermally-calcined waste paper ash nanomaterial (hcwpan). water absorption, sorptivity, bulk density, thermal conductivity, specific heat capacity, thermal diffusivity, flaking concentration, flexural strength, and compressive strength were investigated for each of the fabricated samples. irrespective of the stabilizing agent utilized, 10 % loading level was found to be the optimum for possession of maximum mechanical strength by the samples. only samples with the hcwpan content were found to be capable of reducing building dead loads and improving thermal insulation efficiency over un-stabilized clay material, if applied as walling elements in buildings. generally, it was revealed that the cement and hcwpan have comparable influences on the properties of clay soil, thus indicating that hcwpan could be utilized as an alternative stabilizer to cement. in addition, the preparation of hcwpan was found to be more energy-saving than that of the cement. keywords: bulk density, building design, compressive strength, sorptivity, thermal conductivity. 1. introduction in recent decades, the impacts of technology on society have been pronounced, especially, in terms of changes in relation to survival needs such as foods and shelter as well as aspirations among which is knowledge. being a powerful tool in the development of civilization, technology may be regarded as a complex social enterprise that includes design, manufacturing, research, management, finance, marketing, etc. with that, it suffices to assert that technology and human life cannot be separated. this is because as long as there have been people, there has been technology. the utilization of natural resources like clay as a building material is one chief evidence showing that the society has a cyclical dependence on technology. traditionally, clay is used in the form of bricks, plasters or mortars for shelter construction. as posited by bredenoord [1], homes made with clay bricks have a better moisture regulation and are more comfortable than those built with hollow concrete blocks. from an environmental standpoint, this is a pointer to the fact that the application of clay bricks as walling elements in buildings can reduce the impact of natural resource consumption. through technological innovation, the strength and durability properties of clay bricks can be improved. one way of achieving such an improvement is by thermal treatment, a technique which involves firing clay bricks at high temperatures in a kiln [2–4]. another effective method of clay modification is chemical treatment using portland cement [5–7]. however, the said methods portend great dangers. in the case of thermal treatment, for instance, wood is used to fuel the kilns and this practice leads to deforestation, in addition to a large scale emission of carbon dioxide, among other things. also, it is possible to produce uneven bricks as those closest to the heat source may be over-dried. similarly, cement production involves high energy usage followed by an emission of a large amount of carbon dioxide capable of causing serious environmental problems. as observed by several researchers including chen et al. [8], zhang et al. [9], and 749 https://doi.org/10.14311/ap.2021.61.0749 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica shen et al. [10], cement industry accounts for 5–7 % of the global anthropogenic carbon dioxide. though the amount emitted depends on the kiln type, clinker to cement ratio, fuel used, efficiency of energy utilization as well as the demand for cement, it is well-known that global warming is a serious threat, which our planet is facing daily and carbon dioxide is a vital factor in it. some other studies have revealed that clay can be chemically modified using ashes derived from solid plant-based agro-wastes like sugarcane bagasse [11, 12], rice husk [13–15], corn cob [16], groundnut shell [17], palm oil fuel and palm kernel shell [18]. in all these reported cases, the stabilizing agents exhibit good properties compared to cement in terms of improvement in the properties of the resulting clay-based composites. that notwithstanding, their utilization for the stabilization of clay is associated with major drawbacks. observably, agro-wastes used in preparing ashes are generated after harvesting. as such, since harvesting depends on maturity and economic issues/demand in relation to individual crops, the availability of the wastes in question may be negatively affected by seasonal influence on crops. moreover, several researchers have also reported that fly ash and slag [19–23], nano iron oxide [24], nano aluminium [25], nano silica [26, 27], nano magnesium oxide [28] are useful stabilizers for the improvement of mechanical and resistance properties of clay. it has been observed that our society lies in 3e (energy, economy, and environment) and while the society needs to keep all energy options open to satisfy the growing demand, the economy of a country directly relies on the transformation and utilization of resources. in this regard, there is a need to recycle a readily available waste in order to ensure a sustainable building construction. a typical waste that fits into the sustainability context is unused paper. paper is simply a cellulose-based material that is used daily for a number of different applications (such as packaging, writing, drawing, books, industrial purposes, etc). world facts report in 2017 has shown that 215,125,083 tons of paper are produced by 10 countries worldwide. this agrees with the report of o’mara [29] that 300 million tons of paper are produced yearly throughout the world. being a necessity of civilization, there can be no halt in the production of paper. like in other cases, the technological revolution with respect to paper manufacturing also has negative impacts on the society. paper considered to be of no further use is usually treated as a waste material. mckinney [30] noted that papers and paper products constitute approximately 25–40 % of globally generated solid wastes. in developing countries like nigeria, solid waste management is ineffective, thus leading to open burning as a prevalent way of getting rid of waste paper materials. this practice has the potential to increase the carbon footprint in the environment. according to a february 2020 report released by the international energy agency, the global energy-related carbon dioxide emission plateaued in 2019 at 33 gt. though waste paper and its ashes are found to be useful for the production of thermal insulation panel products/ceilings [31–36] and preparation of concrete [37, 38] as well, there is a need to further devise a scientifically-safe way of minimizing the huge quantity of such waste materials for beneficial purpose(s). hence, by carefully considering the aforementioned situations in the light of a sustainable building design, this study is designed to assess the feasibility of utilizing hydrothermally-calcined waste paper ash nanomaterial (hcwpan) as an alternative stabilizer to cement for clay modification for building purposes. specifically, a comparison will be made between the use of hcwpan and cement in terms of thermophysical and strength properties of the resulting clay-based composites and also the energy involvement in the production processes. 2. experimental perspective 2.1. materials collection and description portland limestone cement (cem ii b-l 32.5r), pink clay soil, and assorted waste papers were the main materials utilized in this study. the clay soil, as collected, was wet and plastic in nature. all the materials were gathered in large quantities within uyo metropolis in akwa ibom state, nigeria. 2.2. processing of the clay and preparation of the hcwpan the clay soil was sun-dried to a constant weight and then crushed by means of hammer. using a 2 mmsieve, the crushed soil material was screened and the quantity of it that passed through the openings was used in this work. also, the surfaces of the waste papers were cleaned before the papers were shredded with the aid of a pair of scissors. by means of an incinerator (model i8-20s), the paper pieces were burnt to ash for 1 hour at 850 °c. this type of incinerator is capable of operating without odours, smoke, or harmful emissions. after that, the ash was sieved to remove any accompanying impurity and also to obtain fraction size passing through 250 µm openings. then, two stages of heat treatment were adopted to prepare the hcwpan, thus: a mixture of the sieved ash and tap water was stirred for 0.5 hour using magnetic stirrer and then heated for 12 hours in a teflon-lined autoclave at 200 °c. using an electric furnace, the preheated cursor was calcined at 750 °c for 4 hours and after cooling to 35 °c, it was pulverized in an agate mortar and then ball-milled by means of a high-energy ball milling machine (emax, manufactured by retsch gmbh) at 500 rpm for 6 hours. this ball miller is capable of ensuring a reduction in the particle sizes of a material 750 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . figure 1. used forms of the (a) cement (b) hcwpan. weight fraction of clay soil (%) 100 95 90 85 80 weight fraction of the hcwpan or cement used (%) 0 5 10 15 20 table 1. proportioning of the sample mixes. feed from about 5 mm to as fine as less than 80 nm. the clay soil, cement, and hcwpan used in this work are shown in figure 1. 2.3. analysis of the materials the clay soil, cement, and hcwpan were each divided into portions. using one portion of each material, static angle of repose was determined by a fixed funnel method [39]. also, by employing a hydrometer method of sedimentation [40], the particle size analysis of the clay soil was performed after which the percentage passing was plotted against the particle diameters on a logarithmic scale. with the aid of x-ray fluorescence analyser (spectro x-lab 2000), the chemical composition as well as the loss on ignition were determined for the materials as per the technique used by [41, 42]. 2.4. fabrication of test samples using the remaining portion of each material, various similar weight fractions of cement and hcwpan were thoroughly and separately mixed with the clay soil as presented in table 1. after moistening with water (ranging in a quantity from 23.8 % to 34.1 % by dry weight of the material mix), the resulting mixture was compacted with a load of 50 kg for 12 hours. whereas test samples meant for the investigation of thermophysical properties were prepared to be 9.0 mm in thickness and 110 mm in diameter, those for the strength characterization were developed in a mould measuring 210 × 36 × 12 mm. the samples were fabricated in triplicates per formulation. prior to the conduction of the intended tests on each of them, the samples were oven-dried until they became moisturefree. 2.5. properties investigation each sample developed for the assessment of thermophysical properties was first subjected to a thermal conductivity test using modified lee–charlton’s disc apparatus technique [43]. then, the samples were carefully cut into sizes needed for water absorption, bulk density, sorptivity, and specific heat capacity tests. the method used by shrestha [44] was adopted for the evaluation of water absorption, wa of each sample. by means of a digital balance (s. mettler – 600g), the mass of each sample was measured and their bulk volume was found by using modified water displacement method [45]. from the data obtained, the corresponding bulk density, ρ, was determined as ρ = m v , (1) where m = mass, and v = bulk volume of the sample. the shape of samples for the sorptivity test was made rectangular. following the method used in previous work [46], with slight modifications, a sample was suspended from a digital scale by means of a strong and light inextensible string. after that, the reading on the scale was noted and the area of the sample’s lower surface was determined. also, a glass vessel containing water – saturated foam was placed directly under the sample. then, the position of the sample was adjusted until its lower end rested on the foam (as illustrated in figure 2). at that instant, a digital stopwatch was turned on to measure the infiltration time of the water. after every minute, the reading shown on the scale was noted and the initial reading obtained was subtracted from it in order to know the mass of the water that ingressed into the sample. the temperature of the water at that moment was determined with the aid of a digital thermometer equipped with a type-k probe and the water density at that temperature was noted as provided in [47]. when 50 minutes elapsed, the water infiltration depth 751 u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica figure 2. schematic diagram for the sorptivity test. at each interval of time was determined using the formula d = ( wa dwa ) , (2) where d = cumulative water infiltration depth, wa = mass of water ingressed into the sample, dw = density of water, and a = area of the sample’s surface in contact with the foam. the infiltration depth of the water was plotted against the square root of the time and the sorptivity value of the sample was deduced graphically based on the relation d = sp √ t, (3) where sp = sorptivity of the sample, and t = infiltration time. the measurement of specific heat capacity of each sample was carried out using seur’s apparatus, as described elsewhere [48]. in this case, other heat exchange accessories used were plates of aluminium and plywood (each having thickness of 10 mm). the values of bulk density, thermal conductivity (k) , and specific heat capacity (c) obtained for each sample were then applied to calculate the corresponding thermal diffusivity, λ as λ = k ρc . (4) the abrasion resistance was assessed by a flaking concentration test for each of the samples developed for the appraisal of strength. in this case, the initial mass of each test sample was measured after which a hard shoe brush was used to rub against their two surfaces until 70 strokes of forward and backward movements were made. for the purpose of ensuring that the pressure applied was uniform, a 0.5 kg weight was firmly attached to the top of the brush and the movement of the brush was activated by pushing and pulling it slowly over a sample under test. at the end of the process, the flaked samples were weighed and the percentage decrease in the mass of each tested sample was determined: fc = ∆m mo 100 %, (5) where ∆m = decrease in the mass of the sample after being flaked, and mo= mass of the sample before being flaked. the flexural strength of the samples was determined by means of a computerised electromechanical universal testing machine (wdw-10) based on a three-point bending technique as stated in [49]. during each test schedule, the sample under test was suspended as a single beam supported at two points and loaded at its mid-point until it fractured. with the data obtained, the value of flexural strength, fs was computed based on the relation: fs = 3p l 2bx2 , (6) where p = maximum load applied at the fracture, l = span length, and b = width of the sample under test. in the case of compressive strength, the values of crushing force, f , and cross-sectional area, a, of each sample were used for computation as [50] cs = f a . (7) all the tests were performed at room temperature with ±2 °c variations. for each test, the mean and standard error values of the results obtained were tabulated. 3. results and discussion the index properties of the clay soil, hcwpan, and cement utilized in fabricating the samples are presented in table 2. there is a slight difference in terms of static angles of repose of hcwpan and cement, both of which are greater than the value obtained for the clay soil by about 40.8 % and 35.4 % respectively. since the angle of repose is inversely proportional to the particle size of a material [51, 52], it can be claimed that the clay soil contains particles that are the largest in size and those in the cement are larger as compared 752 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . particulars materials clay soil hcwpan cement used form dry dry/powdery dry/powdery static angle of repose (29.4 ± 0.1)° (41.4 ± 0.3)° (39.8 ± 0.1)° plasticity index 23.4 % – – table 2. description of the materials used. figure 3. particle size distribution of the clay soil used. to the size of the particles in the hcwpan. from figure 3, it can be observed that the clay-sized particles (< 2 µm) are more than silt particles (0.06 mm – 2 µm) while sand particles (2 mm – 0.06 mm) constitute the least fraction. also, by having nano-sized particles, hcwpan is finer than cement and its fineness is typical enough to enable it to function as a binder as well as filler in the clay matrix to form a clay-based composite. based on the interpretation that an angle of repose of 25° to 30° is for excellent flowability, whereas a value in the range of 41° to 45° indicates a passable flow property [53], it can be adjudged that the materials utilized in this study possess acceptable flow characteristics for manufacturing purposes. table 3 shows the chemical compositions of the clay soil, hcwpan, and cement. among the main oxides in the clay soil, sio2 is of the highest proportion followed by al2o3, fe2o3, and then cao. it is noteworthy that while cao manages the structure and rupture threshold of the soil, the alumina (al2o3) fraction promotes the increase in mechanical resistance. the percentage of silica (sio2) in the clay is approximately 50 %, thus agreeing well with the report of velasco et al. [54] that clay used in bricks factories contains 50 % to 60 % of sio2. this means that the clay soil used in this work is a suitable raw material for utilization in factories for brick making. in the case of hcwpan, the proportions of sio2, al2o3, and fe2o3 sum up to 30.7 %. this value is less than 70 % being the minimum requirement specified for a material to be a pozzolan [55], thus signifying that the hcwpan cannot be regarded as a pozzolanic material in this case. however, the hcwpan contains cao (56.76 %), sio2 (28.04), al2o3 (1.61), and mgo (1.30) as the prominent constituents and the proportions of these oxides are similar in rankings to those in the cement used in this study. also, the fraction of cao in hcwpan is very close to that of the cement. chemical phases formed from the calcium, silicates, and aluminates contents of the admixtures in the cement are essential for strength development through hydration. specifically, sio2 aids in strength development. also, in the presence of water, cao first forms ca(oh)2 and with the availability of carbon dioxide, yields caco3 for hardening and strength development. it is, therefore, obvious that the hcwpan is cementitious enough to bring about rapid hardening as well as early attainment of strength as a hydraulic binder of the resulting clay-based components containing it. apart from the fact that particles 753 u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica chemical constituents proportion (% wt) per material used name formula clay soil hcwpan cement (plc) silicon dioxide (silica) sio2 49.79 28.04 16.32 aluminium trioxide (alumina) al2o3 30.68 1.61 3.62 ferric oxide fe2o3 6.90 1.05 2.81 calcium oxide (lime) cao 0.74 56.76 58.23 magnesium oxide mgo 0.21 1.30 0.91 sulphur trioxide so3 – 0.38 3.18 potassium oxide k2o 1.87 0.04 – loss on ignition loi 8.61 10.29 4.46 table 3. chemical composition of the materials. of hcwpan are finer than those of cement, thus ensuring faster decomposition when heated, their values of loss on ignition indicate that the hcwpan contains more compounds that decompose at high temperatures. the results of experiments performed on the developed samples are presented in table 4. in the case of water absorption, the replacement of clay with up to 10 % of hcwpan improves the waterresistive ability of the clay-based composites over the blank/control sample. this indicates that the lime, alumina, and silica in the hcwpan react optimally at that level to enhance the inter-particle packing and micro-aggregation, thereby reducing the inter-particle voids and decreases the water uptake of the resulting composites. beyond the 10 % level of hcwpan incorporation into the clay matrix, the water absorption of the composites increases. this may be due to the influence of numerous pores in the hcwpan as a result of tininess of its particles. in addition, hcwpan is the most porous in this case and its hydrophilicity is higher than that of the clay soil. thus, an excess fraction of it in the resulting composite samples exercises a remarkable influence in promoting water absorption. it can, therefore, be posited that a no greater than 10 % hcwpan content is capable of improving the durability of the developed samples. the use of cement as a stabilizing agent reduces water absorption at all levels considered in this study, though the water absorption obtained at a 20 % content of the cement is greater than the value recorded at 15 % level of loading. it is also observed that the reduction in the case of using hcwpan up to 10 % is more than when incorporating the cement at similar levels. this may be attributed to the fact that cement is made of calcium silicates and aluminates and it has a high affinity for water. however, a one-way analysis of variance test yields a calculated f-value of 0.69 at p < 0.05, thus revealing that the observed differences are not significant. whereas an increase in hcwpan content creates more pores in the clay matrix, the pores are reduced when increasing the cement content. in other words, by increasing the cement content, the clay soil particles are covered with cement particles and bonding is enhanced among the particles, thereby making the resulting composite samples more solid in nature. this is possible because the hydrated component (such as calcium silicate hydrates) formed reduces the pore size (pore network) of the sample as the content of cement increases. the case with hcwpan is different from this. thus, since sorptivity is based on capillarity network, it increases steadily with increasing proportions of hcwpan. this, plausibly, is due to the degree of fineness of hcwpan which enables it penetrate the clay matrix significantly and create a network that promotes links for the enhancement of water infiltration. that is why, as sorptivity increases with the increasing hcwpan content, a decrease is observed with the increasing proportion of cement. there is no doubt that the variations in stabilizing mechanism of hcwpan and cement can make the hcwpan-modified clay soil and cement-modified clay soil differ in their properties. samples with hcwpan have a lower bulk density than their counterparts made with cement. at 5 %, 10 %, 15 %, and 20 % loading levels, the values obtained between the use of hcwpan and incorporation of cement differ by 12.8 %, 15.4 %, 23.7 %, and 33.4 % respectively. this may be attributed to the lightness of hcwpan as compared to the clay soil. also, the samples containing hcwpan have a lower thermal conductivity than the ones containing cement fractions at similar loading levels. this is possible because hcwpan is refractory and more porous than clay, even in the presence of water, whereas under such condition, cement produces more cohesive mortar than clay. thus, with cement, a significant increase in inter-particle packing is ensured and that reduces air volume by minimizing pore spaces. on the contrary, though hcwpan has finer particles as compared to the clay, its cohesiveness is lower, thereby leading to the increase in air space. figure 4 reveals that an increase in the content of hcwpan causes a decrease in both the bulk density and thermal conductivity of the resulting clay composites, whereas cement increases them. since the sam754 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . st ab ili zi ng ag en t us ed l ev el of re pl ac em en t (% w t) m ea n w it h st an da rd er ro r va lu es of ea ch pr op er ty in ve st ig at ed w a s p ρ k c λ f c f s c s (% ) (1 0− 4 m s − 1/ 2 ) (1 03 k g m − 3 ) (w m − 1 k − 1 ) (1 03 j k g − 1 k − 1 ) (1 0− 8 m 2 s − 1 ) (% ) (m p a ) (m p a ) h c w pa n 0. 0 11 .4 9 6. 82 2. 15 1 0. 27 4 1. 67 1 7. 62 1. 01 6 1. 77 8 2. 86 6 ± ± ± ± ± ± ± ± ± 0. 02 0. 01 0. 00 2 0. 00 3 0. 01 6 0. 03 0. 00 2 0. 00 3 0. 00 4 5. 0 8. 53 6. 85 1. 91 2 0. 24 8 1. 70 6 7. 60 1. 02 2 1. 88 2 2. 87 4 ± ± ± ± ± ± ± ± ± 0. 02 0. 01 0. 00 2 0. 00 2 0. 00 9 0. 02 0. 00 3 0. 00 2 0. 00 2 10 .0 9. 99 6. 89 1. 87 3 0. 23 1 1. 74 9 7. 05 1. 03 4 1. 97 6 2. 88 0 ± ± ± ± ± ± ± ± ± 0. 01 0. 02 0. 00 3 0. 00 3 0. 01 2 0. 02 0. 00 2 0. 00 3 0. 00 2 15 .0 11 .5 1 6. 95 1. 75 2 0. 21 8 1. 78 3 6. 98 1. 08 9 1. 76 9 2. 86 1 ± ± ± ± ± ± ± ± ± 0. 03 0. 02 0. 00 2 0. 00 3 0. 01 3 0. 03 0. 00 2 0. 00 3 0. 00 3 20 .0 11 .6 8 7. 07 1. 63 0 0. 20 4 1. 81 6 6. 89 1. 10 6 1. 74 6 2. 83 9 ± ± ± ± ± ± ± ± ± 0. 03 0 .0 4 0. 00 3 0. 00 2 0. 01 8 0. 03 0. 00 3 0. 00 3 0. 00 3 c em en t 0. 0 11 .4 9 6. 82 2. 15 1 0. 27 4 1. 67 1 7. 62 1. 01 6 1. 77 8 2. 86 6 ± ± ± ± ± ± ± ± ± 0. 02 0. 01 0. 00 2 0. 00 3 0. 01 6 0. 03 0. 00 2 0. 00 3 0. 00 4 5. 0 11 1. 04 6. 71 2. 15 7 0. 27 9 1. 64 3 7. 87 1. 01 1 1. 86 2 2. 87 7 ± ± ± ± ± ± ± ± ± 0. 01 0. 02 0. 00 1 0. 00 1 0. 01 0 0. 04 0. 00 1 0. 00 3 0. 00 2 10 .0 10 .9 6 6. 64 2. 16 1 0. 28 2 1. 56 9 8. 32 1. 00 8 2. 63 9 2. 89 3 ± ± ± ± ± ± ± ± ± 0. 02 0. 01 0. 00 1 0. 00 1 0. 01 2 0. 02 0. 00 2 0. 00 2 0. 00 3 15 .0 10 .9 1 6. 51 2. 16 8 0. 28 6 1. 53 2 8. 61 1. 00 4 2. 10 2 2. 88 1 ± ± ± ± ± ± ± ± ± 0. 02 0. 01 0. 00 2 0. 00 1 0. 01 4 0. 03 0. 00 1 0. 00 2 0. 00 2 20 .0 11 .3 3 6. 43 2. 17 4 0. 29 2 1. 40 7 8. 97 1. 00 1 2. 11 8 2. 88 9 ± ± ± ± ± ± ± ± ± 0. 03 0. 02 0. 00 2 0. 00 2 0. 01 1 0. 02 0. 00 1 0. 00 2 0. 00 3 table 4. obtained results of the property tests performed on the developed samples. 755 u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica figure 4. variations of bulk density and thermal conductivity with proportions of the stabilizer. ples are porous and also completely dry, this could be due to fact that as hcwpan fraction increases, more interstices/pores exist, thus leading to the increase in air volume, whereas a drastic decrease in pore size and hence, the minimization of air volume is caused by the increasing content of cement. consequently, bulk density of the developed composites decreases with the increase in hcwpan, but increases with the incorporation of cement. the implication is that unlike in the case of cement, by increasing the proportion of hcwpan, lightweight composites are produced. solihu [56] and srivastava et al. [57] in their research work observed similar effects on density when using cement for soil stabilization. more so, it could be understood that air is a good thermal insulant (having thermal conductivity of about 0.025 wm−1k−1) and it acts within the pores as heat transfer medium that controls and resists the flow of heat through the samples. thus, an increase in hcwpan loadings causes a decrease in the thermal conductivity because of the creation of more air-filled pores in the developed samples. also, an increase in cement content brings about an increase in thermal conductivity, because of the improvement of packing behaviour, which reduces the air volume in the clay composites. in this case, it could be posited that thermal conductivity is a function of bulk density which is a function of porosity and as such, samples with hcwpan content are more capable of ensuring a better thermal insulation performance than those made with cement at a similar content level. however, all the thermal conductivity values obtained for the studied samples fall within 0.023 wm−1k−1 to 2.900 wm−1k−1, which is the range recommended for heat-insulating and construction materials [58]. the results of the specific heat capacity test in this work substantiates the assertion that hcwpanmodified samples can exhibit a better thermal insulation performance than their counterparts with the cement content. as can be seen, the specific heat capacity increases with the increasing content of hcwpan, but decreases as the fraction of cement increases. this simply implies that the incorporation of hcwpan enhances the heat-storing ability of the developed composites, whereas the utilization of cement impedes such tendency. as such, for a unit temperature change, more energy will be required by a unit mass of the sample containing hcwpan as compared to the case of using cement at a similar level to produce the sample. alongside the thermal conductivity and bulk density (as expressed in eq. 4 in this study), the specific heat capacity influences the thermal diffusivity of the samples. observably, thermal capacity (product of bulk density and specific heat capacity) shows a positive correlation with thermal conductivity in the case of samples with hcwpan content, whereas a negative relationship is exhibited in the case of samples containing cement. thus, with respect to proportion of the stabilizing agent utilized, the thermal diffusivity decreases with the increasing content of hcwpan, but increases progressively with the increase in the percentage of cement. at 20 % level of hcwpan, the thermal diffusivity decreases by about 9.6 %, while it increases by 17.7 % due to the utilization of cement up to a similar level. thermal diffusivity may be regarded as a transport property that shows how a material undergoes temperature changes. since a larger value of thermal diffusivity implies a faster rate of heat diffusion through a material, the results obtained in this study indicate that the temperature propagation is slower in hcwpan756 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . figure 5. variations of flexural strength and compressive strength with proportions of the stabilizer. modified samples than in the case of those modified with cement even at similar content levels. this further substantiates that hcwpan is more effective than cement in improving the thermal insulation efficiency of the clay soil for building purposes. the extent to which a material is cohesive exercises a great deal of influence on the bonding strength of the material which, in turn, characterizes how strongly or weakly particles are held together in the material. between the stabilizing agents used in this work, cement exhibits a greater cohesivity than hcwpan in the presence of water. this possibly contributes to the greater flaking concentration observed in the case of samples containing hcwpan, as compared to their counterparts with cement content. it is noticed that a sample made with up to 5 % of cement content is as flaky as the control one, thus portraying that more than such a proportion of cement is needed to reduce the flakiness of the developed composite. also observed is the fact that while the samples become flakier as hcwpan fraction in them increases, their flaking concentration is lesser by increasing the proportions of cement. the implication in this case is that, at similar levels of usage, samples with the cement content are less susceptible to a mechanical wear than those fabricated with the hcwpan fraction. in the cases of flexural strength and compressive strength, the values obtained for using cement as a stabilizing agent are greater as compared to the results for the use of hcwpan in modifying the clay soil. as a matter of fact, cement is mainly composed of lime and silica, which react with each other and other components in the mix when water is added. this reaction forms a combination of tri-calcium silicate and di-calcium silicate, which are referred to as c3s and c2s in the cement literature. such mineral phases play a critical role in the chemical reaction and eventually generate a matrix of interlocking crystals that cover any inert filler and provide a high strength. the results simply reveal that samples developed with the cement content are stronger and stiffer than those produced similarly but with hcwpan as a component. in other words, the modification of clay soil with cement yields clay-based composites with a greater ability to withstand rupture and shattering under loading situations as compared to those stabilized with hcwpan. at 5 %, 10 %, 15 %, and 20 % proportions, flexural strength of the studied samples differs by 0.3 %, 33.6 %, 18.8 %, and 21.3 %, respectively, while compressive strength differs by 0.1 %, 0.5 %, 0.7 %, and 1.8 %, respectively, between the use of hcwpan and cement for the modification of the clay soil. among other ways, the development of higher strength and stiffness is achieved by reducing voids, binding particles and aggregates together, and maintaining flocculation. additionally, the ca(oh)2 crystals generated by cement as by-product of the hydration of calcium silicates are pure and fine, and as such, they are more reactive. this provides the calcium necessary for the ion exchange during the physicochemical reaction between the cement and clay soil. the illustration in figure 5 depicts the flexural strength and compressive strength trend with the proportions of the stabilizers used. it can be observed that the strength increases up to the 10 % incorporation level, beyond which they decline in both cases. this signifies that 10 % is the optimum level for achieving the maximum strength. beyond this level, a sharp decline is revealed. as for the hcwpan, this may be because the silica content can react optimally only when at most 10 % of the stabilizing agent is utilized. that is to say, if a subsequent increase in its propor757 u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica equipment nominal energy source application duration energy quantity of the applied power rating (hours) consumed material (w) (gj) processed (ton) incinerator – – paper ashing 1 – 0.134 autoclave 3120 electricity heat treatment 12 0.135 furnace 1800 electricity calcinations 4 0.026 ball miller 3100 electricity ball milling 6 0.067 table 5. analysis of energy consumption during hcwpan preparation. tion is introduced into the clay matrix, the excess of it (the hcwpan) may not have the opportunity to be utilized, thus resulting in increased void fractions and pores that weaken the inter-particle bonding in the developed composites. with the use of cement as a stabilizing agent, the maximum strength values are obtained at similar loading levels showing that a further increment in the cement fraction may not necessarily lead to increased strength, but impairment of the expected performance as noted by mahedi et al. [59]. notably, all the studied samples have a compressive strength value that compares favorably with 2.9 mpa, reported [60] for cement-stabilized clay soil containing 10 % by volume of cement and cured for 28 days. also, the studied samples possess a compressive strength that meets the minimum requirement stipulated to be 2.5 mpa for non-load bearing walling elements. table 5 shows the analysis of the energy consumed during the hcwpan preparation in the study. it can be inferred from the data presented that the total energy consumption is 0.228 gj for processing 0.134 tons of the material. this means that, on average, the consumption is 1.701 gj per ton. in the case of the cement production, approximately 3.2 to 6.3 gj amount of energy is needed and about 1.7 tons of raw material (mainly limestone) per ton of the clinker are generated [61]. in other words, energy consumption is 1.882 to 3.706 gj per ton of cement. comparatively, it can be deduced that the processes involved in preparing a ton of hcwpan is about 9.6 % to 54.1 % more energy-saving than those used for cement manufacturing. 4. conclusion the following conclusions were drawn based on the results of the investigation carried out in this study: • the ranking of oxides proportions in hcwpan is similar to that of the portland limestone cement utilized in this study. • replacement of clay matrix with up to 20 % of hcwpan caused a decrease in bulk density, thermal conductivity, and thermal diffusivity by 24.2 %, 25.5 %, and 9,6 %, respectively, whereas with the use of the cement, increments of 1.1 %, 6.6 %, and 17.7 % respectively, were observed. • compared with the case of the cement used in this work, the preparation of a kilogram of hcwpan is more energy-saving and the clay-based composites developed with it are capable of ensuring a better thermal comfort as building materials, especially in tropical zones. • at 10 % content level of either stabilizing agent, the fabricated composites possessed the maximum flexural strength and compressive strength, though with a variation of about 33.6 % and 0.5 %, respectively, for the use of the cement over hcwpan. • since waste papers are cheaply and constantly available in large quantities, the use of hcwpan to modify clay soil for building purposes could serve as a promising way of managing the waste while ensuring a sustainable building construction. • the cement and hcwpan have comparable influences on the durability behaviour, such as water absorption and sorptivity of the modified clay samples, thus signifying that hcwpan could be utilized as an alternative stabilizer to cement for the modification of clay soil for building purposes. • valorisation of waste papers, as described in this study, can serve as a safe disposal technique of the waste while improving the economy of the building sector. references [1] j. bredenoord. sustainable building materials for low-cost housing and the challenges facing their technological developments: examples and lessons regarding bamboo, earth-block technologies, building blocks recycled materials, and improved concrete panels. journal of architectural engineering technology 6:187, 2017. https://doi.org/10.4172/2168-9717.1000187. [2] e. tsega, a. mosisa, f. fuga. effects of firing time and temperature on physical properties of fired clay bricks. american journal of civil engineering 5(1):21–26, 2017. https://doi.org/10.11648/j.ajce.20170501.14. [3] s. karaman, h. gunal, s. ersahin. assessment of clay bricks compressive strength using quantitative values of colour components. construction and building materials 20(5):348–354, 2006. https: //doi.org/10.1016/j.conbuildmat.2004.11.003. [4] h. t. rondonane, j. a. mbeny, e. c. bayiga, p. d. ndjigui. characterization and application tests of kaolinite clays from aboudeia (southeastern chad) in 758 https://doi.org/10.4172/2168-9717.1000187 https://doi.org/10.11648/j.ajce.20170501.14 https://doi.org/10.1016/j.conbuildmat.2004.11.003 https://doi.org/10.1016/j.conbuildmat.2004.11.003 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . fired bricks making. scientific african 7:e00294, 2020. https://doi.org/10.1016/j.sciaf.2020.e00294. [5] e. garzón, m. cano, b. c. o’kelly, p. j. sánchez-soto. phyllite clay-cement composites having improved engineering properties and material applications. applied clay sciences 114:229–233, 2015. https://doi.org/10.1016/j.clay.2015.06.006. [6] m. s. el-mahllawy, a. m. kandeel. engineering and mineralogical characteristics of stabilisedunfired montmorrillonite clay bricks. hbrc journal 10(1):82–91, 2014. https://doi.org/10.1016/j.hbrcj.2013.08.009. [7] l. m. a. santos, j. a. s. neto, a. f. n. azerêdo. soil characterisation for adobe mixtures containing portland cement as stabiliser. matéria 25(1):1–10, 2020. https://doi.org/10.1590/s1517-707620200001.0890. [8] c. chen, g. habert, y. bouzidi, a. jullien. environmental impact of cement production: detail of the different processes and cement plant variability evaluation. journal of cleaner production 18(5):478–485, 2010. https://doi.org/10.1016/j.jclepro.2009.12.014. [9] s. zhang, e. worrell, w. crijns-graus. evaluating co-benefits of energy efficiency and air pollution abatement in china’s cement industry. applied energy 147:192–213, 2015. https://doi.org/10.1016/j.apenergy.2015.02.081. [10] w. shen, l. cao, q. li, et al. quantifying co2 emissions from china’s cement industry. renewable and sustainable energy reviews 50:1004–1012, 2015. https://doi.org/10.1016/j.rser.2015.05.031. [11] k. s. reddy, p. s. vivek, k. s. chambrelin. stabilization of expansive soil using bagasse ash. international journal of civil engineering and technology 8(4):1730–1736, 2017. [12] k. c. p. faria, r. f. gurgel, j. n. f. holanda. recycling of sugarcane bagasse ash waste in the production of clay bricks. journal of environmental management 101:7–12, 2012. https://doi.org/10.1016/j.jenvman.2012.01.032. [13] s. zahan, s. akter, r. ahsan. effects of rice husk in clay bricks. in 5th international conference on civil engineering for sustainable development. khulna, bangladesh. [14] o. agbede, m. joel. effcet of rice husk ash (rha) on the properties of ibaji burnt clay bricks. american journal of scientific and industrial research 2(4):674–677, 2011. https://doi.org/10.5251/ajsir.2011.2.4.674.677. [15] y. c. khoo, i. johari, z. a. ahmad. influence of rice husk ash on the engineering properties of fired-clay brick. advanced materials research 795:14–18, 2013. https:// doi.org/10.4028/www.scientific.net/amr.795.14. [16] v. s. sankar, p. d. a. raj, s. j. raman. stabilization of expansive soil by using agricultural waste. international journal of engineering and advanced technology 8(3s):154–157, 2019. https://www.ijeat.org/wp-content/uploads/ papers/v8i3s/c10310283s19.pdf. [17] s. s. shinde, g. k. patil. study on utilization of agricultural waste as soil stabilizer. international journal of latest trends in engineering and technology 7(1):227– 230, 2016. https://doi.org/10.21172/1.71.032. [18] o. d. afolayan, o. m. olofinade, i. i. akinwumi. use of some agricultural wastes to modify the engineering properties of subgrade soils: a review. journal of physics: conference series 1378:022050, 2019. https://doi.org/10.1088/1742-6596/1378/2/022050. [19] s. mandal, j. p. singh. stabilization of soil using ground granulated blast furnace slag and fly ash. international journal of innovative research in science, engineering and technology 5(12):21121–21126, 2016. [20] j. dayalan. comparative study on stabilization of soil with ground granulated blast furnace slag (ggbs) and fly ash. international research journal of engineering and technology 3(5):2198–2204, 2016. https://www. irjet.net/archives/v3/i5/irjet-v3i5465.pdf. [21] l. yadu, r. k. tripathi. effects of granulated blast furnace slag in the engineering behavior of stabilized soft soil. procedia engineering 51:125–131, 2013. https://doi.org/10.1016/j.proeng.2013.01.019. [22] b. d. nath, m. k. a. molla, g. sarkar. study on strength behavior of organic soil stabilized with fly ash. international scholarly research notices 2107:5786541, 2107. https://doi.org/10.1155/2017/5786541. [23] h. singh, g. s. brar, g. s. mudahar. evaluation of characteristics of fly ash-reinforced clay bricks as building materials. journal of building physics 40(6):530–543, 2017. https://doi.org/10.1177/1744259116659662. [24] f. changizi, a. haddad. strength properties of soft clay treated with mixture of nano-sio2 and recycled polyester fiber. journal of rock mechanics and geotechnical engineering 7(4):367–378, 2015. https://doi.org/10.1016/j.jrmge.2015.03.013. [25] s. g. jahromi, h. zahedi. investigating the effecting of nano aluminum on mechanical and volumetric properties of clay. amirkabir journal of civil engineering 50(3):597–606, 2018. https://doi.org/10.22060/ceej.2017.12241.5157. [26] f. changizi, a. haddad. improving the geotechnical properties of soft clay with nano-silica particles. proceedings of the institution of civil engineers – ground improvement 170(2):62–71, 2018. https://doi.org/10.1680/jgrim.15.00026. [27] n. ghasahkolaei, a. janalizadeh, m. jahanshahi, et al. physical and geotechnical properties of cement-treated clayey soil using silica nanoparticles: an experimental study. the european physical journal plus 131:134, 2016. https://doi.org/10.1140/epjp/i2016-16134-3. [28] z. h. majed, m. r. taha. effect of nanomaterial treatment on geotechnical properties of a penang soil. asian journal of scientific research 2(11):587–592, 2012. [29] m. o’mara. how much paper is used in one day? record nations, last updated: january 3, 2020, https://www.recordnations.com/. [30] r. w. j. mckinney. technology of paper recycling. blackie-academic and professional, chapman and hall, new york, 1995. 759 https://doi.org/10.1016/j.sciaf.2020.e00294 https://doi.org/10.1016/j.clay.2015.06.006 https://doi.org/10.1016/j.hbrcj.2013.08.009 https://doi.org/10.1590/s1517-707620200001.0890 https://doi.org/10.1016/j.jclepro.2009.12.014 https://doi.org/10.1016/j.apenergy.2015.02.081 https://doi.org/10.1016/j.rser.2015.05.031 https://doi.org/10.1016/j.jenvman.2012.01.032 https://doi.org/10.5251/ajsir.2011.2.4.674.677 https://doi.org/10.4028/www.scientific.net/amr.795.14 https://doi.org/10.4028/www.scientific.net/amr.795.14 https://www.ijeat.org/wp-content/uploads/papers/v8i3s/c10310283s19.pdf https://www.ijeat.org/wp-content/uploads/papers/v8i3s/c10310283s19.pdf https://doi.org/10.21172/1.71.032 https://doi.org/10.1088/1742-6596/1378/2/022050 https://www.irjet.net/archives/v3/i5/irjet-v3i5465.pdf https://www.irjet.net/archives/v3/i5/irjet-v3i5465.pdf https://doi.org/10.1016/j.proeng.2013.01.019 https://doi.org/10.1155/2017/5786541 https://doi.org/10.1177/1744259116659662 https://doi.org/10.1016/j.jrmge.2015.03.013 https://doi.org/10.22060/ceej.2017.12241.5157 https://doi.org/10.1680/jgrim.15.00026 https://doi.org/10.1140/epjp/i2016-16134-3 https://www.recordnations.com/ u. w. robert, s. e. etuk, o. e. agbasi et al. acta polytechnica [31] o. p. folorunso, b. u. anyata. potential use of waste paper/sludge as a ceiling board material. advanced materials research 18-19:49–53, 2007. https://doi. org/10.4028/www.scientific.net/amr.18-19.49. [32] u. w. robert, s. e. etuk, g. p. umoren, o. e. agbasi. assessment of thermal and mechanical properties of composite board produced from coconut (cocos nucifera) husks, waste newspapers and cassava starch. international journal of thermophysics 40(9):83, 2019. https://doi.org/10.1007/s10765-019-2547-8. [33] p. s. e. ang, a. h. i. ibrahim, m. s. abdullah. preliminary study of ceiling board from composite material of rice husk, rice husk ash and waste paper. progress in engineering application and technology 1(1):104–115, 2020. https://publisher.uthm.edu.my/ periodicals/index.php/peat/article/view/241. [34] e. u. nathaniel, u. w. robert, m. e. asuquo. evaluation of properties of composite panels fabricated from waste newspaper and wood dust for structural application. journal of energy research and reviews 5(1):8–15, 2020. https://doi.org/10.9734/jenrr/2020/v5i130138. [35] u. w. robert, s. e. etuk, o. e. agbasi, et al. investigation of thermal and strength properties of composite panels fabricated with plaster of paris for insulation in buildings. international journal of thermophysics 42(2):25, 2021. https://doi.org/10.1007/s10765-020-02780-y. [36] s. o. amiandamhen, s. o. osadolor. recycled waste paper-cement composite panels reinforced with kenaf fibres: durability and mechanical properties. journal of material cycles and waste management 22:1492–1500, 2020. https://doi.org/10.1007/s10163-020-01041-2. [37] j. p. azar, m. najarchi, b. sanaati, et al. the experimental assessment of the effect of paper waste ash and silica fume on improvement of concrete behaviour. ksce journal of civil engineering 23:4503–4515, 2019. https://doi.org/10.1007/s12205-019-0678-x. [38] b. m. kejela. waste paper ash as partial replacement of cement in concrete. american journal of construction and building materials 4(1):8–13, 2020. https://doi.org/10.11648/j.ajcbm.20200401.12. [39] h. m. b. al-hashemi, o. s. b. al-amoudi. a review on the angle of repose of granular materials. powder technology 330:397–417, 2018. https://doi.org/10.1016/j.powtec.2018.02.003. [40] astm d7928, standard test method for particle-size distribution (gradation) of fine-grained soils using the sedimentation (hydrometer) analysis. astm international, west conshohocken, 2017. [41] m. bediako, e. o. amankwah. analysis of chemical composition of cement in ghana: a key to understand the behavoiur of cement. advances in materials science and engineering 2015:2015, 2015. https://doi.org/10.1155/2015/349401. [42] a. i. inegbenebor, a. o. inegbenebor, r. c. mordi, et al. determination of the chemical compositions of clay deposits from some part of south west nigeria for industrial applications. international journal of applied sciences and biotechnology 4(1):21–26, 2016. https://doi.org/10.3126/ijasbt.v4i1.14214. [43] u. w. robert, s. e. etuk, o. e. agbasi, u. s. okorie. quick determination of thermal conductivity of thermal insulators using a modified lee–charlton’s disc apparatus technique. international journal of thermophysics 42(8):113, 2021. https://doi.org/10.1007/s10765-021-02864-3. [44] s. shrestha. a case study of brick properties manufactured in bhaktapur. journal of science and engineering 7:27–33, 2019. https://doi.org/10.3126/jsce.v7i0.26786. [45] u. w. robert, s. e. etuk, o. e. agbasi. bulk volume determination by modified water displacement method. iraqi journal of science 60(8):1704–1710, 2019. https://doi.org/10.24996/ijs.2019.60.8.7. [46] u. w. robert, s. e. etuk, o. e. agbasi, et al. on the hygrothermal properties of sandcrete blocks produced with sawdust as partial replacement of sand. journal of the mechanical behavior of materials 30(1):144–155, 2021. https://doi.org/10.1515/jmbm-2021-0015. [47] d. r. lide. crc handbook of chemistry and physics. 85th ed. crc press, boca raton, 2005. [48] s. e. etuk, u. w. robert, o. e. agbasi. design and performance evaluation of a device for determination of specific heat capacity of thermal insulators. beni-suef university journal of basic and applied sciences 9(1):34, 2020. https://doi.org/10.1186/s43088-020-00062-y. [49] astm d790, standard test methods for flexural properties of unreinforced and reinforced plastics and electrical insulating materials. astm international, west conshohocken, 2017. [50] u. w. robert, s. e. etuk, o. e. agbasi, s. a. ekong. properties of sandcrete block produced with coconut husk as partial replacement of sand. journal of building materials and structures 7:95–104, 2020. https://doi.org/10.5281/zenodo.3993274. [51] h. lu, x. guo, y. liu, x. gong. effects of particle size on flow mode and flow characteristics of pulverised coal. kona powder and particle journal 32:143–153, 2015. https://doi.org/10.14356/kona.2015002. [52] x. guiling, c. xiaoping, l. cai, et al. experimental investigation on the flowability properties of cohesive carbonaceous powders. journal of particulate science and technology 35(3):322–329, 2016. https://doi.org/10.1080/02726351.2016.1154910. [53] usp, powder flow. in the united states pharmacopeia 30-national formulary 25 convention. rockville, 2007. [54] p. m. velasco, m. p. m. ortíz, m. a. m. giró, l. m. velasco. fired clay bricks manufactured by adding wastes as sustainable construction material – a review. construction and building materials 63:97–107, 2014. https://doi.org/10.1016/j.conbuildmat.2014.03.045. [55] astm c618, standard specification for coal fly ash and raw or calcined natural pozzolan for use in concrete. astm international, west conshohocken, 2019. 760 https://doi.org/10.4028/www.scientific.net/amr.18-19.49 https://doi.org/10.4028/www.scientific.net/amr.18-19.49 https://doi.org/10.1007/s10765-019-2547-8 https://publisher.uthm.edu.my/periodicals/index.php/peat/article/view/241 https://publisher.uthm.edu.my/periodicals/index.php/peat/article/view/241 https://doi.org/10.9734/jenrr/2020/v5i130138 https://doi.org/10.1007/s10765-020-02780-y https://doi.org/10.1007/s10163-020-01041-2 https://doi.org/10.1007/s12205-019-0678-x https://doi.org/10.11648/j.ajcbm.20200401.12 https://doi.org/10.1016/j.powtec.2018.02.003 https://doi.org/10.1155/2015/349401 https://doi.org/10.3126/ijasbt.v4i1.14214 https://doi.org/10.1007/s10765-021-02864-3 https://doi.org/10.3126/jsce.v7i0.26786 https://doi.org/10.24996/ijs.2019.60.8.7 https://doi.org/10.1515/jmbm-2021-0015 https://doi.org/10.1186/s43088-020-00062-y https://doi.org/10.5281/zenodo.3993274 https://doi.org/10.14356/kona.2015002 https://doi.org/10.1080/02726351.2016.1154910 https://doi.org/10.1016/j.conbuildmat.2014.03.045 vol. 61 no. 6/2021 hydrothermally-calcined waste paper ash nanomaterial . . . [56] h. solihu. cement soil stabilization as an improvement technique for soil track subgrade, and highway subbase and base courses: a review. journal of civil and environmental engineering 10(3):1–6, 2020. https://doi.org/10.37421/jcde.2020.10.344. [57] s. srivastava, j. yadav, r. pandey. analysis of stabilization of soil cement for base of railway track & subgrade. international journal of engineering development and research 6(1):263–265, 2018. [58] e. r. e. rajput. heat and mass transfer. 6th revised ed. s. chand & company pvt ltd, new delhi, 2015. [59] m. mahedi, b. cetin, d. j. white. performance evaluation of cement and slag stabilized expansive soils. transportation research record: journal of the transportation research board 2672(52):164–173, 2018. https://doi.org/10.1177/0361198118757439. [60] m. g. hiwot, e. t. quezon, g. kebede. comparative study on compressive strength of locally produced fired clay bricks and stabilized clay bricks with cement and lime. global scientific journal 5(12):147–157, 2017. [61] a. rahman, m. g. rasul, m. m. k. khan, s. sharma. recent development on the uses of alternative fuels in cement manufacturing process. fuel 145:84–99, 2015. https://doi.org/10.1016/j.fuel.2014.12.029. 761 https://doi.org/10.37421/jcde.2020.10.344 https://doi.org/10.1177/0361198118757439 https://doi.org/10.1016/j.fuel.2014.12.029 acta polytechnica 61(6):749–761, 2021 1 introduction 2 experimental perspective 2.1 materials collection and description 2.2 processing of the clay and preparation of the hcwpan 2.3 analysis of the materials 2.4 fabrication of test samples 2.5 properties investigation 3 results and discussion 4 conclusion references acta polytechnica https://doi.org/10.14311/ap.2023.63.0132 acta polytechnica 63(2):132–139, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague exact solutions for time-dependent complex symmetric potential well boubakeur khantoula, b, abdelhafid bounamesa, ∗ a university of jijel, department of physics, laboratory of theoretical physics, bp 98 ouled aissa, 18000 jijel, algeria b university of constantine 3 – salah boubnider university, department of process engineering, bp b72 ali mendjeli, 25000 constantine, algeria ∗ corresponding author: bounames@univ-jijel.dz abstract. using the pseudo-invariant operator method, we investigate the model of a particle with a time-dependent mass in a complex time-dependent symmetric potential well v (x,t) = if (t) |x|. the problem is exactly solvable and the analytic expressions of the schrödinger wavefunctions are given in terms of the airy function. indeed, with an appropriate choice of the time-dependent metric operators and the unitary transformations, for each region, the two corresponding pseudo-hermitian invariants transform into a well-known time-independent hermitian invariant which is the hamiltonian of a particle confined in a symmetric linear potential well. the eigenfunctions of the last invariant are the airy functions. then, the phases obtained are real for both regions and the general solution to the problem is deduced. keywords: non-hermitian hamiltonian, time-dependent hamiltonian, pseudo-invariant method, pt-symmetry, pseudo-hermiticity. 1. introduction the discovery of a class of non-hermitian hamiltonian that may have a real spectrum has prompted a revival of theoretical and applied research in quantum physics. in fact, in 1998, c.m. bender and s. boettcher showed that any non-hermitian hamiltonian invariant under the unbroken space-time reflection, or pt -symmetry, has real eigenvalues and satisfies all the physical axioms of quantum mechanics [1–3]. in 2002, a. mostafazadeh presented a more extended version of non-hermitian hamiltonians having a real spectrum, proving that the hermiticity of the hamiltonian with respect to a positive definite inner product, ⟨., .⟩η = ⟨.| η |.⟩, is a necessary and sufficient condition for the reality of the spectrum, where η is the metric operator which is linear, hermitian, invertible and positive. this condition requires that the hamiltonian h satisfies the pseudo-hermitian relation [4–6]: h† = ηhη† . (1) moreover in recent years, a significant progress has been achieved in the study of time-dependent (td) non-hermitian quantum systems in several branches of physics. finding exact solutions to the td schrödinger equation, which cannot be reduced to eigenvalues equation in general, is a problem of intriguing difficulty. different methods are used to obtain solutions of schrödinger’s equation for explicitly td systems, such as unitary and non-unitary transformations, the pseudo-invariant method, dyson’s maps, point transformations, darboux transformations, perturbation theory and adiabatic approximation [7–31]. however, the emergence of a non-linear ermakov-type auxiliary equation for several td systems, which is difficult to solve, constitutes an additional constraint to obtain exact analytical solutions [32, 33]. this greatly reduces the number of exactly solvable time-dependent non-hermitian systems [34–38]. in particular, other works have been concerned with studying exact solutions of td hamiltonians with a specific td mass in the non-hermitian case [39, 40] and also in the hermitian case [41–45]. in the present work, we used the pseudo-invariant method [17] to obtain the exact solutions of the schrödinger equation for a particle with td mass moving in a td complex symmetric potential well: v (x,t) = if(t) |x| , (2) where f(t) is an arbitrary real td function. the manuscript is organised as follows: in section 2, we introduce some of the basic equations of the td non-hermitian hamiltonians and their time-dependent schrödinger equation (tdse) with a td metric. in section 3, we discuss the use of the lewis-riesenfeld invariant method to address the schrödinger equation for an explicitly td nonhermitian hamiltonian. in section 4, we use the lewis-riesenfeld method to solve the td schrödinger equation for a particle with td mass in a td complex symmetric potential well. finally, in section 5, we conclude with a brief review of the obtained results. 132 https://doi.org/10.14311/ap.2023.63.0132 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 2/2023 exact solutions for time-dependent complex symmetric potential well 2. td non-hermitian hamiltonian with td metric let h(t) be a non-hermitian td hamiltonian and h(t) its associated td hermitian hamiltonian. the two corresponding td schrödinger equations describing the quantum evolution are: h(t) ∣∣φh (t)〉 = iℏ∂t ∣∣φh (t)〉 , (3) h(t) ∣∣ψh(t)〉 = iℏ∂t ∣∣ψh(t)〉 , (4) where the two hamiltonians are related by the dyson maps ρ(t) as h(t) = ρ−1(t) h(t) ρ(t) − iℏρ−1(t) ρ̇(t) , (5) and their wavefunctions ∣∣φh (t)〉 and ∣∣ψh(t)〉 as∣∣ψh(t)〉 = ρ(t) ∣∣φh (t)〉 . (6) the hermiticity of h(t) allowed us to establish the connection between the hamiltonian h(t) and its hermitian conjugate h†(t) as h†(t) = η(t) h(t) η−1(t) + iℏη̇(t) η−1(t) , (7) which is a generalisation of the well-known conventional quasi-hermiticity equation (1), and the td metric operator is hermitian and defined as η(t) = ρ†(t) ρ(t). 3. pseudo-invariant operator method let us start with the description of the lewisriesenfeld theory [46] for a td hermitian hamiltonian h(t) with a hermitian td invariant ih(t). the dynamic invariant ih(t) satisfies: dih(t) dt = ∂ih(t) ∂t − i ℏ [ ih(t) , h(t) ] = 0 . (8) the eigenvalue equation for ih(t) is: ih(t) ∣∣ψhn(t)〉 = λn ∣∣ψhn(t)〉 , (9) where the eigenvalues λn of ih(t) are reals and timeindependent, and the lewis-riesenfeld phase is defined as: ℏ d dt εn(t) = 〈 ψhn(t) ∣∣ iℏ ∂ ∂t − h(t) ∣∣ψhn(t)〉 , (10) and the solution of the tdse of h(t) is given as∣∣ψh(t)〉 = exp [iεn(t)] ∣∣ψhn(t)〉 . (11) in the paper [17], we showed that any td hamiltonian h(t) satisfying the td quasi-hermiticity equation (7) admits a pseudo-hermitician invariant iph(t) such that: iph†(t) = η(t)iph(t)η−1(t) ⇔ ih(t) = ρ(t)iph(t)ρ−1(t) = ih†(t) . (12) since the hermitian invariant ih(t) satisfies the eigenvalues equation (9), equation (12) ensures that the pseudo-hermitian invariant’s spectrum is real with the same eigenvalues λn of ih(t): ih(t) ∣∣ψhn(t)〉 = λn ∣∣ψhn(t)〉 , (13) iph(t) ∣∣ϕphn (t)n(t)〉 = λn ∣∣ϕphn (t)〉 , (14) where the eigenfunctions ∣∣ψhn(t)〉 and ∣∣ϕphn (t)〉, of ih(t) and iph(t), respectively, are related as∣∣ψhn(t)〉 = ρ(t) ∣∣ϕphn (t)〉 . (15) the inner products of the eigenfunctions associated with the non-hermitian invariant iph(t) can now be written as ⟨ϕphm (t) ∣∣ϕphn (t)〉η = ⟨ϕphm (t)|η ∣∣ϕphn (t)〉 = δmn , (16) and it corresponds to the conventional inner product associated to the hermitian invariant ih(t). it is easy to verify, by a direct substitution of the hermitian hamiltonian h(t) and the hermitian invariant ih(t) by their equivalents in the expressions (5) and (12), respectively, that the pseudo hermitian invariant iph(t) satisfies: ∂iph(t) ∂t = i ℏ [ iph(t) , h(t) ] . (17) we should remark that the invariant operator’s eigenstates and eigenvalues can be computed using the same procedure as the hermitian case. the solution ∣∣φh (t)〉 of the schrödinger equation (3) is different from ∣∣ϕphn (t)〉 in equation (14) only by the factor eiε ph n (t) where εphn (t) is a real phase given by: ℏ d dt ε ph n (t) = 〈 ϕ ph n (t) ∣∣ η(t) [iℏ ∂ ∂t − h(t) ] ∣∣ϕphn (t)〉 . (18) 4. particle in td complex symmetric potential well let us consider a particle with a td mass m(t) in the presence of a pure imaginary td symmetric potential well equation (2), where its hamiltonian can be written as: h(t) = { p2 2m(t) + if(t)x if x ≥ 0 p2 2m(t) − if(t)x if x ≤ 0 , (19) the associated tdse of the system is: 133 boubakeur khantoul, abdelhafid bounames acta polytechnica [ p2 2m(t) + if(t) |x| ] ψ(x,t) = i ∂ ∂t ψ(x,t) , (20) where m(t) is the particle td mass and f(t) an arbitrary real td function, and the unit of ℏ = 1. this model can be considered as the complex version of the hermitian case of a particle, with td mass and charge q, moving under the action of td electric field e(t) and confined in a pure imaginary symmetric linear potential well: if(t)x for x ≥ 0 and −if(t)x for x ≤ 0, where f(t) = −qe(t). according to the results in [17], the solution to the td schrödinger equation with a td non-hermitian hamiltonian is easily found if a nontrivial td pseudohermitian invariant iph(t) exists and satisfies the vonneumann equation (17). in the current problem, in order to solve the td shrödinger equation (20) we assume that the hamiltonian h(t) admits an invariant in each region: let i ph 1 (t) for x ≥ 0 and i ph 2 (t) for x ≤ 0. for the region x ≥ 0, let us look for a non-hermitian td invariant in the following quadratic form: i ph 1 (t) = β1(t)p 2 + β2(t)x + β3(t)p + β4(t) , (21) where βi(t) are arbitrary complex functions to be determined. by inserting the expressions (19) and (21) in equation 17, the following system of equations can be found:   β̇1(t) = 0 , β̇2(t) = 0 , β̇3(t) = − β2(t) m(t) + 2 if(t)β1(t) , β̇4(t) = if(t)β3(t) , (22) to simplify the calculations, we take β1(t) = 1 and β2(t) = 1, so β3(t) and β4(t) are given by: β3(t) = g(t) + ik(t) , (23) β4(t) = s(t) + iw(t) , (24) where g(t) = − ∫ dt m(t) , k(t) = 2 ∫ f(t)dt, s(t) = − ∫ f(t)k(t)dt and w(t) = ∫ f(t)g(t)dt. substituting expressions (23) and (24) in equation (21) we found: i ph 1 (t) = p 2 + x + [g(t) + ik(t)] p + s(t) + iw(t) . (25) its eigenvalue equation is as follows: i ph 1 (t) |ψ(t)⟩ = λ1 |ψ(t)⟩ , (26) in order to show that the spectrum of iph1 (t) is real, we search for a metric operator that fulfills the pseudo hermiticity relation: i ph† 1 (t) = η1(t)i ph 1 (t)η −1 1 (t) , (27) and we make the following choice for metric: η1(t) = exp[−α(t)x − β(t)p] , (28) where α(t) and β(t) are chosen as real functions in order that the metric operator η1(t) is hermitian. the position and momentum operators transform according to the transformation η1(t) as: η1(t)xη−11 (t) = x + iβ(t) , (29) η1(t)pη−11 (t) = p − iα(t) , (30) incorporating these relationships into equation (27), we found: α(t) = k(t) , (31) β(t) = g(t)k(t) − 2w(t) , (32) then the td metric operator η1(t) is given by: η1(t) = exp[−k(t)x − (g(t)k(t) − 2w(t))p] , (33) according to the relation η1(t) = ρ † 1(t)ρ1(t), and since ρ1(t) is not unique, we can take it as a hermitian operator in order to simplify the calculations: ρ1(t) = exp [ − k(t) 2 x − [ g(t)k(t) 2 − w(t) ] p ] , (34) the hermitian invariant ih1 (t) associated with the pseudo-hermitian invariant iph1 (t) is given by: i h 1 (t) = ρ(t)i ph 1 ρ −1(t) = p2 +x+g(t)p+ k2(t) 4 +s(t) . (35) for the region x ≤ 0, we take the non-hermitian invariant iph2 as i ph 2 (t) = α1(t)p 2 + α2(t)x + α3(t)p + α4(t) , (36) where αi(t) are arbitrary complex functions to be determined. in the same way as the precedent case, inserting the expressions (19) and (36) in equation (17), where we take α1(t) = 1 and α2(t) = −1, so α3(t) and α4(t) are given by: α3(t) = −g(t) − ik(t) , (37) α4(t) = s(t) + iw(t) . (38) 134 vol. 63 no. 2/2023 exact solutions for time-dependent complex symmetric potential well then, the final results of iph2 (t) and η2(t) are: i ph 2 (t) = p 2 − x − [g(t) + ik(t)] p + s(t) + iw(t) , (39) η2(t) = exp[k(t)x − [2w(t) − g(t)k(t)] p] . (40) we take ρ2(t) as a hermitian operator, then η2(t) = ρ22, ρ2(t) = exp [ k(t) 2 x + [ k(t)g(t) 2 − w(t) ] p ] , (41) and the related hermitian invariant ih2 (t) is: ih2 (t) = p 2 − x − g(t)p + k2(t) 4 + s(t) . (42) to derive the eigenvalues equations of the invariants ihj (t) for the two regions (j = 1, 2), we introduce the unitary transformations uj (t): |ϕn,j (t)⟩ = uj (t) |φn⟩ , j = 1, 2 , (43) where φn will be determined later and u1(t) = exp [ −i g(t) 2 x + i 4 [ k2(t) − g2(t) + 4s(t) ] p ] , (44) u2(t) = exp [ i g(t) 2 x − i 4 [ k2(t) − g2(t) + 4s(t) ] p ] . (45) according to these transformations, the invariants ih1 (t) and ih2 (t) turn into: i1 = u † 1 (t)i h 1 (t)u1(t) = p 2 + x, (46) i2 = u † 2 (t)i h 2 (t)u2(t) = p 2 − x, (47) and they can be written in the following combined form: i = p2 + |x| . (48) we note here that i can be considered as the hamiltonian of a particle of mass m0 = 1/2 confined in the linear symmetric potential well |x|. therefore, the eigenvalue equation of the invariant i:[ d2 dx2 + (λn − |x|) ] φn(x) = 0 , (49) is a well-known problem in quantum mechanics. the bound states φn(x) are given in terms of the airy functions ai and bi [47, 48]: φn(x) = nn ai(|x| − λn) + n ′ n bi(|x| − λn) . (50) this solution is not relevant because bi(|x|−λn) tends to infinity for (|x| − λn) > 0. thus, we take n ′ n = 0 and the above solution reduces to: φn(x) = nn ai(|x| − λn) . (51) the eingenvalues λn are determined by matching the functions φn(x) and their derivatives in the two regions at the point x = 0: φ(1)n (0) = φ (2) n (0) , (52) φ ′(1) n (0) = ±φ ′(2) n (0) , (53) from which there are two possibilities for λn and the normalisation constant nn depending on whether n is even or odd: • if n is even: λn = −a′n 2 +1 , (54) where a′k is the k th zero of the derivative ai′ of the airy function, and all values of a′k are negative numbers [49]. the normalisation constant is: nn = 1√ −2a′n 2 +1 ai(a′n 2 +1 ) , (55) and the corresponding eigenfunction of i is: φn(x) = 1√ −2a′n 2 +1 ai(a′n 2 +1 ) ai(|x| + a′n 2 +1 ) , (56) • if n is odd: λn = −a n+1 2 , (57) where ak is the kth zero of the airy function ai, and all values of ak are negative numbers [49]. the normalisation constant is: nn = 1 √ 2ai′(a n+1 2 ) , (58) and the corresponding eigenfunction of i is: φn(x) = sgn(x) 1 √ 2ai′(a n+1 2 ) ai(|x| +a n+1 2 ) . (59) the eigenfunctions of the hermitian invariants ihj (t) are written for each region as: |ϕn,j (t)⟩ = uj (t) |φn⟩ , (60) then, the eigenfunctions of the pseudo-hermitian invariants iphj (t) are given by: |ψn,j (t)⟩ = ρ−1j (t)uj (t) |φn⟩ , (61) 135 boubakeur khantoul, abdelhafid bounames acta polytechnica thus, the solutions of the time-dependent schrödinger equation (20) take the form: |ψn,j (t)⟩ = eiϵ j n(t) |ψn,j (t)⟩ , (62) where ϵjn(t) is the phase (ϵ1n(t) for x ≥ 0 and ϵ2n(t) for x ≤ 0), which is obtained from the following relation: ϵ̇jn(t) = ⟨ψn,j (t)| ηj (t) [ i ∂ ∂t − h(t) ] |ψn,j (t)⟩ = ⟨ϕn,j (t)| iρj (t)ρ̇−1j (t) |ϕn,j (t)⟩ − ⟨ϕn,j (t)| ρj (t)h(t)ρ−1j (t) |ϕn,j (t)⟩ + ⟨ϕn,j (t)| i ∂ ∂t |ϕn,j (t)⟩ = θ(t) − ⟨ϕn,j (t)| p2 2m(t) |ϕn,j (t)⟩ + ⟨ϕn,j (t)| i ∂ ∂t |ϕn,j (t)⟩ , (63) where θ(t) = 1 2 f(t) [ k(t) 2 g(t) − w(t) ] . (64) using the unitary transformations uj (t), we found: ϵ̇jn(t) = χ j (t) − 1 2m(t) ⟨φn(t)| (p2 ± x) |φn(t)⟩ , (65) where χ1(t) = θ(t)− 1 16m(t) [ k2(t) + 3g2(t) + 4s(t) ] , (66) χ2(t) = θ(t) + 1 16m(t) [ k2(t) − g2(t) + 4s(t) ] . (67) from the eigenvalue equation of the invariant i, we have: (p2 ± x) |φn(t)⟩ = λn |φn(t)⟩ , (68) then, the phases ϵjn(t) take the form: ϵjn(t) = ∫ (χj (t) − λn 2m(t) )dt, (69) and the solution of the td schrödinger equation (20) is given by: |ψn,j (t)⟩ = exp [ iϵjn(t) ] ρj (t)−1 |ϕn,j (t)⟩ . (70) in position representation we have: 〈 x ∣∣ρ−1j (t)∣∣ ϕj (t)〉 = exp [iζ(t)] exp [ ± k(t) 2 x ] × ϕj (x ± i( g(t)k(t) 2 − w(t)), t) , (71) where (+) is for the positive region while (−) is for the negative region, and ζ(t) = − k 4 ( g(t)k(t) 2 − w(t)) . (72) then, the solution of the schrödinger equation for each region (70) can be written as: ψn,j (x,t) = exp [ i(ϵjn(t) + ζ(t)) ] exp [ ± k(t) 2 x ] × ϕn,j (x ± i( g(t)k(t) 2 − w(t)), t) , (73) and the general solution of the schrödinger equation (20) is given by: ψ(x,t) = { ψn,1(x,t) for x ≥ 0 , ψn,2(x,t) for x ≤ 0 . (74) according to the equations (51), (60), (61) and (62), the probability density function is given by: |ρ1(t)ψn,1| 2 + |ρ2(t)ψn,2| 2 = = |ϕn,1| 2 + |ϕn,2| 2 = |φn| 2 , (75) and because φn(x) is determined in terms of airy function ai(x), which is a real function, and according to equations (56) and (59), the probability density expression can be written as: • for n is even |φn(x)| 2 = 1 (−2a′n 2 +1 ) [ ai(a n 2 +1) ]2 [ai(|x| + a′n2 +1)]2 , (76) and which is represented in figure 1 for the first three even states (n = 0, 2, 4). • for n is odd |φn(x)| 2 = 1 2 [ ai′(a n+1 2 ) ]2 [ai(|x| + a n+12 )]2 , (77) and which is represented in figure 2 for the first three odd states (n = 1, 3, 5). we note here that the probability in the region x ≤ 0 is ⟨ψn,2(t)| η2(t) |ψn,2(t)⟩ = ⟨φn| φn⟩x≤0 = 0∫ −∞ φ∗n(x)φn(x)dx = 1 2 , (78) and the probability in the region x ≥ 0 is 136 vol. 63 no. 2/2023 exact solutions for time-dependent complex symmetric potential well -6 -4 -2 2 4 6 x 0.1 0.2 0.3 0.4 0.5 φn 2 n=4 n=2 n=0 figure 1. probability density of equation 76 for even values of n = 0, 2, 4. ⟨ψn,1(t)| η1(t) |ψn,1(t)⟩ = ⟨φn| φn⟩x≥0 = ∞∫ 0 φ∗n(x)φn(x)dx = 1 2 . (79) so the two regions are equiprobable and the probability in all space is equal to one ⟨ψ(t) , ψ(t)⟩η = ⟨ψn,1| η1(t) |ψn,1⟩ + ⟨ψn,2| ηn,2(t) |ψ2⟩ = ∞∫ −∞ φ∗n(x)φn(x)dx = 1 . (80) 5. conclusion the pseudo-invariant method has been used to obtain the exact analytical solutions of the timedependent schrödinger equation for a particle with time-dependent mass moving in a complex timedependent symmetric potential well. we have shown that the problem can be reduced to solve a well-known eigenvalue equation for a time-independent hermitian invariant. in fact, with a specific choice of the td metric operators, η1(t) and η2(t), and the dyson maps, ρ1(t) and ρ2(t), and using unitary transformations, the pseudo-invariants operators (iph1 (t) for x ⩾ 0 and i ph 2 (t) for x ⩽ 0) are mapped to two time-independent hermitian invariants ih1 (t) and ih2 (t), which can be combined in a unique form i = p2 + |x|. the latter can be considered as the hamiltonian of a particle confined in a linear time-independent symmetric potential well, where its eigenfunctions are given in terms of the airy function ai. the phases have been calculated for the two regions and are real. thus, the exact analytical solution of the problem has been deduced. finally, let us highlight the fact that the probability density associated with the model in question is time-independent. -6 -4 -2 2 4 6 x 0.05 0.10 0.15 0.20 0.25 0.30 φn 2 n=5 n=3 n=1 figure 2. probability density of equation 77. for odd values of n = 1, 3, 5. references [1] c. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. physical review letters 80(24):5243–5246, 1998. https://doi.org/10.1103/physrevlett.80.5243 [2] f. bagarello, j. gazeau, f. szaraniec, m. znojil. non-self adjoint operators in quantum physics: mathematical aspects. john wiley, 2015. [3] c. bender. pt symmetry: in quantum and classical physics. world scientific, 2019. [4] a. mostafazadeh. pseudo-hermiticity versus pt symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian. journal of mathematical physics 43(1):205–214, 2002. https://doi.org/10.1063/1.1418246 [5] a. mostafazadeh. pseudo-hermiticity versus pt-symmetry ii: a complete characterization of non-hermitian hamiltonians with a real spectrum. journal of mathematical physics 43(5):2814, 2002. https://doi.org/10.1063/1.1461427 [6] a. mostafazadeh. pseudo-hermiticity versus pt-symmetry iii: equivalence of pseudo-hermiticity and the presence of antilinear symmetries. journal of mathematical physics 43(8):3944–3951, 2002. https://doi.org/10.1063/1.1489072 [7] h. choutri, m. maamache, s. menouar. geometric phase for a periodic non-hermitian hamiltonian. journal-korean physical society 40(2):358–360, 2002. [8] c. yuce. time-dependent pt symmetric problems. physics letters a 336(4-5):290–294, 2005. https://doi.org/10.1016/j.physleta.2004.12.043 [9] a. dutra, m. hott, v. dos santos. time-dependent non-hermitian hamiltonians with real energies. europhysics letters (epl) 71(2):166–171, 2005. https://doi.org/10.1209/epl/i2005-10073-7 [10] c. faria, a. fring. time evolution of non-hermitian hamiltonian systems. journal of physics a: mathematical and general 39(29):9269–9289, 2006. https://doi.org/10.1088/0305-4470/39/29/018 [11] a. mostafazadeh. time-dependent pseudo-hermitian hamiltonians defining a unitary quantum system and uniqueness of the metric operator. physics letters b 137 https://doi.org/10.1103/physrevlett.80.5243 https://doi.org/10.1063/1.1418246 https://doi.org/10.1063/1.1461427 https://doi.org/10.1063/1.1489072 https://doi.org/10.1016/j.physleta.2004.12.043 https://doi.org/10.1209/epl/i2005-10073-7 https://doi.org/10.1088/0305-4470/39/29/018 boubakeur khantoul, abdelhafid bounames acta polytechnica 650(2-3):208–212, 2007. https://doi.org/10.1016/j.physletb.2007.04.064 [12] m. znojil. time-dependent quasi-hermitian hamiltonians and the unitarity of quantum evolution, 2007. arxiv:0710.5653 [13] m. znojil. time-dependent version of crypto-hermitian quantum theory. physical review d 78(8):085003, 2008. https://doi.org/10.1103/physrevd.78.085003 [14] j. gong, q. wang. time-dependent pt-symmetric quantum mechanics. journal of physics a: mathematical and theoretical 46(48):485302, 2013. https://doi.org/10.1088/1751-8113/46/48/485302 [15] a. fring, m. moussa. unitary quantum evolution for time-dependent quasi-hermitian systems with nonobservable hamiltonians. physical review a 93(4):042114, 2016. https://doi.org/10.1103/physreva.93.042114 [16] a. fring, m. moussa. non-hermitian swanson model with a time-dependent metric. physical review a 94(4):042128, 2016. https://doi.org/10.1103/physreva.94.042128 [17] b. khantoul, a. bounames, m. maamache. on the invariant method for the time-dependent non-hermitian hamiltonians. the european physical journal plus 132(6):258, 2017. https://doi.org/10.1140/epjp/i2017-11524-7 [18] m. maamache, o. djeghiour, n. mana, w. koussa. pseudo-invariants theory and real phases for systems with non-hermitian time-dependent hamiltonians. the european physical journal plus 132(9):383, 2017. https://doi.org/10.1140/epjp/i2017-11678-2 [19] m. maamache. non-unitary transformation of quantum time-dependent non-hermitian systems. acta polytechnica 57(6):424, 2017. https://doi.org/10.14311/ap.2017.57.0424 [20] b. bagchi. evolution operator for time-dependent non-hermitian hamiltonians. letters in high energy physics 1(3):4–8, 2018. https://doi.org/10.31526/lhep.3.2018.02 [21] b. ramos, i. pedrosa, a. lopes de lima. lewis and riesenfeld approach to time-dependent non-hermitian hamiltonians having pt symmetry. the european physical journal plus 133(11):449, 2018. https://doi.org/10.1140/epjp/i2018-12251-3 [22] w. koussa, n. mana, o. djeghiour, m. maamache. the pseudo hermitian invariant operator and time-dependent non-hermitian hamiltonian exhibiting a su(1,1) and su(2) dynamical symmetry. journal of mathematical physics 59(7):072103, 2018. https://doi.org/10.1063/1.5041718 [23] h. wang, l. lang, y. chong. non-hermitian dynamics of slowly varying hamiltonians. physical review a 98(1):012119, 2018. https://doi.org/10.1103/physreva.98.012119 [24] s. cheniti, w. koussa, a. medjber, m. maamache. adiabatic theorem and generalized geometrical phase in the case of pseudo-hermitian systems. journal of physics a: mathematical and theoretical 53(40):405302, 2020. https://doi.org/10.1088/1751-8121/abad79 [25] r. bishop, m. znojil. non-hermitian coupled cluster method for non-stationary systems and its interaction-picture reinterpretation. the european physical journal plus 135(4):374, 2020. https: //doi.org/10.1140/epjp/s13360-020-00374-z [26] m. zenad, f. ighezou, o. cherbal, m. maamache. ladder invariants and coherent states for timedependent non-hermitian hamiltonians. international journal of theoretical physics 59(4):1214–1226, 2020. https://doi.org/10.1007/s10773-020-04401-8 [27] j. choi. perturbation theory for time-dependent quantum systems involving complex potentials. frontiers in physics 8, 2020. https://doi.org/10.3389/fphy.2020.00189 [28] f. luiz, m. de ponte, m. moussa. unitarity of the time-evolution and observability of non-hermitian hamiltonians for time-dependent dyson maps. physica scripta 95(6):065211, 2020. https://doi.org/10.1088/1402-4896/ab80e5 [29] a. mostafazadeh. time-dependent pseudo-hermitian hamiltonians and a hidden geometric aspect of quantum mechanics. entropy 22(4):471, 2020. https://doi.org/10.3390/e22040471 [30] l. alves da silva, r. dourado, m. moussa. beyond pt-symmetry: towards a symmetry-metric relation for time-dependent non-hermitian hamiltonians. scipost physics core 5(1):012, 2022. https: //doi.org/10.21468/scipostphyscore.5.1.012 [31] y. gu, x. bai, x. hao, j. liang. pt-symmetric non-hermitian hamiltonian and invariant operator in periodically driven su(1,1) system. results in physics 38:105561, 2022. https://doi.org/10.1016/j.rinp.2022.105561 [32] p. ermakov. second-order differential equations: conditions of complete integrability. applicable analysis and discrete mathematics 2(2):123–145, 2008. https://doi.org/10.2298/aadm0802123e [33] d. schuch. quantum theory from a nonlinear perspective. springer, 2018. [34] a. fring, t. frith. exact analytical solutions for time-dependent hermitian hamiltonian systems from static unobservable non-hermitian hamiltonians. physical review a 95(1):010102, 2017. https://doi.org/10.1103/physreva.95.010102 [35] w. koussa, m. maamache. pseudo-invariant approach for a particle in a complex time-dependent linear potential. international journal of theoretical physics 59(5):1490–1503, 2020. https://doi.org/10.1007/s10773-020-04417-0 [36] a. fring, r. tenney. exactly solvable time-dependent non-hermitian quantum systems from point transformations. physics letters a 410:127548, 2021. https://doi.org/10.1016/j.physleta.2021.127548 [37] k. zelaya, o. rosas-ortiz. exact solutions for time-dependent non-hermitian oscillators: classical and quantum pictures. quantum reports 3(3):458–472, 2021. https://doi.org/10.3390/quantum3030030 [38] m. huang, r. lee, q. wang, et al. solvable dilation model of time-dependent pt-symmetric systems. physical review a 105(6):062205, 2022. https://doi.org/10.1103/physreva.105.062205 138 https://doi.org/10.1016/j.physletb.2007.04.064 http://arxiv.org/abs/0710.5653 https://doi.org/10.1103/physrevd.78.085003 https://doi.org/10.1088/1751-8113/46/48/485302 https://doi.org/10.1103/physreva.93.042114 https://doi.org/10.1103/physreva.94.042128 https://doi.org/10.1140/epjp/i2017-11524-7 https://doi.org/10.1140/epjp/i2017-11678-2 https://doi.org/10.14311/ap.2017.57.0424 https://doi.org/10.31526/lhep.3.2018.02 https://doi.org/10.1140/epjp/i2018-12251-3 https://doi.org/10.1063/1.5041718 https://doi.org/10.1103/physreva.98.012119 https://doi.org/10.1088/1751-8121/abad79 https://doi.org/10.1140/epjp/s13360-020-00374-z https://doi.org/10.1140/epjp/s13360-020-00374-z https://doi.org/10.1007/s10773-020-04401-8 https://doi.org/10.3389/fphy.2020.00189 https://doi.org/10.1088/1402-4896/ab80e5 https://doi.org/10.3390/e22040471 https://doi.org/10.21468/scipostphyscore.5.1.012 https://doi.org/10.21468/scipostphyscore.5.1.012 https://doi.org/10.1016/j.rinp.2022.105561 https://doi.org/10.2298/aadm0802123e https://doi.org/10.1103/physreva.95.010102 https://doi.org/10.1007/s10773-020-04417-0 https://doi.org/10.1016/j.physleta.2021.127548 https://doi.org/10.3390/quantum3030030 https://doi.org/10.1103/physreva.105.062205 vol. 63 no. 2/2023 exact solutions for time-dependent complex symmetric potential well [39] f. kecita, a. bounames, m. maamache. a real expectation value of the time-dependent non-hermitian hamiltonians. physica scripta 96(12):125265, 2021. https://doi.org/10.1088/1402-4896/ac3dbd [40] b. villegas-martinez, h. moya-cessa, f. soto-eguibar. exact solution for the time dependent non-hermitian generalized swanson oscillator, 2022. arxiv:2205.05741 [41] p. caldirola. forze non conservative nella meccanica quantistica. il nuovo cimento 18(9):393–400, 1941. https://doi.org/10.1007/bf02960144 [42] e. kanai. on the quantization of the dissipative systems. progress of theoretical physics 3(4):440–442, 1948. https://doi.org/10.1143/ptp/3.4.440 [43] m. abdalla. canonical treatment of harmonic oscillator with variable mass. physical review a 33(5):2870–2876, 1986. https://doi.org/10.1103/physreva.33.2870 [44] i. ramos-prieto, a. espinosa-zuñiga, m. fernández-guasti, h. moya-cessa. quantum harmonic oscillator with time-dependent mass. modern physics letters b 32(20):1850235, 2018. https://doi.org/10.1142/s0217984918502354 [45] k. zelaya. time-dependent mass oscillators: constants of motion and semiclasical states. acta polytechnica 62(1):211–221, 2022. https://doi.org/10.14311/ap.2022.62.0211 [46] h. lewis, w. riesenfeld. an exact quantum theory of the time-dependent harmonic oscillator and of a charged particle in a time-dependent electromagnetic field. journal of mathematical physics 10(8):1458–1473, 1969. https://doi.org/10.1063/1.1664991 [47] j. schwinger. symbolism of atomic measurements. springer, 2001. [48] o. vallée, m. soares. airy functions and applications to physics. imperial college press, 2004. [49] f. olver, d. lozier, r. boisvert, c. clark. nist handbook of mathematical functions, chap. 9. cambridge university press, 2010. http://dlmf.nist.gov/9 139 https://doi.org/10.1088/1402-4896/ac3dbd http://arxiv.org/abs/2205.05741 https://doi.org/10.1007/bf02960144 https://doi.org/10.1143/ptp/3.4.440 https://doi.org/10.1103/physreva.33.2870 https://doi.org/10.1142/s0217984918502354 https://doi.org/10.14311/ap.2022.62.0211 https://doi.org/10.1063/1.1664991 http://dlmf.nist.gov/9 acta polytechnica 63(2):132–138, 2023 1 introduction 2 td non-hermitian hamiltonian with td metric 3 pseudo-invariant operator method 4 particle in td complex symmetric potential well 5 conclusion references ap_06_4.vp 1 introduction multiple description coding (mdc) is a source coding technique which can be used for transmitting data over lossy diversity networks. the mdc generates two or more different descriptions, which are sent over different channels of the network. each of these descriptions can be decoded independently. the reconstruction quality at the receiver in creases with the number of received descriptions. decoding one description is usually called side decoding. decoding more than one description is usually called central decoding. if all descriptions need the same bandwith and all sidedecoder outputs are of the same quality, the descriptions are called balanced. for a scenario displayed in fig. 1 the theoretical limits for a gaussian source with zero mean and unit variance are derived in [3]. a popular approach for mdc uses the indices of an arbitrary quantizer for a mapping procedure called index assignment [8]. for this approach it is difficult to allocate redundancy for three or more descriptions in an optimal way, as mentioned in [6]. partitioning based mdc avoids this problem, by using quantizers with different rates for generating the side descriptions [1]. as an additional benefit such approaches may also generate standard conform descriptions [7]. for such multiple description schemes there are several ways for central decoding, which are compared in this paper. the paper is structured as follows: in section two, three different ways for reconstruction at the central decoder are introduced. in section three, the test scenario is explained and experimental results are shown. in section four, the results are summarized. 2 mdc based on domain partitioning in domain partitioning based mdc the md encoders are scalar quantizers with different quantization intervals. this results in different quantization errors for each description, as shown in fig. 2: y x qi i� � this simple approach can easily be generalized to n descriptions with n uniform scalar quantizers. the proportion between the different quantization intervals adjust the redundancy. high redundancy corresponds to nearly equal sized quantization intervals: � � i j � 1 low redundancy corresponds to high differences between the quantization intervals: � � i j �1 these n quantizers generates n sets of indices that describes the source data. balanced descriptions are achieved by switching these indices by a scheme, known to encoder and decoder. 2.1 highest rate reconstruction the simplest and most common way for central decoding uses only the description with the highest rate for each quantization index. ignoring the lower rate descrip tions leads to easily predictable central distortion and low complexity. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 central decoding for multiple description codes based on domain partitioning m. spiertz, t. rusert multiple description codes (mdc) can be used to trade redundancy against packet loss resistance for transmitting data over lossy diversity networks. in this work we focus on md transform coding based on domain partitioning. compared to vaishampayan’s quantizer based mdc, domain based md coding is a simple approach for generating different descriptions, by using different quantizers for each description. commonly, only the highest rate quantizer is used for reconstruction. in this paper we investigate the benefit of using the lower rate quantizers to enhance the reconstruction quality at decoder side. the comparison is done on artificial source data and on image data. keywords: multiple description coding, domain based, scalar quantization, reconstruction, central decoder. fig. 1: a multiple description scenario with two senders and three receivers fig. 2: domain based mdc, qi � quantization error corresponding to a quantization interval �i 2.2 reconstruction by linear superposition it is possible to reduce the quantization error of the central decoder by using more descriptions than only the highest rate description. for this we introduce n weighting factors �i, and construct the central reconstruction by weighted superposition of the received side reconstructions. with 1 1 1 1 1 � � � � � � � � �� �i i n n i i n a (1) the central decoder can be written as: y y y x qi i i n i i i n � � � � � � �� � 1 1 to maximize the reconstruction quality, we minimize the following term: e y x e qi i i n {( ) }� � � � � � � � � � � �� � � � ��� �2 1 2 � (e{.} �� statistical expectation) as a first approximation, we assume that qi and qj are uncorrelated for i � j. as a matter of fact this is not true, especially for � �i j k k� �, � . in section 3 we will show that even with this raw assumption an enhancement for central decoding is possible for high redundancy. � � � �e y x e qi i i n {( ) } {( ) }2 2 1 � condition (1) reduces the dimension of this problem by one: e y x e q e qi i i n i i n n{( ) } { } {� � � � � � � � � � � � � � � �2 2 1 1 1 1 2 1� � 2 }. minimization of quantization error by zero setting the derivation for each �i : � �� � � e y x e q e qi i n i i i n{( ) } { } { } !� � � � � � � � � � � � 2 2 2 1 1 0 1 � . (2) the solution of these n � 1 equations minimizes the quantization error. as an example for two dimensions, (1) and (2) leads to: �1 2 2 1 2 2 2� � e q e q e q { } { } { } �2 1 2 1 2 2 2� � e q e q e q { } { } { } 2.3 intersection reconstruction a more deterministic way of using the lower rate descriptions to enhance the quality of the central decoder output is shown in [4] for dpcm systems. each received quantization index belongs to one quantization interval with one lower limit li, and one upper limit ui. for each quantizer, the following applies: x l ui i�( , ) . (3) by applying more than one quantizer intervals, formula (3) becomes: x l u ii i� �(max( ), min( )), . by reducing the width of the reconstruction interval the distortion at the decoder decreases and y approximates the source sample more accurate. this decoding approach results in no quality improvement if all limits of the higher rate quantizer are also limits of the lower rate quantizer. this may happen in the case of � �i j k k� �, � , depending on the width of the quantizer deadzone. in all other cases every received quantizer index may reduce the width of the reconstruction interval for the central decoder. 3 experimental results for low complexity, all simulations are limited to two descriptions and three decoders, as shown in fig. 1. first a gaussian source with zero mean and unit-variance is used as source data for comparison of the three decoders. the two encoders are uniform scalar quantizer with different quantization intervals �i. for balanced descriptions, the two sets of quantization indices are mixed by a codec wide known scheme, e.g. the scheme used in [6]. the rate is approximated by the entropy of the indices. cause of the high rate of 2 bpss we assume uniformly distribution of the quantization error. results are shown in fig. 5 and fig. 6, along with the theoretical limit for multiple description coding of a gaussian source as derived in [3]. fig. 5 shows that in the case of � �2 1 2 1� � � �k n k n, , � the linear superposition method gets worse than the highest rate method. in these cases the assumption of no cross-correlation between quantization er© czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 46 no. 4/2006 fig. 3: source sample x with two corresponding quantization intervals �1, �2, quantization interval �3 results from highest rate decoding, reconstruction values y1, y2 and y3 are chosen for an uniformly distributed source fig. 4: source sample x with two corresponding quantization intervals �1, �2; quantization interval �3 results from intersection decoding; reconstruction values y1, y2 and y3 are chosen for an uniformly distributed source rors of different descriptions does not apply. for these cases, fig. 6 shows the same quality for highest rate and intersection reconstruction. this is because the limits of the lower rate quantization intervals are also limits of the higher rate quantization intervals when using uniform scalar quantizer without a wider deadzone. second, the wavelet coefficients of some commonly used test images are used as sourcedata for the mdc. generating the two descriptions is done similar as by the gaussian source. entropycoding is performed by the spiht algorithm [5]. for visualization and comparison of the efficiency of mdcs redundancy rate distortion plots (rrd-plots), introduced by [9], are used. the experimental results for image lena 512×512 are shown in figs. 7 and 8. with other test images, comparable results are achieved. as by the gaussian source fig. 7 shows that the linear superposition method improves the highest rate reconstruction only for high redundancy. for a redundancy of 0.6 or less the assumption of uncorrelated quantization errors qi seems wrong. in fig. 8, no such drawback can be seen. the intersection reconstruction improves every domain based mdc, and may be even more effective for more than two descriptions, because every different quantizer results in additionally limits of quantization intervals, which can be interpreted at the central decoder. 4 conclusions in this paper it is shown how to utilize the lower rate quantizers for reducing the distortion at the central decoder in a domain based partitioning mdc. the linear superposition and the intersection method are described for n possible descriptions, so they can be used in domain based partition ing mdc systems with arbitrary number of descriptions. for the first approach, called linear superposition, less complexity is traded for the possibility of drawbacks. for lower redundancy, the assumption of negligible cross correlation between the quantization errors of the different channels may not apply. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 5: highest rate decoder (solid line) vs. linear superposition decoder (dotted line), gaussian source with unit-variance and zero mean, rate: 2bpss fig. 6: highest rate decoder (solid line) vs. intersection decoder (dotted line), gaussian source with unit-variance and zero mean, rate: 2bpss fig. 7: lena 512×512, highest rate decoder (solid line) vs. linear superposition decoder (dotted line), psnr for central decoder: 30db fig. 8: lena 512×512, highest rate decoder (solid line) vs. intersection decoder (dotted line), psnr for central decoder: 30db the second, more complex approach is at least as good as the highest rate reconstruction, and by proper choosing of quantization intervals, a significant reduction of the distortion at the central decoder is possible. although the intersection method is better than the linear superposition method, there may be applications where the quantization intervals may not be known at the decoder, for example [2]. for such applications the linear superposition may be an improvement for an environment with need of high redundancy. further investigations may interpret the cross correlation of the linear superposition method or study the benefits of these central decoders for more than two channels. 5 acknowledgments the research described in the paper was supervised by prof. j. r. ohm, ient, rwth aachen university. the author wish to thank the team of the ient, rwth aachen university for assistance and help. references [1] bajic, i. v., woods, j. w.: domain-based multiple description coding of images and video. ieee transactions on image processing, vol. 12 october 2003, no. 10, p. 1211–1225. [2] malvar, h. s., hallapuro, a., karczewicz, m., kerofsky l.: low-complexity transform and quantization in h. 264/avc. ieee trans. on circuits and systems for video technology, vol. 13, july 2003, no. 7, p. 598–603. [3] ozarow, l.: on a source-coding problem with two channels and three receivers. the bell systems technical journal, vol. 59, december 1980, no. 10. [4] regunathan, l., rose, k.: efficient prediction in multiple description video coding. in: proc. ieee int. conf. image processing, vol. 1 (2000), p. 1020–1023. [5] said, a., pearlman, w. a.: a new fast and efficient image codec based on set partitioning in hierachical trees. ieee trans. on circuits and systems for video technology, 6 june 1996. [6] tillo, t., baccaglini, e., olmo, g.: a xexible multi-rate allocation scheme for balanced multiple description coding application. in: ieee international workshop on multimedia signal processing,, 2005 p. 081–084. [7] tillo, t., olmo, g.: a novel multiple description coding scheme compatible with the jpeg 2000 decoder. ieee signal processing letters, vol. 11, november 2004, p. 908–911. [8] vaishampayan, v. a.: design of multiple description scalar quantizers. ieee trans. on information theory, vol. 39, may 1993, no. 3, p. 821–834. [9] wang, y., orchard, m. t., vaishampayan, v. a., reibman, a. r.: multiple description coding using pairwise correlating transforms. ieee transactions on image processing, vol. 10, march 2001, p. 351–366. martin spiertz spiertz@ient.rwth-aachen.de dipl.-ing. thomas rusert rusert@ient.rwth-aachen.de institute of communications engineering rwth aachen university 52056 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 46 no. 4/2006 ap06_2.vp 1 introduction rapid detection and location of ammonia gas leaks poses a serious problem at all facilities utilising almost pure ammonia gas, such as large-scale refrigerating systems (breweries, dairies, abattoirs, logistic centres, ice-rinks, etc.), chemical plants producing ammonia by the haber-bosh reaction, and fertilizer plants. every year, several accidents caused by ammonia are reported, resulting in severe damage to health and general distress. the concentration limit of human ammonia perception is about 50 ppm, but even lower concentrations are harmful for the human respiratory system. the long-term allowed concentration is defined as 20 ppm. higher concentrations (500–1000 ppm) lead to a serious attack on the respiratory system. the lethal concentration limit is estimated as 5000–10000 ppm [1]. such a level can be readily reached in the event of a highly concentrated ammonia gas leak. the standard architecture of an ammonia detection system used at present is based on a network of individual conductometric sensing heads equipped with a metal-oxide film [1]. clearly, both the cost and the electric power consumption of such a sensing network rise rapidly with the number of sensors installed. low gas selectivity and insufficient long-term stability of the sensing elements are also serious problems [1]. numerous sensing schemes and optical systems have been tested to overcome these drawbacks (e.g. [1, 3–8]). one approach employs a suitably sensitized optical fibre. the two most extensively tested principles of this type are based on absorptionand fluorescence-based intrinsic sensing fibres. selected reagent is embedded in the fibre cladding; ammonia gas then induces spectral variations of the fibre optical absorbance or fluorescence through its chemical reaction with the reagent. short sections of such a fibre can be employed in fabricating the individual sensing heads, whereas various modifications of the optical reflectometry method (optical time domain reflectometry, otdr, optical frequency domain reflectometry, ofdr, optical low coherence reflectometry, olcr, optical time-of-flight chemical detection, otof-cd) provide a basis for constructing distributed sensing systems [9–11]. the crucial part of any intrinsic fibre optic sensor design is the selection of a reaction mechanism transforming an analyte exposure event into variations of the optical properties of the fibre. for instance, a ph-sensitive dye can be embedded into the fibre cladding. when exposed to alkaline ammonia gas, the optical absorption band of the dye is shifted; the change in cladding absorption modifies the light spectrum (guided in the fibre core as individual waveguide modes) through evanescent electromagnetic field components. unfortunately, such utilization of an acid base reaction suffers from its strong dependence on the humidity level (the presence of hydronium ions is necessary for the creation of nh4 � ions) and the obvious low sensing selectivity. the latter drawback can be partly compensated by using a proper gas selective membrane [1]. the ammonia detection approach employed here uses the ligand exchange reaction proposed in [12]. it is in principle capable reducing the humidity dependence and enhancing the selectivity in comparison with the acid base reaction (see the experimental section for more details). in the research stage described here, the following objectives were adopted: (i) to choose reagent(s) suitable for further studies on optical fibres by evaluating the optical absorption spectra of two suitable organic dyes and their metallic complexes in solution, (ii) to prepare sensing fibre samples by diffusion of the reagent into a plastic clad silica (pcs) fibre cladding using the frequently applied procedure elaborated by our research group in frame of the cec copernicus programme [13, 9, 14], (iii) to test the concentration and temporal response to ammonia gas exposure by vis-nir absorption spectroscopy measurements on short fibre sections, and (iv) to pre-evaluate the qualitative features of an otdr signal by measurements on a longer fibre, only partially sensitized with the selected reagent. preparation, optimization, and quantitative characterization of a real distributed fibre optic ammonia sensor will be carried out in the next step, and the results will be published in a forthcoming paper. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fiber optic detection of ammonia gas l. kalvoda, j. aubrecht, r. klepáček bathochromic shifts accompanying the formation of several bivalent metallic complexes containing 5-(4 ’dimethylaminophenylimino) quinolin-8-one (l1), and 7chlore-5(4 ’diethylamino-2-methylphenylimino) quinolin-8-one (l2) ligands in ethanol solutions were evaluated by vis-nir spectroscopy. the [l1-cu-l1] sulphide complex was selected as a reagent for further tests on optical fibres. samples of multimode siloxane-clad fused-silica fibre were sensitized by diffusing an ethanol/chloroform solution of the dye into the cladding polymer, and tested by vis-nir optical spectroscopy (12 cm long fibre sections), and optical time domain reflectometry (otdr; 20 ns laser pulses, wavelength 850 nm, 120 m long fibre sensitized within the interval 104–110 m). a well-resolved absorption band of the reagent could be identified in the absorption spectra of the fibres. after exposure to dry ammonia/nitrogen gas with increasing ammonia concentration (0–4000 ppm), the short fibre samples showed subsequent decay of nir optical absorption; saturation was observed for higher ammonia levels. the concentration resolution r � 50 ppm and forward response time �90 � 30 sec were obtained within the interval 0–1000 ppm. the otdr courses showed an enhancement of the back-scattered light intensity coming from the sensitized region after diffusion of the initial reagent, and decay after exposure to concentrated ammonia/nitrogen gas (10000 ppm). keywords: chemical sensors, ammonia sensors, optical fibres. 2 experimental 2.1 chemicals ethanol, chloroform, and acetone (p.a. grade) were obtained from lach-ner, cr; deionised water (0.1 �s cm�1) was employed. 5-(4’-dimethylaminophenylimino) quinolin-8-one (l1, mw � 277.38), and 7-chlore-5(4diethylamino-2-methylphenylimino) quinolin-8-one (l2, mw � 353.85) was synthesized as described in [13]. febr2, cobr2, nicl2, cucl2, cuso4, znso4, cd(no3)2, hg(no3)2, pb(no3)2, and naoh were obtained from sigma-aldrich and lach-ner. solutions of metallic complexes were prepared by admixing 50 ml of l1 or l2 solution (0.01m) in ethanol and 2.5 ml of the metallic salt solution in an ethanol/water mixture (0.1m) and stirring for 30 min at rt. the amount of water was kept at a minimal level. the complex dye formation was then checked by vis-nir absorption spectroscopy. fresh solutions of the complex dyes were used in subsequent preparation of the sensing fibres. the following complexes were prepared and tested: [l-fe-l] br2, [l-co-l] br2, [l-ni-l] cl2, [l-cu-l] cl2, [l-zn-l] so4, [l-cd-l] (no3)2, [l-hg-l] (no3)2, [l-pb-l] (no3)2, (l stands for l1 or l2). nitrogen gas (99.996 % wt.) was obtained from messer technogas, cr. a dry ammonia/nitrogen mixture (10000 ppm nh3) was prepared by evaporating a known amount of ammonia gas from ammonia liquor, drying it in a column filled with naoh, and mixing it with nitrogen gas. 2.2 preparation of tested fibres a custom-made pcs fibre (nc � 1.458, 200/260 �m core/cladding diameter, dow corning optigard siloxane cladding, produced at ire ascr, prague, cr) was utilized throughout the experiments. the fibre sections were washed in acetone for at least 12 h prior to further sensitization. the reagent was soaked into the fibre cladding from its ethanol/chloroform solution (1:1 wt.), washed in ethanol, and left to dry for 12 h in room conditions. two types of fibres were prepared: short sections (12 cm) intended for vis-nir absorption spectroscopy measurements and a long fibre (120 m, sensitized within the interval 104–110 m) tested by otdr. the whole preparation and subsequent characterization of the fibres was performed at room temperature (rt). 2.3 ligand exchange reaction the ammonia sensing fibres prepared in this work employ the ligand exchange reaction [ ]l me l a nnh� � � �� �2 2 3 � � [( ) ]nh me a ln3 2 2 2� � �� � (1) where l is an organic chromatic ligand, me is a bivalent positive metal ion forming a complex ion with l (see also table 1), a is a suitable counter-anion, and n is the integer number, which depends on the type of me co-ordination. the superscripts indicate the degree of ionization of the particular fragment. 2.4 experimental characterization the vis-nir absorption spectra of the solutions and short fibre sections were measured using a setup containing a white light source (halogen lamp, 25 w), a cuvette holder with focusing optics (solution spectroscopy measurements) or a sealed testing chamber attached to a gas mixing system (spectroscopy measurements of fibre sections), an optical pigtail collecting light from the sample under test and transferring it into an ocean optics s1000 array spectrometer (200–1200 nm wavelength range, 1 nm resolution, single channel operation) controlled by a pc. in the case of fibre testing, the light beam was focused into the tested fibre through a microscopic objective and collected by an integration sphere (1 mm in diameter) at the end of the pigtail. the light source, the testing chamber, and the collecting pigtail were all placed on a common optical rail. the otdr setup consisted of a photodyne 5500 xfa otdr unit (laser diode wavelength � � 850 nm, pulse width 20 ns, average pulse power 30 �w, repeating frequency 3.1 khz), hp 54615b digital oscilloscope (maximum sampling speed 1 gsa/s, signal bandwidth 500 mhz), a sealed testing chamber (4 dm3) attached to the cylinder with 10000 ppm ammonia/nitrogen gas, and a pc controller. the otdr unit launched laser pulses into the tested fibre, registered a back-scattered light intensity using an internal pin si diode, and logarithmically amplified the detected signal. the temporal signal course was then recorded by the digital oscilloscope and transferred to the pc. the final pulse width limited the spatial resolution along the fibre length to ~4 m (cf eq. 3). the otdr curves showing a full temporal course were recorded with the oscilloscope set to collect data within the time range � t � 2000 ns. the curve details were examined with � t � 200 ns. a numeric data averaging procedure (128 single-shot spectra were accumulated) was employed to reduce the signal-to-noise ratio of the resulting otdr curves. variable concentrations of ammonia gas in nitrogen were prepared by volumetrically dosing the concentrated ammonia/nitrogen mixture (10000 ppm) into the gas circuit (including the measuring chamber) filled with nitrogen. a membrane circulation pump included in the circuit provided fast homogenization of the prepared gas mixture. 2.5 basic theory the vis-nir absorption spectra were obtained by a standard absorption spectroscopy method [15]. the otdr technique relies on interrogation of an attached optical fibre by short monochromatic laser pulses, and a subsequent temporal analysis of the light intensity p(x) returning to the otdr unit from the fibre distance x, x � � t c/nc (c – light velocity in a vacuum, nc – refractive index of the fibre core, � t – time interval between the pulse onset and the measuring time point). two types of processes have to be considered as the dominant signal sources in the case of an intrinsic absorption-based sensing fibre: rayleigh scattering along the fibre length, and fresnel reflection from the fibre splices and the free end [16]. the recorded otdr curves (log(p(x) versus x) provided information about the immediate optical properties of the tested fibre along its length. the total fresnel reflection intensity pf(x) is proportional to the forward pulse power p�(x) at position x, multiplied by the square of the corresponding reflection coefficients ri averaged over all reflected light modes m guided within the multimode fibre of the core diameter d and numeric aperture na [15, 17] 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague p x p x m r xf i i m ( ) ( ) ( )� � � �1 2 1 ; m d na� � � �2 2 2 2 � � ; (2) p x i a x xl w � � ��( ) exp{ ( )} d 0 . the integration runs over pulse spatial width w c nc� � ; � is the pulse duration. the intensity of the rayleigh back-scattered light can be expressed as [10, 16, 18] p x c n i s x s x na x a x xr c l b r w ( ) ( ) ( ) ( ) exp{ ( )}� ��2 22 2 0 d , (3) where c is a constant characterizing the coupling efficiency of the optics, il is the laser intensity injected into the fibre, sb and sr are respectively a back-scattering factor and the rayleigh scattering coefficient at position x. the latter relates on a microscopic level to the local polarizability �(x) of the fibre as [15] s x x r( ) ( ) � � � 2 4 . (4) the total fibre attenuation a(x) caused by scattering and absorption effects is obtained as [9] a x a l l x ( ) ( )� �� d 0 , (5) where �a l( ) is a local attenuation coefficient and the integration runs over the fibre length 0–x. for example, in the case of a uniform fibre (sr(x), sb(x), na(x), and a(x) independent of x), it follows from (3) that the rayleigh contribution to the otdr logarithmic output, log (pr(x)), should be a linear function of x. contingent variations of �a x( ) and/or sr(x), sb(x) and na(x) along the fibre length lead respectively to the corresponding slope changes and/or local extremes appearing on the resulting otdr curve. 3 results and discussion as expected, dyes l1 and l2 showed strong absorption bands in the spectral range 500 nm – 800 nm with maxima at 626 nm and 690 nm, respectively (fig. 1). creation of metallic complexes of the dyes resulted in bathochromic shifts (� �) of the absorption bands due to widening of the corresponding electronic resonant systems. the complex ions containing ligand l1 showed remarkably larger shifts than those containing ligand l2; the greatest � �-value was observed for complex dye [l1-cu-l1] so4 (table 1, fig. 2). the impact of the counter-anion (so4 2� and (cl �)2 were tested) on the � � value was small. the [l1-cu-l1] so4 complex dye (hereafter referred to as reagent r) was selected for the subsequent sensing tests performed on the sensitized fibre samples. studies employing other interesting complexes, especially those containing co2�, are now in progress and will be published in a separate paper. © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 1: absorption spectrum of dye l1 (1) and l2 (2) in ethanol. the solution concentration was 24 �m and 19 �m for l1 and l2, respectively. the peak positions are indicated. central ion (me2�) counter-ion � � (nm), l1 � � (nm), l2 fe2� (br�)2 96 76 co2� (br�)2 105 82 ni2� (cl �)2 108 70 cu2� (cl �)2 125 70 cu2� so4 2� 129 70 zn2� so4 2� 81 69 hg2� (no3 �)2 106 80 pb2� (no3 �)2 98 68 table 1: bathochromic shifts � � resulting from creation of complex ions [l-me-l]2�. the values refer to the absorption maximum of the corresponding ligand l (l � l1 or l2) in ethanol. fig. 2: absorption spectra of ethanol solutions. curve 1 – dye l1; 2 – complex reagent r � [l1-cu-l1] so4; 3 – spectrum of r after addition of ammonia liquor. the bathochromic shift � � accompanying the creation of complexes r is indicated. as expected (cf eq. 1), the vis-nir absorption spectra showed that ion [l1-cu-l1]2� decomposes in contact with ammonia liquor and the ligand absorption spectrum is restored (fig. 2). a similar reaction was also observed on short fibre sections sensitized with reagent r and exposed to 10000 ppm ammonia gas (fig. 3). the forward reaction time �90 (the time necessary to reach 90 % of the total light intensity drop) was ~30 sec, much faster than the fibre recovery time; full spectral relaxation was obtained after ~5 minutes of nitrogen blow. ammonia, as a stronger electron donor than the organic ligand, substitutes the latter in the metallic complex, giving rise to a new [(nh3)nme] 2� complex ion (cf eq. 1). the decomposition of the original chromatic complex leads to the observed colour changes. the equilibrium constant of reaction (1) depends not only on the ammonia concentration, but also on the actual degree of dissociation of the individual ionic species. the degree of dissociation varies with the type of ions (metallic ions as well as anions) and with the permittivity of the solvating medium. if a polymer matrix acts as the solvent, the possible diffusion of water into the polymer bulk can also modify the actual degree of dissociation, thus contributing to a remarkable dependence of the sensor signal on the ambient humidity. such behaviour was indeed experimentally observed [12, 14]. the presence of hydroxyl and hydronium ions in the polymer matrix can also lead to an alternative chemical process – the creation of an ammonium salt competing with the reaction (1), thus disturbing the sensor function. careful optimization of the cladding polymer and counter-anion type is therefore necessary to reduce undesired effects. we are currently researching in this direction, and our work will be presented in a forthcoming paper. for this study, only dry gas was used throughout the experiments; thereby reaction scheme (1) could be adopted. spectroscopic measurements also confirmed the crucial importance of the initial fibre wash for the long-term spectral stability of the sensitized fibres. the rapid decay of the optical absorption of the reagent r observed for an unwashed sample (fig. 4) resulted very likely from a chemical decomposition of the reagent fractions due to a reaction with remains of a uv-initiator/catalyser in the siloxane cladding. the optical properties of fibres properly treated with acetone remained stable for several months of storage in laboratory conditions; the stability and reversibility in field conditions are currently being tested. the optical absorption of a short sensing fibre sample integrated within the spectral interval (840 – 860) nm decayed with increasing ammonia gas concentration (fig. 5); the corresponding concentration sensitivity descends with growing analyte concentration and approaches a saturated level at ammonia concentration ~4000 ppm. the saturated state likely corresponds to a complete reagent decomposition (cf reaction (1) involving all reagent molecules embedded in the cladding). thus, increasing the reagent concentration in the fibre cladding could potentially increase the ammonia sensitivity 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague w a v e le n g h t (n m ) time (s ec) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 900 800 850 750 100 200 300 400 500 600 fig. 3: absorption spectrum of a short sensing fibre section exposed to dry nh3 /nitrogen gas (10000 ppm, exposure start at t � 160 sec) followed by nitrogen blow (start at t � 260 sec) . . . . . fig. 4: absorption spectra of a short sensing fibre demonstrating the influence of initial fibre rinsing in acetone on spectral stability of the reagent after sensitization. curve 1 – freshly prepared fibre; curve 2 – after 1 week storage, rinsed fibre; curve 3 – after 1 week storage, non-rinsed fibre. time (sec) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 100 200 300 400 500 600 700 1000 ppm 2000 ppm 3000 ppm 4000 ppm ~ 6 � fig. 5: absorbance variations measured on a short sensing fibre when exposed to a dry nh3 /nitrogen mixture of increasing concentration. absorbance values integrated over the spectral interval (840 – 860) nm. the estimated signal range 6� including ~99.7 % of the signal fluctuations is indicated. threshold, but it would also enhance the total fibre attenuation (undesirable for longer fibres), and likely decrease the resolution at low concentrations. this is because the diffusing analyte molecules will mostly react with the reagent molecules located in the outer shell of the cladding, which interacts only weakly with the evanescent field of the fibre core [17]. the cladding thickness and reagent concentration profile therefore also have to be carefully optimized. the low concentration resolution r within the interval 0–1000 ppm can be roughly estimated from the course in fig. 5. assessing the standard deviation (std) value as � � 0 002. and taking in account the signal change c. 0.092 we get for r � �1000 2 0 092 50 � . ppm. as already mentioned, the practical reagent concentration which can be achieved in a sensing fibre cladding is generally limited. therefore, the detection resolution can be further improved mainly by reducing the noise level. in our case, we estimate that the primary noise source was the poor light coupling between the tested fibre and the measuring system; the measuring system may be significantly improved if fixing fibre splices are used. further improvement can also be achieved by applying numeric data accumulation and averaging procedures. the full otdr curves recorded with the 120 m long fibre as fabricated, sensitized by reagent r within the range (104–110) m, and exposed to 10000 ppm ammonia gas within the same range were dominated by two fresnel reflections (fig. 6a). the pulse dispersion along the fibre length is clearly demonstrated by the broadening of the distant (second) reflection. the reflection at x ~ 0 comes from the front end of an internal fibre within the photodyne unit; the second main maximum corresponds to the reflection from the free end of the tested fibre. the tiny side-maximum of the first reflection (at x ~ 8 m in fig. 6a) is caused by the splice connecting the tested fibre to the measuring unit. the fibre sensitization resulted (i) in a slight increase in the intensity back-scattered from the sensitized region (fig. 6b), followed (ii) by a steeper signal decay from the more distant fibre part combined with a well-resolved reduction in the second fresnel reflection intensity (fig. 6, curve 2) compared to the unsensitized state (fig. 6, curve 1). the first effect may likely be ascribed to the increase in polarizability within the sensitized range enhancing the rayleigh term sr(x) and reducing na(x) (cf eq. (3, 4)). the local drop of the numeric aperture elicits leakage of some part of the guided modes into the cladding, where their scattering and absorption level is much higher than in the fibre core. the second effect results from the enhanced fibre attenuation within the sensitized region (cf eq. 2, 3, 5). the character of the otdr signal variation observed after exposure to concentrated ammonia gas (fig. 6, curve 3) was opposite to the reaction following the sensitization procedure. the second fresnel reflection grew, the back-scattered signal coming from the sensitized/exposed region decreased, and the slope of the signal that originated just after the exposed region rose. the behaviour is again in accord with the basic otdr model represented by eq. 2–5 and with reaction mechanism eq. 1. decomposition of the reagent complexes and the creation of much smaller ions [(nh3)4cu] 2� led to a local reduction of parameters �(x) and �a (x), causing the observed signal changes. the two intersections of curves 2 and 3 (fig. 6b) conform very well with the boundaries of the exposed region. 4 conclusions the results show the principal feasibility of fabricating an ammonia sensing fibre using the selected reagent and fibre sensitization procedure employed here. the sensing parameters obtained with a short fibre sample (r � 50 ppm, �90 � 30 s) are comparable with the figures required for detecting an extensive ammonia gas leakage. we anticipate that comparable sensing parameters would also be achieved with long sensing fibres; the concentration resolution can likely be further improved by using a better optical coupling of the tested fibre to the measuring unit, and numeric signal accumulation and averaging procedures. the otdr measurements were performed with the aim to demonstrate the principal feasibility of distributed ammonia gas detection using the proposed fibre design. two fea© czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 a) b) 10095 105 110 115 1 2 3 distance (m) 0.07 0.13 0.20 0.26 0.33 0.39 0.46 0.53 0.59 120 lo g ( ) (r e l. u n it s ) p x lo g ( ) (r e l. u n it s ) p x distance (m) 150 0.50 0 100 20050�50 0.25 0.75 1.00 1 2 3 fig. 6: otdr curves obtained with a 120 m long fibre sensitized within distance x � (104–110) m; a) full courses, b) courses corresponding to the sensitized region, recorded with higher resolution (see experimental section for details). curve 1 – before sensitization; curve 2 – after sensitization with reagent r; curve 3 – after exposure to 10000 ppm ammonia gas in nitrogen. tures of the observed otdr signal are important for the subsequent design of the prototype: (1) variation of the fresnel reflection coming from the free fibre end can be instrumental in detecting any ammonia leak along the fibre length. (2) the absorbing reagent embedded in the fibre cladding not only enhances the local fibre attenuation (the main factor restricting the maximum length of absorption-based sensing fibres), but also contributes to growth of the local back-scattered light intensity, thus slightly improving the signal-to-noise ratio of the resulting signal. our forthcoming research will focus on tests of alternative reagents (such as l1-complex with cobalt) followed by the fabrication of an optimized distributed fibre optic sensor. the static and dynamic characteristics and the long-term stability in field conditions of the sensor will be analysed with reference to the influence of ambient temperature and variations in relative humidity. acknowledgment part of this research was supported by the european commission in the framework of the copernicus program (contract number cipa-ct94-0206). references [1] timmer, b., olthuis, w., van den berg, a.: “ammonia sensors and their applications – a review.” sensors and actuators. vol. b107 (2005), p. 666–677. [2] zakrzewska, k.: “mixed oxides as gas sensors.” thin solid films. vol. 391 (2001), p. 229–238. [3] shahriary, m. r., zhou, q., siegel jr., g. h.: “porous optical fibers for high-sensitivity ammonia-vapor sensors.” opt. lett. vol. 13 (1988), p. 407–409. [4] blyler jr., l. l. et al.: “optical fiber chemical sensors utilizing dye-doped silicone polymer claddings.” polym. eng. sci. vol. 29 (1989), p. 1215–1222. [5] liebermann, r. a.: “distributed and multiplexed chemical fiber optic sensors.” proc. spie. vol. 1586 (1991), p. 80–91. [6] klein, r., voges, e.: “integrated-optic ammonia sensor.” sensors and actuators. vol. b11 (1993), p. 221–225. [7] preininger, c. et al.: “ammonia fluorosensors based on reversible lactonization of polymer-entrapped rhodamine dyes and the effect of plasticizers.” anal. chim. acta. vol. 334 (1996), p. 113–123. [8] grady, t. et al.: “optical sensor for gaseous ammonia with tuneable sensitivity.” analyst. vol. 122 (1997), p. 803–806. [9] potyrailo, r. a., hieftje, g. m.: “optical time-offlight chemical detection: absorption-modulated fluorescence for spatially resolved analyte mapping in a bidirectional distributed fiber-optic sensor.” anal. chem. vol. 70 (1998), p. 3407–3412. [10] nissilä, s.: “on the use of optical fibres in industrial sensor applications.” acta universitatis ouluensis c – technica. vol. 83, oulun yliopisto, oulu, 1995. [11] takada, k.: “improvement in signal-to-noise ratio of rayleigh backscattering measurement using olcr.” j. lightwave tech. vol. 20 (2002), p. 1001–1017. [12] malins, c. et al.: “fibre optic ammonia sensing employing near infrared dyes.” sensors and actuators. vol. b51 (1998), p. 359–367. [13] kvasnik, f et al.: “rapid detection and location of ammonia leaks.” final report on copernicus joint research project, cipa-ct94-0206, umist, manchester, 1997. [14] scorsone, e. et al.: “fibre-optic evanescent sensing of gaseous ammonia with two forms of a new near-infrared dye in comparison to phenol red.” sensors and actuators. vol. b90 (2003), p. 37–45. [15] ingle, j. d., crouch, s. r.: spectrochemical analysis. new jersey: prentice hall, 1988. [16] bürck, j., sensfelder, e., ache, h.-j.: “distributed measurement of chemicals using fiber optic evanescent wave sensing.” proc. spie. vol. 2836 (1996), p. 1–11. [17] saleh, b. e. a., teich, m. c.: fundamentals of photonics. new york: john wiley & sons, 1991. [18] klocek, p. et al.: “measurement system for attenuation, numerical aperture (na), dispersion, and optical time-domain reflectometry (otdr) in infrared (ir) optical fibers.” proc. spie. vol. 618 (1986), p. 151–158. ing. ladislav kalvoda, csc., specialist assistant phone:+420 224 358 606, +420 233 325 508 fax: +420 224 358 601 e-mail: ladislav.kalvoda@fjfi.cvut.cz ing. rudolf klepáček phone:+420 224 358 606, +420 233 325 508 fax: +420 224 358 601 e-mail: rudolf.klepacek@fjfi.cvut.cz jan aubrecht phone:+420 224 358 606, +420 233 325 508 fax: +420 224 358 601 e-mail: jan.aubrecht@centrum.cz department of solid state physics czech technical university in prague faculty of nuclear science and physical engineering trojanova 13 120 00 prague 2, czech republic 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague ap_07_6.vp 1 introduction fracture surfaces are valuable sources of information on the structural composition and physical properties of materials. for these reasons they are a subject of interest for many research laboratories. since the publication of the basic work by mandelbrot and his co-workers [1] many authors have tried to correlate the fractal dimensions of fracture surfaces with the mechanical properties of materials. this effort has been impacted by the great complexity of these surfaces, especially in the case of composite porous materials. cementitious materials have complex fracture surfaces that have been extensively studied [2–4]. the values of the fractal dimensions of a range of materials show only a narrow scatter, ranging from ~2.0 to ~2.2. repeatedly determined dimensions of different fractured samples of the same materials have often resulted in identical values and this has led some authors [5] to the idea of a universal co-dimension (hurst exponent) h � 0 8. that characterizes fracture surfaces as a whole. though this idea may invoke certain doubts at first sight [6], it should be carefully considered before it is rejected or accepted. the aim of this paper is to investigate the concept of a universal co-dimension of fracture surfaces. as will be shown, this concept may be verified experimentally using fracture surfaces of porous materials in connection with their compressive strength. for this purpose, it is necessary first to derive corresponding relations for fractal porosity and fractal strength, and then to apply them to a particular material, in our case cement gel. 2 fractal porosity the large class of porous materials possesses at least one common feature, namely, they are composed of grains (particles, globules, etc.) of microscopic size l. the grains are usually arranged fractally with number distribution n(l) n l l l d ( ) � � � � � � � , l < l. (1) the porosity p of the cluster p l l d � � � � � � � 1 3 (2) modifies its form if the porous material consists of more than one (n 1) fractal cluster p l l i i d i n i � � � � � � � � � � �1 3 0 . (3) however, relation (3) does not take into account the case of a composite material in which the fractal clusters of characteristic sizes li can be stochastically scattered and mixed with other phases so that the size � of the investigated sample may considerably exceed the cluster sizes li � �. in order to generalize relation (3), let use suppose that there are mi fractal clusters with dimension di in the sample. their volume fractions �i m li� 3 3 � enable us to calculate the porosity of the whole sample, as follows p l li i i d i n i � � � � � � � � � � �1 3 0 � . (4) eq. (4) includes all possibilities of fractal, non-fractal ( )d � 3 or mixed arrangements of the solid environment surrounding the pores. provided there are n 1 components distributed over the whole sample (li � �, mi � 1, � i � 1), eq. (4) then converts back to (3). 3 fractal compressive strength one of the most widely used relations for compressive strength � of a porous material is that of balshin [7], though other relations [8] have also been proposed for this purpose. balshin considered an ideal case when pores are not filled with an incompressible liquid and compressive strength � is directly dependent on the compactness ( )1 p of the material � �� 0 1 * ( )p k. however, as soon as the virtual incompressibility of the pore liquid is included together with some other factors, certain remaining strength s0 appears as a constant when the porosity reaches a critical value pcr, i.e. �( )p scr � 0. the generalized balshin function then reads � � �� � � � � � � � � � � � � 0 0 0 01 1 0 1 1 * ( ) , , p p s p b s b p k k cr cr 0 0 0 0� � �� � �* * , p kcr , (5) where p is total porosity and � is compressive strength of the sample with a virtually incompressible fluid filling at least a part of its pore space. combining (4) and (5), the compressive strength of porous matter appears as a function of the fractal structure � � �� � � � � � � � � � � � � � � � � �0 3 0 0i i i d i n k l l b s i . (6) © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 47 no. 6/2007 dimension of fracture surfaces t. ficker the question of the universality of a dimension of fracture surfaces is discussed, and it is shown that such a general parameter may exist at least for a particular class of materials. keywords: fractal dimension, fracture surfaces, porous materials, compressive strength. 4 dimensions of a fracture surface generally, in the case of a mixed structure containing both fractal and non-fractal regions some of the dimensions di are associated with volume (mass) fractals (0> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap05_3.vp 1 introduction the demand for a high quality, low cost product with a short development time for the dynamic global market has forced researchers and industries to focus on various effective product development strategies. concurrent engineering is an effective approach to reduce product development lead-time and improve the overall lifecycle quality by incorporating downstream product development considerations into the early design stage. the product development process deals with various kinds of data issued from value engineering, structural breakdown, dfx assessments, etc. integration of product data in each design phase (conceptual design, embodiment design and detail design, pahl and beitz [1]) will provide more complete information-based decision making. product modelling has been recognized as an effective technique for facilitating the representation and management of product data. during the 1990s many research works dealing with product modelling were published by krause et al. [2], laakko and mäntylä [3], anderl and mendgen [4]. the main approach of these works was to describe the product based on geometric and form features. in recent years, the research focus of product modelling has gradually shifted to the earlier design phase. essentially, the earlier design phase is function-driven or function-oriented, because the main design focus at this stage is to find a design solution that is able to achieve the required functions. form feature modelling approaches and systems, dealing mainly with geometry, cannot take functional information into account at all, because this information is a more abstract concept than geometric and topological information. in such a situation, product modelling oriented to functional design has been widely studied. the main idea of these modelling approaches is to describe the relationships between functions, behaviours and structures during the design process, umeda et al. [5], deng et al. [6], zhang et al. [7]. the representation of the data that has to be managed during the product development process depends on the design phase. it is hardly feasible to find an all-round perfect product model which is able to handle all the product data aspects above. in the presented research work, two models are used to deal with product data from the conceptual phase to the detail design phase. a function-behaviour-structure (fbs) model is employed for the conceptual and embodiment design phase, and a multiple-view model is used for the embodiment and detail design phase. as we know, commercial cad systems provide powerful capability in cad modelling and analysis. currently, the research aim is to obtain the linking which will integrate these three models. this research proposes an integration framework about different aspects of product data. on the conceptual level, the different kinds of product data and their logical relations are described through functional modelling and multiple-view modelling and cad modelling. on the implementation level, the integration of the functional modeller, the multiple view product modeller and catia compose the environment for implementing the modelling logics provided on the conceptual level. the data exchange between catia and an already-existing optimisation module is also considered to aid designers for a quick assessment of a sub-assembly taking into account a set of evaluation criteria. this paper is organised as follows: (1) an integration framework for multiple aspects of product data is proposed, (2) as an application of this framework, the fbs model and multiple-view model are briefly described and the linking between them is emphasised, (3) the integration of the implementing modellers is described, especially details of the linking between the multiple-view modeller and catia is discussed, (5) the conclusions and future work are presented. 2 the proposed integration framework the product development process deals with different kinds of data in each design phase. during the phase of task clarification, the design specifications originating from customer requirements are the description of a product to be designed; during the conceptual design, the means (principle solutions or components) are generated to meet the functional requirements; then the configurations of each of the components and the connections between the components are set during the embodiment design; after that, final decisions on dimensions, arrangement, shapes of individual © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 towards integration of cax systems and a multiple-view product modeller in mechanical design h. song, b. eynard, p. lafon, l. roucoules this paper deals with the development of an integration framework and its implementation for the connexion of cax systems and multiple-view product modelling. the integration framework is presented regarding its conceptual level and the implementation level is described currently with the connexion of a functional modeller, a multiple-view product modeller, an optimisation module and a cad system. the integration between the multiple-view product modeller and catia v5 based on the step standard is described in detail. finally, the presented works are discussed and future research developments are suggested. keywords: concurrent engineering, integrated design, cax system, product modelling, step. components and materials are made, giving due consideration to the manufacturing function in the detail design. as mentioned in section 1, the modelling of product data depends on the design phase. it is obviously difficult to handle all the aspects of product data given above in an all-round model. the key to product modelling is product model partition and integration. fig. 1 shows an overview of the proposed integration framework. the purpose of this framework is to provide methods and tools to enable product data integration in a concurrent engineering approach. several mapping methods have been proposed to implement system integration within the product modelling area. in nowak et al. [8], a meta-modelling based method has been proposed for the integration of product models. a formal modelling notation is required for the definition of mappings on the conceptual level. this notation provides a method to describe the correspondences between the models. after the mapping has been defined, it is possible to translate data to the implementation level. according to the identified product data in different design phases, the following modelling techniques are adopted on the conceptual level: 1) customer requirement modelling: the customer requirements are analysed and translated into a statement which defines the function that the product should provide (referred to as a functional requirement) and the physical requirements that the product must satisfy. 2) functional modelling: to describe a design and its requirements from its functional aspects so as to allow reasoning about its functionality and generating schemes. 3) multiple-view modelling: to embed engineering knowledge (geometry, process planning, manufacturing, assembly, etc.) into the description of the form features. 4) cad modelling: to describe the geometric and topological information and to enable an analysis. according to the product modelling techniques provided on the conceptual level, the corresponding implementing systems have already been developed by research institutes or commercial companies. these implementing tools and their relationships compose the implementation level in the framework. there are a number of techniques for customer requirement modelling. the most well-known of these techniques is quality function deployment (qfd). the functional modeller and the multiple-view modeller have already been developed respectively by song [9], roucoules and tichkiewitch [10]. the commercial catia v5, from dassault syst mes (www.3ds.com), is used here for classical cad modelling. to aid designers to make a quick assess went of a sub-assembly taking into account a set of evaluation criteria, the already-developed optimisation module is also considered to be integrated into the proposed framework. as this research work focuses on mechanical design, customer requirement modelling and its modeller system is not considered currently. on the conceptual level, the integration of functional modelling and multiple-view modelling will be emphasised; on the implementation level, the data exchange between the multiple-view modeller and catia v5 will be focused on. 3 product modelling in this section, we will first briefly describe the two models our work is based on. then the integration of these two models will be emphasised. 3.1 functional modelling the idea of functional modelling in conceptual and embodiment design is to reason at the functional level in order to generate solutions to specified design problems. the product requirements are progressively mapped onto product structures via functional modelling. to support functional modelling, it is now generally accepted that design information should include not only the physical structure of a design, but also its required functions and implementing behaviour [6]. for example, the function of a steam valve in a boiler is to prevent an explosion; its behaviour is that it opens when a certain pressure difference is 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague cad modelling task clarification conceptual design embodiment design detail design customer requirements modelling c o n c e p tu a l le v e l im p le m e n ta ti o n le v e l customer requirements modeller functional modeller multiple view product modeller catia optimisation module mutiple view modelling product development process functional modelling fig. 1: overview of the integration framework e detected; its structure is the physical layout and the connection between the various physical components [11]. fig. 2 shows a functional representation of the structure. this representation suggests that each given structure be described by the behaviours it can provide and the functions provided by the behaviours, and in order to provide each of these, the functions that the structure requires. the search for a solution to a given problem, defined in terms of a given set of desired functions, would start with a search among the known behaviours for those that can provide the desired functions, then among the known structures for those that can provide the intended behaviours. these chosen structures, in turn, require some other functions in order that they can provide the desired functions. now, new behaviours and structures will be searched for, so as to provide these required functions, which in turn give rise to new functional requirements. this process will continue until all the required functions are provided by some structures. each resulting combination of structures evolved by the above process thereby becomes a solution to the design problem posed at the beginning. as the design level decreases (from abstract to concrete), the difference between the meaning of function and behaviour becomes more and more vague. in such a situation, a function may be mapped onto a structure without the behaviour as the transition. for example, the intended function “transmitting torque” can be mapped onto a structure “shaft” when a function may not be mapped onto a behaviour or a structure, it may be broken down into sub-functions. thus, the behaviour or structure in the above representation may be a virtual node. a detailed discussion about the mappings in functional modelling is presented by song and lin [12]. 3.2 multiple-view modelling for embodiment and detail design, a multiple-view model is used. the product has several breakdowns according to different points of view, i.e., the product has multiple-view breakdowns. the multiple-view breakdown was introduced by chapa and detailed in [10] to gather all the data that describe a product with a specific vision of it. fig. 3 shows an example of the multiple-view approach with: technology, manufacturing, assembly and finite element analysis views. roucoules et al. [13] describe the integration of the process planning view. this representation allows multiple-views breakdown of the product and ensures the link with the geometry. these feature-based breakdowns complete the product definition, adding new data and new constraints from specific points of view such as tooling, finite element analysis, etc. the multiple-view model is fully described in [14]. 3.3 integration of the functional model and the multiple-view model (conceptual level) to integrate the functional model and the multiple-view model, it is necessary to establish the linking between these two models. a feasible way is to find the relationship between the results of functional modelling and the initial information of multiple-view modelling. the scheme generated from functional modelling is described by the information on function, behaviour and structure in terms of the representation model as shown in fig. 2. the multiple-view breakdowns start from some structures in multiple-view modelling. thus, both models hold the information on the structural description. it is obvious that the structural descriptions in these two models are in fact different. the objective of functional modelling is to generate an outline scheme, so the structures here are the components on the upper design level. in the multiple-view model, the structure descriptions are relatively detailed. it is nevertheless suggested that the terminal point of functional © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 f f s b p r p indicates to the functions or behaviors provided r r f: function b: behavior s: structure f p indicates from the functions provided fig. 2: functional representation of a structure multiple -view modelling technology assemblyfinite element analysis manufacturing view 1 view 2 view 3view 4 fig. 3: multiple-view model 11 1 22 33 44 3 4 2 function structure behavior 1 link1 link2 link3 relation 2 3 4 e c h n o lo is t v ie w functional modelling multiple-view modelling t e c h n o lo g is t v ie w fig. 4: integration of the functional model and the multiple-view model modelling overlaps with the starting point of multiple-view modelling. fig. 4 shows the integration framework between these two models. from the integration framework, the structure which is the terminal node in functional modelling can be shifted to the technology view in multiple-view modelling. thus, functional relationships that exist between structures, such as drive, support, hold, locate, and couple etc., can be analysed further. figure 5 shows the integration of the two models with an example of a component in a sewing machine. 4 integration on the implementation level in this section, integration between the implementing systems is discussed. there are four different approaches to the data exchange problem: manual re-input of data, direct translation, neutral format translation, and a shared product database [15]. each approach has its own pros and cons. direct translation is the simple and accurate solution, but the number of the translators increases exponentially with the number of systems involved. the last two solutions are reasonable, flexible, and adaptable. 4.1 linking between a functional modeller and a multiple-view modeller the functional modeller is a knowledge-based expert system, which supports functional reasoning based on domain-specific knowledge. it was developed using a clips expert system shell (c language integrated production system) [16]. as explained in section 3.3, the structural information plays the internuncial role between the functional modelling and multiple-view modelling. after the functional modelling, the schemes which contextualised the functional product data are output in a neutral plain text file. then the multiple-view modeller can read the related structures from the neutral file. thus, according to the structures, the different aspects of the product data can easily be retrieved. 4.2 linking between catia v5 and optimisation module integration of the optimisation module into the proposed framework allows designers to quickly assess a sub-assembly, taking into account a set of evaluation criteria and to evaluate the influence of the data on the design solutions by modifying one or more item [17]. the mechanism of data exchange between catia v5 and the developed optimisation module is shown in fig. 6. in catia v5, the optimisation problem is abstracted from the parameterised geometric model. the parameters and constraints which define the optimisation problem are output into a neutral file by calling caa-api (component application architecture application programming interface) of catia v5. then, the optimisation module reads the data from the neutral file and performs the optimisation operation. the optimal values of the variables are output into a neutral file. finally, catia v5 gets back the results from the neutral file by calling caa-api and updates the geometric model. 4.3 linking between the multiple-view modeller and catia v5 in the cad/cam context, there exist several standards for data exchange, such as iges, set, vda-fs, edif, etc. [18]. the most popular exchange standard in use is iges. it was designed as a neutral format for the iges exchange of cad data, and has been used as the standard for geometric data by most cad/cam systems. although iges is best supported as an interchange format for geometric data, it cannot fulfill the completeness requirement in representing product data. step was first proposed in 1984 [19] to represent complete information of a product throughout its life cycle. then the choice was made that the data exchange between the multiple-view modeller and catia v5 would be based on step. another approach was developed in eynard et al. [20], aiming at a direct connexion of the multiple-view modeller and an opensource cad system based on on opencascade libraries. the relevant output data from the multiple-view modeller is converted into step file format by a translator, and then the step file can be opened by catia v5 directly. the key to this data exchange is the development of a translator. this module has already been developed using ms visual c++ 6.0. considering the reusability and extensibility of the module, the mapping about an entity from the output file of the multiple-view modeller onto the step neutral file is 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague f1 f 2 f 22f 21 f 23 b 3 s 8 s 9 b1 s 11 ...... link 2link 1 roll pin ................................................................... functional modelling multiple-view modelling................................................................... f1: join-material b1: sewing f2: interweave-threadmaterial f21: tense-stitch f22: catch-thread-loop f23: lead-thread-through-material s11: brace b3: thread-takeup s9: thread-takeup-bar s8: needle-bar fig. 5: a case of integration of a functional model and a multiple-view model catia v5 parameterized geometrical model catia v5 preformulated optimisation problem optimisation algorithms optimal values of parameters parameters and constraints update caa-apicaa-api fig. 6: data exchange between catia v5 and the optimisation module encapsulated into a class which provides a compact interface for accessing. when an entity of the step file is needed, the only thing to be done is to extract the required parameters and call the corresponding class member function. for example, the cylindrical surface defined by the multiple-view modeller is shown as follows: { axe sline : pl_axe_doigt_cyl4 dir vector : pl_dir_axe_doigt_cyl4 z float : [0.000000..0.000000] y float : [1.000000..1.000000] x float : [0.000000..0.000000] diameter float : [12.0 .. 12.0] position point : pl_pos_doigt_cyl4 z float : [0.000000..0.000000] y float : [�20.000000..�20.000000] x float : [0.000000..0.000000] length float : [30.0 .. 30.0]} the parameters required by the translator are the center points of each section circle and the radius. the two center points are center 1 (0, �20, 0) and center 2 (0, 10, 0); the radius is 12. then, a cylindrical surface object (css) can be built with the parameters and we call its member function (getstepcode) to get the step code. ccylindrical_surface_segment css(center1,center2, radius), css.getstepcode. the obtained step file is shown as follows (in part): #51=cartesian_point('axis2p3d location',(� 5.,0.,0.)); #52=direction('axis2p3d direction',(0.,1.,0.)); #53=direction('axis2p3d xdirection',(�1., 0.,0.)); #54=axis2_placement_3d('cylinder axis2p3d', #51,#52,#53); #55=cylindrical_surface('generated cylinder', #54,6.); ............. fig. 7 shows the interfaces of the process of data exchange. 5 conclusion the research work presented in this paper deals with product data integration in design. lasts of this problem have been studied in the literature and some concepts have been well specified. the current research work focuses on developing a new integration approach and associated systems. an integration framework of a cax system and a multiple-view modeller is proposed. on the conceptual level, an fbs model and the multiple-view model are used to represent the product data during the design process. the linking between these two models is established to enable the integration of product data in each design phase (conceptual design, embodiment design and detail design). on the implementation level, the data exchange between the corresponding modellers is carried out by neutral file translation. therefore, the designer can easily access the product data from different views and contextualise it throughout the product development process. this research work provides designers with a comprehensive product model and their modelers so as to help them to make information-based design decisions. references [1] pahl, g., beitz w.: engineering design : a systematic approach. springer. london, 1996. [2] krause, f. l. et al: “product modelling.” annals of the cirp, vol. 42(1993), no. 2, p. 695–706. [3] laakko, t., mäntylä, m.: “feature modelling by incremental feature recognition.” computer-aided design, vol. 25 (1993), no. 8, p. 479–492. [4] anderl, r., mendgen, r.: “modelling with constraints: theoretical foundation and applications.” computer-aided design, vol. 28 (1996), no. 3, p. 155–166. [5] umeda, y., ishii, m., yoshioka, m., tomiyama, t.: “supporting conceptual design based on the function-behavior-state modeler.” artificial intelligence for engi© czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 codemo file step neutral file preprocessor catia v 5 fig. 7: example of data exchange between a multiple-views modeller and catia v5 neering design, analysis and manufacturing, vol. 10 (1996), p. 275–288. [6] deng, y.-m., tor, s. b., britton, g. a.: “a dual-stage functional modelling framework with multilevel design knowledge for conceptual mechanical design.” journal of engineering design, vol. 11(2000), no. 4, p. 347–375. [7] zhang, tor, s. b., britton, g. a., deng y.-m.: “efdex: a knowledge-based expert system for functional design of engineering systems.” engineering with computers, vol. 17 (2001), no. 4, p. 339–353. [8] nowak, p., roucoules, l., eynard, b.: “product meta-modelling: an approach for linking product models.” in: proc. of the ieee world congress on system, man and cybernetics. tunisia: hammamet, october 6th–9th, 2002. [9] song, h. j.: “research on methodology and key techniques of scheme generation in conceptual design of mechanical products.” phd thesis (in chinese). xi’an jiaotong university, china, 2003. [10] roucoules, l., tichkiewitch, s.: “code: a co-operative design environment. a new generation of cad systems.” concurrent engineering research and application, vol. 8 (2000), no. 4, p. 263–280. [11] kuipers, b.: “commonsense reasoning about causality: deriving behaviour from structure.” artificial intelligence, vol. 24(1984), no. 1–3, p. 169–203. [12] song, h. j., lin, z. h.: “hierarchical function solving framework with hybrid mappings in conceptual design of mechanical products.” chinese journal of mechanical engineering, vol. 38 (2003), 5, p. 82–87. [13] roucoules, l., salomon, o., paris, h.: “process planning as an integration of knowledge in the detailed design phase.” international journal of computer integrated manufacturing, vol. 16 (2003), no. 1, p. 25–37. [14] tichkiewitch, s.: “specification on integrated design methodology using a multi-view product model.” in: roc. of asme engineering system design and analysis conference montpellier, france, july, 1996. [15] fowler, j.: step for data management, exchange and sharing. technology appraisals, 1995. [16] giarratano, j., riley, g.: expert systems: principles and programming. 3rd ed. boston, pws, 1998. [17] eynard, b., lafon, p.: towards a mechanical systems modelling and optimal embodiment method. in: proc. of 13th international conference on engineering design glasgow, scotland, uk, august 21st–23rd, 2001. [18] bloor, m. s., owen, j.: “cad/cam product-data exchangement: the next step.” computer-aided design, vol. (1991), p. 237–243. [19] iso10303-1. step: industrial automation systems and integration product data representation and exchange, part 1. overview and fundamental principles, 1994. [20] eynard, b., roucoules, l., yan, x.t.: “knowledge integration approach for product modelling using an opensource cad system.” in: proc. of 5th knowledge intensive cad ifip5.2 workshop. st. julian, malta, july 23rd–25th, 2002. dr. huijun song phone: +33 325 715 671 fax: +33 325 715 675 e-mail: song.huijun@utt.fr dr. benoît eynard e-mail: benoit.eynard@utt.fr dr. pascal lafon e-mail: pascal.lafon@utt.fr dr. lionel roucoules e-mail: lionel.roucoules@utt.fr troyes university of technology laboratory of mechanical systems and concurrent engineering–fre 2719 cnrs 12 rue marie curie bp 2060, f.10010 troyes cedex, france 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague ap07_1.vp 1 introduction this paper was prepared at the faculty of architecture ctu in prague and the research centre for industrial heritage (vcpd), and was presented at the xiii international ticcih congress in terni in 2006. the vcpd, which was founded in 2002, works on issues related to industrial heritage and technical monuments, and cooperates with other institutions to this end. the centre gathers information on significant industrial heritage sites and buildings, which is compiled in a database for the use of decision-making bodies, regional planning institutions and investors and institutions in regional planning and investors. the paper is based on texts that have been prepared under the research project aesthetic and symbolic dimension of industrial buildings. one of the classification categories, which was carried out at fa ctu in prague in 2006, and under a project funded by the ministry of education, youth and sports, interdisciplinary perspectives on the development of sectors of technology and industrial architecture in the czech republic with a view to typology, which was carried out at vcpd in 2006. the paper deals with the symbolic-aesthetic dimension of industrial and technical structures as a possible category for classifying and evaluating such structures. this category should play an important role in assessing the value of such structures so that they are thoroughly and properly assessed as a part of our cultural heritage. industrial structures from the czech republic are used as examples, and a specific analysis is made of two bridges. 2 symbolic appreciation of the world and its relationship to architecture first we will consider an observation made by carl gustav jung about the symbolic appreciation of the world. he started that real life begins only when we are able to see the world in terms of its symbolic meaning. according to jung, this need and capacity is a part of our natural psyche and it must receive adequate attention throughout our lives. a symbolic appreciation and interpretation of the world plays an important role not just in ordinary life, but also in connection with architecture and urban studies. the theories of c. n. schultz, c. jencks, k. lynch etc., had quite different foundations, but all have pointed out that we must maintain a symbolic-aesthetic perception of our environment, at every level, from individual isolated structures to more elaborate structural complexes. nowadays, however, we primarily approach the environment around us with practical perspectives. architecture and the environment come to form a meaningful unit when they, too possess this dimension. technical and industrial structures can also be perceived as symbolic-aesthetic phenomena, which is what was intended by many of the builders and architects who conceived them. this approach to perceiving and appreciating these structures should serve as a useful argument in efforts aimed at conserving them, as many of these structures are still under threat of destruction because they are often viewed as having no significance or architectural value. 3 the symbol and the industrial structure symbolic-aesthetic values have long been a standard classification and assessment criterion for other common types of architectural works. such values have even been overemphasised. stereotypes have thus far dictated that this criterion is not applied to industrial or technical structures, because in their case it is assumed that functional and operational criteria take precedence over aesthetic design. nevertheless some completed industrial and technical works are undertaken with explicitly symbolic-aesthetic ambitions. they are thus comparable with the construction of churches or representative residential buildings, and at the time of their origin they were perceived in this light. a number of important works were conceived by architects who were also engaged in theoretical work on this issue (antonín engel, karel teige, jan e. koula, františek a. libra, františek l. gahura, emanuel © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 47 no. 1/2007 the symbolic-aesthetic dimension of industrial architecture as a method of classification and evaluation: the example of bridge structures in the czech republic l. popelová this paper deals with the symbolic-aesthetic dimension of industrial and technical structures as a possible category for classifying and evaluating such structures. this category should play an important role in assessing the value of industrial and technical structures so that they are thoroughly and properly assessed as a part of our cultural heritage. this paper was presented at the xiii international ticcih congress in terni in 2006. keywords: symbolic-aesthetic demands, industrial archeology, industrial heritage, conservation. hruška). there are some industrial and technical works that are generally acknowledged as oustanding, especially works linked to the modernist movement, in which case their function was almost uncritically glorified and was imposed as the new aesthetic on most architectural work. if we perceive and assess industrial and technical structures from an aesthetic-symbolic perspective, it is clear that the value of these works is not based only on their functional and technical components, but also derives from their specific poetry, monumentality and beauty. in contrast to other types of architecture they acquire these qualities in very interesting forms. that it is necessary to interpret industrial structures from a symbolic-aesthetic perspective has also been pointed out by p. neaverson and m. palmer in their book industrial archeology and by other authors, such as g. darley and b. h. bradley. 4 classifying industrial and technical structures from the perspective of their symbolic-aesthetic qualities the categories dealt with here are close to each other, and can overlap and intermingle from time to time. the categories refer to individual objects as well as the whole setting of an plant. individual branches of industry often incline toward certain categories, and in this area the typical features need to be described in greater detail. it will also be interesting to identify elements that are only locally typical (e.g. inspired by folk architecture) and those drawn from other cultures (e.g. the low alpine gables of the austro-hungarian railway stations and the use of brickwork in the czech environment), and to think about how these categories are perceived on the subjective and objective levels, and how the perception of them today differs from their contemporary perception. industrial objects can be viewed as purely technical structures or as works of architecture, but for our kind of evaluation this dichotomy has no importance. the ways in which industrial and technical structures confront symbolic-aesthetic demands can be divided into five categories: a unintended symbols b stylistic unity c intended metaphors c. 1 romantic metaphors c. 2 technological metaphors c. 3 value of standard c. 4 the grand total a unintended symbols the production and technology character of these structures gives them an unmistakable expression and form. they are usually single-purpose objects and freestanding structures, which have acquired almost archetypal significance on account of their typicality and immediateness, and in this sense they also affect the observer´s symbolic perception. the form of these objects is defined by their structure and their technical equipment. some are exceptional examples of an engineering aesthetic. this category includes limekilns, mining towers, water and windmills, smelting furnaces, cooling towers. these objects have a certain monumentality due to their dimensions and their functional specialization. the outer appearance of these structures already identifies the kind of production involved. (fig. 1) 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 1: category a – unintended symbol. conical lime-kilns. former vojtěch – koněv ironworks in kladno near prague. built after 1854 (left). a mining tower of michal mine in ostrava-michálkovice, 1842. architect of the redevelopment františek fiala, 1913–1915 (right). b. stylistic unity this category includes objects which have been integrated into the predominant local or historical stylistic and formal framework. in this way the relationship of the structure to the surrounding structures has become harmonised, though at the time of their construction the effort to integrate industrial and technical structures into their surroundings may have been somewhat awkward. these structures can be multifunctional or single-purpose objects. rather confusing association are created by the reuse of historical buildings for new production – e.g. in the early stages of industrialisation factories were built inside cloisters, castles, residential buildings. (fig. 2) c. intended metaphors c. 1 romantic metaphors in this case, parts or aspects of traditional architectural styles are used in a new context and are often applied in a very exalted and exotic way. production buildings in this category are thematically stylised. for example, a paper mill may have been designed like an egyptian temple (an association with papyrus), or a carpet factory in the style of islamic architecture (an association with persian carpets), a tobacco processing plant as a mosque (an association with the exotic origin of tobacco). the appearance of the factory may have served to advertise and promote the goods produced in it. other typical examples are 19th century pseudo-gothic “industrial castles” or water-towers that resemble minarets or defence architecture. these models borrowed from a glorified version of history, aimed to enhance the significance of the structure. (fig. 3) c. 2 technological metaphors the style of the industrial structures may blend with the modern buildings in the area, thus tending to create a formal unity with them. however, due to the specific forms © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 47 no. 1/2007 fig. 3: category c – intended metaphor (romantic metaphor and metaphor of technology). winternitz mill in pardubice, built 1909–1910 and 1919–1926 by josef gočár (left). ferra a.s. warehouse in praha-holešovice (former l. g. bondy, a.s.), built 1928–1929 by josef kříž (right). fig. 2: category b stylistic unity. a floodgate. so-called laudon pavilion in a veltrusy palace garden. built 1792 – 1797 by matěj hummel (left). feigl & widrich textile mill (later textilana, a. g.), chrastava (kratzau), built 1904 – 1907 by gustav sachers and son (right). and accents of production buildings and technical structures they belong rather to the sphere of metaphors. at the present time, in particular built objects metaphorically proclaim technological the sovereignty of technology and production. however, this is very often a general expression only, because in reality their forms are not influenced by the specific kind of technology located inside them. the form sometimes expresses the metaphor of a fully integrated and hidden technology that is totally controlled by man. in fact the control room of a fully automatic bakery can look quite similar to the control room of a nuclear power plant. in such cases, the most advanced building methods are chosen, even in combination with futuristic stylization. this metaphor symbolises the optimism and the all-powerful potential of the technological era. (fig. 3) c. 3 value of standard the value and importance of multifunctional objects lies in their universality, enormous quantity they were built and their long service to industry. primarily they are interesting engineering feats and structural innovations: e.g. ceiling structures, high capacity ventilation, skylight systems, the use of precast elements and the re-implementation of successful solutions. secondarily, they can by important for their advanced equipment and machinery, which has probably been replaced several times and may no longer be preserved. the value of these universal buildings lies in the fact that they form a part of architecture they have respected the basic sense of proportion and have applied the basic tectonic details and the scheme of the facade is well-balanced. even their traditional building materials were in harmony with the rest of the city. their industrial function was manifested by great metal framed windows, chimneystacks, water towers, ramps, cranes, entrance ways and rail tracks. the whole setting of such plants can be seen best form a bird’s eye view. typical examples of this type of structure are textile mills and engineering works from the 19th century. we can hardly include most of contemporary industrial structures in this category, because they are often built and assembled ad hoc for an immediate functional purpose, without any architectural aim or pretensions. (fig. 4) c. 4 the grand total here we are speaking about stylistically and somehow ideologically unified assemblies of buildings and objects. they include not only production buildings but also houses for workers, parks, schools, sports and cultural facilities. this kind of organism emerges as an entity, not unlike the old utopian falanistera or the urban agglomerations inspired by the garden or industrial conceptions of cities in the 19th century. they were based on capital, economic success, and especially on lofty social ideals with a great deal of optimism. they aspired to create a platform which would renew harmony in human life on all levels. architecturally – at least on a small scale and for a period of time such projects achieved a “great unity”, similar to that achieved by the great historical styles (in the czech republic e.g. bata’s city of zlín with its satellites). 5 the importance of the symbolic-aesthetic criterion in evaluating industrial structures in the european context, the heritage of czech industrial and technical structures is very valuable. this discussion of the topic is a small contribution to promoting a criterion thus far absent, a criterion that would systematically evaluate technical structures from a symbolic-aesthetic perspective. such an evaluation is an innovation except in the case of some very exceptional works by well-known architects. today we admire many industrial structures for their aesthetic appearance, which they acquired from their specific typological class. paradoxically, however, association with an industrial typology nowadays often underlines the thoughtless destruction of such works. old technology has been surpassed, and all that remains is the shell of a structure, and its potential aesthetic value is not usually an adequate motive for new investors to make an effort to conserve the building. moreover, such structures are more complicated to evaluate than other types of buildings, because their urban impact tends to be much broader in scope. when they finish serving their function, it is often necessary to transform much larger 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 4: category c – value of standard. an engineering works. koh-i-noor, a.s. (former waldes and co.) in praha. built 1919–1921 by jindřich pollert (left). a pencil work koh-i-noor hardtmuth factory in české budějovice. built 1846–1862 and later (right) . sections of areas. it also seems that the public, and especially the conservation bodies, have not yet learned to appreciate the quality of these structures. valuable works are thus disappearing rapidly, without ever having been assessed and without any thought being given to appropriate ways of saving them and converting them for new use in the future. 6 an analysis of selected technical structures – bridge structures bridge structures have been selected as an example that can quite easily demonstrate the application of the symbolic-aesthetic interpretation of a structural work. two bridges will be discussed in detail the art nouveau svatopluk čech bridge in prague and the long bridge in české budějovice. using those examples i will also demonstrate the difference between the traditional and modern forms of bridges. 6.1 the symbolic perception of water in the human mind, water is associated with continuity, the passage of time, the principle of change, and danger. it is also seen as a subconscious source of wisdom. for this reason rivers and water commonly appear in human dreams and myths. 6.2 bridges in the organism of the city bridges have always played an important role in the arrangement of the city as a whole. they are important components of the image of the city and its genius loci. the bridges of prague (especially the most famous charles bridge) are an important part of its cityscape, together with many of its towers. over the ages bridges have also served many secondary functions; for example, they have formed a part of the fortifications, and dwellings, business and shops have been located on them, and they have had representational and leisure functions (e.g. a bridge in verona, the old london bridge, ponte vecchio, ponte rialto, chinese bridges). these historical structures were always to some extent aesthetically and symbolically cultivated, and were created to form a harmonious part of the city. 6.3 svatopluk čech bridge (constructed in 1908–1909, architect j. koula) this art nouveau iron bridge is an example of the category of stylistic unity and the romantic metaphor (figs. 5–9). the bridge consists of three truss arches supported by two concrete pillars cladded with stone. in the fact, the bridge creates a sort of counterpart and at the same time an addition to the famous charles bridge. they are comparable on terms of two factors: their technical merit and their artistic merit. at the same time, they represent two different epochs of the national “golden age”. from the ideological point of view, charles bridge is a manifestation of the universality of the catholic faith, symbolised by the well-known gallery of open-air statues. on the other hand, the svatopluk čech bridge glorifies the emancipated modern society of the early 20th century. in an urbanistic context, the svatopluk čech bridge is actually the continuation of a long axis, that starts in the historic market centre, attached to a building of great civic pride – old town hall on the old town square. the axis then runs through the former jewish quarter along the new and representative parizska street, and then up to the bridge (fig. 7). this straight axis was intended to continue through the triumphal arch above the planned opening in the steep letna hill (fig. 5). the culmination and planned conclusion of this line was to be a new city quarter, in which government buildings would be centred. in this way the bridge can be compared to charles bridge, which formed an integral part of the royal route that ran through the complicated organism of the medieval city. the route symbolically unified the objects of spiritual power with their temporal counterparts. at one end of the route is the royal palace, with the cathedral as a sacred source of authority and power. the reverse view presents the city as an image of blessed human activity. here, one direction complements the opposite directions – the design represents the way of salvation as a two-way street. the svatopluk čech bridge also created a specific vista of the new centre of gravity – the government buildings – in accordance with the modern belief that all power comes from the people, not from god. as regards form, we have already said that the art nouveau svatopluk čech bridge is the continuation of the trium© czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 47 no. 1/2007 fig. 5: svatopluk čech bridge in prague. built 1908–1909 by j. koula. the face side of the bridge (left). the design of the triumph arch at the letná hill by j. koula, 1897 (right). phalism of parizska street (fig. 7). stylistically, it is a mixture of all historical styles, but without religious accents. in a spiritual sense, the artists and architects preferred pre-christian motives. the whole project is a manifestation of the optimism and ambition of the era of prosperity, based on modern management, modern technology and a modern social system. the decoration of the bridge is organised on three main levels (figs. 5, 6). the lowest level is thematically related to water. the mid-level is dedicated to man and to work associated with water. the upper level belongs to protective forces and beings. in this way all the strategic points and levels of the bridge are symbolically occupied. the face of the svatopluk čech bridge is the upstream side (fig. 7). surprisingly, the coats-of-arms of prague are on the opposite side, on the rear of the structure, like as is flag of the nation, which is situated on the stern of a ship. here, the flag is protected by the whole mass of the ship. in our case, this supreme symbol is flanked by a pair of water dragons, hydras that only the divine hercules was able to overcome. here, these creatures are placed in the role of loyal monsters – as guard dogs of a sort. this kind of application recalls lion’s heads that have to hold iron rings in their mouths, or frightening cathedral gargoyles that facilitate roof drainage. below the coats-of-arms we also see the face of the water god, captured in the wall. he, too, guard the bridge, watching over everything closely, and perhaps shouting through his open mouth. on the face side of the bridge there are two prominent female figures holding a torch in their hands (fig. 6). they may also be symbols of republican liberty, like the statue of liberty in new york. the torches used to have gas lamps on them, lighting the waterway, ceaselessly welcoming the river and the approaching boats. in the night, they warned of the potentially dangerous presence of two pillars. these gigantic ladies stand on pillars formed in the shape of decorative ship bows festooned with flowers. the spaces between the arches are decorated with relief sculptures of naked bodies of real and mythical creatures, 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 6: svatopluk čech bridge in prague. built 1908–1909 by j. koula. the face side of the bridge with an female figure holding a torch (left). the railing with identical medallions with female face, “chain” of electric lamps and relief sculptures (right). fig. 7: svatopluk čech bridge in prague. built 1908–1909 by j. koula. a specific vista to pařížská street (left). the upstream side of the bridge with coats-of-arms of prague and water dragons (right). which play together in harmony. the guidelines of the arches are marked by a chain of electric lamps that look like grand pearls, supernaturally taken from the depths of the purifying water, as in a wagnerian opera (figs. 6, 8). on the terrain level of the bridge there are motifs based on work related to water. in the centre of the railing panels there are identical medallions with a beautiful female face and the latin inscription: prague, mother of cities (figs. 6, 8). the replication of this image and the inscription are like a sort of mantra, proclaiming the eternity of the city. even the mandala-like shape of the relief sculptures refers to the world of universal archetypes. on the top of the railing, above the pillars there are pairs of urns that borrow their shape from ancient times and remind us of the process of death and rebirth from earth and water – a natural cycle (fig. 7). the tower gates at charles bridge are of a defensive character. on the threshold of svatopluk čech bridge, by contrast, all that can be found are some small stone houses crowned by seven-meter-high columns (fig. 8). these lean and structures, placed at a good distance from each other, are inviting to free and rapid movement, rather than creating an obstacle. they form as it were open gates indicating just the symbolic beginning and end of the bridge. at the top of the columns there are small glass structures, and above them stand four poetic figures of victory. at night they stand on the light of shining lanterns. in this connection we can recall the old christian © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 47 no. 1/2007 fig. 8: svatopluk čech bridge in prague. built 1908–1909 by j. koula. decoration of the spaces between the arches, candelabras and small stone houses on the treshold of the bridge (left). the threshold of svatopluk čech bridge with seven-meter-high columns. fig. 9: svatopluk čech bridge in prague. built 1908–1909 by j. koula. the decoration of candelabras – relief works depicting water-related trades (left), a head of a ram (right). image of the virgin mary standing on the crescent, or the lady dressed in the sun. according to a more ancient vision the image is reminiscent of the goddess nike holding the branch of victory. other impressive vertical features are the candelabras (figs. 7, 8, 9), the lower parts of which are decorated with relief work. these originally depicted several water-related trades, such as rafters and fishermen. these have today been replaced by replicas of one of the originals, which depicts people carrying jugs of water on their shoulders (fig. 9). the figures have an air of peace and dignity to them, like the figures from the ara pacis altar in rome. the same place is marked by an almost three-dimensional image of the head of a ram (fig. 9). this seems to symbolise the nature of the power of man and according to the ancient models perhaps the sacred sacrifice – in this case through dedicated human work. slightly higher up, there are metal platforms for flowers, which used to decorate the whole structure on holidays and festivals. at the highest point of the candelabras there are gilded images of another life-giving element – the disk of the sun with rays of sunshine. its simplicity recalls the monstrances carried by the hussites in front of their troops. at night, this image is simulated by two hanging electric lamps. in summary, the symbolic meaning of this bridge is the celebration of elements of nature, human work and the new social perspective. on the other hand, charles bridge is guarded by the images of saints, including some czech compatriots, who together help us to pass safely across to the opposite shore, “the other side”, and they remind us that this walk is certainly not endless. 6.4 modern bridges the metaphor of modern bridges is often indifferent to their surroundings and to the genius loci of the whole city. such bridges create the impression that the problems to be solved and the particular situation of the place are often artificially complicated to create an ambitious challenge. in addition to their radical technical designs and forms, new bridges are also stylised to be contemporary. this additional articulation can be much stronger than the plain structural design. because of its extremity, such an approach can disturb the genius loci of the surroundings. 6.5 long bridge in české budějovice (constructed in 1999, arch. r. koucký, t. rotter) this bridge is an example of the metaphor of technology category, due to its simple functional structures with typical production and technical features (fig. 10). the steel structure of this cable-stayed bridge is supported by one pillar. its typical silhouette resembles the old chain bridges. however, because of the expected loads and the small span of the river, this structure seems to be unnecessary. hanging cables are attached in the centre of the bridge on a two-part frame. in the place of the anchorage, this frame simulates the separation of the parts, and simulates the pull and the weight of the load. the illusion of deformation also has an aesthetic function in other parts of the bridge. we see the rubber-like deformed openings in the steel sheets, which are part of the side beams (fig. 10). these openings resemble the holes that form the inner structure of aircraft. the metallic character and the round shape enhance this association. the radically outward arched railing also has something in common with the leading edge of the giant wing of an airplane. despite the technoid character of the bridge, however, the general impression has something surprisingly akin to the style of art nouveau. the bridge is located close to some traditional houses and it is not far from the historical centre of the city. from this point of view, the aircraft metaphor is rather improper. this new bridge blocks off its surroundings and draws all attention to itself, a feature that has also been criticised by other architects and experts. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 10: long bridge in české budějovice. built 1999 by roman koucký and tomáš rotter. the structure of the bridge (left). the rubber-like deformed openings of the side beams (right). (authors of figures: lenka popelová and the database of the vcpd, ctu in prague.) 7 conclusion this paper is a contribution in support of a criterion that will systematically evaluate industrial structures, including bridge structures, from a symbolic-aesthetic perspective. this approach should serve as a useful argument in efforts aimed at conserving and regenerating such structures. 8 acknowledgments this research was supported by ig ctu0622115 aesthetic and symbolic dimension of industrial buildings. one of the classification categories in 2006 and by rp mšmt 61-66100 interdisciplinary perspectives on the development of technical fields and industrial architecture in the czech republic and the creation of a typology in 2006. references [1] bradley, b. h.: the works. the industrial architecture of the united states. new york, oxford university press, 1999. [2] čenský, a.: návrh na úpravy letenské pláně dle prof. j. kouly. architektonický obzor. vol. vi (1907), p. 31–32, 34–35, 37–38, 41–42. [3] čenský, a.: o původním návrhu arch. prof. kouly na komunikaci letenským průkopem. architektonicky obzor, vol. ix (1910), p. 13, 15–17. [4] darley, g.: factory. london, reaktion books ltd., 2003. [5] fischer, j., fischer, o.: pražské mosty. praha, academia, 1985. [6] föhl, a.: bauten der industrie und technik. baden, schriftenreihe des deutschen nationalkomitees für denkmalschutz. [7] hozák, j.: krása průmyslové architektury. industriální architektura – nevyužité dědictví. (editors: fragner, b., hlaváček, e.), praha, ntm v praze, oa, čfvu 1990, p. 10–13. [8] jung, c. g.: duše moderního člověka. brno, atlantis, 1994. [9] kučera, v.: architektura mostů. praha, vydavatelství čvut v praze, 2002. [10] koula, j.: návrh vstupu komunikace letenské od arch. prof. j. kouly. architektonický obzor, vol. ix (1910), p. 19–21. [11] neaverson, p., palmer, m.: industrial archeology. london and new york, routledge, 2000. [12] šeneberger, t.: rekonstrukce výrobně–technických staveb k novým účelům. praha, ntm v praze, 1995. [13] urlich, p.: klasický a abstraktní model v architektuře 20. století. praha, vydavatelství čvut v praze, 1992. ing. arch. lenka popelová phone: +420 224 355 322 email: lenka.popelova@fsv.cvut.cz; lenpop@centrum.cz department of architecture czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 47 no. 1/2007 ap06_6.vp 1 introduction many publications in the field of seig have been dealt with solutions to a range of problems (e.g. enhancing performance, loading, interfacing with the grid, etc). el sousy et al [1] discuss a method for controlling a 3 phase induction generator using indirect field orientation control, while mashaly at al [2] introduce an flc controller for a wind energy utilization scheme. we have found no studies concentrating on a wind energy scheme system for supplying an isolated load. the primary advantages of seig are less maintenance costs, better transient performance, no need for dc power supply for field excitation, brushless construction (squirrel-cage rotor), etc. in addition, induction generators have been widely employed to operate as wind-turbine generators and small hydroelectric generators for isolated power systems [3, 4]. induction generators can be connected to large power systems, in order to inject electric power, when the rotor speed of the induction generator is greater than the synchronous speed of the air-gap-revolving field. in this paper the dynamic performance is studied for seig driven by wecs to feed an isolated load. the d-q axes equivalent circuit model based on different reference frames extracted from fundamental machine theory can be employed to analyze the response of the machine transient in dynamic performance [3,4]. the voltage controller, for seig, is conducted to adapt the terminal voltage, via a semiconductor switching system. the semiconductor switch regulates the duty cycle, which adjusts the value of the capacitor bank connected to the seig [5, 6]. the seig is equipped with a frequency controller to regulate the mechanical input power. in addition, the stator frequency is regulated. this is achieved by adjusting the pitch angle of the wind turbine. in this paper, the integral gain (ki) of the pi controller is supervised using the flc to enhance the overall dynamics response. the simulation results of the proposed technique are compared with the results obtained for the pi with fixed and variable ki. 2 the system under study fig. 1 shows the block diagram for the study system, which consists of seigs driven by wecs connected to an isolated 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fuzzy algorithm for supervisory voltage/frequency control of a self excited induction generator hussein f. soliman, abdel-fattah attia, s. m. mokhymar, m. a. l. badr this paper presents the application of a fuzzy logic controller (flc) to regulate the voltage of a self excited induction generator (seig) driven by wind energy conversion schemes (wecs). the proposed flc is used to tune the integral gain (ki) of a proportional plus integral (pi) controller. two types of controls, for the generator and for the wind turbine, using a flc algorithm, are introduced in this paper. the voltage control is performed to adapt the terminal voltage via self excitation. the frequency control is conducted to adjust the stator frequency through tuning the pitch angle of the wecs blades. both controllers utilize the fuzzy technique to enhance the overall dynamic performance. the simulation result depicts a better dynamic response for the system under study during the starting period, and the load variation. the percentage overshoot, rising time and oscillation are better with the fuzzy controller than with the pi controller type. keywords: fuzzy logic controller, self exited induction generator, voltage control, frequency control, wind power station. fig. 1: system under study load. two control loops for terminal voltage and pitch angle using flc to tune ki of the pi controller are shown in the same figure. the mathematical model of seig driven by wecs is simulated using the matlab/simulink package to solve the differential equations. meanwhile, two controllers have been developed for this system. the first is the voltage controller to adjust the terminal voltage at the rated value. this is done by varying the switching capacitor bank, for changing the duty cycle, to adjust the self excitation. the second controller is the frequency controller to regulate the input power to the generator and thus to keep the stator frequency constant. this is achieved by changing the value of the pitch angle for the blade of the wind turbine. first, the system under study is tested when equipped with a pi controller for both voltage and frequency controllers at different fixed values ki. then the technique is developed to drive the pi controller by a variable ki to enhance the dynamic performance of the seig. ki is then tuned by using two different algorithms. the simulation is carried out when the pi controller is driven by variable ki using a linear function, with limitors, between ki and voltage error for voltage control. the simulation also includes variable ki based on the mechanical power error for the frequency controller. meanwhile, variable ki has lower and upper limits. then, the simulation is conducted when the pi controller is driven by a variable ki through flc technique. the simulation results depict the variation of the different variables of the system under study, such as terminal voltage, load current, frequency, duty cycles of the switching capacitor bank, variable ki in voltage controller kiv and variable ki in frequency controller kif. 3 mathematical model of the seig driven by wecs 3.1 electrical equation of the seig the stator and rotor voltage equations using the krause transformation [3, 4], based on a stationary reference frame, are given in the appendix [7]. 3.2 mechanical equations of the wecs the mechanical equations relating the power coefficient of the wind turbine, the tip speed ratio (�) and the pitch angle (�) are given in [7, 8, 9]. the analysis of an seig in this research is performed taking the following assumptions into account [3]: � all parameters of the machine can be considered constant except xm. � per-unit values of the stator and rotor leakage reactance are equal. � core loss in the excitation branch is neglected. � space and time harmonic effects are ignored. 3.3 equivalent circuit the d and q axes equivalent-circuit models parameters for a no-load, three-phase symmetrical induction generator refer to a 1.1 kw, 127/ 220 v (line voltage), 8.3/4.8 a (line current), 60 hz, 2 poles, wound-rotor induction machine [4]. more details about the machine are described in [7, 8]. 3.4 voltage control and switching capacitor bank technique: 3.4.1 switching switching of capacitors has been discarded in the past because of the practical difficulties involved [5, 6], i.e. the occurrence of voltage and current transients. it has been argued, and justly so, that current ‘spikes’, for example, would inevitably exceed the maximum current rating as well as the (di/dt) value of a particular semiconductor switch. the only way out of this dilemma would be to design the semiconductor switch to withstand the transient value at the switching instant. the equivalent circuit in fig. 2 is added to explain this situation of the switching capacitor bank due to the duty cycle. the details of this circuit are given in [6]. for the circuit of fig. 2, the switches are operated in anti-phase, i.e. the switching function fs2 which controls switch s2 is the inverse function of fs1 which controls switch s1. in other words, switch s2 is closed during the time when switch s1 is open, and vice versa. this means that s1 and s2 of branch 1 and 2 are operated in such a manner that one switch is closed while the other is open. 3.4.2 voltage control as shown in fig. 1, the input to the controllers is the voltage error, while the output of the controllers is used to execute the duty cycle (�). the value of calculated � is used as an input to the semiconductor switches to change the value of the capacitor bank according to the need for the effective value of the excitation. accordingly, the terminal voltage is controlled by adjusting the self-excitation through automatic switching of the capacitor bank. 3.5 frequency control frequency control is applied to the system by adjusting the pitch angle of the wind turbine blades. this is used to keep the seig operating at a constant stator frequency and to counteract the speed disturbance effect. the pitch angle is a function of the power coefficient cp of the wind turbine wecs. the value of cp is calculated using the pitch angle value according to the equation mentioned in [7, 8, 9]. consequently, the best adjustment for the value of the pitch angle improves the mechanical power regulation, which, in turn, achieves a better adaptation for the frequency of the overall system. accordingly, the frequency control regulates the mechanical power of the wind turbine. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 46 no. 6/2006 fig. 2: semi conductor switches (s1, s2) circuit for capacitor bank 4 controllers two different types of controller strategies have been conducted. first, the conventional pi controller with fixed and variable gains is applied. second, flc is applied to adjust the value of ki for both frequency and voltage controllers. 4.1 conventional pi controller the simulation program is carried out for different values of ki while the value of the proportional gain is kept constant, as shown in fig. 3. it is observed from the simulation results that the value of the percentage overshoot (p.o.s), rising time and settling time change as ki is changed. then the technique of having variable ki depending on the voltage error, for voltage control, is introduced to obtain the advantage of high and low value of the integral gain of voltage loop kiv. 4.2 pi-controller with variable gain a program has been developed to compute the value of the variable integral gain kiv, using the following rule: if (ev � evmin), kiv � kivmin; elseif (ev � evmax), kiv � kivmax; else (evmin� ev � evmax), m � (kivmax� kivmin)/(evmax� evmin); c � kivmin� m×evmin; kiv = m×ev� c; end where, ev � voltage error, evmin � minimum value of the voltage error, evmax � maximum value of the voltage error, kivmin � minimum value of kiv, kivmax � maximum value of kiv, c is a constant and m is the slope value. fig. 4 shows the rule for calculating kiv of kiv against the terminal voltage error ev. the value of evmin and evmax is obtained by trail and error to give the best dynamic performance. the proportional gains (kpv and kpf) are also kept constant for the voltage and frequency controllers, respectively. various characteristics are tested to study the effect of changing the value of (kiv) to update the voltage control. the simulation results cover the starting period and the period when the system is subjected to a sudden increase in the load, at instant 8 sec. fig. 3 shows the simulation results for the variable kiv . figs. 5, 6 show the effect of variable voltage integral gain kiv and frequency kif controllers versus time, respectively. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 3: dynamic response of the terminal voltage with different values of integral gain for voltage control fig. 4: variable integral gain for pi controller 5 a fuzzy logic controller (flc) to design the fuzzy logic controller, flc, the control engineer must gather information on how the artificial decision maker should act in the closed-loop system, and this would be done from the knowledge base [10]. the fuzzy system is constructed from input fuzzy sets, fuzzy rules and output fuzzy sets, based on the prior knowledge base of the system. fig. 7 shows the basic construction of the flc. there are rules to govern and execute the relations between inputs and outputs for the system. every input and output parameter has a membership function which could be introduced between the limits of these parameters through a universe of discourse. the better the adaptation of the fuzzy set parameters is, the better the tuning of the fuzzy output is conducted. the proposed flc is used to compute and adapt the variable integral gain ki of pi controller. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 46 no. 6/2006 fig. 5: variable integral gain in pi-voltage controller with flc fig. 6: variable integral gain in pi-frequency controller with flc 5.1 global input and output variables for voltage control the fuzzy input vector consists of two variables; the terminal voltage deviation ev, and the change of the terminal voltage deviation � ev . five linguistic variables are used for each of the input variables, as shown in fig. 8a and fig. 8b, respectively. the output variable fuzzy set is shown in fig. 8c and fig. 8d, while shows the fuzzy surface. for frequency control, the fuzzy input vector also consists of two variables; the mechanical power deviation ef , and the change of the mechanical power deviation � ef . five linguistic variables are used for each of the input variables, as shown in fig. 9a and fig. 9b, respectively. the output variable fuzzy set is shown in fig. 9c, and fig. 9d shows the fuzzy surface. in figs. 8, 9 linguistic variables have been used for the input variables, p for positive, n for negative, av for average, b for big and s for small. for example, pb is positive big and ns is negative small, etc. after constructing the fuzzy sets for input 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 7: the three stages of the fuzzy logic controller a) b) c) d) fig. 8: a) membership function of voltage error, b) membership function of change in voltage error, c) membership function of variable kiv, d) fuzzy surface and output variables, it is required to develop the set of rules, the so-called look-up table, which defines the relation between the input variables, ev, ef , � ev and � ef and the output variable of the fuzzy logic controller. the output from the fuzzy controller is the integral gain value of ki used in the pi controller. the look-up table is given in table 1. 5.2 the defuzzification method the minimum of maximum method has been used to find the output fuzzy rules representing a polyhedron map, as shown in fig. 10. first, the minimum membership grade, which is calculated from the minimum value for the intersection of the two input variables (x1 and x2) with the related fuzzy set in that rule. this minimum membership grade is calculated to rescale the output rule, then the maximum is taken, as shown in fig. 11. finally, the centroid or center of area has been used to compute the fuzzy output, which represents the defuzzification stage, as follows: k y y y y y i � � � � � ( ) ( ) d d . more details about the variables of the above equation are given in [10]. 6 simulation results 6.1 dynamic performance due to sudden load variation the flc utilizes the terminal voltage error (ev) and its rate of change (� ev ) as input variables to represent the voltage control. the output of flc is used to tune up ki of the pi controller. another flc is used to regulate the mechanical power via the blade angle adaptation of the wind turbine. figs. 8a, b, c and d depict the fuzzy sets of ev, � ev , kiv and the © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 46 no. 6/2006 voltage deviation (ev) voltage deviation change (� ev) nb ns av ps pb nb nb nb nb ns av ns nb nb ns av ps av nb ns av ps pb ps ns av ps pb pb pb av ps pb pb pb table 1: look up table of fuzzy set rules for voltage control a) b) c) d) fig. 9: a) membership function of mech. power error, b) membership function of change on mech. power error, c) membership function of variable kif , d) fuzzy surface fuzzy surface, respectively. the terminal voltage error (ev) varies between (�220 and 220) and its change (� ev ) varies between (� 22 and 22). the output of flc is kiv, which changes between (5e�003 and 5.5e�003). table 1 shows the lookup table of fuzzy set rules for voltage control. the same technique is applied for the frequency controller, where the two inputs for flc are the mechanical power error (ef ), which varies between (�1 and 1) and its change (� ef ), which varies between (�0.1 and 0.1). the output of flc is kif , which changes between (4e�006 and 5e�006). the output of the pi-flc in the frequency controller adapts the pitch angle value to enhance the stator frequency. fig. 9a, b, c and d show the fuzzy sets of ef , � ef , the related output fuzzy set and fuzzy surface, respectively. based on the mathematical model of the system, equipped with two controllers (pi & flc) for terminal voltage and blade angle, the simulation is carried out using the matlabsimulink package. it runs for a pi controller with varying integral gain finding a relation between the voltage or frequency error and the value of these gains. figs. 11, 12, 13 show the simulation results for the terminal voltage for different loads. at time � 8 sec the system is subjected to sudden 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 rule1: if verror (x1) is ns and change in (y)) is nsverror (x2) is av then output (integral gain rule2: if verror (x1) is av and change in verror (x2) is ps then output (integral gain (y)) is ps fig. 10: schematic diagram of the defuzzification method using the center of area fig. 11: dynamic response of terminal voltage for pi with and without flc © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 46 no. 6/2006 fig. 12: dynamic response of terminal voltage for pi with and without flc fig. 13: dynamic response of terminal voltage for pi with and without flc change in load. fig. 14 shows the stator frequency. the system is equipped with a conventional controller having fixed and variable integral gain and the flc algorithm. the proposed flc is used to adapt ki to give a better dynamic performance for the overall system, as shown in figs. 11, 12, 13, regarding p.o.s and settling time compared with fixed pi and pi with variable ki for different loads. figs. 15, 16 depict the simulation results for the load current and controller’s duty cycle. the same conclusion is achieved as explained for fig. 11. 6.2 dynamic performance due to sudden wind speed variation there are different simulation results when the overall system is subjected to a suddenly disturbance the wind speed from 7 m/s to 15 m/s in figs. 17, 18 show the simulation results of the wind speed variation and the stator frequency, respectively. the simulation given in fig. 18 shows the ability of the 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 14: dynamic response of stator frequency for pi with and without flc fig. 15: dynamic response of load current for pi with and without flc proposed controller to overcome the speed variation for variable and fixed integral gain. 7 conclusion this paper presents an application of flc to a self-excited induction generator driven by wind energy. the proposed flc is applied to frequency and voltage controls of the system to enhance its dynamic performance. flc is used to regulate the duty cycle of the switched capacitor bank to adjust the terminal voltage of the induction generator. flc is also applied to regulate the blade angle of the wind energy turbine to control the stator frequency of the overall system. the simulation results show better dynamic performance of the overall system using the flc controller than for the variable pi type. another simulation was conducted to study the dynamic performance to this system with a suddenly disturbance for wind speed variation. a comparison was conducted for the stator frequency in dynamic performance with variable and fixed ki. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 46 no. 6/2006 fig. 16: dynamic response of duty cycle for pi with and without flc fig. 17: suddenly variation for the wind speed versus time appendix seig differential equations at no load vds stator voltage (volt) differential equation at direct axis v r i ps b b ds ds qs ds� � � � � � � � � � � � � � � � � � � � � , (1) vqs stator voltage differential equation at quadrate axis v r i ps b b qs qs ds qs � � � � � � � � � � � � � � � � � � � � � , (2) vdr rotor voltage differential equation at direct axis v r i p b b dr r dr r qr dr� � � � � � � � � � � � � � � � ( )� � � � � � , (3) vqr rotor voltage differential equation at quadrate axis v r i p b b qr r qr r dr qr � � � � � � � � � � � � � � � � ( )� � � � � � , (4) where p is the differentiation parameter d/dt. flux linkage differential equation for stator and rotor components �ds ls ds m dr ds� � � �x i x i i( ), (5) where �ds is the stator flux linkage (wb) at direct axis, idr the rotor current (a) at direct axis, ids the stator current (a) at direct axis. �qs ls qs m qr qs� � � �x i x i i( ), (6) where �qs is the stator flux linkage at quadrant axis, iqr the rotor current at quadrant axis, iqs the stator current at the quadrant axis. �dr lr dr m dr ds� � �x i x i i( ), (7) �qr lr qr m qr qs� � �x i x i i( ), (8) d d qs b qs s qs ds � � � t v r i� � �( ), (9) d d ds b ds s ds qs � � � t v r i� � �( ), (10) where �dr, �qr is the rotor flux linkage at the direct and quadrant axis, respectively, wb the base speed. magnetizing reactance and load case differential equations i c v t v l i t ds ds ds l lds d d d d � � � � � � � � � � � � � � � � � � � � � rl , (11) i c v t v l i t qs qs qs l lqs d d d d � � � � � � � � � � � � � � � � � � � � � rl , (12) � �i i i i im qr qs dr ds� � � �( ) ( ) .2 2 0 5, (13) � �t i ie ds qs qs ds� � � � , (14) xm � 105 77. . . . . at m0 0 0 864. .� �i , (15) x im m � � 340 2 2 35 . . . . . . at m0 864 1051. .� �i , (16) 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 18: stator frequency according to the wind speed variation for seig controlled by flc & pi x im m � � 227 4 122 . . . . . . . at m1051 1 476. .� �i , (17) x im m � � 202 3 9 3 . . . . . . . at m1 476 1717. .� �i , (18) x im m � � 179 8 6 3 . . . . . . . at m1717. � i . (19) excitation differential equations i c p v c vcd cd s cq� � � � � �( )� , (20) i c p v c vcd cd s cq� � � � � �( )� , (21) where �s is the synchronous speed (rad/sec), icd the capacitor current in the direct axis, icq the capacitor current in the quadrant axis, c the value of the capacitor bank. c v t i i� � � � � � � � � d d ds ds lds( ), (22) c v t i i� � � � � � � � � d d qs qs lqs( ), (23) ( ) ( )l pi v r il lds ds l lds� � � , (24) ( ) ( )l pi v r il lqs qs l lqs� � � , (25) c v t i v l i t r � � � � � � � � � � � � � � � � �d d d dds ds ds l lds l , (26) c v t i v l i t r � � � � � � � � � � � � � � � � �d d d dqs qs qs l lqs l , (27) where ilds is the load current in the direct axis, ilqs the load current in quadrant axis, rl the load resistance (�), ll the load inductance (h). c c eff � � � max ( ) ( )1 2 2� � � , (28) where ceff is the effective capacitor bank value (�f), cmax is the maximum capacitor value, cmin is the minimum capacitor value, � � (cmax/cmin), and � is the duty cycle value. mechanical differential equations d d r b m e r � � � t h t t b� � � 2 ( ), (29) where �r is the rotor speed (rad/sec). p c d vm p 2 w 3� 1 8 � , (30) � � m � 2 60 n , (31) t p m m m � � , (32) c p � � � � � � �( . . ) sin ( ) . . ( )0 44 0 0167 3 15 0 3 0 00184 3� � � � � � , (33) � � � � �m w w r v d n v60 , (34) d d r b m e a r � � � t h t t b� � � 2 ( ), (35) where �m is the mechanical speed (rad/sec), pm the mechanical power (kw), tm the mechanical torque (nm), n the rotor revolution per minute (rpm), cp the power coefficient of the wind turbine, � the blade pitch angle ( ° ), � the tip speed ratio, vw the wind speed (m/s), r the rotor radius of the wind turbine (m), d the rotor diameter of the wind turbine (m), ba the friction factor, te the electrical torque (nm), � � 3.14, air density (kg/m3). references [1] el-sousy, f., orabi, m., godah, h.: indirect field orientation control of self-excited induction generator for wind energy conversion. icit, december 2004. [2] mashaly, a., sharaf, m. mansour, m., abd-satar, a.a.: a fuzzy logic controller for wind energy utilization scheme. proceedings of the 3rd ieee conference of control application, august 24-26, 1994, glasgow, scotland, uk. [3] li, wang, yi-su, jian: dynamic performance of an isolated self excited induction generator under various loading conditions. ieee transactions on energy conversion, vol. 15 (1999), no. 1, march 1999, p. 93–100. [4] li, wang, chinghuei, lee: longshunt and shortshunt connections on dynamic performance of a seig feeding an induction motor load. ieee transactions on energy conversion, vol. 14 (2000), no. 1, p. 1–7. [5] atallah, a. m., adel, a.: terminal voltage control of slef excited induction generators. sixth middle-east power systems conference mepcon’98, mansoura, egypt, december 15–17,1998, p. 110–118. [6] marduchus, c.: switched capacitor circuits for reactive power generation. ph.d. thesis, brunuel university, 1983. [7] soliman, h. f., attia, a. f., mokhymar, s. m., badr, m.a.l., ahmed, a. e. m. s.: dynamic performance enhancement of self excited induction generator driven by wind energy using ann controllers. sci. bull. fac. eng. ain shams university, part ii. vol. 39, no. 2, june 30, 2004, p. 631–651, issn 1110-1385. [8] mokhymar s. m: enhancement of the performance of wind driven induction generators using artificial intelligence control. ph.d. thesis fac. eng. ain shams university, march. 10, 2005. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 46 no. 6/2006 [9] ezzeldin, s. a., xu, w.: control design and dynamic performance analysis of a wind turbineinduction generator unit. ieee transaction on energy conversion, vol. 15 (2000), no. 1, march 2000, p. 91–96. [10] passino, k. m., yurkovich, s.: fuzzy control, library of congress cataloging-in-publication data. (includes bibliographical references and index). addison wesley longman, inc., 1998, isbn 0-201-18074-x. associate prof. dr. ing. hussein f. soliman e-mail: hfaridsoliman@yahoo.com currently elect. & computer department king abdulaziz university faculty of engineering jeddah, saudi arabia dr. ing. abdel-fattah attia e-mail: attiaa1@yahoo.com national research inst. of astronomy and geophysics p.o. 11421 helwan cairo, egypt dr. ing. mokhymar m. sabry e-mail: sabry40@hotmail.com electricity & energy ministry new & renewable energy authority, wind management cairo, egypt prof. dr. ing. m. a. l. badr elect. power and machines department ain shams university faculty of engineering abbasia cairo, egypt 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 acta polytechnica https://doi.org/10.14311/ap.2021.61.0617 acta polytechnica 61(5):617–623, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on reducing co2 concentration in buildings by using plants ondřej franek∗, čeněk jarský czech technical university in prague, faculty of civil engineering, department of construction technology, thákurova 7, 160 00 prague, czech republic ∗ corresponding author: ondrej.franek@fsv.cvut.cz abstract. the article deals with the implementation of plants in the indoor environment of buildings to reduce the concentration of co2. based on a specified model representing the internal environment of an office space, it was studied whether the requirement for the total amount of ventilated air could be reduced by using plants, thereby achieving savings of operating costs in the building ventilation sector. the present research describes the effect of plant implementation according to different levels of co2 concentration of the supply air, specifically with values of 410 ppm corresponding to the year 2020, 550 ppm to the year 2050 and 670 ppm to the year 2100, as well as according to different levels of co2 concentration in the indoor environment, namely 1000 ppm and 1500 ppm, the illumination of plants in the indoor environment is constant in the model, ppfd equals to 200 µmol m−2 s−1. based on the computational model, it was found that the implemented plants can positively influence the requirement for the total amount of ventilated air, the most significant effect is in the case of a low indoor environment quality, with the co2 concentration of 1500 ppm, and a high supply air quality 410ṗpm. the simulation also showed that compared to 2020, by the year 2100, it will be necessary to increase the ventilation of the indoor environment by 25.1 % to ensure the same quality of the indoor environment. keywords: carbon dioxide, climate change, indoor greening, indoor air quality, building ventilation. 1. introduction building’s ventilation is essential for maintaining the quality of the indoor environment, especially with regard to the current concentration of co2. it is known that with an increased amount of co2 concentrations, there is a significant deterioration of work efficiency [1, 2]. the most significant impact of the high concentration of co2 is on the work performance that is directly associated with the worker’s brain concentration, such as initiative and strategic decisionmaking. for these activities, it is absolutely essential to keep the co2 concentration ideally between 600–1000 ppm [3, 4], whereas at significantly higher concentrations, around 2500 ppm, the concentration of workers is significantly reduced and their ability to work in these areas can be described as insufficient and non-functional [1]. for these reasons, it is necessary to keep the co2 concentration in the workplace within the acceptable limits, which is ensured by the supply of outside air with a low concentration of co2 to the indoor environment of buildings. however, the outdoor concentration of co2 has increased significantly during the last century. in the years 1900–2000, the concentration ranged from 280 to 400 ppm, since 2000, values exceeding 400 ppm are common [5] and today, the outdoor concentration of co2 is typically around 410 ppm [6]. based on the performed studies, it is expected that the outdoor concentration of co2 will continue to rise, it is estimated to reach 550 ppm in the year 2050, and 670 ppm in 2100 [7, 8]. based on the performed studies, it is assumed that this finding raises a potential problem in the future with maintaining the required quality of the indoor environment in terms of co2 concentration with an economic sustainability of operation, especially in situations with a requirement for a high quality indoor environment. of the total energy consumption for the operation of office buildings today, approximately 40 % is spent on the treatment of ventilated air [9]. an interesting solution may be the implementation of green plants in the indoor environment of buildings, in order to passively improve the air quality and thus reduce the total amount of air supplied to the indoor environment of the building, and ideally, for the maximum degree of optimization and prevention of shortcomings of the design itself [10]. it has previously been shown that the implementation of greenery can have a positive effect on indoor humidity, while it can also have a positive effect on the reduction of co2 concentration in naturally ventilated buildings [11]. the implementation of plants in the indoor environment is fundamentally affected by the degree of illumination, namely sufficient photosynthetic photon flux density (ppfd). in an indoor environment, ppfds less than 10 µmol m−2 s−1 can be considered as low light levels, ppfd values around 50 µmol m−2 s−1 can be considered as high light levels [11]. with the help of light sources, even very high levels of illumination can be achieved in the indoor envi617 https://doi.org/10.14311/ap.2021.61.0617 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en ondřej franek, čeněk jarský acta polytechnica cultivar lcp cultivarmetabolism net co2 assimilation [µmol m−2 s−1] [g h−1 m−2] hedera helix 30.9 c3 0.998 spathiphyllum wallissii verdi 20.1 c3 0.325 table 1. net co2 assimilation per m2 of plant’s leaf with ppfd 200 µmol m−2 s−1 [12]. ronment, even higher than 200 µmol m−2 s−1. the design of the light source should consider the area that it should reliably illuminate. the basic possibilities of lighting sources include led, par source or cycloptics technologies [12]. led lamps are distinguished by their ability to illuminate a smaller area intensely, however, they are not suitable for illuminating larger areas with only one source and must comprise several smaller sources so as to cover the entire area addressed. cycloptics sources can illuminate a much larger area from a single luminaire with a similar power to approximately same extent, their disadvantage is that their local intensity is lower. a certain compromise are par sources, which, at a similar power, have approximately half the local intensity as compared to led sources, but at the same time, slightly better coverage in the area [13]. in the stage of designing luminaires, it must also be ensured that the leaves are not overburdened, especially by the heat radiated from the light reflector, and in the case of designing led lighting, also take into account local overexposure, which could result in the gradual death of the plant [14]. when selecting the appropriate plants for the indoor environment, it is necessary to monitor the light compensation point (lcp), plants with a very low lcp are typically hedera helix with lcp 30.9 µmol m−2 s−1 and spathiphyllum wallissii verdi with lcp 20.1 µmol m−2 s−1 [12]. it can clearly be seen from table 1 that at ppfd values of 200 µmol m−2 s−1, hedera helix is able to assimilate 0.998 g h−1 m−2 of co2 and spathiphyllum wallissii verdi assimilates 0.325 g h−1 m−2 of co2 [12]. the above ability of plants to assimilate co2 from the indoor environment applies to an ambient temperature of 25°c, a relative humidity of 35–45 % and a co2 concentration of 400–450 ppm. it can be assumed that with increasing co2 concentration in the indoor environment, the ability of plants to assimilate co2 also increases, but it is fundamentally increasing with the rate of ppfd. for this reason, the effect of a high concentration of co2 on the ability to assimilate co2 by a plant can be neglected in the indoor environment [12]. the most important producers of co2 in the indoor environment are the users themselves (i. e., people), who produce approximately 31.5 g h−1 person of co2 during administrative activities [15]. 2. materials and methods 2.1. method description to determine the theoretical influence of green plants implemented in the indoor environment on the total amount of supply air from the outdoor environment, a basic model of an office room was defined. before starting the simulation, all internal parameters of the indoor environment that are necessary for determining the ability to reduce co2 by the plant are defined. the methodology specifically examines the amount of the total air supplied for different combinations of situations depending on the presence of plants and the quality of the supply air. subsequently, based on the simulation, it determines the differences in the total requirement for the amount of air supplied to the indoor environment depending on the presence of a defined number of plants and a defined quality of indoor and outdoor environment in terms of a mass concentration of co2. the simulation clearly shows the differences between the individual situations and based on this, evaluations can be made from the point of view of plant implementation efficiency. 2.2. model description an office room with a total area of 24.1 m2, with a height of 3.1 m and a volume of 74.7 m3 was chosen as a model environment for simulating the development of co2 concentration. for the purposes of the simulation, it is assumed that this room has a forced ventilation by a central ventilation system, while the amount of supply air is controlled according to the current co2 concentration in the room, based on the co2 concentration sensor. the room has 3 permanent posts, occupied by an administrative staff from 8:00 to 16:00. for the purposes of the model, 2 investigated conditions are considered from the point of view of the maximum permissible concentration. the first limit state specifies the maximum concentration of co2 in the indoor environment to be 1000 ppm, which is generally considered as acceptable for an administrative work. the second limit state specifies the maximum concentration of co2 in the indoor environment to be 1500 ppm, this concentration is considered as the maximum permissible from the point of view of the ability to perform administrative activities. at the beginning of the working hours, an initial concentration corresponding to the set modelling limit is considered. the temperature of the indoor environment in office buildings can generally be considered stable, for the 618 vol. 61 no. 5/2021 on reducing co2 concentration in buildings . . . figure 1. scheme of model environment. purposes of the model environment, the temperature is considered in the range of 19–25°c, the relative humidity of the indoor environment is considered to be 35 % to 55 %. atmospheric pressure is set at 101.3 kpa. the illumination of the implemented plants is considered at the ppfd value of 200 µmol m−2 s−1, the simulation assumes the location of the hedera helix plant, which under these conditions is able to reduce co2 value by approximately 0.998 g h−1 m−2, the location of the 3 m2 of green leaves of this plant is considered, which corresponds to a 1 m2 sheet per 1 administrative worker. a schematic representation of the model environment can be seen in fig. 1. the supply of air to the indoor environment is considered by means of air distribution, while for the purposes of modelling, the variable values of the quality of the supply air from the outdoor environment are considered according to the expected development of co2 concentration in the earth’s atmosphere. a value of 410 ppm is considered for the year 2021, 550 ppm is considered for the year 2050 and 670 ppm is considered for the year 2100. local negative influences on the supply air are neglected for the purposes of the modelling, such influences, in the real environment, can occur especially in cases where the air intake from the outside environment is near traffic routes, or located in polluted industrial zones. air penetration due to leaks in the building envelope or office operations is also neglected. 2.3. computational relations basic computational relations are used to simulate the development of the internal environment. to determine qsup = vin t × ( creq − cin cout − cin ) (1) where qsup [m3 h−1] is the requirement for the amount of air supplied to the office, vin [m3] is the volume of the air in the room area, t [h] is a defined period of time, creq [g m−3] is the desired pollutant concentration, where creq ∈ (cout, cin), cin [g m−3] is the current pollutant concentration inside the room area, cout [g m−3] is the pollutant concentration of the supply air. to determine the actual concentration of cin in the indoor environment, it is necessary to proceed from the relation 2: cin = mori + mper − mpl − mvent (2) where cin [g m−3] is the current pollutant concentration inside the room area, mori [g m−3] is the initial indoor pollutant mass, mper [g m−3] is the indoor pollutant mass excess due to a workers’ activity, mpl [g m−3] is the indoor pollutant mass loss due to the plant reduction, mpl [g m−3] is the indoor pollutant mass loss due to the air ventilation. to express the photosynthetic reaction in grams, the relation 3 is used: ap hg = ap hmol × mco2 × 3600 (3) where ap hg [g m−2 h−1] and ap hmol [µmol m−2 s−1] express the level of pure co2 assimilation by 1 m2 of green leaves, mco2 [g µmol−1] is the molar mass of co2. 3. results 3.1. simulation results based on the performed computational simulation, it was proved that to ensure the same level of co2 concentration in the indoor environment, the concentration of co2 of the supply air from the outdoor environment has a very significant effect. 619 ondřej franek, čeněk jarský acta polytechnica creq [ppm] cout [ppm] rate of ventilation air flow for office [m3 h−1] difference [%]without plants with plants and ppfd 200[µmol m−2 s−1] 1000 670 50.627 50.462 0.33 1000 550 45.315 45.135 0.40 1000 410 40.373 40.186 0.46 1500 670 34.014 33.828 0.55 1500 550 31.531 31.347 0.58 1500 410 29.056 28.877 0.62 table 2. the resulting simulation values according to the co2 concentration in the supply air, the co2 concentration required and the presence of plants.. the effect of the implementation of 1 m2 of plants per 1 person in the simulation area has an interesting effect. a total of 3 m2 of green leaves slightly favourably affects the values of the co2 concentration in the room and is able to achieve the effect of reducing the required amount of supply air, although only in tenths of a percent. the specific results of the performed simulation with various required co2 concentrations in the room, specific levels of co2 outdoor concentration and the implementation of plants are shown in tab. 2. from tab. 2 it is evident that in order to maintain a concentration of 1000 ppm in the indoor environment, it should be supplied with 40.3 m3 h−1 to the simulation area without any plant implementation for a co2 concentration of 410 ppm in the supply air, 45.3 m3 h−1 in the case of a supply air with a concentration of 550 ppm and 50.6 m3 h−1 in the case of a supply air with a concentration of 670 ppm. to maintain a concentration of 1500 ppm in the indoor environment, a supply of 29.1 m3 h−1 for a co2 concentration of 410 ppm in the supply air is needed, 31.5 m3 h−1 for a co2 concentration of 550 ppm and 34.0 m3 h−1 for a co2 concentration of 670 ppm . the implementation of plants in the indoor environment with a total amount of 3 m2 of green leaves (this corresponds to 1 m2 of green leaves per person) has, from the point of view of the total requirement for the amount of supplied air, only a minimal effect in the order of tenths of a percent. the most significant effect of plants is observable when the simulation area is less ventilated (i. e. at higher co2 concentrations in the indoor environment). in the environment with the specified co2 concentration limit of 1000 ppm, the most significant effect of plants is observable in the case of air supplied from the outside environment with a co2 concentration of 410 ppm, the implementation of plants can reduce the demand for air supply by 0.46 %. similarly, the effect of plants is observable in an environment with a co2 concentration limit of 1500 ppm, where the implementation of plants in the case of supply air from the outside environment with a co2 concentration of 410 ppm can reduce the demand for air supply by 0.62 %. as the concentration of co2 in the supply air increases, more air must be supplied to the model environment, and the effect of a constant amount of green leaves capable of a photosynthetic reaction decreases proportionally. 3.2. influence of environmental concentration the present simulation describes basic theoretical assumption that a building operation with increasing co2 concentration in the outdoor environment requires larger amount of ventilated air to the indoor environment to ensure the required concentrations of co2 in the indoor environment. such a trend is evident from fig. 2. from fig. 2 it can be seen that while in the year 2021, it is sufficient to supply 40.4 m3 h−1 from the outdoor environment to maintain the concentration of the indoor environment 1000 ppm co2, by the year 2050, it will increase to 45.3 m3 h−1, which represents an increase by 12.1 %, by the year 2100, it will increase to 50.6 m3 h−1, which represents an increase, when compared to the year 2021, by 25.2 %. similarly, the trend applies to maintaining the co2 concentration of the indoor environment of 1500 ppm. in the year 2021, it is necessary to supply 29.1 m3 h−1, 31.5 m3 h−1 in the year 2050, which represents an 8.2 % increase, and 34.0 m3 h−1 in the year 2100, which represents a 16.8 % increase, when compared to the year 2021. 3.3. influence of plant implementation into the indoor environment the results of the simulation showed that the implementation of living plants in the indoor environment can favourably affect the concentration of co2 in the indoor environment, although the amount of green leaf area and lighting level defined in the simulation are only in the order of tenths of percent. the potential for reducing the supply air demand is shown in fig. 3 and fig. 4. it is clear from fig. 3 that at the given parameters of the model environment, which is the implementation of 1 m2 of plant per 1 worker and with the lighting values of this plant being ppfd 200 µmol m−2 s−1, 620 vol. 61 no. 5/2021 on reducing co2 concentration in buildings . . . figure 2. development of the amount of ventilated air depending on the concentration of co2 in the supply air and according to the required concentration of co2 in the indoor environment. figure 3. influence of the implementation of plants into the indoor environment on the total amount of ventilated air with the maximum creq concentration of 1000 ppm. figure 4. influence of the implementation of plants into the indoor environment on the total amount of ventilated air with the maximum creq concentration of 1500 ppm. 621 ondřej franek, čeněk jarský acta polytechnica the amount of supplied air to the office environment slightly decreases. at the required limit concentration of 1000 ppm, in the case of an outdoor air with a co2 concentration of 410 ppm, the plant implementation is able to reduce the supply air requirement by 0.46 %, at an external co2 concentration of 550 ppm, the supply air requirement is reduced by 0.40 % and at an external co2 concentration of 670 ppm, the requirement is reduced by 0.33 %. it is obvious that in general, with the decreasing rate of indoor air exchange, the relative efficiency of the implemented plants increases. if the limit concentration of co2 in the indoor environment is set at 1500 ppm, it is generally sufficient to ventilate less to achieve the desired concentration and thus increase the efficiency of the implemented plants. from fig. 4, it can be seen that to maintain a co2 concentration of 1500 ppm in the indoor environment, the plant implementation can reduce the supply air requirement by 0.62 % for 410 ppm of co2 in the supply air, by 0.58 % for 550 ppm of co2 in the supply air and by 0.55 % in the case of a concentration of 670 ppm of co2 in the supply air. 4. discussion the simulation theoretically showed a trend that with increasing concentration of co2 in the outdoor environment, it will be necessary to ventilate buildings with more air to ensure the required concentration of co2 in the indoor environment. the results show that these are very significant differences that may have a significant impact on the operating costs of buildings in the future. almost alarming is the fact that in the year 2100, it will be necessary to increase the ventilation of the air in the indoor environment by 25 ppm to maintain a co2 concentrations of 1000 ppm as compared to the year 2021. it is necessary to assume that to ensure a higher quality of the indoor environment such as a lower concentration co2 (e. g. 800 ppm), it will be necessary to supply even more outside air, while the percentage difference will increase even further. the implementation of plants in the indoor environment of buildings has the potential to slightly reduce the concentration of co2 in the indoor environment, but it is affected by many paremeters. the most important parameters include the total possible area of greenery implemented in the indoor environment and ensuring a sufficient lighting, with the area of green leaves in the indoor environment increasing, the total amount of assimilated co2 in the indoor environment increases proportionally. from the point of view of the development of co2 in the external environment, the question also arises as to how spaces with a requirement for a high quality of the indoor environment (i. e., with a requirement for a low co2 concentration such as 600 ppm) will be ventilated. it follows from the basic limiting conditions of the modelling that if the concentration of co2 in the air supplied to the indoor environment is higher than the required concentration, the required concentration cannot be achieved and the environment will not meet the quality requirements. another issue that needs to be addressed is how to solve plant lighting to ensure a maximum efficiency of the luminaire from the point of view of ppfd while minimizing the cost of operating such a lighting source. 5. conclusion based on the simulation, it was theoretically demonstrated that plants in indoor conditions can contribute to co2 reduction and slightly reduce the requirement for the total amount of ventilated air, if the ventilation is controlled by a sensor based on the current co2 concentration in the room. the most significant share of plants in reducing the demand is evident at lower ventilation levels, i. e., in the case of a low co2 concentration in the supply air – cout 410 ppm, or at a lower quality of the indoor environment – creq 1500 ppm. the increasing concentration of co2 in the supply air, cout 550 ppm and cout 670 ppm, results in a deterioration of the overall efficiency of the implemented plants, their effect on the total amount of supply air is minimized with a deteriorating quality of the supplied air. the same applies to increasing the requirement for the quality of the indoor environment such as for a lower required creq concentration of 1000 ppm, where the percentage efficiency of the plants in the simulated environment is considerably lesser than in the case of the required co2 concentration creq 1500 ppm. for a higher level of co2 assimilation by plants in the indoor environment, it is necessary to adjust the input parameters, especially lighting conditions (ppfd > 200 µmol m−2 s−1) or increase the overall amount of greenery implemented in the indoor environment. although the simulation theoretically shows the minimal impact of plants in the indoor environment of buildings from the point of view of ventilation, the ability to generally reduce the quantity requirement by tenths of percent in the long run is very important in terms of potential operational savings in ventilation. on a global scale, this is an important area that can be slightly offset by the ever-increasing amount of air supplied to the indoor environment of buildings to ensure the same quality of the indoor environment. it is appropriate to research the implementation of greenery in the indoor environment of buildings, to deal with their ability to assimilate co2, especially with regard to the potential for significant savings on a global scale and try to find optimal options for the implementation of plants in the indoor environment. acknowledgements the authors would like to express their gratitude to the czech technical university in prague. this study was financially supported by the grant sgs21/007/ohk1/1t/11 of the czech technical university in prague. 622 vol. 61 no. 5/2021 on reducing co2 concentration in buildings . . . references [1] u. satish, m. j. mendell, k. shekhar, et al. is co2 an indoor pollutant? direct effects of low-to-moderate co2 concentrations on human decision-making performance. environmental health perspectives 120(12):1671–1677, 2012. https://doi.org/10.1289/ehp.1104789. [2] j. g. allen, p. macnaughton, u. satish, et al. associations of cognitive function scores with carbon dioxide, ventilation, and volatile organic compound exposures in office workers: a controlled exposure study of green and conventional office environments. environmental health perspectives 124(6):805–812, 2016. https://doi.org/10.1289/ehp.1510037. [3] c.-y. lu, j.-m. lin, y.-y. chen, y.-c. chen. buildingrelated symptoms among office employees associated with indoor carbon dioxide and total volatile organic compounds. international journal of environmental research and public health 12(6):5833–5845, 2015. https://doi.org/10.3390/ijerph120605833. [4] d.-h. tsai, j.-s. lin, c.-c. chan. office workers’ sick building syndrome and indoor carbon dioxide concentrations. journal of occupational and environmental hygiene 9(5):345–351, 2012. https://doi.org/10.1080/15459624.2012.675291. [5] l.-g. hersoug, a. sjödin, a. astrup. a proposed potential role for increasing atmospheric co2 as a promoter of weight gain and obesity. nutrition and diabetes 2(3):e31, 2012. https://doi.org/10.1038/nutd.2012.2. [6] noaa: national oceanic and atmospheric administration. global monitoring laboratory: earth system research laboratories. noaa/gml and scripps institution of oceanography, 2021. [2021-04-07], https: //www.esrl.noaa.gov/gmd/ccgg/trends/data.html. [7] b. i. mcneil, t. p. sasse. future ocean hypercapnia driven by anthropogenic amplification of the natural co2 cycle. nature 529(7586):383–386, 2016. https://doi.org/10.1038/nature16156. [8] c. b. field, v. r. barros, d. j. dokken, et al. climate change 2014 impacts, adaptation, and vulnerability, chap. human health: impacts, adaptation, and co-benefits, pp. 709–754. cambridge university press, cambridge, 2014. https://doi.org/10.1017/cbo9781107415379.016. [9] d. ürge-vorsatz, l. f. cabeza, s. serrano, et al. heating and cooling energy trends and drivers in buildings. renewable and sustainable energy reviews 41:85–98, 2015. https://doi.org/10.1016/j.rser.2014.08.039. [10] m. tuháček, o. franek, p. svoboda. application of fmea methodology for checking of construction’s project documentation and determination of the most risk areas. acta polytechnica 60(5):448–454, 2020. https://doi.org/10.14311/ap.2020.60.0448. [11] d. tudiwer, a. korjenic. the effect of an indoor living wall system on humidity, mould spores and co2concentration. energy and buildings 146:73–86, 2017. https://doi.org/10.1016/j.enbuild.2017.04.048. [12] c. gubb, t. blanusa, a. griffiths, c. pfrang. can houseplants improve indoor air quality by removing co2 and increasing relative humidity? air quality, atmosphere & health 11(10):1191–1201, 2018. https://doi.org/10.1007/s11869-018-0618-9. [13] j. a. nelson, b. bugbee, d. a. campbell. economic analysis of greenhouse lighting: light emitting diodes vs. high intensity discharge fixtures. plos one 9(6):e99010, 2014. https://doi.org/10.1371/journal.pone.0099010. [14] j. a. nelson, b. bugbee, z.-h. chen. analysis of environmental effects on leaf temperature under sunlight, high pressure sodium and light emitting diodes. plos one 10(10):e0138930, 2015. https://doi.org/10.1371/journal.pone.0138930. [15] ansi/ashrae 62.1-2013: ventilation for acceptable indoor air quality. informative appendix c — rationale for minimum physiological requirements for respiration air based on co2 concenrtation. ashrae, atlanta, 2013. 623 https://doi.org/10.1289/ehp.1104789 https://doi.org/10.1289/ehp.1510037 https://doi.org/10.3390/ijerph120605833 https://doi.org/10.1080/15459624.2012.675291 https://doi.org/10.1038/nutd.2012.2 https://www.esrl.noaa.gov/gmd/ccgg/trends/data.html https://www.esrl.noaa.gov/gmd/ccgg/trends/data.html https://doi.org/10.1038/nature16156 https://doi.org/10.1017/cbo9781107415379.016 https://doi.org/10.1016/j.rser.2014.08.039 https://doi.org/10.14311/ap.2020.60.0448 https://doi.org/10.1016/j.enbuild.2017.04.048 https://doi.org/10.1007/s11869-018-0618-9 https://doi.org/10.1371/journal.pone.0099010 https://doi.org/10.1371/journal.pone.0138930 acta polytechnica 61(5):617–623, 2021 1 introduction 2 materials and methods 2.1 method description 2.2 model description 2.3 computational relations 3 results 3.1 simulation results 3.2 influence of environmental concentration 3.3 influence of plant implementation into the indoor environment 4 discussion 5 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0001 acta polytechnica 62(1):1–7, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague conserved quantities in non-hermitian systems via vectorization method kaustubh s. agarwal, jacob muldoon, yogesh n. joglekar∗ indiana university purdue university indianapolis (iupui), indianapolis, indiana 46202 u.s.a. ∗ corresponding author: yojoglek@iupui.edu abstract. open classical and quantum systems have attracted great interest in the past two decades. these include systems described by non-hermitian hamiltonians with parity-time (pt ) symmetry that are best understood as systems with balanced, separated gain and loss. here, we present an alternative way to characterize and derive conserved quantities, or intertwining operators, in such open systems. as a consequence, we also obtain non-hermitian or hermitian operators whose expectations values show single exponential time dependence. by using a simple example of a pt -symmetric dimer that arises in two distinct physical realizations, we demonstrate our procedure for static hamiltonians and generalize it to time-periodic (floquet) cases where intertwining operators are stroboscopically conserved. inspired by the lindblad density matrix equation, our approach provides a useful addition to the well-established methods for characterizing time-invariants in non-hermitian systems. keywords: parity-time symmetry, pseudo-hermiticity, conserved quantities. 1. introduction since the seminal discovery of bender and coworkers in 1998 [1], non-hermitian hamiltonians h with real spectra have become a subject of intense scrutiny [2– 4]. the initial work on this subject focused on taking advantage of the reality of the spectrum to define a complex extension of quantum theory [5] where the traditional dirac inner product is replaced by a hamiltonian-dependent (cpt ) inner product. soon it became clear that this process can be thought of as identifying positive definite operators η̂ ≥ 0 that intertwine with the hamiltonian [6–8], i.e. η̂h = h†η̂, and that a non-unique complex extension of standard quantum theory is generated by each positive definite η̂ [9, 10]. these mathematical developments were instrumental to elucidating the role played by non-hermitian, self-adjoint operators, biorthogonal bases, and non-unitary similarity transformations that change an orthonormal basis set into a non-orthogonal, but linearly independent basis set in physically realizable classical and quantum models [11]. a decade later, this mathematical approach gave way to experiments with the recognition that nonhermitian hamiltonians that are invariant under combined operations of parity and time-reversal (pt ) represent open systems with balanced gain and loss [12– 15]. the spectrum of a pt -symmetric hamiltonian hpt(γ) is purely real when the non-hermiticity γ is small. with increasing γ, a level attraction and resulting degeneracy turns the spectrum into complexconjugate pairs when the non-hermiticity exceeds a nonzero threshold γpt [16]. this transition is called pt -symmetry breaking transition, and at the threshold γpt the algebraic multiplicity of the degenerate eigenvalue is larger than the geometric multiplicity, i.e. it is an exceptional point (ep) [17]. fueled by this physical insight, the past decade has seen an explosion of experimental platforms, usually in classical wave systems, where effective pt symmetric hamiltonians with balanced gain and loss have been realized. they include evanescently coupled waveguides [18], fiber loops [19], microring resonators [20, 21], optical resonators [22], electrical circuits [23–25], and mechanical oscillators [26]. the key characteristics of this transition, driven by the nonorthogonality of eigenstates, are also seen in systems with mode-selective losses [27–29]. in the past two years, these ideas have been further extended to minimal quantum systems, thereby leading to observation of pt -symmetric breaking and attendant phenomena in a single spin [30], a single superconducting transmon [31], ultracold atoms [32], and quantum photonics [33]. we remind the readers the effective hamiltonian approach requires dirac inner product, and is valid in both pt -symmetric and pt -broken regions. apropos, the non-unitary time evolution generated by the effective hpt signals the fact that the system under consideration is open. in this context, every intertwining operator η̂ – positive definite or not – represents a time-invariant of the system. in other words, although the state norm ⟨ψ(t)|ψ(t)⟩ or the energy ⟨ψ(t)|hpt|ψ(t)⟩ of a state |ψ(t)⟩ = exp(−ihptt)|ψ(0)⟩ of a pt -symmetric system are not conserved [8], the expectation values ⟨ψ(t)|η̂|ψ(t)⟩ remain constant with time. for a system with n degrees of freedom, a complete characterization of intertwining operators for a given system is carried out by solving the set of n2 simultaneous, linear equations, i.e. η̂hpt = h † ptη̂. (1) 1 https://doi.org/10.14311/ap.2022.62.0001 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en k. s. agarwal, j. muldoon, y. n. joglekar acta polytechnica in the past, several different avenues have been used to obtain these conserved quantities. they include spectral decomposition methods [8, 34], an explicit recursive construction to generate a tower of intertwining operators [25, 35], sum-rules method [36], and the stokes parametrization approach for a pt symmetric dimer [37]. here, we present yet another approach to the problem, and illustrate it with two simple examples. the plan of the paper is as follows. in section 2, we present the eigenvalue-equation approach for intertwining operators and the details of the vectorization scheme. this method is valid for any finite dimensional pt -symmetric hamiltonian. in section 3, we present results of such analysis for a quantum pt -symmetric dimer with static or time-periodic gain and loss. corresponding results for a classical pt -symmetric dimer are presented in section 4. we conclude the paper with a brief discussion in section 5. 2. intertwining operators as an eigenvalue problem for a pt -symmetric system undergoing coherent but non-unitary dynamics with static hamiltonian hpt, the expectation value of an operator η̂ satisfies the following linear-in-η̂ first-order differential equation ∂t⟨ψ(t)|η̂|ψ(t)⟩ = −i⟨ψ(t)|η̂hpt − h † ptη̂|ψ(t)⟩. (2) this equation is reminiscent of the gorini kossakowski sudarshan lindblad (gksl) equation [38, 39] (henceforth referred to as the lindblad equation) that describes the dynamics of the reduced density matrix of a quantum system coupled to a much larger environment [40–42]. interpreting η̂ as an n × n matrix, all η̂s that satisfy eq. (2) can be obtained from the corresponding eigenvalue problem ekη̂k = −i(η̂khpt − h † ptη̂k) ≡ lη̂k, (3) for 1 ≤ k ≤ n2. we vectorize the matrix η̂ into an n2-sized column vector |ηv ⟩ by stacking its columns, i.e. [η̂]pq → ηvp+(q−1)n [43]. under this vectorization, the hilbert-schmidt trace inner product carries over to the dirac inner product, tr(η̂†1η̂2) = ⟨η v 1 |ηv2 ⟩ where ⟨ηv1 | is the hermitian-conjugate row vector obtained from the column vector |ηv1 ⟩. using the identity aη̂b → (bt ⊗ a)|ηv ⟩, the eigenvalue problem eq. (3) becomes det(l − e1n 2 ) = 0 where the n2 × n2 “liouvillian” matrix is given by l = −i [ htpt ⊗ 1n − 1n ⊗ h † pt ] , (4) and 1m is the m × m identity matrix. thus, the intertwining operators are distinct eigenvectors |ηvm⟩ with zero eigenvalue in eq. (3). the n2 eigenvalues of the liouvillian l are simply related to n eigenvalues ϵm of the hpt as epq = −i(ϵp − ϵ∗q ). (5) since the spectrum of hpt is either real (ϵp = ϵ∗p) or complex conjugates (ϵp = ϵ∗q for some pair), there are n zero eigenvalues of l when hpt has no symmetrydriven degeneracies; the number of zero eigenvalues grows to n2 if the hamiltonian is proportional to the identity matrix [34]. this analysis also provides a transparent way to construct corresponding intertwining operators via the spectral decomposition of hpt [8]. note that when e = 0, due to the linearity of the intertwining relation, eq. (1), without loss of generality, we can choose the n intertwining operators η̂m to be hermitian. so what is the advantage of this approach? for one, it gives us n(n − 1) other, (generally non-hermitian) operators whose expectation value in any arbitrary state evolves simply exponentially in time. when epq is purely imaginary, it leads to the non-hermitian η̂pq whose expectation value in any state remains constant in magnitude; on the other hand, if epq is purely real, one can choose a hermitian η̂pq whose expectation value exponentially grows or decays with time. this analysis of constants of motion is valid for systems with a static, pt -symmetric hamiltonian. it can be suitably generalized to time-periodic, pt symmetric hamiltonians via the floquet formalism [25, 28, 32, 44, 45]. when hpt(t) = hpt(t + t) is periodic in time, the long-time dynamics of the system is governed by the floquet time-evolution operator [46] gf (t) = te −i ∫ t 0 hpt(t′)dt′, (6) where t stands for the time ordered product that takes into account non-commuting nature of the hamiltonians at different times. the (stroboscopic) dynamics of the system at times tm = mt is then given by |ψ(tm)⟩ = gmf |ψ(0)⟩, and the corresponding, hermitian, conserved operators η̂ = η̂† are determined by [25, 34] g † f η̂gf = η̂. (7) vectorization of eq. (7) implies the conserved quantities are given by eigenvectors of the “floquet liouville time-evolution” matrix g = gtf ⊗ g † f (8) with unit eigenvalue. since gf (t) inherits the pt symmetry of the time-periodic hamiltonian, the eigenvalues κm of gf (t) either lie on a circle (|κp| = const.; pt -symmetric phase) or occur along a radial line in pairs with constant geometric mean (|κpκq | = const.; pt -broken phase). therefore, it is straightforward to see that among the n2 eigenvalues λpq ≡ κpκ∗q of g, there are n unit eigenvalues, giving rise to n conserved quantities. as in the case with the static hamiltonian, the remaining n(n − 1) eigenvectors give operators that vary exponentially with the stroboscopic time tm irrespective of the initial state |ψ(0)⟩. 2 vol. 62 no. 1/2022 conserved quantities in non-hermitian systems if λpq is real, we can choose them to be hermitian, as in the case of a static hamiltonian. we now demonstrate these ideas with two concrete examples. 3. quantum pt -symmetric dimer we first consider the prototypical pt -symmetric dimer (n = 2) with a hamiltonian given by h1(t) = jσx + iγf(t)σz = ht1 ̸= h † 1. (9) we call this model “quantum” because it arises naturally in minimal quantum systems undergoing lindblad evolution when we confine ourselves to trajectories that undergo no quantum jumps [31], as well as in wave systems [18–22]. here j > 0 denotes coupling between the two degrees of freedom and γ > 0 is the strength of the gain-loss term. h1 is pt -symmetric with the parity operator p = σx and time-reversal operator t = ∗ (complex conjugation). the eigenvalues ϵ1,2 = ± √ j2 − γ2 ≡ ±∆(γ) of the hamiltonian h1(γ) remain real when γ < γpt = j and become purely imaginary when γ exceeds the threshold. in the static case, f(t) = 1, using eq. 1, it is easy to show that η̂1 = p = σx is the first intertwining operator [34, 35], and the recursive construction gives the second intertwining operator as η̂2 = η̂1h1/j = 1 + (γ/j)σy . however, the corresponding 4 × 4 liouvillian matrix l, eq. (4), has two nonzero eigenvalues that are given by e± = ±2i∆. the corresponding eigen-operators are given by η̂± = 1 j2 [ (γ ± i∆)2 −i(γ ± i∆) +i(γ ± i∆) 1 ] . (10) note that the 2 × 2 matrices η̂± have rank 1, and thus are not invertible. in the pt -symmetric region (∆ ∈ r), the operators η̂± are not hermitian, whereas in the pt broken region (∆ ∈ ir), they are hermitian. next we consider the time-periodic case, i.e. f(t) = f(t + t) where f(t) = sgn(t) for |t| < t/2 denotes a square wave. this piecewise constant gain and loss means that the hamiltonian switches from h1+ = jσx + iγσz for 0 ≤ t < t/2 to h1− = t h1+t = jσx −iγσz for t/2 ≤ t < t . the non-unitary floquet time-evolution operator can be explicitly evaluated as [47] gf (t) = e−ih1−t /2e−ih1+t /2, (11) = g012 + igxσx + gyσy, (12) where g0 = [j2 cos(∆t) − γ2]/∆2, gx = −j sin(∆t)/∆ and gy = −jγ[1 − cos(∆t)]/∆2 are coefficients that remain real irrespective of where ∆(γ) is real or purely imaginary. when γ → 0, this reproduces the expected result gf (t ) = exp(−ijσxt) and in the limit t → 0, the time-evolution operator reduces to 12 as expected. on the other hand, as ∆ → 0, the power series for gf (t) terminates at second order in t in a sharp contrast to the static case, where it terminates at first order in time. the eigenvalues of gf , eq. (12), are κ1,2 = g0 ± i √ g2x − g2y. (13) thus the ep contours separating the pt -symmetric phase (|κ1| = |κ2|) from the pt -broken phase (|κ1| ≠ |κ2|) are given by gx = ±gy [47]. it is easy to check that η̂1 = σx satisfies g † f η̂1gf = η̂1 and is a stroboscopically conserved quantity. the second conserved operator is obtained from the symmetrized or antisymmetrized version of the recursive construction [34], i.e. η̂2 = { (η̂1gf + g † f η̂1)/2, −i(η̂1gf − g † f η̂1)/2. (14) in the present case, the symmetrized version returns η̂1 while the antisymmetrized version gives the second, linearly independent conserved operator as η̂2 = gx12 + gyσz . following the procedure outlined in section 2 gives us two unity eigenvalues of g, eq. (8), with corresponding conserved operators. the remaining two eigenvalues are complex conjugates with unit length in the pt -symmetric region, i.e. λ3 = λ∗4 = eiϕ with eigen-operators η̂+ = η̂ † − that are hermitian conjugates of each other. in the pt -broken region, the two complex eigenvalues with equal phase satisfy |λ3λ4| = 1. figure 1 shows expectation values normalized to their initial values, ηα(t) ≡ ⟨ψ(t)|η̂α|ψ(t)⟩ ⟨ψ(0)|η̂α|ψ(0)⟩ (15) calculated with initial state |ψ(0)⟩ = | + x⟩ as a function of dimensionless time t/t. the system parameters are γ = 0.5j, jt = 1, and | + x⟩ is the eigenstate of σx with eigenvalue +1. thus, the system is in the pt -symmetric region. figure 1a shows that η1(t) is conserved in this evolution at all times, not just stroboscopically at tm = mt. on the other hand η2(t), shown in figure 1b, has a periodic behavior with a period ∼ 30t (not shown). although η2(t) varies with time, it is stroboscopically conserved, η2(tm) = 1. the dotted red line shows ℜλt2 = 1. figure 1c shows that the real part of η+(t), with eigenvalue λ3 = −0.44 + 0.9i, also shows periodic variation. the dotted black line shows ℜλt3, and the fact that ℜη+(tm) matches it stroboscopically confirms the simple sinousoidal variation of this eigen-operator. figure 1d shows corresponding results for the fourth operator η̂− = η̂ † + with eigenvalue λ4 = −0.44 − 0.9i. we conclude this section with transformation properties of gf (t) and the conserved operators η̂. when the periodic hamiltonian is hermitian, i.e. h0(t) = h † 0 (t) = h0(t + t), shifting the zero of time to t0 leads to a unitary transformation, gf (t + t0, t0) = u(t0)gf (t)u†(t0), (16) u(t0) = te −i ∫ t0 0 h0(t′)dt′. (17) 3 k. s. agarwal, j. muldoon, y. n. joglekar acta polytechnica figure 1. conserved quantities for a floquet, quantum pt -symmetric dimer. system parameters are γ = 0.5j, jt = 1, |ψ(0)⟩ = | + x⟩, and ηα(t) denote normalized expectation values. (a) η̂1 = σx is an eigen-operator of g with eigenvalue λ1 = 1; η1(t) is constant. (b) η̂2 = gx12 + gyσz is the second eigen-operator of g with λ2 = 1; η2(t) oscillates with time, but is stroboscopically constant at t/t = n; the dotted red line shows ℜλt2 = 1. (c) η̂+ is a non-hermitian eigen-operator with unit-length eigenvalue λ3 = −0.44 + 0.9i. the real part of its normalized expectation value stroboscopically matches ℜλt3 shown in dotted black. (d) corresponding result for η̂− = η̂ † + with eigenvalue λ4 = λ∗3 . therefore the conserved operators are also unitarily transformed. however, in our case, eq. (16) becomes a similarity transformation, gf (t + t0, t0) = sgf (t)s−1 where s = t exp(−i ∫ t0 0 hpt(t ′)dt′) does not satisfy s†s = 1 = ss†. under this transformation, the conserved operators change as η̂ → s−1†η̂s−1. this non-unitary transformation of the conserved quantities under a shift of zero of time suggests that they are not related to “symmetries” of the open system with balanced gain and loss. 4. classical pt -symmetric dimer we now consider a different example characterized by a non-hermitian hamiltonian with purely imaginary entries. we call such a system “classical” because having hpt = −h∗pt ensures that the non-unitary time evolution operator exp(−ihptt) is purely real, and therefore |ψ(t)⟩ remains real if |ψ(0)⟩ is. such classical hamiltonian arises naturally in describing the energy density dynamics in mechanical or electrical circuits [23–26, 28], where |ψ(t)⟩ encodes time-dependent positions, velocities, voltages, currents, etc. and is obviously real. as its simplest model, we consider a dimer governed by the hamiltonian h2(t) = jσy + iγf(t)σz = −h∗2 . (18) on one level, the hamiltonian h2(t), eq. (18), is “just a change of basis” from h1(t), eq. (9); h2(t) = exp(−iπσz/4)h1(t) exp(+iπσz/4). however, since h2(t) models effective, classical systems where the entire complex state space is physically accessible, it is necessary to treat it differently. a physical realization of h2(t) is found in a single lc circuit whose inductance l(t) and capacitance c(t) are varied such that its characteristic frequency j = 1/ √ l(t)c(t) remains constant [25]. hamiltonian h2(t) is pt -symmetric with pt = σx∗. in the static case (f(t) = 1), the two, hermitian intertwining operators are given by η̂1 = σy and η̂2 = η̂1h2/j = 12 − (γ/j)σx. in addition, the vectorization approach gives two, rank-1 eigen-operators η̂± = 1 j2 [ (γ ± i∆)2 −(γ ± i∆) −(γ ± i∆) 1 ] , (19) with eigenvalues e± = ±2i∆. as we discussed in section 3, these operators are not hermitian in the pt -symmetric phase, and become hermitian in the pt -broken phase. for the floquet case, we choose a gain-loss term that is nonzero only at discrete times. this is accomplished by choosing the dimensionless function f(t) as f(t) = t [δ(t) − δ(t − t/2)] = f(t + t). (20) the resulting floquet time-evolution operator gf (t) can be analytically calculated [25]. since the hamiltonian h2(t) is hermitian at all times except tk = kt/2, the evolution is mostly unitary, punctuated by nonunitary contributions that occur due to δ-functions at 4 vol. 62 no. 1/2022 conserved quantities in non-hermitian systems times tk. the result is gf (t) = e+γt σze−ij t σy /2e−γt σze−ij t σy /2 = g012 + gxσx + igyσy + gzσz, (21) where the four real coefficients gk are given by g0 = cos2(jt/2) − sin2(jt/2) cosh(2γt), (22) gx = − sin(jt) sinh(2γt)/2, (23) gy = − sin(jt)[1 + cosh(2γt)]/2. (24) gz = − sin2(jt/2) sinh(2γt). (25) as is expected, the purely real gf (t) reduces to exp(−ijtσy ) in the hermitian limit γ → 0. the ep conoturs, on the other hand, are determined by the constraint g2x + g2y − g2z = 0, which reduces to cos(jt/2) = tanh(γt) [25]. two linearly independent floquet intertwining operators obtained by solving eq. (7) are given by η̂1 = σy and η̂2 = −i(η̂1gf −g † f η̂1)/2. the latter simplifies to η̂2 = gy12 + gzσx − gxσz . we leave it for the reader to check that, as in the case of floquet quantum pt dimer problem, the symmetrized version of the recursive procedure, eq. (14), does not lead to a result that is linearly independent of η̂1. following the recipe in section 2, we supplement these analytical results with symbolic or numerical results for four eigenvalues λk and four eigen-operators η̂1, η̂2, η̂± of g, eq. (8). figure 2 shows the behavior of normalized expectation values ηα(t) calculated with |ψ(0)⟩ = | + x⟩ as a function of time. the system parameters are γ = 0.5j and jt = 1, and therefore the system is in the pt -symmetric region. note that since |ψ(t)⟩ is purely real, ⟨ψ(t)|η̂1|ψ(t)⟩ = 0 independent of time [25]. on the other hand η2(t), shown in figure 2a, has a periodic behavior. although η2(t) varies with time, it is stroboscopically conserved, η2(tm) = 1. figure 2b shows that the real part of η+(t), with unit-magnitude eigenvalue λ3 = −0.65 + 0.756i, also varies periodically. the dotted black line shows ℜλt3, and the fact that ℜη+(tm) matches it stroboscopically confirms the simple sinousoidal variation of this eigenoperator. since the system is in the pt -symmetric phase, η̂− = η̂ † +, and therefore ℜη−(t) = ℜη+(t). figure 2c shows the corresponding imaginary parts ℑη+(t) = −ℑη−(t) for the eigen-operator with the complex conjugate eigenvalue λ4 = λ∗3. we note that in the pt -broken regime, the non-unit-modulus eigenvalues are not complex conjugates of each other, and therefore the corresponding eigen-operators will not satisfy the relations shown in figures 2b-c. 5. conclusions in this article, we have presented a new method to obtain intertwining operators or conserved quantities in pt -symmetric systems with static or time-periodic hamiltonians. in this approach, these operators appear as zero-e eigenmodes of the static liouvillian l or as λ = 1 eigenmodes of the floquet g. for figure 2. conserved quantities for a classical pt symmetric dimer with γ = 0.5j, jt = 1, |ψ(0)⟩ = | +x⟩. since |ψ(t)⟩ is purely real, the expectation value of η̂1 = σy is always zero. (a) η̂2 = gy12 + gxσz − gzσx is the second eigen-operator of g with λ2 = 1. η2(t) oscillates with time, but is stroboscopically constant at t/t = n; the dotted red line shows ℜλt2 = 1. (b) since the system is in the pt -symmetric phase, ℜη+(t) = ℜη−(t) (solid black) shows periodic behavior with values that stroboscopically match ℜλt3, shown in dotted black. (c) corresponding imaginary parts, ℑη−(t) = −ℑη+(t) (dot-dashed black) show similar, stroboscopically matching behavior. an n-dimensional system, in addition to the n constants of motion, this approach also leads to n(n − 1) operators whose expectation values in any arbitrary state undergo simple exponential-in-time change. we have demonstrated these concepts with two simple, physically motivated examples of a pt -symmetric dimer with different, periodic gain-loss profiles. we have deliberating stayed away from continuum models because extending this approach or the recursive construction [34, 35] to infinite dimensions will probably be plagued by challenges regarding domains of resulting, increasingly higher-order differential operators. the definition of an intertwining operator via eq. (1) can be generalized to obtain conserved observables for hamiltonians that posses other antilinear symmetries, such as anti-pt symmetry [48–50] or anyonic-pt symmetry [51, 52]. the recursive procedure to generate a tower of such operators [34], and the vectorization method presented in section 2 remains valid for arbitrary antilinear symmetry. thus, this approach can be used to investigate constants of motion in such systems as well. 5 k. s. agarwal, j. muldoon, y. n. joglekar acta polytechnica references [1] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. physical review letters 80(24):5243–5246, 1998. https://doi.org/10.1103/physrevlett.80.5243. [2] g. lévai, m. znojil. systematic search for pt -symmetric potentials with real energy spectra. journal of physics a: mathematical and general 33(40):7165–7180, 2000. https://doi.org/10.1088/0305-4470/33/40/313. [3] c. m. bender, d. c. brody, h. f. jones. must a hamiltonian be hermitian? american journal of physics 71(11):1095–1102, 2003. https://doi.org/10.1119/1.1574043. [4] c. m. bender. making sense of non-hermitian hamiltonians. reports on progress in physics 70(6):947–1018, 2007. https://doi.org/10.1088/0034-4885/70/6/r03. [5] c. m. bender, d. c. brody, h. f. jones. complex extension of quantum mechanics. physical review letters 89(27):270401, 2002. https://doi.org/10.1103/physrevlett.89.270401. [6] a. mostafazadeh. pseudo-hermiticity versus pt symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian. journal of mathematical physics 43(1):205–214, 2002. https://doi.org/10.1063/1.1418246. [7] a. mostafazadeh. exact pt -symmetry is equivalent to hermiticity. journal of physics a: mathematical and general 36(25):7081–7091, 2003. https://doi.org/10.1088/0305-4470/36/25/312. [8] a. mostafazadeh. pseudo-hermitian representation of quantum mechanics. international journal of geometric methods in modern physics 07(07):1191–1306, 2010. https://doi.org/10.1142/s0219887810004816. [9] m. znojil, h. b. geyer. construction of a unique metric in quasi-hermitian quantum mechanics: nonexistence of the charge operator in a 2 × 2 matrix model. physics letters b 640(1-2):52–56, 2006. https://doi.org/10.1016/j.physletb.2006.07.028. [10] m. znojil. complete set of inner products for a discrete pt -symmetric square-well hamiltonian. journal of mathematical physics 50(12):122105, 2009. https://doi.org/10.1063/1.3272002. [11] m. znojil. special issue “pseudo-hermitian hamiltonians in quantum physics in 2014”. international journal of theoretical physics 54(11):3867–3870, 2015. https://doi.org/10.1007/s10773-014-2501-2. [12] a. ruschhaupt, f. delgado, j. g. muga. physical realization of pt -symmetric potential scattering in a planar slab waveguide. journal of physics a: mathematical and general 38(9):l171–l176, 2005. https://doi.org/10.1088/0305-4470/38/9/l03. [13] r. el-ganainy, k. g. makris, d. n. christodoulides, z. h. musslimani. theory of coupled optical ptsymmetric structures. optics letters 32(17):2632–2634, 2007. https://doi.org/10.1364/ol.32.002632. [14] k. g. makris, r. el-ganainy, d. n. christodoulides, z. h. musslimani. beam dynamics in pt symmetric optical lattices. physical review letters 100:103904, 2008. https://doi.org/10.1103/physrevlett.100.103904. [15] s. klaiman, u. günther, n. moiseyev. visualization of branch points in pt -symmetric waveguides. physical review letters 101:080402, 2008. https://doi.org/10.1103/physrevlett.101.080402. [16] y. n. joglekar, c. thompson, d. d. scott, g. vemuri. optical waveguide arrays: quantum effects and pt symmetry breaking. the european physical journal applied physics 63(3):30001, 2013. https://doi.org/10.1051/epjap/2013130240. [17] t. kato. perturbation theory for linear operators. springer, berlin heidelberg, 1995. https://doi.org/10.1007/978-3-642-66282-9. [18] c. e. rüter, k. g. makris, r. el-ganainy, et al. observation of parity-time symmetry in optics. nature physics 6(3):192–195, 2010. https://doi.org/10.1038/nphys1515. [19] a. regensburger, c. bersch, m.-a. miri, et al. parity– time synthetic photonic lattices. nature 488(7410):167– 171, 2012. https://doi.org/10.1038/nature11298. [20] h. hodaei, m.-a. miri, m. heinrich, et al. parity-time– symmetric microring lasers. science 346(6212):975–978, 2014. https://doi.org/10.1126/science.1258480. [21] b. peng, ş. k. özdemir, f. lei, et al. parity–time-symmetric whispering-gallery microcavities. nature physics 10(5):394–398, 2014. https://doi.org/10.1038/nphys2927. [22] l. chang, x. jiang, s. hua, et al. parity–time symmetry and variable optical isolation in active–passivecoupled microresonators. nature photonics 8(7):524–529, 2014. https://doi.org/10.1038/nphoton.2014.133. [23] j. schindler, a. li, m. c. zheng, et al. experimental study of active lrc circuits with pt symmetries. physical review a 84(4):040101, 2011. https://doi.org/10.1103/physreva.84.040101. [24] t. wang, j. fang, z. xie, et al. observation of two pt transitions in an electric circuit with balanced gain and loss. the european physical journal d 74(8), 2020. https://doi.org/10.1140/epjd/e2020-10131-7. [25] m. a. quiroz-juárez, k. s. agarwal, z. a. cochran, et al. on-demand parity-time symmetry in a lone oscillator through complex, synthetic gauge fields, 2021. arxiv:2109.03846. [26] c. m. bender, b. k. berntson, d. parker, e. samuel. observation of pt phase transition in a simple mechanical system. american journal of physics 81(3):173–179, 2013. https://doi.org/10.1119/1.4789549. [27] d. duchesne, v. aimez, r. morandotti, et al. observation of pt -symmetry breaking in complex optical potentials. physical review letters 103(9):093902, 2009. https://doi.org/10.1103/physrevlett.103.093902. [28] r. de j. león-montiel, m. a. quiroz-juárez, j. l. domínguez-juárez, et al. observation of slowly decaying eigenmodes without exceptional points in floquet dissipative synthetic circuits. communications physics 1(1), 2018. https://doi.org/10.1038/s42005-018-0087-3. 6 https://doi.org/10.1103/physrevlett.80.5243 https://doi.org/10.1088/0305-4470/33/40/313 https://doi.org/10.1119/1.1574043 https://doi.org/10.1088/0034-4885/70/6/r03 https://doi.org/10.1103/physrevlett.89.270401 https://doi.org/10.1063/1.1418246 https://doi.org/10.1088/0305-4470/36/25/312 https://doi.org/10.1142/s0219887810004816 https://doi.org/10.1016/j.physletb.2006.07.028 https://doi.org/10.1063/1.3272002 https://doi.org/10.1007/s10773-014-2501-2 https://doi.org/10.1088/0305-4470/38/9/l03 https://doi.org/10.1364/ol.32.002632 https://doi.org/10.1103/physrevlett.100.103904 https://doi.org/10.1103/physrevlett.101.080402 https://doi.org/10.1051/epjap/2013130240 https://doi.org/10.1007/978-3-642-66282-9 https://doi.org/10.1038/nphys1515 https://doi.org/10.1038/nature11298 https://doi.org/10.1126/science.1258480 https://doi.org/10.1038/nphys2927 https://doi.org/10.1038/nphoton.2014.133 https://doi.org/10.1103/physreva.84.040101 https://doi.org/10.1140/epjd/e2020-10131-7 http://arxiv.org/abs/2109.03846 https://doi.org/10.1119/1.4789549 https://doi.org/10.1103/physrevlett.103.093902 https://doi.org/10.1038/s42005-018-0087-3 vol. 62 no. 1/2022 conserved quantities in non-hermitian systems [29] y. n. joglekar, a. k. harter. passive parity-time-symmetry-breaking transitions without exceptional points in dissipative photonic systems [invited]. photonics research 6(8):a51–a57, 2018. https://doi.org/10.1364/prj.6.000a51. [30] y. wu, w. liu, j. geng, et al. observation of parity-time symmetry breaking in a single-spin system. science 364(6443):878–880, 2019. https://doi.org/10.1126/science.aaw8205. [31] m. naghiloo, m. abbasi, y. n. joglekar, k. w. murch. quantum state tomography across the exceptional point in a single dissipative qubit. nature physics 15(12):1232–1236, 2019. https://doi.org/10.1038/s41567-019-0652-z. [32] j. li, a. k. harter, j. liu, et al. observation of parity-time symmetry breaking transitions in a dissipative floquet system of ultracold atoms. nature communications 10(1):855, 2019. https://doi.org/10.1038/s41467-019-08596-1. [33] f. klauck, l. teuber, m. ornigotti, et al. observation of pt -symmetric quantum interference. nature photonics 13(12):883–887, 2019. https://doi.org/10.1038/s41566-019-0517-0. [34] f. ruzicka, k. s. agarwal, y. n. joglekar. conserved quantities, exceptional points, and antilinear symmetries in non-hermitian systems. journal of physics: conference series 2038(1):012021, 2021. https://doi.org/10.1088/1742-6596/2038/1/012021. [35] z. bian, l. xiao, k. wang, et al. conserved quantities in parity-time symmetric systems. physical review research 2(2), 2020. https://doi.org/10.1103/physrevresearch.2.022039. [36] m. v. berry. optical lattices with pt symmetry are not transparent. journal of physics a: mathematical and theoretical 41(24):244007, 2008. https://doi.org/10.1088/1751-8113/41/24/244007. [37] m. h. teimourpour, r. el-ganainy, a. eisfeld, et al. light transport in pt -invariant photonic structures with hidden symmetries. physical review a 90:053817, 2014. https://doi.org/10.1103/physreva.90.053817. [38] v. gorini, a. kossakowski, e. c. g. sudarshan. completely positive dynamical semigroups of n-level systems. journal of mathematical physics 17(5):821– 825, 1976. https://doi.org/10.1063/1.522979. [39] g. lindblad. on the generators of quantum dynamical semigroups. communications in mathematical physics 48(2):119–130, 1976. https://doi.org/10.1007/bf01608499. [40] m. ban. lie-algebra methods in quantum optics: the liouville-space formulation. physical review a 47(6):5093–5119, 1993. https://doi.org/10.1103/physreva.47.5093. [41] v. v. albert, l. jiang. symmetries and conserved quantities in lindblad master equations. physical review a 89:022118, 2014. https://doi.org/10.1103/physreva.89.022118. [42] d. manzano. a short introduction to the lindblad master equation. aip advances 10(2):025106, 2020. https://doi.org/10.1063/1.5115323. [43] j. gunderson, j. muldoon, k. w. murch, y. n. joglekar. floquet exceptional contours in lindblad dynamics with time-periodic drive and dissipation. physical review a 103:023718, 2021. https://doi.org/10.1103/physreva.103.023718. [44] y. n. joglekar, r. marathe, p. durganandini, r. k. pathak. pt spectroscopy of the rabi problem. physical review a 90(4):040101, 2014. https://doi.org/10.1103/physreva.90.040101. [45] t. e. lee, y. n. joglekar. pt -symmetric rabi model: perturbation theory. physical review a 92:042103, 2015. https://doi.org/10.1103/physreva.92.042103. [46] p. hänggi. driven quantum systems, 1998. [2020-10-31], https://www.physik.uni-augsburg.de/ theo1/hanggi/chapter_5.pdf. [47] a. k. harter, y. n. joglekar. connecting active and passive pt -symmetric floquet modulation models. progress of theoretical and experimental physics 2020(12):12a106, 2020. https://doi.org/10.1093/ptep/ptaa181. [48] p. peng, w. cao, c. shen, et al. anti-parity–time symmetry with flying atoms. nature physics 12(12):1139– 1145, 2016. https://doi.org/10.1038/nphys3842. [49] y. choi, c. hahn, j. w. yoon, s. h. song. observation of an anti-pt-symmetric exceptional point and energy-difference conserving dynamics in electrical circuit resonators. nature communications 9(1), 2018. https://doi.org/10.1038/s41467-018-04690-y. [50] f. zhang, y. feng, x. chen, et al. synthetic anti-pt symmetry in a single microcavity. physical review letters 124:053901, 2020. https://doi.org/10.1103/physrevlett.124.053901. [51] s. longhi, e. pinotti. anyonic pt symmetry, drifting potentials and non-hermitian delocalization. epl (europhysics letters) 125(1):10006, 2019. https://doi.org/10.1209/0295-5075/125/10006. [52] g. arwas, s. gadasi, i. gershenzon, et al. anyonic parity-time symmetric laser, 2021. arxiv:2103.15359. 7 https://doi.org/10.1364/prj.6.000a51 https://doi.org/10.1126/science.aaw8205 https://doi.org/10.1038/s41567-019-0652-z https://doi.org/10.1038/s41467-019-08596-1 https://doi.org/10.1038/s41566-019-0517-0 https://doi.org/10.1088/1742-6596/2038/1/012021 https://doi.org/10.1103/physrevresearch.2.022039 https://doi.org/10.1088/1751-8113/41/24/244007 https://doi.org/10.1103/physreva.90.053817 https://doi.org/10.1063/1.522979 https://doi.org/10.1007/bf01608499 https://doi.org/10.1103/physreva.47.5093 https://doi.org/10.1103/physreva.89.022118 https://doi.org/10.1063/1.5115323 https://doi.org/10.1103/physreva.103.023718 https://doi.org/10.1103/physreva.90.040101 https://doi.org/10.1103/physreva.92.042103 https://www.physik.uni-augsburg.de/theo1/hanggi/chapter_5.pdf https://www.physik.uni-augsburg.de/theo1/hanggi/chapter_5.pdf https://doi.org/10.1093/ptep/ptaa181 https://doi.org/10.1038/nphys3842 https://doi.org/10.1038/s41467-018-04690-y https://doi.org/10.1103/physrevlett.124.053901 https://doi.org/10.1209/0295-5075/125/10006 http://arxiv.org/abs/2103.15359 acta polytechnica 62(1):1–7, 2022 1 introduction 2 intertwining operators as an eigenvalue problem 3 quantum pt-symmetric dimer 4 classical pt-symmetric dimer 5 conclusions references ap07_4-5.vp 1 introduction room acoustic parameters are commonly used to describe a room’s acoustical quality in a simplified and objective manner, where the most common parameter is the reverberation time [1]. usually, a personal computer and professional audio hardware, meeting certain requirements, are essential for these measurements, as defined in iso 3382 [2] and iso 18233 [3]. by comparing the measurement results obtained by different teams for the same room, deviations larger than the human perception have been found [4]. a good measurement accounts for a range of uncertainty smaller than this perception, to obtain parameters corresponding to the listener’s impressions in a room. furthermore, world-wide round robins have been conducted in the past to investigate these variations and their significance [5]. in order to provide unified results in all fields of measurement, the iso/bipm developed the guide to the expression of uncertainty in measurement (gum) [6]. determining the range of uncertainty according to gum often requires complex modeling of the error propagation or even monte-carlo simulations, if an analytical expression cannot be given [7] or the influence factors, called input quantities, are not directly measurable. therefore, applicable practical modeling techniques have been developed [8]. this paper presents a scalable linear uncertainty model according to the gum procedures for measurements of room acoustic parameters. the main input quantities are determined by theoretical and experimental investigations of the measurement signal chain and its inner components. special experiments are designed to be conducted in different types of halls, to investigate the dependence of the individual uncertainty contributions on the type and shape of the room. the combined uncertainty of the room acoustic parameter is calculated in a following step from the available data from the experiments. the uncertainty budget will be given and discussed in the context of measurement quality. 2 theoretical background in order to provide the necessary acoustic and mathematical background, the most important facts of the measurement standard for room acoustics (iso 3382), which is the central document for room acoustic measurements, and the gum, are summarized below. knowledge of signal theory and of the correspondences between time and frequency domain (e.g. fourier-transformation) are presumed. 2.1 iso 3382 – measurement standard iso 3382 defines measurable objective acoustic parameters and their measurement procedures. these parameters have been designed to be related to human perception. for instance, the reverberation time t corresponds to the perceived reverberance of a room and the clarity index c80 is related to subjective sensation of clarity. further parameters contain information on the spatial distribution of sound incidence at a listener’s seat, e.g. lateral fraction. for demonstration purposes, this paper focuses on the clarity index, but the model is meant to be used for other acoustic parameters as well. to provide more detailed objective information at a certain position in a room, the parameters are required to be stated as a function of frequency by means of octave or third-octave frequency band values. the typical range covers octave bands from 63 hz up to 8 khz. the object to be measured is the room, which can be assumed as a linear time-invariant (lti) system, at least for the short time frame of a measurement session. therefore, the impulse response completely describes the behavior of a room from a certain source to a certain receiver position in a system-theoretical manner. acoustic parameters are defined as mathematical equations involving the impulse response h(t). hence, the effort is reduced to measurement of the impulse response only. the clarity index, which is representatively chosen, is defined by c h t t h t t e e e � � � � � � � � � � � � � � � � � � � � � 10 10 2 0 2 log ( ) ( ) d d � [ ]db (1) with an integration time �e � 80 ms for music. this gives a relation between early and late arriving sound energy. besides the electronic equipment, e.g. pc and amplifiers, the essential components of the signal chain, as depicted in © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 47 no. 4–5/2007 modeling measurement uncertainty in room acoustics p. dietrich this paper investigates a way of determining and modeling uncertainty contributions in measurements of room acoustic parameters, which are commonly used to describe the acoustic situation of a room in an objective manner. if the range of uncertainty and the confidence interval are not given, the results remain incomparable to other measurement teams, since modern pc-based measurements still show appreciable sources of measurement errors. the guide to the expression of uncertainty in measurement (gum) defines a unified guideline for determining uncertainties in all fields of measurement. its application is increasingly required by modern measurement standards. however, the gum procedures have not been applied to room acoustics yet. hence, a scalable linear approach for calculating the combined uncertainty of room acoustic parameters with regard to the input quantities is proposed. in-situ measurement results of specially designed experiments show the significance of the main influence factors and are used to build the uncertainty budget. keywords: room acoustics, measurement uncertainty, gum, iso 3382, uncertainty modeling. fig. 1, are the loudspeaker and the microphone, which are in direct interaction with the sound field of the room. for most room acoustic measurements a loudspeaker and microphones with omni-directional directivity patterns are required, as shown in fig. 2. a typical geometrical setup for an omni-directional sound source is the dodecahedron loudspeaker, consisting of 12 equal loudspeaker chassis. the loudspeaker in fig. 1 has been developed and enhanced at the institute of technical acoustics at rwth aachen in recent years. it consists, from top to bottom, of a small high-tone dodecahedron, a mid-tone dodecahedron, and a bandpass sub-woofer speaker. in addition, the measurement standard summarizes the investigated difference limen (jnd) for the defined room acoustic parameters. for the reverberation time the jnd is on the order of 5 % of the mean reverberation time for each frequency band. for the clarity index it is assumed to be on the order of 1 db independent of frequency. 2.2 gum basics the gum was published 1995. it enhances the well known simple gaussian error calculus by distinguishing between type a and type b uncertainty contributions instead of the formerly known random and systematic errors. in more detail, type a covers results obtained by statistical analysis of a series of measurements and type b represents other analysis methods, e.g. using data from calibration certificates. the uncertainty of a measurement is defined as a parameter associated with a measurement result, called the measurand. it expresses a range of values which can reasonably be attributed to the measurand with a certain level of confidence. typically, confidence levels of 68 % and 95 % are used, meaning that the true value lies in the given interval with this probability. furthermore, the gum describes a guideline scheme for deriving uncertainties, which consists of the following 7 parts. 1. collecting information on the measurement and its input quantities xi. 2. modeling the particular measurement in terms of a model function f. 3. evaluation of the input quantities according to type a or type b. 4. combination of the results to obtain the value y and the associated uncertainty u(y). 5. calculating the expanded uncertainty u(y). 6. statement of the complete measurement result y u� for a chosen coverage factor k. 7. assembling the measurement uncertainty budget. formally, the output quantity y, which is the result of a measurement, can be expressed as a function of the input quantities xi as in the following y f x x xn� �( , , , )1 2 . (2) if all input quantities xi are known, the corresponding uncertainties u(xi) have to be determined. in the case of type a, which is used throughout this paper, the uncertainty can be expressed as the standard deviation of the mean u x n n q qi k k n 2 2 1 1 1 ( ) ( ) ( )� � � , (3) where qk stands for the individual measurement results. the best value for the input quantity is defined as the arithmetic mean. under consideration of these uncertainty contributions the combined uncertainty for the output quantity can be calculated as u y f x u x f x f x u x x i i i n i j i j j ( ) ( ) ( , )� � � � � � � � � � � � � � � � � � 1 2 2 � �� �� i n i n 11 1 . (4) in most cases the correlation between the input quantities represented by u x xi j( , ) can be neglected. 3 application to room acoustics as mentioned above, determining the uncertainty contributions and the model function is the most difficult task in modeling uncertainties. hence, certain assumptions and simplifications are made to apply the gum procedures. as can be seen, the central measured quantity in room acoustics is the impulse response. modeling the error propagation from the input quantities to the impulse response and for64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 1: measurement chain in room acoustic measurements fig. 2: 3-way dodecahedron loudspeaker (left) and condenser microphones in an array (right) ward to the derived parameters, involves finding a model function f as in equation (2). this is problematic, since the only input quantity for calculating a room acoustic parameter is the impulse response, which is not suitable in this context. in addition, the input quantities to the impulse response cannot always be directly measured. as major uncertainty contributions can be found, it is applicable to analyze the variations of the output quantity itself by varying these input quantities. fig. 3 illustrates this approach, where sources of errors are grouped into room, equipment and evaluation errors. the linear dependence graph in fig. 4 shows that the corrected output value can be formulated as y y k k k� � room equipment evaluation (5) with the correction factors ki. the model function f is therefore linearized. correction factors are introduced to capture the uncertainty contributions by means of a standard deviation of the output quantity. these are meant to have a mean value of k i � 0 but an uncertainty u k i( ) � 0, modeling the different sources of error. as acoustic parameters are dependent on variables, e.g., frequency or position, the correction factors are generally assumed to be dependent on these variables until otherwise proven, k k f room positioni i� ( , , ). (6) in more detail, fig. 5 shows the grouping of the main input quantities influencing the accuracy of a room acoustic measurement. hence, the task changes to finding experiments modeling the uncertainty of the input quantities in an appropriate manner and analyzing the output quantity, in terms of a correction factor, directly. 4 measurement results special experiments were designed to investigate, e.g., noise and meteorological influences, loudspeaker directivity or variations in positioning the microphones and the loudspeakers. repeated overnight measurements were conducted, the dodecahedron loudspeaker was rotated by a controlled turntable, and the microphones were configured in an array to scan the area of a seat. the corresponding uncertainties are calculated from the standard deviations in each experiment for each frequency band independently. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 47 no. 4–5/2007 fig. 3: capsuled sources of measurement errors fig. 4: linear uncertainty dependence graph fig. 5: groups of correction factors with details fig. 6: source rotation with two different dodecahedron loudspeakers since the directional pattern of the dodecahedron loudspeaker has been under investigation at the institute for technical acoustics for many years, measurement results obtained from two different loudspeakers on a turntable are presented to show the influence of directional pattern on the acoustic parameters. if the emitted wavelength is smaller than the geometry of the speaker, even the designed omni-directional source becomes more and more directional. fig. 6 shows that the deviations in the calculated parameters increase with frequency for both sources used. in fig. 6 (left), we use a commercially available dodecahedron loudspeaker commonly used for room acoustic measurements, and, in fig. 6 (right), a special three-way dodecahedron loudspeaker with a smaller diameter, developed by the institute of technical acoustics in aachen. the results for the latter equipment show less variation, as the source is closer to omni-directionality. due to its geometrical properties, the dodecahedron loudspeaker always shows a periodicity of 120°, as can be seen in the derived parameters. the spatial distribution of the parameters within a seat, by means of an area in the seat at the typical height of the listeners’ ears of 120 cm high above the floor, is drawn in fig. 7. fig. 7 (left) shows the 250 hz and fig. 7 (right) the 500 hz octave band. as can be seen, the parameters are dependent on the position and the fluctuations are in good agreement with the mean wavelength in these frequency bands. therefore, the parameter varies faster over a position in the 500 hz band than in the lower band. towards higher frequencies, the variations decrease since the room’s behavior changes from distinct room modes to overlapping modes. this transition happens at the well known schröder frequency. by considering the measurement of a single seat position, the actual receiver one chooses when measuring a room could lay in the range of the scanned area. therefore, it is applicable to adapt this fact to the uncertainty budget for octave bands, separately. further experiments and results have been obtained that are not discussed in detail here. the uncertainty budget under the given simplifications and assumptions can be seen in fig. 8 for the major uncertainty contributions investigated. the combined uncertainty shown here is valid for measuring the room impulse response for one source and one seat position at a single time and deriving the room acoustic parameter. in this case, the uncertainty exceeds the difference limen. in order to provide more precise results, several measurements under variation of the source position and angle, the receiver position, etc., have to be averaged. 5 conclusion this paper has proposed an approach for modeling and determining uncertainty contributions in room acoustics, according to the gum procedures, involving in-situ measurement results. based upon the current measurement standard iso 3382, the common measurement chain has been under uncertainty investigation. the main influence factors and their relevance regarding measurement uncertainty have been analyzed and determined. experiments have been designed and conducted in several rooms and halls to quantify the uncertainty contributions and their dependence on different types of rooms. in a following step, the combined uncertainty for the output quantity was calculated and presented for the clarity index, representatively. however, the model, the procedures and the measured room impulse responses serve for uncertainty calculation of the remaining 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 7: spatial variations within a seat position for two different frequency bands fig. 8: uncertainty budget for the clarity index room acoustic parameters. the uncertainty budget for a single measurement shows values on the order of jnd. hence, a single measurement for one source-receiver pair is not sufficient. instead, several measurement results have to be used to give an average result. acknowledgments the author would like to thank prof. michael vorländer, ingo witew and dr. gottfried behler from the institute of technical acoustics at rwth aachen for their support. furthermore, he would like to thank mr. ernsting, mr. mörtel, and mr. tennhard for providing the halls for measurement purposes. references [1] bradley, j. s.: using iso 3382 measures, and their extensions, to evaluate acoustical conditions in concert halls. acoustical science and technology, 2005. [2] iso, iso 3382 – acoustics – measurement of the reverberation time – part 1: performance spaces, iso tc 43/sc 2 n 0767 – 2004-06-30, iso, june 2004. [3] iso / dis 18233, acoustics – application of new measuring methods in building and room acoustics – final draft, iso, 2006. [4] witew, i. b., behler, g. k.: uncertainties in measurement of single number parameters in room acoustics. forum acusticum, 2005. [5] lundeby, a., vigran, t. e., bietz, h.,vorländer, m.: uncertainties of measurements in room acoustics. acustica, vol. 81 (1995), p. 344–355. [6] guide to the expression of uncertainty in measurement. international organization for standardization, geneva, 1995. [7] siebert, b. r. l., sommer, k. d.: weiterentwicklung des gum und monte-carlo-techniken. technisches messen, vol. 71 (2004). [8] sommer, k. d., weckenmann, a., siebert, b. r. l.: a systematic approach to the modelling of measurements for uncertainty evaluation. journal of physics: conference series, vol. 13 (2005), p. 224–227. dipl. ing. pascal dietrich e-mail: pdi@akustik.rwth-aachen.de institute of technical acoustics rwth aachen university d-52056 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 47 no. 4–5/2007 ap07_2-3.vp 1 scalar products and metric operators in classical mechanics, the system is described by its coordinates in n-dimensional phase (configuration) space. the unique trajectory is a solution of canonical hamiltonian or euler-lagrange equations and satisfies fixed initial conditions. the observable quantities (energy, angular momentum, …) of the systems are represented by a function on the phase space. the description of the system is completely different in quantum theory. we are dealing with a infinite dimensional hilbert space in this case, which is a vector space complete with respect to the norm induced by the associated scalar product. the quantum mechanical system is represented by a vector in the hilbert space. the observables are represented by the operator acting on the hilbert space. by postulate, the measurable quantities are associated with spectra of the operators which are required to be real. for this reason, only self-adjoint operators are considered to be physically admissible. the evolution equations of classical mechanics are replaced by the schrödinger equation i t h� � � � �� and the initial conditions are replaced by the requirement that � lies in the domain of the hamiltonian. the task of solving the evolution equations is more complicated in quantum mechanics. from the mathematical point of view, we face a second order differential equation of complex valued functions. additionally, the involved self-adjoint hamiltonians are mostly unbounded, which requires careful determination of their domains. the scalar product plays the key role in interpreting the solutions of the schrödinger equation. the quantity ( , )� �1 2 is proportional to the transition amplitude that the system in state �1 will transform to the state described by the vector �2 . when the system does not depend on time explicitly, the measured quantities have to be independent with respect to time translations. particularly, the time evolution of the states � �i it u t( ) ( ) ( )� 0 should not violate the transition amplitude. in other words, there should hold � � � �� � � �1 2 1 20 0 0 0( ), ( ) ( ) ( ), ( ) ( )� u t u t . this implies that the time evolution operator has to be unitary, i.e. u t u t†( ) ( ) � 1. the dagger denotes hermitian conjugation with respect to the scalar product (., .) , which is usually taken to be ( , ) *� � � �1 2 1 2� � dqn , (1) where * is complex conjugation. the time evolution operators form a one-parameter group, u( )0 1� , u t t u t u t( , ) ( ) ( )1 2 1 2� . the hamiltonian of the system plays the role of the generator of time translations, we have u t iht( ) exp( )� � . the relation is consistent with stone’s theorem which establishes a correspondence between self-adjoint operators and one-parameter groups of unitary operators. a few years ago, we witnessed a boom in the study of ��-symmetric hamiltonians [1–21]. these operators are specific by their ��-symmetry, i.e. [ , ]�� h � 0 , where � is space reflection and � is time reversion. these can be covered by a broad class of pseudo-hermitian operators which satisfy the following operator equation h h† � �� � 1 the operator � is required to be hermitian, invertible and bounded. the standard self-adjointness is a special case of the preceeding relation as long as � � 1. it was observed that these operators possess a real spectrum in many cases. the natural question emerged whether it would be possible to replace the standard requirement of self-adjointness by the less restrictive requirement of pseudo-hermiticity. the answer showed to be affirmative as long as we are dealing with operators having purely real spectra. at first sight, we encounter a serious problem that the time evolution generated by these hamiltonians is non-unitary with respect to the scalar product (1). this problem has been fixed by the theorem which states [14]: for any pseudo-hermitian operator with a purely real spectrum there exists a positive-definite operator � with respect to which the hamiltonian is pseudo-hermitian, i.e. h h† ,� � �� � 0 the consistent unitary evolution generated by h is recovered by redefining of the scalar product. instead of (1), we fix it in the following form ( , )� � � �1 2 2� �� � 1* dqn . (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 47 no. 2–3/2007 pseudo-hermitian operators in a description of physical systems v. jakubský we present some basic features of pseudo-hermitian quantum mechanics and illustrate the use of pseudo-hermitian hamiltonians in a description of physical systems. keywords: pseudo-hermitian operators, klein-gordon and proca equation, thermodynamics. the hamiltonian h is self-adjoint with respect to the new scalar product and may consequently serve as a generator of time translations. so, we encounter a twofold problem in pseudo-hermitian quantum mechanics. first, one has to check the spectrum of the operator. when it consists of real eigenvalues only, the existence of positive-definite metric operator � is guaranteed and its explicit construction follows. the construction is non-trivial in most cases and contains an additional subtlety: the resulting metric operator is non-unique. for given hamiltonian h, there exist rather a class of admissible metric operators and associated scalar products. the mathematical framework does not provide any restriction to make the result of the construction unique. to weaken the ambiguity, one employs additional, physically motivated requirements on the scalar product – metric operator. in the following section, we will provide an example of pseudo-hermitian systems which appear in relativistic quantum mechanics. it illustrates the construction of a proper positive-definite scalar-product and discusses consequent restriction of its ambiguity by the additional physical requirement of relativistic invariance. 2 pseudo-hermitian hamiltonians in relativistic quantum mechanics attempts to marry the special theory of relativity (str) and quantum mechanics gave rise to the well known equations of relativistic quantum mechanics. the theory built on the dirac equation provides a consistent description of relativistic spin one-half particles, at least in the presence of weak external fields. the theory based on klein-gordon and proca equations representing evolution equations for spin zero and spin one particles has met serious problems with probabilistic interpretation of their solutions. the associated scalar product showed to be indefinite. the problem is closely related to the fact that the equations are of the second order in the time derivative. nowadays, the genuine fusion of str and quantum theory is represented by quantum field theory. however, relativistic quantum mechanics may serve as a powerful tool in describing compound particles in the weak external fields or may provide relativistic corrections to the non-relativistic systems. recently, relativistic integer-spin systems have been reconsidered within the framework of pseudo-hermitian quantum mechanics. the positive scalar product for both cases has been constructed [15, 16]. the dynamics of the free spin-less particle is governed by the well-kown klein-gordon equation ( )� �t p m 2 2 2 0� � � . (3) the first derivative of the wave function in time can be understood as a new physical degree of freedom. in this context, we introduce the following denotation i t� � �� . (4) together with this relation, eq. (3) turns to a couple of mutually intertwined differential equations of functions � and �. this system can be rewritten in terms of two-componental formalism into the compact form, which resembles the schrödinger equation i i m p ht t� � � � � � � �� � � � � � � � � � � � � 0 1 02 2 . (5) it was shown a long time ago [22] that the relativistic evolution equations can be unified formally in the framework of 2(2s � 1)-component formalism, where s denotes spin of the particle. spin-one half keeps its privileged position even in this framework, as the associated hamiltonian (dirac hamiltonian) is hermitian in contrast to the integer-spin systems. the operator h ceases to be hermitian. instead, it satisfies the relation h php p† ,� � � � � 0 1 1 0 , (6) which establishes its pseudo-hermiticity with respect to p. the hamiltonian has a purely real spectrum which consists of the values � � � �� p m2 2 . consequently, the existence of the positive metric operator is guaranteed. its explicit form in momentum representation reads � � �( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )p p p p p p p p� � � � � � � � � � � � � � 1 � � � � � � ( )p � . (7) this contains positive but arbitrary functions �( )p which represent the non-uniquness of the scalar product. to restrict the ambiguity, we impose additional requirement of lorentz invariance of the scalar product ( , ) ( , )� � �� �� � ��1 2 1 2 , (8) which in terms of infinitesimal transformations � � �� �( ) , :1 0i m (9) leads to m m† .� �� (10) m is a generator of the poincare group in appropriate representation. instead of explicit computation, let us note that the solution of the preceeding relation can be particularly facilitated by the use of shur‘s lemma [15, 16]. the additional, physically motivated constraint reduces the ambiguity to a single parameter. thus, we may write the resulting scalar product of two solutions �i of the klein-gordon equation in the following form � �( , ) ( , ) , � , � , � � � � � � � � � � � 1 2 1 2 1 2 1 2 1 1 2 1 � � � � � � � � �� m i m � �� � � �2 1 2 1 1� � �� , , ( , ). (11) the system involving a free particle with spin-one can be treated in a similar manner. the evolution equation � � � f m a� �2 0, (12) where f a a� � �� �� � and a � are complex valued functions – components of a four-potential, can be rewritten in multi-component formalism as well. introducing a 6-componential wave function �t tma ie� ( , ) � � we obtain (we define f ej j 0 � � for j � 1 2 3, , ) 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 i t m m m m m � � � �� � � � � � � � � � � 0 0 2 grad div grad div h � (13) which is pseudo-hermitian with respect to p. its spectrum is real, and the existence of the positive metric operator is guaranteed again. after explicit construction of the metric operator, we impose the requirement of relativistic invariance of the scalar product. as a result, we obtain a one-parameter family of the scalar product [16]. its explicit form on the space of solutions a a a jj j j t� �( , ) , ,0 1 2 � of the proca equations is � � � �( , ) , ,a a ia m a a a a a a a 1 2 1 2 2 0 1 1 0 2 1 2 � � � � � � � � � � � � grad grad grad grada m i p h m a a a 1 0 3 2 2 3 2 2 0 1 2 2 , � � � ( ) , � � � � � � � � � � � � � � � m i p h m a � � � , � � � � � � � � � � � 2 2 22 � (14) where � in ( , )�1 1 , .,. denotes standard scalar product and �� �� � dp p p and �h is the helicity operator. formulas (11) and (14) set up the basis for direct computation of observable physical properties of the system of free relativistic particles with spin zero and one. 3 metric operator in thermodynamics in the preceeding section, we considered a system which involved a single particle described by the hamiltonian h. the scalar products played the crucial role in computing the physical properties of these systems. the situation is different when we are interested in the collective behavior of identical systems. a statistical description of the ensemble of quantum systems comes into play. let us assume that the system in equilibrium is described by pseudo-hermitian hamiltonian with a purely real spectrum. then the state of the system is described by density matrix �, which is the solution of the bloch equation �� �� �� �h (15) with initial condition �( )0 1� . the parameter � equals inverse temperature, i.e. � � 1 t. the density matrix fixes the probability that we find the system in a given state. it provides only statistical information on the system in this sense. the formal solution of the bloch equation reads � �� �e h and should be normalized to identity (tr � = 1) to acquire the desired statistical properties. the temperature dependent normalization factor z tr h� �exp[ ]� is of crucial importance, as it allows direct computation of thermodynamic quantities (entropy, inner and free energy, …) and subsequent derivation of the thermodynamic properties (state equation) of the system. it is called a partition function. it has been shown [20] that the partition function is not dependent on the explicit form of the scalar product. this implies that information on the thermodynamics of the system can be gained without explicit knowledge of the scalar product. this statistical approach may represent a contribution to the discussion on the ambiguity of the scalar product, in the sense that the thermodynamics of the system is insensitive with respect to the choice of the metric operator. 4 conclusion this paper was intended as a brief look into the realm of pseudo-hermitian, or ��-symmetric, quantum mechanics. to satisfy the need for more detailed information, we refer to the review article [18] and special issues of czech. j. phys. and j. phys. a dedicated to ��-symmetric (pseudo-hermitian) quantum mechanics [7–11], and to the references in these publications. in our presentation, we have tried to illustrate using examples how the formalism of pseudo-hermitian quantum mechanics can be used in the description of a physical system. particularly, we have illustrated how this contributes to a consistent approach to the relativistic quantum mechanics of integer-spin systems. pseudo-hermitian operators can be used in a description of physical systems. however, their use requires a specific approach. one has to be careful about the spectrum of the operator – only operators with a real spectrum are physically admissible. derivation of the physical properties of the system requires the construction of the new scalar product. the construction depends on the hamiltonian, so that the scalar product is dynamically generated in this sense. as we have remarked, the study of thermodynamic properties of the system described by pseudo-hermitian hamiltonians represents a new stream in the field, and may provide an insight into the physical properties of these systems. acknowledgment the author would like to thank the universidad de santiago de chile for its kind hospitality. the work was partially supported project no. lc06002 and by gačr grant nr. 202/07/1307. references [1] scholz, f. g., geyer, h. b., hahne, f. j.: quasi-hermitian operators in quantum mechanics and the variational principle, ann. phys. vol. 213 (1992), p. 74. [2] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having �� symmetry, phys. rev. lett. vol. 80 (1998), p. 5243. [3] bender, c. m., boettcher, s., meisinger, p.: ��-symmetric quantum mechanics, j. math. phys. vol. 40 (1999), p. 2201. [4] dorey, p., dunning, c., tateo, r.: spectral equivalences, bethe ansatz equations and reality properties in pt-symmetric quantum mechanics, j. phys. a, vol. 34 (2001), p. 5679. [5] mostafazadeh a.: pseudo-hermiticity versus �� symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian, j. math. phys. vol. 43 (2002), p. 205. © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 47 no. 2–3/2007 [6] mostafazadeh a.: pseudo-hermiticity versus �� symmetry ii: a complete characterization of non-hermitian hamiltonians with a real spectrum, j. math. phys. vol. 43 (2002), p. 2814. [7] proc. 1st int. workshop on pseudo-hermitian hamiltonians in quantum physics (editor: znojil m.), czech. j. phys. vol. 54 (2004), p. 1–156. [8] proc. 2nd int. workshop on pseudo-hermitian hamiltonians in quantum physics (editor: znojil m.), czech. j. phys. vol. 54 (2004), p. 1005–1148. [9] proc. 3rd int. workshop on pseudo-hermitian hamiltonians in quantum physics (editor: znojil m.), czech. j. phys. vol. 55 (2005), p. 1045–1192. [10] special issue devoted to the subject of pseudo-hermitian hamiltonians in quantum physics (editor: znojil m.) czech. j. phys. vol. 56 (2006), p. 885–1064. [11] special issue dedicated to the physics of non-hermitian operators (editors: h. b. geyer, d. heiss, m. znojil), j. phys. a, vol. 39 (2006), p. 9965–10262 and references therein. [12] znojil, m.: conservation of pseudo-norm in �� symmetric quantum mechanics, rendiconti del circ. mat. di palermo, ser. ii, supp. vol. 72 (2004), p. 211, arxiv: math-ph/0104012. [13] bender, c. m., brody, d. c., jones, h. f.: complex extension of quantum mechanics, phys. rev. lett. vol. 89 (2002), p. 0270401. [14] mostafazadeh, a.: pseudo-hermiticity versus �� symmetry iii: equivalence of pseudo-hermiticity and the presence of antilinear symmetries, j. math. phys. vol. 43 (2002), p. 3944. [15] mostafazadeh, a.: probability interpretation for klein-gordon fields and the hilbert space problem in quantum cosmology, class. quant. grav. vol. 20 (2003), p. 155. [16] jakubský, v., smejkal, j.: a positive-definite scalar product for free proca particle, czech. j. phys. vol. 56 (2006), p. 985. [17] figureira de morisson faria, c., fring a.: time evolution of non-hermitian hamiltonian systems, j. phys. a, vol. 39 (2006), p. 9269. [18] bender, c. m.: making sense of non-hermitian hamiltonians, to be published in rep. prog. phys, arxiv:hep-th/0703096. [19] znojil, m., geyer, h.: construction of a unique metric in quasi-hermitian quantum mechanics: non-existence of the charge operator in a 2×2 matrix model, phys.lett. b, vol. 640 (2006), p. 52. [20] jakubský, v.: thermodynamics of pseudo-hermitian systems in equilibrium, to appear in mod. phys. lett. a, arxiv: quant-ph/0703092v3. [21] bender, c. m., brody, d. c., chen jun-hua, jones, h. f., milton, k. a., ogilvie m. c.: phys. rev. d, vol. 74 (2006) 025016. [22] case, k. m.: some generalizations of the foldy-wouthuysen transformation, phys. rev. vol. 95 (1954), p. 1323. ing. vít jakubský, ph.d. phone: +420 266 173 255 email: jakub@ujf.cas.cz departamento de física universidad de santiago de chile casilla 307, santiago 2, chile (and phone: +420 266 173 255 academy of science of the czech republic 250 68 rež, czech republic) 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ap07_2-3.vp 1 introduction non-hermitian hamiltonians with a complex eigenvalue spectrum have been studied almost since the formulation of quantum mechanics, most prominently as consistent descriptions of dissipative systems resulting for instance from channel coupling [1]. it has also been known for a very long time that many interesting non-hermitian hamiltonians with a real eigenvalue spectrum result naturally in various circumstances. for instance, it was argued more than thirty years ago that the lattice versions of reggeon field theory [2] � � � � � � � �a a iga a a a g a a a i i i i i i i j i � � � � � � � � � � † † † † † ( ) ~ ( )( i j ij i a � � � � � � � � � � � �� � ) (1) with a a i i � � †, being standard creation and annihilation operators and �, , ~g g � �, possess a real eigenvalue spectrum [3] (i am grateful to john cardy for pointing this out). the reduction of the hamiltonian in (1) to a single lattice site in zero transverse dimension [4] is very reminiscent of the so-called swanson model [5], which results by replacing the interaction term with a simpler bilinear expression ga a g aa† † ~� . the latter model serves currently as a concrete popular solvable model to exemplify various general features related to the study of non-hermitian hamiltonians [5, 6, 7, 8, 9]. affine toda field theory with a complex coupling constant is a very prominent class of field theoretical models, which are argued [10, 11] to be consistent despite their hamiltonians being non-hermitian. besides the study of such explicit models related to non-hermitian hamiltonians, in particular their spectral properties, the question of how to formulate the corresponding quantum mechanical description consistently was first addressed in [12]. a useful insight into how to implement ��-symmetry into this formulation was later obtained [13]. the current large interest in the subject of non-hermitian hamiltonian systems was initiated about nine years ago [14] by the surprising numerical observation that even the class of simple non-hermitian hamiltonians � � �p g iz n2 ( ) , (2) defined on a suitable domain, possesses a real positive and discrete eigenvalue spectrum for integers n 2 with g � �. supported by numerous new results and insights (for some recent reviews see [15, 16, 17, 18]), which have been obtained since, the natural question arises of how to construct non-hermitan hamiltonians with real eigenvalue spectra in a more systematical way. the question i would like to address in this paper is how this may be achieved, in particular by generalizing some integrable models. 2 real spectra of non-hermitian hamiltonians the activities in spectral theory usually focus on normal or self-adjoined operators in some hilbert space. with regard to the remarks made in the introduction we shall first briefly review some arguments which may be used to explain the reality of the spectra of non-hermitian hamiltonians and then employ them to construct new models, which depending on the argument used are guaranteed, or at least are likely, to have a real eigenvalue spectrum. 2.1 pseudo-hermiticity since a hermitian operator, say h h� †, is guaranteed to have real eigenvalues, i.e. h� ��� with � � �, one may trivially construct isospectral hamiltonians by means of a similarity transformation h h� �� �1 , such that h� �� � with � � �� �1 . when � is a hermitian operator this implies that the conjugation of h is simply achieved by h h† � �� �2 2. hamiltonians of this type are denoted as pseudo hermitian hamiltonians [12, 19, 20, 21, 22]. one of the immediate virtues of the these relations is that �2 can be used consistently as a metric operator. given a hermitian hamiltonian it is of course trivial to construct several isospectral non-hermitian hamiltonians in this manner simply by computing � �� �1h h for some positive �. however, interesting situations arise when given simple non-hermitian hamiltonians, such as (1) and (2), possible together with the knowledge that they possess a positive real 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ��-symmetry and integrability a. fring we briefly explain some simple arguments based on pseudo hermiticity, supersymmetry and ��-symmetry which explain the reality of the spectrum of some non-hermitian hamiltonians. subsequently we employ ��-symmetry as a guiding principle to construct deformations of some integrable systems, the calogero-moser-sutherland model and the korteweg devries equation. some properties of these models are discussed. keywords: ��-symmetry, pseudo-hermiticity, calogero-moser-sutherland model, kdv-equation. spectrum, and one tries to construct their hermitian counterparts by seeking convenient hermitian operators �, such that � �h h h� � �1 †. unfortunately, this is only feasible in an exact manner in some very rare cases [23, 5, 6, 7, 8, 9] and mostly one has to rely on perturbation theory, see e.g. [24, 25, 26, 8, 27]. more awkward is the fact that when given exclusively the non-hermitian hamiltonian h, there might be several hermitian hamiltonians counterparts and the metric is therefore not even uniquely determined. one may select out a particular metric by specifying for instance at least one more observable [12] or the spectrum. 2.2 supersymmetry another standard procedure produces isospectral hamiltonians is to employ darboux transformations or equivalently a supersymmetric quantum mechanical construction [28, 29]. for this one considers hamiltonians �, which can be decomposed into the form � � � � �� �h h qq q q ~ ~ (3) as indicated in (3) one assumes that the two superpartner hamiltonians h� factor into the two supercharges q and ~ q , which intertwine the hamiltonians h� as qh h q� �� and ~ ~ q h h q� �� . evidently the two charges commute with the hamiltonian �, i.e. [ , ] [ , ~ ]� �q q� � 0 , and thus the sl(1/1) algebra constitutes a symmetry of �. as pointed out by various authors [30, 31, 32, 33, 34, 35, 36], one does not require the hamiltonians h� to be hermitian, such that we allow h h� �� † . the only constraints which are natural to impose when one wishes to make contact with the pseudo-hermitian treatment in the previous section are that the individual factors of h� are conjugated as [34] q q† ~ � � �� � �2 2 and ~ ~†q q� � �� � �2 2, (4) where the operators �� are hermitian � �� �� † . as an immediate consequence of (4), both hamiltonians h� in (3) become pseudo-hermitian and possess hermitian counterparts h h� �� † h h h h� � � � � � � �� � � † � � � � � �2 2 1. (5) by construction all four hamiltonians h�, h� are therefore isospectral h� � ��� �� and h� � ��� � � (6) and their corresponding wavefunctions are intimately related � � � � � � �� �q � � 1 and � �� � � � �� � ~ q � �1 . (7) one may now characterize four qualitatively different cases depending on the properties of the hermitian operators �� in (7), namely i) for generic �� we have isospectral quartets, ii) for generic �� and �� � � and iii) for generic �� and �� � � we find isospectral triplets and finally iv) for �� � � we have isospectral doublets. the interesting cases ii) and iii), which contain a hermitian hamiltonian, have been considered in [31]. next, one needs to specify the explicit representation for the supercharges in terms of the superpotential w(x). setting the parameter �2 2 1m � , the simplest choices are differential operators of the first order q x w� � d d and ~ q x w� � � d d (8) such that the two superpartner hamiltonians may be written as h w w v� �� � � � � � � �� � 2 . (9) alternative choices with higher order differential operators are discussed for instance in [37]. assuming further that h� possesses a discrete spectrum h n n n� � ��� �� , one may adjust the energy scale such that h n� � �� 0 for some chosen m. in order to single out this groundstate wavefunction we denote it as m m mc w x: exp� � � � �� � � �� d , c � �. consequently the superpartner potentials may be expressed in terms of the groundstate wavefunctions and acquire the forms wm m m � � � , v m m m � � � �� , v m m m m m � � �� � �� � � �� � �� 2 2 . (10) therefore the hamiltonians h v e w w em m m m m m� �� � � � � � � � � �� � 2 (11) are isospectral h em n n n� � ��� � for n m� . (12) in order to disentangle the hermitian from the non-hermitian case, we separate the superpotential into its real and imaginary part w w iwm m m� � � with w wm m� † , � � †w wm m� and likewise for the groundstate energy e im m m� �� �� . with these notations we can re-write (11) as h w w w i w w w m m m m m m m m m � � � � � � � � � � � � � � ( � � � ). � �2 (13) clearly we encounter the situation ii) or iii) when w w wm m m m � � � �� � � � 2 or �wm � 0 , (14) respectively. when given a hamiltonian, irrespective of being hermitian or non-hermitian, and at least one wavefunction, the exploitation of supersymmetry is a very constructive procedure to obtain isospectral hamiltonians, which could also be hermitian or non-hermitian. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 47 no. 2–3/2007 2.3 ��-symmetry a further very simple and transparent way to explain the reality of the spectrum of some non-hermitian hamiltonians results when we encounter unbroken ��-symmetry, which in the recent context was first pointed out in [13]. this means that both the hamiltonian and the wavefunction remain invariant under a simultaneous parity transformation � : x x� � and time reversal � : t t� � , that is we require � �h, �� � 0 and �� � �� , (15) where � is a square integrable eigenfunction on some domain of h. it is crucial to note that the ��-operator is an anti-linear operator, i.e. it acts as �� �� ��( ) * * � �� � � �� � � with �, � � and �, � being some eigenfunctions. an easy way to convince oneself of this property is to consider the standard canonical commutation relation [ , ]x p i� . since �� : ,x x p p� � � , we require �� : i i� � : to keep this relation invariant. utilizing now both relations in (15) and the anti-linear nature of the ��-operator, a very simple argument leads to the reality of the spectrum � � � �� � � � � � �� � � � � �h h h�� �� �� ��* * (16) whereas the first relation in (15) is usually trivial to check, the second is in general difficult to access as one rarely knows all the wavefunctions. in case it does not hold, one speaks of a broken ��-symmetry and the eigenvalues come in complex conjugate pairs. all arguments in this subsection were essentially already known to wigner in 1960 [38] relating to anti-linear operators in a completely generic form. noting that the ��-operator is an example of such an operator these ideas have been revitalized in a modified form and developed further in the recent context of the study of non-hermitian hamiltonians [13]. 3 ��-symmetry as a guiding principle to construct new models if we now wish to construct new models with real eigenvalue spectra, we may in principle use any of the previous arguments. clearly the exploitation of ��-symmetry on the level of the hamiltonian is the most direct and transparent way, as one can just read of this property immediately. thereafter one can write down some new ��-symmetric hamiltonians by means of simple deformations, i.e. replacing for instance the potential v(x) by v(x) f(ix), v(x) f(ixp), v(x) + f(ix) or v(x) + f(ixp), etc. with f being some arbitrary function. clearly the hamiltonians in (1) and (2) are of this type. of course these new models are not guaranteed to have real spectra as the second property in (15) might be spoiled. nonetheless, they have a high chance to describe non-dissipative physics and are potentially interesting. 3.1 ��-symmetric extensions for multi-particle systems basu-mallick and kundu [39] were the first to write down some non-hermitian extensions for some integrable many-particle systems, i.e. the rational a�-calogero models [40] �bk ii i k i k i k ii k p q g q q ig q q p � � � � � � � � � � 2 2 2 2 22 2 2 1 1 ( ) ~ ( )� (17) with g g,~ � �, q p, � ��� 1. there are some immediate questions one may pose [41] with regard to the properties of �bk: i) how can one formulate �bk independently of the representation for the roots? ii) can one generalize �bk to other potentials apart from the rational one? iii) can one generalize �bk to other algebras or more precisely coxeter groups? iv) is it possible to include more coupling constants and, in particular, v) are the extensions still integrable? it turns out that the answers to all these questions become all quite simple when one realises that (17) corresponds in fact to the standard calogero model simply shifted in the momenta. this means the similarity transformation � is simply the translation operator in p-space. in order to see this and to answer the above questions we ignore the confining term in (17) by taking � 0 and we re-write the hamiltonian as � � � � � �� � � � � �� 1 2 1 2 2 2p g v q i p( ) � , (18) where � is now any root system invariant under coxeter transformations, � � �� � � � ��1 2 ~ ( )g f q� , f x x( ) �1 and v x f x( ) ( )� 2 . we have also introduced coupling constants g g� �, ~ for each individual root. the hamiltonians �� are meaningful for any representation of the roots and all coxeter groups. for a specific choice of the representation for the roots, namely � � �i i i� � �1 for1 � �i � with � � �i j ij� � and the coxeter group, i.e. a�, we recover the expression in (18). to establish the integrability of these models it is crucial to note the following not obvious property � � � � � � � 2 2 2 2 2� � � � � �� �s s l lg v q g v qs l ~ ( ) ~ ( ) � � (19) where �s, �l denotes the short and long roots, respectively. for the details of the proof of this identity we refer to [41]. as a consequence of (19), we may re-express �� in form of the usual calogero hamiltonian with shifted momenta together with some redefinitions of the coupling constants 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 � � � � � � � � � � � � � � � �� 1 2 1 2 2 2 2 2 2 2 ( ) � ( ), � ~ p i g v q g g gs s s � for � � � ! " � � s l l l lg g 2 2 2 � �~ . for (20) therefore, upon redefining of the coupling constant, we may obtain �� by a similarity transformation as h h� � �� �1 cal with � �� � �e x . the results of section 2.1 apply therefore and one may construct for instance the corresponding wavefunctions by �� � �� �1 cal . similarly one can establish integrability with the help of a lax pair with a shifted momentum. one may verify that l p i h i g f q e m=m h i g f q e � � � � � � � � � ��( ) � ( ) � ( ) � � � � � � � � � � and �� � (21) fulfill the lax equation � �� ,l l m� , upon the validity of the classical equation of motion resulting from (18), where the lie algebraic commutation relations � � � � � � � � h h h e e e e h e e e i j i i, , , , , , , . � � � � �� � 0 � � � � � � ��� � � � � � are taken to be in the cartan-weyl basis, i.e. they are normalized as tr( )h hi j ij� � , tr( )e e� �� �1 . the vector m can be expressed in terms of the structure constant ��� � and the potential in the usual fashion. we note that the lax equation is ��-symmetric as �� : ,l l m m� � � . naturally the conserved charges i lk k� tr ( ) 2, notably the hamiltonian i2, have the same property. having established the integrability of the calogero models one may address the question ii) and try to extend these considerations to other potentials. allowing now f x x( ) sinh�1 and f x x( ) sin�1 , we obtain the hyperbolic and elliptic case with v x f x( ) ( )� 2 . the integrability is guaranteed by means of the same lax pairs (21). however, when expanding the square in (20) the resulting hamiltonian is not quite of the form (18) � � � � � � �� � � � � � �� 1 2 1 2 1 2 2 2 2p g v q i p� ( ) � , (22) because the identity (19) does not hold for the other potentials. this means that the hamiltonians in (22) constitute non-hermitian integrable extensions for calogero-moser-sutherland (cms)-models for all crystallographic coxeter groups, including, besides the rational, also trigonometric, hyperbolic and elliptic potentials. dropping the last term would break the integrability for the non-rational potentials. 3.2 ��-symmetric deformations of the korteweg devries equation an even more popular integrable model than the cms-model is one having the korteweg-de vries (kdv) equation [42] as the equation of motion u uu ut x xxx� � � 0 . (23) this equation is known to remain invariant under x x� � , t t� � , u u� , i.e. it is ��-symmetric. by the same recipe outlined above we may then carry out the following deformation u i iux x� � ( ) � with � � �, which was originally performed for the second term in [43] and for the third term in [44], leading to the equations u iu iu ut x xxx� � � �( ) � �0 � (24) and u uu i iu u iu ut x x xx x xxx� � � � � � � � � � � �( )( ) ( )1 02 2 1 , (25) respectively. for the model in (24) one can establish the following properties: the galilean symmetry is broken, the model possesses two conserved quantities in terms of infinite sums and exhibits steady state solutions. however, it is unclear how ��-symmetry can be utilized further. instead (25), despite being more complicated, has some simpler properties: it is galilean invariant, possesses three simple conserved charges, exhibits steady state solutions, ��-symmetry can be utilized to explain the reality of the energy and it allows for a hamiltonian formulation with non-hermitian hamiltonian density � � � � ��u iux 3 11 1 � � �( ) � . (26) analogues of various different types of solutions of the kdv-equation have been studied in [43, 44]. no soliton solutions have been found and it seems unlikely that the models are integrable. 4 conclusions we have demonstrated that ��-symmetry serves as a very useful guiding principle for constructing new interesting models, some of which even remain integrable. being closely related to integrable models, these new models have appealing features and deserve further investigation. naturally one may also reverse the setting and employ methods that have been developed in the context of integrables to address questions which arise in the study of non-hermitian hamiltonians. for instance, one [45] may employ bethe ansatz techniques to establish the reality of the spectrum for hamiltonians of type (2). 5 acknowledgments i would like to thank the members of the řež nuclear physics institute, especially miloslav znojil, for their kind hospitality and for organizing this meeting. i also gratefully acknowledge the kind hospitality granted by the members of the department of physics of the university of stellenbosch, in particular hendrik geyer. for useful discussions i thank carla figueira de morisson faria and paulo gonçalves de assis. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 47 no. 2–3/2007 references [1] friedrich, h., wintgen, d.: interfering resonances and bound states in the continuum. phys. rev. vol. a32 (1985), p. 3231–3243. [2] cardy, j. l., sugar, r. l.: reggeon field theory on a lattice. phys. rev. vol. d12 (1975), p. 2514–2522. [3] brower, r. c., furman, m. a., subbarao k.: quantum spin model for reggeon field theory. phys. rev. vol. d15 (1977), p. 1756–1771. [4] bronzan, j. b., shapiro, j. a., sugar, r. l.: reggeon field theory in zero transverse dimensions. phys. rev. vol. d14 (1976), p. 618–631. [5] swanson, m. s.: transition elements for a non-hermitian quadratic hamiltonian. j. math. phys. vol. 45 (2004), p. 585–601. [6] jones, h.: on pseudo-hermitian hamiltonians and their hermitian counterparts. j. phys. vol. a38 (2005), p. 1741–1746. [7] jones, h., mateo, j.: an equivalent hermitian hamiltonian for the non-hermitian –x4 potential. phys. rev. vol. d73 (2006), 085002. [8] figueira de morisson faria, c., fring, a.: isospectral hamiltonians from moyal products. czech. j. phys. vol. 56 (2006), p. 899–908. [9] musumbu, d. p., geyer, h. b., heiss, w. d.: choice of a metric for the non-hermitian oscillator. j. phys. vol. a40 (2007), p. f75–f80. [10] hollowood, t.: solitons in affine toda field theory. nucl. phys. vol. b384 (1992), p. 523–540. [11] olive, d. i., turok, n., underwood, j. w. r.: solitons and the energy momentum tensor for affine toda theory. nucl. phys. vol. b401 (1993), p. 663–697. [12] scholtz, f. g., geyer, h. b., hahne, f.: quasi-hermitian operators in quantum mechanics and the variational principle. ann. phys. vol. 213 (1992), p. 74–101. [13] bender, c. m., d. c. brody, d. c., jones, h. f.: complex extension of quantum mechanics. phys. rev. lett. vol. 89 (2002), p. 270401(4). [14] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having �� symmetry. phys. rev. lett. vol. 80 (1998), p. 5243–5246. [15] znojil, m. (guest editor): special issue: pseudo-hermitian hamiltonians in quantum physics. czech. j. phys. vol. 56 (2006), p. 885–1064. [16] geyer, h., heiss, d., znojil, m. (guest editors): special issue dedicated to the physics of non-hermitian operators (phhqp iv) (university of stellenbosch, south africa, 23–25 november 2005). j. phys. vol. a39 (2006), p. 9965–10261. [17] figueira de morisson faria, c., fring, a.: non-hermitian hamiltonians with real eigenvalues coupled to electric fields: from the time-independent to the time dependent quantum mechanical formulation. laser physics. vol. 17 (2007), p. 424–437. [18] bender, c.: making sense of non-hermitian hamiltonians. rept. prog. phys. vol. 70 (2007), p. 947-1018. [19] mostafazadeh, a.: pseudo-hermiticity versus ��-symmetry. the necessary condition for the reality of the spectrum. j. math. phys. vol. 43 (2002), p. 205–214. [20] mostafazadeh, a.: pseudo-hermiticity versus ��-symmetry ii: a complete characterization of non-hermitian hamiltonians with a real spectrum. j. math. phys. vol. 43 (2002), p. 2814–2816. [21] mostafazadeh, a.: pseudo-hermiticity versus ��-symmetry iii: equivalence of pseudo-hermiticity and the presence of anti-linear symmetries. j. math. phys. vol. 43 (2002), p. 3944–3951. [22] mostafazadeh, a.: exact ��-symmetry is equivalent to hermiticity. j. phys. vol. a36 (2003), p. 7081–7092. [23] znojil, m.: ��-symmetric harmonic oscillators. phys. lett. vol. a259 (1999), p. 220–223. [24] bender, c. m., brody, d. c., jones, h. f.: extension of ��-symmetric quantum mechanics to quantum field theory with cubic interaction. phys. rev. vol. d70 (2004), 025001(19). [25] mostafazadeh, a.:��-symmetric cubic anharmonic oscillator as a physical model. j. phys. vol. a38 (2005), p. 6557–6570. [26] figueira de morisson faria, c., fring, a.: time evolution of non-hermitian hamiltonian systems. j. phys. vol. a39 (2006), p. 9269–9289. [27] caliceti, e., cannata, f., graffi, s.: perturbation theory of ��-symmetric hamiltonians. j. phys. vol. a39 (2006), p. 10019–10027. [28] witten, e.: constraints on supersymmetry breaking. nucl. phys. vol. b202 (1982), p. 253–316. [29] cooper, f., khare, a., sukhatme, u.: supersymmetry and quantum mechanics. phys. rept. vol. 251 (1995), p. 267–385. [30] cannata, f., junker, g., trost, j.: schrödinger operators with complex potential but real spectrum. phys. lett. vol. a246 (1998), p. 219–226. [31] andrianov, a. a., cannata, f. j., dedonder, p., ioffe, m. v.: susy quantum mechanics with complex superpotentials and real energy spectra. int. j. mod. phys. vol. a14 (1999), p. 2675–2688. [32] tkachuk, v. m., fityo, t. v.: factorization and superpotential of the �� symmetric hamiltonian. j. phys. vol. a34 (2001), p. 8673–8677. [33] dorey, p., dunning, c., tateo, r.: supersymmetry and the spontaneous breakdown of �� symmetry. j. phys. vol. a34 (2001), p. l391–l400. [34] mostafazadeh, a.: pseudo-supersymmetric quantum mechanics and isospectral pseudo-hermitian hamiltonians. nucl. phys. vol. b640 (2002), p. 419–434. [35] sinha, a., roy, p.: generation of exactly solvable non-hermitian potentials with real energies. czech. j. phys. vol. 54 (2004), p. 129–138. [36] znojil, m.: ��-symmetric regularizations in supersymmetric quantum mechanics. j. phys. vol. a37 (2004), p. 10209–10222. [37] andrianov, a. a., cannata, f., dedonder, j. p., ioffe, m. v.: second order derivative supersymmetry, q de48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 formations and scattering problem. int. j. mod. phys. vol. a10 (1995), p. 2683–2702. [38] wigner, e.: normal form of antiunitary operators. j. math. phys. vol. 1 (1960), p. 409–413. [39] basu-mallick, b., kundu, a.: exact solution of calogero model with competing long-range interactions. phys. rev. vol. b62 (2000), p. 9927–9930. [40] calogero, f.: solution of a three-body problem in one-dimension. j. math. phys. vol. 10 (1969), p. 2191–2196. [41] fring, a.: a note on the integrability of non-hermitian extensions of calogero-moser-sutherland models. mod. phys. lett. vol. 21 (2006), p. 691–699. [42] korteweg, d. j., devries g.: on the change of form of long waves advancing in a rectangular canal, and on a new type of long stationary waves. phil. mag. vol. 39 (1895), p. 422–443. [43] bender, c. m., brody, d. c., chen, j., furlan, e.: ��-symmetric extension of the korteweg-de vries equation. j. phys. vol. a40 (2007), p. f153–f160. [44] fring, a.: ��-symmetric deformations of the korteweg-de vries equation. j. phys. vol. a40 (2007), p. 4215–4224. [45] dorey, p., dunning, c., tateo, r.: the ode/im correspondence. arxiv:hep-th/0703066. andreas fring phone: +44 (0)20 7040 4123 e-mail: a.fring@city.ac.uk centre for mathematical science city university northampton square london ec1v 0hb, united kingdom (and department of physics university of stellenbosch 7602 matieland, south africa) © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 47 no. 2–3/2007 ap06_5.vp 1 introduction the vision and image processing systems are widely employed in current manufacturing processes. these kinds of systems provide measures to inspect and control the manufacturing process of the producing location. in addition, the vision systems make automatic inspection and process examination possible with the help of “intelligent” software components. the project, on which this paper is based, is initiated in such a circumstance in order to associate and simplify the qualtiy control of free-form surface manufacturing, specially shining water taps in sanitary industries. the first stage in the fabrication of water taps is the casting of the rough part. the material that is mainly used is brass. after the casting process some other machining processes, like drilling, milling and threading are carried out. then the water tap is ground and polished sequentially in order to obtain high surface quality. finally, the end product is finished by electroplating. it is crucial to find out the potential defects existing on the workpiece surface after grinding and polishing before the final electroplating. if a defect is found on a part after electroplating, manufacturer has to painfully discard this part or remove the electroplated coating. in this case removing the coating is a very expensive process because its harmful to the evironment. therefore cost will be saved if the defects can be detected at an early stage. in addition, it is always useful to find out which type of a defect has been identified. once the type of a defect is known, the decision can be made whether the part is discarded or can be retouched with proper compensatory engineering processes and if an adjustment of a previous machining process is necessary. so far the tasks of defect inspection and categorization are performed by human operators in a traditional “see and evaluation” way. it is a labour-intensive and therefore also cost-intensive job. this process is necessarily automated to improve the efficiency of defects inspection, releasing worker from unpleasant working environments, and finally reducing the overall manufacture cost, specially in the countries where wages are high. to automate the defects inspection and classification, a vision system is installed and integrated into the manufacturing chain. in our project, the inspection process takes place after the end of the grinding and polishing processes. if no defects are found on the surface, the workpiece is accepted for the next processing step. otherwise, it is classified automatically in order to determine if a removal of the defect is possible or the workpiece should be rejected directly. the vision system consists of a carrier, a camera system, a lighting system, other accessories and the software. the system hardware is responsible to provide a constant lighting enviroment and obtain the digital images under this constant circumstance. the software provides the solution to examine the images from the camera system, locating and classifing the defects on workpiece surfaces. the challenge in our project now is to characterise and determine the type of defects from predefined categories and additional foulings (which are called pseudo-defects in our project) on the surface. to this end, this paper presents measures and considerations regarding both theoretical and practical aspects, to efficiently classify all defects as well as a separation of real-defects from pseudo defects. 2 automatic classification system 2.1 defects definition the defects have been generally divided into 15 classes according to their physical attributes and the consequent handling operations. fig. 1. gives samples for all defect categories. all defects mainly can be split up into two major categories, real-defects and pseudo-defects that are actually foulings, like dust or oil, on the surface or shades caused by uneven lighting on the free-form surfaces. the first ten defects in fig. 1. are real-defects and the last five are pseudo-defects. the pseudo-defects are causing no quality problem; while the real-defects should be critically picked out because they not only spoil the aesthetic aspect but sometimes also result in malfunction of final products. the vision system considers both kinds as failures at first and distinguish them in the classification phase. therefore, two indice have to be taken into account, the overall right classficaition ratio and wrong classification ratio between the real-defects and pseudo-defect. in addition, the wrong classification ratio from realto pseudo-defects is more crucial than that from pseudoto real-defects. in the previous case, a product with real8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 quality control in automated manufacturing processes – combined features for image processing b. kuhlenkötter, x. zhang, c.krewet in production processes the use of image processing systems is widespread. hardware solutions and cameras respectively are available for nearly every application. one important challenge of image processing systems is the development and selection of appropriate algorithms and software solutions in order to realise ambitious quality control for production processes. this article characterises the development of innovative software by combining features for an automatic defect classification on product surfaces. the artificial intelligent method support vector machine (svm) is used to execute the classification task according to the combined features. this software is one crucial element for the automation of a manually operated production process. keywords: quality control, image processing, defect classification, support vector machine. 1 introduction the use of computers is essential in modern design processes. computer aided design (cad) is extensively applied in a wide range of industrial branches. however, there is a body of opinion that the benefits of applying cad are below expectations. the development of cad systems and their applications in engineering practice have been greatly influenced in the last decade by rapid increases in the performance of computer hardware. emphasis has been laid on the implementing numerical methods and computer graphics. hence, cad still concentrates rather too much on providing a means for representing the final form of design, whereas designers also need a continual stream of advice and information. design problems are known to be “ill–defined”. the problem statement usually sets a goal, some constraints within which the goal must be archived, and some criteria by which a successful solution might be recognised. the solution is unknown, and there is no certain way of proceeding from the statement of the problem to a statement of the solution. moreover, many design constraints and criteria also remain unknown and existing cad approaches, based on conventional programming methods, are not able to help the designer in dealing with uncertainty and inconsistencies. thus, the quality of design solutions still depends mostly on the designer’s skill and experience. it can be argued that not many successful intelligent systems are known to be applied in engineering domains, especially not in the field of design optimisation. until recently, engineering (design) problems were indeed considered as well-defined mathematically-formulated problems, that can be managed with the computer aids based on numeric representation and computer graphics, without ‘interference’ from artificial intelligence. however, it is becoming more and more evident that adding intelligent behaviour to the existing cad tools [1] can lead to significant improvements in the effective and reliable performance various engineering tasks, including design optimisation. in this paper, we will present three different intelligent modules to be applied within a design process to enable more intelligent and less experience-dependent design optimisation. a computer-aided geometric modeller and a structural analysis package are basic cad tools within the proposed optimisation cycle. the design cycle begins with the initial design and problem definition. the first intelligent module is applied to support the preparation of the numerical model for structural analysis. after the analysis, the second intelligent module is provided to support evaluation of the “expert” results, upon which the most appropriate design optimisation steps are defined. the final design is usually reached after several structural analyses, and each run involves appropriately adjusted input data, including design changes if necessary. the third intelligent module addresses more specific design issues related to ergonomic and aesthetic aspects of the design. this will be used by the designer when specifying the outer geometric shape and appearance of the product. 2 design optimisation design optimisation is a very complex iterative process. in many cases, the basic parameters for the optimising process are the results of structural engineering analysis. finite element analysis (fea) [2] is the most frequently used numerical method for simulation and verification of the conditions in the structure. if the structure does not satisfy given criteria, certain optimisation steps, such as redesign, use of other materials, etc., have to be taken. the initial design is made in a geometric modeller, analysed by fea or some other method for engineering analysis and then re-designed in the modeller. this optimisation loop is repeated until the final design, that satisfying the given criteria, is developed. since existing cad tools fail to provide functional advice, the quality of the initial design, analyses and re-design actions, and also the number of iterative steps needed to reach the final solution depend mainly on the designer’s experience. design experts can efficiently perform structural design optimisation. they have built up their experience over time by working on design and analysis problems for various products. their strategy when dealing with a design problem is based mainly on heuristics or rules of thumb. but what about less experienced designers? is it possible to avoid trial-error behaviour and help them to perform computer-aided design optimisation more efficiently? evidently, traditional design optimisation systems that concentrate on numerical aspects of design process are not successful in integrating numerical parts with human expertise [3]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 46 no. 5/2006 intelligent support for a computer aided design optimisation cycle b. dolšak, m. novak, j. kaljun it is becoming more and more evident that adding intelligence to existing computer aids, such as computer aided design systems, can lead to significant improvements in the effective and reliable performance of various engineering tasks, including design optimisation. this paper presents three different intelligent modules to be applied within a computer aided design optimisation cycle to enable more intelligent and less experience-dependent design performance. keywords: knowledge based systems, computer aided design, structural analysis, ergonomics, aesthetics. 3 intelligent design optimisation cycle in order to make thee design optimisation process more intelligent and less experience-dependent, existing cad systems should be supplemented with some intelligent modules that will provide advice when needed. in the last decade, various artificial intelligence (ai) applications to engineering design have been reported. the book edited by d. t. pham [1] is a good collection of early examples related to this area. it is evident that ai applications to design are now the subject of intensive development and implementations. fig. 1 shows our proposal for an intelligent design optimisation cycle, based on interaction between a geometric modeller and an fea package. the idea is to encode the knowledge and the experience required in dealing with engineering design optimisation into a knowledge base (kb) that can be used by a computer system. the proposed intelligent kb modules should help the user to reach a final design that will fulfil the structural and other specific design criteria. the optimisation cycle begins with the problem definition and initial design. the first kb module is applied to support the user in the process of setting the finite element (fe) mesh parameters. after the initial fea is performed, the second intelligent module is involved to perform an evaluation of the “expert” results. for every iterative analysis, the input data is corrected and/or the structure is re-meshed before a new analysis is performed. if the geometry of the structure needs to be optimised, design changes are also performed before returning to the pre-processing phase of the analysis. a special group of intelligent kb modules is envisaged to support the various design tasks in the geometric modeller in order to satisfy more specific design criteria, such as manufacturability, appropriateness for assembly, ergonomics, aesthetics, etc. as is shown in fig. 1, the idea is to link the existing fea package and geometric modeller with the proposed kb modules into an integrated intelligent software environment for design optimisation. in addition, for proper integration leading to a transparent and user-friendly optimisation tool, several interfaces need to be developed to ensure that the data is always converted to the format used in the next step of optimisation. now we will discuss in greater detail, three kb modules that are important composite parts of the proposed intelligent design optimisation cycle. two of them are strongly related to the structural analysis part of the optimisation process, while the third kb module contains more specific knowledge to support the designer in setting the ergonomic and aesthetic value of the new product. all three intelligent systems mentioned here are under of research and development in our laboratory. 4 kb module for finite element mesh design within fea, a number of different mesh models usually need to be created until the right one is found. the trouble is that each mesh has to be analysed, since the next mesh is generated with respect to the results derived from the previous mesh. considering that one fea can take from a few 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 manufacture, assembly, fig. 1: intelligent design optimisation cycle minutes to several hours and even days of computer time, there is obviously a strong motivation to design “optimal” finite element mesh models more efficiently – in the first step or at least with minimum trials. as an alternative to the conventional “trial-and-error” approach to this problem, we have developed the finite element mesh design expert system named femdes [4]. the system was designed to help the user to define the appropriate finite element mesh model more easily, more rapidly, and with less dependence on experience. fea has been applied extensively for more than 30 years. however, there is no clear and satisfactory formalisation of the mesh design know-how. finite element design is still a mixture of art and experience, which is hard to describe explicitly. however, many resorts have been published in terms of problem definition, an adequate finite element mesh (chosen after several trials), and results of the analysis. these reports were used as a source of training examples for machine learning algorithms to construct more than 1700 rules for finite element mesh design by generalising given examples. the way in which inductive logic programming techniques were applied to develop the kb in this particular case is presented in detail in [5]. fig. 2 shows how femdes is to be applied within the fea pre-processing phase. the user has to define the problem (geometry, loads, and supports). the data about the problem needs to be converted from the fea pre-processor format into the symbolic qualitative description to be used by the kb module. femdes’ task is to propose the appropriate types of finite elements and to determine the mesh resolution values. a command file for the mesh generator can be constructed according to the results obtained by the intelligent system. for femdes, we have built our own program to gain the most efficient correlation between the knowledge and the program part of the system. like the kb, the shell is also written in prolog. this enables the proper use of the kb for fe mesh design (inference engine), and also communication between the user and the system (user interface). a very important and useful feature of the user interface is its capability to explain the inference process, by answering the questions “why?” and “how?” a complete source code of the program, together with explanations of the algorithms performed by the program, can be found in [6]. the user has to prepare the input data file before running the system. the structure that is to be analysed needs to be described in exactly the same way as the training examples were presented to the learning algorithm in the knowledge acquisition phase. this can be done automatically by using the geometric model of the structure. guidelines for automatic transformation from numeric form to symbolic qualitative description are presented in [6]. however, femdes is not yet integrated with any commercial fea pre-processor. thus, the problem description currently needs to be made manually. this takes some time, especially for structures that are more complex. however, following a few simple algorithms the task does not require special knowledge and experience. to simplify the learning problem, the training set used for developing of the knowledge base was designed with the aim of being representative of a particular type of structures. the following limitations were taken into account: � all structures were cylindrical, � only forces and pressure were considered as loads, � highly local mesh refinement was not required. however, femdes can also be applied as a general tool for determining the mesh resolution values for real-life three-dimensional structures outside the scope of these limitations. the results of the system have to be adjusted subsequently, according to the specific requirements of the particular analysis. furthermore, they can always serve as a basis for an initial fe mesh, which is subject to further adaptations considering the results of the numerical analyses. it is very important to choose a good initial mesh and to minimise the number of iterative steps leading to an appropriate mesh model. thus, femdes can be very helpful to inexperienced users, especially through its ability to explain the inference process. 5 kb module for structural analysis-based design improvements the post-processing phase of the engineering analysis represents a synthesis of the whole analysis and is therefore of special importance. it concludes with the final report of the analysis, where the results are quantified and evaluated with respect to the next design steps, which have to follow, the analysis in order to find an optimal design solution. the sources for post-processing are numerical results of the computation performed in the previous phase of the analysis. the data is stored in a computer file. in spite of the fact that the records are quite well ordered, the numerical figures are hard to follow in the case of a complex real-life problem, when the data file is usually complex and extensive. nowadays fea software is very helpful at this point, as it offers adequate computer graphics support in terms of reasonably clear pictures showing the distribution of the unknown parameters inside the body of the structure. © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 46 no. 5/2006 fig. 2: femdes application within the fea pre-processing phase however, the user still has to answer many questions and solve many dilemmas in order to conclude the analysis and compose the report. the designer has to be able to judge whether the results of the analysis are correct and reliable, and also to decide what kind of design changes are needed, if any. most users need “intelligent” advice to interpret the analysis of the results adequately. unfortunately, this kind of help cannot be expected from the present software. traditional systems tend concentrate on numerical aspects of the analysis and are not successful in integrating the numerical parts with human expertise. in order to overcome this bottleneck, we decided to collect and encode the knowledge and experience needed to propose appropriate design actions that may lead to design improvement. in this way the prototype of the intelligent consultative system propose for supporting design decisions considering the results of a prior stress/strain or thermal analysis was developed [7]. propose provides a list of redesign recommendations that should be considered in order to optimise a certain critical area within the structure. fig. 3 shows that, as a result of applying propose the user may expect a list of redesign recommendations, based on the expert interpretation of the results of the prior numerical analysis. as a rule, several redesign steps are possible for design improvement. the selection of one or more redesign steps to be performed in a certain case depends on requirements, possibilities and wishes. the proposed system was developed in several steps. the most important step was to develop the knowledge base, where knowledge acquisition was the most crucial task [8]. theoretical and practical knowledge about design and redesign actions were investigated and collected. a wide range of different knowledge is needed to explore possible design actions that should follow the engineering analysis. knowledge acquisition was carried out in three different ways: from a literature survey, from examinations of old engineering analyses and from interviews with human experts. that was not an easy task. recommendations on redesign are scarce, and are dispersed in many different design publications. many reports on analyses contain confidential data and cannot be used. on the other hand, interviews and examination of existing redesign elaborations depend on cooperation with experts, and can be time-consuming. therefore, the scope of the results is greatly limited by the experts. production rules were selected as an appropriate formalism for encoding knowledge, because they are quite similar to the actual rules used in the design process. each rule proposes a list of recommended redesign actions that should be taken into consideration, while dealing with a certain problem, taking into account some specific design limits. the rules are generalised, and do not refer only to the examples that were used during the knowledge acquisition process. they can be used whenever the problem and the limits match with those in the head of the rule. in such a case, applying of the appropriate rule will result in a list of recommended redesign actions for dealing with the given problem. when using the propose system, the user has to answer some questions stated by the system in order to describe the results of the engineering analysis. in addition, critical areas within the structure need to be qualitatively described to the system. this input data is then compared with the rules in the knowledge base, and the most appropriate redesign changes to be taken into account in a given case are determined and recommended to the user. the system provides constant support to the user’s decisions in terms of explanations and advice. finally, the user receives an explanation of how the proposed redesign changes were selected and also some more precise information on how to implement a certain redesign proposal, including some pictorial explanations. fig. 4 shows an example of such an explanation for the proposed design action: “add smaller relief holes in the line of loads on both sides of the problem hole”. the abstract description of the problem area should be as general as possible, in order to cover most the problem areas, instead of addressing only very specific products, which is characteristic of some recent research projects in this field 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 3: propose application within the fea post-processing phase fig. 4: example of a pictorial explanation of a redesign action recommended by propose [9–11]. for this reason, the number of predefined attributes is relatively small. however, by answering some additional questions, the problem can be defined in a more refined manner. in cases when the problem area can be described to the system in different ways, it is advisable to run the system several times, each time with a different description. thus, the system will be able to propose more design actions, at the expense of only a few more minutes at the console. the larger number of proposals may confuse the user, who will probably need help in the form of explanations of the proposals. on the other hand, more proposals will provide more options for design improvements. the propose system was evaluated in two ways. first, experts who had already been involved in the knowledge acquisition process evaluated the system. then some real-life examples were used to test the performance of the system. the experts that participated in the evaluation process are practising designers and some academics. they individually evaluated the system from two points of view. first, they analysed the performance of the system using some real-life examples. they also evaluated the user interface by inspecting how well the system helps and guides the user, or even enables him or her to acquire some new knowledge. the suitability, clearness and sufficiency of the redesign proposals were also evaluated. all comments, critiques and suggestions presented by the experts were taken into consideration and led to numerous corrections and adjustments of the system. 6 kb module for ergonomic and aesthetic design in order to deliver suitable design solutions, designers have to consider a wide range of influential factors. ergonomics and aesthetics are certainly among the most complex considerations. less experienced designers could meet several problems in the design stage. some computer tools are available to be used for evaluating of the ergonomic condition of the product [12]. however, much experience and knowledge in the field of ergonomics is required in order to choose and carry out the appropriate redesign actions to improve the ergonomic value of the product within a reasonable time. on the other hand, the aesthetic design phase still depends mainly on the skill and experience of the designer and is not supported by any computer tool of any practical value. in this context, we decided to develop an intelligent consultative system that will be able to support the designer through the decision making process when defining the ergonomic and aesthetic parameters of the product [13]. expert advice is clearly often needed, and it could be very useful to apply the intelligent advisory system. moreover, since the aesthetic and ergonomic properties of the product are established in the early phases of product development, the intelligent advisory system should be able to support this process with minimum data requirements. the ergonomic analysis and aesthetic evaluation should be performed on the cad model. after that, the intelligent system can be used again to advise the user which design changes are possible or even necessary in order to improve the ergonomic and/or aesthetic value of the product. in order to improve the ergonomic and aesthetic value of the product, the design recommendations will be proposed to the user by using the expert knowledge collected in the kb of the system and the case-specific data given by the user. for proper control over each part of the system, we decided to build two separate knowledge bases, containing the theoretical and practical knowledge about the design and redesign actions, one for the ergonomic part and the other for the aesthetic part of the system. if the user applies only one part of the system, the inference engine will be able to use the separate kb that belongs to that part. on the other hand, if the complete system is used, both knowledge bases will be used, while some special rules will be applied to harmonise the ergonomic and aesthetic design recommendations, when necessary (fig. 5). the kb module for ergonomic and aesthetic design, discussed in this section, is still under intensive research and development. currently we are working on development the kb for the ergonomic part of the system, where we have limited the target area to hand tool design. in this context, the global hand tool ergonomic design goals [14] that need to be followed by the designer of the hand tool in order to meet health, safety and efficiency requirements, have been specified. the following are some of the goals recognised as most important: � consider the anthropometrical data to define the dimensions and configurations; � maintain the wrist in the neutral straight position; � avoid tissue compression; � reduce the excessive forces; � protect against vibration, heat, cold and noise; � ensure that the task can be performed at the appropriate height; � reduce the static load; � consider cognitive ergonomics. the kb that is being developed will contain rules on how to realise these goals within the hand tool design. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 46 no. 5/2006 fig. 5: intelligent support to ergonomic and aesthetic design 7 conclusions engineering design is obviously much more than just analysis and modelling, and existing cad systems need to be further explored to be able to assist in the other aspects of design as well. future research and development of cad systems will require radical re-thinking, considering some new approaches that have not been taken properly into account in the past. the application of ai techniques in design is certainly an approach that is becoming more and more important. design optimisation is a part of the development process for almost every new product. it plays a very important role in the modern high-tech world, where only optimal solutions can win the game on the market. however, developing optimal design solutions is a very complex domain that cannot be treated adequately by using conventional cad tools, unless the user possesses special skills and experience. thus, many research activities are aimed at making the design optimisation process more intelligent and less experience-dependent. many experts share the opinion that this can be achieved by supplementing existing cad systems with some intelligent modules that will provide advice when needed. the intelligent modules discussed in this paper are important parts of the overall intelligent design optimisation cycle and are now under intensive research and development in our laboratory. design data is not always well formulated, and almost never complete. humans can deal with such data reasonably easily, while even the most “intelligent” programs have great difficulties. designers are also reluctant to assign responsibility for decisions to computer programs, no matter how competent they may appear. it can also be argued that an encoded design kb does not allow designers to express their creative ideas. all these and many other factors constrain the application of intelligent systems in design. therefore, it is likely that no single technology will be adequate by itself. the designer will have available a toolkit of techniques, including intelligent modules, for information and well-defined advice for making decisions and judgements. our research presented here takes a step or two forward on the way intelligent support for the design process. references [1] pham, d. t. (editor): artificial intelligence in design. springer–verlag, 1991. [2] zienkiewicz, o. c., taylor, r. l.: the finite element method – basic formulation and linear problems. mcgraw-hill, london, 1988. [3] lee, d., kim, s-y.: a knowledge-based expert system as a pre-post processor in engineering optimisation. expert systems with applications, vol. 11 (1996), no. 1, p. 79–87. [4] dolšak, b.: finite element mesh design expert system. knowledge-based systems, vol. 15 (2002), no. 5/6, p. 315–322. [5] dolšak, b., bratko, i., jezernik a.: knowledge base for finite element mesh design learned by inductive logic programming. artificial intelligence in engineering design, analysis and manufacturing, vol. 12 (1998), p. 95–106. [6] dolšak, b.: a contribution to intelligent mesh design for fem analyses, phd thesis (in slovene with english abstract), faculty of mechanical engineering, university of maribor, slovenia,1996. [7] novak, m., dolšak, b.: intelligent computer-aided structural analysis-based design optimisation. wseas transactions on information science and applications, vol. 3 (2006), no. 2, p. 307–314. [8] novak, m.: intelligent computer system for supporting design optimisation, phd thesis (in slovene with english abstract), faculty of mechanical engineering, university of maribor, slovenia, 2004. [9] calkins, d., su, w., chan, w.: a design rule-based tool for automobile systems design. sae special publications, vol. 1318 (1998), paper 980397, p. 237–248. [10] smith, l., midha, p.: a knowledge-based system for optimum and concurrent design and manufacture by powder metallurgy technology. int. journal of prod. res., vol. 37 (1999), no. 1, p. 125–137. [11] pilani, r. et al.: a hybrid intelligent systems approach for die design in sheet metal forming. int. journal of advanced manufacturing technology, vol. 16 (2000), p. 370–375. [12] porter, j. m., freer, m. t., case, k.: computer aided ergonomics. engineering designer, vol. 25 (1999), no. 2, p. 4–9. [13] kaljun, j., dolšak, b.: computer aided intelligent support to aesthetic and ergonomic design. wseas transactions on information science and applications, vol. 3 (2006), no. 2, p. 315-321. [14] noyes, j. m.: designing for humans. hove press, 2001. dr. bojan dolšak, univ. dipl. ing. e-mail: dolsak@uni-mb.si dr. marina novak, univ. dipl. ing. jasmin kaljun, univ. dipl. ing. laboratory for intelligent cad systems university of maribor faculty of mechanical engineering smetanova 17 si-2000 maribor, slovenia 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 ap06_2.vp 1 introduction the design of a power system stabilizer is an important issue from the viewpoint of power system stability, since it damps out local plant modes and inter area modes of oscillation without compromising the stability of other modes. a conventional power system stabilizer (cpss) comprises of a wash out circuit and a cascade of two phase lead networks. to design such a pss, many strategies [1–6] have been proposed that included techniques such as root locus, bode plots, eigenvalue assignment, multivariable frequency domain techniques, optimal control design, etc. new techniques [7–10] based on expert systems, neural networks and rule based fuzzy logic for pss design is also emerging. the fuzzy logic controller has emerged as a very active and fruitful research area. the adaptive fuzzy controllers provide a means for continuously adapting the fuzzy set to match the desired performance criteria. its typical application area is for controlling time varying and/or nonlinear plants. fuzzy controllers may be either static or dynamic. the static fuzzy rules are usually based on operator experience, as fuzzy logic can easily encode linguistic information [11]. this is the main advantage of the fuzzy logic controller over neural networks. in the adaptive case, the linguistic information captured from operator experience can be used to initialize the fuzzy set. this helps to reduce the number of iterations in the training [12]. in this paper, three fuzzy controllers are proposed for pss. the first uses seven fixed fuzzy sets for the input and output variables. the second uses five fixed fuzzy sets for the input and output variables. the third controller is proposed as an adaptation scheme, based on the back propagation (bp) algorithm [12], which dynamically varies the fuzzy sets to achieve better dynamic performance. the performance of pss with both static and adaptive fuzzy logic controllers is compared through the simulation results, when the system is subjected to various disturbances. 2 system under study and a mathematical model of it the system under study consists of a single synchronous generator (s.g) connected to the infinite bus via double short transmission lines, as shown in fig. 1. a matlab program has been developed for the system under study. the synchronous generator is equipped with an automatic voltage regulator (avr). the equations of the system are derived in the d and q-axis in [13, 14]. the magnetic circuit is assumed to be linear. the differential equations that describe the system under study are as follows: � � � � � � � � � � � � � � � � �t d k k e m m q1 2 (1) � �� � �� � �b (2) � � � � � � � � � � � � � � e e k t e t kq fd do q do 4 3 � (3) � � � � � � � � � � � � � � � � � � � e k v k k k k e t e k u t fd a t a a q a fd a pss a 5 6� (4) where: � defines a small change in the next symbol or variable, � is the mechanical angular speed, tm is the mechanical torque. � defines the power angle and �b is the base angular speed. �eq is the voltage behind the transient reactance in the d-axis. the equations which describe the constants k1 to k6 and variable efd are given in [13, 14]. �tdo is the field winding open circuit time constant (sec), vt is the terminal voltage. ka and ta are the exciter gain and time constant. the values of ka and ta are given in appendix 1, upss is the output of the power system stabilizer. all the system parameters and constants are given in appendix 1. the mathematical equations of the avr together with the exciter and the governor are given as follows: © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 power system stabilizer driven by an adaptive fuzzy set for better dynamic performance h. f. soliman, a.-f. attia, m. hellal, m. a. l. badr this paper presents a novel application of a fuzzy logic controller (flc) driven by an adaptive fuzzy set (afs) for a power system stabilizer (pss).the proposed flc, driven by afs, is compared with a classical flc, driven by a fixed fuzzy set (ffs). both flc algorithms use the speed error and its rate of change as input vectors. a single generator equipped with flc-pss and connected to an infinite bus bar through double transmission lines is considered. both flcs, using afs and ffs, are simulated and tested when the system is subjected to different step changes in the reference value. the simulation results of the proposed flc, using the adaptive fuzzy set, give a better dynamic response of the overall system by improving the damping coefficient and decreasing the rise time and settling time compared with classical flc using ffs. the proposed flc using afs also reduces the computational time of the flc as the number of rules is reduced. keywords: fuzzy controller: static and adapted fuzzy sets and power system stabilizer. e k st v v ufd a a ref t pss� � � � � � � � �1 ( ) , (5) g a b stg � � � � � � � � � � � � � �1 �� , (6) where vref is the reference terminal voltage, a, and b are constants, tg is the governor time constant. their values are given in appendix 1. the complete system model, in block diagram presentation, is shown in fig. 2. all the system parameters and the constants of avr, exciter, governor, and conventional pss are given in appendix 1. 3 fuzzy logic controller fuzzy control systems are rule-based systems. a set of fuzzy rules represents the flc mechanism to adjust the effect of certain system stimuli. the aim of fuzzy control systems is normally to replace a skilled human operator with a fuzzy rules-based system. also the flc provides an algorithm which can convert the linguistic control strategy, based on 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague � � avr turbine load transmission line � s.g exciterflc-pss based on ffs cpss flc-pss based on afs � � � fig. 1: the synchronous generator, equipped with flc-pss based on ffs or afs or cpss � � � � � � � � 1 d sm+ fig. 2: transfer function block diagram expert knowledge, to automatic control strategies. fig. 3 depicts the basic configuration of the flc. it consists of a fuzzification interface, a knowledge base, decision making logic, and a defuzzification interface [11]. 3.1 global input variables the fuzzy input vector consists of two variables; first, the generator speed deviation � � and, second, the acceleration � �� . we designed two fuzzy controllers; the first is the fuzzy controller with seven linguistic variables, using fixed fuzzy sets for each one of the input variables, as shown in fig. 4 and fig. 5, respectively. the fuzzy set for the output variable is shown in fig. 6. in these figures, linguistic variables have been used such as pl (positive large), pm (positive medium), ps (positive small), z (zero), ns (negative small), nm (negative medium), nl (negative large). they have been used for the output fuzzy set, adding linguistic u to the related ones, as indicated in table 1. the second fuzzy controller, based on five linguistic variables, is used for each input variable, as shown in fig. 7 and fig. 8. respectively. the fuzzy set for the output variables is shown in fig. 9. the fuzzy sets shown in figs. 7, 8, 9 are used for both flc-pss with a fixed and adaptive fuzzy set. the solid line, in these figures, represents ffs, while the dotted lines depict these fuzzy sets after applying the adaptation techniques via the back propagation algorithm [12]. the © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fuzzification interface defuzzification interface inference systems knowledge base controlled systems fuzzy fuzzy actual control non fuzzy process output and state fig. 3: generic structure of fuzzy logic controller fig. 4: input fuzzy sets for speed deviation fig. 5: input fuzzy sets for speed deviation change fig. 6: output fuzzy sets adaptation technique is used to modify the membership function (mf) of each fuzzy set. in these figures, mfs of the linguistic variables have been used, such as pl (positive large), p (positive), z (zero), n (negative), nl (negative large). these linguistic variables have been used for the output fuzzy set, adding linguistic u to the related ones, as indicated in table 2. generally, after building up the fuzzy set for input and output variables it is required to develop the set of rules, the so-called look-up table, in which the relation between the input variable, � � and � �� , and the output variable of fuzzy controller is defined. the output of the flc is used to adjust the upss value, as shown in fig. 1. these look-up tables are given in table 1, when using seven fuzzy sets (7flc), and table 2, when using five fuzzy sets (5flc). the surface viewer, as shown in fig. 10, has a special capability that is very helpful in cases with two (or more) inputs and one output. this figure shows the output surface of the flc of the system versus two inputs. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague fig. 7: solid lines: represent the ffs or mfs before adaptation. dashed lines: represent the afs or normalized mfs after adaptation. fig. 8: solid lines: represent the ffs or mfs before adaptation. dashed lines: represent the afs or normalized mfs after adaptation. fig. 9: solid lines: represent the ffs or mfs before adaptation. dashed lines: represent the afs or normalized mfs after adaptation. speed deviation (� �) speed deviation chang (� �� ) nl nm ns z ps pm pl nl u_nl u_nl u_nl u_nl u_nm u_ps u_z nm u_nl u_nm u_nm u_nm u_ns u_z u_ps ns u_nl u_nm u_ns u_ns u_z u_ps u_pm z u_nl u_nm u_ns u_z u_ps u_pm u_pl ps u_nm u_ns u_z u_ps u_ps u_pm u_ pl pm u_ns u_z u_ps u_pm u_pm u_ pl u_ pl pl u_z u_ps u_pm u_ pl u_ pl u_ pl u_ pl table 1: look-up table defining the relation between the input and output variable in a fuzzy set form for seven fuzzy sets 3.2 the defuzzification method the minimum of maximum value method has been used to calculate the output from the fuzzy rules. this output is usually represented by a polyhedron map, as shown in fig. 11. the defuzzification stage is executed in two steps. first, the minimum membership is selected from the minimum value of the two input variables (x1 and x2) with the related fuzzy set © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 speed deviation (� �) speed deviation chang (� �� ţ) nl n z p pl nl u–nl u–nl u–nl u–n u–z n u–nl u–nl u–n u–z u–p z u–nl u–n u–z u–p u–pl p u–n u–z u–p u–pl u–pl pl u–z u–p u–pl u–pl u–pl table 2: look-up table defining the relation between the input and output variable in a fuzzy set form for five fuzzy sets fig. 10: rules surface viewer for flc using a fuzzy set µ ps (x 1) µ pm (x 2) µ u-pm (y ) rule 1: if x 1 is ps and x 2 is pm then y is u-pm rule 2: if x 1 is z and x 2 is ps then y is u-ps min min x 1 x 1 x2 x 2 x 1 x 2 µ z(x 1) µ ps(x 2) y y y µ( )y y µ u-ps (y ) fig. 11: schematic diagram of the defuzzification method using the center of area in that rule. this minimum membership is used to rescale the output rule, and then the maximum is taken to give the final polyhedron map, as shown in fig. 11. finally, the centroid or center of area has been used to compute the fuzzy output, which represents the defuzzification stage, as follows: u y y y y y pss � � � ( ) ( ) d d 4 the proposed adaptive fuzzy logic controller the adaptive fuzzy logic controller (aflc), using an adaptive fuzzy set, has the same inputs and output as a static flc. a full rule base (25 rules) is also defined. the rules have the general form: if � � is nl and � �� ţ is n then upss is u–nl, where the membership functions (mfi) is defined as follows: mf nl n z p and plj �{ , , , } as in the static fuzzy case. however, the output space has 25 different fuzzy sets. to accommodate the change in operating conditions, the adaptation algorithm changes the parameters of the input and output fuzzy sets. the algorithm presented in this section is designated to optimize the rule base of the fuzzy controller by shifting and/or modifying the support of the input and output fuzzy sets. in addition, it will not modify the rules or the structure of the fuzzy controller. in general the membership grade of fuzzy sets � �� ji( ) used in this research for input or output variables has a triangular or trapezoidal configuration. these fuzzy sets are defined as follows: in the case of a triangle mf, the three parameters {a, b, c} are defined as follows: � j i def j i j i j i j ix a b c x a x a b a a x b c ( ) ( ) ( ) ( ) ( )( ; , , ) � � � � � � 0 � � � � � � � � � � � � � � � � x c b b x c c x j i j i( ) ( ) 0 where a b cj i j i j i( ) ( ) ( ), , � � and a b cj i j i j i( ) ( ) ( )� � , where � is the real number set. this means that � �� ji jia( ) ( ) � 0, � �� ji jib( ) ( ) � 1 and � �� ji jic( ) ( ) � 0, as shown in fig. 12. the triangular membership functions used symmetric consequents and antecedents to represent the fuzzy set [11]. in the case of a trapezoidal mf, the four parameters {a, b, c, d} are defined as follows: � j i def j i j i j i x a b c d x a x a b a a x b b( ) ( ) ( ) ( ) ( ; , , , ) � � � � � � � 0 1 x c d x d c c x d d x j i j i j i � � � � � � � � � � � � � � � � � ( ) ( ) ( ) 0 where a b c dj i j i j i j i( ) ( ) ( ) ( ), , , � � and a b c dj i j i j i j i( ) ( ) ( ) ( )� � � . this means that � �� ji jia( ) ( ) � 0, � �� ji jib( ) ( ) � 1, � �� ji jic( ) ( ) � 1 and � �� ji jid c( ) ( ) � 0, as shown in fig. 13. the trapezoidal membership functions used symmetric consequents and antecedents to represent the fuzzy set [11]. the updated parameters of the membership functions of the fuzzy set for the input and output variables used the on-line back-propagation (bp) algorithm [12]. a simple mathematical form of the on-line updating for the mf is given by the following equation: �� � �� �� k k k1 � � � � where represents the learning rate and � represents the parameters a b c dj i j i j i j i( ) ( ) ( ) ( ), , , � � of the membership function of the fuzzy sets for the input and output variables. � �{ }k is the change of these parameters based on the performance of the system under study [12]. figs. 7, 8, and 9 show the simulation results of the normalized mfs of the input and output fuzzy sets before and after adaptation. the tuning strategy goal for the afs is to achieve a fast dynamic response, with no overshoot and negligible steady state error. the complexity of the fuzzy logic controller is reduced due to the lower number of rules and adaptation of the membership function parameters. 5 simulation results of s.g equipped with flc-pss based on ffs and afs the system data given in appendix-1 is used to test the proposed flc-pss. the proposed flc has two input variables, � � and � �� . different simulation results have been obtained for the system given in fig. 1. these results are 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague xa b c �( )x �( ) 1b � �( )a � �( ) 0c � fig. 12: triangular membership function (mf) xa b c d �( )x �( )a � �( ) 0d � �( )b � �( ) 1c � fig. 13: trapezoidal membership function (mf) obtained when the s.g. is subjected to different dynamic disturbances, while the system is normally loaded and 0.65 p.u. active power and 0.45 p.u. reactive power are delivered. these disturbances are applied as a step change in the mechanical power command and the voltage reference command, respectively. 5.1 mechanical torque disturbance the first case was studied when the s.g. was subjected to a 10 % step increase in the reference mechanical torque and then the torque returned to the initial condition. figs. 14a, b show the simulation results for this case with the power angle �, in radians, and the speed deviation � �, in rad/sec. the dynamic responses using a cpss flc with static seven and five fuzzy sets for the input and output variables and an adaptive fuzzy logic controller (aflc) are shown in figs. 14a and 14b, respectively. the dynamic response when using aflc is superior to the three other controllers as regards the rising time, settling time and damping coefficient of the overall system. figs. 9, 10 and 11 show the normalized mfs before and after training, using the bp technique for the input and output variables of the fuzzy controller. 5.2 terminal voltage disturbance the second case of disturbance is carried out when the s.g. is subjected to a 10 % step increase in the voltage reference, and then returns to the initial condition. figs. 15a and 15b show �, in radians, and � �, in rad/sec, respectively. the dynamic performance of aflc-pss is much better in compared with five flc-pss, and the same conclusions have been reached as those obtained with torque reference disturbances, as regards rising time, percentage overshoot, and oscillations. 6 conclusions this paper introduces a novel application flc-pss and tests it through a simulation program based on an adaptive fuzzy set. a classical fuzzy logic controller, using a fixed fuzzy set, was simulated and tested. the settling time and the rise © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 b) a) fig. 14: dynamic response of the synchronous generator (s.g.) equipped with flc-pss, aflc-pss and cpss, (a) power angle displacement, (b) speed change a) b) fig. 15: dynamic response of the synchronous generator (s.g.) equipped with flc-pss, aflc-pss and cpss, a) power angle displacement, b) speed change time are decreased when using the adaptive fuzzy controller. a aflc also improves the damping coefficient of the overall system under study. simulation results show the superiority of the adaptive fuzzy controller in over other controllers. the simulation results also show the effectiveness of the proposed fl with an adaptive fuzzy set scheme as a promising technique. in addition the proposed technique reduces the computation time of the flc. the number of rules is reduced from 49 in the case of seven fuzzy sets, to 25 when using aflc. appendix 1 the data of the machine, avr exciter and governor are given as follows: all parameters and data are given in per-unit values pe � 0.65, qe � 0.45, vt � 1, xd � 1.2, �xd � 0.19, xq � 0.743, h � 4.63, �td � 7.76, d � 2, local load data: g2 � 0.249, b1 � 0.262 line data: rt.l � 0.034, xt.l � 0.997 avr data: ka � 400, ta � 0.02 speed governor parameters (in p.u.) a � � 0.001238, b � � 0.17, tg � 0.03. reference [1] larsen, e. v., swann d. a.: “applying power system stabilizers: part i, part ii and part iii”, ieee trans. on power apparatus & systems, 1981, pas-100, p. 3017. [2] kothari, m. l., nanda, j., bhattacharya k.: “discrete mode power system stabilisers”, iee proceedings part c, 1993, 140, (6), p. 523–531. [3] aldeen, m., chetty m.: “a dynamic ouput feedback power system stabiliser”, proceedings of the control ’95 conference, vol. 2 (1995), melbourne, p. 575–579. [4] samarasinghe, v. g. d. c., pahalawaththa, n. c.: “design of universal variable-structure controller for dynamic stabilization of power systems”, iee proceedings generation, transmission and distribution, vol. 141 (1994), no. 4, p. 363–368. [5] anderson, p. m., fouad, a. a.: power system control and stability. ieee press, new york, 1994. [6] ahmed, s. s., chen, l., petroianu, a.: “design of suboptimal h� excitation controllers”, ieee transactions on power systems, vol. 11 (1996), no. 1, february 1996, p. 312–318. [7] ohtsuka, k., taniguchi, t., sato, t., yokokawa, s., ueki, y.: “a h� optimal theory-based generator control system”, ieee transactions on energy conversion, vol. 7 (1992), no. 1. [8] el-metwally, k. a., malik, o. p.: “a fuzzy logic power system stabiliser”, iee proc. generation, transmission and distribution, vol. 145 (1995), no. 3, p. 277–281. [9] lie, t. t, sharaf, a. m.: “an adaptive fuzzy logic power system stabiliser”, electric power system research, vol. 38 (1996), p. 75–81. [10] hsu, y. y., chen, c. r.: “tuning of power system stabiliser using artificial neural network”, ieee trans. energy conversion, vol. 6 (1991), no. 4, p. 612–619. [11] lee, c. c.: “fuzzy logic control systems: fuzzy logic controller, part i”, iee trans. syst. man, cybernetic, vol. 20 (1990), march /april, p. 404–418. [12] nürnberger, a., nauck, d., kruse, r.: “neuro-fuzzy control based on the nefcon-model”, recent developments. soft computing vol. 2 (1999), springer-verlag 1999, p. 168–182. [13] demello, laskowski, t. f.: “concepts of power system dynamic stability” ieee trans. on power apparatus & systems, vol. 94 (1979), p. 833. [14] soliman, h. f., badr, m. a., hellal, m. n. a.: ”improving the performance of the power system stabilizer using a variable rule based fuzzy logic controller”, scientific bulletin of ain shams univ., vol. 39 (2004), no. 3, september 2004, egypt. associate prof. dr. ing. hussein f. soliman e-mail: hfaridsoliman@yahoo.com dept. of electric power and machine faculty of engineering ain-shams university abbasia, cairo, egypt dr. ing. abdel-fattah attia phone:+202 5560046 fax:+202 5548020 e-mail: attiaa1@yahoo.com astronomy department national research institute of astronomy and geophysics (nriag), 11421 helwan cairo, egypt ing. mohammed hellal south cairo electric power station helwan, cairo, egypt prof. dr. ing.m. a. l. badr dept. of electric power and machine faculty of engineering ain-shams university abbasia, cairo, egypt 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2023.63.0140 acta polytechnica 63(2):140–157, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague utilising pareto efficiency and rsm to adjust binder content in clay stabilisation for yttre ringvägen, malmö per lindha, b, polina lemenkovac, ∗ a swedish transport administration, department of investments technology and environment, neptunigatan 52, box 366, se-201-23 malmö, sweden b lund university, lunds tekniska högskola lth (faculty of engineering), department of building and environmental technology, division of building materials, box 118, se-221-00, lund, sweden c université libre de bruxelles (ulb), école polytechnique de bruxelles (brussels faculty of engineering), laboratory of image synthesis and analysis (lisa), campus de solbosch, ulb – lisa cp165/57, avenue franklin d. roosevelt 50, b-1050 brussels, belgium ∗ corresponding author: polina.lemenkova@ulb.be abstract. in this paper, we present a new framework for improving soil strength using an advanced method of engineering statistics. the materials included clay till collected in yttre ringvägen, southern sweden. binders included quicklime, slag and ordinary portland cement used as pure binders and blended mixtures. we first applied the response surface methodology techniques aimed at binder blend optimisation: 1) central composite design; 2) box-behnken design; 3) simplex lattice design. the pareto charts were presented for modelling responses from tests with different binders and estimating their effects on soil strength. finally, to examine the variables important for soil stabilisation, we also evaluated the effect of the amount of binder and the interaction between cement/lime/slag in different ratios: 30-50-20 %; 50-50-0 %; 100-0-0 % the paper highlights the major opportunities and challenges of engineering statistics as a cross-cutting research direction for the issues of civil engineering. keywords: soil stabilisation, simplex experimental design, binder, opc, statistical analysis. 1. introduction soil stabilisation is a critically important task in civil engineering. it is aimed at improving soil parameters and properties in various areas of civil engineering, such as road constructions, bridge or building engineering, and earthworks on pavements. stabilisation and solidification of soil is a widely applied method in geotechnical works performed using various binders as stabilising agents. the existing methods of soil stabilisation are aimed at improving the soil performance to obtain the required characteristics of foundations. although the state-of-the art methods are applicable, new approaches using various stabilisation agents used solitarily or in combinations and novel binders require experimental testing and evaluation to assess their quality and effectiveness. despite widespread applications in civil engineering and extensive existing research, there are still some challenges in soil stabilisation, including the following: (1.) evaluation of variations in soil properties and responses of individual soil specimens and variations between binder characteristics that affect the stabilisation process. for instance, this includes the ratio of stabilising agents with respect to water content, technical characteristics of the stabilising agent and the use of accelerators; (2.) processing large amounts of sampling materials – modern engineering tasks in construction industry are increasingly high dimensional and require, in many cases, processing of several tons of soil using hundreds of kg of binders. this necessarily requires the use of the effective techniques to optimise this process; (3.) dealing with occlusions in soil such as fiber, which may create noise in technical parameters of soil while modelling data, and therefore affect the stabilisation process. at the same time, effective soil stabilisation is crucial for safe road constructions and engineering works. this especially concerns northern regions with harsh environmental and climate conditions that create challenges for infrastructure [1–7], due to the unique physical and mechanical properties of soil collected in real-world environment, rather than theoretical models presented in technical guidance. taking this into account, the importance of experiments on soil stabilisation consists in the complexity of real case situations because soil is a highly variable porous structure. formed as a mixture of organic matter, minerals and rocks, chemical components (gases or liquids), and biological particles (microorganisms), the properties of soil vary significantly, which requires experimental testing in earthwork constructions. while much progress has been made in soil stabilisation techniques to address various aspects of the 140 https://doi.org/10.14311/ap.2023.63.0140 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content abovementioned challenges, many issues are still far from being resolved. particularly, those that lack an effective and standardised framework for considering binder optimisation during the process of soil stabilization. thus, the majority of the accepted in situ techniques of soil stabilisation are based on the approach where the stabilisation agent, as one binder, is distributed over soil specimens. in such cases, the most widely used binder is lime, as proved by numerous examples of the existing cases [8–14]. the ordinary portland cement (opc) is another widely used stabilising agent. recent advances in utilizing opc as a binder [15–19] have made the opc widely used for stabilisation of soil in the field of civil engineering. the difference in the effects of opc and other binders poses a new way to model and compare the reaction of soil with binders during the stabilisation process. for instance, the reactions of lime and opc with soil differ and have their own advantages and disadvantages, as discussed in existing literature [20–25]. the advantages of lime, which is one of the oldest binders used to improve engineering properties of soils, are as follows: (1.) it results in long period of exploitation and workability; (2.) it enables high level of homogeneity in the mixtures with soil; (3.) it is effective in decreasing the amount of water; (4.) it reduces the plasticity index of soil, which facilitates higher workability of foundations [26]. however, there are also certain disadvantages of lime, which can be mentioned as follows: (1.) lime needs certain conditions regarding the mineral content and grading of soil; (2.) although lime does increase the strength of soil, the process is rather slow; (3.) organic content of soil may affect the performance of lime as a binder. the advantages of opc as a binder include the following ones: (1.) opc ensures a high level of strength during the process of stabilisation; (2.) using opc increases the speed of the strength gain; (3.) opc reacts well with water and thus enables fabricating the opc slurry; (4.) compared to lime, opc has higher robustness regarding soil grading; (5.) organic content in soil does not affect the opc’s performance as much in the case of lime. nevertheless, some drawbacks of the opc as a binder should also be mentioned: (1.) the effects resulting from the properties of opc remain for a shorter time; (2.) the use of opc requires a significant amount of compaction works; (3.) using opc results in a lower homogeneity of the opc-soil mixture as compared to lime-soil mixture. the performance of the binder during soil stabilisation and its reaction with specimens ultimately explains the difference in the effects of lime and opc, caused by the mineral structure. thus, aluminium and silica minerals and water are necessary for lime to produce pozzolanic reactions. their main effects include soil cementation with higher strength, reduced deformability, and higher durability [27–30]. these minerals can already be present naturally or can be mixed with lime during the soil stabilisation. the advantage of the pozzolanic reaction is that it ensures the increase in strength of soil which lasts over years. in contrast, the opc or cementitious binders are less sensitive and only need water for an effective reaction. in this case, hardening starts immediately and the gain of strength in the opc-soil mixture increases exponentially with the maximum value achieved after 28 days of curing time [31, 32]. besides the effects from binders, various soil types behave differently during the stabilisation process [33– 37]. therefore, there is no unique recipe for the mixture of stabilisation agents. as a response to this problem, the objective of the stabilisation methods is to correctly select binders and adjust their amounts and ratios in order to achieve the most effective stabilisation results. hence, binders should be defined and regulated carefully with respect to the in situ conditions and parameters of soil. for example, blended binders may sometimes ensure the best performance, while in other cases, the effects from various binders on soil should be tested empirically. very often, the use of blended binders results in the best performance of the binder-soil mixture, which is the goal of stabilisation and required workability of soil [38, 39]. in this paper, we propose a framework for a number of experimental designs for establishing an optimal technique of soli stabilisation tested for different types and blended binders. the objective of the study is to contribute to the development of technical methods of robust soil stabilisation through empirical binder optimisation using a combination of the statistical approach and series of practical tests. the goal was to find and indicate the optimal mixture of stabilising agents for soil stabilisation considering both the soil properties and the economical limitations. the experiment revealed both the effective and not effective interactions between the different stabilising agents and their reaction with soil specimens over time during the stabilisation process. the project is based on the empirical geotechnical works performed in the laboratory of the swedish geotechnical institute (sgi). the technical aim was 141 per lindh, polina lemenkova acta polytechnica to fabricate blended mixtures made from different pure binders having the best effect on soil stabilisation. to this end, we used opc, lime, ggbfs and fly ash as binders for stabilising soil samples. the framework included a number of existing standardised technical workflows of soil stabilisation, modified and applied for our case. 2. essentials on binder selection a mixture of two or three different binders is commonly acceptable for soil stabilisation in general and for deep mixing method (dmm) in particular [40–42]. in the last case, the lime-opc columns are fabricated using a special machine. previous research shown that a mixture of opc and fly ash contributes to the gain of soil strength better as compared to opc used as a single binder [43, 44]. other cases shown that there is a positive reaction between lime and fly ash, which can be used to improve the results of stabilisation [45–47]. blended binders have been successfully used for the past 30 years for the purpose of soil stabilization [48– 56]. to improve the efficiency of soil stabilisation, some commercial binder suppliers developed their own blends. such binders can be classified as follows: (1.) general purpose (gp) opc; (2.) general blend (gb) opc; (3.) cementitious triple and quaternary blends of binders mixed from combinations of fly ash, gp opc, ground granulated blast furnace slag (ggbfs) and lime; (4.) hydrated lime; (5.) opc/asphalt blends. previous studies on the evaluation of the effects of various mixtures demonstrated [57] that ggbfs contributes the most to the compressive strength development in expansive clayey soil, followed by opc. moreover, blended mixtures of cement with ggbfs and lime with ggbfs perform better than single binders. therefore, we chose a ggbs combined with cement and lime as the mixture to achieve notable effects on soil stabilisation and strength gain. in sweden, the use of blended binders is on the rise since 1970s (figure 1). the swedish opc and lime industry encouraged the use of the additives with regard to soil type, figure 2. nowadays, the use of mixed binders in deep mixing have increased up to 100 %, see figure 1. the applicability of different binders is summarised in table 1. however, there is a lack of documented experience regarding the optimisation of different mixtures of binders for stabilisation purposes, specifically for sweden. selected papers reported cases of binder optimisation using deep mixing methods, for instance, at the sgi [61, 62]. regarding the strength development under the effects of additives added to different soil types, the sgi documented the following: the figure 1. binders traditionally used for dmm. source: modified after [59] figure 2. boundaries best suited for various binders. source: modified after [60] opc usually demonstrates very good effects for highplasticity clayey silt (mh), lime-opc blend gives a good effect, and lime is satisfactory for soil stabilisation. the tested specimens included soil of the following types, according to the unified soil classification system (uscs): (1.) inorganic clays of low to medium plasticity, gravelly clays, sandy clays, silty clays, lean clays (cl); (2.) inorganic silts and very fine sands, rock flour, silty or clayey fine sands or clayey silts with slight plasticity (ml). the specimens belong to the category of fine-grained soil consisting of silts and clays with liquid limit less than 50 %. for low plasticity silty clay (cl), opc has the best effects, followed by the lime-opc blend and lime with equally good effects. likewise, high plasticity clay (ch) is best stabilised by the opc with excellent effects, followed by the lime-opc mixture and lime with an equally good effect. the experimental results from the existing works show a certain difference between the stabilisation and fieldwork using dmm, since the last one is used in the in-situ conditions with dominating soft high-plasticity clay (ch). therefore, such methods does not include laboratory-based com142 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content binder type course-grained soil fine-grained soil gp gw gm & gc sp & sw cl ch gp opc (a) 1 1 1 2 2 3 gb opc (b) 1 1 1 1 1 2 opcitious blends (c) 1 1 1 1 1 3 lime & opc (c) 3 3 2 3 2 1 lime & fly ash (c) 3 1 1 3 2 2 lime (d) 2 2 1 3 2 1 opc/bitumen (e) 1 1 2 2 3 3 notations for table 1. gp – poorly graded gravel and crushed rock; gw – well-graded gravel; gm – silty gravel; gc – clayey gravel; sp – clean sand, poorly graded; sw – clean sand – well graded; cl – low plasticity sandy/silty clay; ch – high plasticity heavy clay. 1 – excellent; 2 – satisfactory; 3 – not suitable. table 1. applicability of binders for soil types. source: modified after [58] paction works. at the same time, soil compaction is one of the essential parameters required for stabilisation. following different national standards and approaches on soil stabilisation using one binder, the recommendations exist for laboratory workflow [63, 64]. according to these references, traditional and modified techniques of stabilisation provide examples regarding the improving or optimising the existing methods [65– 73]. for the assessment of binder blends, multiple and advanced experimental designs are preferable, including those adopted from different disciplines or domains. thus, the test for the initial consumption of lime (icl) evaluates the amounts of lime which should be added in order to ensure that the required ph level is reached. the lower level guarantees that the effective pozzolanic reactions of soil with lime will occur. in contrast, if the binder consists of the limeggbfs blend, there is no need to achieve the ph to ensure strength development, because lime and ggbfs can lead to this effect. with regard to the above, complex design experiments on soil stabilisation can be classified in two groups: (1.) to assess the amount of required stabilising agents; (2.) to test the ratios of binders with respect to the proportions of the selected components in a blended mixture. 3. statistical approach statistical analysis ensures quality improvement, optimisation, and economisation of the soil stabilisation workflow, which is otherwise a very expensive, timeconsuming and laborious process. despite the similarities in high-level standard tests on binder selection, the advanced approaches using statistical methods of data analysis are fundamentally different in the following aspects: (1.) statistical analysis enables using contextual information for a more robust evaluation of binder proportions; (2.) statistical analysis is capable of leveraging arbitrary features of soil collected in real in-situ conditions which may vary considerably; (3.) statistical analysis is embedding the standard context of soil classification data in the local features of soil. statistical analysis enables to separate local features and standard classification types of soil which are matched and kept separately for optimisation of binder blends using their different matrices. as a result, a number of statistical methods performed a verification of soil stabilisation performance using integration of geotechnical engineering approaches and computational analysis [74–79]. following this experience, the technique of the response surface methodology (rsm) is adopted from the industrial areas including the automotive, chemical, and process industries. the rsm proved to be an effective method to improve the reliability of binder testing. besides, it helps the exploitation of new processes, optimises performance, and improves the experimental design, as demonstrated in selected relevant studies [80–83]. 4. stabilisation quality control robust estimation methods used in geotechnical engineering such as soil stabilisation are performed under real conditions which may include outliers in statistical data analysis. for instance, in-situ setting may vary locally and include various soil grading, water content or mineral content in specimens. therefore, tested methods should be fitted to the prevailing natural conditions that often include a large number of variables. to achieve the high level of soil strength, which indicates the quality of the stabilisation process, blended mixtures should also be independent from variations in binder amount, homogeneity of mixtures, and compaction. this ensures the objec143 per lindh, polina lemenkova acta polytechnica tivity of the statistical experiment using separated variables. despite the complexity of the adopted approaches, it is important to optimise the workflow of data analysis, in order to achieve the ultimate goal of optimisation of binders. this ensures the analysis of the performance of soil in various scenarios, e.g. with varied proportions and content of binders, water ratio or with testing various soil samples. defining the determinant parameters for the stabilised soil is essential to design high-quality soilbinder blend. the existing methods of parametrisation [84] are adopted in the experimental work using background knowledge and pre-processing tests. these steps are crucial for the smooth workflow of the laboratory experiments. with this regard, it is principally important to note that the evaluation of the amount of binders differs from the procedure of testing their types. these are the two separate subproblems that are to be solved in the same framework, but using different evaluation approaches, in order to match the optimisation criteria of soil stabilisation. the core of the evaluation method is estimating the amount of binder, which is required to define lower and upper limits of the binder in a mixture. it strongly depends on the soil type as well as its mineral and moisture content. the potential solutions to this include a series of pre-tests integrated with the empirical evaluation of soil samples. such pre-processing can utilise a few specimens and 1–2 response variables. the final design of the presented work was performed based on the principles discussed above and considering the results of the data preprocessing. the difference in the effects of binders on the gain in soil strength allowed us to evaluate their effectiveness on the quality of soil stabilisation. the decision on the selection of binders that fits best to the given soil was performed with regard to the following response variables: (1.) omc; (2.) compaction energy required for a specific water content, moisture condition value (mcv); (3.) uniaxial compressive strength (ucs) at different curing periods; (4.) water content after mixing of soil with binders which was assessed to evaluate and compare the effects from different agents on binding water in a mixture. the aim of the undertaken study was a quantitativequalitative analysis of binders for soil stabilisation. first, we examined the types of binders which suit best for stabilisation of the given soil type, that is, clay till. second, we optimised the amount of these binders required to achieve the desired quality of soil using methods of statistical analysis. as a results of these tests, strength characteristics of soil were significantly improved upon completion of the stabilisation process. the maximum ucs was achieved with the optimum amount of binder. the statistical analysis allowed to ensure the economic improvements of works through the optimisation of the process. the ccd and the bbd methods were applied and integrated for the quantity evaluation of binders. 5. results 5.1. central composite design (ccd) as a factorial experiment the method of the central composite design (ccd) is one of the most accepted 2nd-order experimental designs in engineering practice. it is based on the response surface methodology (rsm) [85] and matrix approach, as shown in equation (1).  α 0 0 . . . . . . . . . 0 −α 0 0 . . . . . . . . . 0 0 α 0 . . . . . . . . . 0 0 −α 0 . . . . . . . . . 0 ... . . . . . . . . . . . . . . . ... 0 0 0 . . . . . . . . . α 0 0 0 . . . . . . . . . −α   . (1) it comprises the 2k factorial with nf runs, 2 ∗ k axial runs, and nc centre runs. the k = 2 ccd is presented in figure 3a. the mathematical approach is based on the matrix, which is derived from the axial points, with 2 ∗ k rows depending on the factors, which signify the binder content. in our case, each factor is a binder, which is consecutively located at a and all other factors which might influence soil stabilisation are set to zero. this enables to evaluate the influence of binders as factors controlling the gain of strength. the value of a is opted in accordance to the runs of binder testing. changing values in binders enables to achieve the required properties of soil in the design of the mixture, which leads to the increased gain in soil strength. the input of each factor in this approach is defined at least at three levels: low, high and medium. the response corresponding to the general model is presented in equation (2). y = β0 + β1x1 + . . . + βκxκ + β12x1x2 + β13x1x3 + . . . + βκ−1,κxκ−1xκ + β11x21 + . . . + βκκx2kappa + ε , (2) where βi represents the regression coefficient and ε – the statistical error, so that ε ∈ n (0, σ2). as can be inferred from the formulation of the feature matching problem, fitting a model to the observed values of the dependent variable y includes three essential steps of data processing: • main impacts for factors as components of binder blends x1, . . . xκ; • interactions between the stabilising agents in binder blends as factors x1x2, x1x3, . . . , xκ−1xκ); 144 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content (a). (b). (c). figure 3. (a): ccd for k=2 with start points and rotatable 2nd-order designs. (b): 32 factorial design. (c): a comparison between the 2-factor ccd (continuous line) and a 22 factorial design (dashed line). • quadratic components of factors x21, . . . , x2κ. modelling binder blends also provides useful information for detection of shear strength development, which can be measured additionally. the presented models for mixture data included the two models used for testing binder blends: the quadratic model and the special cubic model. respectively, the quadratic model is formulated using the equation (3): y = β1x1 + β2x2 + β3x3 + β1,2x1x2 + β1,3x1x3 + β2,3x2x3 + ε , (3) where βi represents the regression coefficient and ε – the statistical error, similarly to the equation (2). correspondingly, special cubic model can be expressed as demonstrated in the equation (4): y = β1x1 + β2x2 + β3x3 + β1,2x1x2 + β1,3x1x3 + β2,3x2x3 + β1,2,3x1x2x3 + ε , (4) where βi represents the regression coefficient and ε represents the statistical error, respectively. as one can note, the difference between these models consists in a special approach of cubic model, which applies a test for interactions between all the factors x1x2x3, sf. equations (2), (3) and (4). evaluating the effects of binders on soil stabilisation through a series of experiments can be used to detect the influence of each component of blend on the particular soil specimen with regard to the soil type and specifics of its mineral and moisture content. with this regard, we have performed the quantitative evaluation of the binder reaction and assessed the effectiveness of the low, medium and high levels of binder content independently. to ensure the robustness of the quantitative evaluation, a single level was chosen initially, i.e. 2.5 % binder per dry weight of soil. the initial proportion of binder, i.e. 2.5 %, by weight of dry soil was used to make a groups of soil samples stabilised by single binders prior to blended mixtures where proportions were selected in varying ratios to obtain the optimum percentage of each additive. the tested experiments were applied for clayey expansive tills of types cl (clays of low to medium plasticity to lean clays) and ml (inorganic silts and tills with silty of clayey fine sands or clayey silts with slight plasticity). other types of soil would require different binders, since the reaction of binders with soil particle differs, depending on the type of soil: coarse-grained, medium-grained or fine grained. thus, clay and silts are best stabilised by lime (quicklime or hydrated lime), while for sandy soils, it is more effective to use cementitious binders or pure opc. when using blended binders consisting of three different stabilising agents, such as opc, lime and ggbfs, the blends can be fabricated using the defined proportions of binders as follows: – blend 1: opc = 30 %, lime = 50 %, ggbfs = 20 % – blend 2: opc = 50 %, lime = 50 %, ggbfs = 0 % – blend 3: opc = 100 %, lime = 0 %, ggbfs = 0 % 145 per lindh, polina lemenkova acta polytechnica using statistical and combinatorial methods ccd, bbd, and sld, we defined that adding binders, in general, positively affects the development of strength, with lime being the best binder for stabilisation of expansive clayey tills, followed by slag (ggbfs) and cement (opc). the tested binders strengthened the strength of the specimens of high clay tills that have a high plasticity with the best performance showed by lime, followed by slag (ggbfs), and cement (opc). such effects are explained by the high content of active components in these binders, such as aluminium (iii) oxide, calcium oxide, and active silicon dioxide (silica), which increase the binding properties between the particles of soil-binder mixtures and strength. compared to single binders, a mixture of lime, slag (ggbfs), and cement (opc) according to the proportions in blend 1 (opc = 30 %, lime = 50 %, ggbfs = 20 %) demonstrated the best effects on stabilisation of clayey tills. note that the hypothesis is made concerning the impact of factors, i.e. binders, to analyse the effects from a combination of values from these factors [86]. using this general model is beneficial in several aspects, as follows: (1.) the ccd method is adaptable in search of optimum in the response from factors. it enables to find the extreme values and saddle points, which correspond to the optimal regions. (2.) the ccd presents feasible solutions to evaluate the parameters. (3.) the application of the ccd model is possible for different domains of the industrial design with specific assumptions. (4.) additionally, although the ccd model is not suitable for the whole factorial range, an iterative process of gradual approximation through the loops of the adjustments of the true response function can result in an optimised operation region achieved in parts, e.g. a narrow range around the best values. (5.) a combinatorial approach employed for finding the decisions can be used auxiliary for determining the directions of optimum, based on the choice of levels of the independent factors of the model. (6.) the 2nd-order models and the ccd as a factorial experiment can be used complementarily. the ccd with κ = 2 is a 22 factorial design which employs the combinatorial approach to solving the optimisation problem with 4 axial runs, figure 3a. this design includes loops used to fit the 2nd-order model [85]. in case of a face-centred ccd with a κ = 2, the treatment combination is applied exactly as in a 32 design, figure 3b. the discovered correspondences by the ccd and factorial design also include certain constraints on factor combinations for a standard ccd. thus, a high level of both binders (opc and lime) is not applicable since both binders require adding water, which leads to the conflict of variables. figure 3c demonstrates this difference in the applicability region. the observed cases in the first experimental design comprise opc, lime and water as independent variables in ccd for κ = 3. 5.2. specimen collection and processing the soil used in this study is presented by specimens of clay till collected in the yttre ringvägen, a ring road in malmö in southern scania, sweden. this soil is being used as an embankment fill material for the connecting road for the öresund bridge. the specimens were excavated from the test pit in the location petersborg. during the laboratory workflow, all the materials were passed through a 19 mm sieve, to exclude coarsegrained specimens from further processing, i.e. soil with grain size over 19 mm. the processing included the evaluation of factors as response variables, which were the following: moisture content value (mcv), density, unconfined compressive strength (ucs), and change in water content after mixing of soil with binders. the material was systematically mixed and placed in sealed containers, which were stored in a climate room with a maintained temperature of 8°c and a constant humidity of 85 %. the maintenance of temperature at 8°c aimed at simulating curing conditions of the specimens close to the natural conditions of swedish environment during cool periods, which are a temperature of 8°c and a humidity of 85 %. at these conditions, soil samples were stored in a curing chamber for 28 days of curing period. besides, such temperature and humidity conditions maintain real case environment during the freezing cycles and air drying, which enables stimulating the actions of binders. as a result, the reaction of soil-binder mixtures performs effectively with particles of binder filling the pores of soil, which ultimately improves the compressive strength of soil. the curing period of one week was followed by the preprocessing, which was performed to set up the test design. water content measured in the stored soil used as raw material in stabilisation tests is demonstrated in figure 4. water content demonstrated variations in the test range with theoretical value at 14.8 %, modelled values ranging from 15.5 to 16.5 %, and actual water content in the clay till lying in a range from 8.4 % to 11.2 % (set 1) and 13.2 % to 15.8 % (set 2), figure 4. the variation in water content and grading was minimised during the experiment. 5.3. effects from opc and quicklime the opc and quicklime were tested as independent variables during the 2nd experimental design of the face centre ccd for κ = 2. the face centre design was applied to compare the performance and the effect of both mixed and unmixed binders, i.e., opc and quicklime. in this case, the design shown in figure 3a was changed to that shown in figure 3b with low values assigned to zero. statistically, this conforms that 8 of the tested design points were comprised of 146 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content (a). test 1. (b). test 2. figure 4. water content in soil used as raw specimens for stabilisation tests. the stabilised material and one unstabilised, i.e., the unstabilised specimens form the part of the empirically tested region. here, two design points correspond to the pure lime and pure opc, respectively, for the tested specimens with low values assigned to zero. the meaning of this step is to avoid negative values in the star (target) points which would arise if the low values in a standard ccd were set to zero which would result in the infeasible design of the experiment. figure 5a and figure 5b show the results from these empirical tests with the difference that figure 5a only depicts the significant effects shown in a surface, while figure 5b illustrates all the effects which were considered to model the surface. the response surface in figure 5a is represented by the fitted function using the following equation (5), (cf. equation (1)). w = 15.2 − 27 ∗ op c − 63.7 ∗ ql + 738.0 ∗ op c ∗ ql . (5) this equation only includes the significant terms as of opc, lime and the interaction term opc*quicklime (here: ql). correspondingly, the response surface in figure 5b is expressed by the following fitted function in equation (6): (a). (b). figure 5. (a): the significant differences in water content depend on type and amount of binder. (b): the difference in water content depends on the type and amount of binder not adjusted for significance. w = 15.2 − 31.1op c + 68.4op c2 − 31 ∗ ql − 1634.2 ∗ ql + 738 ∗ op c ∗ ql . (6) in equation (6), all the terms are included in the computing of the fitted function. the experimental loop comprised a series of tests with 40 specimens. this signifies that each point in the design signifies four specimens, except for the centre point, which stands for eight samples. major linear effects and the 2-way interconnections significantly impacted the variations in the water content. the interaction part of the equation (6) is positive, which means that using a mixture of the ql and opc did not reduce overall water content in the same way as using the pure stabilisation agent. this information can be used for the cases when soil specimens are sensitive to changes in water content, as in the case with clay till from southwest scania. moreover, stabilised and unstabilised soil behave distinctly regarding some parameters, such as omc or grading. therefore, it is a challenging task to include them in the same design. to support this through a comparative analysis, the relations between the mcv and water content in stabilised and unstabilised soil are demonstrated, respectively, in figure 6. 147 per lindh, polina lemenkova acta polytechnica factor anova; r2 = 0.79783; adj:0.76809, 2-factors, 1 block, 40 runs, ms-res=0.10063. ss df ms f p (1) cem. (linear) 8.3162 1 8.3162 82.640 0.0000 cem. (quadratic) 0.0353 1 0.0353 0.351 0.5572 (2) q.lime. (linear) 4.1409 1 4.1409 41.148 0.0000 q.lime (quadratic) 0.2492 1 0.2492 2.476 0.1247 1(l) by 2(l) (interaction) 0.7843 1 0.7843 7.793 0.0085 error 3.4215 34 0.1006 – – total ss 16.923 39 – – – table 2. anova statistical analysis. dependent variable: water content upon stabilisation (w_after). figure 6. mcv vs water for treated and raw soil. the 95 %-confidence bound is shown by dash-dot lines. 5.4. anova we estimated the design of the experiment by the analysis of variance (anova) technique with the significance level at 0.05 for tested binders. for the statistical evaluation, we used both pure and blended stabilisation agents, which included quicklime, hydrated lime, ordinary portland cement (opc), and their blends in various forms. the anova was used to confirm that there are no linear or quadratic main effects from these factors and no interaction between them, respectively. the results of the anova are summarised in table 2 with p levels described as the probability of rejecting the null hypothesis when it is actually true. in the case of p = 0.05, this means that there is a 5 % probability that any relation between variables identified in the samples is random [86]. the notations used in table 2 are defined in the list of symbols. table 2 demonstrates that there is an interaction between the quicklime and the opc, as indicated in figures 8 and 9. the interaction between the quicklime and the opc is significant even at p = 0.01. these results are also presented in the pareto chart in figure 7 where a column presents the effects from the anova summarised in table 2. these effects are organised and presented categorically, ordered by the magnitude of values. the pareto chart enables figure 7. pareto chart of the effects that notably change the variable w_after at p = 0.05. to estimate the magnitude, or the values of the effect, which should be achieved to be statistically significant according to the level p. thus, figure 7 shows that the opc, ql and the interaction between are significant at a level of p = 0.05. the quadratic effects from the opc and ql are not significant. there are no 2nd order terms in this equation, since it is reduced and modified according to the values of the significant effects, cf. with table 2. all the data points represent the same type of soil for a robust statistical experiment. although there is a difference between the reaction from various stabilisation agents with soil, all the binders fit the same regression line, which has an r2 of 0.7759 for stabilised soil, based on 72 observations tested in this study. for the unstabilised soil, the r2 of the regression line is 0.9303 and the fit is based on 18 observations. figure 6 illustrates that the mcv for the stabilised soil specimens becomes less sensitive to the variations in water content compared to the unstabilised raw soil before curing. from this comparison, one can conclude that using different types and different amounts of binders in the same ccd concerning the response variable mcv is possible and results in an effective soil stabilisation. if a pure binder should be a part of a box-behnken design, it is considered instead of using a face center ccd approach. we base the recommendation on the results from four other statistical runs, in which the standard ccd and the face-centred 148 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content figure 8. pareto charts: modelled responses from four different tests. subfigures 1 and 3 show face centre ccd, while 2 and 4 mean standard ccd. the response variable is modified water content before and after stabilisation. figure 9. predicted values against residual values from the tests. 149 per lindh, polina lemenkova acta polytechnica ccd were compared and analysed. the pareto charts of the standardised effects from four different runs are shown in figure 8. here, two different specimens of clay till were evaluated by tests and modelled using the response variable approach with regard to the changed water content before and after stabilisation (∆w ). the results are shown in the subfigures 1 and 3 of figure 8, respectively, which represents these samples with a face-centred ccd. the subplot 1 in figure 8 shows that there are three significant linear effects from cement (l), the quadratic effect from cement (q), and the linear effect from lime (l). both linear effects from the opc and quicklime increase ∆w , while the quadratic effect from the opc decreases ∆w . this is owing to the disturbance from the unstabilised specimens which are the part of the design experiment. the results from the subplot 1 in figure 8 can be compared with those in the subplot 2, showing two significant linear effects from the opc and ql. the results shown that the quadratic effect of opc is not significant at p = 0.05 in a standard ccd test. the difference in the results is explained by the fact that in a face-centred ccd, the unstabilised soil affect the final results. the subplot 3 in figure 8 shows the significant linear effects of ql. the subplot 4 in figure 8 shows the significant linear effects of ql and opc. adding 2.5 % of ql to the soil reduced water content, which means that the results in the subplot 3 are affected by varied water content. in order to check the presence of the outliers in a general data pool, the predicted values were plotted against the residual ones, figure 9, which demonstrated usual data pattern confirming the robustness of the experiment. 5.5. box-behnken design (bbd) the box-behnken design is an experimental design which determines the amount of binders [87], originally developed for rsm by g. e. p. box and d. behnken in 1960. it fits a quadratic model containing squared terms, products of two factors, linear terms and an intercept. in the bbd, each factor is evaluated at the three hierarchical levels: low, medium, and high, which are placed at one of three equally spaced values. by a visual examination of the bbd in figure 10, one can note that the origin is located at a cube corner and the points in the corners of the cube and face points are absent. such a design of bbd quantifies the predicting of the response for a pure binder, which does not include the stabilisation agents, as it is constrained within the low and high levels. the benefits of the bbd design consist in its optimisation with respect to the economical limitations, because it is particularly useful in modelling very expensive tests and experimental runs of large datasets. in the bbd design, the minimum number of factors is three. thus, if the design has >3 various binders, the high level of all binders is not considered, which presents a positive modelling effect. this is because figure 10. bbd for 3 factors with a centre point. if the region of interest is between 1–5 % for the 3 binders, the total amount of binders will otherwise be 15 % if all high levels are used at the same time. however, this is not appropriate in most of the cases. thus, the bbd designed is optimised for the minimum number of factors at 3. therefore, the bbd design was applied to estimate the time of performance and effects from 3 different binders at 3 different curing periods before compaction. we used the bbd in the case where the standard ccd could not be used due to the target (star) points with time variable being set to 1, 3 and 5 hours of curing time. 5.6. simplex lattice design similar to the case of bbd, each factor is evaluated at the three hierarchical levels: low, medium, and high. we used the proposed simplex design which consists of the two parts: (1.) simplex-lattice; (2.) simplex-centroid design. these two types of the design are complemented with constraints. the real-time conditions accurately record the changing state of the computed variables. thus, if a binder requires an activator, e.g. ggbfs, its pure blend will affect the results of the test, which requires the assignment of the upper constraint. figure 11 demonstrates the design for the three components and simplex lattice design of the polynomial order 2. the existing constrains in a simplex design is as follows. the cycles of the tests occur on the boundary of the model and the predictions in the interior can be uncertain. besides, the design cannot be used as a universal approach for testing binders, since it does not contain the complete mixtures, but only binary or pure 150 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content figure 11. simplex-lattice design 32. mixtures. at the same time, the interactions between all the binder components as stabilising agents might be negative as well. the solution in such cases consists in adding the interior points, in case if the complete design region is of interest and requires extrapolation. to improve resolution, the experimental setup was adjusted by the repeated cycles with iterations increased from 6 to 10 following the existing methods [88]. the results regarding the fabrication of soil-binders mixture by the simple lattice approach are presented in figure 12. the subfigures a and b show the results from the augmented simplex lattice design with 20 tests, while subfigures c and d illustrate the result from a 32 simplex lattice design with 12 runs. the subplots a and c show the estimated significant effects, while those presented in the subplots b and d are shown as a response surface. the interior points, in this case, did not affect the final results. empirically, both response surfaces demonstrate a positive interaction between lime and ggbfs regarding the gain in ucs of the soil. based on the results of this study, we recommend to avoid normal factor designs if a single stabilising agent is selected, because the zero value of normal factor includes raw specimens. in case it exceeds zero value, only blended binders should be tested. therefore, our method can be adopted if the goal of the test is to adjust the most suitable content of binders for soil stabilisation, including specimens collected from other regions. we evaluated the mixture design of the ternary statistical modelling applied on mixtures of binders for soil stabilisation, which proved to be an effective and robust approach. nevertheless, the type of soils and variation in water content in sample specimens should always be taken in the consideration for geotechnical engineering, prior to construction works. the performed experimental design proved to be a perfect tool for improving the efficiency and quality of various methods of testing soil quality and performance after stabilisation with various binders: opc, slag, and lime tested for swedish soil. the soil material included clay tills collected from the embankment fill from the test places of the connecting road for the öresund bridge, located in southern scania region of sweden. besides the comparisons of binders and the performance of soil after curing, we conducted the statistically significant test anova implemented in statistica. we shown that the difference between the testing methods is statistically significant as p-values are all smaller than 0.05 significance level. namely, we noted that opc and lime and the interaction between them are significant at p = 0.05 and that the quadratic effects of the opc and lime are not significant. 6. discussion in this study, we proposed an approach to evaluation of the amount and types of binders necessary for soil stabilisation. in some cases, the amount of binder is set to a defined permanent value, or the interaction between various stabilising agents to a fixed level, which is regarded as a mixture of designs that needs to be applied. a careful assessment of binder proportions enables to take the advantages from the stabilisation agents constituting the binder blend. to enhance the strength development during soil stabilisation, we modelled the modifications of binders and tested their effects on a soil-binder mixture. the sum of the components in a mixture design is set as constant, i.e. the amount of binders is not an independent variable but always counts for 100 %, regardless of the ratio of binders in each particular test case. we presented a new framework for improvement of engineering properties of soil using an advanced method of stabilisation with blended binders. the effects of binder ratios and water content on soil strength were evaluated by statistical methods and pareto charts. among the proposed mixed binder blends for various types of soil, tested blend 1 is an example of the complete mixture, fabricated of all the three binders. blend 2 is a modified binary blend where we excluded the influence of the ggbfs, and the impact of opc and lime is equal (50/50 %). the single-binder blend 3 is a pure binder containing only 100 % of opc. 7. conclusions in this paper, we introduced a framework for selecting optimal amounts of binders used for soil stabilisation with a case study of clay soils collected from the region of yttre ringvägen, which is a ring road in malmö located in southern sweden. a novel framework for adjusting binder content was proposed. we tested three approaches of the rsm techniques for evaluating the performance of soil stabilised using various binder types. the results were compared and presented using pareto charts. we used the following combinatorial statistical methods for data analysis and assessment: (1.) central composite design (ccd); (2.) box-behnken design (bbd); 151 per lindh, polina lemenkova acta polytechnica figure 12. pareto charts and rms from simplex lattice design with ucs as a response variable. subfigures c and d – quadratic model; a and b – idem with interior points. notable effects are visible in subfigures a and c. (3.) simplex lattice design. given the large amount of data and processed materials in the construction industry, where tons of soil should be processed for evaluation to ensure the safety of the foundations prepared for constructed buildings, statistical analysis is one of the major challenges supporting and facilitating the laboratory experiments through data modelling. to deal with soil specimens, we propose incorporating statistical methods of ccd, bbd and sld to refine the selection of binders prepared as components for blended mixtures. the increase of lime content improved the stabilisation performance, followed by cement and slag. we devised a combinatorial method to use information on types and amount of 3 blends of binders for refining and optimising binder blends and then develop a modelling approach using statistica software to use the refined formulae of binder blends as a practical application for soil stabilisation. blend 1, a mixture fabricated of the three binders shown the best results, followed by the blend 2, which is a modified binary blend where the influence of slag is excluded, while the impact of opc and lime is equal, and a single-binder blend 3, a pure binder containing only opc. the experimental results show good improvement in binder blend matching using the approaches of ccd, bbd, and sld, which proves the significance of these methods for soil stabilisation. to demonstrate the capability of various binders and their influence on soil stabilisation and strength development, we applied and tested two approaches: a design for evaluating the amount of binder and a design to evaluate the interaction between several binders. the proposed method is shown to be capable of incorporating the rms, which is an effective tool for geotechnical experiments. we demonstrated that the ccd, using robust functionality, is essential for the assessment of the amount and type of the stabilising agents necessary for effective soil stabilisation. additionally, we indicated that bbd can be used for statistical tests in civil engineering and geotechnical tasks. the statistical analysis included quadratic model, regression analysis, and anova test to analyse the efficiency of measurements. we recommend to take into account the following closing remarks for similar related works: • the ccd is effective for a blended binder with 2 agents for modelling the content of mixture and ratio of the components. • the ccd and bbd techniques can both be applied for blends with >3 stabilising agents to optimise their amounts in each case. • a mixture design is effective to test the effects from the existing binders or adjust the new ones in cases of complex mixtures of binders (>3 elements of stabilising agents in a mixture). • the external factors (water content or temperature) can be processed as covariates in a function. • the repetitive measurements with >2 sample tests should be used for every type of the experiment. 152 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content • for a correct comparison between the unstabilised and stabilised specimens, a design should include only stabilised samples followed by the reference samples for the unstabilised ones. the specific value of our research is that the methods and techniques have been borrowed from other industrial areas, such as the automotive sector, chemical engineering, and process industry. thus, we used the rsm as an advanced and efficient technique which can be used to optimise binder types and amounts for soil stabilisation. next, the pareto charts were presented for modelling responses from different tests with changed types and amounts of binders and estimate the effects of binder ratios and water content on soil strength. using this approach, we conducted the extensive experiments on soil stabilisation using new binders and novel methods of soil stabilisation which required different testing and evaluation methods to benchmark the performance of the framework. the enhanced properties of binders with regard to modified and updated performance can be applied in civil engineering. otherwise, using traditional ways of testing soil stabilization, is an expensive, time-consuming and laborious task in construction works. further experiments on testing binder blends by means of statistical analysis demonstrated their potential applications in civil engineering, applied material science, and geotechnical studies. this paper is an important point in further development of these directions, presenting the combined methods of statistical analysis, theoretical logic, data science, and approaches of combinatorics for practical tasks of soil stabilisation in civil engineering. finally, we demonstrated the applications of combinatorics for an optimal combination of binders in soil stabilisation. acknowledgements we acknowledge the cementa (heidelbergcement ag group – northern europe), nordkalk ab, swedish national board for industrial and technical development (nutek) and peab ab for their support of this study. the authors gratefully acknowledge the valuable assistance of prof. dr. björn holmqvist from the department of statistics, lund university and prof. dr. em. igor rychlik from the centre for mathematical sciences, chalmers university of technology. we appreciate critical comments, corrections and remarks from the two anonymous reviewers, which improved the initial version of this manuscript. list of symbols adj adjusted r2 anova analysis of variance bbd box-behnken design ccd central composite design df degrees of freedom dmm deep mixing method f f test. f = m streatments/m se ggbf s ground granulated blast furnace slag icl initial consumption of lime m cv moisture compaction value m s means of squares om c optimum moisture content op c ordinary portland cement r − sqr coefficient of multiple determination (=1 − sse/sst) rsm response surface methodology sgi swedish geotechnical institute ss sum of squares, for each treatment, error and total p p-level u cs unconfined/uniaxial compressive strength references [1] t. dahlin, m. svensson, p. lindh. dc resistivity and sasw for validation of efficiency in soil stabilisation prior to road construction. in proceedings 5th eegs-es meeting, vol. 6–9, pp. 1–3. eegs, eegs, budapest, hungary, 1999. cp-35-00099. [2] m. rothhämel, i. tole, j. mácsik, j. laue. stabilization of sulfide soil with by-product originated hydraulic binder in a region with seasonal frost – a field investigation. transportation geotechnics 34:100735, 2022. https://doi.org/10.1016/j.trgeo.2022.100735 [3] p. lindh. compactionand strength properties of stabilised and unstabilised fine-grained tills. ph.d. thesis, lund university, lund, sweden, 2004. https://doi.org/10.13140/rg.2.1.1313.6481 [4] j. vestin, d. nordmark, m. arm, et al. biofuel ash in road stabilization – lessons learned from six years of field testing. transportation geotechnics 14:146–156, 2018. https://doi.org/10.1016/j.trgeo.2017.12.002 [5] v. lemenkov, p. lemenkova. measuring equivalent cohesion ceq of the frozen soils by compression strength using kriolab equipment. civil and environmental engineering reports 31(2):63–84, 2021. https://doi.org/10.2478/ceer-2021-0020 [6] v. lemenkov, p. lemenkova. testing deformation and compressive strength of the frozen fine-grained soils with changed porosity and density. journal of applied engineering sciences 11(2):113–120, 2021. https://doi.org/10.2478/jaes-2021-0015 [7] k. d. eigenbrod, s. knutsson, d. sheng. pore-water pressures in freezing and thawing fine-grained soils. journal of cold regions engineering 10(2):77–92, 1996. https: //doi.org/10.1061/(asce)0887-381x(1996)10:2(77) [8] h. m. greaves. an introduction to lime stabilisation. in lime stabilisation: proceedings of the seminar held at loughborough university civil & building engineering department, vol. 108, pp. 5–12. thomas telford services limited, 2015. [2023-01-19], https://trid.trb.org/view/482068. [9] s. m. rao, t. thyagaraj. lime slurry stabilisation of an expansive soil. proceedings of the institution of civil engineers – geotechnical engineering 156(3):139–146, 2003. https://doi.org/10.1680/geng.2003.156.3.139 153 https://doi.org/10.1016/j.trgeo.2022.100735 https://doi.org/10.13140/rg.2.1.1313.6481 https://doi.org/10.1016/j.trgeo.2017.12.002 https://doi.org/10.2478/ceer-2021-0020 https://doi.org/10.2478/jaes-2021-0015 https://doi.org/10.1061/(asce)0887-381x(1996)10:2(77) https://doi.org/10.1061/(asce)0887-381x(1996)10:2(77) https://trid.trb.org/view/482068 https://doi.org/10.1680/geng.2003.156.3.139 per lindh, polina lemenkova acta polytechnica [10] c. d. f. rogers, s. glendinning, c. c. holt. slope stabilization using lime piles – a case study. proceedings of the institution of civil engineers – ground improvement 4(4):165–176, 2000. https://doi.org/10.1680/grim.2000.4.4.165 [11] n. o. attoh-okine. lime treatment of laterite soils and gravels – revisited. construction and building materials 9(5):283–287, 1995. https://doi.org/10.1016/0950-0618(95)00030-j [12] v. robin, o. cuisinier, f. masrouri, a. javadi. chemo-mechanical modelling of lime treated soils. applied clay science 95:211–219, 2014. https://doi.org/10.1016/j.clay.2014.04.015 [13] k. harichane, m. ghrici, s. kenai, k. grine. use of natural pozzolana and lime for stabilization of cohesive soils. geotechnical and geological engineering 29:759–769, 2011. https://doi.org/10.1007/s10706-011-9415-z [14] s. glendinning, c. d. f. rogers, n. dixon. deep stabilisation using lime. in lime stabilisation: proceedings of the seminar held at loughborough university civil & building engineering department, pp. 127–138. thomas telford services limited, 2015. [2023-01-19], https://www.icevirtuallibrary.com/ doi/abs/10.1680/ls.25639.0012?src=recsys. [15] j. r. prusinski, s. bhattacharja. effectiveness of portland cement and lime in stabilizing clay soils. transportation research record 1652(1):215–227, 1999. https://doi.org/10.3141/1652-28 [16] s. bhattacharja, j. i. bhatty, h. a. todres. stabilization of clay soils by portland cement or lime – a critical review of literature. portland cement association, skokie, illinois usa, 2003. pca r&d serial no. 2066. [17] p. lindh, p. lemenkova. geochemical tests to study the effects of cement ratio on potassium and tbt leaching and the ph of the marine sediments from the kattegat strait, port of gothenburg, sweden. baltica 35(1):47–59, 2022. https://doi.org/10.5200/baltica.2022.1.4 [18] p. lindh, p. lemenkova. resonant frequency ultrasonic p-waves for evaluating uniaxial compressive strength of the stabilized slag–cement sediments. nordic concrete research 65(2):39–62, 2021. https://doi.org/10.2478/ncr-2021-0012 [19] p. p. kulkarni, j. n. mandal. strength evaluation of soil stabilized with nano silica-cement mixes as road construction material. construction and building materials 314:125363, 2022. https: //doi.org/10.1016/j.conbuildmat.2021.125363 [20] m. mahedi, b. cetin, d. j. white. cement, lime, and fly ashes in stabilizing expansive soils: performance evaluation and comparison. journal of materials in civil engineering 32(7):04020177, 2020. https: //doi.org/10.1061/(asce)mt.1943-5533.0003260 [21] s. kolias, v. kasselouri-rigopoulou, a. karahalios. stabilisation of clayey soils with high calcium fly ash and cement. cement and concrete composites 27(2):301–313, 2005. cement and concrete research in greece. https: //doi.org/10.1016/j.cemconcomp.2004.02.019 [22] d. barman, s. k. dash. stabilization of expansive soils using chemical additives: a review. journal of rock mechanics and geotechnical engineering 14(4):1319–1342, 2022. https://doi.org/10.1016/j.jrmge.2022.02.011 [23] p. lindh, t. dahlin, m. svensson. comparisons between different test methods for soil stabilisation. in proceedings of the isrm international symposium 2000, is 2000, 19-24 november 2000, pp. 1–5. isrm, isrm, melbourne; australia, 2000. code 139306. [24] s. h. chew, a. h. m. kamruzzaman, f. h. lee. physicochemical and engineering behavior of cement treated clays. journal of geotechnical and geoenvironmental engineering 130(7):696–706, 2004. https://doi.org/10.1061/(asce)1090-0241(2004) 130:7(696) [25] m. koohmishi, m. palassi. mechanical properties of clayey soil reinforced with pet considering the influence of lime-stabilization. transportation geotechnics 33:100726, 2022. https://doi.org/10.1016/j.trgeo.2022.100726 [26] p. lindh, p. lemenkova. impact of strength-enhancing admixtures on stabilization of expansive soil by addition of alternative binders. civil and environmental engineering 18(2):726–735, 2022. https://doi.org/10.2478/cee-2022-0067 [27] m. di sante, d. bernardo, i. bellezza, et al. linking small-strain stiffness to development of chemical reactions in lime-treated soils. transportation geotechnics 34:100742, 2022. https://doi.org/10.1016/j.trgeo.2022.100742 [28] p. akula, d. n. little. analytical tests to evaluate pozzolanic reaction in lime stabilized soils. methodsx 7:100928, 2020. https://doi.org/10.1016/j.mex.2020.100928 [29] f. g. bell. stabilisation and treatment of clay soils with lime. part 1 – basic principles: bell, f g ground engngv21, n1. international journal of rock mechanics and mining sciences & geomechanics abstracts 25(5):240, 1988. https://doi.org/10.1016/0148-9062(88)90321-x [30] e. vitale, d. deneele, m. paris, g. russo. multiscale analysis and time evolution of pozzolanic activity of lime treated clays. applied clay science 141:36–45, 2017. https://doi.org/10.1016/j.clay.2017.02.013 [31] f. g. bell. cement stabilization and clay soils, with examples. environmental & engineering geoscience i(2):139–151, 1995. https://doi.org/10.2113/gseegeosci.i.2.139 [32] g. p. makusa, j. mácsik, g. holm, s. knutsson. process stabilization-solidification and the physicochemical factors influencing the strength development of treated dredged sediments. in geo-chicago 2016, pp. 532–545. american society of civil engineers, 2016. https://doi.org/10.1061/9780784480168.053 [33] s. bhattacharja, j. bhatty. comparative performance of portland cement and lime stabilization of moderate to high plasticity clay soils. rd125. portland cement association, skokie, illinois, usa, 2003. 154 https://doi.org/10.1680/grim.2000.4.4.165 https://doi.org/10.1016/0950-0618(95)00030-j https://doi.org/10.1016/j.clay.2014.04.015 https://doi.org/10.1007/s10706-011-9415-z https://www.icevirtuallibrary.com/doi/abs/10.1680/ls.25639.0012?src=recsys https://www.icevirtuallibrary.com/doi/abs/10.1680/ls.25639.0012?src=recsys https://doi.org/10.3141/1652-28 https://doi.org/10.5200/baltica.2022.1.4 https://doi.org/10.2478/ncr-2021-0012 https://doi.org/10.1016/j.conbuildmat.2021.125363 https://doi.org/10.1016/j.conbuildmat.2021.125363 https://doi.org/10.1061/(asce)mt.1943-5533.0003260 https://doi.org/10.1061/(asce)mt.1943-5533.0003260 https://doi.org/10.1016/j.cemconcomp.2004.02.019 https://doi.org/10.1016/j.cemconcomp.2004.02.019 https://doi.org/10.1016/j.jrmge.2022.02.011 https://doi.org/10.1061/(asce)1090-0241(2004)130:7(696) https://doi.org/10.1061/(asce)1090-0241(2004)130:7(696) https://doi.org/10.1016/j.trgeo.2022.100726 https://doi.org/10.2478/cee-2022-0067 https://doi.org/10.1016/j.trgeo.2022.100742 https://doi.org/10.1016/j.mex.2020.100928 https://doi.org/10.1016/0148-9062(88)90321-x https://doi.org/10.1016/j.clay.2017.02.013 https://doi.org/10.2113/gseegeosci.i.2.139 https://doi.org/10.1061/9780784480168.053 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content [34] s. burroughs. soil property criteria for rammed earth stabilization. journal of materials in civil engineering 20(3):264–273, 2008. https://doi.org/10. 1061/(asce)0899-1561(2008)20:3(264) [35] r. renjith, d. j. robert, c. gunasekara, et al. optimization of enzyme-based soil stabilization. journal of materials in civil engineering 32(5):04020091, 2020. https: //doi.org/10.1061/(asce)mt.1943-5533.0003124 [36] p. j. venda oliveira, a. a. correia, j. c. cajada. effect of the type of soil on the cyclic behaviour of chemically stabilised soils unreinforced and reinforced with polypropylene fibres. soil dynamics and earthquake engineering 115:336–343, 2018. https://doi.org/10.1016/j.soildyn.2018.09.005 [37] w. j. wall. influence of soil types on stabilization of the savannah river. journal of the waterways and harbors division 91(3):7–23, 1965. https://doi.org/10.1061/jwheau.0000404 [38] t. schanz, m. b. d. elsawy. stabilisation of highly swelling clay using lime–sand mixtures. proceedings of the institution of civil engineers – ground improvement 170(4):218–230, 2017. https://doi.org/10.1680/jgrim.15.00039 [39] x. kang, j. cao, b. bate. large-strain strength of polymer-modified kaolinite and fly ash–kaolinite mixtures. journal of geotechnical and geoenvironmental engineering 145(2):04018106, 2019. https: //doi.org/10.1061/(asce)gt.1943-5606.0002008 [40] n. bergman, s. larsson. comparing column penetration and total–sounding data for lime-cement columns. proceedings of the institution of civil engineers – ground improvement 167(4):249–259, 2014. https://doi.org/10.1680/grim.12.00019 [41] m. s. al-naqshabandy, n. bergman, s. larsson. strength variability in lime-cement columns based on cone penetration test data. proceedings of the institution of civil engineers – ground improvement 165(1):15–30, 2012. https://doi.org/10.1680/grim.2012.165.1.15 [42] p. paniagua, b. k. bache, k. karlsrud, a. k. lund. strength and stiffness of laboratory-mixed specimens of stabilised norwegian clays. proceedings of the institution of civil engineers – ground improvement 175(2):150– 163, 2022. https://doi.org/10.1680/jgrim.19.00051 [43] y. liu, d. e. l. ong, e. oh, et al. sustainable cementitious blends for strength enhancement of dredged mud in queensland, australia. geotechnical research 9(2):65–82, 2022. https://doi.org/10.1680/jgere.21.00046 [44] m. vukićević, v. pujević, m. marjanović, et al. fine grained soil stabilization using class f fly ash with and without cement. in geotechnical engineering for infrastructure and development – proceedings of the xvi european conference, edinburgh, scotland, vol. 5, pp. 2671–2676. geotechnical engineering for infrastructure and development, 2015. [2023-01-19], https: //hdl.handle.net/21.15107/rcub_grafar_683. [45] r. k. goswami, b. singh. influence of fly ash and lime on plasticity characteristics of residual lateritic soil. proceedings of the institution of civil engineers – ground improvement 9(4):175–182, 2005. https://doi.org/10.1680/grim.2005.9.4.175 [46] m. r. rao, a. s. rao, d. r. babu. efficacy of lime-stabilised fly ash in expansive soils. proceedings of the institution of civil engineers – ground improvement 161(1):23–29, 2008. https://doi.org/10.1680/grim.2008.161.1.23 [47] k. knapik, j. bzówka, d. deneele, g. russo. physical properties and mechanical behaviour of soil treated with fluidal fly ash and lime. in geotechnical engineering for infrastructure and development: xvi european conference on soil mechanics and geotechnical engineering, pp. 2571–2576. 2015. [2023-01-19], https://www.icevirtuallibrary.com/ doi/abs/10.1680/ecsmge.60678.vol5.396. [48] a. ghazy, m. t. bassuoni, a. k. m. r. islam. response of concrete with blended binders and nanosilica to freezing-thawing cycles and different concentrations of deicing salts. journal of materials in civil engineering 30(9):04018214, 2018. https: //doi.org/10.1061/(asce)mt.1943-5533.0002349 [49] n. c. consoli, e. j. b. marin, r. a. q. samaniego, et al. use of sustainable binders in soil stabilization. journal of materials in civil engineering 31(2):06018023, 2019. https: //doi.org/10.1061/(asce)mt.1943-5533.0002571 [50] p. lindh. optimizing binder blends for shallow stabilisation of fine-grained soils. proceedings of the institution of civil engineers ground improvement 5:23–34, 2001. https://doi.org/10.1680/grim.2001.5.1.23 [51] s. lee, m. chung, h. m. park, et al. xanthan gum biopolymer as soil-stabilization binder for road construction using local soil in sri lanka. journal of materials in civil engineering 31(11):06019012, 2019. https: //doi.org/10.1061/(asce)mt.1943-5533.0002909 [52] h. c. s. filho, r. b. saldanha, c. g. da rocha, n. c. consoli. sustainable binders stabilizing dispersive clay. journal of materials in civil engineering 33(3):06020026, 2021. https: //doi.org/10.1061/(asce)mt.1943-5533.0003595 [53] y. yi, m. liska, a. al-tabbaa. properties of two model soils stabilized with different blends and contents of ggbs, mgo, lime, and pc. journal of materials in civil engineering 26(2):267–274, 2014. https: //doi.org/10.1061/(asce)mt.1943-5533.0000806 [54] p. lindh, p. lemenkova. leaching of heavy metals from contaminated soil stabilised by portland cement and slag bremen. ecological chemistry and engineering s 29(4):537–552, 2022. https://doi.org/10.2478/eces-2022-0039 [55] k. gu, f. jin, a. al-tabbaa, b. shi. initial investigation of soil stabilization with calcined dolomite-ggbs blends. in ground improvement and geosynthetics, pp. 148–157. american society of civil engineers, 2014. https://doi.org/10.1061/9780784413401.015 155 https://doi.org/10.1061/(asce)0899-1561(2008)20:3(264) https://doi.org/10.1061/(asce)0899-1561(2008)20:3(264) https://doi.org/10.1061/(asce)mt.1943-5533.0003124 https://doi.org/10.1061/(asce)mt.1943-5533.0003124 https://doi.org/10.1016/j.soildyn.2018.09.005 https://doi.org/10.1061/jwheau.0000404 https://doi.org/10.1680/jgrim.15.00039 https://doi.org/10.1061/(asce)gt.1943-5606.0002008 https://doi.org/10.1061/(asce)gt.1943-5606.0002008 https://doi.org/10.1680/grim.12.00019 https://doi.org/10.1680/grim.2012.165.1.15 https://doi.org/10.1680/jgrim.19.00051 https://doi.org/10.1680/jgere.21.00046 https://hdl.handle.net/21.15107/rcub_grafar_683 https://hdl.handle.net/21.15107/rcub_grafar_683 https://doi.org/10.1680/grim.2005.9.4.175 https://doi.org/10.1680/grim.2008.161.1.23 https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol5.396 https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol5.396 https://doi.org/10.1061/(asce)mt.1943-5533.0002349 https://doi.org/10.1061/(asce)mt.1943-5533.0002349 https://doi.org/10.1061/(asce)mt.1943-5533.0002571 https://doi.org/10.1061/(asce)mt.1943-5533.0002571 https://doi.org/10.1680/grim.2001.5.1.23 https://doi.org/10.1061/(asce)mt.1943-5533.0002909 https://doi.org/10.1061/(asce)mt.1943-5533.0002909 https://doi.org/10.1061/(asce)mt.1943-5533.0003595 https://doi.org/10.1061/(asce)mt.1943-5533.0003595 https://doi.org/10.1061/(asce)mt.1943-5533.0000806 https://doi.org/10.1061/(asce)mt.1943-5533.0000806 https://doi.org/10.2478/eces-2022-0039 https://doi.org/10.1061/9780784413401.015 per lindh, polina lemenkova acta polytechnica [56] x. yang, z. you, j. mills-beale. asphalt binders blended with a high percentage of biobinders: aging mechanism using ftir and rheology. journal of materials in civil engineering 27(4):04014157, 2015. https: //doi.org/10.1061/(asce)mt.1943-5533.0001117 [57] p. lindh, p. lemenkova. dynamics of strength gain in sandy soil stabilised with mixed binders evaluated by elastic p-waves during compressive loading. materials 15(21):7798, 2022. https://doi.org/10.3390/ma15217798 [58] auststab. national auststab guidelines – australian binders used for road stabilisation. tech. rep. version a, auststab (pavement recycling and stabilisation association), sutherland, nsw 1499, australia. 27 november, 1997. [59] h. åhnberg, s.-e. johansson, a. retelius, et al. cement and lime for deep stabilization of soil [in swedish; cement och kalk för djupstabilisering av jord]. tech. rep. 48, swedish geotechnical institute (sgi), linköping, sweden, 1995. [60] k. assarson. stabilization of cohesive soils with lime [in swedish; stabilisering av kohesionära jordarter med kalk]. norsk vegtidskrift 2:2–16, 1968. [61] g. holm. deep mixing. in soft ground technology, pp. 105–122. american society of civil engineers, 2001. https://doi.org/10.1061/40552(301)9 [62] g. holm. state of practice in dry deep mixing methods. in grouting and ground treatment, pp. 145–163. american society of civil engineers, 2003. https://doi.org/10.1061/40663(2003)5 [63] m. a. hasan, z. hossain, a. elsayed. laboratory and field investigation of subgrade soil stabilization in arkansas. in tran-set 2021, pp. 259–275. american society of civil engineers, 2021. https://doi.org/10.1061/9780784483787.026 [64] c. d. rogers, d. i. boardman, g. papadimitriou. stress path testing of realistically cured lime and lime/cement stabilized clay. journal of materials in civil engineering 18(2):259–266, 2006. https://doi. org/10.1061/(asce)0899-1561(2006)18:2(259) [65] a. roduner, d. flum, m. stolz, e. gröner. large-scale field tests with flexible slope stabilization systems. in geotechnical engineering for infrastructure and development: xvi european conference on soil mechanics and geotechnical engineering, pp. 1645–1650. 2015. [2023-01-19], https://www.icevirtuallibrary. com/doi/abs/10.1680/ecsmge.60678.vol4.243. [66] z. kichou, m. mavroulidou, m. j. gunn. triaxial testing of saturated lime-treated high plasticity clay. in geotechnical engineering for infrastructure and development: xvi european conference on soil mechanics and geotechnical engineering, pp. 3201–3206. 2015. [2023-01-19], https://www.icevirtuallibrary. com/doi/abs/10.1680/ecsmge.60678.vol6.500. [67] y. wang, y.-j. cui, a. m. tang, et al. shrinkage behaviour of a compacted lime-treated clay. géotechnique letters 10(2):174–178, 2020. https://doi.org/10.1680/jgele.19.00006 [68] k. fahoum, m. s. aggour, f. amini. dynamic properties of cohesive soils treated with lime. journal of geotechnical engineering 122(5):382–389, 1996. https://doi.org/10.1061/(asce)0733-9410(1996) 122:5(382) [69] p. lindh, p. lemenkova. seismic velocity of p-waves to evaluate strength of stabilized soil for svenska cellulosa aktiebolaget biorefinery östrand ab, timrå. bulletin of the polish academy of sciences: technical sciences 70(4):e141593, 2022. https://doi.org/10.24425/bpasts.2022.141593 [70] p. lindh, p. lemenkova. soil contamination from heavy metals and persistent organic pollutants (pah, pcb and hcb) in the coastal area of västernorrland, sweden. gospodarka surowcami mineralnymi – mineral resources management 38(2):147–168, 2022. https://doi.org/10.24425/gsm.2022.141662 [71] j. c. sasser, r. crowley, m. davies, et al. surfactant-induced soil strengthening (siss) – a potential new method for temporary stabilization along beaches and coastal waterways. in geo-congress 2022, pp. 212–221. american society of civil engineers, 2022. https://doi.org/10.1061/9780784484012.022 [72] p. lindh. mcv and shear strength of compacted fine-grained tills. in soil mechanics and geotechnical engineering (12arc) – proceedings of the twelfth asian regional conference, pp. 493–496. world scientific publishing co pte ltd, singapore, 2003. proceedings of the soil mechanics and geotechnical engineering (12arc) – proceedings of the twelfth asian regional conference. 1652 pp. [73] g. habibagahi, a. niazi, b. n. sisakht, e. nikooee. stabilisation of collapsible soils: a biological technique, vol. 5, pp. 2829–2834. 2015. [2023-01-19], https://sciexplore.ir/documents/details/ 203-143-108-113. [74] a. eissa, m. t. bassuoni, a. ghazy, m. alfaro. improving the properties of soft clay using cement, slag, and nanosilica: experimental and statistical modeling. journal of materials in civil engineering 34(4):04022031, 2022. https: //doi.org/10.1061/(asce)mt.1943-5533.0004172 [75] t. d. o’rourke, a. j. mcginn. lessons learned for ground movements and soil stabilization from the boston central artery. journal of geotechnical and geoenvironmental engineering 132(8):966–989, 2006. https://doi.org/10.1061/(asce)1090-0241(2006) 132:8(966) [76] y. najjar, c. huang, h. yasarer. neuronet-based soil chemical stabilization model. in contemporary topics in ground modification, problem soils, and geo-support, pp. 393–400. american society of civil engineers, 2009. https://doi.org/10.1061/41023(337)50 [77] h. källén, a. heyden, k. åström, p. lindh. measuring and evaluating bitumen coverage of stones using two different digital image analysis methods. measurement 84:56–67, 2016. https: //doi.org/10.1016/j.measurement.2016.02.007 [78] e. b. ojo, d. s. matawal, a. k. isah. statistical analysis of the effect of mineralogical composition on the qualities of compressed stabilized earth blocks. journal of materials in civil engineering 28(11):06016014, 2016. https: //doi.org/10.1061/(asce)mt.1943-5533.0001609 156 https://doi.org/10.1061/(asce)mt.1943-5533.0001117 https://doi.org/10.1061/(asce)mt.1943-5533.0001117 https://doi.org/10.3390/ma15217798 https://doi.org/10.1061/40552(301)9 https://doi.org/10.1061/40663(2003)5 https://doi.org/10.1061/9780784483787.026 https://doi.org/10.1061/(asce)0899-1561(2006)18:2(259) https://doi.org/10.1061/(asce)0899-1561(2006)18:2(259) https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol4.243 https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol4.243 https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol6.500 https://www.icevirtuallibrary.com/doi/abs/10.1680/ecsmge.60678.vol6.500 https://doi.org/10.1680/jgele.19.00006 https://doi.org/10.1061/(asce)0733-9410(1996)122:5(382) https://doi.org/10.1061/(asce)0733-9410(1996)122:5(382) https://doi.org/10.24425/bpasts.2022.141593 https://doi.org/10.24425/gsm.2022.141662 https://doi.org/10.1061/9780784484012.022 https://sciexplore.ir/documents/details/203-143-108-113 https://sciexplore.ir/documents/details/203-143-108-113 https://doi.org/10.1061/(asce)mt.1943-5533.0004172 https://doi.org/10.1061/(asce)mt.1943-5533.0004172 https://doi.org/10.1061/(asce)1090-0241(2006)132:8(966) https://doi.org/10.1061/(asce)1090-0241(2006)132:8(966) https://doi.org/10.1061/41023(337)50 https://doi.org/10.1016/j.measurement.2016.02.007 https://doi.org/10.1016/j.measurement.2016.02.007 https://doi.org/10.1061/(asce)mt.1943-5533.0001609 https://doi.org/10.1061/(asce)mt.1943-5533.0001609 vol. 63 no. 2/2023 pareto efficiency and rsm to adjust binder content [79] p. lindh, p. lemenkova. evaluation of different binder combinations of cement, slag and ckd for s/s treatment of tbt contaminated sediments. acta mechanica et automatica 15(4):236–248, 2021. https://doi.org/10.2478/ama-2021-0030 [80] b. subha, y. c. song, j. h. woo. bioremediation of contaminated coastal sediment: optimization of slow release biostimulant ball using response surface methodology (rsm) and stabilization of metals from contaminated sediment. marine pollution bulletin 114(1):285–295, 2017. https://doi.org/10.1016/j.marpolbul.2016.09.034 [81] h. bian, j. wan, t. muhammad, et al. computational study and optimization experiment of nzvi modified by anionic and cationic polymer for cr(vi) stabilization in soil: kinetics and response surface methodology (rsm). environmental pollution 276:116745, 2021. https://doi.org/10.1016/j.envpol.2021.116745 [82] r. h. myres, d. c. montgomery, c. m. anderson-cook. response surface methodology: process and product optimization using designed experiments. wiley, new jersey, us, 4th edn., 2016. [83] k. chen, d. wu, z. zhang, et al. modeling and optimization of fly ash-slag-based geopolymer using response surface method and its application in soft soil stabilization. construction and building materials 315:125723, 2022. https: //doi.org/10.1016/j.conbuildmat.2021.125723 [84] p. lindh, p. lemenkova. shear bond and compressive strength of clay stabilised with lime/cement jet grouting and deep mixing: a case of norvik, nynäshamn. nonlinear engineering 11(1):693–710, 2022. https://doi.org/10.1515/nleng-2022-0269 [85] d. c. montgomery. design and analysis of experiment. wiley, new jersey, us, 10th edn., 2019. [86] statsoft inc. statistica for windows (computer program manual). tulsa, 1995. [87] g. e. p. box, j. s. hunter, w. g. hunter. statistics for experimenters: design innovation and discovery. wiley-interscience, hoboken n.j. u.s., 2nd edn., 2005. [88] p. lindh, p. lemenkova. simplex lattice design and x-ray diffraction for analysis of soil structure: a case of cement-stabilised compacted tills reinforced with steel slag and slaked lime. electronics 11(22):3726, 2022. https://doi.org/10.3390/electronics11223726 157 https://doi.org/10.2478/ama-2021-0030 https://doi.org/10.1016/j.marpolbul.2016.09.034 https://doi.org/10.1016/j.envpol.2021.116745 https://doi.org/10.1016/j.conbuildmat.2021.125723 https://doi.org/10.1016/j.conbuildmat.2021.125723 https://doi.org/10.1515/nleng-2022-0269 https://doi.org/10.3390/electronics11223726 acta polytechnica 63(2):140–157, 2023 1 introduction 2 essentials on binder selection 3 statistical approach 4 stabilisation quality control 5 results 5.1 central composite design (ccd) as a factorial experiment 5.2 specimen collection and processing 5.3 effects from opc and quicklime 5.4 anova 5.5 box-behnken design (bbd) 5.6 simplex lattice design 6 discussion 7 conclusions acknowledgements list of symbols references ap08_1.vp 1 introduction in the part of central europe that is now the czech republic, the effect of an artificial channel of high capacity on the runoff culmination discharge was already under discussion more than a century ago, during implementation of the general program of river regulation in the kingdom of bohemia, which started in 1903. it was claimed that river regulation concentrates water into a river channel and constrains the spread of water into floodplains, resulting in amplification of the flood wave. to suppress this effect, the program suggested building storage reservoirs on regulated rivers. nowadays, changing the floodwave hydrograph caused by the spread of water into a floodplain or into a storage reservoir is discussed in standard hydrology textbooks, e.g. [1], [5], together with methods for describing this phenomenon. however, transforming the flood wave due to the spread of water into a floodplain or into a storage reservoir is only one part of the problem. the other is the formation of a flood wave due to the channel itself. this effect has not yet been addressed adequately. for drainage purposes, many small natural channels crossing crop fields have been regulated to artificial, straight, deep, smooth channels of unnecessarily high capacity. such channels accelerate the runoff process, the runoff hydrograph transforms (it becomes steeper) and the culmination discharge increases. this paper offers a theoretically based methodology for evaluating the runoff discharge during a flood event in a watershed drained by both artificial and natural (restored) channels. the methodology should help to evaluate the function of a channel during flooding and to decide whether an artificial channel should be subjected to restoration. the proposed methodology is based on the standard method of isochrones. 2 theoretical analysis 2.1 survey of the flow routing method, producing a theoretical hydrograph one of the most efficient overland flow routing methods is the time-area method [1]. this method omits storage effects and divides the watershed into subareas, using isochrones. an isochrone is a contour which passes through points of the same travel time of a water particle from the location given by the point to the location of the outlet of the watershed. to construct isochrones, it is first necessary to determine the average travel time of a water particle. this is accomplished by using empirical formulae calculating the travel time through the hydraulic length of the watershed from the surface flow velocity deduced for the hydraulic properties of the watershed surface, and from the channel flow velocity using the manning equation for the channel draining the watershed. isochrones of constant travel times are drawn on to a map of the watershed. the area of each zone is then determined by a planimeter, and the area values are used for determining the runoff discharge in the outflow profile of the watershed (see fig. 1). the time-area method calculates the zonal runoff from the zone area and from the excess (effective) rainfall. the equation for the zonal runoff discharge, qt, reads q f k h tt t t � 0 310 60 (m3/s) (1) in the equation, ft is the area bordered by the isochrone associated with the time period t in km2, k0 is the coefficient of direct runoff, ht is the total magnitude of rainfall in mm over the time period t, and t is the time period in minutes from the beginning of the rain. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 1/2008 effect of the hydraulic characteristics of a stream channel and its surroundings on the runoff hydrograph v. matoušek the course and magnitude of a rainfall flood depends primarily on the intensity and duration of the rainfall event, on the morphological parameters of the watershed (e.g. its slope and shape), and on how to watershed has been exploited. a flood wave develops in the stream channel that drains the watershed, and it transforms while passing along the channel. this is particularly the case if the water spreads into floodplains and/or storage reservoirs while passing through the channel. this paper addresses an additional effect that has a significant influence on the magnitude and course of the flood wave but has not previously been addressed adequately, namely the effect of the hydraulic parameters of the stream channel itself on the transformation of a flood wave. the paper explains theoretically and shows on a practical example that a smooth channel with a high capacity significantly increases the magnitude and speed of a flood wave. many flood events are unnecessarily severe just because the watershed is drained by a hydraulically inappropriate channel. the channel is large and smooth and therefore it gathers most of the flowing water during the flood event, producing high water velocity in the channel. as a result, the large and smooth channel accelerates the runoff from the watershed and constrains the spread of water into the floodplain. a high and steep flood wave is developed in the channel, and this floods areas with a limited water-throughput capacity (e.g. urban areas in the vicinity of hydraulic structures) downstream the channel. this paper offers a methodology for evaluating the ability of a channel to convey a flood wave safely and for recognizing whether a regulated channel should be subjected to restoration due to its inability to convey flood waves safely. keywords: rainfall-runoff process, runoff velocity, flood-wave transformation, flood damping, stream channel restoration. eq. 1 contains the runoff coefficient k0, a dimensionless empirical coefficient related to the abstractive and diffusive properties of the catchment. its value is influenced by many parameters associated with the properties of the evaluated watershed and it is very difficult to determine the value accurately. however, the objective of this paper is to analyze the effect on a rainfall flood of channel properties, rather than watershed properties. therefore an investigation into appropriate values of the coefficient is not a part of the analysis described here. the same value of the coefficient is taken for different stream channels. 2.2 basics of the proposed methodology: an evaluation of the effect of the velocity in the stream channel on the theoretical hydrograph imagine that the same watershed as in fig. 1 is drained by a stream channel that is reclaimed in such a way that the mean velocity of the water in the reclaimed channel is twice that considered in fig. 1. the higher velocity in the channel has a profound effect on the course of the isochrones (see fig. 2). this is due to the shorter travel time in the channel (the travel time in the terrain of the watershed remains unchanged). in fig. 2, the new isochrones are drawn as dotted lines. the new lines (representing the isochrones for the channel velocity that is twice the velocity in the original channel before adaptation) match the original lines at the border of the watershed, because the travel time of the water in the terrain is the same for both situations. the difference in the shapes of the isochrones for a certain travel time is greatest at the location where the isochrones cross the stream channel. for the new velocity in the channel, the dotted line for a certain travel time must match the dashed line of the travel time that is one half of that of the dotted line. for example, the dotted line representing the 10-minute isochrone must match the dashed line representing the 20-minute isochrone at the location where both isochrones cross the stream channel (see fig. 2). the comparison of the hydrographs for the two situations in figure 2 shows that the hydrograph grows faster if the velocity in the channel is higher. in other words, higher velocity in the channel creates a steeper hydrograph. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 0 10 20 30 40 50 60 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 0:45 0:50 0:55 1:00 time h:min d is c h a r g e m 3 /s reclaimed channel with velocity 2xv natural channel with velocity v fig. 2: comparison of isochrones and hydrographs for two different mean velocities in a channel 0 10 20 30 40 50 60 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 0:45 0:50 0:55 1:00 time h:min d is c h a rg e m 3 /s fig. 1: isochrones in a watershed, and the constructed ascending part of a hydrograph (flood wave) in an outflow profile in the first case, the maximum discharge in the outflow profile is reached in 55 minutes, i.e. the concentration time is 55 minutes. this is also the duration of the simulated rainfall (the descending part of the hydrograph starts at time 0:55). the concentration time is much shorter if the channel velocity is two times higher; it is 35 minutes (see fig. 2). since the rainfall duration and intensity remain the same in both cases, the maximum discharge has the same value in both cases, only in the adapted channel it passes the outflow profile over a longer time period, namely for 20 minutes (from time 0:35 to 0:55). the application of eq. 1 shows why the value of the maximum discharge, qk, is the same for both cases. the area of the watershed ft remains unchanged, and the same holds for the coefficient of direct runoff k0. the concentration time tk is 55 minutes in the first case and 35 minutes in the other case. in 35 minutes the precipitation depth at the watershed is h35 � (h55/55)35 and thus q f k h f k h k t t� � � � � 0 55 3 0 55 310 60 55 35 55 10 60 55 rainfalls of shorter duration have higher intensities (see an example in table 1). in the region of hermanuv mestec, a rainfall event with 50-year periodicity (n = 50) and a duration of 30 minutes produces a precipitation depth of 45.3 mm. if its duration is 60 minutes, the total precipitation depth is 55.0 mm, i.e. 22.5 mm in 30 minutes. a rainfall event produces a much higher maximum discharge qk when the concentration time of the watershed is equal to the rainfall duration than when the concentration time is longer than the rainfall duration. to show this effect, fig. 3 compares the hydrographs for a rainfall of 35-minute duration in a watershed with a concentration time of 35 minute (the watershed with the adapted channel) and 55 minutes (the watershed with the unadapted channel), respectively. the maximum flow rate in the outflow profile of the reclaimed channel with the velocity twice the velocity in the natural channel is almost two times higher than the maximum flow rate in the natural channel. the hydrograph is also much steeper. the outflow from the watershed with the natural channel is more equally distributed in time. the maximum flow rate corresponding with the flow rate in the time equal to the concentration time is never reached. water from the farthest locations reaches the outflow profile after the end of the rain event. since in the watershed the longer concentration time is caused by the natural channel, it is obvious that the channel has a damping effect on the flood wave passing through the natural channel. fig. 4 compares the hydrographs in our two channels for rainfalls of a certain periodicity (50 years) and durations corresponding with the concentration times, i.e. for rainfalls with durations of 35 minutes (reclaimed channel) and 55 minutes (natural channel). from table 1 the rainfalls of periodicity n � 50 are selected, and thus their precipitation depths are 48 mm for the 35-minute duration and 52 mm for the 55-minutes duration. fig. 4 shows that the channel with the two times higher velocity significantly increases and accelerates the runoff from the watershed. thus the general © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 1/2008 0 20 40 60 80 100 120 140 160 180 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 time h:min d is c h a rg e m 3 /s reclaimed channel with velocity 2xv natural channel with velocity v fig. 3: comparison of hydrographs of the same watershed for rainfall of duration equal to the concentration time in a reclaimed channel with higher water velocity (precipitation depth 48 mm and duration 35 min.) duration n � 2 n � 5 n � 10 n � 20 n � 50 n � 100 30 min. 16.9 23.5 30.1 36.8 45.3 52.0 40 min. 18.5 26.0 33.3 40.7 50.3 58.0 60 min. 19.6 28.0 36.0 44.2 55.0 63.0 90 min. 21.2 30.0 39.0 47.9 59.6 68.3 120 min. 22.5 32.0 41.3 50.7 63.0 72.3 table 1: intensity–duration-frequency table of rainfalls in the region of hermanuv mestec, czech republic conclusion is that a channel draining a watershed is able to increase and accelerate the flood wave. the above analysis, based on the method of isochrones, shows a profound effect of the hydraulic characteristics of a stream channel on the runoff hydrograph. the following step is to demonstrate the results of the analysis on a practical example of an observed flood event. remark: in reality, a stream channel with low velocity has a small capacity. thus during flood events the channel is usually not able to drain all water down the channel. instead, some water spreads into the floodplains along the channel. this helps to keep the velocity of the flow low. at the same time, the floodplain accumulates water for some time and thus damps the flood wave. the flood wave transforms itself, and becomes flatter and longer. the method of isochrones does not solve the transformation of a flood wave due to the spread of water into the floodplain. there are methods available for solving the transformation. one suitable method will be used below to analyze the observed flood event. 3. practical example: a flood event on dubanka creek on may 30th, 2005 3.1 description of the flood event in the late afternoon of may 30th, 2005 water and mud from the local dubanka creek flooded the village of rozhovice, which lies 4 km from hermanuv mestec and 6 km from chrudim, in east bohemia. the dubanka is a left-side tributary of the small river bylanka, which joins the labe river near pardubice. the report of the municipality representatives from the june 1st, of 2005 describes the flood event as follows: “the flood event was created by an intensive rainstorm (lasting for 10–15 minutes approximately) accompanied by a hailstorm, strong wind and a thunderstorm. due to the specific configuration of the landscape and inappropriate exploitation of the land (maize fields) in the watershed of the dubanka creek above the village of rozhovice, an extensive runoff of water occurred and the village was flooded. the surface runoff caused transport of soil and mud from the fields. the sediments covered the streets, gardens, parks, cellars, garages, houses, cars, sewers, wells and the channel of the creek.” the report identifies three reasons for the flood event: 1) rainfall of high intensity, 2) the configuration of the landscape, namely the steep slope of the creek valley, 3) inappropriate agricultural exploitation of the valley slopes (maize fields). 3.2 reconstruction of the flood event, and an analysis of its causes information on parameters of the rainfall that caused the flood event was obtained from photos of meteorological radars acquired from the czech hydrometeorological institute. photos giving local precipitation in 10-minute intervals and photos of one-hour precipitation totals were available. an analysis of the photos revealed that surprisingly enough, the rain lasted only 30 minutes and the total precipitation depth was only 20 mm. a comparison of this depth with the depths typical for the area (see table 1) shows that the periodicity of the rainfall is less than 5 years. such a rainfall should not cause a flood event of the size experienced on may 30th, 2005. a field survey revealed that the rainfall was very local. the rain hit the slope of the dubanka creek valley above the village of rozhovice. the slope was being exploited as a maize field. apparently, the affected area was small, but the runoff from the area was very great. the hydraulic parameters and the longitudinal water-surface slope of the stream channel of the dubanka creek were measured in a suitable reach behind a culvert just above the village. the channel in the reach was very deep (depth of 1.5 m) and had a large longitudinal slope (2.86 %). traces on the channel banks and in their surroundings indicated that the channel was full during the flood event and conveyed all the drained water (no traces of a water surface in the floodplain). there were no backwater conditions. the manning equation gave a mean velocity of about 4 m/s (r � 0.62, n � 0.030). the banks and bottom of the channel were lined 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 0 20 40 60 80 100 120 140 160 180 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 0:45 0:50 0:55 1:00 time h:min d is c h a rg e m 3 /s reclaim channel with 2xv natural channel with velocity v fig. 4: comparison of hydrographs of the same watershed for rainfall of periodicity 50 years and duration equal to the concentration time in each watershed (precipitation depth 48 mm for rainfall duration 35 min in a watershed with a reclaimed channel and precipitation depth 52 mm for rainfall duration 55 min in a watershed with a natural channel) with a stone pavement covered by reeds. the high degree of devastation of the reed vegetation in the channel supported the idea of high velocity in the channel during the flood event. according to the traces, the discharge area of the channel was 3 m2 at the maximum flow rate and thus the maximum flow rate was approximately 12 m3/s. the estimated value of the maximum discharge was confirmed by the measurements in a hydraulic structure on the dubanka creek below rozhovice. the structure was a small railway bridge that exhibited clear traces of the maximum water level on the front and rear faces of the bridge. this information was processed in the hec-ras program, which gave a maximum flow rate of 15 m3/s. a parallel hydraulic structure, a rectangular culvert 1×1.2 m in dimensions, conveyed approximately 3.0 m3/s of additional flow. thus the total maximum flow rate in the creek below rozhovice was about 18 m3/s. this value corresponds very well with the value of 12 m3/s above the village, because the outflow profile above the village drains water from a watershed 1.123 km2 in area while the outflow profile below the village drains an area of 1.68 km2, i.e. an area one-third larger (0.557 km2) than the area above the village. the landscape of the watershed drained by the dubanka creek above rozhovice is described in fig. 5. the area of the watershed is 1.12 km2. the map shows the steep slopes of the dubanka valley. the mean left-bank slope is about 7 % and its maximum local value is 14.5 %. the mean right-bank slope is about 6 % and the maximum local value 10.4 %. maize should not be grown on such steep slopes. the rules stipulated by czech standard csn 75 4500 “erosion control of agricultural land” were not followed when this maize field was sown on the slopes of the dubanka valley. not even contour seeding was applied, and the drills were seeded in the downhill direction, as shown in fig. 6. the downhill directed drills created small natural channels that drained the field during the rainfall event and increased the direct runoff and its velocity. water flowing through the drills eroded the soil surface and formed deeper rills in which the water velocity increased further and through which the water transported large amounts of eroded soil to the creek. the maize field was probably able to pond only a small part of the precipitation water, and thus the coefficient of direct runoff must have been high. information on the rainfall, the maximum flow rate, and the watershed slopes made it © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 1/2008 profile with discharge 12 m 3 /s initial profile with increased discharge border of watershed border of runoff area fig. 5: map showing isochrones of the dubanka watershed above rozhovice possible to construct the isochrones and calculate the coefficient of direct runoff (k0). the 15-minute isochrone is drawn in fig. 5. the concentration time of the entire affected area above the outflow profile in fig. 5 was 30 minutes. therefore the value for coefficient k0 could be determined using eq. 1 as 12 1123 20 10 60 300 3 � � � � . k , and thus k0 0 96� . . this is an extremely high value and it indicates that the maize field absorbed virtually no water. this was due to the steep slope and the use of inappropriate drills in the maize field. thus the fast runoff of precipitation water from the watershed was a major reason for the extremely high maximum flow rate in the outflow profile just above the village of rozhovice. however, it is unlikely that it was the only reason. it is necessary to examine what role the stream channel played in the formation of the maximum flow rate. this role was not recognized in the report by the municipality representatives. apparently, the representatives were not aware of this aspect of the formation of a flood wave. in general, little is known about the role of a stream channel with a hydraulically unsuitable shape in the development of a runoff hydrograph. 3.3 evaluation of the role of the hydraulic characteristics of a stream channel the channel of the dubanka creek is virtually straight and smooth (fig. 7). moreover, it is deep. it was adapted in the 1930s to drain the surrounding fields. the mean longitudinal slope of the channel is 0.93 %. the channel has a trapezoidal cross section with a bottom width of 0.5 m, a bank slope 1:1, and a minimum depth of about 1.2 m. the full channel has a discharge area of 2.04 m2 and hydraulic radius r � 0.60 m. the roughness coefficient n � 0.030 and velocity coefficient c � 30.64. the manning equation gives a mean velocity v � 2.3 m/s in the full channel. the capacity of the full channel is q � 2.3×2.04 � 4.7 m3/s. there is no vegetation belt along the channel banks that would prevent washing of soil from the surrounding fields into the channel (see fig. 7). the fields are cultivated down to the upper border of the channel bank and thus the wetted perimeter of the discharged area also remains relatively smooth if the channel is overfull. due to the straight course of the channel and the small roughness of the surface of the channel and of the channel surroundings, the flow velocity increases during a flood event even if the channel is full and a certain part of the water flows over the bank. the method of isochrones is used to reconstruct the runoff hydrograph from the may 30th, 2005. the investigation described above provided us with the values of the precipitation depths, the direct runoff coefficient and the areas confined by the isochrones. this information is sufficient to determine the flow rates through the outflow profile at various time periods during the flood event, using eq. 1. after the first 15 minutes of the rainfall the channel at the outflow profile drained the steep-slope part of the watershed area of 0.408 km2. this is a large area, and its size is primarily due to the high velocity of the water in the channel. the precipitation depth on the area after 15 minutes was 10 mm, and estimated 80 per cent of the precipitation water run off. this means k0 � 0.8, which is smaller than the value of 0.96 found earlier. the value of 0.96 is associated with the situation at the end of the rainfall (i.e. 30 minutes after the beginning of the rainfall). basically, the value of the direct runoff coefficient should increase with the duration of a rain event. after the rain begins, first the soil pores are filled with rainfall water, and interception and surface accumulation takes place, all of these processes inhibiting the direct runoff. eq. 1 determines the flow rate in the outflow profile 15 minutes after the beginning of the rain as q15 3 0 408 10 0 80 10 60 15 3 62� � � � �. . . m3/s. 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 fig. 6: maize drills in the downhill direction on a slope of the dubanka valley fig. 7: reclaimed channel of the dubanka creek and its surroundings above rozhovice in the 30th minute after the beginning of the rain, the channel in the outflow profile drained the watershed area of 1.123 km2 and the flow rate in the profile was q 30 3 1123 20 0 96 10 60 30 1198� � � � �. . . m3/s. the duration of the rainfall was 30 minutes, thus the hydrograph curve starts to decrease after reaching the q30 value. the constructed hydrograph is shown on fig. 8. the hydrograph is steep and the maximum flow rate is high. this is because the water drained from the fields and accumulated in the channel reached the outflow profile very quickly, due to the high velocity of the water in the channel. during the flood event virtually the entire water current flowed through the channel (no spilling over the bank), thus all water elements flowed very quickly and even the elements from the farthest locations of the watershed reached the outflow profile in 30 minutes. as discussed theoretically in the previous section, this causes a steep and high flood wave. the reconstruction of the hydrograph of the may 30th, 2005 flood event suggests that the role of the properties of the stream channel in the formation of the runoff hydrograph may be important. in addition to the causes recognized in the report by the municipality representatives, a further cause of the flood was the hydraulically inappropriate shape of the reclaimed stream channel. an evaluation of the real impact of the hydraulic characteristics of the stream channel on the hydrograph development requires a simulation and a mutually comparison of the roles of the existing reclaimed channel and of the proposed natural (restored) channel, which has very different hydraulic characteristics. an inspection of the dubanka valley indicated how the natural channel of the creek looked before it was adapted seventy years ago. the new proposed natural (restored) channel adopts the properties of the original channel before the adaptation. it is shallow, curved and welted, with a vegetation belt along its banks (fig. 9). a cross section of the proposed natural channel is shown in fig. 10. during a flood event water spills over the banks and flows partially through the floodplain at the bottom of the valley. the values of the velocities for different water depths and longitudinal slopes are summarized in table 2. the velocities are small in comparison with the existing channel. this is particularly the case for the maximum flow rate (the velocity was more than 2.3 m/s in the existing channel during the flood event, while the velocity in the natural channel would be about 0.5 m/s). this has a significant effect on the distribution of the isochrones in the dubanka watershed above rozhovice (see fig. 11). remember that the conditions on the slopes of the dubanka valley are the same for both compared channels, as is the velocity of the water in the maize field. the diminishing velocity is due exclusively to the increased roughness of the discharge perimeter at the bottom of the valley and consequent spilling of water into the floodplain at the bottom of the valley. the construction of the isochrones and the hydrograph for the dubanka watershed drained by the proposed natural channel is described in tables 3–5. the isochrones in fig. 11 reveal significant deceleration of the runoff. the time of concentration changed from 30 minutes for the existing channel to 60 minutes for the proposed natural channel. this affects the runoff areas in table 4 and the hydrograph parameters in table 5. the maximum discharge is reached at the 45th minute after the beginning of the rain. this discharge is produced by the runoff from the area between the 15-minute isochrone in the 45-minutes isochrone. this area is the largest, and it is logical that it forms © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 1/2008 0 2 4 6 8 10 12 14 0:00 0:15 0:30 0:45 1:00 1:15 time h:min d is c h a rg e m 3 /s fig. 8: deduced hydrograph in the existing channel above rozhovice during the flood event on may 30th, 2005 fig. 9: proposal for a new natural channel the highest discharge. according to eq. 1, the runoff from this area is equal to q 45 3 0 398 0 42 20 0 96 10 60 30 8 64� � � � � � �( . . ) . . m3/s. this confirms the value reached in table 5. the maximum flow rate (8.6 m3/s) is significantly lower than that for the existing channel (12 m3/s). thus the presence of the natural channel flattens and prolongs the hydrograph in 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 h s o r c v (m/s) q (m3/s) (m) (m2) (m) (m) (m0.5/s) 1.2 % 1% 1.2% 1% 0.20 0.6 6.0 0.1 6.81 0.24 0.22 0.14 0.13 0.40 2.4 12.0 0.2 7.64 0.35 0.34 0.84 0.82 0.60 5.4 18.0 0.3 8.18 0.49 0.45 2.65 2.42 0.80 9.6 24.0 0.4 8.58 0.59 0.54 5.71 5.21 1.0 1.0 30.0 0.5 8.91 0.96 0.63 10.35 9.45 table 2: velocities and flow rates in the proposed natural channel (fig. 10) for different water depths and longitudinal slopes (n � 0.10) t (min) 1st reach 2nd reach 3rd reach total length (m) q (m3/s) v (m/s) l (m) q (m3/s v (m/s) l (m) q (m3/s) v (m/s) l (m) 15 0–1.5 aver. � 0.75 0.33 297 297 30 1.5–5.0; aver. � 3.3 0.48 432 0–3.0 aver. � 1.5 0.39 351 783 45 5.0–7.0; aver. � 6.0 0.55 495 3.0–5.0 aver. � 4.0 0.50 450 0–4.0 aver. � 2.0 0.42 378 1 323 table 3: time development of the flow rate along the length of the natural channel and the distance of the isochrones from the outflow profile area 1 2 3 4 total km2 0.112 0.398 0.420 0.193 1.123 table 4: areas between isochrones in the watershed drained by the natural channel fig. 10: cross section of the proposed natural channel and its surroundings t (min) area 1 area 2 area 3 area 4 qt (m3/s)k01 q1 k02 q2 k03 q3 k04 q4 15 0.90 1.12 1.12 30 1.00 1.24 0.90 3.98 5.22 45 1.00 4.42 0.90 4.20 8.62 60 1.00 4.67 0.90 1.93 6.60 75 1.00 2.14 2.14 table 5: hydrograph parameters for the natural channel, using eq. 1 comparison with that for the existing channel. moreover, there is an additional effect – retention of water in the floodplain at the bottom of the dubanka valley. this effect is not taken into account in the calculations in table 5. 3.4 evaluation of role of the floodplain in transforming the hydrograph the retention takes place in the dubanka valley drained by the proposed natural channel. the channel has a low capacity of about 0.3 m3/s and each flow rate greater than this causes a spillage of water into the floodplain. thus the valley accumulates water and diminishes the flow rate. this transforms the flood wave, or the hydrograph. various methods are available calculating the transformation of a flood wave. the volume balance method is applied here. water accumulates at the bottom of the valley and forms a temporary lake. there is the flow rate into the lake (inflow qp) and out of the lake (outflow q0). over the infinitesimal time period dt the volume qpdt enters the lake and the volume q0dt leaves the lake. the difference between these two volumes gives the change in the volume of accumulated water in the lake. if qp> q0, the water depth increases in the lake. hence d d dv q q t p hp� � � �( )0 , (3) where dv is the change in the volume of water accumulated in the lake, p is the area of the water surface in the lake, and dh is the change in the water depth in the lake. since we do not have analytical functions for q fn tp � ( ) and q fn t0 � ( ), we are not able to integrate eq. 3. instead we can solve the equation q t q t vp � � �� �0 (4) provided that information on q f tp � 1( ), q f h0 2� ( ), and v f h� 3( ) is available. the relationship v f h� 3( ) is found for the geometrical shape and dimensions (read out from a map) of the floodplain at the bottom of the dubanka valley and it is given in table 6. table 6 takes into account the fact that different relations are found between the volume and the depth in the upper reach of the lake (reach 400 metres in length, volume v2) and the lower reach of the lake (reach 1000 metres in length, volume v1). table 2 gives q f h0 2� ( ). for example, if the longitudinal slope is 1% and the flow rate is 5.21 m3/s, the water depth in the lake is 0.8 m, according to table 2. table 6 shows that in this case the lake accumulates 11 305 m3 of water. table 5 gives q f tp � 1( ). at time t � 15 min, the total inflow into the lake is 11.23 m3/s (1.12 m3/s from area 1; 3.98 m3/s from area 2; 4.20 m3/s from area 3, and 1.93 m3/s from area 4). the average flow rate qpp over the first 15 minutes (t from 0 to 15 min.) is one half of 11.23 m3/s, because the flow rate is zero at t � 0. over the next 15 minutes (t from 15 to 30 min.) the inflow is much higher. at t � 30 min, qp � 12.47 m 3/s (see table 5) and qpp � (11.23 � 12.47)/2 � 11.85 m3/s. there is no rain in the time period from 30 to 45 min and qpp � (12.47 � 0)/2 � 6.24 m 3/s. the method discussed above operates with the simplified input relationships. these are, however, accurate enough to demonstrate the effects of water retention. higher accuracy of the input relationships could be achieved if the geodetic pa© czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 1/2008 fig. 11: map showing the isochrones of the dubanka watershed drained by the proposed natural channel above rozhovice rameters of the surface of the floodplain were known and used for calculating v f h� 3( ) and q f h0 2� ( ), using some hydraulic software like hec-ras. table 7 demonstrates the process for determining the flow rate in the outflow profile above rozhovice. the flow rate is influenced by water retention in the floodplain. the flow rate is determined using the volume-balance method described above. the flow rate in the outflow profile is determined from v f h� 3( ), i.e. the volume curve of the retention space, and q f h0 2� ( ), i.e. the consumption curve of the outflow profile. the known value of the inflow rate and its volume determines the predicted outflow volume and the volume accumulated in the lake in a time step. the accumulated volume is added to the total volume of the accumulated water r from the previous time step and a new value of r is found for the end of the time step. r f h� 3( ) gives the value of h for the determined r and q f ht0 2� ( ) gives the value of the instant flow rate in the outflow profile. the chosen mean flow rate in the outflow profile must be equal to the average value of the instant flow rates q0t. this result is often achieved already in the second iteration. after first 15 minutes the floodplain accumulates 4 487 m3 of water and the water depth in the lake grows to 50 cm. the stage-discharge curve determines the outflow discharge q t0 15� . m 3/s. after another 15 minutes, the volume of the accumulated water is much greater, namely 12 110 m3. the water depth is 82 cm and the outflow discharge from the stage-discharge curve is 5.5 m3/s. this discharge value is very similar to that calculated in table 5. in table 7, the time period from t � 30 min to t � 60 min is calculated in the shorter time step of 5 minutes in order to avoid errors in the time and magnitude of the maximum flow rate from averaging of the outflow discharge over an excessirely long integration period. table 5 determines that the maximum flow rate is reached at time t � 45 min. according to the method of isochrones, the flow rate in the outflow profile is composed of the outflow 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 h s1 v1 2/3×h s2 v2 �v (cm) (m2) (m3) (cm) (m2) (m3) (m3) 40 2.4 2 400 26.7 1.07 428 2 828 50 3.75 3 750 33.3 1.66 665 4 415 60 5.4 5 400 40.0 2.4 960 6 360 70 7.35 7 350 46.7 3.25 1 300 8 650 80 9.6 9 600 53.3 4.26 1 705 11 305 90 12.15 12 150 60.0 5.4 2 160 14 310 100 15.0 15 000 66.7 6.67 2 669 17 669 table 6: relationship v f h� 3( ) t (min) qpp (m3/s) vp (m3) q0p (m3/s) v0 (m3) r (m3) h (cm) q0t (m3/s) 15 5.615 5 054 0.75 675 4 379 50 1.5 30 11.85 10 665 3.5 3 150 11 894 82 5.5 35 10.39 3 117 5.95 1 785 13 226 86 6.4 40 6.23 1 869 6.4 1 920 13 175 86 6.4 45 2.08 624 6.05 1 815 11 984 82.5 5.7 50 0 0 5.2 1 560 10 424 77 4.7 55 0 0 4.3 1 290 9 230 72 4.0 60 0 0 3.7 1 110 8 120 68 3.4 75 0 0 2.8 2 520 5 600 56 2.0 90 0 0 1.6 1 440 4 160 48 1.3 105 1.1 990 3 170 42 0.9 legend: qpp – average inflow discharge over time interval, vp – inflow volume, q0p – average outflow discharge over time interval, v0 – outflow volume, r – total volume of accumulated water in the flood plain, h – water depth in the lake of accumulated water, q0t instant outflow discharge in time t table 7: determination of instant flow rate qot in the outflow profile above rozhovice. the flow rate is influenced by water retention in the dubanka valley. discharge from area 2 in the time period from 15 to 30 min, and of the outflow discharge from area 3 in the time period from 0 to 15 min. the maximum flow rate in the outflow profile is 8.6 m3/s. the water retention in the lake has a significant effect on the outflow discharge in the time period from 30 minutes to 45 minutes, because the water depth is great (82–86 cm), the lake is wide (24.6–25.8 m), and an increase in the depth by 5 cm represents an accumulated volume of 1 260 m3 in the 1000-m long reach of the lake. an average inflow discharge of 1.4 m3/s is required to fill the volume in 15 minutes. if the average inflow discharge is distributed to the triangle hydrograph, then the inflow discharge required to fill the volume is 2.8 m3/s. this line of reasoning documents that even a small accumulation significantly damps the flood wave, i.e. diminishes the maximum flow rate. the maximum flow rate is diminished from 8.6 m3/s to 6.4 m3/s, i.e. to 74 per cent of the original flow rate. water retention changes the shape of a runoff hydrograph, in particular its descending part. accumulated water flows out of a floodplain over a longer time period during a floodplain drawdown. fig. 12 compares of the runoff hydrographs of two different channels for the conditions of the dubanka flood event on may 30th, 2005. the proposed natural channel with a vegetation belt along the channel banks exhibits a much flatter and longer hydrograph than the existing reclaimed channel. it is important that the hydrograph in the proposed natural channel is flatter. the existing channel in rozhovice has a capacity of about 8 m3/s, and thus there would be no flooding in the village if the channel above the village was natural (restored) instead of reclaimed as it is now. slower grow of the water depth would enable cleaning of the channel during the flood event, and would avoid blockage of the discharge area of the channel at critical locations (hydraulic structures such as culverts, small bridges etc.) during the dubanka flood event, the flooding of the village was worsened by a culvert that was blocked by debris brought by the flood wave. the blockage caused a further increase in the water level in the village. the culvert was blocked by old wooden window frames washed away from a warehouse belonging to a company that makes windows. 4 discussion of results fig. 12 demonstrates the effect of the stream channel and its surroundings on the runoff hydrograph of the dubanka flood event on the may 30th, 2005. the discussion focuses on the effects of any variations in the coefficient of the direct runoff and in the concentration time of the watershed on the course of the flood caused by the rainfall from the may 30th, 2005. 4.1 effect of maize in a sloping terrain on the direct runoff coefficient the presence of the maize field on the slopes of the dubanka valley affected the course of the flood primarily through the value of the direct runoff coefficient, i.e. through the velocity of the runoff. this was high due to the presence of the maize field. the slope of the valley affected the flood through the value of the direct runoff coefficient and also through the acceleration of the runoff, i.e. through the enlargement of the areas between the isochrones. a combination of the effects of the presence of the maize field and the slope of the valley caused the extremely high value of the coefficient (k0 � 0.96). the effects on runoff from agriculturally cultivated lands of areas smaller than 10 km2 affected by torrential rains can be evaluated using the curve number (cn) method established by the usda soil conservation service in the united states. a cn curve gives a proportion between the rainfall and the direct runoff, and hence the direct runoff coefficient. this method does not take the slope into account, but works © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 1/2008 0 2 4 6 8 10 12 14 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 0:45 0:50 0:55 1:00 1:05 1:10 1:15 1:20 1:25 1:30 time h:min d is c h a rg e m 3 /s deep smooth channel shallow channel with vegetation belt fig. 12: comparison of hydrographs for two different channels in the dubanka valley above rozhovice. the hydrograph is caused by rainfall of 30-minutes duration and total precipitation of 20 mm only with information about field exploitation, soil cultivation, hydrological conditions and soil infiltration conditions. it gives the value cn � 91 for wide-space crops with downhill directed drills and bad hydrological conditions on soil with a very low velocity of infiltration. for a 20-mm rainfall events, this curve corresponds with the coefficient of direct runoff k0 � 0.33, which is a very low value. the value k0 � 0.8 is not reached until a precipitation depth of 120 mm. if k0 � 0.33 is correct then the effect on the runoff of the slope of the valley is more important than the effect of the presence of the maize field. the danger of a steep slope is stressed in every textbook about erosion control on agricultural land. according to czech standard csn 75 4500 “erosion control of agricultural land”, wide-space crops should be raised on fields with slopes not steeper than 8 %. for slopes between 8 % and 15 %, belts of wide-space crops should be combined with belts of cereals, cereals with underseeding, or perennial fodder plants to impair erosion. the effect of the downhill orientation of the drills is probably not very important. if the drill were not oriented downhill, the ko value would drop to say 0.8, and the maximum discharge from 12 m3/s to 10 m3/s. 4.2 effect of concentration time on the shape of the runoff hydrograph the proposed natural channel prolonged the concentration time from 30 minutes for the existing channel to 60 minutes. in the previous sections, it was shown that a rainfall of 30-minutes duration (i.e. a duration equal to the concentration time of the existing channel) significantly flattens the hydrograph. it is necessary to show that this hydrograph flattening also takes place for rainfalls of longer duration, e.g. a duration equal to the concentration time of the proposed channel, i.e. 60 minutes. according to table 1, rainfall of 4-year periodicity and 60-min duration gives a precipitation depth of 25 mm. for the existing channel and conditions from may 30th, 2005 in the dubanka valley (maize field with k0 � 0.96, deep smooth channel, no vegetation belt), the drained area of the outflow profile above rozhovice after 15 minutes of rainfall is 0.408 km2 and the outflow discharge q15 3 0 408 6 25 0 8 10 60 15 2 3� � � � �. . . . m3/s. after another 15 minutes (t � 30 min) the drained area is 1.123 km2 and the outflow discharge q 30 3 1123 12 5 0 96 10 60 30 7 5� � � � �. . . . m3/s. this flow rate will be maintained for the rest of the rain duration (see fig. 13). the determination of the hydrograph for the proposed natural channel is shown in table 8. the areas between the isochrones are taken from table 4. the maximum flow rate in the outflow profile above rozhovice is virtually the same as for the existing channel. it changes, however, if the effect of water 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 t (min) area 1 area 2 area 3 area 4 qt (m3/s)k01 q1 k02 q2 k03 q3 k04 q4 15 0.92 0.7 0.7 30 1.0 0.8 0.92 2.5 3.3 45 1.0 0.8 1.0 2.8 0.92 2.7 6.3 60 1.0 0.8 1.0 2.8 1.0 2.9 0.92 1.2 7.7 75 1.0 2.8 1.0 2.9 1.0 1.3 7.0 table 8: determination of the hydrograph for the proposed channel, using eq. 1 for rainfall of 60 min and 25 mm (no water retention) 0 1 2 3 4 5 6 7 8 0:00 0:05 0:10 0:15 0:20 0:25 0:30 0:35 0:40 0:45 0:50 0:55 1:00 1:05 1:10 1:15 1:20 1:25 1:30 time h:min d is c h a rg e m 3 /s deep smooth channel shallow channel with vegetation belt fig. 13: comparison of the hydrographs for two different channels in the dubanka valley above rozhovice. the hydrographs are caused by rainfall of 60-minutes duration and total precipitation of 25 mm retention is applied. the effect is very similar to that for the 30-min rainfall in the natural channel, because for both rains the maximum flow rate in the natural channel is about 8 m3/s for the hydrograph constructed for no retention effect. for the 30-min rainfall the retention effect diminished the maximum flow rate from 8.6 m3/s to 6.4 m3/s (see table 7). for the 60-min rainfall the retention effect diminishes the maximum flow rate from 7.7 m3/s to 6.0 m3/s (see fig. 13). the comparison of the hydrographs in fig. 13 shows that the difference between the flood waves for the two different channels is smaller for the rainfall of 60-min duration than for the rainfall of 30-min duration (in fig. 12). however, the difference is significant for both rains. the smaller difference for the longer-lasting rainfall is because the longer-lasting rainfall has lower intensity, which reduces the adverse effects of the fast runoff promoted by the deep smooth channel. the comparisons of the hydrographs in figs. 12 and 13 are actually not completely appropriate. this is because the figures do not compare flood events that produce the maximum flow rate for the given conditions. a flood event generates the maximum flow rate if the flood is caused by rainfall of duration equal to the concentration time of the watershed. thus, the two hydrographs that should be compared with each other are the hydrograph caused by rainfall of 20 mm and 30 min for the existing channel (fig. 12) and the hydrograph caused by rainfall of 25 mm and 60 min for the proposed natural channel (fig. 13). this comparison shows that the maximum flow rate in the proposed channel is one half of the maximum flow rate in the existing adapted channel. theoretically, the maximum flow rate generated in the watershed with the proposed natural channel should be higher for the longer lasting rain (25 mm and 60 min) than for the shorter lasting rain (20 mm and 30 min). however, figs. 12 and 13 show that this is not the case for our conditions. the shorter rain produces a maximum flow rate that is 0.4 m3/s greater than that produced by the longer rain. this disproportion is due to the specific configuration of the watershed terrain in the dubanka valley above rozhovice. the areas confined by the isochrones of 15 and 30 min, and by the isochrones of 30 and 45 min are large in comparison with the areas between other isochrones, and hence rainfall with duration of 30 minutes produces an extraordinarily great runoff. 5 conclusions the effect of a stream channel and of its floodplain on the course of a flood is described using a comparison of runoff hydrographs for a certain rainfall impacting a certain watershed that is drained by channels with different hydraulic characteristics. the paper proposes a methodology for evaluating the flood waves produced by different (either reclaimed or natural/restored) stream channels. the methodology adopts the method of isochrones for evaluating the effect of the channel itself and the volume balance method for the effect of its floodplain. the proposed methodology is demonstrated on a flood event that took place in 2005, and shows that restoration of the channel should significantly diminish the maximum discharge produced in the stream channel. if the existing reclaimed hydraulically smooth channel was replaced by the restored channel, the maximum discharge would be reduced to approximately one half. the change in the shape of the runoff hydrograph (and thus the value of the maximum discharge) depends primarily on the change in the velocity of flow through the channel. for the evaluated flood event, the proposed restoration of the stream channel would diminish the velocity from a value of 2.3 m/s in the existing reclaimed channel to 0.5 m/s in the proposed restored channel. besides the changes in the hydraulic characteristics of the stream channel, the runoff hydrograph of the observed watershed is further modified due to the effect of the spread of water into the floodplain. the methodology also takes this effect into account. the methodology shows that reclaimed channels, which are usually deep, straight and smooth, accelerate and increase floods. this is because increased velocity of flow through a channel shortens the time of concentration of the runoff from a watershed. it is well known that the maximum discharge of the runoff is produced by rainfall equal in duration to the concentration time of the affected watershed. due to a certain constant periodicity of a rainfall event, rainfalls of shorter duration are more intensive and thus produce floods with higher maximum discharges if they impact watersheds with a shorter concentration time equal to the rainfall duration. reference [1] ponce, v. m.: engineering hydrology. new jersey: prentice-hall, 1989, 640 p. [2] alberta environmental protection, 1999 stormwater management guideline for the province of alberta. [3] wmo, guide to hydrological practices. wmo 1994 no. 168. [4] čsn 75 4500 erosion control of agricultural land (in czech). [5] kemel, m.: climatology, meteorology, hydrology. ctu prague, 2000 (in czech). [6] matoušek, v.: documentation and evaluation of a flood event on the dubanka creek. research report mzp 0002071101, prague: vúv 2005 (in czech). [7] matoušek, v.: impact of the strean channel and floodplain on the course and magnitude of a flood. in just et al.: water-engineering revitalizations and their applications in flood control, prague: artedit, 2005 (in czech). [8] usda soil conservation service, scs national engineering handbook, section 4: hydrology. washington, d.c., 1985 ing. václav matoušek, drsc. phone: 220 197 382 fax: 233 333 804 e-mail: vaclav_matousek@vuv.cz výzkumný ústav vodohospodářský t. g. masaryka veřejná výzkumná instituce podbabská 2582/30 160 00 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 1/2008 ap08_3.vp 1 introduction medical imaging is important in today’s clinical environments. multiple imaging modalities such as x-ray angiography or endoscopy gather information on the patient’s well-being and medical status [1]. visual information is essential for the medical practitioner to confirm a diagnosis and to administer a therapy. an example is coronary angiography, where the inflow and outflow of a contrast agent in the coronary arteries provide decisive information for the planning of interventions [2]. for example, image processing algorithms can be used for image enhancement, which can lead to a reduction of the required x-ray dose for an accurate diagnosis. medical image information often has to be acquired from various imaging modalities, enhanced, analyzed, interpreted, and displayed for the medical practitioner. integration of these steps in a single software framework is a complex task, but also offers the chance to create a foundation for the operation and development of medical image processing algorithms. the rest of the paper is organized as follows. chapter two lists the requirements for a real time software framework for medical image data processing. chapter three illustrates the work flow for our software framework. a more detailed view on the software design pattern and techniques used in our software framework is the focus of chapter four. the fifth chapter concentrates on the hardware and software we used to implement our project. the results and a report onclinical trials are highlighted in chapter six. chapter seven sums up the main points and concludes with final remarks. 2 requirements the intention of our project was to design a platform for complex image and video processing algorithms. image frames are acquired from a medical image source, often processed with appropriate algorithms and finally displayed for the medical practitioner. the aim of the development was to combine these steps in a fast and efficient framework. a catalogue of criteria to be met by realtimeframe was developed. the requirements can be divided into three categories: features, performance and expandability, which are described in the following sections. 2.1 features images have to be acquired from a medical image source and may be preprocessed for display in real time. during interventions, video data is supplied via the s-video connector of the endoscopy system’s camera control unit. thus, frame grabber functionality is mandatory to digitise the video stream. for testing and presentation purposes it must be possible to read video streams from the hard drive to simulate a real medical image source. a further requirement is that arbitrary operations can be performed on the image data. such operations could enhance image quality and significance or store the data as video or image sequences on the hard drive. the results must be displayed with minimal time delay for use during diagnosis or surgery, and a full screen mode is needed to provide the best possible view for the medical practitioner. 2.2 performance the images processed are 720×576 pixels in size. the refresh rate required for clinical evaluation is 50 fields per second. the images are rgb coded, and each of the three channels is sampled with 8-bit colour depth. handling such a stream leads to a total of 31,104,000 bytes (� 29.66 megabytes) of image data per second. the delay between image acquisition and image display must not exceed 100 ms. otherwise, a physician would experience an irritating time lag during the intervention. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 3/2008 realtimeframe – a real time processing framework for medical video sequences s. gross, t. stehle imaging technology is highly important in today’s medical environments. it provides information upon which the accuracy of the diagnosis and consequently the wellbeing of the patient rely. increasing the quality and significance of medical image data is therefore one the aims of scientific research and development. we introduce an integrated hardware and software framework for real time image processing in medical environments, which we call realtimeframe. our project is designed to offer flexibility, easy expandability and high performance. we use standard personal computer hardware to run our multithreaded software. a frame grabber card is used to capture video signals from medical imaging systems. a modular, user-defined process chain performs arbitrary manipulations on the image data. the graphical user interface offers configuration options and displays the processed image in either window or full screen mode. image source and processing routines are encapsulated in dynamic library modules for easy functionality extension without recompilation of the entire software framework. documented template modules for sources and processing steps are part of the software’s source code. keywords: medical image processing, software framework, real time image processing, multithreading. 2.3 expandability expandability was mandatory for realtimeframe. the aim was to create a system flexible enough to be constantly expanded with further image sources and image processing algorithms, while providing performance for real time application in the medical field. ideally, additional image sources and processing algorithms can be integrated into the program without recompiling of the whole software framework. 3 work flow realtimeframe was designed to provide a flexible platform for medical image processing. fig. 1 illustrates the work flow of the framework. 3.1 image acquisition a source is chosen by the user to generate image data. this source is encapsulated in a dll file and uses the interface defined by the framework to place image data in the source buffer. the image source can be replaced by any module implementing the image retrieval process for a new source. no change to the framework has to be made. an example of this would be the replacement of the installed frame grabber card by another model. a new source module can be implemented and then easily selected via the gui. 3.2 image processing data from the source buffer is sent to a user defined chain of processing modules implementing image manipulation and processing steps. these steps may include algorithms for improving the image quality or significance, filtering, and storing in videos or file sequences. the functionality of the processing modules is again encapsulated in dll files. the process chain can be configured to hold an arbitrary combination of processing modules, and is not limited by any implemented restriction. with respect to real time processing and the required cpu time for the process chain, however, there are limitations. 3.3 graphical user interface the graphical user interface (gui) delivered with our realtimeframe is depicted in fig. 2. it offers control options for all operations performed by realtimeframe on the left side of the window. drop down menus list all available sources and processing modules. parameters for the selected modules are represented by check boxes, spin boxes, and text lines. these parameters and their visual representations are automatically extracted from the associated modules and integrated into the gui. the process chain can be configured via gui and turned on and off during an intervention. the right hand side of fig. 2 displays the processed image. a full screen mode is available, offering an enlarged image display area during medical interventions. 3.4 available modules video source plug-ins shipped with realtimeframe are sourceframegrabber for accessing euresys picolo series frame grabber cards, sourcevideo for accessing video files and sourcesingleimage for loading image sequences from the hard drive. processing modules shipped with realtimeframe include a median filter, a colour channel inversion module and the canny-edge-detector. documented examples and ready-to-use templates for source and processing modules are included in the source 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 image display data manipulation image acquisition source source buffer process module 1 process module n process thread source thread display thread display buffer graphical user interface image display data manipulation image acquisition source source buffer process module 1 process module n process thread source thread display thread display buffer graphical user interface fig. 1: work flow concept for the realtimeframe software framework fig. 2: screenshot of the main window, showing configuration buttons and a processed medical image code. therefore, creating new modules based on the template files is fast and relatively easy. 4 software concepts the premises for the software require careful planning, and especially the postulated flexibility and expandability have to be considered. an object-orientated approach is a prerequisite for successful creation of an easily extendable software framework. 4.1 model-view-controller concept the model-view-controller (mvc) concept is a well-known and widely-used pattern in object orientated software architecture [4]. the central idea is to increase the flexibility of the software project, to simplify the reuse of code parts and to shorten the time needed for familiarisation with the code. the concept introduces functional units and enforces encapsulation of the associated code. the software developer can concentrate on certain parts of the code, which can be modified or even replaced without any change to the rest of the framework. the only necessity is to leave the interfaces to the other program parts unchanged. applying the mvc concept, the functionality of a program can be divided into three functional blocks: � model � view � controller fig. 3 shows an organisational diagram of realtimeframe, indicating the separation of program parts into functional blocks. the model is responsible for storage of the processed and unprocessed image data. it does not have any knowledge of the source of the data or its meaning. in realtimeframe,s case, the model does not hold any information on whether the stored frame is from a video or a single image, a frame grabber or any other image source. the model does not know if the data will be processed or displayed at all. it simply offers memory to store and read the frame data. the task of the view is to display data. the view is separated from acquisition, storage or processing of the data. data handed over to the view module is presented to the user. the controller is responsible for modifying the data and for administering the image manipulation process. it is separated from the data storage in the view module and the representation of the data in the view module. 4.2 multithreading multithreading enables the work load to be spread into different work flows. these work flows, so-called threads, run simultaneously and independently. interaction is possible between threads, and enables them to exchange data, to wait for signals or to start and stop each other. this also enables user interaction while running program tasks in the background. however, multithreaded programming has some drawbacks [5]. the method usually leads to less understandable code. it is left to the programmer to identify all possible problems with thread interactions and to solve them. bug tracking in these cases is difficult and time consuming. 4.3 parallel processing performance is one of realtimeframe’s key requirements, as we are working under real time conditions. performance can be boosted in several ways. one of the ways, if a multiprocessor computer is available, is parallel processing. having more than one processor core enables the programmer to distribute the work load. theoretically, one would expect four cores to work four times faster than a single core. this, however, is not the case, for the following reasons: � coordinating the distribution and ensuring load balancing for four processors will create a computational overhead. � if the process waits for hardware resources or user input, the number of processors is of no consequence for the duration of the delay. � not all problems can be distributed evenly or at all. some results may depend on others and calculations may need previous results. this prevents operations from being run simultaneously. even taking these constraints into account, there is still a considerable performance increase when running applicable tasks on multiple cpus and, therefore, we make use of parallel processing whenever possible, as image manipulation algorithms can often be vastly accelerated by parallel processing, in comparison with a single processor system. 4.4 dynamic link libraries the linking process was invented to bind multiple object-files created by the compiler from different source code files into one single executable. libraries were used to supply additional functionality, and to reduce code size and compile time. an often used library was likely compiled into multiple programs. thus it finally used a large amount of hard drive space and, even more important, a large amount of the computer’s main memory. in the late 1960s dynamic linking was developed to cope with problems of disk space. the linker process only records the necessary library and the entry point for the functions © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 3/2008 main gui source control process control source buffer display buffer state source process step controller model view main source control process control source buffer display buffer state source process step controller model view main gui source control process control source buffer display buffer state source process step fig. 3: model-view-controller concept for the realtimeframe software framework called, but does not include the object code of the library itself in the binary. a library only has to be stored once on a hard drive of the computer and its path must be made known to the operating system. all programs can then load this library if needed to perform their operations. dynamic loading on the other hand seeks to free up system memory by directing all programs in need of a certain library to the same memory location. the library is loaded only once. thus, multiple programs use the same instance of the library, which is unloaded when the last program that depends on the library exits. dynamic linking and loading offers more advantages than just the originally intended effect of saving disc space and memory. replacing a library with a new version takes effect for all programs which refer to it. the programs do not need to be changed and are completely separated from the implementation of the library. there is no need to recompile the programs, and introducing updated versions or additional functionalities in libraries is therefore uncomplicated. however, the library creator has to ensure that the library’s interface remains unchanged and the entry points referenced in the code still exist. the expandability postulated for realtimeframe can be achieved with dynamic link libraries (dll) without the need to recompile the main program. we decided to define an interface for all source and processing modules to use. at start-up realtimeframe will check for available dlls and load each of them. adding a source or processing module therefore only requires moving the dll into the appropriate directory. this will help developing additional functionality and servicing customer deployed copies of realtimeframe, as new or updated functionality can be installed easily and no change to the rest of the framework is necessary. 5 development platform we chose our hardware and software to fit the requirements defined in chapter 2. 5.1 software the software was chosen for performance, features and flexibility as well as for improved and simplified implementation. we opted to use windows xp professional rather than windows vista. windows vista was introduced on january 30, 2007, and users then experienced various problems with the new operating system, including numerous driver compatibility and availability issues. this convinced us to use windows xp professional, as the manufacturers of our hardware provided drivers for this operating system and most issues with this software platform are well-known and have been solved in the meantime. the development platform was microsoft visual studio 2003.net. we decided to replace the standard microsoft compiler with the intel c�� compiler 10.0 for windows, as it provides better support for multithreading and generates faster code. apart from the choice of the operating system and the integrated development environment (ide), the choice of the underlying libraries has a major influence on the implementation. many functions and algorithms are provided by libraries. using these well tested, reliable libraries speeds up the implementation process. the graphical user interface (gui) is based on trolltech’s qt. it offers fast and comfortable gui design as well as an interface for multithreading and inter-thread communication. the images are displayed using microsoft directx9 to take advantage of hardware acceleration. we use intel’s open computer vision library (opencv) for image handling and manipulation. as opencv uses the outdated video for windows interface to access video files, we replaced this functionality with ffmpeg’s libavcodec. thus, realtimeframe is able to handle video files larger than 2 gb, which is reached rapidly when we store medical video data in high quality. 5.2 hardware today’s hardware offers ample performance to create a system capable of acquiring images, manipulating image data and displaying results with only an insignificant lag. we decided to use standard personal computer hardware components as they offer solid performance at a reasonable price. our hardware platform consists of two intel xeon 5140 dual core processors with 2.2 ghz. the two cpus are placed on a tyan server motherboard with 4 gb ram. we use two 70 gb hard drives in a raid setup for the operating system, realtimeframe, and the development tools. two 500 gb sata2-hard drives offer performance for simultaneous data storage and reading. we use an nvidia geforce 7950 gx2 graphics adapter which offers the possibility of performing matrix calculation operations via a programmable interface. for video acquisition from medical devices, we installed a euresys picolo series industrial frame grabber card. 6 results realtimeframe is run on two identical systems. one is the development platform, which remains at the institute of imaging and computer vision, rwth aachen university. the other system is used in clinical trials. realtimeframe has proved to be of exceptional value for the development, testing, and clinical evaluation of algorithms for medical image processing. our framework was recently used as a basis for the implementation of our fascia enhancement algorithm [6]. it currently serves as the operating platform in the associated clinical trials during surgical interventions. clinical trials have shown that realtimeframe introduces an acceptable time lag. the complete system, including the endoscope used in the interventions, delayed the image by about 90–110 ms on average. activating the fascia enhancement algorithm increases this number to 130 ms. the endoscope alone causes an average lag of 60 ms. taking this into account, the postulated maximum increase by 100 ms has not been exceeded. the software framework itself causes a very low cpu load. at a refresh rate of 25 frames per second, about 2 % of the cpu capacity on our presented hardware platform is con18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 sumed by image acquisition via a frame grabber card and display on the screen. 7 summary and conclusion realtimeframe is a solution for real time video processing in medical environments. it provides services for image acquisition from various image sources. the data is processed by a chain of processing steps before it is displayed. the process chain may include filtering, analysis, and saving frame sequences or entire videos in an arbitrary combination. we have designed source and processing modules to encapsulate the functionality. the module initialisation is performed at start up and the gui offers control and configuration options for all realtimeframe plug-ins found. thus, adding new modules is straightforward, and is done without changes to the framework. realtimeframe meets the requirements listed in chapter two. the concepts of the implementation were illustrated in chapters three to five. based on the results highlighted in chapter six, we are convinced that realtimeframe will be of exceptional value for medical image processing in clinical environments and also in the development of new algorithms in the future. 8 acknowledgments we would like to thank prof. t. aach of the institute of imaging and computer vision, rwth aachen university for supervising the research described in this paper. we would also like extend our thanks to prof. dr. med. bruch and dr. med. keller at the university medical center schleswig-holstein, lübeck, germany for their cooperation and their efforts in connection with the clinical evaluations employing realtimeframe. part of the project was funded by olympus winter & ibe gmbh, hamburg, germany. we are grateful to them for supporting our research. references [1] aach, t., schiebel, u., spekowius, g.: digital image acquisition and processing in medical x-ray imaging. journal of electronic imaging. vol. 8 (1999), p. 7–22. [2] aach, t., mayntz, c., rongen, p., schmitz, g., stegehuis, h.: spatiotemporal multiscale vessel enhancement for coronary angiograms. medical imaging 2002: spie, vol. 4684, spie, p. 1010–1021, san diego, usa. [3] gross, s., stehle, t., behrens, a., aach, t.: realtimeframe – a real time image processing solution for medical environments. in: 41. dgbmt-jahrestagung biomedizinische technik 2007. [4] gamma, e., helm, r., johnson, r., vlissides, j.: design patterns – elements of reusable object-orientated software. addison-wesley, 2003, p. 4–6 [5] lee e. a.: the problem with threads. computer, vol. 39 (2006), no. 5, p. 33–42 [6] stehle, t., behrens, a., bolz, m., aach, t.: visual enhancement of facial tissue in endoscopy. spie medical imaging 2008 (to appear). sebastian gross e-mail: sebastian.gross@lfb.rwth-aachen.de thomas stehle e-mail: thomas.stehle@lfb.rwth-aachen.de institute of imaging and computer vision rwth aachen university sommerfeldstraße 24 d-52074 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 3/2008 ap05_5.vp 1 introduction in recent years, the combinatorial properties of finite and infinite words have become significantly important in fields of physics, biology, mathematics and computer science. one of the first impulses for extensive research in this field was the discovery of quasi-crystals. normal crystal structures show rotational and translational symmetry. in 1982, however, dan shechtman discovered an aperiodic structure (which was formed by rapidly-quenched aluminum alloys) that has a perfect long-range order, but no three-dimensional translational periodicity (see e.g. [1] or [2]). since then many stable and unstable aperiodic structures have been discovered. they are now known as quasi-crystals. the problematics of aperiodic structures has been studied from various points of view and there are numerous relations with other applications (besides solid-state physics), such as pseudo-random number generators [3], pattern recognition and symbolic dynamical systems. early results are reported in [4] or [5]. this paper is devoted to sturmian words. sturmian words are infinite words over a binary alphabet with exactly n � 1 factors of length n for each n � 0. they represent the simplest family of quasi-crystals. the history of sturmian words dates back to the astronomer j. bernoulli iii [6]. he considered the sequence � � � �( )n n� � � �1 1 2 1 2� � , where n � 1 and � is a positive irrational. based on the continued fraction expansion of �, he gave (without proof) an explicit description of the terms of the sequence. there also exist some early works by christoffel and markoff. a. a. markoff was the first to prove the validity of bernoulli’s description. he did that in his work [7], where he described the terms of the sequence � � � � � �( )n n� � �1 � � � , n � 1 (later known as a mechanical sequence). the first detailed investigation of sturmian words is due to hedlund and morse [4], who studied such words from the point of view of symbolic dynamics and, in fact, introduced the term „sturmian“; named after the mathematician charles francois sturm. it appears that there are several equivalent ways of constructing sturmian words. we will describe some of them and show the relationship with other notions from the combinatorics on words such as palindromes and return words. then we make some notes on an extension of sturmian words using the cut-and-project scheme. the next section is devoted to the important question of the invariance of sturmian words on a substitution. in the last section we present some open problems related to generalizations of sturmian words. 2 sturmian words an infinite one-sided word w w w w wn n n� ��( ) ,0 0 1 2 � is a sequence of letters from a finite set a which is called an alphabet. we use the notation n for the set of integers and n n0 0� { }. a finite word v v v vn� �0 1 1� is a finite string of letters from a and v n� is the length of v. the set of all finite words over the alphabet a is denoted by a*. a finite word u a * is a factor of w if there exist 0 � �k l such that u w wk l� � . the empty word is denoted by �. the set of factors of w of length n is written ln(w) and the set of all factors of w is denoted by l(w). the set l(w) is often called the language. an infinite word w is ultimately periodic if there exist a word u and a word v such that w uv� � where v� is the infinite concatenation of the word v. it is periodic if u is the empty word. if the infinite word w does not have any of the previous forms we say that it is aperiodic. we say that u is a prefix (resp. suffix) of v v v vn� �0 1 1� , v l wn� ( ), if there exists 0 1� � �l n such that u v vl� 0� (resp. u v v vl l n� � �1 1� ). the infinite word w, l � 0, is a suffix of w. an infinite word w is uniformly recurrent if for every integer k there exists an integer l such that each word of l wk( ) occurs in every word of length l. the usual way of defining sturmian words is via the complexity function. let w be an infinite word and cw be a mapping n n0 � , such that c nw( ) is the number of different factors of length n of w; i.e. c n l ww n( ) # ( )� . cw is called the complexity and we say that the word w is sturmian if c n nw( ) � � 1 for all n. since cw( )1 2� , sturmian words are defined over a binary alphabet, say a � { , }0 1 . clearly cw( )0 1� for any w. there have been some attempts to extend sturmian words to words over alphabets with more than two letters, for instance [8] or [9], but none of these constructions show such nice properties as sturmian words. however, the approach of arnoux and rauzy presented in [8] resulted in another interesting family of words called arnoux-rauzy sequences. another combinatorial definition of sturmian words is based on the distribution of letters in the word. let w be an infinite word, u v l w, ( )� be two factors of the same length u v� and function � be defined �( , )u v u v� �0 0 , © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 various properties of sturmian words p. baláži this overview paper is devoted to sturmian words. the first part summarizes different characterizations of sturmian words. besides the well known theorem of hedlund and morse it also includes recent results on the characterization of sturmian words using return words or palindromes. the second part deals with substitution invariant sturmian words, where we present our recent results. we generalize one-sided sturmian words using the cut-and-project scheme and give a full characterization of substitution invariant sturmian words. keywords: sturmian words, mechanical words, 2-interval exchange map, palindromes, return words, substitutions. where u 0 denotes the number of occurrences of 0 in u. we say that w is balanced if and only if �( , )u v � 1 for all u v l w, ( )� with u v� . note that the structure of a balanced word over the alphabet a � { , }0 1 is formed, either by a block of 0’s between two consecutive 1’s, or by a block of 1’s between two consecutive 0’s. it is easy to see that the length of the blocks of 0’s between two consecutive 1’s (resp. blocks of 1’s between two consecutive 0’s) differs at most by 1 in a balanced word. as we will see balanced words are equivalent to sturmian words. in the literature one can find several constructions of binary words. we start with the construction presented by hedlund and morse in [4]. let �, � be real numbers, where � �( , )0 1 and � �[ , )0 1 . then s s n� � � �, , ( )� , s s n� � � �, , ( )� are sequences defined by, � �s n n n� � � � � �, ( ): ( )� � � � �1 , for n � 0, � � � �s n n n� � � � � �, ( ): ( )� � � � �1 , for n � 0. these words are usually referred to as mechanical words and � is called the slope and � the intercept. mechanical words of s � �, resp. s � �, have a nice geometrical interpretation. consider the integer lattice z2, the straight line y x� �� �, x � 0, and two sequences of integer points �� �x n nn � �, � � and � �� �y n nn � �, � � . the elements of xn cover integer points of the lattice just below the line y x� �� � and the elements of yn cover points just above the line. if s n� �, ( ) � 0 then the points xn and xn�1 lie on the horizontal line, and if s n� �, ( ) � 1 then they lie on the diagonal line. the same holds for s n� �, ( ) and yn. in fact, the sequences s � �, , s � �, are the coding of a discretization of a straight line. note that if � is irrational then s � �, and s � �, differ by at most for two values of n. clearly, this can happen only if n� �� is an integer for some n. if we consider � � 0 and n � 0, then s �, ( )0 0 0� and s �, ( )0 0 1� and we obtain the important special case of mechanical words s c� �,0 0� , s c� �,0 1� ; n � 0. the infinite word c� is called the characteristic sequence of �. crisp et al. study, in their work [10], another way of constructing characteristic sequences. consider again the integer lattice z2 and the straight line y x� � , where � is a positive irrational and x � 0. we label the intersections of the line y x� � with verticals of the grid using 0, and we label by 1 the intersections of y x� � with horizontals. the sequence of labels forms the so called cutting sequence and is denoted by s�. it can be shown that c� �� s if and only if � � �� �( )1 (see e.g. [10]). one can often encounter the term r-interval exchange map in connection with infinite words. properties of words generated by the r-interval exchange map (called the coding of the r-interval exchange map) are studied from different aspects in [11], [12] or [13]. let us closely mention the 2-interval exchange map and its reference to sturmian words. let � �( , )0 1 be an irrational number and x �[ , )0 1 . let i1 0 1� �[ , )� , i2 1 1� �[ , )� be the decompositions of the interval [ , )0 1 . the map t�, given by the rule t x x x x x� � �� � � � ( ) , [ , , , [ , , � � � � � � � � � � � for for 0 1 1 1 1 is called the 2-interval exchange map. it can be written in a more complex way �t x x x x� � � �( ) { }� � � � � � , � �x [ , )0 1 . for the n-th iteration of the 2-interval exchange map t� we obtain t x t x n xn n� � � ( )( ) ( ) { }� � � . since � is irrational, t� does not have any fixed point. it is not difficult to check that � �( ) ( ) [ , ( )n x n x t xn� � � � � � �1 0 1� � ��� which implies that s n t t n n� � � � � �� � � � , ( ) ( )( ) ( ) [ , , ( ) [ , . � � � � � � � 0 0 1 1 1 1 if if� let us show the most famous example of sturmian words, the fibonacci word. example 2.1: let be the map given by : ,0 01 1 0� � with the property ( ) ( ) ( )uv u v� for any finite words u, v over the alphabet {0, 1}. define the n-th finite fibonacci word fn in the following way: f0 0� and for all n � 0, f fn n� �1 ( ). since f0 0� and (0) starts with 0, then fn is the prefix of f n �1 for each n � 0. the fibonacci word is defined as the limit of the sequence of words fn and thus lim n nf �� � 010010100100101001010010010� note that the further applications of the map fn do not change the fibonacci word. observe that the length f n is the n-th element of the fibonacci sequence fn, (f f fn n n� �� �1 2; f0 1� , f1 2� ), for all n � 0. it can be shown that the complexity of the fibonacci word equals to n � 1, thus the word is sturmian. the fibonacci word also coincides with the mechanical word with slope 1 2 and zero intercept, where � �1 2 1 5( ) is the golden mean. we finalize this part with a theorem by hedlund and morse [4] which states that sturmian, balanced and mechanical words are indeed equivalent. theorem 2.1: (hedlund, morse). let w be an infinite word over the alphabet a � { , }0 1 . the following conditions are equivalent: 1. w is sturmian; 2. w is balanced and aperiodic; 3. there exist an irrational �, � �( , )0 1 and a real � �[ , )0 1 such that w s� � �, or w s� � �, , for all n � 0. there exist several proofs of the theorem. the original proof [4] is of combinatorial nature, while the other by lunnon and pleasants [14] is based on geometrical considerations. 2.1 other characteristics of sturmian words we have mentioned several equivalent definitions of sturmian words as those with minimal complexity, balanced aperiodic sequences and mechanical words. in the past few years, there have been successful attempts to find a new characterization of sturmian words. the first one, which we will describe uses return words, while the second uses palindromes. let w be a one-sided infinite word and u a factor of w. we say that a finite word v is the return word over u if vu is a factor of w, u is a prefix of vu and there are exactly 2 occurrences of u 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague in vu. in other words, the return word v over u starts with the occurrence of u and ends just before the next occurrence of u. example 2.2: let 0100101001001010010100100101… be the fibonacci word. the set of return words over 0101 contains words 01010010 and 01010. for clarity, here is the fibonacci word with indicated return words over 0101 : 0100101001001010010100100101… vuillon [15] observed that the number of return words indicates whether a word is sturmian or not. he showed the following theorem. theorem 2.2: a binary infinite word w is sturmian if and only if the set of return words over u has exactly two elements for every non empty word u. note that the proof of the necessary condition includes a nice application of rauzy graphs, which are often used for investigation of growth of the complexity in infinite words [8]. let us focus on palindromes. a palindrome is a finite word that reads the same backwards as forwards. for instance, these are the first palindromes of the fibonacci word: �, 0, 1, 00, 010, 101, 1001, 00100, 01010,… in [16], droubay and pirillo showed the characterization of sturmian words by observing the number of palindromes of even and odd length. theorem 2.3: an infinite word is sturmian if and only if, for each nonnegative integer n, there is exactly one palindrome of length n, if n is even, and there are exactly two palindromes of length n, if n is odd. the mapping that assigns to an integer n the number of palindromes of length n in a word is called the palindromic complexity. in the context of palindromes and sturmian words, let us draw attention to paper [17]. the authors proved that the number of palindromes in a word n of length n is less or is equal to n � 1. note that this holds for any kind of words (not necessary sturmian) over arbitrary alphabets. however in the case of sturmian words it was shown in [17] that the number of all palindromes in a factor of length n is equal to n � 1. the reader may like to try finding words of length n over a 2-letter alphabet with the number of palindromes less than n � 1; this is not as trivial as it may appear to be. 2.2 bidirectional sturmian words in the previous sections we have outlined several characteristics of sturmian words. however we have limited ourselves, from the very first definitions, to one-sided infinite words. this restriction is typical for a large number of papers, in spite the fact that the definitions of notions like balanced words, mechanical words etc. can be extended, very naturally, to both sides. the question is, whether the above listed theorems still hold for bidirectional infinite words. let us show a way of generating bidirectional infinite words from which it will be clear that such a generalization is possible. consider �, � fixed positive irrational numbers, � �� � , � �[ , )0 1 and the set � � � � � � � � �, ( , ] { | , , }đ� � � � � � � �1 1m n m n m n z . it follows that m has to satisfy � � � �� � � � �n m n , hence �m n� �� � , i.e. any element of � � � � �, ( , ]� 1 has the form �� � �� �n n , for some n z� . using this fact we can write �� � � � � � � ��, ( , ] { : }� � � � � �1 x n n n zn , thus the elements of � � � � �, ( , ]� 1 form an increasing sequence ( )xn n z� . we can compute the lengths between two consecutive points of ( )xn n z� as � �x x n nn n� � � � � � � �1 1( )� � � � �. the term � �( )n n� � � �1 � � � � takes only two values 0 and 1 for each n z� . thus there are only two distances between neighbors in � � � � �, ( , ]� 1 , namely 1 � � and �. let us define a sequence ( )wn n z� of 0’s and 1’s by w x x x xn n n n n � � � � � � � � � � � 0 1 1 1 1 if if � � , . the sequence ( )wn n z� is a pointed bidirectional infinite word ( ) | ,w w w w w wn n z� � ��� �2 1 0 1 2 where | denotes a delimiter. since the distances between the consecutive points in � � � � �, ( , ]� 1 are ordered as the mechanical word s � �, then w (i.e. the infinite word to the right from the delimiter) is the sturmian word. from the construction it is clear that the language of w w w0 1 2 � is the same as the language of � w w� �2 1. the complexity function cw has been defined for the right sided infinite words, however the definition can be naturally extended to left sided infinite words, say � � � �w w w� 2 1 as : c n l ww n� � �( ) # ( ). since c n l w l w c nw n n w� � � � �( ) # ( ) # ( ) ( ) the word � w w� �2 1 is also sturmian. since an arbitrary shift of the whole interval ( , ]� �� 1 (i.e. we are keeping the unit length of the interval) does not change the set � � � � �, ( , ]� 1 , it just shifts the position of the delimiter, we can conclude that the word ( )wn n z� is the pointed bidirectional infinite sturmian word. from now on we will consider only bidirectional infinite words. note that the set � � � � �, ( , ]� 1 is called the cut-and-project set and the interval ( , ]� �� 1 is an acceptance window. there are a couple of papers dealing with cut-and-project sets. in [18] it is shown that a cut-and-project set has either two or three distances between adjacent points; two distances correspond to the case of the unit length of the acceptance window and the distances form a sturmian word. on the other hand, the words corresponding to the cut-and-project sets with three distances are exactly those which arise from the coding of the 3-interval exchange map, and vice versa. in [19], [20] the authors study substitution properties and the substitutivity of cut-and-project sets. 3 sturmian words and substitutions let us take a look at sturmian words from a different point of view. one can see in example 2.1 that there exist sturmian words which are generated by certain maps (here we will call them substitutions). in fact, the mentioned fibonacci word is a fixed point of the map . the question is whether this is a general property of all sturmian words, or whether there exists a class of sturmian words invariant under substitutions. let us now state basic definitions and then we give an overview of the most interesting results. a morphism is a map of a* into itself satisfying ( ) ( ) ( )uv u v� for each u v a, *� . the morphism is called non-erasing if ( )ai is not an empty word for any a ai � . a non-erasing morphism is called a substitution. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 the action of can be extended to bidirectional infinite words ( ) |w w w w w wn n z� � ��� �2 1 0 1 2 by � � ( ) ( )| ( ) ( ) ( )w w w w w� �2 1 0 1 2 we say that the word ( )wn n z� is invariant under the substitution (or is a fixed point of ) if � � � � ( ) ( )| ( ) ( ) ( ) |w w w w w w w w w w� � � ��2 1 0 1 2 2 1 0 1 2 suppose that we have a substitution and there exist letters a a ai j, � , such that ( )a uai i� and ( )a a vj j� for some non-empty words u v a, *� . then by repeated application of on the pair a ai j| of letters separated by the delimiter | we obtain words ( ) ( )( )| ( )n i n ja a , n n� of increasing length. clearly ( ) ( ) ( ) ( )( )| ( ) ( )| ( )n i n j n n i n j na a u a a v� � �1 1 , for certain words u v an n, *� . the limit of ( ) ( )( )| ( )n i n ja a for n � � is an infinite bidirectional word ( )wn n z� and we say that generates the word ( )wn n z� . let us define a weaker notion of a substitutive word. we say that ( ) |w w w w w wn n z� � ��� �2 1 0 1 2 is a substitutive word, if there exists a substitution on an alphabet b with a fixed point ( ) |v v v v v vn n z� � ��� �2 1 0 1 2 and a map � : b a� such that w vn n� �( ), for each n z� . note that all fixed points of a substitution are substitutive. the opposite is not true. let a a ak� { , , }1 � be an alphabet. to a substitution one may assign a substitution matrix a � �n k k in the following way: a ij j ia a: ( )� number of letters in the word . the problem of invariance under a substitution (or the weaker notion of substitutivity) has motivated many papers. there are some partial results, where authors consider only one sided sturmian words or characterize substitution invariant bidirectional sturmian words depending on the slope � if the intercept � � 0. crisp et al. [10] carried on the work of brown [21] and studied substitution invariant cutting sequences. they proved that the cutting sequence s� (resp. the mechanical sequence c� is substitution invariant if and only if the continuous fraction expansion of � (resp. �) has a certain form. the author in [22] used some of their results and simplified their condition on the invariance of the characteristic sequence c� under a substitution. he showed that c�, � �( , )0 1 , is invariant under a substitution if and only if � is a quadratic irrational with conjugate � � ( , )0 1 . such � is called a sturm number. this result was shown independently by allauzen, [23]. parvaix [24] proved that bidirectional non-pointed sturmian words with � � 0 are invariant under � substitution if and only if � is a sturm number and the intercept � belongs to the quadratic field q a a b a,b q( ) { | }� � �� . a complete characterization of infinite one-sided substitution invariant sturmian words was done by yasutomi [25]. berthé et al. [26] studied infinite words which arise from the coding of the 2-interval exchange map and gave an alternative proof of yasutomi’s result using rauzy fractals associated with invertible primitive substitutions. the authors also defined for every fixed sturm number � a matrix m� � �q 2 2 that is called the generating matrix of � and is closely related to the smallest solution of a pell equation. they showed that a sturmian sequence s � �, (� sturm number, � �[ , )0 1 ) is a fixed point of a substitution with substitution matrix a if and only if a has the form m� l , for some l � 1. we complete this overview by giving a few notes on paper [27]. the authors have completely solved the question of the substitution invariance of pointed bidirectional sturmian words. the main theorem shown in the paper is the following. theorem 3.1: let � be an irrational number, � �( , )0 1 , � �[ , )0 1 . the pointed bidirectional sturmian word with slope � and intercept � is invariant under a non-trivial substitution if and only if the following three conditions are satisfied: 1. � is a sturm number, 2. � ��q( ), 3. � � � � � �� � �1 or 1 � � � � � �� � � , where �� is the image of � under the galois automorphism of the quadratic field q( )� . note that this result is analogous to those derived in [25] and [26] for the one-sided case. the proof presented in [27] is constructive based on the cut-and-project scheme that was sketched in the paragraph devoted to bidirectional words. this approach has not been used yet in the study of substitution invariant sturmian words. it turns out to be a good choice because the proof is simple. one of the advantages of the cut-and-project scheme is that the more difficult parts of the proof can be illustrated on vivid examples, which makes the whole paper more comprehensible. we believe that methods similar to those in [27] can solve the question of the substitution invariance of words over a 3 letter alphabet, which arise from the coding of the 3-interval exchange map. 4 open problems we have summarized several interesting properties of sturmian words and here we would like to highlight how the results about sturmian words can help in further advances in the field of aperiodic words. as sturmian words are the simplest aperiodic structures, the question of generalisation of results obtained for sturmian words is particularly interesting. the direct generalization of sturmian words are infinite words which arise from the coding of a 3-interval (resp. r-interval) exchange map. we believe that results and techniques used in the study of sturmian words can be applied with success to open problems connected with words coding a 3-interval exchange map. below you may find a list of the most interesting issues. 1. a necessary and sufficient condition on the substitution invariance of the words coding a 3-interval exchange map is still not known. the problem of finding a similar condition to that of theorem 3.1 is still open, but the techniques used in [27] may bring some answers. 2. giving a description of substitution matrices of substitutions which generate the words coding a 3-interval exchange map is a problem closely related to the previous one. the question of finding generating matrices of such 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague substitutions seems to be very challenging. this issue was solved in [26] for sturmian words. 3. although a description of palindromes in the words coding a 3-interval exchange map is already known [17], we believe that the use of the cut-and-project scheme can give an alternative proof of this result. nevertheless the palindromic complexity is described only for a 2-interval exchange map (see theorem 2.3) and a 3-interval exchange map [28], but we believe that a general description can be obtained by considerations based on the cut-and-project scheme. 5 acknowledgment the author acknowledges partial funding from the czech science foundation ga, čr 201/05/0169. the author is also very obliged to edita pelantová and zuzana masáková for their help and discussions on the paper. references [1] shechtman, d. et al.: “metallic phase with long-range orientational order and no translational symmetry.” phys. rev. lett. vol. 53 (1984), p. 1951–1953. [2] bombieri, e., taylor, j. e.: “which distributions of matter diffract? an initial investigation.” j. phys. colloque c3, vol. 14 (1986), p. 19–28. [3] guimond, l. s., patera, j.: “proving the deterministic period breaking of linear congruential generators using two tile quasicrystals.” math. of comput., vol. 71 (2002), p. 319–332. [4] hedlund, g., morse, m.: “symbolic dynamics ii: sturmian sequences.” amer. j. math., vol. 61 (1940), p. 1–42. [5] venkov, b. a.: elementary number theory, groningen: wolters-noordhoff, 1970. [6] bernoulli, j.: “sur une nouvelle espece de calcul.” receuil pour les astronomes, vols. 1, 2, berlin, 1772. [7] markoff, a. a.: “sur une question de jean bernoulli.” math. ann., vol. 19 (1882), p. 27–36. [8] arnoux, p., rauzy, g.: “représentation géométrique de suites de complexité 2n+1.” bull. soc. mat. france, vol. 119 (1991), p. 199–215. [9] coven, e. m.: “sequences with minimal block growth ii.” math. sys. theory, vol. 8 (1973), p. 376–382. [10] crips, d. et al.: “substitution invariant cutting sequences.” j. de th. des nom. de bordeaux, vol. 5 (1993), p. 123–137. [11] adamczewski, b.: “codages de rotations et phénompnes d’autosimilarité.” j. de th. des nom. de bordeaux, vol. 14 (2002), p. 351–386. [12] alessandri, b., berthé, v.: “three distance theorems and combinatorics on words.” l’enseignement mat., vol. 44 (1998), p. 103–132. [13] ferenczi, s. et al.: “structure of three interval exchange transformations i: an arithmetic study ann. inst. fourier, vol. 51 (2001), p. 861–901. [14] lunnon, w. f., pleasants, p. a. b.: “characterization of two-distance sequences.” j. austral. math. soc., vol. 53 (1992), p. 198–218. [15] vuillon, l.: “a characterization of sturmian words by return words.” european j. combin., vol. 22 (2001), p. 263–275. [16] droubay, x., pirillo, g.: “palindromes and sturmian words.” theoret. comput. sci., vol. 223 (1999), p. 73–85. [17] droubay, x. et al.: “episturmian words and some constructions of de luca and rauzy.” theoret. comput. sci., vol. 225 (2001), p. 539–553. [18] guimond, l. s. et al.: “combinatorial properties of infinite words associated with cut-and-project sequences.” j. de th. des nom. de bordeaux, vol. 15 (2003), p. 697–725. [19] baláži, p.: “infinite words coding 3-interval exchange.” diploma work, 2003. [20] masáková, z. et al.: “substitution rules for aperiodic sequences of cut and project type.” j. of phys. a: math. gen., vol. 33 (2000), p. 8867–8886. [21] brown, t. c.: “a characterization of the quadratic irrationals.” canad. math. bull., vol. 34 (1991), p. 36–41. [22] baláži, p.: “jednoduchá charakteristika sturmových čísiel.” proc. of 9th math. conf. of tech. uni., 2001, p. 5–15. [23] allauzen, c.: “une charactérisation simple des nombres de sturm.” j. de th. des nom. de bordeaux, vol. 10 (1998), p. 237–241. [24] parvaix, b.: “substitution invariant sturmian bisequences.” j. de th. des nom. de bordeaux, vol. 11 (1999), p. 201–210. [25] yasutomi, s.: “on sturmian sequences which are invariant under some substitutions.” number theory and its applications, vol. 2 (1999), p. 347–373. [26] berthé, v. et al.: invertible substitutions and sturmian words: an application of rauzy fractals, preprint 2003. [27] baláži, p. et al.: “complete characterization of substitution invariant sturmian sequences.” to appear in integers, 2005. [28] damanik d., zamboni l. q: arnoux-rauzy subshifts: linear recurrences, powers, and palindromes, arxiv: math. co/0208137v1. [29] berstel, j.: “recent results on extensions of sturmian words.” internat. j. algebra comput., vol. 12 (2002), p. 371–385. ing. peter baláži e-mail: balazi@km1.fjfi.cvut.cz department of mathematics czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 ap05_3.vp notation b distance between two side plates bk width of lower plate faa static load due to the trolley fy load due to the working load h0 height of the girder end h2 height of the side plates la distance between trolley wheels lk span of crane girder lp distance between two adjacent supports q weight of one meter platform qk weight of one meter maintenance platform qp uniformly distributed mass units of bridge t1 thickness of the upper and lower plates t2 thickness of the side plates x2 distance between center of gravity and the midpoint of the left side plate x4 distance between center of gravitiy and the midpoint of the rail y1 distance between neutral axis and the midpoint of the rail y3 distance between center of gravity and the midpoint of the top plate y5 distance between neutral axis and the midpoint of the top plate wx1 moment of resistance on x-axis wy1 moment of resistance on y-axis �c amplifying coefficient � dynamic coefficient 1 introduction cranes are the best way of providing a heavy lifting facility covering virtually the whole area of a building. an overhead crane is the most important materials handling system for heavy goods. the primary task of the overhead crane is to handle and transfer heavy payloads from one position to another. thus they are used in areas such as automobile plants and shipyards [1, 2]. their design features vary widely according to their major operational specifications, such as: type of motion of the crane structure, weight and type of the load, location of the crane, geometric features and environmental conditions. since the crane design procedures are highly standardized with these components, most effort and time are spent on interpreting and implementing the avaliable design standards [3]. there are many published studies on structural and component stresses, safety under static loading and dynamic behaviour of cranes [5–16]. solid modeling of bridge structures and finite element analysis to find the displacements and stress values has been investigated by demirsoy [17]. solid modeling techniques applied for road bridge structures, and an analysis of these structures using the finite element method are provided in [18]. in this study, stress and displacements were found using fem90 software. solid modeling of a crane bridge, the loading at different points on the bridge and then application of the finite element method have been studied by celiktas [19]. she presented the results of finite element methods for an overhead crane. din-taschenbuch and f. e. m. (federation européenne de la manutention) rules offer design methods and empirical approaches and equations that are based on previous design experience and widely accepted design procedures. din-taschenbuch 44 and 185 are a collection of standards related to crane design. din norms generally state standard values of design parameters. f. e. m rules are mainly an accepted collection of rules to guide crane designers. it includes criteria for deciding on the external loads to select crane components [3, 20]. in this study, the calculations apply the f. e. m. rules and din standards, which are used for box girder crane bridges. the calculation of the box girder uses the cesan inc. standard bridge tables. then a solid model of the crane bridge is generated with the same dimensions as in the calculation results. then static analysis is performed, using the finite element method. before starting the solution, the boundary conditions are applied as in practice. 2 overhead cranes with a double box girder overhead travelling cranes with a double box girder not only hoist loads but also carry them horizontally. a double beam overhead crane is built of a trolley travelling on bridges, © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 solid modeling and finite element analysis of an overhead crane bridge c. alkin, c. e. imrak, h. kocabas the design of an overhead crane bridge with a double box girder has been investigated and a case study of a crane with 35 ton capacity and 13 m span length has been conducted. in the initial phase of the case study, conventional design calculations proposed by f. e. m rules and din standards were performed to verify the stress and deflection levels. the crane design was modeled using both solids and surfaces. finite element meshes with 4-node tetrahedral and 4-node quadrilateral shell elements were generated from the solid and shell models, respectively. after a comparison of the finite element analyses, the conventional calculations and performance of the existing crane, the analysis with quadratic shell elements was found to give the most realistic results. as a result of this study, a design optimization method for an overhead crane is proposed. keywords: overhead crane, finite element method, solid modeling, box girder. and bridges travelling on rails. the trolley hoists or lowers the loads and carries them on the bridge structure. the bridges carry the loads on a rail. as a result, three perpendicular movements are performed. the system is depicted in fig. 1, where the payload of the mass is attached to the bridge with wire ropes [21, 22]. the double box girders are subjected to vertical and horizontal loads by the weight of the crane, the working (hook) load and the dynamic loads. with a double box girder construction, the trolley runs above or between the girders. the acceptable construction requirements and values for a box girder bridge structure are shown in fig. 2. 3 application of fem to an overhead crane among numerical techniques, the finite element method is widely used due to the availability of many user-friendly commercial softwares. the finite element method can analyse any geometry, and solves both stresses and displacements [23]. fem approximates the solution of the entire domain under study as an assemblage of discrete finite elements interconnected at nodal points on the element boundaries. the approximate solution is formulated over each element matrix and thereafter assembled to obtain the stiffness matrix, and displacement and force vectors of the entire domain. in this study finite element modeling is carried out by means of the cosmosworks and msc commercial package. patran and 4-node tetrahedral elements and 4-node quadrilateral shell elements have been used for modeling the overhead crane bridge. the four-node tetrahedral element is the simplest three-dimensional element used in the analysis of solid mechanics problems such as bracket stress analysis. this element has four nodes, with each node having three translational and three rotational degrees of freedom on the x, y, and z-axes. a shell element may be defined, which allows in the plane or curved surface of the element and posseses both length. it width and may only be used in 3-d simulations. the four-node shell element is obtained by assembling the bending element to the appropriate degrees of freedom. this is sufficient as long as the shell element deflection is within the predefined ratio of shell thickness, otherwise the system works as a large deflection. a typical four-node tetrahedral element and four-node quadratic shell element, and their coordinate systems are illustrated in fig. 3 [24]. the four-node tetrahedral element chosen has six degrees of freedom at each node: translation in the nodal x, y, and z directions and rotations about the nodal x, y, and z directions. for the four-node quadratic shell element used to model the overhead crane girder, r and s denote the natural coordinates and � is the thickness of the element. this system does not have any horizontal force. the axial displacements and rotations of the first and last faces are equal to zero. in addition, the transverse displacement is zero at the first and last face nodes. the external forces acting on the system are the mass of the main girder of the crane (distributed load) and the forces acting on the wheels of the trolley along the crane (active load). the forces acting on the trolley wheels are caused by the mass of the trolley, an the lifting load which will be moved on the crane. 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague bridge (girder) wheelshoisting mechanism trolley bridge rail fig. 1: overall view of an overhead crane lk h 0 lp h 0 bk h 2 t1 t1 t2 t2 b fig. 2: construction requirements for a box girder bridge 4 solid and finite element modelling of an overhead crane bridge the finite element method is a numerical procedure that can be applied to obtain solutions to a variety of problems in engineering. steady, transient, linear or nonlinear problems in stress analysis, heat transfer, fluid flow and electromechanism problems may be analysed with finite element methods. the basic steps in the finite element method are defined as follows: preprocessing phase, solution phase, and postprocessing phase. real crane data was gathered from cesan inc., a turkish company involved in mass production of overhead cranes. first, the crane bridge is modeled as a surface. bridge geometry is suitable for this, and long and thin parts should also be modeled as a surface. later, a mesh is created. in this study, a quadratic element type is used. solid modeling is generated for the calculated crane bridge and the solid model is shown in fig. 4 [20]. 5 numerical example of an overhead crane a 35-ton-capacity overhead crane of overall length 13 m and total weight 22.5 tons was selected as a study object. the configuration of the overhead crane is shown in fig. 1. the overhead crane consists of two girders, two saddles to connect them, and a trolley moving in the longitudinal direction of the overhead crane and wheels. the driving unit is installed in one of the two girders. the overhead crane is supported by two rails and the runway girders installed in building. in order to calculate the stress in the structure, the rules of f. e. m 1.001 are applied. the design values used in the bridge analysis from the f. e. m and din standards are given in table 1. first the maximum and minimum stresses and then the shear stress are calculated using the f. e. m. rules. using the finite element method for the considered girder, we obtain the stress valnes. we obtain the static loads due to the dead weight, the loads due to the working load multiplied by the © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 4-node tetrahedral element 4-node quadratic shell element fig. 3: elements used to model an overhead crane girder solid model of a crane bridge wireframe view of a crane bridge fig. 4: models of an overhead crane bridge handling capacity : gy � 35 ton trolley weight : fa � 3 ton bridge length : lk � 13 m distance between wheels of trolley : la � 2 m trolley velocity : va � 20 m/min. crane velocity : vf � 15 m/min. hoisting velocity : vh � 2.7 m/min total duration of use : u4 load spectrum class : q3 appliance group : a5 loading type : h (main load) dynamic coefficient : � � 1.15 amplifying coefficient : �c � 1.11 table 1: bridge property values dynamic coefficient, and the two most unfavourable horizontal effects, excluding the buffer forces. the maximum stress consists of the stress on the bridge dead weights, the stress on the trolley dead weight, the stress from the hoisting load, stress from the inertia forces and the stress of the trolley contraction. the minimum stress includes the stress on the bridge dead weights and the stress on the trolley dead weight. the maximum and minimum stresses for the given values according to the f. e. m. rules [20] are written in standard form as � �max ) ( )� � � � � � � � � � c k p k 2 x1 aa k x1 k a ( 8 32 q q g l w f l w l l2 2 � f w l l l l w q q g l f y x1 k k a k y1 k p k aa 32 ( � � � � � � � ( ) . ) 2 0 075 2 2 � � �� � � � � � 0 05. l w f fa y1 aa y (1) and, � �min ) ( )� � � � � � � � � � c k p k 2 x1 aa k x1 k a ( 8 32 q q g l w f l w l l2 2 � � . (2) the value of the dynamic coefficient � is applied to the loading arising from the working load. the value of the amplifying coefficient �c depends the group classification of the application, and the weight of one meter maintenance platform is zero in this work. [25]. it is assumed that the total load (372780 n) is effected on the mid point of the rail and each girder shares this total load equally. this load is applied via the contact points of the two trolley wheels in this system. therefore the value of the acting force on each point is 93195 n. applying the total load in the system, the value of the maximum stress according to eq. (1) is 143.90 n/mm2 to two decimal places, and the value of the minimum stress according to eq. (2) is 47.33 n/mm2 to two decimal places. according to fig. 5, the permissible stress in shear consists of the shear stresses of the wheel forces, and is defined as [20] � � � � � � � � � � � � � �( . ) ( ) ( ) ( ) x y f f t x x y y f f4 5 2 4 1 3 0 2 aa y 2 y c a 4 a 2 24 t h� . (3) the value of the maximum shear stress is 24.82 n/mm2 to two decimal places from eq. (5). substituting eq. (1)–(3) the equivalent stress is given by. the value of the equivalent stress is 150.18 n/mm2 to two decimal places. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague h 1 h 2 h k t1 h r s b2b t2 bt2 bk brx3 s3 s2 s4 s1 p3(a1) y 2 y 3y 5 y 1 e 1 = y s e 2 x p2(a2) p4(a2) p1(a1) x2 x4 u2 u1 y xs p5(ar) y 1 y 3 x2 ma x4 fig. 5: inertia and moment of resistance in a box girder fig. 6: stress values of an overhead crane girder with a four-node tetrahedral element 6 results from a girder model with a four-node tetrahedral element to model the overhead crane girder with a four-node tethrahedral element, cosmosworks software was used for finite element analysis using the girder solid model generated by means of solidworks 2003. young’s modulus (e) is 2.1×105 n/mm2 and the poisson ratio (�st) is 0.3 for finite element analysis. the value of the maximum stress of the side plate is 12.07 n/mm2 to two decimal places and the value of the maximum stress of the bottom plate is 15.08 n/mm2 to two decimal places from fig. 6 [20]. the displacement of the modelled overhead crane girder was obtanied from cosmosworks, and is illustrated in fig. 7. the value of maximum displacement of the girder is about 2.2 mm. 7 results from a girder model with a four-node quadratic shell element to model the overhead crane girder with a four-node quadratic shell element, msc patran software was used for the finite element analysis. young’s modulus (e) is 2.1×10 n/mm2 and the poisson ratio (�st) is 0.3 for finite element analysis. the value of the maximum stress of the side plate is 35.40 n/mm2 to two decimal places, and the value of the maximum stress of the bottom plate is 49.30 n/mm2 to two decimal places, from fig. 8 [20]. the displacement of the modelled overhead crane girder was obtained from msc patran, and is illustrated in fig. 9. the value of maximum displacement of the girder is about 3.89 mm. the value of the maximum stress according to eq. (1) is calculated as 143.90 n/mm2 to two decimal places. the safety factor should be considered between 2 and 3 for overhead crane girder design. the maximum stress value of the side plate is between 24.14 and 36.21 n/mm2 to two decimal places, and the maximum stress value of the bottom plate is between 30.16 and 45.24 n/mm2 to two decimal places for a four-node tetrahedral element, taking into account the safety factor. the maximum stress value of the side plate is between 70.8 and 106.2 n/mm2 to two decimal places and the maximum stress value of the bottom plate is between 98.6 and 147.9 n/mm2 to two decimal places for a four-node quadratic shell element, taking into account the safety factor. the permissible displacement of the girder is 13 mm according to f. e. m. rules. the maximum displacement obtained from the finite element model with a four-node tetrahedral element is between 4.40 and 6.60 mm, taking into account the safety factor. the maximum displacement obtained from the finite element model with a four-node quadratic shell element is between 7.78 and 11.67 mm, taking into account the safety factor. © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 fig. 7: displacements of an overhead crane girder with a four-node tetrahedral element fig. 8: stress values of an overhead crane girder with a quadratic shell element 8 conclusion in this study, unlike the other studies carried out previously, shell elements in finite element modeling of an overhead box girder have been examined. in order to show the use of shell elements, one illustrative overhead crane bridge example is given. the maximum stress value is 143.90 n/mm2 and 45.24/mm2 for a four-node tetrahedral element and 147.9 n/mm2 for a four-node quadratic shell element using both calculations according to the f. e. m. rules and finite element analysis. the value of the equivalent stress is 150.18 n/mm2 to two decimal places. taking into account the safety factor, the stress value varies between 97–145.5 n/mm2, which is obtained from msc patran. the ratio of length to thickness of the element used in modelling the overhead crane box girder is higher than 20. therefore, in order to show the accuracy of the analysis of the overhead crane bridges, a four-node quadratic shell element is used instead of the four-node tetrahedral element for finite element analysis. acknowledgment it is pleasure to acknowledge much stimulating correspondence with dr. haydar livatyali and gratefully to acknowledge the support of cesan inc., which provided the design data. references [1] oguamanam, d. c. d., hansen, j. s., heppler, g. r.: “dynamic responce of an overhead crane system.” journal of sound and vibration, vol. 213 (1998), no. 5 , p. 889–906. [2] otani, a., nagashima, k., suzuki, j.: “vertical seismic response of overhead crane.” nuclear eng. and design, vol. 212 (1996), p. 211–220. [3] erden, a.: “computer automated access to the ‘f.e.m. rules’ for crane design.” anadolu university journal of science and technology, vol. 3 (2002), no. 1, p. 115–130. [4] anon, a: “new thinking in mobile crane design.” cargo systems, vol. 5 (1998), no. 6, p. 81. [5] baker, j.: “cranes in need of change.” engineering, vol. 211 (1971), no. 3, p. 298. [6] buffington, k. e.: “application and maintenance of radio controlled overhead travelling cranes.” iron and steel engineer, vol. 62 (1985), no. 12, p. 36. [7] demokritov, v. n.: “selection of optimal system criteria for crane girders.” russian engineering journal, vol. 54 (1974), no. 4, p. 7. [8] erofeev, m. j.: “expert systems applied to mechanical engineering design experience with bearing selection and application program.” computer aided design, vol. 55 (1987), no. 6, p. 31. [9] lemeur, m., ritcher, c., hesser, l.: “newest methods applied to crane wheel calculations in europe.” iron and steel engineer, vol. 51 (1977), no. 9, p. 66. [10] mccaffery, f. p.: “designing overhead cranes for nonflat runways.” iron and steel engineer, vol. 62 (1985), no. 12, p. 32. [11] reemsyder, h. s., demo, d. a.: “fatigue cracking in welded crane runway girders, causes and repair procedures.” iron and steel engineer, vol. 55 (1978), no. 4, p. 52. [12] rowswell, j. c., packer, j. a.: “crane girder tie-back connections.” iron and steel engineer, vol. 66 (1989), no. 1, p. 58. [13] moustafa, k. a., abou-el-yazid, t. g.: load sway control of overhead cranes with load hoisting via stability analysis.” jsme int. journal, series c, vol. 39 (1996), no. 1, p. 34–40. [14] oguamanam, d. c. d., hansen, j. s., heppler, g. r.: “dynamic of a three-dimensional overhead crane system.” journal of sound and vibration, vol. 242 (2001), no. 3, p. 411–426. [15] auering, j. w., troger, h.: “time optimal control of overhead cranes with hoisting of the load.” automatica, vol. 23 (1987), no. 4, p. 437–447. [16] huilgol, r. r., christie, j. r., panizza, m. p.: “the motion of a mass hanging from an overhead crane.” chaos, solutions & fractals, vol. 5 (1995), no. 9, p. 1619–1631. 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague fig. 9: displacements of an overhead crane girder with a four-node quadratic shell element [17] demirsoy, m.: “examination of the motion resistance of bridge cranes.” phd. thesis, dokuz eylul university, izmir, turkey, 1994. [18] ketill, p., willberg, n. e.: “application of 3d solid modeling and simulation programs to a bridge structure.” phd. thesis, chalmers university of technology, sweden. [19] celiktas, m.: “calculation of rotation angles at the wheels produced by deflection using finite element method and the determination of motion resistance in bridge cranes.” j. of mechanical design, vol. 120 (1998). [20] alkin, c.: “solid modeling of overhead crane bridges and analysis with finite element method.” m.sc. thesis, istanbul technical university, istanbul, turkey, 2004. (in turkish) [21] scheffer, m., feyrer, k., matthias, k.: fördermaschinen hebezüge, aufzüge, flurförderzüge. wiesbaden: viehweg & sohn, 1998. [22] kogan, j.: crane design : theory and calculations of reliability, new york: john wiley & sons, 1976. [23] errichello, r.: “gear bending stress analysis.” asme journal of mechanical design, vol. 105 (1983), p. 283–284. [24] moaveni, s.: finite element analysis : theory and application with ansys. new jersey: prentice-hall, 1999. [25] verschoof, j.: cranes design, practice and maintenance, london: professional engineering pub., 2000. coskun alkin dr. c. erdem imrak e-mail: imrak@itu.edu.tr dr. hikmet kocabas istanbul technical university faculty of mechanical engineering 34439 istanbul, turkey © czech technical university publishing house http://ctn.cvut.cz/ap/ 67 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 acta polytechnica https://doi.org/10.14311/ap.2022.62.0238 acta polytechnica 62(2):238–247, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague zno-zeolite nanocomposite application for photocatalytic degradation of procion red and its adsorption isotherm tuty emilia agustinaa, ∗, rianyza gayatrib, david bahrina, rosdiana moeksina, gustinic a universitas sriwijaya, faculty of engineering, chemical engineering department, jl. raya palembang – prabumulih km 32 indralaya 30662, south sumatera, indonesia b universitas sriwijaya, environmental technology, master program of chemical engineering, jl. srijaya negara, bukit besar, palembang 30139, south sumatera, indonesia c universitas sriwijaya, faculty of engineering, mechanical engineering department, jl. raya palembang – prabumulih km 32 indralaya 30662, south sumatera, indonesia ∗ corresponding author: tuty_agustina@unsri.ac.id abstract. in this paper, the photocatalytic degradation of procion red dye, one of the most frequently used dyes in the textile industry, was studied. the objective of the research is to study the zno-zeolite nanocomposite application to degrade procion red dye by using different irradiation sources. the adsorption isotherm was also investigated. the zno-zeolite nanocomposite was prepared by a sol-gel process. photodegradation test was applied under the sunlight irradiation, ultraviolet (uv) lamp, and in a darkroom. the dye degradation was also examined by the synthetic zeolite and zno for a comparison. another objective of this study is to analyse the appropriate adsorption isotherm to describe the degradation process of procion red dye by using zno-zeolite nanocomposite. the adsorption ability of the nanocomposite was described by langmuir and freundlich isotherms. the adsorption of the nanocomposite was reported to depend on the degradation time. the highest photodegradation result of 98.24 % was achieved by irradiating 50 mg/l of procion red dye under the sunlight for 120 minutes. the result showed that the langmuir adsorption isotherm was the appropriate adsorption equation for the degradation process of procion red dye by using zno-zeolite nanocomposite with r2 value of 0.995. keywords: nanocomposite, zno, zeolite, adsorption isotherm, photocatalytic degradation, procion red. 1. introduction the growing textile industry, in addition to producing commercial products, also produces byproducts in the form of dye wastewater. procion red is one of most often used synthetic dye that is hard to decompose and is present in the wastewater. one of the most important aspects of water treatment technology is the process to remove the organic compounds in the wastewater [1]. based on previous research, there are several conventional methods used to degrade textile dye wastewater. however, the photodegradation process and the use of photocatalyst is the best and most suitable method for wastewater treatment. the photocatalytic oxidation process combines uv irradiation with a catalyst; tio2, cds, zno [2]. the photocatalytic reaction produces hydroxyl radicals as an oxidator to break down the pollutant gradually in a stepwise process [3]. zno is the most suitable and is often used as a photocatatalyst [4]. in photocatalytic applications, semiconductor zno is cheaper than the other nanosized metal oxides and zno is also better due to its environmental stability. the adsorption capacity of the photocatalyst is still weak, and the combination of photocatalyst and adsorbents will solve this problem [5]. this combination is carried out in order to maximize the dye pollutants’ contact with photocatalysts [6]. the adsorbent used also does not need to be regenerated because the photocatalyst will immediately degrade the pollutants that have been absorbed in the adsorbent in situ so that the adsorbent is not easily saturated. zeolite has good adsorption properties and has a large surface, so it is used as an adsorbent and helps the adsorption-catalytic process [7]. the activation of zeolite aims to increase its purity [8]. however, synthetic zeolite, which has physical properties is much better than natural zeolite; the pore size of the synthetic zeolite is uniform, and it maximizes the adsorption results. this is the basis for selecting the synthetic zeolite adsorbents to be combined with the zno photocatalysts. the zeolite adsorbent also has a good adsorption ability. the adsorption capacity can be determined by the adsorption isotherm equation, generally by the freundlich or langmuir equation [9]. among the several adsorption isotherms, the equations proposed 238 https://doi.org/10.14311/ap.2022.62.0238 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 2/2022 zno-zeolite nanocomposite application for photocatalytic . . . by freundlich and langmuir are most frequently used and are very useful for a mathematical description of adsorption from aqueous solutions [10]. therefore, this study will use the photodegradation process as a method of removal of red procion textile colour substances using composites consisting of photocatalyst zno and synthetic zeolite adsorbent. in this study, the sol-gel process is used for the formation of zno-zeolite nanocomposites, the process will produce metal oxides using alcohol or water [11]. based on several researches that have been carried out, the zno-zeolite nanocomposite was applied for the photodegradation process of procion red dye, and the effect of the irradiation source in the photodegradation process should also be seen. therefore, in this study, the determination of the adsorption isotherm that is suitable for the degradation process of the procion red dye by using zno-zeolite nanocomposites was also considered, to study the adsorption mechanism and the interaction between the adsorbent and the absorbed substance, and to determine the maximum adsorption capacity [12]. 2. materials and methods zinc acetate, zinc oxide, ethanol, naoh, and hcl were obtained from sigma aldrich and synthetic procion red powder dye was obtained from dyestuff store (fajar setia, jakarta). commercial synthetic zeolite of a-type (na2o.al2o3.2sio2.4.5h2o) was purchased from pt. phy edumedia, east java. synthetic zeolites with a 400 mesh size have been activated by heating in oven at 110 °c for 2 hours, then washed with 0.4 m hcl for 60 minutes, and finally, washed with distilled water. the synthetic zeolite was dried in an oven at 110 °c for 2 hours [13]. 2.1. synthesys and characterisation of zno-zeolite nanocomposite zno-synthesis zeolite nanocomposite is produced by sol-gel method, as referred to in [13]. the reason for using the precursor weight of zinc acetate and zeolite 2:1 is because the ratio produces the highest degradation of colour substances, as it was discovered in previous research [14] which has also compared the influence of zno and zeolite photocatalyst ratios and the 2:1 weight ratio is the best ratio. zinc acetate as precursor is added with active synthetic zeolite in the precursor and synthetic zeolite ratio of 2:1, then dissolved in 80 ml of 99 % ethanol. the precursor and zeolite mixture was heated in a reflux flask to 76 °c for 2 hours. then, 225 ml of 2 m naoh was added to the solution and stirred for 60 minutes. the mixture was stood for 12 hours and then it was filtered. next, the precipitate obtained was heated at 60 °c for 24 hours and then stored in a desiccator to keep it dry [13]. the zno-zeolite nanocomposites were analysed by sem-edx, bet, and xrd. 2.2. zno-zeolite nanocomposite application for photocatalytic degradation zno-zeolite nanocomposite have been tested for the degradation of 50 mg/l of procion red dye. dye photodegradation was applied in three ways, by using sunlight, ultraviolet (uv) lamp and dark conditions. the use of only zeolite and only zno was also tested for the photodegradation process. before the degradation process, the maximum wavelength of procion red solution was measured. a 100 mg of zno-zeolite nanocomposite, mixed with 25 ml of 50 mg/l procion red, was stirred and placed directly under the sun. experiments using sunlight were carried out at noon between 11.00 am–1.00 pm. the light intensity at that time was measured by a luxmeter. samples were tested in several time variations from 5–120 minutes; first, they were filtered and then the colour degradation was analysed. the experiments were also carried out with ultraviolet light irradiation [14] by using uv lamp (evaco 254 nm) and in dark conditions. the experiments in dark conditions were carried out without turning on the uv lamp and closing the reactor using a black plastic coated box. the final dye’s absorbance and dye’s concentration were measured by using a uv-vis spectrophotometer. the maximum wavelength of procion red solution is 470 nm. the degradation percentage of procion dye and the adsorption percentage formulas are shown in equations (1) and (2). degradation p ercentage = a0 − a1 a0 × 100 % (1) equation (1) explains how to find the dye degradation percentage, a0 is the value of the initial absorbance of the dye and a1 is the value of the final absorbance. adsorption p ercentage = c0 − c1 c0 × 100 % (2) equation (2) shows the adsorption percentage calculation, c0 is the value of the initial concentration of the dye and c1 is the value of the final concentration. 2.3. determination of zno-zeolite nanocomposite adsorption isotherm type the adsorption process in dark conditions with znozeolite nanocomposite provides data on the photodegradation results of procion red dye in the form of adsorption values and the final concentration of the dye. the data analysis weas performed by determining the final content of the procion red dye after the adsorption process. calculations of the langmuir and freundlich equations are carried out to determine the pattern or type of adsorption isotherm suitable for the absorption process of procion red dye by zno-zeolite nanocomposites. the calculation results of each type will be plotted on the graph, and the most suitable znozeolite nanocomposite adsorption isotherm will be 239 t. e. agustinai, r. gayatri, d. bahrin et al. acta polytechnica figure 1. sem image of zno. determined. a good linearization of the line on the graph and the coefficient of determination r2 ≥ 0.9 (close to 1) are parameters for determining the corresponding adsorption isotherm equation to describe the degradation process of procion red dye using the zno-zeolite nanocomposite. the langmuir isotherm model in equation (3) assumes that a single particle will be adsorbed by each site and a monolayer on the surface of adsorbent will be formed by the adsorbate, ce qe = 1 a.b + 1 a ce. (3) the freundlich isotherm in equation (4) explains the adsorption on heterogeneous surfaces and microporous adsorbent [2]. ln qe = ln kf + 1 n ln ce (4) 3. results and discussion 3.1. zno-zeolite nanocomposite characters figure 1 shows the sem image of zno with a magnification of 15 000. the image shows a uniform distribution of the particle shape. figure 2 presents the sem image of activated synthetic zeolite with a magnification of 10 000. the active synthetic zeolite has a smoother structure and has a more regular shape and size than before the activation process. the edx results of zno-zeolite nanocomposites showed that zno and zeolite components are present in the zno-zeolite nanocomposite with a zn content of 56.13 % and an oxygen content of 22.42 % weight percent. the other components are carbon content of 12.61 %, aluminium of 2.75 % and silica of 6.09 % weight percent [13]. the silica and aluminium contents indicate the presence of zeolite in the nanocomposites. sem image result of the nanocomposite, with a magnification of 10 000, is showed in figure 3. the nanocomposite particle distribution and particle size look smooth and uniform. based on a previous study, the surface area of the nanocomposite that consists of zno and synthetic figure 2. sem image of activated synthetic zeolite. figure 3. sem image of zno-zeolite nanocomposite. zeolite also has a significant impact on the photodegradation process. the results of bet characterisation showed that the surface area of the activated synthetic zeolite was 47.192 m2/g, for zno, it was 19.192 m2/g, and for zno-zeolite nanocomposite, it was 95.981 m2/g. the zno-zeolite nanocomposite has the highest surface area. these bet results have also been discussed in previous research by [13]. the xrd of zno-zeolite nanocomposite confirmed and proved that the nanocomposite consisted of zinc oxide (zno) and synthetic zeolite components as shown in figure 4. the xrd pattern of the nanocomposite shows the diffraction peaks at 2θ of 31.43°, 34.51°, 36.37°, and 54.99° that are identical to hexagonal zno peaks [15]. the other peaks found at 2θ= 10.01°, 22.12°, 26.11°, and 29.98°, represent the presence of synthetic zeolite [16], [13]. 3.2. the photodegradation results of procion red overall, the highest dye photodegradation was achieved by using the zno-zeolite nanocomposite for the longest time period of 12 minutes. research by [17] showed similar results where zno photocatalysts with natural zeolites were the most effective and resulted in the highest degradation of dyes. the percentage rate of dye removal increases by increasing the time up to 120 minutes, because the nanocomposite has a high photocatalytic activity and produces more hydroxyl radicals to degrade the dye, moreover, the 240 vol. 62 no. 2/2022 zno-zeolite nanocomposite application for photocatalytic . . . figure 4. xrd of zno-zeolite nanocomposite. zno photocatalysts will directly degrade pollutants on the surface of the zeolite adsorbent. the photodegradation process was applied by using three different materials in different irradiation conditions. figures 5–7 below shows a graph comparing the decrease in the procion red dye concentration by using nanocomposite of zno-zeolite, zno, and synthetic zeolite under three different exposure conditions, the sunlight, uv light, and in dark conditions. the results of the photodegradation using nanocomposite and zno under the sunlight showed the highest photodegradation percentage of procion red as compared to the photodegradation application under the uv light and dark conditions. as shown in figure 5, the procion red dye showed the highest degradation rate by using nanocomposites under the sunlight. during 5 minutes of degradation by using nanocomposites, the concentration of procion red dye under sunlight has decreased by 24.5 %, by 18 % under the uv lamps, and in the dark conditions, the concentration decreased by 17 %. during 15, 25 and 30 minutes of degradation, it was seen that the degradation of procion red under the uv light was higher than that of sunlight, this was due to the intensity of sunlight that was changing over the time, due to this, it was possible that the sun’s intensity was decreasing or lower so the removal of colour substances also decreased. after degrading under the sunlight for 30 minutes, the percentage of degradation was 70 % and had reached 91 % at 60 minutes of degradation. degradation under the sunlight for 120 minutes resulted in the highest percentage of degradation of procion red, 98.24 %, meanwhile, in the case of the uv lamp, the degradation percentage was 90.42 %. the lowest degradation percentage of only 28.56 % was obtained in the dark conditions. as can be seen in the figure 5, the degradation of procion red in the dark conditions by using zno-zeolite nanocomposite showed no significant increase after 60 minutes, indicating the possibility that the zeolite adsorbent could have become saturated over time. degradation by using only zno also showed a similar result. based on figure 6, it can be noticed that the highest decrease in procion red concentration was obtained with radiation from the sunlight. however, for 20, 25, and 30 minutes, the uv lamp degradation shows higher results than under the sunlight, this is also due to the intensity of sunlight, which is not constant changes over time and at that time, the intensity of sunlight was low, thus reducing the degradation process. for 120 minutes, the percent degradation of procion red under the sun has reached 97.65 %, for uv lamp, it reached 80.64 %, and in the dark conditions it was still low, only 35.57 %. zno-zeolite nanocomposite and zno can decompose procion red dye using either a direct exposure to the sunlight or by using the uv lamp. this is because the zno-zeolite nanocomposite and zno only, work based on the photocatalytic mechanism. these results are also reported from the research conducted by [18], which states that the differences in light sources also have an impact on the photocatalytic process. the photocatalytic process occurs under ultraviolet light (uv lamp) and the sun as the light source. the sunlight can produce the highest percentage of degradation because the intensity of the light produced is very high. the sun’s intensity is much greater than the uv rays. energy from the sunlight produces the highest percentage not because it is better than 241 t. e. agustinai, r. gayatri, d. bahrin et al. acta polytechnica figure 5. effect of irradiation sources on degradation percentage of 50 mg/l procion red by using zno-zeolite nanocomposite. figure 6. effect of irradiation sources on degradation percentage of 50 mg/l procion red by using only zno. the uv light, but because the energy from the sunlight is much higher than from the uv light installed in the reactor. the wavelength of sunlight is polychromatic, the sunlight has a wide range of wavelengths from ultraviolet light to infrared light, while the uv light is monochromatic, meaning that the range of wavelengths is narrower. the photocatalytic process is yields better results using sunlight as a light source as compared to uv lamps. this is due to the fact that the sunlight has more energy than ultraviolet light, meaning a higher electron excitation takes place [18]. the sunlight has a greater light intensity than the uv light. the sunlight also has a wider wavelength, resulting in the highest reduction in dye concentration [17]. research by [2] also reached similar results, namely that the percentage of dye degradation in the sunlight was higher than for the ultraviolet light, or the conditions without any light. the greater the sunlight intensity, the easier the photocatalytic process runs, so the percentage of dye degradation will be higher as well. the sunlight causes an even greater rate of decomposition of the dye. electrons can be excited to a higher energy level due to the energy emitted by light both by sunlight and uv rays [19]. in this study, photodegradation was carried out between 11.00 am–1.00 pm, during hot conditions, or when the measurement results of light intensity by the luxmeter are very high. in that time frame, the highest light intensity measurement results were obtained, around or more than 100 000 lux. the results have shown that the highest average light intensity is at 12.00 am–1.00 pm. the large number of active photocatalysts exposed to visible light enhances the formation of hydroxyl radicals for the photodegradation process [20]. the photodegradation mechanism of sunlight and uv rays is different, the percentage degradation of 242 vol. 62 no. 2/2022 zno-zeolite nanocomposite application for photocatalytic . . . figure 7. effect of irradiation on adsorption percentage of 50 mg/l procion red by using synthetic zeolite. the sun is higher than that of uv light. the sunlight provides a lot of visible light irradiation and its photodegradation uses the ultraviolet light mechanism and the visible light mechanism, whereas the uv light only uses the ultraviolet light mechanism [21]. photon energy from the uv light makes the percentage of degradation increase, this is also due to the excitation of electrons and the formation of hydroxyl radicals. the results of degradation in dark condition are not significant, because in dark conditions, there are no photons that can activate the zno or zno-zeolite nanocomposite, so there were no hydroxyl radicals formed which are strong oxidizers for the photodegradation process of procion red. when it is dark, the degradation process happens only due to the entrapment process carried out by zeolite particles. in dark conditions, there is no light that will help the zno photocatalysts to produce hydroxyl radicals that can degrade dyes, so the photocatalytic process does not run optimally, this is the reason why the degradation results of using nanocomposite and zno in the dark conditions are so unsatisfying. the znozeolite nanocomposite can still degrade procion red dye even in dark conditions due to the dye adsorption process carried out by synthetic zeolite, but the results are still not optimal because they are not assisted by the zno photocatalytic process. in the zno-zeolite nanocomposite degradation, the zno photocatalyst plays a more important role as well, because in terms of its sustainability, the catalyst can be used continuously, and its sustainability is more guaranteed than that of the adsorbent. when using zeolite adsorbent, after the dye is absorbed, the adsorbate can become saturated and can no longer serve as an adsorbent. a longer degradation time will result in the zeolite getting saturated and the semiconductor photocatalyst zno itself then becomes responsible for the degradation of the dye. figure 7 shows the effect of irradiation on the adsorption percentage of procion red dye by using only synthetic zeolite. the adsorption of procion red showed that the highest yield was obtained in dark conditions. meanwhile, the percentage of dye removal by using uv lamp and under the sunlight were 29.73 % and 23.74 %, respectively. after 120 minutes, the adsorption in dark conditions reached an adsorption percentage of 78 %, 50.64 % for the uv lamp, and 27.91 % for the sun. after 90 minutes, the percent adsorption by synthetic zeolite did not increase anymore, this was due to the adsorption process of procion dye was only made by synthetic zeolite adsorbent, so which could have become saturated. the adsorption percentage by using the synthetic zeolite under the sunlight and the uv light is lower than in the dark conditions because the adsorption process of procion red dye, unlike in the case of zno and nanocomposites, works based on the photocatalytic method that requires the light. the dye concentration decreasing was due to the adsorption process of the dye by the synthetic zeolite. the adsorption percentage by using the synthetic zeolite is quite low due to the fact that the adsorption process is only made by the zeolite adsorbent, for which the adsorption process occurs only at the surface. meaning the larger the surface area of the zeolite is, the more dye is adsorbed [22]. the zno-zeolite nanocomposite gave the highest photodegradation percentage as compared to using zno and synthetic zeolite, this is because they not only use the dye adsorption process by synthetic zeolite, but are also aided by the zno semiconductor presence in the zno-zeolite nanocomposite. the effect of zeolite is also increased under the sunlight or uv light. 3.3. types of adsorption isotherm for degradation of zno-zeolite nanocomposite the adsorption process carried out in dark conditions by using zno-zeolite nanocomposite provided data 243 t. e. agustinai, r. gayatri, d. bahrin et al. acta polytechnica ce [mg/l] qe ln qe ln ce ce/qe 50 33.56 4.11 1.41 3.51 8.18 100 81.31 4.67 1.54 4.44 18.21 150 128.12 5.47 1.69 4.91 24.77 200 176.53 5.87 1.77 5.24 32.14 250 227.25 5.69 1.73 5.47 41.69 table 1. calculation of the values of qe, ce/qe, ln qe, and ln ce for freundlich and langmuir isotherms. figure 8. freundlich isotherm graph. on the degradation results of the procion red dye in the form of the adsorption value and the final concentration of the dye. experiments in dark conditions were carried out without turning on the uv lamp and closing the reactor using a black plastic coated box. the interaction between the adsorbent and the adsorbate was described in several types of adsorption isotherms. this adsorption isotherm will show the maximum capacity of the adsorbent [23]. the results were then analysed using the adsorption isotherm equation. the suitable type of adsorption isotherm for the adsorption process of procion red dye by znozeolite nanocomposite was determined using the langmuir and freundlich equations. these equations are very well known and applicable [23]. in wastewater treatment, freundlich and langmuir isotherm equations are the most used. the calculation results of each type were plotted on the graph and the most suitable zno-zeolite nanocomposite adsorption isotherm was be confirmed. the adsorption process of the adsorbent against the adsorbate and the maximum capacity of the adsorbent can be specified by determining the appropriate type of adsorption [24]. the adsorption process of zno-zeolite nanocomposite was applied by varying the dye concentration during the degradation time of 60 minutes. the value of procion red dye concentration was set to 50, 100, 150, 200, and 250 mg/l. the values of qe, ce/qe, ln qe and ln ce were calculated to be included in the langmuir and freundlich isotherm equations and are presented in table 1. the qe value is obtained from the following equation (5): qe = (c0 − ce) × v w . (5) plotting the values for ce/qe and ce will yield the langmuir equation and plotting ln qe versus ln ce will yield the freundlich equation. 3.4. freundlich isotherm a good linearization graph and the value of the coefficient of determination r2 ≥ 0.9 (close to 1) show the corresponding adsorption isotherm equation. figure 8 demonstrates the relationship between ln qe and ln ce in the freundlich isotherm equation. based on the graph, an equation in the form of a linear equation y = 0.088x + 1.369 and the value of r2 = 0.849 were obtained. the linear equation of the freundlich isotherm graph fulfils equation (6) and gives a constant kf value of 3.933, a value of 1/n of 0.088 and an n value of 11.403. ln qe = ln kf + 1 n ln ce. (6) k and 1/n are freundlich constants indicating the rate of adsorption and heterogeneity factors. freundlich described a heterogeneous adsorption system. 244 vol. 62 no. 2/2022 zno-zeolite nanocomposite application for photocatalytic . . . figure 9. langmuir isotherm graph. the value of kf (mg/g) also indicates the power or the maximum adsorption capacity of the material. the value of 1/n > 1 explains that the saturation of the adsorbent is not achieved; on the contrary, when 1/n < 1, the adsorbent has been saturated by the adsorbate molecules, this happens more in the adsorption system [24]. 3.5. langmuir isotherm the langmuir isotherm equation is illustrated in figure 9, which shows the ce/qe and ce relationship curves for the langmuir isotherm. the langmuir graph produces the linear equation of y = 8.098x + 0.705 an r2 = 0.995. ce qe = 1 a.b + 1 a ce. (7) the linear equation of the langmuir isotherm graph follows the equation (7) and gives a constant value of b of 11.488 l/mg and a of 0.124 mg/g. the value of b is the langmuir equilibrium constant or constant (l/mg) and the value of a is the maximum concentration value in the solid phase (mg/g), which can also indicate the maximum adsorption capacity of the material. table 2 shows the adsorption isotherm parameters for both types of adsorption isotherms. based on the langmuir and freundlich graphs, the adsorption isotherm equation that is suitable for the process of degradation of the procion red dye with zno-zeolite nanocomposites is the langmuir adsorption equation with an r2 value of 0.995, while the freundlich isotherm equation does not meet the requirements, because the coefficient of determination r2 is 0.849. this is also similar to [2] research, where the dye adsorption isotherm pattern carried out by the znozeolite composite follows the langmuir isotherm, and from the graph, it can be seen that it produces a coefficient of determination r2 that is close to 1 (0.94) isoterm langmuir isoterm freundlich r2 = 0.995 r2 = 0.849 b = 11.488 l/mg n = 11.403 a = 0.124 mg/g kf = 3.933 mg/g table 2. adsorption isotherm parameters. as compared to the isotherm. freundlich does not comply because the value of r2 = 0.1. the research of [24] also conducted a similar study and obtained the highest correlation coefficient (r2) value found in the langmuir isotherm, this indicates that the adsorption of liquid waste follows a langmuir isotherm approach. this shows that the adsorption of procion red dye by using zno-zeolite nanocomposite is more appropriate and in accordance with the langmuir adsorption isotherm type, as evidenced by the value of the correlation coefficient (r2) which is closer to 1 as compared to the freundlich isotherm type, so it can be assumed that the adsorbed dye or adsorbates are adsorbed in a single form (monolayer) and the adsorption process of procion red dye by using zno-zeolite nanocomposite is a homogeneous one. however, freundlich isotherm type describes a multilayer adsorption process and involves more physical interactions [3]. 4. conclusions photodegradation application by using zno-zeolite nanocomposite produced a higher decomposition percentage of the procion red dye under the sunlight as compared to uv light and in the dark condition. the highest degradation percentage was 98.24 % by irradiation under the sunlight for 120 minutes. the photodegradation process of procion red dye using the zno-zeolite nanocomposite followed the langmuir 245 t. e. agustinai, r. gayatri, d. bahrin et al. acta polytechnica adsorption isotherm pattern with the linear equation y = 8.098x + 0.705, the coefficient of determination r2 = 0.995, langmuir constant value b of 11.488 l/mg, and a maximum adsorption capacity a of 0.124 mg/g. list of symbols ce equilibrium concentration [mg/l] c0 concentration of initial pollutants [mg/l] qm the adsorption capacity at maximum qe amount of adsorbate at equilibrium [mg/g] v volume of sample [l] w adsorbent weight [g] a maximum adsorption capacity [mg/g] b langmuir constant kf , n freundlich’s empirical constant acknowledgements the authors would like to express their gratitude to all parties involved in this research. the authors would also like to thank the waste treatment technology laboratory of chemical engineering department universitas sriwijaya, integrated research laboratory of postgraduate universitas sriwijaya, and laboratory of environmental research center (pplh) universitas sriwijaya, for the laboratory support. references [1] p. zawadzki, e. kudlek, m. dudziak. kinetics of the photocatalytic decomposition of bisphenol a on modified photocatalysts. journal of ecological engineering 19(4):260–268, 2018. https://doi.org/10.12911/22998993/89651. [2] p. zawadzki, e. kudlek, m. dudziak. titanium(iv) oxide modified with activated carbon and ultrasounds for caffeine photodegradation: adsorption isotherm and kinetics study. journal of ecological engineering 21(8):137–145, 2020. https://doi.org/10.12911/22998993/126985. [3] m. ahda, s. sutarno, e. s. kunarti. sintesis silika mcm-41 dan uji kapasitas adsorpsi terhadap metilen biru. pharmaciana 3(1), 2013. https://doi.org/10.12928/pharmaciana.v3i1.414. [4] m. t. adiwibowo, m. ibadurrohman, slamet. synthesis of zno nanoparticles and their nanofluid stability in the presence of a palm oil-based primary alkyl sulphate surfactant for detergent application. international journal of technology 9(2):307–316, 2018. https://doi.org/10.14716/ijtech.v9i2.1065. [5] d. a. wismayanti, n. p. diantariani, s. r. santi. pembuatan komposit zno-arang aktif sebagai fotokatalis untuk mendegradasi zat warna metilen biru. jurnal kimia (journal of chemistry) 9(1):109–116, 2015. https://doi.org/10.24843/jchem.2015.v09.i01.p18. [6] s. naimah, s. ardhanie a., b. jati, et al. degradasi zat warna pada limbah cair industri tekstil dengan metode fotokatalitik menggunakan nanokomposit tio2 – zeolit. jurnal kimia dan kemasan 36(2):225–236, 2014. https://doi.org/10.24817/jkk.v36i2.1889. [7] m. o. ruliza, t. e. agustina, r. mohadi. impregnation of activated carbon-tio2 composite and its application in photodegradation of procion red synthetic dye in aqueous medium. iop conference series: earth and environmental science 105:012024, 2018. https://doi.org/10.1088/1755-1315/105/1/012024. [8] d. a. wulandari, n. nasruddin, e. djubaedah. selectivity of water adsorbent characteristic on natural zeolite in cooling application. journal of advanced research in fluid mechanics and thermal sciences 55(1):111–118, 2020. [9] jasmal, sulfikar, ramlawati. kapasitas adsorpsi arang aktif ijuk pohon aren (arenga pinnata) terhadap pb2+. sainsmat : jurnal ilmiah ilmu pengetahuan alam 4(1):57–66, 2015. https://doi.org/10.35580/sainsmat4112842015. [10] a. puszkarewicz, j. kaleta. adsorption of chromium (vi) on raw and modified carpathian diatomite. journal of ecological engineering 20(7):11–17, 2019. https://doi.org/10.12911/22998993/108640. [11] p. laokul, v. amornkitbamrung, s. seraphin, s. maensiri. characterization and magnetic properties of nanocrystalline cufe2o4, nife2o4, znfe2o4 powders prepared by the aloe vera extract solution. current applied physics 11(1):101–108, 2011. https://doi.org/10.1016/j.cap.2010.06.027. [12] m. murtihapsari, b. mangallo, d. d. handyani. model isoterm freundlich dan langmuir oleh adsorben arang aktif bambu andong (g. verticillata (wild) munro) dan bambu ater (g. atter (hassk) kurz ex munro). jurnal sains natural 2(1):17–23, 2012. https://doi.org/10.31938/jsn.v2i1.31. [13] r. gayatri, t. e. agustina, r. moeksin, et al. preparation and characterization of zno-zeolite nanocomposite for photocatalytic degradation by ultraviolet light. journal of ecological engineering 22(2):178–186, 2021. https://doi.org/10.12911/22998993/131031. [14] a. salam, t. e. agustina, r. mohadi. photocatalytic degradation of procion red synthetic dye using zno-zeolite composites. international journal of scientific & technology research 7(8):54–59, 2018. [15] a. c. mohan, b. renjanadevi. preparation of zinc oxide nanoparticles and its characterization using scanning electron microscopy (sem) and x-ray diffraction (xrd). procedia technology 24:761–766, 2016. https://doi.org/10.1016/j.protcy.2016.05.078. [16] e. nyankson, j. k. efavi, a. yaya, et al. synthesis and characterisation of zeolite-a and zn-exchanged zeolite-a based on natural aluminosilicates and their potential applications. cogent engineering 5(1):1440480, 2018. https://doi.org/10.1080/23311916.2018.1440480. [17] t. e. agustina, e. melwita, d. bahrin, et al. synthesis of nano-photocatalyst zno-natural zeolite to degrade procion red. international journal of technology 11(3):472–481, 2020. https://doi.org/10.14716/ijtech.v11i3.3800. [18] a. rahman, m. nurjayadi, r. wartilah, et al. enhanced activity of tio2/natural zeolite composite for degradation of methyl orange under visible light irradiation. international journal of technology 9(6):1159–1167, 2018. https://doi.org/10.14716/ijtech.v9i6.2368. 246 https://doi.org/10.12911/22998993/89651 https://doi.org/10.12911/22998993/126985 https://doi.org/10.12928/pharmaciana.v3i1.414 https://doi.org/10.14716/ijtech.v9i2.1065 https://doi.org/10.24843/jchem.2015.v09.i01.p18 https://doi.org/10.24817/jkk.v36i2.1889 https://doi.org/10.1088/1755-1315/105/1/012024 https://doi.org/10.35580/sainsmat4112842015 https://doi.org/10.12911/22998993/108640 https://doi.org/10.1016/j.cap.2010.06.027 https://doi.org/10.31938/jsn.v2i1.31 https://doi.org/10.12911/22998993/131031 https://doi.org/10.1016/j.protcy.2016.05.078 https://doi.org/10.1080/23311916.2018.1440480 https://doi.org/10.14716/ijtech.v11i3.3800 https://doi.org/10.14716/ijtech.v9i6.2368 vol. 62 no. 2/2022 zno-zeolite nanocomposite application for photocatalytic . . . [19] r. fraditasari, s. wardhani, m. m. khunur. degradasi methyl orange menggunakan fotokatalis tio2-n : kajian pengaruh sinar dan konsentrasi tio2-n. jurnal ilmu kimia universitas brawijaya 1(1):606–612, 2015. [20] a. charanpahari, s. s. umare, s. p. gokhale, et al. enhanced photocatalytic activity of multi-doped tio2 for the degradation of methyl orange. applied catalysis a: general 443-444:96–102, 2012. https://doi.org/10.1016/j.apcata.2012.07.032. [21] d. windy dwiasi, t. setyaningtyas. fotodegradasi zat warna tartrazin limbah cair industri mie menggunakan fotokatalis tio2 sinar matahari. molekul 9(1):56–62, 2014. https://doi.org/10.20884/1.jm.2014.9.1.150. [22] i. kustiningsih, d. k. sari. uji adsorbsi zeolit alam bayah dan pengaruh sinar ultraviolet terhadap degradasi limbah methylene blue. teknika: jurnal sains dan teknologi 13(1):25–32, 2017. [23] m. ghosh, k. manoli, x. shen, et al. solar photocatalytic degradation of caffeine with titanium dioxide and zinc oxide nanoparticles. journal of photochemistry and photobiology a: chemistry 377:1–7, 2019. https://doi.org/10.1016/j.jphotochem.2019.03.029. [24] d. darmansyah, s. ginting, l. ardiana, h. saputra. mesopori mcm-41 sebagai adsorben: kajian kinetika dan isotherm adsorpsi limbah cair tapioka. jurnal rekayasa kimia & lingkungan 11(1):10–16, 2016. https://doi.org/10.23955/rkl.v11i1.4228. 247 https://doi.org/10.1016/j.apcata.2012.07.032 https://doi.org/10.20884/1.jm.2014.9.1.150 https://doi.org/10.1016/j.jphotochem.2019.03.029 https://doi.org/10.23955/rkl.v11i1.4228 acta polytechnica 62(2):238–247, 2022 1 introduction 2 materials and methods 2.1 synthesys and characterisation of zno-zeolite nanocomposite 2.2 zno-zeolite nanocomposite application for photocatalytic degradation 2.3 determination of zno-zeolite nanocomposite adsorption isotherm type 3 results and discussion 3.1 zno-zeolite nanocomposite characters 3.2 the photodegradation results of procion red 3.3 types of adsorption isotherm for degradation of zno-zeolite nanocomposite 3.4 freundlich isotherm 3.5 langmuir isotherm 4 conclusions list of symbols acknowledgements references ap08_3.vp 1 unrra the abbreviation unrra stands for the united nations relief and rehabilitation administration. this organization was established on the proposal of american president franklin delano roosevelt. on november 9, 1943 deputies from 44 countries, which were fighting against a common enemy – nazi germany and its allies-gathered in the white house in washington. there they signed an agreement to establish unrra (for czechoslovakia the document was signed by jan masaryk, foreign minister in the czechoslovak government-in-exile). the fundamental idea was to establish mutual economic cooperation in the post-war period, and to assist in rehabilitating and renewing war-devastated countries. participation in reconstruction was not a part of unrra activities, and there was no long-term cooperation in economic programs and agreements. the aim of this cooperation was to use world resources to help the countries immediately affected by the war, countries which did not have their own resources to recover from the consequences of world war ii. in the period of urgent assistance (so-called relief), unrra primarily supplied food, clothing, and medicine. the program, contributed to some supplementary health and socio-political services, and cooperated repatriating refugees dispersed all round the world. in the period of renewal (so-called rehabilitation), unrra supplied the liberated countries with goods needed to rehabilitate their production and transport networks. for 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 unrra and support for science j. frydryšková the second world war was the most devastating war in history. millions of people died, countless cities were destroyed and economic recovery was one of the most important problems dealt with in past-war conferences. the us needed an economically stable europe, and for this reason they started preparing a recovery program before the war came to an end. one of the most important programs was unrra. unrra consisted of two parts – relief and rehabilitation. an important part of rehabilitation was the fellowships program, which was the first international scholarship program after the second world war. this program provided scholarships for specialists from every country that received help from unrra. these specialists returned to their own country after their residency and helped with the recovery. this was the main reason for the fellowships program. keywords: post-war recovery, unrra, fellowship program. unrra suppliers non occupied members supplies and foreign money occupied members supplies and foreign money contribution for unrra administration supplies of non-unrra countries supplies by non-goverment organisation fig. 1: woodbridge, g., the history of unrra, i. volume, p. 105 fig. 2: unrra in czechoslovakia, p. 177 example, seeds, fertilizers, raw materials, fishing gear, industrial machinery and parts were supplied. unrra also contributed to the renewal of important installations such as waterworks, electric power plants and gasworks, restoration of the infrastructure and procurement of materials for renewing educational institutions. 2 fellowship program in addition to these joint activities, unrra worked on developing advanced science and engineering in the war-afflicted countries. the leaders of this organization clearly realized that countries occupied for a long time by the enemy face a shortage of well-qualified people. originally, unrra sent all sorts of experts to the war-devastated countries. they trained local specialists, and thus accelerated the rehabilitation process in the devastated countries. the experts also brought the local scientific communities into contact with the latest technical, scientific and medical discoveries. in spite of this creditable activity, the measures were considered temporary because the seven years of imposed separation of scholars and scientific workers from other scientific and specialist communities could not be made good by short-term specialized discussions. the council of unrra decided to run a scholarship program that would enable engineers, experts, doctors and scientific workers to acquaint themselves with the latest achievements in their spheres. thus, the unrra fellowship program was established. the unrra relief recipient countries were given an opportunity to send specialists to the usa, canada and great britain to learn about the latest advances made by their colleagues and to implement the experience that they acquired in their home countries. scholarship holders, who had been proposed by the government itself, were appointed as unrra experts when they returned home. a total of 155 six-month and nine-month scholarships were provided. czechoslovakia obtained 21 scholarship places for its candidates. eighteen were sent to the usa and canada, and only three went to france and great britain. each grant holder was given some training and other introductory courses. then he/she was sent to a department, institution or factory where he/she studied and specialize in a specific topic. many of the scholars took specialized courses at a university. all were required to write a concluding report, which had the following structure: 1) program studies, 2) acquired technical and scientific information, 3) proposals for aid and renewal of his/her scope in his/her homeland; 4) list of places he/she studied at, 5) list of special literature and facilities. the unrra council sent this final report to the government of the scholarship holder. the candidates who successfully passed the courses received a letter from unrra (a certificate of merit), signed personally by the director-general. in addition, unrra prepared a plan to improve in-hospital care. in six european countries, including czechoslovakia, about 100 nurses were chosen to go to the united state, where they were trained in the latest methods. all expenses for a three-month stay were covered by unrra. 3 conclusion it should be pointed out that this was the first large-scale grant program of its kind after world war ii. despite all the difficulties, it met people’s expectations. unrra director-general l. w. rookse provided proof of this. on his european tour, he proclaimed that: “the returning grant holders teach, conduct studies, reorganize and are very valued persons in the renewal process of their own countries. in many european countries the recovery often depends on the technical guidance and efficiency of that handful of educated men and women.” the significance of these activities was tremendous; the scholarship holders from the recipient countries were able to enter into personal relations, exchange opinions and experience, and also to become familiar with technical and scientific progress in countries away from the battlefields. the experience and results of this first scholarship program established a model for international organizations, e.g., unesco, which was developing similar concepts of scholarship programs at that time. 4 acknowledgments the research described in this paper was supervised by prof. phdr. ivan jakubec, csc., institute for economic and social history, philosophical faculty of charles university in prague. references [1] dowd, a. w.: exporting civil society. contributors. in: world, vol. 18 (2003), number 7, p. 255–265. [2] unrra in czechoslovakia, prague, 1971. [3] woodbridge. g.: unrra – the history of the united nations relief und rehabilitation administration i., ii., iii. part, columbia university, 1950. [4] fund – unrra, national archives, prague. jana frydryšková e-mail: j.frydryskova@seznam.cz the institute for economic and social history philosophical faculty of charles university celetna 20 116 42 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 48 no. 3/2008 field total no. in u.s.a. health 25 16 penicillin 5 5 welfare 20 18 public services 6 5 agricultural rehabilitation 32 27 industrial rehabilitation 67 55 total 155 126 table 1: fields of study of unrra fellowships holders ap06_5.vp 1 introduction all over europe, the oportunities for preserving most of the cultural heritage are intimately connected with the use of historical buildings – castles, mansions, palaces, monasteries etc. – to deposit historic and artistic collections, and with opening these buildings to public access. the microclimate regulations for this class of preservation are still a subject of controversial discussions [8]. as regards the czech republic, the interiors of more than 250 castles and mansions serve as exhibition rooms for authentic historic and/or artistic collections, as archives, historical libraries, etc. in approximately one third of these buildings valuable collections of paintings and sculptures are presented, although neither heating nor air-handling devices are in operation in most of these interiors. hence there is almost no way to achieve a stable and non-aggressive internal environment. reusing historic sites to keep historic and artistic collections is common in many countries, and owing to this an invaluable part of europe’s cultural heritage is exposed to the more or less damaging impact of an unsuitable microclimate in historical buildings, particularly to the impact of air humidity, [1], [2]. internal air humidity is the most significant harmful exposure for exhibits made of porous organic materials, such as wood, paper, canvas, etc. more specifically, variations in air humidity and temperature during the annual cycle cause variations in moisture content in these mostly organic materials and cause dimensional changes which may lead to dangerous stresses and deformations, and may even openup cracks [7]. in earlier research, the authors developed a microclimate control system aimed at keeping the moisture content in the protected exhibits constant. the idea of moisture content stabilization is based on the sorption isotherm model of a selected material for which a constant moisture content is achieved by adequate humidity adjustment to compensate for the temperature changes given by their natural annual cycle. this paper reports on the implementation and successful trial operation of a mol-26 dehumidifying device, controlled according to the equal-sorption principle in the interior of the historica collection in the state archives in třeboň castle, czech republic, where microclimate control has been tested over a period of more then twenty two months. the paper explains the equal-sorption approach to environment adjustment including the specific issue of the different sorption characteristics of various materials. the design of the dehumidification device control unit is based applying the henderson model and implementing a microprocessor. the implementation in an archive is described, and the resulting environment parameters are reported and discussed. 2 equilibrium moisture content model for each of the considered materials the equilibrium moisture content (emc) u determines a specific level appropriate to the relative humidity � and temperature t of the ambient air. the emc value increases with growing � and decreases with growing t and, in general, it is much more sensitive to changes in air humidity than to variations in temperature. various models, e.g. these by day and nelson [3], simpson [4], henderson [5], are used to describe the relationship between emc as u, and the pair of air temperature t and relative air humidity �, u � � (�, t). unfortunately these investigations have dealt with the higher temperature ranges of emc relevant to industrial purposes, far from the temperatures typical for preventive conservation. the relationship �(.) is usually displayed in the coordinates � and u, as so-called sorption isotherms, with temperature considered as a parameter. the logarithmic henderson model was chosen as the most suitable available model, namely its three-parameter version [5] u a t b c � � � � � � � � ln( ) ( ) 1 � , (1) where � � 0 1, is the relative air humidity expressed as a dimensionless ratio, t is air temperature in °c and u is emc, expressed as the ratio of moisture mass content to the mass of anhydrous material. the parameters of the model are specific for each material, the additive temperature constant b is in °c, c is a positive dimensionless exponent less than one and © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 46 no. 5/2006 experience of implementing moisture sorption control in historical archives p. zítek, t. vyhlídal, j. chyský this paper deals with a novel approach to inhibiting the harmful impact of moisture sorption in old art works and historical exhibits preserved in remote historic buildings that are in use as depositories or exhibition rooms for cultural heritage collections. it is a sequel to the previous work presented in [2], where the principle of moisture sorption stabilization was explained. sorption isotherm investigations and emc control implementation in historical buildings not provided with heating are the main concern in this paper. the proposed microclimate adjustment consists in leaving the interior temperature to run almost its spontaneous yearly cycle, while the air humidity is maintained in a specific relationship to the current interior temperature. the interior air humidity is modestly adjusted to protect historical exhibits and art works from harmful variations in the content of absorbed moisture, which would otherwise arise owing to the interior temperature drifts. since direct measurements of moisture content are not feasible, the air humidity is controlled via a model-based principle. two long-term implementations of the proposed microclimate control have already proved that it can permanently maintain a constant moisture content in the preserved exhibits. keywords: equilibrium moisture sorption, preventive conservation, sorption isotherm, humidity control. the sensitivity coefficient a is in °c�1. the model is clearly not suitable for air humidity approaching the state of saturation, i.e. for � � 1, since then the logarithm is not defined. however, any state that is approaching saturation brings about condensation, and thus such a state is quite inadmissible for the interiors that we are considering here. a common feature of most emc models and measurements available from the references is that they have a higher temperature range than is needed for our work. the temperature range of the microclimate adjustment will be approximately from 5 °c to 25 °c, but the emc characteristics are seldom available for these temperatures. we therefore decided to perform our own emc measurements to find the truest possible parameters of the henderson model (1) for the purposes of microclimate control in historical interiors. 3 investigations of moisture sorption in wood and paper samples in order to assess the parameters of the henderson model, in cooperation with the department of carbohydrate chemistry and technology, institute of chemical technology in prague, we performed several long-term experiments to assess the sorption isotherms for the following samples of materials, namely a pine wood sample new (pwn), an oak wood sample-new (own), pine wood samples-around hundred years old (slightly rotten pwo1, well-preserved pwo2), a paper sheet sample taken from a book fifty years old (pso). the average size of the wood samples were 8×8×40 mm. the samples of paper consisted of a bunch of 40 paper sheets 50×50 mm in size. for more than three weeks these samples were examined in different combinations of temperature and relative humidity, maintained by temperature control and humidity adjustment by adding specific saturated salt solutions of hygroscopic agents in four dessicators, [6], see table 1. the conditions inside the dessicators were monitored by testo 175-h1 dataloggers. the experiment, which consisted in periodic weighing of the wood samples, lasted 22 days after which the saturated weights of the samples were determined. then the samples were dried up and the dried residua were weighed again to determine the equilibrium moisture content 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 dessicator saturated salt solution rh [%] t [°c] 1 nacl 77 6.0 2 nacl 77 22.3 3 mg(no3)2 63 7.0 4 mg(no3)2 56 22.3 table 1: conditions inside the dessicators equilibrium moisture content [g/g] henderson model param. 77 %, 6 °c 77 %, 22.3 °c 63 %, 7 °c 56 %, 22.3 °c a [°c�1] b [°c] c pwn 0.150 0.136 0.115 0.092 0.247 95.83 0.668 own 0.138 0.120 0.102 0.084 0.361 80.02 0.653 pwo1 0.166 0.146 0.130 0.103 0.425 62.81 0.601 pwo2 0.154 0.137 0.121 0.096 0.431 68.88 0.605 pso 0.090 0.083 0.090 0.059 0.447 135.68 0.642 table 2: equilibrium moisture content estimates of samples stored in dessicators, and parameters of the henderson model for each set of samples fig. 1: the moisture content in the samples pwo2 during the experiment (emc), see table 2. this table also shows the parameters of the henderson model resulting from the measurements. as an example of the sorption dynamics, the time evolution of the moisture content in the pwo2, samples is shown in fig. 1. in fact, the results shown in the figure are step responses. the step changes in the conditions are caused by taking the samples from an environment with average temperature t � 22 °c and relative humidity rh � 40 % (usual interior conditions) and placing them in dessicators with the microclimate characterized by the values shown in table 1. as can be seen from the responses, the sorption dynamics are relatively slow even though the samples only weight three grams. it can be concluded from the responses that the emc fluctuations caused by the natural changes in temperature and relative humidity during a daily cycle are quite negligible. however, the longer term fluctuations in the environment caused by alternations of sunny and rainy periods, especially in the spring and autumn seasons, can cause considerable fluctuations in moisture content, particularly in artifacts with thin layers on the surface. fig. 2 shows the sorption isotherms computed according to the henderson model with the parameters obtained for the pwo2 samples. the experimental results show significant differences in the temperature sensitivity of emc in the lower temperature range. the experiments also show that not only the material of the sample, but also its age, plays a considerable role in the sorption behavior. 4 maintaining constant emc by air humidity adjustment model (1) was designed to assess the moisture content in particular materials. however, it also contains an intuitive suggestion for forming microclimate conditions that will keep this moisture content constant. it is apparent from (1) that for a temperature change from t1 to t2 such a specific humidity change from the initial �1 to a new �2 can keep the emc value constant i.e. u1 � u2. since the change in moisture content is the crucial harmful impact originating from the air humidity, this is the main factor in preventive conservation. using (1), the requirement u1 � u2 leads to the following relationship dependent only on parameter b (while a, c cancel each other out) ln( ) ( ) ln( ) ( ) 1 11 1 2 2 � � � � � � � t b t b (2) our idea for preventing variations in moisture sorption consists in the following. the control stabilizing the moisture content in the preserved exhibits is conceived as a reference tracking humidity control, where the modestly variable reference humidity value �d the desired air humidity is assessed from the temperature and humidity measurements by means of relationship (2), i.e. � �d 0� � � � � � � � � 1 1 0 exp ln ( ) t b t b , (3) where t0, �0 is a selected reference air state satisfying the preventive conservation claims and t is the actual (measured) interior temperature. this desired �d is applied as the reference setting for a humidity control adjusting the actual (measured) interior humidity � towards �d by means of an air handling device. strictly speaking, as mentioned above, this control protect only a single material – corresponding to parameter b considered in (3) – from variations in the moisture content. however although various materials differ from each other in their sorption characteristics, it can be seen from (2) that the differences in parameter b values influence the required humidity readjustment only weakly. take, for instance, two materials m and n, having sorption parameters bm and bn respectively, and assume an air state change from the initial air state t0, �0 to a different temperature t and try to assess the new desired humidity values as �m and �n for materials m and n respectively. using equality (2), the following equations are obtained for the two materials ( ) ( ) ln( ) ln( ) t b t b m m m� � � � �0 0 1 1 � � , ( ) ( ) ln( ) ln( ) t b t b n n n� � � � �0 0 1 1 � � . (4) the additive temperature constant b takes on relatively high values, say from 60 to 200 °c, for various materials of the considered protected exhibits while, however, the temperature range assumed in this study is only from about 5 to 25 °c. apparently, the higher b is the less sensitive the material is to changes in moisture sorption. it is easy to prove that the temperature ratios in (4) are close to one, even if there are substantial differences in b. consequently the humidity logarithm ratios are also close to one and therefore the required humidity readjustments �m � �0 and �n � �0 have rather low values and cannot differ from each other substantially. for example, the required humidity readjustment for the pine wood assessed above differs only by about one per cent from the values appropriate for other sorts of wood, e.g. oak, beech etc. a similar conclusion can be arrived at for other moisture sensitive materials, e.g. various sorts of paper, canvas, parchment etc. with regard to the attainable accuracy level of the humidity measurements, these mutual differences are negligible and the readjustment of air humidity resulting from (2) can be considered as satisfactory not only for wood in general but also for miscellaneous exhibits of paper, canvas, parchment, etc. the only exception to this rule is a painting or an artefact on an interior wall. in this case the humidity readjustments needed to prevent the surface moisture content from varying are substantially higher than for the other cases © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 46 no. 5/2006 fig. 2: experimental fitting of the sorption isotherms by the henderson model for wood samples pwo2 (pine, old), + emc measurement according to table 2 mentioned. hence, as a general rule, variations in emc are inhibited by humidity readjustments which are only weakly dependent on differences in the sorption properties of various materials. the humidity adjustments resulting from relations (2) or (3) may seem very modest. however these values cannot be viewed separately since they are to be compared with the spontaneous changes on relative humidity brought about by temperature variations. using the derivative of the magnus formula [7] for the humidity mixing ratio �, in g/kg � �� � � � � � � 3 795 10. exp ln at b t (5) with the parameters a � 7.5 and b � 237.3 °c, the condition of keeping � constant results in the following approximate relationship for a change from air state t0, �0 to another air state t, � � � � � � 0 0 2 0 0 10ab b t t t ln ( ) ( ) . (6) for example, if �0 � 0.5 and t0 � 20 °c a temperature decrease, e.g. to t � 15 °c in a well insulated room brings about an increase of relative humidity by more than fifteen per cent. by contrast, the derivatives of the henderson model (1) � � � � �� u t a b t c b t u c a b t c � � � � � � � � � � � ln( ) ( ) ( ) ( ) , [ ( 1 0 )] [ ln( )] ( )c c � 1 1 0 1 � � , (7) prove that a humidity adjustment according to (2) providing no change in emc requires a decrease of �, in this case by about 1.5 to 2.0 %. hence, instead of the natural relationship, where the relative humidity in a well insulated room increases when the temperature decreases, an inverse proportion between these two variables is to be artificially provided by equal-sorption control. humidity readjustment (2) thus brings about a more substantial intervention into the air state than at first appears. the desired relationship between t and � to be artificially provided by the control results from the requirement du � 0. the following relationship is obtained from the derivatives (7) � � �� � � � �� � � � � � � � � � u t u t k t t k t b t c c ( , ) , ( , ) ln( ) ( )1 1 0, (8) where the gain parameter k tc( , )� is a function of both temperature and air humidity. it should be noted that only b from the three henderson model parameters influences this gain parameter. due to the opposite trends of ( )1 � and ln( )1 � , and also due to the relatively high value of b, b � �60 c, the variability of k tc( , )� with humidity and temperature is fairly weak. in fact, owing to the limited accuracy of t and � measurements, the relationship (8) can be treated as linear. 5 performance estimation of the air handling device in order to perform the above-discussed humidity adjustments, an air handling device was applied in a historic interior. the required performance of this device results from the volume and humidity parameters of the protected room. to achieve the desired relative humidity �d in the room it is necessary to exhaust, or sometimes to add, an amount of water per hour qm. therefore �d is to be expressed as the desired humidity mixing ratios �d and �e appropriate to the indoor and outdoor air states, using the magnus formula (5). considering �w as the humidity mixing ratio on the internal surface of the walls, the total transferred moisture amount qm is given as q q dm � � l e d w w d[ ] [ ]� � � � , (9) where ql is the in-leakage of the outdoor air and dw is the effective diffusion coefficient between the entire wall surface and the internal air. if qm> 0, qm is the performance of the dehumidifier, on the contrary qm< 0 means the humidifying water input requirement. the values of absolute humidity are computed from the measured temperature and the relative humidity values by means of the magnus formula (5). the in-leakage flow of the outdoor air is usually estimated to be approximately 25 to 40 % of the room volume per hour [8]. the moisture diffusion from the walls is highly variable. our measurement shows that the moisture diffusion from rather wet walls may be as significant a water source as window leakage for the interior. the performance of the dehumidifying device has to be designed to manage even the highest demands of moisture transport throughout the annual cycle of the weather in the site. however, there is only exceptionally a need for humidification in remote historic buildings, and its provision can be often omitted to reduce costs. 6 implementation in the třeboň archives the first implementation of the proposed microclimate adjustment in the chapel of the holy cross at karlštejn castle proved its ability to provide permanently favourable conditions for keeping the absorbed moisture constant in most of the exhibits deposited in the medieval interior [1], in spite of extreme demands due to large numbers of visitors. another implementation of the proposed humidity adjustment operates in the historica collection in the state archives in třeboň castle, czech republic, using the mol-26 dehumidifying device, produced by pzp komplet, dobruška, czech republic. this apparatus is controlled according to the equal sorption principle (3), see fig. 3. before the dehumidifying device was installed in the collection room in 2005, its environment had been long-term monitored. the annual cycle measurements performed in the interior, shown in fig. 4, show that the temperature does not fall below zero even in the coldest winter periods. secondly, a comparison of the measured relative humidity with its desired value computed from the henderson model to maintain constant emc (10 % for wood, parameters for pwo1 in table 2) shows that only dehumidifying is needed. thus, the dehumidifying device is able to provide a favourable microclimate in this interior throughout the whole year. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 a special control unit was developed to control the mol-26 dehumidifying device. the main parts of the control unit are the atmel at89c2051 eight-bit microcontroller and the sht11 temperature-humidity sensor (produced by sensirion ag, with a capacitive polymer moisture-sensing element for relative humidity (0 to 100 % rh) and a bandgap temperature sensor ( 40 °c to 120 °c). the two parts are connected by a two-wire sensirion serial bus. the microcontroller processes the data measured by the temperature/humidity sensor. a data table for predefined types of objects is stored in the memory of the control unit. from this the program assigns the desired relative humidity value � to each measured temperature sample t according to (3). consequently, the desired �d is compared with the measured �, and on the basis of this comparison, the program algorithm decides whether the dehumidifier is to be switched on or off. thus the control unit accomplishes a relay based control, where the reference value of desired relative humidity �d is assigned by (3) on the basis of measured temperature t. the algorithm operates with the hysteresis 0.5 % rh and the time delay of 5 min limit the switching frequency, in accordance with the operating conditions for the mol-26 dehumidifier. the control unit is powered by an external ac adapter 230 v ac/9 v dc, 100 ma. two examples of records of the controlled environment in the archives are shown in fig. 5 and 6. as can be seen, for both paper and wood materials emc is almost constant, which is in agreement with the primary aim of the control. comparing the emc characteristics in fig. 5 and 6 with those in the uncontrolled environment shown in fig. 4 in the same time period, we can clearly see the improvement brought about by implementing the control method described in this paper. 7 conclusions the proposed principle of the equal-sorption humidity control principle involves adjusting the relative indoor air humidity to a level continuously adapted to the indoor air temperature according to condition (3), where the desired humidity is determined by temperature measurements. as the air handling processes are much faster than the dynamics of the emc changes, correcting the humidity can easily forestall changes in moisture sorption, and in this way the control is © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 46 no. 5/2006 fig. 3: interior of historica collection in třeboň archives with dehumidifying device mol-26 fig. 4: records of the uncontrolled environment in historica collection of třeboň archives and the proposed relative humidity adjustment for desired emc ud � 10% endowed with a predictive character. although the moisture content is viewed as the controlled variable, no acceptable method for measuring real-time emc is available. the presented model-based control scheme presented here is therefore to be applied. the dehumidifying devices are designed as local apparatuses, preferably portable. if they are portable the staff can put them in the most suitable place in the interior. in addition, low-power dehumidifiers can be produced for a reasonable price. the air handling device is controlled to operate interruptedly, producing low-amplitude oscillations of humidity around the desired �d. for implementation purposes, the authors’ own measurements of sorption isotherms were of primary importance, since relevant data for typical preventive conservation materials were available. the crucial point for the proposed microclimate adjustment was the finding that, despite differences in sorption characteris60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 5: records of the controlled environment in historica collection of třeboň archives – time period from 17. 7. 2005 to 30. 10. 2005 fig. 6: records of the controlled environment in historica collection of třeboň archives – time period from 5. 5. 2006 to 1. 8. 2006 tics for various organic materials such as wood, paper, canvas, parchment, etc., the desired humidity corrections can be considered as universal for all of them. the main contribution of the paper consists in the implementation results since the proposed control has shown long-term effectiveness in maintaining the interior air in a state that stabilizes the absorbed moisture in moisture-sensitive materials. 8 acknowledgment the research presented here has been supported by the ministry of education of the czech republic under project no. 1m0567. references [1] zítek, p., němeček, m., vyhlídal, t., kopecká. i.: moisture sorption stabilization as a preventive conservation technique. 6th european commmission conference on sustaining europe’s cultural heritage, london 2004. [2] zítek, p., vyhlídal, t.: model-based control of moisture sorption in a historical interior. acta polytechnica, vol. 45 (2005), no. 4, prague: ctu publ. house. [3] avramidis, s.: evaluation of “three-variable” models for the prediction of equilibrium moisture content in wood, wood science and technology, springer-verlag 1989. [4] simpson, t. w.: predicting equilibrium moisture content of wood by mathematical models. wood and fiber, 1973, vol. 5, p. 41–49. [5] henderson, s. m.: a basic concept of equilibrium moisture, agr. eng. vol. 33, 1952, p. 29–33. [6] greenspan, l.: humidity fixed points of binary saturated aqueous solutions. journal of research of the national bureau of standards, physics and chemistry. vol. 81 (1977), no. 1, p. 89–96 [7] camuffo, d.: microclimate for cultural heritage. amsterdam, london: elsevier science ltd., 1998. [8] kotterer, m.: research report of the project eu 1383 prevent, museum ostdeutsche galerie regensburg, 2002. prof. ing. pavel zítek, drsc. phone: +420 224 352 564 e-mail: pavel.zitek@fs.cvut.cz centre for applied cybernetics department of instrumentation and control engineering doc. ing. tomáš vyhlídal, ph.d. phone: +420 224 352 877 e-mail: tomas.vyhlidal@fs.cvut.cz centre for applied cybernetics jan chyský phone: +420 224 352 469 e-mail: jan.chysky@fs.cvut.cz department of instrumentation and control engineering ctu in prague faculty of mechanical engineering technická 4 166 07 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 46 no. 5/2006 table of contents microwave drying of textile materials and optimization of a resonant applicator 3 m. pourová, j. vrba quality control in automated manufacturing processes – combined features for image processing 8 b. kuhlenkötter, x. zhang, c.krewet intelligent support for a computer aided design optimisation cycle 15 b. dolšak, m. novak, j. kaljun ergonomic optimization of a manufacturing system work cell in a virtual environment 21 f. caputo, g. di gironimo, a. marzano advanced load effect model for probabilistic structural design 28 m. sýkora numerical simulation of nanoscale double-gate mosfets 35 r. stenzel, l. müller, t. herrmann, w. klix design of carbon composite driveshaft for ultralight aircraft propulsion system 40 r. poul, p. rùžièka,. d. hanus, k. blahouš term analysis – improving the quality of learning and application documents in engineering design 45 s. weiss, j. jänsch, h. birkhofer evaluation of a reflection method on an open-ended coaxial line and its use in dielectric measurements 50 r. zajíèek, j. vrba, k. novotný experience of implementing moisture sorption control in historical archives 55 p. zítek, t. vyhlídal, j. chyský ap05_4.vp 1 introduction our work deals with many aspects of the design of an axial-flow, one-stage, ducted fan driven by a yamaha four-stroke, four-cylinder reciprocating engine with a power output of 110 kw. an airplane with this propulsion is expected to have distinctly low drag, because of the location of the fan inside the hull. this will make the airplane aerodynamically smoother than classic airplanes with a tractor propeller. such a solution minimizes the disturbance of the flow around the aircraft, so a significant part of the wetted surface of the airplane is expected to work in the laminar regime. this fan has a rotor – stator configuration. the need for low weight of the fan is the reason for choosing a carbon/epoxy composite material for the blades and driving shaft connecting the engine with the fan. this work is divided into 6 main parts: aerodynamic design, choice of production technology, evaluation of the quasistatic and dynamic loads of the rotor, calculation of the properties of the composite materials, and rotor fea. because of the great complexity of the work, other aspects, such as a strength and stiffness analysis of the stator vanes and the solution of the cooler, are omitted. 2 aerodynamic design a method for aerodynamic design of a fan is described in [1]. the fan is designed for optimal isentropic efficiency in the design regime and on the assumption that the distribution of the tangential component of the velocity corresponds to a free vortex. the rotor blade and stator vane profiles have the naca 65 a 010 profile, parabolically modified in the rear part and wrapped around the circular arc centerline. the design point of the fan is airplane velocity equal to 0 m/s at an altitude of 0 m of isa. the blades and vanes are divided into 8 sections for geometry calculation purposes. the rotor contains 14 blades and the stator contains 20 vanes. the following method is used for calculating the geometry (see fig. 1). the incoming air angle is calculated as: �1 � � � � � � � � �arctg u wax , where u is tangential velocity, and wax is axial velocity of air. the angle of the air leaving the rotor blade is: � �2 1� � � � � � � � � � �arctg tg( ) p m w uax , where m is mass flow rate, and p is power input. the air deflection is then: � � �� 1 2 the proper blade/vane density is chosen according to fig. 2. the deviation angle is obtained from the constant rule modified for angle of attack equal to zero � � * . . � 0 23 1 0 23 s c s c , where s c is relative blade/vane density. the stator calculation is accomplished by analogy. for technological reasons the hub and duct diameters are constant along the fan axis. the hub diameter is 300 mm, and the duct diameter is 580 mm. the next task was to check the 104 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague composite axial flow propulsor for small aircraft r. poul, d. hanus this work focuses on the design of an axial flow ducted fan driven by a reciprocating engine. the solution minimizes the turbulization of the flow around the aircraft. the fan has a rotor stator configuration. due to the need for low weight of the fan, a carbon/epoxy composite material was chosen for the blades and the driving shaft. the fan is designed for optimal isentropic efficiency and free vortex flow. a stress analysis of the rotor blade was performed using the finite element method. the skin of the blade is calculated as a laminate and the foam core as a solid. a static and dynamic analysis were made. the rtm technology is compared with other technologies and is described in detail. keywords: axial flow fan design, ducted fan, composite blades. � 1 � 2 2 �* � � � w w w 1 2 3 � 3 w4 rotor stator fig. 1: symbols used for a description of the rotor blade and stator vane geometry mach number on the blade surface to prevent the occurrence of shock waves. two approximate methods were used for this check. the first was to find the local velocity maximum of the blade compared with the mean velocity between the rotor blades. the second method was based on lift and drag coefficients, and is described in [1]. 3 technology in this section only the rotor blade is studied. the rotor blade will be made of carbon reinforced epoxy resin and a foam core, with the dovetail lock made of al-alloy. the most suitable technology for composite production is chosen in this section. three composite production technologies are compared in this work: 1) wet lay-up is the cheapest from the tooling point of view, but the mechanical properties would vary considerably from piece to piece, 2) prepreg has the most expensive resin curing and storage conditions, but properties such as the fibre volume fraction of the composite will vary only slightly 3) vacuum assisted resin transfer molding (vartm) was chosen as the most favourable method, in terms of both cost and technological considerations. fig. 3 shows a comparison of the production costs. scheme of vacuum assisted resin transfer molding (fig. 4). © czech technical university publishing house http://ctn.cvut.cz/ap/ 105 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 -60° -40° -20° 0° 20° 40° 60° 0.5 1 1.5 2 2.5 3 3.5 � 1= 6 0 ° 4 5 ° 3 0 ° 1 5 ° 0 ° -1 5 ° -3 0 ° -4 5 ° -6 0 ° -7 5 ° 4 r o v n o t l a k � =3 0° 45° 60° 75° 90° 10 5° 12 0° 13 5° 15 0° �� s/c fig. 2: diagram from [1] used for choosing the optimum blade/vane relative density work structural materials auxiliary consumables workplace protection waste disposal vartm prepreg wet lay-up fig. 3: comparison of the production costs for the composites upper part of mould lower part of mould skin material foam core resin intake holes for bolting sealing air suction fig. 4: scheme of vartm 4 loads applied to rotor blades a rotor blade is loaded by a centrifugal force, resulting in a tensional force and a torsional moment acting in the blade axis direction, and by air pressure, which induces bending moments acting in the fan tangential and axial direction. the torsional moment in the blade axis direction induced by air pressure was neglected in this work, because of the unknown exact pressure distribution on the blade surface. the centrifugal forces and moments are calculated directly by finite element analysis (fea) software. the pressure forces are obtained by the following process and are applied to the fea model. the distributed pressure loads in directions x and y are calculated as: q p p r x c c� �( )1 2 2 14 � q r w w u y ax� � � 2 90 14 2 2� � �( cos( ) ) , where p1c and p2c are the total pressures in front of and behind the blade. for explanation of other symbols, see fig. 5. to obtain the distribution of the bending moment, the following integrals were applied: m q x r xpxr y r r � � ( ) max d , m q x r xpyr x r r � � ( ) max d . the dynamic load is produced by the airflow and by the variations in the angular velocity of a rotor typical for reciprocating engines. in our calculation, three loading sources were taken into account: 1) the wake from the union of the air inlet, which is divided into two parts – 2 pulses per turn, 2) the pulses induced by passing the stator vanes – 20 per turn and 3) the variations in angular velocity, which are 2 per turn for a four-stroke fourcylinder engine. the natural frequencies of the rotor blades must not be the same as the loading frequencies or their multiples in revolutions of working regimes. for real values, we used fea with more angular velocities given to obtain a sufficient amount of data to create campbell’s diagram (see fig.7). explanation: 1v, 2v, 3v, 4v: 1st to 4th blade natural frequency, grey region around 1v to 4v represents +10 % of natural frequencies covering the model inaccuracy 1m: engine excitation frequency – twice per turn 2m, 3m: double and triple 1m 1s: stator vanes excitation frequency twenty per turn 2s: double 1s 106 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague r r r d r x y z q q m a x m in x y � fig. 5: scheme of pressure forces qx r [m]0.16 0.18 0.2 0.22 0.24 0.26 0.28 -900 -800 -700 -600 -500 -400 qy 0.16 0.18 0.2 0.22 0.24 0.26 0.28 2 4 6 3 5 7 1 m m pxr pyr r [m] m [nm] fig. 6: distributed pressure load and bending moment 20 40 60 80 100 100 200 300 400 revolutions [-/s] f [hz] 1v 2v 3v 4v 1m 2m 3m 1s 2s fig. 7: campbell’s diagram of a rotor blade 5 mechanical properties of the composites three types of composite are used for the rotor blade skin: high modulus carbon fabric reinforced epoxy resin, unidirectional high modulus carbon reinforced epoxy resin, and aramide fabric reinforced epoxy resin. the mechanical properties of these materials were calculated for fea. the properties were calculated using the methods described in [6, 7], which allowed us to calculate the mechanical properties of the composite from the known properties of the fibres and of the resin. the distribution of the moduli of elasticity of high modulus carbon fabric reinforced resin is shown in fig. 8. the axes of the polar diagram represent the directions of the warp and weft of the fabric. 6 rotor blade finite element analysis a stress analysis of the rotor blade was performed using the finite element method. the calculation and modelling were accomplished using msc patran/nastran. because of the composite skin and foam core structure of the blade the following model was used: the composite skin was modeled using 4 node shell elements connected with a foam core made of wedge type solid elements. the skin of the blade was calculated as laminate. this model was loaded by centrifugal force and “air” pressure, as in the section rotor blade applied loads. from this analysis the values of the tsai-hill failure criterion [6, 7] were obtained. the same model was used for natural frequency analysis. the tsai-hill criterion values are presented in fig. 9, and the stress distribution in fig. 10. © czech technical university publishing house http://ctn.cvut.cz/ap/ 107 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 40 000 -20000 20000 40 000 40 000 20 000 20 000 40 000 -15000 1 0000-5000 5000 10000 15 000 -15000 -10000 -5000 5000 10000 15000 fig. 8: polar diagrams of young’s modulus and the shear modulus as a function of the angle between the evaluated and main direction of the material fig. 9: tsai-hill failure criterion 7 design features and conclusion the very light composite structure of the stator and rotor of the ducted fan allows lighter corresponding supporting structures, which will significantly influence the aircraft weight. in this way the composite propulsor can be useful for very light airplanes. the proposed solution of the rotor blade is shown in fig. 11. references [1] jerie, j.: teorie motorů, čvut, 1981. [2] statečný, j., sedlář, f., doležal, z.: pevnost a životnost leteckých turbínových motorů část 1, čvut, 1995. [3] růžek, j., kmoch, p.: teorie leteckých motorů část 1, va brno, 1979. [4] ušakov, k. a., brusilovskij, i. v., bušel, a. r.: aerodynamika osových ventilátorů a jejich konstrukční prvky. praha: sntl, 1962. [5] hanus, d.: pohon letadel, čvut, 1997. [6] agarwal, b. d., broutman, l. j.: vláknové kompozity. praha: sntl, 1987. [7] gay, d.: matériaux composites. hermes, 1997. [8] hoskins, b. c., baker, a. a.: composite materials for aircraft structures. american institute of aeronautics and astronautics, inc., 1986. [9] potter, k.: resin transfer moulding. chapman & hall, 1997. [10] kolář, v., němec, i., kanický, v.: fem – principy a praxe metody konečných prvků. computer press, 1997. [11] martaus, f.: výroba přesných kompozitních dílů metodou vartm – 1. etapa řešení, vzlú, 2001. [12] uher, o.: mathematical modeling of behavior of filament wound composite structures. čvut, 2002. ing. robin poul e-mail: robin.poul@fs.cvut.cz doc. ing. daniel hanus, csc. e-mail: daniel.hanus@fs.cvut.cz automotive and aerospace engineering department faculty of mechanical engineering czech technical university in prague karlovo náměstí 13 121 35 prague 2, czech republic 108 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 10: stress distribution in the first layer of the rotor blade: �x, �y, �xy blade glass/epoxy blade bottom al-alloy modified dovetail lock adhesive bonds fig. 11: rotor blade detail ap06_2.vp 1 introduction we have implemented the ce program, which can simulate the behavior of caches inside the smp system a the software basis. the user must have the source code of the program in c language and modify it (explicitly include memory operations for ce purposes). the cache model for one cpu considered here corresponds to the structure of l1 – l3 caches on most modern memory architectures [1]. we consider a multilevel set-associative cache. the number of sets is denoted by h. if h is equal to 1 then the cache is called fully associative. one set consists of s independent blocks (called lines in intel terminology). if s is equal to 1 then the cache is called direct mapped. the size of the data part of a cache in bytes is denoted by dcs. the cache block size in bytes is denoted by bs. we assume write-back caches with the lru block replacement strategy. obviously, dc s b hs s� � � . in shared memory systems, each cpu (and each cache) is connected by the shared bus to the shared memory. a coherence protocol is used to maintain memory coherence according to the specified consistency model. 2 general assumptions of the ce we make following assumptions during the implementation of the ce: 1. the ce is designed for smp systems using bus snoopy cache coherence protocols. 2. the whole cache size is used only for data. we omit bus transactions for instruction reading. 3. the operand read does not cross the cache block boundary. 4. we assume only write-back caches with the lru replacement strategy. 5. we assume that all caches in each level are the same; caches have at most 3 levels. 6. we assume that all bus transactions are atomic. 7. we assume random bus arbitration. 3 an example of an application of the ce application of the ce is very easy and straightforward. © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 a simple cache emulator for evaluating cache behavior for smp systems i. šimeček every modern cpu uses a complex memory hierarchy, which consists of multiple cache memory levels. it is very difficult to predict the behavior of this hierarchy for a given program (for details see [1, 2]). the situation is even worse for systems with a shared memory. the most important example is the case of smp (symmetric multiprocessing) systems [3]. the importance of these systems is growing due to the multi-core feature of the newest cpus. the cache emulator (ce) can simulate the behavior of caches inside an smp system and compute the number of cache misses during a computation. all measurements are done in the “off-line” mode on a single cpu. the ce uses its own emulated cache memory for an exact simulation. this means that no other cpu activity influences the behavior of the ce. this work extends the cache analyzer introduced in [4]. keywords: cache hierarchy, cache emulator, symmetric multiprocessing, mesi protocol. fig. 1: overview of cache parameters fig. 2: example of an smp system 4 comparison between ce and hw cpu cache monitors ce has many important advantages in comparison to hw cpu cache monitors: � the user can measure the behavior of the smp system on the uniprocessor system. � ce is supported on any platform, because ce is implemented as a gnu c library and can easily be included in any program. � the measurements are not influenced by other processes due to the “off-line” mode of the measurement. � external hw cpu cache monitors are very expensive and platform-dependent. modern cpus also have internal cache counters (for example, in the ia-32 architecture they are called “performance counters”), but they are available only for privileged users and are hard-to-use. � ce can measure effects of memory functions not supported by the cpu core (for example a “read once” operation) and can serve for the development of more effective cache hierarchies. � the user can easily change the cache configuration or the number of cpus for the measurement by the #defines statement. parameters for real systems can easily be included from predefined files. � the user can measure the cache behavior only in an area of interest. � conditional memory operations are supported. � the “off-line mode” guarantees that small quantities of cache misses can also be exactly measured. ce also has potential drawbacks: � memory operations in ce are about 1000 times slower than hw memory reads because in ce all read (or write) operations are simulated by the software. the exact slowdown ratio depends on the type of measured task and the complexity of the simulated smp system. this drawback is reduced by the fact that the user can measure the cache behavior only in the area of interest. � ce requires additional memory for its own emulated cache memory. � only data caches are assumed, not tlb or other parts of the memory architecture. � only the numbers of cache misses are measured, not effects of these misses. some memory latencies, conflicts or stalls occurring during code execution can be overlapped by other computation, so they do not result in performance degradation. � some cache misses cannot be measured because it is difficult to expressed these memory operations on the source code level (for example, those caused by stack operations in calling subroutines � call-ret sequences). � for caches which can hold both data and instructions, the effect of loading instructions into the cache is omitted. this drawback is not usually significant, because the code-sizes of the inner loops are much smaller than the data-sizes used by these loops. � the current version supports only the mesi protocol, write-back caches with the lru replacement strategy. a version that also supports different coherence protocols or cache configurations is under development. 5 solution of coherency misses the number of coherency cache misses is strongly influenced by the exact times of the memory requests execution and the type of bus arbitration. since no sw cache emulator is able to predict these times exactly among the whole smp system, the memory request ordering is solved on a statistical basis. we assume that the whole cpu enters the section of global memory operation in approximately the same time and that there are no preferences for memory requests. for each execution of this global memory operation inside the loop, a random ordering of cpu memory requests is generated, because we assume random bus arbitration. under this ordering, the memory operations get serialized. 6 validation of ce in order to validate ce, we ran following subroutines from the linear algebra package: � sparse matrix-vector multiplication, � cholesky factorization, � matrix-matrix multiplication. 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague the example code the example code with ce calls s � 0.0; s � 0.0; cache_init(); for i � 1 to n do for i � 1 to n do cpu no. 1: s� � a[i] * b[i]; s� � a[i]; * b[i]; cache_read(1,a� i); cache_read(1,b� i); cpu no. j: c[ j ] � s; cache_write( j,c� j); cache_print(); table 1: example of an application of the ce all routines run in 2 forms: 1. the original code was measured by the performance analyzer hw cache monitor on this smp system: sunfire v880, 8 ultrasparc iii cu processors, each cpu has a 64 kb l1 cache and an 8 mb l2 cache, running os solaris 10. 2. the modified code was emulated by ce. all cache parameters were equal to the real smp system configuration (see above) on this hw configuration: an intel celeron 2,4 mhz, 512 mb ram, a 128 kb l2 cache, an 8 kb l1 cache running os windows xp with intel c compiler version 7.01. the results from these tests are very similar. of course, they are not exactly the same, because the real measurement and the emulation are inexact in different ways, as discussed above. for simplicity, we can say that the differences between these results were smaller than 20 %. 7 conclusions we have implemented a cache emulator to study quantitative parameters of the cache behavior in the smp systems during different types of computation. we have also discussed the advantages and drawbacks of this emulator. the main advantage of this emulator is that a user can simulate the cache behavior of any smp system on the uniprocessor system. the emulator has been verified on different types of usual tasks. the results were similar to those obtained from the hw cache monitor. the errors in the estimations are due to minor simplifying assumptions in the ce. acknowledgment this work was supported by mšmt under research program msm6840770014. references [1] wadleigh, k. r., crawford, i. l.: software optimization for high performance computing. hewlett-packard professional books, 2000. [2] kennedy, k., allen, j. r.: optimizing compilers for modern architectures: a dependence-based approach. morgan kaufmann publishers inc., 2002. [3] bik, a., girkar, m., grey, p., tian, x.: “efficient exploitation of parallelism on pentium iii and pentium 4 processor-based systems.” intel technology journal, 2001, no. 5. [4] tvrdík, p., šimeček, i.: “software cache analyzer.” proceedings of ctu workshop, 2005, p. 180–181. ing. ivan šimeček phone: +420 224 357 268 e-mail: xsimecek@fel.cvut.cz department of computer science czech technical university faculty of electrical engineering technická 2 166 27 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 acta polytechnica https://doi.org/10.14311/ap.2022.62.0030 acta polytechnica 62(1):30–37, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague linearised coherent states for non-rational susy extensions of the harmonic oscillator alonso contreras-astorgaa, david j. fernández c.b, césar muro-cabralc, b, ∗ a conacyt-centro de investigación y de estudios avanzados del i. p. n., departamento de física, av. instituto politécnico nacional no. 2508, col. san pedro zacatenco, c.p. 07360, ciudad de méxico, méxico b centro de investigación y de estudios avanzados del i. p. n., departamento de física, av. instituto politécnico nacional no. 2508, col. san pedro zacatenco, c.p. 07360, ciudad de méxico, méxico c centro de investigación y de estudios avanzados del i. p. n., unidad querétaro, libramiento norponiente no. 2000, fracc. real de juriquilla, c. p. 76230, querétaro, qro., méxico ∗ corresponding author: cesar.muro@cinvestav.mx abstract. in this work, we derive two equivalent non-rational extensions of the quantum harmonic oscillator using two different supersymmetric transformations. for these extensions, we built ladder operators as the product of the intertwining operators related with these equivalent supersymmetric transformations, which results in two-step ladder operators. we linearised these operators to obtain operators of the same nature that follow a linear commutation relation. after the linearisation, we derive coherent states as eigenstates of the annigilation operator and analyse some relevant mathematical and physical properties, such as the completeness relation, mean-energy values, temporal stability, time evolution of the probability densities, and wigner distributions. from these properties, we conclude that these coherent states present both classical and quantum behaviour. keywords: supersymmetric quantum mechanics, non-rational extensions, linearised ladder operators, coherent states. 1. introduction in quantum physics, supersymmetric quantum mechanics (susy) is considered the most efficient technique to generate new quantum potentials from an initial solvable one (see [1–5] for reviews on the topic). this method allows modifying the energy spectrum of an initial hamiltonian to obtain new hamiltonians with known eigenstates and eigenvalues. these potentials obtained with susy are known as extensions or susy partners of the considered initial potential. moreover, when two different susy transformations lead to the same potential (up to an additive constant), it can be said that the extensions are equivalent [6, 7]. equivalent rational extensions of the quantum harmonic oscillator are very attractive in mathematical physics since its eigenstates are written in terms of exceptional orthogonal polynomials and the results are useful for studying superintegrable systems or generating solutions to the painlevé equations [8–10]. in a recent work of the authors [11], it was shown that the equivalence between susy transformations goes beyond rational extensions and can be extended to non-rational extensions of the harmonic oscillator, i.e. extensions whose potentials cannot be written as the quotient of two polynomials, by considering not only polynomial solutions but also general solutions of the schrödinger equation. however, since the birth of quantum theory, it has been relevant to study the quantum states at the border between classical and quantum regimes. in this sense, it is well-known that schrödinger, in 1926 [12], derived quantum states of the harmonic oscillator that resemble classical behaviour on the phase-space as the classical oscillator does. later on, in 1962, glauber rediscovered these states, known as coherent states, and found that they provided the quantum description of coherent light [13]. since then, there has been a continuous research activity in quantum physics looking for quantum states with a behaviour at the border between classical and quantum regimes by examining semi-classical phase-space properties, in particular, by systems generated by susy [4, 14–20]. the coherent states of the harmonic oscillator are gaussian states, labeled by a complex number z, that minimize the heisenberg uncertainty relation. they can be constructed either as displaced versions of the ground state or as eigenvectors of the annihilation operator. moreover, they form an overcomplete set in the sense that 1 π ∫ c |z⟩ ⟨z| d2z = 1. (1) these four properties are commonly used as definitions of coherent states when we have a potential different from the harmonic oscillator, see for example [21–25]. each definition gives, in general, different sets of coherent states. in this work, we obtain coherent states of non-rational extensions of the harmonic oscillator as eigenvectors of the annihilation operator. 30 https://doi.org/10.14311/ap.2022.62.0030 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 linearised cs for non-rational susy extensions of the ho for this purpose, we need to find ladder operators of the system. the outline of the work is the following: in the next section, we present a short summary of susy. in section 3, we generate two equivalent non-rational extensions of the harmonic oscillator. then, we construct ladder operators as the product of the intertwining operators of the susy transformations. in the section 4, we linearise the ladder operators to obtain a linear commutation relationship, then, we derive coherent states as eigenstates of the annihilation operator and study some of their properties. our conclusions are presented in the last section. 2. supersymmetric quantum mechanics with this technique, we start with two hamiltonians h = − 1 2 d2 dx2 + v (x), h̃ = − 1 2 d2 dx2 + ṽ (x), (2) where h is the initial hamiltonian with known eigenfunctions ψn(x) and eigenvalues en, n = 0, 1, 2, . . . , whereas h̃ is the hamiltonian under construction. the potential ṽ is known as the extension or supersymmetric partner of v . now, we propose the existence of k-th order differential operators b,b+ that intertwine h and h̃ as h̃b+ = b+h, bh̃ = hb. (3) by properly choosing k general solutions uj (j = 1, 2, . . . ,k) of the stationary schrödinger equation huj = ϵjuj , with corresponding energies ϵj , the susy partner potential ṽ (x) reads ṽ (x) = v (x) − [ln w(u1,u2, . . . ,uk)]′′, (4) where w (f1,f2, . . . ,fk) denotes the wronskian of the functions in its argument. the functions uj are usually referred to as seed solutions and the constant ϵj as factorization energies. be aware that to have a regular potential, we must choose the seed solutions in such a way the wronskian has no zeroes. if b+ψn ̸= 0, the eigenfunctions ψ̃n, n = 0, 1, . . . , of h̃ can be computed with the relation ψ̃n(x) = b+ψn(x)√ (en − ϵ1) . . . (en − ϵk) = 1√ (en − ϵ1) . . . (en − ϵk) w (u1,u2, . . . ,uk,ψn) w (u1,u2, . . . ,uk) . (5) the constructed hamiltonian h̃ may contain additional eigenfunctions ψ̃ϵi , known as missing states, for some of the factorization energies ϵi, given by ψ̃ϵi ∝ w(u1, . . . ,ui−1,ui+1, . . . ,uk) w(u1, . . . ,uk) . (6) if ψ̃ϵj fullfills the boundary conditions of the quantum problem, then ϵj must be included in the spectrum of h̃. in particular, for second-order supersymmetric quantum mechanics, the intertwining operators have the explicit form [26] b = 1 2 [ d2 dx2 + g(x) d dx + g′(x) + h(x) ] , (7) b+ = 1 2 [ d2 dx2 − g(x) d dx + h(x) ] . (8) where the functions g(x),h(x) are found in terms of the only two seed solutions u1,u2 with the corresponding factorization energies ϵ1,ϵ2, as g = w ′(u1,u2) w(u1,u2) , h = g′ 2 + g2 2 − 2v + ϵ1 + ϵ2 2 . (9) finally, the intertwining operators b and b+ fulfill the following factorization relations: b+b = (h̃ − ϵ1) . . . (h̃ − ϵk), (10) bb+ = (h − ϵ1) . . . (h − ϵk), (11) i.e., the product of b+ and b are polynomials of the hamiltonians h and h̃. 3. non-rational extensions of the quantum harmonic oscillator and their ladder operators let us consider the harmonic oscillator potential v = 1 2x 2 and the hamiltonian h as h = − 1 2 d2 dx2 + 1 2 x2, (12) whose eigenfunctions and eigenvalues are ψn(x) = √ 1 2n √ πn! e− x2 2 hn(x), en = n + 1 2 , where n = 0, 1, 2, . . . and hn(x) are hermite polynomials [27]. when eigenfunctions of a hamiltonian are employed as seed functions to generate its susy partner, the results are rational extensions and the transformation is called krein-adler transformation [6, 7, 28]. moreover, rational extensions can also be built by employing the polynomial non-normalizable solutions of the schrödinger equation φm(x) = e x2 2 hm(x), e−m−1 = − ( m + 1 2 ) , where m = 0, 1, 2, . . . , and hm(x) = (−i)mhm(ix) are the modified hermite polynomials [29], which are free of nodes for even m and possess a single node at x = 0 for m odd. in the case of m even, the reciprocal of these solutions are square-integrable functions [6]. we can generate non-rational extensions of the harmonic oscillator potential using non-polynomial solutions of the schrödinger equation as seed functions 31 a. contreras-astorga, d. j. fernández c., c. muro-cabral acta polytechnica in a susy transformation. let us write down the general solution of the stationary schrödinger equation, with an arbitrary factorization energy denoted by e = λ + 1/2, as u(x) = e− x2 2 [hλ(x) + γhλ(−x)], (13) where hλ(x) ≡ 2λγ ( 1 2 ) γ ( 1−λ 2 ) 1f1 (−λ2 ; 12 ; x2 ) + 2λγ ( − 12 ) γ ( − λ2 ) x1f1 ( 1 − λ2 ; 32 ; x2 ) , (14) are defined as hermite functions [30, 31], 1f1(a; b; z) ≡ γ(b) γ(a) ∞∑ n=0 γ(a + n) γ(b + n) zn n! , (15) is the confluent hypergeometric function, and γ is a real parameter. if γ > 0, the solution will have an even number of zeroes and for γ < 0, an odd number of nodes. 3.1. first susy transformation as the first non-rational extension of the harmonic oscillator, we perform a second-order susy transformation where we add two new levels with factorization energies −3/2 < e1 < 1/2 and e2 = e−2 = −3/2, both below the ground state energy. we start by choosing the seed solutions as u (1) 1 (x) = e − x 2 2 [hλ1 (x) + γhλ1 (−x)], u (1) 2 (x) = φ1(x), (16) where λ1 = e1 − 1/2. to obtain a nodeless wronskian w(u(1)1 ,u (1) 2 ), we take γ > 0. notice that e1 is an arbitrary energy between e0 = 1/2 and e−2 = −3/2. by following the relation (8), we can define a set of second-order intertwining operators b(1),b(1)+ which satisfy the relations h̃(1)b(1)+ = b(1)+h, (17) and its adjoint. the susy partner potential is ṽ (1) = 1 2 x2 − [ ln w(u(1)1 ,u (1) 2 ) ]′′ . (18) since u(1)1 is an infinite series, the potential ṽ (1) is a non-rational extension of v . to find the eigenfunctions of the hamiltonian h̃(1), we use the operator b(1)+ as ψ̃(1)n = b(1)+ψn√ (en − e1)(en − e2) , n = 0, 2, 3, . . . (19) regarding both missing states of this extension ψ̃ (1) e1 ∝ u (1) 2 w(u(1)1 ,u (1) 2 ) , ψ̃ (1) e2 ∝ u (1) 1 w(u(1)1 ,u (1) 2 ) , (20) due to a stronger divergent behaviour of the wronskian when |x| → ∞ than the solutions u(1)1 ,u (1) 2 , the hamiltonian h̃(1) contains two new bounded states ψ̃(1)e1 , and ψ̃ (1) e2 , so its spectrum is sp{h̃ (1)} = {e−2, e1,en, n = 0, 1, 2, . . . }. 3.2. second but equivalent susy transformation we can obtain the same hamiltonian h̃(1), up to an additive constant, with a different second-order susy transformation. let us choose the following seed solutions: u (2) 1 (x) = ψ1(x), u (2) 2 (x) = e − x 2 2 [hλ2 (x) + γhλ2 (−x)], (21) with the factorization energies e3 = e1, and e4 = e1 + 2, respectively. note that λ2 = λ1 + 2. again, through the relations (7) and (8), we can define second-order differential operators b(2), b(2)+, which intertwine a hamiltonian h̃(2) with h as h̃(2)b(2)+ = b(2)+h. (22) the supersymmetric partner potential is ṽ (2) = 1 2 x2 − [ ln w(u(2)1 ,u (2) 2 ) ]′′ . (23) since u(2)2 is an infinite series, ṽ (2) is a non-rational extension of v . the eigenfunctions of its hamiltonian are ψ̃(2)n = b(2)+ψn√ (en − e3)(en − e4) , n = 0, 2, 3, . . . , (24) and the missing states ψ̃ (2) e3 ∝ u (2) 2 w(u(2)1 ,u (2) 2 ) , ψ̃ (2) e4 ∝ u (2) 1 w(u(2)1 ,u (2) 2 ) . (25) in this case, owing to the divergent asymptotic behaviour of the solution u(2)2 when |x| → ∞, the missing state ψ̃(2)e3 is not normalizable, and since u (2) 1 converges, the state ψ̃(2)e4 is square-integrable. therefore, the energy spectrum of h̃(2) is sp(h̃(2)) = {e0, e3,e2, . . . }. it is important to notice that the seed functions u (1) 1 , u (1) 2 used to construct h̃ (1) are related to the seed solutions u(2)1 , u (2) 2 involved in h̃ (2). the functions u(1)1 and u (2) 2 satisfy u (1) 2 = √ 2 √ πex 2 u (2) 1 , and a−a−u (2) 2 = 2λ(λ − 1)u (1) 1 , where a − is the annihilation operator of the harmonic oscillator. then, by a direct substitution, it can be shown that h̃(2) = h̃(1) + 2. thus, ṽ (1) and ṽ (2) are equivalent non-rational extensions of the harmonic oscillator. notice that due to this equivalence, the eigenfunctions obtained by both 32 vol. 62 no. 1/2022 linearised cs for non-rational susy extensions of the ho transformations are the same but with eigenvalues displaced. in the first extension, the ground state is the missing state ψ̃(1)e2 , which is also obtained by b (2)+ψ0. moreover, the missing state ψ̃(1)e1 corresponds to the missing state ψ̃(1)e4 . finally, relations (19) and (24) are also equivalent as ψ̃(2)n ∝ ψ̃ (1) n−2, where n = 2, 3, 4, . . . 3.3. ladder operators since both hamiltonians h̃(1) and h̃(2) are equivalent, we can simplify the notation by defining h̃(2) as h̃, its eigenfunctions simply by ψ̃, the potential ṽ (2) as ṽ , and e3 as ϵ. be aware that 1/2 < ϵ < 5/2. now, we can define the ladder operators for the susy extension h̃ as the product of the intertwining operators related to the equivalent susy transformations as in [32], i.e. l+ = b(1)+b(2), l− = b(2)+b(1). (26) they satisfy the following commutation algebra [h̃, l±] = ±2l±, (27) and [l−, l+] = (h̃ + 2 − e1)(h̃ + 2 − e2) (h̃ + 2 − e3)(h̃ + 2 − e4) − (h̃ − e1)(h̃ − e2)(h̃ − e3)(h̃ − e4). (28) from the relation (27) and the diagram in figure 1, we can observe how these operators are two-step ladder operators. furthermore, the commutation relation (28) indicates that these operators, together with h̃, realize a polynomial heisenberg algebra of thirdorder [33], with a generalized number operator: n4(h̃) = l+l− = (h̃ − e1)(h̃ − e2)(h̃ − e3)(h̃ − e4). (29) the kernel of the annihilation operator l− is composed by the functions kl− = {ψ̃e0, ψ̃ϵ, ψ̃e3,b (1)+u (2) 2 }. (30) the first three elements of the kernel are eigenfunctions of h̃ and the last one is a non-normalizable solution of the corresponding schrödinger equation. by applying iteratively the operator l+ onto these three eigenfunctions, we can construct a basis of three subspaces of the hilbert space, the direct sum of the three hilbert-subspaces compose the whole hilbert space (see figure 2). notice that ψ̃ϵ is annihilated by l+, then the corresponding subspace will be one-dimensional whereas the other two are infinite-dimensinal subspaces. . figure 1. diagram of the mechanism of the two-step ladder operators (26) figure 2. three independent energy ladders that make up the spectrum of h̃. this spectrum is composed by two infinite energy ladders and a singleelement one. 4. linearised coherent states and their properties once we have defined the ladder operators l± in (26), and clarify how they divide the hilbert space into two infinite subspaces (or energy ladders) plus a onedimensional subspace, we proceed to linearise them. we focus on the two infinite subspaces since the construction of the coherent state of the third subspace is trivial. we define new ladder operators for each infinite subspace as l+ν = σν (h̃)l +, l−ν = σν (h̃ + 2)l −, (31) where ν = 0, 3 is the index of the subspace. when ν = 0, we refer to the subspace span{ψ̃0, ψ̃2, ψ̃4, . . . } and, when ν = 3, we refer to the subspace span{ψ̃3, ψ̃5, ψ̃7, . . . }. the operators σν are defined as σ0(h̃) = [(h̃ − e1)(h̃ − e3)(h̃ − e4)]−1/2, σ3(h̃) = [(h̃ − e1)(h̃ − e2)(h̃ − e3)]−1/2. (32) 33 a. contreras-astorga, d. j. fernández c., c. muro-cabral acta polytechnica from (27), and considering σν (x) a regular function, we obtain the following useful relations. σν (h̃)l+ = l+σν (h̃ + 2), σν (h̃)l− = l−σν (h̃ − 2); l+σν (h̃) = σ(h̃ − 2)l+, l−σν (h̃) = σν (h̃ + 2)l−. using (29), it is direct to show that the operators l±ν fulfill the linear commutation relation [lν, l+ν ] = 21hν , (33) where 1hν is the identity in the subspace hν . therefore, on both hilbert subspaces, the action of the linearised ladder operators is l−ν ψ̃ν+2n = √ 2nψ̃ν+2(n−1), l+ν ψ̃ν+2n = √ 2(n + 1)ψ̃ν+2(n+1), (34) where n = 0, 1, 2, . . . at this stage, we can define the linearised coherent states as eigenstates of the linear annihilation operator, l−ν |z ν ⟩ = z |zν ⟩ , ν = 0, 3, (35) where z ∈ c. we can make the expansion |zν ⟩ = ∞∑ n=0 cn |ν + 2n⟩ , (36) where ψ̃ν+2n(x) = ⟨x|ν + 2n⟩ are the eigenfunctions of the susy hamiltonian, and following the definition (35), we find that the explicit form of the normalised coherent states is |zν ⟩ = e− |z|2 4 ∞∑ n=0 (z/ √ 2)n √ n! |ν + 2n⟩ . (37) notice that we obtained a similar expression of the standard coherent states but with the relevant difference that the expansion is in terms of eigenfunctions of the supersymmetric partner hamiltonian h̃ in the subspace ν. 4.1. completeness relation an important property that the constructed coherent states fulfill is that they form an over-complete set on hilbert subspaces, i.e., they solve an identity expression [25] 1 2π ∫ c |zν ⟩ ⟨zν | d2z = 1hν . (38) 4.2. mean-energy values the eigenvalue equation of the hamiltonian h̃ is given by h̃ |ν + 2n⟩ = ( ν + 1 2 + 2n ) |ν + 2n⟩ , (39) which leads to the energy expectation ⟨zν | h̃ |zν ⟩ = ν + 1 2 + |z|2. (40) we observe that we obtain the well-known quantity of energy-growth corresponding to the oscillator coherent states, this result is another direct consequence of the linear commutation relation between the linearised ladder operators. 4.3. temporal stability another relevant property of the coherent states is that they must remain coherent as they evolve in time. by applying the time evolution operator u(t), we obtain u(t) |zν ⟩ = e−i(ν+ 1 2 )t |zν (t)⟩ , i.e., our linearised coherent states fulfill this condition. the period of evolution of these states is τ = π, the half of the harmonic oscillator coherent states (t = 2π). this means that in the phase-space, our states need just the half of the time to return to the same point with an acquired phase. this represents a first clear indication of non-classical behaviour. 4.4. evolution of the probability densities let us analyse the time evolution of the probability densities. for the classical coherent states, this quantity is represented by a gaussian wave packet oscillating around the minimum of the potential. in our case, we have: ρz (z,x,t) = |⟨x| u(t) |zν ⟩| 2 = | ∞∑ n=0 e− |z|2 4 (ze−i2t/ √ 2)n √ n! ψ̃ν+2n(x)|2. (41) in the figure 3, we plot this evolution. we observe that each coherent state is composed by two wavepackets with a back-and-forth motion resembling a semi-classical behaviour, since each wavepacket looks like a harmonic-oscillator coherent state. the two wavepackets interfere with each other, and it is more noticeable when they collide around x = 0. a parity symmetry x → −x, is only apparent and cannot be guaranteed for the susy extensions since the potential ṽ is only symmetric around x = 0 when the parameter γ = 0 in the seed function u(2)2 . 4.5. wigner distributions an efficient tool to determine the nature of quantum wave functions is the wigner quasiprobability distribution in the phase space, defined by w (x,p) ≡ 1 2π ∫ ∞ −∞ ψ ∗ ( x − y 2 ) ψ ( x + y 2 ) e ipy dy. (42) in figure 4, we show the corresponding wigner functions of coherent states for both subspaces. we observe that the distributions possess regions with non-positive values, which is a clear indication of the non-classical behaviour or pure quantum nature of our linearised coherent states. 34 vol. 62 no. 1/2022 linearised cs for non-rational susy extensions of the ho figure 3. time evolution of the probability densities (41) of the linearised coherent states (37) with ϵ = 2, γ = 2, top: ν = 0, z = 5, and bottom: ν = 3, z = 5. 4.6. heisenberg uncertainty relation first, we introduce two hermitian quadrature operators x1 = l+ν + l − ν 2 , x2 = l−ν − l + ν 2i , (43) and the uncertainties σ2xi = ⟨x 2 i ⟩zν − ⟨xi⟩ 2 zν , i = 1, 2. (44) since the coherent states are eigenfunctions of l−, it is found that these uncertainties follow the product σ2x1σ 2 x2 = 1 4 , (45) indicating that they saturate the heisenberg inequality. 5. conclusions we have found a family of equivalent non-rational extensions of the harmonic oscillator potential generated through two different susy transformations involving general solutions of the stationary schrödinger equation in terms of hermite functions. these susy figure 4. wigner distributions of the linearised coherent states with ϵ = 2, γ = 2, z = 5, top: ν = 0, and bottom: ν = 3. transformations consisted in moving the first-excited state to an arbitrary level between the ground and the second-excited states, and, on the other hand, adding two new levels below the ground state. we built fourth-order differential ladder operators as the product of the intertwining operators related to the equivalent susy transformations. then, we linearised these ladder operators to have a linear commutation relation. in addition, we realized that these operators divide the entire hilbert space of eigenfunctions into two infinite energy ladders or hilbert-subspaces, and one single-element subspace. then, we derived coherent states of the linearised annihilation operator as eigenstates. we uncovered that they are temporally stable cyclic states with a period τ = π, and we showed as well that they form an overcomplete set in each subspace. moreover, they present the same energy growth as the oscillator coherent states. for the time evolution of the probability densities, we obtained the structure of two oscillating wave-packets, each one with a period 2π, but the collective behaviour with a period τ. for the wigner functions, we observed that they possess regions with non-positive values, unveiling the quantum nature of these states. finally, by defining two hermitian quadrature operators as in the harmonic oscillator, we got the linearised coherent states saturate the heisenberg inequality. therefore, as we already mentioned, we conclude that our states present both classical and quantum behaviour. 35 a. contreras-astorga, d. j. fernández c., c. muro-cabral acta polytechnica acknowledgements the authors acknowledge consejo nacional de ciencia y tecnología (conacyt-méxico) under grant fordecyt-pronaces/61533/2020. references [1] c. v. sukumar. supersymmetric quantum mechanics of one-dimensional systems. journal of physics a: mathematical and general 18(15):2917, 1985. https://doi.org/10.1088/0305-4470/18/15/020. [2] v. b. matveev, m. a. salle. darboux transformations and solitons. springer series in nonlinear dynamics. springer berlin heidelberg, 1992. isbn 9783662009246. [3] f. cooper, a. khare, u. sukhatme. supersymmetry and quantum mechanics. physics reports 251(5-6):267–385, 1995. https://doi.org/10.1016/0370-1573(94)00080-m. [4] d. j. fernández c., n. fernández-garcía. higher-order supersymmetric quantum mechanics. aip conference proceedings 744:236–273, 2004. https://doi.org/10.1063/1.1853203. [5] g. junker. supersymmetric methods in quantum, statistical and solid state physics. iop expanding physics. institute of physics publishing, 2019. isbn 9780750320245. [6] s. odake, r. sasaki. krein-adler transformations for shape-invariant potentials and pseudo virtual states. journal of physics a: mathematical and theoretical 46(24):245201, 2013. https://doi.org/10.1088/1751-8113/46/24/245201. [7] d. gomez-ullate, y. grandati, r. milson. extended krein-adler theorem for the translationally shape invariant potentials. journal of mathematical physics 55(4):043510, 2014. https://doi.org/10.1063/1.4871443. [8] d. bermudez, d. j. fernández c. supersymmetric quantum mechanics and painlevé iv equation. sigma symmetry, integrability and geometry: methods and applications 7:025, 2011. https://doi.org/10.3842/sigma.2011.025. [9] d. bermudez, d. j. fernández c., j. negro. solutions to the painlevé v equation through supersymmetric quantum mechanics. journal of physics a: mathematical and theoretical 49(33):335203, 2016. https://doi.org/10.1088/1751-8113/49/33/335203. [10] p. clarkson, d. gómez-ullate, y. grandati, r. milson. cyclic maya diagrams and rational solutions of higher order painlevé systems. studies in applied mathematics 144(3):357–385, 2020. https://doi.org/10.1111/sapm.12300. [11] c. muro-cabral. ladder operators and coherent states for supersymmetric extensions of the harmonic oscillator. master’s thesis, physics department, cinvestav, 2020. [12] e. schrödinger. der stetige übergang von der mikrozur makromechanik. naturwissenschaften 14(28):664– 666, 1926. https://doi.org/10.1007/bf01507634. [13] r. j. glauber. coherent and incoherent states of the radiation field. physical review 131(6):2766, 1963. https://doi.org/10.1103/physrev.131.2766. [14] h. bergeron, j. p. gazeau, p. siegl, a. youssef. semi-classical behavior of pöschl-teller coherent states. epl (europhysics letters) 92(6):60003, 2011. https://doi.org/10.1209/0295-5075/92/60003. [15] d. j. fernández c., l. m. nieto, o. rosas-ortiz. distorted heisenberg algebra and coherent states for isospectral oscillator hamiltonians. journal of physics a: mathematical and general 28(9):2693, 1995. https://doi.org/10.1088/0305-4470/28/9/026. [16] d. j. fernández c., v. hussin, o. rosas-ortiz. coherent states for hamiltonians generated by supersymmetry. journal of physics a: mathematical and theoretical 40(24):6491, 2007. https://doi.org/10.1088/1751-8113/40/24/015. [17] d. bermudez, a. contreras-astorga, d. j. fernández c. painlevé iv coherent states. annals of physics 350:615–634, 2014. https://doi.org/10.1016/j.aop.2014.07.025. [18] s. e. hoffmann, v. hussin, i. marquette, z. yao-zhong. coherent states for ladder operators of general order related to exceptional orthogonal polynomials. journal of physics a: mathematical and theoretical 51(31):315203, 2018. https://doi.org/10.1088/1751-8121/aacb3b. [19] s. e. hoffmann, v. hussin, i. marquette, z. yaozhong. ladder operators and coherent states for multistep supersymmetric rational extensions of the truncated oscillator. journal of mathematical physics 60(5):052105, 2019. https://doi.org/10.1063/1.5091953. [20] s. garneau-desroches, v. hussin. ladder operators and coherent states for the rosen-morse system and its rational extensions. journal of physics a: mathematical and theoretical 54(47):475201, 2021. https://doi.org/10.1088/1751-8121/ac2549. [21] a. o. barut, l. girardello. new “coherent” states associated with non-compact groups. communications in mathematical physics 21(1):41–55, 1971. https://doi.org/10.1007/bf01646483. [22] a. m. perelomov. coherent states for arbitrary lie group. communications in mathematical physics 26(3):222–236, 1972. https://doi.org/10.1007/bf01645091. [23] m. m. nieto, l. m. simmons. coherent states for general potentials. i. formalism. physical review d 20:1321–1331, 1979. https://doi.org/10.1103/physrevd.20.1321. [24] v. v. dodonov, e. v. kurmyshev, v. i. manko. generalized uncertainty relation and correlated coherent states. physics letters a 79(2-3):150–152, 1980. https://doi.org/10.1016/0375-9601(80)90231-5. [25] j. p. gazeau, j. r. klauder. coherent states for systems with discrete and continuous spectrum. journal of physics a: mathematical and general 32(1):123, 1999. https://doi.org/10.1088/0305-4470/32/1/013. [26] d. j. fernández c. integrability, supersymmetry and coherent states: a volume in honour of professor véronique hussin, chap. trends in supersymmetric quantum mechanics, pp. 37–68. springer international publishing, cham, 2019. https://doi.org/10.1007/978-3-030-20087-9_2. 36 https://doi.org/10.1088/0305-4470/18/15/020 https://doi.org/10.1016/0370-1573(94)00080-m https://doi.org/10.1063/1.1853203 https://doi.org/10.1088/1751-8113/46/24/245201 https://doi.org/10.1063/1.4871443 https://doi.org/10.3842/sigma.2011.025 https://doi.org/10.1088/1751-8113/49/33/335203 https://doi.org/10.1111/sapm.12300 https://doi.org/10.1007/bf01507634 https://doi.org/10.1103/physrev.131.2766 https://doi.org/10.1209/0295-5075/92/60003 https://doi.org/10.1088/0305-4470/28/9/026 https://doi.org/10.1088/1751-8113/40/24/015 https://doi.org/10.1016/j.aop.2014.07.025 https://doi.org/10.1088/1751-8121/aacb3b https://doi.org/10.1063/1.5091953 https://doi.org/10.1088/1751-8121/ac2549 https://doi.org/10.1007/bf01646483 https://doi.org/10.1007/bf01645091 https://doi.org/10.1103/physrevd.20.1321 https://doi.org/10.1016/0375-9601(80)90231-5 https://doi.org/10.1088/0305-4470/32/1/013 https://doi.org/10.1007/978-3-030-20087-9_2 vol. 62 no. 1/2022 linearised cs for non-rational susy extensions of the ho [27] m. abramowitz, i. a. stegun (eds.). handbook of mathematical functions, with formulas, graphs, and mathematical tables. dover, 1965. [28] v. e. adler. a modification of crum’s method. theoretical and mathematical physics 101(3):1381–1386, 1994. https://doi.org/10.1007/bf01035458. [29] k.-h. kwon, l. l. littlejohn. classification of classical orthogonal polynomials. journal of the korean mathematical society 34(4):973–1008, 1997. [30] n. n. lebedev, r. a. silverman. special functions and their applications. physics today 18(12):70–72, 1965. https://doi.org/10.1063/1.3047047. [31] l. weisner. generating functions for hermite functions. canadian journal of mathematics 11:141–147, 1959. https://doi.org/10.4153/cjm-1959-018-4. [32] i. marquette, c. quesne. new ladder operators for a rational extension of the harmonic oscillator and superintegrability of some two-dimensional systems. journal of mathematical physics 54(10):102102, 2013. https://doi.org/10.1063/1.4823771. [33] j. m. carballo, d. j. fernández c., j. negro, l. m. nieto. polynomial heisenberg algebras. journal of physics a: mathematical and general 37(43):10349, 2004. https://doi.org/10.1088/0305-4470/37/43/022. 37 https://doi.org/10.1007/bf01035458 https://doi.org/10.1063/1.3047047 https://doi.org/10.4153/cjm-1959-018-4 https://doi.org/10.1063/1.4823771 https://doi.org/10.1088/0305-4470/37/43/022 acta polytechnica 62(1):30–37, 2022 1 introduction 2 supersymmetric quantum mechanics 3 non-rational extensions of the quantum harmonic oscillator and their ladder operators 3.1 first susy transformation 3.2 second but equivalent susy transformation 3.3 ladder operators 4 linearised coherent states and their properties 4.1 completeness relation 4.2 mean-energy values 4.3 temporal stability 4.4 evolution of the probability densities 4.5 wigner distributions 4.6 heisenberg uncertainty relation 5 conclusions acknowledgements references ap07_4-5.vp 1 introduction automatic speech recognition has become a very popular field of research, and the results come into our lives in many forms, e.g. in voice controlled machines, like pcs and mobile phones. these command controlled systems often work with real, spontaneous speech, which is different from the read speech used in the clean laboratory conditions of a research centre. this noisy kind of speech is recorded in a real environment, which causes some noise addition to the speech signal. the speech is also created ”on the fly” and the speaker has to think of the next word while speaking. all this causes many non-speech events to be present in such speech. therefore it is important to make the recogniser robust against these events, and to take the events into account and ensure that they are not incorrectly recognised. the long-term goal of our work is to create a hidden markov model (hmm) based digit recogniser, which will suppress the non-speech event influence. one solution is to model the non-speech events that affect speech. for this purpose it is important to have a large training database with many occurrences of each modelled item, e.g. phoneme, to get a general description of the item. therefore in the first part of this work, two czech speech databases are analysed for the presence and quality of the speaker non-speech events. in this very beginning of our work, only speaker-generated non-speech events are taken in account because of their position between words of recognised speech. this enables us to consider the events to be another word and to model them easily. for this purpose, the databases were inspected for the presence of the events in different situations. the second part of the work is concerned with speaker non-speech event modelling. a robust czech digit sequence recogniser based on hmms of czech phonemes is trained on the analysed database, and the results gained with the non-speech event recognition feature are presented. an hmm toolkit [2] is used for this purpose. based on these results, forced-alignment experiments are shown in the third part of the work. the recogniser is used to re-recognise the training data. this will remove unsuitable event marks in the transcription and it also enables to remark the event, which should improve the recognition results. 2 the database for the non-speech event recognition task as noted above, it is important for automatic speech recogniser training to have a speech database that has enough occurrences of each type of modelled unit, e.g. phonemes. in the non-speech event modelling task this calls for a sufficient number of non-speech events present in the database, which is the way to get a general description of the modelled event. this work deals with speaker non-speech events that appear in between particular words, so the database is analysed for the presence of these events. in most cases, speech recognition systems are trained on a database of mainly read speech. this helps to complete the database, because it is known approximately what was said. however, read speech is different from the spontaneous speech used in voice communication with a machine. read speech is low in speaker non-speech events. unlike a spontaneous speaker, a reader is not forced to think while speaking and so the occurrence of hesitation is low. the same problem occurs with other possible non-speech events, like lip-smacking, because the reader has better control over his mouth. the recognizer presented here was trained on two different sets of speech. the first database (spee) is a collection of czech speaker records in a different environment, which was inspected to contain only the items in a silent environment for training purposes. the spee set includes speaker non-speech events divided into two classes: filled pauses (fil), which are pauses in speech filled with some sound (common in hesitation), and other speaker non-speech events (spk). to increase the number of non-speech events present in the training set of speech data, one other dataset was used. it is a collection of records from a car (tem), which was also inspected to contain only silent items from a standing car. this database divides the events into several classes, which helps to create more different models of non-speech events and to describe these events more accurately. table 1 shows the number of events marked in the transcription of the whole dataset and of the selected clean training subset for both the spee and the tem dataset. it can be seen that the training selections (clean) contain notably fewer speaker non-speech events marked in the tran© czech technical university publishing house http://ctn.cvut.cz/ap/ 107 acta polytechnica vol. 47 no. 4–5/2007 speaker non-speech event recognition with standard speech datasets j. rajnoha a non-speech event modelling approach to speech recognition is presented in this paper. a speaker independent spoken czech digit recogniser is used for this purpose, and speaker generated non-speech events are modelled. because it is important for the recogniser to be trained on suitable data, the paper shows some factors that influence the occurrence of the modeled non-speech events in the training database. some results achieved on the analysed training database are then shown. in the experiments on forced alignment the recogniser eliminates almost all the insertion error, which is a promising property for subsequent training. however, experiments with a different basis for the non-speech event models provide almost the same results, so the difference seems to be not so significant for recognition. keywords: speech recognition, digit recognition, non-speech events, training database, forced alignment. scription. the analysis in 2.2 shows that higher snr (lower noise level) in speech leads to lower occurrence of the events. some compromise is therefore needed in the choice between clean speech (more understandable for the recogniser) and the number of non-speech events for training the non-speech event robust recogniser. the reason for the low number of non-speech events in the tem-training subset is that the training subset is only a small fragment of the whole tem dataset. only the standing car items were taken for training purposes. the datasets also include some records that are not suitable for standard phoneme training, e.g. web-page addresses or spelled utterances. while the average occurrence of a non-speech event is 0.64 events per one utterance in the whole spee dataset and 0.56 events per utterance in the training subset, the average rate for web-page utterances is 0.89. however, these records were not used in the training phase, which decreases the number of non-speech events in the training subset. the analyses follow. these try to find groups of records with some property that is important in the speaker non-speech event distribution in the training subset. this can help to discover some inefficiency in the non-speech event training process. 2.1 event distribution in speaker’s age the datasets include some basic information about the speaker. this can help us to find whether some group of people can influence the training dataset in some way. one such kind of information is the age-class distribution and the number of speaker non-speech events for different age-classes. figure 2 shows that the presence of speaker non-speech events is not much influenced by the speakers age till about 65 years, and only for higher age the amount rises. there are not many speakers from the 65+ age-class in the datasets (fig. 1), and this increase is not high enough to cause harmful effects, like training the recogniser for a specific kind of event only. 2.2 event distribution in different noise levels the spee database includes an snr estimation for the records. it enables the user to analyse the influence of a noisy environment on the occurrence of a non-speech event in speech. there is no snr estimate for the tem database, but the information on car type and engine state can also partially describe the environment. the graph in fig. 3 shows that the distribution of the average non-speech event rate has its maximum in the snr group between 10 and 15db. for lower snr, the rate falls mainly because of the noise that covers the event. a similar effect can be seen in tem, where the items without a running engine in the background contain more non-speech event marks. for higher snr ranges the rate also falls. in this environment the 108 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 utterances spk-event fil-event spee-all 180213 108474 7382 spee-train 63024 33138 1856 tem-all 221318 46742 691 tem-train 38391 11532 153 table 1: number of non-speech events fig. 1: age-class distribution in the datasets fig. 2: non-speech events distribution in age for the speech datasets fig. 3: non-speech events in the training spee dataset in different noisy environment speaker is not disturbed by the environmental noise so he pronounces properly and this leads to a lower non-speech event rate. 2.3 event distribution for different workgroups the realization team for creating the two databases was divided into two parts with different supervisors and workplaces within the czech republic. this led to a different dialect distribution within the groups, as the records from the second group were spoken mainly by the speakers with a moravian dialect. as shown in graph 5, this separation led to a difference in the annotations. unlike the second group, the workers in the first group had to use fewer non-speech event marks in the transcription, and these items have a notably lower rate of non-speech event occurrence. this may have caused some of the marked non-speech events from the second group to be less loud or less prominent. because such events do not decrease the recognition score, it is quite undesirable to train the recogniser to separate them from common background noise. if these unimportant event marks were not there, the recogniser would take only significant events in account and would be able to model the significant events more properly. this is one reason for using forced alignment on the training dataset see below. 3 previous tests the analysed databases were used to create a czech digit sequence recogniser that models speaker non-speech events [1]. in the first step, only two classes of speaker non-speech events were modelled and both datasets could be used. in subsequent processing, the spk event was divided into separate plosive and fricative events. therefore only tem was used for subsequent training, because there was no information on these properties in the spee dataset. the recogniser was tested on the selection of both datasets (because of the different environment). figure 6 shows the recognition results for the testing data derived from spee and tem in terms of word error rate wer d s i n � � �� � � � � 100 % (1) where n means number of recognised words, and d stands for deleted words, s for substituted words and i for incorrectly inserted words. at the beginning of non-speech event modelling there was a recogniser trained on spee only (col. a). in the first step tem was added to the training set and also two basic non-speech event marks were added to the set of models. the first retraining decreased the error rate even without using non-speech events (col. b), but the recogniser that takes the events into account has better results (col. c). additional retraining brought no improvement, so only tem was used for subsequent phases and the general spk mark was divided into two classes: fricative (bre) and plosive (pl) event. after two retraining steps the results show that, unlike in the case of non-speech event modelling (col. e, f), the word error rate and insertion error rate increase rapidly without using the additional models (col. d). so the experiment therefore shows that the non-speech event modelling approach helps recognition improvement. 4 experiments on forced-alignment as noted above, the spee database divides speaker non-speech events into two classes only, but it is better for the recogniser to have more classes of these events. the more © czech technical university publishing house http://ctn.cvut.cz/ap/ 109 acta polytechnica vol. 47 no. 4–5/2007 fig. 4: non-speech events in the whole (full filled bar) and training (lined bar) tem dataset in different noisy environment fig. 5: non-speech events in the whole (full filled bar) and training (dashed bar) datasets for different dialect/workgroup fig. 6: word error rate (dashed bar) and insertion error rate (full filled bar) on the spee-testdat (top) and on the tem-testdat (bottom) classes are used, the more accurately the events can be described and modelled. therefore three models of non-speech events were used in the recogniser above, which meant using only tem for subsequent retraining and the training dataset size decreased. using the spee database for subsequent training again needs a decision on, whether the spk mark in the transcription closer to a bre or to a pl event. the htk system enables a feature called forced alignment, which tries to recognise the training database and puts the most fitting (probable) form of the given word into the record. this is used for deciding between pronunciation variants of one word, and in this work it was used in a similar way for speaker non-speech events. in this experiment the spee dataset was re-recognised using the recogniser above. because the recogniser is able to classify the spk event, the result of this recognition is the spee database, which has 3 classes of speaker non-speech events, like in the tem database. 4.1 breath-like noise the recogniser above uses two types of bre model. one is based on the phonetically similar phoneme ”f ”, while the second takes advantage of the length of silence model. so it was necessary to decide which should be used for subsequent processing. the recognition results (fig. 6, col. e, f) show that in the two cases the recognition score is almost the same. it seems that both models have the same ability to describe the event, but using forced alignment on the spee dataset discovered the quality of these models. this was done by comparing of the results of the re-aligned spee when using the bre models with a different basemodel. the recogniser was trained in several retraining steps. in the last step (phase 2) and in one preceding step (phase 1) a forced alignment of the spee database was performed table 2 shows the difference between the ‘f ’-based and silence-based forced alignment in those two succeeding retraining phases. this led to 4 comparisons. we analysed whether one recogniser had marked the non-speech event with the same mark (bre vs pl) as another recogniser (substitution). and we noted as a deletion the situation, when one of them had not placed a mark where the second one marked the event. in the case of ’f ’-based bre models, the event was marked the same way as by other recognisers in 89.5% cases at maximum. this shows that the recogniser does not have stable models and these models continue to change their properties notably while being trained. as a result one event will serve as a bre event for a while, but then it will be used to train the pl model, because of the non-stability of the forced-alignment results. on the other hand, the comparison of silence-based models in both succeeding phases shows that these models act in a rather stable way. therefore the event will be used to train only one type of non-speech event model, and seems to be more suitable for subsequent processing. 4.2 aligned data-based recogniser the first analysis (listening test) of the re-aligned data discovered faults in the ability of the recogniser to decide whether the spk event is of bre or pl type. in some cases the event was too close to the beginning word, sometimes the event was too silent, so high accuracy could be expected. however, the results expanded the training database, which could help the recognition. based on the results above retraining was performed with the use of the re-aligned spee training dataset. both kinds of bre models were used to check whether the comparison experiment has any effect on recognition accuracy. table 3 shows that using a different base model for a bre event has no significant influence on recognition accuracy. after two steps of retraining, the recogniser was able to eliminate all insertions except for the one that remained. but even when the accuracy does not achieve the value of the original recogniser, the re-alignment seems to bring an improvement for subsequent training. 4.3 alignment against listening as noted above the tem dataset divides non-speech events into several classes. these events were marked by the human annotator, so the original transcription can be used as a good basis for re-alignment quality comparison. this transcription can be considered as a good estimate of the non-speech event class, and if the re-alignment phase marks some plosive event as bre in many cases (or vice versa), the recogniser is unable to describe the difference between these events and it needs to be trained better. section 2.3 shows that not all the non-speech events marked in the original transcription of the speech datasets can be considered as a suitable pattern for training the event model. so the re-alignment can help to reduce the difference between the marked events by removing silent events, which can be modelled by the silence model. 110 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 silence, phase 1 ’f’, phase 2 ‘f’-based, phase 1 subst. 24.18 % 10.44 % deletion 5.34 % 4.14 % silence-based phase 2 subst. 4.44 % 18.77 % deletion 3.86 % 7.19 % table 2: comparison of re-aligned bre non-speech events in different retraining phases acc[%] insertions ’f’-based, original 96.44 3 sil-based, original 96.44 2 ’f’-based, phase 1 95.96 3 sil-based, phase 1 95.96 2 ’f’-based, phase 2 96.20 1 sil-based, phase 2 96.32 1 table 3: recognition results after re-alignment with a different bre basis table 4 shows the number of non-speech event marks bre and pl that were deleted from the original training subset (annotated by a human being) in different training phases and which were substituted for each other. for the spee dataset there was no human-aligned transcription for these non-speech event classes, so there is no information about substituted marks. the number of deleted marks of non-speech events rises in the case of spee subset, but the difference between the number of deleted marks in particular phases decreases. the recogniser therefore tends to some final form of non-speech event models that consider about 2300 marks in the original transcription as too silent or inappropriate in some other way. for the tem subset this number does not change notably. this may be because the recogniser was trained on the tem subset in all 4 phases, while the spee was used only in the last two training phases. the substitutions in table 4 show that training leads to a decreasing number of substituted marks. this effect means that the recogniser classifies the non-speech events similarly to the human annotator, and so the recogniser can re-align the data more precisely. the substitutions are more often caused by marking the pl event as a bre event. a simple listening test showed that plosive non-speech events are followed by or mixed with breath in some cases and only the pl event is marked, so the substitution does not necessarily indicate a bad model of a non-speech event. 5 conclusion this paper describes some analyses and tests in a non-speech event modeling task. the analyses of the training datasets show some properties that can influence recognition accuracy. then some recognition tests were performed to find the best way to model speaker non-speech events. a spoken czech digit sequence recogniser based on phoneme hmms was used for this purpose. the speech databases used for the experiments were analysed, and it was found, that a part of both sets contains a notably different non-speech event rate. this was caused by the different supervision in the annotation phase of database creation. the distribution in different noise backgrounds supports the intuitive conclusion that a high noise level covers non-speech events, and so the occurrence rate decreases. for highly silent environments the rate is also lower, so not only the cleanest items are best for the non-speech event recognition task. the analysed datasets were used for training the recogniser, and although they were not checked to ensure that they contained only suitable non-speech items, using non-speech event modeling feature brought a notable improvement. this recogniser was used to re-align one of the training datasets to get a more accurate description of the non-speech events. this reduced the insertion error. the choice of the model that stands as a basis for non-speech events seems to be of less importance, because after the retraining difference slowly disappears. acknowledgments the work presented here was supported by gačr 102/05/0278 “new trends in voice technologies research and usage”, avčr 1et201210402 “voice technologies in information systems”, iga mz čr nr8287-3/2005 and research activity msm 6840770014 ”research in the area of the prospective information and navigation technologies”. references [1] rajnoha, j.: modeling of speaker non-speech events in robust speech recognition. proceedings of the 16th czech-german workshop on speech processing, prague: academy of sciences of the czech republic, institute of radioengineering and electronics, 2006, p. 149–155. [2] young, s. et al.: the htk book (for htk version 3.2.1) cambridge university engineering department, 2002. [3] gajic, b., markhus, v., pettersen, s. g., johnsen, m. h.: automatic recognition of spontaneously dictated medical records for norwegian. cost278 and isca tutorial and research workshop – robust 2004, 2004. [4] shriberg, e. e.: phonetic consequences of speech disfluency. proceedings of the international congress of phonetic sciences, san francisco, 1999, p. 619–622. [5] speecon project webpage. http://www.speechdat.org/speecon. josef rajnoha e-mail: rajnoj1@fel.cvut.cz dept. of circuit theory czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 111 acta polytechnica vol. 47 no. 4–5/2007 phase deleted substituted speecon phase 1 before re-align 1116 – phase 2 before re-align 1723 – phase 1 after re-align 2169 – phase 2 after re-align 2282 – temic phase 1 before re-align 1381 424 phase 2 before re-align 1381 566 phase 1 after re-align 1296 458 phase 2 after re-align 1286 448 table 4: comparison of non-speech events marked in the original and re-aligned training subsets table of contents detection of facial features in scale-space 3 p. hosten, m. asbach valuating the investment efficiency of distribution companies 8 m. karajica valuation of companies 14 jiøí lisník an improved version of the fluxgate compass module 18 v. petrucha robust detection of point correspondences in stereo images 23 a. stojanovic, m. unger assessment of human hemodynamics under hyperand microgravity: results of two aachen university parabolic flight experiments 29 n. blanik, m. hülsbusch, m. herzog, c. r. blazek refactorisation methods for ttcn-3 33 l. eros, f. bozoki extending the life time of a nuclear power plant: impact on nuclear liabilities in the czech republic 38 l. havlíèek negative chromatic dispersion generated by introducing curvature into photonic crystal fiber 43 m. lucki developing e-learning courses for mobile devices 48 r. szabados, k. sipos the influence of pzt actuators positioning in active structural acoustic control 55 p. švec, v. jandák computer controlled switching device for deep brain stimulation 59 j. tauchmanová modeling measurement uncertainty in room acoustics 63 p. dietrich wireless and non-contact ecg measurement system – the “aachen smartchair” 68 a. aleksandrowicz, s. leonhardt identification of nonlinear systems: volterra series simplification 72 a. novák inductive contactless distance measurement intended for a gastric electrical implant 76 j. tomek analysis of mismatched first order a priori information in iterative source-channel decoding 80 b. schotsch, p. vary, t. clevorn automated classification of analysisand reference cells for cancer diagnostics in microscopic images of epithelial cells from the oral mucosa 86 t. e. schneider stick based speckle reduction for real-time processing of oct images on an fpga 91 h. luecken, g. tech, r. schwann, g. kappen the impact of connecting distributed generation to the distribution system 96 e. v. mgaya, z. müller improved evaluation of planar calibration standards using the tdr preselection method 102 j. vancl speaker non-speech event recognition with standard speech datasets 107 j. rajnoha acta polytechnica https://doi.org/10.14311/ap.2022.62.0473 acta polytechnica 62(4):473–478, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague evaluation of residual strength of polymeric yarns subjected to previous impact loads gabriela hahna, antonio henrique monteiro da fonseca thomé da silvab, c, felipe tempel stumpfd, carlos eduardo marcos guilhermea, ∗ a universidade federal do rio grande, policab – stress analysis laboratory, av. itália km. 8, 96203-000 rio grande, brazil b petróleo brasileiro sa – petrobras, cenpes r&d centre, ilha cidade universitária cidade universitária, 21941-915 rio de janeiro, brazil c universidade federal fluminense – departamento de engenharia mecânica, rua passo da pátria 156 bl. d., 24210-240 niterói, brazil d universidade federal do rio grande do sul, departamento de engenharia mecânica, rua sarmento leite 425, 90050-170 porto alegre, brazil ∗ corresponding author: carlosguilherme@furg.br abstract. the discovery of oil fields in deep and ultra-deep waters provided an opportunity to evaluate the use of synthetic ropes, complementarily or alternatively to traditional steel-based mooring lines in offshore units, mainly because of the former’s lower specific weight. considering the series of complex dynamic-mechanical mainly axial loads to which these structures may be subjected, originated from different sources, such as wind, water current, tide, etc., there may be cases when at least one of these lines may possibly face an abrupt, shock-like axial load of considerably larger magnitude. the goal of the present study is to evaluate the residual tensile strength of three different synthetic yarns (polyester, and two grades of high modulus polyethylene) after exposure to such axial impact loads. it was observed that, for the tested materials, polyester is the one with the largest impact resistance to the conditions evaluated herein, mainly because of its comparatively greater energy absorption properties. keywords: offshore mooring, ultra deep-water, impact load. 1. introduction since the end of the world war ii, there has been an increase in the application of synthetic materials, mainly because of the reduction of their production costs and their significantly advantageous mechanical properties [1]. as an example, one can mention the construction of polymeric ropes, which can be used in a wide range of sports and industrial applications, such as climbing, rescue operations, mooring of offshore structures, shipping operations, etc. [2]. during the 1990s, the offshore oil industry began to replace the traditional mooring system based on steel cables and chains by systems consisting mainly of polyester cables. the main motivation for this shift was the severe increase in the water depth in which these structures were now being anchored, requiring compliant ropes with low specific weight in order to reduce the overall weight of the floating system [3, 4]. nowadays, as examples, one can mention the synthetic fibres typically used for mooring ropes manufacturing: polyester (pet), high modulus polyethylene (hmpe), polyamide (pa), liquid crystal polymer (lcp), aramid and polypropylene (pp). apart from the mechanical loads originating from the movement of the floating unit, such anchoring systems may be subjected to some degree of environmental damage caused, for example, by ultraviolet incidence and hydrolysis, depending on the fibre group [5, 6]. yarn-on-yarn abrasion is another (now a mechanical) degradation mechanism, even more relevant than the previous ones, that can affect the material’s mechanical behaviour [7]. for characterisation purposes considering mooring ropes applications, static and dynamic stiffness of polymeric multifilaments are typically assessed according to iso 18692 [8] and iso 14909 [9, 10]. polyester yarns have a high mechanical resistance, good tenacity and abrasion resistance [7, 11, 12]. when exposed to the environment under typical mooring conditions, they do not degrade considerably and are resistant to hydrolysis and ultraviolet incidence [5, 13]. polyester is also not biodegradable, has a negligible creep behaviour at room temperature and, when exposed to high temperatures, it contracts instead of expanding [14, 15]. high modulus polyethylene is produced from a gel spinning process, resulting in a highly crystalline structure, oriented and extended along the fibre axis, with many different grades available in the market with specific properties. in general, hmpe fibres present a 473 https://doi.org/10.14311/ap.2022.62.0473 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en g. hahn, a. h. m. f. t. da silva, f. t. stumpf, c. e. m. guilherme acta polytechnica lower-than-water density allowing its buoyancy in regular water, which makes it a very interesting choice for marine applications. as for its mechanical properties, the fibre has a high tenacity and stiffness as compared to similar materials. also, the deformation of hmpe at rupture is very low, but it shows remarkable creep behaviour at lower temperatures [16, 17]. there is a lack of previous studies in the literature regarding the influence of severe and abrupt tensile loads on the mechanical properties of polymeric materials applied to mooring ropes. however, it represents a relevant topic in regards to synthetic fibres used for mooring and operation lines since during its lifetime, an anchoring or operation line might be exposed to such loads several times. one important factor that influences the capacity to support shock loads is the degree of crystallinity, which is inversely proportional to the resistance to high, instantaneous tensile forces [18, 19]. considering the aforementioned, the main goal of this paper is to assess the residual tensile strength of polyester and high modulus polyethylene yarns exposed to a prior shock-like axial load. 2. materials and methods 2.1. materials in the present work, one grade of pet and two different grades of hmpe are evaluated, referred to as pet, hmpe1 and hmpe2, respectively. the materials have titers of 3300 dtex, 1761 dtex, and 1759 dtex, respectively. 2.2. methods 2.2.1. environmental conditions during tests all tests were performed with 500 mm long yarn specimens conditioned according to iso 139:2014, which determines that the samples must stay for at least two hours in an environment at 20 ± 2 ◦c and a relative humidity of 65 ± 4 % prior to any experimental procedure. the tests themselves must also be performed in such environmental conditions. 2.2.2. tensile tests in reference unexposed samples to determine the ybl reference values, for comparison purposes, unexposed fibres were tested according to astm d2256. 30 rupture tests were performed for 500 mm long yarn samples of pet, hmpe1 and hmpe2 at 250 mm/min. prior to tensile testing, samples were twisted along their axes with 60rounds per meter. an emic dl2000 universal testing machine with a 1 kn load cell was used. 2.2.3. impact tests it is considered that there is a critical velocity, above which the material shows brittle behaviour. the british standard bs en 892:2012 proposes impact tests in mountaineering ropes, which consists in the application of an instantaneous tensile force applied figure 1. free-fall diagram. source [20]. by a free-falling mass (figure 1) [20]. the input data to the experiment are the rope length, the free fall height, the mass of the falling object, and standard atmospheric conditions, such as temperature and relative humidity. bs en 892:2012 brings different procedures depending on the investigated factors, such as stiffness, cycle number for impact load, transmitted force in the rope and maximum stretching. multifilaments used in the impact test are expected to present the capacity of dissipation and absorption of the potential energy. specimens must be capable of preserving their original elastic behaviour, granting structural integrity and not compromising their mechanical properties. in the present paper, we apply such loads to the specimens and investigate eventual changes in their yarn break load (ybl) in the two distinct moments: immediately after the impact load, and 24 hours after the test. the ybl of the tested material is compared to that of an unexposed sample. the application of the abrupt tensile forces was performed with increasingly heavier dead weights, typically from 1 % to 7 % ybl (and higher whenever applicable) increased by 2 % ybl steps. if samples did not fail when exposed to the chosen dead weight, residual strength was evaluated (as described in the next sections) and a new set of impact tests with increased dead weight was performed. in all tests, the dead weight was released from a 250 mm height, which corresponds to a half of the samples’ length. a total of 30 untwisted samples were used for each of the investigated materials. 2.2.4. residual strength tests 2.2.4.1. residual strength of partially impacted samples in order to evaluate the intermediate residual quasistatic tensile strength of the materials after the application of each of the impact loads steps detailed in section 2.2.3, those samples that did not fail by 474 vol. 62 no. 4/2022 evaluation of residual strength of polymeric yarns . . . material ybl (n/tex) specific deformation at break (%) pet 0.76 ± 0.01 10.90 ± 0.04 hmpe1 3.04 ± 0.15 3.30 ± 0.14 hmpe2 3.08 ± 0.10 3.20 ± 0.09 table 1. yarn break load of reference unexposed samples of pet, hmpe1 e hmpe2. material maximum impact load promoted by the 1 % ybl dead weight (n/tex) ybl (n/tex) after impact specific deformation at break (%) hmpe1 1.24 ± 0.03 3.25 ± 0.09 3.30 ± 0.12 hmpe2 1.33 ± 0.02 3.23 ± 0.10 3.20 ± 0.12 table 2. results of the tensile tests for residual strength evaluation immediately after the impact load of 1 % ybl. material maximum impact load promoted by the 3 % ybl dead weight (n/tex) ybl (n/tex) after impact specific deformation at break (%) pet 0.31 ± 0.01 0.76 ± 0.01 10.30 ± 0.40 hmpe1 2.14 ± 0.13 3.15 ± 0.17 3.20 ± 0.20 hmpe2 2.22 ± 0.10 3.11 ± 0.11 3.20 ± 0.11 table 3. results of the tensile tests for residual strength evaluation immediately after the impact load of 3 % ybl. rupture were subjected to tensile tests using the procedure of section 2.2.2. the time interval between both experiments was of 1 minute and 30 seconds, during which the samples were kept under the same controlled environmental conditions of section 2.2.1. the same emic dl2000 machine was used. 2.2.4.2. residual strength of samples after resting time a second round of tensile tests was performed in order to evaluate the influence of the resting time of the samples between the impact experiments and the residual tensile strength measurements. for that, instead of testing the samples for their residual ybl immediately after the shock-like events, they were first left to rest for 24 hours in a controlled environment (see section 2.2.1). after this period, the tensile tests were performed according to the procedure detailed in section 2.2.2. 3. results and discussion 3.1. yarn break load of reference (unexposed) samples table 1 shows the results for ybl of pet, hmpe1 e hmpe2, determined according to the experimental setup detailed in section 2.2.2 considering 30 samples each. it can be seen that pet shows a lower ybl and a higher elongation at break when compared to both grades of hmpe, as expected. 3.2. tensile tests immediately after the impact loads as detailed in section 2.2.3, each material was tested with abrupt tensile loads using an initial dead weight equivalent to 1 % ybl, which exposed the samples to a specific impact load. due to technical difficulties in measuring the very small force during the tests for pet (approximately 2.5 n), this material was excluded from this first experimental batch. after the tests with 1 % ybl, no visible damage was observed in any of the hmpe samples. moreover, no sample has failed during the impact load. immediately after the impact tests, tensile tests were performed in order to observe eventual changes in the materials’ original ybl. table 2 shows the results considering 30 samples each. although the impact forces applied to the samples were close to half of the materials’ original ybl, both hmpe showed an apparent increase in their tensile strength: hmpe1 increased its ybl by about 7 % and hmpe2 by about 5 %, when compared to the materials’ reference ybl (see table 1). there was no significant change in the specific deformation at break. it was observed that the standard deviation of the hmpe1 samples decreased as compared to the results in the reference unexposed samples. then, additional impact tests were performed, increasing the dead weight to 3 % of the materials’ original ybl, now including pet samples. again, there was no visible damage in any of the samples after the application of the sudden axial loads. table 3 shows 475 g. hahn, a. h. m. f. t. da silva, f. t. stumpf, c. e. m. guilherme acta polytechnica material maximum impact load promoted by the 5 % ybl dead weight (n/tex) number of samples impact tested samples broken during impact ybl (n/tex) after impact specific deformation at break (%) pet 0.50 ± 0.02 30 0 0.77 ± 0.02 10.50 ± 0.40 hmpe1 2.86 ± 0.12 78 48 not applicable hmpe2 2.98 ± 0.10 54 24 3.09 ± 0.12 3.20 ± 0.13 table 4. results of the tensile tests for residual strength evaluation immediately after the impact load of 5 % ybl. material maximum impact force caused by dead weight drop (n/tex) number of samples impact tested samples broken during impact ybl (n/tex) after impact specific deformation at break (%) pet 7 % 0.62 ± 0.02 30 10 0.77 ± 0.02 10.50 ± 0.35 pet 9 % 0.69 ± 0.01 47 17 0.77 ± 0.02 10.30 ± 0.41 pet 11 % 1.12 ± 0.02 58 28 not applicable hmpe2 6 % 3.09 ± 0.09 52 22 not applicable table 5. results of the tensile tests for residual strength evaluation immediately after impact load – higher dead weights. the results of the tensile tests performed after the impacts considering 30 samples each. here, the axial load for the hmpe samples exceeds almost 70 % of the materials’ original ybl, while for polyester, it was close to 50 % of its original ybl. the results for pet showed no significant difference when compared to the unexposed reference material (table 1). both hmpe samples showed, after impact, a higher ybl than that of the reference samples, but now only 3 % higher for hmpe1 and 1 % for hmpe2. again there was no significant change in the specific deformation at the break. the high standard deviation of hmpe1 may be an indication of permanent damage caused to the multifilament structure during the impact tests even thoughthe quasi-static mechanical behaviour was not jeopardized. increasing the deadweight to 5 % ybl, more than 50 % of the hmpe1 samples failed by rupture during the impact test, which means that the impact strength of that material was reached. less than half of hmpe2 samples showed failure by rupture. some of the samples that did not break during the impact test showed visible, macroscopic damage in their structure, while all pet samples did not show any visible structural damage. table 4 shows the residual tensile strength test results. impact sampling was performed in order to always guarantee around 30 viable samples for the residual tensile strength tests afterwards. results show that, the higher the impact load, the higher the standard deviation and, in this case, it is reaching the limits of the impact tolerance for the hmpe fibres, with a significant amount of fibres already failing under impact. again, pet showed a very small increase in its tensile strength, while the specific deformation at break remained almost unchanged. the next set of tests was performed with a deadweight of 7 % ybl. as expected, all samples of hmpe2 have broken during impact, so, for that material, the load was decreased to 6 % ybl, in order to find its impact strength. following the procedure of increasing the load by 2 % ybl, pet was further tested up to 11 % ybl, which was found to be beyond the material’s impact strength. table 5 shows these results. 3.3. tensile tests 24 hours after the impact loads in this section, the effect of the time interval between the impact test and subsequent tensile tests on the quasi-static residual strength of the fibres was observed, aiming to observe any potential microstructural accommodation. therefore, because of the similar mechanical behaviour of hmpe1 and hmpe2, only pet and hmpe1 were chosen to undergo this new set of experiments. samples of pet were subjected to an impact load of 5 % ybl and hmpe1 to 3 % ybl. these values were chosen because they were found to be the highest impact loads (among the loads tested in the study) that do not cause macroscopic damage to the material (see section 3.2). after the impact experiment, the samples were left to rest for 24 hours in a controlled environment as determined by iso 139:2014. table 6 shows the results. in table 3, it can be seen that pet has a tensile strength of 1.12 ± 0.02 n/tex when the tensile test is performed immediately after the impact test. when 476 vol. 62 no. 4/2022 evaluation of residual strength of polymeric yarns . . . material number ofsamples maximum impact force (n/tex) ybl (n/tex) after impact specific deformation at break (%) pet 5 % ybl 30 0.50 ± 0.02 0.77 ± 0.02 11.20 ± 0.33 hmpe1 3 % ybl 30 2.14 ± 0.13 3.11 ± 0.01 3.20 ± 0.12 table 6. results of the tensile tests for residual strength evaluation 24 hours after impact load. the tensile test was conducted 24 hours after the impact experiment, the ybl was measured as 1.13 ± 0.02 n/tex. for hmpe1, the equivalent results are of 3.15 ± 0.17 n/tex (immediately after impact) and 3.11 ± 0.01 n/tex (24 hours after impact). it was concluded that the materials tested do not restore their tensile strength after a considerable time interval of 24 hours after the impact test. 4. conclusions the main goal of this article is to evaluate an eventual loss in tensile strength of polyester and (two grades of) high modulus polyethylene yarns after an exposure to abrupt, axial impact loads (as a percentage of materials’ original ybl). this is made by measuring the ybl of the unexposed reference material’s ybl, and comparing it to the ybl of samples previously exposed to different levels of impact loads. the obtained results suggest that, among the tested materials, pet is the one being less affected by a % ybl impact load, having shown impact strength to an axial load equivalent to about 9 % of its original ybl (see table 5). both evaluated grades of hmpe, hmpe1 and hmpe2, presented an impact strength equal to 5 % ybl and 6 % ybl, respectively (see tables 4 and 5). a possible explanation is that pet’s elongation at break (∼10 %) is significantly larger than those of hmpe’s (∼3.3 %), which means that the former is more capable of absorbing strain energy than the latter. because of the absence of similar studies in the literature, this study is considered to be pioneer in terms of the assessment of the consequences of axial, abrupt loads for the posterior tensile strength of polymeric yarns. the methodology followed here is considered adequate to be applied when one intends to quantitatively compare the impact strength of different polymeric fibres. it should be noted that due to the difference between the axial stiffness of the different tested materials, naturally, the strain rate is expected not to be the same when this approach, based on deadweight release, is used. if one intends to apply exactly the same strain rate to the materials being compared, more sophisticated experimental apparatus must be employed. it is also important to note that the results regarding impact strength can be highly affected by the temperature during the experiments, so it is recommended to perform the shock-like experiments at the service temperature of the materials for more accurate comparisons. references [1] jr. w. d. callister and d. r. rethwisch. materials science and engineering: an introducion. john wiley & sons, 2007. [2] a. j. mclaren. design and performance of ropes for climbing and sailing. proceedings of the institution of mechanical engineers, part l: journal of materials: design and applications 220(1):1–12, 2006. https://doi.org/10.1243/14644207jmda75. [3] v. sry, y. mizutani, g. endo, et al. consecutive impact loading and preloading effect on stiffness of woven synthetic-fiber rope. journal of textile science and technology 3:1–16, 2017. https://doi.org/10.4236/jtst.2017.31001. [4] k. h. lo, h. xü, l. a. skogsberg. polyester rope mooring design considerations. in ninth international offshore and polar engineering conference. brest, france, 1999. isope-i-99-1691999. [5] j. p. duarte, c. e. m. guilherme, a. h. m. f. t. da silva, et al. lifetime prediction of aramid yarns applied to offshore mooring due to purely hydrolytic degradation. polymers and polymer composites 27(8):518–524, 2019. https://doi.org/10.1177/0967391119851386. [6] m. m. shokrieh, a. bayat. effects of ultraviolet radiation on mechanical properties of glass/polyester composites. journal of composite materials 41(20):2443–2455, 2007. https://doi.org/10.1177/0021998307075441. [7] s. r. da silva soares, v. e. fortuna, f. e. g. chimisso. yarn-on-yarn abrasion behavior for polyester, with and without marine finish, used in offshore mooring ropes. in 9th youth symposium on experimental solid mechanics, pp. 60–64. 2010. [8] international organization for standardization. fibre ropes for offshore stationkeeping – part 2: polyester (iso standard no. 18692-2), 2019. [9] international organization for standardization. fibre ropes for offshore stationkeeping – high modulus polyethylene (hmpe) (iso standard no. 14909), 2012. [10] f. t. stumpf, c. e. m. guilherme, f. e. g. chimisso. preliminary assessment of the change in the mechanical behavior of synthetic yarns submitted to consecutive stiffness tests. acta polytechnica ctu proceedings 3:75–77, 2016. https://doi.org/10.14311/app.2016.3.0075. [11] t. s. lemmi, m. barburski, a. kabziński, k. frukacz. effect of thermal aging on the mechanical properties of high tenacity polyester yarn. materials 14(7):1666, 2021. https://doi.org/10.3390/ma14071666. 477 https://doi.org/10.1243/14644207jmda75 https://doi.org/10.4236/jtst.2017.31001 https://doi.org/10.1177/0967391119851386 https://doi.org/10.1177/0021998307075441 https://doi.org/10.14311/app.2016.3.0075 https://doi.org/10.3390/ma14071666 g. hahn, a. h. m. f. t. da silva, f. t. stumpf, c. e. m. guilherme acta polytechnica [12] g. susich. abrasion damage of textile fibers. textile research journal 24(3):210–228, 1954. https://doi.org/10.1177/004051755402400302. [13] m. moezzi, m. ghane. the effect of uv degradation on toughness of nylon 66/polyester woven fabrics. the journal of the textile institute 104(12):1277–1283, 2013. https://doi.org/10.1080/00405000.2013.796629. [14] c. oudet, a. bunsell. effects of structure on the tensile, creep and fatigue properties of polyester fibres. journal of material science 22:4292–4298, 1987. https://doi.org/10.1007/bf01132020. [15] h. y. jeon, s. h. kim, h. k. yoo. assessment of longterm performances of polyester geogrids by accelerated creep test. polymer testing 21(5):489–495, 2002. https://doi.org/10.1016/s0142-9418(01)00097-6. [16] p. davies, y. reaud, l. dussud, p. woerther. mechanical behaviour of hmpe and aramid fibre ropes for deep sea handling operations. ocean engineering 38(17-18):2208–2214, 2011. https://doi.org/10.1016/j.oceaneng.2011.10.010. [17] y. lian, h. liu, y. zhang, l. li. an experimental investigation on fatigue behaviors of hmpe ropes. ocean engineering 139:237–249, 2017. https://doi.org/10.1016/j.oceaneng.2017.05.007. [18] e. hage jr. resistencia ao impacto. in s. v. canevarolo jr. (ed.), técnicas de caracterização de polímeros. artliber, são paulo, 2003. [19] e. l. v. louzada, c. e. m. guilherme, f. t. stumpf. evaluation of the fatigue response of polyester yarns after the application of abrupt tesion loads. acta polytechnica ctu proceedings 7:76–78, 2017. https://doi.org/10.14311/app.2017.7.0076. [20] i. emri, a. nikonov, b. zupančič, u. florjančič. time-dependent behavior of ropes under impact loading: a dynamic analysis. sports technology 1(4-5):208–219, 2008. https://doi.org/10.1080/19346182.2008.9648475. 478 https://doi.org/10.1177/004051755402400302 https://doi.org/10.1080/00405000.2013.796629 https://doi.org/10.1007/bf01132020 https://doi.org/10.1016/s0142-9418(01)00097-6 https://doi.org/10.1016/j.oceaneng.2011.10.010 https://doi.org/10.1016/j.oceaneng.2017.05.007 https://doi.org/10.14311/app.2017.7.0076 https://doi.org/10.1080/19346182.2008.9648475 acta polytechnica 62(4):473–478, 2022 1 introduction 2 materials and methods 2.1 materials 2.2 methods 2.2.1 environmental conditions during tests 2.2.2 tensile tests in reference unexposed samples 2.2.3 impact tests 2.2.4 residual strength tests 3 results and discussion 3.1 yarn break load of reference (unexposed) samples 3.2 tensile tests immediately after the impact loads 3.3 tensile tests 24 hours after the impact loads 4 conclusions references ap08_2.vp 1 introduction some interrelations between classical integrable systems and field theories in dimensions 3 and 4 were proposed by n. hitchin twenty years ago [1, 2]. this approach to integrable systems has some advantages. it immediately leads to the lax representation with a spectral parameter, to prove in some cases the algebraic integrability and to find separated variables [3, 4]. it was found later that some well-known integrable systems can be derived in this way [5, 6, 7, 8, 9, 10, 11, 12]. it was demonstrated in [13] that there exists an integrable regime in � � 2 supersymmetric yang-mills theory in four dimension, which is described by sieberg and witten [14]. a general picture of the interrelations between integrable models and gauge theories in dimensions 4, 5 and 6 was presented in [15]. some new aspects of the interrelations between integrable systems and gauge theories have recently recently been found in the framework of four-dimensional reformulation of the geometric langlands program [16, 17, 18]. these lectures take into account this approach, but they are also based on [1, 2, 11, 12, 19, 20, 21, 22]. the derivation of integrable systems from field theories is based on symplectic or poisson reduction. this construction is familiar in gauge field theories. the physical degrees of freedom in gauge theories are defined upon imposing first and second the class constraints. first class constraints are analogs of the gauss law generating gauge transformations. a combination of the gauss law and constraints coming from the gauge fixing yields second class constraints. we start with gauge theories that have some important properties. first, they have at least a finite number of independent conserved quantities. after reduction they will play the role of integrals of motion. next, we assume that after gauge fixing and solving the constraints, the reduced phase space becomes a finite-dimensional manifold and its dimension is twice the number of integrals. this property provides complete integrability. it is, for example, the theory of higgs bundles describing hitchin integrable systems [1]. this theory corresponds to a gauge theory in three dimension. on the other hand, a similar type of constraints arises in reducing the self-duality equations in four-dimensional yang-mills theory [1], and in four-dimensional � � 4 supersymmetric yang-mills theory [16] after reducing them to a space of dimension two. we also analyze the problem of classifying integrable systems. roughly speaking two integrable systems are called equivalent, if the original field theories are gauged equivalent. we extend gauge transformations by allowing singular gauge transformations of a special kind. on the field theory side these transformations correspond to monopole configurations, or, equivalently, the inclusion of the t’hooft operators [23, 24]. for some particular examples we establish in this way an equivalence of integrable systems of particles (calogero-moser systems) and integrable euler-arnold tops. it turns out that this equivalence is the same as equivalence of two types of r-matrices of dynamical and vertex type [25, 26]. before considering concrete cases we point out the main definitions of completely integrable systems [20, 21, 27]. 2 classical integrable systems consider a smooth symplectic manifold � of dim( )� � 2l. it means that there exists a closed non-degenerate two-form w, and the inverse bivector � � � �( ),a b bc a c� , such that the space c�( )� becomes a lie algebra (poisson algebra) with respect to the poisson brackets � �f g df dg f ga ab b, � �� � � � . any h c� �( )� defines a hamiltonian vector field on � � �h dh h ha ab b� � �� � � � , . a hamiltonian system is a triple ( , , )� � h with the hamiltonian flow � � � � � �t a a b bax h x h� �, a hamiltonian system is called completely integrable, if it satisfies the following conditions � there exist l poisson commuting hamiltonians on � (integrals of motion) i1, …, il. � since the integrals commute the set �t i cc j j� �{ } is invariant with respect to the hamiltonian flows� �i j , . then being restricted on tc, ij(x) are functionally independent for almost all x tc� , i.e. det( )( )�a bi x 0. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 2/2008 lectures on classical integrable systems and gauge field theories m. olshanetsky in these lectures we consider hitchin integrable systems and their relations with the self-duality equations and twisted super-symmetric yang-mills theory in four dimension. we define the symplectic hecke correspondence between different integrable systems. as an example we consider elliptic calogero-moser system and integrable euler-arnold top on coadjoint orbits of the group gl(n, �) and explain the symplectic hecke correspondence for these systems. keywords: integrable systems, gauge theories, monopoles. in this way we come to the hierarchy of commuting flows on � � ��t jj ix x x� ( ), . (2.1) tc � is a submanifold tc � �. it is is a lagrangian submanifold, i.e. � vanishes on tc. if tc is compact and connected, then it is diffeomorphic to an l-dimensional torus. torus tc is called the liouville torus. in a neighborhood of tc there is a projection p b: � � , (2.2) where the liouville tori are generic fibers and the base of fibration b is parameterized by the values of the integrals. the coordinates on a liouville torus (“the angle” variables) along with dual variables on b (“the action” variables) describe a linearized motion on the torus. globally, the picture can be more complicated. for some values of cj tc ceases to be a submanifold. in this way the action-angle variables are local. here we consider a complex analog of this picture. we assume that � is a complex algebraic manifold and the symplectic form � is a (2,0) form, i.e. locally in the coordinates ( , , , , )z z z zl l1 1 � the form is represented as � �� �a b a bdz dz, . general fibers of (2.2) are abelian subvarieties of �, i.e. they are complex tori �l �, where the lattice � satisfies the riemann conditions. integrable systems in this situation are called algebraically integrable systems. let two integrable systems be described by two isomorphic sets of the action-angle variables. in this case the integrable systems can be considered as equivalent. establishing equivalence in terms of angle-action variables is troublesome. there is a more direct way based on lax representation. lax representation is one of the commonly accepted methods for constructing and investigating integrable systems. let l x z( , ), m x z m x zl1( , ), , ( , )� be a set of l 1 matrices depending on x � � with a meromorphic dependence on the spectral parameter z � �, where � is a riemann surface. (it will be explained below that l and m are sections of some vector bundles over �.) this is called a basic spectral curve. assume that the commuting flows (2.1) can be rewritten in the matrix form � ��t jj l z l z m z( , ) ( , ), ( , )x x x� . (2.3) let f be a non-degenerate matrix of the same order as l and m. the transformation � � �l f l f1 , � � � �m f f f m fj t jj 1 1� , (2.4) is called the gauge transformation because it preserves the lax form (2.3). the flows (2.3) can be considered as special gauge transformations l t t f t t l f t tl l l( , , ) ( , , ) ( , , ),1 1 1 0 1� � �� � where l0 is independent on times and defines an initial data, and m f fj t j� �1� . moreover, it follows from this representation that the quantities tr l z j( ( , ))x are preserved by the flows and thereby can produce, in principle, the integrals of motion. as we mentioned above, it is reasonable to consider two integrable systems to be equivalent if their lax matrices are related by a non-degenerate gauge transformation. we relax the definition of the gauge transformations and assume that can have poles and zeroes on the basic spectral curve � with some additional restrictions on f. this equivalence is called the symplectic hecke correspondence. this extension of equivalence will be considered in details in these lectures. the following systems are equivalent in this sense: examples � 1. elliptic calogero-moser system � elliptic gl( , )n � top, [11]; � 2. calogero-moser field theory � landau-lifshitz equation, [11, 10]; � 3. painlevé vi � zhukovsky-volterra gyrostat, [12]. the first example will be considered in section 4. the gauge invariance of the lax matrices allows one to define the spectral curve � �� � � � � �( , ) det( ( , ))� �� z l x z� 0 (2.5) the jacobian of � is an abelian variety of dimension g, where g is the genus of �. if g l� � 1 2 dim � then � plays the role of the liouville torus and the system is algebraically integrable. in generic cases g > l and to prove algebraic integrability one should find additional reductions of the jacobians, leading to abelian spaces of dimension l. finally we formulate two goals of these lectures � to derive the lax equation and the lax matrices from a gauge theory; � to explain the equivalence between integrable models by inserting t’hooft operators in gauge theory. 3 1d field theory the simplest integrable models, such as the rational calogero-moser system, the sutherland model, and the open toda model can be derived from matrix models of a finite order. here we consider a particular case – the rational calogero-moser system (rcms) [32, 33]. 3.1 rational calogero-moser system (rcms) the phase space of the rcms is � �� rcm nc� �2 ( , )v u , v � ( , , )v vn1 � , u � ( , , )u un1 � with the canonical symplectic form � rcm j j j n d v d u� � � � 1 , � �v uj k jk, � � . (3.1) the hamiltonian describes interacting particles with complex coordinates u � ( , , )u un1 � and complex momenta v � ( , , )v vn1 � h v u u rcm j j n j kj k � � � � � �12 12 1 2 2 ( ) . 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the hamiltonian leads to the equations of motion �t j ju v� , (3.2) � t j j kj k v u u � � � � �2 3 1 ( ) . (3.3) the equations of motion can be put in the lax form �t l l m( , ) [ ( , ), ( , )]v u v u v u� . (3.4) here l, m are the n×n matrices of the form l p x� , m d y� , p diag v vn� ( , , )1 � , x u ujk j k� � � ( ) 1, (3.5) y u ujk j k� � � � ( ) 2, d d dn� diag( , , )1 � , (3.6) d u uj j k k j � � � � ( ) 2. the diagonal part of the lax equation (3.4) implies �t diagp x z� [ , ] . it coincides with (3.3). the non-diagonal part has the form � �t nondiagx p y x y x d� �[ , ] [ , ] [ , ] . it can be found that [ , ] [ , ]x y x dnondiag � and the equation �t x p y� [ , ] coincides with (3.2). the lax equation produces the integrals of motion im � 1 m lmtr( ), �t mltr( ) � 0, m n�1 2, , ,� . (3.7) it will be proved later that they are in involution � �i im n, � 0. in particular, i h rcm2 � . eventually, we come to the rcms hierarchy � ��j jf i f( , ) , ( , )v u v u� . (3.8) 3.2 matrix mechanics and rcms this construction was proposed in ref. [35, 36]. consider a matrix model with the phase space � � �gl gl( , ) ( , )n n� � � � ( , ) a , , ( , )a n�gl � (these notations will be justified in next section) dim � � 2 2n . the symplectic form on � is � � � � ��tr( ) , d da d daj,k j,k j k . (3.9) the corresponding poisson brackets have the form � � j,k j,k ki jla, � � � . choose n commuting integrals im � 1 m mtr( ) , � �i im n, � 0, m n�1 2, , ,� . take as a hamiltonian h i� 2. then we come to the free motion on � � ��t h � �, 0 , (3.10) � ��t a h a� �, . (3.11) generally, we have a free matrix hierarchy � j � 0, � j ja � ��, (�j ji� { , }). 3.2.1 hamiltonian reduction for rcms the form � and the integrals im are invariant with resect to the action of the gauge group � � gl( , )n � , � �f f1 , a f a f� �1 , f n�gl( , )� . the action of gauge lie algebra lie n( ) ( , )� � gl � is represented by the vector fields v� � [ , �], v� a a� [ , �]. (3.13) let � be the contraction operator with respect to the vector field v� ( � �� (, vj k �)jk �� jk ) and �� = d �+ �d is the corresponding lie derivative. the invariance of the symplectic form and the integrals means that �� � � 0 , �� im � 0 . since the symplectic form is closed d� � 0, we have d �� � 0. then on the affine space � the one-form �� is exact ��� df a( , , �). (3.14) the function f a( , , �) is called the momentum hamiltonian. the poisson brackets with the momentum hamiltonian generate the gauge transformations: � �f f a, ( , ) � �� f a( , ) . the explicit form of the momentum hamiltonian is f a( , , �)� tr( �[ , a]). define the moment map � : *( ) ~ ( , )� � lie gauge group ngl � � ��( , ) , a a� , � �( , ) , a a� . (3.15) let us fix its value as � �� , a j� , (3.16) j � � � � � � � � � � � � � 0 1 1 1 0 1 1 1 1 0 � � � � � � � � � � . (3.17) it follows from the definition of the moment map that (3.16) is the first class constraints. in particular, �f a( , , �), f a( , , �' �) ( , ,� f a [�, �']). note, that matrix j is degenerate and is conjugated to the diagonal matrix diag( , , )n � � �1 1 1� . let �0 be a subgroup of the gauge group preserving the moment value � �� �0 1� � ��f f jf j , (dim( ) ( ) )�0 21 1� � n . in other words �0 preserves the surface in � � �f j a j� � �1( ) [ , ] . (3.18) let us fix a gauge on this surface with respect to the �0 action. it can be proved that generic matrices a can be diagonalized by �0 f af u un � � �1 1u diag( , , )� , f ��0. (3.19) © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 2/2008 in other words, we have two conditions – first class constraints (3.16) and gauge fixing (3.19). the reduced phase space �red is the result of putting both types of constraints � � � � red f j� � �/ / ( ) /1 0 . it has dimension dim( ) dim( ) dim( ) dim( ) ( ) � � � � red n n n n � � � � � � � � 0 2 2 2 12 2 2 1 � � � let us prove that � �red rcm� and that the hierarchy (3.12) being restricted to �rcm coincides with the rcms hierarchy (3.8). let f ��0 diagonalize a in (3.19). define l f f� �1 (3.20) then it follows from (3.10) that l satisfies the lax equation � �t t l l m � � �0 [ , ], (m f ft� � �1� ). the moment constraints (3.18) allow one to find the off-diagonal part of l. evidently, it coincides with x (3.5). the diagonal elements of l are free parameters. in a similar way, the off-diagonal part y (3.6) of m can be derived from the equation of motion for a (3.11). thereby, we come to the lax form of the equations of motion for rcms. since � l and a � u, the symplectic form � (3.9) coincides on �rcm with �rcm (3.1). it follows from (3.20) that the integrals (3.7) poisson commute. therefore, we obtain rcms hierarchy. the same system can be derived starting with matrix mechanics based on sl( , )n � . in this case i1 0� �tr and thereby in the reduced system v j� � 0. 3.2.2 hamiltonian reduction for 4d yang-mills theory in this subsection we take a step aside to illustrate the hamiltonian reduction in terms of the familiar phase space of the yang-mills theory. for this purpose consider 4d ym theory with a group g in the hamiltonian formalism [37, 38]. the phase space is generated by the space components on � 3 1 2 3� ( , , )x x x of the vector potential and the electric field � �� � � �a e( , , ), ( , , )a a a e e e1 2 3 1 2 3 . where e f a a a aj j j j j� � � 0 0 0 0� � [ , ]. here we have suppressed the lie algebra indices. it is a symplectic space with the canonical form � � � � �� �� � d d d e d aj j j e a � � 3 3 1 3 tr( ) . the hamiltonian is quadratic in fields and has the form h � � 1 2 2 2 3 e b � , where b � ( , , )b b b1 2 3 is the magnetic field b j � �jklfkl ��jkl( [ , ])� �k j j k k ja a a a� . we assume that the fields are smooth and vanish at infinity such that the hamiltonian and the symplectic form are well defined. the hamiltonian defines the classical equations of motion �t j ja e� , � �t j k k kj k e a f� �[ , ]. the hamiltonian and the form are invariant with respect to the gauge transformations a a a� � � �f f df f f1 1 , e e e� � �f f f1 . where d � ( , , )� � �1 2 3 . we assume that f is a smooth map f c g� � ��� ( )�3 , vanishing at infinity and at some marked points f a( )y � 0, y a a a ay y y� ( , , )1 2 3 , ( , , )a n�1 � . infinitesimal gauge transformations defines the vector field on the phase space v� e e� [ , �], v� a � d� [ ,a �], � � lie( )� . the corresponding momentum hamiltonian is f( , ,e a �) � � � 3 �� ( [ , ])�j j j j j e a e � � 1 3 , (compare with (3.14)). therefore the moment takes the form � �( , [ , ]e a) = j j j j j e a e � � 1 3 . this is an element of the gauge co-algebra lie * ( )� . in other words the moment belongs to the map of the phase space � to the distributions on �3 with values in lie g* ( ) . let us fix it as � � �j j j j j a a a e a e � � � � �[ , ] ( ) 1 3 x y , (3.21) where �a lie� * ( )� . the moment constraints (3.21) are nothing else than the gauss law, and �a are electric charges. to come to the reduced phase space � � �red � / / we should add a gauge fixing condition to the gauss law. note that the gauge transformations vanish at the points ya, and in this way they preserves the right hand side of (3.21). starting with six fields ( , )a ej j defining �, we put two types of constraints – the gauss law and gauge fixing. roughly speaking, they kill two fields and the reduced phase space describes the “transversal degrees of freedom”. 4 3d field theory 4.1 hitchin systems 4.1.1 fields define a field theory on (2 1) dimensional space-time of the form � � g n, , where is a riemann surface of genus g with a divisor d x xn� ( , , )1 � of n marked points. the phase space of the theory is defined by the following field content: 1) consider a vector bundle e of rank n over �g,n equipped with the connection �� �! "d dzz . it acts on the sections 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 s s st n� ( , , )1 � of e as �� � d s s as� . the vector fields a z z( , ) are c� maps � g n n, ( , )� gl � . 2) the scalar fields (the higgs fields) ( , )z z dz" , �: ( , ),g n n� gl � . the higgs field is a section of the bundle � �( , ) ,( , ) 1 0 g n eend . this means that acts on the sections s s dzj kj j� " . we assume that has holomorphic poles at the marked points ~ a az x� � let ( , , , , , )� � � �1 1� �g g be a set of fundamental cycles of � g n, , ( � � � �j j j jj � �# �1 1 1). the bundle e is defined by the monodromy matrices ( , )q j j� � j js q s: � �1 , � j j s:� �1 similarly, for a and we have � �j j j j ja q q q aq: � � �1 1, � �j j j j ja a: � � �� � � �1 1 �j j jq q: � �1, bj j j: � �� �1. (4.1) 3) the spin variables are attributed to the marked points s na � gl( , )� , a n�1, ,� , s g s ga a� �1 0( ) , where s a ( )0 is a fixed element of gl( , )n � . in other words, s a belong to coadjoint orbits a of gl( , )n � . they play the role of non-abelian charges located at the marked points. let � �t� , ( , , )� �1 2 � n be a basis in the lie algebra gl( , )n � , [ , ] ,t t c t� � � � � �� . define the poisson structure on the space of fields: 1) darboux brackets of the fields ( , )a : a z z a z z t( , ) ( , )�� � � � , ( , ) ( , )w w w w t�� � � � . � � � � � � �( , ), ( , ) ( , )w w a z z t t z w z w� � � , (��= trace in ad) 2) linear lie brackets for the spin variables: s a as t�� �� � � �s s c sa b a b a� � ���� ��, ,� . in this way we have defined the phase space � � ( , , )a a s . (4.2) the poisson brackets are non-degenerate and the space � is symplectic with the form � � � �� � � ��� � 0 1 a a a a n z x z x g n ( , ) ,� , (4.3) �0 � �� d da g n � , , (4.4) �a ad s g dg� ��( )1 . (4.5) the last form is the kirillov-kostant form on the coadjoint orbits. the fields ( , ) a are holomorphic coordinates on � and the form �0 is the (2,0)-form in this complex structure on �. similarly, ( , )s g ga �1 are holomorphic coordinates on the orbit a, and �a is also (2,0) form. 4.1.2 hamiltonians the traces j ( j n�1, ,� ) of the higgs field are periodic (j, 0)-forms� �( , ) ,( ) j g n 0 with holomorphic poles of order j at the marked points. to construct integrals from j one should integrate them over � g n, and to this end prepare (1, 1)-forms from the (j, 0)-forms. for this purpose consider the space of smooth (1�j, 1)-differentials � �( , ) ,( \ ) 1 1� j g n d vanishing at the marked points. locally, they are represented as � � � �j j z jz z dz� "�( , )( ) 1 . in other words �j are (0,1)-forms taking values in degrees of vector fields on � g n d, \ . for example, �2 is the beltrami differential. the product j j� can be integrated over the surface. we explain below that �j can be chosen as elements of the basis in the cohomology space h dg n j1 1( \ , ),� " � . this space has dimension n h d j g jn j g jj g n j� � � � � � � � � " �dim ( \ , ) ( )( ) , 1 1 2 1 1 1 1 � (4.6) let �j,k be a basis in h dg n j1 1( \ , ),� " � , (k n j�1, ,� ). the product � j k j , can be integrated to define the hamiltonians i jj k j k j g n , , , � � 1 � � , j n�1, ,� . (4.7) it follows from (4.6) that the number of independent integrals n j� for gl( , )n � is d n g n n n n n g n j j n , , ( ) ( ) � � � � � � 1 1 12 2 1 . (4.8) since � 0 for sl( , )n � the number of independent integrals is d n g n n n n n g n j j n , , ( )( ) ( ) � � � � � � � 1 1 12 2 2 . (4.9) © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 2/2008 fig. 1: ( , , )x x1 4� –marked points the integrals i(j, k) are independent and poisson commute � �i ij k j k( , ) ( , ),1 1 2 2 0� . (4.10) thus we come to dn g n, , commuting flows on the phase space �( , , )a a s � �� �t i j,k j,k � ! �, 0, (4.11) � � � t a j,k j,k j� � 1, (4.12) � �tj,k as � 0.res z x a a s� � (4.13) 4.1.3 action and gauge symmetries the same theory can be described by the action � � � � � �� � � ����� �� � � � j k k n j n a a a a z x z x g nj k j , ,, ( , ) � 12 s g g i dta j k a j k a n j k � � � � � � �� 1 1 � , , , where the time-like wilson lines at the marked points are included. the action is gauge invariant with respect to the gauge group � ��� �� �smooth maps : gl� g n n, ( , ) the elements f ��� are smooth and have the same monodromies as the higgs field (4.1). the action is invariant with respect to the gauge transformations a f f f af� � �1 1� , � �f f1 , g g fa a a� , s sa a a af f� �( ) 1 , f f z za z xa � �( , ) . consider the infinitesimal gauge transformations v a a� �� �� [ , ], v� � � [ , ], v g g xa a a� �� ( ) , v s s x a a a� �� [ , ( ) ], � � lie( )�� . the hamiltonian f generating the gauge vector fields �� � df has the form f a z x z xa a a a n g n � � � � � �� � � �( [ , ] ( , )) , � s 1 . the moment map � : ( , , ) * ( )� �a liea s � � , � � �� � � � � � [ , ] ( , )a z x z xa a a a n s 1 . the gauss law (the moment constraints) takes the form � � � � � � �[ , ] ( , )a z x z xa a a a n s 1 . (4.14) upon imposing these constraints the residues of the higgs fields become equal to the spin variables by analogy res z x a a s� � with the yang-mills theory, where the higgs field corresponds to the electric field and sa are the analog of the electric charges. the reduced phase space � � red aa +� ( , , ) / ( s gauss law) (gauge fixing) defines the physical degrees of freedom, and the reduced phase space is the symplectic quotient � � � red aa� ( , , ) / / s � (4.15) 4.1.4 algebra-geometric approach the operator ��d acting on sections defines a holomorphic structure on the bundle e. a section s is holomorphic if ( )� �a s 0 the moment constraint (4.14) means that the space of sections of the higgs field over � g d\ is holomorphic. consider the set of holomorphic structures � �� � da on e. two holomorphic structure are called equivalent if the corresponding connections are gauge equivalent. the moduli space of holomorphic structures is the quotient � ��. generically the quotient has very singular structure. to have a reasonable topology one should consider the so-called stable bundles. the stable bundles are generic and we consider the space of connection � stable corresponding to the stable bundles. the quotient is called the moduli space of stable holomorphic bundles � � �( , , )n g n stable� . it is a finite-dimensional manifold. the tangent space to �( , , )n g n is isomorphic to h eg n 1( , ),� end . its dimension can be extracted from the riemann-roch theorem and for curves without marked points ( )n � 0 dim ( , ) dim ( , ) ( ) dimh e h e g g0 1 1� �end end� � � . for stable bundles and g �1 and dim ( , )h e0 1� end � and dim ( , , ) ( )� n g g n0 1 12� � for gl( , )n � , and dim ( , , ) ( )( )� n g g n0 1 12� � � for sl( , )n � . thus, in the absence of the marked points we should consider bundles over curves of genus g $ 2. but the curves of genus g � 0 and 1 are important for applications to integrable systems. including the marked points improves the situation. we extend the moduli space by providing an additional data at the marked points. consider an n-dimensional vector space v and choose a flag fl v v v vn� � � �( )1 2 � , where vj is a subspace in vj 1. note that a flag is a point in a homogeneous space called the flag variety fl n b�gl( , )� , where b is a borel subgroup. if ( , , )e en1 � is a basis in v and fl is a flag � �fl v a e v a e a e v vn� � � �1 11 1 2 21 1 22 2{ }, { },� , 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 then b is the subgroup of lower triangular matrices. the flag variety has the dimension 1 2 1n n( )� . the moduli space �( , , )n g n is the moduli space �( , , )n g 0 equipped with maps g na �gl( , )� of v to the fibers over the marked points v e xa � , preserving fl in v. in other words ga are defined up to the right multiplication of b and therefore we supply the moduli space �( , , )n g 0 with the structure of the flag variety gl( , )n b� at the marked points. we have a natural “forgetting” projection � : ( , , ) ( , , )� �n g n n g� 0 . the fiber of this projection is the product of copies of the flag varieties. bundles with this structure are called the quasi-parabolic bundles. the dimension of the moduli space of quasi-parabolic holomorphic bundles is dim ( , , ) ( , , ) ( )� �n g n n g nn n� �0 1 2 1 . for curves of genus g �1, dim ( , , )� n g n is independent on the degree of the bundles d e c e� �deg( ) (det )1 . in fact, we have a disjoint union of components labeled by the corresponding degrees of the bundles � �� ( )d� . for elliptic curves ( )g �1 one has dim ( , ) dim ( , )h e h e1 0� �end end� , and dim ( , )h e0 � end does depend on deg(e). namely, dim( ( , , , )) ( , )� n d n d1 0 � g.c.d. . (4.16) in this case the structure of the moduli space for trivial bundles (i.e. with deg( )e � 0) and, for example, for bundles with deg( )e �1 is different. now consider the higgs field . as we already mentioned, defines an endomorphism of the bundle e � � � �: ( , ) ( , )( ) , ( , ) , 0 1 0 g n g ne e� , s s dz� " . similarly, they can be described as sections of � � c g n d e k� " ( ) ,( , ) 0 end . here kd is the canonical class on � \ d that locally apart from d is represented as dz. rememeber that has poles at d. on the other hand, as it follows from the definition of the symplectic structure (4.4) on the set of pairs ( , ) a , that the higgs field plays the role of a “covector” with respect to vector a. in this way the higgs field is a section of the cotangent bundle t stable* � . the pair of the holomorphic vector bundle and the higgs field (e, ) is called the higgs bundle. the reduced phase space (4.15) is the moduli space of the quasi-parabolic higgs bundles. it is the cotangent bundle � � red t n g n d� * ( , , , ) . (4.17) due to the gauss law (4.14), the higgs fields are holomorphic on � \ d. then on the reduced space �red �� "h e kg n d 0( , ), end* . (4.18) a part of t n g n d* ( , , , )� comes from the cotangent bundle to the flag varieties t g b a*( ) located at the marked points. without the null section t g b a*( ) is isomorphic to a unipotent coadjoint orbit, while the null section is the trivial orbit. generic coadjoint orbits passing through a semi-simple element of gl( , )n � are affine spaces over t g b a*( ) . in this way we come to the moduli space of the quasi-parabolic higgs bundles [29]. it has dimension dim ( ) ( )� red n g n n n� � �2 1 2 12 (4.19) this formula is universal and valid also for g � 0 1, and does not depend on deg(e). at first glance, for g �1 this formula appears to contradict to (4.16). in fact, we have a residual gauge symmetry generated by a subgroup of the cartan group of gl( , )n � . the symplectic reduction with respect to this symmetry kill these degrees of freedom and we come to dim ( )� red n n n� �2 1 (see (4.6). we explain this mechanism on a particular example in section 4.2.2. formula (4.6) suggests that the phase spaces corresponding to bundles of different degrees may be symplectomorphic. we will see soon this is the case. it follows from (4.18) that �j g n d jh k� 0( , ), . in other words j are meromorphic forms on the curve with poles of the order j at the divisor d. let �jk be a basis of h kg n d j0( , ),� . then 1 1 j ij jk jk k n j � � � � . (4.20) the basis �jk in h dg n j1 1( \ , ),� " � introduced above is dual to the basis �jk � � � �jk lm j l k m g n� , � � . then the coefficients of the expansion (4.20) coincide with the integrals (4.7). the dimensions nj (4.6) can be calculated as dim ( , ),h kg n d j0 � . symplectic reduction preserves the involutivity (4.10) of the integrals (4.7). since (see(4.8), (4.9)) we come to integrable systems on the moduli space of the quasi-parabolic higgs bundles �red. for gl( , )n � the liouville torus is the jacobian of the spectral curve � (2.5). consider bundles with the structure group replaced by a reductive group g. the algebraic integrability for g �1and g is a classical simple group was proved in [1]. the case of exceptional groups was considered in [30, 31]. 4.1.5 equations of motion on the reduced phase space let us fix a gauge a a� 0. for an arbitrary connection a define a gauge transform © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 2/2008 1 2 dim ( , , )* � n g n � number of integrals f a a a[ ] : � 0, a f f a f a af a0 1 1� � �( )[ ] [ ] [ ]� then f a[ ] is an element of the coset space � �� 0, where the subgroup �0 preserves the gauge fixing � ��0 0 0� �f f a f� [ , ] . the same gauge transformation brings the higgs field to the form l f a f a� �1[ ] [ ] . the equations of motion for (4.11) in terms of l take the form of the lax equation (4.21) where m f a f aj k j k, ,[ ] [ ]� �1 � . therefore, after reduction the higgs field becomes the lax matrix. equations (4.21) describes the hitchin integrable hierarchy. the matrix m j k, can be extracted from the second equation (4.12) � � �m m a a lj k j k j k j j k, , , ,[ , ]� � � � 0 0 1 . (4.22) the gauss law restricted to �red takes the form � �l a l x xa a a a n � � �[ , ] ( , )0 1 s (4.23) thus, the lax matrix is the matrix green function of the operator � a0 on � g n, acting in the space� � ( , ) ,( , ) 1 0 g n eend . the linear system corresponding to the integrable hierarchy takes the following form. consider a section � of the vector bundle e. the section is called the baiker-akhiezer function if it is a solution of the linear system for 1 0 2 0 3 0 0. ( ) , . ( ) , . ( ) ., , � � � � � � � � � � � � % � % a l mj k j k the first equation means that � is a holomorphic section. compatibility of the first equation and the second equation is the gauss law (4.23) and the first equation and the last equation is the lax equations (4.21). in terms of the lax matrix the integrals of motion ij,k are expressed by the integrals (4.7) i j l x zjk jk j g n � � 1 � tr( ( , )) ,� (4.24) or by the expansion (4.20) 1 1 j l ij jk jk k n j � � � � . (4.25) the moduli space of the higgs bundles (4.17) is parameterized by the pairs ( , )a 0 l . the projection (2.2) t n g n b h d kg n d j j n * ,( , , ) ( \ , )� � � � � 0 1 � is called the hitchin fibration. an illustrative example of the hitchin construction is the higgs bundles over elliptic curves. these cases will be described explicitly in following subsections. 4.2 n-body elliptic calogero-moser system (ecms) 4.2.1 description of the system let c� be an elliptic curve � � �( ) � , im� � 0. the phase space �ecm of ecms is described by n complex coordinates and their momenta u v � � � � ( , , ) ( ) ( u u u c v n j1 � � coordinates of particles, 1, , ) ( )� v un j � � � � � � momentum vector with the poisson brackets { , }v uj k jk� � . the hamiltonian takes the form h u ucm j k j k � & � ' �12 2 2v ( ). (4.26) here 2 is a coupling constant and&( )z – is the weierschtrass function. it is a double periodic meromorphic function & �& &( ) ( ) ( )z z z1 � with a second order pole & �( )~z z 2, z � 0. the system has the lax representation [39] with the lax matrix l iv xcm � , v v vn� diag( , , )1 � , (4.27) x z z u u u u z x ix jk j k j k� � � �� � � � � � � � � � � � e e ( ) ( , ), ( ) exp 2 (4.28) where � � � � � ( , ) ( ) ( ) ( ) ( ) u z u z u z � � 0 , (4.29) and � � � � ( ) ( ) exp ( ) , exp z q n n nz q i n n � � � � � � � � � � � 1 8 1 2 1 2 1 2 z � (4.30) is the standard theta-function with a simple zero at z � 0 and the monodromies � �( ) ( )z z � �1 , � � ��( ) ( )z q e ziz � � � � 1 2 2 . (4.31) then from (4.31) that � �( , ) ( , )u z u z �1 , � � �( , ) ( ) ( , )u z u u z � �e , (4.32) and �( , )u z has a simple pole at z � 0 res u z z�( , ) � �0 1 (4.33) 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 �j k j kl l m, ,[ , ]� fig. 2 4.2.2 ecms and higgs bundles [6, 7] to describe ecms as a hitchin system consider a vector bundle e of rank n and degree 0 over an elliptic curve �1,1 with one marked point. we assume that the curve is isomorphic to c� �� � � �( ). the quasi-parabolic higgs bundle t e* has coordinates � �� 0 � ( , ), ( , ),z z a z z s , , ( , )a n�gl � , s � , where is a degenerate orbit at the marked point z � 0 � � � � � ��s g s g g n s j1 0 0gl( , ),� , and j is the matrix (3.17). the orbit has dimension dim( ) � �2 2n . for degree zero bundles the monodromies around the two fundamental cycles can be chosen as q id1 � and �1 � e u( ), where e u( ) ((exp , , exp )� diag iu iun2 21� �� . a section with these monodromies is s s st n� ( , , )1 � , s u zj j� �( , ) . (4.34) where �( , )u zj is (4.29). it follows from (4.32) that the section has the prescribed monodromies. for the fields and the gauge group we have the same monodromies a z a z( ) ( ) �1 , ( ) ( )z z �1 , a z a z( ) ( ) ( ) ( ) � �� e u e u , ( ) ( ) ( ) ( )z z � �� e u e u , f z z f z z( , ) ( , ), �1 1 f z z f z z( , ) ( ) ( , ) ( ) � �� � e u e u . it can be proved that for bundles of degree zero generic connections a f f� � �� 1 is trivial and therefore a a� �0 0. (4.35) this means that stable bundles e of rank n are decomposed into the direct sum of line bundles e j n j� � �1� , with the sections (4.34). the elements uj are the points of the jacobian jac( )�� . they play the role of coordinates, and thereby, c jac� �~ ( )� . this gauge fixing is invariant with respect to the constant diagonal subgroup d0. it acts on the spin variables s � . this action is hamiltonian. the moment equation of this action is diag( ) � 0. this condition dictates the form of s j0 � . gauge fixing allows one to kill the degrees of freedom related to the spin variables, because dim( ) ( ) � �2 1n and dim( )d n0 1� � . thus, the symplectic quotient is a point (dim( / / ) ) d0 0� . remark 4.1 one can choose an arbitrary orbit �. in this case we come to the symplectic quotient �//d0. it has dimension dim( ) ( ) � �2 1n . now consider solutions the moment equation (4.23) with the prescribed monodromies and prove that becomes the lax matrix � � l v xcm (4.27). since a0 0� , v does not contribute in (4.23) and its elements are free parameters. we identify them with momenta of the particles v v vn� diag( , , )1 � . due to the term with the delta-function in (4.23) the off-diagonal part should have a simple pole with the residue j and the prescribed monodromies. it follows from (4.32) and (4.33) that xjk satisfies these conditions. they uniquely fix its matrix elements. the reduced space is described by the variables v and u. the symplectic form on the reduced space l a d v d ucm j j, , 0 1 1� � �� � leads to the brackets { , }v uj k jk� � . from the general construction the integrals of motion come from the expansion of tr( ) ( , , )l zcm j v u . they are double periodic meromorphic functions with poles at z � 0. this is a finite-dimensional space generated by a basis of derivative of the weierschtrass functions. they are elements of the basis � jk in (4.25). 1 0 2j l z i i z i zcm j j cm j cm jj cm jtr( ) ( , , ) ( ) ( )( )v u � & &� . (4.36) there are n n( ) �1 2 1 integrals. due to a special choice of the orbit only n �1 integrals are independent. in particular, 1 2 tr( ) ( , , ) ( )l z h zcm cm2 2v u � � & . for generic orbits (see remark 4.1) the hamiltonian take the form h s s u ucm jk kj j k j k � & � ' �12 2v ( ) . this is ecms with spin [34]. note, that ij, j are the casimir functions defining a generic orbit . therefore we have n n n nn( ) ( )( ) �� � � �1 2 1 2 1 1 commuting integrals of motion. the number of independent commuting integrals is always equal to 1 2 dim( ) . 4.3 elliptic top (et) on gl( , )n � 4.3.1 description of the system the elliptic top is an example of euler-arnold top related to the group gl( , )n � . its phase space is a coadjoint orbit of gl( , )n � . the hamiltonian is a quadratic form on the coalgebra g* *� gl( , )n � . the et is an integrable euler-arnold top. before defining the hamiltonian introduce a special basis in the lie algebra gl( , )n � . define the finite set � � � � �n n n ( ) ( )2 � � , ~ ( ) \ ( , )( )� � � � �n n n 2 0 0� � and let en i n x x( ) exp� 2� . then a basis is generated by n2 1� matrices © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 48 no. 2/2008 t n i qn� � � � � � � � � � � � � 2 2 1 2 1 2e � , � � �� �( , ) ~( ) 1 2 2 �n , where q nn n� �diag( , ( ), , ( ))1 1 1e e� , (4.37) � � � �ej j j n n , , ,(mod ) 1 1 (4.38) the commutation relations in this basis have a simple form [ , ] sin ( )t t n n t� � � � � � � �� . let s � � �� s tn �� ��( ) \( , ) * 2 0 0 g . the poisson brackets for the linear functions s� come from the lie brackets � �s s n n s� � � � � � � �, sin ( )� . the phase space �et of the et is a coadjoint orbit � �� et g g g n~ , ( , )*� � � ��s s sg 0 1 gl � . a particular orbit passes through s 0 � j, as for the spinless ecms. the euler-arnold hamiltonian is defined by the quadratic form h et � � ( 1 2 tr( ( ))s j s , where j is diagonal in the basis t� j s( ) : ,s s� � ��& & �& � � � � � �� � � � �1 2 n , � � �n ( )2 . the equations of motion corresponding to this hamiltonian take the form �t eths s j s s� �{ , } [ ( ), ], � � � � �� � � � � � ts n s s n n � & � � � sin ( ) ~( ) � 2 . 4.3.2 field theory and higgs bundles the curve �1,1 is the same as for the calogero-moser system. consider a vector bundle e of a rank n and degree one over �1,1. it is described by its sections s s z z s z zn� ( ( , ), , ( , ))1 � with monodromies s z z q s z zt t( , ) ( , ) � �1 1 1 , s z z s z zt t( , ) ~ ( , ) � �� � � 1 , (4.39) where q is (4.37), ~ ( ) � �� � en z � 2 , and � is (4.38). since det q � )1 and det ~ ( ) � � ) � e1 2 z � the determinants of the transition matrices have the same quasi-periods as the jacobi theta-functions. the theta-functions have a simple pole on �1,1. thereby, the vector bundle en has degree one. the higgs bundle has the same field content as ecms � �� � a, , s , a n, ( , ) �gl � , s � . the orbit � � ��{ , ( , )}s sg g g n1 0 gl � is located at the marked point z � 0. it follows from (4.39) that the fields , a have the monodromies a z q a z q( ) ( ) � �1 1, ( ) ( )z q z q � �1 1, a z a z( ) ( ) � �� � � 1, � �( ) ( )z z � �� 1. the group of the automorphisms �� � { }f of e should have the same monodromies f z q f z q( ) ( ) � �1 1, f z f z( ) ( ) � �� � � 1. due to the monodromy conditions the generic field a is gauge equivalent to the trivial f af f f� � �1 1 0� . therefore a f a f a� � �� [ ] [ ]1 . (4.40) this allows us to choose a � 0 as an appropriate gauge. it means that there are no moduli of holomorphic vector bundles. more precisely, the holomorphic moduli are related only to the quasi-parabolic structure of e related to the spin variables s. the monodromies of the gauge matrices prevent the existence of nontrivial residual gauge symmetries. let f a z z[ ]( , ) be a solution of (4.40). consider the transformation of by solutions of (4.40) l a g z z f a z z f a z zet [ , ]( , ) [ ]( , ) [ ]( , )� � 1 (4.41) the moment constraints (4.14) take the form � �l z zet � ( , )s the solution takes the form where � � �� � � ( ) ( ) ( , )z z zn t n � e 2 1 2 . the lax matrix was found in [40] using another approach. it is the lax matrix of the vertex spinchain. the lax matrix is meromorphic on �1,1 with a simple pole with reslet z� �0 s. the monodromies of ��(z) are read off from (4.32) � � � �� �( ) ( ) )z zn �1 2e , � � � �� �( ) ( ) ( )z zn � �e 1 then let have the prescribed monodromies. the reduced phase space �et is the coadjoint orbit: � �� et g g� � � �s s 0 1 , s � � �� s tn � �� g * \( , )( )� 2 0 0 . the symplectic form on �et is the kirillov-kostant form (4.5). for a particular choice of the orbit passing through j (ref j) its dimension coincide with the dimension of the phase of the spinless ecms dim dim� �et cms n� � �2 2. this is not occasional and we prove below that �cm is symplectomorphic to �et. since the traces tr( )let j are double periodic and have poles at z � 0 the integrals of motion come from the expansion (see (4.36)) 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 l s z tet n � �� � � �� � ( )( ) \( , )� 2 0 0 , tr( ( )) ( ) ( ), , , ( )l z i i z i zet k k k k k k� & & �0 2 2 � in particular, tr( ) ( )l h c zet et2 2� & . the coefficients is,k are in involution � �i s k j, ,, im � 0. in particular, all functions is, k poisson commute with the hamiltonian het. therefore, they play the role of conservation laws of elliptic rotator hierarchy on gl(n, �). we have a tower of n n( ) 1 2 independent integrals of motion i i i i i i i in n n n 0 2 2 2 0 3 2 3 3 3 0 2 , , , , , , , , � � � � � � there are no integrals i1,k because there are no double periodic meromorphic functions with one simple pole. the integrals i k nk k, , , , , ,� 0 2 3 � are the casimir functions corresponding to the coadjoint orbit � �� et s n g g� � � �gl( , ), ( )� s s1 0 . the conservation laws is,k generate commuting flows on � rot � �� s k s ki, , ,s s� 1, ( : ), ,� �s k t s k� . 4.4 symplectic hecke correspondence let e and ~ e be two bundles over � of the same rank. assume that there is a map � �: ~ e e (more precisely a map of the space of sections � �( ) ( ~ )e e� ) such that it is an isomorphism on the complement to z0 and it has one-dimensional cokernel at x � �: 0 0 0 � * �** � � e e c z � ~ . the map �+ is called upper modification of the bundle e at the point z0. let w z z� � 0 be a local coordinate in a neighborhood of z0. we represent locally e as a sum of line bundles e j n j� � �1� with holomorphic sections s s s s n� ( , , , )1 2 � . (4.42) after modification we come to the bundle ~ ( )e zj n j� � "�1 0� . the sections of ~ e are represented locally as ~ ( ( ) , , ( ) )s g w s w g w sn n� � 1 1 1 � , where g j ( )0 0 . in this basis the upper modification at the point z0 is represented by the matrix � � � � � � � � � id 0 0 n w 1 . this is a modification of order 1, since it increase the degree of e deg( ~ ) deg( ) deg( ( )) deg( )e e z e� � 0 1 (4.43) on the complement to the point z0 consider the map e e� � + *** ~ such that � �� � id. it defines the lower modification at the point z0. the upper modification � is represented by the vector (0,…, 1) and �� by (0,…, �1). for higgs bundles the modification acts as ( ) ( ~~ )e e �* �* � �� ~ , � � � ~ a a� � . (4.44) the higgs fields and ~ should be holomorphic with prescribed simple poles at the marked points. the holomorphity of the higgs field put restrictions on its form. consider the upper modification � ~ ( , , )0 1� and assume that in the above-defined basis takes the form � � � � � � � a b c d , where a is a matrix of order n �1. then � � a b c d a bw cw d � � � � � � � � � � � � � � � �1 . we see that a generic higgs field acquires a first order pole after the modification. to escape this, we assume that there exists an eigen-vector � � such that it belongs to the k er . let � ( , , , )0 0 1� and � � � � � � � a c d 0 . then the higgs field ~ does not have a pole ~ � � � � � � � a cw d 0 . in other words the matrix elements ( ) jn should have first order null. in this way the upper modification is lifted from e to the higgs bundle. after the reduction we come to the map (see (4.17)) t n g n d t n g n d* *( , , , ) ( , , , )� �� 1 . we call this the upper symplectic hecke correspondence (shc). generically the modified bundle ~ e is represented locally as a sum of line bundles ~ ( ( ) ) ( )e z mj n j j m j� � " ��1 0� � with holomorphic sections ~ (~, ,~ ) ( , , , )s s s w g s w g s w g sn m m m n n n� � � � �1 1 1 2 21 2� � . (4.45) it has the degree deg( ~ ) deg( )e e m j j n � � � 1 . this modification is represented by the vector ( , , )m mn1 � . rememeber that the higgs field is an endomorphism of e s s� and near z0 it acts as ( �s sj j k k( ) . © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 48 no. 2/2008 similarly the modified higgs field acts on sections of the modified bundle ~ e ~ ~~s s� . then it follows from (4.45) that ~ ~ ~ ~ ( �s sj j k k , ~ ( ) ( ) j k m m k j j kw g w g wk j� � �1 . since ~ is holomorphic and g j ( )0 0 , j k m mz z k j( )� � 0 must be regular at z z� 0. if we order m m mn1 2$ $ $� then the number of parameters of the endomorphisms is ( )m mj kj k � '� . in general case t n g n d t n g n d m j j n * *( , , , ) ( , , , )� �� � � 1 . if m jj n �� �1 0 the shc does not change the topological type of the bundle. therefore, such shc defines a bcklund transformation of integrable hierarchy. 4.5 symplectic hecke correspondence � � cm et� [11] we work directly with the lax matrices l let cm � � � the modification matrix should intertwine the multipliers corresponding to the fundamental cycles � �( , ) ( , )z q z � 1 � � , (4.46) � � �( , ) ~ ( , ) ( , ) ( ( ))z z z u j � � � � � diag e . (4.47) consider the modification at z � 0. the lax matrix of the cms has the first order pole l z jcm ~ 1 . its residue has an eigen-vector t � ( , , )1 1� with the eigen-value n �1. the matrix � satisfying (4.46) and (4.47) that annihilates the vector has the form � �( ) ~ ( ) ( ) ( , ) ; , z z u ul k j j k j k l � � � � � � �� � � � ��' #diag 1 � � ~ ( , , , , ) ( , )�ij n jz u u i n n z nu n1 1 2 2 � � � �� �, . . . / 0 1 1 1 � . here � � i n n jz nu n �, . / 0 1 � 1 2 2 ( , ) is the theta-function with a characteristic. the determinant of � can be calculated explicitly det ~ ( , , , , ) ( ) ( ) ( ) (�ij n l kz u u i z i u u1 � � ! � � ! � �, . . / 0 1 1 � � ) ( )i k l n ! � 12 ' 2 # , where ! �( ) ( )� � �#q q n n 1 24 1 0 is the dedekind function. it has a simple pole at z � 0 and therefore � is degenerate. we use the modification to write down the interrelations between the coordinates and momenta of the calogero-moser particles and the orbit variables of the elliptic top in the sl( , )2 � case s v u u1 10 10 10 2 01 0 0 2 2 0 0 0 � � � � � � � � � � � � "" ( ) ( ) ( ) ( ) ( ) ( ) ( ) "" "" "" � � � � � � ( ) ( ) ( ) , ( ) ( ) ( ) ( ) 2 2 2 0 0 2 2 01 2 2 u u u s v u u � � � � � � � � � � � "" 2 10 01 10 01 2 3 01 0 0 0 2 2 2 0 ( ) ( ) ( ) ( ) ( ) ( ) , ( u u u s v� � ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( � � � � � � � � � � "" "" 0 2 2 0 0 0 201 01 2 10 10u u u 2 22 u u ) ( ) . � (4.48 ) here � � � � 1 0 1 2 0 0 1 2 1 2 2 2 2 1 2 , , exp ( ) , exp � � � � � � � � � � � � q n z q n n n � nz q nz n n n n � � � � � � � � � � �� � � � , ( ) exp .,� �0 1 1 2 1 21 2 2 these relations describe the darboux coordinates ( , )v u � �2 the coadjoint sl( , )2 � -orbit s� 2 2� � it turns out that this modification is equivalent to the twist of r-matrices. namely, it describes the passage from the dynamical r matrix of the irf models to the vertex r-matrix [25, 26]. we do not discuss this aspect of shc here. 5 4d theories 5.1 self-dual ym equations and hitchin equations 5.1.1 2-d self-dual equations consider a rank n complex vector bundle e over �4 with coordinates x � ( , , , )x x x x1 2 3 4 . assume that the space of sections is equipped with a nondegenerate hermitian metric h, ( )h h � . it satisfies the following condition dh x y h x y h x y( , ) ( , ) ( , )� ! ! , where ! is a connection on e. if dh x y( , ) � 0 for vectors in fibers y v� , x v t� ~ , then there exist connections ! � j x jj a� such that a a � � �� �h dh h h1 1 , ( )a � � � a dxj j j 0 3 . in this situation the transition functions are reduced to the unitary group su gl( ) ( , )n n� � . let f su n( ) ( , ( ))( )a �� 2 4� be the curvature fij i j� ! ![ , ] or f d( )a a a� 2. here su( ) { }n x x h xh� � � �1 . the self-duality equation f f�3 , where 3 is the hodge operator in �4 takes the form f f f f f f 01 23 02 31 03 12 � � � � � % � % (5.1) 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 assume that aj depend only on (x1, x2). this means that the fields are invariant under the shifts in directions x0, x3. then (a0, a3) become adjoint-valued scalar fields which we denote as ( , )� �1 2 . they are called the higgs fields. in fact, they will be associated below with the higgs field . in this way we come to the self-dual equations on the plane � 2 1 2� ( , )x x � �f12 1 2� � �, , (5.2) � � � �! � !1 1 2 2, ,� � , (5.3) � � � �! � !1 2 2 1, ,� � . (5.4) introduce complex coordinates z x ix� 1 2, z x ix� �1 2 and let � �!d z , �� �!d z . consider the fields, taking values in the lie algebra sl( , )n � � � z z i dz i dz � � � � � 1 2 1 2 1 0 2 1 2 1 2 0 ( ) ( , ), ( ) ( , ) ( � � � � � ad e , )( , ).1 2� ad e � � % �% they are not independent since the hermitian conjugation acts as z zh h �� � 1 . (5.5) similarly, a a ia a a ia z z � � � � � % � % 1 2 1 2 1 2 1 2 ( ) ( ) a h dh h a hz z � �� � �1 1 . (5.6) in terms of fields � ( , , , )a a f fz z z (5.7) (5.2) – (5.4) can be rewritten in the coordinate invariant way: 1 0 2 0 3 0 . [ , ] , . , . , f d d z z z z � �� � � � � � % � % (5.8) where [ , ] z z z z z z� . due to (5.5) and (5.6) the third equation is not independent. thus, we have two equations with the left side of type (1, 1) for two complex valued fields ( ) z za and the hermitian matrix h. equations (5.8) are conformal invariant and thereby can be defined on a complex curve �g. in this case � �z g n� ( , )( , ( )),1 0 su , � �z g n� ( , )( , ( )),0 1 su �� � d n nj k g j k g: ( , ( )) ( , ( )). ( , ) ( , )� � � �su su1 the self-duality equations (5.9) on �g are called the hitchin equations. in fact, instead of (5.8) we will consider further a modified system 1 0 2 0 3 0 . [ , ] , . , . . f d d z z z z � � �� � � � � � % � % (5.9) this comes from the self-duality on �4 with a metric of signature (2, 2). consider the gauge group action on solutions of (5.9) � �� � �f ng� �0( , ( ))su , (5.10) z z z zf f f f� � � �1 1, , (5.11) �� � ���d f d f1 . (5.12) if ( , , , )a az z z are solutions of (5.9), then the transformed fields are also solutions. if f takes values in gl( , )n � then it again transforms solutions to solutions. as above we denote this gauge group as ��. define the moduli space of solutions of (5.9) as a quotient under the gauge group action � �h gs( ) � solutions of (5.9) / . now look at the second equation in (5.9). it is the moment constraint equation for the higgs bundles in the absence of marked points (4.14). the gauge group �� transforms solutions of (5.9) to solutions but breaks (5.5), (5.6). now we will restrict ourself with the second equation in (5.9). dividing the space of its solution on the gauge group �� we come to the moduli space of the higgs bundles t n g d* ( , , , )� 0 (4.17). there exists a dense subset of moduli space of stable higgs bundles t n g d t n g dstable* *( , , , ) ( , , , )� �0 0� . the moduli space of stable higgs bundles parameterize the smooth part of �h g( )� (5.13) [2]. consider a higgs bundle with data ( , ) a satisfying eq. 2 in (5.9) and reconstruct from it solutions ( , , , )a az z z z of (5.9). define them as z z h h� � � � , 1 , a a a h h h a hz z� � � � � � , 1 1� . then ( , ) z za satisfy eq. 3. (5.9). equation 1. (5.9) takes the form � � � �( ) [ ,( )] [ , ] h h h a h a a h h h a h h h � � � � � � � � 1 1 1 1 1 0 . for almost all ( , ) a there exists a solution h of this equation (see appendix of donaldson in [2]). in this way we pass from the holomorphic data to solutions of the system (5.9). summarizing, to define �h g( )� one can acts in two ways: 1. divide the space of solutions of (5.9) on the su(n)-valued gauge group �. 2. consider the moduli space of stable higgs bundles. 5.1.2 hyper-kahler reduction in this section we explain how to derive the moduli space �h g( )� (5.13) via an analog of the symplectic reduction. this is the so-called hyper-kahler reduction [41]. we prove that infinite-dimensional space (5.7) is a hyper-kahler © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 48 no. 2/2008 manifold, and �h is its hyper-kahler quotient, where (5.9) play the role of the moment equations. to define a hyper-kahler manifold we need three complex structures and a metric satisfying certain axioms. define a flat metric on depending on the complex structure on � ds a a a az z z z z z z z 2 1 4 � � " " " " �� � � � � � � � � tr( ). � (5.14) introduce three complex structures i, j, k on . the corresponding operators act on the tangent bundle t , such that they obey the imaginary quaternion relations i j k id2 2 2� � � � , ij k� ,�. the complex structures are integrable because is flat. introduce a basis of one-forms in t * v a az z z z� ( , , , )� � � � then the action of the conjugated operators on t * in this basis takes the form i i i i i t � � � � � � � � � � � � � � � 0 0 0 0 0 0 0 0 0 0 0 0 , jt � � � � � � � � � � � � � � � 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 , k i i i i t � � � � � � � � � � � � � � � 0 0 0 0 0 0 0 0 0 0 0 0 . linear functions on are holomorphic with respect to a complex structure, if they are transformed under the action of the corresponding operator with eigen-value i. thus az, z are holomorphic in the complex structure i, a iz z , az z are holomorphic in the complex structure j, and az z� , az z are holomorphic in the complex structure k. to be hyper-kahler on the metric ds2 should be of type (1,1) in each complex structure. this means that ds i i ds j j ds k k dst t t t t t2 2 2 2~ ( ) ( ) ( )" � " � " . in this way we have described a flat hyper-kahler metric on . a linear combination of the complex structures produces a family of complex structures, parameterized by ��1. we define three symplectic structures associated with the complex structures on as �i ti id ds� "( ) 2, � j tj id ds� "( ) 2, �k tk id ds� "( ) 2. � � i z z z z i da da d d g � � � � ��2 tr( ) � , � � �j z z z zd da d a g � � �� 1 2 tr( ) � , (5.15) � � k z z z z i d da d da g � � � ��2 tr( ) � . these forms are closed and of type (1,1) with respect to the corresponding complex structures. now consider the gauge transformations (5.10) of the fields (5.11), (5.12). since the gauge transform takes values in su( )n , the forms (5.15) are gauge invariant. therefore we can proceed as in the case of standard symplectic reduction (3.14). but now we obtain three generating momentum hamiltonians with respect to the three symplectic forms f i tri g � �2� (� �( [ , ])),fz z z z� , (�� lie( )� ), f trj g � � � 1 2� ( � �( ))� ��d dz z f i trk g � � �2� (� �( ))� � ��d dz z . and the three moment maps �� lie *( ) �i z zif i� � [ , ] , � j z zd d� � �� �k z zi d d� � � ��( ) . the zero-valued moments coincide with the hitchin systems. the hyper-kahler quotient �/ / / is defined as � �/ / / ( ) ( ) ( ) /� 4 4� � �� � �i j k 1 1 10 0 0 to come to system (5.9) consider the linear combination � �i j k zi d� � �� . (5.16) this moment map is derived from the symplectic form � � i j k z zi d da� � ��� � � 1 tr( ) . this is a (2,0)-form in the complex structure i. thus we have the holomorphic moment map i in the complex structure i. vanishing of the holomorphic moment map i and the real moment map �i is equivalent to the hitchin equations. dividing their solutions on the gauge group � we come to the moduli space �h(�g) (5.13). now consider an analog of (5.16) corresponding to the complex structure j � �j k i z zi� � � , , � � � �z z z z, ( , )� , �z z za i� , �z z za i� . this moment map comes from the symplectic form � � j d d g � �� 1 2� tr( )� � . it is (2,0) form in the complex structure j. putting j � 0 we come to the flatness condition of the bundle e. dividing the set of solutions �z z, � 0 on the gl( , )n � valued gauge transformations �� we come to the space 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 � � �� �( ),z z 0 � (5.17) of homomorphisms �1( ) ( , )� g n� gl � defined up to conjugations. in accordance with [42] and donaldson (the appendix in ref.[2]) generic flat bundles parameterize �h(�g) (5.13) in the complex structure j. this space is a phase space of non-autonomous hamiltonian systems leading to monodromy preserving equations (see section 6.3). thus, the space �h(�g) describes phase spaces of integrable systems �red (4.17) in the complex structure i and phase spaces of monodromy preserving equations � (5.17) in the complex structure j. 5.2 � � 4 susy yang-mills in four dimension and hitchin equations here we consider a twisted version of � � 4 super yang-mills theory in four dimension. this theory was analyzed in detail in [16, 17, 18] to develop a field-theoretical approach to the geometric langlands program. quantum hitchin systems are one side of this construction and here we use only a minor part of [16]. the twisted theory is a topological theory that contains a generalization of the hitchin equations (5.9) as a condition of brst invariance. our goal is to describe the hecke transformations in terms of the theory. in section 4 we have defined the hecke transformations as an instant singular gauge transformation. the four-dimensional theory allows to consider gauge transformations varying along a space coordinate x3. they become singular at some point, say x3 0� , where a singular t’hooft operator is located. this gives a natural description of the symplectic hecke correspondence in terms of a monopole configuration in the twisted theory. 5.2.1 twisting of susy yang-mills theory � � 4 susy su( )n yang-mills action in four dimension can be derived from the � � 4 susy su( )n yang-mills action in ten dimensions by dimensional reduction. we need only the bosonic part of the reduced theory. the bosonic fields of the 4d yang-mills theory are the four-dimensional gauge potential a a a a a� ( , , , )0 1 2 3 , and six scalar fields coming from six extra dimensions � � � � � � �� ( , , , , , )0 1 2 3 4 5 . the bosonic part of the action has the form i e d x f f d di i� � � � � � �� � ��� 1 2 4 0 3 1 6 0 3 tr 1 2 � � � � � � � � , 1 2 [ ] . , � �i j i j 2 0 6 � � � � � �� the symmetry of the action is spin(4) × spin(6) (or spin(1,3) × spin(6) in the lorentz signature). the sixteen generators of the 4d supersymmetry are transformed under spin(1,3) × spin(6)~sl(2)×sl(2)×spin(6) as ( , , ) ( , , )2 1 4 1 2 4� : { } { }q qax a y� , ( , ; , , )a x� �1 2 1 4� , ( � , ; , , )a y� �1 2 1 4� . they satisfy the super-symmetry algebra { , } �q q pax a y x y aa � � �� � � � � 0 3 , { , }q q � 0, { , }q q � 0. (5.18) the action of q on a field x takes the form �x q x� { , }. let # be a map spin(4) � spin(6) and set spin id spin� � ( ) ( ) ( )4 4# . define # in such a way that the action of spin�( )4 on the chiral spinor � has an invariant vector. let q be the corresponding supersymmetry. it follows from (5.18) that it obeys q 2 0� . the twisted theory is defined by the physical observables from the cohomology groups h q5( ). the twisted four scalar fields � � �� ( , , )0 3� are reinterpreted as adjoint-valued one-forms on �4, while untwisted $ $ � �, � )4 5i remain adjoint-valued scalars. in fact there is a family of topological theories parameterized by t � ��1. to be invariant under q the bosonic fields should satisfy the equations 1 0 2 0 3 0 1 ) ( ) , ) ( ) , ) , f td f t d d � � � � � � � 3 3 � � � � � � � � � � (5.19) where ) denote the self-dual and the anti-self-dual parts for four-dimensional two-forms, d d� [ , ]a and3 is the hodge operator in four dimension. we are interested in solutions of this system up to gauge transformations. this theory defined on flat �4 can be extended to any four-manifold m in such a way that it preserves the q-symmetry and the contributions of the metric come only from q-exact terms. the bosonic part of the theory is described by connections a � ( , , , )a a a a0 1 2 3 in a bundle e over m in the presence of the adjoint-valued one-forms � � � � �� ( , , , )0 1 2 3 satisfying (5.19). the important thing for the case of integrable systems is m g� � 2 � , where �2 0 3� � �( ) ( )time x x y and �g will play the role of the basic spectral curve. �2 is not involved in the twisting and the fields ( , )� �0 3 remain scalars, while � �1 2, become one-forms on �g. it turns out that after reduction the system (5.19) becomes equivalent to the hitchin equations (5.9). 5.2.2 hecke correspondence and monopoles the system (5.19) for t �1 can be replaced by f d� 3 �� � � 0 (5.20) * *d � � 0 (5.21) assume that the fields are time independent and consider the system on the three-dimensional manifold w i x g� ( )3 � , where �� 2 2 �x3 . in terms of the tree-dimensional fields © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 48 no. 2/2008 and ~ ( ( , ~ ) a � a a0 , � � ( , ~ ) 0 0dx , the equations take the form [16] ~ ~ ~ ( [ , ~ ]), ~ [ , ~ ] , ~ [ f d a d da d a � � �3 � 3 � 3 3 0 0 0 0 0, ] . 0 0� (5.22) here the hodge operator 3 is taken in the three-dimensional sense. replace the coordinates x x x x� ( , , )1 2 3 on x y3 � and ( , ) ( , )x x z z2 3 � , where ( , )z z are local coordinates on �g. let g z z dz( , ) 2 be a metric on �g. the metric on w is ds g dz dy2 2 2� . then the hodge operator takes the form 3 � � 3 � � � 3 � �dy ig dz dz dz idz dy dz idz dy 1 2 , , . it is argued in [16] that y � 0 and a0 0� are solutions of the system. then we come to the equations (5.23) where as before z zh h �� � 1 . the system is simplified in the gauge ay � 0. in particular, for 0 0� the system (5.23) becomes essentially two-dimensional and coincides with the hitchin equations (5.9). let �g be an elliptic curve ( )g �1 . this case is important for application to integrable systems. the nonlinear system (5.23) can be rewritten as a compatibility condition for the linear system depending on the spectral parameter � � � ( ) , ( � � � � � � � � � � z y z z z y z z a a i i a a a i � � � � 1 2 1 0 2 0 � � � � % �% i a� � 0 0) here a2 1� �� � . this linear system allows to apply the methods of the inverse scattering problem or the whitham approximation to find solutions of (5.23). define the complex connection �z z za i� . in terms of ( , , )� �z z ya the systems (5.23) assumes the form of the bogomolny equation f d�3 0 . (5.24) consider a monopole solution of this equation. let ~ ( \ ( , ))w w x y z z� � � � 0 00 . the bianchi identity df � 0 in the space ~ w implies that 0 is the green function for the operator 3 3d 3 3 � �d d m x x 0 0�( ) , where m n�gl( , )� . consider first the abelian case g � u( )1 . then f a az z( , ) is a curvature of a line bundle �. locally near x y z z z z0 0 00� � � �( , , ) 0 has a singularity 0 02 ~ im x x � , (5.25) where m is a magnetic charge. due to 1. in (5.23), f takes the form f a a gz z y( , ) � 1 2 0 � , f a a mg z z y x x z z( , ) ~ ( , ) 1 2 0 3 2 � . consider a small sphere s2 enclosing x0. due to (5.24) and (5.25) f m s 2 � � . this solution describes the dirac monopole of charge m corresponding to a line bundle over s2 of degree m. let � �g g ) � )�( ) and �) be the line bundles over � g ) . the two-dimensional cycle c describing the boundary c w i xg� � �(( ) \ )� 0 is � �g g s �� � 2. taking the integral over c we find that f c � � 0. in other words, for the chern classes of the bundles c( ) deg( )� �� we have deg( ) deg( )� � �� m , or � � "� ( )z m0 . here ( )z m 0 is a line bundle whose holomorphic sections are holomorphic functions away from z0 with a possible single pole of degree m at z0. the line bundles over �g are topologically equivalent for y ' 0 or y � 0. the gauge transformation 0 is smooth away from x0. the singularity change the degree of the bundle. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 1. f a a igdz z z z ay( , ) [ , ]� � 1 2 0 , 2. daz z � 0 , 3. f a a idy z az( , ) ,� 0 4. d ia z zy � � [ , ].0 fig. 3: monopole, located at y0 changes the chern class the monopole increase the chern class: c c m1 1� . in four-dimensional abelian theory we have the dirac monopole singular along the time-like line l x x� ( , )0 0 . this corresponds to including the t’hooft operator in the theory saying that the connections have monopole singularity along the line l. a generic vector bundles e near x0 splits e y n~ � � �1 2� � �� . consider the gauge transformation 0 0 12 ~ ( , , ) i x x m mn � � diag . (5.26) this causes the transformation � � j j m z j� " ( )0 . the degree of the bundles e changes after crossing y � 0 by m j� , as it was described for bundles over � in section 4.4. to be more precise we specify the boundary conditions of solutions on the ends y � �� and y � �. since 0 0� for y � )� system (5.23) coincides with the hitchin system (5.9). if �h n g n m( , , , ) ) is the moduli space of solutions on the boundaries y � )� the gauge transformation with the monopole singularity stands that m m m j �� � . it defines the shc between two integrable systems related to �h n g n m( , , , ) ) . in particular, we have described it at the point y � 0 for �h n n( , , , )1 0 and �h n n( , , , )1 1 . 6 conclusion here we will briefly discus some related issues have not included in the lectures. 6.1 solutions of the hitchin equations (5.9) corresponding to quasi-parabolic higgs bundles were analyzed in ref. [17]. in the three-dimensional gauge theory considered in section 4.3 we have the wilson lines located at the marked points. in the four-dimensional yang-mills theory they corresponds to singular operators along two-dimensional surfaces. locally on a punctured disc around a marked point the hitchin system (5.9) assumes the form of the nahm equations [43]. it was proved in ref. [46] that the space of its solutions after dividing on a special gauge group is symlectomorphic to a coadjoint orbit of sl( , )n � . a hyper-kahler structure on the space of solutions induces a hyper-kahler structure on the orbits. it establishes the interrelations between the hitchin equations and the higgs bundles with the marked points (the quasi-parabolic higgs bundles). 6.2 there exists a generalization of this approach to higgs bundles of infinite rank. in other words, the structure group g n� gl( , )� or sl( , )n � of the bundles is replaced by an infinite-rank group. one way is to consider the central extended loop group s g1 � . then the higgs field depends on additional variable x s� 1 and instead of the lax equation we come to the zakharov-shabat equation � �j x j jl m m l� �[ , ] 0. this equation describes an infinite-dimensional integrable hierarchy like the kdv hierarchy. the two-dimensional version of ecms was constructed in [10,11]. in particular, shc establishes an equivalence of the two-particles (n � 2) elliptic calogero-moser field theory with the landau-lifshitz equation [44, 45]. the latter system is the two-dimensional version of the sl( , )2 � elliptic top. relations (4.48) work in the two-dimensional case. another way is to consider gl( )� bundles. in ref. [48] ecms for infinite number of particles n � � was analyzed. the elliptic top on the group of the non-commutative torus was considered in ref. [47]. it is a subgroup of gl( )� . this construction describes an integrable modification of the hydrodynamics of the ideal fluid on a non-commutative two-dimensional torus. 6.3 consider dynamical systems, where the role of times is played by parameters of complex structures of curves � g n, . in this case we come to monodromy preserving equations, like the schlesinger system or the painlevé equations. they can be constructed in the similar fashion as the integrable hitchin systems [8]. to this purpose the one should replace the higgs bundles by the flat bundles and afterwards use the same symplectic reduction (see (5.17)). in this situation the lax equations takes the form � �j z j jl m m l� �[ , ] 0. an analysis of this system is more complicated than the standard lax equations due to the presence of derivative with respect to the spectral parameter. note that mj corresponds only to quadratic hamiltonians, since they responsible for the deformations of complex structures. concrete examples of this construction was given in [8, 52, 53]. interrelations with higgs bundles were analyzed in [8, 49]. it is remarkable that the symplectic hecke correspondence works in this case. it establishes an equivalence of the painlevé vi equation and a non-autonomous zhukovski-volterra gyrostat [12]. 6.4 a modification of the higgs bundles allows one to construct relativistic integrable systems [50]. the role of the higgs field is played by a group element g ck� �exp( )1 where k is a canonical class on � and c is the relativistic parameter. this construction works only for curves of genus © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 2/2008 g 21. this approach was realized in ref. [28] to derive the elliptic rujesenaars system and in ref. [51, 53] to derive the elliptic classical r-matrix of belavin-drinfeld [54] and a quadratic poisson algebra of the sklyanin-feigin-odesski type [55, 56]. the inclusion of the relativistic systems allows to define a duality in integrable systems [57, 58] (see [59] for recent developments). this type of dualities has a natural description for the corresponding quantum integrable systems in terms of hecke algebras [60]. there, it is called the fourier transform and takes the form of s-duality. another form of duality in the classical hitchin system considered in [61, 62, 63]. it is related to langlands duality and similar to t-duality of fibers in the hitchin fibration. 6.5 there is an useful description of the moduli space of holomorphic vector bundles closely related to the modification described in section 4.4. it is so-called tyurin parametrization [64]. this construction was applied to describe higgs bundles and integrable systems related to a curve of arbitrary genus in ref. [10, 65, 66]. using this approach classical r-matrices with a spectral parameter living on curves of arbitrary genus were constructed in ref. [67]. references [1] hitchin, n.: stable bundles and integrable systems. duke math. jour. vol. 54 (1987), p. 91–114. [2] hitchin, n.: the self-duality equations on a riemann surface. proc. london math. soc,. vol. 55 (1987), 59–126. [3] hurtubise, j.: integrable systems and algebraic surfaces. duke math. journ., vol. 83 (1996), 19–50. [4] gorsky, a., nekrasov, n., rubtsov, v.: hilbert schemes, separated variables, and d-branes. commun. math. phys., vol. 222 (2001), p. 299–318. [5] markman, e.: spectral curves and integrable systems. comp. math., vol. 93 (1994), p. 255–290. [6] gorsky, a., nekrasov, n.: elliptic calogero-moser system from two dimensional current algebra. [hep-th/9401021]. [7] enriquez, b., rubtsov, v.: hitchin systems, higher gaudin operators and r-matrices. math. res. letter, vol. 3 (1996), p. 343–357. [8] levin, a., olshanetsky, m.: isomonodromic deformations and hitchin systems. amer. math. soc.transl. (2), vol. 191 (1999), p. 223–262. [9] nekrasov, n.: holomorphic bundles and many-body systems. commun. math. phys., vol. 180 (1996), p. 587–604, [hep-th/9503157]. [10] krichever, i.: vector bundles and lax equations on algebraic curves. commun. math. phys., vol. 229 (2002), p. 229–269, [hep-th/0108110]. [11] levin, a., olshanetsky, m., zotov, a.: hitchin systems – symplectic hecke correspondence and two-dimensional version. comm. math. phys., vol. 236 (2003), p. 93–133. [12] levin, a., olshanetsky, m., zotov, a.: painleve vi, rigid tops and reflection equation. commun. math. phys., vol. 268 (2006), p. 67–103. [13] gorsky, a., krichever, i., marshakov, a., a. mironov, a., morozov, a.: integrability and seiberg-witten exact solution. phys. lett. b, vol. 355, 466 (1995), [arxiv:hep-th/9505035]. [14] seiberg, n., witten, e.: electric – magnetic duality, monopole condensation, and confinement in n=2 supersymmetric yang-mills theory. nucl. phys. b, vol. 426 (1994), p. 19–52 [erratum-ibid. b vol. 430, 485(1994)], [arxiv:hep-th/9407087]. seiberg, n., witten, e.: monopoles, duality and chiral symmetry breaking in n=2 supersymmetric qcd. nucl. phys. b, vol. 431 (1994), p. 484–550 [arxiv:hep-th/9408099]. [15] gorsky, a., mironov, a.: integrable many-body systems and gauge theories. [arxiv:hepth/0011197]. [16] kapustin, a., witten, e. electric-magnetic duality and the geometric langlands program. [arxiv:hep-th/0604151]. [17] gukov, s., witten, e.: gauge theory, ramification, and the geometric langlands program. [arxiv:hep-th/0612073]. [18] witten, e.: gauge theory and wild ramification. [arxiv:hep-th/07100631]. [19] olshanetsky, m., perelomov, a.: classical integrable finite-dimensional systems related to lie algebras, physics reports, vol. 71 (1981), p. 313–400. [20] donagi, r.: siebrg-witten integrable systems, [arxiv:alg-geom/9705010]. [21] etingof, p.: lectures on calogero-moser systems, [arxiv:math/0606233]. [22] zotov, a.: classical integrable systems and their field-theoretical generalizations, phys. part. nucl. vol. 37 (2006), p. 400–443; fiz. elem. chast. atom. yadra vol. 37 (2006), p. 759–842. [23] g. t’hooft, g.: on the phase transition towards permanent quark cofinement, nucl. phys. b, vol. 138, (1978). [24] kapustin, a.: wilson-’t hooft operators in four-dimensional gauge theories and s-duality, phys. rev. d, vol. 74 (2006), 025005. [25] baxter, r. j.: eight-vertex model in lattice statistics and one-dimensional anisotropic heisenberg chain, i. ann. phys., vol. 76 (1973), p. 48–71. [26] e. date, m. jimbo, t. miwa, m. okado, fusion of the eight vertex sos model, lett. math.phys., vol. 12 (1986), p. 209. [27] arnold, v.: mathematical methods in classical mechanics. springer, 1978. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 [28] arutyunov, g., frolov, s., medvedev, p.: elliptic ruijsenaars-schneider model from the cotangent bundle over the two-dimensional current group. j. math. phys., vol. 38 (1997), p. 5682–5689. [29] simpson, c.: harmonic bundles on noncompact curves, journ. am. math. soc., vol. 3 (1990), p. 713–770. [30] faltings, g.: stable g-bundles and projective connections. j. alg. geo., vol. 2 (1993), p. 507–568. [31] donagi, r.: spectral covers, in current topics in complex algebraic geometry. (h. clemens and j. kollar, eds., msri pub., vol. 28 (1995), p. 65–86, [arxive:alg-geom/9505009]. [32] calogero, f.: solution of the one-dimensional n-body problem with quadratic and/or inversely quadratic pair potentials. j. math. phys., vol. 12 (1971), p. 419–436; exactly solvable one-dimensional many-body problem. lett. nuovo cim., vol. 13 (1975), p. 411. [33] moser, j.: three integrable systems connected with isospectral deformations. adv. math., vol. 16 (1975), p. 1– 23. [34] gibbons, j., hermsen, t.: a generalization of calogero-moser system. physica d, vol. 11d (1984), p. 337–348. [35] olshanetsky, m., perelomov, a.: explicit solution of the calogero model in the classical case and geodesic flows on symmetric space of zero curvature. lett. nuovo cim. vol. 16 (1976), p. 333–339. [36] kazdan, d., kostant, b., sternberg, s.: hamiltonian group actions and dynamical systems of calogero type. comm. pure and appl. math., vol. 31 (1978) p. 481–507. [37] faddeev, l., slavnov, a.: gauge fields: introduction to quantum field theory. benjamin reading, 1980. [38] konopleva, n., popov, v.: gauge fields. nm queen – harwood academic publishers new york, 1981. [39] krichever, i.: elliptic solutions of the kadomtsev-petviashily equation and many-body problems. funct. anal. aplic. vol. 14 (1980), p. 45. [40] reyman, a., semenov-tian-schansky, m.: lie algebras and lax equations with spectral parameter on elliptic curve (russian). zap. nauchn. sem. leningrad. otdel. mat. inst. steklov. (lomi) 150 (1986), voprosy kvant. teor. polya i statist. fiz., vol. 6 p. 104–118, 221; translation in journal soviet math., vol. 46 (1989), no. 1, p. 1631–1640. [41] hitchin, n., karlhede, a., lindström, u., roček, m.: hyper-kähler metrics and supersymmetry. comm. math. phys., vol. 108 (1987), p. 535–589. [42] corlette, k.: flat g-bundles with canonical metrics. j. diff. geom., vol. 28 (1988), p. 361–382. [43] nahm, w.: the construction of all selfdual multi-monopoles adhm method, trieste cent. theor. phys., ic-82-016. [44] sklyanin, e.: on complete integrability of the landau-lifshitz equation. preprint lomi e-3-79 (1979). [45] borovik, a., robuk, v.: linear pseudopotentials and conservation laws for the landau-lifshitz equation. theor. math. phys., vol. 46 (1981), p. 371–381. [46] p. kronheimer, a hyper-kähler structure on coadjoint orbits of a semisimple complex group. j. london math. soc., vol. 42 (1990), p. 193–208. [47] khesin, b., levin, a., olshanetsky, m.: bihamiltonian structures and quadratic algebras in hydrodynamics and on non-commutative torus. comm. math. phys., vol. 250 (2004), p. 581–612. [48] olshanetsky, m.: the large n limits of integrable models. mosc. math. j., vol. 3 (2003), p. 1307–1331. [49] takasaki, k.: spectral curves and whitham equations in the isomonodromic problems of schlesinger type. [solv-int/9704004]. [50] ruijsenaars, s. n. m.: complete integrability of relativistic calogero-moser systems and elliptic function identities. commun. math. phys., vol. 110:191 (1987); ruijsenaars, s. n. m., schneider, h.: a new class of integrable systems and its relation to solitons. ann. phys., vol. 170 (1986), 370–405. [51] braden, h., dolgushev, v., olshanetsky, m., zotov, a.: classical r-matrices and the feigin-odesskii algebra via hamiltonian and poisson reductions. journ. phys. a, vol. 36 (2003), p. 6979–7000. [52] takasaki, k.: gaudin model, kz equation, and isomonodromic deformation on torus. lett. math. phys., vol. 44 (1998), p. 143–156. [hep-th/9711058]. [53] chernyakov, yu., levin, a., olshanetsky, m., zotov, a.: elliptic schlesinger system and painlevé vi. journ. phys. a, vol. 39 (2006), p. 12083–120102. [54] belavin, a., drinfeld, v.: solutions of the classical yang-baxter equation for simple lie algebras. funct. anal. and applic., vol. 16 (1982), no. 3, p. 1–29. [55] sklyanin, e.: some algebraic structures connected with the yang-baxter equation. funct. anal. and applic., vol. 16 (1982), no. 4, p. 27–34. [56] feigin, b., odesski, a.: sklyanin’s elliptic algebras. funct. anal. and applic., vol. 23 (1989), no. 3, p. 207–214. [57] fock, v., gorsky, a., nekrasov, n., rubtsov, v.: duality in integrable systems and gauge theories. jhep 0007 (2000) 028, [arxiv:hep-th/9906235]. [58] gorsky, a., rubtsov, v.: dualities in integrable systems: geometrical aspects. [arxiv:hepth/0103004]. [59] abenda, s., previato, e.: how is the hitchin system self-dual? let us count the ways. inst. mittag lefler, report no. 64 2006/2007. [60] cherednik, i.: double affine hecke algebras. cambridge university press, 2005. [61] hausel, t., thaddeus, m.: mirror symmetry, langlands duality, and the hitchin system. invent. math., vol. 153 (2003), p. 197– 229. [62] donagi, r., pantev, t.: langlands duality for hitchin systems. [arxiv:math/0604617]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 2/2008 [63] hitchin, n.: langlands duality and g2 spectral curves. [arxiv:math/0611524 ]. [64] tyurin, a.: classification of vector bundles over an algebraic curve of arbitrary genus. amer. math. soc. transl. ii ser. 63 (1967), p. 245– 279. [65] enriquez, b., rubtsov, v.: hecke-tyurin parametrization of the hitchin and kzb systems. [arxiv:math/9911087]. [66] takasaki, k.: tyurin parameters of commuting pairs and infinite dimensional grassmann manifold. contribution to proceedings of rims workshop “elliptic integrable systems” (rims, 2004). [arxiv:nlin/0505005]. [67] dolgushev, v.: r-matrix structure of hitchin system in tyurin parameterization. commun. math. phys., vol. 238 (2003), p. 131–147. mikhail olshanetsky e-mail: olshanet@itep.ru chern institute of mathematics nankai university tianjin, china 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap06_3.vp 1 introduction while building the fixed track section from třebovice to rudoltice in 2005, the substructure was designed in accordance with the regulations in force for db (german federal railways). the requirement for the bearing capacity of the substructure subgrade was 120 mpa, i.e. a value greatly exceeding the requirements of sždc (czech railway infrastructure administration) for corridor tracks. ministry of transport of the czech republic research project no.1f52h/052/130 [1] was therefore charged with investigating the correlations between the static modulus of deformation determined under the regulations of čd (czech railways) and db. comparative measurements were carried out in laboratory conditions in an experimental box (on gravel and granulated gravel) and in in-situ conditions (on sand and soil). 2 measurement of moduli of deformation at sždc (čd) the requirements for the measuring apparatus, a detailed test procedure and its assessment are regulated in instruction čd s4 substructure [2] and in standard čsn 72 1006 checking the compaction of soils and fills [3]. at the begining of the measurements, to ensure tight fitting of individual parts of the load device, short-term loading not exceeding 10 s is activated, which must not generate a pressure on the loaded layer greater than 20 % of the maximum plate load. the initial reading is taken after the load is removed and the path indicators are stabilized. the plate is then gradually loaded in four steps of equal magnitude. for each loading step, the deformation of the subsoil under the plate is recorded. if the change in deformation during 1 minute is less than 0.02 mm, it is considered as stabilized and the next loading step follows. the procedure continues in the same way until the maximum load required for the loaded layer is reached. later the load plate is relieved in the same steps down to zero load, and the cycle is repeated for the second time. during the test, the plate load values are read on the load gauge and the plate insertion is read on the path indicators. the mean load plate insertion values in each loading and load removal step are plotted in a chart expressing the relationship between the specific pressure acting on the load plate and the load plate insertion. the value of the total mean plate insertion from the loading branch of the second cycle is entered in the chart with a simultaneous calculation of the modulus of deformation using general formula: e p r y � � 1 5. , where e is static modulus of deformation [mpa], p specific pressure acting on the load plate [mpa], r load plate radius, i.e. 0.15 [m], y total mean load plate insertion determined in the loading branch of the second cycle [m]. the specific pressure on the load plate is selected as follows: on earth subgrade p � 0.2 mpa with a loading step of 0.05 mpa, for less bearing soils p � 0.1 mpa with a loading step of 0.025 mpa; on a structural layer of the substructure body p � 0.2 mpa with a loading step of 0.05 mpa; and on the rail bed at sleeper loading area � 0.4 mpa with a loading step of 0.10 mpa (note: since 1. 1. 2003 the bearing capacity at this level has no longer been assessed). the measurement of the modulus of deformation in the experimental box on the surface of a granulated gravel layer is shown in fig. 1. 3 measurement of moduli of deformation at db the requirements for the measuring apparatus, the detailed test procedure and its assessment are regulated in german standard din 18 134 plattendruckversuch [3]. at the begining of the measurement, a preload of ca 0.01 mpa is activated to be removed in about 30 seconds, and the reading on the path indicators is set at 0.00 mm. the test procedure consists of two loading branches and one load removal branch. a plate with a diameter of 300 mm is loaded until the plate insertion reaches a value of ca 5 mm or a specific pressure of 0.50 mpa. the load is added step by step in at least 6 roughly identical steps. in each loading step, the path indi52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague problems in different measuring and assessment the modulus of deformation using the czech and german methodologies m. lidmila, l. horníček, h. krejčiříková, p. tyc comparative laboratory and in-situ measurements were used to establish the relationships between the static moduli of deformation calculated under the čd methodology and the db methodology. the measurements proved that the moduli of deformation determined in accordance with the two methodologies cannot be substituted for each other. keywords: modul of deformation, conventional trans-european system, the substructure, classification of soils, the earth subgrade, the fixed track section, the corridor track, rubber plates, granulated gravel, the correlation coefficient. cators are read at 120 second intervals, and in the case of load-bearing layers this time may be reduced to 60 seconds. when measuring with 3 path indicators, the reading on the first indicator is taken 10 seconds before the waiting time expiry. the path indicators are always read in the same order and at identical time intervals. load removal is carried out in 3 steps – 50 %, 25 % and 0 % of the maximum plate load. the second loading branch terminates in the last but one step of the first loading branch. the modulus of deformation (verformungsmodul) is calculated from both the first and the second loading branch. it is determined from the loading chart, from the inclination of the secant line between two points given by the value of the 0.3and 0.7-multiple of the maximum load, using the general formula: e r sv � � �15. � � � , where ev is static modulus of deformation [mpa], r load plate radius, i.e. 0.15 [m], y total mean load plate insertion determined in the loading branch of the second cycle [m], � � difference in the value of the 0.3and 0.7-multiple of the maximum load [mpa], � s difference in the load plate insertion between the value of the 0.3and 0.7-multiple of the maximum load [mpa]. 4 measurement of moduli of deformation on rail bed models in order to establish the correlation between the moduli of deformation of rail bed structures determined in accordance with the czech and german methodologies in laboratory conditions, the experimental box of the department of railway structures of the faculty of civil engineering, ctu in prague was used. the experimental box consists of welded steel sections and removable walls of wooden planks. the walls of the box walls were lined with a thin galvanized plate to minimize the friction of the model at contacts with the walls of the box. the basic dimensions of the inside space of the experimental box are: length: 2095 mm, width: 990 mm, height: 800 mm. the experimental box also includes a movable frame, which enables load press to be propped while measuring the moduli of deformation on models in the experimental box. the experimental box was used for experimental monitoring of 4 rail bed models in a 1:1 scale composed of rubber plates simulating earth subgrade with different bearing capacities (10.9 mpa, 25 mpa), structural layers of granulated gravel of 0/32 mm grading with different thicknesses (15 cm, 30 cm) and a rail bed of gravel of 32/63 mm grading with a constant thickness of 35 cm. the gravel and granulated gravel were exposed to a grain composition test, which proved that the grain size composition of these materials is suitable for use in rail beds or in structural layers of the substructure body. the moduli of deformation were determined at three height levels: on a simulated earth subgrade, on a layer of granulated gravel and on a layer of gravel. the individual models were labelled with codes containing information on the thickness of the structural layer in cm, abbreviations of the material used for the structural layer and the approximate bearing capacity of the simulated earth subgrade in mpa (e.g. 15sd e10). characteristic data for each model is presented in tab. 1. the model of the 5sd e10 structure in the experimental box is shown in fig. 2. the structural layer was composed of either one or two layers, 150 mm in thickness, and the rail bed consisted of two layers, 175 mm in thickness. the individual layers were compacted with a special manual vibrating compacting device with a compacting area of 174×174 mm, and each layer was uniformly compacted for 30 minutes along the whole surface of the experimental box. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 1: measuring the static modulus of deformation on a granulated gravel layer in the experimental box model labelling earth subgrade bearing capacity in mpa structural layer thickness in mm rail bed thickness in mm 15sd e10 10.9 150 350 30sd e10 10.9 300 350 15sd e25 25.0 150 350 30sd e25 25.0 300 350 table 1: labelling of rail bed models and their characteristics 5 measurements of moduli of deformation on a layer of gravel after each model, had been constructed, the three plate load tests on a layer of gravel were performed following the čd methodology and then, in identical positions, three plate load tests were performed following the db methodology. this measurement method was selected due to the confined plan of the model. inaccuracy in determining of the modulus of deformation under the db methodology, however, will be insignificant. the composition of the 15sd e10 model with schematic positions of the circular load plate is shown in fig. 2. the determined values of the moduli of deformation are given in table 2. 6 measurements of moduli of deformation on a layer of granulated gravel once the plate load tests on a layer of gravel were completed, the gravel was removed from each model, and three plate load tests were performed on a layer of granulated gravel under the čd methodology and subsequently, in identical positions, three plate load tests were performed under the db methodology. the determined values of the moduli of deformation are given in table 3. 7 measurements of moduli of deformation on sand the location selected for in-situ measurements of the moduli of deformation was the klíčany sand pit, where sand with fine soil admixtures is quarried. the measurements were carried out at two points (field no. 1 and field no. 2) with the natural placement of sand layers showing no significant disintegration due to the extraction machinery. the counterweight used during the measurement of the static moduli of deformation was a loaded lorry. in all, six plate load tests were performed using the čd methodology and six plate load tests using the db methodology. once the plate load tests were completed, holes were dug at the measurement points to a depth of 1.0 m. these dug holes served for taking disintegrated samples for geotechnical laboratory analyses which proved that the tested layer of sand can be classified under čsn 72 1002 [5] as sand with fine soil admixtures (s3 s-f) with a mean natural moisture content of 7.8 % in field no. 1 and 5.5 % in field no. 2. the determined values of the moduli of deformation are given in table 4. fig. 3 shows the modulus of deformation being measured on a layer of sand in the klíčany sand pit. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague a) b) fig. 2: rail bed structure model 15sd e10 in the experimental box: a) longitudinal section, b) cross section model labelling mean modulus of deformation in mpa čd methodology db methodology 15sd e10 63.8 71.3 30sd e10 79.4 98.3 15sd e25 90.0 104.5 30sd e25 103.9 117.7 table 2: moduli of deformation on a gravel surface for each methodology model labelling mean modulus of deformation in mpa čd methodology db methodology 15sd e10 26.1 23.0 30sd e10 53.3 48.9 15sd e25 50.1 44.1 30sd e25 82.9 86.4 table 3: moduli of deformation on a granulated gravel surface for each methodology 8 measurements of moduli of deformation on soil the location selected for in-situ measurements of the moduli of deformation was the construction site of an embankment at svojšovice as part of corridor iv within the říčany – stránčice section. the measurement was carried out on a compacted layer of highly non-homogeneous material compacted at two points (field no. 1 and field no. 2). the counterweight used was a flat vibratory roller or a loaded lorry. in all, six plate load tests were performed using the čd methodology and six plate load tests using the db methodology. once the plate load tests were completed, holes were dug at the measurement points to a depth of 0.5 m. these dug holes served for taking disintegrated samples for geotechnical laboratory analyses which proved that the layer of soil can be classified under čsn 72 1002 [5] as dirty gravel (g5 gc) to gravel with fine soil admixtures (g3 gf) with a mean natural moisture content of 12.7 % in field no. 1, while the soil layer in field no. 2 is composed of gravely clay (f2 cg) to gravely loam (f1 mg) with a mean moisture content of 15.8 %. the values of the moduli of deformation are shown in table 5. 9 correlation coefficients in order to express the relationship of the moduli of deformation determined under the čd methodology and under the db methodology, the correlation coefficients were calculated from the formula: k e e � db čd , where k is correlation coefficient [-], edb modulus of deformation determined under the german methodology from the second loading branch [mpa], ečd modulus of deformation determined under the czech methodology [mpa]. the correlation coefficients of the moduli of deformation on a layer of gravel are shown in table 6, and on a layer of granulated gravel in table 7. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 3: static modulus of deformation being measured on a layer of sand in the klíčany sand pit field labelling mean natural moisture content in % mean modulus of deformation in mpa čd methodology db methodology field no. 1 8.7 82.2 136.8 field no. 2 5.5 38.7 67.4 table 4: moduli of deformation on sand for each methodology field labelling mean natural moisture content in % mean modulus of deformation in mpa čd methodology db methodology field no. 1 12.7 92.1 66.3 field no. 2 15.8 56.0 45.6 table 5: moduli of deformation on soil for each methodology the correlation coefficients of the moduli of deformation on sand are given in table 8 and on soil in table 9. 10 conclusion the correlation coefficients determined in our experiment lead to the following conclusions: 1. the čd and db methodologies for measuring the static moduli of deformation apply the same testing apparatus (a circular load plate with a diameter of 30 cm), but the prescribed plate load loading is different and so is the calculation procedure for the static modulus of deformation. 2. comparative measurements of static moduli of deformation show that the correlation dependency emerging from the determined moduli of deformation k � edb/ ečd shows considerable differences both for different materials and for multi-layer systems of various materials. for example, the measurements revealed fluctuations in correlation coefficient “k” in the range of 0.76 to 1.70. in designing the track body, the values of the static moduli of deformation as determined under the čd methodology and the db methodology cannot be substituted for each other. 3. in designing the rail bed on sždc tracks, the requirements for the load-bearing capacity of individual layers must be respected as regulated in instruction čd s4 substructure. the requirements given by instruction čd s4 substructure cannot be substituted for the requirements given by instruction ds 836 vorschrift für erdbauwerke applied by db. 4. in the case of a potential change in the čd methodology for measuring of the static modulus of deformation (e.g. by taking over a uniform european standard), instruction čd s4 substructure, čsn 72 1006 checking the compaction of soils and fills and other related regulations would have to undergo considerable modifications. 11 acknowledgment this paper was written within the research project no. 1f52h/052/130 funded by ministry of transport of the czech republic. references [1] krejčiříková, h., tyc, p., lidmila, m., horníček, l., voříšek, p., et al.: methodology of transitional parameters for the construction of substructure and tracks of the conventional trans-european system. research reports of project 1f52h/052/130, faculty of civil engineering, ctu in prague, department of railway structures, 2005 and 2006. [2] instruction čd s4 substructure, 1997 (in force since 1. 7. 1998). [3] čsn 72 1006 checking the compaction of soils and fills. český normalizační institut, 1998. [4] din 18 134 plattendruckversuch. deutsches institut für normung e. v., 1993. [5] čsn 72 1002 classification of soils for transportation structures. český normalizační institut, 1993. ing. martin lidmila, ph.d. e-mail: lidmila@fsv.cvut.cz ing. leoš horníček, ph.d. e-mail: leos.hornicek@fsv.cvut.cz doc. ing. hana krejčiříková, csc. phone: +420 224 354 756 e-mail: krejcirikova@fsv.cvut.cz prof. ing. petr tyc, drsc. e-mail: petr.tyc@fsv.cvut.cz department of railway structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague model labelling mean correlation coefficient “k” total mean correlation coefficient “k” 15sd e10 1.12 1.17 30sd e10 1.24 15sd e25 1.16 30sd e25 1.13 table 6: correlation coefficients on a layer of gravel model labelling mean correlation coefficient “k” total mean correlation coefficient “k” 15sd e10 0.88 0.94 30sd e10 0.92 15sd e25 0.88 30sd e25 1.04 table 7: correlation coefficients on a layer of granulated gravel model labelling natural moisture content in % mean correlation coefficient “k” total mean correlation coefficient “k” field no. 1 12.7 0.72 0.76 field no. 2 15.8 0.81 table 9: correlation coefficients on soil model labelling natural moisture content in % mean correlation coefficient “k” total mean correlation coefficient “k” field no. 1 8.7 1.66 1.70 field no. 2 5.5 1.74 table 8: correlation coefficients on sand acta polytechnica https://doi.org/10.14311/ap.2023.63.0216 acta polytechnica 63(3):216–226, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague improving the energy efficiency of a tram’s running gear stanislav semenova, evgeny mikhailova, sergii kliuieva, ján dižob, ∗, miroslav blatnickýb, vadym ishchukb a volodymyr dahl east ukrainian national university, faculty of transport and building, department of logistics and traffic safety, ioanna pavla ii str., 17, 01042 kyiv, ukraine b university of žilina, faculty of mechanical engineering, department of transport and handling machines, univerzitná 8215/1, 010 26 žilina, slovak republic ∗ corresponding author: jan.dizo@fstroj.uniza.sk abstract. this research analyses the influence of some design features of an undercarriage of a tram on the energy efficiency in terms of mechanical energy losses during its operation. the work includes the comparison of results of the research from two main points of view, namely from a running gear design and a railway wheel design. two variants of a tram’s bogie are investigated. a standard bogie and a bogie with a system that allows a radial adjustment of the wheelsets in curves. two designs of a railway wheel are compared, a wheel with the traditional construction scheme and a wheel with a perspective construction scheme. the values of mechanical energy losses due to slippage in the contact between the wheels and rails are analysed. these losses are obtained with reference to a specific route and the value of the average power dissipated. moreover, an analysis of advantages and disadvantages of bogie designs has made it possible to consider the most appropriate bogie design in terms of ensuring the energy efficiency. in this case, the bogie design with the wheels with the perspective construction scheme can be considered as the optimal option. keywords: energy efficiency, tram’s bogie, simulations, mbs model, railway wheel. 1. introduction the successful functioning of large cities is only possible with an efficient public transport system. one of the most important types of street rail-based public transport, usually with electric traction, used mainly in large cities to transport passengers along fixed routes, is the tram [1–5]. in some places in europe, the tram has now even begun to go beyond the classical concept of urban transport, as it can drive on the tracks of a conventional railway and operate in suburban areas, covering distances of up to 80 kilometers outside of the city limits. in some places where there is no subway, the tram can pass under the busiest streets through tunnels. tram transport is therefore actively developing [6–12]. currently, the transport industry is faced with the urgent task of ensuring the energy efficiency of vehicles, including rail vehicles. the urgency of this issue is heightened by rising prices of electricity and fuel resources. therefore, even a slight decrease in energy consumption required to move trams can significantly reduce operating costs, since the main part of these costs is the provision of their movement. according to [13], energy efficiency can be understood as the comparative ability to use less energy for the same level of energy supply for technological processes [14–16]. one of the main indicators of the energy efficiency of a tram as a rail vehicle is the amount of energy required to move along the rail track, i.e. to overcome the resistance to its movement. among several ways to create rail vehicles with competitive energy efficiency indicators, one of the most promising is the use of innovative bogie designs [17–21]. for example, it is well known that the use of radially mounted bogies and wheelsets can significantly reduce the energy required to overcome the resistance to movement. the energy required to move the rolling stock is reduced while at the same time providing resistance against wobbling of the bogies. this resistance depends primarily on the bending stiffness (resistance to the angular rotation of the wheelsets) and the shear stiffness of the bogies (resistance to the mutual lateral movement of the wheel sets). high shear stiffness provides stable movement on straight tacks, and low bending stiffness allows to negotiate curved track sections of the railway without flange guidance [22–24]. finding the optimum combination of these characteristics will lead to a reduction in energy required to move rail vehicles on both straight and curved tracks [25]. there are known technical solutions, in which, a wheelset does not consists of a pair of wheels rigidly mounted on a common axle, but of three independently moving components, i.e. two wheels rotating on an axle independently of each other [26]. this technical solution is known as the irws (independently rotating wheels). on the one hand, this reduces the wear of the wheel/rail pair in curves with small 216 https://doi.org/10.14311/ap.2023.63.0216 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 3/2023 improving the energy efficiency of a tram’s running gear radii and partially, depending on the application, also helps to reduce noise of a driving urban railway vehicle. on the other hand, releasing the rigid coupling between the wheels and the axle leads to unstable rotation of the wheelset, or, rather to unpredictable movement of the irw wheelset. it means that the standard and necessary sinusoidal movement of the wheelset is no longer present. many technical solutions have been developed to eliminate the described disadvantages of the standard irws wheelset. number of them includes less or more complicated mechatronic systems [27–30], which increase the complexity of the wheelset and thus the entire bogie (running gear) [30– 33]. another related problem is the braking system, which also needs adequate modifications [34–37]. the question is, therefore, whether it is possible to design a wheelset (mainly for urban railway vehicles – trams), which will be able to reduce the negative effects of the traditional wheelset, mainly wear and energy losses and at the same time eliminate the disadvantages of the irws. one of the perspective construction scheme (pks) solutions in this direction is the use of control systems for the installation of running gear in the rail track [38, 39]. for example, in some of them, the input parameter is the angle between the bodies of two cars or between the car body and the bogie. although such systems cannot accurately operate when entering and leaving curves, their operation does not depend on the geometry of the profiles of wheels and rails. if the rails are in a good condition, radial adjustment systems can be used, depending on the geometry of the connected profiles wheel and the rails, with partial use of the mass inertia. however, due to the predominantly worn state of the rail track, a negative effect of using this technical solution is possible and consists in an increased uneven wear of the wheels [40–45]. the operation of the railway vehicle and the application of the proposed pks wheel design relate not only to the running of the rail vehicle through curves, but also to running through crossings and switches. a research focused on the phenomenon of the lifetime of the wheel/rail pair components was made and is included in [46, 47]. they mainly focus on monitoring the condition of the most critical part of turnouts (common crossing) as well as the research of predicting the crossing geometry deterioration. future research with the proposed tks may also demonstrate the contribution in terms of reduced wear and deterioration effects on the surfaces of the railway wheel and the rails. the mechanism of radial installation of wheelsets cannot be used advantageously in the modernisation of a rolling stock [48]. on the new rolling stock, the implementation of this measure does not require additional high costs. according to [48], the use of such devices is expedient almost everywhere, except for lines with a small number of curves. taking into account the predominant contribution of the kinematic movement resistance to the level of the overall resistance to the movement of the railway vehicle, another pks technical solution that can significantly reduce the resistance to the movement of the railway vehicle in curved sections of the track and reduce the wear of the contact surfaces is the use of independently rotating wheels in the running gear [49, 50]. a further development of the use of such technical solutions is, for example, the design of a rail vehicle wheel with the pks wheels proposed and justified in [51, 52], allowing the possibility of independent rotation of its supporting rolling surface and guide surface. this technical solution makes it possible to minimise the amount of kinematic resistance to the movement of the rail vehicle. 2. a method of the problem solving an important generalising indicator of the energy consumption during the movement of a transport unit is the total specific resistance to its movement w ′′ t_srm [n·kn−1]. based on [53–57], the total specific resistance to the movement of a railway vehicle: w ′′ t_srm = w ′′ m_srm + w ′′ ad_srm , (1) where w ′′ m_srm [n·kn−1] is the main specific resistance to movement of the railway vehicle and w ′′ ad_srm [n·kn −1] is the additional specific resistance to movement of the railway vehicle. among the well-known works on determining the components of the main specific resistance to the movement of railway vehicles, the studies of p.n. astakhov stand out [53, 54], where it is indicated that the main components of this resistance is the specific resistance from friction in bearings w ′′ f b [n·kn −1], the specific resistance from rolling friction of wheels on rails w ′′ rf w [n·kn −1], the specific resistance from sliding friction of wheels on rails w ′′ sf w [n·kn −1], the specific aerodynamic drag w ′′ sad [n·kn −1], the specific resistance from energy dissipation in transit w ′′ edt [n·kn −1], and the specific resistance from dissipation of energy into the environment w ′′ dee [n·kn −1], i.e.: w ′′ m_srm = w ′′ f b + w ′′ rf w + w ′′ sf w + w ′′ sad + w ′′ edt + w ′′ dee . (2) as the wheels rotate, frictional forces are created in the bearings, depending on the type of bearing, the quantity and quality of the lubricant, the air temperature, and the speed of the crew. modern axle boxes use cylindrical or tapered roller bearings, which provide a significant reduction in drag as compared to plain bearings. the value of the friction coefficient of roller bearings is from 0.001 to 0.005. the specific aerodynamic drag w ′′ sad [n·kn −1] of the movement of the car mainly depends on the factors associated with the design of the body of the rail vehicle, its speed, axial load, and ambient temperature. 217 s. semenov, e. mikhailov, s. kliuiev et al. acta polytechnica it is known that the condition of the railway track significantly affects the overall resistance to the movement of the vehicle, particularly as the speed of movement and axial loads increase. in a number of works, when determining the specific resistance to the movement of a rail vehicle from energy dissipation along the way, the indicator of energy dissipation in the track is used, which is determined experimentally [55, 56]. the use of generalised empirical formulas for calculating the value of this indicator is problematic due to the presence of a large number of influencing factors. the value of the specific resistance to movement from the dissipation of energy into the environment w ′′ dee [n·kn −1] is also difficult to quantify using generalised empirical formulas for the same reason. in [57, 58], it was proposed to take the value of this indicator in comparative calculations, considering the assumption of unwearable elements of the running gear. the value of the specific resistance to the movement from the rolling friction of wheels on rails w ′′ rf w [n·kn −1] also depends on many factors: the arrangement and loading of the rolling stock, the type of rails, sleepers, and their number per 1 km of the track, the type and condition of the ballast, etc. the better the quality and technical condition of the track, the lower the rolling resistance of the wheels on the rails. this indicator is also difficult to quantify using generalised empirical formulas, because under the influence of large loads and plastic deformations of the material, microprocesses of sliding friction and wear of the contact surfaces occur simultaneously with rolling friction. the friction coefficient for pure rolling is from 0.001 to 0.01. when the wheel is rolling, in addition to the rolling friction, there also is an elastic sliding of the wheel along the rail. the dimensionless coefficient of elastic sliding of one body over another or creep is defined as the deviation from pure rolling conditions. in addition, in some cases, inelastic sliding of the rolling surfaces of wheels and ridges along the rails also occurs. the value of the specific resistance to the movement of the car from the sliding friction of the wheels on the rails w ′′ sf w [n·kn −1] is determined by converting the absolute values [n] of the forces of resistance to movement from the sliding of the wheels of a railway vehicle along the rails into specific values [n·kn−1] [59, 60]. the additional specific resistance to the movement of the car w ′′ ad_srm [n·kn −1] generally includes the resistance to movement in the curve w ′′ r [n·kn −1], specific resistance caused by rail track inclination w ′′ i [n·kn −1], and specific resistance caused by the action of wind load w ′′ b [n·kn −1]. it should be noted that it is not possible to influence the values of the last two components through the use of innovative designs of the running gear. one of the most significant components of the resistance to the movement of a railway vehicle is the so-called kinematic resistance to movement, which is associated with slips at the contact points of the wheels with the rails, leading to the dissipation of mechanical energy during the movement of the vehicle. a review of scientific papers on the study of the kinematic resistance to the movement of railway vehicles showed the dependence of its value on a significant number of factors: the presence of curves of a small radius and the specific length of these curves on a given test site, the presence of lubrication on the contact surfaces, the technical condition of the vehicle and the railway industry. the presence of a large number of curves on a certain polygon and the specific length of these curves increases the kinematic resistance to movement and the wear of the contact surfaces. this is especially true for curves of a small radius. this factor cannot be influenced. to reduce the kinematic resistance to movement and wear of the contact surfaces, a widespread measure is the lubrication of the contact surfaces of wheels and rails. the result of its application is an increase in traffic safety, improved conditions for fitting into curves and passing through turnouts, and a reduction in the consumption of fuel and energy resources for train traction, wear of rolling stock wheel flanges and rails and noise levels. the method of lubrication and the conditions of use affect the complex nature of the interaction processes between the wheel and the rail when the crew moves in curved sections of the track, the wear of the running gear, and rails. however, a number of serious issues remain unresolved, such as the environmental friendliness of lubricating oils and the ingress of sand and dust into lubricant compositions, which causes additional resistance to movement and increased wear of the wheel-rail contact surfaces. an important factor in reducing the kinematic resistance to the movement of rail vehicles is the maintenance of the rail track in an appropriate technical condition. however, these measures require significant material costs, which makes the simultaneous improvement processes impossible. also, the reason for the unsatisfactory technical condition of the rail track is the discrepancy between the geometric parameters of the track and the parameters of the wheels, which also affects the indicators of wear and traffic safety [61, 62]. in a short theoretical review, it was found that the existing methods for calculating the resistance to movement of railway vehicles do not take into account its dependence on the design features of their running gear. this does not allow a direct possibility of determining this characteristic at the design stage of the crew. it is advisable to study and evaluate the influence of these design features on the energy efficiency of the crew by means of simulation. the characteristics of the movement of a rail vehicle along the section are influenced by many random fac218 vol. 63 no. 3/2023 improving the energy efficiency of a tram’s running gear figure 1. a bar graph of the curved sections of the route. tors, this causes certain difficulties in determining the possible effect of the use of certain technical solutions from the standpoint of energy efficiency. therefore, in order to calculate the energy costs for the movement of a tram vehicle when comparing competing options for technical solutions of its components, it is advisable to determine the main characteristics of energy consumption with a mandatory reference to a specific route. this requirement is especially relevant when conducting research on the influence of the design features of tram running gear on the level of kinematic resistance to their movement. it should be taken into account that the values of most of the abovementioned components of the resistance to movement of a rail vehicle are practically independent of the design of its running gear. therefore, the study considered the impact on the reduction of energy consumption during the movement of the tram is precisely the component that is determined by the forces generated when the wheels slip on the rails (kinematic resistance to movement). in fact, the magnitude of this component can be influenced through the use of innovative designs of undercarriage. in order to determine the impact on the energy efficiency of trams of the possibilities of using innovative technical solutions in their running gear by reducing the kinematic resistance to their movement, the simulation of the movement of several variants of the crew was carried out on the example of the tatra t3 tram car. the simulation was carried out with reference to a real tram route (part of route no. 12: ukraine, kharkov, yuzhny station – lesopark, appendix 1, figure a1). this route consists of straight sections of the track and curves of different radii, the minimum of which is 20 m. a map-scheme of the part of the route in relation to where the simulation was performed is shown in figure 1. the map in figure 1 shows a distribution of curves with various radii along the entire length of the track section. the colours of the bars indicate curve diameters in the following manner, the red colour means the smallest radius of a curve and the blue colour the biggest radius if a curve. the tramcar motion modelling was carried out using the simpack software package. the main components of the resistance to the movement of a rail rolling stock are associated with the forces generated when the wheels slip on the rails on straight and curved sections of the track. to reduce figure 2. a railway running gear with the tks wheels (left) and with the pks wheels (right). figure 3. a visualisation of the contact zone of a wheel of a pks wheel with a rail. them, it is necessary to create innovative undercarriages, which use, for example, advanced wheel designs or devices for radial installation of bogies or wheelsets. the study considered the movement of the running gear of a tram car, taking into account the following design features: • a tram car with bogies and wheels of a standard design, • a tram car with bogies of a standard design and wheels with the pks, which have the possibility of separate rotation of the supporting and guiding surfaces, • a tram car with bogies with a radial installation of wheelsets and wheels of a standard design. figure 2, 3 and 4 show diagrams of the main design features of the tram running gears under consideration. the designs of carts shown in figure 2 are standard, so there is no detailed description of them. the difference between them is only in the design schemes of the wheels. as noted above, in figure 2, the cart has wheels of a standard design. a feature of the wheels of the bogie shown in figure 3 is the possibility of separate rotation of the supporting and guiding surfaces. this mean s that the wheel flange is rigidly mounted on the axle and the tread surface can rotate separately in relation to the flange and the wheelset axle. in this design, the tread surface rests on bearings mounted on the axle. due to the specified design, the amount of kinematic slip of the contact surfaces is reduced, especially in the ridge contact, and the corresponding work of the friction forces is also reduced. figure 3 shows the visualisation of the contact zone of a wheel of a perspective design scheme with a rail, formed in simpack. the track gauge of the model is 1520 mm, the wheel diameter is 680 mm, the wheel profile is 219 s. semenov, e. mikhailov, s. kliuiev et al. acta polytechnica figure 4. a scheme of a tram car running gear with a radial installation of wheelsets and with wheels of a standard design. e-99-00 and the rail profile is b1. the wheel/rail contact surface is calculated by means of the fastsim contact algorithm. the bogie presented in [63–65] was accepted as the bogie of the tram car with radial installation of the wheelsets a feature of such a bogie (figure 4) is the fastening of the axle frame with the help of three bearing housings and the presence of a lever mechanism for adjusting the radial position of the axes when moving along a curve [66–68]. the key feature of the suspension is kinematic independence from the axis positioning system. 3. results and discussion in general, the consumption of mechanical energy at the points of contact of the wheels with the rails during the movement of the vehicle has been calculated by determining and summing the mechanical work of the friction forces at each contact point of the wheels of the tram vehicle with the rails. the described types of the tram bogies were analysed. the analyses have been focused on investigating the energy consumption during running on a selected track line (section 2). there are many quantities and output parameters, which can be assessed. in the case of such a complex model of a tram bogie or an entire tram, it is possible to focus on various outputs. some limits, which have been considered regarding the evaluation of the outputs, should be noted. the simulations were carried out for the running on a track with no irregularities. this means that the mechanical system of a bogie/tram was not excited by the track irregularities and these dynamic effects were neglected during the simulations. mechanical energy eabi [j], which is dissipated when the wheels slip on the rails in the corresponding contact point (equal to the work aabi [j] of the creep forces in the corresponding contact), was defined as the scalar product of the corresponding component of the creep force fabi [n] in the corresponding contact point and of the slip on the rail at the contact point in this direction: eabi = aabi = −(uabi × suabi + vabi × s v abi) , (3) where a, b are the number of the wheel pair of the vehicle in the direction of travel and the side (left or right) of the tram running gear, respectively, uabi [n], vabi [n] are the longitudinal and lateral components of the creep force in the i-th contact of the corresponding wheel and the rail, respectively, suabi, s v abi are sliding in the i-th contact of the corresponding wheel and the rail in the longitudinal and lateral directions, respectively. in order to take into account the rapid changes in the values of the forces acting at the corresponding contact points of the wheels and the rails, their values have been modelled at a specific frequency (simulation frequency fs [hz]). the corresponding slip was calculated taking into account the components of the slip velocity at the contact, i.e.: suabi = v u abi × tfs svabi = v v abi × tfs , (4) where vuabi, v v abi – components of the sliding speed at each contact, in the longitudinal and lateral directions, respectively, tfs [s] is the simulation time period, taking into account the specific simulation frequency fs (accepted in the calculations fs = 200 hz), that is: tfs = 1 fs . (5) the total consumption of mechanical energy due to sliding at the points of contact of the wheels with and rails during the movement of the vehicle is equal to the sum of the consumption of mechanical energy esum [j] during sliding at each contact point of each wheel of the railway vehicle with the rail: esum = ĺ4a=1 ĺ 2 b=1 ĺ 2 i=1 eabi , (6) where eabi [j] is the mechanical energy dissipated when the wheels slip on the rails in the corresponding contact point. the average power dissipated when the wheels slipped on the rails while moving along the route n [w] was determined as follows: n = esum t , (7) where t [s] is the time of the movement of the vehicle along the considered route. as a result of processing the simulation data, the following values of total mechanical energy losses (work of tangential forces in the contact points of wheels and rails) from sliding at the points of contact of wheels and rails and the average power dissipated when a tramcar moves on a given route: • a tram car with bogies and wheels of a standard design 297.3 kj, 220 vol. 63 no. 3/2023 improving the energy efficiency of a tram’s running gear figure 5. total loss of mechanical energy esum [j] due to slipping of the tram wheels and the rails during the passage of the considered route at a speed of 10 km·h−1; tks – a bogie with the traditional construction scheme, pks – a bogie with the perspective construction scheme, radial – a bogie with the radial setting system. figure 6. the average power n [kw] dissipated when the wheels of the tram car slip on the rails during the passage of the considered route at a speed of 10 km·h−1; tks – a bogie with the traditional construction scheme, pks – a bogie with the perspective construction scheme, radial – a bogie with the radial setting system. • a tram car with bogies of a standard design and wheels capable of separate rotation of the supporting and guiding surfaces 178.6 kj, • a tram car with wheels of a standard design and the possibility of radial installation of wheelsets 168.1 kj. the average power consumption to overcome the kinematic resistance to movement when the tramcar vehicle variants move along the indicated route was: • a tram car with bogies and wheels of a standard design 0.083 kw, • a tram car with bogies of a standard design and wheels capable of separate rotation of the supporting and guiding surfaces 0.05 kw, • a tram car with wheels of a standard design and the possibility of radial installation of wheelsets 0.048 kw. figure 5, 6, 7 and 8 show, as an example, the results of several studies on the effectiveness of applying innovative technical solutions in the running gear of a tram to reduce energy losses when it moves at a speed of 10 km·h−1 are shown. figure 8 shows the results of a study of the power dissipated when the wheels of tram cars slide on the figure 7. possible effect of energy saving by using the innovative technical solutions at a running speed of 10 km·h−1: red colour – the amount of energy dissipated due to wheel slippage on the rails when the tram moves along the route, green colour – energy saving effect; tks – a bogie with the traditional construction scheme, pks – a bogie with the perspective construction scheme, radial – a bogie with the radial setting system. figure 8. the value of the average power n [kw] dissipated when the wheels of the tram cars slide on the rails during the passage of the considered route l [km]; tks – a bogie with the traditional construction scheme, pks – a bogie with the perspective construction scheme, radial – a bogie with the radial setting system. rails during the passage of the considered route with the speed ranging from 10 km·h−1 to 30 km·h−1. the analysis of the obtained data allows us to state that the use of wheels with the pks wheel design as part of standard tram car bogies can reduce the consumption of mechanical energy to overcome the resistance to movement as effectively as the use of bogies with the possibility of radial installation of wheel sets. the future research in this field will be focused on an analysis of other dynamic effects, which relate with the proposed technical solution of the wheel design. it is considered, that the simulations will be carried out for other running conditions, which will include track irregularities. they will cause an excitation of the mechanical system of a bogie and it will be possible to assess related valuable outputs. additional considered activities will be aimed at performing simulations and evaluations of energy consumption for other railway tracks (depending on the availability of the geometry). furthermore, the implementation of a flexible body 221 s. semenov, e. mikhailov, s. kliuiev et al. acta polytechnica (a bogie frame) as a part of a multi-body model of a bogie is considered, which will lead to more realistic results. it will then be possible to compare the results of experiments carried out both on a section of a real track and on a test stand. 4. conclusions an analysis of the data obtained allows us to state that, by using innovative technical solutions in the running gear of a tram, it is possible to achieve a reduction in the loss of mechanical energy dissipated in the contact of the wheels with the rails. these are the possibilities of radial installation of wheel sets and the use of wheels with the capability of separate rotation of the supporting and guiding surfaces. the energy efficiency improvements for these running gear variants are almost the same. the results of the results of the simulations are as follows. the total mechanical energy loss due to sliding in the wheel/rail contact, in the case of the tram with the bogie equipped with the pks, was 297.3 kj. however, the tks wheels can reduce these losses. the reduction indicated for the analysed track section is 118.7 kj and the installation of the mechanism for adjusting the wheelset to the radial position in the track even reduces these losses by 129.2 kj. the comparison of the average power consumption has also shown that the most favourable technical solution of the bogie seems to be the bogie with the mechanism for adjusting the wheelset in the radial position under the analysed conditions. it can reduce this power by almost half, to the value of 0.048 kw, compared to the bogie with the tks wheels (0.083 kw). the installation of the pks wheels has a similar effect on the average power consumption. the value for this technical solution is 0.05 kw. an analysis of the advantages and disadvantages of the running gear options under consideration allows us to conclude that, in this case, the option of the tram bogie with wheels of the pks design can be considered the most appropriate in terms of ensuring the energy efficiency of the tram car. list of symbols w ′′ t_srm total specific resistance [n·kn−1] w ′′ m_srm main specific resistance [n·kn−1] w ′′ ad_srm additional specific resistance [n·kn −1] w ′′ f b specific frictional resistance of bearings [n·kn −1] w ′′ rf w specific rolling resistance of wheels on rails [n·kn−1] w ′′ sf w specific sliding friction resistance of wheels on rails [n·kn−1] w ′′ sad specific aerodynamic drag [n·kn −1] w ′′ edt specific resistance resulting from energy dissipation in transit [n·kn−1] w ′′ dee specific resistance resulting from dissipation of energy into the environment [n·kn−1] w ′′ r specific resistance resulting from motion in the curve [n·kn−1] w ′′ i specific resistance resulting from rail track inclination [n·kn−1] w ′′ b specific resistance resulting from the action of wind load [n·kn−1] eabi mechanical energy [j] aabi mechanical work of the creep forces in the corresponding contact [j] fabi corresponding component of the creep force [n] uabi longitudinal component of the creep force in the i-th contact between the corresponding wheel and the rail [n] vabi lateral component of the creep force in the i-th contact between the corresponding wheel and the rail [n] suabi longitudinal slip [–], [%] svabi lateral slip [–], [%] fs simulation frequency [hz] vuabi longitudinal component of a sliding speed [m·s −1] vvabi lateral component of a sliding speed [m·s −1] tfs period of the simulation time [s] esum sum of the mechanical energy consumed [j] n average dissipated power [w] t time of movement of a vehicle [s] acknowledgements this research was supported by the cultural and educational grant agency of the ministry of education of the slovak republic in the project no. kega 031žu-4/2023: development of key competences of the graduates of the study programme vehicles and engines. this research was supported by the scientific grant agency of the ministry of education, science, research and sport of the slovak republic and the slovak academy of sciences in the project no. vega 1/0513/22: research of properties of railway brake components in simulated operating conditions on a flywheel brake stand. references [1] m. damjanović, s. željeko, d. stanimirović, et al. impact of the number of vehicles on traffic safety: multiphase modeling. facta universitatis series: mechanical engineering 20(1):177–197, 2022. https://doi.org/10.22190/fume220215012d [2] j. szmagliński, s. grulkowski, k. birr. identification of safety hazard and their sources in tram transport. matec web of conferences 231:05008, 2018. https://doi.org/10.1051/matecconf/201823105008 [3] j. gnap, š. senko, m. kostrzewski, et al. research on the relation between transport infrastructure and performance in rail and road freight transport – a case study of japan and selected european countries. sustainability 13(12):6654, 2021. https://doi.org/10.3390/su13126654 [4] j. gnap, p. varjan, p. ďurana, m. kostrzewski. research of relationship between freight transport and transport infrastructure in selected european countries. transport problems 14(3):63–74, 2019. https://doi.org/10.20858/tp.2019.14.3.6 222 https://doi.org/10.22190/fume220215012d https://doi.org/10.1051/matecconf/201823105008 https://doi.org/10.3390/su13126654 https://doi.org/10.20858/tp.2019.14.3.6 vol. 63 no. 3/2023 improving the energy efficiency of a tram’s running gear [5] a. lovska, o. fomin, a. horban, et al. investigation of the dynamic loading of a body of passenger cars during transportation by rail ferry. eureka, physics and engineering 2019(4):91–100, 2019. https://doi.org/10.21303/2461-4262.2019.00950 [6] r. melnik, s. koziak, b. sowiński, a. chudzikiewicz. reliability analysis of metro vehicles operating in poland. transportation research procedia 40:808–814, 2019. https://doi.org/10.1016/j.trpro.2019.07.114 [7] o. fanta, f. lopot, p. kubovy, et al. kinematic analysis and head injury criterion in a pedestrian collision with a tram at the speed of 10 and 20 km.h−1. manufacturing technology 22(2):139–145, 2022. https://doi.org/10.21062/mft.2022.024 [8] e. bernal, m. spiryagin, i. persson, et al. traction control algorithms versus dynamic performance in light rail vehicle design architectures. lecture notes in mechanical engineering pp. 78–87, 2022. https://doi.org/10.1007/978-3-031-07305-2_9 [9] j. stefanović-marinović, ž. vrcan, s. troha, m. milovančević. optimization of two-speed planetary gearbox with brakes on single shafts. reports in mechanical engineering 3(1):94–107, 2022. https://doi.org/10.31181/rme2001280122m [10] ž. vrcan, j. stefanović-marinović, m. tica, s. troha. research into the properties of selected single speed two-carrier planetary gear trains. journal of applied and computational mechanics 8(2):699–709, 2022. https://doi.org/10.22055/jacm.2021.39143.3358 [11] j. mašek, m. kendra, s. milinković, et al. proposal and application of methodology of revitalization of regional railway track in slovakia and serbia. part 1: theoretical approach and proposal of methodology for revitalization of regional railways. transport problems 10:85–95, 2015. https://doi.org/10.21307/tp-2015-064 [12] l. naegeli, u. weidmann, a. nash. checklist for successful application of tram-train systems in europe. transportation research record 2275(1):39–48, 2012. https://doi.org/10.3141/2275-05 [13] iso 50001. energy management systems – requirements with guidance for use. [14] z. nunić, m. ajanović, d. miletić, r. lojić. determination of the rolling resistance coefficient under different traffic conditions. facta universitatis series: mechanical engineering 18(4):653–664, 2020. https://doi.org/10.22190/fume181116015n [15] k. de payrebrune. relation of kinematics and contact forces in three-body systems with a limited number of particles. facta universitatis series: mechanical engineering 20(1):95–108, 2022. https://doi.org/10.22190/fume210310035p [16] j. moravec. increase of the operating life of active parts of cold-moulding tools. technicki vjesnik 24:143–146, 2017. https://doi.org/10.17559/tv-20150520132311 [17] z. yang, z. lu, x. sun, et al. robust lpv-h ∞ control for active steering of tram with independently rotating wheels. advances in mechanical engineering 14(11):1–12, 2022. https://doi.org/10.1177/16878132221130574 [18] a. chudzikiewicz, i. maciejewski, t. krzyżyńński, et al. electric drive solution for low-floor city transport trams. energies 15(13):4640, 2022. https://doi.org/10.3390/en15134640 [19] a. chudzikiewicz, m. sowińska. modelling the dynamics of an unconventional rail vehicle bogie with independently rotating wheels with the use of boltzmann-hamel equations. vehicle system dynamics 60(3):865–883, 2022. https://doi.org/10.1080/00423114.2020.1838567 [20] a. heckmann, d. lüdicke, a. keck, b. goetjes. a research facility for the next generation train running gear in true scale. lecture notes in mechanical engineering pp. 18–27, 2022. https://doi.org/10.1007/978-3-031-07305-2_3 [21] m. gharagozloo, a. shahmansoorian. chaos control in gear transmission system using gpc and smc controllers. journal of applied and computational mechanics 8(2):545–556, 2022. https://doi.org/10.22055/jacm.2020.32499.2028 [22] f. klimenda, j. skocilas, b. skocilasova, et al. vertical oscillation of railway vehicle chassis with asymmetry effect consideration. sensors 22(11):4033, 2022. https://doi.org/10.3390/s22114033 [23] m. svoboda, v. schmid, m. sapieta, et al. influence of the damping system on the vehicle vibration. manufacturing technology 19(6):1034–1040, 2019. https://doi.org/10.21062/ujep/408.2019/a/ 1213-2489/mt/19/6/1034 [24] z. dvořák, b. leitner, l. novák. software support for railway traffic simulation under restricted conditions of the rail section. procedia engineering 134:245–255, 2016. https://doi.org/10.1016/j.proeng.2016.01.066 [25] j. gerlici, v. sakhno, a. yefymenko, et al. the stability analysis of two-wheeled vehicle model. matec web of conferences 157:01007, 2018. https://doi.org/10.1051/matecconf/201815701007 [26] r. melnik, s. sowiński. analysis of dynamics of a metro vehicle model with differential wheelsets. transport problems 12(3):113–124, 2017. https://doi.org/10.20858/tp.2017.12.3.11 [27] a. barbera, g. bucca, r. corradi, et al. electronic differential for tramcar bogies: system development and performance evaluation by means of numerical simulation. vehicle system dynamics 52(sup1):405–420, 2014. https://doi.org/10.1080/00423114.2014.901543 [28] g. vaiciunas, s. steisunas. sperling’s comfort index study in a passenger car with independently rotating wheels. transport problems 16(2):121–130, 2021. https://doi.org/10.21307/tp-2021-028 [29] g. megna, a. bracciali. gearless track-friendly metro with guided independently rotating wheels. urban rail transit 7:285–300, 2021. https://doi.org/10.1007/s40864-021-00159-2 [30] t. zhang, x. guo, t. jin, et al. dynamic derailment behaviour of urban tram subjected to local collision. international journal of rail transportation 10(5):581–605, 2022. https://doi.org/10.1080/23248378.2021.1964392 223 https://doi.org/10.21303/2461-4262.2019.00950 https://doi.org/10.1016/j.trpro.2019.07.114 https://doi.org/10.21062/mft.2022.024 https://doi.org/10.1007/978-3-031-07305-2_9 https://doi.org/10.31181/rme2001280122m https://doi.org/10.22055/jacm.2021.39143.3358 https://doi.org/10.21307/tp-2015-064 https://doi.org/10.3141/2275-05 https://doi.org/10.22190/fume181116015n https://doi.org/10.22190/fume210310035p https://doi.org/10.17559/tv-20150520132311 https://doi.org/10.1177/16878132221130574 https://doi.org/10.3390/en15134640 https://doi.org/10.1080/00423114.2020.1838567 https://doi.org/10.1007/978-3-031-07305-2_3 https://doi.org/10.22055/jacm.2020.32499.2028 https://doi.org/10.3390/s22114033 https://doi.org/10.21062/ujep/408.2019/a/1213-2489/mt/19/6/1034 https://doi.org/10.21062/ujep/408.2019/a/1213-2489/mt/19/6/1034 https://doi.org/10.1016/j.proeng.2016.01.066 https://doi.org/10.1051/matecconf/201815701007 https://doi.org/10.20858/tp.2017.12.3.11 https://doi.org/10.1080/00423114.2014.901543 https://doi.org/10.21307/tp-2021-028 https://doi.org/10.1007/s40864-021-00159-2 https://doi.org/10.1080/23248378.2021.1964392 s. semenov, e. mikhailov, s. kliuiev et al. acta polytechnica [31] g. vaiciunas, s. steisunas, g. bureika. adaptation of rail passenger car suspension parameters to independently rotating wheels. transport problems 17(1):215–226, 2022. https://doi.org/10.20858/tp.2022.17.1.18 [32] b. leitner. the software tool for mechanical structures dynamic systems identification. in proceedings of the 15th international conference transport means, kaunas, lithuania, 20-21 october 2011, pp. 38–41. 2011. [33] j. harušinec, a. suchánek, m. loulová. creation of prototype 3d models using rapid prototyping. matec web of conferences 254:01013, 2019. https://doi.org/10.1051/matecconf/201925401013 [34] p. kurčík, j. gerlici, t. lack, et al. innovative solution for test equipment for the experimental investigation of friction properties of brake components of brake systems. transportation research procedia 40:759–766, 2019. https://doi.org/10.1016/j.trpro.2019.07.107 [35] k. topczewska, j. gerlici, a. yevtushenko, et al. analytical model of the frictional heating in a railway brake disc at single braking with experimental verification. materials 15(19):6821, 2022. https://doi.org/10.3390/ma15196821 [36] a. yevtushenko, k. topczewska, p. zamojski. influence of thermal sensitivity of functionally graded materials on temperature during braking. materials 15(3):963, 2022. https://doi.org/10.3390/ma15030963 [37] p. zvolensky, l. kašiar, p. volna, d. barta. simulated computation of the acoustic energy transfer through the structure of porous media in application of passenger carriage body. procedia engineering 187:100–109, 2017. https://doi.org/10.1016/j.proeng.2017.04.355 [38] r. goodall, w. kortüm. mechatronic developments for railway vehicles of the future. control engineering practice 10(8):887–898, 2022. https://doi.org/10.1016/s0967-0661(02)00008-4 [39] e. mikhailov, j. gerlici, s. kliuiev, et al. mechatronic system of control position of wheel pairs by railway vehicles in the rail track. aip conference proceedings 2198(1):020009, 2019. https://doi.org/10.1063/1.5140870 [40] l. smetanka, p. šťastniak, j. harušinec. wear research of railway wheelset profile by using computer simulation. matec web of conferences 157:03017, 2018. https://doi.org/10.1051/matecconf/201815703017 [41] a. miltenović, m. banić, j. tanasković, et al. wear load capacity of crossed helical gears, 2022. [in print]. [42] l. kou, m. sysyn, j. liu. influence of crossing wear on rolling contact fatigue damage of frog rail, 2022. [in print]. [43] l. smetanka, s. hrček, p. šťastniak. investigation of railway wheelset profile wear by using computer simulation. matec web of conferences 254:02041, 2019. https://doi.org/10.1051/matecconf/201925402041 [44] t. lack, j. gerlici, p. šťastniak. wheelset/rail geometric characteristics and contact forces assessment with regard to angle of attack. matec web of conferences 254:01014, 2019. https://doi.org/10.1051/matecconf/201925401014 [45] a. yevtushenko, k. topczewska. model for calculating the mean temperature on the friction area of a disc brake. journal of friction and wear 42(4):296–302, 2021. https://doi.org/10.3103/s1068366621040048 [46] m. sysyn, o. nabochenko, f. kluge, et al. common crossing structural health analysis with track-side monitoring. communications-scientific letters of the university of zilina 21(3):77–84, 2019. https://doi.org/10.26552/com.c.2019.3.77-84 [47] m. sysyn, o. nabochenko, u. gerber, et al. common crossing condition monitoring with on-board inertial measurements. acta polytechnica 59(4):422–433, 2019. https://doi.org/10.14311/ap.2019.59.0423 [48] s. bühler, b. thallemer. how to avoid squeal noise on railways state of the art and practical experience, vol. 99, chap. noise and vibration mitigation for rail transportation systems. notes on numerical fluid mechanics and multidisciplinary design. springer, berlin, heidelberg, 2008. [49] r. dukkipati, s. narayana, m. osman. independently rotating wheel systems for railway vehicles: a state of the art review. vehicle system dynamics 21(1):297–330, 1992. https://doi.org/10.1080/00423119208969013 [50] o. kyryl’chuk, j. kalivoda, l. neduzha. high speed stability of a railway vehicle equipped with independently rotating wheels. in engineering mechanics 2018, svratka, czech republic, may 14-17, vol. 24, pp. 473–476. 2018. https://doi.org/10.21495/91-8-473 [51] e. mikhailov, s. semenov, h. shvornikova, et al. a study of improving running safety of a railway wagon with an independently rotating wheel’s flange. symmetry 13(10):1955, 2021. https://doi.org/10.3390/sym13101955 [52] e. mihajlov, v. slashhov, m. gorbunov, et al. rail vehicle wheel: utility model patent. ukraine, no. 87418, 2014. [53] p. astahov. resistance to the movement of railway rolling stock. [in russian; soprotivlenie dvizheniju zheleznodorozhnogo podvizhnogo sostava], transport, 178 p., 1966. [54] p. astahov. determination of the main resistance of the rolling stock on the experimental ring. [in russian; opredelenie osnovnogo soprotivlenija podvizhnogo sostava na jeksperimental’nom kol’ce], vestink vniizht, 2, pp. 27–29, 1962. [55] p. grebenjuk, a. dolganov, o. nekrasov, a. lisicyn. traction calculation rules. [in russian; pravila tjagovyh raschetov], transport, 287 p., 1985. [56] a. kogan. track dynamics and its interaction with rolling stock. [in russian; dinamika puti i ego vzaimodejstvie s podvizhnym sostavom], transport, 327 p., 1997. [57] a. komarova. influence of bogie characteristics on the energy efficiency of freight cars. [in russian; vlijanie harakteristik telezhek na jenergojeffektivnost’ gruzovyh vagonov], ph.d. thesis, department of wagons and rail track economy, st. petersburg state transport university, sankt-peterburg, russia, 2015. 224 https://doi.org/10.20858/tp.2022.17.1.18 https://doi.org/10.1051/matecconf/201925401013 https://doi.org/10.1016/j.trpro.2019.07.107 https://doi.org/10.3390/ma15196821 https://doi.org/10.3390/ma15030963 https://doi.org/10.1016/j.proeng.2017.04.355 https://doi.org/10.1016/s0967-0661(02)00008-4 https://doi.org/10.1063/1.5140870 https://doi.org/10.1051/matecconf/201815703017 https://doi.org/10.1051/matecconf/201925402041 https://doi.org/10.1051/matecconf/201925401014 https://doi.org/10.3103/s1068366621040048 https://doi.org/10.26552/com.c.2019.3.77-84 https://doi.org/10.14311/ap.2019.59.0423 https://doi.org/10.1080/00423119208969013 https://doi.org/10.21495/91-8-473 https://doi.org/10.3390/sym13101955 vol. 63 no. 3/2023 improving the energy efficiency of a tram’s running gear [58] l. gracheva, a. hudjakova. influence of energy dissipation in the spring suspension of bogies on the resistance to movement of freight cars. [in russian; vlijanie rasseivanija jenergii v ressornom podveshivanii telezhek na soprotivlenie dvizheniju gruzovyh vagonov], vestink vniizht, 3, pp. 37–39, 1979. [59] s. semenov, e. mihailov, j. dizo, m. blatnicky. the research of running resistance of a railway wagon with various wheel designs. in lecture notes in intelligent transportation and infrastrucutre, pp. 110–119. 2022. https://doi.org/10.1007/978-3-030-94774-3_11 [60] study of the interaction between track and rolling stock in the usa. [in russian; issledovanie vzaimodejstvija puti i podvizhnogo sostava v ssha], railways of the world, 9, pp. 45–48, 1991. [61] e. blohin, s. mjamlin, n. sergienko. increased wear of wheels and rails is the most important problem of transport. [in russian; povyshennyj iznos koles i rel’sov – vazhnejshaja problema transporta], railway transport of ukraine. technics and technologies, 1, pp. 10–14, 2011. [62] a third of accidents at “ukrzaliznytsia” pov’jazani with a filthy camp of a crumbling warehouse. [in ukrainian; tretyna avarij na “ukrzaliznyci” pov’jazani z poganym stanom ruhomogo skladu – menedzher] [2023-01-04], https://lb.ua/economics/2021/10/19/ 496566_tretina_avariy_ukrzaliznitsi.html/. [63] v. hauser, o. nozhenko, k. kravchenko, et al. proposal of a steering mechanism for tram bogie with three axle boxes. procedia engineering 192:289–294, 2017. https://doi.org/10.1016/j.proeng.2017.06.050 [64] v. hauser, o. nozhenko, k. kravchenko, et al. impact of wheelset steering and wheel profile geometry to the vehicle behavior when passing curved track. manufacturing technology 17(3):306–312, 2017. https://doi.org/10.21062/ujep/x.2017/a/ 1213-2489/mt/17/3/306 [65] v. hauser. construction proposal of tramcar bogie with minimized force effects to the track. [in slovak; konštrukčný návrh podvozka električky so zníženými silovými účinkami na trať], ph.d. thesis, department of transport and handling machines, university of žilina, žilina, slovak republic, 2017. [66] v. hauser, k. kravchenko, m. loulova, et al. analysis of a tramcar ride when passing a point frog and when entering small radius arc by specific rail geometry. manufacturing technology 19(3):391–396, 2019. https://doi.org/10.21062/ujep/302.2019/a/ 1213-2489/mt/19/3/391 [67] v. hauser, o. nozhenko, k. kravchenko, et al. proposal of a steering mechanism for tram bogie with three axle boxe. procedia engineering 192:289–294, 2017. https://doi.org/10.1016/j.proeng.2017.06.050 [68] p. strážovec, j. gerlici, t. lack, j. harušinec. innovative solution for experimental research of phenomena resulting from the wheel and rail rolling. transportation research procedia 40:906–911, 2019. https://doi.org/10.1016/j.trpro.2019.07.127 225 https://doi.org/10.1007/978-3-030-94774-3_11 https://lb.ua/economics/2021/10/19/496566_tretina_avariy_ukrzaliznitsi.html/ https://lb.ua/economics/2021/10/19/496566_tretina_avariy_ukrzaliznitsi.html/ https://doi.org/10.1016/j.proeng.2017.06.050 https://doi.org/10.21062/ujep/x.2017/a/1213-2489/mt/17/3/306 https://doi.org/10.21062/ujep/x.2017/a/1213-2489/mt/17/3/306 https://doi.org/10.21062/ujep/302.2019/a/1213-2489/mt/19/3/391 https://doi.org/10.21062/ujep/302.2019/a/1213-2489/mt/19/3/391 https://doi.org/10.1016/j.proeng.2017.06.050 https://doi.org/10.1016/j.trpro.2019.07.127 s. semenov, e. mikhailov, s. kliuiev et al. acta polytechnica a. appendix figure a1. a map showing the route of a tram. 226 acta polytechnica 63(3):216–226, 2023 1 introduction 2 a method of the problem solving 3 results and discussion 4 conclusions list of symbols acknowledgements references a appendix ap05_4.vp notation / abbreviations � wavelength (here expressed in mm). dbi the gain in decibels (db) relative to that of an isotropic radiator in free space. hf high frequency. refers to radio ,short wave’ communication over the frequency range 3 to 30 mhz. nvis near vertical incidence skywave. a high angle radio mode involving the ionosphere. ff far field pattern; refers to an envelope of constant field strength surrounding an energised radiating aerial. in conformity with the principle of reciprocity, the same envelope may be used to represent the profile of relative gain in various directions when using the aerial to receive radio signals. dm dual monopole, in this paper referring to aerials of a type consisting of inwardly inclined unfolded (i.e., single) elements on or near a ground plane. the elements cross over but do not physically touch. swr standing wave ratio that would be on a transmission line if it were connected between a source of excitation and the aerial feed point. swr of unity implies a perfect match between aerial and feeder cable. 1 introduction an overview of nvis propagation with references has been given already in [1]. to summarise, there are three main modes of terrestrial hf radio propagation. these are: ground wave (gw in fig. 1) for distances up to 50 miles but subject to blocking by hills, buildings and other obstacles. secondly, over-the-horizon ‘classical’ skywaves, also launched horizontally, for intercontinental contacts employing skips between the ionosphere and the ground. and thirdly, nvis launched at high angles (grey lines) and returned from the ionosphere, and having the useful property of straddling ground obstacles and overcoming the limitations of the dead zone in respect of the other two modes. this dead zone would otherwise extend from the groundwave maximum s1 to the nearest ‘skip distance’ from the ionosphere at s2. in the diagram, the nvis range is represented by rp(n) and varies with the state of the ionospheric layers e, f1 and e2. nvis working is best at or just below the critical frequency, which during daytime is generally in the range 2 to 10 mhz and commonly around 7 mhz. the aerials designs reported in this paper are for a fundamental resonant frequency of 7.05 mhz, the centre of the uk 7 to 7.1 mhz amateur radio band. this is primarily for the convenience of practical testing by the author and similarly licensed persons. the application potential is far wider and includes emergency services and telemetry, e.g., environmental monitoring from locations in valleys or other difficult terrain, and at locations when and where links by line-of-sight vhf/uhf, satellite or mobile phone, may not be available. under such circumstances, hf radio may be the only viable means of communication. the ff pattern of an nvis aerial should consist of a broad lobe directed upwards, with shapes ranging from near spherical (a in fig. 1b) to an oblate spheroid (b), and with no horizontally directed sidelobes. this latter feature is particularly important for rejecting long distance ‘skip’ traffic when working local nvis paths. also, simultaneous reception of groundwave and nvis signals can cause multi-path interference-induced fading. however, in some designs presented here, it is possible to use the same aerial for both nvis work at or near the design frequency fd (near vertical directivity), and also higher frequencies, e.g., around 2× fd or 3× fd with a different ff pattern, using horizontal lobes for over-the-horizon skip propagation. we note that the numerical electromagnetic code (nec) ff pattern computations take into account ground reflection of aerial geometry. in comparing ff patterns for this work, a perfect earth is assumed. as with any near-ground aerial system, the efficacy of the earth is affected by local conditions, which can cause losses and adversely affect aerial performance. the eznec™ software used is capable of simulating a range of imperfect earth scenarios, but these effects blur subtle ff pattern differences are not discussed further here. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 properties and performance of a new compact hf aerial design for multi-band operation d. telfer, j. spencer this work is an extension to that of telfer and austin [1] in that here a balanced feed embodiment of an inwardly-inclined folded dual monopole aerial is presented and discussed in terms of its improved performance over the original configuration. this includes greater control of the stability of the far-field (ff) lobe pattern with operating frequency, and a considerably extended frequency range (3:1 ratio) over which near vertical incidence skywave (nvis) propagation in the high frequency (hf) bands can be exploited. furthermore, the ff lobe patterns at frequencies >2× the fundamental design frequency are such that advantage can be taken of conventional non-nvis horizontal propagation at those frequencies using the same aerial. at the fundamental frequency, compactness of design and robustness of its nvis ff pattern to orientation make the novel balanced aerial design a convenient replacement for a full-length low dipole in cluttered environments. the paper also presents a vehicle-mounted version for medium range operation within hf skip distances. applications highlighted include stations for remote monitoring of environmental measurements in difficult or hostile terrain. keywords: aerial, antenna, multi-band, nvis, hf, environmental monitoring. 2 dual monopole aerial: unbalanced vs balanced configuration the main experimental aerial referred to in [1] was an unbalanced folded dual monopole (ubfdm) design (fig. 2a) which, once computed, was relatively easy to set up and use. however, the ability to maintain integrity of the near-spherical lobe pattern (fig. 2c) over an extended frequency range either side of fd is limited. simulations of this design also showed that the optimal ff pattern may not coincide with the condition of minimum swr for good matching between aerial and feeder cable. one practical solution was to design the element configuration for optimal lobe shape, connect the 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague a) b) fig. 1: a) nvis propagation (schematic), b) desired ff pattern a) b) c) d) fig. 2: a) unbalanced aerial: m is a matching reactance, close to the feed point f, b) balanced aerial with integrated feed and matching unit m, c) ubdfm aerial fd ff pattern, d) fd ff lobe for bfdm aerial unbalanced (50 ohm co-axial) feeder directly to the aerial, then tune the system for minimal swr at the transmitter end with an aerial tuning unit (atu). this ‘external tuning’ worked well, but the challenge of obtaining an optimal lobe co-incident with perfect match at the aerial end remained. therefore a ‘matching unit’ (m in fig. 2a) was introduced. this is a single reactance, either inductive for an electrically short aerial or capacitive for longer aerial elements. when m is adjusted for resonance (minimum swr) at the design frequency, attention can be switched to optimising the ff lobe, which is a property solely of the aerial configuration. this may be achieved by, e.g., making one element longer than the other, or changing the x-elevation of one element. iteration of the whole process is carried out as required to give an acceptable lobe shape in the centre of the operational frequency range. but if the frequency range limitation (in our case +/�0.2 mhz at 7mhz) for a well shaped nvis lobe is not acceptable, the nvis bandwidth may be extended by adopting balanced configurations. two such solutions are explored in this work. one of these involved conversion of the foregoing ubfdm system to the balanced design of fig. 2b. although appearing more complex, both simulation and practical matching are made easier because the ff lobe retains its desired shape (fig. 2d) over a much wider (cca 3:1) frequency range. the extra conductor length introduced between p and q requires equal value pre-match capacitors at m, positioned each side of the feed point. the second solution is essentially the same matching arrangement driving a pair of non-folded elements. 3 balanced dm aerials: effects of changes in x-elevation using non-folded elements (fig. 3a) decreases the swr bandwidth slightly, but a simplified balanced aerial, based on an embodiment in [2], facilitates modelling and testing, especially when changing element orientations. eznec™ software was employed to model the effects of changing x-elevation angle �. in order to unify scaling, the parameters zi and cm (the matching capacitance values), are conveniently plotted against aerial height h (c.f., fig. 3a), which is an important consideration in some locations. fig. 5a to 5f show how the angle � affects the ff pattern. fig. 5b shows the ff pattern of a horizontal low dipole radiator for comparison. the relative gain profile in fig. 5a is shown in fig. 5c for the xz plane, for which the beamwidth for half-power points is 69.4 degrees. there is observable near-ground bifurcation along the y-axis in fig. 5a, and the yz cross section is significantly broadened (fig. 5d) with a half-power point beamwidth of 99.4 degrees. maximum gain (z-direction) is 8.26 dbi. by comparison, the dipole ff pattern of fig. 5b has half power beamwidths of 65 degrees in the xz plane and 102.6 degrees in the yz plane. in the dipole case, maximum (z-direction) gain is 8.35 dbi. differences between the ff patterns for the dm aerial and a low �/2 dipole are very small and for practical purposes nearly identical. physical dimensions of the test dm aerial (vertically aligned, � � 45 degrees) are summarised as: lengths of e1, e2, 0.3051 �; length of centre wire containing m, 0.00706 �; length of wires joining the m-wire to lower ends of e1, e2, 0.07984 �� heights (agl): m-wire, 0.0141 �; lower ends of e1, e2, 0.0235 �; upper ends of e1, e2, 0.2416 �. e1, e2 crossover gap 0.027 �. progressive variations in ff pattern with q are relatively small over the range 29.5 degrees (fig. 5e) to 56.4 degrees (fig. 5f). the most notable visual observations are the changes © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 a) b) fig. 3: a) simple dm aerial showing the x-elevation � of elements e1 and e2, b) view of same aerial from z direction, showing inter-element gap fig. 4: plots of zi and � in relation to total height (mm) for dm aerial with fd � 7.05 mhz and concomitant adjustments to capacitor values cm (pf) in the matching unit m to make the feed impedance zi (ohm) purely resistive. we note the considerable tolerance of zi to � with zi(max) at � ~ 45 degrees in structure and orientation of the near-ground bifurcations. as these are low-gain components of the profile, they make little practical difference, especially under ‘real-earth’ conditions, when distinctions between near-ground features are further reduced. the z-direction gains for the lobes in figs. 5e and 5f are 8.78 dbi and 7.08 dbi, respectively. 4 effects of changes in y-elevation in fig. 6, the reference dm aerial has constant � � 45 degrees, where � is now the inclination of elements e1, e2 with respect to the medial axis, elevated � degrees in the y-direction. i.e., (90 � �) is the tilt angle of the medial axis from the vertical (z) axis.the ff pattern in fig. 7 for 58 degrees sideways tilt illustrates the robustness of the design to constraints in which sideways tilting is necessary, e.g., for obstacle avoidance or height restriction. however, in fig. 8, zi falls off increasingly with medial axis tilt (going right to left), whereas only small changes in matching capacitance cm are needed to maintain minimum swr at fd. in practice, matching may be achieved using an atu, e.g., at the transmitter end, but it is 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague a) b) c) d) e) f) fig. 5: a) ff pattern at fd for dm aerial of fig. 3a with � � 45 degrees, b) ff pattern at fd for fully extended low dipole (h � 0.17 �), c) xz plane cross section of ff pattern in a), d) yz plane cross section of ff pattern in b) showing broadening, e) ff pattern at fd for dm aerial with x-elevation � � 29.5 degrees, f) ff pattern at fd for dm aerial with x-elevation � � 56.4 degrees better to rely on a dedicated matching unit close to the aerial as indicated and then, if desired, use the atu for ‘tweaking’. 5 the aerial matching unit a toroidal transformer (t1 in figs. 9a,b) with suitable turns ratio matches the cable (generally 50 ohms) to zi. in practice, the secondary winding is connected to fixed pre-matching capacitors, each in parallel with a variable trimmer capacitor (see fig. 9b) that is left preset after initial tuning. any further tuning is done with the aerial tuning unit (atu) connected at the transmitter end of the feeder cable. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 6: tilt of medial axis fig. 7: ff pattern at fd, at 58 degrees tilt, with some broadening close to ground; predicted z-gain is 8.13 dbi fig. 8: plots of zi and cm against y-elevation (�) a) b) fig. 9: a) matching unit, schematic: primary feed connection to bifilar transformer t1 and secondary to the pre-matching capacitors cm, b) matching unit construction at 7 mhz and weatherproof housing 6 swr considerations and multiband properties the eznec™ predicted swr at 7.0 to 7.1 mhz for the dm aerial of fig. 3 is shown below in fig. 10a. an extended plot of swr over the range 5 to 30 mhz for the same aerial appears in fig. 10b. the plot of fig. 10a covers the 40 metre uk amateur radio band (7.0 mhz to 7.1 mhz) which was used for practical tests. the swr curve is such that only small adjustments to tuning with an aerial tuning unit (atu) at the transmitter end of the cable will suffice for complete band coverage. this was borne out during practical nvis experiments, where also support for the predicted aerial gain profile, as estimated from comparative signal strength readings with stations of known transmitter power and aerial configuration (e.g., horizontal dipole), was obtained. similar in-band swr and lobe properties were evident in experiments with the bfdm configuration of fig. 2b, for which the 7.0 mhz to 7.1 mhz swr plot was very similar. however, the extended spectral plot (fig. 10c) revealed a more detailed signature, as anticipated from the increased aerial complexity. positions of the higher 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague a) b) c) fig. 10: a) swr plot, 7 to 7.1 mhz for dm aerial of fig. 3, zi � 50 ohms at 7.05 mhz; b) swr spectrum, 5 mhz to 30 mhz for the dm aerial of fig. 3, for zi � 50 ohms at 7.05 mhz; c) predicted swr spectrum, range 5 mhz to 30 mhz, of the bfdm aerial in fig. 2b, for zi � 50 ohms at 7.05 mhz, the occurrence of multiple peaks at higher frequencies (b to e) has positive implications for multiband operation a) b) c) fig. 11: a) third harmonic ff pattern from a low dipole (h � 0.17 �), b) third harmonic ff pattern from the non-folded dm aerial of fig. 3a, c) ff f2 (14mhz) plot for the bfdm aerial of fig. 2b frequency peaks b to e do not bear a simple relationship to fd (a). although aerial tuning at or near these peaks can be readily ‘pulled’ into a matched state at the transmitter with an atu, high quality feeder cable should be used to minimise losses due to residual standing wave currents. this bfdm design has replaced the unbalanced one as the main hf aerial at the author’s station and several months experience with has confirmed its similarity to a low dipole on the 7.0 mhz to 7.1 mhz band. in fig. 10b, the simpler dm aerial spectrum is shown. peaks a, b and c corresponding to the fundamental (design) frequency fd (� f1) and the approximate second and third harmonics, f2 and f3, respectively. it is well known that the dipole aerial can be most readily matched at f1 and f3. but how useful the dm aerial and its folded element variants would be for multiband operations depends on the higher frequency ff patterns, which, for working above nvis frequencies, should have substantial horizontal lobes for ‘skip’ working. fig. 11a shows the ff pattern for the 7.05 mhz low horizontal dipole operated near f3 (21 mhz). this is compared in fig. 11b with the f3 ff pattern for the ‘standard’ dm aerial with x-elevation � � 45 degrees, and in with the bfdm aerial of fig. 2b operated at f2 (fig. 11c) and f3 (fig. 2d). the dm aerial f3 ff pattern differs substantially from that of the low dipole, but has substantial energy input into sidelobes under ‘perfect earth’ conditions. however, under real-earth terrain conditions, these will be elevated and diminished in proportion to the losses. although peaks b to e in fig. 10c for the bdfm aerial do not coincide exactly with f2 and f3, the f2 plot is shown in fig. 11c, and the f3 plot was very similar to fig. 11b for the dm aerial. similar trends (not shown) were observed for the ubfdm aerial. these behaviours have also been confirmed by the author through radio skip working on 14 mhz and 21 mhz bands. 7 conclusions computer simulation and practical testing of a number of dual inwardly-inclined monopole aerials for nvis has confirmed the viability of this design avenue. in particular, the balanced feed variants exhibit consistency of vertical far-field pattern shape with frequency and element orientation, the latter augmenting spatial adaptability and compactness without compromising nvis performance. versatility is further exemplified by useful horizontal gain achieved at frequencies above those used for nvis propagation, allowing the same aerial to be used, without further adjustment, for multi-band operation. potential for applications in telemetry and emergency communications from shielded locations and difficult terrain is thereby demonstrated. 8 acknowledgments thanks are due to our colleague dr. brian austin for his continuing interest and many fruitful discussions, and to dr. alan brown of the countryside council for wales for insights into telemetry applications. references [1] telfer, d. j., austin, b. a.: “novel antenna design for near-vertical incidence (nvis) hf communications.” proc. 2nd international conference on advanced engineering design [electrical engineering section]. university of glasgow, 2001, june 24–26. [2] telfer, d. j.: “multiple monopole aerial.” uk patent gb2375235. april 2004. dr. duncan telfer e-mail: djtelfer@liverpool.ac.uk dr. joe spencer centre for intelligent monitoring systems university of liverpool department of electrical engineering and electronics brownlow hill liverpool l69 3gj © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 12: vehicular mounted shortened dm aerial, inductively loaded, has narrow swr bandwidth but can be tuned readily with an atu. its potential for nvis in fieldwork is being currently investigated ap08_3.vp 1 introduction brushless direct-current motors (bldcs) are so named because they have a straight-line speed-torque curve like their mechanically commutated counterparts, permanent-magnet direct-current (pmdc) motors. in pmdc motors, the magnets are stationary and the current-carrying coils rotate. the current direction is changed using a mechanical commutation process. a brushless dc motor, on the other hand has a rotor with permanent magnets and a stator with windings (the magnets rotate and the current-carrying coils are stationary). it is essentially a dc motor turned inside out. the brushes and commutator have been eliminated and the windings are connected to the control electronics. the control electronics replace the function of the commutator and energize the proper winding. the energized stator winding leads the rotor magnet, and switches just as the rotor aligns with the stator [1]. bldc motor control requires knowledge of the rotor position and mechanism in order to commutate the motor. to sense the rotor position, bldc motors use hall effect sensors to provide absolute position sensing. this results in more wires and higher cost. bldc motors can be designed into systems that are sensor-based or sensorless. sensorless bldc control eliminates the need for hall effect sensors, using the back-emf (electromotive force) of the motor instead to estimate the rotor position. sensorless control is essential for low-cost variable speed applications such as fans and pumps [2]. brushless dc (bldc) motors are widely-used in industrial applications such as machine tool drives, computer peripherals, robotics and electric propulsion. bldc motors have many advantages. many of these are due to the reduced maintenance of bldc motors (no brushes), better speed versus torque characteristics, high dynamic response, long operating life, noiseless operation, higher speed ranges, compact size, controllability, high torque to volume ratio, high efficiency and low moment of inertia. 2 cascaded control feedback is both a mechanical, process and a signal mediated response that is looped back to control the system within itself. this loop is called the feedback loop. a control system usually has input and output to the system. when the output of the system is fed back into the system as part of its input, it is called the “feedback”. cascade control is used to enable a process having multiple lags to be controlled with the fastest possible response to process disturbances including set point changes. cascade control is widely used in industrial processes. conventional cascade schemes have two distinct features with two nested © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 3/2008 a proportional integral derivative (pid) feedback control without a subsidiary speed loop m. aboelhassan the aim of this investigation is to design and describe the essential features of a brushless direct-current (bldc) motor. the static and dynamical state of the bldc-motor is designed and calculated. within this frame-work, it has been shown that while working with the p-controller in conjunction with the subsidiary speed loop and pd-controller (with non-zero error in a steady state) without a subsidiary speed loop, there is pid-controller without a subsidiary speed loop which has zero error in a steady state. the last part of this paper is dedicated to a simulation of the circle rounds of p and pid controllers with and without a subsidiary speed loop in matlab–simulink to decide which of these controllers is suitable, available and reliable with a bldc-motor and their application in cutting tool machines in general. keywords: pid-controller, p-controller, pd-controller, feedback control, disturbance observer. fig. 1: block diagram of the p-controller with the subsidiary speed loop feedback control loops. there is a secondary control loop located inside the primary control loop. the primary loop controller is used to calculate the setpoint for the inner (secondary) control loop. the inner loop (secondary, slave loop) in a cascade-control strategy should be tuned before the outer loop (primary, master loop). after the inner loop is tuned and closed, the outer loop should be tuned using knowledge of the dynamics of the inner loop. the most common use of a cascaded control structure is: inner current closed loop followed by speed loop and outermost position loop superimposed on the speed loop. block diagrams of a close-loop position control system with p, pid and pd controllers with-without a subsidiary speed loop are shown in fig. 1, fig. 2, and fig. 3. 3 comparison of p, pd, pid controllers with-without the subsidiary speed loop 3.1 proportional control proportional control is denoted by the p-term in the pid controller. it used when the controller action is to be proportional to the size of the process error signal e t r t y tm( ) ( ) ( )� � . the time and laplace domain representations for proportional control are given as [3]: time domain u t k e tc v( ) ( )� , (1) laplace domain u s k e sc v( ) ( )� , (2) where kv is the proportional gain. fig. 4. shows the block diagrams for proportional control. 3.2 proportional and derivative control a property of derivative control that should be noted arises when the controller input error signal becomes constant but not necessarily zero, as might occur in steady state process conditions. in these circumstances, the derivative of the constant error signal is zero and the derivative controller produces no control signal. consequently, the controller takes no action and is unable to correct for steady state offsets, for example. to avoid the controller settling into a somnambulant state, the derivative control term is always used in combination with a proportional term. this combination is called proportional and derivative control, or pd control. the formulae for simple pd controllers are given as [3]: time domain u t k e t k e tc v d ( ) ( )� � d d , (3) laplace domain u s k k s e sc v d( ) [ ] ( )� � , (4) where kv is the proportional gain and kd is the derivative gain. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 2: block diagram of the pid-controller without the subsidiary speed loop fig. 3: block diagram of the pd-controller without the subsidiary speed loop fig. 4: block diagrams: proportional control term 3.3 parallel pid controllers the family of pid controllers is constructed from various combinations of the proportional, integral and derivative terms as required to meet specific performance requirements. the formula for the basic parallel pid controller is (transfer function pid controller formula) u t k k s k s e sc p i d( ) [ ] ( )� � � 1 . (5) time-domain pid controller formula u t k e t k e d k e tc p d i t ( ) ( ) ( )� � �� � � d d . (6) this controller formula is often called the textbook pid controller form, because it does not incorporate any of the modifications that are usually implemented to give a working pid controller. for example, the derivative term is not usually implemented in the pure form due to adverse noise amplification properties. other modifications that are introduced into the textbook form of pid control include those used to deal with so-called kick behavior, which arises because the textbook pid controller operates directly on the reference error signal. this parallel or textbook formula is also known as a decoupled pid form. this is because the pid controller has three decoupled parallel paths. as can be seen from the figure, a numerical change in any individual coefficient, kp, ki or kd, changes only the size of the contribution in the path of the term. for example, if the value of kd is changed, then only the size of the derivative action changes, and this change is decoupled and independent from the size of the proportional and integral terms. this decoupling of the three terms is a consequence of the parallel architecture of the pid controller. the block diagram of p, pd, pid controllers with-without the subsidiary speed loop is given in fig. 5. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 3/2008 fig. 5. block diagram of the p-controller, pd-controller and pidcontroller with-without the subsidiary speed loop fig. 6: step disturbance of p, pd and pid controllers with-without subsidiary speed loop 4 simulation of the circle rounds of the two regulators in matlab-simulink to simulate the circle rounds we have two drives. the first moves on the x axis and it has a sine signal. the second is on the y axis and has cosine signal. if these two drives have the same gain values, then they will have a circular movement, or else elliptical. the two drives should be the same in the same axis x and y respectively. a block diagram of two drives with a p_controller in conjunction with the subsidiary speed loop or a pid-controller without a subsidiary speed loop is shown in fig. 8. 5 conclusion the simulation of two drives with the same frequency of 20 rad/s has been configured and initialized in matlab-simulink. if these two drives have the same values of the gain kv, they will have a circular movement, or else an elliptical movement. the increase or decrease in the frequency of the sine and cosine signal has a profound effect on the radius of the circle. thus decreasing frequency results an increase in the radius circle, because of the frequency bandwidth of the drive. a comparison (shown in (fig. 6:)) of p, pd, pid controllers with-without the subsidiary speed loop shows that p and pid controllers have zero error (y w� � �) in a steady state, but the pd-controller has non-zero error. acknowledgments the research described in the paper was supervised by prof. ing. jiří skalický, csc. vut in brno. it has been supported by research program msm 0021630516. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 fig. 8: block diagram of the circle rounds for the p-controller and the state feedback controller fig. 7: step response of p, pd a pid controllers with-without subsidiary speed loop references [1] duane, d., douglas, w.: electronically commutated motors, 2001. [2] information on url: http://robotika.cz/wiki/bldcmotor. [3] michael, a., mohammad, h.: pid control. new identification and design methods, 2006. mustafa aboelhassan e-mail: xaboel01@stud.feec.vutbr.cz dept. of power electrical and electronic engineering brno university of technology technická 8 602 00 brno, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 48 no. 3/2008 fig. 9: circle rounds of the p controller with the subsidiary speed loop and with the same gain values (freq. sin � 20 rad/s) fig. 10: circle rounds of the pid controller without the subsidiary speed loop and with the same gain values (freq. sin � 20 rad/s) acta polytechnica https://doi.org/10.14311/ap.2021.61.0497 acta polytechnica 61(4):497–503, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague corn starch doped with sodium iodate as solid polymer electrolytes for energy storage applications fatin farhana awang, mohd faiz hassan∗, khadijah hilmun kamarudin universiti malaysia terengganu, faculty of science and marine environment, ionic state analysis (isa) laboratory, advanced nano-materials (anoma) research group, 21030 kuala nerus, terengganu, malaysia ∗ corresponding author: mfhassan@umt.edu.my abstract. the concern about environmental problems has inspired a of energy storage devices from natural sources. in this study, solid polymer electrolyte (spe) films made from corn starch doped with different compositions of sodium iodate (naio3) were prepared via the solution casting technique. the effect of dopants on the structure, morphology and electrical properties of spe films was analysed using x-ray diffraction (xrd), scanning electron microscopy (sem) and electrochemical impedance spectroscopy (eis) analysis. from the xrd, it shows that the amorphous state would influence the conductivity values of spe films. then, the sem observations revealed that the films seem to be rough, porous and having branch structure, which may affect the conductivity of spe films. the maximum conductivity of spe film is obtained from 3 wt.% of naio3 with a value of 1.08 × 10−4 scm−1 at room temperature (303k). from the results, this spe is proposed to have a great potential in future energy storage applications. keywords: biodegradable polymers, polymer membranes, solid polymer electrolytes, sodium salt, electrical properties. 1. introduction electrolyte is one of the main components in battery, which has been used as a medium for ion movement between anode and cathode. it can be divided into liquid and solid electrolytes. according to [1], there are a few limitations of liquid electrolytes, such as leakage issues and the difficulty to store and produce them. in order to solve the arising problems, solid polymer electrolyte (spe) has been considered. the important criteria in the selection of spe include stability and safety, reduction in cost and weight, and flexibility, which is easy to be fabricated and could perform as a proper medium for electrode-electrolyte contact [2–6]. furthermore, it is seen suitable to be applied in some applications of electronic devices such as batteries, solar cell, sensors and supercapacitors [7– 11]. however, the main barrier that makes spe hard to be utilised is the relatively low ionic conductivity at ambient temperature. normally, the ion conduction usually takes place in the amorphous region [12]. this is because the amorphous phase provides free spaces for the ions to move along with the polymer matrix. many efforts have been suggested to enhance the ionic conductivity of polymer electrolytes, which includes incorporating various ionic dopant with host polymers [13]. the usage of biodegradable polymers is vital as they can be decomposed by a natural agent easily. besides, they also take a short duration to decay as compared to non-biodegradable ones without leaving any electronic waste. some examples of polymers, such as chitosan, starch, pectin and cellulose [14–16] are usually used in certain areas. it is commonly used as a host for the preparation of solid polymer electrolyte (spe) due to its properties, such as being renewable, cheaper, non-toxic and soluble in water [17– 20]. in the spe preparation, starches are used as host polymers and sodium iodate (naio3) as an ionic dopant. basically, starches are carbohydrate polymers that consist of amylose and amylopectin, which are bonded together [21]. however, the starch chains are held together by hydrogen bond, making them insoluble in cold water. then, to obtain corn starch thin films, starch granules should be incorporated with a plasticizer in excess water at temperatures below 100 °c. examples of commonly used plasticizers are glucose, glycerol and sorbitol [22]. according to [23], glycerol is the most suitable plasticizer because of its mechanical characteristics and transparency. also, the increase in glycerol contents can promote mobile ions to penetrate the polymer matrix. from this, it will facilitate the rise in amorphicity of the electrolyte and thus, leads to an increase in conductivity [24]. salts act as an important constituent to an electrolyte as they have a strong influence on the electrolyte properties such as conductivity and thermal stability. in this study, sodium (na) salts were selected as a well-established candidate due to their material abundance, low cost and environmentally friendliness [25–27]. moreover, sodium also exhibits similar chemical properties to lithium that are applied 497 https://doi.org/10.14311/ap.2021.61.0497 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en f. f. awang, m. f. hassan, k. h. kamarudin acta polytechnica figure 1. the physical changes of the spe films. in battery applications. meanwhile, lithium (li) materials themselves are very hazardous and the sources are very limited. thus, the purpose of this study is to investigate the morphology and electrical properties of corn starch doped with sodium iodate (naio3) by using the x-ray diffraction (xrd), scanning electron microscopy (sem) and electrical impedance spectroscopy (eis) techniques. 2. experimental 2.1. materials the solution casting technique was used to prepare corn-starch-based polymer electrolyte. sodium iodate, naio3, (purity 96 %, sigma-aldrich) and corn starch, c6h10o5, (sigma-aldrich) were used as the raw materials. in addition, two different solvents (20 ml of distilled water and 0.6 ml of glycerine) with a purity of 100 % and 96 % were also used. 2.2. preparation of spe films spe was prepared via the solution casting method. different amounts of naio3 (wt. %), which range from 0, 1, 3, 5, 7 and 9, were dissolved in the mixed solvents (distilled water and glycerine) until a complete dissolution. the weight percentage was calculated using the formula expressed in equation 1. wt.% = x x + y x 100 (1) where, x is the weight of salt (g), y is the weight of the polymer and weight percentage is the difference values in percentage for salt as a dopant. thereafter, 1 g of corn starch was added into the solution and continuously stirred at 60-70 °c, until it turned homogenous with no phase separation. the solution was poured into glass petri dishes and left at room temperature for the spe to be formed. for further drying process, the spe films were kept in a desiccator filled with silica gels to eliminate any excess moisture. 2.3. characterisation technique x-ray diffraction (xrd) is a technique to determine the structural changes in a polymer electrolyte system. the crystallinity or amorphousness of the polymer electrolyte systems was observed using a miniflex ii diffractometer with cukα radiation. the 2θ angle varies from 10° to 60° with a step size of 5 °/min per spe film samples. then, the obtained data were analysed using the search-match and origin software. the degree of crystallinity (χc) of the spe films was calculated from equation 2. crystallinity (%) = ac ac + am x 100 (2) where, ac and am are the crystalline and amorphous areas, respectively. scanning electron microscopy (sem) was used for a morphological investigation. for this purpose, spe films were coated to make them conductive for a better observation and placed on specimen tubs with double cellophane. they were examined using the model jeol jsm-6350la with an acceleration voltage of 20kv at room temperature. the images were taken at various magnifications of ×350, ×750 and ×1000. through this method, the structure of the spe films can be directly observed. the electrical properties of prepared films were observed by the electrochemical impedance spectroscopy (eis) technique. the thickness of the spe films was measured using a digital micrometre screw gauge. furthermore, the impedance of spe films was determined using a hioki 3532-50 lcr hi-tester at room temperature, which is interfaced to a computer over a frequency range of 50 hz to 1 mhz. the films were cut into appropriate size (3.0 cm × 1.0 cm) and placed between two stainless blocking electrodes. the values of bulk resistance, rb obtained from the measurement were used to calculate the conductivity, σ using equation 3, respectively. σ = l rba (3) where, l is the thickness of spe films, rb is the bulk resistance achieved from the interception of imaginary impedance against real impedance and a is the contact area of electrode-electrolyte. 3. results and discussion the physical image of the corn starch-naio3 spe films can be seen in figure 1. it was observed that an insertion of naio3 salt into the polymer electrolyte 498 vol. 61 no. 4/2021 corn starch doped with sodium iodate as solid . . . had changed the homogenous solution into dark purple after being cooled at room temperature. then, the spe films were left to be dried naturally for 24 hours until pale purple thin films were formed. by increasing the drying time, the spe films had shifted to colourless. from this, it could be considered that the changes in spe film features might alter the structure and its morphology. moreover, the difference in colour or transparency might be caused by the changes in internal structures developed during the drying of the films [28]. the thickness of the spe films was measured in a range of 0.104 mm to 0.124 mm. it slightly varied due to the viscosity of the polymer solutions. the xrd is a technique used to identify the changes of phases in spe film whether in amorphous or crystalline regions [29]. figure 2 displays the xrd of spe films. from the results, it can be seen that the pure corn starch film showed a broad hump at 15 °to 28 °, which corresponded to the amorphous nature of the polymer [30]. furthermore, as seen in all patterns, the addition of naio3 into the polymer electrolyte would give no significant change except for the higher salt content. specifically, with 1 wt. % of naio3, the pattern of complex films changed to a wider hump as compared to the pure corn starch film with 68% of amorphicity. then, the incorporation of 3 wt.% of naio3 showed a pattern similar to the complex film that contained 1 wt. % of naio3 but the amorphous phase had increased to 77%. the amorphicity started to decrease with a further addition of naio3. this might be caused by the interaction between the sodium salt and polymer host, which altered the film’s state. it could be seen that the amorphousness of the spe films started to decrease for samples containing 5 wt.% and 7 wt.% of naio3 with a value of 58% and 40%, respectively. thereafter, several crystalline peaks appeared at 2θ = 14.3 °, 21.4 °, 24.9 °, 27.1 °and 32.7 °for the spe sample with 9 wt.% of naio3. therefore, from the xrd results, it can be summarised that the spe films tend to appear in the amorphous phase rather than crystalline phase. moreover, the presence of naio3 did not significantly influence the formation of peaks in the complex samples. the morphology and microstructure of spe films were investigated by using the scanning electron microscopy (sem). figure 3 shows the surface images of the spe films. in general, the surface condition played a vital role in the enhancement of conductivity and was linked with an amorphous content. in addition, according to [31] and [32], the smooth surface of the films would provide a better ionic conductivity, which correlates to the amorphous nature of the sample. it could be observed that the surface of the films became rougher with the addition of salt. the increase in the roughness and amorphous content was responsible for promoting the ion movement, hence, enhancing the conductivity. therefore, it can be concluded that when salt was added to the polymer electrolyte, the figure 2. the x-ray diffraction pattern of corn starch doped with sodium iodate (naio3). film surface was slightly modified, which indicated that the salt was taking part in the system. figure 4 shows the sem images of the prepared films. the corn starch image (figure 4a) revealed the smooth surface and compact structure without any pores or cracks. upon addition of 1 wt.% naio3 (figure 4b), the rough images appeared. it was clear from figure 4b that after the addition of sodium salt, the surface morphology of the spe film changed. after increasing the salt content to 3 wt.% (figure 4c), the film images showed a higher number of pores with a larger diameter. it was believed that the appearance of pores could help the ions to penetrate easily through the membrane. thus, it could lead to a rise in the conductivity. the branched structure with some pores are presented in figure 4d (5 wt.% naio3). from the figure, the number and estimated pore sizes were smaller than in the case of the previous spe film sample. furthermore, for spe films that contained 7 wt.% of naio3 (figure 4e) the root image did not show any salt particles present. the spe film with the highest content of naio3 (9 wt.%) appeared with some impurities and tore on the surface. the diameter of pores for selected spe films (figure 4c figure 4d) was analysed using imagej software and the results are shown in figure 5. for the spe film with 3 wt.% of naio3, the diameter size is in the range of 5.00 to 26.01 µm. as stated by [33], the characteristics of pores are also to enhance the conductivity values because they could assist the ions to easier penetrate through the polymer matrix. moreover, the larger pore size will also contribute to the higher ion penetration through the system for ion conduction. hence, the conductivity of spe films will increase. this is confirmed by the fact that higher conductivity values were obtained with a higher number and larger size of pores. then, 5 wt.% of naio3 spe film has shown a decrease in the amount and diameter of pores between 7.61 to 13.15 µm. 499 f. f. awang, m. f. hassan, k. h. kamarudin acta polytechnica figure 3. the surface conditions of spe films that contain various ratios of naio3 (a) 0 wt.%, (b) 1 wt.%, (c) 3 wt.%, (d) 5 wt.%, (e) 7 wt.% and (f) 9 wt %. (a). (b). (c). (d). (e). (f). figure 4. sem micrographs of the surfaces for (a) corn starch film, (b) 1 wt.% ofnaio3, (c) 3 wt.% of naio3, (d) 5 wt.% of naio3, (e) 7 wt.% ofnaio3, (f) 9 wt.% of naio3. designation naio3 concentration bulk resistance, rb (wt. %) (ω) a 0 3.00 × 103 b 1 2.00 × 102 c 3 4.70 × 101 d 5 1.50 × 102 e 7 3.32 × 102 f 9 5.00 × 102 table 1. the bulk resistance and conductivity of spe films at different salt concentrations. 500 vol. 61 no. 4/2021 corn starch doped with sodium iodate as solid . . . figure 5. the sem image of a pore in the specific location for (a) 3 wt.% of naio3 and (b) 5 wt.% of naio3 content in spe films . (a). (b). figure 6. cole-cole plots for (a) 0 wt. % of naio3 and (b) 9 wt. % of naio3 spe films. the nyquist plot or cole-cole plot is a powerful technique used to better understand the electrical properties of the spe films. typically, the cole-cole plot is comprised of two semicircles. the first semicircle forms due to the contribution of the grain boundary at a low frequency and the second forms due to the grain or bulk properties at a high frequency of the material. in general, the bulk resistance (rb) value of the sample is determined from the plot in the interception of the higher frequency region on the zr axis. figure 6 shows the cole-cole plot of spe samples containing 0 and 9 wt. % of naio3 while the bulk resistance values are tabulated in table 1. a reduction of bulk resistance (rb) also might affect the ionic conductivity values. as the conductivity of spe films is calculated using equation 3, as stated before. the best way to study the properties and mechanisms of any electrochemical reaction is by applying complex impedance spectroscopy [34]. it was observed that the conductivity of pure corn starch film was 1.1 × 10−6 scm−1. the conductivity started to increase when the sodium salt was introduced into the electrolyte system. the conductivity increased to 1.08 × 10−4 scm−1 with the presence of 3 wt.% of naio3. it could be concluded that the rise in conductivity, as the salt content increased, was attributed to the increase in ion concentration in the polymer matrix [35]. additionally, the increase in the ion density would raise the number of free ions, which could lead to an increase in conductivity. further addition of salt into the electrolyte caused a decrease in conductivity. it might be due to the inability of the salt to fill the polymer host and deposit it on the surface after the films were formed [36]. this might decrease the number of mobile ions in the spe sample, thus decreasing the conductivity. moreover, [37] also reported that at a higher salt concentration, the distance between dissociated ions may become too small, which leads to the ions reassociating back to neutral ion pairs that do not contribute to the ion conduction. this phenomenon could be observed at a higher naio3, 5 wt. % 9 wt. % concentrations in the electrolyte will decrease the conductivity values. figure 7 indicates the percentage of crystallinity versus conductivity for the spe films. 501 f. f. awang, m. f. hassan, k. h. kamarudin acta polytechnica figure 7. the crystallinity (%) against conductivity for each spe film. 4. conclusion in conclusion, solid polymer electrolytes based on corn starch-naio3 were successfully prepared using the solution cast method. the xrd analysis proved the coexistence of peaks, which confirmed the complexation of studied materials took place. from the sem results, it showed that doping sodium iodate with corn starch improved the structural integrity. moreover, it can be seen that the spe film, which contained 3 wt. % of naio3 (1.08 × 10−4 scm−1) tends to have a branch structure with some pores to enable it to facilitate the movement of free ions through the polymer electrolyte. it was also believed that the prepared spe film had the potential to be used for energy storage applications such as in batteries as the required minimum conductivity value was achieved. acknowledgements the authors acknowledge the financial support from the ministry of education via frgs 2019-1 grant of (vot. no. 59586), faculty of science and marine environment and university malaysia terengganu for the financial and technical support for this work to be completed. references [1] a. arya, a. l. sharma. effect of salt concentration on dielectric properties of li-ion conducting blend polymer electrolytes. journal of materials science: materials in electronics 29(20):17903–17920, 2018. https://doi.org/10.1007/s10854-018-9905-3. [2] g. dave, d. k. kanchan. dielectric relaxation and modulus studies of peo-pam blend based sodium salt electrolyte system. indian journal of pure & applied physics 56:978–988, 2018. [3] m. s. mustafa, h. o. ghareeb, s. b. aziz, et al. electrochemical characteristics of glycerolized peo-based polymer electrolytes. membranes 10(6):116, 2020. https://doi.org/10.3390/membranes10060116. [4] m. johnsi, s. a. suthanthiraraj. compositional effect of zro2 nanofillers on a pvdf-co-hfp based polymer electrolyte system for solid state zinc batteries. chinese journal of polymer science 34(3):332–343, 2016. https://doi.org/10.1007/s10118-016-1750-3. [5] m. f. hassan, n. noruddin. the effect of lithium perchlorate on poly (sodium 4-styrenesulfonate): studies based on morphology, structural and electrical conductivity. materials physics and mechanics 36:8–17, 2018. https://doi.org/10.18720/mpm.3612018_2. [6] a. chandra, a. chandra, r. s. dhundhel. electrolytes for sodium ion batteries: a short review. indian journal of pure & applied physics 58:113–119, 2020. [7] s. k. deraman, n. s. mohamed, r. h. y. subban. conductivity and electrochemical studies on polymer electrolytes based on poly vinyl (chloride)ammonium triflate -ionic liquid for proton battery. international journal electrochemical science 6(1):1459–1468, 2013. [8] m. f. hassan, h. k. ting. physical and electrical analyses of solid polymer electrolytes. arpn journal of engineering and applied sciences 13:8189–8196, 2018. [9] c.-w. kuo, w.-b. li, p.-r. chen, et al. effect of plasticizer and lithium salt concentration in pmma based composite polymer electrolytes. international journal electrochemical science 8:5007–5021, 2013. [10] s. g. rathod, r. f. bhajantri, v. ravindhrachary, et al. influence of transport parameters on conductivity of lithium perchlorate-doped poly(vinyl alcohol)/chitosan composites. journal of elastomers & plastics 48(5):442–455, 2016. https://doi.org/10.1177/0095244315580457. [11] b. jinisha, k. anilkumar, m. manoj, et al. development of a novel type of solid polymer electrolyte for solid state lithium battery applications based on lithium enriched poly (ethylene oxide) (peo)/poly (vinyl pyrrolidone) (pvp) blend polymer. electrochimica acta 235:210–222, 2017. https://doi.org/10.1016/j.electacta.2017.03.118. [12] s. b. aziz, z. h. z. abidin. electrical conduction mechanism in solid polymer electrolytes: new concepts to arrhenius equation. journal of soft matter pp. 1–8, 2013. https://doi.org/10.1155/2013/323868. [13] a. s. samsudin, m. a. saadiah. ionic conduction study of enhanced amorphous solid bio-polymer electrolytes based carboxymethyl cellulose doped nh4br. journal of non-crystalline solids 497:19–29, 2018. https://doi.org/10.1016/j.jnoncrysol.2018.05.027. [14] n. h. ahmad, m. i. n. isa. ionic conductivity and electrical properties of carboxylmethylcellulose nh4cl solid polymer electrolytes. journal of engineering science and technology 11(6):839–847, 2016. [15] s. çavuş, e. durgun. poly(vinyl alcohol) based polymer gel electrolytes: investigation on their conductivity and characterization. acta physica polonica a 129(4):621–624, 2016. https://doi.org/10.12693/aphyspola.129.621. [16] m. f. hassan, f. f. awang, n. s. n. azimi, c. k. sheng. starch/mgso4 solid polymer electrolyte for zinc carbon batteries and its application in a simple circuit. journal of sustainability science and management 15(8):1–8, 2020. https://doi.org/10.46754/jssm.2020.12.001. 502 https://doi.org/10.1007/s10854-018-9905-3 https://doi.org/10.3390/membranes10060116 https://doi.org/10.1007/s10118-016-1750-3 https://doi.org/10.18720/mpm.3612018_2 https://doi.org/10.1177/0095244315580457 https://doi.org/10.1016/j.electacta.2017.03.118 https://doi.org/10.1155/2013/323868 https://doi.org/10.1016/j.jnoncrysol.2018.05.027 https://doi.org/10.12693/aphyspola.129.621 https://doi.org/10.46754/jssm.2020.12.001 vol. 61 no. 4/2021 corn starch doped with sodium iodate as solid . . . [17] s. b. aziz. li+ ion conduction mechanism in poly (ε-caprolactone)-based polymer electrolyte. iranian polymer journal 22(12):877–883, 2013. https://doi.org/10.1007/s13726-013-0186-7. [18] d. hambali, z. zainnuddin, i. supa’at, z. osman. studies of ion transport and electrochemical properties of plasticized composite polymer electrolytes. sains malaysiana 45(11):1697–1705, 2016. [19] p. c. sekhar, p. n. kumar, a. k. sharma. effect of plasticizer on conductivity and cell parameters of (pmma+naclo4) polymer electrolyte system. iosr journal of applied physics (iosr-jap) 2(4):1–6, 2012. https://doi.org/10.9790/4861-0240106. [20] s. b. aziz, o. g. abdullah, m. a. rasheed, h. m. ahmed. effect of high salt concentration (hsc) on structural, morphological, and electrical characteristics of chitosan based solid polymer electrolytes. polymers 9(6), 2017. https://doi.org/10.3390/polym9060187. [21] c. l. luchese, p. benelli, j. c. spada, i. c. tessaro. impact of the starch source on the physicochemical properties and biodegradability of different starch-based films. journal of applied polymer science 135(33):1–11, 2018. https://doi.org/10.1002/app.46564. [22] b. chatterjee, n. kulshrestha, p. n. gupta. preparation and characterization of lithium ion conducting solid polymer electrolytes from biodegradable polymers starch and pva. international journal of engineering research and applications 5:116–131, 2015. [23] r. alves, m. m. silva. the influence of glycerol and formaldehyde in gelatin-based polymer electrolytes. molecular crystals and liquid crystals 591(1):64–73, 2014. https://doi.org/10.1080/15421406.2013.822739. [24] r. alves, j. p. donoso, c. j. magon, et al. solid polymer electrolytes based on chitosan and europium triflate. journal of non-crystalline solids 432:307–312, 2016. https://doi.org/10.1016/j.jnoncrysol.2015.10.024. [25] h. m. a. herath, v. a. seneviratne. electrical and thermal studies on sodium based polymer electrolyte. procedia engineering 215:124–129, 2017. https://doi.org/10.1016/j.proeng.2018.02.089. [26] s. brutti, et al. ionic liquid electrolytes for room temperature sodium battery systems. electrochimica acta 306:317–326, 2019. https://doi.org/10.1016/j.electacta.2019.03.139. [27] p. k. nayak, l. yang, w. brehm, p. adelhelm. from lithium-ion to sodium-ion batteries: advantages, challenges, and surprises. angew chem international ed england 57(1):102–120, 2018. https://doi.org/10.1002/anie.201703772. [28] j. mei, y. yuan, y. wu, y. li. characterization of edible starch-chitosan film and its application in the storage of mongolian cheese. international journal biological macromolecules 57:17–21, 2013. https://doi.org/10.1016/j.ijbiomac.2013.03.003. [29] m. f. hassan, s. z. m. yusof. poly(acrylamide-coacrylic acid)-zinc acetate polymer electrolytes: studies based on structural and morphology and electrical spectroscopy. microscopy research 02(02):30–38, 2014. https://doi.org/10.4236/mr.2014.22005. [30] f. f. awang, k. h. kamarudin, m. f. hassan. effect of sodium bisulfite on corn starch solid polymer electrolyte. malaysian journal of analytical science 25(2):224–233, 2021. [31] s. b. aziz, m. h. hamsan, w. o. karim, et al. study of impedance and solid-state double-layer capacitor behavior of proton (h+)-conducting polymer blend electrolyte-based cs:ps polymers. ionics 26(9):4635–4649, 2020. https://doi.org/10.1007/s11581-020-03578-6. [32] n. angulakshmi, d. j. yoo, k. s. nahm, et al. mgal2sio6-incorporated poly(ethylene oxide)-based electrolytes for all-solid-state lithium batteries. ionics 20(2):151–156, 2013. https://doi.org/10.1007/s11581-013-0985-z. [33] m. f. hassan, a. n.s.n, k. h. kamarudin, c. k. sheng. solid polymer electrolytes based on starchmagnesium sulphate: study on morphology and electrical conductivity. asm science journal special issue pp. 17–28, 2018. [34] s. b. aziz, m. a. brza, k. mishra, et al. fabrication of high performance energy storage edlc device from proton conducting methylcellulose: dextran polymer blend electrolytes. journal of materials research and technology 9(2):1137–1150, 2020. https://doi.org/10.1016/j.jmrt.2019.11.042. [35] m. f. z. kadir, s. r. majid, a. k. arof. plasticized chitosan–pva blend polymer electrolyte based proton battery. electrochimica acta 55(4):1475–1482, 2010. https://doi.org/10.1016/j.electacta.2009.05.011. [36] m. n. z. m. sapri, a. h. ahmad. conductivity and ftir studies on peo-nacf3so3 solid polymer electrolyte films. science letters 10(1):11–13, 2016. [37] n. n. a. amran, n. s. a. manan, m. f. z. kadir. the effect of licf3so3 on the complexation with potato starch-chitosan blend polymer electrolytes. ionics 22(9):1647–1658, 2016. https://doi.org/10.1007/s11581-016-1684-3. 503 https://doi.org/10.1007/s13726-013-0186-7 https://doi.org/10.9790/4861-0240106 https://doi.org/10.3390/polym9060187 https://doi.org/10.1002/app.46564 https://doi.org/10.1080/15421406.2013.822739 https://doi.org/10.1016/j.jnoncrysol.2015.10.024 https://doi.org/10.1016/j.proeng.2018.02.089 https://doi.org/10.1016/j.electacta.2019.03.139 https://doi.org/10.1002/anie.201703772 https://doi.org/10.1016/j.ijbiomac.2013.03.003 https://doi.org/10.4236/mr.2014.22005 https://doi.org/10.1007/s11581-020-03578-6 https://doi.org/10.1007/s11581-013-0985-z https://doi.org/10.1016/j.jmrt.2019.11.042 https://doi.org/10.1016/j.electacta.2009.05.011 https://doi.org/10.1007/s11581-016-1684-3 acta polytechnica 61(4):1–7, 2021 1 introduction 2 experimental 2.1 materials 2.2 preparation of spe films 2.3 characterisation technique 3 results and discussion 4 conclusion acknowledgements references ap08_1.vp 1 introduction a simulation of construction activities on the basis of production rate makes it possible to monitor the reliability of the expected time schedule and the total costs. the input parameters are production rate, scope of the work, the time schedule, bonding conditions, maximum and minimum deviations from the scope of the work, and the production rate. the simulated model can be used at many levels of project management. clients are able to make decisions about implementing their intentions; competitors can assess a bid price; building contractors can make a detailed calculation of the costs and the time schedule of construction activities, and they can optimize the construction process. the simulation model stems form the research of haas, hajek [4] and then beran, dlask [1] and [2], which has been carried out at the faculty of civil engineering, czech technical university in prague during the last ten years. 2 definition of the problem within the framework of initiating a simulation of a building project, it is necessary to define the problem as such. the application obtains input data by means of the module of input data, which defines the particular construction activities, and the volume q of these particular construction activities is expressed in physical or financial units, its production rate v and bonding conditions � dconnection activities linking particular activities. tabproject characterizes the calculation as a meta problem called dynamic progress chart (flow-sheet). n generally characterizes the sequential networks ni [2]. the set expression is as follows: tab d project connection activitie � � �n f risk riski d q v( , ,� �� �s ) , , ,i � 1 � (1) where i are partial processes and d is the set of activity durations, while risk influence is a conditioned externality (see ta30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 dynamic simulations in cost and time estimation of the construction process v. beran, e. hromada this paper describes a model which is able to simulate the costs and the duration of construction for a building project. the model predicts the set of expected costs and the duration of the project depending on input parameters such as production rate, scope of the work, the time schedule, bonding conditions, maximum and minimum deviations from the scope of the work, and the production rate. clients are able to make proper decisions concerning the time and cost schedules of their investments. keywords: simulation, time scheduling, cost scheduling, reliability and risk. key: … noncritical activity … critical activity … total reserve … waiting for activity … waiting for activity – on critical way table 1: dynamic progress chart of a building project ble 1). the notation is supplement by conditionality of the breach of the assumed input parameters – the scope of the work and the production rate. a practical solution of the calculation according to the dynamic progress chart (1) is based on inputting the work volume, the production rate and a time schedule of the particular activities. the time duration in the dynamic progress chart is calculated as the quotient of quantities q and v, or more precisely d q vi i i� . the input data included in the module of input data in the connection activities sheet defines the bonding conditions among the particular production activities. the deviations of project parameters 1 and 2 sheet contain input data about the minimum / maximum deviations of the scope of the work and the production rate of the particular activities, based on the expected parameters of the building process [6]. 3 solution and an example on the basis of the excel vba application, the algorithm enable us to calculate an instant dynamic progress chart of the building project, including a time schedule of resources in terms of expression (1), the dynamic progress chart is differences calculated on the basis of a common progress chart. the calculation is based on the production rates and the individual activities, which are described in columns start and end (tab. 1), which present the links between the individual activities. it practically represents the relations between the declared function f risk risk( , , )q v �dconnection activities from expression (1) and the composition of the task as a consecutive process on the basis of the time duration of the individual processes n [d] [1, 2, 5]. the dynamic progress chart creates a comprehensive methodically uniform model. the outputs of the model include information about the deadlines for the start and the end of the production activities, and information about cost schedules. the application creates a graphic visualization of the demand for resources in time (see fig. 1). the question of continuity of the project realization is interconnected with the cost-cutting management measures. the varying construction rate causes changes in construction costs. the flow of the construction costs is a significant indicator of the economy of the capital employment. the calculation and software application described here can be used for evaluating of bid proposals for investment projects. the approach carries out a two-dimensional simulation. the project described in propositions time and costs will be marked as a predefined project. the discrete probabilistic variables (t; c) obtain the values (ti; cj). we mark it as follows: p( ; )t t c c pi i ij� � � . the sw application simulates the presumed development of the examined construction phase, the whole construction project or just a set of construction activities. we can identify the effects of changes and we can view management changes in the scope of particular jobs (construction activities) and the probability (reliability) of meeting the proposed (contracted) completion deadline tfin and the proposed contracting cost limit cfin. in general we are searching for acceptable f t c t t c c ct ( ; ) ( ; )fin fin fin fin� � � p for the selected project activities ak (2) or their activity sets ak, al, …, ax, functionality-designed into a network. the results of particular simulations p p( ; ) ( ( ; ))t c sim t c� tabproject (2a) for the example described below are t � �45; 89� and c � �2411;3131�. simulation data is continuously recorded on the basis of (1), [2]. the simulation is based on the time schedule given in table 1. 50 000 simulations can be ranged into 30×30 categories. when a simulation is finished, the recorded data serves as a basis for a statistical analysis of the construction processes. the data file serves for the final analysis and the inter alia is a basis for modified 3d visualization, as shown in fig. 2. the calculation of the expected or fixed probability starts, ends and reserves the results, as shown in fig. 3 and fig. 4. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 1/2008 0 20 40 60 80 100 120 140 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 time duration (working days) t h o u s a n d e u r 0 500 1 000 1 500 2 000 2 500 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 time duration (working days) t h o u s a n d e u r fig. 1: required cash flow of capital needed in the course of the construction period and the cumulative need for capital table 2 contains structural data of the comprehensive simulation example. the particular items are calculated as a construction bid proposal which is further described by a simulation study. this shows how practically competitive and realistic the supposed completion date and costs are. p p p ( ; ) ( ( ; )) ( ( , ) ( , )t c sim t c s 45 89 2411 3131 � �tabproject � � �im n f risk riski d q v� �( , , ) ,dconnection activities for su �bprojects or subactivities i � 1, ).� (2b) the construction project [3] is proposed with the time schedule and the scope of the work as given in the 3d bar chart. the ellipse in table 2 shows the shift of probability in time and costs. using this approach, he results of the simulations can be specified more precisely. the occurrence frequencies of the particular scenarios of building project bids are comparable. the highest simulation frequency values in the 3d bar graph indicate the highest probabilities of potential success scenarios for the construction project. in the given case, the building project will be completed with satisfactory commercial probability, within the range of 57 to 59 days, and the construction cost will be in the range of 2 731 to 2 751 thousand eur. within the framework a building project simulation, the calculation frequently reveals a unique regular solution. an example of a 3d probability bar chart with a unique regular solution is shown in fig. 2. in the case of complicated bonding conditions and other additional interdependences between the particular activities, the solution of the simulation may not be unique. fig. 5 and fig. 5a present a building project in which the input parameters contain a specific interdependence within the first activity (ground works). let us compare these results with fig. 2. in the event that the first activity takes more than 25 days, the building ground machinery must without delay be removed to another major activity (another building project). this situation causes slippage of dates within the range 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 fig. 2: example of a 3d probability bar chart as an expression on the basis of (1) fig. 3: probability 3d bar chart for a construction project with fixed cost scope c fig. 4: probability 3d bar chart for a construction project with fixed time duration t of 21 days. after that period, the ground works can be resumed. this specific condition leads to a heterogeneous solution of the simulation. it is difficult to find the solution of this building project using standard statistical methods. it is convenient to take advantage of visualization techniques and particular simulation calculations. important information regarding a proposal for a future project time schedule is specified by tests of the potential scenarios of the project development with current fixing of certain parameters of the building organizational model. important information can be obtained about critical parameters of the planned project, for example by fixing the deviations of the scope of the work for particular activities, see q in expression (1) [7]. it is common practice to present the probability of the total construction time of a building project without the cost viewpoint (fig. 4). a better-expressed project cost is presented as a respected fixed value that will be stable and independent from the duration of the project. addressing this notion, the proposed approach that simulates the interrelated time and cost values shown in fig. 3, is more understandable and more comprehensive than the information shown in fig. 4, where p( ) ( ) ( )a t b f t f t t b a � � (3) © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 1/2008 table 2: the example of a 3d bar chart illustrating the result of 50 000 simulations 2 4 4 7 2 4 8 9 2 5 3 1 2 5 7 3 2 6 1 5 2 6 5 7 2 6 9 9 2 7 4 1 2 7 8 3 2 8 2 5 2 8 6 7 2 9 0 9 2 9 5 1 2 9 9 3 3 6 4 2 4 8 5 4 6 0 6 6 7 2 7 8 8 4 9 0 9 6 1 0 2 1 0 8 1 1 4 0 000. 0 001. 0 002. 0 003. 0 004. 0 005. 0 006. 0 007. 0 008. 0.009 p ro b a b il it y construction cost (thousand eur) time duration (working days) fig. 5: example of a 3d probability bar chart with heterogeneous solutions fig. 5a: example of a graph with interdependences between time duration and construction cost (frequencies) or than the calculation with a fixed scope of work p p( ) ( ( ))45 89 � t sim t t tabproject . (3a) a similar situation arises if we fix the alternation of time schedules for the project. the scope of the work given as c is specified as p( ) ( ) ( )x t y f c f c c y x � � (3b) for the data simulated in table 2 and demonstrated in fig. 4. p p( ) ( ( ))2411 3131 � c sim c c tabproject , (3c) in the calculation with a fixed time ratio. fig. 4 shows the changes in the project cost with fixed duration of the observed project. the expected time duration of the total construction project is given by its mean value � �e t c t p t t ti i t project const� � � � . ( ) (4) in the same way, we can quantify the expected scope of the work on the total construction project by its mean value � �e c t c p c c ci i c project const� � � � . ( ) (5) 4 the search for reliable construction cost and time duration the simulation model is able to calculate the adequate construction costs and the time duration of a project on the basis of the input probability level. the reciprocal view aims to find out the adequate level of probability for construction cost and activity durations. there are two ways to calculate an adequate level of probability. the first way consists in fixing one variable parameter and investigating the changes in the remaining parameter. the second way involves a simultaneous investigation of the deviations of both parameters. the approach used in this paper is based on expression (1) and table 3 (discrete probability density table), and it enables us according to the data in table 2 to calculate the level of probability as a cumulative density function f t c p t t c ci i ct ( ; ) ( ; )� � � , (6) where ti and ci run through the set of all possible values of t and c and p t t c ci i ct ( ; )� � � 1. (6a) in the course of a closer investigation of the results of particular simulations, the dependence between the level of probability and the construction cost and time duration was found. the following figures show the bilateral interactions of these project parameters. 5 conclusion the model described here enables us to predict the expected costs and the duration schedule of a project depending on input parameters such as the production rate, the scope of the work, the time schedule, bonding conditions, maximum and minimum deviations from the scope of the work, and the production rate. the results present a useful risk evaluation for projects or project activities. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 table 3: example of calculating the level of discrete probability distribution (dpd) for 50 000 simulations acknowledgement this paper was developed with financial support from the ministry of education, youth and sports of the czech republic, project no. 1m6840770001. references [1] beran, v., dlask, p.: management udržitelného rozvoje. praha: academia, 2005. [2] beran, v., dlask, p., heralová, r., berka, v.: dynamický harmonogram, rozvrhování výroby do času. 1. vyd. praha: academia, 2002. 170 s. isbn 80-200-1007-6. [3] dlask, p., beran, v.: mdm 2004 – teoretická příručka. 1. vyd. praha: vydavatelství čvut, 2004. 90 s. isbn 80-01-03072-5. [4] haas, š., hájek, v.: systémové plánování a řízení ve stavebnictví. praha: sntl, 1981. [5] heralová, r., frková, j., tománková, j.: decision making and bids in construction industry. integrated risk management. berlin: humbolt universität berlin, 2002. [6] hromada, e., beran, v.: reliable construction cost and time estimation on the basis of the dynamic flow-chart (software for project reliability estimation and risk evaluation). in: advanced engineering design aed 2006 [cd-rom]. prague: ctu, 2006, vol. 1, p. p3.14-1–p3.14-15. isbn 80-86059-44-8. [7] hromada, e., beran, v.: management časových a finančních potřeb stavebních zakázek. in: stavebnictví a interiér. 2005, roč. 2005, č. 12, s. 98–100. issn 1211-6017. doc. ing. václav beran, drsc. phone: +4202 2435 3720 fax: +420 224 355 439 e-mail: beran@fsv.cvut.cz ing. eduard hromada, ph.d. e-mail: eduard.hromada@fsv.cvut.cz czech technical university in prague faculty of civil engineering thákurova 7 166 29 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 48 no. 1/2008 ap06_6.vp 1 introduction gallium nitride (gan) has become one of the most promising wide band gap (3.4 ev) direct semiconductor materials for utilization in high power and high frequency transistors, solid state photo detectors and high brightness blue light emitting diodes (leds), laser diodes (lds) and full colour flat panel displays [1], [2]. er3+-doped optical materials are candidates for fabrication of optical amplifiers and lasers operating at 1 550 nm [3] due to the er3+ intra-4f emission, which corresponds to the 4i13/2 � 4i15/2 transition. this wavelength is commonly used in telecommunication systems due to the fact that it corresponds to a low loss window of silica based optical fibers. erbium doped amplifiers are usually optically pumped by a source operating at 1 480 nm or 980 nm. when only er3+ ions are present in short waveguides, optical pumping at 980 nm is not sufficiently efficient, because the er3+ absorption cross-section at this wavelength is not very good. this problem can be overcome by adding yb3+, as its 2f5/2� 2f7/2 transition is approximately ten times stronger than that of 4i13/2 � 4i15/2 [4], [5]. the basic schematic energy levels and laser transitions of er3+ and yb3+ are shown in fig. 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 46 no. 6/2006 properties of erbium and ytterbium doped gallium nitride layers fabricated by magnetron sputtering v. prajzler, z. burian, i. hüttel, j. špirková, j. hamáček, j. oswald, j. zavadil, v. peřina we report about some properties of erbium and erbium/ytterbium doped gallium nitride (gan) layers fabricated by magnetron sputtering on silicon, quartz and corning glass substrates. for fabricating gan layers two types of targets were used gallium in a stainless steel cup and a ga2o3 target. deposition was carried out in the ar + n2 gas mixture. for erbium and ytterbium doping into gan layers, erbium metallic powder and ytterbium powder or er2o3 and yb2o3 pellets were laid on the top of the target. the samples were characterized by x-ray diffraction (xrd), photoluminescence spectra and nuclear analytical methods. while the use of a metallic gallium target ensured the deposition of well-developed polycrystalline layers, the use of gallium oxide target provided gan films with poorly developed crystals. both approaches enabled doping with erbium and ytterbium ions during deposition, and typical emission at 1 530 nm due to the er3+ intra-4f 4i13/2 � 4i15/2 transition was observed. keywords: gallium nitride, erbium, ytterbium, magnetron sputtering, photoluminescence. fig. 1: schematic energy levels and laser transitions of er3+ and yb3+ ions it was previously shown in [6] that thermal quenching in er3+-doped semiconductors decreases with increasing band gap. therefore, wide-band gap semiconductors such as gan are attractive hosts for er3+ and yb3+ ions (re ions). gan layers are usually grown by epitaxy methods such as metal organic chemical vapor deposition (mocvd) and molecular beam epitaxy (mbe) [7], [8]. epitaxy methods such as hydride vapor phase epitaxy (hvpe) and liquid phase epitaxy (lpe) [9], [10] are used for fabricating of free standing gan substrates. to obtain gan layers doped with erbium and ytterbium ions, two procedures are basically available. the first procedure involves fabricating gan layers and then doping them by ion implantation [11], [12]. the second way involves doping the gan layers by erbium and ytterbium ions during the deposition process [13], [14]. re-doped gan layers fabricated by the epitaxy method are of high quality; however, the deposition process is rather complicated (for gan fabricated by mocvd a toxic precursor is needed, and for gan fabricated by mbe an ultrahigh-vacuum chamber must be applied). instead of these rather complicated methods an easier approach to gan fabrication is now being investigated. yang et al. in 2003 already managed to fabricate high quality gan layers [15] by using magnetron sputtering. their gan samples exhibited luminescence at 354 nm wavelength at room temperature. erbium and ytterbium can easily be doped into the deposited gan layers in the course of the sputtering process [16]. moreover, sputter deposition is relatively inexpensive and it is ideal for covering a large area. 2 experiment 2.1 fabrication of the samples the gan samples were fabricated by radio frequency (rf) magnetron sputtering (balzers pfeiffer pls 160) on silicon, quartz or corning glass. before deposition, the substrates were cleaned by a standard cleaning procedure. the sputtering experimental set-up is shown in fig. 2. we used two types of target: ga target and ga2o3 target. because of its very low melting point (29.78 °c), gallium cannot be used directly as a target, so we had to pour it into a stainless steel crucible. another way is to use ga2o3 target, as already reported in [17]. this would satisfactorily solve the problem arising from the low melting point of gallium, as ga2o3 melts at about 1600 °c. in our experiments we sintered ga2o3 powder (sigma-aldrich) to form a target 5 cm diameter. typical deposition parameters were: temperature 300 k, time 60 min, nitrogen-argon ratio 3:7, power 50 w. the apparatus was evacuated before each experiment below 0.01 pa, and deposition was done at total gas pressure 3.4 pa. the further details of the fabrication process are given in table 1. typical thickness of the deposited layers was 0.5 to 3.2 �m, depending on the time of deposition. for erbium doping into gallium nitride layers, the er metallic powder and yb powder were laid on the top of gallium targets, or, er metallic powder and yb powder or er2o3 and yb2o3 pellets 5 or 10 mm in diameter were put on top of the ga2o3 targets. the er2o3 and yb2o3 pellets were fabricated by pressing er2o3 and yb2o3 powder (sigma-aldrich). 2.2 measurement the structure of the deposited gan layers was studied by xrd (x-ray diffraction). the compositions of the samples were determined with the use of nuclear chemical analysis (rutherford backscattering spectroscopy (rbs) and elastic recoil detection analysis (erda)). the gan stoichiometry and the o admixture amount was checked by rbs using 2.4 mev protons. for this energy the non-rutherford cross-section for n and o is sufficiently enhanced to obtain satisfactory sensitivity. the amounts of the erbium and ytterbium dopants were checked by rbs with both 2.4 mev protons and 2.2 mev alpha particles. the areas in the spectra above the surface of the ga energy edge enabled us to determine the re concentrations up to a depth of 600 and 240 nm from the gan surface for 2.4 mev protons and 2.2 mev alpha particles, respectively. the h impurity was checked by erda with the 2.7 mev alpha particles. the evaluations of the rbs and erda spectra were done by gisa3 [18] and simnra [19] code, respectively. the transmission spectra of the samples in the spectral region from 400 nm to 1 000 nm at room temperature were also taken. for this purpose, a tungsten lamp and mdr 23 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 2: schema of the planar magnetron-sputtering set-up used for deposition of gan layers target ga, ga2o3 power (13.56 mhz) 50 w gas precursor (purity 99.999%) mixture n2/ar (3:7) total gas pressure 3.4 pa target substrate distance 3.7 cm deposition time 1�4 hr deposition temperature 300 k re doping using pellets er2o3, yb2o3 using powder er metallic powder, yb powder table 1: deposition parameters for er/yb:gan fabrication monochromator were used as light sources, and the light transmitted through the samples was detected by a pyrodetector. the photoluminescence measurement was carried out at three excitation wavelengths: � ar laser ila-120 operating at �ex � 488 nm, eex � 100 mw, � ar lasers operating at �ex � 514.5 nm, eex � 300 mw, � semiconductor laser p4300 operating at �ex � 980 nm, eex � 500 mw. an feu62 photocell was used to detect the wavelength from 500 to 1 000 nm, while a ge detector was used for the wavelengths from 1 000 nm to 1 600 nm. the reference chopper frequency was 75 hz. all the luminescence measurements were performed at room temperature. 3 result and discussion the structure of the deposited gan thin films was studied by xrd (x-ray diffraction), and the results have already been given in [20]. it was shown that the gan structures depended on the type of the target and temperature used for the deposition. gan films grown using the ga2o3 target at room temperature had an amorphous structure, while gan films fabricated using the ga target at room temperature had polycrystalline structure (according to the literature, gan layers fabricated at an elevated temperature (above 800 °c) can have a single crystalline structure [21]). the exact composition of the deposited gan layers was determined by nuclear analytical methods (rbs, erda). the typical rbs spectrum of an erbium doped gan layer is shown in fig. 3. the analyses proved that the samples contained gallium, nitrogen, oxygen, argon, hydrogen and erbium and/or ytterbium ions (see table 2). the amount of incorporated er3+ and yb3+ ions differed depending on the area of the target covered by the erbium and ytterbium co-dopant, and also on the erosion area represented by the part of the surface target © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 46 no. 6/2006 fig. 3: rbs spectrum of er-doped gan containing 1.3 at % er samples composition (at %) ga n o h er � yb #160 reference sample * 39.2 14.1 41.5 5.2 0 #110 er 1 × er2o3 ** 33 26.9 32 8 0.1 #111 er 3 × er2o3 ** 35.3 25.4 30.5 8.6 0.2 #112 er 5 × er2o3 ** 32.6 16.1 42.3 8.5 0.5 #161 er mer � 0.05 g *** 36.8 21.4 34.6 6 1.2 #162 er/yb mer � 0.05 g *** myb � 0.0997 g 33.8 15.5 41.9 6 2.8 #165 er/yb mer � 0.05 g *** myb � 0.4996 g 37.7 11.3 35.6 8.9 6.5 #163 er/yb mer � 0.05 g *** myb � 1.0008 g 19.4 12.6 48.3 4.7 15 * sample without er +yb doping, ** number of er2o3 pellets (5mm diameter) put on top of the ga2o3 target, *** weight of er or yb powder put on top of the ga2o3 target table 2: composition of re-doped gan samples as determined by rutherford backscattering analysis and elastic recoil detection analysis covered by erbium and ytterbium. as er and yb have very close, atomic weight values these two elements cannot be distinguished in the rbs spectra, so that only the sum of the two elements can be obtained. according to table 2, a significant amount of hydrogen is found in gan films, with the relative concentrations ranging between 4 and 9 at %. this unintended presence of hydrogen in the samples is probably a consequence of the residual contamination of the ar and/or 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 4: transmission spectra of the er:gan sample fig. 5: pl spectra of the er-doped gan layers fabricated by magnetron sputtering using a ga target and erbium metallic powder laid on top of the target fig. 6: pl spectra of an er-doped gan layer fabricated by magnetron sputtering using the gallium oxide target and two pellets (5 mm diameter) of erbium oxide laid on the top of the target n2 gases that are employed. they contained approximately 5 ppm of hydrogen [22]. the gan samples also contained a small amount of argon (around the detection limit 1 at %), due to the argon atmosphere used during deposition. fig. 4 compares the transmission spectra of the sputtered gan doped with 1.2 at. % of erbium with the un-doped gan sample. the arrows in the figure mark the strongest transitions of the er3+ ions (2h11/2). however, we observed only a weak peak attributed to the er3+ transition. we did not observe any transition of the yb3+ ions at 980 nm (2f5/2) with the erbium doped gan layers co-doped by yb3+ ions, probably because the absorption coefficients for ytterbium ions are very low and/or the deposited layers are rather thin. the photoluminescence spectrum of the er3+ doped gan layers fabricated using a ga target excited at �ex � 514.5 nm at a temperature of 4 k is given in fig. 5. the figure shows typical photoluminescence bands attributed to the erbium transition 4i13/2 � 4i15/2. we obtained the best result for the gan sample containing about 2.83 at % of erbium. fig. 6 shows the photoluminescence spectra of a gan layer doped by er3+ fabricated using the ga2o3 target and er2o3 pellets laid on top of the target, obtained by using optical pumping at 980 nm at room temperature. fig.7 shows the 1530 nm region of the photoluminescence spectra of the er3+/yb3+ containing gan samples fabricated by doping from erbium-ytterbium powder put onto a ga2o3 target, excited by an ar laser (�ex � 514.5 nm, temperature 4 k). the typical photoluminescence bands attributed to erbium 4i13/2 � 4i15/2 increased in intensity with increasing ytterbium content (for details, see table 2). the best results were obtained when we laid 0.05 g of erbium metallic powder and 1 g ytterbium powder onto the target. fig. 8 shows the same photoluminescence spectra as fig. 7, but now obtained by optical pumping at 980 nm at room temperature, which indicates better quality of the samples. 4 conclusion two basic approaches for rf magnetron sputtering of gan thin films have been presented. the first one, utilizing a metallic gallium target, provides deposition of well-developed © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 46 no. 6/2006 fig. 7: pl spectra of er/yb-doped gan layers fabricated by magnetron sputtering using erbium and ytterbium powder laid onto a ga2o3 target fig. 8: pl spectra of er/yb-doped gan layers fabricated by magnetron sputtering using erbium and ytterbium powder laid onto a ga2o3 target polycrystalline layers. the second, using a gallium oxide target, resulted in almost amorphous gan films with poorly developed crystals. the er/yb doped gan samples exhibited the typical emission at 1 530 nm due to the er3+ intra-4f 4i13/2� 4i15/2 transition even pumped at 980 nm at room temperature. the layers co-doped with yb ions revealed increased intensity of luminescence. thus the possibility of fabrication of erbium and ytterbium ions containing gan films by magnetron sputtering was demonstrated. acknowledgments our research has been supported by the grant agency of the czech republic (grant no. 102/06/0424), by research program msm6840770014 of the czech technical university in prague and by ministry of education, youth and sports of the czech republic (grant no. lc6041). we specially thank petr bak for technical support and bohumír dvořák for providing the er2o3 and yb2o3 pellets. references [1] morkoc, h., strite, s., gao, g. b., lin, m. e., sverdlov, b., burns, m.: “large-band-gap sic, iii-v nitride, and ii-vi znse-based semiconductor-device technologies.” journal of applied physics, vol. 76 (1994), no. 3, p. 1363–1398. [2] steckl, a. j., heikenfeld, j. c., lee, d. s., garter, m. j., baker, c. c., wang, y. q., jones, r.: “rare-earth-doped gan: growth, properties, and fabrication of electroluminescent devices.” ieee journal of selected topics in quantum electronics, vol. 8 (2002), no. 4, p. 749–766. [3] kik, p. g., polman, a.: “erbium doped optical-waveguide amplifiers on silicon.” mrs bulletin, vol. 23 (1998), no. 4, p. 48–54. [4] chryssou, c.e. – di pasquale, f. – pitt, c.w.: “improved gain performance in yb3+-sensitized er3+-doped alumina (al2o3) channel optical waveguide amplifiers.” journal of lightwave technology, vol. 19 (2001), no. 3, p. 345–349. [5] hubner, j., guldberg-kjaer, s., dyngaard, m., shen, y., thomsen, c. l., balslev, s., jensen, c., zauner, d., feuchter, t.: “planar erand yb-doped amplifiers and lasers.” applied physics b-lasers and optics, vol. 73 (2001), no. 5–6, p. 435–438. [6] favennec, p. n., lharidon, h., salvi, m., moutonnet, d., leguillou, y.: “luminescence of erbium implanted in various semiconductors – iv-materials, iii-v-materials and ii-vi materials.” electronics letters, vol. 25 (1989 ), no. 11, p. 718–719. [7] amano, h., sawaki, n., akasai, i., toyoda, y.: “metalorganic vapor-phase epitaxial-growth of high-quality gan film using an aln buffer layer.” applied physics letters, vol. 48 (1986), no. 5, p. 353–355. [8] doppalapudi, d., iliopoulos, e., basu, s. n., moustakas t. d.: “epitaxial growth of gallium nitride thin films on a-plane sapphire by molecular beam epitaxy.” journal of applied physics, vol. 85 (1999), no. 7, p. 3582–3589. [9] molnar, r. j., gotz, w., romano, l. t., johnson, m. n.: “growth of gallium nitride by hydride vapor-phase epitaxy.” journal of crystal growth, vol. 178 (1997), no. 1–2, p. 147–156. [10] klemenz, c., scheel, h. j.: “crystal growth and liquid-phase epitaxy of gallium nitride.” journal of crystal growth, vol. 211 (2000), no. 1–4, p. 62–67. [11] song, s. f., chen, w. d., zhu, j. j., hsu, c. c.: “dependence of implantation-induced damage with photoluminescence intensity in gan:er.” journal of crystal growth, vol. 265 (2004), no. 1–2, p. 78–82. [12] chen, w. d., song, s. f., zhu, j. j., wang, x. l., chen, c. y., hsu c. c.: “study of infrared luminescence from er-implanted gan films.” journal of crystal growth, vol. 268 (2004), no. 3–4, p. 466–469. [13] hommerich, u., nyein, e. e., lee, d. s., heikenfeld, j., steckl, a. j., zavada, j. m.: “photoluminescence studies of rare earth (er, eu, tm) in situ doped gan.” materials science and engineering b-solid state materials for advanced technology, vol. 105 (2003), no. 1–3, p. 91–96. [14] hansen, d. m., zhang, r., perkins, n. r., safvi, s., zhang, l., bray, k. l., kuech, t. f.: “photoluminescence of erbium-implanted gan and in situ-doped gan:er.” applied physics letters, vol. 72 (1998), no. 10, p. 1244–1246. [15] yang, y. g., ma, h. l., xue, c. s., hao, x. t., zhuang, h. z., ma, j.: “characterization of gan films grown on silicon (111) substrates.” physica b-condensed materials, vol. 325 (2003), no. 1–4, p. 230–234. [16] kim, j. h., shepard, n., davidson, m. r., holloway, p. h.: “visible and near infrared alternating current electroluminescence from sputter grown gan thin films doped with er.” applied physics letters, vol. 83 (2003), no. 21, p. 4279–4281. [17] yang, y. g., ma, h. l., xue, c. s., zhuang, h. z., hao, x. t., ma, j., teng, s. y.: “preparation and structural properties for gan films grown on si (111) by annealing.” applied surface science, vol. 193 (2002), no. 1–4, p. 254–260. [18] saarilahti, j., rauhala, e.: “interactive personal-computer data-analysis of ion backscattering spectra.” nuclear instruments & methods in physics research section b-beam interactions with materials and atoms, vol. 64 (1992), no. 1–4, p. 734–738. [19] mayer, m.: simnra users guide, institute für plasmaphysik, 1998. [20] prajzler, v., schröfel, j., hüttel, i., špirková, j., hamáček, j., machovič, v., peřina v.: er:gan thin films fabricated by magnetron sputtering, in: thin films 2004 & nanotech 2004 (editors: sam zhang, xianting zeng). singapore, nanyang technological university, 2004, 34-otf-a315. [21] ross, j., rubin, m.: “high-quality gan grown by reactive sputtering.” materials letters, vol. 12 (1991), no. 4, p. 215–218. [22] zanatta, a. r., ribeiro, c. t., freire, f. l.: “optoelectronic and structural properties of er-doped sputter-deposited gallium-arsenic-nitrogen films.” journal of applied physics, vol. 90 (2001), no. 5, p. 2321–2328. 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 ing. václav prajzler e-mail: xprajzlv@feld.cvut.cz doc. ing. zdeněk burian, csc. e-mail: burian@fel.cvut.cz department of microelectronics czech technical university faculty of electrical engineering technická 2 166 27 prague, czech republic doc. ing. ivan hüttel e-mail: ivan.huttel@vscht.cz rndr. jarmila špirková, csc. e-mail: jarmila.spirkova@vscht.cz ing. jiří hamáček e-mail: jiri.hamacek@vscht.cz institute of chemical technology technická 5 166 27 prague, czech republic ing. jiří oswald, csc. e-mail: oswald@fzu.cz institute of physics, czech academy of sciences cukrovarnická 10 162 53 prague, czech republic rndr. jiří zavadil, csc. e-mail: zavadil@ure.cas.cz institute of radio engineering and electronics czech academy of sciences chaberská 57 182 51 prague, czech republic rndr. vratislav peřina, csc. e-mail: perina@ujf.cas.cz institute of nuclear physics czech academy of sciences 250 68 řež near prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 46 no. 6/2006 table of contents optimal (comfortable) operative temperature estimation based on physiological responses of the human organism 3 m. v. jokl, k. kabele extraction of alkali ions investigated by conductometric and ph measurements 14 l. kalvoda, r. klepáèek response analysis of an rc cooling tower under seismic and windstorm effects 17 d. makovièka determination of rheological parameters from measurements on a viscometer with coaxial cylinders – choice of the reference radius 22 f. rieger embedded coding of astronomical images 25 f. i. y. elnagahy, a. a. haroon, y. a. azzam, a. el-bassuny alawy,h. k. elminir, b. šimák influence of different speech representations and hmm training strategies on asr performance 32 h. boøil, p. fousek fuzzy algorithm for supervisory voltage/frequency control of a self excited induction generator 36 hussein f. soliman, abdel-fattah attia, s. m. mokhymar, m. a. l. badr properties of erbium and ytterbium doped gallium nitride layers fabricated by magnetron sputtering 49 v. prajzler, z. burian, i. hüttel, j. špirková, j. hamáèek, j. oswald, j. zavadil, v. peøina ap06_1.vp notation and symbols a � shear span from support to first concentrated load, mm as � longitudinal steel area, mm 2 b � beam width, mm d � beam effective depth, mm hw � depth of the gfrp composite, mm fc � compressive strength of concrete, mpa fwa � allowable tensile stress in the gfrp composite, mpa fwu � ultimate tensile stress of the gfrp composite, mpa fy � yield stress of reinforcement, mpa vc � shear strength provided by concrete, n vf � beam shear capacity assuming flexural failure, n vn � nominal shear strength, mpa vs � shear capacity provided by web reinforcement, n vu � ultimate shear load, n vw � shear strength provided by gfrp wraps, n s � spacing of web reinforcement, mm tw � thickness of the gfrp composite, mm � � reinforcement ratio (as /bd) 1 introduction upgrading of reinforced concrete beams usually involves strengthening existing members to carry higher ultimate loads or to satisfy certain serviceability requirements. strengthening of reinforced concrete members by bonding external steel plates using epoxy has been recognized to be an effective method for improving the structural performance. however, there have been two major disadvantages of this method: (a) difficulty in manipulating the steel plates at the construction site due to their bulk, and (b) deterioration of the bond caused by corrosion of the steel. these difficulties have led to the idea of replacing the steel plates by fiber-reinforced polymer (frp) composites. glass fibers are commonly used as a reinforcement offering unique advantages for gfrp composites, such as cost effectiveness, low density (one-quarter that of steel), resistance to electrochemical deterioration and low maintenance cost. reinforced concrete beams can be deficient in shear capacity due to a variety of factors including improper detailing of the shear reinforcement, poor construction practice, changing the function of the structure accompanied with higher service loads and a reduction in, or total loss of, the area of the shear reinforcement due to corrosion in a harsh environment. an innovative method of beam shear strengthening involves the use of frp externally bonded to the faces of the member where the shear capacity is deficient. several schemes are available: frp plates bonded to the sides, strips of frp bonded to the sides, or a jacket (wrap) placed along the shear span. the literature showed that only a few studies have addressed the use of externally bonded frp sheets to improve shear strength. al-sulaimani et al. [1] tested simply supported beams with fiberglass in different configurations (plates, strips and wraps) under four-point loading. the specimens were 150×150 mm in cross section and 1250 mm long with a shear span to depth ratio equal to 3. compression and tension reinforcement as well as web stirrups were present. these beams were damaged before retrofit and were designed to fail in shear as the stirrups served mostly to confine the flexural reinforcement. the researchers determined that fiberglass plates and strips bonded to the sides of the beams produced a moderate (25 %–30 %) increase in shear capacity. this repair technique, however, was not sufficient to prevent a shear mode of failure. also, the fiberglass plates and strips peeled off. beams fitted with a fiberglass wrap, however, nearly doubled the shear capacity of the beam, and this increase was sufficient to produce a flexural mode of failure. it was concluded that shear repair by a jacket on three sides performed better than repair by strips or wings. the wings of the jacket were well anchored at the bottom of the beam so that no premature peeling failure occurred. additionally, the continuity provided by the geometry of the jacket minimized the effect of stress concentrations in the plates. therefore, a jacket configuration should be considered whenever possible. chajes et al. [2] investigated reinforced concrete beams with aramid, glass, and graphite wraps loaded in four-point bending. these specimens were structural tees in cross section, having a 190 mm depth, 140 mm wide flange, a 64 mm thick web, and 1220 mm length. these beams were com24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague shear strengthening of reinforced concrete beams using gfrp wraps m. a. a. saafan the objective of the experimental work described in this paper was to investigate the efficiency of gfrp composites in strengthening simply supported reinforced concrete beams designed with insufficient shear capacity. using the hand lay-up technique, successive layers of a woven fiberglass fabric were bonded along the shear span to increase the shear capacity and to avoid catastrophic premature failure modes. the strengthened beams were fabricated with no web reinforcement to explore the efficiency of the proposed strengthening technique using the results of control beams with closed stirrups as a web reinforcement. the test results of 18 beams are reported, addressing the influence of different shear strengthening schemes and variable longitudinal reinforcement ratios on the structural behavior. the results indicated that significant increases in the shear strength and improvements in the overall structural behavior of beams with insufficient shear capacity could be achieved by proper application of gfrp wraps. keywords: strengthening, shear capacity, gfrp, polyester, fiberglass, wrapping. pletely lacking in shear reinforcement but contained enough flexural reinforcement so that a shear failure would occur. while all beams experienced an increase in ultimate capacity, they still failed in shear. the glass and graphite wraps were torn along the diagonal crack. the purpose of this experimentation was not to force flexural failure, but to determine the effectiveness of the system to increase the shear capacity in specimens that were designed to fail in shear. the frp wrap was thus shown to be effective for shear repair. the work was continued in another research project [3], in which the beams were designed to fail in flexure, and this requirement was successfully fulfilled. 2 research significance several studies have investigated the use of externally bonded frp composites to improve the strength and stiffness of reinforced concrete beams, but most have addressed flexural strength, not shear. the efficiency of applying successive layers of fiberglass fabric as external shear reinforcements to enhance the shear capacity of rc beams without web reinforcement was investigated. the test parameters included a variable tension reinforcement ratio and different shear strengthening schemes. a combination of shear and flexural strengthening was also considered, aiming to achieve a further increase in ultimate loads accompanied by desirable ductile modes of failure. the efficiency of the proposed strengthening technique was evaluated making use of the test results for two groups of control beams. the beams in the first group were not strengthened and were fabricated with and without web reinforcement, while those in the second group were strengthened in flexure and were fabricated with and without web reinforcement. simple equations describing the shear capacity of gfrp wraps were revised, based on the current results, to present a simple design tool for practicing engineers. 3 experimental program 3.1 design considerations the design criteria for conventional beams were based on the provisions of aci 318-95 code [4] for ultimate strength design. the flexural and shear capacities were computed based on the actual material properties without using the reduction factors specified by the code. the code provisions for shear design use the concept that the nominal shear strength vn of a reinforced concrete member is taken as the sum of the shear carried by concrete vc and web reinforcement vs. the term vc in a diagonally cracked beam represents three separate components: (a) dowel action of the tension longitudinal reinforcement, (b) aggregate interlock across the crack faces, and (c) shear carried by the uncracked concrete in the flexural compression zone [5]. the term vs represents the vertical component of the shear reinforcement across an assumed 45 deg. failure crack. the terms vc and vs are given by the aci 318-95 code as (dim. in ib and in.): v f v d m b dc c u u � � � � � � � � 19 2500. � , (1) v a f d ss v y � . (2) according to the mathematical model proposed by woo kim [6], the term vc can be more accurately estimated for a beam under four point loading using the following equation taking into account the influence of the parameter a/d: v d a f b dc c� � � � � �� �9 4 1 2 1 3. ( )� � . (3) the design of beams strengthened in flexure by bonding gfrp composite of a specified thickness to the soffit of the beam was conducted according to the same design provisions of conventional beams. the strain in the extreme fiber of the concrete was set equal to 0.003 and the flexural capacity was computed using an iterative technique applying compatibility and equilibrium conditions. the degree of shear strengthening required is represented by the difference between the load demand and the existing section capacity. the final shear capacity of the strengthened beam should be sufficiently higher than the flexural capacity to avoid shear failure modes, which can occur without warning and may be catastrophic. shear strengthened beams were designed to carry an additional force vw according to the requirements specified by icbo [7]. the design of the frp wrap providing vw is based on controlling the shear crack width to maintain aggregate interlock and consequently proper shear transfer through the concrete. this requirement is fulfilled by limiting the allowable composite material stress, fwa, to be 0.004 of the composite modulus of elasticity, ew, but not more than 0.75 of the composite ultimate tensile stress fwu. the shear force vw can be computed according to the following equation assuming a shear crack inclined at 45 degrees: v t h fw w w wa� 2 (4) in which tw is the thickness of the frp composite and hw is its depth. the bond between the frp composite and the underlying concrete surface should be capable of transmitting this shear force and consequently, a check for bond is necessary. unfortunately, it is not straight forward to determine the loading level at which the bond is broken, because (a) the actual ultimate bond strength is influenced by many factors including the quality of the adhesive material, the quality of the surface preparation, and the efficiency of the fabric application process, (b) the distribution of shear along the depth of the composite is not generally uniform due to possible shear stress concentration at the composite ends, which causes it to peel off without developing the ultimate shear stress in a uniform distribution [4]. al-sulaimani et al. [4], pointed out that the ultimate shear strength can develop uniformly along the depth of the wings of strengthening jackets, as previously discussed. the shear force vw is related to bond stress when using side plates or jackets according to the following equation, assuming a shear crack inclined at 45 degrees [4]: v h dw w� � , (5) where � is the average shear stress developed along the interface. the actual value of the average stress can be estimated from the experiments by knowing ultimate shear force vu and the concrete shear capacity vc provided by control beams with no web reinforcement. however, for the purpose of preliminary design, the results provided by pulling double-lap frp plates bonded to concrete provide a reasonable estimate for the average shear stress. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 3.2 specimen details a total number of 20 singly reinforced concrete beams (100×150×1050 mm) were cast. all beams were reinforced with two lower bars allowing for an effective depth of 130 mm. the beams were divided into three main groups according to the strengthening scheme (table 1): group c: comprised six control beams without strengthening. group cf: comprised six control beams with flexural strengthening. group s: comprised eight beams with different schemes of shear strengthening. three out of six beams in group c and in group cf had different reinforcement ratios with vertical stirrups of 6 mm mild steel bars at 65 mm spacing as web reinforcement as, shown in fig. (1.a). in order to determine the shear capacity provided by the concrete and to evaluate the efficiency of the external strengthening compared to steel as web reinforcement, the other three beams in groups c and cf were cast without stirrups. table [2] reports the design details including the ratio of the ultimate shear capacity vc� vs as a ratio of the ultimate shear force, vf, associated with an assumed flexural failure. this ratio varied from 3.6 to 1.25 for beams with stirrups and from 0.33 to 0.88 for those without stirrups. the calculations in table 2 show that eq. (1) provided a conservative estimate of vc compared to eq. (3) according to which beam c8 was safe against shear failure. the beams in group cf were typically the same as those in group c with frp fabric adhered to the soffit of the beam. the frp composite (100×800 mm) consisted of eight layers of fabric with an overall thickness of 4 mm, as shown in fig. 1b. testing the strengthened beams in group cf was intended to provide information about the beam behavior due to flexural strengthening and check out possible premature failure modes due to, for instance, composite debonding or concrete cover ripoff. all beams in group s were without stirrups. the beams were strengthened in shear along the whole shear span length by wrapping the whole cross section, bonding rectangular side wraps (130 × 300 mm), and bonding �-jackets, as demonstrated in fig. 1c and 1d. beams with 8 and 10 mm reinforcement bar diameter were also strengthened in flexural in the same way as in group cf. table 3 summarizes the design calculations for shear strengthening based on the properties of the frp material reported in table 4 for different numbers of fabric layers and different composite thickness. beams with a 13 mm reinforcement bar diameter were strengthened using four fabric layers in all shear repair schemes, while only two fabric layers were used in the other beams. using aci equations 1 and 2, the theoretical ratio ( vc + vw) /vf assumes values between 1.42–2.52 when side 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague group beam designation no. of beams description c c�w * 3 control beams with steel web reinforcement c� 3 control beams with no web reinforcement cf cf�w 3 control beams with steel web reinforcement and frp wraps for flexural strengthening. cf� 3 control beams with no web reinforcement and frp wraps for flexural strengthening. s s�fp 2 beams strengthened in flexure and in shear by side wraps forming side plates. s�fu 2 beams strengthened in flexure and in shear by u-jacket wraps. s�fr 2 beams strengthened in flexure and in shear by wraps around the whole cross section. s�p & s�r 2 beams strengthened in shear by side plate wraps and wraps around the whole cross section. * (�) stands for the diameter of the reinforcing bar (8, 10,13 mm) table 1: description of test beams beam as � �b vf vc vs ( )v v vs c f� c8w 100.53 0.19 13.64 12.06 (17.69)* 36.74 3.6 (4.0) c8 – 0.88 (1.3) c10w 157.08 0.28 20.39 12.49 (18.74) 36.74 2.4 (2.7) c10 – 0.6 (0.9) c13w 265.46 0.68 40.09 13.31 (19.95) 36.74 1.25 (1.4) c13 – 0.33 (0.5) * ( ) vc value according to eq. (3). table 2: details on control beams (group c) plates are used and between 1.59–2.78 in the case of jackets, provided that the bond strength is sufficient so that vw can develop. 3.3 fabrication of test beams the concrete mix was placed in the forms and vibrated to ensure consolidation of the concrete. the specimens were covered with wet burlap, which was kept moist for the first 3 days. the specimens were left to cure at room temperature and tested after 28 days of casting. prior to application of the composite overlays, the beams were prepared by using a sand blasting machine to roughen the concrete surface to enhance the bond with the frp composite and by rounding the corners of the beam along the shear span to a radius of 10 mm. this practice was necessary to avoid bending the jacket wraps at a right angle and thus preventing the creation of gaps between the beam and the composite. the surfaces were then thoroughly cleaned of debris and dust using an air blower. the reinforcing fabric was cut to the proper dimensions using scissors, and was infused with polyester. this was done by laying the fabric over the soffit while the beam was upside down and spreading the resin by hand to saturate the fabric. for wrapping with �-jackets, it was found more convenient to saturate the fabric laid flat over a polyethylene sheet and then lay it up around the beam. to wrap the whole cross section, one layer in the form of a �-jacket was applied followed by an inverted jacket (�-jacket). 3.4 instrumentation and test procedure all beams were tested to ultimate load in four-point bending over a simple span of 900 mm and a shear span of 300 mm providing a shear span-to-depth ratio (a /d) of 2.3. both ends of the beam were free to rotate and translate under the load. the load was applied by means of a 100 kn capacity flexural machine equipped with a digital control console. the load was applied in increments of 2.5 kn until the tensile reinforcement yielded. deflection control, in which the load step corresponds to a specified increase in deflection, was used when the beam entered the plastic range. the midspan deflection was recorded at each load step using a dial gage. electrical resistance strain gages were affixed at the beam midspan section to monitor the strain internally in the tensile reinforcement and externally in the extreme concrete fiber in compression. another two strain gages were located under the applied loads and affixed to the beam side 5 mm below the extreme compression fiber. 3.5 materials concrete: a normal strength concrete mix with a specified 28-day compressive strength of 30 mpa was made using natural gravel of a maximum size of 19 mm and type i portland cement. the cement:sand:gravel proportions in the concrete mix were 1:2:3.37 by weight and the water-cement ratio was 0.51. three cylinders 150×300 mm and prisms 100×100×500 mm were cast and cured under the same conditions as the rc beams and tested at 28-days age to determine the average compressive strength, modulus of elasticity and modulus of rupture. the results are given in table 4. steel: four different reinforcing bar sizes of mild and high strength steel were used as web and flexural reinforcements. tension tests were conducted on full-size bar samples © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 100 450 75 (a) 400 × (b) 400 pl. 130×300 × (c) -jacket l 300� × (d) a 300� × fig. 1: beam details and different repair schemes no. of fabric layers 2 4 composite thickness (mm) 1 2 allow. tensile stress (mpa) 86 84 shear capacity vw (kn): (1) side plates (hw � 130 mm) 22.4 43.7 (2) u-jacket or full wrapping (hw � 150 mm) 25.8 50.4 required bond strength (mpa) 1.15 2.24 table 3: design calculations for shear strengthening to determine the yield and ultimate strengths, as given in table 4. frp: a commercially available fiberglass reinforced polymer matrix was used for beam strengthening as external shear and flexural reinforcement. the composite consisted of a number of layers of woven roving fiberglass embedded in a polyester resin. the fabric is made by weaving untwisted rovings in a plane weave (half of the strands are laid at right angles to the other half) with a nominal thickness of 0.5 mm and a density of 450 g /m2. tests were conducted to explore the mechanical properties of different locally available polymers including polyester, vinyl ester and epoxy adhesives. based on these results, polyester was a convenient choice because of its excellent mechanical properties and relatively low price. liquid polyester resin is cured with benzoyl peroxide as an initiator that was supplied in the form of an emulsion, as it may cause fire when used in a pure state due to rapid decomposition. an initiator dose of 8 ml /kg of polyester allowed for a curing time of 90 minutes before setting into a solid state in the mixing mold. the mixed amount was limited so that it can be applied with a brush without going into the gel state. the flexural prismatic specimen showed a major drawback in using polyester as an adhesive, as an average linear shrinkage as high as 2.5 percent was measured. it is however expected that the strain incompatibility problems, which are common and sometimes severe in plate bonding techniques due to autogenous shrinkage and thermal effects, should not be pronounced when using the wrapping technique. this is attributed to the very small thickness of the bonding line, which is reinforced at the same time by the fiber reinforcement. separate plates were manufactured to prepare the frp specimens tested in tension. the plates were manufactured by laying the fabric layers out flat and evenly spreading the resin on the fabric by hand to saturate it. 24 hours later, three specimens were cut out of the plate and tested in uniaxial tension upon full cure. the material exhibited a linear elastic behavior up to failure with decreasing ultimate strength as the number of fabric layers and consequently the plate thickness increased. the average fiber content was 63 % by weight (71 % by volume) of the composite. table 1 gives the results of the mechanical tests performed in accordance with astm specifications on both the frp material and the polyester matrix. a special test arrangement was designed to evaluate the bond strength between the frp composite and the underlying concrete surface. a concrete block (100×100×400 mm) provided with a wedge notch at the middle of the two opposite sides and two steel bars to be clamped in the testing machine. after the specimen has been pulled and split into two pieces at the notch location, the composite is adhered to the two faces with the notch to join the two pieces together. the composite was a strip 25.4 mm and extended 100 mm over one side of the notch and 200 mm over the other side. 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague material mechanical properties (mpa) comp. strength mod. of elasticity mod. of rupture yield strength ultimate strength bond strength shear strength concrete 29.8 29500 2.6 steel – � 6 mm * 200000 325 475 – � 8 mm * 210000 330 462 – � 10 mm* 210000 325 462 – � 13 mm 210000 420 625 polyester 102 1970 61.7 1.5 17.3 frp comp. no. of fabric layers – 1 mm thick. 21500 420 2 – 2 mm thick. 21000 311 4 – 4 mm thick. 18500 200 8 *plain mild steel bars table 4: mechanical properties of concrete, reinforcement and frp materials fig. 2: frp-concrete bond test unequal extension about the notch was intended to force debonding of the composite to occur over a specified area. three specimens were prepared in this manner and tested 72 hours later by pulling the specimen to the ultimate load at which the frp strip peeled off (fig. 2). 4 results and discussion the test results for eighteen beams are summarized in table 5, listing the measured cracking, yield and ultimate loads for the tested beams along with a description of the different modes of failure. also, the load versus deflection curves for the tested beams are shown in fig. 3 through 7. the load-deflection curves were terminated upon either sudden failure due to composite debonding or reaching an ultimate compressive strain of 0.003 associated with concrete crushing. circular and triangular marks on the curves are used to indicate tension steel yielding and concrete crushing, respectively. the test results for the control beams in group c show that beams c8 and c10 failed in flexure, which means that the shear capacity was unexhausted and thus the ultimate shear force vu can only provide a lower bound for the concrete shear capacity vc. while the mode of failure for beam c13w was shear-compression, the shear capacity was also exhausted in beam c13, which failed due to diagonal tension. it can be concluded that eq. (1) provided a highly conservative estimate for vc as its value was only 57 percent of the experimental value for beam c13. for beams c8 and c10 this ratio was 82 and 57 percent applying vu as a lower bound for vc. the higher values for vc obtained by eq. (3) provided a more accurate yet conservative prediction for the shear capacity. diagonal tension failure modes always occurred due to a single crack extending throughout the shear span on one end of the beam. the crack started from the extreme flexure crack and gradually inclined with loading to finally approach the near point of loading with a flat slope. in shear-compression failures, the cracking pattern was almost symmetrical and was characterized by the extension of a flexural-shear crack towards the points of loading with web shear cracks connected to it. the web cracks started above the reinforcement level and extended towards the bottom of the beam. the beams in group cf were strengthened in flexure by bonding eight successive layers of fiberglass fabric extending over 90 percent of the span. the aim of flexure strengthening was to provide additional flexural strength for group c beams so that shear failure and possibly other brittle modes would be © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 0 10 20 30 40 0 1 2 3 4 5 6 midspan deflection (mm) c8w c10w c13w cf8w cf10w fig. 3: load-deflection curves for control beams with web reinforcement 0 10 20 30 0 1 2 3 4 5 6 midspan deflection (mm) c8 c10 c13 c8f c10f fig. 4: load-deflection curves for control beams with no web reinforcement 0 10 20 30 40 0 2 4 6 8 10 midspan deflection (mm) s8fp s8fu s8fr c8 fig. 5: load-deflection curves for shear strengthened beams 0 10 20 30 40 50 0 2 4 6 8 10 midspan deflection (mm) s10fp s10fu s10fr c10 fig. 6: load-deflection curves for shear strengthened beams expected. because beam c13w failed in shear, it was not necessary to strengthen this beam in flexure and thus the results of only eighteen beams were reported. according to the proposed iterative solution, the flexural strength for beams cf8w and cf10w was 37.9 and 40.55 kn, respectively, making these two beams more likely to fail in shear with regard to their shear capacity. this was confirmed by the test results reported in table 5. the mode of failure for beams cf8w and cf10w was shear-compression associated with simultaneous composite debonding. the beams showed sufficient ductility, as the tension reinforcement yielded at an earlier stage. it can be seen that there was no plateau in the load-deflection curves for these beams upon yielding, indicating that the increment in the tension component of the internal moment couple was carried by the composite after reinforcement yielding. the load carrying capacity for beams cf8w and cf10w was 96 and 55 percent higher than the ultimate loads of beams c8w and c10w, respectively. the increase in the ultimate load for beams cf8 and cf10 compared to beams c8 and c10 was not as high as similar beams with web reinforcement. however, the mode of failure for these beams changed from flexure to diagonal tension due to flexural strengthening. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague 0 10 20 30 40 50 0 1 2 3 4 midspan deflection (mm) s13p s13r c13 fig. 7: load-deflection curves for shear strengthened beams group beam vcr vy vu mode of failure c c8w 6.0 14.0 14.7 flexural (steel yielded) c10w 6.0 20.3 21.8 flexural (steel yielded) c13w 7.5 – 38.8 shear-comp. (onset of steel yielding c8 6.0 14.0 14.7 flexural (steel yielded) c10 6.0 20.6 21.4 flexural (steel yielded) c13 8.0 – 23.5 diagonal tension (steel remained elastic) cf cf8w 11.0 19.4 28.9 shear-compression associated with peeling of the frp comp. (steel yielded)cf10w 12.0 26.3 33.7 cf8 11.5 19.7 22.3 diagonal tension (steel yielded) cf10 12.0 – 23.2 diagonal tension (steel remained elastic) s s8fp 16.0 25.0 32.0 debonding of lower plate (steel yielded) s8fu 15.0 22.0 29.0 flexural (steel yielded) s8fr 18.0 25.0 35.0 steel yielding followed by rupture of the lower plate and concrete crushing s10fp 17.0 27.2 33.7 debonding of lower plate (steel yielded) s10fu 17.0 29.0 37.9 flexural (steel yielded) s10fr 19.0 31.8 42.8 steel yielding followed by rupture of the lower plate and concrete crushing s13p 13.0 – 38.0 debonding of side plates / jackets s13r 13.0 – 40.0 table 5: shear force (kn) at different loading stages and mode of failure fig. 8: cambering of beam s8fu due to the initial tensile force in the frp composite bonded to the soffit of the beam © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 shear strengthening was planned to modify the mode of failure for control beams that failed in a brittle manner and to evaluate the associated increase in ultimate loads. the efficiency of the composite system as external shear reinforcement compared to web reinforcement can be assessed using the results for group s beams listed in table 5. compared to beam cf8w, the ultimate load was increased by 11, 0, and 21 percent in beams s8fp, s8fu, and s8fr, respectively. in all repair schemes, the ductility of the shear strengthened beams was superior to the control beams, as the steel yielded at an early stage of loading. the use of jackets in beam s8fu and full wrapping in beam s8fr effectively anchored the end of the lower plate and prevented it from peeling off. beam s8fr developed the highest strength, as the lower plate was ruptured simultaneously with concrete crushing in the region of maximum moment. surprisingly, beam s8fu did not gain any significant increase in the ultimate load compared to the control beam. compared to beam s8fp, the mode of failure was modified into flexure while both stiffness and strength were superior in beam s8fp. this could be explained by the existence of unfavorable stress in the beam due to an initial eccentric tensile force in the lower plate. this particular beam was strengthened in shear by bonding the �-jackets 24 hours after bonding the lower composite. the lower plate was sufficiently cured by this time to resist the shrinkage associated with hardening of the matrix bonding the wing jackets. this force reduced the load carrying capacity of the beam and caused it to crack earlier. concrete evidence in support of this explanation can be seen in the photograph in fig. 8 showing cambering of the beam upon load release under the eccentric tensile force in the perfectly elastic lower composite plate. this argument suggested that the use of a high shrinkage matrix like polyester should be avoided or that it should be used with caution when the sequence of strengthening works is planned to avoid such a problem. compared to beam cf10w, the ultimate load was increased by 0, 12 and 27 percent in beams s10fp, s10fu, and s10fr, respectively. it can seen that beam s10fp did not develop its full strength due to peeling of the lower plate, while beam s10fr yielded the highest strength as the lower plate ruptured at failure producing the maximum tensile force in the plate. beams s13p and s13r, strengthened only in shear, developed 98 and 103 percent of the ultimate load developed by the control beam c13w. failure of these beams was due to limited debonding of the side plates or wings over separate spots. fig. 7 shows linear behavior for these two beams up to failure with a sudden drop in the load-deflection curves upon composite debonding. to determine the shear capacity provided by different shear repair schemes, the shear capacity provided by the concrete was specified as provided by testing the control beams c13, cf8 and cf10. all these beams failed in shear and yielded an average concrete shear capacity of 23 kn. the two-layer composite in beam s10fr developed and sustained a load of 20 kn with no signs of debonding. similar computations showed that the 4 composite layers in beams s13p and s13r were useless, as a limiting bond stress of 0.77 mpa governed. from the above discussions two major conclusions can be drawn. first, applying an external shear reinforcement using gfrp composites is as efficient as using closed stirrups as a web reinforcement with a possible improvement in the mode of failure. second, the design guidelines outlined in table 3 are satisfactory for external shear strengthening design when used along with the design equations of the aci code. the load-deflection curves for all beams showed a minor improvement in beam stiffness due to flexural strengthening because of the small thickness and modulus of the composite. on the other hand, a remarkable increase in cracking loads was recorded. the tensile force generated in the composite when cracking was imminent effectively arrested the crack and also prevented it from opening up wide during all loading stages. a further increase in the cracking load was recorded in shear strengthened beams as the composite extended along the shear span (66 percent of the span) and thus the cracks were forced to occur along a limited portion of the beam. 5 conclusions the present study focused on performing a comprehensive experimental work to explore the efficiency of gfrp composites as an external shear reinforcement. the cost effectiveness of the system makes it a very attractive alternative in strengthening and repair works as the material are commercially available at a convenient price. the results indicated that significant increase in shear strength could be achieved by the application of gfrp to concrete beams deficient in shear capacity. when �-jackets are properly applied over the shear span, the failure mode of the beam may be altered from that of a brittle shear failure to a ductile flexural failure mode. also, the strengthened beams were able to achieve the strength and stiffness levels of web reinforced beams. the results also show that the serviceability performance of strengthened beams is expected to be superior with regard to increased cracking loads and the limited number of cracks and small cracking width. it was observed that the composite system was relatively difficult to adhere properly to the concrete prior to curing, as the fabric tended to separate. it was also noticed that the layers can easily to slip down under self weight. these difficulties suggest that suitable clamps are necessary to hold the system in place until curing. also, the bond strength developing at the interface was a critical link in the strengthened beam. it was found difficult to avoid the existence of gaps near the edges when jackets are used despite the rounding of these edges along the shear span. these gaps are suspected to initiate bond failure, which may govern the ultimate strength of the beam. for this reason, special attention should be given to the preparation of the concrete surface, and to proper selection of the matrix and the placement of the fabric. through proper design, gfrp composites developed sufficient ductility despite their brittle nature, which encourages their use as an effective concrete reinforcement. finally, the cost effectiveness of the system makes it a very attractive alternative in strengthening and repair works, as the materials are commercially available at a convenient price. references [1] al-sulaimani, g. j., sharif, a., basunbul, i. a., baluch, m. h., ghaleb, b. n.: “shear repair for reinforced concrete by fiberglass plate bonding.” aci structural journal, vol. 91 (1994), no. 3, p. 458– 464. [2] chajes, m. j., januszka, t. f., mertz, d. r., thomson, t. a., finch, w. w.: “shear strengthening of reinforced concrete beams using externally applied composite fabrics.” aci structural journal, vol. 92 (1995), no. 3, p. 295–303. [3] chajes, m. j., thomson, t. a., tarantino, b.: “reinforcement of concrete structures using externally bonded composite materials.” in proceedings non-metallic reinforcement for concrete structures, 1995, e&fn spon., london, p. 501– 508. [4] aci committee 318: “building code requirements for reinforced concrete, aci-95 and commentary aci 318r-95” aci, 1995, p. 369. [5] johnson, m. k, ramirez, j. a.: “minimum shear reinforcement in beams with high strength concrete.” aci structural journal, vol. 86 (1989), no. 4, p. 376– 382. [6] woo kim, white, r. n.: “initiation of shear cracking in reinforced concrete beams with no web reinforcements.” aci structural journal, vol. 88 (1991), no. 3, p. 458– 308. [7] international conference of building officials evaluation service, icbo “acceptance criteria for concrete and reinforced and unreinforced masonry strengthening using fiber-reinforced composite systems” ac125, 1997. dr. mohamed abdel aziz saafan e-mail: m.safan2000@yahoo.com department of civil engineering faculty of engineering menoufia university egypt 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague ap05_3.vp 1 introduction the mechanical properties of stainless steels are significantly different from those of carbon steel. stainless steels display a pronounced response to cold working resulting in anisotropic, non-linear stress-strain behaviour and low proportional limits. the material properties of various stainless steels have been thoroughly investigated since the 1960s by a number of investigators, e.g. refs. [1, 2, 3, 4]. it has been generally concluded that the stress-strain behaviour of stainless steels can be best described by the ramberg-osgood model [5], and hill’s [6] modified form of the ramberg-osgood equation is used in design specifications. the main design specification for cold formed stainless steel members in the usa is the asce specification [7] and in europe, eurocode 3: part 1.4 [8] has been recently developed and is still under examination. the two codes use different approaches when dealing with the mechanical properties of the material. the asce code employs the modified form of the ramberg-osgood model to describe the stress-strain behaviour of a material, whereas the eurocode relies for most purposes on the specification of a linear stress-strain law, with the yield strength taken as the 0.2% proof stress. refs. [9, 10] show a comparison of the eurocode and asce code load capacity predictions for lipped channel columns is illustrated. the simpler eurocode analysis has been found to give reasonable estimates of concentrically loaded column strength without taking account of the non-linearity of the stress-strain curve. as part of a previous investigation ref. [11], a series of tensile tests were carried out on coupons cut from stainless steel lipped channel sections, and also on full sections, and the stress-strain characteristics are examined in this paper and incorporated into a non-linear finite element analysis. 2 mechanical properties of stainless steel lipped channel members in the formation of a profiled section, the cold working occurs in localised areas, with the material at the bends being strain hardened. therefore the properties of the material vary throughout the cross-section where at the formed bends, higher yield and tensile strengths exist, leading to a more complex stress-strain relationship for cold formed members, and in particular, for stainless steel members. the level of increase of both yield and tensile strength is dependent on the ratio of corner radius to material thickness (r/t). the cold formed lipped channels under investigation are of stainless steel, of cross-sections with small web, flange and lip dimensions and are considered to be thick and hence four corner bends are formed with small r/t ratios (<1). these four corners will have an effect on the stress-strain response of the material obtained from a full section test, which could then be compared to that obtained for virgin material from a standard tensile test. also, most commercially available finite element programs allow for a non-linear analysis and hence the inclusion of the actual stress-strain data obtained from tensile testing and from existing theories. the asce design specification adopts the modified form of the ramberg-osgood formula given by equation (1). it is a three-parameter equation for expressing the relationship between the stress and strain for stresses up to a value slightly greater than the yield strength of the material. � � � � � � � � � � e k e n , (1) where � � unit strain � � unit stress (n/mm2) e � modulus of elasticity (n/mm2) k and n are constants for a given curve, which are evaluated through two secant yield strength values for slopes of 0.7 e and 0.85 e. equation (1) was modified by hill [6], and, instead of using secant yield strengths, k and n can be evaluated in terms of two yield strength values: (i) �1 at an offset �1; (ii) �2 at an offset �2. using the most common offset of 0.002 for the yield stress (�2) and assuming that the modulus of elasticity e is equal to the initial value e0, equation (1) becomes: � � � � � � � � � � � � e y n 0 0 002. . (2) the asce design code makes use of this modified equation 2, and the three points on the stress-strain curve are defined as: (i) the origin; (ii) the point of 0.2 % proof stress; 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague finite element modelling of cold formed stainless steel columns m. macdonald, j. rhodes this paper describes the results obtained from a finite element investigation into the load capacity of column members of lipped channel cross-section, cold formed from type 304 stainless steel, subjected to concentric and eccentric compression loading. the main aims of this investigation were to determine the effects which the non-linearity of the stress-strain behaviour of the material would have on the column behaviour under concentric or eccentric loading. stress-strain curves derived from tests and design codes are incorporated into non-linear finite element analyses of eccentrically loaded columns and the results obtained are compared with those obtained on the basis of experiments on stainless steel channel columns with the same properties and dimensions. comparisons of the finite element results and the test results are also made with existing design specifications and conclusions are drawn on the basis of the comparisons. keywords: finite elements, stainless steel, cold forming. (iii) another offset strength (e.g. 0.01 %). if these points are substituted into equation (2), then ‘n’ can be evaluated. the term ‘n’ is referred to in the asce design code as the plasticity factor. the accuracy of the above method is largely based on how well the analytical equation fits the stress-strain relationship of the material. the code lists for particular grades of stainless steel, tables of yield stress, tangent modulus and plasticity factors. the results obtained for the stress-strain relationship from both virgin material and full section tensile tests will be used for comparison with the results obtained from the above asce ramberg-osgood approach and by a trial and error ‘best fit’ method using the experimental stress-strain curves. these will then be incorporated into a non-linear finite element analysis of eccentrically loaded stainless steel columns. 3 load capacity of stainless steel lipped channel columns subjected to combined bending and axial compression loading rhodes et. al. [9, 10], investigated both concentric and eccentric loading of cold formed stainless steel lipped channel section columns. the findings showed that the relevant design codes refs [7, 8] provided very accurate predictions of load capacity for the concentric loading case using both virgin and full section material properties when compared to experimental results. a finite element analysis also produced a very accurate correlation to both the experimental results and the design code predictions. however, for shorter length eccentrically loaded columns, the design codes were very conservative in their prediction of load capacity using both virgin and full section material properties. it was concluded that the design codes’ interaction formulae were inadequate in predicting the load capacity of short-to-medium length columns. the eurocode (8) interaction formula to determine the axial strength nsd is given by equation (3). n f a n e m y m n m sd sd � � � �1 1 1 � � �� � � � � � �� � � , (3) where all terms are defined in ref. [9]. the asce [7] interaction formula to determine the axial strength pu is given by equation (4). p p p e m p p u n u n u e � � � � �� � � � � � � � � 1 1 0. . (4) where all terms are defined in ref. [9]. both equations produced very conservative estimates of load capacity and an attempt to improve the interaction formulae was proposed by macdonald ref. [12]. this modification to the interaction formulae involved replacing the linear moment capacity mn with the true moment capacity of the lipped channel cross-section mexp obtained from bending tests. hence the asce interaction formula was modified as given by equation (5). p p p e m p p u n u u e � � � � �� � � � � � � � � exp . 1 1 0 . (5) the eurocode interaction formula was modified as given by equation (6). n f a n e m y m m sd sd � � � �1 1 1 � � �� � � � � � �� � � exp . (6) in both equations (5) and (6), mexp is the cross-section true moment capacity and the 0.2 % proof stress is taken from the full section tensile test results, and all other terms are defined in ref. [12]. 4 finite element analysis a finite element analysis was then performed using the ansys software package to determine the load capacity of concentrically and eccentrically loaded columns. two types of buckling analyses are available within the ansys package – eigenvalue analysis and non-linear analysis. both types of analyses were used, however, the eigenvalue analysis was used only to verify that the finite element model boundary conditions (i.e. column pin ends and eccentric loading) were accurate, as this type of analysis takes no account of material non-linearity. a full non-linear analysis was conducted using shell elements (ansys shell181) which are four-noded elements with six degrees of freedom at each node, i.e. translations in x, y and z directions, and rotations about x, y and z axes. fig. 2 shows a typical displacement and boundary conditions plot for an eccentrically loaded column, while fig. 3 shows a typical nodal stress plot. for the eccentrically loaded columns, the non-linear material properties of the stainless steel were defined in ansys using the initial elastic modulus, poisson’s ratio and stress-strain data obtained from: (i) coupon tensile tests on material cut from section webs; (ii) full-section tensile tests; (iii) asce (ramberg-osgood) approach; (iv) ‘best fit’ stress-strain curves. for concentrically loaded columns, the virgin material properties were used. the non-linear solution breaks the load up into a series of load increments that can be applied over several load steps. at the completion of each incremental solution, the program adjusts the stiffness matrix to reflect the non-linear changes in the structural stiffness before proceeding to the next load increment. for a non-linear buckling analysis to be accurate using ansys, it is necessary to set an initial imperfection in the structure being modelled. this was achieved by modelling a very small mid-span deflection which produced a very large radius of curvature for the lipped channel columns which would approximate any actual imperfections. a parametric model was constructed by defining positions of keypoints to allow for easy alterations to the model, particularly for the variation in column length. a column half-model was modelled using appropriate symmetry commands that helped to reduce the considerable computer processing time. © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 5 experimental investigations 5.1 tensile tests fig. 1 shows a schematic diagram of a typical cold formed stainless steel lipped channel member under investigation. the member is commercially available with specimens cut from supplied lengths and all the specimens were accurately measured at a number of points, with the values averaged to obtain the finished dimensions, and all calculations were based on mid-line dimensions. the details are given in table 1. in order to determine the material properties of the sections, tensile tests were set-up where the applied load and gauge specimen elongation were recorded continuously until fracture of the specimen occurred. the measured load and elongation were normalised to give a stress-strain relationship. due to the anisotropy of stainless steel, a full analysis of the material properties would require tensile tests in the longitudinal and transverse directions, as well as compression tests in the same directions. indeed, provision is made in the asce design code to enable use to be made of them in specific applications. however, compression tests were not carried out as there would be difficulty in establishing the true material properties of the material due to likely buckling effects. also, transverse direction tensile tests could not be carried out because of the limitations in the geometry of the sections. hence tensile testing was limited to the longitudinal direction. all tensile tests were carried out in accordance with bsen10002-1 [13]. standard tensile tests were performed to ascertain the material properties of the stainless steel. coupons were cut from the webs of the columns and tested to obtain the 0.2 % proof stress and the modulus of elasticity. tensile tests were also performed on full sections to include the effects of the cold formed corners and from these tests, the 0.2% proof stress and the initial modulus of elasticity were determined. for the standard coupons, a total of three specimens were tested and the average results were noted. for the full section tests, two specimens were tested and again, the average results were noted and shown in table 2. the results obtained for the plasticity factors n from the asce design code, i.e. the modified form of the ramberg-osgood equation given by equation (2), are also detailed in ref. [11] and shown in table 3. also shown in table 3 are the plasticity factors obtained from a comparative plots/trial and error (‘best fit’) process using the stress-strain curves obtained from the tensile tests, as reported in ref. [11]. 5.2 compression tests in the experimental investigation a series of compression tests to failure were made on stainless steel columns of the lipped channel cross-section as described above. the specimen parameters investigated were as follows: column lengths varied from 222 mm to 1222 mm in increments of 100 mm (slenderness ratios varying from 42 to 234); twenty-two tests to failure (2 sets of columns, average results noted) were carried out with the loading applied concentric to the centroid of the cross-section to give pure axial compression; twenty-two tests to failure (2 sets of columns, average results noted) were carried out with the loading applied 8 mm eccentric to the centroid of the cross-section to give combined bending and axial compression loading. each length of column tested was cut to the specified length and then milled flat at each end to avoid any possible gripping problems. the end grips were designed such that they would hold the ends of the column and allow the loading to be applied at the required eccentricity through knife edges. the specimens were tested using a tinius olsen electro-mechanical testing machine, with the column vertical displacement and mid-span horizontal deflection measured during the tests using displacement transducers. fig. 1 shows a schematic diagram of the column test configuration. 5.3 observations fig. 4 shows the results obtained for the load capacity of concentrically loaded stainless steel lipped channel section columns from tests, design codes (using virgin material properties and full section properties) and from finite element analysis (using virgin material properties). the design code and finite element predictions show excellent correlation with the test results for all but the shortest of the columns. 94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague p (kn) p (kn) l e=8mm e=8mm t fig. 1: schematic diagram of eccentrically-loaded column test web b1 (mm) flange b2 (mm) lip b3 (mm) thickness t (mm) radius r1 (mm) radius r2 (mm) 28.00 14.88 7.45 2.43 1.10 1.10 table 1: average dimensions of lipped channel cross-section thickness t (mm) av. virgin 0.2 % p. s. (n/mm2) av. virgin uts (n/mm2) av. fs 0.2 % p. s. (n/mm2) av. fs uts (n/mm2) 2.43 480 553 520 689 table 2: tensile test results: virgin material and full section (fs) mechanical properties tensile test n (asce) n (best fit) coupon 3.80 6.22 full section 5.02 6.65 table 3: plasticity factors © czech technical university publishing house http://ctn.cvut.cz/ap/ 95 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 column length = 722 mm cross-section centroid fig. 2: finite element plot of displacement and boundary conditions for eccentric column loading column length = 222 mm fig. 3: finite element plot of nodal stress and boundary conditions for eccentric column loading fig. 5 shows the results obtained for the load capacity of eccentrically loaded stainless steel lipped channel section columns from tests, design codes (using virgin material properties and full section properties) and from modifications to the design codes as described by equations (5) and (6). all design code predictions show conservatism in prediction of load capacity for the shorter range of columns with improvements gained when full section properties are used and further improvements are gained in using the modified forms. the best correlation obtained was for columns where the modified design codes predicted accurate load capacities for all column lengths. fig. 6 shows the results obtained for the load capacity of eccentrically loaded stainless steel lipped channel section columns from tests and from finite element analysis using the various stress-strain curves described earlier. the predictions of the finite element analysis show an excellent correlation to the test results and a real improvement on the predictions of 96 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague 0 10 20 30 40 50 60 70 222 322 422 522 622 722 822 922 1022 1122 1222 column length (mm) b u c k li n g l o a d (k n ) asce asce (fs) euro.1.4 euro.1.4 (fs) ansys experimental fig. 4: graph of load capacity v. column length: concentric loading (test/design codes/finite element analysis) 0 5 10 15 20 25 30 222 322 422 522 622 722 822 922 1022 1122 1222 column length (mm) b u c k li n g l o a d ( k n ) asce asce (fs) asce (fs,mexp) euro.1.4 euro.1.4 (fs) euro.1.4 (fs,mexp) experimental fig. 5: graph of load capacity v. column length: eccentric loading (test/design codes) the design codes. however, any differences between finite element predictions and test results are very slight and occur mainly for very short length columns. 6 conclusions the load capacity of short-to-medium length cold formed stainless steel lipped channels under pure axial compressive (concentric) loading was shown to be accurately predicted by the eurocode 1.4 and the asce design code. a non-linear finite element analysis using virgin material stress-strain properties also provided accurate load predictions. however, for eccentrically loaded columns with shorter lengths, the design codes predicted very conservative load capacities. improvements were gained by employing enhanced stress-strain properties and by incorporating the full section moment capacity within the code interaction formulae. finally, it has been shown in this paper that finite element analysis can be used with a high level of confidence in predicting the load capacity of eccentrically loaded cold formed stainless steel, short-to-medium length columns of lipped channel section. this has been shown to be true for various stress-strain curves including that obtained from virgin material tensile tests. references [1] johnson, a. l.: the structural performance of austenitic stainless steel members. department of structural engineering, report no 327. cornell university, new york 1966. [2] wang, s. t.: cold-rolled austenitic steel: material properties and structural performance. department of structural engineering, report no. 334. cornell university, new york 1969. [3] wang, s. t., errera, s. j. and winter, g.: “behaviour of cold-rolled stainless steel members.” journal of the structural division, proc. asce 101 (st11), 1975, p. 2337–2357. [4] van den berg, g. j., van der merwe, p.: “prediction of corner mechanical properties for stainless steels due to cold forming.” paper presented at the 11th international specialty conference on cold formed steel structures, st.louis, missouri, (usa): october 1992. [5] ramberg, w., osgood, w. r.: “description of stress-strain curves by three parameters.” national advisory committee for aeronautics (naca), technical note no. 902, february 1943. [6] hill, h. n.: “determination of stress-strain relations from ‘offset’ yield strength values.” national advisory committee for aeronautics (naca), technical note no. 927, february 1944. [7] ansi/asce-8-90: specification for the design of cold-formed stainless steel structural members. 1991. [8] env 1993-1-3, eurocode 3: design of steel structures: part 1.4: general rules – supplementary rules for stainless steel. july 1996. [9] rhodes, j., macdonald, m., mcniff, w.: “buckling of cold formed stainless steel columns under concentric and eccentric loading.” proc. 15th int. specialty conference on cold formed steel structures, st louis, missouri (usa), october 2000. [10] rhodes, j., macdonald, m., kotelko,m., mcniff, w.: “buckling of cold formed stainless steel columns under concentric and eccentric loading.” proc. 3rd int. © czech technical university publishing house http://ctn.cvut.cz/ap/ 97 czech technical university in prague acta polytechnica vol. 45 no. 3/2005 0 5 10 15 20 25 30 222 322 422 522 622 722 822 922 1022 1122 1222 column length (mm) l o a d c a p a c it y (k n ) fea (best fit, n=6.22) fea (asce, n=3.80) fea (virgin) fea (fs) experimental fig. 6: graph of load capacity v. column length: eccentric loading (test/finite element analysis) conference on thin walled structures, krakow (poland), june 2001. [11] macdonald, m., rhodes, j., taylor, g. t.: “mechanical properties of stainless steel lipped channels.” proc. 15th int. specialty conference on cold formed steel structures, st louis, missouri (usa), october 2000. [12] macdonald, m.: “the effects of cold forming on material properties and post-yield behaviour of structural sections.” phd thesis. glasgow caledonian university, glasgow (scotland), january 2002. [13] british standards institution: tensile testing of metallic materials. bsen10002-1. martin macdonald glasgow caledonian university school of engineering, science and design cowcaddens road glasgow, g4 0ba scotland, uk jim rhodes e-mail: jrhodes@mecheng.strath.ac.uk university of strathclyde department of mechanical engineering 75 montrose street glasgow, g1 1xj, scotland, uk 98 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 3/2005 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2022.62.0157 acta polytechnica 62(1):157–164, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague swanson hamiltonian revisited through the complex scaling method marta reboiroa, b, ∗, romina ramírezc, d, viviano fernándezc a university of la plata, faculty of exact science, department of physics, 49 & 115, 1900 la plata, argentine b conicet, institute of physics of la plata. 63 & diag. 113, 1900 la plata, argentine c university of la plata, faculty of exact science, department of mathematics, 50 & 115, 1900 la plata, argentine d conicet, institute argentine of mathematics. saavedra 15 3º, c1083aca buenos aires, argentine ∗ corresponding author: reboiro@fisica.unlp.edu.ar abstract. in this work, we study the non-hermitian pt-symmetry swanson hamiltonian in the framework of the complex scaling method. we show that by applying this method we can work with eigenfunctions that are square-integrable both in the pt and in the non-pt symmetry phase. keywords: pt-symmetric hamiltonians, swanson model, complex scaling method. 1. introduction the swanson model has been introduced in [1] as an example of a pt-symmetry hamiltonian [2–8]. since then it has been extensively studied, allowing for several interesting extensions [9–25]. among recent works, let us mention an extension of the swanson model with complex parameters [23, 25], this work introduces bicoherent-state path integration as a method to quantify non-hermitian systems. though the swanson model is described by quadratic operators, the underlying physics is nevertheless very rich. depending on the region in the model parameter space, the swanson model is similar to the hamiltonian of a parabolic barrier or the hamiltonian of a harmonic oscillator [26]. from the mathematical point of view, it is an example of a hamiltonian with eigenfunctions that do not belong to l2(r) in some regions of the space of parameters. among the methods that are employed to describe the physics of resonances with complex energy, the complex scaling method (csm) [27–32] is one of the most powerful. it has been extensively used in the description of many-body resonant states and non-resonant continuum states observed in unstable nuclei [32]. in this work, we propose the use of the csm to describe the dynamics of the swanson model, particularly in the region of non-pt-symmetry. the work is organized as follows. in section 2 we describe the application of the csm to the swanson hamiltonian. we establish a similarity transformation between the transformed hamiltonian and its adjoint operator. we discuss, according to the space of parameters of the model, the possibility of having square-integrable eigenfunctions. we present the mean values of some observables. in section 3, we analyse with an example, the survival probability as a function of time for an initial coherent state. conclusions are drawn in section 4. 2. formalism the hamiltonian of swanson [1] is given by h = ℏω ( a†a + 1 2 ) + ℏα a2 + ℏβ a† 2 , (1) with ω, α, β ∈ r. the hamiltonian of eq. (1) can be written in terms of the coordinate operator, x̂, and the momentum operator, p̂, by implementing the following representation a = 1 √ 2 ( x̂ b0 + i b0 ℏ p̂ ) , a† = 1 √ 2 ( x̂ b0 − i b0 ℏ p̂ ) , (2) being b0 the characteristic length of the noninteracting system. the hamiltonian in eq. (1) reads h(ω,α,β) = 1 2 ℏ(ω + α + β) ( x̂ b0 )2 + 1 2 ℏ(ω − α − β) ( b0 p̂ ℏ )2 +ℏ (α − β) 2 ( 2 x̂ i ℏ p̂ + 1 ) . (3) the adjoint hamiltonian of h(ω,α,β) is hc = h(ω,β,α). as we showed in [26], some of the eigenfunctions of eq. (3) do not belong to the usual hilbert space, h = l2(r), so that we have to work in a rigged hilbert space [33, 34]. an alternative approach to solve the eigenvalue problem of the hamiltonian of eq. (1), is the use of the csm method [27–32] . the aim of the csm is to make a similarity transformation from the original hamiltonian to a hamiltonian which has eigenfunctions 157 https://doi.org/10.14311/ap.2022.62.0157 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en m. reboiro, r. ramírez, v. fernández acta polytechnica that belong to l2(r). in the framework of the csm, we shall introduce the transformation operator v̂ (θ) = e− θ 2ℏ (x̂p̂+p̂x̂) with a real scaling parameter θ: v̂ (θ)x̂v̂ −1(θ) = e iθ x̂, v̂ (θ)p̂v̂ −1(θ) = e−iθ p̂. (4) the hamiltonian of eq. (3) is transformed as h(θ) = v̂ (θ)hv̂ −1(θ): h(θ) = h(θ,ω,α,β) = 1 2 ℏ(ω + α + β) ( e iθ x̂ b0 )2 + 1 2 ℏ(ω − α − β) ( b0 e −iθ p̂ ℏ )2 +ℏ (α − β) 2 ( 2 x̂ i ℏ p̂ + 1 ) . (5) it is straightforward to observe that h†(θ) = h(−θ,ω,β,α). (6) notice that h(θ) is not invariant under the usual pt-symmetry given by x̂ → −x̂, and p̂ → p̂, and i → −i. we shall introduce the following similarity transformation induced by the operator υ(θ) = e − α−β ω−α−β e2iθ x2 2b20 . it reads υ(θ) h(θ)υ(θ)−1 = h(θ), (7) where h(θ) is given by h(θ) = 1 2m ( e−iθp̂ )2 + 1 2 k ( eiθx̂ )2 . (8) we have defined [26] k = m ω2 and m = m(ω,α,β,b0) = ℏ (ω − α − β)b20 ω = ω(ω,α,β) = √ ω2 − 4αβ = |ω|eiϕ. (9) though h(θ) is a non-hermitian operator, h†(θ) = h(−θ) = v (−2θ)h(θ)v (−2θ)−1. consequently υ(−θ)−1h†(θ)υ(−θ) = h(θ)∗, (υ(−θ)v (−2θ))−1h†(θ)(υ(−θ)v (−2θ)) = h(θ). (10) from eqs. (7) and (10), it results h†(θ)s = sh(θ), with s = υ(−θ)v (−2θ)υ(θ) [35–37]. the eigenfunctions and eigenvalues of h(θ), ϕ(θ) and e(θ), are related to that of h and h† as follows. given h(θ)ϕ(θ,x) = e(θ)ϕ(θ,x): h ϕ(θ,x) = ẽ(θ) ϕ(θ,x), h† ψ(θ,x) = e(θ) ψ(θ,x), (11) with ϕ̃(θ,x) = υ(θ)−1ϕ(θ,x), e(θ) = e(θ), ψ(θ,x) = υ(−θ)(ϕ(θ,x))∗, e(θ) = e(θ)∗. (12) thus, the eigenfunctions of h(θ) with eigenvalue ẽν (θ) = eν (θ) are given by ϕ̃ν (θ,x) = e α−β ω−α−β e2iθ x2 2b20 nνϕν (θ,x) (13) with nν a normalization constant. it can be shown that the eigenfunctions of h†(θ) are ψν (θ,x) = e − α−β ω−α−β e−2iθ x2 2b20 (nνϕν (θ,x)) ∗ , (14) and the corresponding eigenvalue is given by eν (θ) = ẽν (θ)∗. a similar structure for eqs. (10)-(14) can be found in [38, 39]. moreover, the relation between the eigenvalues, eq. (12), is a typical feature for operators which are self-adjoint in krein spaces [39–41]. it should be mentioned that the hamiltonian of eq. (5), for α = β = 0 and ω = 1/ cos(2θ), reduces to the one introduced in [23–25]. particularly, in [25] the dynamics under the action of this hamiltonian is described for values of θ ∈ (−π/4,π/4). for further results, the reader is kindly referred to [23–25]. in what follows, we aim to determine the range of values of θ for which ϕ(θ,x) belongs to the hilbert space l2(r). 2.1. eigenfunctions and eigenvectors for ω − (α + β) ̸= 0, eq. (8) can be also written as − d2ϕ(y) dy2 + ( 1 4 y2 − ϵ ) ϕ(y) = 0, (15) with ϵ = e ℏω = e ℏ|ω| eiϕ (16) and y = √ 2 |σ|ei(θ+γ) x b0 , (17) where we have defined σ = ( mω ℏ )1/2 b0 = eiγ |σ|. (18) 158 vol. 62 no. 1/2022 swanson hamiltonian revisited through the csm (a) 0 π 2 π 3 π 2 2 π -1 0 1 θ u (x )/ |u (x )| (b) 0 π 2 π 3 π 2 2 π -1 0 1 θ u (x )/ |u (x )| (c) 0 π 2 π 3 π 2 2 π -1 0 1 θ u (x )/ |u (x )| (d) 0 π 2 π 3 π 2 2 π -1 0 1 θ u (x )/ |u (x )| figure 1. effective potential of eq. (19), u(θ,x)|u(θ,x)| , for a fixed x in the regions determined by the signs of the parameters m(ω, α, β, b0) and ω2(ω, α, β, b0). in panel (a), (sg(m), sg(ω2)) = (+, +), region i. for panel (b),(sg(m), sg(ω2)) = (+, −), region ii. while, (sg(m), sg(ω2)) = (−, +) in panel (c), region iii. in region iv (sg(m), sg(ω2)) = (−, −), for panel (d). the real part of the effective potential, re ( u(θ,x) |u(θ,x)| ) , is displayed in solid lines, while the imaginary part of the effective potential, im ( u(θ,x) |u(θ,x)| ) , is drawn with dashed lines. eq. (15) is the schrödinger equation corresponding to the effective potential u(θ,x) = u(θ,x) ℏω = e2i(θ+γ) 1 2 |σ|2 x2 b20 . (19) solutions corresponding to eq. (15) represent different physical systems according to the signs of m(ω,α,β,b0) and ω2(ω,α,β,b0) [26]. in what follows, we shall refer to region i when (sg(m),sg(ω2)) = (+, +), region ii for the case (sg(m),sg(ω2)) = (+, −), region iii for (sg(m),sg(ω2)) = (−, +), and region iv for (sg(m),sg(ω2)) = (−, −), respectively. in figure 1 we present the behaviour of the effective potential of eq. (19), u(x)|u(x)| , as a function of θ, for x, |σ| and |ω| fixed, in the different regions of the parameter model-space. the real part of the effective potential, re ( u(x) |u(x)| ) , is displayed in solid lines, while the imaginary part of the effective potential, im ( u(x) |u(x)| ) , is drawn with dashed lines. 2.1.1. discrete spectrum for the discrete sector of the spectrum, eigenvalues and the eigenfunctions are given by en = ℏ ω [n] = ℏ |ω|eiϕ [n], ϕ̃m(θ,x) = e α−β ω−α−β e 2iθ x2 2b20 ϕm(θ,x), (20) ψm(θ,x) = e − α−β ω−α−β e −2iθ x2 2b20 (ϕm(θ,x))∗, (21) where ϕn(θ,x) can be written as (a) 0 π 2 π 3 π 2 2 π -1 0 1 θ r e (u (x )/ |u (x )| ) (b) 0 π 2 π 3 π 2 2 π -1 0 1 θ r e (u (x )/ |u (x )| ) (c) 0 π 2 π 3 π 2 2 π -1 0 1 θ r e (u (x )/ |u (x )| ) (d) 0 π 2 π 3 π 2 2 π -1 0 1 θ r e (u (x )/ |u (x )| ) figure 2. real part of the effective potential of eq. (19), u(θ,x)|u(θ,x)| . the shadowed sectors correspond to the values of θ for which the solutions of eq. (15) are square-integrable. in panels (a), (b), c) and (d) we present the results for regions i, ii, iii and iv, respectively. ϕn(θ,x) = nne −e2i(θ+γ) x 2 2b20 |σ|2 hn ( ei(θ+γ) x b0 |σ| ) . nn2 = ei(θ+γ) √ πn!2n |σ| b0 , (22) being hn(z) the hermite polynomial of order n, and [n] = n + 1/2. eigenfunctions ϕn(θ,x) are square-integrable for θintervals where re(u(θ,x)) takes positive values. in figure 2, we plot re(u(θ,x)/|u(θ,x)|) for every region, the gray regions correspond to the intervals for which the eigenfunctions are square integrable. in table (1), we summarize the sign of the parameter m and ω2, which characterize the different regions of the model, and for each region we present the values of phases γ and ϕ, and the interval where the eigenfunctions are square-integrable. in regions (i) and (iii), we can define two welldefined θ−domains: i1 = [−π, −3π/4)∪(−π/4,π/4)∪ (3π/4,π] and i2 = (−3π/4, −π/4) ∪ (π/4, 3π/4). while, in regions (ii) and (iv), the θ-domains are: i3 = (−π, −π/2) ∪ (0,π/2) and i4 = (−π/2, 0) ∪ (π/2,π). the intervals repeat themselves periodically, with period π. in the domains summarized in table 1, eigenfunctions {ψν (θ,x), ϕ̃ν (θ,x)} form a biorthogonal complete set. ∫ ∞ −∞ (ψm(θ,x)) ∗ϕ̃n(θ,x) dx =∫ ∞ −∞ ϕm(θ,x)ϕn(θ,x) dx = δmn. (23) it should be noticed that in all regions, the θ−domains of positive spectrum are different from the domains with negative spectrum. they represent different physical boundary conditions. 159 m. reboiro, r. ramírez, v. fernández acta polytechnica sg(m) sg(ω2) γ ϕ i θc i + + 0 0 i1 ±π/4π/2 π i2 iii + π/2 0 i2 0 π i1 ii + π/4 π/2 i4 0 ±π/2 π −π/4 −π/2 i3 iv −π/4 π/2 i3 π/4 −π/2 i4 table 1. values of the characteristic parameters for the different model-space regions. in columns 2 and 3 we give the sign of m and ω2, respectively. phases γ and ϕ, for the different regions, are given in columns 4 and 5, respectively. in column 6 we present the θ-interval for which the different eigenfunctions are square-integrable. in the table i1 = [−π, −3π/4) ∪ (−π/4, π/4) ∪ (3π/4, π], i2 = (−3π/4, −π/4) ∪ (π/4, 3π/4), i3 = (−π, −π/2) ∪ (0, π/2) and i4 = (−π/2, 0) ∪ (π/2, π). in the last column, we give the values of θc for which the eigenfunctions of the continuous spectrum are square-integrable. the intervals repeat themselves periodically, with period π. 2.1.2. continuous spectrum the eigenfunctions associated to the continuous spectrum [26, 42–45] are given, in terms of the eigenfunctions of h(θ) of eq. (8), ϕe±(θ,x), by ϕ̃e±(θ,x) = e α−β ω−α−β e 2iθ x2 2b20 ϕe±(θ,x), (24) ψ e ±(θ,x) = e − α−β ω−α−β e −2iθ x2 2b20 (ϕe±(θ,x)) ∗, (25) with ϕ e ±(θ, x) = c γ(ν + 1)d−ν−1 ( ∓ √ −2ei(θ+γ)|σ| x b0 ) . (26) being d−ν−1(y) the parabolic cylinder functions and ν = ϵ− 12 . the normalization constant takes the value c = e iπ/8iν/2( |σ| b0 ei(θ+γ) )1/2 π23/4 . the biorthogonality and the completeness relation can be written as ∫ ∞ −∞ (ψ e ±(θ,x)) ∗ϕ̃e ′ ± (θ,x)dx = δ(e − e ′), ∑ s=± ∫ ∞ −∞ (ψ e s (θ,x)) ∗ϕ̃es (θ,x)de = δ(x − x ′). (27) the possible values that the parameter θ can take to fulfill the requirements of biorthogonality and completeness of eq. (27), θc, are presented in the last column of table 1. in the framework of the csm, the continuous spectrum lies along the line 2θ. in regions ii and iv, the 2θc = ± π so that e ∈ (−∞, +∞) meanwhile, in region i and iii, 2θc = ± π2 , so that e takes imaginary values. consequently, the parameter ν associated to the order of the eigenfunctions of eq. (26) takes the value ν = −i|ϵ| − 12 . if we look at the effective potential u(θ,x), the values of θc correspond to the values of θ for which re(u(θ,x)) = 0. 2.1.3. particular cases case (a): ω = 0. when ω = 0 and ω − (α + β) ̸= 0, the problem reduces to that of a free particle of energy e = ε e−2iθ . eq. (8) reduces to − ℏ2 2m e2iθ d2f(x) dx2 = e f(x), (28) the wave function can be written as f(x) = aeikx + ae−ikx, with k = √ 2ε ℏ(ω−α−β)b20 . case (b): ω − (α + β) = 0, α ̸= β. to study this case we have to look at eq. (5). if ω − (α + β) = 0, it reads h(θ) = ℏ(α + β) ( e iθ x̂ b0 )2 +ℏ (α − β) 2 ( 2 x̂ i ℏ p̂ + 1 ) , (29) f(x) = e −e2iθ x̂ 2 4b20 α+β α−β x − 12 + εe−2iθ ℏ(α−β) . (30) in table 2 we present the values of e for which the wavefunction f(x) is square-integrable. (α + β)/(α − β) (α − β) cos(2θ) ε + + i1 ε| cos(2θ)| ℏ|α−β| < 1 2 i2 + i1 ε| cos(2θ)| ℏ|α−β| > 1 2 + i2 table 2. regions for which the wave function of eq. (30) is square-integrable. 2.2. mean values of observables to compute the mean values, we use operators p̂ and x̂ defined as [19, 46, 47] p̂ = υ−1v (θ + γ)p̂v (θ + γ)−1υ = e−i(θ+γ)p̂ + iℏei(θ+γ) α − β (ω − α − β)b20 x̂, x̂ = υ−1v (θ + γ)x̂v (θ + γ)−1υ = ei(θ+γ)x̂, (31) 160 vol. 62 no. 1/2022 swanson hamiltonian revisited through the csm that satisfy [x̂, p̂ ] = iℏ. (32) for the discrete spectrum of h, it can be proved that ⟨m|p̂ |n⟩ = ∫ ∞ −∞ (ψm(θ,x)) ∗ p̂ ϕ̃n(θ,x)dx = ∫ ∞ −∞ ϕm(θ,x) e−i(θ+γ)p̂ ϕn(θ,x)dx = iℏ √ 2b0r (√ n + 1δm,n+1 − √ nδm,n−1 ) , ⟨m|p̂ 2|n⟩ = ∫ ∞ −∞ (ψm(θ,x)) ∗ p̂ 2 ϕ̃n(θ,x)dx = ∫ ∞ −∞ ϕm(θ,x) e−2i(θ+γ)p̂2 ϕn(θ,x)dx = iℏ √ 2b0r (√ n + 1δm,n+1 − √ nδm,n−1 ) = − ℏ2 2b20r (√ (n + 2)(n + 1)δm,n+2 −(2n + 1)δm,n + √ n(n − 1)δm,n−2 ) , (33) and ⟨m|x̂|n⟩ = ∫ ∞ −∞ (ψm(θ,x)) ∗ x̂ ϕ̃n(θ,x)dx = ∫ ∞ −∞ ϕ±m(θ,x) e i(θ+γ)x̂ ϕ±n (θ,x)dx = b0r√ 2 (√ n + 1δm,n+1 + √ nδm,n−1 ) , ⟨m|x̂2|n⟩ = ∫ ∞ −∞ (ψm(θ,x)) ∗ x̂2 ϕ̃n(θ,x)dx = ∫ ∞ −∞ ϕm(θ,x) e2i(θ+γ)x̂2 ϕn(θ,x)dx = b20r 2 (√ (n + 2)(n + 1)δm,n+2 +(2n + 1)δm,n + √ n(n − 1)δm,n−2 ) , (34) with b0r = b0/|σ|. 2.3. time dependent mean values from the schrödinger equation iℏ ∂ ∂t φ̃n(θ,x,t) = h(θ)φ̃n(θ,x,t), (35) it results φ̃n(θ,x,t) = e−iẽn t ℏ ϕ̃n(θ,x). (36) in the same way iℏ ∂ ∂t ψn(θ,x,t) = h(θ)†ψn(θ,x,t), (37) it results ψn(θ,x,t) = e−ien t ℏ ψn(θ,x). (38) 2.3.1. reigions i and iii: real spectrum in regions i and iii, the discrete eigenvalues of h(θ) take the values e±n = ±ℏ|ω|[n], with eigenfunctions ϕ̃±n (θ,x). in region i, the eigenfunctions of the positive (negative) are square integrable in interval i1 (i2), see table 1. meanwhile, in region iii, tthe eigenfunctions of the positive (negative) are square integrable in interval i2 (i1). consequently the time evolution of the states is given by φ̃±n (θ,x,t) = e −i ẽn tℏ ϕ̃n(θ,x), = e∓i(n+ 1 2 )|ω|tϕ̃±n (θ,x). ψ ± n (θ,x,t) = e −i en tℏ ψn(θ,x), = e∓i(n+ 1 2 )|ω|tψ ± n (θ,x), (39) and then ⟨m|ô|n⟩ = e∓i(n−m)|ω|t ∫ ∞ −∞ (ψ ± m(θ,x)) ∗ôϕ̃±n (θ,x)dx. (40) 2.3.2. region ii and iv: complex spectrum in regions ii and iv, the discrete eigenvalues of h(θ) take the values e±n = ±iℏ|ω|[n], with eigenfunctions ϕ̃±n (θ,x). in region ii, the eigenfunctions of the positive (negative) are square integrable in interval i4 (i3). meanwhile, in region iii, the eigenfunctions of the positive (negative) are square integrable in interval i3 (i4). so that the time evolution of the eigenfunctions are given by φ̃±n (θ,x,t) = e −i en tℏ ϕ̃±n (θ,x), = e±(n+ 1 2 )|ω|tϕ̃n(θ,x). (41) and ψ ± n (θ,x,t) = e −i e∗n t ℏ ψn(θ,x), = e∓(n+ 1 2 )|ω|tψ ± n (θ,x). (42) as a result ⟨m|ô|n⟩ = e±(n−m)|ω|t ∫ ∞ −∞ (ψ ± m(θ,x)) ∗ôϕ̃±n (θ,x)dx. (43) 161 m. reboiro, r. ramírez, v. fernández acta polytechnica 3. results and discussion in order to evaluate the benefits of the present approach, let us consider the time evolution of a given initial state when the parameters of the model correspond to region ii. in [26] we have analysed the swanson model by solving its eigenvalue problem in the rigged hilbert space. we have found that in region ii the hamiltonian was similar to the one of a particle in a parabolic barrier. in the framework of the csm, we model the effective interaction by a complex potential. this fact resembles the spirit of the optical potential in nuclear physics [48, 49], as the potential seen by an incident nucleon on a nucleus is modeled by a complex effective potential accounting for the loss of flux due to the interaction of an incident particle with the nucleons of the nucleus. we shall consider the solutions with eigenvalues en = −iℏ|ω|(n + 1/2), which evolve in time as e−|ω|(n+1/2)t. they correspond to the boundary problem for 0 < t < ∞. in this case γ = −π/4 and θ ∈ i3. for simplicity, let us assume that the initial state is a coherent state of the form ϕi (z,θ,x) = e−|z| 2/2 ∞∑ k=0 zk √ k! ϕk(θ,x), (44) where ϕ̃k(θ,x) is the k-eigenfunction of h(θ). the survival probability of the state can be computed as p(t) = ∣∣∣∣ ∫ ∞ −∞ (ψi (z,θ,x)) ∗e−ih(θ)t/ℏϕi(z,θ, x)dx ∣∣∣∣2 = ∣∣∣∣∣e−|z|2−|ω|t/2 ∞∑ k=0 (|z|2e−|ω|t)k k! ∣∣∣∣∣ 2 = e−|ω|t+2|z| 2(e−|ω|t−1). (45) notice that, in this particular case, p(t) is independent of the parameter θ. 4. conclusions in this work we analyse the advantages of the csm for describing the dynamics of a non-hermitian system when the eigenfunctions of the problem do not belong to l2(r). we have shown that we can cast the original problem into a complex potential, which includes absorption and dissipation effects according to the sign of its imaginary component. we have shown that for a range of values of θ in the different regions of the model, the resulting eigenfunctions are square-integrable. this feature facilitates the study of the dynamics of the system from the computational point of view. the price we have to pay is the lack of pt-symmetry invariance of the transformed hamiltonian. work is in progress concerning the application of the csm to a more involved problem as the one presented in [50]. acknowledgements this work was partially supported by the national research council of argentine (conicet) (pip 0616) and by the agencia nacional de promoción científica (anpcyt) of argentina. references [1] m. s. swanson. transition elements for a non-hermitian quadratic hamiltonian. journal of mathematical physics 45(2):585–601, 2004. https://doi.org/10.1063/1.1640796. [2] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. physical review letters 80(24):5243–5246, 1998. https://doi.org/10.1103/physrevlett.80.5243. [3] c. m. bender, m. v. berry, a. mandilara. generalized pt symmetry and real spectra. journal of physics a: mathematical and general 35(31):l467–l471, 2002. https://doi.org/10.1088/0305-4470/35/31/101. [4] c. bender, b. berntson, d. parker, e. samuel. observation of pt phase transition in a simple mechanical system. american journal of physics 81(3):173–179, 2013. https://doi.org/10.1119/1.4789549. [5] c. m. bender, m. gianfreda, ş. k. özdemir, et al. twofold transition in pt -symmetric coupled oscillators. physical review a 88(6):062111, 2013. https://doi.org/10.1103/physreva.88.062111. [6] a. beygi, s. p. klevansky, c. m. bender. coupled oscillator systems having partial pt symmetry. physical review a 91(6):062101, 2015. https://doi.org/10.1103/physreva.91.062101. [7] z. wen, c. m. bender. pt -symmetric potentials having continuous spectra. journal of physics a: mathematical and theoretical 53(37):375302, 2020. https://doi.org/10.1088/1751-8121/aba468. [8] c. m. bender, h. f. jones. interactions of hermitian and non-hermitian hamiltonians. journal of physics a: mathematical and theoretical 41(24):244006, 2008. https://doi.org/10.1088/1751-8113/41/24/244006. [9] a. sinha, r. roychoudhury. isospectral partners of a complex pt-invariant potential. physics letters a 301(3-4):163–172, 2002. https://doi.org/10.1016/s0375-9601(02)00736-3. [10] a. sinha, p. roy. generalized swanson models and their solutions. journal of physics a: mathematical and theoretical 40(34):10599–10610, 2007. https://doi.org/10.1088/1751-8113/40/34/015. [11] a. sinha, p. roy. pseudo supersymmetric partners for the generalized swanson model. journal of physics a: mathematical and theoretical 41(33):335306, 2008. https://doi.org/10.1088/1751-8113/41/33/335306. [12] h. f. jones. on pseudo-hermitian hamiltonians and their hermitian counterparts. journal of physics a: mathematical and general 38(8):1741–1746, 2005. https://doi.org/10.1088/0305-4470/38/8/010. 162 https://doi.org/10.1063/1.1640796 https://doi.org/10.1103/physrevlett.80.5243 https://doi.org/10.1088/0305-4470/35/31/101 https://doi.org/10.1119/1.4789549 https://doi.org/10.1103/physreva.88.062111 https://doi.org/10.1103/physreva.91.062101 https://doi.org/10.1088/1751-8121/aba468 https://doi.org/10.1088/1751-8113/41/24/244006 https://doi.org/10.1016/s0375-9601(02)00736-3 https://doi.org/10.1088/1751-8113/40/34/015 https://doi.org/10.1088/1751-8113/41/33/335306 https://doi.org/10.1088/0305-4470/38/8/010 vol. 62 no. 1/2022 swanson hamiltonian revisited through the csm [13] ö. yeşiltaş. quantum isotonic nonlinear oscillator as a hermitian counterpart of swanson hamiltonian and pseudo-supersymmetry. journal of physics a: mathematical and theoretical 44(30):305305, 2011. https://doi.org/10.1088/1751-8113/44/30/305305. [14] p. e. g. assis, a. fring. metrics and isospectral partners for the most generic cubic pt -symmetric non-hermitian hamiltonian. journal of physics a: mathematical and theoretical 41(24):244001, 2008. https://doi.org/10.1088/1751-8113/41/24/244001. [15] b. midya, p. p. dube, r. roychoudhury. nonisospectrality of the generalized swanson hamiltonian and harmonic oscillator. journal of physics a: mathematical and theoretical 44(6):062001, 2011. https://doi.org/10.1088/1751-8113/44/6/062001. [16] a. mostafazadeh. metric operators for quasi-hermitian hamiltonians and symmetries of equivalent hermitian hamiltonians. journal of physics a: mathematical and theoretical 41(24):244017, 2008. https://doi.org/10.1088/1751-8113/41/24/244017. [17] a. mostafazadeh. pseudo-hermitian representation of quantum mechanics. international journal of geometric methods in modern physics 07(07):1191–1306, 2010. https://doi.org/10.1142/s0219887810004816. [18] m. znojil. complete set of inner products for a discrete pt -symmetric square-well hamiltonian. journal of mathematical physics 50(12):122105, 2009. https://doi.org/10.1063/1.3272002. [19] b. bagchi, a. fring. minimal length in quantum mechanics and non-hermitian hamiltonian systems. physics letters a 373(47):4307–4310, 2009. https://doi.org/10.1016/j.physleta.2009.09.054. [20] a. sinha, p. roy. generalized swanson model and its pseudo supersymmetric partners. in recent developments in theoretical physics, pp. 222–234. indian statistical institute, india, 2009. https://doi.org/10.1142/9789814287333_0010. [21] b. bagchi, i. marquette. new 1-step extension of the swanson oscillator and superintegrability of its two-dimensional generalization. physics letters a 379(26-27):1584–1588, 2015. https://doi.org/10.1016/j.physleta.2015.04.009. [22] s. dey, a. fring, l. gouba. milne quantization for non-hermitian systems. journal of physics a: mathematical and theoretical 48(40):40ft01, 2015. https://doi.org/10.1088/1751-8113/48/40/40ft01. [23] j. da providência, n. bebiano, j. p. da providência. non-hermitian hamiltonians with real spectrum in quantum mechanics. brazilian journal of physics 41:78–85, 2011. https://doi.org/10.1007/s13538-011-0010-9. [24] f. bagarello. examples of pseudo-bosons in quantum mechanics. physics letters a 374(37):3823–3827, 2010. https://doi.org/10.1016/j.physleta.2010.07.044. [25] f. bagarello, j. feinberg. bicoherent-state path integral quantization of a non-hermitian hamiltonian. annals of physics 422:168313, 2020. https://doi.org/10.1016/j.aop.2020.168313. [26] v. fernández, r. ramírez, m. reboiro. swanson hamiltonian: non-pt-symmetry phase. journal of physics a: mathematical and theoretical 55(1):015303, 2022. https://doi.org/10.1088/1751-8121/ac3a35. [27] j. aguilar, j. m. combes. a class of analytic perturbations for one-body schrödinger hamiltonians. communications in mathematical physics 22:269–279, 1971. https://doi.org/10.1007/bf01877510. [28] e. balslev, j. m. combes. spectral properties of many-body schrödinger operators with dilatation-analytic interactions. communications in mathematical physics 22:280–294, 1971. https://doi.org/10.1007/bf01877511. [29] b. simon. quadratic form techniques and the balslev-combes theorem. communications in mathematical physics 27:1–9, 1972. https://doi.org/10.1007/bf01649654. [30] h. feshbach. a unified theory of nuclear reactions. ii. annals of physics 19(2):287–313, 1962. https://doi.org/10.1016/0003-4916(62)90221-x. [31] y. ho. the method of complex coordinate rotation and its applications to atomic collision processes. physics reports 99(1):1–68, 1983. https://doi.org/10.1016/0370-1573(83)90112-6. [32] t. myo, k. katõ. complex scaling: physics of unbound light nuclei and perspective. progress of theoretical and experimental physics 2020(12):12a101, 2020. [33] i. m. gel’fand, g. shilov. generalized functions vol. i. academic press, new york and london, 1964. [34] a. bohm, j. d. dollard, m. gadella. dirac kets, gamow vectors and gelfand triplets. lecture notes in physics vol. 348. springer, 1989. [35] a. mostafazadeh. pseudo-hermiticity versus ptsymmetry. ii. a complete characterization of non-hermitian hamiltonians with a real spectrum. journal of mathematical physics 43(5):2814–2816, 2002. https://doi.org/10.1063/1.1461427. [36] f. bagarello, j. p. gazeau, f. h. szafraniec, m. znojil. non-selfadjoint operators in quantum physics: mathematical aspects. john wiley & sons, usa, 2015. [37] t. y. azizov, i. s. iokhvidov. linear operators in spaces with an indefinite metric. john wiley & sons, usa, 1989. [38] u. günther, b. f. samsonov. naimark-dilated pt -symmetric brachistochrone. physical review letters 101:230404, 2008. https://doi.org/10.1103/physrevlett.101.230404. [39] u. günther, s. kuzhel. pt -symmetry, cartan decompositions, lie triple systems and krein space-related clifford algebras. journal of physics a: mathematical and theoretical 43(39):392002, 2010. https://doi.org/10.1088/1751-8113/43/39/392002. [40] a. dijksma, h. langer. operator theory and ordinary differential operators. american mathematical society, providence, ri, 1996. [41] f. s. u. günther. ir-truncated pt -symmetric ix3 model and its asymptotic spectral scaling graph. arxiv:1901.08526. 163 https://doi.org/10.1088/1751-8113/44/30/305305 https://doi.org/10.1088/1751-8113/41/24/244001 https://doi.org/10.1088/1751-8113/44/6/062001 https://doi.org/10.1088/1751-8113/41/24/244017 https://doi.org/10.1142/s0219887810004816 https://doi.org/10.1063/1.3272002 https://doi.org/10.1016/j.physleta.2009.09.054 https://doi.org/10.1142/9789814287333_0010 https://doi.org/10.1016/j.physleta.2015.04.009 https://doi.org/10.1088/1751-8113/48/40/40ft01 https://doi.org/10.1007/s13538-011-0010-9 https://doi.org/10.1016/j.physleta.2010.07.044 https://doi.org/10.1016/j.aop.2020.168313 https://doi.org/10.1088/1751-8121/ac3a35 https://doi.org/10.1007/bf01877510 https://doi.org/10.1007/bf01877511 https://doi.org/10.1007/bf01649654 https://doi.org/10.1016/0003-4916(62)90221-x https://doi.org/10.1016/0370-1573(83)90112-6 https://doi.org/10.1063/1.1461427 https://doi.org/10.1103/physrevlett.101.230404 https://doi.org/10.1088/1751-8113/43/39/392002 http://arxiv.org/abs/1901.08526 m. reboiro, r. ramírez, v. fernández acta polytechnica [42] d. chruściński. quantum mechanics of damped systems. journal of mathematical physics 44(9):3718– 3733, 2003. https://doi.org/10.1063/1.1599074. [43] d. chruściński. quantum mechanics of damped systems. ii. damping and parabolic potential barrier. journal of mathematical physics 45(3):841–854, 2004. https://doi.org/10.1063/1.1644751. [44] g. marcucci, c. conti. irreversible evolution of a wave packet in the rigged-hilbert-space quantum mechanics. physical review a 94:052136, 2016. https://doi.org/10.1103/physreva.94.052136. [45] d. bermudez, d. j. fernández c. factorization method and new potentials from the inverted oscillator. annals of physics 333:290–306, 2013. https://doi.org/10.1016/j.aop.2013.02.015. [46] s. dey, a. fring. squeezed coherent states for noncommutative spaces with minimal length uncertainty relations. physical review d 86:064038, 2012. https://doi.org/10.1103/physrevd.86.064038. [47] s. dey, a. fring, b. khantoul. hermitian versus non-hermitian representations for minimal length uncertainty relations. journal of physics a: mathematical and theoretical 46(33):335304, 2013. https://doi.org/10.1088/1751-8113/46/33/335304. [48] l. l. foldy, d. walecka. on the theory of the optical potential. annals of physics 54(2):403, 1969. https://doi.org/10.1016/0003-4916(69)90161-4. [49] j. rotureau, p. danielewicz, g. hagen, et al. optical potential from first principles. physical review c 95:024315, 2017. https://doi.org/10.1103/physrevc.95.024315. [50] r. ramírez, m. reboiro. squeezed states from a quantum deformed oscillator hamiltonian. physics letters a 380(11-12):1117–1124, 2016. https://doi.org/10.1016/j.physleta.2016.01.027. 164 https://doi.org/10.1063/1.1599074 https://doi.org/10.1063/1.1644751 https://doi.org/10.1103/physreva.94.052136 https://doi.org/10.1016/j.aop.2013.02.015 https://doi.org/10.1103/physrevd.86.064038 https://doi.org/10.1088/1751-8113/46/33/335304 https://doi.org/10.1016/0003-4916(69)90161-4 https://doi.org/10.1103/physrevc.95.024315 https://doi.org/10.1016/j.physleta.2016.01.027 acta polytechnica 62(1):157–164, 2022 1 introduction 2 formalism 2.1 eigenfunctions and eigenvectors 2.1.1 discrete spectrum 2.1.2 continuous spectrum 2.1.3 particular cases 2.2 mean values of observables 2.3 time dependent mean values 2.3.1 reigions i and iii: real spectrum 2.3.2 region ii and iv: complex spectrum 3 results and discussion 4 conclusions acknowledgements references ap06_5.vp 1 introduction the implementation of a so-called “digital factory” is a tremendous challenge for automotive engineering. the technical task is to effect a seamless information backbone spanning three key departments: design, production process planning, and manufacturing. also suppliers such as machine and tool vendors have to be integrated into the information flow. furthermore, there is the challenge of assimilating the human factor into the digital factory. new production planning tools will significantly change not only the contemporary production process planner’s work but also the collaboration with suppliers. this raises one major issue: how to integrate different user groups into the design of complex engineering applications for production planning. the authors focus on a case study about the development of a methodology for optimizing the workplace in the automotive field. in particular they investigate the feasibility of integrating virtual humans into design environments to perform ergonomic assessments [1]. the paper illustrates the general benefits of ergonomic assessments, detailed advantages due to the utilisation of virtual humans. a virtual human is an accurate biomechanical model of a human being. these models fully mimic human motion to allow an ergonomics (or human-factors) expert to perform process flow simulations. this study uses an analysis of the jack software package to highlight the usefulness of such software options for applications in the manufacturing industry [2]. workplace ergonomic considerations have traditionally been reactive, time-consuming, incomplete, sporadic, and difficult. the experience of an expert in ergonomic studies or data from injuries that have been previously observed and reported have always been necessary for these studies, and analyses are made after problems have occurred in the workplace. there are now emerging technologies supporting simulation-based engineering, and several operational simulation-based engineering systems to address this in a proactive manner. at present various commercial systems are available for ergonomic analysis of human posture and workplace design. 2 related work the importance of applying ergonomics to workplace design is illustrated by the injuries, illnesses, and fatalities (iif) program of the u.s. department of labor, bureau of labor statistics [3]. according to this report, there were 5.2 million occupational injuries and illnesses among u.s. workers and approximately 5.7 of every 100 workers experienced a job-related injury or illness. workplace-related injuries and illnesses increase workers’ compensation and retraining costs, absenteeism, and faulty products. many research studies have shown the positive effects of applying ergonomic principles in workplace design [4]. riley et al. [5] describe a study to demonstrate how applying appropriate ergonomic principles during design can reduce many life cycle costs. traditional methods for ergonomic analysis were based on statistical data obtained from previous studies or equations based on such studies. an ergonomics expert was required to interpret the situation, analyze and compare with existing data, and suggest solutions. the standard analytical tools included the niosh lifting equation [6], ovaka posture analysis [7], and rapid upper limb assessment [8], among others. various commercial software systems are now available for ergonomic studies. hanson [9] presents a survey of three such tools, annie-ergoman, jack, and ramsis, used for human simulation and ergonomic evaluation of car interiors. the tools are compared and the comparison shows that all three tools have excellent potential in evaluating car interiors ergonomically in the early design phase. jack [10], an ergonomics and human factors product, enables users to position bio-mechanically accurate digital humans of various sizes in virtual environments, assign them tasks and analyze their performance. gill et al. [11] provide an analysis of the jack software to highlight its usefulness for applications in the manu© czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 46 no. 5/2006 ergonomic optimization of a manufacturing system work cell in a virtual environment f. caputo, g. di gironimo, a. marzano the paper deals with the development of a methodology for studying, in a virtual environment, the ergonomics of a work cell in an automotive manufacturing system. the methodology is based on the use of digital human models and virtual reality techniques in order to simulate, in a virtual environment, human performances during the execution of assembly operations. the objective is to define the optimum combination of those geometry features that influence human postures during assembly operation in a work cell. in the demanding global marketplace, ensuring that human factors are comprehensively addressed is becoming an increasingly important aspect of design. manufacturers have to design work cells that conform to all relevant health and safety standards. the proposed methodology can assist the designer to evaluate the performance of workers in a workplace before it has been realized. the paper presents an analysis of a case study proposed by comau, a global supplier of industrial automation systems for the automotive manufacturing sector and a global provider of full maintenance services. the study and all the virtual simulations have been carried out in the virtual reality laboratory of the competence regional center for the qualification of transportation systems (crdc “trasporti” www.centrodicompetenzatrasporti.unina. it), which was founded by the campania region with the aim of delivering advanced services and introducing new technologies into local companies operating in the field of transport. keywords: ergonomics, digital human models, manufacturing process, work cell optimization. facturing industry. eynard et al. 12], describe a methodology using jack to generate and apply body typologies from anthropometric data of the italian population and compare the results with a global manikin. the study identified the importance of using accurate anthropometric data for ergonomic analysis. sundin et al. [13], present a case study to highlight benefits of the use of jack analysis in the design phase of a new volvo bus. the importance of virtual humans in simulation and design has also been put set out badler [1] & hou [14]. ford has made use of the “design for ergonomics” virtual manufacturing process, using jack. the ergonomic design technology lab at pohang institute of science and technology is also involved in human modeling, design simulation, design evaluation in virtual environments and design optimization [15]. the potential value of ergonomics analysis using virtual environments is discussed in detail by wilson [16]. 3 the methodology for optimizing a workp1lace: pei method & wei method in this paper the problem that the authors have faced is optimization of the geometric features of a workplace in order to guarantee the maximum postural comfort for operators from different anthropometric percentiles during assembly operations. such optimization, which has to consider the presence of possible external restraints, is strictly connected to the layout of the physical elements present in the working area. for this purpose, a methodology is proposed, based on the application of the “task analysis toolkit (tat)” included in the jack software, whose functions will be analyzed in the following sections. among the tools made available by tat for the analysis of a working activity (niosh lifting analysis; rula; manual material handling limits; static strength prediction; owas; low back analysis; predetermined time standard) it has not been possible to find “one” that enables us to determine, among several solutions, the optimal one. if the geometric features characterizing a workplace influence the ergonomics of only one operation, in order to define the optimum combination of these features, the pei method can be applied. it follows the phases illustrated in the flow diagrams of fig. 1a. the aim of the pei method is the ergonomic optimization of an operation within a work cell, so it is referred to a single operation. in general, more than one operation is performed in a work cell. in this case, the combination of geometric features could not influence the single operations in the same way and, therefore, the pei method is not applicable. therefore, the combination of geometric features that optimizes the posture of all human percentiles can be evaluated applying the wei method. fig. 1b shows the flow diagram of this approach, where m represents the number of operations that have to be performed in the work cell. 3.1 first phase: analysis of the working environment the first phase consists in an analysis of the working environment and in the consideration of all the possible movement alternatives: this, in general, involves considering alternative routes, postures and speeds of execution, which all contribute to the effective conclusion of the work. it is essential, in a virtual environment, to simulate all these operations in order to verify in the first place their feasibility. in fact, for instance, it cannot be taken for granted that all the points can be reached starting from different postures. the execution of this analysis guarantees the feasibility of the assignment. among the phases of optimization this is the one that requires the longest time, since it needs the creation of a large number of simulations in real time, without taking into account that some of them will turn out to be useless, because, for instance, the simulation shows that some points cannot be reached with the movements that the designer had conceived. other parameters that can be modified are the distances of the manikin from objects taken as a reference, and the possibility to move the objects in the working area. 3.2 second phase: reachability and accessibility analysis the design of a workplace always requires a preliminary study of the accessibility of the critical points. this is a very in22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 a) b) fig. 1: a) pei method flow chart, b) wei method flow chart teresting problem, and often occurs in assembly lines. the problem consists in verifying that in the designed layouts it is possible to carry out the movements necessary to the operation and that all the critical points can be reached; in a lifting operation, for instance, it could happen that a shelf is positioned too high and that therefore the worker does not succeed in developing his assignment. such an analysis can be conducted in jack, activating the collision detection algorithm. the layout configurations that do not satisfy the accessibility analysis do not have to be taken into consideration in the following analyses. from the analysis of the working environment and the accessibility analysis the different configurations can be designed. if the number of configuration is high a design of experiments (doe) procedure can occur [17]. 3.3 third phase: static strength prediction (ssp) once the possible working sequences have been conceived, the question is: how many workers will be able to expound the necessary efforts for these movements? the answer can come from the static strength prediction. in the case that the task must be developed, during a given period of time, by workers of different stature, age and sex, it can be accepted only in the hypothesis that the tool appraises in 100 % the percentage of workers capable of the working activity. in practice, this cannot be done, because many activities provide percentages lower than 100 %. in the workplace design phase, the operations that have a percentage of 0 % should not be taken into consideration in the following analyses. the operations that have an evaluation of the percentage below a certain limit should also be discarded. 3.4 fourth phase: low back analysis (lba) low back analysis is a tool that allows the strengths to be evaluated on the virtual manikin’s spine, according to each posture assumed by the digital human model and any loading action. this tool evaluates, in real time, the actions linked to the tasks imposed on the manikin according to the niosh standards and according to the studies carried out in this field by raschke [18]. the low back analysis tool offers information related to the compression and cut strengths on the l4 and l5 lumbar disks, together with the reaction-moments in the axial, sagittal and lateral plane on the l4 and l5 lumbar disks and the activity level of the trunk muscles to balance the spine moments. in particular, in the following, we use the value, expressed in newton, of the compression on the l4 and l5 lumbar disks. 3.5 fifth phase: ovako working posture analysis system (owas) owas is a simple method for verifying the degree of comfort related to working postures and for evaluating the degree of urgency that has to be assigning to corrective actions. the method was developed in the finnish metallurgic industries in the 1970s. it is based on a classification of postures and on an observation of working tasks. the owas method consists in the use of a four-digit code to assess the position of the back side of the body, the arms and legs together with the intensity of existing loads during the performance of a specific task. the activity under examination has to be observed according a period of about thirty seconds. during each step, the positions and the applied strengths have to be registered, in accordance with a decomposing technique of complex activities. in this way, the distribution of the postures, the repeated positions and the critical positions are focused. the data collection and the successive analysis enable the working procedure to be redesigned to reduce or eliminate postures that are potentially dangerous. in fact, the tasks are classified using four principal classes: 1) no harmful effect, 2) a limited harmful effect, 3) recognised harmful effect on health, 4) highly harmful effect on health. 3.6 sixth phase: rapid upper limb assessment analysis (rula) from the initial scenario of possible layout configurations, the procedure progressively discarded those that: 1) did not ensure accessibility of the critical points, 2) asked for efforts that the workers were presumably not able to perform, 3) were potentially dangerous for the lower back. in this phase, the postural quality is analyzed. the purpose is to minimize the risks of muscular-skeletal pathologies in the medium-to-long term. the tool used is rula. rula analysis refers to exposure to risk of disease and/or damage to the upper limbs. the analysis takes into account loads, biomechanical and postural parameters focusing on the position of the head, body and upper limbs. the rula method is based on data sheet filling. the sheet enables the user to quickly compute a value that indicates the degree of urgency of an intervention that needs to be adopted in order to reduce the risk of damage to the upper limbs. the method enables not only arm and wrist analyses, but also head, body and leg analyses. the first analysis, together with the information about muscles in use and existing loads, enables an assessment of the final score that represents the evaluation of the working posture. the risk is considered “acceptable” when the score is 1 or 2, “in need of further investigation” (a score of 3 or 4), “in need of further investigation and a rapid change” (a score of 5 or 6) or “investigation and immediate change” (a score of 7). 3.7 seventh phase: pei evaluation at this point a comparison can be established among the layout configurations, through the critical postures associated with them. the comparison allows us to establish a classification of risk of the operator contracting muscle-skeletal pathologies in the medium-to-long term. the choice of this optimal solution passes through the individuation of the more comfortable posture, which can be carried out using a posture evaluation index (pei), which integrates the results of lba, owas and rula [19]. in particular, pei is the sum of three adimensional variables i1, i2 and i3. the variable i1 is evaluated normalizing the lba value with the niosh limit for the compression strength (3400 n). variables i2 and i3 are respectively equal to the owas index normalized with its critical value (“3”) and the rula index normalized with its critical value “5”. pei � � �i i i1 2 3 (1) where: i1 3400� lba n, i2 3� owas , i3 5� rula . pei definition and the consequent use of lba, owas and rula task analysis tools depend on the following consideration. the principal risk factors for work requiring biomechanical overload are: repetition, frequency, posture, © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 46 no. 5/2006 effort, recovery time. the factors that mainly influence the execution of an assembly task are extreme postures, in particular of the upper limbs, and high efforts. consequently, attention has to be paid to the evaluation of compression strengths on the l4 and l5 lumbar disks (i1 determination), to the evaluation of the level of discomfort of the posture (i2 determination), and to the evaluation of the level of fatigue of the upper limbs (i3 determination). pei enables us to select the modus operandi to perform the disassembly task in a simple way. in fact, the optimal posture associated to an elementary task is the critical posture with the minimum pei value. the variables defining pei depend on the discomfort level associated with the examined posture: the greater the discomfort, the higher are i1, i2 and i3 and, consequently, pei. pei expresses, in a synthetic way, the “quality” of a posture with values varying between a minimum value of 0.47 (no loads applied to the hands, values of joints angles within the acceptability range) and a maximum value depending on the i1 index. in order to ensure the conformity of the work with the laws protecting health and safety, a posture whose i1 index is more than or equal to 1 is assumed not valid. in fact, in this way the niosh limit related to compression strengths on the l4 and l5 lumbar disks will be exceeded. according to these considerations, the maximum acceptable value for pei is 3 (compression strength on the l4 and l5 lumbar disks equal to the niosh limit 3400 n; values of joints angles not acceptable). iterating the procedure for all the elementary tasks of the assembly sequence, it is possible to associate to each of them the optimal posture to be assumed and, finally, to individuate the optimal value of the geometric parameters for the assembly task. 3.8 eighth phase: wei evaluation once we have individuated the optimal value of the geometric parameters for each operation within a work cell (m represents the number of operation), the wei (work cell evaluation index) index is introduced [20]. this is defined as: wei pei( )configuration wj i i i � �� , (2) where: w time of operationi i� work cell time cycle. the best index wei is obtained by the following expression: � �wei min weibest j jconfig� ( ) . (3) the wei definition depends on the following consideration: if the aim is the ergonomic optimization of the work cell, it is necessary to establish a single optimal solution. 4 case study in order to test the pei method and the wei method, a case study proposed by comau was analyzed. the goal was to optimize a body welding work cell by using the methodology explained above. 5 working environment analysis the geometric model of the body welding work cell was imported into the jack software, and then the 9 operations that have to be realized in the work cell were simulated. the 9 operations are as follows: 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 2: sequence of operations in the body welding work cell 1. welding visual control 4. smearing sealer with gun 7. upper cross member wind screen loading 2. welding imperfections restoring with brush 5. smearing sealer renewal 8. back cross member wind screen loading 3. braze welding renewal 6. bottom cross member wind screen loading 9. upper front rafter loading fig. 2 shows the sequence of operations performed in the body welding work cell. simulating the operations that have to be performed in this work cell, a qualitative analysis of the postural sequence for each operation was conducted, in order to individuate the geometric parameters to be optimized. table 1 shows the results of this first phase. as shown in the same table, a score was assigned to each operation and a calculation was made of the total score associated to each geometric parameter that influences the postural positions of the workers while they are performing the tasks. it can be asserted that in this case study there is just one geometric parameter to be optimized, represented by the body height with respect to the assembly line, and one external factor, represented by the percentile of the worker. 6 accessibility analysis the analysis of the geometry was conducted in order to define the range of the body height, taking into account the geometric constraints of the work cell. then, the range was reduced through the accessibility analysis. the visual control of the welding (operation 1) and the smearing sealer with a gun (operation 4) defined the limits of the range, as shown in fig. 3 and fig. 4. the lower limit is �5 cm, because the body positioned at �10 cm does not allow the spot welding to be visualized completely (fig. 3 b), and the upper limit is 20 cm, because the body positioned at 25 cm does not allow the smearing sealer to be realized by the 5th percentile. a step of 5 cm was established, so the possible body height values are 6 (l1 � 6), and the percentiles considered are 3 (l2 � 3): 5th, 50th, 95th. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 46 no. 5/2006 operations op 1 op 2 op 3 op 4 op 5 op 6 op 7 op 8 op 9 score parameters 1 3 2 1 2 3 2 2 4 s i� � 1 9 worker percentile � � � � � � � � � 20 postural positions � � � � 5 geometric parameters body locking point body � � � � � 7 table 1: definition of the parameters to be optimized score value meaning 0 not critical 1 low injury 2 middle-low injury 3 middle-high injury 4 high injury height of body with respect to not critical … middle-high … high a) b) fig. 3: a) spot welding visible, b) spot welding not completely visible fig. 4: the smearing sealer operation realized by the 5th percentile (on the left), vs the smearing sealer operation not practicable by the 5th percentile (on the right) now it is possible to define the number of configurations (n): from the combinations of the values of these parameters there are 18 configurations, as shown in table 2. 7 pei method & wei method results by applying the ssp, lba, owas and rula tools for each configuration and operation, the configurations injurious for the worker have been discarded. table 3 shows results for the wei method & pei method. note that pei has been evaluated taking into account an average value among those obtainable. it can be asserted that the height of the body with respect to the assembly line, corresponding to the optimal postural sequence, is 20 cm. table 3 shows the evaluation of pei and wei in some exhaustive cases. 8 conclusions the proposed methodology makes available a valid tool for workplace analysis. the following objectives have been achieved: to appraise the quality of the postures assumed during a working activity; in designing a new layout, to establish if it ensures the feasibility of the operation (based on the criteria of accessibility of the critical points, of compatibility of the efforts, and danger for the lower back); to compare the possible alternatives for the configuration of the layout, supplying useful criteria for the designer to choose which is the most convenient to realize in the production chain. the reliability of the results depends on the extent to which the assumptions on which the tools of the tat are based will be respected: almost static movements, non excessive temperature and humidity of the environment, satisfactory times of rest. such assumptions are generally satisfied in the normal workplace. the objective of industry is to apply ergonomic criteria to reduce the number of accidents in the workplace and, secondly, to increase productivity. currently only large firms turn their attention to this sector, because the simulation software has a certain cost, and also because the time for applying the software requires human resources that small firms do not have. the future objective is, on the one hand, to improve the interaction between the theoretical concepts of the ergonomics and the software, and, on the other, to simplify the analytical procedures to reduce time and costs. 9 acknowledgment the authors, who have contributed equally to this work, thank prof. antonio lanzotti for his helpful discussions and suggestions about future work, and the future engineer fabrizio di gioia for his technical support. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 configuration height body percentile n l l� �1 2 l1 6� l2 3� 1 �5 5° 2 �5 50° 3 �5 95° 4 0 5° 5 0 50° 6 0 95° 7 5 5° 8 5 50° 9 5 95° 10 10 5° 11 10 50° 12 10 95° 13 15 5° 14 15 50° 15 15 95° 16 20 5° 17 20 50° 18 20 95° table 2: experimental plain conf. height body perc. op1 w � 0.044 pei op2 w � 0.088 pei op3 w � 0.088 pei op4 w � 0.133 pei op5 w � 0.133 pei op6 w � 0.159 pei op7 w � 0.133 pei op8 w � 0.133 pei op9 w � 0.088 pei wei pei w i i i �� 10 10 5 1.937 1.600 1.611 1.801 1.600 2.599 1.627 1.644 2.218 1.86411 10 50 12 10 95 13 15 5 1.854 1.432 1.527 1.679 1.433 2.590 1.869 1.814 1.859 1.82114 15 50 15 15 95 16 20 5 1.753 1.263 1.534 1.543 1.263 2.527 1.863 2.057 1.721 1.77117 20 50 18 20 95 table 3: wei method & pei method results references [1] badler, n.: virtual humans for animation, ergonomics, and simulation. in: proceedings of the ieee non rigid and articulated motion workshop, 1997, p. 28–36. [2] choi, h.: integration of multiple human models into a virtual reality environment. thesis, washington state university. 2003. [3] u.s. department of labor (bureau of labor statistics) workplace injuries and illnesses in 2001. available online via [accessed january 2004]. [4] das, b., shikdar, a.: . participative versus assigned production standard setting in a repetitive industrial task: a strategy for improving worker productivity. international journal of occupational safety and ergonomics, vol. 5 (1999), no. 3, p. 417–430. [5] riley, m. w., dhuyvetter, r. l.: design cost savings and ergonomics. in: proceedings of the xivth triennial congress of the international ergonomics association and 44th annual meeting of the human factors and ergonomics association, “ergonomics for the new millennium”, san diego, ca usa, july 29–aug 4, 2000. [6] dempsey, p. g.: usability of the revised niosh lifting equation. ergonomics, vol. 45(2002), no. 12, p. 817–828. [7] keyserling, w. m.: owas: an observational approach to posture analysis. available online via [accessed january 2004]. [8] mcatamney, l., corlett, e. n.: rula: a survey method for the investigation of work-related upper limb disorders. applied ergonomics, vol. 24 (1993), no. 2, p. 91–99. [9] hanson, l.: computerized tools for human simulation and ergonomic evaluation of car interiors. in: proceedings of the xivth triennial congress of the international ergonomics association and 44th annual meeting of the human factors and ergonomics association, “ergonomics for the new millennium”, san diego, ca usa, july 29–aug 4, 2000. [10] di gironimo g., martorelli m., monacelli g., vaudo g.: use of virtual mock-up for ergonomic design. in: proc. of 7th international conference on “the role of experimentation in the automotive product development process” – ata 2001 [cd rom], florence, may 23–24 2001. [11] gill, s. a.,ruddle, r.a.: using virtual humans to solve real ergonomic design problems. in: proceedings of the 1998 international conference on simulation, iee conference publication, 457, p. 223–229. [12] eynard, e., et al.: generation of virtual man models representative of different body proportions and application to ergonomic design of vehicles. in: proceedings of the xivth triennial congress of the international ergonomics association and 44th annual meeting of the human factors and ergonomics association, “ergonomics for the new millennium”, san diego, ca usa, july 29–aug 4, 2000. [13] sundin, a., christmansson, m., ortengren, r.: methodological differences using a computer manikin in two case studies: bus and space module design. in: proceedings of the xivth triennial congress of the international ergonomics association and 44th annual meeting of the human factors and ergonomics association, “ergonomics for the new millennium”, san diego, ca usa, july 29–aug 4, 2000, p. 496–498. [14] hou, h., sun, s., pan, y.: research on virtual human in ergonomics simulation. chinese journal of mechanical engineering, vol. 13 (2000), p. 112–117. [15] ergosolutions magazine, 2003. available on-line via http://www.ergosolutionsmag.com/. [16] wilson, j. r.: virtual environments applications and applied ergonomics. applied ergonomics, vol. 30 (1999), no. 1, p. 3–9. [17] park, s. h.: robust design and analysis for quality engineering. london, chapman and hall, 1996. [18] raschke, u.: lumbar muscle activity prediction during dynamic sagittal plane lifting conditions: physiological and biomechanical modeling considerations. ph.d. dissertation bioengineering, university of michigan usa, 1994. [19] di gironimo, g., monacelli, g., patalano, s.: a design methodology for maintainability of automotive components in virtual environment. in: proc. of international design conference – design 2004, dubrovnik, may 18–21, 2004. [20] love, r. f., morris, j. k., wesolowsky, g. o.: multiobjective & methods, north-holland, 1989. prof. francesco caputo e-mail: francesco.caputo@unina.it giuseppe di gironimo, ph.d e-mail: giuseppe.digironimo@unina.it ing. adelaide marzano e-mail: a.marzano@unina.it dep. of progettazione e gestione industriale university of naples federico ii p.le v. tecchio, 80, 80125 naples italy © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 46 no. 5/2006 ap05_4.vp symbols m airplane mass v velocity �a time derivation of a � angle of attack � angle of drift �, �, � euler angles (roll, pitch, yaw) u, v, w components of velocity of the airplane mass centre relative to atmosphere p, q, r components of the angular velocity of the airplane cd drag coefficient cl lift coefficient clf lift coefficient of the vertical tail surface cl rolling moment coefficient cn yawing moment coefficient cm pitching moment coefficient � air density ix, iy, iz moments of inertia about (x, y, z) axes c length of mean aerodynamic chord s wing area b wing span g standard gravitational acceleration t engine thrust 1 introduction civil and military usage of low cost uavs is becoming more needed. possibly the most expensive design items are the control and navigation systems. therefore, one of main questions that each system designer has to face is the selection of appropriate sensors for a specific autopilot system. such sensors should satisfy the main requirements without contravening their boundaries. higher sensor quality can lead to a significant rise in costs. in aircraft design this kind of consideration is especially important due to the safety requirements expressed in airworthiness standards. therefore question is how to determine the optimal solution. this problem is mostly solved by the designer’s experience and by thorough testing. however, this can be very expensive, and involves with many risks in relation to flight safety. the problem can be resolved by using a suitable simulation method, for example in the matlab® simulink® program environment. this program can be considered as a facility fully competent for this task. an important factor is what is done to manipulate the functions of the program to achieve the autopilot design. a computer only solves logical problems. it cannot implement practical real world entities, and a computer simulation only simulates what in a sense the designer already knows. for the precise design solution it is necessary to have a mathematical model of the aircraft or at least the basic con© czech technical university publishing house http://ctn.cvut.cz/ap/ 109 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 simulation of uav systems p. kaňovský, l. smrcek, c. goodchild the study described in this paper deals with the issue of a design tool for the autopilot of an unmanned aerial vehicle (uav) and the selection of the airdata and inertial system sensors. this project was processed in cooperation with vtul a pvo o.z. [1]. the feature that distinguishes the autopilot requirements of a uav (figs. 1, 7, 8) from the flight systems of conventional manned aircraft is the paradox of controlling a high bandwidth dynamical system using sensors that are in harmony with the low cost low weight objectives that uav designs are often expected to achieve. the principal function of the autopilot is flight stability, which establishes the uav as a stable airborne platform that can operate at a precisely defined height. the main sensor for providing this height information is a barometric altimeter. the solution to the uav autopilot design was realised with simulations using the facilities of matlab® and in particular simulink®[2]. keywords: autopilot, modelling, sojka, tools, uav fig. 1: uav sojka iii wing span 4.5 m overall length 3.78 m mtow 145 kg max. speed 210 km/hr payload 20 kg endurance >4 hr engine 28.4 kw straints in its movement. by using suitable simulations it is possible not only to evaluate the sensors, but also to optimise their filters and control algorithms. 2 experiment first phase of the project was to verify the sensor parameters declared by the manufacturer. in order to measure the sensor parameters (sensitivity, accuracy, stability, temperature dependence and hysteresis), it was necessary to adapt an existing automatic system by recording data into a file (see fig. 2). the second phase in this project was to make a statistical evaluation of data obtained by the automatic measurement system. the validity of the measurements themselves was verified by an accuracy analysis of the measurement system and processing the statistical data. the most important quantity in the set of measured data was the pressure variation between two different altitudes, which could be measured very precisely. the entire evaluation of the measured data then helped to find sensor parameters and, consequently, to design a sensor model for the matlab simulink® program. the designed model was a simplified version, because it reflected only parameters relevant for the specified uav autopilot design. the sensor delay in this case could be ignored, because its value was negligible in comparison with the previously mentioned sensor parameters. the basic requirement for this project was to obtain data concerning the uav design system. in this case, the uav system was described in the following referential axes (fig. 3) and by a set of differential equations (equations 1, 2) [3, 4]: force equations � sin � u qw rv c v s m t m g v ru pw c v s m � � � � � � � � � � � d lf 1 2 1 2 2 2 � � �� t m g w qu pv c v s m t m g � � � � � � cos sin � cos cos . � �� � �l 1 2 2 � (1) moment equations � , � p i i i qr c v sb i q i i i rp c v sc i y z x l x x z y m � � � � � � � � 1 2 1 2 2 2 � � y z x z n z r i i i pq c v sb i , � .� � � � 1 2 2 � (2) all variables were calculated in a frame of differential equations. all aerodynamic parameters needed for these equations were obtained from wind tunnel experiments [4]. the airplane model in simulink consists of partial blocks. these blocks represent basic mathematical operations or functions and tools that are necessary for modelling. for example, these could be memory, delay, signal sinks and sources (for an illustration, see fig. 4). the mathematical model of the airplane was compiled as a continuous system, i.e., all calculations were not performed as a time defined sequence but the time interval changed due to the magnitude of the outcomes of the calculations. this solution guaranteed that all relevant quantities would be obtained. this was because the time between two calculations was in the order of milliseconds, which means more than two orders higher frequency than the highest own frequency of airplane movement. the model in simulink® was divided into subblocks representing the individual equations. the main initial conditions were defined in an external .m file, which had to be run before starting the main simulation in simulink®. the rest of the initial conditions and quantities could be set up directly in the simulink® scheme. these conditions were, for example, initial velocity, altitude and flight path. the input quantities to be modelled were: � elevator deflection � aileron deflection 110 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague pressure regulator druck dpi 530 barometer druck dpi 145pc power temperature sensor (pt100) measured pressure sensor thermostatic box labio ls80 rs 232 rs 232 rs 232 pressure data acquisition control unit agilent 34970a hpib fig. 2: automatic measurement system fig. 3: main axes � thrust � wind – as defined in mil-f-8785c [6] all the calculated output quantities for visualisation could be shown in a graphical interface or saved. these quantities could be, for example, euler angles and their derivations, translational and angular movements, speed, altitude, etc. assuming that we had all necessary aerodynamic parameters, the design of a non-linear aircraft model was worked out. the model describes the aircraft behaviour in almost all standard phases of flight. algorithms representing the autopilot control were also simulated. in order to make the simulation comprehensive, the model was extended by submodels of the wind and the actuators. a simplified diagram of the design simulation is shown in fig. 5. 3 results the processed data from the automatic measurement system was used for designing a model corresponding to the basic parameters of the sensor. the results showed the significant temperature dependence. this dependence was easy to correct. however, taking into account the desired function of the sensor, this dependence could be ignored. standard deviation was 0.031 by processing the 152 measured data items, the following results can be obtained: � 80 % of the results were within the interval �0.5m � 60 % of the results were within the interval �0.3m � 20 % of the results were within the interval �0.1m © czech technical university publishing house http://ctn.cvut.cz/ap/ 111 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 4: simulink® scheme representing the calculation of the first equation from equation 1 airplane mathematical model sensor models + algorithm of filtration actuators pitch autopilot roll autopilot data visualisation fig. 5: simulink® block scheme a decision on the suitability of using the sensor as an altimeter could have been made on the basis of the stated manufacturer’s data, from the supplied datasheets. unfortunately this method was inapplicable in the case of using the pressure sensor as a vertical speedometer. the definition for calculating vertical speed is obvious from fig. 6. and eq. 3. vertical speed is defined as change of altitude in time. vertical speed � h t (3) it is evident from graph 1. that the steps of discrete time derivation of altitude can cause undesirable step changing in measurement of vertical speed. these steps could be elimi112 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague time altitude � h � t fig. 6: calculating the vertical speed graph 1 graph 2 nated by implementing a filter, but this would involve unacceptable time delays. surprisingly, thanks to the robustness of the system, the system itself was resistant to sharp and large steps of indicated vertical speed. graph 2 shows the resultant change of altitude. the control mechanism is made by keeping the vertical speed value (in this case 0) and keeping the set altitude value. 4 conclusion simulated flight quantities were evaluated with comparing with real flight records and submitted by real sojka operators. this correspondence certifies the correctness of the simulink® airplane model. the model of the uav design system helps to create a powerful tool for a suitability testing of the autopilot and its sensors. this procedure can speed up the choice of sensors which makes a price reduction of their implementation into the uav autopilot. particular result from this study is a decision to use a low-cost altimeter for vertical speed measurements. this design method is also suitable for further utilisation in uav design system simulation. another possible utilisation of the method is in evaluating of sensor quality in ageing, for the purposes of flight safety. 5 acknowledgment the author thanks mr. z. cech, from delong instruments ltd. brno, czech republic, who described the mathematical model of uav sojka, and mr. v. dvorak from ctu prague, who designed the basic version of the automatic measurement system for pressure measurement. references [1] vtul a pvo o. z. (air force research institute), czech republic, http://www.vtul.cz/. [2] http://www.mathworks.com/. [3] etkin, b., reid, l. d.: dynamics of flight – stability and control, john wiley & sons, inc. 1996, 3rd ed. [4] cech, z.: využití magnetického pole země ke stabilizaci bezpilotních prostředků (usage of the earth’s magnetic field for stabilising uav flight)(in czech), dissertation work, military academy antonina zapotockeho, brno, cz, 1989. [5] proks, m.: “měření modelu letounu sojka-m4 v aerodynamickém tunelu.” technical report z-3763/02, aeronautical research and test institute in prague 2002. [6] http://www.pdiaero.com/downloads/download_files/mil -f-8785c. pdf, section 3.7 ing. petr kaňovský e-mail: kanovsp@fel.cvut.cz department of measurements czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic dr. ladislav smrcek dr. colin goodchild aerospace engineering department james watt building university of glasgow glasgow, g12 8qq scotland, u.k. © czech technical university publishing house http://ctn.cvut.cz/ap/ 113 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 7 fig. 8 table of contents section 1electrical engineering model based control of moisture sorption in a historical interior 5 p. zítek, t. vyhlídal properties and performance of a new compact hf aerial design for multi-band operation 11 d. telfer, j. spencer implementation of sliding mode observer based reconfiguration in an autonomous underwater vehicle 18 a. j. mitchell, e. w. mcgookin, d. j. murray-smith simulation and analysis of magnetisation characteristics of interior permanent magnet motors 25 j. a. walker, c. cossar, t. j. e. miller ontologies and formation spaces for conceptual redesign of systems 33 j. bíla, m. tlapák intelligent data storage and retrieval for design optimisation – an overview 39 c. peebles, c. bil, l. drack high-speed real-time simulators for engineering design 55 r. e. crosbie, n. g. hingorani physics based design, the future of modeling and simulation 59 t. s. ericsen feedback control in an atomic force microscope used as a nano-manipulator 65 m. hrouzek section 2aerospace engineering design of a low-cost easy-to-fly stol ultralight aircraft in composite material 73 d. p. coiro, a. de marco, f. nicolosi,. n. genito, s. figliolia design and development of the engine unit for a twin-rotor unmanned aerial vehicle 81 g. avanzini, s. d’angelo, g. de matteis sliding mode implementation of an attitude command flight control system for a helicopter in hover 88 d. j. mcgeoch, e. w. mcgookin, s. s. houston primary response assessment method for concept design of monotonous thin-walled structures 96 v. zanic, p. prebeg composite axial flow propulsor for small aircraft 104 r. poul, d. hanus simulation of uav systems 109 p. kaòovský, l. smrcek, c. goodchild ap07_2-3.vp 1 introduction sequences of the form � �� �j j� �� for � � 1, now known as beatty sequences, were first studied in the context of the famous problem of covering the set of positive integers by disjoint sequences [1]. further results in the direction of so-called disjoint covering systems are due to [2], [3], [4] and others. other aspects of beatty sequences were then studied, such as their generation using graphs [5], their relation to generating functions [6], [7], their substitution invariance [8], [9], etc. a good source of references on beatty sequences and other related problems can be found in [10], [11]. in [12] the authors study the self-matching properties of the beatty sequence � �� �j j� �� for � � 1 2 5 1( ), the golden ratio. their study is rather technical; they have used for their proof the zeckendorf representation of integers as a sum of distinct fibonacci numbers. the authors also state an open question whether the results obtained can be generalized to other irrationals than �. in our paper we answer this question in the affirmative. we show that beatty sequences � �� �j j� �� for quadratic pisot units � have a similar self-matching property, and for our proof we use a simpler method, based on the cut-and-project scheme. it is interesting to note that beatty sequences, fibonacci numbers and the cut-and-project scheme have attracted the attention of physicists in recent years because of their applications for mathematical description of non-crystallographic solids with long-range order, so-called quasicrystals, discovered in 1982 [13]. the first observed quasicrystals revealed crystallographically forbidden rotational symmetry of order 5. this necessitates, for an algebraic description of the mathematical model of such a structure, the use of the quadratic field �(�). such a model is self-similar with the scaling fac tor � 1. later, the existence was observed of quasicrystals with 8 and 12-fold rotational symmetries, corresponding to mathematical models with selfsimilar factors � � 1 1 2 and � � 1 2 3. note that all �, �, and � are quadratic pisot units, i.e. they belong to the class of numbers for which the result of bunder and tognetti is generalized here. 2 quadratic pisot units and the cut-and-project scheme the self-matching properties of the beatty sequence � �� �j j� �� are best displayed on the graph of � �j� against � �j � �� 1 2 3, , ,� . an important role is played by the fibonacci numbers, f f f f fk k k0 1 1 10 1� � � , , , for k 1 . the result of [12] states that � � � �( )j f j fi i � � � 1, (1) with the exception of isolated mismatches of frequency �i, namely at points of the form � �j kf k fi i� 1 � , k � �. our aim is to show a very simple proof of these results that is valid for all quadratic units � �( , )0 1 . every such unit is a solution of the quadratic equation x mx m2 1 � �, �, or x mx m2 1 � �, �, m 3. the considerations will differ slightly in the two cases. a) let � �( , )0 1 satisfy � �2 1 �m for m � �. the algebraic conjugate �� of �, i.e. the other root �� of the equation, satisfies �� ��1. we define the generalized fibonacci sequence g g g mg g nn n n0 1 2 10 1 0� � � , , , (2) it is easy to show by induction that for i � �, we have ( ) � 1 1 1 i i i ig g� � and ( ) � � � 1 1 1 i i i ig g� � . (3) b) let � �( , )0 1 satisfy � �2 1 � m for m � �, m 3. the algebraic conjugate �� of � satisfies �� �1. we define g g g mg g nn n n0 1 2 10 1 0� � � , , , (4) in this case, we have for i � � © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 47 no. 2–3/2007 self-matching properties of beatty sequences z. masáková, e. pelantová we study the selfmatching properties of beatty sequences, in particular of the graph of the function � �j� against j for every quadratic unit � �( , )0 1 . we show that translation in the argument by an element gi of a generalized fibonacci sequence almost always causes the translation of the value of the function by gi�1. more precisely, for fixed i � �, we have � � � �� �( )j g j gi i � 1, where j ui� . we determine the set ui of mismatches and show that it has a low frequency, namely � i. keywords: beatty sequences, fibonacci numbers, cut-and-project scheme. � �i i ig g� 1 and � � � � � i i ig g 1 (5) the proof we give here is based on the algebraic expression for one-dimensional cut-and-project sets [14]. let v1, v2 be straight lines in �2 determined by vectors ( , )� 1 and ( , )� � 1 , respectively. the projection of the square lattice �2 on the line v1 along the direction of v2 is given by ( , ) ( ) ( )a b a b x a b x� � � � � � 1 2, for ( , )a b �� 2, where � x1 1 1� �� � � � ( and � x2 1 1� � � � � � � ( . for the description of the projection of �2 on v1 it suffices to consider the set � � � �� �� � � �� �: ,a b a b the integral basis of this free abelian group is ( , )1 �� , and thus every element x of � �� �� has a unique expression in this base. we will say that a is the rational part of x a b� �� and b is its irrational part. since �� is a quadratic unit, � �� �� is a ring and, moreover, it satisfies � � � �� � � �� � �� � (6) a cut-and-project set is the set of projections of points of � 2 to v1, that are found in a strip of given bounded width, parallel to the straight line v1. formally, for a bounded interval � we define � �� �( ) , ,� � � �a b a b a b� � �� note that a b �� corresponds to the projection of the point (a, b) to the straight line v1 along v2, whereas a b � corresponds to the projection of the same lattice point to v2 along v1. among the simple properties of the cut-and-project sets that we use here are � � � �( ) ( ) � 1 1 , � �� �� � � �( ) ( ), where the latter is a consequence of (6). if the interval � is of unit length, one can derive directly from the definition a simpler expression for � �( ). in particular, we have � � � �� ��[ , ) [ , )0 1 0 1� � � � � �a b a b b b b� � � � � , (7) where we use that the condition 0 1� �a b� is satisfied if and only if � � � �a b b� � � � . let us mention that the above properties of one-dimensional cut-and-project sets, and many others, are explained in the review article [14]. 3 self-matching property of the graph � �j� against j an important role in the study of the self-matching properties of the graph � �j� against j is played by the generalized fibonacci sequence ( )gi i�� , defined by (2) and (4), respectively. it turns out that shifting the argument j of the function � �j� by the integer gi results in shifting the value by gi 1, with the exception of isolated mismatches with low frequency. the first proposition is an easy consequence of the expressions of �� as an element of the ring � �� � in the integral basis 1, �, given by (3) and (5). theorem 1 let � �( , )0 1 satisfy � �2 1 �m and let ( )gi i� � 0 be defined by (2). let i ��. then for j �� we have � � � �� � ( ) ( )j g j g ji i i � 1 where � � i ij( ) ,( )� 0 1 1 . the frequency of integers j for which the value i j( ) is non-zero is equal to � � � �i n i ij n j n j n : lim # , ( ) � � � � � � �� � 0 2 1 . proof. the first statement is trivial. for, we have � � � � � �� � � � � � � � � � � i i i i i i j j g j g j j g g j j ( ) ( ) ( ) � � � 1 1 11� � � �� i i� 0 1 1, ( ) . (8) the frequency �i is easily determined in the proof of theorem 1. � in the following theorem we determine the integers j for which i j( ) is non-zero. from this, we easily derive the frequency of such mismatches. theorem 2 with the notation of theorem 1, we have i i ij if j u otherwise, ( ) , ( ) � � � � � 0 1 1 where � �� �u k g k g k k gi i i i i� � � � � � � � � � 1 1 0 1 2 � �, ( ) . before starting the proof, let us mention that for i even, the set ui can be written simply as � �� �u k g k g ki i i� � 1 � � . for i odd, the element corresponding to k � 0 is equal to gi instead of 0. the distinction according to the parity of i is necessary here, since unlike the paper [12], we determine the values of i j( ) for j ��, not only for. proof. it is convenient to distinguish two cases according to the parity of i. � first let i be even. it is obvious from (8), that � � i j( ) ,� 0 1 and � � � � �i ij j j( ) [ , ).� �1 0if and only if (9) let us denote by m the set of all such j, � �� � � � m j j j j k j k i i � � � � � � � � � � � � � � � [ , ) [ , ), 0 0 for some 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 therefore m is formed by the irrational parts of the elements of the set � �k j k j g g k i i i i i � � � � � � � � � � � � � � � [ , ) [ , ) [ , ) ( ) 0 0 0 1 1 � � � �� � �k k� � , where the last equality follows from (3) and (7). separating the irrational part we obtain � �� � � �� � m k g m k g k g k g k k g k u i i i i i i � � � � � 1 1 � � � � , where we have used the equations � � �� �2 1m and mg g gi i i � 1 1. � now let i be odd. then from (8), � � i j( ) ,� 0 1 and � � � � �i ij j j( ) [ , ).� � 1 1 1if and only if (10) let us denote by m the set of all such j, � �� � � � m j j j j k j k i i � � � � � � � � � � � � � � � 1 0 0 [ , ) [ , ), .for some therefore m is formed by the irrational parts of elements of the set � �k j k j i i i i � � � � � � � � � � � � � [ , ) [ , ) [ , ) ( [ , ) 0 0 1 0 1 0 1 � � � � �� �) ( ) .� � � � � � �g g k k ki i 1 1 � separating the irrational part we obtain � �� � � �� � m k g m k g k g g k k g g k k k g i i i i i i i � � � � � 1 1 1 1 � � � �( ) � �� �g k k ui i( ) ,� � �1 � where we have used the equation � � �� �2 1m , mg g gi i i � 1 1 and � � � � �k k� � . let us recall that the weyl theorem [15] states that numbers of the form � �j j� � , j ��, are uniformly distributed in (0, 1) for every irrational �. therefore the frequency of those j �� that satisfy � �j j i� � � ( , )0 1 is equal to the length of the interval i. therefore one can derive from (9) and (10) that the frequency of mismatches (non-zero values i j( )) is equal to �i, as stated by theorem 1. � if � �( , )0 1 is the quadratic unit satisfying � �2 1 � m , then the considerations are even simpler, because expression (5) does not depend on the parity of i. we state the result as the following theorem. theorem 3 let � �( , )0 1 satisfy � �2 1 � m and let ( )gi i� � 0 be defined by (4). for i ��, put � �� �v k g k g ki i i� � 1 1( )� � . then for j �� we have � � � �� � ( ) ( )j g j g ji i i � 1 , where i ij if j v otherwise. ( ) , � �� � � 0 1 the density of the set ui of mismatches is equal to � i. proof. the proof follows the same lines as proofs of theorems 1 and 2. � 4 conclusions one-dimensional cut-and-project sets can be constructed from �2 for every choice of straight lines v1, v2, if they have irrational slopes. however, in our proof of the self-matching properties of the beatty sequences we strongly use the algebraic ring structure of the set � �� �� , and its scaling invariance with the factor �� , namely � � � �� � �� � �� � . for this, �� must necessarily be a quadratic unit. however, it is plausible that, even for other irrationals �, some self-matching property is displayed by the graph � �j� against j. to show that, other methods would be necessary. 5 acknowledgments the authors acknowledge financial support from the czech science foundation ga čr 201/05/0169, and from the grant lc06002 of the ministry of education, youth and sports of the czech republic. references [1] beatty, s.: amer. math. monthly, vol. 33 (1926), no. 2, p. 103–105. [2] fraenkel, a. s.: the bracket function and complementary sets of integers. canad. j. math., 21, 1969, 6–27 [3] graham, r. l.: covering the positive integers by disjoint sets of the form � �� �n n� � �: , ,1 2 � . j. combinatorial theory ser. a, vol. 15 (1973), p. 354–358. [4] tijdeman, r.: exact covers of balanced sequences and fraenkel’s conjecture. in algebraic number theory and diophantine analysis (graz, 1998), berlin: de gruyter 2000, p. 467–483. [5] de bruijn, n. g.: updown generation of beatty sequences. nederl. akad. wetensch. indag. math., vol. 51 (1989), p. 385–407. [6] komatsu, t.: a certain power series associated with a beatty sequence. acta arith., vol. 76 (1996), p. 109–129. [7] o’bryant, k.: a generating function technique for beatty sequences and other step sequences. j. number theory, vol. 94 (2002), p. 299–319. [8] komatsu, t.: substitution invariant inhomogeneous beatty sequences. tokyo journal math., vol. 22 (1999), p. 235–243. [9] parvaix, b.: substitution invariant sturmian bisequences. thor. nombres bordeaux, vol. 11 (1999), p. 201–210. [10] brown, t.: descriptions of the characteristic sequence of an irrational. canad. math. bull., vol. 36 (1993), p. 15–21. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 47 no. 2–3/2007 [11] stolarsky, k.: beatty sequences, continued fractions, and certain shift operators. canad. math. bull., vol. 19 (1976), p. 473–482. [12] bunder, m., tognetti, k.: on the self matching properties of [j�]. discr. math., vol. 241 (2001), p. 139–151. [13] shechtman, d., blech, i., gratias, d., cahn, j. w.: metallic phase with long-range orientational order and no translation symmetry. phys. rev. lett., vol. 53 (1984), p. 1951–1953. [14] gazeau, j. p., masáková, z., pelantová, e.: nested quasicrystalline discretization of the line. in: physics and number theory (editor: l. nyssen), vol. 10 of irma lectures in mathematics and theoretical physics, zürich, ems 2006, p. 79–132. [15] weyl, h.: über die gleichverteilung von zahlen mod. eins. math. ann., vol. 77 (1916), p. 313–352. doc. ing. zuzana masáková, ph.d. phone: +420 224 358 544 e-mail: masakova@km1.fjfi.cvut.cz, prof. ing. edita pelantová, csc. phone: +420 224 358 544 e-mail: pelantova@km1.fjfi.cvut.cz doppler institute for mathematical physics and applied mathematics czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 praha 2, czech republic 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ap06_3.vp 1 terminology and notation consider a sparse n×n matrix a with elements aij; i j n, ,� 1 . the largest distance between nonzero elements in any row is the bandwidth of matrix a and is denoted by �b, i.e., l ai j ijj� �min ( ; )0 r ai j ijj� �max ( ; )0 �b i i i� � �max ( )r l 1 1.1 storage schemes for sparse matrices 1.1.1 compressed sparse row (csr) format matrix a is represented by 3 linear arrays a, adr, and ci (see fig. 1). array a stores the nonzero elements of input matrix a, array adr [1, …, n] contains indexes of the initial nonzero elements of rows of a, and array ci contains column indexes of nonzero elements of a. hence, the first nonzero element of row j is stored at index adr [ j] in array a. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 performance aspects of sparse matrix-vector multiplication i. šimeček sparse matrix-vector multiplication (shortly spm×v) is an important building block in algorithms solving sparse systems of linear equations, e.g., fem. due to matrix sparsity, the memory access patterns are irregular and utilization of the cache can suffer from low spatial or temporal locality. approaches to improve the performance of spm×v are based on matrix reordering and register blocking [1, 2], sometimes combined with software-pipelining [3]. due to its overhead, register blocking achieves good speedups only for a large number of executions of spm×v with the same matrix a. we have investigated the impact of two simple sw transformation techniques (software-pipelining and loop unrolling) on the performance of spm×v, and have compared it with several implementation modifications aimed at reducing computational and memory complexity and improving the spatial locality. we investigate performance gains of these modifications on four cpu platforms. keywords: sparse matrix-vector multiplication, code restructuring, loop unrolling, software pipelining, cache hierarchy. fig. 2: the idea of the static l-csr format: a) a sparse matrix a in dense format, b) the static l-csr representation of a fig. 1: the idea of the csr format: a) a sparse matrix a in dense format, b) the csr representation of a 1.1.2 length-sorted csr (l-csr) storage format the main idea is explained in [4]. the data is represented as in the csr format, but the rows are sorted by length in increasing order. this means that the length of row i is less or equal to the length of row i � 1. there are two variants: 1. static: the rows are physically stored in the sorted order in the csr format (see fig. 2). 2. dynamic: the original csr format is extended with two additional arrays (see fig. 3). array member [1 … n] contains the indexes of the rows after sorting. array begin[1 … �b] contains the indexes into the array member: begin [i] is the index of the first row of length i in the array member. 1.2 code restructuring for demonstration purposes, we will use the following pseudocode: an example code s � 0.0; for i � 1 to n do s � s � a[i]; 1.2.1 loop unrolling modern cpus with deep instruction and arithmetic logic unit (shortly alu) pipelines achieve peak performance if they execute straight serial codes without conditional branches. a non-optimizing compiler translates a for cycle with n iterations into a code with a single loop in which the condition must be tested n times, even if the loop body is very small. loop unrolling by unrolling factor uf consists in constructing another loop whose body consists of uf instances of the loop body in a sequence followed by a clean-up sequence if n is not a multiple of uf . this makes the serial code longer, so that the instructions can be better scheduled, the internal pipeline can be better utilized, and the number of condition tests drops from n to � �n uf . loop unrolling applied to the example code (uf � 2) s � 0.0; for i � 1 to n step 2 do s � s � a [i]; s � s � a [i � 1]; 1.2.2 loop unrolling-and-jam (loop jamming) since operations in the floating point unit (shortly fpu) on modern cpus are multi-stage pipelined operations, dependences between iterations can prevent the floating-point pipeline from filling, even if unrolling is used. to improve pipeline utilization in dense matrix codes that have a recurrence in the inner loop, the unroll-and-jam transformation is often used. the transformation consists in unrolling the outer loop and fusing the resulting inner loop bodies. this can increase floating point pipeline utilization by interleaving the computation of multiple independent recurrences. loop unrolling-and-jam applied to the example code (uf � 2) s1 � 0.0; s2 � 0.0; for i � 1 to n step 2 do s1 � s1 � a[i]; s2 � s2 � a[i+1]; s � s1 � s2; 1.2.3 software pipelining the initial instruction(s) of the first iteration is/are moved into the prologue phase, and the final instruction(s) of the last iteration is/are moved into the epilogue phase. this technique is usually combined with loop unrolling and makes the loop code larger and more efficient for instruction scheduling. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 3: the idea of the dynamic l-csr format: a) a sparse matrix a in dense format, b) the dynamic l-csr representation of a software pipelining applied to the example code s � 0.0; a1 � a[1]; for i � 2 to n do a2 � a[i]; s � s � a1; a1 � a2; s � s � a2; 1.2.4 sparse matrix-vector multiplication consider a sparse n×n matrix a stored in the format csr as defined in the previous section and input dense array x [1, …, n] representing vector x. the goal is to compute output dense array y [1, …, n] representing vector y � ax. the following algorithm mvm_csr is a straightforward implementation of spm×v. algorithm mvm_csr low � adr [1]; for i � 1 to n do s � 0.0; up � adr [i + 1]; for j � low to up do k � c [ j]; s � a [ j]*x [k]; y [i] � s; low � up; 2 improving the performance of sparse matrix-vector multiplication the mvm_csr code stated above has poor performance on modern processors. due to matrix sparsity, the memory access patterns are irregular and the utilization of caches suffers from low spatial and temporal locality. the multiplication at codeline (4) requires indirect addressing, which causes performance degradation due to the small cache-hit ratio. this problem is difficult to solve in general [5] and it is even more difficult if the matrix has only a few nonzero elements per row. in our case, we consider sparse matrices produced by discretization of 2d differential equations of the second order with typical stencils. these matrices contain only a few (typically between 4 and 20) nonzero elements per row. the performance of spm×v is influenced by sw transformations (shortly swt) and implementation decisions (shortly id). a) using explicit software preload ( swt, spm×v_(a) ) the elements of arrays a and ci are loaded in advance by using software-pipelining in the loop at codeline (4). this should hide memory system latencies. b) interleaving of adjacent rows ( swt, spm×v_(b) ) two (or more) adjacent rows are computed simultaneously by using loop unrolling in the loop at codeline (4). this should improve fpu pipeline utilization. c) condensed implementation of the csr format ( id, spm×v_(c) ) the matrix in the csr format is not represented by two independent arrays ci and a, but by a single array that holds two records. this should improve spatial locality, because elements of these arrays are always used together. d) use of pointers in array ci ( id, spm×v_(d) ) the array ci contains pointers to array x instead of indexes into it. this should decrease the amount of work for the operand address decoder in the instruction pipeline. e) use of the static l-csr format for storing a ( id, spm×v_(e) ) arrays a and ci are reordered to the static l-csr format. this modification can save some fpu and alu operations. the number of conditional branches is strongly reduced, and operations for loading zero into floating point registers are completely removed. for rows with a very small number of nonzero elements, this can have a significant impact. f) using single precision ( id, spm×v_(f) ) all elements of array a are stored in the single precision format (float). this floating point format requires half of the space of the double format, so the memory requirements for storing a drop by 33 %. 3 hw and sw configuration the impact of using these transformations of spm×v on the performance was evaluated empirically by measurements on four different processors: 1. ibm power 3, 200 mhz, 1 gb ram, 32 kb instruction and 64 kb data l1 cache (128 b cache line size), and 1 mb l2 cache (32 b cache line size), os aix 4.3.2, ibm c compiler, used switches: -o5. 2. amd opteron, 1.6 ghz, 1 gb ram, 64 kb instruction and 64 kb data l1 cache, and 1 mb l2 cache, os linux debian, kernel version 2.4.18, gnu c compiler, used switches: -o3. 3. intel pentium iii coppermine 1 ghz, 512 mb ram, 16 kb instruction and 16 kb data l1 cache (32 byte cache line size), and 256 kb l2 cache (32 byte cache line size), os linux debian, kernel version 2.4.18, intel compiler version 6.0 build 020312z, used switches: icc -o3 -fno_alias -pc64 -tpp6 -xk -ipo -align -zp16. 4. sun ultrasparc iiii (sparc v9), 1 ghz, 1 gb ram, 32 kb instruction and 64 kb data l1 cache, and 1 mb l2 cache, os sunos 5.9, gnu c compiler, used switches: -o3. the order n starts at n0 � 5100 and grows by a geometric series with factor q � 1.32 up to n10 � n0 q 10 � 80400. all measurements on all four processors were performed with the same set of matrices generated by an fem generator. the graphs illustrate the performance of spm×v on these four processors measured in mflops as a function of the order of matrix a: © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 � mv represents performance of standard spm×v. � mv_a represents performance of spm×v_(a). � mv_b2 represents performance of spm×v_(b) interleaving of 2 rows. � mv_b4 represents performance of spm×v_(b) with interleaving of 4 rows. � mv_c represents performance of spm×v_(c) with preload with 1 iteration distance. � mv_d represents performance of spm×v_(d). � mv_e represents performance of spm×v_(e). � mv_f represents performance of spm×v_(f). 4 results evaluation of the results: � using structures ( spm×v_(a) ) our assumptions were not fulfilled; this modification caused slowdown on all architectures. one possible reason is that the loop code becomes less clear and these compilers are unable to optimize it. � using explicit preload ( spm×v_(b) ) this modification caused slowdown on all architectures. the reason is that explicit sw preload collides with default compiler preload transformation. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 4: the performance of algorithms on ibm power 3 fig. 5: the performance of algorithms on amd opteron � interleaving of 2 adjacent rows ( spm×v_(c) ) the effect of this modification is very different (pentium iii: speedup about 20 %, sun: constant performance, power 3: constant performance, opteron: slowdown about 40 %.). one possible reason is that explicit loop unroll-and-jam collides with default loop unroll-and-jam heuristics in compilers. � using pointers ( spm×v_(d) ) our assumptions were not fulfilled; this modification caused slowdown on all architectures. one possible reason is that the loop code becomes less clear and these compilers are unable to optimize it. � matrix a is stored in l-csr format ( spm×v_(e) ) this modification achieves speedup on all architectures due to slight reduction of the number of fpu operations and conditional branches. the main drawback of this method is that “typical” row lengths must be known at compile-time. � using single precision ( spm×v_(f) ) this modification achieves speedup on all architectures caused by 33 % smaller amount of data for matrix and by 50 % smaller amount of data for vectors. the main drawback of this method is lower precision of the resulting vector. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 6: the performance of algorithms on intel pentium 3 fig. 7: the performance of algorithms on sun ultrasparc iiii 5 conclusion we have tried to increase the performance of spm×v, one of most common routines in la. to fulfill this goal, we have used either sw code transformation techniques or some implementation decisions. we have measured the performance of several modifications of an spm×v algorithm on four different hw platforms. the results differ due to the use of different cpu architectures and compilers, but we can conclude that three of the techniques improve the performance of the code and can be used to accelerate spm×v. 6 acknowledgment this work was supported by mšmt under research program msm6840770014. references [1] heras, d. b., cabaleiro, j. c., rivera, f. f.: modeling data locality for the sparse matrix-vector product using distance measures. parallel computing, vol. 27 (2001), no. 7, p. 897–912, june 2001. [2] vuduc, r., demmel, j. w., yelick, k. a., kamil, s., nishtala, r., lee, b.: performance optimizations and bounds for sparse matrix-vector multiply. in: proceedings of supercomputing 2002. baltimore (md, usa), november 2002. [3] rollin, s., geus, r.: towards a fast parallel sparse matrix-vector multiplication. in: parallel computing: fundamentals and applications. (d’hollander, e. h., joubert, j. r., peters, f. j., sips, h. eds.), proc. of parco’99, imperial college press, 2000, p. 308–315. [4] white, j., sadayappan, p.: on improving the performance of sparse matrix-vector multiplication. in: proceedings of the 4th international conference on high performance computing (hipc ’97), ieee computer society, 1997, p. 578–587. [5] wolfe, m. j.: high-performance compilers for parallel computing. reading (massachusetts, usa): addison-wesley, 1995. ing. ivan šimeček phone: +420 224 357 268 e-mail: xsimecek@fel.cvut.cz department of computer science czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague ap07_4-5.vp 1 introduction there are a vast number of telecommunications groups in europe. to pick the companies for the valuation presented, the following criteria were used: � quality of the web page (english version), � negotiability on the stock market, � stability and integrity in accounting principles, � stability in the structure of the group. the companies were selected on the basis of these criteria between 2002 and 2005. i selected the following six telecommunications operators [1]: � ote (greek), � czech telecom (ct) (czech), � swisscom (swiss), � telekom austria (ta) (austrian), � telenor (norwegian), � teliasonera (ts) (swedish-finnish). each of these companies has a division that deals with the mobile segment and also a division that runs fixed lines. there are significant distinctions between the companies. for example, czech telecom and swisscom operate only in their home countries, while the others have subsidiaries also in other european countries (some of them in asia). 2 techno-economic indicators the first part of the comparison is based on techno-economic indicators. the following absolute size indicators were chosen: � number of customers, � number of employees, � revenues, � ebitda – earnings before interest, taxes depreciation and amortization, � capex – capital expenditures, � arpu – average revenue per user – revenue generated by a mobile customer per month. subsequently, several relative indicators were evaluated and used to compare the companies. the highest numbers of customers are reported by the northern groups, telenor and teliasonera. the highest numbers of employees are found in the ote group, but during the monitored years, the number decreased dramatically. the progression of the other indicators is very variable. 3 financial indicators i decided to divide the financial indicators into three main groups: � profitability indicators, � altman’s z-score, � stock market indicators. the currency conversion between eur, sek, chf, nok and czk is based on the exchange rate issued by the czech national bank. 3.1 profitability indicators the following ratios were used [2]: � roa – return on assets, � roe – return on equity, � ros – return on sales. each of these was calculated in two variants, one using ebit and the other using net profit in the numerator. 3.2 altman’s z-score the edward altman z-score formula for predicting bankruptcy is a multivariate formula for measuring the financial health of a company. it is a powerful diagnostic tool forecasting the probability that the company will go into bankruptcy. the z-score bankruptcy predictor combines five common business ratios, using a weighting system calculated by altman to determine the likelihood of a company going bankrupt. z a b c d e� � � � � � � � � �12 1 4 3 3 0 6 0 999. . . . . (1) a � working capital/total assets, b � retained earnings/total assets, c � ebit/total assets, d � market value of equity/book value of liabilities, e � sales/total assets. if the score is 3.0 or above – bankruptcy is not likely. if the score is 1.8 or less – bankruptcy is likely. a score between 1.8 and 3.0 is the gray area [5]. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 valuation of companies jiří lisník this paper deals with valuating telecommunications companies. six groups operating mainly in the european countries are compared. the comparison is based on financial and techno-economic indicators. these indicators are used to value the company by the dcf method. however, many companies have only a very short history. to value these, classical methods like dcf or real options are not suitable. in this case, methods based on market confrontation seem to be efficient. a further exploration of these methods is also the topic of my graduate studies. keywords: valuation, companies, dcf, indicator, market confrontation. in this case, the use of the z-score formula is disputable, because this model was created in 1968 for the american companies. i use it only to compare the groups, not to predicting bankruptcy. the highest score is achieved by swisscom. another interesting result is displayed by ct for years 2004–2005, due to the sale of unneeded assets (especially phone boxes). 3.3 stock market indicators each of the selected companies is listed on a public stock exchange. 3.3.1 earnings per share the eps indicator informs shareholders about the net profit that can be paid as a dividend. eps � net profit number of shares (2) 3.3.2 price to earnings ratio the p/e ratio of a stock (also called its „earnings multiple“, or simply „multiple“, „p/e“, or „pe“) is used to measure how cheap or expensive its share price is. p e � price per share earnings per share (3) 3.3.3 book value the book value is the shareholders’ equity of a business (assets – liabilities), as measured by the accounting ‘books’. bv � equity number of shares (4) © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 47 no. 4–5/2007 year 2002 2003 2004 2005 ct 1.979 1.202 2.480 4.586 ote 1.789 1.642 1.659 1.586 swisscom 3.576 4.022 4.906 4.692 ta 0.793 1.311 1.841 1.950 telenor 1.119 2.109 2.380 1.441 teliasonera 1.204 2.334 2.763 2.735 table 1: altman’s z-score 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5 4,0 4,5 5,0 2002 2003 2004 2005 year z-score ct ote swisscom ta telenor ts fig. 1: altman’s z-score year 2002 2003 2004 2005 ct 12.96 -5.72 17.27 19.40 ote 22.27 27.15 10.65 �12.83 swisscom 490.79 492.95 475.41 613.48 ta 0.81 8.70 13.85 24.20 telenor �10.36 9.73 11.33 16.24 teliasonera �6.05 6.93 9.36 8.05 table 2: earningss per share –eps (in czk) -30 czk -20 czk -10 czk 0 czk 10 czk 20 czk 30 czk 2002 2003 2004 2005 year eps ct ote swisscom ta telenor ts fig. 2: eps (for better lucidity the values are shown only from �30 30; ) year 2002 2003 2004 2005 ct 18.88 -50.94 21.38 27.04 ote 14.90 12.51 37.83 �40.70 swisscom 17.76 17.24 18.61 12.61 ta 377.28 36.50 30.69 22.77 telenor �11.12 17.21 17.96 14.82 teliasonera �18.73 19.36 14.35 16.39 table 3: price tu earnings – p/e -60,00 -40,00 -20,00 0,00 20,00 40,00 60,00 2002 2003 2004 2005 year p/e ct ote swisscom ta telenor ts fig. 3: p/e (for better lucidity the values are shown only from �60 60; ) 4 comparison of the companies in most of the ratios used here the best values are achieved by swisscom. telekom austria takes second place. the subsequent places are occupied by the northern groups, telenor and teliasonera, which are not consolidated as they made many acquisitions between 2002 and 2005. in upcoming years they will restructure and consolidate. the fifth place goes to ct, which is being restructured at the present time. the last position is occupied by the greek operator, ote. 5 methods of valuation one way of categorizing valuation methods is as follows [3]: � market based – comparable market transactions or comparable companies. these methods assume that the value of the company can be determined by using as a reference market information on companies with similar characteristics as the company being valued. � income based – perhaps the most commonly-used set of valuation methods in the context of small-to-medium company acquisitions. financial performance methods attempt to measure historical performance and also to predict future performance in determining the value of the seller’s business to the buyer on a post-closing basis: � capitalization of profits, � dcf, � gross profit differential method, � excess profits method, � real options. � asset based – e. g. historical or replacement cost. if a company has a large portion of its value wrapped up in fixed assets, an appraiser may lean toward some type of asset valuation when attempting to price it. each of the methods mentioned above has its own pros and cons. selection of the optimal method depends on the following considerations: � history of the company (short, long), � type of company (private, public), � required accuracy of calculation and its complexity, � business cycle of the company. 6 purpose of valuation based on the character and extent of the available information about the company being valued and the actual purpose of the valuation itself, using a single method, various values can be obtained: � open-market value, � estimated realization value, � existing use value, � estimated restricted realization price, � depreciated replacement cost. 7 valuation of czech telecom the dfcff method [4] was chosen for the valuation here. ct was valued to january 1st 2006. swot and competition analyses were carried out, and the indicators calculated in the previous part were used to predict the cash flow. fcff ebit t dep wc invt t t t t� � � � � �( )1 � (5) fcfft – free cash flow to the firm, ebitt – earnings before interest and taxes, t – taxes, dept – depreciation, �wct – change in working capital, invt – investments. for the valuation, a two-phase model was used: v fcff wacc fcff wacc g waccb t t n n n t t n � � � � � �� � � � � ( ) ( ) 1 11 1 (6) vb – gross value, wacc – weighted average cost of capital, gn – growth. the share price: sp v ns v fc ns n b� � � (7) sp – share price, vn – nett value, fc – foreign capital, ns – number of shares. for calculating wacc and the cost of the own capital, the capm model was used. the shares were valued at a price of czk 580. the actual share prices at the prague stock exchange on december 30th, 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 year 2002 2003 2004 2005 ct 345.47 279.88 280.02 294.87 ote 224.93 237.36 212.73 195.96 swisscom 2 399.4 2 409.5 1 992.6 1 820.7 ta 158.60 171.06 167.05 166.46 telenor 81.19 79.49 79.48 98.55 teliasonera 81.61 85.80 87.85 87.40 table 4: book value –bv (in czk) 0 czk 50 czk 100 czk 150 czk 200 czk 250 czk 300 czk 350 czk 400 czk 2002 2003 2004 2005 year bv ct ote swisscom ta telenor ts fig. 4: bv (for better lucidity the values are shown only from �30 30; ) 2005 and january 1st 2006 were 524.50 czk and 527 czk, respectively [1]. 8 conclusion six telecommunications groups were compared using techno-economic and financial indicators. swisscom achieved the best values across most of the criteria considered. subsequently, several of these indicators were employed for a valuation of czech telecom by the dfcff method. the shares were valued at czk 580. on december 30th 2005 and january 1st 2006, the actual share prices at the prague stock exchange were 524.50 czk and 527 czk respectively. the dfcff method was chosen based on the characteristics of ct. this is a public company and its shares are traded on the prague stock exchange. a sufficient amount of information about ct is publicly available, including data from the past years. information for predicting the company’s future can be also found. the forecast of future revenues and expenses is the most important aspect of valuing the company using dcf methods. another important step is to determine wacc. for this purpose, historical data from the prague stock exchange was used. a sensitivity analysis on changing initial expectations was also conducted, see [1]. my future work will be on valuing companies with a short history of existence. historical data cannot be used, and hence it is complicated to predict the future. so far, a no unified theory is available for such cases. the market confrontation method may deal sufficiently with this issue, and this will be the focus of my work. references [1] lisník, j.: financial analysis of the telecommunications companies. diploma thesis, 2007, in czech. [2] kislingerová, e., hnilica, j.: financial analysis step by step. prague: c. h. beck, 2005, in czech. [3] mařík, m. et al.: methods of valuating companies. prague: ekopress, 2003, in czech. [4] kislingerová, e.: valuating company. prague: c. h. beck, 2001, in czech. [5] mrkvička, j.: financial analysis. prague: bilance, 1997, in czech. jiří lisník e-mail: j.lisnik@sh.cvut.cz dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 47 no. 4–5/2007 acta polytechnica https://doi.org/10.14311/ap.2022.62.0080 acta polytechnica 62(1):80–84, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague on the wess-zumino model: a supersymmetric field theory daya shankar kulshreshthaa, ∗, usha kulshreshthab a university of delhi, department of physics and astrophysics, delhi – 110007, india b university of delhi, kirori mal college, department of physics, delhi – 110007, india ∗ corresponding author: dskulsh@gmail.com abstract. we consider the free massless wess-zumino model in 4d which describes a supersymmetric field theory that is invariant under the rigid or global supersymmetry transformations where the transformation parameter ϵ (or ϵ̄) is a constant grassmann spinor. we quantize the theory using the hamiltonian and path integral formulations. keywords: wess-zumino model, supersymmetric field theories, hamiltonian and path integral quantization. 1. introduction supersymmetry (susy) is a symmetry that rotates bosons into fermions and fermions into bosons. it is one of the beautiful symmetries of nature. also, a field theory (ft) which remains invariant under the rigid or global supersymmetry transformations (where the transformation paremeter is a constant grassman spinor) and which also satisfies the super poincare algebra (spa) is usually referred to as a supersymmetric field theory (sft). in this article, we consider the free massless wess-zumino model (wzm) in 4d which describes a sft. it may be important to mention here that the wzm is the first known example of an interacting 4d quantum field theory with linearly realised susy, studied by wess and zumino using the dynamics of a single chiral superfield (composed of a complex scalar and a spinor fermion). it may be important to mention that the wzm represents a typical sft which is of central importance in the theory of susy, supergravity and superstring theory (sst) and for further details we refere to the work of refs. [1–8]. the wzm describes an example of a non-manifest supersymmetry [5]. one could of course go to the formalism of superspace and superfields to construct a theory that has a manifest supersymmetry [5]. taking this theory as an example, it is possible to formulate supersymmertic field theories in different dimensions including in higher dimensions. the wzm also provides a basic framework for the study of ramond nievue schwarz (rns) sst [8] which is an example of a sst with non-manifest susy. further, starting with the wzm, it is also possible to construct a supergravity theory [1–6, 8]. spa is a graded lie algebra that includes anticommutation relations (acr’s) involving the supercharge qa – the generator of the susy transformations. wzm is one of the simplest examples of a sft. in this article, we discuss the supersymmetry of wzm and present some remarks with respect to the rigid or global supersymmetry versus the local supersymmetry (which happens to be a supergravity theory). finally we consider the constraint quantization of this theory [7]. it is important to mention that the supersymmetry has profound applications in conformal hadron physics from light-front holography where it even has some observational prospects [9–11]. as mentioned above, the supersymmetry is a symmetry that relates bosonic and fermionic variables (or the bosons and fermions) so that: δb = ϵ̄f , δf = ϵ ∂b ; ∂ ≡ ∂µ (1) here, δ is bosonic, b is bosonic and f is fermionic. the transformation parameter ϵ (or ϵ̄) is a constant grassman spinor and is fermionic. grassman variables are anti-commuting. supergravity theory on the other hand is a theory that has “local supersymmetry” and it is invariant under local susy transformations where the transformation parameter depends on the spacetime xµ. so the transformation parameter for supergravity: ϵ(xµ) or ϵ̄(xµ) depends on xµ and hence supergravity is a “gauge theory” of gravity. in contrast to this the wzm is a supersymmetric ft with rigid or global (not local) supersymmetry. let us us consider two consecutive infinitesimal rigid supersymmetry transformations of a bosonic field b: δ1 b = ϵ̄1f , δ2f = ϵ2∂b (2) this then implies that the two internal susy transformations lead us to a spacetime translation: {δ1,δ2}b = aµ∂µb ; aµ = (ϵ̄2γµϵ1) (3) presence of a spacetime derivative of b on right hand side (rhs) of above equation suggests that the susy is an extension of the poincare spacetime symmetry: {qa,q̄b} = 2(γµ)abpµ (4) 80 https://doi.org/10.14311/ap.2022.62.0080 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 on the wess-zumino model: a sft supercharge qa (a = 1, 2, 3, 4 in 4d) is the generator of susy transformations. it is related to the generator of spacetime translations pµ and therefore is not an internal symmetry generator. the susy transformation is an extention of the poincare spacetime symmetry. supercharge qa is a spinor. it is fermionic and anti-commuting. poincare algebra (pa) after including the supersymmetry becomes the spa. 2. the wess-zumino model the wzm is defined (on-shell) by the lagrangian density [5]: l := [ 1 2 (∂µa)∂µa− 1 2 m2a2 + 1 2 (∂µb)∂µb − 1 2 m2b2 −mga(a2 + b2) + ψ̄(iγν∂ν −m)ψ −g (ψ̄ψ a + iψ̄γ5ψb) − 1 2 g2(a2 + b2)2 ] (5) here a is a scalar field, b is a pseudoscalar field, ψ is a spin-1/2 majorana field (ψ = ψc = cψ̄t ). c is the charge conjugation matrix and a = a† and b = b†. all the fields here have the same mass m and they couple with the same strength g. this is in contrast to the non-susy ft’s. this is due to the fact that states of a particular representation of the super poincare algebra (spa) are characterized by the eigenvalue m2 of p 2 (= pµp µ) and different values of spin s. actually, all the fields belong to the same mass multiplet in spa. pauli-ljubanski polarization vector is defined as: wµ := 1 2 ϵµνρσp νmρσ (6) here p 2 = pµp µ w 2 = wµw µ (7) are casimir operators of pa that satisfy:[ p 2,mµν ] = 0[ p 2,pµ ] = 0[ w 2,mµν ] = 0[ w 2,pµ ] = 0 (8) we further have: p 2 = m2 > 0 w 2 = −m2s(s + 1). (9) where, m2 and (−m2s(s+1)) are the eigenvalues of p 2 and w 2. here s denotes the spin of the representation which assumes discrete values: s = 0, 1/2, 1, 3/2, . . . this representation is specified in terms of the mass m and spin s. physically a state in a representation (m,s) corresponds to a particle of rest mass m and spin s. also, since the spin projection s3 can take any value from −s to +s, (massive particles fall into (2s + 1)-dimensional multiplets). in wzm, all the fields namely, a, b, ψ, ψ̄ have the same mass m and they couple with the same strength g (in the unbroken susy) – in contrast to the nonsupersymmetric field theories. states of a particular representation of spa are characterized by the eigenvalue m2 of casimir operator p 2 and different values of spin s. w µ is proportional to p µ (generator of the poincare group): w µ = λp µ (10) and w0 = λp0 = −→ p · −→ j (11) where p µ = (p0 , −→ p ). (12) the constant of proportionality λ in wµ = λpµ is called helicity and it is defined by: λ := −→ p · −→ j p0 (13) for massless particles with λ := ±s where s = 0, 1/2, 1, . . . is the spin of representation. n = 1 is called as the minimal supersymmetry and n > 1 is called the extended supersymmetry. for simplicity we set (g = 0) yielding the lagrangian density of the free wzm [5]: l := [ 1 2 ∂µa∂ µa− 1 2 m2a2 + 1 2 ∂µb∂ µb − 1 2 m2b2 + ψ̄(iγν∂ν −m)ψ ] (14) theory is seen to be invariant (up to a total derivative) under the rigid susy transformation [5]: δa = ϵ̄ ψ δb = −i ϵ̄ γ5 ψ δψ = − (iγν∂ν + m) (a− iγ5 b) ϵ δψ̄ = ϵ̄ (a− iγ5 b) (iγν ←− ∂ ν −m) (15) here ϵ is a constant grasmann variable (which does not depend on spacetime x ≡ xµ) implying a global or rigid susy transformations. however, δψ and δψ̄ here, are seen to depend on spacetime derivatives of a and b. this implies that this is an extention of poincare spacetime symmetry (different than an internal symmetry). supercurrent jµ of the theory could be easily calculated to be [5]: jµ = [ i 2 ϵ̄(a− iγ5 b) (iγν ←− ∂ ν −m)γµ ψ ] ≡ [ 1 β ϵ̄ kµ ] (16) 81 daya shankar kulshreshtha, usha kulshreshtha acta polytechnica here β is a real constant which could be suitably choosen. the spinor charges qa are defined by [5]: qa := ∫ d3x k0a k0a = i 2 β [ {(a− iγ5 b)(iγν ←− ∂ ν −m)}γ0ψ ] a (17) here k0a are the spinor charge densities with a = 1, 2, 3, 4. spinor charges and spinor charge densities being fermionic satisfy spa and the spinor charges are seen to satisfy the anti-commutation relation (acr) [5]: {qa,q̄b} = 2pµ(γµ)ab (18) this explicitly shows that the wzm obeys the spa and it is a supersymmetric ft having a rigid or global susy. also, the supersymmetry of the theory is a nonmanifest supersymmetry. we now set m = 0 for making the fields to be massless, so that the free massless wzm is defined by the lagrangian density: l := [ 1 2 ∂µa∂ µa + 1 2 ∂µb∂ µb + ψ̄(iγν∂ν )ψ ] (19) this is the simplest example of a supersymmetric ft in 4d with a non-manifest supersymmetry. we obtain the free wzm by setting g = 0 and it is seen to be invariant, up to a total derivative, under the rigid susy transformations [5]: δa = ϵ̄ ψ δb = −i ϵ̄ γ5 ψ δψ = − (iγν∂ν ) (a− iγ5 b) ϵ δψ̄ = ϵ̄ (a− iγ5 b) (iγν ←− ∂ ν ) (20) here ϵ is a constant grasmann variable (which does not depend on spacetime x ≡ xµ). here, δψ and δψ̄ are seen to depend on spacetime derivatives of a and b which implies that this is an extention or generalization of the poincare spacetime symmetry. (ϵ, ϵ̄) being constant, implies that the symmetry is a rigid or global susy. it is also possible to consider it as a theory of a single complex scalar field and a fermionic field by combining the fields a and b as follows: ϕ(x) := (a + ib)/2 ϕ⋆(x) = (a− ib)/2 (21) implying therefore: δϕ = ϵ̄ψ̄ and δϕ⋆ = ϵψ and δψa = 2i(σµϵ̄)a∂µϕ⋆(x) δψ̄ȧ = −2i(σ̄µϵ)ȧ∂µϕ(x) (22) since a is a scalar field and b is a pseudoscalar field, the complex combination ϕ(x) transforms under the parity transformation like complex conjugation. here, ψ and ψ̄ are not independent fields as they are the majorana spinor fields in the weyl formulation. hence the transformations of δψ and δψ̄ are not independent and one could be obtained from the other. supercurrent jµ of the theory is obtained as: j µ = [ i 2 ϵ̄(a− iγ5 b) (iγν ←− ∂ ν )γµ ψ ] ≡ [ 1 β ϵ̄ k µ ] (23) here β is a real constant. spinor charge qa are: qa := ∫ d3x k0a k0a = i 2 β [ {(a− iγ5 b) (iγν ←− ∂ ν )}γ0 ψ ] a (24) here k0a are the spinor charge densities with a = 1, 2, 3, 4. wzm being a supersymmetric ft, spinor charges and the spinor charge densities are seen to satisfy spa and the spinor charges satisfy the acr: {qa,q̄b} = 2pµ(γµ)ab (25) this implies that the wzm obeys spa and it is a supersymmetric ft with a rigid susy. spa reads [5]: [pµ,pν ] = 0 (26) [mµν,pρ] = −i (ηµρ pν −ηνρ pµ) (27) [mµν,mρσ ] = − i(ηµρmνσ + ηνσmµρ) + i (ηµσmνρ + ηνρmµσ ) (28) [pµ,qa] = 0 (29) [mµν,qa] = −(σ4µν )ab qb (30) σ4µν := i 4 [γµ,γν ] {qa,q̄b} = 2(γµ)ab pµ {qa,qb} = −2 (γµ c)ab pµ {q̄a,q̄b} = 2 (c−1γµ)ab pµ (31) spa has 14 generators: 4 generators of lorentz translations pµ , 6 generators of poincare transformations mµν and 4 spinor charges qa (the majorana spinors). here the indices a and b run from 1 to 4 in 4d. 3. free massless wzm we now set m = 0 for making the fields to be massless, so that the free massless wzm is defined by the lagrangian density: l := [ 1 2 ∂µa∂ µa + 1 2 ∂µb∂ µb + ψ̄(iγν∂ν )ψ ] (32) 82 vol. 62 no. 1/2022 on the wess-zumino model: a sft we break up the lagrangian density of the free massless wzm into bosonic and fermionic parts: l = lb + lf lb = 1 2 ∂µa∂ µa + 1 2 ∂µb∂ µb lf = ψ̄(iγν∂ν )ψ (33) further, lf (= lf ) could be written in two different looking but conceptually equivalent forms (which differ by a total derivative (t.d.)) as follows: lf1 = iψ̄γ µ∂µψ lf2 = i 2 [ ψ̄γµ(∂µψ) − (∂µψ̄)γµψ ] (34) lf1 −l f 2 = i 2 ∂µ(ψ̄γµψ) = i 2 ∂µj µ jµ = (ψ̄γµψ) (35) theory described by lf1 is seen to possess a set of two second class constraints: ρ1 = (π + iψ̄γ0) ≈ 0 ρ2 = π̄ ≈ 0 (36) here, fermi fields ψ and ψ̄ are to be treated as independent fields. theory described by lf2 is also seen to possess a set of two second class constraints: χ1 = (π + i 2 ψ̄γ0) ≈ 0 χ2 = (π̄ + i 2 γ0ψ) ≈ 0. (37) the fermi fields ψ and ψ̄ in this later case are not independent fields. this is consistent with the definition of majorana spinor fields (we remind ourselves here that in wzm, the fermionic fields are majorana spinor fields). we now study the hamiltonian formulation of the theory [7]. the canonical momenta following from the lagerangian density of wzm defined by l := (lb +lf ) with lf = lf2 (working with the signature ηµν := diag(+1,−1,−1,−1)) are: πa := ∂l ∂(∂0a) = ∂0a πb := ∂l ∂(∂0b) = ∂0b (38) π := ∂l ∂(∂0ψ) = − i 2 ψ̄γ0 π̄ := ∂l ∂(∂0ψ̄) = − i 2 γ0ψ (39) theory thus has 2 primary constraints (pc’s): χ1 = (π + i 2 ψ̄γ0) ≈ 0 χ2 = (π̄ + i 2 γ0ψ) ≈ 0 (40) in principle, χ1,χ2 represent an infinite number of pc’s which could be labeled say by α,β (which run from one to infinity). we however, ignore these further labelings in our considerations. the canonical hamiltonian density of the theory is obtained as: hc =(∂0a) πa + (∂0b) πb + (∂0ψα) πα + (∂0ψ̄α) π̄α −lb −lf (41) hc = 1 2 [ π2a + π 2 b − iψ̄γk∂ kψ + i(∂kψ̄)γkψ ] (42) the total hamiltonian density is: ht := hc + χ1 u + χ2 v (43) demanding that the constraints χ1 and χ2 are preserved in the course of time one does not get any secondary constraints and therefore these are the only 2 constraints that the theory possesses. non-vanishing matrix elements of the 2 × 2 matrix of the pb’s of these above constraints among themselves are: r12 = −r21 = iγ0δ(x1 −y1)δ(x2 −y2)δ(x3 −y3). (44) the non-vanishing equal-time (et) commutation relations (cr’s) (denoted by a square bracket) and et anti-commutation relations (acr’s) (denoted by a curly bracket) of the bosonic and ferminic variables of the theory are found to be: [a(x,t), πa(y,t)] = i δ(x−y) (45) [b(x,t), πb (y,t)] = i δ(x−y) (46) {ψα(x,t), ψ̄β (y,t)} = γ0δαβ δ(x−y) (47) δ(x−y) := δ(x1 −y1)δ(x2 −y2)δ(x3 −y3) (48) these relations appear to be similar to the usual ones. however, the fermionic spinor field ψ here is not a dirac spinor but it is a majorana spinor having real components: (ψ = ψc ). we need to remember here that the dirac spinor is a 4-component spinor which has complex elements and it could be expressed in terms of two, 2-component weyl spinors having complex elements. however, if the elements of these weyl spinors are taken as real (ψ = ψc ) then it becomes a majorana spinor (having real elements). in path integral quantization (piq) [7], transition to quantum theory is made by writing the vacuum to vacuum transition amplitude for the theory, called the generating functional z[jk] of the theory which in the presence of the external sources jk for the present theory is [7]: z[jk] = ∫ [dµ] exp [ i ∫ dxdy[jkφk + πa∂0a + πb∂0b + π∂0ψ + π̄∂0ψ̄ + πu∂0u + πv∂0v −ht ] ] (49) 83 daya shankar kulshreshtha, usha kulshreshtha acta polytechnica here φk ≡ (a,b,ψ,ψ̄,u,v) are the phase space variables of the theory with the corresponding respective canonical conjugate momenta: πk ≡ (πa, πb,π, π̄, πu, πv ). the functional measure [dµ] of the theory (with the above generating functional z[jk]) is: [dµ] = [ [δ(x1 −y1)δ(x2 −y2)δ(x3 −y3)][da] [db][dψ][dψ̄][du][dv][dπa] [dπb ][dπ][dπ̄][dπu][dπv ] δ[(π + i 2 ψ̄γ0) ≈ 0] δ[(π̄ + i 2 γ0ψ) ≈ 0] ] (50) 4. conclusions and summary some important remarks may be helpful. in relativistic quantum mechanics, the dirac equation (de) is a single particle relativistic wave equation where ψ represents a wave function. in ft, de is an eulerlagrange field equation which is obtained from the dirac action or the dirac lagrangian by using the variational principle. wzm is the simplest example of a supersymmetric field theory in 4d. this is also an example of a ft with non-manifest supersymmetry. taking the example of free massless wzm, one could study many important theories in different dimensions including in higher dimsimensions. the theory also provides a basic framework for the study of ramond nievue schwarz (rns) superstring theory (sst) which is an example of a sst with non-manifest susy. starting with the wzm, it is possible to construct a supergravity theory by gauging its global (rigid) susy into a local susy through the noether’s procedure. just to summarize in brief, we have studied in this work, the wzm [5], which is a supersymmetric ft that has rigid or global supersymmetry. the theory has a supercharge qa (a = 1, 2, 3, 4 in 4d) which is a grassmann spinor having anti-commuting properties. theory is invariant under rigid supersymmetry transformations where the transformation parameter is a constant grassmann spinor [5]. finally, we have also studied the hamiltonian and path integral quantization of the theory [7]. acknowledgements it is matter of great pleasure to thank the organizers of the international conference on analytic and algebraic methods in physics xviii (aamp xviii) – 2021 at prague, czechia, prof. vít jakubský, prof. vladimir lotoreichik, prof. matej tusek, prof. miloslav znojil and the entire team for a wonderful orgnization of the conference. references [1] j. wess, b. zumino. supergauge transformations in four dimensions. nuclear physics b 70(1):39–50, 1974. https://doi.org/10.1016/0550-3213(74)90355-1. [2] s. deser, b. zumino. consistent supergravity. physics letters b 62(3):335–337, 1976. https://doi.org/10.1016/0370-2693(76)90089-7. [3] d. z. freedman, p. van nieuwenhuizen, s. ferrara. progress toward a theory of supergravity. physical review d 13:3214–3218, 1976. https://doi.org/10.1103/physrevd.13.3214. [4] p. van nieuwenhuizen. supergravity. physics reports 68(4):189–398, 1981. https://doi.org/10.1016/0370-1573(81)90157-5. [5] h. j. w. müller-kirsten, a. wiedemann. introduction to supersymmetry. world scientific, singapore, 2nd edn., 2010. https://doi.org/10.1142/7594. [6] j. wess, j. bagger. supersymmetry and supergravity. princeton university press, 1992. isbn 9780691025308. [7] h. j. w. müller-kirsten. introduction to quantum mechanics: schrödinger equation and path integral. world scientific, singapore, 2006. isbn 9789812566911. [8] k. becker, m. becker, j. h. schwarz. string theory and m-theory: a modern introduction. cambridge university press, 2007. isbn 9780521860697. [9] s. j. brodsky. supersymmetric and conformal features of hadron physics. universe 4(11):120, 2018. https://doi.org/10.3390/universe4110120. [10] s. j. brodsky, g. f. de teramond, h. g. dosch. light-front holography and supersymmetric conformal algebra: a novel approach to hadron spectroscopy, structure, and dynamics, 2020. arxiv:2004.07756. [11] s. j. brodsky. color confinement and supersymmetric properties of hadron physics from light-front holography. journal of physics: conference series 1137:012027, 2019. https://doi.org/10.1088/1742-6596/1137/1/012027. 84 https://doi.org/10.1016/0550-3213(74)90355-1 https://doi.org/10.1016/0370-2693(76)90089-7 https://doi.org/10.1103/physrevd.13.3214 https://doi.org/10.1016/0370-1573(81)90157-5 https://doi.org/10.1142/7594 https://doi.org/10.3390/universe4110120 https://arxiv.org/abs/2004.07756 https://doi.org/10.1088/1742-6596/1137/1/012027 acta polytechnica 62(1):80–84, 2022 1 introduction 2 the wess-zumino model 3 free massless wzm 4 conclusions and summary acknowledgements references ap05_6.vp 1 introduction the design process for fpgas differs mainly in the “design time”, i.e., in the time needed from the idea to its realization, in comparison with the design process for asics. moreover, fpgas enable different design properties, e.g., in-system reconfiguration to correct functional bugs or update the firmware to implement new standards. due to this fact and due to the growing complexity of fpgas, these circuits can also be used in mission-critical applications such as aviation, medicine or space missions. there have been many papers [1, 2] on concurrent error detection (ced) techniques. ced techniques can be divided into three basic groups according to the type of redundancy. the first group focuses on area redundancy, the second group on time redundancy and the third one on information redundancy. when we speak about area redundancy, we assume duplication or triplication of the original circuit. time redundancy is based on repetition of some computation. information redundancy is based on error detecting (ed) codes, and leads either to area redundancy or time redundancy. next, we will assume the utilization of information redundancy (area redundancy) caused by using ed codes. the process when high-energy particles impact sensitive parts is described as a single event upset (seus) [3]. seus can lead to bit-flips in sram. the fgpa configuration is stored in sram, and any changes of this memory may lead to a malfunction of the implemented circuit. some results of seu effects on fpga configuration memory are described in [4]. ced techniques can allow faster detection of a soft error (an error which can be corrected by a reconfiguration process) caused by an seu. seus can also change values in the embedded memory used in the design, and can cause data corruption. these changes are not detectable by off-line tests, only by some ced techniques. the fpga fabrication process allows the use of sub-micron technology with smaller and smaller transistor size. due to this fact the changes in fpga memory contents, affected by seus, can be observable even at sea level. this is another reason why ced techniques are important. there are three basic terms in the field of ced: � the fault security (fs) property means that for each modeled fault, the produced erroneous output vector does not belong to the proper output code word. � the self-testing property (st) means that, for each modeled fault, there is an input vector occurring during normal operation that produces an output vector which does not belong to the proper output code word. � the totally self-checking (tsc) property means that the circuit must satisfy fs and st properties. the basic method for the proper choice of a ced model is described in [5]. techniques using ed codes have also been studied by other research groups [6, 7]. one method is based on a parity bits predictor and a checker, see fig. 1. 2 the fault model all of our experiments are based on fpga circuits. the circuit implemented in an fpga consists of individual memory elements (luts – look up tables). we can see 3 gates mapped into an lut in fig. 2. the original circuit has two inner nets. the original set of the test vectors covers all faults in these inner nets. these test vectors are redundant for an lut. for circuits realized by © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 parity codes used for on-line testing in fpga p. kubalík, h. kubátová this paper deals with on-line error detection in digital circuits implemented in fpgas. error detection codes have been used to ensure the self-checking property. the adopted fault model is discussed. a fault in a given combinational circuit must be detected and signalized at the time of its appearance and before further distribution of errors. hence safe operation of the designed system is guaranteed. the check bits generator and the checker were added to the original combinational circuit to detect an error during normal circuit operation. this concurrent error detection ensures the totally self-checking property. combinational circuit benchmarks have been used in this work in order to compute the quality of the proposed codes. the description of the benchmarks is based on equations and tables. all of our experimental results are obtained by xilinx fpga implementation eda tools. a possible tsc structure consisting of several tsc blocks is presented. keywords: on-line testing, self-checking, error detection code, fault, error, fpga. combinational circuit inputs } parity predictor checker m circuit output code word k check bits fig. 1: structure of a tsc circuit luts a change (a defect) in the memory leads to a single event upset (seu) at the primary output of the lut. therefore we can use the stuck-at fault model in our experiments to detect seu – only some of the detected faults will be redundant. our fault model is described by a simple example in fig. 3. only one lut is used for simplicity. this lut implements a circuit containing 3 gates. the primary inputs from i0 to i1 are the same as the address inputs for the lut. when this address is selected its content is propagated to the output. we assume the following situation: first the content of this lut can be changed, e.g., electromagnetic interference, cross-talk or alpha particles. the appropriate memory cell is set to one and the wrong value is propagated to the output. this means that the realized function is changed and the output behaves as a single event upset. we can say that a change of any lut cell leads to a stuck-at fault on the output according to this example. this fault is observed only if the bad cell is selected. this is the same situation as for circuits implemented by gates. some faults can be masked and do not necessarily lead to an erroneous output. due to masking of some faults, the possibility of their appearance can occur at the time when previously unused logic is being used. e.g., if one bit of an lut is changed, the erroneous output will appear, while the appropriate bit in an lut is selected by the address decoder. in our design methodology we evaluate fs and st properties. for st properties a hidden fault is not assumed. the evaluation of the fs property is independent of the set of allowed input words. if a fault does not manifest itself as an incorrect codeword for all possible input words, it cannot cause an undetectable error for any subset of input words. so we can use the exhaustive test set for combinational circuits. the exhaustive test set is generated to evaluate the st property for combinational circuits, where the set of input words is not defined. but in a real situation, some input words may not occur. this means that some faults can be undetectable. this can decrease the final fault coverage. therefore, the number of faults that can be undetectable is higher. the fault simulation process is performed for circuits described by netlist (for example .edif). 3 parity bits predictor there are many ways to generate checking bits. a single even parity code is the simplest code that may be used to get a code word at the output of the combinational circuit. this parity generator performs xor over all primary outputs. however, the single even parity code is mostly not appropriate to ensure the tsc goal. another error code is a hamming-like code, which is in essence based on the single parity code (multi parity code). the hamming code is defined by its generating matrix. we used a matrix containing the unity sub-matrix on the left side for simplicity. the generating matrix of the hamming code (15, 11) is shown in fig. 4. the values aij have to be defined. when a more complex hamming code is used, more values have to be defined. the number of outputs oi used for the checking bits determines the appropriate code. e.g., the circuit alu1 [10] having 8 outputs requires at least the hamming code (15, 11). therefore 8 data bits and 4 checking bits are used. the definition of the values aik is also important. now we present a method for generating values aik. let us mention the hamming code (15, 11) having 4 checking bits. in our case (alu1) we have only 8 bits. therefore the reduced hamming matrix must be used. the sub-matrix has only 8 rows and 4 columns after the reduction. we can define eight 4-bit vectors or four 8 bit vectors. the second case will be used here. the search for erroneous output is a similar method to a binary search. the first 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague gates mapped into lut 0 1 • • 15 fault i0..3 o lut 0 1 • • 15 fault i0..3 o 0 1 • • 15 fault i0..3 o lut redundant fault fig. 2: fault model 0 1 1 0 • • 1 lut inputs 0 1 1 0 • • 1 lut inputs single event upset 0 0 0 0 0 0 1 1 1 faultfault fig. 3: fault model – example g a a a a a a a a � 1 0 0 0 1 0 0 0 1 11 12 1 3 1 4 2 1 2 2 2 3 2 4 � � � � � � � , , , , , , , , � � � � a a a a111 112 11 3 11 4, , , , � � � � � � � � � � � � � � fig. 4: generating matrix for hamming code (15, 11) 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 1 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � fig. 5. right part of generating matrix vector is composed of log. 1s only. the last vector is composed of log. 1s in the odd places and log. 0s in the even places. every vector except the first contains the same number of 1s and the same number of 0s. an example of the possible content of the right part sub-matrix is shown in fig. 5. the number of vectors in the set is the same as the number of rows in the appropriate hamming matrix. the way to generate parity output for checking bit xk is described by equation 1: x a o a o a ok k k mk m� 1 1 2 2 � , (1) where o1 … om are the primary outputs of the original circuit. 4 area overhead minimization the benchmarks used in this paper are described by a two-level network. the final area overhead depends on the minimization process. we used two different methods in our approach. both these methods are based on a simple duplication of the original circuit. our first method is based on a modification of the circuit described by a two-level network. the area of the check bits generator contributes significantly to the total area of the tsc circuit. as an example we consider a circuit with 3 inputs (c, b and a) and 2 outputs ( f and e). the check bits generator uses the odd parity code to generate the check bits. in our example we have only one check bit x. our example is shown in table 1. output x was calculated from outputs e and f. we have to generate the minimal form of the equation at this time. we can achieve the minimal form using methods like the karnaugh map or quine-mccluskey. after minimization we obtain three equations, one per output ( f, e and x), where x means an odd parity of the outputs f and e. if we want to know whether the odd parity covers all faults in our simple combinational circuit example, we have to generate the minimal test set and simulate all faults in each net in this circuit. the final equations are: e bc a b c� ( ) (2) f ab c a b� ( ) (3) x bc� (4) our second method is based on a modification of the multi-level network. the parity bits are incorporated into the tested circuit as a tree composed of xor gates. the maximal area of the parity generator can be calculated as the sum of the original circuit and the size of the xor tree. 5 experimental evaluation software fig. 6 describes how the test is performed for each detecting code. the mcnc benchmarks [11] were used in our experiments. these benchmarks are described by a truth table. to generate the output parity bits, all the output values have to be defined for each particular input vector. only several output values are specified for each multi-dimensional input vector, and the rest are assigned as don’t cares; they are left to be specified by another term. thus, in order to be able to compute the parity bits, we have to split the intersecting terms, so that all the terms in the truth table are disjoint. in the next step, the original primary outputs are replaced by parity bits. two different error codes were used to calculate the output parity bits (single even parity code and hamming code). another tool was used in the case where the original circuit was modified in multilevel logic. this tool is described in [8]. two circuits generated in the first step (the original circuit and the parity circuit) are processed separately to avoid sharing any part of the circuit. each part is minimized by the espresso tool [9]. the final area overhead depends on the software that was used in this step. many tools were used to achieve a small area of the parity bits generator. only espresso was used to minimize the final area of the circuit described by the two level network. in this step the area overhead is known for implementation to asic. for fpgas the area overhead is known after the synthesize process has been performed. the “pla” format is converted into the “bench” format in the next step. the “bench” format was used because, the tool © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 c b a f e x 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 0 1 0 1 0 1 0 1 1 0 1 1 1 1 1 1 0 0 1 table. 1: example of parity generator generate pla with parity bits pla to bench convert fault injection & simulation mcnc benchmark fault coverage test set original parity original + parity bench to vhdl convert original parity generate exhaustive test set synthesize vhdl synplify original parity original parity synthesize vhdl leonardo spectrum original + parity minimization espresso original parity split intersection terms fig. 6: design scheduling of self-checking circuit which generates the exhaustive test set uses this format. an exhaustive test set has 2n patterns, and we used it to evaluate the tsc goals. another conversion tool is used to generate two vhdl codes and the top level. the top level is used for incorporating original and parity circuit generator. in the next step, the synthesis process is performed by synplify [12]. the constraints properties set during the synthesis process express the area overhead and the fault coverage. if the maximum frequency is set too high, the synthesize process causes hidden faults to occur during the fault simulation. the hidden faults are caused by circuit duplication or by the constant distribution. the size of the area overhead is obtained from the synthesis process. the final netlist is generated by the leonardo spectrum [13] software. the fault coverage was obtained by simulation using our software. 6 software solution description special tools had to be developed to evaluate the area overhead and fault coverage. in addition to some commercial tools such as leonardo spectrum [13] and synplify [12] we used format converting tools, parity circuit generator tools and simulation tools. at first, area minimization and term splitting is performed for the original circuit by boom [10]. the hamming code generator (or single parity generator) is generated by the second software. these two circuits are minimized again with espresso. the next two tools convert the two-level format into a multi level format. the first converts a “pla” file to “bench”, and the second converts “bench” to vhdl. the second software is used for generating the final circuit in the “bench” format for further usage in the exhaustive test set generator. the format converting software and parity generator software were written in microsoft visual c++. the netlist fault simulator was written in java. the parser source code was used for parsing the netlist that is generated by the two commercial tools described above. 7 experiments the combinational mcnc benchmarks [11] were used for all the experiments. these benchmarks are based on real circuits used in large designs. since the whole circuit will be used for reconfiguration in fpga, only small circuits were used. real designs having a large structure must by partitioned into several smaller parts. for large circuits, the process of area minimization and fault simulation takes a long time. this disadvantage prevents us examining more methods of designing the check bits generator. the evaluated area, fs and st properties depend on circuit properties such as the number of inputs and outputs, and the circuit complexity. the experimental results show that a more important property is the structure of the circuit. two basic properties are described in table 2. in the first set of experiments our goal was to obtain one hundred percent of the fs and st property, while we measured the area overhead. in this case, the maximum of the parity bits was used. this task was divided into two experiments (fig. 7). in the first experiment the two-level network was being modified (fig. 7a). the results are shown in table 3. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague circuit inputs outputs alu1 12 8 apla 10 12 b11 8 31 br1 12 8 al2 16 47 alu2 10 8 alu3 10 8 c17 5 2 table 2: description of tested benchmarks generate bench with parity bits pla to bench convert mcnc benchmark split intersection terms split intersection terms originaloriginal parity generate pla with parity bits generate pla with parity bits originaloriginal originaloriginal parity pla to bench convert originaloriginal parity a) b) minimization espre sos originaloriginal parity minimization esspreso originaloriginal fig. 7: two different flows for creating a parity generator circuit parity nets original [lut] parity [lut] overhead [%] st fs alu1 4 8 84 1050 100 100 apla 5 45 105 233 100 98.3 b11 6 38 38 100 100 99.7 br1 4 50 59 118 100 95.9 al2 7 51 54 106 100 98.8 alu2 4 30 127 423 100 100 alu3 4 28 94 336 100 100 c17 2 2 3 150 100 100 table 3: hamming code – pla the st property was fulfilled in 7 cases and the fs property was fulfilled in 4 cases. the area overhead in many cases exceeds 100%. this means that the cost of one hundred percent fault coverage is too high. in these cases the tsc goal is satisfied for most tested benchmarks. we then used an old method, where the original circuit described by a multi-level network is modified by additional xor logic (fig. 7b) [8]. the results obtained from this experiment are shown in table 4. the fs and the st properties were fulfilled in the same cases as in the first experiment, but the overhead is in some cases smaller. in the second set of experiments we tried to obtain a small area overhead, and the fault coverage was measured. in this case the minimum of parity bits is used (single even parity).the experiments are divided into two groups, a) and b), fig. 7. the procedure is the same as described above. in the first experiment the two-level network of the original circuit was modified (fig. 7a). the results are shown in table 5. the st property is achieved in four cases, but the area overhead is smaller in five cases. the fs property is satisfied in one case. in the last experiment, we have modified the circuit described by a multilevel network (fig. 7b). the st property was satisfied in four cases and the fs property in two cases. the area overhead is higher than 100% for most benchmarks, but the fault coverage did not increase, table 6. 8 huge design our previous results show that it is in many cases too difficult to achieve tsc goals with minimal area overhead [8]. a way to detect and localize the fault part of the circuit has to be proposed. assuming that the tsc goals cannot be higher than 90%, the area overhead can be rapidly decreased, and other methods to cover and localize the fault can be used. on-line testing methods can only detect faults. the localization process must exploit some other methods for off-line testing. however, neither on-line nor off-line tests increase the reliability parameters. the reliability mostly decreases due to the larger area occupied by the tsc circuit than by the original circuit. therefore we propose a reconfigurable system to increase these parameters. each block in our design is designed as a tsc, and we have been working on a methodology to satisfy tsc goals for the whole design and to design highly reliable systems. the way to connect all tsc blocks is shown in fig. 8. the main idea is based on detection of the error code word generated in any block. the detecting process is moved from the primary outputs to the primary inputs of the following circuit. the interconnections of all individual blocks play an important role with respect to the tsc property of the whole circuit. a bad order of the connections between the inner blocks leads to lower fault coverage. additional logic has to be included into the control arrangement of the implemented blocks with respect to the way the automatic tools handle the interconnection. in our structure we can assume six places where an error can be observable. we assume, for simplicity that an error that occurred in the check bit generator will be observable at the parity nets (number 1) and error occurred in the original circuit will be observable at the primary outputs (number 5). the checker in block n will detect the error if it occurs in net number 1, 2, 4 or 5. if the error occurs in the net number 3 or 6, the error will be detected in the next checker (n 1). all our experiments were applied to combinational circuits only. the same techniques can be used for a sequential © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 circuit parity nets original [lut] parity [lut] overhead [%] st fs alu1 4 8 13 163 100 100 apla 5 45 114 253 100 97.2 b11 6 38 73 192 100 99 br1 4 50 85 170 100 96.5 al2 7 52 109 210 100 99.1 alu2 4 30 52 173 100 100 alu3 4 28 44 157 100 100 c17 2 2 3 150 100 100 table 4: hamming code – xor circuit parity nets original [lut] parity [lut] overhead [%] st fs alu1 1 8 271 3388 100 98.9 apla 1 46 23 50 99.5 82.6 b11 1 37 3 8 89.9 77.3 br1 1 54 10 19 86.9 62.1 al2 1 52 4 8 97.3 91.7 alu2 1 29 47 162 100 91.2 alu3 1 26 32 123 100 92 c17 1 2 2 100 100 100 table 5: single even parity – pla circuit parity nets original [lut] parity [lut] overhead [%] st fs alu1 1 8 10 125 100 100 apla 1 46 56 122 99.7 87.2 b11 1 37 36 97 93.9 81,4 br1 1 54 61 113 92.7 69 al2 1 52 23 44 97.9 93.2 alu2 1 29 44 152 100 91.1 alu3 1 26 39 150 100 91.6 c17 1 2 2 100 100 100 table 6: single even parity – xor circuit, because these circuits can be divided into simple combinational parts separated by flip-flops. the finite state machine can be divided into two parts: the first part covers the combinational logic from inputs to flip-flops (with feedback), while the second part covers the combinational logic from flip-flops to outputs (and the parts connected directly from the input to the output). 9 conclusion the paper describes one part of the automatic design process methodology for a dynamic reconfiguration system. we designed concurrent error detection (ced) circuits based on fpgas with a possible dynamic reconfiguration of the faulty part. the reliability characteristics can be increased by reconfiguration after the error detection. the most important criterion is the speed of the fault detection and the safety of the whole circuit with respect to the surrounding environment. in summary, fs and st properties can be satisfied for the whole design, including the checking parts. this is achieved by using more redundancy outputs generated by the special codes. a hamming-like code can be used as a suitable code to generate check bits. the type depends on the number of outputs and on the complexity of the original circuit [9]. more complex circuits need more check bits. we would like to reduce the duplicated circuit and compute the fault coverage again. we have proposed a new solution of the check bits generator design method. because we want to increase the reliability characteristics of the circuit implemented in fpgas, we have to modify the circuits at the netlist level. all of our experiments apply combinational circuits only. sequential circuits can be disjoint to the simple combinational parts separated by flip-flops. therefore this restriction only to combinational circuits does not reduce the quality of our methods and experimental results. our future improvements will involve d discovering closer relations between real fpga defects and our fault models. minimization of the whole tsc design to obtain the lowest area overhead has been under intensive experimentation. we are also working intensively on the appropriate decomposition of the designed circuit. references [1] mohanram, k., sogomonyan, e. s., gössel, m., touba, n. a.: “synthesis of low-cost parity-based partially self-checking circuits.” proceedings of the 9th ieee international on-line testing symposium, 2003, p. 35. [2] drineas, p., makris, y.: “concurrent fault detection in random combinational logic.” in: proceedings of the ieee international symposium on quality electronic design (isqed), 2003, p. 425–430. [3] quicklogic corporation.: “single event upsets in fpgas”, 2003, www.quicklogic.com [4] bellato, m., bernardi, p., bortalato, d., candelaro, a., ceschia, m., paccagnella, a., rebaudego, m., sonza reorda, m., violante, m., zambolin, p.: “evaluating the effects of seus affecting the configuration memory of an sram-based fpga design automation event for electronic system in europe,” 2004, p. 584–589. [5] mitra, s., mccluskey, e. j.: “which concurrent error detection scheme to choose?” proc. international test conf., 2000, p. 985–994. [6] bolchini, c., salice, f., sciuto, d.: “design self-checking fpgas through error detection codes.” 17th ieee international symposium on defect and fault tolerance in vlsi systems (dft’02), 2002, p. 60. [7] bolchini, c., salice, f., sciuto, d., zavaglia r.: “an integrated design approach for self-checking fpgas.” 18th ieee international symposium on defect and fault tolerance in vlsi systems (dft’03), 2003, p. 443. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague checker primary output primary input totally self-checking circuit n-1 output check bits (circuit n-1) ok (circuit n-2) fail (circuit n-2) input check bits (circuit n-2) checker primary output primary input totally self-checking circuit n output check bits (circuit n) ok (circuit n-1) fail (circuit n-1) input check bits (circuit n-1) original combinational circuit original combinational circuit check bits generator check bits generator 1 4 2 5 6 3 fig. 8: proposed structure of tsc circuits implemented in fpga [8] kubalík, p., kubátová, h.: “design of self checking circuits based on fpga.” in: proc. of 15th international conference on microelectronics, cairo, cairo university, 2003, p. 378–381. [9] brayton, r. k. et al.: logic minimization algorithms for vlsi synthesis. boston, ma, kluwer academic publishers 1984, 192 p. [10] hlavička, j., fišer, p.: “boom – a heuristic boolean minimizer.” proc. international conference on computer-aided design iccad 2001, san jose, california (usa), 2001, p. 439–442. [11] yang, s.: “logic synthesis and optimization benchmarks user guide.” technical report 3, microelectronics center of north carolina, 1991. [12] http://www.synplicity.com/ [13] http://www.mentor.com/ ing. pavel kubalík phone: +420 224 357 340 e-mail: xkubalik@fel.cvut.cz doc. ing. hana kubátová, csc. phone: +420 224 357 281 e-mail: kubatova@fel.cvut.cz department of computer science and engineering czech technical university in prague faculty of electrical engineering karlovo nám. 13 121 35 praha 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap05_4.vp notation ar wing aspect ratio. clmax maximum lift coefficient of the aircraft with retracted flaps. clmaxff maximum lift coefficient of the aircraft with full flaps. clmaxl maximum landing lift coefficient of the aircraft. clmaxto maximum take off lift coefficient of the aircraft. rc, rcmax maximum rate of climb. s wing area. slg, stog landing ground run, take off ground run. t time. tmin minimum time of climb to altitude z. v(rcmax) speed at maximum rate of climb. vmax, vmin maximum level speed, minimum level speed. vs, vsff stalling speed flaps up, stalling speed flapsdown. we empty weight. wto maximum take off weight. p power. z altitude. �, �0 density, density at sea level. 1 introduction the class of ultralight (ulm) and light aircraft in general has attracted by growing interest through europe in recent years. only in italy in the last 5–6 years, at least 10 companies have started production of ulm aircraft. there is a very active market for this class, used to promote flight at all levels and for sports aircraft the maximum flight speed for ulm aircraft has been increased in recent years through the use of more powerful engines (100 hp instead of 64 or 80) and better aerodynamics. it is not surprising that a maximum level speed of about 280 km/h has been reached. since the weight constraints are very strict, it is important to study ways to improve structural design, safety, flight qualities, aeroelastic behaviour and systems reliability, without raising costs.. following the experience acquired in our department in designing light and ultralight aircraft, the design of a new composite ulm is being carried out at dpa. the design goals established for this new design were: 1) short take-off and landing (stol) aircraft capable of taking off and landing from an uprepared runway within 40 m; 2) almost complete construction in composite material; 3) foldable wing, in order to make the ulm very easy to use, to put on a trailer and to hangar in a normal size garage; 4) wing with a retractable leading edge slat and slotted/fowler flaps; 5) maximum speed around 190–200 km/h at mtow of 450 kg; 6) good flight and handling qualities, to be safely flown by inexperienced pilots; 7) low cost. 2 market survey all the analyzed aircraft are ulm (wto = 450 kg = 4415 n) and equipped with an 80 hp (59.6 kw) engine; most of them are made of aluminium alloy with a high wing configuration, ensuring high stability and easy piloting. none satisfies all the above-mentioned design goals. in fact, the yuma, the savannah and the zenair ch 701 are successful stol aircraft made of aluminium alloy; however, their de© czech technical university publishing house http://ctn.cvut.cz/ap/ 73 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 design of a low-cost easy-to-fly stol ultralight aircraft in composite material d. p. coiro, a. de marco, f. nicolosi, n. genito, s. figliolia the paper deals with the design of an aircraft, starting from a market survey, the conceptual design loop and the preliminary choice of dimensions, and leading to the detailed design of efficient high-lift systems and a low-drag fuselage shape. technological challenges regarding the design of low-cost systems for flap/slat retraction and a simple wing folding system are highlighted. aiming at an efficient optimization algorithm, we developed a new integration technique between cad, aerodynamic and structural numerical calculation. examples deriving from this new approach are presented. keywords: stol, preliminary design. aircraft m. w.p. we [n] w w e to w s to [n/m2] s [m2] ar vs [km/h] vsff [km/h] vmax [km/h] rc [m/s] stog [m] slg [m] clmax clmaxff p92 echo 80 a h 2757 0.62 334.43 13.20 6.55 71 61 210 5.5 110 100 1.40 1.90 p96 golf 80 a l 2757 0.62 361.84 12.20 5.78 71 61 225 4.5 110 100 1.52 2.06 table 1: weights, sizes and performances at sea level of the analyzed aircraft (m. – material: a – aluminium alloy, c – composite; w.p. – wing position: h – high, l – low.) sign is unattractive, and they have a fixed slat on the leading edge, which reduces maximum cruising speed. the sky arrow 450t and the remos g-3, on the contrary, are high cost “non-stol” aircraft in composite materials, advanced ulm. they can easily by put onto a trailer, due to their removable or foldable wing. the main characteristics of the analyzed aircraft are shown in table 1. their main performance characteristics in terms of landing run versus maximum level speed at sea level are shown in fig. 1. 3 design point the methodology followed during the design process is similar to that reported in [1], but it has been expressly modi74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 0 20 40 60 80 100 120 140 160 180 0 50 100 150 200 250 300 vmax [km/h] l a n d in g g r o u n d r u n [m ] p92 echo 80 p96 golf 80 remos g-3 df 2000 yuma (stol) savannah (stol) zenair ch 701 (stol) amigo ! slepcev storch mk4 (stol) sky arrow 450t allegro 2000 sinus 912 motoaliante avio j-jabiru ev-97 euro star model 2001 jet fox 97 tl 96 star goal position for the aircraft to be designed fig. 1: landing ground run versus maximum level speed at sea level aircraft m. w.p. we [n] w w e to w s to [n/m2] s [m2] ar vs [km/h] vsff [km/h] vmax [km/h] rc [m/s] stog [m] slg [m] clmax clmaxff remos g-3 c h 2757 0.62 366.65 12.04 7.98 75 63 220 6.5 80 140 1.38 1.95 df 2000 a h 2747 0.62 367.88 12.00 8.33 66 56 215 5.5 110 100 1.79 2.48 yuma (stol) a h 2766 0.63 328.46 13.44 7.07 55 50 175 6.0 40 55 2.30 2.78 savannah (stol) a h 2668 0.60 343.81 12.84 6.28 50 45 160 6.0 50 50 2.91 3.59 zenair ch 701 (stol) a h 2580 0.58 387.24 11.40 5.90 53 48 153 7.0 50 50 2.92 3.56 amigo ! a l 2806 0.64 339.58 13.00 5.24 74 64 250 6.5 80 100 1.31 1.75 slepcev storch mk4 (stol) a h 2649 0.60 275.91 16.00 6.76 52 46 155 4.5 50 50 2.16 2.76 sky arrow 450t c h 2825 0.64 326.76 13.51 6.96 70 61 192 5.1 120 80 1.41 1.86 allegro 2000 a h 2727 0.62 387.24 11.40 10.23 73 63 220 5.0 150 100 1.54 2.06 sinus 912 motoaliante c h 2786 0.63 360.07 12.26 18.28 66 63 220 6.5 88 100 1.75 1.92 avio j-jabiru c h 2649 0.60 474.17 9.31 9.49 74 64 216 6.0 100 160 1.83 2.45 ev-97 euro star model 2001 a l 2570 0.58 448.63 9.84 6.67 75 65 225 5.5 125 90 1.69 2.25 jet fox 97 a, c h 2845 0.64 301.95 14.62 6.54 70 60 175 6.0 100 120 1.30 1.77 tl 96 star a l 2747 0.62 364.83 12.10 6.87 80 63 250 6.0 90 100 1.21 1.94 fied for the ulm category: in particular, new statistical relations between take off ground run stog and take off parameter for ulm topulm (1), landing ground run slg and landing stall speed vsl, power index ip (3) and maximum speed at sea level vmax have been calculated, as shown in figs. 2, 3 and 4. topulm is defined as: topulm to to l to � � � � � � � � � � � � � w p w s c� max ; (1) with w s � � � � � � to in [n/m2] and w p � � � � � � to in [n/w]; � � � � 0 . (2) ip is defined as: ip w s w p � � � � � � � cr 3 (3) with w s in [psf] and w p � � � � � � cr in [lbs/hp]; p kz kv pcr to� � . (4) in (4) pcr and pto are respectively the power at cruising and take off, kv and kz are the speed and altitude factor (for a four-stroke engine kv � 1 and kz � �1.22), � is the engine admission limit. the data scattering is probably due to limited reliability of the published data, and due to an unbiased difficulty in measuring the data: for example, slight differences in executed manouvres lead to great differences in measured data. for this stol aircraft, the main restrictions are maximum speed, take off and landing run, as shown in fig. 5. once these limitations have been reported in a graph relating power loading (w/p)to and wing loading (w/s)to, the resulting shaded area represents all the possible design point choices. maximum power loading is fixed ( ( w/p)to � 74 n/kw), because maximum take off weight (450 kg � 4415 n) and power (80 hp � 59.6 kw) have been fixed. in this way only maximum wing loading has been © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 45 no. 4/2005 ulm: slg [m] = 0.038v sl 2 0.641v� sl 0 20 40 60 80 100 120 140 160 180 40 45 50 55 60 65 70 75 vsl [km/h] s l g [m ] far23: s lg [m] = 0.02354 v sl 2 fig. 3: landing ground run slg versus landing stall speed at sea level vsl ulm: stog [m] = 0.0649 top ulm 2 + 5.0024 top ulm 0 20 40 60 80 100 120 140 160 6 8 10 12 14 16 18 20 top s t o g [m ] far23: s tog [m]= 5.22922 top 23 + 0.01025 top 23 2 fig. 2: take off ground run stog versus take off parameter top chosen, based on the criteria for keeping the wing area as small as possible (mainly for cost reasons) and using appropriate values of maximum take off and landing lift coefficient ((w/s)to � 324 n/m 2, s � 13.6 m2, clmaxto � 2.45, clmaxl � 3.12). 4 preliminary design the conceptual loop is shown in fig. 6. it looks simple, but, for example, converting the geometry of sections into cad geometry is a complicated and delicate step: aircraft 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 170 180 190 200 210 220 230 0.7 0.75 0.8 0.85 0.9 0.95 ip v m a x a t s e a le v e l [k m /h ] fig. 4: maximum speed vmax versus power index ip 64 66 68 70 72 74 76 78 80 82 84 250 260 270 280 290 300 310 320 330 340 350 (w/s)to [n/m 2 ] (w /p ) t o [n /k w ] take off distance limit maximum cruise speed limit landing distance limit rate of climb with all engine operative limit design point clmaxl� 3.2 clmaxto� 2.6 chosen power loading (w/p)to fig. 5: maximum power loading (w/p)to versus maximum wing loading (w/s)to geometry of sections attempt at cad geometry section generation, mass and inertial data, surface grids and fem (finite element method) semi empirical aerodynamic and structural calculations performance and flightquality virtual simulation have the design goals been achieved? parametric optimization detailed calculations, wind tunnel and flight tests yesno wished complicated and delicate step! fig. 6: conceptual loop design surfaces must be carefully defined, otherwise the aircraft geometry will be different from the desired design. the parametric optimization loop is shown in fig. 7. first of all, the preliminary geometry was fixed, analyzing existing aircraft and applying semi-empirical methods. the wing was sized to minimize the required power at cruising speed. some airfoils were analyzed and a new airfoil was designed (modifying naca gaw1 airfoil) to provide a compromise between lift, drag and pitch moment coefficients. the high lift system and aileron sizing ensures the stol characteristic and good lateral control; this has been demonstrated by j. roskam [2], w. mccormick [3], c. d. perkins and r. e. hage [4] and by the authors [5]. in particular, two possible high lift system configurations are shown in fig. 8. the horizontal and vertical tails were sized by the volume method, ensuring good stability and control also in landing. the fuselage design is very important and it was based on aerodynamic, ergonomic and line of sight studies, as shown in fig. 9. a 3-view of the aircraft is shown in fig. 10; table 2 reports the main dimensions, weights and loadings. © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 attempt at geometry defined as n parameters goal function and constraints definition ex.: fusolage drag decrease binding the diameter to a prefixed value (ergonomics) and unmodifying structural endurance first numerical analysis parameter modification (respecting constraints) according to mathematical “logics” to find goal function optimum (minimum) numerical analysis (aerodynamic, structural, performance and flight quality) has the goal function been minimized? stop no yes fig. 7: parametric optimization loop take off 20°15° landing 40°15° 40°15° 15°15° (a) (b) take off 20°15° landing 40°15° 40°15° 15°15° (a) (b) fig. 8: possible high lift system configurations: (a) slat – single slot; (b) slat – fowler fig. 9: ergonomics and line of sight of the fuselage fig. 10: 3-view of the aircraft 5 numerical analysis the design was accomplished using a code named aereo [5], which has been developed in recent years at dpa to predict all aerodynamic characteristics in linear and non-linear conditions (high angles of attack) and all flight performances as well as dynamic behavior and flight qualities of propeller driven aircraft. the figures below report some aerodynamic characteristics (figs. 11, 12, 13 and 14) and performance characteristics (fig. 15) of the aircraft calculated with aereo code. table 3 reports the main performances of the aircraft. further optimization of the global configuration is in progress to improve the wing aero-structural behavior as well as the relative position of the wing and horizontal tail to minimize downwash and induced drag. 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague dimensions, external wing aircraft span [m] 9.71 length overall [m] 6.52 root chord [m] 1.40 height overall [m] 1.35 tip chord [m] 1.40 aspect ratio 6.93 propeller (fixed-pitch) incidence [deg] 2.00 blade number 3 diameter [m] 1.66 horizontal tail span [m] 2.80 areas root chord [m] 0.72 wing [m2] 13.60 tip chord [m] 0.72 ailerons [m2] 1.22 aspect ratio 3.90 leading edge flap [m2]: slat 2.04 trailing edge flap [m2]: single slot 2.85 vertical tail horizontal tail [m2] 2.01 span [m] 1.47 vertical tail [m2] 1.08 root chord [m] 0.87 tip chord [m] 0.61 weights and loadings aspect ratio 2.00 empty weight 280 kg 2747 n incidence [deg] 0.00 max t-o and landing weight 450 kg 4415 n leading edge sweep angle [deg] 22.20 max wing loading 33.09 kg/m2 324 n/m2 trailing edge sweep angle [deg] 13.00 max power loading 5.63 kg/hp 74 n/kw table 2: main dimensions, weights and loadings performance (max weight, isa, at sea level) max speed [km/h] 194 take off run to 15 m [m] 121 cruising speed [km/h] 165 landing run from 15 m [m] 100 stall speed [km/h]: flaps up 65 landing run [m] 50 flaps down: slat – single slot 48 theoretical ceiling [m] 7908 max rate of climb [m/s] 6.69 service ceiling [m] 7317 take off run [m] 55 table 3: performances 6 conclusion the preliminary design of a stol ulm aircraft and numerical performance prediction has been shown. the aircraft shows acceptable performances that are consistent with the desired design goals. the predicted performances were obtained with aereo code, which confirmed its usefulness as a fast and reliable design tool for propeller-driven aircraft. the parametric design and optimization loops have been highlighted. detailed design and optimization of the high-lift system and three-dimensional aerodynamic analysis are in progress, while wind tunnel tests (high-lift airfoil, aircraft model) are planned in the near future. © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 -0.5 0 0.5 1 1.5 2 0 0.05 0.1 0.15 0.2 0.25 cd c l � � �18° � � �15° � � �12° � � �6° � � 0° cdeq � fig. 11: polar curves parameterized in d (horizontal all-movable tail deflection) and equilibrium polar curve -1 -0.5 0 0.5 1 1.5 2 -10 -5 0 5 10 15 20 25 �b [deg] c l � = -18° � = -15° � = -12° � = -6° � = 0° cleq � fig. 12: lift coefficient of the aircraft versus alpha body (incidence angle measured in regard to the thrust axis) parameterized in � (horizontal all-movable tail deflection) and equilibrium lift coefficient -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -0.5 0 0.5 1 1.5 2 cl c m � � � �� � � � �� � � � �� � � ��� � � ��� � � ��� � � �� � fig. 13: pitch moment coefficient versus lift coefficient parameterized in � (horizontal all-movable tail deflection) references [1] roskam, j.: part i: preliminary sizing of airplanes. lawrence, kansas 66044, u.s.a.: 120 east 9th street, suite 2, darcorporation, 1997. [2] roskam, j.: part vi: preliminary calculation of aerodynamic, thrust and power characteristcs. lawrence, kansas 66044, u.s.a.: 120 east 9th street, suite 2, darcorporation, 2000. [3] mccormick, w.: aerodynamics, aeronautics and flight mechanics. new york, chichester, brisbane, toronto, singapore, john wiley & sons, 1979. [4] perkins, c. d., hage, r. e.: airplane performance, stability and control. new york, john wiley & sons, 1949. [5] coiro, d. p., nicolosi, f.: “aerodynamics, dynamics and performance prediction of sailplanes and light aircraft.” technical soaring, vol. 24, no. 2, april 2000. prof. d. p. coiro phone: +39 081 7683322 fax: +39 081 624609 e-mail: coiro@unina.it dr. a. de marco e-mail: agodemar@unina.it dr. f. nicolosi e-mail: fabrnico@unina.it dr. n. genito e-mail: nigenito@unina.it dr. s. figliolia e-mail: jdrfig@tin.it dipartimento di progetazione aeronautica (dpa) university of naples “federico ii” via claudio 21 80125 naples, italy 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 5 10 15 20 25 30 35 40 45 50 55 t [min] v[m/s] z [m ] tmin vmin vmax v(rcmax) rcmax theoretical ceiling service ceiling fig. 15: flight envelope -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 60 80 100 120 140 160 180 200 v [km/h] � e q [d e g ] fig. 14: equilibrium horizontal all-movable tail deflection versus speed (center of gravity position is at 25 % of mean aerodynamic chord) acta polytechnica https://doi.org/10.14311/ap.2022.62.0248 acta polytechnica 62(2):248–261, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague effect of pigments on bond strength between coloured concrete and steel reinforcement joseph j. assaad∗, matthew matta, jad saade university of balamand, department of civil and environmental engineering, balamand, po box 100, al kourah, lebanon ∗ corresponding author: joseph.assaad@balamand.edu.lb abstract. the effect of pigments on mechanical properties of coloured concrete intended for structural applications, including the bond stress-slip behaviour to embedded steel bars, is not well understood. series of concrete mixtures containing different types and concentrations of iron oxide (red and grey colour), carbon black, and titanium dioxide (tio2) pigments are investigated in this study. regardless of the colour, mixtures incorporating increased pigment additions exhibited higher compressive and splitting tensile strengths. this was attributed to the micro-filler effect that enhances the packing density of the cementitious matrix and leads to a denser microstructure. also, the bond to steel bars increased with the pigment additions, revealing their beneficial role for improving the development of bond stresses in reinforced concrete members. the highest increase in bond strength was recorded for mixtures containing tio2, which was ascribed to formation of nucleus sites that promote hydration reactions and strengthen the interfacial concrete-steel transition zone. the experimental data were compared to design bond strengths proposed by aci 318-19, european code ec2, and ceb-fip model code. keywords: coloured concrete, iron oxide pigment, carbon black, titanium dioxide, durability, bond strength. 1. introduction pigments including iron oxide (io), carbon black (cb), and titanium dioxide (tio2) are finely ground particles for integral colouring of concrete and cementitious materials intended for architectural applications [1–3]. often manufactured as per astm c979 [3] specification, the pigments are bound onto the surface of cement grains, thus altering the colour characteristics by absorbing certain wavelengths of the visible light and reflecting others. the ios are synthetic colourants manufactured to display a variety of colours (i.e., red, grey, yellow, etc.), thus infusing the concrete with their shades [4, 5]. these pigments are stable in the high-alkaline portland cement environment, conferring proper colour fastness to sunlight exposure and resistance to weathering effects. the cb is an economical black colourant with high tinting strength produced from petroleum and charring organic materials [1, 6]. compared to io, the cb generally disrupts the air-entrainment and increases the vulnerability of concrete to leaching when exposed to repeated wet/dry cycles [6]. the white-coloured tio2 mostly occurs in the natural rutile and anatase crystal forms [7, 8]; it is normally used with white cement and other pozzolanic materials (metakaolin) to brighten the cementitious mixture. earlier studies showed that the pigment characteristics (i.e., type, fineness, mineralogy, morphology, solubility, etc.), additions rates, and dispersion can drastically alter the fresh and hardened concrete properties [9–11]. for instance, it is accepted that pigments absorb part of the free mixing water because of a significantly higher surface area (vs. cement), thus requiring increased water demand and/or high-range water reducer (hrwr) to achieve a given workability [12–14]. meng et al. [15] reported that the drop in fluidity due to tio2 can be controlled through hrwr and slag additions. if poorly dispersed, added pigments may agglomerate in the cement matrix, causing unreacted pockets or weak zones that decrease mechanical properties [15, 16]. lopez et al. [17] suggested using the mortar phase, while others recommended the use of extended mixing time [18] or water-based colourants [19] to improve pigment dispersibility. despite their chemically inert nature, most studies showed that synthetic io and cb pigments lead to increased strength and durability of cementitious materials. this is generally associated to a microfiller effect that blocks the capillary pores and leads to a denser microstructure [9, 20, 21]. yildizel et al. [22] found that yellow and black ios lead to increased strength and resistance to water permeability. mortars produced using red pigments exhibited relatively higher pore ratios, which detrimentally affected freeze/thaw resistance and durability. assaad et al. [11] reported that strength and bond to existing substrates increased when red or yellow pigments are incorporated by up to 6 % of cement mass. the curtail in strength at high pigment rates (above 6 %) was related to improper hydration reactions resulting from the excessive amount of powders that are adsorbed onto the cement grains. masadeh [23] found that 248 https://doi.org/10.14311/ap.2022.62.0248 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . physical properties specificgravity median particle size soundness blaine fineness lightness (l-value) 3.15 26.5 µm 0.04 % 3150 cm2/g 88.5 chemical properties cao sio2 al2o3 fe2o3 mgo 68.5 % 21.8 % 4.15 % 0.27 % 1.18 % table 1. physical and chemical properties of white cement. cb incorporated up to 0.5 % of cement mass reduces the concrete chloride permeability and corrosion rates of inserted steel bars. inoue et al. [24] noted that the cb treatment using aqueous solution of humic acids helps improving the dispersibility together with a reduced interaction with air-entraining surfactants and superior adhesion to the cement matrix (i.e., less leaching). in addition to the micro-filler effect, numerous researchers found that tio2 can participate in the cement hydration process, at least as nucleation sites, to accelerate setting times and promote strength development [15, 16, 25, 26]. chen et al. [27] showed that concrete durability and resistance to water infiltration significantly improved with 3 % tio2 additions, given the conversion of greater volume of calcium hydroxide (ch) crystals into c-s-h gels. zhang et al. [28] reported that tio2 acts as a filler in empty spaces and crystallization centre of ch to refine the concrete microstructure including its resistance to chloride ion penetration. folli et al. [29] speculated that the strength improvement might be related to alteration in packing density and nucleus orientation around the interfacial transition zones, rather than increased amounts of hydration products. 2. context and paper objectives the performance of coloured concrete in structural members, including the extent to which the use of pigments would alter the bond strengths to embedded steel bars, is not well understood. generally, the transfer of stresses between the reinforcement and surrounding concrete is attributed to chemical adhesion and mechanical bearing arising from the concrete surface around the steel ribs [30, 31]. the parameters affecting the bond are broadly related to the reinforcement characteristics (i.e., yield strength of bar, size, geometry, epoxy coating, cover, position in cast member, etc.) and concrete constituents and properties (i.e., density, strength, workability, presence of fibres, mineral admixtures, etc.) [32]. the spliced or developed lengths are computed by relevant models proposed by various building codes; for example, aci 318-19 [33] considers that the development length for deformed bars in tension members is inversely proportional to the square root of compressive strength, multiplied by specific factors to account for special considerations due to the reinforcement size, lightweight concrete, top bars, epoxy-coated bars, and contribution of confining transverse reinforcement. yet, limited attempts have been made to assess the validity of existing models and design provisions in the case of coloured concrete. this paper is a part of a comprehensive research project undertaken to assess the effect of pigments on durability and mechanical properties of coloured concrete mixtures. two concrete series made with 350 and 450 kg/m3 cement content and various concentrations of io (red or grey colour), cb, and tio2 pigments are investigated. tested properties included the compressive strength, splitting tensile strength, modulus of elasticity, and bond stress-slip behaviour to reinforcing steel bars. the experimental data were compared to design bond strengths proposed by relevant building codes including aci 318-19 [33], european code ec2 [34], and ceb-fip model code [35]. data reported herein can be of interest to civil engineers and architects seeking the use of pigments in coloured concrete intended for structural applications. 3. experimental program 3.1. materials white-coloured portland cement conforming to astm c150 type i was used in this study. its physical and chemical properties are listed in table 1. the gradations of siliceous sand and crushed limestone aggregate were within the astm c33 specifications. the specific gravity for the sand, fineness modulus, and absorption rate were 2.65, 3.1, and 0.75 %, respectively. those values were 2.72, 6.4, and 1 %, respectively, for the coarse aggregate, while the nominal maximum particle size was 20 mm. naphthalenebased hrwr was used; its specific gravity, solid content, and maximum dosage rate were 1.2, 40.5 %, and 3.5 % of cement mass, respectively. commercially available io (i.e., red and grey colour), cb, and tio2 pigments were used. as shown in figure 1, the red and grey coloured ios had almost spherical shapes; their specific gravities were 4.64 and 4.8, respectively, while their fe2o3 contents were 97.5 % and 98.8 %, respectively. the white-coloured tio2 is rutile-based manufactured by the chloride process; it also possesses round shape (figure 1) with a specific gravity of 4.1. the cb is produced by combustion of aromatic petroleum oil feedstock and consists essentially of pure carbon (i.e., > 98 %); its specific gravity was 2.05. the particle size gradation curves obtained by laser diffraction for the various pigments are plotted in 249 j. j. assaad, m. matta, j. saade acta polytechnica red io grey io tio2 carbon black figure 1. morphology of various pigments used. 0 20 40 60 80 100 0.01 0.1 1 10 100 pe rc en t p as si ng particle diameter, 𝛍m cement red io grey io tio2 carbon black figure 2. particle size distribution curves for the cement and various pigments. figure 2. generally speaking, the fineness of io pigments is pretty close to each other; the median diameter (d50) computed as the size for which 50 % of the material is finer for the grey and red io is 4.1 and 5.3 µm, respectively. the tio2 and cb were remarkably finer, which shifted the gradation curves towards much smaller particle sizes. the resulting d50 dropped to 0.21 and 0.092 µm for the tio2 and cb, respectively. deformed steel bars complying to astm a615 no. 13 were used in this work to evaluate the effect of pigments on bond stress-slip properties of coloured concrete to embedded rebars. the bar nominal diameter (db), young’s modulus, and yield strength (fy ) were 12 mm, 205 gpa, and 520 mpa, respectively. 3.2. mixture proportions two control concrete mixtures containing 350 (or, 450) kg/m3 cement with 0.5 (or, 0.42) water-to-cement ratio (w/c) were considered; the corresponding 28days f ′c was 26.7 and 34.2 mpa, respectively. the fine and coarse aggregate contents in the lean concrete mixture were 830 and 1020 kg/m3, respectively; while these were 790 and 925 kg/m3 in the higher strength concrete mix. the resulting sand-to-total aggregate ratio was 0.45. the hrwr dosage was either 2.6 % or 2.35 % of cement mass, respectively, in order to secure a fixed workability corresponding to a slump of 210 ± 10 mm. the io, cb, and tio2 pigments were incorporated at three different concentrations varying from 1.5 % to 4.5 % of cement mass, at 1.5 % increment rates. the 250 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . tio2 carbon black control mix red io grey io 1.5% 3% 4.5% 1.5% 3% 4.5% 1.5% 3% 4.5% 1.5% 3% 4.5% concrete with cement content = 350 kg/m3 figure 3. photo of coloured cylinders after the splitting tensile test. mixing sequence consisted on homogenizing the fine aggregate, coarse aggregate, and powder pigment for 3 minutes to ensure efficient dispersion of colourant materials. the cement, water, and hrwr were then sequentially introduced over 2 minutes. after 30 sec rest period, the mixing resumed for 2 more minutes. the ambient temperature and relative humidity (rh) remained within 23 ± 3 °c and 55 ± 5 %, respectively. 3.3. testing procedures 3.3.1. fresh and hardened properties right after mixing, the workability and air content were determined as per astm c143 and c231, respectively. the concrete was cast in 100 × 200 mm cylinders to determine the density, compressive strength (f ′c), and splitting tensile strength (f t) as per astm c642, c39, and c496, respectively [36–38]. all specimens were demoulded after 24 hours, cured in water, and tested after 28 days. averages of 3 values were considered. the modulus of elasticity (e) was determined through ultrasonic pulse velocity (u p v ) measurements using 100 × 200 mm concrete cylinders, as per astm c597 [39]. the pulse velocity was computed as the ratio between the 200 mm length of the concrete specimen to the measured transit time. the e was computed using the conventional wave propagation equation in solid rocks, expressed as: e, gp a = [(ρ × u p v 2)/g] 10−2, where g is gravity acceleration (9.81 m/s2) and ρ is the concrete density (kg/m3) [1, 31]. 3.3.2. colourimetry the l, a, and b colour coordinates were determined following the commission internationale d’eclairage (cie) system using a portable colourimeter. the l-value reflects the colour lightness varying from 0 (black) to +100 (white), a-value represents the chromatic intense of magenta/red (+127) and green (–128), and b-value the chromatic intense of yellow (+127) and blue (–128) [19, 40]. the specimens were oven-dried for one day at 50 ± 5 °c prior to testing. the measurements were realized using the broken cylinders after the tensile splitting test, as shown in figure 3. special care was taken to position the colourimeter sensor in the mortar phase (not the aggregate particle); while an average of 6 measurements was considered. the colour deviation (∆(e)) due to pigment additions from the control mix was determined as: ∆(e) = √ (lc − l)2 + (ac − a)2 + (bc − b)2, where lc = lcontrol, ac = acontrol, bc = bcontrol. 3.3.3. bond to steel reinforcement the direct bond method was used to determine the bond stress-slip properties of concrete mixtures, in accordance with rilem/ceb/fib specification [41]. the bars were vertically centred in the 150 mm cubic moulds (figure 4); the embedded length was 60 mm (5 db) and pvc bond breaker was inserted around the bar at the concrete surface to reduce the concentration of stresses during loading. after 24 hours from casting, the specimens were demoulded and covered with plastic bags to cure at 23 ± 3°c for 28 days. the direct bond test was realized using a universal testing machine, whereby the pullout load and slips of the steel bar relative to the concrete block are recorded [30, 42]. the tensile load was gradually applied until failure at a rate hovering 0.25 kn/sec. 4. test results and discussion 4.1. hrwr demand table 2 summarizes the hrwr demand and colour coordinates for mixtures prepared with various pigment types and concentrations. in line with current literature [1, 4, 8], the demand for hrwr increased with pigment additions, given their higher fineness 251 j. j. assaad, m. matta, j. saade acta polytechnica 150 mm 150 mm 140 mm steel bar unbonded length (80 mm) bonded length (5db = 60 mm) top view: bar placed centrally side view: figure 4. photo of the experimental testing of bond stress-slip properties. mixture codification hrwr [% of cement] slump [mm] air content [%] density [kg/m]3 l a b ∆(e) 350-control 2.6 205 2.8 2320 67.8 5.6 19.3 – 350-red-1.5 % 2.6 205 n/a 2315 55.2 12.8 12.9 15.8 350-red-3 % 2.9 200 2.7 2330 51.9 16.1 10.6 21.0 350-red-4.5 % 3.2 200 2.9 2375 44.4 19.8 12 28.3 350-grey-1.5 % 2.7 195 n/a 2330 61.9 2.2 12.4 9.6 350-grey-3 % 2.9 205 3.1 2360 58.4 0.8 7.2 16.0 350-grey-4.5 % 3.1 205 n/a 2350 49.4 -0.1 4.0 24.6 350-tio2-1.5 % 2.7 210 n/a 2310 72.0 5.3 19.8 4.3 350-tio2-3 % 3.0 210 3 2340 71.9 5.4 19.7 4.2 350-tio2-4.5 % 3.1 195 2.9 2380 76.3 4.4 17.9 8.7 350-cblack-1.5 % 2.9 190 n/a 2350 42.4 -0.2 -0.2 32.5 350-cblack-3 % 3.5 195 3.2 2340 42.2 -0.5 -0.7 33.0 350-cblack-4.5 % 3.6 205 3.4 2390 31.0 -0.05 -1.2 42.4 450-control 2.4 205 3 2345 68.3 5.7 19.1 – 450-red-3 % 2.5 210 3.1 2385 48.8 19.0 10.8 25.0 450-grey-3 % 2.8 200 3 2385 56.2 0.58 6.9 17.9 450-tio2-3 % 2.7 200 2.8 2405 77.1 5.1 19.1 8.9 450-cblack-3 % 3.2 205 3.5 2415 41.9 -0.4 -0.8 33.6 the mix codification refers to cement content-pigment type-pigment dosage. n/a refers to not tested. table 2. effect of pigment types and concentrations on workability and colourimetry properties. 252 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . 15.8 21.0 28.3 9.6 16.0 24.6 4.3 4.2 8.7 32.5 33.0 42.4 0 10 20 30 40 50 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% ∆( e) red grey tio2 c. black cement content = 350 kg/m3 figure 5. effect of pigment type and concentration on ∆(e) variations for concrete prepared with 350 kg/m3 cement. that absorbs part of the mixing water and results in requirement of additional superplasticizing molecules to ensure the targeted slump of 210 ± 10 mm. for example, at a 3 % pigment rate, the hrwr dosage varied from 2.6 % for the 350-control mix to 2.9 % and 2.85 % for the 350-red-3 % and 350-grey-3 % mixtures, respectively. the increase in hrwr was particularly pronounced for the cb pigment, given its extremely fine particles [13]. hence, the hrwr reached 3.45 % and 3.6 % for the concrete containing 3 % or 4.5 % cb, respectively. 4.2. colourimetry as can be noticed in table 2, the lightness of colour (lvalue) increased from 67.75 for the 350-control mix to 72 and 76.3 with the addition of 1.5 % and 4.5 % tio2, respectively, which can be attributed to the intrinsic white-coloured nature of this pigment [26, 29]. yet, as expected, the l-value followed a decreasing trend when darker pigments were used; it dropped to 49.35, 44.35, and 31 for mixtures containing 4.5 % grey, red, and cb pigments, respectively. to the other end, the magenta chromatic intense (i.e., a-value) varied from 5.55 to 16.1 for the 350-control and 350-red-3 % mixtures, respectively, while in contrast, the highest b-value of 19.75 corresponded to the 350-tio2-1.5 % mix. concrete mixtures prepared with cb exhibited negative a and b values, reflecting the black colouring effects of such powders. as shown in figure 5, the mixtures containing the white-coloured tio2 pigments exhibited the lowest ∆(e) values, reflecting relatively limited variations with respect to the control mix. the incorporation of red or grey io pigments gradually increased the ∆(e) values that varied from 9.6 to 28.3, while the cbmodified mixtures exhibited the highest ∆(e) that varied from 32.5 to 42.4. it should be noted that ∆(e) steadily increased with pigment additions (figure 5), without showing a clear stabilization tendency that reflects colour saturation [11, 17]. this can be attributed to the relatively reduced cement volume (i.e., about 11 % of the overall concrete mix), thus requiring additional pigment powders to achieve colour saturation. additionally, the beige-like colour of natural sand could have affected the pigment tinting strength, which reduced the tendency towards the colour saturation [10, 11]. 4.3. hardened properties the effect of pigment type and concentration on the 28-days f ′c for concrete prepared with 350 kg/m3 cement are summarized in table 3, and plotted in figure 6. regardless of the colour, mixtures incorporating increased pigment additions exhibited higher strength values. for example, compared to the 26.7 mpa value obtained for the control mix, the f ′c increased to 29.7 and 34.9 mpa for the mixtures containing 1.5 % and 4.5 % red io, respectively. such values reached 30 and 34.2 mpa for the mixtures containing 1.5 % and 4.5 % grey io, respectively. this could be associated to the micro-filler effect and enhanced packing density that lead to a denser microstructure capable of supporting higher loads. yildizel et al. [22] reported that io pigments are inert materials (i.e., do not react with water) that fill the interspaces and capillary pores in cementitious systems, leading to an improved resistance against permeability and attack of aggressive ions. for the given concentration, the effect of cb on strength development is pretty similar or slightly higher than the io pigments. hence, the f ′c reached 35.1 mpa for the 350-cblack-4.5 % mixture. knowing the inert nature of such powders, the increase in strength can be physically related to the micro-filler effect that enhances packing density of the cementitious matrix. the highest increase in f ′c was recorded for concrete mixtures prepared with tio2 additions; this 253 j. j. assaad, m. matta, j. saade acta polytechnica 26.7 29.7 34 34.9 30 29.8 34.2 33.5 36 37.7 30.2 34 35.1 25 30 35 40 co nt ro l 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% 1. 5% 3% 4. 5% f'c , m pa red grey tio2 c. black cement content = 350 kg/m3 figure 6. effect of pigment types and concentrations on hardened properties and bond to steel bars. mixture codification 7-d f ′c [mpa] 28-d f ′c [mpa] f t [mpa] upv [km/s] e [gpa] τu [mpa] δu [mm] 350-control 14.8 26.7 2.16 3.55 29.8 11.36 4.1 350-red-1.5 % 17 29.7 2.28 3.58 30.2 11.94 5.3 350-red-3 % 20.1 34 2.41 3.6 30.8 18.00 10.3 350-red-4.5 % 19.8 34.9 3.06 3.62 31.7 17.78 10.5 350-grey-1.5 % 17.2 30 n/a 3.5 29.1 n/a n/a 350-grey-3 % 20.2 29.8 2.86 3.62 31.5 12.74 4.9 350-grey-4.5 % 20.6 34.2 3.36 3.58 30.7 14.05 6.1 350-tio2-1.5 % 19.7 33.5 2.62 3.57 30.0 17.25 6.2 350-tio2-3 % 19.6 36 2.7 3.66 32.0 n/a n/a 350-tio2-4.5 % 20.6 37.7 3.22 3.62 31.8 20.70 7.8 350-cblack-1.5 % 19 30.2 2.87 3.6 31.0 15.34 4.4 350-cblack-3 % 18.8 34 n/a 3.72 33.0 n/a n/a 350-cblack-4.5 % 20.5 35.1 3.28 3.8 35.2 17.89 5 450-control 23.4 34.2 2.56 3.7 32.7 18.90 5.8 450-red-3 % 27.3 38.9 3.76 3.8 35.1 23.41 11.5 450-grey-3 % 25.8 42.3 n/a 3.83 35.7 19.78 6.3 450-tio2-3 % 28.6 44.5 3.94 3.7 33.6 24.85 9.2 450-cblack-3 % 27.6 41.6 3.85 3.9 37.4 21.96 7.2 n/a refers to not tested. table 3. effect of pigment types and concentrations on workability and colourimetry properties. 254 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . y = 6.962x + 13.87 r² = 0.7 y = 2.976x + 23.32 r² = 0.58 25 29 33 37 41 45 2.0 2.4 2.8 3.2 3.6 4.0 f'c , m pa o r e, g pa ft, mpa f'c e figure 7. relationships between f t with respect to f ′c and e responses for all tested concrete mixtures. 0 4 8 12 16 20 24 0 2 4 6 8 10 12 14 bo nd s tr es s, m pa displacement, mm mixtures with 350 kg/m3 cement control mix grey-4.5% tio2-4.5% red-4.5% figure 8. typical bond stress vs. displacement curves for mixtures prepared with 350 kg/m3 cement. reached 33.5 and 37.7 mpa at 1.5 % and 4.5 % rates, respectively. besides the micro-filler effect, the increase in strength may be ascribed to the formation of nucleation sites that promote hydration reactions and precipitate additional gels in the hardened matrix [16, 29]. the hydration products grow around the tio2 particles, causing the formation of secondary cs-h in the capillary pores that reduces the porosity of the matrix. as shown in table 3, the f ′c significantly increased from 34.2 mpa for the 450-control mix to 44.5 mpa for the 450-tio2-3 % concrete, which can be associated to the micro-filler effect prompted with additional c-s-h hydrating compounds that could refine the concrete microstructure [27, 28]. the effect of pigment type and concentration on f t and e properties is quite similar to the one observed on f ′c responses. hence, the strength increased with io and cb pigments, while being particularly pronounced with the use of tio2 (table 3). moderate relationships with correlation coefficients (r2) larger than 0.58 are obtained between the hardened properties for all tested concrete mixtures prepared with 350 and 450 kg/m3 cement, as shown in figure 7. 4.4. bond stress-slip behavior table 3 summarizes the ultimate bond strength (τu) at failure and corresponding slip (δu) for all tested concrete mixtures. it is worth noting that the coefficient of variation (cov) for τu responses determined for selected mixtures varied from 9.6 % to 14.7 %, representing an acceptable repeatability. the steel bars did not reach their yielding state during pullout testing (i.e., the yielding load is 58.8 kn). a pullout mode of failure occurred for all tests, whereby the concrete crushed and sheared along the embedded steel region with no visible cracks on the external concrete specimens [30, 31]. typical bond stress-slip (τ vs. δ) curves determined for the 350-control mix and those incorporating different pigment types and concentrations are plotted in figure 8. all curves are initially linear, which can be ascribed to the adhesive component of the bond and mechanical interlock that takes place between the 255 j. j. assaad, m. matta, j. saade acta polytechnica 11.36 11.94 18.00 17.78 12.74 14.05 17.25 20.70 15.34 17.89 10 13 16 19 22 co nt ro l 1. 5% 3% 4. 5% 3% 4. 5% 1. 5% 4. 5% 1. 5% 4. 5% u lti m at e bo nd s tr en gt h, m pa red grey tio2 c. black cement content = 350 kg/m3 figure 9. effect of pigment type and concentration on τu for mixtures prepared with 350 kg/m3 cement. y = 1.129x + 15.12 r² = 0.83 y = 0.104x + 1.17 r² = 0.57 2.0 2.5 3.0 3.5 4.0 4.5 25 29 33 37 41 45 10 13 16 19 22 25 28 ft , m pa f'c , m pa ultimate bond strength, mpa f'c ft figure 10. relationships between τu with respect to f ′c and f t for all tested concrete mixtures. embedded steel ribs and the surrounding concrete [32]. the 350-tio2-4.5 % mixture exhibited the highest τ vs. δ responses, given the micro-filler effect and formation of secondary c-s-h hydrating compounds that strengthen the interfacial concrete-steel transition zone [26, 29]. hence, compared to the 11.36 mpa value obtained from the control mix, τu reached 20.7 mpa for the 350-tio2-4.5 % concrete. the increase in τu was also noticeable for mixtures prepared with io pigments, albeit this remained comparatively lower than what was achieved with tio2 additions. hence, τu reached 14.05 and 17.78 mpa for the 350-grey-4.5 % and 350-red-4.5 %, respectively. when the adhesive and interlock components fail, the concrete between the steel ribs breaks, causing excessive local slips at reduced bond stresses [42, 43]. only the frictional bond component remains in the post-peak region of τ vs. δ curves, whereby the steel bars are dynamically pulled out from the concrete specimens. figure 9 summarizes the effect of the pigment type and concentration on τu responses determined for mixtures prepared with a 350 kg/m3 cement content. regardless of the pigment type, τu gradually increased with such additions, which practically reveals their beneficial role for improving the development and transfer of bond stresses in reinforced concrete members. the highest value of 20.7 mpa was recorded for the concrete containing the highest tio2 concentration of 4.5 %. this was followed by mixtures incorporating 3 % and 4.5 % red io as well as those made using 4.5 % cb; the resulting τu hovered around 18 mpa. just like the mechanical properties, the increase in τu due to inert io or cb pigments can be attributed to the micro-filler effect that densifies the cementitious microstructure around the steel ribs, leading to an improved bond behaviour. moderate relationships with r2 of 0.57 and 0.83 are established between τu with respect to f ′c and f t for all tested concrete mixtures (figure 10). as shown in figure 8, the increase in τu due to pigment additions is accompanied by an increase in the maximum slip that occurs at failure. for example, δu of 4.1 mm was registered for the 350-control mix, while it reached 6.1 and 7.8 mm for the 350-grey256 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . y = 0.838x + 10.28 r² = 0.41 y = 0.86x + 14.89 r² = 0.67 10 14 18 22 26 3 4 5 6 7 8 9 10 11 12 u lt im at e bo nd s tr en gt h, m pa displacement at failure, mm cement = 350 kg/m3 cement = 450 kg/m3 figure 11. relationships between τu and δu for mixtures prepared with 350 or 450 kg/m3 cement. 4.5 % and 350-tio2-4.5 %, respectively. in addition to the improved bond strength, this reflects that pigments confer higher ductility, which can be particularly relevant in high-strength concrete reinforced members [33, 43, 44]. figure 11 plots the relationships between τ and δ for concrete mixtures prepared with 350 or 450 kg/m3 cement. clearly, mixtures exhibiting higher τu are characterized by increased displacements at failure. 5. comparison with international bond models in order to ensure compliance of bond properties of the coloured concrete to international design code models, the τu values determined experimentally are compared with the design bond strengths (τmax) specified in ceb-fip [35], aci 318-19 [33], and european code ec2 [34] models. the ceb-fip (2010) considers that the stiffness of ascending τ vs. δ curves follows an exponential trend raised to a power of 0.4, until reaching τmax equal to 2 or 2.5 √ f ′c, depending on whether the concrete is confined or not. this can be expressed in eqs. 1 and 2 as follows. ceb-fip for unconfined concrete: τmax = 2 √ f ′c (1) ceb-fip for confined concrete: τmax = 2.5 √ f ′c (2) in ultimate state conditions, the aci 318-19 [33] considers that τmax can be calculated as: τmax = 10 √ f ′c ( cb+ktr db ) 4 × 9 ψt ψe ψs λ (3) where cb is the concrete cover and ktr the transverse reinforcement index (note that the (cb + ktr )/db ratio is limited to 2.5). the ψs, ψe, ψt, and λ factors refer to the bar-size, epoxy coated bars, bar location with respect to the upper surface, and lightweight concrete, respectively. in this study, ψs is taken as 0.8 for bars no. 13, while ψt, ψe, and λ equal to 1. the τmax expression proposed by ec2 [34] for determining the ultimate bond stress is given as: τmax = 2.25 η1 η2 fctd (4) where η1 is a coefficient reflecting the bond quality to the embedded steel (taken as 1.0) and η2 is related to the bar diameter (taken as 1, given that db is less than 32 mm). the fctd = αct fctk,0.05/γc refers to concrete design tensile strength, where αct and γc refer to the long-term effects on tensile strength and partial safety factor, respectively (both taken as 1). the fctk,0.05 refers to the concrete characteristic axial tensile strength computed as 0.7 × 0.3 × f (2/3)ck , where fck is the 28-days compressive strength concrete cylinder. table 4 summarizes the τmax values computed using the different codes as well as the resulting experimental-to-design bond strength ratios (i.e., τu/τmax). as shown in figure 12, the τmax values followed an increasing trend with pigment additions. on average, the experimental τu values are 3.35and 4.85times higher than the aci 318-19 and ec2 equations (table 4), respectively; this reveals the conservative nature of such models for predicting the bond strength between steel bars and coloured concrete structures. yet, the τu/τmax becomes pretty close or even lower than 1.0 when the ceb-fip equations are used (i.e., eqs. 1 and 2), reflecting the unconservative nature of such equations for assessing the bond strengths of coloured concrete. 6. conclusions this paper is part of an investigation that aims at investigating the impact of pigments on the structural properties of reinforced coloured concrete members. the findings of this paper reveal that such additions have a rather beneficial effect on the concrete bond 257 j. j. assaad, m. matta, j. saade acta polytechnica y = 0.375x + 13.18 r² = 0.71 y = 0.13x + 4.57 r² = 0.71 y = 0.117x + 3.03 r² = 0.71 2.5 3.5 4.5 5.5 12 13 14 15 16 0 1 2 3 4 5 𝜏m ax , m pa (a ci a nd e c2 ) 𝜏m ax , m pa (c eb -f ip ) pigment concentration, % of cement ceb-fip aci ec2 cement = 350 kg/m3 figure 12. relationships between pigment concentration and τmax computed by different codes for mixtures prepared with 350 kg/m3 cement. mixture codification τmax computed by different codes experimental-to-design bond ratio mpa (τu/τmax) 2 √ f ′c 2.5 √ f ′c aci 318 ec2 2 √ f ′c 2.5 √ f ′c aci 318 ec2 350-control 10.3 12.9 4.5 3 1.1 0.88 2.53 3.84 350-red-1.5 % 10.9 13.6 4.7 3.2 1.1 0.88 2.52 3.76 350-red-3 % 11.7 14.6 5.1 3.5 1.54 1.23 3.56 5.19 350-red-4.5 % 11.8 14.8 5.1 3.5 1.5 1.2 3.47 5.03 350-grey-3 % 10.9 13.6 4.7 3.2 1.17 0.93 2.69 4.01 350-grey-4.5 % 11.7 14.6 5.1 3.5 1.2 0.96 2.77 4.03 350-tio2-1.5 % 11.6 14.5 5 3.4 1.49 1.19 3.43 5.02 350-tio2-4.5 % 12.3 15.4 5.3 3.7 1.69 1.35 3.88 5.57 350-cblack-1.5 % 11 13.7 4.8 3.2 1.4 1.12 3.22 4.78 350-cblack-4.5 % 11.8 14.8 5.1 3.5 1.51 1.21 3.48 5.04 450-control 11.7 14.6 5.1 3.5 1.62 1.29 3.72 5.42 450-red-3 % 12.5 15.6 5.4 3.8 1.88 1.5 4.32 6.16 450-grey-3 % 13 16.3 5.6 4 1.52 1.22 3.5 4.93 450-tio2-3 % 13.3 16.7 5.8 4.2 1.86 1.49 4.29 5.98 450-cblack-3 % 12.9 16.1 5.6 4 1.7 1.36 3.92 5.53 average = 1.45 1.16 3.35 4.85 st. deviation = 0.26 0.2 0.59 0.77 table 4. experimental-to-design bond strengths computed by different codes. 258 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . properties to embedded steel bars, which could practically be assuring to consultants and architects in the concrete building industry. based on the foregoing, the following conclusions can be warranted: • mixtures containing tio2 exhibited the lowest ∆(e) values, reflecting relatively limited variations with respect to the control mix. the incorporation of gradually increased red or grey io pigments led to an increased ∆(e), while the cb-modified mixtures exhibited the highest ∆(e) values. • the steady ∆(e) increase with pigment additions was attributed to the relatively reduced cement volume and beige-like colour of natural sand, thus affecting the pigment tinting strength and reducing the tendency towards colour saturation. • regardless of the colour, mixtures incorporating increased pigment additions exhibited higher f ′c and f t responses. this was directly associated to the micro-filler effect and enhanced packing density that lead to denser microstructure. the io and cb pigments are inert materials (i.e., do not react with water) that fill the interspaces and capillary pores in cementitious systems, leading to improved strength properties. • the highest increase in strength was recorded for mixtures prepared with tio2 additions. besides the micro-filler effect, the increase in strength was ascribed to the formation of nucleation sites that promote hydration reactions and reduce the porosity of the hardened matrix. • just like the f ′c and f t responses, τu gradually increased with such additions, which practically reveals their beneficial role for improving the development and transfer of bond stresses in reinforced concrete members. the highest increase was noticed for the concrete mixture containing tio2 additions, given the micro-filler effect and formation of additional hydrating gels that strengthen the interfacial concrete-steel transition zone. • the increase in τu due to pigment additions was accompanied with an increase in the maximum slip that occurs at failure. this reflects that pigments confer higher ductility, which can be particularly relevant in high-strength concrete reinforced members. • on average, the experimental τu values are 3.35and 4.85-times higher than the aci 318-19 and ec2 equations, respectively. yet, the τu becomes pretty close to τmax computed by the ceb-fip equations, reflecting the unconservative nature of such equations to predict the bond strengths of coloured concrete mixtures. acknowledgements the authors wish to acknowledge the support provided by sodamco-weber, lebanon. references [1] p. bartos. fresh concrete properties and test. elsevier, amsterdam, 1992. [2] l. popelová. the symbolic-aesthetic dimension of industrial architecture as a method of classification and evaluation: the example of bridge structures in the czech republic. acta polytechnica 47(1):23–31, 2007. https://doi.org/10.14311/912. [3] astm c979. standard specification for pigments for integrally colored concrete. annual book of astm standards. farmington hills, usa, 4(2), 2007. [4] v. hospodarova, j. junak, n. stevulova. color pigments in concrete and their properties. pollack periodica 10(3):143–151, 2015. https://doi.org/10.1556/606.2015.10.3.15. [5] s. r. naganna, h. a. ibrahim, s. p. yap, et al. insights into the multifaceted applications of architectural concrete: a state-of-the-art review. arabian journal for science and engineering 46:4213–4223, 2021. https://doi.org/10.1007/s13369-020-05033-0. [6] y.-m. gao, h.-s. shim, r. h. hurt, et al. effects of carbon on air entrainment in fly ash concrete: the role of soot and carbon black. energy fuels 11(2):457–462, 1997. https://doi.org/10.1021/ef960113x. [7] k. loh, c. c. gaylarde, m. a. shirakawa. photocatalytic activity of zno and tio2 ‘nanoparticles’ for use in cement mixes. construction and building materials 167:853–859, 2018. https: //doi.org/10.1016/j.conbuildmat.2018.02.103. [8] s. s. lucas, v. m. ferreira, j. l. barroso de aguiar. incorporation of titanium dioxide nanoparticles in mortars – influence of microstructure in the hardened state properties and photocatalytic activity. cement and concrete research 43:112–120, 2013. https://doi.org/10.1016/j.cemconres.2012.09.007. [9] h.-s. jang, h.-s. kang, s.-y. so. color expression characteristics and physical properties of colored mortar using ground granulated blast furnace slag and white portland cement. ksce journal of civil engineering 18(4):1125–1132, 2014. https://doi.org/10.1007/s12205-014-0452-z. [10] r. alves, p. faria, a. brás. brita lavada – an eco-efficient decorative mortar from madeira island. journal of building engineering 24:100756, 2019. https://doi.org/10.1016/j.jobe.2019.100756. [11] j. j. assaad, d. nasr, s. chwaifaty, t. tawk. parametric study on polymer-modified pigmented cementitious overlays for colored applications. journal of building engineering 27:101009, 2020. https://doi.org/10.1016/j.jobe.2019.101009. [12] j. j. assaad. disposing waste latex paints in cementbased materials – effect on flow and rheological properties. journal of building engineering 6:75–85, 2016. https://doi.org/10.1016/j.jobe.2016.02.009. [13] h.-s. lee, j.-y. lee, m.-y. yu. influence of inorganic pigments on the fluidity of cement mortars. cement and concrete research 35(4):703–710, 2005. https://doi.org/10.1016/j.cemconres.2004.06.010. 259 https://doi.org/10.14311/912 https://doi.org/10.1556/606.2015.10.3.15 https://doi.org/10.1007/s13369-020-05033-0 https://doi.org/10.1021/ef960113x https://doi.org/10.1016/j.conbuildmat.2018.02.103 https://doi.org/10.1016/j.conbuildmat.2018.02.103 https://doi.org/10.1016/j.cemconres.2012.09.007 https://doi.org/10.1007/s12205-014-0452-z https://doi.org/10.1016/j.jobe.2019.100756 https://doi.org/10.1016/j.jobe.2019.101009 https://doi.org/10.1016/j.jobe.2016.02.009 https://doi.org/10.1016/j.cemconres.2004.06.010 j. j. assaad, m. matta, j. saade acta polytechnica [14] a. woods, j. y. lee, l. j. struble. effect of material parameters on color of cementitious pastes. journal of astm international 4(8):1–18, 2007. https://doi.org/10.1520/jai100783. [15] t. meng, y. yu, x. qian, et al. effect of nano-tio2 on the mechanical properties of cement mortar. construction and building materials 29:241–245, 2012. https: //doi.org/10.1016/j.conbuildmat.2011.10.047. [16] a. mohammed, n. t. k. al-saadi, j. sanjayan. inclusion of graphene oxide in cementitious composites: state-of-the-art review. australian journal of civil engineering 16(2):81–95, 2018. https://doi.org/10.1080/14488353.2018.1450699. [17] a. lópez, j. m. tobes, g. giaccio, r. zerbino. advantages of mortar-based design for coloured self-compacting concrete. cement and concrete composites 31(10):754–761, 2009. https: //doi.org/10.1016/j.cemconcomp.2009.07.005. [18] l. hatami, m. jamshidi. application of sbrincluded pre-milled colored paste as a new approach for coloring self-consolidating mortars (scms). cement and concrete composites 65:110–117, 2016. https: //doi.org/10.1016/j.cemconcomp.2015.10.015. [19] p. weber, e. imhof, b. olhaut. coloring pigments in concrete: remedies for fluctuations in raw materials and concrete recipes. tech. report, harold scholz & co gmbh, 2009. [20] j. j. assaad, m. vachon. valorizing the use of recycled fine aggregates in masonry cement production. construction and building materials 310:125263, 2021. https: //doi.org/10.1016/j.conbuildmat.2021.125263. [21] m. j. positieri, p. helene. physicomechanical properties and durability of structural colored concrete. in aci symposium publication, vol. 253, pp. 183–200. 2013. https://doi.org/10.14359/20175. [22] s. a. yıldızel, g. kaplan, a. u. öztürk. cost optimization of mortars containing different pigments and their freeze-thaw resistance properties. advances in materials science and engineering 2016:5346213, 2016. https://doi.org/10.1155/2016/5346213. [23] s. masadeh. the effect of added carbon black to concrete mix on corrosion of steel in concrete. journal of minerals and materials characterization and engineering 3(4):271–276, 2015. https://doi.org/10.4236/jmmce.2015.34029. [24] d. inoue, h. sakurai, m. wake, t. ikeshima. carbon black for coloring cement and method for coloring molded cement article. european patent, wo 01/046322, 10 p, 1999. [25] j. topič, z. prošek, k. indrová, et al. effect of pva modification on properties of cement composites. acta polytechnica 55(1):64–75, 2015. https://doi.org/10.14311/ap.2015.55.0064. [26] a. nazari, s. riahi. the effect of tio2 nanoparticles on water permeability and thermal and mechanical properties of high strength self-compacting concrete. materials science and engineering: a 528(2):756–763, 2010. https://doi.org/10.1016/j.msea.2010.09.074. [27] j. chen, s.-c. kou, c.-s. poon. hydration and properties of nano-tio2 blended cement composites. cement and concrete composites 34(5):642–649, 2012. https: //doi.org/10.1016/j.cemconcomp.2012.02.009. [28] r. zhang, x. cheng, p. hou, z. ye. influences of nano-tio2 on the properties of cement-based materials: hydration and drying shrinkage. construction and building materials 81:35–41, 2015. https: //doi.org/10.1016/j.conbuildmat.2015.02.003. [29] a. folli, c. pade, t. b. hansen, et al. tio2 photocatalysis in cementitious systems: insights into self-cleaning and depollution chemistry. cement and concrete research 42(3):539–548, 2012. https://doi.org/10.1016/j.cemconres.2011.12.001. [30] m. konrad, r. chudoba. the influence of disorder in multifilament yarns on the bond performance in textile reinforced concrete. acta polytechnica 44(5-6):186–193, 2004. https://doi.org/10.14311/654. [31] j. j. assaad, p. matar, a. gergess. effect of quality of recycled aggregates on bond strength between concrete and embedded steel reinforcement. journal of sustainable cement-based materials 9(2):94–111, 2020. https://doi.org/10.1080/21650373.2019.1692315. [32] aci 408r-03. bond and development of straight reinforcing bars in tension, 2003. [33] aci 318-19. building code requirements for reinforced concrete, 2019. [34] en 1992-1-1. eurocode 2: design of concrete structures – part 1-1: general rules and rules for buildings, 2004. [35] fib (international federation for structural concrete). model code for concrete structures. ernst & sohn, berlin, 2013. [36] astm c642-13. standard test method for density, absorption, and voids in hardened concrete. annual book of astm standards. west conshohocken, pa, 2013, usa. [37] astm c39/c39 m-05. standard test method for compressive strength of cylindrical concrete specimens. annual book of astm standards. west conshohocken, pa, 2005, usa. [38] astm c496/c496 m-04. standard test method for splitting tensile strength of cylindrical concrete specimens. annual book of astm standards. west conshohocken, pa, 2004, usa. [39] astm c 597-16. standard test method for pulse velocity through concrete. annual book of astm standards. west conshohocken, pa, 2016, usa. [40] g. teichmann. the use of colorimetric methods in the concrete industry. betonwerk fertigteil-technik 11:58–73, 1990. [41] rilem/ceb/fib. bond test for reinforcing steel: 2, pullout test. materials and structures 3:175–178, 1970. [42] j. němeček, p. padevět, z. bittnar. effect of stirrups on behavior of normal and high strength concrete columns. acta polytechnica 44(5-6):158–164, 2004. https://doi.org/10.14311/648. 260 https://doi.org/10.1520/jai100783 https://doi.org/10.1016/j.conbuildmat.2011.10.047 https://doi.org/10.1016/j.conbuildmat.2011.10.047 https://doi.org/10.1080/14488353.2018.1450699 https://doi.org/10.1016/j.cemconcomp.2009.07.005 https://doi.org/10.1016/j.cemconcomp.2009.07.005 https://doi.org/10.1016/j.cemconcomp.2015.10.015 https://doi.org/10.1016/j.cemconcomp.2015.10.015 https://doi.org/10.1016/j.conbuildmat.2021.125263 https://doi.org/10.1016/j.conbuildmat.2021.125263 https://doi.org/10.14359/20175 https://doi.org/10.1155/2016/5346213 https://doi.org/10.4236/jmmce.2015.34029 https://doi.org/10.14311/ap.2015.55.0064 https://doi.org/10.1016/j.msea.2010.09.074 https://doi.org/10.1016/j.cemconcomp.2012.02.009 https://doi.org/10.1016/j.cemconcomp.2012.02.009 https://doi.org/10.1016/j.conbuildmat.2015.02.003 https://doi.org/10.1016/j.conbuildmat.2015.02.003 https://doi.org/10.1016/j.cemconres.2011.12.001 https://doi.org/10.14311/654 https://doi.org/10.1080/21650373.2019.1692315 https://doi.org/10.14311/648 vol. 62 no. 2/2022 effect of pigments on bond strength between coloured . . . [43] a. alarab, b. hamad, j. j. assaad. strength and durability of concrete containing ceramic waste powder and blast furnace slag. journal of materials in civil engineering 34(1):04021392, 2022. https: //doi.org/10.1061/(asce)mt.1943-5533.0004031. [44] j. machovec, p. reiterman. influence of aggressive environment on the tensile properties of textile reinforced concrete. acta polytechnica 58(4):245–252, 2018. https://doi.org/10.14311/ap.2018.58.0245. 261 https://doi.org/10.1061/(asce)mt.1943-5533.0004031 https://doi.org/10.1061/(asce)mt.1943-5533.0004031 https://doi.org/10.14311/ap.2018.58.0245 acta polytechnica 62(2):248–261, 2022 1 introduction 2 context and paper objectives 3 experimental program 3.1 materials 3.2 mixture proportions 3.3 testing procedures 3.3.1 fresh and hardened properties 3.3.2 colourimetry 3.3.3 bond to steel reinforcement 4 test results and discussion 4.1 hrwr demand 4.2 colourimetry 4.3 hardened properties 4.4 bond stress-slip behavior 5 comparison with international bond models 6 conclusions acknowledgements references ap07_2-3.vp 1 preliminaries the magnetic fields of planets, stars and galaxies are maintained by dynamo effects in conducting fluids or plasmas [1, 2, 3]. these dynamo effects are caused by a topologically nontrivial interplay of fluid (plasma) motions and a balanced self-amplification of the magnetic fields – and can be described within the framework of magnetohydrodynamics (mhd) [1, 2]. for physically realistic dynamos the coupled system of maxwell and navier-stokes equations has, in general, to be solved numerically. for a qualitative understanding of the occurring effects, semi-analytically solvable toy models play an important role. one of the simplest dynamo models is the so called �2-dynamo with a spherically symmetric �-profile(1) (see, e.g. [2]). for such a dynamo the magnetic field can be decomposed into poloidal and toroidal components, expanded over spherical harmonics [2, 4] and unitarily re-scaled [5]. as a result, one arrives at a set of mode decoupled matrix differential eigenvalue problems [2, 4, 5] a� � � � � � � � � � � q q q [ ] [ ] [ ] 1 1 , q r r l l r r r[ ]: ( ) ( ) ( ) � � � � �� � 1 2 , (1) with boundary conditions (bcs) which have to be imposed in dependence on the concrete physical setup and which will be discussed below. the �-profile describes the net effect of small scale helical turbulence on the magnetic field [2]. it can be assumed real-valued �( )r � � and sufficiently smooth. we note that the reality of the differential expression (1), independently from the concrete bcs, implies an operator spectrum which is symmetric with regard to the real axis, i.e. which consists of purely real eigenvalues and of complex conjugate eigenvalue pairs. in [4] it was shown that the differential expression (1) of this operator has the fundamental (canonical) symmetry [6, 7] a a� �� j j † , j i i � � � � � � 0 0 . (2) in case of bcs compatible with this fundamental symmetry the operator turns out self-adjoint in a krein space(2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 47 no. 2–3/2007 the spherically symmetric �2–dynamo and some of its spectral peculiarities u. günther, o. n. kirillov, b. f. samsonov, f. stefani a brief overview is given of recent results on the spectral properties of spherically symmetric mhd �2-dynamos. in particular, the spectra of sphere-confined fluid or plasma configurations with physically realistic boundary conditions (bcs) (surrounding vacuum) and with idealized bcs (super-conducting surrounding) are discussed. the subjects comprise third-order branch points of the spectrum, self-adjointness of the dynamo operator in a krein space as well as the resonant unfolding of diabolical points. it is sketched how certain classes of dynamos with a strongly localized �-profile embedded in a conducting surrounding can be mode decoupled by a diagonalization of the dynamo operator matrix. a mapping of the dynamo eigenvalue problem to that of a quantum mechanical hamiltonian with energy dependent potential is used to obtain qualitative information about the spectral behavior. links to supersymmetric quantum mechanics and to the dirac equation are indicated. keywords: mhd dynamo, operator spectrum, krein space, boundary conditions, supersymmetric quantum mechanics, diabolical points, resonance, kdv soliton potential. fig. 1: real and imaginary components of the �2-dynamo spectrum as functions of the scale factor c of an �-profile �( ) ( . . . )r c r r r� � � � � � �1 26 09 5364 28 222 3 4 in the case of angular mode number l � 1 and physically realistic boundary conditions (3). the concrete coefficients in the quartic polynomial �( )r have their origin in numerical simulations of field reversal dynamics (see ref. [20, 21]). only the imaginary components with �� 0 are shown. the symmetrically located complex conjugate �� 0 components are omitted for the sake of brevity. (�j, [.,.]j) [4, 5] and in this way it behaves similar like hamiltonians of ��-symmetric quantum mechanics (ptsqm) [9–14]. subsequently, we first present a sketchy overview of some recent results on the spectral behavior of �2-dynamos obtained in [5, 15–19], which we extend by a discussion of the transition from �2-dynamo configurations confined in a box to dynamos living in an unconfined conducting surrounding. 2 physically realistic bcs and spectral triple points for roughly spherically symmetric dynamical systems like the earth, the conducting fluid is necessarily confined within the core of the earth so that the �-effect resulting from the fluid motion has to be confined to this core. setting the surface of the outer core at a radius r �1, one can assume �( )r � �1 0 and a behavior of the magnetic field at r �1 like in a vacuum. a multi-pole-like decay of the magnetic field at r � � then leads to mixed effective bcs at r �1 (see, e.g. [2]) and a corresponding operator domain of the type �� �( ) ~ ( , ) ( , ) (a u u� � � � �l l r2 201 01 �0 0) ,� bu r � �1 0� , u:� � � � � � u u 1 2 , (3) b:� � � � � � � �r l r 0 0 1 . from the domain �( )†a� of the adjoint operator �� �( ) � � ( , ) ( , ) �(†a u u� � � � �l l r2 201 01 �0 0) , � �� bu r � �1 0� , �: � � u � � � � � � u u 1 2 , (4) � : ( ) b � �� � � � � � � � �r r l r r 0 1 , one reads off that � �( ) ( )†a a� �� and, hence, the dynamo operator itself is not self-adjoint even in a krein space. in the case of constant �-profiles and arbitrary l � �, the spectrum is implicitly given by a characteristic equation built from spherical bessel functions [2]. in all other cases numerical studies are required. a typical spectral branch graph is depicted in fig. 1. obviously, for the specific �-profile it contains a large number of spectral phase transitions from real spectral branches to complex ones and back. there are strong indications that phase transition points (second order branch points/exceptional points) of the spectrum close to the � � 0 line play an important role in polarity reversals of the magnetic field (see [20–23] for numerical studies and [24] for recent experiments). apart from the second-order branch points visible in fig. 1 there may occur thirdand higher-order branch points. they are located on hyper-surfaces of higher co-dimension in parameter space and they therefore require the tuning of more parameters to pin them down(3). corresponding results have been obtained in [16] and are illustrated in fig. 2. the triple points result from coalescing second-order branch points, correspond to 3×3 jordan blocks in the spectral decomposition of the operator and are accompanied by a merging or disconnecting of two complex spectral sectors 76 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 fig. 2: �2-dynamo with a r c r( ) [ ( . . ) ( . . ) ( .� � �21 465 2 467 426 412 167 928 8062� � 729 436 289 392 276 2729913 4 . ) ( . . ) ]� �r r and a spectral triple point at ( . , . )� � �0 45 0 86c . highlighted (fat) lines correspond to purely real branches of the spectrum. the cusp in the imaginary component (lower right graphics) indicates the closely located triple point. over the parameter space. an implicit indication of a closely located triple point is the presence of cusps in the imaginary components as shown in figs.1, 2. 3 idealized bcs and krein-space related perturbation theory in order to gain some deeper insight into possible dynamo-related processes, semi-analytical toy model considerations play a crucial role. a certain simplification of the eigenvalue problem was achieved in [17, 18] by considering a reduced and idealized (auxiliary) problem(4) with dirichlet bcs imposed at r �1, i.e. by setting u( )r � �1 0. in this case it holds � �( ) ( )†a a� �� and the operator a� is self-adjoint in a krein space ( , [.,. ] )� j j [5]. for constant �-profiles � �( )r � �0 const the eigenvalue problem ( )a u� �0 0� � becomes exactly solvable in terms of orthonormalized riccati-bessel functions � � � � u r n r j r n j u u u n n l n n l n m n mn n ( ) , : , ( , ) , � � � � 1 2 1 2 3 2 2 1 � � � (5) with �n � 0 the squares of bessel function roots � �jl n �1 2 0� . the solutions of the eigenvalue problem have the form un n nu l � � � � � � � � � 1 0 12 � � ( , ), (6) are krein space orthonormalized [u u [u u u u u m n n mn m n n n n � � � � � � � � � � � � � , ] , , ] , , : , 2 0� � �� � � � � (7) and correspond to eigenvalue branches � � �� ��n n n� � 0 which scale linearly with �0. in the ( , )� �0 � -plane the branches � n and � n � of states un , un � of positive and negative krein space type form a spectral mesh (see fig. 3). the intersection points (nodes of the mesh) are semisimple double eigenvalues, i.e. eigenvalues of geometric and algebraic multiplicity two – so-called diabolical points (dps) [26]. two given branches � ��n ( )0 and � � � m( )0 intersect at the single point � � �� � � � � � � � � � � � � 0 0 0: , :n m n m, and we obtain that branches from states of opposite krein space type � �� � intersect for � 0 0� , whereas states of the same type (� �� ) intersect at � 0 0� . under small inhomogeneous perturbations � � � � ( ) ( ) ( )r r r� � 0 0� � the diabolical points split � � � 0 0 1� � � into two real or complex points (see also [27] for similar considerations) with leading contribution �1 resulting from the quadratic equation © czech technical university publishing house http://ctn.cvut.cz/ap/ 77 acta polytechnica vol. 47 no. 2–3/2007 fig. 3: the spectral mesh of the operator matrix a� for l � 0 (a); its resonant deformation due to harmonic perturbations of a constant �-profile (b), (c); the formation of overcritical oscillatory dynamo regimes for � increasing from � � �0 0( )� to � � � � �35 0 0( ,� � for some branches) (d); and the resonant unfolding of dps in the complex plane (e), (f). � � � � � � �� � � � � 1 2 1 2 2 � � � � � � � [ , ] [ , ] [ bu u bu u bu n n n m m m n � � � � � � � � , ][ , ] [ , ] , u bu u bu un m m n m n m � � 4 0 (8) where [ , ] ( ) bu um n n m m n m n l l r u u u u� � � �� � �� � � � � � � � � ! " # $ 1 2 dr 0 1 % . (9) the unfolding of the dps follows the typical krein space rule. when they result from branches of the same type (� �� , � 0 0� ) then the corresponding dps unfold purely real-valued, whereas dps from branches of opposite type (� �� � , � 0 0� ) may unfold into complex conjugate eigenvalue pairs. this behavior is clearly shown in fig. 3b,c. direct inspection reveals that the spectral meshes of unperturbed operators a� 0 for l � 0 and 0 � �l � show strong qualitative similarities so that results obtained for the quasi-exactly solvable l � 0 – model will also qualitatively hold for models with 0 � �l � . via fourier expansion of �( )r a very pronounced resonance has been found along parabolas in the ( , )� �0 � – plane indicated by white and colored dots in fig. 3a – leaving regions away from these parabolas almost unaffected. an especially pronounced resonance is induced by cosine perturbations which in linear approximation affect only the single parabola j k� 2 , fig. 3b,e. sine perturbations act strongest on parabolas j k� �2 1 with decreasing effect on j k m� �2 for increasing m (see fig. 3c,f). physically important is the fact that higher mode numbers k (shorter wave lengths of the � �(r) perturbations) affect more negative ��. due to a magnetic field behavior & e t� this is the mathematical formulation of the physically plausible fact that small-scale perturbations decay faster than large-scale perturbations. numerical indications for the importance of this behavior in the subtle interplay of polarity reversals and so-called excursions (“aborted” reversals) of the magnetic field have recently been given in [23]. 4 diagonalizable �2-dynamo operators, susyqm and the dirac equation another approach to obtain quasi-exact solution classes of the eigenvalue problem ( )a u� �� � 0 consists in a �-dependent diagonalization of the operator matrix (1). the basic feature of this technique, as demonstrated in [19], is a two-step procedure consisting of a gauge transformation which diagonalizes the kinetic term and a subsequent global (coordinate-independent) diagonalization of the potential term. such an operator diagonalization is possible for �-profiles satisfying the constraint �� � �� � �( ) ( ) ( )r r a r 1 2 03 2 (10) with a � �const � a free parameter. solutions �(r) of this autonomous differential equation (de) can be expressed in terms of elliptic integrals. in order to maximally explore the similarities to known qm type models(5) a strongly localized �-profile has been assumed which smoothly vanishes toward r � �. physically, such a setup can be imagined as a strongly localized dynamo-maintaining turbulent fluid/plasma motion embedded in an unbounded conducting surrounding (plasma) with fixed homogeneous conductivity. the only �-profile with �( )r � � � 0 satisfying (10) has the form of a korteweg-de vries(kdv)-type one-soliton potential �( ) cosh[ ( )] r a a r r � � 2 0 . (11) this amazing finding points to deep structural links to kdv and supersymmetric quantum mechanics (susyqm) and opens up a completely new exploration approach to �2-dynamos (6). in [19] we restricted our consideration to the most elementary solution properties of such models. the decoupled equation set after a parameter and coordinate rescaling has been found in terms of two quadratic pencils � � �� ! " #$ � � � �� � � � x l l x f x x 2 2 2 2 0 1 1 2 1 2 0 2 ( ) , cosh( � � � ) (12) in the new variable x ar:� and with new auxiliary spectral parameter � �� � �12 1 2� . the equivalence transformation from ( )a u� �� � 0 to (12) is regular for � � 0 and becomes singular at � � 0 where (12) has to be replaced by a jordan type equation system � � x x v v v 2 0 1 2 0 1 00 0 � � � � � � � � � � � � � � � (13) with potentials v l l x0 2 1 2 21 1� � ��( ) ( )� , v1 � � �. in terms of the original spectral parameter � the eigenvalue problems (12) read � � � �� � � � � � ! ! ! " # $ $ $ � �� �� � � � �x l l x f f2 2 2 1 21 1 2 1 2 ( ) (14) and can be related to the spectral problem of a qm hamiltonian with energy e � �� and energy-dependent potential component � � � �� �12 12 1 2 1 2� � � �� � �� �x e x . for physical reasons asymptotically vanishing field configurations with f x� � � �( ) 0, 0 1 0, ( )x � � � are of interest. these dirichlet bcs at infinity imply the self-adjointness of the operator a� in a krein space �j – with (12), (13) as a special 78 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 representation of the eigenvalue problem ( )a u� �� � 0. from the structure of (12), (14) it follows that the only free parameter apart from the angular mode number l � � is the maximum position x0 of the �-profile �(x) (the minimum position of the potential component � �2 2( )x ) so that the solution branches will be functions �( )x0 . with the help of susy techniques it has been shown in [19] that (14) has a single bound state (bs) type solution which via e � � �� 0 corresponds to an overcritical dynamo mode � � 0. it has been found that the bs solutions of (14) behave differently for x x j0 � and x x j0 � , where for x x j0 � the description in terms of (12) breaks down and has to be replaced by the singular jordan type representation (13). by a susy inspired factorization ansatz � � �� �x l l x l l2 2 21 1 2 1 2 ( ) † , (15) l wx� � � , l wx † � � , w u u � � , (16) an equivalence relation between (12) and a system of two dirac equations h� � ��� �� , h vx� �� �� (17) �� � � � � � � � � �� � � � � � � � � � � � � � � � � � � 1 2 1 0 1 1 0 , , : , , f lf� v w w� � � � � � � �� 0 (18) has been established for models with x x j0 � . general results on dirac equations then allowed for the conclusion that in the case of x x j0 � the bound state related spectrum has to be real. a perturbation theory with the distance � � �x x j0 from the jordan configuration as a small parameter supplemented by a bootstrap analysis showed that the dirichlet bcs f x� � � �( ) 0 render only the solution f x ( ) non-trivial and with a real eigenvalue, whereas f x�( ) has to vanish identically f x� '( ) 0. the single spectral branch in terms of �( )x0 and �( )x0 is depicted in fig. 4 for angular mode numbers l � 0 1 2 3, , , . © czech technical university publishing house http://ctn.cvut.cz/ap/ 79 acta polytechnica vol. 47 no. 2–3/2007 fig. 4: spectra � ( )x0 (a) and �( )x0 (b) in the case of angular mode numbers l � 0 1 2 3, , , . for numerical reasons the dirichlet bc has been imposed at the large distance x � 100. fig. 5: cutoff (x-)dependence of the spectral branches with radial mode numbers n � 1 2, and angular mode numbers l � 0 (a) and l � 2 (b) for cutoffs (box-lengths) x � 10 20 40, , . clearly visible are the x-independence of the overcritical bs type modes (n � 1) and the tendency � & �1 2x for the undercritical (box-type) mode. the modes with n � 3 show the same qualitative � & �1 2x behavior as the n � 2 mode and are note depicted here. assuming the dynamo model with strongly localized �-profile (11), (12) confined in a large box, i.e. with dirichlet bcs imposed at large x x� �0, one can study the dynamo spectrum in the infinite box limit. figs. 5a,b show the corresponding behavior. due to its localization the bs-related overcritical dynamo mode is almost insensitive to the x � � limit. this is in contrast to the under-critical (decaying) modes which behave as expected for a sign inverted box spectrum of qm. for fixed mode number n � 2 and x � � the energies en decrease like e xn &1 2 � 0 and the corresponding part of the spectrum becomes quasi-continuous and related to the continuous (essential) spectrum of qm scattering states of a particle moving in the energy dependent potential l l x x e x ( ) ( ) ( ) � � � � � � 1 1 2 1 22 2 1 2� �� . for the associated dynamo eigenvalues this implies � & �1 2x �0 – as is clearly shown in figs. 5a,b. 5 concluding remarks a brief overview of some recent results on the spectra of dynamo operators has been given. the obtained structural features like the resonance effects in the unfolding of diabolical points as well as the unexpected link to kdv soliton potentials, elliptic integrals, susyqm and the dirac equation appear capable to open new semi-analytical approaches to the study of �2-dynamos. acknowledgment the work reviewed here has been supported by the german research foundation dfg, grant ge 682/12-3 (u.g.), by the crdf-brhe program and the alexander von humboldt foundation (o.n.k.), as well as by rfbr-06-02-16719 and ss-5103.2006.2 (b.f.s.). remarks (1) the �-profile �(x) plays the role of an effective potential for the �2-dynamo. (2) for comprehensive discussions of operators in krein spaces see, e.g., [6–8]. (3) an explicit hyper-surface parametrization of second-order branch point configurations embedded in a ��symmetric 3 × 3 matrix model with corresponding 2 × 2 jordan-block preserving modes can be found e.g. in the recent work [25]. (4) from a physical point of view such �2-dynamos can be regarded as embedded in a superconducting surrounding. (5) for early comments on structural links between mhd dynamo models and qm-related eigenvalue problems, see e.g. [28]. (6) the question of whether this new class of quasi-exactly solvable �2-dynamo models might be structurally related (via dynamical embedding) to the recently studied ��-symmetrically extended kdv solitons [29, 30] remains to be clarified. references [1] moffatt, h. k.: magnetic field generation in electrically conducting fluids. cambridge university press, cambridge, 1978. [2] krause, f., rädler, k.-h.: mean-field magnetohydrodynamics and dynamo theory. akademie-verlag, berlin and pergamon press, oxford, 1980, chapter 14. [3] zeldovich, ya. b., ruzmaikin, a. a., sokoloff, d. d.: magnetic fields in astrophysics. gordon & breach science publishers, new york, 1983. [4] günther, u., stefani, f.: isospectrality of spherical mhd dynamo operators: pseudo-hermiticity and a no-go theorem. journal math. phys. vol. 44 (2003), p. 3097–3111, math-ph/0208012. [5] günther, u., stefani, f., znojil, m.: mhd �2-dynamo, squire equation and ��-symmetric interpolation between square well and harmonic oscillator. j. math. phys. vol. 46 (2005), 063504, math-ph/0501069. [6] azizov, t. ya., iokhvidov, i. s.: linear operators in spaces with an indefinite metric. wiley-interscience, new york, 1989. [7] dijksma, a., langer, h.: operator theory and ordinary differential operators. in: böttcher a. (ed.) et al.: lectures on operator theory and its applications. fields institute monographs, vol. 3 (1996), p. 75, am. math. soc., providence, ri. [8] bognár, j.: indefinite inner product spaces. springer, new york, 1974. [9] bender, c. m., boettcher, s.: real spectra in non-hermitian hamiltonians having pt symmetry. phys. rev. lett. vol. 80 (1998), p. 5243–5246, physics/9712001 [10] bender, c. m., boettcher, s., meisinger, p. n.: ��-symmetric quantum mechanics. j. math. phys. vol. 40 (1999), p. 2201–2229, quant-ph/9809072. [11] znojil, m.: ��-symmetric harmonic oscillators. phys. lett. vol. a259, (1999), p. 220–223, quant-ph/9905020; znojil, m.: quasi-exactly solvable quartic potentials with centrifugal and coulombic terms. j. phys. a: math. gen. vol. 33 (2000), p. 4203–4211, math-ph/0002036; znojil, m., cannata, f., bagchi, b., roychoudhury, r.: supersymmetry without hermiticity. phys. lett. vol. b483 (2000), p. 284–289, hep-th/0003277. [12] mostafazadeh, a.: pseudo-hermiticity versus �� symmetry: the necessary condition for the reality of the spectrum of a non-hermitian hamiltonian. j. math. phys. vol. 43 (2002), p. 205–214, math-ph/0107001; pseudo-hermiticity versus ��-symmetry ii: a complete characterization of non-hermitian hamiltonians with a real spectrum. ibid. vol. 43 (2002), p. 2814–2816, math-ph/0110016; pseudo-hermiticity versus ��-symmetry iii: equivalence of pseudo-hermiticity and the presence of antilinear symmetries. ibid. vol. 43 (2002), p. 3944–3951, math-ph/0203005. [13] bender, c. m., brody, d. c., jones, h. f.: complex extension of quantum mechanics. phys. rev. lett. vol. 89 (2002), 270401, quant-ph/0208076. [14] bender, c. m.: making sense of non-hermitian hamiltonians. hep-th/0703096. 80 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 [15] günther, u., stefani, f., gerbeth, g.: the mhd �2-dynamo, �2graded pseudo-hermiticity, level crossings, and exceptional points of branching type. czech. j. phys. vol. 54 (2004), p. 1075–1090, math-ph/0407015. [16] günther, u., stefani, f.: third order spectral branch points in krein space related setups: ��-symmetric matrix toy model, mhd �2-dynamo, and extended squire equation. czech. j. phys. vol. 55 (2005), p. 1099–1106, math-ph/0506021. [17] günther, u., kirillov, o. n.: krein space related perturbation theory for mhd �2-dynamos and resonant unfolding of diabolical points. j. phys. a: math. gen. vol. 39 (2006), p. 10057–10076, math-ph/0602013. [18] kirillov, o. n., günther, u.: on krein space related perturbation theory for mhd �2-dynamos. proc. appl. math. mech. (pamm) vol. 6 (2006), p. 637–638. [19] günther, u., samsonov, b. f., stefani, f.: a globally diagonalizable �2-dynamo operator, susy qm and the dirac equation. j. phys. a: math. theor. vol. 40 (2007), p. f169–f176, math-ph/0611036. [20] stefani, f., gerbeth, g.: asymmetric polarity reversals, bimodal field distribution, and coherence resonance in a spherically symmetric mean-field dynamo model. phys. rev. lett. vol. 94 (2005), 184506, physics/0411050. [21] stefani, f., gerbeth, g., günther, u., xu, m.: why dynamos are prone to reversals. earth planet. sci. lett. vol. 243 (2006), p. 828–840, physics/0509118. [22] stefani, f., gerbeth, g., günther, u.: a paradigmatic model of earth’s magnetic field reversals. magnetohydrodynamics, vol. 42 (2006), p. 123–130, physics/0601011. [23] stefani, f., xu, m., sorriso-valvo, l., gerbeth, g., günther, u.: oscillation or rotation: a comparison of two simple reversal models. geophys. astrophys. fluid dyn. vol. 101 (2007), p. 227–248. [24] berhanu, m. et al.: magnetic field reversals in an experimental turbulent dynamo. eur. phys. lett. vol. 77 (2007), 59001, physics/0701076. [25] znojil, m.: a return to observability near exceptional points in a schematic ��-symmetric model, phys. lett. vol. b647 (2007), p. 225–230, quant-ph/0701232. [26] berry, m. v., wilkinson, m.: diabolical points in the spectra of triangles. proc. r. soc. lond. vol. a392 (1984), p. 15–43. [27] kirillov, o. n., seyranian, a. p.: collapse of the keldysh chains and stability of continuous nonconservative systems. siam journal appl. math. vol. 64 (2004), p. 1383–1407. [28] meinel, r.: generation of localized magnetic fields by dynamos with conducting surroundings. astron. nachr. vol. 310 (1989), p. 1–6. [29] bender, c. m., brody, d. c., chen, j., furlan, e.: ��-symmetric extension of the korteweg-de vries equation. j. phys. a: math. theor. vol. 40 (2007), p. f153–f160, math-ph/0610003. [30] fring, a.: ��-symmetric deformations of the korteweg-de vries equation. j. phys. a: math. theor. vol. 40 (2007), p. 4215–4224, math-ph/0701036. dr. uwe günther phone: +49 351 260 2415 email: u.guenther@fzd.de research center dresden-rossendorf pob 510119 d-01314 dresden, germany dr. oleg n. kirillov phone: +7 7095 939 2039 email: kirillov@imec.msu.ru institute of mechanics moscow state lomonosov university michurinskii pr. 1 119192 moscow, russia prof. dr. boris f. samsonov phone: +7 3822 913 019 email: samsonov@phys.tsu.ru physics department tomsk state university 36 lenin avenue 634050 tomsk, russia dr. frank stefani phone: +490 351 260 3069 email: f.stefani@fz-rossendorf.de research center dresden-rossendorf pob 510119 d-01314 dresden, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 81 acta polytechnica vol. 47 no. 2–3/2007 ap07_4-5.vp 1 introduction cancer is the second leading cause of death in industrial countries. so as to have the best possible chance of cure, cancer has to be detected and treated as early as possible. as cancer starts within a single cell, many types of cancer can be detected very early from already marginal changes within single cells using cytopathological methods. these cell specimens can be obtained easily and painlessly, e.g. with tiny brushes, from the oral mucosa. one cytopathological diagnostic method is dna image cytometry (dna-icm). for this the cells are stained stoichiometrically according to feulgen to visualize the dna content within the nuclei. images of the nuclei are captured with a camera mounted on a brightfield light microscope. since dna-icm needs the dna value of each cell, the integral optical densities of the nuclei are computed. to obtain the dna value from these integral optical densities, a set of about normal, non-proliferating cells, called reference cells, has to be selected. then the dna content of each nucleus is computed from the ratio of its integral optical density to the mean integral optical density of the reference cells. subsequently the diagnosis is performed based on the histogram of the dna values of diagnostically relevant, suspicious analysis cells. for oral mucosa these are squamous epithelial cells with noticeably altered morphology and chromatin texture, i.e., amount and distribution of dna, of their nuclei, cells that therefore are suspicious of cancer. according to the european guidelines for dna-icm [1], a cytopathologist has to find and select manually about 30 ref86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 automated classification of analysisand reference cells for cancer diagnostics in microscopic images of epithelial cells from the oral mucosa t. e. schneider to get the best possible chance of healing, cancer has to be detected as early as possible. as cancer starts within a single cell, cytopathological methods offer the possibility of early detection. one such method is standardized dna image cytometry. for this, the diagnostically relevant cells have to be found within each specimen, which is currently performed manually. since this is a time-consuming process, a preselection of diagnostically relevant cells has to be performed automatically. for specimens of the oral mucosa this involves distinguishing between undoubted healthy epithelial cells and possibly cancerous epithelial cells. based on cell images from a brightfield light microscope, a set of morphological and textural features was implemented. to identify highly distinctive feature subsets the sequential forward floating search method is used. for these feature sets k-nearest neighbor and fuzzy k-nearest neighbor classifiers as well as support vector machines were trained. on a validation set of cells it could be shown that normal and possibly cancerous cells can be distinguished at overall rates above 95.5 % for different classifiers, enabling us to choose the support vector machine with a set of two features only as the classifier with the lowest computational costs. keywords: cells, cytopathology, feature extraction, classification, fuzzy k-nearest neighbor, support vector machine. fig. 1: two examples of feulgen stained squamous epithelial cells. on the left a normal cell is shown, on the right a morphologically abnormal analysis cell. since feulgen stains the dna stoichiometrically, the dna within the nuclei is stained, whereas the surrounding cytoplasm is not. erence cells and about 300 analysis cells. this is a time-consuming task, which the expert should therefore be assisted by automated preselection of cells. a machine for fully automated screening of cervical smears using dna-icm is already available. this machine searches for cells with an abnormally high dna content, by measuring the dna content of all existing cells [2]. but as these cells are rare and cancer starts already with small changes of dna content, the same high sensitivity and specificity as with the standardized interactive dna-icm [1] cannot be achieved. the approach in this paper therefore aims at an algorithmic implementation of such expert knowledge for specimens of the oral mucosa, i.e., automatic discrimination of normal epithelial nuclei as reference cells and non healthy epithelial nuclei as analysis cells. the paper is organized as follows: in section 2 the database of cells from oral smears and the imaging modality is described. features characterizing the properties of epithelial reference and analysis cells, the process of reducing the whole feature set to a subset of features with optimal discriminating power and different classification algorithms to classify the cells automatically is presented in section 3. the results of these algorithms computed on the dataset are presented in section 4, showing that epithelial reference and analysis cells from 15 specimens of the oral mucosa can be discriminated with different classification methods at total classification rates of above 95.5 %. the paper ends by analyzing the classification results and suggesting topics for future study. 2 material our dataset is based on 15 feulgen stained smears from the oral mucosa. seven of these specimens are without cancer cells, including three with inflammation. five specimens contain cancer cells. from these specimens images have been acquired with a brightfield light microscope and a 63× oil immersion objective (na 1.32) and a three-chip ccd camera with a resulting resolution of � x � 01. �m. an experienced cytopathologist classified 950 reference cells within the specimens without cancer cells, and 748 analysis cells within the specimens with cancer cells. fig. 1 shows two example cells. for these cells, the contours of the nuclei are given as chaincodes. 3 methods to be able to distinguish automatically between analysis and reference squamous epithelial cells, a set of potentially relevant features has been implemented. to miminize the risk of overfitting during training of a classifier, a feature selection method is performed that can be combined with different objective functions to select good discriminative feature subsets. to solve the classification task, different classification algorithms can be chosen. each of these has to be trained on a training set of cells, whereas the classification rate is calculated on an independent cell set (validation set). the process of feature selection and classifier generation is sketched in fig. 2, and the algorithms used within each step are described below. the distance measure used here within each step is the euclidean distance. 3.1 features when performing a manual discrimination between analysis and reference cells, cytopathologists consider geometrical criteria like area of the nuclei, shape, as well as textural characteristics of the chromatin pattern and the overall amount of stain related to the size of a nucleus. to cover these criteria, several morphological features are used, as described in [3] and [4]. these comprise area, perimeter, form factor, fourier descriptor energies, and, further on, translation, rotation and scale invariant features. the latter are computed as combinations of the central 2d polynomial moments for the nuclear mask. as textural features, these moment-based features are calculated for a density estimation of the chromatin (extinction image) and an edge image of the chromatin distribution. the extinction image is calculated on the green channel of the original rgb image, and the edge image is the difference of the extinction image and its median filtered version. these features are supplemented by histogram features of the topological gradient [3] and particle-oriented features homogeneity, granularity and distribution of the chromatin [5]. see fig. 3 for examples of the image transformations. this sums © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 acta polytechnica vol. 47 no. 4–5/2007 fig. 2: algorithmic workflow of feature subset selection and classifier generation. the different sized feature subsets resulting from the searching method are denoted as set_1 to set_p, the output of classifier training and training set classification as p classification rates. up to 203 features describing the morphology of the nuclei and their chromatin texture. 3.2 feature selection testing the separability performance of each possible feature subset is computationally too expensive for large feature sets. therefore the parameter-free sequential forward floating selection search method [6] is implemented to identify feature subsets with a good separability performance for our classification task. to rate the separability performance, three different objective functions can be chosen. based on the assumption of normally distributed data these are fisher’s criterion [7], and the bhattacharyya distance [7], which is adapted to the special case of two classes only. using the likelihood of feature vectors instead of a density model, mutual information [8] can be selected as the third objective function. 3.3 classifier to carry out the classification task, three classification algorithms are applicable. these have been selected due to the reproducibility of the classification results, as well as the possibly changing behavior of the distribution of the data during feature selection. for the different subsets of features during this selection the distribution of the data may fit different distribution models, including non-gaussian, multi-modal ones. furthermore, for higher dimensions of the feature space, the data is in general sparsely distributed and less compact, which leads to the use of non-parametric classifiers to be more general. as non-parametric classifiers that provide comprehensible decisions, the knn algorithm and the fuzzy-knn are chosen and, additionally, support vector machines (svm), which are known to provide a good generalization capability. a knn version is trained that makes its decision as soon as k neighbors belong to the same class i. the fuzzy-knn is 88 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 3: image transformations needed to compute the features. from left to right: within the upper row the original rgb image, its green channel, the extinction image computed from the green channel and the edge image are displayed. based on the extinction image the lower row contains the watershed regions computed for local maxima and for local minima, each filled with the gray value of the local optima, the topological gradient as the difference of the watershed regions and a three-color image to partition the chromatin into darker and brighter particles basic feature set classification (a;r;o) objective function number of features number of neighbors morphology 75.5; 92.3; 85.2 b 14 3 chromatin green 92.5; 95.5; 94.1 b 14 1 chromatin extinction 90.2; 96.5; 93.8 b 19 3 chromatin+moments 90.7; 98.3; 95.1 m 2 4 all features 90.7; 98.3; 95.1 m 2 4 table 1: results of the feature selection for each basic feature set, selected due to the best overall classification result. the classification results (a; r; o) in percentages for the (a)nalysis, the (r)eference cells, and the (o)verall rate are given for each of the basic sets. additionally the used objective function ((m)utual information, (f)ishers’ criterion, or (b)hattacharyya distance), the number of features and the number of neighbors of the knn are noted. the computed optimal feature sets of chromatin�moments and all features are identical. implemented according to [9]. for this the membership to each class is computed, taking the influence of a nearest neighbor into account using the distance to the sample. this distance can moreover be weighted. for this we used the proposed version of assigning complete membership of the neighbors into their own class and nonmembership in all other classes. the second version of weighting the influence of the neighbours based on their distances to the class means assumes a unimodal distribution of the classes with equal variances, which turned out to be too restrictive [10–11]. since the third proposed weighting increases the computational costs through a search for nearest neighbors for each nearest neighbor of a sample, it is excluded. for the svm algorithm [12] the spider toolbox for matlab is applied, interfacing the libsvm. the rbf kernel is chosen as kernel function, and the classifiers are trained using gridsearch for the parameter search, and cross-validation on the training data. 4 experiments and results the cell set from section 2 is split into reference cells and analysis cells for the training set and a validation set of reference cells, and analysis cells. on the basis of the training set all features are normalized to the range [0, 1]. firstly, feature selection was performed. to rate a wide range of feature combinations during the feature selection process, the features are grouped into five basic feature sets. these are morphology, chromatin green and chromatin extinction as the chromatin features computed for the green channel and the extinction image respectively. additionally the combination of the two former chromatin sets is extended through moment based features computed for the extinction and the edge texture image to chromatin � moments. and, finally, within the last basic feature set all features are included. feature selection was performed on each of these basic feature sets up to feature set sizes of 50 features for each objective function (searching method in fig. 2). to determine the best feature set size of each objective function, knn classifiers were trained for k from 1 to 10 (classifier within the block feature selection in fig. 2). these sets are chosen using the overall classification rates from leave-one-out cross-validation on the training set to rate the feature subsets. table 1 shows the results of the best sets for each basic feature set. for these five best classifiers the classification rates on the validation set were computed with knn and fuzzy-knn (table 2). it turned out that the best distinction between the two classes needs two features only. these are a moment based feature (imtote in [3]) as an estimate of the integral optical density, and, as an inhomogeneity measure of the chromatin distribution, the median of the topological gradient image (rg in [3]) for the green channel. as svms are known to provide good generalization results, two svms were trained on the training set. one on the whole set of 203 features (svm203) and one for the two features from the feature selection process that showed a good discriminative power with the knn classifiers (svm2). both svms are trained with 13-fold cross-validation and an iteratively refined grid-search for the svm parameter width of the rbf kernel and the penalty term. the grid search started at logarithmic scale for both parameters. the classification results on the validation set are comparable for both classifiers and also with the results of the knn and the fuzzy knn, as shown in table 3. a comparison of the two classifiers with the best overall classification rate, f-knn and svm2, shows a higher detection rate of the diagnostically relevant analysis cells for f-knn, but with a lower predictive value and higher computational costs compared to svm2. table 4 lists the number of misclassified cells and the predictive values in comparison for these two classifiers. the predictive value of each class is calculated as the ratio of the number of correctly classified cells of the class to the sum of the number of correctly classified cells of this class and the number of cells falsely classified into this class. 5 discussion for the different basic feature sets the classification results of the knn for the best feature sets vary slightly. it can be seen that chromatin features provide a good separability performance, whereas only the geometrical features within the basic set morphology do not distinguish the two classes sufficiently. © czech technical university publishing house http://ctn.cvut.cz/ap/ 89 acta polytechnica vol. 47 no. 4–5/2007 basic feature set knn (a; r; o) f-knn-a (a; r; o) morphology 75.3; 92.5; 85.4 77.7; 88.5; 83.2 chromatin green 91.9; 97.5; 94.7 91.9; 97.5; 94.7 chromatin extinction 93.9; 95.5; 94.7 91.9; 94.5; 93.2 chromatin+moments/ all features 92.4; 98.5; 95.5 93.9; 97.5; 95.7 table 2: classification results of the validation set, computed for the feature sets and the number of neighbors of the classifiers within table 1. the classification results (a; r; o) are computed for the classifiers knn and f-knn and are given as percentages for the (a)nalysis, the (r)eference cells, and the (o)verall rate. classifiers analysis cells reference cells overall rate knn 92.4 98.5 95.5 f-knn 93.9 97.5 95.7 svm 2 92.9 98.5 95.7 svm203 93.9 97.0 95.5 table 3: the detection rates of analysis cells, reference cells, and the overall rates of the validation set are shown in percentages for the different classifiers. knn, f-knn and svm2 use two features only, svm203 classifies according to all features. classifier f-knn svm2 analysis cells 186 (198) 97.4 % 184 (198) 98.4 % reference cells 195 (200) 94.2 % 197 (200) 93.4 % table 4: absolute number of correctly classified cells within each class and the predictive values in percentages for f-knn and svm2. the total number of cells in each class is noted in brackets. to achieve the best overall classification rate, only two features are needed, which results in low computational costs. these features also are consistent with the visual criteria used by cytopathologists (section 3), which simplifies an understanding of the computed decisions. validating these two features on the validation set using knn and f-knn, as well as svm2, results in comparable classification rates between the classifiers. these rates on the validation set are slightly better than those classification results achieved with the knn classifier after feature selection on the training set. this is due to the distribution of the validation set within the feature space. training a svm on all features for the possibility of detecting another relation between the features than was tested during feature selection did not result in better classification results. however, these results are still comparable to the classifiers using two features only. the quantity of training data may still be too low for this. in consequence, to be sure not to have performed an overfitting or to have used a training set that is not representative, these classifiers have to be tested with cells from specimens that are different from the specimens of the training set. furthermore, there is a persistent difference between the classification results of the reference cells and analysis cells for the knn methods, and also for the svms. the major reason for this may be the different number of training cells in the classes. therefore the analysis cells should be supplemented with new prototypes. in conclusion, it can be stated that the classification results on the 398 validation cells result in overall classification results between 95.5 % and 95.7 % for the different classifiers, thus providing good separability between the analysis cells and the reference cells. while the classification results of the different classification algorithms are comparable, we can choose the final classifier according to the best classification results of either analysis cells or reference cells, or on the basis of computational costs. acknowledgments i would like to thank prof. dr.-ing. t. aach and prof. emer. dr.-ing. d. meyer-ebrecht, institute of imaging and computer vision, rwth aachen university, germany, for many insightful discussions, and i would like to thank prof. dr. med. a. böcking, institute for cytopathology, heinrich-heine-university, düsseldorf, germany, for providing the dataset and for his teaching about the medical background. the project is supported by the viktor and mirka pollak fund for biomedical engineering. references [1] böcking, a., et al.: consensus report of the esacp task force on standardization of diagnostic dna image cytometry. anal. cell pathol., vol. 8 (1995), no. 1, p. 67–74. [2] sun, x. r., wanq, j., garner, d., palcic, b.: detection of cervical cancer and high grade neoplastic lesions by a combination of liquid-based sampling preparation and dna measurements using automated image cytometry. cellular oncology, vol. 27 (2005), no. 8, p. 33–41. [3] rodenacker, k., bengtsson, e.: a feature set for cytometry on digitized microscopic images. anal cell pathol, vol. 25 (2003), no. 1, p. 1–36. [4] suk, t., flusser, j.: graph method for generating affine moment invariants. procs icpr, vol. 2 (2004), p. 192–195. [5] young, i. t., verbeek, p. w., mayall, b. h.: characterization of chromatin distribution in cell nuclei. cytometry, vol. 7 (1986), no. 5, p. 467–474. [6] pudil, p., novovičová, j., kittler, j.: floating search methods in feature selection. pattern recognition letters, vol. 15 (1994), no. 11, p. 1119–1125. [7] fukunaga, k.: statistical pattern recognition. academic press, 1990. [8] pluim, j. p. w., maintz, j. b. a., viergever, m. a.: mutual information based registration of medical images: a survey. ieee trans. med imaging, vol. 22 (2003), no. 8, p. 986–1004. [9] keller, j. m., gray, m. r., givens, j. a.: a fuzzy k-nearest neighbor algorithm. ieee trans. system, man, and cybernetics, vol. smc-15 (1985), no. 4, p. 580–585. [10] schneider, t. e., bell, a. a., meyer-ebrecht, d., böcking, a., aach, t.: computer aided cytological cancer diagnosis: cell type classification as a step towards fully automatic cancer diagnostics on cytopathological specimens of serous effusions. spie medical imaging 2007, computer–aided diagnosis, vol. 6514 (2007), p. 6514og-1–10. [11] schneider, t. e., bell, a. a., gerberich, g., meyer-ebrecht, d., böcking, a., aach, t.: chromatinmuster-basierte zellklassifizierung für die dns-bildzytometrie an mundschleimhaut-abstrichen. bildverarbeitung für die medizin 2007, 2007, p. 257–261. [12] vapnik, v. n.: the nature of statistical learning theory. springer, 1995. timna esther schneider e-mail: timna.schneider@lfb.rwth-aachen.de institute of imaging and computer vision rwth aachen university sommerfeldstrasse 24 52074 aachen, germany 90 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 ap08_3.vp 1 nuclear fuel cycle the nuclear fuel cycle consists in principle of three basic parts. the first part is usually called the “front-end”, and includes all activities connected with procurement of nuclear material and services, production of nuclear fuel and its transportation to the npp site. the second part of the nuclear fuel cycle is normally defined as use of the nuclear fuel in the reactor. it is generally referred to as “middle” or “in-core”. for all subsequent manipulations and operations with used (“spent”) fuel the term “back-end” of the nuclear fuel cycle is used. there are different points of view on the nuclear fuel cycle. in this paper i will look at it mainly from the position of a nuclear power plant operator. 1.1 front-end the front-end nuclear fuel cycle starts with mining of uranium ore. then uranium in the form of concentrates is purified and transformed into uranium hexafluoride in a conversion plant (uranium conversion). in light water reactors it is necessary to increase the proportion of fissionable isotope 235u (in nature only 0.711 %) to 3–5 % in the nuclear fuel. such isotopic enrichment is based on separating the initial quantity of uranium hexafluoride in gaseous form into the enriched and the depleted streams using a slightly lower mass of the uf6 molecule of 235u then 238u (uranium enrichment). the enrichment plant uses diffusion or centrifuge technology. uranium hexafluoride also has very convenient properties for phase transformation. easy transition between solid, gaseous and liquid phases enables simple and safe filling, handling and transportation of uranium hexafluoride. finally, in the fabrication plant uranium hexafluoride is re-converted into uo2. fuel pellets are pressed from uo2 powder, sintered, braced and stacked into fuel elements (zirconium tubes). the completed fuel (fresh fuel) assembly is an array of fuel elements fixed together via the top and bottom nozzle and several spacer grids. the fresh fuel is then transported in dedicated containers to the npp site and stored in fresh fuel storage. 1.2 middle part as fissionable isotopes are gradually consumed in the reactor core, due to their depletion a chain reaction is not longer sustainable under safe conditions. reactor operation must be halted. during outage the most depleted nuclear fuel is replaced by fresh fuel and the quantity of fissionable isotopes in the reactor (core reactivity) is increased again. in fact, the reactivity in the core must be excessive at the beginning of the cycle in order to enable 12 months or longer operation of the reactor. such excessive reactivity must be compensated at the beginning of the reactor cycle by burnable absorbers integrated in the fuel (gadolinium or boron based), control clusters and boron acid concentration in the primary circuit. in case of a 12-month cycle between two refueling outages, 1/4–1/5 of the fuel inventory in the reactor core is changed. in the case of a 18-month cycle, roughly 1/3 of the fuel is reloaded, and in case of a 24-month cycle about 1/2 is changed. partly used fuel remaining in the core is repositioned and / or rotated. important aspects of the middle part of the cycle are fuel design and loading pattern calculation. the basic input for this calculation is the required operation of the npp unit (defined by the planned use of the nuclear unit within the operator’s power plant portfolio) taking into account the current depletion of the fuel in the core. the design optimization sets the number of fuel assemblies to be reloaded and their enrichments, including its radial and axial profiling. the necessary content of burnable absorbers is also calculated. the loading pattern sets the position of each fuel assembly in the reactor core. 1.3 back-end the back-end of the nuclear fuel cycle includes all operations which ensure that highly radioactive, toxic and heat releasing spent fuel is safely separated from the environment. it starts with storage of spent fuel in the pool in the reactor building. the fuel is stored until the residual heat is decreased to a level that enables further handling of the spent fuel. the capacity of the spent fuel pool is limited, and is usually suffi30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 nuclear fuel cycle evaluation and real options l. havlíček the first part of this paper describes the nuclear fuel cycle. it is divided into three parts. the first part, called front-end, covers all activities connected with fuel procurement and fabrication. the middle part of the cycle includes fuel reload design activities and the operation of the fuel in the reactor. back-end comprises all activities ensuring safe separation of spent fuel and radioactive waste from the environment. the individual stages of the fuel cycle are strongly interrelated. overall economic optimization is very difficult. generally, npv is used for an economic evaluation in the nuclear fuel cycle. however the high volatility of uranium prices in the front-end, and the large uncertainty of both economic and technical parameters in the back-end, make the use of npv difficult. the real option method is able to evaluate the value added by flexibility of decision making by a company under conditions of uncertainty. the possibility of applying this method to the nuclear fuel cycle evaluation is studied. keywords: nuclear fuel cycle, front-end, back-end, economic evaluation, evaluation methods, npv, uncertainty, flexibility, real options. cient only for a few years of operation. the spent fuel therefore must be transferred to some outside storage facility. there are two basic strategies for the back-end of the fuel cycle: � open cycle – the spent fuel is stored in long-term storage and then disposed of in a deep geological repository. � closed cycle – the spent fuel is reprocessed (the nuclear material is separated from the construction material and the products of fission reaction, and is re-used in the new fuel), and only vitrified radioactive waste is disposed of in a deep geological repository (smaller in size than for open cycle). treatment of radioactive waste generated during the operation of a nuclear power plant and during storage, reprocessing or other treatment of spent fuel and during decommissioning of nuclear facilities (the nuclear power plant itself, storage, reprocessing facility) is an integral part of the back-end of the fuel cycle. 2 economic optimization in the nuclear fuel cycle all three stages of the nuclear fuel cycle are strongly interconnected and mutually dependent. the design parameters of the fuel will have a major effect on the procurement costs in the front-end. such procurement costs will be proportional to the heat generated in the reactor allocated to the total npp costs during use of the fuel in the reactor. at the same time, the fuel burn-up and its power history in the core will determine the properties of the spent fuel. the resulting residual heat, the activity, the toxicity and the total quantity of such fuel will strongly influence the back-end fuel costs. these future costs must be estimated with a sufficient degree of reliability and allocated to the unit of production generated during the lifetime of an npp. the above-mentioned relations illustrate the importance, and at the same time the difficulty of such complex optimization (which would take into account all three stages of the fuel cycle). the basic assumption for optimal use of nuclear fuel is the best possible use of the neutrons produced by the chain reaction in the reactor core. because there is an upper limit of 5 % on the enrichment level used in fresh fuel (an administratively imposed limit aimed at preventing misuse of nuclear material for military purposes), the total theoretical reactivity in fresh fuel is limited. at the same time, there is another limit on the amount of reactivity loaded into reactor in fresh fuel during an outage, as the power distribution curve in the reactor core must fulfill certain conditions during the whole cycle. with the use of burnable absorbers reactivity can be temporarily suppressed but at an additional cost. as a result, optimal designs of fresh fuel have been developed for individual reactors with the aim of making maximum use of the fuel in the reactor. progress in the alloys used for structural fuel materials has enabled this. however such optimized fuels achieve very high burn-up, and are subjected to very harsh conditions in the core over a long period of time. the number of fuel element failures and the release of radioactivity into the primary circuit coolant has been rising with increasing fuel burn-up. this leads to increased npp operational costs and to higher production of waste for treatment and disposal. greater burn-up of spent fuel also means higher residual heat. disposal containers must then be spaced further apart. this implies a larger geological repository and higher back-end costs. optimization only in one part of the nuclear fuel cycle can thus have an adverse effect on other parts of the fuel cycle. it is necessary to consider that the fuel cycle is only one aspect of npp operation. it is sometimes necessary to compromise fuel cycle effectiveness due to: � malfunctioning of other technological systems in the power plant (operation on lower than nominal parameters or shortened cycles – not all reactivity in the fuel is used as planned and paid for), � coordination of operation and outages of different units (not only nuclear) in the fleet of an operator (suboptimal fuel use as above), � business opportunities – the price of electricity can motivate an operator to maximize generation despite higher specific fuel costs (this is the case of 18 and 24-month npp cycles, fuel use and also spent fuel creation is therefore optimal for a 12-month cycle). there is an important difference between the front-end and middle fuel cycle on the one hand and the back-end, on the other. the first two stages are fully controlled by the operator. all decisions are solely at the operator’s discretion. however in nuclear fuel back-end some activities are controlled by other entities. in the czech republic, the state is responsible for disposal of radioactive waste, while in some countries the responsibility lies with a company founded jointly by several nuclear waste producers. an operator and the other subject (e.g. the state) must to some degree coordinate their activities. it is questionable how the consent should be achieved between these subjects in the case of a major change in strategy (e.g. transition from an open to a closed back-end fuel cycle) and how such important decisions will be introduced into the future repository cost estimation system and recalculated to create a new fund during npp operation for future costs of spent fuel and waste disposal. this review indicates that optimization over the whole fuel cycle is very difficult. in practice, some parameters must be set as fixed input values in order to decrease the number of variables to a reasonable level. in real life, fuel cycle lengths are imposed by power plant fleet outage scheduling. consequently fuel design is optimized, and one of the criteria is to minimize the number of fresh fuel assemblies in a fuel reload batch. other criteria are minimization of the neutron flux to the reactor vessel (lower costs for future decommissioning of the npp, and slower ageing of the reactor vessel). more reliable fuel design leads to lower production of radioactive waste and costs for processing and disposing of them. currently this is a limiting factor for optimization in the nuclear fuel cycle. more detailed optimizations are performed separately in individual stages of the nuclear fuel cycle, but there is no interrelation of such analyses. nowadays even power companies are preoccupied by short term financial results and capitalization on the stock exchange. strategic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 48 no. 3/2008 long-term analysis does not fit under streamlining and cost cutting tendencies. 3 traditionally used analytical tools the basic analytical concept for financial evaluation in the nuclear fuel cycle is the net present value (npv) concept. as the analyzed options in most cases represent the same amount of generated (and sold) electricity, for npv only costs are considered in the calculations. 3.1 front-end analyses the major uncertainty in the front-end is the price of uranium. following a long period of stability this price has been highly volatile for several years. the price of nuclear services (conversion, enrichment and fabrication) has been increasing steadily and is more predictable. as long-term contracts are used to procure nuclear materials and services (5–15 years), and the procurement chain for a reload batch (from mining of raw uranium to the completed fuel) is roughly 2 years, the cost of capital can be predicted reasonably. plain npv complemented by scenarios and sensitivity is used. for cycle x of the reactor calculation of front-end costs the following formula can be used: npvx t t t t n t m cp fa r� � � � �� � �( ) , (1) where cpt is cost of procurement in year t, fa is fuel amortization in year t, n is year of the first expenditure for nuclear fuel, m is year of full amortization of the fuel. 3.2 back-end analyses the back-end period lasts at least 50–100 years. it is logical that the uncertainty is much higher than in the front-end part of the nuclear fuel cycle. the high degree of uncertainty is connected with estimating the spent fuel and the radioactive waste storage and disposal. the actual future technologies, time schedules and waste volumes can vary substantially from those currently considered. it is difficult to predict economic parameters like inflation, interest rates and rates of return on invested back-end liability funds over such a long period. the most important task for back-end analyses is to determine the rate of accumulation of financial funds (e.g. in czk per mwh generated) during npp operation in order to cover fully all future costs connected with nuclear liabilities (waste and spent fuel storage and disposal, npp decommissioning). using the traditional npv approach, a simplified rate (constant over npp lifetime) can be calculated in the following way: rate c r e r t t t n t m t t t t t � � � � � � � � � � � � 1 1 , (2) where ct is back-end cost in year t, et is electricity generated in year t, r is risk free real interest rate, t is year of npp closure, n is year r of the first expenditure for nuclear fuel, m is year of last back-end cost. in most cases, this type of npv is used in analyses, complemented by sensitivity analysis, scenarios and trees. however a probabilistic approach is used in some countries. the individual input values are represented by probability distribution. result is also represented by a probability distribution. the interpretation can be such that the operators are required to accumulate financial means at a rate corresponding to 50 % probability and to provide guarantees corresponding to the difference between rate for 80 % probability and 50 % probability. due to the high degree of uncertainty and volatility, it is tempting to use the real option concept and its applicability to analyses in the nuclear fuel cycle. 4 concept of real options 4.1 what is an option the term “option” can be understood as any situation which gives a subject the right to postpone its decision until new or better information is available. there are two important terms related to options – uncertainty and flexibility. uncertainty is a term used to describe something which is not known. it is the set of options that are beyond the knowledge and power of the subject making the decision. uncertainty can be technical or economic. technical uncertainty is not a function of any economic parameter. it is a function of purely technical factors (unknown ore grade, fuel deterioration, geological parameters at the repository site). technical uncertainty can be decreased by investment in research and exploration. it is wise to do this in stages. economic uncertainty is a function of movements in economic parameters and in a particular industrial branch (e.g. commodity prices). uncertainty can be quantified by volatility (standard deviation, variance). flexibility is the ability to adapt decisions in real time to a change in situation (market prices, new technology). however, flexibility does not always have a useful value for a decision making subject (table 1). based on the degree of uncertainty and flexibility, the following analytical methods are recommended (see table 2). this does not mean that option techniques cannot be used for low flexibility or uncertainty, it just wouldnot be effective. it would provide the same results as npv. however, the meth32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 system description rigid flexible uncertainty low better to focus at the beginning flexibility too expensive high rigid system too risky flexibility needed table 1: value of flexibility ods can be combined. for example volatility can be simulated by the monte carlo method. 4.2 real options a real option is the right, but not the obligation, to undertake a business decision in future concerning a company’s assets. the attraction of such a right depends on the value of the underlying asset. the current price of an underlying asset is the current value of future cash-flows. exercise price is a capital investment connected with the decision. the option expiry date is not known in advance. if the option can be exercised at any time the expiry date, we speak of an “american option”. if option can be exercised only at the expiry, we speak of a “european option”. the risk-free interest rate is the same as risk-free interest rate used in financial options. the volatility of the underlying asset is expressed by the volatility of the future discounted cash-flows. by analogy with the theory of financial options, the option can be a “put” (to sell) or a “call” (to buy). if npv is the value of the project without considering the real option, then the following formula is valid: npv npv value of the option* � � . (3) traditional npv represents the value of the project when parameters estimated at the beginning are valid. however, the option value includes value of adaptation to the new conditions. the most widely-used options are: � option to wait or to defer a project � option to expand or to contract a project � option to abandon a project � option to stage � option to shutdown and restart � option to switch 5 applicability of the real option concept in nuclear fuel cycle 5.1 applicability in the front-end if an operator buys nuclear materials and services to be delivered just in time for the next stage of the processing chain, there is no room for flexibility. the operator cannot adapt its procurement to the situation in the market. however, as soon as the operator creates strategic stocks of nuclear material (and today almost 100 % of nuclear utilities do this), which would enable fuel to be produced without buying on the market for some time, an option to defer procurement of uranium is available. therefore the operator can wait, follow the development of the uranium price in the market and wait for better conditions. this is the american option (uranium can be bought any time before expiry – the moment when the inventory is depleted). procurement can be also staged. a binomial model can be used for calculating the value of the option (fig. 1). in middle part of the nuclear fuel cycle i do not see possibility of using a real option. 5.2 applicability in the back-end on the other hand, the potential use of real options in the back-end is wider. we can find more volatile underlying assets with at least some degree of flexibility. if we stick with the price of uranium, we can find the first option of an npp operator. the basic selected alternative for the back-end is the open nuclear fuel cycle. it is assumed that all spent fuel will be disposed of in a deep geological repository after long-term storage. this alternative was selected not only for technical reasons, but mainly due to the fact that the price of uranium was low and the cost of reprocessed uranium conversion, enrichment and fabrication is much higher than the cost of freshly mined uranium. npv preferred open fuel cycle. however if we consider the volatility of the uranium prices, and consider the option to switch from an open to a closed fuel cycle, we get a different picture. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 48 no. 3/2008 system description rigid flexible uncertainty low npv dynamic methods, scenarios high sensitivity analysis, simulations real options table 2: recommended analytical methods fig. 1: calculation of real option to defer contracting for uranium fig. 2: real option to switch from an open to a closed fuel cycle another option can be the option to expand – if a new npp is built and the volume of spent fuel and radioactive waste is substantially changed. it is important to retain flexibility to expand the capacity of back-end facilities. we can also imagine other real options. 6 conclusions complex optimization of the whole fuel cycle is a very difficult task. it is necessary to set some input values in order to decrease the number of variables and make calculation viable. npv stays the basic, for some purposes irreplaceable, analytic tool for economic evaluation in the fuel cycle. it is used for example for calculating the rate to be paid per unit of electricity generated in order to create sufficient funding for future back-end liabilities. npv is sometimes complemented by sensitivity analysis or probabilistic calculations. the volatility of the underlying assets and a sufficient degree of flexibility in the decisions of the npp operator (or of the state), or flexibility of the back-end technical solutions enables the use of real option techniques as an important support for strategic decision making. acknowledgments i wish to express my gratitude to my family, to my colleagues at work, and last but not least to my teachers in the department of economics, management and humanities for their support. references [1] scholleová, h.: hodnota flexibility reálné opce. c. h. beck pro praxi, 2007. [2] starý, o.: reálné opce. a plus, 2003. ing. ladislav havlíček e-mail: havlicek.ladislav@seznam.cz dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 ap07_4-5.vp 1 introduction electrocardiography is one of the most important diagnostic methods to monitor proper heart function. increasingly, it is not only used in a clinical environment but more and more applied to the “personal healthcare” scenario. in this application field, medical devices are intended to be used in the domestic environment and can communicate wirelessly [1]. traditionally, for ecg measurement conductive electrodes have been applied which are directly attached to the skin. with the help of contact gel, they provide direct resistive contact to the patient. unfortunately, these electrodes possess various disadvantages which are not optimal for long-term use in a “personal healthcare” scenario: as a result drying of the contact gel and surface degradation of the electrodes, the transfer resistance may change with time. furthermore, metal allergies can cause skin irritations and may result in pressure necroses. especially infant’s skin reacts sensitively to these kind of electrodes. finally, as a single-use item, they are rather expensive. capacitive (insulated) electrodes, which can obtain an ecg signal without conductive contact to the body, even through clothes, represent an alternative. this kind of electrodes were first described by richardson [2]. unlike conductive electrodes, the surface of these electrodes is electrically insulated and remains stable in long-term applications. integrated in objects of daily life, they seem ideal for the “personal healthcare” field. preliminary work concerning the integration of ecg measuring systems into objects of daily life was done by ishijima [3]. in his work, conductive textile electrodes were used. these acted as underlay and pillow and could obtain an ecg during sleep. however, due to direct skin contact, the patient’s coupling was probably not exclusively capacitive. a group of korean researchers around k. s. park continued this work and integrated insulated electrodes into several objects, e.g. a toilet seat or a bath tub. recently, two ecg applications with a chair have been presented [4, 5] that are somewhat similiar to the work decribed below. thus, based on the measurement principles of insulated electrodes, we present a cable-free, battery-operated ecg measurement system, integrated into an off-the-shelf office chair. 2 materials and methods fig. 1 gives an overview of the developed measurement system. the front-end part of the measurement system consists of components for analog processing, like the insulated electrodes, an instrumental amplifier and analog filters plus in-between amplifiers. a/d conversion is followed by the digital part, with a radio transmitter as a data source and a receiver as a data drain. for demonstation and validation purposes, the receiver may be connected either with a standard icu patient monitor via a d/a converter and magnitude adjustment, or with a pc via serial interface. 2.1 analog signal processing the front-end features two active electrodes with an effective surface area of a � 4 cm × 8 cm. in both cases, the electrode’s surface forms a coupling capacitance (c1,2) between the subject’s body and the input of the unity gain amplifiers a1 and a2, respectively (fig. 1). the integrated unity gain amplifiers assure the required high input impedance of the 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 wireless and non-contact ecg measurement system – the “aachen smartchair” a. aleksandrowicz, s. leonhardt this publication describes a measurement system that obtains an electrocardiogram (ecg) by capacitively coupled electrodes. for demonstration purposes, this measurement system was integrated into an off-the-shelf office chair (so-called “aachen smartchair”). whereas in usual clinical applications adhesive, conductively-coupled electrodes have to be attached to the skin, the described system is able to measure an ecg without direct skin contact through the cloth. a wireless communication module was integrated for transmitting the ecg data to a pc or to an icu patient monitor. for system validation, a classical ecg with conductive electrodes and an oxygen saturation signal (spo2 ) were obtained simultaneously. finally, system-specific problems of the presented device are discussed. keywords: electrocardiography, ecg, non-contacting, monitoring, capacitive electrodes, chair. fig. 1: block diagram of the measurement system measurement system. the electrode’s surface is covered by a very thin isolation layer (here: approx. 20 �m clear-transparent insulating lacquer). hence, the coupling capacitance depends mainly on the thickness d and the dielectric constant �r of the cloth located between the electrode and the subject’s skin. assumed values like d � 0 3. mm and �r � 1 result in a capacitance of c a dr12 0 92, � �� � pf (1) to suppress the interference due to the changing electromagnetic fields in the environment, the electrodes have to be actively shielded [6]. any static charges on the coupling capacitances and/or on the subject’s clothing must be discharged over the resistance rbias. note that c1,2 and rbias form a high-pass filter. a compromise between the discharging time constant and the attenuation of important ecg spectral fractions needs to be found by adjusting the resistor rbias properly. in this application, rbias was selected to 100 g� . thus, the cut-off frequency of the electrodes can be calculated to f c rc � � � � 1 2 17 3 12, . bias mhz � (2) in practice, any 50 hz voltages from the power supply line that are capacitively coupled into the patient’s body may cause a common mode voltage (vcm ) between the subject’s body and the insulated circuit common ground vcm of approx. 1 v, see fig. 2. in this figure, cs represents the isolation capacitance between the circuit common ground and the potential earth, and cb as well as cp the stray coupling with the 220-v-powerline and potential earth, respectively. due to the finite common mode rejection ratio (cmrr) of the instrumental amplifier, these common mode voltages on the body cause interferences in the output signal vout and have to be supressed. for this reason, a so-called “driven-right-leg” circuit with an additional reference electrode is used in most conventional (conductive) ecg measurements [7]. by analogy, kim et al. [4] introduced a so-called “driven-ground-plane” circuit for non-contacting ecg measurements which was also implemented in the application presented here. the sum of the electrodes’ output signals is fed back via inverting amplifier a3 to a conductive plane (c3), as shown in fig. 2. this plane is also insulated and capacitively coupled to the subject. resistors ra and rf adjust the amplifier’s gain to g r ra f a 3 2 � (3) besides the limited cmrr of the instrumental amplifier, another reason for needing to reduce vcm is the possible transformation to a differential signal at the instrumental amplifier’s input vd, see fig. 3. in the case of capacitive ecg measurements, this may occur due to different coupling capacitances c c1 2� , e.g. as a result of inhomogeneities in clothing. with z jwc1 11� and z jwc2 21� and the input impedance zc of a1 and a2 (here: opa124, burr brown corp., dallas, usa), eq. (4) shows the relation between vd and vcm: v v z z z z z zd cm b b � � � � � � � � � 2 2 1 1 (4) with zc � 10 3 14 � pf (5) and z z r z rb c c � � � bias bias . (6) furthermore, eq. (4) illustrates that to minimize interferences at the instrumental amplifier’s input, either both coupling capacitances must be identical or vcm must be as low as possible. a “driven-ground-plane” circuit supports the second option. the resulting common mode voltage suppression was found to be v v cm p f ( )220 118 50 v � � � hz db (7) using a gain of g a3 1000� (compare with [4]). finally, the measurement chain for analog processing of the ecg signal consists of the following elements: � instrumentation amplifier (ina 114, texas instruments inc., dallas, u.s.a.) � 4th order butterworth high-pass filter with a cut-off frequency f c hp, .� 0 5 hz � an in-between amplifier � 6th order butterworth low-pass filter with cut-off frequency fc lp, � 200 hz � 50 hz notch-filter with cut-off frequencies fl � 40 hz and fh � 60 hz in the version presented here, the overall gain in forward direction (i.e. from the capacitive electrodes to the a/d converter, see fig. 1) was set to 950. © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 47 no. 4–5/2007 fig. 2: the “driven-ground-plane” circuit fig. 3: equivalent circuit for common mode voltages at the input of the measurement system 2.2 integration of the measurement system the capacitive ecg measurement system was integrated into an off-the-shelf office chair. fig. 4 shows the electrodes on the backrest and the position of the insulated reference electrode hidden under the cover (copper meshes as “driven-ground”, area: 34 cm×23 cm). the chair’s backrest is vertically adjustable, in order to adapt the position of the electrodes to the subject’s torso. 2.3 wireless data transmission the analog signal processing block is followed by a 12 bit a/d converter and a zigbeetm radio transmission module (zebra module from sentec elektronik corp., ilmenau, germany). low-power consumption and a data transmission rate sufficient for the specific application (unidirectional transmission of a one channel ecg) makes the zebra module particularly attractive. a second zebra module was used as a receiver. at the receiver’s output, two interfaces were implemented. either an icu patient monitor can be connected via d/a-conversion and magnitude adjustment, allowing standard ecg clamp leads to be applied, or a simple wired serial interface to a pc can be realized. in this second case, digital processing and the ecg data display may be done by a labview® application. due to the wireless ecg data communication and the use of an accumulator for the power supply, the “aachen smartchair” possesses a mobility which is limited only by the transmission range. 3 results fig. 5 shows an ecg signal obtained with our measurement system. all given measurements were recorded from the same subject (male, 26 yr., healthy) during normal breathing conditions and without further deliberate body movements. for further validation purposes, a classical, conductive einthoven ecg and an oxygen saturation signal (spo2) were recorded in parallel to the capacitive ecg signal. these signals were displayed on an icu monitor mp70 in combination with an “intellivue”-module (both produced by philips medical inc., boeblingen, germany). as an example, a screenshot (inverted for better visibility) visually demonstrates the good correlation of the three vital signals (fig. 6). in addition, fig. 7 shows three capacitive ecg signals, obtained with our system, with different cotton wool shirt thickness. 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 4: the “aachen smartchair”, front view fig. 5: capacitively measured ecg while the subject is wearing a cotton wool shirt of 0.3 mm in thickness fig. 6: comparison between a synchronously measured spo2 signal (above), a capacitive ecg signal (center) and classical (conductive) ecg derivation after einthoven (below). the ecg measurement was performed under normal breathing conditions, wearing a cotton wool shirt 0.3 mm in thickness. fig. 7: comparison of the ecg signals with different a cotton wool shirt thicknesses 4 discussion with our measurement system, it is possible to monitor an ecg without resistive skin contact. the qrs complex and the t wave were clearly identified, and often the p wave was also seen (see fig. 5). however, typically a negative subsidence right after the t-wave, atypical in comparison to the einthoven ecg, was observed by the authors, and this is in coincidence with the findings of kim [5]. a possible reason for this could be the nontypical position of the insulated electrodes, compared to the classical ecg position. also, this negative wave could be caused by a body movement, due to the mechanical activity of the heart. in the application presented here, the ecg electrodes are not fixed to the body. thus, movement artifacts cannot be prevented in principle. a conductive ecg measurement may be less sensitive to movement artifacts, due to the use of electrodes glued to the body. to reduce movement artifacts, kim et al. suggest raising the high-pass cut-off frequency from 0.5 to 8 hz [5]. this method smoothes the base line, but may remove diagnostically important fractions of the ecg signal (like the p or t-wave). fig. 6 shows that the ecg signal quality of the presented measurement device is comparable to a conductive ecg measurement under certain clothing conditions: a subject’s shirt thickness of approx. 0.3 mm or lower. due to possible static charging of the clothes, a cotton wool material is preferred. after searle [6], a general increase of the coupling impedance between body and measurement systems leads to an increased difference and, referring to eq. (4), to increasing interferences in the ecg signal. as a result, fig. 7 shows increasing 50-hz hum. a reduced corner frequency of the low-pass filter to 35 hz, also applied by kim [5], could decrease this 50-hz-noise. in any case, even with clothes thickness of 2.5 mm, at least the qrs-complex was clearly identified. 5 conclusion with the capacitively coupled ecg measurement system presented here, an ecg can be obtained without direct skin contact and, thus, without causing skin irritations. compared to a conventional, conductive measurement system, it is more sensitive to moving artifacts. furthermore, the quality of the capacitive ecg is strongly dependent on the subject’s clothing, i.e. an adequate distance between the surface of the electrodes and subject’s body is necessary for a high-quality ecg measurement. taking these disadvantages into consideration, our system seems useful for heart rate detection in long-term applications. however, further research is needed before the diagnostic potential of capacitive ecg measurement can be finally evaluated. acknowledgments the research described in this paper was supervised by prof. dr. s. leonhardt, philips chair for medical information technology at rwth aachen university. thanks to all colleagues from the chair for valuable discussions and technical assistance. reference [1] leonhardt, s.: personal healthcare devices. in s. mukherjee et al. (eds.), amiware: hardware technology drivers of ambient intelligence. chapter 6.1, springer verlag, dordrecht, nl, 2006, p. 349–370. [2] richardson, p. c.: the insulated electrode. in proceedings of the 20th annual conference on engineering in medicine and biology. boston, ma (usa), 1967, p. 157. [3] ishijima, m.: monitoring of electrocardiograms in bed without utilizing body surface electrodes. ieee transactions on biomedical engineering, vol. 40 (1993), no. 6, p. 593–594. [4] kim, k. k., lim, y. k., park, k. s.: common mode noise cancellation for electrically non-contact ecg measurement system on a chair. in proceedings of the 27th annual conference of the ieee embs. shanghai (china), sept. 2005, p. 5881–5883. [5] lim, y. g., kim, k. k., park, k. s.: ecg measurement on a chair without conductive contact. ieee transactions on biomedical engineering, vol. 53 (2006), no. 5, p. 956–959. [6] searle, a., kirkup, l.: a direct comparison of wet, dry and insulating bioelectric recording electrodes. physiological measurement, vol. 21 (2000), p. 271–283. [7] winter, b. b., webster, j. g.: reduction of interference due to common mode voltage in biopotential amplifiers. ieee transactions on biomedical engineering, vol. 30 (1983), no. 1, p. 58–61. adrian aleksandrowicz e-mail: medit@hia.rwth-aachen.de prof. dr. steffen leonhardt, ph.d. philips chair for medical information technology rwth aachen university pauwelsstr. 20 52074 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 47 no. 4–5/2007 acta polytechnica https://doi.org/10.14311/ap.2023.63.0188 acta polytechnica 63(3):188–198, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague from positional representation of numbers to positional representation of vectors izabella ingrid farkasa, edita pelantováb, ∗, milena svobodováb a eötvös loránd university, doctoral school of informatics, pázmány p. sétány 1/c, 1117 budapest, hungary b czech technical university in prague, faculty of nuclear sciences and physical engineering, department of mathematics, trojanova 13, 120 00 prague, czech republic ∗ corresponding author: edita.pelantova@fjfi.cvut.cz abstract. to represent real m-dimensional vectors, a positional vector system given by a non-singular matrix m ∈ zm×m and a digit set d ⊂ zm is used. if m = 1, the system coincides with the well known numeration system used to represent real numbers. we study some properties of the vector systems which are transformable from the case m = 1 to higher dimensions. we focus on an algorithm for parallel addition and on systems allowing an eventually periodic representation of vectors with rational coordinates. keywords: number system, positional representation, local function, parallel addition, eventually periodic representation. 1. introduction expression of a number as a linear combination of elements of the sequence (βj )j∈z with coefficients from a finite set d is nowadays the most used way to represent numbers. such a system is called a positional number system with the base β and the digit set d. the decimal number system with the base ten and digits 0, 1, . . . , 9 prevails in europe for several centuries. in the age of computers, the binary and hexadecimal number systems broke the domination of the decimal number system. but advantages of working with another number system were observed before computers came on the scene: a. cauchy [1] checked correctness of his computations using simultaneously the classical decimal number system and decimal number system with the symmetric set of digits {−5, −4, . . . , 0, 1, . . . , 5}. v. grünwald [2] considered a number system with base β = −2 and digits {0, 1}. an important moment for the number systems, making them a source of interest for many areas of mathematics, came in 1957, when a. rényi [3] introduced number systems with an arbitrary real base β > 1. algebraic, dynamical, topological, geometric and algorithmic properties of the rényi number systems have been very intensively studied since then, from both theoretical and practical points of view. for example, a suitable choice of a base in an algebraic extension of rational numbers enables to represent all elements of the algebraic field by a finite or eventually periodic string of digits, see [4] and [5]. further generalisations of numeration systems emerged in the following years. knuth [6] and penney [7] came with positional representations of complex numbers, wherein, instead of two strings of digits representing the real and the imaginary part of complex numbers separately, they suggested to use a complex base β, in order to represent the complex number by a single string of digits. this type of representation was then further developed in the concept of canonical number systems [8] and (even more general) shift radix systems, see [9] for a survey on the topic. in most of the above mentioned generalisations of numeration systems, every number has a unique representation, whose digits are determined by iterations of some transformation function. one exception is the decimal system with a symmetric digit set used by cauchy. it has eleven digits – more than necessary for representing all positive integers. such a system is called redundant. already cauchy noticed that, with this redundant system, the addition of two numbers is easier than in the classical system, as the carry propagation is limited. this property was further exploited by a. avizienis, aiming to speed up addition. in the classical b-ary numeration system, where the base is an integer β = b ≥ 2, addition has a linear time complexity with respect to the length of representations of the summands. avizienis [10] designed an algorithm with a constant time complexity for addition in redundant number systems using base β = b ≥ 3 and a symmetric integer digit set. in this paper, we consider representations of mdimensional vectors. the numeration system is given by a square matrix base m ∈ zm×m and by a finite set of digits d ⊂ zm. an origin of such numeration systems can be found in works [11] and [12] of a. vince, showing that for any expansive matrix m ∈ zm×m, there exists a digit set d ⊂ zm such that any element x of the lattice zm can be written in the form x = ∑n j=0 m j dj , where dj ∈ d. in other words, the whole lattice zm is representable in the matrix numeration system (m, d). however, if a matrix m has an eigenvalue inside the unit circle, then no choice of the digit set d ⊂ zm allows to represent all in188 https://doi.org/10.14311/ap.2023.63.0188 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 3/2023 positional representation of vectors teger vectors as a combination of only non-negative powers of m . the matrix formalism for numeration systems (under the name numeration systems in lattices) was systematically used by a. kovács in [13]. of course, kovács, just like vince, considers integer matrices, as they map a lattice into itself. in fact, already positional representations of gaussian integers or algebraic integers from an algebraic extension of rational numbers can be interpreted as special cases of the matrix numeration systems. j. jankauskas and j. thuswaldner [14] generalised the vince’s results to matrix bases m ∈ qm×m with rational entries and without eigenvalues in modulus strictly smaller than 1. another generalisation introduced recently allows to use both positive and negative powers of the matrix base for representation of vectors. in [15], it is shown that for m ∈ zm×m with det m = ∆ ̸= 0, there exists a finite digit set d ⊂ zm such that every integer vector from zm has a finite (m, d)-representation, i.e., zm ⊂ find (m ), where (1) find (m ) = {∑ j∈i m j dj : i ⊂ z, i finite, dj ∈ d } . we show (in theorem 9) that if, moreover, no eigenvalue of m lies on the unit circle, then for a suitable finite digit set d ⊂ zm, addition and subtraction on find(m ) can be performed by a parallel algorithm, i.e., in a constant number of steps independent of the length of (m, d)-representation of summands. according to proposition 15, the required assumption on eigenvalues of m is in fact necessary for the existence of a parallel addition algorithm on (m, d). then, we restrict our study to expansive matrices – i.e., matrices with all eigenvalues strictly outside the unit circle. in theorem 21, we show that the digit set d ⊂ zm allowing a parallel addition enables (for an expansive matrix base m ) an eventually periodic (m, d)-representation of every element of qm, i.e. qm = perd (m ), where perd (m ) = { n∑ j=−∞ m j dj : dj ∈ d, (dj )n−∞ eventually periodic } . consequently, every element of rm has an (m, d)representation (corollary 22). the methods we use in our proofs are based on proofs of analogous results for a positional representation of real and complex numbers, modified accordingly to the formalism of matrices and vectors. 2. preliminaries a numeration system used for a positional representation of complex numbers is given by a base β ∈ c with |β| > 1 and a finite digit set a ⊂ c. if x ∈ c can be written in the form x = ∑n j=−∞ aj β j , where aj ∈ a for each j ∈ z, j ≤ n, we say that x has a (β, a)-representation. the assumption |β| > 1 guarantees that the series ∑n j=−∞ aj β j is convergent for any choice of digits aj ∈ a. w. penney in [7] introduced the following numeration system, which we use to demonstrate our approach. example 1. let us consider β = i − 1. penney in [7] showed that (1.) each x ∈ z[i] = {a + ıb : a, b ∈ z}, x ̸= 0 can be expressed uniquely as x = ∑n j=0 aj β j , where a0, a1, . . . , an ∈ {0, 1} and an ̸= 0; (2.) each x ∈ c can be expressed as x = ∑n j=−∞ aj β j , where aj ∈ {0, 1} for every j ∈ z, j ≤ n. our aim is to study selected properties of matrix numeration systems used to represent m-dimensional vectors. any matrix numeration system used in this paper is given by a non-singular matrix base m ∈ zm×m and a finite (vector) digit set d ⊂ zm. thanks to the result of [15] mentioned earlier, we always assume that zm is a subset of find (m) and, (2) moreover, d contains the zero vector. (3) in the first part of this paper, we work only with vectors from find(m ). therefore, we do not yet impose additional assumptions on the matrix base m , analogous to the assumption |β| > 1 required for number bases (which is important to ensure the convergence of the infinite series ∑n j=−∞ aj β j ). only in the second part of the paper, we revisit the question of the infinite representations convergence for matrix numeration systems as well. let us list some obvious properties of the set find (m ): • m pfind (m ) = find (m ) for every p ∈ z. • find (m ) ⊂ qm, more precisely, find (m ) ⊂ ⋃ k∈n 1 ∆k z m, where ∆ = det m . • if x = ∑n j=0 m j dj for some n ∈ n, then x ∈ zm. • find (m ) is closed under addition and subtraction: indeed, if x, y ∈ find(m ), then there exists p ∈ n such that m px ∈ zm and m py ∈ zm, and hence m p(x ± y) ∈ zm. by assumption (2), m p(x ± y) ∈ find(m ) = m pfind(m ), and thus x ± y ∈ find (m ). if a vector x ∈ qm is expressed as x = ∑ j∈i m j dj for a finite i ⊂ z and dj ∈ d, we can, for some integer numbers s ≤ 0 ≤ n, write x = ∑n j=s m j dj , because the zero vector belongs to d. hence x can be identified with a bi-infinite string (dj )j∈z ∈ dz usually referred to as (m, d)-representation of x: (x)m,d = ω 0dndn−1 · · · d1d0 • d−1d−2 · · · ds0ω , where the zero index in the bi-infinite string is indicated by •. 189 i. farkas, e. pelantová, m. svobodová acta polytechnica as mentioned in the introduction, some numeration systems used for representation of numbers can also be interpreted as matrix numeration systems. let us illustrate this concept on the penney numeration system introduced in example 1 . example 2. let us transform the number numeration system from example 1 into a matrix numeration system on z2. in place of the number base β = ı − 1, we use the matrix base m = ( −1 −1 +1 −1 ) ∈ z2×2. it is easily seen that multiplication of a gaussian integer – complex number x = b + ıc ∈ z[ı], with b, c ∈ z, by the number base β corresponds to multiplication of an integer vector v = (b, c)⊤ ∈ z2 by the matrix base m , as follows: βx = (ı − 1) · (b + ıc) = (−b − c) + ı(b − c) , m v = ( −1 −1 +1 −1 ) · ( b c ) = ( −b − c b − c ) . let us define a mapping ξ : z[ı] 7→ z2 by the formula ξ(b + ıc) := (b, c)⊤ for any b, c ∈ z . (4) obviously, ξ(x + y) = ξ(x) + ξ(y) for every x, y ∈ z[ı]. thus, the mapping ξ is an isomorphism between the lattices z[ı] and z2. moreover, it fulfils the equality ξ(β · x) = m · ξ(x) for every x ∈ z[ı]. (5) hence, if b + ıc ∈ z[ı] is written in the form b + ıc =∑n j=0 aj β j with aj ∈ {0, 1}, then( b c ) = ξ(b+ıc) = ξ ( n∑ j=0 βj aj ) = n∑ j=0 m j dj , where dj = ξ(aj ) ∈ {( 0 0 ) , ( 1 0 )} = {ξ(0), ξ(1)} . using the properties of the penney numeration system from example 1, we conclude that the matrix numeration system given by the base m = ( −1 −1 +1 −1 ) and the digit set d = {(0, 0)⊤, (1, 0)⊤} provides for any vector v ∈ z2 a unique (m, d)-representation in the form v = ∑n j=0 m j dj , dj ∈ d and dn ̸= 0 (if v ̸= 0). 3. parallel addition in matrix numeration systems let us consider the operations of addition and subtraction on the set of m-dimensional vectors from an algorithmic point of view. similarly to the classical algorithms for arithmetic operations, we work only with finite representations – i.e., on the set find (m ). let x, y ∈ find (m ), with (x)m,d = ω 0xnxn−1 · · · x1x0 •x−1x−2 · · · xs0ω and (y)m,d = ω 0ynyn−1 · · · y1y0 • y−1y−2 · · · ys0ω . adding x and y means to rewrite the (m, d + d)representation ω 0(xn + yn) · · · (x0 + y0) • (x−1 + y−1) · · · (xs + ys)0ω of the number x + y into an (m, d)-representation of x + y. as already announced, we are interested in parallel algorithms for addition. let us mathematically formalise the parallelism. firstly, we recall the notion of a local function, which comes from symbolic dynamics, see [16]. definition 3. let a and b be finite sets. a function φ : az → bz is said to be p-local if there exist nonnegative integers r and t satisfying p = r + t + 1, and a function φ : ap → b such that, for any u = (uj )j∈z ∈ az and its image v = φ(u) = (vj )j∈z ∈ bz, we have vj = φ(uj+t · · · uj−r ) for every j ∈ z. this means that the image of u by φ is obtained through a window of limited length p. the parameter r is called memory and the parameter t is called anticipation of the function φ. such functions, restricted to finite sequences, are computable by a parallel algorithm in constant time, irrespective of the length of the operands’ representations. definition 4. given a (matrix) base m ∈ zm×m with det m ̸= 0 and (vector) digit sets a, b ⊂ zm containing 0, a digit set conversion in base m from a to b is a function φ : az → bz such that (1.) for any u = (uj )j∈z ∈ az with a finite number of non-zero digits, v = (vj )j∈z = φ(u) ∈ bz has only a finite number of non-zero digits, and (2.) ∑ j∈z m j vj = ∑ j∈z m j uj . such a conversion is said to be computable in parallel if it is a p-local function for some p ∈ n. thus, the operation of addition on find (m ) is computable in parallel if there exists a digit set conversion in base m from d + d to d which is computable in parallel. two useful lemmas precede the statement about the parallel addition on matrix numeration systems: lemma 5. let m ∈ zm×m be a non-singular matrix and d ⊂ zm be a finite digit set such that every x ∈ zm is representable in the numeration system (m, d). if addition is computable in parallel in (m, d), then it is computable in parallel also in (m, d′) for each finite digit set d′ ⊂ zm containing d. proof. each digit d′ ∈ d′ can be written in the form d′ = ∑ j∈i(d′) m j dj , where i(d′) is a finite subset of z and dj ∈ d for each j ∈ i(d′). denote q = max{|x| : d′ ∈ d′, x ∈ i(d′)} . 190 vol. 63 no. 3/2023 positional representation of vectors the string 0ω d′nd′n−1 · · · d′0 • d′−1d′−2 · · · d′−n 0 ω can be transformed by a (2q + 1)-local function into a string having the form 0ω en+q en+q−1 · · · e0 • e−1e−2 · · · e−n −q 0ω , where ed ∈ d + d + · · · + d︸ ︷︷ ︸ (2q+1)−times . in other words, a sum of two finite (m, d′)representations can be rewritten as sum of 2(2q + 1) finite (m, d)-representations. since addition of two strings is doable in parallel in (m, d), addition of 2(2q + 1) strings with fixed q is possible in parallel (m, d) as well, and the resulting (m, d)representation is also an (m, d′)-representation, due to d ⊂ d′. the following lemma is stated without a proof here, as the course of the proof would be identical to that of proposition 5.1 in [17]. although that proposition works with roots of the minimal polynomial of an algebraic number, the minimality of the polynomial is not used for the proof itself. in fact, the idea of the proof comes from [18], where expansive polynomials are considered. in our case, we extend the considerations to polynomials with no roots on the unit circle. lemma 6. let α1, . . . , αn ∈ c be the roots of a polynomial f ∈ z[x] satisfying |αk| ̸= 1 for all k = 1, . . . , n. then, for any t ≥ 1, there exists a non-zero polynomial g ∈ z[x], g(x) = ∑p−1 l=0 clx l such that g is divisible by f and for one coefficient cl we have 1 t cl > p−1∑ l=0,l ̸=l |cl| . (6) in particular, if |αk| > 1 for all k = 1, . . . , n, then l = 0. corollary 7. let m ∈ zm×m, with det m ̸= 0 and no eigenvalue of m equals 1 in modulus. then, there exists a polynomial g ∈ z[x], g(x) = ∑p−1 l=0 clx l such that g(m ) = θ and for one coefficient cl we have cl ≥ 4 · ( c + p−1∑ l=0,l ̸=l |cl| ) , (7) where c = cl − 4 ⌊ cl 4 ⌋ ∈ {0, 1, 2, 3}. proof. let f be the characteristic polynomial of the matrix m . the hamilton–cayley theorem says that f (m ) = θ. by lemma 6 applied on f with t = 16, we find a polynomial g(x) = ∑p−1 l=0 clx l ∈ z[x] such that cl > 16 · p−1∑ l=0,l ̸=l |cl| , for some l ∈ {0, . . . , p − 1} . using the fact that ∑p−1 l=0,l ̸=l |cl| ≥ 1 and denoting c := cl − 4 ⌊ cl 4 ⌋ ∈ {0, 1, 2, 3} , we obtain the following estimate: cl > 16 p−1∑ l=0,l ̸=l |cl| = 4 ( 3 p−1∑ l=0,l ̸=l |cl| + p−1∑ l=0,l ̸=l |cl| ) ≥ ≥ 4 ( 3 + p−1∑ l=0,l ̸=l |cl| ) ≥ 4 ( c + p−1∑ l=0,j ̸=l |cl| ) . since the characteristic polynomial f divides g, we have g(m ) = θ. example 8. the minimal polynomial of the complex number β = ı − 1 and the characteristic polynomial of the matrix m defined in example 2 are both equal to f (x) = x 2 + 2x + 2. the polynomial g(x) = x 4 + 4 = (x 2 + 2x + 2)(x 2 − 2x + 2) satisfies g(m ) = θ and (7) with l = 0. theorem 9. let m ∈ zm×m, with det m ≠ 0 and no eigenvalue of m equals 1 in modulus. then there exists a finite (vector) digit set d ⊂ zm such that zm ⊂ find(m ) and both addition and subtraction on find (m ) are computable in parallel. proof. let g(x) = ∑p−1 l=0 clx l be the polynomial from corollary 7. denote k = ⌊ cl 4 ⌋ ≥ 1 and define d = [−3k, 3k)m ∩ zm. in order to show that the digit set d enables parallel addition, we introduce two auxiliary sets d′ = [−2k, 2k)m ∩ zm and q = [−1, 1]m ∩ zm , and then exploit the obvious fact that d + d ⊂ d′ + 4kq . (8) let x = ∑ j∈z m j aj and y = ∑ j∈z m j bj , where aj , bj ∈ d. moreover, we assume that aj , bj ̸= 0 for just a finite number of indices j ∈ z. clearly, aj + bj ∈ d + d. due to (8), we find for each j a vector qj ∈ q such that aj + bj − 4kqj ∈ d′. then x + y = ∑ j∈z m j (aj + bj ) = = ∑ j∈z ( m j (aj + bj ) − m j−lg(m )︸ ︷︷ ︸ =θ qj ) . we express g(m ) in the explicit polynomial form g(m ) = ∑p−1 l=0 clm l in the rightmost sum: ∑ j∈z m j−lg(m )qj = ∑ j∈z p−1∑ l=0 m j−l+lclqj = = ∑ j∈z m j (p−1∑ l=0 clqj+l−l ) . therefore, x + y = ∑ j∈z m j zj , with zj = aj + bj − (p−1∑ l=0 clqj+l−l ) . (9) using cl = 4k + c, we get zj = aj + bj − 4kqj︸ ︷︷ ︸ ∈d′ − ( cqj + p−1∑ l=0,l ̸=l clqj+l−l ︸ ︷︷ ︸ =:u ) . 191 i. farkas, e. pelantová, m. svobodová acta polytechnica all entries of all vectors qj belong to {0, 1, −1}, and thus any component of the vector u in modulus is at most c + ∑ l=0,l ̸=l |cl|. equation (7) guarantees that the components of u are not greater than cl4 . since the components are integer, they are at most k = ⌊ cl 4 ⌋ . as d′ + [−k, k]m ⊂ [−3k, 3k)m, so we can conclude that zj ∈ d. in order to compute zj , we needed to know, besides aj and bj , also qj+l, qj+l−1, . . ., qj+l−p+1. let us stress that qj depends only on aj + bj . hence, zj is determined by digits on p positions, i.e., the addition is performed by a p-local function. to demonstrate the point (1) of definition 4, we have to show that zj ̸= 0 for only finitely many indices j ∈ z. the form of d, d′ and q guarantees that there exists a unique qj satisfying aj + bj − 4kqj ∈ d′. in particular, if aj = bj = 0, then qj = 0. the formula (9) implies that zj is non-zero for only finitely many indices j ∈ z. let us note that the digit set d is not closed under multiplication by −1. but for each b ∈ d, we can find c, d ∈ d such that −b = c + d. hence, a subtraction of two vectors x = ∑ z m j aj and y = ∑ z m j bj can be viewed as addition of three vectors, and therefore it is computable in parallel as well. it remains to prove that zm ⊂ find (m ). but this is clear, since d ⊂ find (m ), find (m ) is closed under addition and each x ∈ zm can be expressed as a finite sum of digits from d. remark 10. the vectors we add by the parallel algorithm as described in the previous proof are represented by both-sided infinite strings. but only finitely many entries of the strings are occupied by non-zero digits. assume that x + y = ∑n j=n m j (aj + bj ), for some integers n ≤ n . as stated in the previous proof, if both digits aj and bj are zero, then the algorithm puts qj = 0. the formula (9) for zj implies that zj is zero for all j ≤ n − l − 1 and for all j ≥ n + p − l. hence, ∑n j=n m j (aj + bj ) = ∑n ′ j=n′ m j zj , where n′ = n − l and n ′ = n + p − l − 1. example 11. consider the matrix numeration system with base matrix m = ( −1 −1 +1 −1 ) ∈ z2×2. by example 8, the polynomial g(x) = x 4 + 4 with cl = c0 = 4 is suitable for the parallel addition algorithm as described in the proof of theorem 9. following the proof, we put k = ⌊ cl 4 ⌋ = 1 and define d = [−3, 3)2 ∩ z2, i.e., the digit set has 36 elements. with such a choice of the digit set d, the addition in (m, d) is computable in parallel. the algorithm for parallel addition constructed in the proof of theorem 9 is very simple, as the value qj depends only on the digits aj and bj having the same index j. an algorithm with such a property is usually called neighbour free. however, we pay a large price for the simplicity of the algorithm – the digit set is huge. with another choice of the algorithm, the digit set could be substantially smaller, and still sufficient to perform the addition in parallel by means of a p-local function (with a larger parameter p, though). example 12. consider the numeration system in c with base β = ı − 1. in [19], a 7-local function of parallel addition in system (β, a) is found for the digit set a = {0, ±1, ±ı}. let us denote the 7-local function as φ : (a + a)7 7→ a, acting on a 7-tuple (wj , . . . , wj−6) ∈ (a + a)7 by means of an auxiliary quotient function q : (a + a)6 7→ q ⊂ z[ı] as follows: qj := q(wj , . . . , wj−5) ∈ q and, consequently, zj := wj + qj−1 − βqj = φ(wj , . . . , wj−6) ∈ a . the local functions φ and q acting on numbers can be transformed to local functions φ′ : (d + d)7 7→ d = ξ(a) and q′ : (d + d)6 7→ q′ = ξ(q) acting on vectors, by means of the isomorphism ξ : z[ı] 7→ z2 defined in example 2. thereby, we can define q′ = ξ ◦ q ◦ ξ−1 and φ′ = ξ ◦ φ ◦ ξ−1. in other words, with the help of the formulas from 7-local parallel addition on the number system (β, a), we obtain a 7-local parallel addition on the matrix system (m, d) with digit set size #d = #(ξ(a)) = #a = 5. the vector digit set of size 5 has elements {(0, 0)⊤, (1, 0)⊤, (−1, 0)⊤, (0, 1)⊤, (0, −1)⊤} = d. as proved in [20], the size of 5 is minimal for a digit set allowing parallel addition on the number system with base β = ı − 1. consequently, the digit set size #d = 5 must be minimal for parallel addition on the matrix system (m, d) as well, due to the isomorphism ξ. the algorithm for parallel addition of vectors in z2 presented in example 12 uses, for the given matrix base m , a digit set of the minimal possible size for parallel addition. however, the way to determine the coefficients qj = q(wj , . . . , wj−5) is very laborious, as the formula for q is in fact a look up table with 136 rows. with the digit set size increased from 5 to 9 elements, a lot simpler algorithm for parallel addition can be obtained, as presented in the following example. example 13. let us consider m = ( −1 −1 +1 −1 ) and the digit set of size 9 d̃ = {( bc ) : b, c ∈ {0, ±1}} ⊂ z 2. again, we construct an auxiliary coefficient function q̃ : (d̃ + d̃)2 7→ q̃ ⊂ z2. the coefficients q̃j ∈ q̃ produced by q̃ then provide the result sum digits z̃j ∈ d̃ via local function φ̃ : (d̃ + d̃)3 7→ d̃, as follows: q̃j := q̃(w̃j , w̃j−2) ∈ q̃ and, consequently z̃j := w̃j + q̃j−2 − m 2q̃j = φ̃(w̃j , w̃j−2, w̃j−4) ∈ d̃ . 192 vol. 63 no. 3/2023 positional representation of vectors the coefficient set q̃ is, just by coincidence, equal to the digit set d̃: q̃ = {( 00 ) , ± ( 10 ) , ± ( 01 ) , ± ( 11 ) , ± ( 1 −1 ) } . the interim sum digit set w̃ = d̃ + d̃ = {( bc ) : b, c ∈ {0, ±1, ±2}} has 25 elements and can be rewritten using the rotation symmetry by the angle π2 as follows w̃ = w̃0 ∪ w̃1 ∪ w̃2 ∪ w̃3 , where w̃0 = {( 00 ) , ( 10 ) , ( 20 ) , ( 11 ) , ( 21 ) , ( 12 ) , ( 22 )} and w̃k = rk w̃0 for k = 1, 2, 3, with r = ( 0 −1 +1 0 ) . thanks to the π2 -rotation symmetry of all the sets in question, i.e., d̃, q̃, and w̃ = d̃ + d̃, it is enough to specify the coefficient function by listing its values q̃(w̃j , w̃j−2) just for w̃j ∈ w̃0. all the rest can then be obtained by rotation: q̃ ( rkw̃j , r kw̃j−2 ) = rk · q̃(w̃j , w̃j−2) . for all w̃j ∈ w̃0 and w̃j−2 = ( bc ) ∈ w̃, the coefficients q̃j assigned by q̃ are listed below • q̃ ( ( 00 ) , ( bc ) ) := ( 00 ); • q̃ ( ( 20 ) , ( bc ) ) := ( 01 ); • q̃ ( ( 22 ) , ( bc ) ) := ( −1 1 ) ; • q̃ ( ( 10 ) , ( bc ) ) := ( 00 ), if c ≥ 0; • q̃ ( ( 10 ) , ( bc ) ) := ( 01 ), if c < 0; • q̃ ( ( 12 ) , ( bc ) ) := ( −1 0 ) , if c ≥ 0; • q̃ ( ( 12 ) , ( bc ) ) := ( −1 1 ) , if c < 0; • q̃ ( ( 21 ) , ( bc ) ) := ( 01 ), if b ≤ 0; • q̃ ( ( 21 ) , ( bc ) ) := ( −1 1 ) , if b > 0; • q̃ ( ( 11 ) , ( bc ) ) := ( 00 ), if b ≤ 0 ≤ c; • q̃ ( ( 11 ) , ( bc ) ) := ( −1 0 ) , if b > 0 and c > 0; • q̃ ( ( 11 ) , ( bc ) ) := ( 01 ), if b < 0 and c < 0; • q̃ ( ( 11 ) , ( bc ) ) := ( −1 1 ) , if c ≤ 0 ≤ b and ( bc ) ̸= ( 00 ). by checking all the possible variants (w̃j , w̃j−2, w̃j−4) ∈ w̃ 3, it can be verified that the final digit z̃j calculated by z̃j := w̃j + q̃j−2 − m 2q̃j is always an element of the desired digit set d̃. correct value of the final sum z̃ = ∑ j∈z m j z̃j is guaranteed by z̃ = ∑ j∈z m j z̃j = ∑ j∈z m j (w̃j + q̃j−2 − m 2q̃j ) = ∑ j∈z m j w̃j + ∑ j∈z m j q̃j−2 − ∑ j∈z m j+2q̃j = = w̃ + ∑ j∈z m j q̃j−2 − ∑ l∈z m lq̃l−2 = w̃ + 0 = w̃ . remark 14. consider m ∈ zm×m with no eigenvalue on the unit circle. then, at least one eigenvalue λ of m satisfies |λ| > 1. the eigenvalue λ is an algebraic integer, as it is a root of the characteristic polynomial f ∈ z[x] of the matrix m , and, obviously, f is monic. if, moreover, f is irreducible over q, then z[λ] = {a0+a1λ+· · ·+am−1λm−1 : a0, . . . , am−1 ∈ z} and ξ : z[λ] 7→ zm defined by ξ(a0 + a1λ + · · · + am−1) = (a0, a1, . . . , am−1)⊤ is an isomorphism. consequently, any algorithm for parallel addition in the number system (λ, a) with a ⊂ z[λ] can be transformed by the isomorphism ξ to the matrix numeration system (m, ξ(a)). in particular, if a ⊂ z, then the digit set ξ(a) ⊂ {ce1 : c ∈ z}, where e1 = (1, 0, . . . , 0)⊤. any known result on the minimal size of the digit set allowing parallel addition in the number system with base λ can be applied to the matrix numeration system with base m , thanks to irreducibility of f over q. some results on the minimal size of digit sets for parallel addition in number systems with base β ∈ c can be found e.g. in [21]. theorem 9 states that each non-singular matrix m ∈ zm×m with no eigenvalue on the unit circle can be equipped with a suitable finite digit set d ⊂ zm such that any integer vector x ∈ zm is representable in the system (m, d). as shown in [15], the assumption |λ| ≠ 1 for each eigenvalue λ of m is not necessary for the representability of zm in (m, d). nevertheless, this assumption is necessary for parallel addition in (m, d), as shown below. proposition 15. if addition in a matrix numeration system (m, d) with base m ∈ zm×m and a finite digit set d ⊂ zm is doable in parallel, then no eigenvalue of m lies on the unit circle. proof. by lemma 5, we consider, without a loss of generality, that the digit set d generates rm – i.e., that rm is the linear hull of d. assume, for a contradiction, that m has an eigenvalue λ ∈ c on the unit circle, |λ| = 1. let u ∈ cm be an eigenvector of the matrix m ⊤ to the eigenvalue λ – i.e., m ⊤u = λu. as d generates rm, the vector u cannot be orthogonal to all digits, hence there exists a digit d ∈ d such that αu⊤d = 1 for some α ∈ c. let v be the eigenvector v = αu, so that v⊤d = 1 and v⊤m = λv⊤. let parallel addition be performed by a p-local function. denote s = max { ∣∣p−1∑ j=0 λj v⊤dj ∣∣ : dj ∈ d} . as |λ| = 1, there exist infinitely many j ∈ n such that ℜ(λj ) > 12 . hence one can find n ∈ n and coefficients ε0, ε1, . . . , εn −1 ∈ {0, 1} such that ℜ ( ∑n −1 j=0 εj λ j ) > 193 i. farkas, e. pelantová, m. svobodová acta polytechnica 2s. let x = ∑n −1 j=0 m j xj with xj ∈ d such that |ℜ(v⊤x)| = max {∣∣ℜ(v⊤( n −1∑ j=0 m j dj ))∣∣ : dj ∈ d} = max {∣∣ℜ(n −1∑ j=0 λj v⊤dj )∣∣ : dj ∈ d} . since 1 and 0 belong to {v⊤d : d ∈ d}, we have |ℜ(v⊤x)| ≥ ℜ ( n −1∑ j=0 εj λ j ) > 2s. (10) the p-local function used to add x + x produces digits zj ∈ d such that x + x = n +p−2∑ j=n m j zj + n −1∑ j=0 m j zj + −1∑ j=−p+1 m j zj . after multiplication of x + x by the vector v⊤ from the left, we have v⊤(x + x) = λn p−2∑ j=0 λj v⊤zn +j +v⊤ n −1∑ j=0 m j zj +λ−p p−1∑ j=1 λj v⊤zj−p . (11) the definitions of s and x guarantee that ∣∣p−2∑ j=0 λj v⊤zn +j ∣∣ ≤ s, ∣∣p−1∑ j=1 λj v⊤zj−p ∣∣ ≤ s, and ∣∣ℜ(v⊤(n −1∑ j=0 m j zj ))∣∣ ≤ |ℜ(v⊤x)| . using these inequalities and the triangle inequality, together with (11) and the fact that |ℜ(y)| ≤ |y| for every y ∈ c, and with |λ| = 1, we get 2|ℜ(v⊤x)| ≤ |2v⊤x| ≤ 2s + |ℜ(v⊤x)| which contradicts (10). 4. eventually periodic representations with expansive matrix base t. vávra in [22] shows, for any algebraic complex base β with |β| > 1, that there exists a suitable (finite) digit set a ⊂ z such that any x ∈ q(β) has an eventually periodic expansion in this base, i.e., x = ∑n j=−∞ aj β j , where the sequence (aj )j≤n of digits from a is eventually periodic. looking for an analogy to this result in the matrix systems, we first have to give a meaning of the previous sum in the case when the number base β is replaced with a matrix base m and (integer) number digits aj by (integer) vector digits dj . if a matrix m ∈ zm×m is expansive, then m −1 is contractive and there exists a vector norm ∥·∥c in r m such that ∥∥m −1∥∥ < 1, where ∥·∥ is the matrix norm induced by the vector norm ∥·∥c, see [23]. let us recall that for these two norms, the following inequalities hold: ∥ax∥c ≤ ∥a∥ . ∥x∥c and ∥ab∥ ≤ ∥a∥ . ∥b∥ for every x ∈ rm and a, b ∈ rm×m. if (dj )n−∞ is a sequence of (vector) digits from a finite set d ⊂ rm, then the vector ∑n j=−∞ m j dj is well defined, as the sequence ( n∑ j=−n m j dj )+∞ n=0 is a cauchy sequence. indeed, let us denote r :=∥∥m −1∥∥ < 1 and d := max{∥d∥c : d ∈ d}. the triangle inequality implies for every n, q ∈ n that∥∥∥∥∥∥ n∑ j=−n m j dj − n∑ j=−n−q m j dj ∥∥∥∥∥∥ c = ∥∥∥∥∥∥ −n−1∑ j=−n−q m j dj ∥∥∥∥∥∥ c = ∥∥∥∥∥∥ n+q∑ j=n+1 ( m −1 )j d−j ∥∥∥∥∥∥ c ≤ n+q∑ j=n+1 rj d ≤ drn+1 1 − r , so the value dr n+1 1−r can be made arbitrarily small for all n > n0, with sufficiently large n0 ∈ n. consequently, if m is expansive, then there exists lim n→+∞ ∑n j=−n m j dj , which may be denoted as∑n j=−∞ m j dj . in the remaining part of this chapter, we focus on matrix numeration systems with m being an expansive matrix. matrix numeration systems with base m being an expansive matrix have been intensively studied since the work of vince. the main focus of the research in this area is on the systems where each lattice point has a unique representation. recall that a lattice in rm is the set of all integer combinations of m linearly independent vectors. a lattice numeration system can be formalised as follows (see [24]): definition 16. let λ be a lattice, m : λ → λ be a linear operator (also called the base or radix) and let d be a finite subset of λ containing 0 (called the digit set). the triplet (λ, m, d) is called a generalised number system (gns) if every element x ∈ λ has a unique finite representation of the form x = n∑ j=0 m j dj , where n ∈ n, dj ∈ d for every j = 0, 1, . . . , n and dn is a non-zero digit (if x ̸= 0). characterisation of triplets (λ, m, d) forming gns seems to be a difficult task. to quote a necessary 194 vol. 63 no. 3/2023 positional representation of vectors condition, we recall that two elements (lattice points) x, y ∈ λ are congruent modulo m if they belong to the same residue class, i. e., if (x − y) ∈ m λ. we denote this fact by x ≡λ y (mod m ). theorem 17. [13] if (λ, m, d) is a gns, then the following holds: (1.) the operator m is expansive. (2.) the digit set d is a complete residue system modulo m. (3.) det(m − i) ̸= ±1. note that the number of residue classes modulo m equals | det(m )|, hence the gns has exactly #d = | det(m )| digits. a sufficient condition for gns was provided by l. germán and a. kovács in [24]. theorem 18. [24] let m : λ 7→ λ be a non-singular linear operator. if the spectral radius of the inverse operator m −1 is less than 12 , then there exists a digit set d ⊂ λ such that (λ, m, d) is a gns. any lattice λ ⊂ rm is just an image of zm under a non-singular linear map. since a linear map does not change the gns property, we can consider, without a loss of generality, that λ = zm. a linear operator mapping zm to zm corresponds to multiplication of integer vectors by a matrix m ∈ zm×m. further results are known regarding the existence of gns with special types of the digits sets: • if d = {k(1, 0, . . . , 0)⊤) : k ∈ z, 0 ≤ k < | det(m )|}, we speak about canonical systems. • if d = {k(1, 0, . . . , 0)⊤) : k ∈ z, − 12 | det(m )| < k ≤ 12 | det(m )|}, the digit set is called symmetric. • if d contains the lattice point of the smallest norm from each residue class (modulo m ), then the digits set d is said to be dense. the choice of the smallest lattice point is not necessarily unique. • the adjoint digit set consists of those lattice points which belong to | det(m )| · [− 12 , 1 2 ) m. the results on gns with the special digits sets can be consulted in [25]. if we abandon the requirement of uniqueness for the representation of vectors from zm, then the question on existence of a suitable digit set for a given expansive base m is much more simpler. in fact, using a sufficiently redundant digit set d allows to represent all vectors in rm. first, we focus on vectors from rm which have eventually periodic representation in the numeration system (m, d), i.e., we focus on the set perd (m ) = { n∑ j=−∞ m j dj : dj ∈ d, (dj )n−∞ eventually periodic } . to work with eventually periodic representations, we exploit a well known fact about contractive matrices, namely that (i − a)−1 = ∑+∞ j=0 a j for any contractive matrix a ∈ cm×m. lemma 19. let m ∈ zm×m be an expansive matrix and d ⊂ zm. if x ∈ perd (m ), then x ∈ qm. proof. first, assume that an (m, d)-representation of x has the form 0 • (d−1d−2 · · · d−p)ω . it means that x = m −1d−1 + m −2d−2 + · · · + m −pd−p + m −px, which implies that x = (i−m −p)−1(m −1d−1+m −2d−2+· · ·+m −pd−p) . since all elements of m are integer, it follows that all the matrix powers m −j with j ∈ n and (im −m −p)−1 belong to qm×m. therefore, x ∈ qm. now, let us assume that a representation of x is eventually periodic and the preperiodic part ends at an index −k ∈ z, −k ≤ 0. then m kx equals z + y, where z = ∑n j=0 m j dj and y has the purely periodic form y = 0 • (d−1d−2 · · · d−p)ω . obviously, z ∈ zm and, by the previous argumentation, y ∈ qm. hence x = m −k(z + y) belongs to qm as well. to study the implication opposite to the previous lemma 19, we use the concept of parallel addition. lemma 20. if the digit set d allows parallel addition on find(m ), then it allows parallel addition also on perd (m ), and the result of addition is again eventually periodic. it means that, for such a digit set d, the set perd (m ) is closed under addition. proof. let us explain this statement, assuming the parallel addition on find(m ) is a p-local function, with p = r + 1 + t, memory r and anticipation t. let ℓx, ℓy be the lengths of the periods of (m, d)representations of the summands x, y, and denote ℓxy := ℓxℓy . clearly, both x and y have (m, d)representations with the same period ℓxy , and the same holds for the (m, d + d)-representation of their (eventually periodic) sum w = x + y calculated by a pure summation of digits on each position separately. it is clear that, by applying the p-local function φ : (d + d)p 7→ d onto the (m, d + d)-representation of the interim sum w, we obtain an eventually periodic string over the alphabet d. in other words, x + y has an eventualy periodic (m, d)-representation. theorem 21. let m ∈ zm×m be an expansive matrix. then there exists a finite digit set d ⊂ zm such that perd (m ) = qm. proof. by theorem 9 and lemma 5, there exists a finite digit set d such that addition in (m, d) can be performed in parallel. due to lemma 19, it remains to prove only the inclusion qm ⊂ perd(m ). we denote by es the vector from zm whose sth coordinate equals 1 and all other coordinates are zero. in the 195 i. farkas, e. pelantová, m. svobodová acta polytechnica first step, we show that for each q ∈ n and each s = 1, 2, . . . , m, the vector 1 q es has an eventually periodic (m, d)-representation. for this purpose, we define a congruence relation on the matrices from zm×m. we say that a ∈ zm×m is congruent modulo q to b ∈ zm×m, if (a − b) ∈ q zm×m. as the number of congruence classes is finite (for a fixed q ∈ n), we find in the list i, m, m 2, m 3, . . . two matrices from the same congruence class. in other words, there exist k, ℓ ∈ n, ℓ > 0 such that m k+ℓ − m k = qc for some c ∈ zm×m, or, equivalently 1 q i = m −ℓ−k (i − m −ℓ)−1 c = m −ℓ−k ( +∞∑ j=0 m −ℓj ) c . (12) let c1 denote the first column of the matrix c. as zm ⊂ find(m ), we can write c1 = ∑n t=0 m tft with ft ∈ d. it is due to lemma 6 and remark 10 that the bottom index of the sum equals zero, as m is expansive, so all of its eigenvalues are > 1 in modulus. the first column of the matrix equality (12) then equals 1 q e1 = m −ℓ−k +∞∑ j=0 m −ℓj c1 = n∑ t=0 +∞∑ j=0 m −ℓj−ℓ−k+tft . thus, we have expressed 1 q e1 as a sum of n + 1 vectors with eventually periodic (m, d)-representation. since perd (m ) is closed under addition, due to lemma 20, the vector 1 q e1 belongs to perd(m ) as well. analogously, the same holds for 1 q e2, . . . , 1 q em and for − 1 q e1, . . . , − 1q em, therefore, they are also elements of perd (m ). in the second step, we consider an arbitrary vector x ∈ qm. we find q ∈ n and integer nubers p1, . . . , pm such that x = p1 ( 1 q e1 ) + · · · + pm ( 1 q em ) . it means that x is a sum of (|p1| + · · · + |pm|) vectors from the set perd(m ), which is closed under addition. hence x ∈ perd (m ). corollary 22. let m ∈ zm×m be an expansive matrix. then there exists a finite digit set d ⊂ z such that every x ∈ rm has an (m, d)-representation. proof. let ∥·∥c be the vector norm of r m, for which the induced matrix norm of the contractive matrix m −1 is ∥∥m −1∥∥ = r < 1. since m is expansive, any vector y ∈ zm has an (m, d)-representation in the form y = ∑n j=0 m j dj for some n ∈ n and dj ∈ d. in general, the representation is not unique. we denote by height(y) the minimal n ∈ n among all the (m, d)-representations of y. let x ∈ rm and ( x(n) ) n∈n be a sequence of vectors from qm such that lim n→∞ xn = x. by the previous theorem, we have x(n) = ∑nn j=−∞ m j d (n) j . denote the integer and fractional parts of x(n) by y(n) := ∑nn j=0 m j d (n) j and z (n) := ∑−1 j=−∞ m j d (n) j , respectively. obviously, y(n) ∈ zm. assume that (m, d)-representations of the integer parts y(n) satisfy nn = height(y(n)) for every n ∈ n. the size of the fractional parts z(n) is bounded. indeed,∥∥z(n)∥∥ c = ∥∥∥∑−1j=−∞ m j d(n)j ∥∥∥ c ≤ ∑−1 j=−∞ r −j d = rd 1−r . since any convergent sequence is bounded, the integer parts y(n) ∈ zm are bounded as well, as∥∥y(n)∥∥ c ≤ ∥∥x(n) − z(n)∥∥ c ≤ ∥∥z(n)∥∥ c + ∥∥x(n)∥∥ c . it means that y(n) can take only finitely many values in zm, and thus their heights nn are bounded as well, say by n . hence, we can rewrite the representation of x(n) for each n ∈ n into the form x(n) = ∑n j=−∞ m j d (n) j with the uniform upper index n of the sums (if necessary, we add leading zero coefficients to sums). now, we are ready to find an (m, d)-representation of x = lim n→∞ x(n). we construct the sequence dn dn −1 · · · d0 • d−1d−2 · · · of digits from d. a digit which appears infinitely times among x(n)n will be chosen as dn . then, we chose dn −1 as such a digit that the pair of digits dn dn −1 appears infinitely many times among x (n) n x (n) n −1. similarly, dn −2 is chosen as such a digit that the triplet dn dn −1dn −2 appears infinitely many times among x(n)n x (n) n −1x (n) n −2, and so on. by this construction, for every step l ∈ n, l ≥ 1, there exists kl ∈ z such that the strings dn dn −1 · · · d0•d−1d−2 · · · and x(kl)n x (kl) n −1 · · · x (kl) 0 • x (kl) −1 x (kl) −2 · · · have a common prefix of length at least l. therefore,∥∥∥∥∥∥ n∑ j=−∞ m j dj − n∑ j=−∞ m j x (kl) j ∥∥∥∥∥∥ c = ∥∥∥∥∥∥ n −l∑ j=−∞ m j ( dj − x (kl) j )∥∥∥∥∥∥ c ≤ 2d +∞∑ j=l−n rj = 2d 1 − r rl−n , where d denotes max{||d||c : d ∈ d}. since 2d1−r r l−n tends to 0 with l → +∞, we obtain n∑ j=−∞ m j dj = lim l→+∞ x(kl) = lim n→+∞ x(n) = x , and thus ∑n j=−∞ m j dj is an (m, d)-representation of x. 5. open questions we have focused only on two questions connected with (m, d)-representation of vectors: computability of addition in parallel and eventually periodic representations. many open problems still remain unresolved. let us list some of them: (1.) what is the minimal size of a digit set d ⊂ zm with the property zm ⊂ find(m )? this question 196 vol. 63 no. 3/2023 positional representation of vectors was already tackled in [26] for very special matrices, namely for jordan blocks jm(1) corresponding to the eigenvalue 1. (2.) is it possible for a given non-singular matrix m ∈ zm×m to find a (finite) digit set d ⊂ zm such that every element in find(m ) has a unique representation in this numeration system? if m is expansive, then the size of the suitable digit set (if it exists) is | det(m )|, see [13]. what is an analogy of this result for a non-expansive matrix? (3.) does there exist any (finite) digit set such that multiplication of vectors from rm by a scalar x ∈ r can be performed by an on-line algorithm? note that k. trivedi and m. ercegovac in [27] designed on-line algorithms for the multiplication and division of two numbers represented in a numeration system (β, a) with β ∈ n. their algorithms were later generalised to numeration systems (β, a) with base β being a (real or complex) pisot number [28]. acknowledgements edita pelantová acknowledges financial support by the ministry of education, youth and sports of the czech republic, project no. cz.02.1.01/0.0/0.0/16_019/0000778. references [1] a. cauchy. sur les moyens d’éviter les erreurs dans les calculs numériques. no. 11 in série i. c.r. acad. sc. paris, france, 1840. [2] v. grünwald. intorno all’aritmetica dei sistemi numerici a base negativa con particolare riguardo al sistema numerico a base negativo-decimale per lo studio delle sue analogie coll’aritmetica ordinaria (decimale). 23. giornale di matematiche di battaglini, italy, 1885. [3] a. rényi. representations for real numbers and their ergodic properties. acta mathematica academiae scientiarum hungaricae 8:477–493, 1957. https://doi.org/10.1007/bf02020331 [4] k. schmidt. on periodic expansions of pisot numbers and salem numbers. bulletin of the london mathematical society 12(4):269–278, 1980. https://doi.org/10.1112/blms/12.4.269 [5] t. vávra, f. veneziano. pisot unit generators in number fields. journal of symbolic computation 89:94–108, 2018. https://doi.org/10.1016/j.jsc.2017.11.005 [6] d. e. knuth. a imaginary number system. communications of the acm 3(4):245–247, 1960. https://doi.org/10.1145/367177.367233 [7] w. penney. a “binary” system for complex numbers. journal of the acm 12(2):247–248, 1965. https://doi.org/10.1145/321264.321274 [8] b. kovács, a. pethö. number systems in integral domains, especially in orders of algebraic number fields. acta scientiarum mathematicarum 55:287–299, 1991. http://acta.bibl.u-szeged.hu/15313/1/math_055_ fasc_003_004_287-299.pdf. [9] p. kirschenhofer, j. thuswaldner. shift radix systems: a survey. numeration and substitution b46:1–59, 2014. http://hdl.handle.net/2433/226207. [10] a. avizienis. signed-digit numbe representations for fast parallel arithmetic. ire transactions on electronic computers ec-10(3):389–400, 1961. https://doi.org/10.1109/tec.1961.5219227 [11] a. vince. radix representation and rep-tiling. in proceedings of the 24-th southeastern international conference on combinatorics, graph theory, and computing, vol. 98, pp. 199–212. 1993. [12] a. vince. replicating tessellations. siam journal on discrete mathematics 6(3):501–521, 1993. https://doi.org/10.1137/0406040 [13] a. kovács. number expansions in lattices. mathematical and computer modelling 38(7):909–915, 2003. https://doi.org/10.1016/s0895-7177(03)90076-8 [14] j. jankauskas, j. thuswaldner. characterization of rational matrices that admit finite digit representations. linear algebra and its applications 557:350–358, 2018. https://doi.org/10.1016/j.laa.2018.08.006 [15] e. pelantová, t. vávra. on positional representation of integer vectors. linear algebra and its applications 633:316–331, 2022. https://doi.org/10.1016/j.laa.2021.10.018 [16] d. lind, b. marcus. an introduction to symbolic dynamics and coding. cambridge university press, cambridge, 1995. https://doi.org/10.1017/cbo9780511626302 [17] c. frougny, e. pelantová, m. svobodová. parallel addition in non-standard numeration systems. theoretical computer science 412(41):5714–5727, 2011. https://doi.org/10.1016/j.tcs.2011.06.028 [18] s. akiyama, p. drungilas, j. jankauskas. height reducing problem on algebraic integers. functiones et approximatio commentarii mathematici 47(1):105–119, 2012. https://doi.org/10.7169/facm/2012.47.1.9 [19] j. legerský, m. svobodová. construction of algorithms for parallel addition in expanding bases via extending window method. theoretical computer science 795:547–569, 2019. https://doi.org/10.1016/j.tcs.2019.08.010 [20] j. legerský. minimal non-integer alphabets allowing parallel addition. acta polytechnica 58(5):285–291, 2018. https://doi.org/10.14311/ap.2018.58.0285 [21] c. frougny, e. pelantová, m. svobodová. minimal digit sets for parallel addition in non-standard numeration systems. journal of integer sequencers 16(2):1–36, 2013. https://cs.uwaterloo.ca/ journals/jis/vol16/frougny/frougny3.pdf. [22] t. vávra. periodic representations in salem bases. israel journal of mathematics 242:83–95, 2021. https://doi.org/10.1007/s11856-021-2123-3 [23] e. isaacson, h. keller. analysis of numerical methods. john wiley & sons, new york, 1966. [24] l. germán, a. kovács. on number system constructions. acta mathematica hungarica 115:155–167, 2007. https://doi.org/10.1007/s10474-007-5224-5 197 https://doi.org/10.1007/bf02020331 https://doi.org/10.1112/blms/12.4.269 https://doi.org/10.1016/j.jsc.2017.11.005 https://doi.org/10.1145/367177.367233 https://doi.org/10.1145/321264.321274 http://acta.bibl.u-szeged.hu/15313/1/math_055_fasc_003_004_287-299.pdf http://acta.bibl.u-szeged.hu/15313/1/math_055_fasc_003_004_287-299.pdf http://hdl.handle.net/2433/226207 https://doi.org/10.1109/tec.1961.5219227 https://doi.org/10.1137/0406040 https://doi.org/10.1016/s0895-7177(03)90076-8 https://doi.org/10.1016/j.laa.2018.08.006 https://doi.org/10.1016/j.laa.2021.10.018 https://doi.org/10.1017/cbo9780511626302 https://doi.org/10.1016/j.tcs.2011.06.028 https://doi.org/10.7169/facm/2012.47.1.9 https://doi.org/10.1016/j.tcs.2019.08.010 https://doi.org/10.14311/ap.2018.58.0285 https://cs.uwaterloo.ca/journals/jis/vol16/frougny/frougny3.pdf https://cs.uwaterloo.ca/journals/jis/vol16/frougny/frougny3.pdf https://doi.org/10.1007/s11856-021-2123-3 https://doi.org/10.1007/s10474-007-5224-5 i. farkas, e. pelantová, m. svobodová acta polytechnica [25] p. hudoba, a. kovács. toolset for supporting the research of lattice based number expansions. acta cybernetica 25(2):271–284, 2021. https://doi.org/10.14232/actacyb.289524 [26] j. caldwell, k. hare, t. vávra. non-expansive matrix number systems with bases similar to jn(1). [2022-02-01]. arxiv:2110.11937 [27] k. trivedi, m. ercegovac. on-line algorithms for division and multiplication. ieee transactions on computers c-26(7):681–687, 1977. https://doi.org/10.1109/tc.1977.1674901 [28] c. frougny, m. pavelka, e. pelantová, m. svobodová. on-line algorithms for multiplication and division in real and complex numeration systems. discrete mathematics & theoretical computer science 21(3), 2019. https://doi.org/10.23638/dmtcs-21-3-14 198 https://doi.org/10.14232/actacyb.289524 https://arxiv.org/pdf/2110.11937.pdf https://doi.org/10.1109/tc.1977.1674901 https://doi.org/10.23638/dmtcs-21-3-14 acta polytechnica 63(3):188–198, 2023 1 introduction 2 preliminaries 3 parallel addition in matrix numeration systems 4 eventually periodic representations with expansive matrix base 5 open questions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2023.63.0019 acta polytechnica 63(1):19–22, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague linearisation of a second-order nonlinear ordinary differential equation adhir maharaj∗, peter g. l. leach, megan govender, david p. day durban university of technology, steve biko campus, department of mathematics, durban, 4000, republic of south africa ∗ corresponding author: adhirm@dut.ac.za abstract. we analyse nonlinear second-order differential equations in terms of algebraic properties by reducing a nonlinear partial differential equation to a nonlinear second-order ordinary differential equation via the point symmetry f (v)∂v . the eight lie point symmetries obtained for the second-order ordinary differential equation is of maximal number and a representation of the sl(3, r) algebra. we extend this analysis to a more general nonlinear second-order differential equation and we obtain similar interesting algebraic properties. keywords: lie symmetries, integrability, linearisation. 1. introduction nonlinear differential equations are ubiquitous in mathematically orientated scientific fields, such as physics, engineering, epidemiology etc. therefore, the analysis and closed-form solutions of differential equations are important to understand natural phenomena. in the search for solutions of differential equations, one discovers the beauty of the algebraic properties that the equations possess. even though closed-form solutions are the primary objective, one cannot ignore the interesting properties of the equations [1–6]. in recent years, one such area in relativistic astrophysics involves the embedding of a four-dimensional differentiable manifold into a higher dimensional euclidean space which gives rise to the so-called karmarkar condition for class i spacetimes [7]. the karmarkar condition leads to a quadrature, which reduces the problem of determinig the gravitational behaviour of a gravitating system to a single generating function. this is then used to close the system of field equations in order to get a full description of the thermodynamical and gravitational evolution of the model. in a recent approach, nikolaev and maharaj [8] investigated the embedding properties of the vaidya metric [9]. the vaidya solution is the unique solution of the einstein field equations describing the exterior spacetime filled with null radition of a spherical mass distribution undergoing dissipative gravitational collapse. in their work, nikolaev and maharaj showed that the vaidya solution is not class i embeddable but the generalised vaidya metric describing an anisotropic and inhomogeneous atmosphere comprising of a mixture of strings and null radiation gives rise to interesting embedding properties. here, we consider the nonlinear partial differential equation arising from the generalised vaidya metric be of class i. the governing equation is 2r2mm′′ − r2m′2 − 2rmm′ + 3m2 = 0, (1) where the prime denotes differentiation of the dependent variable, m(v, r), with respect to the independent variable, r. equation (1) is not v-dependent explicitly and possesses the point symmetry f (v)∂v where f (v) is an arbitrary function of v only. using this symmetry, we obtain the invariants r = x and m = y(x), which reduces (1) to a nonlinear nonautonomous second-order ordinary differential equation 2x2yy′′ − x2y′2 − 2xyy′ + 3y2 = 0, (2) where y is a function of x only. we use the lie symmetry approach to obtain the solution of (2). using the solution of (2), we obtain the solution of (1). 2. preliminaries let (x, y) denote the variables of a two-dimensional space. suppose that x is the independent variable and y is the dependent variable. an infinitesimal transformation in this space has the form x̄ = x + ϵξ(x, y) (3) ȳ = y + ϵη(x, y) (4) which can be regarded as generated by the differential operator γ = ξ(x, y) ∂ ∂x + η(x, y) ∂ ∂y . (5) since we are concerned with point symmetries in this paper, ξ and η depend upon x and y only. under the infinitesimal transformation (3) and (4), the nth derivative transform is given by ζn = η(n) − n∑ j=1 ( n j ) y(n+1−j)ξ(j) (6) and γn = ζn ∂ ∂y(n) , (7) 19 https://doi.org/10.14311/ap.2023.63.0019 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en a. maharaj, p. g. l. leach, m. govender, d. p. day acta polytechnica where the notation η(n), ξ(j) and y(n) denote the nth, jth and nth derivative of the dependent variable with respect to x. in the case of a function, f (x, y, y′, ..., y(n)), the infinitesimal transformation is generated by γ + γ1 + γ2 + ... + γn which we write as γ[n], where [10] γ[n] = γ + n∑ i=1 [ η (i) − i∑ j=1 ( i j ) y (i+1−j) ξ (j) ] ∂ ∂y(i) , (8) is called the nth extension of γ. in the case of an equation e(x, y, y′, ..., y(n)) = 0 (9) the equation is a constraint and the condition [11, 12] γ a symmetry of the equation γ[n]e|e=0 = 0, (10) i.e. the action of th nth extension of γ on the function e is zero when the equation (9) is taken into account. we note that e = 0 may be a scalar equation or a system of equations1. 3. symmetry analysis the lie point symmetries2 of (2) are γ1 = x ∂ ∂x γ2 = y ∂ ∂y γ3 = x3/2 √ y ∂ ∂y γ4 = √ xy ∂ ∂y γ5 = 2x ∂ ∂x + 3y ∂ ∂y γ6 = x2 ∂ ∂x + 3xy ∂ ∂y γ7 = √ y x ∂ ∂x + ( y x )3/2 ∂y γ8 = √ xy ∂ ∂x + 3y3/2 √ x ∂ ∂y which is a maximal number for a second-order ordinary differential equation and must be a representation of the sl(3, r) algebra in the mubarakzyanov classification scheme [21–24]. equation (2) is linearisable to d2y dx 2 = 0, (11) 1an interested reader is referred to [13–16]. 2the mathematica add-on package sym [17–20] was used to obtain the symmetries. by means of a point transformation. the solution of (11) is y = ax + b, (12) while the solution of (2) is not exactly obvious. however, one can transform (2) to (11). we seek the transformation from (2) to (11) which casts γ4 = √ xy∂y into canonical form. γ4 assumes canonical form provided ξ(x, y) ∂x ∂x + η(x, y) ∂x ∂y = 0 (13) ξ(x, y) ∂y ∂x + η(x, y) ∂y ∂y = 1, (14) where ξ = 0 and η = √ xy because (2) possesses a symmetry of the general form γ = ξ∂x + η∂y . when we apply the method of characteristics for first-order partial differential equations to (13) and (14), we obtain dx 0 = dy √ xy = dx 0 (15) dx 0 = dy √ xy = dy 1 (16) for which the solutions are x = x, y 2 = 4y x . (17) under the transformation (17), equation (2) takes the form in (11). hence we may apply (17) to (12) to obtain the solution to (2), which is y(x) = 1 4 x(ax + b)2, (18) where a and b are two constants of integration. by using the invariants r = x and m = y(x), the solution of (1) follows from (18) and is m(v, r) = 1 4 r(a(v)r + b(v))2, (19) where a(v) and b(v) are functions of integration. 4. the general case we consider a general case by setting y(x) = un, where u is a function of x in equation (2), we obtain a more general second-order equation 2nx2uu′′ + n(n − 2)x2u′2 − 2nxuu′ + 3u2 = 0. (20) 20 vol. 63 no. 1/2023 linearisation of a second-order nonlinear ordinary differential equation the lie point symmetries of (20) are λ1 = x ∂ ∂x λ2 = ∂ ∂x + u nx ∂ ∂u λ3 = 1 n x3/2u1−n/2 ∂ ∂u λ4 = 1 n √ xu1−n/2 ∂ ∂u λ5 = 2x ∂ ∂x + 3 n u ∂ ∂u λ6 = x2 ∂ ∂x + 3 n xu ∂ ∂u λ7 = √ un x ∂ ∂x + u1+n/2 nx3/2 ∂ ∂u λ8 = √ xun ∂ ∂x + 3 n √ un+2 x ∂ ∂u . as (20) is a second-order ordinary differential equation and possesses eight lie point symmetries, it is related to the generic second-order equation [25] d2y dx 2 = 0. (21) when we apply the method of characteristics for firstorder partial differential equations to (13) and (14), and using symmetry λ4, we obtain dx 0 = du 1 n √ xu1−n/2 = dx 0 (22) dx 0 = du 1 n √ xu1−n/2 = dy 1 (23) for which the solutions are x = x, y 2 = 4un x . (24) from the solution of (21), by means of the transformation (24), we obtain the solution of (20) as u(x) = ( x 4 ) 1 n (c1x + c2) 2 n , (25) where c1 and c2 are constants of integration. 5. conclusion most studies of the algebraic properties of ordinary differential equations are focused on the first, second and third order equations, which is most natural since these are the equations which arise in the modelling of natural phenomena. in this paper, we performed the symmetry analysis of equation (2) and showed that the equation possesses the sl(3, r) algebra. in turn, we reported the solution of (2) and thus obtained the solution of (1). a natural generalisation of (2) followed. by setting m(v, r) = zn, where z is a function of v and r in equation (1), we obtain a more general partial differential differential 2nr2zz′′ + n(n − 2)r2z′2 − 2nrzz′ + 3z2 = 0, (26) where the prime denotes differentiation of the dependent variable, z(v, r), with respect to the independent variable, r. we note that, as in equation (1), (26) is not explicitly dependent on v, and therefore possesses the point symmetry g(v)∂v , where g(v) is an arbitrary function of v only. we use this symmetry to obtain the invariants r = x and z = u(x) which reduce (26) to the second-order nonlinear equation (20) with the solution given by (25). using (25) and the invariants mentioned above, we obtain the solution for equation (26) to be z(v, r) = ( r 4 ) 1 n (c1(v)r + c2(v)) 2 n , (27) where c1(v) and c2(v) are functions of integration. this paper demonstrates that the equations (1), hence (26), which, at first glance, looks complicated, has some very interesting properties from the viewpoint of symmetry analysis. using the symmetry approach we were able to show that these equations are integrable and have closed-form solutions. acknowledgements mg expresses grateful thanks to the national research foundation of south africa and the durban university of technology for their continuing support. am acknowledges support of the durban university of technology. pgll appreciates the support of the national research foundation of south africa, the university of kwazulunatal and the durban university of technology. dpd acknowledges the support provided by durban university of technology. references [1] b. abraham-shrauner, p. g. l. leach, k. s. govinder, g. ratcliff. hidden and contact symmetries of ordinary differential equations. journal of physics a: mathematical and general 28(23):6707, 1995. https://doi.org/10.1088/0305-4470/28/23/020 [2] k. andriopoulos, p. g. l. leach, a. maharaj. on differential sequences. applied mathematics & information sciences 5(3):525–546, 2011. [3] c. géronimi, m. r. feix, p. g. l. leach. third order differential equation possessing three symmetries the two homogeneous ones plus the time translation. tech. rep., scan-9905040, 1999. [4] a. k. halder, a. paliathanasis, p. g. l. leach. singularity analysis of a variant of the painlevé–ince equation. applied mathematics letters 98:70–73, 2019. https://doi.org/10.1016/j.aml.2019.05.042 [5] a. maharaj, p. g. l. leach. properties of the dominant behaviour of quadratic systems. journal of nonlinear mathematical physics 13(1):129–144, 2006. https://doi.org/10.2991/jnmp.2006.13.1.11 21 https://doi.org/10.1088/0305-4470/28/23/020 https://doi.org/10.1016/j.aml.2019.05.042 https://doi.org/10.2991/jnmp.2006.13.1.11 a. maharaj, p. g. l. leach, m. govender, d. p. day acta polytechnica [6] a. maharaj, k. andriopoulos, p. g. l. leach. properties of a differential sequence based upon the kummer-schwarz equation. acta polytechnica 60(5):428–434, 2020. https://doi.org/10.14311/ap.2020.60.0428 [7] k. karmarkar. gravitational metrics of spherical symmetry and class one. proceedings of the indian academy of sciences – section a 27:56, 1948. https://doi.org/10.1007/bf03173443 [8] a. v. nikolaev, s. d. maharaj. embedding with vaidya geometry. the european physical journal c 80(7):1–9, 2020. https://doi.org/10.1140/epjc/s10052-020-8231-0 [9] p. chunilal vaidya. the external field of a radiating star in general relativity. current science 12:183, 1943. [10] f. m. mahomed, p. g. l. leach. symmetry lie algebras of nth order ordinary differential equations. journal of mathematical analysis and applications 151(1):80–107, 1990. https://doi.org/10.1016/0022-247x(90)90244-a [11] h. stephani. differential equations: their solution using symmetries. cambridge university press, 1989. [12] g. w. bluman, s. kumei. symmetries and differential equations, vol. 81. springer science & business media, 2013. [13] a. maharaj, p. g. l. leach. the method of reduction of order and linearization of the two-dimensional ermakov system. mathematical methods in the applied sciences 30(16):2125–2145, 2007. https://doi.org/10.1002/mma.919 [14] a. maharaj, p. g. l. leach. application of symmetry and singularity analyses to mathematical models of biological systems. mathematics and computers in simulation 96:104–123, 2014. https://doi.org/10.1016/j.matcom.2013.06.005 [15] p. g. l. leach. symmetry and singularity properties of a system of ordinary differential equations arising in the analysis of the nonlinear telegraph equations. journal of mathematical analysis and applications 336(2):987–994, 2007. https://doi.org/10.1016/j.jmaa.2007.03.045 [16] p. g. l. leach, j. miritzis. analytic behaviour of competition among three species. journal of nonlinear mathematical physics 13(4):535–548, 2006. https://doi.org/10.2991/jnmp.2006.13.4.8 [17] k. andriopoulos, s. dimas, p. g. l. leach, d. tsoubelis. on the systematic approach to the classification of differential equations by group theoretical methods. journal of computational and applied mathematics 230(1):224–232, 2009. https://doi.org/10.1016/j.cam.2008.11.002 [18] s. dimas, d. tsoubelis. sym: a new symmetryfinding package for mathematica. in proceedings of the 10th international conference in modern group analysis, pp. 64–70. university of cyprus press, 2004. [19] s. dimas, d. tsoubelis. a new mathematica-based program for solving overdetermined systems of pdes. in 8th international mathematica symposium, pp. 1–5. avignon, 2006. [20] s. dimas. partial differential equations, algebraic computing and nonlinear systems. ph.d. thesis, university of patras, greece, 2008. [21] v. v. morozov. classification of nilpotent lie algebras of sixth order. izvestiya vysshikh uchebnykh zavedenii matematika 4:161–171, 1958. [22] g. m. mubarakzyanov. on solvable lie algebras. izvestiya vysshikh uchebnykh zavedenii matematika (1):114–123, 1963. [23] g. m. mubarakzyanov. classification of real structures of lie algebras of fifth order. izvestiya vysshikh uchebnykh zavedenii matematika (3):99–106, 1963. [24] g. m. mubarakzyanov. classification of solvable lie algebras of sixth order with a non-nilpotent basis element. izvestiya vysshikh uchebnykh zavedenii matematika (4):104–116, 1963. [25] f. m. mahomed, p. g. l. leach. the linear symmetries of a nonlinear differential equation. quaestiones mathematicae 8(3):241–274, 1985. https://doi.org/10.1080/16073606.1985.9631915 22 https://doi.org/10.14311/ap.2020.60.0428 https://doi.org/10.1007/bf03173443 https://doi.org/10.1140/epjc/s10052-020-8231-0 https://doi.org/10.1016/0022-247x(90)90244-a https://doi.org/10.1002/mma.919 https://doi.org/10.1016/j.matcom.2013.06.005 https://doi.org/10.1016/j.jmaa.2007.03.045 https://doi.org/10.2991/jnmp.2006.13.4.8 https://doi.org/10.1016/j.cam.2008.11.002 https://doi.org/10.1080/16073606.1985.9631915 acta polytechnica 63(1):19–22, 2023 1 introduction 2 preliminaries 3 symmetry analysis 4 the general case 5 conclusion acknowledgements references ap_06_4.vp 1 introduction the internet and other technologies enable a customer to find an appropriate substitution for a service, if he is not satisfied with the pricing and the quality of the given offer. hence, the service provider must find the optimum market position. from the customer’s point of view, price is often the essential feature of a particular. on the other hand, a hotel wants to keep its unit price at the highest possible level. the definition of the expedient price can vary from business to business. a hotel company rarely has a monopolistic or oligopolistic position. a competitive market should therefore be assumed. the yield mechanism can be interpreted, from a technical point of view, as a decision support system that helps to find the optimum pricing level for a particular product. the greatest advance is that it suggests the pricing automatically. the hotel management can than either accept or modify the offered scenario. a yield management (ym) system architecture – historical data processing – estimation and modeling of the future demand and price optimization, needs to be developed. the new idea presented in this paper involves an analogy between the ym approach and the model-based predictive control approach. this new concept leads to more appropriate demand forecasting and control. while traditional ym demand predictions are based on historical demand data only, the new idea suggests compound predictions based on both demand and price data simultaneously. this new approach recovers hidden dependencies in the historical data and should provide better justified demand predictions. 2 ordering of incoming reservations the next section describes the booking system structure and the time dependency of incoming reservations is analyzed. finally a structure that enables analysis of time trends is introduced. in the following text, the term “reservation” is often used. a reservation is basically a data record consisting of the date, for which the reservation is arranged, the number of persons, the agreed price for a person and the time at which the deal was arranged. in the next paragraph only the date of stay, the number of persons and the arrangement date are to be considered. the price factor will be considered in a separate section. 3 historical reservations a reservation can appear at any time t or advance s. for a statistical row analysis we need the time row format. therefore a discretisation system for time t and for the order of advance s must be stated. the task is not easy, as this is a two dimensional problem. first of all the sampling period for time t must be stated. the system is supposed to plan the prices for a night in a hotel. this implicitly defines the sampling period of time – the sampling moment is the check-in time. the sampling period is one day, as there is only one check-in per day. the reservations must be handled separately for each day in the calendar. the advance ordering period s is thus the time interval before one particular check-in time, as in fig. 1. the advance is zero at check-in time and it is positive – the earlier the reservation the greater s will be. as was mentioned above, the reservations for a particular day are discontinues function, not a smooth function. it is difficult to compare, analyze and build a model in such conditions. for this reason, the reservations must be aggregated in some well-selected intervals, which will enable us to compare and analyze trends. the user must define these bounds for intervals. the use of defined aggregation enables us to comparison to be made between two situations. an example of such an aggregation is shown in the fig. 2. the definition of aggregation can be intermediated by the introduction of indices ri. these are the bounds of the intervals in which we want to sum the reservations. it is very important to assign the reservation to one specific index ri. the appropriate assignment is shown in fig. 2, where an arrowhead marks it. it represents the assignment of reservations in the interval between two successive indices. these intervals are also used in real booking system applications. the standard categories of reservations are as follows. 1) check-in on or even after day closure – directly on r0 or later that night 2) check-in between r0� r1 hours in advance © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 46 no. 4/2006 a reservation aggregation framework design for demand estimation t. vítek, d. pachner effective management practices in the tourism and hotel area have seldom been more important than at the present time. pricing decisions cannot be taken without serious thought. it has provided the opportunity for a customer to make a quick market search and it offers decision support systems that can be used in the hotel management. the heart of yield management system consists of the predicting machine, which estimates the number of incoming reservations. incoming reservations arrive randomly in time. the time series calculi as well as the estimators known from control engineering require properly defined time rows (with a constant period). this requirement is usually not fulfilled, so the input data are not exploited properly. this paper outlines a procedure that aggregates the reservations into a time series that is useful for demand prediction. the algorithm prepares the data systematically for further processing. any method that process time rows can be used for subsequent prediction: time series, linear models or time extrapolation. keywords: yield management, demand prediction, time rows, data proceeding. 3) check-in between r1� r2 hours in advance 4) check-in between rn �1 � rn hours in advance 5) check-in before rn to formulate the previous mapping into a suitable form, the indices ri must be appropriately defined. the times defined by user represent the bounds of the interval. the set of times must be given in ascending order. the first index is assumed to be zero. in order to monitor all incoming reservation, one index rn�1 should be implicitly added to the end of this set. this index will contain all the reservations coming before the index rn. this induces the following set of time indices � �r r r r r r i j n i j r rg n n i j� � � �0 1 2 1, , , , , , ,� . (1) an example of such a set is rg = {0 hours, 6 hours, 12 hours, 1 day, 5 days, 5+}. (2) all data coming after day closure should be assigned to index r0. it should be noted that the maximum delay is 24 hours, because then it is already the next day. the previously-defined structure can be used for defining the projection of the reservations onto the set of indices. 4 representative price for an interval the aggregation procedure described in section 3 gathers a series of reservations from a particular interval for one value. this value quantifies the interest of customers in that particular period as a total number of buys. each of these sales can generally have a different selling price. this algorithm will introduce a single representative price that will be attractive to an average buyer in this interval. fig. 3 displays the situation between two time indices ri and ri�1. this implies that a price representing a time span should be counted, as some kind of weighed average throughout the interval. the yield from a given period is known and constant, as is the number of reservations, so the right representative is that, which preserves a constant yield from a given set of reservations. this equation can be formulated in an integral or discrete version, as c r n s s c s n s sk r r r r k k k k ( ) ( ) ( ) ( )d d � � � �� 1 1 . (3) the right side of the equation represents the overall revenue in the particular period. the left side is equal to the average price multiplied by the number of reservations in the period. hence the equation expresses the balance between sales counted from the dates and sales counted from the interval representative price. this representative price can be further expressed from equation (3) as follows c r c s n s s n s s k r r r r k k k k ( ) ( ) ( ) ( ) � � � � � d d 1 1 . (4) 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 1: reservations coming for the days as a discontinuous function of s fig. 2. aggregation of reservations in time intervals defined in the s variable fig. 3: reservation as a combination of price and number of persons – visualization of aggregation in prices and reservations equation (4) represents the weighted average of the prices, where the weight for each particular price is the number of reservations that were sold for this price. 5 mapping reservations at the border of the future and the past section 3 considers that all reservations are available when they are to be analyzed. this kind of post processing enabled aggregation intervals of any size to be defined (1). these intervals reflect the specification of the hotel management concerning the moments, at which price can be changed. however the connected aggregation of reservations in these intervals supposes that of all data in particular interval is known. this system works well with historical data, where the reserved day is in the past, but it cannot be used unchanged when dealing with reservations for future days. first of all, it is necessary to recover what data is known at an arbitrary time denoted as �. fig. 4 displays a time period where the data is already known, as is the future period in which the information about reservations and prices remains unknown. it can be clearly seen that each of the days that border a time border has a different ratio of known and unknown history. hence the desired set of time indices (1) cannot be used directly. it is necessary to define one extra index for each of rows different that will separate known past and unknown future data. the situation is displayed in fig. 5. as can be seen, the actual time � divides the time plane of variables t and s into two half planes. one of them is filled with known data, while the other remains unknown. the separating line is well known, determined by the current time �. the condition that determines the points on this line is t s� �� . (5) the reservations for each of the days {t1, t2, t3, …} is defined on the line that goes through the particular day (ti) and it is parallel to the s axis. the separating line intersects the line of an individual day at only one point. this is the point that separates known and future reservations. therefore it can be an appropriate limit for an interval, at which the reservations can be aggregated. the previous paragraph implies that each day should have its own set of indices. the user gives the original set of indices. one index is inserted later, and its position is dependent on the current time �. the intersection rm can be computed, from current time � and the particular date for which a reservation should be made t1 r tm i� � �. (6) as shown in fig. 5 the intersection rm can be either in the past for ti � �24 �, or in the future. as shown in the figure, only the future reservation should be separated by the middle index rm. for one particular date, the set of indices can be created as a union of the set of desired indices (1). for a particular date the set can be created by ordering the union � � � �r r r r r r re n n m� �0 1 2 1, , , , ,� . (7) the values are known in those intervals where the lower limit is greater or equal to rm. this condition splits set defined by condition 7 into two sets. one of them rp contains the limits to intervals with known data, while the other set rf contains the interval limits for unknown data. these two sets can be created by ordering the following sets � �r r r r rf i e i m� � � , (8) � �r r r r rp i e i m� � � . (9) the set of indices rp enables aggregation of reservations in intervals of already known data. the set contains the limits of the intervals. aggregation of the number of reserved places must be performed for all neighboring limits. the past indices induce the following occupation and price aggregations for all elements of the ordered set rp � � �� �r r r r ri i p i i, 1 1, n r n s si r r i i ( ) ( )� � � �1 1 d , c r c s n s s n s s i r r r r i i i i ( ) ( ) ( ) ( ) � � � � � � 1 1 1 d d . © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 46 no. 4/2006 fig. 4: reservations known at time � – problem with interval limits fig. 5: the intersection of the actual time line and a particular date is a new interval limit future aggregations remain unknown. but they can be defined very similarly. 6 conclusion the outlined algorithm enables standard reservation data to be processed into a proper time row format. the data is commonly available in a hotel database. the idea must be transformed into an sql macro that generates the time row for each required day. the algorithm can be successfully exploited by various estimation methods that compare the trends for individual days. the procedure enables a comparison of time trends. this requirement is essential for successful demand prediction. 7 acknowledgments the research described in the paper was funded by the czech ministry of trade under grant no. fi-im/189. references [1] luciani, s.: implementing yield management in medium sized hotels: an investigation of obstacles and success in florence hotels. hospitality management, vol. 18 (1999), p. 129–142. [2] tallury, k. t., van rizin, g. j.: the theory and practice of the revenue management. kluver academic publishers, 2004. isbn 1-4020-7933-8. [3] donaghy, k., mcmahon, u., mcdowell, d.: yield management: an overview. hospitality management, vol. 14 (1995), p. 139–150. [4] kimes, e. s.: revenue management: a retrospective. cornell hotel and restaurant quarterly, vol. october–december (2003), p. 131–138. [5] weatherford, l. r., kimes, s. e., scott, d. a.: forecasting for hotel revenue management testing aggregation against disaggregation. cornell hotel and restaurant administration quarterly, cornell university vol. 3 (2001), p. 53–64. [6] francis, g., fidato, a., humphreys, i.: airport-airline interaction: the impact of low-cost carriers on two european airports. journal of air transport management, pergamon vol. 9 (2003), p. 267–273. ing. tomáš vítek phone: +420 737 109 868 e-mail: vitekt2@centrum.cz daniel pachner phone: +420 224357206 email: daniel.pachner@seznam.cz department of control engineering czech technical university faculty of electrical engineering technická 2 166 27 praha 6, czech republic 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 ap07_1.vp 1 introduction civil aviation has continuously been progressing since the onset of commercial flights in florida in 1916 and the appearance of scheduled international links just after the end of the first world war [1]. to sustain and foster this progress, all branches of the aeronautical sciences and technology have also experienced a breathtaking evolution. the contributions of aerodynamics, structures, propulsion, avionics and materials have been essential [2], and have been so impacting that air travel has changed our notion of time and space, and our way of life. many legendary airplanes have paved the way of civil aviation. to mention but a few, let us arrange a short list: douglas dc-3, lockheed super constellation, boeing b707, sud aviation caravelle, boeing b747, bac-sa concorde… however, a certain paradox has arisen in the midst of this development: an airplane enthusiast looking at the runways from an airport spectators terrace would see the same shapes nowadays as 30 years back. the vast majority of the flying vehicles belong to the so-called conventional configuration [2]. around 60 years ago, boeing engineers created a new jet airplane concept characterized by a slender fuselage mated to a high aspect ratio swept wing, horizontal and vertical stabilizers attached to the rear fuselage, and pod-mounted engines under the wing. the aircraft, named b47, was a bomber, designed to fly at high subsonic speeds [3]. in subsequent years, most engineers working around the world on new airliners equipped with jet engines, to fly faster and higher, followed the same concept. a variant, with pod-mounted engines attached to the rear fuselage, soon appeared and was better suited for small and medium-size transport airplanes. both the basic concept and the variant are still in use in current designs at the beginning of the 21st century (see fig. 1). the conservatism in maintaining the configuration over decades contrasts sharply with the enormous efforts made in aerodynamics, materials, propulsion, etc [2]. the average lift over drag ratio in cruise has increased more than 30 percent since the advent of jet airliners. at the same time, new alloys and composite materials have reduced the structural weight by nearly 40 percent in the same period. the engines have halved their specific fuel consumption in the last five decades. these improvements have resulted in an outstanding decrease in direct operating costs. flying has become an ordinary and popular way of travel in the rich countries [4], and a similar societal transformation is occurring in the developing countries led by the continuous decrease in air fares. consequently, the centre of gravity of civil aviation is changing [5-7]. in the recent past the usa, in first place, and europe, in second place led the rankings. by around 2020, europe will be number one (as depicted in fig. 2), and some years later the asia-pacific rim will overtake europe in terms of revenue pas32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 flying wings. a new paradigm for civil aviation? r. martinez-val over the last 50 years, commercial aviation has been mainly based what is currently called the conventional layout, characterized by a slender fuselage mated to a high aspect ratio wing, with aft-tail planes and pod-mounted engines under the wing. however, it seems that this primary configuration is approaching an asymptote in its productivity and performance characteristics. one of the most promising configurations for the future is the flying wing in its distinct arrangements: blended-wing-body, c-wing, tail-less aircraft, etc. these layouts might provide significant fuel savings and, hence, a decrease in pollution. this configuration would also reduce noise in take-off and landing. all this explains the great deal of activity carried out by the aircraft industry and by numerous investigators to perform feasibility and conceptual design studies of this aircraft layout to gain better knowledge of its main characteristics: productivity, airport compatibility, passenger acceptance, internal architecture, emergency evacuation, etc. the present paper discusses the main features of flying wings, their advantages over conventional competitors, and some key operational issues, such as evacuation and vortex wake intensity. keywords: flying wing, blended-wing-body, conceptual design, civil aviation. fig. 1: great change and steadiness evolution in 100 years of aviation senger-kilometres and air freight. fig. 3 explains why: the number of air trips per capita in rich countries hardly varies with wealth; meanwhile a mild increase in per capita income in the crowded poor countries (such as china, india, etc.) translates into much higher air traffic demand. the size of the airplanes is also entrained by this scenario. current airliners have an average size of 180 seats. those required to cope with the increasing demand will be close to 270 seats, as depicted in fig. 4 [8]. on the other hand, the changes in the geopolitical scenario are accompanied by changes in the perception of public opinion in the usa and western europe, where a strong wave of environment protectionism is putting under pressure and detailed scrutiny all sources of noise and pollution including, needless to say, civil aviation. in the last two decades aeronautical engineers have responded with larger and more efficient airplanes to the strong requirements for cheaper and more environment-friendly aircraft. but the so-called conventional configuration is approaching an asymptote (see fig. 5) about the size of the airbus a380. therefore aeronautical engineers have started to consider unconventional aircraft in order to overcome the limits and to achieve performance or operational improvements, including drag reduction, increased useful load, diminishing environmental impact, etc. the arrangements studied are both strange and creative [9–12], as shown in fig. 6. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 47 no. 1/2007 fig. 2: evolution and geographical distribution of world air traffic at the beginning of the 21st century in terms of revenue passenger kilometres (rpk). on the left, data for year 2000 (3.3 trillion rpk). on the right, year 2020 (8.3 trillion rpk) air trips per capita per capita income fig. 3: relationship between wealth and air traffic 10900 at 180 10900 at 180 8800 at 270 2000 2020 fig. 4: evolution of fleet and average size of jet airliners. the grey area accounts for aging jets remaining, jets recycled and jets required for fleet replacement. the white area above represents fleet growth. overall, around 16000 passenger transport airplanes will be built between 2000 and 2020 fig. 5: evolution of fuel efficiency of jet transport airplanes fig. 6: unconventional configurations studied for future transport airplanes within this market and technology framework, one of the most promising configurations is the flying wing in its different concepts: blended-wing-body, c-wing, tail-less aircraft, etc. it may provide significant fuel savings and, hence, a lower level of pollution. moreover, the engines are located above the wing and the aircraft does not need high lift devices in a low-speed configuration, which results in a quieter airplane. this explains the great deal of activity carried out by the aircraft industry and by numerous researchers throughout the world to perform conceptual design level studies, to address the problems and challenges posed by this layout. this paper discusses the main features of flying wings and blended wing bodies, their advantages over conventional competitors, and some key operational issues, such as airport operation, evacuation, and vortex wake intensity. 2 the arrangement of flying wings and blended wing bodies the flying wing concept is not new. it was used by lippisch and horten in germany in the 1930s and by northrop on prototypes flown in the 1940s [13], as presented in fig. 7. some british firms performed interesting conceptual design work during the 1950s on potential airliners with this configuration (see fig. 8) [14]. currently, researchers and designers are working on two main configurations: a rather pure flying wing, fw, with straight leading and trailing edges (depicted in fig. 9); and a blended wing-body arrangement, bwb, in which the body adopts the shape of a much flattened fuselage mated to an outer wing (see fig. 10). the studies published cover most existing segments of commercial aviation, from a one hundred seat delta wing [15] to gigantic 1500 seat aircraft [10]. the great majority of papers deal with the bwb layout [16–22], mainly due to its growing capability which easily results in a family [23, 24]. it has improved characteristics over conventional layouts, in terns of aerodynamics, due to its relatively reduced wetted 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 7: northrop yb49 flying wing fig. 8: sketches of the all-wing bristol 260 seats fig. 9: plan view sketch of a 300-seat flying wing fig. 10: sketch of the blended-wing-body aircraft area, and in terms of structural weight, due to its spanloading effect. however, the bulky inner body of bwb counter balances, at least in part, these beneficial effects. the lift coefficient is still relatively high [22] and the lift alleviation by payload and main structure is only partial [21]. pure flying wings behave much better in these two key aspects [25–27]. however, to enable efficient use of the inner space they need to be larger than a minimum size and to incorporate thick airfoils. moreover, they can hardly form families in a similar sense to what is common with conventional airliners. in payload-driven design the cabin area, around 1 m2 per passenger in three-class seating, is of extreme importance. the cabin surface, scab, is linked to the wing geometry as follows: s s fcab wing planform inner arrangement spar location � ( , , , , , )a t c � � (1) where s is cabin area, a is aspect ratio, � is taper ratio and t/c the relative thickness. it should be recalled that, by definition, s � b2/a. the wing aspect ratio is ordinarily chosen as a compromise among proper aerodynamic performance, minimum gross weight and suitable area per passenger. published values are typically around 6, clearly below those of conventional jets, which fall into the 7–10 range. medium-size flying wings pose a crucial problem: habitability, presented in fig. 11. the torque box of the wing, i.e. the space between the front and rear spars, is pressurised to accommodate the payload. the minimum height in any place of the cabin must be around 1.85 m, which implies either a long chord or a thick airfoil. large bwb are designed with more than one deck, as in fig. 12, and habitability issues almost disappear. in contrast, they exhibit a major drag penalty. to avoid very high aerodynamic drag the relative thickness has to be below a certain threshold. eq. 2 shows the relationship among admissible wing thickness, t/c, cruise lift coefficient, clcr, cruise mach number, mcr, and swept angle, � [28]. t c c m� � � �( . . ) ( . ) cos .0 90 01 0 02 0 5lcr cr � . (2) in the former equation the drag rise is assumed to occur at two cents above the cruise mach number. with appropriate values of intervening variables many fw and bwb designs consider 17 percent thick airfoils in their studies. architecture of the flying wing from the structural point of view, flying wings are arranged as dual entities: an unconventional inner wing with a pressurised torque-box between the spars, for passenger cabins and holds; and an outer wing with fairly conventional architecture, including fuel tanks outboard of the cargo holds. as the payload is located, completely or in part, inside the wing or inside a lifting shape, the structure around the payload must resist pressurization loads added to ordinary bending and torque induced by the overall configuration. the loads combine in a nonlinear manner and, under extreme manoeuvres or gusts, produce high stresses and severe deformation. the stress level has to be analyzed in terms of fail-safe and damage-tolerance. significant deformations may in additions affect the operation of the flight controls and equipment, the habitability of the cabin and the aerodynamics of the external shape [29, 30]. two main concepts have been studied for to enable skin of these aircraft to resist all prescribed loads: a flat sandwich shell and a double-skin vaulted shell. fig. 13 sketches such concepts. in both cases the resisting shells are helped by large ribs, in order to carry on the concentrated shear forces, to maintain the external aerodynamic shape and to contribute to torsion stiffness. from the point of view of the passengers, there is no major difference, since headroom and passages are optimized to become almost equal. structurally, however, there are major differences. in the thick flat sandwich shell, the skin has to be sized to withstand all loads: pressurization, bending and torque; but © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 47 no. 1/2007 fig. 11: habitability inside a flying wing fig. 12: cabin and freighthold arrangement in a blended-wing-body aircraft fig. 13: alternative structural solutions for the cabin area in flying wings its geometry is essentially the same as in conventional wings, i.e. a nearly flat skin stiffened with stringers and held by ribs. on the other hand, the external shape of the wing skin corresponds to the airfoil that is selected. in all cases the upper and lower skins are formed by quasi-planar panels. resisting the internal pressure with these planar panels is very inefficient [29, 31], and thus this solution may lead to undesirable extra weight. conversely, the double skin vaulted solution divides the load carrying responsibility into two major components: side-by-side inner cylindrical segments, similar to narrow-body sections, to resist pressurization; and a conventional outer skin to bear the additional loads. the ribs are very large and provide load distributing paths and a joint inner to outer skin. some researchers indicate that this solution is lighter and also superior due to its load diffusion and fail-safe features [24, 30, 32]. such a construction is also claimed to be well suited to prevent fatigue crack propagation and to increase buckling rigidity. however, other designers argue that if a rupture occurs in the inner vaulted skin the cabin pressure would have to be borne by the outer skin which, therefore, has to be sized to carry that load too [21]. obviously this would imply a great deal of extra weight. as an additional point in this controversy, let us recall that common fuselage structures are sized to resist quite large ruptures; up to 1.86 m2 (20 sq ft) according to far-jar 25.365 [33, 34]. consequently, if the inner vaulted skin takes equivalent responsibility as the fuselage structure in conventional airplanes, the rupture or pressurization loss becomes of secondary importance. moreover, the available space between the skins may be used to accommodate various items of equipment as will be mentioned later. selecting one of these two structural solutions is an open issue, and the critical decision on it affects manufacturing, wiring and maintenance, among other relevant aspects. no estimations exist yet on the influence of such concepts on direct operating costs, and it is not easy to speculate about the possible pre-eminence of one of them. the decision will likely depend on the size and number of decks of each aircraft. independently of the chosen solution for the skin, the best cabin arrangement seems to be a set of parallel bays, slightly shifted longitudinally following the wing’s sweptback (see fig. 14). once this concept is established the best layout to maximize the number of passengers abreast with the minimum width is a 3-by-3 disposition, since it requires only one aisle per bay. this means a narrow body-like size. the cabin width of the current boeing narrow body fuselages is 3.53 m and the analogous figure for airbus is 3.70 m. this last must be considered a minimum, since fw and bwb are designed to fly much longer ranges than any a320 or b737 and, therefore, should offer roomer cabins. such an arrangement provides enough flexibility for three-class seating [24, 27], as shown in fig. 14, corresponding to a 300-seat class fw. in larger aircraft there can be more bays, the bays may be more stretched and the overall capacity can easily reach 800-1000 passengers [21, 23, 24]. first class and business travellers occupy the central bays to benefit from improved comfort levels in rough flights, although recent investigations indicate that unpleasant rolling accelerations could be counterbalanced by smoothed manoeuvres and multimedia equipment [35]. needless to say, these aircraft comply with suitable standards concerning access, evacuation and on board services. the doors must be arranged to allow passengers to board and leave the aircraft independently of galley servicing and cleaning. furthermore they have to be equipped with ramp slides, provisions for ditching, etc. several symmetrical couples of type a doors are located on the sides of the front corridor, through the front spar and leading edge, and some symmetrical pairs are located at the rear, through the second spar and trailing edge. in some conceptual designs all galleys, toilets and wardrobes are located at the rear of the cabin for aesthetic and operational reasons. this arrangement of exits and services is very efficient and improves emergency evacuation. as in any other airliner, on board services include galleys (one area per 100–120 passengers), toilets (one per 40–50 passengers), overhead lockers for passengers’ light baggage, stowage for mail, duty-free items or passengers’ coats, new in-flight services, etc [36]. these roomy aircraft have few problems with accommodating all equipment and installations required on board: electrical, air conditioning, emergency oxygen, avionics, fuel tanks, de-icing and anti-icing, auxiliary power units, etc. for example, the non-pressurized leading and trailing edges provide precious spaces close to the cabin bays as does the space between the inner and outer skins in the double vaulted shell architectural solution. 3 advantages of the flying wing one of the claimed advantages of flying wings is weight saving, in terms of both maximum take-off weight, mtow, and trip fuel, tf. these and other main weights can be estimated at conceptual design level as follows. by definition, the maximum take-off weight to perform a specified mission is mtow oew pl tf rf� � � � (3) where oew is the operating empty weight, obtained from an empirical correlation between oew on one side, and mtow and wing size on the other side, analogous to the procedure described in [37] for oew, mtow and fuselage size; pl stands for payload; tf represents the fuel burnt during the 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 14: three-class seating in a flying wing. the two outer bays are symmetrical flight and rf the reserve fuel. both this last named and the consumption in take-off, climb, descent and landing can be considered as known fractions of the actual weight in each phase [38]. the fuel burnt in cruise, wfcr, is computed using the breguet range equation [28, 37, 38] r k w w w � � ln i i fcr , (4) where r is range, wi the initial weight in the cruise phase, and k a range parameter assumed constant and defined as k v c l d � j . (5) eq. 5 shows how the range parameter incorporates the influence of cruise speed, v, the specific fuel consumption, cj, and the average lift over drag ratio in cruise, l/d. the average lift over drag ratio of a jet airliner in cruise is about 92 to 95 percent of the maximum [37]. this last can be expressed as l d a c s s � � � � � � max � � 4 f wet , (6) where cf is the average friction coefficient (mainly dependent on the reynolds number) over the exposed area, swet. since flying wings have a lower aspect ratio but also a lower friction coefficient (due to its larger chord and, hence, larger reynolds number) [37] the key point is the smaller relative exposed area. this is easily observed in fig. 15. so the lift over drag ratio of flying wings in cruise is about 20–25 percent larger than for jet transports. eq. 6 includes the product � a � that comes from the non-lift dependent part of the drag polar, as explicit in eq. 7. c c c ad d0 l � 2 � � . (7) quantifying these effects, most databases suggest cd0 values in the range of 0.014-0-020 for common jets [28, 37, 38–40] but fall to 0.008-0-010 in flying wings and similar concepts. this means cruise l/d around 23–25 [21, 26, 27], slightly above 20 percent over current jets. if laminar flow control (lfc) can be introduced over part of the exposed area [41–43] the non-lift dependent term decreases up to 0.007–0.008 with l/d almost reaching 30 [27]. an uncommon characteristic of flying wings is that they have to cruise at higher altitudes than conventional jets, between 41000 and 47000 ft in the various steps of long flights. this fact deserves some explanation. the lift coefficient for maximum range must be [44] c c alcr d0� � � � (8) where � is a parameter related to the mach number dependence of the specific fuel consumption [44]. for current high by pass ratio turbofans it is about 0.6. this results in clcr around 0.5 for conventional airliners and some 0.3 for flying wings. in cruise, lift must balance weight; i.e. w l p m c scr cr cr 2 lcr� � � 2 , (9) where � is 1.4, and pcr and mcr are the pressure and mach number at cruise conditions, respectively. eqs. 8 and 9 can be rearranged as follows: p w s m c a cr cr cr 2 d0 � 2 � � � � (10) flying wings are foreseen with cruise wing loading around 2000–3000 pa, instead of 4000–6000 pa of conventional airliners. so, with the same mach number, lower cd0 and much lower wing loading than conventional airliners, the flying wing must fly at a higher altitude (as indicated in fig. 16) to benefit from its intrinsic design features. the engines are sized following four common requirements [37–40]: cruise capability, take-off field length, landing field length and second segment climb. a suitable design value for thrust over weight ratio is tto/wto � 0.25, which includes an allowance for the thrust lapse from static take-off to high subsonic, high altitude cruise conditions. interestingly most designs do not consider high lift devices for field manoeuvres, due to the relatively low wing loading of flying wings. closely related to the very good aerodynamics just mentioned, the specific range, i.e. the distance flown per unit fuel mass burnt, is defined as d d j r w v c l d w � � 1 . (11) © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 47 no. 1/2007 conventional a/c flying wings (l/d)c r fw lfc fig. 15: lift over drag ratio and aerodynamics of various types of aircraft 0 4000 8000 12000 0 10 20 h (km) 0.2 0.3 0.4 clm 2 conventional jet transports flying wings wcr / (pa)s fig. 16: relationship between wing loading and cruise altitude for a given aircraft the specific range can also be expressed as a function of the flight conditions, according to eq. 12 d d r w f m h� ( , , )� , (12) where � is the relative weight with respect to mtow. fig. 17 presents the specific range for a 300-seat class flying wing. the results show that the optimum altitude for the maximum range of this precise flying wing is around 45000 ft, as already indicated. this maximum, as a function of altitude, is rather flat; which means that the loss of range is almost negligible provided the flight remains between 41000 and 47000 ft. the sharp decline seen on the right hand side of the plots is due to a sudden risen subsonic drag. estimating the fuel consumption in the short duration phases of the flight (take-off, landing, etc) and integrating the specific range over the duration of the mission yields the payload-range diagram. a two – or three step – cruise is commonly defined in long flights to match altitude with decreasing wing loading. fig. 18 presents the pl-r diagram of the 300-seat class aircraft mentioned above. although not fully optimized, it may be close to future practice of fw operation. in the case of lfc, a hypothesis has been introduced. to be conservative in the pl-r estimation it is assumed that the lfc equipment fails completely during the last three hours of flight, in parallel to the conditions and extra fuel allowances required for current etops (extended range twin operations) [45]. hence fig. 18 shows three different diagrams: one corresponding to fw without lfc; another with lfc but keeping the same mtow of the original aircraft (which is not sound since the mission range is not maintained); and another diagram with the same mission specifications but reduced mtow. obviously the payload-range diagram is a boundary of all possible routes. the utilization of a given airplane is quite different in distinct companies. thus, fig. 19 depicts a histogram of routes flown with a340-300. the dual mode is characteristic of current practice by airlines: they use the aircraft in many medium-to-long range stages, as well as in some dense, short steps to increase annual utilization and profits [46]. fw show greater operational flexibility for exhibiting lower empty weight and much lower fuel consumption, thus making route planning easier and more efficient. taking all fuel burnt during the flight of an lfc flying wing, the resulting consumption is 14.6 g/pax.km or 1.82 l/pax.100km, somewhat lower than the airbus a380 [47] and comparable to the consumption of fully occupied efficient cars [27]. table 1 collects the main performances of flying wings and common airliners of the 300-seat class. it shows the great advantages of flying wings both in fuel efficiency and in take-off and landing field lengths. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 17: specific range for a 300-seat flying wing, designed at mnom � 0.8, in terms of actual cruise mach number, cruise altitude and relative weight to mtow 0 20000 40000 0 10000 20000 r(km) pl(kg) fig. 18: payload-range diagram of a flying wing (thin line), an lfc flying wing with the same mtow as “forme concepto” (dashed line), and an lfc flying wing with the same mission range 4 some operational aspects of the flying wing this section is devoted to three meaningful aspects of airplane operation: on-ground manoeuvres and servicing, emergency evacuation, and vortex wake effects. no major differences are found when studying runway/taxiway movements or airport terminal operations of unconventional aircraft, such as flying wings or blended-wing-bodies, provided they fit within the newly defined 80 m box and the landing gear track is smaller than or equal to 14 m [48]. these limits are very restrictive for large bwb, frequently designed with a wingspan greater than 100 m and landing gear track near 20 m. it should be noted that all major airports have already made major modifications to accommodate the a380 and other potential ultra high capacity airplanes [49], and no further major works can be justified in the mid-term future. with respect to servicing unconventional airplanes, there are no relevant issues that could delay or pose serious difficulties to their entry into service. as a basis for comparison fig. 20 shows the busy on-ground servicing of a conventional medium-size airplane, the b767, with many vehicles around the aircraft to enable a complete turn-around in about 45 minutes. in the case of a medium size flying wing belonging to the 300-seat class, presented in figs. 9 and 14, the rear doors can be installed through the rear spar and trailing edges. they are used for cabin cleaning and galley servicing. in this situation passenger services, cargo/baggage handling and airplane servicing can be done simultaneously with the usual overlap of activities, as shown in fig. 21. interestingly, the loading and unloading of passengers in airport piers requires the finger’s floor to be positioned about 5 m above the ground for medium-size wide bodies, but at a moderate height of around 3 m through the leading edge of the flying wing. container loading is also performed in the front part, but at a distance from the passenger doors. the doors of the cargo compartments are at similar height, around 2.5–3.0 m, in all cases. waste draining is performed below the central part of the wing and fuel refilling is carried out below the outer part of the wing. it is evident that the situations in figs. 20 and 21 are very similar. emergency evacuation is another issue in the future operation of any aircraft. it is always a major issue, particularly in large airplanes or when the configuration is unusual. very lit© czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 47 no. 1/2007 a330-200 b777-200 flying wing lfc fw stofl (m) 2390 2310 1860 1930 slfl (m) 1600 1550 1320 1350 fuel eff. (g/pax.km) 21.5 23.5 19.8 14.6 table 1: summary of field and trip performances of flying wings and conventional airliners fig. 19: worldwide utilisation histogram of the a340-300 fig. 20: on ground servicing of the b767-200 tle has been published about this point on flying wings. the only aspect frequently included in conceptual design drawings is that the doors/exits are located in the leading edge. this location entails many problems, such as the need for structural reinforcements to assure integrity in crashes or object strikes, or provisions to avoid cabin flooding after ditching. exits through the trailing edge pose fewer problems, but they require a great deal of engineering imagination. to improve survivability after crashes, emergency landings, etc, the airworthiness authorities require manufacturers and operators to meet a number of design and performance standards related to cabin evacuation. one of the most important regulations, albeit controversial, is the 90-second rule, which requires a demonstration in any new or derivative type airplane that all passengers and crew members can safely abandon the aircraft in less than 90 seconds, with half of the usable exits blocked, minimum illumination provided by floor lighting, and a certain age-gender mix in the simulated occupants [33, 34]. the only objective of the demonstration is to show that the airplane can be evacuated in less than 90 seconds under the aforementioned conditions. therefore, it only provides a benchmark for consistent evaluation. it cannot represent accident scenarios, nor is intended for system optimization. since demonstrations are costly and dangerous, various computer models have been developed by the airworthiness authorities, airplane manufacturers and independent researchers to gain insight into and understanding of the evacuation process. however, all models have significant limitations. the results reported here are obtained with a seat-to-exit assignment algorithm that can be combined with various rules to minimize the total distance travelled by all passengers along their escape trajectories [50]. fig. 22 shows the suitable evacuation routes and the connections among various areas in the 300-seat class flying wing considered in this paper [27]. the innermost bay has no front door since the leading edge there is distorted by the nose bullet cockpit. according to the rules, only half of the exits can be used in the 90-second trials. several scenarios have been analysed. the worst results correspond to the case where the usable exits are the two front doors on one side of the plane of symmetry, and the rear door on the other side. table 2 summarises the results of the evacuation analysis in this condition and shows a fairly unbalanced situation. in this case the average distances are acceptable, but the maximum distance appears rather long for a passenger escaping through the rear exit. moreover, the outermost front door seems to be empty in comparison to others, but in a real trial some of the passengers approaching the inner front door would escape through the nearby empty door, since there would be no queue most of the time. these results closely resemble those of airplanes with a slightly higher capacity, like the a340-300, dc10-30 or l1011-200 [50]. therefore the flying wing configuration suffers a certain penalty for the extra wide cabin layout, but without posing any remarkable problem in terms of passenger flow rate or total evacuation time. 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 21: on ground servicing of a 300 seat flying wing fig. 22: evacuation corridors and exits of a flying wing n° of evacuees xmean (m) xmax (m) outer front exit 54 5.3 8.9 inner front exit 132 6.3 10.3 rear exit 138 10.7 18.5 table 2: evacuation results of a 300-seat flying wing regarding the third topic, all airplanes in flight produce an intense wake, mainly characterized by a pair of counter-rotating vortices, originating at the wing tips. the wake thus formed is very severe, with induced velocities higher than those found in natural atmospheric turbulence. this phenomenon obliges a certain time and distance separation to be mentioned between aircraft in take-off and landing manoeuvres, imposing a limit on the number of movements per runway. table 3 indicates aircraft separation in the terminal area in a low-speed configuration [51]. the initial circulation of the aircraft wake, 0, is practically maintained over a very long distance downstream. it can be computed under the hypothesis of an elliptic spanwise lift distribution as �0 4 � w v b� � , (13) where w is weight (equal to lift), v speed, b wingspan and � air density. following the hallock-burnham model [51, 52], one of the most realistic for describing the evolution of the vortex wake, the maximum tangential velocity is achieved at the radius of the vortex core. this radius enlarges downstream by diffusion and, hence, the maximum tangential velocity decays approximately as indicated by the following expression: v r v xc � � � max � � � �0 0 4 20 , (14) where rc is core radius, and x the distance downstream. fig. 23 represents the evolution of the maximum tangential velocity for an a330, a b777 and a 300-seat flying wing [53] in take-off. the relative positions are exactly the same in landing. the b777 always exhibits the most intense wake, and the fw300 the least powerful vortices. the results show that the flying wing, which due to its weight belongs to the heavy aircraft category (w>136000 kg in table 3), could be considered within the medium category in terms of airport operations; i.e. time and distance separation, and this could increase the number of operations at airports. 5 conclusions the flying wing, in its different arrangements, is one of the most promising and efficient configurations envisaged to face the increasing demand for air traffic and attenuate the impact produced by so many aircraft operations. this configuration, already studied in the 1930s and 1940s, is presently receiving a great deal of attention because of its potential advantages over conventional competitors in field, its cruise performances and its greater environment-friendliness. current interest focuses on the so-called blended-wing-body layout, in which the fuselage is substituted by a much flattened body that coalesces with a conventional outer wing, and a rather pure flying wing, with straight leading and trailing edges. not only very large but also medium-size aircraft, similar in passenger capacity to common wide bodies, exhibit a remarkable improvement over conventional airplanes in field and cruise performance, as well as in emissions and noise. the flying wing configuration may better exploit emerging technologies such as laminar flow control, vectored thrust or active stability, taking even these advantages even further. there are also many open issues: the structural arrangement around the cabin; emergency evacuation; large size effects (beyond the 80 m box or the 14 m lading gear track); passenger acceptance of an uncommon cabin layout; vertical accelerations in gusty weather or roll manoeuvres; etc. taking into account the very positive results published by many researchers all over the world, and the level of difficulty of the still open questions, the flying wing could become a new paradigm for commercial aviation and could enter into service within the next decade. 6 acknowledgment a version of this paper was presented as a plenary lecture at the aed2006 conference, held in prague in june 2006. the author feels honoured by the invitation to contribute to this conference and expresses his sincere acknowledgment to the organising committee. the written version was finished at toulouse, france, during a sabbatical year hosted © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 47 no. 1/2007 leader aircraft (mtow) follower aircraft separation (nautical miles) time delay (s) (v=70 m/s) heavy (>136000 kg) heavy 4 106 heavy medium 5 132 heavy light 6 159 medium (<136000 kg) (>7000 kg) light 5 132 table 3: icao aircraft separation rules. for all combinations not, shown the minimum 3 nm or 79 s applies 0 0. 5 0. 10 0. 15 0. 20 0. 25 0. 30.0 35.0 0 0. 2 0. 4 0. 6 0. 8 0. 10 0. x (nautical miles) a330-200 ufw b777-200ergb fig. 23: downstream evolution of the maximum tangential speed (in metres per second) in the wake produced by a flying wing at take-off by supaero. the financial support of universidad politecnica de madrid and the spanish ministry of education for this research and sabbatical under grants pr2005-0100 and tra2004-07220 is highly appreciated. references [1] trimble, w. f. (ed.): from airships to airbus: the history of civil and commercial aviation. volume 2: pioneers and operations. washington, usa, smithsonian institution, 1995. [2] anderson, j. d.: the airplane. a history of its technology. reston (va), usa, aiaa, 2002. [3]www.boeing.com/histo ry/boeing/b47.html [4] jarrettt, p.: modern air transport: worldwide air transport from 1945 to the present. london, uk, putnam, 2000. [5] worldwide airports traffic. report 2003. london, uk, airport council international, 2003. [6] long-term forecast of air traffic 2004–2025. brussels, b, eurocontrol, 2005. [7] current market outlook 2004. seattle, usa, boeing commercial airplanes, 2004. [8] global market forecast 2004–2023. blagnac, f, airbus, 2005. [9] lange, r. h.: review of unconventional aircraft design concepts. journal of aircraft, vol. 25 (1988), p. 385–392. [10] mcmasters, j. h., kroo, i. m.: advanced configurations for very large transport airplanes. aircraft design, vol. 1 (1998), p. 217–242. [11] schmitt, d.: challenges for unconventional transport aircraft configurations. air and space europe, vol. 3 (2001), p. 67–72. [12] nangia, r., palmer, m., tilman, c.: unconventional high aspect ratio joined-wing aircraft incorporating laminar flow. applied aerodynamics conference, aiaa paper 2003-3927, 2003. [13] pelletier, a.: les ailes volantes. boulogne, f, etai, 1999. [14] payne, r.: stuck on the drawing board. unbuilt british commercial aircraft since 1945. stroud, uk, tempus, 2004. [15] stinton, d.: the anatomy of the aeroplane. 2nd ed. oxford, uk, blackwell, 1998. [16] denisov, v. e., bolsunovsky, a. l., buzoverya, n. p., gurevich, b. i., shkadov, l. m.: conceptual design for passenger airplane of very large passenger capacity in the flying wing layout. proceedings 20th icas congress, sorrento, italy, vol. ii, 1996, p. 1305–1311. [17] liebeck, r. h., page, m. a., rawdon, b. k.: blended-wing-body subsonic commercial transport. 36th aerospace science meeting & exhibit, reno (nv), usa, aiaa paper 98-0438, 1998. [18] bolsunovski, a. l., buzoverya, n. p., gurevich, b. i., denisov, v. e., dunaevski, a. i., shkadov, l. m., sonin, o. v., udzhukhu, a. j., zhurihin, j. p.: flying wing: – problems and decisions. aircraft design, vol. 4 (2001), p. 193–219. [19] mialon, b., fol, t. and bonnaud, c.: aerodynamic optimization of subsonic flying wing configurations. 20th aiaa applied aerodynamics conference, st. louis (mo), usa, aiaa paper 2002-2931, 2002. [20] cook, m. v., castro, h. v.: the longitudinal flying qualities of a blended-wing-body civil transport aircraft. aeronautical journal, vol. 108 (2004). [21] liebeck, r. h.: design of the blended wing body subsonic transport. journal of aircraft, vol. 41 (2004), p. 10–25. [22] qin, n., vavalle, a., le moigne, a., laban, m., hackett, k., weinerfelt, p.: aerodynamic considerations of blended wing body aircraft. progress in aerospace sciences, vol. 40 (2004), p. 321–343. [23] willcox, k., wakayama, s.: simultaneous optimization of a multiple-aircraft family. journal of aircraft, vol. 40 (2003), p. 616–622. [24] bradley, k. a.: sizing methodology for the conceptual design of blended-wing-body transports. nasa cr 2004-213016, 2004. [25] ,martinez-val r. and schoep, e.: flying wing versus conventional transport airplane: the 300 seat case. proceedings 22nd icas congress, harrogate, uk, cd-rom, 2000, paper 113. [26] martinez-val, r., perez, e.: medium size flying wings. in innovative configurations and advanced concepts for future civil aircraft (eds. e. torenbeek, h. deconinck). rhode-saint-gen se, von karman institute, b, 2005. [27] martinez-val, r., perez, e., alfaro, p., perez, j.: conceptual design of a medium size flying wing. journal of aerospace engineering, part g proc. imeche, 221, 2007, p. 57–66. [28] howe, d.: aircraft conceptual design synthesis. london, uk, professional engineering publishing, 2000. [29] niu, m. c. -y.: airframe structural design. 2nd ed. hong kong, prc, conmilit press, 1999. [30] mukhopadhyay, v., sobieszczanski-sobieski, j., kosaka, i., quinn, g., charpentier, c.: analysis, design and optimization of non-cylindrical fuselage for blended-wing-body (bwb) vehicle. aiaa paper 2002-5664, 2002. [31] timoshenko, s. p., woinowski-krieger, s.: theory of plates and shells. 2nd ed., new york, usa, mcgraw-hill, 1959. [32] mukhopadhyay, v.: structural concept study of non-circular fuselage configuration. aiaa/sae world aviation congress, los angeles, usa, 1996, paper wac-67. [33] part 25. airworthiness standards: transport category airplanes. in code of federal regulations. title 14. aeronautics and space. washington, usa, office of the general register, 1995. [34] certification specifications for large aeroplanes cs-25. amendment 1. köln, d, european aviation safety agency, 2005. [35] wittmann, r.: passenger acceptance of bwb configurations. proceedings 24th icas congress, yokohama, japan, cd-rom, 2004, paper 1.3.3. [36] roskam, j.: airplane design. part 3. layout design of cockpit, fuselage, wing and empennage: cutaways and inboard profiles. ottawa (ka), usa, roskam aviation, 1986. [37] torenbeek, e.: synthesis of subsonic airplane design. delft, nl, delft university press, 1976. 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 [38] roskam, j.: airplane design. part i. preliminary sizing of airplanes. ottawa (ka), usa, roskam aviation, 1985. [39] fielding, j. p.: introduction to aircraft design. cambridge, uk, cambridge university press, 1999. [40] raymer, d. p.: aircraft design: a conceptual approach, 3rd ed. reston (va), usa, aiaa, 1999. [41] joslin, r. d.: aircraft laminar flow control. annual review of fluid mechanics, vol. 30 (1998), p. 1–29. [42] bewley, t. r.: flow control: new challenges for a new renaissance. progress in aerospace sciences, vol. 37 (2001), p. 21–58. [43] gad-el-hak, m.: flow control: the future. journal of aircraft, vol. 38 (2001), p. 402–418. [44] martinez-val, r. and perez, e.: optimum cruise lift coefficient in initial design of jet aircraft. journal of aircraft, vol. 29 (1992), p. 712–714. [45] martinez-val, r., pérez, e.: extended range operations of two and three turbofan engined airplanes. journal of aircraft, vol. 30 (1993), p. 382–386. [46] clark, p.: buying the big jets. aldershot, uk, ashgate, 2001. [47] vigneron, y.: commercial aircraft for the 21st century–a380 and beyond. aiaa/icas international air & space symposium and exposition, dayton (oh), usa, aiaa paper 2003-2886, 2003. [48] annex 14. part 1, aerodromes. montreal, cnd, international civil aviation organisation, 2003. [49] turrado, e., martinez-val, r.: impact of future generation aircraft in airport operativity and airspace capacity. proceeding of icrat conference, zilina, slovakia, 2004, p. 311–317. [50] martinez-val, r., hedo, j. m.: analysis of evacuation strategies for design and certification of transport airplanes. journal of aircraft, vol. 37 (2000), p. 440–447. [51] gerz, t., holzäpfel, f., darracq, d.: commercial aircraft wake vortices. progress in aerospace sciences, vol. 38 (2002), p. 181–208. [52] hallock, j. n., greene, g. c., burnham, d. c.: wake vortex research – a retrospective look. air traffic control quarterly, vol. 6 (1998), p. 161–178. [53] ghigliazza, h. h., martinez-val, r., perez, e., smrcek, l.: on the wake of transport flying wings. submitted for publication to journal of aircraft, 2006. prof. dr. rodrigo martinez-val e-mail: rodrigo.martinezval@upm.es etsi aeronauticos universidad politecnica de madrid plaza cardenal cisneros 3 28040 madrid, spain © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 47 no. 1/2007 7martinez-val 33 34 35 36 37 38 39 40 41 42 43 table of contents fuzzy concepts in the detection of unexpected situations 3 j.bíla, j. jura preparation of thin metal layers on polymers 9 j. siegel, v. kotál evaluation of audio compression artifacts 12 m. herrera martinez indirect determination of the risk of transportation projects 17 o. pastor support for expert estimations in transportation projects 21 o. pastor, p. novotný, j. melechovský the symbolic-aesthetic dimension of industrial architecture as a method of classification and evaluation: the example of bridge structures in the czech republic 23 l. popelová flying wings. a new paradigm for civil aviation? 32 r. martinez-val ap08_2.vp 1 what is noncommutative geometry? “... to savour the strange warm glow of being much more ignorant than ordinary people, who are only ignorant of ordinary things.“ (terry pratchett, “equal rites”) noncommutative geometry begins with classical geometry and extends into the realm of abstract algebras and operators. one may, of course, say that noncommutative geometry studies the geometry of quantum spaces – or, to be more explicit – the geometry of noncommutative algebras. clearly, the word quantum, although at first only superficially related to quantum mechanics or quantum field theory might be the right one both physics and mathematics are involved in many examples and there is a huge interplay between them. however, the notion of quantum spaces is a delicate one since the objects that noncommutative geometry attempts to study are (usually) not spaces – they cannot be visualized. then why study noncommutative geometry? first of all, it seems to be a natural and rich extension of the concept of spaces, one that can admit the notion of geometry in its various aspects. moreover, within noncommutative geometry one has various objects on the same footing and one can do more than within classical geometry. last not least one should mention that many basic examples do arise from physics: the phase space in quantum mechanics, the brillouin zone in the quantum hall effect, the geometry of finite spaces in the noncommutative description of the standard model, quantum groups in integrable models or the quantized target space of string theory. before we begin this easy walk through the noncommutative reality, we need to mention that – like any subject, that is in the early stage of development, it has many branches. the approach presented here falls close to connes’ noncommutative geometry (it appears that the wording connesian variety has also been used to describe this approach...) – (58b34, to give mathematical subject classification number) and noncommutative differential geometry – (46l87). however, the topics that we shall mention range from c*-algebras, differential algebras to ideals and exotic traces. let us also attempt to place the subject matter of noncommutative geometry in relation to other subjects in physics and mathematics. clearly, in mathematics noncommutative geometry lies between algebra and geometry (meaning rather differential geometry), based on fundamental results of operator algebras. there are, however, many other, sometimes not evident connections – with topology, probability, measure theory, algebraic geometry, ring theory and also with number theory. in physics, it includes both classical field theory, quantum field theory, renormalization, quantum mechanics as well as gravity and string theory. of course, one should be aware that we are far from certain whether the notion of space-time is indeed best described by noncommutative geometry and we still need some crucial theoretical steps to pursue this goal. nevertheless some qualitative considerations and also the evidence that we already have from high energy physics make this line of research quite promising. in this set of lectures we shall present an overview of mathematical objects and tools leading us towards the noncommutative world. this set of lectures is by no means self-consistent and many statements are quoted without proofs. some statements are also presented not in their most general form – we do this on purpose, having as a basic guide the need to explain the ideas rather than technical details. we assume some basic knowledge of algebra, differential geometry, operator algebras, hilbert spaces as well as some knowledge of gauge theory and characteristic classes. although not essential some knowledge in these topics is a big help when learning noncommutative geometry – a good example of a textbook that can be used as a starting point is the classical text [4]. let us finish this introduction with the quotation from alain connes’ interview with george skandalis (newsletter of the european mathematical society, no 3. (2007)). � what is noncommutative geometry? in your opinion, is “noncommutative geometry” simply a better name for operator algebras or is it a close but distinct field ? � yes, it’s important to be more precise. first, noncommutative geometry for me is this duality between geometry and algebra, with a striking coincidence between the algebraic rules and the linguistic ones. ordinary language never uses parentheses inside the words. this means that associativity is taken into account, but not commutativity, which would permit permuting the letters freely. 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 3�lectures on noncommutative geometry a. sitarz we present a short overview of noncommutative geometry. starting with c* algebras and noncommutative differential forms we pass to k-theory, k-homology and cyclic (co)homology, and we finish with the notion of spectral triples and the spectral action. keywords: connes’ noncommutative geometry, spectral triples (mcs 2000: 58b34, 46l85, 46l87, 81t75). 2 towards noncommutative topology mathemata mathematicis scribuntur. (mathematics is written for mathematicians.) nicolaus copernicus 2.1 where it all begins: gelfand-naimark when we say space, we usually mean a topological space. it may, of course, have some additional properties and features but the basic ingredient, topology, is there. we usually assume that the topology is hausdorff, which means that every two points can be separated by disjoint open sets. we know many examples of topological spaces and we have a good notion of continuous (complex valued) functions. one of the very basic observations (that we know almost intuitively) is that all continuous functions over a topological space form a complex vector space and that the product of two continuous functions is still a continuous function. this says that we have an algebra of continuous functions and, since we can take a complex conjugate of a function, it is a *-algebra. let us assume for a while that our space is compact, then clearly to each function we can associate the supremum of its absolute value. thus we arrive at the norm of a function. going a step further we come to the notion of a c*-algebra with the following formal definition: definition 2.1: an involutive banach algebra � (that is a complex normed algebra, which is complete as a topological space in the norm) such that aa a* � 2, � �a � is a c*-algebra. so, to cut the story short: the algebra of continuous functions on a compact hausdorff space is a c*-algebra with a unit! but does it work the other way round? surprisingly (or not surprisingly) yes, as we can state in the gelfand-naimark theorem: theorem 2.2: gelfand-naimark a commutative unital c*-algebra is an algebra of continuous functions on a compact hausdorff space. we shall not discuss the details of the proof as this can be found in numerous textbooks – and is (in principle) quite easy. the points of the space are provided by the characters of the algebra, which are continuous algebra morphisms from the algebra to the complex numbers. this is the dawn of noncommutative geometry. why? note that c* algebras might not be commutative. indeed, take an example: the algebra mn ( )� of matrices with complex entries, with hermitian conjugation and matrix multiplication is a very good and simple example of a noncommutative c* algebra. then in the view of gelfand-naimark’s theorem we can use noncommutative c* algebras as the definition of noncommutative hausdorff compact spaces. but these space have no points or have just a couple of points! coming back to the matrix algebra mn ( )� we see that for n �1 there are no characters at all. another very good (and generic, as we shall see) example of a c*-algebra comes from the theory of hilbert spaces. the recipe is very easy: take a (separable) hilbert space � and take an algebra �(�) of bounded operators on � with an operator norm. then any norm closed subalgebra of �(�) is a (separable) c* algebra. of course, our toy-model algebra mn ( )� is in fact of this type: just take � � �n . the algebra of all bounded operators on it is nothing else but mn ( )� . of course, we have shown one of the most restrictive versions of the theorem. if we forget about unital, we still get hausdorff spaces but the are only locally compact. 2.2. into the c*-world. what are c* algebras ? there are many (isomorphic) definitions out of which we have already presented one. we know that there are many c* algebras, and what we would like is to have a description of them that would be on the same footing – independently, whether there are commutative or noncommutative. and we want not an abstract description but a concrete one. again, gelfand, naimark and segal come to our aid: theorem 2.3 (gns): every abstract c*-algebra � is isometrically *-isomorphic to a concrete c* algebra of operators on a hilbert space �. if the algebra � is separable then we can take � to be separable. now we have a powerful tool: a description of all c*-algebras as operators on the hilbert space. this is a good starting point for noncommutative topology and towards many other notions like measurable functions, for instance. to summarize this section let us quote the dictionary, which establishes parallel notions between standard and noncommutative topology: topology algebra (locally compact) topological space commutative c*-algebra homeomorphism automorphism continuous proper map morphism compact space unital -algebra open (dense) subset (essential) ideal compactification unitization stone-čech compactification multiplier algebra cartesian product tensor product 2.3 the tricky bits and examples among the first problems that arise in the noncommutative world and have no classical correspondence, are some ambiguities in construction. for this reason, one should treat some “equivalences” from the above dictionary with due respect. a good example is the case of the tensor product of noncommutative c* algebras, which might depend on the completion of the algebraic tensor product of two algebras. let us first have a look at an example: example 2.4: take an interval i � ( , )0 1 and the algebra of continuous functions on it, c i( ). of course, we can interpret © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 48 no. 2/2008 each element of the tensor product c i c i( ) ( )� as a continuous function on the cartesian product i i i2 � � , but it clear that not all continuous functions on i2 are of this form. of course, it is true that c i c i( ) ( )� is dense in c i( )2 , so, in order to work with tensor products of c* we need to work out a way to complete the algebraic tensor product. this leads to a rather extended and not unique construction unlike the topological cartesian product of spaces. this is rather bad news but we might find some comfort in the fact that many algebras (called nuclear) do have unique completions of tensor products with them. the list includes all commutative algebras, matrix algebras and – last but not least – the algebra of compact operators (which we shall define later). so in the end, there is no ambiguity in defining continuous functions on the square, but it might be a different story for a noncommutative square! nevertheless, we shall always work with the algebraic tensor product, keeping in mind that some further details and more knowledge are needed when we try to think of c* algebras. we shall very often restrict ourselves to the c* algebras generated by some concrete operators, which are defined by their actions on the orthonormal basis. so, if t is a bounded operator on a hilbert space �, then we shall denote by c t*( ) the c* algebra, which is the norm closure of the algebra of all polynomials in t, t*. let us consider a couple of examples, the first of which defines compact operators: example 2.5: take pn to be a one-dimensional projection on a hilbert space with basis { }ek k 0: p e en k nk k� � , n k, 0. then the smallest c* algebra that contains all projections pn, c pn*({ }) is the algebra of compact operators �. example 2.6: let u be a unilateral shift on a hilbert space with basis { }ek k��: u e en n� 1, n � �. then the algebra c u*( ) is isomorphic to the algebra of continuous functions on the circle, c s( )1 . remark 2.7: note that our intuition is that the operator u corresponds to the function � � �� e i2 on the circle, where � is the angle. however, this may not be the case. take, for instance, any monotonic, continuous function � :[ , ] [ , ]0 1 0 1� . we can now replace one of the elements of the basis of the hilbert space by e i2� � �( ) and using the standard procedure we can introduce a new orthonormal basis of the hilbert space, in which one of the basis vectors is proportional to the chosen one. certainly it will not be the basis we have in mind and the unilateral shift operator u cannot then be identified with e i2� �. example 2.8: take t to be a unilateral shift on a hilbert space with basis { }ek k 0: t e en n� 1, n 0. then the algebra c t*( ) is isomorphic to the algebra c s( )1 �, in the following sense: there exist c*-algebra morphisms i, � such that the following sequence is exact: 0 01� � �� � �� �k c t c si *( ) ( )� . example 2.9: take u, v to be the following unitary operators on the hilbert space with basis { }, ,em n m n��: u e em n m n, ,� 1 , v e e em n i m m n, ,� � 2 1 � � , m n, � �, where � � � . if � � 0 we can identify the algebra with the continuous functions on the torus. if � is irrational then the algebra c u v*( , ) is the so-called irrational rotation algebra, aka functions on the noncommutative torus. it is easy to see that u and v satisfy: u v e vui� 2� � clearly, the above presentation of the noncommutative torus is not unique. we can take as the hilbert space l2(s1) and as operators u and v: u f z z f z( ) ( )� , v f z f e zi( ) ( )� 2� � . both operators are unitary and both satisfy the same commutation relation – is the c*-algebra then the same? it is reassuring that the answer is positive (though it takes some time to prove it). 3 differential geometry (noncommutative way) alice laughed: “there’s no use trying,” she said; “one can’t believe impossible things.” – “i daresay you haven’t had much practice,” said the queen. “when i was younger, i always did it for half an hour a day. why, sometimes i’ve believed as many as six impossible things before breakfast.” (lewis carroll, “alice in wonderland”.) having started with topology we have established a nice setup for the discussion of noncommutative spaces. however, we are still very far from geometry as topology does not distinguish between a ball and a cube! our task in this section is to carry out the parallels built up for c*-algebras as noncommutative spaces for some more geometric notions. we shall begin with a pure algebraic setup of differential calculi. 3.1 differential calculi in the course of differential geometry one begins with the notion of a smooth manifold, c functions and vector fields. this is, however, reserved for a purely commutative world as can be immediately noticed when on takes the simplest example of a noncommutative space, described by the algebra of 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 matrices mn ( )� , n � 0. suppose we want to have a vector field – what is a vector field then? a good answer is that a vector field is a derivation on the algebra of smooth functions: definition 3.1: a derivation � on an algebra � is a map � :� �� satisfying the leibniz rule: � � �( ) ( ) ( )ab a b a b� , � �a b, � . a derivation is inner if there exist an elementx �� such that for every a ��, �( ) [ , ]a x a� . a derivation which is not inner is called an outer derivation. since for commutative algebras there are no inner derivations, vector fields are (in classical differential geometry) outer derivations. but let us look at mn ( )� – where every derivation is in fact inner (which means that there are no outer derivations). so, can we call them vector fields? we can even take a much simpler example: a commutative algebra of complex functions on two points. as a vector space it has two basis vectors: a unit (1) and the function which takes value 1 on the first point and �1 on the other, which we shall call e. each function is a linear combination of these two, and the algebra structure is encoded in one simple identity e2 1� . now what is the space of the derivations? clearly, there are no inner ones, as the algebra is commutative. assume that � is a derivation. then, using the leibniz rule we have: 0 1 2� �� �( ) ( )e e, which simply tells us that apart from the trivial derivation, � � 0, there are no derivations at all. of course, a good lesson to learn is that we have chosen a bad object to start with. instead of looking at the vector fields we need to look at differential algebras, which generalize nicely to the noncommutative world. definition 3.2: a differential graded algebra (dga) over an algebra � is an �-graded algebra, not necessarily finite, such that the 0-th grade is isomorphic with � and that is equipped with a degree linear map (grade increasing), which obeys the graded leibniz rule: d d d( ) ( )�� �� � ��� �1 , for any elements �, �, where � denotes the degree of the form �. there is, however, no unique way to construct such an object in the noncommutative situation and one might have many different dgas over a single algebra – even in the commutative case. before we look at some interesting examples, let us define an important dga, which can be canonically constructed for every algebra. proposition 3.3: we assume that the algebra � is unital. let u 1 ( )� be the kernel of the multiplication map in � �� : u i i i a b a b a b i i i ii 1 0 ( ) , ; � � � � � � � �� � � � �� � � � � let us take as u( )� the tensor algebra over � of u 1 ( )� . then the linear map defined on u 0 ( )� �� as: d a a au( ) � � � �1 1 , extends in a unique way to a degree 1 linear operator on u( )� , which satisfies the graded leibniz rule and is nilpotent, du 2 0� . this dga is called a universal dga and du is the universal external derivative. proof: it is a good exercise to have a look at the proof. actually, most of the properties of du are used in the construction. for instance, we extend the definition of du in such a way that the leibniz rule is assured. the universal one-forms could be always uniquely expressed as a d bi u ii� , for a bi i, ��, a bi ii� � 0. indeed, an arbitrary universal one-form is of the type a bi ii �� for some a bi i, �� such that a bi i� � 0. but: ( ) ( )� � � � � � �� � � �a d b a b a b a bi u i i i i i i i i i i i 1 . we set: d a d b d a d bu i u i i u i u i i � � � � � � � � � � � �� . this assures the graded leibniz rule between elements of the algebra and one-forms, and makes sure that du 2 0� on �. the rest is just the application of the leibniz rule. we extend du to products of universal one-forms through: d d u k j j k j k ( ) ( ) � � � � � � � 1 2 1 1 2 1 1 � � � � � � � � � � � � � � � � � � � for each �i u� 1 ( )� , i k�1, ,� . of course, we need to check that the definition is compatible with the tensor product over �, which mean that it gives the same result on � ��� a and � �a �� : d a d a d a d a u u u u ( ( ( ) ( ) ( ( ) ( � � � � � � � � � � � � � � � � � � � � � 1 1) ( ) ( ) ( ) ( ( ) � � � � � � � � � � � � � � � �� � � � � � � d a a d d a d a u u u u 1 1 � � � � � � � � � � � � � � � � � � � � � � � � ( ) ( ) ( ( ) ( ) ( 1 1 a d d a a d d u u u u � � a �� it is now a matter of easy verification that du 2 0� . � from the above construction we obtain a very convenient presentation of universal differential forms: each form of degree k can be presented as a finite sum of elements of the type a da dak0 1� . moreover, having two forms �a ka da da� 0 1� and �b ka db db� 0 1� they are different un© czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 48 no. 2/2008 less for each i k� 0 1, , ,� , ai and bi are linearly dependent. this follows directly from the definition of u k ( )� as the products of k one-forms and the fact that each one-form is already in that form. the rest is the iterative application of the leibniz rule. note that this hints at a very nice feature of the universal differential algebra: each space of forms of fixed grade is generated as a left-module over � by the image of d. we did not assume this in the definition of the dga but we can use the (nonstandard) name of a proper dga for those that have this property. finally we come to the question of the name. universal differential algebra owes its name its nice property: it is indeed universal! what does this mean? the following is due to karoubi: theorem 3.4: if *( )� is a graded differential algebra over � and � :� �� an algebra homomorphism, then there exists a unique extension of �: ~ : ( ) ( )*� �u � �� , which is a morphism of graded differential algebras: d d� �� �� . for our purposes, we shall actually need a consequence of this statement, corollary 3.5: for any proper dga, ( )� , there exists a surjective morphism of differential graded algebras: � : ( ) ( ) u � �� . proof: we set: �( ) ( )a d a d a a da dau u k k0 1 0 1� �� . this is a well-defined morphism of differential algebras. but since the differential graded algebra ( )� is proper all its elements are of the form a da dak0 1� and hence in the image of �. � briefly, corollary 3.5 means that every proper differential graded algebra over the algebra � is isomorphic to a quotient of u( )� by a differential ideal, which is an ideal of the universal graded algebra, � �� u( ), such that du( )� �� . to see how this works we need to look at a couple of examples. example 3.6: let � be the algebra of functions on two points (described earlier). the bimodule of universal one-forms is generated by due. applying the leibniz rule we see that: e d e d e eu u( ) ( ) � 0, so there are only two linearly independent one forms: de and ede. similarly one constructs all higher-order forms, which (as in every universal calculus) can be of arbitrary order. in fact, du is a derivation on the algebra �, but a special one. it takes values not in � itself (it cannot, as we have shown before) but in a bimodule over �. example 3.7: let x be a space and � an algebra of functions on x (we shall not need anything more about this algebra, so we do not say whether the functions are measurable or smooth, or just arbitrary). now, consider the following differential graded algebra: � � k k k i jx f x f x x i j x x( ) : , ( , , ) , :� � � ! �� 1 0� if with the product: ( )( , , , , , ) ( , , ) ( , ,f g x x x x f x x g x xk k k p k k k p" � 1 1 1 1� � � � ) f x g xk p� � ( ), ( ) . the external derivative d is defined on the functions f x0 0� ( ) as: d f x x f x f x( )( , ) ( ) ( )0 1 2 1 20 0� � , and then extended to forms of arbitrary order through: ( )( , , , ) ( ) ( , , � , , )d f x x x f x x xn n n i i n i 1 1 2 1 21� � � � �� , where �xi denotes that we just omit the i-th variable. exercise 3.8: prove that d satisfies the graded leibniz rule and is nilpotent, du 2 0� . if x is a space consisting of a finite number of points, can we identify the differential algebra ( )x ? note that the differential graded algebra ( )x is in general not a proper one. this depends strongly on the class of functions that we consider, as we can easily see: exercise 3.9: assume that x is the interval i and k x( ) are polynomials (of variables) understood as functions on i k. show that the differential graded algebra is a proper one in this case (that is, for each k k x( ) is generated by the image of d). what happens if we take k x( ) to be all continuous functions on ik? although ( )x might not be proper, it appears to be a suitable generalization of the universal differential graded algebra, especially in the case, when the algebraic tensor products might not give us all elements to play with (as is the case with smooth or continuous functions). example 3.10: an interesting question is whether we can identify the standard de rham differential graded algebra using the universal calculus and choosing a suitable quotient. the answer is positive but again we need to be careful about 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the tensor products. the construction should be similar as in example 3.7 with the algebra of smooth functions over x. the de rham differential calculus is obtained if we take the quotient of this differential graded algebra s x( ) by the ideal described as follows. let � be the space of all functions on x2 such that: lim ( , ) x y f x y x y� � � 0, x y x, � . it is not difficult to observe that multiplication by smooth functions from both sides does not lead outside �. taking the quotient s i 1( ) � we first obtain all de rham one-forms on x. the construction of higher order forms is then a formality. we already know that there are many noncommutative differential calculi over one algebra. the universal one, apart from its universality property, is not actually very interesting. it carries very little information about the “space” underneath. the interesting examples are “smaller” calculi. actually, even in the commutative case we might have many differential graded algebras, which are neither de rham, nor universal. see the following example: example 3.11: consider an algebra of smooth functions on the circle and the following differential graded algebra. the zero-forms are the smooth functions, k s c s( ) ( )1 1� . the bimodule of one-forms and the entire differential algebra is generated by two one-forms, � and �. the following gives the algebra rules and the action of the external derivative: d f f f d f f f d f f � � � � # � # ( ) ( ) , , ( ) , , , � � � � � � � � � � � � � 2 0 2 � � # # � � � � � , ,0 where denotes the standard derivative of the function on the circle. note that a priori the differential algebra is infinite-dimensional, as we can construct product of arbitrary numbers of �. however, since � �# generates a differential ideal, we might take a quotient and therefore set � �# � 0. a point worth mentioning is that although the algebra is commutative the differential forms (which are not universal) do not commute with functions! 3.2 involution, tensor products and representations 3.2.1 involution if the algebra � is equipped with an involution, we would like to have this operation extended onto the differential algebra. this is by no means a problem for the universal calculus, as on the zero-forms we take the assumed involution on the algebra and for higher-order universal forms we set: ( ) ( )* *da d a� � . so the problem, when reduced to the universal case, is trivial and the only thing we need to take care of is that the differential ideal � �� u( ) is involution invariant � �* � . example 3.12: on the algebras of complex functions we have a natural involution – complex conjugation. taking the trivial (but still interesting) example of two-point geometry, e e* � , we have de de� �( )*. exercise 3.13: verify that in example 3.11 of the nonstandard differential algebra over the circle we have: � �* � � and � �* � . 3.2.2 tensor products next, let us discuss the procedure for constructing the differential graded algebra for the tensor product of two algebras � and �. assuming that the respective differential algebras *( )� , *( )� are given, there exists a canonical procedure for creating a differential algebra over the tensor product, which uses the �2-graded tensor product: * * *( ) ( ) � ( )� � � �� � � where �� means that we take the symmetric or antisymmetric part of the tensor product, with respect to the degree of forms. in other words the product of two forms of fixed degrees �� �� *( ) and �� �� *( ) is always commutative or anticommutative depending on their degrees: � � � � � � � � � � � �# � � #( )1 . 3.3 representations of differential algebras finally, let us consider a specific way of obtaining differential graded algebras – connected with representations and commutators. let � be an algebra and let � be its representation on a vector space (not necessarily finite dimensional). let f be an endomorphism (a linear operator, in other words) of this vector space. lemma 3.14: if � is a representation of the algebra �, then for each linear operator f the following gives a representation of the universal differential algebra u( )� : � � � � � f n n a da da da a f a f a f a ( ) ( )[ , ( )][ , ( )] [ , ( )] 0 1 2 0 1 2 � �� (1) note that �f is nothing else but a representation of the algebra, and neither the grading nor the external derivative are in any way preserved. this follows from the fact that the kernel of �f might not be a differential ideal. we also need to be careful while dealing with infinite dimensional representations, such as representations on the hilbert space. in such a © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 48 no. 2/2008 case, it is natural to assume that all operators �(a) and the commutators [ , ( )]f a� are bounded for all a ��. this naturally implies that the image of an arbitrary form is a bounded operator. f itself might not be bounded and we shall see the most natural examples when it is not the case. there exists a canonical way to obtain a differential graded algebra through �f: we have to take � � ker (ker )� �f fd . this is a differential ideal within u( )� and then u( )� � will be a differential algebra. note that unless d f f(ker ) ker� �� the obtained differential algebra will not have a representation on the hilbert space. one might always choose for a higher-order form its representative in the image �f u( ( ))* � , however, this cannot be done in a unique way. 3.4 from representations to differential calculi in view of the previous section we might consider just a different way of obtaining differential graded algebras. just start with a representation � of the algebra, choose a suitable operator f, and consider all commutators [ , ( )]f a� as one-forms. with a bit of luck (and some additional assumptions) we shall the obtain a good example of a differential calculus. we shall consider two canonical cases. first, let f be a selfadjoint f f� †. this assures that for an involutive algebra � and a *-representation we have d a da( ) ( )* *� � : d a f a a f a f d a( ) ([ , ( )]) [ ( ) , ] [ ( ), ] ( )* † † † * *� � � � �� � � . moreover, let us take f2 1� , which means that (as seen on a hilbert space) f is a sign operator with eigenvalues being 1 and �1. we have: lemma 3.15. let f f� † and f2 1� be an operator on the hilbert space � and let � be the representation of � as bounded operators on �. then �u defined in 1 is a representation of the differential algebra, with: � � � � � � � � f u u d f f ( ) [ , ( )] [ , ( )] � � � � even odd for any universal form �, [ , ]" " denotes anticommutator. proof: first observe: [ , ] [ , ]f x f f f x� � , � �x b( )� . then: � � � � � f n nda da da da f a f a f a f ( ) [ , ( )][ , ( )] [ , ( )] ( 0 1 2 0 1� �� � a f a f a a f a f a f n n 0 1 0 1 )[ , ( )] [ , ( )] ( )[ , ( )] [ , ( )] ( � � � � � � �� � � � � � � ( )[ , ( )] [ , ( )]) ( ) ( ( )[ , ( )] [ , a f a f a a f a f n n 0 1 0 11 � �� � � � � � ( )]) [ , ( )[ , ( )] [ , ( )]] , a f f a f a f a n n� $0 1 � where the $ sign at the last bracket means that we take the commutator (or anticommutator) depending on n. clearly d2 0� , as: [ ,[ , ] ] [ ,[ , ]]f f x f f x � � 0. � we have a first crude method for obtaining differential algebra through the representation of � on a hilbert space. of course, the differential calculi obtained in that way are not very nice. even for the commutative algebras they are rather awkward and – in particular – infinite dimensional. consider the example of a circle: example 3.16: let � be the algebra of functions on the circle (for our purposes we shall take the only polynomials in u, with the representation on the hilbert space with basis { }en , n � � given as u e en n� 1. we take the operator f to be: f e sign n en n� ( ) 1 2 . what are then the differential forms? we calculate du: du e f u e sign n sign n en n n� � � � � � � � � � � � � � � � � � � � �[ , ] 3 2 1 2 � � 1 1 12�n ne, the differential form du is an operator of finite rank and, as we can easily see, 1 2 u du * is a projection on the subspace spanned by e�1. exercise 3.17: show that every differential form over the algebra of polynomials in u and u* is in fact a finite rank operator. can we generalize this result to the one-forms constructed with f over the algebra of continuous function over the circle? the obtained calculus is somehow strange – but we shall see that it plays an important role. surely enough it is genuinely noncommutative even for such a commutative space as a circle. we may, however, look for some examples of the differential graded algebras, which are more “reasonable”. example 3.18: let � be the algebra of polynomial functions on the circle. we take the same representation on the hilbert space as in example 3.16. but instead of taking the operator f let us consider an unbounded operator d, d e n en n� . of course, d, as an unbounded operator is only densely defined, so we need to work with a dense subspace of �. the one-forms are all generated by the commutator [ , ]d u : [ , ]d u e en n� 1 and du is a bounded operator (and hence well-defined on the entire hilbert space). since all one-forms and higher order forms arise from products of the elements of the algebra and du, the representation �d is into bounded operators. observe that: u du du u� , u du du u* *� , 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 so, the one-forms really do commute with the elements of the algebra. since d2 1! the image of the universal dga is not a dga itself. for instance, in the differential algebra we would have: 0 2� � �d u du du u du du( ) , but on the other hand � �d ddu du( ) ( ) is certainly a non-zero operator: ( ( ) ( ))� �d d n ndu du e e� 2 to obtain a true differential graded algebra we need to quotient the algebra generated by u and du by the differential ideal. the above calculation was not a coincidence, as it happens that du du is exactly the element which should be added to the ideal generated by ker �d in order to make it a differential one. as a result, we have a differential graded algebra with central one-forms (commuting with the elements of the algebra) and all higher-order forms vanishing. but this is nothing else than the de rham differential algebra over a circle, when restricted to polynomial functions! finally, let us come back to the case of functions on two-points: exercise 3.19: take the algebra of complex-valued functions on two points and its representation on the hilbert space �2. show that the universal differential calculus is isomorphic to the calculus given by the operator f: f � � � � � � � 0 1 1 0 . 4 the pleasures of geometry “geometry is the only science that it hath pleased god hitherto to bestow on mankind.” (thomas hobbes) so far we have extended one geometric notion: that of differential algebras and differential forms. we still have many tasks ahead of us, at least from the practical point of view of applications to physics. we need to understand the noncommutative generalization of vector bundles (so we shall come back to the notion of vector fields in the end), connections on them, integration and, last not least, we need to recognize whether our constructions fall into the same classes from the topological point of view. 4.1. projective modules in the same manner as we have dealt with spaces, which we have replaced by (suitable) algebras we shall tackle vector bundles. so, instead of taking the vector bundle itself we take the linear space of all its sections. what is the structure of that space? clearly, it is not an algebra but since each section might be multiplied by the function in the algebra we have the structure of a module. definition 4.1: � is a left-module over the algebra � if it is a linear space and there exists an associative action: � �� m a am� �� � which satisfies: a m m am am( )1 2 1 2 � , a bm ab m( ) ( )� , for all a b, ,�� and m m m, ,1 2 � �. however, for a vector bundle we have a condition of local triviality, which is an essential ingredient. cutting the story short, it could also be nicely translated to the language of modules, based on the crucial result of serre-swan. first we need a definition: definition 4.2: the module � over an algebra � is projective if there exists a module �% such that � � �& '% n for some n � 0. the module is said to be finitely generated if there exist a finite number of elements m mk1, ,� such that � � � � � � �� � � � ��� � �a mi i i k a i1 then we have serre-swan equivalence: theorem 4.3: the continuous sections of a vector bundle over a manifold form a finitely generated projective module over the algebra of continuous functions on the manifold. in turn, every finitely generated projective module over a commutative algebra of continuous functions is of that form. due to this theorem we have another straightforward generalization: sections of vector bundles are just elements of projective modules! note that there are, of course, several equivalent but different definitions of projectivity in addition to 4.2. exercise 4.4: let � be a left module over �. show that if any surjective module morphism � : �� splits for any � left module , (which means that there exists a morphism �: � � , such that � �� � id�) then the module � is projective. another equivalent statement is that for any surjective module morphism � : ( � every homomorphism �: � � can be lifted to a homomorphism ( � (� : � such that � � �� (� . why don we not play with projective modules, which are infinitely generated? certainly, a good reason is that they do not correspond to locally trivial bundles. moreover, the world of infinitely generated modules is a strange one. look, for instance, at the so-called eilenberg swindle. take a projective module � and an infinitely generated free module � . then, if �% is the completion of � to a free module we have: � � � � � � �& � & & & & & % %( ) ( ) �, so: � � � � � �& � & & & & % %( ) ( ) � and hence: © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 48 no. 2/2008 � � � �& � . in the next step we need to learn a bit more: how to construct projective modules and how to distinguish between different projective modules! 4.1.1 modules and projections starting with an algebra � we already know how to construct a certain class of projective modules: � n , called free modules. now, imagine we have a projection p mn� ( )� , which is an n n� matrix with entries from the algebra �, such that p p2 � . lemma 4.5: let p mn� ( )� be a projection. let �p be defined as a subspace of all elements in � n such that mp m� : m p mi ji i n j � � � 1 where we have explicitly taken m n�� as a collection of elements from �, m m mn� { , , }1 � and denoted the entries pij of the matrix p. then �p is a finitely generated projective module. proof. we need to show that �p is indeed a left-module over �. this, however, follows immediately from the definition, if m satisfies mp m� so does a m (as the multiplication in � is associative). if ei denotes the basis of � n , an element with zeroes at all entries apart from the i-th where it has 1, then all elements of �p are generated by { } , ,e pi i n�1 � . moreover, because 1 � p is also a projection, � � �p p n& ��1 , so the module �p is projective. � example 4.6: take the algebra of functions over sphere s2 and the following projection in m c s2 2( ( )): p e e i i� � � � � � � � � �� 1 2 1 1 cos sin sin cos � � � � � � . the projective module defined by �p is nontrivial (that is, it is not free) – and has a deep physical meaning. it is a projective module associated with the vector bundle of the magnetic monopole. the module with 1 � p is then the antimonopole. the next example shows that we need to be careful about the algebra we take. in the commutative case this simply means that it does matter whether we take polynomials, smooth functions or continuous functions. example 4.7: let c(t2) be the algebra of continuous functions over a torus. consider the following projection: p f g h e g h e f i i� � � � � � � � � �� ( ) ( ) ( ) ( ) ( ) ( ) � � � � � � � � 1 , where real-valued functions f, g, h satisfy: gh � 0 and g h f f2 2 2 � � , and 0 2) )�� � � denotes the usual coordinates on the torus. although the verification that p is a projection is easy, we need to wait a while to show that it might correspond to a nontrivial line bundle (a 1-dimensional complex vector bundle) over the two-torus! actually, we might propose a different projection: example 4.8: let c(t2) be again the algebra of continuous functions over torus. let us take the following projection �p: � � 1 2 2 1 1 1 2 2 2 1 1 2sin (cos ) sin (cos (cos ) sin � � � � � � i i 2 2 2 1 1 2 2 12i isin (cos (cos ) sin sin (cos ) � � � � � � � � � � � �� � � � � �� which (certainly) is not of the above form. again, to see what the vector bundle is we need to wait a bit. an interesting observation is that while the projection p might be smooth (provided that f, g, h are smooth), �p is only continuous! the advantage is, however, that �p is close (of course, in a naive sense) to the matrix algebra over polynomials in e i� and e i�. the notion of equivalence of projective modules is a natural one, given through module isomorphisms. (to be more precise the notion could be slightly relaxed to a stable isomorphism: two modules are stably isomorphic, if they are isomorphic after adding a free module.) but how can we distinguish that two modules given by two different projections are equivalent? for this purpose we need a notion of the equivalence of projections, which is due to murray-von neumann. definition 4.9: we say that the projections p and q are equivalent, if there exists u mn� ( )� such that u u p* � and u u q* � . note that any projection p mn� ( )� can always be embedded in mn ( )� , n n� , putting p in the upper left corner and filling the remaining entries with zeroes. this means that the equivalence of projections needs to be understood in m ( )� (seen as an inductive limit of mn ( )� , n � ). to finish this section let us mention what we can do with the newly established set of all equivalence classes of projections. first, a little help comes from the lemma: lemma 4.10: the space of equivalence classes of projections has the structure of a semigroup, with the addition: [ ] [ ]p q p q � � � � � � � * + , . / 0 0 . and thus we have met the k-theory. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 definition 4.11: we define the k0 group of the algebra � as the groethendieck group associated with the semigroup v(�). formally we can speak of abstract classes of pairs with the equivalence relation: ([ ],[ ]) ~ ([ ],[ ]) [ ]:[ ] [ ] [ ] [ ] [ ] [p q p q r p q r p q( ( 0 ( � ( r ]. the origins of k-theory are based in the classification problems of (real) vector bundles over manifolds. the construction of the k0-group in the seminal work of atiyah was the breakthrough of topological k-theory – this together with the serre-swan theorem, which formulates the equivalence between vector bundles and finitely generated projective modules over commutative algebras enabled to push the theory to the c*-algebraic setup. we were very sloppy here on the details – for example, whether we use arbitrary algebras or c*-algebras. for the purpose of the first k-theory group, k0, this does not really matter (in the sense that for c* algebras the algebraic k-theory we defined is the same as the topological k-theory. the difference arises later, when one turns to higher k-groups – but graciously we shall not touch this topic here. in principle, k-theory of a c*-algebra is a functor from this category to the category of abelian groups: k0 is defined through equivalence classes of projections, whereas k1 is the �0 of the gl group of the algebra (the inductive limit of invertible matrices over the algebra). we skip here the details of the construction and its properties, which can be found in many textbooks. the interested reader is recommended to consult them (see the list at the end of the notes). for us, there are two important things: first of all, k-theory can be calculated (thanks to advanced tools such as excision and the connecting morphism between k0 and k1). furthermore, it provides important information about the algebra itself, like the existence and classification of nontrivial (noncommutative) vector bundles. what is also significant, is that k-theory depends actually on the dense (and stable under holomorphic functional calculus) subalgebra; that is, in the commutative case one might work with continuous as well with smooth functions, and k-theory still does not change. 4.2 connections on projective modules finally we can use what we have already learned and apply to both differential algebras and projective modules to construct a new object (known from differential geometry, of course): a connection. in differential geometry we know that the theory of connections on vector bundles is equivalent to connections on principal fibre bundles. in noncommutative geometry we have all the necessary tools so we can start the theory in just the same way. such objects are interesting from many points of view: first of all, they provide an element of the theory that enables some practical calculations of noncommutative chern characters. this links k-theory with noncommutative differential forms. from the point of view of physics, connections are just a tool for the gauge theory and gravity. therefore we can view this as a step towards the notion of noncommutative gauge theory! although the use of connections appears in many places, including connes and rieffel’s seminal work on the noncommutative torus [24], it is worth mentioning that systematic analysis and proof of the relation between projectivity and the existence of universal connections is a great result of cuntz and quillen [33]. let us start with a definition and some interesting properties. definition 4.12: let � be a left projective module over an algebra � and let *( )� be a dga over �. a connection 1 on � is a linear map: 1 �: ( )� � ��� 1 , such that: 1 � 1 �( ) ( )a m a m a da m� where m � �, a ��. before we proceed further, let us pose the question of existence. evidently, if � is a left projective module then we have a projection p mn� ( )� such that � �� n p. then one might always construct a canonical (so-called grassmanian) connection. on a free module it is a trivial exercise to set: 1 �( )a da, where a n�� and the expression on the right-hand side is understood as an element of � n . then using the inclusion of a projective module into � n we construct the grassmanian connection as a composition of the connection on the free module with the projection: 1 � �( )ap da p� it is a nice feature that the existence of universal connections, that is connections related with the universal differential algebra, is actually equivalent to the projectivity of the module. before we define the curvature of a connection let us observe that the space of connections is an affine space over �-linear module endomorphisms of �, that is every two connections differ by an element of end� �� � �( , ( )) 1 � . the connection is also easily extended to a general degree-one linear map satisfying the graded leibniz rule on *( )� ��� . the difference of connections is always a *( )� -module homomorphism. in particular, for a free module the connection is always of the form: 1 � ( ) ( )a da a� , where � � �end n n� �� � �( , ( )) 1 can be understood as a matrix of one-forms over �. © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 2/2008 the curvature of a connection is its square, 12, which itself is a degree two �-linear module endomorphism of *( )� ��� . for a free module we have: 1 � 2 2( ) ( )( )a d a� � , which is a well-known formula from the gauge theory but now applicable to a general class of noncommutative objects! we shall finish this short review of connections by mentioning that in case the module � has an �-valued hermitian inner product one might require that the connection is hermitian: 1 ( 1 ( � (m m m m d m m, , , , where the left-hand side is the natural extension of the inner product on the tensor products with one-forms. in the canonical example of the free module (and the canonical inner product) this translates to the requirement that � �* � � (as a matrix of differential forms). to make the picture a bit more comprehensible, let us study two important examples. example 4.13: take the algebra � of complex valued functions on the space of two points. it has (of course) only free modules and we take just the simple one, � itself. it is equipped with the standard hermitian product arising just from the complex conjugation and product in �. let us consider all hermitian connections with respect to universal differential calculi. we have, for a �� (but � seen as a left-module) 1 � ( ) �a da a� , where �� is an arbitrary one-form, so �� �� de, with � a complex function on two points. since �� must be antiselfadjoint, when we recall all the rules of differential calculus (see examples 3.6 and 3.12), we have: ( )* *� � �de de de� � � since e de de e� � the function � is given by a complex number �: � � � ( ) ( )* *� � � �1 e . the curvature of the connection 1, 12 is: 1 � 2 2( ) � �a d� � , which rewritten in the language of forms de gives: 1 � �2 4( ) ( )* *a de de� � �� , next, by introducing h � �1 4� we obtain: 1 � �2 1 4 1( ) ( )*a hh de de . suppose now that we think of h as a gauge connection field and calculate the square of the curvature. integrating this square over the space (two points) we get f hh2 1 2 12 � �( )* . if we construct a weird type of kaluza-klein theory with the extra space being the usual one, we might interpret this as an action for a field h(x), which arises from the discrete geometry of two-points. it is a surprise that this type of action is well known and has a very important role in physics as the action for the higgs field! 5 cycles and cyclic cohomology “most of the fundamental ideas of science are essentially simple, and may, as a rule, be expressed in a language comprehensible to everyone.” (albert einstein) let us recall that the classical theory of connections on vector bundles leads to the notion of characteristic classes. the square of the connection, curvature, which is a differential two-form valued in the algebra of endomorphisms of the vector bundle is the basic building block of the theory. in the noncommutative setup we have almost the same possibility, with the exception that the differential calculi are plentiful and therefore some of them (in particular the universal differential calculus) may carry no cohomology information, that can be used to construct characteristic classes. however, when we accept this approach we shall have no general principle in the theory: every single case needs to be studied separately! moreover, it is rather unrealistic to study all possible differential calculi that can bring some new data. again, the solution to the problem lies in the approach – we need to introduce a more general notion which replaces de rham cohomology in the noncommutative setup. definition 5.1: consider cn ( )� – space of linear maps from � � ( )n 1 to �, and the linear map b c cn n: ( ) ( )� �� 1 : ( )( , , , , ) : ( , , , ) ( , b a a a a a a a a a a a n n n� � � 0 1 1 0 1 2 1 0 1 2 � � � � , , ) ( ) ( , , , ) ( ) ( , � � � a a a a a a a n n n n n n � � 1 0 1 1 1 1 0 1 1 � � a an1, , ),� where � is an n 1-linear functional and a a an0 1 1, , ,� ��. then b2 0� , and we can define the cohomology of the complex { , }c bn n�� ,which is called the hochschild cohomology of a and denoted hh *( )� : hh b b n c c n n ( ) ker im � � . example 5.2: let us see what is the zeroth hochschild cohomology group for any algebra �. from the definition, it consists of all linear functionals �, which obey: 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 b a a a a a a� �� ��( , ) ) )0 1 0 1 1 0 0� � � . therefore hh 0( )� is nothing else but the linear space of all traces on the algebra �! higher hochschild cohomology groups are more difficult to calculate (but still calculable in most examples). for the purpose of making connections with the commutative differential geometry let us quote: theorem 5. 3: for an algebra of smooth functions on a manifold m, the continuous hochschild cohomology group hh c mk( ( )) is canonically isomorphic with the space of de rham currents on the manifold m (which are continuous linear functional on the space of de rham forms). for proof (and for more details) we refer to the seminal work of connes [23]. so, we have a space, which corresponds (roughly) to the differential forms (although it is not a differential algebra)! what we need is to find a subcomplex which would give us in the commutative situation some information about the de rham cohomology. definition 5.4: consider cn� ( )� space of linear maps from � � ( )n 1 to �, which are cyclic: � �( , , , ) ( ) ( , , , , )a a a a a a an n n0 1 1 2 01� �� � and the linear map b c cn n: ( ) ( )� �� �� 1 , which is the restriction of the coboundary b defined above. the homology of the cochain complex ( , )c bn n� �� is the cyclic cohomology of �, denoted hc*( )� . example 5.5: let m be a manifold of dimension and dr m( ) the differential algebra of de rham forms over m. for smooth functions a a an0 1, , ,� we define: �( , , , ) :a a a a da dan n0 1 1 0 1� � � # #2 , where 2 is the standard integral on the manifold. then � is a cyclic cocycle of dimension n. the cyclicity of � follows from the leibniz rule and the fact that the wedge product of forms is antisymmetric. we check explicitly that � is a hochschild cocycle: b a a a a a a a a a a a n n� � � ( , , , , ) ( , , , ) ( , , , 0 1 2 1 0 1 2 1 0 1 2 � � � � � a a a a a a a n n n n n n n � � � 1 0 1 1 1 0 1 1 ) ( ) ( , , ) ( ) ( , , ) � � � � � ( ) ( ) ( ) a a da da a d a a da a da n n n 0 1 2 1 0 1 2 1 0 11 # # � # # � # 2 � � � � � � # � # # � � # # d a a a a da da a da da n n n n n n ( ) ( ) ( ) 1 1 1 0 1 0 1 1 1 n n n n n a a a da da � # # � 1 1 1 0 11 0 ( ) � a cycle is a noncommutative generalization of what we have in differential geometry (as presented earlier): functions and forms together with the integral and the (inevitable) stokes theorem. however, it is not at all that easy and if one goes noncommutative then, in general, there is no default construction of such a structure. definition 5.6: a cycle of dimension n over � is a graded differential algebra over � together with a closed graded trace 2 �: ( ) � �: �� �� �2 2� �( )1 , �� � ( )� , d�2 � 0, �� ( )� . using cycles one might easily construct cyclic cocycles. lemma 5.7: each cycle of dimension n over algebra � defines a class of a cyclic cocycle in hcn ( )� : �( , , , ) ( ) ( ) ( )a a a i a di a di an n0 1 0 1� ��2 . for proof see [5], proposition 4, p.186. recall that we have met a very nice prescription for the construction of differential graded algebras through the representation and commutators with an operator f, f2 1� . can we get a cycle in this way? the answer is positive, provided that we are ready to add a couple of additional features. definition 5.8: if � is an algebra, � its representation as bounded operators on the hilbert space, and f a selfadjoint operator such that f2 1� and for every a �� the commutators [ , ( )]f a� are compact then we call ( , , )� � f a fredholm module over �. we say that the fredholm module is even if there exists an operator � �� †, �2 1� which commutes with the representation � and anticommutes with f. actually we have already met an odd fredholm module in the example of the “strange” differential algebra for the functions on the circle 3.16. and we have shown that indeed all commutators with the algebra elements were compact! to define the graded trace on the cycle we need to know something about the summability of the fredholm module, i.e. we need to assume that the products of commutators [ , ( )]f a� fall into the ideal of trace class operators. trace class operators are operators such that the series of partial sums of their eigenvalues is summable. more precisely, we say that the fredholm module is p 1-summable, if for any p 1-elements the product of compact operators: [ , ][ , ] [ , ]f a f a f a p1 2 1� is an operator of trace class. then we have: lemma 5.9: for a ( )n 1 summable fredholm module, the closed graded trace of dimension is given by: � � �2 � 1 2 tr( )f f , in the odd case, and for even fredholm modules by: © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 48 no. 2/2008 � � � �2 � 1 2 tr ( )f f . as a corollary we have: corollary 5.10: each n 1-summable fredholm module gives rise to an n-cyclic cocycle. to exploit relations with k-theory and formulate the chern map using connections one needs an additional ingredient. this is cyclic cohomology (or rather a version of it, so-called periodic cyclic cohomology) that is the natural receptor for the chern map. we will see how differential graded algebras can help us to construct cyclic cohomology elements. example 5.11: recall again the construction of the differential algebra for the circle, with the operator f, f2 1� in 3.16. the commutators of with the elements of the algebra are compact and in particular, commutators with polynomials are finite rank operators. therefore they are trace class and we can easily define a 1-cyclic cocycle on the polynomial algebra: �( , ) ( [ , ] [ , ] )a a a f a fa f a f0 1 0 1 0 1 1 2 � tr tr . we calculate this explicitly for homogeneous polynomials. first, u f u e sign p k sign p en k p p k n[ , ] ( ) ( )� � � � � � � � 1 2 1 2 . therefore: tru f u k n kn k[ , ] , � �� � � 2 0 0 if otherwise. then: � �( , ) ,u u k n k n k� 02 . let us verify that it is a cyclic cocycle. first of all, observe that it is cyclic: � � � �( , ) ( , ), ,u u k n u u n k n k n k k n� � � � � 0 0 . furthermore: b u u u u u u u u uk m n k m n k m n k n m k � � � � � ( , , ) ( , ) ( , ) ( , )� � � 2 m n n m n m � � ( ( ) ) 0 note that up to some rescaling we obtain the same cocycle as the one arising from classical de rham forms and the standard integration: � � �� �dr n k in ik n ku u d e de i k( , ) ( ) ,� �2 � � �2 2 0 1 02 . now we can turn to the truly noncommutative world. example 5.12: do you remember the noncommutative torus uv e vui� 2� � ? we shall construct a two-dimensional cycle over it. first, the differential forms. we take two generating one-forms �u and �v, which are central. the external derivative becomes: da a au u v v� � � � �( ) ( ) , where �u and �v are two outer derivations on the algebra of the noncommutative torus: �n n m n mu v nu v( , ) � , �m n m n mu v mu v( , ) � . the two-forms are generated by the wedge product of the one-forms: � � � �u v v u# � � # . for the trace we take: a au v� �# �2 tr , where tr is the standard trace on the algebra, defined on the polynomials as: tr u vn m n m� � �, ,0 0. clearly 2 is a graded trace (since the basic one-forms anticommute and tr is a trace), so we need only to show that it is closed: d u v u v mu v ku v mu v k n m u k l v n m k l u v n m ( ) ( ) ( � � � � � � # � � 2 2 tr u v m k k l n m k l ) ., , , ,� � �� � � �0 0 0 0 0 the resulting two-cyclic cocycle is: � � � � � � ( , , ) ( ( ) ( ) ) ( ( a a a a da da a a a au u v v u 0 1 2 0 1 2 0 1 1 � � # 2 2 2 2 0 1 2 2 1 ) ( ) ) ( ( ) ( ) ( ) ( )) � � � � � � � � � u v v u v u v u v a a a a a a � � #2 � �tr( ( ( ) ( ) ( ) ( ))).a a a a au v u v0 1 2 2 1� � � � we shall use this construction later – also in the case � � 0, which is the usual commutative torus. 5.1 chern-connes pairing in this section we shall use the cyclic cohomology to produce invariants of projective modules. we must be aware that in doing so we cut a really long story short. details and more aspects of the theory are left for the intrigued reader to further self-study. assume now we have an algebra �. let p be a projection in mn ( )� which is associated to a finitely generated projective module �p. on the other side, let � be an even cyclic cocycle over �. even without knowing much about the theory we can construct the following pairing: � �, ( , , , ) , , p p p pi i i i i i i i k k � � 1 2 2 3 1 1 � � , (2) which (for brevity) we shall always write as �( , , , )p p p� . as such, it is only a number. however, the following theorem makes it a really important number: theorem 5.13: the pairing (refchern depends only on the equivalence class of the of the projection p and also only on 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the class of the cocycle � within the cyclic cohomology group hceven ( )� . for proof we refer to connes [5]. what we have gained? an extremely useful tool of noncommutative geometry applicable to noncommutative topology: we have a very precise way of distinguishing different topological classes of projective modules. let us test this knowledge on some examples. example 5.14: let us take the two-dimensional sphere and the projection of the magnetic monopole from example 4.6. taking the standard cyclic cocycle that arises from de rham differential forms and the standard integration, we have: pdpdp e e d i i� � � � � � � � � � " � � 1 8 1 1 cos sin sin cos sin � � � � � � � � � � � � � � � � cos sin cos sin sin e d i e d e d i e d d i i i i � � � � � � � � � � � � � � � " � � sin cos sin cos sin � � � � � � � � d e d i e d e d i e i i i i � � � � � � �d d i i e i e i sin sin ( cos sin sin � � � � � � � � � � � � � � � � 1 4 1 2 2 i i d d � �� sin ( cos� � � 1 � � � � � � � � � # the integral over s2 of the trace of the above expression is: d d i d d i� � � � � 0 0 2 1 2 2 � � �2 2 #��� � � � �sin . since the trivial line bundle over the sphere s2 given by the trivial projection (p �1) has a trivial pairing with the two-cyclic cocycle (for the obvious reason that p �1 and hence dp � 0) – we have the proof (based, of course, on theorem 5.13) that the projective module of the magnetic monopole is not trivial. example 5.15: let us take the commutative two-torus, with the cyclic cocycle given by the de rham forms and integration. first, we start with the smooth projection 4.7. skipping the easy calculation we show the result: tr p dp dp i g t f t g t f t g t g t s � ( � ( � ( 2 2 2( ( )( ( ) ( ) ( ) ( ) ( ) ) cos( ) ( )( ( ) ( ) ( ) ( ) ( ))).g t f t h t h t f t h t2 2( � ( � ( since the integral of cos( )s vanishes, only the first component contributes to the pairing. we rewrite it further as: ( ( ) ( )) ( ) ( ) ( ( ))f t g t f t g t g t2 2 23 1 2 ( � ( � (. integrating by parts, and using the fact that all functions are smooth and periodic we get the result for the pairing as: ( ) ( ) ( ))4 3 2 0 2 � � i dt( f t g t(2 . recall that the function f g h, , must obey: g t h t( ) ( ) � 0, g t h t f t f t( ) ( ) ( ) ( )2 2 2 � � . then: ( ) ( ) ( )) ( ) ( )( ( ) ( ))4 3 4 32 0 2 2� � � i dt( f t g t i f t f t f t( � ( �2 supp g t j i i j n i f x f x ( ) ( ) ( ) ( ) ( ) . 2 �� � �� � � � � � � 4 1 3 2 1 2 3 1 � where the points xi are such that: supp g t x x x x t g t n( ) ( , ) ( , ) ( , ) { : ( ) }. � 3 3 3 � ) ) ! 0 2 0 2 0 1 1 2 � � � since points xi are the boundary points of a set where g (and thus h) do not vanish, by continuity both g and h must vanish each xi. so, f satisfies f x f xi i 2( ) ( )� and therefore it is either 0 or 1 at each point. hence, we introduce � j i if x f x� � � � � � � � � � � � 2 3 2 0 1 2 3( ) ( ) since f is periodic, f f( ) ( )0 2� � and the points 0 and 2� cancel each other in the sum. in the end we have: ( ) ( )2 1 1 1 � �i j j j n � � � � . therefore, independently of the choice of f the result is an integral multiple of 2�i! example 5.16: let us turn to the second presentation of the (supposedly) nontrivial projection on the torus. we use the cyclic cocycle as derived in in example 5.12, keeping in mind that the functions that we use in the projection are in fact only continuous and are not differentiable at one point (t � 0, identified with t �1). we calculate the pairing (again skipping the tedious but uncomplicated algebraic manipulations), obtaining: � �2 1 4 2 1 2( , , ) sin (cos( ) ) ( )p p p i i� � � � � � � � � �tr � � , where we have taken the not-normalized trace coming from the integration (tr1 2 2� ( )� ). again, we obtain a result different from zero, which ensures that the projective module is nontrivial. shall we worry that the projection was not actually a smooth one? yes, but only a little bit. note that each function in the matrix elements of the projection has a well-defined left derivative at each point. however, even though this derivative is not continuous at one point, we can multiply functions with such discontinuities without problems. since the integration is also well-defined fur such functions, we see that no significant problems arise. finally, we turn to a very nontrivial (and very) noncommutative example. example 5.17: consider the following family of projective (and smooth) modules the over the noncommutative torus: �q qs� �( )� � , q � � , © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 48 no. 2/2008 where s( )� denotes the space of schwartz functions (rapidly decreasing functions on �) with the module structure: u s v e s w vis u( ( ) ) ( ) ( )� � �� � �2 , v s v s p q w vv( ( ) ) ( )� �� � � � � � � � � � � �� , where wu, wv are matrices satisfying w w e w wu v i v u p q� 2� . it is not evident that these modules are projective and that every projective module over the nc torus is either free or is of this form. details are to be found in papers by rieffel and connes [24]. here we shall just use this knowledge to find the pairing between the standard two-cyclic cocycle over the noncommutative torus and the module. the drawback is that we do not know the projection – and although one can construct it (surprisingly enough it is a projection in the algebra of the noncommutative torus itself!) – it is sufficiently complicated to make this form of approach rather difficult. in turn we shall construct the connections and curvature over the module. then using a function of the curvature (like in the case of characteristic classes) we shall recover the pairing. as the connection on the module we take: ( )( ) ( ) ( ) 1 � � � �� � � � � � � s i sq p q s d s dsu v 2 , where we use the forms �u, �v introduced earlier. the verification that 1 is the connection is explicit. the curvature is f iq p q u v � � � # 2� � � � . the pairing, in this two-dimensional case, is given by applying the closed graded trace in the differential algebra to the trace of the curvature two-form (remember that the curvature is a two-form but with values in the endomorphism of the projective module): 1 2 2 � � �i iq p q id � tr( )� , and since: tr( )id p q� � � � , we get the integer q as the value of the pairing. here we smuggled in the information about the dimension of the projective module (or whatever we might call it) – which could be defined as the trace of the identity endomorphism. in fact, one may use the pairing and some facts about its integrality (which we do not mention in these notes) to get this value. nevertheless, although we are dealing with a very strange family of projective modules in a pure noncommutative setup, we can still say that we can distinguish between those which are not equivalent to each other with the help of the chern-connes pairing. 5.2 summary we may now summarize all the tools and construction that we have learned in this crash course on noncommutative geometry. we just extend the dictionary, which we constructed first for noncommutative topology: geometry algebra vector bundles finitely generated projective modules differential forms differential graded algebra integration of differential forms closed graded trace on dga simplicial (de rham) homology cyclic homology infinitesimals compact operators integral trace (exotic trace) connection on a vector bundle connection on a projective module characteristic classes chern-connes character 6 spectral geometry and its applications ubi materia, ibi geometria. (where there is matter, there is geometry.) (johannes kepler) in this last part of the lectures we shall use (and probably overuse) the word spectral. its sense will be described in the definition of properties of spectral triples – a concise proposition for noncommutative spin manifolds. the clue is that (almost) everything is set by the dirac operator and it, in turn, is defined through its set of discrete eigenvalues with multiplicities. we briefly touch the main proposition, which links the theory with physics: the construction of gauge theories and the spectral action principle. 6.1 enter: the dirac operator in the example of the differential graded algebra on the circle we tested a peculiar unbounded operator d (example 3.18). of course, this was not a coincidence and the story could have been repeated for any compact spin manifold. taking as the hilbert space the square-summable sections of the spinor bundle l2(m, s) and d as the true dirac operator (for a given riemannian metric) we shall always (in a similar manner) recover the de rham differential algebra. the dirac operator on a compact spin manifolds is indeed a very elegant object: an unbounded, self-adjoint operator, with a discrete spectrum and with the growth of eigenvalues governed by the dimension of the manifold. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the dirac operator is also a very useful tool: it encodes a lot of topological and geometrical information about the manifold, in particular about the differential algebra and the metric. we shall mimic this construction in the noncommutative world, assuming that it is the basic data that makes noncommutative geometry the geometry. but to do this we shall learn one more tool: the construction of exotic traces. 6.2 exotic traces and residues let us start with a definition: definition 6.1. assume we have a positive, compact operator t on a hilbert space, with the eigenvalues �i t( ), i �1 2, ,� suppose that the eigenvalues decrease to 0 so fast that the following expression makes sense: tr� �( ) lim log ( )t n t n i i n � � � �1 1 . if it exists then we call it a dixmier trace. example 6.2: let us take, for instance, the operator ( )d �1 1, where d is the dirac operator on the circle, mentioned in the previous example. since each positive integer (apart from 1) enters into the spectrum of d 1exactly twice, we have: tr�( ) lim log lim log d n i n n i n n � � � � � � � � � � � � � � �1 1 2 2 1 1 1 (log( ) ( )) ,n o n �� 2 where we have used the formula for the asymptotic form of the harmonic numbers, and � is the euler constant. theorem 6.3: the space of all operators for which the dixmier trace can be calculated is a two-sided ideal in the algebra of bounded operators, moreover for r bounded and t in this ideal: tr tr� �rt tr� . for the proof of the theorem and also for verification that the dixmier trace is well defined and is indeed a trace, we refer to connes’ book [5]. assume now that we have an unbounded operator d such that d �1 is compact (for simplicity we eliminate zeroes from the spectrum of d). for sufficiently large s � 0 the operator d s� will be a trace class and thus the function: �d ss d( ) � �tr , makes sense. using the functional calculus on a hilbert space we can define this function at least on the part of the complex plane for re( )s big enough. suppose now that we make an analytic continuation of the function �d and that it extends to the entire complex plane with the exception of several isolated points. then at each of these points we can calculate the residue of �d s( ) – just as a functional. we have: theorem 6.4: if d �1 is an operator of dixmier class then: tr� d s d s z z� � �� re 1 . example 6.5: an elliptic differential operator on a manifold is just a special case of an unbounded operator. for instance, taking the laplace operator � on a compact manifold of dimension n, it appears that �� n 2 is in the dixmier class and its dixmier trace is related to the wodzicki residue (which is a functional on the space of differential operators given explicitly by the integral of the principal symbol). as a specific example take (again) the 2-dimensional torus and its laplace operator � � �� 2 2 , where 0 �, � 4 2� are again the standard coordinates on the torus. the principal symbol of � (and so ��1) is constant, so the wodzicki residue, which is (in two dimensions) its integral over the sphere bundle and the manifold, is: wres( ) ( )�� �1 24 2� � . to calculate the dixmier trace of we need to calculate the following limit: lim log ( )n m n n n m n� ) � � � � � � � �� 1 1 2 1 2 2 2 2 2 2� . leaving the evaluation of the asymptotics of the sum as an exercise (see [2], for hints) let us state the result: tr� � ( )�� �1 1 2 . the ratio: wres( ) ( ) ( ) � � � � � 1 1 22 2 tr� � , and is, in fact, universal (in this dimension 2). 6.3 spectral triples imagine we want to encompass everything we have learned in one compact definition. so, we work with a suitable algebra, which is a subalgebra of a c* algebra. it is represented (faithfully) on a hilbert space. further, we need to have a suitable definition of a differential algebra – and here we need just to choose a suitable unbounded operator d on the hilbert space so that the differential one-forms, the commutators of d with the elements of the algebra remain bounded. the sign of d will define for us a fredholm module, and thus a cyclic cocycle. moreover, suitably chosen d will allow us to introduce noncommutative integrations through the residua of �d. so, we are ready for the formal definition: © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 48 no. 2/2008 definition 6.6: let us have an algebra �, its faithful representation � on a hilbert space �, a selfadjoint unbounded operator d with compact resolvent, such that � �a � , [ , ( )] ( )d a b� � � , then we call ( , , )� � d a spectral triple. since the definition is very basic, we shall need (in most cases) some additional structures. we say that the spectral triple is even if there exists an operator � such that � �� †, �� � �( ) ( )a a� and � �d d � 0. we say that the spectral triple is finitely summable if the operator d�1 has eigenvalues of the order o n p( )� for some p 0. if the growth of eigenvalues of d is exactly of the order np, we say that the spectral triple is of metric dimension p. for such a triple we might introduce the noncommutative integral: � �2 � �t t dz zres 0tr( ) . this exists for all operators t, which are products of �(a), powers of d and their commutators with d and d . definition 6.7: a real spectral triple of ko-dimension n mod 8 is a spectral triple with an antilinear unitary operator j, j j * �1, such that: dj � jd, j2 � ', j � � ''� j. (3) and [ ( ) , ( )]j a j b� � � 0, [[ , ], ( ) ]d a j b j� � 0, (4) for all a b, ��, and where the signs , ', '' depend on the ko-dimension modulo 8 according to the following rules: n mod 8 0 1 2 3 4 5 6 7 � � ' � � � � '' � � the first of conditions 4 states that conjugation by j maps the algebra to its commutant, whereas the second condition, the so-called order-one condition, states that the one forms commute as well with the commutant of �. the following establishes (precisely) the relation of spectral triples to classical differential geometry (for proof see [2]). theorem 6.8: if � � c m( ), m is a spin riemannian compact manifold, s is a spinor bundle over m, � � l s2( ) (summable sections of spinor bundle) and d is the dirac operator on m then to ( , , )� � d is a spectral triple (with a real structure), of ko and metric dimension dim(m). even more interesting is the reconstruction theorem, which states the inverse: theorem 6.9: if � is a commutative algebra and ( , , )� � d is a spectral triple satisfying all required conditions then m is a spin manifold and � � c m( ), � � l s2( ), and d is the dirac operator on m. a weaker version of the theorem is due to bondia & varilly [2], a proof of the full version was proposed by varilly & rennie [44] and then improved by connes. we did not present here any further requirements for the spectral triples. this includes further algebraic conditions, like the existence of a certain hochschild cycle, which is mapped to � or 1 (depending on the dimension), or the poincaré duality in k-theory. there are also some very restrictive conditions, more of the “analysis” type. they ensure, for instance, that the algebra is smooth with respect to the derivation given by the commutators with d and d , and the suitable domain of definition of these operators on the hilbert space is a projective module over this algebra. although all of this plays a crucial role in the reconstruction theorem, it is not certain that the formulation of these requirements is in its final form in the noncommutative situation. in fact, some of the recent examples coming from q-deformed spaces can hardly meet these requirement, yet they have reasonable dirac operators. 6.3.1 the use of spectral geometries and examples if spectral geometries are applicable to noncommutative algebras then we should learn how to extract the geometric data. we already know how to obtain a differential graded algebra. but, how can we get the metric? it appears that a simple formula allows us to recover the distances on the manifold (and in general, also introduce a metric on the space of states): d x y x a y a d a ( , ) sup ( ) ( ) [ , ] � � )1 , (5) where x y m, � and a c m� ( ). we already know that a a d n2 � �tr��( ) is well-defined. once we are sure that the dixmier trace of d n� exists we can use this as a definition of the integral on the manifold. summarizing – all practical information of geometry is encoded in the datum of the spectral triple. example 6.10: we might come back to the canonical example of two points. as the dirac operator we take an arbitrary self-adjoint off-diagonal operator on �2. clearly all conditions of the spectral triple are fulfilled: even more, we can construct a real spectral triple, of course, of dimension 0. for this, we double the hilbert space to � �2 2� . it is now convenient to write every operator in a block diagonal form: �( )a a a � � � � � � � 0 0 , � � � � � � � � � 1 0 0 1 , j j j � � � � � � � 0 0 0 0 , d d d � � � �� � � �� 0 0 0 0 † , 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 where j0 is the standard complex conjugation on � 2. the condition d j j d� gives: d dt0 0� , whereas the order-one condition is satisfied for any d. therefore the spectral triple is set by a symmetric two-by-two matrix d0. of course, since the diagonal entries do not contribute to the commutators, it is the off-diagonal term that matters and that fixes, for instance, the metric. indeed, we might ask what the distance is between these two points? using the formula 5, we need to calculate f f( ) ( ) � � , where , � denotes the two points of the space, for any function such that [ , ]d f )1. first, take f z we� , where z w, � � and e is the generating function of the algebra e2 1� . we calculate: [ , ] [ , ] [ , ]† d f w d e w d e � � � �� � � �� 0 0 0 0 . next: [ , ] ( ) ( ) d e d d0 0 12 0 12 2 0 0 � �� � � � � �. where we have used that ( ) ( )d d0 12 0 21� (symmetric matrix d0). the norm of [ , ]d f is: [ , ] ( )d f d w� 4 0 12 . for the function f given above f f w( ) ( ) � � � 2 , therefore the distance between the two points, in the noncommutative geometry given by the dirac operator d, is: dist( , ) sup ( )( ) � � � )4 1 0 120 12 2 1 2 1 d w w d . it is a more difficult task to calculate the distance for the circle. there, we need to consider all smooth functions on s1, represent them on the hilbert space and calculate their operator norm! finally, let us study the best known example of the spectral triple: that of the noncommutative torus. there are several ways to guess or derive the construction – we shall, however, be very minimalistic and just provide it. example 6.11: we take the algebra of the noncommutative torus as generated by u, v, with the relation uv e vui� 2� � , together with the faithful representation on the hilbert space � (presented in example 2.9). we double the hilbert space, take the diagonal representation of the algebra and introduce the operators j, d, � as block-type operators. first we set: j e e em n i mn m n0 2 , ,� � � � � � , �e m in em n m n, ,( )� . then: � � � � � � � � � 1 0 0 1 , j j j � �� � � � � � 0 0 0 0 , d � � � �� � � �� 0 0 � �† . gives the data of the real spectral triple of ko and metric dimension 2 on the noncommutative torus. exercise 6.12: verify that all properties of spectral triples are satisfied! 6.4 making (noncommutative) physics suppose we accept that spectral triples do describe noncommutative manifolds. is there any physical contents in them? can we use them to describe some noncommutative physics? the answer is yes and, indeed, we shall be able to provide – at least – some partial answers. 6.4.1 gauge theory and gravity if we have a spectral triple ( , , )� � d we may always wonder whether the dirac operator we have chosen is a good one. certainly nothing (apart from some symmetries) guarantees it and, in fact, a simple transformation d d a� , where a is a self-adjoint one-form a a d bi i��� �( )[ , ( )] allows us to construct a spectral triple with the same algebra and hilbert space but a – slightly – modified dirac operator. moreover, when we look at it, we see that in this way we reconstruct the gauge theory and is nothing else but the gauge potential. we call this inner fluctuations of the dirac operator. it is a small step to calculate d2, recover the curvature of the gauge potential and construct the action. in this approach there are, however, some hidden obstacles, which have origin in the fact that d defines a differential algebra only after we quotient out an additional ideal. then, there are many non-equivalent ways of embedding the two-forms into the algebra of operators on the hilbert space (which we need if we want to use noncommutative integrals to calculate the action). clearly, the information that is encoded in d includes also the metric and hence the riemannian connection. are we able to construct the gravity action as well? a partial answer was provided some time ago by kastler, kalau and walze, who proved that the einstein-hilbert functional (the integral of the scalar of curvature, in other words) on manifolds can be expressed as a wodzicki residue of a certain power of the operator d [38, 39]. now, we are ready to see the proposition that encompasses both contributions. © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 48 no. 2/2008 6.4.2 the spectral action again, assume that we have a spectral triple, that is ( , , )� � d . the following defines a functional on the space of all admissible dirac operators: s d f d( ) ( )� tr 2 , where f is a cut-off function, which, for instance, vanishes for arguments bigger than a certain number �. this idea appeared for the first time (in a similar phrasing) in the work of sakharov [41] in 1965. of course, the action functional depends on the choice of f. however, it appears that the dependence is not as significant as we might suspect, as we shall see later. let us consider now the easy example of two-points and the spectral action in this case. example 6.13: recall the construction of the spectral triple for the algebra of two-points and its dirac operator d. the only free parameter in d was a complex number ( )d0 12. a most general self-adjoint one-form a is given as: a � � � � � � � 0 0 � �† , where � � � � � � � �( )d w z0 12 0 0 , for arbitrary w z, � �. since our triple is a real one and j d d j� , we need to require the same for the gauge potential a and thus we have z w� . then the spectral action is: tr( ) ( ) ( )( )d a d z z � 2 0 12 24 1 1 . it is not an exciting answer but, in fact, we did not expect anything exciting here. more interesting things happen when we consider the continuous geometry of the c -functions on a manifold and the spectral action of the true dirac operator! lemma 6.14: for the spectral triple ( ( ), ( , ), )c m l m s d 2 over a 4-dimensional manifold, the spectral action (modulo topological and boundary terms) has the following asymptotic expansion: tr f d f f f g d x f 2 2 2 4 4 2 2 0 4 2 1 16 2 1 96 � � � � � � � � � � � � � � 2� � ( ) ( �� �� �� � � ��� " � � 2f r g d x f r r r r r 0 4 2 0 2 1 4 1 360 5 2 12 2 ) ( ; � � r ���� ),2 where f f u u duk k� � 2 ( ) 10 for k � 0 and f f0 0� ( ) are the moments of f. this is a pure gravity action, which includes the cosmological constant, the einstein-hilbert action and some additional term, which depend on the riemannian curvature tensor, the ricci tensor and the scalar curvature. if we introduce just a bit of noncommutative geometry, by taking as the algebra not c m ( ) but c m mn �( ) ( )� , the algebra of matrix valued functions on m we obtain the possibility of constructing an su(n) gauge theory. the gauge connection one-form a a dx� � � (in local coordinates) will enter the spectral action as well and we shall additionally obtain (apart from some change in the coefficients in the two first terms) another term in the �-independent part of the expansion: the yang-mills action: 1 4 1 1202 0� �� ��" 2f f ftr( ) . 6.5 the standard model so far we have recovered important parts of theoretical physics all encoded in one simple action. there is, however, more to it as we shall finally see in this section. the crucial point is to take the geometries of the type m f� , where m is a riemannian manifold and f is a discrete geometry. it is like a kaluza-klein model but with the extra dimensions being in fact of (classical dimension) zero. we shall study here a toy model of the construction, referring to the recent papers by connes [31, 32] for the detailed advanced construction, which makes contact with the real physical standard model. example 6.15: let us begin with the construction of the (real) spectral triple for the algebra of functions on m f� 2, with f2 the space of two points. the standard procedure is to take the spectral triples on both spaces and construct their tensor product. however, we shall simplify the construction: we shall just postulate the dirac operator. another simplification that we make is that we skip the requirement of the reality conditions (in the sense of j-reality) and take the manifold to be flat (say a 4-torus!). as a hilbert space we take the two copies of the spinorial hilbert space over manifold m : � � &l m s l m s2 2( , ) ( , ). the functions act on � so that f h h f h f h( ) ( ) ( ) � �& � & � . as the dirac operator we take: d � � � � � � � � � � � � � � � � � 5 5 , where by �5 we denote the �2-grading of the standard spectral triple over m (which in physical notation is �5) and � �� � �( )� is the standard dirac operator expressed in local coordinates with the spin connection ��. the most general inner fluctuations of the dirac operator are: a a x h x h x a x � � � � � � � � � � � � � � � � � � ( ) ( ) ( ) ( )* 5 5 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 the next step is a not a trivial one – but using the knowledge of the asymptotic expansion and seeley-de witt coefficients (see [22, 25] for explanation and details) we find out what will change in the expansion of the spectral action. we skip the exact calculation (which we recommend, however, as a good exercise!) it is no surprise that the additional terms involve certain covariant functionals of h: � the vacuum energy of field h(x) (both in �2 and �0 parts) g d x h x h x4 ( ) ( )*2 , � the coupling of h to the scalar of curvature (�0 part) g d x h x h x r4 ( ) ( )*2 , � the kinetic term for the h(x) field: g d x d h x d h x4 ( ( ) ( ) )*� �2 . � the potential of the h(x) field: g d x h x4 4( )2 . here d� denotes there the u u( ) ( )1 1� -covariant derivative. we can interpret these contributions in physical terms as the kinetic action of the higgs field h(x) and the higgs potential, which after some rescaling can be written as: g d x h x h x4 2 2( ( ) ( ) )*� $2 . the crucial information (from the physical point of view) lies actually not in the exact values of the coefficients but their relative signs. if h 2 and h 4 appear with opposite signs we have the standard higgs potential leading to the symmetry breaking mechanism. since (a priori) � and all coefficients fk from the cutoff function are free parameters we can actually fix them so that the signs are correct. for the more realistic approach to the standard model we need to take a slightly complicated model, with the finite algebra being � �& &� m3( ). it comes as no surprise that this construction leads to the full gauge group of the standard model: u su su( ) ( ) ( )1 2 3� � . taking an appropriate spectral triple over the finite algebra (which is actually of ko-dimension 6) and tensoring it with the usual spectral triple over a manifold, one obtains as a spectral action: s f f c f d g d x f f c r g d � � � � � � � � 2 1 48 4 96 24 2 4 4 2 0 4 2 2 0 2 � � � � � � 4 0 2 4 2 2 10 11 6 3 2 x f r r c c g d x a f 2 2 ���� � � � � � ���� ����* * ( � e f g d x f a d g d x f ar g d x f 0 2 2 4 0 2 2 4 0 2 2 4 0 2 12 2 ) � � � � � 2 2 2 � 2 3 2 2 2 1 2 4 0 5 3 g g g g f f g b b g d x f i i �� �� �� � ��� �� �� � � � � � � 2 2 2 4 4 � b g d x2 , where f, b, g are curvatures of the electro-weak and strong gauge fields and � is the sm higgs doublet (seen as a quaternionic field). for more details, please consult [6]. 6.6 where and why learn more? in these three lectures we have tried to give a glimpse of noncommutative geometry – a theory, which, motivated by examples, extends the notion of geometry into the algebraic world. what we still need to supply is a word about prospects: first of learning (where to learn more) but also the prospects of the field (why learn it). 6.6.1 the sources for the more interested reader we recommend further reading. first of all, there is an excellent introduction to almost all of the topics of these lectures: “elements of noncommutative geometry” [2] by josé gracia-bondia, joseph várilly and hector figueroa. it offers a comprehensive and detailed course in noncommutative geometry reviewing also the most recent trends and links with physics. there are, of course, the books by alain connes: the seminal work “noncommutative geometry” [5] and the more recent book with mathilde marcolli [6]. other textbooks which give a review of selected topics (and are written more from the perspective of mathematical physicist) are “an introduction to noncommutative differential geometry and its physical applications”, [17], by john madore and “an introduction to noncommutative spaces and their geometry”, [14], by giovanni landi. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 48 no. 2/2008 where a x�( ) $ are two copies of the u( )1 gauge potential and h x( ) is a complex-valued field! we calculate the square of the dirac operator paying particular attention to the terms that involve h. the rest, depending only on a x�( , )$ , does not differ from the usual classical theory. the terms that depend on h in the square of the dirac operator d a are: h x h x a x a x h x a x ( ) ( ) ( ( ) ( ) ) ( ) ( ( ) * � � � � � � � � � � � 5 5 1 � � � � � � � � � � � � �a x h x h x h x�( ) ) ( ) ( ) ( )* * there are, of course, numerous books on some mathematical aspects of noncommutative geometry, e.g. k-theory, for instance. we shall not list here all possible available books and monographs but will give only single examples. first of all, there is an excellent monograph of jean-louis loday [16] on cyclic homology, hochschild and related subjects. a concise and useful review of the topis is given by husemoller [12]. cyclic homology within noncommutative geometry is presented in [8]. a very good and comprehensible introduction to k-theory can be found in a friendly approach to k-theory by wegge-olsen [19] and also in the book by blackaddar [1]. the overview of the link between cyclic cohomology, k-theory and chern parings is nicely explained in the book of jacek brodzki [3]. almost everything on operator algebras can be found in the excellent monograph by richard kadison, john ringrose [13]. a review of differential graded algebras can be found in the lecture notes by michel dubois-violette [35, 36]. one of the basic classical texts on all aspects of topology, differential geometry, gauge theories and characteristic classes is the textbook “analysis, manifolds and physics”, [4], by choquet-bruhat and dewitt-morette. everything one wants to know about spin geometry is in the book (surprise, surprise): “spin geometry”, [15], by lawson and michelsohn. all properties of dirac operators are explained in the work of thomas friedrich “dirac operator” [10]. the most recent concise introduction to spectral triples (treating the classical and noncommutative examples) is contained in the book by joseph varilly, [18], “an introduction to noncommutative geometry”. there are, of course, numerous reviews and lecture notes from courses at institutions and schools (for example: [7, 9, 11]). written for different purposes and by different authors, they offer views of the topic from many differen angles. a selection can also be found found on the internet, on the web pages of noncommutative geometers or common sites like the „noncommutative blog“. 6.6.2 the outlook it is hard to see at the moment whether noncommutative geometry will become the right tool for describing the physics: both known physics and physics yet to be discovered. we have mentioned that ncg finds applications in a range of topics, from the quantum hall effect [20], standard model [43, 40] up to string theory [42, 28]. there are other branches of noncommutative geometry that we have not even touched: hopf algebras, quantum groups and quantum deformations, deformation quantization, noncommutative field theory, hopf algebras in renormalization – to list only those, which (more or less) have some links to physics. one may say that we are just at the dawn of noncommutative geometry: it is a world still to be discovered. whether this geometry will be the geometry used to describe the world is not known. but we might soon find out. acknowledgement partially supported by polish government grants 115/e-343/spb/6.prue/die 50/2005–2008 and 189/6.prue/2007/7 references [1] blackadar, b.: k-theory for operator algebras. cambridge: mathematical sciences research institute publications, 5. cambridge university press, 1998. [2] bondia, josé m., várilly, j., figueroa, h.: elements of noncommutative geometry. boston (ma): birkhauser advanced texts, birkhauser boston, inc., 2001. [3] brodzki, j.: an introduction to k-theory and cyclic cohomology. advanced topics in mathematics. warsaw: pwn – polish scientific publishers, 1998. [4] choquet-bruhat, y., c. dewitt-morette, c.: analysis, manifolds and physics. amsterdam: north-holland publishing co., 1989. [5] connes, a.: noncommutative geometry and physics. san diego: academic press, inc., 1994. [6] connes, a., marcolli, m.: noncommutative geometry, quantum fields and motives. (colloquium publications), providence: american mathematical society, 2008. [7] cuntz, j., khalkhali, m. (eds.): cyclic cohomology and noncommutative geometry. (proceedings of a workshop, fields institute, waterloo). providence: american mathematical society (ams), 1997. [8] cuntz, j., skandalis, g., tsygan, b.: cyclic homology in non-commutative geometry. (encyclopaedia of mathematical sciences 121(ii)). berlin: springer, 2004. [9] doplicher, s., longo, r. (eds.): noncommutative geometry. (lectures given at the c. i. m. e. summer school, martina franca), berlin: springer, 2004. [10] friedrich, t.: dirac operators in riemannian geometry. providence: american mathematical society (ams), 2000. [11] higson, n., nigel, j. roe (eds.): surveys in noncommutative geometry. (proceedings from the clay mathematics institute instructional symposium). providence: american mathematical society, 2006. [12] husemoller, d.: lectures on cyclic homology. berlin: springer-verlag, 1991. [13] kadison, r., ringrose, j.: fundamentals of the theory of operator algebras. advanced theory, vol. 1–2 , american mathematical society, 1997. [14] landi, g.: an introduction to noncommutative spaces and their geometry. lecture notes in physics. berlin: springer-verlag, 1997. [15] lawson, h., michelsohn, m-l.: spin geometry. princeton, nj: princeton mathematical series, 38. princeton university press, 1989. [16] loday, j-l.: cyclic homology. grundlehren der mathematischen wissenschaften, 301. berlin: springer-verlag, 1998. [17] madore, j.: an introduction to noncommutative differential geometry and its physical applications. cambridge: cambridge university press, 1995, 1999. [18] várily, j.: an introduction to noncommutative geometry. (ems series of lectures in mathematics), zürich: european mathematical society publishing house, 2006. 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 [19] wegge-olsen, n.: k-theory and c*-algebras. a friendly approach. oxford science publications. oxford, new york: the clarendon press, 1993. [20] bellissard, j.: noncommutative geometry and quantum hall effect. in proceedings of the international congress of mathematicians, vol. 1, 2 zürich: 1994, p. 1238–1246, basel: birkhauser, 1995. [21] chamseddine, a. h., connes, a.: an universal action formula, phys. rev. lett., vol. 77 (1996), p. 4868. [22] chamseddine, a. h., connes, a.: the spectral action principle. commun. math. phys., vol. 186 (1997), p. 731. [23] connes, a.: noncommutative differential geometry. inst. hautes études sci. publ. math., (1985), no. 62, p. 257–360. [24] connes, a., rieffel, m.: yang-mills for noncommutative two-tori. operator algebras and mathematical physics (1985), p. 237–266, contemp. math., 62. [25] connes, a.: the action functional in noncommutative geometry. commun. math. phys., vol. 117 (1988), p. 673. [26] connes, a., lott, j.: particle models and noncommutative geometry. nucl. phys. proc., suppl. 18b (1991), p. 29. [27] connes, a.: noncommutative geometry and reality. j. math. phys., vol. 36 (1995), p. 6194. [28] connes, a., douglas, m. r., schwarz, a.: noncommutative geometry and matrix theory: compactification on tori. jhep 9802 (1998), 003. [29] connes, a.: a short survey of noncommutative geometry. j. math. phys., vol. 41 (2000), p. 3832. [30] connes, a.: noncommutative geometry: year 2000. arxiv:math.qa/0011193. [31] connes, a., chamseddine, a. h., marcolli, m.: gravity and the standard model with neutrino mixing. arxiv:hep-th/0610241 [32] connes, a., chamseddine, a. h.: a dress for sm the beggar. arxive:math/07063690. [33] cuntz, j., quillen, d.: algebra extensions and nonsingularity. j. amer. math. soc., vol. 8 (1995), no. 2, p. 251–289. [34] doplicher, s., fredenhagen, k., roberts, j.: the quantum structure of spacetime at the planck scale and quantum fields. comm. math. phys., vol. 172 (1995), no. 1, p. 187–220. [35] dubois-violette, m.: lectures on graded differential algebras and noncommutative geometry. arxiv:math.qa/9912017. [36] dubois-violette, m.: lectures on differentials, generalized differentials and on some examples related to theoretical physics. arxiv:math.qa/0005256. [37] gracia-bondia, j. m.: noncommutative geometry and the standard model: an overview. arxiv:hep-th/9602134. [38] kalau, w., walze, m.: gravity, non-commutative geometry and the wodzicki residue. j. geom. phys., vol. 16 (1995), no. 4, p. 327–344. [39] kastler, d.: the dirac operator and gravitation. commun. math. phys., vol. 166 (1995), p. 633. [40] kastler, d.: noncommutative geometry and fundamental physical interactions: the lagrangian level: historical sketch and description of the present situation. j. math. phys., vol. 41 (2000), p. 3867. [41] sakharov, a.: vacuum quantum fluctuations in curved space and the theory of gravitation. dokl. akad. nauk ser. fiz. 177 (1967), p. 70–71. [42] seiberg, n., witten, e.: string theory and noncommutative geometry. j. high energy phys., (1999), no. 9, paper 32. [43] varilly, j., gracia-bondia, j. m.: connes’ noncommutative differential geometry and the standard model. j. geom. phys., vol. 12 (1993), p. 223. [44] varilly, j., rennie, a.: reconstruction of manifolds in noncommutative geometry. arxiv:math/0610418 andrzej sitarz e-mail: sitarz@if.uj.edu.pl institute of physics jagiellonianuniversity reymonta 4 30-059 kraków, poland © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 48 no. 2/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap06_2.vp 1 introduction studies of self-excited induction generators have been investigated since 1935. many papers dealing with various problems in the field of seig have been published. the primary advantages of seig are lower maintenance costs, better transient performance, lack of a dc power supply for field excitation, brushless construction (squirrel-cage rotor), etc. in addition, induction generators have been widely employed to operate as wind-turbine generators and small hydroelectric generators of isolated power systems. induction generators can also be connected to large power systems, to inject electric power [1, 2]. the generator action takes place, when the rotor speed of the induction generator is greater than the synchronous speed of the air-gap-revolving field. various configurations for connecting seig to a large power system have been discussed in many publications. this research concentrates on the dynamic performance of an isolated seig, driven by wind energy, to supply an isolated static load. a d-q axis equivalent circuit model based on various reference frames, extracted from fundamental machine theory, is created to study seig performance in dynamic case [2]. this paper studies seig performance, when equipped with a switching capacitor bank, using a controller based on gas to adjust the duty cycle and adjusting the stator frequency via the pitch control. genetic algorithms are search algorithms that simulate the process of natural selection and survival of the fittest [3]. the ga starts off with a population of randomly generated chromosomes, and advances toward better chromosomes in a sequence of generations. during each generation, the fitness of each solution is evaluated and solutions are selected for reproduction based on their fitness. then, the chromosomes with higher fitness have higher probabilities of having more copies in the following generation, while the chromosomes with worst fitness are eliminated. then a roulette wheel cheme is applied for reproduction. consequently, the new population of chromosomes is formed using a selection mechanism and specific genetic operators such as crossover and mutation. recently, genetic algorithms have received considerable attention. global optimization utilizes techniques that can distinguish between the global optimum and numerous local optima within a region of interest. many papers have been published using ga to optimize pi and pid controllers [4]. in this paper, the mathematical model of seig driven by wecs is simulated using the matlab/simulink package to solve its differential equations. two controllers have been developed for the system under study. the first of these is the reactive controller to adjust the terminal voltage at the rated value, by controlling the duty cycle of the switching capacitor bank. the second controller is the active controller, which adjusts the input mechanical power to the generator and thus keeps the stator frequency constant. this is achieved by controlling the pitch angle of the blade of the wind turbine. both controllers are implemented using the conventional pi controller. then, the integral gains of both pi controllers are optimized using the ga technique. fig. 1 shows the block diagram for the system under study. it consists of a self-excited induction generator driven by a wind energy conversion scheme connected to an isolated load. in addition, the system under study is equipped with the reactive and active controller’s loops, using a pi controller. then, the integral gains are tuned using the ga algorithm. 2 mathematical model for seig driven by wecs 2.1 electrical equations for seig fig. 2 shows the d-q axis equivalent-circuit model for a no-load, three-phase symmetrical induction generator. the stator and rotor voltage equations using krause transformation [1, 2], based on a stationary reference frame, are given in appendix a. © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 genetic algorithm based control system design of a self-excited induction generator a.-f. attia, h. soliman, m. sabry this paper presents an application of the genetic algorithm (ga) for optimizing controller gains of the self-excited induction generator (seig) driven by the wind energy conversion scheme (wecs). the proposed genetic algorithm is introduced to adapt the integral gains of the conventional controllers of the active and reactive control loop of the system under study, where ga calculates the optimum value for the gains of the variables based on the best dynamic performance and a domain search of the integral gains. the proposed genetic algorithm is used to regulate the terminal voltage or reactive power control, by adjusting the self excitation, and to control the mechanical input power or active power control by adapting the blade angle of wecs, in order to adjust the stator frequency. the ga is used for optimizing these gains, for an active and reactive power loop, by solving the related optimization problem. the simulation results show a better dynamic performance using the ga than using the conventional pi controller for active and reactive control. keywords: genetic algorithms, conventional controllers, self-excited induction generator. 2.2 mechanical equations for wecs the mechanical equations relating the power coefficient of the wind turbine, tip speed ratio � and pitch angle � are given in appendix a [5]. the analysis of seig in this paper is based on the following assumptions [1]: � all parameters of the machine can be considered constant except xm � per-unit values of both stator and rotor leakage reactance are equal � core loss in the excitation branch is neglected � space and time harmonic effects are ignored. 2.3 equivalent circuit the d-q axsis equivalent-circuit models for a no-load, three-phase symmetrical induction generator are shown in fig. 2a and fig. 2b. the equivalent-circuit parameters shown in these figures are based on the machine data in appendix b [1, 2]. the equation of motion of the rotating part of the combined studied seig and the wind turbine is also included in the system in order to provide a detailed simulation model. 2.4 reactive control and switching capacitor bank technique 2.4.1 the switching capacitor bank capacitor switching has been discarded in the past because of the practical difficulties involved [6], i.e. the occurrence of voltage and current transients. it has been argued, and justly so, that current ‘spikes’, for example, would inevitably exceed the maximum current rating as well as the ( di /dt ) 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague load semi conductor switching kiv tuned using ga pi for reactive control kif tuned using ga pitch control pi for active control bus_bar wecs_input mech. power + pm_ ref. p m _ a c tu a l + v ref. duty cycle pm_error v_error v -t e rm in a l seig seig fig. 1: system under study a) rs �ds lls llr ( ws �wr) (� dr / wb) rr vqs iqs xm iqr vqr b) vds ids xm i dr vdr rs �qs lls llr r r( ws �wr) (� qr / wb) fig. 2: a) equivalent circuit of an induction generator for quadrate axis, b) equivalent circuit of an induction generator for direct axis value of a particular semiconductor switch. the only way out of this dilemma would be to design the semiconductor switch to withstand the transient value at the switching instant. an equivalent circuit of the switching capacitor bank with a controlled value of the duty cycle is shown in fig. 3. in this figure, the switches are operated in anti-phase, i.e. the switching function fs2 which controls switch s2 is the inverse function of fs1 which controls switch s1. in other words, switch s2 is closed during the time when switch s1 is open, and vice versa. this means that s1 and s2 of branch 1 and 2 are operated in such a manner that one switch is closed while the other is open. 2.4.2 reactive control through the switching capacitor bank technique in the system under study given in fig. 1, the controller input is the voltage error for the reactive power controller, and the output of the controller performs the value of the duty cycle �. the duty cycle is used as an input to the semiconductor switches to adjust the capacitor bank ceff value according to the need for the effective value of the excitation, which regulates the terminal voltage. accordingly the semiconductor switching technique as explained in the above section and, hence, the terminal voltage is controlled by adjusting the self-excitation through automatic switching of the capacitor bank. 2.5 the active power control active control is applied to the system under study by adjusting the pitch angle of the wind turbine blades. this is used to keep the seig operating at a constant stator frequency and to avoid the effect of the disturbance. the pitch angle is a function of the power coefficient cp of the wind turbine wecs. the value of cp is calculated using the pitch angle according to eq. (14), given in appendix a. consequently, the best adjustment for the value of the pitch angle improves the mechanical power regulation, which achieves better adaptation for the frequency of all systems. accordingly, the active power control regulates the mechanical power of the wind turbine. 3 proportional plus integral (pi) controllers first, the pi controller using a fixed gain is applied to the system under study. then, the integral gain ki of the pi controller is varied linearly with reference to terminal voltage error ev, while proportional gain kp is fixed. the voltage or frequency error is used as an input variable to the pi control© czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 v c1 c2 s1 s2 l1 fs1 fs2 fig 3: semi conductor switches (s1, s2) circuit for capacitor bank fig. 4: dynamic response of the terminal voltage with different values of integral gain for reactive control ler, then the output is used to regulate the duty cycle of the switching capacitor bank in the reactive controller, while the output of the active controller the output is utilized to tune up the pitch angle of the wind turbine to adjust the system frequency. fig. 4 shows the simulation results for the system under study when starting against a step change in the reference voltage. then the system is subjected to a sudden change in the local electric load. this simulation result is carried out for different values of fixed integral gain. figs. 5, 6 and 7 show the simulation results for the stator frequency, the duty cycle and the stator current for the previous condition indicated in fig. 4. these simulation results show that the dynamic performance is changed, as regards percentage over shoot (p.o.s), rising time and oscillation, by changing the value of the integral gain. thus, the idea of driving the system using a variable integral gain is introduced to achieve the benefits of using high and small integral gains, as follows. from fig. 4, a higher value of the integral gain, kiv � 0.007, is associated with a shorter rising time but the 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague fig. 5: dynamic response of the stator frequency fig. 6: dynamic response of the duty cycle p.o.s and oscillation are greater than the other values of kiv. meanwhile, the lowest value of kiv � 0.0051 is associated with negligible p.o.s and oscillation and the rising time is greater than other values of kiv. fig. 8 show the variation of the integral gain during the starting period versus time. 3.1 pi-controller with a variable gain a program has been developed to compute the value of the variable integral gain kiv using the following rule: if e e k k elseif e e k k v v iv iv v v iv iv ( ), ; ( ), min min max ma � � � � x min max max min max m ; ( ), ( ) ( else e e e m k k e e v v v iv iv v v � � � � � in min min ); ; ( ) ; c k m e k m e c en d iv v iv v � � � � � � where ev is the voltage error, evmin and evmax are the minimum and maximum values of the voltage error, respectively, kivmin and kivmax are the minimum and maximum values of the variable integral gain, respectively, c is a constant and m is the slop constant of the linear part. fig. 9 shows the above rule used to calculate the variable kiv . the value of evmin and evmax is obtained by trial and error to give the best dynamic performance. figs. 4, 5, 6 and 7 also show the dynamic performance of the overall system when equipped with this variable integral gain compared with other fixed integral gains. the simulation results depict an improvement in the dynamic performance when using the variable kiv. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 7: dynamic response of the load current ×10 �3 fig. 8: variable integral gain for pi controller lower limit upper limit voltage error = ev kiv ev evmin evmax kivmax kivmin in te g ra l g a in fig. 9: variable integral gain for a pi controller 4 genetic algorithm with a constrained search space for optimizing pi controller gains the integral gain of the second pi controller is optimized based on the genetic algorithm. ga is used to calculate the optimum value of the variables based on the best dynamic performance and a domain search of the variable. the objective function used in the ga technique is f j� �1 1( ), where j is the minimum cost function, which will be defined later. ga uses its operators and functions to find the values of kiv and kif of the pi controllers to achieve better dynamic performance of the overall system. these values of gains lead to the optimum value of gains for which the system achieves the desired values by improving the p. o. s, rising time and oscillations. the main aspects of the proposed ga approach for optimizing the gains of pi controllers, and the flowchart procedure for the ga optimization process, are shown in fig. 10. 4.1 representation of pi controller gains the pi gains are formulated using the ga approach, where all the gains are represented in a chromosome. the chromosome representation determines the ga structure. the parameter gains are encoded in a chromosome. the pi controller gains are initially started using minimum values of the domain search for pi gains. based on the simulation 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague reproduction start initialization j > tolerance mutation crossover yes no randomly generate chromosomes of calculate f 1/(1+ )� j for generation = 1: max_ gen for k = 1: pop_size [kiv , kif] record required information stop pi active controllers pitch angle valuekif pi reactive controllers duty cycleev, ev kiv pme, pme fig. 10: flowchart of ga approach for optimizing pic gains results given in the previous sections, the values of [kivmax, kivmin] are shown in fig. 4, while the values of [kifmax, kifmin] are shown in fig. 11. the acceptable domain search for each gain is defined as [kivmax, kivmin] and [kifmax, kifmin] based on five times less and five times more than the gains obtained using the ziegler-nichols rule to satisfy minimum cost function j, as given in the following equation [7]: �j e t e t e t t t � � � � �� � � �1 1 1 0 ( ) ( ) ( ) d , (1) where; e(t) is equal to ev or pme; ev is the voltage error used for the reactive power control and pme is the mechanical power error used for active power control, as shown in fig. 1. the parameters �1, �1 and �1 are weighting coefficients. 4.2 coding of pi controller gains the coded parameters are arranged on the basis of their constraints, as shown in fig. 12, to form a chromosome of the population. the binary representation, given in fig. 12, is the coded form for parameters with chromosome length equal to the sum of the bits for all parameters. in binary coding, the relation between the bit length li and the corresponding bit resolution ri is given in the following equation [8]: r ub lb i i i l i � � �2 1 , (2) where ubi and lbi are the upper and lower bounds of parameter i, respectively. in the present case study, we assume bit resolution r � 105 for all parameters. fig. 12 shows the coded parameters of the pi controller gains for reactive and active power controllers, respectively. the chromosome length used in this paper was 20 bits, where the bit length of kiv equal 10 bits and the bit length of kif equals 10 bits. 4.3 selection function the selection strategy decides how to select individuals to be parents for new ‘children’. the selection usually applies some selection pressure by favoring individuals with better fitness. after procreation, the suitable population consists, for example, of l chromosomes, which are all initially randomized. each chromosome has been evaluated and associated with fitness, and the current population undergoes the reproduction process to create the next population. then, the “roulette wheel” selection scheme is used to determine the member of the new population. the roulette scheme is shown in fig. 13. the chance on the roulette-wheel is adaptive and is given as p p � �� as in the eq. (3) [8]: p j l � � � �� � 1 1, { , , }, (3) and j � is the performance of the model encoded in the chromosome measured in the terms used in eq. (1). © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 11: stator frequency for pi and pi & ga controllers kiv kif 10 bits 10 bits chromosome length 1 1 … 0 1 0 1 … 1 1 fig. 12: coded parameters of pi controller gains maximizing the fitness function of each chromosome, which is inversely proportional to the performance criteria, eq. (1) will damp the overshoot or the oscillations [8]. 4.4 crossover and mutation operators the mating pool is formed, and crossover is applied. then the mutation operation is applied, followed by the proposed ga approach. finally, after these three operations, the overall fitness of the population is improved. the procedure is repeated until the termination condition is reached. the termination condition is the maximum allowable number of generations, or a certain value of j. this procedure is shown in the flowchart given in fig. 10. 5 simulation results the nonlinear differential equation, which describes the system under study, was solved using the runge-kutta fifth order method, using the matlab simulink package. the integration step value was automatically varied in this package. the relative tolerance was set at 0.00001. the minimum and maximum step size was adjusted automatically. several tests were carried out to validate the efficiency of the proposed control schemes. the simulations show the comparison between the two proposed pi controllers. the system performance checks the terminal voltage vl, the stator frequency fs, the load current il and the duty cycle versus time. the overall system is tested against a sudden change in the load. this disturbance is made by applying a sudden change in the resistance part of the load impedance. the load is equivalent to the r-l series circuit with load resistance rl � 80 ohm and load inductance ll � 0.12 h per phase. 6 pi controller based on ga 6.1 dynamic performance due to sudden load variation figs. 14, 15, 16 and 11 show the simulation results of the system under study using the pi controller based on ga for the terminal voltage, the stator current, the duty cycle and the stator frequency, respectively. the simulation results show that the performance of a pi controller based on ga is much better, as regards maximum overshoot and rising time, than a pi controller with fixed or variable kiv. the simulation results show the effectiveness of the proposed controller, as shown in the previous figures. 6.2 simulation results due to sudden wind speed variation other simulation results are obtained when the overall system is subjected to a sudden variation in wind speed from 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague p 2 p 1 p l fig. 13: roulette wheel selection scheme fig. 14: terminal voltage for pi and pi & ga controllers 7 m/s to 15 m/s. figs. 17, 18 show the simulation results of the wind speed variation and the stator frequency, respectively. the simulation result given in fig. 18 shows the ability of the proposed controller to overcome the speed variation when using the variable and fixed integral gain. 7 conclusions this paper presents the application of two types of controller to enhance the performance of seig driven by wecs, using a variable rule based integral gain and ga. ga is used © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 15: load current for pi and pi & ga controllers fig. 16: duty cycle for pi and pi & ga controllers to optimize the pi controller gains in order to improve the dynamic response of the overall system. optimal gains for the pi controller were determined using the ga procedure. the simulation results show that a pi controller tuned by the proposed ga was able to decrease the overshoot, and at same time to decrease the rising time. the simulation results show the effectiveness of the ga approach as a promising identification technique for pi controller gains. the two different 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague fig. 17: sudden variation for wind speed versus time fig. 18: stator frequency according to wind speed variation for seig controlled by pi-ga types of controllers improve the dynamic performance when tested against warious types of disturbances. appendix a – seig differential equation vds (volt) stator voltage‘s differential equation at direct axis v r i pds s ds b qs ds b � � � � � � � � � � � � � � � � � � � � � � � � � � (4) vqs stator voltage‘s differential equation at quadrate axis v r i pqs s qs b ds qs b � � � � � � � � � � � � � � � � � � � � � � � � � � (5) vdr rotor voltage‘s differential equation at direct axis v r i pdr r dr r b qr dr b � � � �� � � � � � � � � � � � � � � � � � � � � � � (6) vqr rotor voltage‘s differential equation at quadrate axis v r i pqr r qr r b dr qr b � � � �� � � � � � � � � � � � � � � � � � � � � � � (7) flux linkage differential equation for stator and rotor components: �ds s ds m dr dsx i x i i� � � � �� ( ), (8) where: �ds (weber) is the stator flux linkage at the direct axis, idr (amp) is the rotor current at the direct axis but ids (amp) is the stator current at direct axis, p is the differentiation parameter � d/dt. �qs s qs m qr qsx i x i i� � � � �� ( ), (9) where: �qs is the stator flux linkage at the quadrant axis, iqr is the rotor current at quadrant axis but iqs is the stator current at the quadrant axis. �dr r dr m dr dsx i x i i� � � �� ( ), (10) where: �dr is the rotor flux linkage at the direct axis. �qr r ds m qr qsx i x i i� � � �� ( ), (11) where: �qr is the rotor flux linkage at quadrant axis. d d � � � qs b qs s qs dst v r i� � � �( ), (12) where: �b is the base speed. p c d vm p w� 1 8 2 3( ) (13) c p � � � � � � � �� � � �� �( . . ) sin ( ) . .0 44 0 0167 3 15 0 3 0 00184� � � ( )� �� � � � � � �3 (14) where: �m (rad/sec) is the mechanical speed pm (kw) is the mechanical power tm (nm) is the mechanical torque n (rpm) is the rotor revolution per minute cp is the power coefficient of the wind turbine � is the blade pitch angle (degree) � is the tip speed ratio vw (m/s) is the wind speed d (m) is the of the rotor diameter of the wind turbine � 3.14 (kg/m3) air density appendix b – seig parameters the induction machine under study as a seig has the following parameters: 1.1 kw, 127/ 220 v (line voltage), 8.3/4.8 a (line current), 60 hz, 2 poles, wound-rotor induction machine [9, 10]. by choosing proper base values: � base voltage vb � [220/(1.73)] v, � base current ib � 4.8 a, � base impedance zb � 26.462 ohm, � base rotor speed nb � 3600 rpm, and � base frequency fb � 60 hz, the per-unit parameters of the induction machine under study are equal: � stator resistance rs � 0.0779, � rotor resistance rr � 0.0781, � stator reactance xs and rotor reactance xr are equal 0.0895. the equation of the motion of rotating parts of the combined studied seig and the wind turbine is also included in the system in order to provide a detailed simulation model. the inertia constant of the machine h � 0.055 s. references [1] li, wang, jian-yi-su: “dynamic performance of an isolated self excited induction generator under various loading conditions”, ieee transactions on energy conversion, vol. 15 (1999), no. 1, march 1999, p. 93–100. [2] li, wang, chinghuei lee: “longshunt and shortshunt connections on dynamic performance of a seig feeding an induction motor load”, ieee transactions on energy conversion, vol. 14 (2000), no. 1, p. 1–7. [3] golodberg, d.: genetic algorithms in search optimization and machine learning. addision-wesely, reading, ma, 1989. [4] attia, a., soliman, h.: “an efficient genetic algorithm for tuning pd controller of electric drive for astronomical telescope.” scientific bulletin of ain shams university, faculty of engineering, part ii, issue no. 37/2, june 30, 2002. [5] ezzeldin, s. abdin, wilson, xu: “control design and dynamic performance analysis of a wind turbine – induction generator unit”, ieee transaction on © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 energy conversion, vol. 15 (2000), no. 1, march 2000, p. 91–96. [6] marduchus, c.: “switched capacitor circuits for reactive power generation”, ph.d. thesis, brunuel university, 1983. [7] sekaj, i.: “genetic algorithm – based control system design and system identification”, 5th international mendel conference on soft computing, 1999, brno, czech republic, p. 139–144. [8] michalewic, z.: genetic algorithms + structure = evolution program. springer-verlag, berlin heidelberg, 1992. dr. ing. abdel-fattah attia phone:+202 5560046 fax:+202 5548020 e-mail: attiaa1@yahoo.com astronomy department national research institute of astronomy and geophysics (nriag), 11421 helwan cairo, egypt associate prof. dr. ing. hussein f. soliman e-mail: hfaridsoliman@yahoo.com dept. of electric power and machine faculty of engineering ain-shams university abbasia, cairo, egypt dr. ing. mokhymar sabry e-mail: sabry40@hotmail.com electricity & energy ministry new & renewable energy authority “nrea” wind management, cairo, egypt 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague acta polytechnica https://doi.org/10.14311/ap.2022.62.0023 acta polytechnica 62(1):23–29, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague photonic graphene under strain with position-dependent gain and loss miguel castillo-celeitaa, ∗, alonso contreras-astorgab, david j. fernández c.a a cinvestav, physics department, p.o. box. 14-740, 07000 mexico city, mexico b cinvestav, conacyt – physics department, p.o. box. 14-740, 07000 mexico city, mexico ∗ corresponding author: mfcastillo@fis.cinvestav.mx abstract. we work with photonic graphene lattices under strain with gain and loss, modeled by the dirac equation with an imaginary mass term. to construct such hamiltonians and their solutions, we use the free-particle dirac equation and then a matrix approach of supersymmetric quantum mechanics to generate a new hamiltonian with a magnetic vector potential and an imaginary position-dependent mass term. then, we use a gauge transformation that maps our solutions to the final system, photonic graphene under strain with a position-dependent gain/loss term. we give explicit expressions for the guided modes. keywords: graphene, dirac materials, photonic graphene, matrix supersymmetric, quantum mechanics. 1. introduction graphene is the last known carbon allotrope, it was isolated for the first time by novoselov, geim, et al. in 2004 [1]. this material consists of a two-dimensional hexagonal arrangement of carbon atoms. graphene excels for its interesting properties, such as mechanical resistance, electrical conductivity, and optical opacity [2, 3]. the study of graphene has contributed to the development of different areas in physics, for example, in solid-state, graphene has prompted the discovery of other materials with similar characteristics, such as borophene and phosphorene. at low energy, the charge carriers in graphene behave like dirac massless particles, and from this approach, graphene has allowed the verification of the klein tunneling paradox as well as the quantum hall effect. these phenomena have gained a special interest in particle physics and quantum mechanics [4]. exploring graphene in an external constant magnetic field has allowed identifying the discrete bound states in the material, the so-called landau levels. moreover, theoretical physicist have analyzed the behavior of dirac electrons in graphene under different magnetic field profiles as well. supersymmetric quantum mechanics is a useful tool to find solutions of the dirac equation under external magnetic fields [5–9]. following this approach, a mechanical deformation in a graphene lattice is equivalent to introducing an external magnetic field [10, 11]. graphene has its analog in photonics, called photonic graphene. it is constructed through a twodimensional photonic crystal with weakly coupled optical fibers in a three-dimensional setting [12–17]. photonic graphene under strain is modeled through a deformation in the coupled optical fiber lattice [18– 21]. compared with the conventional graphene hamiltonian, the photonic graphene hamiltonian has an extra term that represents the gain/loss in the fibers. the literature on this topic always considers a constant gain/loss in space. with the previous motivation, we will apply supersymmetric quantum mechanics in a matrix approach (matrix susy-qm) to obtain solutions of the dirac equation for strain photonic graphene with a position-dependent gain/loss. 2. strain in photonic graphene the graphene structure consists of carbon atoms in a hexagonal arrangement similar to a honeycomb lattice. this structure can be described by two triangular sublattices of atoms, which are denoted as type a and type b. the base vectors to the unitary cell are given by a1 = a 2 ( √ 3, 3), a2 = a 2 (− √ 3, 3), (1) where a is the interatomic distance, for graphene a = 1.42 å (see figure 1a). the position of the atoms in the whole lattice can be defined by the set of vectors rl = l1a1 + l2a2, with l1, l2 ∈ z. an alternative description of graphene is through the first neighbors, which are connected by the vectors δn δ1 = a 2 ( √ 3, 1), δ2 = a 2 (− √ 3, 1), δ3 = a(0, −1). (2) a reciprocal lattice can be defined in the momentum space, which is also hexagonal, as shown in figure 1b. it is rotated 90◦ with respect to the original carbon network. a hexagon in the reciprocal lattice is recognized as the first brillouin zone. in this zone, there are only two inequivalent points, k± = (± 4π3√3a, 0). 23 https://doi.org/10.14311/ap.2022.62.0023 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en m. castillo-celeita, a. contreras-astorga, d. j. fernández c. acta polytechnica a b a1a2 δ1δ2 δ3 (a). b1 b2 k+k(b). figure 1. (a) hexagonal graphene lattice. the lattice is constructed by type a and type b atoms, in this case, a1 and a2 correspond to the lattice unitary vectors, and δn are the vectors that connect the atoms a(b) with the nearest neighbors. (b) reciprocal lattice, which is characterized by the b1,2 vectors and k± correspond to the two possible inequivalent points in the lattice. all subsequent corners are determined from either k+ or k− plus integer multiples of the vectors b1 = 2π 3a ( √ 3, 1), b2 = 2π 3a (− √ 3, 1). (3) vectors ai and bj fulfill the condition ai · bj = 2πδij . 2.1. tight-binding model the tight-binding hamiltonian describes the hopping of an electron from an atom a (b) to an atom b (a) h = −t ∑ ri 3∑ n=1 (|ari ⟩ ⟨bri +δn | + |bri +δn ⟩ ⟨ari |), (4) where t ≈ 3 ev is called the hopping integral, ri runs over all sites in the sublattice a, thus |ari ⟩ is a state vector in these sites, the same applies to b and |bri+δn ⟩, recall that δn connects the atoms of the sublattice a(b) with its nearest neighbors in the sublattice b(a). the translational symmetry suggests the use of bloch states |ψbloch⟩ = 1 √ nc ∑ rj (eik·rj ψa(k) |arj ⟩ + eik·(rj +δ3)ψb (k) |brj +δ3 ⟩), (5) where nc is the number of the unitary cell [22]. then h |ψ⟩ = e |ψ⟩ becomes a matrix problem  0 −t 3∑ n=1 e−ik·δn −t 3∑ n=1 eik·δn 0   ( ψa ψb ) = e ( ψa ψb ) , (6) with ψa ≡ ψa(k) and ψb ≡ ψb (k) and the energy term given by e± = ± ∣∣∣∣∣t 3∑ n=1 e−ik·δn ∣∣∣∣∣ = ±t √ 3 + 2 cos( √ 3kxa) + 4 cos( √ 3kx a 2 ) cos(3ky a 2 ). to obtain an effective hamiltonian at low energy, we can consider the taylor series around the dirac points h(k = k± +q) ≈ q · ∇kh|k± . note that e(k±) = 0, as a consequence, at these points, the valence and conduction bands are connected. the above calculus leads to the analog of the dirac-weyl equation hϱψ = ℏv0(ϱσ1qx + σ2qy )ψ = eψ, (7) where ϱ = ±1 correspond to the k± valleys, v0 is called the fermi velocity, in graphene, v0 = 3ta/2ℏ ≈ c/300, with c being the velocity of light, σi are the pauli matrices σ1 = ( 0 1 1 0 ) , σ2 = ( 0 −i i 0 ) , σ3 = ( 1 0 0 −1 ) , (8) and ψ is a bi-spinor. the matrix nature of this equation is related to the sublattices a and b, this degree of freedom is called pseudo-spin. notice that at low energies, the dispersion relation is linear, given by e±(q) = ±ℏv0|q|, then, the dirac cones are connecting at e± = 0, as expected for particles without mass [23]. 2.2. uniform strain the photonic analog of a graphene lattice is built with weakly coupled optical fibers. this kind of photonic system is described by the same tight-binding hamiltonian in graphene with an additional term γa/b ; that represents the gain and loss in the optical fibers in the position a/b, this new term produces an attenuation or amplification in the optical modes. if we consider uniform strain in the lattices, which is represented by a strain tensor u = ( u11 0 0 u22 ) , (9) the fermi velocity is modified in the following form vij = v0(1 + (1 − β)uij ). (10) the hopping integrals are modified with a little perturbation t → tn, that, considering the changes in the orbitals by the modification of the carbon distances tn ≈ t ( 1 − β a2 δn · u · δn ) , (11) 24 vol. 62 no. 1/2022 photonic graphene under strain with position-dependent . . . where β = − ∂ ln t ∂ ln a (12) is the grüneisen parameter that depends on the model; for graphene, β is between 2 and 3 [10] (see also [24, 25]). in photonic graphene, a is the distance between adjacent waveguides. the hamiltonian of a photonic graphene with a uniform strain reads as h = γa ∑ ri |ari ⟩ ⟨ari | + γb ∑ ri |bri +δn ⟩ ⟨bri +δn | − ∑ ri 3∑ n=1 tn(|ari ⟩ ⟨bri +δn | + |bri +δn ⟩ ⟨ari |). (13) the deformation of the lattice produces a shift of the dirac points kd± ≈ (1 − u) · k± ± a, where a = (ax, ay ) ax = β 2a (u11 − u22), ay = − β 2a (2u12). (14) using the bloch solution, the hamiltonian under strain takes the form: h =   γa − 3∑ n=1 tne −ik·(1−u)·δn − 3∑ n=1 tne ik·(1−u)·δn γb   , (15) under the assumption |u·δn| ≪ a. in this work, we will assume that γa = iγ and γb = γ∗a, then, for positive γ, the waveguides in the sublattice a (b) present the energy gain (loss), as in the arrangements proposed in [14]. expanding this hamiltonian around the dirac points, through the substitution k = kd± + q, one arrives at a dirac hamiltonian analog with minimal coupling h = v0σ · (1 + u − βu)q + iγσ3. (16) comparing with (7), the effect of strain is equivalent, to consider magnetic-like field modeled through a pseudo-magnetic vector potential a. the last term represents a gain/loss balance in sublattices a/b. in photonic graphene, strain could be generated by deformations in the geometry of the optical-fiber lattice. 2.3. non-uniform strain for non-uniform strain, the deformation matrix depends of the position, u → u(r). thus, the expression for the hamiltonian becomes h = −iσi √ vij∂j √ vij + v0σiai + iγσ3, (17) considering a strain tensor of the form u = ( u11(x) 0 0 u22(y) ) , (18) and equations (10) and (14), still apply. we can also write the strain hamiltonian as h(x,y) = − iσ1 √ v11∂x √ v11 − iσ2 √ v22∂y √ v22 + σ1 v0β 2a (u11(x) − u22(y)) + iγσ3, (19) where v11 = v11(x), v22 = v22(y). we can relate the eigenvalue equation of this hamiltonian, hψ = eψ, with a strain-free one using the following transformation. first, we define the coordinates r = ∫ v0 v11(x) dx, s = ∫ v0 v22(y) dy, (20) and the operator g(x,y) = √ v11v22 v0 exp ( iv0β 2a ∫ x 0 u11(q) v11(q) dq ) , (21) then, h will be related with a flat fermi velocity hamiltonian h0 as h(x,y) = g−1(x,y)h0(r(x),s(y))g(x,y), (22) where h0φ = ( −iv0σ1∂r − iv0σ2∂s − v0β 2a u22 + iγσ3 ) φ, (23) and u22 = u22(y(r,s)). the solutions are mapped as ψ(x,y) = g−1(x,y)φ(r(x),s(y)). (24) the energy spectrum is the same for both hamiltonians [18, 19, 26]. 3. supersymmetric quantum mechanics: matrix approach supersymmetric quantum mechanics (susy-qm) is a method that relates two schrödinger hamiltonians through an intertwining operator [27, 28]. another approach is the matrix susy-qm, which intertwines two dirac hamiltonians h0, h1 by a matrix operator l. in this work, we use the latter to construct an appropriate hamiltonian h1 that will be linked via the operator g introduced in (21) to a photonic graphene system under strain. for the sake of completeness we will give a brief review of matrix susy-qm (more details can be found in [29]). we start by proposing the following intertwining relation: l1h0 = h1l1, (25) where the dirac hamiltonians are given by h0 = −iσ2∂s + v0(s), h1 = −iσ2∂s + v1(s), (26) and the intertwining operator is l1 = ∂s − usu−1, (27) with u being a matrix function called seed or transformation matrix, the subindex in us represent the derivative respect to s, and u must satisfy h0u = uλ. let us write u in a general form and λ as a diagonal matrix u = ( u11 u12 u21 u22 ) , λ = ( λ1 0 0 λ2 ) . (28) 25 m. castillo-celeita, a. contreras-astorga, d. j. fernández c. acta polytechnica from the intertwining relation and the given definitions, the potential v1 can be written in terms of the potential v0 and the transformation matrix as v1 = v0 + i[usu−1,σ2]. (29) solutions of the dirac equation h0ξ = eξ can be mapped onto solutions of h1φ = eφ using the intertwining operator as φ ∝ l1ξ. there are some extra solutions, usually referred to missing states. they can be obtained from each column of (ut )−1, named φλj , j = 1, 2, which satisfy h1φλj = λj φλj . if the vectors φλj fulfill the boundary conditions of the problem, λj must be included in the spectrum of h1. as a summary, with this technique, we start from h0, its eigenspinors and spectrum, then we construct h1, obtain the solutions of the corresponding dirac equation and the spectrum. now, let us mention that it is possible to iterate this technique. the main advantage comes from the modification of the spectrum, since with each iteration, we can add more energy levels. the second-order matrix susy-qm can be reached through a second intertwining relation l2h1 = h2l2, (30) which is similar to (25). the intertwining operator now takes the form l2 = ∂s − (u2)su −12 . (31) the operator l1 is used to determine the transformation matrix of the second iteration, u2 = l1u2, where u2 fulfills the relation h0u2 = u2λ2. in this case, λ2 is an hermitian matrix that we choose diagonal once again, λ2 = ( λ̃1 0 0 λ̃2 ) , u2 = ( w11 w12 w21 w22 ) . (32) the elements of λ2 are such that (λ̃1, λ̃2) ̸= (λ1,λ2). therefore, the second order potential is given by v2 = v1 + i [ (u2)su −12 ,σ2 ] . (33) the solutions of h2χ = eχ are obtained from the eigenspinors of h1 as χ ∝ l2φ. the second-order matrix susy-qm generates, in principle, two sets of eigenspinors that correspond to the columns of the matrices (u t2 )−1 and l2(ut )−1. 4. photonic graphene under strain and position-dependent gain and loss in this section, we start from the auxiliary dirac equation of a free particle with imaginary mass, and using a matrix susy-qm and a gauge transformation g, we obtain a photonic graphene model with strain and position dependent gain/loss. we show that we can iterate the technique to add more propagations modes. figure 2. graph of the functions v0kr + k(s) (line blue) and the gain/loss term γ − γ(s) (dashed red line) for ϵ = 1.5, kr = π, γ = 1, v0 = 1.0. notice that γ − γ(s) coincides asymptotically with γ. 4.1. photonic graphene with a single mode let us start from the free particle dirac equation where we included a purely imaginary mass term: h0φ =(−iv0σ1∂r − iv0σ2∂s + iγσ3)φ. (34) considering φ(r,s) = exp(ikrr)(ϕa(s),ϕb (s))t , the hamiltonian can be written as h0(r,s) = −iv0σ2∂s + v0, (35) where v0 = v0krσ1 + iγσ3. now, we use the matrix susy-qm to construct a new system. a convenient selection of the λ elements is λ1 = ϵ = −λ2. we build the transformation matrix u with the entries u21 = u∗22 = cosh(κs) + i sinh(κs), the corresponding momentum in s is given by κ = √ k2r − (γ2 + ϵ2)/v20 . the other two components are found through the equation: u1j = v0 (λj − iγ) (−u′2j + kru2j ), j = 1, 2. (36) from (29) we obtain v1 as v1 = v0 + σ1k(s) − iσ3γ(s), (37) where γ(s), k(s) are given by γ = 2γ + 2ϵ (κ(γ sinh(2κs) + ϵ) − γkr cosh(2κs)) κ(γ − ϵ sinh(2κs)) + krϵ cosh(2κs) , k = 2v0krϵ (kr cosh(2κs) − κ sinh(2κs)) κ(γ − ϵ sinh(2κs)) + krϵ cosh(2κs) − 2v0kr. figure 2 shows a plot of the functions v0kr + k(s) and γ − γ(s). the new hamiltonian takes the form h1(r, s) = −iσ2v0∂s + σ1(−iv0∂r + k) + iσ3(γ − γ). (38) this system supports two single bound states. they are the columns of the matrix (ut )−1 = (φϵ, φ−ϵ). the eigenvector associated with ϵ is given by φϵ(r,s) = eikr r 2   − (γ2+ϵ2)(cosh(κs)+i sinh(κs))v0κ(γ−ϵ sinh(2κs))+v0kr ϵ cosh(2κs) (γ−iϵ)((κ+ikr ) cosh(κs)−(kr +iκ) sinh(κs)) κ(γ−ϵ sinh(2κs))+kr ϵ cosh(2κs)   . (39) 26 vol. 62 no. 1/2022 photonic graphene under strain with position-dependent . . . our next step is to apply the gauge transformation defined in (20)-(22). the strain and fermi velocity tensors that we consider are u = ( 0 0 0 − 2ak(y)) β ) , v = v0 ( 1 1 1 1 − (1 − β) 2ak(y) β ) , (40) see (18). the change of variables in (20) becomes r = x, s = ∫ 1 1 − (1 − β) 2ak(y) β dy, (41) and the operator g(x,y) = √ 1 − (1 − β) 2ak(y) β . this choice leads to the following hamiltonian h1(x,y) = − iv0σ1∂x − iσ2 √ v22(y)∂y √ v22(y) − σ1 v0β 2a u22(y) + i(γ − γ(y))σ3. (42) bounded eigenstates of h1 can be found as ψϵ(x,y) = g−1(x,y)φϵ(r(x),s(y)). in this system, there is a single mode in the upper dirac cone and another in the bottom cone. the strain generates an analog of a magnetic field perpendicular to the graphene layer b⃗(y) = (β/2a)∂y u22ẑ. since we are working with a photonic graphene, such a pseudo-magnetic field affects light. moreover, the term iγ(y)σ3 indicates a position dependent gain/loss in the optical fibers of the sublattice a/b. figure 3a shows the square modulus of each component of φϵ = (ϕϵa,ϕϵb )t (shadowed curves) and the intensity |ϕϵa|2 + |ϕϵb |2 (red curve). figure 3b shows the same for the mode ψϵ. 4.2. photonic graphene with two modes in this subsection, we use two iterations of the matrix susy-qm, starting again from the free-particle hamiltonian. let us choose an initial system with zero gain/loss (γ = 0), which is a massless fermion in graphene. in the first matrix susy-qm step we use the same transformation matrix u as in the example above. the first matrix susy-qm partner hamiltonian has the form h1 = −iσ2v0∂s + σ1k(s) + iσ3γ(s), (43) where k(s) = krv0, γ(s) = 2ϵv0κ κv0 sinh(2κs) − krv0 cosh(2κs) , (44) with κ = √ (krv0)2 − ϵ2/v0. as a result of the first matrix susy-qm step, it is generated a position dependent gain/loss term γ(s). the iteration of the method requires to define the second diagonal matrix λ2, with λ̃1 = −λ̃2 = ϵ1 ≠ ϵ, and the second transformation matrix u2. for this example, we (a). (b). figure 3. (a) plot of the individual intensities |ϕϵa|2 (gray curve) and |ϕϵb |2 (blue curve) and the total intensity |ϕϵa|2 + |ϕϵb |2 (red line). (b) analog of the (a) plot for the solution ψϵ of the hamiltonian under strain. the parameters in this case are: kr = π, ϵ = 1.5, a = 1.0, β = 0.8, γ = 1, v0 = 1.0. figure 4. graph of the function v0kr + k2(s) (black line), and the gain/loss function γ(s) + γ2(s) (red dashed line), for ϵ = 1.5, ϵ1 = 2.0, kr = π, γ = 0, v0 = 1.0. choose w21 = cosh(κ2s) and w22 = cosh(κ2s), where κ2 = √ (krv0)2 − ϵ21/v0. the other two components can be found through the equation w1j = v0 λ̃j (−w′2j + krw2j ), j = 1, 2. (45) the potential v2 can be calculated from (33), v2 = v1+σ1k2(s)+iσ3γ2(s) = v0+σ1k2+iσ3(γ+γ2). the functions v0kr + k2(s) and γ(s) + γ2(s) are shown in figure 4. it is important to highlight that the gain/loss term remains a pure imaginary quantity. 27 m. castillo-celeita, a. contreras-astorga, d. j. fernández c. acta polytechnica (a). (b). figure 5. (a) intensity of the superposition |φ̄(s, z)|2 propagating in the z-axis. (b) intensity of the superposition |ψ̄(y, z)|2. the values of the parameters taken are ϵ = 1.5, ϵ1 = 2.0, kr = π, γ = 0, β = 0.8, a = 1.0, v0 = 1.0. the second matrix susy-qm step introduces two new sets of eigenmodes. they can be extracted from the columns of the matrix (u t2 )−1 = (χϵ1,χ−ϵ1 ). the eigenmodes added in the first step are mapped as χ±ϵ = l2φ±ϵ. similar to the previous example, it is possible to perform the gauge transformation (20)(22). then, in the system under strain, the modes become ψ±ϵ1 (x,y) = g −1(x,y)χ±ϵ1 (r(x),s(y)), ψ±ϵ(x,y) = g−1(x,y)χ±ϵ(r(x),s(y)). therefore, in this new optical system, two guided modes are created in the upper dirac cone and two more in the bottom dirac cone. finally, let us mention that we can have superpositions of the introduced modes and let them propagate along the z-axis inside the photonic graphene. for example, in the flat fermi velocity system (before the gauge transformation), φ̄(s,z) = a1e−iϵz φϵ(s) + a2e−iϵ1z φϵ1 (s), (46) becomes ψ̄(y,z) = a1e−iϵz ψϵ(y) + a2e−iϵ1z ψϵ1 (y), (47) in the photonic graphene system under strain with the position dependent gain/loss balance. figure 5a shows the propagation along z-axis of the intensity |φ̄(s,z)|2, while figure 5b shows |ψ̄(s,z)|2. 5. summary this article shows a natural way to construct hamiltonians associated with a photonic graphene under strain with a position-dependent gain/loss balance. the main tools that we use are a matrix approach to supersymmetric quantum mechanics and a gauge transformation. with a correct choice of a transformation matrix u, it is possible to add a bound state to the free-particle hamiltonian using the matrix susy-qm, but the dirac equation will have two new terms in the potential, v1 = v0 + σ1k(s) − iσ3γ(s). the function k could be associated with a magnetic vector potential, but the function iγ is related to an imaginary mass term, which is difficult to interpret or realize in a solid-state graphene. the gauge transformation g maps solutions from the flat fermi velocity system of the previous step to a graphene system under strain. at this point, it becomes relevant to work with the photonic graphene. the magnetic vector potential translates into deformations of the lattice of optical fibers, while the iγ function indicates the gain/loss of the fibers in the sublattice a/b. we end with the hamiltonian of photonic graphene with a single mode. this mode is confined by the strain and the positiondependent gain/loss balance. finally, we show that the technique can be iterated, to have two or more modes in the photonic graphene. acknowledgements the authors acknowledge the support of conacyt, grant fordecyt-pronaces/61533/2020. m. c-c. acknowledges as well the conacyt fellowship 301117. references [1] k. s. novoselov, a. k. geim, s. v. morozov, et al. electric field effect in atomically thin carbon films. science 306(5696):666–669, 2004. https://doi.org/10.1126/science.1102896. [2] r. c. andrew, r. e. mapasha, a. m. ukpong, n. chetty. mechanical properties of graphene and boronitrene. physical review b 85(12):125428, 2012. https://doi.org/10.1103/physrevb.85.125428. [3] s.-e. zhu, s. yuan, g. c. a. m. janssen. optical transmittance of multilayer graphene. epl (europhysics letters) 108(1):17007, 2014. https://doi.org/10.1209/0295-5075/108/17007. [4] a. k. geim, k. s. novoselov. the rise of graphene. in nanoscience and technology: a collection of reviews from nature journals, pp. 11–19. 2009. https://doi.org/10.1142/9789814287005_0002. [5] ş. kuru, j. negro, l. m. nieto. exact analytic solutions for a dirac electron moving in graphene under magnetic fields. journal of physics: condensed matter 21(45):455305, 2009. https://doi.org/10.1088/0953-8984/21/45/455305. 28 https://doi.org/10.1126/science.1102896 https://doi.org/10.1103/physrevb.85.125428 https://doi.org/10.1209/0295-5075/108/17007 https://doi.org/10.1142/9789814287005_0002 https://doi.org/10.1088/0953-8984/21/45/455305 vol. 62 no. 1/2022 photonic graphene under strain with position-dependent . . . [6] b. midya, d. j. fernández c. dirac electron in graphene under supersymmetry generated magnetic fields. journal of physics a: mathematical and theoretical 47(28):285302, 2014. https://doi.org/10.1088/1751-8113/47/28/285302. [7] a. contreras-astorga, a. schulze-halberg. the confluent supersymmetry algorithm for dirac equations with pseudoscalar potentials. journal of mathematical physics 55(10):103506, 2014. https://doi.org/10.1063/1.4898184. [8] m. castillo-celeita, d. j. fernández c. dirac electron in graphene with magnetic fields arising from first-order intertwining operators. journal of physics a: mathematical and theoretical 53(3):035302, 2020. https://doi.org/10.1088/1751-8121/ab3f40. [9] a. contreras-astorga, f. correa, v. jakubský. super-klein tunneling of dirac fermions through electrostatic gratings in graphene. physical review b 102(11):115429, 2020. https://doi.org/10.1103/physrevb.102.115429. [10] g. g. naumis, s. barraza-lopez, m. oliva-leyva, h. terrones. electronic and optical properties of strained graphene and other strained 2d materials: a review. reports on progress in physics 80(9):096501, 2017. https://doi.org/10.1088/1361-6633/aa74ef. [11] m. oliva-leyva, c. wang. theory for strained graphene beyond the cauchy–born rule. physica status solidi (rrl)–rapid research letters 12(9):1800237, 2018. https://doi.org/10.1002/pssr.201800237. [12] m. polini, f. guinea, m. lewenstein, et al. artificial honeycomb lattices for electrons, atoms and photons. nature nanotechnology 8(9):625–633, 2013. https://doi.org/10.1038/nnano.2013.161. [13] y. plotnik, m. c. rechtsman, d. song, et al. observation of unconventional edge states in ‘photonic graphene’. nature materials 13(1):57–62, 2014. https://doi.org/10.1038/nmat3783. [14] h. ramezani, t. kottos, v. kovanis, d. n. christodoulides. exceptional-point dynamics in photonic honeycomb lattices with pt symmetry. physical review a 85(1):013818, 2012. https://doi.org/10.1103/physreva.85.013818. [15] g. g. pyrialakos, n. s. nye, n. v. kantartzis, d. n. christodoulides. emergence of type-ii dirac points in graphynelike photonic lattices. physical review letters 119(11):113901, 2017. https://doi.org/10.1103/physrevlett.119.113901. [16] s. grosche, a. szameit, m. ornigotti. spatial goos-hänchen shift in photonic graphene. physical review a 94(6):063831, 2016. https://doi.org/10.1103/physreva.94.063831. [17] t. ozawa, a. amo, j. bloch, i. carusotto. klein tunneling in driven-dissipative photonic graphene. physical review a 96(1):013813, 2017. https://doi.org/10.1103/physreva.96.013813. [18] a. szameit, m. c. rechtsman, o. bahat-treidel, m. segev. pt -symmetry in honeycomb photonic lattices. physical review a 84(2):021806, 2011. https://doi.org/10.1103/physreva.84.021806. [19] h. schomerus, n. y. halpern. parity anomaly and landau-level lasing in strained photonic honeycomb lattices. physical review letters 110(1):013903, 2013. https://doi.org/10.1103/physrevlett.110.013903. [20] m. c. rechtsman, j. m. zeuner, a. tünnermann, et al. strain-induced pseudomagnetic field and photonic landau levels in dielectric structures. nature photonics 7(2):153–158, 2013. https://doi.org/10.1038/nphoton.2012.302. [21] d. a. gradinar, m. mucha-kruczyński, h. schomerus, v. i. fal’ko. transport signatures of pseudomagnetic landau levels in strained graphene ribbons. physical review letters 110(26):266801, 2013. https://doi.org/10.1103/physrevlett.110.266801. [22] c. bena, g. montambaux. remarks on the tight-binding model of graphene. new journal of physics 11(9):095003, 2009. https://doi.org/10.1088/1367-2630/11/9/095003. [23] p. dietl, f. piéchon, g. montambaux. new magnetic field dependence of landau levels in a graphenelike structure. physical review letters 100(23):236405, 2008. https://doi.org/10.1103/physrevlett.100.236405. [24] t. m. g. mohiuddin, a. lombardo, r. r. nair, et al. uniaxial strain in graphene by raman spectroscopy: g peak splitting, grüneisen parameters, and sample orientation. physical review b 79(20):205433, 2009. https://doi.org/10.1103/physrevb.79.205433. [25] f. ding, h. ji, y. chen, et al. stretchable graphene: a close look at fundamental parameters through biaxial straining. nano letters 10(9):3453–3458, 2010. https://doi.org/10.1021/nl101533x. [26] a. contreras-astorga, v. jakubskỳ, a. raya. on the propagation of dirac fermions in graphene with strain-induced inhomogeneous fermi velocity. journal of physics: condensed matter 32(29):295301, 2020. https://doi.org/10.1088/1361-648x/ab7e5b. [27] d. j. fernández c. susy quantum mechanics. international journal of modern physics a 12(01):171–176, 1997. https://doi.org/10.1142/s0217751x97000232. [28] d. j. fernández c., n. fernández-garcía. higher-order supersymmetric quantum mechanics. aip conference proceedings 744(1):236–273, 2004. https://doi.org/10.1063/1.1853203. [29] l. m. nieto, a. a. pecheritsin, b. f. samsonov. intertwining technique for the one-dimensional stationary dirac equation. annals of physics 305(2):151–189, 2003. https://doi.org/10.1016/s0003-4916(03)00071-x. 29 https://doi.org/10.1088/1751-8113/47/28/285302 https://doi.org/10.1063/1.4898184 https://doi.org/10.1088/1751-8121/ab3f40 https://doi.org/10.1103/physrevb.102.115429 https://doi.org/10.1088/1361-6633/aa74ef https://doi.org/10.1002/pssr.201800237 https://doi.org/10.1038/nnano.2013.161 https://doi.org/10.1038/nmat3783 https://doi.org/10.1103/physreva.85.013818 https://doi.org/10.1103/physrevlett.119.113901 https://doi.org/10.1103/physreva.94.063831 https://doi.org/10.1103/physreva.96.013813 https://doi.org/10.1103/physreva.84.021806 https://doi.org/10.1103/physrevlett.110.013903 https://doi.org/10.1038/nphoton.2012.302 https://doi.org/10.1103/physrevlett.110.266801 https://doi.org/10.1088/1367-2630/11/9/095003 https://doi.org/10.1103/physrevlett.100.236405 https://doi.org/10.1103/physrevb.79.205433 https://doi.org/10.1021/nl101533x https://doi.org/10.1088/1361-648x/ab7e5b https://doi.org/10.1142/s0217751x97000232 https://doi.org/10.1063/1.1853203 https://doi.org/10.1016/s0003-4916(03)00071-x acta polytechnica 62(1):23–29, 2022 1 introduction 2 strain in photonic graphene 2.1 tight-binding model 2.2 uniform strain 2.3 non-uniform strain 3 supersymmetric quantum mechanics: matrix approach 4 photonic graphene under strain and position-dependent gain and loss 4.1 photonic graphene with a single mode 4.2 photonic graphene with two modes 5 summary acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0488 acta polytechnica 62(4):488–497, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague a comparative study of ferrofluid lubrication on double-layer porous squeeze curved annular plates with slip velocity niru c. patela, jimit r. patela, ∗, gunamani m. deherib a charotar university of science and technology (charusat), p. d. patel institute of applied sciences, department of mathematical sciences, charusat campus, changa 388 421, gujarat, india b sardar patel university, department of mathematics, v. v. nagar 388 120, anand, gujarat, india ∗ corresponding author: patel.jimitphdmarch2013@gmail.com abstract. this article makes an effort to present a comparative study on the performance of a shliomis model-based ferrofluid (ff) lubrication of a porous squeeze film in curved annular plates taking slip velocity into account. the modified darcy’s law has been adopted to find the impact of the doublelayered porosity, while the slip velocity effect has been calculated according to beavers and joseph’s slip conditions. the modified reynolds equation for the double-layered bearing system is solved to compute a dimensionless pressure profile and load-bearing capacity (lbc). the graphical results of the study reveal that the lbc increases in the case of magnetization, volume concentration and upper plate’s curvature parameter while it decreases with other parameters for both the film thickness profile. a comparative study suggests that the exponential film thickness profile is more suitable to enhance lbc for the annular plates lubricated by ferrofluid, including the presence of a slip. the study shows that the slip model performed quite well and there is a potential for improving the performance efficiency. besides, multiple methods have been presented to enhance the performance of the above mentioned bearing system by selecting various combinations of parameters governing the system. keywords: shliomis model, curved annular plates, double-layered porous, slip velocity, exponential and hyperbolic film profile, ferrofluid. 1. introduction porous materials seem to be ubiquitous and play a notable role in many aspects of day-to-day life. they are extensively used in various areas, such as energy management, vibration automotive, heat insulation, processes of sound, turbine industries and fluid filtration. due to the phenomenon known as self-lubrication, the porous bearing has a porous film filled with some amount of lubricants so that it does not require more lubrication throughout the life period of the bearing. the lubricant comes out of the porous layer and is deposited between the annular plates to inhibit friction and wear, as well as withstanding the original load applied to the annular plates. therefore, the impact of lubrication due to the double porous layer is better than that of the single porous layer. due to the remarkable mechanical properties and wide applications of the annular plates, many researchers have been focused on analysing annular bearing systems, such as lin [1], shah and bhat [2], bujurke et al. [3], deheri et al. [4], fatima et al. [5] and hanumangowda et al. [6]. also, numerous studies (ting [7], gupta et al. [8], bhat and deheri [9], shah et al. [10], shimpi and deheri [11], patel and deheri [12], rao and agarwal [13], vasanth et al. [14] and shah et al. [15]) have been carried out to examine the impact of porosity on the effectivity of annular plates. a synthetic fluid, namely “ferrofluid”, is a mixture of colloidal dispersions containing ferromagnetic particles in a liquid carrier. besides being used in elastic dampers to reduce noise, ffs are used in cooling and heating cycles, long-term sealing of rotating shafts, and reducing unwanted resonances in loudspeakers. in the last four decades, several investigators (kumar et al. [16], sinha et al. [17], shah and bhat [18], patel and deheri [19], shah and shah [20] and munshi et al. [21]) have worked on a ff lubrication theory to examine the behaviour of various bearing systems. alternative physical boundary conditions were proposed in the advanced study of beavers and joseph [22] that allowed a non-zero tangential velocity (called slip velocity) at the surfaces and uncovered that slip velocity had a broad effect on the bearing performance. several studies have been documented in the literature about slip velocities for different conditions in bearing systems (chattopadhyay and majumdar [23], shah and parsania [24], shah and patel [25], venkata et al. [26], deheri and patel [27], patel and deheri [28], shah et al. [29], and mishra et al. [30]). in their studies, fragassa et al. [31], janevski et al. [32] and geike [33] analysed the theory of staticdynamic load and lubricated contacts, respectively. these investigations confirm that the load profile re488 https://doi.org/10.14311/ap.2022.62.0488 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 4/2022 a comparative study of ferrofluid lubrication . . . mains crucial for the bearing design. patel and deheri [34] examined the influence of variations of viscosity of the ferrofluid on long bearings. it was noticed that the viscosity variation does not help to increase the lbc in the case of long bearings. patel and deheri [35] presented a comparison of a porous structure on shliomis-model-based ferrofluid lubrication of a squeeze film between rotating rough-curved circular plates. it has been ascertained that the kozenycarman model has an edge over the irmay’s model in improving the lbc. a study of thin film lubrication at nanoscale appears in patel and deheri [36], where a ferrofluid-based infinitely long rough porous slider bearing has been considered. it has been found that the magnetic fluid induced a higher load and showed a further improvement when the thin film lubrication at nanoscale took place. very few studies have been made regarding the ferrofluid lubrication in multi-layered porous plates in the presence of slip velocity. and even lesser amount of studies has been done concerning the comparative studies on the performance of ferrofluid lubricated porous squeeze film in the multi-layered bearing system considering slip velocity. thus, it was thought proper to put forward a comparative study regarding the performance of a ferrofluid-based squeeze film in two-layered porous annular plates when the slip velocity is taken into account. to what extent can the ferrofluid lubrication counter the adverse effect of porosity and slip velocity? this fundamental question has been addressed while presenting the comparison. 2. analysis figure 1 involves two annular disks (inner and outer radius b and a, respectively (b < a)) with curved (exponential and hyperbolic film) upper surface and flat lower surface. figure 1. diagram of the porous annular bearing in view of murti [37], shah and bhat [2] and patel and deheri [38], the thickness profile h of the film is assumed as h(r) = h0e−βr 2 , b ≤ r ≤ a for the exponential and h(r) = h0 1 + βr , b ≤ r ≤ a (1) for the hyperbolic profile. as per the discussions of shliomis [39] and kumar [40], and neglecting the assumptions of shukla and kumar [41], the governing equations for the flow of ff as suggested by shliomis [39] are −∇p+η∇2q+µ0(m ·∇)h + 1 2τs ∇×(s−iω) = 0 (2) s = iω + µ0τs(m × h), (3) m = m0 h h + τb i (s × m), (4) with the continuity equation (∇ · q = 0), equations of maxwell ∇ ×h = 0, ∇ · (h +m) = 0 and ω = 12 ∇ ×q (bhat [42]). above mentioned equation(2) reduces to −∇p+η∇2q+µ0(m ·∇)h+ 1 2 µ0∇×(m ×h) = 0 (5) and m = m0 h [ h + τ(ω × h) ] , where τ = τb 1 + µ0τbτs i m0h with the help of equations (3) and (4), as given in shliomis [39]. equation(5) takes the following form, as discussed in bhat [42] and patel and patel [43] with u, h = (0, 0,h0) and an axially symmetric flow, ∂2u ∂x2 = 1 η(1 + τ) dp dr (6) where τ = 3 2 ϕ ξ − tan hξ ξ + tan hξ . because of beavers and joseph’s [22] slip boundary conditions u(z = 0) = 0, u(z = h) = − 1 s ∂u ∂z , ηa = η(1 + τ), the solution of equation (6) can be transformed to u = − 1 ηa z2 − zsh(z − h) 2(1 + sh) dp dr . (7) 489 n. c. patel, j. r. patel, g. m. deheri acta polytechnica with the help of the above expression (7), one can find the continuity equation (1 r d dr ∫ h 0 rudz +wh −w0 = 0 ) , as 1 r d dr ( h3r(2 + sh) 1 + sh dp dr ) = 12ηa(wh − w0). (8) assuming the upper surface having a double layered porous facing and the lower flat surface being solid in the annular plate bearing. in the present study, p1 and p2 of the porous region satisfy the following equations, respectively (bhat [42]), 1 r ∂ ∂r ( ∂p1 ∂r ) + ∂2p1 ∂z2 = 0 and 1 r ∂ ∂r ( ∂p2 ∂r ) + ∂2p2 ∂z2 = 0. (9) using the morgan-cameron approximation, one gets( ∂p1 ∂z ) z=h1 = h1 r d dr ( r dp dr ) ,( ∂p2 ∂z ) z=h2 = h2 r d dr ( r dp dr ) . (10) since the lower surface is solid and the upper surface has a double layer porous facing, the velocity component along z-direction is w0 = 0 wh = ḣ0 − [ k1 ηa ( ∂p1 ∂z ) z=h1 + k2 ηa ( ∂p2 ∂z ) z=h2 ] . (11) incorporating equation (9) and (10), equation (11) turns into w0 = 0 wh = ḣ0 − [ k1 ηa ( h1 r d dr ( r dp dr )) + k2 ηa ( h2 r d dr ( r dp dr ))] . (12) using equation (12), and ηa = η0 ( 1 + 52ϕ ) (1 + τ), equation (8) yields 1 r d dr ({ h3(2 + sh) 1 + sh + 12k1h1 + 12k2h2 } r dp dr ) = 12η0 ( 1 + 5 2 ϕ ) (1 + τ)ḣ0. (13) upon introduction of the non-dimensional measures: r = r b , h = h h0 , β = βb2 (exponential), β = βb (hyperbolic), p = − h0 3p η0b2ḣ0 , s = sh0, ψ1 = k1h1 h0 3 , ψ2 = k2h2 h0 3 , (14) and using above equation (14), equation (13) transforms to 1 r d dr {[ h 3 (2 + sh) 1 + sh + 12(ψ1 + ψ2) ] r dp dr } = −12 ( 1 + 5 2 ϕ ) (1 + τ). (15) considering boundary conditions of annular plates, p(1) = p(k) = 0, (16) one can find the dimensionless p as p = ∫ r 1 ( − 6er g ) dr + c1 ∫ r 1 ( 1 gr ) dr, (17) where c1 = ∫ k 1 (6er g ) dr∫ k 1 ( 1 gr ) dr , g = h 3 (2 + sh) 1 + sh + 12(ψ1 + ψ2), e = ( 1 + 5 2 ϕ ) (1 + τ), while the non-dimensional lbc of the annular plates can be found as, w = − h0 3w 2πη0b4ḣ0 = ∫ k 1 rp dr = − 1 2 (∫ k 1 − 6er3 g dr + c1 ∫ k 1 r g dr ) (18) 3. result and discussion the results for the double-layered porous medium and slip velocity on exponential and hyperbolic film profiles of the annular bearing are discussed in this section. equation (17) establishes the non-dimensional 490 vol. 62 no. 4/2022 a comparative study of ferrofluid lubrication . . . pressure, while equation (18) represents a dimensionless lbc. in addition, expression (18) is linear in terms of the magnetization parameter, which indicates an improvement of the lbc of annular plates mathematically. as far as lbc is concerned, a comparison of the film profile is exhibited graphically in figures 2–13 with double-porous facing in the occurrence of a slip. the first figure demonstrates the exponential film shape and the second figure depicts the impact of the hyperbolic profile. the performance of appearance of the shliomis’ ff lubricated double porous medium annular plates is based on the foundation of several non-dimensional parameters like magnetization, upper plate’s curvature, porosity, volume concentration and slip velocity. it can be observed that the exponential film fares better. it is noticed that equation (18) suggests the lbc of a single layer porous medium when ψ2 → 0. with ψ1 → 0 and ψ2 → 0, this investigation transfers to the non-porous ff based annular bearing system with the slip velocity. also, this study reduces to a study of a traditional annular bearing by removing the effect of magnetization in the absence of the slip. for the range of the parameters, one can refer below: τ: 0.1–0.5, ϕ: 0.01–0.05, β: 1.5–1.9, 1/s: 0.02–0.1, ψ1: 0.001–0.005 and ψ2: 0.001–0.005. the dispensation of lbc about the τ, for numerous values of ϕ, β, ψ1, ψ2 and 1/s shown in figures 2–5, recommends that the lbc rises strictly due to the ff lubricant. a closer examination of the figures emphasizes that the functionality of bearing systems as well as the increase in load is connected with all the parameters for both the film profiles. exponential film profile registers a higher load as compared to the hyperbolic shape in figures 2–5. the behaviour of the volume concentration parameter concerning various parameters of lbc is illustrated in figures 6–8, respectively. due to the rise of the volume concentration parameter, the effect of lbc decreases with porosity (in figure 7) and slip velocity (in figure 8), while the effect of β (in figure 6) increases the lbc. moreover, figure 8 suggests a marginally improved effect of the slip velocity in an exponential film bearing, which indicates an enhancement of the overall annular bearing’s performance up to some extent. the profile of a non-dimensional lbc with respect to the β is described in figures 9–11. if we increase the β, then the capacity of the load is growing sharply in the case of the hyperbolic function, while a reverse behaviour is observed with porosity and slip velocity. however, the lbc increases slightly for the exponential profile and follows the same trends for the parameters mentioned above. one can visualize an identical scenario for the curvature of exponential and hyperbolic functions, which is shown in figures 9–11. the effect of the porosity on the load distribution of the bearing is shown in figures 12 and 13. in figure 12, the effect of slip velocity is negligible for the exponential profile. however, figure 13 suggests that the trends of both the porosity parameters are almost the same. both layers help to improve the lifespan of the system by creating a film layer between the surfaces. the graphical representation makes it clear that the following takes place. (1.) the nature of both porous facings is almost the same with the same values (ψ1 = ψ2), however, that does not apply when the porosity values differ. (meaning ψ1 > ψ2 or ψ1 < ψ2 ). (2.) figure 13 displays the maximum load among all the figures, which means the double porous layer improves the lbc in annular plates bearing systems. (3.) higher values of curvature parameter have a negligible effect on the lbc in the case of the exponential profile, but it does reflect on the lbc in the case of the hyperbolic film profile. (4.) the effect of slip velocity is satisfactory for the exponential surface profile as compared to the hyperbolic surface. (5.) finally, this study helps to improve the lbc by considering the proper selection of all parameters and film shapes while designing the bearing system of annular plates. 4. conclusion the effect of the double-porous layered on mf lubricated curved annular bearing is investigated theoretically with the theory of shliomis’ flow model of ff, modified darcy’s law for double layer, and beavers and joseph’s more realistic slip conditions. in view of the bearing’s life period, it is evident that some of the parameters appear to have an opposite effect on the performance of the bearing system. hence, this investigation clarifies that while designing the bearing system, the porosity in two layers, and the slip velocity must be considered. interestingly, numerous factors (like porosity and slip velocity) disturb the system adversely even though the bearing can support a load without flow, this does not apply in the case of traditional lubricants. even the upper plate’s curvature, either exponential or hyperbolic, may significantly impact the performance of this bearing system, considering the moderate values of volume concentration, slip velocity, and porosity. lastly, the exponential film profile exhibits a higher load-bearing capacity, in the case of the double-layered porosity with shliomis’ magnetic fluid flow when slip is in place. a pertinent question is to elevate this analysis by incorporating the effect of surface roughness and deformation. an immediate concern is to carry out this analysis for some other types of bearing systems including the circular ones with slip velocity. 491 n. c. patel, j. r. patel, g. m. deheri acta polytechnica figure 2. variation of lbc as regards of τ and ϕ. figure 3. variation of lbc as regards of τ and β. figure 4. variation of lbc as regards of τ and ψ1. figure 5. variation of lbc as regards of τ and 1/s. 492 vol. 62 no. 4/2022 a comparative study of ferrofluid lubrication . . . figure 6. variation of lbc as regards of ϕ and β. figure 7. variation of lbc as regards of ϕ and ψ2. figure 8. variation of lbc as regards of ϕ and 1/s. figure 9. variation of lbc as regards of β and ψ1. 493 n. c. patel, j. r. patel, g. m. deheri acta polytechnica figure 10. variation of lbc as regards of β and ψ2. figure 11. variation of lbc as regards of β and 1/s. figure 12. variation of lbc as regards of ψ1 and 1/s. figure 13. variation of lbc as regards of ψ1 and ψ2. 494 vol. 62 no. 4/2022 a comparative study of ferrofluid lubrication . . . list of symbols a outer radius of annular plates b inner radius of annular plates h film thickness r radial coordinates p pressure of fluid u x – component of q w z – component of q h magnitude of h i a sum of moments of inertia of the particles per unit volume s slip velocity q fluid velocity in the film region m magnetization vector h an identical magnetic field s internal angular momentum h0 central film thickness w0, wh values of w at z = 0, h respectively h1 thickness of lubricant in the inner layer h2 thickness of lubricant in the outer layer k1 permeability of inner layer of the porous region k2 permeability of outer layer of the porous region m0 equilibrium magnetization h0 constant magnetic field h1 thickness of the inner layer adjacent to lubricant layer h2 thickness of the outer layer adjacent to solid wall p1 pressure of inside layer in the porous region p2 pressure of outside layer in the porous region η viscosity of suspension β curvature of the upper plate ξ langevin’s parameter τ magnetization parameter ϕ volume concentration τb brownian relaxation time parameter τs relaxation time parameter µ0 permeability of free space η0 carrier fluid viscosity ψ1 inner layer porous structure parameter ψ2 outer layer porous structure parameter references [1] j. lin. magneto-hydrodynamic squeeze film characteristics between annular disks. industrial lubrication and tribology 53(2):66–71, 2001. https://doi.org/10.1108/00368790110384028. [2] r. shah, m. bhat. ferrofluid squeeze film between curved annular plates including rotation of magnetic particles. journal of engineering mathematics 51(4):317–324, 2005. https://doi.org/10.1007/s10665-004-1770-9. [3] n. bujurke, n. naduvinamani, d. basti. effect of surface roughness on the squeeze film lubrication between curved annular plates. industrial lubrication and tribology 59(4):178–185, 2007. https://doi.org/10.1108/00368790710753572. [4] g. deheri, r. patel, n. abhangi. magnetic fluid-based squeeze film behavior between transversely rough curved annular plates: a comparative study. industrial lubrication and tribology 63(4):254–270, 2011. https://doi.org/10.1108/00368791111140477. [5] s. fatima, t. biradar, b. hanumagowda. magneto-hydrodynamics couple stress squeeze film lubrication of rough annular plates. international journal of current research 9(9):58007–58014, 2017. [6] g. hanumagowda, b.and savitramma, a. salma, noorjahan. combined effect of piezo-viscous dependency and non-newtonian couple stresses in annular plates squeeze-film characteristics. journal of physics: conference series 1000(1):1–8, 2018. https://doi.org/10.1088/1742-6596/1000/1/012083. [7] l. ting. engagement behavior of lubricated porous annular disks. part i: squeeze film phase surface roughness and elastic deformation effects. wear 34(2):159–172, 1975. https://doi.org/10.1016/0043-1648(75)90062-9. [8] j. gupta, k. vora, m. bhat. the effect of rotational inertia on the squeeze film load between porous annular curved plates. wear 79:235–240, 1982. https://doi.org/10.1016/0043-1648(82)90171-5. [9] m. bhat, g. deheri. squeeze film behaviour in porous annular discs lubricated with magnetic fluid. wear 151(1):123–128, 1991. https://doi.org/10.1016/0043-1648(91)90352-u. [10] r. shah, s. tripathi, m. bhat. magnetic fluid based squeeze film between porous annular curved plates with the effect of rotational inertia. pramana journal of physics 58(3):545–550, 2002. https://doi.org/10.1007/s12043-002-0064-x. [11] m. shimpi, g. deheri. a study on the performance of a magnetic fluid based squeeze film in curved porous rotating rough annular plates and deformation effect. tribology international 47:90–99, 2012. https://doi.org/10.1016/j.triboint.2011.10.015. [12] j. patel, g. deheri. theoretical study of shliomis model based magnetic squeeze film in rough curved annular plates with assorted porous structures. fme transactions 42(1):56–66, 2014. https://doi.org/10.5937/fmet1401056p. [13] p. rao, s. agarwal. couple stress fluid-based squeeze film between porous annular curved plates with the effect of rotational inertia. iranian journal of science and technology, transactions a: science 41(4):1171–1175, 2017. https://doi.org/10.1007/s40995-017-0295-9. [14] k. vasanth, j. hanumagowda, j. santhosh kumar. combined effect of piezoviscous dependency and non-newtonian couple stress on squeeze-film porous annular plate. journal of physics: conference series 1000(1):1–8, 2018. https://doi.org/10.1088/1742-6596/1000/1/012080. [15] r. shah, d. patel, d. patel. ferrofluid-based annular squeeze film bearing with the effects of roughness and micromodel patterns of porous structures. tribology materials, surfaces & interfaces 12(4):208–222, 2018. https://doi.org/10.1080/17515831.2018.1542192. 495 https://doi.org/10.1108/00368790110384028 https://doi.org/10.1007/s10665-004-1770-9 https://doi.org/10.1108/00368790710753572 https://doi.org/10.1108/00368791111140477 https://doi.org/10.1088/1742-6596/1000/1/012083 https://doi.org/10.1016/0043-1648(75)90062-9 https://doi.org/10.1016/0043-1648(82)90171-5 https://doi.org/10.1016/0043-1648(91)90352-u https://doi.org/10.1007/s12043-002-0064-x https://doi.org/10.1016/j.triboint.2011.10.015 https://doi.org/10.5937/fmet1401056p https://doi.org/10.1007/s40995-017-0295-9 https://doi.org/10.1088/1742-6596/1000/1/012080 https://doi.org/10.1080/17515831.2018.1542192 n. c. patel, j. r. patel, g. m. deheri acta polytechnica [16] p. kumar, d.and sinha, p. chandra. ferrofluid squeeze film for spherical and conical bearings. international journal of engineering science 30(5):645–656, 1992. https://doi.org/10.1016/0020-7225(92)90008-5. [17] p. sinha, p. chandra, d. kumar. ferrofluid lubrication of cylindrical rollers with cavitation. acta mechanica 98:27–38, 1993. https://doi.org/10.1007/bf01174291. [18] r. shah, m. bhat. ferrofluid squeeze film in a long journal bearing. tribology international 37:441–446, 2004. https://doi.org/10.1016/j.triboint.2003.10.007. [19] j. patel, g. deheri. performance of a ferrofluid based rough parallel plate slider bearing: a comparison of three magnetic fluid flow models. advances in tribology 2016:1–9, 2016. https://doi.org/10.1155/2016/8197160. [20] r. shah, r. shah. derivation of ferrofluid lubrication equation for slider bearings with variable magnetic field and rotations of the carrier liquid as well as magnetic particles. meccanica 53(4-5):857–869, 2017. https://doi.org/10.1007/s11012-017-0788-9. [21] m. munshi, a. patel, g. deheri. lubrication of rough short bearing on shliomis model by ferrofluid considering viscosity variation effect. international journal of mathematical, engineering and management sciences 4(4):982–997, 2019. https://doi.org/10.33889/ijmems.2019.4.4-078. [22] g. beavers, d. joseph. boundary conditions at a natural permeable wall. journal of fluid mechanics 30(1):197–207, 1967. https://doi.org/10.1017/s0022112067001375. [23] a. chattopadhyay, b. majumdar. steady state solution of finite hydrostatic porous oil journal bearings with tangential velocity slip. tribology international 17(6):317–323, 1984. https://doi.org/10.1016/0301-679x(84)90095-1. [24] r. shah, m. parsania. ferrofluid lubrication equation for non-isotropic porous squeeze film bearing with slip velocity. mathematics today 28(2):43–49, 2012. [25] r. shah, d. patel. squeeze film based on ferrofluid in curved porous circular plates with various porous structure. applied mathematics 2(4):121–123, 2012. https://doi.org/10.5923/j.am.20120204.04. [26] j. venkata, r. murthy, m. kumar. effect of slip parameter on the flow of viscous fluid past an impervious sphere. international journal of applied science and engineering 12(3):203–223, 2014. https://doi.org/10.6703/ijase.2014.12(3).203. [27] g. deheri, s. patel. combined effect of slip velocity and surface roughness on a magnetic squeeze film for a sphere in a spherical seat. indian journal of materials science 2015:1–9, 2015. https://doi.org/10.1155/2015/159698. [28] j. patel, g. deheri. numerical modeling of jenkins model based ferrofluid lubrication squeeze film performance in rough curved annular plates under the presence of slip velocity. facta universitatis, series: mathematics and informatics 31(1):11–31, 2016. [29] n. shah, r.and patel, r. kataria. some porous squeeze film-bearings using ferrofluid lubricant: a review with contributions. proceedings of the institution of mechanical engineers, part j: journal of engineering tribology 230(9):1157–1171, 2016. https://doi.org/10.1177/1350650116629096. [30] m. mishra, s.and barik, g. dash. an analysis of hydrodynamic ferrofluid lubrication of an inclined rough slider bearing. tribology materials, surfaces & interfaces 12(1):17–26, 2018. https://doi.org/10.1080/17515831.2017.1418280. [31] c. fragassa, g. minak, a. pavlovic. measuring deformations in the telescopic boom under static and dynamic load conditions. facta universitatis, series: mechanical engineering 18(2):315–328, 2020. https://doi.org/10.22190/fume181201001f. [32] g. janevski, p. kozic, a. pavlovic, s. posavljak. moment lyapunov exponents and stochastic stability of a thin-walled beam subjected to axial loads and end moments. facta universitatis, series: mechanical engineering 19(2):209–228, 2021. https://doi.org/10.22190/fume191127014j. [33] t. geike. bubble dynamics-based modeling of the cavitation dynamics in lubricated contacts. facta universitatis, series: mechanical engineering 19(1):115–124, 2021. https://doi.org/10.22190/fume210112027g. [34] j. patel, g. deheri. influence of viscosity variation on ferrofluid based long bearing. reports in mechanical engineering 3(1):37–45, 2022. https://doi.org/10.31181/rme200103037j. [35] j. patel, g. deheri. effect of various porous structures on the shliomis model based ferrofluid lubrication of the film squeezed between rotating rough curved circular plates. facta universitatis, series: mechanical engineering 12(3):305 – 323, 2014. [36] j. patel, g. deheri. a study of thin film lubrication at nanoscale for a ferrofluid based infinitely long rough porous slider bearing. facta universitatis, series: mechanical engineering 14:89–99, 2016. https://doi.org/10.22190/fume1601089p. [37] p. murti. squeeze films in curved porous circular plates. journal of lubrication technology 97(4):650–652, 1975. https://doi.org/10.1115/1.3452699. [38] j. patel, g. deheri. jenkins model based magnetic squeeze film in curved rough circular plates considering slip velocity: a comparison of shapes. fme transactions 43(2):144–153, 2015. https://doi.org/10.5937/fmet1502144p. [39] m. shliomis. effective viscosity of magnetic suspensions. soviet physics – jetp 34(6):1291–1294, 1972. [40] d. kumar. lubrication with a magnetic fluid. ph.d. thesis, iit kanpur, 1991. [41] j. shukla, d. kumar. a theory for ferromagnetic lubrication. journal of magnetism and magnetic materials 65(2-3):375–378, 1987. https://doi.org/10.1016/0304-8853(87)90075-8. [42] m. bhat. lubrication with a magnetic fluid. team spirit (india) pvt. ltd., india, 2003. 496 https://doi.org/10.1016/0020-7225(92)90008-5 https://doi.org/10.1007/bf01174291 https://doi.org/10.1016/j.triboint.2003.10.007 https://doi.org/10.1155/2016/8197160 https://doi.org/10.1007/s11012-017-0788-9 https://doi.org/10.33889/ijmems.2019.4.4-078 https://doi.org/10.1017/s0022112067001375 https://doi.org/10.1016/0301-679x(84)90095-1 https://doi.org/10.5923/j.am.20120204.04 https://doi.org/10.6703/ijase.2014.12(3).203 https://doi.org/10.1155/2015/159698 https://doi.org/10.1177/1350650116629096 https://doi.org/10.1080/17515831.2017.1418280 https://doi.org/10.22190/fume181201001f https://doi.org/10.22190/fume191127014j https://doi.org/10.22190/fume210112027g https://doi.org/10.31181/rme200103037j https://doi.org/10.22190/fume1601089p https://doi.org/10.1115/1.3452699 https://doi.org/10.5937/fmet1502144p https://doi.org/10.1016/0304-8853(87)90075-8 vol. 62 no. 4/2022 a comparative study of ferrofluid lubrication . . . [43] n. patel, j. patel. magnetic fluid-based squeeze film between curved porous annular plates considering the rotation of magnetic particles and slip velocity. journal of serbian society for computational mechanics 14(2):69–82, 2020. https://doi.org/10.24874/jsscm.2020.14.02.05. 497 https://doi.org/10.24874/jsscm.2020.14.02.05 acta polytechnica 62(4):488–497, 2022 1 introduction 2 analysis 3 result and discussion 4 conclusion list of symbols references ap06_6.vp 1 introduction the provision of optimal hygrothermal conditions, i.e. above all an optimal operative temperature (calm air and air temperature reaching radiant temperature) is the principal condition for healthy human life in the interior of a building. the optimal operative temperature has in the past been calculated from the pmv (predicted mean value) (see e.g. en iso 7730 moderate thermal environment) estimated on the basis of a positive reaction from 80% of the persons presents. the feelings of human beings are very subjective values, impacted by many other factors in addition to hygrothermal conditions, e.g. by indoor interior colors, a person’s mood, etc. in addition, due to the way in which pmv is experimentally estimated and proved in other experimental works (see fishman, pimbert 1979, newsham, tiller 1995), it is approximately valid for the neutral zone only. the further away from the neutral zone the more the real values depart from the values calculated from pmv, see fig. 1. what is more, the greater a person’s activity, the bigger the difference. the application of high activity values is thus impossible in practice. in fig. 2 the mean thermal sensation vote is plotted against the operative temperature for a range of velocities. each point represents the mean vote of thirty two subjects. the correlation between the operative temperature and the mean thermal sensation vote is high, with a correlation coefficient of 0.97 (n � 5). there is no significant difference between the sexes. the solid curve is the regression line for the individual vote (n � 80). for comparison, the dotted line represents © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 46 no. 6/2006 optimal (comfortable) operative temperature estimation based on physiological responses of the human organism m. v. jokl, k. kabele problems following the application of optimal operative temperatures estimated on the basis of pmv and the necessity to apply correct values in the new czech government directive no. 523/2002 code led to experiments based on the physiological human body response instead of solely people’s feelings in a given environment. on the basis of experiments on 32 subjects (university students) it has been possible to estimate: a) the total balance of hygrothermal flows between the human body and the environment, b) the optimal operative temperature as a function of the subject’s activity, c) the thermoregulatory range for each optimal operative temperature, i.e. maximal (category cmax) limited by the onset of sweating, minimal (category cmin) limited by the onset of shivering (category c can be applied to naturally ventilated buildings), optimal (comfort level – category a) defined by time constant 0.368 (can be applied to air conditioned buildings), and submaximum (decreased comfort level – category b) defined by time constant 0.632 (can be applied to buildings with basic air conditioning systems). keywords: thermal comfort, microenvironment, hygienic regulations, pmv problems, thermoregulatory ranges. 16 18 20 22 24 26 28 30 �3 �2 �1 0 1 2 3 indoor temperature (°c) co m fo rt vo te observed vote predicted vote a) fig. 1: a) comparison of mean thermal comfort votes (ashrae scale) with predictions by the pmv model in an english office building (fishman and pimbert, 1979), activity 80 w�m�2, clothing 0.64 up to 0.82 clo; b) comparison of mean thermal comfort votes (ashrae scale) with predictions by the pmv model in a building (newsham and tiller, 1995), activity 70 w�m�2, clothing 0.78�0.21 clo. �3 �2 �1 0 1 2 3 1918 20 22 24 2623 25 c o m fo rt vo te operative temperature (°c) 21 ashrae pmv b) the results for 172 japanese subjects in conditions of low air movement reported by tanabe (1987), and the dashed line represents the calculated pmv values. further important results are drawn from research at reading university (croome et al.1993). these results take into account the opening and closing of doors, i.e. the ventilation rate, see fig.3. when the windows and the door were closed, the mean thermal sensation tended to be on the warm side of neutral. when the windows and doors were open, the votes were spread widely over the thermal sensation scale. however, the calculated pmv values corresponding to the tests were close to the neutral point for most of the test conditions. this suggests that, in this investigation, pmv underestimates the thermal impression for the case when the windows and doors are shut, and undervalues the change in thermal impression for the two cases. this may be due to three main reasons. the first reason is the assumed steady state laboratory conditions used in the derivation of the pmv equation. the second is the oversimplified approach to the assessment of the metabolic rate of the occupant. the occupants rarely sat in the room for a long period, say one hour, without moving around. the third reason is the sensitivity of pmv to clo values (croome et al. 1993). it can be concluded that the pmv equation overpredicts the neutral temperature by as much as 2 k and underpredicts the comfort requirement when air temperature deviates from neutrality. humphreys and nicol (2000) have suggested that there may be formulaic errors in such a complex index as pmv, with two contributing factors: 1) steady state approximation. pmv, like other indices of warmth, is a steady state heat exchange equation, and therefore its application to the office environment can only be an approximation. recent research shows that among an office population the temperature of the fingers varies extensively and rapidly, indicating that the thermal state of the bodies of office workers is in continual flux (humphreys et al. 1999). this suggests that it is better to regard the people a soon as being in dynamic thermal equilibrium rather than in a steady thermal state. by extension, the same is likely to be true of other and more varied pursuits. thus, any index built on steady state assumptions is of limited relevance to normal living. indices that exlude thermoregulation cannot therefore simulate real life conditions. 2) inaccurate numerical formulae for steady state. most indices have errors in the numerical values used in the equations, such as the convective and radiant heat transfer coefficients, skin temperature and sweat rates that are assumed in comfort conditions. these contribute to formulaic errors and additionally there are numerical errors attributable to conceptual simplifications. for example, although the calculation of pmv is based on calculated skin temperature and sweat rate but when considering external conditions to be neutral, pmv is based solely on a hypothetical heat load. this results in the same body thermal states being attributed different pmv values in different environments (humphreys and nicol 1996). conceptual and numerical approximations add to the formulaic error. and, no thermoregulatory ranges can be estimated, from the pmv system. for these reasons we decided to estimate the optimal operative temperatures on the basis of the physiological response of the human organism. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 mean air velocity m/s[ ] present result tanabe et al. 1987 pmv (calculation) 0.52 0.29 0.22 0.16 0.1 �3 �2 �1 0 1 2 3 25 26 27 operative temperature °c[ ] m e a n th e rm a ls e n sa tio n vo te a s h r a e fig. 2: mean thermal sensation vote versus operative temperature for japanese college-age subjects (tanabe et al. 1987) 16 18 20 22 24 26 28 22.8 °c �2 �1 0 1 2 3 t h e rm a l s e n s a ti o n vo te a s h r a e a) for windows and the door closed operative temperature [ ]°c fo ot lev el (r= 0. 67 ) (r= 0.6 2) he ad lev el (r= 0.5 2) pm v ov er all 22.8 °c �2 �1 0 1 2 3 16 18 20 22 24 26 28 �3t h e rm a l s e n s a ti o n vo te a s h r a e b) for windows and the door open operative temperature [ ]°c fo ot le ve l (r= 0. 62 ) (r= 0. 72 ) he ad le ve l ( r= 0. 70 ) pmv o ve ra ll fig. 3: effect of operative temperature on thermal sensation vote, activity 1.2 met, clo value results from neutral temperature 22.8 °c, rh � 40�55 %, mean radiant temperature equals air temperature (croome et al. 1993) 2 mathematical model of the physiological body response the total heat rate production and its distribution into individual components during heat exchange between the human body and the environment are shown in fig. 4, where qm � m � w � metabolic heat (see jokl 1989). qres and qev, d are the components of the heat rate from the organism due to respiration and due to skin moistening (evaporation), when the human body is in the thermal neutral zone. the heat flow qdry represents the component transferred from the organism through the clothing layer with total thermal resistance rt, wa(qdry � qc� qr). the regulatory process within the neutral zone is achieved mainly by vasodilation and vasoconstriction changing the body’s internal resistance into thermoregulatory and adaptational heat flux qtr and qa to the skin surface. qtr and qa is the heat flux regulating the instantaneous value of the skin temperature during the subject’s interaction with the environment, qtr is the organism’s immediate response to changes in the microclimate or metabolic heat changes; qa is the reaction shift due to adaptation to heat in summer and cold in winter. qtr� qa may be negative (heat loss) or positive (heat gain). it is the transient heat flow – even in the thermal neutral zone – that is called “quasi-stationary”, to be distinct from the hyperthermia and hypothermia zone. qtr� qa represents the rates of heat storage or heat debt accumulation. when the body is in a steady-state thermal balance with the environment, these therms are equal to zero. however we can consider the state of the subject in the neutral zone by non-steady-state conditions due to periodical changes of the metabolic heat rate, qm, or short thermal excitations in time followed by changes in the internal thermal resistance of the body within the neutral zone. the temporary characteristics of each non-steady process are determined, in addition to the thermal resistances rt, i and rt, wa, by the human body heat capacity, ct. the values characterizing the heat exchange are: tsk, tcore and tg. the internal thermal resistance, rt, i ,also determines the changes in thermoregulation and the adaptational heat, qtr+ qa, which is necessary for maintaining the skin temperature within physiological values if the core temperature is to remain constant (tcore � 36.7 � 0.4 °c). the heat flow balance, as presented in the model shown in fig. 4, can be expressed by a thermal flux equation at the subject-environment boundary. thus (if heat conduction is neglected): q r t t q q q q q q q dry t, wa g sk m res ev tr a i sw w m � � � � � � � � � � � 1 ( ) [ 2 ] (1) where qev � qev, ins � qev, sens � qev, ins� qsw [w � m �2], qm� q res� qev, ins � qi [w � m �2], qsw � 0.6(qm� 58.14) [w � m �2], � the quantity of excreted perceptible but mostly invisible sweat. this was estimated by weighing during the experiments as a mean value for the whole range. heat flux within the human body can be represented as (see model in fig. 4): q q q q g t t r t t m res tr a t,ti i sk t,ti i sk w m � � � � � � � � � ( ) ( ) [ ] 1 2 (2) where gt, ti is total body thermal conductance, which can be expressed by eq. (3): g q q t t q q t t g gt,ti m res i sk tr a i sk t,m t,i� � � � � � � � [w � m�2� k�1] (3) where gt, i is internal thermal conductance and gt, m is metabolic thermal conductance. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 46 no. 6/2006 qres rt,wa rt,wa rt,i (tsk ) cttcore t ti core� tsk tg qev qr qc qm, rt,m metabolic heat production qres�qres core thermoregulation fig. 4: total heat rate production and its distribution in individual components during heat exchange between the human body and the environment (qm metabolic heat, qres respiration heat, qtr thermoregulatory heat, qe evaporative heat, qc convective heat, qr radiant heat, rt,wa total thermal resistance of clothing, rt total internal thermal body resistance, ct thermal body capacity, ti deep body temperature, tcore core body temperature, tsk skin temperature, tg globe temperature 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 q q � i s w [ ] w /m 2 qdry[] w/m 2 f ig . 5: g ra p h of th e re la ti on sh ip q d ry � f( q i � q s w ) fo r cl ot h in g 0. 5 cl o, p oi n ts fr om ex p er im en t, op ti m al va lu es ar e on th e li n e q d ry � q i � q s w © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 46 no. 6/2006 0 .5 c lo � � � � � q (q q ) d ry 4 .8 i sw q m / [w m ] 2 t[] g°c t g 0 .1 0 1 7 q 3 1 .7 0 8 r 0 .8 8 6 3 � � � � m 2 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0 1 3 0 1 4 0 1 5 0 1 6 0 1 7 0 1 8 0 1 9 0 2 0 0 2 1 0 2 2 0 2 3 0 2 4 0 2 5 0 0 2 .55 7 .51 0 1 2 .51 5 1 7 .52 0 2 2 .52 5 3 0 2 7 .5 3 2 .53 5 f ig . 6: g ra p h of th e re la ti on sh ip t g � f( q m ) fo r op ti m al va lu es tr an sf er re d fr om gr ap h on f ig . 5 w it h in th e ra n ge q d ry � (q i� q s w )� � 4. 8, w h er e th e va lu e � 4. 8 w �m � 2 re p re se n ts th e m in im al th er m or eg u la to ry h ea t thermoregulation and adaptational heat flux first affects the skin temperature, tsk. the internal thermal resistance value, rt, i � 1/gt, i, characterizing the vasodilatation and vasoconstriction process, can be calculated from the equation: r t t q qt, i i sk tr a � � � [w�1� m2� k] (4) 3 experimental estimation of mathematical model parameters an experiment over the course of several years was undertaken in a climatic chamber, leading identification of the parameters in eq. (1) and (4). the experimental subjects were male university students. each of them underwent six experiments lasting about three hours at four levels of activity: (1) sitting in a chair, (2) sitting on a bike-ergometer without pedaling, (3) pedaling on a bike-ergometer with a 40 w load and (4) pedaling on a bike-ergometer with a load of 1 w per kg body mass (for as long as he was able to continue). metabolic heat production during each activity was measured by an indirect calorimetric method. mean skin temperature, heat rate and body water loss were estimated continuously during each experiment. two sets of clothing were used by the subjects: lightweight (pyjamas) and heavier clothing (an anti-g suit for fighter pilots). the results of the anti-g suit experiments will be presented in a separate report. there were no differences between the air temperature and the surface wall temperatures – it can be assumed that the overall temperature equals the operative temperature. six temperatures were chosen (29�3 °c and 14�3 °c, which determine the temperature ranges where some of the subjects started to leave the neutral zone and appeared to begin sweating or shivering). the originally chosen range of temperatures 8, 11, 14, 17, 20, 23, 26, 29, 32 °c was found to be excessive, and so they were reduced. within the comfort range the relative humidity was maintained corresponding to a partial water vapour pressure from 700 to 1850 pa. the onset of sweating and shivering was always assessed by the same person. experiments were carried out in all seasons of the year, thus reflecting the seasonal adaptation effect on maximal and minimal thermoregulatory heat, i.e. it was possible to determine adaptational heat. however it became evident that the seasonal adaptation effect can be neglected (jokl, moos 1992), being lower than 0.2 °c (i. e. within the range of experimental error in measuring the temperatures). the same finding has been described by other authors (see fanger 1970). the results were only accepted from subjects within the thermal neutral zone with the thermoregulatory heat constant. 3.1 graph construction of tg � f(qm) the measured values are plotted as qdry � f ( qi� qsw) in fig. 5, where for optimal values the linear equation: qdry � qi� qsw [w�m �2] representing equilibrium is valid. practical application of this graph is very difficult; but the relationship tg � f ( qm) is useful. therefore the linear relationship from fig. 5 was transferred into the graph in fig. 6 by plotting a regression line through the points limited by the equation � qdry� (qi� qsw) � � 4 .8 [w�m �2] in fig. 6. the value of � 4 .8 w�m�2 of the regression line is the minimal thermoregulatory heat, i.e. it represents maximal vasoconstriction in human body, and can be estimated from the minimum value for internal thermal conductivity of the human body (see fig. 7), which equals 9.07 w�m�2�k�1 (for core body temperature ti � 36.6 °c, skin temperature tsk � 30.5 °c and qm � 45.7 w�m �2). q g t ttr, t,i, i sk w mmin min ( ) . ( . . ) .� � � � � � �157 36 6 30 5 9 6 2 , i.e. �4.8 w�m�2, where g g gtt,i, t, i, t,m,min min min . . .� � � � �9 07 7 5 1 57 w�m �2 ��, g q t tt,m, m i sk min . . . .� � � � � 45 7 36 6 30 5 7 5 w�m�2��� 3.2 estimation of thermoregulatory range the widest thermoregulatory range, i.e. from optimum up to the onset of visible sweating can be estimated by plotting the regression line into the points of the onset of sweating. however, to ensure comfort we need lower values, without visible sweating occurring. this area is between the line of the optimum value and the tangent from the origin (which is the intersection of the line of the optimum and the regression line of the onset of sweating) to the area of beginning of shivering (see fig. 8). these tangents are analogous to the thermoregulatory range of category c according to cr 1752-1998. for categories a and b it must be taken into consideration that the human body is a thermoregulatory mechanism in the surrounding environment balancing the operative temperature changes by thermoregulatory heat flows in the human so that equilibrium can be achieved, and this must take place at three levels (by analogy with technological mechanisms) (see fig. 9): 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 0 10 20 30 40 50 60 12 16 20 24 28 32 36 0 50 100 150 200 h e a t l o s s w m [ ] � � 2 skin temperature °c[ ] in water in air conductance heat loss c o n d u c ta n c e w m k [ ] � � � 2 � 1 m in . fig. 7: human body thermal conductance gt, ti and human body heat loss as a functionof skin temperature tsk for a resting subject during daytime (qm � 45.6–57.4 w�m �2) (burton, bazett 1936, du bois et al. 1952, lefevre 1898, liebmaster 1869) (from itoh et al. 1972) � level a, corresponding to the time constant 0.368 to, tr, max � level b, corresponding to the time constant 0.632 to, tr, max � level c, corresponding to the time constant 1.000 to, tr, max level a is valid for building interiors with the highest requirements, and can only be attained with the use of air conditioning systems. level c is valid for building interiors with the lowest requirements, usually only naturally ventilated. level b covers other buildings, where air conditioning is necessary only in some cases. the time constant according to control theory characterizes the system response, (the response of the human © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 46 no. 6/2006 fig. 8: how to obtain thermoregulatory ranges (see the text for an explanation) fig. 9: graph of the relationship tg � f (qm) with the regression line of onset of sweating and the thermoregulatory range for levels (categories) a, b, c for warm (towards the onset of sweating) and for cold (towards the onset of shivering) organism) to the operative temperature changes, and is equal to the product of system thermal resistance r and its thermal capacity c: time constant � r�c, where r � rti� rtwa [w �1 �m2�k]. thermoregulatory changes are shown in fig. 10 as transferred from fig. 9 and also rounded up to 0.5 °c for practical application. for a complete list of values, see table 1. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 thermoregulatory range (0.5 clo) �30 �20 �10 0 10 20 30 40 0 50 100 150 200 q tr [w /m 2 ] max c opt max c opt (2.5) max b opt max b opt (2 5). max a opt max a opt (2 5). min a opt min a opt (2 5). min bopt min b opt (2 5). min c opt min c opt (2 5). q [w/mm 2 ] t °c g ,t r [ ] �6.0 �4.0 �2.0 0 2.0 4.0 6.0 8.0 fig. 10: thermoregulatory changes as transferred from graph on fig. 7 and rounded up to 0.5 °c for practical application (see also table 1) °c 50 [w�m�2] 70 [w�m�2] 80 [w�m�2] 100 [w�m�2] 120 [w�m�2] 150 [w�m�2] 180 [w�m�2] sweat 32.2 31.4 31.1 30.3 29.6 28.5 27.4 sweat – opt 5.6 6.8 7.5 8.8 10.1 12.0 14.0 sweat – opt (0.5) 5.5 6.5 7 8.5 10 12 13.5 max 29.3 27.9 27.2 25.8 24.4 22.3 20.2 max c – opt 2.7 3.4 3.7 4.3 4.9 5.9 6.8 max c – opt (0.5) 2.5 3.0 3.5 4.0 4.5 5.5 6.5 max b (0.632) 28.3 26.7 25.9 24.3 22.6 20.2 17.7 max b – opt 1.7 2.1 2.3 2.7 3.1 3.7 4.3 max b – opt (0.5) 1.5 2.0 2.0 2.5 3.0 3.5 4.0 max a (0.368) 27.6 25.8 24.9 23.1 21.3 18.6 15.9 max a – opt 1.0 1.2 1.4 1.6 1.8 2.2 2.5 max a – opt (0.5) 1.0 1.0 1.0 1.5 1.5 2 2.5 opt 26.6 24.6 23.6 21.5 19.5 16.5 13.4 table 1: optimal operative temperatures and thermoregulatory range as a function of man’s activity qm 4 a comparison between optimal values and thermoregulatory ranges with accepted values the proposed optimal temperatures and their thermoregulatory ranges were compared with the values according to iso 7730 (moderate thermal environments iso 7730-1984 (e)), cr (1752) (1998) and iso/dis 7730 (2003), ansi/ashrae standard 55-2004. the comparison between the above proposed operative temperatures and the values according to cr 1752 and iso/dis 7730 is presented in table 2 and in fig. 11. there is agreement on operative temperatures for 50 w�m�2, 70 w�m�2 and 80 w�m�2. for higher activities the values differ: the greater the activity, the greater the operative temperature difference. the findings are in agreement with the experiments (sitting persons in the neutral zone) on which the pmv value is based. a comparison between proposed optimal operative temperatures and the values according to iso and ansi/ashrae is presented in table 3 and fig.12. there is evident agreement for low activities (the graph is also based on iso 7730). 5 discussion the optimal operative temperatures derived from pvm values (from the 1970’s) are now not fully acceptable. it is more precise to use optimal operative temperatures based on the physiological human body response and not based only on people’s feelings. this has been proved by experimental © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 46 no. 6/2006 °c 50 [w�m�2] 70 [w�m�2] 80 [w�m�2] 100 [w�m�2] 120 [w�m�2] 150 [w�m�2] 180 [w�m�2] min a (0.368) 25.9 23.7 22.6 20.4 18.2 14.9 11.6 min a – opt -0.7 �0.9 �1 �1.1 �1.3 �1.5 �1.8 min a – opt (0.5) -0.5 �0.5 �0.5 �1.0 �1.0 �1.5 �1.5 min b (0.632) 25.4 23.1 21.9 19.6 17.3 13.8 10.4 min b – opt -1.2 �1.5 �1.6 �1.9 �2.2 �2.6 �3.1 min b – opt (0.5) -1.0 �1.0 �1.5 �1.5 �2.0 �2.5 �3.0 min 24.7 22.2 21 18.5 16 12.3 8.6 min c – opt �1.9 �2.4 �2.6 �3.0 �3.5 �4.2 �4.8 min c – opt (0.5) �1.5 �2.0 �2.5 �3.0 �3.0 �4.0 �4.5 0.1 0.2 m °c/w 2 � 150 125 100 75 50 1.510.50 1.0 2.0 3.0 0 28 °c 26 °c 24 °c 22 °c 20 °c 18 °c 14 °c 16 °c 12 °c 10 °c �1 °c �2 °c �1.5 °c �2.5 °c �4 °c �3 °c �5 °c optimal �1.0 +1.5 +2.0 �1.0 +3.5 �2.5 +4.0 �3.0 +2.5 �1.5 +3.0 �2.5 26.6 °c 24.6 °c 23.6 °c 21.5 °c 19.5 °c 16.5 °c 13.4 °c +2.0 �1.5 w/m 2 met a c ti vi ty clothing clo fig. 11: comparison between experimentally found optimal operative temperatures and the values in iso 7730 (the values presented in the graph correspond to temperatures only, not to activity) works (fishman, pimbert 1979) and shown when iso values have been applied in practice. the greater a person’s activity, the greater the discrepancy in the optimal temperature. because of this discrepancy, the new czech government directive no. 523/2002 code is based on the values presented here, and not on iso/dis 7730, which is based on pmv. the absence in the directive of adaptation to heat and cold, e.g. as a result of staying in a heated room in winter and in air-conditioned cars in summer, results in the same optimal operative temperatures for winter and for summer; the temperatures are differentiated only by different clothing. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 category cr � iso/dis jo/ka a (air conditioning) 24.5 � 1.0 24.6 � 1.0 � 0.5 b (air conditioning and natural ventilation) 24.5 � 1.5 24.6 � 2.0 � 1.0 c (natural ventilation) 24.5 � 2.5 24.6 � 3.0 � 2.0 table 2: a comparison between experimentally found operative temperatures and the values in cr 1752 and iso/dis 7730 (2003) in categories a, b, c (0.5 clo, 1.2 met) 30 20 10 % rh 40 50 60 70 80 90 operative temperature [°c] h u m id it y r a ti o d e w p o in t te m p e ra tu re [°c ] data based on iso 7730 and ashrae std 55 0.5 clo 1.0 clo pmw limits 0.5 upper recomended humidity limit, 0.012 humidity ratio 10 13 16 18 21 24 27 3229 35 38 �10 �5 0 5 10 20 15 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 24. 0 u p t o 2 5.8 °c no recomended lower humidity limit fig. 12: comparison between experimentally found optimal operative temperatures and the values in ansi/ashrae qm [w�m�2] 50 70 80 100 120 150 180 [met] 0.86 1.20 1.38 1.72 2.07 2.59 3.10 jo/ka [°c] (b) 26.6 �1.5 24.6 �2.0 23.6 �2.0 21.5 �2.5 19.5 �3.0 16.5 �3.5 13.4 �4.0 �1.0 �1.0 �1.5 �1.5 �2.0 �2.5 �3.0 iso 7730 (1984) 26.6�1.5 24.5�1.5 23.6�2.0 22.3�2.0 20.6�2.5 18.5�2.5 16.4�2.5 cr 1752, iso/dis 7730 (2003) [°c] (b) – 24.5�1.5 23.5�2.0 – – – – ansi/ashrae 55 (1992) to active � to sedentary� 4.5 (met � 1.2) [°c] – 24.5 23.7 22.2 20.6 18.2 16.0 remark: (b) means category table 3: a comparison between experimentally found operative temperatures and the values in iso and ansi/ashrae (clothing 0.5) 6 results the mathematical model (fig. 4) shows the role of various heat flows produced by the human body as it interacts with the environment. all the heat flows must be in mutual equilibrium if the human body is to stay homoiotherm. practical application of this graph is very difficult, and tg � f (qm) is a more useful relationship (fig. 6). this equilibrium forms the basis for optimal operative temperature estimation (fig. 5). the experimental data on the onset of sweating and the onset of shivering enable the thermoregulatory ranges to be estimated (fig. 8). the thermoregulatory area is between the line of the optimum and the tangent from pole, defined as the intersection of the line of optimum and the regression line of the onset of sweating, to the field of the onset of sweating (upper limit, level cmax) and to the field of the onset of shivering (lower limit, cmin). it is interesting and in agreement with human feelings that the thermoregulatory field for cold is smaller than the thermoregulatory field of the warm area – the human body is more sensitive to temperature decreases in the cold area. the question is how to sub-divide the thermoregulatory range into categories (a, b and c). instead of qualified assumption, it is proposed to base the categories on control theory. the human body behaves like any other system to which control theory can be applied. it is proposed that the human body time constant is used to differentiate the categories. the following values were used: time constant 0.368 t0, tr, max (a), 0.632 t0, tr, max (b) and 1.0 t0, tr, max, which correspond to categories a, b and c (fig. 6). category a can be applied to air conditioned buildings and category c to naturally ventilated buildings. as a result, two previously separate methods of assessment can be merged, those for air conditioned buildings, based on pmv, and these for natural ventilated buildings, based on mean monthly outdoor temperature. the results have been compared with the values according to iso 7730, cr (1752) (1998), iso/dis 7730 (2003) (table 2 and 3), and with the ansi/ashrae standard (table 3). most importantly however, it was possible to base the new czech government directive no. 523/2002 code (table 3) on these new findings, which have been used to derive the compulsory microclimatic condition for workplaces in the czech republic. 7 acknowledgments i would like to thank professor d. j. nevrala for his help with the english text. references [1] ansi/ashrae standard 55-2004. thermal environmental conditions for human occupancy. [2] croome, d. j., gan, g., abwi, h. b.: evaluation of indoor environment in naturally ventilated offices. in: research on indoor air quality and climate. cib proceedings, publication 163, rotterdam 1993. [3] en iso 7730 moderate thermal environment. [4] european technical report cr 1752-1998 “ventilation for buildings: design criteria for the indoor environment”. [5] fanger, p. o.: thermal comfort. danish technical press, copenhagen 1970. [6] fishman, d. s., pimbert, s. l.: survey of the objective responses to the thermal environment in offices. in: indoor climate (eds p. o. fanger, o. valbjorn). copenhagen, danish building research institute, 1979, p. 677–698. [7] humphreyes, m. a., nicol, j. f.: effects of measurement and formulation error on thermal comfort indices in the ashrae database of field studies. ashrae transactions, vol. 106 (2000), no. 2, p. 493–502. [8] humphreyes, m. a., nicol, j. f.: conflicting criteria for thermal sensation within the fanger predicted mean vote equation. in: cibse/ashrae joint national conference proceedings, harrogate, uk, vol. 2, 1996, p. 153–158. [9] humphreyes, m. a., nicol, j. f.: an analysis of some observations of finger-temperature and thermal comfort of office workers. in: indoor air, edinburgh, uk, 1999. [10] itoh, s., ogata, k., yoshimura, h.: advances in climatic physiology. igatu shoin ltd. tokyo 1972, springer verlag berlin, heidelberg, new york 1972. [11] jokl, m. v.: microenvironment: the theory and practice of indoor climate. thomas, illinois, usa, 1989. [12] jokl, m. v., moos, p., štverák, j.: the human thermoregulatory range within the neutral zone. physiol. res. vol. 41 (1992), p. 227–236. [13] jokl, m. v., moos, p.: die warmeregelungsgrenze des menschen in neutraler zone. bauphysik vol. 14 (1992), no. 6, p. 175–181. [14] nařízení vlády č. 523/2002 sb., kterým se mění nařízení vlády č. 178/2001 sb., kterým se stanoví podmínky ochrany zaměstnanců při práci (government directive no. 523/2002 code., changing government directive no. 178/2001 code. prescribing the conditions for employees protection during the work). [15] newsham, g. r., tiller, d. k.: a field study of office thermal comfort using questionnaire software. national research council canada, internal report no. 708, nov. 1995. [16] tanabe, s. i., kimura, k. i., hara, t., akimoto, t.: effects of air movement on thermal comfort in air-conditioned spaces during summer season. journal of architecture, planning and environmental engineering, vol. 382 (1987), p. 20–30. prof. ing. miloslav jokl, drsc. phone: +420-22435-4432 e-mail: miloslav.jokl@fsv.cvut.cz prof. ing. karel kabele phone: +420-22435-4570 e-mail: kabele@fsv.cvut.cz dept. of microenvironmental and building services engineering czech technical university in prague faculty of civil engineering thákurova 7 166 29, prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 46 no. 6/2006 acta polytechnica https://doi.org/10.14311/ap.2023.63.0050 acta polytechnica 63(1):50–64, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague comparative study of state space averaging and pwm with extra element theorem techniques for complex cascaded dc-dc buck converter sangeeta shete∗, prasad joshi shivaji university, government college of engineering, department of electrical engineering, karad, maharashtra, india ∗ corresponding author: sangeetashete2019@gmail.com abstract. until now, most commonly used state space averaging and pwm techniques have been applied to different converter topologies and their advantages and disadvantages were stated. however, the superiority of an analytical technique was not justified based on a parametric comparison of these techniques for the same converter topology. hence, in this paper, first time a comparative evaluation of two commonly used modelling techniques for fourth order converter is presented. the first approach makes use of a state-space averaging of the converter and is based on analytical manipulations using different state representations of the converter. the second approach is based on pwm switch modelling with an extra element theorem and consists of topological manipulations. the two modelling techniques are applied to the same complex cascaded dc-dc buck converter and a transfer function is obtained. these techniques are compared for different features and the study concludes that state space modelling technique is systematic and less complicated than pwm switch modelling with an extra element theorem for a higher order converter. keywords: analytical technique, cascaded dc-dc buck converter, state space averaging, pwm switch modelling, extra element theorem, transfer function. 1. introduction analytical techniques play a vital role in analysing the behaviour of converters under different conditions, performance improvements and design aspects. dc-dc converters have a high number of components and nonlinear behaviour due to the existence of a switch and diode in circuit, which makes them complex in nature. a study of dynamic behaviour and assessment of stability of such nonlinear and complex circuits is a very challenging job. article [1] presented the review of existing analytical techniques in a structured and meaningful manner and stated that the circuit averaging (ca), state space averaging (ssa), and pwm switch modelling are most commonly used analytical techniques, whereas signal flow graphs, energy factor, switching function, s-z method are special techniques. ssa is an organised method which allows the inclusion of parasitic elements of a circuit even in the initial stage, permits to explore a linear system with a zero initial condition and nonlinear systems with all initial conditions [2–4]. it is preferred to design a robust controller. however, as it doesn’t consider switching frequency in the analysis, this technique is applied by neglecting the ripple effect on inductor current and output voltage [5, 6], which affects its accuracy [7–9]. state space representation is used for mathematical modelling of sepic converter, which is further used to contrivance the controller. the results of pi controls and hysteresis controls are compared using psim software [10]. the pwm switch modelling technique employs determination of invariant properties of the pwm switch to obtain the average model of circuit and small signal characteristics can be obtained from this average model [11]. pwm technique with conventional approach is simple and pedagogical approach of analysis and provides complete information about steady state and dynamic properties of the converter. this approach is also useful for frequency domain analysis and quasi resonant converters. pwm technique using extra element theorem (eet) is more efficient and practical for general circuits, as it leads to a faster analysis due to a reduced mathematical manipulation, meaningful form of expressions and it is easy to track the errors [12]. an enormous study has been conducted regarding ssa and pwm switch modelling techniques for converters. article [2] investigated dynamic modelling of zeta converter, wherein transfer functions, bode plot, transient response and steady state response were obtained using matlab and pspice. article [13] obtained state equations and output equations using ssa technique for buck, boost and buck-boost converter and observed that the state space model simulation results are comparable with a hardware model with a deviation of only 0.0015 v. however, this technique doesn’t simulate ripple effect of inductor current and output voltage due to the absence of switching frequency. article [14] analysed sepic converter using an average state-space modelling approach taking into account power losses in converter elements and concluded that this methodology 50 https://doi.org/10.14311/ap.2023.63.0050 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . contributes for converter design as per the requirements and reduces the need for accurate time domain simulations. mathematical modelling of dc-dc buck converter using state space averaging and validation of the models was performed using psim and matlab simulink [15]. the studies were performed for state space small signal modelling of a double boost converter integrated with sepic converter [3, 16, 17], and sepic converter with coupled and uncoupled inductor [18]. book [19] presented the well-defined and systematic process of small signal analysis of nonlinear circuits using state space analysis to determine the control to output transfer function. the concepts and mathematical modelling of circuits using eet [20] and pwm with eet [21] were studied. the output impedances of three pwm dc-dc converters were analysed in a general and unified manner using pwm switch modelling with eet technique [22]. article [23] presented the effect of an input filter interaction in the small signal analysis of voltage and current control mode of dc-dc boost converter and concluded that the methodology is universally adopted. matlab simulink was employed to study the output voltage for multiple dc-dc converters [24] and boost converter [25]. article [12] presented pwm with eet to a general circuit and concluded that it is a fast analytical technique which produces a well-ordered polynomial and low entropy form of transfer functions. however, by applying this technique to a complex sepic converter, the study concluded that as the circuit becomes complicated, a conventional node and mesh analysis is better instead of the pwm with eet. nevertheless, the basis of comparison for this statement is not presented in the study, and therefore it is necessary to study the features of pwm with eet technique for higher order circuits. some of the existing studies have developed the analytical models, obtained simulation results and commented on advantages, disadvantages and applications in a relative manner without comparing the results with a counterpart technique, which doesn’t fit into the acceptable scientific and technical approach. the advantages, disadvantages and superiority of a technique should be based on a comparison of two or more techniques for the same converter topology. even though numerous studies regarding ssa and pwm with eet techniques for different converters have been performed, surprisingly no efforts were done to compare these techniques for the same converter and to comment on the superiority of the analytical technique. the past few years have perceived notable development in the research of dc-dc converter topologies. although the conversion efficiency of a single-stage converter is better than that of the two-stage converter, the quadratic converters are proposed for extremely large range of conversion ratios. as a part of this, complex cascaded buck converter (cbc) is always preferred for many applications, as it has a better conversion ratio, high voltage step down ratio, and large voltage regulation [26]. therefore, in this paper, a comparative evaluation of two widely preferred modelling tools for a higher order dc-dc converter is presented. the first approach makes use of a state-space averaged model of the converter based on the analytical manipulations, whereas the second approach is based on the pwm switch modelling with eet technique based primarily on the manipulations of circuits. the two modelling techniques are applied to a same complex cascaded dc-dc buck converter and are compared on the basis of their various features to decide the superiority for a higher order circuit. 2. materials and methods 2.1. complex cascaded dc-dc buck converter the existence of two transistor switches of the conventional cascaded buck converter can be reduced to a single transistor switch in the complex cascade dc-dc buck converter as shown in figure 1, which has two inductors and two capacitors making it a fourth-order circuit and has a quadratic conversion ratio of d2. this singletransistor realisation is the most additional advantage over a straight forward cascade of two basic converters. the switching section consists of one active switch and three diodes. though the circuit is complex, the conversion ratio of cbc cannot be realised with less than two capacitors, two inductors, and four switches. however, additional complexity of the converter network may compromise the wide conversion ratio. the control strategy of the complex cbc is depicted in figures 2a and 2b. the on state and off state correspond to the operation of two buck converters connected in a cascade as depicted in figure 3. figure 1. topology of cascade dc-dc buck converter. 51 sangeeta shete, prasad joshi acta polytechnica (a) (b) figure 2. switching mechanism of complex cbc. figure 3. operation of complex cbc. the total output voltage is given by equation 1. v2 = 1 t ∫ t 0 (−dv1 + dv1(1 + d)) dt v2 = d2v1 (1) where, d is the duty ratio and d2 = 1 t ∫ t 0 ď2(t)dt the chopper circuit of cbc has one active switch and three passive switches, which are driven by switching function ď2(t) and ď ′2(t), respectively. the switching function is given as, ď2(t) = { 1 0 < t < ton 0 ton < t < t 2.2. state space averaging technique the state-space averaging method is based on analytical operations for the converter states comprising the determination of linear state model for each possible configuration of circuit and subsequently combining all these elementary models into a single and unified one through a duty factor. the method involves a formulation of the state equation for each state, averaging, perturbation and then rearranging equations to obtain the required transfer function by applying laplace transform. the complex cbc is considered in continuous conduction mode and switch and diodes are considered in ideal mode. considering, inductor currents il1, il2, and capacitors voltages vc1 and vc2 as the state variables, the state equation and output equation in on state of the cbc (figure 2a) are expressed by the equations 2 and 3. d(il1) dt = v1 l1 − vc1 l1 d(il2) dt = v1 l2 − vc2 l2 d(vc1) dt = il1 c1 d(vc2) dt = il2 c2 − v2 rc2   (2) v2 = vc2 (3) 52 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . similarly, the state equation and output equation in off state of the converter (figure 2b) are expressed by the equations (4) and (5). d(il1) dt = − vc1 l1 d(il2) dt = vc1 l2 − vc2 l2 d(vc1) dt = il1 c1 − il2 c1 d(vc2) dt = il2 c2 − v2 rc2   (4) v2 = vc2 (5) the general form of state equations for on state and off state are given as, ẋ(t) = a1x(t) + b1u(t) during dt ẋ(t) = a2x(t) + b2u(t) during d’t y (t) = c1x(t) during dt y (t) = c2x(t) during d’t where x(t) is the state vector, x(t) is the state variable matrix, u(t) is the input variable matrix, a1 and a2 are the state matrices and b1 and b2 are the input matrices, c1 and c2 are the output matrices and d is the switching function. thus, equations (2) and (3) are represented in matrix form as,   ˙il1 ˙il2 ˙vc1 ˙vc2   =   0 0 −1/l1 0 0 0 0 −1/l2 −1/c1 0 0 0 0 1/c2 0 −1/rc2     il1 il2 vc1 vc2   +   1/l1 1/l2 0 0  v1 (6) and v2 = [0 0 0 1]   il1 il2 vc1 vc2   (7) equations (4) and (5) are represented in matrix form as,   ˙il1 ˙il2 ˙vc1 ˙vc2   =   0 0 −1/l1 0 0 0 1/l2 −1/l2 1/c1 −1/c1 0 0 0 1/c2 0 −1/rc2     il1 il2 vc1 vc2   +   0 0 0 0  v1 (8) and v2 = [0 0 0 1]   il1 il2 vc1 vc2   . (9) the matrices of equations (6) to (9) are averaged with respect to switching function d and d ′ and are given in equations (10) and (11). a = a1d + a2d ′ =   0 0 −1/l1 0 0 0 d ′ /l2 −1/l2 1/c1 −d ′ /c1 0 0 0 1/c2 0 −1/rc2   &b = b1d + b2d′ =   d/l1 d/l2 0 0   (10) c = c1d + c2d ′ = [ 0 0 0 1 ] (11) 53 sangeeta shete, prasad joshi acta polytechnica thus, a complete averaged state space model of the cbc is given by the equations (12) and (13).   ˙il1 ˙il2 ˙vc1 ˙vc2   =   0 0 −1/l1 0 0 0 d ′ /l2 −1/l2 1/c1 −d ′ /c1 0 0 0 1/c2 0 −1/rc2     il1 il2 vc1 vc2   +   d/l1 d/l2 0 0  v1 (12) v2 = [ 0 0 0 1 ]   il1 il2 vc1 vc2   (13) to obtain the linear small-signal state-space model, perturbation is added in each state variables and linear steady state model is obtained as in equation (14).   ˙̂ il1 ˙̂ il2 ˙̂ v c1 ˙̂ v c2   =   0 0 −1/l1 0 0 0 d ′ /l2 −1/l2 1/c1 −d ′ /c1 0 0 0 1/c2 0 −1/rc2     îl1 îl2 v̂c1 v̂c2   +   −v1/d ′ l1 v1/d ′ l2 −dv1/2rc1 0   (14) the general equation of laplace transform of the linear steady state model is given by the following expression, x̂(s) = [si −a]−1[(a1 −a2)x + (b1 −b2)u]d̂(s) = [[si −a]−1bd]d̂(s). (15) for cbc, (si −a) = s   1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1  −   0 0 −1/l1 0 0 0 d ′ /l2 −1/l2 1/c1 −d ′ /c1 0 0 0 1/c2 0 −1/rc2   (si −a) =   s 0 1/l1 0 0 s −d ′ /l2 1/l2 −1/c1 d ′ /c1 s 0 0 −1/c2 0 s + 1/rc2   . (16) now, [si −a]−1 = co-factor of ((si −a))t determinant of (si −a) . (17) let, co-factors of (si −a) =   c11 c12 c13 c14 c21 c22 c23 c24 c31 c32 c33 c34 c41 c42 c43 c44   then, (co-factors of (si −a))t =   c11 c21 c31 c41 c12 c22 c32 c42 c13 c23 c33 c43 c14 c24 c34 c44   . (18) where, co-factors are given in table 1. 54 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . co-factor expression cofactor expression c11 s 3+s2 ( 1 rc2 ) +s ( 1 l2c2 + d ′2 l1c1 ) + d ′2 rc1c2l2 c31 −s2 1 l1 −s 1 rc2l1 + 1 c2l1l2 c12 d ′ l2 − 1 rc1c2 −s 1 c1 c32 −s2 d ′ l2 −s d ′ rc2c2 c13 s 2 1 c1 −s 1 rc1c2 + 1 c1c2l2 c33 s3 + s2 ( 1 rc2 ) + s 1 l2c2 c14 − d ′ c1c2l2 c34 −s d ′ c2l2 c21 −s d ′ c1l1 − d ′ rc1c2l1 c41 d ′ c1l1l2 c22 s 3 + s2 ( 1 rc2 ) + s 1 l1c1 + 1 rc1c2l1 c42 −s2 1 l2 − 1 c1l1l2 c23 s2 d ′ c1 + s d ′ rc1c2 c43 −s d ′ c1l2 c24 s 2 1 c2 + 1 c1c2l1 c44 s3 + s d ′2 l2c1 + s 1 c1l1 table 1. (co-factors of (si − a))t . determinant of (si −a) is s   s −d′/l2 1/l2d′/c1 s 0 −1/c2 0 s + 1/rc2   + 0   0 d′/l2 1/l2−1/c1 s 0 0 0 s + 1/rc2   +1/l1   0 s 1/l2−1/c1 d′/c1 0 0 −1/c2 s + 1/rc2   + 0   0 s d′/l2−1/c1 d′/c1 s 0 −1/c2 0   . this yields the determinant of (si −a) as, 1 + s ( l2 + d ′2l1 r ) + s2[l2c2 + l1c1 + l1c2d ′2] + s3 ( l1l2c1 r ) + s4(l1l2c1c2) (19) substituting equations (18) and (19) in (15), [si −a]−1 =   c11 c21 c31 c41 c12 c22 c32 c42 c13 c23 c33 c43 c14 c24 c34 c44   1 + s ( l2+d ′ 2l1 r ) + s2[l2c2 + l1c1 + l1c2d ′2] + s3 ( l1l2c1 r ) + s4(l1l2c1c2) (20) and bd =   −v1/d ′ l1 v1/d ′ l2 −dv1/2rc1 0   . 55 sangeeta shete, prasad joshi acta polytechnica thus, from equation (14), the equation of laplace transform of complex cbc becomes, x̂(s) = (21)      c11 c21 c31 c41 c12 c22 c32 c42 c13 c23 c33 c43 c14 c24 c34 c44   1 + s ( l2+d ′ 2l1 r ) + s2[l2c2 + l1c1 + l1c2d ′2] + s3 ( l1l2c1 r ) + s4(l1l2c1c2)     −v1/d ′ l1 v1/d ′ l2 −dv1/2rc1 0     d̂(s). the general equation of laplace transform of the output at a steady state is given by, ŷ = cx̂(s) + [(c1 −c2)x(s)]d̂(s). from equation (7) and (9), c1 = c2. thus, ŷ = cx̂(s)d̂(s). hence for complex cbc, v̂2(s) = (22) [ 0 0 0 1 ]      c11 c21 c31 c41c12 c22 c32 c42 c13 c23 c33 c43 c14 c24 c34 c44   1 + s ( l2 +d ′ 2 l1 r ) + s2[l2c2 + l1c1 + l1c2d ′ 2] + s3 ( l1 l2 c1 r ) + s4(l1l2c1c2)     −v1/d ′ l1 v1/d ′ l2 −dv1/2rc1 0     d̂(s). solving equation (23), control to output transfer function of complex cbc is, v̂2(s) d̂(s) = ( 1 + sl1 ( dd ′ 2r ) + s2l1c1 (1+d 2d )) 1 + s ( l2+d ′ 2l1 r ) + s2[l2c2 + l1c1 + l1c2d ′2] + s3 ( l1l2c1 r ) + s4(l1l2c1c2) . (23) it is realised that the calculation of matrix (si − a)−1 becomes complicated for any high order circuits, which is in agreement with what is stated in [27]. 2.3. pwm switch with eet technique the main principle of the pwm switch modelling technique is the elimination of active and passive switches by their time-averaged models and obtain the averaged circuit model for a switched network, which is further inserted into the converter circuit. the final model is a time-averaged equivalent circuit model, where all branch currents and node voltages correspond to averaged values of corresponding original currents and voltages. replacement of active and passive switches from the basic switch model (figure 4) by switching functions d and d ′ , respectively, is depicted in figure 5. figure 4. basic circuit model of switch. figure 5. replacement of switches. the invariant average equations for figure 5 are, i1 = di2 v23 = dv13 } (24) 56 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . adding a small deviation in duty ratio function d and then differentiating equation (24), ı̂1 = dı̂2 + i2d̂ v̂23 = dv̂13 + v13d̂ } , (25) where, d, i2 and v13 are the steady state operating points of pwm switch. the pwm switch model from invariant average equations (24) is obtained as depicted in figure 6. also, using equation (25), dependent sources dı̂2 and dv̂23 are replaced with 1:d transformer and moving control source d̂v13 from common terminal side to active terminal side, and the pwm switch model is obtained as in figure 7. thus, in pwm switch modelling with eet, the switch and diode are replaced point by point with its equivalent circuit model and further eet is employed to find out tf of the converter circuit. figure 6. pwm switch model. figure 7. pwm switch model as 1:d transformer ratio. figure 8. two pwm switches as 1:d transformer ratio for cb. therefore, for complex cbc, the conditions of on state and off state of switches are employed and two pwm switches are identified, which are represented as s1 and s2. the identified pwm switch models are replaced with 1:d transformer along with their small signals as depicted in figure 8. in order to determine dc operating point of complex cbc, all inductors are short circuited and capacitors are open circuited. referring to figure 8, dc operating point with respect to switch s1 is, vap1 = v1. current flowing through terminal ‘c’ of switch s1 is, ic1 = ic2 −dic2 = d ′ ic2. similarly, dc operating point with respect to switch s2 is, vap = v1(1 + d), ic2 = v2 r = d2v1 r . once the dc operating point is determined, 4-eet is applied that provides separate, independent steps of the analytical technique. the general form of control to output transfer function is [ v̂2(s) d̂(s) ] . in the very first step of 4-eet, denominator d̂(s) is determined, for which each impedance element of the converter is treated as extra element and input sources are set to zero. with respect to figure 3 c1, c2, l2 and l1 are considered as 4 ports and numbered as port (1), (2), (3) and (4), respectively. treating these ports as an extra element, 57 sangeeta shete, prasad joshi acta polytechnica various time constants are determined by removing port elements and observing the remaining circuit through the removed port. the denominator of the transfer function is given by the following equation: d̂(s) = 1 + a1s + a2s2 + a3s3 + a4s4, (26) where a1 = τ1 + τ2 + τ3 + τ4 a2 = τ1τ21 + τ1τ31 + τ1τ41 + τ2τ32 + τ2τ42 + τ3τ43 a3 = τ1τ31 τ231 + τ4τ14 τ241 + τ4τ14 τ341 + τ3τ32 τ423 a4 = τ4τ14 τ341τ2413   (27) figure 9 is referred for determination of (τ) for a1, which is explained in table 2. figure 9. determination of (a) τ1 (b) τ2 (c) τ3 (d) τ4. τ expression τ1 time constant of first port is req x c1, determined by observing the circuit through port (1), capacitor of this port 1 is assumed as extra element and is removed temporarily, l1, l2 and c2 are assumed in dc condition. by observing through port 1, the req is 0, therefore, τ1 = req x c1 = 0. τ2 time constant of second port is req x c2, determined by observing the circuit through port (2), capacitor of this port is now extra element and is removed temporarily, l1, l2 and c1 are assumed in dc condition. after observation, req is 0, therefore, τ2 = req x c2 = 0. τ3 time constant of third port is ( l2 req ) , which is determined by observing the circuit through port (3). inductor of this port is extra element and is removed temporarily and c1, c2 and l1 are kept in dc condition. by observation, req is r. hence, τ3 = l2r . τ4 time constant of fourth port is ( l1 req ) , which is determined by observing the circuit through port (4). inductor of this port is extra element and is removed temporarily and c1, c2 and l2 are kept in dc condition. by observing the circuit it is seen that req is rd′ 2 . hence, τ4 = l1d ′ 2 r . table 2. determination of (τ) for a1. by summarizing the results of table 2, we get: a1 = l2 r + l1d ′2 r (28) first two terms and the last term of the a2 (equation (27)) are zero. the calculation of a2 is only addition of middle terms, i.e. (τ1τ41 + τ2τ32 + τ2τ42 ), which are indeterminate and is removed by changing the port sequence as [τ4τ14 + τ3τ23 + τ4τ24 ]. referring the figure 10, terms τ14 , τ23 and τ24 are determined as described in table 3. by summing up the results from the table 3, we get: a2 = l1c1 + l2c2 + l1c2d ′2. (29) the first term of the a3 (equation (27)) cannot be determined as it gives a ratio of (0/0), this indeterminacy in time constants is removed with new sequence of ports for a3 which is, a3 = τ1τ31 τ 2 31 + τ4τ 1 4 τ 2 41 + τ4τ 1 4 τ 3 41 + τ3τ 3 2 τ 4 23. (30) 58 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . figure 10. determination of (a) τ14 (b) τ23 (c) τ24 . τ expression τ14 observing the circuit through port (1), port (4) is considered in high frequency state and ports (2) and (3) are kept in dc condition, capacitor of port (1) is treated as extra element and removed temporarily, equivalent resistance req is rd′ 2 . hence τ4τ 1 4 = l1c1. τ23 observing the circuit through port (2), port (3) is considered in high frequency state and ports (1) and (4) are kept in dc condition, capacitor of port (2) is treated as extra element and removed temporarily, equivalent resistance req is r. therefore τ3τ23 is l2c2. τ24 observing the circuit through port (2), port (4) is considered in high frequency state and ports (1) and (3) are kept in dc condition, capacitor of port (2) is treated as extra element and removed temporarily, equivalent resistance req is r. therefore τ4τ24 is l1c2d ′2. table 3. determination of (τ) for a2. the first two terms and last term of the equation (30) are zero. to calculate τ341 , the circuit in figure 11 is observed through port (3), while ports (1) and (4) are kept in a high frequency condition. here req is r which results in τ341 = l3 req = l3 r . hence, a3 = τ4τ14 τ 3 41 = l1l2c1 r . (31) figure 11. determination of τ341. figure 12. determination of τ2413. for the determination of a4, referring to figure 12, τ2413 is determined by observing the circuit through port (2), while ports (4), (1) and (3) are kept in high frequency condition. in such situation, req is r and τ2413 is rc2. the calculation of the term (τ4τ14 τ341τ2413) results in (l1l2c1c2). hence, a4 = l1l2c1c2. (32) using equations (28), (29), (31) and (32) in equation (26), the denominator of control to output transfer function of complex cbc becomes, d̂(s) = 1 + ( l2 r + l1d ′2 r ) s + (l1c1 + l2c2 + l1c2d ′2)s2 + ( l1l2c1 r ) s3 + l1l2c1c2s4. (33) now, the excitation applied to the port is retained as it is and additional excitation is applied to the port of the extra element with the extra element removed. with these two excitations, the response is considered as null. simultaneously, the conversion ratio d2 is differentiated with respect to d. considering this differentiation and null response condition altogether, the numerator of transfer function is determined. 59 sangeeta shete, prasad joshi acta polytechnica figure 13. null condition at the output of complex cbc for determination of numerator. by considering a null response at the output of complex cbc (figure 13), it is observed that the presence of ı̂l2(s) creates the condition as represented by the equation, ı̂p2(s) = ı̂c2d̂(s) and the resultant voltage across c1 is, v̂c1(s) = vapd̂ (1/(sc1) (sl1 + 1/sc1) + ic2d̂ [sl1||(1/sc1)] . (34) this voltage must be the same as that of d side of the second pwm switch. hence, v̂c1(s) = − [ vap2 d d̂− v̂c1(s) ] d. (35) from equations (34) and (35), d ′ vap1 1 1 + s2l1c1 + d ′ ic2 sl1 1 + s2l1c1 + vap2 = 0. (36) which yields the numerator of the control to output transfer function of complex cbc as given by, v̂2(s) = 1 + sl1 dd ′ 2r + s2l1c1 1 + d 2d . (37) using equations (33) and (37), the overall transfer function of complex cbc is given by the equation (38): v̂2(s) d̂(s) = ( 1 + sl1 ( dd ′ 2r ) + s2l1c1 (1+d 2d )) 1 + s ( l2+d ′ 2l1 r ) + s2[l2c2 + l1c1 + l1c2d ′2] + s3 ( l1l2c1 r ) + s4(l1l2c1c2) . (38) 3. simulation of complex cbc the simulation circuit is developed; voltage and current graphs (figure 14), time response (figure 15) and frequency response plot (figure 16) are obtained for complex cbc using matlab simulation to study the behaviour of cbc. the circuit parameters for the simulation are selected as: inductors l1 = 524 µh, l2 = 1200 µh, capacitors c1 = c2= 5 µf, output resistance r0 = 8.05 ohm, switch (q1) parameters: fet on resistance = 0.1 ω, internal diode resistance = 0.01 ω, internal diode forward voltage = 0.7 v, diode (d1, d2, d3) parameters: resistance = 0.01 ω, forward voltage = 0.8 v, switching frequency = 50 khz. for a duty ratio of 0.35, input voltage v1 = 25 v, and source current = 0.05746 a, output voltage v2 = 3 v and load current = 0.367 a was observed (figure 14). also output voltage and load currents for different duty ratios of complex cbc are depicted in table 4. from the transient and steady state response, it is observed that the output voltage has settled to about 3 volts with the settling time of approximately 0.9 ms after the application of input voltage. the system is stable (figure 16), as gain margin is positive (∞db) and phase margin is also positive (450°). 60 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . figure 14. voltage and current graphs of complex cbc. figure 15. transient and steady state response of complex cbc. figure 16. bode plot of control to output transfer function of complex cbc. 61 sangeeta shete, prasad joshi acta polytechnica duty ratio output voltage [v] load current [a] 0.3 2.47 0.3076 0.4 3.434 0.4266 0.5 5.44 0.6762 0.6 8.601 1.068 0.7 11.78 1.464 table 4. load current and output voltage for different duty rates. 4. results and discussion ssa and pwm switch modelling with eet technique for cbc (section 2.2 and 2.3) are brainstormed, compared for different features and the results are summarized in table 5. in ssa, a linearisation of all the components of the converter is performed, whereas in the pwm switch model method, only non-linear switching devices are linearised and linear components remain unchanged. in ssa technique, a set of differential equations are written which are further averaged with respect to the duty ratio function. the averaged equations are then linearised by using principle of perturbation, laplace transform is applied to the linear state space model of cbc and the control to output transfer function is obtained. unlike ssa technique, differential equations are not required in pwm switch modelling technique. averaging and linearisation with the help of perturbation is accomplished using the pwm switch itself. instead of laplace transform, eet is applied to the linearised model of cbc to derive the control to output transfer function. the ssa model of cbc is a systematic procedure involving mathematics without any circuit transformation. it involves complex mathematical transformations and a higher number of mathematical equations. in contrast, pwm switch modelling technique with eet consists of simple mathematical transformations. however, expertise is required for proper circuit transformations. based on the parametric comparison (table 5) of two analytical techniques, it is observed that ssa is better than the pwm with eet for a high-order circuit, due to systematic, straightforward procedural steps. pwm with eet for a high-order circuit requires a higher number of circuit transformations and a high expertise, which makes it time consuming and difficult. features ssa technique pwm switch modelling with eet analysis approach mathematical mainly circuit oriented structure of analysis systematic less systematic equivalent diagram not required required circuit transformation not required required number of mathematical transformations more less number of mathematical equations more less complexity of mathematical transformations high low expertise required mathematical identification of pwm switch, changing the port sequences in case of indeterminacy and application of eet identification of pwm switch not required required extra element theorem not applicable applicable form of transfer function low entropy low entropy the complexity of the modelling process low high steady state analysis yes yes transient state analysis yes yes table 5. results of comparative evaluation of the modelling techniques. 5. conclusion this paper presents a comparative study of two modelling techniques commonly used for modelling of dc-dc converters, which is first in its kind. the transfer functions of a complex cbc converter were developed using 62 vol. 63 no. 1/2023 comparative study of state space averaging and pwm . . . state space averaging and pwm switch modelling with an extra element theorem and the simulation results were obtained. the state space modelling approach is purely mathematical and systematic. however, it involves complex mathematics for high-order circuits. the pwm switch modelling approach is mainly circuit oriented approach, which involves relatively simple mathematical transformations. however, circuit transformations become complex for a high-order circuit. a number of complex circuit transformations and identification of the pwm switch, changing the port sequences in the case of indeterminacy and application of eet make the pwm switch modelling approach a complex one as compared to the state space averaging approach. references [1] s. h. shete, p. m. joshi. analytical techniques for dc dc converters and 1φ inverters: a comprehensive review. journal of the institution of engineers (india): series b volume 103:1827–1844, 2022. https://doi.org/10.1007/s40031-022-00759-x. [2] e. vuthchhay, c. bunlaksananusorn. dynamic modeling of a zeta converter with state-space averaging technique. in 2008 5th international conference on electrical engineering/electronics, computer, telecommunications and information technology, vol. 2, pp. 969–972. 2008. https://doi.org/10.1109/ecticon.2008.4600593. [3] s. hegde, a. izadian. a new sepic inverter: small signal modeling. in iecon 2013 – 39th annual conference of the ieee industrial electronics society, pp. 240–245. 2013. https://doi.org/10.1109/iecon.2013.6699142. [4] a. davoudi, j. jatskevich, t. de rybel. numerical state-space average-value modeling of pwm dc-dc converters operating in dcm and ccm. ieee transactions on power electronics 21(4):1003–1012, 2006. https://doi.org/10.1109/tpel.2006.876848. [5] d. czarkowski, m. k. kazimierczuk. energy-conservation approach to modeling pwm dc-dc converters. ieee transactions on aerospace and electronic systems 29(3):1059–1063, 1993. https://doi.org/10.1109/7.220955. [6] m. bartoli, a. reatti, m. k. kazimierczuk. open loop small-signal control-to-output transfer function of pwm buck converter for ccm: modeling and measurements. in proceedings of 8th mediterranean electrotechnical conference on industrial applications in power systems, computer science and telecommunications (melecon 96), vol. 3, pp. 1203–1206. 1996. https://doi.org/10.1109/melcon.1996.551161. [7] n. kroutikova, c. a. hernandez-aramburo, t. c. green. state-space model of grid-connected inverters under current control mode. iet electric power applications 1:329–338, 2007. https://doi.org/10.1049/iet-epa:20060276. [8] h. mashinchi mahery, e. babaei. mathematical modeling of buck-boost dc-dc converter and investigation of converter elements on transient and steady state responses. international journal of electrical power & energy systems 44(1):949–963, 2013. https://doi.org/10.1016/j.ijepes.2012.08.035. [9] s. laali, h. m. mahery. buck dc-dc converter: mathematical modeling and transient state analyzes. in 2012 3rd ieee international symposium on power electronics for distributed generation systems (pedg), pp. 661–667. 2012. https://doi.org/10.1109/pedg.2012.6254073. [10] m. muntasir nishat, m. a. moin oninda, f. faisal, m. a. hoque. modeling, simulation and performance analysis of sepic converter using hysteresis current control and pi control method. in 2018 international conference on innovations in science, engineering and technology (iciset), pp. 7–12. 2018. https://doi.org/10.1109/iciset.2018.8745619. [11] v. vorperian. simplified analysis of pwm converters using model of pwm switch. continuous conduction mode. ieee transactions on aerospace and electronic systems 26(3):490–496, 1990. https://doi.org/10.1109/7.106126. [12] c. basso. switching-converter dynamic analysis with fast analytical techniques: overview and applications. ieee power electronics magazine 4(3):41–52, 2017. https://doi.org/10.1109/mpel.2017.2718238. [13] h. g. tan rodney, l. y. h. hoo. dc-dc converter modeling and simulation using state space approach. in 2015 ieee conference on energy conversion (cencon), pp. 42–47. 2015. https://doi.org/10.1109/cencon.2015.7409511. [14] t. polsky, y. horen, s. bronshtein, d. baimel. transient and steady-state analysis of a sepic converter by an average state-space modelling. in 2018 ieee 18th international power electronics and motion control conference (pemc), pp. 211–215. 2018. https://doi.org/10.1109/epepemc.2018.8522000. [15] b. p. mokal, k. vadirajacharya. extensive modeling of dc-dc cuk converter operating in continuous conduction mode. in 2017 international conference on circuit ,power and computing technologies (iccpct), pp. 1–5. 2017. https://doi.org/10.1109/iccpct.2017.8074188. [16] g. kanimozhi, j. meenakshi, v. t. sreedevi. small signal modeling of a dc-dc type double boost converter integrated with sepic converter using state space averaging approach. energy procedia 117:835–846, 2017. https://doi.org/10.1016/j.egypro.2017.05.201. [17] v. eng, u. pinsopon, c. bunlaksananusorn. modeling of a sepic converter operating in continuous conduction mode. in 2009 6th international conference on electrical engineering/electronics, computer, telecommunications and information technology, pp. 136–139. 2009. https://doi.org/10.1109/ecticon.2009.5136982. 63 https://doi.org/10.1007/s40031-022-00759-x https://doi.org/10.1109/ecticon.2008.4600593 https://doi.org/10.1109/iecon.2013.6699142 https://doi.org/10.1109/tpel.2006.876848 https://doi.org/10.1109/7.220955 https://doi.org/10.1109/melcon.1996.551161 https://doi.org/10.1049/iet-epa:20060276 https://doi.org/10.1016/j.ijepes.2012.08.035 https://doi.org/10.1109/pedg.2012.6254073 https://doi.org/10.1109/iciset.2018.8745619 https://doi.org/10.1109/7.106126 https://doi.org/10.1109/mpel.2017.2718238 https://doi.org/10.1109/cencon.2015.7409511 https://doi.org/10.1109/epepemc.2018.8522000 https://doi.org/10.1109/iccpct.2017.8074188 https://doi.org/10.1016/j.egypro.2017.05.201 https://doi.org/10.1109/ecticon.2009.5136982 sangeeta shete, prasad joshi acta polytechnica [18] o. kircioğlu, m. ünlü, s. çamur. modeling and analysis of dc-dc sepic converter with coupled inductors. in 2016 international symposium on industrial electronics (indel), pp. 1–5. 2016. https://doi.org/10.1109/indel.2016.7797807. [19] n. mohan, t. m. undeland, w. p. robbins. power electronics – converters, applications and design. 3rd edition. john wiley and sons, 2013. [20] c. basso. linear circuit transfer functions: an introduction to fast analytical techniques. wiley, hoboken, nj, 2016. [21] v. vorpérian. fast analytical techniques for electrical and electronic circuits. cambridge university press, cambridge, u.k., 2002. [22] s. k. pidaparthy, b. choi. output impedance analysis of pwm dc-to-dc converters. in 2019 10th international conference on power electronics and ecce asia (icpe 2019 – ecce asia), pp. 849–855. 2019. https://doi.org/10.23919/icpe2019-ecceasia42246.2019.8796492. [23] b. choi, d. kim, d. lee, et al. analysis of input filter interactions in switching power converters. ieee transactions on power electronics 22(2):452–460, 2007. https://doi.org/10.1109/tpel.2006.889925. [24] e. can. pwm controlling of a new multi dc-dc converter circuit. technical journal 13(2):116–122, 2019. https://doi.org/10.31803/tg-20190427093441. [25] e. can, h. h. sayan. different mathematical model for the chopper circuit. technical journal 10(1-2):13–15, 2016. https://hrcak.srce.hr/file/238913. [26] d. maksimovic, s. cuk. switching converters with wide dc conversion range. ieee transactions on power electronics 6(1):151–157, 1991. https://doi.org/10.1109/63.65013. [27] m. m. garg, y. v. hote, m. k. pathak. pi controller design of a dc-dc zeta converter for specific phase margin and cross-over frequency. in 2015 10th asian control conference (ascc), pp. 1–6. 2015. https://doi.org/10.1109/ascc.2015.7244716. 64 https://doi.org/10.1109/indel.2016.7797807 https://doi.org/10.23919/icpe2019-ecceasia42246.2019.8796492 https://doi.org/10.1109/tpel.2006.889925 https://doi.org/10.31803/tg-20190427093441 https://hrcak.srce.hr/file/238913 https://doi.org/10.1109/63.65013 https://doi.org/10.1109/ascc.2015.7244716 acta polytechnica 63(1):50–64, 2023 1 introduction 2 materials and methods 2.1 complex cascaded dc-dc buck converter 2.2 state space averaging technique 2.3 pwm switch with eet technique 3 simulation of complex cbc 4 results and discussion 5 conclusion references ap06_1.vp 1 introduction the dynamic properties of a real plant are usually identified by making a model – choosing a model structure and estimating the unknown parameters of the model using data measured on the real plant [1]. the first goal of this paper is to compare a set of parameter estimations of an arx model where each estimation is obtained by minimizing the p-norm (1� p � 2). the measurement of the system output is considered to be damaged by a number of outliers. another problem is optimal control of dynamic systems. model predictive control (mpc) strategies are very popular [2, 4]. optimal predictive control of an arx or state space model is usually obtained by minimizing the quadratic criterion. if a non-quadratic norm is used in the optimality criterion, different results are obtained. for example, for p � 1 dead beat control is obtained. minimizing the l1 norm using linear programming in mpc control has been considered by many authors (e.g. [5, 6, 7]). a connection between linear programming and optimal control is shown for example in [8, 9]. in this paper, optimal predictive control utilizing p-norm minimization of the criterion is shown, and the results are illustrated by simple examples. the paper is organized as follows: the second section shows identification of an arx model using the p-norm. the algorithm for minimizing the p-norm where 1< p < 2 is recapitulated in section 3. this algorithm is known as iteratively reweighted least squares (irls). in section 4, the predictive control strategy for state space models is reviewed. in sections 5 and 6, examples of system identification and control in p-norm are shown. 2 identification of an arx model using p-norm the arx model of a dynamic system can be described by the equation y t a y t a y t n b u t b u t n n n ( ) ( ) ( ) ( ) ( ) � � � � � � � � � � � � � 1 1 0 1 1 1 1 � � � e t( ) (1) where y(t) is the system output in time t and u(t) is the system input, e(t) is an equation error and ai, bj are system parameters. note that only a single input – single output (siso) system is considered in this paper. the parameter estimation problem from a given set of data � t y t y t u t u t� � �{ ( ), ( ), , ( ), ( ), }1 1� � leads to minimization of error vector � �� � �e t e t t( ), ( ),1 � . we are looking for the parameter vector � �x � a a b b t1 2 0 1, , , , ,� � that ensures the best approximation of the output vector � �b � � � �y t y t y t m t( ), ( ), , ( )1 1� by a vector ax, where a � � � � � � � � � y t y t u t y t y t u t ( ) ( ) ( ) ( ) ( ) ( ) 1 2 2 3 1 � � � � � � � � � � � � � . (2) both vector b and matrix a are formed from the measured data � t. the most usual solution of the parameter estimation problem involves minimization of the quadratic norm of the error vector, known as the euclidean norm. in such a case, the optimization problem is min x ax b� 2 . (3) the solution is known as least squares (ls) [10]. in reality the noise of the output measurement is also found in elements of data matrix a. the solution of such a problem leads to total least squares (tls) [11]. if the input measurement is noise free, the problem can be solved by mixed ls and tls [11]. in some applications, we can use the more general p-norm instead of the quadratic norm [10, 12]. the p-norm is defined as � �y y p p n p p i y y p y y � � � � � � � � 1 1 1 2 1� � , max , , .( ) (4) then the problem of parameter estimation is defined as follows x ax b x * arg min� � p p . (5) for 1� p � 2 p-norm minimization can be done by iterative solution of the ls problem. for p � 1 and p � � the solution can be obtained by linear programming. 3 p-norm minimization the general p-norm can be used in the optimization problem defined by (5). if the p-norm is restricted by 1� p � 2 the © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 identification and predictive control by p-norm minimization j. pekař, j. štecha real time system parameter estimation from the set of input-output data is usually solved by minimization of quadratic norm errors of system equations – known in the literature as least squares (ls) or its modification as total least squares (tls) or mixed ls and tls. it is known that the utilization of the p-norm (1�p�2) instead of the quadratic norm suppresses the wrong measurements (outliers) in the data. this property is shown for different norms, and it is shown that the influence of outliers is suppressed if p � 1. also optimal predictive control utilizing p-norm minimization of the criterion is developed, and the simulation results show the properties of such control. keywords: identification, predictive control, p-norm, arx model, iteratively reweighted least squares, linear programming. minimization problem is convex and the solution is unique. this problem can be solved by an iterative algorithm known as iteratively reweighted least squares (irls) [10]. note that the optimization problem with 1-norm or �-norm can be solved by linear programming. 3.1 iteratively reweighted least squares let us solve the approximation problem � �min ( ) x x ax b� � � � �p p p1 2. (6) we assume, that all coordinates of the residuum �( )x b ax� � are nonzero. then the function �(x) can be defined as �( ) ( ) ( ) ( )x x x x� � � � � � ��i p i m i p i i m 1 2 2 1 � � . (7) the previous problem is weighted least squares: min ( ) ( ) , ( ) ( ) x d b ax d� � � p� � � 2 2 2 diag . (8) because of the dependency of the diagonal weighting matrix d(�) on the unknown solution x, the problem must be solved by an iterative algorithm: 1. x(0) is an initial solution, set the iteration counter k � 0. 2. calculate �(k) by � ( ) ( )k k� �b ax , d( ) ( )k k p � � � � � � � � � � diag � 2 2 . (9) 3. utilizing the weighted ls algorithm, �x( )k is obtained by solving � � � � x d a x x ( ) ( ) ( )arg min ( )k k k� � 2 . (10) 4. the next iteration x(k+1) is obtained as x x x( ) ( ) ( )k k k� � �1 � . (11) 5. if the convergence criterion is satisfied, then stop and x(k+1) is the solution, else set k k� � 1 and go to 2. 3.2 minimization of 1-norm by linear programming in order to minimize the norm p � 1 linear programming (lp) can be used. the following problems are equivalent min min{ : , } x y ax b y ax b y ax b y� � � � � � �1 1 t . (12) introducing the augmented vector z x y � � � �, the standard form of the lp problem is obtained min{ : }, z c z az bt � (13) ct � [ ]0 0 1 1, , , , ,� � , a a a b b b � � � � � � � � � � � � i i , . the only drawback of such computation is that the lp problem can have more than one solution. 3.3 minimization of �-norm by linear programming in order to minimize the norm p � �, linear programming (lp) can also be used. similarly as for 1-norm, the following two problems are equivalent min min{ : , } x ax b ax b ax b� � � � � � �� y y y y1 1 , (14) where 1 � [ , , , ]1 1 1� t is the unit vector. introducing the augmented vector z x y � � � �, the standard form of the lp problem is obtained: min{ : }, z c z az bt � (15) where c � � � � � � � � � � 0 0 1 � , a a a � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 � � , b b b � � � � � . 4 predictive control strategy suppose the state and output development model is given in the form x ax bu v y cx du e ( ) ( ) ( ) ( ), ( ) ( ) ( ) ( ). t t t t t t t t � � � � � � � 1 (16) where x(t), y(t) and u(t) are the state, output and input of the system and a, b, c and d are matrices of appropriate dimensions, v(t) and e(t) are state and measurement noises respectively with zero means and covariances rv and re, independent of the state and input of the process. for the optimal predictive strategy the quality criterion is usually given in the quadratic form � � � �j y t w t r u t t k � � � � ! "! # $ ! %!� �� ( ) ( ) ( ) ( )2 2 1 1x , (17) where k is the horizon of the predictive control strategy, w(t) is the reference and r is the weighting coefficient. let us introduce the augmented vectors � �y � y y k t( ), , ( )1 � , � �w � w w k t( ), , ( )1 � , � �u � u u k t( ), , ( )1 � , � �v � v v k t( ), , ( )1 � , � �e � e e k t( ), , ( )1 � and augmented matrices, � �p c a c a c� �t t t t k t t , , , ( )� 1 , s d cb d ca ca d � � � � � � � � � �� � 0 0 0 � � � � � � � k k2 3 , q c ca ca � � � � � � � � � �� � 0 0 0 0 0 0 � � � � � � � k k2 3 . then the criterion with the quadratic cost function can be written in the form � �j rt t� � � �� ( ) ( ) ( )y w y w u u x 1 , (18) where the augmented vector y � px(1) � su � qv � e. the criterion can be minimized by completing the squares if the 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague constraints are not considered. after simple manipulation the optimal control results � �u s s i s px w* ( ) ( )� � � ��t tr 1 1 (19) the state and output noises only enlarge the optimal value of the criterion. often only differences of input signal are considered. the criterion then has the form � � � �j y t w t r u t t k � � � � ! "! # $ ! %!� �� ( ) ( ) ( ) ( )2 2 1 1� x , (20) where &u(t) � u(t) � u(t � 1). in such a case, the optimal predictive control strategy is � �� �u s s r r s px w r u* ( ) ( ) ( )� � � � ��t t t tr r1 1 0 , (21) where � �u( ) ( ), , ,0 0 0 0� u t� and matrix r equals r � � � � � � � � � � � � � � � � 1 0 0 0 1 1 0 0 0 0 1 0 0 0 1 1 � � � � � . this way achives integral action of control. also for predictive control a different norm minimization of the quality criterion can be used j y i w i r u i p p p p k k � � � � � � ( ) ( ) ( )� � 1 (22) according to weighting coefficient r, the quadratic norm suppresses large control errors and large input signals. if the norm p � 2 is used, small members in the criterion have greater influence, and if p � 1 the control law approaches to dead beat control. 5 example i – suppression of outliers by p-norm minimization this section shows an example of system parameter identification. the experiment clearly shows the influence of outliers on the results of identification for utilization of different norms. consider a discrete-time system described by the transfer function y z z z z z ( ) . . . . . � � � � � 0 002870 0 009882 0 002126 2 444 2 008 2 3 2 z u z z z z e z � � � � � 0 5488 1 2 444 2 008 0 54883 2 . ( ) . . . ( ). (23) using this model, the data for our experiments was generated. fig. 1 shows the system input and output time trajectories. 1000 samples of input-output data are corrupted by 11 outliers (completely wrong measurements). fig. 2 and fig. 3 show the step responses of the identified models together with the step responses of the nominal model, where the each identification was done by minimizing a different norm. it is not convenient to use the p � � norm for system parameter identification from data corrupted by outliers – see fig. 4. the euclidean norm of the parameter estimation error can be used to demonstrate the accuracy of the estimation. the norm is j � �x x* 2 , (24) where x is the estimation and x* is the vector of the true parameters of the nominal system. fig. 5 shows the dependency of the norm of the estimation error on the p-norm of the criterion (i.e., on p). © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 fig. 1: data for experiments 0 20 40 60 0 0.2 0.4 0.6 0.8 1 1.2 step response time (sec) a m p lit u d e true model identification 0 20 40 60 0 0.2 0.4 0.6 0.8 1 1.2 time (sec) a m p lit u d e true model identification step response fig. 2: results of identification – step responses (left p � 2, right p � 1.7) 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague 0 20 40 60 0 0.2 0.4 0.6 0.8 1 1.2 step response time (sec) a m p lit u d e true model identification step response time (sec) 0 20 40 60 0 0.2 0.4 0.6 0.8 1 1.2 true model identification a m p lit u d e fig. 3: results of identification – step responses (left p � 1.3, right p � 1) 0 10 20 30 40 50 60 �0.2 0 0.2 0.4 0.6 0.8 1 1.2 step response time (sec) a m p lit u d e true model identification fig. 4: results of identification step response for p � � fig. 5: dependency of the norm of the estimation error on the p-norm (i.e., on p) 6 example ii – predictive control by p-norm minimization consider the second order continuous-time system described by a transfer function g s s s ( ) . . � � � 1 0 7 0 932 (25) that is sampled by a sampling period t ss � 01. . note that the discrete-time system model was transformed to the state space. in the first experiment, three different mpc controllers were designed: � the first mpc minimizes the 1-norm. such a minimization problem can be converted to linear programming (see section 3). � the second mpc minimizes the p-norm where p is considered to be 1 � p � 2. in this case, the minimization problem can be solved by the irls algorithm (see section 3). � the third case is standard mpc minimizing the quadratic norm of the error vector. this optimization can be solved © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 0 2 4 6 8 10 �20 0 20 40 60 80 time [s] sy st e m o u tp u t [] reference system output 0 2 4 6 8 10 0 20 40 60 80 time [s] sy st e m in p u t [] system input �20 0 2 4 6 8 10 �20 0 20 40 60 80 time [s] sy st e m o u tp u t [] reference system output 0 2 4 6 8 10 0 20 40 60 80 time [s] sy st e m in p u t [] system input �20 fig. 7: simulation results for 1.5 (left) and 2 (right) norm (with weighting coefficient r � 1) 0 2 4 6 8 10 �20 0 20 40 60 80 time [s] sy st e m o u tp u t [] reference system output 0 2 4 6 8 10 0 20 40 60 80 time [s] sy st e m in p u t [] system input �20 0 2 4 6 8 10 �20 0 20 40 60 80 time [s] sy st e m o u tp u t [] 0 2 4 6 8 10 0 20 40 60 80 time [s] sy st e m in p u t [] system input reference system output �20 fig. 6: simulation results for 1 (left) and 1.1 (right) norm (with weighting coefficient r � 1) as a least squares problem or, with constraints, as a quadratic programme. the only parameter that is different for each mpc is the kind of p-norm. all other parameters are not changed (prediction horizon n � 4s and weighting coefficient r � 1). fig. 6 and fig. 7 show the simulation results for the first experiment. the performances of reference tracking and system input are shown for p � 1, 1.1, 1.5, 2. minimizing the 1-norm of the quality criterion achieves dead beat control. the second experiment shows the influence of the weighting coefficient r in the criterion (22) on the time length of dead beat control (for 1 norm). the simulation results for weighting coefficient r � 1 are shown in fig. 6 (left), for r � 0.1 and r � 10 in the fig. 8. the relatively large penalty r does not allow big changes in differences of control variable &u and therefore the time of dead beat control is longer. 7 conclusions the purpose of this paper is to show the influence of different p-norm selection on arx model identification and control. when the measurements of the system output were damaged by outliers, it is shown that the euclidean norm gives worse results than the p-norm for 1 � p � 2. the best results were achieved with the p-norm for p � 1. predictive control realized by the p-norm of quality criterion minimization shows interesting results from lq control (linear system and quadratic criterion) to dead beat control. 8 acknowledgment this work was partly supported by grants 102/05/0903, 405/05/2173 and 102/05/2075 of the grant agency of the czech republic. references a shorter version of this paper was presented in [3]. [1] ljung, l.: system identification: theory for the user. prentice hall, englewood cliffs, nj, 1987. [2] aström, k. j., wittenmark, b.: computer controlled systems: theory and design. prentice hall, inc., upper saddle river, nj, 1997. [3] pekař, j., štecha, j., havlena, v.: “identification and predictive control of arx model by p-norm minimization.” proceedings of the 23rd iasted international conference on modelling, identification, and control. anaheim 2004. isbn 0-88986-387-3. [4] findeisen, r., imsland, l., allgöwer, f., foss, b. a.: “state and output feedback nonlinear model predictive control: an overview.” european journal of control, vol. 9 (2003), p. 190–206. [5] rao, c. v., rawlings, j. b.: “linear programming and model predictive control.” journal of process control, vol. 10 (2000), p. 283–289. [6] allwright, j. c., papavasiliou, g. c.: “on linear programming and robust model predictive control using impulse response.” sys. cont. let., vol. 18 (1992), p. 159–164. [7] genceli, h., nikolau, m.: “robust stability analysis of constrained -norm model predictive control. aiche j., vol. 39 (1993), no. 12, p. 1954–1965. [8] change, t. s., seborg, d. e.: a linear programming approach for multivariable feedback control with inequality constraints. int. j. control, vol. 37 (1983), p. 583–597. [9] zadeh, l. a., whalen, j. h.: “on optimal control and linear programming.” ire trans. auto, cont. vol. 7 (1962), p. 45– 46. 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 1/2006 czech technical university in prague 0 2 4 6 8 10 �20 0 20 40 60 80 time [s] sy st e m o u tp u t [] reference system output 0 2 4 6 8 10 �100 0 100 200 time [s] sy st e m in p u t [] system input 0 2 4 6 8 10 -20 0 20 40 60 80 time [s] sy st e m o u tp u t [] reference system output 0 2 4 6 8 10 -20 0 20 40 60 80 time [s] sy st e m in p u t [] system input fig. 8: the simulation results for 1 norm for weighting coefficient r � 0.1 (left) and r � 10 (right) [10] björck, .: numerical methods for least squares problems. siam, philadelphia, 1996. [11] van huffel, s., vandewalle, j.: the total least squares problem. computational aspects and analysis. siam, philadelphia, 1991. [12] boyd, s., vandenberghe, l.: “introduction to convex optimization with engineering applications.” lecture notes, stanford university, 2002. available at: isl.stanford.edu/pub/boyd/. ing. jaroslav pekař, ph.d. phone: +420 266 052 325 fax: +420 286 890 555 e-mail: jaroslav.pekar@honeywell.com honeywell prague laboratory pod vodárenskou věží 4 182 08 prague 8, czech republic prof. ing. jan štecha, csc. phone: +420 224 357 238 fax: +420 224 918 646 e-mail: stecha@control.felk.cvut.cz department of control engineering czech technical university in prague karlovo nám. 13 12 135 prague 2, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 czech technical university in prague acta polytechnica vol. 46 no. 1/2006 ap05_5.vp 1 introduction a necessary condition for extensive development of nuclear energy is to maintain and increase the safety of nuclear fuel. at the same time, there is a need to operate npps (nuclear power plants) economically. the importance of nuclear fuel as a basic part of each npp is more and more apparent, so our work aims at a more correct description of the behaviour of cylindrical fuel rods under more stringent operational conditions, including high burn up. this implies that modern fuel rod computer models must be able to predict with greater reliability the margin for cladding tube integrity loss under normal operation and operational transients. one of the critical moments that occur during fuel duty is the pellet – cladding contact (and also modified pellet – pellet contact after a thermo-mechanical interaction). the difficulty of modelling the contact event can be shown by a very simple example: one core in one loading of one reactor contains ~104 fuel rods, and each fuel rod contains ~102 fuel pellets. where, when, how and in what detail do we have to predict and model it? 2 mathematical description of the problem – unilateral contact mathematical modelling of unilateral contact problems is a new area in applied mathematics. the finite element method (fem) is used. the method is based on the fact that the zone of unilateral contact of two solid bodies need not be defined a priori; the correct definition is one of the results of the solution of the problem. the classical analysis of this problem, started by hertz in 1896, was limited to simple geometries. the age of high-speed computers has brought a qualitative change into the analysis of contact problems. on the basis of suitable discretization, by means of finite differences or fem, the problems can be solved approximately even for complex geometrical situations and boundary conditions. a. signorini originally formulated it as a technical problem in 1933 for the case of unilateral contact of an elastic body with an ideal rigid and smooth base (known as signorini’s problem). from the mathematical point of view the unilateral contact problem performs the application of variational inequalities, which today form the basis of convex analysis and make a connection between mathematical physics problems and optimisation problems. unlike classical variational equations, variational inequalities are solved not on the whole function space but on some closed part of it. the problem tends to seek the conditional extreme of the functional of the deformation energy. these problems are nonlinear, due to the existence of material nonlinearities, and also because the set on which the solution is being found can alter during the solution process. the mathematical details can be found, e.g., in [1, 2, 3]. 3 systems used for calculation 3.1 cosmos/m the problem has been solved by means of the cosmos /m [4] software system. it is a complete, modular finite element system, which includes for instance modules to solve linear and nonlinear static problems in addition to problems of heat transfer. developing a reliable model capable of predicting the behaviour of structural systems represents one of the most difficult problems to face the analyst. fem provides an effective vehicle for performing these problems due to its versatility and the great advancement in its adaptation to computer use. the success of a finite element analysis depends largely on how accurately the geometry, the material behaviour and the boundary conditions of the actual problem are defined. one has to take into consideration that all real structures are nonlinear in some way. in addition, the unilateral contact problems are nonlinear due to their optimization substance as was mentioned above. the unilateral contact forms the basis for adequate pellet-cladding modelling and it was used in all calculated problems. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 physical and numerical difficulties in computer modelling of pellet-cladding contact problems for burned-up fuel m. dostál, j. zymák, m. valach the importance of fuel reliability is growing due to the deregulated electricity market and the demands on operability and availability to the electricity grid of nuclear units. under these conditions of fuel exploitation, the problems of pcmi (pellet-cladding mechanical interaction) are very important from the point of view of fuel rod integrity and reliability. severe loading is thermophysically and mechanically expressed as a greater probability of cladding failure especially during power maneuvering. we have to be able to make a realistic prediction of safety margins, which is very difficult by using computer simulation methods. nri (nuclear research institute) has recently been engaged in developing 2d and 3d fem (finite element method) based models dealing with this problem. the latest effort in this field has been to validate 2d r-z models developed in the cosmos/m system against calculations using the femaxi-v code. this paper presents a preliminary comparison between classical fem based integral code calculations and new models that are still under development. the problem has not been definitely solved. the presented data is of a preliminary nature, and several difficult problems remain to be solved. keywords: pcmi, fem, contact, pellet, cladding. after the first thermoelastic cases, all further problems of pellet-cladding interaction were solved as nonlinear problems in these ways: geometrical nonlinearity and the thermal, mostly nonlinear, dependencies of all used material properties. heat transfer with inner sources included was always solved as the initial solution for the subsequent mechanical problem. 3.2 femaxi-v the second used system was femaxi-v [5] – a japanese light water reactor fuel analysis code. it deals with a single fuel rod and predicts thermal and mechanical response of a fuel rod to irradiation, including fp (fission product) gas release. the thermal analysis predicts rod temperature distribution on the basis of pellet heat generation, changes in pellet thermal conductivity and gap thermal conductance, and (transient) change in surface heat transfer to coolant, using radial one-dimensional geometry. the mechanical analysis performs elastic/plastic, creep and pcmi calculations by fem. the fp gas release model calculates diffusion of fp gas atoms and accumulation in bubbles, release and increase of internal pressure in the rod. femaxi-v consists of two main parts: one for analyzing the temperature distribution, thermally induced deformation, fp gas release, etc., and the other for analyzing the mechanical behaviour of the fuel rod. in the thermal analysis part, the temperature distribution is calculated as a one-dimensional axisymmetrical problem in the radial direction, and with this temperature, such temperature-dependent values as fp gas release, gap gas flow in the axial direction, and their feedback effects on gap thermal conduction are also calculated. in the mechanical analysis part outside the thermal analysis part, users can select either analysis of the entire length of a fuel rod, or analysis of one pellet length. in the former, the axisymmetrical finite element method (fem) is applied to the entire length of the rod; in the latter, the axisymmetrical fem is applied to half the pellet length (for a symmetrical reason), and the mechanical interaction between pellet and cladding, i.e. local pcmi, is analysed. in the mechanical analysis, the magnitude of the pellet strain caused by thermal expansion, densification, swelling and relocation is calculated first, and a stiffness equation is formulated with consideration given to cracking, elasticity/plasticity and creep of the pellet. then, the stress and strain of pellet and cladding are calculated by solving of the stiffness equation with boundary conditions corresponding to the pellet-cladding contact mode. when pcmi occurs and the states of the pellet-cladding contact change, calculation is re-started with the new boundary conditions of contact from the time when the change occurs. 4 validation case postulated in nri the validity of the new models in cosmos /m is examined by a comparison with femaxi-v. the power history is shown in fig. 1 and covers 6 days: linear increase of the linear heat rate from 0 to 300 w/cm, 300 w/cm, linear decrease to 150 w/cm, 150 w/cm, linear increase to 300 w/cm and 300 w/cm. one hour time steps were used in both codes. 5 2d cosmos/m results against femaxi-v results 5.1 input data models synchronization the dimensions of a pellet for this validation case are selected to represent a vver-440 fuel rod: inner diameter 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 1: linear heat rate vs. time 1.4 mm, outer diameter 7.68 mm, height 10.0 mm, and cladding thickness 0.69 mm. a radial gap of 20 �m at room temperature is assumed, considering pellet swelling at a high burn up. the non-linear, thermal dependent material properties of the pellet and cladding were applied from femaxi-v into cosmos /m. 2d meshing of the pellet and cladding is given in femaxi-v (see fig. 2), so it was rearranged in the same way in cosmos /m. a detail of an 8-node element with four integration points is shown in fig. 3. 5.2 checking the thermal field to achieve comparability temperature is an important factor in predicting pcmi. the reason is that almost all processes and effects (thermal expansion, creep, etc.) are temperature dependent. temperature in time follows the linear heat rate. the central temperatures are quite well comparable (see fig. 4 and fig. 5). the peak values are almost the same: 994 °c femaxi-v and cca 1245 k (972 °c) cosmos /m. the pellet outer surface temperature and the cladding inner and outer surface temperatures are also consistent. this is a good base point for further comparison. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 fig. 2: mesh of the pellet and cladding in femaxi-v fig. 3: integration points for a rectangular 8-node element fig. 4: temperature (°c) vs. time (days) from femaxi-v. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 5: temperature (k) vs. time (seconds) from cosmos/m a) b) c) d) fig. 6: a) eq. stress at the outer clad. surface, b) eq. stress at the inner clad. surface, c) circ. stress at the outer clad. surface, d) circ. stress at the inner clad. surface 5.3 presentation of the stress-strain field at a given outer pressure (coolant pressure) acting on the fuel rod, the contact pressure increases the stress in the cladding. reaching the yield stress value can be the primary cause of cladding failure. the critical value is also dependent, e.g., on the fission rate (fluence). as comparable stresses, we chose the circumferential and equivalent (effective) stresses in the most exposed places, where the highest stress due to the “hour glassing” effect of the pellet is expected. in the femaxi-v mesh, there is a node with coordination axial 6 (iz � 6 in fig. 2) and radial 1 (ir � 1 in fig. 2), i.e., the inner surface of the cladding and of course also node axial 6 (iz � 6 in fig. 2) and radial 2 (ir � 2 in fig. 2), i.e., the outer surface of the cladding. these values vs. time can be seen in figs. 6a–6d. the stresses correspond well with the gap size, the gap is closed after about 12–18 hours, from where the stresses show a high increase and reach the peak value (for equivalent stress at the outer surface of the cladding about 65 mpa) and then they relax during the steady power. a gap opens with the power decrease, and the stresses fall almost to their initial values. when there is low power (fourth day), the stresses remain also low. the second power increase induces the same behaviour, only with higher peak values. the cosmos /m results are shown in fig. 7. the top line corresponds with the line in fig. 6b, second top fig. 6a, third fig. 6c and fourth fig. 6d. the difference can be seen on the first day – the initial values are zero, in contrast with femaxi-v. from the contact point cosmos /m induces the same behaviour as femaxi-v with higher peak values (85 mpa (effective stress) for the outer surface). on the second and sixth days relaxation takes place due to steady power and established contact conditions. during day four, the stresses remain low. slight numerical instability can also be seen at the beginning of the second day (24th–25th hour). all calculations were made with contact coefficient of friction 0,1 and the following questions remain open: � what is the influence of bonding or sliding with friction on the transient behaviour? � what are the real (realistic) local friction coefficients for medium and highly burned-up fuel? 6 conclusion the paper summarizes the preliminary results from a local axisymmetric model developed in the cosmos/m system. for validation we used the results from the femaxi-v integral code. the absolute values of the calculated stresses are higher for cosmos /m, and the stress progress in time differs on the first day of the power history. we were not fully able to define a totally compatible initial modelling state in both codes. we regard this as a weakness, and we will work on a more detailed analysis. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 fig. 7: stresses (mpa) from cosmos/m vs. time (seconds) references [1] timošenko, š.: strength of materials i., ii. praha: technicko-vědecké vydavatelství, 1951 (in czech). [2] hlaváček, i., haslinger, j., nečas, j., lovíšek, j.: solution of variational inequalities in mechanics. bratislava: alfa, 1982 (in slovak). [3] hinton, e., owen, d. r. j.: finite element programming. london: ap, 1977. [4] cosmos/m 2.8 for windows – online manuals, srac 2003. [5] suzuki, m.: femaxi-v manual, japan, jaeri 2000. ing. martin dostál e-mail: m.dostal@post.cz department of nuclear reactors czech technical university in prague faculty of nuclear sciences and physical engineering v holešovičkách 2 180 00 praha 8, czech republic nuclear research institute řež plc. reactor technology department 250 68 husinec-řež 130, czech republic rndr. jiří zymák, csc. e-mail: zym@ujv.cz ing. mojmír valach, csc. e-mail: val@ujv.cz nuclear research institute řež plc. reactor technology department 250 68 husinec-řež 130, czech republic 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague ap07_4-5.vp 1 navigation satellite navigation systems can determine an absolute position on the earth’s surface. an electronic compass must be a part of an inertial navigation system to be able to do the same. the output value from an electronic compass is the azimuth. the azimuth can be calculated using equation (1), where hey and hex are the horizontal parts of the magnetic vector and d is the declination in the measurement location (see fig. 1). y h h d� � � � � � � � � arctan ey ex . (1) the azimuth calculated using equation (1) is correct only when the magnetic sensors are in the horizontal plane (pitch = 0, roll � 0). this cannot be easily mechanically assured in underground drilling applications. therefore tilt sensors must be introduced into the system. the data from the magnetometers is then mathematically compensated for the actual measured pitch and roll. three mems accelerometers are used as tilt sensors. 2 magnetometer the main part of the compass is a sensor of the earth’s magnetic field. three types of sensors are typically used for geomagnetic field sensing. hall-effect magnetometers are used in applications where the cost of the sensor, its dimensions and power consumption are critical, e.g. watches and mobile phones. an amr sensor (honeywell hmc1001) offers higher accuracy but it is still very difficult to achieve the desired error limits (< 0.5 degree). a fluxgate sensor is the best choice for applications where accuracy is the most critical parameter. the miniature pcb fluxgate sensors used in this compass have smaller dimensions (34×16×1.2mm), lower power consumption (important for a battery operated device) and lower price (in case of mass production). three types of pcb fluxgates sensors are shown in fig. 2 (type a on the left with an excitation coil around the whole core, type b on the right with an excitation coil only on the two sides, type c where the excitation coil is equally distributed around the whole core with a higher count of compensation coil turns). this difference in sensor excitation coil distribution has a considerable impact on the sensor properties. a type a sensor with the excitation coil equally distributed has lower non-linearity error. however, a higher compensation current is needed because of the lower count of sensing-compensation coil turns. a type b sensor is used in the compass module, as the lower compensation current (for the earth’s magnetic field it is 28 ma) means easier design of the compensation loop. a common operational amplifier can supply such a current. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 an improved version of the fluxgate compass module v. petrucha satellite based navigation systems (gps) are widely used for ground, air and marine navigation. in the case of a malfunction or satellite signal inaccessibility, some back-up navigation system is needed. an electronic compass can provide this function. the compass module described in this paper is designed for precise navigation purposes. the compass module is equipped with electronic tilt error compensation, and includes everything in one package electronics with digital output, sensors. a typical application of this compass is in underground drilling. a critical parameter in this application is heading accuracy. a reading error of 1 degree can cause a displacement of 1.8 metres in the target area (length of tunnel 100m). this is not acceptable in an urban conglomeration, and therefore a more accurate heading sensing device must be used. an improved version of this electronic compass is being finished. keywords: navigation, azimuth, electronic compass, fluxgate, magnetometer. fig. 1: azimuth, heading direction and magnetic vector components fig. 2: three types of pcb fluxgate sensors developed at ctu. type a on the left, type b on the right, and c at the bottom a type c sensor has been developed, which combines the best from type a and type b. this sensor, with low non-linearity error and small compensation current, is used in the new version of the electronic compass. 2.1 magnetometer electronics fluxgate sensors are usually excited using a sine wave signal. in order to suppress the power consumption, pulse excitation is used. typical signal evaluation electronics for sine wave excitation is the second harmonic detector. when operating with pulse excitation, some other type of evaluation has to be used. a differential switched integrator seems to be a good choice [1]. various excitation signal patterns were tested (fig. 3). the excitation frequency is constant (10 khz), while the pulse width (e.g. 12 %) and the phase between excitation and evaluation sync. signals are changed. a three-channel pcb fluxgate magnetometer is used in this compass. the construction of such a device is a demanding challenge. at least a four-layer pcb must be used for the three-channel magnetometer evaluation electronics. because of the unavailability of smd components (resistors with low temperature coeficient, high quality capacitors), mixed components were used. the second version is constructed entirely with smt. only three connectors are “through hole” types. another problem is the temperature bias stability of the pcb fluxgate sensors used here. the compass consumes less than two watts of electric energy, but this still causes considerable selfheating. by using resistors with a very small temperature coefficient ( 3 ppm) as a current to voltage converter (magnetometers operate in a closed loop) and with a stable voltage reference ( 3 ppm), the temperature stability is mainly influenced by the sensors themselves. fig. 4 shows the temperature stability of the magnetometers. simple measuring equipment was used to test pcb fluxgate magnetometer linearity. the test field ( 55000 nt) was generated with a helmholtz coil driven by the power supply with an ieee488 interface. the magnetometer output voltage digitized by the adc of the compass was sent to the computer and processed with ms excel. the measured non-linearity depends on the actual environmental conditions (presence of magnetic disturbance). a typical non-linearity value was 0.05 % of full scale (see fig. 5). 2.2 compass electronics the whole system consists of three magnetometers, three accelerometers, six delta-sigma adcs (ads1210) and two microcontrollers (atmel attiny2313) – the first one is used for data acquisition and communicates through a serial line with the master system (e.g. a pc), and the second is used in an excitation unit. an improved version contains atmega16 in the adc module (the larger flash memory allows easier programming using c language). the output voltages from the accelerometers and magnetometers must be converted simultaneously, so six single adcs are used. higher resolution can also be achieved with this configuration compared to a single adc with a sampling unit. colibrys ms7202 mems capacitive accelerometers with a low magnetic housing are used. ms7202 has sensitivity 500 mv/g with a range of 2 g, bias temperature coeficient 400 �g/ °c max. (scale factor temperature coefficient 100 ppm/ °c typ.) and white noise spectral density 36 �g/�hz. battery operated power supply with usb/serial converter was used for field testing. 3 mechanical assembly a compass case is made from a fibreglass-filled plastic rod (length 230 mm, diameter 48 mm). this material ensures solidity of sensor placement and its temperature deformation is very low (this is very important, because any change in © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 47 no. 4–5/2007 fig. 3: from top to bottom: excitation current (600 map-p), synchronization signals for evaluation electronics, sensor output response (b ~ 40 �t) fig. 4: pcb fluxgate magnetometer offset temperature drift (0.9 nt/ °c) measured in a six-layer shielding -60 -40 -20 0 20 40 -60000 -40000 -20000 0 20000 40000 60000 b [nt] e [n t ] fig. 5: pcb fluxgate magnetometer linearity error (channel x, 0.036 % f.s.) sensor alignment extends the azimuth error). the rod was cut longitudinally into two pieces. space for the electronics and sensors was then made in both pieces with the use of a milling machine. the sensors installed in the compass case are shown in fig. 6. all sensors are fixed with non-magnetic screws and hot glue. the electronic parts are mounted on four pcbs. all pcbs are four layered and are mounted one above the other in the center of the compass case. all components used in the compass should be made of non-magnetic materials (the presence of soft ferromagnetic materials has an unpredictable influence on the earth’s magnetic field, and causes considerable errors in azimuth estimation). the complete system ready for testing is shown in fig. 7. 4 calibration calibration of such a system is essential in order to achieve the desired accuracy. sensitivities, offsets and orthogonality errors can be calibrated using a process called scalar calibration. the three sensors (magnetometers or accelerometers) must be orthogonally placed in order to be able to properly sense the vector of the appropriate field. imperfections in mechanical placing cause the angle between two sensors not to be 90 degrees. a typical orthogonality error can be up to 5 degrees for single axis sensors and up to 1 degree for dual or triple axis sensors (depending on the actual inner sensor construction – one die or multiple dies in one package). scalar calibration is based on the presumption of a stable and homogeneous field. data from sensors is taken in various (random) positions of a sensor triplet. in the case of magnetometers this can be performed during smooth slow motion. this is not suitable for accelerometers, because the sensor output may be influenced by dynamic acceleration. the acquired data is then processed by an iteration algorithm. the purpose of the iteration algorithm is to find nine coefficients (three sensitivities, three offsets and three misalingment angles). the algorithm minimizes the variance of the total field (erms), which is computed using equations (2), (3), (4) and (5). fmeas-x is the value measured by sensor x, fkx is sensitivity, fox is sensor offset, a11–a33 are coeficients that represent the non-orthogonality of the sensor triplet, and fmeas is the total field value measured by a scalar magnetometer (overhauser magnetometer). f f f f f f t temp-x temp-y temp-z meas-x meas-y me � � � � � � � � � � as-z x y z ox oy o � � � � � � � � � � � � � � � � � � � t k k k f f f f f f 0 0 0 0 0 0 z � � � � � � � � � t , (2) f f f a a a a a a a a a x y z � � � � � � � � � � � � � � �11 12 13 21 22 23 31 32 33 � � � � � � � � � � � � � � f f f temp-x temp-y temp-z , (3) f f f f� � �x y z 2 2 2 , (4) � �e nf f i n rms meas� � �11 2 1 . (5) the operation of the iteration algorithm is shown in fig. 8. at the beginning of the iteration cycle, the estimated parameter value and its range are known (e.g. off20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 6: three pcb fluxgate sensors in the left picture and three mems accelerometers are placed in the compass case in the picture on the right fig. 7: compass module on a non-magnetic tripod. on the table is a module with accumulators and a usb/serial converter fig. 8: iteration algorithm – one step of iteration of one parameter set � 0 0.1 �t). the interval is segmented into five equal parts. the erms value (5) is computed in the middle of each part. the interval with minimum erms is chosen. the new parameter range is 60 percent of the previous one. the iteration algorithm is terminated after a predefined number of iterations, or when the minimum predefined difference between iterations is reached. other methods for finding the best parameters in nine-dimensional space are being studied. fig. 9 shows the result of scalar calibration. the variance of total field erms (5) was decreased from 1729 nt to 31 nt. in the case of gravitation, a reduction from 24 mg to 4 mg was achieved. 4.1 calibrating the misalignment of sensor triplets and the compass case after scalar calibration has been done, we have the ideal vector magnetometer and accelerometer. the azimuth is measured in reference with the compass body. the sensitivity axis of the sensor triplets should coincide with the compass case axis. in real conditions, angles �, � and � (fig. 10) are non-zero, and this misalignment must be calibrated otherwise a significant error in azimuth estimation can occur. for this calibration, a stable non-magnetic platform is needed (e.g. a tripod). the compass module is then rotated in one axis. the output value (corresponding to the rotation axis) should be constant during this motion. in the real case the output will change sinusoidally. an iteration algorithm is used to minimize this change in all axes. 5 software equipment the software equipment (created in labwindows/cvi ver. 5.5) is also very important. the software for the supervising pc consists of three basic parts. data acquisition is the first part, followed by processing with a calibration algorithm [2] and measurement with a calibrated device with pc software, see in fig. 11. azimuth, pitch and roll are the main output values. the total magnetic field f can be used to check the presence of a magnetic disturbance in the place of measurement. otherwise, a significant error in azimuth estimation could remain hidden. 6 conclusion a compass module suitable for underground drilling was constructed. the azimuth error is 0.5° after all calibration procedures have been done. the technical parameters of the current version of the compass module are summarized in table 1. a new version is being finished, and not all measurements have yet been made. the new design addresses the drawback of currently used pcb fluxgate sensors, i.e., the influence of the electronics on the magnetic sensors. it was anticipated that the mechanical dimensions of the new version would be slightly increased and the space between the fluxgate sensors and the rest of the compass would be wider. however, there were difficulties in manufacturing the case, and these have not yet been overcome. the main advantages of the new version are briefly mentioned here: � lower power consumption (30 % reduction) � elimination of current loops � a lower amount of ferromagnetic materials contained in the electronics of the compass � improved electromechanical construction (smds) � better software for the control cpu (invariable adc setting) � the type c fluxgate sensor (better linearity of the magnetometers) the last part of the work consists in completing the software equipment and calibration of the device. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 47 no. 4–5/2007 fig. 9: total field before and after scalar calibration fig. 10: misalignment of the sensor triplet sensitivity axis and the compass case fig. 11: pc software – output data presentation: azimuth, roll, pitch, total magnetic field references [1] kubík, j., janošek, m., ripka, p., včelák, j.: low-power fluxgate signal processing using gated differential integrator. in: emsa’06 – 6th european magnetic sensors and actuators conference, bilbao, 2006, p. 132. [2] včelák, j., ripka, p., kubík, j., platil, a., kašpar, p.: amr navigation systems and methods of their calibration. sensors and actuators. a, physical vol. 123–4 (2005), p. 122–128. ing. vojtěch petrucha e-mail: petruv1@fel.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 166 27 prague 6, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 dimensions (length, diameter) 230 mm, 48 mm power consumption (power supply voltage �12 v) <2 w ( <1.4 w new version) azimuth accuracy (after calibration) �0.5° table 1: compass parameters summarized ap05_6.vp 1 introduction visualization of fluid flow [3], [4] is an important research tool for obtaining information about mechanisms of flows and also a validation and confirmation tool for developing hydraulic and pneumatic devices or – to use a more general term – fluidic devices [5]. in internal flow problems, visualisation usually requires making a model of the device with transparent walls, but this is rarely a problem since the commonly used transparent acrylic material (polymethylmetacrylate) is cheap, sufficiently soft for convenient and fast machining, and stiff enough to resist the acting mechanical forces. the main problem is that the common fluids used in laboratory investigations – water and air – are transparent and require an additional facility to generate observable effects. for many purposes, this causes inconvenient complications, and there is a general demand for visualisation facilities which would be simple and inexpensive. the problem is less severe in the case of water, which may be relatively simply coloured using a dye. for air flows, an analogous addition of smoke has been tried, but the smoke is inconvenient to generate, usually not well observable in narrow channels (where there cannot be enough smoke for significant light extinction). also, the “smoke” – actually droplets of decomposition products – tends to condense in the channels, generating a non-transparent layer on the walls which obstructs the observation and requires repeated cleaning of the model. a new method described in the present paper is used for studying the air flow in the channels of acrylic models. it uses condensation of water vapour on the channel walls. the patterns of the regions with consensed water droplets are easily observable. since the vapour deposition depends on local flow velocity (and local shear stress on the wall), and the droplets tend also to be removed from a high-velocity location (either because they evaporate or because they are moved by the shear stress), the condensation pattern provides a useful picture of the conditions inside the channel. the method is extremely simple and cheap. somewhat surprisingly, there does not seem to be any description of this method in authoritative monographs (e.g. [3, 4]) on flow visualization methods. 2 the task present authors find flow visualization indispensable for their current development of no-moving-part fluidic devices, in particular flow control valves. despite the convenience of cfd numerical flowfield solutions as a tool for designing the internal geometry of the valve, experience shows that it cannot be relied upon without verification experiments. this may sound rather surprising in the context of microfluidics, with devices usually operated at low reynolds numbers. indeed, microfluidic flows are usually laminar and the numerical solution is thus spared the complexities and intricacy of turbulence modeling, the usual source of unreliability and troubles. this, however, is not valid universally – there are microfluidic devices with fully turbulent flows. moreover, even at moderate and low reynolds numbers the character of the flow may be complicated by vortices which the computations – especially the usual steady-state computations – may not always handle properly. the developed valves are based on the flow diverting principle. the supplied flow forms a jet deflected by the action of the control flow. in particular, the described novel visualisation method was developed and used in verification experiments with pressure-driven valves intended for applications in fluid sample selectors [2, 8] for delivering samples to chemical composition analyser. the need for sample purity results in a very special requirement – the generation of a jet pumping effect by the control flow. the pumping reverses the flow in the downstream cavities and this removes any remains of a sample that may linger in the conduits and other volumes connecting the valve with the composition analyzer further downstream. achieving a sufficiently strong entrainment into the jet called for the control jet reynolds number to be much higher than is usual in microfluidics, reaching values of the order of re � 1 000. this is still mostly within the lami© czech technical university publishing house http://ctn.cvut.cz/ap/ 3 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 flow visualisation by condensing steam – an unusual method applied to development of a low reynolds number fluidic selector valve j. r. tippetts, v. tesař a visualization method so far not mentioned in the literature has been recently developed by the authors as a useful validation supplement to numerical flowfield computations in the design of microfluidic devices. the method is based upon water vapour condensation on device channel walls. it is extremely easy to set up with minimum expense – and yet it is very reliable. as an application example, the paper shows the method used in study of properties of a microfluidic valve intended for switching gaseous sample flows in a microfluidic selector sampling unit. a scaled-up model of the valve was built, as usual, in transparent acrylic material, making possible observation and photo-recording of the deposition and subsequent drying of the condensed droplets. the scaling-up slowed down the time scale enough for investigating the transition processes which takes place as the flow in the valve is switched on and off. keywords: flow visualization, condensing steam, microfluidics, valves. nar regime, but the flows become dominated by quite strong vortices, actually welcome for generating an effective entrainment effect in the jet. apart from the desirable vortices entraining fluid in the jet, the geometry gave rise to various other vortices, more or less standing inside various corners. the problem with cfd solution unreliability was traced down to the software’s improper handling of these vortices (tesař et al., 2004 – [2, 12]). even the most sophisticated software versions – large-eddy simulations – tend to compute the vortices as steady, while in reality (as documented by the flow visualizations) they are shed and carried away with the flow. other vortices are then formed anew in their place. the energy for their formation is extracted from the main flow. as a result, this was found to lead to seriously overpredicted diffuser effects and underpredicted losses [12]. a disappointing experience with the initial cfd results led to an altered design procedure in which the computations are validated by laboratory tests using large scale acrylic (perspex) models. one attractive feature of the scaled-up models is the significantly slower, easier to investigate switching frequency in the model. this is due to the fact that the scaling factor of the switching is the stokes number sk � �f d2 � (1) when increasing the size d of the model – as the authors did – five times relative to the actual microvalve, hydrodynamic similarity (= equal sk) requires switching at 25-times lower frequency f if the kinematic viscosity � remains the same. as a matter of fact, the viscosity of our model fluid is lower. the valve is designed for switching “syngas” – synthesis gas containing a substantial proportion of a high-viscosity h2 component, at a high temperature (400 k) – while our model experiments are run with much cooler air, having roughly 3-times lower viscosity. as a result, the switching frequency for the same stokes number in the model is about 75-times lower. this makes the switching processes in the model easily observable – and recordable by a standard camcorder. requirements concerning the resolution power of the flow visualization method in these tests are certainly undemanding. we need not study fine details of the flowfield. the interest is mainly in detecting into which channel the sample flow passes in response to the admission of the control flow. standard, expensive gas flow visualization methods, aiming at revealing the fine points of the flowfield, are unnecessary. after all, on a 5:1 scale the model channel widths are still rather small, of the order of millimetres, which is certainly not enough for any elaborate study of the flow details. on the other hand, with the visualization having in our case a more or less supplementary role, we need the method to be inexpensive, readily available and reliable. the standard gas flow visualization, which we could have used but decided not to, is the well known so called “smoke” method (e.g. merzkirch 1974 [3]). the gas motion is made visible by particles of “oil smoke” carried with the flow. in fact the particles are formed by condensing oil vapours. for our purpose, we found several unpleasant disadvantages. setting up an efficient “burning” (in fact mainly evaporation) of the very small properly dosed oil flow for reasonable constancy of vapour concentration is by no means easy. good results certainly require a professionally built facility, which is not exactly cheap. another problem is the fouling of the model channels – sedimentation of the condensed particles on the walls – producing a difficult to remove layer that gradually obscures the view into the channel. what we needed was a simple, cheap and easy to set-up method, leaving no permanent deposits on the wall. similar requirements may arise in many other situations, where our experience may be found useful. 3 condensing steam method the inspiration for the development of the new method came from observations of our colleagues who tested the behaviour of the newly delivered model by blowing into its terminal inlets with their mouths. though extremely crude, this method enabled some observation of the switching action, because the humidity contained in human breath was seen to condense on the walls of the model channels through which the breath passed, revealing a clearly visible gas flowpath inside the model. a more permanent source of moist air was built according to fig. 1. the air flow rate is measured upstream, where the air is still dry. the moisture content is then increased by bubbling the air through water in a container, which is easily improvised from a plastic bottle for soft drinks. we used a standard pet bottle of 1.5 litre volume. of course, the bottle has to be closed tightly to prevent leakage of the air. for the air inflow and outflow, the bottle plug was adapted by drilling through two holes with two different lengths of plastic tubing inserted and glued into the plug. one of them reaches deep below the water level, while the end of the other one remains above the water. 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 1: schematic arrangement of the humid air source. the only component needed is any suitable bottle filled with hot water. the authors use an empty coca cola bottle, filled with water heated in the office tea kettle. the saturation with moisture was found to improve significantly if the water temperature was increased. it is easy to fill the bottle with hot and indeed boiling water. we warmed it simply in the office tea kettle. for protracted experiments the bottle was provided with polystyrene thermal insulation (using the polystyrene inserts from the boxes in which computers are delivered). also advisable (though not seen in our setup shown in fig. 6), is then to provide a similar thermal insulating layer for the feed pipe leading to the model (as shown in fig. 1). it is useful to keep this pipe short (the length of our 5 mm i.d. pipe was 100 mm). even with the complicating feature of heated water, anyone can improvise the moist air source according to fig. 1 a few minutes. the cost is virtually zero. the advantage of water condensation is that the generated droplets are easily removed from the surfaces after the visualization experiment by simply blowing dry air through the model. the water is in effect distilled and leaves no fouling residua. although the water vapour condensation in the air flow inside the cavities is also observable, we relied mainly on the clearly visible layer of water droplets created by condensation on the cool walls of the model channels. the layer being white, it is useful to improve the observation by placing a contrasting black surface behind the model (fig. 6). the model may be cooled to enhance the condensation – we tested cooling it by tap water or by air flow from a small blower – but this is not strictly necessary. we used pressure cylinders as the source of our control flow and found particularly well observable condensation provided by a collision of the warm moist air flow from the valve supply nozzle with a control flow, which is cooled by the expansion from the cylinder prior to the entry into the model. the condensed water droplets on the surfaces are very small and are removed quite fast if exposed to dry control air. the changes in the optical properties of the condensed layer – the droplet deposition as well as removal by drying – are fast enough for investigations of the switching processes in the valve. of course, in the variant utilizing condensation on the walls, the method is particularly suitable for investigating fluidic circuits behaviour, with emphasis on the switching processes and on following the flowpaths. it is obviously less useful for a study of the internal flowfield inside the channel, though it may indicate some interesting facts, such as the boundaries between the moist and dry gas flows if they share the same channel. it can also provide some information about the orientation and magnitude of the wall shear stress. 4 application of the method 4.1 object of investigations although of quite wide applicability, the present method was developed and tested in studies of a microfluidic valve. the valve is a successor to several earlier designs (tesař [7, 8]; tesař et al., [6]). it is intended to form a part of a 16-channel fluidic flow selector sampling unit to be used for catalyst testing (low et al., 2001 [1]; wilkin et al., 2002 [10]). the details of its layout and its properties are not discussed here as they are available elsewhere (tesař et al., 2004 [2]). the valve is of planar arrangement, with relatively shallow cavities (the cavity contours are shown in fig. 4) of everywhere constant depth. it is operated in two principal states: “open” and “closed”. in the sampling unit, all but one of the valves are in their “closed” state while the only remaining one is “open”, permitting the flow of the corresponding one fluid sample to a composition analyzer. in the “open” state, shown in the schematic representation fig. 2, there is no control flow. the sample flow passes through the valve from the supply terminal s to the output terminal y. the hydrodynamic conditions in the valve are determined by the very small sample supply flow rate, rendering the reynolds numbers very low – its value is only around re � 88, evaluated from the conditions in the supply nozzle. this is too small for efficient driving of the sample flow into the connected load (analyzer). instead, the sample is driven there by the applied constant pressure difference �pyv. this is applied between the mutually interconnected vents v and the output terminals y of all valves forming the sampling unit. to prevent contamination of the sample by the fluid mixture from the vent, the pressure is adjusted so as to generate a small spillover “guard” flow into the vent v. the “closed” state, represented schematically in fig. 3, is brought about by the admission of the control flow to the control terminal x. the sample flow is not actually turned down. as shown in fig. 3, it is instead prevented from enter© czech technical university publishing house http://ctn.cvut.cz/ap/ 5 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 2: schematic diagram of the valve. black triangles: nozzles, white triangles: collectors. in the open state shown here the sample flow from supply terminal s passes into output y. a small percentage is spilled over as the protective “guard” flow into vent v. fig. 3: schematic diagram of the valve in the closed state. to clean the cavities downstream from y, the powerful control flow from x generates a reverse output flow in the jet pump part of the valve. ing the valve output y by being diverted into the vent v. the control flow actually meets the supply flow already slowed down in the diffuser (note the white triangle symbol of the diffuser in fig. 3) of the jet pump collector so that the diverting effect is not so much by the jet momentum interactions used in the large-scale fluidic devices (e.g., tesař, 1998) as it is due to fact that the entrance into the output channel is obstructed by the control fluid presence. the control jet leaving the control nozzle is here rather extraordinarily powerful. its reynolds number is as high as re � 1 000, a value out of proportion in usual microfluidics, but required here for formation of vortical motions in the control jet. these produce a quite effective entrainment of the fluid from the output channel, this generating a return flow in the output terminal y, required for cleaning the downstream cavities between the valve and the analyzer. it is in this “closed” state – and the transitions between the “open” and “closed” regimes – that the problems with cfd unreliability occur. 4.2 visualisation experiments the enigmatic disagreements between the computations and the experimental data [12] were investigated using 5:1 scaled up models of the valves. the interaction cavities, as shown in fig. 4, were made by laser cutting in a perspex plate. the plate is clamped between the top and bottom cover plates. the latter are quite thick (they may be seen in fig. 6) and hold the tube connection ferrules of the four terminals of the valve. of them, three terminals (s, x, and y) are in the top plate (fig. 6), unfortunately somewhat obstructing the observation in figs. 5 and 7 (the valves were viewed from the top during the experimental runs). it should be said that the models were intended for working purposes, to obtain information, not to generate nice visualization pictures. nevertheless, despite the somewhat cluttered appearance of the pictures taken during the runs, the essential information – the presence or absence of the gas flow in the individual parts of the valve – is clearly recognizable. by way of example, fig. 7 presents a photograph of the model in the “open” state. the humid air entering from the supply terminal s leaves well observable traces on its way into the output terminal y. it is not only possible to trace the flow in the channels, but in the large cavity area downstream from the supply channel it is also possible to see a part of the supplied gas spilling over into the vent v. this shows the desirable generation of the small spillover “guard” flow, which eliminates the possibility of the uncontrolled fluid mixture in the common vents getting into contact with the sample. though a back flow from the vent v is not very likely, its existence cannot be ruled out in the absence of any mechanical separation (which would rule out such a possibility in mechanically closing valves). the “guard” flow represents a sacrifice of a part of the sample and should not therefore be higher than strictly necessary. this is adjusted by proper 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 4: contours of the valve cavities. the layout is planar and is dominated by what is in effect an incorporated jet pump, driven by the control flow. fig. 5: photograph of the scaled-up acrylic model of the valve as used in the laboratory flow visualisation experiments using the steam condensation method – some condensation is visible in the channels fig. 6: oblique view of the acrylic model, with the top of the bottle filled with hot-water seen in the foreground. partly visible in the background are another model and its bottle – the tests were performed with the whole multi-valve selector circuit. choice of the applied driving pressure difference dpyv. observing the amount of the flow sacrificed in situations like that shown in fig. 7 is a useful aid for the adjustment. fig. 8 presents a sequence of the flow visualization photographs taken during the transition from the “open” state (no control flow) into the “closed” state. it should be noted how the condensation in the path of the control flow in the jet pump part of the valve gradually clears away. this indicates the inaccessibility of the path to the valve output y for the humid supplied gas once the control flow is switched on. after a mere 4 seconds, the switching into the “closed” state is completed. this represents a very short transition in the actual valve in view of the stokes number similarity eq. (1) above. an effect which deserves pointing out is the much slower but demonstrably present gradual clearing of the condensation in the output channel, between the jet pump part of the valve and the output terminal y. this manifests the presence of the reverse flow, cf. fig. 3, generated due to jet pumping. this is an important feature required for maintaining sample purity. the fact that this reverse flow is smaller – so that the condensation removal of took 30 seconds in the scaled-up experiment – is actually quite desirable. the generation of the reverse flow takes place in the “closed” state and since in the sampling unit there is a large number of the valves simultaneously in this “closed” state, the combined loss of the sample resultant from all protective reverse flows may easily represent a substantial proportion of the available sample. we must keep it small and this is achieved by generating only rather weak jet pumping – well within the capability of the essentially laminar control jet. 5 conclusions in the design of fluidic devices similar to our microfluidic valve, flow visualization is a useful and in fact indispensable validation tool accompanying the numerical flowfield computations. because of the auxiliary nature of the visualization and the lack of interest in revealing details of the flow, the main requirement on the flow visualization methods for this purpose is simplicity and low cost. the steam condensation method, described in this paper, fulfills the requirement almost ideally. 6 acknowledgment the authors gratefully acknowledge financial support from iac – institute of applied catalysis, united kingdom. references [1] low, y. y., tesař, v., tippetts, j. r., allen, r. w. k., pitt, m.: “mutlichannel catalyst testing reactor with microfluidic flow control.” 3rd european congress of chemical engineering, nuremberg, germany, 2001. [2] tesař, v., tippetts, j. r., low, y., y., allen, r. w. k.: “development of a microfluidic unit for sequencing fluid samples for composition analysis.” chemical engineering research and design, vol. 82 (a6) (2004), p. 708–718, transactions of the institution of chemical engineers, u.k., part a, june 2004. [3] merzkirch, w.: flow visualization. academic press, new york, 1974. [4] řezníček, r.: visualisace proudění. (flow visualization, in czech) , academia publishing house of čsav, praha, 1972. [5] tesař, v.: “valvole fluidiche senza parti mobili.” (no-moving-part fluidic valves, in italian), oleodinamica – pneumatica, revista delle applicazioni fluidodinamiche e controllo del sistemi, vol. 39 (1998), no. 3, issn 1 122-5017, p. 216. [6] tesař, v., low, y. y., allen, r. w. k., tippetts, j. r.: “microfluidics for mems – microfluidic valve”, 2001 picast iv – proceedings of the 4th pacific international conference on aerospace science and technology, kaoshiung, taiwan, p. 301–306, publ. by: national cheng kung university, tainan, taiwan, 2001. [7] tesař, v.: “microfluidic valves for flow control at low reynolds numbers”, journal of visualisation, vol. 4 (2001), no. 1, p. 51–60, tokyo. [8] tesař, v.: “sampling by fluidics and microfluidics”, acta polytechnica – journal of advanced engineering, vol. 42 (2002), no. 2, prague, p. 41–49. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 7: visualised flow in the model valve in its open state. clearly recognisable due to the condensation is the spillover flow into vent v. its magnitude can be estimated. fig. 8: sequence of video records showing the transition from the open state into the closed state by the action of the control flow. the control air removes the condensed water droplets. [9] tesař, v.: „subdynamické asymptotické chování mikrofluidického ventilu“ (subdynamic asymptotic behaviour of a microfluidic valve, in czech), automatizace, issn 0005-125x, vol. 45 (2002), no. 12, p. 766–770, prague. [10] wilkin, o. m., allen, r. w. k., mailtlis, p. m., tippetts, j. r., tesař, v., turner, m. l., haynes, a., pitt, m. j., low, y. y., sowerby, b.: „high throughput testing of catalysts for the hydrogenation of carbon monoxide to ethanol.“ in: derouanne e. g. et al.: „principles and methods for accelerated catalyst design and testing“, p. 299–303, isbn 1402007205, kluwer acad. publishers, netherlands, 2002. [11] tesař, v.: “microfluidic turn-down valve.” journal of visualisation, vol. 5 (2002), no. 3, p. 301–307, tokyo. [12] tesař, v., tippetts, j. r., allen, r. w. k.: “failure of steady cdf solutions caused by vortex shedding.” developments in machinery design and control, vol. 3, ed. k. peszynski, isbn 83-89334-85-2, p. 87–94, poland, 2004. dr. j. r. tippetts, ph.d. e-mail: j.tippetts@sheffield.ac.uk university of sheffield mappin street s1 3jd sheffield, uk prof. ing. václav tesař, csc. e-mail: v.tesar@sheffield.ac.uk university of sheffield mappin street s1 3jd sheffield, uk av čr dolejškova 5 182 00 praha 8, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague 3 4 5 6 7 8 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap07_2-3.vp 1 introduction in physics there are usually one or more free parameters which specify some properties of the system. aside from trivial parametric dependences that can be determined directly, e.g., by a change of units, variation in the values of some parameters can qualitatively affect the system’s behaviour. in quantum physics, where almost all physical properties are related to spectral properties of a chosen set of operators (in particular the hamiltonian), qualitative changes of spectra due to varying parameters of the hamiltonian are of prominent importance. ��-symmetry, which was introduced as a concept generalising the usually demanded self-adjointness of the hamiltonian, does not per se provide the reality of its spectrum and consequently of the energies. even if the spectrum is real it need not to be so for all values of the considered parameters. therefore determining the regions in the parametric space where the spectrum’s reality holds is a key step in finding the limits of the physical significance of any ��-symmetric theory(1). most attention has been concentrated on discrete spectra, for the following reasons. first, it is usually more convenient (at least for mathematically less careful and less rigorous physicists) to deal with isolated eigenvalues than with the continuous part of the spectrum, where all manipulations become in a way more mathematically intricate. second, there are many systems whose spectrum is fully discrete and others where the essential spectrum is insensitive to parametric change. the points of eigenvalues’ complexification are thus the investigated limits of physical relevance. after examining many of the “classical” examples of ��-symmetric systems an apparently regular pattern was observed – a square-root singularity structure and a jordan-block degeneracy. though this is beyond the scope of this article, it can be useful for the reader to consider one more application of studying complexification and its properties. complex (albeit not necessarily ��-symmetric) hamiltonians do also arise with extending parameters of the hermitian systems into the complex domain. the structure of the spectrum and especially the distribution of the exceptional points (which are, in this case up to slight generalisation, the points of the spectrum’s complexification) then influences the phase transitions and the chaotic behaviour of many systems. see e.g. [2] or [3] (and refs. therein). outside of quantum mechanics, ��-symmetric hamiltonians are of use in the framework of magnetohydrodynamics. here again the exceptional points and level crossings (we will discuss their close relatedness later) have a direct physical significance. for the connection between magnetohydrodynamics and ��-symmetry, see [4, 5]. the concrete nature of the complexification is important, e.g., if one has to construct a perturbation theory around these singularities [6]. the purpose of this paper is to summarise some well-known facts about complexification of energies and also to discuss some of the other (than usual) possible scenarios of complexification, especially for dirac operators. in the first section we will make clear the terminology and compare particular properties of self-adjoint and ��-symmetric parametrically dependent hamiltonians. in the second section, we will present the situation in relativistic models. 2 the basics of complexification in the following, we will investigate the spectra of operators that depend on a single parameter. these operators will be called hamiltonians, as is usual in the ��-symmetric context, even if the matrices used for illustration in this section are rather toy models than realistic generators of time evolution. however, the physical background of what is discussed in the second section will be rather obvious. the investigated parameter will be denoted by c, the hamiltonian h(c) and its eigenvalues e, unless stated otherwise. the point in the parametric space where complexification occurs will be called the exceptional point (2) (ep) and denoted by cep; complexification means that some chosen subset (usually consisting of two eigenvalues) of the spectrum � �� h( )c is real for c c� ep and imaginary for c c� ep or vice versa. to avoid unnecessary complications we will examine the most common situation when the hamiltonian is of the form h h v( )c c� �0 (1) note that in spite of being quite restrictive within all imaginable parametric dependences, form (1) is useful in a broad class of physical applications. obviously, one can get other than linear parametric dependences by means of reparametrisation. 50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 how do energies complexify? h. bíla some particular properties of the parametric dependence of eigenvalues with emphasis on their complexification are discussed. the non-diagonalisability of ��-symmetric matrix hamiltonians in exceptional points is compared with level-crossing prohibition of hermitian systems. for non-matrix hamiltonians, the different way of complexification between klein-gordon and dirac hamiltonians is demonstrated. keywords: ��-symmetry, exceptional points, dirac equation all discussed hamiltonians will be ��-symmetric. this means that there exists an anti-linear operator, conventionally written as a product of time reversal � and parity reflection �, (3) which commutes with the hamiltonian: �� ��h h( ) ( )c c� (2) it is well known that condition (2) itself provides that the discrete spectrum consists of real eigenvalues and complex-conjugated pairs. thanks to the simple parametric dependence of h it is reasonable to estimate that the eigenvalues are continuous functions of c.(4) from these two facts one can infer that at the exceptional point at least two eigenvalues must merge. hence it could be useful to generalise the notion of eps to those c where the spectrum is degenerated even if it is real (and non-degenerated) in the neighbourhood of the point. 2.1 matrices and avoided crossings the simplest case of a ��-symmetric system is a parametrically dependent (finite-dimensional) matrix. examples of these have been used many times to demonstrate various properties of ��-symmetric systems. for some infinite-dimensional systems, non-self-adjointness is essential only in the subspaces spanned on the lowest-energy eigenvectors [7, 8] and matrix representations of the hamiltonian on this subspace can be considered as a good approximation to the complete problem. because for matrices of form (1) the eigenvalues are always continuous functions of c and their total number is fixed, complexification can clearly happen only in two ways. in the first case, at the exceptional point there is an n-dimensional degenerate subspace with n independent eigenvectors. in the second case, there are only n k� independent eigenvectors with k � 0 and there is thus a jordan-block structure. the important thing now is that the former case, let us call it diagonalisable ep, is much rarer than the latter (non-diagonalisable). to see this, let us first examine the simplest 2×2 matrix case. a two-dimensional diagonalisable matrix with a degenerated spectrum is a multiple of the unit matrix and so if one assumes diagonalisable ep at cep, then h i( )c kep � and thus h i v0 � � �k c c( )ep . (3) the consequence is that h0 and v commute. this is something that is not satisfied in almost all physical situations. however it must be emphasised that although the commutativity of h0 and v is a sufficient condition for the existence of diagonalisable eps in any dimension of hilbert space, it is a necessary condition for that only in the case of two dimensional systems. the stated argument cannot be easily generalised to more dimensions, and it needs a different approach to see the reasons why diagonalisable eps are “prohibited”. hermitian matrices are always diagonalisable and thus the only ep that can occur in a hermitian parametrically dependent system is a diagonalisable one, which is usually referred to as a level crossing. that level crossings are in some way avoided for hermitian systems is well known, but because the mechanism of this is exactly the same as the mechanism protecting against diagonalisable complexification in the ��-symmetric case, let us formulate the statement more precisely. for the purposes of this article, the following formulation of the “no-crossing” theorem will be sufficient: let � be a set of all n-dimensional hermitian matrices and s n: � � 2 be a mapping that maps any v � with entries vij to the vector s v v v v v v vn nn( ) ( , , , , , , , , )v � � � � �11 12 12 1 22 23� � . (4) let h0 then be a fixed hermitian operator and �c be a subset of � such that for all v �c the operator h v0 � c has a level crossing for at least one c � . then s( )�c has zero measure in �n 2 .(5) the message of the theorem is that parametrically dependent matrices of the form (1) which have a level crossing are extremely rare in the sense that they constitute a zero-measure subset of all possible matrices of the considered type. a sketch of the proof can be outlined as follows: the characteristic polynomial �( ) det(e c e� � �h v i)0 (5) of a hermitian matrix is real for all real c and e. it must have a multiple root in an exceptional point, which is otherwise stated as �( )cep � 0 , (6) �� �e c ep � 0 . (7) the dependence of � on c is polynomial and thus smooth and therefore also �� �c c ep � 0 ; (8) otherwise � would have complex roots at cep � � for some non-zero small �. now, having a given h0, let us choose any cep and eep and find any matrix � such that h v0 � c has an ep at c0 with the energy eep. (6) one may ask what happens with the eigenvalues if the matrix elements of v (or better the components xi of the vector x s( )v ) and the parameter c are slightly varied around the selected point. in the ( )n2 2� -dimensional space of variables xi, c and e the eigenvalues are located on the subset where �( , , )x c e , therefore one needs to solve the differential equation 0 � � � � �d d d d� � �� � �� � ( )x e e c cx . (9) from (6) and (7) it follows that if one starts at the ep the last two terms in (9) vanish and one gets 0 � �( )x � dx (10) this is an equation which defines an ( )n2 2� dimensional manifold in �n 2 . it is obvious now that the n2-dimensional vectors x for which s � 1( )x �c are restricted to lie on such manifolds clearly of zero measure since they have one dimension less than the space of all hermitian n×n matrices.(7) © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 47 no. 2–3/2007 what are the differences if one takes the ��-symmetric case instead of hermitian one? the number of free real parameters determining a ��-symmetric matrix is the same as for a hermitian matrix and its characteristic polynomial is also real. the crucial difference is that since there is no guaranteed reality of the spectrum, condition (8) does not hold. on the other hand since we know [9] that a diagonalisable ��-symmetric matrix with real eigenvalues can be transformed to a hermitian one by a similarity transformation, and because similar matrices have the same characteristic polynomials, condition (8) is still valid for diagonalisable eps of ��-symmetric matrices. however the argument does not apply for non-diagonalisable eps, where an analogous similarity transformation does not exist. in this case, (9) can be solved by a function c(x) that determines the position of the selected ep under an arbitrary change of parameters xi . 2.2 square-root dependence to illustrate the behaviour of eigenvalues on a concrete example, let us consider the simple two-dimensional matrices h � � � � � � � � � � �� c de de c iq iq 1 2 (11) with c d qi, , �. (8) the eigenvalues are given by 2 41 2 1 2 2 2e c c c c d� � � � �( ) . (12) the square root in (12) is a typical example of parametric dependence near the non-diagonalisable ep. since (11) is a very simple (i.e. two-level) system the dependence is exactly the square root; for larger matrices of type (1) the square-root function is modified by some non-singular c-dependent factor, but still lim c c e c � �� ep d d . (13) the essential fact is that in complex neighbourhood of the exceptional point the eigenvalues form a structure with two riemann sheets typical for the square root – this is universally true for matrices in all eps where two levels cross. the scheme also holds for a large class of schrödinger operators with ��-symmetric potentials. a square-root singularity was attested for example in the founding problem of ��-symmetry – the hamiltonians � � � d d 2 2 2 x ix c( ) (14) of bender, boettcher and meisinger [10]. another example is the one-dimensional laplacian at the interval ( , )0 d discussed in [11] with ��-symmetric boundary conditions(9) � � � �� �( ) ( ) ( )0 0b ic , � � �� �( ) ( ) ( )d b ic d (15) the system defined in such a way has several interesting properties: the total number of complex eigenvalues does not exceed one pair. the exceptional point usually has a jordan-block degeneracy and the square-root-like behaviour of the energy, but the situation can be made slightly more complicated for example by keeping b c n2 2 1 2� � � fixed and varying c, which leads to unusual triple (still non-diagonalisable) degeneracy in the ep. an even more extraordinary possibility is to put b � 0 and observe the dependence on c. in this setting the spectrum is real for all c and only one energy level is not constant; the exceptional points occur at k n� �, where n � and k e� . 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 fig. 1: the eigenvalues of matrix (11) with c c c1 2� � and d e iq� � 1 (solid lines). the dashed lines represent an analogous hermitian matrix where both off-diagonal elements are �1. the graph illustrates the typical behaviour of energy levels of both hermitian systems (avoided crossing at c � 0) and ��-symmetric systems (complexification at exceptional points c � �1 ). fig. 2: the energy dependence of the laplacian with boundary conditions (15). the left graph shows the level crossings for b � 0. the situation on the right represents varying b for fixed value b c2 2 2 4� � . . solid lines represent real eigenvalues while dashed and dotted lines are the real and imaginary parts of one of the complex eigenvalues respectively (the second one is conjugated). the values of the wave vector k e� are shown. because actually no complexification occurs also (13) does not hold, but a jordan-block structure is still maintained. as a summary to the first section it is worth noting that the merging of two (or ocassionally more) eigenstates and eigenvalues in the point of complexification, while being obligatory for matrices, is prevalent but not universal for schrödinger operators. complications come into play when the potential involves a singularity, typically a divergent term proportional to 1 x. in such situations the eigenfunction corresponding to one of the real eigenvalues expected to merge at the ep ceases to be square integrable and thus in this point one sees a splitting of one real energy into two complex energies (see for example [12]). we will also discuss this later with the radial dirac equation. 3 relativistic systems the equations of relativistic quantum mechanics provide a natural resource of systems with complexification phenomena. in contrast to non-relativistic ��-symmetry, here one does not need to add an “artificial” complex term and complexification is encountered within undoubtedly physically relevant problems. the point of complexification is usually regarded as the furthermost boundary of quantum-mechanically describable configuration. in the following subsections we will discuss in more detail three different relativistic models. 3.1 dirac square well one of the simplest interactions that can be imagined is represented by a finite square well potential. the first discussed relativistic system will be the spin 1 2 particle in such potential. the one-dimensional dirac hamiltonian in one suitable representation reads h � � � � � � � � �� � � �� m c x i i m c x d x x d � � � � ( , ) ( , ) ( ) ( ) 0 0 , (16) where c and d are the well’s depth and width, respectively, and is the mass of the particle, throughout the following deliberately put to be equal to one. the domain of the hamiltonian (16) is � �domh � � � �� �l l l l2 2 2 2( ) ( ) ( ) ( )� � � � . (17) the hamiltonian has as usual the dirac hamiltonian continuous spectrum � c � �� � � �( , ) ( , )1 1 which does not depend on c, d. our interest is concentrated on the bound states. the search for stationary states consists of solving the problem on the intervals ii � ��( , )0 , i dii � ( , )0 and i diii � �( , ). the normalisable solutions for energy e are � � � � � i i ii ii ii iii iii � � � � � � � � k k k k x x x x e e e e , , . (18) where k refers to still unspecified complex constants and � � � � � � � � � � � � � � 1 1 ( ), , , , c e p e p p (19) the square roots are taken positive or positive imaginary (that is purely imaginary with positive imaginary parts). the matching conditions in 0 and d involve then only the wave-function, not the derivative, as h contains only the first derivative term. after eliminating all k s we get the equation ( )1 1� � �� ��e d (20) with � � � � � � p p . from the definitions above it can be seen that � is real positive or imaginary positive, but no real positive � can fulfill equation (20). it follows that e lies in the interval (�1, 1) and thus in the gap of the continuous spectrum, as might be expected. now it would be useful to take a logarithm of both sides of (20) to get a real equation � � 2 1 1 1 1 2 12� � � � � � � � � �arctan ( )( ) ( )( ) ( ) mod e c e e c e d c e . (21) by investigating (21) we arrive at the following observations about the dependence of the energies on parameters c and d: 1. due to the symmetry of the system the change c c � yields e e � and thus one can restrict one’s attention to c � 0 without loss of generality. 2. each energy decreases as c or d increases. 3. for the i-th bound state and fixed d there are critical values c di min ( ) and c di max( ) such that the bound state exists only if c c ci i ( , ) min max . the energy of the bound state “sinks” into the continuous spectrum at these points: lim , lim . min max c c i c c i i i e e � � � 1 1 (22) 4. by analogy for fixed c there are d ci min ( ) and d ci max( ) and the bound state exists only if d d di i ( , ) min max . 5. d c i c ci min ( ) ( ) � � 2 2 � and d c i c ci max( ) ( ) ( ) � � � 2 1 2 � . when c � 2 the value d ci max( ) � � . the energies are numbered from 0 to �. we can formally consider the points with parameters c d ci, ( ) min or c d ci, ( ) max as exceptional points, but in these points no complexification takes place. the spin-1/2 particle in a square well thus behaves “correctly” and exhibits no complexification for strong potential; this can be placed in contrast to a scalar particle in an analogous potential described in the next subsection. 3.2 klein-gordon square well it is well known that a scalar particle is described by the klein-gordon equation. as for the involved potential we will consider the minimal coupling, and thus the equation for stationary states has the form © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 47 no. 2–3/2007 (( ( )) ) ( )( , )e c x m xd x� � � �� � �0 2 2 2 0 , (23) where c is the well’s depth and d its width. it is worth mentioning that it is possible to describe system (23) by a hamiltonian of form (1) (see [13]). the price paid is that one must introduce a two-component formalism, which leads to some ambiguities as long as the step from one to two components is not unique, and the resulting hamiltonian is non-hermitian(10). since we are mainly interested in the spectrum we will not follow this way. after some manipulations with matching conditions at 0 and d and logarithming as in the dirac case, it is easy to obtain the secular equation for the eigenvalues, which reads arctan ( )( ) ( )( ) ( ) mod 1 1 1 1 2 1 2 2� � � � � � � � � e e c e c e d c e � . (24) although (24) resembles (21), the solutions to it behave differently at larger values of c and d. in addition to levels analogous to those of the dirac hamiltonian, there is a second set of energies emerging from the lower continuum with increasing c or d that merge with the former in a traditionally shaped exceptional point (fig. 3). this effect is enabled by the non-self-adjointness of the hamiltonian, in contrast with the dirac problem. 3.3 relativistic coulomb hamiltonian the last example is intended to show that also the dirac equation can lead to complex energies. it is the well-known radial coulomb hamiltonian h � � � � � � � � � � � � � � � � � � c r m r r c r m r r � � � � , (25) where � �� 1 2 3, , ,� characterises the angular momentum and m is the mass, once more without loss of generality set to one in what follows. we will not discuss the algorithm of the solution since the reader is assumed to know it, and instead we write down the resulting formula e c nn � � � � � � �� � � �� � 1 1 2 � , (26) �� �2 2c . (27) obviously c > � leads to complex values of en and all the levels with the same � complexify simultaneously. the energy dependence is clearly of square-root type, however only one real eigenvalue exists at c c� ep for each complex pair at c c� ep. to see how this is possible for the symmetric hamiltonian (25) one needs to examine it more closely. we can restrict ourselves to s-states (� � 1) because these complexify first and there is no qualitative difference for higher �. the hamiltonian’s domain of definition trivially must be a subset of � �d l l l l� � �� � � �� �2 2 2 2( ) ( ) ( ) ( )� � � �h , (28) (the existence of h� for all � d is understood). if one has to establish the self-adjointness of h one has first to apply per partes integration on the integrals of wave function products and demand vanishing of boundary terms. the boundary at r � � causes no complications because �( )r 0 for r � trivially due to the square integrability. one must be more careful at r 0. the square integrability of � itself clearly does not forbid non-zero values at origin, however square integrability of h� can do better. to see whether there are functions from d with non-zero limit at r � 0 one may check the asymptotics of h�; around zero one can disregard the mass-proportional terms, and one has h � � � � � � � � 1 2 1 1 1 2 2 1 1 1 1 2 � � � � � � � � � � � � � � � � � � �� � � � cr r r cr � � � . (29) 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 fig. 3: the lowest energy level plotted against depth c with fixed width d � 1 5. for relativistic square wells. the spin-1/2 case on the left hand side does not complexify while the spinless case on the right hand side does; in the latter graph the two levels that merge in the ep are shown. fig. 4: six lowest real energies of the dirac hamiltonian with coulomb potential. the line styles distinguish between different values of �. this two-component function can be square-integrable 1. either if � �� r with � � 1 2 (and thus �( )0 0� ), because then both �� and � r are proportional to r � �1 and therefore square integrable 2. or if the whole expression (29) vanishes. the second case leads to the differential equation for asymptotical behaviour of � near the origin, and the solution to it is � � �r (30) with the same � 0 as in (27) – � still being equal to 1. the solution proportional to r � is not interesting for being zero at r � 0 but the solution r � diverges. for � 1 2 it is not square integrable and can thus be ruled out, but if � 1 2 (it is c � 3 2 ) this solution lies in d. in other words, in this case there are functions from d which do not vanish at r � 0 and if we make d the definition domain of h the operator would not be symmetric because of the non-zero boundary term in per partes integration. after restricting the domain of h to functions which vanish in the origin the operator is symmetric with point spectrum (26), but still not self-adjoint, since the domain of the adjoint operator would be the entire set d, which is now for c � 3 2 different from dom h. from the previous discussion it follows that there are two main differences between square-well and coulomb interaction within the framework of the dirac equation. the first, connected with the long range of coulomb potential, is that the eigenvalues do not emerge from the upper continuum as the strength of the interaction increases (as they do in the case of a square well) but they all exist for any non-zero c. a similar difference is also present in a non-relativistic treatment of the problems. the second difference, more interesting for the purposes of this paper, the existence of complexification at a certain value of c, is a consequence of the 1 r singularity at the origin. this complexification is of an extraordinary type within the family of ��-symmetric models with only one real eigenvalue forming a complex pair(11). 4 summary the purpose of this paper was two-fold: first to discuss the simple principles behind the complexification of eigenvalues and to show the connection between avoided level crossings in the hermitian framework and avoided diagonalisable eps in ��-symmetry. and second, to show that in less standard situations (where standard means a matrix or a schrödinger hamiltonian), there are more ways in which the eigenvalues can cross the boundary between the physical world and the complex realm. it is by no means intended to state that the few examples presented here constitute in any sense a complete list of what can happen. rather the aim of the article was to illustrate that some quite ordinary systems can behave contrary to the usual “��-symmetric intuition”. remarks (1) it may be useful to note that a systematic approach to ��-symmetry cannot be probably achieved without use of the krein space concept [1]. when the spectrum of a krein space operator is real, a transformation to a hilbert space operator exists and the ��-symmetric system can be treated as in standard quantum mechanics. (2) some authors distinguish between ”exceptional“ and ”diabolic“ points regarding the type of complexification. here we use the first term for all types. (3) usually but not necessarily the standard realisations of � and � are used. a particularly useful non-standard choice is � = i, which identifies symmetric and ��-symmetric operators. (4) this need not to be valid for general h0 and v, but most physical systems do not spoil the assumption. in many situations it can be proved through perturbation theory. (5) the mapping s is a bijection which creates n2-dimensional vector from n2 independent real parameters determining the hermitian n×n matrix, i.e. the real diagonal matrix elements and real and imaginary parts of the entries above the diagonal. (6) such matrices obviously exist. a particular example can be easily constructed in the diagonal basis of h0: any matrix which is diagonal in this basis and has values v e cii ii� � �( ( ) )ep eph0 1 at least for two different � �i n 1� is good enough. (7) we suppose the measure in the space of matrices is generated by an euclidean norm a � � aij2 or equivalent. on contrary there can be reasons for choosing measures concentrated on less-dimensional subspaces – usually because of some symmetry prescription for the „randomly choosen“ hamiltonian. under such conditions the level crossings are not forbidden. (8) it is the most general ��-symmetric two-dimensional matrix with respect to � defined by pauli matrix �3 and � being simply complex conjugation. (9) strictly speaking the system defined by (15) is not of type (1). but the linear parametrisation is still valid in a generalised sense, for the hamiltonian’s associated sequilinear form. (10)and also not manifestly lorentz covariant. (11)to be more specific, the statement is true if the boundary condition �( )0 0� is given. without it, demanding only square integrability of � and h�, one gets a clearly non-physical situation: if c � 3 2, then all real numbers become eigenvalues with eigenfunctions proportional to confluent hypergeometric function u multiplied by decaying exponential. references [1] mostafazadeh, a.: krein-space formulation of ��-symmetry, ���-inner products, and pseudo-hermiticity, czech. j. phys. vol. 56 (2006), p. 919. © czech technical university publishing house http://ctn.cvut.cz/ap/ 55 acta polytechnica vol. 47 no. 2–3/2007 [2] heiss, w. d., müller, m., rotter, i.: collectivity, phase transitions and exceptional points in open quantum systems, phys. rev. vol. e58 (1998), p. 2894. [3] cejnar, p., heinze, s., dobeš, j.: thermodynamic analogy for quantum phase transitions at zero temperature, phys.rev. vol. c71 (2005), p. 011304. [4] günther, u., stefani, f., gerbeth, g.: the mhd �2–dynamo, �2–graded pseudo-hermiticity, level crossings and exceptional points of branching type. czech. j. phys. vol. 54 (2004), p. 1075. [5] günther, u., stefani, f., znojil m.: mhd �2–dynamo, squire equation and ��-symmetric interpolation between square well and harmonic oscillator, j. math. phys. vol. 46 (2005), p. 063504. [6] günther, u., rotter, i., samsonov, b. f.: projective hilbert space structures at exceptional points, arxiv:0704.1291. [7] mostafazadeh, a., batal, a.: physical aspects of pseudo-hermitian and ��-symmetric quantum mechanics, j. phys. vol. a37 (2004), p. 11645. [8] quesne, c., bagchi, b., mallik, s., bíla, h., jakubský, v., znojil, m.: ��-supersymmetric partner of a short-range square well, czech. j. phys. vol. 55 (2005), p. 1161. [9] mostafazadeh, a.: exact ��-symmetry is equivalent to hermiticity, j.phys. vol. a36 (2003), p. 7081. [10] bender, c. m., boettcher, s., meisinger, p.: ��-symmetric quantum mechanics, j. math. phys. vol. 40 (1999), p. 2201. [11] krejčiřík, d., bíla, h., znojil, m.: closed formula for the metric in the hilbert space of a ��-symmetric model, j. phys. vol. a39 (2009). p. 10143. [12] lévai, g.: comparative analysis of real and ��-symmetric scarf potentials, czech. j. phys. vol. 56 (2006) p. 953. [13] feshbach, h., villars, f.: elementary relativistic wave mechanics of spin 0 and spin 1/2 particles, rev. mod. phys. vol. 30 (1958). mgr. hynek bíla phone: +420 266 173 280 email: hynek.bila@ujf.cas.cz nuclear physics institute academy of science of the czech republic 250 68 řež, czech republic 56 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 acta polytechnica https://doi.org/10.14311/ap.2021.61.0672 acta polytechnica 61(6):672–683, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague comparision of the walk techniques for fitness state space analysis in vehicle routing problem anita agárdia, ∗, lászló kovácsa, tamás bányaib a university of miskolc, faculty of mechanical engineering and informatics, institute of informatics, miskolc-egyetemváros 3515, miskolc, hungary b university of miskolc, faculty of mechanical engineering and informatics, institute of logistics, miskolc-egyetemváros 3515, miskolc, hungary ∗ corresponding author: agardianita@iit.uni-miskolc.hu abstract. the vehicle routing problem (vrp) is a highly researched discrete optimization task. the first article dealing with this problem was published by dantzig and ramster in 1959 under the name truck dispatching problem. since then, several versions of vrp have been developed. the task is np difficult, it can be solved only in the foreseeable future, relying on different heuristic algorithms. the geometrical property of the state space influences the efficiency of the optimization method. in this paper, we present an analysis of the following state space methods: adaptive, reverse adaptive and uphill-downhill walk. in our paper, the efficiency of four operators are analysed on a complex vehicle routing problem. these operators are the 2-opt, partially matched crossover, cycle crossover and order crossover. based on the test results, the 2-opt and partially matched crossover are superior to the other two methods. keywords: fitness state space, vehicle routing problem, optimization. 1. introduction the vehicle routing problem (vrp) is one of the best known discrete optimization tasks. during the basic vrp, the positions and demands of the customers are given in advance, and the number of vehicles and capacity limit are also known in advance. starting from the depot, vehicles visit customers and then return to the depot. the objective of the optimization is the minimization of the distance travelled by the vehicles. figure 1 illustrates the layout of a basic vehicle routing problem. the vrp was first introduced by dantzig and ramster [1] as truck dispatching problem. since then, many types and constraints of the problem have emerged as the complexity of transportation tasks has begun to increase. below, we will introduce some types of problems. table 1 contains the most researched type of vehicle routing problem types. in the fitness state space analysis, the key elements are the representation form of the state and the applied neighbourhood operator. the analysis can be used to show the efficiency of the selected operators of the optimisation algorithm. below, we present several papers that investigate fitness state space analysis techniques. fitness state space analysis was first published by sewall wright [14]. he described the search space mathematically as a pair containing a set of solutions and fitness functions. since then, fitness state space analysis techniques have been applied to many problems, for example, knapsack problem [15], timetabling probfigure 1. example of a basic vehicle routing problem (vrp) lem [16], traveling salesman problem [17, 18], flowshop scheduling problem [19], quadratic assignment problem [20], maximal constraint satisfaction problem [21], vehicle routing problem [22]. the work [15] presents the analysis of several representation methods, such as weight-coding representation, random-key representation, ordinal representation, binary representation, permutation representation. the analysis includes the following metrics: fitness distance correlation, autocorrelation function, correlation length. in [23], the fitness state space of the optimum multiuser detection problem is also analysed using the 672 https://doi.org/10.14311/ap.2021.61.0672 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 61 no. 6/2021 comparision of the walk techniques in vrp vehicle routing problem type explanation time window [2] customers must be visited during a specific time interval. multiple time window [3] customers have one or more time windows. soft time window [4] customers can be visited outside of the time interval(but the solution gets a penalty point). single depot [5] the system contains a single depot. multiple depot [6] the system contains multiple depots,the vehicles must start their route from one of the depots open route [7] vehicles do not have to arrive to the depot inter-depot routes [8] vehicles can arrive to any of the depots two-echelon [9] the system contains intermediate locations (called satellites). the products will be shipped in the following route: depot-satellite-customer homogeneous fleet [10] the system contains one type of vehicles heterogeneous fleet [10] the system contains multiple types of vehicles pickup and delivery [11] delivering products from the depot to the customers andcollecting products from the customers to the depots. multiple product [12] the system contains multiple products periodic problem [13] periodic visits to customers table 1. vrp types following fitness state space methods: fitness distance correlation, random walk correlation function. the timetabling problem landscape is determined by the following techniques: fitness distance correlation, autocorrelation [16]. we can also find some examples of a fitness state space analysis of the traveling salesman problem, e. g., [17, 18], where the k-opt operator is analysed by calculating the hamming distance. the following analyses are performed by the authors: distance to global optimum, number of local optimums, probability of visiting an optimum, autocorrelation, time to local optimum, the distance between the optima. the effectiveness of edge recombination and maximal preservative crossover (mpx) are compared. an analysis of the flowshop scheduling problem (fsp) fitness state space has also been performed, for example, by [19]. in the article, the following analytical methods are detailed: evolvability, neutral degree, neutral neighbour, autocorrelation. an analysis of the fitness state space of the quadratic assignment problem has also been investigated, for example, by the authors of [20]. the following operators are analysed: pairwise exchange, partially mapped crossover (pmx), and swap. the following fitness state space analysis techniques are performed: autocorrelation function, autocorrelation length, random walk, the basin of attraction. in [21], the maximal constraint satisfaction problem is investigated with a random-walk and costdensity analysis. the analysis of the fitness state space of the traveling salesman problem was also presented by the authors of [24]. they demonstrated the efficiencies of 2-opt (edge swap) and city swap operators. the distance from the global optimum was examined by the authors as a function of a fitness value. according to the authors, the 2-opt operator is more efficient, because in this case, the fitness distance correlation is strong. the time to the local optimum, distances from optima, autocorrelation, number of local optima, probability of getting to the optimum analysis of the traveling salesman problem and the no-wait job scheduling task was also detailed by the authors of [17]. the fitness state space analysis of the quadratic assignment problem and traveling salesman problem [25] were performed with the following techniques: fitness distance correlation, fitness cloud, autocorrelation, correlation length, information content, information stability, ruggedness, autocorrelation, correlation length. the following walk techniques are presented: random walk, adaptive walk, reverse adaptive walk, uphill-downhill walk, neutral walk. we can find an example of an analysis of the vehicle routing problem fitness state space [22], where the following operators were attempted to be analysed: swap, inversion, insertion, partially matched crossover, uniform order crossover, cycle crossover, displacement. random walk technique with 2000 steps taken was used. the task was also examined from an information elemental point of view, with the following techniques: information content, densitybasin information and partial information content. in our earlier research, we have already investigated the fitness state space of a multi-echelon system [26, 27]. we performed a fitness cloud and random walk analyses of the multi-echelon vehicle routing problem. now, we turn to the extension of the prob673 a. agárdi, l. kovács, t. bányai acta polytechnica lem, namely the investigation of the adaptive, reverse adaptive and uphill-downhill walk. the remainder of this paper is organized as follows. section 2 contains the concept of our fitness state space analysis, such as the adaptive, reverse adaptive and uphilldownhill walks, the applied neighbourhood operators and the applied distance calculations. section 3 contains the results and discussion. this section presents our vehicle routing problem, the results of the adaptive, reverse adaptive and uphill-downhill walks, and the summary results. the last section presents conclusions and the scope of our future work. 2. fitness state space analysis optimisation metaheuristics operate iteratively to find the extremum points in the object space. the body of the iteration contains the following elements [28]: • the set of possible states. • neighbourhood, which is defined by an operator. from the current state with the application of a neighbourhood operator, we can get the next state. • objective function. • representation: the applicability of each operator depends on the representation. • transition rule: selection of the next state point from the neighbours. • termination criteria of the algorithm. • whether the initial state point is generated randomly or with some heuristics. 2.1. adaptive, reverse adaptive and uphill-downhill walk analytical methods can be divided into two categories, exhaustive search and stochastic, sampling-based techniques [29]. the exhaustive search analysis gives the most complete picture of the search space. but they can only be applied to smaller problems (not suitable for larger problems), hence, it is difficult to apply this analysis in practice. the main advantage of an exhaustive search is the completeness. the search time (runtime) of the exhaustive search is high, but the search space analysis is completely traversed [29]. the most commonly used techniques in the search space analysis are the stochastic, sampling-based techniques. their advantage over the exhaustive search is that their running time is low. however, one of the disadvantages of these methods is that we cannot get a complete systematic view of the search space. it is comparable to metaheuristics in the sense that metaheuristics search for a “relatively good” solution, while the stochastic search offers a “relatively good” mapping [29]. the basic problem of stochastic analysis methods is the selection of a representative sample. the selection of the optimal sample is a difficult task as we do not know the complete set of the solution. so far, two types of sample generation techniques have been used [29]: • the trajectory-based sampling technique creates the path of optimization methods with continuous solution candidates. • discovery strategies generates scattered samples. sampled trajectories create a path in the search space or, in other words, a sequence of adjacent state points (solution candidates). this method is also called walk [29]. there are several walks to analyse the search space. the walk starts from a random solution candidate and uses the neighbourhood search to "walk" to neighbouring solution candidates. depending on the type of the neighbourhood search, the following walking techniques are possible [25]: • random walk – a state point is randomly selected from a set of neighbours. • adaptive walk – the better state point (neighbour) is selected. • reverse adaptive walk – always selects the worst neighbour. this is the reverse of the adaptive walk. • uphill-downhill walk – first, an adaptive walk is performed, then, if a better fitness point (solution) is not found during the step, a reverse adaptive walk is performed until a better state point cannot be found. • neutral walk – a neighbour with a fitness value equal to the parent’s fitness value (with the effort to increase the distance from the starting solution) is chosen. 2.2. neighbourhood operators the optimization algorithm is greatly influenced by the applied search operators. in this article, the efficiency of the following operators are presented: 2-opt, order crossover (ox), cycle crossover (cx), partially matched crossover (pmx). in the case of 2-opt [30], two edges are swapped. this is the reversal of the elements of a section in the permutation. the two edges are selected randomly. figure 2 shows a sequence. the problem is in a permutation space; an element is given with a permutation. figure 2. example of the 2-opt operator the order crossover (ox) [31] operator creates two children solutions from two parent solutions. when using the operator, we need to look for a fitting section. 674 vol. 61 no. 6/2021 comparision of the walk techniques in vrp figure 3. example of the order crossover (ox) operator figure 4. example of the cycle crossover (cx) operator the fitting section is selected randomly. the children will initially be the copies of the parents. then, the elements of the fitting section of the second parent are deleted from the first child (letter h) and the elements of the fitting section of the first parent are deleted from the second child. then, the child solutions are rearranged so that the letters h (empty spaces) are in the matching phase of the permutations. then, the first child receives the fitting section of the second parent and the second child receives the fitting section of the first parent. figure 3 illustrates the operations. cycle crossover (cx) [32] searches for cycles. this operator also creates two child solutions from two parent solutions. initially, the first element of the first child should be the first element of the first parent, and the first element of the second child should be the first element of the second parent. in the example, this is the pair (1,9), so the first element of the first child is 1 and the first element of the second child is 9. then, in the example, comes the pair (9,6), they are placed in position 4, the pair (6,4) in position 6, (4,2) in position 8, (2,7) in position 3, and (7,1) in the last position. then, (1,9) would come, but this pair has already been placed, so the cycle is closed. there are still gaps in the child solutions, they are filled so that the first child gets the right items from the second parent and the second child gets the right items from the first parent. the operator is shown in figure 4. figure 5. example of the partially matched crossover (pmx) operator partially matched crossover (pmx) [33] also creates two-child solutions from a two-parent solutions. this operator also selects the fitting section randomly. first, the child solutions will be the copies of the parents. pairs from the fitting sections of the parents are then paired. in figure 5, the pairs are the following: (9,6), (5,5), (6,4), and (8,3). these items are swapped in both children. 2.3. distance calculation we used three types of distances to describe the similarity between two solutions, these are the fitness [34], hamming [35] and basic swap sequence [36]. the fitness distance between the two solutions is illustrated in figure 6. in the example, the distance is 500 because the absolute value of the fitness value of the two solutions must be taken. the hamming distance is 4, because they differ in 4 positions. the basic swap sequence distance is 1, because with a single edge swap operation, we get the other solution from one solution (the basic swap sequence distance is the minimum of the edge swap numbers). figure 6. distance example 675 a. agárdi, l. kovács, t. bányai acta polytechnica figure 7. the investigated vehicle routing problem 3. results and discussion 3.1. our vehicle routing system our vehicle routing problem (which is also presented in our previous works [26, 27]) is illustrated in figure 7. the test system contains a single depot, satellites and customers. the x and y position coordinates of the depot are generated in the interval (0–100). for the first level satellites, the coordinates are generated in the interval (200–300) and the coordinates of second level satellites are in the interval (400–500). the system contains 10 first-level and 10 second-level satellites and 15 customers. the products are transported from the depots to the satellites, and then, from the satellites to the customers. the demands for the products vary by customers, and the system contains a single type of product. the system contains different types of vehicles. the vehicles differ in the capacity limit and fuel consumption the values of the capacity limits are generated in the interval (10,000–50,000), and the fuel consumption in the interval (10–100). the travel time between the customers and the route status are also important factors. additionally, the following costs are also considered: loading and unloading time, administration time, loading and unloading cost, administrative and quality control cost. the values of these componens are generated in the interval (30–50). the objective function of the optimization is a compound cost function having the following components and objectives: fuel consumption (minimization), route time (minimization), unvisited customers (minimization), and route status (minimization). our goal was to select a complex vrp system having several layers and several vehicle types with different depots for the analysis. this kind of architecture is very frequent in real life applications. 3.2. adaptive walk in this subsection, the results of the adaptive walk are analysed. according to figure 8, fitness values range from 120,000 to 150,000 for 2-opt, and the average fitness distances taken from solutions range from 4,000 to 21,000. here, we obtain a parabolic function. the figure 9 represents the average of the hamming distances for 2-opt. here, the hamming distances are between 20–34. the same can be said for the basic swap sequence (here the distances are between 13–26). we also get a parabola-like function for fitness distances in the case of order crossover (here the fitness values are between 110,000–140,000, while the averages of the fitness differences range from 2,000–17,000), while the hamming (here the averages of the differences are between 16–34) and basic swap sequence distances (where the averages of the distances are between 12–26) also depend on the fitness value of the solution. the cycle crossover is similar to 2-opt, here, the fitness values are between 110,000–130,000, while the averages of the distances for a fitness distance are between 2,000–15,000, while the basic swap sequence distances are in the range of 8–24. for a partially matched crossover, fitness values are between 110,000– 140,000 and the average fitness differences are between 2,000–21,000. hamming distances are between 22–36 and also depend on the fitness value. the higher the 676 vol. 61 no. 6/2021 comparision of the walk techniques in vrp figure 8. the relationship between fitness values and average of fitness distances to other solutions in the case of 2-opt operator figure 9. the relationship between fitness values and average of hamming distances to other solutions in the case of 2-opt operator fitness value, the higher the averages of the hamming distances. and we also got this result in the case of the basic swap sequence, here, the values are between 16–28. in the case of the distance from the best solution for an adaptive walk the higher the fitness value, the greater the distances from the best solution for all operators. the fitness distance from the best solution and the average of the fitness distances of the other solutions (figure 10) describe a parabolic function. if the distance from the best solution is small or large, then figure 10. the relationship between the fitness distances of the best solution and the average of fitness distances to other solutions in the case of 2-opt operator figure 11. the relationship between the hamming distances of the best solution and the average of hamming distances to other solutions in the case of 2-opt operator the average distance from the other solutions is large when using a 2-opt operator. there are no similarities between the hamming and basic swap sequence distances (figure 11), the distance of the best solution does not affect the average of the distances taken from the other solutions. in the case of the order crossover, if the fitness distances from the best solution are too large, the averages of the distances will also be very large, and this is also true for the hamming and basic swap sequence distances. in the case of a cycle crossover, the averages of fitness distances are also high when the fitness 677 a. agárdi, l. kovács, t. bányai acta polytechnica figure 12. cost of density in the case of 2-opt operator distance taken from the best solution is high. the same is true for the hamming and basic swap sequence distances. in the case of a partially matched crossover, both fitness and hamming and basic swap sequence distances taken from the best solution also greatly affects the average distances. in the case of 2-opt, each solution has a different fitness value, but in the case of order crossover and cycle crossover (figure 12), some solutions have the same fitness value. in the case of a partially matched crossover, almost every solution has a unique fitness value. the summary of the adaptive walk for each operator is illustrated in the table 2. 3.3. reverse adaptive walk in the following, the reverse adaptive walk solutions are analysed. the averages of fitness distances describe a parabolic function, hamming and basic swap sequence averages decrease with increasing fitness value. order crossover fitness distances also describe a parabolic function while hamming and basic swap sequence distances do not depend on fitness values. the situation is similar in the case of the cycle and partially matched crossover, the averages of the difference in fitness values can be described by a parabolic function, and the distances between hamming and basic swap sequences do not depend on the fitness value, they are located close to each other. the fitness distance from the best solution in the case of 2-opt, order crossover, cycle crossover and partially matched crossover also describes a linear function. hamming and basic swap sequence distances occur in a large interval, from a small distance of 1–2 distances to a very large distance of 30–40 between solutions. the function of the distance from the best solution and the averages of the distances from the other solutions is as follows: the fitness distance for 2-opt is a parabolic function, but shows a prolonged, longdecreasing trend and then starts to increase slightly by the end of the scale. the hamming and basic swap sequence distances show wider average distances for larger distances. in the case of the order crossover, the fitness distances also describe a parabolic function, only in this case, a larger increase can be observed after a small decrease. for hamming and basic swap sequence distances, the scale of the average distances increases as the distances from the best solution increase. in the case of the cycle and partially matched crossover, we also get a parabolic function for the fitness distance, and the hamming and basic swap sequence distances increase with increasing the distance from the best solution, and the scale of the averages of the distances also increases. the cost of density values is low, which means that the fitness values of the solutions are different during the steps. 1–2 is the cost of density for 2-opt, 1–4 for order crossover. the cycle crossover, however, provided several solutions with the same fitness value, here 1–5 is the number of identical fitness values. for the partially matched crossover, this value is 1–3, but in this case, almost all solutions have different fitness values. the table 3 illustrates the results for a reverse adaptive walk. 3.4. uphill-downhill walk during the uphill-downhill walk, the fitness values range from 130,000 to 150,000, while the average fitness distances are between 2,500 and 9,000 in the case of 2-opt operator. this function is parabolic because small and large fitness values have a large average fitness distance, while medium fitness values have a small average distance. the averages of hamming distances do not depend on the fitness values. they move in a relatively small interval, and the same can be said for the basic swap sequence distance. in the case of an order crossover, the fitness values of the solutions range from 120,000 to 140,000. the averages of the fitness distances taken from the solutions are large for small and large fitness values, while they are small for average fitness values, so their function is parabolic. the hamming and basic swap sequence distances are condensed into a single point; the averages of the distances do not depend on the fitness values. cycle crossover values are also similar to 2-opt and order crossover values, here, fitness values range from 120,000 to 140,000, fitness distances average between 2,000 and 8,500, hamming distances range from 28 to 34, and basic swap sequence distances range from 23 to 26. partially matched crossover results are also similar to those of the other three operators. the distances are taken from the best solution for a 2-opt operator increase, as a function of fitness distance, while hamming and basic swap sequence distances, however, show no correlation with fitness values, these distances differ greatly. for order crossover, fitness distances also increase linearly as a function of fitness value, and hamming and basic swap sequence 678 vol. 61 no. 6/2021 comparision of the walk techniques in vrp type distance lower bound upper bound 2-opt fitness values fitness 120,000 150,000 average of fitness distances fitness 4,000 21.000 average of hamming distances hamming 20 34 average of basic swap sequence distances basic swap sequence 13 26 fitness distances of best solution fitness 0 28,000 hamming distances of best solution hamming 2 39 basic swap sequence distances of best solution basic swap sequence 1 30 cost of density – 1 3 order crossover fitness values fitness 110,000 140,000 average of fitness distances fitness 2,000 17,000 average of hamming distances hamming 16 34 average of basic swap sequence distances basic swap sequence 12 26 fitness distances of best solution fitness 1,000 21,000 hamming distances of best solution hamming 2 32 basic swap sequence distances of best solution basic swap sequence 1 28 cost of density – 1 7 cycle crossover fitness values fitness 110,000 130,000 average of fitness distances fitness 2,000 15,000 average of hamming distances hamming 12 30 average of basic swap sequence distances basic swap sequence 8 24 fitness distances of best solution fitness 0 18,000 hamming distances of best solution hamming 2 36 basic swap sequence distances of best solution basic swap sequence 1 28 cost of density – 1 14 partially matched crossover fitness values fitness 110,000 140,000 average of fitness distances fitness 2,000 21,000 average of hamming distances hamming 22 36 average of basic swap sequence distances basic swap sequence 16 28 fitness distances of best solution fitness 0 24.000 hamming distances of best solution hamming 4 40 basic swap sequence distances of best solution basic swap sequence 2 34 cost of density – 1 9 table 2. adaptive walk results distances do not correlate with fitness values, and these distances are also in a large interval here. the results of the cycle crossover are similar to the results of 2-opt and order crossover, the fitness distances here are between 500–16,000, the hamming distances are between 2–42 and basic swap sequence distances range from 1–34. the situation is similar in the case of a partially matched crossover, here, the fitness distances are between 0–14,000, the hamming distances are between 6–40, and the basic swap sequence distances are between 4–32. the cost density values move at a small interval during each operator, which means that the fitness values are different for each solution. the cost density value for 2-opt is 1–2, 1–3 for order crossover, 1–3 for cycle crossover, and in the case of partially matched crossover, 1–3. table 4 illustrates the results for the uphill-downhill walk. 3.5. summary analysis based on the test results, we can summarize our experiments based on the following. for the walk techniques presented above, the fitness values are good in the following case: the difference between the lower and the upper boundaries is great. the lower limit should be small because the objective is to minimize the fitness value. the average of fitness, hamming and basic swap sequence distances is good in the following case: the greater the distances, the better the algorithm maps the search space. the evaluation of the fitness, hamming and basic swap sequence distances of the best solution is as follows: when the lower bound is small and the upper bound is large, then the operator 679 a. agárdi, l. kovács, t. bányai acta polytechnica type distance lower bound upper bound 2-opt fitness values fitness 120,000 150,000 average of fitness distances fitness 3,000 16,000 average of hamming distances hamming 24 34 average of basic swap sequence distances basic swap sequence 17 25 fitness distances of best solution fitness 500 21,000 hamming distances of best solution hamming 2 40 basic swap sequence distances of best solution basic swap sequence 1 30 cost of density – 1 2 order crossover fitness values fitness 130,000 140,000 average of fitness distances fitness 800 4,000 average of hamming distances hamming 28 34 average of basic swap sequence distances basic swap sequence 22 28 fitness distances of best solution fitness 500 6,000 hamming distances of best solution hamming 6 40 basic swap sequence distances of best solution basic swap sequence 6 34 cost of density – 1 4 cycle crossover fitness values fitness 130,000 140,000 average of fitness distances fitness 1,200 3,800 average of hamming distances hamming 28 34 average of basic swap sequence distances basic swap sequence 22 27 fitness distances of best solution fitness 0 7,000 hamming distances of best solution hamming 6 40 basic swap sequence distances of best solution basic swap sequence 5 32 cost of density – 1 5 partially matched crossover fitness values fitness 130,000 150,000 average of fitness distances fitness 1,500 7,000 average of hamming distances hamming 29 35 average of basic swap sequence distances basic swap sequence 22 28 fitness distances of best solution fitness 3,000 11,000 hamming distances of best solution hamming 4 40 basic swap sequence distances of best solution basic swap sequence 2 34 cost of density – 1 3 table 3. the results for reverse adaptive walk explores the space well (large upper bound), but it also found a very good solution because of the small lower bound. the evaluation of the cost density is as follows: for many small values, the operator maps the space well. the objective is to create many different fitness value solutions. table 5 shows that the 2-opt operator is efficient, and cycle crossover has a weak performance. in the case of an adaptive walk and in the case of a cycle crossover, we find the smallest average fitness distance. the average hamming and basic swap sequence distances are also the smallest here. the cost of density value became the highest in the case of the cycle crossover, which means that we got several solutions with the same fitness value during the walk. according to the order crossover cost of density diagram, several solutions also have the same fitness value. the cost of density values is the lowest for 2-opt, so this operator produced the highest number of different solutions. since the objective is to map the search space as well as possible, the 2-opt operator proved to be the best in this measurement experiment. in the reverse adaptive walk analysis, we find the greatest distance between the solutions of the 2-opt operator. the hamming and basic swap sequence distances are approximately the same for each operator. the cost of density values are low for all operators, but the lowest for 2-opt. in the case of a cycle crossover, several solutions also have the same fitness value. the results of the uphill-downhill walk are nearly identical for each operator. the cost density values are small for all operators. during the neutral walk, we got the highest value of the average distances of the fitness values in the case of the partially matched 680 vol. 61 no. 6/2021 comparision of the walk techniques in vrp type distance lower bound upper bound 2-opt fitness values fitness 130,000 150,000 average of fitness distances fitness 2,500 9,000 average of hamming distances hamming 26 34 average of basic swap sequence distances basic swap sequence 19 26 fitness distances of the best solution fitness 1,500 15,000 hamming distances of the best solution hamming 4 36 basic swap sequence distances of the best solution basic swap sequence 2 30 cost density – 1 2 order crossover fitness values fitness 120,000 140,000 average of fitness distances fitness 3,000 9,500 average of hamming distances hamming 30 36 average of basic swap sequence distances basic swap sequence 23 28 fitness distances of the best solution fitness 1,000 18,000 hamming distances of the best solution hamming 2 38 basic swap sequence distances of the best solution basic swap sequence 4 32 cost density – 1 3 cycle crossover fitness values fitness 120,000 140,000 average of fitness distances fitness 2,000 8,500 average of hamming distances hamming 28 34 average of basic swap sequence distances basic swap sequence 23 26 fitness distances of the best solution fitness 500 16,000 hamming distances of the best solution hamming 2 42 basic swap sequence distances of the best solution basic swap sequence 1 34 cost density – 1 3 partially matched crossover fitness values fitness 120,000 130,000 average of fitness distances fitness 2,000 8,000 average of hamming distances hamming 32 36 average of basic swap sequence distances basic swap sequence 24 28 fitness distances of the best solution fitness 0 14,000 hamming distances of the best solution hamming 6 40 basic swap sequence distances of the best solution basic swap sequence 4 32 cost density – 1 3 table 4. uphill-downhill walk effective weak adaptive walk 2-opt cx, ox reverse adaptive walk 2-opt cx uphill-downhill walk table 5. summary crossover. average hamming and basic swap sequence distances were the greatest for the partially matched crossover operator. this means that this operator has created the highest number of different solutions. order crossover distance values are also high but the cycle crossover and 2-opt fitness distances are low. based on the cost density diagrams, the cycle crossover solutions are the most unchanged, but the order crossover solutions also do not change much. the 2-opt and partially matched crossover solutions vary greatly, with almost every solution having a unique fitness value. the practical significance of the presented fitness state space lies in the selection of the appropriate optimization operators. we can assume that the search space topology of transportation problems of a similar architecture is also similar. based on that assumption, our findings on optimal search space operators can be generalized for vrp problems of the same type. 4. conclusions and future work in this paper, we have presented the fitness state space analysis of a complex vehicle routing problem. in our fitness state space analysis, we examined the efficiencies of four operators: 2-opt, cycle crossover, order crossover, partially matched crossover. the adaptive walk, reverse adaptive walk and uphill-downhill walk techniques were used as the analytical method. 681 a. agárdi, l. kovács, t. bányai acta polytechnica the adaptive walk receives a neighbour that is closest to the previous solution during each iteration. the reverse adaptive walk is just the opposite. during the uphill-downhill walk, the adaptive and reverse adaptive steps alternate. the walk results were analysed using fitness values, average of fitness distances, average of hamming distances, average of basic swap sequence distances, fitness distances of the best solution, hamming distances of the best solution, basic swap sequence distances of the best solution and cost density. based on the test results, the 2-opt operator proved to be much more efficient than the other operators. our further research is the investigation of other operators, such as er, edge-2, edge-3, mpx, rar, gnx operators. in addition, we plan to analyse the search space of a discrete optimization problem, which is permutation-based, such as scheduling tasks (parallel machines scheduling, job shop, flow shop, etc.). references [1] g. b. dantzig, j. h. ramser. the truck dispatching problem. management science 6(1):80–91, 1959. https://doi.org/10.1287/mnsc.6.1.80. [2] y. shi, y. zhou, t. boudouh, o. grunder. a lexicographic-based two-stage algorithm for vehicle routing problem with simultaneous pickup – delivery and time window. engineering applications of artificial intelligence 95:103901, 2020. https://doi.org/10.1016/j.engappai.2020.103901. [3] m. hoogeboom, w. dullaert, d. lai, d. vigo. efficient neighborhood evaluations for the vehicle routing problem with multiple time windows. transportation science 54(2):400–416, 2020. https://doi.org/10.1287/trsc.2019.0912. [4] g. li, j. li. an improved tabu search algorithm for the stochastic vehicle routing problem with soft time windows. ieee access 8:158115–158124, 2020. https://doi.org/10.1109/access.2020.3020093. [5] g. nagy, s. salhi. heuristic algorithms for single and multiple depot vehicle routing problems with pickups and deliveries. european journal of operational research 162(1):126–141, 2005. https://doi.org/10.1016/j.ejor.2002.11.003. [6] p. stodola. hybrid ant colony optimization algorithm applied to the multi-depot vehicle routing problem. natural computing 19:463–475, 2020. https://doi.org/10.1007/s11047-020-09783-6. [7] j. brandão. a memory-based iterated local search algorithm for the multi-depot open vehicle routing problem. european journal of operational research 284(2):559–571, 2020. https://doi.org/10.1016/j.ejor.2020.01.008. [8] t. r. p. ramos, m. i. gomes, a. p. barbosa-povoa. a new matheuristic approach for the multi-depot vehicle routing problem with inter-depot routes. or spectrum 42(1):75–110, 2020. https://doi.org/10.1007/s00291-019-00568-7. [9] p. kitjacharoenchai, b.-c. min, s. lee. two echelon vehicle routing problem with drones in last mile delivery. international journal of production economics 225:107598, 2020. https://doi.org/10.1016/j.ijpe.2019.107598. [10] r. f. fachini, v. a. armentano. logic-based benders decomposition for the heterogeneous fixed fleet vehicle routing problem with time windows. computers & industrial engineering 148:106641, 2020. https://doi.org/10.1016/j.cie.2020.106641. [11] l. do c. martins, p. hirsch, a. a. juan. agile optimization of a two-echelon vehicle routing problem with pickup and delivery. international transactions in operational research 2020. https://doi.org/10.1111/itor.12796. [12] s. m. nugroho, l. nafisah, m. s. a. khannan, et al. vehicle routing problem with heterogeneous fleet, split delivery, multiple product, multiple trip, and time windows: a case study in fuel distribution. in iop conference series: materials science and engineering, vol. 847, p. 012066. iop publishing, 2020. https://doi.org/10.1088/1757-899x/847/1/012066. [13] y. wang, l. wang, g. chen, et al. an improved ant colony optimization algorithm to the periodic vehicle routing problem with time window and service choice. swarm and evolutionary computation 55:100675, 2020. https://doi.org/10.1016/j.swevo.2020.100675. [14] s. wright. the roles of mutation, inbreeding, crossbreeding, and selection in evolution, vol. 1. 1932. [15] j. tavares, f. b. pereira, e. costa. multidimensional knapsack problem: a fitness landscape analysis. ieee transactions on systems, man, and cybernetics, part b (cybernetics) 38(3):604–616, 2008. https://doi.org/10.1109/tsmcb.2008.915539. [16] g. ochoa, r. qu, e. k. burke. analyzing the landscape of a graph based hyper-heuristic for timetabling problems. in proceedings of the 11th annual conference on genetic and evolutionary computation, pp. 341–348. 2009. https://doi.org/10.1145/1569901.1569949. [17] m.-h. tayarani-n., a. prügel-bennett. an analysis of the fitness landscape of travelling salesman problem. evolutionary computation 24(2):347–384, 2016. https://doi.org/10.1162/evco_a_00154. [18] k. mathias, d. whitley. genetic operators, the fitness landscape and the traveling salesman problem. in ppsn, pp. 221–230. 1992. [19] m.-e. marmion, c. dhaenens, l. jourdan, et al. on the neutrality of flowshop scheduling fitness landscapes. in international conference on learning and intelligent optimization, pp. 238–252. springer, 2011. https://doi.org/10.1007/978-3-642-25566-3_18. [20] f. chicano, f. daolio, g. ochoa, et al. local optima networks, landscape autocorrelation and heuristic search performance. in international conference on parallel problem solving from nature, pp. 337–347. springer, 2012. https://doi.org/10.1007/978-3-642-32964-7_34. [21] m. belaidouni, j.-k. hao. an analysis of the configuration space of the maximal constraint satisfaction problem. in international conference on parallel problem solving from nature, pp. 49–58. springer, 2000. https://doi.org/10.1007/3-540-45356-3_5. 682 https://doi.org/10.1287/mnsc.6.1.80 https://doi.org/10.1016/j.engappai.2020.103901 https://doi.org/10.1287/trsc.2019.0912 https://doi.org/10.1109/access.2020.3020093 https://doi.org/10.1016/j.ejor.2002.11.003 https://doi.org/10.1007/s11047-020-09783-6 https://doi.org/10.1016/j.ejor.2020.01.008 https://doi.org/10.1007/s00291-019-00568-7 https://doi.org/10.1016/j.ijpe.2019.107598 https://doi.org/10.1016/j.cie.2020.106641 https://doi.org/10.1111/itor.12796 https://doi.org/10.1088/1757-899x/847/1/012066 https://doi.org/10.1016/j.swevo.2020.100675 https://doi.org/10.1109/tsmcb.2008.915539 https://doi.org/10.1145/1569901.1569949 https://doi.org/10.1162/evco_a_00154 https://doi.org/10.1007/978-3-642-25566-3_18 https://doi.org/10.1007/978-3-642-32964-7_34 https://doi.org/10.1007/3-540-45356-3_5 vol. 61 no. 6/2021 comparision of the walk techniques in vrp [22] m. ventresca, b. ombuki-berman, a. runka. predicting genetic algorithm performance on the vehicle routing problem using information theoretic landscape measures. in european conference on evolutionary computation in combinatorial optimization, pp. 214–225. springer, 2013. https://doi.org/10.1007/978-3-642-37198-1_19. [23] s. wang, q. zhu, l. kang. landscape properties and hybrid evolutionary algorithm for optimum multiuser detection problem. in international conference on computational science, pp. 340–347. springer, 2006. https://doi.org/10.1007/11758501_48. [24] c. fonlupt, d. robilliard, p. preux. fitness landscape and the behavior of heuristics. in evolution artificielle, vol. 97, p. 56. citeseer, 1997. [25] e. pitzer, m. affenzeller. a comprehensive survey on fitness landscape analysis. in recent advances in intelligent engineering systems, pp. 161–191. springer, 2012. https://doi.org/10.1007/978-3-642-23229-9_8. [26] l. kovács, a. agárdi, t. bányai. fitness landscape analysis and edge weighting-based optimization of vehicle routing problems. processes 8(11):1363, 2020. https://doi.org/10.3390/pr8111363. [27] a. agárdi, l. kovács, t. bányai. an attraction map framework of a complex multi-echelon vehicle routing problem with random walk analysis. applied sciences 11(5), 2021. https://doi.org/10.3390/app11052100. [28] e. pitzer, m. affenzeller, a. beham, s. wagner. comprehensive and automatic fitness landscape analysis using heuristiclab. in international conference on computer aided systems theory, pp. 424–431. springer, 2011. https://doi.org/10.1007/978-3-642-27549-4_54. [29] e. pitzer, a. beham, m. affenzeller. generic hardness estimation using fitness and parameter landscapes applied to robust taboo search and the quadratic assignment problem. in proceedings of the 14th annual conference companion on genetic and evolutionary computation, pp. 393–400. 2012. https://doi.org/10.1145/2330784.2330845. [30] p. r. d. o. da costa, j. rhuggenaath, y. zhang, a. akcay. learning 2-opt heuristics for the traveling salesman problem via deep reinforcement learning, 2020. arxiv:2004.01608. [31] a. srivastav, s. agrawal. multi-objective optimization of mixture inventory system experiencing order crossover. annals of operations research 290(1):943–960, 2020. https://doi.org/10.1007/s10479-017-2744-4. [32] t. visutarrom, t.-c. chiang. an evolutionary algorithm with heuristic longest cycle crossover for solving the capacitated vehicle routing problem. in 2019 ieee congress on evolutionary computation (cec), pp. 673–680. ieee, 2019. https://doi.org/10.1109/cec.2019.8789946. [33] m. a. al-omeer, z. h. ahmed. comparative study of crossover operators for the mtsp. in 2019 international conference on computer and information sciences (iccis), pp. 1–6. ieee, 2019. https://doi.org/10.1109/iccisci.2019.8716483. [34] k. li, z. liang, s. yang, et al. performance analyses of differential evolution algorithm based on dynamic fitness landscape. international journal of cognitive informatics and natural intelligence (ijcini) 13(1):36–61, 2019. https://doi.org/10.4018/ijcini.2019010104. [35] a. shamir, i. safran, e. ronen, o. dunkelman. a simple explanation for the existence of adversarial examples with small hamming distance, 2019. arxiv:1901.10861. [36] i. khan, m. k. maiti. a swap sequence based artificial bee colony algorithm for traveling salesman problem. swarm and evolutionary computation 44:428–438, 2019. https://doi.org/10.1016/j.swevo.2018.05.006. 683 https://doi.org/10.1007/978-3-642-37198-1_19 https://doi.org/10.1007/11758501_48 https://doi.org/10.1007/978-3-642-23229-9_8 https://doi.org/10.3390/pr8111363 https://doi.org/10.3390/app11052100 https://doi.org/10.1007/978-3-642-27549-4_54 https://doi.org/10.1145/2330784.2330845 https://arxiv.org/abs/2004.01608 https://doi.org/10.1007/s10479-017-2744-4 https://doi.org/10.1109/cec.2019.8789946 https://doi.org/10.1109/iccisci.2019.8716483 https://doi.org/10.4018/ijcini.2019010104 https://arxiv.org/abs/1901.10861 https://doi.org/10.1016/j.swevo.2018.05.006 acta polytechnica 61(6):672–683, 2021 1 introduction 2 fitness state space analysis 2.1 adaptive, reverse adaptive and uphill-downhill walk 2.2 neighbourhood operators 2.3 distance calculation 3 results and discussion 3.1 our vehicle routing system 3.2 adaptive walk 3.3 reverse adaptive walk 3.4 uphill-downhill walk 3.5 summary analysis 4 conclusions and future work references ap07_4-5.vp 1 introduction face detection is one of the most challenging tasks in object recognition, because of the high variance among human faces, including facial expression. furthermore, the typical challenges of object detection, such as variability in scale, orientation and pose as well as occlusion and lighting conditions have to be treated by a face detector. in recent years, several methods have been developed for face detection that deal with some but not all sources of variance. holistic approaches such as eigenfaces [12] (sometimes called image-based [4] or appearance-based [14] methods in context of detection and recognition)or the boosted cascade of simple features [13] perform object detection by classifying image regions through a sliding window [15]. while they can be made invariant to scale and lighting, the major drawback of these techniques is that they cannot deal with different rotation or pose. in addition, facial expression and occlusion must be treated as intra-class variance, decreasing the performance of the classifier/detector. on the other hand, feature-based approaches handle pose, expression and even partial occlusion very well. rotation invariance is only limited by the properties of the features used to build the detector. this paper presents a new approach to the detection of facial features. the local features used for detection are invariant to scale, rotation and change in illumination, and are robust to changing viewpoints. we show that the features chosen are well suited to detect several meaningful parts of the human face like pupils, nostrils and the corner of the mouth. the paper is organized as follows: in section two, the feature extraction process is described. feature space reduction and feature selection are explained in section three. section four presents the classifier, and the results are shown in section five. in section six, the results are discussed. 2 feature extraction this section explains feature extraction from an image and its description in feature space, the features have to be invariant to affine transformations and illumination. while there have been numerous proposals for appropriate feature points, they have usually been chosen on the basis of human observation of face characteristics. in contrast to such knowledge-based methods, several methods have been developed to detect structures that are generally easy to locate, can be computed with high reliability and satisfy the demand of scale, rotation and illumination invariance [6, 10]. the local area around these so-called interest points is then extracted and modeled. a 3d scale-space representation of an image is usually taken to detect local features and their corresponding scales (see fig. 1). given any image i(x), its scale-space representation l(x, �), is defined by l g i( , ) ( , ) ( )x x x� �� � (1) where g( , )x � denotes the gaussian kernel function given by g( , ) expx x � � � � �� � � � � � � 1 2 2 2 2 2 . (2) our approach uses the scale adapted harris corner detector to detect interest points. this detector is based on the second moment matrix [7]: �( , , ) ( ) ( , ) ( , ) ( , ) x x x x � � � � � � � i d d i d d x y d x y d g l l l l l l � �2 2 d d 2 ( , )x � � � � � � (3) this matrix describes the gradient distribution in a local neighborhood of a point. its eigenvalues �1, �2 represent the two principal signal changes. thus it is possible to extract points with a significant signal change in both orthogonal directions indicating edges or junctions, for example. since it © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 47 no. 4–5/2007 detection of facial features in scale-space p. hosten, m. asbach this paper presents a new approach to the detection of facial features. a scale adapted harris corner detector is used to find interest points in scale-space. these points are described by the sift descriptor. thus invariance with respect to image scale, rotation and illumination is obtained. applying a karhunen-loeve transform reduces the dimensionality of the feature space. in the training process these features are clustered by the k-means algorithm, followed by a cluster analysis to find the most distinctive clusters, which represent facial features in feature space. finally, a classifier based on the nearest neighbor approach is used to decide whether the features obtained from the interest points are facial features or not. keywords: clustering methods, face recognition, feature extraction, interest points, karhunen-loeve transforms, object detection, pattern classification. fig. 1: the image i(x) is embedded into a continuous family l(x, �) of gradually smoother versions of it. hereby the original image corresponds to the scale � � 0 [11]. interest points are detected in this scale-space. is easier to compute the trace and the determinant of the second moment matrix than its eigenvalues, the harris detector uses the following measure to determine the location of interest points [3]: cornerness tracei d � � � � � � � � � � � � � 1 2 1 2 2 2 ( ) det( ( , , )) (� x �( , , )).x � �i d (4) this measure is not suitable for detecting the maximum over scales in a scale-space representation. thus the normalized laplacian-of-gaussian is used for automatic scale selection [5]. log s l ln n xx n yy n( , ) ( , ) ( , )x x x� � �� � 2 2 2 (5) the region around interest points is described using local descriptors. recently, several descriptors have been developed [8]. this paper uses the sift descriptor based on the gradient distribution in the detected region around the interest point [6]. the size of the region being described depends on the detection scale �. thus a scale invariant description is obtained. the resulting descriptor maps each interest point into a 128-dimensional vector m. in addition, a dominant orientation is calculated for each interest point and descriptors are calculated on the basis of this angle to gain rotation invariance. 3 feature reduction and selection in this section, dimensionality reduction and feature selection are explained. the karhunen-loeve transform is applied to reduce the dimensionality. then the clustering process and the cluster analysis is described. 3.1 karhunen-loeve transform it is a well-known problem in the machine learning domain that the number of required training samples increases exponentially with the dimensionality of the feature space. this is called the “curse of dimensionality” [1]. since the sift descriptor forms a 128 dimensional vector, the karhunen-loeve transform is applied to reduce dimensionality. positive side effects of this are a reduction in computing time for the classification and the fact that the elements of the feature vector m become uncorrelated. hence the covariance matrix � of n training feature vectors m is used to find a linear subspace for better representation. � � � � � � � � � � � 1 1 1 1 1 n n i i n i i t i n m m m( )( ) therefore the covariance matrix � has to be orthogonalized by an eigenvector transform: � �� t n � � � � � � � � � � � � � � 1 2 0 0 0 0 0 0 � � � � � � � . (6) the columns of transform matrix � are the eigenvectors of �. these eigenvectors are orthonormal and represent the basis vectors. the use of only a subset of these eigenvectors as transformation matrix �t leads to reduced dimensionality of the feature space [2]: f m� � �� ( )� �t (7) 3.2 clustering in contrast to knowledge-based methods, the exact facial features are not defined beforehand. for clustering the training feature vectors f in the reduced feature space, we need, however, to estimate their number. a single cluster represents a distribution of a number of feature vectors f that belongs to the same facial feature. the number of clusters must therefore be big enough to offer at least one cluster centroid per facial feature. choosing too many clusters, however, leads to overtraining, i.e. a single facial feature will be represented by a multitude of cluster centroids that are adapted to the training set instead of generalizing a given feature. hence an evaluation with different numbers of clusters was conducted (see section 4) to find the optimum value ex post based on the classifier performance. a detailed description of the clustering process is given in the following paragraphs. first of all, the training feature vectors f are split up into two subsets. the first subset contains all feature vectors describing the facial features (class a � 1). in the second subset all feature vectors describing the rest of the image are aggregated (class a � 0): � � � �f a f aface non face� � � � � �f f1 0, (8) each subset is clustered with the k-means algorithm. thereby the number of clusters in each subset is kept proportional to the respective number of features f. in other words the average number of features represented by each cluster is chosen to be identical for both subsets. 3.3 cluster analysis a cluster analysis is applied to fface to find the most characteristic clusters of facial features. the cluster precision introduced in [9] is a measure of the representativeness of a cluster: p f f fja ja ja ja � �� � # # #0 1 (9) 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 2: cluster j represents the feature of class a � 0, 1. the cluster precision is pj a � 0 � 0.67 and pj a � 1 � 0.33. this results in the probability pja that a feature of class a is represented by cluster j. in order to determine which feature is represented by which cluster, a vector quantization is applied to all training feature vectors f. thus the cluster precision for each cluster is obtained. finally only clusters that represent facial features ( a � 1) and whose cluster precision is higher than 0.9 are kept to form the classifier. in this way the most distinctive clusters representing the facial features in feature space are determined. in the following, these clusters will be denoted as target clusters. 4 classifier the distances of the feature vectors f to the target clusters determined in section 3.3, are a measure of the performance of the classification process. as shown in fig. 3, the training features assigned to a face (a � 1) are closer to the target cluster than the other features (a � 0). a threshold can therefore be calculated from the training features for a given false-positive rate. in this context, false-positive denotes all features wrongly classified as facial features, whereas true-positive denotes all features correctly classified as facial features. in this way a classifier is obtained. up this point, all evaluations have been based on the training set only. this is necessary for a cross evaluation study that requires that tests are only performed on data that has not been part of the training process. this dataset is split up into 46 subsets containing approx. 49 images each. during cross evaluation, a training corpus is formed of a random selection of 45 subsets, leaving a single subset as a test set. the results of such evaluation runs are averaged for the final results presented in section 5. a single run consists of the following steps: 1. all sift descriptors of the test set are projected into the feature space found in 3.1. 2. for all such features the respective nearest neighbors of the target clusters from 3.3 is calculated. 3. all features whose distance to their target cluster is smaller than a given threshold, are classified as facial features. 4. from the annotation of the dataset, true positives and false positives are discriminated. 5. by varying the threshold, classifier sensitivity is adjusted. a roc curve over true positives vs. false positives results. this procedure is repeated for a number of cluster densities (see fig. 5) to find the ideal number of clusters. 5 results in this section the results of the previously introduced interest point detector and classifier are presented. our dataset consists of 2254 images showing one to three people in cluttered backgrounds. as described in section 4 the dataset is randomly split into a training set containing 2205 images and a test set containing 49 images. fig. 4 depicts the distribution of the interest points over the face area. about 150,000 facial interest points have been extracted from all images of the whole dataset. they are mapped into a normalized coordinate system to achieve comparability. as can be seen, eyes, nostrils and corners of the mouth are extracted very reliably. thus these interest points are suitable for facial analysis. in order to determine the best setup of the classifier, the influence of the number of clusters on its performance has been analyzed. the result is shown in fig. 5. obviously, the effect of overfitting arises for a large number of clusters and a low cluster density, respectively. fig. 6 shows the distances of the target clusters to the features f extracted from the test set. just as has been observed from the training set, the test features assigned to a face (a � 1) are closer to the target cluster than the other features (a � 0). since only a small part of the image area is covered by human faces, the number of interest points describing facial features is 20 times lower than the number of interest points found in the remaining image. that is why only really low false-positive rates (0.1 % – 0.2 %) are acceptable. setting a threshold for 0.2 % false positives results in a detection rate of about 20 %, which seems acceptable given the fact that this corresponds to about 10 correctly detected facial fea© czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 47 no. 4–5/2007 fig. 3: distances of the training feature vectors f to the target cluster. distribution of interest in a normalized face fig. 4: this figure shows the distribution of interest points detected in a face by the scale adapted harris corner detector. the interest points are mapped into the normalized coordinate system. ture points per image, whereas approximately 2 features are wrongly classified. this aspect is exemplified in the following. fig. 7 shows all interest points detected by the scale-adapted harris corner detector. these points, indicated by circles, are especially located in corner like structures. the result of the classifier can be seen in fig. 8. it shows only the interest points that have been classified as facial features. these features represent eyes, eye brows and corner of the mouth. 6 conclusion this paper has introduced a new approach to the detection of facial features. our approach builds on local image descriptions that are invariant towards affine image transformations and illumination. we have shown that the scale adapted harris detector yields feature points that are suitable for the detection of specific facial features like eyes, nostrils and corners of the mouth, and that these features can be appropriately described by the sift descriptor. our new method is able to detect on average 10 features per face, with a false-positive rate of 0.2 %, which corresponds to approximately 2 wrongly classified features per image. acknowledgements the research described in this paper was supervised by prof. j.-r. ohm, institute of communications engineering at rwth aachen university. references [1] bellman, r.: adaptive control processes: a guided tour. princeton university press, 1961. [2] fukunaga, k.: introduction in statistical pattern recognition. 1972. [3] harris, c., stephens: a combined corner and edge detector. alvey vision conference, 1988. [4] hjelmas, e., low, b. k.: face detection: a survey. computer vision and image understanding,vol. 83 (2001), no. 3, p. 236–274. [5] lindeberg, t.: edge detection and ridge detection with automatic scale selection. international journal of computer vision, 1998. [6] lowe, d. g.: distinctive image features from scale-invariant keypoints. international journal of computer vision, 2004. 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 5: this figure shows the influence of different cluster densities on the performance of the classifier. thus the cluster density is a measure for the number of training features represented by a cluster. the smaller the value, the more clusters are computed. fig. 6: distances of the test feature vectors f to the target cluster fig. 7: this image of the test set shows all interest points detected by the scale-adapted harris corner detector. the diameter of the circle indicates the appropriate scale on which the interest point has been detected. fig. 8: this image shows only the interest points that have been classified as facial features [7] mikolajczy, k., schmid, c.: scale & affine invariant interest point detectors. international journal of computer vision, 2004. [8] mikolajczy, k., schmid, c.: a performance evaluation of local descriptors. ieee transactions on pattern analysis and machine intelligence, 2005. [9] mikolajczyk, k., leibe, b., schiele, b.: local features for object class recognition. technical report, multimodal interactive systems tu darmstadt, germany, 2005. [10] mikolajczyk, k., tuytelaars, t., schmid, c., zisserman, a., matas, j., schaffalitzky, f., van gool, l., kadir, t.: a comparison of affine region detectors. international journal of computer vision, 2005. [11] ter haar romeny, b., florack, l., koenderink, j., viergever, m. (editors): scale-space theory in computer vision, first international conference, scale-space’97, utrecht, the netherlands, july 2–4, 1997, proceedings. springer, 1997. [12] turk, m., pentland, a.: eigenfaces for recognition. journal of cognitive neuroscience, vol. 3 (1991)no. 1, p. 71–86. [13] viola, p., jones, m.: rapid object detection using a boosted cascade of simple features. conference on computer vision and pattern recognition 2001, 2001. [14] yang, m.-h., kriegman, d. j., ahuja, n.: detecting faces in images: a survey. pattern analysis and machine intelligence, ieee transactions on, vol. 24 (2002), no. 1, p. 34–58. [15] zhang, j., yan, y., lades, m.: face recognition: eigenface, elastic matching and neural nets. proceedings of the ieee, vol. 85 (1997), no. 9. peter hosten e-mail: hosten@ient.rwth-aachen.de ing. mark asbach e-mail: asbach@ient.rwth-aachen.de institute of communications engineering rwth aachen university 52056 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 47 no. 4–5/2007 ap07_1.vp 1 introduction the approaches to unexpected situations (as a special class of so-called undetectable faults) come from various domains, and they are referenced, e.g., in [1], [2], [3], [4], [5], [6], [7], [8], [9]. (a detailed analysis of these sources is in [17].) our approach for ux3 detection is based on the concept of a ux3 type situation, on the concepts of a model of a system of situations (mss) and a model of a system of faults (msf), and on an original method which detects a ux3 type unexpected situation as a violation of a proper structural invariant constructed on mss (msf). the structural invariant is constructed on mss during the so-called “cognitive phase” of mss (msf) development. in this “cognitive phase”, it is considered that some “classes” of situations (faults) have already been established but some of the new situations are processed with large uncertainty. violation of the structural invariant represents the detection of a ux3 type situation. (a more detailed explanation is given in section 2). analysing our approach from the fault diagnosis (fdi) point of view, some important issues may be indicated. the first issue is the concept of mss (msf) in the context of the model-based approach in fdi, and the assignment of our method to some of the known diagnostic approaches, e.g. to abductive diagnostics. our approach is model-based, though the development and the use of models (mss, msf) are different from the examples introduced, e.g., in [10]. our models are developed as a result of data and signal analysis (not as a result of preformed knowledge about a diagnosed system, e.g., knowledge about the internal structure or about a mathematical model). as will be explained in the following sections, the phase of fault detection in an abductive diagnostic model (e.g., in [11], [12]) is a rather special case of ux3 detection in our method. the second issue is the concept of a symptom. our approach concentrates on processing situation signals (data) of the following types: vectors of outputs from qualitative models, ultrasound signals (representing the internal structure of the material samples), sequences of symbols (signs) and words from monitoring processes, ecg and eeg signals. special situations (faults) which represent an extraordinary (faulty) behavior of a monitored (diagnosed) system are spread along the run of the signals (e.g., in their morphology) and in the sets of data. it is sometimes hard to speak about symptoms (symptoms of what?). the presented approach to ux3 detection has been tested by the following three application cases: � in a supervisory control system for an industrial distillation column, especially in a qualitative model designed for the starting phase of the distillation process (e.g., after maintenance operations), details, in [13], [14]. � in detecting unexpected faults in welds (laser, micro-arc and electron beam welds of thin walled welded structures used in the aerospace industry) in combination with neural fault detectors, [15], [16] � within the framework of a special supervisory and monitoring system, e.g., in [17]. 2 unexpected situations – concepts and examples general features of unexpected situations and three basic classes of unexpected situations will be introduced in this section. the first class (ux1) is induced by the relativity of the unexpected situation in respect to the levels of available data and knowledge of a reasoning human. (we will denote these situations as ux (ux1) – emphasizing the intuitive aspect of the detection of such situations.) example 2.1: let us suppose that the extent of values measurable in an instrument (given in the instrument protocol) is umin, umax. the situation when we measure by this instrument a value 10 umax (without problems) is a ux 1 situation. (one interpretation of such a phenomenon is that wrong information was introduced in the protocol of the instrument.) example 2.2: the correct representation of a process by a differential equation depends on the identified type of equation and on the precision of the identified quotients. all cases with unknown noise in the input or in the output variables, unknown drifts in the parameters and cases of so-called hidden parameters, are cases that generate ux1 situations. © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 47 no. 1/2007 fuzzy concepts in the detection of unexpected situations j.bíla, j. jura this paper establishes three essential classes of unexpected situations (ux1, ux2, ux3), and concentrates attention on ux3 detection. the concepts of a special model of system of situations (mss) and a model of a system of faults (msf) are introduced. an original method is proposed for detecting unexpected situations indicating a violation of a proper invariant of mss (msf). the presented approach offers a promising application for starting and ending phases of complex processes, for knowledge discoveries on data and knowledge bases developed with incomplete experience, and for modeling communication processes with unknown (disguised) communication subjects. the paper also presents a way to utilize ill-separable situations for ux3 detection. the paper deals with the conceptual background for detecting ux3 situations, recapitulates recent results in this field and opens the ways for further research. keywords: unexpected situations, model of system of situations, model of system of faults, invariants, fuzzy variables, degree of unexpectedness, emergence zone, association rules, hasse diagram. this text was a part of the intenational conference on advanced engineering design (aed 2006), which was held in prague in 2006. situations of the second class (ux2) are generated by models that are a priori insufficient for representing some situations in the modeled process or system. (this means, e.g., that most situations are well represented, but a small number of situations are represented incorrectly.) example 2.3: situations generated as an unprovable formula in fol (first order language), situations for which a turing machine does not stop (or works too long), or situations in complex robotic production lines (which are impossible to simulate wholly), belong to this class. example 2.4: situations generated by models of systems with deterministic chaos. for example, systems modeled by duffing or lorenz equations belong to this class. situations of the third class (ux3) have causes that differ from those that induce situations of the two previous classes. one principal scenario for ux3 emergence is shown in fig. 1. a general type of model is considered as a pair (c, i) which represents the synthesis of a carrier (c) and information (i). (a simple example of such a model is a classical photograph, where the photographic paper represents the carrier. what a human interpreter sees in the photograph is the information.) most cases of sign modeling work with so-called “hard” carriers equipped with resistance to the influence of the information that is encoded on them. however, sign and symbolic models with hard carriers are not resistant to the encoding and transfer of false information, (though they are very successful even in commercial terms). such a case is described in example 2.5. the whole scenario of this example is important, including the physical and technical background. it is this background that distinguishes this case from a case of ciphering (which lies beyond the scope of our paper). for this reason we search for additional models m3(c3, i3), … , mn(cn, in), which enable the correctness of the model m2(c2, i2) to be checked with regard to m1(c1, i1). such models in most cases really exist, being induced during the evolution of m1(c1, i1) and m2(c2, i2). however, it is not trivial to discover such models. the correctness of function m2 with regard to m1 is verified (in cases when some of models m3, …, mn are discovered) by symbolic commutation of the transformation diagrams (from fig. 1): h g of h g of1 1 1 1� �, ,� � n n (1) this brief explanation of essential concepts for the theory of ux3 situations introduced above will be extended and supplemented by a formal description and a recent application of this theory in section 3. example 2.5: approximation of ux3 type situations by a scheme with standard intentions. let us consider the scheme in fig. 2. the scheme consists of an unavailable (“invisible”) part and a transparent part. the invisible part contains: a process (for which we have no model), process observer variables (x1, …, xn), non-process external variables (y1, …, ym) and switches. the transparent part contains: an observer and a situation recognition block with classes of situations (s1, …, sq) and with the developed mss. the situation recognition block (which contains mss) is developed during a standard operation of the process as a result of the work of the observer. the process is “represented” (for the observer) by the process observer variables (x1, …, xn). let us consider that after a period of successful function, the structure of the situation recognition block is accepted as stable. now let us assume that some of the switches are suddenly switched to variables (y1, …, ym) which represent another external reality, but they are formally the same as (x1, …, xn). as a result of this action – the assignment of the new situations coming into classes (s1, …, sq) will work incorrectly. such a change is undetectable using standard methods, and it requires a special detection approach. example 2.6: special ux3 situations emerge in cases when inappropriate intentions (in frege’s sense, e.g., in [18]) are used to describe a process. (usual intentions are propositions, quantities or properties.) quantities, for example, are sometimes wrongly used for complex processes (or systems with complicated behavior) which are poorly measurable, and representing them by time series or by complicated signals introduces further difficulties in processing and interpretation. typical examples of such cases are ecg, hrv (heart rate variability) signals. (these facts are known, and they have 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 m 1 (c 1 , i 1 ) i 2 m 3 (c 3 , i 3 ) m n (c n , i n ) m 2 (c 2 , i 2 ) g 1 … i 3 i n h 1 g n h n f 1 fig. 1: models, carriers and information s 1 , …, sq situation recognition block observer process x1 x2 xm ym y2 y1 s1 sj sp invisible part transparent part fig. 2: approximation of a ux3 type situation been published (e.g., in the journal of cardio-vascular research from 1996), and have been presented in our research (e.g., in [28].)) 3 formal models for detecting ux3 the method proposed in this paper for detecting ux3 uses a model of a system of situations (mss) or a model of a system of faults (msf). both these systems are developed in the cognitive phase (as was mentioned in the introduction) during operations and experiments with the observed process or with the fdi system. the goal of the “cognitive phase” is to form structural invariants. (a few types of such invariants will be introduced in subsections 3.2–3.4.) in our method, a violation of a structural invariant is a means for detecting a ux3 type situation. (an investigation of the necessary statistical parameters of the “cognitive phase” (e.g., “how many situations need to be analysed and in which classes can they be searched for”, etc.) was made in [26].) in this paper we assume that the “cognitive phase” satisfies the necessary statistical and modeling standards.) model mss has in general the following form: mss s s s inv inv� , 1� � � �( ), , ( ) , ( ), , ( )� �n i p , (2) where s represents a basic set of situations, � �1( ), , ( )s s� n are structures on s considered as relevant for ux3 detection and inv inv( ), , ( )� �i p� are invariants on some � �1( ), , ( )s s� n � �( , , , )i p n� 1 � for ux3 detection. model msf has in general the following form: msf s f s f s f inv inv � , , ,1, ( ), , ( ) , ( ), , ( ) , � � � � � � n i p , (3) where s f, represent basic sets of situations and faults, � �1 , ,( ), , ( )s f s f� n are structures on (s, f) considered as relevant for ux3 detection and inv inv( ), , ( )� �i p� are invariants on some � �1( ), , ( )s s� n for ux 3 detection. models for ux3 detection have the forms md mss cond md msf cond inv inv ( ) ( ) , ux , ux , v v 3 3 � � or (4) where condvinv represents the conditions of violation of mss (msf) invariants. (these conditions are analysed in the process of ux3 detection.) fig. 3 illustrates the position of the models for ux3 detection in a block scheme of the fdi system respecting one of many possible structures of the fdi system. the figure expresses only the fact that msf and md(ux3) work as a parallel block with a fault recognition system (which is usually understood as an ending member of an fdi system). 3.1 a general type of structural invariants inv(�i ), …, inv(�p ) the general type of invariants inv(�i), …, inv(�p) is connected with the commutation of diagrams (1) and is limited in this paper to the form of morphisms hi, gi, for i � 1, …, n: m c i g m c i m c i h i i i i i i i� � � � � �� �2 2 2 2 2 2 2 2 2( , ) ( ( , )), , ( , )�� i m c i( ( , )).1 1 1 (5) the concrete form of the morphisms depends on the type of models m c i1 1 1( , ) and m c i2 2 2( , ). the following subsections introduce three examples of mss and msf and md(ux3). 3.2 md(ux3) with an emergence zone mss has (in this case) the form (6) mss s m s� � � � � � � , ( ), , , ( , ( ( , ) ))s s� � , (6) where s represents a basic set of situations, m(s) is a matroid constructed on this set of situations, � is a basis on m(s), � represents a cover on m(s) (a subset of the matroid closure), �( , )s � is a metric function and � is a positive real number. in addition to � �, there is the so-called emergence zone �. in this zone there are elements that can not be constructed from the elements in � �, , but these elements are relevant to mss and they are not in zone outside, see fig. 4. with regard to the fact that “regular” situations can be assigned in � or in � or they are classified in zone outside (law of the excluded third alternative), elements of � are considered as extraordinary situations and they represent (in the context of our paper) ux3 situations. in this case md(ux3) has the form (7) md mss md mss ( ) , ) ( , )) ( ) ) ux u ux p 3 3 ,( (s or , ( s � � � � � � � � � � (7) where s* is an unexpected situation and up is a real number (upper bound). © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 47 no. 1/2007 process (system) sensors diagnostic model extraction of diagnostic situations fault recognition system detection of unexpected faults fig. 3: position of models for ux3 detection in a rough block scheme of the fdi system m s( ) o s* outside � fig. 4: basis, cover and emergence zone the mss described above was used in real conditions for ux3 detection in the starting phase of a deisobutanisation process [13], [14]. 3.3 md(ux3) with bipartite graphs in this case, msf has the form (8) � �msf s f g g g gm� , , , , »1 2, , ,� , (8) where (s, f represent basic sets of situations and faults, g is a bipartite graph (situation � faults), � �g g gm1 2, , ,� , » is a special dulmage-mendelsohn decomposition, [19], [21]. ( , , ,g g gm1 2 � are irreducible sub-graphs, “»” is a tree ordering on the set of the sub-graphs.) the detection of ux3 has been represented by a violation of ordering “»” and has been indicated by the following conditions discovered for a situation s*. in this case, md(ux3) has the form (9) md msf acc acc( ) ,((( ( , )) ( ( , ))),ux g s and not g s for 3 � � �i j some g » gj j i( )) , (9 ) where s* is an ux3 related to � �g g gm1 2, , ,� , » . note 3.1: expression acc(g, s) denotes “situation s is accepted by bipartite graph g”. this md(ux3) model has been used, e.g., in [15] and [16]. (however, taking into account the well-studied concept of dm-irreducibility (e.g., in [20], page 63, 64) we are aware of the limited applicability of dmd.) 3.4 md(ux3) with the association rules the background for md(ux3) model presented in this section continues in the line started in [22],[23] and nowadays utilises formulations, e.g., from [24], [25]. in this case, msf has the form (10) msf s f m hd er� , , , ,g , (10) where (s, f represent basic sets of situations and faults, mg is a qualitative matrix with data acquired from the cognitive phase of fdi system operation (the concept “cognitive phase” was introduced and explained in section 1.), hd is a hasse diagram [25] derived from mg and er is a set of evaluated association rules extracted from hd (each rule evaluated by quantities of supp (rule support) and conf (rule confidence), [24]. note 3.2: the hasse diagram facilitates the process of extracting the rules from matrix mg, but its use is not obligatory for the formation of er. md(ux3) has (in this case) the form (11) md msf er acc( ) , ( , ( , ))ux not s3 � � � �r r (11) where s* is a ux3 situation. note 3.3: the expression acc(r, s*) denotes “situation s* is accepted by rule r”. 3.5 additional important circumstances a. the basic purpose of mss (msf) and of md(ux3) is in the formation of rules that enable to distinct between “ill-separable situations” and “ux3 situations”. such decision rules are simple: �if situation s satisfies the invariant� � �then s it is an ill separable situation . �if situation s does not satisfy the invariant � � �then s is ux3 situations�. b. when detecting a ux3 situation, we should like to know how important the discovered ux3 situation is in comparison with other possible ux3 situations (which could be detected by the considered invariant). for this reason a numerical function d inv inv( ( ), , ( ) , )� �i p ux� 3 has been suggested. it is called the degree of ux3 (the degree of unexpectedness) with respect to invariants inv inv( ), , ( )� �i p� and depends on the following two factors: � the complexity (the constructibility) of the applied invariant (invariants), � the sensitivity of the invariant (invariants) to the measure of the violation. 4 degree of ux3 the following form has been introduced for the computation of d(inv(�p), ux 3) (with one invariant inv(�p)) d inv inv ( ( ), )) ( ( )) ( ( )) � � p i i x ux x i 3 1 2 1 1 � � � �q cpm p � � elm� � � � � � � � � � � � � � � 1 1 2 2 a i i i x x x i ( ( ( ) ( ))) ( ( )) � � � elm inv � p � � � � � � � � 1 a , (12) where qcpmx is a quotient of the invariant complexity, xi are elements of the invariant structure (from the ground set elm(inv(�p)), �i are quotients of importance for elements xi. quantities 1�(xi) are quotients of deployment of elements xi before invariant violation and 2�(xi) are quotients of deploy6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 no. name of structure qcpmx [1] 1 klein group of the 4th order 1 2 permutation group of the 6th order (4×4) 0.9455 3 group of linear transformations of the �th order 0.7432 4 semi-group of binary equivalencies 0.5571 5 semi-group of binary relations 0.1979 6 matroid of the 2nd order 0.1750 7 linear regular grammar 0.1345 8 non context grammar 0.1047 9 structure of association rules 0.0883 table 1 ment for elements xi after violation of the invariant by means of xi. (number a � 2 (usually) but not necessarily). note 4.1: if an element is not violated, it holds: 1�(xi) � 2 �(xi). quotient of structure complexity qcpmx expresses the difficulty to form a model of such a structure. the complexity of the structures is compared in [26]. some illustrative examples of qcpmx quantities for structures with one composition operation are introduced in table 1. note 4.2: the specification of structures 1–5, 7, 8 from table 1 is known from the literature. structure 6 is a matroid with 10 independent sets and with the cardinality of its basis equal to 2. the quantity of d(inv (�p), ux 3) gives a qualitative evaluation of the difficulty of the operations that follow after ux3 detection. usual such operations (e.g., in the fdi field) are: localisation, isolation, identification and interpretation a ux3 situation. the higher the d(inv (�p), ux 3) quantity is, the more difficult these operations are to execute. example 4.1: let us suppose klein group b/ � �{i, n, r, c}, o as an invariant inv(�1), and let us suppose the violation of elements n and r given by the following quantities of quotients: �i � �n � �r � �c � 1, 1�(xi) � 1�(xr) � 1�(xn) � 1�(xc) � 1, 2�(xi) � 2�(xc) � 1, 2�(xn) � 2�(xc) � 0.8 and a � 2. d( b/ , ux3) � (0.08/4)1/2 = 0.1414. example 4.2: let us suppose a set of rules er (with 11 rules) as an invariant inv(�1), and let us suppose violation of rules r3 and r5 given by the following conditions: �1 0 7� . , �2 0 7� . , �3 1 4� . , �4 0 9� . , �5 1 0� . , �6 1 2� . , �7 0 5� . , �8 0 8� . , �9 0 5� . , �10 1 2� . , �11 0 7� . , 1 1 1 2 1 11 1� � �( ) ( ) ( )r r r� � � �� , 2 1 2 2 1� �( ) ( )r r� � , 2 3 2 5 0 0� �( ) ( ) .r r� � , 2 4 2 2 7 2 8 2 9 2 10 2 11� � � � � � �( ) ( ) ( ) ( ) ( ) ( ) ( )r r r r r r r6� � � � � � � 1 and a � 2. d er( , ) , , ux q cpmx i i 3 3 2 5 2 2 1 11 1 2 1 � � � � � � � � � � � � � � � � � � � � � 0 565 0 0883 6 4 . . .� . the examples introduced here illustrate the d(inv(�p), ux 3) quantities for violation of two very different invariant structures. the results correspond to an intuitive understanding of variable d(inv(�p), ux 3). the more complicated the structure of inv(�p) is and the higher its violation is, the higher is the potential quantity of d(inv(�p), ux 3). (for illustration: d( b/ , ux3) � [0, 1], d(er, ux3) � [0, 11.325].) 5 conclusions this paper demonstrates the use of a fuzzy approach for modeling very complex problems. fuzzy concepts are contained in all essential conceptual constructs as ux3, mss, msf, a violation of an invariant, emergence zone, evaluated association rules, hasse diagram (and in the fuzzy values and variables). the paper has introduced methods for ux3 detection. it has introduced a general approach for the development of mss, msf and md(ux3) models, and three variants of these models have been described. examples of md(ux3) with an emergence zone, with bipartite graphs and with evaluated associations rules in the role of structural invariants of mss and (msf) have been presented, e.g., in [13]–[17].) in [17] we introduced an illustrative application of md(ux3) with evaluated association rules for the conditions of an industrial monitoring system (developed with data support from “ventilation system of the mrazovka road tunnel in prague” in czech republic). the proposed method may be applied for processes with a similar formal description, e.g., in transport systems, power supply systems, and in the chemical industry. the approach is also well applicable in special signal based cases (with no available model of the observed and detected system), where the signals are acquired from special sensors and especially when neural networks or fuzzy systems are used for processing them. (such applications fields are described, e.g., in [16], [27], [28].) 6 acknowledgments this research has been supported by research grant msm no. 2b06023 references [1] abramovici, m., breuer, m. a., friedman, a. d.: digital system testing and testable design, ieee press, new york, 1995, p. 99–104. [2] wang, h., wu, q. h.: detect and diagnose unexpected changes in the output probability density function for dynamic stochastic system: an identification approach. in proc. 14th world congress of ifac, beijing, china, vol. 17 (1999), p. 217–222. [3] padmanabhan, b., tuzhilin, a.: knowledge refinement based on the discovery of unexpected patterns in data mining, decision support systems, vol. 33 (2002), no. 3, p. 309–321. [4] deruyver, a., hode, y.: image interpretation with a semantic graph: labelling over-segmented images and detection of unexpected objects. in proc. conf. on applications and science of computational intelligence, orlando, fl, 1999, p. 424–432. [5] atkins, e. m., durfee, e. h., shin, k. g.: expecting the unexpected: detecting and reacting to unplanned for world states. in proc. 13th national conf. on artificial intelligence, portland, or, 1996, p. 1377. [6] grégorie, é.: a fast logically-complete preferential reasoner for the assessment of critical situations. in soft computing for risk evaluation and management, (da ruan, j. kacprzyk, m. fedrizzi, eds.) heidelberg, springer-verlag, 2001, p. 121–132. [7] ohsawa, y., yachida, m.: discovery of unknown causes from unexpected co-occurrence of inferred known causes. in proc. conf. on discovery science, fukuoka, japan, 2001, p. 174–185. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 47 no. 1/2007 [8] koscielny, j. m.: application of fuzzy logic for fault isolation in a three-tank system. in proc. 14th world congress of ifac, beijing, china, vol. p, 1999, p. 73–78. [9] cellier, f. e., nebot’, a., escobet, a.: model acceptability measure for identification of failures in qualitative fault monitoring systems. in proc. of int. multiconf. on modelling and simulation esm 99, warsaw, poland, vol. 2 (1999), p. 339–347. [10] frank, p. m., ding, s.x., köppen-seliger, b.: current developments in the theory of fdi. in proc. of 4th ifac symposium on fault detection supervision and safety for technical processes, budapest, hungary, vol.1 (2000), p. 16–27. [11] cox, p. t., pietrzykowski, t.: general diagnosis by abductive inference. in proc. of symp. on logic programming, 1987, p. 183–189. [12] bécher el ayeb, wang, s.: abduction-based diagnosis: a competition-based neural model simulation abductive reasoning. j. of parallel and distributed computing, vol. 24 (1995), p. 202–212. [13] bíla, j.: the development of qualitative models for knowledge-based process control support systems. in proc. int. conf. on systems, analysis control and design, lyon, france, vol. 2 (1994), p. 256–262. [14] bíla, j.: knowledge-based fault detection system for distillation process. in proc. of int. conf. on artificial intelligence in industry – aii ’98, high tatras, slovak republic, 1998, p. 30–37. [15] bíla, j., koran, l., mankova, r.: detection of ill separable faults. in proceedings of the 4th ifac symposion fault detection, supervision and safety for technical processes – safeprocess 2000, budapest, hungary, 2000, p. 461–465. [16] bíla, j.: neural network based methods for ultrasonic testing of welded constructions. in proc. of iasted int. symposium on applied informatics 2001, innsbruck, austria, 2001, p. 427–431. [17] bíla, j.: detection of unexpected faults. in proc. of 8th iasted int. conf. on artificial intelligence and soft computing, marbella, spain, 2004, p. 60–65. [18] tichý, p.: the foundations of frege’s logic, de gruyter, 1988. [19] iri, m.: a review of recent works in japan on principal partitions of matroids and their applications. annals of the new york academy of sciences. vol. 14 (1979), p. 306–319 [20] murota, k.: matrices and matroids for systems analysis. springer-verlag, heidelberg, 2000. [21] sugihara, k.: a graph-theoretical method for monitoring concepts formation. pattern recognition, vol. 11 (1995), p. 1635–1643. [22] hu, x., cercone, n.: mining knowledge rules from databases. in proc. of 12th int. conf. on data engineering, new orleans, louisiana, 1996, p. 493–515. [23] pedrycz, w.: fuzzy set technology in knowledge discovery. fuzzy sets syst., vol. 98 (1998), p. 279–290. [24] delgado, m., marín, n., sánchez, d., vila, maría-amparo.: fuzzy association rules: general model and applications. ieee trans. on fuzzy systems, vol. 11 (2003), no. 2, p. 214–226. [25] yi, z., jianliang, s., da ruan, pengfei, s.: interesting rough lattice-based implication rules discovery. in soft computing for risk evaluation and management, (da ruan, j. kacprzyk, m. fedrizzi, eds.) heidelberg, springer-verlag, 2001, p. 155–169. [26] bíla, j.: identifiability of complex processes and tools for its assessment. dacii/fme/ctu/02/05, czech republic, 2005. [27] collins, l. m., yan zhang, king li, hua w., carin, l., hart, s. j., rose pehrsson, s. l., nelson, h. h., mcdonald, j. r.: a comparison of the performance of statistical and fuzzy algorithms for unexploded ordnance detection. ieee trans. fuzzy syst., vol. 9 (2001), p. 17–30. [28] bíla, j., bukovsky, i., oliveira, t., martins, j.: modelling of influence of autonomic neural system to heart rate variability. in proc. of the seventh iasted international conference on artificial intelligence and soft computing, 2003, banff, canada, p. 345–350. prof. ing. jiří bíla, drsc. e-mail: jiri.bila@fs.cvut.cz ing. jakub jura czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 ap07_4-5.vp 1 introduction as the nonlinear properties of the analyzed multimedia/audio system are unknown, the system is considered as a black box. for such a system, only input and output are observable. this black box is time invariant which means that the properties of the black box do not depend explicitly on time. signal y(t) is the system’s response at the output to an input signal x(t). any given input xi(t) produces a unique output yi(t). considering a nonlinear system, not only one input x(t) can produce the same output y(t). however, the converse is not true, i.e., there is a unique response y(t) to input x(t). the black box with its properties can be represented as shown in fig. 1, where the symbol hn is called a volterra operator. this volterra series theory was introduced in [1] and later used in electro-acoustics with maximum length sequence excitation [2]. the relation between the output and the input can be expressed in the form given by the total sum y t x t n ( ) [ ( )]� �hn (1) in which hn[ ( )] ( , , ) ( ) ( )x t h x t x tn n n n� � � �� � �� � �� � � �� � � � � �1 1 1d d� ( 2) represents n-dimensional convolution of the input signal x(t) and n-dimensional volterra kernel hn(�1,..., �n). the symbol hn represents the n-th order volterra operator. if the total volterra series sum is itemized into the sum of the separated convolutions, the relation between the input and the output will be: y t h x t h x t x t ( ) ( ) ( ) ( , ) ( ) ( ) � � � � � �� � � 1 1 1 1 2 1 2 1 2 � � � � � � � d d� � � � � � � � � 1 2 3 1 2 3 1 2 3 1 d d �� � �� � �� � � � �h x t x t x t( , , ) ( ) ( ) ( ) d d d � � � � � � � 2 3 1 1 1 �� � �� � �� � ��� � � � � � � �h x t x tn n n( , , ) ( ) ( ) �d�n �� � �� � �� (3) 2 first-order volterra systems for this section only a causal, stable and lti (linear time invariant) first-order volterra system will be considered. this can be expressed as y t x t( ) [ ( )]� h1 (4) which can be expanded by using volterra operator h1 in the form y t h x t( ) ( ) ( )� � �� � � 1 � � �d . (5) this equation represents a simple one-dimensional convolution, which determines a pure linear system. the first-order volterra system is in general a linear system, in which the first-order volterra kernel h1(t) is called the impulse response of the system. this impulse response can be obtained by dirac impulse excitation �(t), from h t t1( ) [ ( )]� h1 � (6) 3 the second-order volterra system since the lti system keeps rules of linear combination, the response to a linear combination of input signals equals a linear combination of the outputs. the second-order system does not keep the rules of linear combination, but the rules of bilinear combination. the response to a linear combination of input signals equals a bilinear combination of the output 72 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 identification of nonlinear systems: volterra series simplification a. novák traditional measurement of multimedia systems, e.g. linear impulse response and transfer function, are sufficient but not faultless. for these methods the pure linear system is considered and nonlinearities, which are usually included in real systems, are disregarded. one of the ways to describe and analyze a nonlinear system is by using volterra series representation. however, this representation uses an enormous number of coefficients. in this work a simplification of this method is proposed and an experiment with an audio amplifier is shown. keywords: volterra series, nonlinear, system, identification, audio. fig. 1: schematic representation of a volterra series model signals. let us take into consideration a causal, stable, second-order system, which is defined by y t x t( ) [ ( )]� h2 . (7) operator h2 is called a second-order volterra operator. this operator is expressed by formula eq. (2) h2[ ( )] ( , ) ( ) ( )x t h x t x t� � � �� � �� � �� 2 1 2 1 2 1 2� � � � � �d d . (8) the function h2(�1, �2) is called a second-order volterra kernel. generally, this function needs not to be axis-symmetric by axis h2(�, �), but for reasons of definiteness it would be better to consider this function as axis-symmetric by axis h2 *(�1, �2). the symmetrization can be done by � h h h2 1 2 2 1 2 2 2 112( , ) ( , ) ( , ) * * � � � � � �� � . (9) from now on, the only axis-symmetric kernels will be considered. this can be represented as h h2 1 2 2 2 1( , ) ( , )� � � �� . (10) as is known from the theory of linear systems and as is described in eq. (6), the impulse response of a first-order system (linear system) can be obtained as a response to a dirac impulse. let us take into consideration the signal x t t( ) ( )� � , which is brought into the input of a second-order system. the output is given by y t t h t t ( ) [ ( )] ( , ) ( ) ( ) � � � � �� � �� � � h2 � � � � � � � � �2 1 2 1 2 1 2d d� � h t t2( , ). (11) the response to the dirac impulse does not determine the second-order system, but represents just a slice through the axis of second-order volterra kernel (see fig. 2). let the input signal x(t) be given by the sum of two signals x1(t) � x2(t). the response to such a signal is given by � y t x t x t x t x t x t x ( ) [ ( )] [ ( ) ( )] ( ), ( ) ( � � � � � h h h h 2 2 2 2 1 2 1 1 12 � � � t x t x t x t x t x t x t ), ( ) ( ), ( ) [ ( )] ( ), ( ) 2 2 2 1 1 12 � � � � h h h h 2 2 2 2[ ( )],x t2 (12) where h2{•} is a bilinear volterra operator, which is defined by �h2 x t x t h x t x t1 2 2 1 2 1 1 2 2 1 2( ), ( ) ( , ) ( ) ( )� � � �� � � � � � � � � �d d � � � . (13) thence �h h2 2x t x t x t1 1 1( ), ( ) [ ( )]� , (14) thus a bilinear volterra operator applied to two same signals is simply speaking a second-order volterra operator. generally, any higher-order system can be considered, but the complexity increases as the order of the system increases. a representation of the higher order is also more difficult to imagine as the dimension increases in size. the analysis consisting of finding all other kernels is based on finding the higher-order kernel and then recursively on finding lower-order kernels. 4 a simplified model using the whole volterra model introduces many difficulties into both identifying and reconstructing a nonlinear system. since the n-th volterra kernel is a function of n variables, the model which represents the system has to contain many coefficients necessary to determine the system. this section describes a simplified model, which reduces the number of coefficients required for a volterra series representation. the first simplification replaces the n-th volterra kernel by its symmetric representation. the second-order volterra kernel will be reduced to h h h2 1 2 2 1 2 2( , ) ( ) ( )� � � �� � � . (15) © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 47 no. 4–5/2007 fig. 2: example of second-order volterra kernel a) sub-kernel b) kernel fig. 3: a demonstration of kernel simplification this is demonstrated in fig. 3, which shows the sub-kernel �h2( )� and kernel h2 1 2( , )� � . generally for higher volterra kernels it stands that h hn n n n ( , , , ) ( )� � � �1 2 � � �� . (16) the output signal of the second-order system is in eq. (17) y t h x t x t h ( ) ( , ) ( ) ( ) ( ) � � � � � �� � �� � �� 2 1 2 1 2 1 2 2 1 � � � � � � � d d � � � � � � �� � �� � �� h x t x t h x t 2 2 1 2 1 2 2 1 1 ( ) ( ) ( ) ( ) ( ) � � � � � � � d d d d d � � � � � � � 1 2 2 2 2 2 �� � �� � �� � � � � � � � � � � � h x t h x t ( ) ( ) ( ) ( )� � � � � � 2 . (17) the scheme from fig. 1 can be simplified by applying the simplifications described above. the simplified volterra model is not able to determine all the nonlinearities in the same manner as the full volterra model [1], but it will be shown that, in some cases, such as analysis of an amplifier in weakly nonlinear mode, the simplified model is sufficiently precise. 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 4: scheme of measured system (amplifier) a) b) c) d) fig. 5: comparison of responses to a) 200 hz and 1 khz tones, up to �150 db; b) 200 hz and 1 khz tones, up to �100 db; c) 500 hz and 2 khz tones, up to �100 db; d) 1 khz and 5 khz tones, up to �100 db; sony sdp-300 – above, model – below then, the output y(t) is then given by y t h x t h x t ( ) ( ) ( ) ( ) ( ) � � � � � � � � � � � � � � �� � �� � � � 1 2 � � � � � � d d 2 � � � � � � � � � � � � �� � � h x tn n ( ) ( )� � �d (18) which can be rewritten into a shortened form y t h x tn n n n ( ) ( ) ( )� � � � � � � � � � � �� � � �� � � �d 1 . (19) 5 measuring non-linear audio systems the simplification of volterra kernels described above has been tested on a real audio system with nonlinear behavior. the method described above gives sufficiently precise results with respect to a weak nonlinear mode. if the higher kernels are too feeble, i.e. if the nonlinearity is weak, it is better to use the simplest model, as the higher kernels are near the level of noise. the simplified method for determining the sub-kernels has been verified on surround processor sony sdp-e300, used in amplifier mode. the measurement scheme for identifying of volterra sub-kernels is shown in fig. 4. a simple method using a workstation with an audio card has been used to generate and record input and output signals. to verify the simplified volterra model a comparison between an audio amplifier and the volterra model was performed. the input signal consisting of two sinusoids was put into both the audio amplifier and the model. the output spectrum of the two models was compared. the results are shown in fig. 5. 6 conclusion the method for identifying nonlinear systems using a simplified volterra series representation has been presented and tested on a real (low-cost) audio system. the results of the nonlinear model are in some cases (weak nonlinearities) very similar to the real system. in cases of more complex nonlinearities the model gives worse results, and the simplification is not appropriate for use. the simplification of kernels gives better results in systems with weak nonlinearities, which can be found in multimedia systems such as amplifiers, loudspeakers, etc. acknowledgments research described in the paper was supervised by doc. f. kadlec, fee ctu in prague and supported by the research project msm6840770014 ”research in the area of prospective information and communication technologies”. references [1] schetzen, m.: the volterra and wiener theories of nonlinear systems, newyork: johnwiley&sons, 1980. [2] greest, m. c., hawksford, m. o.: distortion analysis of nonlinear system with memory using maximum-length sequences. iee proc. – circuits devices systems, vol. 142 (1995), october 1995, p. 345–350. ing. antonín novák phone: +420 224 352 109 email: novaka4@fel.cvut.cz department of radioelectronics czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 75 acta polytechnica vol. 47 no. 4–5/2007 ap07_4-5.vp 1 introduction optical coherence tomography (oct) is a rather new imaging technique based on broadband interferometry. like ultrasonic imaging, oct generates a cross-sectional image of the scanned tissue. the penetration depth of oct ranges from about 1–2 mm in opaque tissue (e.g. skin) to 2 cm in transparent tissue (e.g. eye). the achievable lateral resolution mainly depends on the focusing optics, while the axial resolution is based on the light source of the oct scanner. currently, resolutions down to a few micrometers are possible, allowing oct images to compete with histological examination of tissue. the main drawback of histology is that for each observation a tissue sample has to be taken surgically. thus, the main advantages of oct are its non-invasive nature and real-time feasibility. similar to ultrasonic images, oct speckle arises from constructive and destructive interference of light backscattered from reflectors smaller than a wavelength. speckle noise degrades contrast, and makes it difficult to identify small reflectors. additionally, post-processing algorithms (such as deconvolution or segmentation) that are used to enhance image resolution and quality may suffer from the presence of speckle in oct images. thus, prior to image enhancement algorithms, a speckle reduced image is available that can be blended with the original image to enhance viewing convenience. this paper is organized as follows: in section 2 the basic principles of oct are described and the high resolution oct system is presented in detail. section 3 deals with the theory of speckle reduction algorithms and gives an overview of the algorithmic complexity. based on these investigations, section 4 evaluates real time implementation feasibility on an fpga. 2 high resolution oct this section gives details about the oct system used throughout this work, and a flexible digital post processing realized on an fpga is presented. 2.1 oct principle a general oct system is depicted in fig. 1. the light from a broadband light source is split into a reference beam and a probe beam directed onto the object of interest. by combining the backscattered light with the reference beam, an interference pattern is created and detected by a photodetector. the strength of the interference is dependent on the optical path difference and on the reflection. the envelope of the interference pattern shows the reflectivity in a certain depth of the probe [1]. hence, by altering the length of the reference beam, a reflectivity profile of the probe can be recorded, called a scan. the photodetector signal is amplified and analog-to-digital converted to enable digital filtering, demodulation and advanced post processing. by deflecting successively the probe beam in lateral direction, multiple scans can be combined to a 2-dimensional b-scan, showing a high resolution cross-sectional view of the probe. 2.2 oct system interference of the detected light depends on the difference of the optical length of the reference arm and probe arm. unlike narrowband light sources, broadband light © czech technical university publishing house http://ctn.cvut.cz/ap/ 91 acta polytechnica vol. 47 no. 4–5/2007 stick based speckle reduction for real-time processing of oct images on an fpga h. luecken, g. tech, r. schwann, g. kappen this paper presents an fpga based real-time implementation of an adaptive speckle reduction algorithm. applied to the log-compressed image of a high-resolution optical coherence tomography (oct) system, all related signal processing steps from envelope detection to vga video signal generation are executed on a single chip. images from measured oct data show that the chosen algorithm produces a smooth, detailed image with fewer image artifacts than comparable approaches. an estimation of the hardware effort, the possible throughput rate and the resulting image frame rate is given for different window sizes used here in speckle reduction. keywords: optical coherence tomography, real-time processing, speckle reduction, fpga. fig. 1: oct principle sources show interference only for small differences up to the coherence length. the applied source emits light with a spectral bandwidth of 255 nm at a center wavelength of 800 nm, which leads to a coherence length of 1.6 �m. the coherence length is the limit for the axial resolution of the system. resolution in the lateral direction of the oct-system depends basically on the focusing optics and lenses. the system considered here provides up to 0.01 degrees of lateral resolution. the length of the reference arm can be varied on a micrometer scale with a piezoelectric crystal. adaptive control enables a linear movement of the reference mirror at constant speed. therefore the recorded signal shows the interference pattern as a sine-wave carrier signal. the depth-dependent reflectivity of the probe shapes a superimposed envelope. the carrier frequency depends on the mirror velocity and center frequency of the light source, and is about 350 khz for the considered system, mainly limited by the mechanical set-up of the reference mirror. 2.3 digital post-processing after amplifying the photodiode signal, it is sampled and converted to the digital domain. a standard analog-to-digital converter is used, providing precision of 12 bits at a sampling frequency of 1 mhz. this data is fed to an altera stratix ii fpga, which serves as a test-bed to evaluate different speckle reduction algorithms and the real-time abilities of the system. fig. 3 shows a block diagram of the architecture. standard signal processing for an oct system consists of filtering and amplitude demodulation to determine the signal’s envelope. the implemented system starts with a bandpass filter. subsequently a hilbert transformation is chosen to demodulate the signal by adding the imaginary part to the oct signal and taking the absolute value. the bandpass and hilbert transform can be performed in one step, in fig. 3 denoted as complex bandpass. to compute the absolute value, i.e. the squared sum of the imaginary and real part of the signal, an efficient coordinate rotation (cordic) algorithm has been implemented as proposed in [5]. subsequent to envelope detection, the signal is compressed to 8 bit by computing the logarithm. to adapt the sampling rate to the axial resolution given by the coherence length, down-sampling is done after low-pass filtering to avoid aliasing. the demodulated and compressed oct data of each scan line is written to a memory buffer, and the adjacent scans are combined in lateral direction to a b-scan. a vga protocol converter allows visualization of the oct picture on a standard vga display with a resolution of 1024×768 pixels at 75 hz. the architecture makes use of a 2-stage memory structure (table 1). for advanced processing and speckle reduction, up to 63 adjacent a scans can be stored in a buffer and can be processed in both the axial and the lateral direction. a 32 mbyte sdram is used as frame buffer by the video controller. the required arithmetical logical units (aluts) for an fpga implementation of this post processing scheme are summarized in table 2. as can be seen, only 9 % of the overall fpga (48,352 aluts) are used in this post processing approach, leaving sufficient space for extension and enhancements. 3 speckle reduction algorithm this section details with the speckle reduction technique considered here. speckle reduction aims to reduce noise-like speckle distortion in b-scans, and it should also generally preserve structures like edges, borders and small features of interest. a large variety of speckle reducing filters has been published, particularly from research in ultrasonic image processing. approaches using region growing, diffusion and wavelet decomposition have been proposed. this paper considers the adaptive speckle filter introduced in [3] using asymmetric sticks, called asf. similar to other speckle reduction techniques, the basic concept is to compute the output from windows which are adapted to the local content of the source image. in contrast to algorithms working with rectangular windows and/or weights [6, 7, 8] 92 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 2: spectrum of the broadband light source and interference pattern of an ideal mirror fig. 3: digital post-processing of oct data block a-scan buffer sdram memory size 524,288 bits 804,864 bytes table 1: memory usage of oct post-processing block band -pass abs log lp vga sdramcontroller � aluts 936 469 109 763 1562 429 4268 table 2: resource requirements of basic oct post-processing blocks which depend on the distance to the center of the window, asf resolves the window in angular steps. this method shows superior results at the expense of high computational load and is impracticable to implement on a general purpose processor for real-time processing. the algorithm will now be explained in detail. the filter uses directional features (called sticks, fig. 5) and allows iterative application, since it behaves like a nonlinear diffusion model. the method is based on a weighted sum of means, which are calculated in the stick direction. the weights are given by the reciprocal of the local variance in each direction. according to [3], the filter algorithm is given by � ( )i w g v ii i i n � � � �1 1 4 4 and w g vi i n � � � � ( ) 1 4 4 , where î denotes the output for each pixel. vi and ii are the variance and mean of the original image data i along the ith stick pointing from the center of the n×n filter window in one of 4n � 4 different directions. the sticks can be constructed by hardware, making efficient use of bresenham’s line algorithm [9]. given the ith stick as a filter kernel hi(x, y), the mean and variance are computed by i x y i x y h x yi i i( , ) ( , ) ( , )� � and � v x y i x y h x y ii i i( , ) ( , ) ( , )� � �2 2 with * denoting a 2-dimensional correlation. as variation function g( ) this paper applies g v k v v k i i i( ) � � � � �� 255 255 if else. sticks across an edge show high variance and therefore the mean in this direction is weighted to have less influence on the computed output. hence, smoothing only happens in homogeneous directions whereas it is suppressed across edges. in homogeneous speckle regions, all sticks have approximately the same variance and weight, and as a result all directions are smoothed evenly. because of the high quantization of angular steps, asf is able to adapt its kernel very flexibly to the image content. inhomogeneities with various shapes can not only be retained, but can be smoothed without blurring edges. fig. 4 shows a measured oct image of a spheroid placed on an agar substrate and the enhanced images for two different stick lengths. the speckle pattern and additive noise are significantly reduced. in [3], it is stated that due to the overlapping sticks the algorithm works like a gaussian smoothing filter in homogeneous regions. however, a closer look at the filter kernel in these regions shows that the weights are w r c n r c n r ( ) ( ) � � � � � � �� 4 4 0 4 4 8 for else. (assuming vi equal for all directions i, with r denoting the number of stick pixels from the window center and a constant c) the kernel is not gaussian, but decreases with 1/(8r). for better smoothing, the influence of the center pixel can be reduced. a straightforward way to do this is to subtract a fraction of the original pixel value from the output � ( ) �� � � �i i i1 � � . an empirically found value of � is 0.125. this adjustment is especially important if only one iteration is applied. to compare the image quality of asf to an existing fpga implementation of the argf-based [8] algorithm by mazumdar et al. [4], both filters have been implemented in c�� and matlab and adjusted to oct-image data. fig. 6 shows the resulting images. © czech technical university publishing house http://ctn.cvut.cz/ap/ 93 acta polytechnica vol. 47 no. 4–5/2007 fig. 4: high resolution oct image of a spheroid: (a) original image, (b) asymmetric stick filter (11×11 window), (c) asymmetric stick filter (21×21) fig. 5: kernels hi (x,y) showing sticks for a window size n � 11 the right image is slightly degraded by artifacts (marked with arrows). these are caused by hard switching between square filter kernels and outliers found in speckle statistics. they do not appear in the left image, because of the soft thresholding and high angular quantization of asf. furthermore, asf needs only one parameter k, which can be adjusted very easily. the filter introduced in [4] needs 4 parameters, which are difficult to adjust empirically. 4 real-time implementation this section describes the real-time fpga implementation of the asf algorithm introduced in section 3. this work focuses on meeting real-time requirements. therefore, the timing requirements for the asf algorithm are derived based on the a-scan rate and pixels per a-scan. subsequently the architecture of the asf algorithm will be shown and the throughput rate for a serial and a parallel implementation will be estimated. in the second part, implementation results are presented for a stratix ii fpga for different sets of parameters of the algorithm. here, the dependency of different stick-lengths corresponding to varying window sizes as well as the number of sticks on the required hardware will be shown, measured in logic elements (les). a possible parallel execution of the asf algorithm to achieve highest throughput rate will be described. 4.1 signal processing concept fig. 7 shows the signal processing concept of the asf algorithm. as can be seen, the algorithm is divided into three blocks that realize different steps of the algorithm at different sample rates. this architecture allows pipelining of the algorithm to increase the throughput rate. in fig. 7, the first block receives a window of n×n pixels and calculates the sum along each stick and the sum of the squared elements for ns sticks. because the n×n window pixels are prefetched and stored in an array, they are available without latency. in each clock cycle, a serial implementation performs one of the ns summations over nl stick elements. this leads to a required number of ns processing cycles for the first block and thus one summation value is calculated each clock cycle. in a parallel implementation, all these summations are calculated simultaneously leading to a calculation time of one clock cycle for all summation results. the processing cycles of the second block are mainly affected by the summation over ns sticks, leading also to a number of about ns cycles, and hence one calculated value per clock cycle. here, the parallel implementation calculates all results in one clock cycle. finally, block iii realizes a divider which does not affect the throughput rate of this pipelined architecture. the number of processing cycles required to perform the asf algorithm realized by the three-block approach considered here is equal to the maximum number of processing cycles of block i and ii. thus, a serial implementation requires about ns processing cycles if the pipeline has been filled up. in contrast to this, a fully parallel implementation requires three clock cycles to determine a speckle reduced pixel value. based on the a-scan rate and the number of pixels per a-scan, the upper limit of processing cycles to calculate a new pixel value can be derived from cyc clk a-scan � f f l , where l is the a-scan length (in pixels), fclk and fa-scan are the clock frequency and the a-scan rate in hz respectively. with an a-scan rate of 50 hz and an a-scan length of 768 pixels, approximately 2600 cycles are acceptable for the calculation of each speckle reduced pixel. as can be seen from the processing cycle calculation above, serial as well as parallel asf implementations easily meet this requirement for reasonable window sizes. 4.2 fpga implementation fig. 8 shows the number of required aluts for a serial implementation of the asf algorithm. the increasing number of aluts is mainly caused by the first block. as can be seen in fig. 8, the bit widths of the second and third block’s input 94 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 6: comparison of speckle reduction algorithms (left asymmetric stick filter, right argf-based filter) nl nl ns ns fig. 7: signal processing architecture of the speckle reduction algorithm parameters are just slightly increasing by ld( )nl and ld( )ns respectively, leading to nearly constant hardware resources for these blocks. the number of required aluts of the first block is approximately a quadratic function of n. furthermore, the maximum clock frequency for an implementation on the stratix ii fpga (ep2s60f1020c4) is given in fig. 8. as can be seen for reasonable values of n, the maximum clock frequency is greater than 50 mhz. 4.3 performance evaluation to determine the maximum window size for real time calculation, the equation for the upper limit of available cycles per pixel in section 4.1 can be rewritten. for a stick length of n nl � �( )1 2 , using the minimum number of required cycles n ns � �4 4 for the serial implementation yields: n f l w f � �clk img4 1 . this form allows a calculation of the window size depending on a-scan length (l), image rate (fimg) and width (w) as well as clock frequency (fclk). for a second available oct system based on fourier-domain image reconstruction, an a-scan length of 1000 pixels, an image rate of 5 hz and a width of 250 lines is required. in this case, a window size of n � 11 can be processed in real-time. 5 conclusion a generic design approach allows for a flexible implementation of the asf algorithm. thereby, algorithm parameters like window size, number of sticks, and weighting function can be configured at compile time. the performance of the serial implementation meets the requirements of the broadband oct system used here. even for fft-based fast oct systems, the presented hardware implementation can apply real-time speckle reduction to the full image size. for even higher demands, parallelization of the algorithm is straightforward using the proposed generic design approach. thus, increased throughput rate can be traded for hardware effort regarding the number of required aluts. the introduction of a slight modification of the algorithm shows that a single run provides sufficient smoothing of the image. acknowledgments parts of this work were funded by the deutsche forschungsgesellschaft (dfg). the work was supported by felix spöler and the institute of semiconductor electronics at rwth aachen university. the authors would like to thank prof. tobias g. noll for supervising this work at the chair of electrical engineering and computer systems, rwth aachen university. references [1] schmitt, j. m.: optical coherence tomography (oct): a review, in ieee journal of selected topics in quantum electronics, vol. 5 (1999), no. 4, july/august 1999. [2] schmitt, j. m., xiang, s. h., yung, k. m.: speckle in optical coherence tomography, in journal of biomedical optics, vol. 4 (1999), no. 1, january 1999. [3] xiao, c. y., zhang, s., chen, y. z.: a diffusion stick method for speckle suppression in ultrasonic images, in pattern recognition letters, 2004, no. 25. [4] mazumdar, b., mediratta, a., bhattacharyya, j., banerjee, s.: a real time speckle noise cleaning filter for ultrasound images, in proceedings of the 19th ieee symposium on computer-based medical systems, 2006. [5] schwann, r., kappen, g.: cordic based postprocessing of ultrasound beamformer data, in proceedings 9th international student conference on electrical engineering “poster 2005”, 2005. [6] loupas, t., mcdicken, w. n., allan, p. l.: adaptive weighted median filter for speckle suppression in medical ultrasonic images, in ieee transactions on circuits and systems, vol. 36 (1989), p. 129–135. [7] koo, j. i., park, s. b.: speckle reduction with edge preservation in medical ultrasonic images using a homogeneous region growing mean filter (hrgmf), in ultrasonic imaging, vol. 13 (1991), p. 211–237. [8] chen, y., yin, r., flynn, p., broschat, s.: aggressive region growing for speckle reduction in ultrasound images. in pattern recogn. lett. vol. 24 (feb. 2003), no. 4–5, p. 677–691. [9] bresenham, j.: algorithm for computer control of a digital plotter, in ibm systems journal, vol. 4 (1965), no. 1, p. 25–30. heinrich luecken gerhard tech dipl.-ing. robert schwann e-mail: schwann@eecs.rwth-aachen.de dipl.-ing. götz kappen e-mail: kappen@eecs.rwth-aachen.de chair of electrical engineering and computer systems rwth aachen university schinkelstrasse 2 aachen, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 95 acta polytechnica vol. 47 no. 4–5/2007 fig. 8: resource usage and maximum clock frequency of asf as a function of window size ap05_4.vp 1 introduction the approach of artificial intelligence disciplines to support for design problem solution synthesis has changed qualitatively in recent times. the traditional interest in sophisticated formal means for system design (description of components, procedures of system synthesis from components, etc.) is now targeted at semantic modelling and at an effective description of the functions of the designed systems. the field which promises the necessary improvements in modelling the semantics is the field of ontologies. 2 ontologies an extended interpretation of the term “ontology” (e.g. in [1, 11]) in the context of this paper is as follows: an ontology is a specification of the way of conceptualisation that is used. the internal basis of an ontology is given by the methods for knowledge acquisition, knowledge representation, knowledge sharing, knowledge management and data retrieval. the formal “shape” of an ontology depends on the means used representation that is used (the most general “ancestor” is a semantic network). essential points for specifying ontology (with respect to the ontologies discussed in this paper) include: � purpose and objective of the development and application of the ontology. � subject domains and tasks relevant for the development and application of the ontology. � classes, relations, functions and other formal categories which will be used for conceptualisation. � way of working with the ontology, the form of computer support and the user of the ontology. � terminology used in the ontology (“schools”, traditions and usage). � implementation environment in which the ontology will be developed and applied. 2.1 languages for representation of ontologies from the list of “older” semantic formalisms which can nowadays be considered as ontologies, we mention only bylander‘s consolidations [2]. consolidations are graphic-symbolic formations to describe functions on the level of principles. in combination with suh‘s axiomatic theory of design [4], knowledge acquisition and knowledge representation were used to explain the system functions and for system design [3]. they are still used, e.g., in systems for automatic identification of functional structures, [5]. one of the most powerful means for representing ontologies is ontolingua, [7, 8]. its basic layer is done by kif language (knowledge interchange formate), [6], which is a variant of predicate first order language with the syntax of lisp. cycl language is a language of the cyc project. it is based on lisp syntax and, like kif, it follows the features of predicate first order language. (some parts of the developed ontology – 6000 concepts and 60 000 assertions – are available in [12].) of the many of other languages for representation of ontologies, the following are widely used: ocml (ontology compositional modelling language), daml-ont (darpa agent mark-up language-ontology), oil (ontology inference layer) and daml+oil. details, e.g., in [13]. 2.2 uml language and its use for representation of ontologies uml (unified modelling language, [9]) was developed by omg (object management group) for the analysis and design of large sofware systems. it is nowadays also used for other applications, especially in the fields of analysis and design of general systems, for conceptual design and also for the representation of ontologies [14]. uml works with 8 layers of models. for conceptual design and for the representation of ontologies most important are: class diagrams (which express the necessary relations between elements of conceptual categories, such as classes, associations, attributes, operations, dependencies, relations between dependencies), state diagrams (to describe dynamic processes inside the classes) and sequence diagrams (to describe dynamic processes in established tasks with interaction between the classes). though uml has been accepted as one of languages for representation of ontologies it has two principal disadvantages: limited semantics of associations in the layer of © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 ontologies and formation spaces for conceptual redesign of systems j. bíla, m. tlapák this paper discusses ontologies, methods for developing them and languages for representing them. a special ontology for computational support of the conceptual redesign process (crdp) is introduced with a simple illustrative example of an application. the ontology denoted as global context (glb) combines features of general semantic networks and features of uml language. the ontology is task-oriented and domain-oriented, and contains three basic strata – glbexpl (stratum of explanation), glbfact (stratum of fields of activities) and glbenv (stratum of environment), with their sub-strata. the ontology has been developed to represent functions of systems and their components in crdp. the main difference between this ontology and ontologies which have been developed to identify functions (the semantic details in those ontologies must be as deep as possible) is in the style of the description of the functions. in the proposed ontology, formation spaces were used as lower semantic categories the semantic deepness of which is variable and depends on the actual solution approach of a specialised conceptual designer. keywords: ontologies, conceptual redesign, fields of activities, principles, uml. class diagrams, and the absence of means for the inference of novel knowledge. the importance of these two obstacles has been decreased by modifications and extensions of uml. (the second obstacle has been only partially solved by means of ocl (object constraint language) [10] and by the addition of a special inference system.) 3 an ontology for conceptual design we sketch here the essential characteristics of conceptual design: � conceptual design starts by specifying a goal-designed system, and it results in a functional structure usually called the scheme. � the scheme has substantial features of the product or system which is being designed, but need not necessarily contain geometrical and quantitative data. � there are two roles of the scheme: � to explain the function of a designed system. � to describe the basic features of a designed system structure (components, materials, relevant relations, rough computations and estimations). � the field of conceptual design may be decomposed (according to the type of designed systems) into three classes: a. conceptual design of systems (control systems, technological systems, transport systems, telecommunication systems). b. conceptual design of technological components, machines and devices (holders, attachment tools, frames, bicycles, cars, paragliding sets, refrigerators, heat pumps). c. conceptual design of configurations (flats, buildings, parks, allocation of machines in halls, …). � (before the conceptual design phase there is usually an early design phase, and the conceptual design phase is followed by a detailed design phase.) uml language (mentioned in the previous section) may be directly used for conceptual design of systems of group a systems [14]. on the other hand, the development of ontologies for group b systems and products is particularly interesting. 3.1 ontology for representation of functions in conceptual redesign of systems in group b an ontology that has been developed for conceptual redesign now will be proposed. conditions for redesign (where the specification of a novel system – the goal of a redesign process – is done by conditions for improving the “old” system) enable us to develop an effective but not too extensive ontology. (details about redesign methods are given, e.g., in [15, 16, 17]). the ontology denoted as a global context (glb) combines the features of general semantic networks and the features of uml language. the ontology is task-oriented and domain-oriented, and contains three basic strata (with their sub-strata): glbexpl … stratum of explanation, glbfact … stratum of fields of activities, glbenv … stratum of environment. stratum fields of activities (glbfact) has 5 sub-strata (principles): glbprinc1, glbprinc2, glbprinc3, glbprinc4, glbprinc5. a structure of strata and sub-strata is shown in fig. 1, which corresponds, to expression (1): glb � glb glb glb glb glb glb expl fact princ1 princ2 princ3 p , rinc4 princ5 envglb glb, , . (1) strata and sub-strata glbfact, glbprinc1, glbprinc2, glbprinc3 have the structure of models glbp � �fam famp p, ( ) . (2) strata and sub-strata glbexpl, glbprinc4, glbprinc5 and glbenv have the structure of algebras glbp � fam f famp p, ( ) , (3) where famp are carriers of models and algebras (p � {expl, fact, princ1, princ2, princ3, princ4, princ5, env}, �( )famp are systems of relations and f(famp) are systems of operations introduced in carriers famp. carriers famp of models and algebras will in this paper be called “families” (as in [16]) and, their elements will be called “formation spaces” (denotated as fs). note: in this paper, only fragments of an ontology from stratum fact are demonstrated. the description of their sub-strata is very brief, limited by the requirements of example 1. stratum “field of activities” (fact): carrier “famfact” contains formation spaces of the type � �famfact � me pnu hme els msf tcs, , , , , ,� , (4) with the following meaning: me … mechanics, pnu … pneumatics, hme … hydromechanics, els … electrical and electronics (field of activities), msf … mathematics, symbolic and formal (field of activities), tcs … technological constructions (bridges, frames, walls, …). stratum “principles 1” (princ1): carrier “famprinc1” contains formation spaces of the type �famprinc1 � agg trns contr protc cnstr r eff instr d , , , , , , , �am emb prod, , , (5) with the following meaning: agg … aggregation, trns … transformation, contr … control, protc … protection, cnstr … constructions, r-eff … 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 1: a partial ordering of strata in glb ontology relative effects, instr … instrumental, dam … damage, emb … embedding, prod … production. stratum “principles 2” (princ2): carrier “famprinc2” contains formation spaces of the type � famprinc2 � agg , , trns , , accum synth chcarrv transfer transms chbeh chvval rep supp catal analog logic , , , contr , , , , , f logic protcprod protcprop consvstate s � , protc , , , cnstr epar fix bear content, join, milieu filter join , , , , r eff ,� t bearing tool material means discard cont , , instr , , , dem , amin destruct inconstr include annex object , , emb , , , prod �, ,univqual univpower , (6) with the following meaning: accum … accumulation (aggregation without change of the aggregated components), synth … synthesis (aggregation with a change of the aggregated components), chcarr … change of energy carriers, chcarrv … change of carrier variables, transfer … change of position of energy matter with possible changes of the internal properties, transms … (transmission) change of position of energy matter without changes of the internal properties, chbeh …. change of behaviour of energy matter, chvval … change of values of descriptive variables, rep … repression of an effect (process, principle), supp … support of an effect (process, principle), catal … catalysation of an effect (process, principle), analog … analog control of an effect (process, principle), logic … logic control of an effect (process, principle), f-logic … fuzzy logic control of an effect (process, principle), protcprod … protection of products, protcprop … protection of properties, consvstate … conservation of a state, separ ... to separate, fix … to fix, bear … to bear, content … to form a volume, join … to join, milieu … to form a milieu, filter … filter, joint … joint, bearing … generalised bearing, tool … tool, material … material, means … means (non special facilities to help an effect or action), discard … to discard (to eliminate the existence), contamin … to contaminate, destruct … to destruct, inconstr … to embed in a system and to use the functionality (of the embedded system or of both), include … to embed without specified utilisation of functionalities, annex … to annex, objects … production of objects, univqual … production of universal qualities (money, water, light, foodstuffs), univpower … production of universal powers (electrical energy, heat). strata “principles 3” (princ3), “principles 4” (princ4), “principles 5” (princ5): the stratum “principles 3” contains uml class diagrams, stratum “principles 4” contains uml state diagrams related to the relevant class diagram from stratum “principles 3”, and stratum “principles 5” contains uml sequence diagrams related to the relevant class diagram from stratum “principles 3”. for each line fact – princ1 – princ2 there exists at least one class, state or sequence diagram (according to need). (in the final stage xml language will be used to represent the diagrams from strata princ3, princ4, princ5). an example of a class diagram and a sequence diagram for the principle “pneu – trns – chcarr” is illustrated in fig. 2. and fig. 3. the class diagram expresses a process in which three main classes participate – external actor (man, nature, a pneumatic system, …), gas medium and active space. gas medium has attribute pcarriers (pneumatic carriers of energy) and two operations (principles) transms (transmission) and actenergy (activation of energy). class active space has attributes shape, interaction zone and non p-carriers (non pneumatic carriers). operation (principle) “cnstr” (construction) provides a shaping of interaction zone and operation interact performs the interaction of non p-carriers in the interaction zone. a detailed description (if needed) will be introduced in classes tcarr (class of carriers), © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 pneu trns chcarr tshape external actor tcarr gas medium p-carriers : tcarr transms() actenergy() active space shape : tshape interaction zone : ti-zone non p-carriers : tcarr cnstr() interact() ti-zone fig. 2: class diagram for principle “pneu – trns – chcarr” external actor pcarriers : gas medium non p-carriers: active space transms actenergy( ) interact( ) interact actenergy cnstr( ) pneu -trns chcarr interaction zone : active space fig. 3: sequence diagram for principle “pneu – trns – chcarr” tshape (class of shapes) and ti-zone (class of interaction zones). the sequence diagram in fig. 3 expresses the dynamics of the described principle („pneu –trns“ – chcarr) where lines with arrows denote operations and lower principles as events between objects and their order in event time. the first event cnstr is started by the external actor and it is oriented to object active space. the second event is the operation (principle) transms (induced by external actor and it is oriented to interaction zone of active space). (further description is obvious.) example 1: for illustration we now introduce a fragment glbfact� glbprinc1� glbprinc2� � glbprinc4, of ontology which describes the function of a sensor. this is a sensor for measuring the flow of gas, which has to improve the properties of the measurement orifice. compared with the ontology from [3] and [5], which were developed to identify functions, the proposed fragment of ontology has to advise the designer which fields of activities and principles a novel device (sensor) may be formed from. (the procedure for automated synthesis of functional structures is introduced, e.g., in [15, 16].) one possible functional structure is described by the following expression x: x � fact(pneu�princ1(trans � princ2(chcarr and chvalv and transms and chbeh) and contr � princ2(analog) and cnstr � princ2(shape) and me � princ1(agg � princ2(accum) and trans � princ2(chcarrv and chbeh) and contr � princ2(analog) and r-eff � princ2(bearing) and cnstr � princ2(separ and fix and shape) ) expression x contains a structure of instances of fields of activities and principles for a given case. the state diagram in fig. 4. represents these facts: a solution which is searched for is in the activities of pneumatic (pneu) and mechanical (me) fields of activities. the time sequence of states starts in preparatory state in field pneu (p-state pneu), which is left in the situation when the quantity of variable v1 is higher than the lower limit quantity v1l and when a certain construction shaping of the space of gas flow ((valv1>valv1l) and cnstr(shape)) is provided. in active state pneu 1 there are changed carriers of energy (and information) (chcarr), the flow of the gas continues (transms) till the change of behaviour of the gas flow into type beh pneu1 (chbeh: behpneu1), and active state me 1 (cnstr(shape) and (chbeh: behpneu1)) starts. in active state me 1 there is changed and differentiated behaviour of mechanical components chbeh: beh me1, (using the principles of fixation (cnstr(fix)) and of relative effect (generalised bearing) (r-eff(bearing)) and the quantities of the variables of the energy carriers (chcarrv) are changed. releasing the transition from active state me 1 into active state pneu 2 (in exit from active me 1) a part of the flow space (cnstr(separ)) is separated, and the quantity of a variable is aggregated agg(accum). the process continues by transition into active state pneu 2 (achieving a change of behaviour of the mechanical components into type beh me2 (still satisfying the condition of the shape of the space flow), etc.) (description of further steps is obvious.) note: the state diagram does not contain all relevant principles which are introduced in solution x, and does not contain the precise 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague p-state pneu entry: valv1 < valv1l do: transms pneu active state pneu 1 entry: chcarr do: transms exit: chbeh active state pneu 3 do: transms exit: chbeh p-state me entry: valv2 < valv2l do: none [ valv2 = 0 ] me active state me 1 entry: chbeh: beh me1 do: chcarrv do: r-eff (bearing) do: cnstr (fix) exit: agg(accum) and cnstr(separ) [ (chvalv: valv1 > valv1l ) and cnstr(shape) ] [ valv1 = 0 ] [ valv1 < valv1l ] active state me 2 do: r-eff (bearing) entry: chbeh: beh me2 do: chcarrv do: cnstr (fix) exit: agg(accum) and cnstr(separ) [ cnstr(shape) and chbeh: beh pneu1 ] active state pneu 2 do: transms exit: chbeh [ cnstr(shape) and chbeh: beh me2 ] [ cnstr(shape) and chbeh: beh pneu2 ] [ cnstr(shape) and chbeh: beh me3 ] contr(analog) [ (chvalv: valv1< valv1l) and cnstr(shape) ] [ cnstr(shape) and chbeh: beh pneu1 ] fig. 4: description of a structure of principles from fields of activities pneu and me by state diagram orientation of the principles to all possible arguments. the expression contr(analog) – for example – holds for the whole diagram as a condition inducing transitions between states (though it is introduced only at the beginning of the diagram). similarly the extension of the influence of principle agg(accum) is not precisely determined. however, there is no need to describe all circumstances in “hard” detail, because the diagram has the role of an intelligent prompter (similarly as solution x). expression x and the state diagram (fig. 4) describe the function of a device for measuring flow, where in the interaction of the principles of the mechanical and pneumatic fields a cyclic alternation of the behaviour of the gas flow and the behaviour of the mechanical components is established. this cyclic process induces aggregation (accumulation) of the quantities (values) of some variable. one possible interpretation of the proposed solution is the device in fig. 5 (details in [19]). 4 conclusions the development of ontologies for many different engineering domains represents a synthesis of present-day informatic and engineering methods. this paper has shown the increasing importance of an effective ontology for computer support of problem solving in conceptual design. 5 acknowledgments this research has been conducted at the department of instrumentation and control engineering, faculty of mechanical engineering, czech technical university in prague and has been supported by research grant mšm 68 40 77 0008. references [1] guarino, n., giaretta, p.: ontologies and knowledge bases, towards a terminological clarification, http//www.ladseb .pd.cnr.it/infor/ontology/papers/ kbks95.pdf, 1995. [2] bylander, t., chandrasekaran, b.: “generic tasks in knowledge-based reasoning: the right level of abstraction for knowledge acquisition.” in: (gaines, b. r., boose, j. h., eds.) knowledge acquisition based systems. london, academic press., vol. 1 (1988), p. 65–77. [3] katai, o., kawakami, h., sawaragi, t., konishi, t., iwai, s.: “method of extracting meta plan for designing from artefacts analysis based on axiomatic design theory.” transactions of the society of instrument and control engineers, vol. 31 (1995), no. 3, p. 347–356. [4] suh, n. p.: the principles of design. oxford university press. 1990. [5] kitamura, yo.,toshinobu, s., namba, k., mizoguchi, r.: “a functional concept ontology and its application to automatic identification of functional structures.” advanced engineering informatics, vol. 16 (2002), no. 2, p. 145–163. [6] michael, r. g.: “knowledge interchange format.” in: (fikes, r., allen, j. a., sandewall, e., eds.) proc. of 2nd int. conf. on principles of knowledge representation and reasoning. cambridge, ma, 1991, p. 599–600. [7] gruber, t. r.: “translation approach to portable ontology specifications.” knowledge acquisition, vol. 5 (1993), no. 2., p. 199–220. [8] gruber, t. r., olsen, g. r.: “an ontology for engineering mathematics.” in: (doyle, j., torasso, p., sandewall, e., eds.) 4th int. conf. on principles of knowledge representation and reasoning. bonn, morgan kaufmann, 1994, p. 256–278. [9] rumbaugh, j., jacobson, i., booch, g.: the unified modelling language reference manual. addison-wesley, 1998. [10] warmer, j. b., kleppe, a. g.: the object constraint language: precise modelling with uml. addison-wesley, 1998. [11] brasethvik, t., gull, j.: “natural language analysis for semantic document modelling.” data and knowledge engineering, 2001, no. 38, p. 45–62. [12] http://www.opencyc.org/ [13] svátek, v.: “ontologies and www.” in: proc. of int. conference datacon, 2002. http://www.cogsci.princeton.edu/~wn/ [14] cranfield, s., purvis, m.: “uml as an ontology modelling language.” in: proc. of workshop on intelligent information integration, 16th int. joint conf. on artificial intelligence – ijcai '99, 1999, p. 119–127. [15] bíla, j.: “degree of emergence in evolutionary synthesis process.” in: proc. of int. iasted conference on neural networks – nn 2000, pittsburgh, pa, 2000, p. 142–147. [16] bíla, j.: “one variant of the emergence phenomenon in the design process and its computer support.” submitted for automation in construction (issn 0926-5805), 2003. [17] bíla, j.: “emergent phenomena in a co-evolutionary conceptual design.” proc. of int. conf. artificial intelligence in design – aid 02, cambridge, england, (2002) cdrom with poster papers, paper no.1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 5: one possible interpretation of solution x of the conceptual redesign problem from example 1. (description: 30 … flapper, 31, 32, 33 … constraint pins, 28, 29 … input and output neck, 24 … interaction space, 27 … shaped wall of the interaction space) [18] bíla, j., tlapák, m.: “application of ontologies in mechanical engineering.” in: proc. of worskshop on “the development of methods and tools in integration of mechanical engineering. faculty of mechanical engineering, ctu in prague, prague, cr, 2003, p. 20–24. isbn 80-01-02866-6. [19] bíla, j., preisler, v.: the device for measurement of flows and amounts of gases. utility model no. cz 11576 u1, the office for industry properties, prague, cr, 2001. prof. ing. jiří bíla, drsc. phone: +420 224 352 534 fax: +420 233 336 414 e-mail: jiri.bila@fs.cvut.cz mgr. miroslav tlapák e-mail: miroslav.tlapak@fs.cvut.cz dept. of instrumentation and control engineering czech technical university in prague faculty of mechanical engineering technická 4 166 07 prague 6, czech republic 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague ap06_6.vp 1 introduction analyses of the alkali ion contents in aqueous solutions are attracting considerable attention nowadays within the community of sensor developers. information about instant, selective concentrations of na�, k�, mg2� and ca2� ions has considerable relevance for assays of biological fluids, such as human and animal blood, sap, or, for instance, for tests of drinking water, underground water, water used in the food-processing industry, and waste water. the most common method currently applied for alkali ion analysis is absorption or emission flame spectrophotometry. it enables selective determination of the ion concentration within the ppm range. the basic drawbacks of the method – high instrumentation and high labor costs – have stimulated the development of alternative sensing approaches based on electro-chemical and optical principles. the most important of the optical approaches, nowadays, are systems utilizing an intrinsic spectroscopic fibre optic scheme. the general physical principle for their operation employs interaction of the evanescent light field (‘circling’ the waveguide core and exponentially decaying along the external core normal) with the fibre cladding changing the spectral composition of the light propagating through the core, in response to diffusion of the target ions from the water phase surrounding the sensing fibre into the cladding bulk. the cladding material (usually a polymer with a refractive index slightly lower than the core material) has to be properly sensitized to provide the desired optical absorption or fluorescence changes in the presence of the selected target ions [1]. one such sensitization method that is widely applied at present is based on enriching the cladding polymer with a carefully selected combination of an ionophore and an acid-base dye [2]. the ionophore molecules selectively create complex ions with the target ions, migrating into the cladding bulk from the water phase. the necessary condition of overall electric neutrality of the cladding material leads then to de-protonation of the dye molecules, resulting in a change in the optical absorption of the cladding. the alkalinity and lipophilicity of the dye has to be carefully tuned to get the sensing scheme working. the selectivity and reversibility of the complex ion formation is also very important. a broad range of ionophores have already been tested with the aim to optimize the latter properties, such as crown-ethers, cryptands, calix[n]arenes (n � 4, 5 or 6), validomycin derivatives, and calix-crowns [3]. within the ionophore family, calix[n]arenes provide an extremely flexible construction base due to their combination of stereo and electro-chemical ionic affinity [2]. the aim of the research presented here is to investigate the behavior of a typical calix[n]arene ionophore under the conditions of a restricted proton exchange between the tested aqueous solution and the sensing membrane, i.e., the case when the concentration of the proton donor is low compared to the ionophore concentration. from a practical point of view, such a situation can occur, e.g., in the vicinity of the cladding/water interface after the acidic-basic reagent has been partially washed out into the aqueous phase. a liquid, lipophilic membrane (hexane in our case) containing the dissolved ionophore has been used throughout the experiments to simplify the experimental conditions. it has already been proved that such an arrangement can be used for a realistic simulation of the ion exchange between a lipophilic polymeric membrane and water [4]. the specific conductivity and the ph of the water phase was tested to get information about the course of the complexing reaction. the measurements provided simultaneous information about the concentration changes in the target ions and protons in the tested solution. 2 experimental the tetraethyl p-tert-butylcalix[4]arene tetraacetate ionophore (tbt, fig. 1) was prepared from p-tert-butylcalix[4]arene (tb), as described in [5]. tb was purchased from alfa aesar. the other chemicals were obtained from sigma-aldrich, and they were of analytical grade purity: hexane, ethanol (etoh), naoh, koh, nacl, kcl, kclo4, cacl2, and mgcl2. nitrogen gas (99.999 % wt) was provided by messer technogas cz. de-ionized water (conductivity less than 1 �s/cm) was used for preparing of the tested solutions. the specific conductivity and ph of the aqueous solutions were measured in a thermo-stated glass test-tube (internal diameter 20 mm) kept at 20 °c and under nitrogen, with the aid of a conductance electrode (4-strip pt electrode gryf xb1, controlled by a gryf xbc magic unit attached to a pc) and a combined glass/agcl ph-electrode (radelkis op-08083 attached to a radelkis ok104 conductometer). the temperature in the test-tube was controlled by a haake f423 thermostat. first, the electrolyte conductivity of the tested chlorides and hydroxides was determined as commensurate with the solution concentration in the interval 0.1–100 mm, and the calibration curves giving the concentration of the metallic ions in the solution versus their conductivity were cal14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 extraction of alkali ions investigated by conductometric and ph measurements l. kalvoda, r. klepáček extraction of alkali ions from aqueous solutions of chlorides and hydroxides into a lipophilic liquid membrane composed of tetraethyl p-tert-butylcalix[4]arene tetraacetate (tbt) solution in hexane was investigated by means of measurements of changes in the electrolytic conductivity and ph-value of the aqueous solution. hydrolysis of the tbt ionophore resulting in the release of the ethyl groups was proposed as the main reaction process, leading to disturbance of the known preference of tbt for sodium ions. keywords: extraction of alkali ions, cali[4]arenes, conductometric method. culated. full dissociation of the solutions could be assumed. extraction of the alkali ions from an aqueous phase into a lipophilic phase was tested as follows. the test-tube was filled with 13 ml of 1 mn alkali salt or alkali hydroxide solution, the thermostat temperature was set to 20 °c, the cannula providing the nitrogen inflow was inserted deep into the solution, and the conductance and ph values were measured until stable values were obtained. then, 6.5 ml of 2 mm tbt solution in hexane was carefully added, and the changes in alkali ion and proton concentration in the water phase were recorded for 20 hours. the nitrogen bubbles climbing from the bottom of the tube provided the moderate and stable agitation of the solutions, very important for achieving non-biased values of the stability constants characterizing the complex-ion forming reaction. since the distribution coefficient of alkaline ions between hexane and water is negligible, the metal ions cannot penetrate into the hexane and form a complex there. thus, the complex ions are formed only at the interface and then diffuse into the bulk hexane. if the solutions are not stirred, the equilibrium values are often misrepresented due to the restricted diffusion kinetics [5]. the development of etoh in the aqueous solutions observed during the experiments was qualitatively evaluated using the iodoform test [6]: 10 drops of 10% (wt) naoh solution were added to 5 ml of the tested solution. then a 10% solution of iodine in ki was added drop-wise until a slightly yellow color was obtained. the solution was then heated up to 60 °c and left slowly to cool down. optional precipitation of a colorless iodoform cloud gives the sought evidence of the presence of etoh in the tested solution. 3 results and discussion no significant change in conductivity was observed for all the tested alkali chloride solutions (fig. 2). however, the ph-values showed distinct variations during the inter-phase contact (fig. 3). while an increase was observed for the solutions of nacl, kcl, and mgcl2, a slight fall occurred for cacl2 (fig. 3). it is well known [2] that tbt can undergo hydrolysis in acid and basic solutions, so that an acidic hydrolysis process following reaction schema (2) is likely compatible with the observed behavior: ca och cooet h cl ca och cooh etcl� � � � � �� �2 2 . (1) here ca denotes the upper rim and central annulus of the tbt molecule. the validity of the proposed mechanism was supported by the remarkable development of a gas (the boiling temperature of the ethyl chloride is ~ 12 °c at room temperature and atmospheric pressure) observed after shaking the solutions together. the observed behavior of the potassium perchlorate and calcium chloride solutions is also conformal with (1). both the solutions showed a lower starting ph-value (~ 5.6–5.7) than the other tested chlorides. the less electro-negative perchlorate ions did not react according to scheme (1). in the case of calcium chloride, reaction (1) took place but the shift in the balance towards the basic conditions led to the creation of calcium hydroxide. this precipitated due to its very limited solubility, forming an observable white cloud in the solution. the equilibrium concentration of ca2� in the solution remained unchanged in the same time, since it is buffered by the incomplete dissociation of the cacl2 present in the solution. naoh and koh solutions showed remarkable decay of the conductivity value with time. this was accompanied by a decrease in the ph of the solutions (fig. 4, fig. 5). magnesium and calcium hydroxides were not tested because of their low solubility in water. the observed data can © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 46 no. 6/2006 fig. 1: structure of tetraethyl p-tert-butylcalix[4]arene tetraacetate (tbt) 0 200 400 600 800 1000 1200 100 108 117 125 133 142 nacl kcl cacl2 mgcl2 kclo4 c o n d u c ti v it y (m s �c m � 1 ) time (min) fig. 2: temporal evolution of the electrolyte conductivity of the tested chloride solutions during the extraction process (temperature 20 °c, initial concentration of the chloride solutions and the tbt solution 1 mn and 2mm, respectively) 0 200 400 600 800 1000 1200 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 nacl kcl cacl2 kclo4 mgcl2 p h time (min) fig. 3: temporal evolution of the ph value of the tested chloride solutions during the extraction process (temperature 20 °c, initial concentration of the chloride solutions and the tbt solution 1 mn and 2mm, respectively) in this case likely be explained by basic hydrolysis of the ionophore side groups: ca och cooet oh me ca och coo me etoh � � � � � � � � 2 2 ( ) ( ) . (2) me� stands for the na� and k� ions. the relevance of reaction scheme (2) was supported by the observed presence of ethanol in the resulting aqueous solutions, qualitatively verified by the iodoform test. the following stability constants were calculated from the obtained conductivity data, using relation (2): kna� � 7.3×10 �7, kk� � 6.5×10 �8, kca2� � 1.6×10 �5, kmg2�: no measurable reaction. thus, the highest selectivity (�, ��1/k) toward the target ion was obtained for the potassium ions: �(k�)/�(na�) � 11.1. this result confirms again that the observed extraction process differs from the expected creation of complex ions [tbt-me]� known to take place, e.g., in methanol solutions. the latter process shows remarkable preference for na� ions with the selectivity ratio �(na�)/�(k�) ~ 400 [7]. 4 conclusions the results presented here show that the nature of the chemical reaction of the tested tbt calix[4]arene derivative with alkali ions can be strongly changed in a lipophilic solvent, such as hexane, and depends on the ph value of the water phase. when no free protons are available for an effective charge exchange at the water/hexane interface, there is a hydrolytic attack on the ionophore molecules. in case of alkaline solutions, this results in extraction of the target ions to the hexane phase; for slightly acidic solutions of chlorides, no observable ionic extraction can be observed. the proposed hydrolytic mechanism only includes the ethyl ester side groups, so it cannot be directly extended to other calix[n]arene derivatives bearing other substituents at the bottom rim. however, if hydrolysis of the side groups may occur, the effect has to be seriously considered as a possible parasitic reaction disturbing function of, e.g., intrinsic fibre optic sensors utilizing calix[n]arenes ionophores as the selective reacting centers for alkali metal detection. 5 acknowledgment the research was supported by grant ctu0611014 of the czech technical university in prague. references [1] lieberman, r. a.: recent progress in intrinsic fibre optic sensing ii. sensors and actuators b11, 1993, p. 43–55. [2] diamond, d., nolan, k.: calixarenes: designer ligand for chemical sensors, analytical chemistry, january 1, 2001, p. 23a–29a. [3] poonia, n. s., bajaj, a. v.: coordination chemistry of alkali and alkaline earth cations. indore (india): university of indore, 1979. [4] morf, w. e.: the principles of ion-selective electrodes and membrane transport. budapest: akadémiai kiadó, 1981. [5] arnaud-neu, f. et al.: synthesis, x-ray crystal structure, and cation-binding properties of alkyl calixaryl esters and ketones, a new family of macrocyclic molecular receptors. j. am. chem. soc. vol. 111 (1989), p. 8681–8691. [6] šedivec, v., flek, j.: příručka analýzy organických rozpouštědel, praha: sntl, 1968. [7] arnaud-neu, f. et al.: selective alkali-metal cation complexation by chemically modified calixarenes. part 4, chem. soc. perkin trans 2, 1992, p. 1119–1125. ing. ladislav kalvoda, csc., specialist assistant phone: +420 224 358 606, +420 233 325 508 fax: +420 224 358 601 e-mail: ladislav.kalvoda@fjfi.cvut.cz ing. rudolf klepáček phone: +420 224 358 606, +420 233 325 508 e-mail: rudolf.klepacek@fjfi.cvut.cz department of solid state physics czech technical university in prague faculty of nuclear science and physical engineering trojanova 13 120 00 prague 2, czech republic 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 -200 0 200 400 600 800 1000 1200 1400 1600 100 150 200 250 300 time (min) c o n d u c ti v it y (� s /c m ) naoh koh fig. 4: temporal evolution of the electrolyte conductivity of the tested hydroxide solutions during the extraction process (temperature 20 °c, initial concentration of the chloride solutions and the tbt solution 1 mn and 2mm, respectively) 0 200 400 600 800 1000 1200 1400 1600 1800 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 naoh koh p h time (min) fig. 5: temporal evolution of the ph value of the tested hydroxide solutions during the extraction process (temperature 20 °c, initial concentration of the chloride solutions and the tbt solution 1 mn and 2mm, respectively) ap07_2-3.vp 1 introduction let g be a finite dimensional complex semisimple lie algebra and u g u( ) � its corresponding enveloping algebra. if we want to study all two-sided ideals of the associative algebra u (see [1] chapters 6–10), i.e. to find all the vector spaces i u� which satisfy for all a b u aib i, :� � we can easily see that such a two-sided ideal is invariant with respect to adjoint action of u: if we take adjoint action � : ( )g l u� , i.e. �( ) [ , ]x a x a� for x g� and a u� and extend this definition to all u by � �� �� � �( ) ( ) ( ) , , [ , ]u x x x x an n� �1 1� � � for all u x x x gn i� �1 � , we see that �( )u i i� . it is well known that such adjoint action � is completely reducible for g semisimple: it is possible to cut the enveloping algebra u into pieces u k� such that u u k k � � � 0 � and each piece u k� is an invariant subspace (corresponding to an irreducible representation (component) of �) generated by its highest weight vector �k: �( )u uk k � �� now if we take any ideal i u� and find any highest weight vector �k i� , we automatically know that �( )u u ik k � �� � , i.e. we know a large set of vectors which must be contained also in the ideal i. this can help considerably in the classification of all ideals of u. the main aim of this paper is to give an explicit decomposition � � k u k0 � , i.e. to give a list of highest weight vectors �k, k � 0 1, ,� for the simplest cases of lie algebras and to show that even in more complicated examples the decomposition does not look too weird and is relatively easyli obtained by a generally described procedure. 2 the simplest example – algebra sl(2) let us have algebra sl(2) which is a complex linear span of three vectors e e h e e12 21 1 11 22, , � � , where by eij we denote the matrix having 1 in position (i, j) and 0s elsewhere. if we compute the commutation relations of these elements we obtain [ , ] , [ , ] , [ , ]e e h h e e h e e12 21 1 1 12 12 1 21 212 2� � � � . the enveloping algebra u u sl� ( ( ))2 fulfils the poincare-birkhoff-witt theorem and consists of all complex linear combinations of monomials e e h12 21 1 0 � � � � � �, , , � . it has a natural filtration un, n � 0 given by the degree n of the elements in un. it is easily seen that the adjoint action has un as its invariant subspace (by applying a commutator we cannot obtain an element of higher degree). therefore it is completely reducible on each un, i.e. we can see un as a direct sum of invariant subspaces generated by certain highest weight vectors. the highest weight vector v of weight m satisfies the relations [ , ] , [ , ]e v h v mv12 10 2� � . from this we can directly find that all highest weight vectors of small degree: degree 0 1 degree 1 e12 degree 2 e c12 2 1, degree 3 e c e12 3 1 12, … … here c h e e h1 1 2 12 21 14 2� � is a casimir element from the center of u. by commuting these vectors we see that © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 47 no. 2–3/2007 structure of the enveloping algebras č. burdík, o. navrátil, s. pošta the adjoint representations of several small dimensional lie algebras on their universal enveloping algebras are explicitly decomposed. it is shown that commutants of raising operators are generated as polynomials in several basic elements. the explicit form of these elements is given and the general method for obtaining these elements is described. keywords: adjoint representation, enveloping algebra, lie algebra. u u u u e u u u e u c u 0 1 12 0 2 12 2 1 1 1� � � � � � � � � � ( ) , ( ) , ( ) ( ) , � from this we can claim that for any n � 0 u u c en k m k m n k m� � � , ( ) 0 2 1 12� (1) the proof of this general formula is based on a dimensional check: first we see that the sum � �( ) ( )u v u v1 2 , where v1, v2 are highest weight vectors, is direct if and only if v1, v2 are linear independent. thus in our example we must check that the vectors c ek m1 12 are linear independent for different k, m. but this is easily seen from the definition, since c e e hk m m k1 12 12 1 2� lower terms where “lower terms“ contain monomials with h k1 2 1� and lower. next it is well known from representation theory that the dimension of the irreducible representation with highest weight m is 2 1m . on the other hand the dimension of un is also easy to determine, because it is well known that it is isomorphic (as a vector space) to the direct sum � �k n ks g 0 ( ) of the k-th symmetric powers of lie algebra g. we can construct the symmetric algebra s(g) from the tensor algebra t(g) by taking the quotient algebra of t(g) by the ideal generated by all differences of products v w w v� � � for v w g, � ; then there is a direct sum decomposition of s(g) as a graded algebra into the summands sk(g) which consist of the linear span of the monomials in vectors of g of degree k. the symmetric algebra s(g) is in effect the same as the polynomial ring in the indeterminates that are basis vectors for g. therefore we see that dim ( ) dim s g g k k k � �� � � � � � 1 and from this we have dim dim dim dim u g k k n g gn k n � �� � � � � � � � � � � � � � � 1 0 , thus in our example dim u n n � � � � � � � 3 3 . now it is sufficent to prove that the dimensions of two spaces on both sides (1) are equal, i.e. that n m k m k m n � � � � � � � � � 3 3 2 1 0 2 ( ) , . we have ( ) ( ) ( ) , 2 1 2 1 2 1 0 2 0 2 0 0 m m m k m k m n m n k k m n � � � � � � � � �� � � � � �� � � � � � � � � � � 2 0 2 2 2 1 1 6 1 2 3 3 3 k k k n n k n n n n ( ) ( )( (( ) � � � � 0 2 n . 3 moving to the commutative case there is an alternative approach to obtain this decomposition – move to the commutative case: it is a well-known fact (see [2]) that the mapping e c x x i ij k k jj k n � � � � � , 1 is a representation of an n-dimensional lie algebra with basis ei, i n� 1, ,� on a space of polynomials of the variables xi, i n� 1, ,� ; if we view s(g) as the polynomial ring, this representation is equivalent as a representation of the enveloping algebra to the adjoint representation via canonical isomorphism u g s g( ) ( )� . therefore if we return to our example and represent the basis vectors of the lie algebra sl(2) by operators e e e h e e e h h e e 12 21 12 1 21 12 21 1 1 12 12 2 2 2 2 � � � � � � � � � � � � � � � � e e21 21 � � acting on the space of polynomials in three variables e12, e21, h1, we can find all highest weight vectors by solving this system of differential equations: h f e e f h e f e e f e mf 1 21 12 1 12 12 21 21 2 0 2 2 2 � � � � � � � � � � � � , . this system can be easily solved, and the general solution is f e e h e f cm( , , ) ( )12 21 1 12 1� , where f is any differentiable function of the variable c h e e1 1 2 12 214� . we see that when f x x k( ) � , the solution is polynomial and we can transfer back to the enveloping algebra using the canonical isomorphism s g u g( ) ( )� : if a1, …, a gk � , this isomorphism sends a a k a ak k1 1 1 � �� �! ( ) ( )� � � 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 (we sum over all permutations of the indices 1, …, k). in our example c h e e e e h e e h c 1 1 2 12 21 21 12 1 2 12 21 1 1 4 1 2 1 2 4 2 � � � � � � � � � � . doing the dimension check as before the we see that decomposition is complete. 4 the algebras sl(3) and gl(3) the case of algebra sl(3) is more complicated. the decomposition of this case is done in [3]. let us show where the complications in the decomposition arise, using the similar case of lie algebra gl(3) with the basis eij, i j, , ,� 1 2 3 and commutation relations [ , ]e e e eij kl ij jk kj il� � . we turn directly into the commutative case. first we set up, as before, the system of differential equations for the highest weight vectors of weight (n1, n2, n3): e x f x x f x f x x x f 12 13 23 12 11 22 11 22 � � � � � �� � � �� � � � � � � � � ( ) � � � � � � � � � x x f x e x f x x f x f x 21 32 31 23 13 12 23 22 33 0� � � � � � � � �� � � �� � � � � � ( ) ,x x f x x f x e x f x 22 33 32 21 31 13 13 11 0 � � � � � � �f x x f x x f x x x f x � � � � � � � 33 23 21 12 32 11 33 31 � � �� � � �� � � �( ) 0 11 12 12 13 13 21 21 31 31 1 , ,e x f x x f x x f x x f x n f� � � � � � � � � � � � e x f x x f x x f x x f x n f e 22 12 12 23 23 21 21 32 32 2� � � � � � � � � � � � , 33 13 13 23 23 32 32 31 31 3� � � �x f x x f x x f x x f x n f � � � � � � � � . from this immediately follows n n n1 2 3 0 � . if we define new variables x x1 13� , x x x x x x x x x3 12 2 23 12 13 33 22 13 2 32� � �( ) then any solution f can be written as f x x x fij n n n( ) � � �1 3 2 3 2 , where the new unknown function f satisfies the following system of differential equations: x f x x f x f x x x f x13 23 12 11 22 11 22 21 � � � � � � � � � � � � �� � � �� �( ) � � � � � �� � � �� � x f x x f x x f x f x x 32 31 13 12 23 22 33 0 � � � � � � � � , ( 22 33 32 21 31 13 11 33 0� � � � � � �� � � �� x f x x f x x f x f x ) , � � � � � � � � � � � �x f x x f x x x f x23 21 12 32 11 33 31 0 � � � � � � ( ) , x f x x f x x f x x f x x f x 12 12 13 13 21 21 31 31 13 1 0 � � � � � � � � � � � � � , 3 23 23 32 32 31 31 0 � � �x f x x f x x f x � � � � � � . solving this system, we obtain that the general solution of the system has the form: f x x x f x x c c cij n n n( ) , , ,� � � �� � � �� � � 1 3 2 1 1 2 3 2 3 2 , where x x xk k2 1 3� , c xkk1 � , c x xik ki2 � , c x x xik kl li3 � . (we omit the summation of repeated indices going over 1, 2, 3). we now encounter the first problem which arises in the process of decomposition. although we know the general solution of this system of differential equations, an important question is how to obtain enough polynomial solutions in variables xij from this general solution. this can be generally difficult. for example, the expression x x x x c x x c c x x c c 4 1 3 2 3 1 3 1 2 2 1 2 1 2 2 2 1 1 3 32 1 2 3 1 3 � � � � � � ( ) ( ) � � � � � � � � � � x x x x x x x x x 3 12 23 2 13 23 11 22 13 2 21( ) although rational in variables xj, cj is polynomial in variables xij! the second question which arises is, in some sense, the opposite kind of problem. we can generate too many polynomial solutions and their polynomial combinations will no longer be linear independent (or, equivalently, the polynomial solutions will be functionally dependent). for example, in our case of algebra gl(3) we have x x x c x x c c x x c c x3 4 2 3 1 2 2 1 1 2 2 2 1 2 1 3 3 1 32 1 2 3 1 3 � � � � �( ) ( ) . this dependency signals that it is forbidden to have product x3x4 in the decomposition. luckily, both above-mentioned problems are easily found out by a dimension check. if there are not enough solutions we will not have enough linear independent vectors to construct the decomposition, so that the number of linear independent vectors we generate is less than the dimension of the corresponding filtration part un. in the opposite case, if the number is greater, it clearly indicates that there is some linear dependency between the considered vectors. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 47 no. 2–3/2007 summing up these facts, if we now anti-symmetrize xj, cj into elements xj, cj of the enveloping algebra u, i.e. x x e x x e e e e e e 1 1 13 2 2 12 23 13 11 33 13 1 2 � � � � �( ) etc. we can construct vectors c c c x x x xk k k n n n n1 2 3 1 2 3 4 1 2 3 1 2 3 4 , k nj j, � 0, n n3 4 0� and compute their weights. the weight of the product of any elements is obtained by the sum of the weights of the corresponding elements; we have element weight x1 (1,0,�1) x2 (1,0,�1) x3 (2,�1,�1) x4 (1,1,�2) c1,2,3 (0,0,0) therefore weight ( ) ( c c c x x x x n n n n k k k n n n n 1 2 3 1 2 3 4 1 2 3 4 1 2 3 1 2 3 4 2� , , ).n n n n n n4 3 1 2 3 42� � � � � the degree of the general element is n k k k n n n n� 1 2 3 1 2 3 42 3 2 3 3 . now we can perform a dimension check to see if our hypothesis is correct: from representation theory, the dimension of the representation with highest weight (n1, n2, n3) is d n n n n n n n n n( , , ) ( )( )( )1 2 3 1 2 1 3 2 3 1 2 1 2 1� � � � , hence the dimension of the representation generated by our general element is ~ ( , , , ) ( )( ) ( d n n n n n n n n n n n 1 2 3 4 1 2 3 1 2 4 1 1 2 3 1 3 1 2 2 � n n n2 3 43 3 2 ). it follows that we must prove that ~ ( , , , ) , , , , , , d n n n n k k k n n n n n n k k 1 2 3 4 0 0 2 1 2 3 1 2 3 4 3 3 1 2 � � 3 2 3 33 1 2 3 4 9 k n n n n n n n � � � � � � � � . the first look leads us of course to check if it is correct for small n: we make a table n sum on the left side n � � � � � � 9 9 1 10 10 2 55 55 3 220 220 4 715 715 … … from which we see that it should work. a general proof of the formula can be performed for example using generating functions: because n n n � � � � � � � � � � � � � � � � � � � � � 9 9 1 9 9 8 8 ( ) it is sufficient to show ~ ( , , , ) , , , , , , d n n n n k k k n n n n n n k k 1 2 3 4 0 0 2 1 2 3 1 2 3 4 3 4 1 2 � � 3 2 3 33 1 2 3 4 8 8 k n n n n n n � � � � � � � � � or, equivalently, ~ ( , , , ) , , , , , , d n n n n xn k k k n n n n n n k k 1 2 3 4 0 0 2 1 2 3 1 2 3 4 3 4 1 � � 2 3 1 2 3 43 2 3 3 0 0 8 8 � � � �� �� � � � � � � k n n n n n n n n n x . this is easily done with the help of the sum of the geometric series, namely 1 1 0 � � � �x x n n which can be differentiated several times to get 1 1 2 0( )� � � �x n x n n , x x x n xn n ( ) ( ) � � � �11 3 2 0 , … etc. this way, after expanding the right hand side we get n x x n n � � � � � � � � � � 8 8 1 10 9( ) . 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 to remove the unwanted condition n n3 4 0� we can rewrite the inner sum as ~ ( , , , ) , , , , , , d n n n n k k k n n n n n n k k 1 2 3 4 0 0 2 1 2 3 1 2 3 4 3 4 1 2 � � 3 2 3 3 1 2 3 3 1 2 3 4 1 2 3 1 2 0 k n n n n n k k k n n d n n n � � � ~ ( , , , ) , , , , , , ~ ( , , , n n k k k n n n n n d n n n 3 4 1 2 3 1 2 3 4 0 2 3 2 3 3 1 2 40 � � � ) , , , , , ,k k k n n n n k k k n n n n n 1 2 3 1 2 3 4 1 2 3 1 2 3 4 0 2 3 2 3 3 � � � � � ~ ( , , , ) , , , , , , d n n k k k n n n n k k k n 1 2 0 2 3 0 0 1 2 3 1 2 3 4 1 2 3 1 � � 2 3 32 3 4n n n n © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 47 no. 2–3/2007 and sum each of the summands separately. for the first one we have ~ ( , , , ) , , , , , d n n n xn k k k n n n k k k n 1 2 3 0 2 3 2 0 1 2 3 1 2 3 1 2 3 1 � n n n n n n n n n n n 2 33 0 1 2 1 2 3 1 2 1 2 1 3 1 2 2 3 � � �� � ( )( )( n xk k k n n n k k k n n n 3 2 3 2 3 0 2 1 2 3 1 2 3 1 2 3 1 2 3 � � � ) , , , , , 10 5 1 1 1 2 9 5 x x x x � ( ) ( ) the second term is obviously equal to the first term. for the third sum we have and sum each of the summands separately. for the first one we have ~ ( , , , ) , , , , d n n xn k k k n n k k k n n n 1 2 0 2 3 2 0 0 1 2 3 1 2 1 2 3 1 2 � � �� � � n k k k n n k n n n n x 0 1 2 2 1 2 2 3 21 2 1 2 2 2 1 2 3 1 2( ) ( ) 1 2 3 1 2 3 0 4 3 2 8 5 6 16 6 1 1 1, , , , , ( ) ( )k k n n n x x x x x x � � � � . now 2 10 5 1 1 1 6 16 6 1 1 1 2 9 5 4 3 2 8 x x x x x x x x x x � � � ( ) ( ) ( ) ( )5 9 1 1 � �( )x and the dimension check succeeds. 5 lie algebra gl(4) finally let us demonstrate how the procedure goes in the case of lie algebra gl(4). for the vectors with highest weights (n1, n2, n3, n4) we obtain the following equations e x f x x f x x f x x f x x f 11 12 12 13 13 14 14 21 21 31� � � � � � � � � � � � �x x f x n f e x f x x f x x f x 31 41 41 1 22 21 21 23 23 24 2 � � � � � � � � � � � , 4 12 12 32 32 42 42 2 33 31 31 � � � � � x f x x f x x f x n f e x f x � � � � � � � � , x f x x f x x f x x f x x f x32 32 34 34 13 13 23 23 43 43 � � � � � � � � � � � � � � n f e x f x x f x x f x x f x x 3 44 41 41 42 42 43 43 14 14 2 , � � � � � � � � � � � 4 24 34 34 4 12 11 22 21 12 2 � � � � � � � � f x x f x n f e x x f x x f x � � � � , ( ) 2 13 23 14 24 12 11 32 31 42 4 � � �x f x x f x x f x x f x x f x � � � � � � � � � � 1 23 21 31 22 33 32 23 33 24 0� � � , ( )e x f x x x f x x f x x f� � � � � � � �x x f x x f x x f x e x f x 34 13 12 23 22 43 42 34 31 41 0� � � � � � � � � � � � � , x f x x x f x x f x x f x x32 42 33 44 43 34 44 14 13 24 � � � � � � � � � � � �( ) f x x f x e x x f x x f x x � � � � � � � 23 34 33 13 11 33 31 12 32 1 0� � � � , ( ) 3 33 14 34 13 11 23 21 43 41 0 � � � � � � � � � � f x x f x x f x x f x x f x � � � � , e x f x x x f x x f x x f x24 21 41 22 44 42 23 43 24 44 � � � � � � � � � � � ( ) x f x x f x x f x e x x f x 14 12 24 22 34 32 14 11 44 4 0 � � � � � � � � � � � � � , ( ) 1 12 42 13 43 14 44 14 11 24 2 � �x f x x f x x f x x f x x f x � � � � � � � � � � 1 34 31 0� �x f x � � . now let us denote e xik ik ( )1 � , e x xik in nk ( )2 � , e x x xik in nm mk ( )3 � . using these elements we define the following solutions of the system (summation of repeated indices going over 1… 4): c ekk1 1� ( ), c ekk2 2� ( ), c ekk3 3� ( ), c e ekl lk4 3 1� ( ) ( ), the highest weights of the corresponding elements of the enveloping algebra are element weight ck (0,0,0,0) xk (1,0,0,�1) yk (1,1,�1,�1) zk (2,0,�1,�1) zk � (1,1,0,�2) w (3,�1,�1,�1) w� (1,1,1,�3) t (2,1,�1,�2) the dimension of the representation with highest weight (n1, n2, n3, n4) is d n n n n n n n n n n � � � � � � � 1 12 1 2 3 1 2 1 2 1 3 1 4 2 3 2 4 ( )( )( ) ( )( )( ).n n3 4 1� now we start as in the case of gl(3) to count the dimensions of the filtration subspaces to see the conditions which arise as the degree of the elements grows. elements of degree 1 are element dimension c1 1 x1 15 total � 16 elements of degree 2 are element dimension c1 2 1 c1x1 15 x1 2 84 c2 1 x2 15 y1 20 total � 136 … etc. analyzing degree 6, we encounter the first linear dependency which signals some polynomial relations (syzygies) between the generators. namely we have z z x y x y c x x y1 1 1 2 2 2 2 1 1 1 2 1 ( ) ( ) � � � , x z x z x z1 3 2 2 3 1 0 ( ) ( ) ( ) � � x z x z x z1 3 2 2 3 1 0 ( ) ( ) ( )� � �� � 8 4 2 2 03 2 1 2 2 1 4 1 2 2 4 2 2 1y c c y c c c c c y � � � �( ) ( ) . higher degrees give us many other relations of this type which lead to the system of conditions that must be fulfilled to keep the set of highest weight vectors linear independent. going up to degree 12 finishes the analysis, and we no longer encounter new relations. thus we are ready to formulate the hypothesis and after a successful dimension check we can state the final result, which is contained in the following theorem: theorem the set of elements c c c x x x y y t z z k k k n n n m m p r 2 3 4 1 2 3 1 2 1 2 2 3 4 1 2 3 1 2 1 � ( ) (( ) ( )) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )r r s s s t tz z z z w w2 3 1 2 3 1 23 1 2 3 � � � � with conditions r n r s r s r s r p r t s n s r s r s p1 3 1 1 1 2 1 3 1 1 2 1 3 1 2 1 3 1 0� � � � � � � � � � r s r s r t r p s r s t r s s p s t2 2 2 3 2 2 2 2 3 2 1 3 3 2 1 1 0� � � � � � � � � , r t r p s t s p t p t p t t3 2 3 3 1 3 1 2 1 2 0� � � � � � � , r2 0 1� , , s2 0 1� , , r3 0 1� , , s 3 0 1� , , p � 0 1, form a desired decomposition of the enveloping algebra u( gl(4)). 6 conclusion we have studied the structure of enveloping algebras u(g) where g sl gl gl� ( ), ( ), ( )2 3 4 and we have decomposed the adjoint representation into its irreducible components. we have found explicitly the highest weight vector in any such component. our result can be useful for a further study of the tensor products of representations and ideals of enveloping 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 y e e e e 1 13 1 14 1 23 1 24 1� � � � � � � � � det ( ) ( ) ( ) ( ) , y e e e e 2 13 2 14 2 23 2 24 2� � � � � � � � � det ( ) ( ) ( ) ( ) , y e e e e 3 13 3 14 3 23 3 24 3� � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 1 13 1 14 1 13 2 14 2 � � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 2 13 1 14 1 13 3 14 3 � � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 3 13 2 14 2 13 3 14 3 � � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 1 14 2 24 2 14 1 24 1 � � � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 2 14 3 24 3 14 1 24 1 � � � � � � � � � � det ( ) ( ) ( ) ( ) , z e e e e 3 14 3 24 3 14 2 24 2 � � � � � � � � � � det ( ) ( ) ( ) ( ) , w e e e e e e e e � det ( ) ( ) ( ) ( ) ( ) ( ) ( ) 12 1 13 1 14 1 12 2 13 2 14 2 12 3 13 3 14 3( ) ( )e � � � � �� � � � � �� , w e e e e e e e e � � det ( ) ( ) ( ) ( ) ( ) ( ) ( ) 14 3 24 3 34 3 14 2 24 2 34 2 14 1 24 1 34 1( ) ( )e � � � � �� � � � � �� , t e x e x e e x e x� � � 2 24 2 13 14 3 14 13 3 14 2 23 14 3 1 ( ) ( ) ( ) ( ) ( )( ) ( 3 24 3 14 23 3 24 13 3e x e x e( ) ( ) ( ))� � . algebras. the method can also be used for other simple lie algebras (see [4]). acknowledgment the authors are grateful to m. havlíček for valuable and useful discussions. this work was partially supported by gačr 201/05/0857 and by research plan msm6840770039. references [1] dixmier, j.: algebres enveloppantes. gauthier-villars editeur, paris 1974. [2] kirillov, a.: elementy teorii predstavlenij. nauka, moscow 1972. [3] flath, d.: decomposition of the enveloping algebra of sl3. j. math. phys. vol. 31 (1990), no. 5, p. 1076–1077. [4] burdík, č., navrátil, o.: decomposition of the enveloping algebra so(5), submitted to j. gener. lie theory and app. prof. rndr. čestmír burdík, drsc. phone:+420 224 358 549 e-mail: burdik@kmlinux.fjfi.cvut.cz ing. severin pošta, ph.d. phone: +420 224 358 562 e-mail: severin@km1.fjfi.cvut.cz department of mathematics czech technical university in prague faculty of nuclear sciences and physical engineering trojanova 13 120 00 prague 2, czech republic doc. rndr. ondřej navrátil, ph.d. phone: +420 224 890 714 e-mail: navratil@fd.cvut.cz department of mathematics czech technical university in prague faculty of transportation sciences na florenci 25 110 00 prague 1, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 47 no. 2–3/2007 ap06_5.vp 1 introduction actions on structures are often of a time-variant nature. special attention is in particular required when a combination of time-variant loads needs to be considered. approaches to probabilistic structural design based on different load combination models are indicated by jcss [1]. it appears that an advanced load effect model based on renewal processes can be suitably used to describe random load fluctuations in time, enabling sufficiently accurate estimates of the reliability level in practical applications. a great number of actions on structures can be approximated by rectangular wave renewal processes with random durations between renewals, as already recognized e.g. by wen [2]. models based on renewal processes with exponentially distributed durations between renewals and exponentially distributed durations of load pulses were also recommended for practical use by iwankiewicz and rackwitz [3, 4]. when estimating the structural reliability level related to the specified observed period t, the failure probability pf(0,t) is often assessed by the lower and upper bounds in the case of combinations of renewal processes, as indicated e.g. by sýkora [5]. the upper bound on pf(0,t) is of great importance for practical applications. two basic properties of the renewal process, the expected number of renewals e[n(0,t)] and the probability of “on”-state pon(t), are needed to evaluate the upper bound, see sýkora [5]. the probability of „on“-state was investigated by shinozuka [6], considering a “sufficiently long” observed period t. extension to an arbitrarily long period t, to the so-called non-stationary case, was then provided by iwankiewicz and rackwitz [4]. formulas for the probability of „on“-state were derived considering various initial conditions. the present paper attempts to reinvestigate the formulas for pon(t) achieved by shinozuka [6] and by iwankiewicz and rackwitz [3, 4]. in addition, a formula for the expected number of renewals e[n(0,t)] is verified. both the basic properties of the renewal process are investigated under random as well as given initial conditions. special attention is paid here to the correct definition of the random initial conditions. initially, the expected number of renewals e[n(0,t)] is shown to be independent of the initial conditions. the initial probability of „on“-state pon(0) is then derived under random initial conditions. a two-state markov process developed by madsen and ditlevsen [7] is adopted to derive the probability of „on“-state pon(t) at an arbitrary point in time. it appears that pon(t) is a time-invariant quantity under random initial conditions. in general the paper provides a comprehensive theoretical background for practical applications of advanced load models based on renewal processes. in addition to the newly derived formulas, several results already obtained by shinozuka [6] and iwankiewicz and rackwitz [4] are verified. desirable extensions for further research are outlined. 2 basic properties of the considered renewal process it is further considered that the actual load process can suitably be approximated by the renewal process s(t) with the following properties: � the process is intermittent, i.e. the load may be “on”/present or “off”, � durations tren between renewals are mutually independent random variables described by an exponential distribution with the rate �, � durations �ton of “on”-state, durations of load pulses, are also mutually independent exponential variables (rate �). “on”-states are initiated when renewal occurs, � load pulses do not overlap and thus the effective load pulse duration ton is given by t t t t t t ton on on ren ren on ren � � � � � � � � | | (1) � load intensities s are mutually independent variables having an appropriate extreme distribution maxtren[s(t)] related to the expected duration between renewals e[tren] = 1 / �, see e.g. weisstein [8], 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 advanced load effect model for probabilistic structural design m. sýkora in probabilistic structural design some actions on structures can be well described by renewal processes with intermittencies. the expected number of renewals for a given time interval and the probability of “on“-state at an arbitrary point in time are of a main interest when estimating the structural reliability level related to the observed period. it appears that the expected number of renewals follows the poisson distribution. the initial probability of “on”-state is derived assuming random initial conditions. based on a two-state markov process, the probability of “on”-state at an arbitrary point in time then proves to be a time-invariant quantity under random initial conditions. the results are numerically verified by monte carlo simulations. it is anticipated that the proposed load effect model will become a useful tool in probabilistic structural design. the aims of future research are outlined in the conclusions of the paper. keywords: rectangular wave renewal process, probability of “on”-state, expected number of renewals, action on structures. � random conditions are primarily assumed at the initial time t � 0 of the observed period t, i.e. it is of a purely random nature whether the process starts in an “on”-state or an “off”-state. some remarks on processes with the given initial conditions are also provided in the following. the considered process s(t) is indicated in fig. 1 where the actual load history is depicted in grey. note that random variables are further denoted by upper-case letters x (e.g. durations between renewals are referred to as tren) while lower-case letters x (e.g. tren) stand for their realizations/trials. 3 random initial conditions special attention is paid to the correct definition of random conditions at the initial state. as the process s(t) constitutes a sequence of intervals tren, random initial conditions are fulfilled when the initial point is located in an interval t0 selected from a population of tren on a purely random basis taking into account the random properties of tren as indicated in fig. 2. the interval t0 (random variable) is denoted as the first renewal. to derive a cumulative distribution function (cdf) ft0(t) of the first renewal t0, a sufficiently long sequence of a large number � �n of trials t i niren, ( )1 � � � is further considered. a trial t t j nj0 1� � � �ren, ( ) is randomly selected from the population of tren, i. consider next that the duration of the selected trial is �, t t j0 � �ren, �. by intuition, the probability p dren,[ ( , )]t t t tj0 0� � �� � that the selected trial t0 is of duration � can be obtained as a ratio of the total duration of all t tjren, d �( , )� � over the total duration of all trials tren, i from the population. using the probability density function (pdf) of an exponential distribution f erent tt( ) � �� � , the central limit theorem for a sum of �n independent random variables, see weisstein [8], and the expected value e[ ]tren �1 �, the probability becomes � � p d p dren ren, [ ( , )] lim ( , ) t t t t t t n n j 0 0� � � � � � � � � � � � � � � � t t t n n e t i i n n j ren, ren, ren p d � � � � � � � � � � � 1 lim ( , ) [ ] � � � � � � � � � �� ��e d e d � �� � � � t t2 . (2) the cumulative distribution function of the first renewal t0 is obtained from (2), as follows f p e d et t t t t t t t t t0 0 2 0 1 1( ) ( ) ( )� � � � � �� �� � �� � (3) and the probability density function becomes, using (3) f df d et t tt t t t0 0 2( ) ( ) � � �� � . (4) note that cdf (3) can be suitably used in monte carlo simulations to achieve random initial conditions. the first renewal t0 can either be randomly selected from a large population of samples of tren, or directly simulated using cdf (3). simulations based on cdf (3) are clearly incomparably more efficient than “the first approach” described in the beginning of this section. more details are provided by sýkora [5]. 4 expected number of renewals in the following, n denotes a random number of renewals of the process s(t). the expected number of renewals e[n(0, t)] is the essential process characteristic used to estimate the failure probability pf(0, t). unlike in section 3, consider that process s(t) is defined so that the first renewal starts at the initial time t � 0, as indicated in fig. 3, so that the “given” initial conditions are taken into account. this assumption can be used e.g. for an imposed load model where the action starts to be “on” approximately at t � 0 when a new structure is put into operation. the expected number of renewals for such a process is obtained e.g. by weisstein [8] e n t t[ ( , )]0 � �. (5) the first renewal of the considered process s(t) is a “standard” exponentially distributed duration tren with cdf f p erent ren tt t t( ) ( )� � � � �1 � . (6) next consider the random initial conditions again, i.e. the first renewal t0 of the process s(t) is selected purely on a random basis and the initial time point t � 0 is randomly located in the first renewal. the process is again a sequence of exponentially distributed durations except for the first renewal t0. the random effective duration t0eff of t0 corresponds to the first renewal (6) of the process described above. the difference between these processes is indicated in fig. 3. note that t0on denotes the duration of “on”-state within the first renewal and t0oneff the effective duration of t0on, i.e. the duration of the “on”-state involved in the observed period (0, t). random properties of the durations t0on and t0oneff are discussed in the following. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 46 no. 5/2006 s t ren tren t on t on maxtren [ (s t)] t ren t fig. 1: rectangular wave renewal process s(t) with intermittencies t t ren,2tren,1 tren,(n�3)tren,3 tren,(n�1)tren,(n�2) tren,n t 1 t 0eff t t � 0 fig. 2: sequence of intervals tren and random selection of the first renewal t0 the effective duration t0eff indicated in fig. 3 is randomly selected from the first renewal t0 to fulfil the random initial conditions. given t0 = t0, the effective duration t0eff (0, t0) has the rectangular distribution r(0, t0) with pdf f efft t t t t0 0 0 01( | )� � and cdf f efft t t t t t0 0 0 0( | )� � . cdf f efft t0 ( ) of the effective duration t0eff for an arbitrary t0 can be derived from the sum of probabilities of two disjoint events (t > 0): � t0 is less than t, implying t0eff < t, � t0 is greater or equal to t while t0eff < t. this can be rewritten as follows f p( p[( ( ] = p( eff eff efft t t t t t t t t t t 0 0 0 0 0 0 ( ) ) ) )� � � � � � � � ) ).� � �p( efft t t0 0 (7) the first part of the right hand side of (7) is already obtained in (3). to evaluate the second part of the right hand side of (7), consideration is initially taken that t < � < t0 < t+dt. under this assumption, p( efft t t0 0� � ) is equal to p( defft t t t0 0� � � �| )� � and, using cdf of t0eff for a given t0, arrives at p( defft t t t t 0 0� � � � � �| )� � � � (8) the condition t < � implies � (t, �). using (4) and (8), p( efft t t0 0� � ) can be obtained by the expectation p( p( d f d eff eff t t t t t t t t t t 0 0 0 0 0 � � � � � � � � � � � ) | ) ( )� � � � � � f dt t 0( ) .� � � � (9) substitution of (4) into (9) followed by the limit passage yields � �p( e d e eefft t t t t t t t t 0 0 2� � � � � �� � � � ��) � � � � � � �� �� � . (10) substitution of (3) and (10) into (7) leads to cdf ft0eff(t) f e e eefft t t tt t t0 1 1 1( ) ( )� � � � � � � � �� �� � � . (11) a comparison of cdf (6) with cdf (11) indicates that the first renewal tren of the process with the given initial conditions and the effective duration of the first renewal t0eff of the process with random initial conditions are random variables with the same cumulative distribution functions. since the following durations tren between renewals are the same random variables, the investigated processes inevitably have the same statistical properties. the number of renewals, therefore, remains the same for both processes. formula (5) is thus valid for process s(t) with random initial conditions. the expected number of renewals e[n(0, t)] (5) of process s(t) as a function of t and � is numerically compared with the number of renewals predicted by the crude monte carlo simulation method (mc) in fig. 4. solid lines denote the results of (5) and circles (‘o‘) denote the results of the simulations. 1000 trials of the whole processes s(t) are simulated for each considered combination of t and �, using the following procedure: � a trial t0 is randomly selected from a “sufficiently large” population of realizations t i n niren, ( , )1 � � � � � in accordance with the “first approach”, described in section 3, � the effective duration t0eff is obtained as a pseudorandom number in the range from 0 to t0 (see fig. 2), � subsequent durations between renewals t j njren, ( )2 1� � � are simulated as exponential variables with the rate �, � a realization of the number of renewals n is determined using the condition t t t t tj j n j j n 0 2 0 2 1 � � � � � � � � �ren ren, , . (12) note that n � 1 if t t t t0 0 2� � � � ren, and n � 0 if t0 > t. the number of renewals n is identified for each trial of the whole process s(t) and the expected value is then determined. more details on mc verification are provided by sýkora [5]. fig. 4 indicates that formula (5) and mc verification lead to the same results. it is therefore concluded that formula (5) is applicable also for process s(t) with random initial conditions. 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 ton s a) process with given initial conditions trentrentrentren ton ton tren t t 0 t0on t b) process with random initial conditions t0 t0 trentrentrentren ton t0eff ton t0oneff s fig. 3: investigated processes s(t) (5) 0 1 2 3 4 5 6 7 8 9 10 0 20 40 60 t e[n(0,t)] (5) simulations � � 10 � � 1 � � 0.1 fig. 4: expected number of renewals e[n(0, t)] 5 probability of “on”-state – stationary case in addition to the expected number of renewals e[n(0, t)], the probability of “on”-state pon(t) of a process s(t) needs to be known for applications of the renewal processes. the time variability of s(t) is completely described by durations tren and ton and therefore pon(t) can be derived from the statistical properties of tren and ton. note that the initial conditions in t � 0 may have an influence on the probability of “on”-state pon(t). given that the process s(t) is “on” at t � 0, pon(t) apparently attains 1 for t�0. however, the probability of „on“-state approaches, by intuition, a “stationary” value for large t becoming independent of the initial conditions. consider that the observed period is “sufficiently long” (conservatively t > 1/� for processes with “short” load pulses when � /� < 0.05, and approximately t > 3 /� for other processes). note that these conditions are usually satisfied in practical applications). the probability of „on“-state then approaches the stationary value obtained by shinozuka [6] p e t e ton on ren � ( ) ( ) , (13) where e(ton) is the expected duration of the “on”-state and e (tren) � 1 /� denotes the expected duration between renewals. duration ton is a truncated exponential variable, as defined in (1). if a realization of exponential duration with rate � is less than a realization tren, �ton > tren, duration ton is truncated to ton � tren, otherwise ton � �ton . this truncation apparently influences the expected value e(ton). cdf fton(t) is initially derived. two possible exclusive events can yield ton < t: � duration tren is less than t. this implies that duration ton is less than t, � duration tren is greater or equal to t. in this case, cdf of ton in the interval (0, t) remains the “standard” cumulative distribution function of the exponential distribution and the truncation has no effect here. probability p[ton < t] can then be determined from cdf of the exponential distribution with rate �. using cdfs of the exponential distributions with rates � and �, fton(t) is given for the mutually independent durations ton and tren as follows f p p p p on on on ren ren ren t t t t t t t t t t t t ( ) ( ) ( | ) ( ) ( � � � � � � � � ) ( )[ ( )] ( ) ( )[ ( � � � � � � � � � �� � p p p e e on ren rent t t t t t t 1 1 1 1� � � � �t t t)] .( )� � � �� � �1 1e e (14) from (14), the expected value e[ton] arrives at e on( )t � � 1 � � . (15) substitution of (15) into (13) yields the probability of „on“-state under stationary conditions pon � � � � � (16) which is in accordance with the well-known result obtained e.g. in rcp [9]. 6 initial probability of „on“-state considering random initial conditions, the probability of „on“-state p0on � pon(0) at the initial time t � 0 can be defined as the probability that the sum of the effective duration of the first renewal t0eff and the initial duration of the “on”-state t0on exceeds the duration of the first renewal t0 (see the right hand side of fig. 3) p p t t t t t0 0 0 0 0 0on on eff� � �[ ( ) ( ) ]. (17) note that both durations t0eff and t0on are dependent on t0. if the sum is less than t0, the “on”-state is finished earlier than the observed period starts. the initial probability of “off ”-state p0off � poff(0), the complementary probability to p0on, then follows from (17) p p t t t t t0 0 0 0 0 0off on eff� � �[ ( ) ( ) ]. (18) initially consider that t0 � �. application of the convolution integral to (18), see e.g. weisstein [8], yields p t t t0 0 0 off 0on 0efff f d( | ) ( ) ( )� � � � � � � � � � , (19) where ft0on( ) is cdf of t0on. the upper bound for integration in (19) is �, since t0eff is always less or equal to t0 (see fig. 3). duration t0on is defined in accordance with (1), i.e. t0on has an exponential distribution with rate � if t0on < t0 � �, otherwise it is truncated to t0on � �. since the integration variable ��(0, �) in (19) is always positive, ft0on(�) is evaluated for values 0 � ��� � �. duration t0on is within this interval a “standard” exponential variable with cdf ft0on(t) � 1 – e ��t. to satisfy the random initial conditions, the effective duration t0eff has the rectangular distribution r(0, �) with pdf ft0eff(t) � 1 /�. substitution of the aforementioned functions into (19) followed by integration leads to the probability of „off“-state conditional on t0 � � � p t0 0 0 1 1 1 1 off e d e ( | ) ( ) ( ) � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 0 1 1 1 � �� �� �� e . (20) the probability density function of t0 is provided in (4). the probability of „off “-state poff for an arbitrary t0 can be obtained by the expectation of (20), using (4) p p tt0 0 0 0 21 1 1 off offe e e � � � � � � � �� � � �� � [ ( | )]� � �� �� � ��� � �� � � � � � � �� � � � � � � � � � � d 0 2 1 . (21) since probabilities p0on and p0off are mutually complementary, the initial probability of „on“-state becomes p t t t t t p0 0 0 0 0 0 01on on eff offp� � � � � � � [ ( ) ( ) ] � � � . (22) it appears that under random initial conditions, the initial probability of “on”-state p0on (22) is equal to the “stationary” probability (16). this is an expected conclusion as, e.g., the probability of “on”-state of a wind action nearly always has a © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 46 no. 5/2006 constant value, barely dependent on the origin t � 0 of the observed period (0, t). the numerical study published by sýkora [5] indicates that the result obtained in (22) is in accordance with the monte carlo simulations. 7 probability of “on”-state – non-stationary case to derive the probability of “on”-state pon(t) at an arbitrary point-in-time t, consider the process s(t) with “short” pulses so that the probability of the duration of the “on”-state ton exceeding the duration between renewals tren is negligible, p(ton> tren) ~ 0. this process can be suitably modelled as the product of a non-intermittent rectangular wave renewal process s(t) and a two-state markov process z(t), as already proposed by madsen and ditlevsen [7] s t s z tt( ) ( )( )� . (23) states of the markov process are characterized as follows z t( ) 0 1 � � “off” state “on” state � � � � � (24) the considered processes s(t) and z(t) are indicated in fig. 5. the renewal process s(t) may be “on” [z(t) � 1] or “off ” [z(t) � 0] in time t. within an infinitely short time interval �t�0, the process may jump between the “on” and “off ” states or may remain in the same state. therefore, the process s(t) may again be “on” [z(t��t) � 1] or “off ” [z(t��t) � 0] in t��t. it is further assumed that “on” and “off ” states form a markov process, and states in t��t merely depend on the states in t. consider that the process s(t) is “off ” at the time point t, i.e. z(t) � 0. within �t, the process may remain in the “off ”-state [z(t��t) � 0] or a renewal may occur and an “on”-state may be initiated [z(t��t) � 1]. the probability p�t(n > 0) of the occurrence of at least one renewal within �t is obtained e.g. by wen [2] p p e , � � � � � � t t t n n t o t t ( ) ( ) ( ) � � � � � � � � � 0 1 0 1 2� � � (25) where o(�t2) denotes terms with a higher order in �t for which lim ( ) � � � t o t t � � 0 2 0. this implies that the transition (transfer) probability p[z(t��t) � 1|z(t) � 0] of a jump from the “off”-state in t into the “on”-state in t��t given the “off”-state in t is p z t t z t t[ ( ) | ( ) ]� � � � �1 0 � . (26) the complementary transition probability p[z(t��t) � 0|z(t) � 0] that process s(t) remains “off ” becomes p p [ ( ) | ( ) ] [ ( ) | ( ) ] . z t t z t z t t z t t � � � � � � � � � � � � 0 0 1 1 0 1 � (27) assuming an “on”-state at a time point t[z(t) � 1], the process may remain in the “on”-state [z(t��t) � 1] or the load pulse may be finished and an “off ”-state may be initiated [ z(t��t) � 0]. note that the probability of the event of an “on”-state being finished by a renewal occurrence is neglected here. this event is conditioned by ton> tren, but only processes with “short” load pulses are investigated in this section, and thus p[ton> tren] ~ 0. by analogy with (25), (26) and (27), the transition probability p[z(t��t) � 0|z(t) � 1] of a jump from the “on”-state in t into the “off ”-state in t��t given the “on”-state in t is p[ ( ) | ( ) ]z t t z t t� � � � �0 1 � (28) and the complementary transition probability p[z(t��t) � 1|z(t) � 1] that process s(t) remains “on” is p p [ ( ) | ( ) ] [ ( ) | ( ) ] . z t t z t z t t z t t � � � � � � � � � � � � 1 1 1 0 1 1 � (29) fig. 6 indicates all the transition probabilities (26) to (29) and markov states. the probability of „on“-state pon(t��t) is obtained as p t t z t t z t t z t z t on p p p ( ) [ ( ) ] [ ( ) | ( ) ] [ ( ) � � � � � � � � ! � � � � 1 1 1 1 1 0 0 1 ] [ ( ) | ( ) ] [ ( ) ] ( ) ( ) � � � � ! � � � ! � p p on z t t z t z t t p t � � �� � t p t! off ( ). (30) as the probabilities of „on“-state and “off ”-state are complementary, pon(t)�poff(t) � 1, (30) can be rewritten as � � � � � � � � �p t t p t t p t t p ton on on on( ) ( ) ( ) ( ) ( )� � � � � �, (31) where �pon (t��t) is a derivative of the probability of „on“-state at time t. equation (31) is in agreement with the formulas obtained by iwankiewicz and rackwitz [4]. it follows from (31) that the probability of „on“-state pon(t) at an arbitrary point in time t, the so-called non-stationary probability of „on“-state, is p t c ton e( ) (� � � � �� � � � � , (32) where c is a constant of integration obtained from the initial conditions. under stationary conditions, t��, the first term of the right hand side of (32) vanishes and pon(t) equals � /(���), as already given in (16). considering the random initial conditions and an arbitrarily long period (0, t), the initial probability of „on“-state is, in accordance with (22), pon(0) � � /(���). for t � 0, substitution of (22) into (32) leads to c � 0 and the non-stationary 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 s t( ) t t ren tren ton ton t ren z t( ) t s( )t t 1 0 fig. 5: non-intermittent rectangular wave renewal process s(t) and two-state markov process z(t) � �t 0 ��t ��t “off ”“on” 1� �� t1� �� t fig. 6: transition probabilities and markov states probability of „on“-state under random initial conditions becomes p ton ( ) � � � � � . (33) it appears that under random initial conditions, the probability of „on“-state is time-invariant for an arbitrarily long observed period t, and process s(t) is stationary and ergodic. this is a very important conclusion, probably firstly published here. however, the initial conditions can be given in some cases. a load process s(t) can then be “on” or “off ” at t � 0. if process s(t) is assumed to be “on” at t � 0, pon(0) � 1, the non-stationary probability of „on“-state (32) reads p t ton e( ) (� � � � � �� � � � � � � �� . (34) if the load process s(t) is “off ” at t � 0, pon(0) � 0, the non-stationary probability of „on“-state (32) writes p t ton e( ) (� � � � � �� � � � � � � �� (35) probabilities (34) and (35) are identical with those already achieved by iwankiewicz and rackwitz [4]. note that in the referenced paper the random initial conditions are approximated as pon(0) � poff(0) � 0.5. formula (32) then yields p t ton ) e( ) ( (� � � � � � �� � � � � � � � �� 2 . (36) the difference between the initial conditions applied to derive formula (33) in present paper and (36) published by iwankiewicz and rackwitz [4] is obvious, and needs no further comment. the probability of „on“-state pon(t) (33) is further numerically evaluated using the mc simulations. for the number of trials �n � 105, “on”and “off ”-states are identified at selected times t and the probability of „on“-state pon(t) is then determined. fig. 7 indicates the probabilities pon(t) for three alternatives: � alt. a: the time variability of the considered process is defined by �a � 1 and �a � 0.1. formula (33) predicts pon,a � 0.91. it is thus foreseen that process a is nearly always active, � alt. b: rates �b � 1 and �b � 1 and pon,b � 0.5. process b is sometimes “on” and sometimes “off”, � alt. c: rates �c � 1 and �c � 10 and pon,c � 0.09. process b has large intermittencies and short load pulses. note that processes a and b hardly fulfil the condition for a process with short pulses. it can be easily shown that pa(tren< ton) = �a/(�a+ �a) = 0.91 and pb(tren< ton) = 0.5. the condition is perhaps satisfied for process c only. the dashed lines in fig. 7 indicate the results obtained by (36), solid lines by (33) and circles ‘o’ by simulations. the mc simulations are in agreement with (33). the probability of „on“-state pon(t) proves to be time-invariant when random initial conditions are satisfied. the non-stationary probability of „on“-state (36) provided in the literature fails to describe alternatives a and c for lower t. it appears that for a larger time (in this case approximately t > 5), the “non-stationary” effects involved in (36) vanish, and formula (36) corresponds well to the mc simulations. as already mentioned, formulas (33), (34) and (35) for pon(t) are derived assuming “short” load pulses, p(ton> tren) ~ 0. it is foreseen that deriving of the probability of „on“-state for a process with general load pulses is a much more difficult task. however, the preliminary numerical results partly presented in fig. 7 for alternatives a and b indicate that the derived formulas remain valid for processes with arbitrarily long pulses. 8 concluding remarks it is indicated that the cumulative distribution function ft0(t) of the first renewal t0 obtained in the paper can suitably be used to simulate random initial conditions of the renewal process. the expected number of renewals e[n(0, t)] of the renewal process considered here is shown to be independent of the initial conditions. using the newly derived initial probability of „on“-state pon(0) and a two-state markov process, the probability of „on“-state pon(t) at an arbitrary point in time then proves to be time-invariant under correctly defined random initial conditions. it is foreseen that the investigated model based on renewal processes will become a useful tool in probabilistic structural design, particularly in applications of time-variant reliability analysis. as the formula obtained for pon(t) is derived considering a process with “short” load pulses, future research should focus on deriving pon(t) for a general process with arbitrarily long pulses. acknowledgments this research has been conducted at the czech technical university in prague, klokner institute. financial support has been provided by the czech science foundation in the framework of gačr research project 103/06/p237 “probabilistic analysis of time-variant structural reliability”. the help given by the author’s colleague, prof. m. holický, is highly appreciated. references [1] jcss (joint committee for structural safety): jcss probabilistic model code. http://www.jcss.ethz.ch\, 2001. [2] wen, y. k.: structural load modeling and combination for performance and safety evaluation. amsterdam: elsevier, 1990. [3] iwankiewicz, r., rackwitz, r.: coincidence probabilities for intermittent pulse load processes with erlang arrivals and durations. in: proceeding icossar’97 – structural safety and reliability (vol. 2), (editors: © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 46 no. 5/2006 t pon( )t aa 0 0.5 1 0 2 4 6 bb cc fig. 7: probability of “on”-state pon(t) shirashi, shinozuka, wen). rotterdam: balkema, 1998, p. 1105–1112. [4] iwankiewicz, r., rackwitz, r.: non-stationary and stationary coincidence probabilities for intermittent pulse load process. probabilistic engineering mechanics, vol. 15 (2000), no. 2, p. 155–167. [5] sýkora, m.: probabilistic analysis of time-variant structural reliability. prague: ctu publishing house, 2005. [6] shinozuka, m.: notes on the combinations of random loads (i) in probability based load criteria for the design of nuclear structures: a critical review of the state-of-the-art. bln report no. nureg/cr 1979, 1981. [7] madsen, h. o., ditlevsen, o.: transient load modelling: markov on-off rectangular pulse process. structural safety, vol. 2 (1985), p. 253–271. [8] weisstein, e. w.: mathworld. a wolfram web resource: http://mathworld.wolfram.com/. 2006. [9] rcp reliability consulting programs: strurel (1903): a structural reliability analysis program system, comrel & sysrel user’s manual. munich: rcp consult, 2003. ing. miroslav sýkora, ph.d. e-mail: sykora@klok.cvut.cz department of structural reliability czech technical university in prague klokner institute, šolínova 7 166 08 prague 6, czech republic 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 ap06_3.vp 1 introduction atm technology arose as the basic communication code for b-isdn worldwide broadband communication network. the atm philosophy is based both on fast switching of very tiny cells of fixed length, and also on effective usage of bandwidth transmission. atm enables us to realize transmissions with guaranteed quality of service for all services provided in broadband networking. if we want a setting with a large number of services to be effectively supported, we need an atm control mechanism that will accept different quality requirements of particular services. this is called traffic management. 2 atm traffic management the main task of traffic management is to protect the network and end systems from overloading in such a way that efficiency goals are achieved and the given quality of services is retained. if overloading of the network occurs, the next task of traffic management is to eliminate this overloading. an additional function is to increase the effective usage of network resources. reaction management responds to network overloading when it happens, i.e. it reduces the consequences of overloading to an acceptable level. it regulates the operative flow at the entry points on the basis of the current level of operation in the network with the help of management through feedback. preventive management provides a fair allocation of the communication area in such way that during high load in the network it ensure that the operative flow stays within the specified range acceptable for the particular service. the chief idea of preventive management is to prevent overloading of the multiplex the entry point to the network in the process of connection building. 3 call admission control call admission control is the basic function of preventive management, which is defined as the series of acts done by the network during the phase of connection building in order to define whether the connection will be accepted or refused. the operation contact is made between the customer and the network at the time of connection building, which specifies the properties of the atm connection on the uni and nni interfaces, that it goes through. the network will undertake to support the operation on a given level and the customer agrees not to overrun the given efficiency parameters. maintenance of the given qos is important for atm services with reference to fulfilling the operation contract. the given qos is on to a great extent dependent on rationing the network devices, and this is what cac determines. for cbr (constant bit rate) services, for rt-vbr (real time-variable bit rate) and also for nrt-vbr (non real time – variable bit rate), cac is compulsorily applied as the preventive function of traffic management. judged on the basis of the operative parameters, pcr (peak cell rate) is defined as the maximum rate of cell broadcasting for an individual atm connection scr (sustainable cell rate) is measured over a long time period in respect of the t value, where pcr � 1/t and others. abr (available bit rate) and ubr (unspecified bit rate) services characteristically use abundant network devices. to secure the qos mechanism, abr uses feedback control as a reactive function of traffic management. ubr does not have qos or assigned network devices secured by the network. 4 the diffuse method the diffuse method [2] is based on two statistical formulations of the required bandwidth. in the first case, it is based on the cell loss ratio for a finite series pfb b c c � � � � � 1 2 2 2 2 2 � � � � �e e ( ) ( ) (1) derived from the model of an atm multiplexor with finite capacity. relation (1) is based on the ratio of the line capacity overflow set by the gaussian method p p r t ci i n c overflow e� � � � � � � � � � � � � � � � � � � ( ) ( ) 1 1 2� � 2 22� (2) and on the exponential function e � �2b c( )� � (3) which represents the usage of the capacity of buffer store b according to the of diffuse method. the time of operation of buffer store cells of fixed size has a specified constant value. in dependence on the load of the line capacity, the time of operation of the buffer store can vary. the contrast � � c represents the immediate average rate of cell access into the buffer store (drift), where � �� � � i i n 1 (4) is the central bit rate of cell access into the buffer store and c is the general line capacity (multiplexor). parameter � determines the immediate variance of cell access into the buffer store, and is defined as © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 cac in atm – the diffuse method i. baroňák, m. vozňák connection admission control is an element in the of preclusive mechanisms of atm management. its main task is to prevent overloading of the network and to ensure the required quality of service. this means that it has to predict the service of the network and according to its state it can manage both existing and new connections. this paper deals with the diffuse method, a cac method that enables us to obtain the required results. keywords: atm, qos, cac – the diffuse method. � �� � � � i i i n c2 1 , (5) where c r b r b r p i 2 2 2 1 1 1 1 1 � � � � � � � � � � � � � � � � (6) is the variance coefficient of cell access (average time of active status b and average time of inactive status p). in the second case, we deal with the application of the cell loss ratio for an infinite series pib b c c � � � � � � � � � � � � 2 2 2 2 2e e ( ) ( ) (7) derived from the atm multiplexor model when the infinite capacity is considered. relation (7) is determined from the account of the top cell loss ratio p e r t ci i n c loss e� � � � � � � � � � � � � � � � � � �� ( ) ( ) 1 2 2 � � � � � 2 2� , (8) where ri(t) is the immediate connection bit rate i and � � 2 2 1 � � � i i n (9) is the central quadratic diversion of the bit rate. relation (3) is also derived from the exponential function. let � represent the maximum accepted cell loss ratio. consequently, two statistical formulations of the required bandwidth marked as cfb and cib can be derived. cfb denotes the statistical bandwidth derived from the diffuse model of a system with finite series for an atm multiplexor. from relation (1) we obtain the quadratic equation ( ) ( )� � � � �� � � � �c c2 2 12 2 0, (10) where for the account we use artificial variables � � �� 2 2b (11) and � � �1 2� ln( ). (12) if c cfb� is valid than cfb � � � �� � � � � 2 2 12 . (13) cib marks the statistical bandwidths gained from the diffuse model of a system with infinite series for an atm multiplexor. in the same manner as for cfb we can derive cib, during which time we start from relation (8), and we get cib � � � �� � � � � 2 2 22 , (14) where the artificial variable is determined � �� � �2 2� �ln( ) ln( ). (15) this formulation of two statistical bandwidths contains the usage of the cell loss ratio selected by the user, the general characteristics of the operation and also the available size of the buffer store. they also define the acceptable range for different types of connection on the basis of their operation descriptors for different types of buffer stores. the statistical bandwidth for the diffuse method cdf, which is needed for the particular connection, can be determined from the relation c c cdf fb ib� max( , ). (16) 5 simulation of the diffuse method by simulating the diffuse method we tried to ensure the required bandwidth for the connections. when simulating the cac method we had to define the model of the atm service for the cbr, vbr and on/off service. we dealt with the case that there exist n � 100 independent connections in the network, and for each connection n = 100 values of service in time are generated in dependence on the pcr of the individual connections. the size of the line we had chosen was c � 155 mbit/s. we chose a constant size of the pcr parameter for all models of service cbr, vbr and on/off, according to the formula pcr k c n � � , (17) where a supply constant occurs. when simulating the cac method it is appropriate to generate the service in the network in such a way that the link capacity is accidentally overrun and the qos of a few connections is corrupted. the constant k � 1.95 is defined for this purpose. the parameter of the cell loss ratio is represented by clr on the whole line, and we chose its value at clr � 1.10�6. buffer store b � 1 mbit/s. 5.1 cbr service fig. 1 shows the dependability of the cell loss ratio for finite series pfb and for infinite series pib for the diffuse method in cbr service. both probabilities show zero values. because the diffuse method is derived from the gaussian method, this caused by the central quadratic derivation the of bit rate, which at the same immediate bit rate approaches zero. thereby dividing by zero occurs in formulae (1) and (7). the statistical bandwidth obtained from the diffuse model with finite series cfb and with infinite series cib for an atm multiplexor in cbr service, and also the consequential bandwidth of the diffuse method cdf, is shown in fig. 2. the figure shows that for individual connections n both statistical bandwidths extend with increasing connections and with equal estimated values, and therefore they are stated as the consequential bandwidth of the diffuse method cdf . the 97 th connection overruns the line capacity c. in comparison with the case when for every new connection we reserved its maximal value pcr of required bandwidth cpcr , the overrun of line capacity would already come by the admission of the 52nd connection. here we have to take into account that the standard deviation � is zero in cbr service. it follows that after substitution to formula 13 and 14, the bandwidths for cfb and cib are determined only by the central bit rate, and therefore they are equal. in a real atm this would cause a big derivation of delay, which cannot be accepted. for this reason the diffuse method is not appropriate for cbr service. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 5.2 vbr service fig. 3 shows the dependability of the cell loss ratio for finite series pfb and for infinite series pib for the diffuse method in vbr service. in comparison with cbr service there are changes of values in both probabilities. up to the 72nd connection, their values are about zero. after the 72nd connection both probabilities increase, which reflects the close-up of several connections in the atm network. an overflow of line capacity occurs in the network. after the admission of the 78th connection the overflow of line capacity for finite series pfb and after the admission of the 80 th connection, for infinite series pib (fig. 4) the line capacity overflows (fig. 4). fig. 3 shows that the line capacity overflow occurred at the set value of the cell loss ratio parameter on the lines clr on 1�10�6. the statistical bandwidth obtained from the diffuse model of the system with finite series cfb and with infinite series cib for the atm multiplexor in vbr service, and also the resultant bandwidth of the diffuse method cdf , are shown in fig. 4. the figure shows that for individual connections n both statistical bandwidths with increasing connections expand, during which time the estimated values with finite series cfb are greater than the estimated values with infinite series cib. for this reason the cfb values are stated as the resultant bandwidth of the diffuse method cdf . during the development of the 78th connection for cfb, or let us say the 80 th connection for cib, the line capacity c is overrun. in comparison with the case when, for each of the new connections, we reserved its maximum value pcr of required bandwidth cpcr , the line capacity overrun would occur already at the admission of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 2 1 0 -1 -2 x 10 �6 p fb , p ib 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] pfb pib fig. 1: the cell loss ratio for finite and infinite series of the diffuse method in cbr service for constant pcr 200 150 100 50 0 c , c d f, c p c r /s [m b it ] 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] cpcr cdf c fig. 2: the bandwidth estimated by the diffuse method in cbr service for constant pcr 2 1.5 1 0.5 0 x 10 �6 p fb , p ib 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] pfb pib fig. 3: the cell loss ratio for finite and infinite series of the diffuse method in vbr service for constant pcr 52nd connection. in this way we can enter more connections by using of the diffuse method. for this reason, the diffuse method is appropriate for vbr service. we can enter more connections than are shown in fig. 4 by using the diffuse method. we can achieve this with the use of greater buffer stores b in the network or, we can increase the cell loss ratio clr parameter on the line. figs. 5 and 6 shows the influence on possible accepted connections due to changes of the buffer store capacity b or the parameter of cell loss ratio clr. when the clr parameter changes, fig. 5 shows that, at low values of buffer store capacity b 0.1 till 1 mbit/s, the number of possible accepted connections is aliasingly greater and greater with increasing clr. with a buffer store size of 10 mbit/s we can see very similar aliasing growth and even 89 connections can enter. fig. 6 shows changes, this time in dependence on the change of the buffer store capacity b. during the initial reflections it does not matter if we use a buffer store 0.01 or 0.6 mbit/s in size, as the number of possible accepted connections is the same. little improvement occurs at 1 mbit/s. sharp improvement occurs when the buffer store is over 10 mbit/s in size. this means that if we change the buffer store size above b � 10 mbit/s, the change in possible accepted connections increases linearly. for this reason we have to choose values like these when creating the layout of the atm service. the change in the clr parameter, and also the change in b cause sharp changes in the number of possible accepted connec60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 200 150 100 50 0c , c if , c d f, c p c r m b it /s [ ] 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] cpcr cib cdf c fig. 4: the bandwidth estimated using of the diffuse method in vbr service for constant pcr 100 90 80 70 10 �2 10 �1 10 0 10 1 10 2 b [mbit/s] cdf (clr=1.0e 05)� cdf (clr=1.0e 06)� cdf (clr=1.0e 07)� n u m b e r o f p o s s ib le c o n n e c ti o n d m fig. 6: the number of possible accepted connections of the diffuse method with the change of b in vbr service for constant pcr 90 85 80 75 70 n u m b e r o f p o s s ib le c o n n e c ti o n d m 10 �8 10 �7 10 �6 10 �5 10 �4 clr cdf (b=0.1[ mbit/s]) cdf (b=1[ mbit/s]) cdf (b=10[ mbit/s]) fig. 5: the number of possible accepted connections of the diffuse method with the change of clr in vbr service for constant pcr tions. the best combination is to use the greater possible cell loss ratio clr as possible and as great buffer store size b. however in reality these parameters are influenced by many other factors. 5.3 on/off service fig. 7 shows the dependability of the cell loss ratio for finite series pfb and for infinite series pib for the diffuse method in on/off service. up to the 70th connection their values are about zero. after the 70th connection both probabilities increase, which reflects the entry of several connections in the atm network. line capacity overflow occurs in the network. after the admission of the 74th connection the line capacity overflows for the finite series pfb and after the admission of the 76th connection it overflows for infinite series pib (fig. 8) as well. fig. 7 shows that line capacity overflow also occurred at a set value of the cell loss ratio parameter on the whole line clr at 1�10�6. the statistical bandwidth obtained from the diffuse model of the system with finite series cfb and with infinite series cib for the atm multiplexor in on/off service as well as the resultant bandwidth of the diffuse method cdf is shown in fig. 8. the figure shows that for individual connections n both statistical bandwidths with increasing connections expand, during which time the estimated values with finite series cfb are greater than the estimated values with infinite series cib. for this reasons values cfb are stated as the resultant bandwidth of the diffuse method cdf . during the development of the 74th connection for cfb, or let us say the 76 th connection for cib, the line capacity c is overrun. in the case when for each of the new connections we reserved its maximum value pcr of required bandwidth cpcr , the line capacity overrun would occur already at the admission of 52nd connection. in this way we can enter more connections with the use of the diffuse method. however we have to take into account the fact that with a few entered connections bandwidth cfb as well as cib is slightly greater than if we had reserved its maximum capacity cpcr for the connection. this factor can be eliminated with the use of greater buffer stores b to allow a greater cell loss ratio clr on the line. the diffuse method is also appropriate for on/off service, but with the use of few connections it can be ineffective. for this reason we have to focus on the proper arrangement of the service. also in this case we can enter more connections than are shown in fig. 8 when we use greater buffer stores b, or we can increase the cell loss ratio parameter clr on the line. fig. 9 and 10 show the influence on possible accepted connections due to changes in buffer store capacity b or the cell loss ratio parameter clr. when the clr parameter changes, fig. 9 shows that at low values of buffer store capacity b 0.1 till 1 mbit/s the number of possible accepted connections grows with increasing clr. with a buffer store size of 10 mbit/s we can see very similar growth and as many as 95 connections can enter. fig. 10 shows changes, this time in dependence on the change in buffer store capacity b. during initial reflections it does not matter if we use a buffer store 0.01 or 0.1 mbit/s in size, as the number of possible accepted connections is the same. a sharp improvement occurs when the size of the buffer store rises above 10 mbit/s. this means that if we change b about © czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 200 150 100 50 0 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] cpcr cib cdf c c , c ib , c d f, c p c r m b it /s [ ] fig. 8: the statistical bandwidth estimated using of the diffuse method in on/off service for constant pcr 2 1.5 1 0.5 0 x 10 �6 0 10 20 30 40 50 60 70 80 90 100 n [number of connection] pfb pib p fb , p ib fig. 7: the ratio of line capacity overrun for finite and infinite series of the diffuse method in on/off service for constant pcr 10 mbit/s, the change of possible accepted connections increases linearly. an absolute decay occurs at about 60 mbit/s. this is why we have to choose of the buffer store values of about 10 mbit/s when creating the layout of the atm service. the change in the clr parameter as well as the change in b cause sharp changes in the number of possible accepted connections. the best combination is to use the greatest possible cell loss ratio clr and buffer store size b about 10 mbit/s. 6 conclusion the diffuse method cannot be used in cbr service because the central quadratic derivation of bit rate � � 2 approaches zero in equal immediate bit rate, i.e. dividing by zero occurs in formulae (1) and (7). afterwards the bandwidths for cfb and cib are determined only by the central bit rate. in the real atm service this could cause big derivation delays, which cannot be accepted. for this reason the diffuse method is not appropriate for cbr service. in vbr and cluster on/off service the method can be used effectively. in both models of the service the results were much better than if we were to reserve for each of new connections its maximal value of pcr. in the diffuse method, an even greater number of connections can enter than was shown in the figures with the use of greater buffer stores b in the network or we can expand the cell loss ratio clr parameter on the line. in comparison with other methods, the diffuse method maintains the values of the cell loss ratio during which it is more effective during allocation of the required bandwidth. it is appropriate for homogeneous and also for heterogeneous type of service. in comparison with classic models, the account of this method is effective. it is easily employable as a cac algorithm. acknowledgment this work is a part of av 1002/2004, vega 1/0156/03 and vega 1/3118/06 projects. references [1] baroňák, i., kajan, r.: quality of atm services and cac methods. fei stu department of telecommunications, bratislava, 1999. [2] shiomoto, k., yamanaka, n., takahashi, t.: overview of measurement-based connection admission control methods in atm networks. ieee communications surveys, first quarter 1999. [3] marzo, i., lazaro, j. l.: enhanced convolution approach for cac in atm networks, an analytical study and implementation. girona, 1996, isbn 84-8458-106-53. [4] baroňák, i., kvačkaj, p.: submission to cac. communications, scientific letters of the university of žilina, vol. 6 (2004), no. 4, p. 80–83. [5] kvačkaj, p.: cac method of effective bandwidth. international competition student eeict 2004, section 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague 90 85 80 75 70 10 �8 10 �7 10 �6 10 �5 10 �4 clr cdf (b=0.1[ mbit/s]) cdf (b=1[ mbit/s]) cdf (b=10[ mbit/s]) n u m b e r o f p o s s ib le c o n n e c ti o n d m fig. 9: the number of possible connections of the diffuse method with the change of clr in on/off service for constant pcr 100 90 80 70 10 �2 10 �1 10 0 10 1 10 2 b [mbit/s] cdf (clr=1.0e-05) cdf (clr=1.0e-06) cdf (clr=1.0e-07) n u m b e r o f p o s s ib le c o n n e c ti o n d m fig. 10: the number of possible connections of the diffuse method with the change of clr in on/off service for constant pcr telecommunications, bratislava, may 27, 2004, p. 232–237. [6] the atm forum: traffic management specification version 4.0. af-95-0013r13, letter ballot, april 1996. [7] engel, r.: design and implementation of a new connection admission control algorithm using a multistate traffic source model. department of computer science, washington university st. louis, 1996. [8] jakab, f., giertl, j., bača, j., andoga, r., mirilovič, m.: contribution to adaptive sampling of qos parameters in computer networks. acta electrotechnica et informatica, vol. 1 (2006 ), p. 52–59, issn 1335-8243. [9] marchevský, s., čižmár, a.: converging the pstn/isdn and the internet. in: ittw 98 – international workshop. tempus telecomnet project. barcelona, 1998, p. 77–81. [10] marchevský, s., kocúr, d., longauer, l., čížová, j.: simulation of adaptiveblind multi-user detection of cdma signals by system design tool-system view. in: recent trends in multimedia information processing. iwssip 2003 – proceedings of the 10th international workshop on system, signal and image processing. prague, czech republic, 10 – 11 september 2003, p. 203–206. doc. ing. ivan baroňák, ph.d. department of telecommunications slovak technical university, bratislava faculty of electrical engineering and information technology ilkovičova 3 bratislava 1, sk 812 19, slovak republic ing. miroslav vozňák, ph.d. phone: +420 596 991 699, +420 234 680 468 e-mail: miroslav.voznak@vsb.cz department of electronics and telecommunications vsb technical university of ostrava faculty of electrical engineering and computer science 17. listopadu 15 708 33 ostrava-poruba, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 acta polytechnica https://doi.org/10.14311/ap.2022.62.0538 acta polytechnica 62(5):538–548, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague experimental study of the torsional effect for yarn break load test of polymeric multifilaments daniel magalhães da cruz∗, fernanda mazuco clain, carlos eduardo marcos guilherme federal university of rio grande, engineering school, stress analysis laboratory policab, 96203-000, rio grande – rs, brazil ∗ corresponding author: dacruz.daniel@furg.br abstract. polymeric multifilaments have gained a significant interest in recent decades. in the studies of mechanical characteristics, although there are different types of tests, such as rupture, abrasion, creep, impact and fatigue, it can be said that the main mechanical characterisation is the tensile rupture strength (yarn break load, ybl), which also serves as a parameter for other tests. the objective of this work is to evaluate the results of breaking strength under different torsional conditions in polymeric multifilaments and to determine optimal twists for failures. the test were carried out with the following materials: polyamide, polyester, and high modulus polyethylene (hmpe), and for torsional conditions: 0, 20, 40, 60, 120, 240, and 480 turns per metre. as a result, for these torsion groups, curves were obtained for the three materials that present an optimal point of maximum rupture value, which was also experimentally proven. the twist that optimises the breaking strength of hmpe is 38 turns per metre, 56 turns per metre for polyester, and 95 turns per metre for polyamide. the twist groups that exceed the optimal torsion have a deleterious effect on the material, where the multifilament ceases to be homogeneous and starts to create an excessive "spring effect". the results found differ from the recommendation of the standard that regulates the ybl test, and thus, a relationship is built between groups of optimal torsion and linear density that provides evidence that the increase in linear density causes the optimal torsion for rupture to also increase, while the standard places a condition of 30 turns per metre for linear densities greater than 2200 dtex, and 60 turns per metre for linear densities less than 2200 dtex. in addition to optimal torsion values, this conclusion is paramount, the test procedure makes a general recommendation that does not optimise the breaking strength. keywords: yarn break load, twist effect, maximum breaking strength, polymeric multifilaments. 1. introduction the last few decades have provided significant advances in the field of material engineering and its applications. there has been a significant increase in academic production and commercial focus on polymers [1]. it is noteworthy that over time, some of the polymers could be produced at low cost with very similar properties, and even superior to those of natural ones [2]. thus, polymers have become, in certain applications, a viable and promising alternative to classic engineering materials. in technical terms, polymers have several advantages as compared to other materials. highlights are: low specific weight, high toughness (sometimes similar to metallic materials), good vibration dampening, low friction coefficient, thermal insulation, and high corrosion resistance. their limitations are related to low stiffness, lower levels of hardness and resistance to abrasion, and heat sensitivity [3]. synthetic polymers, when extruded, have a stretch orientation to provide textile yarns [4]. due to this elongation in fabrication, the ultimate tensile strength becomes greater, also having an effect of reducing the maximum elongation. the mechanism responsible for this improved performance is the orientation of the polymeric chains, with a possible formation of oriented crystals and a reduction of the amorphous phase [5]. one of the prominent uses of such polymeric materials is in offshore mooring systems, these synthetic fibre ropes have replaced traditional steel ropes due to their better properties in the marine environment and low weight [6]. there is a constant development of mooring ropes; since it is desired that they are more rigid to guarantee limits on the movement of floating units, their mechanical performance is fundamentally studied [7, 8]. on offshore moorings, the synthetic polymeric multifilaments for ropes can be made from different materials, but polyamide, polyester and high modulus polyethylene (hmpe) stand out. polyamide can be highlighted by recent studies in mooring for offshore energy conversion systems, such as for fowt floating wind turbines [9]. polyester is a material already established in offshore mooring systems, it can be said that it is the most used in these applications, and studies that address it have excellent references in the 538 https://doi.org/10.14311/ap.2022.62.0538 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 5/2022 experimental study of the torsional effect . . . literature [10]. high modulus polyethylene (hmpe), however, is a newer material, also more expensive, but it has excellent mechanical performance and there are studies for mobile offshore drilling unit [11, 12]. for mechanical characterisation of polymeric materials, the work carried out with multifilaments stands out, where the conditions of use involve multiple variables, such as temperature, humidity, and types of loads (static, sudden, cyclic). predicting the behaviour of these materials is a challenge due to their non-linear characteristics [13]; this complexity means that much of the research is carried out experimentally [14]. from this mechanical characterisation carried out in several researches with polymeric multifilaments, the tensile rupture (yarn break load, ybl) stands out as the main one. this is because ybl serves as a parameter for several other mechanical characterisation tests, such as impact, creep, abrasion, fatigue, quasi-static stiffness, and dynamic stiffness. as load, they usually use a specific percentage of the standard ybl breaking load, as seen in several works that attest ybl as a parameter for performing other mechanical characterisation tests on multifilaments [15–19]. iso 2062 standard regulates the execution of the yarn break load test. in this standard, several parameters are determined, but no requirement is stated about the twist that must be applied to the multifilament. only a torsional recommendation exists in the standard norm based on the dtex (linear density). due to this standard, most of the papers that show the ybl in the experimental data use either 60 twists per metre or no twists at all. to provide break values (ybl) that serve as a reference for each material, the results from the cited literature can be used, [15] for hmpe, [16] for polyamide, and [17] for polyester. furthemore, a study on torsional effect in aramid and lcp yarns can be mentioned [20]. the results show that yarn strength can be improved with a slight twist, but a high degree of twist damages the fibres and reduces the yarn’s tensile strength. the approach defined in the study by rao & farris [20] is based on torsional angles and shows a general tendency that there is an ideal torsional angle group for each material to achieve a maximum strength. thus, the aim of this study is, through an experimental approach, to study the torsion effect on yarn break load (ybl) tests, which are the main test used for mechanical characterisation of multifilaments: in particular analysing the effects on the values of breaking force. the hypothesis is that, as found in [20] for aramid multifilaments, there is a torsion group that optimises the breaking strength for polyamide, polyester, and hmpe. 2. materials and methods 2.1. multifilament materials and torsion conditions in this study, the following materials are analysed: polyamide, polyester, and high modulus polyethylene. the materials tested were characterised by measuring their linear density and breaking load according to current standards, as shown in sections 2.2 and 2.3. each material is initially tested with the following twist group: 00 (no torsion), 20, 40, 60, 120, 240 and 480 turns per metre. also, all tests followed standard atmosphere determinations of iso 139:2005. the specimens are conditioned during the test with a temperature of 20 ± 2 ◦c and a relative humidity of 65 ± 4 % [21]. 2.2. linear density test, tex in the characterisation methodology of each of the multifilament materials, the weight by length is very important, even as a parameter to determine the linear tenacity of the multifilament. the density linear test (mass/length ratio) was performed for each material with a series of 10 specimens to obtain the mean and standard deviations according to astm d1577 [22]. the same procedure was performed for all materials: polyamide, polyester, and hmpe. for a sample with a length of 1000 ± 1 mm, the mass was measured on a highly accurate balance with a resolution of 0.0001 grams. for each sample, a 3-minute stabilisation period was set before recording the mass data indicated by the balance. using the mass and length data, the linear density in grams per metre was determined. it should be highlighted that the unity usually appears in the literature in tex [g/km], and can also be presented in submultiple notation as decitex [dtex]. 2.3. yarn breaking load test the multifilament break test follows the methodology of the iso 2062:2009 standard and is called yarn break load (ybl) [23]. the main parameters of the test being: tested useful length – 500 millimetres, speed – 250 millimetres per minute, and environmental standard iso 139:2005. the yarn break load test gives the breaking value as well as the maximum elongation of the tested specimen. this result is also important to determine the linear tenacity of the multifilament. breakage tests were performed on 12 specimens for each of the 7 torsion groups and for each of the 3 materials, resulting in 252 specimens. data were statistically filtered using box-plots, leaving 8 values for each group that characterise the mean of the required condition and make up the standard deviation. in the box-plot tool, values that differ greatly from the set are outside an accepted range. these atypical and extreme values are known as outliers and are excluded from the treatment [23]. 539 d. m. cruz, f. m. clain, c. e. m. guilherme acta polytechnica 2.4. curve parameterization for optimal point in the work of rao & farris [20], an optimal point of twist angle is presented. the expectation that something similar will occur in relation to the torsion groups in turns per metre is what motivates the parameterisation of the curve to an optimal point. theoretically, increasing the twister condition increases the breaking strength to an optimal point, and increasing the twist any further decreases the breaking strength. with the rupture results for each of the torsional groups, graphs of the torsional groups versus ybl were plotted, where the axis x corresponds to the twist value and the axis y corresponds to the rupture strength. then, the optimal point is determined by three sets of discrete data, for each of the groups tested: the highest breaking strength data are used, also taking into account the groups immediately before and immediately after. it should be noted that the non-linear characteristics of polymeric materials are what hinder a general approach to the results. furthermore, it would be very difficult to predict a curve with a satisfactory coefficient of determination that passes through all 7 or 8 discrete points for each material. with these 3 sets of data, it is possible to perform a quadratic regression and determine a second degree polynomial that passes through the 3 points. in fact, this determination is a precise mathematical methodology that makes a perfect fit. since r2 represents a quantitative measure to predict whether a given mathematical model is satisfactory for the described behaviour, in this case, it forces a perfect fit, an r2 equal to 1 [24]. by definition, the maximum and minimum points of a given function are equivalent to a first derivative of the function equal to zero, and this will be able to provide the ideal torsional groups for maximum breaking strength for each of the materials [25]. after finding the ideal twist point, all materials were tested again for this ideal twist condition to verify the experimental ybl value with the one obtained mathematically. 2.5. relationship between optimum twist and linear density the recommendation of the standard that regulates the yarn break load test, iso 2062:2009 is: “a twist of 60 ± 1 turns/m for yarns below 2200 dtex and a twist of 30 ± 1 turns/m for yarns above 2200 dtex are recommended. other twist amounts may be allowed on agreement of the interested parties” [26]. it is not a requirement of the standard, but based on that recommendation, it is possible to infer that the optimal torsion group is a function of the linear density (tex) of the material. based on this and taking into account the optimal experimental results, perhaps it is possible to make a mathematical model for torsion per metre as a function of the linear density value in tex [g/km]. the methodology used is the leastsquare method associated with the concept of the best coefficient of determination, that is, r2 = 1 [24, 27]. it should be emphasized that this step depends on the experimental results and on the aforementioned methodology itself. 3. results and discussions 3.1. linear density for multifilaments the results of linear density measurements are shown in table 1 for each material measurement series. it is observed that polyamide is the material with the highest linear density, while the high modulus polyethylene is the material with the lowest linear density; it should be noted that the difference linear density between these two materials is significant. the standard deviation (sd) is very satisfactory. the results show the linear density values in tex [g/km] for each of the respective materials: 284.4 tex for polyamide, 225.5 tex for polyester, and 185.4 tex for high modulus polyethylene. for a comparison, and even validation, results from the literature can be cited for each of the materials. in [16], the linear density for the polyamide was 284.6 tex. for polyester, in [17], the value of 233.8 tex can be found. and for the hmpe, in [15], a linear density of 176.4 tex was found. polyamide polyester hmpe average 0.2844 0.2255 0.1854 sd 0.0011 0.0006 0.0009 table 1. linear density results in grams per metre [g/m]. 3.2. yarn breaking load for torsion groups for the yarn break load criterion, the breaking force is used for each of the twist groups. table 2 shows the mean and sd for ybl of the 8 sets of filtered data, as described in section 2.3, for each group of torsion. although there is no data in the literature for all torsional conditions, the results presented in table 2 can be compared to specific literature cited in the introduction [15–17]. in the reference works, a breaking force for hmpe of 500.84 newtons with a standard deviation of 16.37 newtons was measured [15], but the torsion used was not specified, although the need for adherence to the standard suggests that it was 60 turns per metre. for polyamide, the average breaking strength found for an untwisted condition was 210.47 n with a standard deviation of 3.78 n [16]. a value reference for ybl polyester can be an average breaking force of 174.04 n with a standard deviation of 3.20 n for a twist condition of 60 turns per metre [17]. as can be seen, they are numerically different but similar values in a very consistent way. these 540 vol. 62 no. 5/2022 experimental study of the torsional effect . . . polyamide polyester hmpe torsion [rev/m] maximum load [n] strain [%] maximum load [n] strain [%] maximum load [n] strain [%] 0 210.08 ± 3.90 17.68 ± 1.00 172.36 ± 2.48 12.52 ± 0.17 467.17 ± 10.74 2.53 ± 0.12 20 214.31 ± 2.50 17.19 ± 0.61 176.58 ± 4.23 13.15 ± 0.48 532.00 ± 8.15 2.81 ± 0.09 40 218.45 ± 0.72 19.81 ± 0.33 179.14 ± 1.17 13.50 ± 0.30 546.45 ± 5.47 2.96 ± 0.04 60 220.32 ± 1.82 19.68 ± 0.32 181.34 ± 2.18 13.29 ± 0.28 522.60 ± 7.85 2.85 ± 0.05 80 176.56 ± 1.12 12.91 ± 0.30 120 221.15 ± 1.61 21.17 ± 0.69 175.62 ± 2.29 13.24 ± 0.44 474.16 ± 21.67 2.96 ± 0.17 180 210.40 ± 2.49 21.49 ± 0.57 240 190.87 ± 4.72 20.39 ± 1.11 151.66 ± 4.53 13.76 ± 0.38 193.93 ± 21.13 3.24 ± 0.25 480 95.39 ± 2.85 28.03 ± 2.70 67.03 ± 3.08 16.15 ± 1.71 94.25 ± 13.44 7.78 ± 1.05 table 2. yarn break load results for polyamide, polyester and hmpe sensitive differences can even be justified by the coil manufacturer or polymerisation method. thus, the values presented in table 2 are values verified in the literature in coherent ranges for each material (mainly for untwisted groups and with 60 turns per metre). of the initial 7 torsion groups, the 80-turns-permetre group for polyester and the 180-turns-per-metre group for polyamide were added, as can be seen in table 2. the reason for these additions is due to the higher tensile strength in the group immediately preceding the aggregate, the intention was to verify the behaviour of the breaking force for some marginally intermediate group that would allow interpretations, since in relation to the initial values the torsion gaps were large. there is also the intention of verifying an increasing and then decreasing behaviour, because the absence of a hold level confirms a result similar to that of [20], that is, there is a maximum point that optimises resistance, quoted for torsional angle in the literature and for a torsional ratio in turns per metre in the present work. the results can be graphically represented for each material, figure 1 for polyamide, figure 2 for polyester and figure 3 for hmpe. there is a torsional group that provides the highest break value in the yarn break load test, which differs for each material. for polyamide, the maximum break value is 221.15 newtons for a twist of 120 turns per metre, polyester breaks at 181.34 newtons for a twist of 60 turns per metre, and for hmpe, the maximum break is 546.45 newtons for a twist of 40 turns per metre. in the table and graphs, there is important information about stretching. an increasing trend towards elongation is observed. at certain points, the mean even decreases, but if the standard deviation is considered, it can be said that there is an increase in elongation as more twist is added to the specimen. the addition of torsion promotes a densification of the multifilament, the effect of this torsion is manifested in the elongation in what can be called the “spring effect”. as the “spring effect” increases, the elongation increases in a contained manner, until reaching the point in which the multifilament starts to fold in on itself. in approximately 200 or 240 turns per metre, when this “self-folding” of the multifilament occurs (indications in red circles in figure 4b), the elongation data grows exponentially for the ybl test. understanding that this more expressive increase in elongation is not a characteristic of the material, but is the removal of excessive “self-folding” torsion through traction. in the rupture graph, this removal of “self-folding” is evident by the abrupt decays that occur during the test, taking away the visual homogeneity of the graph, figure 5. these groups of excessive elongations also appear to be the groups with the lowest breaking strength. what happens is that the excessive twist promotes shear forces, so a much larger amount of twist promotes more shear forces in the multiflament when it is pulled, and therefore the break value is lower. it is noteworthy that the effect of any torsion group has an important characteristic from the point of view of homogenising the moment of rupture, even due to the effect of densifying the multifilament. for all materials in this study, this homogeneity in rupture can be observed for a group that has the torsion. taking polyamide as an example, in figure 6 and figure 7, the untwisted and 20-turns-per-metre graphs are displayed respectively. in figure 6, this rupture is seen unevenly, and in figure 7, where there is a torsion in the specimen, the rupture is much more homogeneous. 3.3. determination of the optimal torsion group from the results already shown, it is found that there is an optimal group for each material. as described in section 2.4, for each material and its discrete data 541 d. m. cruz, f. m. clain, c. e. m. guilherme acta polytechnica figure 1. twist x ybl for polyamide. figure 2. twist x ybl for polyester. figure 3. twist x ybl for hmpe. 542 vol. 62 no. 5/2022 experimental study of the torsional effect . . . figure 4. (a) counter-turn display in 240 turns per metre, (b) appearance of “self-folding”. figure 5. graphic effect of self-folding removal on the hmpe specimen. figure 6. graph of the ybl test results for polyamide, 00 turns per metre. 543 d. m. cruz, f. m. clain, c. e. m. guilherme acta polytechnica figure 7. graph of the ybl test results for polyamide, 20 turns per metre. set, a system of equations was created. the matrix condition was solved by gauss, thus providing the coefficients of the quadratic model of equations (1), (2) and (3). for polyamide, equation (1): y = −0.0016 · x2 + 0.3031 · x + 207.92. (1) for polyester, equation (2): y = −0.0087 · x2 + 0.9819 · x + 153.82. (2) for hmpe, equation (3): y = −0.0479 · x2 + 3.5945 · x + 479.26. (3) the approximated curves, developed by the quadratic model, are shown in figure 8 (polyamide), figure 9 (polyester) and figure 10 (hmpe). the first derivative of a function equal to zero represents the maximum and/or minimum points of the function [25], as in all models, there is a downwards concavity of the parabola, it is the maximum of the function which, when equal to zero, will result in the optimal torsion group for the maximum breaking value. for polyamide, equation (4): d dx [ − 0.0016 · x2 + 0.3031 · x + 207.92 ] = 0. (4) for polyester, equation (5): d dx [ − 0.0087 · x2 + 0.9819 · x + 153.82 ] = 0. (5) for hmpe, equation (6): d dx [ − 0.0479 · x2 + 3.5945 · x + 479.26 ] = 0. (6) thus, the optimal torsional groups, rounded to whole numbers, are: 95 turns per metre for polyamide, 56 turns per metre for polyester, and 38 turns per metre for hmpe. with these torque values, it is possible to return to the respective models, equations (1), (2) and (3) respectively, and determine the expected force value for the optimal point. for polyamide, equation (7) is developed: y(95) = −0.0016 · (95)2 + 0.3031 · (95) + 207.92 = 222.2745 [n]. (7) for polyester, equation (8) is developed: y(56) = −0.0087 · (56)2 + 0.9819 · (56) + 153.82 = 181.5232 [n]. (8) for hmpe, equation (9) is developed: y(38) = −0.0479 · (38)2 + 3.5945 · (38) + 479.26 = 546.6834 [n]. (9) observing the rupture forces obtained for optimal mathematical groups, the values are higher than those of the works taken as reference [15–17], and it is possible to infer the improvement in strength with a given twist per metre. it is important now to verify if the experimental breakage values are coincident for the same optimal groups obtained by the mathematical models above. 3.4. ybl results for optimal torsion having determined the appropriate ideal torsional groups, as well as the expected breaking force, it is possible to verify experimentally whether these torsions correspond to the maximum breaking response in the yarn breaking load test. the results of these tests are presented in table 3, which contains the experimental results and the results of the predicted force in the quadratic mathematical model. very low relative differences in maximum breaking load occurred between the square model and the experimental data, which confirms that these are the ideal torsional groups for each material. 544 vol. 62 no. 5/2022 experimental study of the torsional effect . . . figure 8. square model for optimal stitch, polyamide. figure 9. square model for optimal stitch, polyester. figure 10. square model for optimal stitch, hmpe. 545 d. m. cruz, f. m. clain, c. e. m. guilherme acta polytechnica square model experimental maximum load [n] maximum load [n] strain [%] polyamide 95 [rev/m] 222.2745 222.45 ± 1.23 20.60 ± 0.40 polyester 56 [rev/m] 181.5232 182.68 ± 2.11 13.24 ± 0.36 hmpe 38 [rev/m] 546.6834 548.60 ± 15.95 2.87 ± 0.08 table 3. experimental results for the optimal torsion group of each material. figure 11. mathematical model, optimal torsion as a function of linear density. 3.5. curve ratio for optimal torsion and tex as described in section 2.5, and considering that there are reasons for the optimal torsion to be related to the linear density, a mathematical model related these variables. there are, therefore, 3 discrete data sets, relating to polyamide (284.4 tex; 95 rev/m), polyester (225.5 tex; 56 rev/m) and hmpe (185.4 tex; 38 rev/m). thus, to force a perfect coefficient of determination (r2 = 1), a square model of these discrete data sets is made. the mode system can be set up in its matrix form to determine the coefficients of the quadratic equation, and can be solved by the gauss method, obtaining equation (10). y = 16790 7794237 · x2 − 3400351 7794237 · x + 10590325 236189 , y ∼= 2.1542 · 10−3 · x2 − 0.43626 · x + 44.838. (10) the coefficient that accompanies the square term is very small, which allows a linear model approximated by the method of least squares to be satisfactory. the programming was done in octave, and the curves, equations and coefficients of determination indicated in figure 11 were obtained. the linear model is obtained using the least-square method [27] and returns the equation of the straight line shown in equation (11). y = 0.58219 · x − 71.931 (11) as can be seen, the model allows to consider the interrelationship between the linear density and a certain optimal torsion. however, if the norm standard recommendation is observed, it states that a reduction in torsion leads to an increase in linear density, and in the constructed model, exactly the opposite is verified, that the increase in linear density causes the optimal torsion to also increase for the maximum break value. in other words, the standard recommendation can contribute to a lower performance of polymeric multifilaments, but it should always be emphasized that the standard norm recommendation is generalised, here, we are describing a model for specific synthetic polymeric materials. 4. conclusions in this study, it is evident that the amount of twists applied to the specimens influences the yarn break load results. the gradual increase in twist densifies the multifilament, making its breakage more homogeneous. a consequence of this densification is also an increase in the breaking load up to a certain optimal twist. in this study, optimal twists were determined for: polyamide (95 turns per metre), polyester 546 vol. 62 no. 5/2022 experimental study of the torsional effect . . . (56 turns per metre) and hmpe (38 turns per metre). after the optimal twist, the addition of torsion causes the shear forces to increase and consequently causes it to reach lower breaking values. another conclusion is related to the relationship between linear density and optimal torsion (figure 11). the prescribed mathematical model proves to be very satisfactory both in the form of a quadratic model and a linear model. the model obtained in the study demonstrates that the increase in linear density causes the twist optimal value to increase up to the maximum breaking force, which is exactly the opposite of the recommendation of the standard iso 2062 standard for yarn breaking load testing. that is, the standard makes a general recommendation that is comprehensive, and that does not optimise the performance of the material in terms of breaking strength. in this study, the torsion effect for the ybl test was evaluated. likewise, future studies can be carried out with the same purpose of evaluating the torsion effect in other multifilament mechanical tests, such as creep, fatigue, and abrasion. acknowledgements the authors would like to express their gratitude to the federal university of rio grande and the policab stress analysis laboratory for supporting this study. references [1] e. hage jr. aspectos históricos sobre o desenvolvimento da ciência e da tecnologia de polímeros. polímeros 8(2):6–9, 1998. https://doi.org/10.1590/s0104-14281998000200003. [2] w. d. callister jr. ciência e engenharia de materiais. editora ltc, rio de janeiro, 7th edn., 2008. [3] d. s. d. rosa. correlação entre envelhecimentos acelerado e natural do polipropileno isotático (ppi). ph.d. thesis, universidade estadual de campinas, 1996. [4] h. a. mckenna, j. w. s. hearle, n. o’hear. handbook of fibre rope technology. elsevier, 2004. [5] l. a. santos. desenvolvimento de cimento de fosfato de cálculo reforçado por fibras para uso na área médico-odontológica. ph.d. thesis, universidade estadual de campinas, 2002. [6] h. da costa mattos, f. chimisso. modelling creep tests in hmpe fibres used in ultra-deep-sea mooring ropes. international journal of solids and structures 48(1):144–152, 2011. https://doi.org/10.1016/j.ijsolstr.2010.09.015. [7] m. b. bastos, l. f. haach, d. t. poitevin. prospects of synthetic fibers for deepwater mooring. in rio oil and gas 2010, ibp2745_10. 2010. [8] s. leite, p. e. griffin, r. helminem, r. d. s. challenges and achievements in the manufacturing of dwm polyester tether ropes for the chevron tahiti projects – gulf of mexico. in rio oil and gas 2010, ibp2291_10. 2010. [9] y. chevillotte, y. marco, g. bles, et al. fatigue of improved polyamide mooring ropes for floating wind turbines. ocean engineering 199:107011, 2020. https://doi.org/10.1016/j.oceaneng.2020.107011. [10] c. j. m. del vecchio. light weight materials for deep water moorings. ph.d. thesis, university of reading, uk, 1992. [11] c. berryman, r. dupin, n. gerrits. laboratory study of used hmpe modu mooring lines. in offshore technology conference, otc-14245-ms. 2002. https://doi.org/10.4043/14245-ms. [12] i. corbetta, f. sloan. hmpe mooring line trial for scarabeo iii. in offshore technology conference, otc13272-ms. 2001. https://doi.org/10.4043/13272-ms. [13] e. l. v. louzada, c. e. m. guilherme, f. t. stumpf. evaluation of the fatigue response of polyester yarns after the application of abrupt tension loads. acta polytechnica ctu proceedings 7:76–78, 2017. https://doi.org/10.14311/app.2017.7.0076. [14] v. sry, y. mizutani, g. endo, et al. consecutive impact loading and preloading effect on stiffness of woven synthetic-fiber rope. journal of textile science and technology 3:1–16, 2017. https://doi.org/10.4236/jtst.2017.31001. [15] e. s. belloni, f. m. clain, c. e. m. guilherme. post-impact mechanical characterization of hmpe yarns. acta polytechnica 61(3):406–414, 2021. https://doi.org/10.14311/ap.2021.61.0406. [16] d. m. cruz, e. s. belloni, f. m. clain, c. e. m. guilherme. analysis of impact cycles applied to dry polyamide multifilaments and immersed in water. rio oil and gas expo and conference 20(2020):199–200, 2020. https://doi.org/10.48072/2525-7579.rog.2020.199. [17] i. melito, e. s. belloni, m. b. bastos, et al. effects of mechanical degradation on the stiffness of polyester yarns. rio oil and gas expo and conference 20(2020):176–177, 2020. https://doi.org/10.48072/2525-7579.rog.2020.176. [18] l. cofferri, c. e. m. guilherme, f. t. stumpf. application of the stepped isothermal method to evaluate creep behavior in high-modulus polyethylene yarns. in 27º congresso internacional de transporte aquaviário, construção naval e offshore. 2018. https://doi.org/10.17648/sobena-2018-87587. [19] l. caldeira, p. lucas, f. chimisso. creep comparative behavior of hmpe (high modulus polyethylene) multifilaments when submitted to changing conditions of temperature and load. in youth symposium on experimental solid mechanics, pp. 99–102. 2010. [20] y. rao, r. j. farris. a modeling and experimental study of the influence of twist on the mechanical properties of high-performance fiber yarns. journal of applied polymer science 77(9):1938–1949, 2000. https://doi.org/10.1002/1097-4628(20000829)77: 9<1938::aid-app9>3.0.co;2-d. [21] international organization for standardization. textiles – standard atmospheres for conditioning and testing (iso standard no. 139), 2005. [22] american society for testing and materials. standard test methods for linear density of textile fibers (astm standard no. d1577), 2018. 547 https://doi.org/10.1590/s0104-14281998000200003 https://doi.org/10.1016/j.ijsolstr.2010.09.015 https://doi.org/10.1016/j.oceaneng.2020.107011 https://doi.org/10.4043/14245-ms https://doi.org/10.4043/13272-ms https://doi.org/10.14311/app.2017.7.0076 https://doi.org/10.4236/jtst.2017.31001 https://doi.org/10.14311/ap.2021.61.0406 https://doi.org/10.48072/2525-7579.rog.2020.199 https://doi.org/10.48072/2525-7579.rog.2020.176 https://doi.org/10.17648/sobena-2018-87587 https://doi.org/10.1002/1097-4628(20000829)77:9<1938::aid-app9>3.0.co;2-d https://doi.org/10.1002/1097-4628(20000829)77:9<1938::aid-app9>3.0.co;2-d d. m. cruz, f. m. clain, c. e. m. guilherme acta polytechnica [23] w. w. hines, d. c. montgomery, d. m. goldsman, c. m. borror. probabilidade e estatística na engenharia. editora ltc, rio de janeiro, 4th edn., 2006. [24] p. a. barbetta, m. m. reis, a. c. bornia. estatística: para cursos de engenharia e informática. atlas, são paulo, 2010. [25] p. a. morettin, s. hazzan, w. o. bussab. cálculo – funções de uma e várias variáveis. editora saraiva, são paulo, 3rd edn., 2012. [26] international organization for standardization. textiles – yarns from packages – determination of single-end breaking force and elongation at break using constant rate of extension (cre) tester (iso standard no. 2062), 2009. [27] o. helene. método dos mínimos quadrados com formalismo matricial. livraria da física, são paulo, 2006. 548 acta polytechnica 62(5):538–548, 2022 1 introduction 2 materials and methods 2.1 multifilament materials and torsion conditions 2.2 linear density test, tex 2.3 yarn breaking load test 2.4 curve parameterization for optimal point 2.5 relationship between optimum twist and linear density 3 results and discussions 3.1 linear density for multifilaments 3.2 yarn breaking load for torsion groups 3.3 determination of the optimal torsion group 3.4 ybl results for optimal torsion 3.5 curve ratio for optimal torsion and tex 4 conclusions acknowledgements references ap07_4-5.vp 1 definition of nuclear liabilities liability can be simply defined as the obligation to transfer economic benefits as a result of a past transaction or a past event. if we apply this principle to the operation of an npp, together with other required attributes of liabilities (sufficient probability of the event, possibility to express the consequences in financial terms, enforceability), we can identify the following basic nuclear liabilities: � radioactive waste (rw) treatment, storage and disposal � spent fuel (sf) storage and management � disposal of sf or residues from sf reprocessing � decommissioning of nuclear facilities, i.e. npp, rw treatment facilities, sf storages, etc. the basic legislation in the czech republic is the nuclear act 18/1997 coll., as amended. the act strictly specifies the division of responsibilities (and therefore liability) in the back-end of the nuclear fuel cycle. the state has assumed responsibility for safe disposal of the rw and sf that is produced. on the other hand, the npp operator remains liable for treating its rw (the main aim is volume reduction and transformation to a form suitable for disposal). rw is in fact usually processed continually or in short-term campaigns. if it is necessary to store rw before treatment, or in the event that the rw is not suitable for disposal in currently operated near-surface repositories and must be disposed of in a future deep geological repository, the operator is responsible for rw storage. the npp operator is also responsible for sf storage until the moment it is declared to be rw by the operator and handed over to the state for disposal. the npp operator is also fully liable for decommissioning its nuclear facilities after the end of their operation. it is logical that the financial means have to be accumulated by the subject responsible for the particular liability. in the case of rw and sf disposal, financial means in the form of a nuclear fund are accumulated and controlled by the state, though the funding is basically sourced from the prescribed contributions from the waste producers, including npp operators. however, it is the operator who accumulates the funds for rw � sf storage and for decommissioning. although the liabilities will be settled in the distant future, many years after closure of the npp, it is necessary to accumulate necessary financial funds while npp is still in operation and to invest them continuously and safely in order to compensate the effects of inflation and to get at least some real yield over inflation. when the unit cost of current production is calculated, many different assumptions must be taken in account. 2 operator actions with an impact on nuclear liabilities when we analyze how liabilities are defined and calculated, we can list the following events that have a substantial influence on a financial expression of liability: � changes in production of sf and rw (changes on the costs side) � changes in storage, disposal, decommissioning technologies (costs side) � changes in electricity generated (revenue side) � changes in the time-schedule of future expenditures (influence on yield interest from invested funds) � changes in the economic environment and related factors, e.g., cost escalation, influence of inflation, yield from invested funds 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 extending the life time of a nuclear power plant: impact on nuclear liabilities in the czech republic l. havlíček nuclear power plant (npp) operators have several basic long-term liabilities. such liabilities include storage, treatment and disposal of radioactive waste generated at the operators’ npp, storage and management of nuclear fuel irradiated in the reactor of the operator’s npp (“spent fuel”), disposal of the spent fuel (sf) or residues resulting from spent fuel reprocessing. last but not least, the operator is liable for decommissioning its nuclear facilities. if the operator considers extending the life time of its npp or if the construction of a new npp is being evaluated by an investor, an integral part of the economic evaluation must be a comprehensive assessment of future incremental costs related to the above-mentioned long-term liabilities. an economic evaluation performed by standard methods (usually npv, alternatively real options) leads to a decision either to proceed with the project or to shelve it. if the investor decides to go ahead with the project there can be an immediate impact on nuclear liabilities. the impact is not the same for all operator liabilities. depending on the valid legislation and the nature of the liability, in some cases the extent of the liability must be immediately recalculated when a decision is made to proceed with the project, and the annual accrual of accumulated reserves / funds must be adjusted. in other cases, the change in liability is linked to the generation of additional radioactive waste or spent fuel. in the czech republic, responsibility for each of the nuclear liabilities is defined, as is the form in which the financial means are to be accumulated. this paper deals with the impact of npp life time extension (alternatively npp power up-rate or construction of a new npp) on individual nuclear liabilities in the conditions of the czech republic. keywords: nuclear liabilities, spent fuel, radioactive waste, storage, disposal, decommissioning, financing. � changes in legislation – redefinition of responsibilities, changes in the form of fund accumulation, enlargement or limitation of investment opportunities. if the operator decides to extend the life time of its npp, there will be a combination of the above-mentioned effects – the volume of sf and rw is increased, the timing of the expenditures is changed and the amount of electricity to be generated is increased. if the operator (or potentially any investor) decides to build a new npp, the effect on liabilities will in principle be the same, though the effect will be more pronounced. in addition, there are other projects with a potential impact on nuclear liabilities. the operator may decide to up-rate the installed capacity of its current npp for the remaining part of the (unchanged) planned life time. these operator actions with an impact on nuclear liabilities are summarized in fig. 1. below, i will analyze the impact on individual nuclear liabilities. if we take the current situation in the czech republic, several such operator actions are being implemented or considered. the installed capacity at npp dukovany is being increased by means of turbine and control system modifications, and similar measures will be applied at npp temelín. as the fuel at npp dukovany has been improved substantially and is now able to support higher output with an unchanged 5-year fuel cycle, there is no obvious impact on the cost side, and only the revenue side (the amount of electricity generated) will be increased. in future the operator may decide to change the technology for storage of sf or to reprocess and recycle its sf in the reactor and dispose of only separated high-level waste. in such cases there would also be a substantial change in the operator’s liability. as the current stage of technological development and the economics of reprocessing are not favorable, some fundamental change in the fuel cycle back-end strategy in the czech republic is not foreseen in the near future. 3 aspects of liability in principle, the basic considerations are as follows: � investor‘s view – an investor needs to know if the project (i.e. an extension/up-rate/new build) will be efficient. standard analytical methods like net present value (npv) can be used. costs per unit of future/additional production (e.g. mwh) connected with any particular liability can be calculated for the npv enhancement feature. � shareholder‘s view – any change in liability must be correctly quantified in the operator’s calculations. only then can the current or potential shareholders base their decision on correct assumptions. � regulator‘s (state‘s) view – the role of the state/regulator is to monitor whether sufficient funds for future expenditures are being generated and set aside. 4 options for model calculations let us assume an npp that has been in operation since the 1980s. the currently planned life time is 40 years. the operator considers extending the life time by 10 or 20 years. this leads to the following options in table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 47 no. 4–5/2007 0 2 4 6 8 10 12 14 a n n u a l e le c tr ic it y g e n e ra ti o n (t w h ) 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 lifetime extension of the current npp l-t extension current npp 0 2 4 6 8 10 12 14 16 a n n u a l e le c tr ic it y g e n e ra ti o n (t w h ) 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 uprating of the current npp additional output current output 0 5 10 15 20 25 30 a n n u a l e le c tr ic it y g e n e ra ti o n (t w h ) 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 construction of a new npp additional npp current npp fig. 1: basic operator actions influencing nuclear liabilities 19 80 19 90 20 00 20 10 20 20 20 30 20 40 20 50 20 60 20 70 20 80 20 90 basic option 40 years alternative option 1 50 years alternative option 2 60 years npp operation: npp decommissioning npp decommissioning npp decommissioning deep geological disposal repository operation spent fuel storage operation fig. 2: options for extending the current life time of npp, and the connected time schedule 5 impact on individual liabilities 5.1 spent fuel & radioactive waste storage as was mentioned above, in the czech republic sf and rw storage is a liability of the npp operator. the operator is required to cover not only all future costs connected with storage process itself, but also costs related to transport from the reactor to the storage and finally from the storage to the deep geological repository for final disposal. the operator of the npps in the czech republic has selected “dry” storage technology in dual-purpose (storage and transport) casks. the storage casks are stored in a specifically designed building which protects the casks from adverse weather conditions. the dry storage technology was selected under time pressure in the first half of the 1990s, when, after the break-up of czechoslovakia, it was necessary to transport back from slovakia the sf irradiated at npp dukovany, which was at that time being stored in wet technology sf storage in slovakia. cask technology can be implemented more quickly, and therefore this technology was chosen despite being more expensive. cask technology involves the following costs: � storage building construction costs – including the preparatory phase (siting, licensing, selection of a constructor) � cask purchasing costs – for the life time of npp this is a multi-billion czk cost item – the major element in the storage costs � storage operation costs – personnel, energy, maintenance, insurance, technical assistance and research. during simultaneous operation with npp those costs are not high, as many costs items are shared with npp, and some are not required (e.g. insurance). currently two sf storages are in operation at npp dukovany. the first storage (60 casks, i.e. 600 t of heavy metal) was put into operation in the mid 1990s. its capacity was exhausted in 2006. when it was constructed, its capacity was limited by a political decision, due to the reluctance of the neighboring communities to accept storage for the full life time of npp dukovany. at that time central storage for all czech npps was under consideration. however this concept was later abandoned and a second storage for (132 casks, i.e. 1340 thm) was put into operation in 2006. this storage was planned to absorb sf production for a 40-year life time. due to improvements in the fuel cycle (a reduced number of loaded fuel assemblies), the storage is sufficient for roughly 45-year operation of npp dukovany. for option 1and option 2, it is therefore necessary to include the cost of building additional storage. for the purposes of an economic evaluation, the following storage costs (cash-flow view), based on 2006 fixed prices were considered: the same cash-flow expenses are shown on a cumulative basis in the following graph: these cost estimates were used as one of inputs for an overall economic evaluation of the life time options. the results show that in terms of storage costs, the extension project is efficient. in terms of liability, the situation is different. the liability to store a fuel assembly (fa) arises at the moment when the particular fa is loaded into the reactor. before this loading, a fresh fa can be de-fabricated and individual components (nuclear material, zirconium and stainless steel structural parts) can be sold, and therefore there is no liability to store such fa 40 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 basic option option 1 option 2 extension 0 10 20 total life time 40 50 60 table 1: definition of the options for extending the life time of npp 0 10 000 20 06 20 11 20 16 20 21 20 26 20 31 20 36 20 41 20 46 20 51 20 56 20 61 20 66 20 71 20 76 20 81 m il. c z k 40 years total 50 years total 60 years total fig. 4: cumulative storage costs (cash-flow view) for the evaluated model npp life time options 0 200 400 2 0 0 6 2 0 1 0 2 0 1 4 2 0 1 8 2 0 2 2 2 0 2 6 2 0 3 0 2 0 3 4 2 0 3 8 2 0 4 2 2 0 4 6 2 0 5 0 2 0 5 4 2 0 5 8 2 0 6 2 2 0 6 6 2 0 7 0 2 0 7 4 2 0 7 8 2 0 8 2 m il . c z k basic option alternative 1 alternative 2 fig. 3: storage costs (cash-flow view) for the evaluated model npp life time options 0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 0% 1% 2% 3% discount rate c z k /m w h basic option alternative 1 alternative 2 fig. 5: unit storage costs per generated mwh for the considered options, calculated by npv as a function of the discount rate in the form of sf. however once an fa is exposed to a chain reaction its nuclear properties become such that it must be safely stored for a long time, and later disposed of. liability is always to be assessed on the basis of the actual quantity of sf. consequently there is no direct link between the adopted planned npp life time (or the decision to build a new npp) and the financial expression of the storage liability (under the ias 37 accounting standard) in the operator’s balance of accounts. this liability is increased only if new fresh fuel is loaded into the reactor, if the or future costs estimate is updated upwards. on the other hand, the liability is decreased when the related expenses are paid (new cask, operational costs). if sf generation reaches such a level that there is not sufficient capacity for placing the casks with sf, there will be a sudden increase in liability due to the need to build a new storage building. as far as rw storage is concerned, there is no particular liability to be assessed. there are no explicit storage costs for items being stored for future disposal. unlike the disposal and decommissioning liability there is no obvious relation between npp life time (waste volume produced) and liability. 5.2 spent fuel & radioactive waste disposal as noted earlier, sf and rw disposal is the responsibility of the state. at the present time 50 czk per mwh generated in an npp is paid by the operator into the nuclear account (na). the accumulated payments together with earnings from the investment income, should cover all future costs of sf and rw disposal. the assessment of the operator’s liability is currently based on the legal requirement to make regular payments into the na 50 czk per each mwh generated at an npp for the rest of the life time of the npp. this duty of the operator is defined by government decree no. 416/2002 coll. the fee paid to the na was calculated on the basis of an estimate of the costs for a deep geological repository. the capacity of this repository and the disposal time schedule were based on npp dukovany and npp temelín operating for a maximum of 40 years. these assumptions are included in the czech republic’s strategy for sf and rw disposal. in the event, that the operator officially decides to extend the life time of its npp (or to build a new npp), such a decision must first be approved by the state. then the state will have to modify and adjust its sf and rw disposal strategy. the updated repository cost estimates, the increased amount of electricity, and the new time schedule for repository operation will be used for recalculating the fee to be paid per mwh generated at the npp. when the new fee is calculated and a new government decree ordering the npp operator to pay the updated fee comes into effect, the operator’s disposal liability changes as well. for the purposes of an economic evaluation, the operator needs at least an estimate of the influence of the extended life time or the new npp on the rate of payment to the na. for such purposes, the operator uses its own model of future disposal costs and the na balance. a model of the npp dukovany life time options has shown that the effect on the rate is rather small. this was a surprising conclusion, considering the relatively small increase in the disposal costs and the substantial increase the electricity generated. from a detailed analysis we can conclude that this is because most of the future repository will be covered from the proceeds of investing today’s funds (past and current payments) in the nuclear account. only about 16–17 % of future expenditures will be covered from the operator’s payments. therefore, the additional earnings (electricity generation) and costs (additional sf and rw) do not play a substantial role when the rate is calculated. for the options considered here, the changes in the rate are in fig. 6. the difference is very small, but it is in favour of extending the life time of npp. 5.3 decommissioning an npp the operator is fully liable for decommissioning (dc) its npp. the state only verifies the cost estimates and the availability of funds for dc (the dedicated blocked account within the assets of the npp operator). the dc liability is assessed based on costs estimate for selected and by the nuclear regulatory body (the state office for nuclear safety, sons) approved dc procedure. the costs estimate is then verified by the state repository authority (rawra). currently the deferred dc strategy is adopted, i.e. only non-contaminated or low contaminated technology units will be dismantled. highly activated technology units (reactor, primary circuit) will be dismantled only after a long “waiting” period (the decrease in activity during this period will reduce the doses received by the workers). the cost estimate includes expenditures for technology dismantling, removal of sf and rw, decontamination, maintenance and personnel costs during the “waiting” period. for an economic evaluation of the npp dukovany life time extension options, the following cost estimates in fixed 2003 prices have been prepared. it is important to note that by adding 10 years to npp operation the start of npp dc is also deferred by 10 years, but the end of the dc process is fixed by the date of closure of the deep geological repository, when the highly-contaminated npp elements will be disposed of. as a result, for the individual life time options there will be different “waiting” periods. therefore, after 60 years of npp operation, the highly activated elements will have less time to decrease their activity and technological adjustments will have to be made with a potential effect on the cost estimate. the final effect on dc liability (and the connected accumulation of funds) is very small and positive for the life time extension project. © czech technical university publishing house http://ctn.cvut.cz/ap/ 41 acta polytechnica vol. 47 no. 4–5/2007 0.00 10.00 20.00 30.00 40.00 50.00 60.00 basic option alternative 1 alternative 2 c z k /m w h fig. 6: rate of the payment to the nuclear account for the evaluated options thus, if the npp operator declares its intention to extend the life time of its npp, it will be necessary to make technical adjustment to the dc procedure for the different time schedule. the amended procedure must then be submitted to sons for approval. if approval is granted, the cost estimate will have to be submitted for verification to rawra. when this procedure has been completed, the dc liability in the operator’s balance of accounts will be re-stated. a somewhat different situation will arise if the investor declares an intention to build and put in operation a new npp. although preparation of the dc procedures, approval by sons and verification of the related cost estimate are prerequisites for granting an npp operational license, the npp operator has no dc liability until the chain reaction in the reactor core is started for the first time. 6 conclusions for a current npp, two options to extend its operational life time were evaluated from the point of view of additional costs related to sf storage and disposal as well as npp decommissioning. the relevant operator liabilities were assessed. if we calculate the costs per unit of generated electricity using npv, for the extension considered here, we can conclude that the project is efficient. however, nuclear liabilities form only one item in the very complex additional costs connected with additional investments necessary extending the operation of an npp. the final decision must therefore be taken only when the overall evaluation has been finalized. it is not possible at this stage to anticipate the results of the overall evaluation, but as far as the back-end liabilities are concerned there is no adverse impact on the effectiveness of extending the life time of the npp. if the npp operator declares a life time extension or the construction of a new npp, there is no immediate impact on storage liability. there is also no immediate effect on disposal liability. the state, as the responsible entity, will have to amend its disposal strategy and order an amended fee per mwh generated at npp to be paid by the operator. as far as dc liability, is concerned the effect of life time extension would be immediate, though the change in the liability can be quantified only after the amended costs estimate has been verified by rawra. in the case of a new npp, a cost estimate will have to be prepared by the operator and verified by rawra, but there is no dc liability until the chain reaction in the reactor is started. references [1] havlíček, l.: storage costs for considered npp life time extension, čez, a.s., internal document, february 2007. [2] havlíček, l.: modeling of payment to the nuclear account, čez, a.s., analytical model kopalc 2007, march 2007. ing. ladislav havlíček e-mail: poster2007@radio.feld.cvut.cz havlicek.ladislav@seznam.cz dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha, czech republic 42 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 acta polytechnica https://doi.org/10.14311/ap.2022.62.0197 acta polytechnica 62(1):197–207, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague complex topological soliton with real energy in particle physics takanobu taira city, university of london, department of mathematics, northampton square, london ec1v 0hb, uk correspondence: takanobu.taira@city.ac.uk abstract. we summarise the procedure used to find the classical masses of higgs particle, massive gauge boson and t’hooft-polyakov monopole in non-hermitian gauge field theory. their physical regions are explored, and the mechanism of the real value of the monopole solution is analysed in different physical regions. keywords: t’hooft-polyakov monopole, quantum field theory, non-hermitian quantum field theory. 1. introduction quantum field theory is a key tool to analyse particle physics. the most modern physical description of the fundamental particle interaction is described by the model called the standard model. however, the model possesses several problems such as incompatibility with the general relativity, hierarchy problem, etc. therefore, it is an active area to extend the standard model. recently a growing number of research papers started exploring the non-hermitian extension of the standard model [1–14]. we have contributed to this development by analysing the goldstone theorem [8, 10], the higgs mechanism [9] and t’hooftpolyakov monopoles [11]. the classical masses of higgs particles, massive gauge boson and monopoles were analysed. however, a detailed analysis of their intersecting physical regions and the mechanism of the real value of the energy of monopole was not explored. the main aim of this contribution is to fill this gap. there are two separate mechanisms that guarantee the real value of the particle masses in question. first, the masses of higgs particles are given by a nonhermitian mass matrix m . assume that the matrix possess anti-linear symmetry, which we refer to as pt symmetry, that satisfies [pt , m ] = 0, m v = λv, pt v = eiθ v, where {v, λ} are eigenvectors and eigenvalues of the mass matrix. from this, it is trivial to show that the eigenvalues are real pt m vi = pt λivi = λ∗i pt vi = λ ∗ i e iθi vi, pt m vi = m pt vi = m eiθi vi = λieiθi vi. it was shown in [8] that this pt symmetry is related to the cpt symmetry of the field-theoretic action. on the other hand, the classical energy of the soliton solution is found by inserting the solution into the hamiltonian e = h[ϕ] = ∫ d3xh(ϕ). therefore, the techniques from pt symmetric quantum mechanics shown above can not be applied. we will show below that the energy of the soliton solutions are real when the three conditions stated below holds. therefore they are sufficient conditions to guarantee the real value of particles in the model. however, we do not claim that these are necessary conditions. let {ϕ1, ϕ2} be a set of distinct (or identical) solutions to the equations of motion δl/δϕ − ∂µ(δl/δ∂µϕ) = 0, where l(ϕ) is the field-theoretic lagrangian density. the classical energies of the solution are given by inserting the solution into the hamiltonian, ei = h[ϕi] = ∫ d3xh(ϕi), for i ∈ {1, 2}. the classical mass of the solution ϕ1 and ϕ2 are real if there exist some anti-linear symmetry cpt (note that is it not the standard cpt symmetry in quantum field theory) such that three conditions are satisfied: (1.) cpt : h[ϕ(x)] → h[cpt ϕ(x)] = h†[ϕ(−x)]. (2.) cpt : ϕ1(x) → ϕ2(−x). (3.) h[ϕ1] = h[ϕ2]. if two solutions are identical ϕ1 = ϕ2, then the above condition reduces to the reality condition of the soliton solution already derived in [15]. using the above three conditions, the real value of the classical mass can easily be shown by the following argument∫ d3xh[cpt ϕ(x)] (1) = ∫ d3xh†[ϕ(−x)] = m †1 , (2) = ∫ d3xh[ϕ2(−x)] = m2, =⇒ m †1 = m2 (3) =⇒ m †1 = m1, where numbers above the equal signs indicate the condition number. the above analysis can be performed directly on the complex model. however, the non-hermitian theory is only well-defined once the inner-product is identified. the modern way of the well-defined non-hermitian quantum mechanics was first realised by frederik scholtz, hendrik geyer, and fritz hahne in 1992, [16]. the authors used the mathematical condition on the operator called the quasi-hermiticity (the term was first coined in [17], but the metric was 197 https://doi.org/10.14311/ap.2022.62.0197 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en takanobu taira acta polytechnica not given) to define the positive definite inner product. the quasi-hermiticity is defined as a condition on the bounded linear operator of the hilbert space a : h → h, which satisfies (i)⟨v|ρv⟩ > 0 for all |v⟩ ∈ h and |v⟩ ≠ 0. (ii)ρa = a†ρ . where the bounded hermitian linear operator ρ : h → h is often called the metric operator because the inner product is defined by the operator ⟨·|·⟩ρ := ⟨·|ρ·⟩ restores the hermiticity of the operator. this result can be shown by using the condition(ii) ⟨v|aw⟩ ρ ≡ ⟨v|ρaw⟩ = ⟨v|a†ρw⟩ = ⟨av|ρw⟩ = ⟨av|w⟩ ρ , for all |v⟩ , |w⟩ ∈ h. note that the quasi-hermiticity alone does not guarantee the real energy spectrum of the hamiltonian. in fact, one requires two extra conditions. (iii)the metric operator is invertible. (iv)ρ = η†η. the operator which satisfies only conditions (ii) and (iv) is referred to as the pseudo-hermitian operator, which was first introduced in [18]. these extra conditions were considered in [16] to prove that, given a set of pseudo-hermitian operators a = {ai}, the metric operator ρa which satisfies conditions (i), (ii), (iii) and (iv) for all operators of set a is uniquely determined if and only if all operators of the set a are irreducible on the hilbert space h. this procedure is analogous to the dyson mapping first introduced by freeman dyson [19] used in the study of nuclear reaction [20–22], which maps the non-hermitian operator a to hermitian operator η−1aη via dyson map η. the relation between the metric operator and the dyson map is found by utilising the hermiticity of the expression η−1aη in the following way η −1 aη = (η−1aη)† =⇒ aη†η = η†ηa† =⇒ η†η = ρ. (1) we will utilise this mapping to transform the nonhermitian field-theoretic hamiltonian to a hermitian hamiltonian. this procedure will resolve the issue of complex vacuum solution and derrick’s scaling argument, as we will see below. however, we note that the dyson map used here introduces a negative kinetic sign in the kinetic term of one of the fields, indicating the ghost field problem. this issue is removed if one further diagonalise the hamiltonian. such diagonalisation can be realised via field-redefinition or via another dyson map. a more detailed discussion of this is found in [23], and a dyson map which diagonalise the free part of the non-hermitian hamiltonian is found in [12]. 2. methods in this section, we will summarise the method used in [8–11] to find the masses of the higgs particles, massive gauge particles and t’hooft-polyakov monopoles in non-hermitian gauge field theory. we note that the explicit forms of the similarity transformation will not be discussed in this paper as non-hermitian and hermitian theories are isospectral as long as the cpt symmetry is preserved for hamiltonian, higgs particles and monopole solution. we begin with the non-hermitian local su (2) gauge theory with matter fields in the adjoint representation lad2 = 1 4 tr (dϕ1) 2 + m21 4 tr(ϕ21) (2) −i µ2 2 tr(ϕ1ϕ2) − g 64 [ tr(ϕ21) ]2 + 1 4 tr (dϕ2) 2 + m22 4 tr(ϕ22) − 1 8 tr ( f 2 ) . here we take g, µ ∈ r, mi ∈ r and discrete values ci ∈ {−1, 1}. the two fields {ϕi}i=1,2 are hermitian matrices ϕi(t, x⃗) ≡ ϕai (t, x⃗)t a, where ϕai (t, x⃗) is a real-valued field. the three generators {t a}a=1,2,3 of su (2) in the adjoint representation are defined by three hermitian matrices of the form (t a)bc = −iϵabc, satisfying the commutation relation [t a, t b] = iϵabct c. one can check that t r(t at b) = 2δab. the field strength tensor is defined as fµν = ∂µaν − ∂ν aµ − ie[aµ, aν ], where the gauge fields are aµ = aaµt a. the partial derivative is replaced with the covariant derivative (dµϕi)a := ∂µϕai + eεabca b µϕ c i to compensate for the local symmetry group su (2). this action is invariant under the local su (2) transformation of the matter fields ϕi → eiα a(x)t a ϕie −iαa(x)t a and gauge fields aµ → eiα a(x)t a aµe −iαa(x)t a + 1 e ∂µα a(x)t a. it is also symmetric under modified cpt symmetry, which transforms two fields, ϕ1 and ϕ2 as cpt : ϕ1(t, x⃗) → ϕ1(−t, −x⃗) (3) : ϕ2(t, x⃗) → −ϕ2(−t, −x⃗) : i → −i. the equations of motion for the fields ϕi and aµ of the lagrangian (2) are (dµdµϕi) a + δv δϕa i = 0, (4) dν f νµ a − eϵabcϕb1(dµϕ)c + eϵabcϕb2(dµϕ)c = 0, where repeated indices are summed over. we perform the similarity transformation of the complex lagrangian (2) by momentary resorting to a quantum theory where we assume an equal time commutation relation between the fields ϕai and their canonical momenta πai = ∂0ϕ a i , satisfying the commutation relation [ϕai (t, x⃗), π b j (t, y⃗)] = δ(x⃗ − y⃗)δij δab. using this relation, we can transform the corresponding complex hamiltonian of the lagrangian (2) by h → eη± he−η± , (5) η± = ∏3 a=1 exp ( ± π2 ∫ d3xπa2 ϕa2 ) , where h is the field-theoretic hamiltonian of our model (2), obtained via legendre transformation. this non-uniqueness of the metric is analogous to 198 vol. 62 no. 1/2022 complex topological soliton with real energy in particle physics the non-uniqueness of the metric and its connection to the observables in the quantum mechanical setting discussed in [16, 24]. the adjoint action of η± maps the complex action in the equation (2) into the following real action s = ∫ d4x 1 4 t r (dϕ1) 2 − 1 4 t r (dϕ2) 2 (6) +c1 m21 4 t r(ϕ21) − c2 m22 4 t r(ϕ22) −c3 µ2 2 t r(ϕ1ϕ2) − g 64 ( t r(ϕ21) )2 − 1 8 t r(f 2) ≡ ∫ d4x 1 4 t r (dϕ1) 2 − 1 4 t r (dϕ2) 2 −v − 1 8 t r(f 2). the parameter c3 indicates the different similarity transformations by taking the values ±1 for η±, respectively. for convenience, let us rewrite the above real action in terms of each component of the fields ϕai as lad2 = 1 2 (dµϕi)aiij (dµϕj )a + 1 2 ϕai hij ϕ a j (7) − g 16 ( ϕai eij ϕ a j )2 − 1 4 f aµν f aµν , where the matrices h, i and e are defined as h := ( m21 −µ2 −µ2 −m22 ) , i := ( 1 0 0 −1 ) , (8) e := ( 1 0 0 0 ) . 2.1. higgs and gauge masses next, we define the trivial solution of the equations of motion by solving δv = 0 and dµϕi = 0. such vacuum is often referred to as higgs vacuum. the first equation can be simplified by choosing an ansatz (ϕ0i ) a(t, x⃗) = h0i r̂ a(x⃗) where r̂ = (x, y, z)/ √ x2 + y2 + z2 and {h0i } are some constants to be determined. note that the vacuum solution has a rotational symmetry so(3) since r̂ar̂a = 1. inserting this ansatz into the equation (7), we find v = − 1 2 hihij hj + g 16 h41. (9) then the vacuum equation δv = 0 is reduced to simple coupled third order algebraic equations g 4 (h01) 3 − c1m21h 0 1 + c3µ 2h02 = 0, (10) c2m 2 2h 0 2 + c3µ 2h01 = 0, dµϕα = 0. the resulting vacuum solutions are h02 = − c2c3µ 2 m22 h01, (h01)2 = 4 c2µ 4+c1m21m 2 2 gm22 := r2, (11) (a0i ) a = − 1 er ϵiaj r̂j + r̂aai, (a00)a = 0. the ai are arbitrary functions of space-time. the higgs particle can be identified with the fundamental fields of the theory after spontaneous symmetry breaking of the continuous symmetry su (2) by taylor expanding around the vacuum solution. performing a taylor expansion around the higgs vacuum and keeping a focus on the second-order terms with only matter fields ϕai , the real lagrangian (7) contains the following term s = ∫ d4x 1 2 ϕai ( −∂µ∂µiij δab − h̃ abij ) ϕbj + . . . , where h̃ is a 6 × 6 block diagonal hermitian matrix. after diagonalising the above term by redefining the fields with the eigenvectors of the non-hermitian mass matrix m abij := iikδ ach̃ cbkj , we find that the masses of the fundamental fields after the symmetry breaking to be equal to the eigenvalues of m abij given as m20 = c2 µ4 − m42 m22 , m2± = k ± √ k 2 + 2l, (12) where k = c1m21 − c2 m22 2 + 3µ4 2c2m22 and l = µ4 + c1c2m 2 1m 2 2. notice that we only find three non-zero eigenvalues. the redefined fields with zero masses (eigenvalues) are called goldstone fields, which can be absorbed into gauge fields aaµ by defining the new massive gauge fields. this process of giving mass to the previously massless fields is called the higgs mechanism. the mass of the gauge fields can be found by expanding the kinetic term of ϕ around the higgs vacuum ϕ0i = h 0 i r̂ a. without loss of generality, we can choose a particular direction of the vacuum by taking r̂ = (0, 0, 1)t . this is possible due to the symmetry of the so(3) vacuum as discussed above. keeping the term only quadratic in the gauge field, we find 1 2 (dµϕi + dµϕ0i ) aiij (dµϕi + dµϕ0i ) a (13) = 1 2 ( eaµ × ϕ0i )a iij ( eaµ × ϕ0j )a + . . . = 1 2 e2h0i iij h 0 j ( a1µa 1µ + a2µa 2µ ) + . . . = 1 2 m2g ( a1µa 1µ + a2µa 2µ ) + . . . , where the mass of the gauge field is identified to be mg := √ h0i iij h 0 j = e r √ m42−µ 4 m22 . 2.2. t’hooft-polyakov monopole to find the monopole solutions, let us consider the following ansatz (ϕcli ) a(x⃗) = hi(r)r̂a , (acli ) a = ϵiaj r̂j a(r) , (14) (acl0 )a = 0, where the subscript cl denotes the classical solutions to the equations of motion (4). the difference between this ansatz (14) and the higgs vacuum (11) 199 takanobu taira acta polytechnica is that the quantity hi now depends on the spatial radius hi = hi(r). here we are only considering the static ansatz to simplify our calculation, but one may, of course, also consider the time-dependent solution by utilising the lorentz symmetry of the model and performing a lorentz boost. according to derrick’s scaling argument [25], for the monopole solution to have finite energy, we require the two matter fields of the equation (14) to approach the vacuum solutions in the equation (11) at spatial infinity lim r→∞ h1(r) = h0±1 = ±r , (15) lim r→∞ h2(r) = h0±2 = ∓ c2c3µ 2 m22 r. also, notice that at some fixed value of the radius r, the vacuum solutions ϕ0α and monopole solutions ϕclα both belongs to the 2-sphere in the field configuration space. for example, ϕ01 belongs to the 2-sphere with radius r because (ϕ01)2 = r2. therefore, solutions ϕcli can be seen as a mapping between 2-sphere in spacetime (where the radius is given by the profile function hi) to 2-sphere in field configuration space. such mapping has a topological number called the winding number n ∈ z, which can be explicitly realised by redefining the unit vector r̂a as r̂an =   sin(θ) cos(nφ)sin(θ) sin(nφ) cos(θ)   . (16) therefore different n represent topologically inequivalent solutions. since we require the monopole and vacuum solutions to smoothly deformed into each other at spacial infinity, both solutions need to share the same winding number. it is important to note that winding numbers of ϕ1 and ϕ2 need to be equal to satisfy dϕ1 = dϕ2 = 0, and therefore we will denote the winding numbers of ϕ1 and ϕ2 as n collectively. if they are not equal, we would have dϕ1 = 0 but dϕ2 ̸= 0. next, let us insert our ansatz equation (14) into the equations of motion equation (4). we will also redefine the ansatz for the gauge fields to be aai = ϵ aibr̂b ( 1−u(r) er ) , aa0 = 0, which are more in line with the original ansatz given in [26, 27], compared to equation (14). inserting these expressions into the equations of motion equation (4), we find u ′′ (r) + u(r) [ 1 − u2(r) ] r2 (17) + e2u(r) 2 { h22(r) − h 2 1(r) } = 0, h ′′ 1 (r) + 2h ′ 1(r) r − 2h1(r)u 2(r) r2 (18) +g { −c1 m21 g h1(r) + c3 µ 2 g h2(r) + 14 h 3 1(r) } = 0, h ′′ 2 (r) + 2h ′ 2(r) r − 2h2(r)u2(r) r2 (19) +c2m22 { h2(r) + c3 µ2 m22 h1(r) } = 0. notice that these differential equations are similar to the ones discussed in [26, 27], but with the extra field h2 and extra differential equation (19). in the hermitian model, the exact solutions to the differential equations were found by taking the parameter limit called the bps limit [26, 27], where parameters in the hermitian model are taken to zero while keeping the vacuum solution finite. here we will follow the same procedure and take the parameter limit where quantities in the curly brackets of equations (18) and (19) vanish but keep the vacuum solutions equation (11) finite. we will see in section 2.4 that we also find the approximate solutions in this limit. 2.3. the energy bound surprisingly, by utilising derrick’s scaling argument, one can find the lower bound of the monopole energy without the explicit form of the solution. the energy of the monopole can be found by inserting the monopole solution into the corresponding hamiltonian of equation (6). h = ∫ d3x t r ( e2 ) + t r ( b2 ) (20) +t r { (d0ϕ1)2 } + t r { (diϕ1)2 } −t r { (d0ϕ2)2 } − t r { (diϕ2)2 } + v, where e, b are eia = fa0i , bia = − 12 ϵ ijkf jka , i, j, k ∈ {1, 2, 3}. the gauge is fixed to be the radiation gauge (i.e aa0 = 0, ∂iaai = 0). notice that our monopole ansatz equation (10) is static with no electric charge eai = 0 and therefore, the above hamiltonian reduces to e = ∫ d3x t r ( b2 ) + t r { (diϕ1)2 } (21) −t r { (diϕ2)2 } + v = 2 ∫ d3x bi abi a + (diϕ1)a(diϕ1)a −(diϕ2)a(diϕ2)a + 1 2 v. here, we simplified our expression by dropping the superscripts acli → ai , ϕ cl α → ϕα. we also keep in mind that these fields depend on the winding numbers n ∈ z. in the hermitian model (i.e. when ϕ2 = 0), one can rewrite the kinetic term as b2 + dϕ2 = (b − dϕ)2 + 2bdϕ and find the lower bound to be∫ 2bdϕ. here we will follow a similar procedure but introduce some arbitrary constant α, β ∈ r such that b2 = α2b − β2b where α2 − β2 = 1. this will allow 200 vol. 62 no. 1/2022 complex topological soliton with real energy in particle physics us to rewrite the above energy as e = 2 ∫ d3x α2 { bi a + 1 α (diϕ1)a }2 (22) −β2 { bi a + 1 β (diϕ2)a }2 +2 {−αbia(diϕ1)a + βbia(diϕ2)a} + 1 2 v. to proceed from here, we need to assume extra constraints on α and β such that the following inequalities are true∫ d3x α2 { bi a + 1 α (diϕ1)a }2 (23) −β2 { bi a + 1 β (diϕ2)a }2 ≥ 0,∫ d3xv ≥ 0. with these constraints we can now write down the lower bound of the monopole as e ≥ 2 ∫ d 3 x {−αbia(diϕ1)a + βbia(diϕ2)a} (24) = 2 ∫ d 3 x − α { bi a ∂iϕ a 1 + ebi a ϵ abc ai b ϕ c 1 } +β { bi a ∂iϕ a 2 + ebi a ϵ abc ai b ϕ c 2 } = 2 ∫ d 3 x − α { bi a ∂iϕ a 1 + ( −eϵabcaibbic ) ϕ a 1 } +β { bi a ∂iϕ a 2 + ( −eϵabcaibbic ) ϕ a 1 ϕ c 2 } = 2 ∫ d 3 x − α {bia∂iϕa1 + ∂ibi a ϕ a 1 } +β {bia∂iϕa2 + ∂ibi a ϕ a 1 } = 2 ∫ d 3 x − α∂i (biaϕ1a) + β∂i (biaϕ2a) = lim r→∞ ( −2α ∫ sr dsibi a ϕ1 a + 2β ∫ sr dsibi a ϕ2 a ) , where in the fourth line, we used dibai = 0, which can be shown from the bianchi identity dµϵµνρσ f aρσ = 0. the last line is obtained by using the gauss theorem at some fixed value of the radius r. since the ϕai in the integrand is only defined over the 2-sphere with a large radius, we can use the asymptotic conditions (15) and replace the monopole solutions {ϕaα, bai } with the higgs vacuum {(ϕ0α)a, (b0i ) a} e ≥ ( −2αϕ01 a + 2βϕ02 a) lim r→∞ ∫ sr dsi(b0i ) a (25) = ( ∓2αrr̂an ∓ 2β c2c3µ 2 m22 rr̂ a n ) lim r→∞ ∫ sr dsi(b0i ) a , where the upper and lower signs of the above energy correspond to the upper and lower signs of the vacuum solutions in equation (11). the explicit value of ba0i can be obtained by inserting the higgs vacuum (11) into the definition of the magnetic field bai = − 1 2 ϵi jk (∂j ak − ∂kaj + eaj × ak) a . (26) after a lengthy calculation, this expression can be simplified to ba0i = ϕ̂0 a bi = r̂anbi, where ϕ̂0 a is a normalised solution ∑ a ϕ̂ 0 a ϕ̂0 a = 1. the bi is defined as bi ≡ − 1 2 ϵijk { ∂j ak − ∂kaj + 1 e r̂n · ( ∂j r̂n × ∂kr̂n )} . (27) where a was defined in equation (11). notice that integrating the first term over the 2-sphere gives zero by stoke’s theorem ∫ s ∂ × a = ∫ ∂s a = 0, where one can show that stoke’s theorem on the closed surface gives zero by dividing the sphere into two open surfaces. the second term is a topological term which can be evaluated as∫ dsibi = − 4πn e . (28) the explicit calculation is in [28]. this is the magnetic charge of the monopole solutions. therefore integer n, which corresponds to the winding number of the solution, comes from the ansatz bai = ϕ̂0 a bi. in our case, there is an ambiguity of whether to choose bai = ϕ̂01 a bi or bai = ϕ̂02 a bi. now we see explicitly the reason why we choose to keep the same integer values for solutions ϕ01 and ϕ02. if the integer values of r̂an in solutions ϕ01, ϕ02 are different, then the integration∫ sr dsi(b0i ) a will be different, leading to inconsistent energy. finally, we find our lower bound of the monopole energy e ≥ ∓2r ( α + β c2c3µ 2 m22 ) r̂anr̂ a n ( −4πn e ) (29) = ±8πnr e ( α + β c2c3µ 2 m22 ) . notice that we have some freedom to choose α, β ∈ r as long as our initial assumptions (23) are satisfied. we will see in the next section that we can take a parameter limit of our model, which saturates the above inequality and gives exact values to α and β. 2.4. the fourfold bps scaling limit our main goal is now to solve the coupled differential equations (17)-(19). prasad, sommerfield, and bogomolny [26, 27] managed to find the exact solution by taking the parameter limit, which simplifies the differential equations. the multiple scaling limit is taken so that all the parameters of the model tend to zero with some combinations of the parameter remaining finite. the combinations are taken so that the vacuum solutions stay finite in this limit. inspired by this, we will take here a fourfold scaling limit g, m1, m2, µ → 0 , m21 g < ∞ , µ2 g < ∞ , µ2 m22 < ∞. (30) this will ensure that the vacuum solutions equation (11) stays finite, but crucially the curly bracket parts 201 takanobu taira acta polytechnica in equations (18) and (19) vanish. there is a physical motivation for this limit in which the mass ratio of the higgs and gauge mass are taken to be zero (i.e. mhiggs << mg ) as described in [29]. we will see in the next section that the same type of behaviour is present in our model, hence justifying equation (30). the resulting set of differential equations, after taking the bps limit, is similar to the ones considered in [26, 27] with the slightly different quadratic term in equation (17). it is natural to consider a similar ansatz as given in [26, 27] u(r) = evr sinh (evr) , (31) h1(r) = −αf (r), (32) h2(r) = −βf (r), (33) where α, β ∈ r were introduced in section 2.3 and f (r) ≡ { v coth (evr) − 1 er } . one can check that this ansatz indeed satisfies differential equations equation (17)-(19) in the bps limit. we have decided to put a prefactor α and β in front of equations (32) and (33) to satisfy the differential equation (17). note that if we take α = 1, we get exactly the same as given in [26, 27], which is known to satisfy the first-order differential equation called the bogomolny equation bi − diϕ = 0. the ansatz (31)-(33) only differs from the ones given in [26, 27] by the prefactors α and β, and therefore our ansatz should satisfy the bogomolny equation with the appropriate scaling to cancel the prefactor in equations (32) and (33) bbi + 1 α (diϕ1)b = 0, (34) bbi + 1 β (diϕ2)b = 0, (35) where ϕα ≡ hα(r)r̂n. if we compare these equations to the terms appearing in the energy of the monopole equation (22), then we can saturate the inequality in equation (29) by e[ϕ1, ϕ2] = ±8πnr e ( α + β c2c3µ 2 m22 ) , (36) where upper and lower signs correspond to the vacuum solutions equation (11), when taking the square root. we can calculate the explicit forms of α and β by comparing the asymptotic conditions in equation (15) lim r→∞ h±1 = h 0± 1 = ±r, (37) lim r→∞ h±2 = h 0± 2 = ∓ c2c3µ 2 m22 r, with the asymptotic values of equations (31)-(33) limr→∞ u(r) = 0 , limr→∞ h±1 (r) = −αv, (38) limr→∞ h±2 (r) = −βv. by derrick’s scaling argument, the two asymptotic values (37) and (38) should match, resulting in algebraic equations for α and β. using α2 − β2 = 1 and assuming m42 ≥ µ4, we find the four set of real solutions α = ∓(±) m22 l , v = (±) rl m22 , β = ±(±) c2c3µ 2 l , (39) where l = √ m42 − µ4. the plus-minus signs in the brackets correspond to the two possible solutions to the algebraic equation α2 − β2 = 1. these need to be distinguished from the upper and lower signs of α and β, which correspond to the vacuums solutions (11). inserting the explicit values of α and β to the energy equation (36) we find e[ϕ1, ϕ2] ≡ (±) 8πnr em22 ( −m42 + µ4 l ) (40) = (±) −8πnr em22 l, with corresponding solutions h±1 (r) = ±(±) m22 l [ rl m22 coth ( erl m22 r ) − 1 er ] , (41) h±2 (r) = ∓(±) c2c3µ 2 l [ rl m22 coth ( erl m22 r ) − 1 er ] . it is crucial to note that although it seems like there are two monopole solutions {h±1 , h ± 2 }, the two solutions are related non-trivially in their asymptotic limit by the constraint limr→∞ h±2 = (−c2c3µ2/m22) limr→∞ h ± 1 given in equation (10). for example, one can not choose {h+1 , h − 2 } as a solution as this will break the asymptotic constraint. the solution (41) can be constrained further by imposing that the energy (40) is real and positive. e[ϕ1, ϕ2] > 0 =⇒ −(±) 8πnr em22 l =⇒ −(±)n > 0. (42) therefore we can ensure positive energy if (±) = sign(n). the final form of the monopole solution with positive energy are h ± 1 (r) = ±sign(n) m22 l [ rl m22 coth ( erl m22 r ) − 1 er ] , (43) h ± 2 (r) = ∓sign(n) c2c3µ 2 l [ rl m22 coth ( erl m22 r ) − 1 er ] . with energy e = 8|n|πlr/em22. we conclude this subsection by observing that the above solution depends on the parameter c3, which takes value {−1, 1} depending on the choice of the similarity transformation. choosing a different values of c3 also result in a different asymptotic values (37), meaning solutions for c3 = 1 and c3 = −1 are topologically different. since the energy is independent of c3, two distinct solutions share the same energy. respecting one of the main features of similarity transformation, which is to preserve the energy of the transformed hamiltonian. in the next section, we will investigate in detail how the solution changes and a new cpt symmetry emerges by changing the parameter values. 202 vol. 62 no. 1/2022 complex topological soliton with real energy in particle physics figure 1. monopole, gauge and higgs masses plotted for m21/g = −0.44, µ/g = −0.14, e = 2, c1 = −c2 = −1. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. the dotted vertical lines indicate the boundaries of the physical regions where all the masses acquire real positive values. 3. results and discussion this section will investigate the behaviour of solution (43) in different regimes of the parameter spaces. we will compare the physical regions of gauge particles, higgs particles and monopoles found in the previous section. we will see that the two regions coincide, but the solutions in different regions possess different cpt symmetries. different symmetries of solutions in different regions are not coincident, but the consequence of the three reality conditions stated in the introduction. in fact, it is deeply related to the real value of energy, which will be discussed extensively in [30]. 3.1. higgs mass and exceptional points let us recall the masses of the particles and monopole m20 = c2 µ4−m42 m22 , m2± = k ± √ k 2 + 2l, (44) mg = e rlm22 , mmono = 8|n|πlr em22 . where k = c1m21 − c2 m22 2 + 3µ4 2c2m22 and l = µ4 + c1c2m 2 1m 2 2. notice that the masses do not depend on c3, meaning they do not depend on the similarity transformation as expected. we also comment that in the bps limit, we have m0 = m± = 0, but mg and m± stays finite, such that the ratios mhiggs/mg vanish in the bps limit. this is in line with the hermitian case [29], providing the physical interpretation with mhiggs << mg for the bps limit. one may notice that when c2 = 1, requiring positive mass m20 > 0, implies that µ4 − m42 > 0. this means the quantity l = √ m42 − µ4 is purely imaginary. one may then discard this region as unphysical. however, we will see in the next section that there is a disconnected region beyond µ4 − m42 > 0, which admit real energy because r also becomes purely complex. this is not coincident, and in fact, we will see an emerging new cpt symmetry for the monopoles. in the rest of the section, we will exclusively focus on the monopole and gauge masses. the main message of this section is the emerging symmetry responsible for the real value of the monopole masses. the requirement to make the whole theory physical demands also to consider the intersection of the physical regions between monopole masses and higgs masses. as an example, we plot all the masses of the theory in figure 1. as one can see, intersection points of the physical regions of higgs masses and monopole/gauge masses are non-trivial. in fact, they are bounded by two types of exceptional points. the first type is when two masses of higgs particles coincide and form a complex conjugate pair. such a point is known as an exceptional point where the mass matrix is non-diagonalisable, and the corresponding eigenvectors coincide. the second type is when the gauge and the monopole masses vanishes. interestingly, this is where one of the higgs masses also vanish. since the mass matrix already has a zero eigenvalue, as the result of the spontaneous symmetry breaking, it seems the number of massless fields is increased. however, at this point, the mass matrix is also non-diagonalisable. therefore one can not diagonalise the hamiltonian to identify the field which corresponds to the extra massless fundamental field. therefore this point is also an exceptional point. however, the eigenvalues do not become complex conjugate pairs beyond this point, and as one can see from figure 1 that one of the mass square m20 become negative, and gauge and monopole masses become complex but with no conjugate pair. we dub such a point as zero exceptional point to distinguish from the standard exceptional point. 3.2. change in cpt symmetry and complex monopole solution we begin by introducing the useful quantities m21/g ≡ x, µ2/g ≡ y, µ2/m22 ≡ z. the gauge mass, monopole mass and monopole solutions can be rewritten in terms of these quantities mg = er √ 1 − z 2, mmono = 8|n|πr e √ 1 − z 2, (45) 203 takanobu taira acta polytechnica figure 2. monopole and gauge masses plotted for x = 1, y = 0.8, e = 2, c1 = −c2 = 1. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. figure 3. both panels are plotted for x = 1, y = 0.8, n = 1, e = 2. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. panel (a) shows the monopole and gauge masses against z ≥ 0, with vertical lines indicating the location of the boundaries of three regions. panel (b) shows three profile function h1(r) defined on each region indicated in panel (a). h ± 1 (r) = ± sign(n) √ 1 − z 2 [ r √ 1 − z 2coth(r̂) − 1 er ] , (46) h ± 2 (r) = ∓ sign(n)c2c3z√ 1 − z 2 [ r √ 1 − z 2coth(r̂) − 1 er ] , (47) where r2 = 4(c2zy + c1x) and r̂ = er √ 1 − z 2r. the monopole masses are plotted against the gauge mass for fixed parameters with n ∈ {1, 2, 3, 4} in figure 2 with weak and strong couplings e = 2, e = 10. notice that the gauge mass is smaller than any of the monopole masses for weak coupling, but when e is large enough, some of the monopole masses can become smaller than the gauge mass. this is clear by inspecting the monopole, and gauge mass in equation (45) and two masses coincide when e = √ 8|n|π. note that n = 0 is not a monopole mass as it corresponds to the solution with zero winding number, which is topologically equivalent to the trivial solution. from figure 2, we also observe disconnected regions where both monopole and gauge masses become real to purely complex. a more detailed plot of this is shown in figure 3. region 2 is bounded by two points with lower bound µ2/m22 = 1 corresponding to the zero exceptional point where the vacuum manifolds stay finite (i.e. spontaneous symmetry breaking occur). however, the higgs mechanism fails because the hamiltonian is non-diagonalisable, as discussed in the previous section. the upper bounds correspond to the point where the vacuum manifold vanishes. therefore, the spontaneous symmetry breaking does not occur, implying that the gauge fields do not acquire a mass through the higgs mechanism, resulting in a massless gauge field. most crucially, an interesting region (denoted by region 3 in figure 4) reappears as one increases the value of z. the profile function in region 3 is purely complex, which signals that this may lead to complex energies. however, as one can see from figure 3, the energy is real. the reason for the real energy is that the conditions stated in the introduction hold. we will specify below the cpt symmetry responsible for the real value of the energy. note that the profile function h2 only differ from h1 by some factor in front. therefore we omitted it from the plot. another physical region is when c1 = −c2 = −1. the monopole and gauge masses for this case is plotted in figure 4. we observe almost an identical plot from the figure 3 but with real and imaginary parts swapped. the profile functions also respect these changes as regions 1 and 3 no longer have a definite asymptotic value. the boundaries are unchanged, as one can see from the figure 4. finally, there is an interesting parameter point x = y where region 2 vanishes (see figure 5). the two boundaries z 2 = 1 and c2zy + c1x = 0 coincide when x = y and the zero exceptional point no longer exists because the spontaneous symmetry breaking does not occur in this case. next, let us explain the real value of the energies in different regions. first, to realise the conditions 1-3, stated in the introduction, we require the following 204 vol. 62 no. 1/2022 complex topological soliton with real energy in particle physics figure 4. both panels are plotted for x = 1, y = 1, n = 1, e = 2. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. figure 5. both panels are plotted for x = 1, y = 1, n = 1, e = 2. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. transformations h±2 (r) → −h ± 2 (r) , h ± 1 (r) → h ± 1 (r) in region 1 no symmetry in region 2 h±2 (r) → − ( h±2 (r) )∗ , h±1 (r) → ( h±1 (r) )∗ in region 3 . by using the explicit forms of the solutions (46) and (47). we can show that the above transformations satisfy condition 2 stated in the introduction, in regions 1 h±2 (r) → −h ± 2 (r) = h ∓ 2 (r) , h ± 1 (r) → h ± 1 (r), (48) and in region 3 h±2 (r) → − ( h±2 (r) )∗ = h±2 , (49) h±1 (r) → ( h±1 (r) )∗ = h∓1 . notice that in regions 1 and 3, the cpt relates two distinct solutions in two different ways. for example, h±2 is mapped to h ∓ 2 in region 1, but it is mapped to itself in region 3. finally, the condition 3 stated in the introduction is satisfied because the energy does not depend on the ± signs of the solutions. this explains the real energies of complex monopoles in region 3 and complex energy in region 2. indeed, we observe the predicted behaviour in figure 3. region 2 is a hard barrier between two cpt symmetric regions where solutions are either real or purely imaginary. the same analysis can be carried out in the other physical region c1 = −c2 = −1 where the symmetry is now no symmetry in region 1 h ± 2 (r)→−(h±2 (r)) ∗ h ± 1 (r)→(h±1 (r)) ∗ in region 2 no symmetry in region 3 . (50) we have observed that one can find a well-defined monopole solution in two disconnected regions. however, in the full theory where we include the higgs particles, it is only one of the regions which are considered physical. this is because the higgs mass m20 is either positive or negative depending on which side of z 2 = 1 it is defined. because two disconnected regions are defined on either side of the zero exceptional point z 2 = 1, the full physical region restricts one from moving region 1 to region 3 by changing z. this is most clearly seen in figure 1 where the plot of m20 (green line) becomes negative beyond the zero exceptional point. this may imply that the purely complex monopole solution we observed is not a possible solution of the theory. however, the purely complex solution can exist in the full physical region. an example of this is shown in the figure 6 where we observe that the profile function h1 (therefore h2) is purely complex, and the higgs masses, gauge mass are all real and positive. 4. conclusions we have found the t’hooft-polyakov monopole solution (43) in the non-hermitian theory by drawing an 205 takanobu taira acta polytechnica figure 6. both panels are plotted for x = −2, y = −0.6, c1 = −c2 = 1, n = 1, e = 2. the solid line represents the real part, and the dotted line represents the imaginary part of the masses. analogue from the standard procedure in the hermitian theory. the monopole masses were plotted with the massive gauge and higgs masses, where the physical region of the monopole masses coincided with that of the gauge mass. it was also observed that there are two distinct physical regions bounded by the zero exceptional point and the parameter limit where the vacuum manifold becomes trivial. the profile function (radial part of the monopole solution) is plotted in figures 3, 4, 5, where it is real and purely complex in regions 1 and 3, respectively. incidentally, the cpt symmetries of the solution are different in regions 1 and 3. acknowledgements tt is supported by epsrc grant ep/w522351/1. references [1] j. alexandre, p. millington, d. seynaeve. symmetries and conservation laws in non-hermitian field theories. physical review d 96(6):065027, 2017. https://doi.org/10.1103/physrevd.96.065027. [2] j. alexandre, j. ellis, p. millington, d. seynaeve. spontaneous symmetry breaking and the goldstone theorem in non-hermitian field theories. physical review d 98(4):045001, 2018. https://doi.org/10.1103/physrevd.98.045001. [3] p. d. mannheim. goldstone bosons and the englert-brout-higgs mechanism in non-hermitian theories. physical review d 99(4):045006, 2019. https://doi.org/10.1103/physrevd.99.045006. [4] p. millington. symmetry properties of non-hermitian pt -symmetric quantum field theories. journal of physics: conference series 1586(1):012001, 2020. https://doi.org/10.1088/1742-6596/1586/1/012001. [5] j. alexandre, j. ellis, p. millington, d. seynaeve. spontaneously breaking non-abelian gauge symmetry in non-hermitian field theories. physical review d 101(3):035008, 2019. https://doi.org/10.1103/physrevd.101.035008. [6] j. alexandre, j. ellis, p. millington, d. seynaeve. gauge invariance and the englert-brout-higgs mechanism in non-hermitian field theories. physical review d 99(7):075024, 2019. https://doi.org/10.1103/physrevd.99.075024. [7] j. alexandre, j. ellis, p. millington. discrete spacetime symmetries and particle mixing in non-hermitian scalar quantum field theories. physical review d 102(12):125030, 2020. https://doi.org/10.1103/physrevd.102.125030. [8] a. fring, t. taira. goldstone bosons in different pt -regimes of non-hermitian scalar quantum field theories. nuclear physics b 950:114834, 2020. https://doi.org/10.1016/j.nuclphysb.2019.114834. [9] a. fring, t. taira. massive gauge particles versus goldstone bosons in non-hermitian non-abelian gauge theory, 2020. arxiv:2004.00723. [10] a. fring, t. taira. pseudo-hermitian approach to goldstone’s theorem in non-abelian non-hermitian quantum field theories. physical review d 101(4):045014, 2020. https://doi.org/10.1103/physrevd.101.045014. [11] a. fring, t. taira. ’t hooft-polyakov monopoles in non-hermitian quantum field theory. physics letters b 807:135583, 2020. https://doi.org/10.1016/j.physletb.2020.135583. [12] j. alexandre, j. ellis, p. millington. pt -symmetric non-hermitian quantum field theories with supersymmetry. physical review d 101(8):085015, 2020. https://doi.org/10.1103/physrevd.101.085015. [13] m. n. chernodub, a. cortijo, m. ruggieri. spontaneous non-hermiticity in the nambu–jonalasinio model. physical review d 104(5):056023, 2021. https://doi.org/10.1103/physrevd.104.056023. [14] m. n. chernodub, p. millington. ir/uv mixing from local similarity maps of scalar non-hermitian field theories, 2021. arxiv:2110.05289. [15] a. fring. pt -symmetric deformations of the korteweg-de vries equation. journal of physics a: mathematical and theoretical 40(15):4215–4224, 2007. https://doi.org/10.1088/1751-8113/40/15/012. [16] f. g. scholtz, h. b. geyer, f. j. w. hahne. quasi-hermitian operators in quantum mechanics and the variational principle. annals of physics 213(1):74–101, 1992. https://doi.org/10.1016/0003-4916(92)90284-s. [17] j. dieudonné. quasi-hermitian operators. proceedings of the international symposium on linear spaces, jerusalem 1960, pergamon, oxford pp. 115–122, 1961. 206 https://doi.org/10.1103/physrevd.96.065027 https://doi.org/10.1103/physrevd.98.045001 https://doi.org/10.1103/physrevd.99.045006 https://doi.org/10.1088/1742-6596/1586/1/012001 https://doi.org/10.1103/physrevd.101.035008 https://doi.org/10.1103/physrevd.99.075024 https://doi.org/10.1103/physrevd.102.125030 https://doi.org/10.1016/j.nuclphysb.2019.114834 http://arxiv.org/abs/2004.00723 https://doi.org/10.1103/physrevd.101.045014 https://doi.org/10.1016/j.physletb.2020.135583 https://doi.org/10.1103/physrevd.101.085015 https://doi.org/10.1103/physrevd.104.056023 http://arxiv.org/abs/2110.05289 https://doi.org/10.1088/1751-8113/40/15/012 https://doi.org/10.1016/0003-4916(92)90284-s vol. 62 no. 1/2022 complex topological soliton with real energy in particle physics [18] m. froissart. covariant formalism of a field with indefinite metric. il nuovo cimento 14:197–204, 1959. https://doi.org/10.1007/bf02724848. [19] f. j. dyson. thermodynamic behavior of an ideal ferromagnet. physical review 102(5):1230–1244, 1956. https://doi.org/10.1103/physrev.102.1230. [20] t. marumori, m. yamamura, a. tokunaga. on the “anharmonic effects” on the collective oscillations in spherical even nuclei. i. progress of theoretical physics 31(6):1009–1025, 1964. https://doi.org/10.1143/ptp.31.1009. [21] s. t. beliaev, v. g. zelevinsky. anharmonic effects of quadrupole oscillations of spherical nuclei. nuclear physics 39:582–604, 1962. https://doi.org/10.1016/0029-5582(62)90416-9. [22] d. janssen, f. dönau, s. frauendorf, r. jolos. boson description of collective states. nuclear physics a 172(1):145–165, 1971. https://doi.org/10.1016/0375-9474(71)90122-9. [23] a. fring, t. taira. non-hermitian gauge field theories and bps limits. journal of physics: conference series 2038(1):012010, 2021. https://doi.org/10.1088/1742-6596/2038/1/012010. [24] d. p. musumbu, h. b. geyer, w. d. heiss. choice of a metric for the non-hermitian oscillator. journal of physics a: mathematical and theoretical 40(2):f75–f80, 2007. https://doi.org/10.1088/1751-8113/40/2/f03. [25] g. h. derrick. comments on nonlinear wave equations as models for elementary particles. journal of mathematical physics 5(9):1252–1254, 1964. https://doi.org/10.1063/1.1704233. [26] m. k. prasad, c. m. sommerfield. exact classical solution for the’t hooft monopole and the julia-zee dyon. physical review letters 35(12):760–762, 1975. https://doi.org/10.1103/physrevlett.35.760. [27] e. b. bogomolny. the stability of classical solutions. soviet journal of nuclear physics (english translation, united states) 24(4), 1976. [28] j. arafune, p. g. o. freund, c. j. goebel. topology of higgs fields. journal of mathematical physics 16(2):433– 437, 1975. https://doi.org/10.1063/1.522518. [29] t. w. kirkman, c. k. zachos. asymptotic analysis of the monopole structure. physical review d 24(4):999–1004, 1981. https://doi.org/10.1103/physrevd.24.999. [30] f. correa, a. fring, t. taira. complex bps skyrmions with real energy. nuclear physics b 971(2):115516, 2021. https://doi.org/10.1016/j.nuclphysb.2021.115516. 207 https://doi.org/10.1007/bf02724848 https://doi.org/10.1103/physrev.102.1230 https://doi.org/10.1143/ptp.31.1009 https://doi.org/10.1016/0029-5582(62)90416-9 https://doi.org/10.1016/0375-9474(71)90122-9 https://doi.org/10.1088/1742-6596/2038/1/012010 https://doi.org/10.1088/1751-8113/40/2/f03 https://doi.org/10.1063/1.1704233 https://doi.org/10.1103/physrevlett.35.760 https://doi.org/10.1063/1.522518 https://doi.org/10.1103/physrevd.24.999 https://doi.org/10.1016/j.nuclphysb.2021.115516 acta polytechnica 62(1):197–207, 2022 1 introduction 2 methods 2.1 higgs and gauge masses 2.2 t'hooft-polyakov monopole 2.3 the energy bound 2.4 the fourfold bps scaling limit 3 results and discussion 3.1 higgs mass and exceptional points 3.2 change in cpt symmetry and complex monopole solution 4 conclusions acknowledgements references ap_07_6.vp 1 introduction ultrasonic non-destructive testing is commonly used for flaw detection in materials. ultrasound uses the transmission of high-frequency sound waves in a material to detect a discontinuity or to locate changes in material properties. ultrasonic wave propagation in the tested materials is essentially influenced by the structure of the material. due to the material structure, the acquired ultrasonic signal can be corrupted by a relatively high noise level, commonly called backscattering noise. another source of noise is from the electronic circuitry. these noise components are generally present in all acquired ultrasonic signals, together with flaw and back-wall echo. back-wall echo is due to the reflection of an ultrasonic wave from the end of the material, and fault echo is caused by the reflection of ultrasonic waves from cracks or defects. the main task here is to detect the fault echo in an ultrasonic signal; i.e., to locate the cracks or defects in the tested materials. the flaw detection efficiency is mainly influenced by the noise level (backscattering and electronic), and for this purpose efficient signal processing techniques used for noise reduction and signal separation are proposed. in the past, many methods have been evaluated [2–5, 11] for efficient noise reduction in ultrasonic signals. the simplest method [2] is based on averaging the acquired ultrasonic signals. other popular methods are based on filters [2] with finite (fir) and infinite impulse response (iir). these methods are quite simple, but the noise suppression is not effective. non-linear methods based on band-pass filters, known as split spectrum processing, offer greater signal-to-noise improvement, but the setting of the parameters is based on heuristic methods, with varying results. a very popular method is based on the discrete wavelet transform algorithm [3, 6, 7]. this method is very efficient, but it is important to choose the proper mother wavelet, threshold level and threshold rule [8]. other methods used for signal de-noising are based on adaptive algorithms derived from the wiener filter [9, 10]. this paper presents and evaluates methods used for ultrasonic signal de-noising: the discrete wavelet transform, the wiener filter and blind source separation with appropriate settings. these methods with selected parameters are evaluated in terms of signal-to-noise improvement and flaw detection efficiency. another method used for ultrasonic signal and noise separation is also proposed, and its applicability is discussed in detail. this method is based on independent component analysis, and it has never been applied before in the ultrasonic non-destructive testing area. the rest of this paper is structured as follows. the second section offers basic theoretical descriptions of the de-noising and signal separation methods. in the third section there is an evaluation of methods with different parameter settings. for the case of the blind signal separation method, the appropriate configuration of ultrasonic transducers is described. based on the theoretical analysis, all methods are applied to real acquired ultrasonic signals in section four. for the evaluation, samples of materials used for constructing aircraft engines were used. finally, the results are discussed and future work is indicated. 2 de-noising and signal separation methods 2.1 discrete wavelet transform the wavelet transform [3, 6, 7] is a multiresolution analysis technique that can be used to obtain a time-frequency representation of an ultrasonic signal. the discrete wavelet transform (dwt) analyzes the signal by decomposing it into its coarse and detailed information, which is accomplished with the use of successive high-pass and low-pass filtering and subsampling operations, on the basis of the following equations: y k x n g k n y k x n h k n high n low n ( ) ( ) ( ), ( ) ( ) ( ), � � � � � � � � 2 2 (1) where y khigh( ) and y klow( ) are the outputs of high-pass and low-pass filters with impulse response g and h, respectively, after subsampling by 2 (decimation). this procedure is repeated for further decomposition of the low-pass filtered signals. starting from the approximation and detailed coefficients, the inverse discrete wavelet reconstructs the signal, © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 47 no. 6/2007 signal separation in ultrasonic non-destructive testing v. matz, m. kreidl, r. šmíd in ultrasonic non-destructive testing the signals characterizing the material structure are commonly evaluated. the sensitivity and resolution of ultrasonic systems is limited by the backscattering and electronic noise level commonly contained in the acquired ultrasonic signals. for this reason, it is very important to use appropriate advanced signal processing methods for noise reduction and signal separation. this paper compares algorithms used for efficient noise reduction in ultrasonic signals in a-scan. algorithms based on the discrete wavelet transform and the wiener filter are considered. part of this paper analyses and applies blind source separation, which has never been used in practical ultrasonic non-destructive testing. all proposed methods are evaluated on both simulated and acquired ultrasonic signals. keywords: ultrasonic testing, de-noising algorithms, noise reduction inverting the decomposition step by inserting zeros and convolving the results with the reconstruction filters. dwt can be used as an efficient de-noising method for families of signals that have a few nonzero wavelet coefficients for a given wavelet family. this is fulfilled for most ultrasonic signals. the common filtering procedure (also called de-noising) affects the signal in both frequency and amplitude, and involves three steps. the basic version of the procedure consists of: a) decomposing the signal using dwt into n levels using filtering and decimation to obtain the approximation and detailed coefficients, b) thresholding the detailed coefficients, c) reconstructing the signal from detailed and approximation coefficients using the inverse transform (idwt). when decomposing the signal it is important to choose a suitable mother wavelet, threshold rule and threshold level. 2.2 wiener filter based group delay statistics the wiener filter [9, 10] is a global filter and produces an estimation of the uncorrupted signal by minimizing the mean square error between the estimated signal and the uncorrupted signal in a statistical sense. the process representing the received signal consists of signal and noise, both uncorrelated zero-mean wide-sense-stationary random processes. by filtering y(t) we estimate s(t) using a time-invariant linear system with transfer function h(f). the resulting mean-square error will then be e h f s f f h f n f f� � � �� � �� � � �1 2 2 ( ) ( ) ( ) ( )d d , (2) where n(f) and s(f) are power spectral densities of the noise and the signal. error e is minimized over h(f) for fixed s(f) and n(f). the transfer function can be estimated by means of the group delay target signal having a deterministic phase delay over the working frequency [9]. the following techniques are based on using a discrete group delay. it can be calculated by t k n k k( ) ( ) ( )� � � � 2 1 � � � , (3) where �(k) is the phase component of the discrete fourier transform, k is the frequency index and n is the total number of points. to minimize the edge effect, various windows for the received time sequence are applied. to obviate discontinuity in the group delay phase unwrapping techniques are used. two useful variants [9] based on group delay statistics are the group delay moving standard deviation � �� k k m k m k m t m t� � � � � � � � � �� � �1 1 2 1 2 � �( ) (4) and the group delay moving entropy h f t f tk j j j k m k � � � � � � ( ) log ( )2 1 . (5) both estimates are computed within a moving window m. window m is set to a small compared data length, and reflects a trade off between resolution and estimation error. 2.3 blind signal separation blind signal separation (bss) consists in recovering unobserved signals or sources from several observed mixtures. the simplest bss model assumes the existence of n independent signals s t s tn1( ), , ( )� and the observation of as many mixtures x t x tn1( ), , ( )� , these mixtures being linear and instantaneous. this is compactly represented by the mixing equation x a s( ) ( )t t� , (6) where s( ) ( ), , ( )t s t s tn� 1 � t is a column vector collecting the source signals, vector x(t) similarly collects the n observed signals and the square mixing matrix a contains the mixture coefficients. the bss problem consists in recovering the source vector s(t) using only the observed data x(t), the assumption of independence between the entries of the input vector s(t) and possibly some a priori information about the probability distribution of the inputs. this can be formulated as the computation of an n×n separating matrix b whose output y(t) y b x( ) ( )t t� (7) is an estimate of the vector s(t) of the source signals. the basic bss model can be extended in several directions taking into account, for example more sensors than sources, noisy observations, complex signals and mixtures. the solution of equation (7) depends on the selected algorithm. many algorithms have been published with different results. one of the 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 material fig. 1. configuration of ultrasonic transducers for bss most popular algorithms, called fastica (fast independent component analysis), is described in detail in [12, 13]. before the fastica algorithm can be used, it is very important to characterize the model for the ultrasonic signal separation. the main question is how to propose the source signals s(t). in our study, we used two ultrasonic transducers and acquired the ultrasonic signals synchronously with the ultrasonic transducer configuration, as shown in fig. 1. 3 theoretical results first of all, for the detailed analysis and for performing the de-noising methods it is necessary to generate the simulated ultrasonic signal. the signal is simulated based on the amplitude and frequency analysis of a set of acquired ultrasonic signals. based on this analysis and a physical analysis of ultrasonic wave propagation, the signal was generated based on a simple clutter model using the equation [11]: h x x i x c h mat k k k k l ( ) exp( ) exp ( � � � � � � � � � � � � � � � � � � � � 2 42 2 ) ( ),� � � h k k tot � 1 (8) where � is the material attenuation coefficient, cl is the velocity of the longitudinal waves, xk is the grain positions of k k tot�1� is the number of grains and �k is a random vector depending on the grain volume. an example of a generated ultrasonic signal is shown in fig. 2. the signal consists of noise (backscattering, electronic), fault echo and back-wall echo. first of all, the dwt de-noising algorithm was used. for efficient noise reduction, it is necessary to select the shape of the mother wavelet, the threshold level and the threshold rule [8]. the shape of the mother wavelet has to be very similar to the ultrasonic echo [6]. it has to fulfill the following properties: symmetry, orthogonality and feasibility for dwt. a group of mother wavelets was tested: haar’s wavelet, the discrete meyer wavelet, daubechie’s wavelet and coiflet’s wavelet. in the proposed procedure, only local thresholding of detailed coefficients was used. in the case of the thresholding rule, soft and hard thresholding can be used. according to the literature [6, 8], soft thresholding is not a proper option for noise reduction in ultrasonic signals, because the noise level and the amplitude of the fault echo are decreased by the threshold level. other options for thresholding rules are to modify the hard thresholding rule using the following equations: the compromise thresholding rule [7] can be defined as � � � ( ) , , t sign t t at t t t t ij ij ij ij ij � � � � � � � � 0 (9) where tij is the threshold level for sample i at level j and � is the coefficient for a compromise between hard and soft thresholding. the custom thresholding rule [7] is defined as � ( )( ) , ,t t sign t t t t t t t t ij ij ij ij ij ij � � � � � � � � � � 1 0 � � �� � � � �� � � � � � � �� � � � �� � � � � � � ! " � #� � � � � � 2 3 4( )� � t t ij � � � � � (10) where is the coefficient characterizing the sample level from which the thresholding is valid. the principle of compromising and custom thresholding together with hard and soft thresholding is shown in fig. 3. we evaluated common thresholding methods implemented in the matlab wavelet toolbox [7] (rigsure, sqtwolog, heursure, minimaxi). due to the unsatisfactory results we proposed a new method based on standard deviation v1 and mean value together with standard deviation v2. the local thresholds at each level of decomposition are given by v k n cd j j n 1 2 1 1 1 � � � � �( )cd (11) and v vk2 1� �( ) , (12) where n is the length of the vector detail coefficients, k is a constant (crest factor), cd is the vector of detailed coefficients, and is the mean value. © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 47 no. 6/2007 fault echo back-wall echo fig. 2: simulated ultrasonic signal threshold level thresholding hard soft no thresholding compromising t�t fig. 3: wavelet thresholding demonstration with the use of all the mother wavelets, proposed threshold levels and rules we evaluated the de-noising process on the simulated ultrasonic signals (fig. 1) by calculating of two parameters. the first parameter evaluates the signal-to-noise ratio enhancement and can be expressed as snre p p �10 1 2 log , (13) where p1 and p2 are the power of the noise before and after de-noising. another parameter evaluates the fault echo changes and the decrease the amplitude, and can be expressed as kc r a a aa a a b b a b � � �� � � � � � � �( )0 1 , (14) where r is the cross-correlation function, and ab and aa are the fault echo amplitudes before and after de-noising. many combinations have been processed with different threshold levels, threshold rules and mother wavelets, and the fault echo within 1–100 % of initial echo amplitude was added to the simulated ultrasonic signal. in the case of threshold rule evaluation, parameters k, � and were changed within the appropriate range. best results for hard, custom and soft thresholding are shown in table 1, table 2 and table 3. it can be seen from table 1 that in the case of hard thresholding the best results were obtained using the discrete meyer mother wavelet and threshold level based on standard deviation. the value snre � 37.59 db and fault echo with amplitude of the 5 % of the initial echo amplitude was detected. other thresholding rules and mother wavelets do not provide better results. the noise was suppressed and fault echo equal in amplitude to the noise level was efficiently detected. the next method to be evaluated here is based on the wiener filter, using group delay statistics. the wiener filter based group delay moving entropy and group delay moving 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 threshold level v1 mother wavelet / parameter db2 db4 db6 dmey max. dx (-) 0.869 0.820 0.887 0.820 max. snre ( db ) 26.88 35.72 29.38 32.23 min. af ( % ) 9 7 11 6 k (-) 1.35 2 1.1 1.4 �v (-) 0.16 0.22 0.18 0.2 v (-) 0.03 0.04 0.03 0.03 table 3: evaluation of custom thresholding threshold level v1 v2 mother wavelet / parameter db2 db4 db6 dmey db2 db4 db6 dmey max. dx (-) 0.991 0.991 0.989 0.991 0.959 0.967 0.982 0.976 max. snre (db) 26.76 32.88 31.09 31.83 26.70 32.98 30.34 30.81 min. af ( % ) 8 6 9 5 13 10 20 10 min. k (-) 1.35 2 1.1 1.4 1.35 4.5 1.4 1.4 min. �v (-) 0.16 0.22 0.18 0.2 – – – – table 2: evaluation of compromise thresholding threshold level v1 v2 mother wavelet / parameter db2 db4 db6 dmey db2 db4 db6 dmey max. dx (-) 0.994 0.989 0.978 0.981 0.967 0.976 0.966 0.984 max. snre (db) 25.97 37.76 35.18 37.59 24.70 24.59 19.33 19.72 min. af ( % ) 9 7 9 5 13 9 20 2 min. k (-) 1.35 2 1.1 1.4 1.35 4.5 1.4 1.4 table 1: evaluation of hard thresholding standard deviation were used, and the best parameters were also searched for efficient ultrasonic signal noise reduction. the main idea of using group delay statistics is that the useful signal has a constant group delay in a certain frequency range. this frequency range depends on the frequency response of the ultrasonic transducer. the de-noising efficiency can be increased by using an appropriate window with frequency bandwidth and threshold level. in our evaluation only the hamming window was used. in the case of threshold level and frequency bandwidth we changed the threshold level within 1–80 % of the maximal amplitude of the wiener filter transfer function and frequency bandwidth within 5–15 mhz. the wiener filter was evaluated using the same parameter snre as in the case of dwt. an evaluation of the wiener filter based group delay statistics is shown in fig. 4. an ultrasonic signal with a different fault echo amplitude was also generated. the best results were obtained with a threshold level of 40 % and a frequency bandwidth corresponding to 9 mhz. with this setting, the highest snre � 14.7 db. a comparison of the two algorithms shows that the snre values for the wiener filter based standard deviation are higher. however, the snre values are lower than in the case of dwt. this may be that because the wiener filter is similar in shape to a band pass filter that suppresses only the frequencies outside the frequency range of the proposed filter. the last method applied here is blind source separation used on an ultrasonic signal and noise separation in the configuration shown in fig. 1. based on this configuration the ultrasonic signals were acquired. we obtained two ultrasonic signals that can be described using the following equations: x t a s t a n t x t a s t a n t s s n e s s n e 1 1 1 1 1 2 2 2 2 2 ( ) ( ) ( ), ( ) ( ) ( � � � � ), (15) where s1s(t) and s2s(t) is the source signal acquired with ultrasonic transducer no. 1 and no. 2, and ne(t) is electronic noise. the source signals in this configuration are considered to be all the reflections from the material structure (backscattering noise, fault echo and back-wall echo). if the basic presumptions of equation (6) are valid, sources s t s ts s1 2( ) ( )� and noise n t n te e1 2( ) ( )� . here the presumptions are only theoretical, and if we investigate the real situation in detail the conditions are completely different. the ultrasonic waves propagated through the material structure from ultrasonic transducer no. 1 clearly have different reflections from the ultrasonic waves propagated from ultrasonic transducer no. 2. this means that the two sources are different, due to the different material structure, and it is clear that s t s ts s1 2( ) ( )$ . the situation with the noise ne(t) is the same. if the electronic noise structure from ultrasonic transducer no. 1 is to be equal to the electronic noise structure from transducer no. 2, the two ultrasonic transducers, the ultrasonic system and the cables and measurement conditions must have been completely the same. this is also impossible; nobody can design the same parts with the same noise characteristics, so n t n te e1 2( ) ( )$ . from this simple overview it is clear that the basic presumptions cannot be fulfilled and the blind source separation method cannot be used in the area of ultrasonic non-destructive testing in the configuration presented here. 4 experimental results for the performance of all proposed de-noising algorithms, we used the ultrasonic signals acquired on samples of coarse-grained materials used for constructing aircraft engines. for all measurements, we used an ultrasonic transducer with a frequency of 25 mhz. the signals were measured above the flaw, and consequently de-noising algorithms were used. fig. 5b shows the de-noised signal with dwt using the discrete meyer mother wavelet, hard thresholding and a threshold based on standard deviation. the noise was effi© czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 47 no. 6/2007 fig. 4: wiener filter evaluation 0 1 2 3 4 �1 �0.5 0 0.5 1 t [%s] u /u m a x [ ] 0 1 2 3 4 �1 �0.5 0 0.5 1 t [%s] u /u m a x [ ] a) b) fig. 5. ultrasonic signal de-noising using the dwt algorithm, a) acquired signal, b) filtered signal ciently suppressed and the fault echo and back-wall echo are without amplitude changes. in the case of the wiener filter, the noise was also efficiently suppressed, but the signal after de-noising is more corrupted by noise. it can be seen that the noise was also efficiently suppressed but the fault echo amplitude was decreased. in general, it can be concluded that dwt is a more efficient method and more useful than the wiener filter in the case of ultrasonic signal de-noising. 5 conclusion this paper describes and evaluates methods for ultrasonic signal de-noising and separation. based on our analysis, the best de-noising method for efficient noise suppression is the discrete wavelet transform. the noise reduction for a signal with fault echo is 35 db. the amplitude of fault echo higher than 5 % of the initial echo amplitude is without changes, and the fault echo can be easily detected. we also investigated improvements in fault detection sensitivity using the appropriate parameter setting. our setting identifies the fault echo with amplitude comparable with the noise level. however, the wiener filter using group delay statistics does not offer efficient noise suppression. the blind source separation method is not appropriate for separating the signal and the noise in ultrasonic signals, because the basic presumptions of this method are not fulfilled. acknowledgments this research work has received support from research program no. msm210000015 “research of new methods for physical quantities measurement and their application in instrumentation” of the czech technical university in prague (sponsored by the ministry of education, youth and sports of the czech republic). references [1] krautkrämer, j., krautkrämer, h.: ultrasonic testing of materials. springer-verlag, 4th fully revised edition, 1990, 670 p., isbn 3-540-51231-4. [2] drai, r., sellidj, f., khelil, m., benchaala, a.: elaboration of some signal processing algorithms in ultrasonic techniques: application to materials ndt. ultrasonics, elsevier, vol. 38 (2000), p. 503–507. [3] louis, a. k., maaß, p., rieder, a.: wavelets: theory and applications. john wiley and sons ltd., england, 1997. [4] qi, tian, bilgutay, n. m. : statistical analysis of split spectrum processing for multiple target detection. ieee transaction on ultrasonics, ferroelectrics and frequency control, vol. 45 (1998), no. 1, january 1998, p. 251–256. [5] zhenqing, liu, mingda, lu, moan, wei: structure noise reduction of ultrasonic signals using artificial neural network adaptive filtering. ultrasonics, elsevier: vol. 35 (1997), p. 325–328. [6] shou-peng, song, pei-wen, que: wavelet based noise suppression technique and its application to ultrasonic flaw detection. ultrasonics, elsevier: vol. 44 (2006), p. 188–193. [7] paul, s. a.: the illustrated wavelet transform handbook: introductory theory and applications in science, engineering, medicine and finance. napier university, edinburgh, uk, 2004, isbn 07-50-30692-0. [8] pardo, e., san emeterio, j. l., rodriguez, m. a., ramos a.: noise reduction in ultrasonic ndt using undecimated wavelet transforms. ultrasonics, elsevier: vol. 44 (2006), p. 1063–1067. [9] xing, li, nihat, m. bilgutay: wiener filter realization for target detection using group delay statistics. ieee transaction on signal processing, vol. 41 (1993), no. 6, june 1993. [10] izquierdo, m. a. g., hernandez, m.g., graullera, o., ullate, l. g.: time frequency wiener filtering for structural noise reduction. ultrasonics, elsevier, vol. 41 (2002), p. 269–271. [11] gustafsson, m. g., stepinski, t.: studies of split spectrum processing, optimal detection, and maximum likelihood amplitude estimation using a simple clutter model. ultrasonics, elsevier: vol. 35 (1997), p. 31–52. [12] hyvarinen, a., oja, e.: a fast-fixed point algorithm for independent component analysis. neural computation, vol. 9 (1997), p. 1483–1492. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 0 1 2 3 �1 �0.5 0 0.5 1 t [%s] 0 1 2 3 �1 �0.5 0 0.5 1 t [%s] u /u m a x [ ] u /u m a x [ ] a) b) fig. 6. ultrasonic signal de-noising using the wiener filter, a) acquired signal, b) filtered signal [13] cichocki, a., shun-ichi, a.: adaptive blind signal and image processing: learning algorithms and application. london: john willey & sons, ltd., 2002, isbn 0471-60791-6. ing. václav matz e-mail: vmatz@email.cz doc. ing. marcel kreidl, csc. phone: +420 224 352 117 e-mail: kreidl@feld.cvut.cz ing. radislav šmíd, ph.d. phone: +420 224 352 131 e-mail: smid@feld.cvut.cz department of measurement czech technical university in prague faculty of electrical engineering technická 2 16 627 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 47 no. 6/2007 acta polytechnica https://doi.org/10.14311/ap.2023.63.0208 acta polytechnica 63(3):208–215, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague a comparative experimental investigation of high-temperature effect on fibre concrete and high strength concrete using ut and cm methods javad royaei∗, kabir sadeghi, fatemeh nouban near east university, department of civil engineering, lefkosa, via mersin 10, turkey ∗ corresponding author: royaei.j1975@yahoo.com abstract. in this paper, a 28-day compressive strength test has been performed on samples including normal fibre concrete and high-strength concrete. the ultrasonic test (ut) as a non-destructive and compression machine (cm) as a destructive test were applied, and the results were compared. to investigate the effect of temperature, the samples were subjected to 200, 400, 600, 800, 1000, and 1200 degrees celsius and the exposure time was equal to 30, 45, 60, 90, 120, and 180 minutes. based on the results, it was observed that the minimum error observed between the ut and cm tests was 2.9 % and the maximum error between the two methods was 10.9 %, which shows the high accuracy of the ultrasonic testing method in determining the specimen’s strength. the average probable error of the method is determined to be around 6.8 %. based on the results of the average decrease in compressive strength versus the heat exposure time, it is observed that the trend of changes and decrease in resistance over time for both types of tests is almost the same and has a negligible difference. at the end of 180 minutes of exposure, the resistance ratio for the ultrasonic test is 69.8 %, and 71.1 % for the compression machine. furthermore, according to the average reduction in compressive strength due to heat exposure time, it has been observed that the results of the ut and um tests have slight numerical differences, however, the trend of changes and reduction in resistance over time for both types of tests is almost the same. finally, the accuracy of the ut in determining the compressive strength of specimens at high temperatures is fully confirmed. keywords: fibre concrete, high strength concrete, temperature, ultrasonic text. 1. introduction high temperatures resulting from a fire are considered a serious threat to most buildings, whether steel or concrete. due to the widespread use of concrete as a building material, it is very necessary to fully investigate the effect of heat on it in order to achieve a safe design [1]. since human safety against fire is one of the most important considerations in building design, research on the effective factors on the resistance of concrete exposed to high temperatures began in 1920, and since then, much research has been carried out on the resistance of concrete to fire [2]. many explanations have been proposed for the behaviour of concrete exposed to high temperatures due to differences in three factors, namely, high-temperature conditions, concrete constituents, and laboratory equipment used [3]. researchers have recognised that many factors affect the actual behaviour of concrete exposed to high temperatures, with the most important being environmental factors, such as heating rate and final temperature, and the constituent materials [4–10]. extensive research has been carried out on the subject of fire for structures, especially reinforced concrete buildings, and then studies and research on the effects of high temperatures on concrete, as well as the problems and disadvantages of reinforced concrete structures during a fire, which have been carried out so far, were reviewed [11]. research on the factors affecting the resistance of concrete exposed to high temperatures began in 1920, and since then a great deal of research has been carried out on the resistance of concrete to fire. many explanations have been proposed for the behaviour of concrete exposed to high temperatures due to differences in three factors, namely, high-temperature conditions, concrete constituents, and laboratory equipment used [12–18]. zhang’s experiments [1] showed that concrete will lose its compressive strength as the temperature increases, but the rate of this reduction in strength is different according to the type of granular materials used in the concrete mixing plant and the density of the hardened concrete. ramachandran’s laboratory studies [19] also concluded that concrete loses some of its compressive strength at a temperature of about 200 to 250 degrees celsius. as the temperature increases beyond 300 degrees, the concrete starts to crack. in this case, concrete will lose approximately 30 % of its compressive strength, and as the temperature increases, the resistance will continue to decrease. the research of otieno et al. [20] on the effect of fire on concrete with light aggregate materials showed that in reinforced concrete structures with the possibility of fire, the use of light concrete in covering the members can prevent the phenomenon of thermal bursting and to maintain the members’ resistance. elsalamawy’s studies [21] in 208 https://doi.org/10.14311/ap.2023.63.0208 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 63 no. 3/2023 a comparative experimental investigation of high-temperature effect the field of destructive effects of explosive bursting of the peripheral layer of reinforced concrete members due to fire and considering this phenomenon in safe design conditions show the importance of this discussion. accuracy in the microtexture of concrete with the help of electro-optic equipment also corroborates this statement. in their research, wang and zheng [22] showed that all concrete structures are classified into two categories (in terms of covering) with regard to fire: 1 – non-combustible, 2 – combustible (burnable). in this classification of sousa coatings, it is the common mode of reinforced concrete structures. but the non-flammable group is based on creating a protective coating layer on the concrete surfaces to dissipate the high temperature resulting from a fire. several tests were also conducted in the field of protective coatings used as thermal insulation in concrete structures during a fire, which can be referred to in the research conducted by kong et al. [23]. in this research, high-strength fibre concrete samples were made and then, by controlling the temperature in laboratory furnaces, conditions such as fires of varying durations were simulated. the heated samples were subjected to an ultrasonic test and then re-tested for mechanical properties using the compressive machine. establishing a meaningful relationship between these two types of tests will help us to make a scientific opinion about the amount of damage and reduction in fire resistance in concrete structures using non-destructive tests. 2. materials and methods the main purpose of this article is to do a comparative experimental investigation of the high-temperature effect on fibre concrete and high-strength concrete using ut and cm methods. the routine procedure in all concrete construction projects is to carry out compressive machine tests (cm) for determining the mechanical compressive behaviour of the specimens. also, if the project is already built and there are uncertainties about the concrete design parameters, the only common way is using destructive methods such as getting sample cores. therefore, if we are able develop a non-destructive method, which is an ultrasonic test (ut), in this research, all the possible extra costs and damages to structures that come with destructive methods will be eliminated. a literature dealing with the relation of ultrasonic tests with compressive properties for normal concrete specimens is already available, yet the lack of concise research for other types of concrete mix designs, such as fibre concrete or high-strength concrete mixtures, is needed. in this paper, the relationship between the speed acquired from ut with the compressive strength obtained by cm is determined. the experiments are performed for various temperatures to further help us broaden our knowledge for all concrete infrastructures and ordinary structures affected by the occurrence of fire incidents during their operational life. item value water 150 kg·m−3 cement 350 kg·m−3 water/cement ratio 0.43 coarse aggregates 810 kg·m−3 fine aggregates 990 kg·m−3 fiber (steel/plastic) 1 % by volume of cement superplasticizer 1 liter table 1. specification of mix design of normal concrete. item value water 300 kg·m−3 cement 846 kg·m−3 water/cement ratio 0.32 coarse aggregates 1500 kg·m−3 fine aggregates 1325 kg·m−3 micro silica 35 kg·m−3 superplasticizer 1 liter table 2. specifications of mix design of high-strength concrete samples. the concrete samples have been prepared for tests using portland cement type 2 (based on astm c 150 standard) according to aci 211.1 standard. two types of tests have been conducted, the compressive strength test with a compression machine and the non-linear ultrasonic (nlu) measurement test. table 1 shows the specifications of the sample mix design. it should be noted that the concrete samples are made in two versions: normal concrete with fibre (steel fibre, plastic fibre) and high-strength concrete (table 2). fresh concrete is poured into the mould and cured at room temperature. after opening the moulds, the samples were stored in a chamber with a temperature of 23 degrees celsius and 95 % humidity to accelerate the hydration process and this process continued until the concrete was 28 days old. according to the software of the heating furnace, the time-temperature standard curve of iso 834 was used (figure 1). after heating the samples to the desired temperature, the samples remained in the closed furnace for three hours until the furnace temperature reached the ambient temperature so that they did not suffer from thermal shock and sudden cracks due to the decrease in temperature. in this paper, the concrete samples (normal, metal fibres, plastic fibres, high resistance) were kept in humid conditions after being poured into the mould and compacted until reaching the test age. the samples were kept in humid conditions and after 24 hours, they were removed from the moulds and placed in a water tank with a temperature of 20±2 degrees celsius. samples were processed according to the astm c 192 standard. 209 j. royaei, k. sadeghi, f. nouban acta polytechnica figure 1. time-temperature curve. the prepared samples were placed in the furnace at different temperatures and durations. then, the samples were first subjected to a non-destructive ultrasonic test, and their compressive strength was determined, after which they were subjected to a destructive pressure machine test, and their strength was measured again. in this study, the compressive strength test based on astm c39-86 standard was used. the compressive strength tests have been performed on cube samples. in the compressive strength tests, the cubes were placed in the compression machine in such a way that the two opposite surfaces that were adjacent to the mould during curing were in contact with the upper and lower stirrups of the machine. the loading speed should be in the range of 0.14 to 0.34 mpa per second. in this study, the loading speed was set to 0.25 mpa·s−1. one of the most common methods in the field of quantitative and qualitative assessment of concrete on site is the use of the non-destructive method of ultrasonic waves, which is known as the ultrasonic method. in this experiment, the speed of longitudinal (pressure) waves is determined. this process involves measuring the time required for a pulse to travel a certain distance, the test method is suggested by astm c 597-83. ultrasonic pulse testing has the significant advantage of providing information about the interior of a concrete element, including concrete uniformity. the basis of the device’s work is that the electro-acoustic generator, which produces pulses of longitudinal vibrations, is placed on the concrete surface and emits pulses. after the pulse passes a certain length (l) of the concrete, the pulse vibrations are converted into electronic signals by the second generator. the electronic circuit of the device is able to measure the pulse transit time in microseconds (t). in this thesis, the speed of ultrasonic waves was tested by the pundit device (with a frequency of 54 khz). 3. results the results of the compressive strength of concrete samples obtained by two methods, concrete compression machine (cm) and ultrasonic test (ut), have figure 2. comparing the results of the compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 30 minutes. figure 3. ultrasonic test error for samples exposed to different temperatures for 30 minutes. been investigated to determine the limits of the difference in the accuracy of the two methods. figure 2 shows the comparison of the results of the compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 30 minutes. in figure 3, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for samples exposed to different temperatures for 30 minutes, it has been observed that a 6.3 % error has been observed for concrete samples with a conventional mix design. the lowest observed error was 3.3 % for the concrete samples with steel fibres, and the highest error was 6.3 % for samples of a conventional mix design. the observed error for high resistance concrete is also determined at about 5.8 %. figure 4 shows the comparison of the results of the compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 45 minutes. in figure 5, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for 210 vol. 63 no. 3/2023 a comparative experimental investigation of high-temperature effect figure 4. comparing the results of the compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 45 minutes. figure 5. ultrasonic test error for samples exposed to different temperatures for 45 minutes. samples exposed to different temperatures for 45 minutes, it has been observed that a 5.8 % error has been observed for concrete samples with a conventional mix design. the lowest observed error was 3.2 % for concrete samples with plastic fibres, and the highest error was 5.8 % for a conventional mix design samples. the observed error for high resistance concrete is also determined, at about 4.6 %. figure 6 shows the comparison of the results of the compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 60 minutes. in figure 7, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for samples exposed to different temperatures for 60 minutes, a 7.1 % error has been observed for concrete samples with a conventional mix design. the lowest error value for the concrete samples with a common mix design was 7.1 % and the highest error for highstrength samples was 9.2 %. the observed error for steel and plastic fibre concretes was 8.4 % and 8.8 %, respectively. figure 8 shows the comparison of the results of the figure 6. comparing the results of the compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 60 minutes. figure 7. ultrasonic test error for samples exposed to different temperatures for 60 minutes. compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 90 minutes. in figure 9, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for the samples exposed to different temperatures for 90 minutes, a 7.7 % error has been observed for the concrete samples with a common mix design. the lowest error value was 2.9 % for high-strength concrete samples and the highest error was 9.3 % for samples with steel fibres. the observed error for the concrete with a common mix design was also determined, at around 7.7 %. figure 10 shows the comparison of the results of the compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 120 minutes. in figure 11, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for samples exposed to different temperatures for 120 minutes, it has been observed that a 7.4 % error has been observed for concrete samples with a common 211 j. royaei, k. sadeghi, f. nouban acta polytechnica figure 8. comparing the results of compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 90 minutes. figure 9. ultrasonic test error for samples exposed to different temperatures for 60 minutes. mix design. the lowest error value was 3.8 % for highstrength concrete samples and the highest error was 9.2 % for samples with plastic fibres. the observed error for concrete with a common mix design was also determined, at around 7.4 %. figure 12 shows the comparison of the results of the compression machine test and the ultrasonic test for concrete samples exposed to different temperatures for 180 minutes. in figure 13, the average difference between the results of the two methods is presented. based on the results of the ultrasonic test error for samples exposed to different temperatures for 180 minutes, a 9.2 % error has been observed for concrete samples with a common mix design. the lowest error value was 4.7 % for concrete samples with steel fibres and the highest error was 10.9 % for samples with plastic fibres. the error value for high resistance concrete was also determined, at about 10.3 %. figure 10. comparing the results of the compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 120 minutes. figure 11. ultrasonic test error for samples exposed to different temperatures for 120 minutes. 4. comparison of two compressive test methods in the following, a summary of the changes and losses of compressive strength over the time of exposure to heat is presented in figure 14. based on the findings and results of the average drop of compressive strength against the heat exposure time, it has been observed that although the results of the compressive strength test based on the compressive machine and the ultrasonic test have slight numerical differences, the trend of changes and reduction in resistance over time for both types of tests is almost the same and has a minimal difference. for example, at the time of 180 minutes of exposure, the resistance ratio for the ultrasonic test was 80.3 %, and 79.1 % for the concrete breaker machine. this is despite the fact that the fracture resistance of the sample was less than 10 % different in the two cases. therefore, the accuracy of the ultrasonic test in determining the decreasing trend of compressive strength is fully confirmed. from the results of the compressive strength tests of concrete at different temperatures, we conclude 212 vol. 63 no. 3/2023 a comparative experimental investigation of high-temperature effect figure 12. comparing the results of the compression machine test and ultrasonic test for concrete samples exposed to different temperatures for 180 minutes. figure 13. ultrasonic test error for samples exposed to different temperatures for 180 minutes. that with the increase in temperature, the compressive strength decreases, the reason being that in the range of 100 to 200 degrees celsius, free water in the concrete starts to evaporate and leads to water vapour pressure, this water vapour pressure creates a network of microcracks inside the concrete, this process then continues in the range of 200–400 degrees celsius, where the microcracks turn into larger cracks and lead to a decrease in compressive strength several other factors reducing the strength can be the growth of the structure and the diameter of the pores in the concrete, and the difference in the coefficient of thermal expansion of the components that constitute the concrete (expansion in aggregate and contraction in cement paste) and the non-linear expansion of these components, which causes stress concentration in concrete components. finally, the summary of the comparison of the results of compressive strength tests of concrete samples with a compression machine and the ultrasonic test is as follows in figure 15. based on the summary of the comparison of the results of compressive strength tests of concrete samples with the compression machine and ultrasonic test, it figure 14. the average drop in compressive strength against heat exposure time. figure 15. summary of comparing the results of compressive strength tests of concrete samples with compression machine and ultrasonic test. has been concluded that the minimum error recorded in the tests was 2.9 % and the maximum error between the two methods was 10.9 %, which shows the high accuracy of the ultrasonic test method in determining the compressive strength of concrete samples exposed to fire. the average probable error of the method is determined to be around 6.8 %. also, the best performance and lowest error were observed for concrete samples with steel fibres and high strength, and the worst performance and the highest error of the method were observed for concrete samples with plastic fibres (7.9 % error). the reason for the lower compressive strength of concrete containing plastic fibres as compared to other concretes at normal temperatures is that the use of plastic fibres reduces the compressibility of concrete. this causes weak points in the concrete texture (due to local porosity caused by the penetration of air bubbles) and is the reason why the amount of compressive strength decreases with the presence of plastic fibres. from the temperature of 400 to 600 degrees celsius, we can see a noticeable decrease in compressive strength for all samples, which can be because, at the temperature range of 400 to 600 degrees celsius, the decomposition of cement paste and calcium hydroxide ca(oh)2 in concrete takes place as a result of this porosity reaction. the cement matrix increases and the mechanical properties of concrete, including strength and hardness, decrease. this range can be considered as the critical temperature range for resis213 j. royaei, k. sadeghi, f. nouban acta polytechnica tance reduction. also, the greater decrease in strength in the concrete containing plastic fibres is caused, in addition to the reasons mentioned so far, which also occur in concrete containing plastic fibres, by the fact that plastic fibres start to melt in the temperature range of 160 to 180 degrees celsius and cause voids. additional particles are inside the concrete, which causes a greater decrease in the compressive strength of concrete containing plastic fibres at high temperatures as compared to concrete without plastic fibres. this issue was also present for concrete with steel fibres at temperatures higher than 800 degrees. 5. conclusions to investigate the effect of temperature increase on the compressive strength of concrete, concrete samples were subjected to temperature increase. temperature changes include 200, 400, 600, 800, 1000, and 1200 degrees celsius. each compressive strength test result is the average result of three samples. the results obtained from the compressive resistance test against temperature increase were obtained for all four mixtures at the age of 28 days. the most important results and findings of this part of the research are as follows: • based on the summary of the comparison of the compressive strength test results of concrete samples by compression machine and ultrasonic test, it has been observed that the minimum error recorded in the tests was 2.9 % and the maximum error between the two methods was 10.9 %, which shows the high accuracy of the ultrasonic test method in determining the compressive strength of concrete samples exposed to fire. the average probable error of the method was found to be about 6.8 %. also, the best performance and lowest error were observed for concrete samples with steel fibres and high strength, and the accuracy of the method for concrete samples with plastic fibres had the worst performance (7.9 % error). • based on the findings and results of the average decrease in compressive strength against the time of exposure to heat, it has been observed that although the results of the compressive strength test based on the compressive machine and the ultrasonic test have slight numerical differences, the trend of changes and reduction in resistance in the amount of time for both types of tests is almost the same and has a minimal difference. for example, at the time of 180 minutes of exposure, the resistance ratio for the ultrasonic test was 80.3 %, and 79.1 % for the compressive machine. this is despite the fact that the fracture resistance of the sample was less than 10 % different in the two cases. therefore, the accuracy of the ultrasonic test in determining the decreasing trend of compressive strength is fully confirmed. • based on the findings and results of the average decrease in compressive strength against the heat exposure time, it has been observed that although the results of the compressive strength test based on the compressive machine and the ultrasonic test have slight numerical differences, the trend of changes and reduction of resistance against the time for both types of tests is almost the same and has a minimal difference. for example, at the time of 180 minutes of exposure, the resistance ratio for the ultrasonic test was 69.8 %, and 71.1 % for the compressive machine. this is despite the fact that the fracture resistance of the sample was less than 6.2 % different in the case of the maximum possible error for the two cases. therefore, the accuracy of the ultrasonic test in determining the decreasing trend of compressive strength is fully confirmed. references [1] d. zhang, q. yang, m. mao, j. li. carbonation performance of concrete with fly ash as fine aggregate after stress damage and high temperature exposure. construction and building materials 242:118125, 2020. https: //doi.org/10.1016/j.conbuildmat.2020.118125 [2] s. chiranjiakumari devi, r. ahmad khan. influence of graphene oxide on sulfate attack and carbonation of concrete containing recycled concrete aggregate. construction and building materials 250:118883, 2020. https: //doi.org/10.1016/j.conbuildmat.2020.118883 [3] c. andrade. evaluation of the degree of carbonation of concretes in three environments. construction and building materials 230:116804, 2020. https: //doi.org/10.1016/j.conbuildmat.2019.116804 [4] z.-l. jiang, x.-l. gu, q.-h. huang, w.-p. zhang. statistical analysis of concrete carbonation depths considering different coarse aggregate shapes. construction and building materials 229:116856, 2019. https: //doi.org/10.1016/j.conbuildmat.2019.116856 [5] x. fang, b. zhan, c. s. poon. enhancing the accelerated carbonation of recycled concrete aggregates by using reclaimed wastewater from concrete batching plants. construction and building materials 239:117810, 2020. https: //doi.org/10.1016/j.conbuildmat.2019.117810 [6] a. pandey, b. kumar. investigation on the effects of acidic environment and accelerated carbonation on concrete admixed with rice straw ash and microsilica. journal of building engineering 29:101125, 2020. https://doi.org/10.1016/j.jobe.2019.101125 [7] x.-h. shen, w.-q. jiang, d. hou, et al. numerical study of carbonation and its effect on chloride binding in concrete. cement and concrete composites 104:103402, 2019. https: //doi.org/10.1016/j.cemconcomp.2019.103402 [8] y. liu, y. zhuge, c. w. chow, et al. properties and microstructure of concrete blocks incorporating drinking water treatment sludge exposed to early-age carbonation 214 https://doi.org/10.1016/j.conbuildmat.2020.118125 https://doi.org/10.1016/j.conbuildmat.2020.118125 https://doi.org/10.1016/j.conbuildmat.2020.118883 https://doi.org/10.1016/j.conbuildmat.2020.118883 https://doi.org/10.1016/j.conbuildmat.2019.116804 https://doi.org/10.1016/j.conbuildmat.2019.116804 https://doi.org/10.1016/j.conbuildmat.2019.116856 https://doi.org/10.1016/j.conbuildmat.2019.116856 https://doi.org/10.1016/j.conbuildmat.2019.117810 https://doi.org/10.1016/j.conbuildmat.2019.117810 https://doi.org/10.1016/j.jobe.2019.101125 https://doi.org/10.1016/j.cemconcomp.2019.103402 https://doi.org/10.1016/j.cemconcomp.2019.103402 vol. 63 no. 3/2023 a comparative experimental investigation of high-temperature effect curing. journal of cleaner production 261:121257, 2020. https://doi.org/10.1016/j.jclepro.2020.121257 [9] c. jiang, q.-h. huang, x.-l. gu, w.-p. zhang. modeling the effects of fatigue damage on concrete carbonation. construction and building materials 191:942–962, 2018. https: //doi.org/10.1016/j.conbuildmat.2018.10.061 [10] r. kurda, j. de brito, j. d. silvestre. carbonation of concrete made with high amount of fly ash and recycled concrete aggregates for utilization of co2. journal of co2 utilization 29:12–19, 2019. https://doi.org/10.1016/j.jcou.2018.11.004 [11] a. leemann, r. loser. carbonation resistance of recycled aggregate concrete. construction and building materials 204:335–341, 2019. https: //doi.org/10.1016/j.conbuildmat.2019.01.162 [12] m. mahedi, b. cetin. carbonation based leaching assessment of recycled concrete aggregates. chemosphere 250:126307, 2020. https: //doi.org/10.1016/j.chemosphere.2020.126307 [13] t. p. hills, f. gordon, n. h. florin, p. s. fennell. statistical analysis of the carbonation rate of concrete. cement and concrete research 72:98–107, 2015. https://doi.org/10.1016/j.cemconres.2015.02.007 [14] v. ta, s. bonnet, t. senga kiesse, a. ventura. a new meta-model to calculate carbonation front depth within concrete structures. construction and building materials 129:172–181, 2016. https: //doi.org/10.1016/j.conbuildmat.2016.10.103 [15] f. bouchaala, c. payan, v. garnier, j. balayssac. carbonation assessment in concrete by nonlinear ultrasound. cement and concrete research 41(5):557–559, 2011. https://doi.org/10.1016/j.cemconres.2011.02.006 [16] astm-c-150-04. standard specification for portland cement. astm, 2004, [2023-03-15], https://standards. globalspec.com/std/3816980/astm%20c150-04. [17] aci committee-211. standard practice for selecting proportions for normal, heavyweight, and mass concrete. aci, detroit, 38 pages, 1991, [2023-02-18], https://standards.globalspec.com/std/3816980/ astm%20c150-04. [18] k. matlack, j. kim, l. jacobs, j. qu. review of second harmonic generation measurement techniques for material state determination in metals. journal of nondestructive evaluation 34:273, 2015. https://doi.org/10.1007/s10921-014-0273-5 [19] d. ramachandran, s. uthaman, v. vishwakarma. studies of carbonation process in nanoparticles modified fly ash concrete. construction and building materials 252:119127, 2020. https: //doi.org/10.1016/j.conbuildmat.2020.119127 [20] m. otieno, j. ikotun, y. ballim. experimental investigations on the effect of concrete quality, exposure conditions and duration of initial moist curing on carbonation rate in concretes exposed to urban, inland environment. construction and building materials 246:118443, 2020. https: //doi.org/10.1016/j.conbuildmat.2020.118443 [21] m. elsalamawy, a. r. mohamed, e. m. kamal. the role of relative humidity and cement type on carbonation resistance of concrete. alexandria engineering journal 58(4):1257–1264, 2019. https://doi.org/10.1016/j.aej.2019.10.008 [22] x.-h. wang, d. v. val, l. zheng, m. r. jones. carbonation of loaded rc elements made of different concrete types: accelerated testing and future predictions. construction and building materials 243:118259, 2020. https: //doi.org/10.1016/j.conbuildmat.2020.118259 [23] l. kong, m. han, x. yang. evaluation on relationship between accelerated carbonation and deterioration of concrete subjected to a high-concentrated sewage environment. construction and building materials 237:117650, 2020. https: //doi.org/10.1016/j.conbuildmat.2019.117650 215 https://doi.org/10.1016/j.jclepro.2020.121257 https://doi.org/10.1016/j.conbuildmat.2018.10.061 https://doi.org/10.1016/j.conbuildmat.2018.10.061 https://doi.org/10.1016/j.jcou.2018.11.004 https://doi.org/10.1016/j.conbuildmat.2019.01.162 https://doi.org/10.1016/j.conbuildmat.2019.01.162 https://doi.org/10.1016/j.chemosphere.2020.126307 https://doi.org/10.1016/j.chemosphere.2020.126307 https://doi.org/10.1016/j.cemconres.2015.02.007 https://doi.org/10.1016/j.conbuildmat.2016.10.103 https://doi.org/10.1016/j.conbuildmat.2016.10.103 https://doi.org/10.1016/j.cemconres.2011.02.006 https://standards.globalspec.com/std/3816980/astm%20c150-04 https://standards.globalspec.com/std/3816980/astm%20c150-04 https://standards.globalspec.com/std/3816980/astm%20c150-04 https://standards.globalspec.com/std/3816980/astm%20c150-04 https://doi.org/10.1007/s10921-014-0273-5 https://doi.org/10.1016/j.conbuildmat.2020.119127 https://doi.org/10.1016/j.conbuildmat.2020.119127 https://doi.org/10.1016/j.conbuildmat.2020.118443 https://doi.org/10.1016/j.conbuildmat.2020.118443 https://doi.org/10.1016/j.aej.2019.10.008 https://doi.org/10.1016/j.conbuildmat.2020.118259 https://doi.org/10.1016/j.conbuildmat.2020.118259 https://doi.org/10.1016/j.conbuildmat.2019.117650 https://doi.org/10.1016/j.conbuildmat.2019.117650 acta polytechnica 63(3):1–8, 2023 1 introduction 2 materials and methods 3 results 4 comparison of two compressive test methods 5 conclusions references ap06_2.vp 1 introduction when the plain fatigue limit or the plain fatigue strength of a material is determined from a series of specimens tested at zero mean load, it is of practical importance to clarify how this value is influenced by, for instance, a superimposed mean load, or a combined stress loading. it may be necessary to know the effect of a combination of these variables. this information has to be assessed from a careful observation of the variables acting separately. there is no recent evidence to suggest that compressive mean stresses reduce the zero mean stress fatigue limit. the data suggests that the fatigue limit either remains constant or increases, approximately linearly, above the zero mean stress value as the compressive mean stress increases in magnitude, provided that buckling or gross yielding does not occur. the initiation of a surface microcrack in a wrought ductile metal depends on the resolved cyclic shear stresses necessary to cause continuing cyclic slip exceeding some minimum value. therefore, the effect of the mean stress on its fatigue limit depends on the extent to which this minimum value is either increased or reduced by the resolved static stresses acting both along and normal to the operative slip planes. the data implies that, for metals exhibiting a sharp knee and a definite fatigue limit, this effect is small until the maximum stress in the cycle is above the yield stress; for other materials there is a gradual decrease with increasing static tensile stress and perhaps a slight increase with increasing compressive mean stress. 2 influence of uniaxial loading the stress cycle �m� �, where �m is considered positive, is demonstrated in fig. 1; it follows that � � �max � �m (1a) � � �max � �m . (1b) the ratio � �min max is called the stress ratio and is commonly denoted by r. algorithms to study the effect of mean stress generally involve establishing s/n curves for a series of values of mean stresses, so that a diagram can be plotted showing the relationship between the fatigue limit at a particular mean stress and the corresponding value of the mean stress. in order to avoid the necessity of carrying out comprehensive series of tests at different mean stresses, on different materials, attempts have been made to formulate relationships linking the pertinent variables, thus enabling the fatigue limit of a material (or the strength at a given endurance) under a given mean stress to be predicted from the fatigue limit at zero mean stress. the most common requirement for practical design purposes is the fatigue limit under a tensile mean stress. according to [1] the two relationships are generally accepted as representing the experimental data, one due to goodman and the other due to gerber, which found that wöhler’s tensile mean stress data conformed to a parabola, having as end-points the fatigue limit at zero mean stress and the tensile strength of the material. goodman assumed that the safe repeated tension loading cycle was zero to half the tensile strength of the material rm, that is, rm /4 � rm /4, the safe stress decreasing linearly to zero at rm and increasing linearly to the zero mean stress condition; this gives a zero mean stress fatigue limit of � rm /3. this relationship was subsequently modified (and is now called the modified goodman relationship) so that the fatigue limit decreased linearly with increasing tensile mean stress from its experimentally determined zero mean stress value to zero at rm, the tensile strength of the material. the two relationships are illustrated in fig. 2; they can be expressed in the form � � � � � � � � � � � � �� � � � �� � � � c m m n r 1 , (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 some effects on fatigue strength p. brož it has been widely known for some time that the mean stress affects the fatigue strength of steel. this is distinctly evident when working with non-welded or stress-relieved welded details. some consequences of this influence are revealed when evaluating the stress history, or when counting the cycles. in this paper, decisive influences on the plain fatigue limits or strengths are indicated, when applying a mean stress, together with the effects of both combined stress and anisotropy and the minimum stress required to grow a crack of a given length and depth. keywords: alternating stress, mean load, notched and cracked specimen, notched fatigue data, reversed direct stress, stress ratio. fig. 1: tensile loading cycle where � � is the fatigue limit (or strength at a given endurance) when a tensile mean stress sm is present, � �c is the fatigue limit (or strength at the same endurance) at zero mean stress, rm is the tensile strength, n � 1 is the modified goodman relationship, and n � 2 is the gerber relationship. if the above equation is written in the form (omitting � signs) � � � c m m n r � � � � � � � �1 , (3) the diagram of fig. 2 can be replotted using the non-dimensional variables � �c and �m mr as shown on fig. 3. this figure is often referred to as the r-m diagram because it shows the relationship between the safe range of stress r and the mean stress m. if a line is drawn in fig. 3 joining points a and b, where a and b are the yield of the material divided by �c and rm respectively, then points to the right of ab represent tests in which the maximum tensile stress in the cycle is sufficient to cause gross yielding. to ensure that neither yielding nor fatigue failure occurs, a diagram similar in form to the modified goodman diagram has been proposed in which the criterion of failure at zero alternating stress is taken as the yield instead of the tensile strength; the straight line joining this point to � �c is often referred to as the soderberg line. in all these diagrams, loading conditions inside the curve or straight line are supposedly safe, while those outside lead to failure. the experimental data shows, that the fatigue limit tends to increase above the zero mean stress value with increasing compressive stress, provided the specimen does not yield or buckle. heywood [2] derived an empirical relationship from an analysis of available data which can be written in the form � � � � � r r r rm m m c m c m � � � � � � � � � � � � � � � � � � �� � � � �� 1 1 , (4) where � � � � � � � � � � � � � � � � � m m m mr r3 2 for steels and � � � � � � � � � � � � � � � � �� � � � �� � m m m r nr 1 2200 4 1 for aluminium alloys. n is the logarithm of the life at which the fatigue strength is estimated and all stresses are in mpa, g can be considered as the curvature factor of a line on the � �c versus �m mr diagram, the expression reducing to the modified goodman relationship (2) when � � 0. some experimental data relating to the effect of a mean stress on the uniaxial (reversed direct stress) fatigue limit of various steels are given below, according to [3], in table 1. data obtained on wrought metallic alloys (steels, aluminium alloys) subjected to tensile mean stresses tends to lie between the modified goodman and gerber lines. a survey of the literature showed that 90 percent of the data lay above the goodman line, falling mainly between the goodman and gerber lines. however, some of the lowand medium-strength aluminium alloys give values lying below the goodman line. provided a specimen does not yield or buckle under the maximum compressive stress in the loading cycle, the fatigue 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague rmrm fig. 2: modified goodman (a) and gerber (b) diagrams rm fig. 3: r-m diagram limit does not decrease below the zero mean stress value when a compressive mean stress is superimposed. data on steels and aluminium alloys have been given which imply that the fatigue limit increases linearly with increasing compressive mean stress, the value of the fatigue limit at a compressive mean stress equal to the yield stress of the material being about 1.4 times that at zero mean stress. some data illustrating the variation of fatigue strength with compressive mean stress is regiven in conformity with [3] in table 2. in all cases, it is seen that the fatigue limit either increases above or remains equal to the zero mean stress value. specimens subjected to a wholly compressive loading cycle exhibited numerous surface cracks, pieces of material often flaking away from the surface. it has been reported that a more accurate prediction of the effect of a tensile mean stress can be obtained by using true stress instead of nominal stress and the fracture stress of the material instead of the tensile strength. experimental points from various steels and aluminium alloys tested at various tensile mean stresses fall around a straight line on a true stress/modified goodman diagram. a material whose fatigue limit depends on whether or not cracks can grow directly from inherent flaws responds more markedly to mean stress (either tensile or compressive) than a material whose fatigue limit depends on whether or not the applied cyclic stress is sufficient to initiate and develop surface microcracks. cast iron is an example of the former material, and it is found [3] that a tensile mean stress reduces and a compressive mean stress increases its fatigue limit by a greater extent than that predicted by the modified goodman relationship (2). the ratio of the fatigue limit of cast iron in repeated compression to that in repeated tension averages about 3.3 compared to an average value of about 1.5 for malleable cast irons and wrought steels. this is further illustrated by the following fatigue limits for grey cast iron [2]: pulsating tension 0 to 100 mpa, zero mean stress � 73 mpa, pulsating compression 0 to – 450 mpa. © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 material tensile strength (mpa) tensile mean stress (mpa) fatigue strength (mpa) calculated experimental modified goodman gerber fatigue limit mild steel 410 0 � 190 � 190 � 190 43 � 160 � 185 � 182 91 � 148 � 179 � 168 148 � 122 � 165 � 159 250 � 74 � 120 � 153 310 � 46 � 82 � 105 fatigue limit nickel-chromium alloy steel 865 0 � 480 � 480 � 480 155 � 394 � 465 � 448 310 � 310 � 418 � 418 465 � 224 � 340 � 356 fatigue limit 15 130 steel 800 0 � 338 � 338 � 338 70 � 310 � 334 � 338 140 � 283 � 328 � 330 210 � 252 � 317 � 317 table 1: material compressive mean stress (mpa) fatigue strength (mpa) fatigue limit nickel-chromium alloy steel 0 � 480 �155 � 510 �310 � 510 fatigue limit 11 140 steel 0 � 390 �173 � 410 table 2: 3 anisotropy and combined stress effects in many applications, components are subjected to more general loading cycles, for example, combined bending and torsional loads. combined cyclic loading comprises either in-phase bending and torsion or in-phase biaxial tension. attempts to predict the fatigue limit (or strength at long endurances) at zero mean stress, under a combined stress loading, from the corresponding uniaxial fatigue limit or strength have been based on the usually accepted criteria for predicting the onset of plastic deformation under a static combined stress loading, the limiting static stresses merely being replaced by the corresponding limiting cyclic stress amplitudes. only biaxial stress systems need be considered in the case of those materials in which cracks are initiated at a free surface and the three failure criteria most commonly used (that is, maximum principal stress, maximum shear stress, and the maximum shear – strain energy or von mises) may be expressed by the formulae: maximum principal stress � �� � � � � �e x y x y xy� � � � � 2 1 2 42 2 1 2( ) , (5a) maximum shear stress � �� � � �e x y xy� � �( )2 2 1 24 , (5b) von mises � �� � � � � �e x x y y xy� � � �2 2 2 1 23 (5c) where �e is the equivalent principal stress, �x and � y are normal stresses, and �xy is the shear stress. for a cylindrical bar subjected to combined bending and torsional loads, the formulae for surface stresses become: maximum principal stress � �� � � �e � � � � � �� � � � �� 1 2 42 2 1 2 , (6a) maximum shear stress � �� � �e � �2 2 1 24 (6b) von mises � �� � �e � �2 2 1 23 (6c) where � is the maximum surface bending stress and � is the maximum surface shear stress. verification of whether or not fatigue data obtained on a particular material conforms to any of these criteria (5a– c) may be obtained from the value of the ratio of its torsional fatigue limit to the uniaxial fatigue limit (the above criteria require this ratio equal to 1, 0.5, and 0.577 respectively) and by determining the fatigue limits of specimens subjected to various combined stress loadings. suitable methods for performing these tests are either to apply in-phase bending and torsional stresses to cylindrical specimens or to subject thin-walled tubes to in-phase pulsating internal pressure and axial cyclic loading. 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague materials ratio of fatigue limit in torsion to that in rotating bending 19 carbon steels 0.55 14 alloy steels 0.58 table 3: fig. 4: fatigue limits of 3 steels under combined bending and torsional alternating stresses (taken from frost [1]) average values of the ratio of the fatigue limit in torsion to that in rotating bending are demonstrated in table 3. it would appear, therefore, that for metals in which failure is initiated at a free surface, the ratio of the fatigue limit (or strength at very long endurances) in torsion to that in uniaxial loading has an average value not far removed from that predicted by the von mises criterion, the value for a particular material depending on the endurance at which the fatigue strength is estimated and on a metallurgical structure. most data relating to combined stresses were obtained for combined bending and torsional stresses. machines were designed to apply either an alternating bending or an alternating torsion loading or a loading consisting of any in-phase combination of the two, to obtain data for numerous ferrous alloys. the fatigue limit (zero mean stress) of each material was obtained under alternating bending, alternating torsion, and five different combinations of alternating in-phase bending and torsional stresses, using 7.6 mm diameter plain specimens. none of the usual theories of failure under combined stresses were found to represent all the results; instead, it was suggested that for the wrought steels tested (these covered a range of tensile strengths from 400 mpa to 1850 mpa) the experimental results for each steel could be represented by an ellipse quadrant having as end points the fatigue limits in pure torsion and pure bending. three typical groups of results are shown in fig. 4. the equation of the ellipse quadrant is (omitting � signs) � � � � b cb q cq 2 2 2 2 1� � , (7) where � �cb and � �cq are the fatigue limits in pure bending and torsion respectively and � �b and � �q are the stress ranges due to bending and torsion respectively at the fatigue limit under a combined stress loading. this is equivalent to the maximum shear stress criterion (5b) if � �cb cq � 2 and to the von mises criterion (5c) if � �cb cq � 3. although it was considered that the � �cb cq ratio obtained for the various steels did not agree exactly with either of these two criteria, consideration of all the data for any one steel shows that, in general, they are not far removed from the values predicted by the von mises criterion. some data is replotted, using � b 2 and �q 2 axes, on fig. 5. the full line represents the von mises criterion; the dotted line is the best straight line through the points. values quoted in [3], namely of the torsion/uniaxial ratio for various cast irons, usually lie between 0.9 and 1.0, thus appearing to conform more closely to the maximum principal stress rather than to the von mises criterion. fatigue failure of cast iron is associated, however, with the presence of inherent flaws. some tests carried out on both cast iron plain specimens and wrought steel notched specimens gave results that could not be represented by an ellipse quadrant. instead the fatigue limits lay around an ellipse arc having the equation © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 fig. 5: � b 2 versus � q 2 diagrams, a) 16 320 steel (tensile strength 700–770 mpa), b) chromium-vanadium steel (tensile strength 700–770 mpa) (taken from frost [1]) � � � � � � � � � � q cq b cb cb cq b cb cb cq 2 2 2 2 2� � � � � � � � � � � � � � � � 1 (8) a set of results obtained from plain “silal” cast iron specimens and veenotched wrought steel specimens is shown fig. 6. a specimen containing a notch subjected to uniaxial loading has a biaxial stress system set up in the material at the notch root. tests on chromium-vanadium and 0–14 % c steel specimens, notched to create a known biaxial stress system on the surface of the notch root, gave fatigue limits in good agreement with the von mises criterion. the magnitude and sign of the normal stress occurring on the maximum shear stress planes would be expected to influence the initiation and development of surface microcracks [4], and since a normal stress is present on the planes of maximum shear stress in bending but is absent in torsion, it might be expected to produce differences in the ratio of the fatigue limit in bending to that in torsion. to allow for the influence of the normal stress and also for anisotropy of the material, it was suggested that the maximum shear stress criterion, for example, could be modified as follows. let � �cb and � �cq fatigue limits in pure bending and torsion, respectively, � �b and � �q stress ranges due to bending and torsion at the fatigue limit of the combined stress loading, then, writing �cb = 2k�cq (omitting � signs), where k is a correction factor to allow for the state of stress and anisotropy, the maximum shear stress criterion can be written as � � �b q cbk 2 2 24� �( ) (9) which, on substituting for k, gives � � � � b cb q cq 2 2 2 2 1� � . (10) sines [5] suggested that the experimental data could be summarized as follows: 1) the combined stress fatigue limits of wrought metallic alloys were in good agreement with the von mises criterion, 2) the uniaxial fatigue limit was decreased by a tensile mean stress and increased by a compressive mean stress, the change in fatigue limit, for practical purposes, being linearly dependent on the mean stress, provided the material did not yield, 3) both the torsional and uniaxial fatigue limits were unaffected by a static torsion mean stress, provided the material did not yield, 4) the torsional fatigue limit was affected by a mean tensile or compressive mean stress as in 2), and thus expressed the relationship between the static and permissible cyclic stresses as � �1 3 1 2 2 2 3 2 3 1 2 1 2( ) ( ) ( ) ( )p p p p p p a s s sx y z� � � � � � � � �� (11) where p1,2,3 are the amplitudes of the alternating principal stresses, sx,y,z are the orthogonal static stresses, and a and � are material constants. he suggested that the constants could be evaluated from uniaxial tests carried out at zero mean stress and with a zero to tension loading cycle. if � �1 is the fatigue limit at zero mean stress, then s s s p p px y z, , , , ,2 3 1 10� � � � , (12) therefore a � 2 3 1 � . if �2 � �2 is the fatigue limit under a zero to tensile stress loading cycle, then p p s s s py z x2 3 2 1 20, , , , ,� � � �� � , (13) therefore a � ��� �2 2 2 3 , (14) from which � � � � � � � � � � � 2 3 11 2 . (15) 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 2/2006 czech technical university in prague fig. 6: fatigue limits of plain cast iron and notched wrought steel specimens under combined bending and torsional alternating stresses (taken from frost [1]) 4 conclusion these testing particulars imply that, with the exception of soft metals that deform noticeably subject to the applied loads, the fatigue limits of wrought metals and alloys represent part of the values predicted by the modified goodman and gerber expressions, the former providing the safer estimation. there is a tendency for the data to lie closer to the modified goodman line at low mean stresses and closer to the gerber line at high mean stresses. it may well be for this reason that it has been reported that the data can be more accurately represented by the modified true-stress goodman line, because this line lies above the conventional goodman and gerber lines at mean stresses that are a large fraction of the tensile strength. materials which exhibit a definite fatigue limit give results somewhat closer to the gerber line than to the goodman line; materials which do not exhibit an s/n curve having a sharp knee, such as high-strength aluminium alloys, give results falling around the modified goodman line when the fatigue strength is estimated at relatively low endurances (about 107 cycles) but tend to approach the gerber line when the fatigue strength is estimated at relatively long endurances (greater than 108 cycles). in the case of all stated materials, the sodeberg line renders a safe prediction, but in many occurrences it would be excessively safe. it seems that, for ductile metals and alloys in which microcracks are launched and evolve in surface grains, the failure criterion applicable to an in-phase combined stress loading is some function of the cyclic maximum shear stresses, and the von mises criterion is amply precise for practical intents. in the case of in-phase combined bending and torsion, the results are well represented by an ellipse quadrant having as end-points the fatigue limits in pure bending and pure torsion. little work has been done on the effect of out-of-phase combined stress loadings, but it is difficult to visualize their effect being more dangerous than when the corresponding loadings are applied in-phase. combined bending and torsion fatigue tests in which there were various phase differences between the bending and torsion loading confirmed that in no case was the fatigue strength less than the corresponding in-phase case. the ratio of the fatigue limit in torsion to that in uniaxial loading is nearer 0.57 than 0.5, which is probably due to the fact that there is a normal stress across the operative slip planes in the latter tests but not in the former. thus, although the criterion for the onset of surface slip in a surface grain may indeed be that of maximum shear stress (as indeed it is for the onset of static slip in a single crystal), the progressive development of a microcrack will be easier when a normal stress acts across its faces than when it is absent, and thus the uniaxial fatigue limit is less than twice the torsional fatigue limit. the fact that this ratio approximates to that predicted by the von mises criterion may therefore be coincidental; indeed, it has been argued [6] that, although the strain energy or von mises criterion is useful as a design formula, there is no evidence that fluctuating strain energy is a cause of fatigue cracking. in tests on materials not possessing a definite fatigue limit, the ratio of torsional to bending fatigue strengths increases progressively above the von mises predicted value of 0.57 as the endurance at which the fatigue strengths are estimated decreases. this is presumably because it is easier for a crack having a normal cyclic stress acting across its faces to grow than for one which does not, thus making the bending fatigue strength (at a given endurance) increase less rapidly with decreasing endurance than the corresponding torsional fatigue strength. this fact may also account for the more marked effect, on both the uniaxial and torsional fatigue limit, of a uniaxial rather than a torsional mean stress, because the latter does not induce a normal stress component across the operative slip planes. the effect of anisotropy does not appear to be important in interpreting combined bending and torsion fatigue tests, presumably because the fatigue limit in torsion is not significantly dependent on the direction from which specimens are cut from the stock material. however, experiments on thin-walled tubes under in-phase pulsating internal pressure and longitudinal alternating uniaxial loads yield rather fluctuating results, as a rule attributed to anisotropy, as the circumferential fatigue limit is substantially lower than that in the longitudinal course. 5 references [1] frost, n. e., marsh, k. j., pook, l. p.: metal fatigue. dover publications, inc. mineola, new york, 1999, 499 p. [2] heywood, r. b.: designing against fatigue. chapman and hall, london, 1962. [3] forrest, p. g.: fatigue of metals. pergamon press, oxford, 1962. [4] findley, w. n., mathur, p. n.: “proc. soc. exp. stress analysis”, 14, 35, 1956. [5] sines, g.: metal fatigue. mc graw – hill, new york, 1959, 145 p. [6] findley, w. n., mathur, p. n., szcepanski, e., temel, a. o.: “trans. am. soc. mech. engrs. j. bas. eng.”, 83, 10, 1961. doc. ing. petr brož, drsc. phone: + 420 224 354 630 e-mail: petr.broz@fsv.cvut.cz department of concrete and masonry structures czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 46 no. 2/2006 ap05_4.vp 1 introduction the last decade saw ever growing interest in the academic world in the study of autonomous flight. most of the academic projects share a common approach, where an existing model aircraft is equipped with sensors and on-board computers, in order to achieve the prescribed level of autonomous management of the mission profile. as for aircraft with vertical take-off and landing (vtol) capabilities, model helicopters such as the yamaha r-max and r-50 were employed by the university of california at berkeley [1], carnegie mellon university [2] and the georgia institute of technology [3]. a more ambitious project has been in progress in the past 5 years at the department of aerospace engineering of the polytechnic of turin (dae) and the department of mechanics and aeronautics of the university of rome “la sapienza” (dma), where the vehicle itself is a completely new machine, characterized by an original configuration, and all the activities related to the design, development, ground and flight testing of the vehicle are performed in an academic framework. the program received funding from the italian national plan for research in antarctica (pnra) and the italian ministry for education, scientific research and innovation (miur), and it was entirely carried on in academic structures, with limited support from small and medium-size enterprises, for the realization of those mechanical, electrical and electronic components that could not be built in the workshops of the involved universities. the shrouded fan uninhabited aerial vehicle (uav), shown in fig. 1, is made of a toroidal fuselage that envelopes two counter-rotating, three bladed rotors, driven by three two-stroke piston engines, derived from a model used for para-gliders. a brief description of the vehicle is reported in the next paragraph, while its dimensions and expected performance are reported in tab. 1. more details on the vehicle itself, on the status of the project and related activities can be found in [4] and [5]. next, the advanced tools employed for the design, development and testing of the engine unit will be presented. the three sections will deal with: (i) aerodynamic modeling of the shrouded fan configuration, using advanced cfd software; (ii) solid modeling of the mechanical components, using cad software; (iii) rapid prototyping techniques for the development of the control system. as for the last point, details will be given on the control system for engine rpm control currently being tested on the vehicle prototype. a section of conclusions ends the paper. 2 “ro.w.en.a.” (rotorcraft with enhanced autonomy) as shown in fig. 1, the vehicle is characterized by an axial-symmetric configuration, which allows for rotations around the symmetry axis in any flight condition. moreover, the fuse© czech technical university publishing house http://ctn.cvut.cz/ap/ 81 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 design and development of the engine unit for a twin-rotor unmanned aerial vehicle g. avanzini, s. d’angelo, g. de matteis advanced computer-aided technologies played a crucial role in the design of an unconventional uninhabited aerial vehicle (uav), developed at the turin technical university and the university of rome “la sapienza”. the engine unit of the vehicle is made of a complex system of three two stroke piston engines coupled with two counter-rotating three-bladed rotors, controlled by rotary pwm servos. the focus of the present paper lies on the enabling technologies exploited in the framework of activities aimed at designing a suitable and reliable engine system, capable of performing the complex tasks required for operating the proposed rotorcraft. the synergic use of advanced computational tools for estimating the aerodynamic performance of the vehicle, solid modeling for mechanical components design, and rapid prototyping techniques for control system logic synthesis and implementation will be presented. keywords: uninhabited aerial vehicle, computational fluid dynamics, solid modeling, rapid prototyping. fig. 1: prototype of the uav lage sizably improves rotor performance in hovering and at low-speed [6], while protecting them in the case of low-speed impacts against obstacles, thus increasing the safety level for nearby personnel during ground operations with respect to a more conventional rotary-wing configuration with unprotected rotors. these features will provide the vehicle with unique maneuvering capabilities in a confined space and sensor aiming during forward flight. the drawbacks of such a configuration are (i) a relevant penalty in terms of performance in forward flight, due to the pitch-down attitude necessary to maintain a trim condition at high speed, with an increase of frontal area and, thus, of the total drag, and (ii) the inherent instability of the unaugmented aircraft dynamics, due to the absence of any stabilizing surface, as discussed in detail in [6] and [7]. there are some other vehicles, such as the istar mav [8], that exploit the benefits of the ducted fan. nonetheless the istar mav is significantly different, inasmuch as it uses a ducted, constant pitch propeller for thrust generation, thrust modulation being achieved by engine rpm control, and control moments being generated by vanes in the propeller wake. conversion to an airplane-like attitude is necessary for high-speed flight and the (usually small) payload is placed in the central hub. the only vehicle of comparable size with a configuration similar to that of “ro.w.en.a” is the sikorsky cypher [9], but many distinctive features make the rotorcraft presented in this paper a completely original machine. the cypher is devoted primarily to military uses and, in this sense, it was designed as an expendable, single engine vehicle. by contrast, the uav currently under development at dae and dma is aimed at civil and/or scientific applications, where the issue of safety for both payload and surrounding personnel is a major concern. in this framework, the design of a small-size, multi-engine machine, capable of performing one-engine-out emergency landings is far from being a trivial task. moreover, a vehicle developed industrially can rely upon the most advanced technologies, such as the advancing blade concept (abc), for solving the problem of the retreating blade stall at high speed, and elastomeric joints for blade pitch control, technologies that are ruled out in a research program developed under the tight budget constraints typical of an academic framework. as for the mechanical assembly, the engine unit is probably the most complex, wholly original component of the vehicle. this unit is made of three two-stroke, air cooled piston engines, coupled with two counter-rotating, three-bladed rotors with composite, rigid blades. the engines are mounted on a titanium mount, and lie downstream of the rotor, in order to maximize the cooling effect of the wake, thus avoiding the complexity and weight of a liquid cooling system. each engine is connected by its own drive train to the main rotor gear. inside the pinion, a system of clutch and free wheels will isolate the failed engine, as soon as it does not deliver torque to the main gear, in order to avoid the possibility that the failure of one of the engines stops the other two. the maximum power delivered by two engines should be sufficient for safely landing the rotorcraft, as demonstrated by the power requirement estimate of [6]. the main gear drives the lower rotor, while the upper rotor is driven by the reverse gear. with this configuration, no rotating shaft is present and this, in turn, results in a reduced vibration level, a feature that was experimentally verified during the ground testing of the rotors described in [4], up to the nominal rotor speed of 3,000 rpm. the control of thrust and moments generated by the shrouded rotors is achieved by means of two independent, helicopter-like swashplates, the position of which is determined by six couples of rotary digital pwm servos. the carbon fiber blades are mounted on steel pivots, which must be capable of standing both a very high static load due to the centrifugal force acting on the blades and the high frequency dynamic loads due to blade pitch cyclic variation. the high frequency load also challenges the electrical servos. in this respect, digitally controlled pwm servos can stand such a load only if used in a coupled configuration. if on the one hand, this choice causes a considerable reduction of the overall system reliability, on the other it allowed us to go on with the research program without the development of an ad-hoc micro-actuator, an activity that, although representing on its own an interesting field for the application of advanced design technologies, was considered not strategic at the current status of the project, where the in-flight demonstration of the vehicle concept is considered the main objective. in this respect, the development of a vehicle suitable for civil applications, accompanied by the requirement for the capability of performing an emergency landing in case of engine failure, was already a very ambitious scope. the multi-engine configuration represents a sizeable penalty in terms of power-to-weight ratio and overall system complexity, from both the mechanical and avionic system point of view. also, engine start-up and rotor speed control become more complicated, because it is necessary to properly coordinate the control action on three engine throttles, while keeping the power delivered and the fuel consumption of the three engines close one to the others. the mechanical complexity of the engine mount, drive trains and blade pitch control system is accompanied by the uncertainty relative to the vehicle dynamics, the configuration of which is considerably different from that of any other conventional rotorcraft. unfortunately, sikorsky delivered to the public literature only qualitative data on the cypher, and a detailed study, presented in [6], was necessary for determining a reasonable, although uncertain, vehicle model and for studying its dynamic behavior. it was only thanks to the modern advanced design and computational tools described in the sequel that the enabling 82 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague external diameter: 1.9 m rotor diameter: 1.1 m hub diameter: 0.25 m max. power at 11000 rpm: 42 hp rotor angular velocity: 3000 rpm maximum to weight: 800 n payload: 100 n maximum speed: 30 ms�1 endurance: 1.5 h service ceiling: 2000 m table 1: uav characteristics and expected performance technologies required for developing such a complex vehicle were made available to faculty staff, technicians and students, allowing for surprising advances in the project, in spite of very limited human and economic resources. 3 aerodynamic design with advanced cfd tools budget constraints ruled out from the beginning of the research program the possibility of wind-tunnel tests on a scale model of the shrouded-fan uav, the cost of development of an instrumented model being close to that of the full size prototype. computational fluid dynamics software was regarded as a viable solution for obtaining a reasonable estimate of the aerodynamic behavior of the vehicle in a short time, with the possibility of “testing” several rotor-fuselage configurations, in the course of a few hours. during the first phase of the program, i.e., vehicle preliminary sizing and configuration design, the main problems were the estimate of rotor performance and power demand in every flight condition, and the determination of the aerodynamic model of the vehicle to be used for flight simulation of the uav and for synthesizing suitable control laws. in this phase the numerical code vsaero, by analytical methods, inc., was used [10]. this software uses a boundary element integral method for potential flow solution, with boundary layer correction. an iterative wake relaxation scheme is used for determining flowfields about lifting and blunt bodies. overall aerodynamic force and moment coefficients can be easily determined at the end of the computation, from the pressure distribution over the body. an example of the results obtained from the graphical output of vsaero is reported in fig. 2. the numerical code is extremely efficient, from the computational point of view, being able to provide a solution on a common desktop computer in just a few minutes. nonetheless, the application was not straightforward for the present configuration, inasmuch as (i) the rotor-fuselage interaction is extremely complex and (ii) the wake separation can be automatically detected only in two dimensional flows, or when sharp edges clearly indicate a reasonable separation point, a feature not present in a doughnut-shaped fuselage. as for the first issue, a strong simplifying assumption was employed, such that the rotors were modeled as an actuator disk with assigned pressure increment. in this way, the simulated flow-field took into account the presence of the rotors, when determining the pressure distribution over the fuselage. by contrast, the aerodynamic model of the rotors, based on a simple strip-theory, did not take into account the influence of the fuselage on the rotor inflow. the inflow is approximated by a triangular velocity distribution determined iteratively on the basis of a momentum conservation argument. wake separation points are relatively easy to detect when axial flow is considered, but the problem is much more complicated when forward flight is dealt with. a reasonable estimate of wake separation points was made on the basis of a pressure gradient evaluation for the potential solution and data obtained from published experiments with flow visualizations of separation from doughnut-shaped bodies. in this sense some recent studies made at the georgia institute of technology were of great help in understanding the nature of the closed wake downstream the fuselage. an aerodynamic database was obtained with vsaero code, where lift, drag and pitch moment coefficients are tabulated as a function of fuselage angle of attack, for different values of the inflow parameter k v v wi � �( ) , (1) where v � (u, v, w)t is the velocity vector expressed in body axis components, v � v and vi is the velocity increment induced by the rotor. details on the aerodynamic model de© czech technical university publishing house http://ctn.cvut.cz/ap/ 83 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 2: graphical output of cfd code vsaerotm fig. 3: plot of aerodynamic force and moment coefficient acting on the uav fuselage veloped in this phase (and still used in the simulation model of the uav) can be found in [11]. plots of the aerodynamic coefficients are reported in fig. 3. a second set of cfd simulations was performed using one of the most advanced, sophisticated, and complex flow solvers available on the market, the full navier-stokes code fine/turbo, developed by numeca international s.a. this package provides the user with the possibility of solving very complex flowfields, with reliable automatic mesh generation [12]. of course, the code is considerably more computationally demanding and, for this reason, its use for realizing a new database was not considered, inasmuch as several machine-months would have been necessary for completing such a job. this code was used in order to define the final configuration of the fuselage, in terms of section shape, and to investigate in greater detail phenomena not modeled by the euler solver of vsaero, such as the flowfield between the rotors, rotor-fuselage interaction, and wake separation from the fuselage in hovering, axial translation and forward flight. in the latter case, it was necessary to model the rotors like an actuator disk, inasmuch as the solver can deal with unsteady periodic solutions only in axial flow. in this way it was also possible to make a comparison with the results obtained from the potential code in a given set of working points. although this comparison cannot be considered a true validation of the aerodynamic database determined by vsaero, the combined use of a simple and efficient potential code with a more sophisticated navier-stokes solver significantly improved the confidence in the obtained aerodynamic model of the vehicle. two of the most important results obtained from the analysis performed with the fine /turbo software are the determination of the power required by the rotor system and rotor efficiency in hovering, two figures that play a crucial role in engine sizing and vehicle performance estimation. required power of almost 21 kw (28 hp) was evaluated, and the beneficial effect of the toroidal fuselage at low speed received sound confirmation, when fuselage lift at hovering (due to the suction effect at the shroud inlet) was determined to be as high as 1/3 of the total rotor thrust. 4 cad and solid modeling the design of the mechanical components also relied heavily on the most advanced tools for computer aided design and solid modeling. in the very first phase of the project, autocad, by autodesk inc., demonstrated serious limitations when dealing with the design of such a complex machine. as an example, it was not possible to check for interferences in the mechanism for controlling blade pitch by rotor swashplates. solid modeling became necessary also for automatically generating the surfaces of those components that were to be realized by numerically controlled milling machines. for this reason, licenses for the use of solidworks, by solid works corp., were acquired and, later, for unigraphics, by ugs plm international. together with the aforementioned capabilities, parametric modeling of mechanical components allows for a trial-and-error approach, which would otherwise be extremely time consuming, where several mechanism configurations and dimensions of different components can be varied arbitrarily, until an optimal configuration is found that satisfies all the requirements. mechanical components interference is ruled out at the design stage, by virtually moving all the components in the assembled mechanism. finally, the interface with the most widely used finite element codes, such as nastran, by msc software corp., allows for a user-friendly environment, where also structural analysis is carried out efficiently. instead of going into greater detail with the presentation of the functionalities of solid modeling software, which can be found in any manual of the aforementioned software, their capabilities on the present project are demonstrated by fig. 5, which represents the complex linkages that drive the rotor swashplates in order to control blade pitch according to the pwm servo commands. every part of the vehicle and every piece of equipment developed to ground test the machine was designed, modeled and realized on the basis of a detailed study carried out 84 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 4: examples of fine/turbo graphical output with these tools. the field of mechanical design is probably the framework inside which the undergraduate students that took part to the research program obtained the greatest advantage in terms of new professional skills. on the other hand, it is only thanks to these tools, the use of which could be easily learned from self-contained manuals, that students with just a basic knowledge of mechanical design could successfully take part in a “real life” project, in spite of their lack of experience. 5 control system rapid prototyping after the realization of the rotor assembly and of the three-engine unit, which were separately ground tested, as reported in [4], ground testing of the whole engine/rotor system, coupled with the on-board computer for engine and rotor rpm control, will offer the first test benchmark for the control system software. as explained in greater detail in [7] and [13], the flight management system (fms) is made of an inner robust control and stability augmentation system, coupled with basic autopilot functions, realized by external feedback loops. during normal operations, i. e., after the engine start-up procedure, the engine throttles are controlled by the so-called rpm governor, which is a simple siso lead-lag compensator, with three poles and two zeros, with feedback on rotor speed, which must keep the rotor angular velocity at the nominal value of 3,000 rpm. large gain (28.3 db) and phase margins (103.4 deg) are required to avoid dynamic coupling between governor and uav controller and to preserve the stability of the complete system (including the vehicle) due to the effect of several uncertainties and simplifications in the rotor dynamical model [14]. the fms hardware architecture, described in [13], is based on a set of cots pc/104 boards including � no. 1 eurotech 266 mhz cpu board with 64 mb dram and a 32 mb disk-on-chip used to store user’s programs and for data logging, 15 interrupt channels, 2 rs-232 serial ports: executes flight control code, manages sensor data, organizes and commits telemetry to the ground station. � no. 1 mesa electronics 4123 4 channels rs-232 board: is the interface with imu/gps main attitude/navigation sensor, digital compass and wireless modem. � no. 2 diamond system corp. quartz-mm 10 board with 8 digital input lines, 8 digital output lines and 10 counter/timers: used to generate pwm signals for 12 coupled servos (6 output channels) controlling upper and lower swashplates, 3 servos for engine throttles, and 2 servos for roll/pitch video camera control, and to interface with 5 hall–effect latches, rpm sensors for engines (3) and rotors (2). � no. 1 diamond system corp. diamond–mm 32 board with 32 single or 16 differential analog input channels (16 bit), 4 analog output channels (12 bit) and 24 digital i/o lines: used for analog signal acquisition, that is, egt and cylinder head temperatures, voltage and current in the power system. � no. 1 eurotech idea-net2/et 10 baset ethernet board: used to connect the fms to a laptop computer or prior–to–flight system diagnosis and initialization. � no. 1 diamond systems corp. 50 watt dc/dc power supply board. development of onboard sw is entirely carried out using the mathworks matlab/simulink and stateflow with real-time workshop and stateflow coder as automatic code generation tools. the latter produce customizable c-code directly from simulink and stateflow diagrams for stand-alone embedded system applications. the xpc target toolbox with xpc target embedded option provide automatic implementation of executable code in the target computer, where an efficient real-time kernel is running. supported hardware is connected to the simulink models via the s-function input/output interface [15]. in this way, rapid prototyping of the control law synthesized in the matlab/simulink environment can be performed in the hardware-in-the-loop (hitl) simulation facility, the structure of which is depicted in fig. 6. even if it is by far the simplest element of the control system, the reliability of the rpm governor is of paramount importance for the vehicle. moreover, its demonstration will assess the viability of the approach and the hw configuration, based on the elements and the design tools listed above. for this reason, an experimental setup was designed in order to test the engine system under automatic control. in order to limit the risk of losing the prototype, or important parts of it because of a major, unpredictable failure, a safe approach was undertaken where the engine is coupled with an aerodynamic brake, made by a bar with adjustable rectangular panels, instead of the actual rotor system (fig. 7). the brake was sized in order to create an aerodynamic torque q at the design speed � � 3,000 rpm reasonably close to that produced by the rotor blades. using a simple strip theory for the © czech technical university publishing house http://ctn.cvut.cz/ap/ 85 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 5: solid model of the mechanism for blade pitch control bar and assuming a drag coefficient cdp for the two panels of surface sp, the torque estimate is given by q r s c r c a r r r � � � � � � � � � � ��� 2 3 3 p p dp d d h (2) where rp is the adjustable position of the panel, 2r is the bar length, a its thickness, and 2rh is the diameter of the central hub. if the position of the panels is varied along the bar, the brake torque can be varied at will, in order to simulate different rotor loads. at present, the on-board hardware has successfully demonstrated the possibility of managing manually all the main tasks related to engine operations [16]. the next step, i.e. to automatically manage throttle setting with the engine running, is much more ambitious, but it is also a crucial stage for the entire project, inasmuch as most of the hardware realized by the two research units will work assembled for the very first time in its definitive configuration. 86 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 6: architecture of control system hardware in the hitl simulation facility 6 conclusions and future work the key engineering technologies involved in the design and development of an uninhabited aerial vehicle in an academic environment were presented, with particular focus on the activities related to the development of the engine/rotor system. the paper describes in some detail how these advanced tools enabled academic staff, with contributions from undergraduate students, to complete the design of a wholly original machine, build the prototype and ground test its main components. work currently under way is aimed at demonstrating the overall viability of the approach undertaken for developing the control laws, starting from the simple siso control system in charge of keeping a constant rpm level for the rotors. the success of this phase represents a key step towards the in-flight demonstration of the prototype, i.e., a demonstration of the whole uav system. references [1] koo, t. j. et al.: “hybrid control of an autonomous helicopter.” in: proc. ifac workshop on motion control, grenoble, france, september 1998. [2] la civita, m. et al.: “design and flight testing of a high-bandwidth h-infinity loop shaping controller for a robotic helicopter.” aiaa paper 2002-4836. in: proc. aiaa guidance, navigation, and control conference and exhibit, monterey, ca, august 2002. [3] corban, j. e. et al.: “flight evalutation of adaptive high-bandwidth control methods for unmanned helicopters.” aiaa paper 2002-4441. in: proc. aiaa guidance, navigation, and control conference and exhibit, monterey, ca, august 2002. [4] avanzini, g., d’angelo, s., de matteis, g.: “design and development of a vtol uninhabited aerial vehicle.” j. of aerospace eng., vol. 217 (2003), no. 4, p. 169–178. [5] avanzini, g., d’angelo, s., de matteis, g.: “modeling and simulation of a shrouded-fan uav for environmental monitoring.” aiaa paper 2002-3464, aiaa 1st tech. conf. & workshop on uav systems, technologies, and operations, portsmouth (va), usa, may 2002. [6] avanzini, g., d’angelo, s., de matteis, g.: “performance and stability of a ducted-fan uninhabited aerial vehicle.” j. of aircraft, vol. 40 (2003), no. 1, p. 86–93. [7] avanzini, g., boserman, f., de matteis, g.: “design of a rate command m-controller for a shrouded-fan uninhabited aerial vehicle.” aiaa paper 2003-5521. in: proc. of the aiaa guidance, navigation and control conference, austin (tx), usa, august 2003. [8] lipera, l. et al.: “the micro craft istar micro air vehicle: control system design and testing.” proc. of the am. helicopter soc. 57th annual forum, washington (dc), usa, may 2001. [9] thornberg, c. a., cycon, j. p.: “sikorsky aircraft’s unmanned aerial vehicle, cypher: system description and program accomplishments.” proc. of the annual helicopter society 51st annual forum, fort worth (tx), usa, may 1995, p. 804–811. [10] vsaero user’s manual, analytical methods inc., rev. e5, seattle (wa), usa, april 1994. [11] de divitiis, n.: “aerodynamic modeling and performance analysis of a shrouded fan unmanned aerial vehicle.” proc. of the 23rd congress of the international council of the aeronautical sciences, paper 256, toronto, canada, september 2002. [12] fine/turbo user manual. version 3.1, numerical mechanics applications, brussels, belgium, march 1999. [13] avanzini, g., boserman, f., de matteis, g.: “development of the flight management system for a shrouded-fan u. a. v.” proc. of the xvii congress of the italian society of aeronautics and astronautics, rome, italy, september 2003. [14] avanzini, g., de matteis, g., fresta, f.: “robust multivariable control of a shrouded-fan uninhabited aerial vehicle.” aiaa paper 2002-4703. in: proc. of the aiaa atmospheric flight mechanics conference, monterey, ca, august 2002. [15] simulink: dynamic system simulation for matlab. the mathworks inc., 1997. [16] avanzini, g., badami, m., d’angelo, s., de matteis, g.: “integration and preliminary testing of avionic system and engine unit of a twin rotor u. a. v.” proc. of the xvii congress of the italian society of aeronautics and astronautics, rome, italy, september 2003. giulio avanzini e-mail: giulio.avanzini@polito.it salvatore d’angelo department of aerospace engineering politecnico di torino c.so duca degli abruzzi, 24 turin, 10129, italy guido de matteis department of mechanics and aeronautics universit di roma “la sapienza” via eudossiana, 18 rome, 00184, italy © czech technical university publishing house http://ctn.cvut.cz/ap/ 87 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 7: solid model of the prototype with the aerodynamic brake ap07_2-3.vp in this paper we consider an example of a quantum waveguide with a small ��-symmetric perturbation. the perturbed system is weakly non-self-adjoint and we employ general results of [1, 2] to study the problem. the main aim is to show that the technique suggested in the cited works can be used effectively in the perturbation theory for ��-symmetric operators. let x x x� ( , )1 2 be cartesian coordinates in � 2, � �� � � � �x x: � �2 22 be an infinite straight strip. by v v x� ( ) we denote a real-valued function defined on � having bounded support and belonging to l�( )� . we assume that it satisfies the following assumption, v x x v x x( , ) ( , )� � �1 2 1 2 , x � �. (1) the main object of our study is the operator �� �: � � � i v (2) on � subject to the dirichlet boundary condition. we define it rigorously as an unbounded operator in l2( )� with the domain w2 0 2 , ( )� , where the latter is a subspace of w2 2( )� of the functions vanishing on ��. the symbol � indicates a small positive parameter. the dirichlet laplacian on � is a closed operator in l2( )� and the multiplication operator by v is relatively bounded. because of this the operator �� is closed. it can also be shown that it is m-sectorial. the main property of the operator �� is ��-symmetricity expressed by the identity � �� � * � �� � 1, where ( )( ) ( , )�u x u x x� � 1 2 . our aim is to study the spectrum of the operator ��. we will focus our attention on the continuous, residual and point spectrum of this operator. we define these subsets of the spectrum in accordance with [3]. namely, the continuous spectrum is introduced in terms of singular sequence, the point spectrum is the set of all eigenvalues, and the residual spectrum is the complement of the continuous and point spectrum with reference to the whole spectrum. our first result describes the continuous and residual spectrum of h�. theorem 1 the residual spectrum of �� is empty and the continuous one coincides with �1, � . the proof is the same as the proof of similar results in [4]; we therefore do not give the proof here. it is well-known that the spectrum of operator �0 is purely continuous and coincides with �1, � . the small perturbation �v can generate an eigenvalue converging to the threshold of the continuous spectrum. our second theorem deals with the existence and asymptotic behaviour of such an eigenvalue. before formulating it, we introduce auxiliary notations. we denote � � � � � j j j x jx v x v x x x x u ( ): sin , ( ): ( ) ( ) ( ) , 2 2 1 1 2 2 2 0 1 2 � � � d ( ): ( ) , ( ): x x t v t t u x j ej j x t 1 1 1 1 1 1 1 2 1 1 2 1 1 2 1 1 � � � � � � � � � d � v t tj ( ) .1 1d � � employing these functions, we define ~ ( ): ( ) ( ) ( ( ) ( )) ( ), ~( ) v x v x x v x v x x u x j j j � � � � � 1 2 2 1 1 1 2� � : ( ) ( ).� � � u x xj j j 1 2 2 � the function ~u is well-defined and belongs to w2 2( )� . it can be shown that it is an exponentially decaying solution to the problem � � � � � � � � ~ ~ ~, ~ , . u u v x u x0 � (3) finally, we introduce a number k by the formula © czech technical university publishing house http://ctn.cvut.cz/ap/ 57 acta polytechnica vol. 47 no. 2–3/2007 on a quantum waveguide with a small �� -symmetric perturbation d. borisov we consider a quantum waveguide with a small �� -symmetric perturbation described by a potential. we study the spectrum of such a system and show that the perturbation can produce eigenvalues near the threshold of the continuous spectrum. keywords: waveguide, �� -symmetric potential, spectrum k u v u l l : ( ~ ,~) ( ) ( ) � � �1 2 2 2� � . we will show below that the first norm in this formula is well-defined. theorem 2 if k>0, there exists the unique eigenvalue of the operator ��, converging to the threshold of the continuous spectrum. this eigenvalue is simple, real and satisfies the asymptotic formula � � �� � � � 1 4 0 4 2 5k �( ), . (4) if k<0 , the operator �� has no eigenvalues converging to the threshold of the continuous spectrum as � � 0 . in particular, if v x v x( ) ( )� 1 1 (5) the number k is positive, and if v1 0� (6) the number k is negative. proof. we introduce the numbers k v x x v x x k u v 1 2 1 2 2 2 : , : � � � � � � i ( ) (x ) d ( ) d 1 ( 1 2 1 1 1 1 1 2 � � � � ~~ ~~ .v u x u v x v u x) d d d1 1 1 � � � � �� � � � � �� � � � �� 1 2 � it follows from [2, th. 1] that if re k1 0� , or re k1 0� , re k2 0� , (7) there exists the unique eigenvalue of �� converging to the threshold of the continuous spectrum, and the asymptotics of this eigenvalue reads as follows � � � �� � �� � � � 1 0 2 1 2 2 3k k k k, ( ),� . (8) it also follows from [2, th. 1] that if re k1 0� , or re k1 0� , re k2 0� (9) the operator �� has no eigenvalues converging to the threshold as � � 0. thus, it is sufficient to calculate the numbers k1, k2. the identity (1) implies that v1 is an odd function, and hence k1 0� . (10) therefore, it is sufficient to calculate k2 and check its sign. the mean value of v1 being zero, the function u1 is constant as x1 is large enough. this allows us to write v u x u u x u x1 1 1 1 1 1 1 2 1d d d � � � � � �� � �� � �( ) . (11) hence, k k 2 2 � . we substitute this formula and (10) into (8), and by (7), (9) we conclude that if k � 0 , there exists the unique eigenvalue of �� satisfying (4). if k � 0, the operator has no eigenvalues converging to the threshold of the continuous spectrum. using (1), one can check easily that � is an eigenvalue of �� as well. it converges to the threshold and by the uniqueness of such an eigenvalue we conclude that this eigenvalue is real. let us prove that the conditions (5), (6) are sufficient for the eigenvalue to be present or absent. assume first (5). in this case ~ v � 0, ~u � 0 and k u l � � � 1 2 01 2 2 ( )� (12) if the relation (6) holds true, the function u1 is identically zero, and k v u l: ( ~ ,~) ( )� � 2 � employing (3), by analogy with (11) in the same way we check ~~ ~ ~ ( ) ( ) v u x u u l l d � � �� � � �2 2 2 2 . (13) it follows from the definition of the function ~( , )u x1 � that � � � � ~ ( , ) ~ ( , ) ( , ) u x x u l l 2 1 0 2 0 2 2 2 4� � , and hence � �~ ~ ( ) ( ) u u l l2 2 2 2 4 � � . in view of (12), (13) and this estimate we conclude that k u l � � �3 0 2 2~ ( )� . in conclusion, we observe that the results of [1, 2] allow one also to study also the existence of the eigenvalues emerging from the higher thresholds in the continuous spectrum that are j2. it was shown in [2], that if it exists, this eigenvalue is unique. as in theorem 2, this fact implies that the eigenvalue is real and therefore in this case we are dealing with embedded eigenvalues. acknowledgments this work has been supported in parts by rfbr (05-01-97912) and by the czech academy of sciences and ministry of education, youth and sports (lc06002). the author has also been supported by a marie curie international fellowship within the 6th european community framework (mif1-ct-2005-006254) and gratefully acknowledges support from the deligne 2004 balzan prize in mathematics and from the the grant of republic bashkortostan for young scientists and young scientific collectives. references [1] gadyl’shin, r.: on regular and singular perturbation of acoustic and quantum waveguides, comptes rendus méchanique, vol. 332 (2004), no. 8, p. 647–652. [2] gadyl’shin, r.: local perturbations of quantum waveguides, theor. math. phys., vol. 145 (2005), no. 3, p. 1678–1690. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 [3] glazman, i. m.: direct methods of qualitative spectral analysis of singular differential operators. london, oldbourne press, 1965. [4] borisov, d., krejčiřík, d.: ��-symmetric waveguide, submitted. preprint: arxiv:0707.3039. dr. denis i. borisov phone: +49 371 431 3215 e-mail: borisovdi@yandex.ru, borisovdi@ic.bashedu.ru faculty of mathematics chemnitz university of technology reichenhainer str. 39 d-09107 chemnitz, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 47 no. 2–3/2007 acta polytechnica https://doi.org/10.14311/ap.2023.63.0199 acta polytechnica 63(3):199–207, 2023 © 2023 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague mechanical and physical properties of cement mixtures for 3d processing jiří litoša, ∗, vladimír šánaa, adam uhlíkb, karel kolářa, markéta nguyena a czech technical university in prague, faculty of civil engineering, experimental centre, thákurova 7, 160 00 prague 6 – dejvice, czech republic b slovak university of technology in bratislava, faculty of civil engineering, department of materials engineering and physics, radlinského 2766/11, 810 05 bratislava, slovak republic ∗ corresponding author: litos@fsv.cvut.cz abstract. in this paper, information about cementitious composite materials for further 3d processing is discussed and supplemented. many of the research in this area focuses primarily on cement composites suitable for 3d printing. nevertheless, 3d printing is not the only robotic processing technique. another such a technology is modelling with the help of a robotic arm, which can be used to create various elements that fulfil their original but also aesthetic function. the robotic arm creates, using a variety of sculptural or hand tools, a final unique relief of a given element. three different cement composite mixtures are discussed and their mechanical, physical and thermophysical properties are evaluated. the research aims to investigate and optimise these composites for robotic sculpturing and 3d printing. keywords: 3d processing, robotic sculpturing, cementitious composite, printing technology. 1. introduction currently, there is a lot of research in the world focused on optimising the composition of mixtures in 3d processing. especially 3d printing has received a considerable development and investment in recent years. in general, technologies based on 3d processing are on the rise. the surface processing of cement composites using a robotic arm has been overshadowed by 3d printing technology, although this technology has great potential. the research that led to this paper focuses primarily on the development of a special mixture for this purpose and its comparison with cementitious composite mixtures used for 3d printing. the use of cement composites modified with the help of robots is a new, innovative, and partly automated technology, which has recently been significantly promoted in construction. this production method has started to develop very fast, which brings the need to design a suitable composition of mixture for this technology. it is important to clearly define the required properties of the material in the fresh state, as well as during the processes, solidification and hardening. such a technology brings the potential for an interesting architectural concept, either of the whole building or its parts and elements. an important and interesting characteristic of modern structural design is increasing the fire and dynamic resistance of buildings. however, this effort also increases the financial impact on the resulting architectural work. therefore, the principle of prefabrication is used in civil engineering, which has the main advantage especially in the speed of construction and the possibility of fast delivery of the prepared precast directly to the construction site. a significant advantage of using precast parts is the possibility of control and guarantee of the mechanical properties of these elements. however, for precast elements, it is not advantageous to create original shapes due to the difficulty of formwork of irregular shapes and the impossibility of reusing such a formwork. this has a very negative impact, especially on the financial aspect of the whole process and the final product. this situation can be conveniently solved by the mentioned 3d processing using a robotic hand. with the mentioned robotic sculpture, the robotic hand quickly and efficiently creates the original shape on the surface of the mixture in a simple formwork with the help of programmed techniques and appropriate tools. therefore, the need to create a special formwork for each element, which would be used only once, is essentially eliminated. a more popular 3d technique using cement composite is 3d printing. the robotic processing, mentioned above, shapes the element only on the surface directly into the fresh cement mixture in the formwork and the elements are mostly of a non-load-bearing character. the 3d printing can afterwards be used mainly for creating load-bearing structures. both these 3d cement composite processing technologies are suitable for producing original construction shapes and eliminate the use of formwork. 3d printing of buildings and other structures are used mainly for vertical load-bearing structures, various home furnishings, or for interesting design works. these 3d technologies have started to develop rapidly, but there are several conditions 199 https://doi.org/10.14311/ap.2023.63.0199 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en j. litoš, v. šána, a. uhlík et al. acta polytechnica associated with them. during the use of a robotic hand, plasticity and controllable setting time are required from the mixture, while for 3d printing technology, mainly good extrudability and hardening of the layers immediately after application is preferred. another important factor is to ensure a suitable consistency, sufficient processing time, and maintaining the shape of the already extruded structure after the application. when shaping the surface of a cement composite, it is crucial to achieve good workability and consistency of the mixture. the primary object is to create a mixture that will be able to remain in a plastic state for a time sufficient for the robot or artist to create the desired work of art. the plastic state is understood in 3d processing mixtures as a mixture that has the consistency of a stiffer paste, but the onset of solidification is significantly delayed. currently, there is a lot of extensive research being carried out worldwide to design suitable cementitious mixtures for 3d printing technology. most of these studies, such as nerella et al. [1] were focused mainly on extrudability, buildability and processing time. another team of kazemian et al. [2] at the university of southern california have shown that the addition of micro-silica and nanofibers leads to a significant improvement in shape stability, and these fibres have better properties in such a material than the addition of polypropylene fibres. rahul et al. [3] focused on optimum buildability and extrudability, the optimal value of extrudability was achieved when the yield strength of the mixture was in the range of 1.5-2.5 kpa. after the addition of the additives, the yield strength and also the processing time of the mixture increased significantly. after 30 minutes, the increase was almost double as compared to the reference mixture. unfortunately, there is currently not many studies focusing on the area of robotic hand sculpturing, so no significant progress was observed here. another important and also necessary advancement in the field of civil engineering is the improvement of energy efficiency in buildings. this topic is tightly related to the thermo-physical properties of materials. the thermo-physical properties of the material subsequently affect the heating demand and the associated greenhouse gas emissions. the knowledge of thermo-physical properties can be applied to cementitious composites used in 3d technologies such as robotic sculpturing, specifically in the design and implementation of cladding elements created by this method. it is generally known that the requirements for additional insulation are lower if the material has better thermal properties. this finding is particularly important for 3d printed buildings where the potential implementation of thermal insulation could cause difficulties during installation. last but not least, the rheology plays very important role during the design process of any concrete or concrete-similar materials. in the literature, we can find, for instance, feys [4] and paul [5], who were dealing with this phenomenon. 2. mixtures as part of the experimental programme, three mixtures were selected and subjected to detailed testing. the first reference mixture marked as number 1 is a commercially available dry mixture masterflow 3d 100 from basf intended for 3d printing. this mixture is based on the main component – portland cement. the manufacturer describes this dry bagged mixture as a special non-shrinking mixture with a grain size of up to 0.5 mm. as an additional characteristic of the mixture, the manufacturer also mentions a good workability, suitable for 3d printing, of 1 h/+ 20 °c at 0.165 l·kg−1 of water, which means high strength, zero segregation, and also fast hardening. the two newly designed mixtures number 2 and 3 are composite mixtures resulting from a suitable combination of cement and commercially available additives. these mixtures in the specific mixing ratios lead to the desired effect. thanks to a controlled and continuous hydration process, such mixtures retain a sufficient workability time and also have a rapid increase in initial, especially mechanical, properties. these mixtures are subsequently able to achieve a reliable stability after curing. all of these properties make the mixture easy to handle and suitable for a wide range of practical use. the first mixture (number 1) was mixed exactly according to the procedure specified by the manufacturer. the water-cement ratio for this reference mixture is prescribed to be 0.156. this value of the water-cement ratio is generally relatively low and so it can be assumed that the mixture will contain admixtures or additives which will affect the workability of the fresh mixture. the procedure was, therefore, as follows – dry mixes 2 and 3 were firstly weighed with a small amount of water and mixed thoroughly. then, more water was gradually added until the desired consistency of the mixture was reached. the appropriate water-cement ratio was considered to be 0.183 for mixture 2 and 0.163 for mixture 3, and these water-cement ratio values are very close to those of the reference mixture. the mixtures exhibited optimum consistency during casting of the mixtures and forming in the formwork – the consistency of a stiffer paste. earlier, during the collaboration with federico díaz, 3d forming technology was used. figure 1a shows the forming of sand by a robotic hand, which is then used as part of the formwork for the individual cladding pieces. figure 1b shows the cladding created by 3d sculpture – federico díaz (tunelblanka-info) in prague [6]. 3. experimental programme the experimental programme included testing of a series of selected mechanical, physical, and thermophysical properties of mixtures. test specimens in the 200 vol. 63 no. 3/2023 cement mixtures for 3d processing (a). the formwork surface formation process by the robotic hand, see [6]. (b). artistic application of tiling elements created by a robotic hand in prague 6 [6]. figure 1. two examples of robotic sculpturing, manufacturing process and the final application. form of cubes with edge lengths of 70 mm and 50 mm and beams with dimensions of 160 × 40 × 40 mm were used to test these properties. the beams were used for mechanical properties tests and cube-shaped specimens for thermal and moisture experiments. most of the samples, namely two thirds, were stored in an environment with elevated humidity. the remaining samples intended for 7-day physical properties were placed in a hot-air oven where drying was carried out until the weight settled. physical and mechanical properties of the samples were tested, as usual, after 7, 14, and 28 days. due to the continuous procedure of measurements carried out at different stages of concrete hardening, we are able to obtain a more comprehensive overview of the investigated cementitious composites and the evolution of their characteristics. 3.1. mechanical properties the whole experimental programme started with the determination of the mechanical properties on five samples for each mixture. flexural tensile strength test was performed on 160 × 40 × 40 mm specimens with a support distance of 100 mm using a three-point bending test. after the bending test, a compression test according to en 12390-3 was performed on both fragments, with a contact area of 40 × 40 mm. figure 2a and figure 2b show a compression test in which a clear difference in failure can be observed between the sample of mixture 1 and 2. this mode of failure indicates that mixture 1 contains an admixture of a specific type of microfibre, which here acts as a dispersed reinforcement. 3.2. physical and thermo-physical properties basic physical parameters were determined using standard test methods, namely gravimetry and pycnometry, such as bulk density and matrix density. in this research, the thermo-physical parameter tests were (a). pressure test of the sample 1 with fibres and its failure pattern. (b). pressure test of the sample 2 without fibres and its failure pattern. figure 2. testing of the mechanical properties by hydraulic jack isomet 2114. carried out using the stationary hot disk method. this method is based on the transfer of heat to the material and subsequently, depending on the response, the coefficient of thermal diffusivity, the coefficient of thermal conductivity and also the coefficient of specific heat capacity were evaluated as well. an isomet 2114 instrument was used for these measurements, which is equipped with interchangeable probes. these measurements were performed on cubes with an edge length of 70 mm, one measurement for each sample. after the drying process, the samples were placed in a desiccator until the weight was stabilised, see figure 3a. a silica gel was applied to the bottom of the desiccator to provide a stable and especially non-moist environment. subsequently, after the connection, the entire instrument was set up and a hot disk was attached to the top wall of the sample according to the expected measurement range. finally, the required number of measurements was set. 201 j. litoš, v. šána, a. uhlík et al. acta polytechnica (a). samples placement of tested mixtures in a desiccator. (b). the measurement of changes using a rubber wavy-line mould. figure 3. testing of the physical and thermo-physical properties. as a part of the experimental program, measurements of volume changes in the solidification and hardening phases were also performed. it is generally known that the formation and subsequent development of cracks in the hydration phase is the result of volume changes and these significantly affect the durability and service life of the entire structure. in the case of artistic processing, crack management is even more important. the aim, is therefore, to restrict, as much as possible, the volumetric changes that play an important role in the artistic processing of cementitious composites. most often, these volume changes are examined during the hydration phase, when the fresh mixture changes from a so-called quasi-fluid to a solid phase of the mass. the volume changes of the mixtures were investigated using the rubber wavyline false mould method on the rubber, see figure 3b. a mould of this shape is placed in a gripping chair, which primarily prevents horizontal deflection of the mould. this wavy-line shape has the required properties, such as high vertical flexibility, while at the same time sufficiently resisting horizontal length changes. in the first step, the moulds were filled with the prepared mixtures and then compacted by hand. immediately afterwards, the filled and compacted moulds with the fresh mixture were covered with a foil on the upper surface. this step should eliminate the exchange of moisture parameters with external conditions. a reflective surface was then placed on top of it for measurements by laser sensors. to ensure a constant ambient temperature, the whole assembly was placed in a thermostatic chamber. thereafter, in order to achieve a non-contact method of optical distance measurement, the device is equipped with a sensor support structure for a simple displacement measurement in the vertical direction. scanning was performed by using laser reflective sensors. there is a reflective surface on the top of the specimens which allows the reflection of the transmitted and received light beam by the laser sensor. the reflection is recorded by the measuring station during the measurement, from where it is exported as a text file and then evaluated. with this method, we can continuously measure length changes with an accuracy of 0.6 µm. 4. results and discussion 4.1. mechanical properties the graphs below show the behaviour of the tested cementitious composites in flexural tensile strength (figure 4) and the evolution of compressive strengths (figure 5) over time. if we take a look at the flexural tensile strength, it can be seen that mixes 1 and 2 showed a rapid increase at 7 days and a further step increase at 28 days compared to mix 3 where the strength increased continuously in a linear trend. the presence of fibres in mix 1 was already visible in the flexural tensile strength tests. these fibres are activated as the load develops and increase the flexural tensile strength. it is ensured by preventing brittle fractures. the values of mixes 1 and 3 were comparable for the 28-day tensile strengths. thus, it can be concluded that the results obtained from the compressive and tensile strength tests indicate that the tested mixes 1 and 3 are close to the values of high strength concrete (60 mpa in compression), see [7]. regarding the development of the compressive strength values of the individual mixtures 2 and 3, we can observe a standard trend with almost the same values after 7 and 14 days. after 28 days, however, we can notice a significant difference between the mixtures, especially for mixture 3, where we can observe a rapid increase in compressive strength from 20 mpa to 71.3 mpa. for both new mixtures 2 and 3, we can see a faster increase in strength in comparison to the reference mixture 1. however, compared to commercially available concretes, our reference mixture 1 still achieved relatively high strength characteristics. the table 1 summarises samples 1, 2 and 3, their densities, and measured mechanical properties. according to pytlík [8], these composites can be classified 202 vol. 63 no. 3/2023 cement mixtures for 3d processing figure 4. graphical representation of the flexural tensile strength after 7, 14, and 28 days. figure 5. graphical representation of the compressive strength after 7, 14, and 28 days. mixture density compressive strength tensile bending strength ϱ [kg·m−3] rc [mpa] rf [mpa] 1 1990 51.9 13.3 2 2040 52.9 11.4 3 2100 71.3 13.7 table 1. the density and mechanical properties of the cement composites for 3d printing technology. 203 j. litoš, v. šána, a. uhlík et al. acta polytechnica mixture age λ cϱ a [×10−6] [days] [wm−1k−1] [jkg−1k−1] [m2s−1] 1 7 1.17 881.01 0.68 14 1.19 922.43 0.68 28 1.28 896.85 0.73 2 7 1.24 869.60 0.76 14 1.28 948.13 0.72 28 1.35 951.95 0.75 3 7 1.36 876.09 0.79 14 1.45 914.09 0.81 28 1.48 870.86 0.87 table 2. thermophysical properties of measured mixtures. from lightweight concrete lc (800–2000 kg·m−3) to ordinary concrete c (2000–2600 kg·m−3), see [8]. the reference mixture 1 had the lowest bulk density, while mixture 3 reached the highest values out of the three mixtures, as shown in table 1. these bulk densities also correspond to the measured compressive strengths. from the bulk density evaluations, it is apparent that mixture no. 1 is the most suitable for 3d printed structures. mixtures 2 and 3 have higher bulk densities and can, therefore, be considered less suitable for non-load bearing structures. although they have a higher compressive strength, which might seem to be an advantage, their relatively high bulk density increases the self-loading of the structure in a negative way. for the purposes of lightweight cladding materials, such as tiles or pavers, a high bulk density is disadvantageous. 4.2. physical and thermo-physical properties this section summarises the results and evaluation of the physical and thermo-physical properties of the tested cementitious composite mixtures. the data in table 2 presents the material results, specifically the thermo-physical characteristics, as they relate to research in the development of mixtures for 3d robotic processing. the table shows the measurements of these properties over a range of 7/14/28 days. the time setting is determined by the simultaneous measurement of the mechanical properties. only a slight increase in the measured thermo-physical values can be observed, while the strength characteristics increase significantly with increasing concrete age. as can be seen from the measured values of the thermal conductivity coefficient, all investigated composites can be classified according to their thermophysical properties and compared to the standard concrete with a standard density of 2100–2300 kg·m−3, which has a thermal conductivity coefficient of approximately 1.23–1.36 wm−1k−1, mix number 3 reached values of 1.481 wm−1k−1, which is the same value of thermal coefficient as for reinforced concrete according to ražnjević, see [9]. the authors here presented measured values in the range of 1.43–1.74 wm−1k−1. when considering the design of elements that will be exposed to different temperature and humidity conditions, we must take into account that these values are given for dry material. therefore, we must also consider the practical humidity value for the cladding element. for materials with a porous structure, the higher temperature is directly proportional to the heating of the material in the pores, which can cause an increase in the values of the thermo-technical and thermo-physical properties. therefore, a material with a lower thermal conductivity coefficient value works better as an insulator. the values shown in figures 6 and 7 correspond to the strength values of the individual mixtures. this indicates that the higher the strength of the material, the worse the thermal insulation properties of the material. the mixture 1 shows the best results in terms of conductivity due to the lowest value of thermal conductivity coefficient, next is mixture number 2, and the last mixture 3 is the least suitable from the thermal conductivity point of view, as shown in figures 6. however, this comparison has no real meaning with respect to the settled moisture and the sample (drying) parameters. in real conditions, moisture would play an important role in increasing thermal conductivity. figures 7 shows the evolution of specific heat capacity, where mixture number 2 demonstrates the best values and its values improve further with increasing age, for example, after 28 days. on the contrary, the mixtures 1 and 3 report a decrease in their heat capacities after 28 days. however, all mixtures displayed different values during the solidification phase in volume changes, as shown in figures 8. mixture 3 showed the most significant volume changes, while the other mixtures 1 and 2 had almost identical values in this aspect. interestingly, it also has the best strength properties. the importance of each these properties depends on the purpose of application and use. it is a matter of 204 vol. 63 no. 3/2023 cement mixtures for 3d processing figure 6. graphical representation of the thermal conductivity coefficient after 7, 14, and 28 days. figure 7. graphical representation of development of specific heat capacity at 7, 14, and 28 days. figure 8. volume changes overview of measured mixtures. 205 j. litoš, v. šána, a. uhlík et al. acta polytechnica preference whether high strength or durability is more important for the resulting material. 5. discussion as other authors who also devoted their efforts to the development of materials for 3d robotic printing, we too arrived to results that showed good and similar values of compressive strength as, for example, mixtures from s. j. woo et al. [10], who determined the appropriate compressive strength after 28 days of such materials to be at least 50 mpa, our reference mixture 1 and mixture 2 slightly exceeded these values, with mixture 3 reaching a level of compressive strength exceeding 70 mpa after 28 days. this is almost similar value to the mixture tested by a. p. rubin et al. [11]. in terms of tensile strength during bending, our samples showed almost two-times higher values after 28 days than the mixture from s. j. woo et al. [10], on the contrary, they were only slightly lower than the measured mixture values from a. p. rubin et al. test study [11]. according to the volume weights of our reference mixture and our mixtures 2 and 3, we classify these materials in the category of light concrete according to pytlík [8]. our mixtures showed slightly lower bulk weights than the material from s. j. woo et al. [10]. according to the study of e. lublasser et al. [12], we know that the lower the volumetric weight, the more suitable the material is for use in such a technology. therefore, we can state that our achieved results reached very similar measured values to similar studies by other authors who dealt with the development and measurement of mechanical and thermophysical properties of such mixtures of materials for 3d printing. 6. conclusion the results of the experimental program show that all tested mixtures are suitable for robotic processing. it can be stated that mixture 1 is more suitable for 3d printing and mixtures 2 and 3 are more suitable for robotic sculpturing, mainly due to their rheology. according to the mechanical property measurements, the proposed mixtures show better compressive strength values than the reference mixture. however, their bulk density and thermal insulation values are higher, and therefore worse than in the case of the reference mixture. the intended application of these materials is very important. unless it is necessary to use a high-strength composite, it is not recommended to use mixtures 2 and 3, which would significantly increase the self-weight of the structure. considering the physical properties of the tested composites, density may be the most important factor when the mixture is used for cladding material with artistic elements. on the contrary, strength and thermal insulation properties will be crucial if the material is used for load-bearing, solid or other external structure. the obtained knowledge and measured values will be used in a future investigation within the project, which will be focused on a better balance of cement mixtures with focus on adjustability of processing time and other physical-mechanical properties, whose optimal configuration is very important for 3d processing. the main goal of the project is to create a mixture that is suitable for robotic processing and at the same time commercially available as a dry bagged mixture. acknowledgements this research “development of a special cementitious composite suitable for 3d robotic processing” is supported by the ctu in prague, under project no. ltausa19018. the authors would like to gratefully acknowledge the financial support from the ministry of education, youth and sport of the czech republic within inter-excellence, inter-action, ltausa19018. references [1] v. n. nerella, s. hempel, v. mechtcherine. effects of layer-interface properties on mechanical performance of concrete elements produced by extrusion-based 3d-printing. construction and building materials 205:586–601, 2019. https://doi.org/10.1016/j.conbuildmat.2019.01.235 [2] a. kazemian, x. yuan, e. cochran, b. khoshnevis. cementitious materials for construction-scale 3d printing: laboratory testing of fresh printing mixture. construction and building materials 145:639–647, 2017. https://doi.org/10.1016/j.conbuildmat.2017.04.015 [3] a. rahul, m. santhanam, h. meena, z. ghani. 3d printable concrete: mixture design and test methods. cement and concrete composites 97:13–23, 2019. https://doi.org/10.1016/j.cemconcomp.2018.12.014 [4] d. feys, k. h. khayat, r. khatib. how do concrete rheology, tribology, flow rate and pipe radius influence pumping pressure? cement and concrete composites 66:38–46, 2016. https://doi.org/10.1016/j.cemconcomp.2015.11.002 [5] s. c. paul, y. w. d. tay, b. panda, m. j. tan. fresh and hardened properties of 3d printable cementitious materials for building and construction. archives of civil and mechanical engineering 18(1):311–319, 2018. https://doi.org/10.1016/j.acme.2017.02.008 [6] tunelblanka-info. výtvarník a robot proměnili výdech v umělecké dílo. 2017, [2022-11-30], https://www.tunelblanka.info/vytvarnik-a-robotpromenili-vydech-v-umelecke-dilo/. [7] čsn en 12390-3 část 3. pevnost v tlaku zkušebních těles. 2017, úřad pro technickou normalizaci, metrologii a státní zkušebnictví. [8] p. pytlík. technologie betonu. 2. vydání. vutium, brno, czech republic, 2000. [9] k. ražnjevič. thermodynamic tables. 1. vyd. bratislava: alfa, 2 sv. edícia energetickej literatúry, 1984. [10] s.-j. woo, j.-m. yang, h. lee, h.-k. kwon. comparison of properties of 3d-printed mortar in air vs. underwater. materials 14(19):5888, 2021. https://doi.org/10.3390/ma14195888 206 https://doi.org/10.1016/j.conbuildmat.2019.01.235 https://doi.org/10.1016/j.conbuildmat.2017.04.015 https://doi.org/10.1016/j.cemconcomp.2018.12.014 https://doi.org/10.1016/j.cemconcomp.2015.11.002 https://doi.org/10.1016/j.acme.2017.02.008 https://www.tunelblanka.info/vytvarnik-a-robot-promenili-vydech-v-umelecke-dilo/ https://www.tunelblanka.info/vytvarnik-a-robot-promenili-vydech-v-umelecke-dilo/ https://doi.org/10.3390/ma14195888 vol. 63 no. 3/2023 cement mixtures for 3d processing [11] a. p. rubin, l. c. quintanilha, w. l. repette. influence of structuration rate, with hydration accelerating admixture, on the physical and mechanical properties of concrete for 3d printing. construction and building materials 363:129826, 2023. https://doi.org/10.1016/j.conbuildmat.2022.129826 [12] e. lublasser, t. adams, a. vollpracht, s. brell-cokcan. robotic application of foam concrete onto bare wall elements – analysis, concept and robotic experiments. automation in construction 89:299–306, 2018. https://doi.org/10.1016/j.autcon.2018.02.005 207 https://doi.org/10.1016/j.conbuildmat.2022.129826 https://doi.org/10.1016/j.autcon.2018.02.005 acta polytechnica 63(3):199–207, 2023 1 introduction 2 mixtures 3 experimental programme 3.1 mechanical properties 3.2 physical and thermo-physical properties 4 results and discussion 4.1 mechanical properties 4.2 physical and thermo-physical properties 5 discussion 6 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0118 acta polytechnica 62(1):118–156, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague how to understand the structure of beta functions in six-derivative quantum gravity? lesław rachwał universidade federal de juiz de fora, departamento de física–instituto de ciências exatas, 33036-900, juiz de fora, mg, brazil correspondence: grzerach@gmail.com abstract. we extensively motivate the studies of higher-derivative gravities, and in particular we emphasize which new quantum features theories with six derivatives in their definitions possess. next, we discuss the mathematical structure of the exact on the full quantum level beta functions obtained previously for three couplings in front of generally covariant terms with four derivatives (weyl tensor squared, ricci scalar squared and the gauss-bonnet scalar) in minimal six-derivative quantum gravity in d = 4 spacetime dimensions. the fundamental role here is played by the ratio x of the coupling in front of the term with weyl tensors to the coupling in front of the term with ricci scalars in the original action. we draw a relation between the polynomial dependence on x and the absence/presence of enhanced conformal symmetry and renormalizability in the models where formally x → +∞ in the case of fourand six-derivative theories respectively. keywords: quantum gravity, higher derivatives, beta functions, uv-finiteness, conformal symmetry. 1. introduction and motivation six-derivative quantum gravity (qg) is a model of quantum dynamics of relativistic gravitational field with higher derivatives. it is a special case of general higher-derivative (hd) models which are very particular modifications of einsteinian gravitational theory. the last one is based on the theory with up to two derivatives (an addition of the cosmological constant term brings terms with no derivatives on the metric field at all) and it is simply based on the action composed of the ricci curvature scalar r understood as the function of the spacetime metric. in this setup, we consider that gravitational field is completely described by the symmetric tensor field gµν being the metric tensor of the pseudo-riemannian smooth differential manifold of a physical spacetime. in einstein’s theory the scalar r contains precisely two ordinary (partial) derivatives of the metric. the action obtained by integrating over the spacetime volume the densitized lagrangian √ |g|r we call as einstein-hilbert action. the qg models based on it were originally studied in [1–3]. below we consider modifications of two-derivative gravitational theory, where the number of derivatives on the metric is higher than just two. it must be remarked, however, that the kinematical framework of general relativity (gr) (like metric structure of the spacetime manifold, the form of christoffel coefficients, the motion of probe particles, or geodesic and fluid dynamics equations) remains intact for these modifications. therefore these higher-derivative (hd) models of gravitational field are still consistent with the physical basis of gr, the only difference is that their dynamics – the dynamics of the gravitational field – is described by classical equations of motion with higher-derivative character. thence these modifications of standard einsteinian gravitational theory are still in the set of generally relativistic models of the dynamics of the gravitational field. they could be considered both on the classical and quantum levels with the benefits of getting new and deeper insights in the theory of relativistic gravitational field. our framework on the classical level can be summarized by saying that we work within the set of metric theories of gravity, where the metric and only the metric tensor characterizes fully the configurations of the gravitational fields which are here represented by pseudo-riemannian differential manifolds of relativistic four-dimensional continuous spacetimes. therefore, in this work we neglect other classical modifications of gr, like by adding torsion, non-metricity, other geometric elements or other scalars, vectors or tensor fields. this choice of the dynamical variables for the relativistic gravitational field bears impact both on the classical dynamics as well as on the quantum theory. theories with higher derivatives come naturally both with advantages and with some theoretical problems. this happens already on the classical level when they supposed to describe the modified dynamics of the gravitational field (metric field gµν (x) living on the spacetime manifold). these successes and problems get amplified even more on the quantum level. the pros for hd theories give strong motivations why to consider seriously these modifications of einstein’s gravitation. we will briefly discuss various possibilities of how to resolve the problems of higherderivative field dynamics in one of the last sections of this contribution, while here we will consider more the motivations. 118 https://doi.org/10.14311/ap.2022.62.0118 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity on the classical level, the set of hd gravitational theories can be viewed as one of the many possible modifications of two-derivative gravitational theory. it is true that now observations, mainly in cosmology and on the intergalactic scales, point to some possible failures of einsteinian theory of gravity or to our lack in understanding the proper nature of the sources of gravity in these respective situations. there are various views possible on this situation and its explanations by gravitational theories. in the first view, researchers say that einstein-hilbert theory is still fine, but we need to add locally new exotic (meaning coming with some non-standard properties) matter source. since we do not know what these sources for energy-momentum tensor (emt) of matter are built out of (for example – from which quantum fields of particle physics as understood nowadays), we call the missing sources as dark energy and dark matter respectively. contrary to this approach, in the other, the gravitational source is standard, that is we describe what we really see in the galaxies and in the universe, without any “dark” components, but the gravitational theory should be modified. in this second path, the internal dynamics of the gravitational field is changed and that is why it reacts differently to the same classical visible emt source of standard matter. one of the promising options is to add higher derivatives of the metric on the classical level, but in such a way to still preserve the local lorentz symmetry of the dynamics that is to be safe with respect to the general covariance. hence all hd terms in the action of the theory must come from generally densitized scalars which are hd analogs of the ricci scalar. they can be in full generality built as contractions of the metric tensors (both covariant gµν and contravariant gµν ), riemann curvature tensor rµνρσ and also of covariant derivatives ∇µ acting on these riemann tensors1. initially this may look as presumably unnecessary complication since classical equations of motion (eom) with higher derivatives of the gravitational field are even more complicated than already a coupled system of non-linear partial differential equations for the components of the metric tensor field in einstein’s gravity. however, on the cosmological and galactic scales some gravitational models with higher derivatives give successes in explaining: the problem of dark matter halos, flat galactic rotation curves, cosmological dark energy (late-time exponential expansion of the universe) and also primordial inflation without a necessity of having the actual inflaton field. these are amongst all of observational pieces of evidence that can be taken for hd models. since our work is theoretical we provide below some conceptual and consistency arguments for hd gravities. first, still on the classical level, within the class of higher-derivative gravitational theories, there are models that are the first, which besides relativistic 1we do not need to consider covariant derivatives on the metric tensor because of the metricity condition, ∇µgνρ = 0. symmetries, enjoy also invariance under conformal symmetry understood in the gr framework. properly this is called as weyl symmetry of the rescaling of the covariant metric tensor, according to the law: gµν → ω2gµν with ω = ω(x) being an arbitrary scalar parameter of these transformations. to understand better this fact, one may first recall that the metric tensor gµν is taken here as a dimensionless quantity and all energy dimensions are brought only by partial derivatives acting on it. next, the prerequisite for full conformal symmetry is scale-invariance of the classical action, so the absence of any dimensionful parameter in the definition of the theory. from these facts, one derives that in four spacetime dimensions (d = 4) the gravitational conformal models must possess terms with precisely four derivatives acting on the metric. in general, in d dimensions, for conformal gravitational theory the classical action must be precisely with d derivatives on the metric. (one sees due to the requirement of general covariance that this consideration of conformal gravitational theories makes sense only in even dimensions d of spacetime.) another interesting observation, is that the gravitational theory with einstein-hilbert action is classically conformally invariant only in two-dimensional framework. for 4dimensional scale-invariant gravitational theory one must use a combination of the squares of the riemann tensors and various contractions thereof. (the term □r is trivially a total derivative term, so cannot be used.) therefore for d-dimensional conformal gravitational theories (d > 2) we inevitably must consider hd metric theories. the conformally invariant gravitational dynamics is very special both on the classical level and also on the quantum level as we will see in the next sections. the main arguments for higher-derivative gravitational theories in dimensions d > 2 come instead from quantum considerations. after all, it is not so surprising that it is the quantum coupling between quantum field theory (qft) of matter fields and quantum (or semi-classical) gravity or self-interactions within pure quantum gravity that dictates what should be a consistent quantum theory of gravitational interactions. our initial guess (actually einstein’s one) might not be the best one when quantum effects are taken fully into account. since it is the classical theory that is emergent from the more fundamental quantum one working not only in the microworld, but at all energy scales (equivalent to various distances), then the underlying fundamental quantum theory must necessarily be mathematically consistent, while some different classical theories may not possess the same strong feature. already here we turn the reader’s attention to the fact that the purely mathematical requirement of the consistency on the quantum level of gravitational self-interactions is very strongly constraining the possibilities for quantum gravitational theories. it is more constraining than it was originally thought of. moreover, not all macroscopic, so 119 lesław rachwał acta polytechnica long wavelength limit, classical theories are with these quantum correspondence features, only those which emerge as classical limits of consistent quantum gravity theories. following this path, at the end, we must also correct our classical gravitational theory, and likely it will not be einsteinian gravity any more. from a different side, we know that matter fields are quantum, they interact and they are energetic, so they are “charged” under gravitation since energymomentum content is what the gravity couples to. if we did not know nothing about gravity, then we could discover something about it from quantum considerations of gravitationally “charged” matter fields and their mutual interactions consistent with quantum mechanics. in this way we could make gravity dynamical and quantum with a proper form of graviton’s propagation. actually, it is the quantum consideration that makes the gauge bosons mediating the interactions between quantum charged particles dynamical. these gauge bosons are emanations or quantum realizations of classical dynamical gauge fields that must be introduced in the classical dynamics of matter fields or particles for the overall consistency. below we will present a few detailed arguments why we need hd gravities in d > 2 giving rise to dynamical gravitational fields with hd form of the graviton’s propagators in the quantum domain. they are all related and in a sense all touch upon the issue of coupling of a potential unknown dynamical quantum gravity theory to some energetic quantum matter fields moving under the influence of classical initially non-dynamical gravitational background field. (the background gravitational field does not have to be static, stationary or completely time-independent, what we only require here is that it is not a dynamical one.) these last classical fields can be understood as frozen expectation values of some dynamical quantum gravitational fields. as one can imagine for this process of quantum balancing of interactions the issue of back-reaction of quantum matter fields on the classical non-dynamical geometry is essential. firstly, we recall the argument of dewitt and utiyama [4]. due to quantum matter loops some uv divergences in the gravitational sector are generated. this is so even if the original matter theory is with two-derivative actions (like for example standard model of particle physics). the reasons for these divergences are pictorially feynman diagrams with quantum matter fields running in the perturbative loops, while the graviton lines are only external lines of the diagrams since they constitute classical backgrounds. in such a way we generate the dynamics to the gravitational field due to quantum matter interactions with gravity, so due to the back-reaction phenomena. if the latter was neglected we would have only the impact of classical gravitational field on the motion and interactions of quantum matter particles. we can be very concrete here, namely for example in d = 4 spacetime dimensions, the dynamical action that is generated for gravity takes the form sdiv = ∫ d4x √ |g| ( αc c 2 + αrr2 ) , (1) so we see that counterterms of the gr-covariant form of c 2 and r2 are being generated. (in the equation above, the r2 and c 2 terms denote respectively the square of the ricci scalar and of the weyl tensor, where the indices are contracted in the natural order, i.e. c 2 = cµνρσ c µνρσ . collectively, we will denote these curvatures as r2, so r2 = r2, c2.) this is true no matter what was our intention of what was the dynamical theory of the gravitational field. we might have thought that this was described by the standard two-derivative einstein-hilbert action, but still the above results persist. one notices that in these two counterterms c 2 and r2 one has four derivatives acting on a metric tensor, so these are theories of a general higher-derivative type, differently from originally intended e-h gravitational theory whose action is just based on the ricci scalar r. these c 2 and r2 terms appear in the divergent part of the dynamically induced action for the gravitational fields. we must be able to absorb these divergences to have a consistent quantum theory of the gravitational field coupled to the quantum matter fields present here on such curved (gravitational) backgrounds [5]. this implies that in the dynamics of the gravitational field we must have exactly these terms with higher derivatives as in (1). finally, we can even abstract and forget about matter species and consider only pure gravitational quantum theory. the consistency of self-interactions there on the quantum level puts the same restriction on the form of the action of the theory. in such a situation in the language of feynman diagrams, one considers also loops with quantum gravitons running inside. these graphs induce the same form of uv divergences as in (1). then in such a model, we must still consider the dynamics of the quantum gravitational field with higher derivatives. hence, from quantum considerations higher derivatives are inevitable. we also remark here that in the special case, where the matter theory is classically conformally invariant with respect to classical gravitational background field (the examples are: massless fermion, massless klein-gordon scalar field conformally coupled to the geometry, electrodynamic field or non-abelian yangmills field in d = 4), then only the conformally covariant counterterm c 2 is generated, while the coefficient αr = 0 in (1). this is due to the fact that the quantization procedure preserves conformal symmetries of the original classical theory coupled to the non-trivial spacetime background. such argument can be called as a conformal version of the original dewitt-utiyama argument. then the r2 counterterm is not needed but still the action of a quantum consistent coupled conformal system requires the higher-derivative dynamics in the gravitational sector [6]. here this is clearly the gravitational dynamics only in the spin-2 sector of 120 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity metric fluctuations, which is contained entirely in the (conformal) c 2 sector of the generic four-derivative theory presented in (1). an intriguing possibility for having higher derivatives in the gravitational action was first considered by stelle in [7] and some exact classical solutions of such a theory were analyzed in [8–10]. in d = 4 spacetime dimensions, the minimal number of derivatives is exactly four, the same as the number of dimensions [11]. this reasoning coincides with the one presented earlier that we need to have in even number of dimensions d, precisely d derivatives in the gravitational action to have first scale-invariant model of gravitational dynamics (later possible to be promoted to enjoy also the full conformal invariance). however, as proven by asorey, lopez and shapiro in [12], there are also possible theories with even higher number of derivatives, and they still have good properties on the quantum level and when coupled to quantum matter fields. similarly, in the literature there are various known motivations for conformal gravity in d = 4 spacetime dimensions, one can consult representatives in [13, 14]. secondly, we emphasize that to have a minimal (in a sense with the smallest number of derivatives) perturbatively renormalizable model of qg in dimensions d, one also has to consider actions with precisely d derivatives. the actions with smaller number are not scale-invariant and have problems on the quantum level to control all perturbative uv divergences, and not all of them are absorbable in the counterterms coming from the original classical actions of the theories – such models with less than d derivatives are not multiplicatively renormalizable. the first case for renormalizability is when the action contains all generic terms with arbitrary coupling coefficients with d (partial) derivatives on the metric for d dimensions. the special cases when some coefficients and some coupling parameters vanish may lead to restricted situations in which full renormalizability is not realized. we discuss such special limiting cases in further sections of this paper. the argument with the first renormalizable theory is a very similar in type to the quantum induced action from matter fields, but this time the particles which run in the perturbative loops of feynman diagrams are quantum gravitons themselves. so this argument about renormalizability applies to pure quantum gravity cases. unfortunately, the original einstein-hilbert action for qg model is not renormalizable (at least not perturbatively) in d = 4 dimensions [15–18]. the problems show up when one goes off-shell, couples some matter, or goes to the two-loop order, while at the first loop order with pure e-h gravitational action on-shell all uv divergences could be successfully absorbed [15] on einstein vacuum backgrounds (so on ricci-flat configurations). actually, in such a case in vacuum configurations the theory at the one-loop level is completely uv-finite. there are also other ways how one can on the quantum level induce the higher-derivative terms in the gravitational actions, although these further arguments are all related to the original one from dewitt and utiyama. one can, for example, consider integrating out completely quantum matter species on the level of functional integral which represents all accessible information of the quantum theory. in the situation, when these matter species are coupled to some background gravitational field, then the resulting partition function z is a functional of the background gravitational field. not surprisingly, this functional is of the higher-derivative nature in terms of number of derivatives of the fundamental metric field, if we work in the dimension d > 2. this reasoning was for example popularized by ’t hooft [19–21], especially since in d = 4 it can give rise to another motivations for conformal gravity as a quantum consistent model of conformal and gravitational interactions, when massless fields are integrated out in the path integral. in this way we can discover the quantum consistent dynamics of the gravitational field even if we did not know that such quantum fields mediating gravitational interactions between particles existed in the first place. the graviton becomes a propagating particle and with higher-derivative form of the propagator, which translates in momentum space to the enhanced suppression of the fall-off of the propagator for large momenta in the uv regime. this is due to the additional higher powers of propagating momentum in the perturbative expression for the graviton’s propagator. this enhanced uv decaying form of the propagator is what makes the uv divergences under perturbative control and what makes the theory at the end renormalizable. besides a few (finite number of) controlled uv divergences the theory is convergent and gives finite perturbative answers to many questions one can pose about the quantum dynamics of the gravitational field, also in models coupled consistently to quantum matter fields. another way is to consider the theory of einsteinian gravity and corrections to it coming from higher dimensional theories. one should already understood from the discussion above, that e-h action is a good quantum action for the qg model only in the special 2-dimensional case. there in d = 2 qg is very special renormalizable and finite theory, but without dynamical content resembling anything what is known from four dimensions (like for example the existence of gravitational waves, graviton spin-2 particles, etc.). this is again due to infinite power of conformal symmetry in d = 2 case. instead, if one considers higher dimensions like 6, 8, etc. and then compactifies them to common 4-dimensional case, one finds that even if in the higher dimensions one had to deal with the two-derivative theory based on the einstein-hilbert action, then in the reduced case in four dimensions, one again finds effective (dimensionally reduced) action with four derivatives. these types of arguments were recently invoked by maldacena [22] in order to study higher-derivative (and conformal) gravities from the 121 lesław rachwał acta polytechnica point of view of higher dimensions, when the process of integration out of quantum modes already took place and one derives a new dynamics for the gravitational field based on some compactification arguments. all this above shows that many arguments from even various different directions lead to the studies of higher-derivative gravitational theories in dimensions of spacetime d > 2. therefore, it is very natural to quantize such four-derivative theories (like it was first done by stelle) and treat them as a starting point for discussion of qg models in d = 4 case. at the end, one can also come back and try to solve for exact solutions of these higher-derivative gravitational theories on the classical level, although due to increased level of nonlinearities this is a very difficult task [23]. yet another argument is based on apparent similarity and symmetry seen in the action of quadratic gravity and action for a general yang-mills theory. both these actions are quadratic in the corresponding field strengths (or curvatures). they are curvatures respectively in the external spacetime for the gravitational field and in the internal space for gauge degrees of freedom. the einstein-hilbert action is therefore not similar to the f 2 action of yang-mills theory and the system of einstein-maxwell or einstein-yang-mills theory does not look symmetric since the number of curvatures in two sectors is not properly balanced. of course, this lack of balance is later even amplified to the problematic level by quantum corrections and the presence of unbalanced uv divergences (nonrenormalizability!). still, already on the classical level, one sees some dichotomy, especially when one tries to define a common total covariant derivative dµ (covariant both with respect to yang-mills internal group g and with respect to gravitational field). for such an object, one can define the curvature fµν that is decomposed in its respective sectors into the gauge field strength fµν and the riemann gravitational tensor rµνρσ . but the most natural thing to do here is to consider symmetric action constructed with such a total curvature of the derivative dµ and then the generalized f 2 is the first consistent option to include both dynamics of the non-abelian gauge field and also of the gravitational field. as we have seen this choice is also stable quantum-mechanically [24] since there are no corrections that would destabilize it and the only quantum corrections present they support this f 2 structure of the theory, even if this was not there from the beginning. we emphasize that this was inevitably the higher-derivative structure for the dynamics of the quantum relativistic gravitational field studied here. 1.1. motivations for and introduction to six-derivative gravitational theories now, we would like to summarize here on what is the general procedure to define the gravitational theory, both on the classical as well as on the quantum level. first, we decide what our theory is of – which fields are dynamical there. in our case these are gravitational fields entirely characterized by the metric tensor of gravitational spacetime. secondly, we specify the set of symmetries (invariance group) of our theory. again, in our setup these are, in general, invariances under general coordinate transformations also known as diffeomorphism symmetries of gravitational theories. in this sense, we also restrict the set of possible theories from general models considered in the gauge treatment of gravity, when the translation group or full poincaré groups are gauged. then finally following landau we define the theory by specifying its dynamical action functional. in our case for a classical level, this is a gr-invariant scalar obtained by integrating some gr-densitized scalar lagrangian over the full 4-dimensional continuum (spacetime). as emphasized above, for theoretical consistency, we must use lagrangians (actions) which contain higher (partial) derivatives of the metric tensor, when the lagrangian is completely expanded to a form where ordinary derivatives act on the metric tensors (contracted in various combinations). specifying now, to the case motivated above, we shall use and study below the theories defined by classical action functionals which contain precisely six derivatives of the metric tensor field. in order to define the theory on the quantum level, we use the standard functional integral representation of the partition function (also known as the vacuum transition amplitude) of the quantum theory. that is we construct, having the classical action functional shd, being the functional of the classical metric field shd[gµν ], the following object z = ∫ dgµν exp(ishd), (2) where in the functional integral above we must be more careful than just on the formal level in defining properly the integration measure dgµν . for example, we should sum over all backgrounds and also over all topologies of the classical background gravitational field. one can hope that it is also possible to classify in four dimensions all gravitational configurations (all gravitational pseudo-riemannian manifolds) over which we should integrate above. the functional integral, if properly defined, is the basis for quantum theory. one can even promote the point of view that by giving the functional z one defines the quantum theory even without reference to any classical action s. however, it is difficult a priori to propose generating functionals z, which are consistent with all symmetries of the theory (especially gauge invariances) and such that they possess sensible macroscopic (classical) limits. for practical purposes of evaluating various correlation functions between quantum fields and their fluctuations, one modifies this functional z by adding a coupling of the quantum field (here this role of the integration variable is played by gµν ) to the classical external current j . and also for other theoretical rea122 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity sons, one can compute this functional in background field method, where the functional integration is over fluctuation fields, while the classical action functional is decomposed into background and parts quadratic, cubic and of higher order in quantum fluctuation fields. for this one defines that the full metric is decomposed as follows, gµν = ḡµν + hµν , where the background classical metric is denoted by ḡµν and metric perturbations by hµν . by computing variational derivatives of the partition function z[j ] with respect to the classical current j one gets higher n-point functions with the accuracy of the full quantum level. one can compute them both perturbatively (in loop expansion) or non-perturbatively, and also on trivial backgrounds or in background field method. finally, for spacetimes which asymptotically reach riemann-flatness, from on-shell quantum green functions dressed by wave functions of external classical states, one derives quantum matrix elements of scattering processes. only in such conditions one can define general scattering problem in quantum gravitational theory. in this article, we want to analyze the quantum gravitational model with six derivatives in the action. that the theory is with six derivatives can be seen, because of two related reasons. first, one can derive the classical equations of motion based on such an action. then one will see that the number of partial derivatives acting on a metric tensor in a general term of such tensor of equations of motion is at most 6 in our model. or similarly, one can compute the treelevel graviton’s propagator for example around flat minkowski background. and then one notices that some components of this propagator are suppressed in the uv regime by the power k6 in fourier space, when k is the propagating momentum of the quantum mode. actually, for this last check one does not even have to invert and compute the propagator, one can perform a very much the same analysis on the level of the kinetic operator between gravitational fluctuations around some background (of course the flat background is here the easiest one). later in the main text of this article, we discuss how to overcome the problems in defining the propagator in some special cases, but the situation with the terms of the kinetic operator is almost always well-defined and one can read the six-derivative character of the theory easily from there. we have seen in the previous section that the fourderivative gravitational theories in d = 4 spacetime dimensions are scale-invariant (can be conformally invariant) on the classical level and that they are also first minimal renormalizable models of dynamical qg. this last assertion is proved by the power counting analysis. we will show below that it is possible to further extend the theory in such a way that the control over divergences is strengthened even more and this is again based on the analysis of the superficial degrees of divergences of any graph and also on the energy dimensionality arguments. in this way we will also explain why we can call generic six-derivative gravitational theories in d = 4 as perturbatively superrenormalizable theories. the power counting analysis in the case of fourderivative stelle gravity as in (1) (quadratic gravity of the schematic type r2 as in [7]) leads to the following equality ∆ + d∂ = 4, (3) where ∆ is superficial degree of divergence of any feynman graph g, d∂ is the number of derivatives of the metric on the external lines of the diagram g, and for future use we define l as the number of loop order. for tree-level (classical level) we have l = 0, while for concreteness we shall assume l ⩾ 1. this theory is simply renormalizable since the needed gr-covariant counterterms (to absorb perturbative uv divergences) have the same form as the original action in (1)2. in general local perturbatively renormalizable hd model of qg in d = 4, the divergences at any loop order must take the form as in (1) with a potential addition of the topological gauss-bonnet term. the change in the formula (3), when the sixderivative terms are leading in the uv regime, is as follows ∆ + d∂ = 6 − 2l. (4) the above formula can be also rewritten as a useful inequality (bound on the superficial degree ∆): ∆ ⩽ 6 − 2l = 4 − 2(l − 1), (5) since d∂ ⩾ 0. from this one sees an interesting feature that while in the case of four-derivative stelle theory the bound was independent on the number of loops l, for the case of six derivatives (and higher too) the bound is tighter for higher number of loops. this is the basis for super-renormalizability properties. in particular, in the case of six-derivative theories there are no any loop divergences at the level of fourth loop, since for l = 4 we find that ∆ < 0, so all graphs are uv-convergent. we also emphasize that a superrenormalizable model is still renormalizable, but at the same time it is more special since infinities in the former do not show up at arbitrary loop order l, which is instead the case for merely renormalizable models. from the formula (4) at the l = 3 loop level the possible uv divergences are only of the form proportional to the cosmological constant λ parameter, so completely without any partial derivatives acting on the metric tensor. similarly for the case of l = 2, we have that divergences can be proportional to the λ 2we remind for completeness that the gauss-bonnet scalar term gb = e4 = r2µνρσ − 4r2µν + r2 is a topological term, that is its variation in four spacetime dimensions leads to total derivative terms contributing nothing to classical eom and also to quantum perturbation theory. it may however contribute non-perturbatively when the topology changes are expected. but for the sake of computing uv divergences we might simply neglect the presence of this term both in the original action as well as in the resulting one-loop uv-divergent part of the effective action. 123 lesław rachwał acta polytechnica (with no derivatives) and also to the first power of the ricci scalar r of the manifold (with two derivatives on the metric, when it is expanded). in what follows we will not concentrate on these types of subleading in the uv divergences and our main attention in this paper will be placed on the four-derivative divergences as present in the action (1). up to the presence of the gauss-bonnet term they are the same as induced from quantum matter loops. these types of divergences are only generated at the one-loop level since for them we must have d∂ = 4 and ∆ = 0. the last information signifies that they are universal logarithmic divergences. their names originate from the fact that they arise when the ultraviolet cutoff λuv is used to cut the one-loop integrations over momenta of modes running in the loop in the upper limits. the analysis of power counting implies that the theory has divergences only at the first, second and third loop order and starting from the fourth loop level it is completely uv-finite model of qg. moreover, based on the above argumentation, the beta functions that we report below (in front of gr-covariant terms with four derivatives in the divergent effective action) receive contributions only at the one-loop level and higher orders (like twoand three-loop) do not have any impact on them. this means that the beta functions that we are interested in and that we computed at the one-loop level are all valid to all loop orders, hence our results for them are truly exact. they do not receive any perturbative contributions from higher loops. for other terms in the divergent action (like λ or r) this is not true. the theory is four-loop finite, while the beta functions of r2, c 2 and gb terms are one-loop exact. all these miracles are only possible to happen in very special super-renormalizable model since we have six derivatives in the gravitational propagator around flat spacetime. this number is bigger than the minimal for a renormalizable and scale-invariant qg theory in d = 4 spacetime dimensions and this is the origin of the facts above since we have a higher momentum suppression in the graviton’s propagator. according to what we have stated before, we decide to study the quantum theory described by the following classical lagrangian, l = ωc cµνρσ□c µνρσ + ωrr□r + θc c 2 + θrr2 + θgbgb + ωκr + ωλ. (6) from this lagrangian we construct the action of our hd quantum gravitational model, here with six derivatives as the leading number of derivatives in the uv regime, by the formula shd = ∫ d4x √ |g|l. (7) above by cµνρσ we denote the weyl tensor (constructed from the riemann rµνρσ , ricci tensor rµν and ricci scalar r and with coefficients suitable for d = 4 case). moreover, by gb we mean the euler term which gives rise to euler characteristic of the spacetime after integrating over the whole manifold. its integrand is given by the term also known as the gauss-bonnet term and it has the following expansion in other terms quadratic in the gravitational curvatures, gb = e4 = r2µνρσ − 4r 2 µν + r 2. (8) similarly, we can write for the “square” of the weyl tensor in d = 4 c 2 = c 2µνρσ = cµνρσ c µνρσ = r2µνρσ − 2r 2 µν + 1 3 r2. (9) finally, to denote the box operator we use the symbol □ with the definition □ = gµν ∇µ∇ν , which is a grcovariant analogue of the d’alembertian operator ∂2 known from the flat spacetime. it is important to emphasize here that the lagrangian (6) describes the most general six-derivative theory describing the propagation of gravitational fluctuations on flat spacetime. for this purpose it is important to include all terms that are quadratic in gravitational curvature. as it is obvious from the construction of the lagrangian in (6) for six-derivative model we have to include terms which are quadratic in the weyl tensor or ricci scalar and they contain precisely one power of the covariant box operator □ (which is constructed using the gr-covariant derivative ∇µ). these two terms exhaust all other possibilities since other terms which are quadratic and contain two covariant derivatives can be reduced to the two above exploiting various symmetry properties of the curvature riemann tensor as well as cyclicity and bianchi identities. moreover, the basis with weyl tensors and ricci scalars is the most convenient when one wants to study the form of the propagator of graviton around flat spacetime. other bases are possible as well but then they distort and entangle various contributions of various terms to these propagators. we also remark that the addition of the gauss-bonnet term is possible here (but it is a total derivative in d = 4); one could also add a generalized gauss-bonnet term, which is an analogue of the formula in (8), where the gr-covariant box operator in the first power is inserted in the middle of each of the tensorial terms there, which are quadratic in curvatures. eventually, there is no contribution of the generalized gauss-bonnet term in any dimension to the flat spacetime graviton propagator, so for this purpose we do not need to add such term to the lagrangian as it was written in (6). in what follows we employ the pseudo-euclidean notations and by √ |g| we will denote the square root of the absolute value of the metric determinant (always real in our conventions). the two most subleading terms in the lagrangian (6) are with couplings ωκ and ωλ respectively. the first one is related to the newton gravitational constant gn , while the last one ωλ to the value of the physical cosmological constant 124 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity parameter. the qg model with the lagrangian (6) is definitely the simplest one that describes the most general form of the graviton propagator around flat spacetime, in four spacetime dimensions and for the theory with six derivatives. we would like to already emphasize here, that there are two remarkable special limiting cases in the theory (6). in order to have a non-degenerate classical action and the well-defined hessian operator of the second variational derivative, one needs to require that both coefficients of the uv-leading terms, namely ωc and ωr, should be non-zero. only in this case the theory is renormalizable, moreover only in this case it also has nice additional features like super-renormalizability and that the fourth and higher perturbative loop contributions are completely finite. we want to say that the quantum calculations reported in the next section correspond only to this kind of well balanced model with both weyl tensor and ricci scalar squared terms and one power of the gr-covariant box operator inserted in the middle. (this is in order to have a six-derivative action, but also with terms that are precisely quadratic in gravitational curvatures.) in principle, there exist also models with non-balanced situations and dichotomy between different sectors of fluctuations. for example, in the special case of ωc = 0, θc ≠ 0 and ωr ̸= 0, the theory has the propagating spin-two mode with four derivatives and the propagating spin-zero mode with six derivatives in the perturbative spectrum around flat spacetime. this has to be contrasted with the fact that interaction vertices have always six derivatives in both special and also in generic theories (with ωc ≠ 0 and ωr ̸= 0). for another special version of the model, with ωc ̸= 0 and ωr = 0, the situation is quite opposite regarding the spectrum, but the negative conclusions are the same. according to the power counting arguments from [6, 25] and also from (3) in both special cases the theories are unfortunately nonrenormalizable. (we also discuss in greater details the power counting for these two special limiting models in section 4.4.) hence one should be very careful in performing computations in such cases and in trusting the results of limits there. these cases will be analyzed in more details in the next sections as it will be revealed that they are crucial for understanding the issue of the structure of perturbative divergences both in the four-derivative as well as also in six-derivative qg models in d = 4. the other consequences of the formula for power counting as presented in (4) is that the subleading in the uv terms of the original action in (6) do not at all contribute to the four-derivative terms leading in the uv regime of the divergences in (1). that is we have that the coefficients αc , αr and αgb in (1) depend only on the ratio of the coefficient in front of the term with weyl tensors and box inserted in the middle (i.e. c□c) to the coefficient in front of the corresponding term with two ricci scalars (i.e. r□r), so only on the ratio ωc /ωr also to be analyzed later at length here. these coefficients of uv divergences αc , αr and αgb do not depend on θc , θr, θgb, ωκ nor on ωλ. this is due to the energy dimensionality considerations of other uv-subleading terms in the action in (6). only the terms having the same energy dimensionality as the leading in the uv regime (shaping the uv form of the perturbative propagator) may contribute to the leading form of uv divergences, which in the divergent action (1) are represented by dimensionless numbers (in d = 4) such as αc , αr and αgb. for example, the terms with coefficients θc or θr have different energy dimensions and cannot appear there. this pertinent observation lets us for our computation to use just the reduced action, where we write only the terms that are important for the uv divergences we want to analyze in this paper. this action takes explicitly the following form shd = ∫ d 4 x √ |g| (ωc cµνρσ□c µνρσ + ωrr□r) . (10) we want to just remark here that the results in the theory with six-derivative gravitational action are discontinuous to the results one obtains for the similar type of computations in four-derivative stelle quadratic qg models, which are usually analyzed in d = 4 as the first and the most promising models of higher-derivative qg. this discontinuity is based on the known fact (both for hd gauge and gravitational theories) that the cases with two and four more derivatives in the action of respective gauge fields (metric fields in gravity) than in the minimal renormalizable model are discontinuous and exceptional, while the general formula exists starting from action with six derivatives more in its definition (and then this formula could be analytically extended). all three cases of: first minimal renormalizable theory, and the models with two or four derivatives more are special and cannot be obtained by any limiting procedure from the general results which hold for higher-derivative regulated actions, which contain six or more derivatives than in the minimal renormalizable model. for the case of qg in d = 4 in the minimal model we have obviously four derivatives. of course, this discontinuity is related to the different type of enhanced renormalizability properties of the models in question. as we have already explained above the gravitational model with six derivatives in d = 4 is the first superrenormalizable model of qg, where from the fourth loop on the perturbative uv divergences are completely absent. the case of stelle theory gives just the renormalizable theory, where the divergences are present at any loop order (they are always the same divergences, always absorbable in the same set of counterterms since the theory is renormalizable). one sees the discontinuity already in the behaviour of uv divergences as done in the analysis of power counting. when the number of derivatives is increased in steps (by two), then the level of loops when one does not see divergences at all decreases but in some discontinuous jumps. and for example for the qg theory with ten or more derivatives the uv divergences are only at 125 lesław rachwał acta polytechnica the one-loop level. (for gravitational theories with 8 derivatives the last level which is divergent is the second loop.) there exist also analytic formulas, which combine the results for uv divergences for the cases of theories with four or more derivatives more compared to the minimal renormalizable model with four derivatives in d = 4. again one sees from such formulas, that the correct results for the minimal renormalizable model and the one with six derivatives are discontinuous. then the case with 8-derivative gravitational theory is the first one for which the analytic formulas hold true. however, this has apparently nothing to do with the strengthened super-renormalizability properties at some loop level as it was emphasized above. the six-derivative gravitational theory is therefore 3-loop super-renormalizable since the 3-loop level is the last one, when one needs to absorb infinities and renormalize anew the theory. these jumps from 3loop super-renormalizability to 2-loop and finally to one-loop super-renormalizability are from their nature discontinuous and hence also the results for divergences inherit this discontinuity. for theories with ten or more derivatives we have one-loop superrenormalizability and the results for even higher number of derivatives 2n must be continuous in the parameter of the number of derivatives 2n, which could be analytically extended to the whole complex plane from the even integer values 2n ⩾ 10, which it originally had. in this analytically extended picture, the cases with eight, six and four derivatives are special isolated points, which are discontinuous and cannot be obtained from the general analytic formula valid for any n ⩾ 5. the origin of this is again in power counting of divergences, when some integrals over loop momenta are said to be convergent, when the superficial degree of divergence is smaller than zero, and when this is non-negative, then one meets non-trivial uv divergences. these infinities are logarithmic in the uv cutoff kuv for loop integration momenta for the degree ∆ vanishing, and power-law type for the degree ∆ positive. this sharp distinction between what is convergent and what is divergent (based on the non-negativity of the degree of divergences ∆ of any diagram) introduces the discontinuity, which is the main source of the problems here. in this contribution, we mainly discuss and analyze the results which were first obtained in our recent publication [26]. the details of the methods used to obtain them were presented to some extent in this recent article. the method consists basically of using the barvinsky-vilkovisky trace technology [27] applied to compute functional traces of differential operators giving the expression for the uv-divergent parts of the effective action at the one-loop level. the main results were obtained in background field method and from uv divergences in [26] we read the beta functions of running dimensionless gravitational couplings. the results for them in six-derivative gravitational theory in d = 4 spacetime dimensions were the main results there. they are also described in section 2 here. instead, in the present contribution, we decided to include an extended discussion of the theoretical checks done on these results in section 3. however, the main novel contribution is in section 4, where we present the analysis of the structure of these obtained results for the beta functions. our main goal here is to show an argumentation that provides an explanation why the structure of the beta function is unique and why it depends in this particular form on the ratio x (to be defined later in the main text in (36)). these comments were not initially included in the main research article [26] and they constitute the main new development of the present paper. we remind to the reader that in this paper, in particular, we will spend some time on attempts to explain the discontinuity of such results for uv divergences, when one goes from sixto four-derivative gravitational theories. so, in other words, when one reduces 3-loop super-renormalizability to just renormalizability. or equivalently, when the situation at the fourth perturbative loop gets modified from not having divergences at all, because all loop integrations give convergent results (with negative superficial degree ∆ < 0), to the situation when at this loop level still uv divergences are present (since their degree ∆ is zero for logarithmic uv divergences in the cutoff). this clearly sharp contrast in the sign of the superficial degree of divergences is one of the reasons, why the discontinuity between the cases of sixand four-derivative gravitational theories in d = 4 persists. 1.2. addition of killer operators as a matter of fact, we can also add other terms (cubic in gravitational curvatures r3) to the lagrangian in (6). these terms again will come with the coefficients of the highest energy dimensionality, equal to the dimensionality of the coefficients ωc and ωr. hence they could contribute to the leading fourderivative terms with uv divergences of the theory. the general form of them is given by the following list of six gr-covariant terms lr3 = s1r3 + s2rrµν rµν + s3rµν rµρrνρ + s4rrµνρσ rµνρσ + s5rµν rρσ rµρνσ + s6rµνρσ rµν κλrρσκλ . (11) actually, these terms can be very essential for making the gravitational theory with six-derivative actions completely uv-finite. however, for renormalizability or super-renormalizability properties these terms are not necessary, e.g., they do not make impact on the renormalizability of the theory and therefore should be regarded as non-minimal. in the analysis below we did not take their contributions into account and made already a technically demanding computation in a simplest minimal model with six-derivative actions. the set of terms in (11) is complete in d = 4 for all 126 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity what regards terms cubic in gravitational curvatures. this non-trivial statement is due to various identities as proven in [28]. these cubic terms are also sometimes called “killers” of the beta functions since they may have profound effects on the form of the beta functions of all terms in the theory. this is roughly very simple to explain. these killer terms are generally of the type sr3 and are to be added to the original lagrangian in (6) of six-derivative theories, where the uv-leading terms were of the type ωr□r. it is well known that to extract uv divergences at the one-loop level one has to compute the second variational derivative operator (hessian ĥ) from the full action. the contributions from cubic killers to it will be of the form of at least sr, when counted in powers of generalized curvature r. next, when computing the trace of the functional logarithm of the hessian operator for the form of the one-loop uv-divergent effective action one uses the expansion of the logarithm in a series according to ln(1 + z) = z − 1 2 z2 + . . . . (12) hence we need to take maybe up to the square of the contribution sr to the hessian from the cubic killer term. the third power would be too much. we must remember that we are looking for terms of the general type r2 in the uv-divergent part of the effective action. hence the contribution of the cubic killer in curvatures would produce addition to the covariant terms with uv divergences of the general type f (s)r2, where the yet unknown functions f (s) can be polynomials up to the second order in the coefficients si of these killers. now, requiring the total beta functions vanish (for complete uv-finiteness) we need in general to solve the system of the quadratic equations in the coefficients si. the only obstacle for finding coefficients of the killers can be that some solutions of this system reveal to be complex numbers, not real, but we need to require all si coefficients to be real for the definiteness of the action (for example in the euclidean case of the signature of the metric). therefore this issue requires a more detailed mathematical analysis, but the preliminary results based on [26, 29] show that in most of the cases the uv-finiteness is possible and easily can be achieved by adding the cubic killer operators from (11) with real coefficients si. one can compare the situation here with cubic killers to the more known situation where the quartic killers are used to obtain uv-finiteness. unfortunately, such quartic killers cannot be added to the six-derivative gravitational theory from (6) since they would have too many partial derivatives and would destroy the renormalizability of the model. quartic killers can be included in theories with at least 8 derivatives. such approach seems to be preferred one since the contribution of quartic killers (of the type schematically as r4) is always linear in d = 4 to uv divergences proportional to r2 schematically. and to solve linear system of equations with linear coefficients is always doable and one always finds solutions and they are always real. this approach was successfully applied to gravity theories in [29], to gauge theories in [30], to the theories on de sitter and anti-de sitter backgrounds [31] and also in general non-local theories [32]. one could show that the uv-finiteness may be an universal feature of quantum field-theoretical interactions in nature [33]. moreover, this feature of the absence of perturbative uv divergences is related to the quantum conformality as advocated in [34, 35]. 1.3. universality of the results finally, one of the most important features of the expression for the uv-divergent part of the effective action in the six-derivative gravitational theories is its complete independence of any parameter used in the computation. this parameter can be gauge-fixing parameter, or it can appear in gauge choice, or in details of some renormalization scheme, etc.. this bold fact of complete universality of the results for the effective action was proven by the theorem by kallosh, tarasov and tyutin (ktt) [36–38], applied here to the six-derivative qg theories. the theorem expresses the difference between two effective actions of the same theory but computed using different set of external parameters. basically, this difference is proportional to the off-shell tensor of classical equation of motion of the original theory. and this difference disappears on-shell. however, in our computation we want to exploit the case when the effective action and various green functions are computed from it understood as the off-shell functional. but in super-renormalizable theories there is still some advantage of using this theorem, namely for this one notices the difference in number of derivatives on the metric tensor between the original action and lagrangian of the theory as it is in the form (10) (and resulting from it classical eom) and between the same counting of derivatives done in the divergent part of the effective action. we remind the reader that in the former case we have six derivatives on the metric, while in the latter we count up to four derivatives. this mismatch together with the theorem of ktt implies that the difference between the two uv-divergent parts of the effective actions (only for these parts of the effective actions) computed using two different schemes or methods must vanish in superrenormalizable qg theories with six-derivative actions for whatever change of the external parameters that are used for the computation of these uv-divergent functionals. this means that our results for divergences are completely universal and cannot depend on any parameter. hence we derive the conclusion that our found divergences do not depend on the gauge-fixing parameters, gauge choices nor on other parametrization ambiguities. we remark that this situation is much better than for example in e-h 127 lesław rachwał acta polytechnica gravity, where the dependence on a gauge is quite strong, or even in stelle four-derivative theory, where four-derivative uv-divergent terms also show up some ambiguous dependence on gauge parameters off-shell. here we are completely safe from such problems and such cumbersome ambiguities. in this way such beta functions are piece of genuine observable quantity that can be defined in superrenormalizable models of qg. they are universal, independent of spurious parameters needed to define the gauge theory with local symmetries, and moreover they are exact, but still being computed at the oneloop level in perturbation calculus. they are clearly very good candidates for the observable in qg models. therefore all these nice features gives us even more push towards analyzing the structure of such physical quantities and to understand this based on some theoretical considerations. this is what we are trying to attempt in this contribution. another important feature is that in theories with higher derivatives in their defining classical action, on the full quantum level there is no need for perturbative renormalization of the graviton‘s wave function. this is also contrary to the case of two-derivative theory, when one has to take this phenomena into account, although its expression is not gauge-invariant and depends on the gauge fixing. these nice properties of no need for wave function renormalization can be easily understood in the batalin-vilkovisky formalism for quantization of gauge theories (or in general theories with differential constraints) [39, 40]. this important feature is also shared by other, for example, four-derivative qg models. since the wave function of the graviton does not receive any quantum correction, then one can derive the form of the beta functions for couplings just from reading the uv divergences of the dressed two-point functions with two external graviton lines. we can simplify our computation drastically since for this kind of one-loop computation we do not have to bother ourselves with the threeor higher n-point function to independently determine the wave function renormalization. unfortunately, the latter is the case, for example, for standard gauge theory (yang-mills model) or for e-h gravity, where the renormalization of the coupling constant of interactions has to be read from the combination of the twoand three-point functions of the quantum theory, while the wave function renormalization of gauge fields or graviton field respectively can be just read from quantum dressed two-point green function. for the case of six-derivative theories, just from the two-point function we can read everything about the renormalization of the coupling parameters of gravitons’ interactions. additionally, we have that on the first quantum loop level we do not need to study effective interaction vertices dressed by quantum corrections. hence, here at the one-loop level there is no quantum renormalization of the graviton’s wave function and uv divergences related to interactions are derived solely from propagation of free modes (here of graviton fields) around the flat spacetime and corrected (dressed) at the first quantum loop. effective vertices of interactions between gravitons do not matter for this, but that situation may be changed at higher loop orders. at the one-loop level this is a great simplification for our algorithm of derivation of the covariant form of uv divergences since we just need to extract them from the expression for one-loop perturbative two-point correlators of the theory, both in cases of fourand six-derivative qg models. all these nice features of the six-derivative qg model makes it further worth studying as an example of non-trivial rg flows in qg. here we have exactness of one-loop expressions for running θc (t), θr(t) and θgb(t) coupling parameters in (6), together with superrenormalizability. this is one of the most powerful and beautiful features of the super-renormalizable qg theory analyzed here. therefore, this model gives us a good and promising theoretical laboratory for studying rg flows in general quantum gravitational theories understood in the field-theoretical framework. we remark that from a technical point of view, the one-loop calculations in super-renormalizable models of qg are more difficult when compared to the ones done in the four-derivative just renormalizable gravitational models [27, 41–43]. the level of complexity of such calculations depends strongly on the number of derivatives in the classical action of the model as well as on the type of one-loop counterterms one is looking for. the counterterm for the cosmological constant is actually very easy to obtain and this was done already in [12]. next, the derivation of the divergence linear in the scalar curvature r requires really big efforts and was achieved only recently in our collaboration in [44]. in the present work, we comment on the next step, and we show the results of the calculations of the simply looking one-loop uv divergences for the four-derivative sector in the six-derivative minimal gravity model. in our result, we have now full answers to the beta functions for the weyl-squared c 2, ricci scalar-squared r2 and the gauss-bonnet gb scalar terms. the calculation is really tedious and cumbersome and it was done for the simplest possible six-derivative qg theory without cubic terms in the classical action, which here would be third powers of the generalized curvature tensor r3. even in this simplest minimal case, the intermediate expressions are too large for the explicit presentation here, hence they will be mostly omitted. similar computations in four-, sixand general higher-derivative gauge theory were also performed in [30, 45, 46]. as it was already mentioned above, the derivation of zeroand two-derivative ultraviolet divergences has been previously done in refs. [12] and [44]. below we will show the results for the complete set of beta functions for the theory (10). this we will achieve by deriving the exact and computed at one-loop beta function coefficients for the four-derivative gravitational 128 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity couplings, namely θc , θr and θgb, extracted as the coefficients of the uv-divergent part of the effective action in (1). without loss of generality, the calculation will be performed in the reduced model (10), so without terms subleading in the number of partial derivatives acting on the metric tensor after the proper expansion here. (we will not need to include terms like r2, c 2 or even r in (10).) this is clearly explained by the arguments from dimensional analysis since the divergences with four derivatives of the metric, in (1), are of our biggest interest here. moreover, numerical coefficients of those subleading terms cannot in any way combine with coefficients of propagators (shaped in the uv regime by the leading terms with six derivatives in the action (10)) to form dimensionless ratios in front of terms in (1) in d = 4 spacetime dimensions. 2. brief description of the technique for computing uv divergences an essential part of the calculations is pretty much the same as usually done in any higher-derivative qg model, especially in the renormalizable or superrenormalizable models [26, 44] as considered here. in what follows, we can skip a great part of the explanations. we will focus on the calculation of the fourth derivative terms of the divergent part of the effective action. first, to perform pure computation we use the background field method, which is defined by the following splitting of the metric gµν −→ ḡµν + hµν (13) to the background ḡµν and the quantum fluctuation parts given by the spin-2 symmetric tensor hµν . the next step is to define the gauge-fixing condition. since our theory with six derivatives still possesses gauge invariance due to diffeomorphism symmetry we have to fix the gauge to make the graviton propagator non-degenerate. for this we will make some choice of the gauge-fixing parameters, here represented by numerical α, β and γ parameters. first, we choose the parameter β in the harmonic background gauge-fixing condition χµ, according to χµ = ∇λhλµ − β ∇µh, h = hν ν , (14) in the most simple “minimal” form, as will be indicated below. the same concerns the parameters α and γ. finally, we select a general form of the weighting operator, ĉ = c̃ µν , which is defined by the formula below: ĉ = c̃ µν = − 1 α ( gµν□2 + (γ − 1)∇µ□∇ν ) . (15) this together with the gauge-fixing condition, that is χµ, defines the gauge-fixing action [41] in the following form, sgf = ∫ d4x √ |g| χµ c̃ µν χν . (16) the action of the complex faddeev-popov (fp) ghost fields (respectively c̄ µ and cµ) has in turn the form sgh = ∫ d4x √ |g| c̄ µmµν cν , (17) where the bilinear part between the anti-ghost c̄ µ and ghost fields cµ, the so called fp-matrix m̂ , depends differentially on χµ gauge-fixing conditions and also on the contracted form of the generator of gauge transformations r̂, m̂ = mµν = δχµ δgαβ rαβ ν = δµν □+∇ ν ∇µ−2β∇µ∇ν . (18) in the above equation by the matrix-valued operator rαβ ν we mean the generator of infinitesimal diffeomorphism (local gauge) transformations in any metric theory of gravity. since as proven and explained at the end of section 1.3, our final results for uv divergences are here completely universal and they are independent of any parameter used to regularize, compute and renormalize the effective action of the theory, then we can take the following philosophy at work here. we choose some specific gauge choice in order to simplify our calculation, but then we are sure that the final results will be still correct, if obtained consistently within this computation done in a particular gauge choice. it is true that intermediate steps of the computation may be different in different gauges, but the final results must be unique and it does not matter which way we arrive to them. we think we could choose one of the simplest path to reach this goal. a posteriori this method is justified, but the middle steps of the processing of the hessian operator will not have any invariant objective physical meaning. these are just steps in the calculational procedure in some selected gauge. one knows that in such a case, for example, for a formalism due to barvinsky-vilkovisky (bv) [27] of functional traces of differential operators applied in the background field method framework, all intermediate results are manifestly gauge-independent. then still such partial contributions (any of them) separately do not have any sensible physical meaning, although such results are gauge-independent and look superficially physical – any physical meaning cannot be properly associated to them, if all these terms are not taken in total and only in the final sum. on the contrary, if the computation is performed using feynman diagrams, momentum integrals and around flat spacetime, then the intermediate results are not gauge-invariant, as it is well known for partial contributions of some graphs, and only in the final sum they acquire such features of gauge-independence. we also need to distinguish here two different features. some partial results may be still gaugedependent and their form may not show up gauge symmetry (for example, using feynman diagram approach, a contribution from a subset of divergent 129 lesław rachwał acta polytechnica diagrams may not be absorbed by a gauge-covariant counterterm: f 2 in gauge theories, or r2 in gravity in d = 4). this feature should be however regained when the final results are obtained. this is actually a good check of the computation. but another property is independence of the gauge-fixing parameters, which are spurious non-physical parameters. at the same time, a counterterm might be gauge-covariant (built with f 2 or r2 terms), but its front coefficient may depend on these gauge parameters α, β, γ, etc.. this should not happen for the final results and they should be both gauge-covariant (so gauge-independent or gauge-invariant) and also gauge-fixing parameters independent. these two necessary properties, to call the result physical, must be realized completely independently and they are a good check of the correctness of the calculation. unfortunately, it seems that using the bv computational methods even in the intermediate results for traces of separate matrix-valued differential operators (like ĥ, m̂ and ĉ), we see already both gaugeindependence and gauge-fixing parameters independence provided that such parameters were not used in the definition of these operators. only in some cases, the total result is only gauge-fixing parameter independent. this means that within this formalism of computation this check is not very valuable and one basically has to be very careful to get the correct results at the end. instead, we perform a bunch of other rigorous checks of our results as it is mentioned, for example, in section 3. finally, let us here give briefly a few details concerning the choice of the gauge-fixing parameters α, β and γ. the bilinear form of the action is defined from the second variational derivative (giving rise to the hessian operator ĥ) ĥ = h µν,ρσ = 1√ |g| δ2 (s + sgf ) δhµν δhρσ = h µν,ρσlead + o(∇ 4), (19) where the first term h µν,ρσlead contains six-derivative terms, which are leading in the uv regime. by o(∇4) we denote the rest of the bilinear form, with four or less derivatives and with higher powers of gravitational curvatures r. the energy dimension of this expression is compensated by the powers of curvature tensor r and its covariant derivatives, hence in this case, we can also denote o(∇4) = o(r). the corresponding full expression for the hessian operator ĥ is very bulky, and we will not include it here. the highest derivative part (leading in the uv regime) of the ĥ operator, after adding the gaugefixing term (16) that we have selected, has the form h µν,ρσ lead = [ ωc δ µν,ρσ + ( β2γ α − ωc 3 +2ωr ) gµν gρσ ] □3 + ( ωc 3 − 2ωr − βγ α )( gρσ ∇µ∇ν + gµν ∇ρ∇σ ) □2 + ( 1 α gµρ − 2ωc gµρ ) ∇ν ∇σ□2 + ( 2ωc 3 + 2ωr + γ − 1 α ) ∇µ∇ν ∇ρ∇σ□. (20) in this expression, we do not mark explicitly the symmetrization in and between the pairs of indices (µ, ν) and (ρ, σ) for the sake of brevity. to make the uv-leading part of the hessian operator h µν,ρσlead minimal, one has to choose the following values for the gauge-fixing parameters [44]: α = 1 2ωc , β = ωc − 6ωr 4ωc − 6ωr , γ = 2ωc − 3ωr 3ωc . (21) we previously explained that this choice does not affect the values and the form of one-loop divergences in super-renormalizable qg. thus, we assume it as the most simple option. one notices that the expressions for gauge-fixing parameters in (21) are singular in the limit ωc → 0 and also when ωc = 32 ωr. while the first one is clearly understandable, because then we are losing one term ωc c□c in the action (10) and the theory is degenerate and non-generic, the second condition is not easily understandable in the weyl basis of writing terms in the action in (10) (with r2 and c 2 terms). to explain this other spurious degeneracy one rather goes to the ricci basis of writing terms (with r2 and r2µν = rµν rµν elements and also properly generalized to the six-derivative models by inserting one power of the box operator in the middle). there one sees that the absence of the coefficient in front of the r2µν leads to the pathology in the case of ωc = 32 ωr and also formal divergence of the β gauge-fixing parameter. we remark that in the final results there is no any trace of this denominator and this divergence, hence the condition for non-vanishing of the coefficient in front of the covariant term r2µν in the ricci basis does not have any sensible and crucial meaning – this is only a spurious intermediate dependence on (4ωc −6ωr)−1. contrary, the singular dependence on ωc coefficient is very crucial and will be analyzed at length here. actually, to verify that the denominators with (4ωc − 6ωr)−1 completely cancel out in the final results is a powerful check of our method of computation. now we can collect all the necessary elements to write down the general formula for the uv-divergent part of the one-loop contribution to the effective action of the theory [41], γ̄(1) = i 2 tr ln ĥ − itr ln m̂ − i 2 tr ln ĉ. (22) the calculation of the divergent parts of the first two expressions in (22) is very standard. one uses for this the technique of the generalized schwingerdewitt method [27], which was first introduced by barvinsky and vilkovisky. for this reason we shall skip most of the standard technical details here. we use the barvinsky-vilkovisky trace technology related to the covariant heat kernel methods together with methods of dimensional regularization (dimreg) to evaluate the functional traces present in (22) and to 130 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity have under control the general covariance of the final results. due to this we cannot check it because all three contributions in (22) gives results which look covariant and sensible. we remind the reader that here we work with the minimal gauge choice and in general all three terms separately will show the gauge dependence and also spurious dependence on gaugefixing parameters α, β and γ. however, only the final results, so the weighted sum as in (22) is properly gauge-independent and gauge-fixing independent and gives rise to a physical observable of the beta functional of the theory at the one-loop level. the computational method that we adopt here consists basically of using the barvinsky-vilkovisky trace technology to compute functional traces of differential operators giving the expression for the uv-divergent parts of the effective action at the one-loop level. the main results are obtained in background field method and from uv divergences in [26] we read the beta functions of running gravitational couplings. we also present here below an illustrative scalar example of the techniques by which these results were obtained. 2.1. example of the bv method of computation for the scalar case the simplest example to use the technique of computation presented here can be based on the analysis of the scalar case given by the action s = ∫ d4x ( − 1 2 ϕ□ϕ − λ 4! ϕ4 ) . (23) from this action one reads the second variational derivative operator (also known as the hessian) given by the formula h = δ2s δϕ2 = −□ − λ 2 ϕ2. (24) next, one needs to compute the following functional trace tr ln h to get the uv-divergent part of the oneloop effective action tr ln h = tr ln ( −□ − λ 2 ϕ2 ) = tr ln ( −□ ( 1 + λ 2 ϕ2□−1 )) = tr ln (−□) + tr ln ( 1 + λ 2 ϕ2□−1 ) . (25) in the above expression, one concentrates on the second part which contains the λ coupling. one expands the logarithm, as in (12), in the second trace to the second order in λ. this yields tr ln ( 1 + λ 2 ϕ2□−1 ) = tr ( λ 2 ϕ2□−1 ) − 1 2 tr ( λ 2 ϕ2□−1 )2 + . . . (26) and one picks up from it only the expression quadratic in λ and quartic in the background scalar field ϕ, which is also formally quadratic in the inverse box operator □−1, that is the part tr ln h ⊃ − 1 2 λ2 4 ϕ4tr □−2 = − λ2 8 ϕ4tr □−2. (27) precisely this expression is relevant for the uv divergence proportional to the quartic interaction term − λ4! ϕ 4 in the original scalar field action (23). noticing that the functional trace of the □−2 scalar operator in d = 4 is given by tr □−2 = i ln l2 (4π)2 , (28) where l is a dimensionless uv-cutoff parameter related to the λuv dimensionful momentum uv-cutoff and the renormalization scale µ via λuv = lµ, one finds for the uv-divergent and interesting us part of the one-loop effective action here γ(1)div = i 2 tr ln h ⊃ ∫ d4x ln l2 (4π)2 λ2 16 ϕ4. (29) now, one can compare this to the original action terms in (23) describing quartic interactions of the scalar fields ϕ: − ∫ d4x λ24 ϕ 4. the counterterm action (to absorb uv divergences) is opposite to γdiv and the form of the terms in the counterterm action is expressed via perturbative beta functions of the theory. that is in the counterterm action γct we expect terms γct = −γdiv = − 1 2 ln l2 24 ∫ d4xβλϕ 4 (30) with the front coefficient exactly identical to the one half of the one in front of the quartic interactions in the original action in (23) (being equal to − 14! = − 1 24 ). from this one reads that (identifying that effectively ln l2 → 1 for comparison) − 1 48 βλ = − λ2 16(4π)2 (31) and finally that βλ = 3λ2 (4π)2 , (32) which is a standard result for the one-loop beta function of the quartic coupling λ in 14! λϕ 4 scalar theory in d = 4 spacetime dimensions. one sees that even in the simplest framework, the details of such a computation are quite cumbersome, and we decide not to include in this manuscript other more sophisticated illustrative examples of such derivation of the explicit results for beta functions of the theory. the reader, who wants to see some samples can consult more explicit similar calculations as presented in references [26, 30, 44]. in particular, the appendix of [30] compares two approaches to the computation of uv divergences in gauge theory (simpler than gravity 131 lesław rachwał acta polytechnica but with non-abelian gauge symmetry) – using bv heat kernel technique and using standard feynman diagram computation using graphs and feynman rules around flat space and in fourier momentum space. 2.2. results in six-derivative gravity the final results for this computation of all uv divergences of the six-derivative gravitational theory are γ(1)r,cdiv = − ln l2 2(4π)2 ∫ d4x √ |g| {( 397 40 + 2x 9 ) c 2 + 1387 180 gb − 7 36 r2 } (33) for the case of six-derivative pure qg model in d = 4 spacetime dimensions and γ(1)r,cdiv = − ln l2 2(4π)2 ∫ d 4 x √ |g| { − 133 20 c 2 + 196 45 gb + ( − 5 2 x −2 4−der + 5 2 x −1 4−der − 5 36 ) r 2 } (34) for the case of four-derivative pure stelle quadratic model of qg to the one-loop accuracy. this last result was first reported in [42]. the result in six-derivative gravity is freshly new [26]. here we define the covariant cut-off regulator l [27], which stays in the following relations to the dimensional regularization parameter ϵ [27, 43], ln l2 ≡ ln λ2uv µ2 = 1 ϵ = 1 2 − ω = 2 4 − n , (35) where we denoted by n the generalized dimensionality of spacetime in the dimreg scheme of regularization (additionally λuv is the dimensionful uv cutoff energy parameter and µ is the quantum renormalization scale). moreover, to write compactly our finite results for the six-derivative theory we used the definition of the fundamental ratio of the theory x as x = ωc ωr , (36) while for stelle four-derivative theory in (34) we use analogously but now with the theta couplings instead of omegas, namely x4−der = θc θr . (37) it is worth to describe briefly here also the passage from uv divergences of the theory at the one-loop level to the perturbative one-loop beta functions of relevant dimensionless couplings. using the divergent contribution to the quantum effective action, derived previously, we can define the beta functions of the theory. let us first fix some definitions. the renormalized lagrangian lren is obtained starting from the classical lagrangian written in terms of the renormalized coupling constants and then adding the counterterms to subtract the divergences, lren = l(αb(t)) = l ( zαi(t)αi(t) ) = l(αi(t))+lct = l(αi(t)) + (zc − 1) θc (t) c 2 + (zr − 1) θr(t) r2 + (zgb − 1) θgb(t) gb, (38) where we have that lct = −ldiv and αi(t) = {θc (t), θr(t), θgb(t)}. above we denoted by αb(t) the rg running bare values of coupling parameters, by lct and ldiv the counterterm and divergent lagrangians respectively, by zαi (t) renormalization constants for all dimensionless couplings and finally by αi(t) these running couplings. here and above we neglect writing terms which are uv-divergent but subleading in the number of derivatives in the uv regime. from (38), the full counterterm action reads, already in dimensional regularization, γ(1)ct = −γ (1) div = 1 2ϵ 1 (4π)2 ∫ d 4 x √ |g| {( 397 40 + 2x 9 ) c 2 − 7 36 r 2 + 1387 180 gb } ≡ 1 2ϵ 1 (4π)2 ∫ d 4 x √ |g| { βc c 2 + βrr2 + βgbgb } . (39) comparing the last two formulas we can identify the beta functions and finally get the renormalization group equations for the six derivative theory, βc = µ dθc dµ = 1 (4π)2 ( 397 40 + 2x 9 ) , (40) βr = µ dθr dµ = − 1 (4π)2 7 36 , (41) βgb = µ dθgb dµ = 1 (4π)2 1387 180 , (42) the three lines above constitute the main results of this work. their structure, mainly the x-dependence is the main topic of discussion in the next sections. above we denoted by t the so called logarithmic rg time parameter related in the following way: t = log µ µ0 to the renormalization scale µ, where µ0 is some reference energy scale. as we will show below the differences between the cases of four-derivative theory and six-derivative one are significant and the dependence on the ratio x is with quite opposite pattern and in completely different sectors of ultraviolet divergences of the two respective theories. in the main part of this contribution we will try an attempt to explain the mentioned difference, which is now clearly noticeable, using some general principles and arguments about renormalizability of the quantum models. we will also study some limiting cases of the non-finite (infinite or zero) values of the x parameter and motivate that in such cases the qg model is non-renormalizable and this leads to characteristic patterns in the structure of beta functions mentioned above for six-derivative theories. this is also why we can call the ratio x as the fundamental parameter of the gravitational theory. 132 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity 3. some theoretical checks of the results (33) let us say that regardless of the simplicity of the final formulas with the final result in [26], the intermediate calculations were quite big and this is why we cannot present these intermediate steps here. this was not only because of the size of the algebraic expressions, where we used mathematica for help with symbolic algebra manipulations, but also due to the complexity of all the steps of the computation starting from the quadratic expansions of the action of six-derivative classical theory. the ultimate validity of the calculations has been checked in several different ways. this is also briefly described below. the following checks were performed to ensure the correctness of the intermediate results of the computation of uv divergences, which was the main task of the work presented here. (1.) first, the validity of the expression for the hessian operator from the classical action with six derivatives was verified in the following way. the covariant divergence of the second variational derivative operator (hessian) with respect to gravitational fluctuations hµν , from each gr-covariant term sgrav,i in the gravitational action must be separately zero, namely ∇µ ( δ2sgrav,i δhµν δhρσ ) = 0 + o ( ∇krl, k + 2l > 4 ) , (43) where shd = ∑ i sgrav,i. this formula was explicitly checked for each term in the action in (10) to the order quadratic in curvatures and up to total of four covariant derivatives acting on the general gravitational curvature r. (2.) the computation of the functional trace of the logarithm of the gauge weighting operator ĉ, namely of tr ln ĉ was checked using three methods. since the ĉ operator is a non-minimal four-derivative differential operator and matrix-valued (so with vector indices), then the computation of its trace of the logarithm is a bit troublesome. one has to be more careful here. therefore, we performed additional verifications of our partial results for this trace. our three methods consist basically of transforming the problem to computing the same trace of logarithm but of new operators (with higher number of derivatives). next, by selecting some adjustable parameters present in the construction of these new operators, these morphed operators could be put into a minimal form and easily traced (under the functional logarithm operation) using standard methods and prescriptions of barvinskyvilkovisky trace technology [27]. this construction of new operators was achieved by an operatorial multiplication by some two-derivative spin-one operator ŷ containing one free adjustable parameter. for details one can look up the section iii of [26]. in the first variant of the method, we multiplied ĉ from the right by ŷ one time, in the second method we multiplied by ŷ from the left also once, and in the final third method we used the explicitly symmetric form of multiplication ŷĉŷ . (this last form of multiplication is presumably very important for the manifest self-adjointness property of the resulting 8-derivative differential operator ŷĉŷ .) for these operatorial multiplications, ŷ was a twoderivative operator, whose trace of the logarithm is known and can be easily verified. (this was also checked independently below.) we emphasize that in the first two methods the resulting operators (ŷĉ and ĉŷ respectively) were six-derivative ones, while in the last one with double multiplication from both sides, ŷĉŷ was an eight-derivative matrix-valued differential operator. at the end, all three described above methods of computation of tr ln ĉ agree for terms quadratic in curvatures. these terms are only important for us here since they appear in the form of uv divergences of the theory (and are composed from gr-invariants: r2, r2µν , and r2µνρσ ). (3.) similarly, the computation of tr ln ŷ for the twoderivative operator ŷ was verified using three analogous methods. we used multiplication from both sides by the operator â and also the symmetric form of multiplication âŷâ, where ŷ is a two-derivative operator, whose functional trace of the logarithm we searched here. above, â was another two-derivative non-minimal spin-one vector gauge (massless) operator, whose trace of the logarithm is well known [41] and can be easily found. again, for the final results for tr ln ŷ all three methods presented here agree to the order of terms quadratic in curvatures r. (4.) in total divergent part γdiv of the quantum effective action, we checked a complete cancellation of terms with poles in y = 2ωc − 3ωr variable, namely all terms with 1 y and 1 y2 in denominators (originating from the expression for the gauge-fixing parameter β in (21)) completely cancel out. this is not a trivial cancellation between the results of the following traces: tr ln ĥ and tr ln m̂ . (5.) finally, using the same code written in mathematica [47] a similar computation in four-derivative gravitational theory (stelle theory in four dimensions) was repeated. we easily were able to reproduce all results about one-loop uv divergences there [42]. to our satisfaction, we found a complete agreement for all the coefficients and the same nontrivial dependence on the parameter x4−der, which was already defined above for stelle gravity. this was the final check. 4. structure of beta functions in six-derivative quantum gravity 4.1. limiting cases in this subsection, we discuss various limiting cases of higher-derivative gravitational theories (both with 133 lesław rachwał acta polytechnica four and six derivatives). we study in detail the situation when some of the coefficients of action terms in the weyl basis tend to zero. we comment whether in such cases our method of computation is still valid and whether the final results for uv divergences are correct in that cases and whether they could be obtained by continuous limit procedures. first, we discuss the situation with a possible degeneracy of the kinetic operator of the theory acting between quantum metric fluctuations hµν on the level of the quadratized action. if the action of a theory in the uv regime has the following uv-leading terms sgrav = ∫ d d x √ |g| ( ωc,n c□ n c + ωr,n r□n r ) , (44) with ωc,n ≠ 0 and also ωr,n ̸= 0 and after adding the proper gauge-fixing functional, then the kinetic operator can be defined, so in these circumstances it is not degenerate. then it constitutes the operatorial kernel of the part of the action which is quadratic in the fluctuation fields. it can be well-defined not only for the cases when ωc,n ̸= 0 and ωr,n ̸= 0, but also when ωr,n = 0. this last assertion one can check by explicit inspection, but due to the length of the resulting expression we decided not to include such a bulky formula here. however, in the case ωc,n = 0, a special procedure must be used to define the theory of perturbations and to extract uv divergences of the model. we remark that in the last case the theory is non-renormalizable. we also emphasize that the addition of the gauge-fixing functional here is necessary since without it the kinetic operator (hessian) is automatically degenerate as the result of gauge invariance of the theory (here in gravitational setup represented by the diffeomorphism gauge symmetry). in general, as emphasized in [26], in four spacetime dimensions, the general uv divergences depend only on the coefficients appearing in the following uvleading part of the gravitational hd action, sgrav = ∫ d4x √ |g| ( ωc,n c□ n c + ωr,n r□n r + ωc,n −1c□n −1c + ωr,n −1r□n −1r +ωc,n −2c□n −2c + ωr,n −2r□n −2r ) , (45) where the last two lines contain subleading terms in the uv regime. however, they are the most relevant for the divergences proportional to the ricci curvature scalar and also to the cosmological constant term [44]. below for notational convenience, we adopt the following convention specially suited for six-derivative gravitational theories, so in the case when n = 1. we will call coupling coefficients in front of the leading terms as respective omega coefficients (like ωc = ωc,1 and analogously ωr = ωr,1), while the coefficients of the subleading terms with four derivatives we will denote as theta coefficients (like θc = ωc,0 and analogously θr = ωr,0). eventually, for the most subleading terms with subindex values of (n − 2) equal formally to −1 here, we have just one term contributing to the cosmological constant type of uv divergence. we denote this coefficient as ω−1 = ωr,−1 and it is in front of the ricci scalar term in the original classical action of the theory (6). simply this coefficient ω−1 is related to the value of the 4-dimensional gravitational newton’s constant gn . the expressions for the rg running of the cosmological constant and the newton’s constant in [26, 44] contain various fractions of parameters of the theory appearing in the action (45). still, for a generic value of the integer n , giving roughly the half of the order of higher derivatives in the model, we have the following schematic structure of these fractions: ωr,n −1 ωr,n , ωc,n −1 ωc,n , ωr,n −2 ωr,n , ωc,n −2 ωc,n . (46) the structure of the uv divergences and of these fractions can be easily understood from the energy dimensionality arguments. we notice that in the weyl basis with terms in (45) written with weyl tensors cµνρσ and ricci scalars r the only fractions, which appear in such subleading uv divergences are “diagonal” and do not mix terms from the spin-2 (weyl) sector with terms from the spin-0 (ricci scalar) sector. if in any of the above fractions, we take the limits: ωr,n −1 → 0, ωc,n −1 → 0, ωr,n −2 → 0, or ωc,n −2 → 0, then the corresponding fractions and also related uv divergences (and resulting beta functions in question) simply vanish, provided that the coefficients in their denominators ωr,n and ωc,n are non-zero. on the other hand, if ωc,n = 0, then we cannot rely on this limiting procedure. in this case, on the level of the quadratized action the operator between quantum fluctuations is degenerate even after adding the gauge-fixing terms. this means that in this situation a special procedure has to be used to extract the uv divergences of the model. this is possible, but we will not discuss it here. it is worth to notice that in turn, if ωr,n = 0, then the kinetic operator for small fluctuations and after adding the gauge fixing is still well-defined, as emphasized also above. for this case a special additional kind of gauge fixing has to be used, which fixes the value also of the trace of the metric fluctuations h = gµν hµν and this last one in the theory with n = 0 resembles the conformal gauge fixing of the trace. if ωr,n = 0 and additionally ωr,n −1 or ωr,n −2 are non-zero, then the corresponding beta functions for the cosmological constant and newton’s gravitational constant are indeed infinite and ill-defined as viewed naively from the expressions in (46). this situation could be understood as that there is an additional new divergence not absorbed in the adopted renormalization scheme and the renormalizability of such a theory is likely lost. but if the model is with ωr,n = 0 and at the same time ωr,n −1 = ωr,n −2 = 0, then the contributions of corresponding fractions in (46) are vanishing, because the limits ωr,n −1 → 0 or ωr,n −2 → 0 134 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity must be taken as the first respectively. only after this, the final limiting procedure ωr,n → 0 should be performed. therefore, in this limiting situation, the proper sequence of limits on respective fractions is as follows: lim ωr,n →0 ( lim ωr,n −1→0 ωr,n −1 ωr,n ) = 0 (47) and lim ωr,n →0 ( lim ωr,n −2→0 ωr,n −2 ωr,n ) = 0 . (48) in this case, there are no contributions to the beta functions from these fractions, so the r2 sector does not contribute anything to the mentioned uv divergences, while it is expected that the terms in the c 2 sector make some impact on beta functions. however, the similar procedure cannot be applied in the sector with weyl square terms (c 2 sector), so to the model with ωc,n = 0 and at the same time ωc,n −1 = ωc,n −2 = 0 since these cases have to be treated specially and separately. in the last case, after the limit, only the pure sector with ricci scalar square terms (r2 sector) survives and the theory is likely nonrenormalizable. then we expect contributions to uv divergences only from terms in the r2 sector. regarding the divergences proportional to expressions quadratic in curvatures (r2, c 2, and the gaussbonnet term gb), we have found the following generic structure in four-derivative gravity [42]: a−2 x24−der + a−1 x4−der + a0, (49) where in this case of four-derivative gravity the fundamental ratio of the theory is defined as x4−der = ωc,0 ωr,0 = θc θr . (50) the numerical coefficients a−2, a−1 and a−0 are different for different types of uv divergences (here they are given by terms with four derivatives, namely by r2, c 2 and gb terms respectively). the explicit numerical values are given in the formula (34). one observes negative powers of the ratio x4−der in (49) and in (34), implying also negative powers of the coupling θc in the final results for these uv divergences. this result signifies that the theory with θc = 0 should be treated separately and then we do not have well-defined kinetic operator in a standard scheme of computation. the naive results with the limit θc → 0 of the above formula in (49) do not exist. such theories with θc = 0 entail complete absence of gravitational terms in the c 2 sector. they are again very special and perturbatively non-renormalizable models. the above remarks apply both to pure r2 starobinsky theory as well as to theories in the r2 sector with addition of the einstein-hilbert r or the cosmological constant ωλ terms. on the other side, the limit θr → 0 in pure c 2 gravity seems not to produce any problem with the degeneracy of the kinetic operator, nor with the final expression (49). the naive answer would be just a0 for (49) for each of the uv divergences in this case. but this is an incorrect answer since for pure fourderivative gravity with θr = 0 in the weyl basis of terms, we have an enhancement of the symmetry in the model, beyond the case where θr was non-zero. in this situation, the theory enjoys also conformal symmetry and a more specialized and delicate computation must be performed to cover this case. this is the case of four-dimensional conformal (weyl) gravity. (we decided for simplicity not to analyze here the cases when besides the c 2 action for four-dimensional conformal gravity, there are also some subleading terms from the almost “pure” r2 sector, that is ω−1 ̸= 0 or when we allow for non-vanishing cosmological constant term ωλ ̸= 0 – these terms in the action would cause breaking of classical conformality.) the computation in this case should reflect the fact that also the conformal symmetry should be gaugefixed. we remark that the conformal symmetry does not require dynamical fp ghosts, because the conformal transformations of gravitational gauge potentials (not the conformal weyl gauge potentials bµ) are without derivatives. at the end, when the more sophisticated method is employed, the eventual result is different than a0 for each type out of three types of four-derivative uv-divergent gr-invariant terms in the quantum effective action of the model. the strict result a0 is still correct only for theories in which conformality is violated by inclusion of other non-conformal terms like the einstein-hilbert r term or the cosmological constant ωλ term. we conclude that in the four-derivative theory, the two possible extreme cases of θr = 0 or θc = 0 are not covered by the general formula (49). but in each of these cases the reasons for this omission are different. in both these cases the separate more adapted methods of computation of uv divergences have to be used. in the case of six-derivative theory studied in [26], we have the following structure of uv divergences quadratic in gravitational curvatures b0 + b1x, (51) with new values for the constants b0 and b1. the explicit numerical values are given in our formula (33) with the results. we also remark that the values of the constant terms b0 are different than the values of a0 in the previous four-derivative gravity case. moreover, the numerical coefficients b0 and b1 are different for different types of uv divergences of the effective action (r2, c 2 and gb terms respectively). when the leading dynamics in the uv regime is governed by the theory with six derivatives, then the fundamental ratio x we define as x = ωc,1 ωr,1 = ωc ωr . (52) 135 lesław rachwał acta polytechnica we emphasize that in such a case, we cannot continuously take the limit ωc → 0. although, naively this would mean the limit x → 0, the result just b0 from (51) would be incorrect. this is because in this case we cannot trust the method of the computation. when ωc = 0 the kinetic operator is degenerate (the same as it was in the four-derivative gravity case) and needs non-standard treatment, that we will not discuss here. moreover, looking at the last formula (51), the other limit ωr → 0 is clearly impossible too, because it gives divergent results. however, in this case (ωr = 0) and on the contrary to the previous case (ωc = 0), we could trust the computation at least on the level of the kinetic operator (hessian) and its subsequent computation of the functional trace of the logarithm of. in this case, the final divergent results in (51) signify that the theory likely is non-renormalizable and that there are new uv divergences besides those ones derived from naive power counting analysis3. we conclude that in the case of six-derivative gravity, both cases ωr = 0 or ωc = 0 require special treatment and the type of formula like in (51) or (33) does not apply there and the limiting cases are not continuous. more discussion of these limits is contained also in the further subsection 4.4. 4.2. dependence of the final results on the fundamental ratio x here we just want to understand the x-dependence in the result for the beta functions in six-derivative gravitational theory. we first try to analyze the situation for simpler theory (with four derivatives), prepare the ground for the theory with six derivatives, and then eventually draw some comparison between the two. we look for singular 1 ωr or 1 ωc dependence (corresponding to positive or negative powers of the fundamental ratio x = ωc ωr respectively) in functional traces of the fundamental operators defining the dynamics of quantum perturbations important to the one-loop perturbative level. we note that the two definitions for the ratio x in (50) and in (52) respectively for fourand six-derivative gravities are compatible with each other and the proper use of them (with theta or omega couplings) is obvious in the specific contexts they are used in. below, when we will refer to features shared by both fourand six-derivative gravitational theories, we will use common notation with general ωc , ωr and x coefficients and we will not distinguish and not change it to the special notation originally adequate only to stelle quadratic theory (with θc , θr and x4−der). we hope that this will not lead to any confusion. we emphasize, that when we have one of the two terms missing – with ωr or ωc front couplings – in the leading in uv part of the action of the model, then 3we remark that the generic power counting analysis of uv divergences in six-derivative quantum gravity, as presented in section 1.1, applies only in cases when ωc ̸= 0 and ωr ̸= 0. the theory is badly non-renormalizable and degenerate. for example, one cannot define even at the tree-level the flat spacetime graviton propagator since the parts proportional to p (0) or p (2) projectors do not exist in cases when ωr = 0 or ωc = 0 respectively. however, there we can still use the barvinsky-vilkovisky (bv) trace technology to compute the new uv divergences. the fact that they are not possible to be absorbed in counterterms of the original theory is another story related to the non-renormalizability of the model that we will not discuss further here. we think that, for example, using the bv technique one can fast compute uv divergences in einstein-hilbert (e-h) theory in d = 4 (which is a non-renormalizable model) and this method still gives a definite result (besides that these divergences are gauge-fixing dependent and valid only for one gauge choice). moreover, using the bv traces machinery and the minimal form of the kinetic operator is essential to get final results for the unique effective action (as introduced by barvinsky [27, 48]), also in perturbatively non-renormalizable models. in quadratic gravity (four-derivative theory) in d = 4, setting θc = 0 is highly problematic. the same regards taking the limit θc → 0, because then the pure r2 theory can be fully gauge-fixed. and for example, this means that on flat spacetime background, the kinetic operator vanishes, perturbative modes are not dynamical and there is no graviton propagator. using the standard technique of the one-loop effective action one sees that the traces of the functional logarithms of ĥ and of ĉ operators both contain singular expressions 1 θc , and there is no final cancellation between them. in this case of four-derivative gravity, in final results for uv divergences, we really see inverse powers of the fundamental ratio of the theory x4−der. the results in quadratic stelle gravity, when we set θr = 0, are not continuous either. because in this case the local gauge symmetry of the theory is enhanced. we have also conformal symmetry there. the model is identical to the weyl gravity in d = 4 described by the action c 2. as emphasized in [41], this case of θr = 0 has to be treated specially. also, in this model, the conformal symmetry has to be gauge-fixed and in this special case the operators ĥ and ĉ are different than their limiting versions under θr → 0 limit from the generic four-derivative theory case. hence also the results for the beta functions are different than the limits of the corresponding beta functions in the situation with θr ̸= 0. if we start with the theory with θc = 0 from the beginning, then there are serious problems with the kinetic operator. we checked that it cannot be put by standard fixing of the gauge to the minimal form with four-derivative leading operator. moreover, as the result of this process one of the typical gaugefixing parameters remains undetermined. here one can try to compute the trace of the logarithm of the hessian using the method proposed in [44] consisting of multiplying by some two-derivative non-minimal 136 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity operator and getting a six-derivative operator, whose trace can be easily found. but it is hard to believe that one has any chance to get a non-singular answer for all the beta functions in pure r2 theory since it is known that this theory is non-renormalizable (because it lacks the c 2 counterterm in the bare action). actually, here (for the θc = 0 case) one could choose the ĉ matrix-valued differential operator different from the standard minimal prescription and choose different values for the γ gauge-fixing parameter. in the standard minimal choice for the gauge-fixing parameters and in this model, the ĉ matrix contains an irregular part in θc coupling ( 1θc pole), because of the dependence of γ on θc . this last dependence originates from the conditions forced on gauge-fixing parameters in order to put the kinetic operator in the minimal form, in the standard case θc ≠ 0. however, knowing that in the case with θc = 0, this procedure is anyhow unsuccessful, we have the freedom to choose the value of γ different than the standard one and at our wish. in principle, similar considerations can be repeated verbatim for the case of six-derivative theory (with n = 1 power exponent on the box operator in the defining the theory action in (45)). but we remark here that the theory with ωr = 0 and n = 1 is not conformally invariant in d = 4 dimensions. and the above problems with the gauge fixing of the hessian operator ĥ and non-minimality of it in the case ωc = 0 still persist. this is because here for sixderivative gravitational theories the box operator □ acting between two gravitational curvatures is only a spectator from the point of view of the uv-leading part of the ĥ operator (with the highest number of derivatives and with the zeroth powers in gravitational curvatures) or from the point of view of flat spacetime kinetic operator and flat spacetime graviton’s propagator. the box operator in momentum space gives only one additional factor of −k2 to the kinetic operator and additional suppression by −k−2 to the propagator. the hessian in the six-derivative theory with ωr = 0 must possess the same definitional issues as the one in the four-derivative theory (with θr = 0), because for the kinetic terms box operator again plays only the role of the spectator. hence the difference on this level between fourand six-derivative theories is only in some overall multiplicative coefficient (like flat spacetime d’alembertian operator ∂2 is −k2 in fourier space). so then, if we know that the hessian ĥ is almost well-defined for the conformal gravity case (up to the need for additional gauge fixing of the conformal symmetry), then the same will be true for the hessian in the six-derivative theory with the ωr = 0 condition in d = 4 spacetime dimensions, although then the theory ceases to be conformal anymore. in conformal gravity in d = 4, when θr = 0, we have almost well-defined hessian, because we know that it gives rise to a good renormalizable theory at least to the one-loop perturbative level of computations. now, also in the case of six-derivative theories, setting ωr = 0 does not create any problem for the form of neither ĥ nor ĉ operators. only the final results for the beta functions show 1 ωr poles as this was manifest from the results in [26]. in turn, in six-derivative theories, the limit ωc → 0 seems regular, but it is questionable that now we can trust the results of this limit. in the pure r□r theory, we expect to get some discontinuous results for the beta functions not obtainable by the limit ωc → 0 since this model is non-renormalizable. in this model, there is still an open problem that one cannot make the kinetic operator of fluctuations a minimal 6-derivative one. furthermore, taking the limit ωc → 0 on the kinetic operator from the generic case ωc ̸= 0 produces a hessian ĥ that vanishes on flat spacetime. hence it seems that in this case the intermediate steps of the process of computing the divergent part of the effective action are not well-defined, while the final result is amenable to taking the limit ωc → 0, but exactly because of this former reason, we should not trust these apparently continuously looking limits. one should analyze deeper the form of the leading in the number of derivatives (and also in the uv regime) part of the kinetic operator ĥ of the theory between graviton fluctuations. the insertions of box operators, like any power or functions of the box operator □, are only the immaterial differences between the cases of fourand six-derivative theories here. these operators are only spectators for getting the leading part of the hessian, which is with the highest number of derivatives and also considered on flat spacetime, so with the condition that r = 0. using formula (20) with solutions for gauge-fixing parameters as in (21), one finds in the generic case ωc ̸= 0 and ωr ̸= 0, that the kinetic operator (leading part of the hessian) is indeed minimal and of the form h µν,ρσ lead = ωc 2 □ (gµρgνσ + gµσ gνρ) − ωc ωc − 6ωr 4ωc − 6ωr □gµν gρσ . (53) in the above formula, one does not see any singularity when ωc is vanishing (one saw ω−1c divergences in the expressions for α and γ parameters in (21)), but in this case the above treatment was not justified. when ωc = 0, one can solve the system for gaugefixing parameters for β and γ′ = γ α and assume that formally 1 α = 0 and 1 γ = 0, but in the ratio γ α the limit is finite. one then finds that β = 1 and γ′ = −2ωr and after substitution to the original hessian, one gets that its leading part explicitly vanishes. the same one gets by plugging the naive limit ωc → 0 in (53). one also sees from the explicit solutions in (21) and resulting general expression for γ′ (i.e. γ′ = 43 ωc − 2ωr) that by plugging ωc = 0 one finds again that β = 1 and γ′ = −2ωr as derived exactly above. the highest derivative level of the gravitational action is then completely gauge-fixed. 137 lesław rachwał acta polytechnica in the opposite case, when ωr = 0, the leading part of the hessian does not vanish, but it is degenerate and in the form h µν,ρσ lead = ωc 2 □ (gµρgνσ + gµσ gνρ) − ωc 4 □gµν gρσ , (54) because this operator does not possess a well-defined inverse, precisely in d = 4 dimensions. an addition of a new conformal-like type of gauge-fixing here τ h□3h with a new (fourth) gauge-fixing parameter τ and where the trace of metric fluctuations h = hµµ is used, removes the degeneracy provided that τ ̸= 0 is selected. then the kinetic operator takes the form h µν,ρσ lead = ωc 2 □ (gµρgνσ + gµσ gνρ) + ( τ − ωc 4 ) □gµν gρσ . (55) moreover, for any non-zero value of τ the hessian is still a minimal operator. for τ ̸= 0 the inverse exists and also the propagator can be defined around flat spacetime. the only question is whether the final results are τ -independent since this is a spurious gauge-fixing parameter. the reason for such independence is obvious in the four-derivative case, since τ is a gauge-fixing parameter for conformal symmetry (conformal gauge-fixing parameter, so this is then in such circumstances a symmetry argument). but in the case of six-derivative model in d = 4, the reasoning with conformal symmetry is not adequate since this model is not conformal anymore. only the explicit computation may show that τ parameter drops out from final results as it should for them to be physical and τ gauge choice independent. in four-derivative gravitational theory, one can see the dependence on the x4−der ratio only in the coefficient of the r2 counterterm. this dependence is with the general schematic form a−2x−24−der + a−1x −1 4−der + a0x 0 4−der like in (49) and in (34). we remark that for other counterterms (namely for c 2 and gb in this weyl basis), the coefficients of uv divergences are numbers completely independent of x4−der. one could try to explain here this quadratic dependence in the inverse ratio x−14−der in front of the r 2 counterterm in a spirit similar to the argumentation presented in [44], where we counted active degrees of freedom contributing to the corresponding beta functions of the theory. it is well known by the examples of beta functions in qed coupled to some charged matter and in yang-mills theory, that the beta function at the one-loop level expresses weighted counting of degrees of freedom and their charges in interactions with gauge bosons in question (minimal couplings in three-leg vertices are enough to be considered here due the gauge symmetry). the similar counting could be attempted here, but in gravity, especially in hd gravity, there is a plenty of other gravitational degrees of freedom, so this is quite a difficult task to enumerate all of them and their strength of interactions in cubic vertices when they interact with background gravitational potentials. therefore, this task of explaining x-dependence and numbers present in the expressions for all the beta functions both in fourand six-derivative theories, now seems to be too ambitious and we leave it for some further future considerations. instead, we comment briefly on the general dependence on the x4−der ratio in four-derivative theory and compare this with six-derivative theory. in the case of n = 1 (six-derivative gravitational theory), it was found as a main result in [26], that the dependence on x is only in front of the c 2 counterterm and this is a linear dependence b1x1 + b0x0 like in (51) with non-negative powers of the x ratio. the other counterterms r2 and gb are with constant coefficients (only b0 terms present) (cf. with (33)). if the other than the weyl basis is used to write counterterms, then the x-dependence is linear in coefficients in front of each basis term (like in the basis with r2, r2µν , and r2µνρσ terms). these explicit dissimilarities between n = 0 and n = 1 models certainly require deeper investigations. it is interesting also to analyze a special value of the fundamental ratio x of the six-derivative gravitational theory, which makes the c 2 sector of uv divergences completely finite. this value is exactly x = − 357380 = −44.6625. the r 2 sector of uv divergences cannot be made finite this way. we remind for comparison that in the case of quadratic gravity with four derivatives in d = 4, the special values for x4−der, which made contrary the r2 sector uv-finite, were two and they were x4−der = 3(3 ± √ 7) (their numerical approximations are as follows: x4−der,− ≈ 1.0627 and x4−der,+ ≈ 16.937) as solutions of some nondegenerate quadratic algebraic equation. again, contrary to the previous case with six derivatives, here the divergences in the c 2 sector cannot be made vanish. now, we discuss the differences between the two extreme cases ωc = 0 and ωr = 0. in six-derivative model or when we have even more derivatives, superficially these two couplings and their roles for the computation of uv divergences may look symmetric. this is however not true due to the different impact of these two conditions on the form of the kinetic operator ĥ. in the case when ωr = 0, the hessian operator still exists, while for ωc = 0 we lose its form. this observation has profound implications as we explain below. first, it is a fact that both these conditions lead to badly non-renormalizable theories, in which the flat spacetime propagator cannot be simply defined. moreover, if n > 0 in none of these two reduced models we have an enhancement of symmetries and none of them has anything to do with conformal gravity models, which are present only for n = 0 and ωr = 0 case, despite that in constructions of these six-derivative models we might use only terms with weyl tensor. (however, here we use the term c□c, where it is known that the gr-covariant box operator □ is not conformal.) 138 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity our explanation of the x-dependence is as follows. first, in the generic model with n > 0, since it happens that it is the n = 0 scale-invariant gravitational model which is here the exceptional one. for sixderivative theory (or any one with n > 0) the two reduced models with conditions that ωc = 0 or ωr = 0 respectively are not renormalizable and likely even at the one-loop level higher types of divergences (besides c 2 and r2 from (1)) will be generated. from this we expect that there must be some problems with uv divergences obtained from naive power counting arguments here. the problems must show up somehow in the final numerical values for divergent terms or in the intermediate steps of the process of computing these divergences. these problems then signal that we are working with non-renormalizable theory, which do not have a good control over perturbative divergences showing up at the one-loop level. first, in the case ωc = 0, we see that the problems are already there with the definition of the kinetic operator (hessian) between quantum metric fluctuation fields. this implies that further processing with this operator is ill-defined, we cannot trust it and even if it would give us some final results for divergences, then they are not reliable at all since the theory is non-renormalizable. but we already found here the instance of the problem, which makes our final limiting results (in the ωc → 0 limit) not trustable. this means that from the expression in (51), we do not expect any additional obstacles, like 1 ωc poles since the price for non-renormalizability was already paid and we have already met dangerous problems, which signal the incorrectness of the naive limit ωc → 0. this should already take away our trust in the limit ωc → 0 of expressions for uv divergences in (33). then this line of thought in the case ωc = 0 does not put any constraints at all on the final form of the x-dependence in (51) since these results like in (51) in the limit ωc → 0 will anyway be likely incorrect. now, in the other case with n > 0 and ωr = 0, we do not have the problem with the definition of the hessian ĥ. formally, we can process it till the end of taking the functional trace of the logarithm and adding contributions from tr ln m̂ and tr ln ĉ. but somehow, we must find the occurrence of the problem, because the theory is non-renormalizable! so the only place in which the problem may sit is in the final x-dependence of the results for uv divergences. these results should be ill-defined, when the limit ωr → 0 is attempted. and this implies that they must be with poles in the ωr coefficient, so they must be instead with positive powers of the x ratio of the theory. hence, we conclude that the x-dependence must be linear or quadratic, but always with positive powers of the ratio x. this is now confirmed by explicit form as in (33) for uv divergences of six-derivative theory. the problems with renormalizability of the pure theory c□c show up in the last possible moment in the procedure for obtaining the result, namely when one wants to take the limit ωr → 0 or equivalently x → ∞. this is the generic situation for any super-renormalizable theory and for any n > 0. there are still some mysterious things here, like why the dependence is only linear in x and why only for the c 2 type of uv counterterm, while two other counterterms r2 and gb are numbers completely independent of x. right now we cannot provide satisfactory mathematical explanations for these facts. using this argumentation in the theory models with n = 1, we get an explanation for the x-dependence in formula for uv divergences in (51). the logical chain for the explanation should be as follows. firstly, in the pure theory c□c, one concludes that the problems of non-renormalizability shows only in the final results as impossibility to take the limit ωr → 0 or equivalently x → ∞ of formula in (51) for divergences of the model. hence the dependence must be on non-negative powers of the ratio x in formula (51), as it is clearly confirmed by explicit inspection of this formula. this settles the issue of the structure of exact beta functions for n = 1 models (and also for higher n ⩾ 1 cases too). now, the same formula is a starting point for an attempt to take the other limit x → 0 of the also non-renormalizable model of the type r□r. but in such a model we have already found a source of the problem caused by non-renormalizability earlier, that it is here connected with the impossibility to properly define non-degenerate hessian operator in the model. but this limiting case of x → 0 must follow the same structure as already established in formula (51). simply, theoretically speaking, there is no need to see more instances of problems due to non-renormalizability in the r□r model. hence, the first explanation based on the c□c model is sufficient and the results in the model r□r must be consistent with it. moreover, from just the analysis of the case x → ∞, we have concluded what is the structure in a generic renormalizable case, when we have both ωr ̸= 0 and ωc ̸= 0 (so x ̸= 0 and x ̸= ∞). this structure is beautifully confirmed by the formula (51) or (33) explicitly for the generic case. as emphasized above, it is in turn the n = 0 case, which is extraordinary and it changes the pattern of x4−der-dependencies described above. this all can be traced back to the fact that for n = 0 we have the possibility of reducing the generic hd scale-invariant model to the conformal one, when the full conformal symmetry is preserved (at least on the classical level of the theory). this happens, when one takes the isolated case of θr = 0 and θc ̸= 0 for the four-derivative theory (for positive-definiteness we may also assume that θc > 0). this case is discontinuous and cannot be taken as the naive limit x4−der → ∞ of the formula (49) for the r2 type of uv divergences, which would leave us effectively only with the a0 coefficient. it is well known that the conformal gravity model is renormalizable one (at least to the one-loop level), contrary to the case of the theory c□c, which was 139 lesław rachwał acta polytechnica discussed above. this means that we shall not find any source of the problem when computing and getting results for uv divergences of this c 2 model. we do not find problems with the hessian or the propagator provided we also gauge-fix the conformal local symmetry of weyl conformal gravity. we shall not find the problem with the final expression of uv divergences, so there we shall not expect poles with θr coefficient. but some x4−der-dependence up to the quadratic order could be present (this is due to the one-loop character of the computation here; one can understand this easily from contributing feynman diagrams). so then, we conclude that this dependence may be only in positive powers of the inverse ratio, namely of x−14−der = θr/θc . this is again confirmed in the formula (49) and (34), where we indeed find the quadratic dependence but in the inverse ratio x−14−der. simply, the final results for the generic case x4−der ̸= 0 and x4−der ̸= ∞ cannot depend on positive powers of x4−der since then the limit of conformal gravity in d = 4 (i.e. x4−der → ∞) would produce divergent results, but we know that weyl gravity is renormalizable with a good control over one-loop uv divergences. however, this does not mean that the results for conformal gravity are continuous and obtainable from the generic ones in (34) by taking the limit x4−der → ∞ there. we admit the fact that the coefficients there may show some finite discontinuities. however, both in the true and naive x4−der → ∞ limiting forms they must be finite – we only exclude the case when they would be divergent in the x4−der → ∞ limit. in this way, the results in renormalizable 4-dimensional conformal gravity for uv divergences may be expressed via finite numbers multiplying just one common overall divergence (like 1/ϵ parameter in dimensional regularization (dimreg) scheme). the theory is renormalizable and there are no new divergences inside coefficients of established form of uv divergences in generic hd stelle theory in d = 4 spacetime dimensions, as in (1). the significant difference between the cases of n = 0 and n ⩾ 1 is that in the former case the theory with θr = 0 is conformal on the classical tree-level as well as on the first quantum loop, since we know that weyl conformal quantum gravity is one-loop renormalizable. this is why the pattern of the x-dependence in these two cases is diametrically different. in both these cases of n = 0 and n ̸= 0, one can derive the general structure of beta functions in generic hd theory with any finite value of the fundamental ratio x (x ̸= 0 and x ̸= ∞) by just analyzing the limit x → ∞ (or respectively x4−der → ∞) and its divergences which should or should not appear there respectively for the cases of n ̸= 0 or n = 0. the inverse quadratic dependence on the ratio x4−der in the case of four-derivative stelle theory can be easily understood as well. it is up to the quadratic order and the same dependence we would expect in the case of six-derivative gravitational theory in the c 2 sector of uv divergences. however, there as a surprise we find only up to linear dependence on the fundamental ratio x and only in one distinguished sector of c 2 divergences. in general, we can have up to quadratic dependence on x in six-derivative models or on x−14−der in the stelle gravity case in d = 4 spacetime dimensions. the uv divergences of some renormalizable hd gravity models in d = 4 spacetime dimensions are all at most quadratic in the general gravitational curvature (schematically they are r2). hence they can be all read from the one-loop perturbative quantum corrections to the two-point graviton green function, so equivalently from the quantum dressed graviton’s propagator around flat spacetime background. we remind that here there is no quantum divergent renormalization of the graviton wave function. moreover, higher orders in graviton fields (appearing in interaction vertices) are completely determined here due to the gauge invariance (diffeomorphism) present in any qg model, so we can concentrate below only on these two-point green functions. as it is known from diagrammatics, here the contributing feynman diagrams may have either one propagator (topology of the bubble attached to the line) or two propagators (sunset diagrams) at the one-loop order and for corrections to the two-point function. in the most difficult case, there are here two perturbative propagators. since in our higher-derivative theory we have two leading terms shaping the uv form of the graviton’s propagator, namely the terms ωc c□c and ωrr□r, then the corresponding propagator may be either with the front coefficient ω−1c or ω −1 r respectively as the leading term. to change between the two expansions (in ωc or in ωr) one needs to use one power of the ratio x. since we have two such propagators in the one-loop diagrams considered here, then dependence is up to the quadratic power in x. sometimes we need to change back from ωc to ωr as the leading coefficient of the tree-level propagator, and then we need to multiply by inverse powers of the ratio x. the quadratic dependence is what we can have here in the most complicated case, which is actually realized in stelle generic theory with both θc ̸= 0 and θr ̸= 0. (the argumentation above can be repeated very similarly for quadratic gravity in d = 4 forgetting about one power of box operator □ and changing corresponding omega coefficients to theta coefficients and x to x4−der.) apparently, in the case of six-derivative gravitational theories there is some, for the moment, unexplained cancellation, and we see there only the dependence up to the first power on the ratio x of that theory. one should acknowledge here the speciality of the case of d = 4 and one-loop type of computation. for higher loop orders the powers of the x ratio may appear higher in the final expressions for uv divergences of the theory. similarly, if one goes to higher dimensional qg models, then even in renormalizable models at the one-loop level, one needs to compute higher 140 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity n-point green function. this is because in even dimension d one needs in renormalizable theory not only to renormalize terms of the type r□(d−4)/2r but also others with more curvatures (and correspondingly less powers of covariant derivatives) down to the term of the type rd/2, where we do not find covariant derivatives acting on curvature at all. in the middle, the general terms can be schematically parametrized as ∇d−2iri for i = 2, . . . , d2 – all these terms have the energy dimensionality equal to the dimensionality of spacetime d. for the last term of the type rd/2 one needs to look at the quantum dressed n = d2 point function at the one-loop order. in conclusion, in higher dimensions one should consider not only two-point functions with one-loop diagrams with the two topologies described above, but up to quantum dressed d2 -point functions. and even for one-loop perturbative level these additional diagrams may have more complicated topology meaning more vertices and more propagators and this means that also powers of the ratio x or x−1 respectively will be higher and higher. they are expected to be up to the upper bound given by the maximal power exponent equal to d 2 – this can be derived from the expression of quantum dressed d2 -point function, which is built exactly with d2 propagators joining precisely d 2 3-leg the same perturbative vertices. then the topology of such a diagram is this one of the main one-loop ring and d2 external legs attached to it, with each one separately and each one emanating from one single 3-leg vertex. again the situation at the one-loop and in d = 4 is quite special and simple since the ratio x appears here only up to the maximal power exponent given by d2 = 2. as a side result, one also sees that the situation in four-derivative model with the condition θc = 0 is somehow “doubly” bad. first, the hessian is not well-defined to start with and this takes away our trust in this type of computation. moreover, if we would attempt to take the limit θc → 0 (or equivalently x4−der → 0) in the final result like in (49), then we get a second problem since such limit gives infinite results. this means that we somehow doubly confirmed the problem with the perturbative and multiplicative renormalizability of such a model. it is not that the two instances of the problem support each other – they appear somehow independently and are not related, nor they cancel out. above, we have seen that in the six-derivative (or general n > 0) case, they could occur completely independently for two completely different types of non-renormalizable theories (with the conditions of ωr = 0 or ωc = 0 respectively). here, we see that since conformal gravity at one-loop must be without any problem of this type (no problem with the hessian and no problem with getting infinite results of the limits x4−der → ∞), then the occurrence of these two problems at the same time must happen in badly non-renormalizable model with θc = 0 condition. in other words, since conformal gravity is a safe model, then the model with θc = 0 must suffer twice since all these two problems must inevitably appear in one model or the other, if some extreme special cases like θr = 0 or θc = 0 are being considered. again, we remark that in generic quadratic gravity model we see up to quadratic dependence on the inverse ratio x−14−der, but a precise location where this dependence shows up is still not amenable for an easy explanation. we do not know why this happens in the r2 sector only, while the c 2 and gb sectors are free from any x4−der-dependence. but at least the dependence on the inverse ratio x−14−der, rather than on its original form x4−der = θcθr , in the exceptional case of n = 0 can be explained by the miraculous oneloop perturbative renormalizability of the conformal gravity model in d = 4. 4.3. case of conformal gravity here we continue the discussion of related issues, but now in the framework of conformal gravity, so within the model with the reduced hd action with n = 0 and formally with θr = 0. there are various motivations for conformal gravity in d = 4 spacetime dimensions [13, 14]. as it is well known the reason for the multiplicative renormalizability of such a reduced model, when we have from the beginning that θr = 0 is the presence of conformality – conformal symmetry both on the tree level and also on the level of the first loop. unfortunately, the story with conformal gravity in d = 4 is even more complicated than what we argued above. first, already at the one-loop level one discovers the presence of conformal anomaly, which is typically thought as not so harmful on the first loop level. however, it heralds the soon breaking of conformal symmetry like for example via the appearance of the r2 counterterm at the twoloop level. such term as a counterterm is not fully invariant under local conformal transformations – it is only invariant under so called restricted conformal transformations that is such transformations whose parameters satisfy the source-free background grcovariant d’alembert equation (□ω = 0) on a general spacetime. hence the r2 term is still scale-invariant but it breaks full conformal symmetry of the quantum conformal gravity. it seems the only way out of this conformal anomaly problem is to include and couple to conformal gravity specific matter sector to cancel in effect the anomaly. this is, for example, done in n = 4 conformal supergravity coupled to two copies of n = 4 super-yang-mills theory, first proposed by fradkin and tseytlin [49]. in such a coupled supergravity model, we have vanishing beta functions, implying complete uv-finiteness and conformality present also on the quantum level. this is conformal symmetry in the local version (not a rigid one) with weyl conformal transformations in the gravitational setup and on the quantum field theory level. 141 lesław rachwał acta polytechnica if not in the framework of n = 4 fradkin and tseytlin supergravity, the conformal anomaly of local conformal symmetry signals breaking of conformal symmetry, while scale-invariance (global part of it) still may remain intact. in the long run, besides the presence of non-conformal r2 counterterm, this breaking will put conformal ward identities in question and also the constraining power of the quantum conformality in question too. it will not constrain any more the detailed form of gravitational correlation functions of the quantum theory. the conformal symmetry will not be there and it will not protect the spectrum from the emergence of some spurious ghost states in it. this last thing will endanger the perturbative unitarity of the theory (and we do not speak here about the danger of unitarity breaking due to the hd nature of conformal gravity). without the power of quantum conformal symmetry, we may have unwanted states in the spectrum corresponding to the states from generic stelle gravity, and not from the tree-level spectrum of conformal gravity, so we can see mismatch in counting number of degrees of freedom and also in their characters, namely whether they are spin-1 or spin-0, ghosts or healthy particles, etc. moreover, in pure conformal gravity described by the action simply c 2 without any supergravitational extension, we notice the somehow nomenclature problem with the presence of quantum conformality. even barring the issue of conformal anomaly, the general pure gravitational theory has non-vanishing beta functions, so there is no uv-finiteness there. this implies that there is an rg running and scale-dependence of couplings and of various correlators on the renormalization energy scale. hence already at the one-loop level one could say that scale-invariance is broken, which implies violation of conformal symmetry too. however, one can live with this semantic difference provided that there are no disastrous consequences of the conformal anomaly. one can adopt the point of view that the theory at the one-loop level is still good provided that the uv-divergent action is conformally invariant too, that is when one has only conformally invariant uv counterterms. (although in the strict meaning having them implies non-vanishing beta function, rg running, loss of uv-finiteness and of scale-invariance.) in our case the conformally invariant counterterms are only of the type c 2 and gb, so if the r2 counterterm is not present at the one-loop level, then we can speak about this preserved quantum conformality in the second sense. it happens this is exactly the situation we meet for quantum conformal gravity in d = 4 at the one-loop level. in order to see quantum conformality of one-loop level conformal gravity in d = 4 described by the action, one first naively may try to take the limit x → +∞ from the expression for the r2 sector of uv divergences from formula in (34). one would end up with the results, just a constant a0, which is generally not zero. the whole story is again more subtle, since the limits in this case are again not continuous, although as we advocated above they are luckily also not divergent, when we want to send θr → 0. in the end, we have only a finite discrepancy in numbers, which can be easily explained. as emphasized above in this case of the special reduced model we have the enhancement of symmetries and this new emergent conformal symmetry in the local version must be gauge-fixed too. this means that the kinetic operator needs to be modified and some new conformal gauge-fixing functional must be added to it for the consistency of the generalized faddeev-popov quantization prescription of theories with local gauge symmetries. this means that we will also have a new conformal gauge-fixing parameter (the fourth one), which can be suitably adjusted to provide again the minimality of the hessian operator. although, of course, the whole details of the covariant quantization procedure for conformal gravity are more delicate and more subtle, here we can just take a shortcut and pinpoint the main points of attention. when computing uv divergences using generally covariant methods like bv trace technology and functional traces of logarithms of operators, one also necessarily needs to add here the contribution of the third conformal ghosts, which are scalars from the point of view of lorentz symmetry but they come with anti-commuting statistics. they are needed here because the conformal gravity theory is a natural hd theory and then third ghosts are necessary for covariant treatment of any gauge symmetry in the local form. it is true that for conformal local symmetry we do not need fp ghost fields (because of the reasons elucidated above), but we need a new third conformal ghost, which is moreover independent from the third ghost of diffeomorphism symmetry. each symmetry with local realization comes with its own set of third ghosts, when the theory is with higher derivatives. it is also well known that classically conformal fields (like massless gauge fields of electromagnetism and also of yang-mills theory) give contribution to divergences which is conformally invariant counterterm, so only of the type c 2 or gb terms. this can be understood easily as a kind of conformal version of the dewitt-utiyama argument used before. hence, if the scalars of the anti-commuting type that we have to subtract were conformally coupled, then they will not contribute anything to the r2 type of the counterterm. but we see from the formula in (34) that the a0 coefficient there is non-zero, so only this one survives after the limit x → ∞ is taken. to cancel the r2 counterterm is crucial for the hypothesized conformal invariance of the conformal gravity also on the first loop quantum level. and this must be done by explicitly non-conformal fields with non-conformal contributions to divergences. they cannot be massless gauge fields, but they can be minimally, so not conformally coupled scalar fields. here for the consistency of the whole formalism of the fp covariant quantization 142 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity of conformal gravity, this role is played by the one real conformal third ghost with the kinetic operator □2. the contribution of the third conformal ghosts is what we actually need to complete the whole process of the computation of the uv divergences in the conformal gravity model. we need them for the overall consistency since in this covariant framework we cannot a posteriori check the presence of all gauge invariances. here we assume that on the first quantum loop level, the conformal gravity model enjoys fully diffeomorphism as well as conformal symmetry. the terms given in the covariant bv framework of computation all satisfy these requirements, so only we must be careful to take all these contributions into account. the contribution of the third conformal ghosts is like that of two real scalars coupled minimally (but not conformally to one’s surprise) to the background gravitational metric, but of the ghost type. indeed this means that we have to subtract the contribution of two scalars, which is of course, uv-divergent but after extracting the overall divergence there is only a finite number. this is the number that when subtracted now matches with number obtained after the naive limit x → ∞ of the generic results from (49). we explain that we need to subtract two real scalars each one coming with the standard two-derivative gr-covariant box operator as the kinetic operator since in hd conformal gravity the operator between third conformal ghosts is of the □2 type as for the four-derivative theory. the limit to conformal gravity is discontinuous, but only in this sense that one has to take out also contribution of real scalar fields minimally coupled to gravitation. the first part of the limiting procedure, namely x → ∞ is only a partial step and to complete the whole limiting procedure one must also deal properly with conformal symmetry. this applies not only to the coefficients in front of the r2 term, where we see the mysterious but explainable x-dependence, but also to other coefficients in front of terms like c 2 and gb terms. of course, for the last two terms the limits x → ∞ do not change anything, but the contribution of third conformal ghosts makes impact and change the numerical results, which are luckily still finite in conformal gravity. the coefficients in front of the r2 and gb counterterms are also finite in generic four-derivative gravity (cf. with (34)), however by these types of arguments with conformal gravity we cannot at present understand why the x-dependence happens only in front of the r2 counterterm. of course, in conformal gravity model, there is not any x-dependence at all. at the end, when one accounts for all these numerical contributions, one indeed finds that at the one-loop level in conformal gravity, the coefficient of the r2 term is vanishing, so the quantum conformality is present in the second sense. and we have only conformally invariant counterterms in pure conformal gravity without any conformal matter, but there is still interesting rg flow of couplings there. this also signifies that there is conformally invariant but non-trivial divergent part of the effective action with finite numerical constant coefficients, when the overall divergence is extracted. these finite coefficients arise in the two-step process. first as the limit x → ∞ of a generic hd gravity and then by subtraction of uvdivergent contributions of two real scalars minimally coupled to gravitational field. since these last contributions are known to be finite numbers multiplying the overall uv divergence, then this implies that the limit x → ∞ of the generic expression in (49) must also give finite numbers. this explains why in perturbatively one-loop renormalizable model of conformal gravity in d = 4 there are standard uv divergences, although this is a reduced model with θr = 0 and n = 0 case. so the x-dependence in (34) must be as emphasized above that is with inverse powers of the fundamental ratio x of the theory and in accord with what was schematically displayed in formula (49). hopefully now the dissimilarities between the cases with n = 0 and n > 0 are more clear. in short, we think that the only sensible reason, why we see completely different behaviour when going from n = 0 to n = 1 class of theories is that the theory with ωr = 0 and n = 1 ceases to be conformally invariant in d = 4. in a different vein, the degeneracy of the kinetic operator ĥ in the ωc = 0 cases, for both n = 0 and n = 1, remains always the same. this proves again and again that the case of conformal gravity is very special among all hd theories, in d = 4 among all theories quadratic in gravitational curvatures. one can also study the phenomenological applications of the weyl conformal gravity models to the evaporation process of black holes [50, 51] and also use the technology of rg flows (and also functional rg flows) in the quantum model of conformal gravity to derive some interesting consequences for the cosmology (like for example for the presence of dark components of the universe in [52–54]). finally, the conformal symmetry realized fully on the classical level (and as we have seen also to the first loop level and perhaps also beyond) is instrumental in solving the issue with spacetime singularities [34, 55, 56], which are otherwise ubiquitous problems in any other model of generally covariant gravitational physics (both on the classical and quantum level). to resolve all singularities one must be sure that the conformality (weyl symmetry) is present also on the full quantum level (and it is not anomalous there), so the resolution of singularities from the classical level (by some compensating conformal transformations) is not immediately destroyed by some dangerous non-conformal quantum fluctuations and corrections. 4.4. more on limiting cases here again we analyze closer the situation with various limits, when some coefficients in the gravitational action (45) disappear. 143 lesław rachwał acta polytechnica in a generic six-derivative theory, the trace of the logarithm of the fp ghosts kinetic operator m̂ and of the standard minimal ĉ matrix are regular in the limit ωr → 0, but not in the limit ωc → 0. for the ĉ matrix this is understandable, because the γ parameter contains factor ω−1c in the minimal gauge. however, for the fp ghosts kinetic operator m̂ , this dependence was unexpected, because in the explicit definition of the m̂ operator there was never any singularity in ωc . moreover, this singularity is even quadratic in ωc coefficient. we also emphasize that in the general six-derivative theory the trace of the logarithm of the hessian operator ĥ is irregular both in the limits ωr → 0 and ωc → 0 separately. it seems that in the total sum of contributions to the beta functions of the theory the singularity in ωc cancels completely between tr ln ĥ, tr ln ĉ, and tr ln m̂ , while the poles in ωr remain and this is what is seen as a dependence of the final results on the non-negative powers of the fundamental ratio x. for the definition of the ĥ operator, if the limit ωr → 0 is taken, nothing bad is seen. this may be a partial surprise. of course, when the limit ωc → 0 is taken, then this operator does vanish on flat spacetime, so then its degeneracy is clearly visible. the situation with limits (θr → 0 or θc → 0) in four-derivative theory we see as follows. the functional trace tr ln ĉ is regular in both limits. it actually does not depend on any gauge-fixing parameters here, despite that in its formal definition we used the γ parameter, which shows the 1 θc singularity. the situation with the fp operator m̂ is the same as previously, because the operator is identical as in the six-derivative theory case. the operator ĥ shows the problem with its definition only when the limit θc → 0 is considered. the same is true for its trace of the functional logarithm, which shows singularity in θc coupling coefficient up to the quadratic order. in this case and in the total sum of all contributions, we see only 1 θc singularity to the quadratic order. however, here the limit θr → 0 is not continuous either, because the theory reaches a critical point in the theory space with enhanced symmetry for θr = 0 (conformal enhancement of local symmetries) as it was explained in subsection 4.3. let us also comment on what special happens in the computation of uv divergences for quadratic theory from the perspective of problems that we have initially encountered in six-derivative theory for the same computation. first, we established, in the middle steps of our computation for the results published in [26], that in the traces tr ln ĥ and tr ln ĉ in stelle gravity there are no any dangerous 1 y = 12ωc −3ωr poles (cf. [42]). the cancellations happen separately within each trace. second thing is that we found that the trace tr ln ĉ surprisingly completely does not depend on the gauge-fixing parameter γ, which was needed and used in the initial definition of the ĉ operator in (15). finally, one can notice that the addition of the gauss-bonnet term in d = 4 spacetime dimension, does not change anything for r2, r2µν , and r2µνρσ divergences (as it was expected), because its variation is a topological term in d = 4. in this last part, we use the schematic notation for various gravitational theories, when we do not write, for simplicity, the coupling coefficients in front of various terms since they are not the most important for the considerations here. in the case of six-derivative theories, it is impossible to obtain the results for the cases with ωc = 0 or ωr = 0 by any limiting procedures of the corresponding results obtained for the general six-derivative theory with ωr ̸= 0 and ωc ̸= 0. these reduced theories have different bilinear parts, with degenerate forms of the kinetic operator and our calculation methods break down here. similarly, one can calculate the beta functions in a theory with r2 only and this was done indirectly many times. one can also calculate uv divergences in c□c + r2 theory or in an analogous r□r + c 2 theory, but this is actually not easy to do. but then we cannot easily extract these results from our general calculation done in c□c + r□r six-derivative theory. the simple reason is that all these theories have different amount and characteristics of degrees of freedom and the transition from one to another at quantum level is complicated (and to some extent unknown). moreover, we remark, that the results for beta functions in models c□c + r2 (or c□c + r2 + c 2) or in r□r + c 2 (or r□r + c 2 + r2) could be obtained by though different computations than what we have done here. we summarize that the six-derivative gravitational theory to be renormalizable must contain both terms of the type c□c and r□r. then the kinetic operator (hessian) between gravitational fluctuations and the graviton’s propagator are well-defined. in all other models, there is not a balance between the number of derivatives in the vertices of the theory and in all gauge-invariant pieces of the propagator, so the theory behaves badly regarding perturbative uv divergences at higher loops. this does not mean that the computation of uv divergences at one-loop level is forbidden, just only that usually these are not all divergences in the theory, they may not be the uv-leading ones anymore or the theory does not have decent control over all of them. for the strictly non-renormalizable theory with the leading in the uv term c□c we can have additions of various subleading terms which do not change the fact of non-renormalizability. we can add terms (separately or in conjunction) of the following types: ωλ (cosmological constant term), r (e-h term), r2 (starobinsky’s term), c 2 (weyl square term). the uv-leading part of the hessian still is defined as it contains six-derivative differential operator understood on flat spacetime and between tensorial fluctuations, so derived from the terms quadratic in curvatures. the hessian is non-degenerate. (it has to be nondegenerate here, because the gr-covariant box oper144 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity ator is here only a spectator, and the hessian must be “almost” non-degenerate for the case of conformal gravity with the action c 2.) the flat spacetime propagator can be defined only if we make addition of ωλ, or r or r2 terms – this is because of the problematic part of it proportional to the projector p (0) which must for the consistency of the inverting procedure for the whole propagator be non-zero. this scalar part (spin-0 part) is sourced from any scalar term or from the cosmological constant term. if only the c 2 term is added, then the propagator still is ill-defined. still these additions do not change the fact that the theory is non-renormalizable, if there is not an accompanying six-derivative term of the form r□r. as for the final results for uv divergences in these extended models, naively one would think that there are no additional uv divergences proportional to terms with four derivatives of the metric (namely to terms r2, c 2 and gb), because of the limit ωr → 0 and the dependence on the x ratio in (33) in the linear way. we would naively think that divergences with r2 and gb terms are the same as in (33). the only problematic one could be this proportional to the c 2 term since the limit gives already divergent results (so “doubly” divergent) – this would mean that the coefficient of the c 2 divergence is further divergent and renormalization of just c 2 does not absorb everything at the one-loop level. since the model is non-renormalizable we cannot trust this computation and these limits at the end, especially if they give divergent results. but this probably means that we cannot sensibly define the c 2 counterterm needed for the uv renormalization in these theories. in a sense an attempt of adding ωλ or r, or r2 terms to regularize the theory c□c + c 2, or even the simplest one, just c□c, is unsuccessful so we perhaps still cannot trust there in the final results just given by two b0 coefficients of uv divergences proportional to terms r2 and gb, while the c 2 divergences are never well-defined in this class of models. instead, in the case of the reduced model with r□r uv-leading action, one may keep some hope that the results for the c 2 counterterms will be finite at the end, but maybe still discontinuous, despite the non-renormalizability of the model with r□r action (plus possible lower derivative additions to regularize it as it was mentioned above). maybe in this reduced models the results of the projection procedure of the uv-divergent functional of the effective action of the theory onto the sector with only c 2 terms will result here in giving sense to pure c 2 divergences in this limiting model. (here we may try to resort to some projection procedure for the functional with uv divergences since in these non-renormalizable models, one may expect to find more divergences than just of the form of c 2 and r2 as presented initially in (1). there could exist new uv divergences, which contain even more than four derivatives on the background metric tensor, even in d = 4 case.) but the final finite value may be discontinuous and may not be obtainable by the naive limit x → 0 of the expression for the divergent term in the c 2 sector of uv divergences, so it may not be just b0 there. this remark about possible discontinuities may apply also to coefficients in front of divergent terms of the type r2 and gb. they may still end up with some finite definite values for this model, but probably they are not the same as the coefficients b0 of these terms from (51), so we probably will be able to see here another discontinuities in taking the naive limit x → 0. these above results about discontinuities and negative consequences due to the overall nonrenormalizability of the two considered above reduced models, are also enforced by the analysis of power counting of uv divergences. one can try to perform the “worst case scenario” analysis of one-loop integrals and the results show complete lack of control over perturbative uv divergences in such reduced models. this is even worse that in the case of offshell e-h gravity considered in d = 4 dimensions, which is known to be one-loop off-shell perturbatively non-renormalizable theory. in the latter case the superficial divergence of the divergence ∆ is bounded at the one-loop level (l = 1) in formula (4) by the value 4. in general, at the arbitrary loop order we have the formula for power counting reads then ∆ + d∂ = 4 + 2(l − 1), (56) which if we concentrate on logarithmic uv divergences only (with ∆ = 0), we get that at the one-loop level for all green functions we need counterterms with up to d∂ = 4 partial derivatives on the metric tensor. at the two-loop level we instead need to absorb the divergence term with d∂ = 6 partial derivatives as this was famously derived by goroff and sagnotti [17, 18]. the counterterm that they have found was of the form of the c 3 gr-covariant term and its perturbative coefficient at the two-loop order does not vanish, and this implies that the whole uv-divergent term does not vanish even in the on-shell situation. but still we know which counterterms to expect at the given loop order and the absorption of uv divergences works for all divergent green functions of the qg model. the situation in the reduced models of the type c□c or r□r in the leading uv terms is much worse even at the one-loop level from naive power counting there. one sees that different gr-covariant counterterms are needed to absorb divergences in different divergent green functions of the quantum model at the one-loop level, so the counting does not stop at the two-point function level. we think that despite these tremendous difficulties, one still can compute the divergent parts of the effective action and the actual computations are very tedious and still possible. this is provided that one can invert the propagator, so one has some non-vanishing parts in both gauge-invariant parts of it with the spin-0 and spin-2 projectors. so, 145 lesław rachwał acta polytechnica it is at present practically impossible to do computation in the pure models c□c or r□r only. we know that they give contributions in momentum space proportional to k6 in the spin-2 and spin-0 parts of the propagator respectively, while other parts are not touched. in order to regularize the theory and to give sense to the perturbative propagator around flat spacetime, one has to add the regulator terms as this was mentioned above. let us assume that they give contributions to the other sector of the spin projectors in the graviton’s propagator of the form k−m, where m is some integer and m < 6, they may likely also give additional subleading contributions to the main respective part of the propagator which was there with six derivatives in the uv regime correspondingly to the spin-2 sector in c□c theory and to the spin-0 sector in r□r model. the values of m are respectively: m = 0 for the cosmological constant addition (it still regularizes the propagator but very, very weakly), m = 2 when e-h term is added (it contributes both to the spin-0 and spin-2 parts), m = 4 when either r2 or c 2 terms are added (they contribute exclusively in their respective sectors). since now after the regularization of the graviton’s propagator, its behaviour is still very unbalanced in the uv regime between different components, then one sees the following results of the analysis of uv divergences at the one-loop order. first, the general gravitational vertex is still with six derivatives, while the propagator is k−6 in the best (most suppressed) behaviour and k−m is the worst behaviour in the other components. for the most dangerous situation we have to assume that the overall behaviour of the propagator is in the worst case, so with the uv scaling of the form k−m. then the relation between the number of derivatives in a general gravitational vertex and in the propagator is broken and this is a reason for very bad behaviour with uv divergences here. such relation is typically present even in non-renormalizable models, like in e-h gravity. the lack of this relation means that now the result for d∂ of any feynman graph g depends on the number of external graviton lines ng emanating from the one-loop diagram. previously in the analysis of power counting there was never any dependence on this ng parameter. this is a source for problems even bigger when one increases ng . for definiteness we can assume that ng > 1 since here we will not be interested in vacuum or tadpole diagrams and quantum corrections to them. now, for a general diagram g with ng external graviton lines, the worst situation from the point of view of uv divergences is for the following topology of the diagram, namely there is one loop of gravitons (so called “ring of gravitons”) in the middle with ng 3-leg vertices joined by ng propagators. in the case when we concentrate on logarithmic divergences d = 0, we get the following results for the quantity d∂ which tells us how many derivatives we must have in the corresponding counterterm to absorb the divergence, d∂ = 4 + ng (6 − m) (57) for the graph contributing one-loop quantum corrections to the ng -point green function. one sees that this d∂ grows without a bound even at the one-loop level, when ng grows, so in principle to renormalize the theory at the one-loop level one would need already infinitely many gr-covariant terms, if one does not bound the number of external legs of green functions that must be considered here. a few words about explanation of numbers appearing in the formula (57). the 4 there is the number of spacetime dimensions (integration over all momenta components at the oneloop level), while the (6 − m) factor comes from the difference between the highest number of derivatives in the vertex, i.e. 6 compensated by the worst behaviour in some propagator components given in the uv by k−m only. moreover, there are precisely ng segments of the structure propagator joined with 3-leg vertex to create a big loop. this behaviour signals complete lack of control over perturbative one-loop divergences even at the one-loop level. moreover, they have to be absorbed in the schematic terms of the type ∇4+4ng −ng m+2irng −i, (58) for the index i running over integer values in the range i = 0, 1, 2, . . . ng − 2, where we only mentioned the total number of covariant derivatives not specifying how they act on these general gravitational curvatures. this is because this is an expression for the quantum dressed one-loop green function with ng graviton legs on the flat spacetime, so terms with more curvatures than ng will not contribute to absorb these divergence of flat green ng -point function. we only mentioned here the really the worst situation, where the divergence may be finally absorbable not only by the highest curvature terms of the type ∇4+4ng −ng mrng but also for terms with smaller number of curvatures (up to r2 terms and in the precise form r□ 1 2 ng (6−m)r). we neglect writing counterterms here which are total derivatives and which are of the cosmological constant type. these are then the needed counterterms off-shell at the one-loop level in such general reduced model. to make it more concrete, we will analyze the cases of m = 0, 2 and 4 with special attention here in these badly non-renormalizable models for some small numbers of legs of quantum dressed green functions. we have that at m = 0 to absorb uv divergences from the 2-point function we need generic counterterms of the form: r□j r with the exponent j running over values j = 0, 1, 2, 3, 4, 5, 6, while to renormalize a threepoint function one needs previous terms and possibly new terms of the type ∇j r3 with j = 0, . . . , 16 and for four-point functions new terms are of the type ∇j r4 with j = 0, . . . , 20, etc. for higher green functions (for ng -leg correlators one needs j up to jmax = 4ng + 4). 146 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity when we regularize by adding the e-h term with m = 2 the situation is slightly better, but then to absorb uv divergences from the 2-point function we need generic counterterms of the form: r□j r with the exponent j running over values j = 0, 1, 2, 3, 4, while to renormalize a three-point function one needs previous terms and possibly new terms of the type ∇j r3 with j = 0, . . . , 10 and for four-point functions new terms are of the type ∇j r4 with j = 0, . . . , 12, etc. for higher green functions (for ng -leg correlators one needs j up to jmax = 2ng +4). finally, we can add terms of the type r2 and c 2 for the regularization purposes. in such final case to be considered here one has that to absorb uv divergences from the 2-point function we need generic counterterms of the form: r□j r with the exponent j running over values j = 0, 1, 2, while to renormalize a three-point function one needs previous terms and possibly new terms of the type ∇j r3 with j = 0, . . . , 4 and for four-point functions new terms are of the type ∇j r4 with j = 0, . . . , 4, etc. for higher green functions (for ng -leg correlators one needs j up to jmax = 4 here independently on the number of legs ng ). still in this last case one sees that one needs infinitely many counterterms to renormalize the theory at the one-loop level, although the index j of added covariant derivative is bounded by the values 4, still one needs more terms with more powers of gravitational curvatures r. this shows how badly non-renormalizable are these models already at the one-loop level and that any control over perturbative uv divergence is likely lost, when the number of external legs is not bounded here from above. these reduced models are examples of theories when the dimensionality and the number of derivatives one can extract from the vertices and propagators of the theory differ very much. previously in quantum gravity models these two numbers were identical which leads to good properties of control over perturbative divergences (renormalizability, super-renormalizability and even uv-finiteness). with these reduced models we are on the other bad extreme of vast possibilities of qg models. but seeing them explicitly proves to us how precious is the renormalizability property and why we strongly need them in hd models of qg, in particular how we need super-renormalizability in six-derivative qg models. the arguments above convince us to think that there is no hope to get convergent results for the front coefficient coming with the c 2 counterterm in the uv-divergent part of the effective action in the considered here model with the only presence of the c□c as the leading one in the uv. this may signify that here there exists another uv-divergent term (perhaps of the structure like c□nc), which contains more derivatives, and this could be a reason why the coefficient in front of the c 2 term is itself a divergent one. the presence of such new needed counterterms with more derivatives can be motivated by the analysis of power counting of uv divergences in this reduced unbalanced model, which is also presented below. even if the higher c□nc type of uv divergence is properly extracted and taken care of, then we can still be unable to properly define and see as convergent the divergence proportional to the four-derivative term c 2. even such a projection of the uv-divergent functional of the theory onto the sector with only c 2 terms will not help here in giving sense to pure c 2 divergences in this limiting model. however, this remark does not need necessarily to apply to coefficients in front of divergent terms of the type r2 and gb. they may still end up with some finite definite values for this model, but probably they are not the same as the coefficients b0 of these terms from (51), so we could be able to see here another discontinuity in taking the limit x → +∞. on the other hand, for the strictly nonrenormalizable theory with the leading in the uv term r□r we can have additions of similar various subleading terms which do not change the fact of nonrenormalizability. we can add terms (separately or in conjunction) of the following types: ωλ, r, r2, or c 2. the uv-leading part of the hessian still is not welldefined as it should contain six-derivative differential operator understood on flat spacetime and between tensorial fluctuations, while from the term r□r we get only operator between traces of metric fluctuations h = ηµν hµν (between spin-0 parts), so derived from the terms quadratic in curvatures present in the uv regime. probably the degeneracy of the hessian operator can be easily lifted out, if we add one of the ωλ, or r or c 2 terms. if only the r2 term is added, then the hessian still is degenerate. similarly, the flat spacetime propagator can be defined only if we make addition of ωλ, or r or c 2 terms – this is because of the problematic part of it proportional to the projector p (2) which must for the consistency of the inverting procedure for the whole propagator be non-zero. this tensorial part (spin-2 part) is sourced exclusively from any gr-invariant term built out with weyl tensors in adopted here weyl basis of terms or from the e-h term or from the cosmological constant term. if only the r2 term is added, then the propagator still is ill-defined. still these additions do not change the fact that the theory is formally non-renormalizable, if there is not an accompanying six-derivative term of the form c□c. as for the final results for uv divergences in these extended models, naively one would think that there are no new uv divergences proportional to terms with four derivatives of the metric (namely to terms r2, c 2 and gb), because of the limit ωc → 0 and the dependence on the x ratio in (33) in the linear way. we would naively think that divergences with r2 and gb terms are the same as in (33). the only problematic one could be this proportional to the c 2 term since the limit gives already constant result, namely the b0 coefficient. since the model is non-renormalizable we cannot trust this computation and these limits at the 147 lesław rachwał acta polytechnica end, even if they give here convergent results. but this probably means that we cannot sensibly define the c 2 counterterm needed for the renormalization of these theories. when we include additions to the action, which remove the degeneracy of the flat spacetime graviton’s propagator, then at least the perturbative computation using feynman diagrams can be attempted in such a theory. although of course, in this case the different parts of the propagator have different uv scalings, so the situation for one-loop integrals is a bit unbalanced and there is not a stable control over perturbative uv divergences, when for example one goes to the higher loop orders. probably new counterterms (with even higher number of derivatives) will be at need here. making additions of some subleading terms from the point of view of the uv regime, may help in defining the unbalanced perturbative propagator, but still one expects (due to energy dimension considerations) that these additions do not at all influence the quantitative form of uv divergences with four-derivative terms, so these ones which are leading in the uv. these additions are needed here only quantitatively and on the formal level to let the computation being done for example using feynman diagrams with some mathematically existing expressions for the graviton’s propagator. in a sense adding ωλ or r, or c 2 terms regularizes the theory r□r + r2 or even the simplest one, just r□r, so we can perhaps trust there in the final results just given by all three b0 coefficients of uv divergences proportional to terms r2, c 2 and gb. this can be motivated by the observation that here the limits ωc → 0 or θc → 0 respectively do not enhance any symmetry of the model in question, so they can be naively and safely taken. but we agree that this case requires a special detailed and careful computation to prove this conjectural behaviours. in the general case of badly non-renormalizable theories, with both ωc = 0 or ωr = 0, one trusts more the computation using feynman diagrams and around flat spacetime than of the fully gr-covariant bv method of computation. for the former one only needs to be able to define properly the propagator – all physical sectors of it, and for this purpose one can regularize the theory by adding the term ωκr, which is a dynamical one with the smallest number of derivatives and for which flat spacetime is an on-shell vacuum background. (in this way, we exclude adding the cosmological constant term ωλ, which would require adding some source and the flat spacetime propagator could not be considered anymore in vacuum there.) then in such regulated non-renormalizable theory one can get results around flat spacetime and in fourier momentum space, and then at the end one can take the limit ωκ → 0. the results for some uv divergences in these non-renormalizable models must be viewed as projected since higher-derivative (like 6-derivative and even higher) infinities may be present as well. these last results must coincide with the ones we obtained in (33), when the proper limits of ωc → 0 or ωr → 0 are taken. we notice that adding the e-h term, which is always a good regulator, changes the dynamics very insignificantly for these higher-derivative models and the results from feynman diagram computations can be always derived. instead for the leading in the uv regime part of the hessian operator, which is a crucial element for the bv method of computation, addition of just ωκr does not help too much and the operator is still degenerate since it is required that all its sectors are with six-derivative differential operators: are non-vanishing and non-degenerate there. we propose the following procedure for the derivation of correct limiting cases analyzed here. first the theory with ωc ̸= 0, ωr ̸= 0 and ωκ ̸= 0 is analyzed using feynman diagram approach. the results for uv divergences must be identical to the ones found in (33) using the bv technique. they do not show any singularity at this moment. in feynman diagram computation one can take the limit ωc → or ωr → 0, while the propagator and perturbative vertices are still well-defined. in these circumstances we have that still ωκ ̸= 0. we admit that the theory loses now its renormalizability properties, but we just want to project the uv-divergent action onto the terms with the structure of three gr-covariant terms c 2, r2 and gb. for this the method of feynman graph computation is still suitable since it only requires the well-defined propagator, but it can work even in badly non-renormalizable theories. this is like taking the naive limits of ωc → or ωr → 0 in the results from (33) respectively, regardless which way they were obtained. one justifies the step of taking these limits by recalling that we are still working in the feynman diagram approach, and not with the bv technique. finally, one sends ωκ → 0 hoping that this does not produce any finite discontinuity in the results for four-derivative uv divergences. this is justified by dimensional analysis arguments provided also earlier in this article and by the fact that except in the case of conformal gravity theory in d = 4, there is no any enhancement of local symmetries in the limit ωκ → 0. then one gets the sense for the limits considered in these sections. this analysis concludes the part with special limiting cases of extended six-derivative theories, where one of the coefficients ωr or ωc is to be set to zero. probably the same considerations of some limits can be repeated very similarly (with the exception of the conformal gravity case) for the stelle quadratic gravity models, but we will skip this analysis here since it can be found in the literature. 5. stability of hd theories above we have seen that hd theories of gravitation are inevitable due to quantum considerations. they also come with a lot of benefits that we have discussed at length before like super-renormalizability and the possibility for uv-finiteness. however, it is also well 148 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity known that they have their own drawbacks and problems. one of the most crucial one is the issue of unitarity of the scattering (s-matrix) in perturbative framework. this is of course in the situation when we can discuss the scattering problems, so when we can define asymptotic states in interacting gravitational background, so when the gravitational spacetime is asymptotically flat. in more generality, the related issue is of quantum stability of the theory. in general, in literature about general hd theories there exist various proposals for solutions of these perennial problems. we can mention here a few of them: p t -symmetric quantum theory, anselmi-piva fakeon prescription, non-local hd theories, benign ghosts as proposed by smilga, etc. below we will try to describe some of their methods and show that the problems with unitarity or with the stability of the quantum theory can be successfully solved. we also provide arguments thanks to mannheim [57, 58] that the gravitational coupled theory must be without problems of this type, if the original matter theory was completely consistent. we first express the view that the stability of the quantum theory is fundamental, while the classical theory may emerge from it only in some properly defined limits. hence we should care more about the full even non-linear stability on the quantum level and some instabilities on the classical level may be just artifacts of using classical theory which cannot be defined by itself without any reference to the original fundamental quantum theory. an attempt to understand the stability entirely in classical terms may be doomed to clearly fail since forgetting about the quantum origin may be here detrimental for the limiting process. if the quantum theory is stable and unitarity is preserved, then this is the only thing we should require since the world is in its nature quantum and physically we know that it is true that ℏ = 1 in proper units, rather than ℏ → 0, so the classical limit may be only some kind of illusion. if there are problems with classical stability analysis like this done originally by ostrogradsky, then this may only mean that the classical theory obtained this way neglects some important features that were relevant on the quantum level for the full quantum stability of the system. first, in anselmi-piva prescription one solves completely the unitarity issue for hd theories by invoking fakeon prescription to take properly into account the contribution of particles which in the spectrum are related to higher derivatives theories and which typically are considered as dangerous for the unitarity of the theory. the presence of a particle with negative residue called a ghost at the classical level makes the theory not unitary in its original quantization based on the standard feynman prescription [7] of encircling the poles for the loop integrals. a new quantum prescription, as recently introduced by anselmi and piva [59–61] was based on the earlier works by cutkosky, landshoff, olive, and polkinghorne [62]. the former authors invented a procedure for the leewick theories [63, 64], which allow them to tame the effects typically associated to the presence of ghosts in the stelle’s theory. in this picture, the ghost problem (also known as unitarity problem) is solved consequently at any perturbative order in the loop expansion [61] done for the loop integrals which need to be computed in any qft, if one requires higher order accuracy. at the classical level, the ghost particle (or what anselmi and piva define as “fakeon”, because this particle understood as a quantum state can only appear as a virtual particle and inside perturbative loops) is removed from the perturbative spectrum of the theory. this is done by solving the classical equations of motion for the fakeon field by the mean of a very specific combination of advanced plus retarded green’s functions and by fixing to zero the homogeneous solution of resulting field equations [65, 66]. this is then equivalent to removing the complex ghosts in the quantum theory from the spectrum of asymptotic quantum states by hand. however, this choice and this removal decision is fully preserved and protected by quantum corrections, hence it does not invalidate the unitarity of the s-matrix at higher loop orders. such prescription of how to treat virtual particles arising due to hd nature of the theories is very general and can be applied to both real and complex ghosts, and also to normal particles, if one wishes to. (every particle can be made fake, so without observable effects on the unitarity of the theory.) in particular, this prescription is crucial in order to make perturbatively unitary the theory proposed by modesto and shapiro in [67, 68] which comes under the name of “lee-wick quantum gravity”. the latter class of theories is based on the general gravitational higher-derivative actions as proposed by asorey, lopez, and shapiro [12]. in this range of theories, we can safely state to have a class of super-renormalizable or uv-finite and unitary higher-derivative theories of qg. in order to guarantee tree-level unitarity, the theory in [67, 68] has been constructed in such a way that it shows up only complex conjugate poles in the graviton’s propagator, besides the standard spin-2 pole typically associated with the normal massless graviton particle with two polarizations. afterwards, the new prescription by anselmi and piva [61] guarantees the unitarity of this theory at any perturbative order in the loop expansion. we also emphasize that the stelle’s quadratic theory in gravitational curvatures [7] with the anselmi-piva prescription is the only strictly renormalizable theory of gravity in d = 4 spacetime dimensions, while the theories proposed in [67, 68] are from a large (in principle infinite) class of super-renormalizable or uvfinite models for quantum gravity. next, in the other approach pioneered by bender and mannheim to higher-derivative theories and to non-symmetric and non-hermitian quantum mechan149 lesław rachwał acta polytechnica ics [69, 70], one exploits the power of nonhermitian p t -symmetric quantum gravity. here, the basic idea is that the gravitational hamiltonian in such theories (if it can be well-defined), is not a hermitian operator on the properly defined hilbert space of quantum states, rather it is only p t -symmetric hamiltonian. then some eigenstates of such a hamiltonian may correspond to non-stationary solutions of the original classical wave equations. they would indeed correspond in the standard classical treatment to the ostrogradsky instabilities. the famous example are cosmological run-away solutions or asymptotically non-flat gravitational potentials for the black hole solutions. the problem of ghosts manifests itself already on the classical level of equations of motion, where one studies the linear perturbations and its evolution in time. for unstable theories, the perturbations growth is without a bound in time. but in some special solutions, like for example present in models of conformal gravity, these instabilities are clearly avoided and then one can speak that ghosts are benign in opposition to them being malign in destroying the unitarity of the theory. such benign ghosts [71, 72] are then innocent for the issues of perturbative stability. in the p t -symmetric approach to hd theories at the beginning, one cannot determine the hilbert space by looking at the c-number propagators of quantum fields. in this case, one has to from the start quantize the theory and construct from the scratch the hilbert space, which is different than the original naive construction based on the extension of the one used normally for example for two-derivative qft’s. with this new hilbert space and with the non-hermitian (but p t -symmetric) hamiltonian the theory revealed to be quantum-mechanically stable. this is dictated by the construction of the new hilbert space and the structure of the hamiltonian operator. in that case the procedure of taking the classical limit, results in the definition of the theory in one of the stokes wedges and in such a region the hamiltonian is not real-definite and the corresponding classical hamiltonian is not a hermitian operator. therefore, the whole discussion of ostrogradsky analysis is correct as far as the theory with real functions and real-valued hamiltonians is concerned, but it is not correct for the theory which corresponds to the quantum theory which was earlier proven to be stable quantummechanically. the whole issue is transmitted and now there is not any problem with unitarity or classical stability of the theory, but one has to be very careful in attempts to define the classical limiting theory. we also repeat here arguments proposed by mannheim about stability of the resulting gravitationmatter coupled theory [57, 58]. first we take some matter two-derivative model (like for example standard model of particle physics, where we have various scalars, fermions and spin-1 gauge bosons). this theory as considered on flat minkowski background is well known to be unitary so it gives s-matrix of interactions with these properties. the model can be said that it is also stable on the quantum level. now, we want to couple it to gravity, or in other words put it on gravitational spacetime with non-trivial background in such a way that the mutual interactions between gravitational sector and matter sector are consistent. this, in particular, implies that the phenomena of back-reaction of matter species on geometry are not to be neglected. the crucial assumption here is that this procedure of coupling to gravity is well behaved and for example, it will not destroy the unitarity properties present in the matter sector. we know that the theory in the matter sector is stable and also its coupling to geometry should be stable on the full quantum level. after all, this is just simple coupling procedure (could be minimal coupling) to provide mutual consistent interactions with the background configurations of the gravitational field. next, on the quantum level described, for example, by functional path integral, we can decide to completely integrate out matter species still staying on the general gravitational background. as emphasized in section 1, such procedure in d = 4 spacetime dimensions generate effective quantum gravitational dynamics of background fields with higher derivatives, precisely in this case there are terms of the type c 2 and r2 (the latter term is absent when the matter theory is classically conformally invariant). in other words, the resulting functional of the quantum partition function of the total coupled model is a functional of only background gravitational fields. this last reduced or “effective” functional is given by the functional integral over quantum fluctuations of gravitational field of the theory given classically by the action with these hd above terms. let us recall now what we have done, namely we have simply integrated out all quantum matter fields, which is an identity transformation for the functional integral representation of the partition function z of the quantum coupled theory. since this transformation does not change anything, then also the resulting theory of gravitational background must necessarily possess the same features as the original coupled theory we started with. since the first theory was unitary, then also the last one theory of pure gravity but with higher-derivative terms must be unitary too. we emphasized that both theories give the same numerical values of the partition function z understood here as the functional of the background spacetime metric. in the first theory the integration variables under functional integrals are quantum matter fields, while in the second case we are dealing with pure gravity so we need to integrate over quantum fluctuations of the gravitational fields. in the last case the model, which gives the integrand of the functional integral is given by the classical action shd, so it contains necessarily higher derivatives of the gravitational metric field. there also exist possibilities that ghosts or classical instabilities one sees on the classical level thanks to 150 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity ostrogradsky analysis disappear. this may happen if for example, some very specific (or fine-tuned) initial or boundary conditions are used for solving non-linear higher-derivative classical equations of motion of the theory. it is not excluded as proven by smilga that some instabilities may go away if one analyzes such special situations. various cures have been proposed in the literature for dealing with the ghosts-tachyon issue: lee-wick prescription [63, 64], fakeons [61, 65, 66, 73], nonperturbative numerical methods [71, 72, 74–78], ghost instabilities [79–81], nonhermitian p t -symmetric quantum gravity based on p t -symmetric quantum mechanics [69, 70], etc (see also [82–87]). one might even entertain the idea that unitarity in quantum gravity is not a fundamental concept. so far, there is no a consensus in the community which solutions is the correct one. the unfortunate prevalent viewpoint is that none of the proposed solutions solves conclusively and completely the problem. and it seems that sadly the solutions proposed in the literature are not compatible and are unrelated to each other. therefore all the arguments given above should convince the reader that the hd (gravitational) theories are stable on the full quantum level. in particular, this means that for situations in which we can define asymptotic states (like for asymptotically flat spacetimes) the scattering matrix between fluctuations of the gravitational field is unitary on the quantum level and both perturbatively and non-perturbatively. 6. conclusions in this contribution, we have discussed the hd gravitational theories, in particular six-derivative gravitational theories. first, we motivated them by emphasizing their various advantages as for the models of consistent quantum gravities. we showed that six-derivative theories are even better behaved on the quantum level than just four-derivative theories, although the latter ones are very useful regarding scaleand conformal invariance of gravitational models. moreover, the models with four-derivative actions serve as good starting examples of hd theories and they are reference points for consideration of sixand higher order gravitational theories. we first tried to explain the dependence of the beta functions in six-derivative theories by drawing analogies exactly to these prototype theories of stelle gravity. we also emphasize that only in six-derivative gravitational models we have the very nice features of super-renormalizability and the narrow but still viable option for complete uv-finiteness. this is why we think super-renormalizable six-derivative theories have better control over perturbative uv divergences and give us a good model of qg, where this last issue with perturbative divergences is finally fully under our control and theoretical understanding. in the main part of this paper, we analyzed the structure of perturbative one-loop beta functions in sixderivative gravity for couplings in front of terms containing precisely four-derivative in the uv-divergent part of the effective actions. these terms can be considered as scale-invariant term since couplings in front of them are all dimensionless in d = 4 spacetime dimensions. our calculation for these divergences was done originally in the euclidean signature using the so-called barvinsky-vilkovisky trace technology. however, the results are the same also in the minkowskian signature independently which prescription one uses to rotate back to the physical relativistic lorentz signature case, whether this is standard wick rotation, or the one using anselmi-piva prescription using fakeons. this is because they are the leading divergences in the uv regime, and hence they do not completely depend how the rotation procedure is done from euclidean to minkowskian and how for example the contributions of arcs on the complex plane is taken into account since the last ones give subleading contributions to the uv-divergent integrals. moreover, the calculations of beta functions that we presented in this paper has very nice and important features of being renormalization scheme-independent since they are done at the one-loop, but the expressions we get for them are valid universally. these are exact beta functions since they do not receive any perturbative corrections at the higher loop orders since the six-derivative gravitational theory is super-renormalizable in d = 4. another part of good properties of the beta functions obtained here are the complete gauge independence and also independence on the gauge-fixing parameters one can use in the definition of the gauge-fixing functional. these last two properties are very important since in general gravitational theory we have the access to perturbative computation only after introducing some spurious element to the formalism which are related to gauge freedoms (in this case these are diffeomorphism symmetries). we modify the original theory (from the canonical formalism) by adding various additional fields and various spurious nonphysical (gauge) polarizations of mediating gauge bosons (in our case of gravitons) in order also to preserve relativistic invariances. these are redundancies that have to be eliminated when at the end one wants to compute some physical observables. therefore, it is very reassuring that our final results are completely insensitive to these gauge-driven modifications of original theories. our beta functions being exact and with a lot of nice other properties, constitute one significant part of the accessible observables in the qg model with six-derivative actions. their computation is a nice theoretical exercise, which of course from the sense of algebraic and analytic methods used in mathematical physics has its own sake of interest. however, as we emphasized above these final results for the beta functions may have also meaning as true physical observables in the model of six-derivative qg theories. 151 lesław rachwał acta polytechnica we described in greater detail the analysis of the structure of beta functions in this model. first we used arguments of energy dimensionality and the dependence of couplings on the dimensionless fundamental ratio of the theory x. next, we tried to draw a comparison between the structure of 4-derivative gravitational stelle theory and six-derivative theory in d = 4 dimensions. we showed the dependence on the parameters x is quite opposite in two cases. the case with four-derivative theory is exceptional because the model without any r term in the action (and also without the cosmological constant term) enjoys enhanced symmetry and then the quantum conformal gravity is renormalizable at the one-loop order, so then it is a special case of a sensible quantum physical theory (up to conformal anomaly problems discussed earlier). we also remark that in the cases of x → 0 and x → ∞ the generic six-derivative theories are badly non-renormalizable. this was the source of the problem with attempts to obtain sensible answers in these two limits. non-renormalizability problem must show in some place in the middle or at the end of the computation to warn us that at the end we cannot trust in the final results for the beta functions in these cases. in these two cases this problem showed indeed in two different places and the logical consequences of this were strongly constraining the possible form of the rational x-dependence of these results. thanks to these considerations we were able finally to understand whether the positive or inverse powers of the ratio x must appear in the final results for beta functions in question. of course, we admit that this analysis is a posteriori since we first derived the results for the divergences and only later tried to understand the reasons behind these results. but eventually we were able to find a satisfactory explanation. and there are a few of additional spin-offs of the presented argumentation. first, we can make predictions about the structure of beta functions in 8-derivative and also of other higher-derivative gravitational theories with the number of derivatives in the action which is bigger than 4 and 6 (analyzed in this paper). we conjecture that the structure should be very similar to what we have seen already in the generic case with six derivatives, so only positive powers of the corresponding fundamental ratio x of the theory, and probably only in the sector with c 2 type of uv divergences. another good side effect is that we provide first (to our knowledge) theoretical explanation of the structure of beta functions as seen in four-derivative case of stelle theory in d = 4 spacetime dimensions. it is not only that the theory with c 2 action is exceptional in d = 4 dimensions; we also “explained” these differences based on an extension of the theory to include higher-derivative terms like with 6-derivative and quantify to which level the theory with c 2 action is special and how this reflects on the structure of its one-loop beta functions. we remark that in stelle gravity (or even in its subcase model with conformal symmetry based on the c 2 action), there are contributions to beta functions originating from higher perturbative loops since the super-renormalizability argument based on power counting analysis does not apply here. our partial explanation of the structure of the one-loop beta functions in stelle theory in d = 4 uses a general philosophy that to “explain” some numerical results in theoretical physics, one perhaps has to generalize the original setup and in this new extended framework looks for simplifying principles, which by reduction to some special cases show explicitly how special are these cases not only qualitatively but also quantitatively and what this reduction procedure implies on the numbers one gets as the results of the reduction. for example, one typically extend the original framework from d = 4 spacetime dimension fixed condition to more general situation with arbitrary d and then draw the general conclusion as a function of d based on some general simple principles. then finally, the case of d = 4 is recovered as a particular value one gets when the function is evaluated for d = 4. and this should explain its speciality. in our case, we extended the four-derivative theory by adding terms with six derivatives and in this way we were able to study a more generic situation. this was in order to understand and explain the structure of divergences in the special reduced case of conformal gravity in d = 4 and of still generic four-dimensional stelle theory. we think that this is a good theoretical explanation which sheds some light on the so far mysterious issue of the structure of beta functions. one can also see this as another advantage of why it is worth to study generalizations of higher-derivative gravitational actions to include terms with even more higher number of derivatives, like 6-derivative, 8-derivative actions, etc. finally, here we can comment on the issue of experimental bounds on the values of the ratio x. since it appears in six-derivative gravitational theory the constraints on its possible values are very weak. slightly stronger constraints apply now for the corresponding value of the ratio in four-derivative stelle gravitational theory in d = 4 case. since the main reason for higherderivative modifications of gravity comes because of consistency of the coupled quantum theory, then one would expect that the stringent bounds would come from experimental measurement in the real domain of true quantum gravity. of course, right now this is very, very far, if possible at all, future for experimental gravitational physics. this is all due to smallness of gravitational couplings characterized by gn proportional inversely to the planck mass mp ∼ 1019 gev. and in the quantum domain of elementary particle physics this scale is bigger than any energy scale of interactions between elementary quanta of matter. this implies that also quantum gravitational interactions are very weak in strength. hence the only experimental/observational bounds we have on the coefficients in front of higher-derivative terms come from the classi152 vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity cal/astrophysical domain of gravitational physics and they are still very weak. to probe the values of the coefficients in front of six-derivative terms, one would have to really perform a gravitational experiment to the increased level of accuracy between elementary particles in the full quantum domain, which is now completely unfeasible. hence, we have to be satisfied with already existing very weak bounds, but this lets us to freely consider theoretical generic situation with arbitrary values of the ratio x since maybe only (very far) future experiments can force us theoreticians to consider some more restricted subset or interval for the values of the x ratio as consistent with observed situation in the nature. for the moment it is reasonable to consider and explore theoretically all possible range of values for the x ratio and also both possible signs. (only the case with x = 0 is excluded as a non-renormalizable theory that we have discussed before.) in the last section of this contribution, we commented on the important issue of stability of higherderivative theories. we touched both the classical and quantum levels, while the former should not be understood as a standalone level on which we can initially (before supposed quantization) define the classical theory of the relativistic gravitational field. we followed the philosophy that the quantum theory is more fundamental and it is a starting point to consider various limits, if it is properly quantized (in a sense that the quantum partition function is consistently defined, regardless of how we get there to its form, no matter which formal quantization procedure we have been following). one of such possible limit is the classical limit where the field expectation values are large compared to characteristic values as found in the microworld of elementary particles. and also the occupation number for bosonic states are large number (of the order of avogadro number for example). then we could speak about coherent states which could define well classical limit of the theory. such procedure has to be followed in order to define hd classical gravitational theory. we emphasized that quantum theory is the basis and classical theory is the derived concept, not vice versa. on the quantum level we shortly discussed various approaches present in the literature to solve the problems with unwanted ghost-like particle states. they were classified in two groups: theories with p t -symmetric hamiltonian and theories with anselmi-piva prescription instead of the feynman prescription to take into account contributions of the poles of the ghost but without spoiling the unitarity issue. on the quantum level, we considered mainly the issue with unitarity of the scattering matrix since this seems the most problematic one. the violation of unitarity would signal the problem with conservation of the probability of quantum processes. something that we cannot allow to happen in quantum-mechanical framework for the isolated quantum system (non-interacting with the noisy decohering and dissipative or some thermal environment). of course, such an analysis was tailormade for the cases of gravitational backgrounds on which we can define properly the scattering process. in general, the scattering processes are not everything we can talk about in quantum field theories even for on-shell quantities. the analysis of some on-shell dressed green functions may also show some problems with quantum stability of the system. therefore, we briefly also described the results of the stability analysis, both on the classical and quantum level and to the various loop accuracy in qg models. this analysis is in principle applicable to the case of any gravitational background, more general than the one coming with the requirement of asymptotic flatness. we also mentioned that in some cases of classical field theory the analysis of classical exact solutions shows that the very special and tuned solutions are without classical instabilities and they are well-defined for any time starting with very special initial or boundary conditions. for example, here we can mention the case of so-called benign ghosts of higher-derivative gravitational theories as proposed by smilga some time ago. this should prove to the reader that we are dealing with the theories which besides a very interesting structure of perturbative beta functions, are also amenable to solve the stability and unitarity issues in these theories, both on the quantum as well as on the classical level. with some special care we can exert control and hd gravitational theories are stable quantum-mechanically and this is what matters fundamentally. acknowledgements we would like to thank i. l. shapiro, l. modesto and a. pinzul for initial comments and encouragement about this work. l. r. thanks the department of physics of the federal university of juiz de fora for kind hospitality and fapemig for a technical support. finally, we would like to express our gratitude to the organizers of the “algebraic and analytic methods in physics” – aamp xviii conference for accepting our talk proposal and for creating a stimulating environment for scientific online discussions. references [1] b. s. dewitt. quantum theory of gravity. 1. the canonical theory. physical review 160(5):1113–1148, 1967. https://doi.org/10.1103/physrev.160.1113. [2] b. s. dewitt. quantum theory of gravity. 2. the manifestly covariant theory. physical review 162(5):1195–1239, 1967. https://doi.org/10.1103/physrev.162.1195. [3] b. s. dewitt. quantum theory of gravity. 3. applications of the covariant theory. physical review 162(5):1239–1256, 1967. https://doi.org/10.1103/physrev.162.1239. [4] r. utiyama, b. s. dewitt. renormalization of a classical gravitational field interacting with quantized matter fields. journal of mathematical physics 3(4):608– 618, 1962. https://doi.org/10.1063/1.1724264. 153 https://doi.org/10.1103/physrev.160.1113 https://doi.org/10.1103/physrev.162.1195 https://doi.org/10.1103/physrev.162.1239 https://doi.org/10.1063/1.1724264 lesław rachwał acta polytechnica [5] n. d. birrell, p. c. w. davies. quantum fields in curved space. cambridge monographs on mathematical physics. cambridge univ. press, cambridge, uk, 1984. isbn 978-0-521-27858-4, 978-0-521-27858-4, https://doi.org/10.1017/cbo9780511622632. [6] i. l. buchbinder, s. d. odintsov, i. l. shapiro. effective action in quantum gravity. 1992. [7] k. s. stelle. renormalization of higher derivative quantum gravity. physical review d 16(4):953–969, 1977. https://doi.org/10.1103/physrevd.16.953. [8] k. s. stelle. classical gravity with higher derivatives. general relativity and gravitation 9:353–371, 1978. https://doi.org/10.1007/bf00760427. [9] h. lü, a. perkins, c. n. pope, k. s. stelle. spherically symmetric solutions in higher-derivative gravity. physical review d 92(12):124019, 2015. https://doi.org/10.1103/physrevd.92.124019. [10] h. lu, a. perkins, c. n. pope, k. s. stelle. black holes in higher-derivative gravity. physical review letters 114(17):171601, 2015. https://doi.org/10.1103/physrevlett.114.171601. [11] l. modesto. super-renormalizable quantum gravity. physical review d 86(4):044005, 2012. https://doi.org/10.1103/physrevd.86.044005. [12] m. asorey, j. l. lopez, i. l. shapiro. some remarks on high derivative quantum gravity. international journal of modern physics a 12(32):5711–5734, 1997. https://doi.org/10.1142/s0217751x97002991. [13] p. d. mannheim. making the case for conformal gravity. foundations of physics 42:388–420, 2012. https://doi.org/10.1007/s10701-011-9608-6. [14] p. d. mannheim. conformal invariance and the metrication of the fundamental forces. international journal of modern physics d 25(12):1644003, 2016. https://doi.org/10.1142/s021827181644003x. [15] g. ’t hooft, m. j. g. veltman. one loop divergencies in the theory of gravitation. annales de l’ inst h poincare physique théorique a 20(1):69–94, 1974. [16] s. deser, p. van nieuwenhuizen. one loop divergences of quantized einstein-maxwell fields. physical review d 10(2):401–410, 1974. https://doi.org/10.1103/physrevd.10.401. [17] m. h. goroff, a. sagnotti. quantum gravity at two loops. physics letters b 160(1-3):81–86, 1985. https://doi.org/10.1016/0370-2693(85)91470-4. [18] m. h. goroff, a. sagnotti. the ultraviolet behavior of einstein gravity. nuclear physics b 266(3-4):709–736, 1986. https://doi.org/10.1016/0550-3213(86)90193-8. [19] g. ’t hooft. local conformal symmetry: the missing symmetry component for space and time, 2014. arxiv:1410.6675. [20] g. ’t hooft. local conformal symmetry: the missing symmetry component for space and time. international journal of modern physics d 24(12):1543001, 2015. https://doi.org/10.1142/s0218271815430014. [21] g. ’t hooft. local conformal symmetry in black holes, standard model, and quantum gravity. international journal of modern physics d 26(03):1730006, 2016. https://doi.org/10.1142/s0218271817300063. [22] j. maldacena. einstein gravity from conformal gravity, 2011. arxiv:1105.5632. [23] y.-d. li, l. modesto, l. rachwał. exact solutions and spacetime singularities in nonlocal gravity. journal of high energy physics 2015:1–50, 2015. https://doi.org/10.1007/jhep12(2015)173. [24] i. l. buchbinder, o. k. kalashnikov, i. l. shapiro, et al. the stability of asymptotic freedom in grand unified models coupled to r2 gravity. physics letters b 216(1-2):127–132, 1989. https://doi.org/10.1016/0370-2693(89)91381-6. [25] i. l. buchbinder, i. shapiro. introduction to quantum field theory with applications to quantum gravity. oxford graduate texts. oxford university press, 2021. [26] l. rachwał, l. modesto, a. pinzul, i. l. shapiro. renormalization group in six-derivative quantum gravity. physical review d 104(8):085018, 2021. https://doi.org/10.1103/physrevd.104.085018. [27] a. o. barvinsky, g. a. vilkovisky. the generalized schwinger-dewitt technique in gauge theories and quantum gravity. physics reports 119(1):1–74, 1985. https://doi.org/10.1016/0370-1573(85)90148-6. [28] r. mistry, a. pinzul, l. rachwał. spectral action approach to higher derivative gravity. the european physical journal c 80(3):266, 2020. https://doi.org/10.1140/epjc/s10052-020-7805-1. [29] l. modesto, l. rachwal. super-renormalizable and finite gravitational theories. nuclear physics b 889:228–248, 2014. https://doi.org/10.1016/j.nuclphysb.2014.10.015. [30] l. modesto, m. piva, l. rachwal. finite quantum gauge theories. physical review d 94(2):025021, 2016. https://doi.org/10.1103/physrevd.94.025021. [31] a. s. koshelev, k. sravan kumar, l. modesto, l. rachwał. finite quantum gravity in ds and ads spacetimes. physical review d 98(4):046007, 2018. https://doi.org/10.1103/physrevd.98.046007. [32] l. modesto, l. rachwał. nonlocal quantum gravity: a review. international journal of modern physics d 26(11):1730020, 2017. https://doi.org/10.1142/s0218271817300208. [33] l. modesto, l. rachwał. universally finite gravitational and gauge theories. nuclear physics b 900:147–169, 2015. https://doi.org/10.1016/j.nuclphysb.2015.09.006. [34] l. modesto, l. rachwal. finite conformal quantum gravity and nonsingular spacetimes, 2016. arxiv:1605.04173. [35] l. modesto, l. rachwał. finite quantum gravity in four and extra dimensions. in 14th marcel grossmann meeting on recent developments in theoretical and experimental general relativity, astrophysics, and relativistic field theories, vol. 2, pp. 1196–1203. 2017. https://doi.org/10.1142/9789813226609_0084. [36] r. e. kallosh, o. v. tarasov, i. v. tyutin. one loop finiteness of quantum gravity off mass shell. nuclear physics b 137(1-2):145–163, 1978. https://doi.org/10.1016/0550-3213(78)90055-x. 154 https://doi.org/10.1017/cbo9780511622632 https://doi.org/10.1103/physrevd.16.953 https://doi.org/10.1007/bf00760427 https://doi.org/10.1103/physrevd.92.124019 https://doi.org/10.1103/physrevlett.114.171601 https://doi.org/10.1103/physrevd.86.044005 https://doi.org/10.1142/s0217751x97002991 https://doi.org/10.1007/s10701-011-9608-6 https://doi.org/10.1142/s021827181644003x https://doi.org/10.1103/physrevd.10.401 https://doi.org/10.1016/0370-2693(85)91470-4 https://doi.org/10.1016/0550-3213(86)90193-8 https://arxiv.org/abs/1410.6675 https://doi.org/10.1142/s0218271815430014 https://doi.org/10.1142/s0218271817300063 https://arxiv.org/abs/1105.5632 https://doi.org/10.1007/jhep12(2015)173 https://doi.org/10.1016/0370-2693(89)91381-6 https://doi.org/10.1103/physrevd.104.085018 https://doi.org/10.1016/0370-1573(85)90148-6 https://doi.org/10.1140/epjc/s10052-020-7805-1 https://doi.org/10.1016/j.nuclphysb.2014.10.015 https://doi.org/10.1103/physrevd.94.025021 https://doi.org/10.1103/physrevd.98.046007 https://doi.org/10.1142/s0218271817300208 https://doi.org/10.1016/j.nuclphysb.2015.09.006 https://arxiv.org/abs/1605.04173 https://doi.org/10.1142/9789813226609_0084 https://doi.org/10.1016/0550-3213(78)90055-x vol. 62 no. 1/2022 beta functions in six-derivative quantum gravity [37] m. y. kalmykov. gauge and parametrization dependencies of the one loop counterterms in the einstein gravity. classical and quantum gravity 12(6):1401–1412, 1995. https://doi.org/10.1088/0264-9381/12/6/007. [38] i. l. shapiro, a. g. zheksenaev. gauge dependence in higher derivative quantum gravity and the conformal anomaly problem. physics letters b 324(3-4):286–292, 1994. https://doi.org/10.1016/0370-2693(94)90195-3. [39] i. a. batalin, g. a. vilkovisky. quantization of gauge theories with linearly dependent generators. physical review d 28(10):2567–2582, 1983. [erratum: phys.rev.d 30, 508 (1984)], https://doi.org/10.1103/physrevd.28.2567. [40] i. a. batalin, g. a. vilkovisky. gauge algebra and quantization. physics letters b 102(1):27–31, 1981. https://doi.org/10.1016/0370-2693(81)90205-7. [41] e. s. fradkin, a. a. tseytlin. renormalizable asymptotically free quantum theory of gravity. nuclear physics b 201(3):469–491, 1982. https://doi.org/10.1016/0550-3213(82)90444-8. [42] i. g. avramidi, a. o. barvinsky. asymptotic freedom in higher derivative quantum gravity. physics letters b 159(4-6):269–274, 1985. https://doi.org/10.1016/0370-2693(85)90248-5. [43] l. s. brown, j. p. cassidy. stress tensor trace anomaly in a gravitational metric: general theory, maxwell field. physical review d 15(10):2810–2829, 1977. https://doi.org/10.1103/physrevd.15.2810. [44] l. modesto, l. rachwał, i. l. shapiro. renormalization group in super-renormalizable quantum gravity. the european physical journal c 78(7):555, 2018. https://doi.org/10.1140/epjc/s10052-018-6035-2. [45] m. asorey, f. falceto. on the consistency of the regularization of gauge theories by high covariant derivatives. physical review d 54(8):5290–5301, 1996. https://doi.org/10.1103/physrevd.54.5290. [46] m. asorey, f. falceto, l. rachwał. asymptotic freedom and higher derivative gauge theories. journal of high energy physics 2021:75, 2021. https://doi.org/10.1007/jhep05(2021)075. [47] wolfram research, mathematica, version 9.0, champaign, il (2012). https://www.wolfram.com/. [48] a. o. barvinsky, g. a. vilkovisky. covariant perturbation theory. 2: second order in the curvature. general algorithms. nuclear physics b 333(2):471–511, 1990. https://doi.org/10.1016/0550-3213(90)90047-h. [49] e. s. fradkin, a. a. tseytlin. conformal supergravity. physics reports 119(4-5):233–362, 1985. https://doi.org/10.1016/0370-1573(85)90138-3. [50] c. bambi, l. modesto, s. porey, l. rachwał. black hole evaporation in conformal gravity. journal of cosmology and astroparticle physics 2017(9):033, 2017. https://doi.org/10.1088/1475-7516/2017/09/033. [51] c. bambi, l. modesto, s. porey, l. rachwał. formation and evaporation of an electrically charged black hole in conformal gravity. the european physical journal c 78(2):116, 2018. https://doi.org/10.1140/epjc/s10052-018-5608-4. [52] p. jizba, l. rachwał, j. kňap. infrared behavior of weyl gravity: functional renormalization group approach. physical review d 101(4):044050, 2020. https://doi.org/10.1103/physrevd.101.044050. [53] p. jizba, l. rachwał, s. g. giaccari, j. kňap. dark side of weyl gravity. universe 6(8):123, 2020. https://doi.org/10.3390/universe6080123. [54] l. rachwał, s. giaccari. infrared behavior of weyl gravity. journal of physics: conference series 1956(1):012012, 2021. https://doi.org/10.1088/1742-6596/1956/1/012012. [55] c. bambi, l. modesto, l. rachwał. spacetime completeness of non-singular black holes in conformal gravity. journal of cosmology and astroparticle physics 2017(05):003, 2017. https://doi.org/10.1088/1475-7516/2017/05/003. [56] l. rachwał. conformal symmetry in field theory and in quantum gravity. universe 4(11):125, 2018. https://doi.org/10.3390/universe4110125. [57] p. d. mannheim. mass generation, the cosmological constant problem, conformal symmetry, and the higgs boson. progress in particle and nuclear physics 94:125–183, 2017. https://doi.org/10.1016/j.ppnp.2017.02.001. [58] p. d. mannheim. p t symmetry, conformal symmetry, and the metrication of electromagnetism. foundations of physics 47(9):1229–1257, 2017. https://doi.org/10.1007/s10701-016-0017-8. [59] d. anselmi, m. piva. perturbative unitarity of lee-wick quantum field theory. physical review d 96(4):045009, 2017. https://doi.org/10.1103/physrevd.96.045009. [60] d. anselmi, m. piva. a new formulation of lee-wick quantum field theory. journal of high energy physics 2017:66, 2017. https://doi.org/10.1007/jhep06(2017)066. [61] d. anselmi. fakeons and lee-wick models. journal of high energy physics 2018:141, 2018. https://doi.org/10.1007/jhep02(2018)141. [62] r. e. cutkosky, p. v. landshoff, d. i. olive, j. c. polkinghorne. a non-analytic s-matrix. nuclear physics b 12(2):281–300, 1969. https://doi.org/10.1016/0550-3213(69)90169-2. [63] t. d. lee, g. c. wick. negative metric and the unitarity of the s-matrix. nuclear physics b 9(2):209–243, 1969. https://doi.org/10.1016/0550-3213(69)90098-4. [64] t. d. lee, g. c. wick. finite theory of quantum electrodynamics. physical review d 2(6):1033–1048, 1970. https://doi.org/10.1103/physrevd.2.1033. [65] d. anselmi. fakeons, microcausality and the classical limit of quantum gravity. classical and quantum gravity 36(6):065010, 2019. https://doi.org/10.1088/1361-6382/ab04c8. [66] d. anselmi, m. piva. quantum gravity, fakeons and microcausality. journal of high energy physics 11:21, 2018. https://doi.org/10.1007/jhep11(2018)021. 155 https://doi.org/10.1088/0264-9381/12/6/007 https://doi.org/10.1016/0370-2693(94)90195-3 https://doi.org/10.1103/physrevd.28.2567 https://doi.org/10.1016/0370-2693(81)90205-7 https://doi.org/10.1016/0550-3213(82)90444-8 https://doi.org/10.1016/0370-2693(85)90248-5 https://doi.org/10.1103/physrevd.15.2810 https://doi.org/10.1140/epjc/s10052-018-6035-2 https://doi.org/10.1103/physrevd.54.5290 https://doi.org/10.1007/jhep05(2021)075 https://www.wolfram.com/ https://doi.org/10.1016/0550-3213(90)90047-h https://doi.org/10.1016/0370-1573(85)90138-3 https://doi.org/10.1088/1475-7516/2017/09/033 https://doi.org/10.1140/epjc/s10052-018-5608-4 https://doi.org/10.1103/physrevd.101.044050 https://doi.org/10.3390/universe6080123 https://doi.org/10.1088/1742-6596/1956/1/012012 https://doi.org/10.1088/1475-7516/2017/05/003 https://doi.org/10.3390/universe4110125 https://doi.org/10.1016/j.ppnp.2017.02.001 https://doi.org/10.1007/s10701-016-0017-8 https://doi.org/10.1103/physrevd.96.045009 https://doi.org/10.1007/jhep06(2017)066 https://doi.org/10.1007/jhep02(2018)141 https://doi.org/10.1016/0550-3213(69)90169-2 https://doi.org/10.1016/0550-3213(69)90098-4 https://doi.org/10.1103/physrevd.2.1033 https://doi.org/10.1088/1361-6382/ab04c8 https://doi.org/10.1007/jhep11(2018)021 lesław rachwał acta polytechnica [67] l. modesto, i. l. shapiro. superrenormalizable quantum gravity with complex ghosts. physics letters b 755:279–284, 2016. https://doi.org/10.1016/j.physletb.2016.02.021. [68] l. modesto. super-renormalizable or finite lee–wick quantum gravity. nuclear physics b 909:584–606, 2016. https://doi.org/10.1016/j.nuclphysb.2016.06.004. [69] c. m. bender, p. d. mannheim. no-ghost theorem for the fourth-order derivative pais-uhlenbeck oscillator model. physical review letters 100(11):110402, 2008. https://doi.org/10.1103/physrevlett.100.110402. [70] c. m. bender, p. d. mannheim. exactly solvable pt -symmetric hamiltonian having no hermitian counterpart. physical review d 78(2):025022, 2008. https://doi.org/10.1103/physrevd.78.025022. [71] a. v. smilga. benign versus malicious ghosts in higher-derivative theories. nuclear physics b 706(3):598–614, 2005. https://doi.org/10.1016/j.nuclphysb.2004.10.037. [72] a. v. smilga. supersymmetric field theory with benign ghosts. journal of physics a: mathematical and theoretical 47(5):052001, 2014. https://doi.org/10.1088/1751-8113/47/5/052001. [73] d. anselmi, a. marino. fakeons and microcausality: light cones, gravitational waves and the hubble constant. classical and quantum gravity 37(9):095003, 2020. https://doi.org/10.1088/1361-6382/ab78d2. [74] v. i. tkach. towards ghost-free gravity and standard model. modern physics letters a 27(22):1250131, 2012. https://doi.org/10.1142/s0217732312501313. [75] m. kaku. strong coupling approach to the quantization of conformal gravity. physical review d 27(12):2819–2834, 1983. https://doi.org/10.1103/physrevd.27.2819. [76] e. tomboulis. 1n expansion and renormalization in quantum gravity. physics letters b 70(3):361–364, 1977. https://doi.org/10.1016/0370-2693(77)90678-5. [77] e. tomboulis. renormalizability and asymptotic freedom in quantum gravity. physics letters b 97(1):77–80, 1980. https://doi.org/10.1016/0370-2693(80)90550-x. [78] e. t. tomboulis. unitarity in higher derivative quantum gravity. physical review letters 52(14):1173–1176, 1984. https://doi.org/10.1103/physrevlett.52.1173. [79] f. d. o. salles, i. l. shapiro. do we have unitary and (super)renormalizable quantum gravity below the planck scale? physical review d 89(8):084054, 2014. [erratum: phys.rev.d 90, 129903 (2014)], https://doi.org/10.1103/physrevd.89.084054. [80] p. peter, f. de o. salles, i. l. shapiro. on the ghost-induced instability on de sitter background. physical review d 97(6):064044, 2018. https://doi.org/10.1103/physrevd.97.064044. [81] f. de o. salles, i. l. shapiro. recent progress in fighting ghosts in quantum gravity. universe 4(9):91, 2018. https://doi.org/10.3390/universe4090091. [82] m. christodoulou, l. modesto. note on reflection positivity in nonlocal gravity. jetp letters 109(5):286–291, 2019. https://doi.org/10.1134/s0021364019050011. [83] f. briscese, l. modesto. cutkosky rules and perturbative unitarity in euclidean nonlocal quantum field theories. physical review d 99(10):104043, 2019. https://doi.org/10.1103/physrevd.99.104043. [84] j. f. donoghue, g. menezes. unitarity, stability and loops of unstable ghosts. physical review d 100(10):105006, 2019. https://doi.org/10.1103/physrevd.100.105006. [85] m. asorey, l. rachwal, i. l. shapiro. unitarity issues in some higher derivative field theories. galaxies 6(1):23, 2018. https://doi.org/10.3390/galaxies6010023. [86] f. briscese, l. modesto. nonlinear stability of minkowski spacetime in nonlocal gravity. journal of cosmology and astroparticle physics 2019(07):009, 2019. https://doi.org/10.1088/1475-7516/2019/07/009. [87] f. briscese, g. calcagni, l. modesto. nonlinear stability in nonlocal gravity. physical review d 99(8):084041, 2019. https://doi.org/10.1103/physrevd.99.084041. 156 https://doi.org/10.1016/j.physletb.2016.02.021 https://doi.org/10.1016/j.nuclphysb.2016.06.004 https://doi.org/10.1103/physrevlett.100.110402 https://doi.org/10.1103/physrevd.78.025022 https://doi.org/10.1016/j.nuclphysb.2004.10.037 https://doi.org/10.1088/1751-8113/47/5/052001 https://doi.org/10.1088/1361-6382/ab78d2 https://doi.org/10.1142/s0217732312501313 https://doi.org/10.1103/physrevd.27.2819 https://doi.org/10.1016/0370-2693(77)90678-5 https://doi.org/10.1016/0370-2693(80)90550-x https://doi.org/10.1103/physrevlett.52.1173 https://doi.org/10.1103/physrevd.89.084054 https://doi.org/10.1103/physrevd.97.064044 https://doi.org/10.3390/universe4090091 https://doi.org/10.1134/s0021364019050011 https://doi.org/10.1103/physrevd.99.104043 https://doi.org/10.1103/physrevd.100.105006 https://doi.org/10.3390/galaxies6010023 https://doi.org/10.1088/1475-7516/2019/07/009 https://doi.org/10.1103/physrevd.99.084041 acta polytechnica 62(1):118–156, 2022 1 introduction and motivation 1.1 motivations for and introduction to six-derivative gravitational theories 1.2 addition of killer operators 1.3 universality of the results 2 brief description of the technique for computing uv divergences 2.1 example of the bv method of computation for the scalar case 2.2 results in six-derivative gravity 3 some theoretical checks of the results (33) 4 structure of beta functions in six-derivative quantum gravity 4.1 limiting cases 4.2 dependence of the final results on the fundamental ratio x 4.3 case of conformal gravity 4.4 more on limiting cases 5 stability of hd theories 6 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0341 acta polytechnica 62(3):341–351, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague applicability of secondary denitrification measures on a fluidized bed boiler jitka jeníková∗, kristýna michaliková, františek hrdlička, jan hrdlička, lukáš pilař, matěj vodička, pavel skopec czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, prague 6, dejvice, 160 00, czech republic ∗ corresponding author: jitka.jenikova@fs.cvut.cz abstract. this article compares the performance of selective catalytic reduction (scr) and selective non-catalytic reduction (sncr) applied on the same pilot unit, a 500 kw fluidized bed boiler burning czech lignite. a correlation of the denitrification efficiency and the normalized stoichiometric ratio (nsr) is investigated. the fundamental principle of the scr and sncr is similar with the same reaction scheme. the difference is in the use of the catalyst that lowers the activation energy of the key reaction. as a result, the reduction is performed at lower temperatures during the scr method. during experiments, the nsr was up to 1.6 for the scr method. for the sncr method, which has a higher reducing agent consumption, the maximum denitrification efficiency was reached for nsr of about 2.5. the efficiency of both secondary methods was investigated. the denitrification efficiency during experiments exceeded 98 % for the scr method, and the sncr method, together with the primary measures, reached an efficiency of 58 %. keywords: scr, sncr, fluidized bed boiler, denitrification, denox, coal. 1. introduction many countries rely on and will have to rely on the combustion of fossil fuels for electricity and heat production over the next few years. the combustion of fossil fuels is associated with the production of pollutants that must be minimized in order to operate the technology with low environmental impacts. nitrogen oxides can be identified among typical pollutants and are responsible for acid gas deposition, ozone depletion, and health effects on humans. in the field of pollutant reduction, the most important regulation is given by the bat (best available techniques) reference document for large combustion plants (lcp) [1] that describes the primary and secondary measures to reduce the release of nitrogen oxides (socalled denitrification) from combustion plants to the atmosphere. these measures, which are usable for fluidized bed boilers, are summarized in table 1 with the corresponding general nox reduction rates efficiencies. this article is focused on the experimental investigation of secondary denitrification measures in a bubbling fluidized bed boiler using czech lignite as a fuel. reachable denitrification levels are analysed using the scr and sncr technologies. the mitigation of nitrogen oxides is important for more than just meeting the bat and emission standards. regarding upcoming trends of lowering co2 emissions from energy conversion, combustion systems using fossil fuels can be extended by ccs/u technologies, most typically post-combustion systems or oxy-fuel combustion. the reduction in nox production is crucial for those primary measures nox reductionrate [%] low excess air firing 10–44 air staging 10–77 flue-gas recirculation (fgr) 20–60 reduction of the combustion air temperature 20–30 secondary measures selective catalytic reduction (scr) 80–95 selective non-catalytic reduction (sncr) 30–50 table 1. nox reduction rates of primary and secondary measures [1]. technologies as well, since a high purity of co2 and low levels of acid-forming gases (like nitrogen oxides) are required. 2. nitrogen oxides 2.1. formation of nox emissions there are three known mechanisms of nitrogen oxides formation in combustion processes [2–4]: • thermal nox – oxidation of molecular nitrogen from the oxidant at high temperatures, known as the zeldovich mechanism, • fuel nox – oxidation of chemically bound nitrogen in solid fuels, 341 https://doi.org/10.14311/ap.2022.62.0341 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en j. jeníková, k. michaliková, f. hrdlička et al. acta polytechnica • prompt nox – reactions of molecular nitrogen with hydrocarbon radicals with subsequent oxidation of intermediate products in high-temperature reducing flame zones, known as the fenimore mechanism. nox from conventional coal combustion typically consists of nitric oxide (no) and nitrogen dioxide (no2), where no is the most dominant with a share of about 90 % and more. the dominating formation mechanisms depend on the type of combustor. in the hightemperature systems, e.g. pulverized coal combustion, zeldovich and fenimore mechanisms are more dominant, while fuel-n oxidation dominates in fluidized beds. the fuel-n mechanism is only weakly dependent on the combustion temperature, and there is a proportional correlation with oxygen stoichiometry [5– 8]. in addition to no and no2, a significant nitrous oxide production (n2o) can also be observed. n2o is not part of nox emission limits and the measures for its reduction are not part of bat, it is a gas of importance due to its high gwp potential [9]. the measured n2o emissions from coal combustion systems (except fluidized bed) as the ratio of n2o/nox emissions are typically less than 2 percent [4]. for coal-fired fluidized bed combustors, n2o emissions are within the range of 17 to 48 % of overall nox emissions [4]. n2o is produced in fluidized bed boilers due to its dependence on the bed temperature. a higher temperature leads to lower n2o emissions, which is the reverse of the bed temperature dependence of no formation [10]. the amount of conversion of fuelbound nitrogen to no and n2o is considered to be roughly constant as shown by de las obras-loscertales et al. [11]. 2.2. denitrification methods denitrification is a general term for a nox limitation. technologies for nox reduction can be categorized into primary measures that consist of modifying the operating parameters of combustion, leading to a suppression of the formation mechanisms, and secondary measures. secondary measures represent flue gas treatment leading to the reduction of nox already formed. those technologies can be used independently or in combinations. 2.2.1. primary measures the primary measures are typically most effective for the zeldovich and fenimore mechanisms, and they are focused on reducing the oxygen available in the combustion zone and reducing the peak temperatures. primary measures technologies include air or fuel staging, low nox burners, and flue gas recirculation systems. when solid fuels are burned in fluidized bed boilers, the relevant measures to reduce the nox production are those that focus on fuel-n-originating nox. as explained in section 2.1, the fuel-nox mechanism is mostly dependent on the concentration of oxygen in the primary combustion zone. therefore, the most effective measures aim only at lowering the stoichiometry of the primary air and not at lowering the combustion temperature, since the fluidized bed temperature is inevitably too low for the thermal and prompt mechanism to occur. in particular, the only primary measure, which is not an inherent part of the fluidized bed combustion control process, is the staged injection of combustion air. it is used to achieve the required combustion parameters, such as sub-stoichiometric conditions in the dense bed (which decrease the nox formation), simultaneous combustion of the unburned co in the freeboard section, and increase of the freeboard temperature for efficient injection of the reducing agent. when secondary air is used to burn unburned co, no more nitrogen oxides are formed in the freeboard section [3, 12–16]. air staging has been shown to be an effective primary measure for nox reduction in a fluidized bed boiler; for example, lupiáñez et al. [13] observed a 40 % reduction in nox for a 20 % secondary air ratio as compared to nox without air staging. however, air staging shows insufficient nox reduction rates to meet the emission limits defined in the lcp directive, and secondary measures have to be applied. 2.2.2. secondary measures secondary measures, also called post-combustion methods, represent a group of chemical processes in which already formed nitrogen oxides are decomposed into molecular nitrogen and water vapor using a reducing agent. typical reducing agents are ammonia and urea solutions. selective non-catalytic reduction and selective catalytic reduction are the basic secondary methods. other processes developed to date, such as simultaneous denitrification and desulphurization methods or wet scrubbing, have not been applied on a larger scale [7, 17, 18]. sncr the selective non-catalytic reduction is a method that reduces nitrogen oxides in the absence of a catalyst. the process is based on the following reaction [18]: 4 nh3 + 4 no + o2 4 n2 + 6 h2o (1) to achieve a sufficient no to n2 conversion, the reaction temperature of 900 ◦c is required according to the calculation of the gibbs energy. the typical temperature window for sncr in industrial applications is between 850 and 1100 ◦c. when the reducing agent is injected into the low temperature region, the nitrogen oxides do not react with the nh2 radical due to the low reaction rate and unreacted ammonia leaves the combustor with the flue gas. as a result, the concentration of ammonia in the flue gas increases and it may also be adsorbed on fly ash particles. on the other hand, when the reducing agent is injected above the high boundary of the temperature range, the nh2 radical preferentially begins to react with 342 vol. 62 no. 3/2022 applicability of secondary denitrification measures on a fluidized bed boiler figure 1. possible locations of secondary denitrification measures. oxygen, resulting in an increase in nox concentration in the flue gas. the efficiency of this method is, therefore, highly dependent on the injection of the reducing agent at the right temperature, which can differ with the used reducing agent. the optimum temperature for the reaction also varies depending on the reducing agent used. for example, for ammonia, it is in the range of 850–1000 ◦c, and for urea, between 950–1100 ◦c [4–6, 17]. scr the fundamental principle of the selective catalytic reduction is similar to that of the non-selective reduction with the same reaction scheme. in this method, a catalyst is used that lowers the activation energy of the key reaction. as a result, the reduction can be performed at lower temperatures, and there is no need to keep the reacting substances for the necessary period of time in the high-temperature region. depending on the type of catalyst, the temperature window can be 250 ◦c to 600 ◦c (for zeolite catalysts) [7], but the most commonly used vanadium pentoxide catalyst in the titanium dioxide carrier has an optimal temperature window of 250–430 ◦c with an achievable denitrification efficiency of more than 90 % [3, 7, 17, 19]. the lower temperature limit is set by the reaction rate and formation and deposition of the ammonium sulphate salt, which may deposit on the catalyst and cause its temporary deactivation. the upper temperature limit is established by physical damage to the catalyst by sintering and by oxidation of nh3 to no, thus limiting the nox conversion, and by supersaturation of the catalyst that leads to an excess of unreacted ammonia, which escapes along with the combustion gas [4, 18, 20]. the latest v2o5based scr catalysts are produced with the addition of tungsten trioxide (wo3) and molybdenum trioxide (moo3), which are added for the expansion of the optimal temperature window and because of their ability to resist catalyst poisoning. these are applied by impregnation on a tio2 support that has a good resistance to sulfur oxides. this support is coated on the ceramic skeleton of the catalyst body. vanadium catalysts work the best at temperatures of about 350 ◦c [5]. at lower temperatures, their efficiency decreases rapidly and at higher temperatures, corrosion problems arise [2, 7, 19]. the catalyst can be placed at different locations along the flue gas conduit as shown in figure 1, and the placement depends on its type and material. it is not appropriate to place the catalyst in the high dust region for the fluidized bed combustion while using the dry additive desulfurization method because of the high abrasion properties of the present particles. 3. experimental set-up 3.1. experimental facility the experimental boiler is located in the ctu laboratories in prague. this pilot unit is a fluidized bed boiler with a thermal output of 500 kw, and its scheme is shown in figure 2. fluidization is achieved by primary air together with recirculated flue gas passing through the distributor, which consists of 36 nozzles. the distributor is described in detail in [21] and the boiler in [22]. the combustion chamber has a cylindrical cross section. in the freeboard area, there are 6 thermocouples placed along the height. the secondary air is supplied to the freeboard section by four evenly placed distributors on a perimeter, and each distributor can provide a secondary air inlet at 4 different heights. the heat exchanger is located in the second descending draft of the boiler. the flue gases are sampled downstream of the boiler prior to the cyclone particle separator, and its composition is continuously analysed. in particular, the volumetric fractions of the following components are measured: o2 using a paramagnetic sensor; so2, nox, co2 and co using ndir analysers. the boiler can also be operated in oxy-fuel mode. the off gas was also sampled downstream of the denox unit and analysed using the multicomponent ft-ir analyser. the sncr reducing agent distribution line basically consists of two main components: a probe with a spray nozzle and a system for transporting the reducing agent to the spray system. the probe is cooled by water to prevent the reducing agent from boiling before it is sprayed. compressed air is introduced in front of the nozzle orifice to improve the atomization of the supplied reducing agent. it is possible to place the probe in various inspection holes in the combustion space of the boiler and thus change the height of the injection of the reducing agent. for the experiments, secondary air inlets at a height of 343 j. jeníková, k. michaliková, f. hrdlička et al. acta polytechnica figure 2. fluidized bed boiler scheme. properties “as received” properties “dry ash free” lhv water ash c h o n s [mj/kg] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] 18.5 25.0 9.3 72.3 6.3 19.0 1.1 1.3 table 2. proximate and ultimate analysis of the fuel 550 mm above the fluidized bed were used to achieve the optimal temperature window. the catalyst for the scr method has dimensions of 160 mm × 160 mm × 1260 mm. the reduction of nox in the flue gas is carried out by means of ammonia, which is dosed into the flue gas stream before the reactor itself. the technology is connected to the output of a cyclone separator from the fluidized bed boiler. dedusted flue gases at a temperature of 150– 180 ◦c are heated in an electric heater to the required temperature of 250–300 ◦c. the amount of flue gas that passes through the reactor at a speed of 4.5 m/s through the catalyst is approximately 150 nm3/hour. it is necessary to inject the ammonia gas into the flue gas for the reaction on the catalyst surface. the stoichiometric amount of ammonia is approximately 0.012–0.016 kg/h. the catalyst itself is a honeycombtype based on vanadium pentoxide (v2o5) with the addition of tungsten trioxide (wo3) and molybdenum trioxide (moo3). doping is applied by impregnation of the tio2 supporting body and is used to improve the mechanical stability and chemical resistance of the catalyst, which is related to the widening of the optimal temperature window. 3.2. fuel and reducing agents lignite from the coal basin of north bohemia was used as fuel for the experiments. its proximate and ultimate analysis is shown in table 2. the size of the coal particles was less than 10 mm. pure ammonia was used as the scr reducing agent and adblue (a chemically highly pure aqueous solution of synthetic urea – 32.5 % wt. urea) was used for the sncr. 3.3. methods the normalized stoichiometric ratio is the proportion of the molar ratio of the reducing agent and nitrogen oxides at the beginning of the denitrification process. the range of variables measured was as follows. a detailed description of the variables is given in table 2, where the o2 concentrations are related to 6 % vol. of o2 in dry flue gas. for the sncr method: • nsr values from 0.55 to 3.47, • application of primary measures (flue gas recirculation and air staging), • temperature for reducing agent injection from 880 to 950 ◦c, • the average time in one setting was 45 min. for the scr method: • nsr values from 0.29 to 1.6, • application of primary measures (flue gas recirculation), • catalyst temperatures from 259 to 299 ◦c, • the average time in one setting was 74 min. individual states were set maintaining a constant temperature while the injection of the reducing agent was gradually changed. the urea solution was chosen 344 vol. 62 no. 3/2022 applicability of secondary denitrification measures on a fluidized bed boiler figure 3. summarization of the experimental results for scr and sncr. figure 4. achievable nox level for scr and sncr methods in nsr for the same bfb boiler. *bat-aels (mg/nm3) for nox emissions from combustion of black and/or brown coal into the atmosphere for the new combustion plants with the total rated thermal output < 100 mwth of 100–150 mg/mn3 [23]. for the sncr due to the properties of ammonia, which has very strict storage rules, and therefore is not used in large industrial plants for the sncr method. nevertheless, the fundamental reaction scheme for scr and sncr and both used reagents remains similar. furthermore, the experiments performed are closer to the practical results due to the size of the experimental boiler. 4. results and discussion measured data are presented in table 3 and figure 4. more detailed data with their variations are listed in appendix. the co emissions were also measured, however, no significant correlation between co and nox was observed as can be seen in figure 3. 4.1. reducing agent excess as shown in figure 4, there is a significant difference between the excess of reducing agent needed to achieve the same nox reduction efficiency for the scr and sncr denitrification methods. as can be seen from figure 4, the experiments confirmed that for scr, a lower excess of the reducing agent is needed to reach the same nox level after the secondary denitrification method because the catalyst reduces the activation energy of the chemical process. according to [24], the reducing agent stoichiometry for sncr has the optimum values between 1.5–2.5. experiments showed the highest efficiency values when using nsr around 2.5, but it is always necessary to monitor the ammonia slip in the flue gas, which is subject to the bat-associated emission level. according to bat, the level of nh3 emissions into the air from 345 j. jeníková, k. michaliková, f. hrdlička et al. acta polytechnica measurement no. temperature nox before sncr nox after sncr nox conversion nsr nh3 slip [◦c] [mg/nm3] [mg/nm3] [%] [mol/mol] [mg/nm3] 1 959 321 250 22.1 1.00 no t m ea su re d 2 238 26.0 2.10 3 228 29.0 3.10 4 209 35.1 4.10 5 883 459 392 14.7 0.55 6 303 34.1 1.10 7 267 41.9 1.65 8 257 43.9 2.20 9 242 47.2 2.64 10 886 408 356 12.9 0.62 11 293 28.2 1.30 12 249 39.0 1.89 13 179 56.3 2.50 14 170 58.4 3.00 15 868 424 388 8.7 0.43 0.3 16 343 19.2 0.87 0.2 17 306 28.0 1.30 0.1 18 284 33.0 1.70 0.2 19 269 36.7 3.47 0.4 20 948 461 450 2.3 0.57 0.0 21 437 5.2 1.12 0.0 22 426 7.7 1.70 0.0 23 393 14.7 2.70 0.8 24 925 498 458 8.0 0.41 0.1 25 450 9.5 0.81 0.0 26 410 17.7 1.23 0.0 27 364 26.9 1.97 0.3 measurement no. temperature nox before scr nox after scr nox conversion nsr nh3 slip [◦c] [mg/nm3] [mg/nm3] [%] [mol/mol] [mg/nm3] 1 260 692 544 21.0 0.29 0 2 705 386 45.0 0.59 0 3 698 305 56.0 0.75 0 4 593 132 77.6 0.91 0 5 533 31 94.1 1.17 0.1 6 613 12 98.0 1.29 0.9 7 290 518 168 67.4 0.81 0.2 8 519 158 71.1 0.83 0.3 9 504 79 85.2 0.99 0.2 10 528 124 76.7 1.04 0.2 11 487 47 90.4 1.09 0.2 12 477 18 96.2 1.28 0.3 13 486 36 93.0 1.35 0.0 14 300 520 144 72.4 1.00 0.1 15 531 75 85.9 1.21 0.1 16 519 40 92.3 1.43 0.1 17 539 31 94.1 1.60 0.1 table 3. summarization of the experimental results for scr and sncr. 346 vol. 62 no. 3/2022 applicability of secondary denitrification measures on a fluidized bed boiler the use of scr and/or sncr is < 3–10 mg/mn3 [1]. in the case of plants that burn biomass and operate at variable loads, the upper end of the bat-ael range is 15 mg/mn3 [1]. some of the unreacted ammonia is converted to ammonia salts and bound to fly ash, which then exhibits an unwanted odour and could become unapplicable in future use. in addition, leachable ammonia salts can restrict its application as well. stricter requirements are therefore required by fly ash buyers who use it in the construction industry. therefore, large combustion sources require values as low as 7 mg/mn3 for nh3 in the flue gas after esp, although ammonia can be smelled already at values of 5 mg/mn3. for these reasons, the feeding of higher amounts of the reducing agent and thus a higher nsr is not desirable. the nox reduction efficiency of more than 90 % can be reached with the nsr greater than 1.1 for the scr method. further increase in the reducing agent feed is not desirable because the catalyst would be supersaturated and excess unreacted ammonia would escape along with the flue gas, causing the above-mentioned problems with the ash utilisation. in general, the scnr method requires higher doses of the reducing agent for the same nox level required in the flue gas. 4.2. scr and sncr efficiency the nox reduction efficiency varied between 2 % and 58 % for the sncr method and between 21 % and 98 % for the scr method. the lower value corresponds to the lowest injection rate of the reducing agent for both methods. for the sncr method, the best results were achieved for temperatures between 880 and 890 ◦c and nsr between 2.2 and 3.0 when efficiency reached 44 to 58 %. for the scr method, efficiencies higher than 90 % were reached for all catalyst temperatures, while the nsr’s were between 1.1 and 1.6. the results agree with the bat conclusions as stated in section 1. from table 2, it can be seen that the primary measures reduce the input nox concentrations for the sncr method to values between 321 and 498 mg/nm3. the primary measures on this boiler are used mainly to increase the temperature in the freeboard section and thus to ensure that the optimum temperatures for nox reduction are reached. the scr method was tested with flue gas recirculation (in order not to lower nox emission but to keep the combustion process stable) and the initial nox concentrations ranged between 477 and 705 mg/nm3. the nitrogen emissions concentrations after the denitrification measures were between 170 and 458 mg/nm3 for the sncr method and between 12 and 544 mg/nm3 for the scr method (the highest value is for an insufficiently low nsr of 0.3). 4.3. scr and sncr efficiency correlation with temperature there is a clear effect of temperature on the sncr method at the feeding point of the reducing agent. in figure 5, it can be seen that higher efficiencies are achieved at temperatures up to about 900 ◦c, and with increasing temperature, the efficiency decreases at the same nsrs. this corresponds to the theoretical knowledge as mentioned in section 2.2.2, where the optimal temperature window for urea is said to be between 850 and 1000 ◦c. in contrast, in the case of the scr method, the correlation with temperature is minimal and the efficiency basically depends only on the nsr, as shown in figure 6. 0 10 20 30 40 50 60 70 80 90 100 0.0 1.0 2.0 3.0 4.0 5.0 øη [ % ] nsr [-] 868 °c 883 °c 886 °c 925 °c 948 °c 959 °c figure 5. correlation of sncr efficiency and temperature. 0 10 20 30 40 50 60 70 80 90 100 0.0 0.5 1.0 1.5 2.0 ø η [ % ] nsr [-] 259 °c 261 °c 289 °c 290 °c 299 °c figure 6. correlation of scr efficiency and temperature. 4.4. scr and sncr comparison for the same nox input concentrations a specific comparison was made for the same nox input concentrations. the inlet nitrogen oxides concentration was kept at 500 mg/nm3, and no primary measures were used with the purpose of lowering nitrogen oxides levels. different nsrs were used and the trend of nox reduction according to the nsr can be seen in figure 7. 347 j. jeníková, k. michaliková, f. hrdlička et al. acta polytechnica 0 100 200 300 400 500 600 0 0.5 1 1.5 2 n o x [m g/ n m 3 ] nsr [-] scr sncr figure 7. comparison of the scr and sncr methods used within the same bfb boiler. the maximum nsr for the sncr method was 1.97. from other experimental results, we can assume that with a greater injection of the reducing agent and a lower temperature of the fluidized bed, lower nitrogen emissions could be achieved. in general, from the results it can be seen that a lower excess of the reducing agent is needed for the scr method. the decision of whether to use the scr or sncr denitrification method depends on the final required level of nitrogen oxides and on the consideration of investment and operating costs. 5. conclusions the experimental results show the denitrification possibilities applied on the fluidized bed boiler with a thermal output of 500 kw. the size of the experimental equipment is the biggest benefit of performed experiments. the combustion of various fuel types and the generation of emissions have already been investigated, but those are mainly experimental reactors with a diameter of 100–150 mm and laboratory-made flue gas mixtures [16, 25]. initial nox concentrations in the experimental boiler with lignite combustion range from 321 mg/nm3 to 705 mg/nm3. the correlation of the denitrification efficiency and the nsr was investigated. it has been found that the scr needs a lower nsr than the sncr method to reach the same efficiency. the lower need for the reducing agent corresponds to the higher efficiency of the method because the catalyst reduces the activation energy of the reaction. in particular, nsr up to 1.6 was used for the scr method. with higher nsrs, the ammonia slip could become too high and the ash could be degraded in the practical use of high-dust catalysts. in contrast, the sncr method has a higher reducing agent consumption and the best denitrification results were achieved for nsr around 2.5. for the efficiency of both denitrification methods, the results are as follows. the sncr method, together with primary measures (flue gas recirculation and air staging), reaches the efficiency of 58 % and the efficiency of the scr method exceeds 98 %. acknowledgements this work was supported by the project from research center for low-carbon energy technologies, cz.02.1.01/0.0/0.0/16_019/0000753 which is gratefully acknowledged. list of symbols bat best available techniques ctu czech technical university ft-ir fourier transform infrared spectroscopy lcp large combustion plants moo3 molybdenum trioxide ndir non-dispersive infrared nh3 ammonia nsr normalized stoichiometric ratio no nitric oxide nox oxide emissions no2 nitrogen dioxide n2o nitrous oxide scr selective catalytic reduction sncr selective non-catalytic reduction tio2 titanium dioxide v2o5 vanadium pentoxide wo3 tungsten trioxide references [1] t. lecomte, j. ferrería de la fuente, f. neuwahl, et al. best available techniques (bat) reference document for large combustion plants. eur 28836 en. publications office of the european union, 2017. https://doi.org/10.2760/949. [2] j. vejvoda, p. machač, p. buryan. technologie ochrany ovzduší a čištění odpadních plynů. university of chemistry and technology, 2003. isbn 80-708-0517-x. [3] f. normann, k. andersson, b. leckner, f. johnsson. emission control of nitrogen oxides in the oxy-fuel process. progress in energy and combustion science 35(5):385–397, 2009. https://doi.org/10.1016/j.pecs.2009.04.002. [4] c. t. bowman. control of combustion-generated nitrogen oxide emissions: technology driven by regulation. symposium (international) on combustion 24(1):859–878, 1992. https://doi.org/10.1016/s0082-0784(06)80104-9. [5] j. hemerka, p. vybíral. ochrana ovzduší. czech technical university in prague, 2010. isbn 978-80-01-04646-3. [6] p. machač, e. baraj. a simplified simulation of the reaction mechanism of nox formation and non-catalytic reduction. combustion science and technology 190(6):967–982, 2018. https://doi.org/10.1080/00102202.2017.1418335. [7] p. forzatti. present status and perspectives in de-nox scr catalysis. applied catalysis a: general 222(1-2):221–236, 2001. https://doi.org/10.1016/s0926-860x(01)00832-8. 348 https://doi.org/10.2760/949 https://doi.org/10.1016/j.pecs.2009.04.002 https://doi.org/10.1016/s0082-0784(06)80104-9 https://doi.org/10.1080/00102202.2017.1418335 https://doi.org/10.1016/s0926-860x(01)00832-8 vol. 62 no. 3/2022 applicability of secondary denitrification measures on a fluidized bed boiler [8] k. el sheikh, m. j. h. khan, m. diana hamid, et al. advances in reduction of nox and n2o1 emission formation in an oxy-fired fluidized bed boiler. chinese journal of chemical engineering 27(2):426–443, 2019. https://doi.org/10.1016/j.cjche.2018.06.033. [9] ipcc, 2007. climate change 2007: synthesis report. contribution of working groups i, ii and iii to the fourth assessment report of the intergovernmental panel on climate change [core writing team and pachauri, r.k. and reisinger, a. (eds.)]. ipcc, geneva, switzerland, 2007. isbn 92-9169-122-4. [10] l. e. aamand, b. leckner, s. andersson. formation of nitrous oxide in circulating fluidized-bed boilers. energy & fuels 5(6):815–823, 1991. https://doi.org/10.1021/ef00030a008. [11] m. de las obras-loscertales, t. mendiara, a. rufas, et al. no and n2o emissions in oxy-fuel combustion of coal in a bubbling fluidized bed combustor. fuel 150:146–153, 2015. https://doi.org/10.1016/j.fuel.2015.02.023. [12] m. vodička, j. hrdlička, p. skopec. experimental study of the nox reduction through the staged oxygen supply in the oxy-fuel combustion in a 30 kwth bubbling fluidized bed. fuel 286:119343, 2021. https://doi.org/10.1016/j.fuel.2020.119343. [13] c. lupiáñez, l. i. díez, l. m. romeo. influence of gas-staging on pollutant emissions from fluidized bed oxy-firing. chemical engineering journal 256:380–389, 2014. https://doi.org/10.1016/j.cej.2014.07.011. [14] m. de las obras-loscertales, a. rufas, l. de diego, et al. effects of temperature and flue gas recycle on the so2 and nox emissions in an oxy-fuel fluidized bed combustor. energy procedia 37:1275–1282, 2013. https://doi.org/10.1016/j.egypro.2013.06.002. [15] t. czakiert, z. bis, w. muskala, w. nowak. fuel conversion from oxy-fuel combustion in a circulating fluidized bed. fuel processing technology 87(6):531–538, 2006. https://doi.org/10.1016/j.fuproc.2005.12.003. [16] w. moroń, w. rybak. nox and so2 emissions of coals, biomass and their blends under different oxy-fuel atmospheres. atmospheric environment 116:65–71, 2015. https://doi.org/10.1016/j.atmosenv.2015.06.013. [17] x. cheng, x. t. bi. a review of recent advances in selective catalytic nox reduction reactor technologies. particuology 16:1–18, 2014. https://doi.org/10.1016/j.partic.2014.01.006. [18] f. gholami, m. tomas, z. gholami, m. vakili. technologies for the nitrogen oxides reduction from flue gas: a review. science of the total environment 714:136712, 2020. https://doi.org/10.1016/j.scitotenv.2020.136712. [19] y. gao, t. luan, t. lü, et al. performance of v2o5 wo3 moo3/tio2 catalyst for selective catalytic reduction of nox by nh3. chinese journal of chemical engineering 21(1):1–7, 2013. https://doi.org/10.1016/s1004-9541(13)60434-6. [20] l. olsson, h. sjövall, r. j. blint. a kinetic model for ammonia selective catalytic reduction over cu zsm 5. applied catalysis b: environmental 81(3-4):203–217, 2008. https://doi.org/10.1016/j.apcatb.2007.12.011. [21] j. hrdlicka, p. skopec, f. hrdlicka. trough air distributor for a bubbling fluidized bed boiler with isobaric nozzles. in proceedings of the 22nd international conference on fluidized bed conversion, vol. 1. turku, finland, 2015. isbn 978-952-12-3222-0. [22] p. skopec, j. hrdlička, j. opatřil, j. štefanica. nox emissions from bubbling fluidized bed combustion of lignite coal. acta polytechnica 55(4):275–281, 2015. https://doi.org/10.14311/ap.2015.55.0275. [23] prováděcí rozhodnutí komise (eu) 2021/2326. úřední věstník evropské unie l 469:1–81, 2021. https://eurlex.europa.eu/legal-content/cs/txt/pdf/?uri= celex:32021d2326&from=en. [24] m. mladenović, m. paprika, a. marinković. denitrification techniques for biomass combustion. renewable and sustainable energy reviews 82:3350–3364, 2018. https://doi.org/10.1016/j.rser.2017.10.054. [25] y. hu, s. naito, n. kobayashi, m. hasatani. co2, nox and so2 emissions from the combustion of coal with high oxygen concentration gases. fuel 79(15):1925–1932, 2000. https://doi.org/10.1016/s0016-2361(00)00047-8. 349 https://doi.org/10.1016/j.cjche.2018.06.033 https://doi.org/10.1021/ef00030a008 https://doi.org/10.1016/j.fuel.2015.02.023 https://doi.org/10.1016/j.fuel.2020.119343 https://doi.org/10.1016/j.cej.2014.07.011 https://doi.org/10.1016/j.egypro.2013.06.002 https://doi.org/10.1016/j.fuproc.2005.12.003 https://doi.org/10.1016/j.atmosenv.2015.06.013 https://doi.org/10.1016/j.partic.2014.01.006 https://doi.org/10.1016/j.scitotenv.2020.136712 https://doi.org/10.1016/s1004-9541(13)60434-6 https://doi.org/10.1016/j.apcatb.2007.12.011 https://doi.org/10.14311/ap.2015.55.0275 https://eur-lex.europa.eu/legal-content/cs/txt/pdf/?uri=celex:32021d2326&from=en https://eur-lex.europa.eu/legal-content/cs/txt/pdf/?uri=celex:32021d2326&from=en https://eur-lex.europa.eu/legal-content/cs/txt/pdf/?uri=celex:32021d2326&from=en https://doi.org/10.1016/j.rser.2017.10.054 https://doi.org/10.1016/s0016-2361(00)00047-8 j. jeníková, k. michaliková, f. hrdlička et al. acta polytechnica a. appendix sncr measurement no. temperature nox before sncr nox after sncr nox conversion nsr o2 nh3 slip [◦c] [mg/nm3] [mg/nm3] [%] [mol/mol] [%] [mg/nm3] 1 959 ± 5 321 ± 31 250 ± 12 22.1 ± 2.3 1.00 5.0 ± 0.2 no t m ea su re d 2 238 ± 9 26.0 ± 1.0 2.10 6.5 ± 0.7 3 228 ± 14 29.0 ± 0.8 3.10 7.1 ± 0.2 4 209 ± 9 35.1 ± 1.7 4.10 6.6 ± 0.2 5 883 ± 10 459 ± 29 392 ± 33 14.7 ± 7.2 0.55 6.9 ± 0.6 6 303 ± 14 34.1 ± 3.0 1.10 5.9 ± 0.3 7 267 ± 15 41.9 ± 3.3 1.65 5.7 ± 0.3 8 257 ± 14 43.9 ± 3.4 2.20 5.7 ± 0.3 9 242 ± 10 47.2 ± 2.3 2.64 5.9 ± 0.3 10 886 ± 8 408 ± 22 356 ± 25 12.9 ± 3.3 0.62 5.3 ± 0.4 11 293 ± 29 28.2 ± 3.6 1.30 4.6 ± 0.4 12 249 ± 23 39.0 ± 3.1 1.89 4.5 ± 0.4 13 179 ± 37 56.3 ± 3.9 2.50 4.7 ± 0.7 14 170 ± 11 58.4 ± 0.6 3.00 3.6 ± 0.3 15 868 ± 20 424 ± 33 388 ± 25 8.7 ± 3.6 0.43 6.5 ± 0.8 0.3 ± 0.1 16 343 ± 19 19.2 ± 1.7 0.87 5.8 ± 0.5 0.2 ± 0.1 17 306 ± 15 28.0 ± 1.1 1.30 5.5 ± 0.6 0.1 ± 0.1 18 284 ± 13 33.0 ± 1.2 1.70 5.1 ± 0.4 0.2 ± 0.1 19 269 ± 22 36.7 ± 2.0 3.47 5.5 ± 0.5 0.4 ± 0.2 20 948 ± 29 461 ± 7 450 ± 22 2.3 ± 3.4 0.57 6.4 ± 0.5 0.0 ± 0.1 21 437 ± 38 5.2 ± 1.2 1.12 5.4 ± 1.8 0.0 22 426 ± 27 7.7 ± 2.1 1.70 4.3 ± 0.3 0.0 23 393 ± 12 14.7 ± 0.8 2.70 4.4 ± 0.4 0.8 ± 0.3 24 925 ± 11 498 ± 8 458 ± 12 8.0 ± 2.6 0.41 7.0 ± 0.3 0.1 ± 0.1 25 450 ± 8 9.5 ± 1.5 0.81 7.2 ± 0.3 0.0 26 410 ± 12 17.7 ± 2.6 1.23 7.2 ± 0.2 0.0 27 364 ± 43 26.9 ± 8.8 1.97 7.3 ± 0.8 0.3 ± 0.2 table 4. experimental results for sncr. 350 vol. 62 no. 3/2022 applicability of secondary denitrification measures on a fluidized bed boiler scr measurement no. temperature nox before scr nox after scr nox conversion nsr o2 nh3 slip [◦c] [mg/nm3] [mg/nm3] [%] [mol/mol] [%] [mg/nm3] 1 260 ± 2 692 ± 74 544 ± 58 21.0 ± 9.9 0.29 ± 0.02 10.1 ± 0.5 0 2 705 ± 60 386 ± 46 45.0 ± 7.3 0.59 ± 0.03 10.5 ± 0.9 0 3 698 ± 50 305 ± 38 56.0 ± 6.5 0.75 ± 0.03 10.6 ± 0.7 0 4 593 ± 54 132 ± 20 77.6 ± 3.0 0.91 ± 0.07 9.8 ± 0.7 0 5 533 ± 70 31 ± 15 94.1 ± 2.5 1.17 ± 0.12 9.7 ± 0.8 0.1 ± 0.2 6 613 ± 73 12 ± 4 98.0 ± 0.9 1.29 ± 0.09 10.5 ± 0.9 0.9 ± 0.2 7 290 ± 1 518 ± 40 168 ± 24 67.4 ± 4.3 0.81 ± 0.06 9.8 ± 0.6 0.2 ± 0.1 8 519 ± 19 158 ± 11 71.1 ± 1.8 0.83 ± 0.04 9.5 ± 0.5 0.3 ± 0.1 9 504 ± 19 79 ± 9 85.2 ± 1.7 0.99 ± 0.05 9.3 ± 0.5 0.2 ± 0.1 10 528 ± 45 124 ± 24 76.7 ± 3.6 1.04 ± 0.15 10.7 ± 1.4 0.2 ± 0.1 11 487 ± 28 47 ± 9 90.4 ± 1.9 1.09 ± 0.06 8.4 ± 0.4 0.2 ± 0.1 12 477 ± 46 18 ± 7 96.2 ± 1.4 1.28 ± 0.10 8.4 ± 0.6 0.3 ± 0.1 13 486 ± 18 36 ± 8 93.0 ± 1.6 1.35 ± 0.05 9.0 ± 0.5 0.0 ± 0.1 14 300 ± 1 520 ± 37 144 ± 17 0.4 ± 2.4 1.00 ± 0.06 9.2 ± 0.5 0.1 ± 0.1 15 531 ± 15 75 ± 5 85.9 ± 1.0 1.21 ± 0.06 9.4 ± 0.2 0.1 ± 0.1 16 519 ± 15 40 ± 7 92.3 ± 1.2 1.43 ± 0.07 9.3 ± 0.4 0.1 ± 0.1 17 539 ± 30 31 ± 6 94.1 ± 1.2 1.60 ± 0.09 9.5 ± 0.4 0.1 ± 0.1 table 5. experimental results for scr. 351 acta polytechnica 62(3):341–351, 2022 1 introduction 2 nitrogen oxides 2.1 formation of nox emissions 2.2 denitrification methods 2.2.1 primary measures 2.2.2 secondary measures 3 experimental set-up 3.1 experimental facility 3.2 fuel and reducing agents 3.3 methods 4 results and discussion 4.1 reducing agent excess 4.2 scr and sncr efficiency 4.3 scr and sncr efficiency correlation with temperature 4.4 scr and sncr comparison for the same nox input concentrations 5 conclusions acknowledgements list of symbols references a appendix acta polytechnica https://doi.org/10.14311/ap.2022.62.0361 acta polytechnica 62(3):361–369, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague experimental verification of the efficiency of selective non-catalytic reduction in a bubbling fluidized bed combustor kristýna michaliková∗, jan hrdlička, matěj vodička, pavel skopec, jitka jeníková, lukáš pilař czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: kristyna.michalikova@fs.cvut.cz abstract. controlling nitrogen oxide (nox) emissions is still a challenge as increasingly stringent emission limits are introduced. strict regulations will lead to the need to introduce secondary measures even for boilers with bubbling fluidized bed (bfb), which are generally characterized by low nox emissions. selective non-catalytic reduction has lower investment costs compared to other secondary measures for nox reduction, but the temperatures for its efficient utilization are difficult to achieve in bfbs. this paper studies the possibility of an effective application of selective non-catalytic reduction (sncr) of nitrogen oxides in a pilot-scale facility with a bubbling fluidized bed. the effect of temperatures between 880 and 950 °c in the reagent injection zone on nox reduction was investigated. for the selected temperature, the effect of the amount of injected reagent, urea solution with concentration 32.5 % wt., was studied. the experiments were carried out using 500 kwth pilot scale bfb unit combusting lignite. in addition, an experiment was performed with the combustion of wooden pellets. with reagent injection, all experiments led to the reduction of nitrogen oxides and the highest nox reduction of 58 % was achieved. keywords: selective non-catalytic reduction, sncr, fluidized bed, bfb, denitrification. 1. introduction the future of the energy industry is inextricably linked to the need to prevent the release of pollutants and greenhouse gases. one of the monitored pollutants is nitrogen oxides, which is the collective term for nitric oxide (no) and nitrogen dioxide (no2). their formation is well known and is described even in fluidized bed combustion [1, 2]. nitrogen oxides have a negative impact on the environment and contribute to problems such as acid rain, ozone depletion, and photochemical smog [3]. the nitrous oxide (n2o) is a gas with a greenhouse effect. it is also formed during combustion, particularly at lower temperatures. at higher temperatures (greater than 1 500 k), it is rapidly decomposed, forming n2 or no [4]. therefore, its emissions may be of significance in the case of combustion in fluidized beds, which are generally operated at lower temperatures and do not reach the temperature range for its oxidation. besides the combustion process itself, its significant source can be an application of a selective non-catalytic reduction of nox [5], particularly at elevated ratios of the reducing agent [6]. fluidized bed boilers are widely used for their advantages, such as fuel flexibility, uniform temperature distribution, and operation at low temperatures. bubbling fluidized bed (bfbs) as well as circulation fluidized bed combustors (cfbs) allow the combustion of fuels of different sizes, moisture content, and heating values, and therefore can be used not only for the combustion of coal, but also for the combustion of biomass or various alternative fuels. the operating temperatures in the fluidized bed are in the range of 800–950 °c [7]. this lower operating temperature range also leads to a lower formation of nox. the formation of nitrogen oxides can generally be realised by three mechanisms – thermal, prompt, and fuel nox [8, 9]. the thermal and prompt pathways become more important above the operating temperature range of fluidized bed combustors and are therefore negligible during fluidized bed combustion and fuel nox is then considered the main contributor to nitrogen oxides. 2. measures for nox reduction reduction of nox emissions can be achieved by modifying the combustion process, thereby preventing the formation of nox, known as primary measures, or by secondary measures, which are techniques for the reduction of already generated nitrogen oxides. in principle, all primary measures involve adjustments leading to combustion conditions with a decreased o2 availability at the early stage of the combustion process, a reduction of the maximum flame temperature, or a change in residence time in different parts of the combustion zone [5]. for fluidized bed combustors, the early stage of the combustion process takes place in the dense bed zone. one of 361 https://doi.org/10.14311/ap.2022.62.0361 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en k. michaliková, j. hrdlička, m. vodička et al. acta polytechnica figure 1. reaction path diagram with urea as a reagent [5]. the measures can be air staging, which presents the oxygen-lean primary combustion zone and the oxygenrich secondary combustion zone. it is realized by supplying secondary or even multi-stage air above the area of primary combustion, typically to the freeboard of the bfbs or to the lean bed zone of the cfbs. air staging is often used in bfbs to control combustion and ensure the required temperature and overall oxygen stoichiometry. the use of flue gas recirculation is required for fluidized bed combustion because it provides the necessary volume flow to fluidize the bed material. most primary measures are less effective in the case where the generated nox is predominantly formed by the oxidation of fuel nitrogen [5], which is a typical situation for fluidized bed combustors. therefore, to meet the nox limits, secondary measures may be necessary. secondary measures are used to reduce nitrogen oxides that have been already formed. these postcombustion technologies that reduce nox are mainly selective catalytic (scr) and selective non-catalytic reduction (sncr). these flue gas treatments reduce nox to n2 by reaction with an amine-based reagent, such as ammonia or urea. both technologies are widely used. in the process of scr, the catalyst is present, reducing the activation energy of key reactions. therefore, the reduction of no to molecular nitrogen is realized in the temperature range 290–450 °c [10]. the efficiency of scr at similar reducing agent stoichiometric ratio is higher than for sncr and can reach above 90 %, however, the investment and operating costs are significant and its installation has a large space requirement [11]. this may limit its application as a retrofit to an existing facility. additional problems can be fouling and catalyst poison. on the other hand, sncr can be easily retrofitted, and its investment costs are lower because of the absence of the catalyst. 2.1. selective non-catalytic reduction sncr is a well-known and described technology [5], but its application in fluidized bed combustors is rare. this is mainly due to the lower formation of nox in fluidized bed combustion, which can then operate within the emission limits. the problem may arise if these limits for nox emissions are even stricter or fuels with higher fuel-nitrogen content are burned, for example, non-wooden biofuels. several different reducing agents can be used in the sncr process. the reduction process with ammonia is called the thermal denox process, whereas the use of urea is called the noxout process. a less common reducing agent is cyanuric acid within the raprenox process. since urea decomposes into ammonia and cyanuric acid, the sncr process with urea can be considered as the combination of the other two [12]. the reaction path of urea within the selective non-catalytic reduction is shown in figure 1, which describes the decomposition of urea into ammonia and cyanuric acid and subsequent reactions. the reduction of nox occurs in a temperature range called a temperature window. this required temperature window is affected by many factors and the influence on it has parameters such as the reducing agent, the composition of the flue gas, the residence time, and the mixing between the reagent and the flue gas [5]. however, the interval is usually given in the range 800–1 100 °c; the temperature window for ammonia is 850–1 000 °c and for urea is the temperature window wider (800–1 100 °c) [13]. below these temperatures, the reaction is too slow and most of the injected ammonia remains unreacted, increasing the slip of ammonia. at temperatures above 1 200 °c, the degree of nox reduction decreases due to the thermal decomposition of the reagent that subsequently oxidizes [5]. 362 vol. 62 no. 3/2022 experimental verification of the efficiency of sncr in a bfb the amount of reagent injected is usually represented by a normalized stoichiometric ratio (n sr), which defines the number of moles of injected nh –2 reagent relative to the number of moles of nox. for sncr, the preferred normalized stoichiometric ratio is greater than 1, and a ratio n sr in the range of 1.5–2.5 is generally recommended [13]. as mentioned above, the temperatures in the dense bed of bfb are usually in the range 800 − 900 °c, and therefore it is evident that it is difficult to reach sufficiently high temperatures in the freeboard zone for efficient sncr application. however, our previous work [14] that has been done in a laboratory-scale bfb combustor confirmed that such temperatures can be reached in the freeboard section of a bfb combustor through an intensively staged supply of combustion air. such an intensive air staging, which means that understoichiometric conditions are established in the primary combustion zone in the dense bed, is not typically applied in the industrial practice. in parallel, the application of sncr in a bfb combustor is not common, and the original experimental data from this study contribute to the possible realization of this technology. in this article we have therefore focused on an experimental study of real performance of the selective non-catalytic reduction of nox in a 500 kwth pilot scale bfb combustor in connection with intensive air staging in order to increase the temperature of gas phase in the freeboard section, where the reducing agent was injected. this temperature was in the range for effective reduction of nox. in this work, the characterization of the sncr performance in correlation with the nsr and the temperature in the freeboard section where the reducing agent was injected is presented. 3. experiments 3.1. experimental setup the experiments were carried out in a pilot scale bubbling fluidized bed combustor with a thermal input of 500 kw, and its scheme is given in figure 2. the facility is very variable as it allows combustion of different fuels with different oxidants; it is possible to use oxygen-enriched air up to the pure oxy-fuel regime. in the case of the air regime, a mixture of primary air and recirculated flue gas provides the fluidization, and its volume and mixture ratio can be changed arbitrarily. the combustion chamber together with the freeboard is insulated with fireclay lining, and the walls of the boiler are water-cooled. four secondary air distributors are evenly spaced around the perimeter and each can supply secondary air at 4 heights. in addition, the inspection windows are located in the freeboard, which can be used for injection of reagent within the selective non-catalytic reduction process. the flue gas was continuously sampled. the volumetric fraction of o2 was measured using a paramagnetic sensor and 1 23 4 5 6 7 8 t t t t t t figure 2. scheme of the 500 kwth pilot-scale bfb facility. 1) primary gas inlets, 2) fuel feeder, 3) dense bed region, 4) secondary air distributors, 5) freeboard section, 6) inspection windows, 7) crossover pass, and 8) heat exchanger. the ‘t’ signs indicates the temperature measurement points. figure 3. injection probe. using a lambda probe (for operation). the volumetric fractions of so2, co2, co, nox (sum of no and no2), and n2o were measured using ndir sensors. the volumetric fraction of nh3 was measured using a ftir analyzer. for selective non-catalytic reduction, the main part of realization is reagent injection. in this case, a single probe with a spray nozzle was used (figure 3). the probe is water-cooled, and compressed air is used to atomize the reagent. the probe can be placed at five different heights along the freeboard zone using inspection windows. the reducing agent was transported from an accumulation vessel to the nozzle by increasing the air pressure above the level of the reducing agent in the accumulation vessel. the volumetric flow of the reducing agent was controlled by a manually operated proportional valve placed in the stream of the reducing agent. two rotameters were used to measure the volumetric flow of the reducing agent, which 363 k. michaliková, j. hrdlička, m. vodička et al. acta polytechnica figure 4. the temperature profile along the height of the facility with and without the use of secondary air. as received dry ash free lhv water ash comb.* c h n s o* volatiles [mj · kg−1] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] lignite 17.6 21.1 9.9 69.0 72.3 6.3 1.1 1.3 19.0 47.0 wood 16.4 7.8 1.5 90.7 51.0 6.9 0.3 0.003 41.797 84.6 * calculated as balance to 100 %. table 1. proximate and ultimate analysis of the fuels used within the experiments. differed in the operating range the first measured flow in the range 0–1.2 l/h and the second in the range 1–5 l/h. to achieve the temperature window for efficient injection of the reagent, air staging was realized. it leads to the primary zone with sub-stoichiometric conditions, above which the secondary air inlet is located that creates fuel-lean zone. thanks to this measure, it was possible to obtain higher temperatures in the freeboard area while keeping the bed temperature below 900 °c. air staging simultaneously serves as a primary measure that leads to a reduction of nox. a comparison of temperature profiles within the combustor during non-staged and staged combustion is given in figure 4. the dashed horizontal line indicates the height level of secondary air supply in the combustor. each point in the chart corresponds to a mean temperature value measured at that specific position. the locations of the thermocouples are indicated in figure 2 by a ‘t’ sign. the height of 0 mm corresponds to the average temperature in the dense bed measured by 4 thermocouples. the temperatures in the freeboard section were measured by two thermocouples at each height. it can be seen that the temperature of the flue gas slightly higher than 950 °c was reached at a height of 1 200 mm above the dense bed. at this point, the reducing agent for sncr was injected. 3.2. materials and methods the experiments were carried out using czech lignite bílina hp1 135 and wooden pellets as fuels. the proximate and ultimate analyses of the lignite and wooden pellets are given in table 1. during experiments with lignite coal, its inherent ash was used as a bed material. experiments with biomass were realized combusting wooden pellets (according to the standard enplus a1) and using a lightweight ceramic aggregate with a mean diameter of 1.12 mm as a bed material, which is described in detail in [7]. the reagent used in all experiments with lignite as fuel was a urea solution with a concentration of 32.5 % by weight. since the wooden biomass combustion is expected to lead to a lower production of nox due to the significantly lower nitrogen content in the fuel, it was necessary to adjust the concentration of reagent to 10.8 % wt. to ensure sufficient flow of the urea solution. several series of experiments were performed for lignite and one experiment for the combustion of wooden pellets. in order to achieve temperatures in the freeboard zone (where the injection takes place) suitable for effective nitrogen oxides reduction using sncr, the high level of staged supply of combustion air had to be realized. the stoichiometry in the fluidized bed for individual experiments is given in table 2 for lignite combustion. in the case of combusting wooden 364 vol. 62 no. 3/2022 experimental verification of the efficiency of sncr in a bfb freeboard temperature required freeboardtemperature air excess ratio in the dense bed [◦c] [◦c] [−] combustion of lignite 870 ± 0.8 870 0.50 882 ± 0.8 880 0.94 885 ± 0.8 890 0.97 925 ± 2.0 920 0.63 949 ± 0.6 950 0.53 combustion of wood 886 ± 0.3 880 1.46 table 2. stoichiometry in the fluidized bed. pellets, no secondary air was used and the air excess in the dense bed was 1.7. the value representing the freeboard temperature in figure 2 is an average of values measured at the place of injection of the reducing agent sampled for at least 30 minutes in the interval of 1 s. the average value is given together with the interval of 95 % confidence. the experiments were performed with a constant bed temperature of 880 °c and the oxygen level in the dry flue gas maintained at 6 % to diminish the impact of these parameters on the nox formation. secondary air was injected 900 mm above the surface of the fluidized bed for lignite. a reference case was measured for each selected temperature and the measured values of the concentration of nox were used to calculate the corresponding values of nsr and the efficiency of nox reduction. during all experiments, the selected temperature in the freeboard area was kept constant and the amount of urea solution injected was increased in steps. in the case of lignite combustion, the volumetric flow of the urea solution was increased by 0.25 l/h, except for the measurement with the freeboard temperature 870 °c, when the step was 0.15 l/h. consequently, steady-state cases with volumetric flows of the urea solution of 0, 0.25, 0.5, 0.75, 1, and 1.2 l/h were measured for freeboard temperatures 882, 885, 925, and 949 °c. for freeboard temperature of 870 °c, volumetric flows of the urea solution were 0, 0.15, 0.3, 0.45, 0.60, and 1.2 l/h. in the case of biomass combustion, steady states were measured with volumetric flows of the urea solution of 0, 0.4, 0.85, 1.3, 2, and 3 l/h. the volumetric flow of the urea solution was used together with the reference concentration of nox to determine the stoichiometric ratio n sr. in the case of urea (nh2)2co, the normalized stoichiometric ratio is defined as two moles of urea in the urea solution to the mole of no in the flue gas [15]: n sr = 2 · nurea nno (1) the n sr ratio was used in the evaluation of the result as the variable parameter that describes the intensity of the sncr process. each of the steady-state cases was measured in the average for 40 minutes (at least for 30 minutes). all measured values were sampled with an interval of 1 s except for the nh3 concentrations and the nox concentrations measured downstream of the sncr in the case of lignite combustion and freeboard temperature 920 °c, where the data were sampled with an interval of 1 minute. all sampled data reported in the results are expressed using averages with corresponding intervals of 95 % confidence. 4. results and discussion the measured data from all experiments relevant for this study can be found in table 3 (for the combustion of lignite) and table 4 (for the combustion of biomass). all concentrations reported in these tables are calculated for a reference content of o2 in dry flue gas of 6 %. achieving different temperatures in the freeboard zone was realized by different mass flow of fuel, modification of the volume of fluidizing gas and its composition, and changing the volumetric flow of secondary air as well. these adjustments resulted in different initial concentrations of nox, which varied from 438 to 498 mg/m3n for lignite combustion. the difference in the initial nox concentration could also be caused by a slightly different o2 volumetric fraction in the flue gas. the formation of nox is significantly dependent on the availability of o2 for the oxidation of the fuel n, which was confirmed by krzywański et al. [16] or vodička et al. [17]. although a constant volumetric fraction of o2 in dry flue gas was desired within all experimental cases, its value varied in the range from 4.3 to 7.3 %. it can be seen in table 3 that the higher concentrations of nox were measured for higher volumetric fractions of o2 and vice versa. the nox concentrations measured for wood combustion were significantly lower than those measured for lignite combustion, particularly due to the significantly lower nitrogen content in the fuel [18]. the measurement of co concentration in the freeboard and of nh3 concentration in the flue gas re365 k. michaliková, j. hrdlička, m. vodička et al. acta polytechnica freeboard temp. (required) nox before sncr nox after sncr n sr ηnox nh3 o2 co co freeboard n2o [◦c] [mg/m3n] [mg/m 3 n] [mol/mol] [%] [mg/m 3 n] [% vol.] [mg/m 3 n] [mg/m 3 n] [mg/m 3 n] 870±0.8 (870) 438±0.5 438±0.5 0.00 0.0 na* 6.1±0.04 313±5.4 na* 19.2±0.19 396±0.5 0.43 9.6 na* 6.4±0.05 294±3.5 843±9.8 32.4±0.24 348±0.5 0.87 20.5 na* 5.7±0.05 394±15.0 777±13.9 30.4±0.45 312±0.4 1.30 28.8 0.1±0.02 5.5±0.05 326±10.0 703±5.4 37.8±0.34 289±0.4 1.70 34.0 0.2±0.03 5.1±0.04 341±12.2 677±3.3 41.2±0.46 276±0.9 3.47 37.0 0.4±0.08 5.4±0.06 385±9.8 na* 47.1±0.88 882±0.8 (880) 460±0.6 460±0.6 0.00 0.0 na* 6.7±0.04 268±1.9 na* 23.2±0.18 390±0.5 0.55 15.1 na* 6.8±0.04 303±2.3 na* 30.7±0.19 303±0.4 1.10 34.1 na* 5.9±0.03 294±2.2 na* 43.5±0.22 265±0.5 1.65 42.3 na* 5.6±0.03 283±2.3 na* 55.8±0.30 256±0.5 2.20 44.4 na* 5.7±0.03 345±2.5 na* 72.0±0.29 241±0.4 2.64 47.5 na* 5.9±0.03 333±2.4 na* 83.4±0.27 885±0.8 (890) 443±1.1 443±1.1 0.00 0.0 na* 4.9±0.05 216±2.5 na* 19.8±0.09 415±0.8 0.62 6.4 na* 5.3±0.04 196±1.0 na* 30.8±0.12 360±0.7 1.30 18.7 na* 4.6±0.03 182±1.2 na* 32.8±0.13 299±1.2 1.89 32.5 na* 4.4±0.05 225±2.1 na* 33.4±0.23 202±1.0 2.50 54.4 na* 4.6±0.04 511±16.7 na* 37.0±0.40 181±0.6 3.00 59.6 na* 4.6±0.04 265±7.3 na* 73.8±0.61 925±2.0 (920) 498±3.8 498±3.8 0.00 0.0 na* 7.0±0.08 221±4.8 539±15.7 na* 458±1.3 0.41 8.0 0.1±0.02 7.0±0.08 227±3.5 660±72.5 na* 450±1.9 0.81 9.5 0.01±0.01 7.2±0.07 283±7.0 1883±94.1 na* 410±3.6 1.23 17.7 0.01±0.01 7.2±0.04 216±5.9 1274±39.4 na* 364±10.8 1.97 26.9 0.3±0.04 7.3±0.20 265±21.3 1564±71.6 na* 949±0.6 (950) 468±0.3 468±0.3 0.00 0.0 0.03±0.02 6.8±0.03 219±9.1 na* 6.8±0.36 454±0.6 0.57 2.9 0.03±0.01 6.3±0.03 306±24.2 773±34.9 16.0±0.43 442±0.9 1.12 5.5 0±0 5.5±0.04 358±18.2 958±18.4 14.4±0.55 430±0.4 1.70 8.3 0±0 4.3±0.02 321±20.7 1027±20.0 12.6±0.54 399±0.7 2.70 14.7 0.77±0.10 4.7±0.04 282±14.5 833±19.6 23.0±0.69 * the value was not available for this experiment. table 3. operating conditions during experiments for lignite combustion, o2,ref ≈ 6 %. freeboard temp. (required) nox before sncr nox after sncr n sr ηnox nh3 o2 co co freeboard n2o [◦c] [mg/m3n] [mg/m 3 n] [mol/mol] [%] [mg/m 3 n] [% vol.] [mg/m 3 n] [mg/m 3 n] [mg/m 3 n] 886±0.3 (880) 145±0.5 145±0.5 0 0.0 na* 6.7±0.02 76±0.4 na* 4.2±0.14 133±0.2 1 8.7 na* 6.4±0.02 60±0.1 na* 7.5±0.01 113±0.3 2.2 22.0 na* 5.4±0.02 55±0.1 na* 12.6±0.03 107±0.5 3.3 26.6 na* 5.2±0.02 54±0.2 na* 13.1±0.08 86±0.4 5.1 40.9 na* 4.8±0.02 52±0.2 na* 18.9±0.11 * the value was not available for this experiment. table 4. operating conditions during experiments for biomass combustion, o2,ref ≈ 6 %. quired an additional analyzer. the absence of data of nh3 and n2o concentrations in the off-gas and co concentrations in the freeboard section reported in tables 3 and 4 was caused by non-availability of the additional ftir analyzer during the corresponding experimental case. in all experiments, injection of the reagent resulted in a reduction in nitrogen oxides even at the lowest volumetric flows of the urea solution corresponding to the lowest n srs. the impact of injection of the urea solution on the concentration of nox in the flue gas downstream of sncr is demonstrated in figure 5 with injection of reagent expressed as n sr. the nox reduction efficiency was calculated as the fraction of the difference in the nox concentration after and before sncr over the nox concentration before the sncr process. the best nox reduction efficiency of 58 % was achieved for temperature 885 °c and n sr 3.1 and this efficiency decreased with the use of higher or lower temperatures. this trend is shown in figure 6 for four different n srs in the range from 0.5 to 3.3. nox reduction was also affected by 366 vol. 62 no. 3/2022 experimental verification of the efficiency of sncr in a bfb 0 100 200 300 400 500 600 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 n o x [m g /m n 3 ] nsr [mol/mol] (l) 949 °c (l) 925 °c (l) 885 °c (l) 882 °c (l) 870 °c (wp) 886 °c figure 5. dependence of nox concentration in the flue gas downstream the sncr on n sr for different temperatures, (l) – lignite, (wp) – wooden pellets. 0 10 20 30 40 50 60 70 860 870 880 890 900 910 920 930 940 950 960  n o x [% ] temperature [°c] nsr 0.5 nsr 1.2 nsr 2.0 nsr 3.3 figure 6. dependence of nox reduction efficiency on temperature for different n srs. 0 10 20 30 40 50 60 70 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 η n o x [% ] nsr [mol/mol] (l) 949 °c (l) 925 °c (l) 885 °c (l) 882 °c (l) 870 °c (wp) 886 °c figure 7. dependence of nox reduction efficiency on n sr for different temperatures. 367 k. michaliková, j. hrdlička, m. vodička et al. acta polytechnica the amount of injected reagent, expressed as n sr. an increase in n sr resulted in an increase in efficiency for all experiments, although the increase in efficiency of the nox reduction gradually decreased with increasing n sr, as can be seen in figure 7, where the dependence of the nox reduction efficiency on n sr is given. in the cases where the ammonia slip was measured, the ammonia concentration never exceeded 1 mg/m3n for lignite combustion, even for the highest n srs indicating a strong over-stoichiometry of the urea solution. n2o is a greenhouse gas. its emissions in the energy industry are not currently limited in eu legislation, however, they are negatively affected by selective noncatalytic reduction in the case when urea or cyanuric acid is used as the reducing agent [5]. n2o concentrations increased with increasing reagent levels, up to four times its initial value, which can be seen in tables 3 and 4. in the case of lignite combustion, the highest concentrations of n2o were measured for the highest n srs and freeboard temperatures 880 and 890 °c, where the nox reduction efficiency was the highest and the concentration of nox after sncr was the lowest. for the freeboard temperature of 890 °c and n sr = 3, the concentration of n2o was 73.8 mg/m3n, which was 40 % of the value of the concentration of nox. in the case of the combustion of wooden pellets and n sr 5.1, the concentration of nitrous oxide was 18.9 mg/m3n, whereas the concentration of nox achieved after reduction was 86 mg/m3n. thus, the application of sncr with urea solution can possibly be limited by n2o emissions. the best efficiency of nitrogen oxides reduction is achieved by injecting the reagent into the area with a temperature of around 890 °c, which is a lower temperature compared to previous work [5, 19]. this optimal temperature has been found independent of the n sr values. however, to evaluate the effect of temperature on the efficiency of the reduction of nox, it would be necessary to keep the other remaining combustion parameters constant, which is not feasible due to the control of the combustion process in a pilotscale boiler. therefore, the experiments vary not only in different temperatures in the combustion region but also by the flow rate of the fluidized medium and the ratio of secondary to primary air. these parameters may result in different reagent residence times and different mixing for each series of experiments, and may also affect nox reduction efficiency. for lignite combustion, the implementation of air staging increases the concentration of carbon monoxide in the flue gas, which can result in a shift of the temperature window for reagent injection into the lower temperature range [5, 20]. in the case of biomass combustion, only primary air and no air staging was used, and the temperature dependence will be the subject of further studies, but a lower initial level of nox can lead to lower reduction efficiency. 5. conclusion this paper presents an experimental study of selective non-catalytic reduction of nox in a bubbling fluidized bed boiler. achieving the required temperatures for the reduction reactions was realized by modifying the combustion process by means of high degree of air staging with sub-stoichiometric conditions in the dense bed, which allowed controlling the freeboard temperature in the required range for the sncr. in several series of experiments in the bfb boiler, efficient application of selective non-catalytic reduction of nitrogen oxides by injecting the urea solution was achieved. during the experiments, it was possible to achieve a nitrogen oxide reduction efficiency of 59.6 % for a normalized stoichiometric excess of reagent 3.0. high values of n sr resulted in a higher ammonia slip in the flue gas. the optimal temperature for the sncr in this experimental setup was reached at approximately 890 °c, which is a value slightly below the typical optimum reported in the literature. it can be attributed to elevated co levels in the high-degree air staging operation mode of the combustor. in all experiments, the injection of the reducing agent led to a significant increase in n2o emissions, which in several cases was up to four times the initial value. the nox reduction in the case of biomass combustion was nearly 40 % but due to the different concentrations of urea solution, the results of lignite and biomass combustion are not directly comparable. further experiments with biomass as fuel will be the subject of continued research, followed by oxy-fuel combustion of lignite and biomass, i.e., combustion with oxygen as an oxidant. acknowledgements this work was supported by the ministry of education, youth and sports under op rde grant number cz.02.1.01/0.0/0.0/16_019/ 0000753 “research center for low-carbon energy technologies”. references [1] j. e. johnsson. formation and reduction of nitrogen oxides in fluidized-bed combustion. fuel 73(9):1398–1415, 1994. https://doi.org/10.1016/0016-2361(94)90055-8. [2] p. skopec, j. hrdlička, j. opatřil, j. štefanica. nox emissions from bubbling fluidized bed combustion of lignite coal. acta polytechnica 55(4):275–281, 2015. https://doi.org/10.14311/ap.2015.55.0275. [3] l. muzio, g. quartucy. implementing nox control: research to application. progress in energy and combustion science 23(3):233–266, 1997. https://doi.org/10.1016/s0360-1285(97)00002-6. [4] c. t. bowman. control of combustion-generated nitrogen oxide emissions: technology driven by regulation. symposium (international) on combustion 24(1):859–878, 1992. https://doi.org/10.1016/s0082-0784(06)80104-9. 368 https://doi.org/10.1016/0016-2361(94)90055-8 https://doi.org/10.14311/ap.2015.55.0275 https://doi.org/10.1016/s0360-1285(97)00002-6 https://doi.org/10.1016/s0082-0784(06)80104-9 vol. 62 no. 3/2022 experimental verification of the efficiency of sncr in a bfb [5] m. tayyeb javed, n. irfan, b. m. gibbs. control of combustion-generated nitrogen oxides by selective non-catalytic reduction. journal of environmental management 83(3):251–289, 2007. https://doi.org/10.1016/j.jenvman.2006.03.006. [6] c. mendoza-covarrubiasr, c. e. romero, f. hernandez-rosales, h. agarwal. n2o formation in selective non-catalytic nox reduction processes. journal of environmental protection, 2:1095–1100, 2011. https://doi.org/10.4236/jep.2011.28126. [7] m. vodička, k. michaliková, j. hrdlička, et al. external bed materials for the oxy-fuel combustion of biomass in a bubbling fluidized bed. journal of cleaner production 321:128882, 2021. https://doi.org/10.1016/j.jclepro.2021.128882. [8] y. b. zeldovich. the oxidation of nitrogen in combustion explosions. acta physiochiochimica ussr 21:577–628, 1946. [9] c. fenimore. formation of nitric oxide in premixed hydrocarbon flames. symposium (international) on combustion 13(1):373–380, 1971. https://doi.org/10.1016/s0082-0784(71)80040-1. [10] j. he, k. chen, j. xu. urban air pollution and control. 2017. https://doi.org/10.1016/b978-0-12-409548-9.10182-4. [11] k. abul hossain, m. nazri mohd jaafar, a. mustafa, et al. application of selective non-catalytic reduction of nox in small-scale combustion systems. atmospheric environment 38(39):6823–6828, 2004. https://doi.org/10.1016/j.atmosenv.2004.09.012. [12] j. a. miller, c. t. bowman. mechanism and modeling of nitrogen chemistry in combustion. progress in energy and combustion science 15(4):287–338, 1989. https://doi.org/10.1016/0360-1285(89)90017-8. [13] m. mladenović, m. paprika, a. marinković. denitrification techniques for biomass combustion. renewable and sustainable energy reviews 82:3350–3364, 2018. https://doi.org/10.1016/j.rser.2017.10.054. [14] m. vodička, k. michaliková, j. hrdlička, et al. experimental verification of the impact of the air staging on the nox production and on the temperature profile in a bfb. acta polytechnica 62(3):400–408, 2022. https://doi.org/10.14311/ap.2022.62.0400. [15] n. modliński. numerical simulation of sncr (selective non-catalytic reduction) process in coal fired grate boiler. energy 92:67–76, 2015. https://doi.org/10.1016/j.energy.2015.03.124. [16] j. krzywański, w. nowak. neurocomputing approach for the prediction of nox emissions from cfbc in air-fired and oxygen-enriched atmospheres. journal of power technologies 97(2):75–84, 2017. [17] m. vodička, n. e. haugen, a. gruber, j. hrdlička. nox formation in oxy-fuel combustion of lignite in a bubbling fluidized bed modelling and experimental verification. international journal of greenhouse gas control 76:208–214, 2018. https://doi.org/10.1016/j.ijggc.2018.07.007. [18] m. lisý, h. lisá, d. jecha, et al. characteristic properties of alternative biomass fuels. energies 13(6), 2020. https://doi.org/10.3390/en13061448. [19] s. daood, m. javed, b. gibbs, w. nimmo. nox control in coal combustion by combining biomass co-firing, oxygen enrichment and sncr. fuel 105:283–292, 2013. https://doi.org/10.1016/j.fuel.2012.06.087. [20] p. lodder, j. lefers. effect of natural gas, c2h6 and co on the homogeneous gas phase reduction of nox by nh3. the chemical engineering journal 30(3):161–167, 1985. https://doi.org/10.1016/0300-9467(85)80026-5. 369 https://doi.org/10.1016/j.jenvman.2006.03.006 https://doi.org/10.4236/jep.2011.28126 https://doi.org/10.1016/j.jclepro.2021.128882 https://doi.org/10.1016/s0082-0784(71)80040-1 https://doi.org/10.1016/b978-0-12-409548-9.10182-4 https://doi.org/10.1016/j.atmosenv.2004.09.012 https://doi.org/10.1016/0360-1285(89)90017-8 https://doi.org/10.1016/j.rser.2017.10.054 https://doi.org/10.14311/ap.2022.62.0400 https://doi.org/10.1016/j.energy.2015.03.124 https://doi.org/10.1016/j.ijggc.2018.07.007 https://doi.org/10.3390/en13061448 https://doi.org/10.1016/j.fuel.2012.06.087 https://doi.org/10.1016/0300-9467(85)80026-5 acta polytechnica 62(3):361–369, 2022 1 introduction 2 measures for nox reduction 2.1 selective non-catalytic reduction 3 experiments 3.1 experimental setup 3.2 materials and methods 4 results and discussion 5 conclusion acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0056 acta polytechnica 62(1):56–62, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague time-dependent step-like potential with a freezable bound state in the continuum izamar gutiérrez altamiranoa, ∗, alonso contreras-astorgab, alfredo raya montañoa, c a instituto de física y matemáticas, universidad michoacana de san nicolás de hidalgo, edificio c-3, ciudad universitaria. francisco j. mújica s/n. col. felícitas del río. 58040 morelia, michoacán, méxico b conacyt–physics department, cinvestav, p.o. box. 14-740, 07000 mexico city, mexico c centro de ciencias exactas, universidad del bío-bío, avda. andrés bello 720, casilla 447, 3800708, chillán, chile ∗ corresponding author: izamar.gutierrez@umich.mx abstract. in this work, we construct a time-dependent step-like potential supporting a normalizable state with energy embedded in the continuum. the potential is allowed to evolve until a stopping time ti, where it becomes static. the normalizable state also evolves but remains localized at every fixed time up to ti. after this time, the probability density of this state freezes becoming a bound state in the continuum. closed expressions for the potential, the freezable bound state in the continuum, and scattering states are given. keywords: bound states in the continuum, supersymmetric quantum mechanics, time-dependent quantum systems. 1. introduction the first discussion of bound states in the continuum (bics) in quantum mechanics dates back to von neumann and wigner [1] who constructed normalizable states corresponding to an energy embedded in the continuum in a periodic potential v (r) = e + ∇2ψ/ψ from a modulated free-particle wave function ψ(r) = (sin(r)/r)f(r), with twice the period of the potential. the localization of this state is interpreted as the result of its reflection in the bragg mirror generated by the wrinkles of v (r) as r → ∞. the extended family of von-neumann and wigner potentials have been discussed and extended for many years [2–5] from different frameworks including the gelfan-levitan equation [6] also known as inverse scattering method [4, 7], darboux transformations [8, 9] and supersymmetry (susy) [10–13], among others. bound states in the continuum are nowadays recognized as a general wave phenomenon and has been explored theoretically and experimentally in many different setups, see [14] for a recent review. exact solutions to the time-dependent schrödinger equation are known only in a few cases, including the potential wells with moving walls [15, 16], which has been explored from several approaches (see, for instance, ref. [17] and references therein) including the adiabatic approximation [18] and perturbation theory [16] and through point transformations [19– 23], which combined with supersymmetry techniques allow to extend from the infinite potential well with a moving wall to the trigonometric pöschl-teller potential [24]. in this article, we present the construction of a timedependent step-like potential. we depart from the standard stationary step potential and apply a secondorder supersymmetric transformation to add a bic. then, by means of a point transformation, the potential and the state become dynamic and we allow them to evolve. after a certain time, we assume that all the time-dependence of the potential is frozen, such that the potential becomes stationary again and explore the behavior of the normalizable state. intriguingly, it is seen that the freezable bic is not an eigensolution of the stationary schrödinger equation in the frozen potential, but rather solves an equation that includes a vector potential that does not generate a magnetic field whatsoever. thus, by an appropriate gauge transformation, we gauge away the vector potential and observe the bic that remains frozen as an eigenstate of the stationary hamiltonian after the potential ceases to evolve in time. in order to expose our results, we have organized the remaining of this article as follows: in section 2 we describe the preliminaries of susy and a point transformation. section 3 presents the construction of the time-dependent step-like potential and give explicit expressions for the freezable bic and scattering states. final remarks are presented in section 4. 2. supersymmetry and a point transformation point transformation is a successful technique to define a time-dependent schrödinger equation with a full time-dependent potential from a known stationary problem [19, 20, 24]. in this section, we use a trans56 https://doi.org/10.14311/ap.2022.62.0056 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 step-like potential with a freezable bound state in the continuum formation of this kind in combination with a confluent supersymmetry transformation to obtain a timedependent step-like potential from the stationary case. 2.1. confluent supersymmetry darboux transformation, intertwining technique or supersymmetric quantum mechanics (susy) is a method to map solutions ψ of a schrödinger equation into solutions ψ̄ of another schrödinger equation [25–29]. it is based on an intertwining relation where two hamiltonians and a proposed operator l† must fulfill the relation h̄l† = l†h, (1) where h = − d2 dy2 + v0(y), h̄ = − d2 dy2 + v̄ (y). (2) the main ingredient of susy are the seed solutions, which correspond to solutions of the initial differential equation hu = ϵu, where ϵ is a real constant called factorization energy. in this work we focus on the so called confluent supersymmetry, where l† is a secondorder differential operator. once a seed solution and a factorization energy are chosen, the next step is to construct the following auxiliary function v = 1 u ( ω + ∫ u2(y)dz ) , (3) where ω is a real constant to be fixed. then, one way to fulfill (1) is by selecting l† = ( − d dy + v′ v ) ( − d dy + u′ u ) , (4) and the potential term in h̄ as v̄ (y) = v0(y) − 2 d2 dy2 ln ( ω + ∫ y y0 u2dz ) . (5) then, solutions of the differential equation hψ = eψ, where e is energy, can be mapped using l† and the intertwining relation as follows: hψ = eψ, ⇓ times l† l†hψ = el†ψ, ⇓ using (1) h̄l†ψ = el†ψ, ⇓ defining ψ̄ ∝ l†ψ h̄ψ̄ = eψ̄. we define ψ̄ as ψ̄ = 1 e − ϵ l†ψ. (6) the factor (e − ϵ)−1 is introduced for normalization purposes. moreover, h̄ could have an extra eigenstate that cannot be written in the form (6). this state is called missing state and plays an important role in this work. the missing state is obtained as follows: first we have seen that l† maps solutions of hψ = eψ into solutions of h̄ψ̄ = eψ̄, by obtaining the adjoint equation of (1) hl = lh̄, where l = (l†)† we can construct the inverse mapping, but there is a solution ψ̄ϵ such that lψ̄ϵ = 0. this solution is explicitly: ψ̄ϵ = cϵ 1 v = cϵ u ω + ∫ u2(y)dy , (7) where cϵ is a normalization constant if ψ̄ϵ is square integrable. this state fulfills h̄ψ̄ϵ = ϵψ̄ϵ. notice that the selection of u, ϵ and ω is very relevant, we must choose these carefully to avoid the introduction of singularities in the potential v̄ that lead to singularities also in ψ̄. the function ω + ∫ u2dy must be nodeless. we can satisfy this if either limy→∞ u(y) = 0 or limy→−∞ u(y) = 0 and if ω is appropriately chosen. 2.2. point transformation given that we know the solution of the time independent schrödinger equation d2 dy2 ψ̄(y) + [ e − v̄ (y) ] ψ̄(y) = 0 (8) with a potential defined in y ∈ (−∞, ∞), let us consider the following change of variable y(x,t) = x 4t + 1 , (9) where x ∈ (−∞, ∞) is considered as a spatial variable and t ∈ [0, ∞) a temporal one. then, the wavefunction ϕ(x,t) = 1 √ 4t + 1 exp { i(x2 + e4 ) 4t + 1 } ψ̄ ( x 4t + 1 ) , (10) solves the time-dependent schrödinger equation i ∂ ∂t ϕ(x,t) + ∂2 ∂x2 ϕ(x,t) − v (x,t)ϕ(x,t) = 0, (11) where the potential term is v (x,t) = 1 (4t + 1)2 v̄ ( x 4t + 1 ) . (12) in other words, the change of variable (9) together with the replacements v̄ → v and ψ̄ → ϕ transform a stationary schrödinger equation into a time dependent solvable one. 3. time dependent step-like potential with a freezable bound state in the continuum in this section, we depart from the well-known step potential v (y) = v̂ θ(−y) as time independent system. then, using confluent supersymmetry we will 57 i. gutiérrez, a. contreras-astorga, a. raya acta polytechnica add a single bic. furthermore, with the point transformation previously introduced we transform the stationary system into a time-dependent system with an explicitly time-dependent potential. we will choose a stopping time or freezing time ti after which the potential no longer evolves: vf (x,t) = { v (x,t) 0 ≤ t < ti, v (x,ti) t ≥ ti. (13) finally, the solutions of the schrödinger equation will be presented. let us commence our discussion by considering the step-potential v0(y) = { v̂ y ≤ 0, 0 y > 0, (14) defined along the axis y ∈ (−∞, ∞) and v̂ is a positive constant. the solutions of this system are well known in the literature (see [30, 31]). restricting ourselves to the case 0 < eq < v̂ , the solutions are: ψ(y) = { exp(ρy) y ≤ 0, cos(qy) + κ k sin(qy) y > 0, (15) with energy eq = q2 and ρ = √ v̂ − eq. next, to perform the confluent supersymmetric transformation we choose a factorization energy such that 0 < ϵ < v̂ and the corresponding seed solution u(y) as u(y) = { exp(κy) y ≤ 0, cos(ky) + κ k sin(ky) y > 0, (16) with k2 = ϵ and κ2 = v̂ − ϵ. note that u(y) → 0 when y → −∞. then, from (5) we obtain explicitly the susy partner v̄ : v̄ (y) =   v̂ − 16 exp(2κy)κ 3 ω (exp(2κy)+2κω)2 y ≤ 0 32k2 ( k cos(ky) + κ sin(ky) ṽ(y) v̂(y) ) y > 0, (17) where the functions ṽ(y) and v̂(y) are ṽ(y) = [ (k2 + κ2)(k2x + κ) + 2k4ω ] sin(ky) − k [ (k2 + κ2)(κy + 1) + 2k2κω ] , v̂(y) = [ 2ky(k2 + κ2) + 4k3ω − 2kκ cos(2ky) +(k2 − κ2) sin(2ky) ]2 . we can calculate directly from (7) the missing state associated to the factorization energy ϵ: ψ̄ϵ(y) = cϵ   2κ exp(κy) 2κω+exp(2κy) y ≤ 0, 4k3(cos(ky)+ κ k sin(ky)) ψ̂ϵ(y) y > 0, (18) where ψ̂ϵ(y) = (k2 − κ2) sin(2ky) − 2κk cos(2ky) + 4ωk3 + 2ky ( κ2 + k2 ) . ������ 0 5 10 15 20 0.00 0.02 0.04 0.06 0.08 0.10 0.12 y |ψ (y ) 2 | ψϵ(y) 2 fit |a(y) 2 figure 1. |ψ̄ϵ(y)|2 and an envelop function of the form a(y) = a b+y , with a = 2k(κ 2 + k2)−1/2, b = 2ωk2(κ2 + k2)−1. the scale of the graph is fixed with v̂ = 5, k = 1, κ = 2 and cϵ = 1, in the appropriate units. in order to confirm that ψ̄ϵ is square integrable, we proceed in the following way. first, we separate the integral ||ψ̄ϵ||2 = ∫ ∞ −∞ |ψ̄ϵ| 2dy = ∫ 0 −∞ |ψ̄ϵ| 2dy +∫ ∞ 0 |ψ̄ϵ| 2dy. the first integral can be directly calculated:∫ 0 −∞ |ψ̄ϵ|2dy = |cϵ|2 √ 2 κω tan−1 ( 1 √ 2κω ) . for the second integral, we can show that it is bounded by a square integrable function: ∫ ∞ 0 |ψ̄ϵ| 2dy |cϵ|2 = ∫ ∞ 0 ∣∣∣∣∣ 4k 3(cos(ky) + κ k sin(ky)) ψ̂ϵ(y) ∣∣∣∣∣ 2 dy ≤ ∫ ∞ 0 ∣∣∣∣∣ 4k 2 √ k2 + κ2 4ωk3 + 2ky (κ2 + k2) ∣∣∣∣∣ 2 dy = ∫ ∞ 0 ∣∣∣∣ ab + y ∣∣∣∣2 dy = a2b , (19) where a = 2k√ κ2+k2 , b = 2ωk 2 κ2+k2 . figure 1 shows a fair fit to the squared modulus of eq. (18) for y > 0. for an energy e = q2 ̸= ϵ, the wavefunction solving h̄ψ̄ = eψ̄ is constructed using (6), and (15). it reads ψ̄(y) =   [ (κ−ρ) exp(ρy) (q2−k2) ] ψ̄−(y) y ≤ 0, ψ̄+(y)−q2 cos(qy)−qρ sin(qy) q2−k2 y > 0, (20) where we abbreviated ψ̄−(y) = 2κω0(κ + ρ) + (ρ − κ) exp(2κy) 2κω + exp(2κy) , ψ̄+(y) = k2(ρ sin(qy) + q cos(qy)) q + 4k(κ sin(ky) + k cos(ky)) ψ̂ϵ(y) × [ k q ( κ cos(ky) − k sin(ky) ) ( ρ sin(qy) + q cos(qy) ) + ( κ sin(ky) + k cos(ky) )( q sin(qy) − ρ cos(qy) )] . 58 vol. 62 no. 1/2022 step-like potential with a freezable bound state in the continuum in figure 2 the potential v̄ (y), along with the probability densities of the missing state |ψ̄ϵ(y)|2 and a scattering state |ψ̄(y)|2 are shown. we observe that the wavefunction of the bic has an envelop function which tends to zero as |y| → ∞, whereas the state ψ̄(y) is not localized. the next step is to construct a time dependent potential from (17) using the point transformation presented in (9-12). notice that x = y at t = 0. then v̄ transforms as the piecewise potential: v (x,t) = 1 (4t + 1)2 { v̂ − 16κ3ω exp( 2κx4t+1 )[ 2κω + exp( 2κx4t+1 ) ]2 } (21) if x ≤ 0, otherwise v (x,t) = 32k2 (4t + 1)2 × [ k cos ( kx 4t + 1 ) + κ sin ( kx 4t + 1 ) ṽ(y(x,t)) v̂(y(x,t)) ] . (22) in figure 3 (top) we show the potential v (x,t) at t = 0, t = 0.1 and t = 0.2. its shape changes in time and its spatial profile oscillates as expected, vanishing as x → ∞. analogously, for the time-dependent bic, the associated wavefunction for energy ϵ is explicitly ϕϵ(x,t) = 1 √ 4t + 1 exp { i(x2 + k 2 4 ) 4t + 1 } ψ̄ϵ ( x 4t + 1 ) , (23) this function solves the time-dependent schrödinger equation i∂tϕϵ + ∂xxϕϵ − v ϕϵ = 0 and its square integrability is guaranteed since ψ̄ϵ(y) is a square integrable function: ||ϕϵ||2 = ∫ ∞ −∞ |ϕϵ(x,t)|2dx = 1 4t + 1 ∫ ∞ −∞ ∣∣ψ̄ϵ ( x4t + 1 ) ∣∣2dx = ∫ ∞ −∞ ∣∣ψ̄ϵ(y)∣∣2dy = ||ψ̄ϵ||2. (24) where we used the change of variable (9). its probability density is shown in figure 3 (center) at different times. this state is localized and the first peak in the probability density broadens and diminishes height as time increases. for states with energy eq = q2 ̸= ϵ, the corresponding time-dependent wavefunction has the explicit form ϕ(x,t) = 1 √ 4t + 1 exp { i(x2 + q 2 4 ) 4t + 1 } ψ̄ ( x 4t + 1 ) , (25) the behavior of the probability density |ϕ(x,t)|2, for e = 2 at different times is shown in figure 3 (bottom). this state is unlocalized at any time. finally, we choose the freezing or stopping time ti. then, we can consider a charge particle in a potential: vf (x,t) = { v (x,t) 0 ≤ t < ti, v (x,ti) t ≥ ti. (26) figure 2. potential v̄ (y), along with the probability densities of the missing state |ψ̄ϵ(y)|2 and a scattering state |ψ̄(y)|2 are shown. the scale of the graph is fixed with v̂ = 5, k = 1, κ = 2, q = √ 2 and ω = 4. figure 3. behavior of the potential v (x,t) (top), the bic ϕϵ(x,t) (center) and the scattering state ϕ(x,t) (bottom) at the times t = 0, t = 0.1, and t = 0.2. the scale of the graphs is fixed by v̂ = 5, k = 1, κ = 2, q = √ 2 and ω = 4. 59 i. gutiérrez, a. contreras-astorga, a. raya acta polytechnica where v (x,t) is given by (21,22). notice that when t ∈ [0, ti) the potential is changing in time, and when t ≥ ti the potential is frozen. this potential is in fact a family, parametrized by ω > 0, recall that ω was introduced by the confluent susy transformation. neither ϕ(x,t) nor ϕϵ(x,t) are stationary states, they evolve in time, and they are not eigenfunctions of the operator −∂xx + v . at any time t ≥ ti, the functions ϕ(x,ti) and ϕϵ(x,ti) satisfy the eigenvalue equation:[( − ∂ ∂x + iax(x) )2 + v (x,ti) ] ϕ(x,ti) = e (4ti + 1)2 ϕ(x,ti), t ≥ ti, (27) where ax(x) = −∂xθ(x) and θ(x) = i 4ti + 1 ( x2 + e 4 ) . (28) equation (27) is the schrödinger equation for a charged particle under the influence of a vector potential a = (ax, 0, 0) that, nevertheless, does not generate magnetic field since b = ∇ × a = 0. let us recall that the schrödinger equation for a charged particle of charge q immersed in an external electromagnetic field is better written in terms of the scalar φ and vector potentials a through the hamiltonian h = (p̂ + qa)2 + qφ. (29) these electromagnetic potentials allow us to define the electric and magnetic fields as e = −∇φ − ∂a ∂t , b = ∇ × a, (30) definition that does not change if the following transformations are performed simultaneously, a → a′ = a + ∇λ, φ → φ′ = φ − ∂λ ∂t , (31) where λ = λ(x,t) is a scalar function. this is a statement of gauge invariance of maxwell’s equations. in quantum mechanics, the time-dependent schrödinger equation i ∂ψ ∂t = hψ (32) retains this feature if along the transformations in eq. (31) in the hamiltonian (29), the wavefunction changes according to the local phase transformation ψ → ψ′ = eiλψ. (33) in our example at hand, this freedom allows us to select λ in such a way that if at certain instant of time ti the vector potential a ̸= 0 but before we had a = 0, one can still have a schrödinger equation without vector potential by tuning appropriately the scalar potential. in particular, by selecting λ(x,t) = ℓ(x)θ(t − ti), (34) we can shift the scalar potential such that the timedependent equation governing this state never develops a vector potential to begin with. then, by choosing a vector potential a(x,t) = (ax(x,t), 0, 0) where ax(x,t) = −θ(t − ti)∂xθ(x), we observe that the piecewise function ϕf (x,t) = { ϕ(x,t) 0 ≤ t < ti, ψ̄ ( x 4ti+1 ) t ≥ ti. (35) becomes a solution of i∂tϕf (x,t) = [−∂xx + vf (x,t)] ϕf (x,t) = hϕf (x,t). in particular, the function ϕfϵ(x,t) = { ϕϵ(x,t) 0 ≤ t < ti, ψ̄ϵ ( x 4ti+1 ) t ≥ ti, (36) before the freezing time ti is just a time dependent wave packet but for t > ti it becomes a frozen bound state in the continuum satisfying the eigenvalue equation hϕfϵ = εϕfϵ, where ε = ϵ/(4ti + 1)2. in figure 4 we plot the potential vf (top), the freezable bound state in the continuum ϕfϵ (center) and a scattering state ϕf (bottom) at t = 0.8, t = 1 and t = 1.8, the freezing time is ti = 1, note that after t = 1 neither the potential nor the wavefunctions evolve. 4. final remarks in this article, we apply a confluent supersymmetric transformation to the standard step-potential defined in the whole real axis. the seed solution that we use makes it possible to embed a localized squared integrable state in the continuum spectrum, a bic. we have provided the system, potential, and states, with time evolution through a point transformation. nevertheless, we notice that the wrinkles in the potential as x → ∞ still localize a bic at every fixed time. next, we allow the evolution of the system continue and at a given stopping time ti, we freeze the potential and fix it stationary. upon exploring the behavior of the bic with this static potential after the freeze-out time, we surprisingly observe that it does not correspond to a solution of the stationary schrödinger equation, but instead it develops a geometric phase encoded in a vector potential which does not generate any magnetic field. thus, by gauging out this geometric phase, the resulting state becomes indeed an eigenstate of the frozen hamiltonian. we call this state a freezable bound state in the continuum. further examples are being examined under the strategy presented in this work, including vector potentials which might be relevant for pseudo-relativistic systems. 60 vol. 62 no. 1/2022 step-like potential with a freezable bound state in the continuum figure 4. behavior of the potential vf (x,t) (top), the fbic ϕf ϵ(x,t) (center) and the scattering state ϕf (x,t) (bottom) at the times t = 0.8, t = 1, and t = 1.8. the freezing time is ti = 1. the scale of the graph is fixed by v̂ = 5, k = 1, κ = 2, q = √ 2 and ω = 4. acknowledgements the authors acknowledge consejo nacional de ciencia y tecnología (conacyt-méxico) under grant fordecyt-pronaces/61533/2020. ig and ar acknowledge valuable discussions with juan angel casimiro olivares. references [1] j. von neuman, e. wigner. uber merkwürdige diskrete eigenwerte. uber das verhalten von eigenwerten bei adiabatischen prozessen. physikalische zeitschrift 30:467–470, 1929. https: //ui.adsabs.harvard.edu/abs/1929phyz...30..467v provided by the sao/nasa astrophysics data system. [2] b. simon. on positive eigenvalues of one-body schrödinger operators. communications on pure and applied mathematics 22:531–538, 1969. https://doi.org/10.1002/cpa.3160220405. [3] f. h. stillinger, d. r. herrick. bound states in the continuum. physical review a 11:446–454, 1975. https://doi.org/10.1103/physreva.11.446. [4] b. gazdy. on the bound states in the continuum. physics letters a 61(2):89–90, 1977. https://doi.org/10.1016/0375-9601(77)90845-3. [5] m. klaus. asymptotic behavior of jost functions near resonance points for wigner–von neumann type potentials. journal of mathematical physics 32:163–174, 1991. https://doi.org/10.1063/1.529140. [6] i. m. gel’fand, b. m. levitan. on the determination of a differential equation from its spectral function. izvestiya akademii nauk sssr seriya matematicheskaya 15(4):309–360, 1951. [7] t. a. weber, d. l. pursey. continuum bound states. physical review a 50:4478–4487, 1994. https://doi.org/10.1103/physreva.50.4478. [8] a. a. stahlhofen. completely transparent potentials for the schrödinger equation. physical review a 51:934–943, 1995. https://doi.org/10.1103/physreva.51.934. [9] d. lohr, e. hernandez, a. jauregui, a. mondragon. bound states in the continuum and time evolution of the generalized eigenfunctions. revista mexicana de física 64(5):464–471, 2018. https://doi.org/10.31349/revmexfis.64.464. [10] j. pappademos, u. sukhatme, a. pagnamenta. bound states in the continuum from supersymmetric quantum mechanics. physical review a 48:3525–3531, 1993. https://doi.org/10.1103/physreva.48.3525. [11] n. fernández-garcía, e. hernández, a. jáuregui, a. mondragón. exceptional points of a hamiltonian of von neumann–wigner type. journal of physics a: mathematical and theoretical 46(17):175302, 2013. https://doi.org/10.1088/1751-8113/46/17/175302. [12] l. lópez-mejía, n. fernández-garcía. truncated radial oscillators with a bound state in the continuum via darboux transformations. journal of physics: conference series 1540:012029, 2020. https://doi.org/10.1088/1742-6596/1540/1/012029. [13] a. demić, v. milanović, j. radovanović. bound states in the continuum generated by supersymmetric quantum mechanics and phase rigidity of the corresponding wavefunctions. physics letters a 379(42):2707–2714, 2015. https://doi.org/10.1016/j.physleta.2015.08.017. [14] c. w. hsu, b. zhen, a. d. stone, et al. bound states in the continuum. nature reviews materials 1(9):16048, 2016. https://doi.org/10.1038/natrevmats.2016.48. [15] d. l. hill, j. a. wheeler. nuclear constitution and the interpretation of fission phenomena. physical review 89:1102–1145, 1953. https://doi.org/10.1103/physrev.89.1102. [16] s. w. doescher, m. h. rice. infinite square-well potential with a moving wall. american journal of physics 37:1246–1249, 1969. https://doi.org/10.1119/1.1975291. 61 https://ui.adsabs.harvard.edu/abs/1929phyz...30..467v https://ui.adsabs.harvard.edu/abs/1929phyz...30..467v https://doi.org/10.1002/cpa.3160220405 https://doi.org/10.1103/physreva.11.446 https://doi.org/10.1016/0375-9601(77)90845-3 https://doi.org/10.1063/1.529140 https://doi.org/10.1103/physreva.50.4478 https://doi.org/10.1103/physreva.51.934 https://doi.org/10.31349/revmexfis.64.464 https://doi.org/10.1103/physreva.48.3525 https://doi.org/10.1088/1751-8113/46/17/175302 https://doi.org/10.1088/1742-6596/1540/1/012029 https://doi.org/10.1016/j.physleta.2015.08.017 https://doi.org/10.1038/natrevmats.2016.48 https://doi.org/10.1103/physrev.89.1102 https://doi.org/10.1119/1.1975291 i. gutiérrez, a. contreras-astorga, a. raya acta polytechnica [17] k. cooney. the infinite potential well with moving walls, 2017. arxiv:1703.05282. [18] m. v. berry. quantal phase factors accompanying adiabatic changes. proceedings of the royal society lond a 392:45–57, 1984. https://doi.org/10.1098/rspa.1984.0023. [19] j. r. ray. exact solutions to the time-dependent schrödinger equation. physical review a 26:729–733, 1982. https://doi.org/10.1103/physreva.26.729. [20] g. w. bluman. on mapping linear partial differential equations to constant coefficient equations. siam journal on applied mathematics 43(6):1259–1273, 1983. https://doi.org/10.1137/0143084. [21] a. schulze-halberg, b. roy. time dependent potentials associated with exceptional orthogonal polynomials. journal of mathematical physics 55(12):123506, 2014. https://doi.org/10.1063/1.4903257. [22] k. zelaya, o. rosas-ortiz. quantum nonstationary oscillators: invariants, dynamical algebras and coherent states via point transformations. physica scripta 95(6):064004, 2020. https://doi.org/10.1088/1402-4896/ab5cbf. [23] k. zelaya, v. hussin. point transformations: exact solutions of the quantum time-dependent mass nonstationary oscillator. in m. b. paranjape, r. mackenzie, z. thomova, et al. (eds.), quantum theory and symmetries, pp. 295–303. springer international publishing, cham, 2021. https://doi.org/10.1007/978-3-030-55777-5_28. [24] a. contreras-astorga, v. hussin. infinite square-well, trigonometric pöschl-teller and other potential wells with a moving barrier, pp. 285–299. springer international publishing, cham, 2019. https://doi.org/10.1007/978-3-030-20087-9_11. [25] v. b. matveev, m. a. salle. darboux transformations and solitons. springer series in nonlinear dynamics. springer, berlin, heidelberg, 1992. isbn 9783662009246, https: //books.google.com.mx/books?id=pjdjvweacaaj. [26] f. cooper, a. khare, u. sukhatme. supersymmetry and quantum mechanics. physics reports 251(5-6):267–385, 1995. https://doi.org/10.1016/0370-1573(94)00080-m. [27] d. j. fernández c., n. fernández-garcía. higher-order supersymmetric quantum mechanics. aip conference proceedings 744:236–273, 2004. https://doi.org/10.1063/1.1853203. [28] a. gangopadhyaya, j. v. mallow, c. rasinariu. supersymmetric quantum mechanics: an introduction (second edition). world scientific publishing company, 2017. isbn 9789813221062. [29] g. junker. supersymmetric methods in quantum, statistical and solid state physics. iop expanding physics. institute of physics publishing, 2019. isbn 9780750320245. [30] c. cohen-tannoudji, b. diu, f. laloë. quantum mechanics. 1st ed. wiley, new york, ny, 1977. trans. of : mécanique quantique. paris : hermann, 1973, https://cds.cern.ch/record/101367. [31] r. shankar. principles of quantum mechanics. plenum, new york, ny, 1980. https://cds.cern.ch/record/102017. 62 http://arxiv.org/abs/1703.05282 https://doi.org/10.1098/rspa.1984.0023 https://doi.org/10.1103/physreva.26.729 https://doi.org/10.1137/0143084 https://doi.org/10.1063/1.4903257 https://doi.org/10.1088/1402-4896/ab5cbf https://doi.org/10.1007/978-3-030-55777-5_28 https://doi.org/10.1007/978-3-030-20087-9_11 https://books.google.com.mx/books?id=pjdjvweacaaj https://books.google.com.mx/books?id=pjdjvweacaaj https://doi.org/10.1016/0370-1573(94)00080-m https://doi.org/10.1063/1.1853203 https://cds.cern.ch/record/101367 https://cds.cern.ch/record/102017 acta polytechnica 62(1):56–62, 2022 1 introduction 2 supersymmetry and a point transformation 2.1 confluent supersymmetry 2.2 point transformation 3 time dependent step-like potential with a freezable bound state in the continuum 4 final remarks acknowledgements references ap08_1.vp in may 2007 institut za povijest umjetnosti (the institute of art history) in zagreb released an academic publication about the topography of modern architecture from the period between the two world wars on the territory of croatia. the book was produced by the školska knjiga publishing house, supported by the international working party for documentation and conservation of buildings, sites and neighbourhoods of the modern movement (docomomo), founded in 1988 in eindhoven. docomomo works on revitalizing and promoting 20th century modernist art. this book presents the results of a five – year project under the direction of darja radović mahečić from the institute for art history in zagreb. in the course of the project a national register of croatian modern architecture from the period between 1926 and 1940 was set up. the task was to investigate this period of croatian architecture, when a national identity was being created and many buildings and conceptswere being developed under the influence of political changes, comparable to events in the traditional european architectural centres. the one hundred selected buildings represent a broad spectrum of types: public spaces, offices, residential and multifunctional buildings, rental villas and family houses, schools, hospitals, hotels, churches and chapels, exhibition pavilions and workers’ districts. some show close links with local tradition (e.g., the institute for biology and oceanography in split, by fabjan kaliterna, 1930–1931) mainly on the croatian coastline. after zagreb, the other cultural centres were split, also rijeka and zadar, which were under italian rule. modern architecture also reached the islands of krk, hvar, koločep and lopud, and also the architecturally preserved town of dubrovnik (see city café and cinema at the great arsenal, by mladen kauzlarić and stjepan gomboš, 1931 – 1933). the structure of the book is clear and well–arranged. after the introduction there are chapters on croatian modern architecture in the 1930s and the internationalization of croatian architectural avant-garde, followed by a register of modern architecture in croatia, a bibliography, a typological index, and an index of localities and names. the core material is a catalogue of one hundred selected and in chronologically ordered examples of modern architecture in croatia. each building is presented here with its main specifications (name, architect or architects, time of creation and address), black-and-white, illustrations (original and current photos, in some cases reproductions of plans and sketches) and an index of sources. the preconditions for inclusion in the catalogue were that the source documentation had been well researched, the buildings were of highly architectural value, and above all, that their present condition and appearance reveals the original architectural and urban concept. the third condition must have been complicated by the fact that many examples have been damaged e.g., by removal of the flat roof, a typical formal feature. nevertheless, these © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 48 no. 1/2008 moderna arhitektura u hrvatskoj 1930–ih modern architecture in croatia 1930’s by darja radović mahečić a. bičík keywords: croatian modern architecture, functionalism, functionalism in croatia book jacket review buildings are on the list, because they are important for the development of modern architecture in croatia. in the chapter croatian modern architecture in the 1930s (p. 16–32) the author talks about historical context of croatia and about the development of the croatian school architectural system. this opening chapter goes on to show international and local architectural and urban design competitions, the problem of artists’ associations and their programmes, and contemporary evaluations of the new modern architecture. the question of urban planning is then analysed. the towns encountered new problems, e.g., the rising of population in the towns, and the design of large municipal structures, rather than small flats. next the book introduced some revolutionary events in the field of architectural educational the foundation of two architectural schools. the author describes the foundation of the royal technical college in zagreb in 1919. the establishment of this school had been under preparation by the croatian society of civic engineers and architects since 1898. its first rector was edo šen (schön) and the first graduate was alfred albini. the second school to be founded was the department of architecture, in 1926, at the academy of fine arts in zagreb, under the direction of drago ibler. at that time the rector was ivan meštrović, who later became a world-famous sculptor. sixty students graduate from this school before it closed in 1942. in 1920 the croatian society of civic engineers and architects accepted new rules for architectural competitions, including a commitment to arrange competitions for every general master plan (see international competitions for the general master plans of zagreb and split, 1931–1936) and for every public building. this contributed to the diffusion of modern architecture to distant regions. the existence of two architectural schools competing with each other was good for the development of croatian modern architecture. one of the artists’ associations, zemlja, is introduced here. it was founded by the painter krsto hegedušić in 1929, at the academy of fine arts in zagreb. the architect drago ibler headed the association and was also the author of its manifesto. the group wanted to promote art that reflected the modern vital needs of croatian society. they put this into practice through public lectures, exhibitions and cooperation with similar intellectual groups. before it was banned in 1935, the group held seven exhibitions with permanent and guest members. almost all architects had attended ibler’s course at the academy. the chapter on the internationalization of croatian architectural avant-garde (p. 33–53) tells about the contacts of croatian architects with the association congr s internationaux d’architecture moderne (ciam) and important architects from a number of european art centers (vienna, dresden, frankfurt, berlin, budapest and prague). the author draws attention to the reflections of croatian modern architecture at international exhibitions and in technical papers published inside the kingdom of serbians, croatians and slovenes and elsewhere in europe. the main journals are tehni ki list (published in zagreb), arhitektura (founded in ljubljana in 1931), gradevinski vjesnik (founded in zagreb in 1932), the german reviews bauwelt and monatshefte für baukunst, the french review l’architecture d’aujourdhui. the book mentions prominent architects who studied outside yugoslavia, e.g. praguestudents nikola dobrović, marko vidaković, ivan zemljak and zvonimir kavurić, vienna student juraj neidhardt and ernest weissmann who worked with le corbusier. le corbusier helped him to become the first resident ciam representative of yugoslavia. many other architects returned to 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 ivan zemljak: trešnjevka municipal primary school, zagreb, 1930–1931 marijan haberle: church of blessed marko křiževčanin, zagreb, 1940 fabjan kaliterna: institute for biology and oceanography, split, 1930–1931 yugoslavia from major european studios led by le corbusier, josef hoffmann, adolf loos, peter behrens or hans poelzig. other architects took part in architectural competitions abroad, e. g., juraj neidhardt and vladimir potočnjak participated in the architectural competition for the workers’ district in zlín, czech republic in 1935, and potočnjak was awarded one of the prizes. the register of modern architecture in croatia (p. 58–467) includes some important czech architects and their works on the territory of croatia and also several of their croatian colleagues who studied in prague. we are introduced to thevilla pfefferman (which now houses the embassy of the czech republic) in zagreb. this project from 1928 – 1929 was led by marko vidaković, who studied in vienna and prague, where he graduated in 1918. he and his schoolfellows ivan zemljak and vladimir šterk they are considered to be the architects who designed the first modern buildings in zagreb. czech architect josef kodl designed the municipal schools in split 1928–1930. the most important work alluding to czech architecture is the bataville satellite industrial town of the bat’a shoe factory in the neighborhood of vukovar, by architects františek lydie – gahura (general master plan), vladimír karfík and antonín vítek (architectural design of houses) 1931–1938. the project included 13 six – storey industrial blocks sized 80×20 m, according to the bat’a’s zlín hall no. 24, with the skeleton of reinforced concrete with 6.15 m×6.15 m modules and large windows. the residential quarter built before 1936 was named after jan bat’a. the quarter contained 122 apartment buildings, located in the park belt, with 421 dwelling units: 17 family houses for directors and engineers, 8 two – apartment houses for top managers and 97 four-apartment buildings for workers. the houses were cubic in form and were made of unplastered facade bricks with details in concrete and flat roofs. the town housed 1818 people in 1936. in the midst of the public buildings there is a hostel for single workers with 200 beds, a primary school, a professional secondary school, a department store, a restaurant, a cinema, a stadium and sports airport. at the peak of its expansion in 1939, the factory employed 6290 workers of whom 4650 lived in the town. after 1945 the town was renamed to borovo. the complex was damaged in yugoslav army attacks in 1991. restoration work has taken place since 1998, unfortunately without the value of the original structures. the book moderna arhitektura u hrvatskoj 1930 – ih informs readers about the birth and development of croatian modern architecture in the 1930s. profiles of 100 buildings form the central material in this publication, the profiles are rich in iconic documentation and information about sources, and these make it valuable not only for researchers but also for practicing architects. the effort to promote the modern architecture of this period through one hundred selected houses is laudable. however, a book covering all the findings of darja mahečić’s team and providing a compact insight into all structures includes in the national register of croatian modern architecture would be of even greater value. ing. arch. aleš bičík e–mail: a.bicik@post.cz department of monuments protection and renovation czech technical university in prague faculty of architecture thákurova 7 166 34 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 48 no. 1/2008 ap07_1.vp 1 introduction metallized polymer films are widely used in industries ranging from food packaging to biosensors [1–4]. interesting studies have been made on various aspects of metal/polymer interface formation [5]. polymeric films metallized from both sides are basic structures for constructing diodes with negative differential resistance. a crucial role in preparing metal layers on polymers is played by the interface properties between substrate and deposited metal. this attribute can benefit from plasma treatment of the polymeric surface. plasma processing of polymer materials is routinely used to control the properties. just by varying the plasma treatment conditions it is possible either to increase or to decrease the wettability of the given polymeric surface. in order to understand the changes made to the surfaces during plasma treatment, we need analytical techniques that can accurately characterize the surface before and after modification. the most important characteristic factor that affects interfacial interactions, such as adsorption, wetting and adhesion, is surface energy. the surface energy of plasma treated polymers can easily be examined by measuring the contact angles. metal deposition on polymers can be carried out by sputtering, vacuum deposition, and also by various electrochemical procedures [4,6,7]. the thicknesses of the prepared layers proceed in an interval ranging from a few nanometres to hundreds of nanometers. the structure of the metal layer is mainly influenced by nucleation processes [8,9]. one of the main problems when a metal layer is formed on a polymer is its adhesion to the substrate. adhesion of the metal layer to the polymer can be increased by various methods, including chemical changes, plasma charge, laser and ion beam irradiation [10]. plasma treatment is probably the most versatile surface modification technique, and it has become an important process for altering the chemical and physical properties of surfaces [11]. this work studies modification of polymers (pe, pet, ptfe) and also the formation of metal layers on polymers. surface profiles of pristine and modified pe and pet and images of the au film were obtained by the afm technique. the continuity of the metal layer on the polymer surface was validated by measuring its electrical resistance. 2 experimental pe (hdpe, thickness 25 �m, density 0.965 gcm�3 ), pet (thickness 50 �m, density 1.370–1.395 gcm�3) and ptfe (thickness 25 �m) sheets with dimensions of 3×3 cm were prepared. gold layers were deposited on these pristine and modified polymeric sheets. plasma etching proceeded in an argon atmosphere of pressure 10 pa at room temperature. the etching times were held at constant levels for each type of polymer. the times were 180 s, 180 s, 240 s for pet, ptfe and pe, respectively. these times had been found to be the most effective in terms of improving wettability (according to our earlier investigation [13]). the power of the plasma discharge was constant at a level of 8.6 w. the contact angle of the pristine polymers and of the polymeric sheets directly after plasma treatment, and also of the future development of surface wettability was determined using the surface energy evaluation system device. to determine the sheet resistance we used a multimeter uni-t, type ut83. the surface morphology of the pristine and modified pet and pe films and also the deposited gold layers was examined using afm (contact mode, microscope digital instruments nanoscope tm dimension iii engine). olympus oxide – omcl tr sharpened silicon nitride probes with spring constant 0.02 n/m – was chosen. the normal force of the tip on the sample did not exceed 10 nn. 3 results and discussion the surface morphology of pristine and plasma-treated pe and pet is shown in fig.1, fig. 2, fig. 3, and fig. 4. these images clearly demonstrate the changes in surface roughness when plasma treatment is applied. further investigation of gold layers and their morphologies provided a large number of afm images. two of them have been chosen as representative, and are shown in fig. 5 and fig. 6. the changes in surface roughness after gold deposition on polymers are obvious. the last two pictures above demonstrate that increasing the thickness of the gold layer goes hand in hand with increasing the roughness of the surface of the samples. a rapid decrease in surface wettability for the samples immediately after plasma treatment is obvious from the measurements of the dependence of the contact angle on time © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 47 no. 1/2007 preparation of thin metal layers on polymers j. siegel, v. kotál continuous gold layers of increasing thickness were prepared by the vacuum deposition method on pristine and plasma modified sheets of pe, pet and ptfe. various surface profiles were obtained. the surface morphology was studied using atomic force microscopy (afm). the continuity of the metal layer on the polymer surface was validated by measuring its electrical resistance. changes in the wettability of the plasma treated polymers were evaluated by measuring the aging curves. these were obtained as the dependence of contact angle on ageing time. keywords: polymer, metal, vacuum deposition, atomic force microscopy, contact angle, plasma etching. this text was a part of the international conference poster 2006 which was held in faculty of electrical engineering ctu in prague. after deposition (see fig. 7). the value of the contact angle reached its lowest level immediately after exposure to radiation in the case of all investigated samples. when focused on pet and pe sheets, it is evident that after approximately 48 hours the contact angle increases to the level of about 20 degrees above that of the pristine samples. it then remains con10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 1/2007 fig. 1: afm image of pristine pe fig. 2: afm image of 240 seconds plasma treated pe fig. 3: afm image of pristine pet fig. 4: afm image of 180 seconds plasma treated pet fig. 5: afm image of pristine pet/au 20nm fig. 6: afm image of pristine pet/au 100nm stant. this corresponds to values of 100° or 125° for pet and pe, respectively (see fig. 7). a different dependence was obtained in the case of ptfe. it is evident that the increase in the contact angle is more gradual than in the case of pe and pet. however, subsequent measurement of the ptfe contact angle showed that the final value stabilized at a level near to that of pristine ptfe (about 115°, time 260 hours). it is also evident that immediately after exposure to radiation the value of the contact angle is higher in the case of ptfe (about 45 ) than in pe and pet (see fig. 7). 4 conclusions the results of this work can be summarised as folows: � plasma treatment of polymers increases the surface roughness for pe, pet and ptfe, � the surface morphology of deposited au layers depends on the type of polymeric substrate, � the value of the contact angle reached its lowest level immediately after exposure to radiation, in the case of all investigated samples, � the surface roughness of vacuum evaporated au layers studied by the afm technique is an increasing function of deposition time (thickness of the layer), � the ageing curves of ptfe differ from those obtained for pe and pet. aknowledgments this work was supported by the grant agency of the cr under project no. 102/06/1106, by the internal grant of the ict no. 126080015 and by the ministry of education under project no. lc 06041. reference [1] stutzmann, n., tervoort, t. a., bastiaansen, k., smith, p.: nature. vol. 407 (2000), no. 6, p. 613. [2] bauer, g., pittner, f., schalkhammer, t.: mikrochim. acta,vol. 131 (1999), 107–114 [3] sacher, e., pireaaux, j., kowalczyk, s.: acs symp. ser. vol. 440 (1990), p. 282–288. [4] švorčík, v., zehentner, j., rybka, v., slepička, p., hanatowicz, v.: appl. phys. vol. a75 (2002), p. 541–545. [5] zaporojtchenko, v., strunskus, t., behnke, k., von bechtolsheim, c., keine, m., faupel, f.: j. adhesion sci. technol. vol. 14 (2000), p. 467–490. [6] efimenko, k., rybka, v., švorčík, v., hnatowicz, v.: appl. phys. vol. a68 (1999), p. 479–482. [7] gustafson, g., cao, y., colaneri, n., heeger, a. j.: nature, vol. 357 (1992), p. 477–479. [8] zhang, l., cosandey, f., persaud, r., madey, t. e.: surf. sci. vol. 439 (1999), p. 73–85. [9] walton, d. j.: chem. phys. vol. 37 (1962), p. 2182–2188. [10] barker, c. p., kochem, k. -h., revel, k. m., kelly, r. s. a., badyal, j. p. s.: thin solid films, vol. 257 (1995), p. 77–82. [11] chan, c. m., ko, t. m., hiraoka, h.: surf. sci. rep. vol. 26 (1996), p. 1–57. [12] slepička, p., podgrabinski, t., špírková, m., švorčík, v., rybka, v.: eng. mech. vol. 11 (2004), p. 357–360. [13] švorčík, v., kotál, v., slepička, p., bláhová, o., špírková, m., zajel, p., hnatowicz, v.: nucl. instrum. meth. b 244, 2006, p. 365–372. ing. jakub siegel e-mail: jakub.siegel@vscht.cz ing. vladimír kotál e-mail: vladimir.kotal@vscht.cz department of solid state engineering institute of chemical technology technická 5 166 28 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 47 no. 1/2007 0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 pet pe ptfe c o n ta c t a n g le (d e g re e ) time after exposure (hours) pe/240 pet/180 ptfe/180 fig. 7: dependence of the contact angle of the studied polymers on time after exposure to ar plasma. the numbers in the legend represent treatment times in seconds ap08_2.vp 1 definition of quantum groups, lie bialgebras and twists in mathematics and theoretical physics, quantum groups are certain non-commutative algebras that first appeared in the theory of quantum integrable models, and which were then formalized by drinfeld and jimbo. nowadays quantum groups are one of the most popular objects in modern mathematical physics. they also play a significant role in other areas of mathematics. it turned out that quantum groups provide invariants of knots, and that they can be used in quantization of poisson brackets on manifolds. quantum groups gave birth to quantum geometry, quantum calculus, quantum special functions and many other “quantum” areas. at first, quantum groups appeared to be a useful tool for the following program: let r satisfy the quantum yang-baxter equation (qybe) and let r be decomposable in a series in formal parameter �. then r, which is the coefficient at the first order term, satisfies the so-called classical yang-baxter equation (cybe). in many cases cybe can be solved and the problem is to extend the solutions of cybe to solutions of qybe. however, the most important applications of quantum groups relate to the theory of integrable models in mathematical physics. the presence of quantum group symmetries (or so-called hidden symmetries) was the crucial point in explicit solutions of many sophisticated non-linear equations, such as korteweg-de vries or sine-gordon. quantum groups changed and enriched representation theory and algebraic topology. quantum groups were defined by v. drinfeld as hopf algebra deformations of the universal enveloping algebras (and also their dual hopf algebras). more exactly, we say that a hopf algebra a over � �c [ ]� is a quantum group (or maybe a quasi-classical quantum group) if the following conditions are satisfied: i. the hopf algebra a a� is isomorphic as a hopf algebra to the universal enveloping algebra of some lie algebra l. ii. as a topological � �c [ ]� -module, a is isomorphic to � �v [ ]� for some vector space v over c. the first examples of quantum groups were quantum universal enveloping algebras uq( )g , quantum affine kac-moody algebras uq( �)g , and yangians y( )g . further, it is well-known that the lie algebra l such that a a� � u( )l is unique: � �l � � � � �a a a aa a� : ( )� 1 1 . if a is a quantum group with a a� � u l( ) , then l possesses a new structure, which is called a lie bialgebra structure �, and it is given by: �( ) ( ( ) ( )) modx a aop� � �1 � � , where a is an inverse image of x in a. in particular, the classical limit of uq( )g is g, the classical limit of uq( �)g is the affine kac-moody algebra, which is a central extension of g[ , ]u u 1 , and the classical limit of y( )g is g[ ]u (the corresponding lie bialgebra structures will be described later). the lie bialgebra ( , )l � is called the classical limit of a, and a is a quantization of ( , )l � . so, any quantum group in our sense has its classical limit, which is a lie bialgebra, and the following natural problem arises: given a lie bialgebra, is there any quantum group whose classical limit is the given lie bialgebra? or in other words: can any lie bialgebra be quantized? in the mid 1990’s p. etingof and d. kazhdan gave a positive answer to this problem. now, let g be a simple complex finite-dimensional lie algebra and let g[ ]u be the corresponding polynomial lie algebra. if you ask a physicist which quantum group is a quantization of g[ ]u , you will almost certainly hear that the quantization of g[ ]u is the yangian y( )g . let us explain this in greater detail. set � � �� i i� �, where � �i � is an orthonormal basis of g with respect to the killing form. then it is well-known that the function r( , ) ( , )u v u v u v � � � � 2 (1) is a rational skew-symmetric solution of the classical yang-baxter equation that is [ ( , ), ( , )] [ ( , ), ( , )] [ r r r r r 12 1 2 13 1 3 12 1 2 23 2 3u u u u u u u u 13 1 3 23 2 3 0( , ), ( , )]u u u ur � (2) © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 2/2008 25 years of quantum groups: from definition to classification a. stolin in mathematics and theoretical physics, quantum groups are certain non-commutative, non-cocommutative hopf algebras, which first appeared in the theory of quantum integrable models and later they were formalized by drinfeld and jimbo. in this paper we present a classification scheme for quantum groups, whose classical limit is a polynomial lie algebra. as a consequence we obtain deformed xxx and xxz hamiltonians. keywords: lie bialgebra, quantum group, yang-baxter equation, quantization, classical twist, quantum twist, integrable model. more generally, we call r( , )u v a rational r-matrix if it is skew-symmetric, satisfies (2) and r( , ) ( , )u v u v� �2 polynomial ( , )u v . sometimes �2( , )u v is called yang’s classical r-matrix. further, the formula � ��� ( ( )) ( , ), ( ) ( )p u u v p u p v� � ��2 1 1 (3) defines a lie bialgebra structure on g[ ]u , since � is a g-invariant element of g g� and this implies that �2( ( ))p u is a polynomial in u,v. the yangian y( )g is precisely the quantization of this lie bialgebra. on the other hand, it is easy to see that all rational r-matrices introduced in [13] define lie bialgebra structures on g[ ]u by the formula � ��( ( )) ( , ), ( ) ( )p u u v p u p v� � �r 1 1 (4) of course, the next question is: which quantum groups quantize the other “rational” lie bialgebra structures on g[ ]u ? before we give an answer to this question we need the following two definitions: definition: let �1 be a bialgebra structure on l. suppose s ��2l satisfies [ , ] [ , ] [ , ] ( )( )s s s s s s alt id s12 13 12 23 13 23 1 � �� , where alt x x x x( ): � 123 231 312 for any x � �l 3. then � �2 1 1 1( ): ( ) [ , ]a a a a s� � � defines a lie bialgebra structure on l. we say that: �2 is obtained from �1 by twisting via s, and s is called a classical twist. at this point we note that any rational r-matrix is a twist of yang’s r-matrix. let h be a hopf algebra and f h h� � be an invertible element. let f satisfy f id f f id f12 23( ) ( )� �� � � . (5) such f is called a quantum twist. the formula � �f a f a f( ) ( )� 1 (6) defines a new co-multiplication on h. 2 conjecture and scheme although it is clear that any quantum twist on a quantum group induces uniquely a classical twist on its classical limit, the converse statement remained unknown for a long time. it was formulated in [9] in 2004. conjecture 1. any classical twist can be extended to a quantum twist. the conjecture was proved by g. halbout in [3]. in particular, we can now give an answer to the question posed in the previous section: a quantum group which quantizes any rational lie bialgebra structure on l u� g[ ] is isomorphic to the yangian y( )g as an algebra and it has a twisted co-algebra structure defined by the corresponding rational solution of cybe. however, there might exist lie bialgebra structures on g[ ]u of a different nature, and for classification purposes one has to find all of them. therefore, in order to classify quantum groups which have a given lie algebra l as the classical limit one has to solve the following four problems: 1. describe all basic lie bialgebra structures on l (in other words all lie bialgebra structures up to classical twisting). 2. find quantum groups corresponding to the basic structures. 3. describe all the corresponding classical twists. 4. quantize all these classical twists. 3 lie bialgebra structures on the polynomial lie algebras and their quantization according to the results of an unpublished paper by montaner and zelmanov [11], there are four basic lie bialgebra structures on the polynomial lie algebra p u� g[ ]. let us describe them (and, hence, we do the first step in the classification of lie bialgebra structures on g[ ]u ). when it is possible we make further steps. case 1. here the one-cocycle �1 0� in this case it is not difficult to show that there is a one-to-one correspondence between lie bialgebra structures of the first type and finite-dimensional quasi-frobenius lie subalgebras of g[ ]u . the corresponding quantum group is u u( [ ])g . classical twists can be quantized following drinfeld’s quantization of skew-symmetric constant r-matrices. case 2. in this case the lie bialgebra structure is described by �2 2 1 1( ( )) [ ( , ), ( ) ( )]p u u v p u p v� � �� , where �2( , )u v is yang’s rational r-matrix. the corresponding lie bialgebra structures are in a one-to-one correspondence with the rational solutions of cybe described in [13]. the corresponding quantum group is a yangian y( )g . quantum twists were found for g � sln for n � 2 3, in [9]. case 3. here the basic lie bialgebra structure is given by �3 3 1 1( ( )) [ ( , ), ( ) ( )]p u u v p u p v� � �� with � � 3( , )u v v v u rdj� , where rdj is the classical drinfeld-jimbo modified r-matrix. this lie bialgebra is the classical limit of the quantum group u g uq( [ ]), which is a certain parabolic subalgebra of a non-twisted quantum affine algebra uq( �)g (see [15] or [10]). there is a natural one-to-one correspondence between lie bialgebra structures of the third type and the so-called quasi26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 -trigonometric solutions of cybe. a complete classification of the classical twists for sln was presented in [9] and for the arbitrary g in [12]. details on quantization of the classical twists for sln can be found in [9]. case 4. finally, the 4-th basic lie bialgebra structure on g[ ]u is defined as follows: �4 4 1 1( ( )) [ ( , ), ( ) ( )]p u u v p u p v� � �� , with � � � 4 1 1( , )u v u v uv v u � � . it was proved in [14] that there is a natural one-to-one correspondence between lie bialgebra structures of this kind and the so-called quasi-rational solutions of cybe. the quasi-rational solutions of cybe for g � sln were classified in [14]. some ideas indicate that quasi-rational r-matrices do not exist for g � g f e2 4 8, , . the corresponding quantum group remains unknown, but, it is rather clear that it is related to the dual hopf algebra of y( )g . 4 some open questions 1. following steps 1–4 in the classification scheme, it is natural to classify quantum groups related to affine kac-moody algebras. a conjecture is that in this case we have only two basic types of lie bialgebra structures and the corresponding quantum groups are quantum affine algebras and so-called doubles of y( )g . 2. more generally: let l be a lie algebra. is it possible to describe the moduli space of double(l), all lie bialgebra structures on l modulo action of classical twists? 3. it is well-known that there is a 1-1 correspondence between lie bialgebras and simply connected poisson-lie groups. let us consider a lie bialgebra l and let g be the corresponding poisson-lie group. let m be a poisson homogeneous space that is a homogeneous space with a poisson structure such that the multiplication m g m m: � is a poisson map. now, let h be the quantum group which quantizes l. conjecture 2. let fun( )m be the poisson algebra of smooth functions on m. then it can be quantized in an equivariant way, i.e., there exists an associative algebra a, which is a deformation of the poisson algebra fun( )m , and which is a hopf module algebra over h. this conjecture is based on some results proved in [6, 7, 8]. a special case of this conjecture has been proved in [2]. more exactly, in this paper the conjecture was proved for lie bialgebras of a special type l d k� ( ), where k is another lie bialgebra and l d k� ( ) is the corresponding classical double. recent progress in related questions was achieved in [4]. here, a result similar to the conjecture above was proved under the following assumptions: l is the so-called coboundary lie bialgebra and l acts freely on m (in other words m is far from being a homogeneous space). 5 appendix: solutions for sl(2) and deformed hamiltonians the aim of this section is to present concrete examples of quantum twists and the corresponding hamiltonians following the results obtained in [5]. we consider the case sl(2). let � � � e , � � e21 and � z e e� 11 22. recall that in sl(2) we have two quasi-trigonometric solutions, modulo gauge equivalence. the non-trivial solution is x z z x z z z z1 1 2 0 1 2 1 2( , ) ( , ) ( )( )� � � � . this solution is gauge equivalent to the following: x z z z z z a z z a b z z z z , ( , ) ( 1 2 2 1 2 1 2 1 4 � � � � � � � � � � � � � � � � � � � �) ( ).b z z (7) the above quasi-trigonometric solution was quantized in [5]. let 1 2( )z be the two-dimensional vector representation of u slq( )2 . in this representation, the generator e � acts as a matrix unit e21, e� � as ze21 and h� as e e11 22 . the quantum r-matrix of u slq( )2 in the tensor product 1 2 1 1 2 2( ) ( )z z� is the following: r z z e e e e z z q z qz e e e 0 1 2 11 11 22 22 1 2 1 1 2 11 22 ( , ) ( � � � � 22 11 1 1 1 2 2 12 21 1 21 12 � � � e q q q z qz z e e z e e ) ( ). (8) proposition 1. the r-matrix given by r r z z z z q z qz b az q az q z: ( , ) (( ) ( � � 0 1 2 1 2 1 1 2 2 1 1 � � b b az q az qb z) ( )( ) ) � � � � � �2 1 1 (9) is a quantization of the quasi-trigonometric solution xa b, . corollary 1. the rational degeneration r u u u u u u p u u u u f z( , ) ( ( 1 2 1 2 1 2 12 1 2 2 1 1� � � � � � ��� � ��� � � �z u u2 2 1( ). (10) where p12 denotes the permutation of factors in c c 2 2� , is a quantization of the following rational solution of cybe: © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 48 no. 2/2008 � � r u u u u u uz z( , ) ( .1 2 1 2 1 2� � � � � � � � � (11) the hamiltonians of the periodic chains related to the twisted r-matrix were computed in [5]. we recall this result: we consider t z tr r z z r z z r z zn n( ) ( , ) ( , ) ( , )� 0 0 2 0 1 2 01 2� (12) a family of commuting transfer matrices for the corresponding homogeneous periodic chain, [ ( ), ( )]t z t z� �� � 0, where we treat z2 as a parameter of the theory and z z� 1 as a spectral parameter. then the hamiltonian h q q z z t z t za b z z z, , ( ) ( ) ( )2 2 1 1 2� � d d (13) can be computed by a standard procedure: h h c da b z xxz k z k k k z k k k , , ( ( ) )2 1 1 1� � � � � � � � . (14) here c b az q q � 1 2 2 1( ), d az b q az qb� ( )( )2 1 2 , � � e12, � � � � e , � z e e� 11 22 and h q q xxz k k k k k z k z k � � � � � � � � � � � � � � � �1 1 1 12 . (15) we see that, by a suitable choice of parameters a, b and z2, we can add to the xxz hamiltonian an arbitrary linear combination of the terms � � � �k z k k k z k � 1 1 and � �k k k � 1 and the model will remain integrable. moreover, it was proved in [5] that the hamiltonian h q q u q u t u t uu u u� �, , (( ) ) ( ) ( )2 2 1 1 1 2� � d d (16) for t u tr r u u r u u r u un n( ) ( , ) ( , ) ( , )� 0 0 2 0 1 2 01 2� , (17) is given by the same formula (13), where c u qq � �1 1 2 2 1 2 and d u q u q� �2 2 1 2( ). now it also makes sense in the xxx limit q � 1: h h c du xxx k z k k k z k k k � � � � � � �, , ( ( ) )2 1 1 1� � , (18) where c � �2 and d u u� � 2 2 2( ) . references [1] belavin, a. a., drinfeld, v. g.: on classical yang-baxter equation for simple lie algebras. funct. an. appl., vol. 16 (1982), no. 3, p. 1–29. [2] etingof, p., kazhdan, d.: quantization of poisson algebraic groups and poisson homogeneous spaces. symétries quantiques (les houches, 1995), north-holland, amsterdam 1998, p. 935–946. [3] halbout, g.: formality theorem for lie bialgebras and quantization of twists and coboundary r-matrices. adv. math. vol. 207 (2006), p. 617–633. [4] halbout, g.: quantization of r–z-quasi-poisson manifolds and related modified classical dynamical r-matrices. [arxiv math. qa/0801.2789]. [5] khoroshkin, s., stolin, a., tolstoy, v.:q -power function over q-commuting variables and deformed xxx and xxz chains. phys. atomic nuclei, vol. 64 (2001), no.12, p. 2173–2178. [6] karolinsky, e., stolin, a., tarasov, v.: dynamical yang-baxter equation and quantization of certain poisson brackets. noncommutative geometry and representation theory in mathematical physics. 175–182, contemp. math., vol. 391 (2005), amer. math. soc., providence, ri. [7] karolinsky, e., stolin, a., tarasov, v.: irreducible highest weight modules and equivariant quantization. adv. math. vol. 211 (2007), no. 1, p. 266–283. [8] karolinsky, e., stolin, a., tarasov, v.: quantization of poisson homogeneous spaces, highest weight modules, and kostant’s problem. in preparation. [9] khoroshkin, s., pop, i., stolin, a., tolstoy, v.: on some lie bialgebra structures on polynomial algebras and their quantization. preprint no. 21, 2003/2004, mittag-leffler institute, sweden. [10] khoroshkin, s., pop, i., samsonov, m., stolin, a., tolstoy, v.: on some lie bialgebra structures on polynomial algebras and their quantization. [arxiv math. qa/0706.1651v1]. to appear in comm. math. phys. [11] montaner, f., zelmanov, e.: bialgebra structures on current lie algebras. preprint, university of wisconsin, madison, 1993. [12] pop, i., stolin, a.: classification of quasi-trigonometric solutions of the classical yang–baxter equation. submitted. [13] stolin, a.: on rational solutions of yang-baxter equation for sl(n). math. scand., vol. 69 (1991), no. 1, p. 57–80. [14] stolin a., yermolova–magnusson, j.: the 4th structure. czech. j. phys., vol. 56 (2006), no. 10/11, p. 1293–1297. [15] tolstoy ,v.: from quantum affine kac–moody algebras to drinfeldians and yangians. kac–moody lie algebras and related topics. contemp. math. vol. 343 (2004), amer. math. soc., p. 349–370. alexander stolin e-mail: astolin@chalmers.se department of mathematics university of göteborg göteborg, sweden 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap07_2-3.vp 1 introduction and statement of results this paper considers the fundamental questions: what is the determinant of a partial differential operator, and how might one compute it? determinants of differential operators occur naturally in many applications in mathematical and theoretical physics, and also have inherent mathematical interest since they encode certain spectral properties of differential operators. physically, such determinants arise, for example, in semiclassical approximations in quantum mechanics and quantum field theory, in grand canonical potentials in many-body theory and statistical mechanics, in gap equations in the mean-field approximation, in lattice gauge theory, and in gauge fixing (faddeev-popov determinant) for non-abelian gauge theory. determinants of free laplacians and free dirac operators have been extensively studied [1–6], but much less is known about operators involving an arbitrary potential function. when the operator under consideration is an ordinary (i.e., one dimensional) differential operator, a beautiful general theory due to gel’fand and yaglom [7] has been developed for defining and computing the determinant [8–11]. in this paper i discuss attempts to extend these results to partial differential operators. even for the simple radially separable case of the free laplacian on a 2d disc, the naive extension via a sum over partial waves of ordinary differential operators, leads to a divergence, as noted by forman [9]. however, it turns out that this divergence has a clear physical meaning and can be understood in the context of renormalization in quantum field theory. this leads to finite, renormalized expressions [see eqs. (6)–(8) below] for the determinant of such separable operators. the result for four dimensions was first found in [12] using radial wkb and an angular momentum cut-off regularization and renormalization [13], and then in [14] using the zeta function approach to determinants. the primary motivation of this work is for applications in quantum field theory, so we concentrate on examples in two, three and four dimensions, but the mathematical generalization to arbitrary dimension should be clear. consider the radially separable partial differential operators � � � �� v r( ) ;�free � �� (1) where � is the laplace operator in|r d , and v(r) is a radial potential vanishing at infinity as r�2�� for d � 2 and d � 3, and as r�4�� for d � 4. for d � 1, with dirichlet boundary conditions on the interval [0, �), the results of gel’fand and yaglom [7] lead to the following simple expression for the determinant ratio: det[ ] det[ ] ( ) ( ) � � � � � � � m m 2 2free free � � , (2) where [ ]� � �m2 0� , with initial value boundary conditions: �( )0 0� and � �� ( )0 1. the function �free is defined similarly in terms of the free operator: [ ]�free � m2 . the squared mass, m2, is important for physical applications, and plays the mathematical role of a spectral parameter. the result (4) is geometrically interesting, in addition to being computationally simple, as it means that the determinant is determined simply by the boundary values of the solutions of [ ]� � �m2 0� , and no detailed information is needed concerning the actual spectrum of eigenvalues. now consider dimensions d 1. since the potential is radial, v=v(r), we can express the eigenfunctions of � as linear combinations of basis functions of the form �� �r r r y d l l � � � � �) ( ) ( )( ) ( ) ( )� � 1 1 2 , where y l( )( ) � � is a hyperspherical harmonic, labeled in part by a non-negative integer l, and the radial function �( )( )l r is an eigenfunction of the schrödinger-like radial operator �( ) ( )l d dr l d l d r v r � � � �� � � � � � �� � � � � � 2 2 2 3 2 1 2 . (3) �( ) free l is defined similarly, with the potential omitted: v � 0. in dimension d � 2, the radial eigenfunctions �(l) have degeneracy given by deg( ; ) ( )( ) ! !( ) ! l d l d l d l d � � � � � 2 2 3 2 . (4) formally, for the separable operators in (1), the logarithm of the determinant ratio can be written as a sum over l (weighted with the degeneracy factor) of the logarithm of one-dimensional determinant ratios, ln det[ ] det[ ] deg( ; ) ln det[� � �� � � � � � � � � m m l d 2 2free ( l ll m m ) ( ) free � � � � � � � � � � � 2 2 0 ] det[ ]� (5) © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 47 no. 2–3/2007 functional determinants for radially separable partial differential operators g. v. dunne functional determinants of differential operators play a prominent role in many fields of theoretical and mathematical physics, ranging from condensed matter physics, to atomic, molecular and particle physics. they are, however, difficult to compute reliably in non-trivial cases. in one dimensional problems (i.e. functional determinants of ordinary differential operators), a classic result of gel’fand and yaglom greatly simplifies the computation of functional determinants. here i report some recent progress in extending this approach to higher dimensions (i.e., functional determinants of partial differential operators), with applications in quantum field theory. keywords: quantum field theory, functional determinants, zeta functions, spectral theory, partial differential operators. each term in the sum can be computed using the sturm-liouville extension [10] of the gel’fand-yaglom result (2). however, the l sum in (5) is divergent, as noted by forman [9] for the free laplace operator in a two-dimensional disc. however, it is possible to understand this divergence and define a finite and renormalized determinant ratio for the radially separable partial differential operators (1). specifically, we have found [1] the following simple expressions, which generalize (2) to higher dimensions: ln det[ ] det[ ] ln ( )( ) ( ) � � � � � � � � � � � � � m m d 2 2 2 0 0 free � � free free d ( ) ln ( ) ( ) ( ) ( )� � � � � � � � � � � � � � � � �2 � � l l r r v ( ) ln r l r r v r l 0 1 2 � � � � � � � � �� � � � � � � � �� � � � � � � � � � � � � 2 d � � � � � � � 0 (6) ln det[ ] det[ ] ( ) ln (( ) � � � � � � � � � � � � � m m l d l 2 2 3 2 1 free � � � � � � � � � � � � � � � � � � � �� � � � �) ( ) ( ) ( )� l r r v r l free d 2 0 1 2� � � � � �� � � � � � � � l 0 (7) ln det[ ] det[ ] ( ) ln (( ) � � � � � � � � � � � � � m m l d l 2 2 4 21 free � ! " � � � � � � � �� � � � � � �) ( ) ( ) ( ( )� l r r v r l r r v v m free d 2 d 0 3 2 1 2 ! " ) ( ) ln 0 3 0 3 2 1 1 8 2 � � � � � � � � � � � � � � � � � � � � � � � � � � 8 d l r r v v m l �r 2 1 0 � � � � � � � � � � � � � � . (8) here is euler’s constant, and � is a renormalization scale (defined in the next section), which is essential for physical applications, and which arises naturally in even dimensions. a conventional renormalization choice is to take � � m in (6)–(8). in each of (6)–(8), the sum over l is convergent once the indicated subtractions are made. the function �(l)(r) is the solution to the radial equation [ ] ( ) ( ) ~ , . ( ) ( ) ( ) �( )l l l l d m r r r r � � # � � 2 1 2 0 0 � � as (9) the function �( ) ( )l r free is defined similarly, with the same behavior as r # 0, in terms of the operator [ ]� ( ) free l m� 2 . thus, in d dimensions, �( ) ( )l r free is expressed as a bessel function: �( ) ( ) (l l d l dr m l d r i mrfree � � � � � � � � � � � � � � � � 2 2 2 1 1 2 2 1 ). (10) notice that the results (6)–(8) state once again that the determinant is determined by the boundary values of solutions of [ ]� � �m2 0� , with the only additional information being a finite number of integrals involving the potential v(r). we also stress the computational simplicity of (6)–(8), as the initial value problem (9) is trivial to implement numerically. 2 zeta function formalism the functional determinant can be defined in terms of a zeta function [1, 2, 5] for the operator �. for dimensional reasons, we define � � � � � � � [ ] [ ] ( ) ( ) ( ) � �� � �� � ��m s m s ss s m2 2 2 2 2 2 , (11) where the sum is over the spectrum of �, and � is an arbitrary parameter with dimension of a mass. physically, � plays the role of a renormalization scale. then the logarithm of the determinant is defined as [1, 2, 5] ln det[ ] ( ) ln( ) ( ) [ ] [ ] [ � � � � � � � � � � � � � � m m m 2 2 2 2 2 0 0 � � � � � m2 0 ] ( ) . (12) to compute the determinant ratio, we define the zeta function difference � � �( ) ( ) ( ) [ ] [ ] s s s m m � � �� �2 2free . (13) thus we need to compute the zeta function and its derivative, each evaluated at s � 0. in general, the zeta function at s � 0 is related to the heat kernel coefficient, ad/2(�), associated with the operator � [6]: � � �( ) ( )0 2� ad . for the operator � � � �� e, these heat kernel coefficients are known [6], and we find the standard results �( ) ( ), , ( ), . 0 2 0 3 2 4 1 2 0 1 16 3 2 0 � � � � � � � � � � d d r r v r d d r r v v m d � � � � � � � (14) this gives the first term on the rhs of (12). now we turn to the second term, the derivative of the zeta function at s � 0, �� ( )0 . this can be evaluated using the relation to the familiar jost functions of scattering theory [15]. consider the radial eigenvalue equation �( ) ( ),( ),l l pl p p � 2 , (15) where �(l) is the schrödinger-like radial operator defined in (3). a distinguished role is played by the so-called regular solution, (l),p(r), which is defined to have the same behavior as r#0 as the solution without potential: ( ), ( ) ~ � ( )( )l p r l r j prd # � � 0 3 2 . (16) here the spherical bessel function is defined as � ( ) ( )( )j z z j z l l d d � � � � �3 2 22 1 � . 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 the asymptotic behavior of the regular solution, (l),p(r), as r # � defines thejost function, fl(p), [15] ( ), *( ) ~ ( ) � ( ) ( ) � (l p r l l l l r i f p h pr f p hd d #� � � � � � �� 2 3 2 3 2 pr)� �� � � . (17) here � ( )h pr l d� � � 3 2 and � ( )h pr l d� � � 3 2 are the riccati-hankel functions � ( ) ( ) , � ( ) ( )h z i z h z h z i z h l l l d d d � � � � � � � � � � � 3 2 2 3 2 2 2 1 1� � l d z � � 2 1 2( ) ( ) . (18) as is well known from scattering theory [15], the analytic properties of the jost function fl(p) strongly depend on the properties of the potential v(r). analyticity of the jost function as a function of p for $ p 0 is guaranteed, if in addition to the aforementioned behavior as r # �, we impose v(r) ~ r�2�� for r # 0, and continuity of v(r) in 0 < r <� (except perhaps at a finite number of finite discontinuities). for us, the analytic properties of the jost function in the upper half plane will be of particular importance because they are related to the shifting of contours in the complex momentum plane. by standard contour manipulations [6], the zeta function can be expressed in terms of the jost functions as: % &� � � � � ( ) sin( ) deg( ; ) ln ( )s s l d k k m k f ik s m l l � � � � � � �� d 2 2 0 . (19) this representation is valid for ' s d 2, and the technical problem is the construction of the analytic continuation of (19) to a neighborhood about s � 0. if expression (19) were analytic at s � 0, then we would deduce that � � � � � �� ( ) deg( ; ) ln ( )0 0 l d f iml l . (20) from the definition (17) of the jost function, f iml l im l im l l ( ) ( ) ( ) ( ) ( ) ( ), ( ), ( ) ( ) � � � � � � � �free free , (21) where �( )( )l r is defined in (9). thus, the regulated expression (20) coincides with the formal partial wave expansion (5), using the gelf’and-yaglom result (2) for each l. however, the expansion (20) is divergent in positive integer dimensions. in the zeta function approach, the divergence of the formal sum in (20) is directly related to the need for analytic continuation of �( )s in s to a region including s � 0. from (19), this analytic continuation relies on the uniform asymptotic behavior of the jost function fl(ik). denoting this behavior by f ikl asym( ), the analytic continuation is achieved by adding and subtracting the leading asymptotic terms of the integrand in (19) to write � � �( ) ( ) ( )s s sf as� � , (22) where % & � � � � � f s m l s s l d k k m k f ik f ( ) sin( ) deg( ; ) ln ( ) ln � ( � � � � �d 2 2 l asym l ik( ) ,� �� � � � � � 0 (23) and % & � � � � � as s m l asym s s l d k k m k f ik ( ) sin( ) deg( ; ) ln ( � ( � � � � d 2 2 ) . l � � � 0 (24) ultimately we are interested in the analytic continuation of �(s) to s � 0. as many asymptotic terms will be included in f ikl asym( ) as are necessary to make � f s( ) as given in (23) analytic around s � 0. on the other hand, for � as s( ) the analytic continuation to s � 0 can be constructed in closed form using an explicit representation of the asymptotic behavior of the jost function, derived in the next section. 2.1 asymptotics of the jost function the asymptotics of the jost function f ikl ( ) follows from standard results in scattering theory [15]. in particular, the partial-wave lippmann-schwinger integral equation for the regular solution % & � � � � � ( ), ( ) ( ) ( ) ( ) ( ) ( ) l ik r i kr r r i kr k kr i kr k kr � � � � � � �d v r rl ik r ( ) ( ).( ),� �� 0 (25) leads to an iterative expansion for f ikl ( )in powers of the potential v(r). for dimensions d)4, we need at most the �(v) and �(v2) terms of ln ( )f ikl : ln ( ) ( ) ( ) ( ) ( ) ( ) f ik r r v r k kr i kr r r v r k kr l � � � � � � d d d � � � 0 2 0 � � � � �� r r v r i kr v r ( ) ( ) ( ).� 2 0 3 � (26) this iterative scheme effectively reduces the calculation of the asymptotics of the jost function to the well-known uniform asymptotics of the modified bessel functions k* and i*. using these asymptotics, wedefine ln ( )f ikl asym as the �(v) and �(v2) parts of this uniform asymptotic expansion: © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 47 no. 2–3/2007 ln ( ) ( ) f ik r r v r kr l asym � � � � � � � � � � � � � � � 1 2 1 1 2 1 20 � � d 1 1 1 6 1 3 2 3 2 2�� � � dr r v r kr kr ( ) � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 5 1 1 2 2 0 kr � 8 1 3 3 2 2 3 20 � � dr r v r kr ( ) . � � � � � � � � � � � � � � (27) 2.2 computing �� f ( )0 by construction, �( )s , defined in (23), is now well defined at s � 0, and we find % &� � � � � � �� f l lasym l l d f im f im( ) deg( ; ) ln ( ) ln ( )0 0 . (28) this form is suitable for straightforward numerical computation, as the jost function f iml ( ) can be computed using (9) and (21), while ln ( )f iml asym can be computed using (27). with the subtraction of ln ( )f iml asym in (28), the l sum is now convergent. however, it is possible to find an even simpler expression. it turns out that the subtraction in (28) is an over-subtraction. to see this, expand ln ( )f iml asym into its large l behavior as follows: ln ( ) ~ ( ) ( )f im r r v r r r v v m r rl asym 1 2 1 8 2 1 2 0 3 3 2 � � � d d d� � � � � v r mr mr ( ) 1 1 1 2 2 1 2 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� �� 00 3 2 3 2 1 1 1 1 6 1 �� � dr r v r mr mr ( ) � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 2 2 2 5 1 mr � � � � � � � � � � � � � � � � � � � � 0 3 3 2 2 3 21 8 1 1 � � dr r v r mr ( ) � � � � � � � � � 0 . (29) the first term is �( )1 l , and the second is �( )13l , while the remaining terms are all �( ) 1 5l . in dimensions d ) 4, the degeneracy factor deg(l ; d) is at most quadratic in l, and so these last terms are finite when summed over l in (28). (in fact, in d � 2 and d � 3, the �( )13l terms are also finite when summed over l.) in the next section we show that these finite terms cancel exactly againstcorresponding terms arising in the evaluation of �� as( )0 . thus, for � � � � �� � �( ) ( ) ( )0 0 0f as , we only actually need to subtract the leading large l terms in (29), rather than the full asymptotics in (27). 2.3 computing �� as ( )0 the explicit form of the asymptotic terms in (28) provides the analytic continuation to s=0 of � as s( ), as defined in (24). the k integrals are done using % &dk k m k kr s n s n m 2 2 2 2 1 2� � � � � � � � � � � � � � � � � � � �� �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ( )1 2 1 2 2 s n mr m mr n s� � s n � 2 . (30) therefore, we find � � as ss l d r r v r s s ( ) deg( ; ) ( ) ( ) ( ) ( ) � � �� � � d 1 2 1 2 1 2 1 21 2 s s l mr 1 2 1 2 00 � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � 1 16 1 3 2 3 2 3 2 2 3 2 ( ) ( ) ( ) s s mr s s � � � � � � � � � � � � � � � � � � � � 3 8 1 5 2 5 2 3 2 2 5 2 ( ) ( ) ( ) s s mr s s � � 5 16 1 7 2 7 2 3 2 2 7 2 ( ) ( ) ( ) s s mr s s � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � dr r v r s s mr s s 3 2 2 3 2 3 2 3 21 8 1 ( ) ( ) ( ) ( ) � � � � � � � � � � � � � � � � � � 2 3 20 s . (31) we now subtract sufficiently many terms inside thesum to ensure the analytic continuation of � as s( ) to s � 0. the added back terms produce riemann zeta function terms, such as � r s( )2 1� , whose analytic continuation is immediate. for example, in d � 4, we find 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 notice that the terms involving summation over � cancel exactly against identical terms in �� f ( )0 from (29), after those terms are summed over l with the d � 4 degeneracy factor �2 21� �( )l . furthermore, note that the ln r term inside the integral on the second line of (32) is precisely of the same form as the renormalization term in (14), so the ln � in (12) combines with ln r to form the dimensionless combination ln(�r) in (8). the analogous computations for d � 2 and d � 3 lead to (6) and (7), respectively [14]. 3 conclusions and applications the mathematical results reported here are the formulas in (6)–(8), which provide simple new expressions for the determinant of a radially separable partial differential operator of the form � � �� m v r2 ( ). this generalizes the gel’fand-yaglom result (2) to higher dimensions, and greatly increases the class of differential operators for which the determinant can be computed simply and efficiently. the derivation presented here [14] uses the zeta function definition of the determinant, but the same expressions can be found using the radial wkb approach of [12, 13]. furthermore, it can be shown [14] how these expressions relate to the feynman diagrammatic definition of the determinant based on dimensional regularization [16]. these superficially different expressions are in fact equal, although the zeta function expression is considerably simpler to implement. these results lead to many direct applications in quantum field theory, where they extend the class of solvable fluctuation determinant problems away from the restrictive class of constant background fields, or one dimensional background fields, to the more general class of separable higher dimensional background fields. mathematically, this represents a small but surprisingly non-trivial step towards more general partial differential operators. while this is still a small class of partial differential operators, it is large enough to have many important applications in quantum field theory, for example the study of quantum fluctuations in the presence of vortices, monopoles, sphalerons, instantons, domain walls (branes), etc … a number of generalizations could be made. first, in certain quantum field theory applications the determinant may have zero modes, and correspondingly one is actually interested in computing the determinant with these zero modes removed. our method provides a simple way to compute such determinants [12]. a systematic study of this approach, exploiting the relation between zero modes and topology, would be extremely interesting. another important generalization is to include directly the matrix structure that arises from dirac-like differential operators and from non-abelian gauge degrees of freedom. the feynman diagrammatic approach is well developed for such separable problems [17]; for example it has been applied to the fluctuations about the electroweak sphaleron [18, 19] and to compute the metastability of the electroweak vacuum [20]. more recently, the angular momentum cut-off method has been used to compute the full mass dependence of the fermion determinant in a four dimensional yang-mills instanton background [13], to compute the fermion determinant in a background instanton in the two dimensional chiral higgs model [21], and to address the fluctuation problem for false vacuum decay in curved space [22]. a unified zeta function analysis should be possible, as there is a straightforward generalization of the gel’fand-yaglom result (2) to systems of ordinary differential operators [10]. finally, to conclude, the real mathematical challenge is to ask if the restriction of separability can be loosened. this is a difficult problem, but it is clear that any progress will be interesting. acknowledgments i sincerely thank the organizer, miloslav znojil, and also pavel exner, for the kind invitation to villa lana and the beautiful city of prague. i acknowledge support from the dfg through the mercator guest professor program, and from the us doe through grant de-fg02-92er40716. this talk is based on work done in collaboration with klaus kirsten. references [1] ray, d. b., singer, i. m.: r-torsion and the laplacian on riemannian manifolds, adv. math. vol. 7 (1971), p. 145–210. [2] hawking, s. w.: zeta function regularization of path integrals in curved space-time, commun. math. phys. vol. 55 (1977), p. 133–148. [3] d’hoker e., phong, d. h.: on determinants of laplacians on riemann surfaces, commun. math. phys. vol. 104 (1986), p. 537–545. [4] sarnak, p.: determinants of laplacians, commun. math. phys. vol. 110 (1987), p. 113–120. [5] elizalde, e., odintsov, s. d., romeo, a., bytsenko, a. a., zerbini, s.: zeta regularization techniques with applications, world scientific, singapore, 1994. [6] kirsten, k.: spectral functions in mathematics and physics, chapman-hall, boca raton, 2002. [7] gelfand. i. m., yaglom, a. m.: integration in functional spaces and its applications in quantum physics, j. math. phys. vol. 1 (1960), p. 48–69. © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 47 no. 2–3/2007 � � � � � � � � � � � � � � � �� � �� as d r r v v m r ( ) ( ) ln0 1 8 2 2 1 4 3 2 0 d dr r v r mr mr ( ) 1 2 1 1 1 2 2 1 2 2� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � �� � � � 10 2 3 21 16 1 1 6 1 mr � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � mr mr � � 2 5 2 2 7 2 5 1� � � � � � � �� � � � � � � � � � � � � � � � � � � 1 3 2 21 8 1 1dr r v r mr ( ) � � � � � � � � � � � � � � �� �� 3 2 10 1 � . (32) [8] levit, s., smilansky, u.: a theorem on infinite products of eigenvalues of sturm-liouville type operators, proc. am. math. soc. vol. 65 (1977), p. 299–303. [9] forman, r.: functional determinants and geometry, invent. math. vol. 88 (1987), p. 447–493; erratum, ibid 108 (1992), p. 453–454. [10] kirsten, k., mckane, a. j.: functional determinants by contour integration methods, annals phys. vol. 308 (2003), p. 502–527 [arxiv:math-ph/0305010]; functional determinants for general sturm-liouville problems, j. phys. a vol. 37 (2004), p. 4649–4670 [arxiv:math-ph/0403050]. [11] kleinert, h.: path integrals in quantum mechanics, statistics, polymer physics, and financial markets, world scientific, singapore, 2004. [12] dunne, g. v., min, h.: beyond the thin-wall approximation: precise numerical computation of prefactors in false vacuum decay, phys. rev. d vol. 72 (2005), 125004 [arxiv:hep-th/0511156]. [13] dunne, g. v., hur, j., lee, c., min, h.: instanton determinant with arbitrary quark mass: wkb phase-shift method and derivative expansion, phys. lett. b vol. 600 (2004), p. 302–313 [arxiv:hep-th/0407222]; precise quark mass dependence of instanton determinant, phys. rev. lett. vol. 94 (2005), 072001 [arxiv:hep-th/0410190]; calculation of qcd instanton determinant with arbitrary mass, phys. rev. d vol. 71 (2005), 085019 [arxiv:hep-th/0502087]. [14] dunne, g. v., kirsten, k.: functional determinants for radial operators, j. phys. a vol. 39 (2006), p. 11915–11928 [arxiv:hep-th/0607066]. [15] taylor, j. r.: scattering theory, wiley, new york, 1972. [16] baacke, j., lavrelashvili, g.: one-loop corrections to the metastable vacuum decay, phys. rev. d vol. 69 (2004), 025009 [arxiv:hep-th/0307202]. [17] baacke, j.: numerical evaluation of the one loop effective action in static backgrounds with spherical symmetry, z. phys. c vol. 47 (1990), p. 263–268; the effective action of a spin 1/2 field in the background of a chiral soliton, z. phys. c vol. 53 (1992), p. 407–411. [18] carson, l., li, x., mclerran, l. d., wang, r. t.: exact computation of the small fluctuation determinant around a sphaleron, phys. rev. d vol. 42, (1990), p. 2127–2143 [19] baacke, j., junker, s.: quantum fluctuations around the electroweak sphaleron, phys. rev. d vol. 49 (1994), p. 2055–2073 [arxiv:hep-ph/9308310]; quantum fluctuations of the electroweak sphaleron: erratum and addendum, phys. rev. d vol. 50 (1994), p. 4227–4228 [arxiv:hep-th/9402078]. [20]isidori, g. ridolfi, g., strumia, a.: on the metastability of the standard model vacuum, nucl. phys. b vol. 609 (2001), p. 387–409 [arxiv:hep-ph/0104016]. [21] burnier, y., shaposhnikov, m.: one-loop fermionic corrections to the instanton transition in two dimensional chiral higgs model, phys. rev. d vol. 72 (2005), 065011 [arxiv:hep-ph/0507130]. [22] dunne, g. v., wang, q. h.: fluctuations about cosmological instantons, phys. rev. d vol. 74 (2006), 024018 [arxiv:hep-th/0605176]. prof. gerald v. dunne phone: +1 860 486 4978 e-mail: dunne@phys.uconn.edu department of physics university of connecticut storrs, ct 06269, usa (and institut fürtheoretische physik universität heidelberg philosophenweg 16 69120 heidelberg, germany) 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 ap_06_4.vp 1 definition of liabilities 1.1 what is a “liability”? generally, the term “liability” refers to any legal responsibility, duty or obligation, the state of one who is bound in law and justice to do something which may be enforced by action. such liability may arise from contracts either express or implied or in consequence of torts committed. according to international accounting standard ias 37, liability is defined as an obligation to transfer economic benefits as a result of a past transaction or past events. the obligation can be legal (required by valid legislation) or construed. in this second case, a company (operator) has created a valid expectation that it will accept responsibilities by an established pattern of past practice, the company’s published policies or a sufficiently specific current statement. if we apply the above mentioned principle in the nuclear area, we should define events that can be considered as substantial for the formation of an obligation. at the same time, there must be a legal basis for distinct assignment of responsibility to the operator (legal obligation), or the operator should optionally accept responsibility. 1.2 spent nuclear fuel nuclear fuel, in the case of a nuclear power plant usually in the form of fuel assemblies (bundles of tubes made from a special alloy filled with uranium dioxide with increased isotopic uranium 235 content), does not pose any substantial danger from radioactivity or from the toxic point of view for the environment until it is loaded into the reactor core and exposed to neutron flux, and chain reaction is started. fission of atoms in uranium dioxide matrix leads to the creation of fission products. as the number of completed fissions within the fuel assembly increases with time of residence of the fuel in the core, the radioactivity and toxicity and the related hazard to the environment also rises. at a certain point of time within the fuel assembly, the number of atoms susceptible to fission by the thermal neutrons is too low, and the number of atoms capturing thermal neutrons without subsequent fission is too high. a chain fission reaction is no longer sustainable under safe operational conditions. a part of most burned-up fuel assemblies must be replaced by fresh fuel assemblies. fuel assemblies taken out of the reactor are called spent fuel assemblies (generally spent fuel). spent fuel is highly radioactive and toxic, as the original chemical composition from almost pure uranium dioxide is partly changed by fission into many other chemical elements, some of them with dangerous radiological and toxic properties. therefore spent fuel must be reliably separated from its environment. what is more, spontaneous fission of fission products, which continues after discharge of the fuel from the core, produces substantial heat. as a result, the discharged fuel must be cooled down in the spent fuel pool for at least 5 years. this period (as well as radioactivity and toxicity) is dependent on fuel burn-up. burn-up means megawatt-days of energy per metric ton of all uranium initially contained in the fuel assembly. the higher the burn-up, the longer the necessary cooling down period. rising burn-up has been a general trend for many years. when fuel assemblies have been cooled down sufficiently to be dealt with, they are usually placed into intermediate storage. the main purpose of this storage is to enable safe separation from the environment until geological disposal is in operation (or some other use of the spent fuel, e.g., reprocessing, partitioning and transmutation or some other as yet unknown technology) or until a sufficient batch of spent fuel has accumulated and can be disposed of economically. additional cooling down and a decrease in radioactivity are also important factors. currently two main storage technologies are used. the older system is “wet” storage, i.e. in large pool with circulated water as cooling medium, in principle the same as the pool in a reactor building used to cool down spent fuel, and “dry” storage, which is based on large and heavy containers cooled by natural circulation of the air. the containers are in most cases also used also for transportation of spent fuel (dual purpose containers). this technology is currently more often chosen by nuclear power plant operators. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 46 no. 4/2006 financing of liabilities beyond the service life of nuclear installations l. havlíček operation of a nuclear installation is connected with the creation of long-term liabilities for spent fuel management and disposal, and also decommissioning of the installation (power plant, storages). this means that the operator will have to expend considerable amount of financial resources over a long period after the closure of installation. these financial resources will have to be created during operation of the installation. related costs to be expended in future must be fully included in the price of electricity, in order to ensure fair competition among different operators. financial resources for future coverage of liabilities must be continuously invested in order to compensate for inflation and to gain some real interest. any failure by the operator to comply with its liabilities poses an economic and potentially an environmental hazard for operator’s country. due attention must therefore be paid to assessing connected costs, defining liabilities and ensuring appropriate regulatory oversight. appropriate measures must be well defined and firmly anchored in the legislation of countries operating nuclear installations. this paper reviews the basic principles that should ensure operator’s compliance their liabilities, and maps the current situation in the czech republic. keywords: nuclear, operator, liabilities, financing, european commission, fund, real interest, investment. time of intermediate storage varies widely according to the utility / state spent fuel policy and strategy. in the case of reprocessing spent fuel, the storage time tends to be shorter (up to 20 years). however, in some countries intermediate storage up to 100 years is foreseen. there are various concepts of future spent fuel disposal. spent fuel can either be conditioned (fuel assemblies are decomposed in order to reduce volume) before being placed into disposal containers, or it can be disposed of without any processing. disposal containers loaded with spent fuel are transported deep underground via access shaft or access tunnel. the disposal container is placed into a pre-planned position (usually a hole bored in a disposal corridor) and fixed with sealing material. the properties of deep disposal systems must ensure that the radiotoxic content of spent fuel is reliably contained (engineering barriers) and will not escape from the repository for several hundreds of thousands years, until the spent fuel is no longer dangerous for the environment. if spent fuel is reprocessed, the separated uranium and plutonium can be re-used in the reactor (mox and/or repu fuel). usually only one reprocessing cycle is carried out. further recycling (reprocessing of mox fuel) is difficult due to inconvenient isotopic composition with a high ratio of neutron absorbing isotopes. during the process of spent fuel reprocessing, the volume of waste is substantially reduced, as re-usable uranium and plutonium forms about 97 % and the waste constitutes only 3 % of the spent fuel volume. highly radioactive waste is fixed in a matrix of borosilicate glass and placed into disposal containers. such containers can be easily stored for a long period in slightly modified intermediary storage for spent fuel and disposed off in a deep geological repository, in the same manner as spent fuel. having in mind the back end of the fuel cycle we return to the question of the nuclear power plant operator’s liability. the crucial moment for the emergence of liability is clearly when the first chain reaction after loading the fuel assembly is started, and the properties of the fuel assembly are irreversibly changed. this concept is widely used by utilities when assessing the liability for future expenditures connected with storage and disposal of spent fuel. therefore only fuel which has been loaded into the reactor is considered as a liability, and not purchased fresh fuel before it has been loaded into the core. 1.3 radioactive waste during the operation of a nuclear power plant various radioactive wastes are generated. wastes can be solid, liquid or gaseous. from the point of view of activity they can be classified as low, intermediate and high level waste. the wastes range from very short life isotopes (half-life in the order of minutes) to long-life isotopes. major volumes of radioactive waste will also be generated during decommissioning of the plant. radioactive waste is usually treated with the aim of reducing the volume (e.g. burning, evaporation). then the concentrated waste is fixed in bitumen, cement or glass matrix in stainless steel drums. these drums are disposed of in shallow near-surface repositories. when a section of the repository has been filled with drums, it is filled and covered with a concrete layer. each repository has set limits and conditions, which define the maximum admissible inventory of disposed isotopes and activities. in principle, only solid and liquid short and medium life isotopes are disposed of in such near-surface repositories. long-life waste is stored in intermediary storage to be removed to a deep geological repository when such a repository is available. in comparison with spent fuel storage and disposal, the costs are substantially lower. waste processing is performed during power plant operation, and very often these costs are a part of operational costs. however, for operational reasons some inventory of waste may accumulate. long-life waste is stored. near surface repositories will be monitored after closure for several hundred years. therefore there is also a definite liability of operators to expend financial resources. the liability to store, process and dispose of a particular batch of waste originates when such a batch is generated. the liability to future operational, closure and monitoring costs connected with a near-surface repository originates when the waste is generated. 1.4 decommissioning of installations when the operation of a nuclear installation (a power plant in our case) comes to an end, it still contains spent fuel and radioactive waste. in addition some construction materials of the plant have been activated by ionizing radiation and have become radioactive. after a cooling down period, the spent fuel is removed from the plant, and the radioactive waste is processed and transferred to a repository or into storage. these expenditures are met from funds accumulated for the purpose of covering liabilities for spent fuel and radioactive waste. decontamination and removal of activated construction materials are decommissioning the liability of the operator. decommissioning refers to removal of all hazardous material from the plant and ensuring that the plant is no longer subject to regulation applicable for nuclear installations. total removal of the plant and restoration of “green field” status is not part of the decommissioning liability. there are three basic strategies for proceeding with decommissioning. each has different implications for the liability of operator and for the connected costs. the decisive moment that creates the liability of an operator to decommission is the first start-up of the reactor (first chain reaction). 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 strategy description immediate decommissioning decommissioning is completed as soon as possible combined (partial immediate / deferred) decommissioning only decontaminations and removal of lowly-activated materials performed immediately, highly activated structures are left intact for years to decrease their activity and only then removed deferred decommissioning after removal of spent fuel and operational waste, the plant is left intact for roughly 50 years table 1: basic decommissioning strategies 1.5 division of responsibilities in reality the responsibility of a nuclear power plant operator to cover all liabilities dealt with here stems from the basic widely recognized principle that “the polluter pays”. however in order to ensure that the liabilities are covered, state legislation establishes detailed regulations on how liability is to be assessed, how funds are to be accumulated, rules for investment and criteria for fulfillment of the liability. legal responsibility is assumed by the state itself (a state agency, a ministry, via a selected contractor), or by organizations representing a group of operators (typically when there are several small operators with only one or two operated plants). the activities of entities are financed by the operators through regular fixed payments or through fees based on electricity produced. it is very important to ensure that different liabilities of the individual operators are assessed separately, and that there are no cross subsidies. historically, the situation has been handled differently from country to country. in many countries insufficient funds have been set aside. when the first generation of nuclear power plants was built, nuclear liabilities were not taken into account and contributions to such a fund was not an organic part of product (electricity) price. in the 1970s, the concept of a fund to cover liability was introduced in many countries, but it remains questionable how adequate these financial resources are. especially in countries where there was no clear border between the civilian and military nuclear program, past nuclear liabilities will have to be covered from the state budget. another question concerns, how efficiently the collected financial resources are being spent. the situation in the usa serves as a warning. huge sums were paid by the utilities in the form of a fixed fee per kwh generated, but most of the have been spent without solving the problems or disposing of the accumulated spent fuel. nuclear liabilities have become the subject of high level political discussions among the european parliament, the council and the commission. an inter-institutional statement issued in july 2003 set the ground for community action, highlighting the need for adequate financial resources to cover nuclear liabilities. recently there have been efforts on the european community level to regulate this area in the member states. the european commission has drafted several directives prescribing the division of responsibilities, the principles for accumulating financial resources, establishing funds and establishing an oversight role for european organs. this legislation met fierce resistance in some countries (to some degree all the countries that operate nuclear power plants, but mainly france and germany), which would have difficulty in adapting their present system to the prescribed regime, and were opposed to the superordinacy of the eu organs over national regulatory bodies. until now, all drafts have been rejected. the commission intends to issue annual reports summarizing progress in this area. the first annual report, published in 2005, admits that the financing of decommissioning is a complex issue, and that various approaches can be taken. however, the creation of an international market has brought an increased need for transparency and harmonization in the management of financial resources in liability funds. the commission also stated its opinion that methods of financing should be harmonized in due course. the commission announced that it will draft its recommendation in this area. this has now been submitted for comments by the member states. the recommendation reiterates the obligation to decommission an installation after permanent shutdown, to properly address waste, the polluter pays principle, the availability of financial resources in due time. the recommendation goes on to establish reporting obligations regarding decommissioning plans, and proposes the establishment of a decommissioning funding group on a supranational level. on the national level, the commission requires member countries to set up a national body, which should provide expert judgments on liability assessment, but should remain independent from the contributors to the fund. cost assessment and the accessibility of the gathered financial resources should be reviewed periodically, at least once in 5 years. the national body should report annually to the commission on the conclusions of its proceedings. nuclear utilities should set up adequate funds on the basis of the revenues obtained from their nuclear activities during the designed lifetime of the installation. the commission prefers external funds, and all new nuclear installations will be required to set up ring-fenced external funds. decommissioning and waste (� spent fuel) liabilities must be assessed independently. more expensive options must be taken as the basis for assessment. any shortfall in financing must be covered by the operators. financial resources in funds should be invested with a low and secured risk profile. for ring-fenced external funds, the return on the investments should be guaranteed by the state even if a nominal loss is made by the independent manager of the invested funds. if the fund is internal, the operator should establish a segregated fund within its account to make financial resources gathered for nuclear liabilities identifiable and traceable at any given time. the european parliament strongly supports these initiatives of the european commission and has been pursuing its own initiatives in this area. 2 liabilities in the czech republic the atomic energy act (18/1997 coll.), as fundamental legislation regulating nuclear installations, was passed in 1997. this legislation set the main responsibilities and liabilities in spent fuel management, radioactive waste management and decommissioning of nuclear installations. the basic principles have been extended and specified by implementing legislation (governmental decrees, state office for nuclear safety regulations). čez, a.s. (čez), the operator of nuclear power plants in the czech republic, has been establishing reserve funds to cover nuclear liabilities since 1993. before the passage of the atomic energy act in 1997, the setting-up of these funds was based on company’s own decision and the funds could not be claimed against costs. the atomic energy act establishes that the state is responsible for safe disposal of nuclear waste. the radioactive waste repository authority (rawra) as the state agency that is responsible in the area of radioactive waste and radioactive waste disposal. in accordance with the “polluter pays” principle, producers of radioactive waste make payments into the so © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 46 no. 4/2006 called nuclear account, which is held at the czech national bank. rawra activities are financed from this account. the funds are put into safe investments specified in atomic energy act to cover inflation effects and to achieve some real interest over and above inflation. spent fuel is not considered as radioactive waste until its producer declares it as such and transfers it to rawra for disposal. all future costs connected with intermediate storage of the fuel and with the transportation to the rawra repository must be borne by the operator. these costs comprise the construction, operation and decommissioning costs for intermediate storage, the purchase costs of dry casks, and also the costs of transportation to the deep repository. čez decided to create an accounting reserve to cover all future liabilities connected with storage and transport of spent fuel. this accounting reserve was set up on the basis of a decision of čez, a. s., and not in accordance with atomic energy act and, therefore it cannot be claimed against costs. the ias 37 international accounting standard requires čez to consider this liability in its balances, it is regularly checked by the external auditor. according to czech national radioactive waste policy, a deep geological repository should be in operation in 2065. rawra is responsible for its siting, development, construction, operation, closure and monitoring. the repository will have to be in operation for a sufficiently long period to dispose of the accumulated spent fuel and other waste (long-life and high activity waste from plant operation and decommissioning). čez, the nuclear power plant operator has paid into the nuclear account since 1997. currently it pays 50 czk on mwh generated in its nuclear power plants. this fee covers future costs for spent fuel disposal, and also for disposal of operational low and medium active waste. the operators of research reactors pay to the nuclear account a fee based on the heat generated in the reactor. small producers pay specified fees when their waste is transferred. a near-surface low and intermediate waste repository is in operation on the site of the dukovany nuclear power plant. mainly operational waste from thee dukovany and temelín plants is disposed off there. čez operates the repository as a contractor for rawra. the second repository in operation is the richard repository, in northern bohemia, close to town litom ice. it is used for institutional radioactive waste. the bratrství repository, in a former uranium mine in western bohemia, is used only for natural isotopes. as concerns decommissioning of nuclear installations, operators are required to create reserve fund in the form of a blocked account at a reputable bank. cost assessment and the state of the reserve are audited by rawra. all uses of the blocked account must have rawra approval. cost assessment is reviewed at 5 years intervals. studies and cost assessments must be performed not only for nuclear power plants, but also for other nuclear facilities, e.g. spent fuel storage facilities. if we compare the state of nuclear liabilities in the czech republic with the recommendations of the european commission, we can state that the czech republic already complies with the main requirements. individual liabilities are assessed independently, sufficient financial resources have been accumulated and the adequacy of the funds has been periodically checked. the decommissioning fund is managed by the operator and not by the state. however, the state controls the use of this fund via its agency rawra. the atomic energy act defined responsibilities in the czech republic as follows: 3 assessment of liability future costs are modeled on the basis of technical studies and appraisals of individual activities. the cost studies are usually performed in fixed prices in the year of the study. if we consider cost escalation (not identical with inflation, as storage and repository costs grow less than inflation, which is led by service costs), we have to recalculate the cost estimates from the fixed costs (e.g. from the year 2005) to the current prices in the year when the appropriate amount of money will be expended. s s esct t n n t � � � �fix ( )1 1 , (1) where st is expenditure (cost) in year t in current prices, sfix t is expenditure (cost) in year t expressed in fixed prices from the year of study. escn is escalation factor between years n and n � 1. the basic principle is that financial resources must be accumulated during the operational lifetime of the nuclear power plant. various “technical” unit options can be used for calculating the operator’s regular payments. for decommissioning, fixed annual payments are very often considered (liability is not in a principle function of the volume of electricity generated). for spent fuel and radioactive waste, generated electricity is usually used as the basis for calculating the operator’s payment. for a certain period a certain rate is declared, this is used for calculating the payment (e.g. 50 czk/mwh). this rate can be found by calculating the base rate rbase in following way: firstly we introduce the effective rate in year t, which can be calculated as 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 spent fuel rad. waste decomm. legal responsibility storage – operator storage – producer plant operatordisposal – state disposal – state financial resources plant operator waste producer plant operator form storage – acc.res. storage – acc.res. blocked account of operatordisposal – nucl. acc. disposal – nucl. acc. table 2: division of responsibility in the czech republic r r dt n n t ef base� � � �( )1 1 , (2) where rbase is the basic rate to be paid into the fund per mwh generated in power plants in the first year of the analyzed period, dn is the increment in the rate between years n and n � 1. the concept of an effective rate is logical. the rate should constitute constant proportion of the costs of electricity, and with the gradual effect of inflation also the rate should also increase. then we define cash-flow in year t cf e r st t t t� �( )ef , (3) where et is generation of the nuclear power plants in year t, r tef is the effective rate in year t, st is expenditure from the fund in year t. finally by solving following formula introducing (1), (2) and (3) (e.g. in spreadsheet using a solver) we find sought rbase � �� �� �( )( ) ( ) ( )cf y cf y cf yt t1 1 2 21 1 1 0� � � � � � � , (4) where yt is the rate of yield of invested financial means from the fund in year t, t is year when nuclear liability is fully settled. a very important boundary condition is that summation of first n cash-flows (n � 1 … t) in formula (4) is always � 0. the above formulas assume a variable investment yield rate, inflation rate and increment in the rate. however, it is very difficult to predict these factors over the very long periods involved. for practical purposes we consider them to be constant for the whole period under consideration. then if we suppose constant rate, we can calculate simplified rate rsimpl r s esc e i t t t t t t t tsimpl � � � � � � � ( ) ( ) 1 1 1 1 . (5) example of typical balance of the “nuclear account” (hypothetical) is shown on the following graph in fig. 1. fig. 1 shows that the operator pays into the nuclear account only during the operation of the power plants (1997–2042). during this period substantial financial resources accumulate. investment of the available resources from the nuclear account leads to further growth of the nuclear account. when the process of disposal is completed (the repository is decommissioned), the balance of the nuclear account is zero. the cost assessment includes some extra costs to cover contingencies and risk. 4 conclusions nuclear liability must be carefully assessed and controlled, because failure to meet this liability can lead to serious environmental and economic consequences. the probability of failure to meet nuclear liability is decreasing, as this area is now becoming a subject of supranational regulation, especially in the european community. the liability assessment methodology is well developed, and control mechanisms are being enhanced. the czech republic is very advanced in covering nuclear liabilities. it has already adopted the main principles recommended by the european commission. model calculations have been made, and the assumptions used in the model have been verified. the calculations show that liabilities of the nuclear power plant operator are well covered. 5 acknowledgments i would like to express my sincere thanks to j. knápek and j. vašíček from department of economics, management and humanities for their support and long-term cooperation in the area of nuclear liabilities, and to my colleagues from čez’s fuel cycle section for our wonderful working relationships. references [1] international accounting standard ias 37 provisions, contingent liabilities and contingent assets. [2] the european commission, directive 2003/54/ec of the european parliament and of the council of june 2003 concerning common rules for the international market in electricity and repealing directive 96/92/ec, brussel, 26 june 2003. [3] the european commission, communication to the european parliament and the council report on the use of financial resources earmarked for the decommissioning of nuclear power plants, com (2004) 719, brussel, 16 october 2004. [4] atomic energy act, law 18/1997 of collection as amended. ing. ladislav havlíček e-mail: havlil10@email.cz dept. of economics, management and humanities czech technical university in prague faculty of electrical engineering technická 2 166 27 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 46 no. 4/2006 nuclear account balance 0 2 4 6 8 10 12 14 16 18 1 9 9 7 2 0 0 2 2 0 0 7 2 0 1 2 2 0 1 7 2 0 2 2 2 0 2 7 2 0 3 2 2 0 3 7 2 0 4 2 2 0 4 7 2 0 5 2 2 0 5 7 2 0 6 2 2 0 6 7 2 0 7 2 2 0 7 7 2 0 8 2 2 0 8 7 2 0 9 2 2 0 9 7 p a y m e n ts / e x p e d it u re s [m ld . k è ] 0 20 40 60 80 100 120 140 160 180 200 b a la n c e [m ld . k è ] payments into nuclear account expenditures from nuclear account nuclear account balance fig. 1: hypothetical balance of a nuclear account in cr ap08_3.vp 1 types of electricity market bases the electricity market is determined by the main characteristic of electricity i.e., that it cannot be stored. due to this feature, it is necessary to transmit and retail the currently produced electricity immediately. as a result time optimalization of energy retailing is impossible. this feature has a fundamental influence on the energy markets, or more precisely on the trading systems. there are some standards for electricity trading [1]. 1) sb (single buyer): the single buyer is the only subject that can buy the electric energy. sb buys the electricity at the lowest cost from the producers and then it sells this electric energy on to customers. the disadvantage of this model is the exclusive privilege of the one and only subject for purchasing the electric energy. therefore it is not often used in europe. (one exception is belarus and until recently, france.) 2) tpa (third party access): access of third parties to the transmission net. this means the right for every customer or producer to free energy transportation via the transmission system. � rtpa (regulated third party access): this assumes bilateral contracts between market subjects (producers, traders and customers) for energy retail on condition of regulated prices. the existence of a regulatory authority is assumed at the same time. this authority designates prices in non-liberalized segments of the power industry. rtpa exists in most european countries. � ntpa (negotiated third party access): access to the nets on condition of negotiated prices. (this basis was applied in germany, but since 2005 germany has switched to the rtpa standard.) due to the impossibility of storing electric power, its price is highly volatile on spot markets (fig. 1.), and this forces investors to use some form of hedging (risk management). the use of derivates (futures, forwards) is in the power industry very common. (experience from long term liberalized markets shows that derivates reduce spot prices [2].) 2 electric energy market in order to liberalize energy markets it is necessary to divide up large companies, which were integrated in the past. production, transmission and trading of electricity need to be separated. in addition, this liberalization establishes free economic competition in the production and trading sectors. the only monopoly that remains is in the electric power transmission and distribution grid. the prices in these sectors are determined by the above-mentioned regulatory authority. this forced division of particular structures in the power industry is called unbundling (fig. 2). 2.1 ways of trading in electricity there are two main markets, in which electric power is traded. firstly, on otc (over-the-counter) market and secondly on the classical exchange. on such markets electricity is traded as well as electricity derivates. from trader’s point of view it is better to trade on the exchange. there are two main reasons for this. firstly dealing on the exchange is better assured, and secondly, the exchange has better liquidity for each trader. this liquidity provides opportunities for speculative deals. german, austrian and dutch markets are very popular for energy traders because of their liquidity. liquidity is discussed in greater detail in section three of this paper. 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 the european electricity market and cross-border transmission m. adamec, m. indráková, m. karajica this paper deals with basic characteristics and features of trading in electricity, especially in cross-border trading. first, the most important features of electricity as a commodity are explained, with the consequences for electricity trading. then characteristics and changes in the electricity market after liberalization are discussed. this liberalization has taken place throughout europe, and the consequences of this revolutionary change are still visible. the main features of electricity trading are mentioned in general. then cross-border trade in europe is discussed in greater detail. in this context the basic principles of the allocation of cross-border transmission capacities are explained. the next part of the paper considers the characteristics of the european electricity market from the trader’s point of view. liquidity as a very important index is introduced here. finally the most visible trends in cross-border trade and the most probable future development in this area are presented. keywords: auction, congestion, electricity, ess, kiss, mo, scheduling, tpa, tso, unbundling, sb. 0,00 5,00 10,00 15,00 20,00 25,00 30,00 35,00 40,00 45,00 0 24 48 72 96 120 144 168 hours in a week p r ic e [e u r ] fig. 1: price of electricity during one week on eex 2.2 specifics of cross-border trade cross-border trade electric power in europe is enabled by integration of the countries into ucte (union for the coordination of transmission of electricity). this union facilitates the sale electric power from one state to another. the major remaining is of congestion on the maost of the borders, leading to insufficient transfer capacity between countries. the existence of this congestion has led to the development of methods for allocating free capacity to particular parties interested in electricity transmission. 2.2.1 capacity allocation methods if there is no congestions at the border, it is not necessary to allocate free capacity and every bid for transmission capacity is accepted tax-free. ( for example, the austrian – german border.) problems occur on borders where the capacity of the cross-border lines is insufficient for all bids. in such cases the following principles are used for allocating free capacity. non – market based principles first-come-first-serve – this guarantees the allocation of capacity to customers in the same order as they send their requests. pro rata – when the demand is higher than the free capacity, each request is reduced proportionally, so that no congestion remains. no request is refused, but none is accepted to the full extent. auction principles explicit auction – this is the most widely used way of allocating capacity in europe. each tso (transmission system operator) sets the free capacity ex ante, and this capacity is allocated by auction. bids are sorted according to their price, and they are accepted until no free capacity remains. the price of each successful request for capacity can be the same as the bid in the auction (“pay-as-bid”), or it could be equal to the lowest accepted bid (“marginal bid auction”). implicit auction – the difference from an explicit auction is that the price for the capacity is included directly in the price of the transmitted energy. the price for the capacity (as a part of the transmitted energy price) is set by mo (market operator) in such way that the amount of the accepted bids does not exceed over the free capacity. mo knows bids from each participant in the market and can therefore set the exact conditions under which the bid can still be accepted. the auction is arranged, and the revenue from it belongs, not to tso (difference from explicit auction) but to mo. market splitting – it is developed implicit auction procedure. the electrical energy is transmitted in such way that electricity from areas with lower prices is transmitted to areas with higher prices. thus prices in areas with lower prices level on with those in areas with higher prices. (this process is like splitting of market from one area to another. the whole auction process has acquired its name from this similarity.). mo in one country manages the markets on both sides of the border by a market splitting process, so very close cooperation is necessary between the two mos. 2.2.2 communication between market participants due to congestion on borders, it is necessary to schedule of every energy cross-border transport from energy traders. every energy trader must therefore communicate in real time with the subjects allocate the cross-border capacities. this communication is carried out mostly on-line and automati© czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 48 no. 3/2008 production of electrical energy transmission grid distribution grid customer trading economic competition regulated monopoly physical flow of electrical energy financial flows fig. 2: division of the electricity market between liberalized and regulated parts (unbundling) fig. 3: types of congestion management cally. each market participant has its own eic (etso identification code), unique for each country, provided by the appropriate issuing office (mostly tso). communication is carried out in various ways. in areas with poor access to free trade, phone and fax communication are widely used for announcing the bids. however, in western europe on-line communication prevails. this on-line communication consists in automatic file sending, mostly in xml or xls format. the kiss method (keep it small and simple) – this method sends data files in xls format (ms excel 97). these files contain information about exact energy transmission characteristics, e.g. time or amount of transmitted energy. the ess method (etso scheduling system) – this method uses a special xml format for using data files. it is the newest way to communicate on electricity markets. (this is currently used in germany, austria, luxemburg, switzerland, hungary, poland, the czech republic, slovakia, france, slovenia, croatia and rumania.) in connection with cross-border transmission capacities, the following relations between particular categories of capacities have been introduced: ntc � ttc � trm, (1) where ttc (total transfer capacity): expected maximum quantity of generated energy, that can be transmitted between the two countries that does not limit the system or set limits on the one party. the safety limits for the following considerations have to be taken into account: � maximum warming of conductors, � maximum level of voltage in a particular system, � limit of stability (mechanical and dynamic limitations). the ttc values are calculated using the etso predictive model. trm (transmission reliability margin) is a precautionary reserve that takes into account unpredictable developments due to insufficient provision of information to etso from parties of subjects dealing in the markets. in connection with ntc we should mention atc (available transfer capacity), which introduces true available capacity, defined as the difference between ntc and long-term reserved capacities. 3 market liquidity so-called liquidity is a significant indicator of market attractiveness from the viewpoint of a trader. there is no generally valid definition of market liquidity, but it is very often calculated in practice on the basis of the following relation: liquidity of the energy market � tte/tge, (2) where tte (total traded energy) is the total quantity of electricity marketed on the commodity exchange of the corresponding market, and tge (total generated energy) is the total quantity of energy produced on the corresponding territory. the following table indicates the liquidity of the markets in selected eu countries. it should be pointed out in connection with liquidity that this indicator may have considerably higher values, as the same electricity may be marketed several times and, moreover, electricity manufactured in another country may be marketed on the corresponding commodity exchange. 4 overview of electric power flows europe will be facing shortages of electric power on a medium-term and long-term basis. if the consumption of electric power grows by more than two per cent a year (which is considered to be a rather conservative estimate), there will be a shortfall in manufacturing capacity of about 250 gwh in europe twenty years from now. this is the output of approximately 125 power stations of the size of temelín. the common market in europe has been generated as a direct consequence of the european union’s effort to liberalize the market for electric power, and to suppress the natural monopoly in power engineering. this not only enables competition on domestic markets but is also part of the effort to interconnect individual markets within continental europe, 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 country total generated energy [twh] total traded energy [twh] liquidity [%] germany 445.1 39.0 8.76 austria 51.8 1.0 1.93 slovakia 26.0 0.36 1.38 france 554.0 7.5 1.35 poland 110.9 1.1 0.99 italy 322.0 2.0 0.62 czech republic 61.4 0.3 0.49 table 1: overview of liquidity on individual markets [5] fig. 4: map of european associations which associate the tsos of different countries with the british, isles and with the states of northern europe. this has formed the basis for some associations of tsos (transmission system operators) set up in individual countries. the most important such organizations are etso (european transmission system operator) and ucte (union for the coordination of transmission of electricity). these organizations are becoming less significant with the gradual spread of liberalization. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 48 no. 3/2008 country generation [twh] annual consumption [twh] import [twh] export [twh] belgium 83.3 83.8 14.3 8.0 belarus 30.7 34.7 9.0 5.0 bosnia herzegovina 12.71 10.92 2.17 3.58 bulgaria 43.92 36.462 0.799 8.380 montenegro 2.724 4.534 1.819 0.0 czech republic 82.579 62.691 12.351 24.985 estonia 9.949 6.914 0.345 1.894 france 549.2 482.4 32.3 90.9 croatia 11.6 16.7 8.8 3.6 italy 303.0 309.8 5.03 1.1 lithuania 14.8 10.1 1.1 4.1 latvia 4.723 6.871 5.087 2.940 hungary 35.7 34.6 import/export saldo 6.2 twh macedonia 6.475 8.089 1.659 0.045 germany 581.3 563.6 53.4 61.9 netherlands 94.4 114.7 27.346 5.887 poland 143.95 130.06 5.00 16.19 portugal 41.7 47.9 9.6 2.8 austria 66.359 69.024 20.397 17.732 romania 54.804 51.889 1.605 4.520 greece 49.6 53.4 5.6 1.8 slovakia 29.1 26.3 8.6 11.3 slovenia 13.289 12.389 5.833 6.026 serbia 39.3 38.1 7.9 9.1 spain 263.32 246.19 8.09 9.41 switzerland 55.287 61.637 37.298 29.828 turkey 162.0 160.8 1.8 0.6 ukraine 185.236 176.884 0.0 8.351 albania 5.409 5.933 0.643 0.119 the russian federation 952.0 939.6 9.9 22.3 table 2: annual balance of power in individual european countries in 2005 � ucte: union for the coordination of transmission of electricity, � nordel: organisation for the nordic transmission system operators, � uktsoa: the power system of the united kingdom, � tesis: trans-european synchronously interconnected system, � etso: european transmission system operator table 2 shows the balance of electric power in individual countries in europe. the largest quantity of electric power is produced in germany and in france. these countries are followed by great britain and italy, but with production lower by more than 170 twh. germany, unlike france, consumes almost all of the electric power that it produces. france is thus the largest exporter of electricity at present. the czech republic takes the second place. however, it should be pointed out that the surplus of electric power in the czech republic is a short-term one. the service life of desulphurized coal-fired power plants will end in about 2010. the largest importers of electric power are the benelux countries and hungary. hungary covers 18 % of its requirements through importing. 5 conclusions over several recent decades, worldwide power consumption has continued to grow strongly. there is now increasing awareness of the finiteness and scarceness of resources. emphasis is now being placed on efficient exploitation of electric power. in order to achieve effective exploitation of electrical power, several measures have been taken and new tools have been introduced. one of the most significant tools is implementation of market value of electric power, which reflects its real costs or, more precisely, the maximum price that is acceptable for the end consumer. in the most advanced countries in the world, stock exchange trading in electric power is now common. however, stock exchanges dealing with electric power are not common in less mature countries. traders in electric power aim to achieve maximum profit from the sale of electricity, and in this way a double effect is achieved. firstly, they exert indirect pressure on producers by requiring a low purchase price for electric power. the result is efficient exploitation of primary commodities, technologies and other inputs necessary for generating electricity. secondly, they exert indirect pressure on the end consumer by maximizing the selling price. the result is efficient exploitation of electric power, as the end consumer is motivated to minimize consumption. another tool for efficient exploitation of electric power is the effort to increase efficiency in the all parts of the electric supply chain, from generation, transmission and distribution to the dealing in electric power. the result of this effort has reformed the power sector in a series of eu member countries. previously there had been power companies dealing with the whole electric supply chain, from generation, transmission, distribution and sale of electric power. however, in the course of time it turned out that this model is ineffective. strict separation of generation, transmission, distribution and dealing in electric power (unbundling) has now been introduced. another distinctive tool is the opening of access to the grid by third parties, i. e. implementation of the rtpa market platform, the purpose of which is to increase competition. it should be emphasized, however, that several of the less mature european countries have implemented the legislation for the rtpa market platform, but in practice they still really use the sb market platform. implementation of the rtpa platform requires a long-term perspective. a significant tool in support of intrastate dealing in electricity is the implementation of an electric power stock exchange marketed on yearly, monthly, weekly or daily levels through a range of different types of products. comparing cross-border trading intrastate trading, auctions on cross-border transmission capacities are held mostly monthly or daily and in some cases weekly. these auctions are mostly arranged by operators of transmission systems, or by also called scheduling offices. the emergence of international auction (scheduling) offices should simplify trading in cross-border transmission capacities. it is necessary to emphasize the distinction between real physical flows of electric and financial flows related to business transactions. for this reason, there is pressure to use a flow-based method, that will take into account real physical flows. implementing this method will restric business, since option transmission cross-border profiles will be determined by this method. we should also mention here trading in emission allowances, though this is not at present a great influence on trading in electric power. the main reason is that in the first allocation period the allocated emission allowance was greater than the needs of individual countries. this led to the collapse of the market in emission allowances. however, increase in the volume of trading in emission allowances, due to decrease the emission limits, could cause higher price of emission allowances and electric power. this trend should lead to the future electricity generation in countries where electric power can be generated more cheaply and more efficiently. this will lead to stronger cross-border trade in electric power, resulting in the establishment of many regional stock exchanges or in the last resort in the establishment of a global stock exchange for electric power. as fuel for generating becomes scarcer, exchanges will be established for various fuel types. acknowledgments the authors would like to express their gratitude to their supervisors, prof. ing. oldřich starý, csc. and doc. ing. jaromír vastl, csc. for valuable comments and contributions. references [1] training documents of the department of economics, management and humanities, fee ctu in prague. [2] anderson, e. j., hu, x., winchester, d.: forward contracts in electricity markets: the australian experience, energy policy, vol. 35 (2007), elsevier. [3] adamec, m., indráková, m., karajica, m.: european scheduling overview (training document čez a.s.), 2007. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 3/2008 © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 48 no. 3/2008 [4] haas, r., glachant, j. m., keseric, n., perez, y.: electricity market. etso european transmission system operator marek adamec e-mail: adamem3@fel.cvut.cz michaela indráková e-mail: indram1@fel.cvut.cz mirza karajica e-mail: karajm1@fel.cvut.cz department of economics, management and humanities czech technical university in prague faculty of electrical engineering zikova 4 166 27 praha, czech republic ap06_6.vp 1 introduction the objective of this paper is to calculate the basic static and dynamic loading of twin cooling towers with fans 6 m in propeller diameter for an oil refinery. the basic static and dynamic analyses are based on the requirements of american design standards [1]. the size of the structure was designed to limit the combined static loading states including the actions of wind pressure. the designed quantities due to the action of seismic load in introduction the ductility of the structure are lower than the actions of the wind pressure. the design loads are compared and the dominance of particular loading states is assessed according to the internal force response caused by these loads in the structure.this comparison explicitly shows that the temperature effects exert the biggest influence upon the structure of the towers in the resultant combined design load. the share of these temperature effects in the total stress of the structure can be estimated as approx. 50 %. 2 description of the structure the subject of the analyses is a pair of reinforced-concrete cooling towers located next to each other on a common base plate. each cooling tower has a square plan. the cooling tower is terminated from above by a floor (ceiling) plate with a circular opening, over which a cylindrical diffuser is located. the cooling towers are arranged in series in such a way that the towers have a common internal wall dividing them all along their height. on the edges of the twin towers, in the longitudinal direction, the transversal external walls are reinforced by three vertical reinforcing ribs of longitudinal orientation. these longitudinal stiffeners 400 mm in thickness exceed the external surface of the transversal external wall by 1 500 mm. the transversal external and internal walls 250 mm in thickness exceed the external surface of the longitudinal walls by 800 mm. in addition, there are internal stiffeners in the central section of the transversal external walls to reinforce each tower. in the centre of the towers there are columns with fans on them at the level of the ceilings. the columns are strutted in two of their altitudinal levels to the external walls by means of concrete girders, and at the level of the fans by means of steel pipes into the ceiling plates. the longitudinal external walls of the towers, in which there are suction inlets, are reinforced by vertical ribs and horizontal beams flanging the suction inlets. the basic altitudinal level of the structural model �. 0.0 corresponds with the centre line of the base plate. a spatial computational model of the twin towers was created for these calculations of the cooling towers. the space and the basic dimensions are given in fig. 1. the twin cooling towers were designed for the use of grade 25 concrete (equivalent to concrete class b30). the reinforcement used concrete grade 400 (equivalent to reinforcement 10 425 v). the stiffeners and the supporting structure for the technology inside the tower were designed for the use of astm a36 steel or similar, (equivalent to steel of series 37). the external walls, the internal wall, the floor plates, the diffuser and the vertical stiffeners were modelled by plate elements corresponding in thickness with the modelled part of the structure. the internal columns, the internal girders, the columns/ribs in the external walls, the horizontal © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 acta polytechnica vol. 46 no. 6/2006 response analysis of an rc cooling tower under seismic and windstorm effects d. makovička the paper compares the rc structure of a cooling tower unit under seismic loads and under strong wind loads. the calculated values of the envelopes of the displacements and the internal forces due to seismic loading states are compared with the envelopes of the loading states due to the dead, operational and live loads, wind and temperature actions. the seismic effect takes into account the seismic area of ground motion 0.3 g and the ductility properties of a relatively rigid structure. the ductility is assessed as the reduction in seismic load. in this case the actions of wind pressure are higher than the seismicity effect under ductility correction. the seismic effects, taking into account the ductility properties of the structure, are lower than the actions of the wind pressure. the other static loads, especially temperature action due to the environment and surface insulation are very important for the design of the structure. keywords: cooling tower, wind load, earthquake, ductility, dynamic response. this text was a part of the intenational conference on advanced engineering design (aed 2006), which was held in prague in 2006. fig. 1: basic dimensions of the twin cooling tower unit reinforcement of the longitudinal walls over and under the shutter opening and the horizontal stiffeners of the fan were modelled by beam elements. the damping used for the dynamic loading states was 5 % of critical damping. in the computational model, the baseplate was supported by a winkler-pasternak sub-base model. the sub-base parameters are automatically determined by the computational program according to the set sub-base structure in the test pits. the starting values of the winkler-pasternak constants were selected as follows: c1z � 3.5 mn/m 3, c1x � c1y � 2.0 mn/m3, c2x � c2y � 50 mn/m. the calculation of improved values for the combined of dead and operational loads is shown in fig. 2. the baseplate can turn slightly on the sub-base around it’s horizontal axes and can shift in the vertical direction. there can be no horizontal shifting of the structure as a whole, i.e. on one longitudinal and one transversal edge of the baseplate. horizontal shifting on opposite edges is enabled in order to allow thermal expansion of the plate. 3 load the structure model was loaded with static and dynamic loads. the static loads include the dead weight and the permanent load due to the process equipment, the live load, the operational load, temperature effects and actions of wind pressure. the dynamic loads include the effects due to the fans of the cooling towers and seismic effects. 3.1 temperature load due to the environment the operating temperature of the towers will comply with the temperature in the period of construction. the external walls, ceiling plates and diffuser are loaded with the non-uniform change in the surface temperature due to a decrease or increase in the ambient air temperature. the columns/ribs in the longitudinal external walls and the external stiffeners of the walls were loaded with the temperature corresponding with the centre line of the wall. the temperature of the internal columns and is unaffected by temperature fluctuations in the operation process; therefore these members are left without temperature load. the internal stiffeners and the diaphragm beam are subject to temperature load only at their edges, where they contact the external walls and ceiling. in these parts, contact sections 725 mm in width are modelled, being loaded with the temperature corresponding with the centre line of the wall or ceiling. the environment temperature was adopted from national documents. the temperatures of the surface of the structure were determined in compliance with the theory of heat propagation through a solid medium. the temperature loads were only considered on the part of the structure above the formation level: � estimated mean temperature of the structure during construction t0 � �35.0 °c, � normal air operating temperature inside the structure ti � �35.0 °c, � minimum ambient air temperature in the winter period te � �5.0 °c, � maximum ambient air temperature in the summer period te � �55.0 °c. this group of temperature loads includes: the non-uniform decrease in the temperature of the surfaces in the winter period, the non-uniform increase in the temperature of the surfaces in the summer period, surface insolation in the perpendicular direction (for ceilings only), and the inclined surface insolation under the incidence of sunbeams at an angle of 45°. 3.2 wind load the basic wind pressure values were adopted from [1], and the methodology for determining the load including the elevation effect was adopted from [10]: � basic wind velocity v � 50 m/s, � topography factor s1 � 1.0, � wind pressure variation with height s2 � 0.99 (for h<15 m), � statistical factor s3 � 1.0, � design wind velocity vs � 49.5 m/s, � dynamic wind pressure q � 0.613×49.52 � 1.502 kpa. 3.3 earthquake load the seismic load parameters were determined according to [1] and [9]. according to [9] para 1631.2.5, the vertical acceleration component in comparison with the horizontal component is determined by a coefficient of 0.667. according to [9] par 1631.4.1, the design elastic response spectrum is: � coefficient of significance i � 1.25, � seismic area 1, � coefficient of seismic area z � 0.075, � earth medium profile sd and corresponding, � seismic wave propagation velocity (according to [9]) vs � 360 m/s, � seismic coefficient cv � 0.18, ca � 0.12, 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 2: winkler-pasternak factors of the elastic foundation � damping ratio (according to [9] par 1631.2.2) d � 5 %, � maximum acceleration at the level of the foundation plate 2.5×cv � 0.3 g. 4 natural vibration for the analyses of the response pf the dynamic structure to seismic load, it is necessary to know the tuning of the structural system. for these purposes, the natural vibration was calculated – calculation of the lowest 100 natural modes of vibration and the equivalent natural frequencies in the frequency interval from approx. 0 hz to 50 hz. this separation is sufficient, as seismic excitation has substantial components approx. up to 32 hz. the lowest natural frequency and its modes approx. up to 20 hz are important for determining the seismic response of the structure; the influence of the higher modes in determining the total seismic response is very small due to the variable nature of these higher modes (above 20 hz approx, the influence is in units or tenths per cent of the total response); approx. from frequencies above 30 hz their influence on the total seismic response is even lower. a description of the 12 lowest natural modes of vibration is included in table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 46 no. 6/2006 fig. 3: design elastic response spectrum (i) f (i) dominant vibrating part 1 2.19 rotating vibration of both towers around axis x 2 2.57 rotating vibration of both towers around axis y 3 3.28 sliding vibration of towers on the subbase in the direction of axis z 4 3.31 torsional vibration of walls around axis z 5 4.95 higher bending mode of rotating vibration around axis x 6 5.25 higher bending mode of rotating vibration around axis y 7 11.81 bending vibration of longitudinal walls 8 12.22 higher mode of bending vibration of longitudinal walls 9 13.48 bending vibration of transversal walls 10 14.19 bending vibration of lower cross and longitudinal walls 11 18.62 higher mode of bending vibration of lower cross and tranversal walls 12 20.85 torsional vibration of beam crosses table 1 natural frequencies of vibration [hz] fig. 4: comparison of design moments in plate elements for individual groups of loads 5 structural response the seismic responses were analysed by means of seismic load decomposition into natural modes of vibration (dynamic modal analysis, according to [2]). seismic analyses were executed for both horizontal directions x and y of the load action, taking into account the vertical composite action of the seismic excitation (ground motion in direction z) according to [2] para 1631.2.5. the envelope of the response values was formed on the basis of a seismic combination of these two dynamic analysis states within seismic combination. the analyses of the response on the structure to the seismic load determined the envelope of displacements and internal forces equivalent to the maximum and minimum branch of the envelope of the two load effects in directions x�z and y�z. when sizing the structure, it is allowable to reduce the earthquake load by the coefficient of ductility r according to the american standard (according to para 1631.5.4 [2]), which for cooling towers (according to tab. 16 p in [2]) is determined r � 3.6. the calculated values of the envelopes of the displacements and internal forces due to the seismic loading states were compared with the envelope of the other loading states due to the dead, operational and live loads, wind and temperature actions (figs. 4, 5, 6). the seismic effects, taking into account the ductility properties of the structure, are lower than the actions of wind pressure and the rest of the static load, especially temperature action due to the environment and surface insolation. note: when reducing the earthquake load by ductility r, it is necessary to satisfy the condition of reasonable – sufficient size of the shear reinforcement and the spatial bending reinforcement, e.g., with a double-sided stressed reinforcement (into the armour plate cross) on both surfaces. the results of the response are determined in quantities (terms???) of displacements and internal forces. the two characteristics are given separately for the maximum and minimum branch of the envelope of the relevant combination. in order to represent the results of the calculated response clearly, the structure was divided into particular parts of plate elements and beam elements. the particular results are correlated with the coordinate systems, as follows. the internal forces in plate elements, i.e. moments mx and my, and axial forces nx and ny, are determined in the middle plane of the plates, and they are correlated with the local axes of the plate elements. in the vertical members (in the walls, stiffeners and diffuser) local axis x has a horizontal (global) direction, local axis y has a vertical (global) direction. local axis z has the direction normal to the middle plane of the element; the internal forces in the horizontal structures (ceilings, baseplate) have local axes x and y parallel to the global axes. axis z is normal to the centre line of the member, and it is directed upwards. in the beam elements, local axis x is the axis of the centre line of the member; axes z and y are the axes of the cross section through the beam member. axis z is as a rule the axis in the direction of a longer dimension (height) of the section. 6 conclusion using the example of a cooling tower unit, this paper analyses the influence of wind and natural seismicity and temperature effects beyond the usual static load (dead, operational and live loads) on the static and dynamic response, and 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 6/2006 fig. 5: comparison of design axial forces in plate elements for individual groups of loads it compares the significance of these load types for the safety and reliability of the structure. the comparison has revealed that the dominant effect on the structure with reference to its safety (maximum displacements, extreme stress state in selected cross sections, etc.) is exercised by the temperature effect together with the design wind load. the effects of natural seismicity (without reducing of this load by ductility factor r) are comparable with the dynamic wind load within the interval of the design wind velocities. however, technical seismicity may become dominant for the reliability of the structure when there is vibration of selected parts, such as joints, measuring probes installed in the structure for technological purposes, etc. for structural design, the static loads were combined with the wind effect. the wind effect is greater than the reduced seismic load. the reinforced concrete structure of the towers must also have sufficient reserves for ductility strain at seismic load. in order to enable this ductility strain, the structure must be appropriately reinforced, especially by a shear reinforcement – the principles of reinforcement are given in the standard [4]. when using the ductility factor for sizing the dimensions of the structure, it must be submitted (acknowledged, or – taken into account??) that after an earthquake the structure will be damaged and must be repaired (ductility strain assumes the occurrence of cracks). the internal process equipment inside the towers will probably need to be replaced. 7 acknowledgment this paper was supported as a part of the research projects in gačr no. 103/03/0082 “nonlinear response of structures under extraordinary loads and man-induced actions” and gačr no. 103/06/1521 “reliability and risks of structures in extreme conditions”, for which the authors would like to thank the agency. references [1] ansi/asce 7-95: minimum design loads for buildings and other structures. [2] ubc 1997: uniform building code – volume 2, division iv – earthquake design. [3] bs cp3:c5, p2: basic data for the design of buildings. chapter v. – loading, part 2. wind loads. [4] building code requirements for structural concrete (aci 318m-02) and commentary (aci 318rm-02), metric version. [5] eurocode 2: design of concrete structures. en 1992, december 2004. [6] makovička, d.: ductile behaviour of dynamically loaded structures. in: eurodyn ’99, a. a. balkema, rotterdam, 1999, p. 1136–1140. [7] makovička, d., makovička, d.: assessment of the seismic resistance of a ventilation stack on a reactor building, nuclear engineering and design, vol. 235 (2005), elsevier b.v., p. 1325–1334. [8] makovička, d., makovička, d.: ductile characteristics of rc structures under seismic effects (in czech), beton 2005, no. 9, p.42–47. doc. ing. daniel makovička, drsc. klokner institute czech technical university in prague šolínova 7 166 08 praha 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 46 no. 6/2006 fig. 6: comparison of the design axial and shear forces and the moments in the beam elements for individual groups of loads ap05_4.vp 1 introduction autonomous underwater vehicles (auvs) are submersible robots that operate independently without human intervention [1, 2, 3]. these vehicles have a multitude of uses including maintenance, diver support, pipeline inspection and geological surveying. as many of these tasks require the auv to operate in close proximity to obstacles and hazards, it is imperative that their motion is accurately controlled. this requires that the particular control system designs must be robust and reliable [1, 2, 3]. automatic control of this type would enable the auv to hold its position or manoeuvre accurately with minimal adverse influence from the external disturbances present in the hostile subsea environment e.g. ocean currents. the performance of any controller relies on accurate feedback from sensors [4]. unfortunately, this aspect is not always present within a system due to system failures. there are two possible methods that can be implemented to deal with the sensor faults that are dealt with in this paper. the first is to have redundancy built into the system by having multiple sensors for each state to be measured. this incurs the problems of added costs, weight and reduced battery life, none of which are desirable in an auv. the second method is to reconstruct the output of the faulty sensor from the responses of the remaining good sensors. this is achieved by using an observer to estimate what the faulty signal should actually be reading and then replacing the faulty signal with this estimate of the true value. it is this second method that is presented in this paper. the control method that will be used is nonlinear sliding mode control. this control structure has been used in numerous papers where it has been shown to be robust and reliable for submersible vehicle control [1, 2, 3]. the advantage of sliding mode control over linear control structures is that there is an extra nonlinear switching term that is able to overcome the matched unmodelled dynamics present in the system. the reconfiguration method that is used in this paper makes use of a nonlinear sliding mode observer (smo) [4, 5, 6]. the theory and design of smos for reconfiguration is given by edwards and spurgeon [5] and utkin and guldner [6], and this is furthered by mcgookin [4] who looks into their use in submarine control reconfiguration. this paper continues that research by looking into their use in auvs. this paper is split into 5 sections. the first looks at the type of auv model used in this paper. the second describes the control structure used for the auv. the third gives an overview of smos and how they are designed for reconfiguration. the fourth section gives results of the smos reconfiguring the system against faults injected into the system before the final section gives the conclusions. 2 system model the simulations used in this paper make use of a nonlinear 6 degrees of freedom (dof) model to give as accurate a representation of the real system as possible. a diagram of the auv model used is shown in fig. 1, annotated with the main 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague implementation of sliding mode observer based reconfiguration in an autonomous underwater vehicle a. j. mitchell, e. w. mcgookin, d. j. murray-smith this paper looks at the implementation of a sliding mode observer (smo) based reconfiguration algorithm to deal with sensor faults within the context of navigation controllers for autonomous underwater vehicle (auv). in this paper the reconfigurability aspects are considered for the heading controller. simulation responses are used to illustrate that the sliding mode observer is able to give state information to the controller when there is a fault in the auv’s sensor package. comparisons are made between the sliding mode controller with and without reconfigurability for a number of different sensor failures, e.g. bias errors in or the complete loss of the heading data, and the robustness of the sliding mode observer is investigated through the introduction of disturbances into the system. keywords: autonomous underwater vehicles, sliding-mode control, fault tolerance, reconfiguration, sliding mode observers, environmental disturbances. xe ye ze oe u v w p q r xb ybzb ob�r n �s z . x . y . � . � . � . fig. 1: graphical representation of auv showing reference frames axes and state variables. it is based on the nps auv ii given in [1, 2]. in fig. 1 it can be seen that the auv has vertical and horizontal control surfaces at both the bow and stern. for the work in this paper only the stern control surfaces are used to control the auv and the bow surfaces are set to their trim values of 0°. the control surfaces are limited to �20°, the thrusters �1500rpm and all the actuator signals are passed through a first order filter to limit the actuator rates [1,2]. the state space form of the nonlinear equations of motion is given below [2]. � �� � ( ) ( ) ( ) (� � � � � � � � � � � � �� � � � � � � � m c d j m g1 10 0 � �) ( ) 0 0 � � � � � � � � � m b u 1 (1) where � �� � u v w p q r t, � �� � x y z � � � t, � �u � � �r s n t, b(�) is the nonlinear input matrix, c(�) is the matrix of coriolis and centripetal terms, d(�) is the damping matrix, g(�) is the vector of gravitational and buoyancy forces and moments, j(�) is the euler transformation matrix, and m is the mass and inertia matrix. these equations can be represented by the simplified nonlinear state space form shown below. � ( ) ( )x a x x b x u� � , (2) where x � [� �]t. this can then be linearised to give the following standard state equation: �x ax bu� � . (3) here a is the system matrix and b is the input matrix. 2.1 disturbances to test the controllers and reconfiguration in a realistic simulation environment, ocean current disturbances have been modelled. ocean currents are the turbulent flows within bodies of water [2, 7]. for an auv working near the seabed and in close proximity to objects, the ocean-currents would be highly turbulent, as the objects shed vortices into the stream. to model the sea currents it is required to know the velocity of the current flow, vc, angle of attack, �c, and sideslip angle, �c, of the current in the earth-fixed reference frame. for this paper �c is set to a predefined values of 0° and then vc, and �c are varied randomly. fig. 2 shows graphically the relationship between the sea current’s velocity and the earth fixed axes. the three terms are then converted into velocities in the earth-fixed reference frame using the following equations: u vec c c c� cos cos� � , (4) v vec c c c� cos sin� � , (5) w vec c c� sin � . (6) these can then be changed using euler transforms to give the velocities in the body-fixed reference frame uc, vc, wc. these values are then used to create the vector � ��c c c cu v w� 0 0 0 tthat is added by the principle of superposition to the auv model by altering the simulations equations as shown below. � � � �r c� , (7) � �m c d g b u� ( )� � � � �� � �( ) ( )r r r+ , (8) � ( )� � �� j . (9) 3 controller design 3.1 decoupled controller design since most manoeuvres preformed by auvs can be broken down into 3 basic manoeuvres [1, 2, 7], i.e. change in speed, change in direction and change in depth, the lower level of control will be made up of 3 decoupled controllers. for each of these controllers a submodel of the auv dynamics is generated by reducing the full 6 dof model to smaller ones that are based on the dominant states for each manoeuvre [1, 2, 7]. this process of isolating specific dynamics is called decoupling. for this paper only the heading submodel is considered and this is made up of the yaw rate, r, and the heading angle, �. the sway velocity, v, can also be used in the heading submodel, but this is found to reduce the disturbance rejection abilities of the controller and so is omitted in this study. the control actuator that is used for altering the heading is the rudder at the stern, �r. 3.2 sliding mode control the sliding mode controllers used here is sims (single input multiple state) [1, 2, 7]. sliding mode controllers are made up of two parts, the equivalent term, ueq, and the switching term, usw [1, 2, 7]. the equivalent term is a linear controller that is optimised around a specific operating condition. the switching term is the nonlinear part of the controller and works by giving the extra control action required to drive the system back to the operating condition for which the equivalent term is valid. the equivalent term used in this study is of the state feedback gain controller form [1, 2, 3, 7, 8]: ueq t� k x, (10) where k is the feedback gain calculated using robust pole placement techniques, [1, 2, 8], and x is the sub-system state vector. in this project all pole placement has been carried out using the place command in matlab that uses an iterative method to place the poles [8]. the switching term is calculated from the equation for the sliding surface, given in [1, 2] as: �( )x � h xt � (11) with �x being the state tracking error. this equation can then be expanded to give the following equation, [1, 2, 3, 7]. � �u fsw t t d t� ( ) � �( ) sgn( )h b h x h x1 � (12) © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 x e vec uec v c wec �c �c y e ze fig. 2: diagram showing disturbance in earth-fixed reference frame here h is the right eigenvector of ac � a b k t, with a and b being the system and input matrices defined previously and �( )f x is an estimation of the nonlinearities in the system, which for this paper is set to zero. as the above equation uses the signum function, sgn(�), which has only three outputs, 1, 0 and 1, this leads to chattering when the system is on the sliding surface [1, 2, 3, 7]. to get around this problem the sgn(�) term is replaced by the hyperbolic tangent function tanh(�/�) where � is the boundary layer size, [1, 2, 3, 7]. this gives the response a much smoother transition across the sliding surface while still giving the full control effort outside the boundary layer. bringing the two terms together and making the aforementioned changes gives the full control structure that is used in this paper and is shown below. � �u x� � k h b h xt t t d( ) � tanh( )1 � � . (13) 3.3 guidance the second method of guidance that has been implemented is a simple line-of-sight (los) waypoint following guidance [2]. instead of having a fixed list of course and time data as its input, this guidance has a list of waypoints, depths and approach velocities. the guidance then calculates the direction that the auv should take in order to head towards the waypoint. the auv continues towards the waypoint until it enters a “sphere of acceptance” around the waypoint at which point the guidance moves on to the next waypoint. a graphical representation of the auv heading towards a waypoint is given in fig. 3. the equations to calculate the desired values that the controllers require are given below, [1, 7]. v vwd � (14) �d � � � �� � � �� tan 1 y y x x w c w c (15) z zwd � (16) 4 reconfiguration a sliding mode observer based reconfiguration method is used in this study. this utilises a sliding mode observer to follow the system, and when a fault is detected the output of the smo is then substituted into the system in place of the faulty signal [4, 5, 6]. again, as with the control structure, submodels are used to split the full system into smaller decoupled systems. the submodel used for the heading observer is, however, different from that used in the controller design in that it has the sway velocity, v, included in it to make it a three state system. the extra state is used to give a better performance of the observer as it allows the observer to operate with two faultless states when estimating the heading. the equations for a smo in the case of an lti system are shown below [4, 5]. � � � � �x y a x y b u ns s s s s s 0 � � � � � � � � � � � v , (17) where a s and bs are the subsystem’s system matrix and input distribution vector respectively, �xs is the estimated unflawed subsystem states [ ]v r t, ys is the subvector of the system output, [�], �y s is the observers output [�], us is the input to the subsystem [�r], n is the sliding mode gains and v0 is the switching term. the switching term gives the observer the extra action to overcome the nonlinearities and matched unmodelled dynamics present in the system. when fault free, the switching term v0 is defined as the sign of the difference between the actual system output and observer output [4]. the switching term is defined as [4, 5]: v y y0 � sgn( � )so s . (18) here yso is the output of the actual system that is equivalent to the output of the observer, i.e. for the heading submodel yso would be the heading angle, �. the problem, however, is that when there is a fault in this signal the observer would follow the faulty signal. to get around this problem the switching term is altered so that it uses a comparable signal that is unflawed. for the heading observer the yaw rate, r, is used to replace the faulty � for its switching term as this is approxi20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague north z � u sphere of acceptance waypoint ( xw,yw,zw,vw) xc,yc,zc fig. 3: diagram showing auv heading towards waypoint reconfiguration logic auv sm heading observer sm heading controller �d + � v, r, ��r ^^ � � or � v, r, � fig. 4: diagram of control structure for the heading submodel including reconfiguration logic mately the derivative of the heading angle [4]. to reduce chattering effect [4, 5] soft switching in the form of a hyperbolic tangent function with a boundary layer �0 thick replaces the signum function in the switching terms. the switching terms are then written as: v r x 0 0 h sh1,2 h � tanh � � . (19) a diagram of the complete control structure for the reconfigurable heading controller is shown in fig. 4. 5 results to get the following results a zigzag pattern of waypoints is laid out for the auv to follow. this allows the auv to perform both right and left turns and should highlight any asymmetry in its handling characteristics. the two faults that the reconfiguration is tested for are a complete failure of the heading sensor where the heading output is 0° and for a bias of 10° added into the heading sensor. both of these faults are simulated to happen after 115 seconds and the fault are detected 3 seconds after that. all of the simulations have been carried out using the full 6-dof model at a depth of 40 m and with the auv travelling at 1.7 m/s. the first graph, fig. 5, shows how the auv acts if there is a complete failure in the heading sensor when no reconfiguration is present in the system. from this it can be seen that the auv loses the ability to follow the desired heading completely with the model of the auv breaking down just after 200 s has elapsed. this break down in the model is the result of the speed controller trying to compensate for the cross coupling that becomes noticeable when the actuators are at full deflection. fig. 6 gives a view looking down on the path that the auv takes, starting in the top right of the graph and travelling towards the waypoints that are denoted as circles with a radius the size of the sphere of acceptance. from this it can be seen that the failure in the heading sensor, which happens just after the auv has passed through the first waypoint, causes it to lose direction and snake from side to side before the simulation becomes unstable and the auv appears to shoot off into infinity. figs. 7 and 8 show the response of the auv when a bias error is added to the heading sensor without reconfiguration. in the yaw angle response in fig. 7 it can be seen that the auv continues to follow accurately the value that is given by the sensor. the large spike in the rudder actuation is caused by the controller suddenly being presented with a 10° error in the heading angle and trying to correct for that. the problem caused by this bias error in the sensor is noticeable in fig. 8, where the auv misses the next waypoint. this is caused by the error bias being large enough that the auv is being directed on a course that misses the sphere of acceptance around the waypoint. fig. 9 shows the heading and rudder responses when the smo based reconfiguration is implemented in the system. from the yaw response it can be seen that there is a small deviation from the commanded heading immediately after the fault but this is quickly brought under control. also of note from this graph is the accuracy of the smo in estimating the heading angle, after the fault has been detected the output from the sensor is overwritten with the data from the observer. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 5: heading angle and rudder usage responses with complete failure of the heading sensor fig. 6: xy position plot showing movement of auv when heading sensor fails completely fig. 7: heading angle and rudder usage responses with bias fault in the heading sensor it can be seen in the graph that the actual and observed (the sensed line after the fault) heading angles lie on top of one another. in the rudder response it can be seen that there is a very large spike in the required value around the time of the fault. this is caused by the sudden step of 150° appears in the error signal sent to the controller when the sensed heading suddenly becomes 0°. in the actual auv these large spikes would cause undue stress and wear on the actuators, reducing their lifespan and possibly even damaging them directly. from fig. 10 it can be seen that even though there is a fault in the heading sensor the auv is still able to reach the next waypoint and then head in the direction of the waypoint after that. this would suggest that even with this complete heading sensor failure the auv would be able to complete its mission. the heading and rudder responses for a 10° bias failure when reconfiguration is present are given in fig. 11. from these it can be seen that the smo is able to accurately reconstruct the faulty signal and is again able to follow the actual response very closely. as with the complete failure case, the actuator response around the time of the fault is characterised by large spikes as the step response inputs of the fault and then the reconfiguration cause the controller to react violently in an attempt to follow its input signal. from the view looking down on the auv in fig. 12 it can be seen that the auv is again able to continue on its mission even with one signal being faulty. the ocean current based disturbances were then added to the system to test how well the observer would be able to cope in a more realistic environment. for this paper the currents were set to vary between 0 and 0.1 m/s, with a constant �c of 0° and varying �c. 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague fig. 8: xy position plot showing movement of auv when heading sensor has 10° bias error fig. 9: heading angle and rudder usage responses with complete failure of the heading sensor with reconfiguration present fig. 10: xy position plot showing movement of auv when heading sensor fails completely with reconfiguration present fig. 11: heading angle and rudder usage responses with bias fault in the heading sensor with reconfiguration present fig. 12: xy position plot showing movement of auv when heading sensor has 10° bias error with reconfiguration present from fig. 13 it can be seen that there is no visible degradation in the ability of the smo to follow the actual heading when there is a complete failure of the sensor with the disturbances in the system. the view in fig. 14 shows that the auv is also able to continue on its mission with a failed heading sensor as it is still able to direct itself with accuracy towards the desired waypoints. in fig. 15 the yaw response again shows that the introduction of disturbances when a bias error fault occurs in the heading sensor also has very little impact. the response is again very similar to fig. 11 where no disturbance is present. in fig. 16 it can be noted that again, the auv is able to continue on its mission as the reconfiguration is able to overcome the effects of the failure and guide the auv to the next waypoint. 6 conclusions in conclusion it can be seen that the sliding mode observers perform well as a method of reconfiguration. without any reconfiguration it can be seen that the auv is unable to continue on its mission when either a complete failure or a bias error occurs in the heading sensor. with the smo based reconfiguration the auv is able to accurately track its desired heading and continue with its mission. even when disturbances are introduced into the system the smos are able to accurately estimate the heading of the auv. one problem, rooted in the high gain nature of the sliding mode controllers, is the large spikes in the actuator signals when the faults first appear and are then dealt with. a possible solution to this would be to place a filter either on the input or the output of the controller to reduce the stress that this would inflict on the actuators although this would degrade the recovery performance of the system. references [1] healey, a. j., lienard, d.: “multivariable sliding mode control for autonomous diving and steering of unmanned underwater vehicles.” ieee journal of oceanic engineering, vol. 18 (1993), no. 3, p. 327–339. [2] fossen, t. i.: guidance and control of ocean vehicles. chichester (england): john wiley and sons ltd, 1994. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 4/2005 fig. 13: heading angle and rudder usage responses with complete failure of the heading sensor with reconfiguration and disturbances present fig. 14: xy position plot showing movement of auv when heading sensor fails completely with reconfiguration and disturbances present fig. 15: heading angle and rudder usage responses with bias fault in the heading sensor with reconfiguration and disturbances present fig. 16: xy position plot showing movement of auv when heading sensor has 10° bias error with reconfiguration and disturbances present [3] mitchell, a., mcgookin, e., murray-smith, d.: “comparison of control methods for autonomous underwater vehicles.” proceedings of the 1st ifac workshop on guidance and control of underwater vehicles (gcuv2003), newport, wales, 2003, p. 41–46. [4] mcgookin, e. w.: “fault tolerant sliding mode control for submarine manoeuvring.” proceedings of the 1st ifac workshop on guidance and control of underwater vehicles (gcuv2003), newport, wales, 2003, p. 119–124. [5] edwards, c., spurgeon, s. k.: sliding mode control: theory and applications. london (england): taylor & francis ltd, 1998. [6] utkin, v., guldner, j., shi, j.: sliding mode control in electromechanical systems. london (england): taylor & francis ltd, 1999. [7] mcgookin, e. w.: “sliding mode control of a submarine.” m.eng thesis, glasgow university, scotland, 1993. [8] kautsky, j., nichols, n. k., van doorens, p.: “robust pole assignment in linear state feedback.” international journal of control, vol. 41 (1985), no. 5, p. 1129–1155. alasdair j. mitchell phone: +44 141 330 6137 fax: +44 141 330 6004 e-mail: a.mitchell@elec.gla.ac.uk euan w. mcgookin david j. murray-smith centre for systems and control department of electronics and electrical engineering university of glasgow, glasgow, g12 8lt, uk 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 4/2005 czech technical university in prague ap06_5.vp notation p, n hole and electron density (cm�3) � electrostatic potential (v) na, nd ionized donor and acceptor density (cm �3) �s permittivity of semiconductor (asv �1m�1) q elementary charge (as) jp, jn current densities (acm �2) r, g recombination and generation rate (cm�3s�1) �p, �n carrier mobilities (cm 2v�1cm�1) �p, �n quantum correction potentials (v) �p, �n band parameters (v) dp, dn diffusion coefficients (cm 2s�1) tp, tn carrier temperatures (k) kb boltzmann constant (evk �1) sp, sn energy flux densities (wcm �2) e electric field strength (vcm�1) tl lattice temperature (k) �wp, �wn energy relaxation times (s) mp, mn carrier effective masses (kg) � reduced planck’s constant (js) �p, �n quantum correction coefficient �p, �n thermal conductivities (wk �1cm�1) 1 introduction double-gate (dg) mosfets are considered to be a promising candidate for nanoscale cmos. the international technology roadmap for semiconductors (2005 edition) predicts printed gate lengths up to 15 nm for the next ten years. for these gate lengths conventional mosfets are limited due to different short channel effects. on the other hand structures with two gates and an extremely thin body demonstrate better control of the gate region and consequently suppression of short channel effects. numerical device simulation is an important procedure for the design and optimization of novel semiconductor devices. some advantages are that the electrical behavior iscalculated before the fabrication process, non-measurable inner-electronic values can be calculated and visualizated, and cost effectiveness due to diagnosis/fault-detection in the technological process. 2 simulation models quantum hydrodynamic (qhd) models, which are based on a quantum fluid dynamic model, offer new ways to understand and design quantum sized semiconductor devices. the advantage of this model is its macroscopic character, which enables us obtain description without knowing of quantum mechanical details like the initial wave function [1], [2], [3]. the classical hydrodynamic (hd) model for semiconductor device simulation can be extended by expressions in the transport equations and in the energy balance equations. these describe an internal quantum potential in the transport equation as well as a quantum heat flux in the energy balance equation. these additional terms in the classical hydrodynamic model allow us to describe the continuous electron and hole distribution in a semiconductor device, accumulations of carriers in potential wells and resonant tunneling of carriers, respectively. the standard model for universal device simulations is the drift-diffusion (dd) model, which can be derived from the above mentioned model [4]. basic equations of the qhd model are the poisson equation � �� � � � � � �� �s d a( ) ( )q p n n n (1) continuity equations (index p: holes, index n: electrons) � � � � � � � � �jp q r g p t (2) � � � � � � � � �jn q r g n t (3) transport equations jp p p p p b p p� � � � � � � �q p d q p k p t� � � �( ) ( ) ( ) (4) jn n n n n b n n� � � � � � � �q n d q n k n t� � � �( ) ( ) ( ) (5) © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 46 no. 5/2006 numerical simulation of nanoscale double-gate mosfets r. stenzel, l. müller, t. herrmann, w. klix the further improvement of nanoscale electron devices requires support by numerical simulations within the design process. after a brief description of our simba 2d/3d-device simulator, the results of the simulation of dg-mosfets are represented. starting from a basic structure with a gate length of 30 nm, the model parameters were calibrated on the basis measured values from the literature. afterwards variations in of gate length, channel thickness and doping, gate oxide parameters and source/drain doping were made in connection with numerical calculation of the device characteristics. then a dg-mosfet with a gate length of 15 nm was optimized. the optimized structure shows suppressed short channel behavior and short switching times of about 0.15 ps. keywords: device simulation, semiconductor devices, double-gate mosfet. energy balance equations � � � � � � � � s j ep p b p l wp b p b p 3 2 3 2 3 2 1 k p t t k t pt k t r g � ( ) ( ) 2 1 2 q p g r q t p� � �p wp p� � � � � � � � �( ) ( ) (6) � � � � � � � � s j en n b n l wn b n b n 3 2 3 2 3 2 1 k n t t k t nt k t r g � ( ) ( ) 2 1 2 q n g r q t n� � �n wn n� � � �� � �� �( ) ( ) (7) energy flux density equations s j jp p p b p p p p� � � � �� �( )t k q t 5 2 3 2 (8) s j jn n n b n n n n� � � � �� �( )t k q t 5 2 3 2 (9) and equations for the quantum correction potential � � p p p � � �� 2 2 6 m q p p (10) � � n n n � ��2 2 6 m q n n (11) further approaches are necessary for carrier mobilities, generation and recombination rates, diffusion coefficients and energy relaxation times, which are almost material dependent. equations (1) to (11) are solved self-consistently for the variables ( �, p, n, tp, tn, �p, �n). if equations (10) and (11) are neglected, i. e., for �p � �n � 0, the qhd model can be reduced to the conventional hydrodynamic (hd) model. if the carrier temperatures are set to lattice temperature and equations (6) to (9) are neglected, the quantum drift diffusion (qdd) model and additionally for �p � �n � 0 the conventional drift diffusion (dd) model can be obtained. solutions of the equations are achieved by a successive algorithm (the so called gummel algorithm). for solving the partial differential equations a box method is used. the resulting non-linear equation systems are solved by the newton-method and the corresponding linear equation systems by preconditioned gradient methods. all models are implemented fully three-dimensionally in the simba program system [5], [6]. 3 basic structure simulation and verification the starting point for the simulations is a basic structure represented in fig. 1, as a functionally relevant detail of the real device. the different parameters are assumed as follows: � gate length lg � 30 nm, � source/drain lengths ls � ld � 100 nm, � gate-to-source and gate-to-drain distances lgs � lgd � 100 nm, � silicon film thickness tsi � 20 nm, � gate oxide thickness tox � 2 nm, � channel doping na � 1×10 16 cm�3, � source/drain doping nd � 3×10 20 cm�3. the simulated output characteristics drain current id versus drain-to-source voltage vds are plotted in fig. 2 for 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 fig. 1: basic structure of the dg-mosfet 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 3200 i d a / m [ ] � � vds v[ ] vgs = 0.4 v vgs = 0.6 v vgs = 0.8 v vgs = 1.0 v vgs = 1.2 v fig. 2: output characteristics of the basic structure 0 0.2 0.4 0.6 0.8 1.0 1.2 i d a / m [ ] � � vgs v[ ] vds = 0.05 v vds = 1 v simulation simba[ ] experiment 7[ ] �0.2 10 �5 10 �4 10 �3 10 �2 10 �1 10 1 10 2 10 3 10 4 0 fig. 3: transfer characteristics – simulation and experiment different gate-to-source voltages vgs. verification of the simulation results and calibration of the model parameters were done by comparison with experimental values from [7]. a structure similar to fig. 1 with lg � 45 nm, tox � 2.5 nm, nd � 2×10 20 cm�3 was simulated and compared with the measured values. the results represented in fig. 3 show good agreement. a further successful verification was done by results from [8]. 4 parameter variation and optimization to study the influence of the structure parameters on the electrical device characteristics, various parameters are modified. figs. 4 and 5 depict the output and the transfer characteristics at different gate lengths. at shorter gate lengths a threshold voltage roll off can be observed as a typical short channel effect in the particular pinch-off behavior disappears for lg < 30 nm. at the same time, the drain saturation current increases strongly. variation results of the silicon film thickness are represented in fig. 6. thicker channels lead to larger drain currents, and also to a displacement of the threshold voltage toward smaller values. therefore the film thickness should be not greater than 20 nm. thinner gate oxides result in increasing drain currents (fig. 7) and transconductances, and in a better pinch-off behavior. therefore the smallest possible oxide thickness should be applied. © czech technical university publishing house http://ctn.cvut.cz/ap/ 37 acta polytechnica vol. 46 no. 5/2006 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 4000 i d a / m [ ] � � vds v[ ] lg = 90 nm lg = 45 nm lg = 30 nm lg = 15 nm lg = 10 nm vgs = 1 v fig. 4: output characteristics at different gate lengths 0�0.4 0.8 1.2 i d a / m [ ] � � vgs v[ ] vds = 1 v lg = 90 nm �0.8 10 �5 10 �4 10 �3 10 �2 10 �1 10 1 10 2 10 3 10 5 0 10 4 0.4 lg = 45 nm lg = 30 nm lg = 15 nm lg = 10 nm fig. 5: transfer characteristics at different gate lengths 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 4000 i d a / m [ ] � � vds v[ ] vgs = 1 v tsi = 30 nm tsi = 25 nm tsi = 20 nm tsi = 15 nm tsi = 10 n m tsi = 5 nm fig. 6: output characteristics at different channel thickness 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 4000 i d a / m [ ] � � vds v[ ] tox = 1 nm tox = 1.5 nm tox = 2 nm tox = 2.5 nm tox = 3 nm vgs = 1 v fig. 7: output characteristics at different gate oxide thickness variation of the channel doping causes a decrease in the drain current for doping densities greater than 1017 cm�3 (fig. 8). the threshold voltage is strongly influenced by doping changes. for optimization of the structures, channel doping can be used to adjust the required threshold voltage. the source/drain doping should be high enough to reduce the series resistances, whereas the dopant diffusion into the channel has to be minimized to prevent short channel effects. in this case rapid thermal annealing processes are an essential equirement. fig. 9 shows the corresponding output characteristics. the knowledge gained from the different variations was used for the design of an optimized structure. a minimal technologically practicable gate length of lg � 15 nm and a gate oxide thickness of tox � 1.5 nm are specified. further objectives are threshold voltage of vth � 0.1 v, large drain saturation current and improved dynamical behavior. after several iterations the further parameters are determined as follows: tsi � 3 nm, na � 2.8×10 19 cm�3, nd � 7×10 20 cm�3. the resulting output characteristics are represented in fig. 10. fig. 11 compares the transfer characteristic of the optimized and basic structure. an enlarged drain current as well as an improved transconductance can be observed. to determine the dynamical behavior, the gate-to-source voltage was switched from 0 v to 1 v to find the turn-on time (ton) and from 1 v to 0 v to find the turn-off time (toff). 38 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 5/2006 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 4000 i d a / m [ ] � � vds v[ ] na = 1×10 cm 17 3� vgs = 1.0 v na = 1×10 cm 16 �3 na = 1×10 cm 18 3� na = 5×10 cm 18 3� na = 1× 10 cm 19 3� fig. 8: output characteristics at different channel doping 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 3000 4000 i d a / m [ ] � � vds v[ ] nd = 1×10 cm2 1 �3vgs = 1 v nd = 1×10 cm2 0 �3 nd = 1×10 cm 20 �3 nd = 1× 10 cm 20 �3 nd = 1× 10 cm 20 �3 fig. 9: output characteristics at different source/drain doping 0 0.2 0.4 0.6 0.8 1.0 1.2 0 1000 2000 2600 i d a / m [ ] � � vds v[ ] vgs = 0.4 v vgs = 0.6 v vgs = 0.8 v vgs = 1.0 v vgs = 1.2 v fig. 10: output characteristics of the optimized structure 0 0.2 0.4 0.6 0.8 1 1.2 i d a / m [ ] � � vgs v[ ] optimized structure vds = 1 v basic structure 10 �5 10 �4 10 �3 10 �2 10 �1 10 1 10 2 10 3 0 10 4 �0.4 �0.2�0.6 �0.8 �1 fig. 11: transfer characteristics of the basic and the optimized structure fig. 12 shows the time response of the drain current. this provides the switching time of ton � toff � 0.15 ps. compared with the basic structure the switching times are reduced by a factor of 0.6, primarily due to the structure reduction. 5 conclusion the scaling down of planar bulk mosfets according the international technology roadmap for semiconductors requires new structures such as multiple-gate mosfets a promising way to this is with the use of double-gate transistors. the implementation will be challenging, with numerous new and difficult issues. in this case numerical device simulation is essential. variations of different structure parameters have been carried out to calculate the influence on device characteristics. based on these results an optimized structure with a gate length of 15 nm was created. the optimized structure shows suppressed short channel effects and switching times of about 0.15 ps. references [1] gardner, c. l.: the quantum hydrodynamic model for semiconductor devices. siam j. appl. math., vol. 54 (1994), no. 2, p. 409–427. [2] chen, z.: a finite element method for the quantum hydrodynamic model for semiconductor devices. comput. math. appl., vol. 31(1996), p. 17–26. [3] wettstein, a., schenk, a., fichtner, w.: quantum device simulation with the density-gradient model on unstructured grids. ieee trans. electron devices, vol. 48 (2001), no. 2, p. 279–284. [4] selberherr, s.: analysis and simulation of semiconductor devices. berlin, germany, springer-verlag 1984. [5] klix, w., stenzel, r.: simba-user manual http://www.htw-dresden.de/~klix/simba/welcome.html. [6] höntschel, j., stenzel, r., klix, w.: simulation of quantum transport in monolithic ics based on in0.53ga0.47as-in0.52al0.48as rtds and hemts with a quantum hydrodynamic transport model. ieee trans. on electron devices, vol. 51 (2004), no. 5, p. 684-692. [7] lee, j. h., et al.: super self-aligned double-gate (ssdg) mosfets utilizing oxidation rate difference and selective epitaxy. iedm, tech. digest, (1999), p. 3.5.1–3.5.4 [8] vinet, m., et al.: bonded planar double-metal-gate nmos transistor down to 10 nm. ieee electron device letters, vol. 26 (2005), p. 317–319 prof. dr. -ing. habil roland stenzel e-mail: stenzel@et.htw-dresden.de leif müller tom herrmann prof. dr. -ing. habil wilfried klix department of electrical engineering university of applied sciences dresden friedrich-list-platz 1 d-01069 dresden, germany © czech technical university publishing house http://ctn.cvut.cz/ap/ 39 acta polytechnica vol. 46 no. 5/2006 0.001 0.01 0.1 1.0 10 �20 �10 20 10 i d a / m [ ] � � t ps[ ] vds = 1 v 0 off ( = 1 v 0 v)vgs � on ( = 0 v 1 v)vgs � fig. 12: switching behavior of the optimized structure ap_06_4.vp 1 headline of a section the nonlinearities of air cause high frequency wave components (primary waves) to interact. this interaction produces new frequencies (secondary waves), which are a combination of the sums and the differences of their frequency components. this process is similar to am demodulation; therefore the term self-demodulation is often used. loudspeakers based on parametric arrays use this phenomenon to generate audible sound with a narrow directivity. this directivity allows distant targeting of specific listeners. the basis of nonlinear acoustics is presented, in order to understand the parametric array and to show the influence of the primary wave characteristics. self-demodulation generates some known distortions in the audible sound. therefore pre-processing must be applied to the signal before it is emitted. two possible methods of signal processing are presented and their efficiency is estimated by simulation. finally, as we are going to produce a prototype, we choose a pvdf film based transducer and study its response according to different characteristics, e.g., the size of the transducer. 2 nonlinear acoustics the parametric array was first analyzed by westervelt [1]. he suggests that the sound pressure of secondary waves (ps) produced by the nonlinear interaction is �p c p t c p t s s p� � � 1 0 2 2 2 0 0 4 2 2 2 � � � � � � , (1) where pp is the sound pressure of primary waves, � is the nonlinear coefficient, �0 is air density, and c0 is sound velocity. the square of pp in the source term (the right side of eq. (1)) allows self-demodulation in the area where primary waves exist. secondary waves are generated along this area, in the so-called parametric array. this is limited by the dissipation of the primary waves, which increases with frequency, and by the shock formation distance lp, where strong attenuation occurs. lp decreases when the frequency or the amplitude of the primary waves increases. berktay [2] studied the self-demodulated wave in the far field and gives the sound pressure (ps) on the axis of propagation, p z t p s c z f t t s p ( , ) ( ) � 0 2 0 0 4 2 2 216 � � � � � � � , (2) where f is the envelope function of the amplitude modulated primary waves, p0p is the primary wave emission amplitude, s the sound beam cross-section, � the attenuation coefficient of the primary wave, and � the frequency of the self-demodulated wave. this expression allows us not only to compute the level of the audible sound but also to predict the distortion contribution. the self-demodulated wave characteristics depend on many parameters of the primary waves. first, the directivity of the self-demodulated wave increases when the parametric array length and the transducer surface increase. the parametric array length is related to the primary wave frequency (dissipation of the wave and shock formation distance) and to the sound pressure level of the primary waves (shock formation distance). next, the level of the self-demodulated wave depends on the sound pressure level of the primary waves, the transducer surface and the amplitude of the envelope function, which is related to the modulation rate (eq. (2)). another important characteristic of the self-demodulated wave is the distortion rate. this can be reduced by decreasing the modulation rate: if the envelope function is f � 1 � mcos(�t) then according to relation (2), the demodulated wave is proportional to mcos(�t) and the distortion is proportional to m2cos (2�t). so with a small modulation rate m we have very small distortion but also a small demodulated wave. as we will see in the next section, the distortion may also be reduced by pre-processing the signal. 3 signal pre-processing signal pre-processing has three aims: amplitude modulation, distortion reduction, and transducer response compensation. in this paper, we consider the transducer to have an ideal response, so we are only interested in amplitude modulation and distortion reduction. if classical amplitude modulation is used, the envelope function is given by f � 1 � ms(t), where s(t) is the audible signal to be transmitted. the berktay equation (eq. (2)) shows that the demodulation wave is proportional to the second deriva© czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 46 no. 4/2006 directional loudspeaker using a parametric array a. ritty the theory for sound reproduction of parametric arrays is based on nonlinear acoustics. due to the nonlinearity of the air, a finite amplitude ultrasound interacts with itself and generates audible secondary waves in the sound beam. a special feature of this loudspeaker is its sharper directivity compared to conventional loudspeakers of the same aperture size. this paper describes the basis of the theory used for parametric arrays, and presents the influence of the main parameters, e.g., carrier frequency. it also describes some signal pre-processing needed to obtain the desired audible sound. a pvdf (polyvinylidenefluoride) film transducer is also studied in order to produce a prototype to confirm the theory. keywords: parametric array, directional loudspeaker, pvdf. tive of f 2, therefore to obtain s(t) as a self-demodulation wave in the far field we have to use the modulation function s t s t t1 21( ) ( )� � �� d . (3) this processing is the ideal. however the square root of a signal has an infinite spectrum, while a transducer has a limited bandwidth. in practical terms, in order to have a self-demodulated wave with a satisfactory distortion rate, the transducer must have a bandwidth that is at least four times larger than the highest frequency of s(t). this constraint can be difficult to fulfil when complex signals, e.g., music, have to be transmitted. another solution is to use a single side band amplitude modulation (ssb). in this case the self-demodulated signal has less distortion than when classical amplitude modulation is used without other processing. to decrease the residual distortions, it is possible to simulate the self-demodulation, extract the distortions and correct the signal, as shown in fig. 1. the advantage of this correction is that it does not increase the necessary bandwidth and can be used in iterative processing. these two processing methods give good results, but both have constraints: if the transducer has sufficient bandwidth the first method can be used, but if the transducer bandwidth is narrow and the calculation time is not a problem, then the second method is better. 4 pvdf transducer the transducer characteristics, e.g., bandwidth or efficiency, are very important in the loudspeaker. with a view to producing a prototype, we have started to study a pvdf film based transducer. these transducers are made up of one pvdf membrane placed on a support with one cavity. the static pressure inside the cavity is decreased in order to obtain a partial vacuum, wich will stretch the membrane and give it a spherical shape (fig. 2). the spherical shape allows us to make use of the 3-axis piezoelectricity of the film and so increase the efficiency of the transducer. the transducer response is a coupling between the cavity and the membrane response. it depends on the cavity dimensions and the vacuum quality inside the cavity. when the cavity radius decreases, the vibrating surface decreases and so the resonance frequencies increase. when the vacuum quality increases, the membrane tension increases, as do the resonance frequencies. in order to describe the transducer response precisely, a model will be made. one problem is the shape of the membrane, because the differential equation that describes the movement, cannot be resolved analytically. another problem is the coupling between the membrane and the cavity; moreover the cavity response cannot be described with the lumped constant method, because its dimensions and the used wave length are approximately the same. the loudspeaker is an array of transducers of this kind. a support with an array of cavities is used, and the pvdf film is placed on the support. the advantage is that, as there is only one film, all the active elements are in phase. 5 acknowledgments the study described in this paper was supervised by prof. b. castagn de and was performed in collaboration with p. lotton and b. gazengel. the work was supported by isl (institut franco-allemand de recherche de saint-louis). references [1] westervelt, p. j.: parametric acoustic array. j. acoust. soc. am, vol. 35 (1963), no. 4, p. 535–537. [2] berktay, h. o.: possible exploitation of nonlinear acoustics in underwater transmitting application. j. sound vib., vol. 2 (1965), no. 4, p. 435–461. [3] yoneyama, m., fujimoto, j.: the audio spotlight: an application of nonlinear interaction of sound waves to a new type of loudspeaker design. j. acoust. soc. am, vol. 73 (1983), no. 5, p. 1532–1536. [4] hamilton, m. f., blackstock, d. t.: nonlinear acoustics. academic press, 1998. alexandre ritty e-mail: aritty@gmail.com department of acoustics czech technical university faculty of electric engineering technická 2 166 27 praha, czech republic 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 s t( ) modulation ssb demodul. model distortions extraction modulation blu s t’( ) d t( ) s t2( )� fig. 1: correction of distortions fig. 2: pvdf transducer ap07_2-3.vp 1 qes sextic ahos our model is a 1d sextic anharmonic oscillator, which is frequently used to approximate various situations in quantum mechanics. due to the quasi-exact solvability a part of the spectrum is easily tractable and two parameters of the potential leave enough space for the choice of application. to be more precise, we have been studying the solutions of the schrödinger equation h e� �� , where � � l2( )� and h x x bx b j x� � � � � � � � � � � � � d d 2 2 6 4 2 2 4 4 3( ) where j is a non-negative integer. during the last decade there has been considerable interest in non-selfadjoint problems, e.g. [1], [2]. we recognize that b �� is a challenge, but it goes beyond the scope of this paper, and we confine ourselves to b ��. adopting the idea of bender and dunne [3], we seek the solution � of the schrödinger equation h e� �� in the form �( ) ( ) ! x e p e n n x x b x n n n n � � � � � � � � � � � � � � � 4 2 4 4 2 0 1 4 1 2 � . we are interested in even solutions only, and the odd ones can be treated by analogy. for � to be a solution, the coefficients pn must fulfil the recurrence relation p e e n b p e n n j n n n( ) ( ) ( ) ( )( ) � � �� � � � � � � � � � � � 4 3 2 16 1 2 3 2 1 � � �p en 2( ) with the initial conditions p e� �1 0( ) and p e0 1( ) � . we see that the wave function � serves as a generating function for the polynomials pn(e). choosing a particular value of j we get the particular solution of our problem, but for any non-negative value of j the infinite system of pn forms an orthogonal system of monic polynomials with respect to a probability measure (favard’s theorem [4], [5]). the peculiarity is that the polynomials of degree j or higher contain a common factor pj�1, i.e. they can be factorized. the roots of the pj polynomial are the eigenvalues of the schrödinger equation. these polynomials have other interesting properties [3], but we shall not use them. the algebraic part of the spectrum, i.e. the quasi-exact energy eigenvalues, are solutions of p ej ( ) � 0 and the corresponding eigenfunctions are products �n x q x t x( ) ( ) exp( ( ))� 2 2 . we change the variable x t2 � and reduce the schrödinger equation to 4 4 2 2 4 2 0 2 2 2t f t t bt f t jt e b f d d d d � � � � � � �( ) ( ) (1) we are thus faced by a linear differential operator and its polynomial solutions. this is a problem that has been studied by many prominent mathematicians since the first half of the 19th century, and many nice results have been achieved. 2 heine-stieltjes spectral problem generally, a linear differential operator �( ) ( )z q z z i i i i k � � dd1 with poynomial coefficients is called a (higher) lamé operator (e.g. [6]). an important number r q z ii k i� ��max (deg ( ) ), ,1 � is called the fuchs index of �( )z . the case r � 0 is called exactly solvable, and it has been well studied. these operators and their polynomial eigensolutions have many interesting properties. the operator �( )z is called non-degenerate if deg ( )q z k rk � � . the multiparameter spectral problem �( ) ( ) ( ) ( )z s z v z s z� � 0, where a polynomial v(z) of degree at most r is sought so that the above equation will have a polynomial solution s(z) of degree n. this is called the (higher) heine-stieltjes spectral problem, v(z) a (higher) van vleck polynomial, and s(z) a (higher) stieltjes polynomial. 2.1 non-degenerate cases two important results are worth mentioning at this point. first, a generalization of the result by heine (cf [6] ). 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 root asymptotics of spectral polynomials b. shapiro, m. tater we have been studying the asymptotic energy distribution of the algebraic part of the spectrum of the one-dimensional sextic anharmonic oscillator. we review some (both old and recent) results on the multiparameter spectral problem and show that our problem ranks among the degenerate cases of heine-stieltjes spectral problem, and we derive the density of the corresponding probability measure. keywords: lamé operator, van vleck polynomials, asymptotic root-counting measure. © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 47 no. 2–3/2007 theorem for any non-degenerate higher lamé operator �( )z with algebraically independent coefficients of qi(z), i k�1, ,� and for any n �1 there exist exactly n r n �� � �� �� van vleck polynomials v(z)’s and corresponding degree n stieltjes polynomials s(z)’s. now, we mention also a physically important case ( ) ( )z s z z s z vsi i l j i jj l � � � � � � �� � � � � �dd d d 2 2 1 11 0 (2) with �1<�2<… �l real and �1, …, �l positive. then the following theorem holds. theorem (stieltjes, van vleck, bôcher, [6]) for any integer n �1 1) there exist exactly n l n � �� � �� �� 2 polynomials v of degree l � 2 such that equation (2) has a polynomial solution s of degree exactly n. 2) each root of each v and s is real, simple, and belongs to the interval (�1, �l). 3) none of the roots of s can coincide with some of �i’s. moreover, n l n � �� � �� �� 2 polynomials are in one-to-one correspondence with n l n � �� � �� �� 2 possible ways to distribute n points into l�1 open intervals ( , )� �1 2 , ( , )� �2 3 , …, ( , )� �l l�1 . recently, borcea and shapiro [7] found the density of the asymptotic root distribution for the stieltjes polynomials. 2.2 the degenerate case inspecting the structure of (1) we see that it is a degenerate case of the r �1 operator. the roots of van vleck polynomials are not confined to a finite interval as n � �. in order to find their limitng distribution we have to scale them. to this end we must know their rate of growth. thus, we are interested in the roots of the maximal modulus and their dependence on parameter b. to find this dependence we follow the main idea of the gräffe-lobachevskii method. let p e e p en n l l l n ( ) � � � � 0 1 and we find expressions for the sums of the powers of the roots s e ek k n k� � �1 � . then for the root of the maximal modulus emax holds lim maxk k k s e�� �1 . because sk are related to pl by s p s k pk n l k l n k l k � � �� � � � � 0 1 1 we need explicit expressions for pl up to l n k� � . these formulae can be derived from the recurrence relation. we arrive at s c k b n nk � � �( ) ( )� � �� 1 where � � � � � � � � � � � � � � � � � 3 2 1 2 1 3 4 0 2 2 l k l l k l & & if if and l � 0 1 2, , ,�. thus, lim lim ( ) lim ( ) k k k k k k ks c k b n n c kk k �� �� �� � � � � 3 2 i.e. the limit is independent of b. our conjecture is that lim ( ) k k c k �� � �. polynomials spn that have the same roots as pn but scaled by c cnn � 3 2 do not fulfil any finite recurrence relation. however, we can rewrite them as a determinant of an n×n tridiagonal matrix sp mn n� det( ), where m e e e n n n n n n n n n : , , , , , , , � � � � � � � 1 2 2 2 3 3 3 0 0 0 0 0 0 0 0 � � , , , , , 4 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 � � � � � � � � � � � � � n n n n n n n ne � � � �� 0 0� n n n ne, ,� � � � � � � � � � � � � � � � � � � � � and n i n n i c, � � �4 4 1 2 , �n i n i c, ( ) � �4 1 , n i n n i n i c, ( )( ) � � � � �2 2 1 2 2 2 this enables us to extend the polynomial sequence by introducing a subsequence sp mn i n i, ,det( )� , where mn,i is the upper i × i principal submatrix of mn and this enables us to use the result of kuijlaars and van assche concerning the three-term recurrence relations with variable coefficients [8]. this relation reads sp e e sp spn i n i n i n i n i n i, , , , , ,( ) ( )� � �� � � 1 2 with initial conditions spn,0 1� and spn,� �1 0. now, it remains to go to the polynomials tpn,i that have the same roots as spn,i and fulfil the relation etp a tp tp a tpn i n i n i n i n i n i n i, , , , , , ,� � � �� � �1 1 1 2 , with a i n i cn i n , ( )( ) � � � �4 1 2 2 1 the limits � � ( ) lim ,� � �i n n i 0 , a a ci n n i n ( ) lim ,� ��� � � � � � 4 1 2 determine the density of the asymptotic root-counting measure: � �� �� ��� 2 2 0 1 a a( ), ( ) d , where � � � � � �x y y s s x s x y , ( )( ) , � � � � � � � �� 1 0 if otherwise the sought density � is then expressed as c c s � � � � d 64 1 2 2 20 1 ( )� � �� . we immediately see that the integral does not converge for s � 0. the polynomial under the square root has three real roots � � �1 2 3( ) ( ) ( )s s s� � if s c c � � � �� � �� 16 3 3 16 3 3 , and �3 1� for all s ��. thus � can be written as � � � � � � � � � � � � � � �� c d 64 1 2 3 1 2 ( )( )( ) , which can be evaluated as [9] � � � � � � � ( ) , , ,s c f� � � � � � �� ��8 1 1 2 1 2 1 3 1 2 1 2 1 3 1 s c c � � � �� � �� 16 3 3 16 3 3 , . acknowledgments this research was partially supported by project lc06002. references [1] shin, k. c.: schrödinger type eigenvalue problems with polynomial potentials: asymptotics of eigenvalues, math. sp/0411143. [2] trinh, d. t.: asymptotique et analyse spectral de l’oscillateur cubique, th se, université de nice, 2002. [3] bender, c. m., dunne, g. v.: quasi-exactly solvable systems and orthogonal polynomials, j. math. phys., vol. 37 (1996), p. 6–11. [4] chihara, t. s.: an introduction to orthogonal polynomials. gordon and breach, new york, 1978. [5] marcellán, f., álvarez-nodarse, r.: on the “favard’s theorem“ and its extensions. j. comp. appl. math., vol. 127 (2001), p. 231–254. [6] borcea, j., brändén, p., shapiro, b.: higher lamé equations, heine-stieltjes and van vleck polynomials, in preparation. 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 2–3/2007 fig. 1: a) dependence of the three real roots of 64 1 02 2 2� �( )� � �c s on s (for c � �). if s c c � � �� � �� 16 3 3 16 3 3 , , there is only one real root � 3 1( )s ! ; b) comparison of density � with a numerical example for n � 100 and b � 2 66. . © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 47 no. 2–3/2007 [7] borcea, j., shapiro, b.: root asymptotics of spectral polynomials for the lamé polynomials, math. ca/0701883. [8] kuijlaars, a. b. j., van assche, w.: the asymptotic zero distribution of orthogonal polynomials with varying recurrence coefficients, j. approx. theory, vol. 99 (1999), p. 167–197. [9] prudnikov, a. p., brychkov, yu. a., marichev, o. i.: integrals and series. elementary functions. nauka, moscow, 1981. prof. boris shapiro e-mail: shapiro@mat.su.se department of mathematics university of stockholm s-10691 stockholm, sweden rndr. miloš tater, csc. e-mail: tater@ujf.cas.cz nuclear physics institute, academy of sciences cz-25068 řež near prague ap06_3.vp 1 introduction in case of a hypothetical core disruptive accident (cda) in a liquid metal fast breeder reactor, the contact between fuel and liquid sodium creates a high-pressure gas bubble in the core. the violent expansion of this bubble loads and deforms the vessel and the internal structures. during the 1970s and 1980s, large programmes of investigations devoted to the understanding of cda were undertaken. they included both experimental tests in small-scale mock-ups, and simulations with specialized computer codes to extrapolate the experimental results to the reactor size and to check the lmfbr integrity. based on a 1/30-scale model of the superphenix reactor, the mara programme [1, 2] was performed at cea-cadarache. it involved ten tests of gradual complexity due to the addition of internal deformable structures: � mara 1 and 2 considered a vessel partially filled with water and closed by a rigid roof [3]. � mara 4 represented, in addition, the main core support structures [4]. � mara 8 and 9 were closed by a flexible vessel and a flexible roof [5]. � mara 10 included the core support structures and a simplified representation of the structures above the core [6]. to end this series of tests, the mars test [7] was a 1/20-scale mock-up including all the significant internal components. at the end of the 1980s, a specific cda sodium-bubble-argon tri-component constitutive law [8] was developed in the general ale fast dynamics finite element castem-plexus code. this first version of the cda constitutive law was validated [9] with the cont benchmark [10]. in order to demonstrate the castem-plexus capability to predict the behaviour of real reactors [11, 12], axisymmetrical computations of the mara series were confronted with the experimental results. the computations performed at the beginning of the 1990s showed a rather good agreement between the experimental and computed results for the mara 8 and mara 10 tests, even if there were some discrepancies [13]. on the contrary, the prediction of the mars structure displacements and strains was overestimated [14]. in 1999, the castem-plexus code was merged with the plexis-3c code [15] (a former joint product by cea and jrc) to extend the capacities of both codes. the new-born europlexus code benefitted from a new method to deal with the fluid-structure coupling. new simulations were undertaken with the new coupling, and compared with the experimental results for: � the mara 8 mock-up [16, 17]. � the mara 10 mock-up [18, 19]. � the mars mock-up with a finer mesh, and without the representation of the non-axisymmetrical structures [20]. even if the results of the mars test were better, the deformation of some structures was still overestimated, since the finer mesh introduced an additional flexibility. the conservatism was supposed to come from the very simple representation of the mars non-axisymmetrical structures (core elements, pumps and heat exchangers) by a pressure loss. these structures, acting as porous barriers, should have a protective effect on the containment by absorbing energy and slowing down the fluid impacting the containment. therefore, a new cda constitutive law taking into account the presence of the internal structures (without meshing them) by means of an equivalent porosity method [21] was developed. another simulation with the porous model was carried out to estimate the influence of the latter structures [22]. this paper presents a brief description of the porosity model, the mars mock-up and the analysis of the results computed by the europlexus code with the porous model. the main central structures are described with a classical shell model, while the heat exchangers and pumps are described with the porous model. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 a porosity method to describe complex 3d-structures theory and application to an explosion m.-f. robbe a theoretical method was developed to be able to describe the influence of structures of complex shape on a transient fluid flow without meshing the structures. structures are considered as solid pores inside the fluid and act as an obstacle for the flow. the method was specifically adapted to fast transient cases. the porosity method was applied to the simulation of a hypothetical core disruptive accident in a small-scale replica of a liquid metal fast breeder reactor. a 2d-axisymmetrical simulation of the mars test was performed with the europlexus code. whereas the central internal structures of the mock-up could be described with a classical shell model, the influence of the 3d peripheral structures was taken into account with the porosity method. keywords: porosity method, homogenization, nuclear reactor, explosion, fluid-structure coupling. 2 porosity model the purpose is to define a method able to represent the influence of structures located inside a fluid pool, during a fast transient fluid flow, without meshing them. the idea is based on the substitution of an equivalent “porous” fluid for the structures and the surrounding fluid. structures are assimilated to a set of solid marbles regularly distributed into the fluid. globally, the method consists in averaging the behaviour of the fluid surrounding structures on the volume filled by fluid and structures. the method is divided into three steps: � the fluid conservation laws are space-averaged on the control volume to consider the partial occupation of the control volume by fluid. the fluid equations are written with fluid, solid and control volume terms. as we consider that a solid is rigid, it is pointless studying the solid conservation laws. � the fluid equations are modified, by introducing a porosity coefficient, in order to replace the control volume terms by fluid ones. except a fluid-solid force, the fluid conservation laws just depend on fluid variables. � an equivalent “porous” fluid, with its own properties, is finally defined on the control volume. the conservation laws of this medium are matched up with the fluid equations previously defined. the main hypotheses are: � the fluid is considered homogeneous, even if it is composed of a mixture of several fluids. � thermal transfers and phase changes are considered as negligible, as thermal exchanges are much slower than acoustic transients. � structures are assumed to be rigid and no fluid-structure coupling is considered. the model only simulates the structure presence as an obstacle for the fluid flow. neither the recoil and deformation of structures is described, nor the energy absorbed during their deformation. � no mass source and no heat source are included in the theoretical model. however, both terms are easy to add in the conservation equations. an eulerian approach was used in order to eliminate the temporal variations of the control volume. this simplification allows to have fixed boundaries for all the spatial integrals concerning the control volume. an extension of the present method to the ale description (arbitrary lagrange euler) should be possible by taking into consideration variations of the spatial boundaries versus time. the local conservation laws of a fluid are: mass � � � � t div vf f f� �( ) � 0 momentum � � � � � � t v d i v v v d i v gf f f f f f f( ) ( ) � � � � � � � � � � � � 0 total energy � � � � t u v v div u v vf f f f f f f f� � � � � � �� � �� � � �� � 1 2 1 2 � � � � � � �� � �� � � � � � � � � � v div v g v f f f f f( ) ,� � 0 where the subscript f is relative to fluid variables, t indicates time, � the density, v the velocity, � stresses, g the gravity, and u the internal energy. 2.1 average on the control volume let us consider a fixed control volume or total volume �t cut by an interface as (fig. 1). this interface divides the control volume into a fluid subvolume �f and a solid subvolume �s. let us note at the surface bounding jointly the fluid volume and the control volume. let � ( , , , )x y z t be a function (scalar, vector or tensor) defined on the control volume �t. � ( , , , )x y z t is the average value of � on the control volume �t. � � � � � � � 1 t f d . the fluid conservation laws averaged on the control volume become: mass � � � � t div vf f f( ) ( )� � � 0 momentum � � � � � � � t v d i v v v d i v g n a f f f f f f f t f f ( ) ( ) � � � � � � � � � � � � � 1 � d as � � � 0 total energy � � � � t u v v div u v vf f f f f f f f� � � � � � � � � � � � � � � � 1 2 1 2 � � � � � � � � � � � � �� � � � � � � � � � � � � � � v div v g v n v f f f f f t f f� � � 1 � � �f a a s d� � 0. 2.2 average on the fluid volume let us consider that � ( , , , )x y z t is the average value of � on the fluid volume �f. � � � � � � � 1 f f d . let � be the porosity, defined as the fluid presence fraction inside the control volume: � � � �f t . the average values � and � are linked by: � � � �. the mass equation can be transformed easily into: � � � � � � t div vf f f� �( ) � 0. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 1: control volume the same method applied to the momentum equation introduces new terms: � � � � � � � � t v d i v v v d i v v vf f f f f f f f( ) ( ) � � � � � � � � � � � � �� � � � � � � � � � ��d i v g n af f t f f as � � � � � � � � � 1 0 � d the terms with ' correspond to fluctuating terms. if we consider a function , defined on the fluid volume �f, it can be separated into a fluid average term and a fluctuating term � (white noise): � � �. the stress tensor � f can be split into a pressure tensor including the diagonal terms and a viscous-stress tensor containing the non-diagonal terms: � f f f f t fp i div v i grad v grad v� � � � � � pressure ��� � � �2 3 ( ) � � � viscous stresses � �������� �������� where p is pressure, i the unit tensor, and the dynamic viscosity. the term � � � ��f f fv v � � is analogous to reynolds stresses (turbulent stresses) and can also be separated into a pressure tensor and a viscous-stress tensor: � � � � � � � � � � �f f f f tv v k i � � � �� �� re 2 3 2 similar to a pressure 3 t f t f t fdiv v i grad v grad v( ) � � � � � � � � � similar to viscous stresses � ��������� ��������� , where kt is the turbulent kinetic energy and t the turbulent viscosity. after having eliminated the negligible terms, the global stress tensor becomes: � � f f t f t f t fp i div v i grad v grad v� � � � � � � � � �re ( ) 2 3 � � � . the integral on the solid surface can be written versus basic variables: � � � � � � �� � � � � � � 1 1 1 � � � t f f a f t f a t f f n a p n a n p i s s � � � � d d �� � � � � � � � � � � f a f s a p grad f s d � � , where � � fs is the solid-fluid interaction force and contains all the stress terms different from the one in average pressure. according to [23] and supposing an approximate symmetry of structures, the tri-dimensional interaction force can be written with the following expression: � � � � � �� �� � � � � � � � f a i v vs s t t f f f 1 2 � , where � � contains the coefficients of pressure loss. thus we can deduce the final expression of the momentum conservation law: � � � � � � � � t v di v v v grad p di v d f f f f f f t ( ) ( ) ( � � � � � � � � � � � 2 3 iv v i grad v grad v g a f t f t f f � � � � ) � �� � � � � � � � � � � � � � � 1 2 s t t f f fi v v � � � � � � �� � 0. finally, the total energy equation can be rewritten: � � � � � � t u v v div u v vf f f f f f f f� � � � � � �� � �� � � �� 1 2 1 2 � � � � � � � � � �� � �� � � � � � � � � � v div p v div div v i grad v f f f f f t � 2 3 ( ) grad v v v g f t f f f � � � � � � � � � � � � � � � � � � � � 0. 2.3 definition of a “porous” fluid equivalent to the averaged medium the initial problem was formulated with conservation laws defined on the control volume, subdivided into a fluid zone and a solid zone, and using variables of both components. then the conservation laws were averaged on the fluid; this process allowed to have only fluid equations defined on the fluid subvolume and with fluid variables. to return to the ini© czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 2: steps of the “porosity” method tial control volume, it is necessary to consider an equivalent fluid defined on the control volume and whose properties have to be determined (fig. 2). the equivalent “porous” fluid can be considered as a single substance filling the entire control volume. it is governed by the classical conservation laws with an additional force term in the momentum equation: mass � � � � t div veq eq eq� �( ) , � 0 momentum � � � � t v di v v v grad p di v eq eq eq eq eq eq t ( ) ( ) � � � � � � � � � � � 2 3 eq eq div v i grad v grad veq t eq t eq e ( ) � � � � � � � � � � � � � � � � q eqg f � � � � � 0, total energy � � � � t u v v div u veq eq eq eq eq eq eq� � � � � � �� � �� � � 1 2 1 2 � � � �� � � � �� � �� � � � � � � � v v div p v div div v eq eq eq eq eq ( ) ( ) 2 3 i grad v grad v v g v eq t eq t eq eq eq � � � � � � � � � � � � � � � � � � � � eq � 0. comparing term by term these equations with the fluid conservation laws averaged on the fluid volume, we obtain the value of each equivalent variable: � � �eq f� � � v veq f� p peq f� � u ueq f� � t teq � � � � � � f p grad a i v veq f s t t f f f� � � �� � � � � � . compared with classical fluid conservation laws, the solid presence introduces three new parameters: a porosity � describing the volumic filling rate of structures into the fluid medium, a pressure loss coefficient � � induced by the fluid-solid friction and a coefficient as t� describing the global structure shape. in addition to the fact that the new equations (mass, momentum and total energy) are written with equivalent variables, the momentum equation contains two new forces: a fluid-solid interaction force � � fs and a force p gradf � � at the interface between two equivalent media presenting different porosities. 3 description of the mars mock-up the primary circuit of the superphenix reactor (fig. 3) is enclosed in the main reactor vessel [24]. this vessel is welded to the roof slab, and encased in a safety vessel. both vessels are made of stainless steel. the mars experiment (fig. 4) (1 m in diameter and 1 m high) includes all the significant internal components of the reactor [7]. the main vessel is an assembly of a cylindrical part made of 316l stainless steel, and a torospherical bottom made of 304l stainless steel. its thickness varies between 0.8 and 1.6 mm. the unmelted part of the core is simulated by aluminum cylinders and steel hexagons fixed into two ag3 aluminum plates. the neutron shielding is represented by four radial shells and the associated supporting structures. the mock-up also includes the strongback, the diagrid support, two inner vessels, the anti-convecting device, the core catcher and the main vessel cooling system. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 3: the superphenix reactor – roof – large rotating plug – small rotating plug – core cover plug – main vessel – internal vessels – core support structures – diagrid support – core and neutron shieldings – pumps and heat exchangers – core catcher – gas bubble in the middle of the core fig. 4: the mars mock-up the roof slab is constituted by two circular plates of different thicknesses. openings are drilled to enable the passage of the large components. the two rotating plugs are concentrically off as in the reactor. the core cover plug was simplified compared to the reactor one: however, it includes a top plate, a heat-insulation, spacer plates, an in-pile shell and pipes. the main components inserted through the roof slab (4 primary pumps, 8 intermediate heat exchangers, 4 emergency cooling exchangers, 2 integrated purification devices), as well as the supporting and joining rings are present. rubber-ring bands simulate the heat-insulating material between the roof and the main vessel. other components above the top closure are represented by their inertia using lead plates. the thin structures are mainly made of 304l stainless steel in order to simulate the austenistic steel of the reactor structures. the massive structures and those made of heterogeneous materials (roof slab, rotating plugs, core support structure, diagrid support) are made of a5 aluminum. the top plate of the core cover plug is made of a42 aluminum. the sodium coolant at operating conditions is simulated by water at 20 °c. the cover gas of the mock-up is the same as in the reactor (argon). the test was fired using an 80 g low-density low-pressure explosive charge of l54/16 composition [25]. the charge mass was chosen to simulate the 800 mj full-scale mechanical energy release used in the reference cda in the superphenix reactor. the explosive charge was supported by the base of the core cover plug. the test was well instrumented with: pressure transducers, accelerometers, strain gauges, high-speed cameras, and a grid drawn on the different structures. 4 numerical modelling of the mock-up 4.1 the europlexus code europlexus [26, 27, 28] is a general finite element code, co-developed by cea-saclay and jrc-ispra, and devoted to the analysis of fast transient phenomena. it results from the merging of the castem-plexus and plexis-3c codes, and can perform 1d, 2d or 3d fluid-structure calculations as required. a commercial version of the code for industrial use is available through the samtech software house. the main fields dealt with are impacts [29], explosions [30, 31, 32], pipe circuits [33, 34, 35], blowdowns [36, 37, 38, 39], hydrodynamics [40] and articulated systems [41]. europlexus is mainly based on the finite element method, but it also contains finite volumes and spectral elements. time integration is explicit and realised with a newmark algorithm. the formulation can be lagrangian, eulerian or arbitrary lagrange euler (ale). the code can take into account various non-linearities related to materials or geometry. 4.2 geometry owing to the symmetry of the mars mock-up, a 2d-axisymmetrical representation was chosen. the external structures are modeled by shells or massive elements. the main internal structures are represented with a classical shell model. the peripheral components are described with the porosity model homogenizing the components with the surrounding fluid. the main vessel is modelled with a thin shell, and two materials for the cylindrical and torospherical parts. the vessel is supposed rigidly fixed to the roof. the mock-up is hung to the rigid frame by a cylindrical shell. the top closure is assimilated to an axisymmetrical structure composed of massive elements. the openings for the passage of the components are simply accounted for by the mass they remove to the roof. local masses are added above the top closure to consider the mass of the instrumentation, the lead plates, and the peripheral components. the core cover plug was simplified: the heat and neutronic insulation is simulated by three plates, and the pipes are assimilated to two cylinders. the rings joining the roof slab and the three plugs are represented with thin aluminum shells at the top, and rubber shells at the base of the massive structures. the heat-insulating material between the roof and the main vessel is represented by a rubber-ring band. in the centre of the mock-up, a single rigid structure (called core support structures) describes the strongback, the neutron shielding support, the support of the baffles and the internal vessel as they are fitted together. the css is assimilated to an axisymmetrical rigid shell of constant thickness, and whose mass is the total mass of all the structures. the css is attached to the vessel by a cylindrical collar. the diagrid support is described by a thin shell, connected to the css by a swivel link. the core is schematized by an added mass distributed along the diagrid. the neutron shielding is modelled by its central shell since it governs the fluid port between the core cover plug and the shielding, and a local mass added at the base of the first shell to take into account the three other shells. the baffles surrounding the neutron shielding are assimilated to a vertical axisymmetrical shell. only the central shell of the internal vessel is meshed; the other is described by a local mass. the shielding, the baffle, and the internal vessel are embedded in the css. the core catcher is represented by a mass distributed along the base of the main vessel. 4.3 materials the behaviour of the structures is generally described with isotropic elasto-plastic constitutive laws. however elastic laws are used for the rubber elements of weak resistance joining the roof and the plugs. the cylinders schematizing the pipes of the core cover plug are described by an elastic law approximated by the homogenization of the non-axisymmetrical structures. the behaviour of the core support structure is supposed to be linear elastic. the 3d geometry of the peripheral components (pumps, heat exchangers, purification devices) cannot be meshed correctly with an axisymmetrical model. so their presence is accounted for thanks to the porosity model: the fluid elements comprised between the extreme radii of the peripheral components are homogenized with the structures. the peripheral structures are described by a volume filling rate of 47 %. even if the model allows to take into account a pressure loss by friction, this possibility is not used in this © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 simulation. figs. 5 and 6 present the global mesh and that of the structures. the porous zone corresponds to the trapezoidal area between the internal vessel and the roof. the cda constitutive law is used to describe all the fluid filling the mock-up. this law is devoted to tri-component fluid mixtures, for which one of the components can be diphasic. in the mock-up, the fluids intervening are water, argon and the explosive charge. argon is supposed to be a perfect gas with an adiabatic behaviour. p p p p a n a n a n a n a ( ) ( ) ( ) ( ) � � � � � � � � 1 1 � , where the initial density is �a ( ) .0 1 658� kg/m3, and the heat capacity ratio is � a p vc c� � 1 67. . argon is initially at atmospheric pressure pa ( )0 510� pa. the explosive charge (bubble) is described by a perfect gas with a polytropic law. p p p pb n b n b n b n b ( ) ( ) ( ) ( ) � � � � � � � � 1 1 � , where the initial density is �b ( )0 400� kg/m3, the initial pressure is �b ( ) .0 82 88 10� pa, and the polytropic coefficient �b � 1 322. . water is described by a perfect and isothermal fluid. p p p cw n g n w n w w n w n( ) ( ) ( ) ( ) ( )( )� � �� � � �1 1 2 1� � with an initial density �w ( ) .0 998 3� kg/m3, a sound velocity cw � 1550 m/s. water is initially at atmospheric pressure pw ( )0 510� pa. the pressure of the gas mixture pg is the sum of the partial pressures of the gases: p p p pg n a n b n v n( ) ( ) ( ) ( )� � � �� � �1 1 1 1 . water is supposed to be at saturation conditions. however if p pw n sat ( )� �1 , the presence fraction of the vapour is negligible. if the pressure decreases, water can reach saturation pressure, and the vaporisation is supposed to be instantaneous. the vapour is an isothermal perfect gas whose pressure is constant and only depends on the initial temperature t (0): p p tv n( ) ( )( )� �1 0saturation . 4.4 fluid-structure coupling structures are represented with a lagrangian description. the bubble zone is kept fixed and is eulerian. water and argon are described with an arbitrary-lagrange-euler modelling, apart from the homogenized zone. the porosity model implemented in europlexus is available only for the eulerian description, so that the homogenized zone must be eulerian. two element layers next to the internal vessel are left out of the homogenized zone to operate the fluid-structure coupling. the fluid-structure coupling implemented in europlexus works without coupling elements; the code automatically writes the relations between the fluid and solid nodes facing each other. in the core cover plug, no fluid-structure coupling is defined: � between the internal cylinders and the fluid as the fluid can cross the pipes simulated by vertical cylinders. � at the top of the external cylinder surrounding the plug because this shell is perforated in the mock-up. in order to preserve a regular mesh inside the core cover plug, the fluid nodes along the three cylinders have to stay aligned between the structure nodes at the connections with the plates. 5 analysis of the results of the simulation in the following part of the paper, we will call: � central zone: the area limited by the in-pile shell of the ccp, the neutron shielding and the diagrid. � free opening: the free space between the top of the neutron shielding and the edge of the in-pile shell of the ccp. � bottom zone: the area limited by the diagrid, the css and the bottom of the main vessel. � intermediate zone: the area between the neutron shielding and the lower part of the internal vessel. � lateral zone: the area on the right side between the internal vessel, the css and the torospherical part of the main vessel. � channel: the space between the upper part of the internal vessel and the cylindrical part of the main vessel. � upper zone: the area below the top closure. 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 5: mesh of the mars test fig. 6: mesh of structures 5.1 pressure fig. 7 presents the pressure versus time. initially, water and argon are at atmospheric pressure whereas the explosive charge simulating the gas bubble is at 288 mpa. from 0.02 ms, a pressure wave issued from the bubble zone expands spherically. the pressure wave hits successively: � the central structures at 0.06 ms. © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 7: pressure � the baffle and the ccp from 0.1 ms. � the internal vessel from 0.18 ms. � the bottom of the main vessel at 0.22 ms. � the upper part of the main and the internal vessels between 0.34 and 0.4 ms. from 0.1 to 0.22 ms, the diagrid and the core support structure (css) hold back the propagation of the pressure wave, so that pressure increases up to 160 mpa in the bottom right-hand corner of the central zone, while it remains lower than 60 mpa below the diagrid. the expansion of the pressure wave causes a pressure decrease in the centre of the bubble and at the bottom of the ccp. the pressure wave keeps a spherical shape until 0.22 ms. then the vertical progression is limited by the presence of the main vessel at the bottom and the argon layer at the top. so the wave progresses laterally and crashes progressively against the main vessel. at 0.38 ms, the wave forms a sort of queue that impacts the top of both vessels. at 0.26 ms, the pressure is maximum below the diagrid (25 mpa) and in the intermediate zone (18 mpa). the pressure decreases down to atmospheric pressure at the bottom of the central and intermediate zones, and at the top of the ccp. the pressure always remains low in the argon layer owing to the gas compressibility. from 0.42 to 1 ms, the pressure globally diminishes, and even reaches 12 mpa in some areas. between 1.2 and 2.6 ms, the pressure globally becomes lower than 5 mpa. the pressurised area along the rounded corner slides up to the channel. the overpressure around the central zone moves towards the peripheral components and the top of the ccp. the top of the ccp remains pressurised until 3.7 ms due to the convergence of the upward water flows in the ccp, and the horizontal flows coming from the top closure. until 4.7 ms, the pressurised area slides horizontally below the roof towards the top right-hand corner. between 5 and 6 ms, the rebound of water against the css pressurises the lateral zone. afterwards, the pressure becomes lower than 5 mpa in the entire mock-up. two main differences appear compared with the computation without the massive peripheral structures: when the pressure wave crashes against the vessels at about 0.34 ms, the passing of the wave is slightly delayed and the maximum pressure is higher (20 mpa on the front face of the porous zone, instead of 17 mpa in the corner joining the intermediate and upper parts of the internal vessel). the overpressure in the upper zone between 2.8 and 3.5 ms remains lower than 3 mpa and slides upwards along the porous zone, instead of reaching 11 mpa and impacting directly the upper part of the internal vessel. 5.2 gas fraction figs. 8 to 10 show the general volume presence fraction of the gas, the mass presence fractions of the bubble and the argon, respectively. initially, the mock-up is filled with water, except the explosive charge in the centre and the argon layer below the top closure. the gas bubble grows spherically until 0.2 ms. then it expands preferentially upwards and takes a rectangular shape, homothetic to that of the central area at 0.8 ms. the bubble deviates towards the free opening, and starts escaping in the upper zone from 1.6 ms. water remains trapped at the bottom of the central area due to the expansion of the bubble and the confinement of the central structures. from 0.4 ms, water vaporises in the areas which become depressurised after the passing of the pressure wave. the bubble escapes from the central zone until 7 ms, creating a large panache of pressurised gas in the upper zone. the pressurisation induced by the panache condenses steam in the upper and intermediate zones at 5 ms. nevertheless, the steam layers along the internal vessel and the diagrid still exist until 6 and 8 ms, respectively. the pressure wave impacting the top closure starts compressing the argon layer upwards at 0.2 ms. from 1.6 to 4 ms, the panache propels water upwards, which in turn pushes and compresses the low-pressure argon gas into the top right-hand corner of the mock-up. from 5 to 17 ms, the argon bag is pushed down in the channel by the violent water flows below the top closure. from 8 ms, the expansion of the bubble gas out of the central area is stopped by the rebound of water against the top closure and the internal vessel. the front of the panache flows sideways and a whirlpool forms; the base of the panache becomes thinner. due to the general depressurisation from 6 ms on, the argon expands back along the top closure. the whirlpool pushes back the bubble gas in the central area from 13 ms. at 17 ms, the bubble almost completely fills the central area. the panache goes slightly away from the free opening and forms a torus that twirls round in the middle of the water. some gas flows down along the outer side of the neutron shielding and remains trapped between the shielding and the baffle. the argon gathers and forms a bag at the top of the ccp. as fluid flows back from the channel over the internal vessel, a second bag of argon forms near the top right-hand corner. after 30 ms, the central area is completely diphasic. water vaporises along the main vessel bottom and the diagrid due to the depressurisation of the bottom zone. the main difference when the computation is done without the peripheral structures concerns steaming. larger amounts of vapour appear in the porous zone and for a longer time between 1.2 and 3 ms. it seems that the panache extends a bit less in the new simulation. 5.3 fluid velocity figs. 11 and 12 show the fluid orientation and the fluid velocities. the fluid is initially at rest in the whole mock-up. from 0.02 ms, the bubble expands spherically with a velocity of 200 m/s. at 0.1 ms, the high speed on the symmetry axis comes from the horizontal blocking condition. the confinement of the central area induces a speed decrease in the lower part. the central structures are hit at 0.1 ms. the pressure wave accelerates the water at the periphery of the central zone: water impacts successively between 0.12 and 0.3 ms the baffle, the bottom of the internal vessel, all the spacer plates of the ccp, and the vessel bottom. the 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 8: volume presence fraction of gas speed falls to 140 m/s in the free opening, and water impacts the neutron shielding at 80 m/s. between 0.38 and 0.8 ms, the bubble flows towards the free opening at 100 m/s. the water progressively hits all the external structures: the highest speeds reach 30 m/s at the bottom of the main vessel and 40 m/s in the heat-insulation of the ccp. the argon leaves the top of the ccp horizontally by the perforated part of the external cylinder. from 1 to 2.4 ms, the bubble crosses the free channel with velocities up to 90 m/s, and directly impacts the top of the neutron shielding. out of the central area, the water flows: � down towards the bottom of the main vessel and the collar. � laterally towards the rounded corner and the channel. � upwards and diagonally towards the upper part of the internal vessel. the upward water flows below the top closure expel the argon out of the ccp, and horizontally towards the top right-hand corner. from 3 to 5 ms, the bubble flows out of the central area violently and forms a gas panache. the panache pushes water down between the neutron shielding and the baffle, upwards and diagonally towards the top right-hand corner. the thin argon layer slides outwards along the top closure at 80 m/s. water bounces back against the css and flows back toward the internal vessel. in the bottom zone, the water flow reverses. until 7 ms, the panache progresses in the upper zone at a speed up to 90 m/s, and sucks up the water of the intermediate zone. the water in the ccp slows down and deviates inwards near the top. in the channel, the convergence of the upward water flows from the lateral zone and the downward argon flows from the top corner, and leads to an outward horizontal thrust. the progression of the panache stops at 8 ms at the limit of the porous zone. the gas at the front of the panache deviates sideways and a large whirlpool forms above the baffle. the speed reaches 40 m/s in the panache. the whirlpool sucks water from the top and the side of the mock-up. a second smaller whirlpool arises in the central zone as part of the gas cannot cross the free opening. two opposite flows superimpose at the top of the mock-up: argon moves outwards whereas water is pulled inwards by the large whirlpool. the fluid in the top right corner exerts a major thrust, and therefore the flows progress downwards in the channel. the flows reverse in the ccp, and slow down in the bottom and lateral zones. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 9: mass presence fraction of the bubble between 14 and 20 ms, the whirlpool turns away independently from the rest of the fluid. not only does it prevent the exit of gas out of the central zone, but it pushes fluid down into this confined area. as a consequence, the impact against the diagrid and the in-pile shell accelerates down the water below the diagrid, and up the water in the ccp. the whirlpool draws water from the top closure and the top of the ccp, and pushes it in the intermediate zone. the argon bag in the channel is sucked inwards and passes over the top of the internal vessel. on the contrary, argon concentrates at the top of the ccp near the symmetry axis. after 20 ms, the whirlpool dilutes progressively with the surrounding water, so that it fills half the upper zone at 50 ms. the second small whirlpool continues to twirl in the central area. the argon expands both inwards and downwards. the flows globally slow down in the entire mock-up. the differences when the computation is done without peripheral structures appear at 5 ms. the horizontal thrust in the channel against the top of the main vessel starts at 5 ms in the new simulation instead of 6 ms. the same advance of 1 ms is also noted at the time of the formation of the main whirlpool. the horizontal argon flows are also more marked. the presence of the porous zone acts as an obstacle, and deviates the water flows upwards towards the top closure. consequently, the fluid is pushed sooner and somewhat more violently sideways along the top closure. 5.4 deformed shape figs. 13 to 16 present the deformed shapes of the mesh and structures, and the radial and vertical displacements of the structures, respectively. the first structures to deform are those closest to the explosive charge, because they are the first to be impacted by the shock wave. almost immediately, the internal shells and the base of the main vessel bend and move away. the bending of the neutron shielding induces a lowering of the shell. the plates of the ccp bend while the bottom of the cylinders buckles, because of the violent flows crossing the free opening. at 0.6 ms, the upper part of the internal vessel and the lateral wall of the main vessel are pushed outwards by the perpendicular water flows. the bending and the lift of the spacer plates pull the external cylinder of the ccp locally inwards. between 0.8 and 2 ms, the top of the neutron shielding opens completely. the bending of the baffle and the lower part of the internal vessel propagates from the bottom to © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 10: mass presence fraction of argon 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 11: fluid flows mid-height. the in-pile shell deforms both at the edge and in the centre. the entire main vessel lowers owing to the high thrust of the water on its base. as the top of the vessel is blocked, the lowering is proportional to the distance from the top. this lowering pulls down the collar, the css, and all the structures attached to the vessel. the only exception concerns the baffle, which becomes rounded and stretched by the top water flows. a bulge forms at the bottom of the lateral wall of the main vessel and the upper part of the internal vessel. the rigid css stays fixed radially. the massive top closure is lifted from 1.5 ms by the impact of the upward-directed water; the lift induces a rotation around the hanging device. the ccp goes on being lifted. the external cylinder buckles just under the top-closure level. as the heat-insulation lower plate is more rigid than the external cylinder, the splash of water against the heat-insulation induces a bending of the cylinder rather than one of the heat-insulation. the shortening caused by the buckling of the external cylinder contributes to the lift of the ccp plates. at 2 ms, maximum radial displacements are reached: � 60 mm at the top of the neutron shielding. � 45 mm at mid-height of the baffle. � 25 mm at mid-height of the lower part of the internal vessel. � 15 mm in the main vessel just below the channel. at 2.8 ms, the lowering becomes maximum in the centre of the diagrid (�30 mm), at the top of the radial shielding (�30 mm), and at the junction of the lower and intermediate parts of the internal vessel (�20 mm). the css lowering is around �10 mm. the very flexible baffle continues to be stretched and lifted because of the fluid rebound against the css. from 3 to 5 ms, globally, the deformation of structures in the lower part of the mock-up remain relatively constant whereas the structures in the upper part suffer large increasing deformations. the panache of gas flowing out of the central zone and the wide opening of the radial shielding tilt more the top of the © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 11: fluid flows (following subfigures) 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 12: fluid speed © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 13: deformed shape of the mesh 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 14: deformed shape of structures © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 fig. 15: radial structure displacements 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 16: vertical structure displacements baffle, while the downwards flows in the intermediare zone accentuate the bending of the baffle at mid-height. the water hurled by the panache deforms the intermediate and upper parts of the internal vessel, as well as the top of the peripheral components. the upper part of the main vessel bends and moves back upwards. the ccp continues to be lifted. the upper bulge in the external cylinder crashes against the top closure and the heat-insulation lower plate is lifted and bends at 3.6 ms. the lift of the ccp pulls up the massive structures of the top closure. the joining rings being much more flexible than the massive structures, the pieces shift up mainly at the ring level, and a deformation in stairs appears clearly. between 5 and 9 ms, the top of the baffle opens due to the fluid rebounds against the css. the main vessel bottom and the diagrid stop lowering and move back up due to the reversal of flows. this lift pulls up the rest of the main vessel, the collar and all the shells attached to the css. the lower and intermediate parts of the internal vessel are pushed back inwards, by water flows rebounding against the rounded corner of the main vessel. the upper part bends and buckles due to the flows against and over the shell. the shell lowers and reaches a maximum opening of 15 mm at 7 ms. the peripheral components – modeled by the porous area – bend at mid-height. the downward flows of water and pressurised argon in the channel deform the top of the main vessel from 6 ms. the formation of this upper bulge extends the rubber-ring band joining the base of the roof to the main vessel. progressively, the upper bulge moves down, extends and joins the lower bulge at 9 ms. in the top closure, the plugs keep their original shape whereas the joining rings are completely out of shape. the maximum displacement reaches 30 mm in the lower heat-insulation plate at 9 ms, and about 20 mm in the small rotating plug. the rotation around the bottom extremity of the hanging device pulls inwards the entire top closure. the roof slab deforms at the level of the two pieces of different thickness: the thicker piece rotates and moves more than the thinner one, as the latter is tied up to the hanging device. the ccp continues to be lifted. the external cylinder is partially crashed against the top closure. the in-pile shell, the internal cylinder and the central part of the spacer plates reach a maximum displacement of 100 mm at 9 ms. as the fluid flows change direction and orient downwards in the ccp, the lower spacer plate and the in-pile shell come closer in the centre and at the edge. the radial displacements at the extremities of the spacer plates and in the main vessel above the collar connection remain constant until the end of the computation. between 9 and 12 ms, the main vessel and the structures attached to the css keep on moving back upwards. the top of the baffle deforms more or less like rubber, and its radial deformation becomes maximum at 12 ms and reaches 25 mm. the intermediate part of the internal vessel bends, under the effect of the whirlpool formation. both the intermediate and upper parts are lifted by the upward flows below the vessel. the flows in the channel push the upper part back inwards. the upper bulge of the main vessel takes a pointed shape, and reaches a maximum radial displacement of 35 mm at 12 ms. the formation of this bulge pulls up the part of the main vessel below. the top closure and the hanging device rotate back and return after the fluid rebound. the ccp also moves back down, and the spacer plates move aside as the fluid flows reverse. from 13 to 17 ms, the diagrid moves down again, the edge of the in-pile shell edge moves up, the baffle and the upper part of the internal vessel move back inwards, as the whirlpool pushes the fluid violently inside the central area. the impact against the diagrid accelerates the water in the bottom zone, and the main vessel bottom lowers again, which pulls down the rest of the main vessel and the structure set attached to it. the top closure rotates again and the entire ccp is lifted once again. from 25 ms onwards, the structures oscillate around their deformed position according to the evolution of the flows. compared with the computation without the peripheral structures, the main differences concern the deformation of the two vessels. the formation of the upper bulge in the main vessel starts sooner at 5 ms in the new simulation. the upper and intermediate parts of the internal vessel deform more: � the base of the upper part rounds more at 2 ms. � the top of the internal vessel slightly opens at 7 ms. � the bending of the intermediate part from 10 ms is really more marked, and leads to a deformation of the porous zone. 5.5 von mises stresses in structures fig. 17 shows the stresses in both the external and internal structures. stresses appear first in the central structures and the ccp, due to the impact of the pressure wave. between 0.2 and 0.3 ms, stresses also appear in the baffle, the two vessels, and the ring joining the ccp to the top of the small rotating plug. from 0.4 to 0.5 ms, the stress level increases in the structures previously cited, and the ring joining the two rotating plugs. high-stress spots appear at the collar attachment and the thickness change between the rounded and cylindrical parts of the main vessel. there is no stress increase in the css as it is supposed rigid. in the ccp, stresses rise at the plate-cylinder junctions because the fluid lifts and bends the plates. however stresses decrease in the cylinders as they are not coupled to structures. the stress level remains very low in the diagrid support because the constitutive law presents a very low plastic threshold. from 0.6 to 1.2 ms, stresses decrease in the ccp and the lateral wall of the main vessel. however they increase in: � the neutron shielding and the baffle because of the formation of the bubble panache. � the main vessel bottom with a maximum of 650 mpa at 1.2 ms on the symmetry axis. � the joining rings of the top closure and the hanging device due to the lift and rotation of the top closure. from 1.2 to 2 ms, the stressed areas of the main vessel base shift towards the collar, and then evolve according to the fluid © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 17: von mises stresses flows. stresses reach a maximum of 650 mpa at the top of the neutron shielding at 2 ms. they rise in the intermediate part of the internal vessel, but decrease in the lower and upper parts. even if stresses globally reduce in the ccp, high-stress spots reach 750 mpa at 1.6 ms at the junctions of the spacer plates with the external cylinder. at 2 ms, stresses reach 250 mpa in the lower heat-insulation plate, 450 mpa where the external cylinder buckles, 500 mpa and 800 mpa in the joining rings between the rotating plugs and between the large plug and the roof. stresses also increase in the hanging device and the rubber band joining the roof to the main vessel. all these stress rises are due to the lift and the rotation of the top closure. between 3 and 5 ms, stresses reduce in the neutron shielding and the in-pile shell, as the inter-shell interval available for fluid flows becomes maximum. they reduce in the lower part of the internal vessel, and rise in the intermediate and upper parts. stresses reach a maximum of 550 mpa at 3 ms in the baffle, when the panache hits the top head-on. in the main vessel bottom, the level and the location of stresses are linked to the fluid rebounds. when the fluid impacts the collar at 3 ms, stresses reach 900 mpa in the collar vicinity, and 500 mpa in the rounded corner. stresses rise later in the lateral wall and the rubber band joining the roof to the main vessel, when the fluid starts flowing down in the channel. they also increase in the three upper rings. from 6 to 17 ms, stresses globally decrease from the bottom to mid-height of the main vessel, and alternate more or less high values at the collar attachment according to the rebounds of the water. stresses still reach 550 mpa at the level of the upper bulge. the stresses observed at mid-height of the neutron shielding are caused by the thrust of the small whirlpool in the central zone. as the large whirlpool attracts the fluid from the internal vessel towards the baffle, stresses decrease in the internal vessel, but reach 550 mpa in the baffle at 6 ms. then the stress level oscillates according to the fluid rebounds against the css. in the ccp, stresses remain very low in the internal and intermediate cylinders because these cylinders are not coupled to the fluid. the in-pile shell and the spacer plates remain submitted to the thrust of the small whirlpool. the stress level at the junctions of the plates with the external cylinder depends on the orientation of the flows inside the plug and the pressure exerted by the large whirlpool. stresses increase in the joining rings and the hanging device between 6 and 10 ms simultaneously with the lift in stairs of the top closure. the highest stresses (900 mpa) are observed at 12 ms in the ring between the roof and the large rotating plug. then stresses decrease as the water flows back downwards. from 20 to 50 ms, stresses decrease in the lower part of the mock-up. however, a high-stress spot persists at 20 and 35 ms at the collar attachment and at the level of the upper bulge in the main vessel. at the top of the mock-up and particularly in the rings, stresses evolve according to the flow orientation in the upper zone. the ccp presents high-stress points up to 900 mpa at 25 ms at the junctions of the plates with the external cylinder, when the large whirlpool massively sucks down fluid from the top closure along the ccp. the stress level remains very low in the rigid structures (massive pieces of the top closure, css). in the diagrid support, stresses remain limited as the shell becomes almost immediately plastic and suffers large deformations compared to the stress level. compared with the computation without the peripheral structures, stresses are higher until the formation of the main whirlpool and lower afterwards. they are higher: � in the intermediate part of the internal vessel from 0.4 ms, in the entire internal vessel and the rotating plug-roof ring from 3 to 6 ms, since the fluid tends to flow around the porous zone. � at the bottom of the main vessel between 0.8 and 3 ms as if the fluid thrust were slightly higher with the porous model. � in the lateral wall of the main vessel between 4 and 6 ms due to the sooner formation of the upper bulge. stresses are slightly lower from 8 ms in the ccp with the porous simulation. between 10 and 25 ms, the maximum stresses in the top closure occur at different times in the two simulations. 5.6 plastic strains in structures fig. 18 shows the plastic strains in structures versus time. due to the impact of the shock wave, the entire neutron shielding suffers plastic deformations from 0.2 ms. at 0.3 ms, the baffle and the edge of the diagrid support become plastic. small plastic strains appear in the lower part of the ccp. from 0.4 to 0.8 ms, the lower part of both vessels, and the junction of the spacer plates with the external cylinder become plastic. between 1 and 1.2 ms, the plastic level increases strongly in the structures previously cited. plastic strains appear at the top of the ccp external cylinder, and in the ring joining the two rotating plugs. the strain level reaches: � 40 % at 1.6 ms at the top of the neutron shielding. � 30 % at 2.2 ms at mid-height of the baffle. � 25 % at 4 ms at the junction of the ccp external cylinder with the heat-insulation plate. � 12 % at 4 ms in the rings between the rotating plugs and the roof. � 8 % in the centre of the diagrid support at 1.6 ms, and in the lower part of the internal vessel at 4 ms. � 6 % in the heat-insulation plate at 5 ms. � 4 and 5 % at the bottom and at the level of the lower bulge of the main vessel, respectively. from 6 ms onwards, the plastic strains remain constant in the shielding, the baffle, the diagrid, the main vessel bottom and most of the ccp. on the contrary, strains increase at the junction of the lower and intermediate parts of the internal vessel, and in the lateral wall of the main vessel (they reach 8 % at 10 ms in the channel during the formation of the upper bulge). maximum plastic strains of 14 % and 20 % are observed at 14 ms, in the rings between the small and large rotating plugs and between the large plug and the roof slab, respectively. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague fig. 18: plastic strains the strain levels in the computations with and without the peripheral structures are globally the same. 6 conclusion in this paper, we present a porosity model that enables us to take into account, in a simplified way, the influence of complex structures on a fluid transient flow and an application of this method to the simulation of an explosion. the method is based on models representing porous media and consists in averaging structures with the surrounding fluid as if the structures were solid pores inside a fluid. it was specifically developed to model fast transient phenomena without phase change, without heat transfers and described with a eulerian formulation. the present method was implemented in the europlexus computer code to describe some internal structures of a reactor in the case of a hypothetical core disruptive accident in liquid metal fast breeder reactors. however, this method can be adapted to any kind of problem involving a transient phenomenon with a fluid flow going through a structure set of complex shape. the mars mock-up is a small-scale replica of a fast breeder reactor. it contains all the internal structures of the reactor block. however, the fluids intervening in the real accident were replaced by water, argon and an explosive charge in the experiment in order to make the test easier to manage. the mesh includes classical shells or massive structures to represent the external and internal components, as well as a porous medium to describe the internal structures of complex geometry. the internal fluids are described by the specific cda constitutive law implemented on purpose in the europlexus code for computing this kind of explosion. the main results observed during the simulation of the accident concern the propagation of a shock wave from the centre of the mock-up towards the external structures, which loads and deforms all the structures. the high-pressure gas bubble in the central part of the mock-up expands in the rest of the mock-up. this expansion contributes to accelerate and hurl the water against all the structures. the argon layer under the top closure is pushed sideways in the top right-hand corner and flows partially into the channel between the internal and main vessels. the structures most in demand are the neutron shielding and the core cover plug because of their proximity to the explosive charge. the neutron shielding opens like a flower, the core cover plug is lifted and partially buckles. the main vessel lowers at the bottom and bends twice in the lateral wall, thus forming two bulges. the top closure deforms in stairs due to the weak rigidity of the joining rings linking the massive slabs (plugs and roof). the development of the porous model with the application to the simulation of the core disruptive accident in the mars mock-up brings to an end a series of numerical simulations performed since the 1980s on the mara 8, mara 10 and mars mock-ups to improve the understanding of this kind of accident and to validate the cda models implemented in the europlexus code. references [1] louvet, j.: containment response to a core energy release. main experimental and theorical issues – future trends. in: proc. 10th int. conf. on structural mechanics in reactor technology, vol. e, anaheim (usa), august 1989, p. 305–310. [2] bour, c., spérandio, m., louvet, j., rieg, c.: lmfbr’s core disruptive accident. mechanical study of the reactor block. in: proc. 10th int. conf. on structural mechanics in reactor technology, vol. e, anaheim (usa), august 1989, p. 281–287. [3] acker, d., benuzzi, a., yerkess, a., louvet, j.: mara 01/02 – experimental validation of the seurbnuk and sirius containment codes. in: proceeding 6th int. conf. on structural mechanics in reactor technology, paris (france), august 1981, paper e 3/6. [4] smith, b. l., fiche, c., louvet, j., zucchini, a.: a code comparison exercise based on the lmfbr containment experiment mara 04. in: proc. 8th int. conf. on structural mechanics in reactor technology, brussels (belgium), august 1985, paper e 4/7, p. 151–157. [5] fiche, c., louvet, j., smith, b. l., zucchini, a.: theoretical experimental study of flexible roof effects in an hcda’s simulation. in: proc. 8th int. conf. on structural mechanics in reactor technology, brussels (belgium), august 1985, paper e 4/5, p. 139–144. [6] louvet, j., hamon, p., smith, b. l., zucchini, a.: mara 10: an integral model experiment in support of lmfbr containment analysis. in: proc. 9th int. conf. on structural mechanics in reactor technology, vol. e, lausanne (switzerland), august 1987, p. 331–337. [7] falgayrettes, m., fiche, c., granet, p., hamon, p., barrau, p., magnon, b., jalouneix, j., nédélec, m.: response of a 1/20 scale mock-up of the superphenix breeder reactor to an hcda loading simulation. in: proc. 7th int. conf. on structural mechanics in reactor technology, chicago (usa), 1983, paper e 4/1, p. 157–166. [8] lepareux, m., bung, h., combescure, a., aguilar, j.: analysis of a cda in a lmfbr with a multiphasic and multicomponent behaviour law. in: proc. 11th int. conf. on structural mechanics in reactor integrity, tokyo (japan), august 1991, paper e 13/1, p. 371–376. [9] casadei, f., daneri, a., toselli, g.: use of plexus as a lmfbr primary containment code for the cont benchmark problem. in: proc. 10th int. conf. on structural mechanics in reactor technology, anaheim, august 1989, paper e 13/1, p. 299–304. [10] benuzzi, a.: comparison of different lmfbr primary containment codes applied to a benchmark problem. nuclear engineering and design, vol. 100 (1987), p. 239–249. [11] lepareux, m., bung, h., combescure, a., aguilar, j., flobert, j. f.: analysis of an hcda in a fast reactor with a multiphase and multicomponent behavior law. in: proc. 12th int. conf. on structural mechanics in reactor technology, stuttgart (germany), august 1993, paper e 7/2, p. 197–202. [12] cariou, y., pirus, j. p., avallet, c.: lmr large accident analysis method. in: proc. 14th int. conf. on structural © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 mechanics in reactor technology, lyon (france), august 1997, paper p 3/7, p. 395–402. [13] cariou, y., spérandio, m., lepareux, m., christodoulou, k.: lmfbr’s whole core accident. validation of the plexus code by comparison with mara tests. in: proc. 12th int. conf. on structural mechanics in reactor technology, stuttgart (germany), august 1993, paper e 7/4. [14] cariou, y., lepareux, m., noé, h.: lmr’s whole core accident. validation of the plexus code by comparison with mars test. in: proc. 14th int. conf. on structural mechanics in reactor technology, lyon (france), august 1997, paper p 2/6, p. 339–346. [15] casadei, f., halleux, j. p., huerta, a.: dynamic response of fluid-structure systems by plexis-3c. in: proc. european conf. on new advances in computational structural mechanics, giens (france), april 1991. [16] robbe, m. f., lepareux, m., treille, e., cariou, y.: numerical simulation of an explosion in a simple scale model of a nuclear reactor. journal of computer assisted mechanics and engineering sciences. vol. 9 (2002), no. 4, p. 489–517. [17] robbe, m. f., lepareux, m., cariou, y.: numerical interpretation of the mara 8 experiment simulating a hypothetical core disruptive accident. nuclear engineering and design. vol. 0 (2003), p. 119–158. [18] robbe, m. f., lepareux, m., treille, e., cariou, y.: numerical simulation of a hypothetical core disruptive accident in a simple scale model of a nuclear reactor. nuclear engineering and design. vol. 223 (2003), no. 2, p. 159–196. [19] robbe, m. f., casadei, f.: comparison of various models for the simulation of a core disruptive accident in the mara 10 mock-up. nuclear engineering and design. vol. 232 (2004), p. 301–326. [20] robbe, m. f., lepareux, m., seinturier, e.: computation of a core disruptive accident in the mars mock-up. nuclear engineering and design. vol. 235 (2005), p. 1403–1440. [21] robbe, m. f., bliard, f.: a porosity method to describe the influence of internal structures on a fluid flow in case of fast dynamics problems. nuclear engineering and design. vol. 215 (2002), p. 217–242. [22] robbe, m. f., lepareux, m.: representation of complex 3d-structures by a porosity method application to the simulation of an explosion. in: proc. european nuclear conference, versailles (france), december 2005. [23] butterworth, d.: the development of a model for three-dimensional flow in tube bundles. int. journal of heat and mass transfer. vol. 1 (1978), p. 253–256. [24] nersa: the creys-malville power plant. electricité de france, direction de l’équipement, région d’équipement alpes-lyon (france), 1987. [25] david, f.: etude d’une composition explosive flegmatisée. applications la déformation d’une cuve. in: proc. symposium sur les hautes pressions dynamiques, paris (france), 1978. [26] lepareux, m., hoffmann, a., schwab, b., bung, h.: plexus – a general computer program for fast dynamic analysis. in: proc. conference on structural analysis and design on nuclear power plant, porto alegre (brazil), 1984. [27] robbe, m. f., lepareux, m., bung, h.: plexus – notice théorique. cea report dmt/94-490, 1994. [28] casadei, f., halleux, j. p.: europlexus: a numerical tool for fast transient dynamics with fluid-structure interaction. in: proc. samtech users’ conference, toulouse (france), february 2003. [29] robbe, m. f., dolensky, b., strub, c., galon, p.: overview of german and french works relative to slug impact – numerical results. bulgarian journal of theoretical and applied mechanics. vol. 36 (2006), no. 1. [30] robbe, m. f., vivien, n., valette, m., berglas, e.: use of thermalhydraulic and mechanical linked computations to estimate the mechanical consequences of a steam explosion. journal of mechanical engineering. vol. 52 (2001), no. 2, p. 65–90. [31] robbe, m. f., lepareux, m.: evaluation of the mechanical consequences of a steam explosion in a nuclear reactor. bulgarian journal of theoretical and applied mechanics. vol. 32 (2002), no. 1, p. 48–84. [32] robbe, m. f., sardain, p.: comparison of several simplified models and scenarios to simulate a steam explosion in a tank. journal of mechanical engineering. vol. 54 (2003), no. 2, p. 82–100. [33] lepareux, m., schwab, b., hoffmann, a., jamet, p., bung, h.: un programme général pour l’analyse dynamique rapide – cas des tuyauteries. in: proc. colloque tendances actuelles en calcul des structures, bastia (france), 1985. [34] lepareux, m., schwab, b., bung, h.: plexus: a general computer program for the fast dynamic analysis. the case of pipe-circuits. in: proc. 8th int. conf. on structural mechanics in reactor technology, brussels (belgium), 1985, paper f1 2/1. [35] robbe, m. f., potapov, s.: modeling of the depressurisation induced by a pipe-rupture in the primary circuit of a nuclear plant. revue européenne des eléments finis. vol. 12 (2003), no. 4, p. 459–485. [36] potapov, s., bliard, f., tephany, f.: simulation de la décompression du réacteur hdr avec le code de dynamique rapide europlexus. revue européenne des eléments finis. vol. 11 (2002), no. 5, p. 667–694. [37] potapov, s.: modelling of aquitaine ii pipe whipping test with europlexus fast dynamic code. in: proc. 17th int. conf. on structural mechanics in reactor technology, prague (czechoslovakia), august 2003, paper 01/5. [38] robbe, m. f., lepareux, m., trollat, c.: hydrodynamic loads on a pwr primary circuit due to a loca. nuclear engineering and design. vol. 211 (2002), p. 189–228. [39] robbe, m. f., potapov, s., teffany, f.: simulation of the depressurisation occurring at the beginning of a loca in a 4-loop pwr. nuclear engineering and design. vol. 223 (2003), p. 159–196. [40] studer, e., galon, p.: hydrogen combustion loads – plexus calculations. nuclear engineering and design vol. 174 (1997), p. 119–134. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 3/2006 czech technical university in prague [41] lepareux, m., michelin, j. m. thiault, d.: plexus-r: une extension de plexus la robotique. cea report dmt/94-138, 1994. dr. marie-france robbe phone: 331 69088749 e-mail: sdufour@free.fr 38 rue de migneaux 91300 massy, france © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 czech technical university in prague acta polytechnica vol. 46 no. 3/2006 ap05_5.vp 1 introduction one of the more interesting type of underground spaces, in terms of personal dosimetry, are karst caves, which are found in regions with exceptionally high radon concentrations, (despite the very low uranium content in limestone) due to minimal airflow and negligible air exchange. exact effective dose estimation is required, in view of the potential health hazards caused by inhaling radon and its daughters, in order to categorize such unsafe work areas. in this respect, the main differences between apartments and caves is the absence of aerosol sources (the aerosol concentration is approx. 100 times lower than in the outdoor atmosphere), high humidity, low ventilation rate and the uneven surfaces in caves. the conversion factor (msv/wlm) is dependent on aerosol particle size spectrum resolution and on the distribution of radon daughters among the attached and unattached fraction [1]. with the aim of specifying the calculation of the real effective dose in caves, the following test measurements were carried out a few years ago: natural radioactive element content evaluation in subsoils and in water inside/outside to study radon sources in the cave; continual radon measurements; regular radon and daughters measurements using a sampling procedure to specify the proportion of radon daughters; regular indoor air flow measurements to study the location of the radon supply and its transfer among individual areas of the cave; 5-day aerosol particle-size spectrum measurements to determine the free fraction and to compare the aerosol spectra in an apartment and in a cave; comparative measurements using various radon and daughters monitors to study their behaviour in an area with high humidity (including radim 4 – a device for simultaneous detection of radon concentration and concentration of unattached and attached fractions of its daughters); monitoring the behaviour of guides and visitors to record time spent in the cave, related to the continuously monitored levels of rn concentration; some personal monitors for testing the detection of radon and its daughters. all of the obtained data was used for a discussion of the location-related value of “cave factor” and consequently for effective dose calculation. 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague personal dosimetry enhancement for underground workplaces l. thinová, a. fronka, d. milka personal dosimetry for underground workers mainly concerns measurement of the concentration of radon (and its daughters) and the correct application of the data in dose calculation, using a biokinetic model for lung dosimetry. a conservative approach for estimating the potential dose in caves (or underground) is based on solid state alpha track detector measurements. the obtained dataset is converted into an annual effective dose in agreement with the icrp recommendations using the “cave factor”, the value of which depends on the spectrum of aerosol particles, or on the proportional representation of the unattached and the attached fraction and on the equilibrium factor. the main difference between apartments and caves is the absence of aerosol sources, high humidity, low ventilation rate and the uneven surface in caves. a more precisely determined dose value would have a significant impact on radon remedies or on restricting the time workers stay underground. in order to determine how the effective dose is calculated, it is necessary to divide these areas into distinct categories by the following measuring procedures: continual radon measurement (to capture the differences in eerc between working hours and night-time, and also between daily and seasonal radon concentration variations); regular measurements of radon and its daughters to estimate the equilibrium factor and the presence of 218po; regular indoor air flow measurements to study the location of the radon supply and its transfer among individual areas of the cave; natural radioactive element content evaluation in subsoils and in water inside/outside, a study of the radon sources in the cave; aerosol particle-size spectrum measurements to determine the free fraction; monitoring the behaviour of guides and workers to record the actual time spent in the cave, in relation to the continuously monitored levels of rn concentration. keywords: radon concentration, underground, cave, effective dose, aerosol particles. b)a) fig. 1: air flow 200 cm (a) and 30 cm (b) above ground in two places connected by a tunnel and stairs 2 some results of the air flow measurements individual segments of underground areas are interconnected to each other in dependence on their contacts with the earth surface (through crevasses) and external atmosphere conditions pressure and temperature (the so called “summer and winter” mode). the air flow is affected by the movement of visitors in the cave, of course. (the tendency to protect the conservative microclimate of caves by constructing double doors results in an increase in radon concentration). air flow measurements give very interesting information about the origin of “radon pockets” with very high radon concentration, and enable a study of the location of the radon supply and its transfer among individual areas of the cave. some results of air flow measurements using the testo device are shown in figs. 1, 2. 3 continual radon monitoring comparative measurements were carried out using various radon and daughters monitors to study their behaviour in area with high humidity (including radim 4 – a device for simultaneous measurement of radon concentration and concentration of unattached and attached fraction of its daughters). most of the results of the continual radon, radon daughters and unattached fraction measurements show the equilibrium factor around f � 0.5 and the unattached fraction around 2 %. note: one of the most important question remains: how accurately was the unattached fraction measured? 4 effective dose calculation in caves with regard to the standard monitoring procedure for radon and its daughters, routinely used in the czech repub© czech technical university publishing house http://ctn.cvut.cz/ap/ 45 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 a) b) fig. 2: the different systems of air flow in the “jezero” (a) and “peklo” (b) caves fig. 3: radim 3 continual radon monitor in a plastic box with dessiccant, alphaguard continual radon monitor, fritra 4 continual radon and attached and unattached radon daughters fraction monitor fig. 4: radon concentration in different places in a cave (radim 3) fig. 5: some results of measurement using fritra 4 lic, two sets (summer and winter) of solid state alpha-track detectors are placed inside the cave. the results (equilibrium equivalent radon concentration-eerc) are converted into the equivalent dose, including the application of a special cave-evaluation methodology which includes additional factors (e.g. differences between the unattached fraction and the comparison with common indoor atmosphere) in order to avoid underestimating the real dose. based on the icrp 60 recommendation, the calculation used the equation: e(msv) = hp� a � t (h) � eerc (bq�m �3), t working time, eerc equilibrium equivalent radon concentration, a conversion factor hp “cave factor“ � 1.5 the new approach presumes that the entire effective dose that a person receives from radon daughters should be calculated as the sum of all effective doses obtained from the individual sizes of the aerosols (including the unattached fraction), using the following assumptions: only 222rn and its daughters occur in the caves (if this assumption is incorrect, it would be difficult to determine the conversion factors between the exposure to products of radon decay (wlm) a eerc, because the radon daughters are differentiated by their latency energy); the mutual ratio of radon daughters and the equilibrium factor and the spectrum of aerosols in a given place is constant [2]. the calculation will use the respiratory tract model [3]. 5 aerosol measurement campaign in addition to the radon measurements, the aerosol particle size distribution was also measured, as one of the most important parameters for dose evaluation (diffusion battery and differential mobility analyzer, aerosol impactor). it was found that the presence of aerosol particles 1�10 �m in diameter is definitely caused by the presence of visitors or personnel; when the cave was closed, the particles disappeared quickly in the play out process (after approx. 1 h the concentration was about 10�4 segments cm�3). on the other hand, the concentration of particles about 200 nm in diameter is relatively stable (~10 segments cm�3). for the last particle size group (~10 nm), it seems that the aerosols are produced by intensive work or movement (the concentration is about 100�1000 segments cm�3). the aerosol spectrum plays an important role in the recently suggested dose calculation (see fig. 6 and fig. 7 below). 6 conclusions the enhancement of personal dosimetry for underground work places includes a study of the given questions from three main points of view: 1. a classification of underground areas, and of the main characteristics and differences that have an influence on individual irradiation from radon (or from other sources of radiation) and on measuring the concentration of radon and radon daughters (specification of radon sources, remediation or other method for eliminating radon sources) 2. summation and a critical evaluation of approaches to the evaluation of lung irradiation, determination of the input parameters for evaluation of individual irradiation (with the use the latest information from nrpb) 3. selection of the correct measurement method (equipment) leading to the real value of the chosen parameters (see point 2), providing sufficiently clear information 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 5/2005 czech technical university in prague fig. 6 about the concentration of radon and its daughters within various types of underground areas (and segments of areas), assessing information about aerosol spectra suitable for determining the unattached fraction. the aim of the research will be to elaborate a new methodology, for calculating the effective dose for guides in caves that are visited by people or used for speleotherapy, and applicable for other underground spaces. the existing measurements and a follow-up survey will be used. the existing results indicate that individual underground spaces have markedly different radon concentrations and representations of the unattached fraction. these factors have a substantial impact on dose calculation. the results from the group of evaluated caves did not confirm the presence of aerosol sources, or other sources of radon, apart from attendee rock (possibly water); high radon concentrations are accompanied by a very low flow rate; the equilibrium factor varies mostly between 0.2 – 0.5; the unattached fraction constituted approximately 2 %. we have not yet been able to determine with certainty the ratio of the attached and unattached fraction, which could significantly influence the calculations. the time spent underground by guides should be related to the actual radon concentration at a given time. personal dosimeters are advised. all results should be used as input parameters for calculating the ludep dose-in-lungs model. using the method described above the effective dose can be recalculated fairly precisely. references [1] icrp. protection against radon-222 at home and at work. icrp publication 65. pergamon press, oxford, uk, 1993. [2] hihds, w. c.: aerosol technology, properties, behavior, and measurement of aerosol particles. second edition. john witney & sons, new york, 1998. [3] birchall, a. at al.: respiratory track model. nrpb-sr287. chilton. didcot. oxon ox11 0rq. 1996. rndr. lenka thinová e-mail: thinova@fjfi.cvut.cz czech technical university in prague faculty of nuclear sciences and physical engineering, břehová 7 115 19 prague 1, czech republic a. fronka national radiation protection institute bartoškova 28 140 00 prague 4, czech republic mgr. dušan milka bozkov dolomite cave direction 512 13 bozkov, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 czech technical university in prague acta polytechnica vol. 45 no. 5/2005 fig. 7 ap05_6.vp 1 what’s in a name? what’s in a name? that which we call a rose by any other word would smell as sweet. (romeo and juliet – act ii, scene ii) we dare to disagree with the great poet. the name of a study program can be important for students finishing their high school and deciding which university they want to apply to. let us make a tiny linguist digression. according to [1], geodesy is the scientific discipline that deals with measurement and representation of the earth, its gravitational field and geodynamic phenomena (polar motion, earth tides, and crustal motion) in three-dimensional time varying space. the second branch of our traditional study programs, cartography, is (according to [1] again) the study and practice of making maps or globes. we find these definitions good, but do they reveal that computer science and informatics nowadays play a key role in our discipline? this question can be important for young people deciding about the direction of their future professional career. a new word, geomatics, was apparently coined by b. dubuisson in 1969. it is the discipline of gathering, storing, processing, and delivering geographic information. this broad term applies to both science and technology, and integrates several specific disciplines (including geodesy, cartography, and, last but not least, geographic information systems). we were tempted to call our study program “geomatics”, however, at the end we voted for another new word – geoinformatics. informatics (or information science) is studied as a branch of computer science and information technology and is related to database, ontology and software engineering. it is primarily concerned with the structure, creation, management, storage, retrieval, dissemination and transfer of information. we understand “geoinformatics” as a science that synthetizes the achievments of informatics with knowledge of the principles of geodesy and cartography. in the geodetic courses we will teach our students the mathematical and physical backgrounds of geodesy as well as the practise of surveying – the techniques of gathering and processing measurements. within the study of geoinformatics we will teach our students the theoretical principles of geodesy and many things about computers and information technologies. 2 our projects – geodesy and computer science in concordance protest my ears were never better fed with such delightful pleasing harmony. (pericles, prince of tyre – act ii, scene v) we have studied geodesy and we like sitting at computers writing our own applications. can we bring these two things into concordance? we are deeply convinced that we can. we will present some of our projects to demonstrate the interrelations between geodesy and informatics. 2.1 real-time monitoring of gps networks the first project is related to our work for the company gps solutions, boulder, usa. within the contract between gps solutions and the japanese geographical survey institute (gsi) we take part in the development of a program system for real-time processing of gps data with the highest possible accuracy [8]. together with our american colleagues we have prepared a software system consisting of a server that collects data from many gps receivers and the rtnet (real-time network) processing program, which computes very accurate positions of gps stations in real time. the system is primarily designed for the real-time processing of data stemming from the japanese network geonet (gps earth observation network) – a unique network consisting of 1200 permanent gps stations. one of the main purposes of geonet is to monitor seismic deformations. understanding the character of seismic waves and the laws of their propagation may help in the design of earthquake-resistant buildings or even the establishment of an alert system that could save human lives. the left-hand side of the fig. 2 shows a map of several geonet stations located on the southern coast of hokkaido island. the right-hand side shows the motion of these stations during the tokachi-oki earth© czech technical university publishing house http://ctn.cvut.cz/ap/ 61 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 geoinformatics study at the czech technical university in prague l. mervart, a. čepek at the ctu in prague, there is a long tradition of master degree courses in geodesy, geodetic surveying and cartography. taking into account the fast development of information technologies in recent decades, we decided to prepare a new study program that would combine computer science with a background of geodetic and cartographic know-how. apart from other sources, our plans were inspired and influenced by the review of education needs, a report prepared by stig enemark (prague 1998), and by our experience from several virtual academy workshops. we have decided to call this program „geoinformatics“ to emphasize the role of computer technologies in collecting, analyzing and exploiting information about our planet. within this presentation we will explain the basic ideas behind our new study program and emphasize the features that distinguish it from classical geodetic or cartographic programs. we will mention the connection between our new study program and several geodetic and software projects running at our institute software development for real-time gps applications, cooperation with the astronomical institute, university of berne, on the development of so-called bernese gps software, the gnu project gama for adjustment of geodetic networks, etc. keywords: key education, curricula, geoinformation, software development. quake on september 26th, 2003, computed, by our rtnet software. we find it fascinating to see the huge seismic shocks revealed by gps measurements. the plot clearly shows the propagation of seismic waves – stations closer to the epicenter sense the waves earlier than the more distant stations. the time delay between the so-called primary and secondary seismic waves can be observed by comparing the horizontal and vertical components of the station motions. 2.2 bernese gps software we are very proud to have the opportunity to take part in the development of the so-called bernese gps software. this software package has been developed at the astronomical institute, university of berne, switzerland since the 1980s. it is used at many institutions round the globe for post-processing gps data with the highest accuracy and for various other purposes – the software is capable to estimate a large number of different kinds of parameters: station coordinates, earth rotation parameters, satellite orbits, parameters of the atmosphere, etc. the software is recognized for the quality of its mathematical model, which ensures the accuracy of the results. it is the know-how of geodesy and celestial mechanics that stands behind the software’s success. however, we are convinced that the technical quality of the program, its availability on different computer platforms, and a given level of user-friendliness are of major importance, too. the fig. 2 shows a window of the bernese menu system. the bernese gps software is an example of the concordance between geodesy and informatics. it is becoming usual nowadays that many mathematical achievements find their “materialization” in sophisticated software projects. 2.3 gnu gama and other free software projects talking about software development, free software, a specific software area, is one of the fields that we want our students to get involved in. in order to learn more about the phenomenon of free software, selected essays of richard m. stallman are probably the best starting point (or one can read the interview [7]). 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 1 fig. 1 fig. 2 fig. 2 our first major free software project gnu gama is dedicated to adjustment of geodetic networks. it has been presented at various fig meetings ([3], [4]), so we need not describe it in detail here. the beginning of the gama project was influenced by our experience from virtual academy meetings, where it was first presented as a project aimed at motivating our students to get involved in software development and international collaboration. another example of our free software projects was presented last year at the fig working week in athens [5]. the software part of this project was the gps observation database written by jan pytel in close collaboration with prof. kostelecky, and this year we will extend our collaboration to a project to adjust combined solutions from various observation techniques (gps, vlbi, etc). we believe that the new courses on geoinformatics, with an intensive focus on the theoretical background, will help us to attract more talented students, who will be able to collaborate on software projects of a scientific nature, as described above. 3 the future of geodetic science to-morrow, and to-morrow, and to-morrow, creeps in this petty pace from day to day. (macbeth – act v, scene v) from macbeth’s point of view, time seems to have passed slowly. nowadays we know the relativistic effects: for people planning the future of their educational facilities time may run faster than they wish. our present-day today’s knowledge may appear insufficient for tomorrow’s needs. how should we deal with this situation? what knowledge will our students need in several years after they have graduated? what should we teach them? this is not an easy question. taking into account our inability to estimate the precise needs of the future, we are convinced that we have to concentrate on teaching methods, ways of thinking, and general theories rther then specialized topics. our students should primarily be able to gather and analyze information. and this is actually the bottom line of geoinformatics. without knowledge of our primary science – geodesy – the most breathtaking informatics achievements are useless for us. but, conversely, our discipline (like any modern science) cannot develop without sophisticated information processing. it cannot live without informatics. from the practical point of view we had, first of all, to follow the framework of bachelor/master degree programs at the faculty of civil engineering, where bachelor programs last 8 semesters and master degree programs last three semesters. to distinguish clearly between our bachelors and masters, we decided that bachelors should typically be professional users of geoinformatic systems, in contrast to masters, who should become developers, analysists and leading managers of geoinformation systems. one of the major information systems in the czech republic is the cadastral information system. when preparing the bachelor curricula, we decided that education in the cadastre should be given in the same full level as in the existing study branch in geodesy and cartography. another strategic decision was that our bachelors should be prepared for managerial skills, and thus we put substantial emphasis on education in the social sciences. the definition of the structure of the new social science courses was fully in the competence of the head of our department of social sciences, doc. václav liška. with the focus on our main strategic priorities � mathematics � social sciences and management � geodesy � applied informatics the courses offered for the bachelor degree program are summarized in the table 1. © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 course sem. course sem. calculus 1 1 calculus 2 2 technical geodesy 1 constructive geometry 2 os linux 1 foreign language 2 foreign language 1 technical geodesy 2 introduction to law 1 database systems 2 physics 1 1 rhetoric 2 introduction to numerical mathematics 1 physics 2 calculus 3 3 calculus 4 4 introduction to economics 3 psychology and sociology 4 theoretical geodesy 1 3 theoretical geodesy 2 4 theory of errors and adjustment 1 3 theory of errors and adjustment 2 4 programming language c++ 3 project informatics 4 environmental engineering 3 mathematical cartography 4 practical training in surveying 3 foreign language 4 table 1 the compulsory courses in the master degree program are accompanied by the offer of 36 elective courses ranging from tensor calculus and mathematical modeling to space and physical geodesy or combinatorial optimization. we expect most of our bachelors to continue their studies at master degree level. our program is designed on the basis of this presumption. however, our goal is also to open the new study branch to other fields, namely to geaduates of other bachelor programs at our faculty, e.g., environmental engineering or the existing study branch geodesy and cartography geoinformatics geodesy and cartography geoinformatics � environmental engineering (or water engineering and water structures) system engineering in the building industry (it) in june 2005 the czech ministry of education, youth and sports approved our new curricula, and the first semester in geoinformatics will be opened in 2006. the first semester of the master degree program in geoinformatics will open in 2007. one of our next steps will be to fully harmonize the first two years of the programs in geodesy and cartography with the new geoinformatics study plan. references [1] vaníček, p., krakiwski, e. j.: geodesy: the concepts, north-holland, 1986, 2nd ed., isbn 0444 87775 4. [2] enemark, s.: “review of education needs”, consultancy to the czech office for surveying, mapping and cadastre, eu phare land registration project, project no cz 94402-02, february 1998. [3] čepek, a., pytel, j.: “free software – an inspiration for a virtual academy.” fig xxii international congress, washington, d.c. usa, april 19–26, 2002. [4] čepek, a., pytel, j.: “acyclic visitor pattern in formulation of mathematical model.” fig working week, athens, greece, may 22–27, 2004. 64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague course sem. course sem. photogrammetry and remote sensing 1 5 photogrammetry and remote sensing 2 6 probability and math. statistics 5 cadastre 6 gis 1 5 gis 2 6 mapping 5 www services 6 project – geodesy 5 engineering geodesy 6 real estate law 5 electives (4 credits) 6 management psychology 5 optimization methods 7 elective courses (16 credits) 8 information systems 7 bachelor diploma work 8 image data processing 7 topograph.and thematical cartography 7 ethics 7 elective courses (4 credits) 7 pre-diploma project 7 each bachelor student is required to choose from 25 elective subjects starting from the sixth semester. course sem. course sem. graph theory 1 numerical mathematics 2 object-oriented programming 1 project-statistics 2 statistics-robust methods 1 project-professional specialization 2 elective courses (12 credits) 1 elective courses (14 credits) 2 social sciences and management 3 elective courses (4 credits) 3 diploma project 3 the courses offered for the master degree program in geoinformatics are in table 2. table 2 [5] kostelecký, j., pytel, j.: “modern geodetic control in the czech republic based on densification of euref network”, fig working week, athens, greece, may 22–27, 2004. [6] stallman, r. m.: free software, free society (selected essays), gnu press, free software foundation, boston, ma usa, isbn 1-882114-98-1 [7] stallman, r. (interview): http://kerneltrap.org/node/4484 [8] rocken, c., mervart, l., luke, z., johnson, j., kanzaki m.: “testing a new network rtk software system.” in: proceedings of gnss 2004. institute of navigation, fairfax, 2004, 2831–2839. [9] http://geoinformatika.fsv.cvut.cz/akreditace/ archive of all accreditation documents submitted to the czech ministry of education, youth and sports. prof. dr. ing. leoš mervart, drsc. phone: +420 224 354 805 fax: +420 224 354 343 e-mail: mervart@fsv.cvut.cz department of geodesy prof. ing. aleš čepek, csc. phone: + 420 2 2435 4647 fax: +420 2 2435 5419 e-mail: cepek@fsv.cvut.cz web site: http://gama.fsv.cvut.cz/~cepek department of mapping and cartography czech technical university in prague faculty of civil engineering thákurova 7 166 29 prague 6, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 table of contents flow visualisation by condensing steam – an unusual method applied to development of a low reynolds number fluidic selector valve 3 j. r. tippetts, v. tesaø time-mean helicity distribution in turbulent swirling jets 9 v. tesaø a complete parametric solutions of eigenstructure assignment by state-derivative feedback for linear control systems 19 t. h. s. abdelaziz, m. valášek generic platform for failure recovery in survivable trees 27 v. dynda condition indicators for gearbox condition monitoring systems 35 p. veèeø, m. kreidl, r. šmíd on deterministic chaos in microdischarge phenomena 44 t. ficker magnetization of steeel building materials and structures in the natural geomagnetic field 47 e. èermáková parity codes used for on-line testing in fpga 53 p. kubalík, h. kubátová << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica https://doi.org/10.14311/ap.2022.62.0050 acta polytechnica 62(1):50–55, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague generalized three-body harmonic oscillator system: ground state adrian m. escobar-ruiz∗, fidel montoya universidad autónoma metropolitana, departamento de física, av. san rafael atlixco 186, 09340 ciudad de méxico, cdmx, méxico ∗ corresponding author: admau@xanum.uam.mx abstract. in this work we report on a 3-body system in a d−dimensional space rd with a quadratic harmonic potential in the relative distances rij = |ri − rj | between particles. our study considers unequal masses, different spring constants and it is defined in the three-dimensional (sub)space of solutions characterized (globally) by zero total angular momentum. this system is exactly-solvable with hidden algebra sℓ4(r). it is shown that in some particular cases the system becomes maximally (minimally) superintegrable. we pay special attention to a physically relevant generalization of the model where eventually the integrability is lost. in particular, the ground state and the first excited state are determined within a perturbative framework. keywords: three-body system, exact-solvability, hidden algebra, integrability. 1. introduction the two-body harmonic oscillator, i.e. two particles with masses m1 and m2 interacting via the translational invariant potential v ∝ |ri − rj | 2, appears in all textbook in classical mechanics. in an arbitrary d−dimensional euclidean space rd this system admits separation of variables in the center-of-mass and relative coordinates as well as exact solvability. the relevance of such a system is obvious: any scalar potential u = u(|ri − rj |) can be approximated by the two-body harmonic oscillator. in this case, the centerof-mass and relative coordinates are nothing but the normal coordinates. therefore, in the n-body case of n > 2 particles interacting by a quadratic pairwise potential it is natural to ask the question about the existence of normal coordinates and the corresponding explicit exact solutions. interestingly, even for the three-body case n = 3 a complete separation of variables can not be achieved in full generality. starting in 1935, the quantum n−body problem in r3 was studied by zernike and brinkman [1] using the so-called hyperspherical-harmonic expansion. two decades later, this method possessing an underlying group-theoretical nature was then reacquainted and refined in the papers by delves [2] and smith [3]. nevertheless, in practice the success of the method is limited to the case of highly symmetric systems, namely identical particles with equal masses and equal spring constants. in a previous work [4], the most general quantum system of a three-body chain of harmonic oscillators, in rd, was explored exhaustively. for arbitrary masses and spring constants this problem possesses spherical symmetry. it implies that the total angular momentum is a well-defined observable which allows to reduce effectively the number of degrees of freedom in the corresponding schrödinger equation governing the states with zero angular momentum. in the sector of vanishing angular momentum, it turns out that this three-body quantum system is exactly solvable. the hidden algebra sℓ(4,r) responsible of the exact solvability was exhibited in [4] using the ρ-representation. in the present work we consider a physically relevant generalization of the model where eventually the integrability properties are lost. again, in our analysis we assume a system of arbitrary masses and spring constants with the total angular momentum identically zero. in the current study we revisited the algebraic structure and solvability of the quantum 3-body quantum oscillator system in the special set of coordinates appearing in [5], [6]. afterwards, a physically motivated generalization of the model is considered. the goal of the paper is two-fold. firstly, in the (sub)-space of zero total angular momentum we will describe the reduced hamiltonian operator which admits a hidden sℓ(4; r) algebraic structure, hence, allowing exactanalytical eigenfunctions. especially, at any d ≥ 1 it is demonstrated the existence of an exactly-solvable model that solely depends on the moment of inertia of the system. this model, admits a quasi-exactlysolvable extension as well. secondly, we explore a physically relevant generalization of the model. approximate solutions of the problem are presented just for the case of equal masses in the framework of standard perturbation theory and complemented by the variational method. the first excited state, thus the energy gap of the system, is briefly discussed. 50 https://doi.org/10.14311/ap.2022.62.0050 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 generalized three-body harmonic oscillator system: ground state 2. generalities the quantum hamiltonian in rd (d > 1) for three nonrelativistic spinless particles with masses m1,m2,m3 and translationally invariant potential is given by h = − 3∑ i=1 1 2 mi ∆(d)i + v (r12, r13, r23) , (1) (ℏ = 1) see e.g. [4, 5], where ∆(d)i stands for the individual laplace operator of the ith mass with d−dimensional position vector ri, and rij = |ri − rj | , (2) (j = 1, 2, 3) is the relative mutual distance between the bodies i and j. the eigenfunctions of (1) which solely depend on the ρ-variables, ρij = r2ij , are governed by a three-dimensional reduced hamiltonian [4] hrad ≡ −∆rad + v (ρ) , (3) where ∆rad = 2 µ12 ρ12 ∂ 2 ρ12 + 2 µ13 ρ13 ∂ 2 ρ13 + 2 µ23 ρ23 ∂ 2 ρ23 + 2(ρ13 + ρ12 − ρ23) m1 ∂ρ13 , ρ12 + 2(ρ13 + ρ23 − ρ12) m3 ∂ρ13 , ρ23 + 2(ρ23 + ρ12 − ρ13) m2 ∂ρ23 , ρ12 + d µ12 ∂ρ12 + d µ13 ∂ρ13 + d µ23 ∂ρ23 , (4) c.f. [5], and µij = mi mj mi + mj , denotes a reduced mass. the operator (3) describes three-dimensional (radial) dynamics in variables ρ12,ρ13,ρ23. this operator hrad is, in fact, equivalent to a schrödinger operator, see [4]. we call it three-dimensional (radial) hamiltonian. all the d−dependence in (3) occurs in the coefficients in front of the first derivatives. 2.1. case of identical particles: τ -representation now, let us consider the case of identical masses m1 = 1 ; m2 = 1 ; m3 = 1 , thus, µij = 12 , and the operator (4) is s3 permutationally-invariant in the ρ-variables. it suggests the change of variables ρ ↔ τ where τ1 = ρ12 + ρ13 + ρ23 , τ2 = ρ12 ρ13 + ρ12 ρ23 + ρ13 ρ23 , τ3 = ρ12 ρ13 ρ23 , (5) are nothing but the lowest elementary symmetric polynomials in ρ-coordinates. in these variables (5), the coefficients of the operator ∆rad are also polynomials, hence, this operator is algebraic in both representations. explicitly, ∆rad = 6 τ1∂21 + 2τ1(7τ2 − τ 2 1 )∂ 2 2 + 2τ3(6τ2 − τ 2 1 )∂ 2 3 + 24 τ2∂21,2 + 36τ3∂ 2 1,3 + 2 [9τ3τ1 + 4τ2(τ2 − τ 2 1 )]∂ 2 2,3 + 6 d∂1 + 2 (2d + 1)τ1 ∂2 + 2 [(d + 4)τ2 − τ21 ] ∂3 (6) ∂i ≡ ∂τi, i = 1, 2, 3. 3. laplace-beltrami operator now, as a result of calculations it is convenient to consider the following gauge factor γ4 = (s2△) 2−d m i , (7) m = m1 + m2 + m3, where s 2 △ = 2ρ12 ρ13 + 2ρ12 ρ23 + 2ρ23 ρ13 − ρ212 − ρ213 − ρ223 16 , and i = m1m2 ρ12 + m1m3 ρ13 + m2m3 ρ23 m , possess a geometrical meaning. the term s2△ is the area (squared) of the triangle formed by the position vectors of the three bodies whilst the term i is the moment of inertia of the system with respect to its center of mass. the radial operator hrad (3) is gaugetransformed to a truly schrödinger operator [4], hlb ≡ γ−1 hrad γ = −∆lb + v + v (eff) , (8) here ∆lb stands for the laplace-beltrami operator ∆lb (ρ) = √ | g | ∂µ 1√ | g | gµν∂ν , (ν,µ = 1, 2, 3) and ∂1 = ∂∂ρ12 ,∂2 = ∂ ∂ρ13 ,∂3 = ∂∂ρ23 . the corresponding co-metric in ∆lb (ρ) reads g µν =   2 µ12 ρ12 (ρ13 +ρ12 −ρ23 ) m1 (ρ23 +ρ12 −ρ13 ) m2 (ρ13 +ρ12 −ρ23 ) m1 2 µ13 ρ13 (ρ13 +ρ23 −ρ12 ) m3 (ρ23 +ρ12 −ρ13 ) m2 (ρ13 +ρ23 −ρ12 ) m3 2 µ23 ρ23   . its determinant | g | ≡ detgµν = 32 m2 m21m 2 2m 2 3 i s2△ , (9) admits factorization and is positive definite. the term v (eff) denotes an effective potential v (eff) = 3 8 1 i + (d − 2)(d − 4) 32 m i m1 m2 m3 s 2 △ , which depends on the two variables i and s2△ alone. thus, the underlying geometry of the system emerges. the classical analogue of the quantum hamiltonian operator (8) describes an effective non-relativistic 51 adrian m. escobar-ruiz, fidel montoya acta polytechnica figure 1. 3-body chain of harmonic oscillators. classical particle in a three-dimensional curved space. explicitly, the hamiltonian function takes the form h(classical)lb = g µν πµ πν + v , (10) where πµ , µ = 12, 23, 13 are the associated canonical conjugate momenta to the ρ-coordinates. the hamilton-jacobi equation, at vanishing potential v = 0 (free motion), is clearly integrable. however, a complete separation of variables is absent in the ρ-representation. the poisson bracket between the kinetic energy t = gµν πµ πν and the linear function in momentum variables l (c) 1 = (ρ13−ρ23)π12+(ρ23−ρ12)π13+(ρ12−ρ13)π23 , is zero. 4. three body harmonic oscillator system in the spectral problem with hamiltonian (3) we take the harmonic potential v (ho)(ρ) = 2 ω2 [ ν12 ρ12 + ν13 ρ13 + ν23 ρ23 ] , (11) ω > 0 is frequency and ν12, ν13, ν23 > 0 are constants with dimension of mass. this problem can be solved exactly [4]. in particular, in ρ-space the reduced operator (3) possesses multivariate polynomial eigenfunctions, see below. we call the above potential v (ho)(ρ) the 3-body oscillator system. we mention that in the case d = 1 (3 particles on a line), the corresponding spectral problem was studied in the paper [7]. in the current report, we analyze the d−dimensional case with d > 1. in r-variables, ρ = r2, the potential (11) can be interpreted as a three-dimensional (an)isotropic onebody oscillator. it is displayed in figure 1. the configuration space is a subspace of the cube r3+(ρ) in e3 ρ-space. the ρ-variables must obey the “triangle condition” s2△ ⩾ 0, namely the area of the triangle formed by the position vectors of the bodies is always positive. 4.1. solution for the ground state in the harmonic potential (11), the ground state eigenfunction reads ψ(ho)0 = e −ω (a1 µ12 ρ12 + a2 µ13 ρ13 + a3 µ23 ρ23) , (12) where the parameters a1, a2, a3 ≥ 0 are introduced for convenience. they define the spring constants, see below. the associated ground state energy e0 = ω d (a1 + a2 + a3) , (13) is mass-independent. there exists the following algebraic relations ν12 = a21 µ12 + a1 a2 µ12 µ13 m1 + a1 a3 µ12 µ23 m2 − a2 a3 µ13 µ23 m3 , ν13 = a22 µ13 + a1 a2 µ12 µ13 m1 + a2 a3 µ13 µ23 m3 − a1 a3 µ12 µ23 m2 , ν23 = a23 µ23 + a1 a3 µ12 µ23 m2 + a2 a3 µ13 µ23 m3 − a1 a2 µ12 µ13 m1 . 5. lie algebraic structure using the previous function ψ(ho)0 (12) as a gauge factor, the transformed hamiltonian hrad (3) h(algebraic) ≡ ( ψ(ho)0 )−1 [−∆rad + v − e0] ψ(ho)0 (14) is an algebraic operator, i.e. the coefficient are polynomials in the ρ-variables. the e0 is taken from (13). in addition, this algebraic operator (14) is of liealgebraic nature. it admits a representation in terms of the generators j −i = ∂ ∂yi , j 0ij = yi ∂ ∂yj , j 0(n) = 3∑ i=1 yi ∂ ∂yi − n , j +i (n) = yi j 0(n) = yi   3∑ j=1 yj ∂ ∂yj − n   , (i,j = 1, 2, 3) of the algebra sℓ(4,r), see [8, 9] here n is a constant. the notation y1 = ρ12 , y2 = ρ13 , y3 = ρ23 , was employed for simplicity. if n is a non-negative integer, a finite-dimensional representation space takes place, vn = ⟨yn11 y n2 2 y n3 3 | 0 ≤ n1 + n2 + n3 ≤ n⟩ . (15) 52 vol. 62 no. 1/2022 generalized three-body harmonic oscillator system: ground state 6. relation with the jacobi oscillator now, we can indicate an emergent relation between the harmonic potential (11) and the jacobi oscillator system h(jacobi) ≡ 2∑ i=1 [ − ∂2 ∂zi∂zi + 4 λi ω2 zi · zi ] , (16) where ω > 0, λ1, λ2 ≥ 0, and z1 = √ m1 m2 m1 + m2 (r1 − r2) z2 = √ (m1 + m2) m3 m1 + m2 + m3 ( r3 − m1 r1 + m2 r2 m1 + m2 ) are standard jacobi variables, see e.g. [10]. this hamiltonian describes two decoupled harmonic oscillators in flat space, see [6]. consequently, it is an exactly-solvable problem. the complete spectra and eigenfunctions can be calculated by pure algebraic means. the solutions of the jacobi oscillator that solely depend on the jacoby distances zi = |zi| are governed by the operator, h(jacobi)rad = 2∑ i=1 [ − ∂2 ∂zi∂zi − (d − 1) zi ∂ ∂zi ] + 4 λ1 ω2 z21 + 4 λ2 ω 2 z22 . (17) in this case, the associated hidden algebra is given by sl ⊗ (2)2 which acts on the two-dimensional space (z1,z2). in particular, the eigenfunctions of h(jacobi) (16) can be employed to construct approximate solutions for the n-body problem, for this discussion see [10]. assuming any of the two conditions m2 m3 = ν12 ν13 ; m1 m2 = ν13 ν23 , in the harmonic oscillator potential v (ho) (11), we obtain u (ho) j ≡ 4 λ1 ω 2 z21 + 4 λ2 ω 2 z22 = 2 ω2 [ ν12 ρ12 + ν13 ρ13 + ν23 ρ23 ] = v (ho) (18) with λ1 = λ2 = m1 + m2 + m3 2 m1 m3 ν13 , hence, in this case the three-body oscillator potential coincides with the two-body jacobi oscillator potential. in fact, imposing the singly condition m2 ν13 = m3 ν12 the equality (18) is still valid but λ1 ̸= λ2 and the system is not maximally superintegrable any more. 6.1. identical particles: hyperradious a remarkable simplification occurs in the case of three identical particles with the same common spring constant, namely m1 = m2 = m3 = 1 , a1 = a2 = a3 ≡ a . (19) thus, the potential (11) reduces to v (ho) = 3 2 a2 ω2 (ρ12 + ρ13 + ρ23) = 3 2 a2 ω2 τ1 . consequently, the ground state solutions (12) and (13) read ψ(3a)0 = e − ω2 a ( ρ12 + ρ13 + ρ23 ) = e− ω 2 a τ1 , (20) e0 = 3 ω da , (21) respectively. moreover, from (6) it follows that in this case there exists an infinite family of eigenfunctions ψn (τ1) = e− 1 2 a ω τ1 l (d−1) n (aω τ1) , with energy en = 3 aω ( d + 2 n ) , n = 0, 1, 2, 3, . . ., that solely depend on the variable τ1, the so called hyperradious, here l (d−1) n (x) denotes the generalized laguerre polynomial. these solutions are associated with a hidden sℓ(2,r) lie-algebra. 6.2. arbitrary masses: moment of inertia a generalization of the results presented in section 6.1 can be derived from the decomposition of ∆rad (4) ∆rad = ∆i + ∆̃ , (22) where ∆i = ∆i (i) is an algebraic operator for arbitrary d ≥ 1. it depends on the moment of inertial i only. explicitly, we have ∆i = 2 i ∂2i,i + 2 d∂i . (23) the operator ∆̃ = ∆̃(i,q1,q2) depends on i and two more (arbitrary) variables q1, q2 for which the coordinate transformation {ρij } → {i,q1,q2} is invertible (not singular). since such an operator ∆̃ annihilates any function f = f(i), i.e. ∆̃ f = 0, the splitting (22) indicates that for any potential of the form v = v (i) , (24) the eigenvalue problem for the operator hrad = −∆rad + v is further reduced to a one-dimensional spectral problem, namely [ −∆i + v (i) ] ψ = e ψ , (25) which can be called the i−representation. in the case of equal masses m1 = m2 = m3 the coordinate i is proportional to the hyperspherical radius (hyperradious). also, hi (25) is gauge-equivalent to a one-dimensional the schrödinger operator. 53 adrian m. escobar-ruiz, fidel montoya acta polytechnica figure 2. classical generalized three-body harmonic oscillator system: average lyapunov exponent in the space of parameters (h, m1). the values m2 = m3 = 1, ω = 1, ν12 = ν13 = ν23 = 1 and r12 = r13 = r23 = 1 were used. 7. generalized three body harmonic oscillator system now, let us consider the following potential v (r) = 2 ω2 [ ν12 ( √ ρ12 − r12)2 + ν13 ( √ ρ13 − r13)2 + ν23 ( √ ρ23 − r23)2 ] , (26) where r12,r13,r23 ⩾ 0 denote the rest lengths of the system. at r12 = r13 = r23 = 0 we recover the exactly solvable 3-body oscillator system, v (r) → v (ho). the relevance of v (r) comes from the fact that any arbitrary potential v = v (rij ) can be approximated, near its equilibrium points, by this generalized 3-body harmonic potential. however, the existence of non-trivial exact solutions is far from being evident. even for the most symmetric case of equal masses and equal spring constants, we were not able to find a hidden lie algebra in the corresponding spectral problem (3). moreover, at the classical level such a system is chaotic. this can be easily seen by computing the average lyapunov exponent in the space of parameters (h, m1), see figure 2, where h is the value of the classical hamiltonian (energy) with potential v (r) (26). also, for one-dimensional systems it is said (see [11]) that a classical orbit is pt -symmetric if the orbit remains unchanged upon replacing x(t) by −x∗(−t). there are several classes of complex pt -symmetric non-hermitian quantum-mechanical hamiltonians whose eigenvalues are real and with unitary time evolution [12, 13]. however, while the corresponding quantum three-body oscillator hamiltonian is hermitian, it can still have interesting complex classical trajectories. 7.1. identical particles in order to simplify the problem one can consider the simplest case of equal masses and equal spring constants (19) with ω = 1. also, we will assume equal rest lengths r12 = r13 = r23 = r > 0 . ε0pt [0.5, r] ε0pt [1, r] ε0pt [2, r] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 r 5 10 15 20 ε0[ a, r] figure 3. ground state energy of the generalized 3body harmonic oscillator vs r for different values of the parameter a which defines the spring constant, see text. the solid lines correspond to the variational result whilst the dashed ones refer to the value calculated by perturbation theory up to first order. in this case, approximate solutions for the schrödinger equation can be obtained using perturbation theory in powers of r. 7.1.1. ground state taking the r−dependent terms in (26) as a small perturbation, the first correction e1,0 to the ground state energy takes the form e1,0 = 3 a 2 π ( 3 π ar2 − 4 r √ 6 π a ) . the domain of validity of this perturbative approach is estimated by means of the variational method. the use of the simple trial function ψtrial0 = e − ω2 a α ( ρ12 + ρ13 + ρ23 ) c.f. (20), where α is a variational parameter to be fixed by the procedure of minimization, leads to the results shown in figure 3. 7.1.2. first excited state it is important to mention that for the 3-body harmonic oscillator (r = 0) the exact first excited state possesses a degeneracy equal to 3. for r > 0, the perturbation theory partially breaks this degeneracy. the energy of the approximate first excited state calculated by perturbation theory, up to first order, is displayed in figure 4. 8. conclusions in this report for a 3-body harmonic oscillator in euclidean space rd we consider the schrödinger operator in ρ-variables ρij = r2ij , hlb = −∆lb(ρij ) + v (ho)(ρij ) + v (eff)(ρij ) , (27) where the kinetic energy corresponds to a 3dimensional particle moving in a non-flat space. the 54 vol. 62 no. 1/2022 generalized three-body harmonic oscillator system: ground state e1[0.5, r] e1[1, r] e1[2, r] 0.2 0.4 0.6 0.8 1.0 r 5 10 15 20 25 30 e1[a, r] figure 4. first excited state of the generalized 3body harmonic oscillator vs r. schrödinger operator (27) governs the s-states solutions of the original three-body system (1), in particular, it includes the ground state. it implies that the solutions of corresponding eigenvalue problem depend solely on three coordinates, contrary to the (3d)-dimensional schrödinger equation. the reduced hamiltonian hlb is an hermitian operator, where the variational method can be more easily implemented (the energy functional is a 3-dimensional integral only). the classical analogue of (27) was presented as well. the operator (27) up to a gauge rotation is equivalent to an algebraic operator with hidden algebra sℓ(4,r), thus, becoming a lie-algebraic operator. in the case of identical masses and equal frequencies the aforementioned model was generalized to a 3-body harmonic system with a non-zero rest length r > 0. in this case, no hidden algebra nor exact solutions seem to occur. an indication of the lost of integrability is the fact that the classical counterpart of this model exhibits chaotic motion. using perturbation theory complemented by the variational method it was shown that the ground state energy vs r develops a global minimum, hence, defining a configuration of equilibrium. acknowledgements amer thanks the support through the programa especial de apoyo a la investigación 2021, uam-i. fm and amer are thankful to mario alan quiroz for helping in the numerical computations. references [1] f. zernike, h. c. brinkman. hypersphärische funktionen und die in sphärischen bereichen orthogonalen polynome. koninklijke nederlandse akademie van wetenschappen 38:161–170, 1935. [2] l. m. delves. tertiary and general-order collisions (i). nuclear physics 9(3):391–399, 1958. https://doi.org/10.1016/0029-5582(58)90372-9. [3] f. smith. a symmetric representation for three-body problems. i. motion in a plane. journal of mathematical physics 3(4):735, 1962. https://doi.org/10.1063/1.1724275. [4] a. turbiner, w. miller jr, m. a. escobar-ruiz. three-body problem in d-dimensional space: ground state, (quasi)-exact-solvability. journal of mathematical physics 59(2):022108, 2018. https://doi.org/10.1063/1.4994397. [5] a. turbiner, w. miller jr, m. a. escobar-ruiz. three-body problem in 3d space: ground state, (quasi)-exact-solvability. journal of physics a: mathematical and theoretical 50(21):215201, 2017. https://doi.org/10.1088/1751-8121/aa6cc2. [6] w. miller jr, a. turbiner, m. a. escobar-ruiz. the quantum n-body problem in dimension d ≥ n − 1: ground state. journal of physics a: mathematical and theoretical 51(20):205201, 2018. https://doi.org/10.1088/1751-8121/aabb10. [7] f. m. fernández. born-oppenheimer approximation for a harmonic molecule. arxiv:0810.2210v2. [8] a. v. turbiner. quasi-exactly-solvable problems and the sl(2, r) algebra. communications in mathematical physics 118:467–474, 1988. https://doi.org/10.1007/bf01466727. [9] a. v. turbiner. one-dimensional quasi-exactly-solvable schrödinger equations. physics reports 642:1–71, 2016. https://doi.org/10.1016/j.physrep.2016.06.002. [10] l. m. delves. tertiary and general-order collisions (ii). nuclear physics 20:275–308, 1960. https://doi.org/10.1016/0029-5582(60)90174-7. [11] c. m. bender, j.-h. chen, d. w. darg, k. a. milton. classical trajectories for complex hamiltonians. journal of physics a: mathematical and general 39(16):4219–4238, 2006. https://doi.org/10.1088/0305-4470/39/16/009. [12] c. m. bender, s. boettcher. real spectra in non-hermitian hamiltonians having pt symmetry. physical review letters 80(24):5243–5246, 1998. https://doi.org/10.1103/physrevlett.80.5243. [13] p. dorey, c. dunning, r. tateo. supersymmetry and the spontaneous breakdown of pt symmetry. journal of physics a: mathematical and general 34(28):l391, 2001. https://doi.org/10.1088/0305-4470/34/28/102. 55 https://doi.org/10.1016/0029-5582(58)90372-9 https://doi.org/10.1063/1.1724275 https://doi.org/10.1063/1.4994397 https://doi.org/10.1088/1751-8121/aa6cc2 https://doi.org/10.1088/1751-8121/aabb10 http://arxiv.org/abs/0810.2210v2 https://doi.org/10.1007/bf01466727 https://doi.org/10.1016/j.physrep.2016.06.002 https://doi.org/10.1016/0029-5582(60)90174-7 https://doi.org/10.1088/0305-4470/39/16/009 https://doi.org/10.1103/physrevlett.80.5243 https://doi.org/10.1088/0305-4470/34/28/102 acta polytechnica 62(1):50–55, 2022 1 introduction 2 generalities 2.1 case of identical particles: -representation 3 laplace-beltrami operator 4 three body harmonic oscillator system 4.1 solution for the ground state 5 lie algebraic structure 6 relation with the jacobi oscillator 6.1 identical particles: hyperradious 6.2 arbitrary masses: moment of inertia 7 generalized three body harmonic oscillator system 7.1 identical particles 7.1.1 ground state 7.1.2 first excited state 8 conclusions acknowledgements references acta polytechnica https://doi.org/10.14311/ap.2022.62.0549 acta polytechnica 62(5):549–557, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague determination of fan design parameters for light-sport aircraft jan klesa czech technical university, faculty of mechanical engineering, department of aerospace engineering, karlovo náměstí 13, prague 2, czech republic correspondence: jan.klesa@fs.cvut.cz abstract. this paper is focused on the preliminary design of an electric fan for light-sport aircraft. usage of electric motors brings some advantages compared to piston engines, especially small size and the independence of power on shaft rpm. a 1d compressible fluid flow model is used for the determination of the performace. the influence of various system parameters is analysed. results for the case of the ul-39 ultralight aircraft are presented. finally, input parameters for the fan design are determined according to this analysis. this can be then used as input data for the standard fan (axial compressor) design procedure. keywords: ducted fan, electric propulsion, axial compressor. 1. introduction electric propulsion for aircrafts became a research topic in the last years, which is motivated by the huge progress in low-weight electric power systems. electric flight was already developed in the 1960s for radio controlled model aircraft, e.g., work of fred militky [1]. the progress in battery technology (from nicd to lithium-based batteries), electric motors (from simple dc brush motors with ferrite magnets, later neodymium magnets, and today brushless dc motors) and control electronics led to the increase of model performance and size.this led to the possibility of building manned electrical aircrafts in the last decade, e.g., projects of airbus, pipistrel, extra, jihlavan, etc. today, the technology is advanced enough to build a small fully-electric aircraft. the electric propulsion brings some advantages, especially possible drag reduction due to the lower volume and cross-section of electric motors in comparison with turboprop and piston engines. this allows to decrease the nacelle size (for multiple engine aircraft) and better fuselage nose shape (for single engine aircraft). however, cooling the electric components requires relatively large cooling systems because of the low temperature difference. the main disadvantage remains the source of the electric energy. batteries are relatively heavy and have a low energy density as compared with aircraft fuel [1]. another problem is the long time necessary for recharging the batteries between flights. refuelling is usually much faster and does not require a high power electric line connection at the airport. thus, some hybrid system using standard aircraft fuel (e.g., jet-a1) or hydrogen is necessary for a long range/high endurance aircraft. in this case, electricity is made onboard by means of an electric generator powered by turboshaft engine or apu. in this case, energy can be stored in the high energy density medium, but the overall efficiency is lower due to the chain of necessary energy transformations. both systems are under development for use in aviation, e.g., honeywell [2] or rolls-royce [3]. a ducted fan is used on electric-powered "jet" aircraft, e.g., airbus e-fan. a ducted fan allows the transformation of the electric energy to the propulsive thrust at high flight velocity, where a propeller is inefficient. it became a dynamic research area in the last years due to the efforts of building electric or hybrid-electric transport aircrafts. however, ducted fans or ducted propellers have a lower performance at low flight speeds for multiple reasons: • higher outlet velocity which causes lower propulsive efficiency. • higher losses in the propulsion system due to the friction at duct walls. • higher fuselage (nacelle) drag. • higher drag when flying with the engine off-regime. but there is also some motivation for fan-powered low-speed aircraft, which can have various advantages: • safety, because rotating parts are covered by the duct, and so the risk of damage or injures can be lower than for a conventional propeller. • possible noise reduction. • “jet-feeling” fan-powered aircraft can be used for low-cost training of jet pilots. the preliminary design and a comparison of a ducted fan with a propeller was presented in [4]. this paper is based on the experience with the long developement of the ul-39 aircraft at the department of aerospace engineering of the czech technical university in prague. a more general approach with a compressible fluid flow model is used, which means 549 https://doi.org/10.14311/ap.2022.62.0549 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en jan klesa acta polytechnica electric motor−→v0 −→v3 1 2 3 figure 1. propulsion system scheme. free atmosphere 0 plane in front of the fan 1 plane behind the fan 2 nozzle exit 3 table 1. section definitions. that this approach can also be used for a much faster aircraft than the ul-39. the aircraft and also its propulsion system must fullfill legal requirements. for czech ultalight aircraft, it is certification specification ul-2 [5] (requirements of the german certification specification ltf-ul are very similar [6]). this brings the requirement that the aircraft has to take-off on a given distance, this creates requirement for thrust at low speeds so that the acceleration is adequate. 2. methods the simulation model is based on a modified approach from [4] based on experience from the developement and testing of the ul-39 aircraft. a compressible fluid model is used so that this method can be used for a faster aircraft than the ultralight cathegory. an approach to the fan design based on the comparison of various configurations for the complete flight velocity envelope is used due to certification specification requirements, which are contradictory to the requirement of high cruise speed as shown later in this paper, and led to the necessary modification of the approach presented in [4]. an iterative method in matlab is used for the solution of the system of equations. the result of the method is the fan design point which can then be used for the fan design by standard procedures, see e.g. [7] and [8]. 2.1. physical model the aim of the first step is to find parameters of the propulsion system in design conditions, which were determined according to the experience with the ul-39 aircraft testing and operation [9]. it is a 1d compressible fluid flow model. input parameters used for the fan design phase can be found in table 2. the fan has to be placed into the available space in the fuselage, which limits the maximal fan diameter and determines the length of the exhaust duct. figure 1 and table 1 explain the numbering of different planes in the propulsion system. due to the complexity of the equations, a numerical iterative approach is used for the solution. input parameters are the fan diameter d1 and nozzle crosssection ratio a1/a3. the thrust curve, i.e. dependence of the thrust t on the flight velocity v0, is then determined for every combination of d1 and a1/a3. the fan hub-to-tip radius ratio is set to 0.5, i.e. the engine power p 200 kw flight velocity range v0 0–100 m s−1 air density ρ 1.225 kg m3 atmospheric pressure ps0 101 325 pa intake duct pressure loss coefficient ζ01 0.1 fan efficiency ηf an 0.85 outlet duct wall friction factor λ23 0.013 outlet duct length l 1.5 m air ratio of specific heats κ 1.4 air specific gas constant r 287 j kg−1 k−1 air specific heat at constant pressure cp 1004.5 j kg−1 k−1 table 2. input parameters for the propulsion system. blade length is half of the fan radius. then the fan cross-section a1 can be determined according to a1 = 3 4 πd21 4 . (1) the total pressure in the free atmosphere in front of the aircraft is detemined by the standard formula from flight mach number m0 by pt0 = ps0(1 + κ − 1 2 m 20 ) κ κ−1 , (2) where the flight mach number m0 is m0 = v0 a0 (3) and the speed of sound in the atmosphere a0 is a0 = √ κrts0. (4) the total temperature can be deremined in a similar way tt0 = ts0 ( 1 + κ − 1 2 m 20 ) . (5) the total pressure in the intake duct is computed by means of a loss coefficient ζ01 and the fan axial velocity v1 pt1 = pt0 − ζ01ρ1v 2 1 2 , (6) 550 vol. 62 no. 5/2022 determination of fan design parameters for light-sport aircraft where ζ01 = 0.1 (based on cfd simulations from [10]). the total pressure recovery coefficient (see [11]) cannot be used in this case due to the low flight speed (data from literature sources are suitable for a faster aircraft). heat exchange in the intake duct is neglected, thus the total temperature remains the same tt1 = tt0. (7) the static temperature in front of the fan is ts1 = tt1 − v21 2cp , (8) the speed of sound is then a1 = √ κrts1, (9) and the mach number m1 = v1 a1 . (10) the static pressure can then be calculated from the mach number ps1 = pt1 ( 1 + κ − 1 2 m 21 )− κ κ−1 . (11) the density and the air mass flow are then computed according to ρ1 = ps1 rts1 , (12) ṁ1 = ρ1v1a1. (13) it is assumed that the whole engine power p is used by the fan, i.e. the total temperature is increased in the following way tt2 = tt1 + p cpρ1a1v1 . (14) the total temperature for isentropic compression due to the fan is tt2i = tt1 + ηf anp cpρ1a1v1 . (15) then, the total pressure becomes pt2 = pt1 ( tt2i tt1 ) κ κ−1 , (16) and the fan pressure ratio is π12 = pt2 pt1 . (17) the stagnation density behind the fan is ρt2 = pt2 rtt2 . (18) the critical air density (for choked flow state) behind the fan is ρc2 = ρt2 ( κ + 1 2 )− 1 κ−1 , (19) and the corresponding critical velocity is vc2 = √ 2 (κ − 1) cptt2 κ + 1 , (20) and then, the critical flow density is (ρv)c2 = vc2ρc2. (21) m2 is computed so that the mass flow through the duct remains constant. i.e. (ρv)1 = (ρv)2. the speed of sound behind the fan is a2 = vc2 √ κ + 1 2 ( 1 + κ − 1 2 m 22 )−1 . (22) then, the flow velocity is calculated from the mach number m2 v2 = m2a2, (23) and the air density becomes ρ2 = ρt2 ( 1 + κ − 1 2 m 22 )− 1 κ−1 . (24) the total pressure at the nozzle exit 3 is computed from the exhaust duct loss coefficient ζ23 pt3 = pt2 − ζ23 ρ2v 2 2 2 . (25) the value of the loss coefficient ζ23 is determined according to the information from [12]. the flow density at the nozzle exit is computed from the condition of constant mass flow (ρv)3 = (ρv)2 a3 a1 . (26) the total temperature behind the fan remains constant tt3 = tt2. (27) the total air density at the nozzle exit is ρt3 = pt3 rtt3 . (28) the critical (choked) air density in the nozzle exit is ρc3 = ρt3 ( κ + 1 2 )− 1 κ−1 , (29) and the corresponding critical air velocity is vc3 = √ 2 (κ − 1) cptt3 κ + 1 , (30) and the critical flow density is (ρv)c3 = ρc3vc3. (31) 551 jan klesa acta polytechnica figure 2. thrust over flight speed for different fan diameters d1, nozzle contraction ratio a1/a3 = 1. figure 3. thrust over flight speed for different nozzle contraction ratios a1/a3, fan diameter d1 = 0.66 m. the static temperature in the nozzle exit is ts3 = tt3 ( ps0 pt3 ) κ−1 κ , (32) the corresponding speed of sound is a3 = √ κrts3, (33) and the flow velocity is v3 = m3a3. (34) the nozzle exit mach number m3 is determined from the relation between static pressure ps3 and total pressure pt3 ps3 = pt3 ( 1 + κ − 1 2 m 23 )− κ κ−1 . (35) the thrust of the propulsion system is determined from the momentum conservation law t = ṁ (v3 − v0) , (36) where the air mass flow is ṁ = ρ1a1v1. (37) finally, the propulsion efficiency is defined by the standard formula η = t v0 p . (38) an iterative algorithm has to be used for the computation . a value v1 = 50 m s−1 can be used as a guess for the first iteration. fan rpm is determined from the flow coefficient ϕ = vax/u which is assumed to be 0.5 nm = 120v1 πd1 . (39) 3. results thrust cuves (i.e. dependence of the thrust on the flight velocity) for different fan diameters d1 and nozzle contraction ratios a1/a3 are presented in figures 2 and 3. an increase in fan diameter d1 (see figure 2) causes a thrust increase for the given velocity range, however, this influence diminishes with increasing flight velocity as expected from the general theory of aerospace propulsion. the influence of nozzle contraction ratios a1/a3 on thrust (see figure 3) is similar. lower a1/a3 leads to a higher thrust at a lower flight velocity, but reduces the flight performance at a higher velocity. the influence of the fan diameter d1 and nozzle contraction ratio a1/a3 on the efficiency is presented in figures 4 and 5. the efficiency is relatively low in comparison with the standard propeller due to the small fan cross-section area and also due to the viscous losses in the duct system. another important parameter for the fan design is the axial velocity v1 presented in figures 6 and 7. it is clearly visible that there is a strong dependence of the fan axial velocity v1 on constant electric motor power. both parameters, i.e. fan diameter d1 and nozzle contraction ratio a1/a3, have a strong influence on v1. the dependence of the fan pressure ratio on the flight velocity and fan diameter is presented in figure 8. the fan rpm for the same situation is presented in figure 9 (the assumption of constant ϕ = vax/u = 0.5 is used). based on the above-mentioned results, the dependencies of fan pressure ratio π, thrust t , fan axial velocity v1 and fan rpm nm on the fan diameter d and nozzle contraction ratio a1/a3 for static case (i.e. v0 = 0 km h−1, take-off) and for maximum flight velocity (i.e. v0 = 300 km h−1), are presented in figures 10–17. based on this and the fuselage geometry, a fan diameter of d1 = 0.66 m was selected. the ratio a1/a3 is determined from the relative thrust shown in figure 18. the relative thrust is defined as the ratio t /tref , where the reference value tref is 552 vol. 62 no. 5/2022 determination of fan design parameters for light-sport aircraft figure 4. efficiency of the propulsion system for different fan diameters d1, nozzle contraction ratio a1/a3 = 1. figure 5. efficiency of the propulsion system for different nozzle contraction a1/a3, fan diameter d1 = 0.66 m. figure 6. fan axial velocity component over flight speed for different fan diamenters and a1/a3 = 1. figure 7. fan axial velocity component over flight speed for different nozzle contraction ratios a1/a3, fan diameter d1 = 0.66 m. figure 8. fan pressure ratio over flight speed for different fan diamenters and a1/a3 = 1. figure 9. fan rpm over flight speed for different fan diamenters and a1/a3 = 1. 553 jan klesa acta polytechnica figure 10. dependence of static thrust on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. figure 11. dependence of thrust at flight speed v0 = 300 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. figure 12. dependence of fan pressure ratio at flight speed v0 = 0 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. figure 13. dependence of fan pressure ratio at flight speed v0 = 300 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. the maximum thrust for each velocity. the optimal value is the maximum of relative thrust mean value for flight velocity 0 km h−1 and 300 km h−1. this gives an optimal value of a1/a3 equal to 1.17. the resulting performance of the propulsion system is then determined for this case, see figures 19 and 20. also, the dependence of the fan design parameters on the flight velocity is computed, i.e. fan pressure ratio π in figure 21, fan axial velocity v0 in figure 22 and fan rpm nm in figure 23. 4. discussion the presented results show that the high static thrust requirement is in conflict with the high cruise speed requirement (i.e. high thrust at high flight velocity) as expected from the general aircraft propulsion theory. this is clearly visible in figure 18. the ul-39 light-sport aircraft is used as an example for this computation; the results for similar aircrafts are expected to be comparable. that is why the optimal system configuration is set by means of relative thrust. the outputs of this method are the fan design parameters presented in table 3. 5. conclusions the results of the propulsion system simulation for ducted fan aircraft are presented. a compressible fluid flow model is used so the described procedure can be used for a wider range of flight velocities in comparison with a simple, incompressible flow model (e.g. [4]). the procedure is described and results are presented for the example of the ul-39 aircraft. the requirements for the propulsion system are contradictory, i.e. short take-off distance and high maximal 554 vol. 62 no. 5/2022 determination of fan design parameters for light-sport aircraft figure 14. dependence of fan axial velocity component at flight speed v0 = 0 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. figure 15. dependence of fan axial velocity component at flight speed v0 = 300 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. figure 16. dependence of fan rpm at flight speed v0 = 0 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point are marked by red cross. figure 17. dependence of fan rpm at flight speed v0 = 300 km h−1 on fan diameter and nozzle contraction ratio a1/a3. the selected fan design point parameters are marked by red cross. fan pressure ratio π 1.062 fan diameter d 660 mm electric motor rpm 6340 air axial velocity at fan vax 109.54 m s−1 fan air mass flow ṁ 33.46 kg s−1 expected thrust at 300 km h−1 t 1401.9 n expected efficiency at 300 km h−1 η 0.584 table 3. ul-39 fan design parameters for flight speed 300 km h−1 at sea level international standard atmosphere and electric motor power 200 kw. flight velocity. this leads to the necessity of a tradeoff for chosing the optimal system configuration. the influence of various design parameters on the propulsion performance is presented for the expected range of flight velocities. the proposed selection of the optimal variant is based on the maximum of mean relative thrust for the static case (i.e. flight velocity of 0 km h−1) and the expected high speed cruise (i.e. flight velocity of 300 km h−1). the presented procedure and the results can be used for a ducted fan design for an electric powered aircraft. list of symbols a speed of sound [m s−1] a cross-section area of a duct [m2] cp specific heat at constant pressure [j kg−1 k−1] d1 fan diameter [m] l duct length [m] 555 jan klesa acta polytechnica figure 18. dependence of relative thrust on nozzle contraction ratio a1/a3 at flight speed 0 and 300 km h−1. the selected nozzle contraction ratio a1/a3 = 1.17 is marked by dashed line. figure 19. thrust over flight speed for chosen fan parameters, i.e. d1 = 0.66 m and a1/a3 = 1.17. figure 20. propulsive efficiency over flight speed for chosen fan parameters, i.e. d1 = 0.66 m and a1/a3 = 1.17. figure 21. fan pressure ratio over flight speed for chosen fan parameters, i.e. d1 = 0.66 m and a1/a3 = 1.17. figure 22. fan axial velocity component over flight speed for chosen fan parameters, i.e. d1 = 0.66 m and a1/a3 = 1.17. figure 23. fan rpm over flight speed for chosen fan parameters, i.e. d1 = 0.66 m and a1/a3 = 1.17. 556 vol. 62 no. 5/2022 determination of fan design parameters for light-sport aircraft ṁ air mass flow [kg s−1] m mach number nm fan rpm [rpm] ps static pressure [pa] pt total pressure [pa] p engine power [w] r air specific gas constant [j kg−1 k−1] t thrust [n] tt total temperature [k] ts static temperature [k] v velocity [m s−1] η efficiency ζ pressure loss coefficient κ ratio of specific heats λ wall friction factor π12 fan pressure ratio ρ air density [kg m−3] (ρv) flow density [kg m−2 s−1] (ρv)c critical flow density [kg m−2 s−1] acknowledgements this work was supported by technology agency of the czech republic, grant no. fv40263. references [1] m. hepperle. electric flight – potential and limitations. energy efficient technologies and concepts of operation (sto-mp-avt-209), 2012. accessed 2022-10-26, https://elib.dlr.de/78726/. [2] electric & hybrid-electric propulsion, 2022. accessed 2022-10-26, https: //aerospace.honeywell.com/us/en/products-andservices/product/hardware-and-systems/electricpower/hybrid-electric-electric-propulsion. [3] rolls-royce plc. rolls-royce advances hybrid-electric flight with new technology to lead the way in advanced air mobility, 2022. accessed 2022-10-26, https://www.rolls-royce.com/media/pressreleases/2022/22-06-2022-rr-advances-hybridelectric-flight-with-new-technology.aspx. [4] s. d. v. weelden, d. e. smith, b. r. mullins. preliminary design of a ducted fan propulsion system for general aviation aircraft. in aiaa 34th aerospace sciences meeting and exibit, 96-0376. https://doi.org/10.2514/6.1996-376. [5] letecká amatérská asociace čr. ul 2 část i., 2019. accessed 2022-10-26, https://www.laacr.cz/ sitecollectiondocuments/predpisy/ul2%20%c4%8d% c3%a1st%20i_26.3.2019.pdf. [6] deutscher aero club. lufttüchtigkeitsforderungen für aerodynamisch gesteuerte ultraleichtflugzeuge ltf-ul vom 15.01.2019 und änderung vom 28.02.2019 (nfl 2-459-19), 2019. accessed 2022-10-26, https: //www.daec.de/fileadmin/user_upload/files/2019/ luftsportgeraete_buero/ltf/ltf-ul_2019.pdf. [7] n. a. cumpsty. compressor aerodynamics. krieger publishing company, 2nd edn., 2004. [8] r. o. bullock, i. a. johnsen. aerodynamic design of axial flow compressors. nasa-sp-36. nasa lewis research center, cleveland, oh, 1965. [9] r. theiner, j. brabec. experience with the design of ultralight airplane with unconventional powerplant. proceedings of the institution of mechanical engineers, part g; journal of aerospace engineering 232(14):2721–2733, 2018. https://doi.org/10.1177/0954410018774117. [10] j. hejna. intake desing for experimental fan. master’s thesis, czech technical university in prague, 2021. accessed 2022-10-26, http://hdl.handle.net/10467/96889. [11] s. farokhi. aircraft propulsion. john wiley & sons, 2nd edn., 2014. [12] s. albrig. angewandte strömungslehre. akademie-verlag, 5th edn., 1978. 557 https://elib.dlr.de/78726/ https://aerospace.honeywell.com/us/en/products-and-services/product/hardware-and-systems/electric-power/hybrid-electric-electric-propulsion https://aerospace.honeywell.com/us/en/products-and-services/product/hardware-and-systems/electric-power/hybrid-electric-electric-propulsion https://aerospace.honeywell.com/us/en/products-and-services/product/hardware-and-systems/electric-power/hybrid-electric-electric-propulsion https://aerospace.honeywell.com/us/en/products-and-services/product/hardware-and-systems/electric-power/hybrid-electric-electric-propulsion https://www.rolls-royce.com/media/press-releases/2022/22-06-2022-rr-advances-hybrid-electric-flight-with-new-technology.aspx https://www.rolls-royce.com/media/press-releases/2022/22-06-2022-rr-advances-hybrid-electric-flight-with-new-technology.aspx https://www.rolls-royce.com/media/press-releases/2022/22-06-2022-rr-advances-hybrid-electric-flight-with-new-technology.aspx https://doi.org/10.2514/6.1996-376 https://www.laacr.cz/sitecollectiondocuments/predpisy/ul2%20%c4%8d%c3%a1st%20i_26.3.2019.pdf https://www.laacr.cz/sitecollectiondocuments/predpisy/ul2%20%c4%8d%c3%a1st%20i_26.3.2019.pdf https://www.laacr.cz/sitecollectiondocuments/predpisy/ul2%20%c4%8d%c3%a1st%20i_26.3.2019.pdf https://www.daec.de/fileadmin/user_upload/files/2019/luftsportgeraete_buero/ltf/ltf-ul_2019.pdf https://www.daec.de/fileadmin/user_upload/files/2019/luftsportgeraete_buero/ltf/ltf-ul_2019.pdf https://www.daec.de/fileadmin/user_upload/files/2019/luftsportgeraete_buero/ltf/ltf-ul_2019.pdf https://doi.org/10.1177/0954410018774117 http://hdl.handle.net/10467/96889 acta polytechnica 62(5):549–557, 2022 1 introduction 2 methods 2.1 physical model 3 results 4 discussion 5 conclusions list of symbols acknowledgements references ap_06_4.vp 1 introduction endoscopy is a minimally invasive diagnostic procedure that is used to give the physician a realistic impression of almost any part of the gastrointestinal tract or other organs inside the human body. during an endoscopic examination a flexible tube, which contains among other things a video camera and a lamp, is introduced into the human body. it has a steerable tip that enables the physician to change perspective and to ease navigation, e. g. through the colon. another system for imaging the gastointestinal tract is a so-called „camera in a pill“ or „pillcam“ [1]. this is swallowed by the patient and travels through the gastrointestinal tract in a natural way. the images that are recorded by the camera are sent via a radio frequency transmitter to a data recorder, which is worn on a belt around the patient’s waist. due to the frontal illumination, there are often light reflections that are disturbing for the physician. in the case of classic endoscopy, the physician may change the perspective by turning the tip of the endoscope to avoid these reflections. in the case of a camera-in-pill examination, however, there is no way to force the pill to move to a better position. therefore, these reflections need to be removed in a different manner. the algorithm we are going to present may also be applied to classic endoscopy for the physician’s convenience. 2 methods the algorithm that we apply is used in image communications to conceal image data corrupted by transmission errors [2] or in the restoration of x-ray images with defective detector elements [3]. it models the defective image as a pointwise product of the undisturbed image and a known binary defect map that contains ones where the image is correct and zeros in the case of a defective pixel. this pointwise product in the spatial domain corresponds to a convolution in the spectral domain. to restore the image, a spectral deconvolution is performed. in case of endoscopy, we first perform a segmentation by thresholding in the yuv color space and then we treat the pixels that belong to light reflections as defective pixels, setting them to zero, and then we apply the defect interpolation algorithm. 2.1 segmentation of reflections before the spectral deconvolution algorithm is applicable, a map has to be generated that contains ones where the image is undisturbed and zeros where specular reflexions occur. these reflections are very bright, so the affected pixels show high luminance in the yuv color space. thus, the image is transferred from rgb into the yuv color space by applying the linear transform y u v � � � � � � � � � � � 0 299 0 587 0114 0147 0 289 0 436 0 61 . . . . . . . 5 0 515 0100 � � � � � � � � � � � � � � � � � � � �. . r g b (1) where the y component contains luminance and u and v contain chrominance. fig. 1 shows a typical histogram of the luminance channel of an endoscopic image. on the right hand side, there is a small peak that corresponds to pixels that contain specular reflections. we now segment these reflections by thresholding with a value that is just lower than the left edge of this hill. therefore, we low pass filter the histogram and find the position beginning from the right hand side where the derivative changes from a significant positive value to a value near zero. since there are often dark rings around the affected pixels in the reconstructed image, and these are also very disturbing, we enlarge the segmented areas by an erosion operation. note that we here use erosion instead of dilation to enlarge the segments, since we have a black segment on a white background. 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 removal of specular reflections in endoscopic images t. stehle during an endoscopic examination, pictures from the inside of the human body are displayed on a computer monitor. disturbing light reflections are often visible in these images. in this paper, we present an approach for removing these reflections and replacing them by an estimate obtained using a spectral deconvolution algorithm. keywords: endoscopy, light reflection removal, spectral deconvolution, defect interpolation. fig. 1: histogram of the luminance channel of an endoscopic image containing specular reflections. the x axis shows the luminance value, the y axis shows the number of occurrences 2.2 spectral deconvolution for ease of notation, we describe the univariate case. the results can be easily generalized to the bivariate case. the observed image g(n) is modeled as g n f n w n( ) ( ) ( )� g k n f k w k n f l w k l l n ( ) ( ) ( ) ( ) ( ) � � � � � 1 1 0 1 where 0 �n k n, , f(n) is the undisturbed image, and w(n) a binary window, with w(n) � 0 if the corresponding pixel n in g(n) belongs to a reflection and w(n) � 1 otherwise. g(k), f(k) and w(k) are the dft spectra of g(n), f(n) and w(n), respectively. both f(n) and g(n) are real valued, hence, g(k) � g*(n k) and f(k) � f*(n k). our goal is to estimate f k f n( ) ( )dft� ��� for 0 �n k n, . let us select a spectral line pair g(s) and g(n s) of g(k). if f(k) consisted only of two lines at s and n s, i.e. f k f k f s k s f n s k n s( ) �( ) �( ) ( ) �( ) ( )� � � �� � (3) convolution with the window spectrum w(k) would yield for the observed pair � �g s n f s w f s w s g s n f s w ( ) �( ) ( ) � ( ) ( ) ( ) � ( ) ( ) * * * * � � � 1 0 2 1 0� �� �( ) ( ) ,*f s w s2 (4) where �( )f s and �( ) � ( )*f n s f s � are the estimated coefficients. from (4), �( )f s can be found to �( ) ( ) ( ) ( ) ( ) ( ) ( ) * f s n g s w g s w s w w s � 0 2 0 22 2 . (5) the estimated spectrum �( )f k would then be given by (3). generally, f(k) consists of more than two spectral lines. the error after deconvolution (5) at s and n s in the spectral domain is given by g k g k n f k w k g k n f s w k s f ( ) * ( ) ( ) �( ) ( ) ( ) �( ) ( ) � 1 1 1 � � � �� �( ) ( )s w k s � (6) where g(1)(k) is the spectrum of the window internal difference image � �g n g n f n w n w n f n f n( )( ) ( ) �( ) ( ) ( ) ( ) �( )1 � � . (7) clearly, the spectral error g k( )( )1 0� at the selected line pair k � s and k � n s. given s and n s, the estimates �( )f s and �( )f n s are optimal in the minimum mean square error (mmse) sense if the energy eg of g (1)(n) is minimized. using parseval’s theorem, eg can directly be evaluated in the spectral domain according to � �e g n n g k n e g n n k n g � � � � � � �( ) ( )( ) ( ) . 1 2 0 1 1 2 0 1 1 1 (8) this provides mmse estimates for the sought frequency coefficients. if the selected line pair �( )f s and �( )f n s is dominant in the sense specified below, its convolution with w(k) tends to “hide” other, less dominant spectral coefficients of f(k). this influence is removed by (6), so that another line pair can be selected from g(1)(k), estimated and subtracted. this leads to the following iteration for spectral deconvolution: � initialization: � ( )( )f k0 0� , g k g k( )( ) ( )0 � , i � 1. � i-th iteration step: select a pair of spectral coefficients g si i( ) ( )( ) 1 , g n si i( ) ( )( ) 1 out of g ki( )( ) 1 . � estimate �( )( )f s i , �( )( )f n s i according to (5) such that g s g n si i i i( ) ( ) ( ) ( )( ) ( )� � 0, i.e. � �g s n f s w f s w s g i i i i i i ( ) ( ) ( ) * ( ) ( ) *( ( ) �( ) ( ) � ( ) ( ) � �1 1 0 2 � �1 1 0 2) ( ) * ( ) ( ) * ( )( ) � ( ) ( ) �( ) ( )s n f s w f s w si i i i� � (9) � � � �( ) ( ) ( ) ( ) ( ) ( ( ) ( ) ( ) ( ) ( ) * ( ) f s n g s w g s w s w i i i i i i � 1 10 2 0 22 2 ) ( ) . ( ) w s i (10) since we seek to minimize the error (8), the line pair we should select in the ith iteration is that which maximizes the energy reduction � �� e if s�( )( ) , which can be calculated to � �� e i i i i i f s n f s w k s f s w k s �( ) �( ) ( ) � ( ) ( ) ( ) ( ) ( ) * ( ) ( ) � � � 1 2 2 0 1 k n � � (11) which can be rewritten to �e i i i i i n f s f s w k s f s w k s � � � � 1 2 2 � ( ) �( ) ( ) � ( ) ( * ( ) ( ) ( ) * ( ) ( ) * ( ) ( ) * ( ) ) ( ) �( ) � ( ) ( w k s n f s f s w k i k n i i � �� � �� � � � 0 1 2 1 s f s w k s w k si i i i k n ( ) ( ) ( ) * ( )) �( ) ( ) ( ) . 2 0 1 � � � �� � ��� � (12) for a binary window, we have w n w n2( ) ( )� . inserting this into parseval’s theorem, we obtain w k s w k s w k n i k n i k n k n ( ) ( ) ( ) ( ) ( )� � � � � � � � � � 2 0 1 2 0 1 2 0 1 w n n w n n ( ) ( ) � � � 0 1 0 (13) © czech technical university publishing house http://ctn.cvut.cz/ap/ 33 acta polytechnica vol. 46 no. 4/2006 d f t � � � � and w k s w k s n w n j p s n n i i k n i ( ) ( ) ( ) exp ( ) * ( ) ( ) � � � � � � � � 0 1 2 2 � � � � � n n in w s 0 1 2( ).( ) (14) inserting (13) and (14) into (12) yields � �� e i i i i i if s g s f s g s� � � ( ) ( ) �( ) ( )* ( ) ( ) ( ) ( ) *( ) ( )1 1 . (15) since we know how to estimate �( )( )f s i optimally (10), � ( )* ( )f s i can be eliminated, thus expressing the error energy reduction depending on the available error spectrum line g ki s i( ) ( )( ) 1 as � e i i i i n w w s g s w g s � ( ) ( ) ( ) ( ) re ( ( ) ( ) ( ) ( ) 0 2 2 0 2 2 2 1 2 1� �( ) ( )) ( ) .i iw s 2 2 �� � � � � � � � � ! (16) selecting the best line pair in each iteration hence implies finding s such that �e according to (16) is maximum. because of the symmetry of the dft spectra of real valued signals, it suffices to search over only half the coefficients of g ki( )( ) 1 , i.e. from k � 0 to k � n/2, to find s(i). �( ) ( ) ( )( ) ( ) ( )f s n w g si i i� 0 1 . (17) similarly, the calculation of the error energy reduction �e according to (16) modifies to � e i in w g s� 2 0 1 2 ( ) ( )( ) ( ) . (18) clearly, when selecting only a single line, the error energy reduction depends only on the modulus spectrum | ( )|( ) ( )g si i 1 , apart from a constant factor. to save computational expense in the selection step, a simplified approach is to always select s(i) such that| ( )|( ) ( )g si i 1 is maximum, regardless of whether a single line is tested, that is, s i( ) � 0 or s ni( ) � 2, or a line pair. practically, the outcome of the iteration remains almost unchanged. when the estimated spectrum � ( )* ( )f s i contains as many lines as there are samples of f(n) inside the window w(n), the remaining error energy eg vanishes. the backtransformed estimate � ( )( )f ni is then identical to g(n) inside the observation window, and contains extrapolated information outside. in practice, the iteration is stopped when �e falls below a prespecified level �, or when a maximum number of iterations is reached. to achieve high spectral resolution for the interpolated signal, it is often reasonable to apply zero-padding to g(n) and w(n) before transforming and starting the iteration. an illustration of the entire recursion is given in fig. 2. 3 results the algorithm as described above was applied to several endoscopic images. in our experiments, we used a 9×9 circular structure element for erosion of the binary mask image, and we independently applied 100 iterations of the deconvolution algorithm to each channel of the yuv color space. fig. 3 and fig. 4 show two examples with the original image and a processed version of the image. the specular reflections are removed and the holes are filled with texture that does not disturb the overall impression of the image. however, the results are not perfect. in fig. 3, several vessels are affected by specular reflections in the lower right corner, and in the processed image some of these vessels appear interrupted. in fig. 4, block artifacts were added to the image in areas where a large number of specular reflections occurred. in some cases, the specular reflections itselves were surrounded by green, pink and yellow circles. these circles were correctly identified not being specular reflections and therefore they were not removed completely. this is not severe if the physician knows that they are caused by reflections. in the processed image these colors are visible, too, but the reflections have been removed, so the physician might think it is the real color of the tissue. the processed images have not yet been presented to a physician because of the drawbacks described above. however, the results achieved so far are already very promising. 4 discussion we have presented an approach for removing specular reflections from endoscopic images by segmentation of the affected pixels and interpolation via a spectral deconvolution algorithm. the algorithm, however, needs further improvement. we are going to modify the segmentation of the specular reflections such that the typical elliptical shape is incorporated into 34 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 fig. 2: illustration of the recursive spectral deconvolution algorithm. from the final spectral estimate � ( )( )f ki , the interpolated signal estimate is obtained by an inverse dft. the segmentation process. a rejection criterion will be integrated so that the quantitiy of false positive segmentation results is reduced. in addition, we will address the color disturbances in the processed images. 5 acknowledgments i would like to thank prof. dr. t. aach, institute of imaging and computer vision, rwth aachen university, germany, who supervised this project. i would also like to thank prof. dr. med. c. trautwein, university hospital aachen, germany, who kindly provided the endoscopic images used in this paper. references [1] iddan, g., meron, g., glukhovsky, a. swain p.: wireless capsule endoscopy, nature, 2000, p. 405–417. [2] kaup, a., meisinger, k., aach, t.: frequency selective signal extrapolation with applications to error © czech technical university publishing house http://ctn.cvut.cz/ap/ 35 acta polytechnica vol. 46 no. 4/2006 fig. 3: endoscopic image of the colon with specular reflections (left) and processed image (right). the overall impression of the processed image is better, but some artifacs have been added to the image. the vessels in the lower right corner appear blurred, and some of them interrupted fig. 4: endoscopic image of the esophagus with specular reflections (left) and the processed image (right). in areas with a large number of specular reflections block artifacts have been added to the image. concealment in image communications. international journal of electronic communication (ae), vol. 59 (2005), p. 147–156. [3] aach, t., metzler, v.: defect interpolation in digital radiography – how object-oriented transform coding helps. spie medical imaging, proceedings of spie 4322, 2001, p. 824–835. thomas stehle e-mail: thomas.stehle@lfb.rwth-aachen.de institute of imaging & computer vision rwth aachen university d-52056 aachen, germany 36 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 ap07_4-5.vp 1 negative dispersion in pcfs 1.1 existing dispersion compensating techniques and methods for negative cd negative cd could be used in telecommunications for opposite slope dispersion compensation, and can be achieved by various methods, such as those based on chirped gratings [1], dispersion compensating filters [2], brewster-angled chirped mirrors [3], and soliton-induced negative dispersion [4]. the most mature dispersion compensating technique is the use of dispersion compensating fibers (dcfs) [5]. the first conventional dcf with a large index contrast and a small core exhibited a negative chromatic dispersion coefficient value of tens of ps/nm/km at 1550 nm [6]. more recent dual concentric core optical fibers, consisting of a single-mode central core surrounded by a guiding ring, achieve a minimum cd of 1800 ps/nm/km at 1550 nm [7]. most models proposed as pcf-based dcfs suggest the presence of dual concentric cores – by introducing one ring with smaller holes or removing an entire chosen ring of holes [8] [9], or creating proper index diversity by doping to ensure significant index contrast [10]. the essential problem of fibers with selective filling of the core, with a small core or with a great air-filling fraction is an often unacceptable loss, a small effective mode area or an unfitting operating wavelength range that disqualifies the fiber for high-speed transmission systems because of high attenuation at 1550 nm. increasing the index of a core can also lead to higher order modes that may interfere in a confining core with the fundamental mode and cause modal noise, as suggested in [8]. an mof undoped in the core with seven dual cores designed for cd compensation was proposed by zhang et al [8]; the negative dispersion coefficient was 4500 ps/nm km. although this may result in thousands of ps/nm/km of negative dispersion, negative the cd was sensitive to hole diameter deviation. one of the highest ever-published negative cd coefficients ( 50000 ps/nm/km in pcf and �590000 ps/nm/km in an mof with a dual concentric core) was published by yang et al in [9]. negative cd in the case of fiber bending has been obtained only in a step-index fiber [12] (2007), where lp02 is not scattered. 1.2 possibilities of modeling a bent pcf with coupling modes fiber bending with a small curvature radius is not trivial because the total internal reflection condition should be kept. fiber bending could be sufficiently expressed by introducing curvature into the refractive index profile, represented by the curvature radius and the orientation in radians. the bend orientation defines the plane in which the waveguide is bent, and is referenced to the positive x-axis in a clockwise fashion. a bend orientation of 0 rad denotes that the waveguide is bent in the x-z plane, while a bend orientation of �/2 denotes that it is bent in the y-z plane. the results contained in this work refer to a bending orientation of 0 rad. the mechanism of bending of mofs was analyzed by eijkelenborg et al [12] in the sense observing the coloring the white pulse launched into the fiber with five cores. bending-induced coloring can be described by a model, based on angular bragg grating confinement. another approach employs the theory of coupling modes. if the central core is single-mode at 1550 nm with electric field distribution e1(x, y) and effective index neff1, then the cladding mode caused by the bend is described by e2(x, y) and the effective index neff2. the minimum cd occurs at the wavelength for which the difference of two effective indices is zero, as predicted by auguste et al [7]. this occurs at a certain wavelength, known as the phase-matching wavelength: n n n neff1 eff2 eff1 eff2� � � � 0. (1) pcf bent at a curvature radius r and an equivalent rip, taking the curvature into account, can be expressed: © czech technical university publishing house http://ctn.cvut.cz/ap/ 43 acta polytechnica vol. 47 no. 4–5/2007 negative chromatic dispersion generated by introducing curvature into photonic crystal fiber m. lucki this is a modeling work, which aims to show that negative chromatic dispersion (cd) may be obtained in a pcf by fiber bending. results from the study of negative dispersion could be employed in a new dispersion compensating technique. the proposed method does not require doping in the core, and does not require external cores. the minimum negative dispersion achieved by this method was �185000 ps/nm/km. problems of bending losses and sensitivity of the dispersion with respect to deviations of geometry were studied. keywords: negative chromatic dispersion, photonic crystal fiber, bending radius, normalized hole diameter, bending losses. n x y n x y x r2 1 1( , ) ( , )� �� � � � . (2) where n1 is the refractive index of the material characterized by position (x, y) (before bending) and n2 is the refractive index of the material at the same point, corrected by a coefficient taking into account the impact of bending by adjusting the x position by the value of bending radius r, assuming bending orientation in the x-z plane (0 radians from the x-axis), so that adjustment of the y coordinate is not necessary. such assumptions have been made, e.g., in fevrier et al. [11]. bent fiber theoretically propagates two supermodes: the fundamental lp01 mode leading to negative cd and the cladding mode (modes) leading to a very positive coefficient of cd. in the case of dual concentric core step-index fibers, the value of the negative cd coefficient was comparable to the positive one [13]. for theoretical assumptions, we consider two propagating supermodes: the central supermode lp01, guided in the central core surrounded by the microstructured cladding, assumed to be a single mode at 1550 nm; the outer supermode, created by light coupled from the central-core mode, propagating in the silica glass background, and scattered on the air inclusions. the outer supermode can be multimode in the c-band as there is not enough space between the holes to form one mode or no possibility to form a cladding defect mode. the fiber is designed so that the fundamental modes of both guides couple at the phase-matching wavelength �0, providing a highly negative cd of the first supermode. beyond the phase-matching wavelength, for certain curvature of the bending, coupling leads to intensity of the cladding mode greater than the intensity of lp01. in order to analyze the behavior of the complete structure, we make use of the coupled mode theory, as described in [14] and [15]. we consider the two supermodes as the elementary modes of the two independent guides. the electric fields of the fundamental modes of the structure (composed of the association of lp01 guided in the core and mode guided in the cladding), called supermodes, are then given by: � � � i i j zi� �exp , (3) � � � ii ii j zii� �exp , (4) where �i is the radial distribution, and �i is the propagation constant of the considered mode. to simplify the definition, we decompose the supermodes into elementary modes (or inversely) in a cross-section as: � � �i i ib b� �1 1 2 2, (5) � � �ii ii iib b� �1 1 2 2. (6) in this relation, the modal coefficient is defined as: b r r p r r r r i i m i 1 1 0 1 0 1 1 0 1 2 0 2 2 1 � � � � � � � � � � � � � � � � � � d d d . (7) in the same way, we can obtain the bi2, bii1, and bii2 coefficients. the amplitudes of the field of the first supermode and the two elementary modes are positive at all points in the cross section. the second supermode is chosen to be positive in the axis, so bi1, bi2, bii1 must be positive and bii2 negative. these different coefficients satisfy (at any wavelength) equations given by the coupled mode theory. at the phase-matching wavelength �0, the modal coefficients become: b b b bi i ii ii1 2 1 2� � � � . (8) indeed, at �0, the power of the two supermodes is distributed in the two cores, and the presence of one will gradually excite the other automatically for the same effective index of both guides. the radial distribution of the modes can be obtained accurately by a numeric resolution. the full-vectorial fdfd technique with index averaging and the modified yee’s algorithm was used to obtain the results contained in this paper [16]. the constant of propagation can be obtained by the formula: � � � � 2 0 neff [1/m]. (9) after substitution for group velocity and group delay, we may derive the dispersion by the formula: d � d d � [ps/(nm km)]. (10) we solve the maxwell equations describing the propagation launched in the z axis, and, after discretization and applying mesh algorithm, we obtain matrices that are solved numerically, followed by the algebra leading to eigenvalue equations in terms of transverse fields: ik h h h i i u i i u u u x y z y x y x 0 0 0 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � e e e x y z , (11) � � � � � � � � � � � � � � � � � � � � � � ik e e e irx ry rz x y z 0 0 0 0 0 0 0 0� � � � � i v i i v v v h h h y x y x x y z 0 0 � � � � � � � � � � � � � � � � � � � � � . (12) after estimations and trial simulations, as the curvature radius of about 66000 �m may satisfy the condition for total internal reflection, initial curvature radius r was set to 66000 �m. then it was possible to adjust the geometry to receive the negative cd in the c-band. 1.3 the sensitivity of minimum negative cd to deviations of the curvature radius or with respect to the normalized hole diameter during the fabrication process, the predicted values of the hole diameter or the curvature radius could have some deviations from the theoretical predictions. the holes could have a 44 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 different radius (if the air pressure during the drawing process is not suitable). the prediction of cd for deviation of the curvature radius or the normalized hole diameter deviation is presented here. an optimum reference configuration resulting in minimum negative cd in the c-band was predicted, as follows: pitch � 23.2 �m, d � 0.50453 (for investigation of the curvature radius deviation), and pitch � 23.2 �m, bending radius r � 66116 �m (for the investigation of the hole diameter deviation). 2 numerical results © czech technical university publishing house http://ctn.cvut.cz/ap/ 45 acta polytechnica vol. 47 no. 4–5/2007 fig. 3: minimum negative dispersion with respect to deviation of the hole diameter for assumed bending at radius 66115.7 �m fig. 2: minimum negative cd with respect to different bending radii for assumed d � 0.50453 fig. 1: fundamental mode before introducing curvature (a) coupling from lp01 to the cladding modes (b) coupling interaction untouched for imprecision or absence of holes in the x < 0 plane (c) weaker nonlinearity for even small imprecision of a hole in the x > 0 plane with strong confinement of the optical power fig. 4: value of negative chromatic dispersion at exactly 1550 nm for different bending radii deviations fig. 5: linear evolution of the minimum negative cd-wavelength as a function of bending radius fig. 6: bending losses as a function of normalized hole diameter for assumed curvature radius 66116 �m 3 discussion negative chromatic dispersion (cd) can be generated by fiber bending, resulting from a bending-induced mechanism of coupling from the fundamental mode to the cladding modes. the core guide and the cladding guide reflect light by the same effective index, and coupling is possible when both supermodes propagate with the same velocity, at a specific phase-matching wavelength [fig. 1b], and resulting in strong optical nonlinearities, i.e., highly nonlinear wavelength dependency of chromatic dispersion with significant negative cd over a short wavelength range. the zero-dispersion point is halfway between the negative and positive cd peak (e.g., if minimum negative cd was at 1.55 �m, maximum positive cd was at 1.5502 �m, and the zero-dispersion wavelength was 1.5501 �m). the character of the effective index-matching tangent is different for a different fiber geometry. we obtained the biggest minimum negative cd for normalized hole diameter 0.50453 for a pcf bent at a radius of 69080 �m at 1559.9 nm. the negative cd was �185420 ps/nm/km. an investigation of the behavior of minimum negative cd with respect to the curvature radius showed that a smaller curvature radius (stronger bending) is responsible for tuning a phase-matching wavelength towards longer wavelengths [fig. 2]. the value of the minimum negative-cd wavelength is sharply sensitive to the bending radius, and the bandwidth of the negative cd is not flat. changing the curvature radius by 100 �m shifts the negative dispersion peak by 0.01�m and influences the minimum cd. when the curvature radius was varied by 580 �m from 69080 �m to 68500 mm (0.84 % of the reference curvature radius, a smaller curvature radius means greater bending), the dispersion peak was shifted by about 6.9 nm into longer wavelengths, and the value of the extreme decreased by 35 % from �185420 ps/nm/km to �119610 ps/nm/km. when the curvature radius had the equivalent positive deviation, the dispersion peak was shifted by 7 nm into shorter wavelengths, and the value of the cd extreme decreased by 57 %, from �185420 ps/nm/km to �79185 ps/nm/km. however, half-decreased minimum negative cd is over a wide range of curvature radii (few centimeters). the minimum negative cd wavelength exhibited a linear dependency on the bending radius, as depicted in [fig. 5]. similar behavior would refer to the zero-dispersion wavelength, which was slightly longer than the minimum negative cd wavelength. summarizing, a negative tolerance to the radius of curvature is preferred (smaller curvature radius or a greater bend). the negative cd peak exhibited a shift towards shorter wavelengths, and its value demonstrated little variation when increasing the hole diameter. whereas the same dispersion peak was shifted towards longer wavelengths, the value of the peak demonstrated a remarkable variation, and became less negative when the hole diameter was decreased [fig. 3]. a certain negative tolerance to the hole diameter is preferred (smaller holes). the value was decreased by about 74 % for a certain negative deviation, and by 90 % for an equivalent positive deviation. when the hole diameter was increased by 0.000082, the position of the dispersion peak was shifted by 1.5 nm towards shorter wavelengths, and the value of the peak was decreased by 90 % from �48850 ps/nm/km to �4570 ps/nm/km. when d had an equivalent positive deviation, the position of the dispersion peak was shifted by about 1.5 nm towards a longer wavelength, and the peak’s value was decreased by 74 % from �48850 ps/nm/km to �12643 ps/nm/km. for greater imprecision, the question, if it should be negative or positive tolerance is unimportant. when the diameter of the holes was increased by 0.000642, the position of minimum cd was shifted by about 2.5 nm towards shorter wavelengths and the minimum negative cd decreased by 90 % from �48850 ps/nm/km to �3303 ps/nm/km. the hole diameter predicted to give the biggest possible minimum negative cd for given bending (and at determined wavelength) is not optimal for other curvature radii. as the minimum negative cd of a pcf bent at radius 69080 �m occurred at 1559.9 nm for d � 0.50453, and we manipulated the curvature radius to tune the minimum negative cd into a �1550 nm wavelength, the normalized hole diameter had to be adjusted, too. the curvature radius is then the main factor for tuning the minimum negative cd wavelength, while a proper hole diameter scales the minimum negative cd and weakly tunes the operating wavelength at the assumed bending. if the goal is to achieve the biggest possible negative cd at an arbitrary wavelength, it is rational to assume smaller tolerance to d . if the aim is to precisely tune a possible negative cd wavelength, it is appropriate to assume smaller tolerance to the bending radius. a cladding defect in the plane of coupling is considered as a deviation of the rip, as it was responsible for a reduction of interesting optical nonlinearities [fig. 1d]. we introduced negative deviation to one hole that is in the region with propagating light (radius � 5.7 mm instead of 5.85255 �m). as a result, cd at 1.55 �m was decreased by 15524 ps/nm/km. the disturbance of the regularity of the structure (even a small imprecision of certain holes only) in the plane with coupled power could damage the existing negative cd, while any imprecision of the holes in the plane of x < 0 (opposite to the bending orientation) exhibits no impact on the negative cd [fig. 1c]. an important question concerns not the value of an extreme, but the value at 1.55 �m. attenuation of about 20000 ps/nm/km at 1.55 �m corresponds to a deviation of the curvature radius of 5 �m [fig. 4]. sensitivity to bend radius deviation is less drastic for greater deviations. increased loss is for wavelengths characterized precisely by significant negative cd, and the maximum losses are exactly for minimum negative cd. the bending loss is then 46 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 7: bending losses as a function of curvature radius for assumed normalized hole diameter d � 0.504530 very sensitive to hole diameter adjustments [fig. 6]. for minimum negative cd of �185420 ps/nm/km, the bending loss was about 1 db/cm. on the other hand, as the adjustment of the curvature radius tunes the minimum negative-cd wavelength, the bending loss as a function of curvature radius r exhibited similar wavelength behavior, but the value of the loss for minimum negative cd displayed no exact regularity and oscillated around 1db/cm [fig. 7]. the effect of coupling between modes on bending losses in pcfs was investigated by olszewski et al in [17]. they demonstrated that the coupling between the fundamental mode and the gallery of the cladding modes causes oscillations in the dependence of the bending losses on r within the short wavelength bending loss edge in large core pcfs. 4 conclusions the method proposed in this paper is based on introducing curvature into the fiber, which results in negative dispersion at a certain wavelength. the highest value achieved by this method was �185000 ps/nm/km in a microstructured fiber. negative cd was the most sensitive to hole diameter deviation, especially in the area of strong confinement of the optical power. the position of a peak is sharply sensitive to the curvature radius. a certain negative tolerance is preferred to both hole diameter deviation and radius of curvature. the disadvantage of this method are a significant loss around 1db/cm from light scattered on the air holes, and the necessity to make corrections to the mode profile to match the numerical aperture of the compensating and compensated fiber, which can be done by a lens. it is not necessary to have a kilometer-long fiber, e.g., cd equal to 375 ps/nm of the 250-km-long link (the distance from prague to brno), assuming small dispersion of a conventional igpcf, as 1.5 ps/nm/km can be compensated with two metres of the proposed compensating fiber, if only it has negative dispersion �185000 ps/nm/km. advantages of the proposed method are not only record-breaking dispersion, but also the large effective mode area of the dispersive fundamental mode (700 �m2). this is more than ten times that of the fiber proposed in [8], and seventeen times more than in [9], achieved in a dcf with external cores. finally, the zero-dispersion wavelength can be tuned by winding a bending-based dcf onto a reel over tens of nanometers; the method does not require doping in the core. bending-induced negative cd would be suitable for compensating the dispersion in high-speed transmission systems after reducing the bending losses and correcting the mode profile. references [1] hill, o., bilodeau, f., halo, b., kitagawa, t., theriault, s., johnson, c., albert, j., takiguchi, k.: chirped in-fiber bragg gratings for compensation of optical-fiber dispersion, opt. lett. vol. 19 (1994), p. 1314–1316. [2] takiguchi, k., okamoto, k., moriwaki, k.: planar lightwave circuit dispersion equalizer, j. lightwave technol. vol. 14 (1996), p. 2003–2011. [3] steinmeyer, g.: brewster-angled chirped mirrors for high-fidelity dispersion compensation and bandwidths exceeding one optical octave, opt. express, vol. 11 (2003), no. 19. [4] sakamaki, k., nakao, m.: soliton induced supercontinuum generation in photonic crystal fiber, ieee j. of selected topics in quantum electronics, vol. 10 (2004), no. 5, september/october 2004. [5] gruner-nielsen, l., knudsen, s., edvold, b., veng, t., magnussen, d., larsen, c., damsgaard, h.: dispersion compensating fibers, opt. fiber technol. vol. 6 (2000), p. 164–80. [6] antos, a., smith, d.: design and characterization of dispersion compensating fiber based on the lp01 mode, j. lightw. technol., vol. 12 (1994), no. 10, p. 1739–1745. [7] auguste, j. et al.: invited paper optical fiber technology, vol. 8 (2002), p. 89. [8] zhang, y., yang, s., peng, x., lu, y., chen, x., xie, s.: design of large effective area microstructured optical fiber for dispersion compensation, in: photonic crystals and fibers. bellingham: spie, vol. 5950 (2005), no. 43, isbn 0-8194-5957-7. [9] yang, s., zhang, y., peng, x., lu, y., xie, s., li, j., chen, w., jiang, z., peng, j., li, h.: theoretical study and experimental production of high negative dispersion photonic crystal fiber with large area mode field, opt. express 3015, vol. 14 (2006), no. 7. [10] zsigri, b., l gsgaard, j., bjarklev, a.: a novel photonic crystal fibre design for dispersion compensation, j. opt. a: pure appl. opt. vol. 6 (2004), p. 717–720. [11] fevrier, s., auguste, j.-l., blondy, j.-m., peyrilloux, a., roy, p., pagnoux, d.: accurate tuning of the highly-negative chromatic dispersion wavelength into a dual concentric core fibre by macro-bending, fibers and waveguide components p1.8, 2007. [12] eijkelenborg, m., canning, j., ryan, t.: bending-induced colouring in a photonic crystal fibre, opt. express 88, vol. 7 (2000), no. 2. [13] gerome, f., auguste, j., maury, j., blondy, j., marcou, j.: theoretical and experimental analysis of a chromatic dispersion compensating module using a dual concentric core fiber, j. lightwave technol. vol. 24 (2006), no. 1. [14] cozens, j. r., boucouvalas, a. c.: coaxial optical coupler, electron. lett., vol. 18 (1982), no. 3, p. 138–140, jul. 1982 [15] boucouvalas, a. c.: coaxial optical fiber coupling, j. lightwave technol., vol. lt-3 (1985), no. 5, p. 1151–1158, oct. 1985 [16] zhu, z., brown, t. g.: full-vectorial finite-difference analysis of microstructured optical fibers, optics express 853 , vol. 10 (2002), no. 17. [17] olszewski, j., szpulak, m., urbanczyk, w.: effect of coupling between fundamental and cladding modes on bending losses in photonic crystal fibers, opt. express 6015, vol. 13 (2005), no. 16. michal lucki luckim1@fel.cvut.cz dept. of telecommunication engineering czech technical university in prague faculty of electrical engineering technická 2 166 27 prague, czech republic © czech technical university publishing house http://ctn.cvut.cz/ap/ 47 acta polytechnica vol. 47 no. 4–5/2007 ap_07_6.vp 1 nomenclature 2 introduction vorticity is a quantity that can describe viscous effects in a flow field much better than velocity, and it is very well suited for defining and identifying organized structures in time-dependent vortical flows because the streamlines and pathlines are completely different in two different inertial frames of reference. in this respect, better understanding of the nature of turbulent structures and vortical motions of turbulent flows, particularly in the high-wavenumber region, often requires spatially and temporally resolved measurements of velocity derivatives. lighthill [1] in his wide-ranging introduction to boundary-layer theory provided an extensive description of vorticity dynamics in a variety of flows by using vorticity as a primitive variable for various theoretical considerations. in the past six years our group has made an intense effort to extend the successful techniques developed for measurements of three-dimensional vorticity and the rate-of-strain tensor. these techniques involve measurements in low-speed turbulent boundary layers of reynolds numbers re� � 2,700, ref [2–5], in vortices generated over delta wings [6], compressible flows with shock interactions [7–9] and recently compressible flows with expansion wave interactions. as mentioned by vukoslavcevic et al. [10] and wallace and foss [11], the measurement of vorticity was elusive until a few years ago, when this effort was first undertaken. in the present work, details of the design and evaluation of a vorticity probe based on hot-wire anemometry are presented. some of the fundamental aspects of turbulence can be studied better in flow configurations where the flow is nearly homogeneous and turbulence is nearly isotropic. the presence of a solid wall as a boundary in turbulent flows complicates its understanding by introducing large mean velocity gradients at the wall which are responsible for the continuous production of turbulence. better understanding of the effects of a shock wave or expansion waves on turbulence can be obtained by considering their interaction with grid-generated turbulence where no streamline curvature and a wall with no-slip conditions are present. the flow behind a turbulence-generating grid contains a large variety of turbulent scales, the size of which depends on the distance from the grid and the grid mesh size. 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 design and validation of a probe for spatially and temporally resolved measurements of vorticity and strain rates in compressible turbulence interactions s. xanthos, m. gong, y. andreopoulos a custom-made hot-wire vorticity probe was designed and developed capable of measuring the time-dependent highly fluctuating three dimensional velocity and vorticity vectors, and associated total temperature, in non-isothermal and inhomogeneous flows with reasonable spatial and temporal resolution. these measurements allowed computation of the vorticity stretching/tilting terms, vorticity generation through dilatation terms, full dissipation rate of the kinetic energy term and full rate-of-strain tensor. the probe has been validated experimentally in low-speed boundary layers and used in the ccny shock tube research facility, where interactions of planar expansion waves or shock waves with homogeneous and isotropic turbulence have been investigated at several reynolds numbers. keywords: vorticity measurements, compressible turbulence, shock and expansion waves. c speed of sound cp heat capacity e probe voltage output m mass m mesh size p pressure r gas constant re reynolds number s probe sensitivity sij strain rate t temperature t0 total temperature tr reference temperature tw wire temperature ui instantaneous velocity ui velocity fluctuation ui mean velocity x distance from grid to probe xi coordinates � dissipation rate � kolmogorov’s microscale � taylor’s microscale � viscosity � density the custom-made vorticity probe has been used for measurements in the interaction of isotropic and homogeneous turbulence with expansion waves. in all experiments, the main objective has been to better understand the physics of the interactions and to establishing the behavior of the vorticity field. detailed experimental investigations of interactions of expansion waves with isotropic turbulence simply do not exist. the only existing studies are rather limited and have been confined to turbulent boundary layers, as will be discussed shortly. 3 experimental set-up and techniques the interactions have been simulated experimentally in the ccny shock tube facility shown in fig. 1. the shock tube facility is of large-scale dimensions with an inside diameter of 12 inches (304 mm) and total length of 90 feet (27.4 m), including all components. the present shock tube facility has three distinguishing features. the most significant one is the ability to control the strength of the reflected shock and the flow quality behind it by using a removable porous end wall, placed at the flange between the dump tank and the working section. the impact of the shock wave on the end wall would result in a full normal shock reflection in case of zero porosity (solid wall), a weak shock reflection in case of moderate porosity, or expansion waves in case of unit porosity (open end wall). the second feature of the facility is the ability to vary the total length of the driven section by adding or removing one of several pieces or modules that are available, or to rearrange their layout. proper arrangement of the layout of the various modules of the shock tube can maximize the duration of the useful flow. the third feature of the facility is its large diameter, which allows for a large area of uniform flow in the absence of wall effects thus providing a platform for high spatial resolution turbulence measurements. a turbulence-generating grid installed at the beginning of the working section of the facility was used to generate a homogeneous and isotropic turbulent flow. the interactions of this flow with shock waves was produced by using the porous end wall of the shock tube, while an open end wall was used to generate an expansion wave. fig. 2 shows schematically the interaction with expansion waves (ew). the working (test) section is fitted with several hot-wire and pressure ports, (see fig. 3 for working section with visible wall pressure taps and grid). thus pressure, velocity and temperature data can be acquired simultaneously at various locations downstream from the grid, and therefore we can reduce the variance between measurements. high-frequency pressure transducers and hot wire anemometry have been used in the present investigation. pressure transducers were placed throughout the driven section in order to monitor the passage of the shock wave and also to check its uniformity through the driven section. for the present experiments of velocity and vorticity measurements, high frequency response kulite pressure transducers type xcq-062 were installed in the shock tube at multiple locations, so that wall pressure can be measured simultaneously as a function of time. time-dependent, three-dimensional vorticity measurements were carried out by using the new vorticity probe, [4, 5, 8, 13]. fig. 4 shows a layout of the arrangement for the simultaneous measurements of vorticity and wall pressure along the length of the working section. the shock tube was pressurized so that leaks could be detected, as well as to calibrate the pressure transducers. the shock tube was free of leaks and the static response of the transducers was found to be linear. aluminum plates were used as diaphragms and were placed in between the driver and the driven initially conically shaped section. a detailed description of the facility and the results of the qualification tests can be found in the work by briassulis [12], briassulis et al. [9] and agui et al. [13]. all data were digitized by 4 national instruments analog-to digital-converters model © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 acta polytechnica vol. 47 no. 6/2007 fig. 1: shock tube facility at ccny, driver section fig. 2: schematic of flow interaction with expansion waves fig. 3: working section with visible pressure taps and turbulence generating grid ni pci 6120 with 16 bit resolution and 1mhz per channel for a total of 16 channels. the data acquisition system was triggered by the arrival of the shock wave at the location of a wall pressure transducer upstream of the grid. the grid was installed in the beginning of the working section, as shown in figs. 3 and 4. 4 vorticity measurements a new multi hot-wire probe has been developed which is capable of measuring velocity-gradient related quantities in non-isothermal flows or in compressible flows. the present probe has been built using the experience gained with vorticity measurements in incompressible flows [4] by using a probe with nine wires, and with velocity measurements in compressible flows by using single and cross-wire probes [9, 14]. the present vorticity probe, which consists of 12 wires, is a modification of the original design with nine wires [4]. the three additional wires were operated in the so-called constant current mode and used to measure time-dependent total temperature. since the probe essentially consists of a set of three modules or arrays it is necessary to provide several key features of the individual hot-wire modules (see figs. 5a and 5b). each module contains three hot-wires operated in the constant temperature mode (ctm) and one cold-wire sensor operated in the constant current mode (ccm). each wire of the triple wire sub-module is mutually orthogonal to each other, thus oriented at 54.7 degrees to the probe axis. each of the 5�m diameter tungsten sensors is welded on two individual prongs, which have been tapered at the tips. each sensor is operated independently hence no common prongs are used. each of the 2.5 �m diameter cold-wire was located on the outer part of the sub-module. extensive testing of the probe has been carried out to assess its performance in shock tube flows. the reader is referred to the work of briassulis et al. [8] and agui et al. [9] for details of the tests and the techniques associated with the use of the probe. the probe was also tested in low-speed incompressible boundary layer flows, where vorticity measurements have been obtained in the past with a 9-wire probe [4] and with optical techniques [15]. comparison of the data obtained with the new probe with these previous measurements was very satisfactory. the cold-wire signals were first converted to total temperature, which together with the hot wire signals were used to obtain instantaneous three-dimensional mass fluxes at three neighboring locations within the probe. the numerical techniques and algorithms used in the computations of velocity gradients were very similar to those described by honkan & andreopoulos [4]. the only difference is that in the present case mass fluxes and their gradients were computed at the centroid of each module instead of velocities and velocity gradients. mass fluxes were further separated into density and velocity by using the method adopted by briassulis et al. [12]. decoupling density from mass fluxes assumes that the static pressure fluctuations are small. this is the so-called ”weak” version of the original “strong reynolds analogy” hypothesis of morkovin. the original hypothesis is based on the assumption that pressure and total temperature fluctuations are very small. in the present work, total temperature was measured directly and therefore no corresponding assumptions were needed. the pressure, however, was measured at the wall beneath the hot wire probes and not at the location of the hot wire measurement inside the flow field. the mean value of this pressure signal was used to separate the density and velocity signals, since no mean pressure variation has been 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 5: vorticity probe: (a) probe sensor geometry and arrays, (b) close-up view of the probe fig. 4: 2-d schematic of the working section with pressure and vorticity probe locations along the 2 ft length detected across a given section of the flow. the procedure involves an expression for mass flux, mi, in terms of total temperature, t0, and pressure, p, at the centroid of each array: m u pu rt pu r t u u c i i i i k k p � � � � � � � � � � � � � � � � 0 2 , where ui is the instantaneous velocity component, i � 1 2, or 3 and u u u u uk k � � �1 2 2 2 3 2. the velocity can be decomposed into u u ui i i� � . an iterative scheme was used to decouple density and velocity. during the first iteration it was assumed that the quantity ( )u u cp2 2 3 2 2� , where u2 and u3 are the velocity components in the spanwise and normal directions respectively, is substantially smaller that the quantity t u cp0 1 2 2� . then the above relation can be rearranged to obtain a quadratic equation for ui, rm cp u pu m rti i i i2 02 0� � � . for each digitized point, t0 and mi were available instantaneously at the centroid of each module while pressure was measured at the wall. if the thin shear layer approximation is invoked, then the pressure at the centroid of the array, which appears in the last equation, can be substituted by the mean pressure at the wall. this assumption is justified because the lateral pressure fluctuations are extremely small and therefore their impact on velocity fluctuations is minimal. the discriminant of the above equation � � �p m r t ci p 2 2 2 02 is always positive and therefore there are two real roots. the product of the two roots, as expressed by the ratio of the last term of the l.h.s. of the quadratic equation to the coefficient of the first term, is always negative. therefore one root is positive and one negative. the negative root is unrealistic and only the positive root was accepted. the longitudinal velocity component u1 was computed first while the other two components were obtained from the mass flux ratios as u m m u2 2 1 1 � and u m m u3 3 1 1 � . these values provided the first estimate of the velocity components which were used to obtain a better estimate of the u ck p 2 2 , which subsequently was used to improve the estimate of the velocity components. this iterative scheme required no more than two iterations for convergence. in summary, it should be emphasized that the major contribution of the present hot wire techniques is the addition of temperature wires to obtain instantaneous information on total temperature. this allowed decoupling of all partial sensitivities of the probe from each other. thus, s e s e u s e u u u� � � � � � � � � , where e is the voltage output from the probe. the streamwise derivatives of the three velocity components usually require the use of taylor’s hypothesis of “frozen” convected turbulence to convert temporal derivatives of velocity into spatial derivatives. in the present work this has been accomplished by considering the use of the full momentum equations to estimate the streamwise derivatives of the three velocity components by ignoring the viscous terms. these expressions can be written as: � u x u u t u u x u u x p x 1 1 1 1 2 1 2 3 1 3 1 1 1 � � � � � � � � � � , u x u u t u u x u u x p x u 2 1 1 2 2 2 2 3 2 3 2 1 1 � � � � � � � � � � � , 3 1 1 3 2 3 2 3 3 3 3 1 1 � x u u t u u x u u x p x � � � � � � � � � � . thus the determination of the streamwise gradients u xi 1 is not based entirely on taylor’s original hypothesis. all the terms in the above are available at each time step with © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 acta polytechnica vol. 47 no. 6/2007 fig. 6: typical time dependent wall pressure signals and their temporal gradients at three different locations in the working section below hot probes. the exception of the pressure gradients. mean values of pressure gradients in the lateral directions are zero and their fluctuations are extremely small. in this respect, p x2 and p x3 have been neglected from the corresponding equations. the pressure gradient, p x1, which appears in the streamwise gradient of the longitudinal velocity is not zero through the flow expansion zone. fig. 6 shows three wall pressure signals obtained at three different locations in the working section below the hot-wire probes. their temporal gradients, p t, are also shown in the same figure. these temporal gradients were converted to spatial gradients of pressure, p x1, by considering their propagation into the upstream flow, which takes place with a relative velocity equal to that of the local speed of sound, c. this propagation velocity of the ew relative to laboratory coordinates is c t u t( ) ( )� 1 where u1 is the local flow velocity. both time-dependent quantities, c and u2, are available from the combined temperature and velocity measurements. in this respect the spatial gradient was evaluated from the temporal gradient through the relation p x p t c u1 1 1 � � � . the contribution of this term as well as those of the other terms into the final computed value of s u x11 1 1� is demonstrated in fig. 7, where each of the individual terms is plotted separately. the data in fig. 7 clearly show that the major contribution to s11 comes from the term � 1 1 1 u u t , which provides most of its high frequency content. the terms � u u u x 2 1 1 2 and � u u u x 3 1 1 3 are substantially smaller than the leading term a finding which also agrees with our previous work [8, 9]. the contribution from the pressure gradient term 1 1 1 1 2 1 1� � u p x u p t c u � � � appears to provide a low frequency content of modest amplitude into the final value of u x1 1. the only assumption made in the present analysis is that the magnitude of the pressure fluctuations inside the flow field is negligible to the mean pressure upstream and downstream of the interaction or to the slowly varying pressure within the expansion zone. this assumption introduced an uncertainty into the computations which will be considered in the next section. it should be emphasized that only s11 is affected directly by the pressure gradient along the expansion zone where the interaction takes place. the rest of the velocity gradients and therefore all components of vorticity, are not affected directly by pressure gradients. 5 uncertainty estimates the pressure and total temperature measurements depend directly, through the obtained calibration constants, on the raw voltage data from the individual sensors. because of their linear response these probes produced two calibration constants, sensitivity and d.c. offset. therefore estimates of the uncertainty in the measurements of pressure and total temperature acquired through a 16-bit a/d converter depended mostly on the bit resolution and the residual errors from the calibration constants. uncertainties in the range of less than 0.5 % in pressure and about 2 % in total temperature were found for typical measurements of these two quantities. the mass flux measurements were tied to significantly more complex relations, which depended on the individual and relative geometry of different sensors. mass flux was found to depend on the following variables: captured raw voltage ei, reference temperature tr, total temperature t0, wire temperature tw , calibration constants and yaw or pitch coefficients. uncertainty values for the velocity were estimated to be between 1–3 %. these values are slightly better than those in [8] and [9], because of higher resolution of the adcs. in obtaining all these estimates the square root of 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 fig. 7: computation of velocity gradient, u x1 1, from contributing terms. signals are shifted by multiples of 4,000 s �1. the squares of all partial uncertainties involved was assumed to model the error propagation into the final results. a mathcad code was used to calculate the partial uncertainties. the density variation across the sensing area has also been estimated from the � p p obtained above and the measured variation of total temperature t0 through the uncertainty propagation formula: � � �� �0 2 0 2 1 2 � � � � � � � � � � � � � � � �� � � � �� p p t tt . this predicted approximately the same uncertainty as in the case of the pressure variation. following the work of agui et al. [13], estimates of the uncertainties associated with the measurements of velocity gradients were also obtained by considering the propagation of the uncertainties in the measurement of each quantity involved in the process. a typical velocity gradient is measured through the following approximation: u x u u l fi j p � � �2 1 , where u2 and u1 are the velocities at two nearby locations, lp is the distance between these locations. if the uncertainties in the measurements of u2 and u1 are the same, � � �u u u1 2 3� � , and lp is determined accurately, then the relative uncertainty � f f will be given by: � �f f u u u � � � � � � � � � � �� � � � �� 2 2 1 2 1 2 . a typical u is 2 % of mean u, which corresponds to about 2 m/s, while typical velocity differences u2� u1 can be up to six times the r.m.s. value, �u . if a typical value of this velocity difference is assumed of about 30 m/s in the near field of the grid and 15 m/s further downstream, then the uncertainty f/f appears to be 10 % in the near field and 14 % in the far field. lower uncertainty estimates have been found if the relation � u x u fi j � � � is used for their computation. in this case the relative error is: � � �f f u u � � � � � � � � � � � � � � � � � �� � � � �� 2 2 1 2� � , where � is taylor’s microscale. for a typical relative error in �u of 5 % and 10 % in � the relative error appears to be about 11 %. it should be noted that the relative error f/f increases as the distance away from the grid increases because the absolute value of f decreases. finally the finite number of statistically independent events considered in the data analysis of certain flow cases introduces an uncertainty in the statistical results. computations of the integral time scale, lt from auto-correlation functions ruu( ) indicated that the number of independent samples in general was between 200 to 400. downstream of the interaction time scales lt increase and the number of statistically independent events is reduced. in addition, the duration of the useful data upstream of the shock is shortened at locations close to the porous end wall because the reflected shock wave arrives earlier than at locations close to the grid. the onset of the useful data duration in the upstream of the shock region is also delayed by the arrival of the air mass has not gone through the grid. the number of independent samples in these cases was about n � 60–100. the relative error in the estimate of the variance of the velocity fluctuations is 2/n, which for this specific case at large x/m is between 2 to 4 %. it should be noted that n depends on the shape of ruu, which can be extended to large values if low frequency disturbances are present in the flow field, which are not related to the actual flow turbulence. if high pass filtering at 200–400 hz is applied to the present data lt is reduced substantially and n increases by a factor of 2. no such filtering has been applied to the present data other than what is imposed by the record length. for 10 ms record length the lowest frequency of interest is about 100 hz. furhter direct evidence of the adequacy of the statistical samples can be provided by the rate of convergence of the various statistical quantities that are computed in the present data analysis. as was shown [8], estimates of the convergence uncertainties observed in the present analysis indicate an error of less than 3 %. this error is substantially less at higher mach numbers and locations at closer to the grid. the spatial resolution of the probe is between 0.6 � and 3 � in the upstream of the shock region and 0.3 � and � in the downstream region. the resolution expressed in kolmogorov’s viscous scale � � �� ( )3 4 1 4 appears to be in the range between 3 � and 30 �. in this respect the expected attenuation of the measurement of vorticity r.m.s. due to limited spatial resolution is not very significant. 6 conclusions a custom-designed vorticity probe was used to measure for the first time the rate-of-strain, the rate-of-rotation and the velocity-gradient tensors in compressible turbulent flows. testing and validation of the probe and its eventual use in the shock tube flow field were formidable tasks. the difficulties associated with the measurements of velocity gradients in non-isothermal flows have been discussed. issues related to calibration, data analysis and spatial and temporal resolutions appeared to be the most challenging. reynolds numbers based on taylor’s microscale ranging from 180 to 210 have been achieved. the interactions have been investigated by measuring the three-dimensional velocity and vorticity vectors, the full velocity gradient and rate-of-strain tensors with instrumentation of high temporal and spatial resolution. this allowed estimates of dilatation, compressible dissipation and dilatational stretching to be obtained. acknowledgments the financial support provided by nasa & afosr is greatly acknowledged. references [1] lighthil, m. j.: boundary layer theory in laminar boundary layers, (ed. rosenhead) oxford university press, (1963). © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 acta polytechnica vol. 47 no. 6/2007 [2] agui, j. h., andreopoulos, j.: development of a new laser vorticity probe-lavor. fluids engineering division of asme, international symposium on laser anemometry, (eds. huang, otugen, lake tahoe nv), fed vol. 191 (1994), p. 11–19, june 20–24, 1994. [3] agui, j. h, andreopoulos, y.: a new laser vorticity probe – lavor: its development and validation in a turbulent boundary layer. experiments in fluids, vol. 34 (2003), p. 192–205. [4] hoknan, a., andreopoulos, y.: vorticity, strain-rate and dissipation characteristics in the near-wall region of turbulent boundary layers. j. fluid mech. vol. 350 (1997a), p. 29–96. [5] andreopoulos, y, honkan, a.: an experimental study of the dissipative and vortical motions in turbulent boundary layers. j. fluid mech. vol. 439 (2001), p. 131–163. [6] hoknan, a., andreopoulos, y.: instantaneous three dimensional vorticity measurements in vortical flow over a delta wing. aiaa j. vol. 35 (1997b), no. 10, p. 1612–1620. [7] andreopoulos, y., agui, j. h, briassulis, g.: shock wave-turbulence interactions. annual review fluid mech. vol. 32 (2000), p. 309–345. [8] briassulis, g., agui, j., andreopoulos, y.: the structure of weakly compressible grid turbulence. j. fluid mech. vol. 432 (2001), p. 219–283. [9] agui, j. h., briassulis, g., andreopoulos, y.: studies of interactions of a propagating shock wave with decaying grid turbulence: velocity and vorticity field. j. fluid mech. vol. 524 (2005), p. 143–195. [10] vukoslavcevic, p., wallace, j. m., balint, j.: the velocity and vorticity vector fields of a turbulent boundary layer. part 1 simultaneous measurement by hot-wire anemometry. j. fluid mech. vol. 228 (1991), p. 25–51. [11] wallace, j. m., foss, j.: the measurement of vorticity in turbulent flows. annual review fluid mech. vol. 27 (1995), p. 469–514. [12] briassulis, g. k.: unsteady nonlinear interactions of turbulence with shock waves. ph.d. thesis. city college of cuny, 1996, new york. [13] agui, j.: shock wave interactions with turbulence and vortices. ph.d. thesis, city college of cuny, 1998, new york. [14] briassulis, g., honkan, a., andreopoulos, j., watkins, c. b.: application of hot-wire anemometry in shock-tube flows. experiments in fluids. vol. 19 (1995), p. 29. [15] agui, j. h., andreopoulos, y.: a new laser vorticity probe – lavor: its development and validation in a turbulent boundary layer. exp. fluids. vol. 34 (2003), no. 2, p. 192–205. savvas xanthos, ph. d. minwei gong, ph.d. yiannis andreopoulos, ph.d. experimental aerodynamics and fluid mechanics laboratory the mechanical engineering department the city college of the city university of new york new york, new york 10031 usa 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 6/2007 ap05_6.vp 1. introduction the concept of helicity was originally introduced in the 1950s in the context of magnetohydrodynamics, where it proved to be very useful and has been used quite extensively. its use in hydromechanics of non-conducting fluids was introduced in 1969 by moffat [1], who proposed to apply it as a measure of the “knottedness” of vortex lines – which are known to be often quite tangled in turbulent flows. helicity is a quantity of very fundamental character, closely related to basic topological aspects of the flowfield. moreau [4] formulated the helicity conservation theorem. this holds for ideal non-viscous fluids, in particular for any inviscid barotropic fluid flowing under the influence of conservative body forces. in the theory of non-viscous flows, this invariance gives helicity a status comparable to that of quantities as basic as energy. there have been several attempts to use helicity also as the starting point in attacking the problem of fluid flow turbulence. recently, interest and hopes have revived, as documented e.g. in [5, 6], and [8] and particularly in the studies by zimmerman [11, 12, 13] of helical coherent structures in turbulence, which have led to important results having direct practical engineering relevance. the present study of helicity distribution in a swirling jet was performed to support a more extensive current research program aimed at applying the concept to description and improvement of fuel and oxidant mixing in swirl burners. the helicity of fluid flow [2] is defined as an integral (across a specified volume) of the scalar product of local velocity vector w, characterising the velocity field, and local vorticity vector �, which characterises the rotational motions – as shown in figs. 1 and 2. in practical flowfield studies, the variable of interest is the local quantity, helicity density � (m/s2) as defined in fig. 2: � � � � � � w w � � � � � � cos w w w1 1 2 2 3 3 � (1) an interesting fact is that despite being a quantity fully determined by a single numerical value, helicity is not a simple scalar. its sign depends upon the used reference frame. such quantities are sometimes characterised as pseudo-scalars: the change of sign takes place during the transition from the reference frame of right-handed chirality to left-handed one. the idea of the helical shape of fluid objects deformed by a flowfield exhibiting high helicity is presented in fig. 1. it is a very idealised case, with the two vectors w and � collinear, the rotational speed very high, and the deformation very regular. examples of such shapes are known e.g. from tornadoes. nevertheless in usual turbulent flows, especially with generally inclined local vectors w and � (fig. 2) and irregular deformed objects, the helical shape is much less easy to recognise. the deformation affects any object in the fluid. objects of particular interest are turbulent vortices. a vortex, visualised by e.g. dye addition, exposed to a shear flow is elongated and twisted as if it were a deformable body, because vortices cannot be simply destroyed (they decay only gradually – in an ideal frictionless fluid their vorticity � is conserved) and cannot be torn into pieces – this is prevented the consequences of kelvin’s theorem, the essence of which [3] may be stated as vortex cores behaving like material objects retaining their © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 time-mean helicity distribution in turbulent swirling jets v. tesař helicity offers an alternative approach to investigations of the structure of turbulent flows. knowledge of the spatial distribution of the time-mean component of helicity is the starting point. yet very little is known even about basic cases in which helicity plays important role, such as the case of a swirling jet. this is the subject of the present investigations, based mainly on numerical flowfield computations. the region of significantly large time-mean helicity density is found only in a rather small region reaching to several nozzle diameters downstream from the exit. the most important result is the similarity of the helicity density profiles. keywords: helicity, chirality, jets, swirling flows, swirl generator, supercirculation. fig. 1: helical deformation of a fluid volume in turbulent flows. this idealised shape would arise in the exceptional case of strong rotation with mutually collinear vorticity and velocity vectors. fig. 2: the quantity evaluated in the present investigation is the helicity density � – the scalar product of the local velocity vector w and vorticity vector �. identity. once the vortex core is bent, it becomes twisted and curled, even if it were alone in the flowfield, by the induced rotational motion generated by interaction with its other parts. much more tangled shapes due to the induced motions are then caused by presence of other vortices in the neighbourhood. due to the stochastic nature of the motions in typical turbulent flows the deformation is irregular. as a result, the real shapes of the visualised coherent vortical structures generally bear only a very distant resemblance to the regular shape indicated in fig. 1. the “pitch” between the loops is usually large relative to the diameter of the object, and the stochastic character of the deformation causes the “pitch magnitude” to be different at different locations. the shapes are also typically skewed due mutual inclination of the two vectors (fig. 2), which again may vary from one location to another. finally, the lifetime of vortical structures is usually too short to permit seeing more than a single loop. nevertheless the idea of the real deformation being composed from superimposed components corresponding to fig. 1 is a useful working concept. in fact, the shapes of the deformed vortices need not be helical at all, and they may exhibit nonzero helicity. moffat in [1] has evaluated the helicity of the interlooped configuration of vortices as shown in fig. 3, and has demonstrated it to be proportional to the product of their circulations � and a topological invariant called the linking number of the two loops. configurations similar to those in fig. 3, typically with several smaller loops instead of just the single one shown here, arise in the process of mutual interaction of structures, once the intertwined loops start to undergo the processes of “cut and connect” – a mechanism which may be particularly relevant for evaluating the spectral transfer of helicity. the starting point in investigations of helicity transport phenomena is the spatial distributions of time-mean helicity on the integral (or device) scales. unfortunately, very little information is available, despite the engineering importance of many flows with a particularly strong helical spinning mode (lacking axial symmetry). the swirling character may be imparted externally. however, swirling flows, like the well known case of the bathtub vortex, typically exhibit self-induced rotation as a result of breaking the initially helicity-free reflectional symmetry. the purpose of this paper is to provide some information about recent investigations of time-mean helicity in a very basic flow case – that of a round jet with rotation about its axis, caused by imparting a swirl upstream inside the nozzle. both experimental, using piv, and also computational studies were performed. so far most of the results have been obtained from the computations. the experiments have been delayed by the necessity to amend the standard optical interrogation system by additional components which generate a parallel different colour light sheet to resolve the derivatives in evaluating the vorticity. the resultant unique piv system is specially suitable for helicity measurements. in the computations, helicity has been evaluated at individual points of the flowfield by multiplying the local collinear components of velocity and vorticity and summing of the three products, eq. (1). 2 novel swirl generator nozzles for generating swirling jets are commonly based upon one of the two usual types of swirl generators. on the one hand, there are passive devices inserted into the nozzle (or immediately upstream), the geometry of which disrupts the geometric axial symmetry. the most common forms are twisted tapes, helical rifling of the nozzle walls, or inserted inclined vanes. this method of generating the swirling motion was unsuitable for the experimental part of the present project, as it does not allow variations in the intensity of the generated swirl unless the swirl imparting vanes are made with variable inclination, as discussed below. on the other hand, there are active devices, with the swirl generated by admission of an adjustable tangential inflow component superimposed 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 3: a high helicity in a fluid flow does not necessarily mean the presence the of helical coherent structures of fig. 1. much more important may be the high helicity found for the mutually linked vortex rings (which, according to some theories, may be the basic structures in turbulence). fig. 4: tangential inlet generation of swirling flows which was found unsuitable for the experimental study: a) single inlet produces asymmetric flow and instability, b) complicated multi-inlet layout can eliminate the asymmetry but retains instability and uneven nozzle exit velocity profile. on the basic axial flow (which, in the extreme case, may even vanish). experience with vortex type fluidic devices, that use this tangential inflow approach is not encouraging for application in basic hydrodynamic investigations. the simple single-inlet geometry (a in fig. 4) is unable to generate a sufficiently symmetric flow. this may be improved by a more complex arrangement with many inlets (such as case b in fig. 4), with the disadvantage of a rather complex and more expensive layout – the more so because in the present case the planned investigations call for ambi-chirality, capability to generate both directions (senses) of rotation. another problem caused by imparting the tangential momentum on the periphery of the inlet is the difficulty of obtaining a regular velocity profile at the exit. the profile shape may be improved by a long nozzle exit channel, but this generates a thick boundary layer (degrading the outer parts of the profile) and the inevitable fast dissipation of the swirl decreases the achievable magnitude of nozzle exit helicity. a configuration meeting the demands is found in generators with adjustable vanes – adjusted to a desirable attack angle. they impart the momentum across the whole radius so that the resultant profile is regular. unfortunately they tend to be mechanically very complicated and therefore expensive – especially if the adjustment is to be reproducible, which requires the vanes to be provided with some mechanism making their angle of attack equal and fixed. though perhaps possible in principle, there have been few (if any) attempts to make the mechanism capable of performing time-dependent variation of the attack angle. instead of mechanically adjustable vanes, the author used for imparting the angular momentum in the upstream enlarged inlet cross section of the nozzle the idea developed in the 1970s for use in fluidic valves [9] and later applied by markland and co-workers [10] for switching gas turbine exit flows. the idea is to impart the tangential momentum by fixed vanes with control jets issuing from their trailing edges. the vanes form the swirl generator, located in the upstream part of the nozzle, fig. 5. as shown in fig. 6, the vanes are hollow, made from parallel metal sheets and two round rods forming both the leading edge and the trailing edge. the thin jet issues from one of the slits between the outer sheets and the trailing edge rod (the other slit is used for generating the swirl of reverse chirality). the tangential momentum is imparted due to the circulation generated on the vane by the inclination of the jets (fig. 6) caused by their attachment to the round trailing edges by the coanda effect. a circulation around the vane is generated despite the geometric angle of attack of the vane being zero. the advantages of this novel swirl generator are: a) as in all principles using vanes, by imparting the angular momentum over the whole radius the swirl generator produces a flat and regular resultant velocity profile. b) the flat vane shape imparts no angular momentum without the control jet flow, generating a non-swirling jet. c) despite the flat vane shape, the magnification effect of the supercirculation makes it possible to impart angular momentum values at least an order of magnitude larger than with a vane at an adjusted attack angle (which is prone to separation at its leading edge if the attack angle is set too large). d) the control effect is adjustable instantly, by changing the control air flow admitted into the hollow vanes (while mechanical adjustment of the vanes takes time) and because of absence of inertia of moving components the generated swirl may even be made time-dependent. e) thanks to the supercirculation effect, the control flow rate needed for a strong swirl is very small – an order of magnitude smaller (a magnification factor in excess of 20 was measured in [9]) than in the direct tangential inflow according to fig. 4. f) the swirl is reproducible and may be adjusted continuously – while mechanically adjusted vanes can be set reproducibly only if they are mutually linked by some mechanical linkages (which makes them expensive) or if they are provided with a set attack angle fixation mechanism. however in this case it is not possible to adjust them continuously, only to attack angles arranged in the mechanism. g) the new coanda jet vanes are simple and easy to make. a virtual model of the swirl generator showing the four vanes and the small centre body is shown in fig. 7. the vanes © czech technical university publishing house http://ctn.cvut.cz/ap/ 11 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 5: the used nozzle with the original swirl generator based upon the supercirculation produced on fixed vanes by the coanda effect. fig. 6: the principle of supercirculation vanes as successfully developed at ctu in 1973 [9]. the ambichirality required in the present investigations called for two control flow exit slits (and two inlet cavities) in each vane. are placed in the 112 mm dia upstream inlet part of the nozzle. the nozzle cross section decreases in the flow direction (fig. 5) to the d � 32 mm exit, with a short constant-diameter channel. the vane chord is 52 mm, thickness 9 mm. the leading edge is of r � 4 mm radius, and the trailing edge is of smaller, r � 3.5 mm radius. on both sides of the trailing edge there is one 0.3 mm wide control flow exit slit. fig. 8 presents a photograph of the nozzle and the generator as they were used in the first experiments (with air inlet connected for only one direction of generated swirl rotation). 3 numerical model and preliminary tests numerical flowfield computations were performed using a finite volume discretisation of the computation domain. initially, the nominal symmetry of the flowfield was used to save some computation time by limiting the domain to only one quadrant (positive values of co-ordinates x2 and x3), with periodicity boundary conditions on the symmetry planes. the saving was, however, not large and there were some discrepancies caused by the enforced boundary conditions so that in the actual computation runs the domain was complete 3d volume. the downstream space where the investigated swirling jet is formed is cylindrical, 640 mm long, which is equivalent to 20 nozzle exit diameters. the diameter of this space is also 640 mm, i.e. its transversal dimension is also equal to 20 nozzle exit diameters. while this may not suffice for a complete development of the turbulent structure of the generated jet (note that attaining the full equilibrium of turbulent fluctuation energy requires lengths larger than 60 nozzle diameters – cf. fig.10), the size is definitely much larger than the region of interest in the experiments, and certainly suffices for evaluating the time-mean helicity (as is shown below, the region of substantial helicity magnitude does not extend to more than about 4 nozzle diameters – cf. 12 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 7: virtual computer model of the swirl generator from the upstream (inlet) side fig. 8: actual nozzle (left) and swirl generator (right) used in the experimental investigations fig. 9: with zero swirl the axial time-mean velocity profiles compare successfully with existing similarity solutions [16] and [17] fig. 10: less satisfactory agreement with similarity solutions [17] for the turbulent fluctuation energy in the no-swirl case is explained by very long required development length, as found also in other investigations fig. 18). the inlet into the nozzle in the experiment is from a large settling box. this is not modelled in its full extent, as experience has shown that with the character of the flow determined by the upstream boundary conditions it is sufficient to add to the domain just a small space uprstream from the nozzle. in the computations this upstream space was hemispherical, of 110 mm radius, as shown e.g. in fig. 11. an unpleasant fact for computation is the presence of incongruent length scales. for efficient (and not excessively time-consuming) computations, the discretisation should lead to as small a number of grid elements as possible, all of them preferably not differing in size and having aspect ratios near to 1.0. in the investigated case there is, however, on the one hand the size of the control slits at the trailing edges of the swirl generator vanes as narrow as 0.3 mm – while there is the 640 mm long downstream volume. this means a more than 1:2000 ratio of the important scales. the elements of the discretisation grid cannot have such a span of sizes, and it was necessary to use a larger number of small elements to cover the large space. after the grid adaption, performed by refinement in regions of large velocity magnitude gradient during the initial runs, the final unstructured tetrahedral discretisation contained as many as 921 167 cells. the computations were run using the two-equation turbulence model with the with standard set of model constants and the rng approach for modification of the model in regions of low turbulence reynolds numbers. prior to the actual swirling jet computations, verification runs were performed with zero control flows from the control slits in the vanes. the intention was to confront the used computational procedure with known results. in particular, the computed profiles of time-mean velocities and turbulence parameters were compared with the similarity solution of the axisymmetric turbulent jet using the one-equation model of turbulence by the present author and his student j. šarboch [16], and the later analogous similarity solution using the two-equation model of turbulence described in [17]. the computed axial velocity profiles were used to evaluate the convention diameter �0.5, defined as the transversal distance between the two locations in which the axial time-mean velocity equals 0.5 wm, where wm is the maximum axial velocity at the same downstream distance x1. the profiles were converted into similarity co-ordinates, as shown in fig. 9, with local time-mean axial velocity related to the velocity maximum in the particular profile and the transverse distance from the jet axis related to the convention jet diameter. as seen in fig. 9, all profiles were indeed found to be similar – which, in fact, is somewhat surprising since this similarity was found at downstream distances from the nozzle less than the usually encountered development lengths. fig. 9 also documents excellent agreement of the computed data points with the available similarity solutions from [16] and [17]. similar comparisons were also made for the computed specific values of the fluctuation energy (per unit mass – as seen from the dimension m2/s2) nondimensionalised by relating to the square of the maximum velocity wm. the computed values were much lower than the equilibrium values assumed in the similarity solutions. this, however, is not surprising in view of the fact that the jet leaves the nozzle with a very small, almost negligible level of turbulent fluctuations. the fluctuations are produced in the shear layers of the jet and this leads to a considerable downstream distance before turbulence assumes the fully developed state. this downstream distance is much longer than generally believed – as demonstrated in the present author’s earlier experimental measurements (with his student t. střílka – [17]) presented in fig. 9, where a length as large as equal to 60 nozzle exit diameters was required for complete development. the present results are quite in line with the general character of the experimental data in fig. 9, where the values of the dimensionless fluctuation energy on the jet axis are plotted as a function of the relative downstream distance. the most important conclusion from these validation computations is the substantial difference between the advection dominated case of the velocity field and the production dominated specific energy field. the profiles of the former exhibit a local maximum on the jet axis and values decreasing monotonously with the streamwise distance. on the other hand, the latter exhibit an off-axis maximum and a gradual streamwise growth in the locations of present interest immediately downstream from the nozzle exit. 4 helicity computation results the investigations being rather time-consuming, only five cases of swirling jets (listed in table i ) have been evaluated so far. they, however, suffice for formulation of some useful more general results. all flows investigated so far have concentrated on quite high reynolds number regimes. the four cases a, b, c, and d share the same flow rate magnitude supplied into the axial inlet and differ in the magnitude of the control flow rate admitted into the vane exit slits (distributed equally into all four slits). the four control flow rate magnitudes in the tests increase roughly in a geometric series. in the cases c and d, in fact, all flow rates were the same and the only difference was the control flow issued from the other set of four slits, producing the reverse rotation sense (direction). the last two cases d and e, on the contrary, share the same magnitude of the control flow rate and differ in the magnitude of the axial flow rate. © czech technical university publishing house http://ctn.cvut.cz/ap/ 13 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 11: computed pathlines (colour coded by velocity magnitude) of the left-hand swirling flow downstream from the swirl generator at a medium magnitude of applied control as the first observation, the computed pathlines in fig. 11 and fig. 12 as well as the experience gained with the laboratory rig indicate very satisfactory operation of the novel swirl generator. this was already seen from the very small values of the relative control flow rates sufficient for obtaining quite intensive swirl intensity. in spite of the rather small radius (only r � 3.5 mm) of the vane trailing edges, the jet sheets issuing from the control slots safely attach to them and generate the required circulation about the vanes. the quality of the flow issuing from the nozzle is completely satisfactory. the example of evaluated tangential velocity profiles in fig. 13 shows that the generated jet rotation has a solid body character (velocity increasing in proportion to radial distance) – decaying quite rapidly, however, as soon as the jet leaves the nozzle. as for the evaluated helicity, because of the rather complicated evaluation procedure and also because of the differentiation operation involved in computing the vorticity, the scatter of the data is quite large, even though the velocities have converged to smooth spatial distributions. there is also a remarkable difference in the size of the velocity and helicity flowfields. the helicity dissipates and diffuses away very fast once the jet leaves the nozzle, while the velocity field exhibits significantly nonzero values in a large part of the available downstream volume. a decrease of helicity magnitude to only 5 % (i.e. to within the likely error of the results) of the initial nozzle exit value was found to take place at a distance shorter than 4 exit diameters. 4.1 two helicity generating mechanisms the first among the interesting results which do not seem to have been discussed in the available literature is the discovered existence of two different regions with significantly increased helicity: one of them is the solid-body-rotation core, and the other is the boundary layer on the inner wall of the nozzle. the two regions are discernible downstream from the nozzle in fig. 14. the regions may even exhibit reverse chirality. if a positive helicity (right-handed) is generated by the swirle generatorr vanes in the core, as shown in fig. 15, then the helicity in the near-wall layer is negative – fig. 16. quite surprisingly, no doubt due to the pseudo-scalar nature of helicity, reversing the sense (direction) of the generated rotation did not lead to a simple change in the sign of the 14 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague table i: list of investigated flows. the otherwise identical cases c and d differ in left-handed and right-handed direction of swirl rotation. characteristic swirl angle � is evaluated from the ratio of tangential and axial velocity maxima in the nozzle exit. fig. 12: computed pathlines (colour coded) released from a tangentially located line so that they approach one of the vanes in a tangential plane perpendicular to the vane span. fig. 13: typical profiles of tangential velocity and their changes with axial distance downstream from the nozzle exit fig. 14: typical contours of positive and negative helicity in the perpendicular plane downstream from the nozzle exit evaluated for flow with positive (i.e. right-handed) swirl chirality results. the sign of the helicity generated in the wall layer tended to be the same (negative), so that the resultant helicity in the nozzle exit is subtracted in the positive swirl chirality flow case c (as shown in the top part of fig. 17) and added to the negative swirl chirality in the flow case d (in the bottom part of fig. 17). nevertheless, the near-wall helicity decays considerably faster, and while it remains detectable (fig. 16) – note also the © czech technical university publishing house http://ctn.cvut.cz/ap/ 15 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 15: typical positive helicity contours evaluated in the meridian plane for a jet with positive swirl chirality – case d, with medium applied relative control flow rate 9.43 % fig. 16: negative helicity contours evaluated in the same meridian plane for the same jet with positive swirl chirality as in fig.15 (the negative valued helicity is generated in the boundary layer on the nozzle wall) fig. 17: the consequence of the pseudo-scalar character of helicity: reversal of chirality of the imparted swirl does not lead to a simple change in the sign of the helicity. half-profiles of helicity evaluated immediately downstream from the nozzle exit indicate invariance of the sign of the helicity in the nozzle wall boundary layer. fig. 18: helicity distribution along the jet axis shows the remarkably fast decrease: values become less than the typical uncertainty beyond the downstream distance from the nozzle exit equal to 3–4 exit diameters vestigial negative values in the example profiles in fig. 21 at the distance equal to one nozzle exit diameter – it ceases to be an important factor in the velocity profiles evaluated further downstream (though it may be responsible for the lower values indicated by the arrow in fig. 23). 4.2 similarity of helicity profiles in the jet the profiles of the time-mean helicity density were evaluated from the individual components of the velocity and vorticity vectors by computing the time-mean variant of the expression eq.(1) � � � �w w w1 1 2 2 3 3� � � � (2) as an example, fig. 19 presents the values evaluated at a distance of 2 nozzle diameters from the exit for the case e, showing the profiles of the six individual quantities from eq.(2) as well as the three products to be summed. the substantial difference in the vertical scales may be noted: the resultant helicity density profile presented in fig. 20 is clearly dominated by the axial component with just a secondary effect of the tangential values. in the computed helicity profiles, e.g. those in fig. 21, it is easy to evaluate the maximum values �max – usually very 16 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague fig. 19: evaluation of helicity from the three velocity components and the three components of the vorticity vector. 1– axial components, 2– radial components, 3– tangential components (note the different vertical scales). computed for 12.6 % relative control flow rate at x1 � 2 d. fig. 20: profiles of the three velocity × vorticity products from the example in fig.19 and the helicity profile as the result of their sum fig. 21: typical helicity profiles evaluated at several small distances downstream from the nozzle for a small, 6.45 % relative control flow rate fig. 22: similarity the of two helicity profiles from fig. 21 when plotted in relative co-ordinates near to the time-mean helicity density on the jet axis. they may then be used to nondimensionalise the computed and experimental values of helicity. it is then also easy to find the locations in which the local time-mean helicity density magnitude equals a pre-determined percentage of the maximum. in the present case, the author has determined the radial locations in which the interpolated helicity value is equal to 1 2�max at the same downstream distance x1. these locations define the convention diameter ��0 5. used to non-dimensionalise the horizontal co-ordinate. the profiles of time-mean helicity density exhibit a remarkable mutual similarity and in the transformed co-ordinates, as seen in the examples fig. 22 to fig. 24, the resultant nondimensionalised profiles collapse to almost a single curve – with some scatter understandable considering the rather involved evaluation procedure involving the spatial derivatives. also the transverse size of the helicity region, characterised by the diameter ��0 5. , exhibits a considerable scatter. despite this, fig. 25 tends to suggest that the size increases in the downstream direction in a linear manner, in agreement with the idea of passive transport by turbulent vortical motions. this idea may form a basis for future formulations of helicity transport equations. fig. 26 shows an example of an early result in this direction, based upon the idea of a constant helicity diffusion coefficient in a constant velocity field. the latter simplifying assumption stems from the observation that the profiles are mostly located in the constant-velocity “potential” core of the jet. 5 conclusions the concept of helicity used for investigations of turbulent flowfield, after a period of mixed previous results, is currently gaining popularity. spatial distributions of the time-mean component of helicity density discussed in this paper are the starting point. unfortunately, very little information has been available. the presented results of continuing investigations, obtained using parallel experimental and computational approaches, provide some information about the case of a turbulent swirling jet. the swirl was imparted by an novel method, using an unusual swirl generator design based upon the author’s earlier idea of producing flow deflection by using coanda-effect generated supercirculation on fixed vanes. the design proved to be very useful, bringing several marked © czech technical university publishing house http://ctn.cvut.cz/ap/ 17 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 fig. 23: similarity of helicity profiles evaluated for the highest investigated control flow rate. the wide arrow (cf. fig. 17 and fig. 21) indicates low values, which may be a remnant of the negative wall-layer helicity. fig. 24: similarity of helicity profiles evaluated from the medium control flow rate computation results fig. 25: axial growth rate of the convention diameter of the helicity region in the investigated swirling jet evaluated from the data used in plotting the profiles in fig. 23 fig. 26: comparison of typical evaluated helicity profiles in relative co-ordinates and a theoretical similarity solution for a constant value of the turbulent gradient transport advantages. the investigations of the generated time-mean helicity density in the swirling jet indicate that the region forming donwstream from the nozzle is of limited extent, the time-mean helicity apparently becoming converted into the stochastic helicity component. very important finding of this study is the fact that the evaluated profiles of time-mean helicity density exhibit remarkable similarity. 6 acknowledgments this research has been supported by the grant “novel merging swirl burner design controlled by helical mixing” provided by epsrc – engineering and physical sciences research council, u.k. – to w. b. j. zimmerman and n. russel. the experimental rig was made by mr. christopher turner. the measurements in the swirling jets by particle image velocimetry, performed under the author’s supervision by miss gavita regunath, though her results were not used in this paper, provided a useful insight into swirling jet flows. references [1] moffatt, h. k.: “degree of knottedness of tangled vortex lines.” j. fluid. mech., vol. 36 (1969), p. 117–129. [2] moffatt, h. k., tsinober, a.: “helicity in laminar and turbulent flow.” ann. rev. fluid mech., vol. 24 (1992), p. 281–312. [3] lord kelvin: “on vortex motion.” trans roy. society edin., 25, 1869, p. 217–260. [4] moreau, j. -j.: “constants d’un ilot tourbillionaire en fluide parfait barotrope.” c. r. acad. sci. paris vol. 252 (1961), p. 2810–2812. [5] ditlevsen, p. d., giuiani. p.: “a note on dissipation in helical turbulence.”, physics of fluids, november 2001. [6] biferale, l., pierotti, d., toschi, f.: “helicity transfer in turbulent models.” physical review e, vol. 57 (1998), nr. 3, march 1998. [7] tesař, v.: “the problem of round jet spreading rate computed by two-equation turbulence model.” proc. of conference “engineering mechanics ’97”, svratka, czech republic, may 1997, p. 181. [8] suzuki, y., nagano, y.: “modification of turbulent helical / non-helical flows with small-scale energy input.” phys. fluids, vol. 11, no. 11, p. 3499–3511. [9] tesař, v.: „fluidické proudové výkonové prvky se supercirkulační intensifikací hybnostní interakce proudů.“ (power fluidic jet-type elements with supercirculation intensification of jet momentum interaction, in czech), acta polytechnica – ctu in prague, czech republic 1973. [10] markland, e., tsevdos, n.: “the jet flap applied to flow control.” proc. of 2nd international symposium on fluid control, measurement, mechanics, and visualisation. flucome ’88, sheffield, 1988. [11] khomenko, g. a., zimmerman, w. b.: “large scale structure evolution and mixing due to small scale helical forcing in a compressible viscous fluid.” in: mixing in geophysical flows, eds. j. m. redondo and o. metais, upc barcelona press, 1994, p. 233–248. [12] zimmerman, w. b.: “fluctuations in concentration due to convection by a helical flow in a conducting fluid.” phys. of fluids, vol. 8 (1996), no. 6, p. 1631–1641. [13] zimmerman, w. b.: “fluctuations in passive tracers due to mixing by coherent structures in isotropic, homogeneous, helical turbulence.” inst. chem. eng. symposium series 140, 1996, p. 213–224. [14] tesař, v.: “two-equation turbulence model solution of the plane turbulent jet.” acta polytechnica, vol. 35 (1995), no. 2, p. 19–42, prague, issn 1210-2709. [15] tesař, v.: “the solution of the plane turbulent jet.” acta polytechnica, vol. 36 (1996), no. 3, p. 15–41, prague, issn 1210-2709. [16] tesař, v., šarboch, j.: “similarity solution of the axisymmetric turbulent jet using the one-equation model of turbulence.” acta polytechnica, vol. 37 (1997), no. 3, p. 5–34, prague, issn 1210-2709. [17] tesař, v.: “two-equation turbulence model similarity solution of the axisymmetric fluid jet.” acta polytechnica, vol. 41 (2001), no. 2, p. 26–41, prague, issn 1210-2709. [18] tollmien, w.: “berechnung turbulenter ausbreitungsvorgänge.” zeitschrift für angewandte mathematik und mechanik, vol. 6 (1926), p. 468, berlin, germany. [19] tesař, v.: “similarity solutions of basic turbulent shear flows with oneand two-equation models of turbulence.” zeitschrift für angewandte mathematik und mechanik, vol. 77 (1997), p. 333, berlin, germany. prof. ing. václav tesař, csc. e-mail: v.tesar@sheffield.ac.uk university of sheffield mappin street s1 3jd sheffield, uk av čr dolejškova 5 182 00 praha 8, czech republic 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap_06_4.vp 1 introduction the term ‘ultra wideband’ describes wireless physical layer technologies which use a bandwidth of at least 500 mhz or a bandwidth which is at least 20 % of the used centre frequency. mathematically, b f f f f � � � � � � � � �2 0 20h l h l . , (1) where fh is the upper 10 db and fl is the lower 10 db cut-off frequency of the signal spectrum. there are two main principles which are characterized by this definition: the direct sequence ultra-wideband (ds-uwb) and the multi band orthogonal frequency division multiplexing (mb-ofdm). uwb radios are designed to use coherent wide-relative-bandwidth propagation, which has no rayleigh fading. hence, they are different from many popular radio designs used today that are seriously degraded by fading. because of this no-fading benefit, uwb can operate further and faster than conventional wisdom would expect and results in simple architectures that can deliver extremely high speed radios [3]. uwb is based on transmitting very short-duration pulses, often only nanoseconds or less in duration, whereby the occupied bandwidth is very high. this allows it to deliver data rates in excess of 100 mbps, while using very small amounts of power and operating in the same bands as existing communications without producing significant interference. uwb is fundamentally different from all other radio frequency communications, in that it achieves wireless communications without using a sine-wave rf carrier. instead, it uses modulated high-frequency low-energy pulses. since the actual transmission is physically a wavelet, it is considered to be a true modulated-wavelet radio. this is depicted in figs. 1 and 2, shown below. the two major methods used to modulate waveforms in uwb are pulse-position modulation (time-modulation) and pulse-amplitude modulation (pulse polarity-modulation). however, orthogonal waveforms may also be used. 2 early history in 1893, heinrich hertz used a spark discharge to produce electromagnetic waves in his experiment. spark gaps and arc discharges between carbon electrodes were the dominant wave generators for about two decades after hertz’s first experiments. in the late 1950s, an effort was made by lincoln laboratory and sperry to develop phased array radar systems. sperry’s electronic scanning radar (esr) employed a so called butler hybrid phasing matrix which was an interconnection of 3 db branch line couplers connected in such a manner that it formed a 2-n port network. each input port corresponded to a particular phase taper across the output n-ports which, when connected to antenna elements, corresponded to a particular direction in space. in attempting to understand the wideband properties of this network, efforts were made to reference the properties of the four port interconnection of quarter wave tem mode lines which formed the branch line coupler [2]. however, the dominant form of wireless communications became sinusoidal, and it was not until the 1960s that work began again in earnest on time domain electromagnetics. the development of the sampling oscilloscope in the early 1960s and the corresponding techniques for generating sub-nanosecond baseband pulses sped up the development of uwb [1]. 3 evolution of uwb: historical milestones from measurement techniques in the 1960s, the main focus moved to the development of radar and communications devices. ground-penetrating radar was subsequently developed as low-frequency components were useful in penetrating objects. the major events that led to the evolution of uwb are listed in table 1. 18 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 ultra wideband communications: history, evolution and emergence v. lakkundi ultra wideband, commonly known as uwb, is an emerging wireless personal area network (pan) radio technology that is unique in that it can be used in communications, radar systems as well as in range and location measurements. though uwb seems to be a new, trendy technology, its roots can be traced back to the 1960s. this paper describes uwb from the technological perspectives of its modern history, evolution and emergence. its benefits, existing applications and future trends are also given here. keywords: uwb, personal area networking, radar systems, pulse communications, modern history. fig. 1: narrow band communication fig. 2: ultra wideband communication since the issue of the first us patent for uwb in 1973, the field of pulse communications has moved in a new direction. most of the developments that have taken place since then have occurred in the military domain, with radars and intercept communications leading the way in r & d. 4 emergence of uwb uwb is a technology still under development. however, the first uwb chip sets are now available. uwb standards are being developed by the ieee in the form of ieee802.15.3 standards and the high data rate version ieee802.15.3a is currently underway. the regulation is completed in the us and is ongoing almost worldwide. currently there are two competing formats that have been proposed, and there has been a stalemate for some time. these are the uwb forum proposal (using direct-sequence cdma) and the mboa (multi band ofdm alliance) proposal. 4.1 key benefits and challenges the key benefits of uwb include: � high data rates � low power consumption � multipath immunity � low costs � simultaneous ranging and communication the primary advantages of uwb are high data rates, low cost, and low power. uwb also provides less interference than narrowband radio designs, while yielding a low probability of detection and excellent multipath immunity. when combined with the 802.15.3 pan standard, uwb can provide a very compelling wireless multimedia network for the home. it will have the ability to support multiple devices, and even multiple independent piconets, so neighbours will not interfere with other uwb networks. an additional feature of uwb is that it provides for precise ranging, or distance measurement. this feature can be used for location identification in, for example, public safety applications. though uwb is an exciting and useful technology, it also has many challenges. the basic drawback of uwb is that, under current fcc regulations, it is limited to 10, or a few tens of metres, depending on the desired data rate. this is consistent with the intended application as a pan technology. whole home coverage in larger homes may require an additional networking technology, such as a metal backplane (i.e. cable, or power line) or mesh networking from room to room. the issue of regulatory problems is also of huge concern. since uwb occupies a huge spectrum, which is almost completely used by several other applications, it has to be ensured that uwb will not affect other applications. another challenge is the need for the industry to agree upon common standards for interoperability of uwb devices. 4.2 potential applications and future trends ultra-wideband is capable of being used in a multitude of commercial applications ranging from wireless networks (scalable from low to ultra high speeds) to remote sensing and tracking devices, ground penetrating radars, as well as many more applications that have yet to be invented. consumers will most immediately benefit from uwb that is optimized for wireless home networks. this architecture allows multimedia-enabled devices to send and receive multiple streams of digital audio and video at price points and power consumption levels currently unattainable with existing solutions [3]. potential applications include: � wireless communications systems � high speed local and personal area networks (wlan/wpan) � roadside info-station � short range radios � military communications � radar and sensing � vehicular & collision avoidance radars � ground penetrating radar (gpr) � through wall imaging (police, fire, rescue) � medical imaging � surveillance � location detection � precision location (inventory, gps aid) � intelligent transportation systems apart from sophisticated applications such as medical imaging, uwb id tags can be used to wirelessly identify individuals with issued id tags, similar to asset tracking. other future applications are intelligent transportation systems, electronic signs and smart appliances. on the lighter side, fig. 3 depicts an undesired application of uwb [5]. © czech technical university publishing house http://ctn.cvut.cz/ap/ 19 acta polytechnica vol. 46 no. 4/2006 milestone year electromagnetic waves (heinrich hertz) 1893 phased array radar 1950s advanced developments in time-domain electromagnetics 1960s avalanche transistor & tunnel diode detector early 1970s short range radar sensor 1972 narrow baseband pulse fixture late 1970s advances in radar technology 1980s, 1990s commercial uwb devices & systems late 1990s ieee standards on uwb 2000s-till date table 1: milestones that led to today’s uwb fig. 3. an undesired result of uwb 5 acknowledgments this work has been supported the czech grant agency under grants no. 102/03/h109, and no. 102/04/0557 and by research program no. msm0021630513. references [1] ghavami, m., michael, l., kohno, r.: ultra wideband signals and systems in communication engineering. john wiley & sons, 2004. [2] barrett, t.: history of ultra wideband radar & communications: pioneers and innovators. in: proceedings of progress in electromagnetics symposium 2000, july 2000. [3] uwb forum, [online] www.uwbforum.org [4] porcino, d., hirt, w.: uwb radio technology: potential and challenges ahead. ieee communications magazine, july 2003. [5] padgett, j.: overview of uwb impulse radio. seminar umiacs/lts, march 2004. vishwas lakkundi e-mail: vishwas@feec.vutbr.cz institute of radio electronics brno university of technology purkyňova 118 612 00 brno, czech republic 20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 acta polytechnica https://doi.org/10.14311/ap.2021.61.0689 acta polytechnica 61(6):689–702, 2021 © 2021 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague dirac oscillator in dynamical noncommutative space ilyas haouam université frères mentouri, laboratoire de physique mathématique et de physique subatomique (lpmps), constantine 25000, algeria correspondence: ilyashaouam@live.fr abstract. in this paper, we address the energy eigenvalues of two-dimensional dirac oscillator perturbed by a dynamical noncommutative space. we derived the relativistic hamiltonian of dirac oscillator in the dynamical noncommutative space, in which the space-space heisenberg-like commutation relations and noncommutative parameter are position-dependent. then, we used this hamiltonian to calculate the first-order correction to the eigenvalues and eigenvectors, based on the language of creation and annihilation operators and using the perturbation theory. it is shown that the energy shift depends on the dynamical noncommutative parameter τ. knowing that, with a set of two-dimensional bopp-shift transformation, we mapped the noncommutative problem to the standard commutative one. keywords: dynamical noncommutative space, τ-space, noncommutative dirac oscillator, perturbation theory. 1. introduction in the last few decades, physicists and mathematicians have developed a mathematical theory called noncommutative geometry, which has quickly become a topic of great interest and has been finding applications in many areas of modern physics, such as high energy [1], cosmology [2, 3], gravity [4], quantum physics [5–7] and field theory [8, 9]. substantially, the study on noncommutative (nc) spaces is very important for understanding phenomena at a tiny scale of physical theories. the idea behind the extension of noncommutativity to the coordinates was first suggested by heisenberg in 1930 as a solution to remove the infinite quantities of field theories. the nc space-time structures were first mentioned by snyder in 1947 [10, 11], in which he introduced noncommutativity in the hope of regularizing the divergencies that plagued quantum field theory. motivated by the attempts to understand the string theory, the quantum gravitation and black holes through nc spaces and by seeking to highlight more phenomenological implications, we consider the dirac oscillator (do) within a two-dimensional dynamical noncommutative (dnc) space (also known as a position-dependent nc space). unlike the simplest possible type of nc spaces, in which the nc parameter is constant, here we talk about a different type of nc spaces, where the deformation parameter will no longer be constant. however, there are many other possibilities that cannot be excluded. in fact, in the first paper by snyder himself [10], the noncommutativity parameter was taken to depend on the coordinates and the momenta. considerable different possibilities have been explored since then, especially in the lie-algebraic approaches [12], κ-poincaré noncommutativity [13], other fuzzy spaces [14]. besides, more recently in position-dependent approach [15–17], the authors considered θµν to be a function of the position coordinates, i.e., θ→θ(x,y ). the relativistic do has a great potential both for the theoretical and practical applications. the potential term is introduced linearly, by substitution −→p → −→p − imβω−→r in free dirac hamiltonian, this was considered for the first time by ito et al. [18], with −→r being the position vector and m, β, ω > 0 being the rest mass of the particle, dirac matrix and constant oscillator frequency, respectively. it was named dirac oscillator by moshinsky and szczepaniak [19] because it is a relativistic generalization of the non-relativistic harmonic oscillator and, exactly in a non-relativistic limit, it reduces to a standard harmonic oscillator with a strong spin-orbit coupling term. physically, do has attracted a lot of attention because of its considerable physical applications, it is widely studied and illustrated. it can be shown that it is a physical system, which can be interpreted as an interaction of the anomalous magnetic moment with a linear electric field [20]. in addition, it can be associated with the electromagnetic potential [21]. as an exactly solvable model, do in the background of a perpendicular uniform magnetic field has been widely studied. however, we mention, for instance, the following: in ref. [19], the spectra of (3+1)-dimensional do are solved and the non-relativistic limit is discussed, as well, in ref. [22], the symmetrical properties of the do are studied. the operators of shift for symmetries are constructed explicitly [23]. interestingly, the do may offer a new approach to study quantum optics, where it was found 689 https://doi.org/10.14311/ap.2021.61.0689 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en ilyas haouam acta polytechnica that there is an exact map from (2+1)-dimensional do to jaynes-cummings (jc) model [24], which describes the atomic transitions in a two level system. subsequently, it was found that this model can be mapped either to jc or anti-jc models, depending on the magnitude of the magnetic field [25]. basically, do became more and more important since the experimental observations. for instance, we mention that franco-villafañe et al. [26] came with the proposal of a first-experimental microwave realisation of the one-dimensional do. the experiment depends on a relation of the do to a corresponding tight-binding system. the experimental results obtained show that the spectrum of the one-dimensional do is in good agreement with that of the theory. quimbay et al. [27, 28] show that the do may describe a naturally occurring physical system. more precisely, that case of a two-dimensional do can be used to describe the dynamics of the charge carriers in graphene, and hence its electronic properties [29]. this paper is organized as follows. in section 2, the dnc geometry is briefly reviewed. in section 3, the two-dimensional dnc do is investigated, where in sub-section 3.2, the energy spectrum in nc space is obtained. in sub-section 3.3, based on the perturbation theory and fock basis, the energy spectrum including the dynamical noncommutativity effect is obtained, therefore, we summarize the results and discussions. section 4, is then devoted to the conclusions. 2. review of dynamical noncommutativity let us present the essential formulas of the dnc space algebra we need in this study. as is known, at the tiny scale (string scale), the position coordinates do not commute with each other, thus the canonical variables satisfy the following deformed heisenberg commutation relation [ xncµ ,x nc ν ] = iθµν, (1) with θµν being an anti-symmetric tensor. in simplest way, the deformation parameter is considered a real constant. but, in general, θµν can be a function of coordinates. fring et al. [16] made a generalization of an nc space to a position-dependent space by introducing a set of new variables x, y , px, py and converting the constant θ into a function of coordinates θ(x,y ) = θ ( 1 + τy 2 ) . as another example of θ(x,y ), we also mention that gomes et al. [17] chose in their study θ(x,y ) = θ/[1 + θα ( 1 + y 2 ) ]. however, as a deformation of this nc parameter form will almost inevitably lead to non-hermitian coordinates, it was pointed out [30] that these types of structures are related directly to non-hermitian hamiltonian systems. later, it is explained how this problem was solved. in the new type of the two-dimensional nc space, which is known as the dnc space or τ−space, the commutation relations are [16] [x,y ] = iθ ( 1 + τy 2 ) , [y,py ] = iℏ ( 1 + τy 2 ) , [x,px] = iℏ ( 1 + τy 2 ) , [y,px] = 0, [x,py ] = 2iτy (θpy + ℏx) , [px,py ] = 0. (2) it is interesting to note that √ θ and √ τ have dimensions of l and l−1, respectively. in the limit τ → 0, we obtain the following non-dynamical nc commutation relations [xnc,ync] = iθ, [ ync,pncy ] = iℏ, [xnc,pncx ] = iℏ, [ync,pncx ] = 0,[ xnc,pncy ] = 0, [ pncx ,p nc y ] = 0. (3) the coordinate x and the momentum py are not hermitian, which makes the hamiltonian that includes these variables non-hermitian. we represent algebra (2) in terms of the standard hermitian nc variable operators xnc,ync,pncx ,pncy as x = ( 1 + τ (ync)2 ) xnc, y = ync, py = ( 1 + τ (ync)2 ) pncy , px = pncx . (4) from this representation, we can see that some of the operators involved above are no longer hermitian. however, to convert the non-hermitian variables into a hermitian one, we use a similarity transformation as a dyson map ηoη−1 = o = o† with η = (1 +τy 2)− 12 , as stated in [16]. therefore, we express the new hermitian variables x, y, px and py in terms of nc variables as follows 690 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space x = ηxη−1 = (1 + τy 2)− 12 x(1 + τy 2) 12 = (1 + τ (ync)2) 12 xnc(1 + τ (ync)2) 12 , y = ηy η−1 = (1 + τ (ync)2)− 12 ync(1 + τ (ync)2) 12 = ync, px = ηpxη−1 = (1 + τ (ync) 2)− 12 pncx (1 + τ (ync) 2) 12 = pncx , py = ηpyη−1 = (1 + τ (ync) 2)− 12 py (1 + τ (ync) 2) 12 = (1 + τ (ync)2) 12 pncy (1 + τ (ync) 2) 12 . (5) these new hermitian dnc variables satisfy the following commutation relations [x,y] = iθ ( 1 + τy2 ) , [y,py ] = iℏ ( 1 + τy2 ) , [x,px] = iℏ ( 1 + τy2 ) , [y,px] = 0, [x,py ] = 2iτy (θpy + ℏx) , [px,py ] = 0. (6) now, using bopp-shift transformation [31], one can express the nc variables in terms of the standard commutative variables [5] xnc = xs − θ2ℏp s y, p nc x = psx, ync = ys + θ2ℏp s y, p nc y = psy, (7) where the index s refers to the standard commutative space. the interesting point is that in the dnc space, there is a minimum length for x in a simultaneous x, y measurement [16]: △xmin = θ √ τ √ 1 + τ ⟨y ⟩2ρ, (8) as well, in a simultaneous y , py measurement, we find a minimal momentum as △ (py )min = ℏ √ τ √ 1 + τ ⟨y ⟩2ρ. (9) the motivation and interesting physical consequence for position-dependent noncommutativity is that objects in two-dimensional spaces are string-like [16]. however, investigating the do in dnc geometry gives rise to some phenomenological consequences that may be very important and useful. 3. two-dimensional dirac oscillator in dynamical noncommutative space 3.1. extension to dynamical noncommutative space the dynamics of the do in the presence of a uniform external magnetic field is governed by the following hamiltonian hd = c−→α. ( −→p s − e c −→ as − imcωβ−→r s ) + βmc2, (10) where −→ as ( asx,a s y,a s z ) is the vector potential produced by the external magnetic field and e is the charge of the do (the electron charge). the α⃗ matrices, in two dimensions, are represented by the following pauli matrices α1 = σx = ( 0 1 1 0 ) , α2 = σy = ( 0 −i i 0 ) , β = σz = ( 1 0 0 −1 ) , (11) which satisfy the following relations α2i = β2 = 1, αiαj + αjαi = 0, αiβ + βαi = 0. i = 1, 2, 3 (12) in two dimensions, equation (10) becomes 691 ilyas haouam acta polytechnica hd = c ( α1p s x + α2p s y ) − e ( α1a s x + α2a s y ) − imcω (α1βxs + α2βys) + βmc2. (13) let us consider −→ b to be along the z axis, thus the vector potential a⃗s is given in the landau gauge by −→ a = b 2 (−ys,xs, 0) , (14) therefore, we have hd (xsi ,p s i ) = c ( α1p s x + α2p s y ) + e b 2 (α1ys − α2xs) − imcω (α1βxs + α2βys) + βmc2. (15) the above hamiltonian in dnc space turns to hd (xi,pi) = c (α1px + α2py ) + e b 2 (α1y − α2x) − imcω (α1βx + α2βy) + βmc2. (16) now, using equation (5), we express the hamiltonian above in terms of nc variables hd (xnci ,p nc i ) = βmc 2 + c[α1pncx + α2 ( 1 + τ (ync)2 )1 2 pncy ( 1 + τ (ync)2 )1 2 ] + e b 2 [α1ync − α2 ( 1 + τ (ync)2 )1 2 xnc ( 1 + τ (ync)2 )1 2 ] − imcω[α1β ( 1 + τ (ync)2 )1 2 xnc ( 1 + τ (ync)2 )1 2 + α2βync]. (17) since τ is very small, the parentheses can be expanded to the first-order through ( 1 + τ (ync)2 )1 2 = 1 + 1 2 τ (ync)2 , (18) so that equation (17) turns to hd (xnci ,p nc i ) = c [ α1p nc x + α2 { pncy + 1 2 τ (ync)2 pncy + 1 2 τpncy (y nc)2 }] + βmc2 + e b 2 [ α1y nc − α2 { xnc + 1 2 τ (ync)2 xnc + 1 2 τxnc (ync)2 }] − imcω [ α2βy nc + α1β { xnc + 1 2 τ (ync)2 xnc + 1 2 τxnc (ync)2 }] . (19) using the bopp-shift transformation (7), hamiltonian (19) can be expressed in terms of the standard commutative variables hd (xsi , p s i ) = cα1p s x + βmc 2 + cα2psy + cα2 { 1 2 τ ( y s + θ 2ℏ p s x )2 p s y + 1 2 τ p s y ( y s + θ 2ℏ p s x )2} + eb 2 [ α1 ( y s + θ 2ℏ p s x ) − α2 { τ 2 ( y s + θ 2ℏ p s x )2 ( x s − θ 2ℏ p s y ) +xs − θ 2ℏ p s y + 1 2 τ ( x s − θ 2ℏ p s y )( y s + θ 2ℏ p s x )2}] − imcω [ α2β ( y s + θ 2ℏ p s x ) + α1β { x s − θ 2ℏ p s y + τ 2 ( y s + θ 2ℏ p s x )2 ( x s − θ 2ℏ p s y ) + τ 2 ( x s − θ 2ℏ p s y )( y s + θ 2ℏ p s x )2}] . (20) therefore, to the first-order in θ and τ, we have (noting that terms containing θτ are neglected too) 692 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space hd (xsi ,p s i ) = c [ α1p s x + α2 { psy + τ 2 (ys)2 psy + 1 2 τpsy (y s)2 }] + βmc2 + e b 2 [ α1 ( ys + θ 2ℏ psx ) − α2 { xs − θ 2ℏ psy + τx s (ys)2 }] − imcω [ α2β ( ys + θ 2ℏ psx ) + α1β { xs − θ 2ℏ psy + τx s (ys)2 }] , (21) which can be written as hd = h0 + hθ + hτ , (22) with h0 = cα1psx + cα2p s y + eb 2 (α1ys − α2xs) − imcω (α1βxs + α2βys) + βmc2, (23) hθ = θ 2ℏ [ eb 2 ( α1p s x + α2p s y ) − imcω ( α2βp s x − α1βp s y )] , (24) hτ = 1 2 τ [ cα2 (ys) 2 psy + cα2p s y (y s)2 − (ebα2xs (ys) 2 − i2mcωα1βxs (ys) 2 ] = τ 2 [α2v1 + α2v2 − (ebα2 + i2mcωα1β) v3] , (25) where v1 = (ys) 2 psy, v2 = p s y (y s)2 , and v3 = xs (ys) 2 . (26) knowing that hτ is the perturbation hamiltonian in which it reflects the effects of dynamical noncommutativity of space on the do hamiltonian. we can also treat the term proportional to θ, given in equation (24) as a perturbation term. but here, and in a different way, we will accurately calculate the energy of the deformed system h0 + hθ and employ it to test the effect of the dnc space on the do. thus, we consider the following unperturbed system hu n p = h0 + hθ. (27) while the noncommutativity parameter τ is non-zero and very small, one can use the perturbation theory to find the spectrum of the systems in question. the two-dimensional do equation in the dnc space is written as follows hd |ψd ⟩ = (hu n p + hτ ) |ψd ⟩ = eθ,τ |ψd ⟩ , (28) with |ψd ⟩ = (|ψ1⟩ , |ψ2⟩) t , (29) being the wave function of the system in question. 693 ilyas haouam acta polytechnica 3.2. unperturbed eigenvalues and eigenvectors we introduce the following complex coordinates zs = xs + iys, zs = xs − iys, (30) psz = −iℏ d dzs = 12 ( psx − ipsy ) , psz = −iℏ d dzs = 12 ( psx + ipsy ) , (31) where [zs,psz ] = [z̄ s,psz ] = iℏ, [z s,psz ] = [z̄ s,psz ] = 0. (32) using equation (11), our unperturbed system (24), in the complex formalism, merely becomes hu n p = [ mc2 2cωpsz + imczsω̃ 2cωpsz − imcωz sω̃ −mc2 ] , (33) where ω = 1 + m θω̃ 2ℏ with ω̃ = ω − ωc 2 , (34) knowing that ωc = |e|bmc is the cyclotron frequency. now, let us introduce the following creation and annihilation operators a = i ( ω √ mωℏ psz − i 2ω √ mω ℏ zs ) , (35) a† = −i ( ω √ mωℏ psz + i 2ω √ mω ℏ zs ) , (36) that satisfy the following commutations relations [ a,a† ] = 1, [a,a] = [ a†,a† ] = 0. (37) thus, in terms of the creation and annihilation operators, the hamiltonian (33) takes the following form hθ = [ mc2 i2c √ mωℏa† −i2c √ mωℏa −mc2 ] = [ mc2 iga† −iga −mc2 ] , (38) with g = 2c √ mℏω being a parameter that describes the coupling between different states in nc space, and ω = ωω̃ being the correction of the frequency ω̃ of the commutative space. in addition, the parameter g = 2c √ mω̃ℏ describes the coupling between different states in the commutative space. now, we solve the following equation hu n p ∣∣ψd〉 = eθ ∣∣ψd〉 , (39) where eθ, ∣∣ψd〉 are the eigenenergy and wave function of the dirac equation above, respectively. by inserting equation (29) in (39), we obtain the following system of equations ( mc2 iga† −iga −mc2 )( | ψ1 > | ψ2 > ) = eθ ( | ψ1 > | ψ2 > ) , (40) 694 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space where ( mc2 − eθ ) | ψ1 > +iga † | ψ2 >= 0, (41) −iga | ψ1 > − ( mc2 + eθ ) | ψ2 >= 0. (42) from the equations (41) and (42), we have | ψ2 >= −iga eθ + mc2 | ψ1 >, (43) subsequently [ g2a†a + m2c4 − (eθ) 2 ] | ψ1 >= 0. (44) on the basis of the second quantization, of which | ψ1 >≡| n >, we have [ g2n + m2c4 − (eθ) 2 ] | n >= 0, a†a | n >= n | n > . (45) thus, the energy spectrum is given by e±θ,n = ± √ m2c4 + g2n, (46) which can be rewritten as e±θ,n = ±mc 2 √ 1 + 4ℏω̃ mc2 ( 1 + m θω̃ 2ℏ ) n, n = 0, 1, 2, ... (47) furthermore, we have the reduced energy spectrum e±n e0 = ± √ 1 + 4w ( 1 + 1 2 qw ) n, (48) where the non-relativistic limit feature is reduced in w = ℏω̃ mc2 , which is a parameter that controls the nonrelativistic limit within the nc space (as well as in commutative case, if θ = 0), and e0 = mc2 is a background energy, which corresponds to n = 0. and q = θθ0 with θ0 = ( ℏ mc )2of the dimension [ ℏ mc ]2 = l2 ≡ m2 . the corresponding wave function is written as a function of the basis | n >= (a †)n √ n! | 0 >, and it is given by the following formula | ψ ± n >= c ± n | n; 1 2 > +id±n | n − 1; − 1 2 >, (49) where the coefficients c±n and d±n are determined from the normalization condition. we thus obtain [24] c±n = √ e+n ± mc2 2e+n , d±n = ∓ √ e+n ∓ mc2 2e+n . (50) in the limit θ → 0, the nc energy spectrum becomes commutative one, i.e., equation (47) turns to equation (10) of ref. [32], which confirms that we are in good agreement. as well, in ref. [33] boumali et al. made a study of a do in an nc phase-space, where, if θ → 0, the energy eigenvalues (eq:50) will be similar as ours in equation (47). 695 ilyas haouam acta polytechnica we plot the reduced energy spectrum in terms of quantum number n, for the cases w = 1, q = 1; w = 1, q = 2 and the commutative case with w = 1. the e ± n e0 , as a function of quantum number n of equation (48) in both commutative (θ = q = 0) and nc (q = 1; q = 2) spaces, is illustrated in fig. 1. knowing that fig. 1 discloses that the influence of the nc parameter on the energy spectrum is considerable and significant. figure 1. a reduced energy versus quantum number in both cases of nc and commutative spaces. the following figure shows the coupling parameters g and ḡ, between different levels for the two cases in the nc space. figure 2. the coupling parameter between different levels: (a) in the case of commutative space, (b) in the case of nc space. while n are non-negative integers, we explicitly observe that our eigenvalues are non-degenerated (the spectrum has no degeneracy), this case can be explained by the fact that the particle is restricted to moving in two dimensions, and the third dimension does not contribute in the form of energy. knowing that, it will be an infinite degeneracy when there is a contribution of an element related to the third dimension, such as kz or pz . in more detail and indirectly (in other sense), the energy spectrum is degenerated. as is known, this is related to the landau problem, and it is known that there is an infinite degeneracy. nevertheless, we consider the energy spectrum non-degenerated, because we do not rely on the states with different angular momentum, which is not useful here. the reason is that when we use chiral creation and annihilation operators (al, a†l and ar , a†r ), we see that the number of particles nr created by right operators does not appear in the form of energy, we see only number of particles nl generated by left operators. however, right operators create excitations with a definite angular momentum in one or the other direction; thus, in this sense, we have the degeneracy. this point is very important to clarify because our calculations in the perturbation theory depend on this point. as many researchers have dealt with this sensitive point and considered that the spectrum has no degeneracy such as [33]. besides, differently, for instance, energy levels can appear explicitly degenerated, as in a study [34] about the mesoscopic states in a relativistic landau levels, the authors found that the energy spectrum depends on p2z (check eq. 13 in this cited reference), which is the underlying reason for an infinite degeneracy of all levels. 3.3. perturbed system in this sub-section, we aim to determine the correction of first-order energy by using first-order energy shift formulas. to explain the structure of our spectrum, we will use time-independent perturbation theory for 696 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space small values of the parameter τ. in view that energies are non-degenerated, we use the non-degenerated time-independent perturbation theory ∣∣ψn〉 = ∣∣∣ψ(0)n 〉 + τ ∣∣∣ψ(1)n 〉 + τ2 ∣∣∣ψ(2)n 〉 + ... (51) en = e(0)n + τe (1) n + τ 2e(2)n + ... (52) here, the (0) superscript denotes the quantities that are associated with the unperturbed system. the first-order correction to the eigenvalues and eigenvectors in perturbation theory are simply given by e(1)n = △en =< ψ (0) n | 1 τ hτ | ψ (0) n >, (53) ∣∣∣ψ(1)n 〉 = ∑ k ̸=n < ψ (0) k | 1 τ hτ | ψ (0) n > e (0) n − e (0) k ∣∣∣ψ(0)k 〉 . (54) inserting equation (25) into the equation above, we find e(1)n =< ψ (0) n | 1 2 {α2(v1 + v2) − (ebα2 + i2mcωα1β) v3} | ψ (0) n >, (55) the operator method can also be used to obtain the energy shift in fock space. in our scenario, we require adopting the notation of the state as follows | ψ (0) n >=| nx,ny > . (56) the perturbation matrix is given by m =< nx,ny | i 2 ( 0 −v1 − v2 + υv3 v1 + v2 − υv3 0 ) | n ′ x,n ′ y >, (57) with υ = eb + 2mcω. to calculate the influence of vi (i = 1, .., 3) on the element of the fock basis, we conveniently use the following bj , b†j (j = x,y) operators bj = √ mω 2ℏ ( xsj + i psj mω ) and b†j = √ mω 2ℏ ( xsj − i psj mω ) , (58) where [ bj,b † j ] = 1, with b†jbj = nj. (59) the above creation and annihilation operators are, in fact, extracted from the one in 3.2, when (a,a†) ={ i (ax + iay ) , −i ( a†x − ia†y )} |θ→0→ (b,b†) = { i (bx + iby ) , −i ( b†x − ib†y )} (because in 3.3, we deal only with τ). in fact, we have only one integer, which is n, but with the feature n = nx + ny . we deliberately use nx, ny instead of n because in the perturbed hamiltonian, we cannot use a complex formalism, thus we divide n into nx and ny . with the help of the following definitions of eigenkets and central properties of creation and annihilation operators [35] bj | nj > = √ nj | nj − 1 >, b † j | nj > = √ nj + 1 | nj + 1 >, b2j | nj > = √ nj (nj − 1) | nj − 2 >, b †2 j | nj > = √ (nj + 1) (nj + 2) | nj + 2 >, b3j | nj > = √ nj (nj − 1) (nj − 2) | nj − 3 >, b †3 j | nj > = √ (nj + 1) (nj + 2) (nj + 3) | nj + 3 >, ... (60) 697 ilyas haouam acta polytechnica with < n ′ j | nj >= δn′ j ,nj and [ b†b,b† ] = b†bb† − b†2b = b† and [ b†b,b ] = b†b2 − bb†b = −b, (61) b†bb† | n >= nb† | n >= (n + 1) b† | n >, (62) b†bb | n >= nb | n >= (n − 1) b | n >, (63) [n,b] = −b and bn = nb + b, (64) [ n,b† ] = b† and b†n = nb† − b†. (65) knowing that xsj = √ ℏ 2mω ( bj + b†j ) and psj = i √ ℏmω 2 ( b † j − bj ) . (66) the contributions of the different parts of the perturbed hamiltonian are as follows < nx,ny | v1 | n ′ x,n ′ y >=< nx,ny | (y s)2 psy | n ′ x,n ′ y > = −iℏ 2 √ ℏ 2mω δnx,n′x {√ n ′ y ( n ′ y − 1 )( n ′ y − 2 ) δny ,n′y −3 − √( n ′ y + 1 )( n ′ y + 2 )( n ′ y + 3 ) δny ,n′y +3 + ( n ′ y − 2 )√ n ′ yδny ,n′y −1 − ( n ′ y + 3 )√ n ′ y + 1δny ,n′y +1 } . (67) < nx,ny | v2 | n ′ x,n ′ y >=< nx,ny | p s y (y s)2 | n ′ x,n ′ y > = −iℏ 2 √ ℏ 2mω δnx,n′x {√ n ′ y ( n ′ y − 1 )( n ′ y − 2 ) δny ,n′y −3 − √( n ′ y + 1 )( n ′ y + 2 )( n ′ y + 3 ) δny ,n′y +3 + ( n ′ y + 2 )√ n ′ yδny ,n′y −1 + ( 3 − n ′ y )√ n ′ y + 1δny ,n′y +1 } . (68) < nx,ny | v3 | n ′ x,n ′ y >=< nx,ny | x s (ys)2 | n ′ x,n ′ y > = ( ℏ 2mω )3 2 (√ n ′ xδnx,n′x−1 + √ n ′ x + 1δnx,n′x+1 ) × (√ n ′ y ( n ′ y − 1 ) δny ,n′y −2 + √( n ′ y + 1 )( n ′ y + 2 ) δny ,n′y +2 + ( 1 + 2n ′ y ) δny ,n′y ) . (69) the relevant perturbation matrix is given by m = i ( 0 w12 w21 0 ) , (70) with w12 = −w21 =< nx,ny | 1 2 {υv3 − (v1 + v2)} | n ′ x,n ′ y > . (71) 698 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space the do hamiltonian hd = h0 + hθ + hτ may be represented by a square matrix as follows (we have used the basis given by unperturbed energy eigenkets) hd ≡ ( e (0) n,+ (θ) iτw12 iτw21 e (0) n,− (θ) ) . (72) the eigenvalues of the problem above are ( e1 e2 ) = e (0) − + e (0) + 2 ± √√√√(e(0)− − e(0)+ )2 4 + λ2 |w12| 2 . (73) here, we set λ = iτ. supposing that λ |w12| is small as compared to relevant energy scale, so that the difference of the energy eigenvalues of the unperturbed system equals λ |w12| < ∣∣∣e(0)− − e(0)+ ∣∣∣ . (74) to obtain the expansion of the energy eigenvalues in the presence of a perturbation, namely (a perturbation expansion always exists for a sufficiently weak perturbation) e1 = e (0) − + λ2|w12|2 e (0) − −e (0) + + . . . e2 = e (0) + + λ2|w12|2 e (0) + −e (0) − + . . . (75) we terminate the calculation by the radius of convergence of series expansion (75), so while λ is a complex variable, λ is increased from zero, branch points are encoutered at [34] λ |w12| = ±i ( e (0) − − e (0) + ) 2 , (76) the condition for the convergence of (75) for the λ = 1 full strength case is |w12| = ∣∣∣e(0)− − e(0)+ ∣∣∣ 2 . (77) if this condition is not met, the expansion (75) is meaningless. it can be checked that all the results of the nc case can be obtained from the dnc case directly by taking the limit of τ → 0, for instance, equations (51) and (52) give the same values as the eigenvalues and eigenvectors in the nc space, i.e., equations (47) and (49), respectively. it may be also useful to mention that the do in deformed spaces (including nc spaces) has been investigated in [36–43]. we can regard equation (75) as the eigenvalues of our system, where we restrict ourselves to the first-order correction to the eigenvalues and eigenvectors, which leads the energy shift for the ground state. besides, it is easy to obtain the eigensolutions for excited states. it is interesting to illustrate the dnc effect on do energy levels. this effect is reduced in the energy shifts obtained, hence we do the following sample, with a1 = υ 2 ( ℏ 2mω )3 2 ,a2 = iℏ 2ω ( ℏ 2mω )1 2 ,a3 = iℏ ( ℏ mω )1 2 ,a4 = iℏ 2 ( ℏ mω )1 2 . (78) in table 1, all numerical values of the energy are in units of ev. it may be worth to underline that thanks to the kronecker’s delta, the elements of the perturbed hamiltonian will seldom take many values. now, the dnc and non-dnc effects on the energy levels of the do are illustrated in fig. 3. the upper bound on the value of the nc parameter θ is √ θ ≤ 2 × 10−20 m [44], as well for τ is √ τ ≤ 10−17 ev [45]. the bound on √ τ is consistent with the accuracy in the energy measurement 10−12 ev. 699 ilyas haouam acta polytechnica ( nx,ny,n ′ x,n ′ y ) w12 e1 e2 △e (1, 1, 1, 1) 0 −5, 11.105 5, 11.105 0 (1, 0, 0, 0) a1 −5, 11.105 + a21τ 2 10,22.105 5, 11.10 5 − a 2 1τ 2 10,22.105 a21τ 2 10,22.105 (0, 0, 0, 1) a2 −5, 11.105 + a22τ 2 10,22.105 5, 11.10 5 − a 2 2τ 2 10,22.105 a22τ 2 10,22.105 (0, 1, 0, 2) a3 −5, 11.105 + a23τ 2 10,22.105 5, 11.10 5 − a 2 3τ 2 10,22.105 a23τ 2 10,22.105 (0, 2, 0, 1) -a4 −5, 11.105 + a24τ 2 10,22.105 5, 11.10 5 − a 2 4τ 2 10,22.105 a24τ 2 10,22.105 table 1. energy levels due to dnc space, where we suffice with eigenvalues corrections to the ground state, i.e. e1,2 = ±mc2. figure 3. diagram of splittings for energy levels due to dnc and non-dnc spaces. it is important to clarify that the presence of τ2 in the eigenvalues is not due to the action of the second-order correction, but rather due to the dirac matrices in the perturbed hamiltonian term. in fig. 3, we see the energy levels are splitting as the θ and τ parameters turn on. values of the eigenvalues e1 and e2 are shown in table 1. first, in fig. 3, we show the effect of the nc space when the perturbation parameter is off (τ = 0), where the effect of noncommutativity is very significative as explained in fig. 1. thereafter, the effect of noncommutativity is more significative when the dnc perturbation term is present (τ ̸= 0), the presence of this term exhibits the energy shifts. the last part of fig. 3 shows the combined effect of both the dnc and nc parameters, where the effect is very evident. 4. conclusion in conclusion, we have investigated the effects of a dnc space on the 2d do in the presence of an external magnetic field in terms of creation and annihilation operator languages and through properly chosen canonical pairs of coordinates and its corresponding momenta in a complex nc space. however, the dynamical noncommutativity was treated as a perturbation. more precisely, we have solved the do problem in a two-dimensional nc space to find the exact energy spectrum and wave functions. therefore, we have employed these obtained results to find the first-order correction to the eigenvalues and eigenvectors. it is worth noting that we addressed the system in the nc space as an unperturbed system instead of considering the fundamental system in a commutative space 700 vol. 61 no. 6/2021 dirac oscillator in dynamical noncommutative space and noncommutativity as a perturbation. the first-order correction for the ground state of the do due to the noncommutativity of space is zero for the non-dnc case while it has a nonvanishing value in the dnc case. knowing that, the result reduces to that of a usual do in the limits of τ→0, θ→0. as mentioned in section 2, some operators in the dnc space are non-hermitian. this mixture of dncs, the non-hermiticity theory and the string theory can lead to fundamental new insights in these three fields. distinctly, there are plenty of interesting problems arising from our investigation, such as the investigation of further possibilities of consistent deformations, and studies of additional models in terms of the dnc variables. references [1] n. seiberg, e. witten. string theory and noncommutative geometry. journal of high energy physics (9):032, 1999. https://doi.org/10.1088/1126-6708/1999/09/032. [2] d. m. gingrich. noncommutative geometry inspired blackholes in higher dimensions at the lhc. journal of high energy physics 2010:22, 2010. https://doi.org/10.1007/jhep05(2010)022. [3] p. nicolini. noncommutative black holes, the final appeal to quantum gravity: a review. international journal of modern physics a 24(7):1229–1308, 2009. https://doi.org/10.1142/s0217751x09043353. [4] j. m. gracia-bondia. new paths towards quantum gravity, chap. notes on quantum gravity and noncommutative geometry, pp. 3–58. springer, berlin, heidelberg, 2010. https://doi.org/10.1007/978-3-642-11897-5_1. [5] m. chaichian, m. m. sheikh-jabbari, a. tureanu. hydrogen atom spectrum and the lamb shift in noncommutative qed. physical review letters 86(13):2716–2719, 2001. https://doi.org/10.1103/physrevlett.86.2716. [6] i. haouam. analytical solution of (2+1) dimensional dirac equation in time-dependent noncommutative phase-space. acta polytechnica 60(2):111–121, 2020. https://doi.org/10.14311/ap.2020.60.0111. [7] i. haouam. on the noncommutative geometry in quantum mechanics. journal of physical studies 24(2):1–10, 2002. https://doi.org/10.30970/jps.24.2002. [8] m. r. douglas, n. a. nekrasov. noncommutative field theory. reviews of modern physics 73(4):977–1029, 2001. https://doi.org/10.1103/revmodphys.73.977. [9] i. haouam. on the fisk-tait equation for spin-3/2 fermions interacting with an external magnetic field in noncommutative space-time. journal of physical studies 24(1):1801, 2020. https://doi.org/10.30970/jps.24.1801. [10] h. s. snyder. quantized space-time. physical review 71(1):38–41, 1947. https://doi.org/10.1103/physrev.71.38. [11] h. s. snyder. the electromagnetic field in quantized space-time. physical review 72(1):68–71, 1947. https://doi.org/10.1103/physrev.72.68. [12] n. sasakura. space-time uncertainty relation and lorentz invariance. journal of high energy physics (05):015, 2000. https://doi.org/10.1088/1126-6708/2000/05/015. [13] j. lukierski, h. ruegg, a. nowicki, v. n. tolstoi. q-deformation of poincaré algebra. physics letters b 264(3-4):331–338, 1991. https://doi.org/10.1016/0370-2693(91)90358-w. [14] a. p. balachandran, s. vaidya. lectures on fuzzy and fuzzy susy physics, 2007. world scientific. iisc/chep/11/05. [15] s. a. alavi, n. rezaei. dirac equation, hydrogen atom spectrum and the lamb shift in dynamical non-commutative spaces. pramana 88:77, 2017. https://doi.org/10.1007/s12043-017-1381-4. [16] a. fring, l. gouba, f. g. scholtz. strings from position-dependent noncommutativity. journal of physics a: mathematical and theoretical 43:345401, 2010. https://doi.org/10.1088/1751-8113/43/34/345401. [17] m. gomes, v. g. kupriyanov. position-dependent noncommutativity in quantum mechanics. physical review d 79:125011, 2009. https://doi.org/10.1103/physrevd.79.125011. [18] d. itô, k. mori, e. carriere. an example of dynamical systems with linear trajectory. nuovo cimento a (1965-1970) 51:1119–1121, 1967. https://doi.org/10.1007/bf02721775. [19] m. moshinsky, a. szczepaniak. the dirac oscillator. journal of physics a: mathematical and general 22(17):l817, 1989. https://doi.org/10.1088/0305-4470/22/17/002. [20] r. p. martínez-y-romero, a. l. salas-brito. conformal invariance in a dirac oscillator. journal of mathematical physics 33(5):1831, 1992. https://doi.org/10.1063/1.529660. [21] j. benitez, p. r. martnez y romero, h. n. núnez-yépez, a. l. salas-brito. solution and hidden supersymmetry of a dirac oscillator. physical review letters 64:1643, 1990. https://doi.org/10.1103/physrevlett.64.1643. [22] c. quesne, m. moshinsky. symmetry lie algebra of the dirac oscillator. journal of physics a: mathematical and general 23(12):2263, 1990. https://doi.org/10.1088/0305-4470/23/12/011. [23] o. l. de lange. shift operators for a dirac oscillator. journal of mathematical physics 32(5):1296, 1991. https://doi.org/10.1063/1.529328. 701 https://doi.org/10.1088/1126-6708/1999/09/032 https://doi.org/10.1007/jhep05(2010)022 https://doi.org/10.1142/s0217751x09043353 https://doi.org/10.1007/978-3-642-11897-5_1 https://doi.org/10.1103/physrevlett.86.2716 https://doi.org/10.14311/ap.2020.60.0111 https://doi.org/10.30970/jps.24.2002 https://doi.org/10.1103/revmodphys.73.977 https://doi.org/10.30970/jps.24.1801 https://doi.org/10.1103/physrev.71.38 https://doi.org/10.1103/physrev.72.68 https://doi.org/10.1088/1126-6708/2000/05/015 https://doi.org/10.1016/0370-2693(91)90358-w https://doi.org/10.1007/s12043-017-1381-4 https://doi.org/10.1088/1751-8113/43/34/345401 https://doi.org/10.1103/physrevd.79.125011 https://doi.org/10.1007/bf02721775 https://doi.org/10.1088/0305-4470/22/17/002 https://doi.org/10.1063/1.529660 https://doi.org/10.1103/physrevlett.64.1643 https://doi.org/10.1088/0305-4470/23/12/011 https://doi.org/10.1063/1.529328 ilyas haouam acta polytechnica [24] a. bermudez, m. martin-delgado, e. solano. exact mapping of the 2+1 dirac oscillator onto the jaynes-cummings model: ion-trap experimental proposal. physical review a 76:041801(r), 2007. https://doi.org/10.1103/physreva.76.041801. [25] a. bermudez, m. a. martin-delgado, a. luis. chirality quantum phase transition in the dirac oscillator. physical review a 77(6):063815, 2008. https://doi.org/10.1103/physreva.77.063815. [26] j. a. franco-villafañe, e. sadurní, s. barkhofen, et al. first experimental realization of the dirac oscillator. physical review letters 111:170405, 2013. https://doi.org/10.1103/physrevlett.111.170405. [27] c. quimbay, p. strange. arxiv:1311.2021. [28] c. quimbay, p. strange. arxiv:1312.5251. [29] a. boumali. thermodynamic properties of the graphene in a magnetic field via the two-dimensional dirac oscillator. physica scripta 90(4):045702, 2015. https://doi.org/10.1088/0031-8949/90/4/045702. [30] b. bagchi, a. fring. minimal length in quantum mechanics and non-hermitian hamiltonian systems. physics letters a 373(47):4307–4310, 2009. https://doi.org/10.1016/j.physleta.2009.09.054. [31] i. haouam. the non-relativistic limit of the dkp equation in non-commutative phase-space. symmetry 11(2):223, 2019. https://doi.org/10.3390/sym11020223. [32] b. mandal, s. verma. dirac oscillator in an external magnetic field. physics letters a 374(8):1021–1023, 2010. https://doi.org/10.1016/j.physleta.2009.12.048. [33] a. boumali, h. hassanabadi. the thermal properties of a two-dimensional dirac oscillator under an external magnetic field. the european physical journal plus 128:124, 2013. https://doi.org/10.1140/epjp/i2013-13124-y. [34] a. bermudez, m. a. martin-delgado, e. solano. mesoscopic superposition states in relativistic landau levels. physical review letters 99:123602, 2007. https://doi.org/10.1103/physrevlett.99.123602. [35] j. j. sakurai. modern quantum mechanics. (revised edition). addison-wesley, 1994. [36] s. cai, t. jing, g. guo, et al. dirac oscillator in noncommutative phase space. international journal of theoretical physics 49:1699–1705, 2010. https://doi.org/10.1007/s10773-010-0349-7. [37] b. p. mandal, s. k. rai. noncommutative dirac oscillator in an external magnetic field. physics letters a 376(36):2467–2470, 2012. https://doi.org/10.1016/j.physleta.2012.07.001. [38] m. hosseinpour, h. hassanabadi, m. de montigny. the dirac oscillator in a spinning cosmic string spacetime. the european physical journal c 79:311, 2019. https://doi.org/10.1140/epjc/s10052-019-6830-4. [39] m. de montigny, s. zare, h. hassanabadi. fermi field and dirac oscillator in a som-raychaudhuri space-time. general relativity and gravitation volume 50:47, 2018. https://doi.org/10.1007/s10714-018-2370-8. [40] k. bakke, h. mota. dirac oscillator in the cosmic string spacetime in the context of gravity’s rainbow. the european physical journal plus 133:409, 2018. https://doi.org/10.1140/epjp/i2018-12268-6. [41] h. chen, z.-w. long, y. yang, c.-y. long. study of the dirac oscillator in the presence of vector and scalar potentials in the cosmic string space-time. modern physics letters a 35(21):2050179, 2020. https://doi.org/10.1142/s0217732320501795. [42] f. ahmed. interaction of the dirac oscillator with the aharonov-bohm potential in (1+2)-dimensional gürses space-time backgrounds. annals of physics 415:168113, 2020. https://doi.org/10.1016/j.aop.2020.168113. [43] d. f. lima, f. m. andrade, l. b. castro, et al. on the 2d dirac oscillator in the presence of vector and scalar potentials in the cosmic string spacetime in the context of spin and pseudospin symmetries. the european physical journal c 79:596, 2019. https://doi.org/10.1140/epjc/s10052-019-7115-7. [44] o. bertolami, j. g. rosa, c. m. l. de aragão, et al. noncommutative gravitational quantum well. physical review d 72:025010, 2005. https://doi.org/10.1103/physrevd.72.025010. [45] s. a. alavi, m. a. nasab. gravitational radiation in dynamical noncommutative spaces. general relativity and gravitation 49:5, 2017. https://doi.org/10.1007/s10714-016-2167-6. 702 https://doi.org/10.1103/physreva.76.041801 https://doi.org/10.1103/physreva.77.063815 https://doi.org/10.1103/physrevlett.111.170405 http://arxiv.org/abs/1311.2021 http://arxiv.org/abs/1312.5251 https://doi.org/10.1088/0031-8949/90/4/045702 https://doi.org/10.1016/j.physleta.2009.09.054 https://doi.org/10.3390/sym11020223 https://doi.org/10.1016/j.physleta.2009.12.048 https://doi.org/10.1140/epjp/i2013-13124-y https://doi.org/10.1103/physrevlett.99.123602 https://doi.org/10.1007/s10773-010-0349-7 https://doi.org/10.1016/j.physleta.2012.07.001 https://doi.org/10.1140/epjc/s10052-019-6830-4 https://doi.org/10.1007/s10714-018-2370-8 https://doi.org/10.1140/epjp/i2018-12268-6 https://doi.org/10.1142/s0217732320501795 https://doi.org/10.1016/j.aop.2020.168113 https://doi.org/10.1140/epjc/s10052-019-7115-7 https://doi.org/10.1103/physrevd.72.025010 https://doi.org/10.1007/s10714-016-2167-6 acta polytechnica 61(6):689–702, 2021 1 introduction 2 review of dynamical noncommutativity 3 two-dimensional dirac oscillator in dynamical noncommutative space 3.1 extension to dynamical noncommutative space 3.2 unperturbed eigenvalues and eigenvectors 3.3 perturbed system 4 conclusion references acta polytechnica https://doi.org/10.14311/ap.2022.62.0400 acta polytechnica 62(3):400–408, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague experimental verification of the impact of the air staging on the nox production and on the temperature profile in a bfb matěj vodička∗, kristýna michaliková, jan hrdlička, pavel skopec, jitka jeníková czech technical university in prague, faculty of mechanical engineering, department of energy engineering, technická 4, 166 07 prague, czech republic ∗ corresponding author: matej.vodicka@fs.cvut.cz abstract. the results of an experimental research on air staging in a bubbling fluidized bed (bfb) combustor are presented within this paper. air staging is known as an effective primary measure to reduce nox formation. however, in the case of a number of industrial bfb units, it does not have to be sufficient to meet the emission standards. then selective non-catalytic reduction (sncr) can be a cost-effective option for further reduction of the already formed nox. the required temperature range at the place of the reducing agent injection for an effective application of the sncr without excessive ammonia slip is above the temperatures normally attained in bfbs. the aim of this paper is to evaluate the impact of staged air injection on the formation of nox in bfb combustors and to examine the possibility of increasing the freeboard temperature. several experiments with various secondary/primary air ratios were performed with a constant oxygen concentration in the flue gas. the experiments were carried out using wooden biomass and lignite as fuel in a 30 kwth laboratory scale bfb combustor. furthermore, the results were verified using a 500 kwth pilot scale bfb unit. the results confirmed that the air staging can effectively move the dominant combustion zone from the dense bed to the freeboard section, and thus the temperatures for an effective application of the sncr can be obtained. keywords: air staging, bubbling fluidized bed, nox, sncr. 1. introduction nitrogen oxides (nox), particularly no and no2, are gaseous pollutants that can cause significant environmental issues. one of the main contributors to overall nox emissions is the combustion of solid fuels. there are three mechanisms of nox formation in the combustion process: the thermal and prompt nox formation mechanism, where the source of nox is atmospheric nitrogen, and the oxidation of fuelbound nitrogen. since the breaking of tight n2-bond is strongly temperature dependent, the thermal nox formation mechanism (described by zeldovich et al. [1]) is usually considered important at temperatures higher than 1500 °c [2]. then, the prompt formation of nox (described by fenimore [3]) can be observed in fuel-rich zones in pre-mixed hydrocarbon flames. the prompt nox formation is also strongly temperature dependent and is relevant from above 1400 °c. these conditions are not typical for combustion in bubbling fluidized beds (bfbs) and therefore prompt and thermal nox are of minor importance, and fuel-bound nitrogen is considered to be the main contributor to nox formation there [4–6]. fuel-bound nitrogen is an important source of nox in the combustion of fossil fuels and biomass. it is particularly important for coal combustion, which typically contains 0.5 − 2.0 % wt. of nitrogen, and for the combustion of non-wooden biomass, where its content can reach up to 5 % wt. the degree of conversion of fuel-n to nox is almost independent of the type of nitrogen compound, but is significantly dependent on the local combustion environment [7]. in the furnace, fuel is thermally decomposed and volatile and char compounds are produced. fuel-bound nitrogen is distributed between char and volatiles, depending on the fuel structure and devolatilisation conditions, such as temperature, heating rate, oxygen concentration, or residence time. in case of lower temperatures or shorter residence times, nitrogen preferably remains in the char, while at higher temperatures, it is part of the volatiles [8]. the mechanisms of volatile-n and char-n conversion were described by winter et al. [9], who studied the combustion of a single particle of bituminous coal, subbituminous coal and beech wood in an electrically heated, laboratory-scale, fluidized bed combustor. the nox reduction can be generally realized using different methods; either by modifying the combustion process itself to supress the nox formation (so-called primary measures) or applied after the combustion process to decrease the already formed nox (so-called secondary measures). skopec et al. [6] observed using a 500 kwth bfb combustor that the nox formation depends mainly on the excess of oxygen in the combustion zone and slightly also on the fluidized bed temperature. this was also confirmed by svoboda 400 https://doi.org/10.14311/ap.2022.62.0400 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 3/2022 impact of the air staging on the nox production in a bfb and pohořelý [10], who studied the formation of nox and n2o in a laboratory scale pressurized bfb. they observed that nox formation was strongly promoted by an increase in air stoichiometry (while n2o formation was depressed) at atmopheric pressure. they also observed that an increase in the temperature of the fludized bed slightly promotes the formation of nox and decreases the formation of n2o under slightly elevated pressure (0.25 mpa). therefore, the primary measures reduce the temperature and oxygen concentration in the furnace and subsequently allow the oxidation of the remaining combustibles above the furnace. as primary measures, staged injection of air or fuel and flue gas recirculation could be used. fuel staging requires secondary gas phase fuel and it is not a practically applied method for bfbs. in the case of staged air injection, the oxidizer is separated into two or even more streams. the first stream is introduced into the bfb as the primary air (possibly mixed with the fgr). the second stream (and possibly the consequent streams) is introduced into the freeboard section above the fluidized bed. in the primary combustion zone, there are fuel-rich conditions (stoichiometric or even sub-stoichiometric oxygen/fuel ratio) that cause a smaller conversion of the nox precursor to nox and favor the formation of n2. furthermore, the already formed no can be further reduced through a) reburning reactions with the released fuel n (mainly hcn and nh3), or b) through reactions with carbon compounds that were not yet completely oxidized, or c) on the char surface through catalytic reactions [11, 12]. gaseous products of incomplete combustion (co and toc), which are inevitably formed under such conditions, are subsequently oxidized in the freeboard section, where the secondary oxidizer is introduced. the efficiency of nox reduction through staged injection of air is significantly dependent on the residence time in the primary zone with sub-stoichiometric (reduction) conditions [12, 13]. the optimum residence time in the sub-stoichiometric zone may vary according to the fuel structure. for a lignite coal, it is about 1.5 s [13]. if the residence time in the this zone is too short, no can further form in a significant amount in the secondary oxidation zone [12]. sirisomboon and charernporn [14] also observed that the relative reduction efficiency of the staged air injection depends on the overall stoichiometry of the combustion air. with the most extensive air staging, they observed a similar reduction of about 70 ppmv of no at all air excess ratios (in the range from 20 to 80 %). since the formation of nox is strongly affected by overall air stoichiometry, the reduction efficiency was higher for lower excess of air and lower for higher excess of air. secondary measures are mainly selective noncatalytic (sncr) and selective catalytic reduction (scr). these flue gas treatments use reducing agents containing nh2 (ammonia, urea, ammonia water, etc.), which can reduce no. the reduction follows 900 920 940 960 980 1000 1020 1040 temperature [◦c] 0 20 40 60 80 100 n o f/ n o i [% ] urea hnco nh3 (nh4)2so4 figure 1. comparison of the dependence of the nox reduction efficiency (expressed as a ratio of the final nox concentration with sncr to the initial nox concentration without sncr) using urea, cyanuric acid, ammonia and ammonium sulfate on the flue gas temperature [15]. equation 1 in the case of using ammonia as a reducing agent 2 no + 2 nh3 + 1 2 o2 2 n2 + 3 h2o (1) and equation 2 in the case of using urea. 2 no + co(nh2)2 + 12 o2 2 n2 + co2 + 3 h2o (2) the effective operation of sncr without excessive ammonia slip is determined by an optimal temperature range for the injection of the reducing agent into the combustor. the ideal temperature range depends on the reducing agent used, which can be seen in figure 1. however, the required temperature range 900 – 1000 °c is normally achieved neither in the bed nor in the freeboard in bfbs. the catalyst present in the scr systems usually allows achieving higher reduction efficiency at significantly lower temperatures compared to sncr technology, but at a high investment cost. air staging, which partly moves the combustion zone from the dense bed zone to the freeboard of the bfb, appears to be a suitable measure for increasing the freeboard temperature to the required temperature range. sirisomboon and charernporn [14] increased the temperature in the freeboard section of a pilot scale bfb of about 100 °c through air staging and reached up to 1 100 °c using high volatile sunflower shells as fuel. although nox formation and possible reduction paths in bfbs have been studied by a number of authors using multiple fuels and various scales of devices, the possible application of sncr in bfbs has not been of significant interest yet. this paper presents a comprehensive experimental study of the impact of staged air supply on nox formation in a bfb and on the temperature profile within 401 m. vodička, k. michaliková, j. hrdlička et al. acta polytechnica 1 3 4 5 6 7 8 9 10 flue gas vent recirculated flue gas cooling water primary air intake 11 secondary air intake 2 (a) . 30 kwth experimental bfb facility and its equipment. t t t t t t t t 2 m distributor bed level (b) . indication of temperature measuring points. figure 2. scheme of the 30 kwth experimental bfb facility. 1) fluidized bed region, 2) freeboard section, 3) fluidizing gas distributor, 4) fluidized bed spillway, 5) fuel feeder, 6) cyclone particle separator, 7) flue gas fan, 8) and 9) fgr water coolers, 10) condensate drain, and 11) primary gas fan. the combustor as a consequence of sub-stoichiometric combustion in the dense bed and subsequent oxidation of the remaining combustibles in the freeboard section in order to be able to define the process conditions for reaching the sncr optimal temperature range. a number of experiments were performed combusting lignite and wooden pellets in various operating regimes of a 30 kwth bfb experimental facility. furthermore, to validate the results and their applicability to the industrial scale, the same experiments were performed in a 500 kwth pilot-scale bfb combustor. 2. experiments 2.1. experimental setup the 30 kwth experimental facility has been comprehensively described elsewhere [16] and so it will be described only briefly here. its scheme is given in figure 2. the facility is 2 m high, has a rectangular cross-section, and is made of stainless steel insulated from the outside using the 50 mm thick insulfrax board in the fluidized bed section and using mineral wool 1 23 4 5 6 7 8 t t t t t t figure 3. scheme of the 500 kwth pilot-scale bfb facility. 1) primary gas inlets, 2) fuel feeder, 3) fluidized bed region, 4) secondary air distributors, 5) freeboard section, 6) inspection windows, 7) crossover pass, and 8) heat exchanger. the ‘t’ signs indicates the temperature measurement points. in the freeboard section. there are no internal heat exchangers in the facility. volumetric flows of primary and secondary air were measured using two rotameters. the temperature profile along the height of the facility was measured using five thermocouples in the dense bed region and three thermocouples in the freeboard region. directly in the fluidized bed, the temperature was measured with two thermocouples. however, the value of only one placed 166 mm above the fluidizing gas distributor was taken as representative. primary air with real flue gas recirculation were used to provide sufficient fluidization, and secondary air was introduced at the beginning of the freeboard section. the flue gas was continuously sampled and the volumetric fractions of co2, co, so2 and nox were measured using a ndir sensors, while the volumetric fraction of o2 was measured using a paramagnetic sensor. the scheme of the 500 kwth pilot scale bfb boiler is given in figure 3. this boiler consists of three main sections: the combustion chamber with freeboard, the crossover pass, and the heat exchanger. the fluidization gas, formed by primary air and recirculated flue gas, enters the bed trough distributor consisting of 36 nozzles placed in 6 rows. the combustion chamber and the freeboard section have a cylindrical cross section and are insulated with a fireclay lining with a water-cooled surface. in the freeboard area, the facility is equipped with 6 thermocouples along the height. secondary air is supplied to the freeboard section by 4 distributors placed evenly on a perimeter, and each distributor can provide the secondary air inlet at 4 different heights. for the experiments, secondary air inlets at a height of 550 mm above the fluidized bed were used. from the freeboard section, the flue gas continues to the empty crossover pass 402 vol. 62 no. 3/2022 impact of the air staging on the nox production in a bfb as received dry ash free lhv water ash comb.* c h n s o* volatiles [mj · kg−1] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] [wt. %] lignite 17.6 21.1 9.9 69.0 72.3 6.3 1.1 1.3 19.0 47.0 wood 16.4 7.8 1.5 90.7 51.0 6.9 0.3 0.003 41.797 84.6 * calculated as balance to 100 %. table 1. proximate and ultimate analysis of the fuels used within the experiments. liginite ash lwa 0−2 ρs kg · m−3 2 195 1088 ρb kg · m−3 795 570 dmean mm 0.37 1.07 dmode mm 0.23 1.02 d10 mm 0.15 0.54 d50 mm 0.74 1.05 d90 mm 1.77 1.59 table 2. results of the particle size distribution analysis of the fluidized bed materials. with water-cooled surface and then enters the heat exchanger. the flue gas was continuously sampled and the volumetric fractions of co2, co, so2 and nox were measured using ndir sensors, while the volumetric fraction of o2 was measured using a paramagnetic sensor. 2.2. materials the experiments were carried out using czech lignite bílina hp1 135 and pellets from spruce wood as fuel. the proximate and ultimate analyses of lignite and wooden pellets are given in table 1. lignite has a significantly higher nitrogen content compared to spruce wood, so the nox yields should also be higher in case of combustion of lignite. spruce wood combustibles contain about 100 % more volatiles compared to lignite, which should move the dominant combustion zone slightly higher in the facility in case of wood combustion. for experiments carried out using the 30 kwth facility, the lignite was sieved to a particle size of up to 7 mm. the lignite ash was used as a bed material in the case of lignite combustion. biomass combustion experiments were carried out using spruce wood pellets (according to the enplus a1 standard) and using a lightweight ceramic aggregate (lwa) as an external bed material. the physical properties of the bed materials are given in table 2. the arithmetic mean, mode, median, 1st decile (d10), and 9th decile (d90) particle size were evaluated. the density and bulk density were analyzed along with the particle size. the lignite ash can be classified as geldart b particles which are well fluidizable and form vigorous bubbles. the used lwa population referred as ‘0−2’ is on the boundary between b and d particle types in the geldart classification, where the class d particles are difficult to fluidize in deep beds, they spout, form exploding bubbles or channels [17]. from the analysis of the particle size distribution (psd), the minimum fluidization velocity, the minimum complete fluidization velocity (defined as umf calculated for the particle size d90, and the terminal particle velocity were evaluated for two different conditions of the fluidization gas. the numerical approach was taken from [18]. first, the gas properties corresponded to air at 20 °c (ρg = 1.20 kg · m−3, η = 1.8 · 10−5 pa·s) and secondly to air at 850 °c (ρg = 0.29 kg · m−3, η = 3.8 · 10−5 pa·s). the fluidizing gas temperature of 850 °c was chosen, because when the fluidizing gas passes through the bed of hot material, it is heated to the bed temperature within a few millimeters above the fluidizing gas distributor [19]. the minimum fluidization velocities of both bed materials were also experimentally verified by measuring the correlation of bed pressure drop and superficial fluidizing gas velocity u0. this method can be found in [18]. this test was carried out using air at 20 °c as fluidization gas. the calculated minimum fluidization velocities, the complete fluidization velocities, and the terminal velocities, as well as the experimentally determined minimum fluidization velocities, are given in table 3. 2.3. methods to evaluate the impact of staged air supply on nox emissions and on the temperature profile within the bfb combustor in consequence of the substoichiometric combustion in the dense bed and subsequent oxidation of the remaining combustibles in the freeboard, two series of experiments were performed using the 30 kwth bfb experimental facility and using lignite and biomass as fuels. furthermore, a series of experiments was done using the 500 kwth pilot scale facility and only using lignite as fuel. to highlight this impact and reduce the side effects of fluidized bed temperature, the experiments were carried out with a constant bed temperature of 880 °c and the oxygen level in the dry flue gas maintained at 8 % for wooden pellets and at 6 % for lignite for both facilities. the bed temperature was controlled through the change of volumetric flow or of the composition of the fluidizing gas, which consisted of primary air and recirculated flue gas. 403 m. vodička, k. michaliková, j. hrdlička et al. acta polytechnica conditions lignite ash lwa 0−2 calculation (experimental) results air at 20 °c umf m · s−1 0.37 (0.42) 0.40 (0.38) umf −90 m · s−1 1.72 0.65 ut m · s−1 2.02 3.09 ut−90 m · s−1 5.47 4.06 air at 850 °c umf m · s−1 0.21 0.27 umf −90 m · s−1 2.59 0.62 ut m · s−1 2.36 4.95 ut−90 m · s−1 10.23 7.24 table 3. minimum fluidization velocities umf , terminal velocities ut (calculated for dmean), minimum velocities of complete fluidization umf −90, and complete terminal velocities ut−90 (calculated for d90) of the selected fluidized bed materials. the experimental values are in brackets. the degree of the combustion air staging can be described using equation 3. ψ = vair, sec vair, prim , (3) where vair, sec is the volumetric flow of secondary air and vair, prim is the volumetric flow of primary air. in the experiments performed using 30 kwth bfb facility, one case without air staging was measured as a reference for both fuels and then the following four cases with an increasing secondary/primary air ratio ψ to 2.75 were measured. the minimum amounts of primary and secondary air were limited by the flowmeters used for the volumetric flows measurement and were set to 10 m3n/h. the step for increment of the secondary air flow was 6 m3n/h. however, to keep the overall oxygen level constant and maintain the bed temperature and sufficient fluidization, it was not possible to exactly keep the required secondary/primary air ratio constant throughout. a reference case without air staging and three cases with an increasing secondary/primary air ratio ψ up to 1.6 were performed using the 500 kwth pilot-scale facility. 3. results and discussion the results of biomass and coal combustion in the 30 kwth facility (given in figures 4 and 5 and tables 4 and 5) show that the air staging positively affected nox concentration in the flue gas and the nox formation was suppressed with a higher secondary/primary air ratio. on the other hand, the co level increased significantly, as can be seen in figures 4 and 5. this could be caused by the decrease in excess oxygen to a sub-stoichiometric atmosphere in the fluidized bed region (λprim < 1) connected with incomplete combustion and increased co production, which is then not oxidized effectively in the freeboard region, possibly due to lower temperatures there. therefore, the application of air staging from this point of view is limited by acceptable co emissions. unfortunately, the flue gas temperature significantly decreased as 0.0 0.5 1.0 1.5 2.0 2.5 ψ [−] 0 100 200 300 400 500 600 700 c n o x [m g/ n m 3 ] 0 200 400 600 800 1000 1200 1400 c c o [m g/ n m 3 ] c nox c co figure 4. the dependence of the nox and co volumetric concentration in dry flue gas on the ratio of the secondary to primary air volumetric flows ψ in the case of a lignite combustion in the 30 kwth bfb facility. the gaseous pollutants concentrations related to 6 % vol. of o2 in dry flue gas. 0.0 0.5 1.0 1.5 2.0 2.5 ψ [−] 0 20 40 60 80 100 120 140 160 180 c n o x [m g/ n m 3 ] 0.0 66.7 133.3 200.0 266.7 333.3 400.0 466.7 533.3 600.0 c c o [m g/ n m 3 ] c nox c co figure 5. the dependence of the nox and co volumetric concentration in dry flue gas on the ratio of the secondary to primary air volumetric flows ψ in the case of a biomass combustion in the 30 kwth bfb facility. the gaseous pollutants concentrations related to 6 % vol. of o2 in dry flue gas. 404 vol. 62 no. 3/2022 impact of the air staging on the nox production in a bfb parameter unit ‘case 1’ ‘case 2’ ‘case 3’ ‘case 4’ ‘case 5’ ψ [−] 0.0 0.35 0.62 1.23 2.45 λprim [−] 1.4 1.04 0.8 0.68 0.48 ϕo2 [% vol.] 5.9 ± 0.03 5.93 ± 0.03 5.96 ± 0.03 6.09 ± 0.03 6.13 ± 0.01 tbf b [°c] 881 ± 0.6 889 ± 0.3 886 ± 0.3 890 ± 0.4 887 ± 0.3 cnox [mg · m 3 n] 635 ± 1 491 ± 2 423 ± 1 328 ± 1 266 ± 0 cco [mg · m3n] 277 ± 7 347 ± 4 451 ± 1 1038 ± 5 2155 ± 8 u0 [m · s−1] 1.73 1.73 1.86 1.71 1.71 n − no [% mole] 21.61 16.73 14.41 11.16 9.06 mf uel [kg · h−1] 5.8 5.8 5.8 5.8 5.8 table 4. experimental results of the staged supply of air on the nox formation in the case of lignite combustion in the 30 kwth bfb facility. gaseous pollutant concentrations are related to 6,% vol. of o2 in dry flue gas. parameter unit ‘case 1’ ‘case 2’ ‘case 3’ ‘case 4’ ‘case 5’ ψ [−] 0.0 0.25 0.6 1.09 2.29 λprim [−] 1.64 1.29 1.03 0.81 0.48 ϕo2 [% vol.] 8.21 ± 0.05 8.24 ± 0.04 7.91 ± 0.03 7.93 ± 0.04 7.44 ± 0.03 tbf b [°c] 888 ± 0.7 894 ± 0.5 888 ± 0.3 890 ± 0.7 897 ± 0.9 cnox [mg · m 3 n] 173 ± 0 158 ± 0 135 ± 0 127 ± 0 100 ± 0 cco [mg · m3n] 217 ± 3 455 ± 5 524 ± 5 593 ± 6 843 ± 6 u0 [m · s−1] 2.46 2.35 2.41 2.31 2.31 n − no [% mole] 12.01 11.0 9.36 8.85 6.95 mf uel [kg · h−1] 6.9 6.9 6.9 6.9 6.9 table 5. experimental results of the staged supply of air on the nox formation in the case of wood combustion in the 30 kwth bfb facility. gaseous pollutant concentrations are related to 6 % vol. of o2 in dry flue gas. 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 ψ [−] 0 50 100 150 200 250 300 c n o x [m g/ n m 3 ] 0 500 1000 1500 2000 2500 3000 c c o [m g/ n m 3 ] c nox c co figure 6. the dependence of the nox and co volumetric concentration in dry flue gas on the ratio of the secondary to primary air volumetric flows ψ in the case of a lignite combustion in the 500 kwth bfb facility. the gaseous pollutants concentrations related to 6 % vol. of o2 in dry flue gas. the gas passed through the facility due to poor insulation of the freeboard section of the 30 kwth bfb facility. if the temperature is too low at the point of secondary air injection, the oxidation of the unburned combustibles is very slow, which reduces the desired effect of air staging. since both the fluidized bed and the freeboard section are well insulated by the fireclay lining in the 500 kwth facility, the freeboard temperatures are significantly higher compared to the 30 kwth facility. therefore, secondary air properly oxidizes co and other incomplete combustion products and the concentration of co does not increase (moreover, it decreases) with an increasing secondary/primary air ratio ψ, as can be seen in figure 6 and table 6, where the results of lignite combustion in the 500 kwth bfb facility are given. based on the rise in the nox concentration in flue gas related to the increase in the secondary/primary air ratio ψ from 0.92 to 1.59 in figure 6, it can be estimated that the nox reduction through the air staging has an optimum at the value of the secondary/primary air ratio ψ about 1. the further nox reduction observed in the results obtained using the 30 kwth facility can then be explained by the increased co concentration. the impact of air staging on the temperature profile in the 30 kwth combustor can be seen in figures 7 and 8 and on the temperature profile in the 500 kwth combustor in figure 9. in the case of the 30 kwth combustor, it was not possible to achieve the temperatures required for sncr while keeping the bed temperature constant for coal combustion. in the best scenario (for the secondary/primary air ratio ψ 1.4), the temperature rose by 10 °c in a small area directly above the fluidized bed, but this could also be caused by not exactly the same fluidized bed temperature. in general, no significant positive impact was observed from the increase in secondary air supply. in the case of biomass 405 m. vodička, k. michaliková, j. hrdlička et al. acta polytechnica parameter unit ‘case 1’ ‘case 2’ ‘case 3’ ‘case 4’ ψ [−] 0.0 0.48 0.91 1.6 λprim [−] 1.19 0.7 0.66 0.57 ϕo2 [% vol.] 6.06 ± 0.06 5.27 ± 0.04 4.75 ± 0.04 5.73 ± 0.07 tbf b [°c] 887 ± 0.7 885 ± 0.5 878 ± 0.6 879 ± 0.4 cnox [mg · m 3 n] 256 ± 2 186 ± 1 173 ± 1 253 ± 1 cco [mg · m3n] 4314 ± 209 2882 ± 206 3118 ± 236 499 ± 81 u0 [m · s−1] 1.71 1.29 1.02 1.26 n − no [% mole] 8.72 6.32 5.87 8.63 mf uel [kg · h−1] 50.0 55.0 46.0 62.0 table 6. experimental results of the staged supply of air on the nox formation in the case of lignite combustion in the 500 kwth bfb facility. gaseous pollutants concentrations are related to 6 % vol. of o2 in dry flue gas. combustion, enhanced combustion of volatiles above the fluidized bed was observed with an increase in the secondary/primary air ratio, causing a considerable increase in temperature. there, a temperature greater than 900 °c was reached, thus conducive conditions could be provided for the application of sncr, although the secondary/primary air ratio must be high. however, a significant decrease in freeboard temperatures can be observed for both biomass and coal combustion in figures 7 and 8, possibly due to insufficient insulation of the freeboard section. the negative impact of staged air supply can be caused by increased volumetric flow, as the higher flow rate is then associated with heat loss. it can be expected that with insulation improvement, air staging would also increase the freeboard temperature for coal combustion, because it did so in the 500 kwth facility, where both the fluidized bed and the freeboard section are insulated using an inner fireclay lining. as can be seen in figure 9, where the dependence of the temperature height profile on the secondary/primary air ratio ψ is given, staged air injection can move the dominant combustion zone from the furnace to the freeboard section and therefore could increase the freeboard temperature sufficiently for efficient application of sncr in the bfb boilers even in the case of a lignite combustion. the most significant increase in freeboard temperature was achieved for the secondary/primary air ratio 0.92, where the freeboard temperature was 953 °c for the fluidized bed temperature 880 °c. the further increase in the secondary/primary air ratio did not improve the freeboard temperature but rather cooled it. it can also be seen that the higher volumetric flow of the secondary air and so the higher flue gas velocity in the freeboard section moved the highest temperature measured in the height profile further in the flue gas stream. the increased temperature in the freeboard section invoked by the staged injection of air is also in agreement with the findings of sirisomboon and charenporn [14], who carried out the experiments in a well insulated pilot scale bfb. 650 700 750 800 850 900 t [◦c] 0 200 400 600 800 1000 1200 1400 1600 1800 h [m m ] ψ = 0.00 ψ = 0.38 ψ = 0.80 ψ = 1.41 ψ = 2.75 figure 7. the dependence of the temperature height profile within the 30 kwth bfb facility on the ratio of the secondary to primary air volumetric flows ψ in the case of a lignite combustion. the horizontal dash-anddot line represents the height where the secondary air is injected. 650 700 750 800 850 900 950 t [◦c] 0 200 400 600 800 1000 1200 1400 1600 1800 h [m m ] ψ = 0.00 ψ = 0.25 ψ = 0.63 ψ = 1.14 ψ = 2.60 figure 8. the dependence of the temperature height profile within the 30 kwth bfb facility on the ratio of the secondary to primary air volumetric flows ψ in the case of a biomass combustion. the horizontal dashand-dot line represents the height where the secondary air is injected. 406 vol. 62 no. 3/2022 impact of the air staging on the nox production in a bfb 750 800 850 900 950 1000 t [◦c] 0 200 400 600 800 1000 1200 1400 1600 1800 h [m m ] ψ = 0.00 ψ = 0.50 ψ = 0.92 ψ = 1.59 figure 9. the dependence of the temperature height profile within the 500 kwth bfb facility on the ratio of the secondary to primary air volumetric flows ψ in the case of a lignite combustion. the horizontal dashand-dot line represents the height where the secondary air is injected. 4. conclusion the aim of the work was to experimentally verify the impact of air staging on nox emissions from combustion of coal and wooden biomass in bfb and to study the feasibility of providing sufficient conditions for the application of selective non-catalytic reduction of nitrogen oxides in bfb using two experimental combustors with thermal load 30 and 500 kwth. the experiments also showed that air staging itself is an effective primary measure for bfb to reduce nox formation. in the results from the 30 kwth facility, the suppression of the nox directly correlated with an increased secondary/primary air flow ratio. the nox emissions reduction efficiency of about 55 % and 40 % was achieved in the case of lignite and biomass combustion, respectively. in the experiments carried out using the 500 kwth facility, the best nox emissions reduction efficiency of 33 % was achieved for the secondary/primary air ratio ψ 0.92 and its further increase did not bring further nox reduction. due to the poor insulation of the freeboard section of the 30 kwth bfb facility, it was not possible to increase the freeboard temperature to a value higher than the fluidized bed temperature for both fuels used. in the case of biomass combustion, the results confirmed that it is possible to reach the suitable temperature range for the sncr of nox by air staging, but the secondary/primary air ratio must be significantly high, resulting in sub-stoichiometric conditions in the dense bed. however, a temperature increase was observed only in the well-insulated fluidized bed region below the secondary air inlets. for coal combustion, no positive impact of staged injection of combustion air was observed in the 30 kwth facility. the experiments carried out with 500 kwth bfb facility showed that if the combustion chamber and the freeboard section are properly insulated, the freeboard temperature increases with the increase of the secondary/primary air ratio. the ideal value of this ratio was approximately 1, for which the freeboard temperature was about 70 °c higher than the fluidized bed temperature. it can be expected that with insulation improvement of the 30 kwth facility, the air staging would have a positive impact on the temperature profile for combustion of both fuels. recently, the insulation of the freeboard section of the 30 kwth bfb facility was replaced with insulfrax boards. further experiments confirmed that temperature height profiles within the combustor are significantly improved. this study will continue with the characterization of the formation of gaseous pollutants in combustion of alternative biomass fuels in a bfb. in addition, the application of sncr in the 500 kwth bfb combustor will be studied. acknowledgements this work was supported by the ministry of education, youth and sports under the op rde grant number cz.02.1.01/0.0/0.0/16_019/0000753 “research centre for low-carbon energy technologies”, which is gratefully acknowledged. list of symbols bfb bubbling fluidized bed lhv low heating value lwa lightweight ceramic aggregate psd particle sized distribution scr selective catalytic reduction sncr selective non-catalytic reduction cco concentration of co in the dry flue gas [mg/m3n] cnox concentration of nox in the dry flue gas [mg/m 3 n] d10 1st decile particle size [mm] d90 9th decile particle size [mm] d50 median diameter [mm] dmean mean particle size [mm] dmode mode particle size [mm] h height [mm] mf uel fuel load [kg h−1] n no conversion ratio of fuel nitrogen to no [% mole] t temperature [°c] tbf b temperature in the bubbling fluidized bed [°c] u0 superficial gas velocity [m s−1] umf minimum fluidization velocity [m s−1] umf −90 minimum fluidization velocity for d90 [m s−1] ut−90 terminal particle velocity for d90 [m s−1] ut terminal particle velocity [m s−1] vair prim volumetric flow of primary air [m3n/h] vair sec volumetric flow of secondary air [m3n/h] λprim ratio of the air excess in the primary combustion zone [–] ϕo2 volumetric fraction of o2 in dry flue gas [% vol ] ρs density of solid material [kg m−3] ρb bulk density of solid material [kg m−3] ψ ratio of the secondary to primary air volumetric flows [–] 407 m. vodička, k. michaliková, j. hrdlička et al. acta polytechnica references [1] y. b. zeldovich, p. y. sadovnikov, d. a. frankkamenetskii. oxidation of nitrogen in combustion. tech. rep., academy of sciences of ussr, institute of chemical physics, moscow-leningrad, 1947. transl. by m. shelef. [2] i. glassman, r. a. yetter. combustion. academic press, elsevier, 4th edn., 2008. [3] c. fenimore. formation of nitric oxide in premixed hydrocarbon flames. symposium (international) on combustion 13(1):373–380, 1971. https://doi.org/10.1016/s0082-0784(71)80040-1. [4] j. e. johnsson. formation and reduction of nitrogen oxides in fluidized-bed combustion. fuel 73(9):1398–1415, 1994. https://doi.org/10.1016/0016-2361(94)90055-8. [5] j. konttinen, s. kallio, m. hupa, f. winter. no formation tendency characterization for solid fuels in fluidized beds. fuel 108:238–246, 2013. https://doi.org/10.1016/j.fuel.2013.02.011. [6] p. skopec, j. hrdlička, j. opatřil, j. štefanica. nox emissions from bubbling fluidized bed combustion of lignite coal. acta polytechnica 55(4):275–281, 2015. https://doi.org/10.14311/ap.2015.55.0275. [7] j. a. miller, c. t. bowman. mechanism and modeling of nitrogen chemistry in combustion. progress in energy and combustion science 15(4):287–338, 1989. https://doi.org/10.1016/0360-1285(89)90017-8. [8] p. glarborg, a. d. jensen, j. e. johnsson. fuel nitrogen conversion in solid fuel fired systems. progress in energy and combustion science 29(2):89–113, 2003. https://doi.org/10.1016/s0360-1285(02)00031-x. [9] f. winter, c. wartha, g. löffler, h. hofbauer. the no and n2o formation mechanism during devolatilization and char combustion under fluidized-bed conditions. symposium (international) on combustion 26(2):3325–3334, 1996. https://doi.org/10.1016/s0082-0784(96)80180-9. [10] k. svoboda, m. pohořelý. influence of operating conditions and coal properties on nox and n2o emissions in pressurized fluidized bed combustion of subbituminous coals. fuel 83(7-8):1095–1103, 2004. https://doi.org/10.1016/j.fuel.2003.11.006. [11] h. stadler, d. christ, m. habermehl, et al. experimental investigation of nox emissions in oxycoal combustion. fuel 90(4):1604–1611, 2011. https://doi.org/10.1016/j.fuel.2010.11.026. [12] w. fan, y. li, q. guo, et al. coal-nitrogen release and nox evolution in the oxidant-staged combustion of coal. energy 125:417–426, 2017. https://doi.org/10.1016/j.energy.2017.02.130. [13] h. spliethoff, u. greul, h. rüdiger, k. r. hein. basic effects on nox emissions in air staging and reburning at a bench-scale test facility. fuel 75(5):560–564, 1996. https://doi.org/10.1016/0016-2361(95)00281-2. [14] k. sirisomboon, p. charernporn. effects of air staging on emission characteristics in a conical fluidized-bed combustor firing with sunflower shells. journal of the energy institute 90(2):316–323, 2017. https://doi.org/10.1016/j.joei.2015.12.001. [15] s. l. chen, j. a. cole, m. p. heap, et al. advanced nox reduction processes using -nh and -cn compounds in conjunction with staged air addition. in 22nd symposium (international) on combustion, pp. 1135–1145. the combustion institute, pittsburgh, 1988. [16] j. hrdlička, p. skopec, j. opatřil, t. dlouhý. oxyfuel combustion in a bubbling fluidized bed combustor. energy procedia 86:116–123, 2016. https://doi.org/10.1016/j.egypro.2016.01.012. [17] d. geldart. types of gas fluidization. powder technology 7(5):285–292, 1973. https://doi.org/10.1016/0032-5910(73)80037-3. [18] d. kunii, o. levenspiel. fluidization engineering. elsevier, 2nd edn., 1991. https://doi.org/10.1016/c2009-0-24190-0. [19] i. g. c. dryden (ed.). fluidized-bed combustion, pp. 58–63. butterworth-heinemann, 2nd edn., 1982. https: //doi.org/10.1016/b978-0-408-01250-8.50014-3. 408 https://doi.org/10.1016/s0082-0784(71)80040-1 https://doi.org/10.1016/0016-2361(94)90055-8 https://doi.org/10.1016/j.fuel.2013.02.011 https://doi.org/10.14311/ap.2015.55.0275 https://doi.org/10.1016/0360-1285(89)90017-8 https://doi.org/10.1016/s0360-1285(02)00031-x https://doi.org/10.1016/s0082-0784(96)80180-9 https://doi.org/10.1016/j.fuel.2003.11.006 https://doi.org/10.1016/j.fuel.2010.11.026 https://doi.org/10.1016/j.energy.2017.02.130 https://doi.org/10.1016/0016-2361(95)00281-2 https://doi.org/10.1016/j.joei.2015.12.001 https://doi.org/10.1016/j.egypro.2016.01.012 https://doi.org/10.1016/0032-5910(73)80037-3 https://doi.org/10.1016/c2009-0-24190-0 https://doi.org/10.1016/b978-0-408-01250-8.50014-3 https://doi.org/10.1016/b978-0-408-01250-8.50014-3 acta polytechnica 62(3):400–408, 2022 1 introduction 2 experiments 2.1 experimental setup 2.2 materials 2.3 methods 3 results and discussion 4 conclusion acknowledgements list of symbols references ap08_1.vp abbreviations ncx na�/ca2� exchanger sr sarcoplasmic reticulum serca sr ca2� influx channel ecc excitation-contraction coupling nf non-failing heart f failing heart am actomyosin complex 1 introduction the number of known molecular of cardiac excitation-contraction coupling (ecc) mechanisms is enormous and is still growing. it has become impossible to intuitively assess the relative contribution of an individual arrangement in physiological and pathophysiological events. however, such understanding is essential for identifying strategies for treating various types of heart failure. mathematical modeling and simulation offers a way of handling such an extensive set of data, while still allowing for a (semi)quantitative study. of course, there are limitations to such an approach, mainly: the accuracy of the model, its complexity (always drastically lower than in reality), low availability of consistent experimental input data, etc. however, calculations and simulations have been already proven to be useful in biological research. as an example, let us mention the correct of na-k exchanger stoichiometry 3:1, based on simulations prediction as early as 1985. [1] the aim of the present study is to assess the relative contribution of ncx and serca to the function of cardiac myocytes and thus to help interpret experimental data. generally, both proteins reduce the cytoplasmic concentration of ca2�, and influence both the systolic (contraction) and the diastolic (relaxation) function. while serca transports ca2� into the sarcoplasmic reticulum (sr) and so increases intracellular calcium stores, i.e. the systolic function, ncx removes ca2� across the cellular membrane out of the cell (fig. 1), so that the total cellular calcium is reduced and thus the systolic function is decreased. the role of ncx is far more complex, since it is regulated by voltage and by ca2� and na� concentration. consequently, it can operate in so-called “reverse mode” during the early phase of action potential. heart failure congestive heart failure is an important medical issue with very high mortality and no targeted treatment available. though it is among the leading causes of death in our population, the underlying pathophysiological mechanisms at molecular level remain to be elucidated. altered calcium (ca2�) handling is seen as a key factor in the pathophysiology of heart failure. typically, serca activity is consistently reported to be reduced in cases of advanced cardiac dysfunction [2, 3]. the na�/ca2� exchanger (ncx) has also been reported to be altered in heart failure and the currently attracting strong interest [4]. the transporter may be over-expressed [5] or normal or even reduced. the ratio between the two mechanisms seems to be an important determinant of overall function [2]. the model used for the present work is far from reproducing all known mechanisms of ecc. however, it can still provide a valuable quantitative (or semi-quantitative) insight by providing calculated data that can help to decide which are the direct effects of ncx on cellular performance (under our © czech technical university publishing house http://ctn.cvut.cz/ap/ 3 acta polytechnica vol. 48 no. 1/2008 computer insight into the molecular level of heart failure. what is the role of ncx? m. fischer, m. mlček, s. konvičková, o. kittnar though congestive heart failure is a leading cause of death in our population, the pathophysiological mechanisms at molecular level remain to be elucidated. this paper discusses the contribution of ncx to the pathological pattern of intracellular calcium regulation and contraction on the basis of computer simulations of a virtual cell. the model comprises calcium handling mechanisms, troponin control and acto-myosin interactions. the contribution of ncx was studied by changing its activity and turning it off for some simulations. it was found that ncx helps to support diastolic function by reducing the ca2� level during the diastole. at the same time there is a reduction in peak cai and hence contraction. however, increased ncx activity does not seem to improve calcium handling and contraction crucially, as has been suggested by some authors. keywords: heart failure, numerical models, calcium, ncx. fig. 1: some ca2� transport mechanisms in heart cell limited conditions), and which findings cannot be attributed to ncx and require identification and/or the addition of other mechanisms. 2 materials and methods 2.1 model an extended version of our earlier excitation-contraction model was used [6]. this model consists of gating, calcium, regulatory and contraction subunits (fig. 2). a compartmental description of the underlying molecular processes is adopted to simulate a) calcium handling, b) contraction control by troponin, and c) the reaction kinetics of contractile elements. the contraction model and the calcium handling subunit were presented in [6] and [7]. unlike the previous models, it also includes the na�/ca2� exchanger, which was omitted or extremely simplified in earlier versions. in principle, ncx current is now described after winslow [8] as � � � � � � i t k t na ca e na naca i e n r e t f rt e ( ) ( ) ( ) ( ) � � � � � � � � �3 2 3 � �ca t ei n r e t f rt( ) ( ) ( ) ( ) � � � � � � 2 1 and ionic flux as q t a c v z f i tnaca cap sc c naca( ) ( )� � � � , where k(t) is the saturation parameter (see [8]), nai (12 mol/m 3) is intracellular sodium concentration, cae (1,8 mol/m3) is extracellular calcium concentration, n (3) is stoichiometry of na-ca exchange, r (0.15) is position of the peak of the energy barrier separating two states – activation and deactivation of the exchanger, e(t) is action potential, f (96485.3415 s�a /mol) is faraday constant, r (8.3144 j/k�mol) is ideal gas constant, t (290.15 k) is absolute temperature, nae (140 mol /m 3) is extracellular sodium concentration, cai(t) is intracellular calcium concentration, acap (1.534 e 8 m2) is capacitive membrane area, csc (1 f /m 2) is specific membrane capacity, vc (25 e 12 m3) is cellular volume, and z (2) is valence of calcium ion. 2.2 simulation and parameters the model was implemented in matlab / simulink. the ode15s (stiff/ndf) integration method was used. as an input, the experimentally obtained trace of cardiac action potential was used (fig. 3) [9]. simulation parameters: stimulation frequencies 1.2 and 3 hz, simulation duration 300 s (to avoid transient state effects). the model was also verified for various action potential durations (apd) in accordance with reality where apd varies physiologically both with heart rate and in various regions of the myocardium. however, due to the simplicity, of the model, it currently does not essentially reflect any mechanisms influencing the shaping and action potential duration, such as modification of the activity of ionic channels. heart failure model: chronic congestive failure was simulated according to experimental findings [10] by reducing the activity of serca from 100% to 75 and 50 % of normal values. serca ionic flux is defined by the equation q t ca t ca k k f tin i i in inact( ) ( ( ) ) ( ( ))� � � � �0 , where qin(t) is ionic flux from the intracellular space to the network sarcoplasmic reticulum (nsr), 4 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 e t( ) fig. 2: model scheme – subunits interaction fig. 3: action potential – model input cai0 (1 e 4 mol /m3) is initial intracellular calcium concentration, kin (4 s 1) is the rate constant of the passive part of serca channel, kinact (1 s 1) is the rate constant of the active part of the serca channel, and f�(t) is steady-state deactivation coefficient. according to some findings that report ncx activity in failing hearts [11], the ncx activity was varied between 50, 75, 100 and 130 % of normal activity. the ncx transporter was also virtually completely blocked (activity set to 0 %), so that its (missing) role in ecc could be identified more easily. 3 results 3.1 ncx in a non-failing heart (nf) though ncx is a minor calcium-removal mechanism (10~25 % of total ca to be removed), its contribution to calcium handling is substantial and becomes obvious over a period of time (here, after 300 s at 1 hz). the absence of ncx activity (fig. 4) results in more ca2� remaining in the cell. this calcium is transported into internal stores (the sarcoplasmic reticulum) by the serca transporter. even the rather small amount of ca2� that is retained during each beat gradually loads the internal calcium stores until a new input-output © czech technical university publishing house http://ctn.cvut.cz/ap/ 5 acta polytechnica vol. 48 no. 1/2008 cai (f) 0 00e+00. 1 00e-04. 2 00e-04. 3.00e-04 299 00. 299 25. 299 50. 299 75. 300 00. time (s) c a i (m o l) norm no ncx ncx 150 fig. 4: intracellular ca2� concentration in nf and f heart (ncx activity 0 % and 150 %), at 1 hz am 0 00e+00. 5 00e-09. 1 00e-08. 1 50e-08. 2.00e-08 299 00. 299 25. 299 50. 299 75. 300 00. time (s) a m (m o l) norm no ncx fig. 5: contractile force in nf heart and in heart with turned ncx off, at 1 hz equilibrium is reached. thus during activation more ca2� will be released and so more force will be generated. at the same time, the relaxation of contraction is prolonged. increased ncx activity exhibits opposite effects. thus it can be concluded that increased ncx activity (the forward mode) alone decreases the force (negative inotropic effect) and improves relaxation (positive lusitropic effect), and vice versa. the kinetics of an acto-myosin strong interaction (am(t)) represents the mechanical performance of virtual cardiac myocyte. this is obtained from the model simulation (contraction subunit). the original description was given in [6]. the effect of heart rate the effect of ncx during each cardiac cycle is time-limited, and is therefore directly influenced by heart frequency. shortening the cardiac period means that less time is available for ca2� removal by ncx (fig. 8). if ncx activity is decreased, at higher rates (2–3 hz) ca2� does not return to 6 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 am, 3 hz 0 00e+00. 5 00e-09. 1 00e-08. 1 50e-08. 2 00e-08. 2.50e-08 299 00. 299 25. 299 50. 299 75. 300 00. time (s) a m (m o l) norm no ncx fig. 6: contractile force in nf heart and in heart with turned ncx off, at 3 hz ca fluxes at 1 hz, normal cells 0 00e+00. 4 00e-03. 8 00e-03. 1.20e-02 1.60e-02 qie qin q ncx c a fl u x e s (m o l) fig. 7: total ca2� fluxes (per 1 sec) at 1 hz, nf the baseline, and therefore the myocardium may not relax properly between beats (fig. 6). 3.2 ncx in serca dysfunction impaired calcium handling in chronic heart failure (f) was simulated by decreasing the serca activity to 50 % of the norm. the activity of ncx was varied between 0, 100 and 150 %. decreased serca activity alone resulted in lower calcium peak and force transient (by cca 40%) and increased calcium during the relaxation phase (compared to the non-failing model). this is consistent with many experimental findings [11, 12]. by increasing the activity of ncx to 150 % of n, the relaxation (force and ca2�) was restored. however, this is at the cost of a further decline of the maximal force (to 50 %) (fig. 9). importantly, this effect was pronounced at high frequencies (3hz – fig. 10). 4 discussion the role of ncx in ecc is quite complex. at first, by removing calcium from the cell, increased ncx reduces © czech technical university publishing house http://ctn.cvut.cz/ap/ 7 acta polytechnica vol. 48 no. 1/2008 q ncx (norm) 0.00e+00 2.10e-03 1 hz 2 hz 3 hz q n c x (m o l/ s ) fig. 8: total ca2� fluxes (per 1 sec) at 1, 2 and 3 hz, nf cai (in nf and f) 0 00e+00. 2.10e-03 299 00. 299 25. 299 50. 299 75. 300 00. time (s) c a i (m o l) norm f, ncx 100 f, ncx 150 fig. 9: intracellular ca2� concentration in nf and f heart (serca activity 50 %, ncx activity 100 % and 150 %), at 1 hz contractility and thus could have a negative effect on cardiac performance. this is in line with some experimental findings [13]. though the transporting capacity of ncx is much lower than that of other mechanisms (serca), nxc can influence the gradual loading/unloading of cellular calcium stores. cellular calcium determines the maximal contraction force. secondly, lowering cellular calcium (both in stores and free ca2�) has a direct positive effect on cardiac relaxation and so can improve heart performance – the heart is a pulsation pump so the relaxation is as necessary as the contraction and both need to alternate periodically. moreover, the nutrition (blood supply) of the heart is better during the relaxation phase (it almost stops during peak contraction). thus it cannot be simply concluded which of the two effects should take precedence. nor can we calculate the amplitude of force (am) oscillation as a measure of performance. thirdly, ncx is capable of “reverse mode” operation, i.e. it can load calcium into the cell during the early phase of the cardiac cycle. the effect of ncx is also heart-rate dependent. at low frequencies (<~1hz), the relatively long period offers enough reserve capacity for extrusion of ca2�, which controls relaxation. the extrusion (and therefore the relaxation) takes longer, but ultimately fully completes (figs. 4, 5). at higher rates full relaxation is not achieved. 8 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 am / force (in nf and f), 3 hz 0 00e+00. 2 00e-09. 4 00e-09. 6 00e-09. 8.00e-09 299 00. 299 25. 299 50. 299 75. 300 00. time (s) a m (m o l) norm f, ncx 100 f, ncx 150 fig. 10: contractile force in nf and f heart (serca activity 50 %, ncx activity 100 % and 150 %), at 3 hz ca fluxes (per cycle), 1 hz 0 00e+00. 4 00e-03. 8 00e-03. 1 20e-02. 1.60e-02 qie qei qin qncx q (m o l/ s ) norm f, ncx 100 f, ncx 150 fig. 11: total ca2� fluxes (per 1 sec) in nf and f heart (serca activity 50 %, ncx activity 100 % and 150 %), at 1 hz in heart failure, reduced serca is consistently reported. in line with this, lower total calcium, calcium peak and maximal force are found. however, reports on the role of ncx are almost contradictory. o’rourke [11] reports increased ncx that compensates defective serca in maintaining diastolic (relaxed) calcium low at the cost of further compromised systolic (contraction) performance. based on our results, this compensatory effect is relevant mainly when the heart rate is higher. however, hobai [12] reports improved cardiac function in a failing heart after inhibition of increased ncx by externally administered inhibitory peptide, and suggests ncx inhibition as a promising treatment for heart failure. this finding may also be relevant, however only when the heart rate is low (circa 1 hz and less). blocking ncx: a) increases cellular calcium loading (and contractility), b) increases diastolic ca2�. at low heart rates, there is enough time to extrude this increased ca2� (fig. 9) and so maintain relaxation. at low rates overall stimulation predominates. at rates � 2 hz diastolic ca2� accumulates, since it is not sufficiently extruded and results in incomplete relaxation (fig. 10). the experiments by hobai were performed at 0.5 hz which may explain why they reported a generally beneficial role of ncx inhibition. however, in clinical cases there could be double-edged effects, since the heart frequency can easily reach 2 hz. some other works have suggested increased cardiac reserve performance after over-expression of ncx in a failing rabbit heart [13]. this effect cannot be explained by the current model, and thus requires further mechanisms to be identified and included. no matter how complex calcium handling is, it must be true that the input and output of ca2� into and out of the cytoplasm must generally match. any change in one of the transporting mechanisms (serca, ncx, others) must be reflected by changes in other parameters (cellular ca2� stores, free ca2�i, etc.) the major role of ncx is in removing ca 2�, which (necessarily) enters the cell through an l-type channel (dhpr) from outside during each cycle. thus increased ncx activity will always tend to unload the cell and compromise heart contractility. increased ncx 200 % combined with reduced serca 50 %, as sometimes reported in heart failure, would result in extreme depletion of internal calcium and thus the force that could be generated is hardly sufficient. this calculation indirectly suggests that another mechanism must be involved in ca2� handling changes in heart failure. one option is the “reverse mode” capability of ncx in the early phase of a cycle. this regime depends on many parameters (na� and ca2� concentrations, membrane voltage, ncx regulations) and plays a minor role under physiological conditions. however, to make a quantitative assessment of this effect and its contribution to cell ca2� handling as a whole, specifically in a simulated failing heart, the current model needs to be refined to better reflect the parameters controlling the ncx reverse mode. limitations in biological systems numerous molecular mechanisms contribute to ecc. any of them can be altered and many may even not have been identified yet. conversely, a computational model can only reflect a very limited set of the mechanisms that are included in the real phenomenon in real life. the results drawn here should always be interpreted with caution. however, good conformance with experimental data indicates that major mechanisms have been identified (e.g. the effect of ncx at various heart rates). however, the discrepancy between the simulated and experimental data suggests that further mechanisms need to be searched for (e.g. improved contractility after ncx stimulation in healthy hearts). in this way, simulations can stimulate and direct biomedical research, which has traditionally relied on experimental approaches. 5 acknowledgment this study was supported by the ministry of education, youth and sports of the czech republic project: transdisciplinary research in biomedical engineering ii., no. msm 6840770012 and the grant agency of the czech republic, project no. 106/04/1181, and czech academy of sciences 1et201210527. references [1] di francesco, d. – noble, d.: a model of cardiac electrical activity incorporating ionic pumps and concentration changes – simulations of ionic currents and concentration changes. philos.trans. r. soc. lond b biol. sci., 1985, vol. 307, p. 353–398. [2] hasenfuss, g. et al.: relationship between na�ca2�exchanger protein levels and diastolic function of failing human myocardium. circulation, 1999, vol. 99, p. 641–648. [3] huke, s. et al.: altered force-frequency response in non-failing hearts with decreased serca pump-level. cardiovasc. res., 2003, vol. 59, p. 668–677. [4] hasenfuss, g. – schillinger, w.: is modulation of sodium-calcium exchange a therapeutic option in heart failure? circ.res., 2004, vol. 95, p. 225–227. [5] pieske, b. et al.: functional effects of endothelin and regulation of endothelin receptors in isolated human nonfailing and failing myocardium. circulation, 1999, vol. 99, p. 1802–1809. [6] mlček, m. et al.: mathematical model of the electromechanical heart contractile system—regulatory subsystem physiological considerations. physiol res., 2001, vol. 50, p. 425-432. [7] novak, v. – neumann, j.: mathematical model of the electromechanical heart contractile system – simulation results. international journal of bioelectromagnetism, 2000, vol.2, no. 2, electronic version. [8] winslow, r. l. et al.: mechanisms of altered excitation-contraction coupling in canine tachycardia-induced heart failure, ii: model studies. circ.res., 1999, vol. 84, p. 571–586. [9] trautwein, w. – kassebaum, d. g.:. electrophysiological study of human heart muscle. circ res, 1962, vol. 10, p. 306–312. [10] beuckelmann, d. j. – nabauer, m. – erdmann, e.: intracellular calcium handling in isolated ventricular myocytes from patients with terminal heart failure. circulation, 1992, vol. 85, p. 1046–1055. © czech technical university publishing house http://ctn.cvut.cz/ap/ 9 acta polytechnica vol. 48 no. 1/2008 [11] o’rourke, b. et al.: mechanisms of altered excitation-contraction coupling in canine tachycardia-induced heart failure, i: experimental studies. circ.res., 1999, vol. 84, p. 562–570. [12] hobai, i. a. – maack, c. – o’rourke, b.: partial inhibition of sodium/calcium exchange restores cellular calcium handling in canine heart failure. circ.res., 2004, vol. 95, p. 292–299. [13] munch, g. et al.: functional alterations after cardiac sodium-calcium exchanger overexpression in heart failure. am. j. physiol heart circ.physiol, 2006, vol. 291, p. h488–h495. martin fischer, msc. phone: +420 224 352 542 e-mail: mail@martinfischer.cz department of mechanics, biomechanics and mechatronics czech technical university faculty of mechanical engineering technická 4 166 07 prague 6, czech republic mikuláš mlček, md, ph.d. phone: +420 224 968 407 e-mail: mikulas.mlcek@lf1.cuni.cz institute of physiology charles university first faculty of medicine albertov 5 128 00 prague 2, czech republic svatava konvičková, msc, ph.d. phone: +420 224 352 511 e-mail: svatava.konvickova@fs.cvut.cz department of mechanics, biomechanics and mechatronics czech technical university faculty of mechanical engineering technická 4 166 07 prague 6, czech republic otomar kittnar, md, ph.d. phone: +420224912903 e-mail: otomar.kittnar@ lf1.cuni.cz institute of physiology charles university first faculty of medicine albertov 5 128 00 prague 2, czech republic 10 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 1/2008 << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap07_4-5.vp 1 introduction modern point matching algorithms like sift (see [4, 5]), provide a large number of point matches, even in stereo image pairs with large changes in scale, translation and rotation between the images. the only inconvenience is that there are some wrong matches among the correspondences, especially when parts of the first image are not visible in the second. algorithms for computing the fundamental matrix can be classified into algorithms sensitive to wrong matches and algorithms detecting and ignoring so-called outliers like ransac or least median of squares. in [3] and [7] algorithms of both categories are described and compared. basically, algorithms of the first category should only be used after preliminary outlier removal. the motivation to use them is that some of them perform better than ransac or lmeds (see [3]). based on [3], we used ransac combined with the normalized 8-point algorithm to compute a first estimation of the fundamental matrix in order to perform outlier removal. an observation is that using the criterion presented in [7], i.e. the distance of a point to its corresponding epipolar line, some true correspondences, which may be important for an estimation of the epipolar geometry, are also eliminated. as a consequence we have developed a new criterion to evaluate correspondences between two images, considering the estimation of the fundamental matrix and its uncertainty. the paper is structured as follows: in section two, the computation of the covariance matrix of the fundamental matrix using monte carlo simulation is described. further we describe the computation leading to the new criterion. in section three, we show how the criterion can be used in an effective way for outlier removal, and we compare our results with those using a conventional criterion. in section four a brief conclusion is given. 2 confidence measure for point correspondences to evaluate the quality of a single point correspondence from a set of point correspondences, we use the only available geometric constraint, the epipolar geometry and its algebraic representation, the fundamental matrix. the computation of the fundamental matrix using the random sample consensus combined with the normalized 8-point algorithm is described in [3] and [7]. as the locations of point correspondences are superimposed with noise, the computation of the covariance matrix of the fundamental matrix provides further useful information. 2.1 computing the covariance matrix of the fundamental matrix to compute the covariance matrix, we assume that the noise on the correspondence locations has normal distribution. we simulate this effect by adding gaussian noise to the eight correspondences selected by ransac during the fundamental matrix computation. we obtain a set of different versions of the same eight point correspondences, and compute the fundamental matrix from each version using the normalized 8-point algorithm. a large number of different fundamental matrices is obtained, showing how the epipolar geometry varies under slight changes in the point correspondence locations. a good description of this effect is given by the covariance matrix of the fundamental matrix, denoted by �f . 2.2 epipolar line and epipolar envelope for a given point x in the first image, the corresponding epipolar line l in homogeneous coordinates is given by l x� f , (1) © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 acta polytechnica vol. 47 no. 4–5/2007 robust detection of point correspondences in stereo images a. stojanovic, m. unger a major challenge in 3d reconstruction is the computation of the fundamental matrix. automatic computation from uncalibrated image pairs is performed from point correspondences. due to imprecision and wrong correspondences, only an approximation of the true fundamental matrix can be computed. the quality of the fundamental matrix strongly depends on the location and number of point correspondences. furthermore, the fundamental matrix is the only geometric constraint between two uncalibrated views, and hence it can be used for the detection of wrong point correspondences. this property is used by current algorithms like ransac, which computes the fundamental matrix from a restricted set of point correspondences. in most cases, not only wrong correspondences are disregarded, but also correct ones, which is due to the criterion used to eliminate outliers. in this context, a new criterion preserving a maximum of correct correspondences would be useful. in this paper we introduce a novel criterion for outlier elimination based on a probabilistic approach. the enhanced set of correspondences may be important for further computation towards a 3d reconstruction of the scene. keywords: 3d reconstruction, robust matching, fundamental matrix, probabilistic epipolar geometry, outlier elimination. where f is the fundamental matrix. due to the uncertainty of the fundamental matrix, the point correspondence is more likely to be located within an area around the epipolar line. from the covariance matrix of the fundamental matrix, we can compute the covariance matrix of the epipolar line with � l � jfj t, (2) where j is the jacobian of the mapping l x x � f f . (3) the mahalanobis distance k of a possible epipolar line m from the estimated epipolar line l is then given by ( ) ( )l m l ml� � � �t � k2. thus we have an approximation of how well a certain epipolar line m matches with the knowledge acquired so far about the epipolar geometry and its uncertainty. for a given value of k2 , we can compute an envelope of epipolar lines containing all possible epipolar lines having a value less or equal to k2. with the assumption that the elements of l have normal distribution, k2 has a cumulative �2 2 distribution and a probability that the true epipolar line is within this envelope can be associated to the k2-value, as can be seen in fig. 1. the region can be described by a conic c defined in homogeneous coordinates by c � �ll l t k2� . (4) in [3] the epipolar envelope is used for guided matching, i.e. searching correspondences within the epipolar band after a first estimation of the fundamental matrix. we observed that if the fundamental matrix is computed from correspondences located only in the foreground, the uncertainty in the background becomes very high (see fig. 2). this is the reason why correct matches lying in the corners of the image are often eliminated using a conventional criterion. to prevent this, we have to develop a new criterion being less stringent for matches in regions of high uncertainty. 2.3 minimal mahalanobis distance for a point correspondence basic idea the basic idea for the new criterion is to invert the problem of wnding an epipolar band for a given likelihood (i.e. a given k2 respectively probability �). for a given point �x in the second view corresponding to a point x in the first view, we want to find the conic with minimal k2 comprising the point �x . in other terms, we are retrieving the value k2 in equation (4) that provides a hyperbola passing through �x . the considerations behind this idea are very simple if we know that the epipolar line of a point x in the first image will lie with a certain probability within an epipolar band in the second image, the corresponding point �x , which is located somewhere on the epipolar line, will be within the epipolar band with the same probability. general problem statement the point �x belongs to the conic c if the following equation is verified � � �x xtc 0 , (5) where c is given by: c � �ll l t k2� . (6) f k2 1 2� �( ) � is the probability to find �x within c. we use again the notation m for an epipolar line in the second image. in fact l is here the estimated epipolar line corresponding to the point x in the first image. it is not possible to retrieve directly the corresponding value k2 from a point �x using the equations above. one possi24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 1: envelope of epipolar lines: we computed the epipolar line (dark) and the epipolar band (light) in the second image. the point in the first image that we used for the computation is marked by a white point. the epipolar band shown here represents the envelope of epipolar lines with equal k2 5 99915� . . the according probability that the true epipolar line lies within this band is � � 0 95. . we see that for this value the conic is a hyperbola with two branches. fig. 2: high insecurity for points in the background: in this figure the effect of choosing points only in the foreground is demonstrated. we see that the insecurity for points in the background is much higher than for points in the foreground, as the epipolar band is much more extended. inria syntim owns the copyright of the image. bility would consist in computing a multitude of different conics using a range of different values for k2, and localize �x in between the conics. an approximation for k2 would be obtained by interpolation, but we found a closed-form solution to the problem, providing an exact result. closed-form solution let us assume that the confidence, that any point �x in the second image corresponds to a point x in the first view is in relation with the probability of the epipolar geometry that would explain the correspondence pair x x� � and having at the same time the highest probability. in other terms, for a potential point correspondence x x� �, we retrieve the epipolar line m passing through �x having maximal probability, i.e. minimal k2 regarding equation (14). this assumption differs from the assumption made in [1], and we obtain a different criterion, which is more suitable for our purpose. if we denote with m the unknown epipolar line and with l the estimated line in the second image, the constrained minimization problem can be stated as follows: min( ) ( )l m l m x m l� � � � � � �t t � 0 (7) if � �x ( , , )x y z , m � ( , , )a b c t, l � ( , , )l l l1 2 3 t, and � l � � � � � � � � � � � � � � � � � � � 11 12 13 21 22 23 31 32 33 (8) we obtain function f (a, b, c) to be minimized with f a b c l a l b l c ( , , ) � � � � � � � � � � � � 1 2 3 11 12 13 21 22 23 t � � � � � � � � �31 32 33 1 2 3 � � � � � � � � � � � � � � � � � � � l a l b l c and a constraint g(a, b, c) � 0 with g a b c ax by cz( , , ) � � � . (9) for the minimization we use the lagrange multiplier method to obtain an exact solution by solving the set of equations: � � � � � � f a b c g a b c g a b c ( , , ) ( , , ) ( , , ) � 0 (10) after expanding f (a, b, c) and computing its derivatives we have: � � � � � � � � � � f a a l l l l l b b � � � � � � � � 2 211 1 11 2 21 3 31 2 12 3 13 21 12 31 13� �c c� � , � � � � � � � � � � f b b l l l l l a a � � � � � � � � 2 222 2 22 1 12 1 21 3 23 3 32 21 12 23 32� �c c� � , � � � � � � � � � � f c c l l l l l a a � � � � � � � � 2 233 3 33 1 13 1 31 2 23 2 32 13 31 23 32� �b b� � . for g a b c( , , ) we have the following derivatives: � � � � � � g a x g b y g c z � � � , , . (11) thus we have � � � � � � � � � � � � � � � � � � � � g a f a x f a g b f b y f b g c f � � � � � � � � � 1 1 , , � � � �c z f c � � �1 . (12) as we are solving a problem with three unknown variables, 3 equations are sufficient. using the relations from equation (12) and the constraint that the point is on the line m, we obtain the set of equations: x f a y f b y f b z f c ax by cz � � � � � � � � � 1 1 1 1 0 � � � � � � � � . , (13) expanding equation (13) we have the set of equations a b c x y x y 2 211 12 21 12 21 22 13 � � � � � � � � � � � � � � � � � � � � � � � 31 23 32 1x y h� � � � � � �� � , a b c x y x z 2 11 13 31 12 21 22 32 1 � � � � � � � � � � � � � � � � � � � � � � � 3 31 332 2 � � � � � � � � � x z h , ax by cz� � � 0 . where h l l l l l x l l l 1 1 11 2 21 3 31 2 12 3 13 2 22 1 12 1 2 2 � � � � � � � � � � � � � � � � � �21 3 23 3 32� �l l y , h l l l l l x l l l � � � � � � � � 2 2 1 11 2 21 3 31 2 12 3 13 3 33 1 13 1 � � � � � � � � 31 2 23 2 32� �l l z � � . this set of equations can be easily computed and the solution vector ( , , )a b c t is the vector m passing through �x and having minimal k2 in terms of equation (7). finally, we obtain coordinates of the line m passing through �x and the corresponding value k2 that would produce an epipolar band delimited by a hyperbola passing through �x . the value k2 will be of special interest in applications, as we will use it as a quality criterion for point correspondences in our applications. example in an image pair to summarize the algorithm, we can explain the different steps using an example in an image pair (see fig. 3).we start after the fundamental matrix and its covariance have been computed for the given image pair. we see the estimated epipolar line in the second image, which we denoted by m, © czech technical university publishing house http://ctn.cvut.cz/ap/ 25 acta polytechnica vol. 47 no. 4–5/2007 that corresponds to the point x in the first image which is marked by a white point. further we computed the epipolar band (delimited by the hyperbola) for a value k2 5 9915� . , respectively a probability � � 0 95. that the true epipolar line will be located within this region. further, for points marked by a black circle in the second image, we computed the line passing through the point and having a minimal k2 in terms of equation (7). the lines are dark and represent the lines previously denoted by m. a further question to answer is the value k2 of the chosen line m. it can be easily computed using ( ) ( )l m l ml� � � �t � k2. in fig. 3 we obtain a value of k2 848178� . for the central point associated to the vertical line m, and a value of k2 6 07� . for the point in the right side of the image. the fact that this value is almost equal to 5.991 and the corresponding line m is tangent to the hyperbola is an interesting observation, and will be discussed later. properties we want to briefly show some special cases of lines with minimal k2. one of the reasons why we cannot use the criterion from [1] is because it also contains depth information coming from the chosen point correspondences to compute the fundamental matrix and its covariance. in practice, this becomes visible in the property that points lying on the estimated epipolar line (which is of course the most probable epipolar line) do not have the same probability. in fig. 4 we see that the line m is identical to the estimated epipolar line l. thus we have k2 0� for each point on the estimated line. another effect is shown in fig. 5. we computed the line m for a point on the hyperbola delimiting the epipolar band. we see that the dark line that we obtain is tangent to the hyperbola and the value for k2 is equal to the value we used for k2 to compute the hyperbola. in other terms, if we want to retrieve the value k2 for a hyperbola passing through 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 3: example of an image pair with points and their respective line with minimal k2 : we see the estimated epipolar line in the second image, l, that corresponds to the point x in the first image, which is marked by a white point. the epipolar band for a value k2 59915� . , respectively a probability � � 0 95. is within the hyperbola. points marked by a black circle in the second image are shown together with the lines of minimal k2, denoted by m, and represented dark. fig. 4: line with minimal k2 computed for a point on the estimated epipolar line: we see the epipolar band for a value k2 59915� . , marked by the hyperbola, corresponding to the point x in the first image, which is marked by a white point. obviously the epipolar line with smallest k2 for points on the estimated epipolar line is the epipolar line itself. here we see that these two lines are superposed. the corresponding value is k2 0� . fig. 5: line with minimal k2 computed for a point on the hyperbola: we see the epipolar band for a value k2 59915� . , marked by the hyperbola, corresponding to the point x in the first image which is marked by a white point. the line m with minimal k2 for the black point on the hyperbola is tangent to the hyperbola. the corresponding value for k2 is equal to the value we choose to compute the hyperbola, i.e. k2 5991� . . the point x, it is sufficient to compute the line with minimal k2 as the k2 obtained for the line will be equal to the value obtained if we retrieved the hyperbola. results we started from the situation that for a point x in a first image a region in the second image, delimited by a hyperbola, was defined. we knew from the epipolar geometry and its uncertainty that its correspondence �x would be located within this region with a given probability �, which was derived from the mahalanobis distance k. the result is that we can now compute for a pair of corresponding points x x� � the corresponding hyperbola, passing through �x , and the value k2 corresponding to the hyperbola. in the old application the probability to find the corresponding point �x within the hyperbola was increasing with higher values for k2, (the region bounded by the hyperbola increases with higher values of k2), in our application a low value for k2 indicates that the probability that x x� � are corresponding points is high and vice versa. the way to find the value for k2, which is actually the mahalanobis distance k between the estimated epipolar line and the line passing through �x with minimal k, is of no relevance in applications. this means that in our applications, for a point �x , we will only be interested in the mahalanobis distance of the corresponding line m, not in the line itself. for the purpose of illustration we generated fig. 6, where for each pixel �x we computed the mahalanobis distance k2 according to the minimization problem from equation (7) and represented it using grayscales. in particular, points in white have values close to k2 0� , points in black, a value close to the maximum value on the image. we used the whole range of k2 values on the second image and divided the range into 256 levels of equal width, each one represented by its relative grayscale level. 3 application to outlier removal the initial aim was to develop a new criterion for robust outlier elimination. existing algorithms use criteria based on the distance of a point to its corresponding epipolar line. for distance based algorithms, in the background, the distance to the epipolar line will increase as the estimation of the fundamental matrix will be far better for points in the foreground. the use of a threshold distance to eliminate outliers will lead to an elimination of points further away from the central points, as we can observe in fig. 8 (b). similar to other applications, we could consider using the epipolar band to remove outliers. this means that instead of computing the epipolar line and the distance of a point to the epipolar line, we just compute the epipolar band, and remove points located outside the epipolar band. this would resolve the problem of removing points from the background. although this application seems suitable, two parameters have to be set. first we have to make an assumption about the variance of the point correspondences, and the value for k2 from where we compute the epipolar band needs to be set. in the following we present an algorithm overcoming the drawbacks of the algorithms mentioned above by using our new criterion. 3.1 algorithm the values chosen for the standard deviation � of points to compute the covariance matrix of the fundamental matrix are varying, and thus we first computed the value k2 of each used point correspondence. if we consider the distribution of the value k2 for each point in fig. 7, we see that most of the correspondences are very small, whereas some correspondences have huge k2 values. a classification of inliers and outliers can be easily performed. © czech technical university publishing house http://ctn.cvut.cz/ap/ 27 acta polytechnica vol. 47 no. 4–5/2007 fig. 6: image representing the mahalanobis distance k2 over an image: the right image shows the epipolar band for k2 5991� . . in the left image we computed for each pixel �x the mahalanobis distance k2 and represented it using grayscales. in particular, points in white have values close to k2 0� , points in black, a value close to the maximum value on the image. we used the whole range of k2 values and divided the range into 256 levels of equal width, each one represented by its relative grayscale level. fig. 7. distribution of the values of k2: this figure shows a histogram over the values for k2 of the point correspondences from fig. 8 (a) our proposition is to compute the median value of k2 . as the ransac algorithm does not perform well in the case of more than 50 percent outliers, we only consider the case of having at least 50 percent inliers. as a consequence the median value for k2 will be the value of an inlier. an observation was that the values for inlier were very similar, whereas the values for outliers were huge. in practice we therefore eliminate every point correspondence with a value of more than ten times the median value. another proportion may be used but in our tests we obtained reasonably good results. 3.2 results the results presented here were obtained using ransac for an estimation of the fundamental matrix. further, the covariance matrix of the fundamental matrix was computed by adding noise to the eight point correspondences used to compute the fundamental matrix. in fact we obtained the same results for the outlier elimination using a standard deviation � in the range of 0.02 to 10 pixels. therefore no knowledge about the quality of the image and the variance of the correspondences is needed for the outlier removal with the new criterion. the results are presented in fig. 8. 4 discussion we have seen that the new criterion can be used for outlier removal without side information about the image quality and brings the advantage of providing a larger set of point correspondences. an observation is that the points preserved by the new algorithm are more widely spread throughout the image than the points obtained with a conventional criterion. hence the initial fundamental matrix estimation rather seems to suit point correspondences in the center of the image. our current research is focused on comparing fundamental matrices computed from the initial set of point correspondences to those obtained with the enhanced set, using image pairs with known epipolar geometry, i.e. a comparison to ground truth. furthermore we try to identify different planes spanned in space by the point correspondences, in order to examine the influence of the depth distribution of the correspondences onto the accuracy of the computed fundamental matrix. acknowledgments the research described in this paper was supervised by prof. j.-r. ohm, ient rwth aachen university. the authors wish to thank the team at ient, rwth aachen university for its assistance and help. references [1] brandt, s.: on the probabilistic epipolar geometry. journal of mathematical imaging and vision, in press, 2007. [2] hartley, r.-i.: in defense of the 8-point algorithm. proceedings of the 5th international conference on computer vision, june 1995, p. 1064–1070. [3] hartley, r.-i., zisserman, a.: multiple view geometry in computer vision. second edition. cambridge university press, 2004. [4] lowe, d.: object recognition from scale invariant features. proceedings of the 7th international conference on computer vision. corfu, greece, vol. 2 (1999), p. 1150–1157. [5] lowe, d.: distinctive image features from scale-invariant keypoints. international journal of computer vision, (2004), p. 91–110. [6] zhang z., deriche, r., faugeras, o., luong, q.-t.: a robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. artificial intelligence, november 1995. [7] zhang, z.: determining the epipolar geometry and its uncertainty: a review. international journal on computer vision, vol. 27 (1998), no. 2, p. 161–195. ing. aleksandar stojanovic e-mail: stojanovic@ient.rwth-aachen.de ing. michael unger e-mail: unger@ient.rwth-aachen.de institute of communication engineering rwth aachen university 52056 aachen, germany 28 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 8: results: (a) stereo image pair with sift correspondences, the 8 points selected by ransac are dark. (b) remaining point correspondences after outlier removal using the distance from the epipolar line. (c) enhanced point correspondence set using the new criterion. inria syntim owns the copyright of the image pair. ap_06_4.vp 1 methods the signals for classification were measured by a hand-held multi dopplex ii ultrasonic unit and sent to a pc for storage via an rs232 interface. the device measures the mean velocity of blood in an artery within a short time period, a doppler velocity waveform. multi dopplex ii is a bi-directional device. the waveforms can be displayed as forward and backward flow (see figs. 1–3) or as the difference of these two signals in combined waveforms. in a standard situation, five positions on each leg are examined: artery femoralis, a. poplitea, a. tibialis posterior, a. tibialis anterior and a. dorsalis pedis. for automatic classification, four classes were chosen, into which the signals will be classified. these classes reflect various degrees of artery occlusion, and also describe some defects that can be considered by a specialist during this examination. 1.1 classes normal course – signals acquired by an examination of arteries without peripheral arterial disease (pad) – fig. 1. stenotic course – signals measured in arteries with a stenotic diameter – fig. 2. occlusion – signals measured in arteries with total arterial obstruction – fig. 3. incorrect course – various errors may occur while measuring e.g.: the amplification factor was set too high – the course is cut; the signal is under strong influence from nearby veins, and so on. 1.2 features during the design process, 20 features were considered as potentially useful for the classifier. these features describe the quality of the measured signals in the time domain, in the frequency domain, or have a special medical meaning. the sequential forward search (sfs) algorithm was used to determine the most significant features. it was determined that these were the features usually considered by medical specialists during the visual examination process. sfs found the following four features: © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 acta polytechnica vol. 46 no. 4/2006 bayesian classifier for medical data from doppler unit j. málek nowadays, hand-held ultrasonic doppler units (probes) are often used for noninvasive screening of atherosclerosis in the arteries of the lower limbs. the mean velocity of blood flow in time and blood pressures are measured on several positions on each lower limb. by listening to the acoustic signal generated by the device or by reading the signal displayed on screen, a specialist can detect peripheral arterial disease (pad). this project aims to design software that will be able to analyze data from such a device and classify it into several diagnostic classes. at the department of functional diagnostics at the regional hospital in liberec a database of several hundreds signals was collected. in cooperation with the specialist, the signals were manually classified into four classes. for each class, selected signal features were extracted and then used for training a bayesian classifier. another set of signals was used for evaluating and optimizing the parameters of the classifier. slightly above 84 % of successfully recognized diagnostic states, was recently achieved on the test data. keywords: medical data recognition, hand-held ultrasonic doppler unit, peripheral arterial disease. fig. 1: signals acquired in an examination of arteries without peripheral arterial disease (pad) fig. 2: signals measured in arteries with a stenotic diameter fig. 3: signals measured in arteries with nearly total arterial obstruction signal energy – log sum of squared signal samples over the whole signal. velocity/time index – difference of maximum and minimum blood velocity within one pulse region divided by the pulse region duration. deceleration – maximum velocity within one pulse region divided by its fall time. brachial pressure index (bpi) – ratio of the distal pressure in an examined position on the lower limb and the patient’s system blood pressure (measured on a. brachialis). before computing the features, the signal must be preprocessed. this was done by filtering the signal by a low pass filter to suppress the high frequency noise. it should be noted that all the features are calculated automatically in real time, without human intervention. 1.3 bayesian classifier the bayesian classifier represents each of the four classes by one or more gaussians (modes) within the probability density function (pdf) in a 4-feature space. their parameters (means and variances) are estimated from the training data. since the classes have different occurrence rates, the class prior probabilities are also taken into account. the number of gaussians within one class depends on the distribution of the features in 4-dimensional space and is determined by a vector quantization process. the database uses data from 1,100 examinations, measured in 10 standard positions (5 on each leg), i.e. 11,000 samples are available. in the experiments, data from 880 randomly chosen examinations was used for training the classifier; the remaining data (from 220 examinations) was left for testing. this was repeated 5 times in order to obtain more reliable results. the classifier assigned to each signal the class with the highest posterior probability. the result was compared with the expert’s decision. after this the recognition scores were calculated for each of the measuring positions and then averaged. 2 results the optimal design of the classifier was investigated in many experiments. the most relevant results are shown in table 1. it is evident that the best performance was achieved by the classifier with multi-modal pdfs and prior probabilities. the best score was 84 %, which means that in 84 % of the cases the classifier and the expert agreed. 3 discussion the recent version of the program is based on probabilistic methods. in an attempt to improve the models, i intend to adopt the existing training strategies for so called gaussian mixture models (gmms) that have already been developed in the speech lab at tu of liberec for speech signal classification. in the next step i intend to focus on implementing better frequency features calculated for the combined flow waveform rather than for separated forward and backward flow. it will be necessary to make the peak detection and feature calculation more precise. unfortunately, the presence of vein signals in the data and noise in the low amplitude signals complicates this task. it also seems useful to extend the number of classes for more precise classification of the signals. the class stenosis may be divided into two subclasses: mild and severe stenosis. 4 conclusions the method investigated here shows possible new ways to perform screening for pad in general practice surgeries. computer analysis and classification could help to achieve an early diagnosis of grave vascular diseases, so that treatment can be given before the development of critical limb ischemia. 5 acknowledgments the research described here was supervised by prof. ing. j. nouza csc., faculty of mechatronics, technical university of liberec. the medical specialist and consultant was tomáš klimovi , md, department of functional diagnostics, regional hospital of liberec. the research was partly supported by grant 1qs108040569 provided by the academy of sciences of the czech republic. references [1] transatlantic inter-society consensus (tasc), international angiology 2000, management of peripheral arterial disease (pad); suppl. 1. [2] wolfe, j. h. n. et al.: abc of vascular diseases. bmj 1992, p. 12–14. [3] topol, e. j. et al.: atlas of atherothrombosis, london: science press; 2004. [4] goldman, l. et al.: textbook of medicine, 22nd ed., philadelphia: saunders, 2004. [5] rozman, j. lékař a technika, praha: nts čls j.e.p. 1995, p. 9–11. jiří málek e-mail: jiri.malek@tul.cz faculty of mechatronics and interdisciplinary engineering studies technical university of liberec hálkova 6 461 17 liberec, czech republic 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 46 no. 4/2006 method score [%] 1-modal pdf without prior probability 78.66 1-modal pdf with prior probability 80.22 multi-modal pdf without prior probability 81.95 multi-modal pdf with prior probability 84.01 table 1: results for different types of classifier ap07_4-5.vp 1 introduction during manned flights into space fluid shifts from the lower extremities to the heads of the astronauts have been detected. this phenomenon is termed “puffy face syndrome” or “bird legs”, because of the optical impression of the subject (bloated face and thin legs). these effects are caused by the unfamiliar influence of microgravity on the human body. the fluid shifts into the head result in motion sickness, concentration problems, vision degradation and other potentially dangerous symptoms during space flights. under normal conditions (on earth), gravitation arranges a fluid balance in the human body. thereby most of the blood volume is allocated to the veins of the lower extremities. figure 1 shows the distribution of the blood volume inside the human body while standing on earth. flying into space and escaping from the gravitational force disturbs this balance. especially interesting in this kind of fluid shifts are the dynamic characteristics of the blood flow immediately after the moment when the gravity changes. so far these phenomena are only known qualitatively. to research these dynamic behaviors, the subjects were exposed to a rapid change of gravity (between zero and 1.8 times the normal gravitational force on earth, i.e. 0 g and 1.8 g, respectively) on parabolic flights. during this time, sensors measured the relative change of the blood volume in the upper skin layers of the legs and forehead. 2 parabolic flight studies on gravitational effects requiring lower than normal gravitational force on earth are impossible under usual laboratory conditions. there are only a few ways to eliminate the disturbing influence of gravity on earth: free fall towers, ballistic rockets, parabolic flights or just leaving the earth. parabolic flights can be taken on board the airbus a300 zero-g owned by the european space agency (esa). each year the germany aerospace centre (dlr) organizes parabolic flight campaigns especially for research projects. © czech technical university publishing house http://ctn.cvut.cz/ap/ 29 acta polytechnica vol. 47 no. 4–5/2007 assessment of human hemodynamics under hyperand microgravity: results of two aachen university parabolic flight experiments n. blanik, m. hülsbusch, m. herzog, c. r. blazek astronauts complain about fluid shifts from their lower extremities to their head caused by weightlessness during their flight into space. for a study of this phenomenon, rwth aachen university and charité university berlin participated in a joint project on two parabolic flight campaigns of the german aerospace centre (dlr) in september 2005 and june 2006. during these campaigns, the characteristics of the rapid fluid shifts during hyperand micro gravity were measured by a combination of ppg and ppgi optoelectronic sensor concepts. keywords: rapid fluid shifts, photoplethysmography, photoplethysmography imaging, parabolic flights, microgravity. fig. 1: distribution of blood volume for the different parts of the vascular system fig. 2: airbus a300 zero-g directly before injecting the 0 g-phase in september 2005 and june 2006, our joint research program had the opportunity to fly our experiments on the 7th and 8th dlr parabolic flight campaigns in bordeaux and cologne, respectively. for this purpose the specially prepared airbus a300 is equipped with up to twelve separate experiments conducted by various participants. the campaign comprises 4 to 5 flight days. during one flight the aeroplane accomplishes 31 parabola maneuvers. these flight maneuvers (fig. 4) follow a special trajectory which results in microgravity up to 22 seconds (fig. 5) inside the aircraft. one maneuver can be separated into 3 phases. in the first one, the plane is pulled up from the horizontal to a climbing flight up to an angle of 47°. during this time (about 20 seconds) the occupants of the plane are exposed to 1.8 g. in the next phase the state of microgravity is achieved. therefore the plane is steered in a ballistic trajectory, a parabola. inside the plane a reduced gravity of about 0.02 g can be achieved. this lasts another 20 seconds. after that in the third phase the plane is pulled out to its normal flight level. in doing this 1.8 g is achieved for another 20 seconds. after a break of a few minutes the maneuver is repeated. 3 optoelectronic sensors optoelectronic sensor concepts provide a good opportunity to assess the blood volume in the skin. they are basically non-invasive and thus allow measurements also under space conditions and provide an unproblematic way to obtain data from the subjects or even to exchange subjects, during flight in the time between two parabolas. a greater number of subjects can therefore be measured. 3.1 photoplethysmography sensor the first sensor type used here is the photopletysmography (ppg) sensor. this sensor is stuck to the surface of the skin. it contains a monochromatic light source and a photo detector. it uses the fact that blood absorb light at a higher rate than the surrounding tissue. the backscattered or reflected light intensity in well perfused tissues is lower than in tissues with lower perfusion. the scattered light is received by the detector, and conclusions are drawn from the perfusion of the skin in the area below the sensor. the achievable penetration depth lies between one and three millimeters, depending on the wavelength of the light source that is used. 3.2 photoplethysmography imaging photopletysmography imaging (ppgi) is an advancement of classical ppg. it uses the same underlying circumstances. in contrast to the ppg-sensor, an external light source and a high sensitive camera are used. like the ppg-sensor, this camera is also able to detect in its view small changes of light intensity caused by the shifting of blood in the skin. using a camera as detector, the analysis of the blood volume shift is not limited to a single spot but measures a larger part of the skin surface. an additional benefit is the contactless measurement principle. 4 results after the flights, the data from the different subjects were analysed and it was first ascertained that fluid shifts occurred to all subjects. this proves that even the short time span of 30 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 3: experimental configuration on the 8th dlr parabolic flight campaign fig. 4: flight phases of parabolic flight fig. 5: typical development of g-forces during a parabola maneuver (1: g-forces vertically, 2: g-forces in flight direction) fig. 6: ppg sensor (left) and ppgi camera with surrounding light sources (right) about 20 seconds of a single parabola suffices for clear detection. furthermore, a clear shift from the lower body parts to the head could be verified, as had been predicted. fig. 7 shows three typical ppg measurements. they contain the characteristics of one subject each. these recordings were taken with sensors working with green and infrared light sources. they are placed on the subjects’ foreheads. the detected light intensity is displayed in curve 1 (green light) and curve 2 (infrared light). the lower black curve shows the development of the g-forces during the parabola. the small and fast oscillations seen in the curves are caused by the subjects’ heartbeats (clearly visible in the second blue curve) and breathing. however, the main focus of this research program was on the characteristics of the low-frequency components. it is clear to see that immediately after the injection of zero gravity the detected backscattered light intensity declines rapidly. this corresponds to an increasing volume of blood in the skin under the sensor. a comparison of the data for the different subjects shows that all of them had a very quick reaction to the altered gravitational force. the time span between the change of gravity and the minimum in the detected light intensity (or the maximum if measured at the legs) was very short (only a few seconds). however, this also differs among the subjects. these individual characteristics are detectable with all types of sensors used here. a statistical analysis of the data is shown in fig. 8. it shows the relative shift of the light intensity detected by the sensor during the 0g-phase in contrast to the start of the parabola maneuver. the differences in the range of different light colors are caused by the individual penetration depth of the light wave length that is used. an additional observation of the time that the blood takes to flow out of the vessels in the leg (during a change of gravity from 1.8 g to 0 g) with the time it takes to flow back (during change from 0 g to 1.8 g) shows that the former time span is noticeably shorter than the latter one. an explanation for this is that under microgravity conditions the resistance for the blood flow out of the leg vessels towards the heart (and head) is very little. in addition this flow is assisted by the venous valves. healthy venous valves prevent a direct flow backwards through the veins. the flow towards the legs is thus limited to the heart pumping volume and the arterial blood flow. © czech technical university publishing house http://ctn.cvut.cz/ap/ 31 acta polytechnica vol. 47 no. 4–5/2007 fig. 7: different kinetics of blood volume shifts of three subjects (measuring points on the forehead using a green (1) and an infrared light source (2), normalized at start of measurement) fig. 8: statistical analysis of skin blood volume changes measured with ppg sensors on subjects’ foreheads acknowledgments the experiment was conducted in cooperation with the centre for space medicine, department of physiology, faculty of medicine of the charité university, berlin (prof. gunga, principal investigator) and the institute of high frequency technology, rwth aachen (prof. blazek). references [1] blazek, v.: optoelektronische systemkonzepte für nichtinvasive kreislaufdiagnostik,, optoelek magazin, 1991, p. 212–219. [2] space travel and the effects of weightlessness on the human body, canadian space agency. http://www.space.gc.ca/ asc/pdf/educator-microgravity_science_stu.pdf [3] golenhofen, k.: physiologie. urban und schwarzenberg verlag, münchen 1997. [4] blazek, v., schultz-ehrenburg, u. (eds.): quantitative photoplethysmography – basic facts and examination tests for evaluating peripheral vascular functions, vdi verlag, düsseldorf 1996. [5] blazek, v.: funktionelle beinvenendiagnostik mit der quantitativen, digitalen photoplethysmographie, viavital verlag, essen 2005, p. 49–58. nikolai blanik e-mail: blanik@ihf.rwth-aachen.de markus hülsbusch e-mail: huelsbusch@ihf.rwth-aachen.de markus herzog e-mail: herzog@ihf.rwth-aachen.de institute of high frequency technology rwth aachen university germany claudia r. blazek e-mail: claudia.blazek@gmx.de department of dermatology medical faculty rwth aachen university, germany 32 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 acta polytechnica https://doi.org/10.14311/ap.2022.62.0211 acta polytechnica 62(1):211–221, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague time-dependent mass oscillators: constants of motion and semiclasical states kevin zelaya czech academy of science, nuclear physics institute, 250 68 řež, czech republic correspondence: kdzelaya@fis.cinvestav.mx abstract. this work reports the construction of constants of motion for a family of time-dependent mass oscillators, achieved by implementing the formalism of form-preserving point transformations. the latter allows obtaining a spectral problem for each constant of motion, one of which leads to a non-orthogonal set of eigensolutions that are, in turn, coherent states. that is, eigensolutions whose wavepacket follows a classical trajectory and saturate, in this case, the schrödinger-robertson uncertainty relationship. results obtained in this form are relatively general, and some particular examples are considered to illustrate the results further. notably, a regularized caldirola-kanai mass term is introduced in an attempt to amend some of the unusual features found in the conventional caldirola-kanai case. keywords: time-dependent mass oscillators, caldirola-kanai oscillator, quantum invariants, coherent states, semiclassical dynamics. 1. introduction the search for exact solutions for time-dependent (nonstationary) quantum models is challenging task as compared to the stationary (time-independent) counterpart. in the stationary case, the dynamical law (schrödinger equation) reduces to an eigenvalue equation associated with the energy observable, the hamiltonian, for which several methods can be implemented to obtain exact solutions. particularly, new exactly solvable models can be constructed from previously known ones through darboux transformations [1] (also known as susy-qm). in the nonstationary case, it is still possible to recover an eigenvalue problem for the hamiltonian if one restricts to the adiabatic approximation [2, 3]. however, in general, the latter is not feasible, and other workarounds should be implemented. despite all these challenges, time-dependent phenomena find exciting applications in physical systems such as electromagnetic traps of charged particles and plasma physics [4–8]. the parametric oscillator is perhaps the most wellknown exactly solvable nonstationary model in quantum mechanics. a straightforward method to solve such a problem was introduced by lews and riesenfeld [9] by noticing that the appropriate constant of motion (quantum invariant) admits a nonstationary eigenvalue equation with time-dependent solutions and constant eigenvalues. in this form, nonstationary models can be addressed similarly to their stationary counterparts. this paved the way to solve other time-dependent problems [10–14]. recently, the darboux transformation has been adapted into the quantum invariant scheme to construct new time-dependent hamiltonians, together with the corresponding quantum invariant and the set of solutions [15–17]. alternatively, other methods exist to build new time-dependent models, such as the modified darboux transfomation introduced by bagrov et al. [18], which relies on a differential operator that intertwines a known schrödinger equation with an unknown one. this has led to new results in the nonstationary hermitian regime [19–21]. a non-hermitian pt-symmetric extension has been discussed in [22], and some further models were reported in [23, 24]. on the other hand, the point transformations formalism [25] has been proved useful to construct and solve time-dependent oscillators. this was achieved by implementing a geometrical deformation that transforms the stationary oscillator schrödinger equation into one with time-dependent frequency and mass [26, 27]. this allows obtaining further information such as the constants of motion, which are preserved throughout the point transformation [25], leading to a straightforward way to get such constants of motion without imposing any ansatz. a further extension for non-hermitian systems was introduced in [28], whereas a non-hermitian extension of the generalized caldirola-kanai oscillator was discussed in [29]. in this work, the point transformation formalism is exploited to construct and study the dynamics of semiclassical states associated with time-dependent mass oscillators. this is achieved by using the aforementioned preservation of constants of motion and identifying their corresponding spectral problem. notably, it is shown that one constant of motion leads to an orthogonal set of solutions, whereas a different one leads to nonorthogonal solutions that behave like semiclassical states. that is, gaussian wavepackets whose maximum point follows the corresponding classical trajectory and minimize, in this case, the 211 https://doi.org/10.14311/ap.2022.62.0211 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en kevin zelaya acta polytechnica schrödinger-robertson uncertainty principle. two particular examples are considered to illustrate the usefulness of the approach further. 2. materials and methods throughout this manuscript, the time-dependent mass m(t) and frequency ω2(t) oscillator subjected to an external driving force f (t) is considered. such a model is characterized by the time-dependent hamiltonian ĥck(t) = p̂2 2m(t) + m(t)ω2(t) 2 x̂2 + f(t)x̂, (1) with x̂ and p̂x the canonical position and momentum operators, respectively, with [x̂, p̂x] = iℏi. henceforth, the identity operator i is omitted each time it multiplies a constant or a function. the corresponding schrödinger equation iℏ ∂ψ ∂t = − ℏ2 2m(t) ∂2ψ ∂x2 + m(t)ω2(t)x2 2 ψ +f(t)xψ, (2) is recovered by using the coordinate representation px ≡ −iℏ ∂∂x and x̂ ≡ x ∈ r. the solutions of eq. (2) have been discussed by several authors, see [27, 30–32]. here, a brief summary of the point transformation approach discussed in [26, 27] is provided. this eases the discussion of semiclassical states and dynamics to be presented later in section 3. 2.1. point transformations in general, the method of form-preserving point transformations relies on a geometrical deformation that maps an initial differential equation with variable coefficients into another one of the same form but with different coefficients. to illustrate this, be the stationary oscillator hamiltonian ĥosc = p̂2y 2m0 + m0w 2 0ŷ 2 2 , [ŷ, p̂y ] = iℏ, (3) with ŷ and p̂y another couple of canonical position and momentum observables, respectively. the corresponding schrödinger equation iℏ ∂ψ ∂τ = − ℏ2 2m0 ∂2ψ ∂y2 + m0w 2 0y 2 2 ψ, (4) admits the well-known solutions [2] ψn(y,τ) = e−iw0(n+ 1 2 )τ φn(y), (5) where φn(y) = √ 1 2nn! √ m0w0 πℏ e − m0 w02ℏ y 2 hn (√ m0w0 ℏ y ) , (6) with hn(z) the hermite polynomials [33], fulfills the stationary eigenvalue problem hoscφn(y) = e(osc)n φn(y), e (osc) n = ℏw0 ( n + 1 2 ) , (7) with hosc the coordinate representation of ĥosc, i.e., a second-order differential operator that admits a sturm-liouville problem. to implement the point transformation, one imposes a set of relationships between the coordinates, time paramaters, and solutions of both systems in consideration [25]. in general one has y(x,t), τ(x,t), ψ(y(x,t),τ(x,t)) ≡ g(x,t; ψ), (8) where g(x,t; ψ) is a reparametrization of ψ as an explicit function of x, t, and ψ. in the case under consideration, some further conditions are required to preserve the linearity and the hermiticity of ĥosc and ĥck(t). a detailed discussion on the matter can be found in [27]. here, the final form of the point transformation is used, leading to y(x,t) = µ(t)x + γ(t) σ(t) , τ(t) = ∫ t dt′ σ2(t′) , (9) and ψ(y(x,t),τ(t)) ≡ g(x,t; ψ) = a(x,t)ψ(x,t), (10) with m(t) = µ2(t), together with σ(t) and γ(t) some real-valued functions to be determined. by substituting (9) into the schrödinger equation (4), and after some calculations, one arrives to a new partial differential equation for ψ(x,t) that takes the exact form in (2). the latter allows obtaining a(x,t) = √ σ µ exp a(x,t), a(x,t) := ( i m0w0 ℏ µ σ ( wµ 2 x2 + wγx ) + iη ) , (11) where a(x,t) is a local time-dependent complex-phase and [27] η(t) := m0 2ℏ γ(t)wγ (t) σ(t) − 1 2ℏ ∫ t dt′ f(t′) µ(t′) , wµ(t) = σ(t)µ̇(t) − σ̇(t)µ(t), wγ (t) = σ(t)γ̇(t) − σ̇(t)γ(t), (12) with ḟ(t) ≡ df (t) dt a short-hand notation for the time derivative. in the latter, σ(t) and γ(t) fulfill the nonlinear ermakov equation σ̈(t) + ( ω2(t) − µ̈(t) µ(t) ) σ(t) = w20 σ3(t) , (13) and non-homogeneous equation γ̈(t) + ( ω2(t) − µ̈(t) µ(t) ) γ(t) = f(t) m0µ(t) , (14) the solutions of the ermakov equation are wellknown [34–36] and computed from two linearly independent solutions of the associated linear equation q̈j (t) + ( ω2(t) − µ̈(t) µ(t) ) qj (t) = 0, j = 1, 2, (15) 212 vol. 62 no. 1/2022 td mass oscillators: constants of motion and semiclasical states through the nonlinear combination σ(t) = [ aq21 (t) + bq1(t)q2(t) + cq 2 2 (t) ]1 2 , (16) with b2 − 4ac = −4 w 2 0 w 20 and w0 = wr(q1(t),q2(t)) ̸= 0 the wronskian of two linearly independent solutions of (15), which is in general a time-independent complex constant. the previous constraint on a, b, and c guarantees that σ(t) is different from zero [26] for t ∈ r. in this form, one obtains a set of solutions {ψn(x,t)}∞n=0 to the schrödinger equation (2), where ψn(x,t) = √ µ(t) σ(t) [a(x,t)]−1e−iw0(n+ 1 2 )τ (t) × e− m0 w0 2ℏ ( µ(t)x+γ(t) σ(t) )2 hn (√ m0w0 ℏ µ(t)x + γ(t) σ(t) ) . (17) from (10)-(11) it follows that (ψm,ψn) := ∫ r dxψ∗m(x,t)ψn(x,t) = ∫ r dyψ∗m(y,τ)ψn(y,τ) = δn,m, (18) with z∗ the complex conjugate of z. that is, the inner product is preserved and thus the set {ψn(x,t)}∞n=0 is orthonormal in l2(r,dx). the expressions presented so far are general, and specific result may be obtained once the timedependent mass and frequency terms are specified. this is discussed in the following sections. before concluding, an explicit expression for τ(t) can be determined in terms of the two linearly independent solutions q1(t) and q2(t) as well. one gets τ(t) = w−10 arctan [ w0 2w0 ( b + 2c q2(t) q1(t) )] . (19) 3. results: constants of motion and semiclassical states additional information can be extracted from the stationary oscillator into the time-dependent model. particularly, point transformations preserve first-integrals of the initial equation [25]. in the context of the schrödinger equation, such first-integrals correspond to constants of motion, also known as quantum invariant, associated with the physical models under consideration. from the stationary oscillator, it is straightforward to realize that the hamiltonian ĥosc is a constant of motion that characterize the energy observable. in the time-dependent case, ĥck(t) is no longer a constant of motion, as dĥck(t) dt ≠ 0. this implies that an eigenvalue problem associated with ĥck is not possible1. 1one can still link an eigenvalue problem with ĥck(t) under the adiabatic approximation [3]. this work focuses on exact solutions and such an approach will be disregarded. on the other hand, an orthonormal set of solutions {ψn(x,t)}∞n=0 has been already identified, and it is still unclear the eigenvalue problem that such a set solves. this problem was addressed by lewis-riesenfeld [9] while solving the dynamics of the parametric oscillator. they notice that even in the time-dependent regime, there may be a constant of motion î0(t) that admits a spectral problem î0(t)ϕ(x,t) = λϕ(x,t), (20) where the eigenvalues λ are time-independent. the existence and uniqueness of such a quantum invariant is not necessarily ensured. still, for the parametric oscillator, lewis and riesenfeld managed to find the quantum invariant and solve the related spectral problem. here, some quantum invariants associated with ĥck can be found through point transformations. first, notice that the point transformation was implemented in the schrödinger equation to get the time-dependent counterpart. the same transformation can be applied to a constant of motion of the harmonic oscillator to get the corresponding one on the time-dependent model. particularly, by consider the eigenvalue problem (7), and after some calculations, one gets a first quantum invariant of the form î1(t) := σ2(t) 2m0µ2(t) p̂2x + m0 2 ( w 2µ (t) + w 2 0 µ2(t) σ2(t) ) x̂2 + σwµ(t) 2µ(t) (x̂p̂x + p̂xx̂) + σwγ (t) µ(t) p̂x + m0 ( wγ (t)wµ(t) + w20 µ(t)γ(t) σ2(t) ) x̂ + ( m0 2 w 2γ (t) + γ2(t) σ2(t) ) . (21) it is straightforward to show that î1(t) is indeed a quantum invariant, i ℏ [ĥck, î1(t)] + ∂î1(t) ∂t = 0. (22) moreover, i1(t), the coordinate representation of î1(t), defines a sturm-liouville problem with timedependent coefficients, i1(t)ψn(x,t) = ℏw0 ( n + 1 2 ) ψn(x,t), (23) which justifies the existence of the orthogonal set of solutions found in section 2. note that orthogonality has been alternatively proved in (18) using the preservation of the inner product. remarkably, there are still more quantum invariants to be exploited. to see this, let us consider the operators â = √ m0w0 2ℏ ŷ + i p̂y√ 2m0ℏw0 , ↠= √ m0w0 2ℏ ŷ − i p̂y√ 2m0ℏw0 , (24) 213 kevin zelaya acta polytechnica which factorize the stationary oscillator hamiltonian as ĥosc = ℏw0(â†â + 12 ) and fulfill the commutation relationship [â, â†] = 1. although â and ↠are not constants of motion of ĥosc, one can introduce a new pair of operators â := eiw0τ â, ↠:= e−iw0τ â†, (25) where the straightforward calculations show that i ℏ [ĥosc, â] + ∂â ∂τ = 0, and similarly for â†. that is, a and a† are quantum invariants of ĥosc. the latter can now be mapped into the timedependent model, leading straightforwardly to new quantum invariants of ĥck(t) of the form îa(t) = eiw0τ (t) [ i √ 2m0ℏw0 σ(t) µ(t) p̂x + (√ m0w0 2ℏ µ(t) σ(t) + i √ m0 2ℏw0 wµ(t) ) x̂ + (√ m0w0 2ℏ γ(t) σ(t) + i √ m0 2ℏw0 wγ (t) )] , (26) and its adjoint î†a(t). before proceeding, it is worth to recalling that two arbitrary quantum invariants î(t) and ˆ̃i(t) of a given hamiltonian ĥ(t) can be used to construct further invariants. this follows from the fact that the linear combination ℓî(t) + ℓ̃ˆ̃i(t) and the product ℓî(t)ˆ̃i(t) of quantum invariants are also quantum invariants of the same hamiltonian ĥ(t), for ℓ, ℓ̃, and ℓ timeindependent coefficients. in this form, îa(t) and î † a(t) generate î1(t) through î1(t) = ℏw0 ( î†a(t)îa(t) + 1 2 ) , (27) which is analogous to the factorization of the stationary oscillator. similarly, the commutation relationship [â, â†] = 1 of the stationary oscillator is preserved. one thus get [îa(t), î † a(t)] = 1 together with [î1(t), îa(t)] = −ℏw0îa(t), [î1(t), î†a (t)] = ℏw0î † a (t), (28) which means that îa(t) and î † a(t) are annihilation and creation operators, respectively, for the eigensolutions of î1(t). the latter leads to îa(t)ψn+1(x,t) = √ ℏw0(n + 1)ψn(x,t), î†a(t)ψn(x,t) = √ ℏw0(n + 1)ψn+1(x,t), (29) for n = 0, . . . . on the other hand, the orthonormal set {ψn(x,t)}∞n=0 can be used as a basis to expand any arbitrary solution ψ(x,t) of (2) through ψ(x,t) = ∞∑ n=0 cnψn(x,t), cn := (ψn(x,t),ψ(x,t)). (30) now, from the above results, one may investigate the spectral problem related to the remaining quantum invariants îa(t) and î † a(t). by considering the annihilation operator îa(t), one obtains the eigenvalue problem îa(t)ξα(x,t) = αξα(x,t), (31) where the eigensolution ξα(x,t) can be expanded as the linear combination ξα(x,t) = ∞∑ n=0 c̃n(α)ψn(x,t), α ∈ c. (32) this corresponds to the construction of coherent states using the barut-girardelo approach [37]. the complex coefficients c̃n(α) are determined by using the action of the ladder operators (29) and exploiting the orthonormality of the set {ψn(x,t)}∞n=0. after substituting the linear combination ξα(x,t) into the corresponding eigenvalue problem in (31), one obtains the one-parameter normalized eigensolutions ξα(x,t) = exp ( − |α|2 2ℏw0 ) ∞∑ n=0 ( α √ ℏw0 )n ψn(x,t)√ n! . (33) henceforth, the latter are called time-dependent coherent states or semiclassical states interchangeably. similar to glauber coherent states [38], the eigensolutions of the annihilation operator îa are not orthogonal among themselves. this follows from the overlap between two solutions with different eigenvalues, let say α and β, leading to |(ξβ,ξα)|2 = exp ( − |α|2 + |β|2 − 2 re(α∗β) ℏw0 ) , (34) which is different from zero for every α,β ∈ c, with the inner product defined in (18). interestingly, the eigensolution ξα(x,t) can be brought into an alternative and handy expression by using the explicit form of ψn(x,t) given in (17), together with the well-known summation rules for the hermite polynomials. by doing so one gets ξα(x,t) ≡ √ µ(t) σ(t) √ m0w0 πℏ [a(x,t)]−1e−i w0 τ (t) 2 × exp [ i √ 2m0w0 ℏ ( µ(t)x + γ(t) σ(t) ) im α̃(t) ] ×exp [ − m0w0 2ℏ ( µ(t)x + γ(t) σ(t) − √ 2ℏ m0w0 re α̃(t) )2] , (35) with α̃(t) = αe−iw0τ (t). thus, ξα(x,t) is a normalized gaussian wavepacket with time-dependent width. the complex constants α plays the role of the initial conditions of the wavepacket at a given time t0. see the discussion in the following section. despite the lack of orthogonality in the elements of the set {ξα(x,t)}α∈c, they can be still used as a nonorthogonal basis so that any arbitrary solution of (2) can be constructed through the appropriate linear superposition. that is, a given solution ψ(x,t) of (2) expands as ψ(x,t) = ∫ c d2α πℏ c(α)ξα(x,t), (36) 214 vol. 62 no. 1/2022 td mass oscillators: constants of motion and semiclasical states where c(α) = (ξα(x,t),ψ(x,t)). so far, the spectral problem related to the quantum invariants î1(t) and îa(t) has led to a discrete and a continuous representation, respectively, in which any solution of (2) can be expanded. although the eigenvalue problem related to the quantum invariant î†a(t) can be established, it leads to non-finite norm solutions and is thus discarded. 3.1. semiclassical dynamics with the time-dependent coherent states already constructed, one can now study the evolution on time of such state and its relation with physical observables such as position and momentum x̂ and p̂x, respectively. to this end, note that the quantum invariants obtained through point transformations preserve the commutation relation (28) of the corresponding operators of the stationary oscillator. that is, the set {îa(t), î † a(t), î † a(t)îa(t)} fulfill the weyl-heisenberg algebra [39]. this allows the construction of a unitary displacement operator of the form [39] d(α; t) = eαî † a(t)−α ∗îa(t) = e− |α| 2 eαî † ae−α ∗îa, α ∈ c, (37) so that d†(α,t)îa(t)d(α,t) = îa(t) + α, d†(α; t)î†a(t)d(α; t) = î † a(t) + α ∗. (38) it follows that the action of the first relationship acted on ψ0(x,t) leads to îa(t)d(α,t)ψ0(x,t) = αd(α,t)ψ0(x,t), from which one recovers the eigenvalue equation previously analyzed in (31) by identifying ξα(x,t) = d(α,t)ψ0(x,t). this corresponds to the coherent states construction of perelomov [39]. so far, two different and equivalent ways to construct the solutions ξα(x,t) have been identified, a property akin to glauber coherent states. to further explore the time-dependent coherent states, one can take the unitary transformations (38) and combine them with the relationship between the ladder operators and the physical position x̂ and momentum p̂x observables presented in (26). after some calculations one obtains ⟨x̂⟩α(t) = √ 2ℏ m0w0 σ(t) µ(t) r cos (w0τ(t) − θ) − γ(t) µ(t) , (39) where α = reiθ. by using (19) and some elementary trigonometric identities, one recovers an explicit expression in terms of q1(t) and q2(t) as ⟨x̂⟩α(t) = − γ(t) µ(t) + √ 2ℏw0 m0c r w0 [( cos θ + w0 2w0 c sin θ ) q1(t) µ(t) + w0 w0 c sin θ q2(t) µ(t) ] . (40) similarly, the calculations for the momentum observable leads to ⟨p̂x⟩α(t) = −m0 µ(t) σ(t) (wµ(t)⟨x̂⟩α(t) + wγ (t)) − √ 2m0ℏw0 µ(t) σ(t) r sin (w0τ(t) − θ)). (41) in the latter, ⟨ô⟩α(t) ≡ (ξα(x,t), ôξα(x,t)) stands for the average value of the observable ô computed through the time-dependent coherent state ξα(x,t). the expectation value of the momentum (41) can be further simplified so that it simply rewrites as ⟨p̂x⟩α(t) = m(t) d dt ⟨x̂⟩α(t), m(t) = m0µ2(t), (42) which is an analogous relation to that obtained from the canonical equations of motion of the corresponding classical hamiltonian. this is also consequence of the quadratic nature of the time-dependent hamiltonian ĥck(t) and the ehrenfest theorem. from the expectation values obtained in (39)-(42), a relationship between the complex parameter α = reiθ and the expectation values at a given initial time t = t0 can be established. the straightforward calculations lead to ( re α̃t0 im α̃t0 ) = (√m0w0 2ℏ γt0 σt0√ m0 2ℏw0 wγt0 ) + ( √m0w0 2ℏ µt0 σt0 0√ m0 2ℏw0 wµt0 1√ 2m0ℏw0 σt0 µt0 )( ⟨x̂⟩t0 ⟨p̂x⟩t0 ) , (43) with α̃t0 = αe−iw0τt0 = rei(θ−w0τt0 ), τt0 = τ(t0), σt0 = σ(t0), γt0 = γ(t0), wγt0 = wγ (t0), wµt0 = wµ(t0), ⟨x̂⟩t0 = ⟨x̂⟩α(t0), and ⟨p̂x⟩t0 = ⟨p̂x⟩α(t0). on the other hand, one can write the probability density associated with the time-dependent coherent state in terms of ⟨x̂⟩α(t) through pα(x,t) := |ξα(x,t)|2 = √ m0w0 πℏ µ(t) σ(t) × exp [ − m0w0 2ℏ µ2(t) σ2(t) (x − ⟨x̂⟩α(t)) 2 ] , (44) which is a gaussian wavepacket whose maximum follows the classical trajectory. that is, the timedependent coherent state is considered as a semiclassical state. before concluding this section, it is worth exploring the corresponding uncertainty relations associated with the canonical observables, which can be computed by using (26), (31), and (38). after some calculations one gets (∆x̂)2α = ℏ 2m0w0 σ2(t) µ2(t) , (∆p̂x) 2 α = m0ℏw0 2 µ2(t) σ2(t) ( 1 + σ2(t)w 2µ (t) w20µ 2(t) ) , (45) 215 kevin zelaya acta polytechnica from which the uncertainty relation reduces to (∆x̂)2α (∆p̂x) 2 α = ℏ2 4 ( 1 + σ2(t)w 2µ (t) w20µ 2(t) ) , (46) where it is clear that, in general, ξα(x,t) does not minimize the heisenberg uncertainty relationship, except for those times t′ at which wµ(t′) = 0. the latter follows from the fact that σ ̸= 0 for t ∈ r. still, there are two special cases for which eq. (46) minimizes at all times. • for µ(t) = µ0 and ω(t) = w1, one can always find a constant solution σ4(t) = w20/w21 so that wµ = 0. the uncertainty relationship (46) is minimized, and the time-dependent hamiltonian becomes ĥck(t) = p̂2x 2m0µ20 + m0µ 2 0w 2 1 2 x̂2 + f(t)x̂, (47) which is nothing but a stationary oscillator with an external time-dependent driving force f(t)2. thus, the uncertainty relation gets minimized in the stationary limit, as expected. • for ω2(t) = w20µ−4(t), there is a solution σ(t) = µ(t) for which wµ = 0. this leads to a hamiltonian of the form ĥck(t) = 1 µ2(t) ( p̂2x 2m0 + m0w 2 0 2 x̂2 + µ2(t)f(t)x̂ ) . (48) although the solutions ξα(x,t) minimize the heisenberg uncertainty relation only on some restricted cases, one can still explore the schrödinger-robertson inequality [40, 41]. this is defined for a pair of observables â and b̂ through( ∆â )2 ( ∆b̂ )2 ≥ |⟨[â,b̂]⟩|2 4 + σ2a,b , (49) where σa,b := 12 ⟨âb̂ + b̂â⟩ − ⟨â⟩⟨b̂⟩ stands for the correlation function. for the canonical position x̂ and momentum p̂x observables one gets σ2x,px = ℏ2 4w20 σ2(t)w 2µ (t) µ2(t) , (50) when computed through ξα(x,t). thus, the semiclassical states ξα(x,t) minimize the schrödinger-robertson relationship for t ∈ r. 4. discussion: conventional and regularized caldirola-kanai oscillators so far, the most general setup has been addressed for a time-dependent mass oscillator. two particular 2the hamiltonian (47) is essentially stationary, for the term f (t) can be absorbed through an appropriate reparametrization of the canonical coordinate. examples are considered in this section to further illustrate the usefulness and behavior of the so-constructed solutions and coherent states. henceforth, all calculations are carried on by working in units of ℏ = 1 to simplify the ongoing discussion. throughout the rest of this manuscript, the following two time-dependent masses are considered: µck(t) = e−κt, κ ≥ 0, (51a) µrck(t) = e−κt + µ0, κ,µ0 ≥ 0. (51b) the first one corresponds to the well-known caldirola-kanai oscillator [42, 43], which contains a mass-term that asymptotically approaches to zero. this is a rather unrealistic scenario in the context of the schrödinger equation. still, one can study the dynamics on a given time range, let say t ∈ [0,t], where t denotes the time spent by the mass to reduce its initial value in a factor e−1. in other words, t = κ−1 is equivalent to the lifetime of a decaying system. one thus may disregard the dynamics for t > t. to amend such issue, the second mass term µrck(t) has been introduced, which transits from µrck(0) = 1 to µck(t → ∞) = µ0. thus, there is no need to introduce any artificial truncation on the time domain. the hamiltonian associated with this mass-term will be called regularized caldirola-kanai oscillator. despite the apparent advantages of the regularized system, analytic expressions for σ(t) are significantly more complicated with respect to those obtained from µck(t). still, exact result can be obtained. the discussion is thus divided for each case separately. 4.1. caldirola-kanai case the so-called caldirola-kanai system is another wellknown nonstationary problem, characterized by timedependent mass decaying exponentially on time. it was independently introduced by caldirola [42] and kanai [43] in an attempt to describe the quantum counterpart of a damped oscillator. this model has been addressed by different means, such using a fourier transform to map the map it into a parametric oscillator [32], and using the quantum arnold transformation [30]. for this particular case, a constant frequency ω2(t) = w21 and a driven force f(t) = a0 cos(νt), for ν, a0 ∈ r, are considered. this leads to a forced caldirola-kanai oscillator hamiltonian [10, 44] of the form ĥck(t) = e2κt 2m0 p̂2x +e −2κt m0w 2 1 2 x̂2 + a0 cos(νt)x̂. (52) from the results obtained in previous sections, one gets the solutions to the ermakov and nonhomogeneous equations as σ(t) = ( aq21 (t) + bq1(t)q2(t) + cq 2 2 (t) )1 2 , γ(t) = γ1q1(t) + γ2q2(t) + γp(t), (53) 216 vol. 62 no. 1/2022 td mass oscillators: constants of motion and semiclasical states (a) . wµ(t) (b). figure 1. (a) wµ(t) = σ(t)µ̇(t) − σ̇(t)µ(t) for the caldirola-kanai mass term µck(t). (b) variances (∆x̂)2α (solidblue), (∆p̂x)2α (dashed-red), the schrödinger-robertson uncertainty minimum (dotted-green), and the heisenberg uncertainty minimum (thick-solid-black) associated with the coherent states ξα(x,t) and the mass term µck(t). the parameters have been fixed as a = c = w0 = 1, w1 = 2, and κ = 0.5. (a) . n = 0 (b) . n = 1 (c). figure 2. probability density pn = |ψn(x,t)|2 for n = 0 (a), n = 1 (b), and pα = |ψn(x,t)|2 (c) associated with the caldirola-kanai mass term µck(t). for simplicity, the external force f (t) and γ(t) have been fixed to zero. the rest of parameters have been fixed as w0 = a = b = 1, w1 = 2, and κ = 0.5. respectively, with γ1 and γ2 arbitrary real constants, b2 − 4ac = −16 w 2 0 w21 −κ 2 , and q1(t) = cos( √ w2 − κ2 t), q2(t) = sin( √ w2 − κ2 t), γp(t) = a0e−kt (w21 − ν2) cos(νt) − 2κν sin(νt) (w21 + ν2)2 − 4ν2(w21 − κ2) . (54) in the sequel, κ = 0.5 is consider so that the caldirola-kanai oscillator is constrained to the time interval t ∈ [0, 2]. further discussions concerning the dynamics will be restricted to such a time interval. it is worth recalling that the zeros of wµ(t) correspond to the times for which the heisenberg uncertainty relationship saturates. although the expression for wµ(t) is rather simple in this case, determining the zeros consist of solving a transcendental equation. thus, to get further insight, one may analyze figure 1a, which depicts the behavior of such a function for µck(t) (solid-blue). from the latter, one can see that zeroes do exist indeed, and thus one should expect points in time for which the heisenberg inequality saturates. despite the latter, the schrödinger-robertson inequality saturates at all times. in figure 1b, one can see the behavior of the variances, from which it is clear that the variance in the position blows up at time pass by, whereas the momentum variance squeezes indefinitely, approaching asymptotically to zero. this odd behavior results from a mass term that quickly decays, approaching zero but never converging to it. for those reasons, a truncation on the time interval was previously introduced in the form of a mean lifetime, which in this case becomes t = κ−1 = 2. in this form, one still has a realistic behavior for t ∈ (0, 2). the previous results can be verified by looking at the probability density associated with the solutions ψn(x,t) and the coherent state ξα(x,t), which is depicted in figure 2. from those probability densities, one may see the increase on the position variance (∆x̂)2α, for the wavepacket spreads rapidly on time, 217 kevin zelaya acta polytechnica (a) . wµ(t) (b). figure 3. (a) wµ(t) = σ(t)µ̇(t) − σ̇(t)µ(t) for the regularized caldirola-kanai mass term µrck(t). (b) variances (∆x̂)2α (solid-blue), (∆p̂x)2α (dashed-red), the schrödinger-robertson uncertainty minimum (dotted-green), and the heisenberg uncertainty minimum (thick-solid-black) associated with the coherent states ξα(x,t) and the mass term µrck(t). the parameters have been fixed as a = c = w0 = 1, w1 = 2, µ0 = 0.3, and κ = 0.5. to the point that, for times t > 4 is almost indistinguishable. for completeness, the classical trajectory is depicted as a dashed-black curve in figure 2c, where the initial conditions ⟨x̂⟩t0 = 2 and ⟨p̂x⟩t0 = 0 have been used. 4.2. regularized caldirola-kanai in this section, the regularized caldirola-kanai oscillator is introduced so that it amends the difficulties found in the caldirola-kanai for t >> t. this model is characterized by a constant frequency ω2(t) = w21 and a mass term µrck(t) = µ0e−κt + µ1, with w1,µ0,µ1,κ > 0. the mass term will converge at a constant value (different from zero) and the anomalies found in the conventional caldirola-kanai case will be fixed. the main consequence of the mass regularization is that the classical equation of motion is not as trivial as in section 4. in turn one has q̈(t) + ( w21 − κ2 1 + µ0eκt ) q(t) = 0. (55) two linearly independent solutions to the corresponding linear equation (15) can be found as q1(t) = z(t) i w1 k 2f1  a1,a2 1 − 2i w1 k ∣∣∣∣∣∣ −1z(t)   , q2(t) = q ∗ 1(t), (56) where z(t) = µ0eκt, a1 = −iw1k −i √ w21 k2 − 1, and a2 = −iw1 k + i √ w21 k2 − 1. on the other hand, 2f1(a,b; c; z) stands for the hypergeometric function [33], which converges in the complex unit-disk |z| < 1. given that z(t) : r → (1, ∞), the solutions q1,2(t) in (56) converge for t ∈ r. since both solutions in (56) are complex-valued, with q2 = q ∗ 1, one can construct a real-valued solution to the ermakov equation by taking q1 =re(q1) and q2 =im(q1). to simplify the ongoing discussion, the external force is considered null, f(t) = 0. one thus obtains σ2(t) = a re[q1(t)]2 + b re[q1(t)] im[q1(t)] + c im[q1(t)]2, (57) γ(t) = γ1 re[q1(t)] + γ2 im[q1(t)], (58) where the wronskian of the two linearly independent solutions re q1 and im q1 becomes w0 = w1, leading to the constraint b2 − 4ac = −4 w 2 0 w21 . similarly to the caldirola-kanai case, the heisenberg uncertainty relation saturates for times tm such that wµ(tm) = 0. in this case, an analytic expression for such points is fairly complicated. instead, one may look at the behavior of wµ(t) depicted in figure 3a, from which it is clear that such points exist. on the other hand, figure 3b reveals that, in contradistinction to the caldirola-kanai case, the position variance does not grow indefinitely in time. this is rather expected as, for asymptotic times t >> 1, the mass term converges to a finite value different from zero. that is, the hamiltonian becomes stationary for asymptotic values. before conclude, the probability density for ψn(x,t) and ξα(x,t) are shown in figure 4. in the latter, it can be verified that the width of the wavepackets oscillates in a bounded way for times t > 2. particularly, for the coherent state case of figure 4c, the dynamics of the wavepacket can be identified clearly, where the maximum point follows the corresponding classical trajectory (dashed-black). therefore, there is no need to introduce a truncation time t, for the mass converges to a constant value different from zero, remaining physically reasonable at all times. 5. conclusions in this work, the class of form-preserving point transformations has been used to construct the constants of 218 vol. 62 no. 1/2022 td mass oscillators: constants of motion and semiclasical states (a) . n = 0 (b) . n = 1 (c). figure 4. probability density pn = |ψn(x,t)|2 for n = 0 (a), n = 1 (b), and pα = |ψn(x,t)|2 (c) associated with the regularized caldirola-kanai mass term µrck (t). the rest of parameters have been fixed as w0 = a = b = 1, w1 = 2, µ0 = 0.3, and κ = 0.5. motion for the family of time-dependent mass oscillators. this was achieved by exploiting the preservation of first-integrals on the initial stationary oscillator model. since several constants of motion are already known for the initial system, the corresponding counterparts for the time-dependent model are straightforwardly constructed by implementing the appropriate mappings. notably, three different constants of motion were identified, one that admits an orthogonal set of eigensolutions, another that permits non-orthogonal eigensolutions, and the third one that does not admit finite-norm solutions. interestingly, the non-orthogonal eigensolutions are actually coherent states, for they are constructed from the annihilation operator of the time-dependent oscillator. furthermore, by exploiting the underlying weyl-heisenberg algebra fulfilled by the quantum invariants, it was possible to find exact expressions for the expectation values of the position and momentum observables. the latter revealed the coherent states are represented by gaussian wavepacket whose maximum follows the corresponding classical trajectory. besides the latter properties, it was also found that, in general, the schrödinger-robertson uncertainty relation saturates for all times, whereas the heisenberg one gets minimized only for some times. still, two special time-dependent hamiltonians exist so that the heisenberg inequality saturates at all times, one of which is the stationary limit case, as expected. remarkably, the newly introduced regularized caldirola-kanai mass term admits exact solutions that regularize the unusual behavior observed in the conventional caldirola-kanai case. more precisely, the variances become bounded as well as the expectation values. this allows obtaining localization of particles, which is desired in physical implementations such as traps of charged particles. acknowledgements the author acknowledges the support from the project “physicist on the move ii” (kineó ii), czech republic, grant no. cz.02.2.69/0.0/0.0/18_053/0017163. this research has been also funded by consejo nacional de ciencia y tecnología (conacyt), mexico, grant no. a1-s-24569. references [1] f. cooper, a. khare, u. sukhatme. supersymmetry in quantum mechanics. world scientific, singapore, 2001. [2] f. schwabl. quantum mechanics. springer-verlag, berlin, 3rd edn., 2002. [3] a. bohm, a. mostafazadeh, h. koizumi, et al. the geometric phase in quantum systems: foundations, mathematical concepts, and applications in molecular and condensed matter physics. springer-verlag, berlin, 2003. [4] w. paul. electromagnetic traps for charged and neutral particles. reviews of modern physics 62(3):531–540, 1990. https://doi.org/10.1103/revmodphys.62.531. [5] m. combescure. a quantum particle in a quadrupole radio-frequency trap. annales de l’ihp physique théorique 44(3):293–314, 1986. http: //www.numdam.org/item/aihpa_1986__44_3_293_0/. [6] d. e. pritchard. cooling neutral atoms in a magnetic trap for precision spectroscopy. physical review letters 51:1336–1339, 1983. https://doi.org/10.1103/physrevlett.51.1336. [7] r. j. glauber. quantum theory of optical coherence, chap. the quantum mechanics of trapped wavepackets, pp. 577–594. john wiley & sons, ltd, 2006. https://doi.org/10.1002/9783527610075.ch15. [8] b. m. mihalcea, s. lynch. investigations on dynamical stability in 3d quadrupole ion traps. applied sciences 11(7):2938, 2021. https://doi.org/10.3390/app11072938. [9] h. r. lewis, w. b. riesenfled. an exact quantum theory of the time-dependent harmonic oscillator and of a charged particle in a time-dependent electromagnetic 219 https://doi.org/10.1103/revmodphys.62.531 http://www.numdam.org/item/aihpa_1986__44_3_293_0/ http://www.numdam.org/item/aihpa_1986__44_3_293_0/ https://doi.org/10.1103/physrevlett.51.1336 https://doi.org/10.1002/9783527610075.ch15 https://doi.org/10.3390/app11072938 kevin zelaya acta polytechnica field. journal of mathematical physics 10(8):1458–1473, 1969. https://doi.org/10.1063/1.1664991. [10] v. v. dodonov, v. i. man’ko. coherent states and the resonance of a quantum damped oscillator. physical review a 20:550–560, 1979. https://doi.org/10.1103/physreva.20.550. [11] v. v. dodonov, o. v. man’ko, v. i. man’ko. quantum nonstationary oscillator: models and applications. journal of russian laser research 16(1):1– 56, 1995. https://doi.org/10.1007/bf02581075. [12] i. ramos-prieto, a. r. urzúa, m. fernández-guasti, h. m. moya-cessa. ermakov-lewis invariant for two coupled oscillators. journal of physics: conference series 1540(1):012009, 2020. https://doi.org/10.1088/1742-6596/1540/1/012009. [13] v. v. dodonov. invariant quantum states of quadratic hamiltonians. entropy 23(5):634, 2021. https://doi.org/10.3390/e23050634. [14] v. v. dodonov, m. b. horovits. energy and magnetic moment of a quantum charged particle in time-dependent magnetic and electric fields of circular and plane solenoids. entropy 23(12):1579, 2021. https://doi.org/10.3390/e23121579. [15] k. zelaya, v. hussin. time-dependent rational extensions of the parametric oscillator: quantum invariants and the factorization method. journal of physics a: mathematical and theoretical 53(16):165301, 2020. https://doi.org/10.1088/1751-8121/ab78d1. [16] k. zelaya. nonstationary deformed singular oscillator: quantum invariants and the factorization method. journal of physics: conference series 1540:012017, 2020. https://doi.org/10.1088/1742-6596/1540/1/012017. [17] k. zelaya, i. marquette, v. hussin. fourth painlevé and ermakov equations: quantum invariants and new exactly-solvable time-dependent hamiltonians. journal of physics a: mathematical and theoretical 54(1):015206, 2020. https://doi.org/10.1088/1751-8121/abcab8. [18] v. g. bagrov, b. f. samsonov, l. a. shekoyan. darboux transformation for the nonsteady schrödinger equation. russian physics journal 38(7):706–712, 1995. https://doi.org/10.1007/bf00560273. [19] k. zelaya, o. rosas-ortiz. exactly solvable time-dependent oscillator-like potentials generated by darboux transformations. journal of physics: conference series 839:012018, 2017. https://doi.org/10.1088/1742-6596/839/1/012018. [20] a. contreras-astorga. a time-dependent anharmonic oscillator. journal of physics: conference series 839:012019, 2017. https://doi.org/10.1088/1742-6596/839/1/012019. [21] r. razo, s. cruz y cruz. new confining optical media generated by darboux transformations. journal of physics: conference series 1194:012091, 2019. https://doi.org/10.1088/1742-6596/1194/1/012091. [22] j. cen, a. fring, t. frith. time-dependent darboux (supersymmetric) transformations for non-hermitian quantum systems. journal of physics a: mathematical and theoretical 52(11):115302, 2019. https://doi.org/10.1088/1751-8121/ab0335. [23] a. contreras-astorga, v. jakubský. photonic systems with two-dimensional landscapes of complex refractive index via time-dependent supersymmetry. physical review a 99:053812, 2019. https://doi.org/10.1103/physreva.99.053812. [24] a. contreras-astorga, v. jakubský. multimode two-dimensional pt-symmetric waveguides. journal of physics: conference series 1540:012018, 2020. https://doi.org/10.1088/1742-6596/1540/1/012018. [25] w.-h. steeb. invertible point transformations and nonlinear differential equations. world scientific publishing, singapore, 1993. https://doi.org/10.1142/1987. [26] k. zelaya, o. rosas-ortiz. quantum nonstationary oscillators: invariants, dynamical algebras and coherent states via point transformations. physica scripta 95(6):064004, 2020. https://doi.org/10.1088/1402-4896/ab5cbf. [27] k. zelaya, v. hussin. point transformations: exact solutions of the quantum time-dependent mass nonstationary oscillator. in m. b. paranjape, r. mackenzie, z. thomova, et al. (eds.), quantum theory and symmetries, pp. 295–303. springer international publishing, cham, 2021. https://doi.org/10.1007/978-3-030-55777-5_28. [28] a. fring, r. tenney. exactly solvable time-dependent non-hermitian quantum systems from point transformations. physics letters a 410:127548, 2021. https://doi.org/10.1016/j.physleta.2021.127548. [29] k. zelaya, o. rosas-ortiz. exact solutions for time-dependent non-hermitian oscillators: classical and quantum pictures. quantum reports 3(3):458–472, 2021. https://doi.org/10.3390/quantum3030030. [30] v. aldaya, f. cossío, j. guerrero, f. f. lópez-ruiz. the quantum arnold transformation. journal of physics a: mathematical and theoretical 44(6):065302, 2011. https://doi.org/10.1088/1751-8113/44/6/065302. [31] n. ünal. quasi-coherent states for the hermite oscillator. journal of mathematical physics 59(6):062104, 2018. https://doi.org/10.1063/1.5016897. [32] i. ramos-prieto, m. fernández-guasti, h. m. moya-cessa. quantum harmonic oscillator with time-dependent mass. modern physics letters b 32(20):1850235, 2018. https://doi.org/10.1142/s0217984918502354. [33] f. olver, d. lozier, r. boisvert, c. clark. nist handbook of mathematical functions. cambridge university press, new york, 2010. isbn 0521140633. [34] v. p. ermakov. second-order differential equations: conditions of complete integrability. universita izvestia kiev series iii 20(9):1–25, 1880. [35] a. o. harin. “second-order differential equations: conditions of complete integrability” (english translation). applicable analysis and discrete mathematics 2:123–145, 2008. [36] d. schuch. quantum theory from a nonlinear perspective: riccati equations in fundamental physics. spinger, cham, 2018. 220 https://doi.org/10.1063/1.1664991 https://doi.org/10.1103/physreva.20.550 https://doi.org/10.1007/bf02581075 https://doi.org/10.1088/1742-6596/1540/1/012009 https://doi.org/10.3390/e23050634 https://doi.org/10.3390/e23121579 https://doi.org/10.1088/1751-8121/ab78d1 https://doi.org/10.1088/1742-6596/1540/1/012017 https://doi.org/10.1088/1751-8121/abcab8 https://doi.org/10.1007/bf00560273 https://doi.org/10.1088/1742-6596/839/1/012018 https://doi.org/10.1088/1742-6596/839/1/012019 https://doi.org/10.1088/1742-6596/1194/1/012091 https://doi.org/10.1088/1751-8121/ab0335 https://doi.org/10.1103/physreva.99.053812 https://doi.org/10.1088/1742-6596/1540/1/012018 https://doi.org/10.1142/1987 https://doi.org/10.1088/1402-4896/ab5cbf https://doi.org/10.1007/978-3-030-55777-5_28 https://doi.org/10.1016/j.physleta.2021.127548 https://doi.org/10.3390/quantum3030030 https://doi.org/10.1088/1751-8113/44/6/065302 https://doi.org/10.1063/1.5016897 https://doi.org/10.1142/s0217984918502354 vol. 62 no. 1/2022 td mass oscillators: constants of motion and semiclasical states [37] a. o. barut, l. girardello. new “coherent” states associated with non-compact groups. communications in mathematical physics 21(1):41–55, 1971. https://doi.org/10.1007/bf01646483. [38] r. j. glauber. coherent and incoherent states of the radiation field. physical review 131:2766–2788, 1963. https://doi.org/10.1103/physrev.131.2766. [39] a. perelomov. generalized coherent states and their applications. springer-verlag, berlin, 1986. [40] h. p. robertson. the uncertainty principle. physical review 34:163–164, 1929. https://doi.org/10.1103/physrev.34.163. [41] e. schrödinger. zum heisenbergschen unscharfeprinzip. proceedings of the prussian academy of sciences 19:296–303, 1929. [42] p. caldirola. forze non conservative nella meccanica quantistica. il nuovo cimento 18:393–400, 1941. https://doi.org/10.1007/bf02960144. [43] e. kanai. on the quantization of the dissipative systems. progress of theoretical physics 3(4):440–442, 1948. https://doi.org/10.1143/ptp/3.4.440. [44] m. dernek, n. ünal. quasi-coherent states for damped and forced harmonic oscillator. journal of mathematical physics 54(9):092102, 2013. https://doi.org/10.1063/1.4819261. 221 https://doi.org/10.1007/bf01646483 https://doi.org/10.1103/physrev.131.2766 https://doi.org/10.1103/physrev.34.163 https://doi.org/10.1007/bf02960144 https://doi.org/10.1143/ptp/3.4.440 https://doi.org/10.1063/1.4819261 acta polytechnica 62(1):211–221, 2022 1 introduction 2 materials and methods 2.1 point transformations 3 results: constants of motion and semiclassical states 3.1 semiclassical dynamics 4 discussion: conventional and regularized caldirola-kanai oscillators 4.1 caldirola-kanai case 4.2 regularized caldirola-kanai 5 conclusions acknowledgements references ap05_6.vp 1 introduction eigenstructure assignment is one of the basic techniques for designing linear control systems. the eigenstructure assignment problem is the problem of assigning both a given self-conjugate set of eigenvalues and the corresponding eigenvectors. assigning the eigenvalues allows one to alter the stability characteristics of the system, while assigning eigenvectors alters the transient response of the system. eigenstructure assignment via state feedback has developed the design methods for a wide class of linear systems under full-state feedback with the objective of stabilizing the control systems. the parametric solution of eigenstructure assignment for state feedback has been studied by many researchers [6–10]. fahmy and tantawy [7] and fahmy and o’reilly [8–9] have developed solutions to the eigenstructure assignment problem. a parametric characterization of the assignable eigenvalues and generalized eigenvectors is presented. duan [6] presented two complete parametric approaches for eigenstructure assignment in linear systems via state feedback. this methodology is deeply utilized in this research work. this paper focuses on a special feedback using only state derivatives instead of full-state feedback. therefore this feedback is called state-derivative feedback. the problem of arbitrary eigenstructure assignment using full-state derivative feedback naturally arises. the motivation for state derivative feedback comes from controlled vibration suppression of mechanical systems. the main sensors of vibration are accelerometers. from accelerations we can reconstruct velocities with reasonable accuracy, but not the displacements. therefore the available signals for feedback are accelerations and velocities only, and these are exactly the derivatives of states of the mechanical systems, which are the velocities and displacements. direct measurement of the state is difficult to achieve. one necessary condition for a control strategy to be implementable is that it must use the available measured responses to determine the control action. all of the previous research in control has assumed that all of the states can be directly measured (i.e., that there is full-state feedback). many papers have been published on controlling this class of systems, (e.g. [12-17]) describing the acceleration feedback for controlled vibration suppression. however, the eigenstructure assignment approach for feedback gain determination has not been used at all or has not been solved generally. other papers dealing with acceleration feedback for mechanical systems are [18–19], but here the feedback uses all states (positions, velocities) and accelerations additionally. abdelaziz and valasek [1–3] have recently presented an eigenvalue assignment technique via state-derivative feedback for single-input and multi-input time-invariant linear systems. eigenstructure assignment via state-derivative feedback is introduced in [4-5]. in this paper, two complete parametric approaches for eigenstructure assignment in linear systems via state-derivative feedback are proposed. two complete parametric expressions for the closed-loop eigenvector matrices and the feedback gains are established in terms of closed-loop eigenvalues and a group of parameter vectors. both the closed-loop eigenvalues and this group of parameters can be properly chosen to produce a closed-loop system with some additional desired system specifications. the necessary and sufficient conditions for the existence of the eigenstructure assignment problem are described. the proposed controller is based on the measurement and feedback of the state derivatives of the system. this work has successfully extended previous techniques by state feedback and has modified them to state-derivative feedback. finally, numerical examples are included to demonstrate the effectiveness of this approach. the main contribution of this work is an efficient technique that solves the eigenstructure assignment problem via state-derivative feedback systems. the procedure defined here represents a unique treatment for extending the eigenstructure assignment technique using the state-derivative feedback in the literature. this paper is organized as follows. in the next section, the problem formulation and the necessary and sufficient conditions for the existence of the eigenstructure assignment problem are described. additionally, two complete paramet© czech technical university publishing house http://ctn.cvut.cz/ap/ 19 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 a complete parametric solutions of eigenstructure assignment by state-derivative feedback for linear control systems t. h. s. abdelaziz, m. valášek in this paper we introduce a complete parametric approach for solving the problem of eigenstructure assignment via state-derivative feedback for linear systems. this problem is always solvable for any controllable systems iff the open-loop system matrix is nonsingular. in this work, two parametric solutions to the feedback gain matrix are introduced that describe the available degrees of freedom offered by the state-derivative feedback in selecting the associated eigenvectors from an admissible class. these freedoms can be utilized to improve robustness of the closed-loop system. accordingly, the sensitivity of the assigned eigenvalues to perturbations in the system and gain matrix is minimized. numerical examples are included to show the effectiveness of the proposed approach. keywords: eigenstructure assignment, state-derivative feedback, linear control systems, feedback stabilization, parametrization. ric solutions to the eigenstructure assignment problem via state-derivative feedback are presented. in section 3, illustrative examples are presented. finally, conclusions are discussed in section 4. 2 eigenstructure assignment by state-derivative feedback for time-invariant systems in this section, we present two complete parametric approaches for solving the eigenstructure assignment problem via state-derivative feedback for linear time-invariant systems. 2.1 eigenstructure assignment problem formulation consider a linear, time-invariant, completely controllable system �( ) ( ) ( ), ( )x ax bu x xt t t t� � �0 0 , (1) where �( )x t n�r , x( )t n�r and u( )t m�r are the state-derivative, the state and the control input vectors, respectively, ( )m n� , while a � �r n n and b � �rn m are the system and control gain matrices, respectively. the fundamental assumption imposed on the system is that the system is completely controllable and matrix b has a full column rank m. the objective is to stabilize the system by means of a linear feedback that enforces the desired characteristic behavior for the states. the problem is to find the state-derivative feedback control law u k x( ) �( )t t� � , (2) which assigns prescribed closed-loop eigenvalues and corresponding eigenvectors that stabilize the system and achieve the desired performance. here, the first derivative vector of state-space �( )x t is utilized instead of the vector of state-space x( )t . then, the closed-loop system dynamics becomes �( ) ( ) ( )x i bk a xt tn� � �1 , (3) where in is the n×n identity matrix. in what follows, we assume that (in+ bk) has a full rank in order that the closed-loop system is well defined. the closed-loop characteristic polynomial is given by � det ( )� � � ���i i bk an n 1 0. (4) let �� � � � � �� �i i i s s n, , , , ,c 1 1� be a set of desired self-conjugate eigenvalues, where s is the number of distinct eigenvalues, and denote the algebraic and geometric multiplicity of the ith eigenvalue �i by mi and qi, respectively, ( )1 � �q mi i . the length of qi chains of generalized eigenvectors with �i are denoted by pij, ( j � 1, …, qi). then in the jordan canonical form of the closed-loop matrix, there are qi blocks associated with the ith eigenvalue �i of orders pij. it is satisfying that p nijj q i s i � �� 11 . in this work, we restrict ourselves by mi � qi. this means that the multiple eigenvalues are not split; they are placed in one jordan block. the partial multiplicities are not placed. therefore, the rosenbrock’s inequalities are fulfilled. the right eigenvector and generalized eigenvectors of the closed-loop matrix with �i are denoted by vij k n�c , i � 1, …, s, j � 1, …, qi, k � 1, …, pij. according to the definition of the right eigenvector and generalized eigenvectors for multiple eigenvalues, then � �( )i bk a in i n ijk ijk� � �� �1 1� v v , vij0 � 0, �i j k, , . (5) this equation demonstrates the relation of assignable right generalized eigenvectors with the associated eigenvalue. the notations are defined as � v v v� � �1, ,� s n nc , v v vi i iq n m i i� � �� � �� � �1, ,� c , vij ij ij p n pij ij� � �� � �� � � v v1 , ,� c , where vi contains all right eigenvectors and generalized eigenvectors associated with the eigenvalue �i, and det( )v � 0. then, the eigenstructure assignment problem for system (1) via state-derivative feedback can be stated as follows: eigenstructure assignment problem: given the real pair (a, b) and the desired self-conjugate set �� �1, ,� n �c , find the real state-derivative feedback gain matrix k � �rm n that will make the closed-loop matrix ( )i bk an � �1 have admissable eigenvalues and the associated set of right eigenvector and generalized eigenvector matrix v. the necessary and sufficient conditions that ensure solvability of the eigenstructure assignment problem via state-derivative feedback are presented in the following lemma. lemma 1: the eigenstructure assignment problem for the real pair (a, b) is solvable for any arbitrary nonzero, self-conjugate, closed-loop poles, if (a, b) is completely controllable, that is � rank b ab a b, , ,� n n� �1 , or � rank �i a bn n� �, , � �� c , and a is nonsingular. proof: from this condition, the closed-loop matrix must be defined. this means the matrix ( )i bkn � is of full rank and det( )i bkn � � 0. then, from (5) it can easily be rewritten as ( )i bk av vn � � �1 � (6) where � � �cn n is in jordan canonical form with the desired eigenvalues on the diagonal. then av v bkv� �� � (7) which can be written as i bk av vn � � � � � 1 1. (8) then, det( ) det( ) det( ) det( )i bk av v an � � � � � � � � � 1 1 1 0. (9) since v must be nonsingular, then det( )a � 0 and det( )� � 0. therefore, matrices a and � should be of full rank in order for the closed-loop matrix to be defined. � thus, the necessary and sufficient conditions for existence of the solution to the eigenstructure assignment problem via state-derivative feedback is that the system is completely con20 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague trollable and all eigenvalues of the original system are nonzero (a has full rank). we remark that the requirement that matrix � is diagonal, together with the invertibility of v, ensures that the closed-loop system is non-defective. non-defective systems are desirable because the poles of such systems are less sensitive to system parameter perturbation [10]. based on the above necessary and sufficient conditions, two parametric forms are derived for the state-derivative feedback gain matrix k that assigns the desired closed-loop poles. 2.2 eigenstructure assignment the main work now is to find a parametric solution to the state-derivative feedback gain matrix k that assigns the desired closed-loop eigenvalues and associated eigenvectors. equation (5) can be rewritten as a ij k i n ij k n ij kv v v� � � � �� ( ) ( )i bk i bk 1, vij 0 � 0, �i j k, , . (10) let the auxiliary vectors w vij k ij k m� �k c , i s� 1, ,� , j qi� 1, ,� , k pij� 1, ,� , (11) be introduced. the set of w ij k is defined in a similar manner to the set of vij k as follows, w w w� � � � � � � � � 1 , ,� s m nc , w w wi i iq m m i i� � � � � � � � � 1, ,� c , wij ij ij p m pij ij� � �� � �� � � w w1 , ,� c . this leads to ( ) ,� �i n ij k i ij k ij k ij k ij ij i a b b 0 0 � � � � � � � � �v w v w v w 1 1 0 0, , �i j k, , . , (12) the above equation can be equivalently written in the following compact matrix form, � � � �i n i ij k ij k n ij k ij ki a b i b� � � � � � � � � � � � � � � � � � �, , v w v w 1 1 � � � � � , , , , , .v wij ij i j k 0 00 0 (13) finally, the parametric equation to the right eigenvector and generalized eigenvectors can be expressed as � � � � � � � � i n i ij k n ij k ij k ij k ij k i a b i b� � � � � � � � � � � � �, , , , 1 v w ij i j k 0 � �0, , , . (14) then, parameter vectors � ij k n m� �c are chosen arbitrary under the condition that the columns of matrix v are linearly independent. a parametric solution to the eigenstructure assignment problem via state-derivative feedback is derived from (11) as k wv� �1 (15) where � � v v v w w w� �1 1 1 1( ), , ( ) , ( ), , ( )� � � �� �s s s s . then the feedback gain matrix is parametrized directly in terms of the eigenstructure of the closed-loop system, which can be selected to ensure robustness by exploiting freedom of these parameters. there exists a real feedback gain matrix k if and only if the following three conditions are satisfied: 1. the assigned eigenvalues are symmetric with respect to the real axis. 2. the right generalized eigenvectors �vijk n i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � are linearly independent and for complex-conjugate poles, � �i i2 1� then v vi j k i j k 2 1 � . 3. there exists a set of vectors �w ijk m i iji s j q k p� � � �c , , , , , , , , ,1 1 1� � � , satisfying (13) and w wi j k i j k 2 1 � for � �i i2 1� . the parametric formula for the state-derivative feedback gain matrix k that assigns the desired closed-loop poles system is now derived. in the following, we obtain the more general parametric solutions of vij k and w ij k in (13). two complete parametric forms are introduced and a new procedure is derived, which yields a parametric expression for k involving free parameter vectors. 2.3 a parametrization approach for eigenstructure assignment the aim now is to find a parametric solution to the eigenstructure assignment problem via state-derivative feedback. we remark that the development of parametric solutions to this problem is useful in that one can then think of solving other important variations of the problem, such as the robust eigenstructure assignment problem, by exploiting freedom of these parameters. the relation demonstrating the assignable right generalized eigenvectors with the eigenvalues is (13). definition 1: a square polynomial matrix p(�) is called a unimodular matrix if its determinant is a nonzero constant. definition 2: polynomial matrix p(�) is a unimodular matrix if and only if p(�) equals the product of some finite number of elementary row (or column) transformation matrices. it is well known that the matrix pair (a, b) is controllable if and only if � rank c� �i a bn n� � � �, , . due to the controllability of (a, b), there exist unimodular matrices p( )� � �c n n and q( ) ( ) ( )� � � � �c n m n m satisfying the following equation: � � p i a b q 0 i( ) , ( ) ,� � �n n� � . (16) partition the polynomial matrix q(�) into the following form q q q q q q q ( ) ( ) ( ) ( ) ( ) ( ) ( ) � � � � � � � � � � �� � � �� � � � 1 2 11 12 21 22 �� � � �� with q1( ) ( ) � � � �c n n m , q2( ) ( ) � � � �c m n m , q11( )� � �cn m, q12( )� � �cn n, q21( )� � �c m m and q22( )� � �c m n. © czech technical university publishing house http://ctn.cvut.cz/ap/ 21 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 then, converting (16) to the following form, � � p i a b q q 0 i q q ( ) , ( ) ~ ( ) , , ~ ( ) � � � � � � n n� � � � � � � � � � � 1 2 2 2with ( ) . � � � � �� � � �� (17) now, the following theorem gives a parametric solution to the eigenstructure assignment problem via state-derivative feedback. the parametric solutions of vij k and w ij k in (13) can now be given. theorem 1: let the matrix pair (a, b) be controllable, where matrix a � �r n n is nonsingular and matrix b � �r n m has a full column rank m. then all solutions of (13), vij k and w ij k , are given by: v w v ij k ij k ij k i ij k f� � � � � � � � � � � � � � � � � � q q p 1 2 ( ) ~ ( ) ( ) � � � � �� �� � � � � � � � � � � 1 1 0 0 b 0 0 w v w ij k ij ij . , , (18) or, equivalently, as v v w vij k i ij k i i ij k ij k ijf� � � � �q q p b11 12 1 1 0( ) ( ) ( )( ),� � � � 0 and � �w v w wijk i i ij k i i ij k ij kf� � �� � 1 21 22 1 1 � � � �q q p b( ) ( ) ( )( ) , ij 0 � 0 i s� 1, ,� , j qi� 1, ,� , k pij� 1, ,� , where p( )� and q( )� are unimodular matrices satisfying (16), and fij k m�c are arbitrarily free parameter vectors satisfying the following two contraints: det( )v 0� and f fi j k i j k 2 1 � if � �i i2 1� . proof: first, we need to show that the set of vectors satisfying (13) and the set of vectors given by (18) are equal. then, using (13) and (18), we have � � � � � � � � i n i ij k ij k i n i i i i a b i a b q q � � � � � � � � � � �, , ( ) ~ ( ) v w 1 2 � � � � � � � � � � � � � � � � � � � � � � � � fij k i ij k ij k i p b p 0 ( ) ( ) , � � v w1 1 1 � � � i p b b n ij k i ij k ij k ij k ij k f ( )� � � � � � � � � � � � � � � � � � v w v w 1 1 1 1 � � � � � � � � � � � � � � � �i b 0 0 n ij k ij k ij ij i j k , , , , , , . v w v w 1 1 0 0 (19) therefore, the vectors given by (18) satisfy (13). now, we show that vectors vij k and w ij k (i � 1, …, s, j � 1, …, qi, k � 1, …, pij) satisfying (13) can be expressed in the form of (18). from (17) we can obtain � � p i a b 0 i q q ( ) , , ( ) ~ ( ) , , ,� � � � � i i n i n i i i� � � � � � � � � � � � 1 2 1 1 � s. (20) then � � p i a b 0 i q q ( ) , , ( ) ~ ( ) � � � � � i i n i ij k ij k n i i � � � � � � � � � � �v w 1 2� � � � � � � � � � � � � � � �1 v w ij k ij k (21) and � � � p b 0 i 0 ( ) , , , � i ij k ij k n ij k ij k ij f � � � � � � � � � � � � � �v w e v w 1 1 0 ij i j k 0 � �0, , , , , (22) where f ij k ij k i i ij k ij ke v w � � � � � � � � � � � � � � � � � � � � q q 1 2 1 ( ) ~ ( ) � � � � � � � � �, , ,i j k . (23) then from (22) we obtain � �e v w v wijk i ijk ijk ij ij i j k� � � � � �� �p b 0 0( ) , , , , ,� 1 1 0 0 . (24) substituting (24) into (23) we obtain (18). � assuming zero initial conditions and applying the laplace transformation to (1), we obtain x i a bu g u( ) ( ) ( ) ( ) ( )� � � � �� � ��n 1 . (25) then the behaviour of our linear system is described by a rational matrix function ( )�i a bn � �1 of size n×m of a complex variable �. the input-state transfer function of the system can be factorized as g i a b n d( ) ( ) ( ) ( )� � � �� � �� �n 1 1 , (26) where n( )� � �c n m and d( )� � �c m m are right coprime polynomial matrices in �. the above equation can be written as ( ) ( ) ~ ( )� � � �i a n bdn � � � 0, with ~ ( ) ( ) d d � � � � � . (27) now, the following theorem gives a parametric solution to the eigenstructure assignment problem via state-derivative feedback. theorem 2: let the matrix pair (a, b) be controllable, where matrix a � �r n n is nonsingular and matrix b � �rn m has a full column rank m. then all solutions of (13), vij k and w ij k , are given by: v w ij k ij k i i ij k if � � � � � � � � � � � �� � � �� � n d n( ) ~ ( ) ( ) ~ � � � �d d d n d ( ) ( ) ! ( ) ~ ( ) � � � � i ij k k k i i f k � � �� � � �� � � � � � � � 1 1 1 1 1 � d d � �� � � �� fij 1. (28) or, equivalently, as � �v vij k i ij k l l i ij k l l k l ijf l f� � �� � � n n 0( ) ! ( ) ,� � � 1 1 0d d , and � �w wijk i ijk l l i ij k l l k l ijf l f� � �� � � ~( ) ! ~ ( ) ,d d 0� � � 1 1 0d d , i s� 1, ,� , j qi� 1, ,� , k pij� 1, ,� , 22 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague where n( )� and ~ ( )d � are polynomial matrices satisfying (27) and fij k m�c are arbitrarily free parameter vectors satisfying the following two contraints: det( )v 0� and f fi j k i j k 2 1 � if � �i i2 1� . proof: we need only to show that the set of vectors given by (28) satisfies (13). then take differential of order l on both sides of (27), and we obtain ( ) ( ) ( ) ~ ( ) � � � � � � � � i a n n b d b n l l l l l l l l l � � � � � � � d d d d d d d 1 1 1 1 0d� � l � � ~ ( ) .d (29) substituting � by �i and postmultiplying by a vector 1 1 l fij k ! � on both sides of (29) gives � � � �( ) ! ( ) ! ~ ( )� � � � � �i n l l i ij k i l l i ij k l f l fi a n b d� �� � 1 11d d d d � � 1 1 1 1 1 1 1 1 1 1 � � � � � � � � � �( )! ( ) ( )! ~ l f l l l i ij k l l d d d d� � � n b d� �( ) .�i ijkf �1 (30 ) summing up all the equations in (30) and using (28) we obtain ( ) , , � �i n ij k i ij k ij k ij k ij ij i a b b 0 0 � � � � � � � � �v w v w v w 1 1 0 0 , , , ,�i j k (31) where the assignable chains of right eigenvector generalized eigenvectors associated with the assigned eigenvalue �i are given by � �vij k i ij k i ij k k k f f k � � � � � � � � n n n ( ) ( ) ( ) ! ( � � � � d d d d 1 1 1 1 1 � � �� i ij ijf .) , 1 0v � 0 (32) similarity, the gain-eigenvector product � �w ijk i ijk i ijk k k f f k � � � � � � � � ~ ( ) ~ ( ) ( ) ! d d� � � � d d d d 1 1 1 1 1 � � �~( ) ,d 0� i ij ijf .1 0w � (33) summing up, all the equations (28) hold. � then, theorems 1 and 2 give two complete and explicit parametric solutions with the complete and explicit freedom of eigenstructure assignment via state-derivative feedback. these solutions are expressed by the eigenvalues and a group of free parameter vectors, fij k. by specially choosing the free parameter vectors in (18) and (28), solutions with desired properties can be obtained. remark 1: it should be noted that for the case of distinct eigenvalues (mi � qi � 1, s � n). then, the computations of vij k and w ij k , taking the simple form, are given by: v w i i i i if1 1 1 1 1 2 1 1� � � � � � � � � � � � � � � � � � � � � � � � q q 0 ( ) ~ ( ) � � � or (34) v w i i i i if i n 1 1 1 1 1 1 1 � � � � � � � � � � � �� � � �� � n d ( ) ~ ( ) , , , . � � � remark 2: for the single-input system (m � 1), the parameter vectors f ij k reduce to scalars and, accordingly, the solutions of (18) and (28) are the same (unique), regardless of the choice of f ij k. this leads to the well-known result that solution k in this case is unique [1, 2]. the general expressions for the closed-loop eigenvectors and the feedback gains can immediately be written out as soon as the two polynomial matrix reductions (16) and (26) are carried out. these reductions can be completed by a series of simple elementary matrix transformations [6]. there are two methods for computing the polynomial matrices. the first is the smith canonical form, which exploits the fact that for a controllable pair (a, b) the matrix ( ,�i a bn � ) maintains full rank for all values of �. the smith canonical form constructs two unimodular matrices p(�) and q(�) that diagonalized a given polynomial matrix as (16). subject to the controllability of (a, b) the augmented matrix g i i a b i � �� � �� � � �� � n n n m � 0 , (35) can be changed into the form of h p 0 i 0 q � � � �� � � �� ( ) ( ) � � n . (36) by applying a series of row elementary transformations within the upper n rows and a series of column elementary transformations within the last n � m columns, the matrices p(�) and q(�) in the final transformed matrix h are unimodular and automatically satisfying (16). consequently, we can partition q(�) to find n(�) and d(�) as q n d ( ) ( ) ( ) � � � � � � � � �� � � ��. (37) the second approach uses the matrix fraction description (mfd). if all the elements of the matrix are proper rational polynomials, then the matrix may be factored as n(�) d�1(�). the elements of ( )�i a bn � �1 are rational polynomials. thus ( )� � �i a b n dn � � � �1 1( ) ( ). the convenient solution can be found by inspection n i a b( )� �� � �( )n 1 and d i( )� � n , (38) or n i a b( ) adj� �� �( )n and d i a i( )� �� �det( )n n , (39) where adj(.) and det(.) represent, respectively, the adjoint and the determinant of matrix (.). the smith canonical form and mfd approaches both require symbolic manipulation to perform the smith decomposition or matrix inversion. this presents no difficulty when working by hand or using a symbolic package such as maple. © czech technical university publishing house http://ctn.cvut.cz/ap/ 23 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 based on the discussion and analysis above, an algorithm for solving the eigenstructure assignment problem via state-derivative feedback can be given as follows: algorithm input controllable real pair (a, b), where matrix a � �rn n is nonsingular, and a set of n self-conjugate complex numbers {�1, …, �n}. step 1 construct the right coprime matrix polynomials n(�) and d(�) by applying the method presented in this section. step 2 choose arbitrary parameter vectors f ij k m�c (i � 1, …, s, j � 1, ..., qi, k � 1, ..., pij) in such a way that f fi j k i j k 2 1 � implies � �i i2 1� . step 3 calculate the right eigenvectors vij k n�c (i � 1, …, s, j � 1, ..., qi, k � 1, ..., pij) using (28). if the eigenvectors matrix v is singular, then return to step 2 and select different parameters fij k, until v is nonsingular. step 4 compute the gain-eigenvectors w ij k m�c (i � 1, …, s, j � 1, ..., qi, k � 1, ..., pij) using (28), and construct matrix w. step 5 compute the real derivative feedback gain matrix using, k � wv �1. from the above results we can observe that the system poles can always be assigned by a state-derivative feedback controller for any controllable system if and only if the open-loop system matrix a is nonsingular. in the case of single-input, m � 1, there only at most one solution. in the case of multi-input, 1 < m � n, the solution is generally non-unique, and extra conditions must be imposed to specify a solution. the extra freedom can be used to give the closed-loop system other desirable properties. the extra freedom can be used in different ways, for example to decrease the norm of the feedback gain matrix or to improve the condition of the eigenvectors of the closed-loop matrix. additionally, it increases the robustness of the closed-loop system against the system parameter perturbation. this issue becomes very important when the system model is not sufficiently precise or the system is subject to parameter uncertainty. then the feedback gain matrix is parameterized directly in terms of the eigenstructure of the closed-loop system, which can be selected to ensure robustness by exploiting freedom of these parameters. eigenstructure assignment is a very flexible technique. it provides access to all the available design freedom. the drawback of eigenstructure assignment is that it has no inherent mechanism for insuring robustness and can assign a robust solution as easily as a catastrophically unrobust solution. then the optimization techniques are used with the objective of finding the optimum design vectors f ij k so that the closed-loop system is robust to parameter variations. 3 illustrative examples in this section, numerical examples are used to illustrate the feasibility and effectiveness of the proposed eigenstructure assignment technique via state-derivative feedback. example 1: consider a controllable, time-invariant, multi-input linear system, �( ) ( ) ( )x x ut t t� � � � � � � � � � � � � � � � � � � � � � 0 1 0 0 0 1 1 0 1 0 0 0 1 1 0 . a pair of right matrix polynomials n(�) and d(�) satisfying (25) and ~ ( )d � can be found as: n( )� �� � � � � � � � � � � � 1 0 0 0 1 , d( )� � � � � � �� � �� � � �� 1 1 12 and ~ ( )d � � � � � � � � � � � �� � � �� 1 1 1 1 . in the following, we consider the assignment of three different cases: case 1: the desired closed-loop eigenvalues are selected as {�1, �2 and �3}. then, the closed-loop eigenvector matrix v, and the corresponding matrix w, can be written as � v n n n� ( ) , ( ) , ( )� � �1 111 2 211 3 311f f f and � w d d d� ~( ) , ~( ) , ~( )� � �1 111 2 211 3 311f f f . specially choosing � f f11 1 31 1 1 0� � , t and � f21 1 0 1� , t. then v � � � � � � � � � � � � � � 1 0 1 1 0 3 0 1 0 and w � � �� � �� � � �� 1 15 1 3 1 0 5 3 . . . finally the state-derivative gain matrix is k wv� � � � � � � � � �� � � �� �1 4 3 1 3 15 0 1 0 5 . . . case 2: the desired closed-loop poles are {�2 and �3 � i}. choosing � f11 1 0 1� , tand � f f21 1 31 1 1 0� � , t. then v � � � � � � � � � � � � � � � � 0 1 1 0 3 3 1 0 0 i i and w � � � � � � � � � �� � � �� 15 0 3 0 3 0 5 3 3 . . . . i i i i . the gain matrix is k wv� � � � � � � � � �� � � �� �1 0 6 01 15 0 1 0 5 . . . . . case 3: the desired eigenvalues are {�1, �1 and �3}. then � �v n n n n� �� �� � �� ( ) , ( ) ( ) , ( )� � � � �1 11 1 1 11 1 1 11 2 2 21 1f f f f d d and � �w d d d d� �� � ~ ( ) , ~ ( ) ~ ( ) , ~ ( )� � � � �1 11 1 1 11 1 1 11 2 2 21 1f f f f d d� � �� . choosing � f f11 1 21 1 1 0� � , t and � f11 2 0 1� , t. 24 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague we have v � � � � � � � � � � � � � � 1 0 1 2 0 3 0 1 0 and w � � �� � �� � � �� 0 5 15 1 3 2 0 5 3 . . . . therefore k wv� � � � � � � � � �� � � �� �1 0 8333 01667 15 0 1 0 5 . . . . . example 2: consider a controllable, multi-input, linear system �( ) . . . . ( )x xt t� � � � � � � � � � � � � � � � � � 2 5 0 5 0 0 5 2 5 2 0 2 2 1 0 0 0 0 1 � � � � � � � u( )t . a pair of right matrix polynomials n( )� and d( )� can be obtained as: n( )� � � � �� � � � � � � � � � 2 5 4 1 0 0 1 and d( ) ( . ) . � � � � � � � � � � � � � � � � � � � 2 2 5 0 5 4 10 2 2 2 . in the following, we consider the assignment of three different cases: case 1: the desired closed-loop eigenvalues are selected as {–1, –2 and –3}. specially choosing � f f11 1 31 1 1 0� � , t and � f21 1 0 1� , t. then v � � �� � � � � � � � � � 3 4 1 1 0 1 0 1 0 and w � � � � � � �� � � �� 4 1 0 2 0 0 6667. . the gain matrix is k wv� � � � � � � �� � � �� �1 1 1 3 0 3334 1 13334. . . case 2: the desired closed-loop eigenvalues are {–2 and –3� i}. choosing � f11 1 0 1� , t and � f f21 1 31 1 1 0� � , t. then v � � � � � �� � � � � � � � � � 4 1 2 1 2 0 1 1 1 0 0 i i , w � � � � � � � � � � � � �� � � �� 1 0 4 0 8 0 4 0 8 0 0 6 0 2 0 6 0 2 . . . . . . . . i i i i . the gain matrix k wv� � � � � � � � � � �� � � �� �1 0 4 0 8 2 6 01 0 7 0 4 . . . . . . . case 3: the desired closed-loop eigenvalues are {–2, –2 and –3}. choosing � f f11 1 21 1 1 0� � , t and � f11 2 0 1� , t. then v � � �� � � � � � � � � � 1 4 1 1 0 1 0 1 0 and w � � � � � � �� � � �� 0 1 0 1 0 0 6667. . therefore k wv� � � � � � � � �� � � �� �1 0 0 1 01667 0 8333 0 6667. . . . 4 conclusions in this paper, two complete parametric approaches for solving the eigenstructure assignment problem via state-derivative feedback are proposed. the necessary conditions to ensure solvability are that the system is completely controllable and the open-loop system matrix is nonsingular. the main result of this work is an efficient computational algorithm for solving the eigenstructure assignment problem of a linear system via state-derivative feedback. this parametric solution describes the available degrees of freedom offered by the state-derivative feedback in selecting the associated eigenvectors from an admissible class. the extra degrees of freedom on the choice of feedback gains are exploited to further improve the closed-loop robustness against perturbation. the main contribution of the present work is a compact parametric expression for the feedback controller gain matrix explicitly characterized by a set of free parameter vectors. the principle benefits of the explicit characterization of parametric class of feedback controllers lie in the ability to directly accommodate various different design criteria. references [1] abdelaziz. t. h. s., valášek, m.: “pole-placement for siso linear systems by state-derivative feedback.” iee proceeding part d: control theory & applications, 151, 4, 377–385, 2004. [2] abdelaziz, t. h. s., valášek, m.: “a direct algorithm for pole placement by state-derivative feedback for single-input linear systems.” acta polytechnica, vol. 43 (2003), no. 6, p. 52–60. [3] abdelaziz, t. h. s., valášek, m.: “a direct algorithm for pole placement by state-derivative feedback for multi-input linear systems – nonsingular case.” (accepted in kybernetika, 2004). [4] abdelaziz, t. h. s., valášek, m.: “eigenstructure assignment by state-derivative and partial output-derivative feedback for linear time-invariant control systems.” acta polytechnica, vol. 44 (2004), no. 4, p. 54–60. [5] abdelaziz, t. h. s., valášek, m.: “parametric solutions of eigenstructure assignment by state-derivative feedback for linear control systems.” proceedings of interaction and feedbacks 2003, ut av cr, praha, 2003, p. 5–12. [6] duan, g. r: “solutions of the matrix equation av+bw =vf and their application to eigenstructure assign© czech technical university publishing house http://ctn.cvut.cz/ap/ 25 czech technical university in prague acta polytechnica vol. 45 no. 6/2005 ment in linear systems.” ieee transactions on automatic control, vol. 38 (1993), no. 2. p. 276–280. [7] fahmy, m. m., tantawy, h. s.: “eigenstructure assignment via linear state feedback control.” international journal of control, vol. 40 (1984), no. 1, p. 161–178. [8] fahmy, m. m., o’reilly, j.: “eigenstructure assignment in linear multivariable systems: a parametric solution.” ieee transactions on automatic control, vol. 28 (1983), no. 10, p. 990–994. [9] fahmy, m. m., o’reilly, j.: “on eigenstructure assignment in linear multivariable systems.” ieee transactions on automatic control, vol. 27 (1982), no. 3, p. 690–693. [10] clarke, t., griffin, s. j., ensor, j.: “a polynomial approach to eigenstructure assignment using projection with eigenvalue trade-off.” international journal of control, vol. 76 (2003), no. 4, p. 403–423. [11] kautsky, j., nichols, n. k., van dooren, p.: “robust pole assignment in linear state feedback.” international journal of control, vol. 41 (1985), p. 1129–1155. [12] preumont, a., loix, n., malaise, d., lecrenier, o.: “active damping of optical test benches with acceleration feedback.” machine vibration, vol. 2 (1993), p. 119–124. [13] preumont, a.: vibration control of active structures, kluwer, 1998. [14] bayon de noyer, m. p., hanagud, s. v.: “single actuator and multi-mode acceleration feedback control.” adaptive structures and material systems, asme, ad, vol. 54 (1997), p. 227–235. [15] bayon de noyer, m. p., hanagud, s. v.: “a comparison of h2 optimized design and cross-over point design for acceleration feedback control.” proceedings of 39th aiaa/asme/ asce/ahs, structures, structural dynamics and materials conference, vol. 4 (1998), p. 3250–3258. [16] olgac, n., elmali, h., hosek, m., renzulli, m.:“ active vibration control of distributed systems using delayed resonator with acceleration feedback.” transactions of asme journal of dynamic systems, measurement and control, vol. 119 (1997), p. 380. [17] kejval, j., sika, z., valášek, m.: “active vibration suppression of a machine.” proceedings of interaction and feedbacks 2000, ut av cr, praha, 2000, p. 75–80. [18] deur, j., peric, n.: a comparative study of servosystems with acceleration feedback.” proceedings of the 35th ieee industry applications conference, roma, italy, vol. 2 (2000), p. 1533–1540. [19] ellis, g.: “cures for mechanical resonance in industrial servo systems.” proceedings of pcim 2001 conference, nuremberg, 2001. doc. taha h. s. abdelaziz e-mail: tahahelmy@yahoo.com department of mechanical engineering faculty of engineering helwan university 1 sherif street, helwan, cairo, egypt doc. ing. michael valášek, drsc. e-mail: valasek@fsik.cvut.cz department of mechanics czech technical university in prague faculty of mechanical engineering karlovo nám. 13 121 35 praha 2, czech republic 26 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 45 no. 6/2005 czech technical university in prague << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /cze /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice ap08_2.vp 1 introduction supersymmetric quantum mechanics [1] is under intensive development and remarkable new features have been discovered in recent years. this attention is due both to the wide range of applicability of one-dimensional supersymmetric theories and especially superconformal quantum mechanics [2] for extremal black holes [3], in the ads-cft correspondence [4] (when setting ads2), in investigating partial breaking of extended supersymmetries [5, 6], as well as for its underlying mathematical structures. it is well known that large n (up to n � 32, starting from the maximal, eleven-dimensional supergravity) one-dimensional supersymmetric quantum mechanical models are automatically derived [7] from dimensional reduction of higher-dimensional supersymmetric field theories. large n one-dimensional supersymmetry on the other hand (possibly in the n � � limit) even emerges in condensed matter phenomena. controlling one-dimensional n-extended supersymmetry for arbitrary values of n (that is, the nature of its representation theory, how to construct manifestly supersymmetric invariants, etc.) is a technical, but challenging program with important consequences in many areas of physics, see e.g. the discussion in [8] concerning the nature of on-shell versus off-shell representations, for its implications in the context of the supersymmetric unification of interactions. over the years, progress has come from two lines of attack. in the pivotal work of [9] irreducible representations were investigated to analyze supersymmetric quantum mechanics. the special role played by clifford algebra was pointed out [10]. clifford algebras were also used in [11] to construct representations of the extended one-dimensional supersymmetry algebra for arbitrarily large values of n. another line of attack involved using superspace, so that manifest invariants could be constructed through superfields. for low values of n this is indeed the most convenient approach. however, with increasing n, the associated superfields become highly reducible and require the introduction of constraints to extract irreducible representations. this approach soon becomes impractical for large n. indeed, only very recently a manifestly n � 8 superfield formalism for one-dimensional theory has been introduced, see [12] and references therein. a manifest superfield formalism is however lacking for larger values of n. in this work we discuss our results [13], [14], [15] concerning the classification of linear irreducible representations realized on a finite number of time-dependent, bosonic and fermionic, fields. the connection with clifford algebras and division algebras is discussed, as well as the construction of off-shell invariant actions and some associations with graph theory. several important topics that have appeared recently in the literature, like the nature of the non-linear representations will not be discussed here. there are reviews ([16]) that cover this and other aspects. similarly, the quite important connection with supersymmetric integrable systems in (1+1) dimensions (such as the supersymmetric extension of the kdv equation, will not be discussed since they have been covered elsewhere [17]). the scheme of the paper is as follows. the next section deals with the relevance of one-dimensional supersymmetric quantum mechanics for understanding higher-dimensional supersymmetric field theory. some selected examples of dimensional reductions are pointed out. the relation between irreducible representations of one-dimensional n extended supersymmetry algebra and clifford algebras is explained in section 3. section 4 reviews the classification of clifford algebras and their relation with division algebras, following [18]. in section 5 the results of [14] concerning the classification of irreducible representations with length-4 field content are reported. section 6 computes off-shell invariant actions of one-dimensional sigma models within a manifestly supersymmetric formalism which does not require the introduction of superfields. in section 7 an n � 8 invariant action constructed in terms of octonionic structure constants is presented. the classification in [15] of nonequivalent n � 5 6, supersymmetry transformations with the same field content is given in section 8 and 9. a graphical presentation of supersymmetry transformations in terms of n-colored oriented graphs is discussed. section 10 introduces the fusion algebra produced by tensoring irreducible representations and presents it in graphical form. 58 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 extended supersymmetries in one dimension f. toppan this work covers part of the material presented at the advanced summer school in prague. it is mostly devoted to the structural properties of extended supersymmetries in one dimension. several results are presented on the classification of linear, irreducible representations realized on a finite number of time-dependent fields. the connections between supersymmetry transformations, clifford algebras and division algebras are discussed. a manifestly supersymmetric framework for constructing invariants without using the notion of superfields is presented. a few examples of one-dimensional, n-extended, off-shell invariant sigma models are computed. the relation between supersymmetry transformations and graph theory is outlined. the notion of the fusion algebra of irreps tensor products is presented. the relevance of one-dimensional supersymmetric quantum mechanics as a way to extract information on higher dimensional supersymmetric field theories is discussed. keywords: supersymmetric quantum mechanics, m-theory. ` 2 n extended supersymmetries in d � 1 and dimensional reduction of supersymmetric theories in higher dimensions one important motivation for investigating n extended supersymmetries in one dimension is the fact that their rich algebraic setting can furnish useful information concerning the construction of supersymmetric theories in a higher dimension (such as super-yang-mills, supergravity, etc.) supersymmetric quantum mechanics with large number n encodes large information of these theories. the simplest way to see this is through dimensional reduction, where all space-dimensions are frozen and the only remaining dependence is in terms of a time-like coordinate. the usefulness of this procedure is due to the fact that in such a framework we can make use of powerful mathematical tools (essentially based on the available classification of clifford algebras) which are not available in higher dimensions. it should be remembered that a four-dimensional field theory with n extended supersymmetries corresponds, once it is dimensionally reduced to one-dimension, to a supersymmetric quantum mechanics with four times (4n) the number of the original extended supersymmetries [7]. the most interesting case, in the context of the unification program, corresponds to eleven-dimensional supergravity (the low-energy limit of m-theory), which is reduced to an n � 8 four-dimensional theory and later to an n � 32 one-dimensional supersymmetric quantum mechanical system. in this section we will discuss the dimensional reduction of supersymmetric theories from d � 4 to d �1 in some specific examples. we will prove how certain d � 4 problems can be reformulated in a d �1 language. it is convenient to start with a dimensional analysis of the following theories: i) the free particle in one (time) dimension (d �1) and, for the ordinary minkowski space-time (d � 4), iia) the scalar boson theory (with quartic potential � �4 4 ! ), iib) the yang-mills theory and, finally, iic) the gravity theory (expressed in the vierbein formalism). we further make a dimensional analysis of the above three theories when dimensionally reduced (a la scherk) to a one (time) dimensional d �1 quantum mechanical system. in the following we will repeat the dimensional analysis for the supersymmetric version of these theories. case i) – the d �1 free particle it is described by a dimensionless action s given by s m dt� � 1 2 �� . (2.1) the dot denotes, as usual, the time derivative. the dimensionality of the time t is the inverse of the mass; we can therefore set ([ ]t � �1). by assuming � being dimensionless ([ ]� � 0), an overall constant (written as 1 m ) of mass dimension �1 has to be inserted to make s non-dimensional. summarizing, we have, for the above d �1 model, [ ] , [ ] , [ ] , [ ] , [ ] . t t m s d d d d d � � � � � � � � � � � 1 1 1 1 1 1 1 0 1 0 � � � (2.2) the suffix d �1 has been added for later convenience, since the theory corresponds to a one-dimensional model. case iia) – the d � 4 scalar boson theory the action can be presented as s d x m� � �� � � �� 4 2 2 4 1 2 1 2 1 4 � � �� �� � � � ! (2.3) a non-dimensional action s is obtained by setting, in mass dimension, [ ] , [ ] , [ ] , [ ] . � � d d d d m � � � � � � � � 4 4 4 4 1 1 1 0 � � (2.4) case iib) – the d � 4 pure qed or yang-mills theories the gauge-invariant action is given by s e d x f f� � 1 2 4 tr( )�� �� , (2.5) where the antisymmetric stress-energy tensor f�� is given by f d d�� � �� [ , ], (2.6) with d� the covariant derivative, expressed in terms of the gauge connection a� d ea� � ��� � . (2.7) e is the charge (the electric charge for qed). the action is non-dimensional, provided that [ ] , [ ] , [ ] . a f e d d d � �� � � � � � � 4 4 4 1 2 0 (2.8) case iic) – the pure gravity case the action is constructed, see [19] for details, in terms of the determinant e of the vierbein e a� and the curvature scalar r. it is given by s g d x er n � � � 6 8 4 . (2.9) the overall constant (essentially the inverse of the gravitational constant gn) is now dimensional ([ ]gn d� � �4 2). the non-dimensional action is recovered by setting [ ] , [ ] . e r a d d � � � � � 4 4 0 2 (2.10) let us now discuss the dimensional reduction from d d� �4 1. let us suppose that the three space dimensions belong to some compact manifold m (e.g. the three-sphere s3) and let us freeze the dependence of the fields on the space-dimen© czech technical university publishing house http://ctn.cvut.cz/ap/ 59 acta polytechnica vol. 48 no. 2/2008 sions (application of the time derivative �0 leads to non-vanishing results, while application of the space-derivatives �i, for i �1 2 3, , , gives zero). our space-time is now given by r � m. we get that the integration over the three space variables contributes just to an overall factor, the volume of the three-dimensional manifold m. therefore d x vol dtm 4 � �� . (2.11) since [ ]volm d� � �4 3 (2.12) we can express vol m � 13 , where m is a mass-term. a factor 1 m contributes as an overall factor in one-dimensional theory, while the remaining part 12m can be used to rescale the fields. we have, e.g., for dimensional reduction of the scalar boson theory that � �d dm� � �1 4 1 . (2.13) the dimensional reduction of the scalar boson theory ii a) is therefore given by s m dt m d� � � � � � ��� 1 1 2 1 2 1 4 2 2 2 1 4 � ! � � � � (2.14) where we have [ ] , [ ] , [ ] . � � d d d m � � � � � � 1 1 1 1 0 1 2 (2.15) the d �1 coupling constant �1 is related to the d � 4 non-dimensional coupling constant � by the relation � �1 2� m . (2.16) we proceed in a similar way in the case of yang-mills theory. we can rescale the d � 4 yang-mills fields a� to the d �1 fields b a m� � � 1 . the d �1 charge e is rescaled to e e m1 � . we have, symbolically, for the dimensionally reduced action, a sum of terms of the type � �s m dt b e bb e b� � �� 1 2 1 2 1 2 4� � , (2.17) where [ ] , [ ] . b e d d � � � � 1 1 1 0 1 (2.18) the situation is different as far as gravity theory is concerned. in that case the overall factor vol gm n produces the dimensionally correct 1 m overall factor of the one-dimensional theory. this implies that we do not need to rescale the dimensionality of the vierbein e a� and of the curvature. summarizing, we have the following results scalar boson gauge c � � �: [ ] [ ]d d� �� �4 11 0 onnection :[ [ vierbein a a ad d� � �] ]� �� �4 11 0 [ [ electric charge e e ed d� � � : ] ]� �� �4 11 0 :[ [e e ed d] ]� �� �4 10 1 (2.19) let us now discuss the n �1 supersymmetric version of the three d � 4 theories above. first, we have the chiral multiplet, described in [19], in terms of the chiral superfields �, �. next the vector multiplet v, the vector-multiplet in the wess-zumino gauge, the supergravity multiplet in terms of vierbein and gravitinos and, finally, the gauged supergravity multiplet presenting an extra set of auxiliary fields. the total content of fields is given by the following table, which presents also the d � 4 and respectively the d �1dimensionality of the fields (in the latter case, after dimensional reduction). we have some comments are in order: the vector multiplet corresponds, in d �1language, to the n � 4 “enveloping representation” [14] (1, 4, 6, 4, 1). the latter is a reducible, but non-decomposable representation of the n � 4 supersymmetry. its irreducible multiplets are split into (1, 4, 3, 0, 0) and (0, 0, 3, 4, 1). the wess-zumino gauge, in d �1 language, corresponds to selecting the latter n � 4 irreducible multiplet, whose fields present only non-negative dimensions. the n � 2 four-dimensional super-qed involves coupling a set of chiral superfields together with the vector multiplet. due to the dimensional analysis, the corresponding one-dimensional multiplet is the (5, 8, 3) irrep of n � 8 given by (2, 4, 2)�(3, 4, 1). 60 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 chiral multiplet : � �, fields content : (2, 4, 2) d � 4 dimensionality : [ , , ]1 232 4d� d �1 dimensionality : [ , , ]0 112 1d� vector multiplet : v v� † fields content : (1, 4, 6, 4, 1) d � 4 dimensionality : [ , , , , ]0 1 212 3 2 4d� d �1 dimensionality : [ , , , , ]� � �1 0 1 1 2 1 2 1d vector multiplet : v in the wz gauge fields content : (3, 4, 1) d � 4 dimensionality : [ , , ]1 232 4d� d �1 dimensionality : [ , , ]0 112 1d� supergravity multiplet : e a� � �, fields content : (16, 16) d � 4 dimensionality : [ , ]0 12 4d� d �1 dimensionality : [ , ]0 12 1d� gauged sugra multiplet : e ba i� � �, , fields content : (6, 12, 6) d � 4 dimensionality : [ , , ]0 112 4d� d �1 dimensionality : [ , , ]0 112 1d� (2.20) as far as supergravity theories are concerned, the original supergravity multiplet corresponds to four irreducible n � 4 one-dimensional multiplets, while the gauged supergravity multiplet is obtained, in the d �1viewpoint, in terms of three irreducible n � 4 multiplets whose total number of fields is (6, 12, 6). the multiplet of the physical degrees of freedom of eleven-dimensional supergravity (44 components for the graviton, i.e. the components of a so(9) traceless symmetric tensor, the 128 fermionic components of the gravitinos and the 84 components of the three form) can be accommodated into the (44, 128, 84) multiplet of an n-extended one-dimensional supersymmetry. as will be shown later, 128 bosons and 128 fermions can accommodate at most 16 off-shell supersymmetries that are linearly realized. it is under question whether an off-shell formulation of eleven-dimensional supergravity indeed exists. in any case it would require at least 32768 bosonic (and an equal number of fermionic) degrees of freedom to produce an n � 32 supersymmetry representation in d �1. 3 supersymmetric quantum mechanics and clifford algebras in this section we discuss several results, based on ref. [13], concerning the classification of irreducible representations (from now on “irreps”) of the n-extended one-dimensional supersymmetry algebra and their connection with clifford algebras. the n extended d �1supersymmetry algebra is given by � �q q hi j ij, � � , (3.21) where the qi’s are the supersymmetry generators (for i j n, , ,�1 � ) and h i t � � � � is a hamiltonian operator (t is the time coordinate). if the diagonal matrix �ij is pseudo-euclidean (with signature (p, q), n p q� � ) we can speak of generalized supersymmetries. for convenience we limit the discussion here (despite the fact that our results can be straightforwardly generalized to pseudo-euclidean supersymmetries, having applicability, e.g., to supersymmetric spinning particles moving in pseudo-euclidean manifolds) to ordinary n-extended supersymmetries. therefore for our purposes � ij ij� . the (d-modules) representations of the (3.21) supersymmetry algebra realized in terms of linear transformations acting on finite multiplets of fields satisfy the following properties. the total number of bosonic fields equal the total number of fermionic fields. for irreps of the n-extended supersymmetry the number of bosonic (fermionic) fields is given by d, with n and d linked through n l n d g nl � � � 8 24 , ( ), (3.22) where l � 0 1 2, , ,� and n �1 2 3 4 5 6 7 8, , , , , , , . g(n) appearing in (3.22) is the radon-hurwitz function [13] the modulo 8 property of the irreps of the n-extended supersymmetry is a consequence of the famous modulo 8 property of clifford algebras. the connection between supersymmetry irreps and clifford algebras is specified later. the d �1 dimensional reduction of the maximal n � 8 supergravity produces a supersymmetric quantum mechanical system with n � 32 extended number of supersymmetries. it is therefore convenient to explicitly report the number of bosonic/fermionic component fields in any given irrep of (3.21) for any n up to n � 32. we get the table the bosonic (fermionic) fields entering an irreducible multiplet can be grouped together according to their dimensionality. sometimes instead of “dimension”, the word “spin” is used to refer to the dimensionality of the component fields. this choice of word finds some justification when discussing the d �1 dimensional reduction of higher-dimensional supersymmetric theories. the number (equal to l) of different dimensions (i.e. the number of different spin states) of a given irrep, will be referred to as the length l of the irrep. since there are at least two different spin states (one for bosons, the other for fermions), obtained when all bosons (fermions) are grouped together within the same spin, the minimal length of an irrep is l � 2. a general property of (linear) supersymmetry in any dimension is the fact that the states of highest spin in a given multiplet are auxiliary fields, whose supersymmetry transformations are given by total derivatives. just for d �1 total derivatives coincide with the (unique) time derivative. using this specific property of the one-dimensional supersymmetry it was proven in [13] that all finite linear irreps of the (3.21) supersymmetry algebra fall into classes of equivalence, each class of equivalence being singled out by an associated mini© czech technical university publishing house http://ctn.cvut.cz/ap/ 61 acta polytechnica vol. 48 no. 2/2008 n 1 2 3 4 5 6 7 8 g(n) 1 2 4 4 8 8 8 8 (3.23) n �1 1 n � 9 16 n �17 256 n � 25 4096 n � 2 2 n �10 32 n �18 512 n � 26 8192 n � 3 4 n �11 64 n �19 1024 n � 27 16384 n � 4 4 n �12 64 n � 20 1024 n � 28 16384 n � 5 8 n �13 128 n � 21 2048 n � 29 32768 n � 6 8 n �14 128 n � 22 2048 n � 30 32768 n � 7 8 n �15 128 n � 23 2048 n � 31 32768 n � 8 8 n �16 128 n � 24 2048 n � 32 32768 (3.24) mal length (l � 2) irreducible multiplet. it was further proven that the minimal length irreducible multiplets are in 1-to-1 correspondence with a subclass of clifford algebras (those which satisfy a weyl property). the connection goes as follows. the supersymmetry generators acting on a length-2 irreducible multiplet can be expressed as qi h i i � � � � � � 1 2 0 0 � �~ , (3.25) where � i and ~�i are matrices entering a weyl type (i.e. block antidiagonal) irreducible representation of the clifford algebra relation �i i i � � � � � 0 0 � �~ , { , }� �i j ij� 2� . (3.26) the q i’s in (3.25) are supermatrices with vanishing bosonic and non-vanishing fermionic blocks, acting on an irreducible multiplet m (thought of as a column vector) which can be either bosonic or fermionic. we conventionally consider a length-2 irreducible multiplet as bosonic if its upper half part of component fields is bosonic and its lower half is fermionic. it is fermionic in the converse case. the connection between clifford algebra irreps of the weyl type and minimal length irreps of the n-extended one-dimensional supersymmetry is such that d, the dimensionality of the (euclidean, in the present case) space-time of the clifford algebra (3.26) coincides with the number n of the extended supersymmetries, according to the matrix size of the associated clifford algebra (equal to 2d, with d given in (3.22)) corresponds to the number of (bosonic plus fermionic) fields entering the one-dimensional n-extended supersymmetry irrep. the classification of weyl-type clifford irreps, furnished in [13], can be easily recovered from the well-known classification of clifford irreps, given in [20] (see also [21] and [22]). the (3.25) q i’s matrices realizing the n-extended supersymmetry algebra (3.21) on length-2 irreps have entries which are either c-numbers or are proportional to the hamiltonian h. irreducible representations of higher length (l � 3) are systematically produced [13] through repeated applications of the dressing transformations q q s q si i k k i k � � ( ) ( ) ( )� �1 (3.28) realized by diagonal matrices s k( )’s (k d�1 2, ,� ) with entries s ij k( ) given by s hij k ij jk jk ( ) ( )� � � 1 . (3.29) some remarks are in order [13]: i) the dressed supersymmetry operators �q i (for a given set of dressing transformations) have entries which are integral powers of h. a subclass of the �q i s dressed operators is given by the local dressed operators, whose entries are non-negative integral powers of h (their entries have no 1 h poles). a local representation (irreps fall into this class) of an extended supersymmetry is realized by local dressed operators. the number of the extension, given by � � �n n n( ), corresponds to the number of local dressed operators. ii) the local dressed representation is not necessarily an irrep. since the total number of fields (d bosons and d fermions) is unchanged under dressing, the local dressed representation is an irrep iff d and �n satisfy the (3.22) requirement (with �n in place of n). iii) the dressing changes the dimension (spin) of the fields of the original multiplet m. under the s k( ) dressing transformation (3.28), m s mk� ( ) , all fields entering m are unchanged apart from the k-th one (denoted, e.g., as �k and mapped to ��k). its dimension is changed from [ ] [ ]k k� � 1. this is why the dressing changes the length of a multiplet. as an example, if the original length-2 multiplet m is a bosonic multiplet with d spin-0 bosonic fields and d spin-1 2 fermionic fields (in the following such a multiplet will be denoted as ( ; ) ( , )x d di j s� � �0, for i j d, , ,�1 � ), then s mk( ) , for k d� , corresponds to a length-3 multiplet with d �1bosonic spin-0 fields, d spin1 2 fermionic fields and a single spin-1 bosonic field (in the following we employ the notation ( , , )d d s� �1 1 0 for such a multiplet). let us now fix the overall conventions. the most general multiplet is of the form (d d dl1 2, , ,� ), where di for i l�1 2, , ,� specify the number of fields of a given spin s i� �1 2 . the spin s, i.e. the spin of the lowest component fields in the multiplet, will also be referred to as the “spin of the multiplet”. when looking purely at the representation properties of a given multiplet the assignment of an overall spin s is arbitrary, since the supersymmetry transformations of the fields are not affected by s. introducing a spin is useful for tensoring multiplets and becomes essential for physical applications, e.g. in the construction of supersymmetric invariant terms entering an action. in the above multiplet l denotes its length, dl the number of auxiliary fields of highest spins transforming as time-derivatives. the total number of odd-indexed equal the total number of even-indexed fields, i.e. d d d d d1 3 2 4� � � � � �� � . the multiplet is bosonic if the odd-indexed fields are bosonic and the even-indexed fields are fermionic (the multiplet is 62 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 � of space-time dim. (weyl-clifford) � � of extended su.sies (in 1-dim.) d � n (3.27) fermionic in the converse case). for a bosonic multiplet the auxiliary fields are bosonic (fermionic) if the length l is an odd (even) number. just like the overall spin assignment, the assignment of a bosonic (fermionic) character to a multiplet is arbitrary since the mutual transformation properties of the fields inside a multiplet are not affected by its statistics. therefore, multiplets always appear in dually related pairs so that to any bosonic multiplet there exists its fermionic counterpart with the same transformation properties (see also [23]). throughout this paper we assign integer valued spins to bosonic multiplets and half-integer valued spins to fermionic multiplets. as pointed out before, the most general (d d dl1 2, , ,� ) multiplet is recovered as a dressing of its corresponding n-extended length-2 (d, d) multiplet. in [13] it was shown that all dressed supersymmetry operators producing any length-3 multiplet (of the form ( , , )d p d p� for p d� �1 1, ,� ) are of local type. therefore, for length-3 multiplets, we have � �n n. this implies, in particular, that the ( , , )d p d p� multiplets are non-equivalent irreps of the n-extended one-dimensional supersymmetry. as concerns length l � 4 multiplets, the general problem of finding irreps was not addressed in [13]. it was shown, as a specific example, that the dressing of the length-2 (4, 4) irrep of n � 4, realized through the series of mappings ( , ) ( , , ) ( , , , )4 4 1 4 3 1 3 3 1� � , produces at the end a length-4 multiplet ( , , , )1 3 3 1 carrying only three local supersymmetries ( � �n 3). since the relation (3.22) is satisfied when setting the number of extended supersymmetries acting on a multiplet equal to 3 and the total number of bosonic (fermionic) fields entering a multiplet equal to 4, as a consequence, the ( , , , )1 3 3 1 multiplet corresponds to an irreducible representation of the n � 3 extended supersymmetry. based on an algorithmic construction of representatives of clifford irreps, we present an iterative method for classifying all irreducible representations of higher length for arbitrary n values of the extended supersymmetry. 4 clifford algebras and division algebras due to the relation between supersymmetric quantum mechanics and clifford algebras, we present here a classification of the irreducible representations of clifford algebras in terms of an algorithm which allows us to single out, in arbitrary signature space-times, a representative in each irreducible class of representations of clifford’s gamma matrices. the class of irreducible representations is unique apart from special signatures, where two non-equivalent irreducible representations are linked by sign flipping (� �� �� � ). the construction goes as follows. first, we prove that starting from a given spacetime-dimensional representation of clifford’s gamma matrices, we can recursively construct d � 2 spacetime dimensional clifford gamma matrices with the help of two recursive algorithms. indeed, it is a simple exercise to verify that if �i’s denotes the d-dimensional gamma matrices of a d p q� � spacetime with (p, q) signature (namely, providing a representation for the c(p, q) clifford algebra) then 2d-dimensional d � 2 gamma matrices (denoted as �j) of a d � 2 spacetime are produced according to either © czech technical university publishing house http://ctn.cvut.cz/ap/ 63 acta polytechnica vol. 48 no. 2/2008 1 * 2 * 4 * 8 * 16 * 32 * 64 * 128 * 256 * (1, 0) � (2, 1) � (3, 2) � (4, 3) � (5, 4) � (6, 5) � (7, 6) � (8, 7) � (9, 8) � � (1, 4) � (2, 5) � (3, 6) � (4, 7) � (5, 8) � (6, 9) � (0,3) � (5,0) � (6, 1) � (7, 2) � (8, 3) � (9, 4) � (10, 5) � � (1, 8) � (2, 9) � (3, 10) � (4, 11) � (5, 12) � (0,7) � (9, 0) � (10, 1) � (11, 2) � (12, 3) � (13, 1) � � (1, 12) � (2, 13) � (0, 11) � (13, 0) � (14, 1) � � (1, 16) � (0, 15) � (17, 0) � table 1: table with the maximal clifford algebras (up to d � 256) (4.33) � j i i d d d d p q � � � � � � � � � � � � � � � 0 0 0 1 1 0 1 0 0 1 � � , , ( , ) (� p q� �1 1, ). (4.30) or � j i i d d d d p q � � � � � � � � � � � � � � � 0 0 0 1 1 0 1 0 0 1 � � , , ( , ) (� q p� 2, ). (4.31) it is immediately clear, e.g., that the two-dimensional real-valued pauli matrices �a, �1, �2 which realize the clifford algebra c(2, 1) are obtained by applying either (4.30) or (4.31) to the number 1, i.e. the one-dimensional realization of c(1, 0). we have indeed � � �a � � � � � � � � � � � � � � � � � 0 1 1 0 0 1 1 0 1 0 0 11 2 , , . (4.32) all clifford algebras are obtained by recursively applying algorithms (4.30) and (4.31) to the clifford algebra c(1, 0) ( )�1 and the clifford algebras of the series c m( , )0 3 4� (with m non-negative integer), which must be previously known. this is in accordance with the scheme illustrated in the table1. concerning the previous table, some remarks are in order. the columns are labeled by the matrix size d of the maximal clifford algebras. their signature is denoted by the (p, q) pairs. furthermore, the underlined clifford algebras in the table can be named as “primitive maximal clifford algebras”. the remaining maximal clifford algebras appearing in the table are “maximal descendant clifford algebras”. they are obtained from the primitive maximal clifford algebras by iteratively applying the two recursive algorithms (4.30) and (4.31). moreover, any non-maximal clifford algebra is obtained from a given maximal clifford algebra by deleting a certain number of gamma matrices (as an example, clifford algebras in even-dimensional spacetimes are always non-maximal). it is immediately clear from the above construction that the maximal clifford algebras are encountered if and only if the condition p q� �15 8, mod (4.34) is matched. the notion of a clifford algebra of the generalized weyl type, namely satisfying the (3.26) condition, has already been introduced. all maximal clifford algebras, both primitive and descendant, are not of the generalized weyl type. as already pointed out, the notion of generalized weyl spinors is based on real-valued representations of clifford algebras which, for classification purposes, are more convenient to use w.r.t. the complex clifford algebras that one in general deals with. for this reason generalized weyl spinors exist also in odd-dimensional space-time, see formula (3.26), while standard weyl spinors only exist in even-dimensional spacetimes. this can be understood by analyzing a single example. the real irrep c(0, 7), with all negative signs, is 8-dimensional, see table (4.33), while the real irrep c(7, 0) is 16-dimensional, but of generalized weyl type (3.26). accordingly, euclidean 8-dimensional fundamental spinors can be understood either as the 8-dimensional “non-weyl” spinors of c(0, 7), or as 8-dimensional “weyl-projected” c(7, 0) spinors. in the complex case, the sign flipping c c( , ) ( , )0 7 7 0� can be realized by multiplying all gamma matrices by the imaginary unit “i”. no doubling of the matrix size of the �’s is found and the notion of weyl spinors cannot be applied. one faces a similar situation in one-dimensional spacetime. in the complex case we can realize c(1, 0) with 1 and c(0, 1) with i (both one-dimensional). on the other hand, in the real case, c(0, 1) can only be realized through the 2-dimensional irrep 0 1 1 0� � � � �, which is block-antidiagonal. throughout the text weyl (non-weyl) spinors are always referred to the (3.26) property with respect to real-valued clifford algebras. non-maximal clifford algebras are of the weyl type if and only if they are produced from a maximal clifford algebra by deleting at least one spatial gamma matrix which, without loss of generality, can always be chosen as the one with diagonal entries. let us now illustrate how non-maximal clifford algebras are produced from the corresponding maximal clifford algebras. the construction goes as follows. we illustrate at first the example of the non-maximal clifford algebras obtained from the 2-dimensional maximal clifford irrep c(2, 1) furnished by the three matrices �1, �2, �a given in (4.32). if we restrict the clifford algebra to �1, �a, i.e. if we delete �2 from the previous set, we get the 2-dimensional irrep c(1, 1). if we further delete �1 we are left with �a only, which provides the 2-dimensional irrep c(0, 1) discussed above. on the other hand, deleting �a from c(2, 1) leaves us with �1, �2, the 2-dimensional irrep c(2, 0). to summarize, from the 2-dimensional irrep of the ”maximal clifford algebra” c(2, 1) we obtain the 2-dimensional irreps of the non-maximal clifford algebras c(1, 1), c(0, 1) and c(0, 2) through a “�-matrices deleting procedure”. please note that, through deleting, we cannot obtain from c(1, 2) the irrep , since the latter is one-dimensional. in full generality, non-maximal clifford algebras are produced from the corresponding maximal clifford algebras according to the following table, which specifies the number of time-like or space-like gamma matrices that should be de64 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 leted, as well as the generalized weyl (w) character or not (nw) of the given non-maximal clifford algebra. we get w nw (4.35) ( mod ) ( mod ) ( , ) ( , ) 0 8 1 8 1 � � �p q p q ( mod ) ( mod ) ( , ) ( , ) 2 8 1 8 1 � � �p q p q ( mod ) ( mod ) ( , ) ( , ) 4 8 5 8 1 � � �p q p q ( mod ) ( mod ) ( , ) ( , ) 3 8 1 8 2 � � �p q p q ( mod ) ( mod ) ( , ) ( , ) 6 8 1 8 3 � � �p q p q ( mod ) ( mod ) ( , ) ( , ) 7 8 1 8 2 � � �p q p q in the above entries x mod 8 specifies the mod 8 residue of t s� for any given ( , )t s spacetime. non-maximal clifford algebras are denoted by p t� , q s� , while maximal clifford algebras are denoted by ( , )� �p q , with � �p p, � �q q. the differences � �p p, � �q q denote how many clifford gamma matrices (of time-like or respectively space-like type) have to be deleted from a given maximal clifford algebra to produce the irrep of the corresponding non-maximal clifford algebra. to be specific, e.g., the 6 8mod non-maximal clifford algebra c(6, 0) is obtained from the maximal clifford algebra c(9, 0), whose matrix size is 16 according to (4.33), by deleting three gamma matrices. to complete our discussion what it remains to specify the construction of the primitive maximal clifford algebras for both c n( , )0 3 8� series (which can be named as “quaternionic series”, due to its connection with this division algebra, as we will see in the next section), and also the “octonionic” series c n( , )0 7 8� . the answer can be provided with the help of the three pauli matrices (4.32). we first construct the 4×4 matrices realizing the clifford algebra c(0, 3) and the 8×8 matrices realizing the clifford algebra c(0, 7). they are given, respectively, by c a a a ( , ) , , . 0 3 1 2 2 � � � � � � � � �1 (4.36) and c a a a a( , ) , , , ,0 7 1 2 2 2 2 1 2 2 1 2 � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 1 a a a a a , , . � � � � � 2 2� � � � 1 (4.37) the three matrices of c(0, 3) will be denoted as �i �1 2 3, , . the seven matrices of c(0, 7) will be denoted as ~ , , ,�i �1 2 7� . in order to construct the remaining clifford algebras of the two series we first need to apply the (4.30) algorithm to c(0, 7) and construct the 16×16 matrices realizing c(1, 8) (the matrix with a positive signature is denoted as �9, � 9 2 �1, while the eight matrices with negative signatures are denoted as �j, j �1 2 8, , ,� , with � j 2 � �1). we are now in the position to explicitly construct the whole series of primitive maximal clifford algebras c n( , )0 3 8� , c n( , )0 7 8� through the formulas c n i j j( , )0 3 8 9 4 16 4 9 16 4 9 9 � � � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 1 � � � � � � � � � � � � � � � � j j 1 1 1 1 116 4 9 9 16 16 16 9 � � � � � � � � � � , , , , , , (4.38) and similarly c n i j j( , ) ~ 0 7 8 9 8 16 8 9 16 8 9 � � � � � � � � � � � � � � � � � � � � � � � 1 1 1 1 1 9 16 8 9 9 16 16 16 9 � � � � � � � � � � � � � � � � j 1 1 1 1 1� � � � � � � � � � , , , , , j . (4.39) please note that the tensor product of the 16-dimensional representation is taken n times. the total size of the (4.38) matrix representations is then 4×16n, while the total size of (4.39) is 8×16n. with the help of the formulas presented in this section we are able to systematically construct a set of representatives of the real irreducible representations of clifford algebras in arbitrary space-times and signatures. it is also convenient to explicitly present of clifford algebras with the division algebras of the quaternions (and of the octonions). this relation can be understood as follows. first we note that the three matrices appearing in c(0, 3) can also be expressed in terms of the imaginary quaternions �i satisfying � � i j ij� � � � �ijk �k . (4.40) as a consequence, the whole set of maximal primitive clifford algebras c n( , )0 3 8� , as well as their maximal descendants, can be represented with quaternionic-valued matrices. in its turn the spinors now have to be interpreted as quaternionic-valued column vectors. similarly, there exists an alternative realization of the basic relations of the generators of the euclidean clifford algebra , obtained by identifying its seven generators with the seven imaginary octonions (for a review on octonions see e.g. [24]) satisfying the algebraic relation � � �i j ij ijk kc� � � � (4.41) for i j k, , , ,�1 7� and cijk the totally antisymmetric octonionic structure constants given by c c c c c c c123 147 165 246 257 354 367 1� � � � � � � (4.42) © czech technical university publishing house http://ctn.cvut.cz/ap/ 65 acta polytechnica vol. 48 no. 2/2008 and vanishing otherwise. the octonions are non-associative and cannot be represented in matrix form with the usual matrix multiplication. on the other hand, a construction due to dixon allows us to produce the seven 8×8 matrix generators of the c(0, 7) clifford algebras in terms of the octonionic structure constants. given a real octonion x x xi ii � � �0 � , with real coefficients x0, xi, for i �1 7, ,� , the left action of the imaginary octonions �i ( � � �x xi� ) is reproduced in terms of the 8×8 clifford gamma matrix �i, linearly acting on x0, xi’s. 5 the field content of irreducible representations it is now possible to plug the information contained in clifford algebras and apply the construction outlined in section 3 to compute the admissible field content for the length-4 multiplets for arbitrary values of n. this construction was done in [14]. we present here the list of length-4 field content up to n �11. up to n � 8 we have n �1 no (5.43) n � 2 no n � 3 (1, 3, 3, 1) n � 4 no n � 5 (1, 5, 7, 3), (3, 7, 5, 1), (1, 6, 7, 2), (2, 7, 6, 1), (2, 6, 6, 2), (1, 7, 7, 1) n � 6 (1, 6, 7, 2), (2, 7, 6, 1), (2, 6, 6, 2), (1, 7, 7, 1) n � 7 (1, 7, 7, 1) n � 8 no since there are no length-l irreps with l � 5 for n � 9, the above list, together with the already known length-2 and length-3 irreps, provides the complete classification of the admissible field content of the irreducible representations for n � 8. please note that the length-4 irrep of n � 3, (1, 3, 3, 1), is self-dual under the [14] high � low spin duality, while two of the non-equivalent length-4 n � 5 irreps are self-dual, (2, 6, 6, 2) and (1, 7, 7, 1). the remaining ones are pair-wise dually related (( , , , ) ( , , , )1 5 7 3 3 7 5 1� and ( , , , )1 6 7 2 � ( , , , )2 7 6 1 ). the n � 9 length-4 irreducible multiplet (d d d d1 2 3 4, , , ) is for simplicity expressed in terms of the two positive integers h d� 1, k d� 4, since d h3 16� � , d k4 16� � . the complete list of n � 9 length-4 fields content is expressed by h, k satisfying the constraint h k� � 8. (5.44) n �10 is the lowest supersymmery admitting length-5 irreps. the field content of its length-4 irreps is given by ( , , , )d d d d1 2 3 4 , expressed in terms of the two positive integers h d� 1, k d� 4, since d h3 32� � , d k4 32� � . if we set r h k� min( , ) (5.45) the non-equivalent length-4 field content is given by the ordered pair of positive integers h, k satisfying the constraint h k r� � � 24 . (5.46) for n �11 the length-4 fields content ( , , , )d d d d1 2 3 4 is expressed in terms of the two positive integers h d� 1, k d� 4, since d h3 64� � , d k2 64� � . setting as before r h k� min( , ) and introducing the s(r) function defined through s r r r ( ) , , � � �� ! " # $ 8 0 1 7for otherwise � (5.47) we can express the constraints on h, k as h k r s r� � � �( ) 48. (5.48) 6 the off-shell invariant actions of the n � 4 sigma models in the late 1980’s and early 1990’s, the whole set of off-shell invariant actions of the n � 4 supersymmetries were produced ([5] and references therein), by making use of the superfield formalism. this result was reached after slowly recognizing the multiplets carrying a representation of the one-dimensional n � 4 supersymmetry. the results discussed here allow us to reconstruct, in a unified framework, all off-shell invariant actions of the correct mass-dimension (the mass-dimension d � 2 of the kinetic energy) for the whole set of n � 4 irreducible multiplets. they are given by the (4, 4), (3, 4, 1), (2, 4, 2) and (1, 3, 4) multiplets. we are able to construct the invariants without using a superfield formalism. we use instead a construction which can be extended, how we will prove later, even for large values of n, in the cases where the superfield formalism is not available. we will use the fact that the supersymmetry generators q i’s act as graded leibniz derivatives. manifestly invariant actions of the n-extended supersymmetry can be obtained by expressing them as i t q q f x x xn k� � �� d ( ( , , , ))1 1 2� � (6.49) with the supersymmetry transformations applied to an arbitrary function of the 0-dimensional fields xi’s, i k�1, ,� entering an irreducible multiplet of the n-extended supersymmetry. since the supersymmetry generators admit mass-dimension � 12 (being the “square roots” of the hamiltonian), we have that (6.49) is a manifestly supersymmetric invariant whose lagrangian density q q f x x xn k1 1 2� �� �( , , , ) has a dimension d n� 2 . for n � 4 the lagrangian density has the correct dimension of a kinetic term. the k variables xi’s can be regarded as coordinates of a k-dimensional manifold. the corresponding actions can therefore be seen as n � 4 supersymmetric one-dimensional sigma models evolving in a k-dimensional target manifold. for each n � 4 irrep we get the following results. in all cases 66 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 below the arbitrary ( )xi function is given by � � f x( ). we get the following list. i) the n � 4 (4, 4) case. we have: q x x x x xi j j i ij ijk k i ij ijk k( , ; , ) ( , ; � , � � )� � � � � � �� � � � � , ( , ; , ) ( , ; �, � ).q x x x xj j j j4 � � � �� (6.50) the most general invariant lagrangian l of dimension d � 2 is given by l x x x x j j j x j j ijk i j � � � � � � �� � � � �� � � � ( )[ � � � � ] [ � � � 2 2 1 2 x x x x x k l l l j j ljk j k ljk j k ] [ � � � � ]� � � � � � � � � � � � � � � � 1 2 1 � 6 � �� � �ljk l k k. (6.51) ii) the n � 4 (3, 4, 1) case. we have: q x g x g xi j j ij ijk k i ij ijk k i( ; , ; ) ( ; � ; � ; ),� � � � � � �� � � � � q x g g xj j j j j4( ; , ; ) ( ; , � ; ).� � � �� (6.52) the most general invariant lagrangian l of dimension d � 2 is given by l x x g x g j j j i ijk j k j k � � � � � � �� � � � � ��� � � ( )[ � � � ] [ � � 2 2 1 2 ) � ] . � � � g xi i j j ijk i j k �� � � � �� � � � 6 (6.53) iii) the n � 4 (2, 4, 2) case. we have: q x y g h x g h y1 0 1 2 3 0 3 1( , ; , , , ; , ) ( , ; �, , , � ; � , �� � � � � � � �� � � � 2 2 0 1 2 3 3 0 2 ), ( , ; , , , ; , ) ( , ; �, , , �; �q x y g h y h g x� � � � � � �� � � � , � ), ( , ; , , , ; , ) ( , ; , � �, ; � � � � � � � � � � � � 1 3 0 1 2 3 2 1q x y g h h y x g � � � , � ), ( , ; , , , ; , ) ( , ; , �, �, � � � � � � � � 3 0 4 0 1 2 3 1 2q x y g h g x y h; � , � ).� �0 3 (6.54 ) the most general invariant lagrangian l of dimension d � 2 is given by l x y x y g h y j j x � � � � � � � � �� � � � �� � � � ( , )[ � � � � ] [ � 2 2 2 2 1 2 0 3 ) ) )] [ � ) � � � � � � � � g h x gy �� � � � �� � � � � �� � � � � 2 3 0 1 1 3 0 2 1 2 0 3 � � � � �� � � � � � � � 1 3 0 2 2 3 0 1 0 1 2 3 � � � � ) )] . h � (6.55) iv) the n � 4 (1, 4, 3) case. we have: q x g g x gi j j i i ij ijk k ij ijk k( ; , ; ) ( , � ; � � )� � � � � � � �� � � � � , ( ; , ; ) ( ; �, ; � ).q x g x gj j j j4 � � � �� (6.56) the most general invariant lagrangian l of dimension d � 2 is given by l x x g x g g i i i i i ijk i j k � � � � � � � �� � � � � � � � ( )[ � � � ] ( )[ 2 2 1 2 ] [ ].� �� � � � � � � � x ijk i j k6 (6.57) it is worth recalling that n � 4 is associated, as we have discussed, to the algebra of the quaternions. this is why in cases (4, 4), (3, 4, 1) and (1, 4, 3) the invariant actions can be written by making use of the quaternionic tensors ij and �ijk. in the (2, 4, 2) case two fields are dressed to be auxiliary fields and this spoils the quaternionic covariance property. 7 octonions and n � 8 sigma-models just as the n � 4 supersymmetry is related with the algebra of quaternions, the n � 8 supersymmetry is related with the algebra of the octonions. more specifically, it can be proven that the n � 8 supersymmetry can be produced from the lifting of the cl(0, 7) clifford algebra to cl(0, 9). on the other hand, it is well-known, as we have discussed before, that the seven 8×8 antisymmetric gamma matrices of cl(0, 7) can be recovered by the left-action of the imaginary octonions on the octonionic space. as a result, the entries of the seven antisymmetric gamma-matrices of cl(0, 7) can be expressed in terms of the totally antisymmetric octonionic structure constants cijk’s. the non-vanishing cijk’s are given by c c c c c c c123 147 165 246 257 354 367 1� � � � � � � (7.58) the non-vanishing octonionic structure constants are associated with the seven lines of the fano projective plane, the smallest example of a finite projective geometry, see [24]. the n � 8 supersymmetry transformations of the various irreps can, as a consequence, be expressed in terms of octonionic structure constants. this is in particular true for the dressed (1, 8, 7) multiplet, admitting seven fields which are “dressed” to become auxiliary fields. this is an example of a multiplet which preserves the octonionic structure since the seven dressed fields are related to the seven imaginary octonions. we have that the supersymmetry transformations are given by q x g g x c g ci j j i i ij ijk k ij ijk k( ; , ; ) ( , � ; � � )� � � � � �� � � � � , ( ; , ; ) ( ; �, ; � ).q x g x gj j j j8 � � � �� (7.59) for i j k, , , ,�1 7� . the strategy for constructing the most general n � 8 off-shell invariant action of the (1, 8, 7) multiplet makes use of the octonionic covariantization principle. when restricted to an n � 4 subalgebra, the invariant actions should have the form of the n � 4 (1, 4, 3) action (6.57). this restriction can be made in seven non-equivalent ways (the seven lines of the fano plane). the general n � 8 action should be expressed in terms of octonionic structure constants. with respect to (6.57), an extra-term could in principle be present. it is given by � dt x cijk i j k l� � � � �( ) and is constructed in terms of the octonionic tensor of rank 4 c cijkl ijklmnp mnp� 1 6 � (7.60) (where �ijklmnp is the seven-index totally antisymmetric tensor). please note that the rank-4 tensor is obviously vanishing when restricting to the quaternionic subspace. one immedi© czech technical university publishing house http://ctn.cvut.cz/ap/ 67 acta polytechnica vol. 48 no. 2/2008 ately verifies that the term � dt x cijk i j k l� � � � �( ) breaks the n � 8 supersymmetries and cannot enter the invariant action. as concerns the other terms, starting from the general action (with i j k, , , ,�1 7� ) �s t x x g x g c g i i i i i ijk i � � � � � � � � d �� � � � � � ( )[ � � � ] ( )[ 2 2 1 2 � j k ijk i j k x c � � � � � ] ( ) [ ]� �� 6 (7.61) we can prove that the invariance under the qi generator ( , , )�1 7� is broken by terms which, after integration by parts, contain at least a second derivative �� . we obtain, e.g., a non-vanishing term of the type � ��dt c gijk i k l � � � 2 . in order to guarantee the full n � 8 invariance (the invariance under q8 is automatically guaranteed) we have therefore to set �� � � �x 0, leaving a linear function in x. as a result, the most general n � 8 off-shell invariant action of the (1, 8, 7) multiplet is given by �s t ax b x g a g c g i i i i i ijk i j � � � � � � � � d ( )[ � � � ] [ 2 2 1 2 �� � � � � � � �k ] . (7.62) we can express this result in the following terms: the association of the n � 8 supersymmetry with the octonions implies that the octonionic structure constants enter as coupling constants in the n � 8 invariant actions. the situation w.r.t. the other n � 8 multiplets is more complicated. this is due to the fact that the dressing of some of the bosonic fields to auxiliary fields does not respect octonionic covariance. the construction of the invariant actions can however be performed along similar lines, the octonionic structure constants being replaced by the “dressed” structure constants. the procedure for a generic irrep is more involved than in the(1, 8, 7) case. the full list of invariant actions for the n � 8 irreps is currently being written. the results will be reported elsewhere. the method proposed is quite interesting because it allows us in principle to construct the most general invariant actions. it is worth mentioning that various groups, using n � 8 superfield formalism, are still working on the problem of constructing the most general invariant actions. let us close this section by pointing out that the only sign of the octonions is through their structure constants entering as parameters in the (7.62) n � 8 off-shell invariant action. (7.62) is an ordinary action, in terms of ordinary associative bosonic and fermionic fields closing an ordinary n � 8 supersymmetry algebra. 8 non-equivalent representations with the same field content the irreducible representations of the n-extended supersymmetry algebra are nicely presented in terms of n-colored graphs with arrows (we will explain below how to draw the graphs). the existence of irreducible representations admitting the same field content, but non-equivalent graphs was pointed out in [25]. in [15] the non-equivalent graphs associated to irreducible representations up to n � 8 were classified. we discuss here both construction of [15] and also its main results. since it can be quite easily proved that non-equivalent graphs are not encountered for n � 4, it is sufficient to discuss the irreducible representations of n � 5 6 7 8, , , , which are obtained through a dressing of the n � 8 length-2 root multiplet of type (8, 8) (see the previous discussion). inequivalent graphs (see [15]) are described by the so-called connectivity of the irreps. connectivity can be understood as follows. for the class of irreducible representations under consideration any given field of dimension d is mapped, under a supersymmetry transformation, either i) to a field of dimension d � 12 belonging to the multiplet (or to its opposite, the sign of the transformation being irrelevant for our purposes) or, ii) to the time-derivative of a field of dimension d � 12. if the given field belongs to an irrep of the n-extended one-dimensional supersymmetry algebra, therefore k n� of its transformations are of type i), while the n k� remaining ones are of type ii). let us now specialize our discussion to a length-3 irrep (the interesting case for us). its field content is given by (n n n n1 1, , � ), while the set of its fields is expressed by ( ; ; )x gi j k� , with i n�1 1, ,� , j n�1, ,� , k n n� �1 1, ,� . the xi’s are 0-dimensional fields (the �j are 1 2 -dimensional fields and gk are 1-dimensional fields, respectively). the connectivity associated to the given multiplet is defined in terms of the �g symbol. it encodes the following information. the n 1 2 -dimensional fields �j are partitioned in the subsets of mr fields admitting kr supersymmetry transformations of type i) (kr can take the 0 value). we have m nrr� � . the �g symbol is expressed as � g k km m� � �1 21 2 � (8.63) as an example, the n � 7 (6, 8, 2) multiplet admits connectivity � g � �6 22 1 (see (9.68)). this means that there are two types of fields �j. six of them are mapped, under supersymmetry transformations, into the two auxiliary fields gk. the two remaining fields �j are only mapped into a single auxiliary field. an analogous symbol, x�, can be introduced. it describes the supersymmetry transformations of the xi fields into the �j 68 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 fields. this symbol is, however, always trivial. an n-irrep with ( , , )n n n n1 1� field content always produces x n n� � 1 . let us now discuss how to compute the connectivities. (8, 8) involves 8 bosonic and 8 fermionic fields entering a column vector (the bosonic fields are accommodated in the upper part). the 8 supersymmetry operators �q i (i �1 8, , )� in the (8, 8) n � 8 irrep are given by the matrices � , � ,q h q hj j j � � � � � � �� � � � � � � 0 0 0 08 8 8 � � 1 1 (8.64) where the �j matrices ( j �1 7, ,� ) are the 8×8 generators of the cl(0, 7) clifford algebra and h i t � dd is the hamiltonian. the cl(0, 7) clifford irrep is uniquely defined up to similarity transformations and an overall sign flipping [22]. without loss of generality we can unambiguously fix the �j matrices to be given as in the appendix. each �j matrix (and the 18 identity) possesses 8 non-vanishing entries, one in each column and one in each row. the whole set of non-vanishing entries of the eight (a.1) matrices fills the entire 8 8 64� � squares of a “chessboard”. the chessboard appears in the upper right block of (8.64). the length-3 and length-4 n � 5 6 7 8, , , irreps (no irrep with length l % 4 exists for n � 9, see [14]) are acted upon by the qi’s supersymmetry transformations, obtained from the original �q i operators through a dressing, � �q q dq di i i� � �1, (8.65) realized by a diagonal dressing matrix d. it should be noted that only the subset of “regular” dressed operators qi (i.e., having no 1 h or higher poles in its entries) act on the new irreducible multiplet. apart from the self-dual (4, 8, 4) n � 5 6, irreps, without loss of generality, for our purpose of computing the irrep connectivities, the diagonal dressing matrix d which produces an irrep with ( , , )n n n n1 1� fields content can be chosen to have its non-vanishing diagonal entries given by pq qd , with dq �1 for q n�1 1, ,� and q n n� � 1 2, ,� , while d hq � for q n n� �1 1, ,� . any permutation of the first n entries produces a dressing which is equivalent, for computing both the field content and the �g connectivity, to d. the only exceptions correspond to the n � 5 (4, 8, 4) and n � 6 (4, 8, 4) irreps. besides the diagonal matrix d as above, non-equivalent irreps can be obtained by a diagonal dressing �d with diagonal entries pq qd� , with � �d hq for q � 4 6 7 8, , , and � �dq 1 for the remaining values of q. similarly, the ( , , , )n n n n n n1 2 1 2� � length-4 multiplets are acted upon by the qi operators dressed by d, whose non-vanishing diagonal entries are now given by pq qd , with dq �1 for q n�1 1, ,� and q n n n� � �2 1 22 , ,� , while d hq � for q n n n� � �1 21 2, ,� . the n � 5 6 7 8, , , length-2 (8, 8) irreps are unique (for the given value of n), see [26]. it is also easily recognized that all n � 8 length-3 irreps of a given field content produce the same value of �g connectivity (8.63). as concerns the length-3 n � 5 6 7, , irreps the situation is as follows. let us consider the irreps with ( , , )k k8 8 � field content. their supersymmetry transformations are defined by picking an n & 8 subset from the complete set of 8 dressed qi operators. it is easily recognized that for n � 7, no matter which supersymmetry operator is discarded, any choice of the seven operators produces the same value for the �g connectivity. irreps with different connectivity can therefore only be found for n � 5 6, . the 8 6 28 � � � � � choices of n � 6 operators fall into two classes, denoted as a and b, which can, potentially, produce ( , , )k k8 8 � irreps with different connectivity. similarly, the 8 5 56 � � � � � choices of n � 5 operators fall into two a and b classes which can, potentially, produce irreps of different connectivity. for some given ( , , )k k8 8 � irrep, the value of �g connectivity computed in both n � 5 (as well as n � 6) classes can actually coincide. in the next section we will show when this feature does indeed happen. to be specific, we present a list of representatives of the supersymmetry operators for each n and in each n � 5 6, a, b class. we have, with diagonal dressing d, n n n a n b n a n b � � � � � � 8 7 6 6 5 5 ( ) ( ) ( ) ( ) case case case case � � � � � � q q q q q q q q q q q q q q q q 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 , , , , , , , , , , , , , 1 3 4 5 6 7 1 2 3 4 5 6 3 4 5 6 7 2 , , , , , , , , , , , , , , , q q q q q q q q q q q q q q q q q q q q q3 4 5 6, , , (8.66) and, with diagonal dressing �d for the (4, 8, 4) irreps, n a n a q q q q q q q q q � � � � � � 6 5 1 3 4 5 6 7 3 4 ( ) ( ) , , , , , , , case case 5 6 7, ,q q (8.67) we are now in a position to compute the connectivities of the irreps (the results are furnished in the next section). quite literally, the computations can be performed by filling a chessboard with pawns representing the allowed configurations. 9 classification of the irrep connectivities in this section we report the results of the computation of the allowed connectivities for the n � 5 6 7, , length-3 irreps. it turns out that the only values of n � 8 allowing the existence of multiplets with the same field content but non-equivalent connectivities are n � 5 and n � 6. the results concerning the allowed �g connectivities of the length-3 irreps are reported in the following table (the a, �a , b cases of n � 5 6, are specified) © czech technical university publishing house http://ctn.cvut.cz/ap/ 69 acta polytechnica vol. 48 no. 2/2008 it is useful to explicitly present, in at least one pair of examples, the supersymmetry transformations (depending on the �i global fermionic parameters) for multiplets admitting different connectivities and the same field content. we write below a pair of n � 5 irreps (the (4, 8, 4)a and the (4, 8, 4)b multiplets) differing by connectivity. it is also convenient to visualize them graphically. the graphical presentation at the end of this section is given as follows. three rows of (from bottom to up) 4, 8 and 4 dots are associated with the xi, �j and gk fields, respectively. supersymmetry transformations are represented by lines of 5 different colors (since n � 5). solid lines are associated to transformations with a positive sign, and dashed lines to transformations with a negative sign. it is easily recognized that in the type a graph there are 4 �j points with four colored lines connecting them to the gk points, while the 4 remaining �j points admit a single line connecting them to the gk points. in the type b graph we have 4 �j points with three colored lines and the 4 remaining �j points with two colored lines connecting them to the gk points. the supersymmetry transformations are explicitly given by i) the n � 5 (4, 8, 4)a transformations: � � � � � � � � � � � � � � � � � � x x 1 2 3 4 5 3 6 1 7 5 8 2 2 4 3 5 4 6 5 � � � � � � � � � 7 1 8 3 2 1 1 5 5 6 4 7 3 8 4 2 2 5 5 � � � � � � � � � � � � � � � � � � � � � � � � � � � x x � � � � � � � � � � � � � 1 6 3 7 4 8 1 2 3 4 1 3 2 1 3 5 4 2 � � � � � � � � � � i x g g g g� i x g g g g i x g g g � � � � � � � � � � 2 4 3 1 4 2 5 3 1 4 3 2 1 1 1 5 2 4 � � � � � � � � � � 3 3 4 4 2 2 5 1 1 2 3 3 4 4 5 4 1 3 � � � � � � � � � � � � � � � � � � g i x g g g g i x i � � � � � � � � x i x i x g i x i x i x i 2 1 3 5 4 2 3 6 3 1 4 2 5 3 1 � � � � � � � � � � � � � � � � � � � � x g i x i x i x i x g i 4 2 4 7 1 1 5 2 4 3 3 4 2 1 8 � � � � � � � � � � � � � � � �5 1 1 2 3 3 4 4 2 2 1 4 1 3 2 � � � � � � x i x i x i x g g i i i � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 3 5 4 2 7 2 3 1 4 2 5 3 1 � � � � � � � � � � � � � � i i g i i i i 4 2 8 3 1 1 5 2 4 3 3 4 2 5 4 � � � � � � � i g i i i i i g � � � � � � � � � � � � � � � � � � � � � � � �i i i i i� � � � � � � � � �5 1 1 2 3 3 4 4 2 6� � � � � (9.69) ii) the n � 5 (4, 8, 4)b transformations: � � � � � � � � � � � � � � � � � x x 1 5 2 2 3 4 5 3 6 1 7 2 5 1 2 4 3 5 4 � � � � � � � � � � � � � � � � � � � � � � � � � � � 6 1 8 3 2 1 5 4 1 5 4 7 3 8 4 2 2 5 3 � � � � � � � � � � x x � � � � � � � � � � � � � � � � � � � � � � 1 6 3 7 4 8 1 5 2 2 3 4 1 3 2 1 3i x i x g g g� � 2 5 1 2 4 3 1 4 2 1 4 3 2 1 5 4 1 � � � � � � � � i x i x g g g i x i x � � � � � � � � � � � � � g g g i x i x g g g i 1 4 3 3 4 4 2 2 5 3 1 2 3 3 4 4 5 � � � � � � � � � � � � � � � � � � � � 4 1 3 2 1 3 5 2 2 3 6 3 1 4 2 1 � � � � � � x i x i x g g i x i x i � � � � � � � � � � � � � � � x g g i x i x i x g g 4 5 1 2 4 7 1 1 4 3 3 4 2 1 5 4 8 � � � � � � � � � � � � � � � � � � � � i x i x i x g g� � � � �1 2 3 3 4 4 2 2 5 3� � �� � � � (9.70) 70 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 lenght-3 n � 7 n � 6 n �5 (9.68) (7, 8, 1) 7 11 0� 6 21 0� 5 31 0� (6, 8, 2) 6 22 1� 6 22 0� (a) 4 42 1� (b) 4 2 22 1 0� � (a) 2 62 1� (b) (5, 8, 3) 5 33 2� 4 2 23 2 1� � (a) 2 63 2� (b) 4 3 13 1 0� � (a) 1 5 23 2 1� � (b) (4, 8, 4) 4 44 3� 4 44 2� (a 2 4 24 3 2� � ( �a ) 83 (b) 4 44 1� (a) 1 3 3 14 3 2 1� � � ( �a ) 4 43 2� (b) (3, 8, 5) 3 55 4� 2 2 45 4 3� � (a) 6 24 3� (b) 1 3 45 4 2� � (a) 2 5 14 3 2� � (b) (2, 8, 6) 2 66 2� 2 66 4� (a) 4 45 4� (b) 2 2 45 4 3� � (a) 6 24 3� (b) (1, 8, 7) 1 77 6� 2 66 5� 3 55 4� fig. 1: graph of the n � 5 (4, 8, 4) multiplet of 4 44 1� connectivity (type a) � � � � � � � � � � � � g i i i i i g i 1 4 1 3 2 1 3 5 6 2 7 2 3 1 � � � � � � � � � � � � � � � � � � � � � � i i i i g i i i � � � � � � � � � � � � � 4 2 1 4 5 5 2 8 3 1 1 4 3 3 � � � � � � � � � � � � � � � � � � � � � � � � � � 4 2 5 5 8 4 1 2 3 3 4 4 2 6 � � � � � � � � i i g i i i i i� �5 7 � 10 tensoring irreducible representations: their fusion algebras and the associated graphs the tensor product of linear irreducible representations can be decomposed into their irreducible constituents. this decomposition contains useful information in the construction of bilinear (in general, multilinear) terms entering a supersymmetric invariant action. we recall that the auxiliary fields in a given representation transform as a total derivative (a time derivative in one dimension). useful information concerning the decomposition of the tensor products of the irreducible representations can be encoded in the so-called fusion algebra of the irreps and their supersymmetric vacua. the notion of a fusion algebra of the supersymmetric vacua of the n-extended one dimensional supersymmetry, introduced in [14], is constructed by analogy with the fusion algebra for rational conformal field theories. fusion algebras can also be nicely presented in terms of their associated graphs. we explicitly present here the n �1 and n � 2 fusion graphs (with two subcases for each n, according to whether or not the irreps are distinguished w.r.t. their bosonic/fermionic statistics). let us discuss here how to present the [14] results in graphical form. the irreps correspond to points. nij k oriented lines (with arrows) connect the [j] and the [k] irrep if the decomposition [ ] [ ] [ ]i j n kij k� � holds. the arrows are dropped from the lines if the [j] and [k] irreps can be interchanged. the [i] irrep should correspond to a generator of the fusion algebra. this means that the whole set of n nl lj k� fusion matrices is produced as the sum of powers of the n ni ij k� fusion matrix. let us discuss explicitly the n � 2 case. we obtain the following list of four irreps (if we discriminate their statistics): [ ] ( , ) ; [ ] ( , , ) ; [ ] ( , ) ; [ ] ( , , 1 2 2 2 1 2 1 3 2 2 4 1 2 � � � � bos bos fer 1)fer (10.71) the corresponding n � 2 fusion algebra is realized in terms of four 4×4, mutually commuting, matrices given by n x n n 1 2 4 1 2 1 0 0 2 0 2 1 0 1 2 0 2 0 2 0 2 0 2 0 2 0 2 0 2 � � � � � � � � � � � ; 0 2 0 2 0 2 1 0 1 2 0 2 0 2 1 2 1 0 0 2 0 2 3 � � � � � � � � � � � � � � y n ; � � � z. (10.72) the fusion algebra admits three distinct elements, x, y, z and one generator (we can choose either x or z), due to the relations y x x z x x x� � � � � � 1 8 2 1 4 6 43 3 2( ), ( ). (10.73) the vector space spanned by x, y, z is closed under multiplication x z zx x y z xy y yz y 2 2 2 2 4 � � � � � � � � , (10.74) this fusion algebra corresponds to the “smiling face” graph below. we obtain the following four tables for the fusion graphs of the n �1 and n � 2 supersymmetric quantum mechanics irreps. the “a” cases below correspond to ignore the statistics (bosonic/fermionic) of the given irreps. in the “b” cases, the number of fundamental irreps is doubled w.r.t. the previous ones, in order to take the statistics of the irreps into account. we have © czech technical university publishing house http://ctn.cvut.cz/ap/ 71 acta polytechnica vol. 48 no. 2/2008 fig. 2: graph of the n � 5 (4, 8, 4) multiplet of 4 43 2� connectivity (typeb ) 11 conclusions supersymmetric quantum mechanics is a fascinating subject with several open problems. the potentially most interesting one concern the construction of off-shell invariant actions with the dimension of a kinetic term for large values of n (let us say n % 8). they could provide a hint towards an off-shell formulation of higher dimensional supergravity and m-theory. other important topics concern the nature of the non-linear realizations of the supersymmetry and their connection with linear representations. we have here presented the rich mathematics underlying the linear irreducible representations realized on a finite number of time-dependent fields. we have shown how to use this information to construct supersymmetric invariant one-dimensional sigma models. we have seen that behind supersymmetric quantum mechanics there exists an interlacing of several mathematical structures, clifford algebras, division algebras, graph theory. further mathematical structures seem to enter the picture (cayley-dickson algebras, exceptional lie algebras, etc.). the theory of supersymmetric quantum mechanics is rich in surprises and seems to lie at the crossroads of various mathematical disciplines. we have just given a taste of it here. acknowledgments i am grateful to the organizers of the advanced summer school for the opportunity they gave me to present these results. i am pleased to thank my collaborators zhanna kuznetsova and moises rojas. these lectures were mostly based on our joint work. references [1] witten, e.: nucl. phys., vol. b 188 (1981), p. 513. [2] akulov, v., pashnev, a.: teor. mat. fiz. vol. 56 (1983), p. 344; fubini, s., rabinovici, e.: nucl. phys., vol. b 245 (1984), p. 17; ivanov, e., krivonos, s., lechtenfeld, o.: jhep 0303, 2003, p. 014; bellucci, s., ivanov, e., krivonos, s., lechtenfeld, o.: nucl. phys., vol. b 684 (2004), p. 321. [3] claus, p., derix, m., kallosh, r., kumar, j., townsend, p. k., van proeyen, a.: phys. rev. lett., vol. 81 (1998) p. 4553; de azcarraga, j. a., izquierdo, j. m., perez-bueno, j. c., townsend, p. k.: phys. rev., vol. d 59 (1999), p. 084015; michelson, j., strominger, a.: jhep 9909, 1999, p. 005. [4] britto-pacumio, r., michelson, j., strominger, a., volovich, a.: “lectures on superconformal quantum mechanics and multi-black hole moduli spaces”, hep-th/9911066. [5] ivanov, e. a., krivonos, s. o., pashnev, a. i.: class. quantum grav., vol. 8 (1991), p. 19. [6] donets, e. e., pashnev, a. i., rosales, j. j., tsulaia, m. m.: phys. rev., vol. d 61 (2000), p. 43512. [7] rittenberg, v., yankielowicz, s.: ann. phys., vol. 162 (1985), p. 273; claudson, m., halpern, m. b.: nucl. 72 acta polytechnica vol. 48 no. 2/2008 fig. 4: fusion graph of the n � 1 superalgebra (b case, 2 irreps, bosons/fermions distinction) fig. 5: fusion graph of the n � 2 superalgebra (a case, 2 irreps, no bosons/fermions distinction) fig. 6: fusion graph of the n � 2 superalgebra (b case, 4 irreps, bosons/fermions distinction), “the smiling face”. from left to right the four points correspond to the [2] – [1] – [3] – [4] irreps, respectively. the lines are generated by the n x1 � fusion matrix, see (10.72) fig. 3: fusion graph of the n � 1 superalgebra (a case, 1 irrep, no bosons/fermions distinction) phys., vol. b 250 (1985), p. 689; flume, r.: ann. phys., vol. 164 (1985), p. 189. [8] gates jr., s. j., linch, w. d., phillips, j.: hep-th/0211034; gates jr., s. j., linch iii, w. d., phillips, j., rana, l.: grav. cosmol., vol. 8 (2002), p. 96. [9] de crombrugghe, m., rittenberg, v.: ann. phys., vol. 151 (1983), p. 99. [10] baake,m., reinicke, m., rittenberg, v.: j. math. phys., vol. 26 (1985), p. 1070. [11] gates jr., s. j., rana, l.: phys. lett., vol. b 352 (1995), p. 50; ibid. vol. b 369 (1996), p. 262. [12] bellucci, s., ivanov, e., krivonos, s., lechtenfeld, o.: nucl. phys., vol. b 699 (2004), p. 226. [13] pashnev, a., toppan, f.: j. math. phys., vol. 42 (2001), p. 5257 (also hep-th/0010135). [14] kuznetsova, z., rojas, m., toppan, f.: jhep (2006), p. 098 (also hep-th/0511274). © czech technical university publishing house http://ctn.cvut.cz/ap/ 73 acta polytechnica vol. 48 no. 2/2008 appendix we present here for completeness the set (unique up to similarity transformations and an overall sign flipping) of the seven 8×8 gamma matrices �i which generate the cl( , )0 7 clifford algebra. the seven gamma matrices, together with the 8-dimensional identity 18, are used in constructing the n � 5 6 7 8, , , supersymmetry irreps, as explained in the main text. �1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 � � � 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 2 � � � � � � � � � � � � � � � �� 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 � � � 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 3 � � � � � � � � � � � � � � �� 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 � � � 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 4 � � � � � � � � � � � � � � �� 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 � � � � 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 5 � � � � � � � � � � � � � �� 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 � � � � 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 6 � � � � � � � � � � � � � � � � 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 � � � � � � � � � � � � � � � � � � � �7 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 � � � � � � � � � � � � � � � �18 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 � � � � � � � � � � � � � (a.1) [15] kuznetsova, z., toppan, f.: mod. phys. lett., vol. a 23 (2008), p. 37 (also hep-th/0701225). [16] bellucci, s., krivonos, s.: hep-th/0602199. [17] toppan, f.: nucl. phys. b (proc. suppl.) 102 &103 (2001), p. 270. [18] carrion, h. l., rojas, m. toppan, f.: jhep 0304 (2003), p. 040. [19] wess, j., bagger, j.: supersymmetry and supergravity, 2nd ed., princeton un. press (1992). [20] atiyah, m. f., bott, r., shapiro, a.: topology (suppl. 1), vol. 3 (1964), p. 3. [21] porteous, i. r.: clifford algebras and the classical groups, cambridge un. press, 1995. [22] okubo, s.: j. math. phys., vol. 32 (1991), p. 1657; ibid. (1991), p. 1669. [23] faux, m., gates jr., s. j.: phys. rev. d 71 (2005) 065002. [24] baez, j.: the octonions, math.ra/0105155. [25] doran, c. f., faux, m. g., gates jr., s. j., hubsch, t., iga, k. m., landweber, g. d.: hep-th/0611060. [26] toppan, f.: hep-th/0612276. francesco toppan e-mail: toppan@cbpf.br cbpf rua dr. xavier sigaud 150 cep 22290-180, rio de janeiro (rj), brazil 74 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 48 no. 2/2008 table of contents lectures on classical integrable systems and gauge field theories 3 m. olshanetsky 25 years of quantum groups: from definition to classification 23 a. stolin correlation functions for lattice integrable models 27 f. smirnov 3ölectures on noncommutative geometry 34 a. sitarz extended supersymmetries in one dimension 56 f. toppan << /ascii85encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain 20%) /calrgbprofile (srgb iec61966-2.1) /calcmykprofile (u.s. web coated \050swop\051 v2) /srgbprofile (srgb iec61966-2.1) /cannotembedfontpolicy /error /compatibilitylevel 1.4 /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves 0.0000 /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel 0 /emitdscwarnings false /endpage -1 /imagememory 1048576 /lockdistillerparams false /maxsubsetpct 100 /optimize true /opm 1 /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage 1 /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution 300 /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution 300 /colorimagedepth -1 /colorimagemindownsampledepth 1 /colorimagedownsamplethreshold 1.50000 /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /colorimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000coloracsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000colorimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution 300 /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution 300 /grayimagedepth -1 /grayimagemindownsampledepth 2 /grayimagedownsamplethreshold 1.50000 /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /grayimagedict << /qfactor 0.15 /hsamples [1 1 1 1] /vsamples [1 1 1 1] >> /jpeg2000grayacsimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /jpeg2000grayimagedict << /tilewidth 256 /tileheight 256 /quality 30 >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution 1200 /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution 1200 /monoimagedepth -1 /monoimagedownsamplethreshold 1.50000 /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k -1 >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx1acheck false /pdfx3check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ 0.00000 0.00000 0.00000 0.00000 ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara /bgr /chs /cht /dan /deu /esp /eti /fra /gre /heb /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader 5.0 i kasnijim verzijama.) /hun /ita /jpn /kor /lth /lvi /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader 5.0 en hoger.) /nor /pol /ptb /rum /rus /sky /slv /suo /sve /tur /ukr /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader 5.0 and later.) /cze >> /namespace [ (adobe) (common) (1.0) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) (4.0) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample16bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) (2.0) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [2400 2400] /pagesize [612.000 792.000] >> setpagedevice acta polytechnica https://doi.org/10.14311/ap.2022.62.0228 acta polytechnica 62(2):228–237, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague improvement of spectrum sensing performance in cognitive radio using modified hybrid sensing method hadeel s. abed∗, hikmat n. abdullah al-nahrain university, college of information engineering, department of information and communication engineering, jadriah, 10001 baghdad, iraq ∗ corresponding author: hadeel_sami@coie.nahrainuniv.edu.iq abstract. cognitive radio (cr) is a wireless technology for increasing the bandwidth usage. spectrum sensing (ss) is the first step in cr. there are three basic techniques in ss, energy detection (ed), matched filter (mf), and cyclostationary detection (cfd). these techniques have many challenges in performance detection (pd) and computational complexity (cc). in this paper, we propose a hybrid sensing method that consists of mf and cfd to exploit their merits and overcome their challenges. the proposed method aims to improve pd and reduce cc. when mf hasn’t had enough information about pu, it switches to cfd with a reduction of cc in both mf and cfd. the proposed method is simulated under fading with cooperative and non-cooperative scenarios, measured using pd and cc ratio cratio, and evaluated by comparing it with traditional and hybrid methods in the literature. the simulation results show that the proposed method outperforms other methods in pd and cratio. for example, at eb/no equal to 0 db under the rayleigh fading channel, the pd in the proposed method increased by 38 %, 28 %, 28 %, and 18 % as compared with the modified hybrid method, traditional hybrid method, traditional cfd method, and traditional mf method in the literature, respectively. keywords: cognitive radio, spectrum sensing, matched filter, cyclostationary, energy detection, hybrid sensing method. 1. introduction due to the large number and diversity of wireless devices and applications, the emergence of new applications, and the continuous demand for higher data rates, the radio frequency (rf) spectrum is becoming increasingly crowded [1, 2]. cognitive radio (cr) has been proposed as a promising technique that provides a solution to the spectrum scarcity problem by dynamically exploiting the unused part of the spectrum band [3, 4]. a cognitive radio was defined as a radio or system that senses, and is aware of its operational environment and can dynamically adjust its radio operating parameters accordingly [5]. cognitive radio is a wireless technology that provides the ability to share the spectrum while avoiding any imposed harmful interference to the pu [6]. the cr aims to exploit the natural resources efficiently, including frequency, time, etc. [7]. spectrum sensing is the first step to implementing a cr system. the basic component of spectrum sensing is a primary user (pu) signal or license band, and a secondary user (su) or cognitive user (cu) that senses the pu band to detect the activity of pu and can use its spectrum when the pu is absent [8]. the su must not interfere in any way with the pu to succeed the cognitive radio networks [9]. spectrum sensing techniques can be classified into two scenarios, non-cooperative and cooperative. three basic techniques are used for spectrum sensing, these are energy detection ed, matched filter mf, and cyclostationary feature detection cfd. the ed spectrum sensing technique is more used as compared to others due to its simplicity and minimal computational complexity. however, at low signal-tonoise ratio (snr) values, and bad channel conditions, the ed cannot differentiate between the pu signal and the noise. the matched filter (mf) maximizes the received snr in communication systems, so it can be considered as the best detector [10]. mf has a challenge that it must know the information about the pu signal properties, i.e., packet format, pulse shaping, and the type of modulation. if the cr has incomplete information about the pu signal, then the mf cannot be used as an optimum detector. a cyclostationary detector can be used as a sub-optimal detector. cfd can distinguish between the pu signal and the noise. it has a good performance in low snr conditions because of its noise rejection characteristic [11]. however, a cyclostationary detector has a high computational complexity since it has a long sensing time, which is not favourable in some situations [12]. to improve the performance detection, css (cooperative spectrum sensing) is applied. css could overcome fading and shadow in wireless channels. there are two basic structures of css, centralised and distributed [13, 14]. in css, sus sense the spectrum separately and transmit their local decisions to a fusion centre (fc). by applying some fusion logic scheme, fc is responsible for the overall decision [11]. the decision fusion rules can be either hard or soft. in a hard fusion rule, every 228 https://doi.org/10.14311/ap.2022.62.0228 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 2/2022 improvement of spectrum sensing performance in cognitive . . . su makes the local binary decision independently of the activity of pu, while in the soft fusion rule, the sus send their sensing information to the fusion centre without making local decisions. the decision is made at fc by using one of the combining rules [15– 17]. the rest of the paper is organized as follows: section 2 presents the literature review of the related works. section 3 displays the theoretical background of spectrum sensing techniques. section 4 explains the procedures of the proposed hybrid method. section 5 shows the computational complexity of the proposed method. section 6 illustrates the simulation results and discussions, and finally, the conclusions of the paper are drawn. 2. related works several works related to the spectrum sensing technique are proposed to improve its performance. in [11] traditional hybrid method based on energy and cyclostationary detectors, the cooperative scenario is proposed to improve the detection performance without taking into consideration the computational complexity. in this method, the pu signal is first scanned by ed to detect whether the pu is present or not. if ed is not certain about the detection of pu, then the pu signal is sensed by a cyclostationary detector. in [12], the reduction of the computational complexity in cfd is done by choosing optimum parameters. in [18], the hybrid method consists of two parallel paths of detectors. the first path is created from two sequential detector stages; in the first phase, ed is used to identify the pu signal existence where the signal has not been detected. maximum–minimum eigenvalue (mme) is used as a second stage to detect the pu signal presence. in [19], the hybrid method is done by artificial neural networks (ann). in [20], the hybrid method consists of five types of detectors, each one having its special functions to detect the spectrum whether it is free or occupied. in [21], the hybrid sensing method is proposed based on ed and cyclostationary detector with a reduced computational complexity and an improved detection performance. in [22], the idea of the proposed method is similar to [12], it reduced the computational complexity with a good performance, its process is based on the optimal parameter selection strategy for choosing detection parameters of the cyclic frequency and lag. to improve the performance of spectrum sensing techniques and solve its complexity problem, we proposed a hybrid spectrum sensing method based on matched filter and cyclostationary feature detection. this method improves the performance detection of the matched filter when it does not have sufficient information about a pu signal or at very low snr values, and reduces the computational complexity of the cyclostationary process with an excellent performance detection. the proposed method is measured using the probability of detection (pd) and computational complexity ratio under the rayleigh multipath fading channel with cooperative and non-cooperative scenarios, and evaluated by comparing it with traditional sensing techniques (cyclostationary and mf), the traditional hybrid method in reference [11] and improved hybrid method in reference [21]. 3. spectrum sensing techniques there are three basic techniques used for spectrum sensing, which are energy detection, matched filter, and cyclostationary feature detection. each technique is explained in the following sections. 3.1. energy detector energy detection (ed) is the simplest sensing technique that does not require any knowledge about the pu signal to operate. it performs the detection by comparing the accumulated energy of the received signal with a predefined threshold. the threshold depends only on the noise power [1]. the received samples at the cu receiver are shown in the following equation [23]: y(n) = hθ x(n) + n oi(n), (1) where y(n) is the received sensed signal by the cu, x(n) is the pu signal, n oi(n) is the additive white gaussian noise (awgn) and h is the gain of the channel, and θ is the activity pointer and has one of two values as shown in equation (2), θ = ® 0 for h0 hypothesis 1 for h1 hypothesis. (2) when pu is present, it is represented by hypothesis h1, while when the pu is absent, it is represented by hypothesis h0. the probabilities of false alarm and detection are measured by comparing the energy computed from the sensed signal on observation window w with a pre-defined threshold λ. the accumulated energy enj can be written as shown in equation (3). enj = 1 n n∑ n=1 |y(n)|2, (3) where n is the total number of sensed samples n = w fs, where fs is the frequency sampling. the probabilities of false alarm pf and detection pd are shown in equations (4) and (5), respectively: pf = pr(enj > λ | h0), (4) pd = pr(enj > λ | h1). (5) numerically, the threshold value can be computed for a constant pf value, which is shown in the following equation (6) [24]. λ = (q−1(pf ) + √ n )2 √ n (n )2 (6) 229 hadeel s. abed, hikmat n. abdullah acta polytechnica 3.2. cyclostationary feature detection cyclostationary feature detection is a spectrum sensing technique for detecting the pu signals by exploiting the cyclostationary features of the received signals, these features are the periodicity, number of signals, their modulation type, symbol rate, and presence of interferer [25]. this method is achieved by the autocorrelation process. the autocorrelation can be computed by multiplying the received signal y(n) with its delay version. the sum of autocorrelation is compared with a pre-defined threshold to detect the activity of the pu signal. if the summation is larger than the threshold, it means that the pu is present, otherwise, it is absent [11, 26]. this technique can distinguish between the signal and the noise, so it has a better performance as compared to ed. however, it has a high computational complexity, since it consumes a long sensing time. a signal is called a cyclostationary if its autocorrelation is a periodic function of time t with a given period. this type of cyclostationary detector is called a 2nd order cyclostationary detector [25]. a discrete cyclic autocorrelation function of a discrete-time signal y(n) with a fixed lag l is defined in equation (7) [21]. rαyy (l) = lim n →∞ 1 n n −1∑ n=0 y[n]y∗[n + l]e−j2παn∆n, (7) where n is the number of samples of a signal y[n] and ∆n is the sampling interval. by applying the discrete fourier transform to rαyy (l), the cyclic spectrum (cs) is given as [21]: sαyy (f ) = ∞∑ l=−∞ rαyy (l)e −j2πf l∆l. (8) the detection of the pu signal is achieved by sensing the (cyclic frequency) of its cyclic spectrum or cyclic autocorrelation function (caf). if the caf is larger than the pre-defend threshold, the signal is present, otherwise, the signal is absent [25]. 3.3. matched filter the matched filter is a coherent detection technique. this technique requires prior information about the pu signals at su. assuming that the pu transmitter sends a pilot stream simultaneously with the data, the su receives the signal and the pilot stream. matched filter detection is performed by projecting the received signal in the direction of the pilot [1]. the test statistic can be written as: tm f d = ∑ n y(n)x∗p(n), (9) where xp represents the pu signal, y represents the su received signal. the test statistics, tm f d, are then compared with a pre-defined threshold to detect the activity of pu, as shown in the following equation (10).® if tm f d ≥ λ, pu signal present if tm f d < λ, pu signal absent (10) 4. the proposed method in this method, the design is based on the matched filter and cyclostationary techniques with an improvement in detection performance and reduction in computational complexity in both of them. the process of this method is that the matched filter receives the pu signal and senses the half number of samples by selecting one and skipping another to reduce the computational complexity in the convolution process between the incoming received signal (pu signal) and its impulse response, which is stored in the matched filter of the spectrum sensing technique. when the detector does not have a better knowledge about the pu or when the received signal is distorted due to the channel effect, it switches to the cyclostationary technique to overcome the degradation of performance detection. in the cyclostationary stage, it also senses the pu signal by using the half number of samples by sensing one and skipping one to reduce the computational complexity in the autocorrelation process. so, in this proposed method, we gain a high-performance detection with a reduction in computational complexity. figure 1 shows the flowchart that explains the procedures of the proposed method. figure 2 shows the proposed system model using the centralised cooperative network. according to [11] and [21], the probability of detection of the proposed method can be written as: pd,proposedi = 1 − (1 − pd,m f i)(1 − pd,cycoi) i = 1, 2, . . . k, (11) where k is the number of sus in the cooperative scenario, pd,proposedi is the probability of detection of the proposed method, pd,m f i is the probability of detection in matched filter stage, and pd,cycoi is the probability of detection in cyclostationary stage. 5. computational complexity of the proposed method in this section, we compute the computational complexity in two stages (mf and cfd). since the mf is based on the convolution process between the received and previous information of the pu signal, the computational complexity in the convolution process based on the frequency domain equals to a multiplication between two signals and we need to compute the frequency domain transformation of both the received pu signal and its impulse, then, we need to compute the multiplication between them. the computational complexity of fft for n samples is o(n log2n ) according to [21], while for multiplying two signals, each 230 vol. 62 no. 2/2022 improvement of spectrum sensing performance in cognitive . . . figure 1. procedures of the proposed method. with n samples, it is o(n ). so, the computational complexity of a traditional mf becomes: cconf f t = 2o(n log2n ) + o(n ), (12) where n is the number of samples. in the proposed method, we select a half of the samples by choosing one and skipping one, so the equation (12) becomes: cconpropof f t = 2o å n 2 log2 n 2 ã + o å n 2 ã . (13) in the second stage, the cyclostationary process is based on the autocorrelation process and its computational complexity is [22, 27]: cauto = n o. of real multiplications + n o. of real additions cauto = 4n + 4n − 2 (14) the complexity of a traditional cyclostationary process is written as shown below: ccycl = 4n + 4n − 2 + o(n log2n ). (15) since, in the proposed method, only a half of the samples was chosen for the cyclostationary process by selecting one and skipping one, the equation (15) reduces to: ccyclproposed = 2n + 2n − 2 + o å n 2 log n 2 ã = 4n − 2 + o å n 2 log n 2 ã (16) the total computational complexity of the proposed method is the addition of equations (13) and (16), as shown in equation (17). ct otalproposed = 4n − 2 + 3o å n 2 log2 n 2 ã + o å n 2 ã (17) 231 hadeel s. abed, hikmat n. abdullah acta polytechnica figure 2. the proposed system model using the centralised cooperative network. method computational complexity proposed method ct otalproposed = 4n − 2 + 3o ( n 2 log2 n 2 ) + o ( n 2 ) hybrid method in [21] chybrid = 2n + 2n − 2 + o (n log2(n )) traditional hybrid [11] chybridtradi = 4n + 4n − 2 + o (n log2(n )) traditional cyclostationary [21] ccycl = 4n + 4n − 2 + o (n log2n ) traditional mf cconf f t = 2o (n log2n ) + o(n ) table 1. comparison of computational complexity. the computational complexity ratio is defined as the ratio of computational complexity in the proposed method to the maximum computational complexity (in the traditional cyclostiationary method). cratio = ct otalproposed ccycl (18) table 1 displays the summary of the computational complexity of the proposed method, the traditional hybrid method in [11], the hybrid method in [21], traditional cyclostationary, and traditional mf. it can be noted that the complexity of the traditional hybrid method is the same as the one of the traditional cyclostationary method. 6. simulation results and discussion this section shows the simulation results of the proposed method in both the cooperative and the noncooperative scenarios. the performance is tested under awgn and rayleigh multipath fading channels. the results have been achieved using matlab 2018 on windows 10. the performance results of the proposed method are measured using the probability of detection and computational complexity ratio and evaluated by comparing it with: hybrid methods in parameters values pu signal qpsk carrier frequency fc 200 hz sampling frequency fs 4000 hz pf 0.001 table 2. simulation results. references [11] and [21], and with traditional methods (cyclostationary feature detection (cfd) and matched filter method mf). the simulation parameters used are presented in table 2. the multipath fading used is “itu indoor channel model (a)” with the specification shown in table 3 [28]. figure 3 shows the performance curves of pd vs eb/no for traditional sensing methods (energy detection, cyclostationary, and matched filter) in awgn using the non-cooperative scenario. it can be seen from this figure that the matched filter has a better performance as compared to the energy detection and cyclostationary methods, especially at a low value of eb/no, since it has a good knowledge of the pu signal. for example, at eb/no equal to 0 db, the probability of detection in the matched filter is increased by 36 % and 91 % as compared to cyclo232 vol. 62 no. 2/2022 improvement of spectrum sensing performance in cognitive . . . figure 3. performance comparison between traditional sensing techniques under awgn and non-cooperative scenarios. figure 4. performance comparison between traditional sensing techniques under rayleigh fading and non-cooperative scenarios. tap relative average doppler delay power spectrum [ns] [db] 1 0 0 flat 2 50 -3.0 flat 3 110 -10.0 flat 4 170 -18.0 flat 5 290 -26.0 flat 6 310 -32.0 flat table 3. multipath fading properties of itu indoor channel model (a). stationary and energy detection, respectively. but the performance of matched filter becomes very bad when the knowledge of pu signal becomes poor and the cyclostationary technique becomes the best technique in the detection performance. the calculation of percentage improvement in this and all bellow results are as shown below: percentage = (high value − low value) ∗ 100 %. when comparing two curves at the same eb/no or n , we take the values from the curves and make sure that one curve has a value lower than the other, which is computed as shown in the above formula. figure 4 presents the same performance as in figure 3, but in rayleigh multipath fading, it can be noted that all techniques have the same detection performance as compared with figure 3, but with a degradation in the probability of detection due to multipath fading, and the matched filter also outperforms other technique in the case of a good knowledge of pu. 233 hadeel s. abed, hikmat n. abdullah acta polytechnica figure 5. pd versus eb/no of proposed sensing methods under rayleigh fading and non-cooperative scenario. figure 6. performance comparison between the cooperative and non-cooperative scenarios in the proposed and traditional methods. figure 5 illustrates pd vs eb/no of the proposed method in the non-cooperative scenario under rayleigh multipath fading as compared with hybrid methods in [11] and [21] and traditional methods (cyclostationary feature detection (cfd), and matched filter detection). it can be observed that the probability of detection of the proposed method outperforms the other methods especially at low eb/no values, since the matched filter gives an excellent performance detection when it has the best knowledge about the pu signal. when it has a poor knowledge, it switches to the cyclostationary technique, which is a blind technique (does not need information about the pu signal) and gives a very good performance detection especially at low values of eb/no. so, the overall detection performance of the proposed method gives an excellent detection performance with a low computational complexity. for example, at eb/no equal to 0 db, the proposed method achieves an increase in detection probability of 38 %, 28 %, 28 %, and 18 % as compared with the traditional hybrid method in [11], the hybrid method in [21], traditional cfd method, and traditional mf method, respectively. figure 6 displays the performance curves of the average pd vs eb/no of the proposed method in cooperative and non-cooperative scenarios as compared to the traditional hybrid method in [11]. in the cooperative scenario, we assumed 3 cus do the sensing and one of them is suffering from multipath fading. it can be noted that the detection performance of the 234 vol. 62 no. 2/2022 improvement of spectrum sensing performance in cognitive . . . figure 7. computational complexity ratio versus the number of samples. method performance detection computational complexity proposed method excellent moderate hybrid method in [21] good low traditional hybrid method in [11] good high cfd good high mf very good (in best pu information) moderate ed low low table 4. summary of performance measurement. cooperative scenario has a larger improvement than non-cooperative in both methods, since the effect of fading is reduced. for instance, at eb/no equal to 0 db in the proposed method, the performance detection is increased by 26 % as compared with a single cu in multipath fading and increased by 20 % as compared with the traditional hybrid also with a single cu in multipath fading. in all cases, the proposed method has a better performance than the traditional hybrid method. figure 7 shows the computational complexity ratio versus the number of samples. it can be seen that the proposed method has a lower computational complexity than the hybrid method in [11], traditional cyclostationary method, and mf, since it computes the convolution process in the mf stage or autocorrelation process in cfd with a half of the samples, and it is slightly greater than the hybrid method in [21], since this method uses an ed in the first stage. however, the proposed method outperforms the hybrid method in [21] and others in the probability of detection. for example, at n equal to 100, the computational complexity ratio in the proposed method decreased by 14 %, 14 %, and 12 % as compared to cfd, traditional hybrid in [11], and mf, respectively. so, we conclude that the proposed method has an excellent probability of detection and a very good reduction in computational complexity. table 4 summarizes the performance of the proposed method, hybrid methods in [11] and [21], and traditional methods (ed, cyclostationary, and mf). this table shows that at very low snr values, the proposed method is a perfect choice for spectrum sensing in terms of detection performance and computational complexity and for a very good channel environment, the ed become the best choice, but since in most cases, the channel environment is bad, the proposed method is more appropriate than others. 7. conclusions in this paper, we proposed a modified hybrid sensing method to overcome the problems of the traditional spectrum sensing technique. the proposed method is based on a combination of mf and cfd to improve the detection performance and reduce the computational complexity. the proposed method is simulated using matlab under rayleigh multipath fading with two scenarios: cooperative and non-cooperative, measured using pd and cratio,and evaluated by a comparison with traditional and hybrid sensing methods in the literature. the simulation results show that the proposed method outperforms other methods in 235 hadeel s. abed, hikmat n. abdullah acta polytechnica the literature in terms of probability of detection and computational complexity in both channels. in future work, this method can be tested under other types of fading channels. list of symbols y(n) received sensed signal by the cu x(n) pu signal noi(n) additive white gaussian noise h the gain of the channel θ activity pointer h1 hypothesis when the pu is present h0 hypothesis when the pu is absent w observation window λ pre-defined threshold enj accumulated energy n number of sensed samples fs sampling frequency pf probability of false alarm pd probability of detection rαyy (l) the discrete cyclic autocorrelation function ∆n sampling interval s αyy (f ) cyclic spectrum xp previous information of pu signal tmfd the test statistic of mf pd,proposedi probability of detection of the proposed method pd,mfi probability of detection of mf stage pd,cycoi probability of detection of cfd stage k number of su cconfft the computational complexity of traditional mf ccycl the computational complexity of traditional cfd ctotalproposed the computational complexity of the proposed method cratio the ratio of computational complexity eb/no signal to noise ratio per bit list of abbreviations cr cognitive radio ss spectrum sensing ed energy detection mf matched filter cfd cyclostationary feature detection cc computational complexity rf radio frequency pu primary user su secondary user cu cognitive user snr signal to noise ratio css cooperative spectrum sensing fc fusion center sus secondary users mme maximum–minimum eigenvalue ann artificial neural networks caf cyclic autocorrelation function cs cyclic spectrum fft fast fourier transform itu international telecommunication union awgn additive white gaussian noise references [1] f. salahdine. spectrum sensing techniques for cognitive radio networks, 2017. cornell university library [internet] [2021-11-05], https://doi.org/10.48550/arxiv.1710.02668. [2] m. e. bayrakdar, a. çalhan. performance analysis of sensing based spectrum handoff process for channel bonding mechanism in wireless cognitive networks. in 2017 25th signal processing and communications applications conference (siu). turkey, antalya, 2017. https://doi.org/10.1109/siu.2017.7960157. [3] a. mukherjee, s. choudhury, p. goswami, et al. a novel approach of power allocation for secondary users in cognitive radio networks. computers & electrical engineering 75:1–8, 2018. https: //doi.org/10.1016/j.compeleceng.2018.03.006. [4] m. cicioğlu, m. e. bayrakdar, a. çalhan. performance analysis of mac technique developed for wireless cognitive radio networks. in 2018 26th signal processing and communications applications conference (siu). turkey, izmir, 2018. https://doi.org/10.1109/siu.2018.8404747. [5] s. althunibat. towards energy efficient cooperative spectrum sensing in cognitive radio networks. ph.d. thesis, university of trento, italy, 2014. http://eprints-phd.biblio.unitn.it/1335/. [6] e. katis. resource management of energy-aware cognitive radio networks and cloud-based infrastructures, 2015. cornell university library [internet] [2021-11-05], https://doi.org/10.48550/arxiv.1505.00906. [7] m. matinmikko (editor), m. höyhtyä, m. mustonen, et al. cognitive radio: an intelligent wireless communication system. tech. rep., 2008. report no. vtt-r-02219-08, http://www.vtt.fi/inf/julkaisut/ muut/2008/chess_research_report.pdf. [8] a. mukherjee, s. maiti, a. datta. spectrum sensing for cognitive radio using blind source separation and hidden markov model. in 2014 fourth international conference on advanced computing & communication technologies, pp. 409–414. 2014. https://doi.org/10.1109/acct.2014.63. [9] m. e. bayrakdar, a. çalhan. optimization of ant colony for next generation wireless cognitive networks. journal of polytechnic 24(3):779–784, 2021. https://doi.org/10.2339/politeknik.635065. [10] m. s. falih, h. n. abdullah. dwt based energy detection spectrum sensing method for cognitive radio system. iraqi journal of information and communications technology 3(3):1–11, 2020. https://doi.org/10.31987/ijict.3.3.99. [11] k. yadav, s. d. roy, s. kundu. hybrid cooperative spectrum sensing with cyclostationary detection for cognitive radio networks. in 2016 ieee annual india conference (indicon), pp. 1–6. 2016. https://doi.org/10.1109/indicon.2016.7839118. 236 https://doi.org/10.48550/arxiv.1710.02668 https://doi.org/10.1109/siu.2017.7960157 https://doi.org/10.1016/j.compeleceng.2018.03.006 https://doi.org/10.1016/j.compeleceng.2018.03.006 https://doi.org/10.1109/siu.2018.8404747 http://eprints-phd.biblio.unitn.it/1335/ https://doi.org/10.48550/arxiv.1505.00906 http://www.vtt.fi/inf/julkaisut/muut/2008/chess_research_report.pdf http://www.vtt.fi/inf/julkaisut/muut/2008/chess_research_report.pdf https://doi.org/10.1109/acct.2014.63 https://doi.org/10.2339/politeknik.635065 https://doi.org/10.31987/ijict.3.3.99 https://doi.org/10.1109/indicon.2016.7839118 vol. 62 no. 2/2022 improvement of spectrum sensing performance in cognitive . . . [12] h. arezumand, h. sadeghi. a low-complexity cyclostationary-based detection method for cooperative spectrum sensing in cognitive radio networks. international journal of information and communication technology 3(3):1–10, 2011. https://elmnet.ir/article/605357-22041/a-lowcomplexity-cyclostationary-based-detectionmethod-for-cooperative-spectrum-sensing-incognitive-radio-networks. [13] p. bachan, s. k. ghosh, s. k. saraswat. comparative error rate analysis of cooperative spectrum sensing in non-fading and fading environments. in 2015 communication, control and intelligent systems (ccis), pp. 124–127. 2015. https://doi.org/10.1109/ccintels.2015.7437891. [14] p. bachan, s. k. ghosh, s. r. trankatwar. parametric optimization of improved sensing scheme in multiantenna cognitive radio network over erroneous channel. in a. khanna, d. gupta, z. pólkowski, et al. (eds.), data analytics and management, vol. 54, pp. 533–541. 2021. https://doi.org/10.1007/978-981-15-8335-3_41. [15] h. n. abdullah, n. s. baker, m. t. s. al-kaltakchi. proposed two-stage detection rules for improving throughput in cognitive radio networks. iraqi journal of computers, communications, control & system engineering (ijccce) 19(4):1–11, 2019. https://doi.org/10.33103/uot.ijccce.19.4.1. [16] a. mukherjee, p. goswami, a. datta. hml-based smart positioning of fusion center for cooperative communication in cognitive radio networks. ieee communications letters 20(11):2261–2263, 2016. https://doi.org/10.1109/lcomm.2016.2602266. [17] s. k. ghosh, s. r. trankatwar, p. bachan. optimal voting rule and minimization of total error rate in cooperative spectrum sensing for cognitive radio networks. journal of telecommunications and information technology 1:43–50, 2021. https://doi.org/10.26636/jtit.2021.144420. [18] a. r. mohamed, a. a. a. el-banna, h. a. mansour. multi-path hybrid spectrum sensing in cognitive radio. arabian journal for science and engineering 46:9377–9384, 2021. https://doi.org/10.1007/s13369-020-05281-0. [19] a. nasser, m. chaitou, a. mansour, et al. a deep neural network model for hybrid spectrum sensing in cognitive radio. wireless personal communications 118:281–299, 2021. https://doi.org/10.1007/s11277-020-08013-7. [20] a. s. khobragade, r. d. raut. hybrid spectrum sensing method for cognitive radio. international journal of electrical and computer engineering (ijece) 7(5):2683–2695, 2017. https://doi.org/10.11591/ijece.v7i5.pp2683-2695. [21] h. n. abdullah, z. o. dawood, a. e. abdelkareem, h. s. abed. complexity reduction of cyclostationary sensing technique using improved hybrid sensing method. acta polytechnica 60(4):279–287, 2020. https://doi.org/10.14311/ap.2020.60.0279. [22] d. shen, d. he, w.-h. li, y.-p. lin. an improved cyclostationary feature detection based on the selection of optimal parameter in cognitive radios. journal of shanghai jiaotong university (science) 17(1):1–7, 2012. https://doi.org/10.1007/s12204-012-1222-z. [23] m. emara, h. s. ali, s. e. a. khamis, f. e. a. el-samie. spectrum sensing optimization and performance enhancement of cognitive radio networks. wireless personal communications 86:925–941, 2016. https://doi.org/10.1007/s11277-015-2962-5. [24] s. atapattu. analysis of energy detection in cognitive radio networks. ph.d. thesis, university of alberta, canada, 2013. http://www.ece.ualberta.ca/ ~chintha/pdf/thesis/phd_saman.pdf. [25] w. adigwe, o. r. okonkwo. a review of cyclostationary feature detection based spectrum sensing technique in cognitive radio networks. e3 journal of scientific research 4(3):041–047, 2016. https://doi.org/10.18685/ejsr(4)3_ejsr-16-010. [26] i. g. anyim. wideband cyclostationary spectrum sensing with receiver constraints and optimization. ph.d. thesis, university of portsmouth, united kingdom, 2018. [27] s. narieda. low complexity cyclic autocorrelation function computation for spectrum sensing. ieice communications express 6(6):387–392, 2017. https://doi.org/10.1587/comex.2016xbl0211. [28] m. pätzold. mobile fading channels. john wiley & sons, ltd, 2002. https://doi.org/10.1002/0470847808.fmatter. 237 https://elmnet.ir/article/605357-22041/a-low-complexity-cyclostationary-based-detection-method-for-cooperative-spectrum-sensing-in-cognitive-radio-networks https://elmnet.ir/article/605357-22041/a-low-complexity-cyclostationary-based-detection-method-for-cooperative-spectrum-sensing-in-cognitive-radio-networks https://elmnet.ir/article/605357-22041/a-low-complexity-cyclostationary-based-detection-method-for-cooperative-spectrum-sensing-in-cognitive-radio-networks https://elmnet.ir/article/605357-22041/a-low-complexity-cyclostationary-based-detection-method-for-cooperative-spectrum-sensing-in-cognitive-radio-networks https://doi.org/10.1109/ccintels.2015.7437891 https://doi.org/10.1007/978-981-15-8335-3_41 https://doi.org/10.33103/uot.ijccce.19.4.1 https://doi.org/10.1109/lcomm.2016.2602266 https://doi.org/10.26636/jtit.2021.144420 https://doi.org/10.1007/s13369-020-05281-0 https://doi.org/10.1007/s11277-020-08013-7 https://doi.org/10.11591/ijece.v7i5.pp2683-2695 https://doi.org/10.14311/ap.2020.60.0279 https://doi.org/10.1007/s12204-012-1222-z https://doi.org/10.1007/s11277-015-2962-5 http://www.ece.ualberta.ca/~chintha/pdf/thesis/phd_saman.pdf http://www.ece.ualberta.ca/~chintha/pdf/thesis/phd_saman.pdf https://doi.org/10.18685/ejsr(4)3_ejsr-16-010 https://doi.org/10.1587/comex.2016xbl0211 https://doi.org/10.1002/0470847808.fmatter acta polytechnica 62(2):228–237, 2022 1 introduction 2 related works 3 spectrum sensing techniques 3.1 energy detector 3.2 cyclostationary feature detection 3.3 matched filter 4 the proposed method 5 computational complexity of the proposed method 6 simulation results and discussion 7 conclusions list of symbols references acta polytechnica https://doi.org/10.14311/ap.2022.62.0008 acta polytechnica 62(1):8–15, 2022 © 2022 the author(s). licensed under a cc-by 4.0 licence published by the czech technical university in prague quantum description of angles in the plane roberto beneducia, emmanuel frionb, jean-pierre gazeauc, ∗ a università della calabria and istituto nazionale di fisica nucleare, gruppo c. cosenza, 87036 arcavacata di rende (cs), italy b university of helsinki, helsinki institute of physics, p. o. box 64, fin-00014 helsinki, finland c université de paris, cnrs, astroparticule et cosmologie, 75013 paris, france ∗ corresponding author: gazeau@apc.in2p3.fr abstract. the real plane with its set of orientations or angles in [0, π) is the simplest non trivial example of a (projective) hilbert space and provides nice illustrations of quantum formalism. we present some of them, namely covariant integral quantization, linear polarisation of light as a quantum measurement, interpretation of entanglement leading to the violation of bell inequalities, and spin one-half coherent states viewed as two entangled angles. keywords: integral quantization, real hilbert spaces, quantum entanglement. 1. introduction the formulation of quantum mechanics in a real hilbert space has been analyzed by stueckelberg in 1960 [1] in order to show that the need for a complex hilbert space is connected to the uncertainty principle. later, solèr [2] showed that the lattice of elementary propositions is isomorphic to the lattice of closed subspaces of a separable hilbert space (over the reals, the complex numbers or the quaternions). in other words, the lattice structure of propositions in quantum physics does not suggest the hilbert space to be complex. more recently, moretti and oppio [3] gave stronger motivation for the hilbert space to be complex which rests on the symmetries of elementary relativistic systems. in this contribution, we do not address the question of the physical validity of the real hilbert space formulation of quantum mechanics but limit ourselves to use the real 2-dimensional case, i.e. the euclidean plane, as a toy model for illustrating some aspects of the quantum formalism, as quantization, entanglement and quantum measurement. the latter is nicely represented by the linear polarization of light. this real 2-dimensional case relies on the manipulation of the two real pauli matrices σ1 = ( 0 1 1 0 ) , σ3 = ( 1 0 0 −1 ) , (1) and their tensor products, with no mention of the third, complex matrix σ2 = ( 0 −i i 0 ) . as a matter of fact, many examples aimed to illustrate tools and concepts of quantum information, quantum measurement, quantum foundations, ... (e.g., peres [4]) are illustrated with manipulations of these matrices. in [5], it was shown that the set of pure states in the plane is represented by half of the unit circle and the set of mixed states by half the unit disk, and also that rotations in the plane rule time evolution through majorana-like equations, all of this using only real quantities for both closed and open systems. this paper is a direct extension of our previous paper [6], and for this reason we start the discussion by recalling some key elements of the mathematical formalism. 2. background 2.1. definition of povms we start with the definition of a normalized positiveoperator valued measure (povm) [7]. it is defined as a map f : b(ω) → l+s (h) from the borel σ-algebra of a topological space ω to the space of linear positive self-adjoint operators on a hilbert space h such that f ( ∞⋃ n=1 ∆n ) = ∞∑ n=1 f (∆n) f (ω) = 1 . (2) in this definition, {∆n} is a countable family of disjoint sets in b(ω) and the series converges in the weak operator topology. if ω = r, we have a real povm. if f (∆) is a projection operator for every ∆ ∈ b(ω), we recover the usual projection-valued measure (pvm). a quantum state is defined as a non-negative, bounded self-adjoint operator with trace 1. the space of states is a convex space and is denoted by s(h). a quantum measurement corresponds to an affine map s(h) 7→ m+(ω) from quantum states to probability measures, ρ 7→ µρ. there is [8] a one-to-one correspondence between povms f : b(ω) → l+s (h) and affine maps s(h) 7→ m+(ω) given by µρ(∆) = tr(ρf (∆)), ∆ ∈ b(ω). 2.2. integral quantization quantum mechanics is usually taught in terms of projection operators and pvm, but measurements usually 8 https://doi.org/10.14311/ap.2022.62.0008 https://creativecommons.org/licenses/by/4.0/ https://www.cvut.cz/en vol. 62 no. 1/2022 quantum angles give a statistical distribution around a mean value, incompatible with the theory. we recall here a generalization of a quantization procedure, the integral quantization, based on povms instead of pvm. the basic requirements of this programme are the following: the quantization of a classical function defined on a set x must respect (1.) linearity. quantization is a linear map f 7→ af : q : c(x) 7→ a(h) , q(f ) = af , (3) where • c(x) is a vector space of complex or real-valued functions f (x) on a set x, i.e. a “classical” mathematical model, • a(h) is a vector space of linear operators in some real or complex hilbert space h, i.e., a “quantum” mathematical model, notwithstanding the question of common domains in the case of unbounded operators. (2.) unity. the map (3) is such that the function f = 1 is mapped to the identity operator 1 on h. (3.) reality. a real function f is mapped to a selfadjoint or normal operator af in h or, at least, a symmetric operator (in the infinite-dimensional case). (4.) covariance. defining the action of a symmetry group g on x by (g, x) ∈ g × x such as (g, x) 7→ g · x ∈ x, there is a unitary representation u of g such that at (g)f = u (g)af u (g−1), with (t (g)f )(x) = f ( g−1 · x ) . performing the integral quantization [9] of a function f (x) on a measure space (x, ν) boils down to the linear map: f 7→ af = ∫ x m(x) f (x) dν(x) , (4) where we introduce a family of operators m(x) solving the identity. more precisely, we have x ∋ x 7→ m(x) , ∫ x m(x) dν(x) = 1 . (5) if the m(x) are non-negative, they provide a povm. indeed, the quantization of the characteristic function on the borel set ∆, a(χ∆), f (∆) := a(χ∆) = ∫ ∆ m(x) dν(x) . (6) is a povm which provides a quantization procedure f 7→ af = ∫ x f (x) df (x). 3. euclidean plane as hilbert space of quantum states 3.1. mixed states as density matrices density matrices act as a family of operators which can be used to perform covariant integral quantization. 6 ı̂ = |0⟩ ≡ ( 1 0 ) |ϕ⟩ = ( cos ϕ sin ϕ ) ↔ eϕ = |ϕ⟩⟨ϕ| o ⟨0|0⟩ = 1 = 〈 π 2 ∣∣ π 2 〉 , ⟨0 ∣∣π 2 〉 = 0 ȷ̂ = | π2 ⟩ ≡ ( 0 1 ) 1 ϕ � � � � � � � � �� figure 1. the euclidean plane and its unit vectors viewed as pure quantum states in dirac ket notations. in the context of the euclidean plane and its rotational symmetry, one associates the polar angle ϕ ∈ [0, 2π) with the unit vector ûϕ to define the pure state |ϕ⟩ := |ûϕ⟩. as shown in figure 1, two orthogonal pure states ı̂ = |0⟩ and ȷ̂ = ∣∣∣π 2 〉 are readily identified with the unit vectors spanning the plane. in this configuration, the pure state |ϕ⟩ is defined by an anticlockwise rotation of angle ϕ of the pure state |0⟩. denoting the orthogonal projectors on ı̂ and ȷ̂ by |0⟩⟨0| and ∣∣π 2 〉〈 π 2 ∣∣ respectively, we visualize the resolution of the identity as follows 1 = |0⟩⟨0| + ∣∣∣π 2 〉〈π 2 ∣∣∣ ⇕( 1 0 0 1 ) = ( 1 0 0 0 ) + ( 0 0 0 1 ) . (7) recalling that a pure state in the plane, equivalently an orientation, can be decomposed as |ϕ⟩ = cos ϕ |0⟩ + sin ϕ ∣∣π 2 〉 , with ⟨0|ϕ⟩ = cos ϕ and 〈 π 2 ∣∣ ϕ〉 = sin ϕ, it is straightforward to find the orthogonal projector corresponding to the pure state |ϕ⟩, eϕ = ( cos2 ϕ cos ϕ sin ϕ cos ϕ sin ϕ sin2 ϕ ) , (8) from which we can construct the density matrix corresponding to all the mixed states ρ = ( 1 + r 2 ) eϕ + ( 1 − r 2 ) eϕ+π/2 , 0 ≤ r ≤ 1 . (9) in this expression, the parameter r represents the degree of mixing. hence the upper half-disk (r, ϕ), 0 ≤ r ≤ 1, 0 ≤ ϕ < π is in one-to-one correspondence 9 r. beneduci, e. frion, j.-p. gazeau acta polytechnica with the set of density matrices ρ ≡ ρr,ϕ written as ρr,ϕ = 1 2 1 + r 2 r(ϕ)σ3r(−ϕ) = (1 2 + r 2 cos 2ϕ r 2 sin 2ϕ r 2 sin 2ϕ 1 2 − r 2 cos 2ϕ ) = 1 2 (1 + rσ2ϕ) , (10) where r(ϕ) = ( cos ϕ − sin ϕ sin ϕ cos ϕ ) is a rotation matrix in the plane, and σϕ := cos ϕ σ3 + sin ϕ σ1 ≡ −→σ · ûϕ = ( cos ϕ sin ϕ sin ϕ − cos ϕ ) = r(ϕ) σ3 . (11) the observable σϕ has eigenvalues {±1} and eigenvectors ∣∣∣ϕ2 〉 and ∣∣∣ϕ+π2 〉 respectively. it plays a crucial rôle since, as we show right after, it is at the core of both the non-commutative character and the entanglement of two quantum states of the real space. it is a typical observable used to illustrate quantum formalism [4]. 3.2. describing non-commutativity and finding naimark extensions through rotations let us apply integral quantization with the real density matrices (10). with x = s1, the unit circle, equipped with the measure dν(x) = dϕ π , ϕ ∈ [0, 2π), we obtain the resolution of the identity for an arbitrary ϕ0,∫ 2π 0 ρr,ϕ+ϕ0 dϕ π = 1 . (12) hence, quantizing a function (or distribution) f (ϕ) on the circle is done through the map f 7→ af = ∫ 2π 0 f (ϕ)ρr,ϕ+ϕ0 dϕ π = ( ⟨f ⟩ + r2 cc (rϕ0 f ) r 2 cs (rϕ0 f ) r 2 cs (rϕ0 f ) ⟨f ⟩ − r 2 cc (rϕ0 f ) ) = ⟨f ⟩ 1 + r 2 [cc (rϕ0 f ) σ3 + cs (rϕ0 f ) σ1] , (13) with ⟨f ⟩ := 12π ∫ 2π 0 f (ϕ) dϕ the average of f on the unit circle and rϕ0 (f )(ϕ) := f (ϕ − ϕ0). here we have defined cosine and sine doubled angle fourier coefficients of f cc s (f ) = ∫ 2π 0 f (ϕ) { cos sin 2ϕ dϕ π . (14) in [6], we drew three consequences from this result. the first consequence is that, upon identification of r3 with the subspace v3 = span { e0(ϕ) := 1√2 , e1(ϕ) := cos 2ϕ, e2(ϕ) := sin 2ϕ } in l2(s1, dϕ/π), the integral quantization map with ρr,ϕ+ϕ0 yields a noncommutative version of r3 : ae0 = 1 √ 2 , ae1 = r 2 [cos 2ϕ0 σ3 + sin 2ϕ0 σ1] ≡ r 2 σ2ϕ0 , ae2 = r 2 [− sin 2ϕ0 σ3 + cos 2ϕ0 σ1] ≡ r 2 σ2ϕ0+π/2 . now, the commutation rule reads [ae1 , ae2 ] = − r2 2 τ2 , τ2 := ( 0 −1 1 0 ) = −iσ2 , which depends on the real version of the last pauli matrix and on the degree of mixing. a second consequence, typical of quantummechanical ensembles, is that all functions f (ϕ) in v3 yielding density matrices through this map imply that ρs,θ = ∫ 2π 0 [ 1 2 + s r cos 2ϕ ] ︸ ︷︷ ︸ f (ϕ) ρr,ϕ+θ dϕ π . (15) if r ≥ 2s, this continuous superposition of mixed states is convex. therefore, a mixed state is composed of an infinite number of other mixed states. this has consequences in quantum cryptography, for example, since the initial signal cannot be recovered from the output. the third and last consequence we mention here concerns the naimark extension of a function defined on the circle. in particular, we focus on the toeplitz quantization of f (ϕ), which is a kind of integral quantization. in [6], we used this framework to show there exist orthogonal projectors from l2(s1, dϕ/π) to r2 such that for a function f (ϕ) the multiplication operator on l2(s1, dϕ/π), defined by v 7→ mf v = f v , (16) maps mf to af . they are precisely naimark’s extensions of povms represented by density matrices (see [6] for details). 3.3. linear polarization of light as a quantum phenomenon in this section, we recall that the polarization tensor of light can be expressed as a density matrix, which allows us to relate the polarization of light to quantum phenomena such as the malus law and the incompatibility between two sequential measurements [6]. first, remember that a complex-valued electric field for a propagating quasi-monochromatic electromagnetic wave along the z-axis reads as −→ e (t) = −→ e0(t) eiωt = ex ı̂ + ey ȷ̂ = (eα) , (17) in which we have used the previous notations for the unit vectors in the plane. the polarization is determined by −→ e0(t). it slowly varies with time, and can be 10 vol. 62 no. 1/2022 quantum angles measured through nicol prisms, or other devices, by measuring the intensity of the light yielded by mean values ∝ eαeβ , eαe ∗β and conjugates. due to rapidly oscillating factors and a null temporal average ⟨·⟩t, a partially polarized light is described by the 2 × 2 hermitian matrix (stokes parameters) [10–12] 1 j ( ⟨e0xe ∗0x⟩t 〈 e0xe ∗0y 〉 t ⟨e0y e ∗0x⟩t 〈 e0y e ∗0y 〉 t ) ≡ ρr,ϕ + a 2 σ2 = 1 + r 2 eϕ + 1 − r 2 eϕ+π/2 + i a 2 τ2 . here, j describes the intensity of the wave. in the second line, it is clear that the degree of mixing r describes linear polarization, while the parameter a (−1 ≤ a ≤ 1) is related to circular polarization. in real space, we have a = 0, so we effectively describe the linear polarization of light. 6 k̂ ȷ̂ ı̂ � � � �� �� � �� •h h hhy re (−→ e ) y z x we now wish to describe the interaction between a polarizer and a partially linear polarized light as a quantum measurement. we need to introduce two planes and their tensor product: the first one is the hilbert space on which act the states ρms,θ of the polarizer viewed as an orientation pointer. note that the action of the generator of rotations τ2 = −iσ2 on these states corresponds to a π/2 rotation : τ2ρ m s,θ τ −1 2 = −τ2ρ m s,θ τ2 = ρ m s,θ+π/2 . (18) the second plane is the hilbert space on which act the partially linearized polarization states ρlr,ϕ of the plane wave crossing the polarizer. its spectral decomposition corresponds to the incoherent superposition of two completely linearly polarized waves ρlr,ϕ = 1 + r 2 eϕ + 1 − r 2 eϕ+π/2 . (19) the pointer detects an orientation in the plane determined by the angle ϕ. through the interaction pointer-system, we generate a measurement whose time duration is the interval im = (tm − η, tm + η) centred at tm . the interaction is described by the (pseudo-) hamiltonian operator h̃int(t) = g η m (t)τ2 ⊗ ρ l r,ϕ , (20) where gηm is a dirac sequence with support in im , i.e., lim η→0 ∫ +∞ −∞ dt f (t) gηm (t) = f (tm ) . the interaction (20) is the tensor product of an antisymmetric operator for the pointer with an operator for the system which is symmetric (i.e., hamiltonian). the operator defined for t0 < tm − η as u (t, t0) = exp [∫ t t0 dt′ gηm (t ′) τ2 ⊗ ρlr,ϕ ] = exp [ g η m (t) τ2 ⊗ ρ l r,ϕ ] , (21) with gηm (t) = ∫ t t0 dt′ gηm (t ′), is a unitary evolution operator. from the formula involving an orthogonal projector p , exp(θτ2 ⊗ p ) = r(θ) ⊗ p + 1 ⊗ (1 − p ) , (22) we obtain u (t, t0) =r ( g η m (t) 1 + r 2 ) ⊗ eϕ + r ( g η m (t) 1 − r 2 ) ⊗ eϕ+π/2 . (23) for t0 < tm − η and t > tm + η, we finally obtain u (t, t0) = r ( 1 + r 2 ) ⊗ eϕ + r ( 1 − r 2 ) ⊗ eϕ+π/2 . (24) preparing the polarizer in the state ρms0,θ0 , we obtain the evolution u (t, t0) ρms0,θ0 ⊗ ρ l r0,ϕ0 u (t, t0)† of the initial state for t > tm + η ρm s0,θ0+ 1+r2 ⊗ 1 + r0 cos 2(ϕ − ϕ0) 2 eϕ + ρm s0,θ0+ 1−r2 ⊗ 1 − r0 cos 2(ϕ − ϕ0) 2 eϕ+π/2 + 1 4 (r(r) + s0σ2θ0+1) ⊗ r0 sin 2(ϕ − ϕ0) eϕτ2 − 1 4 (r(−r) + s0σ2θ0+1) ⊗ r0 sin 2(ϕ − ϕ0) τ2eϕ . (25) therefore, the probability for the pointer to rotate by 1+r2 , corresponding to the polarization along the orientation ϕ is tr [( u (t, t0) ρms0,θ0 ⊗ ρ l r0,ϕ0 u (t, t0)† ) (1 ⊗ eϕ) ] = 1 + r0 cos 2(ϕ − ϕ0) 2 , (26) that for the completely linear polarization of the light, i.e. r0 = 1, becomes the familiar malus law, cos2(ϕ − ϕ0). similarly, the second term gives the probability for the perpendicular orientation ϕ + π/2 and the pointer rotation by 1−r2 tr [( u (t, t0) ρms0,θ0 ⊗ ρ l r0,ϕ0 u (t, t0)† )( 1 ⊗ eϕ+π/2 )] = 1 − r0 cos 2(ϕ − ϕ0) 2 , (27) corresponding (in the case r0 = 1) to the malus law sin2(ϕ − ϕ0). 11 r. beneduci, e. frion, j.-p. gazeau acta polytechnica 4. entanglement and isomorphisms in this section, we develop our previous results further by giving an interpretation in terms of quantum entanglement. previously, we described the interaction between a polarizer and a light ray as the tensor product (20), which is analogous to the quantum entanglement of states, since it is a logical consequence of the construction of tensor products of hilbert spaces for describing quantum states of composite system. in the present case, we are in presence of a remarkable sequence of vector space isomorphisms due to the fact that 2 × 2 = 2 + 2 1 : r2 ⊗ r2 ∼= r2 × r2 ∼= r2 ⊕ r2 ∼= c2 ∼= h , (28) where h is the field of quaternions. therefore, the description of the entanglement in a real hilbert space is equivalent to the description of a single system (e.g., a spin 1/2) in the complex hilbert space c2, or in h. in section 4.3 we develop such an observation. 4.1. bell states and quantum correlations it is straightforward to transpose into the present setting the 1964 analysis and result presented by bell in his discussion about the epr paper [13] and about the subsequent bohm’s approaches based on the assumption of hidden variables [14]. we only need to replace the bell spin one-half particles with the horizontal (i.e., +1) and vertical (i.e., −1) quantum orientations in the plane as the only possible issues of the observable σϕ (11), supposing that there exists a pointer device designed for measuring such orientations with outcomes ±1 only. in order to define bell states and their quantum correlations, let us first write the canonical, orthonormal basis of the tensor product r2a ⊗ r 2 b , the first factor being for system “a” and the other for system “b”, as |0⟩a ⊗ |0⟩b , ∣∣∣π 2 〉 a ⊗ ∣∣∣π 2 〉 b , |0⟩a ⊗ ∣∣∣π 2 〉 b , ∣∣∣π 2 〉 a ⊗ |0⟩b . (29) the states |0⟩ and ∣∣π 2 〉 pertain to a or b, and are named “q-bit” or “qubit” in the standard language of quantum information. since they are pure states, they can be associated to a pointer measuring the horizontal (resp. vertical) direction or polarisation described by the state |0⟩ (resp. ∣∣π 2 〉 ). there are four bell pure states in r2a ⊗ r 2 b , namely |φ±⟩ = 1 √ 2 ( |0⟩a ⊗ |0⟩b ± ∣∣∣π 2 〉 a ⊗ ∣∣∣π 2 〉 b ) , (30) |ψ±⟩ = 1 √ 2 ( ±|0⟩a ⊗ ∣∣∣π 2 〉 b + ∣∣∣π 2 〉 a ⊗ |0⟩b ) . (31) 1remind that dim(v ⊗ w ) = dimv dimw while dim(v × w ) = dimv + dimw for 2 finite-dimensional vector spaces v and w we say that they represent maximally entangled quantum states of two qubits. consider for instance the state |φ+⟩. if the pointer associated to a measures its qubit in the standard basis, the outcome would be perfectly random, with either possibility having a probability 1/2. but if the pointer associated to b then measures its qubit instead, the outcome, although random for it alone, is the same as the one a gets. there is quantum correlation. 4.2. bell inequality and its violation let us consider a bipartite system in the state ψ−. in such a state, if a measurement of the component σaϕa := −→σ a · ûϕa (ûϕa is an unit vector with polar angle ϕa) yields the value +1 (polarization along the direction ϕa/2), then a measurement of σbϕb when ϕb = ϕa must yield the value −1 (polarization along the direction ϕa+π2 ), and vice-versa. from a classical perspective, the explanation of such a correlation needs a predetermination by means of the existence of hidden parameters λ in some set λ. assuming the two measurements to be separated by a space-like interval, the result εa ∈ {−1, +1} (resp. εb ∈ {−1, +1}) of measuring σaϕa (resp. σ b ϕb ) is then determined by ϕa and λ only (locality assumption), not by ϕb, i.e. εa = εa(ϕa, λ) (resp. εb = εb (ϕb, λ)). given a probability distribution ρ(λ) on λ, the classical expectation value of the product of the two components σaϕa and σbϕb is given by p(ϕa, ϕb) = ∫ λ dλ ρ(λ) εa(ϕa, λ) εb (ϕb, λ) . (32) since ∫ λ dλ ρ(λ) = 1 and εa,b = ±1 , (33) we have −1 ≤ p(ϕa, ϕb) ≤ 1. equivalent predictions within the quantum setting then imposes the equality between the classical and quantum expectation values: p(ϕa, ϕb) = 〈 ψ− ∣∣ σaϕa ⊗ σbϕb ∣∣ψ−〉 = −ûϕa · ûϕb = − cos(ϕa − ϕb) . (34) in the above equation, the value −1 is reached at ϕa = ϕb. this is possible for p(ϕa, ϕa) only if εa(ϕa, λ) = −εb (ϕa, λ). hence, we can write p(ϕa, ϕb) as p(ϕa, ϕb) = − ∫ λ dλ ρ(λ) ε(ϕa, λ) ε(ϕb, λ) , ε(ϕ, λ) ≡ εa(ϕ, λ) = ±1 . (35) let us now introduce a third unit vector ûϕc . due to ε2 = 1, we have p(ϕa, ϕb) − p(ϕa, ϕc) = ∫ λ dλ ρ(λ) ε(ϕa, λ) ε(ϕb, λ) × [ε(ϕb, λ) ε(ϕc, λ) − 1] . (36) 12 vol. 62 no. 1/2022 quantum angles from this results the (baby) bell inequality: |p(ϕa, ϕb) − p(ϕa, ϕc)| ≤ ∫ λ dλ ρ(λ) [1 − ε(ϕb, λ) ε(ϕc, λ)] = 1 + p(ϕb, ϕc) . hence, the validity of the existence of hidden variable(s) for justifying the quantum correlation in the singlet state ψ−, and which is encapsulated by the above equation, has the following consequence on the arbitrary triple (ϕa, ϕb, ϕc): 1 − cos(ϕb − ϕc) ≥ |cos(ϕb − ϕa) − cos(ϕc − ϕa)| . equivalently, in terms of the two independent angles ζ and η, ζ = ϕa − ϕb 2 , η = ϕb − ϕc 2 , we have ∣∣sin2 ζ − sin2(η + ζ)∣∣ ≤ sin2 η . (37) it is easy to find pairs (ζ, η) for which the inequality (37) does not hold true. for instance with η = ζ ̸= 0, i.e., ϕb = ϕa + ϕc 2 , we obtain |4 sin2 η − 3| ≤ 1 , (38) which does not hold true for all |η| < π/4, i.e., for |ϕa − ϕb| = |ϕb − ϕc| < π/2. actually, we did not follow here the proof given by bell, which is a lot more elaborate. also, bell considered unit vectors in 3-space. restricting his proof to vectors in the plane does not make any difference, as it is actually the case in many works devoted to the foundations of quantum mechanics. 4.3. entanglement of two angles quantum entanglement is usually described by the complex two-dimensional hilbert space c2. as a complex vector space, c2, with canonical basis (e1, e2), has a real structure, i.e., is isomorphic to a real vector space which makes it isomorphic to r4, itself isomorphic to r2 ⊗ r2. a real structure is obtained by considering the vector expansion c2 ∈ v = z1e1 + z2e2 = x1e1 + y1 (ie1) + x2e2 + y2 (ie2) , (39) which is equivalent to writing z1 = x1 + iy1, z2 = x2 + iy2, and considering the set of vectors {e1, e2, (ie1) , (ie2)} (40) as forming a basis of r4. forgetting about the subscripts a and b in (29), we can map vectors in the euclidean plane r2 to the complex “plane” c by |0⟩ 7→ 1 , ∣∣∣π 2 〉 7→ i , (41) which allows the correspondence between bases as |0⟩ ⊗ |0⟩ = e1 , ∣∣∣π 2 〉 ⊗ ∣∣∣π 2 〉 = −e2 , |0⟩ ⊗ ∣∣∣π 2 〉 = (ie1) , ∣∣∣π 2 〉 ⊗ |0⟩ = (ie2) . (42) also, the spin of a particle in a real basis, given by the “up” and “down” states, are defined by e1 ≡ | ↑ ⟩ ≡ ( 1 0 ) , e2 ≡ | ↓ ⟩ ≡ ( 0 1 ) . (43) finally, we obtain an unitary map from the bell basis to the basis of real structure of c2( |φ+⟩ |φ−⟩ |ψ+⟩ |ψ−⟩ ) = ( e1 e2 (ie1) (ie2) ) 1 √ 2   1 1 0 0 −1 1 0 0 0 0 1 −1 0 0 1 1   . in terms of respective components of vectors in their respective spaces, we have  x1 x2 y1 y2   = 1√2   1 1 0 0 −1 1 0 0 0 0 1 −1 0 0 1 1     x+ x− y+ y−   . (44) in complex notations, with z± = x± + iy±, this is equivalent to( z+ z− ) = 1 √ 2 ( 1 −c c 1 )( z1 z2 ) ≡ c@ ( z1 z2 ) , (45) in which we have introduced the conjugation operator cz = z̄, i.e., the mirror symmetry with respect to the real axis, −c being the mirror symmetry with respect to the imaginary axis. let us now see what is the influence of having real bell states on schrödinger cat states. the operator “cat” c@ can be expressed as c@ = 1 √ 2 (1 + f) , f := cτ2 = ( 0 −c c 0 ) . (46) therefore, with the above choice of isomorphisms, bell entanglement in r2 ⊗ r2 is not represented by a simple linear superposition in c2. it involves also the two mirror symmetries ±c. the operator f is a kind of “flip” whereas the “cat” or “beam splitter” operator c@ builds, using the up and down basic states, the two elementary schrödinger cats f | ↑ ⟩ = | ↓ ⟩ , c@ | ↑ ⟩ = 1 √ 2 (| ↑ ⟩ + | ↓ ⟩) , (47) f | ↓ ⟩ = −| ↑ ⟩ , c@ | ↓ ⟩ = 1 √ 2 (−| ↑ ⟩ + | ↓ ⟩) . (48) the flip operator also appears in the construction of the spin one-half coherent states |θ, ϕ⟩, defined in 13 r. beneduci, e. frion, j.-p. gazeau acta polytechnica terms of spherical coordinates (θ, ϕ) as the quantum counterpart of the classical state n̂(θ, ϕ) in the sphere s2 by |θ, ϕ⟩ = ( cos θ 2 | ↑ ⟩ + eiϕ sin θ 2 | ↓ ⟩ ) ≡( cos θ2 eiϕ sin θ2 ) = ( cos θ2 − sin θ 2 e −iϕ sin θ2 e iϕ cos θ2 )( 1 0 ) ≡ d 1 2 ( ξ−1n̂ ) | ↑ ⟩ . (49) here, ξn̂ corresponds, through the homomorphism so(3) 7→ su(2), to the specific rotation rn̂ mapping the unit vector pointing to the north pole, k̂ = (0, 0, 1), to n̂. the operator d 1 2 ( ξ−1n̂ ) represents the element ξ−1n̂ of su(2) in its complex two-dimensional unitary irreducible representation. as we can see in matrix (49), the second column of d 1 2 ( ξ−1n̂ ) is precisely the flip of the first one, d 1 2 ( ξ−1n̂ ) = ( |θ, ϕ⟩ f|θ, ϕ⟩ ) . (50) actually, we can learn more about the isomorphisms c2 ∼= h ∼= r+ × su(2) through the flip and matrix representations of quaternions. in quaternionic algebra, we have the property ı̂ = ȷ̂k̂ + even permutations, and a quaternion q is represented by h ∋ q = q0 + q1ı̂ + q2ȷ̂ + q3k̂ = q0 + q3k̂ + ȷ̂ ( q1k̂ + q2 ) ≡ ( q0 + iq3 q2 + iq1 ) ≡ zq ∈ c2 , (51) after identifying k̂ ≡ i as both are roots of −1. then the flip appears naturally in the final identification h ∼= r+ × su(2) as q ≡ ( q0 + iq3 −q2 + iq1 q2 + iq1 q0 − iq3 ) = ( zq fzq ) . (52) let us close this article with a final remark on spin1/2 coherent states as vectors in r2a ⊗ r 2 b . the “cat states” in c2 given by (49) and equivalently viewed as 4-vectors in h ∼ r4 as |θ, ϕ⟩ 7→   cos θ2 − sin θ2 cos ϕ sin θ2 sin ϕ 0   , (53) are represented as entangled states in r2a ⊗ r 2 b by |θ, ϕ⟩ = cos θ 2 |0⟩a ⊗ |0⟩b − sin θ 2 cos ϕ ∣∣∣π 2 〉 a ⊗ ∣∣∣π 2 〉 b + sin θ 2 sin ϕ|0⟩a ⊗ ∣∣∣π 2 〉 b + 0 ∣∣∣π 2 〉 a ⊗ |0⟩b . therefore, we can say that two entangled angles in the plane can be viewed as a point in the upper half-sphere s2/z2 in r3 shown in figure 2. figure 2. each point in the upper half-sphere is in one-to-one correspondence with two entangled angles in the plane. 5. conclusions integral quantization is a quantization scheme constructed on positive operator-value measures. when applied to a two-dimensional real space, it allows for a description of quantum states as pointers in the real unit half-plane. we recalled in this paper that in this case, a family of density matrices is sufficient to perform this kind of quantization as it describes all the mixed states in this space. furthermore, a density matrix in a two-dimensional real space depends on the usual observable σϕ = ( cos ϕ sin ϕ sin ϕ − cos ϕ ) , which captures the essence of non-commutativity in real space. as a consequence, commutation relations are expressed in terms of the real matrix τ2, which serves as the basis to the description of quantum measurement. we provide an illustration considering linearlypolarized light passing through a polarizer. the pointer, associated with τ2, can rotate by an angle (1±r)/2 with r the degree of mixing of the density matrix, with a probability given by the usual malus’ laws (26) and (27). we extended the analysis by showing that the interaction between a polarizer and a light ray is equivalent to the quantum entanglement of two hilbert spaces. orientations in the plane have only two outcomes (±1), which are the possible issues of σϕ. we showed that for a general bipartite system, the classical and quantum measurement of σϕ deny the existence of local hidden variables, resulting in the well-known violation of bell inequalities, here given by (37). finally, we demonstrated that the isomorphism c2 ≃ r4 allows to write bell states in real space, with the introduction of the “flip” operator (46). this operator is necessary for constructing spin one-half coherent states, that we can fully describe by a set of orientations in r3, as shown in (53). acknowledgements r.b. the present work was performed under the auspices of the gnfm (gruppo nazionale di fisica matematica). 14 vol. 62 no. 1/2022 quantum angles ef thanks the helsinki institute of physics (hip) for their hospitality. references [1] e. c. g. stueckelberg. quantum theory in real hilbert space. helvetica physica acta 33(8):727–752, 1960. [2] m. p. solèr. characterization of hilbert spaces by orthomodular spaces. communications in algebra 23(1):219–243, 1995. https://doi.org/10.1080/00927879508825218. [3] v. moretti, m. oppio. quantum theory in quaternionic hilbert space: how poincaré symmetry reduces the theory to the standard complex one. reviews in mathematical physics 31(04):1950013, 2019. https://doi.org/10.1142/s0129055x19500132. [4] a. peres. neumark’s theorem and quantum inseparability. foundations of physics 20:1441–1453, 1990. https://doi.org/10.1007/bf01883517. [5] h. bergeron, e. m. f. curado, j.-p. gazeau, l. m. c. s. rodrigues. orientations in the plane as quantum states. brazilian journal of physics 49(3):391–401, 2019. https://doi.org/10.1007/s13538-019-00652-x. [6] r. beneduci, e. frion, j.-p. gazeau, a. perri. real povms on the plane: integral quantization, naimark theorem and linear polarization of the light. quantum physics 2021. arxiv:2108.04086. [7] r. beneduci. joint measurability through naimark’s dilation theorem. reports on mathematical physics 79(2):197–214, 2017. https://doi.org/10.1016/s0034-4877(17)30035-6. [8] a. holevo. probabilistic and statistical aspects of quantum theory. springer science & business media, 2011. https://doi.org/10.1007/978-88-7642-378-9. [9] h. bergeron, j.-p. gazeau. integral quantizations with two basic examples. annals of physics 344:43–68, 2014. https://doi.org/10.1016/j.aop.2014.02.008. [10] w. h. mcmaster. polarization and the stokes parameters. american journal of physics 22(6):351– 362, 1954. https://doi.org/10.1119/1.1933744. [11] b. schaefer, e. collett, r. smyth, et al. measuring the stokes polarization parameters. american journal of physics 75(2):163–168, 2007. https://doi.org/10.1119/1.2386162. [12] l. d. landau, e. m. lifshitz. the classical theory of fields. vol. 2. (4th ed.). butterworth-heinemann, 1975. isbn 978-0-7506-2768-9. [13] j. s. bell. on the einstein-podolsky-rosen paradox. physics 1(3):195–200, 1964. https: //doi.org/10.1103/physicsphysiquefizika.1.195. [14] d. bohm. a suggested interpretation of the quantum theory in terms of “hidden” variables. i. physical review 85(2):166–179, 1952. https://doi.org/10.1103/physrev.85.166. 15 https://doi.org/10.1080/00927879508825218 https://doi.org/10.1142/s0129055x19500132 https://doi.org/10.1007/bf01883517 https://doi.org/10.1007/s13538-019-00652-x https://arxiv.org/abs/2108.04086 https://doi.org/10.1016/s0034-4877(17)30035-6 https://doi.org/10.1007/978-88-7642-378-9 https://doi.org/10.1016/j.aop.2014.02.008 https://doi.org/10.1119/1.1933744 https://doi.org/10.1119/1.2386162 https://doi.org/10.1103/physicsphysiquefizika.1.195 https://doi.org/10.1103/physicsphysiquefizika.1.195 https://doi.org/10.1103/physrev.85.166 acta polytechnica 62(1):8–15, 2022 1 introduction 2 background 2.1 definition of povms 2.2 integral quantization 3 euclidean plane as hilbert space of quantum states 3.1 mixed states as density matrices 3.2 describing non-commutativity and finding naimark extensions through rotations 3.3 linear polarization of light as a quantum phenomenon 4 entanglement and isomorphisms 4.1 bell states and quantum correlations 4.2 bell inequality and its violation 4.3 entanglement of two angles 5 conclusions acknowledgements references ap07_4-5.vp 1 introduction the recent development of computers has opened up new opportunities in education. today, in computer assisted education modern multimedia technologies and the internet are used to create place and time independent learning possibilities by using a uniform framework. the present form of computer-aided learning, or e-learning, is a result of continuous development. nowadays most of the existing e-learning systems and courses are conformed to one of the widespread e-learning standards or recommendations, so the learning systems are independent of the learning material, and are interoperable with each other. the key to the popularity of e-learning is the richness of multimedia contents in the courses, and the extensive accessibility of the internet, where these materials are usually presented. due to the development of mobile devices such as pdas (personal digital assistants) and smartphones, mobile internet access has become more and more common, and thus a demand for mobile e-learning systems and courses has arisen. however, most of the existing learning packages were designed to be run on pcs. this means that these materials are usually rich in multimedia contents that fit the size of a typical desktop computer’s display. also the formatting and wrapping of the texts are designed presuming this display size. although mobile devices can generally display these courses, real time shrinking and displaying a video with fine resolution and good quality to a small display requires more computational power than that is available in these devices. there is, therefore, a need for specialized learning packages for mobile devices. most of the course editing and management softwares have some support for pdas or smartphones, but this only involves limiting the displaying area. the creator of the learning package (let us call him the lecturer) has to pay attention to the size and quality of the multimedia contents for the best playback compatibility; he has to resize or reencode the images and videos with another program. since the lecturer is usually not a computer specialist, it would be ideal if the learning content development system would pay attention to these properties of the content, and it would adapt them automatically to the given device. as the generation of a learning package takes much time, and there are already many existing courses designed for pcs, there is also a need for proper playback of these courses on mobile devices. to achieve this, automatic adaptation is required in the e-learning system: after the construction phase of the package and before delivery. to solve these problems, our goal was to analyze the possibilities of adapting courses to mobile platforms, and to develop a system capable of this adaptation. in this paper after introducing the structure of the most common e-learning systems and the quasi-standard scorm (sharable content object reference model) [1] recommendation, we review and evaluate the adaptation possibilities. after this, we present two systems that we have implemented to solve the problem of adaptation. we finish this paper by discussing the evaluation of these systems and presenting the possibilities of their further development. 2 fundamentals of e-learning the very first e-learning courses were developed together with their delivery system; therefore, both of them should have been changed if the learning content had been changed. in addition, there was no way to transfer a learning content to another system, so in the early days e-learning was not very flexible. a new impulse was given to e-learning when independent organizations worked out recommendations for the structure of learning systems and for cooperation between the delivery system and the course itself. the most commonly used recommendation is the scorm, which was created by combining different widely used standards and recommendations. according to scorm, three parts of an e-learning system can be separated: the course, the course editing module, and the runtime environment. this is shown in fig. 1. in an e-learning system we talk about the learning management system (lms), which is responsible for displaying the learning content, for authenticating, and for tracking the user’s progress; the content management system (cms), which manages and generates the learning packages; and the content itself. most commonly the lms is an on-line, web-based system. after the login, it provides different levels of access to the courses according to the user’s rights, which can scale from basic read-only access to system administrator privileges. the cms 48 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 developing e-learning courses for mobile devices r. szabados, k. sipos the recent and rapid development of mobile devices and the increasing popularity of e learning have created a demand for mobile learning packages and environments. we have analyzed the possibilities of adapting the existing content for mobile devices, and have implemented two fundamentally different systems to satisfy the demand that has arisen. one of the systems creates e learning courses from existing materials and adapts them to the specified platform (this system realizes the functionalities of the content management system). the other system is a modified version of the moodle learning management system, which can adapt existing courses right before displaying them. this paper discuses the fundamentals of e learning, the design considerations and investigates various methods of scalable video coding. finally the realization details of the two systems are presented. keywords: e-learning, scorm, mobile devices, adaptation of learning packages, video coding. is usually available off-line, and provides course generation facilities. most of the cms programs support the creation of different e learning package formats ranging from off-line courses (which can be distributed on cds) to scorm conformed packages. the scorm specifies, among others, the structure and content of a learning package. a scorm package – i.e., a scorm compatible learning package – is made up of independent learning objects, called scos (sharable content object), to enable the reusability of the learning contents already made. the hierarchy of the course is created by organizing the scos into a tree structure. scos are like chapters and subsections in a book. the lowest level scos are built up of html pages, which actually contain the learning materials such as text, picture, video, etc. a learning package includes, among others, these html files, the multimedia files, and a so called manifest file. this latter file contains course-related metadata, such as the language and the logical structure of the course, and the location of the files appearing in the course. the name of this file must be imsmanifest.xml, and it has to be placed into the root folder of the package, because the lms displays the course relying upon this file. according to scorm, the adaptation of the multimedia content can be realized in the cms during the construction of the package, and also in the lms during the presentation of the course. the two solutions offer different kinds of adaptation. in the case of adaptation during course production, the learning package will be compatible with all of the scorm-conformed lmss. in this way it produces platform specific scorm packages, which contain the multimedia files adapted for the specific device. on the other hand, these packages can only be displayed appropriately on the specified device; therefore, different learning packages have to be generated for different platforms. to support the multiple package generation of the same course, the lecturer only has to name the platform to which the package is created, and the cms takes care of all the rest. in the case of adaptation in the lms, all scorm conformed packages have to be supported, including the existing courses designed for pcs. in this situation the learning package contains the multimedia content only once, and the system has to adjust it to the device before delivering the package. this way the lecturer does not have to focus on the properties of the learning material. since the two ways of adaptation grant different advantages, we decided to implement both of these methods. 3 design considerations content adaptation is necessary because the different mobile devices have different technical capabilities. during the design of the systems we considered the following properties: � bandwidth: when using mobile devices, the available bandwidth can be relatively low, or the quality of the connection can be poor. an example is the gprs based (general packet radio service) internet access. this may cause problems when trying to send higher quality or longer videos over the network. this problem can also occur while using a notebook. � computational power: pdas have relatively slow cpus, which can also cause problems during higher quality or longer video playback. � screen size: the small screen size causes problems on pdas or smartphones when trying to display large images or videos. when the device displays the images in full size, intelligibility decreases because the user has to scroll on the page. also, in the case of large videos, real-time shrinking requires much computational power. � animations: the electronic learning courses are popular for being rich in vivid multimedia contents; therefore, the presence of flash animations in these courses is common. however, this can cause problems if a device cannot play these files. as the range of existing mobile devices is quite wide and they have rather different capabilities, we defined three profiles corresponding to three common device types for adaptation in our systems. for desktop computers we defined the “pc” profile presuming big screen size, high bandwidth and a fast processor. this is the base profile; no adaptation is necessary for this. we defined a “notebook” profile for computers with low bandwidth internet access. finally, for the “pda” profile we assumed an average pocketpc with 320×240 pixel screen size, low bandwidth connection and a slow processor. as nowadays smartphones have quite small displays, and extremely varied capabilities, we omitted the “smartphone” profile for now. however, the realizations of our systems provide an easy way to insert new devices later on. according to the considerations mentioned above, we reviewed and evaluated the possible adaptation methods of the following multimedia contents: � text: the only reason for adapting text is the small display size, so it would be essential solely for the pda profile. as in scorm the course pages are html files; the browser can automatically adjust the text to the width of the screen. thus, no other adaptation is required. � picture: as real-time resizing of an image is not so difficult, and their transfer also does not require high bandwidth, the main reason for the adaptation of images is again the small display size. when the device displays a large image in full size, it is difficult to understand it as a whole, because the user needs to scroll in order to view the different parts of the picture. on the other hand, displaying these images © czech technical university publishing house http://ctn.cvut.cz/ap/ 49 acta polytechnica vol. 47 no. 4–5/2007 fig. 1: main parts of an e-learning system in a reduced size can cause a very congested image full of small, unreadable parts. that is why we decided to shrink the image to the display size of the device, and also to leave an opportunity for the student to see it in full size. we ensure this by changing the original image to a reduced version that is also a link to the original-sized one. � flash animations: the lack of appropriate animation player softwares would be the reason for adapting animations, but there are implemented players (for example the macromedia flash) for pdas too; therefore, we do not deal with this issue any more. � video: probably the most important problem is adequate adaptation of video files to the given platform. to achieve this, we have to consider the available bandwidth, screen size and computational power in all of the profiles. no adaptation is required in the pc profile. in the notebook profile the bottleneck is the low bandwidth; hence, the video has to be reencoded to cope with this. adaptation is also necessary in the pda profile, because of all the constraints mentioned above. due to the large number of different video codecs, there are numerous ways for video adaptation. the storage of video streams requires large disk space, so we tried to choose a solution that does not require the multiple storage of the same stream. in the case of adaptation during course generation, this is an unreachable goal, because the scorm package has to contain all the relevant multimedia content. thus, the video streams have to be stored in each learning package. in this case we tried to find an effective video coding method, and decided to use the mpeg 4 [2] codec, due to its comprehensive support and effectiveness. in the case of adaptation just before displaying the content, our goal is achievable because the scorm package has to contain only the best quality video stream, and the lms can adjust it to the appropriate quality. in this case we can also use coding and delivery methods that require special server applications, by integrating them into the lms. the simplest way of adaptation is to encode and store the video stream for all of the profiles, as we have already discussed above. however, with this solution the needed storage capacity is huge, and the user has to wait for the whole file to be transferred to his device before starting the playback. some pdas do not even have enough capacity to receive the whole file, so it is not playable at all. the streaming transfer method is appropriate for avoiding this problem. this delivery method is based on a continuous data flow transmission that is decodable immediately; therefore, the user does not have to wait for the download to finish; the playback can be started after a short buffering time. traditionally, streaming is a broadcast transfer method, but there is a special form of streaming, called video on demand (vod), that sends out the stream only on request. using streaming specifies only the mode of the data transfer, but the actual encoding, and thus the storage capacity needed, can be various. storing multiple video streams using the mpeg-4 codec, and transferring it with vod is an easily realizable mode of adaptation in the lms, so we compared the other emerged solutions to this method of adaptation. one of the possible solutions was to store only the best quality video stream, and to reencode it real-time for the other platforms only on demand. this adaptation requires less storage capacity, but much more computational power, which seriously limits the number of simultaneous accesses. this is why we discarded this idea. another solution is to use multiple bitrate video streaming, which involves packing several video and audio streams with different bitrates into one file, and to use a special program to select which stream to transfer to the user on demand. to open a video, a single hyperlink is enough; the server automatically chooses the proper stream according to the properties of the internet connection. this solution is not suitable for adaptation in the lms system, because the stream choosing algorithm considers only the bandwidth, so we cannot adjust the video to the screen size of the device. furthermore, the required storage capacity is not significantly less than it is in the solution applied in the cms. a way to reduce the needed storage capacity is to use a scalable video codec [3]. there are two different realizations. the first one uses layer-level, the second one bit level scalability for improving the quality. the main point of the layered scalable video coding method is to create different layers of the video stream: a base layer and some enhancement layers. the base layer requires small bandwidth, but has poor quality and occasionally low resolution. the enhancement layers are for improving the quality or resolution. for decoding the video stream, the whole base layer is essential, and according to the available bandwidth and the desired quality, additional enhancement layers are transmittable. the more enhancement layers are received, the better quality the video has. the disadvantage of this method is that the quality improves only if a whole enhancement layer has been received, otherwise it does not improve at all. the enhancement layers commonly improve only one property of the video stream. these properties can be seen in fig. 2. the spatial scalability controls the resolution of the video stream. decreasing the resolution results in smaller picture size, which is a disadvantage when the video is played on a pc or notebook, but it is favourable when sending the video to a pda. temporal scalability modifies the number of frames sent in a second. the fewer frames are sent, the smaller the trans50 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 2: different ways of scalability mittable stream is. however, decreasing it too much brings on discontinuity in the video, and it becomes less enjoyable. this gives a solution for the low bandwidth problem, but not for the small screen size; therefore, this kind of encoding is good for notebooks, but not for pdas. picture quality scalability (also known as rate scalability) reduces the quality of the frames to achieve lower bit rates and faster transmission of the video stream. just as in the case of temporal scalability, it is beneficial for adapting the video to notebooks, but this does not reduce the geometrical size of the video frames either; therefore, leaving the problem of small screen size unsolved in the case of pdas. from the above mentioned properties of the different kinds of scalability, it is clear that none of the scalability methods is able to fit all of our requirements by itself. there are existing combined scalable video codecs, as well. these achieve the best quality on a specific resolution first, and switch to higher resolution only after that. the server programs consider only the available bandwidth while selecting the number of enhancement layers to be transmitted. in our case this behavior would cause a pda with high bandwidth internet connection to receive the video stream in too fine resolution. fine granular scalability (fgs) is a special form of scalable video coding that realizes bit-level scalability. in fgs only one enhancement layer exists and all of the bits in this layer improve the video quality; as a result the enhancement layer can be trimmed at any point; in this way, the transmission can adapt well to the available bandwidth. in the lms, with scalable codecs the adaptation for the available bandwidth could be managed, but we would have to store multiple streams for the different resolutions. for the defined profiles it means two streams instead of three, which would also significantly reduce the required storage capacity. the only implemented fgs codec that we found was the mpeg-4 fgs reference codec and server software realized by momusys (mobile multimedia systems). [4] as it was created as a reference implementation, the functionality of this program was quite limited and ineffective. since the implementation of this reference software, the mpeg-4 community has recognized that the mpeg-4 fgs may not be efficient enough; therefore, the community is now working on a new mpeg-4 based scalable codec, called the avc scalable extension (advanced video coding), but this is still under development. our goal to use a scalable video coding method failed due to the lack of appropriately-operating codecs, so the implemented solution was the simplest way of adaptation that we used in the cms as well. for the purpose of reencoding the video files and video on demand, we searched for an existing program, and we decided to use the open source program called vlc (videolan client) [5], which provides both functionalities. 4 how to adapt in a cms in this section we present in detail our solution for adapting the course by modifying the cms. at the time we started this project, we looked for existing open source software to use as a start-up for both of the realizations. that was how we found on a website of the warsaw school of information technology the code of scomaker [6], which we used as a starting point for our program. scomaker is written in the ruby programming language. in the state we found it, it was not a completely working software. it was able to build the hierarchy of the course (from scos) and assign the appropriate pages to the appropriate scos. navigation within the pages was also possible, and it generated the scorm 1.2 compatible manifest file as well. on the other hand, the lecturer had to modify the program code to define the hierarchy, and also it was not able to find the resources. this means that the original software did not fill the pages with content; it only created empty html files. furthermore, it was not able to adapt the courses to different platforms – like most cmss, it created learning packages for pcs. also, originally the lecturer had to run four programs to create the learning package. we have modified the scomaker so that it is able to find the resource files, and from them create html files that are not empty any more, but contain the learning material. at this time, raw text files (.txt files), web pages containing only formatted text (.htm, .html files), pictures and illustrations (.bmp, .jpg, .jpeg, .gif, and .png files), and videos (.avi, .mpg, .mpeg, .mp4, and .wmv files) can be used as resource files. another goal was to be able to create the learning package for different platforms, such as notebook with poor quality internet connection, or pda. while designing the software, we tried to keep in mind that we are creating a program that is going to be used by people who are not computer specialists. this is why we tried to keep the user interface as simple as possible. thus, the lecturer only has to modify one tag at a given place in an xml file to choose to which platform he wants to create the course. then he has to run only one program (instead of four), and he does not need to change the program code to define the hierarchy, he just has to create folders and copy the source files into them. because the original software, in the state we found it, was only able to run on linux platform, we decided to keep this characteristic, and port the program to windows later. also, though it was not the latest version of the scorm recommendation that the scomaker implemented, we decided not to modify the scorm compatible part of the software, because this is the most prevailing version today. now let us overview the operation of our software. to run the program, two parameters have to be given. one is the path of the folder containing the course hierarchy; the other is the path of the configuration file, called config.xml. this file contains the parameters needed by the program, such as the path of the template html pages, where to put the created learning package, etc. first the configuration file is processed. originally, the lecturer had to hard-code the package hierarchy into one of the program files. we modified the program so that the lecturer does not have to modify the program code. now he only has to create a folder structure that represents the hierarchy of the course (scos and pages), and copy the resource files into the corresponding page folders. the resources are placed on the pages in alphabetical order; therefore, the lecturer has to name them accordingly. from this folder structure, a temporary file is created, which later is used to create the imsmanifest.xml file that is mandatory for every scorm © czech technical university publishing house http://ctn.cvut.cz/ap/ 51 acta polytechnica vol. 47 no. 4–5/2007 package. to achieve adaptability, we placed a new tag into the config.xml file, so that only the name of the profile has to be specified in order to adapt the course. now let us see how the empty html pages are filled with content. the resource files belonging to the appropriate page are collected from a temporary file, processed and inserted into the html page according to the extension of the resource file. when inserting text into the html page, the only change made is that the “\n” or “\f” (new line, new page) characters are replaced by the “
” html tag. in the case of pictures and illustrations, if no adaptation is needed, the following tag is placed into the html code: “” where #{fname} is the name and the extension of the file containing the picture or illustration. this file is then copied to the folder of the corresponding sco. picture adaptation may only be necessary when the course is created for pdas. in this case a php script is called to check if the size of the picture exceeds the permitted maximum (it is 230×300 pixels – the effective display size of most pdas), in which case adaptation takes place. adaptation means that a reduced version of the picture is placed into the page as a hyperlink, which points to the original-sized version. in this case the original picture and the picture reduced with the php script are both copied into the folder of the corresponding sco. at the insertion of web pages, the html code between the “” and the ““ tags are inserted. as in the case of raw text, the automatic text wrapping of the browser provides a proper appearance on small displays. placing a video into the course can be done by inserting the “ start video ” line into the html code. as a result, a link is placed into the page with a “start video” label. if no adaptation profile is given, the #{fname} is replaced by the original filename, thus the original, unchanged video file is placed into the course. if the pc adaptation profile is provided, the video is reencoded with the mpeg 4 codec because of its efficiency, but the properties of the video (bitrate, resolution, etc.) are not changed. the #{fname} is replaced by the original filename, but with .mp4 extension. if the “notebook” adaptation profile is given, the video is encoded with a lower bitrate; in the case of the “pda” profile, it is encoded with both a lower bitrate and smaller resolution into mpeg 4 format. in all cases the reencoding is done by a separate program, and only the new, reencoded video files are placed into the correct sco folders. also the references in the html codes point to these files. once the hierarchy, the pages and the imsmanifest.xml file are ready, the very last step is to create a .zip archive from the learning course. besides the fact that scorm requires the learning packages to be archived into an archive format such as zip, jar, cab, tar, etc., there are several advantages of creating this archive file. the main advantages are: it is easier to upload one archive file than a folder containing several files; sending the learning packages through the internet consumes less bandwidth, and also the archive requires less hard drive capacity. by the modifications described in this section, our program is capable of creating complete learning packages with multimedia contents; also the adaptation of these contents is done simply by providing a single extra parameter. such a package can be seen in fig. 3. 5 adaptation in the lms for implementing the adaptation during the playback of the course, we searched for an open source learning management system that conforms to scorm. though there are several open source lmss, only a few of them are able to display scorm packages properly. we chose the moodle [7] system because of its modular program structure, the continuous support and its high level scorm compatibility. after uploading a learning package onto the lms server, the moodle system checks the scorm conformity because the playback presumes it, and displaying a non conform course could cause errors. there are two ways to realize this kind of adaptation. one way is to fit the course to the platform after the verification by modifying the displaying module of the moodle system. the other way is to modify the package itself before the verification by creating a new, adapted scorm conformed course. since we had gained much experience of scorm compatibility and course generation while developing the cms, we decided to implement the latter method. besides producing an adapted learning package, we had to be able to recognize the user’s device, in order to know which device to adapt to. we also had to solve the video transmission through streaming. therefore we installed the vlc program (mentioned in section 3.) onto the server in addition to moodle. for streaming a video by vod, first the video file has to be registered into the vod server. this is done through a telnet interface, by assigning a unique identifier 52 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007 fig. 3: e-learning course on a pda to the video file. after that, the video stream is available at the rtsp://servername:portnumber/identifier hyperlink. the video file can be registered to the vlc at the same time with the multimedia adaptation, and the hyperlink can also be inserted into the course at this time, so no other modules of moodle have to be modified. the used profile depends on the user’s device and also on the quality of the internet connection; hence, we have modified the login screen to grant the user the opportunity to choose the profile. after this step the program stores the profile in a cookie on the user’s device. the remaining modifications of the moodle system were placed into a separate module. in order to construct an adaptive scorm package, it is first necessary to analyze the whole course. depending on the results of the analysis, the multimedia contents have to be adjusted; the displayed html pages and the course descriptor xml file have to be modified. the analysis and the adaptation of the course are based on the processing of the imsmanifest.xml file. the adaptation of the multimedia content takes place when a reference to this content is reached for the first time while reading the manifest file. the adaptation in the lms is executed for all of the profiles. first, the software checks whether adaptation is required. the only situation when no adaptation is required is when the multimedia content is an image that is smaller than all of the display sizes of the different profiles. after checking this, the program reencodes or resizes the content, if necessary, as described in the previous section. the names of the newly generated files are always formed as filename_profilename.extension. the program registers the video files into the vlc server at this point and it stores the new unique video identifiers. the identifiers for the same video but different platforms are formed as identifier_profilename. in this way we can ensure that similar video file names will not cause problems in video streaming. after the adaptation of the material, the program refreshes the manifest file to keep the scorm compatibility. during the processing of the imsmanifest.xml the program collects all of the html pages for all multimedia contents that can refer to it. when the analysis is finished, the adaptation of media files is completed, and the course descriptor xml file is consistent with the new package. after that, only the modifications of the html pages occur. the program searches for the hyperlinks of the media files in the html codes, and replaces them with short javascript functions. these functions automatically produce the proper hyperlink using the original filenames (or video identifiers) and the profile name from the previously stored cookie. after the above modifications of the html pages, the new package is adaptable, but still scorm compatible, so it can pass the verification of the moodle system. for proper playback it requires the profile name to be stored in a cookie in the user’s device, and a vod server, which provides the video streams. because of these requirements, only our modified moodle system can correctly display an adapted course. 6 conclusion with the use of modern technologies, our lives are changing. as a result, educational methods are changing as well; computers are used to aid teaching, and mobile devices to develop ubiquitous learning. as e-learning is spreading, the need appears for learning packages to be playable on these mobile devices as well. to satisfy this need we implemented two fundamentally different systems capable of adapting the e-learning content and packages for mobile devices. in this paper, after discussing the fundamentals of e-learning, we presented the design considerations and the different possibilities of scalable video coding. we also explained and justified our implementation choices. finally, we briefly described the realized systems. the cms, that we created for demonstrational purposes, can create an e learning package using existing text, image, video and html files and is able to adapt these contents to the given platform. this system, as it is now, meets the requirements of proper generation of platform-specific learning packages. the courses created by our cms were tested on several lmss operating on windows and on a pda emulator. the most important property of this system is that the constructed courses are displayable by all scorm conformant lmss. another way to support the use of mobile devices is to convert existing courses that were created for pcs. this approach is advantageous, as it supports every package regardless of the cms used to create it. our modified version of moodle adapts and displays scorm compliant learning packages properly. we used several courses either created by different cmss or found on the internet to test the system. since different cmss may apply different ways of inserting a video into the course, problems may arise during playback. however, moodle handled the video files in the packages that we created with the two cmss that were used for testing. to sum up, both of the systems that we implemented are working properly and are capable of learning package adaptation, but there is still room for improvement. the set of supported multimedia contents and the supported profiles could be increased. when a suitable, efficient scalable video codec appears, it can also be easily applied in the lms system. implementing version 1.3 of the scorm recommendation (aka. scorm2004) could also be considered. acknowledgments the work described in this paper was supervised by krisztián németh, and was supported by the high speed networks laboratory, department of telecommunications and media informatics, budapest university of technology and economics. references [1] scorm 1.2 documentation, october 2001, http://www.adlnet.gov/scorm/history/12/index.cfm [2] iso/iec jtc1/sc29/wg11(mpeg working group), doc no. n4668 the mpeg 4 overview, march 2002, http://www.chiariglione.org/mpeg/standards/mpeg-4/ mpeg-4.htm [3] wang, y., ostermann, j., zhang, y.: video processing and communications. prentice hall, 2002. isbn 0-13-017547-1 © czech technical university publishing house http://ctn.cvut.cz/ap/ 53 acta polytechnica vol. 47 no. 4–5/2007 [4] the home page of the mobile multimedia systems project: http://cordis.europa.eu/infowin/acts/analysys/ products/thematic/mpeg4/momusys/momusys.htm [5] the homepage and the documentation of videolan client: http://www.videolan.org [6] scomaker: http://mxf.wsisiz.edu.pl/scomaker/doc/scomaker [7] the homepage and the developer’s documentation of the moodle system: http://www.moodle.org, http://docs.moodle.org/en/developer_documentation réka szabados e-mail: sr556@hszk.bme.hu katalin sipos e-mail: sk494@hszk.bme.hu department of telecommunications and media informatics faculty of electrical engineering budapest university of technology and economics 2 magyar tudósok krt. budapest, h-1117, hungary 54 © czech technical university publishing house http://ctn.cvut.cz/ap/ acta polytechnica vol. 47 no. 4–5/2007